comparison COG/cddid.tbl @ 3:e42d30da7a74 draft

Uploaded
author dereeper
date Thu, 30 May 2024 11:52:25 +0000
parents
children
comparison
equal deleted inserted replaced
2:97e4e3e818b6 3:e42d30da7a74
1 214330 CHL00001 rpoB RNA polymerase beta subunit 1070
2 214331 CHL00002 matK maturase K 504
3 176948 CHL00003 psbA photosystem II protein D1 338
4 176949 CHL00004 psbD photosystem II protein D2 353
5 176950 CHL00005 rps16 ribosomal protein S16 82
6 176951 CHL00008 petG cytochrome b6/f complex subunit V 37
7 176952 CHL00009 petN cytochrome b6/f complex subunit VIII 29
8 214332 CHL00010 infA translation initiation factor 1 78
9 176954 CHL00011 ndhD NADH dehydrogenase subunit 4 498
10 176955 CHL00012 ndhJ NADH dehydrogenase subunit J 158
11 214333 CHL00013 rpoA RNA polymerase alpha subunit 327
12 214334 CHL00014 ndhI NADH dehydrogenase subunit I 167
13 176958 CHL00015 ndhE NADH dehydrogenase subunit 4L 101
14 214335 CHL00016 ndhG NADH dehydrogenase subunit 6 182
15 176960 CHL00017 ndhH NADH dehydrogenase subunit 7 393
16 214336 CHL00018 rpoC1 RNA polymerase beta' subunit 663
17 176962 CHL00019 atpF ATP synthase CF0 B subunit 184
18 176963 CHL00020 psbN photosystem II protein N 43
19 176964 CHL00022 ndhC NADH dehydrogenase subunit 3 120
20 214337 CHL00023 ndhK NADH dehydrogenase subunit K 225
21 176966 CHL00024 psbI photosystem II protein I 36
22 214338 CHL00025 ndhF NADH dehydrogenase subunit 5 741
23 214339 CHL00027 rps15 ribosomal protein S15 90
24 214340 CHL00028 clpP ATP-dependent Clp protease proteolytic subunit 200
25 176970 CHL00029 rpl36 ribosomal protein L36 26
26 176971 CHL00030 rpl23 ribosomal protein L23 93
27 176972 CHL00031 psbT photosystem II protein T 33
28 214341 CHL00032 ndhA NADH dehydrogenase subunit 1 363
29 176974 CHL00033 ycf3 photosystem I assembly protein Ycf3 168
30 214342 CHL00034 rpl22 ribosomal protein L22 117
31 214343 CHL00035 psbC photosystem II 44 kDa protein 473
32 176977 CHL00036 ycf4 photosystem I assembly protein Ycf4 184
33 176978 CHL00037 petA cytochrome f 320
34 176979 CHL00038 psbL photosystem II protein L 38
35 176980 CHL00039 psbF photosystem II protein VI 39
36 176981 CHL00040 rbcL ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit 475
37 176982 CHL00041 rps11 ribosomal protein S11 116
38 214344 CHL00042 rps8 ribosomal protein S8 132
39 214345 CHL00043 cemA envelope membrane protein 261
40 176985 CHL00044 rpl16 ribosomal protein L16 135
41 214346 CHL00045 ccsA cytochrome c biogenesis protein 319
42 176987 CHL00046 atpI ATP synthase CF0 A subunit 228
43 214347 CHL00047 psbK photosystem II protein K 58
44 214348 CHL00048 rps3 ribosomal protein S3 214
45 176990 CHL00049 ndhB NADH dehydrogenase subunit 2 494
46 176991 CHL00050 rps19 ribosomal protein S19 92
47 176992 CHL00051 rps12 ribosomal protein S12 123
48 176993 CHL00052 rpl2 ribosomal protein L2 273
49 176994 CHL00053 rps7 ribosomal protein S7 155
50 176995 CHL00054 psaB photosystem I P700 chlorophyll a apoprotein A2 734
51 176996 CHL00056 psaA photosystem I P700 chlorophyll a apoprotein A1 750
52 176997 CHL00057 rpl14 ribosomal protein L14 122
53 176998 CHL00058 petD cytochrome b6/f complex subunit IV 160
54 176999 CHL00059 atpA ATP synthase CF1 alpha subunit 485
55 214349 CHL00060 atpB ATP synthase CF1 beta subunit 494
56 177001 CHL00061 atpH ATP synthase CF0 C subunit 81
57 214350 CHL00062 psbB photosystem II 47 kDa protein 504
58 214351 CHL00063 atpE ATP synthase CF1 epsilon subunit 134
59 177004 CHL00064 psbE photosystem II protein V 83
60 177005 CHL00065 psaC photosystem I subunit VII 81
61 177006 CHL00066 psbH photosystem II protein H 73
62 177007 CHL00067 rps2 ribosomal protein S2 230
63 214352 CHL00068 rpl20 ribosomal protein L20 115
64 177009 CHL00070 petB cytochrome b6 215
65 177010 CHL00071 tufA elongation factor Tu 409
66 177011 CHL00072 chlL photochlorophyllide reductase subunit L 290
67 214353 CHL00073 chlN photochlorophyllide reductase subunit N 457
68 214354 CHL00074 rps14 ribosomal protein S14 100
69 177014 CHL00075 rpl21 ribosomal protein L21 108
70 214355 CHL00076 chlB photochlorophyllide reductase subunit B 513
71 177016 CHL00077 rps18 ribosomal protein S18 86
72 214356 CHL00078 rpl5 ribosomal protein L5 181
73 214357 CHL00079 rps9 ribosomal protein S9 130
74 177019 CHL00080 psbM photosystem II protein M 34
75 177020 CHL00081 chlI Mg-protoporyphyrin IX chelatase 350
76 177021 CHL00082 psbZ photosystem II protein Z 62
77 214358 CHL00083 rpl12 ribosomal protein L12 131
78 177023 CHL00084 rpl19 ribosomal protein L19 117
79 214359 CHL00085 ycf24 putative ABC transporter 485
80 164492 CHL00086 apcA allophycocyanin alpha subunit 161
81 164493 CHL00088 apcB allophycocyanin beta subunit 161
82 100206 CHL00089 apcF allophycocyanin beta 18 subunit 169
83 164494 CHL00090 apcD allophycocyanin gamma subunit 161
84 164495 CHL00091 apcE phycobillisome linker protein 877
85 177025 CHL00093 groEL chaperonin GroEL 529
86 214360 CHL00094 dnaK heat shock protein 70 621
87 214361 CHL00095 clpC Clp protease ATP binding subunit 821
88 214362 CHL00098 tsf elongation factor Ts 200
89 214363 CHL00099 ilvB acetohydroxyacid synthase large subunit 585
90 214364 CHL00100 ilvH acetohydroxyacid synthase small subunit 174
91 214365 CHL00101 trpG anthranilate synthase component 2 190
92 214366 CHL00102 rps20 ribosomal protein S20 93
93 214367 CHL00103 rpl35 ribosomal protein L35 65
94 177033 CHL00104 rpl33 ribosomal protein L33 66
95 177034 CHL00105 psaJ photosystem I subunit IX 42
96 177035 CHL00106 petL cytochrome b6/f complex subunit VI 31
97 177036 CHL00108 psbJ photosystem II protein J 40
98 177037 CHL00112 rpl28 ribosomal protein L28; Provisional 63
99 177038 CHL00113 rps4 ribosomal protein S4; Reviewed 201
100 100224 CHL00114 psbX photosystem II protein X; Reviewed 39
101 177039 CHL00115 rpl34 ribosomal protein L34; Reviewed 46
102 214368 CHL00117 rpoC2 RNA polymerase beta'' subunit; Reviewed 1364
103 214369 CHL00118 atpG ATP synthase CF0 B' subunit; Validated 156
104 177042 CHL00119 atpD ATP synthase CF1 delta subunit; Validated 184
105 177043 CHL00120 psaL photosystem I subunit XI; Validated 143
106 214370 CHL00121 rpl27 ribosomal protein L27; Reviewed 86
107 214371 CHL00122 secA preprotein translocase subunit SecA; Validated 870
108 177046 CHL00123 rps6 ribosomal protein S6; Validated 97
109 177047 CHL00124 acpP acyl carrier protein; Validated 82
110 177048 CHL00125 psaE photosystem I subunit IV; Reviewed 64
111 177049 CHL00127 rpl11 ribosomal protein L11; Validated 140
112 177050 CHL00128 psbW photosystem II protein W; Reviewed 113
113 177051 CHL00129 rpl1 ribosomal protein L1; Reviewed 229
114 177052 CHL00130 rbcS ribulose-1,5-bisphosphate carboxylase/oxygenase small subunit; Reviewed 138
115 214372 CHL00131 ycf16 sulfate ABC transporter protein; Validated 252
116 177054 CHL00132 psaF photosystem I subunit III; Validated 185
117 177055 CHL00133 psbV photosystem II cytochrome c550; Validated 163
118 177056 CHL00134 petF ferredoxin; Validated 99
119 177057 CHL00135 rps10 ribosomal protein S10; Validated 101
120 177058 CHL00136 rpl31 ribosomal protein L31; Validated 68
121 177059 CHL00137 rps13 ribosomal protein S13; Validated 122
122 177060 CHL00138 rps5 ribosomal protein S5; Validated 143
123 214373 CHL00139 rpl18 ribosomal protein L18; Validated 109
124 177062 CHL00140 rpl6 ribosomal protein L6; Validated 178
125 214374 CHL00141 rpl24 ribosomal protein L24; Validated 83
126 177064 CHL00142 rps17 ribosomal protein S17; Validated 84
127 177065 CHL00143 rpl3 ribosomal protein L3; Validated 207
128 177066 CHL00144 odpB pyruvate dehydrogenase E1 component beta subunit; Validated 327
129 177067 CHL00145 psaD photosystem I subunit II; Validated 139
130 214375 CHL00147 rpl4 ribosomal protein L4; Validated 215
131 214376 CHL00148 orf27 Ycf27; Reviewed 240
132 177069 CHL00149 odpA pyruvate dehydrogenase E1 component alpha subunit; Reviewed 341
133 164542 CHL00151 preA prenyl transferase; Reviewed 323
134 214377 CHL00152 rpl32 ribosomal protein L32; Validated 53
135 177071 CHL00154 rpl29 ribosomal protein L29; Validated 67
136 177072 CHL00159 rpl13 ribosomal protein L13; Validated 143
137 214378 CHL00160 rpl9 ribosomal protein L9; Provisional 153
138 214379 CHL00161 secY preprotein translocase subunit SecY; Validated 417
139 214380 CHL00162 thiG thiamin biosynthesis protein G; Validated 267
140 214381 CHL00163 ycf65 putative ribosomal protein 3; Validated 99
141 164550 CHL00164 psaK photosystem I subunit X; Validated 86
142 214382 CHL00165 ftrB ferredoxin thioreductase subunit beta; Validated 116
143 214383 CHL00168 pbsA heme oxygenase; Provisional 238
144 100270 CHL00170 cpcA phycocyanin alpha subunit; Reviewed 162
145 100271 CHL00171 cpcB phycocyanin beta subunit; Reviewed 172
146 133617 CHL00172 cpeB phycoerythrin beta subunit; Provisional 177
147 100273 CHL00173 cpeA phycoerythrin alpha subunit; Provisional 164
148 214384 CHL00174 accD acetyl-CoA carboxylase beta subunit; Reviewed 296
149 214385 CHL00175 minD septum-site determining protein; Validated 281
150 214386 CHL00176 ftsH cell division protein; Validated 638
151 214387 CHL00177 ccs1 c-type cytochrome biogenensis protein; Validated 426
152 177082 CHL00180 rbcR LysR transcriptional regulator; Provisional 305
153 177083 CHL00181 cbbX CbbX; Provisional 287
154 177084 CHL00182 tatC Sec-independent translocase component C; Provisional 249
155 177085 CHL00183 petJ cytochrome c553; Provisional 108
156 177086 CHL00184 ycf12 Ycf12; Provisional 33
157 177087 CHL00185 ycf59 magnesium-protoporphyrin IX monomethyl ester cyclase; Provisional 351
158 177088 CHL00186 psaI photosystem I subunit VIII; Validated 36
159 214388 CHL00187 cysT sulfate transport protein; Provisional 237
160 214389 CHL00188 hisH imidazole glycerol phosphate synthase subunit hisH; Provisional 210
161 177089 CHL00189 infB translation initiation factor 2; Provisional 742
162 177090 CHL00190 psaM photosystem I subunit XII; Provisional 30
163 214390 CHL00191 ycf61 DNA-directed RNA polymerase subunit omega; Provisional 76
164 214391 CHL00192 syfB phenylalanyl-tRNA synthetase beta chain; Provisional 704
165 177092 CHL00193 ycf35 Ycf35; Provisional 128
166 177093 CHL00194 ycf39 Ycf39; Provisional 317
167 177094 CHL00195 ycf46 Ycf46; Provisional 489
168 177095 CHL00196 psbY photosystem II protein Y; Provisional 36
169 214392 CHL00197 carA carbamoyl-phosphate synthase arginine-specific small subunit; Provisional 382
170 214393 CHL00198 accA acetyl-CoA carboxylase carboxyltransferase alpha subunit; Provisional 322
171 164575 CHL00199 infC translation initiation factor 3; Provisional 182
172 214394 CHL00200 trpA tryptophan synthase alpha subunit; Provisional 263
173 164576 CHL00201 syh histidine-tRNA synthetase; Provisional 430
174 133644 CHL00202 argB acetylglutamate kinase; Provisional 284
175 164577 CHL00203 fabH 3-oxoacyl-acyl-carrier-protein synthase 3; Provisional 326
176 214395 CHL00204 ycf1 Ycf1; Provisional 1832
177 214396 CHL00206 ycf2 Ycf2; Provisional 2281
178 214397 CHL00207 rpoB RNA polymerase beta subunit; Provisional 1077
179 223080 COG0001 HemL Glutamate-1-semialdehyde aminotransferase [Coenzyme transport and metabolism]. 432
180 223081 COG0002 ArgC N-acetyl-gamma-glutamylphosphate reductase [Amino acid transport and metabolism]. 349
181 223082 COG0003 ArsA Anion-transporting ATPase, ArsA/GET3 family [Inorganic ion transport and metabolism]. 322
182 223083 COG0004 AmtB Ammonia channel protein AmtB [Inorganic ion transport and metabolism]. 409
183 223084 COG0005 XapA Purine nucleoside phosphorylase [Nucleotide transport and metabolism]. 262
184 223085 COG0006 PepP Xaa-Pro aminopeptidase [Amino acid transport and metabolism]. 384
185 223086 COG0007 CysG Uroporphyrinogen-III methylase (siroheme synthase) [Coenzyme transport and metabolism]. 244
186 223087 COG0008 GlnS Glutamyl- or glutaminyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 472
187 223088 COG0009 SUA5 tRNA A37 threonylcarbamoyladenosine synthetase subunit TsaC/SUA5/YrdC [Translation, ribosomal structure and biogenesis]. 211
188 223089 COG0010 SpeB Arginase family enzyme [Amino acid transport and metabolism]. 305
189 223090 COG0011 YqgV Uncharacterized conserved protein YqgV, UPF0045/DUF77 family [Function unknown]. 100
190 223091 COG0012 GTP1 Ribosome-binding ATPase YchF, GTP1/OBG family [Translation, ribosomal structure and biogenesis]. 372
191 223092 COG0013 AlaS Alanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 879
192 223093 COG0014 ProA Gamma-glutamyl phosphate reductase [Amino acid transport and metabolism]. 417
193 223094 COG0015 PurB Adenylosuccinate lyase [Nucleotide transport and metabolism]. 438
194 223095 COG0016 PheS Phenylalanyl-tRNA synthetase alpha subunit [Translation, ribosomal structure and biogenesis]. 335
195 223096 COG0017 AsnS Aspartyl/asparaginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 435
196 223097 COG0018 ArgS Arginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 577
197 223098 COG0019 LysA Diaminopimelate decarboxylase [Amino acid transport and metabolism]. 394
198 223099 COG0020 UppS Undecaprenyl pyrophosphate synthase [Lipid transport and metabolism]. 245
199 223100 COG0021 TktA Transketolase [Carbohydrate transport and metabolism]. 663
200 223101 COG0022 AcoB Pyruvate/2-oxoglutarate/acetoin dehydrogenase complex, dehydrogenase (E1) component [Energy production and conversion]. 324
201 223102 COG0023 SUI1 Translation initiation factor 1 (eIF-1/SUI1) [Translation, ribosomal structure and biogenesis]. 104
202 223103 COG0024 Map Methionine aminopeptidase [Translation, ribosomal structure and biogenesis]. 255
203 223104 COG0025 NhaP NhaP-type Na+/H+ or K+/H+ antiporter [Inorganic ion transport and metabolism]. 429
204 223105 COG0026 PurK Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) [Nucleotide transport and metabolism]. 375
205 223106 COG0027 PurT Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) [Nucleotide transport and metabolism]. 394
206 223107 COG0028 IlvB Acetolactate synthase large subunit or other thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 550
207 223108 COG0029 NadB Aspartate oxidase [Coenzyme transport and metabolism]. 518
208 223109 COG0030 RsmA 16S rRNA A1518 and A1519 N6-dimethyltransferase RsmA/KsgA/DIM1 (may also have DNA glycosylase/AP lyase activity) [Translation, ribosomal structure and biogenesis]. 259
209 223110 COG0031 CysK Cysteine synthase [Amino acid transport and metabolism]. 300
210 223111 COG0033 Pgm Phosphoglucomutase [Carbohydrate transport and metabolism]. 524
211 223112 COG0034 PurF Glutamine phosphoribosylpyrophosphate amidotransferase [Nucleotide transport and metabolism]. 470
212 223113 COG0035 Upp Uracil phosphoribosyltransferase [Nucleotide transport and metabolism]. 210
213 223114 COG0036 Rpe Pentose-5-phosphate-3-epimerase [Carbohydrate transport and metabolism]. 220
214 223115 COG0037 TilS tRNA(Ile)-lysidine synthase TilS/MesJ [Translation, ribosomal structure and biogenesis]. 298
215 223116 COG0038 ClcA H+/Cl- antiporter ClcA [Inorganic ion transport and metabolism]. 443
216 223117 COG0039 Mdh Malate/lactate dehydrogenase [Energy production and conversion]. 313
217 223118 COG0040 HisG ATP phosphoribosyltransferase [Amino acid transport and metabolism]. 290
218 223119 COG0041 PurE Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase [Nucleotide transport and metabolism]. 162
219 223120 COG0042 DusA tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 323
220 223121 COG0043 UbiD 3-polyprenyl-4-hydroxybenzoate decarboxylase [Coenzyme transport and metabolism]. 477
221 223122 COG0044 AllB Dihydroorotase or related cyclic amidohydrolase [Nucleotide transport and metabolism]. 430
222 223123 COG0045 SucC Succinyl-CoA synthetase, beta subunit [Energy production and conversion]. 387
223 223124 COG0046 PurL1 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain [Nucleotide transport and metabolism]. 743
224 223125 COG0047 PurL2 Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain [Nucleotide transport and metabolism]. 231
225 223126 COG0048 RpsL Ribosomal protein S12 [Translation, ribosomal structure and biogenesis]. 129
226 223127 COG0049 RpsG Ribosomal protein S7 [Translation, ribosomal structure and biogenesis]. 148
227 223128 COG0050 TufB Translation elongation factor EF-Tu, a GTPase [Translation, ribosomal structure and biogenesis]. 394
228 223129 COG0051 RpsJ Ribosomal protein S10 [Translation, ribosomal structure and biogenesis]. 104
229 223130 COG0052 RpsB Ribosomal protein S2 [Translation, ribosomal structure and biogenesis]. 252
230 223131 COG0053 FieF Divalent metal cation (Fe/Co/Zn/Cd) transporter [Inorganic ion transport and metabolism]. 304
231 223132 COG0054 RibE 6,7-dimethyl-8-ribityllumazine synthase (Riboflavin synthase beta chain) [Coenzyme transport and metabolism]. 152
232 223133 COG0055 AtpD FoF1-type ATP synthase, beta subunit [Energy production and conversion]. 468
233 223134 COG0056 AtpA FoF1-type ATP synthase, alpha subunit [Energy production and conversion]. 504
234 223135 COG0057 GapA Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]. 335
235 223136 COG0058 GlgP Glucan phosphorylase [Carbohydrate transport and metabolism]. 750
236 223137 COG0059 IlvC Ketol-acid reductoisomerase [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 338
237 223138 COG0060 IleS Isoleucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 933
238 223139 COG0061 NadF NAD kinase [Nucleotide transport and metabolism]. 281
239 223140 COG0062 Nnr1 NAD(P)H-hydrate repair enzyme Nnr, NAD(P)H-hydrate epimerase domain [Nucleotide transport and metabolism]. 203
240 223141 COG0063 Nnr2 NAD(P)H-hydrate repair enzyme Nnr, NAD(P)H-hydrate dehydratase domain [Nucleotide transport and metabolism]. 284
241 223142 COG0064 GatB Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit [Translation, ribosomal structure and biogenesis]. 483
242 223143 COG0065 LeuC Homoaconitase/3-isopropylmalate dehydratase large subunit [Amino acid transport and metabolism]. 423
243 223144 COG0066 LeuD 3-isopropylmalate dehydratase small subunit [Amino acid transport and metabolism]. 191
244 223145 COG0067 GltB1 Glutamate synthase domain 1 [Amino acid transport and metabolism]. 371
245 223146 COG0068 HypF Hydrogenase maturation factor HypF (carbamoyltransferase) [Posttranslational modification, protein turnover, chaperones]. 750
246 223147 COG0069 GltB2 Glutamate synthase domain 2 [Amino acid transport and metabolism]. 485
247 223148 COG0070 GltB3 Glutamate synthase domain 3 [Amino acid transport and metabolism]. 301
248 223149 COG0071 IbpA Molecular chaperone IbpA, HSP20 family [Posttranslational modification, protein turnover, chaperones]. 146
249 223150 COG0072 PheT Phenylalanyl-tRNA synthetase beta subunit [Translation, ribosomal structure and biogenesis]. 650
250 223151 COG0073 EMAP tRNA-binding EMAP/Myf domain [Translation, ribosomal structure and biogenesis]. 123
251 223152 COG0074 SucD Succinyl-CoA synthetase, alpha subunit [Energy production and conversion]. 293
252 223153 COG0075 PucG Archaeal aspartate aminotransferase or a related aminotransferase, includes purine catabolism protein PucG [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 383
253 223154 COG0076 GadA Glutamate or tyrosine decarboxylase or a related PLP-dependent protein [Amino acid transport and metabolism]. 460
254 223155 COG0077 PheA2 Prephenate dehydratase [Amino acid transport and metabolism]. 279
255 223156 COG0078 ArgF Ornithine carbamoyltransferase [Amino acid transport and metabolism]. 310
256 223157 COG0079 HisC Histidinol-phosphate/aromatic aminotransferase or cobyric acid decarboxylase [Amino acid transport and metabolism]. 356
257 223158 COG0080 RplK Ribosomal protein L11 [Translation, ribosomal structure and biogenesis]. 141
258 223159 COG0081 RplA Ribosomal protein L1 [Translation, ribosomal structure and biogenesis]. 228
259 223160 COG0082 AroC Chorismate synthase [Amino acid transport and metabolism]. 369
260 223161 COG0083 ThrB Homoserine kinase [Amino acid transport and metabolism]. 299
261 223162 COG0084 TatD Tat protein secretion system quality control protein TatD (DNase activity) [Cell motility]. 256
262 223163 COG0085 RpoB DNA-directed RNA polymerase, beta subunit/140 kD subunit [Transcription]. 1060
263 223164 COG0086 RpoC DNA-directed RNA polymerase, beta' subunit/160 kD subunit [Transcription]. 808
264 223165 COG0087 RplC Ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. 218
265 223166 COG0088 RplD Ribosomal protein L4 [Translation, ribosomal structure and biogenesis]. 214
266 223167 COG0089 RplW Ribosomal protein L23 [Translation, ribosomal structure and biogenesis]. 94
267 223168 COG0090 RplB Ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. 275
268 223169 COG0091 RplV Ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 120
269 223170 COG0092 RpsC Ribosomal protein S3 [Translation, ribosomal structure and biogenesis]. 233
270 223171 COG0093 RplN Ribosomal protein L14 [Translation, ribosomal structure and biogenesis]. 122
271 223172 COG0094 RplE Ribosomal protein L5 [Translation, ribosomal structure and biogenesis]. 180
272 223173 COG0095 LplA Lipoate-protein ligase A [Coenzyme transport and metabolism]. 248
273 223174 COG0096 RpsH Ribosomal protein S8 [Translation, ribosomal structure and biogenesis]. 132
274 223175 COG0097 RplF Ribosomal protein L6P/L9E [Translation, ribosomal structure and biogenesis]. 178
275 223176 COG0098 RpsE Ribosomal protein S5 [Translation, ribosomal structure and biogenesis]. 181
276 223177 COG0099 RpsM Ribosomal protein S13 [Translation, ribosomal structure and biogenesis]. 121
277 223178 COG0100 RpsK Ribosomal protein S11 [Translation, ribosomal structure and biogenesis]. 129
278 223179 COG0101 TruA tRNA U38,U39,U40 pseudouridine synthase TruA [Translation, ribosomal structure and biogenesis]. 266
279 223180 COG0102 RplM Ribosomal protein L13 [Translation, ribosomal structure and biogenesis]. 148
280 223181 COG0103 RpsI Ribosomal protein S9 [Translation, ribosomal structure and biogenesis]. 130
281 223182 COG0104 PurA Adenylosuccinate synthase [Nucleotide transport and metabolism]. 430
282 223183 COG0105 Ndk Nucleoside diphosphate kinase [Nucleotide transport and metabolism]. 135
283 223184 COG0106 HisA Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase [Amino acid transport and metabolism]. 241
284 223185 COG0107 HisF Imidazole glycerol phosphate synthase subunit HisF [Amino acid transport and metabolism]. 256
285 223186 COG0108 RibB 3,4-dihydroxy-2-butanone 4-phosphate synthase [Coenzyme transport and metabolism]. 203
286 223187 COG0109 CyoE Polyprenyltransferase (heme O synthase) [Coenzyme transport and metabolism, Lipid transport and metabolism]. 304
287 223188 COG0110 WbbJ Acetyltransferase (isoleucine patch superfamily) [General function prediction only]. 190
288 223189 COG0111 SerA Phosphoglycerate dehydrogenase or related dehydrogenase [Coenzyme transport and metabolism, General function prediction only]. 324
289 223190 COG0112 GlyA Glycine/serine hydroxymethyltransferase [Amino acid transport and metabolism]. 413
290 223191 COG0113 HemB Delta-aminolevulinic acid dehydratase, porphobilinogen synthase [Coenzyme transport and metabolism]. 330
291 223192 COG0114 FumC Fumarate hydratase class II [Energy production and conversion]. 462
292 223193 COG0115 IlvE Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 284
293 223194 COG0116 RlmL 23S rRNA G2445 N2-methylase RlmL [Translation, ribosomal structure and biogenesis]. 381
294 223195 COG0117 RibD1 Pyrimidine deaminase domain of riboflavin biosynthesis protein RibD [Coenzyme transport and metabolism]. 146
295 223196 COG0118 HisH Imidazoleglycerol phosphate synthase glutamine amidotransferase subunit HisH [Amino acid transport and metabolism]. 204
296 223197 COG0119 LeuA Isopropylmalate/homocitrate/citramalate synthases [Amino acid transport and metabolism]. 409
297 223198 COG0120 RpiA Ribose 5-phosphate isomerase [Carbohydrate transport and metabolism]. 227
298 223199 COG0121 YafJ Predicted glutamine amidotransferase [General function prediction only]. 252
299 223200 COG0122 AlkA 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [Replication, recombination and repair]. 285
300 223201 COG0123 AcuC Acetoin utilization deacetylase AcuC or a related deacetylase [Chromatin structure and dynamics, Secondary metabolites biosynthesis, transport and catabolism]. 340
301 223202 COG0124 HisS Histidyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 429
302 223203 COG0125 Tmk Thymidylate kinase [Nucleotide transport and metabolism]. 208
303 223204 COG0126 Pgk 3-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 395
304 223205 COG0127 RdgB Inosine/xanthosine triphosphate pyrophosphatase, all-alpha NTP-PPase family [Nucleotide transport and metabolism]. 194
305 223206 COG0128 AroA 5-enolpyruvylshikimate-3-phosphate synthase [Amino acid transport and metabolism]. 428
306 223207 COG0129 IlvD Dihydroxyacid dehydratase/phosphogluconate dehydratase [Amino acid transport and metabolism, Carbohydrate transport and metabolism]. 575
307 223208 COG0130 TruB tRNA U55 pseudouridine synthase TruB, may also work on U342 of tmRNA [Translation, ribosomal structure and biogenesis]. 271
308 223209 COG0131 HisB2 Imidazoleglycerol phosphate dehydratase HisB [Amino acid transport and metabolism]. 195
309 223210 COG0132 BioD Dethiobiotin synthetase [Coenzyme transport and metabolism]. 223
310 223211 COG0133 TrpB Tryptophan synthase beta chain [Amino acid transport and metabolism]. 396
311 223212 COG0134 TrpC Indole-3-glycerol phosphate synthase [Amino acid transport and metabolism]. 254
312 223213 COG0135 TrpF Phosphoribosylanthranilate isomerase [Amino acid transport and metabolism]. 208
313 223214 COG0136 Asd Aspartate-semialdehyde dehydrogenase [Amino acid transport and metabolism]. 334
314 223215 COG0137 ArgG Argininosuccinate synthase [Amino acid transport and metabolism]. 403
315 223216 COG0138 PurH AICAR transformylase/IMP cyclohydrolase PurH [Nucleotide transport and metabolism]. 515
316 223217 COG0139 HisI1 Phosphoribosyl-AMP cyclohydrolase [Amino acid transport and metabolism]. 111
317 223218 COG0140 HisI2 Phosphoribosyl-ATP pyrophosphohydrolase [Amino acid transport and metabolism]. 92
318 223219 COG0141 HisD Histidinol dehydrogenase [Amino acid transport and metabolism]. 425
319 223220 COG0142 IspA Geranylgeranyl pyrophosphate synthase [Coenzyme transport and metabolism]. 322
320 223221 COG0143 MetG Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 558
321 223222 COG0144 RsmB 16S rRNA C967 or C1407 C5-methylase, RsmB/RsmF family [Translation, ribosomal structure and biogenesis]. 355
322 223223 COG0145 HyuA N-methylhydantoinase A/oxoprolinase/acetone carboxylase, beta subunit [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 674
323 223224 COG0146 HyuB N-methylhydantoinase B/oxoprolinase/acetone carboxylase, alpha subunit [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 563
324 223225 COG0147 TrpE Anthranilate/para-aminobenzoate synthases component I [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 462
325 223226 COG0148 Eno Enolase [Carbohydrate transport and metabolism]. 423
326 223227 COG0149 TpiA Triosephosphate isomerase [Carbohydrate transport and metabolism]. 251
327 223228 COG0150 PurM Phosphoribosylaminoimidazole (AIR) synthetase [Nucleotide transport and metabolism]. 345
328 223229 COG0151 PurD Phosphoribosylamine-glycine ligase [Nucleotide transport and metabolism]. 428
329 223230 COG0152 PurC Phosphoribosylaminoimidazole-succinocarboxamide synthase [Nucleotide transport and metabolism]. 247
330 223231 COG0153 GalK Galactokinase [Carbohydrate transport and metabolism]. 390
331 223232 COG0154 GatA Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit or related amidase [Translation, ribosomal structure and biogenesis]. 475
332 223233 COG0155 CysI Sulfite reductase, beta subunit (hemoprotein) [Inorganic ion transport and metabolism]. 510
333 223234 COG0156 BioF 7-keto-8-aminopelargonate synthetase or related enzyme [Coenzyme transport and metabolism]. 388
334 223235 COG0157 NadC Nicotinate-nucleotide pyrophosphorylase [Coenzyme transport and metabolism]. 280
335 223236 COG0158 Fbp Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 326
336 223237 COG0159 TrpA Tryptophan synthase alpha chain [Amino acid transport and metabolism]. 265
337 223238 COG0160 GabT 4-aminobutyrate aminotransferase or related aminotransferase [Amino acid transport and metabolism]. 447
338 223239 COG0161 BioA Adenosylmethionine-8-amino-7-oxononanoate aminotransferase [Coenzyme transport and metabolism]. 449
339 223240 COG0162 TyrS Tyrosyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 401
340 223241 COG0163 UbiX UbiX family flavin prenyltransferase [Coenzyme transport and metabolism]. 191
341 223242 COG0164 RnhB Ribonuclease HII [Replication, recombination and repair]. 199
342 223243 COG0165 ArgH Argininosuccinate lyase [Amino acid transport and metabolism]. 459
343 223244 COG0166 Pgi Glucose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 446
344 223245 COG0167 PyrD Dihydroorotate dehydrogenase [Nucleotide transport and metabolism]. 310
345 223246 COG0168 TrkG Trk-type K+ transport system, membrane component [Inorganic ion transport and metabolism]. 499
346 223247 COG0169 AroE Shikimate 5-dehydrogenase [Amino acid transport and metabolism]. 283
347 223248 COG0170 SEC59 Dolichol kinase [Posttranslational modification, protein turnover, chaperones]. 216
348 223249 COG0171 NadE NH3-dependent NAD+ synthetase [Coenzyme transport and metabolism]. 268
349 223250 COG0172 SerS Seryl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 429
350 223251 COG0173 AspS Aspartyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 585
351 223252 COG0174 GlnA Glutamine synthetase [Amino acid transport and metabolism]. 443
352 223253 COG0175 CysH 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase or related enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 261
353 223254 COG0176 TalA Transaldolase [Carbohydrate transport and metabolism]. 239
354 223255 COG0177 Nth Endonuclease III [Replication, recombination and repair]. 211
355 223256 COG0178 UvrA Excinuclease UvrABC ATPase subunit [Replication, recombination and repair]. 935
356 223257 COG0179 MhpD 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) [Secondary metabolites biosynthesis, transport and catabolism]. 266
357 223258 COG0180 TrpS Tryptophanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 314
358 223259 COG0181 HemC Porphobilinogen deaminase [Coenzyme transport and metabolism]. 307
359 223260 COG0182 MtnA Methylthioribose-1-phosphate isomerase (methionine salvage pathway), a paralog of eIF-2B alpha subunit [Amino acid transport and metabolism]. 346
360 223261 COG0183 PaaJ Acetyl-CoA acetyltransferase [Lipid transport and metabolism]. 392
361 223262 COG0184 RpsO Ribosomal protein S15P/S13E [Translation, ribosomal structure and biogenesis]. 89
362 223263 COG0185 RpsS Ribosomal protein S19 [Translation, ribosomal structure and biogenesis]. 93
363 223264 COG0186 RpsQ Ribosomal protein S17 [Translation, ribosomal structure and biogenesis]. 87
364 223265 COG0187 GyrB DNA gyrase/topoisomerase IV, subunit B [Replication, recombination and repair]. 635
365 223266 COG0188 GyrA DNA gyrase/topoisomerase IV, subunit A [Replication, recombination and repair]. 804
366 223267 COG0189 RimK Glutathione synthase/RimK-type ligase, ATP-grasp superfamily [Coenzyme transport and metabolism, Translation, ribosomal structure and biogenesis]. 318
367 223268 COG0190 FolD 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase [Coenzyme transport and metabolism]. 283
368 223269 COG0191 Fba Fructose/tagatose bisphosphate aldolase [Carbohydrate transport and metabolism]. 286
369 223270 COG0192 MetK S-adenosylmethionine synthetase [Coenzyme transport and metabolism]. 388
370 223271 COG0193 Pth Peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 190
371 223272 COG0194 Gmk Guanylate kinase [Nucleotide transport and metabolism]. 191
372 223273 COG0195 NusA Transcription antitermination factor NusA, contains S1 and KH domains [Transcription]. 190
373 223274 COG0196 RibF FAD synthase [Coenzyme transport and metabolism]. 304
374 223275 COG0197 RplP Ribosomal protein L16/L10AE [Translation, ribosomal structure and biogenesis]. 146
375 223276 COG0198 RplX Ribosomal protein L24 [Translation, ribosomal structure and biogenesis]. 104
376 223277 COG0199 RpsN Ribosomal protein S14 [Translation, ribosomal structure and biogenesis]. 61
377 223278 COG0200 RplO Ribosomal protein L15 [Translation, ribosomal structure and biogenesis]. 152
378 223279 COG0201 SecY Preprotein translocase subunit SecY [Intracellular trafficking, secretion, and vesicular transport]. 436
379 223280 COG0202 RpoA DNA-directed RNA polymerase, alpha subunit/40 kD subunit [Transcription]. 317
380 223281 COG0203 RplQ Ribosomal protein L17 [Translation, ribosomal structure and biogenesis]. 116
381 223282 COG0204 PlsC 1-acyl-sn-glycerol-3-phosphate acyltransferase [Lipid transport and metabolism]. 255
382 223283 COG0205 PfkA 6-phosphofructokinase [Carbohydrate transport and metabolism]. 347
383 223284 COG0206 FtsZ Cell division GTPase FtsZ [Cell cycle control, cell division, chromosome partitioning]. 338
384 223285 COG0207 ThyA Thymidylate synthase [Nucleotide transport and metabolism]. 268
385 223286 COG0208 NrdF Ribonucleotide reductase beta subunit, ferritin-like domain [Nucleotide transport and metabolism]. 348
386 223287 COG0209 NrdA Ribonucleotide reductase alpha subunit [Nucleotide transport and metabolism]. 651
387 223288 COG0210 UvrD Superfamily I DNA or RNA helicase [Replication, recombination and repair]. 655
388 223289 COG0211 RpmA Ribosomal protein L27 [Translation, ribosomal structure and biogenesis]. 87
389 223290 COG0212 FAU1 5-formyltetrahydrofolate cyclo-ligase [Coenzyme transport and metabolism]. 191
390 223291 COG0213 DeoA Thymidine phosphorylase [Nucleotide transport and metabolism]. 435
391 223292 COG0214 PdxS Pyridoxal biosynthesis lyase PdxS [Coenzyme transport and metabolism]. 296
392 223293 COG0215 CysS Cysteinyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 464
393 223294 COG0216 PrfA Protein chain release factor A [Translation, ribosomal structure and biogenesis]. 363
394 223295 COG0217 TACO1 Transcriptional and/or translational regulatory protein YebC/TACO1 [Transcription, Translation, ribosomal structure and biogenesis]. 241
395 223296 COG0218 EngB GTP-binding protein EngB required for normal cell division [Cell cycle control, cell division, chromosome partitioning]. 200
396 223297 COG0219 TrmL tRNA(Leu) C34 or U34 (ribose-2'-O)-methylase TrmL, contains SPOUT domain [Translation, ribosomal structure and biogenesis]. 155
397 223298 COG0220 TrmB tRNA G46 methylase TrmB [Translation, ribosomal structure and biogenesis]. 227
398 223299 COG0221 Ppa Inorganic pyrophosphatase [Energy production and conversion, Inorganic ion transport and metabolism]. 171
399 223300 COG0222 RplL Ribosomal protein L7/L12 [Translation, ribosomal structure and biogenesis]. 124
400 223301 COG0223 Fmt Methionyl-tRNA formyltransferase [Translation, ribosomal structure and biogenesis]. 307
401 223302 COG0224 AtpG FoF1-type ATP synthase, gamma subunit [Energy production and conversion]. 287
402 223303 COG0225 MsrA Peptide methionine sulfoxide reductase MsrA [Posttranslational modification, protein turnover, chaperones]. 174
403 223304 COG0226 PstS ABC-type phosphate transport system, periplasmic component [Inorganic ion transport and metabolism]. 318
404 223305 COG0227 RpmB Ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 77
405 223306 COG0228 RpsP Ribosomal protein S16 [Translation, ribosomal structure and biogenesis]. 87
406 223307 COG0229 MsrB Peptide methionine sulfoxide reductase MsrB [Posttranslational modification, protein turnover, chaperones]. 140
407 223308 COG0230 RpmH Ribosomal protein L34 [Translation, ribosomal structure and biogenesis]. 44
408 223309 COG0231 Efp Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. 131
409 223310 COG0232 Dgt dGTP triphosphohydrolase [Nucleotide transport and metabolism]. 412
410 223311 COG0233 Frr Ribosome recycling factor [Translation, ribosomal structure and biogenesis]. 187
411 223312 COG0234 GroES Co-chaperonin GroES (HSP10) [Posttranslational modification, protein turnover, chaperones]. 96
412 223313 COG0235 AraD Ribulose-5-phosphate 4-epimerase/Fuculose-1-phosphate aldolase [Carbohydrate transport and metabolism]. 219
413 223314 COG0236 AcpP Acyl carrier protein [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 80
414 223315 COG0237 CoaE Dephospho-CoA kinase [Coenzyme transport and metabolism]. 201
415 223316 COG0238 RpsR Ribosomal protein S18 [Translation, ribosomal structure and biogenesis]. 75
416 223317 COG0239 CrcB Fluoride ion exporter CrcB/FEX, affects chromosome condensation [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism]. 126
417 223318 COG0240 GpsA Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 329
418 223319 COG0241 HisB1 Histidinol phosphatase or a related phosphatase [Amino acid transport and metabolism]. 181
419 223320 COG0242 Def Peptide deformylase [Translation, ribosomal structure and biogenesis]. 168
420 223321 COG0243 BisC Anaerobic selenocysteine-containing dehydrogenase [Energy production and conversion]. 765
421 223322 COG0244 RplJ Ribosomal protein L10 [Translation, ribosomal structure and biogenesis]. 175
422 223323 COG0245 IspF 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase [Lipid transport and metabolism]. 159
423 223324 COG0246 MtlD Mannitol-1-phosphate/altronate dehydrogenases [Carbohydrate transport and metabolism]. 473
424 223325 COG0247 GlpC Fe-S oxidoreductase [Energy production and conversion]. 388
425 223326 COG0248 GppA Exopolyphosphatase/pppGpp-phosphohydrolase [Nucleotide transport and metabolism, Signal transduction mechanisms, Inorganic ion transport and metabolism]. 492
426 223327 COG0249 MutS DNA mismatch repair ATPase MutS [Replication, recombination and repair]. 843
427 223328 COG0250 NusG Transcription antitermination factor NusG [Transcription]. 178
428 223329 COG0251 RidA Enamine deaminase RidA, house cleaning of reactive enamine intermediates, YjgF/YER057c/UK114 family [Defense mechanisms]. 130
429 223330 COG0252 AnsA L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D [Translation, ribosomal structure and biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 351
430 223331 COG0253 DapF Diaminopimelate epimerase [Amino acid transport and metabolism]. 272
431 223332 COG0254 RpmE Ribosomal protein L31 [Translation, ribosomal structure and biogenesis]. 75
432 223333 COG0255 RpmC Ribosomal protein L29 [Translation, ribosomal structure and biogenesis]. 69
433 223334 COG0256 RplR Ribosomal protein L18 [Translation, ribosomal structure and biogenesis]. 125
434 223335 COG0257 RpmJ Ribosomal protein L36 [Translation, ribosomal structure and biogenesis]. 38
435 223336 COG0258 Exo 5'-3' exonuclease [Replication, recombination and repair]. 310
436 223337 COG0259 PdxH Pyridoxine/pyridoxamine 5'-phosphate oxidase [Coenzyme transport and metabolism]. 214
437 223338 COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]. 485
438 223339 COG0261 RplU Ribosomal protein L21 [Translation, ribosomal structure and biogenesis]. 103
439 223340 COG0262 FolA Dihydrofolate reductase [Coenzyme transport and metabolism]. 167
440 223341 COG0263 ProB Glutamate 5-kinase [Amino acid transport and metabolism]. 369
441 223342 COG0264 Tsf Translation elongation factor EF-Ts [Translation, ribosomal structure and biogenesis]. 296
442 223343 COG0265 DegQ Periplasmic serine protease, S1-C subfamily, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]. 347
443 223344 COG0266 Nei Formamidopyrimidine-DNA glycosylase [Replication, recombination and repair]. 273
444 223345 COG0267 RpmG Ribosomal protein L33 [Translation, ribosomal structure and biogenesis]. 50
445 223346 COG0268 RpsT Ribosomal protein S20 [Translation, ribosomal structure and biogenesis]. 88
446 223347 COG0269 SgbH 3-keto-L-gulonate-6-phosphate decarboxylase [Carbohydrate transport and metabolism]. 217
447 223348 COG0270 Dcm Site-specific DNA-cytosine methylase [Replication, recombination and repair]. 328
448 223349 COG0271 BolA Stress-induced morphogen (activity unknown) [Signal transduction mechanisms]. 90
449 223350 COG0272 Lig NAD-dependent DNA ligase [Replication, recombination and repair]. 667
450 223351 COG0274 DeoC Deoxyribose-phosphate aldolase [Nucleotide transport and metabolism]. 228
451 223352 COG0275 RmsH 16S rRNA C1402 N4-methylase RsmH [Translation, ribosomal structure and biogenesis]. 314
452 223353 COG0276 HemH Protoheme ferro-lyase (ferrochelatase) [Coenzyme transport and metabolism]. 320
453 223354 COG0277 GlcD FAD/FMN-containing dehydrogenase [Energy production and conversion]. 459
454 223355 COG0278 GrxD Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 105
455 223356 COG0279 GmhA Phosphoheptose isomerase [Carbohydrate transport and metabolism]. 176
456 223357 COG0280 Pta Phosphotransacetylase [Energy production and conversion]. 327
457 223358 COG0281 SfcA Malic enzyme [Energy production and conversion]. 432
458 223359 COG0282 AckA Acetate kinase [Energy production and conversion]. 396
459 223360 COG0283 Cmk Cytidylate kinase [Nucleotide transport and metabolism]. 222
460 223361 COG0284 PyrF Orotidine-5'-phosphate decarboxylase [Nucleotide transport and metabolism]. 240
461 223362 COG0285 FolC Folylpolyglutamate synthase/Dihydropteroate synthase [Coenzyme transport and metabolism]. 427
462 223363 COG0286 HsdM Type I restriction-modification system, DNA methylase subunit [Defense mechanisms]. 489
463 223364 COG0287 TyrA Prephenate dehydrogenase [Amino acid transport and metabolism]. 279
464 223365 COG0288 CynT Carbonic anhydrase [Inorganic ion transport and metabolism]. 207
465 223366 COG0289 DapB Dihydrodipicolinate reductase [Amino acid transport and metabolism]. 266
466 223367 COG0290 InfC Translation initiation factor IF-3 [Translation, ribosomal structure and biogenesis]. 176
467 223368 COG0291 RpmI Ribosomal protein L35 [Translation, ribosomal structure and biogenesis]. 65
468 223369 COG0292 RplT Ribosomal protein L20 [Translation, ribosomal structure and biogenesis]. 118
469 223370 COG0293 RlmE 23S rRNA U2552 (ribose-2'-O)-methylase RlmE/FtsJ [Translation, ribosomal structure and biogenesis]. 205
470 223371 COG0294 FolP Dihydropteroate synthase [Coenzyme transport and metabolism]. 274
471 223372 COG0295 Cdd Cytidine deaminase [Nucleotide transport and metabolism]. 134
472 223373 COG0296 GlgB 1,4-alpha-glucan branching enzyme [Carbohydrate transport and metabolism]. 628
473 223374 COG0297 GlgA Glycogen synthase [Carbohydrate transport and metabolism]. 487
474 223375 COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 82
475 223376 COG0299 PurN Folate-dependent phosphoribosylglycinamide formyltransferase PurN [Nucleotide transport and metabolism]. 200
476 223377 COG0300 DltE Short-chain dehydrogenase [General function prediction only]. 265
477 223378 COG0301 ThiI Adenylyl- and sulfurtransferase ThiI, participates in tRNA 4-thiouridine and thiamine biosynthesis [Coenzyme transport and metabolism, Translation, ribosomal structure and biogenesis]. 383
478 223379 COG0302 FolE GTP cyclohydrolase I [Coenzyme transport and metabolism]. 195
479 223380 COG0303 MoeA Molybdopterin biosynthesis enzyme [Coenzyme transport and metabolism]. 404
480 223381 COG0304 FabB 3-oxoacyl-(acyl-carrier-protein) synthase [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 412
481 223382 COG0305 DnaB Replicative DNA helicase [Replication, recombination and repair]. 435
482 223383 COG0306 PitA Phosphate/sulfate permease [Inorganic ion transport and metabolism]. 326
483 223384 COG0307 RibC Riboflavin synthase alpha chain [Coenzyme transport and metabolism]. 204
484 223385 COG0308 PepN Aminopeptidase N [Amino acid transport and metabolism]. 859
485 223386 COG0309 HypE Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 339
486 223387 COG0310 CbiM ABC-type Co2+ transport system, permease component [Inorganic ion transport and metabolism]. 204
487 223388 COG0311 PdxT Glutamine amidotransferase PdxT (pyridoxal biosynthesis) [Coenzyme transport and metabolism]. 194
488 223389 COG0312 TldD Predicted Zn-dependent protease or its inactivated homolog [General function prediction only]. 454
489 223390 COG0313 RsmI 16S rRNA C1402 (ribose-2'-O) methylase RsmI [Translation, ribosomal structure and biogenesis]. 275
490 223391 COG0314 MoaE Molybdopterin synthase catalytic subunit [Coenzyme transport and metabolism]. 149
491 223392 COG0315 MoaC Molybdenum cofactor biosynthesis enzyme [Coenzyme transport and metabolism]. 157
492 223393 COG0316 IscA Fe-S cluster assembly iron-binding protein IscA [Posttranslational modification, protein turnover, chaperones]. 110
493 223394 COG0317 SpoT (p)ppGpp synthase/hydrolase, HD superfamily [Signal transduction mechanisms, Transcription]. 701
494 223395 COG0318 CaiC Acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 534
495 223396 COG0319 YbeY ssRNA-specific RNase YbeY, 16S rRNA maturation enzyme [Translation, ribosomal structure and biogenesis]. 153
496 223397 COG0320 LipA Lipoate synthase [Coenzyme transport and metabolism]. 306
497 223398 COG0321 LipB Lipoate-protein ligase B [Coenzyme transport and metabolism]. 221
498 223399 COG0322 UvrC Excinuclease UvrABC, nuclease subunit [Replication, recombination and repair]. 581
499 223400 COG0323 MutL DNA mismatch repair ATPase MutL [Replication, recombination and repair]. 638
500 223401 COG0324 MiaA tRNA A37 N6-isopentenylltransferase MiaA [Translation, ribosomal structure and biogenesis]. 308
501 223402 COG0325 YggS Uncharacterized pyridoxal phosphate-containing protein, affects Ilv metabolism, UPF0001 family [General function prediction only]. 228
502 223403 COG0326 HtpG Molecular chaperone, HSP90 family [Posttranslational modification, protein turnover, chaperones]. 623
503 223404 COG0327 NIF3 Putative GTP cyclohydrolase 1 type 2, NIF3 family [Coenzyme transport and metabolism]. 250
504 223405 COG0328 RnhA Ribonuclease HI [Replication, recombination and repair]. 154
505 223406 COG0329 DapA Dihydrodipicolinate synthase/N-acetylneuraminate lyase [Amino acid transport and metabolism, Cell wall/membrane/envelope biogenesis]. 299
506 223407 COG0330 HflC Regulator of protease activity HflC, stomatin/prohibitin superfamily [Posttranslational modification, protein turnover, chaperones]. 291
507 223408 COG0331 FabD Malonyl CoA-acyl carrier protein transacylase [Lipid transport and metabolism]. 310
508 223409 COG0332 FabH 3-oxoacyl-[acyl-carrier-protein] synthase III [Lipid transport and metabolism]. 323
509 223410 COG0333 RpmF Ribosomal protein L32 [Translation, ribosomal structure and biogenesis]. 57
510 223411 COG0334 GdhA Glutamate dehydrogenase/leucine dehydrogenase [Amino acid transport and metabolism]. 411
511 223412 COG0335 RplS Ribosomal protein L19 [Translation, ribosomal structure and biogenesis]. 115
512 223413 COG0336 TrmD tRNA G37 N-methylase TrmD [Translation, ribosomal structure and biogenesis]. 240
513 223414 COG0337 AroB 3-dehydroquinate synthetase [Amino acid transport and metabolism]. 360
514 223415 COG0338 Dam Site-specific DNA-adenine methylase [Replication, recombination and repair]. 274
515 223416 COG0339 Dcp Zn-dependent oligopeptidase [Posttranslational modification, protein turnover, chaperones]. 683
516 223417 COG0340 BirA2 Biotin-(acetyl-CoA carboxylase) ligase [Coenzyme transport and metabolism]. 238
517 223418 COG0341 SecF Preprotein translocase subunit SecF [Intracellular trafficking, secretion, and vesicular transport]. 305
518 223419 COG0342 SecD Preprotein translocase subunit SecD [Intracellular trafficking, secretion, and vesicular transport]. 506
519 223420 COG0343 Tgt Queuine/archaeosine tRNA-ribosyltransferase [Translation, ribosomal structure and biogenesis]. 372
520 223421 COG0344 PlsY Phospholipid biosynthesis protein PlsY, probable glycerol-3-phosphate acyltransferase [Lipid transport and metabolism]. 200
521 223422 COG0345 ProC Pyrroline-5-carboxylate reductase [Amino acid transport and metabolism]. 266
522 223423 COG0346 GloA Catechol 2,3-dioxygenase or other lactoylglutathione lyase family enzyme [Secondary metabolites biosynthesis, transport and catabolism]. 138
523 223424 COG0347 GlnK Nitrogen regulatory protein PII [Signal transduction mechanisms, Amino acid transport and metabolism]. 112
524 223425 COG0348 NapH Polyferredoxin [Energy production and conversion]. 386
525 223426 COG0349 Rnd Ribonuclease D [Translation, ribosomal structure and biogenesis]. 361
526 223427 COG0350 AdaB O6-methylguanine-DNA--protein-cysteine methyltransferase [Replication, recombination and repair]. 168
527 223428 COG0351 ThiD Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase [Coenzyme transport and metabolism]. 263
528 223429 COG0352 ThiE Thiamine monophosphate synthase [Coenzyme transport and metabolism]. 211
529 223430 COG0353 RecR Recombinational DNA repair protein RecR [Replication, recombination and repair]. 198
530 223431 COG0354 YgfZ Folate-binding Fe-S cluster repair protein YgfZ, possible role in tRNA modification [Posttranslational modification, protein turnover, chaperones]. 305
531 223432 COG0355 AtpC FoF1-type ATP synthase, epsilon subunit [Energy production and conversion]. 135
532 223433 COG0356 AtpB FoF1-type ATP synthase, membrane subunit a [Energy production and conversion]. 246
533 223434 COG0357 RsmG 16S rRNA G527 N7-methylase RsmG (former glucose-inhibited division protein B) [Translation, ribosomal structure and biogenesis]. 215
534 223435 COG0358 DnaG DNA primase (bacterial type) [Replication, recombination and repair]. 568
535 223436 COG0359 RplI Ribosomal protein L9 [Translation, ribosomal structure and biogenesis]. 148
536 223437 COG0360 RpsF Ribosomal protein S6 [Translation, ribosomal structure and biogenesis]. 112
537 223438 COG0361 InfA Translation initiation factor IF-1 [Translation, ribosomal structure and biogenesis]. 75
538 223439 COG0362 Gnd 6-phosphogluconate dehydrogenase [Carbohydrate transport and metabolism]. 473
539 223440 COG0363 NagB 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase [Carbohydrate transport and metabolism]. 238
540 223441 COG0364 Zwf Glucose-6-phosphate 1-dehydrogenase [Carbohydrate transport and metabolism]. 483
541 223442 COG0365 Acs Acyl-coenzyme A synthetase/AMP-(fatty) acid ligase [Lipid transport and metabolism]. 528
542 223443 COG0366 AmyA Glycosidase [Carbohydrate transport and metabolism]. 505
543 223444 COG0367 AsnB Asparagine synthetase B (glutamine-hydrolyzing) [Amino acid transport and metabolism]. 542
544 223445 COG0368 CobS Cobalamin synthase [Coenzyme transport and metabolism]. 246
545 223446 COG0369 CysJ Sulfite reductase, alpha subunit (flavoprotein) [Inorganic ion transport and metabolism]. 587
546 223447 COG0370 FeoB Fe2+ transport system protein B [Inorganic ion transport and metabolism]. 653
547 223448 COG0371 GldA Glycerol dehydrogenase or related enzyme, iron-containing ADH family [Energy production and conversion]. 360
548 223449 COG0372 GltA Citrate synthase [Energy production and conversion]. 390
549 223450 COG0373 HemA Glutamyl-tRNA reductase [Coenzyme transport and metabolism]. 414
550 223451 COG0374 HyaB Ni,Fe-hydrogenase I large subunit [Energy production and conversion]. 545
551 223452 COG0375 HybF Hydrogenase maturation metallochaperone HypA/HybF, involved in Ni insertion [Posttranslational modification, protein turnover, chaperones]. 115
552 223453 COG0376 KatG Catalase (peroxidase I) [Inorganic ion transport and metabolism]. 730
553 223454 COG0377 NuoB NADH:ubiquinone oxidoreductase 20 kD subunit (chhain B) or related Fe-S oxidoreductase [Energy production and conversion]. 194
554 223455 COG0378 HypB Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase [Posttranslational modification, protein turnover, chaperones]. 202
555 223456 COG0379 NadA Quinolinate synthase [Coenzyme transport and metabolism]. 324
556 223457 COG0380 OtsA Trehalose-6-phosphate synthase [Carbohydrate transport and metabolism]. 486
557 223458 COG0381 WecB UDP-N-acetylglucosamine 2-epimerase [Cell wall/membrane/envelope biogenesis]. 383
558 223459 COG0382 UbiA 4-hydroxybenzoate polyprenyltransferase [Coenzyme transport and metabolism]. 289
559 223460 COG0383 AMS1 Alpha-mannosidase [Carbohydrate transport and metabolism]. 943
560 223461 COG0384 YHI9 Predicted epimerase YddE/YHI9, PhzF superfamily [General function prediction only]. 291
561 223462 COG0385 YfeH Predicted Na+-dependent transporter [General function prediction only]. 319
562 223463 COG0386 BtuE Glutathione peroxidase, house-cleaning role in reducing lipid peroxides [Defense mechanisms, Lipid transport and metabolism]. 162
563 223464 COG0387 ChaA Ca2+/H+ antiporter [Inorganic ion transport and metabolism]. 368
564 223465 COG0388 YafV Predicted amidohydrolase [General function prediction only]. 274
565 223466 COG0389 DinP Nucleotidyltransferase/DNA polymerase involved in DNA repair [Replication, recombination and repair]. 354
566 223467 COG0390 FetB ABC-type iron transport system FetAB, permease component [Inorganic ion transport and metabolism]. 256
567 223468 COG0391 CofD Archaeal 2-phospho-L-lactate transferase/Bacterial gluconeogenesis factor, CofD/UPF0052 family [Coenzyme transport and metabolism, Carbohydrate transport and metabolism]. 323
568 223469 COG0392 AglD2 Uncharacterized membrane protein YbhN, UPF0104 family [Function unknown]. 322
569 223470 COG0393 YbjQ Uncharacterized conserved protein YbjQ, UPF0145 family [Function unknown]. 108
570 223471 COG0394 Wzb Protein-tyrosine-phosphatase [Signal transduction mechanisms]. 139
571 223472 COG0395 UgpE ABC-type glycerol-3-phosphate transport system, permease component [Carbohydrate transport and metabolism]. 281
572 223473 COG0396 SufC Fe-S cluster assembly ATPase SufC [Posttranslational modification, protein turnover, chaperones]. 251
573 223474 COG0397 YdiU Uncharacterized conserved protein YdiU, UPF0061 family [Function unknown]. 488
574 223475 COG0398 TVP38 Uncharacterized membrane protein YdjX, TVP38/TMEM64 family, SNARE-associated domain [Function unknown]. 223
575 223476 COG0399 WecE dTDP-4-amino-4,6-dideoxygalactose transaminase [Cell wall/membrane/envelope biogenesis]. 374
576 223477 COG0400 YpfH Predicted esterase [General function prediction only]. 207
577 223478 COG0401 YqaE Uncharacterized membrane protein YqaE, homolog of Blt101, UPF0057 family [Function unknown]. 56
578 223479 COG0402 SsnA Cytosine/adenosine deaminase or related metal-dependent hydrolase [Nucleotide transport and metabolism, General function prediction only]. 421
579 223480 COG0403 GcvP1 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain [Amino acid transport and metabolism]. 450
580 223481 COG0404 GcvT Glycine cleavage system T protein (aminomethyltransferase) [Amino acid transport and metabolism]. 379
581 223482 COG0405 Ggt Gamma-glutamyltranspeptidase [Amino acid transport and metabolism]. 539
582 223483 COG0406 PhoE Broad specificity phosphatase PhoE [Carbohydrate transport and metabolism]. 208
583 223484 COG0407 HemE Uroporphyrinogen-III decarboxylase [Coenzyme transport and metabolism]. 352
584 223485 COG0408 HemF Coproporphyrinogen III oxidase [Coenzyme transport and metabolism]. 303
585 223486 COG0409 HypD Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 364
586 223487 COG0410 LivF ABC-type branched-chain amino acid transport system, ATPase component [Amino acid transport and metabolism]. 237
587 223488 COG0411 LivG ABC-type branched-chain amino acid transport system, ATPase component [Amino acid transport and metabolism]. 250
588 223489 COG0412 DLH Dienelactone hydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 236
589 223490 COG0413 PanB Ketopantoate hydroxymethyltransferase [Coenzyme transport and metabolism]. 268
590 223491 COG0414 PanC Panthothenate synthetase [Coenzyme transport and metabolism]. 285
591 223492 COG0415 PhrB Deoxyribodipyrimidine photolyase [Replication, recombination and repair]. 461
592 223493 COG0416 PlsX Fatty acid/phospholipid biosynthesis enzyme [Lipid transport and metabolism]. 338
593 223494 COG0417 PolB DNA polymerase elongation subunit (family B) [Replication, recombination and repair]. 792
594 223495 COG0418 PyrC Dihydroorotase [Nucleotide transport and metabolism]. 344
595 223496 COG0419 SbcC DNA repair exonuclease SbcCD ATPase subunit [Replication, recombination and repair]. 908
596 223497 COG0420 SbcD DNA repair exonuclease SbcCD nuclease subunit [Replication, recombination and repair]. 390
597 223498 COG0421 SpeE Spermidine synthase [Amino acid transport and metabolism]. 282
598 223499 COG0422 ThiC Thiamine biosynthesis protein ThiC [Coenzyme transport and metabolism]. 432
599 223500 COG0423 GRS1 Glycyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 558
600 223501 COG0424 Maf Predicted house-cleaning NTP pyrophosphatase, Maf/HAM1 superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 193
601 223502 COG0425 TusA TusA-related sulfurtransferase [Posttranslational modification, protein turnover, chaperones]. 78
602 223503 COG0426 NorV Flavorubredoxin [Energy production and conversion]. 388
603 223504 COG0427 ACH1 Acyl-CoA hydrolase [Energy production and conversion]. 501
604 223505 COG0428 ZupT Zinc transporter ZupT [Inorganic ion transport and metabolism]. 266
605 223506 COG0429 YheT Predicted hydrolase of the alpha/beta-hydrolase fold [General function prediction only]. 345
606 223507 COG0430 RCL1 RNA 3'-terminal phosphate cyclase [RNA processing and modification]. 341
607 223508 COG0431 SsuE NAD(P)H-dependent FMN reductase [Energy production and conversion]. 184
608 223509 COG0432 YjbQ Thiamin phosphate synthase YjbQ, UPF0047 family [Coenzyme transport and metabolism]. 137
609 223510 COG0433 YjgR Archaeal DNA helicase HerA or a related bacterial ATPase, contains HAS-barrel and ATPase domains [Replication, recombination and repair]. 520
610 223511 COG0434 SgcQ Predicted TIM-barrel enzyme [General function prediction only]. 263
611 223512 COG0435 ECM4 Glutathionyl-hydroquinone reductase [Energy production and conversion]. 324
612 223513 COG0436 AspB Aspartate/methionine/tyrosine aminotransferase [Amino acid transport and metabolism]. 393
613 223514 COG0437 HybA Fe-S-cluster-containing dehydrogenase component [Energy production and conversion]. 203
614 223515 COG0438 RfaB Glycosyltransferase involved in cell wall bisynthesis [Cell wall/membrane/envelope biogenesis]. 381
615 223516 COG0439 AccC Biotin carboxylase [Lipid transport and metabolism]. 449
616 223517 COG0440 IlvH Acetolactate synthase, small subunit [Amino acid transport and metabolism]. 163
617 223518 COG0441 ThrS Threonyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 589
618 223519 COG0442 ProS Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 500
619 223520 COG0443 DnaK Molecular chaperone DnaK (HSP70) [Posttranslational modification, protein turnover, chaperones]. 579
620 223521 COG0444 DppD ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 316
621 223522 COG0445 MnmG tRNA U34 5-carboxymethylaminomethyl modifying enzyme MnmG/GidA [Translation, ribosomal structure and biogenesis]. 621
622 223523 COG0446 FadH2 NADPH-dependent 2,4-dienoyl-CoA reductase, sulfur reductase, or a related oxidoreductase [Lipid transport and metabolism]. 415
623 223524 COG0447 MenB 1,4-Dihydroxy-2-naphthoyl-CoA synthase [Coenzyme transport and metabolism]. 282
624 223525 COG0448 GlgC ADP-glucose pyrophosphorylase [Carbohydrate transport and metabolism]. 393
625 223526 COG0449 GlmS Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains [Cell wall/membrane/envelope biogenesis]. 597
626 223527 COG0450 AhpC Alkyl hydroperoxide reductase subunit AhpC (peroxiredoxin) [Defense mechanisms]. 194
627 223528 COG0451 WcaG Nucleoside-diphosphate-sugar epimerase [Cell wall/membrane/envelope biogenesis]. 314
628 223529 COG0452 CoaBC Phosphopantothenoylcysteine synthetase/decarboxylase [Coenzyme transport and metabolism]. 392
629 223530 COG0454 PhnO N-acetyltransferase, GNAT superfamily (includes histone acetyltransferase HPA2) [Transcription, General function prediction only]. 156
630 223531 COG0455 FlhG MinD-like ATPase involved in chromosome partitioning or flagellar assembly [Cell cycle control, cell division, chromosome partitioning, Cell motility]. 262
631 223532 COG0456 RimI Ribosomal protein S18 acetylase RimI and related acetyltransferases [Translation, ribosomal structure and biogenesis]. 177
632 223533 COG0457 TPR Tetratricopeptide (TPR) repeat [General function prediction only]. 291
633 223534 COG0458 CarB Carbamoylphosphate synthase large subunit [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 400
634 223535 COG0459 GroEL Chaperonin GroEL (HSP60 family) [Posttranslational modification, protein turnover, chaperones]. 524
635 223536 COG0460 ThrA Homoserine dehydrogenase [Amino acid transport and metabolism]. 333
636 223537 COG0461 PyrE Orotate phosphoribosyltransferase [Nucleotide transport and metabolism]. 201
637 223538 COG0462 PrsA Phosphoribosylpyrophosphate synthetase [Nucleotide transport and metabolism, Amino acid transport and metabolism]. 314
638 223539 COG0463 WcaA Glycosyltransferase involved in cell wall bisynthesis [Cell wall/membrane/envelope biogenesis]. 291
639 223540 COG0464 SpoVK AAA+-type ATPase, SpoVK/Ycf46/Vps4 family [Cell wall/membrane/envelope biogenesis, Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 494
640 223541 COG0465 HflB ATP-dependent Zn proteases [Posttranslational modification, protein turnover, chaperones]. 596
641 223542 COG0466 Lon ATP-dependent Lon protease, bacterial type [Posttranslational modification, protein turnover, chaperones]. 782
642 223543 COG0467 RAD55 RecA-superfamily ATPase, KaiC/GvpD/RAD55 family [Signal transduction mechanisms]. 260
643 223544 COG0468 RecA RecA/RadA recombinase [Replication, recombination and repair]. 279
644 223545 COG0469 PykF Pyruvate kinase [Carbohydrate transport and metabolism]. 477
645 223546 COG0470 HolB DNA polymerase III, delta prime subunit [Replication, recombination and repair]. 230
646 223547 COG0471 CitT Di- and tricarboxylate transporter [Carbohydrate transport and metabolism]. 461
647 223548 COG0472 Rfe UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase [Cell wall/membrane/envelope biogenesis]. 319
648 223549 COG0473 LeuB Isocitrate/isopropylmalate dehydrogenase [Energy production and conversion, Amino acid transport and metabolism]. 348
649 223550 COG0474 MgtA Magnesium-transporting ATPase (P-type) [Inorganic ion transport and metabolism]. 917
650 223551 COG0475 KefB Kef-type K+ transport system, membrane component KefB [Inorganic ion transport and metabolism]. 397
651 223552 COG0476 ThiF Molybdopterin or thiamine biosynthesis adenylyltransferase [Coenzyme transport and metabolism]. 254
652 223553 COG0477 ProP MFS family permease [Carbohydrate transport and metabolism, Amino acid transport and metabolism, Inorganic ion transport and metabolism, General function prediction only]. 338
653 223554 COG0478 RIO2 RIO-like serine/threonine protein kinase fused to N-terminal HTH domain [Signal transduction mechanisms]. 304
654 223555 COG0479 FrdB Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit [Energy production and conversion]. 234
655 223556 COG0480 FusA Translation elongation factor EF-G, a GTPase [Translation, ribosomal structure and biogenesis]. 697
656 223557 COG0481 LepA Translation elongation factor EF-4, membrane-bound GTPase [Translation, ribosomal structure and biogenesis]. 603
657 223558 COG0482 MnmA tRNA U34 2-thiouridine synthase MnmA/TrmU, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 356
658 223559 COG0483 SuhB Archaeal fructose-1,6-bisphosphatase or related enzyme of inositol monophosphatase family [Carbohydrate transport and metabolism]. 260
659 223560 COG0484 DnaJ DnaJ-class molecular chaperone with C-terminal Zn finger domain [Posttranslational modification, protein turnover, chaperones]. 371
660 223561 COG0486 MnmE tRNA U34 5-carboxymethylaminomethyl modifying GTPase MnmE/TrmE [Translation, ribosomal structure and biogenesis]. 454
661 223562 COG0488 Uup ATPase components of ABC transporters with duplicated ATPase domains [General function prediction only]. 530
662 223563 COG0489 Mrp Chromosome partitioning ATPase, Mrp family, contains Fe-S cluster [Cell cycle control, cell division, chromosome partitioning]. 265
663 223564 COG0490 KhtT K+/H+ antiporter YhaU, regulatory subunit KhtT [Inorganic ion transport and metabolism]. 162
664 223565 COG0491 GloB Glyoxylase or a related metal-dependent hydrolase, beta-lactamase superfamily II [General function prediction only]. 252
665 223566 COG0492 TrxB Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]. 305
666 223567 COG0493 GltD NADPH-dependent glutamate synthase beta chain or related oxidoreductase [Amino acid transport and metabolism, General function prediction only]. 457
667 223568 COG0494 MutT 8-oxo-dGTP pyrophosphatase MutT and related house-cleaning NTP pyrophosphohydrolases, NUDIX family [Defense mechanisms]. 161
668 223569 COG0495 LeuS Leucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 814
669 223570 COG0496 SurE Broad specificity polyphosphatase and 5'/3'-nucleotidase SurE [Replication, recombination and repair]. 252
670 223571 COG0497 RecN DNA repair ATPase RecN [Replication, recombination and repair]. 557
671 223572 COG0498 ThrC Threonine synthase [Amino acid transport and metabolism]. 411
672 223573 COG0499 SAM1 S-adenosylhomocysteine hydrolase [Coenzyme transport and metabolism]. 420
673 223574 COG0500 SmtA SAM-dependent methyltransferase [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 257
674 223575 COG0501 HtpX Zn-dependent protease with chaperone function [Posttranslational modification, protein turnover, chaperones]. 302
675 223576 COG0502 BioB Biotin synthase or related enzyme [Coenzyme transport and metabolism]. 335
676 223577 COG0503 Apt Adenine/guanine phosphoribosyltransferase or related PRPP-binding protein [Nucleotide transport and metabolism]. 179
677 223578 COG0504 PyrG CTP synthase (UTP-ammonia lyase) [Nucleotide transport and metabolism]. 533
678 223579 COG0505 CarA Carbamoylphosphate synthase small subunit [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 368
679 223580 COG0506 PutA Proline dehydrogenase [Amino acid transport and metabolism]. 391
680 223581 COG0507 RecD ATP-dependent exoDNAse (exonuclease V), alpha subunit, helicase superfamily I [Replication, recombination and repair]. 696
681 223582 COG0508 AceF Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component [Energy production and conversion]. 404
682 223583 COG0509 GcvH Glycine cleavage system H protein (lipoate-binding) [Amino acid transport and metabolism]. 131
683 223584 COG0510 CotS Thiamine kinase and related kinases [Coenzyme transport and metabolism]. 269
684 223585 COG0511 AccB Biotin carboxyl carrier protein [Coenzyme transport and metabolism, Lipid transport and metabolism]. 140
685 223586 COG0512 PabA Anthranilate/para-aminobenzoate synthase component II [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 191
686 223587 COG0513 SrmB Superfamily II DNA and RNA helicase [Replication, recombination and repair]. 513
687 223588 COG0514 RecQ Superfamily II DNA helicase RecQ [Replication, recombination and repair]. 590
688 223589 COG0515 SPS1 Serine/threonine protein kinase [Signal transduction mechanisms]. 384
689 223590 COG0516 GuaB IMP dehydrogenase/GMP reductase [Nucleotide transport and metabolism]. 170
690 223591 COG0517 CBS CBS domain [Signal transduction mechanisms]. 117
691 223592 COG0518 GuaA1 GMP synthase - Glutamine amidotransferase domain [Nucleotide transport and metabolism]. 198
692 223593 COG0519 GuaA2 GMP synthase, PP-ATPase domain/subunit [Nucleotide transport and metabolism]. 315
693 223594 COG0520 CsdA Selenocysteine lyase/Cysteine desulfurase [Amino acid transport and metabolism]. 405
694 223595 COG0521 MoaB Molybdopterin biosynthesis enzyme MoaB [Coenzyme transport and metabolism]. 169
695 223596 COG0522 RpsD Ribosomal protein S4 or related protein [Translation, ribosomal structure and biogenesis]. 205
696 223597 COG0523 YejR GTPase, G3E family [General function prediction only]. 323
697 223598 COG0524 RbsK Sugar or nucleoside kinase, ribokinase family [Carbohydrate transport and metabolism]. 311
698 223599 COG0525 ValS Valyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 877
699 223600 COG0526 TrxA Thiol-disulfide isomerase or thioredoxin [Posttranslational modification, protein turnover, chaperones]. 127
700 223601 COG0527 LysC Aspartokinase [Amino acid transport and metabolism]. 447
701 223602 COG0528 PyrH Uridylate kinase [Nucleotide transport and metabolism]. 238
702 223603 COG0529 CysC Adenylylsulfate kinase or related kinase [Inorganic ion transport and metabolism]. 197
703 223604 COG0530 ECM27 Ca2+/Na+ antiporter [Inorganic ion transport and metabolism]. 320
704 223605 COG0531 PotE Amino acid transporter [Amino acid transport and metabolism]. 466
705 223606 COG0532 InfB Translation initiation factor IF-2, a GTPase [Translation, ribosomal structure and biogenesis]. 509
706 223607 COG0533 TsaD tRNA A37 threonylcarbamoyltransferase TsaD [Translation, ribosomal structure and biogenesis]. 342
707 223608 COG0534 NorM Na+-driven multidrug efflux pump [Defense mechanisms]. 455
708 223609 COG0535 SkfB Radical SAM superfamily enzyme, MoaA/NifB/PqqE/SkfB family [General function prediction only]. 347
709 223610 COG0536 Obg GTPase involved in cell partioning and DNA repair [Cell cycle control, cell division, chromosome partitioning, Replication, recombination and repair]. 369
710 223611 COG0537 Hit Diadenosine tetraphosphate (Ap4A) hydrolase or other HIT family hydrolase [Nucleotide transport and metabolism, Carbohydrate transport and metabolism, General function prediction only]. 138
711 223612 COG0538 Icd Isocitrate dehydrogenase [Energy production and conversion]. 407
712 223613 COG0539 RpsA Ribosomal protein S1 [Translation, ribosomal structure and biogenesis]. 541
713 223614 COG0540 PyrB Aspartate carbamoyltransferase, catalytic chain [Nucleotide transport and metabolism]. 316
714 223615 COG0541 Ffh Signal recognition particle GTPase [Intracellular trafficking, secretion, and vesicular transport]. 451
715 223616 COG0542 ClpA ATP-dependent Clp protease ATP-binding subunit ClpA [Posttranslational modification, protein turnover, chaperones]. 786
716 223617 COG0543 Mcr1 NAD(P)H-flavin reductase [Coenzyme transport and metabolism, Energy production and conversion]. 252
717 223618 COG0544 Tig FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) [Posttranslational modification, protein turnover, chaperones]. 441
718 223619 COG0545 FkpA FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 205
719 223620 COG0546 Gph Phosphoglycolate phosphatase, HAD superfamily [Energy production and conversion]. 220
720 223621 COG0547 TrpD Anthranilate phosphoribosyltransferase [Amino acid transport and metabolism]. 338
721 223622 COG0548 ArgB Acetylglutamate kinase [Amino acid transport and metabolism]. 265
722 223623 COG0549 ArcC Carbamate kinase [Amino acid transport and metabolism]. 312
723 223624 COG0550 TopA DNA topoisomerase IA [Replication, recombination and repair]. 570
724 223625 COG0551 YrdD ssDNA-binding Zn-finger and Zn-ribbon domains of topoisomerase 1 [Replication, recombination and repair]. 140
725 223626 COG0552 FtsY Signal recognition particle GTPase [Intracellular trafficking, secretion, and vesicular transport]. 340
726 223627 COG0553 HepA Superfamily II DNA or RNA helicase, SNF2 family [Transcription, Replication, recombination and repair]. 866
727 223628 COG0554 GlpK Glycerol kinase [Energy production and conversion]. 499
728 223629 COG0555 CysU ABC-type sulfate transport system, permease component [Inorganic ion transport and metabolism]. 274
729 223630 COG0556 UvrB Excinuclease UvrABC helicase subunit UvrB [Replication, recombination and repair]. 663
730 223631 COG0557 VacB Exoribonuclease R [Transcription]. 706
731 223632 COG0558 PgsA Phosphatidylglycerophosphate synthase [Lipid transport and metabolism]. 192
732 223633 COG0559 LivH Branched-chain amino acid ABC-type transport system, permease component [Amino acid transport and metabolism]. 297
733 223634 COG0560 SerB Phosphoserine phosphatase [Amino acid transport and metabolism]. 212
734 223635 COG0561 Cof Hydroxymethylpyrimidine pyrophosphatase and other HAD family phosphatases [Coenzyme transport and metabolism, General function prediction only]. 264
735 223636 COG0562 Glf UDP-galactopyranose mutase [Cell wall/membrane/envelope biogenesis]. 374
736 223637 COG0563 Adk Adenylate kinase or related kinase [Nucleotide transport and metabolism]. 178
737 223638 COG0564 RluA Pseudouridylate synthase, 23S rRNA- or tRNA-specific [Translation, ribosomal structure and biogenesis]. 289
738 223639 COG0565 TrmJ tRNA C32,U32 (ribose-2'-O)-methylase TrmJ or a related methyltransferase [Translation, ribosomal structure and biogenesis]. 242
739 223640 COG0566 SpoU tRNA G18 (ribose-2'-O)-methylase SpoU [Translation, ribosomal structure and biogenesis]. 260
740 223641 COG0567 SucA 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes [Energy production and conversion]. 906
741 223642 COG0568 RpoD DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) [Transcription]. 342
742 223643 COG0569 TrkA Trk K+ transport system, NAD-binding component [Inorganic ion transport and metabolism]. 225
743 223644 COG0571 Rnc dsRNA-specific ribonuclease [Transcription]. 235
744 223645 COG0572 Udk Uridine kinase [Nucleotide transport and metabolism]. 218
745 223646 COG0573 PstC ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 310
746 223647 COG0574 PpsA Phosphoenolpyruvate synthase/pyruvate phosphate dikinase [Carbohydrate transport and metabolism]. 740
747 223648 COG0575 CdsA CDP-diglyceride synthetase [Lipid transport and metabolism]. 265
748 223649 COG0576 GrpE Molecular chaperone GrpE (heat shock protein) [Posttranslational modification, protein turnover, chaperones]. 193
749 223650 COG0577 SalY ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 419
750 223651 COG0578 GlpA Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 532
751 223652 COG0579 LhgO L-2-hydroxyglutarate oxidase LhgO [Carbohydrate transport and metabolism]. 429
752 223653 COG0580 GlpF Glycerol uptake facilitator and related aquaporins (Major Intrinsic Protein Family) [Carbohydrate transport and metabolism]. 241
753 223654 COG0581 PstA ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 292
754 223655 COG0582 XerC Integrase [Replication, recombination and repair, Mobilome: prophages, transposons]. 309
755 223656 COG0583 LysR DNA-binding transcriptional regulator, LysR family [Transcription]. 297
756 223657 COG0584 UgpQ Glycerophosphoryl diester phosphodiesterase [Lipid transport and metabolism]. 257
757 223658 COG0585 TruD tRNA(Glu) U13 pseudouridine synthase TruD [Translation, ribosomal structure and biogenesis]. 406
758 223659 COG0586 DedA Uncharacterized membrane protein DedA, SNARE-associated domain [Function unknown]. 208
759 223660 COG0587 DnaE DNA polymerase III, alpha subunit [Replication, recombination and repair]. 1139
760 223661 COG0588 GpmA Phosphoglycerate mutase (BPG-dependent) [Carbohydrate transport and metabolism]. 230
761 223662 COG0589 UspA Nucleotide-binding universal stress protein, UspA family [Signal transduction mechanisms]. 154
762 223663 COG0590 TadA tRNA(Arg) A34 adenosine deaminase TadA [Translation, ribosomal structure and biogenesis]. 152
763 223664 COG0591 PutP Na+/proline symporter [Amino acid transport and metabolism]. 493
764 223665 COG0592 DnaN DNA polymerase III sliding clamp (beta) subunit, PCNA homolog [Replication, recombination and repair]. 364
765 223666 COG0593 DnaA Chromosomal replication initiation ATPase DnaA [Replication, recombination and repair]. 408
766 223667 COG0594 RnpA RNase P protein component [Translation, ribosomal structure and biogenesis]. 117
767 223668 COG0595 RnjA mRNA degradation ribonuclease J1/J2 [Translation, ribosomal structure and biogenesis]. 555
768 223669 COG0596 MhpC Pimeloyl-ACP methyl ester carboxylesterase [Coenzyme transport and metabolism, General function prediction only]. 282
769 223670 COG0597 LspA Lipoprotein signal peptidase [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 167
770 223671 COG0598 CorA Mg2+ and Co2+ transporter CorA [Inorganic ion transport and metabolism]. 322
771 223672 COG0599 YurZ Uncharacterized conserved protein YurZ, alkylhydroperoxidase/carboxymuconolactone decarboxylase family [General function prediction only]. 124
772 223673 COG0600 TauC ABC-type nitrate/sulfonate/bicarbonate transport system, permease component [Inorganic ion transport and metabolism]. 258
773 223674 COG0601 DppB ABC-type dipeptide/oligopeptide/nickel transport system, permease component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 317
774 223675 COG0602 NrdG Organic radical activating enzyme [General function prediction only]. 212
775 223676 COG0603 QueC 7-cyano-7-deazaguanine synthase (queuosine biosynthesis) [Translation, ribosomal structure and biogenesis]. 222
776 223677 COG0604 Qor NADPH:quinone reductase or related Zn-dependent oxidoreductase [Energy production and conversion, General function prediction only]. 326
777 223678 COG0605 SodA Superoxide dismutase [Inorganic ion transport and metabolism]. 204
778 223679 COG0606 YifB Predicted ATPase with chaperone activity [Posttranslational modification, protein turnover, chaperones]. 490
779 223680 COG0607 PspE Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]. 110
780 223681 COG0608 RecJ Single-stranded DNA-specific exonuclease, DHH superfamily, may be involved in archaeal DNA replication intiation [Replication, recombination and repair]. 491
781 223682 COG0609 FepD ABC-type Fe3+-siderophore transport system, permease component [Inorganic ion transport and metabolism]. 334
782 223683 COG0610 COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases ... [Defense mechanisms]. 962
783 223684 COG0611 ThiL Thiamine monophosphate kinase [Coenzyme transport and metabolism]. 317
784 223685 COG0612 PqqL Predicted Zn-dependent peptidase [General function prediction only]. 438
785 223686 COG0613 YciV Predicted metal-dependent phosphoesterase TrpH, contains PHP domain [General function prediction only]. 258
786 223687 COG0614 FepB ABC-type Fe3+-hydroxamate transport system, periplasmic component [Inorganic ion transport and metabolism]. 319
787 223688 COG0615 TagD Glycerol-3-phosphate cytidylyltransferase, cytidylyltransferase family [Cell wall/membrane/envelope biogenesis]. 140
788 223689 COG0616 SppA Periplasmic serine protease, ClpP class [Posttranslational modification, protein turnover, chaperones]. 317
789 223690 COG0617 PcnB tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 412
790 223691 COG0618 NrnA nanoRNase/pAp phosphatase, hydrolyzes c-di-AMP and oligoRNAs [Nucleotide transport and metabolism]. 332
791 223692 COG0619 EcfT Energy-coupling factor transporter transmembrane protein EcfT [Coenzyme transport and metabolism]. 252
792 223693 COG0620 MetE Methionine synthase II (cobalamin-independent) [Amino acid transport and metabolism]. 330
793 223694 COG0621 MiaB tRNA A37 methylthiotransferase MiaB [Translation, ribosomal structure and biogenesis]. 437
794 223695 COG0622 YfcE Predicted phosphodiesterase [General function prediction only]. 172
795 223696 COG0623 FabI Enoyl-[acyl-carrier-protein] reductase (NADH) [Lipid transport and metabolism]. 259
796 223697 COG0624 ArgE Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase or related deacylase [Amino acid transport and metabolism]. 409
797 223698 COG0625 GstA Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 211
798 223699 COG0626 MetC Cystathionine beta-lyase/cystathionine gamma-synthase [Amino acid transport and metabolism]. 396
799 223700 COG0627 FrmB S-formylglutathione hydrolase FrmB [Defense mechanisms]. 316
800 223701 COG0628 PerM Predicted PurR-regulated permease PerM [General function prediction only]. 355
801 223702 COG0629 Ssb Single-stranded DNA-binding protein [Replication, recombination and repair]. 167
802 223703 COG0630 VirB11 Type IV secretory pathway ATPase VirB11/Archaellum biosynthesis ATPase [Intracellular trafficking, secretion, and vesicular transport]. 312
803 223704 COG0631 PTC1 Serine/threonine protein phosphatase PrpC [Signal transduction mechanisms]. 262
804 223705 COG0632 RuvA Holliday junction resolvasome RuvABC DNA-binding subunit [Replication, recombination and repair]. 201
805 223706 COG0633 Fdx Ferredoxin [Energy production and conversion]. 102
806 223707 COG0634 HptA Hypoxanthine-guanine phosphoribosyltransferase [Nucleotide transport and metabolism]. 178
807 223708 COG0635 HemN Coproporphyrinogen III oxidase or related Fe-S oxidoreductase [Coenzyme transport and metabolism]. 416
808 223709 COG0636 AtpE FoF1-type ATP synthase, membrane subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K [Energy production and conversion]. 79
809 223710 COG0637 YcjU Beta-phosphoglucomutase or related phosphatase, HAD superfamily [Carbohydrate transport and metabolism, General function prediction only]. 221
810 223711 COG0638 PRE1 20S proteasome, alpha and beta subunits [Posttranslational modification, protein turnover, chaperones]. 236
811 223712 COG0639 ApaH Diadenosine tetraphosphatase ApaH/serine/threonine protein phosphatase, PP2A family [Signal transduction mechanisms]. 155
812 223713 COG0640 ArsR DNA-binding transcriptional regulator, ArsR family [Transcription]. 110
813 223714 COG0641 AslB Sulfatase maturation enzyme AslB, radical SAM superfamily [Posttranslational modification, protein turnover, chaperones]. 378
814 223715 COG0642 BaeS Signal transduction histidine kinase [Signal transduction mechanisms]. 336
815 223716 COG0643 CheA Chemotaxis protein histidine kinase CheA [Cell motility, Signal transduction mechanisms]. 716
816 223717 COG0644 FixC Dehydrogenase (flavoprotein) [Energy production and conversion]. 396
817 223718 COG0645 COG0645 Predicted kinase [General function prediction only]. 170
818 223719 COG0646 MetH1 Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]. 311
819 223720 COG0647 NagD Ribonucleotide monophosphatase NagD, HAD superfamily [Nucleotide transport and metabolism]. 269
820 223721 COG0648 Nfo Endonuclease IV [Replication, recombination and repair]. 280
821 223722 COG0649 NuoD NADH:ubiquinone oxidoreductase 49 kD subunit (chain D) [Energy production and conversion]. 398
822 223723 COG0650 HyfC Formate hydrogenlyase subunit 4 [Energy production and conversion]. 309
823 223724 COG0651 HyfB Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 504
824 223725 COG0652 PpiB Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family [Posttranslational modification, protein turnover, chaperones]. 158
825 223726 COG0653 SecA Preprotein translocase subunit SecA (ATPase, RNA helicase) [Intracellular trafficking, secretion, and vesicular transport]. 822
826 223727 COG0654 UbiH 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme transport and metabolism, Energy production and conversion]. 387
827 223728 COG0655 WrbA Multimeric flavodoxin WrbA [Energy production and conversion]. 207
828 223729 COG0656 ARA1 Aldo/keto reductase, related to diketogulonate reductase [Secondary metabolites biosynthesis, transport and catabolism]. 280
829 223730 COG0657 Aes Acetyl esterase/lipase [Lipid transport and metabolism]. 312
830 223731 COG0658 ComEC Predicted membrane metal-binding protein [General function prediction only]. 453
831 223732 COG0659 SUL1 Sulfate permease or related transporter, MFS superfamily [Inorganic ion transport and metabolism]. 554
832 223733 COG0661 AarF Predicted unusual protein kinase regulating ubiquinone biosynthesis, AarF/ABC1/UbiB family [Coenzyme transport and metabolism, Signal transduction mechanisms]. 517
833 223734 COG0662 ManC Mannose-6-phosphate isomerase, cupin superfamily [Carbohydrate transport and metabolism]. 127
834 223735 COG0663 PaaY Carbonic anhydrase or acetyltransferase, isoleucine patch superfamily [General function prediction only]. 176
835 223736 COG0664 Crp cAMP-binding domain of CRP or a regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]. 214
836 223737 COG0665 DadA Glycine/D-amino acid oxidase (deaminating) [Amino acid transport and metabolism]. 387
837 223738 COG0666 ANKYR Ankyrin repeat [Signal transduction mechanisms]. 235
838 223739 COG0667 Tas Predicted oxidoreductase (related to aryl-alcohol dehydrogenase) [General function prediction only]. 316
839 223740 COG0668 MscS Small-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 316
840 223741 COG0669 CoaD Phosphopantetheine adenylyltransferase [Coenzyme transport and metabolism]. 159
841 223742 COG0670 YbhL Integral membrane protein, interacts with FtsH [General function prediction only]. 233
842 223743 COG0671 PgpB Membrane-associated phospholipid phosphatase [Lipid transport and metabolism]. 232
843 223744 COG0672 FTR1 High-affinity Fe2+/Pb2+ permease [Inorganic ion transport and metabolism]. 383
844 223745 COG0673 MviM Predicted dehydrogenase [General function prediction only]. 342
845 223746 COG0674 PorA Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, alpha subunit [Energy production and conversion]. 365
846 223747 COG0675 InsQ Transposase [Mobilome: prophages, transposons]. 364
847 223748 COG0676 YeaD D-hexose-6-phosphate mutarotase [Carbohydrate transport and metabolism]. 287
848 223749 COG0677 WecC UDP-N-acetyl-D-mannosaminuronate dehydrogenase [Cell wall/membrane/envelope biogenesis]. 436
849 223750 COG0678 AHP1 Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 165
850 223751 COG0679 YfdV Predicted permease [General function prediction only]. 311
851 223752 COG0680 HyaD Ni,Fe-hydrogenase maturation factor [Energy production and conversion]. 160
852 223753 COG0681 LepB Signal peptidase I [Intracellular trafficking, secretion, and vesicular transport]. 166
853 223754 COG0682 Lgt Prolipoprotein diacylglyceryltransferase [Cell wall/membrane/envelope biogenesis]. 287
854 223755 COG0683 LivK ABC-type branched-chain amino acid transport system, periplasmic component [Amino acid transport and metabolism]. 366
855 223756 COG0684 RraA Regulator of RNase E activity RraA [Translation, ribosomal structure and biogenesis]. 210
856 223757 COG0685 MetF 5,10-methylenetetrahydrofolate reductase [Amino acid transport and metabolism]. 291
857 223758 COG0686 Ald Alanine dehydrogenase [Amino acid transport and metabolism]. 371
858 223759 COG0687 PotD Spermidine/putrescine-binding periplasmic protein [Amino acid transport and metabolism]. 363
859 223760 COG0688 Psd Phosphatidylserine decarboxylase [Lipid transport and metabolism]. 239
860 223761 COG0689 Rph Ribonuclease PH [Translation, ribosomal structure and biogenesis]. 230
861 223762 COG0690 SecE Preprotein translocase subunit SecE [Intracellular trafficking, secretion, and vesicular transport]. 73
862 223763 COG0691 SmpB tmRNA-binding protein [Posttranslational modification, protein turnover, chaperones]. 153
863 223764 COG0692 Ung Uracil DNA glycosylase [Replication, recombination and repair]. 223
864 223765 COG0693 ThiJ Putative intracellular protease/amidase [General function prediction only]. 188
865 223766 COG0694 NifU Fe-S cluster biogenesis protein NfuA, 4Fe-4S-binding domain [Posttranslational modification, protein turnover, chaperones]. 93
866 223767 COG0695 GrxC Glutaredoxin [Posttranslational modification, protein turnover, chaperones]. 80
867 223768 COG0696 GpmI Phosphoglycerate mutase (BPG-independent, AlkP superfamily) [Carbohydrate transport and metabolism]. 509
868 223769 COG0697 RhaT Permease of the drug/metabolite transporter (DMT) superfamily [Carbohydrate transport and metabolism, Amino acid transport and metabolism, General function prediction only]. 292
869 223770 COG0698 RpiB Ribose 5-phosphate isomerase RpiB [Carbohydrate transport and metabolism]. 151
870 223771 COG0699 CrfC Replication fork clamp-binding protein CrfC (dynamin-like GTPase family) [Replication, recombination and repair]. 546
871 223772 COG0700 SpmB Spore maturation protein SpmB (function unknown) [Function unknown]. 162
872 223773 COG0701 YraQ Uncharacterized membrane protein YraQ, UPF0718 family [Function unknown]. 317
873 223774 COG0702 YbjT Uncharacterized conserved protein YbjT, contains NAD(P)-binding and DUF2867 domains [General function prediction only]. 275
874 223775 COG0703 AroK Shikimate kinase [Amino acid transport and metabolism]. 172
875 223776 COG0704 PhoU Phosphate uptake regulator [Inorganic ion transport and metabolism]. 240
876 223777 COG0705 GlpG Membrane associated serine protease, rhomboid family [Posttranslational modification, protein turnover, chaperones]. 228
877 223778 COG0706 YidC Membrane protein insertase Oxa1/YidC/SpoIIIJ, required for the localization of integral membrane proteins [Cell wall/membrane/envelope biogenesis]. 314
878 223779 COG0707 MurG UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase [Cell wall/membrane/envelope biogenesis]. 357
879 223780 COG0708 XthA Exonuclease III [Replication, recombination and repair]. 261
880 223781 COG0709 SelD Selenophosphate synthase [Amino acid transport and metabolism]. 346
881 223782 COG0710 AroD 3-dehydroquinate dehydratase [Amino acid transport and metabolism]. 231
882 223783 COG0711 AtpF FoF1-type ATP synthase, membrane subunit b or b' [Energy production and conversion]. 161
883 223784 COG0712 AtpH FoF1-type ATP synthase, delta subunit [Energy production and conversion]. 178
884 223785 COG0713 NuoK NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) [Energy production and conversion]. 100
885 223786 COG0714 MoxR MoxR-like ATPase [General function prediction only]. 329
886 223787 COG0715 TauA ABC-type nitrate/sulfonate/bicarbonate transport system, periplasmic component [Inorganic ion transport and metabolism]. 335
887 223788 COG0716 FldA Flavodoxin [Energy production and conversion]. 151
888 223789 COG0717 Dcd Deoxycytidine triphosphate deaminase [Nucleotide transport and metabolism]. 183
889 223790 COG0718 YbaB Conserved DNA-binding protein YbaB (function unknown) [General function prediction only]. 105
890 223791 COG0719 SufB Fe-S cluster assembly scaffold protein SufB [Posttranslational modification, protein turnover, chaperones]. 412
891 223792 COG0720 QueD 6-pyruvoyl-tetrahydropterin synthase [Coenzyme transport and metabolism]. 127
892 223793 COG0721 GatC Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit [Translation, ribosomal structure and biogenesis]. 96
893 223794 COG0722 AroG1 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 351
894 223795 COG0723 QcrA Rieske Fe-S protein [Energy production and conversion]. 177
895 223796 COG0724 RRM RNA recognition motif (RRM) domain [Translation, ribosomal structure and biogenesis]. 306
896 223797 COG0725 ModA ABC-type molybdate transport system, periplasmic component [Inorganic ion transport and metabolism]. 258
897 223798 COG0726 CDA1 Peptidoglycan/xylan/chitin deacetylase, PgdA/CDA1 family [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 267
898 223799 COG0727 YkgJ Fe-S-cluster containining protein [General function prediction only]. 132
899 223800 COG0728 MviN Peptidoglycan biosynthesis protein MviN/MurJ, putative lipid II flippase [Cell wall/membrane/envelope biogenesis]. 518
900 223801 COG0729 TamA Outer membrane translocation and assembly module TamA [Cell wall/membrane/envelope biogenesis]. 594
901 223802 COG0730 YfcA Uncharacterized membrane protein YfcA [Function unknown]. 258
902 223803 COG0731 Tyw1 Wyosine [tRNA(Phe)-imidazoG37] synthetase, radical SAM superfamily [Translation, ribosomal structure and biogenesis]. 296
903 223804 COG0732 HsdS Restriction endonuclease S subunit [Defense mechanisms]. 391
904 223805 COG0733 YocR Na+-dependent transporter, SNF family [General function prediction only]. 439
905 223806 COG0735 Fur Fe2+ or Zn2+ uptake regulation protein [Inorganic ion transport and metabolism]. 145
906 223807 COG0736 AcpS Phosphopantetheinyl transferase (holo-ACP synthase) [Lipid transport and metabolism]. 127
907 223808 COG0737 UshA 2',3'-cyclic-nucleotide 2'-phosphodiesterase/5'- or 3'-nucleotidase, 5'-nucleotidase family [Nucleotide transport and metabolism, Defense mechanisms]. 517
908 223809 COG0738 FucP Fucose permease [Carbohydrate transport and metabolism]. 422
909 223810 COG0739 NlpD Murein DD-endopeptidase MepM and murein hydrolase activator NlpD, contain LysM domain [Cell wall/membrane/envelope biogenesis]. 277
910 223811 COG0740 ClpP ATP-dependent protease ClpP, protease subunit [Posttranslational modification, protein turnover, chaperones]. 200
911 223812 COG0741 MltE Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) [Cell wall/membrane/envelope biogenesis]. 296
912 223813 COG0742 RsmD 16S rRNA G966 N2-methylase RsmD [Translation, ribosomal structure and biogenesis]. 187
913 223814 COG0743 Dxr 1-deoxy-D-xylulose 5-phosphate reductoisomerase [Lipid transport and metabolism]. 385
914 223815 COG0744 MrcB Membrane carboxypeptidase (penicillin-binding protein) [Cell wall/membrane/envelope biogenesis]. 661
915 223816 COG0745 OmpR DNA-binding response regulator, OmpR family, contains REC and winged-helix (wHTH) domain [Signal transduction mechanisms, Transcription]. 229
916 223817 COG0746 MobA Molybdopterin-guanine dinucleotide biosynthesis protein A [Coenzyme transport and metabolism]. 192
917 223818 COG0747 DdpA ABC-type transport system, periplasmic component [Amino acid transport and metabolism]. 556
918 223819 COG0748 HugZ Putative heme iron utilization protein [Inorganic ion transport and metabolism]. 245
919 223820 COG0749 PolA DNA polymerase I - 3'-5' exonuclease and polymerase domains [Replication, recombination and repair]. 593
920 223821 COG0750 RseP Membrane-associated protease RseP, regulator of RpoE activity [Posttranslational modification, protein turnover, chaperones, Transcription]. 375
921 223822 COG0751 GlyS Glycyl-tRNA synthetase, beta subunit [Translation, ribosomal structure and biogenesis]. 691
922 223823 COG0752 GlyQ Glycyl-tRNA synthetase, alpha subunit [Translation, ribosomal structure and biogenesis]. 298
923 223824 COG0753 KatE Catalase [Inorganic ion transport and metabolism]. 496
924 223825 COG0754 Gsp Glutathionylspermidine synthase [Amino acid transport and metabolism]. 387
925 223826 COG0755 CcmC ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 281
926 223827 COG0756 Dut dUTPase [Nucleotide transport and metabolism, Defense mechanisms]. 148
927 223828 COG0757 AroQ 3-dehydroquinate dehydratase [Amino acid transport and metabolism]. 146
928 223829 COG0758 Smf Predicted Rossmann fold nucleotide-binding protein DprA/Smf involved in DNA uptake [Replication, recombination and repair]. 350
929 223830 COG0759 YidD Membrane-anchored protein YidD, putatitve component of membrane protein insertase Oxa1/YidC/SpoIIIJ [Cell wall/membrane/envelope biogenesis]. 92
930 223831 COG0760 SurA Parvulin-like peptidyl-prolyl isomerase [Posttranslational modification, protein turnover, chaperones]. 320
931 223832 COG0761 IspH 4-Hydroxy-3-methylbut-2-enyl diphosphate reductase IspH [Lipid transport and metabolism]. 294
932 223833 COG0762 Ycf19 Uncharacterized conserved protein YggT, Ycf19 family [Function unknown]. 96
933 223834 COG0763 LpxB Lipid A disaccharide synthetase [Cell wall/membrane/envelope biogenesis]. 381
934 223835 COG0764 FabA 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase [Lipid transport and metabolism]. 147
935 223836 COG0765 HisM ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 222
936 223837 COG0766 MurA UDP-N-acetylglucosamine enolpyruvyl transferase [Cell wall/membrane/envelope biogenesis]. 421
937 223838 COG0767 MlaE ABC-type transporter Mla maintaining outer membrane lipid asymmetry, permease component MlaE [Cell wall/membrane/envelope biogenesis]. 267
938 223839 COG0768 FtsI Cell division protein FtsI/penicillin-binding protein 2 [Cell cycle control, cell division, chromosome partitioning, Cell wall/membrane/envelope biogenesis]. 599
939 223840 COG0769 MurE UDP-N-acetylmuramyl tripeptide synthase [Cell wall/membrane/envelope biogenesis]. 475
940 223841 COG0770 MurF UDP-N-acetylmuramyl pentapeptide synthase [Cell wall/membrane/envelope biogenesis]. 451
941 223842 COG0771 MurD UDP-N-acetylmuramoylalanine-D-glutamate ligase [Cell wall/membrane/envelope biogenesis]. 448
942 223843 COG0772 FtsW Bacterial cell division protein FtsW, lipid II flippase [Cell cycle control, cell division, chromosome partitioning]. 381
943 223844 COG0773 MurC UDP-N-acetylmuramate-alanine ligase [Cell wall/membrane/envelope biogenesis]. 459
944 223845 COG0774 LpxC UDP-3-O-acyl-N-acetylglucosamine deacetylase [Cell wall/membrane/envelope biogenesis]. 300
945 223846 COG0775 Pfs Nucleoside phosphorylase [Nucleotide transport and metabolism]. 234
946 223847 COG0776 HimA Bacterial nucleoid DNA-binding protein [Replication, recombination and repair]. 94
947 223848 COG0777 AccD Acetyl-CoA carboxylase beta subunit [Lipid transport and metabolism]. 294
948 223849 COG0778 NfnB Nitroreductase [Energy production and conversion]. 207
949 223850 COG0779 RimP Ribosome maturation factor RimP [Translation, ribosomal structure and biogenesis]. 153
950 223851 COG0780 QueFC NADPH-dependent 7-cyano-7-deazaguanine reductase QueF, C-terminal domain, T-fold superfamily [Translation, ribosomal structure and biogenesis]. 149
951 223852 COG0781 NusB Transcription termination factor NusB [Transcription]. 151
952 223853 COG0782 GreA Transcription elongation factor, GreA/GreB family [Transcription]. 151
953 223854 COG0783 Dps DNA-binding ferritin-like protein (oxidative damage protectant) [Inorganic ion transport and metabolism, Defense mechanisms]. 156
954 223855 COG0784 CheY CheY chemotaxis protein or a CheY-like REC (receiver) domain [Signal transduction mechanisms]. 130
955 223856 COG0785 CcdA Cytochrome c biogenesis protein CcdA [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 220
956 223857 COG0786 GltS Na+/glutamate symporter [Amino acid transport and metabolism]. 404
957 223858 COG0787 Alr Alanine racemase [Cell wall/membrane/envelope biogenesis]. 360
958 223859 COG0788 PurU Formyltetrahydrofolate hydrolase [Nucleotide transport and metabolism]. 287
959 223860 COG0789 SoxR DNA-binding transcriptional regulator, MerR family [Transcription]. 124
960 223861 COG0790 TPR TPR repeat [Signal transduction mechanisms]. 292
961 223862 COG0791 Spr Cell wall-associated hydrolase, NlpC family [Cell wall/membrane/envelope biogenesis]. 197
962 223863 COG0792 YraN Predicted endonuclease distantly related to archaeal Holliday junction resolvase [Replication, recombination and repair]. 114
963 223864 COG0793 CtpA C-terminal processing protease CtpA/Prc, contains a PDZ domain [Posttranslational modification, protein turnover, chaperones]. 406
964 223865 COG0794 GutQ D-arabinose 5-phosphate isomerase GutQ [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 202
965 223866 COG0795 LptF Lipopolysaccharide export LptBFGC system, permease protein LptF [Cell wall/membrane/envelope biogenesis, Cell motility]. 364
966 223867 COG0796 MurI Glutamate racemase [Cell wall/membrane/envelope biogenesis]. 269
967 223868 COG0797 RlpA Rare lipoprotein A, peptidoglycan hydrolase digesting "naked" glycans, contains C-terminal SPOR domain [Cell wall/membrane/envelope biogenesis]. 233
968 223869 COG0798 ACR3 Arsenite efflux pump ArsB, ACR3 family [Inorganic ion transport and metabolism]. 342
969 223870 COG0799 RsfS Ribosomal silencing factor RsfS, regulates association of 30S and 50S subunits [Translation, ribosomal structure and biogenesis]. 115
970 223871 COG0800 Eda 2-keto-3-deoxy-6-phosphogluconate aldolase [Carbohydrate transport and metabolism]. 211
971 223872 COG0801 FolK 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase [Coenzyme transport and metabolism]. 160
972 223873 COG0802 TsaE tRNA A37 threonylcarbamoyladenosine biosynthesis protein TsaE [Translation, ribosomal structure and biogenesis]. 149
973 223874 COG0803 ZnuA ABC-type Zn uptake system ZnuABC, Zn-binding component ZnuA [Inorganic ion transport and metabolism]. 303
974 223875 COG0804 UreC Urease alpha subunit [Amino acid transport and metabolism]. 568
975 223876 COG0805 TatC Sec-independent protein secretion pathway component TatC [Intracellular trafficking, secretion, and vesicular transport]. 255
976 223877 COG0806 RimM Ribosomal 30S subunit maturation factor RimM, required for 16S rRNA processing [Translation, ribosomal structure and biogenesis]. 174
977 223878 COG0807 RibA GTP cyclohydrolase II [Coenzyme transport and metabolism]. 193
978 223879 COG0809 QueA S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) [Translation, ribosomal structure and biogenesis]. 348
979 223880 COG0810 TonB Periplasmic protein TonB, links inner and outer membranes [Cell wall/membrane/envelope biogenesis]. 244
980 223881 COG0811 TolQ Biopolymer transport protein ExbB/TolQ [Intracellular trafficking, secretion, and vesicular transport]. 216
981 223882 COG0812 MurB UDP-N-acetylenolpyruvoylglucosamine reductase [Cell wall/membrane/envelope biogenesis]. 291
982 223883 COG0813 DeoD Purine-nucleoside phosphorylase [Nucleotide transport and metabolism]. 236
983 223884 COG0814 SdaC Amino acid permease [Amino acid transport and metabolism]. 415
984 223885 COG0815 Lnt Apolipoprotein N-acyltransferase [Cell wall/membrane/envelope biogenesis]. 518
985 223886 COG0816 YqgF RNase H-fold protein, predicted Holliday junction resolvase in Firmicutes and mycoplasms, involved in anti-termination at Rho-dependent terminators [Transcription]. 141
986 223887 COG0817 RuvC Holliday junction resolvasome RuvABC endonuclease subunit [Replication, recombination and repair]. 160
987 223888 COG0818 DgkA Diacylglycerol kinase [Lipid transport and metabolism]. 123
988 223889 COG0819 TenA Thiaminase [Coenzyme transport and metabolism]. 218
989 223890 COG0820 RlmN Adenine C2-methylase RlmN of 23S rRNA A2503 and tRNA A37 [Translation, ribosomal structure and biogenesis]. 349
990 223891 COG0821 IspG 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase IspG/GcpE [Lipid transport and metabolism]. 361
991 223892 COG0822 IscU NifU homolog involved in Fe-S cluster formation [Posttranslational modification, protein turnover, chaperones]. 150
992 223893 COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking, secretion, and vesicular transport]. 425
993 223894 COG0824 FadM Acyl-CoA thioesterase FadM [Lipid transport and metabolism]. 137
994 223895 COG0825 AccA Acetyl-CoA carboxylase alpha subunit [Lipid transport and metabolism]. 317
995 223896 COG0826 PrtC Collagenase-like protease, PrtC family [Posttranslational modification, protein turnover, chaperones]. 347
996 223897 COG0827 YtxK Adenine-specific DNA methylase [Replication, recombination and repair]. 381
997 223898 COG0828 RpsU Ribosomal protein S21 [Translation, ribosomal structure and biogenesis]. 67
998 223899 COG0829 UreH Urease accessory protein UreH [Posttranslational modification, protein turnover, chaperones]. 269
999 223900 COG0830 UreF Urease accessory protein UreF [Posttranslational modification, protein turnover, chaperones]. 229
1000 223901 COG0831 UreA Urease gamma subunit [Amino acid transport and metabolism]. 100
1001 223902 COG0832 UreB Urease beta subunit [Amino acid transport and metabolism]. 106
1002 223903 COG0833 LysP Amino acid permease [Amino acid transport and metabolism]. 541
1003 223904 COG0834 HisJ ABC-type amino acid transport/signal transduction system, periplasmic component/domain [Amino acid transport and metabolism, Signal transduction mechanisms]. 275
1004 223905 COG0835 CheW Chemotaxis signal transduction protein [Cell motility, Signal transduction mechanisms]. 165
1005 223906 COG0836 CpsB Mannose-1-phosphate guanylyltransferase [Cell wall/membrane/envelope biogenesis]. 333
1006 223907 COG0837 Glk Glucokinase [Carbohydrate transport and metabolism]. 320
1007 223908 COG0838 NuoA NADH:ubiquinone oxidoreductase subunit 3 (chain A) [Energy production and conversion]. 123
1008 223909 COG0839 NuoJ NADH:ubiquinone oxidoreductase subunit 6 (chain J) [Energy production and conversion]. 166
1009 223910 COG0840 Tar Methyl-accepting chemotaxis protein [Cell motility, Signal transduction mechanisms]. 408
1010 223911 COG0841 AcrB Multidrug efflux pump subunit AcrB [Defense mechanisms]. 1009
1011 223912 COG0842 YadH ABC-type multidrug transport system, permease component [Defense mechanisms]. 286
1012 223913 COG0843 CyoB Heme/copper-type cytochrome/quinol oxidase, subunit 1 [Energy production and conversion]. 566
1013 223914 COG0845 AcrA Multidrug efflux pump subunit AcrA (membrane-fusion protein) [Cell wall/membrane/envelope biogenesis, Defense mechanisms]. 372
1014 223915 COG0846 SIR2 NAD-dependent protein deacetylase, SIR2 family [Posttranslational modification, protein turnover, chaperones]. 250
1015 223916 COG0847 DnaQ DNA polymerase III, epsilon subunit or related 3'-5' exonuclease [Replication, recombination and repair]. 243
1016 223917 COG0848 ExbD Biopolymer transport protein ExbD [Intracellular trafficking, secretion, and vesicular transport]. 137
1017 223918 COG0849 FtsA Cell division ATPase FtsA [Cell cycle control, cell division, chromosome partitioning]. 418
1018 223919 COG0850 MinC Septum formation inhibitor MinC [Cell cycle control, cell division, chromosome partitioning]. 219
1019 223920 COG0851 MinE Septum formation topological specificity factor MinE [Cell cycle control, cell division, chromosome partitioning]. 88
1020 223921 COG0852 NuoC NADH:ubiquinone oxidoreductase 27 kD subunit (chain C) [Energy production and conversion]. 176
1021 223922 COG0853 PanD Aspartate 1-decarboxylase [Coenzyme transport and metabolism]. 126
1022 223923 COG0854 PdxJ Pyridoxine 5'-phosphate synthase PdxJ [Coenzyme transport and metabolism]. 243
1023 223924 COG0855 Ppk Polyphosphate kinase [Inorganic ion transport and metabolism]. 696
1024 223925 COG0856 PyrE2 Orotate phosphoribosyltransferase homolog [Nucleotide transport and metabolism]. 203
1025 223926 COG0857 PtaN BioD-like N-terminal domain of phosphotransacetylase [General function prediction only]. 354
1026 223927 COG0858 RbfA Ribosome-binding factor A [Translation, ribosomal structure and biogenesis]. 118
1027 223928 COG0859 RfaF ADP-heptose:LPS heptosyltransferase [Cell wall/membrane/envelope biogenesis]. 334
1028 223929 COG0860 AmiC N-acetylmuramoyl-L-alanine amidase [Cell wall/membrane/envelope biogenesis]. 231
1029 223930 COG0861 TerC Membrane protein TerC, possibly involved in tellurium resistance [Inorganic ion transport and metabolism]. 254
1030 223931 COG0863 YhdJ DNA modification methylase [Replication, recombination and repair]. 302
1031 223932 COG0864 NikR Metal-responsive transcriptional regulator, contains CopG/Arc/MetJ DNA-binding domain [Transcription]. 136
1032 223933 COG1001 AdeC Adenine deaminase [Nucleotide transport and metabolism]. 584
1033 223934 COG1002 YeeA Type II restriction/modification system, DNA methylase subunit YeeA [Defense mechanisms]. 786
1034 223935 COG1003 GcvP2 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain [Amino acid transport and metabolism]. 496
1035 223936 COG1004 Ugd UDP-glucose 6-dehydrogenase [Cell wall/membrane/envelope biogenesis]. 414
1036 223937 COG1005 NuoH NADH:ubiquinone oxidoreductase subunit 1 (chain H) [Energy production and conversion]. 332
1037 223938 COG1006 MnhC Multisubunit Na+/H+ antiporter, MnhC subunit [Inorganic ion transport and metabolism]. 115
1038 223939 COG1007 NuoN NADH:ubiquinone oxidoreductase subunit 2 (chain N) [Energy production and conversion]. 475
1039 223940 COG1008 NuoM NADH:ubiquinone oxidoreductase subunit 4 (chain M) [Energy production and conversion]. 497
1040 223941 COG1009 NuoL NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 606
1041 223942 COG1010 CobJ Precorrin-3B methylase [Coenzyme transport and metabolism]. 249
1042 223943 COG1011 YigB FMN phosphatase YigB, HAD superfamily [Coenzyme transport and metabolism]. 229
1043 223944 COG1012 AdhE Acyl-CoA reductase or other NAD-dependent aldehyde dehydrogenase [Energy production and conversion]. 472
1044 223945 COG1013 PorB Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, beta subunit [Energy production and conversion]. 294
1045 223946 COG1014 PorG Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, gamma subunit [Energy production and conversion]. 203
1046 223947 COG1015 DeoB Phosphopentomutase [Carbohydrate transport and metabolism]. 397
1047 223948 COG1017 Hmp Hemoglobin-like flavoprotein [Energy production and conversion]. 150
1048 223949 COG1018 Fpr Ferredoxin-NADP reductase [Energy production and conversion]. 266
1049 223950 COG1019 CAB4 Phosphopantetheine adenylyltransferase [Coenzyme transport and metabolism]. 158
1050 223951 COG1020 EntF Non-ribosomal peptide synthetase component F [Secondary metabolites biosynthesis, transport and catabolism]. 642
1051 223952 COG1021 EntE Non-ribosomal peptide synthetase component E (peptide arylation enzyme) [Secondary metabolites biosynthesis, transport and catabolism]. 542
1052 223953 COG1022 FAA1 Long-chain acyl-CoA synthetase (AMP-forming) [Lipid transport and metabolism]. 613
1053 223954 COG1023 YqeC 6-phosphogluconate dehydrogenase (decarboxylating) [Carbohydrate transport and metabolism]. 300
1054 223955 COG1024 CaiD Enoyl-CoA hydratase/carnithine racemase [Lipid transport and metabolism]. 257
1055 223956 COG1025 Ptr Secreted/periplasmic Zn-dependent peptidases, insulinase-like [Posttranslational modification, protein turnover, chaperones]. 937
1056 223957 COG1026 Cym1 Zn-dependent peptidase, M16 (insulinase) family [Posttranslational modification, protein turnover, chaperones]. 978
1057 223958 COG1027 AspA Aspartate ammonia-lyase [Amino acid transport and metabolism]. 471
1058 223959 COG1028 FabG NAD(P)-dependent dehydrogenase, short-chain alcohol dehydrogenase family [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 251
1059 223960 COG1029 FwdB Formylmethanofuran dehydrogenase subunit B [Energy production and conversion]. 429
1060 223961 COG1030 NfeD Membrane-bound serine protease (ClpP class) [Posttranslational modification, protein turnover, chaperones]. 436
1061 223962 COG1031 TM1601 Radical SAM superfamily enzyme with C-terminal helix-hairpin-helix motif [General function prediction only]. 560
1062 223963 COG1032 YgiQ Radical SAM superfamily enzyme YgiQ, UPF0313 family [General function prediction only]. 490
1063 223964 COG1033 COG1033 Predicted exporter protein, RND superfamily [General function prediction only]. 727
1064 223965 COG1034 NuoG NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) [Energy production and conversion]. 693
1065 223966 COG1035 FrhB Coenzyme F420-reducing hydrogenase, beta subunit [Energy production and conversion]. 332
1066 223967 COG1036 COG1036 Archaeal flavoprotein [Energy production and conversion]. 187
1067 223968 COG1038 PycA Pyruvate carboxylase [Energy production and conversion]. 1149
1068 223969 COG1039 RnhC Ribonuclease HIII [Replication, recombination and repair]. 297
1069 223970 COG1040 ComFC Predicted amidophosphoribosyltransferases [General function prediction only]. 225
1070 223971 COG1041 Trm11 tRNA G10 N-methylase Trm11 [Translation, ribosomal structure and biogenesis]. 347
1071 223972 COG1042 ACCS Acyl-CoA synthetase (NDP forming) [Energy production and conversion]. 598
1072 223973 COG1043 LpxA Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase [Cell wall/membrane/envelope biogenesis]. 260
1073 223974 COG1044 LpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase [Cell wall/membrane/envelope biogenesis]. 338
1074 223975 COG1045 CysE Serine acetyltransferase [Amino acid transport and metabolism]. 194
1075 223976 COG1047 SlpA FKBP-type peptidyl-prolyl cis-trans isomerase 2 [Posttranslational modification, protein turnover, chaperones]. 174
1076 223977 COG1048 AcnA Aconitase A [Energy production and conversion]. 861
1077 223978 COG1049 AcnB Aconitase B [Energy production and conversion]. 852
1078 223979 COG1051 YjhB ADP-ribose pyrophosphatase YjhB, NUDIX family [Nucleotide transport and metabolism]. 145
1079 223980 COG1052 LdhA Lactate dehydrogenase or related 2-hydroxyacid dehydrogenase [Energy production and conversion, Coenzyme transport and metabolism, General function prediction only]. 324
1080 223981 COG1053 SdhA Succinate dehydrogenase/fumarate reductase, flavoprotein subunit [Energy production and conversion]. 562
1081 223982 COG1054 YceA Predicted sulfurtransferase [General function prediction only]. 308
1082 223983 COG1055 ArsB Na+/H+ antiporter NhaD or related arsenite permease [Inorganic ion transport and metabolism]. 424
1083 223984 COG1056 NadR Nicotinamide mononucleotide adenylyltransferase [Coenzyme transport and metabolism]. 172
1084 223985 COG1057 NadD Nicotinic acid mononucleotide adenylyltransferase [Coenzyme transport and metabolism]. 197
1085 223986 COG1058 CinA Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA [General function prediction only]. 255
1086 223987 COG1059 ENDO3c Thermostable 8-oxoguanine DNA glycosylase [Replication, recombination and repair, Defense mechanisms]. 210
1087 223988 COG1060 ThiH 2-iminoacetate synthase ThiH/Menaquinone biosynthesis enzyme MqnC [Coenzyme transport and metabolism]. 370
1088 223989 COG1061 SSL2 Superfamily II DNA or RNA helicase [Transcription, Replication, recombination and repair]. 442
1089 223990 COG1062 FrmA Zn-dependent alcohol dehydrogenase [General function prediction only]. 366
1090 223991 COG1063 Tdh Threonine dehydrogenase or related Zn-dependent dehydrogenase [Amino acid transport and metabolism, General function prediction only]. 350
1091 223992 COG1064 AdhP D-arabinose 1-dehydrogenase, Zn-dependent alcohol dehydrogenase family [Carbohydrate transport and metabolism]. 339
1092 223993 COG1066 Sms Predicted ATP-dependent serine protease [Posttranslational modification, protein turnover, chaperones]. 456
1093 223994 COG1067 LonB Predicted ATP-dependent protease [Posttranslational modification, protein turnover, chaperones]. 647
1094 223995 COG1069 AraB Ribulose kinase [Carbohydrate transport and metabolism]. 544
1095 223996 COG1070 XylB Sugar (pentulose or hexulose) kinase [Carbohydrate transport and metabolism]. 502
1096 223997 COG1071 AcoA TPP-dependent pyruvate or acetoin dehydrogenase subunit alpha [Energy production and conversion]. 358
1097 223998 COG1072 CoaA Panthothenate kinase [Coenzyme transport and metabolism]. 283
1098 223999 COG1073 FrsA Fermentation-respiration switch protein FrsA, has esterase activity, DUF1100 family [Signal transduction mechanisms]. 299
1099 224000 COG1074 RecB ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) [Replication, recombination and repair]. 1139
1100 224001 COG1075 EstA Triacylglycerol esterase/lipase EstA, alpha/beta hydrolase fold [Lipid transport and metabolism]. 336
1101 224002 COG1076 DjlA DnaJ-domain-containing proteins 1 [Posttranslational modification, protein turnover, chaperones]. 174
1102 224003 COG1077 MreB Actin-like ATPase involved in cell morphogenesis [Cell cycle control, cell division, chromosome partitioning]. 342
1103 224004 COG1078 YdhJ HD superfamily phosphohydrolase [General function prediction only]. 421
1104 224005 COG1079 YufQ ABC-type uncharacterized transport system, permease component [General function prediction only]. 304
1105 224006 COG1080 PtsA Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) [Carbohydrate transport and metabolism]. 574
1106 224007 COG1082 YcjR Sugar phosphate isomerase/epimerase [Carbohydrate transport and metabolism]. 274
1107 224008 COG1083 NeuA CMP-N-acetylneuraminic acid synthetase [Cell wall/membrane/envelope biogenesis]. 228
1108 224009 COG1084 Nog1 GTP-binding protein, GTP1/Obg family [General function prediction only]. 346
1109 224010 COG1085 GalT Galactose-1-phosphate uridylyltransferase [Carbohydrate transport and metabolism]. 338
1110 224011 COG1086 FlaA1 NDP-sugar epimerase, includes UDP-GlcNAc-inverting 4,6-dehydratase FlaA1 and capsular polysaccharide biosynthesis protein EpsC [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 588
1111 224012 COG1087 GalE UDP-glucose 4-epimerase [Cell wall/membrane/envelope biogenesis]. 329
1112 224013 COG1088 RfbB dTDP-D-glucose 4,6-dehydratase [Cell wall/membrane/envelope biogenesis]. 340
1113 224014 COG1089 Gmd GDP-D-mannose dehydratase [Cell wall/membrane/envelope biogenesis]. 345
1114 224015 COG1090 YfcH NAD dependent epimerase/dehydratase family enzyme [General function prediction only]. 297
1115 224016 COG1091 RfbD dTDP-4-dehydrorhamnose reductase [Cell wall/membrane/envelope biogenesis]. 281
1116 224017 COG1092 RlmK 23S rRNA G2069 N7-methylase RlmK or C1962 C5-methylase RlmI [Translation, ribosomal structure and biogenesis]. 393
1117 224018 COG1093 SUI2 Translation initiation factor 2, alpha subunit (eIF-2alpha) [Translation, ribosomal structure and biogenesis]. 269
1118 224019 COG1094 Krr1 rRNA processing protein Krr1/Pno1, contains KH domain [Translation, ribosomal structure and biogenesis]. 194
1119 224020 COG1095 RPB7 DNA-directed RNA polymerase, subunit E'/Rpb7 [Transcription]. 183
1120 224021 COG1096 Csl4 Exosome complex RNA-binding protein Csl4, contains S1 and Zn-ribbon domains [Translation, ribosomal structure and biogenesis]. 188
1121 224022 COG1097 Rrp4 Exosome complex RNA-binding protein Rrp4, contains S1 and KH domains [Translation, ribosomal structure and biogenesis]. 239
1122 224023 COG1098 YabR Predicted RNA-binding protein, contains ribosomal protein S1 (RPS1) domain [General function prediction only]. 129
1123 224024 COG1099 COG1099 Predicted metal-dependent hydrolase, TIM-barrel fold [General function prediction only]. 254
1124 224025 COG1100 Gem1 GTPase SAR1 family domain [General function prediction only]. 219
1125 224026 COG1101 PhnK ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 263
1126 224027 COG1102 CmkB Cytidylate kinase [Nucleotide transport and metabolism]. 179
1127 224028 COG1103 COG1103 Archaeal Cys-tRNA synthase (O-phospho-L-seryl-tRNA:Cys-tRNA synthase) [Translation, ribosomal structure and biogenesis]. 382
1128 224029 COG1104 NifS Cysteine sulfinate desulfinase/cysteine desulfurase or related enzyme [Amino acid transport and metabolism]. 386
1129 224030 COG1105 FruK Fructose-1-phosphate kinase or kinase (PfkB) [Carbohydrate transport and metabolism]. 310
1130 224031 COG1106 AAA15 ATPase/GTPase, AAA15 family [General function prediction only]. 371
1131 224032 COG1107 COG1107 Archaea-specific RecJ-like exonuclease, contains DnaJ-type Zn finger domain [Replication, recombination and repair]. 715
1132 224033 COG1108 ZnuB ABC-type Mn2+/Zn2+ transport system, permease component [Inorganic ion transport and metabolism]. 274
1133 224034 COG1109 ManB Phosphomannomutase [Carbohydrate transport and metabolism]. 464
1134 224035 COG1110 TopG2 Reverse gyrase [Replication, recombination and repair]. 1187
1135 224036 COG1111 MPH1 ERCC4-related helicase [Replication, recombination and repair]. 542
1136 224037 COG1112 DNA2 Superfamily I DNA and/or RNA helicase [Replication, recombination and repair]. 767
1137 224038 COG1113 AnsP L-asparagine transporter and related permeases [Amino acid transport and metabolism]. 462
1138 224039 COG1114 BrnQ Branched-chain amino acid permeases [Amino acid transport and metabolism]. 431
1139 224040 COG1115 AlsT Na+/alanine symporter [Amino acid transport and metabolism]. 452
1140 224041 COG1116 TauB ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component [Inorganic ion transport and metabolism]. 248
1141 224042 COG1117 PstB ABC-type phosphate transport system, ATPase component [Inorganic ion transport and metabolism]. 253
1142 224043 COG1118 CysA ABC-type sulfate/molybdate transport systems, ATPase component [Inorganic ion transport and metabolism]. 345
1143 224044 COG1119 ModF ABC-type molybdenum transport system, ATPase component/photorepair protein PhrA [Inorganic ion transport and metabolism]. 257
1144 224045 COG1120 FepC ABC-type cobalamin/Fe3+-siderophores transport system, ATPase component [Inorganic ion transport and metabolism, Coenzyme transport and metabolism]. 258
1145 224046 COG1121 ZnuC ABC-type Mn2+/Zn2+ transport system, ATPase component [Inorganic ion transport and metabolism]. 254
1146 224047 COG1122 EcfA2 Energy-coupling factor transporter ATP-binding protein EcfA2 [Inorganic ion transport and metabolism, General function prediction only]. 235
1147 224048 COG1123 GsiA ABC-type glutathione transport system ATPase component, contains duplicated ATPase domain [Posttranslational modification, protein turnover, chaperones]. 539
1148 224049 COG1124 DppF ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 252
1149 224050 COG1125 OpuBA ABC-type proline/glycine betaine transport system, ATPase component [Amino acid transport and metabolism]. 309
1150 224051 COG1126 GlnQ ABC-type polar amino acid transport system, ATPase component [Amino acid transport and metabolism]. 240
1151 224052 COG1127 MlaF ABC-type transporter Mla maintaining outer membrane lipid asymmetry, ATPase component MlaF [Cell wall/membrane/envelope biogenesis]. 263
1152 224053 COG1129 MglA ABC-type sugar transport system, ATPase component [Carbohydrate transport and metabolism]. 500
1153 224054 COG1131 CcmA ABC-type multidrug transport system, ATPase component [Defense mechanisms]. 293
1154 224055 COG1132 MdlB ABC-type multidrug transport system, ATPase and permease component [Defense mechanisms]. 567
1155 224056 COG1133 SbmA ABC-type long-chain fatty acid transport system, fused permease and ATPase components [Lipid transport and metabolism]. 405
1156 224057 COG1134 TagH ABC-type polysaccharide/polyol phosphate transport system, ATPase component [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 249
1157 224058 COG1135 AbcC ABC-type methionine transport system, ATPase component [Amino acid transport and metabolism]. 339
1158 224059 COG1136 LolD ABC-type lipoprotein export system, ATPase component [Cell wall/membrane/envelope biogenesis]. 226
1159 224060 COG1137 LptB ABC-type lipopolysaccharide export system, ATPase component [Cell wall/membrane/envelope biogenesis]. 243
1160 224061 COG1138 CcmF Cytochrome c biogenesis factor [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 648
1161 224062 COG1139 LutB L-lactate utilization protein LutB, contains a ferredoxin-type domain [Energy production and conversion]. 459
1162 224063 COG1140 NarY Nitrate reductase beta subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 513
1163 224064 COG1141 Fer Ferredoxin [Energy production and conversion]. 68
1164 224065 COG1142 HycB Fe-S-cluster-containing hydrogenase component 2 [Energy production and conversion]. 165
1165 224066 COG1143 NuoI Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) [Energy production and conversion]. 172
1166 224067 COG1144 PorD Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, delta subunit [Energy production and conversion]. 91
1167 224068 COG1145 NapF Ferredoxin [Energy production and conversion]. 99
1168 224069 COG1146 PreA NAD-dependent dihydropyrimidine dehydrogenase, PreA subunit [Nucleotide transport and metabolism]. 68
1169 224070 COG1148 HdrA Heterodisulfide reductase, subunit A (polyferredoxin) [Energy production and conversion]. 622
1170 224071 COG1149 COG1149 MinD superfamily P-loop ATPase, contains an inserted ferredoxin domain [General function prediction only]. 284
1171 224072 COG1150 HdrC Heterodisulfide reductase, subunit C [Energy production and conversion]. 195
1172 224073 COG1151 Hcp Hydroxylamine reductase (hybrid-cluster protein) [Inorganic ion transport and metabolism, Energy production and conversion]. 576
1173 224074 COG1152 CdhA CO dehydrogenase/acetyl-CoA synthase alpha subunit [Energy production and conversion]. 772
1174 224075 COG1153 FwdD Formylmethanofuran dehydrogenase subunit D [Energy production and conversion]. 128
1175 224076 COG1154 Dxs Deoxyxylulose-5-phosphate synthase [Coenzyme transport and metabolism, Lipid transport and metabolism]. 627
1176 224077 COG1155 NtpA Archaeal/vacuolar-type H+-ATPase catalytic subunit A/Vma1 [Energy production and conversion]. 588
1177 224078 COG1156 NtpB Archaeal/vacuolar-type H+-ATPase subunit B/Vma2 [Energy production and conversion]. 463
1178 224079 COG1157 FliI Flagellar biosynthesis/type III secretory pathway ATPase [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 441
1179 224080 COG1158 Rho Transcription termination factor Rho [Transcription]. 422
1180 224081 COG1159 Era GTPase Era, involved in 16S rRNA processing [Translation, ribosomal structure and biogenesis]. 298
1181 224082 COG1160 Der Predicted GTPases [General function prediction only]. 444
1182 224083 COG1161 RbgA Ribosome biogenesis GTPase A [Translation, ribosomal structure and biogenesis]. 322
1183 224084 COG1162 RsgA Putative ribosome biogenesis GTPase RsgA [Translation, ribosomal structure and biogenesis]. 301
1184 224085 COG1163 Rbg1 Ribosome-interacting GTPase 1 [Translation, ribosomal structure and biogenesis]. 365
1185 224086 COG1164 PepF Oligoendopeptidase F [Amino acid transport and metabolism]. 598
1186 224087 COG1165 MenD 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase [Coenzyme transport and metabolism]. 566
1187 224088 COG1166 SpeA Arginine decarboxylase (spermidine biosynthesis) [Amino acid transport and metabolism]. 652
1188 224089 COG1167 ARO8 DNA-binding transcriptional regulator, MocR family, contains an aminotransferase domain [Transcription, Amino acid transport and metabolism]. 459
1189 224090 COG1168 MalY Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities [Amino acid transport and metabolism, General function prediction only]. 388
1190 224091 COG1169 MenF Isochorismate synthase EntC [Coenzyme transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 423
1191 224092 COG1171 IlvA Threonine dehydratase [Amino acid transport and metabolism]. 347
1192 224093 COG1172 AraH Ribose/xylose/arabinose/galactoside ABC-type transport system, permease component [Carbohydrate transport and metabolism]. 316
1193 224094 COG1173 DppC ABC-type dipeptide/oligopeptide/nickel transport system, permease component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 289
1194 224095 COG1174 OpuBB ABC-type proline/glycine betaine transport system, permease component [Amino acid transport and metabolism]. 221
1195 224096 COG1175 UgpA ABC-type sugar transport system, permease component [Carbohydrate transport and metabolism]. 295
1196 224097 COG1176 PotB ABC-type spermidine/putrescine transport system, permease component I [Amino acid transport and metabolism]. 287
1197 224098 COG1177 PotC ABC-type spermidine/putrescine transport system, permease component II [Amino acid transport and metabolism]. 267
1198 224099 COG1178 FbpB ABC-type Fe3+ transport system, permease component [Inorganic ion transport and metabolism]. 540
1199 224100 COG1179 TcdA tRNA A37 threonylcarbamoyladenosine dehydratase [Translation, ribosomal structure and biogenesis]. 263
1200 224101 COG1180 PflA Pyruvate-formate lyase-activating enzyme [Posttranslational modification, protein turnover, chaperones]. 260
1201 224102 COG1181 DdlA D-alanine-D-alanine ligase and related ATP-grasp enzymes [Cell wall/membrane/envelope biogenesis, General function prediction only]. 317
1202 224103 COG1182 AzoR FMN-dependent NADH-azoreductase [Energy production and conversion]. 202
1203 224104 COG1183 PssA Phosphatidylserine synthase [Lipid transport and metabolism]. 234
1204 224105 COG1184 GCD2 Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. 301
1205 224106 COG1185 Pnp Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) [Translation, ribosomal structure and biogenesis]. 692
1206 224107 COG1186 PrfB Protein chain release factor B [Translation, ribosomal structure and biogenesis]. 239
1207 224108 COG1187 RsuA 16S rRNA U516 pseudouridylate synthase RsuA and related 23S rRNA U2605, pseudouridylate synthases [Translation, ribosomal structure and biogenesis]. 248
1208 224109 COG1188 HslR Ribosomal 50S subunit-recycling heat shock protein, contains S4 domain [Translation, ribosomal structure and biogenesis]. 100
1209 224110 COG1189 YqxC Predicted rRNA methylase YqxC, contains S4 and FtsJ domains [Translation, ribosomal structure and biogenesis]. 245
1210 224111 COG1190 LysU Lysyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 502
1211 224112 COG1191 FliA DNA-directed RNA polymerase specialized sigma subunit [Transcription]. 247
1212 224113 COG1192 BcsQ Cellulose biosynthesis protein BcsQ [Cell motility]. 259
1213 224114 COG1193 MutS2 dsDNA-specific endonuclease/ATPase MutS2 [Replication, recombination and repair]. 753
1214 224115 COG1194 MutY Adenine-specific DNA glycosylase, acts on AG and A-oxoG pairs [Replication, recombination and repair]. 342
1215 224116 COG1195 RecF Recombinational DNA repair ATPase RecF [Replication, recombination and repair]. 363
1216 224117 COG1196 Smc Chromosome segregation ATPase [Cell cycle control, cell division, chromosome partitioning]. 1163
1217 224118 COG1197 Mfd Transcription-repair coupling factor (superfamily II helicase) [Replication, recombination and repair, Transcription]. 1139
1218 224119 COG1198 PriA Primosomal protein N' (replication factor Y) - superfamily II helicase [Replication, recombination and repair]. 730
1219 224120 COG1199 DinG Rad3-related DNA helicase [Replication, recombination and repair]. 654
1220 224121 COG1200 RecG RecG-like helicase [Replication, recombination and repair]. 677
1221 224122 COG1201 Lhr Lhr-like helicase [Replication, recombination and repair]. 814
1222 224123 COG1202 COG1202 Superfamily II helicase, archaea-specific [Replication, recombination and repair]. 830
1223 224124 COG1203 Cas3 CRISPR/Cas system-associated endonuclease/helicase Cas3 [Defense mechanisms]. 733
1224 224125 COG1204 BRR2 Replicative superfamily II helicase [Replication, recombination and repair]. 766
1225 224126 COG1205 YprA ATP-dependent helicase YprA, contains C-terminal metal-binding DUF1998 domain [Replication, recombination and repair]. 851
1226 224127 COG1206 TrmFO Folate-dependent tRNA-U54 methylase TrmFO/GidA [Translation, ribosomal structure and biogenesis]. 439
1227 224128 COG1207 GlmU Bifunctional protein GlmU, N-acetylglucosamine-1-phosphate-uridyltransferase/glucosamine-1-phosphate-acetyltransferase [Cell wall/membrane/envelope biogenesis]. 460
1228 224129 COG1208 GCD1 NDP-sugar pyrophosphorylase, includes eIF-2Bgamma, eIF-2Bepsilon, and LPS biosynthesis proteins [Translation, ribosomal structure and biogenesis, Cell wall/membrane/envelope biogenesis]. 358
1229 224130 COG1209 RmlA1 dTDP-glucose pyrophosphorylase [Cell wall/membrane/envelope biogenesis]. 286
1230 224131 COG1210 GalU UTP-glucose-1-phosphate uridylyltransferase [Cell wall/membrane/envelope biogenesis]. 291
1231 224132 COG1211 IspD 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase [Lipid transport and metabolism]. 230
1232 224133 COG1212 KdsB CMP-2-keto-3-deoxyoctulosonic acid synthetase [Cell wall/membrane/envelope biogenesis]. 247
1233 224134 COG1213 COG1213 Choline kinase [Lipid transport and metabolism]. 239
1234 224135 COG1214 TsaB tRNA A37 threonylcarbamoyladenosine modification protein TsaB [Translation, ribosomal structure and biogenesis]. 220
1235 224136 COG1215 BcsA Glycosyltransferase, catalytic subunit of cellulose synthase and poly-beta-1,6-N-acetylglucosamine synthase [Cell motility]. 439
1236 224137 COG1216 GT2 Glycosyltransferase, GT2 family [Carbohydrate transport and metabolism]. 305
1237 224138 COG1217 TypA Predicted membrane GTPase involved in stress response [Signal transduction mechanisms]. 603
1238 224139 COG1218 CysQ 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase [Inorganic ion transport and metabolism]. 276
1239 224140 COG1219 ClpX ATP-dependent protease Clp, ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 408
1240 224141 COG1220 HslU ATP-dependent protease HslVU (ClpYQ), ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 444
1241 224142 COG1221 PspF Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain [Transcription, Signal transduction mechanisms]. 403
1242 224143 COG1222 RPT1 ATP-dependent 26S proteasome regulatory subunit [Posttranslational modification, protein turnover, chaperones]. 406
1243 224144 COG1223 COG1223 Predicted ATPase, AAA+ superfamily [General function prediction only]. 368
1244 224145 COG1224 TIP49 DNA helicase TIP49, TBP-interacting protein [Transcription]. 450
1245 224146 COG1225 Bcp Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 157
1246 224147 COG1226 Kch Voltage-gated potassium channel Kch [Inorganic ion transport and metabolism]. 212
1247 224148 COG1227 PPX1 Inorganic pyrophosphatase/exopolyphosphatase [Energy production and conversion, Inorganic ion transport and metabolism]. 311
1248 224149 COG1228 HutI Imidazolonepropionase or related amidohydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 406
1249 224150 COG1229 FwdA Formylmethanofuran dehydrogenase subunit A [Energy production and conversion]. 575
1250 224151 COG1230 CzcD Co/Zn/Cd efflux system component [Inorganic ion transport and metabolism]. 296
1251 224152 COG1231 YobN Monoamine oxidase [Amino acid transport and metabolism]. 450
1252 224153 COG1232 HemY Protoporphyrinogen oxidase [Coenzyme transport and metabolism]. 444
1253 224154 COG1233 COG1233 Phytoene dehydrogenase-related protein [Secondary metabolites biosynthesis, transport and catabolism]. 487
1254 224155 COG1234 ElaC Ribonuclease BN, tRNA processing enzyme [Translation, ribosomal structure and biogenesis]. 292
1255 224156 COG1235 PhnP Phosphoribosyl 1,2-cyclic phosphodiesterase [Inorganic ion transport and metabolism]. 269
1256 224157 COG1236 YSH1 RNA processing exonuclease, beta-lactamase fold, Cft2 family [Translation, ribosomal structure and biogenesis]. 427
1257 224158 COG1237 COG1237 Metal-dependent hydrolase, beta-lactamase superfamily II [General function prediction only]. 259
1258 224159 COG1238 YgaA Uncharacterized membrane protein YqaA, SNARE-associated domain [Function unknown]. 161
1259 224160 COG1239 ChlI Mg-chelatase subunit ChlI [Coenzyme transport and metabolism]. 423
1260 224161 COG1240 ChlD Mg-chelatase subunit ChlD [Coenzyme transport and metabolism]. 261
1261 224162 COG1241 Mcm2 DNA replicative helicase MCM subunit Mcm2, Cdc46/Mcm family [Replication, recombination and repair]. 682
1262 224163 COG1242 YhcC Radical SAM superfamily enzyme [General function prediction only]. 312
1263 224164 COG1243 ELP3 Histone acetyltransferase, component of the RNA polymerase elongator complex [Transcription, Chromatin structure and dynamics]. 515
1264 224165 COG1244 COG1244 Uncharacterized Fe-S cluster-containing protein. MiaB family [General function prediction only]. 358
1265 224166 COG1245 Rli1 Translation initiation factor RLI1, contains Fe-S and AAA+ ATPase domains [Translation, ribosomal structure and biogenesis]. 591
1266 224167 COG1246 ArgA N-acetylglutamate synthase or related acetyltransferase, GNAT family [Amino acid transport and metabolism]. 153
1267 224168 COG1247 YncA L-amino acid N-acyltransferase YncA [Amino acid transport and metabolism]. 169
1268 224169 COG1249 Lpd Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component or related enzyme [Energy production and conversion]. 454
1269 224170 COG1250 FadB 3-hydroxyacyl-CoA dehydrogenase [Lipid transport and metabolism]. 307
1270 224171 COG1251 NirB NAD(P)H-nitrite reductase, large subunit [Energy production and conversion]. 793
1271 224172 COG1252 Ndh NADH dehydrogenase, FAD-containing subunit [Energy production and conversion]. 405
1272 224173 COG1253 TlyC Hemolysin or related protein, contains CBS domains [General function prediction only]. 429
1273 224174 COG1254 AcyP Acylphosphatase [Energy production and conversion]. 92
1274 224175 COG1255 COG1255 Uncharacterized protein, UPF0146 family [Function unknown]. 129
1275 224176 COG1256 FlgK Flagellar hook-associated protein FlgK [Cell motility]. 552
1276 224177 COG1257 HMG1 Hydroxymethylglutaryl-CoA reductase [Lipid transport and metabolism]. 436
1277 224178 COG1258 Pus10 tRNA U54 and U55 pseudouridine synthase Pus10 [Translation, ribosomal structure and biogenesis]. 398
1278 224179 COG1259 COG1259 Bifunctional DNase/RNase [General function prediction only]. 151
1279 224180 COG1260 INO1 Myo-inositol-1-phosphate synthase [Lipid transport and metabolism]. 362
1280 224181 COG1261 FlgA Flagella basal body P-ring formation protein FlgA [Cell motility]. 220
1281 224182 COG1262 YfmG Formylglycine-generating enzyme, required for sulfatase activity, contains SUMF1/FGE domain [Posttranslational modification, protein turnover, chaperones]. 314
1282 224183 COG1263 PtsG1 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific [Carbohydrate transport and metabolism]. 393
1283 224184 COG1264 PtsG2 Phosphotransferase system IIB components [Carbohydrate transport and metabolism]. 88
1284 224185 COG1266 YdiL Membrane protease YdiL, CAAX protease family [Posttranslational modification, protein turnover, chaperones]. 226
1285 224186 COG1267 PgpA Phosphatidylglycerophosphatase A [Lipid transport and metabolism]. 160
1286 224187 COG1268 BioY Biotin transporter BioY [Coenzyme transport and metabolism]. 184
1287 224188 COG1269 NtpI Archaeal/vacuolar-type H+-ATPase subunit I/STV1 [Energy production and conversion]. 660
1288 224189 COG1270 CbiB Cobalamin biosynthesis protein CobD/CbiB [Coenzyme transport and metabolism]. 320
1289 224190 COG1271 AppC Cytochrome bd-type quinol oxidase, subunit 1 [Energy production and conversion]. 457
1290 224191 COG1272 YqfA Predicted membrane channel-forming protein YqfA, hemolysin III family [Intracellular trafficking, secretion, and vesicular transport]. 226
1291 224192 COG1273 YkoV Non-homologous end joining protein Ku, dsDNA break repair [Replication, recombination and repair]. 278
1292 224193 COG1274 PepCK Phosphoenolpyruvate carboxykinase, GTP-dependent [Energy production and conversion]. 608
1293 224194 COG1275 TehA Tellurite resistance protein TehA and related permeases [Defense mechanisms]. 329
1294 224195 COG1276 PcoD Putative copper export protein [Inorganic ion transport and metabolism]. 289
1295 224196 COG1277 NosY ABC-type transport system involved in multi-copper enzyme maturation, permease component [Posttranslational modification, protein turnover, chaperones]. 278
1296 224197 COG1278 CspC Cold shock protein, CspA family [Transcription]. 67
1297 224198 COG1279 ArgO Arginine exporter protein ArgO [Amino acid transport and metabolism]. 202
1298 224199 COG1280 RhtB Threonine/homoserine/homoserine lactone efflux protein [Amino acid transport and metabolism]. 208
1299 224200 COG1281 HslO Redox-regulated molecular chaperone, HSP33 family [Posttranslational modification, protein turnover, chaperones]. 286
1300 224201 COG1282 PntB NAD/NADP transhydrogenase beta subunit [Energy production and conversion]. 463
1301 224202 COG1283 NptA Na+/phosphate symporter [Inorganic ion transport and metabolism]. 533
1302 224203 COG1284 YitT Uncharacterized membrane-anchored protein YitT, contains DUF161 and DUF2179 domains [Function unknown]. 289
1303 224204 COG1285 SapB Uncharacterized membrane protein YhiD, involved in acid resistance [Function unknown]. 221
1304 224205 COG1286 CvpA Uncharacterized membrane protein, required for colicin V production [Function unknown]. 182
1305 224206 COG1287 Stt3 Asparagine N-glycosylation enzyme, membrane subunit Stt3 [Posttranslational modification, protein turnover, chaperones]. 773
1306 224207 COG1288 YfcC Uncharacterized membrane protein YfcC, ion transporter superfamily [General function prediction only]. 481
1307 224208 COG1289 YccC Uncharacterized membrane protein YccC [Function unknown]. 674
1308 224209 COG1290 QcrB Cytochrome b subunit of the bc complex [Energy production and conversion]. 381
1309 224210 COG1291 MotA Flagellar motor component MotA [Cell motility]. 266
1310 224211 COG1292 BetT Choline-glycine betaine transporter [Cell wall/membrane/envelope biogenesis]. 537
1311 224212 COG1293 YloA Predicted component of the ribosome quality control (RQC) complex, YloA/Tae2 family, contains fibronectin-binding (FbpA) and DUF814 domains [Translation, ribosomal structure and biogenesis]. 564
1312 224213 COG1294 AppB Cytochrome bd-type quinol oxidase, subunit 2 [Energy production and conversion]. 346
1313 224214 COG1295 BrkB Uncharacterized membrane protein, BrkB/YihY/UPF0761 family (not an RNase) [Function unknown]. 303
1314 224215 COG1296 AzlC Predicted branched-chain amino acid permease (azaleucine resistance) [Amino acid transport and metabolism]. 238
1315 224216 COG1297 OPT Uncharacterized membrane protein, oligopeptide transporter (OPT) family [Function unknown]. 624
1316 224217 COG1298 FlhA Flagellar biosynthesis pathway, component FlhA [Cell motility]. 696
1317 224218 COG1299 FrwC Phosphotransferase system, fructose-specific IIC component [Carbohydrate transport and metabolism]. 343
1318 224219 COG1300 SpoIIM Uncharacterized membrane protein SpoIIM, required for sporulation [Cell cycle control, cell division, chromosome partitioning]. 207
1319 224220 COG1301 GltP Na+/H+-dicarboxylate symporter [Energy production and conversion]. 415
1320 224221 COG1302 YloU Uncharacterized conserved protein YloU, alkaline shock protein (Asp23) family [Function unknown]. 131
1321 224222 COG1303 COG1303 Predicted rRNA methylase, SpoU family [General function prediction only]. 179
1322 224223 COG1304 LldD FMN-dependent dehydrogenase, includes L-lactate dehydrogenase and type II isopentenyl diphosphate isomerase [Energy production and conversion, Lipid transport and metabolism, General function prediction only]. 360
1323 224224 COG1305 YebA Transglutaminase-like enzyme, putative cysteine protease [Posttranslational modification, protein turnover, chaperones]. 319
1324 224225 COG1306 COG1306 Predicted glycosyl hydrolase, alpha amylase family [General function prediction only]. 400
1325 224226 COG1307 DegV Fatty acid-binding protein DegV (function unknown) [Lipid transport and metabolism]. 282
1326 224227 COG1308 EGD2 Transcription factor homologous to NACalpha-BTF3 [Transcription]. 122
1327 224228 COG1309 AcrR DNA-binding transcriptional regulator, AcrR family [Transcription]. 201
1328 224229 COG1310 Rri1 Proteasome lid subunit RPN8/RPN11, contains Jab1/MPN domain metalloenzyme (JAMM) motif [Posttranslational modification, protein turnover, chaperones]. 134
1329 224230 COG1311 HYS2 Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B [Replication, recombination and repair]. 481
1330 224231 COG1312 UxuA D-mannonate dehydratase [Carbohydrate transport and metabolism]. 362
1331 224232 COG1313 PflX Uncharacterized Fe-S protein PflX, radical SAM superfamily [General function prediction only]. 335
1332 224233 COG1314 SecG Preprotein translocase subunit SecG [Intracellular trafficking, secretion, and vesicular transport]. 86
1333 224234 COG1315 COG1315 Uncharacterized conserved protein, DUF342 family [Function unknown]. 543
1334 224235 COG1316 Cps2a Anionic cell wall polymer biosynthesis enzyme, LytR-Cps2A-Psr (LCP) family [Cell wall/membrane/envelope biogenesis]. 307
1335 224236 COG1317 FliH Flagellar biosynthesis/type III secretory pathway protein FliH [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 234
1336 224237 COG1318 COG1318 Predicted transcriptional regulator [Transcription]. 182
1337 224238 COG1319 CoxM CO or xanthine dehydrogenase, FAD-binding subunit [Energy production and conversion]. 284
1338 224239 COG1320 MnhG Multisubunit Na+/H+ antiporter, MnhG subunit [Inorganic ion transport and metabolism]. 113
1339 224240 COG1321 MntR Mn-dependent transcriptional regulator, DtxR family [Transcription]. 154
1340 224241 COG1322 RmuC DNA anti-recombination protein (rearrangement mutator) RmuC [Replication, recombination and repair]. 448
1341 224242 COG1323 YlbM Predicted nucleotidyltransferase [General function prediction only]. 358
1342 224243 COG1324 CutA Uncharacterized protein involved in tolerance to divalent cations [Inorganic ion transport and metabolism]. 104
1343 224244 COG1325 COG1325 Exosome subunit, RNA binding protein with dsRBD fold [Translation, ribosomal structure and biogenesis]. 149
1344 224245 COG1326 COG1326 Uncharacterized archaeal Zn-finger protein [General function prediction only]. 201
1345 224246 COG1327 NrdR Transcriptional regulator NrdR, contains Zn-ribbon and ATP-cone domains [Transcription]. 156
1346 224247 COG1328 NrdD Anaerobic ribonucleoside-triphosphate reductase [Nucleotide transport and metabolism]. 700
1347 224248 COG1329 CdnL RNA polymerase-interacting regulator, CarD/CdnL/TRCF family [Transcription]. 166
1348 224249 COG1330 RecC Exonuclease V gamma subunit [Replication, recombination and repair]. 1078
1349 224250 COG1331 YyaL Uncharacterized conserved protein YyaL, SSP411 family, contains thoiredoxin and six-hairpin glycosidase-like domains [General function prediction only]. 667
1350 224251 COG1332 Csm5 CRISPR/Cas system CSM-associated protein Csm5, group 7 of RAMP superfamily [Defense mechanisms]. 369
1351 224252 COG1333 ResB Cytochrome c biogenesis protein ResB [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 478
1352 224253 COG1334 FlaG Uncharacterized conserved protein, FlaG/YvyC family [General function prediction only]. 120
1353 224254 COG1335 PncA Nicotinamidase-related amidase [Coenzyme transport and metabolism, General function prediction only]. 205
1354 224255 COG1336 Cmr4 CRISPR/Cas system CMR subunit Cmr4, Cas7 group, RAMP superfamily [Defense mechanisms]. 298
1355 224256 COG1337 Csm3 CRISPR/Cas system CSM-associated protein Csm3, group 7 of RAMP superfamily [Defense mechanisms]. 249
1356 224257 COG1338 FliP Flagellar biosynthetic protein FliP [Cell motility]. 248
1357 224258 COG1339 Rfk Archaeal CTP-dependent riboflavin kinase [Coenzyme transport and metabolism]. 214
1358 224259 COG1340 COG1340 Uncharacterized coiled-coil protein, contains DUF342 domain [Function unknown]. 294
1359 224260 COG1341 Grc3 Polynucleotide 5'-kinase, involved in rRNA processing [Translation, ribosomal structure and biogenesis]. 398
1360 224261 COG1342 COG1342 Predicted DNA-binding protein, UPF0251 family [General function prediction only]. 99
1361 224262 COG1343 Cas2 CRISPR/Cas system-associated endoribonuclease Cas2 [Defense mechanisms]. 89
1362 224263 COG1344 FlgL Flagellin and related hook-associated protein FlgL [Cell motility]. 360
1363 224264 COG1345 FliD Flagellar capping protein FliD [Cell motility]. 483
1364 224265 COG1346 LrgB Putative effector of murein hydrolase [Cell wall/membrane/envelope biogenesis]. 230
1365 224266 COG1347 NqrD Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD [Energy production and conversion]. 208
1366 224267 COG1348 NifH Nitrogenase subunit NifH, an ATPase [Inorganic ion transport and metabolism]. 278
1367 224268 COG1349 GlpR DNA-binding transcriptional regulator of sugar metabolism, DeoR/GlpR family [Transcription, Carbohydrate transport and metabolism]. 253
1368 224269 COG1350 COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) [Amino acid transport and metabolism]. 432
1369 224270 COG1351 ThyX Thymidylate synthase ThyX [Nucleotide transport and metabolism]. 273
1370 224271 COG1352 CheR Methylase of chemotaxis methyl-accepting proteins [Cell motility, Signal transduction mechanisms]. 268
1371 224272 COG1353 Cas10 CRISPR/Cas system-associated protein Cas10, large subunit of type III CRISPR-Cas systems, contains HD superfamily nuclease domain [Defense mechanisms]. 799
1372 224273 COG1354 ScpA Chromatin segregation and condensation protein Rec8/ScpA/Scc1, kleisin family [Replication, recombination and repair]. 248
1373 224274 COG1355 Mho1 Predicted class III extradiol dioxygenase, MEMO1 family [General function prediction only]. 279
1374 224275 COG1356 COG1356 Transcriptional regulator [Transcription]. 143
1375 224276 COG1357 YjbI Uncharacterized protein YjbI, contains pentapeptide repeats [Function unknown]. 238
1376 224277 COG1358 Rpl7Ae Ribosomal protein L7Ae or related RNA K-turn-binding protein [Translation, ribosomal structure and biogenesis]. 116
1377 224278 COG1359 YgiN Quinol monooxygenase YgiN [Energy production and conversion]. 100
1378 224279 COG1360 MotB Flagellar motor protein MotB [Cell motility]. 244
1379 224280 COG1361 COG1361 Uncharacterized conserved protein [Function unknown]. 500
1380 224281 COG1362 LAP4 Aspartyl aminopeptidase [Amino acid transport and metabolism]. 437
1381 224282 COG1363 FrvX Putative aminopeptidase FrvX [Amino acid transport and metabolism, Carbohydrate transport and metabolism]. 355
1382 224283 COG1364 ArgJ N-acetylglutamate synthase (N-acetylornithine aminotransferase) [Amino acid transport and metabolism]. 404
1383 224284 COG1365 COG1365 Predicted ATPase, PP-loop superfamily [General function prediction only]. 255
1384 224285 COG1366 SpoIIAA Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) [Signal transduction mechanisms]. 117
1385 224286 COG1367 Cmr1 CRISPR/Cas system CMR-associated protein Cmr1, group 7 of RAMP superfamily [Defense mechanisms]. 393
1386 224287 COG1368 MdoB Phosphoglycerol transferase MdoB or a related enzyme of AlkP superfamily [Cell wall/membrane/envelope biogenesis]. 650
1387 224288 COG1369 POP5 RNase P/RNase MRP subunit POP5 [Translation, ribosomal structure and biogenesis]. 124
1388 224289 COG1370 COG1370 tRNA-guanine transglycosylase, archaeosine-15-forming [Translation, ribosomal structure and biogenesis]. 155
1389 224290 COG1371 Archease Archease protein family (MTH1598/TM1083)[General function prediction only]. This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism. 137
1390 224291 COG1372 Hop Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons]. 420
1391 224292 COG1373 COG1373 Predicted ATPase, AAA+ superfamily [General function prediction only]. 398
1392 224293 COG1374 NIP7 Rbosome biogenesis protein Nip4, contains PUA domain [Translation, ribosomal structure and biogenesis]. 176
1393 224294 COG1376 ErfK Lipoprotein-anchoring transpeptidase ErfK/SrfK [Cell wall/membrane/envelope biogenesis]. 232
1394 224295 COG1377 FlhB Flagellar biosynthesis protein FlhB [Cell motility]. 363
1395 224296 COG1378 TrmB Sugar-specific transcriptional regulator TrmB [Transcription]. 247
1396 224297 COG1379 YqxK PHP family phosphoesterase with a Zn ribbon [General function prediction only]. 403
1397 224298 COG1380 YohJ Putative effector of murein hydrolase LrgA, UPF0299 family [General function prediction only]. 128
1398 224299 COG1381 RecO Recombinational DNA repair protein (RecF pathway) [Replication, recombination and repair]. 251
1399 224300 COG1382 GimC Prefoldin, chaperonin cofactor [Posttranslational modification, protein turnover, chaperones]. 119
1400 224301 COG1383 RPS17A Ribosomal protein S17E [Translation, ribosomal structure and biogenesis]. 74
1401 224302 COG1384 LysS Lysyl-tRNA synthetase, class I [Translation, ribosomal structure and biogenesis]. 521
1402 224303 COG1385 RsmE 16S rRNA U1498 N3-methylase RsmE [Translation, ribosomal structure and biogenesis]. 246
1403 224304 COG1386 ScpB Chromosome segregation and condensation protein ScpB [Transcription]. 184
1404 224305 COG1387 HIS2 Histidinol phosphatase or related hydrolase of the PHP family [Amino acid transport and metabolism, General function prediction only]. 237
1405 224306 COG1388 LysM LysM repeat [Cell wall/membrane/envelope biogenesis]. 124
1406 224307 COG1389 COG1389 DNA topoisomerase VI, subunit B [Replication, recombination and repair]. 538
1407 224308 COG1390 NtpE Archaeal/vacuolar-type H+-ATPase subunit E/Vma4 [Energy production and conversion]. 194
1408 224309 COG1391 GlnE Glutamine synthetase adenylyltransferase [Posttranslational modification, protein turnover, chaperones]. 963
1409 224310 COG1392 YkaA Uncharacterized conserved protein YkaA, distantly related to PhoU, UPF0111/DUF47 family [Function unknown]. 217
1410 224311 COG1393 ArsC Arsenate reductase and related proteins, glutaredoxin family [Inorganic ion transport and metabolism]. 117
1411 224312 COG1394 NtpD Archaeal/vacuolar-type H+-ATPase subunit D/Vma8 [Energy production and conversion]. 211
1412 224313 COG1395 COG1395 Predicted transcriptional regulator [Transcription]. 313
1413 224314 COG1396 HipB Transcriptional regulator, contains XRE-family HTH domain [Transcription]. 120
1414 224315 COG1397 DraG ADP-ribosylglycohydrolase [Posttranslational modification, protein turnover, chaperones]. 314
1415 224316 COG1398 OLE1 Fatty-acid desaturase [Lipid transport and metabolism]. 289
1416 224317 COG1399 YceD Uncharacterized metal-binding protein YceD, DUF177 family [Function unknown]. 176
1417 224318 COG1400 SEC65 Signal recognition particle subunit SEC65 [Intracellular trafficking, secretion, and vesicular transport]. 93
1418 224319 COG1401 McrB 5-methylcytosine-specific restriction endonuclease McrBC, GTP-binding regulatory subunit McrB [Defense mechanisms]. 601
1419 224320 COG1402 ArfB Creatinine amidohydrolase/Fe(II)-dependent formamide hydrolase involved in riboflavin and F420 biosynthesis [Coenzyme transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 250
1420 224321 COG1403 McrA 5-methylcytosine-specific restriction endonuclease McrA [Defense mechanisms]. 146
1421 224322 COG1404 AprE Serine protease, subtilisin family [Posttranslational modification, protein turnover, chaperones]. 508
1422 224323 COG1405 SUA7 Transcription initiation factor TFIIIB, Brf1 subunit/Transcription initiation factor TFIIB [Transcription]. 285
1423 224324 COG1406 CheX Chemotaxis protein CheX, a CheY~P-specific phosphatase [Cell motility]. 153
1424 224325 COG1407 COG1407 Metallophosphoesterase superfamily enzyme [General function prediction only]. 235
1425 224326 COG1408 YaeI Predicted phosphohydrolase, MPP superfamily [General function prediction only]. 284
1426 224327 COG1409 CpdA 3',5'-cyclic AMP phosphodiesterase CpdA [Signal transduction mechanisms]. 301
1427 224328 COG1410 MetH2 Methionine synthase I, cobalamin-binding domain [Amino acid transport and metabolism]. 842
1428 224329 COG1411 COG1411 Uncharacterized protein related to proFAR isomerase (HisA) [General function prediction only]. 229
1429 224330 COG1412 Fcf1 rRNA-processing protein FCF1 [Translation, ribosomal structure and biogenesis]. 136
1430 224331 COG1413 HEAT HEAT repeat [General function prediction only]. 335
1431 224332 COG1414 IclR DNA-binding transcriptional regulator, IclR family [Transcription]. 246
1432 224333 COG1415 COG1415 Uncharacterized protein [Function unknown]. 373
1433 224334 COG1416 COG1416 Intracellular sulfur oxidation protein, DsrE/DsrF family [Inorganic ion transport and metabolism]. 112
1434 224335 COG1417 COG1417 Uncharacterized protein [Function unknown]. 288
1435 224336 COG1418 RnaY HD superfamily phosphodieaserase, includes HD domain of RNase Y [Translation, ribosomal structure and biogenesis, General function prediction only]. 222
1436 224337 COG1419 FlhF Flagellar biosynthesis GTPase FlhF [Cell motility]. 407
1437 224338 COG1420 HrcA Transcriptional regulator of heat shock response [Transcription]. 346
1438 224339 COG1421 Csm2 CRISPR/Cas system CSM-associated protein Csm2, small subunit [Defense mechanisms]. 137
1439 224340 COG1422 COG1422 Uncharacterized archaeal membrane protein, DUF106 family, distantly related to YidC/Oxa1 [Function unknown]. 201
1440 224341 COG1423 COG1423 ATP-dependent RNA circularization protein, DNA/RNA ligase (PAB1020) family [Replication, recombination and repair]. 382
1441 224342 COG1424 BioW Pimeloyl-CoA synthetase [Coenzyme transport and metabolism]. 239
1442 224343 COG1426 RodZ Cytoskeletal protein RodZ, contains Xre-like HTH and DUF4115 domains [Cell cycle control, cell division, chromosome partitioning]. 284
1443 224344 COG1427 MqnA Menaquinone biosynthesis enzyme MqnA [Coenzyme transport and metabolism]. 252
1444 224345 COG1428 Dck Deoxyadenosine/deoxycytidine kinase [Nucleotide transport and metabolism]. 216
1445 224346 COG1429 CobN Cobalamin biosynthesis protein CobN, Mg-chelatase [Coenzyme transport and metabolism]. 1388
1446 224347 COG1430 COG1430 Uncharacterized conserved membrane protein, UPF0127 family [Function unknown]. 126
1447 224348 COG1431 COG1431 Argonaute homolog, implicated in RNA metabolism and viral defense [Translation, ribosomal structure and biogenesis, Defense mechanisms]. 685
1448 224349 COG1432 LabA Uncharacterized conserved protein, LabA/DUF88 family [Function unknown]. 181
1449 224350 COG1433 NifX Predicted Fe-Mo cluster-binding protein, NifX family [Posttranslational modification, protein turnover, chaperones]. 121
1450 224351 COG1434 YdcF Uncharacterized SAM-binding protein YcdF, DUF218 family [General function prediction only]. 223
1451 224352 COG1435 Tdk Thymidine kinase [Nucleotide transport and metabolism]. 201
1452 224353 COG1436 NtpF Archaeal/vacuolar-type H+-ATPase subunit F/Vma7 [Energy production and conversion]. 104
1453 224354 COG1437 CyaB Adenylate cyclase class IV, CYTH domain (includes archaeal enzymes of unknown function) [Signal transduction mechanisms, General function prediction only]. 178
1454 224355 COG1438 ArgR Arginine repressor [Transcription]. 150
1455 224356 COG1439 Nob1 rRNA maturation endonuclease Nob1 [Translation, ribosomal structure and biogenesis]. 177
1456 224357 COG1440 CelA Phosphotransferase system cellobiose-specific component IIB [Carbohydrate transport and metabolism]. 102
1457 224358 COG1441 MenC O-succinylbenzoate synthase [Coenzyme transport and metabolism]. 321
1458 224359 COG1442 RfaJ Lipopolysaccharide biosynthesis protein, LPS:glycosyltransferase [Cell wall/membrane/envelope biogenesis]. 325
1459 224360 COG1443 Idi Isopentenyldiphosphate isomerase [Lipid transport and metabolism]. 185
1460 224361 COG1444 TmcA tRNA(Met) C34 N-acetyltransferase TmcA [Translation, ribosomal structure and biogenesis]. 758
1461 224362 COG1445 FrwB Phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism]. 122
1462 224363 COG1446 IaaA Isoaspartyl peptidase or L-asparaginase, Ntn-hydrolase superfamily [Amino acid transport and metabolism]. 307
1463 224364 COG1447 CelC Phosphotransferase system cellobiose-specific component IIA [Carbohydrate transport and metabolism]. 105
1464 224365 COG1448 TyrB Aspartate/tyrosine/aromatic aminotransferase [Amino acid transport and metabolism]. 396
1465 224366 COG1449 COG1449 Alpha-amylase/alpha-mannosidase, GH57 family [Carbohydrate transport and metabolism]. 615
1466 224367 COG1450 PulD Type II secretory pathway component GspD/PulD (secretin) [Intracellular trafficking, secretion, and vesicular transport]. 587
1467 224368 COG1451 YgjP Predicted metal-dependent hydrolase [General function prediction only]. 223
1468 224369 COG1452 LptD LPS assembly outer membrane protein LptD (organic solvent tolerance protein OstA) [Cell wall/membrane/envelope biogenesis]. 784
1469 224370 COG1453 COG1453 Predicted oxidoreductase of the aldo/keto reductase family [General function prediction only]. 391
1470 224371 COG1454 EutG Alcohol dehydrogenase, class IV [Energy production and conversion]. 377
1471 224372 COG1455 CelB Phosphotransferase system cellobiose-specific component IIC [Carbohydrate transport and metabolism]. 432
1472 224373 COG1456 CdhE CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein) [Energy production and conversion]. 467
1473 224374 COG1457 CodB Purine-cytosine permease or related protein [Nucleotide transport and metabolism]. 442
1474 224375 COG1458 COG1458 Predicted DNA-binding protein containing PIN domain, UPF0278 family [General function prediction only]. 221
1475 224376 COG1459 PulF Type II secretory pathway, component PulF [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 397
1476 224377 COG1460 RpoF DNA-directed RNA polymerase, subunit F [Transcription]. 114
1477 224378 COG1461 YloV Predicted kinase related to dihydroxyacetone kinase [General function prediction only]. 542
1478 224379 COG1462 CsgG Curli biogenesis system outer membrane secretion channel CsgG [Cell wall/membrane/envelope biogenesis]. 252
1479 224380 COG1463 MlaD ABC-type transporter Mla maintaining outer membrane lipid asymmetry, periplasmic component MlaD [Cell wall/membrane/envelope biogenesis]. 359
1480 224381 COG1464 NlpA ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]. 268
1481 224382 COG1465 AroB2 3-dehydroquinate synthase, class II [Amino acid transport and metabolism]. 376
1482 224383 COG1466 HolA DNA polymerase III, delta subunit [Replication, recombination and repair]. 334
1483 224384 COG1467 PRI1 Eukaryotic-type DNA primase, catalytic (small) subunit [Replication, recombination and repair]. 341
1484 224385 COG1468 Cas4 CRISPR/Cas system-associated exonuclease Cas4, RecB family [Defense mechanisms]. 190
1485 224386 COG1469 FolE2 GTP cyclohydrolase FolE2 [Coenzyme transport and metabolism]. 289
1486 224387 COG1470 COG1470 Uncharacterized membrane protein [Function unknown]. 513
1487 224388 COG1471 RPS4A Ribosomal protein S4E [Translation, ribosomal structure and biogenesis]. 241
1488 224389 COG1472 BglX Periplasmic beta-glucosidase and related glycosidases [Carbohydrate transport and metabolism]. 397
1489 224390 COG1473 AbgB Metal-dependent amidase/aminoacylase/carboxypeptidase [General function prediction only]. 392
1490 224391 COG1474 CDC6 Cdc6-related protein, AAA superfamily ATPase [Replication, recombination and repair]. 366
1491 224392 COG1475 Spo0J Chromosome segregation protein Spo0J, contains ParB-like nuclease domain [Cell cycle control, cell division, chromosome partitioning]. 240
1492 224393 COG1476 XRE DNA-binding transcriptional regulator, XRE-family HTH domain [Transcription]. 68
1493 224394 COG1477 ApbE Thiamine biosynthesis lipoprotein ApbE [Coenzyme transport and metabolism]. 337
1494 224395 COG1478 CofE F420-0:Gamma-glutamyl ligase (F420 biosynthesis) [Coenzyme transport and metabolism]. 257
1495 224396 COG1479 COG1479 Uncharacterized conserved protein, contains ParB-like and HNH nuclease domains [Function unknown]. 409
1496 224397 COG1480 YqfF Membrane-associated HD superfamily phosphohydrolase [General function prediction only]. 700
1497 224398 COG1481 WhiA DNA-binding transcriptional regulator WhiA, involved in cell division [Transcription]. 308
1498 224399 COG1482 ManA Mannose-6-phosphate isomerase, class I [Carbohydrate transport and metabolism]. 312
1499 224400 COG1483 COG1483 Predicted ATPase, AAA+ superfamily [General function prediction only]. 774
1500 224401 COG1484 DnaC DNA replication protein DnaC [Replication, recombination and repair]. 254
1501 224402 COG1485 YhcM Predicted ATPase [General function prediction only]. 367
1502 224403 COG1486 CelF Alpha-galactosidase/6-phospho-beta-glucosidase, family 4 of glycosyl hydrolase [Carbohydrate transport and metabolism]. 442
1503 224404 COG1487 VapC Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 133
1504 224405 COG1488 PncB Nicotinic acid phosphoribosyltransferase [Coenzyme transport and metabolism]. 405
1505 224406 COG1489 SfsA DNA-binding protein, stimulates sugar fermentation [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 235
1506 224407 COG1490 Dtd D-Tyr-tRNAtyr deacylase [Translation, ribosomal structure and biogenesis]. 145
1507 224408 COG1491 COG1491 Predicted nucleic acid-binding OB-fold protein [General function prediction only]. 202
1508 224409 COG1492 CobQ Cobyric acid synthase [Coenzyme transport and metabolism]. 486
1509 224410 COG1493 HprK Serine kinase of the HPr protein, regulates carbohydrate metabolism [Signal transduction mechanisms]. 308
1510 224411 COG1494 GlpX Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase or related protein [Carbohydrate transport and metabolism]. 332
1511 224412 COG1495 DsbB Disulfide bond formation protein DsbB [Posttranslational modification, protein turnover, chaperones]. 170
1512 224413 COG1496 YfiH Copper oxidase (laccase) domain [Inorganic ion transport and metabolism]. 249
1513 224414 COG1497 COG1497 Predicted transcriptional regulator [Transcription]. 260
1514 224415 COG1498 SIK1 RNA processing factor Prp31, contains Nop domain [Translation, ribosomal structure and biogenesis]. 395
1515 224416 COG1499 NMD3 NMD protein affecting ribosome stability and mRNA decay [Translation, ribosomal structure and biogenesis]. 355
1516 224417 COG1500 Sdo1 Ribosome maturation protein Sdo1 [Translation, ribosomal structure and biogenesis]. 234
1517 224418 COG1501 YicI Alpha-glucosidase, glycosyl hydrolase family GH31 [Carbohydrate transport and metabolism]. 772
1518 224419 COG1502 Cls Phosphatidylserine/phosphatidylglycerophosphate/cardiolipin synthase or related enzyme [Lipid transport and metabolism]. 438
1519 224420 COG1503 eRF1 Peptide chain release factor 1 (eRF1) [Translation, ribosomal structure and biogenesis]. 411
1520 224421 COG1504 COG1504 Uncharacterized protein [Function unknown]. 121
1521 224422 COG1505 PreP Prolyl oligopeptidase PreP, S9A serine peptidase family [Amino acid transport and metabolism]. 648
1522 224423 COG1506 DAP2 Dipeptidyl aminopeptidase/acylaminoacyl peptidase [Amino acid transport and metabolism]. 620
1523 224424 COG1507 COG1507 Uncharacterized protein, DUF501 family [Function unknown]. 167
1524 224425 COG1508 RpoN DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog [Transcription]. 444
1525 224426 COG1509 EpmB L-lysine 2,3-aminomutase (EF-P beta-lysylation pathway) [Amino acid transport and metabolism]. 369
1526 224427 COG1510 GbsR DNA-binding transcriptional regulator GbsR, MarR family [Transcription]. 177
1527 224428 COG1511 YhgE Uncharacterized membrane protein YhgE, phage infection protein (PIP) family [Function unknown]. 780
1528 224429 COG1512 YgcG Uncharacterized membrane protein YgcG, contains a TPM-fold domain [Function unknown]. 271
1529 224430 COG1513 CynS Cyanate lyase [Inorganic ion transport and metabolism]. 151
1530 224431 COG1514 LigT 2'-5' RNA ligase [Translation, ribosomal structure and biogenesis]. 180
1531 224432 COG1515 Nfi Deoxyinosine 3'endonuclease (endonuclease V) [Replication, recombination and repair]. 212
1532 224433 COG1516 FliS Flagellin-specific chaperone FliS [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 132
1533 224434 COG1517 Csx1 CRISPR/Cas system-associated protein Csx1, contains CARF domain [Defense mechanisms]. 406
1534 224435 COG1518 Cas1 CRISPR/Cas system-associated endonuclease Cas1 [Defense mechanisms]. 327
1535 224436 COG1519 KdtA 3-deoxy-D-manno-octulosonic-acid transferase [Cell wall/membrane/envelope biogenesis]. 419
1536 224437 COG1520 PQQ Outer membrane protein assembly factor BamB, contains PQQ-like beta-propeller repeat [Cell wall/membrane/envelope biogenesis]. 370
1537 224438 COG1521 CoaX Pantothenate kinase type III [Coenzyme transport and metabolism]. 251
1538 224439 COG1522 Lrp DNA-binding transcriptional regulator, Lrp family [Transcription]. 154
1539 224440 COG1523 PulA Pullulanase/glycogen debranching enzyme [Carbohydrate transport and metabolism]. 697
1540 224441 COG1524 Npp1 Predicted pyrophosphatase or phosphodiesterase, AlkP superfamily [General function prediction only]. 450
1541 224442 COG1525 YncB Endonuclease YncB, thermonuclease family [Replication, recombination and repair]. 192
1542 224443 COG1526 FdhD Formate dehydrogenase assembly factor FdhD [Energy production and conversion]. 266
1543 224444 COG1527 NtpC Archaeal/vacuolar-type H+-ATPase subunit C/Vma6 [Energy production and conversion]. 346
1544 224445 COG1528 Ftn Ferritin [Inorganic ion transport and metabolism]. 167
1545 224446 COG1529 CoxL CO or xanthine dehydrogenase, Mo-binding subunit [Energy production and conversion]. 731
1546 224447 COG1530 CafA Ribonuclease G or E [Translation, ribosomal structure and biogenesis]. 487
1547 224448 COG1531 COG1531 Uncharacterized protein, UPF0248 family [Function unknown]. 77
1548 224449 COG1532 COG1532 CooT family nickel-binding protein [General function prediction only]. 57
1549 224451 COG1534 YhbY RNA-binding protein YhbY [Translation, ribosomal structure and biogenesis]. 97
1550 224452 COG1535 EntB Isochorismate hydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 218
1551 224453 COG1536 FliG Flagellar motor switch protein FliG [Cell motility]. 339
1552 224454 COG1537 PelA Stalled ribosome rescue protein Dom34, pelota family [Translation, ribosomal structure and biogenesis]. 352
1553 224455 COG1538 TolC Outer membrane protein TolC [Cell wall/membrane/envelope biogenesis]. 457
1554 224456 COG1539 FolB Dihydroneopterin aldolase [Coenzyme transport and metabolism]. 121
1555 224457 COG1540 YbgL Lactam utilization protein B (function unknown) [General function prediction only]. 252
1556 224458 COG1541 PaaK Phenylacetate-coenzyme A ligase PaaK, adenylate-forming domain family [Coenzyme transport and metabolism]. 438
1557 224459 COG1542 COG1542 Uncharacterized protein [Function unknown]. 593
1558 224460 COG1543 COG1543 Predicted glycosyl hydrolase, contains GH57 and DUF1957 domains [Carbohydrate transport and metabolism]. 504
1559 224461 COG1544 RaiA Ribosome-associated translation inhibitor RaiA [Translation, ribosomal structure and biogenesis]. 110
1560 224462 COG1545 COG1545 Uncharacterized OB-fold protein, contains Zn-ribbon domain [General function prediction only]. 140
1561 224463 COG1546 PncC Nicotinamide mononucleotide (NMN) deamidase PncC [Coenzyme transport and metabolism]. 162
1562 224464 COG1547 YpuF Predicted metal-dependent hydrolase [Function unknown]. 156
1563 224465 COG1548 COG1548 Uncharacterized protein, hydantoinase/oxoprolinase family [Function unknown]. 330
1564 224466 COG1549 COG1549 Archaeosine tRNA-guanine transglycosylase, contains uracil-DNA-glycosylase and PUA domains [Translation, ribosomal structure and biogenesis]. 519
1565 224467 COG1550 YlxP Uncharacterized conserved protein YlxP, DUF503 family [Function unknown]. 95
1566 224468 COG1551 CsrA sRNA-binding carbon storage regulator CsrA [Signal transduction mechanisms]. 73
1567 224469 COG1552 RPL40A Ribosomal protein L40E [Translation, ribosomal structure and biogenesis]. 50
1568 224470 COG1553 DsrE Sulfur relay (sulfurtransferase) complex TusBCD TusD component, DsrE family [Inorganic ion transport and metabolism]. 126
1569 224471 COG1554 ATH1 Trehalose and maltose hydrolase (possible phosphorylase) [Carbohydrate transport and metabolism]. 772
1570 224472 COG1555 ComEA DNA uptake protein ComE and related DNA-binding proteins [Replication, recombination and repair]. 149
1571 224473 COG1556 LutC L-lactate utilization protein LutC, contains LUD domain [Energy production and conversion]. 218
1572 224474 COG1558 FlgC Flagellar basal body rod protein FlgC [Cell motility]. 137
1573 224475 COG1559 YceG Cell division protein YceG, involved in septum cleavage [Cell cycle control, cell division, chromosome partitioning]. 342
1574 224476 COG1560 HtrB Lauroyl/myristoyl acyltransferase [Lipid transport and metabolism]. 308
1575 224477 COG1561 YicC Uncharacterized conserved protein YicC, UPF0701 family [Function unknown]. 290
1576 224478 COG1562 ERG9 Phytoene/squalene synthetase [Lipid transport and metabolism]. 288
1577 224479 COG1563 COG1563 Uncharacterized MnhB-related membrane protein [General function prediction only]. 87
1578 224480 COG1564 ThiN Thiamine pyrophosphokinase [Coenzyme transport and metabolism]. 212
1579 224481 COG1565 MidA SAM-dependent methyltransferase, MidA family [General function prediction only]. 370
1580 224482 COG1566 EmrA Multidrug resistance efflux pump [Defense mechanisms]. 352
1581 224483 COG1567 Csm4 CRISPR/Cas system CSM-associated protein Csm4, group 5 of RAMP superfamily [Defense mechanisms]. 313
1582 224484 COG1568 COG1568 Predicted methyltransferase [General function prediction only]. 354
1583 224485 COG1569 COG1569 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 142
1584 224486 COG1570 XseA Exonuclease VII, large subunit [Replication, recombination and repair]. 440
1585 224487 COG1571 TiaS tRNA(Ile2) C34 agmatinyltransferase TiaS [Translation, ribosomal structure and biogenesis]. 421
1586 224489 COG1573 Udg4 Uracil-DNA glycosylase [Replication, recombination and repair]. 202
1587 224490 COG1574 YtcJ Predicted amidohydrolase YtcJ [General function prediction only]. 535
1588 224491 COG1575 MenA 1,4-dihydroxy-2-naphthoate octaprenyltransferase [Coenzyme transport and metabolism]. 303
1589 224492 COG1576 RlmH 23S rRNA pseudoU1915 N3-methylase RlmH [Translation, ribosomal structure and biogenesis]. 155
1590 224493 COG1577 ERG12 Mevalonate kinase [Lipid transport and metabolism]. 307
1591 224494 COG1578 COG1578 Uncharacterized conserved protein, contains ATP-grasp and redox domains [Function unknown]. 285
1592 224495 COG1579 COG1579 Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 239
1593 224496 COG1580 FliL Flagellar basal body-associated protein FliL [Cell motility]. 159
1594 224497 COG1581 AlbA Archaeal DNA-binding protein [Transcription]. 91
1595 224498 COG1582 FlgEa Uncharacterized protein YlzI, FlbEa/FlbD family [General function prediction only]. 67
1596 224499 COG1583 Cas6 CRISPR/Cas system endoribonuclease Cas6, RAMP superfamily [Defense mechanisms]. 240
1597 224500 COG1584 SatP Succinate-acetate transporter protein [Energy production and conversion]. 207
1598 224501 COG1585 YbbJ Membrane protein implicated in regulation of membrane protease activity [Posttranslational modification, protein turnover, chaperones]. 140
1599 224502 COG1586 SpeD S-adenosylmethionine decarboxylase or arginine decarboxylase [Amino acid transport and metabolism]. 136
1600 224503 COG1587 HemD Uroporphyrinogen-III synthase [Coenzyme transport and metabolism]. 248
1601 224504 COG1588 POP4 RNase P/RNase MRP subunit p29 [Translation, ribosomal structure and biogenesis]. 95
1602 224505 COG1589 FtsQ Cell division septal protein FtsQ [Cell cycle control, cell division, chromosome partitioning]. 269
1603 224506 COG1590 Tyw3 tRNA(Phe) wybutosine-synthesizing methylase Tyw3 [Translation, ribosomal structure and biogenesis]. 208
1604 224507 COG1591 COG1591 Holliday junction resolvase, archaeal type [Replication, recombination and repair]. 137
1605 224508 COG1592 YotD Rubrerythrin [Energy production and conversion]. 166
1606 224509 COG1593 DctQ TRAP-type C4-dicarboxylate transport system, large permease component [Carbohydrate transport and metabolism]. 379
1607 224510 COG1594 RPB9 DNA-directed RNA polymerase, subunit M/Transcription elongation factor TFIIS [Transcription]. 113
1608 224511 COG1595 RpoE DNA-directed RNA polymerase specialized sigma subunit, sigma24 family [Transcription]. 182
1609 224512 COG1596 Wza Periplasmic protein involved in polysaccharide export, contains SLBB domain of the beta-grasp fold [Cell wall/membrane/envelope biogenesis]. 239
1610 224513 COG1597 LCB5 Diacylglycerol kinase family enzyme [Lipid transport and metabolism, General function prediction only]. 301
1611 224514 COG1598 HicB Predicted nuclease of the RNAse H fold, HicB family [Defense mechanisms]. 73
1612 224515 COG1599 RFA1 ssDNA-binding replication factor A, large subunit [Replication, recombination and repair]. 407
1613 224516 COG1600 QueG Epoxyqueuosine reductase QueG (queuosine biosynthesis) [Translation, ribosomal structure and biogenesis]. 337
1614 224517 COG1601 GCD7 Translation initiation factor 2, beta subunit (eIF-2beta)/eIF-5 N-terminal domain [Translation, ribosomal structure and biogenesis]. 151
1615 224518 COG1602 COG1602 Uncharacterized protein [Function unknown]. 402
1616 224519 COG1603 RPP1 RNase P/RNase MRP subunit p30 [Translation, ribosomal structure and biogenesis]. 229
1617 224520 COG1604 Cmr6 CRISPR/Cas system CMR subunit Cmr6, Cas7 group, RAMP superfamily [Defense mechanisms]. 257
1618 224521 COG1605 PheA Chorismate mutase [Amino acid transport and metabolism]. 101
1619 224522 COG1606 COG1606 ATP-utilizing enzyme, PP-loop superfamily [General function prediction only]. 269
1620 224523 COG1607 YciA Acyl-CoA hydrolase [Lipid transport and metabolism]. 157
1621 224524 COG1608 COG1608 Isopentenyl phosphate kinase [Lipid transport and metabolism]. 252
1622 224525 COG1609 PurR DNA-binding transcriptional regulator, LacI/PurR family [Transcription]. 333
1623 224526 COG1610 YqeY Uncharacterized conserved protein YqeY [Function unknown]. 148
1624 224527 COG1611 YgdH Predicted Rossmann fold nucleotide-binding protein [General function prediction only]. 205
1625 224528 COG1612 CtaA Heme A synthase [Coenzyme transport and metabolism]. 323
1626 224529 COG1613 Sbp ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 348
1627 224530 COG1614 CdhC CO dehydrogenase/acetyl-CoA synthase beta subunit [Energy production and conversion]. 470
1628 224531 COG1615 COG1615 Uncharacterized membrane protein, UPF0182 family [Function unknown]. 885
1629 224532 COG1617 Cgi121 tRNA threonylcarbamoyladenosine modification (KEOPS) complex, Cgi121 subunit [Translation, ribosomal structure and biogenesis]. 158
1630 224533 COG1618 THEP1 Nucleoside-triphosphatase THEP1 [Nucleotide transport and metabolism]. 179
1631 224534 COG1619 LdcA Muramoyltetrapeptide carboxypeptidase LdcA (peptidoglycan recycling) [Cell wall/membrane/envelope biogenesis]. 313
1632 224535 COG1620 LldP L-lactate permease [Energy production and conversion]. 522
1633 224536 COG1621 SacC Sucrose-6-phosphate hydrolase SacC, GH32 family [Carbohydrate transport and metabolism]. 486
1634 224537 COG1622 CyoA Heme/copper-type cytochrome/quinol oxidase, subunit 2 [Energy production and conversion]. 247
1635 224538 COG1623 DisA Diadenylate cyclase (c-di-AMP synthetase), DNA integrity scanning protein DisA [Signal transduction mechanisms]. 349
1636 224539 COG1624 DisA_N Diadenylate cyclase (c-di-AMP synthetase), DisA_N domain [Signal transduction mechanisms]. 247
1637 224540 COG1625 NifB Fe-S oxidoreductase, related to NifB/MoaA family [Energy production and conversion]. 414
1638 224541 COG1626 TreA Neutral trehalase [Carbohydrate transport and metabolism]. 558
1639 224542 COG1627 COG1627 Uncharacterized protein [Function unknown]. 419
1640 224543 COG1628 COG1628 Endonuclease V homolog, UPF0215 family [General function prediction only]. 185
1641 224544 COG1629 CirA Outer membrane receptor proteins, mostly Fe transport [Inorganic ion transport and metabolism]. 768
1642 224545 COG1630 NurA NurA 5'-3' nuclease [Replication, recombination and repair]. 379
1643 224546 COG1631 RPL42A Ribosomal protein L44E [Translation, ribosomal structure and biogenesis]. 94
1644 224547 COG1632 RPL15A Ribosomal protein L15E [Translation, ribosomal structure and biogenesis]. 195
1645 224548 COG1633 YhjR Rubrerythrin [Inorganic ion transport and metabolism]. 176
1646 224549 COG1634 COG1634 Uncharacterized Rossmann fold enzyme [Function unknown]. 232
1647 224550 COG1635 THI4 Archaeal ribulose 1,5-bisphosphate synthetase/yeast thiazole synthase [Coenzyme transport and metabolism]. 262
1648 224551 COG1636 COG1636 Predicted ATPase, Adenine nucleotide alpha hydrolases (AANH) superfamily [General function prediction only]. 204
1649 224552 COG1637 NucS Endonuclease NucS, RecB family [Replication, recombination and repair]. 253
1650 224553 COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]. 332
1651 224554 COG1639 HDOD HD-like signal output (HDOD) domain, no enzymatic activity [Signal transduction mechanisms]. 289
1652 224555 COG1640 MalQ 4-alpha-glucanotransferase [Carbohydrate transport and metabolism]. 520
1653 224556 COG1641 COG1641 Uncharacterized conserved protein, DUF111 family [Function unknown]. 387
1654 224557 COG1643 HrpA HrpA-like RNA helicase [Translation, ribosomal structure and biogenesis]. 845
1655 224558 COG1644 RPB10 DNA-directed RNA polymerase, subunit N (RpoN/RPB10) [Transcription]. 63
1656 224559 COG1645 COG1645 Uncharacterized Zn-finger containing protein, UPF0148 family [General function prediction only]. 131
1657 224560 COG1646 PcrB Heptaprenylglyceryl phosphate synthase [Lipid transport and metabolism]. 240
1658 224561 COG1647 YvaK Esterase/lipase [Secondary metabolites biosynthesis, transport and catabolism]. 243
1659 224562 COG1648 CysG2 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) [Coenzyme transport and metabolism]. 210
1660 224563 COG1649 YddW Uncharacterized lipoprotein YddW, UPF0748 family [Function unknown]. 418
1661 224564 COG1650 COG1650 D-tyrosyl-tRNA(Tyr) deacylase [Translation, ribosomal structure and biogenesis]. 266
1662 224565 COG1651 DsbG Protein-disulfide isomerase [Posttranslational modification, protein turnover, chaperones]. 244
1663 224566 COG1652 XkdP Nucleoid-associated protein YgaU, contains BON and LysM domains [Function unknown]. 269
1664 224567 COG1653 UgpB ABC-type glycerol-3-phosphate transport system, periplasmic component [Carbohydrate transport and metabolism]. 433
1665 224568 COG1654 BirA Biotin operon repressor [Transcription]. 79
1666 224569 COG1655 COG1655 Uncharacterized protein, DUF2225 family [Function unknown]. 267
1667 224570 COG1656 COG1656 Uncharacterized conserved protein, contains PIN domain [Function unknown]. 165
1668 224571 COG1657 SqhC Squalene cyclase [Lipid transport and metabolism]. 517
1669 224572 COG1658 RnmV 5S rRNA maturation endonuclease (Ribonuclease M5), contains TOPRIM domain [Translation, ribosomal structure and biogenesis]. 127
1670 224573 COG1659 COG1659 Uncharacterized protein, linocin/CFP29 family [Function unknown]. 267
1671 224574 COG1660 RapZ RNase adaptor protein for sRNA GlmZ degradation, contains a P-loop ATPase domain [Signal transduction mechanisms]. 286
1672 224575 COG1661 COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif [General function prediction only]. 141
1673 224576 COG1662 InsB Transposase and inactivated derivatives, IS1 family [Mobilome: prophages, transposons]. 121
1674 224577 COG1663 LpxK Tetraacyldisaccharide-1-P 4'-kinase [Cell wall/membrane/envelope biogenesis]. 336
1675 224578 COG1664 CcmA Cytoskeletal protein CcmA, bactofilin family [Cytoskeleton]. 146
1676 224579 COG1665 COG1665 Predicted nucleotidyltransferase [General function prediction only]. 315
1677 224580 COG1666 YajQ Uncharacterized conserved protein YajQ, UPF0234 family [Function unknown]. 165
1678 224581 COG1667 COG1667 Uncharacterized protein [Function unknown]. 254
1679 224582 COG1668 NatB ABC-type Na+ efflux pump, permease component [Energy production and conversion, Inorganic ion transport and metabolism]. 407
1680 224583 COG1669 COG1669 Predicted nucleotidyltransferase [General function prediction only]. 97
1681 224584 COG1670 RimL Protein N-acetyltransferase, RimJ/RimL family [Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones]. 187
1682 224585 COG1671 YaiI Uncharacterized conserved protein YaiI, UPF0178 family [Function unknown]. 150
1683 224586 COG1672 AAAA Predicted ATPase, archaeal AAA+ ATPase superfamily [General function prediction only]. 359
1684 224587 COG1673 COG1673 Predicted RNA-binding protein, contains PUA-like EVE domain [General function prediction only]. 151
1685 224588 COG1674 FtsK DNA segregation ATPase FtsK/SpoIIIE and related proteins [Cell cycle control, cell division, chromosome partitioning]. 858
1686 224589 COG1675 TFA1 Transcription initiation factor IIE, alpha subunit [Transcription]. 176
1687 224590 COG1676 SEN2 tRNA splicing endonuclease [Translation, ribosomal structure and biogenesis]. 181
1688 224591 COG1677 FliE Flagellar hook-basal body complex protein FliE [Cell motility]. 105
1689 224592 COG1678 AlgH Putative transcriptional regulator, AlgH/UPF0301 family [Transcription]. 194
1690 224593 COG1679 COG1679 Predicted aconitase [Energy production and conversion]. 403
1691 224594 COG1680 AmpC CubicO group peptidase, beta-lactamase class C family [Defense mechanisms]. 390
1692 224595 COG1681 FlaB Archaellin (archaeal flagellin) [Cell motility]. 209
1693 224596 COG1682 TagG ABC-type polysaccharide/polyol phosphate export permease [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 263
1694 224597 COG1683 YbbK Uncharacterized conserved protein YbbK, DUF523 family [Function unknown]. 156
1695 224598 COG1684 FliR Flagellar biosynthesis protein FliR [Cell motility]. 258
1696 224599 COG1685 AroK2 Archaeal shikimate kinase [Amino acid transport and metabolism]. 278
1697 224600 COG1686 DacC D-alanyl-D-alanine carboxypeptidase [Cell wall/membrane/envelope biogenesis]. 389
1698 224601 COG1687 AzlD Branched-chain amino acid transport protein AzlD [Amino acid transport and metabolism]. 106
1699 224602 COG1688 Cas5 CRISPR/Cas system-associated protein Cas5, RAMP superfamily [Defense mechanisms]. 240
1700 224603 COG1689 COG1689 Uncharacterized protein [Function unknown]. 274
1701 224604 COG1690 RtcB RNA-splicing ligase RtcB, repairs tRNA damage [Translation, ribosomal structure and biogenesis]. 432
1702 224605 COG1691 COG1691 NCAIR mutase (PurE)-related protein [Nucleotide transport and metabolism]. 254
1703 224606 COG1692 YmdB Calcineurin-like phosphoesterase [General function prediction only]. 266
1704 224607 COG1693 COG1693 Repressor of nif and glnA expression [Transcription]. 325
1705 224608 COG1694 MazG NTP pyrophosphatase, house-cleaning of non-canonical NTPs [Defense mechanisms]. 102
1706 224609 COG1695 PadR DNA-binding transcriptional regulator, PadR family [Transcription]. 138
1707 224610 COG1696 DltB D-alanyl-lipoteichoic acid acyltransferase DltB, MBOAT superfamily [Cell wall/membrane/envelope biogenesis]. 425
1708 224611 COG1697 Spo11 DNA topoisomerase VI, subunit A [Replication, recombination and repair]. 356
1709 224612 COG1698 COG1698 Uncharacterized protein, UPF0147 family [Function unknown]. 93
1710 224613 COG1699 FliW Flagellar assembly factor FliW [Cell motility]. 146
1711 224614 COG1700 COG1700 Predicted component of virus defense system, contains PD-(D/E)xK nuclease domain, DUF524 [Defense mechanisms]. 503
1712 224615 COG1701 COG1701 Archaeal phosphopantothenate synthetase [Coenzyme transport and metabolism]. 256
1713 411689 COG1702 PhoH Phosphate starvation-inducible protein PhoH, predicted ATPase [Signal transduction mechanisms]. 260
1714 224617 COG1703 ArgK Putative periplasmic protein kinase ArgK or related GTPase of G3E family [Posttranslational modification, protein turnover, chaperones]. 323
1715 224618 COG1704 LemA Uncharacterized conserved protein [Function unknown]. 185
1716 224619 COG1705 FlgJ Flagellum-specific peptidoglycan hydrolase FlgJ [Cell wall/membrane/envelope biogenesis, Cell motility]. 201
1717 224620 COG1706 FlgI Flagellar basal body P-ring protein FlgI [Cell motility]. 365
1718 224621 COG1707 COG1707 Uncharacterized protein, contains ACT and thioredoxin-like domains [General function prediction only]. 218
1719 224622 COG1708 COG1708 Predicted nucleotidyltransferase [General function prediction only]. 128
1720 224623 COG1709 COG1709 Predicted transcriptional regulator [Transcription]. 241
1721 224624 COG1710 COG1710 Uncharacterized protein [Function unknown]. 139
1722 224625 COG1711 COG1711 DNA replication initiation complex subunit, GINS family [Replication, recombination and repair]. 223
1723 224626 COG1712 COG1712 Predicted dinucleotide-utilizing enzyme [General function prediction only]. 255
1724 224627 COG1713 YqeK HD superfamily phosphohydrolase YqeK (fused to NMNAT in mycoplasms) [General function prediction only]. 187
1725 224628 COG1714 YckC Uncharacterized membrane protein YckC, RDD family [Function unknown]. 172
1726 224629 COG1715 Mrr Restriction endonuclease Mrr [Defense mechanisms]. 308
1727 224630 COG1716 FHA Forkhead associated (FHA) domain, binds pSer, pThr, pTyr [Signal transduction mechanisms]. 191
1728 224631 COG1717 Rpl32e Ribosomal protein L32E [Translation, ribosomal structure and biogenesis]. 133
1729 224632 COG1718 RIO1 Serine/threonine-protein kinase RIO1 [Signal transduction mechanisms]. 268
1730 224633 COG1719 COG1719 Predicted hydrocarbon binding protein, contains 4VR domain [General function prediction only]. 158
1731 224634 COG1720 TsaA tRNA (Thr-GGU) A37 N-methylase [Translation, ribosomal structure and biogenesis]. 156
1732 224635 COG1721 YeaD2 Uncharacterized conserved protein, DUF58 family, contains vWF domain [Function unknown]. 416
1733 224636 COG1722 XseB Exonuclease VII small subunit [Replication, recombination and repair]. 81
1734 224637 COG1723 Rmd1 Uncharacterized protein, Rmd1/YagE family [Function unknown]. 331
1735 224638 COG1724 YcfA Predicted RNA binding protein YcfA, dsRBD-like fold, HicA-like mRNA interferase family [General function prediction only]. 66
1736 224639 COG1725 YhcF DNA-binding transcriptional regulator YhcF, GntR family [Transcription]. 125
1737 224640 COG1726 NqrA Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA [Energy production and conversion]. 447
1738 224641 COG1727 RPL18A Ribosomal protein L18E [Translation, ribosomal structure and biogenesis]. 122
1739 224642 COG1728 YaaR Uncharacterized protein YaaR, TM1646/DUF327 family [Function unknown]. 151
1740 224643 COG1729 YbgF Periplasmic TolA-binding protein (function unknown) [General function prediction only]. 262
1741 224644 COG1730 GIM5 Prefoldin subunit 5 [Posttranslational modification, protein turnover, chaperones]. 145
1742 224645 COG1731 RibC2 Archaeal riboflavin synthase [Coenzyme transport and metabolism]. 154
1743 224646 COG1732 OsmF Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) [Cell wall/membrane/envelope biogenesis]. 300
1744 224647 COG1733 HxlR DNA-binding transcriptional regulator, HxlR family [Transcription]. 120
1745 224648 COG1734 DksA RNA polymerase-binding transcription factor DksA [Translation, ribosomal structure and biogenesis]. 120
1746 224649 COG1735 Php Predicted metal-dependent hydrolase, phosphotriesterase family [General function prediction only]. 316
1747 224650 COG1736 DPH2 Diphthamide synthase subunit DPH2 [Translation, ribosomal structure and biogenesis]. 347
1748 224651 COG1737 RpiR DNA-binding transcriptional regulator, MurR/RpiR family, contains HTH and SIS domains [Transcription]. 281
1749 224652 COG1738 YhhQ Uncharacterized PurR-regulated membrane protein YhhQ, DUF165 family [Function unknown]. 233
1750 224653 COG1739 YIH1 Putative translation regulator, IMPACT (imprinted ancient) protein family [General function prediction only]. 203
1751 224654 COG1740 HyaA Ni,Fe-hydrogenase I small subunit [Energy production and conversion]. 355
1752 224655 COG1741 YhaK Redox-sensitive bicupin YhaK, pirin superfamily [General function prediction only]. 276
1753 224656 COG1742 YnfA Uncharacterized inner membrane protein YnfA, drug/metabolite transporter superfamily [General function prediction only]. 109
1754 224657 COG1743 COG1743 Adenine-specific DNA methylase, contains a Zn-ribbon domain [Replication, recombination and repair]. 875
1755 224658 COG1744 Med Basic membrane lipoprotein Med, periplasmic binding protein (PBP1-ABC) superfamily [Cell wall/membrane/envelope biogenesis]. 345
1756 224659 COG1745 COG1745 Uncharacterized euryarchaeal protein, UPF0058 family [Function unknown]. 94
1757 224660 COG1746 CCA1 tRNA nucleotidyltransferase (CCA-adding enzyme) [Translation, ribosomal structure and biogenesis]. 443
1758 224661 COG1747 COG1747 Uncharacterized N-terminal domain of the transcription elongation factor GreA [Function unknown]. 711
1759 224662 COG1748 Lys9 Saccharopine dehydrogenase, NADP-dependent [Amino acid transport and metabolism]. 389
1760 224663 COG1749 FlgE Flagellar hook protein FlgE [Cell motility]. 423
1761 224664 COG1750 COG1750 Predicted archaeal serine protease, S18 family [General function prediction only]. 579
1762 224665 COG1751 COG1751 Uncharacterized protein [Function unknown]. 186
1763 224666 COG1752 RssA Predicted acylesterase/phospholipase RssA, containd patatin domain [General function prediction only]. 306
1764 224667 COG1753 VapB3 Predicted antitoxin, CopG family [Defense mechanisms]. 74
1765 224668 COG1754 COG1754 Uncharacterized C-terminal domain of topoisomerase IA [Function unknown]. 298
1766 224669 COG1755 YpbQ Uncharacterized protein YpbQ, isoprenylcysteine carboxyl methyltransferase (ICMT) family [Function unknown]. 172
1767 224670 COG1756 Emg1 rRNA pseudouridine-1189 N-methylase Emg1, Nep1/Mra1 family [Translation, ribosomal structure and biogenesis]. 223
1768 224671 COG1757 NhaC Na+/H+ antiporter NhaC [Energy production and conversion]. 485
1769 224672 COG1758 RpoZ DNA-directed RNA polymerase, subunit K/omega [Transcription]. 74
1770 224673 COG1759 PurP 5-formaminoimidazole-4-carboxamide-1-beta-D-ribofuranosyl 5'-monophosphate synthetase (purine biosynthesis) [Nucleotide transport and metabolism]. 361
1771 224674 COG1760 SdaA L-serine deaminase [Amino acid transport and metabolism]. 262
1772 224675 COG1761 RPB11 DNA-directed RNA polymerase, subunit L [Transcription]. 99
1773 224676 COG1762 PtsN Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 152
1774 224677 COG1763 MobB Molybdopterin-guanine dinucleotide biosynthesis protein [Coenzyme transport and metabolism]. 161
1775 224678 COG1764 OsmC Organic hydroperoxide reductase OsmC/OhrA [Defense mechanisms]. 143
1776 224679 COG1765 YhfA Uncharacterized OsmC-related protein [General function prediction only]. 137
1777 224680 COG1766 FliF Flagellar biosynthesis/type III secretory pathway M-ring protein FliF/YscJ [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 545
1778 224681 COG1767 CitG Triphosphoribosyl-dephospho-CoA synthetase [Coenzyme transport and metabolism]. 288
1779 224682 COG1768 COG1768 Predicted phosphohydrolase [General function prediction only]. 230
1780 224683 COG1769 Cmr3 CRISPR/Cas system CMR-associated protein Cmr3, group 5 of RAMP superfamily [Defense mechanisms]. 335
1781 224684 COG1770 PtrB Protease II [Amino acid transport and metabolism]. 682
1782 224685 COG1771 COG1771 Uncharacterized protein, contains N-terminal Zn-finger domain [Function unknown]. 471
1783 224686 COG1772 COG1772 Uncharacterized protein [Function unknown]. 178
1784 224687 COG1773 YgaK Rubredoxin [Energy production and conversion]. 55
1785 224688 COG1774 YaaT Cell fate regulator YaaT, PSP1 superfamily (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 265
1786 224689 COG1775 HgdB Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB [Secondary metabolites biosynthesis, transport and catabolism]. 379
1787 224690 COG1776 CheC Chemotaxis protein CheY-P-specific phosphatase CheC [Signal transduction mechanisms]. 203
1788 224691 COG1777 COG1777 Predicted transcriptional regulator [Transcription]. 217
1789 224692 COG1778 KdsC 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase KdsC and related HAD superfamily phosphatases [Cell wall/membrane/envelope biogenesis, General function prediction only]. 170
1790 224693 COG1779 Zpr1 C4-type Zn-finger protein [General function prediction only]. 201
1791 224694 COG1780 NrdI Protein involved in ribonucleotide reduction [Nucleotide transport and metabolism]. 141
1792 224695 COG1781 PyrI Aspartate carbamoyltransferase, regulatory subunit [Nucleotide transport and metabolism]. 153
1793 224696 COG1782 COG1782 Predicted metal-dependent RNase, contains metallo-beta-lactamase and KH domains [General function prediction only]. 637
1794 224697 COG1783 XtmB Phage terminase large subunit [Mobilome: prophages, transposons]. 414
1795 224698 COG1784 COG1784 TctA family transporter [General function prediction only]. 395
1796 224699 COG1785 PhoA Alkaline phosphatase [Inorganic ion transport and metabolism, General function prediction only]. 482
1797 224700 COG1786 COG1786 Swiveling domain associated with predicted aconitase [General function prediction only]. 131
1798 224701 COG1787 COG1787 Endonuclease, HJR/Mrr/RecB family [Defense mechanisms]. 217
1799 224702 COG1788 AtoD Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit [Lipid transport and metabolism]. 220
1800 224703 COG1790 COG1790 Uncharacterized protein [Function unknown]. 209
1801 224704 COG1791 Adi1 Acireductone dioxygenase (methionine salvage), cupin superfamily [Amino acid transport and metabolism]. 181
1802 224705 COG1792 MreC Cell shape-determining protein MreC [Cell cycle control, cell division, chromosome partitioning]. 284
1803 224706 COG1793 CDC9 ATP-dependent DNA ligase [Replication, recombination and repair]. 444
1804 224707 COG1794 RacX Aspartate/glutamate racemase [Cell wall/membrane/envelope biogenesis]. 230
1805 224708 COG1795 COG1795 Formaldehyde-activating enzyme nesessary for methanogenesis [Energy production and conversion]. 170
1806 224709 COG1796 PolX DNA polymerase/3'-5' exonuclease PolX [Replication, recombination and repair]. 326
1807 224710 COG1797 CobB Cobyrinic acid a,c-diamide synthase [Coenzyme transport and metabolism]. 451
1808 224711 COG1798 DPH5 Diphthamide biosynthesis methyltransferase [Translation, ribosomal structure and biogenesis]. 260
1809 224712 COG1799 YlmF FtsZ-interacting cell division protein YlmF [Cell cycle control, cell division, chromosome partitioning]. 167
1810 224713 COG1800 COG1800 Predicted transglutaminase-like protease [General function prediction only]. 335
1811 224714 COG1801 YecE Uncharacterized conserved protein YecE, DUF72 family [Function unknown]. 263
1812 224715 COG1802 GntR DNA-binding transcriptional regulator, GntR family [Transcription]. 230
1813 224716 COG1803 MgsA Methylglyoxal synthase [Carbohydrate transport and metabolism]. 142
1814 224717 COG1804 CaiB Crotonobetainyl-CoA:carnitine CoA-transferase CaiB and related acyl-CoA transferases [Lipid transport and metabolism]. 396
1815 224718 COG1805 NqrB Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB [Energy production and conversion]. 400
1816 224719 COG1806 PpsR Regulator of PEP synthase PpsR, kinase-PPPase family (combines ADP:protein kinase and phosphorylase activities) [Signal transduction mechanisms]. 273
1817 224720 COG1807 ArnT 4-amino-4-deoxy-L-arabinose transferase or related glycosyltransferase of PMT family [Cell wall/membrane/envelope biogenesis]. 535
1818 224721 COG1808 COG1808 Uncharacterized membrane protein [Function unknown]. 334
1819 224722 COG1809 ComA Phosphosulfolactate synthase, CoM biosynthesis protein A [Coenzyme transport and metabolism]. 258
1820 224723 COG1810 COG1810 Uncharacterized conserved protein [Function unknown]. 224
1821 224724 COG1811 YqgA Uncharacterized membrane protein YqgA, affects biofilm formation [Function unknown]. 228
1822 224725 COG1812 MetK2 Archaeal S-adenosylmethionine synthetase [Coenzyme transport and metabolism]. 400
1823 224726 COG1813 aMBF1 Archaeal ribosome-binding protein aMBF1, putative translation factor, contains Zn-ribbon and HTH domains [Translation, ribosomal structure and biogenesis]. 165
1824 224727 COG1814 Ccc1 Predicted Fe2+/Mn2+ transporter, VIT1/CCC1 family [Inorganic ion transport and metabolism]. 229
1825 224728 COG1815 FlgB Flagellar basal body rod protein FlgB [Cell motility]. 133
1826 224729 COG1816 Add Adenosine deaminase [Nucleotide transport and metabolism]. 345
1827 224730 COG1817 COG1817 Predicted glycosyltransferase [General function prediction only]. 346
1828 224731 COG1818 Tan1 tRNA(Ser,Leu) C12 N-acetylase TAN1, contains THUMP domain [Translation, ribosomal structure and biogenesis]. 175
1829 224732 COG1819 YjiC UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism]. 406
1830 224733 COG1820 NagA N-acetylglucosamine-6-phosphate deacetylase [Carbohydrate transport and metabolism]. 380
1831 224734 COG1821 COG1821 Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 307
1832 224735 COG1822 COG1822 Uncharacterized membrane protein [Function unknown]. 349
1833 224736 COG1823 TcyP L-cystine uptake protein TcyP, sodium:dicarboxylate symporter family [Amino acid transport and metabolism]. 458
1834 224737 COG1824 MgtE2 Permease, similar to cation transporters [Inorganic ion transport and metabolism]. 203
1835 224738 COG1825 RplY Ribosomal protein L25 (general stress protein Ctc) [Translation, ribosomal structure and biogenesis]. 93
1836 224739 COG1826 TatA Sec-independent protein translocase protein TatA [Intracellular trafficking, secretion, and vesicular transport]. 94
1837 224740 COG1827 NiaR Transcriptional regulator of NAD metabolism, contains HTH and 3H domains [Transcription, Coenzyme transport and metabolism]. 168
1838 224741 COG1828 PurS Phosphoribosylformylglycinamidine (FGAM) synthase, PurS component [Nucleotide transport and metabolism]. 83
1839 224742 COG1829 COG1829 Archaeal pantoate kinase [Coenzyme transport and metabolism]. 283
1840 224743 COG1830 FbaB Fructose-bisphosphate aldolase class Ia, DhnA family [Carbohydrate transport and metabolism]. 265
1841 224744 COG1831 COG1831 Predicted metal-dependent hydrolase, urease superfamily [General function prediction only]. 285
1842 224745 COG1832 YccU Predicted CoA-binding protein [General function prediction only]. 140
1843 224746 COG1833 COG1833 Uri superfamily endonuclease [General function prediction only]. 132
1844 224747 COG1834 DdaH N-Dimethylarginine dimethylaminohydrolase [Amino acid transport and metabolism]. 267
1845 224748 COG1835 OafA Peptidoglycan/LPS O-acetylase OafA/YrhL, contains acyltransferase and SGNH-hydrolase domains [Cell wall/membrane/envelope biogenesis]. 386
1846 224749 COG1836 COG1836 Uncharacterized membrane protein [Function unknown]. 247
1847 224750 COG1837 YlqC Predicted RNA-binding protein YlqC, contains KH domain, UPF0109 family [General function prediction only]. 76
1848 224751 COG1838 FumA Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain [Energy production and conversion]. 184
1849 224752 COG1839 COG1839 Adenosine/AMP kinase [Nucleotide transport and metabolism]. 162
1850 224753 COG1840 AfuA ABC-type Fe3+ transport system, periplasmic component [Inorganic ion transport and metabolism]. 299
1851 224754 COG1841 RpmD Ribosomal protein L30/L7E [Translation, ribosomal structure and biogenesis]. 55
1852 224755 COG1842 PspA Phage shock protein A [Transcription, Signal transduction mechanisms]. 225
1853 224756 COG1843 FlgD Flagellar hook assembly protein FlgD [Cell motility]. 222
1854 224757 COG1844 COG1844 Uncharacterized protein [Function unknown]. 125
1855 224758 COG1845 CyoC Heme/copper-type cytochrome/quinol oxidase, subunit 3 [Energy production and conversion]. 209
1856 224759 COG1846 MarR DNA-binding transcriptional regulator, MarR family [Transcription]. 126
1857 224760 COG1847 Jag Predicted RNA-binding protein Jag, conains KH and R3H domains [General function prediction only]. 208
1858 224761 COG1848 COG1848 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 140
1859 224762 COG1849 COG1849 Uncharacterized protein [Function unknown]. 90
1860 224763 COG1850 RbcL Ribulose 1,5-bisphosphate carboxylase, large subunit, or a RuBisCO-like protein [Carbohydrate transport and metabolism]. 429
1861 224764 COG1851 COG1851 Uncharacterized protein, UPF0128 family [Function unknown]. 229
1862 224765 COG1852 COG1852 Uncharacterized protein, DUF116 family [Function unknown]. 209
1863 224766 COG1853 RutF NADH-FMN oxidoreductase RutF, flavin reductase (DIM6/NTAB) family [Energy production and conversion]. 176
1864 224767 COG1854 LuxS S-ribosylhomocysteine lyase LuxS, autoinducer biosynthesis [Signal transduction mechanisms]. 161
1865 224768 COG1855 COG1855 Predicted ATPase, PilT family [General function prediction only]. 604
1866 224769 COG1856 COG1856 Uncharacterized protein, radical SAM superfamily [General function prediction only]. 275
1867 224770 COG1857 Cas7 CRISPR/Cas system-associated protein Cas7, RAMP superfamily [Defense mechanisms]. 334
1868 224771 COG1858 MauG Cytochrome c peroxidase [Posttranslational modification, protein turnover, chaperones]. 364
1869 224772 COG1859 KptA RNA:NAD 2'-phosphotransferase, TPT1/KptA family [Translation, ribosomal structure and biogenesis]. 211
1870 224773 COG1860 COG1860 Uncharacterized conserved protein, UPF0179 family [Nucleotide transport and metabolism, Replication, recombination and repair]. 147
1871 224774 COG1861 SpsF Spore coat polysaccharide biosynthesis protein SpsF, cytidylyltransferase family [Cell wall/membrane/envelope biogenesis]. 241
1872 224775 COG1862 YajC Preprotein translocase subunit YajC [Intracellular trafficking, secretion, and vesicular transport]. 97
1873 224776 COG1863 MnhE Multisubunit Na+/H+ antiporter, MnhE subunit [Inorganic ion transport and metabolism]. 158
1874 224777 COG1864 NUC1 DNA/RNA endonuclease G, NUC1 [Nucleotide transport and metabolism]. 281
1875 224778 COG1865 CbiZ Adenosylcobinamide amidohydrolase [Coenzyme transport and metabolism]. 200
1876 224779 COG1866 PckA Phosphoenolpyruvate carboxykinase, ATP-dependent [Energy production and conversion]. 529
1877 224780 COG1867 TRM1 tRNA G26 N,N-dimethylase Trm1 [Translation, ribosomal structure and biogenesis]. 380
1878 224781 COG1868 FliM Flagellar motor switch protein FliM [Cell motility]. 332
1879 224782 COG1869 RbsD D-ribose pyranose/furanose isomerase RbsD [Carbohydrate transport and metabolism]. 135
1880 224783 COG1871 CheD Chemotaxis receptor (MCP) glutamine deamidase CheD [Cell motility, Signal transduction mechanisms]. 164
1881 224784 COG1872 YggU Uncharacterized conserved protein YggU, UPF0235/DUF167 family [Function unknown]. 102
1882 224785 COG1873 YlmC Sporulation protein YlmC, PRC-barrel domain family [General function prediction only]. 87
1883 224786 COG1874 GanA Beta-galactosidase GanA [Carbohydrate transport and metabolism]. 673
1884 224787 COG1875 YlaK Predicted ribonuclease YlaK, contains NYN-type RNase and PhoH-family ATPase domains [General function prediction only]. 436
1885 224788 COG1876 LdcB LD-carboxypeptidase LdcB, LAS superfamily [Cell wall/membrane/envelope biogenesis]. 241
1886 224789 COG1877 OtsB Trehalose-6-phosphatase [Carbohydrate transport and metabolism]. 266
1887 224790 COG1878 COG1878 Kynurenine formamidase [Amino acid transport and metabolism]. 218
1888 224791 COG1879 RbsB ABC-type sugar transport system, periplasmic component, contains N-terminal xre family HTH domain [Carbohydrate transport and metabolism]. 322
1889 224792 COG1880 CdhB CO dehydrogenase/acetyl-CoA synthase epsilon subunit [Energy production and conversion]. 170
1890 224793 COG1881 PEBP Uncharacterized conserved protein, phosphatidylethanolamine-binding protein (PEBP) family [General function prediction only]. 174
1891 224794 COG1882 PflD Pyruvate-formate lyase [Energy production and conversion]. 755
1892 224795 COG1883 OadB Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit [Energy production and conversion]. 375
1893 224796 COG1884 Sbm Methylmalonyl-CoA mutase, N-terminal domain/subunit [Lipid transport and metabolism]. 548
1894 224797 COG1885 COG1885 Uncharacterized protein, UPF0212 family [Function unknown]. 115
1895 224798 COG1886 FliN Flagellar motor switch/type III secretory pathway protein FliN [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 136
1896 224799 COG1887 TagB CDP-glycerol glycerophosphotransferase, TagB/SpsB family [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 388
1897 224800 COG1888 COG1888 Uncharacterized protein [Function unknown]. 97
1898 224801 COG1889 NOP1 Fibrillarin-like rRNA methylase [Translation, ribosomal structure and biogenesis]. 231
1899 224802 COG1890 RPS3A Ribosomal protein S3AE [Translation, ribosomal structure and biogenesis]. 214
1900 224803 COG1891 COG1891 Uncharacterized protein, UPF0264 family [Function unknown]. 235
1901 224804 COG1892 PpcA Phosphoenolpyruvate carboxylase [Carbohydrate transport and metabolism]. 488
1902 224805 COG1893 PanE Ketopantoate reductase [Coenzyme transport and metabolism]. 307
1903 224806 COG1894 NuoF NADH:ubiquinone oxidoreductase, NADH-binding 51 kD subunit (chain F) [Energy production and conversion]. 424
1904 224807 COG1895 COG1895 Uncharacterized protein, contains HEPN domain, UPF0332 family [Function unknown]. 129
1905 224808 COG1896 YfbR 5'-deoxynucleotidase YfbR and related HD superfamily hydrolases [Nucleotide transport and metabolism, General function prediction only]. 193
1906 224809 COG1897 MetA Homoserine trans-succinylase [Amino acid transport and metabolism]. 307
1907 224810 COG1898 RfbC dTDP-4-dehydrorhamnose 3,5-epimerase or related enzyme [Cell wall/membrane/envelope biogenesis]. 173
1908 224811 COG1899 DYS1 Deoxyhypusine synthase [Posttranslational modification, protein turnover, chaperones, Translation, ribosomal structure and biogenesis]. 318
1909 224812 COG1900 COG1900 Uncharacterized conserved protein, DUF39 family [Function unknown]. 365
1910 224813 COG1901 COG1901 tRNA pseudouridine-54 N-methylase [Translation, ribosomal structure and biogenesis]. 197
1911 224814 COG1902 FadH 2,4-dienoyl-CoA reductase or related NADH-dependent reductase, Old Yellow Enzyme (OYE) family [Energy production and conversion]. 363
1912 224815 COG1903 CbiD Cobalamin biosynthesis protein CbiD [Coenzyme transport and metabolism]. 367
1913 224816 COG1904 UxaC Glucuronate isomerase [Carbohydrate transport and metabolism]. 463
1914 224817 COG1905 NuoE NADH:ubiquinone oxidoreductase 24 kD subunit (chain E) [Energy production and conversion]. 160
1915 224818 COG1906 COG1906 Uncharacterized protein [Function unknown]. 388
1916 224819 COG1907 COG1907 Predicted archaeal sugar kinase [General function prediction only]. 312
1917 224820 COG1908 FrhD Coenzyme F420-reducing hydrogenase, delta subunit [Energy production and conversion]. 132
1918 224821 COG1909 COG1909 Uncharacterized protein, UPF0218 family [Function unknown]. 167
1919 224822 COG1910 YvgK Periplasmic molybdate-binding protein/domain [Inorganic ion transport and metabolism]. 223
1920 224823 COG1911 RPL30E Ribosomal protein L30E [Translation, ribosomal structure and biogenesis]. 100
1921 224824 COG1912 COG1912 S-adenosylmethionine hydrolase (SAM-hydroxide adenosyltransferase) [Coenzyme transport and metabolism]. 268
1922 224825 COG1913 COG1913 Predicted Zn-dependent protease [General function prediction only]. 181
1923 224826 COG1914 MntH Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism]. 416
1924 224827 COG1915 COG1915 Uncharacterized conserved protein, contains Saccharopine dehydrogenase N-terminal (SDHN) domain [Function unknown]. 415
1925 224828 COG1916 COG1916 Pheromone shutdown protein TraB, contains GTxH motif (function unknown) [Function unknown]. 388
1926 224829 COG1917 QdoI Cupin domain protein related to quercetin dioxygenase [General function prediction only]. 131
1927 224830 COG1918 FeoA Fe2+ transport system protein FeoA [Inorganic ion transport and metabolism]. 75
1928 224831 COG1920 COG1920 2-phospho-L-lactate guanylyltransferase, coenzyme F420 biosynthesis enzyme, CobY/MobA/RfbA family [Coenzyme transport and metabolism]. 210
1929 224832 COG1921 SelA Seryl-tRNA(Sec) selenium transferase [Translation, ribosomal structure and biogenesis]. 395
1930 224833 COG1922 WecG UDP-N-acetyl-D-mannosaminuronic acid transferase, WecB/TagA/CpsF family [Cell wall/membrane/envelope biogenesis]. 253
1931 224834 COG1923 Hfq sRNA-binding regulator protein Hfq [Signal transduction mechanisms]. 77
1932 224835 COG1924 YjiL Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) [Lipid transport and metabolism]. 396
1933 224836 COG1925 PtsH Phosphotransferase system, HPr and related phosphotransfer proteins [Signal transduction mechanisms, Carbohydrate transport and metabolism]. 88
1934 224837 COG1926 COG1926 Predicted phosphoribosyltransferase [General function prediction only]. 220
1935 224838 COG1927 Mtd F420-dependent methylenetetrahydromethanopterin dehydrogenase [Energy production and conversion]. 277
1936 224839 COG1928 PMT1 Dolichyl-phosphate-mannose--protein O-mannosyl transferase [Posttranslational modification, protein turnover, chaperones]. 699
1937 224840 COG1929 GlxK Glycerate kinase [Carbohydrate transport and metabolism]. 378
1938 224841 COG1930 CbiN ABC-type cobalt transport system, periplasmic component [Inorganic ion transport and metabolism]. 97
1939 224842 COG1931 COG1931 Predicted RNA binding protein with dsRBD fold, UPF0201 family [General function prediction only]. 140
1940 224843 COG1932 SerC Phosphoserine aminotransferase [Coenzyme transport and metabolism, Amino acid transport and metabolism]. 365
1941 224844 COG1933 PolC Archaeal DNA polymerase II, large subunit [Replication, recombination and repair]. 253
1942 224845 COG1934 LptA Lipopolysaccharide export system protein LptA [Cell wall/membrane/envelope biogenesis]. 173
1943 224846 COG1935 COG1935 Uncharacterized protein [Function unknown]. 122
1944 224847 COG1936 Fap7 Broad-specificity NMP kinase [Nucleotide transport and metabolism]. 180
1945 224848 COG1937 FrmR DNA-binding transcriptional regulator, FrmR family [Transcription]. 89
1946 224849 COG1938 COG1938 Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 244
1947 224850 COG1939 MrnC 23S rRNA maturation mini-RNase III [Translation, ribosomal structure and biogenesis]. 132
1948 224851 COG1940 NagC Sugar kinase of the NBD/HSP70 family, may contain an N-terminal HTH domain [Transcription, Carbohydrate transport and metabolism]. 314
1949 224852 COG1941 FrhG Coenzyme F420-reducing hydrogenase, gamma subunit [Energy production and conversion]. 247
1950 224853 COG1942 PptA Phenylpyruvate tautomerase PptA, 4-oxalocrotonate tautomerase family [Secondary metabolites biosynthesis, transport and catabolism]. 69
1951 224854 COG1943 RAYT REP element-mobilizing transposase RayT [Mobilome: prophages, transposons]. 136
1952 224855 COG1944 YcaO Ribosomal protein S12 methylthiotransferase accessory factor YcaO [Translation, ribosomal structure and biogenesis]. 398
1953 224856 COG1945 PdaD Pyruvoyl-dependent arginine decarboxylase (PvlArgDC) [Amino acid transport and metabolism]. 163
1954 224857 COG1946 TesB Acyl-CoA thioesterase [Lipid transport and metabolism]. 289
1955 224858 COG1947 IspE 4-diphosphocytidyl-2C-methyl-D-erythritol kinase [Lipid transport and metabolism]. 289
1956 224859 COG1948 MUS81 ERCC4-type nuclease [Replication, recombination and repair]. 254
1957 224860 COG1949 Orn Oligoribonuclease (3'-5' exoribonuclease) [RNA processing and modification]. 184
1958 224861 COG1950 YvlD Uncharacterized membrane protein YvlD, DUF360 family [Function unknown]. 120
1959 224862 COG1951 TtdA Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain [Energy production and conversion]. 297
1960 224863 COG1952 SecB Preprotein translocase subunit SecB [Intracellular trafficking, secretion, and vesicular transport]. 157
1961 224864 COG1953 FUI1 Cytosine/uracil/thiamine/allantoin permease [Nucleotide transport and metabolism, Coenzyme transport and metabolism]. 497
1962 224865 COG1954 GlpP Glycerol-3-phosphate responsive antiterminator (mRNA-binding) [Transcription]. 181
1963 224866 COG1955 FlaJ Archaellum biogenesis protein FlaJ, TadC family [Cell motility]. 527
1964 224867 COG1956 GAF GAF domain-containing protein, putative methionine-R-sulfoxide reductase [Defense mechanisms, Signal transduction mechanisms]. 163
1965 224868 COG1957 URH1 Inosine-uridine nucleoside N-ribohydrolase [Nucleotide transport and metabolism]. 311
1966 224869 COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]. 79
1967 224870 COG1959 IscR DNA-binding transcriptional regulator, IscR family [Transcription]. 150
1968 224871 COG1960 CaiA Acyl-CoA dehydrogenase related to the alkylation response protein AidB [Lipid transport and metabolism]. 393
1969 224872 COG1961 PinE Site-specific DNA recombinase related to the DNA invertase Pin [Replication, recombination and repair]. 222
1970 224873 COG1962 MtrH Tetrahydromethanopterin S-methyltransferase, subunit H [Coenzyme transport and metabolism]. 313
1971 224874 COG1963 YuiD Acid phosphatase family membrane protein YuiD [General function prediction only]. 153
1972 224875 COG1964 COG1964 Uncharacterized Fe-S cluster-containing enzyme, radical SAM superfamily [General function prediction only]. 475
1973 224876 COG1965 CyaY Iron-binding protein CyaY, frataxin homolog [Inorganic ion transport and metabolism]. 106
1974 224877 COG1966 CstA Carbon starvation protein CstA [Signal transduction mechanisms]. 575
1975 224878 COG1967 COG1967 Uncharacterized membrane protein [Function unknown]. 271
1976 224879 COG1968 UppP Undecaprenyl pyrophosphate phosphatase [Lipid transport and metabolism]. 270
1977 224880 COG1969 HyaC Ni,Fe-hydrogenase I cytochrome b subunit [Energy production and conversion]. 227
1978 224881 COG1970 MscL Large-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 130
1979 224882 COG1971 MntP Putative Mn2+ efflux pump MntP [Inorganic ion transport and metabolism]. 190
1980 224883 COG1972 NupC Nucleoside permease NupC [Nucleotide transport and metabolism]. 404
1981 224884 COG1973 HypE Hydrogenase maturation factor HypE [Posttranslational modification, protein turnover, chaperones]. 449
1982 224885 COG1974 LexA SOS-response transcriptional repressor LexA (RecA-mediated autopeptidase) [Transcription, Signal transduction mechanisms]. 201
1983 224886 COG1975 XdhC Xanthine and CO dehydrogenase maturation factor, XdhC/CoxF family [Posttranslational modification, protein turnover, chaperones]. 278
1984 224887 COG1976 TIF6 Translation initiation factor 6 (eIF-6) [Translation, ribosomal structure and biogenesis]. 222
1985 224888 COG1977 MoaD Molybdopterin converting factor, small subunit [Coenzyme transport and metabolism]. 84
1986 224889 COG1978 YkuK Predicted RNase H-related nuclease YkuK, DUF458 family [General function prediction only]. 152
1987 224890 COG1979 YqdH Alcohol dehydrogenase YqhD, Fe-dependent ADH family [Energy production and conversion]. 384
1988 224891 COG1980 COG1980 Archaeal fructose 1,6-bisphosphatase [Carbohydrate transport and metabolism]. 369
1989 224892 COG1981 COG1981 Uncharacterized membrane protein [Function unknown]. 149
1990 224893 COG1982 LdcC Arginine/lysine/ornithine decarboxylase [Amino acid transport and metabolism]. 557
1991 224894 COG1983 PspC Phage shock protein PspC (stress-responsive transcriptional regulator) [Transcription, Signal transduction mechanisms]. 70
1992 224895 COG1984 DUR1B Allophanate hydrolase subunit 2 [Amino acid transport and metabolism]. 314
1993 224896 COG1985 RibD Pyrimidine reductase, riboflavin biosynthesis [Coenzyme transport and metabolism]. 218
1994 224897 COG1986 YjjX Non-canonical (house-cleaning) NTP pyrophosphatase, all-alpha NTP-PPase family [Nucleotide transport and metabolism, Defense mechanisms]. 175
1995 224898 COG1987 FliQ Flagellar biosynthesis protein FliQ [Cell motility]. 89
1996 224899 COG1988 YbcI Membrane-bound metal-dependent hydrolase YbcI, DUF457 family [General function prediction only]. 190
1997 224900 COG1989 PulO Prepilin signal peptidase PulO (type II secretory pathway) or related peptidase [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 254
1998 224901 COG1990 Pth2 Peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 122
1999 224902 COG1991 COG1991 Uncharacterized protein, UPF0333 family [Function unknown]. 131
2000 224903 COG1992 COG1992 Predicted transcriptional regulator fused phosphomethylpyrimidine kinase (thiamin biosynthesis) [General function prediction only]. 181
2001 224904 COG1993 COG1993 PII-like signaling protein [Signal transduction mechanisms]. 109
2002 224905 COG1994 SpoIVFB Zn-dependent protease (includes SpoIVFB) [Posttranslational modification, protein turnover, chaperones]. 230
2003 224906 COG1995 PdxA 4-hydroxy-L-threonine phosphate dehydrogenase PdxA [Coenzyme transport and metabolism]. 332
2004 224907 COG1996 RPC10 DNA-directed RNA polymerase, subunit RPC12/RpoP, contains C4-type Zn-finger [Transcription]. 49
2005 224908 COG1997 RPL43A Ribosomal protein L37AE/L43A [Translation, ribosomal structure and biogenesis]. 89
2006 224909 COG1998 RPS27AE Ribosomal protein S27AE [Translation, ribosomal structure and biogenesis]. 51
2007 224910 COG1999 Sco1 Cytochrome oxidase Cu insertion factor, SCO1/SenC/PrrC family [Posttranslational modification, protein turnover, chaperones]. 207
2008 224911 COG2000 COG2000 Uncharacterized Fe-S cluster-containing protein [General function prediction only]. 226
2009 224912 COG2001 MraZ MraZ, DNA-binding transcriptional regulator and inhibitor of RsmH methyltransferase activity [Translation, ribosomal structure and biogenesis]. 146
2010 224913 COG2002 AbrB Bifunctional DNA-binding transcriptional regulator of stationary/sporulation/toxin gene expression and antitoxin component of the YhaV-PrlF toxin-antitoxin module [Transcription, Defense mechanisms]. 89
2011 224914 COG2003 RadC DNA repair protein RadC, contains a helix-hairpin-helix DNA-binding motif [Replication, recombination and repair]. 224
2012 224915 COG2004 RPS24A Ribosomal protein S24E [Translation, ribosomal structure and biogenesis]. 107
2013 224916 COG2005 ModE DNA-binding transcriptional regulator ModE (molybdenum-dependent) [Transcription]. 130
2014 224917 COG2006 COG2006 Uncharacterized conserved protein, DUF362 family [Function unknown]. 293
2015 224918 COG2007 RPS8A Ribosomal protein S8E [Translation, ribosomal structure and biogenesis]. 127
2016 224919 COG2008 GLY1 Threonine aldolase [Amino acid transport and metabolism]. 342
2017 224920 COG2009 SdhC Succinate dehydrogenase/fumarate reductase, cytochrome b subunit [Energy production and conversion]. 132
2018 224921 COG2010 CccA Cytochrome c, mono- and diheme variants [Energy production and conversion]. 150
2019 224922 COG2011 MetP ABC-type methionine transport system, permease component [Amino acid transport and metabolism]. 222
2020 224923 COG2012 RPB5 DNA-directed RNA polymerase, subunit H, RpoH/RPB5 [Transcription]. 80
2021 224924 COG2013 AIM24 Uncharacterized conserved protein, AIM24 family [Function unknown]. 227
2022 224925 COG2014 COG2014 Uncharacterized conserved protein, contains DUF4213 and DUF364 domains [Function unknown]. 250
2023 224926 COG2015 BDS1 Alkyl sulfatase BDS1 and related hydrolases, metallo-beta-lactamase superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 655
2024 224927 COG2016 Tma20 Predicted ribosome-associated RNA-binding protein Tma20, contains PUA domain [Translation, ribosomal structure and biogenesis]. 161
2025 224928 COG2017 GalM Galactose mutarotase or related enzyme [Carbohydrate transport and metabolism]. 308
2026 224929 COG2018 COG2018 Predicted regulator of Ras-like GTPase activity, Roadblock/LC7/MglB family [Signal transduction mechanisms]. 119
2027 224930 COG2019 AdkA Archaeal adenylate kinase [Nucleotide transport and metabolism]. 189
2028 224931 COG2020 STE14 Protein-S-isoprenylcysteine O-methyltransferase Ste14 [Posttranslational modification, protein turnover, chaperones]. 187
2029 224932 COG2021 MET2 Homoserine acetyltransferase [Amino acid transport and metabolism]. 368
2030 224933 COG2022 ThiG Thiamin biosynthesis thiazole synthase ThiGH, ThiG subunit [Coenzyme transport and metabolism]. 262
2031 224934 COG2023 RPR2 RNase P subunit RPR2 [Translation, ribosomal structure and biogenesis]. 105
2032 224935 COG2024 SepRS O-phosphoseryl-tRNA(Cys) synthetase [Translation, ribosomal structure and biogenesis]. 536
2033 224936 COG2025 FixB Electron transfer flavoprotein, alpha subunit [Energy production and conversion]. 313
2034 224937 COG2026 RelE mRNA-degrading endonuclease RelE, toxin component of the RelBE toxin-antitoxin system [Defense mechanisms]. 90
2035 224938 COG2027 DacB D-alanyl-D-alanine carboxypeptidase [Cell wall/membrane/envelope biogenesis]. 470
2036 224939 COG2028 COG2028 Uncharacterized protein [Function unknown]. 145
2037 224940 COG2029 COG2029 Uncharacterized protein [Function unknown]. 189
2038 224941 COG2030 MaoC Acyl dehydratase [Lipid transport and metabolism]. 159
2039 224942 COG2031 AtoE Short chain fatty acids transporter [Lipid transport and metabolism]. 446
2040 224943 COG2032 SodC Cu/Zn superoxide dismutase [Inorganic ion transport and metabolism]. 179
2041 224944 COG2033 SORL Desulfoferrodoxin, superoxide reductase-like (SORL) domain [Energy production and conversion]. 126
2042 224945 COG2034 COG2034 Uncharacterized membrane protein [Function unknown]. 85
2043 224946 COG2035 COG2035 Uncharacterized membrane protein [Function unknown]. 276
2044 224947 COG2036 HHT1 Archaeal histone H3/H4 [Chromatin structure and dynamics]. 91
2045 224948 COG2037 Ftr Formylmethanofuran:tetrahydromethanopterin formyltransferase [Energy production and conversion]. 297
2046 224949 COG2038 CobT NaMN:DMB phosphoribosyltransferase [Coenzyme transport and metabolism]. 347
2047 224950 COG2039 Pcp Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) [Posttranslational modification, protein turnover, chaperones]. 207
2048 224951 COG2040 MHT1 Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) [Amino acid transport and metabolism]. 300
2049 224952 COG2041 YedY Periplasmic DMSO/TMAO reductase YedYZ, molybdopterin-dependent catalytic subunit [Energy production and conversion]. 271
2050 224953 COG2042 Tsr3 Ribosome biogenesis protein Tsr3 (rRNA maturation) [Translation, ribosomal structure and biogenesis]. 179
2051 224954 COG2043 COG2043 Uncharacterized conserved protein, DUF169 family [Function unknown]. 237
2052 224955 COG2044 COG2044 Predicted peroxiredoxin [General function prediction only]. 120
2053 224956 COG2045 ComB Phosphosulfolactate phosphohydrolase or related enzyme [Coenzyme transport and metabolism, General function prediction only]. 230
2054 224957 COG2046 MET3 ATP sulfurylase (sulfate adenylyltransferase) [Inorganic ion transport and metabolism]. 397
2055 224958 COG2047 COG2047 Proteasome assembly chaperone (PAC2) family protein [General function prediction only]. 258
2056 224959 COG2048 HdrB Heterodisulfide reductase, subunit B [Energy production and conversion]. 293
2057 224960 COG2049 DUR1A Allophanate hydrolase subunit 1 [Amino acid transport and metabolism]. 223
2058 224961 COG2050 PaaI Acyl-coenzyme A thioesterase PaaI, contains HGG motif [Secondary metabolites biosynthesis, transport and catabolism]. 141
2059 224962 COG2051 RPS27A Ribosomal protein S27E [Translation, ribosomal structure and biogenesis]. 67
2060 224963 COG2052 RemA Regulator of extracellular matrix RemA, YlzA/DUF370 family [Cell wall/membrane/envelope biogenesis]. 89
2061 224964 COG2053 RPS28A Ribosomal protein S28E/S33 [Translation, ribosomal structure and biogenesis]. 69
2062 224965 COG2054 COG2054 Uncharacterized archaeal kinase related to aspartokinase [General function prediction only]. 212
2063 224966 COG2055 AllD Malate/lactate/ureidoglycolate dehydrogenase, LDH2 family [Energy production and conversion]. 349
2064 224967 COG2056 YuiF Predicted histidine transporter YuiF, NhaC family [Amino acid transport and metabolism]. 444
2065 224968 COG2057 AtoA Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit [Lipid transport and metabolism]. 225
2066 224969 COG2058 RPP1A Ribosomal protein L12E/L44/L45/RPP1/RPP2 [Translation, ribosomal structure and biogenesis]. 109
2067 224970 COG2059 ChrA Chromate transport protein ChrA [Inorganic ion transport and metabolism]. 195
2068 224971 COG2060 KdpA K+-transporting ATPase, A chain [Inorganic ion transport and metabolism]. 560
2069 224972 COG2061 COG2061 Uncharacterized conserved protein, contains ACT domain [General function prediction only]. 170
2070 224973 COG2062 SixA Phosphohistidine phosphatase SixA [Signal transduction mechanisms]. 163
2071 224974 COG2063 FlgH Flagellar basal body L-ring protein FlgH [Cell motility]. 230
2072 224975 COG2064 TadC Pilus assembly protein TadC [Extracellular structures]. 320
2073 224976 COG2065 PyrR Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase [Nucleotide transport and metabolism]. 179
2074 224977 COG2066 GlsA Glutaminase [Amino acid transport and metabolism]. 309
2075 224978 COG2067 FadL Long-chain fatty acid transport protein [Lipid transport and metabolism]. 440
2076 224979 COG2068 MocA CTP:molybdopterin cytidylyltransferase MocA [Coenzyme transport and metabolism]. 199
2077 224980 COG2069 CdhD CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein) [Energy production and conversion]. 403
2078 224981 COG2070 YrpB NAD(P)H-dependent flavin oxidoreductase YrpB, nitropropane dioxygenase family [General function prediction only]. 336
2079 224982 COG2071 PuuD Gamma-glutamyl-gamma-aminobutyrate hydrolase PuuD (putrescine degradation), contains GATase1-like domain [Amino acid transport and metabolism]. 243
2080 224983 COG2072 CzcO Predicted flavoprotein CzcO associated with the cation diffusion facilitator CzcD [Inorganic ion transport and metabolism]. 443
2081 224984 COG2073 CbiG Cobalamin biosynthesis protein CbiG [Coenzyme transport and metabolism]. 298
2082 224985 COG2074 Pgk2 2-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 299
2083 224986 COG2075 RPL24A Ribosomal protein L24E [Translation, ribosomal structure and biogenesis]. 66
2084 224987 COG2076 EmrE Multidrug transporter EmrE and related cation transporters [Defense mechanisms]. 106
2085 224988 COG2077 Tpx Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 158
2086 224989 COG2078 AMMECR1 Uncharacterized conserved protein, AMMECR1 domain [Function unknown]. 203
2087 224990 COG2079 PrpD 2-methylcitrate dehydratase PrpD [Carbohydrate transport and metabolism]. 453
2088 224991 COG2080 CoxS Aerobic-type carbon monoxide dehydrogenase, small subunit, CoxS/CutS family [Energy production and conversion]. 156
2089 224992 COG2081 YhiN Predicted flavoprotein YhiN [General function prediction only]. 408
2090 224993 COG2082 CobH Precorrin isomerase [Coenzyme transport and metabolism]. 210
2091 224994 COG2083 COG2083 Uncharacterized protein, UPF0216 family [Function unknown]. 140
2092 224995 COG2084 MmsB 3-hydroxyisobutyrate dehydrogenase or related beta-hydroxyacid dehydrogenase [Lipid transport and metabolism]. 286
2093 224996 COG2085 COG2085 Predicted dinucleotide-binding enzyme [General function prediction only]. 211
2094 224997 COG2086 FixA Electron transfer flavoprotein, alpha and beta subunits [Energy production and conversion]. 260
2095 224998 COG2087 CobU Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase [Coenzyme transport and metabolism]. 175
2096 224999 COG2088 SpoVG DNA-binding protein SpoVG, cell septation regulator [Cell cycle control, cell division, chromosome partitioning]. 95
2097 225000 COG2089 SpsE Sialic acid synthase SpsE, contains C-terminal SAF domain [Cell wall/membrane/envelope biogenesis]. 347
2098 225001 COG2090 COG2090 Uncharacterized protein [Function unknown]. 141
2099 225002 COG2091 Sfp Phosphopantetheinyl transferase [Coenzyme transport and metabolism]. 223
2100 225003 COG2092 EFB1 Translation elongation factor EF-1beta [Translation, ribosomal structure and biogenesis]. 88
2101 225004 COG2093 Spt4 RNA polymerase subunit RPABC4/transcription elongation factor Spt4 [Transcription]. 64
2102 225005 COG2094 Mpg 3-methyladenine DNA glycosylase Mpg [Replication, recombination and repair]. 200
2103 225006 COG2095 MarC Small neutral amino acid transporter SnatA, MarC family [Amino acid transport and metabolism]. 203
2104 225007 COG2096 PduO Cob(I)alamin adenosyltransferase [Coenzyme transport and metabolism]. 184
2105 225008 COG2097 RPL31A Ribosomal protein L31E [Translation, ribosomal structure and biogenesis]. 89
2106 225009 COG2098 COG2098 Uncharacterized protein [Function unknown]. 116
2107 225010 COG2099 CobK Precorrin-6x reductase [Coenzyme transport and metabolism]. 257
2108 225011 COG2100 COG2100 Uncharacterized Fe-S cluster-containing enzyme, radical SAM superfamily [General function prediction only]. 414
2109 225012 COG2101 SPT15 TATA-box binding protein (TBP), component of TFIID and TFIIIB [Transcription]. 185
2110 225013 COG2102 Dph6 Diphthamide synthase (EF-2-diphthine--ammonia ligase) [Translation, ribosomal structure and biogenesis]. 223
2111 225014 COG2103 MurQ N-acetylmuramic acid 6-phosphate (MurNAc-6-P) etherase [Cell wall/membrane/envelope biogenesis]. 298
2112 225015 COG2104 ThiS Sulfur carrier protein ThiS (thiamine biosynthesis) [Coenzyme transport and metabolism]. 68
2113 225016 COG2105 YtfP Uncharacterized conserved protein YtfP, gamma-glutamylcyclotransferase (GGCT)/AIG2-like family [General function prediction only]. 120
2114 225017 COG2106 MTH1 Predicted RNA methylase MTH1, SPOUT superfamily [General function prediction only]. 272
2115 225018 COG2107 MqnD Menaquinone biosynthesis enzyme MqnD [Coenzyme transport and metabolism]. 272
2116 225019 COG2108 COG2108 Uncharacterized conserved protein related to pyruvate formate-lyase activating enzyme [Function unknown]. 353
2117 225020 COG2109 BtuR ATP:corrinoid adenosyltransferase [Coenzyme transport and metabolism]. 198
2118 225021 COG2110 YmdB O-acetyl-ADP-ribose deacetylase (regulator of RNase III), contains Macro domain [Translation, ribosomal structure and biogenesis]. 179
2119 225022 COG2111 MnhB Multisubunit Na+/H+ antiporter, MnhB subunit [Inorganic ion transport and metabolism]. 162
2120 225023 COG2112 COG2112 Predicted Ser/Thr protein kinase [Signal transduction mechanisms]. 201
2121 225024 COG2113 ProX ABC-type proline/glycine betaine transport system, periplasmic component [Amino acid transport and metabolism]. 302
2122 225025 COG2114 AcyC Adenylate cyclase, class 3 [Signal transduction mechanisms]. 227
2123 225026 COG2115 XylA Xylose isomerase [Carbohydrate transport and metabolism]. 438
2124 225027 COG2116 FocA Formate/nitrite transporter FocA, FNT family [Inorganic ion transport and metabolism]. 265
2125 225028 COG2117 COG2117 Predicted subunit of tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 198
2126 225029 COG2118 PDCD5 DNA-binding TFAR19-related protein, PDSD5 family [General function prediction only]. 116
2127 225030 COG2119 Gdt1 Putative Ca2+/H+ antiporter, TMEM165/GDT1 family [General function prediction only]. 190
2128 225031 COG2120 LmbE N-acetylglucosaminyl deacetylase, LmbE family [Carbohydrate transport and metabolism]. 237
2129 225032 COG2121 COG2121 Uncharacterized conserved protein, lysophospholipid acyltransferase (LPLAT) superfamily [Function unknown]. 214
2130 225033 COG2122 COG2122 Uncharacterized protein, UPF0280 family, ApbE superfamily [Function unknown]. 256
2131 225034 COG2123 Rrp42 Exosome complex RNA-binding protein Rrp42, RNase PH superfamily [Translation, ribosomal structure and biogenesis]. 272
2132 225035 COG2124 CypX Cytochrome P450 [Secondary metabolites biosynthesis, transport and catabolism, Defense mechanisms]. 411
2133 225036 COG2125 RPS6A Ribosomal protein S6E (S10) [Translation, ribosomal structure and biogenesis]. 120
2134 225037 COG2126 RPL37A Ribosomal protein L37E [Translation, ribosomal structure and biogenesis]. 61
2135 225038 COG2127 ClpS ATP-dependent Clp protease adapter protein ClpS [Posttranslational modification, protein turnover, chaperones]. 107
2136 225039 COG2128 YciW Alkylhydroperoxidase family enzyme, contains CxxC motif [Inorganic ion transport and metabolism]. 177
2137 225040 COG2129 COG2129 Predicted phosphoesterase, related to the Icc protein [General function prediction only]. 226
2138 225041 COG2130 CurA NADPH-dependent curcumin reductase CurA [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 340
2139 225042 COG2131 ComEB Deoxycytidylate deaminase [Nucleotide transport and metabolism]. 164
2140 225043 COG2132 SufI Multicopper oxidase with three cupredoxin domains (includes cell division protein FtsP and spore coat protein CotA) [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism, Cell wall/membrane/envelope biogenesis]. 451
2141 225044 COG2133 YliI Glucose/arabinose dehydrogenase, beta-propeller fold [Carbohydrate transport and metabolism]. 399
2142 225045 COG2134 Cdh CDP-diacylglycerol pyrophosphatase [Lipid transport and metabolism]. 252
2143 225046 COG2135 SRAP Putative SOS response-associated peptidase YedK [Posttranslational modification, protein turnover, chaperones]. 226
2144 225047 COG2136 IMP4 rRNA maturation protein Rpf1, contains Brix/IMP4 (anticodon-binding) domain [Translation, ribosomal structure and biogenesis]. 191
2145 225048 COG2137 RecX SOS response regulatory protein OraA/RecX, interacts with RecA [Posttranslational modification, protein turnover, chaperones]. 174
2146 225049 COG2138 SirB Sirohydrochlorin ferrochelatase [Coenzyme transport and metabolism]. 245
2147 225050 COG2139 RPL21A Ribosomal protein L21E [Translation, ribosomal structure and biogenesis]. 98
2148 225051 COG2140 OxdD Oxalate decarboxylase/archaeal phosphoglucose isomerase, cupin superfamily [Carbohydrate transport and metabolism]. 209
2149 225052 COG2141 SsuD Flavin-dependent oxidoreductase, luciferase family (includes alkanesulfonate monooxygenase SsuD and methylene tetrahydromethanopterin reductase) [Coenzyme transport and metabolism, General function prediction only]. 336
2150 225053 COG2142 SdhD Succinate dehydrogenase, hydrophobic anchor subunit [Energy production and conversion]. 117
2151 225054 COG2143 SoxW Thioredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 182
2152 225055 COG2144 COG2144 Selenophosphate synthetase-related protein [General function prediction only]. 324
2153 225056 COG2145 ThiM Hydroxyethylthiazole kinase, sugar kinase family [Coenzyme transport and metabolism]. 265
2154 225057 COG2146 NirD Ferredoxin subunit of nitrite reductase or a ring-hydroxylating dioxygenase [Inorganic ion transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 106
2155 225058 COG2147 RPL19A Ribosomal protein L19E [Translation, ribosomal structure and biogenesis]. 150
2156 225059 COG2148 WcaJ Sugar transferase involved in LPS biosynthesis (colanic, teichoic acid) [Cell wall/membrane/envelope biogenesis]. 226
2157 225060 COG2149 YidH Uncharacterized membrane protein YidH, DUF202 family [Function unknown]. 120
2158 225061 COG2150 COG2150 Predicted regulator of amino acid metabolism, contains ACT domain [General function prediction only]. 167
2159 225062 COG2151 PaaD Metal-sulfur cluster biosynthetic enzyme [Posttranslational modification, protein turnover, chaperones]. 111
2160 225063 COG2152 COG2152 Predicted glycosyl hydrolase, GH43/DUF377 family [Carbohydrate transport and metabolism]. 314
2161 225064 COG2153 ElaA Predicted N-acyltransferase, GNAT family [General function prediction only]. 155
2162 225065 COG2154 PhhB Pterin-4a-carbinolamine dehydratase [Coenzyme transport and metabolism]. 101
2163 225066 COG2155 YuzA Uncharacterized membrane protein YuzA, DUF378 family [Function unknown]. 79
2164 225067 COG2156 KdpC K+-transporting ATPase, c chain [Inorganic ion transport and metabolism]. 190
2165 225068 COG2157 RPL20A Ribosomal protein L20A (L18A) [Translation, ribosomal structure and biogenesis]. 85
2166 225069 COG2158 COG2158 Uncharacterized protein, contains a Zn-finger-like domain [General function prediction only]. 112
2167 225070 COG2159 COG2159 Predicted metal-dependent hydrolase, TIM-barrel fold [General function prediction only]. 293
2168 225071 COG2160 AraA L-arabinose isomerase [Carbohydrate transport and metabolism]. 497
2169 225072 COG2161 StbD Antitoxin component YafN of the YafNO toxin-antitoxin module, PHD/YefM family [Defense mechanisms]. 86
2170 225073 COG2162 NhoA Arylamine N-acetyltransferase [Secondary metabolites biosynthesis, transport and catabolism]. 275
2171 225074 COG2163 RPL14A Ribosomal protein L14E/L6E/L27E [Translation, ribosomal structure and biogenesis]. 125
2172 225075 COG2164 COG2164 Uncharacterized protein [Function unknown]. 126
2173 225076 COG2165 PulG Type II secretory pathway, pseudopilin PulG [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 149
2174 225077 COG2166 SufE Sulfur transfer protein SufE, Fe-S cluster assembly [Posttranslational modification, protein turnover, chaperones]. 144
2175 225078 COG2167 RPL39 Ribosomal protein L39E [Translation, ribosomal structure and biogenesis]. 51
2176 225079 COG2168 DsrH Sulfur transfer complex TusBCD TusB component, DsrH family [Posttranslational modification, protein turnover, chaperones]. 96
2177 225080 COG2169 AdaA Methylphosphotriester-DNA--protein-cysteine methyltransferase (N-terminal fragment of Ada), contains Zn-binding and two AraC-type DNA-binding domains [Replication, recombination and repair]. 187
2178 225081 COG2170 YbdK Gamma-glutamyl:cysteine ligase YbdK, ATP-grasp superfamily [Posttranslational modification, protein turnover, chaperones]. 369
2179 225082 COG2171 DapD Tetrahydrodipicolinate N-succinyltransferase [Amino acid transport and metabolism]. 271
2180 225083 COG2172 RsbW Anti-sigma regulatory factor (Ser/Thr protein kinase) [Signal transduction mechanisms]. 146
2181 225084 COG2173 DdpX D-alanyl-D-alanine dipeptidase [Cell wall/membrane/envelope biogenesis]. 211
2182 225085 COG2174 RPL34A Ribosomal protein L34E [Translation, ribosomal structure and biogenesis]. 93
2183 225086 COG2175 TauD Taurine dioxygenase, alpha-ketoglutarate-dependent [Secondary metabolites biosynthesis, transport and catabolism]. 286
2184 225087 COG2176 PolC DNA polymerase III, alpha subunit (gram-positive type) [Replication, recombination and repair]. 1444
2185 225088 COG2177 FtsX Cell division protein FtsX [Cell cycle control, cell division, chromosome partitioning]. 297
2186 225089 COG2178 COG2178 Predicted RNA- or ssDNA-binding protein, translin family [General function prediction only]. 204
2187 225090 COG2179 YqeG Predicted phosphohydrolase YqeG, HAD superfamily [General function prediction only]. 175
2188 225091 COG2180 NarJ Nitrate reductase assembly protein NarJ, required for insertion of molybdenum cofactor [Energy production and conversion, Inorganic ion transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 179
2189 225092 COG2181 NarI Nitrate reductase gamma subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 228
2190 225093 COG2182 MalE Maltose-binding periplasmic protein MalE [Carbohydrate transport and metabolism]. 420
2191 225094 COG2183 Tex Transcriptional accessory protein Tex/SPT6 [Transcription]. 780
2192 225095 COG2184 FIDO Fido, protein-threonine AMPylation domain [Signal transduction mechanisms]. 201
2193 225096 COG2185 Sbm Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) [Lipid transport and metabolism]. 143
2194 225097 COG2186 FadR DNA-binding transcriptional regulator, FadR family [Transcription]. 241
2195 225098 COG2187 COG2187 Aminoglycoside phosphotransferase family enzyme [General function prediction only]. 337
2196 225099 COG2188 MngR DNA-binding transcriptional regulator, GntR family [Transcription]. 236
2197 225100 COG2189 Mod Adenine specific DNA methylase Mod [Replication, recombination and repair]. 590
2198 225101 COG2190 NagE Phosphotransferase system IIA component [Carbohydrate transport and metabolism]. 156
2199 225102 COG2191 FwdE Formylmethanofuran dehydrogenase subunit E [Energy production and conversion]. 206
2200 225103 COG2192 COG2192 Predicted carbamoyl transferase, NodU family [General function prediction only]. 555
2201 225104 COG2193 Bfr Bacterioferritin (cytochrome b1) [Inorganic ion transport and metabolism]. 157
2202 225105 COG2194 OpgE Phosphoethanolamine transferase for periplasmic glucans (OPG), alkaline phosphatase superfamily [Cell wall/membrane/envelope biogenesis]. 555
2203 225106 COG2195 PepD2 Di- or tripeptidase [Amino acid transport and metabolism]. 414
2204 225107 COG2197 CitB DNA-binding response regulator, NarL/FixJ family, contains REC and HTH domains [Signal transduction mechanisms, Transcription]. 211
2205 225108 COG2198 HPtr HPt (histidine-containing phosphotransfer) domain [Signal transduction mechanisms]. 122
2206 225109 COG2199 GGDEF GGDEF domain, diguanylate cyclase (c-di-GMP synthetase) or its enzymatically inactive variants [Signal transduction mechanisms]. 181
2207 225110 COG2200 EAL EAL domain, c-di-GMP-specific phosphodiesterase class I (or its enzymatically inactive variant) [Signal transduction mechanisms]. 256
2208 225111 COG2201 CheB Chemotaxis response regulator CheB, contains REC and protein-glutamate methylesterase domains [Cell motility, Signal transduction mechanisms]. 350
2209 225112 COG2202 PAS PAS domain [Signal transduction mechanisms]. 232
2210 225113 COG2203 FhlA GAF domain [Signal transduction mechanisms]. 175
2211 225114 COG2204 AtoC DNA-binding transcriptional response regulator, NtrC family, contains REC, AAA-type ATPase, and a Fis-type DNA-binding domains [Signal transduction mechanisms]. 464
2212 225115 COG2205 KdpD K+-sensing histidine kinase KdpD [Signal transduction mechanisms]. 890
2213 225116 COG2206 HDGYP HD-GYP domain, c-di-GMP phosphodiesterase class II (or its inactivated variant) [Signal transduction mechanisms]. 344
2214 225117 COG2207 AraC AraC-type DNA-binding domain and AraC-containing proteins [Transcription]. 127
2215 225118 COG2208 RsbU Serine phosphatase RsbU, regulator of sigma subunit [Signal transduction mechanisms, Transcription]. 367
2216 225119 COG2209 NqrE Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE [Energy production and conversion]. 198
2217 225120 COG2210 YrkE Peroxiredoxin family protein [Energy production and conversion]. 137
2218 225121 COG2211 MelB Na+/melibiose symporter or related transporter [Carbohydrate transport and metabolism]. 467
2219 225122 COG2212 MnhF Multisubunit Na+/H+ antiporter, MnhF subunit [Inorganic ion transport and metabolism]. 89
2220 225123 COG2213 MtlA Phosphotransferase system, mannitol-specific IIBC component [Carbohydrate transport and metabolism]. 472
2221 225124 COG2214 CbpA Curved DNA-binding protein CbpA, contains a DnaJ-like domain [Transcription]. 237
2222 225125 COG2215 RcnA ABC-type nickel/cobalt efflux system, permease component RcnA [Inorganic ion transport and metabolism]. 303
2223 225126 COG2216 KdpB High-affinity K+ transport system, ATPase chain B [Inorganic ion transport and metabolism]. 681
2224 225127 COG2217 ZntA Cation transport ATPase [Inorganic ion transport and metabolism]. 713
2225 225128 COG2218 FwdC Formylmethanofuran dehydrogenase subunit C [Energy production and conversion]. 264
2226 225129 COG2219 PRI2 Eukaryotic-type DNA primase, large subunit [Replication, recombination and repair]. 363
2227 225130 COG2220 UlaG L-ascorbate metabolism protein UlaG, beta-lactamase superfamily [Carbohydrate transport and metabolism]. 258
2228 225131 COG2221 DsrA Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits [Inorganic ion transport and metabolism]. 317
2229 225132 COG2222 AgaS Fructoselysine-6-P-deglycase FrlB and related proteins with duplicated sugar isomerase (SIS) domain [Cell wall/membrane/envelope biogenesis]. 340
2230 225133 COG2223 NarK Nitrate/nitrite transporter NarK [Inorganic ion transport and metabolism]. 417
2231 225134 COG2224 AceA Isocitrate lyase [Energy production and conversion]. 433
2232 225135 COG2225 AceB Malate synthase [Energy production and conversion]. 545
2233 225136 COG2226 UbiE Ubiquinone/menaquinone biosynthesis C-methylase UbiE [Coenzyme transport and metabolism]. 238
2234 225137 COG2227 UbiG 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase [Coenzyme transport and metabolism]. 243
2235 225138 COG2229 Srp102 Signal recognition particle receptor subunit beta, a GTPase [Intracellular trafficking, secretion, and vesicular transport]. 187
2236 225139 COG2230 Cfa Cyclopropane fatty-acyl-phospholipid synthase and related methyltransferases [Lipid transport and metabolism]. 283
2237 225140 COG2231 COG2231 Uncharacterized protein related to Endonuclease III [General function prediction only]. 215
2238 225141 COG2232 COG2232 Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 389
2239 225142 COG2233 UraA Xanthine/uracil permease [Nucleotide transport and metabolism]. 451
2240 225143 COG2234 Iap Zn-dependent amino- or carboxypeptidase, M28 family [Posttranslational modification, protein turnover, chaperones, Amino acid transport and metabolism]. 435
2241 225144 COG2235 ArcA Arginine deiminase [Amino acid transport and metabolism]. 409
2242 225145 COG2236 Hpt1 Hypoxanthine phosphoribosyltransferase [Coenzyme transport and metabolism]. 192
2243 225146 COG2237 COG2237 Uncharacterized membrane protein [Function unknown]. 364
2244 225147 COG2238 RPS19A Ribosomal protein S19E (S16A) [Translation, ribosomal structure and biogenesis]. 147
2245 225148 COG2239 MgtE Mg/Co/Ni transporter MgtE (contains CBS domain) [Inorganic ion transport and metabolism]. 451
2246 225149 COG2240 PdxK Pyridoxal/pyridoxine/pyridoxamine kinase [Coenzyme transport and metabolism]. 281
2247 225150 COG2241 CobL Precorrin-6B methylase 1 [Coenzyme transport and metabolism]. 210
2248 225151 COG2242 CobL Precorrin-6B methylase 2 [Coenzyme transport and metabolism]. 187
2249 225152 COG2243 CobF Precorrin-2 methylase [Coenzyme transport and metabolism]. 234
2250 225153 COG2244 RfbX Membrane protein involved in the export of O-antigen and teichoic acid [Cell wall/membrane/envelope biogenesis]. 480
2251 225154 COG2245 COG2245 Uncharacterized membrane protein [Function unknown]. 182
2252 225155 COG2246 GtrA Putative flippase GtrA (transmembrane translocase of bactoprenol-linked glucose) [Lipid transport and metabolism]. 139
2253 225156 COG2247 LytB Putative cell wall-binding domain [Cell wall/membrane/envelope biogenesis]. 337
2254 225157 COG2248 COG2248 Predicted hydrolase, metallo-beta-lactamase superfamily [General function prediction only]. 304
2255 225158 COG2249 MdaB Putative NADPH-quinone reductase (modulator of drug activity B) [General function prediction only]. 189
2256 225159 COG2250 HEPN HEPN domain [Function unknown]. 132
2257 225160 COG2251 COG2251 Predicted nuclease, RecB family [General function prediction only]. 474
2258 225161 COG2252 AzgA Xanthine/uracil/vitamin C permease, AzgA family [Nucleotide transport and metabolism]. 436
2259 225162 COG2253 COG2253 Predicted nucleotidyltransferase component of viral defense system [Defense mechanisms]. 258
2260 225163 COG2254 Cas3 CRISPR/Cas system-associated endonuclease Cas3-HD [Defense mechanisms]. 230
2261 225164 COG2255 RuvB Holliday junction resolvasome RuvABC, ATP-dependent DNA helicase subunit [Replication, recombination and repair]. 332
2262 225165 COG2256 RarA Replication-associated recombination protein RarA (DNA-dependent ATPase) [Replication, recombination and repair]. 436
2263 225166 COG2257 YlqH Type III secretion system substrate exporter, FlhB-like [Intracellular trafficking, secretion, and vesicular transport]. 92
2264 225167 COG2258 YiiM Uncharacterized conserved protein YiiM, contains MOSC domain [Function unknown]. 210
2265 225168 COG2259 DoxX Uncharacterized membrane protein YphA, DoxX/SURF4 family [Function unknown]. 142
2266 225169 COG2260 Nop10 rRNA maturation protein Nop10, contains Zn-ribbon domain [Translation, ribosomal structure and biogenesis]. 59
2267 225170 COG2261 YeaQ Uncharacterized membrane protein YeaQ/YmgE, transglycosylase-associated protein family [General function prediction only]. 82
2268 225171 COG2262 HflX 50S ribosomal subunit-associated GTPase HflX [Translation, ribosomal structure and biogenesis]. 411
2269 225172 COG2263 COG2263 Predicted RNA methylase [General function prediction only]. 198
2270 225173 COG2264 PrmA Ribosomal protein L11 methylase PrmA [Translation, ribosomal structure and biogenesis]. 300
2271 225174 COG2265 TrmA tRNA/tmRNA/rRNA uracil-C5-methylase, TrmA/RlmC/RlmD family [Translation, ribosomal structure and biogenesis]. 432
2272 225175 COG2266 COG2266 GTP:adenosylcobinamide-phosphate guanylyltransferase [Coenzyme transport and metabolism]. 177
2273 225176 COG2267 PldB Lysophospholipase, alpha-beta hydrolase superfamily [Lipid transport and metabolism]. 298
2274 225177 COG2268 YqiK Uncharacterized membrane protein YqiK, contains Band7/PHB/SPFH domain [Function unknown]. 548
2275 225178 COG2269 EpmA Elongation factor P--beta-lysine ligase (EF-P beta-lysylation pathway) [Translation, ribosomal structure and biogenesis]. 322
2276 225179 COG2270 BtlA MFS-type transporter involved in bile tolerance, Atg22 family [General function prediction only]. 438
2277 225180 COG2271 UhpC Sugar phosphate permease [Carbohydrate transport and metabolism]. 448
2278 225181 COG2272 PnbA Carboxylesterase type B [Lipid transport and metabolism]. 491
2279 225182 COG2273 BglS Beta-glucanase, GH16 family [Carbohydrate transport and metabolism]. 355
2280 225183 COG2274 SunT ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain [Defense mechanisms]. 709
2281 225184 COG2301 CitE Citrate lyase beta subunit [Carbohydrate transport and metabolism]. 283
2282 225185 COG2302 YlmH RNA-binding protein YlmH, contains S4-like domain [General function prediction only]. 257
2283 225186 COG2303 BetA Choline dehydrogenase or related flavoprotein [Lipid transport and metabolism, General function prediction only]. 542
2284 225187 COG2304 YfbK Secreted protein containing bacterial Ig-like domain and vWFA domain [General function prediction only]. 399
2285 225188 COG2306 COG2306 Predicted RNA-binding protein, associated with RNAse of E/G family [General function prediction only]. 183
2286 225189 COG2307 COG2307 Uncharacterized conserved protein, Alpha-E superfamily [Function unknown]. 313
2287 225190 COG2308 COG2308 Uncharacterized conserved protein, circularly permuted ATPgrasp superfamily [Function unknown]. 488
2288 225191 COG2309 AmpS Leucyl aminopeptidase (aminopeptidase T) [Amino acid transport and metabolism]. 385
2289 225192 COG2310 TerZ Stress response protein SCP2 [Signal transduction mechanisms]. 182
2290 225193 COG2311 YeiB Uncharacterized membrane protein YeiB [Function unknown]. 394
2291 225194 COG2312 YbfO Erythromycin esterase homolog [Secondary metabolites biosynthesis, transport and catabolism]. 405
2292 225195 COG2313 PsuG Pseudouridine-5'-phosphate glycosidase (pseudoU degradation) [Nucleotide transport and metabolism]. 310
2293 225196 COG2314 TM2 Uncharacterized membrane protein YozV, TM2 domain [Function unknown]. 95
2294 225197 COG2315 MmcQ Predicted DNA-binding protein with double-wing structural motif, MmcQ/YjbR family [Transcription]. 118
2295 225198 COG2316 COG2316 Predicted hydrolase, HD superfamily [General function prediction only]. 212
2296 225199 COG2317 YpwA Zn-dependent carboxypeptidase, M32 family [Posttranslational modification, protein turnover, chaperones]. 497
2297 225200 COG2318 DinB Uncharacterized damage-inducible protein DinB (forms a four-helix bundle) [Function unknown]. 172
2298 225201 COG2319 WD40 WD40 repeat [General function prediction only]. 466
2299 225202 COG2320 GrpB GrpB domain, predicted nucleotidyltransferase, UPF0157 family [General function prediction only]. 185
2300 225203 COG2321 YpfJ Predicted metalloprotease [General function prediction only]. 295
2301 225204 COG2322 YozB Uncharacterized membrane protein YozB, DUF420 family [Function unknown]. 177
2302 225205 COG2323 YcaP Uncharacterized membrane protein YcaP, DUF421 family [Function unknown]. 224
2303 225206 COG2324 COG2324 Uncharacterized membrane protein [Function unknown]. 281
2304 225207 COG2326 COG2326 Polyphosphate kinase 2, PPK2 family [Energy production and conversion]. 270
2305 225208 COG2327 WcaK Polysaccharide pyruvyl transferase family protein WcaK [Cell wall/membrane/envelope biogenesis]. 385
2306 225209 COG2329 HmoA Heme-degrading monooxygenase HmoA and related ABM domain proteins [Coenzyme transport and metabolism]. 105
2307 225210 COG2331 COG2331 Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 82
2308 225211 COG2332 CcmE Cytochrome c-type biogenesis protein CcmE [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 153
2309 225212 COG2333 ComEC Metal-dependent hydrolase, beta-lactamase superfamily II [General function prediction only]. 293
2310 225213 COG2334 SrkA Ser/Thr protein kinase RdoA involved in Cpx stress response, MazF antagonist [Signal transduction mechanisms]. 331
2311 225214 COG2335 FAS1 Uncaracterized surface protein containing fasciclin (FAS1) repeats [General function prediction only]. 187
2312 225215 COG2336 MazE Antitoxin component of the MazEF toxin-antitoxin module [Signal transduction mechanisms]. 82
2313 225216 COG2337 MazF mRNA-degrading endonuclease, toxin component of the MazEF toxin-antitoxin module [Defense mechanisms]. 112
2314 225217 COG2339 PrsW Membrane proteinase PrsW, cleaves anti-sigma factor RsiW, M82 family [Signal transduction mechanisms]. 274
2315 225218 COG2340 YkwD Uncharacterized conserved protein YkwD, contains CAP (CSP/antigen 5/PR1) domain [Function unknown]. 207
2316 225219 COG2342 COG2342 Endo alpha-1,4 polygalactosaminidase, GH114 family (was erroneously annotated as Cys-tRNA synthetase) [Carbohydrate transport and metabolism]. 300
2317 225220 COG2343 COG2343 Uncharacterized conserved protein, DUF427 family [Function unknown]. 132
2318 225221 COG2344 Rex NADH/NAD ratio-sensing transcriptional regulator Rex [Transcription]. 211
2319 225222 COG2345 COG2345 Predicted transcriptional regulator, ArsR family [Transcription]. 218
2320 225223 COG2346 YjbI Truncated hemoglobin YjbI [Inorganic ion transport and metabolism]. 133
2321 225224 COG2348 FmhB Lipid II:glycine glycyltransferase (Peptidoglycan interpeptide bridge formation enzyme) [Cell wall/membrane/envelope biogenesis]. 418
2322 225225 COG2350 YciI Uncharacterized conserved protein YciI, contains a putative active-site phosphohistidine [General function prediction only]. 92
2323 225226 COG2351 HiuH 5-hydroxyisourate hydrolase (purine catabolism), transthyretin-related family [Nucleotide transport and metabolism]. 124
2324 225227 COG2352 Ppc Phosphoenolpyruvate carboxylase [Energy production and conversion]. 910
2325 225228 COG2353 YceI Polyisoprenoid-binding periplasmic protein YceI [General function prediction only]. 192
2326 225229 COG2354 MutK Uncharacterized membrane protein MutK, may be involved in DNA repair [Function unknown]. 303
2327 225230 COG2355 COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog [Posttranslational modification, protein turnover, chaperones, Amino acid transport and metabolism]. 313
2328 225231 COG2356 EndA Endonuclease I [Replication, recombination and repair]. 237
2329 225232 COG2357 YjbM ppGpp synthetase catalytic domain (RelA/SpoT-type nucleotidyltranferase) [Nucleotide transport and metabolism, Signal transduction mechanisms]. 231
2330 225233 COG2358 Imp TRAP-type uncharacterized transport system, periplasmic component [General function prediction only]. 321
2331 225234 COG2359 SpoVS Stage V sporulation protein SpoVS (function unknown) [Function unknown]. 87
2332 225235 COG2360 Aat Leu/Phe-tRNA-protein transferase [Posttranslational modification, protein turnover, chaperones]. 221
2333 225236 COG2361 COG2361 Uncharacterized conserved protein, contains HEPN domain [Function unknown]. 117
2334 225237 COG2362 DppA D-aminopeptidase [Amino acid transport and metabolism]. 274
2335 225238 COG2363 YgdD Uncharacterized membrane protein YgdD, TMEM256/DUF423 family [Function unknown]. 124
2336 225239 COG2364 YczE Uncharacterized membrane protein YczE [Function unknown]. 210
2337 225240 COG2365 Oca4 Protein tyrosine/serine phosphatase [Signal transduction mechanisms]. 249
2338 225241 COG2366 PvdQ Acyl-homoserine lactone (AHL) acylase PvdQ [Secondary metabolites biosynthesis, transport and catabolism]. 768
2339 225242 COG2367 PenP Beta-lactamase class A [Defense mechanisms]. 329
2340 225243 COG2368 YoaI Aromatic ring hydroxylase [Secondary metabolites biosynthesis, transport and catabolism]. 493
2341 225244 COG2369 COG2369 Uncharacterized conserved protein, contains phage Mu gpF-like domain [Function unknown]. 432
2342 225245 COG2370 HupE Hydrogenase/urease accessory protein HupE [Posttranslational modification, protein turnover, chaperones]. 201
2343 225246 COG2371 UreE Urease accessory protein UreE [Posttranslational modification, protein turnover, chaperones]. 155
2344 225247 COG2372 CopC Copper-binding protein CopC (methionine-rich) [Inorganic ion transport and metabolism]. 127
2345 225248 COG2373 YfaS Uncharacterized conserved protein YfaS, alpha-2-macroglobulin family [General function prediction only]. 1621
2346 225249 COG2374 COG2374 Predicted extracellular nuclease [General function prediction only]. 798
2347 225250 COG2375 ViuB NADPH-dependent ferric siderophore reductase, contains FAD-binding and SIP domains [Inorganic ion transport and metabolism]. 265
2348 225251 COG2376 DAK1 Dihydroxyacetone kinase [Carbohydrate transport and metabolism]. 323
2349 225252 COG2377 AnmK 1,6-Anhydro-N-acetylmuramate kinase [Cell wall/membrane/envelope biogenesis]. 371
2350 225253 COG2378 YafY Predicted DNA-binding transcriptional regulator YafY, contains an HTH and WYL domains [Transcription]. 311
2351 225254 COG2379 GckA Glycerate-2-kinase [Carbohydrate transport and metabolism]. 422
2352 225255 COG2380 COG2380 Uncharacterized protein [Function unknown]. 327
2353 225256 COG2382 Fes Enterochelin esterase or related enzyme [Inorganic ion transport and metabolism]. 299
2354 225257 COG2383 COG2383 Uncharacterized membrane protein, Fun14 family [Function unknown]. 109
2355 225258 COG2384 TrmK tRNA A22 N-methylase [Translation, ribosomal structure and biogenesis]. 226
2356 225259 COG2385 SpoIID Peptidoglycan hydrolase (amidase) enhancer domain [Cell wall/membrane/envelope biogenesis]. 397
2357 225260 COG2386 CcmB ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 221
2358 225261 COG2388 YidJ Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 99
2359 225262 COG2389 COG2389 Uncharacterized metal-binding protein, DUF2227 family [Function unknown]. 179
2360 225263 COG2390 DeoR DNA-binding transcriptional regulator LsrR, DeoR family [Transcription]. 321
2361 225264 COG2391 YedE Uncharacterized membrane protein YedE/YeeE, contains two sulfur transport domains [General function prediction only]. 198
2362 225265 COG2401 MK0520 ABC-type ATPase fused to a predicted acetyltransferase domain [General function prediction only]. 593
2363 225266 COG2402 COG2402 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 135
2364 225267 COG2403 COG2403 Predicted GTPase [General function prediction only]. 449
2365 225268 COG2404 NrnB Oligoribonuclease NrnB or cAMP/cGMP phosphodiesterase, DHH superfamily [Translation, ribosomal structure and biogenesis, Signal transduction mechanisms]. 339
2366 225269 COG2405 COG2405 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 157
2367 225270 COG2406 COG2406 Protein distantly related to bacterial ferritins [General function prediction only]. 172
2368 225271 COG2407 FucI L-fucose isomerase or related protein [Carbohydrate transport and metabolism]. 470
2369 225272 COG2409 YdfJ Uncharacterized membrane protein YdfJ, MMPL/SSD domain [Function unknown]. 937
2370 225273 COG2410 COG2410 Predicted nuclease (RNAse H fold) [General function prediction only]. 178
2371 225274 COG2411 COG2411 Uncharacterized protein [Function unknown]. 188
2372 225275 COG2412 COG2412 Uncharacterized protein [Function unknown]. 101
2373 225276 COG2413 COG2413 Predicted nucleotidyltransferase [General function prediction only]. 228
2374 225277 COG2414 YdhV Aldehyde:ferredoxin oxidoreductase [Energy production and conversion]. 614
2375 225278 COG2419 COG2419 Trm5-related predicted tRNA methylase [Translation, ribosomal structure and biogenesis]. 336
2376 225279 COG2421 FmdA Acetamidase/formamidase [Energy production and conversion]. 305
2377 225280 COG2423 OCDMu Ornithine cyclodeaminase/archaeal alanine dehydrogenase, mu-crystallin family [Amino acid transport and metabolism]. 330
2378 225281 COG2425 ViaA Uncharacterized protein, contains a von Willebrand factor type A (vWA) domain [Function unknown]. 437
2379 225282 COG2426 COG2426 Uncharacterized membrane protein [Function unknown]. 142
2380 225283 COG2427 YjgD Uncharacterized conserved protein YjgD, DUF1641 family [Function unknown]. 148
2381 225284 COG2428 Sfm1 Rps3 or RNA methylase involved in ribosome biogenesis, SPOUT family, [Translation, ribosomal structure and biogenesis]. 196
2382 225285 COG2429 Gch31 Archaeal GTP cyclohydrolase III [Nucleotide transport and metabolism]. 250
2383 225286 COG2430 COG2430 Uncharacterized protein [Function unknown]. 236
2384 225287 COG2431 YbjE Uncharacterized membrane protein YbjE, DUF340 family [Function unknown]. 297
2385 225288 COG2433 COG2433 Possible nuclease of RNase H fold, RuvC/YqgF family [General function prediction only]. 652
2386 225289 COG2440 FixX Ferredoxin-like protein FixX [Energy production and conversion]. 99
2387 225290 COG2441 COG2441 Predicted butyrate kinase, DUF1464 family [General function prediction only]. 374
2388 225291 COG2442 COG2442 Uncharacterized conserved protein, DUF433 family [Function unknown]. 79
2389 225292 COG2443 Sss1 Preprotein translocase subunit Sss1 [Intracellular trafficking, secretion, and vesicular transport]. 65
2390 225293 COG2445 YutE Uncharacterized conserved protein YutE, UPF0331/DUF86 family [Function unknown]. 138
2391 225294 COG2450 COG2450 Predicted archaeal cell division protein, SepF homolog, DUF552 family [Cell cycle control, cell division, chromosome partitioning]. 124
2392 225295 COG2451 Rpl35A Ribosomal protein L35AE/L33A [Translation, ribosomal structure and biogenesis]. 100
2393 225296 COG2452 COG2452 Predicted site-specific integrase-resolvase [Mobilome: prophages, transposons]. 193
2394 225297 COG2453 CDC14 Protein-tyrosine phosphatase [Signal transduction mechanisms]. 180
2395 225298 COG2454 COG2454 Uncharacterized protein [Function unknown]. 211
2396 225299 COG2456 COG2456 Uncharacterized protein [Function unknown]. 121
2397 225300 COG2457 COG2457 Uncharacterized protein [Function unknown]. 199
2398 225301 COG2461 COG2461 Uncharacterized conserved protein, DUF438 domain, may contain hemerythrin domain [Function unknown]. 409
2399 225302 COG2469 COG2469 Uncharacterized protein, contains HTH domain [Function unknown]. 284
2400 225303 COG2501 YbcJ Ribosome-associated protein YbcJ, S4-like RNA binding protein [Translation, ribosomal structure and biogenesis]. 73
2401 225304 COG2502 AsnA Asparagine synthetase A [Amino acid transport and metabolism]. 330
2402 225305 COG2503 COG2503 Predicted secreted acid phosphatase [General function prediction only]. 274
2403 225306 COG2508 PucR DNA-binding transcriptional regulator, PucR family [Transcription]. 421
2404 225307 COG2509 COG2509 FAD-dependent dehydrogenase [General function prediction only]. 486
2405 225308 COG2510 COG2510 Uncharacterized membrane protein [Function unknown]. 140
2406 225309 COG2511 GatE Archaeal Glu-tRNAGln amidotransferase subunit E, contains GAD domain [Translation, ribosomal structure and biogenesis]. 631
2407 225310 COG2512 COG2512 Uncharacterized membrane protein [Function unknown]. 258
2408 225311 COG2513 PrpB 2-Methylisocitrate lyase and related enzymes, PEP mutase family [Carbohydrate transport and metabolism]. 289
2409 225312 COG2514 CatE Catechol-2,3-dioxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 265
2410 225313 COG2515 Acd 1-aminocyclopropane-1-carboxylate deaminase/D-cysteine desulfhydrase, PLP-dependent ACC family [Amino acid transport and metabolism]. 323
2411 225314 COG2516 COG2516 Biotin synthase-related protein, radical SAM superfamily [General function prediction only]. 339
2412 225315 COG2517 COG2517 Predicted RNA-binding protein, contains C-terminal EMAP domain [General function prediction only]. 219
2413 225316 COG2518 Pcm Protein-L-isoaspartate O-methyltransferase [Posttranslational modification, protein turnover, chaperones]. 209
2414 225317 COG2519 Gcd14 tRNA A58 N-methylase Trm61 [Translation, ribosomal structure and biogenesis]. 256
2415 225318 COG2520 Trm5 tRNA G37 N-methylase Trm5 [Translation, ribosomal structure and biogenesis]. 341
2416 225319 COG2521 COG2521 Predicted archaeal methyltransferase [General function prediction only]. 287
2417 225320 COG2522 COG2522 Predicted transcriptional regulator [General function prediction only]. 119
2418 225321 COG2524 COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]. 294
2419 225322 COG2602 YbxI Beta-lactamase class D [Defense mechanisms]. 254
2420 225323 COG2603 SelU tRNA 2-selenouridine synthase SelU, contains rhodanese domain [Translation, ribosomal structure and biogenesis]. 334
2421 225324 COG2604 COG2604 Uncharacterized conserved protein [Function unknown]. 594
2422 225325 COG2605 COG2605 Predicted kinase related to galactokinase and mevalonate kinase [General function prediction only]. 333
2423 225326 COG2606 EbsC Cys-tRNA(Pro) deacylase, prolyl-tRNA editing enzyme YbaK/EbsC [Translation, ribosomal structure and biogenesis]. 155
2424 225327 COG2607 COG2607 Predicted ATPase, AAA+ superfamily [General function prediction only]. 287
2425 225328 COG2608 CopZ Copper chaperone CopZ [Inorganic ion transport and metabolism]. 71
2426 225329 COG2609 AceE Pyruvate dehydrogenase complex, dehydrogenase (E1) component [Energy production and conversion]. 887
2427 225330 COG2610 GntT H+/gluconate symporter or related permease [Carbohydrate transport and metabolism, General function prediction only]. 442
2428 225331 COG2703 COG2703 Hemerythrin [Signal transduction mechanisms]. 144
2429 225332 COG2704 DcuA Anaerobic C4-dicarboxylate transporter [Carbohydrate transport and metabolism]. 436
2430 225333 COG2706 Pgl 6-phosphogluconolactonase, cycloisomerase 2 family [Carbohydrate transport and metabolism]. 346
2431 225334 COG2707 YeaL Uncharacterized membrane protein, DUF441 family [Function unknown]. 151
2432 225335 COG2710 NifD Nitrogenase molybdenum-iron protein, alpha and beta chains [Inorganic ion transport and metabolism]. 456
2433 225336 COG2715 SpmA Spore maturation protein SpmA (function unknown) [General function prediction only]. 206
2434 225337 COG2716 GcvR Glycine cleavage system regulatory protein [Amino acid transport and metabolism]. 176
2435 225338 COG2717 YedZ Periplasmic DMSO/TMAO reductase YedYZ, heme-binding membrane subunit [Energy production and conversion]. 209
2436 225339 COG2718 YeaH Uncharacterized conserved protein YeaH/YhbH, required for sporulation, DUF444 family [General function prediction only]. 423
2437 225340 COG2719 SpoVR Stage V sporulation protein SpoVR/YcgB, involved in spore cortex formation (function unknown) [Cell cycle control, cell division, chromosome partitioning]. 495
2438 225341 COG2720 YoaR Vancomycin resistance protein YoaR (function unknown), contains peptidoglycan-binding and VanW domains [Defense mechanisms]. 376
2439 225342 COG2721 UxaA Altronate dehydratase [Carbohydrate transport and metabolism]. 381
2440 225343 COG2723 BglB Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase [Carbohydrate transport and metabolism]. 460
2441 225344 COG2730 BglC Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism]. 407
2442 225345 COG2731 EbgC Beta-galactosidase, beta subunit [Carbohydrate transport and metabolism]. 154
2443 225346 COG2732 BarS Barstar, RNAse (barnase) inhibitor [Transcription]. 91
2444 225347 COG2733 YjiN Uncharacterized membrane-anchored protein YjiN, DUF445 family [Function unknown]. 415
2445 225348 COG2738 YugP Zn-dependent membrane protease YugP [Posttranslational modification, protein turnover, chaperones]. 226
2446 225349 COG2739 YlxM Predicted DNA-binding protein YlxM, UPF0122 family [Transcription]. 105
2447 225350 COG2740 YlxR Predicted RNA-binding protein YlxR, DUF448 family [General function prediction only]. 95
2448 225351 COG2746 YokD Aminoglycoside N3'-acetyltransferase [Defense mechanisms]. 251
2449 225352 COG2747 FlgM Negative regulator of flagellin synthesis (anti-sigma28 factor) [Transcription, Cell motility]. 93
2450 225353 COG2755 TesA Lysophospholipase L1 or related esterase [Amino acid transport and metabolism]. 216
2451 225354 COG2759 MIS1 Formyltetrahydrofolate synthetase [Nucleotide transport and metabolism]. 554
2452 225355 COG2761 FrnE Predicted dithiol-disulfide isomerase, DsbA family [Posttranslational modification, protein turnover, chaperones]. 225
2453 225356 COG2764 PhnB Uncharacterized conserved protein PhnB, glyoxalase superfamily [General function prediction only]. 136
2454 225357 COG2766 PrkA Predicted Ser/Thr protein kinase [Signal transduction mechanisms]. 649
2455 225358 COG2768 COG2768 Uncharacterized Fe-S cluster protein [Function unknown]. 354
2456 225359 COG2770 HAMP HAMP domain [Signal transduction mechanisms]. 83
2457 225360 COG2771 CsgD DNA-binding transcriptional regulator, CsgD family [Transcription]. 65
2458 225361 COG2801 Tra5 Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons]. 232
2459 225362 COG2802 LON Uncharacterized protein, LON-like domain, ASCH/PUA-like superfamily [Function unknown]. 221
2460 225363 COG2804 PulE Type II secretory pathway ATPase GspE/PulE or T4P pilus assembly pathway ATPase PilB [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 500
2461 225364 COG2805 PilT Tfp pilus assembly protein PilT, pilus retraction ATPase [Cell motility, Extracellular structures]. 353
2462 225365 COG2807 CynX Cyanate permease [Inorganic ion transport and metabolism]. 395
2463 225366 COG2808 PaiB Predicted FMN-binding regulatory protein PaiB [Signal transduction mechanisms]. 209
2464 225367 COG2810 COG2810 Predicted type IV restriction endonuclease [Defense mechanisms]. 284
2465 225368 COG2811 NtpH Archaeal/vacuolar-type H+-ATPase subunit H [Energy production and conversion]. 108
2466 225369 COG2812 DnaX DNA polymerase III, gamma/tau subunits [Replication, recombination and repair]. 515
2467 225370 COG2813 RsmC 16S rRNA G1207 methylase RsmC [Translation, ribosomal structure and biogenesis]. 300
2468 225371 COG2814 AraJ Predicted arabinose efflux permease, MFS family [Carbohydrate transport and metabolism]. 394
2469 225372 COG2815 PASTA PASTA domain, binds beta-lactams [Cell wall/membrane/envelope biogenesis]. 303
2470 225373 COG2816 NPY1 NADH pyrophosphatase NudC, Nudix superfamily [Nucleotide transport and metabolism]. 279
2471 225374 COG2818 Tag 3-methyladenine DNA glycosylase Tag [Replication, recombination and repair]. 188
2472 225375 COG2819 YbbA Predicted hydrolase of the alpha/beta superfamily [General function prediction only]. 264
2473 225376 COG2820 Udp Uridine phosphorylase [Nucleotide transport and metabolism]. 248
2474 225377 COG2821 MltA Membrane-bound lytic murein transglycosylase [Cell wall/membrane/envelope biogenesis]. 373
2475 225378 COG2822 EfeO Iron uptake system EfeUOB, periplasmic (or lipoprotein) component EfeO/EfeM [Inorganic ion transport and metabolism]. 376
2476 225379 COG2823 OsmY Osmotically-inducible protein OsmY, contains BON domain [Function unknown]. 196
2477 225380 COG2824 PhnA Uncharacterized Zn-ribbon-containing protein [General function prediction only]. 112
2478 225381 COG2825 HlpA Periplasmic chaperone for outer membrane proteins, Skp family [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 170
2479 225382 COG2826 Tra8 Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons]. 318
2480 225383 COG2827 YhbQ Predicted endonuclease, GIY-YIG superfamily [Replication, recombination and repair]. 95
2481 225384 COG2828 PrpF 2-Methylaconitate cis-trans-isomerase PrpF (2-methyl citrate pathway) [Energy production and conversion]. 378
2482 225385 COG2829 PldA Outer membrane phospholipase A [Cell wall/membrane/envelope biogenesis]. 317
2483 225386 COG2830 COG2830 Uncharacterized protein [Function unknown]. 214
2484 225387 COG2831 FhaC Hemolysin activation/secretion protein [Intracellular trafficking, secretion, and vesicular transport]. 554
2485 225388 COG2832 YbaN Uncharacterized membrane protein YbaN, DUF454 family [Function unknown]. 119
2486 225389 COG2833 COG2833 Uncharacterized conserved protein, contains ferritin-like DUF455 domain [Function unknown]. 268
2487 225390 COG2834 LolA Outer membrane lipoprotein-sorting protein [Cell wall/membrane/envelope biogenesis]. 211
2488 225391 COG2835 YcaR Uncharacterized conserved protein YbaR, Trm112 family [Function unknown]. 60
2489 225392 COG2836 TauE Sulfite exporter TauE/SafE [Inorganic ion transport and metabolism]. 232
2490 225393 COG2837 EfeB Periplasmic deferrochelatase/peroxidase EfeB [Inorganic ion transport and metabolism]. 352
2491 225394 COG2838 IcdM Monomeric isocitrate dehydrogenase [Energy production and conversion]. 744
2492 225395 COG2839 YqgC Uncharacterized conserved protein YqgC, DUF456 family [Function unknown]. 160
2493 225396 COG2840 SmrA DNA-nicking endonuclease, Smr domain [Replication, recombination and repair]. 184
2494 225397 COG2841 YdcH Uncharacterized conserved protein YdcH, DUF465 family [Function unknown]. 72
2495 225398 COG2842 COG2842 Bacteriophage DNA transposition protein, AAA+ family ATPase [Mobilome: prophages, transposons]. 297
2496 225399 COG2843 COG2843 Poly-gamma-glutamate biosynthesis protein CapA/YwtB (capsule formation), metallophosphatase superfamily [Cell wall/membrane/envelope biogenesis]. 372
2497 225400 COG2844 GlnD UTP:GlnB (protein PII) uridylyltransferase [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 867
2498 225401 COG2845 COG2845 Uncharacterized protein [Function unknown]. 354
2499 225402 COG2846 RIC Iron-sulfur cluster repair protein YtfE, RIC family, contains ScdAN and hemerythrin domains [Posttranslational modification, protein turnover, chaperones]. 221
2500 225403 COG2847 COG2847 Copper(I)-binding protein [Inorganic ion transport and metabolism]. 151
2501 225404 COG2848 COG2848 Uncharacterized conserved protein, UPF0210 family [Cell cycle control, cell division, chromosome partitioning]. 445
2502 225405 COG2849 YwqK Antitoxin component YwqK of the YwqJK toxin-antitoxin module [Defense mechanisms]. 230
2503 225406 COG2850 RoxA Ribosomal protein L16 Arg81 hydroxylase, contains JmjC domain [Translation, ribosomal structure and biogenesis]. 383
2504 225407 COG2851 CitM Mg2+/citrate symporter [Energy production and conversion]. 433
2505 225408 COG2852 YcjD Very-short-patch-repair endonuclease [Replication, recombination and repair]. 129
2506 225409 COG2853 VacJ ABC-type transporter Mla maintaining outer membrane lipid asymmetry, lipoprotein component MlaA [Cell wall/membrane/envelope biogenesis]. 250
2507 225410 COG2854 MlaC ABC-type transporter Mla maintaining outer membrane lipid asymmetry, periplasmic MlaC component [Lipid transport and metabolism]. 202
2508 225411 COG2855 YeiH Uncharacterized membrane protein YadS [Function unknown]. 334
2509 225412 COG2856 ImmA Zn-dependent peptidase ImmA, M78 family [Posttranslational modification, protein turnover, chaperones]. 213
2510 225413 COG2857 CYT1 Cytochrome c1 [Energy production and conversion]. 250
2511 225414 COG2859 COG2859 Uncharacterized protein [Function unknown]. 237
2512 225415 COG2860 YadS Uncharacterized membrane protein YeiH [Function unknown]. 209
2513 225416 COG2861 YibQ Uncharacterized conserved protein YibQ, putative polysaccharide deacetylase 2 family [Carbohydrate transport and metabolism]. 250
2514 225417 COG2862 YqhA Uncharacterized membrane protein YqhA [Function unknown]. 169
2515 225418 COG2863 CytC553 Cytochrome c553 [Energy production and conversion]. 121
2516 225419 COG2864 FdnI Cytochrome b subunit of formate dehydrogenase [Energy production and conversion]. 218
2517 225420 COG2865 COG2865 Predicted transcriptional regulator, contains HTH domain [Transcription]. 467
2518 225421 COG2866 MpaA Murein tripeptide amidase MpaA [Cell wall/membrane/envelope biogenesis]. 374
2519 225422 COG2867 PasT Ribosome association toxin PasT (RatA) of the RatAB toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 146
2520 225423 COG2868 YsxB Uncharacterized conserved protein YsxB, DUF464 family [Function unknown]. 109
2521 225424 COG2869 NqrC Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC [Energy production and conversion]. 264
2522 225425 COG2870 RfaE ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase [Cell wall/membrane/envelope biogenesis]. 467
2523 225426 COG2871 NqrF Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF [Energy production and conversion]. 410
2524 225427 COG2872 AlaX Ser-tRNA(Ala) deacylase AlaX (editing enzyme) [Translation, ribosomal structure and biogenesis]. 241
2525 225428 COG2873 MET17 O-acetylhomoserine/O-acetylserine sulfhydrylase, pyridoxal phosphate-dependent [Amino acid transport and metabolism]. 426
2526 225429 COG2874 FlaH Archaellum biogenesis protein FlaH, an ATPase [Cell motility]. 235
2527 225430 COG2875 CobM Precorrin-4 methylase [Coenzyme transport and metabolism]. 254
2528 225431 COG2876 AroGA 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 286
2529 225432 COG2877 KdsA 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase [Cell wall/membrane/envelope biogenesis]. 279
2530 225433 COG2878 RnfB Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfB subunit [Energy production and conversion]. 198
2531 225434 COG2879 YbdD Uncharacterized short protein YbdD, DUF466 family [Function unknown]. 65
2532 225435 COG2880 COG2880 Predicted DNA-binding protein, potential antitoxin AbrB/MazE fold [General function prediction only]. 67
2533 225436 COG2881 COG2881 Uncharacterized protein [Function unknown]. 181
2534 225437 COG2882 FliJ Flagellar biosynthesis chaperone FliJ [Cell motility]. 148
2535 225438 COG2884 FtsE ABC-type ATPase involved in cell division [Cell cycle control, cell division, chromosome partitioning]. 223
2536 225439 COG2885 OmpA Outer membrane protein OmpA and related peptidoglycan-associated (lipo)proteins [Cell wall/membrane/envelope biogenesis]. 190
2537 225440 COG2886 COG2886 Predicted antitoxin, contains HTH domain [General function prediction only]. 88
2538 225441 COG2887 COG2887 RecB family exonuclease [Replication, recombination and repair]. 269
2539 225442 COG2888 COG2888 Predicted RNA-binding protein involved in translation, contains Zn-ribbon domain, DUF1610 family [General function prediction only]. 61
2540 225443 COG2890 HemK Methylase of polypeptide chain release factors [Translation, ribosomal structure and biogenesis]. 280
2541 225444 COG2891 MreD Cell shape-determining protein MreD [Cell wall/membrane/envelope biogenesis]. 167
2542 225445 COG2892 Pcc1 tRNA threonylcarbamoyladenosine modification (KEOPS) complex, Pcc1 subunit [Translation, ribosomal structure and biogenesis]. 82
2543 225446 COG2893 ManX Phosphotransferase system, mannose/fructose-specific component IIA [Carbohydrate transport and metabolism]. 143
2544 225447 COG2894 MinD Septum formation inhibitor-activating ATPase MinD [Cell cycle control, cell division, chromosome partitioning]. 272
2545 225448 COG2895 CysN Sulfate adenylyltransferase subunit 1, EFTu-like GTPase family [Inorganic ion transport and metabolism]. 431
2546 225449 COG2896 MoaA Molybdenum cofactor biosynthesis enzyme MoaA [Coenzyme transport and metabolism]. 322
2547 225450 COG2897 SseA 3-mercaptopyruvate sulfurtransferase SseA, contains two rhodanese domains [Inorganic ion transport and metabolism]. 285
2548 225451 COG2898 MprF Lysylphosphatidylglycerol synthetase, C-terminal domain, DUF2156 family [Function unknown]. 538
2549 225452 COG2899 COG2899 Uncharacterized protein [Function unknown]. 346
2550 225453 COG2900 SlyX Uncharacterized coiled-coil protein SlyX (sensitive to lysis X) [Function unknown]. 72
2551 225454 COG2901 Fis DNA-binding protein Fis (factor for inversion stimulation) [Transcription]. 98
2552 225455 COG2902 Gdh2 NAD-specific glutamate dehydrogenase [Amino acid transport and metabolism]. 1592
2553 225456 COG2904 QueFN NADPH-dependent 7-cyano-7-deazaguanine reductase QueF, N-terminal domain [Translation, ribosomal structure and biogenesis]. 137
2554 225457 COG2905 COG2905 Signal-transduction protein containing cAMP-binding, CBS, and nucleotidyltransferase domains [Signal transduction mechanisms]. 610
2555 225458 COG2906 Bfd Bacterioferritin-associated ferredoxin [Inorganic ion transport and metabolism]. 63
2556 225459 COG2907 COG2907 Predicted NAD/FAD-binding protein [General function prediction only]. 447
2557 225460 COG2908 LpxH UDP-2,3-diacylglucosamine pyrophosphatase LpxH [Cell wall/membrane/envelope biogenesis]. 237
2558 225461 COG2909 MalT ATP-, maltotriose- and DNA-dependent transcriptional regulator MalT [Transcription]. 894
2559 225462 COG2910 YwnB Putative NADH-flavin reductase [General function prediction only]. 211
2560 225463 COG2911 TamB Autotransporter translocation and assembly factor TamB [Intracellular trafficking, secretion, and vesicular transport]. 1278
2561 225464 COG2912 SirB1 Regulator of sirC expression, contains transglutaminase-like and TPR domains [Signal transduction mechanisms]. 269
2562 225465 COG2913 BamE Outer membrane protein assembly factor BamE, lipoprotein component of the BamABCDE complex [Cell wall/membrane/envelope biogenesis]. 147
2563 225466 COG2914 PasI Putative antitoxin component PasI (RatB) of the RatAB toxin-antitoxin module, ubiquitin-RnfH superfamily [Defense mechanisms]. 99
2564 225467 COG2915 HflD Regulator of phage lambda lysogenization HflD, binds to CII and stimulates its degradation [Mobilome: prophages, transposons, Signal transduction mechanisms]. 207
2565 225468 COG2916 Hns DNA-binding protein H-NS [Transcription]. 128
2566 225469 COG2917 YciB Intracellular septation protein A [Cell cycle control, cell division, chromosome partitioning]. 180
2567 225470 COG2918 GshA Gamma-glutamylcysteine synthetase [Coenzyme transport and metabolism]. 518
2568 225471 COG2919 FtsB Cell division protein FtsB [Cell cycle control, cell division, chromosome partitioning]. 117
2569 225472 COG2920 DsrC Sulfur relay (sulfurtransferase) protein, DsrC/TusE family [Inorganic ion transport and metabolism]. 111
2570 225473 COG2921 YbeD Putative lipoic acid-binding regulatory protein [Signal transduction mechanisms]. 90
2571 225474 COG2922 Smg Uncharacterized conserved protein Smg, DUF494 family [Function unknown]. 157
2572 225475 COG2923 DsrF Sulfur relay (sulfurtransferase) complex TusC component, DsrF/TusC family [Inorganic ion transport and metabolism]. 118
2573 225476 COG2924 YggX Fe-S cluster biosynthesis and repair protein YggX [Inorganic ion transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 90
2574 225477 COG2925 SbcB Exonuclease I [Replication, recombination and repair]. 475
2575 225478 COG2926 YeeX Uncharacterized conserved protein YeeX, DUF496 family [Function unknown]. 109
2576 225479 COG2927 HolC DNA polymerase III, chi subunit [Replication, recombination and repair]. 144
2577 225480 COG2928 COG2928 Uncharacterized membrane protein [Function unknown]. 222
2578 225481 COG2929 COG2929 Uncharacterized conserved protein, DUF497 family [Function unknown]. 93
2579 225482 COG2930 SYLF Lipid-binding SYLF domain [Lipid transport and metabolism]. 227
2580 225483 COG2931 COG2931 Ca2+-binding protein, RTX toxin-related [Secondary metabolites biosynthesis, transport and catabolism]. 510
2581 225484 COG2932 COG2932 Phage repressor protein C, contains Cro/C1-type HTH and peptidase s24 domains [Mobilome: prophages, transposons]. 214
2582 225485 COG2933 RlmM 23S rRNA C2498 (ribose-2'-O)-methylase RlmM [Translation, ribosomal structure and biogenesis]. 358
2583 225486 COG2935 Ate1 Arginyl-tRNA--protein-N-Asp/Glu arginylyltransferase [Posttranslational modification, protein turnover, chaperones]. 253
2584 225487 COG2936 COG2936 Predicted acyl esterase [General function prediction only]. 563
2585 225488 COG2937 PlsB Glycerol-3-phosphate O-acyltransferase [Lipid transport and metabolism]. 810
2586 225489 COG2938 SdhE Succinate dehydrogenase flavin-adding protein, antitoxin component of the CptAB toxin-antitoxin module [Posttranslational modification, protein turnover, chaperones]. 94
2587 225490 COG2939 Kex1 Carboxypeptidase C (cathepsin A) [Amino acid transport and metabolism]. 498
2588 225491 COG2940 SET SET domain-containing protein (function unknown) [General function prediction only]. 480
2589 225492 COG2941 Coq7 Demethoxyubiquinone hydroxylase, CLK1/Coq7/Cat5 family [Coenzyme transport and metabolism]. 204
2590 225493 COG2942 YihS Mannose or cellobiose epimerase, N-acyl-D-glucosamine 2-epimerase family [Carbohydrate transport and metabolism]. 388
2591 225494 COG2943 MdoH Membrane glycosyltransferase [Cell wall/membrane/envelope biogenesis, Carbohydrate transport and metabolism]. 736
2592 225495 COG2944 YiaG DNA-binding transcriptional regulator YiaG, XRE-type HTH domain [Transcription]. 104
2593 225496 COG2945 COG2945 Alpha/beta superfamily hydrolase [General function prediction only]. 210
2594 225497 COG2946 NicK DNA relaxase NicK [Replication, recombination and repair]. 377
2595 225498 COG2947 COG2947 Predicted RNA-binding protein, contains PUA-like domain [General function prediction only]. 156
2596 225499 COG2948 VirB10 Type IV secretory pathway, VirB10 components [Intracellular trafficking, secretion, and vesicular transport]. 360
2597 225500 COG2949 SanA Uncharacterized periplasmic protein SanA, affects membrane permeability for vancomycin [Cell wall/membrane/envelope biogenesis]. 235
2598 225501 COG2951 MltB Membrane-bound lytic murein transglycosylase B [Cell wall/membrane/envelope biogenesis]. 343
2599 225502 COG2952 COG2952 Uncharacterized protein [Function unknown]. 183
2600 225503 COG2954 CYTH CYTH domain, found in class IV adenylate cyclase and various triphosphatases [General function prediction only]. 156
2601 225504 COG2956 YciM Lipopolysaccharide biosynthesis regulator YciM, contains six TPR domains and a predicted metal-binding C-terminal domain [Cell wall/membrane/envelope biogenesis]. 389
2602 225505 COG2957 AguA Agmatine/peptidylarginine deiminase [Amino acid transport and metabolism]. 346
2603 225506 COG2958 COG2958 Uncharacterized protein [Function unknown]. 307
2604 225507 COG2959 HemX Uncharacterized conserved protein HemX (no evidence of involvement in heme biosynthesis) [Function unknown]. 391
2605 225508 COG2960 YqiC Uncharacterized conserved protein YqiC, BMFP domain [Function unknown]. 103
2606 225509 COG2961 RlmJ 23S rRNA A2030 N6-methylase RlmJ [Translation, ribosomal structure and biogenesis]. 279
2607 225510 COG2962 RarD Uncharacterized membrane protein RarD, contains two EamA domains [Function unknown]. 293
2608 225511 COG2963 InsE Transposase and inactivated derivatives [Mobilome: prophages, transposons]. 116
2609 225512 COG2964 YheO Predicted transcriptional regulator YheO, contains PAS and DNA-binding HTH domains [Transcription]. 220
2610 225513 COG2965 PriB Primosomal replication protein N [Replication, recombination and repair]. 103
2611 225514 COG2966 YjjP Uncharacterized membrane protein YjjP, DUF1212 family [Function unknown]. 250
2612 225515 COG2967 ApaG Uncharacterized protein affecting Mg2+/Co2+ transport [Inorganic ion transport and metabolism]. 126
2613 225516 COG2968 YggE Uncharacterized conserved protein YggE, contains kinase-interacting SIMPL domain [Function unknown]. 243
2614 225517 COG2969 SspB Stringent starvation protein B, binds SsrA peptide [Posttranslational modification, protein turnover, chaperones]. 155
2615 225518 COG2971 BadF BadF-type ATPase, related to human N-acetylglucosamine kinase [Carbohydrate transport and metabolism]. 301
2616 225519 COG2972 YesM Sensor histidine kinase YesM [Signal transduction mechanisms]. 456
2617 225520 COG2973 TrpR Trp operon repressor [Transcription]. 103
2618 225521 COG2974 RdgC DNA recombination-dependent growth factor C [Replication, recombination and repair]. 303
2619 225522 COG2975 IscX Fe-S-cluster formation regulator IscX/YfhJ [Posttranslational modification, protein turnover, chaperones]. 64
2620 225523 COG2976 YfgM Putative negative regulator of RcsB-dependent stress response [Signal transduction mechanisms]. 207
2621 225524 COG2977 EntD 4'-phosphopantetheinyl transferase EntD (siderophore biosynthesis) [Secondary metabolites biosynthesis, transport and catabolism]. 228
2622 225525 COG2978 AbgT p-Aminobenzoyl-glutamate transporter AbgT [Coenzyme transport and metabolism]. 516
2623 225526 COG2979 YebE Uncharacterized membrane protein YebE, DUF533 family [Function unknown]. 225
2624 225527 COG2980 LptE Outer membrane lipoprotein LptE/RlpB (LPS assembly) [Cell wall/membrane/envelope biogenesis]. 178
2625 225528 COG2981 CysZ Uncharacterized protein involved in cysteine biosynthesis [Amino acid transport and metabolism]. 250
2626 225529 COG2982 AsmA Uncharacterized protein involved in outer membrane biogenesis [Cell wall/membrane/envelope biogenesis]. 648
2627 225530 COG2983 YcgN Uncharacterized cysteine cluster protein YcgN, CxxCxxCC family [Function unknown]. 153
2628 225531 COG2984 COG2984 ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 322
2629 225532 COG2985 YbjL Uncharacterized membrane protein YbjL, putative transporter [General function prediction only]. 544
2630 225533 COG2986 HutH Histidine ammonia-lyase [Amino acid transport and metabolism]. 498
2631 225534 COG2987 HutU Urocanate hydratase [Amino acid transport and metabolism]. 561
2632 225535 COG2988 AstE Succinylglutamate desuccinylase [Amino acid transport and metabolism]. 324
2633 225536 COG2989 YcbB Murein L,D-transpeptidase YcbB/YkuD [Cell wall/membrane/envelope biogenesis]. 561
2634 225537 COG2990 VirK Uncharacterized protein VirK/YbjX, DUF535 family [Function unknown]. 300
2635 225538 COG2991 COG2991 Uncharacterized protein [Function unknown]. 77
2636 225539 COG2992 Bax Uncharacterized FlgJ-related protein [General function prediction only]. 262
2637 225540 COG2993 CcoO Cbb3-type cytochrome oxidase, cytochrome c subunit [Energy production and conversion]. 227
2638 225541 COG2994 HlyC ACP:hemolysin acyltransferase (hemolysin-activating protein) [Posttranslational modification, protein turnover, chaperones]. 148
2639 225542 COG2995 PqiA Uncharacterized paraquat-inducible protein A [Function unknown]. 418
2640 225543 COG2996 CvfB Predicted RNA-binding protein, contains S1 domains, virulence factor B family [General function prediction only]. 287
2641 225544 COG2998 TupA tungsten ABC transporter substrate-binding protein [Inorganic ion transport and metabolism]. 280
2642 225545 COG2999 GrxB Glutaredoxin 2 [Posttranslational modification, protein turnover, chaperones]. 215
2643 225546 COG3000 ERG3 Sterol desaturase/sphingolipid hydroxylase, fatty acid hydroxylase superfamily [Lipid transport and metabolism]. 271
2644 225547 COG3001 FN3K Fructosamine-3-kinase [Carbohydrate transport and metabolism]. 286
2645 225548 COG3002 YbcC Uncharacterized conserved protein YbcC, UPF0753/DUF2309 family [Function unknown]. 880
2646 225549 COG3004 NhaA Na+/H+ antiporter NhaA [Energy production and conversion, Inorganic ion transport and metabolism]. 390
2647 225550 COG3005 NapC Tetraheme cytochrome c subunit of nitrate or TMAO reductase [Energy production and conversion]. 190
2648 225551 COG3006 MukF Chromosome condensin MukBEF complex, kleisin-like MukF subunit [Cell cycle control, cell division, chromosome partitioning]. 440
2649 225552 COG3007 COG3007 Trans-2-enoyl-CoA reductase [Lipid transport and metabolism]. 398
2650 225553 COG3008 PqiB Paraquat-inducible protein B (function unknown) [Function unknown]. 553
2651 225554 COG3009 YmbA Uncharacterized lipoprotein YmbA [Function unknown]. 190
2652 225555 COG3010 NanE Putative N-acetylmannosamine-6-phosphate epimerase [Carbohydrate transport and metabolism]. 229
2653 225556 COG3011 YuxK Predicted thiol-disulfide oxidoreductase YuxK, DCC family [General function prediction only]. 137
2654 225557 COG3012 YchJ Uncharacterized conserved protein YchJ, contains N- and C-terminal SEC-C domains [Function unknown]. 151
2655 225558 COG3013 YfbU Uncharacterized protein YfbU, UPF0304 family [Function unknown]. 168
2656 225559 COG3014 COG3014 Uncharacterized protein [Function unknown]. 449
2657 225560 COG3015 CutF Uncharacterized lipoprotein NlpE involved in copper resistance [Cell wall/membrane/envelope biogenesis, Defense mechanisms]. 178
2658 225561 COG3016 PhuW Uncharacterized iron-regulated protein [Function unknown]. 295
2659 225562 COG3017 LolB Outer membrane lipoprotein LolB, involved in outer membrane biogenesis [Cell wall/membrane/envelope biogenesis]. 206
2660 225563 COG3018 COG3018 Uncharacterized protein [Function unknown]. 115
2661 225564 COG3019 COG3019 Uncharacterized conserved protein [Function unknown]. 149
2662 225565 COG3021 YafD Uncharacterized conserved protein YafD, endonuclease/exonuclease/phosphatase (EEP) superfamily [General function prediction only]. 309
2663 225566 COG3022 YaaA Cytoplasmic iron level regulating protein YaaA, DUF328/UPF0246 family [Inorganic ion transport and metabolism]. 253
2664 225567 COG3023 AmpD N-acetyl-anhydromuramyl-L-alanine amidase AmpD [Cell wall/membrane/envelope biogenesis]. 257
2665 225568 COG3024 YacG Endogenous inhibitor of DNA gyrase, YacG/DUF329 family [Replication, recombination and repair]. 65
2666 225569 COG3025 PPPi Inorganic triphosphatase YgiF, contains CYTH and CHAD domains [Inorganic ion transport and metabolism]. 432
2667 225570 COG3026 RseB Negative regulator of sigma E activity [Signal transduction mechanisms]. 320
2668 225571 COG3027 ZapA Cell division protein ZapA, inhibits GTPase activity of FtsZ [Cell cycle control, cell division, chromosome partitioning]. 105
2669 225572 COG3028 YjgA Ribosomal 50S subunit-associated protein YjgA (function unknown), DUF615 family [Translation, ribosomal structure and biogenesis]. 187
2670 225573 COG3029 FrdC Fumarate reductase subunit C [Energy production and conversion]. 129
2671 225574 COG3030 FxsA Protein affecting phage T7 exclusion by the F plasmid, UPF0716 family [General function prediction only]. 158
2672 225575 COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking, secretion, and vesicular transport]. 275
2673 225576 COG3033 TnaA Tryptophanase [Amino acid transport and metabolism]. 471
2674 225577 COG3034 YafK Murein L,D-transpeptidase YafK [Cell wall/membrane/envelope biogenesis]. 298
2675 225578 COG3036 ArfA Stalled ribosome alternative rescue factor ArfA [Translation, ribosomal structure and biogenesis]. 66
2676 225579 COG3037 UlaA Ascorbate-specific PTS system EIIC-type component UlaA [Carbohydrate transport and metabolism]. 481
2677 225580 COG3038 CybB Cytochrome b561 [Energy production and conversion]. 181
2678 225581 COG3039 IS5 Transposase and inactivated derivatives, IS5 family [Mobilome: prophages, transposons]. 230
2679 225582 COG3040 Blc Bacterial lipocalin [Cell wall/membrane/envelope biogenesis]. 174
2680 225583 COG3041 YafQ mRNA-degrading endonuclease (mRNA interferase) YafQ, toxin component of the YafQ-DinJ toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 91
2681 225584 COG3042 Hlx Putative hemolysin [General function prediction only]. 85
2682 225585 COG3043 NapB Nitrate reductase cytochrome c-type subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 155
2683 225586 COG3044 COG3044 Predicted ATPase of the ABC class [General function prediction only]. 554
2684 225587 COG3045 CreA Periplasmic catabolite regulation protein CreA (function unknown) [Signal transduction mechanisms]. 165
2685 225588 COG3046 COG3046 Uncharacterized protein related to deoxyribodipyrimidine photolyase [General function prediction only]. 505
2686 225589 COG3047 OmpW Outer membrane protein W [Cell wall/membrane/envelope biogenesis]. 213
2687 225590 COG3048 DsdA D-serine dehydratase [Amino acid transport and metabolism]. 443
2688 225591 COG3049 YxeI Penicillin V acylase or related amidase, Ntn superfamily [Cell wall/membrane/envelope biogenesis, General function prediction only]. 353
2689 225592 COG3050 HolD DNA polymerase III, psi subunit [Replication, recombination and repair]. 133
2690 225593 COG3051 CitF Citrate lyase, alpha subunit [Energy production and conversion]. 513
2691 225594 COG3052 CitD Citrate lyase, gamma subunit [Energy production and conversion]. 98
2692 225595 COG3053 CitC Citrate lyase synthetase [Energy production and conversion]. 352
2693 225596 COG3054 YtfJ Predicted transcriptional regulator [General function prediction only]. 184
2694 225597 COG3055 NanM N-acetylneuraminic acid mutarotase [Cell wall/membrane/envelope biogenesis]. 381
2695 225598 COG3056 YajG Uncharacterized lipoprotein YajG [Function unknown]. 204
2696 225599 COG3057 SeqA Negative regulator of replication initiation [Replication, recombination and repair]. 181
2697 225600 COG3058 FdhE Formate dehydrogenase maturation protein FdhE [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 308
2698 225601 COG3059 YkgB Uncharacterized membrane protein YkgB [Function unknown]. 182
2699 225602 COG3060 MetJ Transcriptional regulator of met regulon [Transcription, Amino acid transport and metabolism]. 105
2700 225603 COG3061 OapA Cell envelope opacity-associated protein A (function unknown) [Function unknown]. 242
2701 225604 COG3062 NapD Cytoplasmic chaperone NapD for the signal peptide of periplasmic nitrate reductase NapAB [Posttranslational modification, protein turnover, chaperones]. 94
2702 225605 COG3063 PilF Tfp pilus assembly protein PilF [Cell motility, Extracellular structures]. 250
2703 225606 COG3064 TolA Membrane protein involved in colicin uptake [Cell wall/membrane/envelope biogenesis]. 387
2704 225607 COG3065 Slp Starvation-inducible outer membrane lipoprotein [Cell wall/membrane/envelope biogenesis]. 191
2705 225608 COG3066 MutH DNA mismatch repair protein MutH [Replication, recombination and repair]. 229
2706 225609 COG3067 NhaB Na+/H+ antiporter NhaB [Energy production and conversion, Inorganic ion transport and metabolism]. 516
2707 225610 COG3068 YjaG Uncharacterized protein YjaG, DUF416 family [Function unknown]. 194
2708 225611 COG3069 DcuC C4-dicarboxylate transporter [Energy production and conversion]. 451
2709 225612 COG3070 TfoX Transcriptional regulator of competence genes, TfoX/Sxy family [Transcription]. 121
2710 225613 COG3071 HemY Uncharacterized conserved protein HemY, contains two TPR repeats [Function unknown]. 400
2711 225614 COG3072 CyaA Adenylate cyclase [Nucleotide transport and metabolism]. 853
2712 225615 COG3073 RseA Negative regulator of sigma E activity [Signal transduction mechanisms]. 213
2713 225616 COG3074 ZapB Cell division protein ZapB, interacts with FtsZ [Cell cycle control, cell division, chromosome partitioning]. 79
2714 225617 COG3075 GlpB Anaerobic glycerol-3-phosphate dehydrogenase [Amino acid transport and metabolism]. 421
2715 225618 COG3076 RraB Regulator of RNase E activity RraB [Translation, ribosomal structure and biogenesis]. 135
2716 225619 COG3077 RelB Antitoxin component of the RelBE or YafQ-DinJ toxin-antitoxin module [Defense mechanisms]. 88
2717 225620 COG3078 YihI Ribosome assembly protein YihI, activator of Der GTPase [Translation, ribosomal structure and biogenesis]. 169
2718 225621 COG3079 YgfB Uncharacterized conserved protein YgfB, UPF0149 family [Function unknown]. 186
2719 225622 COG3080 FrdD Fumarate reductase subunit D [Energy production and conversion]. 118
2720 225623 COG3081 NdpA Nucleoid-associated protein YejK (function unknown) [Function unknown]. 335
2721 225624 COG3082 YejL Uncharacterized conserved protein YejL, UPF0352 family [Function unknown]. 74
2722 225625 COG3083 YejM Membrane-anchored periplasmic protein YejM, alkaline phosphatase superfamily [Cell wall/membrane/envelope biogenesis]. 600
2723 225626 COG3084 YihD Uncharacterized protein YihD, DUF1040 family [Function unknown]. 88
2724 225627 COG3085 YifE Uncharacterized conserved protein YifE, UPF0438 family [Function unknown]. 112
2725 225628 COG3086 RseC Positive regulator of sigma E activity [Signal transduction mechanisms]. 150
2726 225629 COG3087 FtsN Cell division protein FtsN [Cell cycle control, cell division, chromosome partitioning]. 264
2727 225630 COG3088 NrfF Cytochrome c-type biogenesis protein CcmH/NrfF [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 153
2728 225631 COG3089 YheU Uncharacterized conserved protein YheU, UPF0270 family [Function unknown]. 72
2729 225632 COG3090 DctM TRAP-type C4-dicarboxylate transport system, small permease component [Carbohydrate transport and metabolism]. 177
2730 225633 COG3091 SprT Predicted Zn-dependent metalloprotease, SprT family [General function prediction only]. 156
2731 225634 COG3092 YfbV Uncharacterized membrane protein YfbV, UPF0208 family [Function unknown]. 149
2732 225635 COG3093 VapI Plasmid maintenance system antidote protein VapI, contains XRE-type HTH domain [Defense mechanisms]. 104
2733 225636 COG3094 SirB2 Uncharacterized membrane protein SirB2 [Function unknown]. 129
2734 225637 COG3095 MukE Chromosome condensin MukBEF, MukE localization factor [Cell cycle control, cell division, chromosome partitioning]. 238
2735 225638 COG3096 MukB Chromosome condensin MukBEF, ATPase and DNA-binding subunit MukB [Escherichia coli str. K-12 substr. MG1655 [Cell cycle control, cell division, chromosome partitioning]. 1480
2736 225639 COG3097 YqfB Uncharacterized protein YqfB, UPF0267 family [Function unknown]. 106
2737 225640 COG3098 YqcC Uncharacterized conserved protein YqcC, DUF446 family [Function unknown]. 109
2738 225641 COG3099 YciU Uncharacterized conserved protein YciU, UPF0263 family [Function unknown]. 108
2739 225642 COG3100 YcgL Uncharacterized conserved protein YcgL, UPF0745 family [Function unknown]. 103
2740 225643 COG3101 EpmC Elongation factor P hydroxylase (EF-P beta-lysylation pathway) [Translation, ribosomal structure and biogenesis]. 180
2741 225644 COG3102 YecM Uncharacterized conserved protein YecM, predicted metalloenzyme [General function prediction only]. 185
2742 225645 COG3103 YgiM Uncharacterized conserved protein YgiM, contains N-terminal SH3 domain, DUF1202 family [General function prediction only]. 205
2743 225646 COG3104 PTR2 Dipeptide/tripeptide permease [Amino acid transport and metabolism]. 498
2744 225647 COG3105 YhcB Uncharacterized membrane-anchored protein YhcB, DUF1043 family [Function unknown]. 138
2745 225648 COG3106 YcjX Predicted ATPase, YcjX-like family [General function prediction only]. 467
2746 225649 COG3107 LpoA Outer membrane lipoprotein LpoA, binds and activates PBP1a [Cell wall/membrane/envelope biogenesis]. 604
2747 225650 COG3108 YcbK Uncharacterized conserved protein YcbK, DUF882 family [Function unknown]. 185
2748 225651 COG3109 ProQ sRNA-binding protein [Signal transduction mechanisms]. 208
2749 225652 COG3110 YccT Uncharacterized conserved protein YccT, UPF0319 family [Function unknown]. 216
2750 225653 COG3111 YdeI Predicted periplasmic protein YdeI with OB-fold, BOF family [Function unknown]. 128
2751 225654 COG3112 YacL Uncharacterized protein YacL, UPF0231 family [Function unknown]. 121
2752 225655 COG3113 MlaB ABC-type transporter Mla maintaining outer membrane lipid asymmetry, MlaB component, contains STAS domain [Cell wall/membrane/envelope biogenesis]. 99
2753 225656 COG3114 CcmD Heme exporter protein D [Intracellular trafficking, secretion, and vesicular transport]. 67
2754 225657 COG3115 ZipA Cell division protein ZipA, interacts with FtsZ [Cell cycle control, cell division, chromosome partitioning]. 324
2755 225658 COG3116 FtsL Cell division protein FtsL, interacts with FtsB, FtsL and FtsQ [Cell cycle control, cell division, chromosome partitioning]. 105
2756 225659 COG3117 YrbK Lipopolysaccharide export system protein LptC [Cell wall/membrane/envelope biogenesis]. 188
2757 225660 COG3118 YbbN Negative regulator of GroEL, contains thioredoxin-like and TPR-like domains [Posttranslational modification, protein turnover, chaperones]. 304
2758 225661 COG3119 AslA Arylsulfatase A or related enzyme [Inorganic ion transport and metabolism]. 475
2759 225662 COG3120 MatP Macrodomain Ter protein organizer, MatP/YcbG family [Replication, recombination and repair]. 149
2760 225663 COG3121 FimC P pilus assembly protein, chaperone PapD [Extracellular structures]. 235
2761 225664 COG3122 YaiL Uncharacterized conserved protein YaiL, DUF2058 family [Function unknown]. 215
2762 225665 COG3123 YaiE Uncharacterized conserved protein YaiE, UPF0345 family [Function unknown]. 94
2763 225666 COG3124 AcpH Acyl carrier protein phosphodiesterase [Lipid transport and metabolism]. 193
2764 225667 COG3125 CyoD Heme/copper-type cytochrome/quinol oxidase, subunit 4 [Energy production and conversion]. 111
2765 225668 COG3126 YbaY Uncharacterized lipoprotein YbaY [Function unknown]. 158
2766 225669 COG3127 YbbP Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component [Secondary metabolites biosynthesis, transport and catabolism]. 829
2767 225670 COG3128 PiuC Predicted 2-oxoglutarate- and Fe(II)-dependent dioxygenase YbiX [General function prediction only]. 229
2768 225671 COG3129 RlmF 23S rRNA A1618 N6-methylase RlmF [Translation, ribosomal structure and biogenesis]. 292
2769 225672 COG3130 Rmf Ribosome modulation factor [Translation, ribosomal structure and biogenesis]. 55
2770 225673 COG3131 MdoG Periplasmic glucans biosynthesis protein [Cell wall/membrane/envelope biogenesis]. 534
2771 225674 COG3132 YceH Uncharacterized conserved protein YceH, UPF0502 family [Function unknown]. 215
2772 225675 COG3133 SlyB Outer membrane lipoprotein SlyB [Cell wall/membrane/envelope biogenesis]. 154
2773 225676 COG3134 YcfJ Uncharacterized conserved protein YcfJ, contains glycine zipper 2TM domain [Function unknown]. 179
2774 225677 COG3135 BenE Predicted benzoate:H+ symporter BenE [Secondary metabolites biosynthesis, transport and catabolism]. 402
2775 225678 COG3136 GlpM Uncharacterized membrane protein, GlpM family [Function unknown]. 111
2776 225679 COG3137 YdiY Putative salt-induced outer membrane protein YdiY [Cell wall/membrane/envelope biogenesis]. 262
2777 225680 COG3138 AstA Arginine/ornithine N-succinyltransferase beta subunit [Amino acid transport and metabolism]. 336
2778 225681 COG3139 yeaC Uncharacterized conserved protein YeaC, DUF1315 family [Function unknown]. 90
2779 225682 COG3140 yoaH Uncharacterized conserved protein YoaH, UPF0181 family [Function unknown]. 60
2780 225683 COG3141 YebG dsDNA-binding SOS-regulon protein, induction by DNA damage requires cAMP [Replication, recombination and repair]. 97
2781 225684 COG3142 CutC Copper homeostasis protein CutC [Inorganic ion transport and metabolism]. 241
2782 225685 COG3143 CheZ Chemotaxis regulator CheZ, phosphatase of CheY~P [Cell motility, Signal transduction mechanisms]. 217
2783 225686 COG3144 FliK Flagellar hook-length control protein FliK [Cell motility]. 417
2784 225687 COG3145 AlkB Alkylated DNA repair dioxygenase AlkB [Replication, recombination and repair]. 194
2785 225688 COG3146 COG3146 Predicted N-acyltransferase [General function prediction only]. 387
2786 225689 COG3147 DedD Cell division protein DedD (periplasmic protein involved in septation) [Cell cycle control, cell division, chromosome partitioning]. 226
2787 225690 COG3148 YfiP Uncharacterized conserved protein YfiP, DTW domain [Function unknown]. 231
2788 225691 COG3149 PulM Type II secretory pathway, component PulM [Intracellular trafficking, secretion, and vesicular transport]. 181
2789 225692 COG3150 ycfP Predicted esterase YcpF, UPF0227 family [General function prediction only]. 191
2790 225693 COG3151 yqiB Uncharacterized protein YqiB, DUF1249 family [Function unknown]. 147
2791 225694 COG3152 yhaH Uncharacterized membrane protein YhaH, DUF805 family [Function unknown]. 125
2792 225695 COG3153 yhbS Predicted N-acetyltransferase YhbS [General function prediction only]. 171
2793 225696 COG3154 SCP2 Predicted lipid carrier protein YhbT, SCP2 domain [Lipid transport and metabolism]. 168
2794 225697 COG3155 ElbB Enhancing lycopene biosynthesis protein 2 [Secondary metabolites biosynthesis, transport and catabolism]. 217
2795 225698 COG3156 PulK Type II secretory pathway, component PulK [Intracellular trafficking, secretion, and vesicular transport]. 323
2796 225699 COG3157 Hcp Type VI protein secretion system component Hcp (secreted cytotoxin) [Intracellular trafficking, secretion, and vesicular transport]. 162
2797 225700 COG3158 Kup K+ transporter [Inorganic ion transport and metabolism]. 627
2798 225701 COG3159 YigA Uncharacterized conserved protein YigA, DUF484 family [Function unknown]. 218
2799 225702 COG3160 Rsd Regulator of sigma D [Transcription]. 162
2800 225703 COG3161 UbiC 4-hydroxybenzoate synthetase (chorismate-pyruvate lyase) [Coenzyme transport and metabolism]. 174
2801 225704 COG3162 YjcH Uncharacterized membrane protein, DUF485 family [Function unknown]. 102
2802 225705 COG3164 YhdR Uncharacterized conserved protein YhdP, contains DUF3971 and AsmA2 domains [Function unknown]. 1271
2803 225706 COG3165 UbiJ Ubiquinone biosynthesis protein UbiJ, contains SCP2 domain [Coenzyme transport and metabolism]. 204
2804 225707 COG3166 PilN Tfp pilus assembly protein PilN [Cell motility, Extracellular structures]. 206
2805 225708 COG3167 PilO Tfp pilus assembly protein PilO [Cell motility, Extracellular structures]. 211
2806 225709 COG3168 PilP Tfp pilus assembly protein PilP [Cell motility, Extracellular structures]. 170
2807 225710 COG3169 COG3169 Uncharacterized conserved protein, DUF486 family [Function unknown]. 116
2808 225711 COG3170 FimV Tfp pilus assembly protein FimV [Cell motility, Extracellular structures]. 755
2809 225712 COG3171 YggL Uncharacterized conserved protein YggL, DUF469 family [Function unknown]. 119
2810 225713 COG3172 NadR3 Nicotinamide riboside kinase [Coenzyme transport and metabolism]. 187
2811 225714 COG3173 YcbJ Predicted kinase, aminoglycoside phosphotransferase (APT) family [General function prediction only]. 321
2812 225715 COG3174 COG3174 Uncharacterized membrane protein, DUF4010 family [Function unknown]. 371
2813 225716 COG3175 COX11 Cytochrome c oxidase assembly protein Cox11 [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 195
2814 225717 COG3176 COG3176 Putative hemolysin [General function prediction only]. 292
2815 225718 COG3177 COG3177 Fic family protein [Transcription]. 348
2816 225719 COG3178 COG3178 Predicted phosphotransferase, aminoglycoside/choline kinase (APH/ChoK) family [General function prediction only]. 351
2817 225720 COG3179 COG3179 Predicted chitinase [General function prediction only]. 206
2818 225721 COG3180 AbrB Uncharacterized membrane protein AbrB, regulator of aidB expression [General function prediction only]. 352
2819 225722 COG3181 TctC Tripartite-type tricarboxylate transporter, receptor component TctC [Energy production and conversion]. 319
2820 225723 COG3182 PiuB Uncharacterized iron-regulated membrane protein [Function unknown]. 442
2821 225724 COG3183 COG3183 Predicted restriction endonuclease, HNH family [Defense mechanisms]. 272
2822 225725 COG3184 COG3184 Uncharacterized protein, contains DUF2059 domain [Function unknown]. 183
2823 225726 COG3185 HppD 4-hydroxyphenylpyruvate dioxygenase and related hemolysins [Amino acid transport and metabolism, General function prediction only]. 363
2824 225727 COG3186 PhhA Phenylalanine-4-hydroxylase [Amino acid transport and metabolism]. 291
2825 225728 COG3187 HslJ Heat shock protein HslJ [Posttranslational modification, protein turnover, chaperones]. 142
2826 225729 COG3188 FimD Outer membrane usher protein FimD/PapC [Cell motility, Extracellular structures]. 835
2827 225730 COG3189 YeaO Uncharacterized conserved protein YeaO, DUF488 family [Function unknown]. 117
2828 225731 COG3190 FliO Flagellar biogenesis protein FliO [Cell motility]. 137
2829 225732 COG3191 DmpA L-aminopeptidase/D-esterase [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 348
2830 225733 COG3192 EutH Ethanolamine transporter EutH, required for ethanolamine utilization at low pH [Amino acid transport and metabolism]. 389
2831 225734 COG3193 GlcG Uncharacterized conserved protein GlcG, DUF336 family [Function unknown]. 141
2832 225735 COG3194 AllA Ureidoglycolate hydrolase (allantoin degradation) [Nucleotide transport and metabolism]. 168
2833 225736 COG3195 PucL 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) decarboxylase [Nucleotide transport and metabolism]. 176
2834 225737 COG3196 CbrC Uncharacterized protein CbrC, UPF0167 family [Function unknown]. 183
2835 225738 COG3197 FixS Cytochrome oxidase maturation protein, CcoS/FixS family [Posttranslational modification, protein turnover, chaperones]. 58
2836 225739 COG3198 COG3198 Uncharacterized protein [Function unknown]. 172
2837 225740 COG3199 COG3199 Predicted polyphosphate- or ATP-dependent NAD kinase [Nucleotide transport and metabolism]. 355
2838 225741 COG3200 AroG2 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase, class II [Amino acid transport and metabolism]. 445
2839 225742 COG3201 PnuC Nicotinamide riboside transporter PnuC [Coenzyme transport and metabolism]. 222
2840 225743 COG3202 TlcC ATP/ADP translocase [Energy production and conversion]. 509
2841 225744 COG3203 OmpC Outer membrane protein (porin) [Cell wall/membrane/envelope biogenesis]. 354
2842 225745 COG3204 YjiK Uncharacterized protein YjiK [Function unknown]. 316
2843 225746 COG3205 COG3205 Uncharacterized membrane protein [Function unknown]. 72
2844 225747 COG3206 GumC Uncharacterized protein involved in exopolysaccharide biosynthesis [Cell wall/membrane/envelope biogenesis]. 458
2845 225748 COG3207 Dit1 Pyoverdine/dityrosine biosynthesis protein Dit1 [Secondary metabolites biosynthesis, transport and catabolism]. 330
2846 225749 COG3208 GrsT Surfactin synthase thioesterase subunit [Secondary metabolites biosynthesis, transport and catabolism]. 244
2847 225750 COG3209 RhsA Uncharacterized conserved protein RhsA, contains 28 RHS repeats [General function prediction only]. 796
2848 225751 COG3210 FhaB Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport]. 1013
2849 225752 COG3211 PhoX Secreted phosphatase, PhoX family [General function prediction only]. 616
2850 225753 COG3212 YkoI Uncharacterized membrane protein YkoI [Function unknown]. 144
2851 225754 COG3213 NnrS Uncharacterized protein involved in response to NO [Defense mechanisms]. 396
2852 225755 COG3214 YcaQ Uncharacterized conserved protein YcaQ, contains winged helix DNA-binding domain [General function prediction only]. 400
2853 225756 COG3215 PilZ Tfp pilus assembly protein PilZ [Cell motility, Extracellular structures]. 117
2854 225757 COG3216 COG3216 Uncharacterized conserved protein, DUF2062 family [Function unknown]. 184
2855 225758 COG3217 YcbX Uncharacterized conserved protein YcbX, contains MOSC and Fe-S domains [General function prediction only]. 270
2856 225759 COG3218 COG3218 ABC-type uncharacterized transport system, auxiliary component [General function prediction only]. 205
2857 225760 COG3219 COG3219 Uncharacterized protein, DUF2063 family [Function unknown]. 237
2858 225761 COG3220 COG3220 Uncharacterized conserved protein, UPF0276 family [Function unknown]. 282
2859 225762 COG3221 PhnD ABC-type phosphate/phosphonate transport system, periplasmic component [Inorganic ion transport and metabolism]. 299
2860 225763 COG3222 COG3222 Uncharacterized conserved protein, glycosyltransferase A (GT-A) superfamily, DUF2064 family [Function unknown]. 211
2861 225764 COG3223 PsiE Phosphate starvation-inducible membrane PsiE (function unknown) [General function prediction only]. 138
2862 225765 COG3224 COG3224 Antibiotic biosynthesis monooxygenase (ABM) superfamily enzyme [General function prediction only]. 195
2863 225766 COG3225 GldG ABC-type uncharacterized transport system involved in gliding motility, auxiliary component [Cell motility]. 538
2864 225767 COG3226 YbjK DNA-binding transcriptional regulator YbjK [Transcription]. 204
2865 225768 COG3227 LasB Zn-dependent metalloprotease [Posttranslational modification, protein turnover, chaperones]. 507
2866 225769 COG3228 MtfA Mlc titration factor MtfA, regulates ptsG expression [Signal transduction mechanisms]. 266
2867 225770 COG3230 HemO Heme oxygenase [Inorganic ion transport and metabolism]. 196
2868 225771 COG3231 Aph Aminoglycoside phosphotransferase [Translation, ribosomal structure and biogenesis]. 266
2869 225772 COG3232 HpaF 5-carboxymethyl-2-hydroxymuconate isomerase [Amino acid transport and metabolism]. 127
2870 225773 COG3233 COG3233 Predicted deacetylase [General function prediction only]. 233
2871 225774 COG3234 yfaT Uncharacterized conserved protein YfaT, DUF1175 family [Function unknown]. 215
2872 225775 COG3235 COG3235 Uncharacterized membrane protein [Function unknown]. 223
2873 225776 COG3236 ybiA N-glycosylase of 5-amino-6-ribosylamino-2,4-pyrimidinedione 5?-phosphate (riboflavin biosynthesis damage control) [Coenzyme transport and metabolism]. 162
2874 225777 COG3237 yjbJ Uncharacterized conserved protein YjbJ, UPF0337 family [Function unknown]. 67
2875 225778 COG3238 ydcZ Uncharacterized membrane protein YdcZ, DUF606 family [Function unknown]. 150
2876 225779 COG3239 DesA Fatty acid desaturase [Lipid transport and metabolism]. 343
2877 225780 COG3240 COG3240 Phospholipase/lecithinase/hemolysin [Lipid transport and metabolism, General function prediction only]. 370
2878 225781 COG3241 COG3241 Azurin [Energy production and conversion]. 151
2879 225782 COG3242 yjeT Uncharacterized conserved protein YjeT, DUF2065 family [Function unknown]. 62
2880 225783 COG3243 PhaC Poly(3-hydroxyalkanoate) synthetase [Lipid transport and metabolism]. 445
2881 225784 COG3245 CytC5 Cytochrome c5 [Energy production and conversion]. 126
2882 225785 COG3246 COG3246 Uncharacterized conserved protein, DUF849 family [Function unknown]. 298
2883 225786 COG3247 HdeD Uncharacterized membrane protein HdeD, DUF308 family [Function unknown]. 185
2884 225787 COG3248 Tsx Nucleoside-specific outer membrane channel protein Tsx [Cell wall/membrane/envelope biogenesis]. 284
2885 225788 COG3249 COG3249 Uncharacterized protein [Function unknown]. 343
2886 225789 COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]. 808
2887 225790 COG3251 MbtH Uncharacterized conserved protein YbdZ, MbtH family [Function unknown]. 71
2888 225791 COG3252 Mch Methenyltetrahydromethanopterin cyclohydrolase [Coenzyme transport and metabolism]. 314
2889 225792 COG3253 YwfI Chlorite dismutase [Inorganic ion transport and metabolism]. 230
2890 225793 COG3254 RhaM L-rhamnose mutarotase [Cell wall/membrane/envelope biogenesis]. 105
2891 225794 COG3255 SCP2 Putative sterol carrier protein [Lipid transport and metabolism]. 134
2892 225795 COG3256 NorB Nitric oxide reductase large subunit [Inorganic ion transport and metabolism]. 717
2893 225796 COG3257 AllE Ureidoglycine aminohydrolase [Nucleotide transport and metabolism]. 264
2894 225797 COG3258 CytC Cytochrome c [Energy production and conversion]. 293
2895 225798 COG3259 FrhA Coenzyme F420-reducing hydrogenase, alpha subunit [Energy production and conversion]. 441
2896 225799 COG3260 HycG Ni,Fe-hydrogenase III small subunit [Energy production and conversion]. 148
2897 225800 COG3261 HycE2 Ni,Fe-hydrogenase III large subunit [Energy production and conversion]. 382
2898 225801 COG3262 HycE1 Ni,Fe-hydrogenase III component G [Energy production and conversion]. 165
2899 225802 COG3263 NhaP2 NhaP-type Na+/H+ and K+/H+ antiporter with C-terminal TrkAC and CorC domains [Energy production and conversion, Inorganic ion transport and metabolism]. 574
2900 225803 COG3264 MscK Small-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 835
2901 225804 COG3265 GntK Gluconate kinase [Carbohydrate transport and metabolism]. 161
2902 225805 COG3266 DamX Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning]. 292
2903 225806 COG3267 ExeA Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking, secretion, and vesicular transport]. 269
2904 225807 COG3268 COG3268 Uncharacterized conserved protein, related to short-chain dehydrogenases [Function unknown]. 382
2905 225808 COG3269 COG3269 Predicted RNA-binding protein, contains TRAM domain [General function prediction only]. 73
2906 225809 COG3270 Ncl1 Ribosome biogenesis protein, NOL1/NOP2/fmu family [Translation, ribosomal structure and biogenesis]. 127
2907 225810 COG3271 COG3271 Predicted double-glycine peptidase [General function prediction only]. 201
2908 225811 COG3272 YbgA Uncharacterized conserved protein YbgA, DUF1722 family [Function unknown]. 163
2909 225812 COG3273 COG3273 Uncharacterized conserved protein, contains PhoU and TrkA_C domains [Function unknown]. 204
2910 225813 COG3274 WecH Surface polysaccharide O-acyltransferase, integral membrane enzyme [Cell wall/membrane/envelope biogenesis]. 332
2911 225814 COG3275 LytS Sensor histidine kinase, LytS/YehU family [Signal transduction mechanisms]. 557
2912 225815 COG3276 SelB Selenocysteine-specific translation elongation factor [Translation, ribosomal structure and biogenesis]. 447
2913 225816 COG3277 GAR1 rRNA processing protein Gar1 [Translation, ribosomal structure and biogenesis]. 98
2914 225817 COG3278 CcoN Cbb3-type cytochrome oxidase, subunit 1 [Energy production and conversion]. 482
2915 225818 COG3279 LytT DNA-binding response regulator, LytR/AlgR family [Transcription, Signal transduction mechanisms]. 244
2916 225819 COG3280 TreY Maltooligosyltrehalose synthase [Carbohydrate transport and metabolism]. 889
2917 225820 COG3281 Ble Predicted trehalose synthase [Carbohydrate transport and metabolism]. 438
2918 225821 COG3283 TyrR Transcriptional regulator of aromatic amino acids metabolism [Transcription, Amino acid transport and metabolism]. 511
2919 225822 COG3284 AcoR Transcriptional regulator of acetoin/glycerol metabolism [Transcription]. 606
2920 225823 COG3285 LigD Eukaryotic-type DNA primase [Replication, recombination and repair]. 299
2921 225824 COG3286 COG3286 Uncharacterized protein [Function unknown]. 204
2922 225825 COG3287 COG3287 Uncharacterized conserved protein, contains FIST_N domain [Function unknown]. 379
2923 225826 COG3288 PntA NAD/NADP transhydrogenase alpha subunit [Energy production and conversion]. 356
2924 225827 COG3290 CitA Sensor histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]. 537
2925 225828 COG3291 COG3291 PKD repeat [Function unknown]. 297
2926 225829 COG3292 COG3292 Periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 671
2927 225830 COG3293 COG3293 Transposase [Mobilome: prophages, transposons]. 124
2928 225831 COG3294 COG3294 Metal-dependent phosphatase/phosphodiesterase, HD supefamily [General function prediction only]. 269
2929 225832 COG3295 COG3295 Uncharacterized protein [Function unknown]. 213
2930 225833 COG3296 Tic20 Uncharacterized conserved protein, Tic20 family [Function unknown]. 143
2931 225834 COG3297 PulL Type II secretory pathway, component PulL [Intracellular trafficking, secretion, and vesicular transport]. 390
2932 225835 COG3298 COG3298 Predicted 3'-5' exonuclease related to the exonuclease domain of PolB [Replication, recombination and repair]. 122
2933 225836 COG3299 JayE Uncharacterized phage protein gp47/JayE [Mobilome: prophages, transposons]. 353
2934 225837 COG3300 MHYT MHYT domain, NO-binding membrane sensor [Signal transduction mechanisms]. 236
2935 225838 COG3301 NrfD Formate-dependent nitrite reductase, membrane component NrfD [Inorganic ion transport and metabolism]. 305
2936 225839 COG3302 DmsC DMSO reductase anchor subunit [Energy production and conversion]. 281
2937 225840 COG3303 NrfA Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit [Inorganic ion transport and metabolism]. 501
2938 225841 COG3304 YccF Uncharacterized membrane protein YccF, DUF307 family [Function unknown]. 145
2939 225842 COG3305 COG3305 Uncharacterized membrane protein, DUF2068 family [Function unknown]. 152
2940 225843 COG3306 COG3306 Glycosyltransferase involved in LPS biosynthesis, GR25 family [Cell wall/membrane/envelope biogenesis]. 255
2941 225844 COG3307 RfaL O-antigen ligase [Cell wall/membrane/envelope biogenesis]. 424
2942 225845 COG3308 COG3308 Uncharacterized membrane protein [Function unknown]. 131
2943 225846 COG3309 VapD Virulence-associated protein VapD (function unknown) [Function unknown]. 96
2944 225847 COG3310 COG3310 Uncharacterized protein, DUF1415 family [Function unknown]. 196
2945 225848 COG3311 AlpA Predicted DNA-binding transcriptional regulator AlpA [Transcription, Mobilome: prophages, transposons]. 70
2946 225849 COG3312 AtpI FoF1-type ATP synthase assembly protein I [Energy production and conversion]. 128
2947 225850 COG3313 YdhL Predicted Fe-S protein YdhL, DUF1289 family [General function prediction only]. 74
2948 225851 COG3314 YjiH Uncharacterized membrane protein YjiH, contains nucleoside recognition GATE domain [Function unknown]. 427
2949 225852 COG3315 YktD O-Methyltransferase involved in polyketide biosynthesis [Secondary metabolites biosynthesis, transport and catabolism]. 297
2950 225853 COG3316 Rve Transposase (or an inactivated derivative) [Mobilome: prophages, transposons]. 215
2951 225854 COG3317 NlpB Uncharacterized lipoprotein, NlpB/DapX family [Function unknown]. 342
2952 225855 COG3318 YecA Uncharacterized conserved protein YecA, UPF0149 family, contains C-terminal Zn-binding SEC-C motif [Function unknown]. 216
2953 225856 COG3319 EntF Thioesterase domain of type I polyketide synthase or non-ribosomal peptide synthetase [Secondary metabolites biosynthesis, transport and catabolism]. 257
2954 225857 COG3320 Lys2b Thioester reductase domain of alpha aminoadipate reductase Lys2 and NRPSs [Secondary metabolites biosynthesis, transport and catabolism]. 382
2955 225858 COG3321 PksD Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites biosynthesis, transport and catabolism]. 1061
2956 225859 COG3322 CHASE4 Extracellular (periplasmic) sensor domain CHASE (specificity unknown) [Signal transduction mechanisms]. 295
2957 225860 COG3323 YqfO Uncharacterized protein YbgI, a toroidal structure with a dinuclear metal site [Function unknown]. 109
2958 225861 COG3324 COG3324 Predicted enzyme related to lactoylglutathione lyase [General function prediction only]. 127
2959 225862 COG3325 ChiA Chitinase, GH18 family [Carbohydrate transport and metabolism]. 441
2960 225863 COG3326 YsdA Uncharacterized membrane protein YsdA, DUF1294 family [Function unknown]. 94
2961 225864 COG3327 PaaX DNA-binding transcriptional regulator PaaX (phenylacetic acid degradation) [Transcription]. 291
2962 225865 COG3328 IS285 Transposase (or an inactivated derivative) [Mobilome: prophages, transposons]. 379
2963 225866 COG3329 COG3329 Uncharacterized conserved protein [Function unknown]. 372
2964 225867 COG3330 COG3330 Uncharacterized conserved protein [Function unknown]. 215
2965 225868 COG3331 PrfA Penicillin-binding protein-related factor A, putative recombinase [General function prediction only]. 177
2966 225869 COG3332 NRDE Uncharacterized conserved protein, contains NRDE domain [Function unknown]. 270
2967 225870 COG3333 COG3333 TctA family transporter [General function prediction only]. 504
2968 225871 COG3334 MotE Flagellar motility protein MotE, a chaperone for MotC folding [Cell motility]. 192
2969 225872 COG3335 COG3335 Transposase [Mobilome: prophages, transposons]. 132
2970 225873 COG3336 CtaG Cytochrome c oxidase assembly factor CtaG [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 299
2971 225874 COG3337 Cmr5 CRISPR/Cas system CMR-associated protein Cmr5, small subunit [Defense mechanisms]. 134
2972 225875 COG3338 Cah Carbonic anhydrase [Inorganic ion transport and metabolism]. 250
2973 225876 COG3339 YkvA Uncharacterized membrane protein YkvA, DUF1232 family [Function unknown]. 116
2974 225877 COG3340 PepE Peptidase E [Amino acid transport and metabolism]. 224
2975 225878 COG3341 Rnh1 RNase HI-related protein, contains viroplasmin and RNaseH domains [General function prediction only]. 225
2976 225879 COG3342 COG3342 Uncharacterized conserved protein, Ntn-hydrolase superfamily [General function prediction only]. 265
2977 225880 COG3343 RpoE DNA-directed RNA polymerase, delta subunit [Transcription]. 175
2978 225881 COG3344 YkfC Retron-type reverse transcriptase [Mobilome: prophages, transposons]. 328
2979 225882 COG3345 GalA Alpha-galactosidase [Carbohydrate transport and metabolism]. 687
2980 225883 COG3346 Shy1 Cytochrome oxidase assembly protein ShyY1 [Posttranslational modification, protein turnover, chaperones]. 252
2981 225884 COG3347 RhaD Rhamnose utilisation protein RhaD, predicted bifunctional aldolase and dehydrogenase [Carbohydrate transport and metabolism]. 404
2982 225885 COG3349 COG3349 Uncharacterized conserved protein, contains NAD-binding domain and a Fe-S cluster [General function prediction only]. 485
2983 225886 COG3350 COG3350 Uncharacterized conserved protein, YHS domain [Function unknown]. 53
2984 225887 COG3351 FlaD Archaellum component FlaD/FlaE [Cell motility]. 214
2985 225888 COG3352 FlaC Archaellum component FlaC [Cell motility]. 157
2986 225889 COG3353 FlaF Archaellum component FlaF, FlaF/FlaG flagellin family [Cell motility]. 137
2987 225890 COG3354 FlaG Archaellum component FlaG, FlaF/FlaG flagellin family [Cell motility]. 154
2988 225891 COG3355 COG3355 Predicted transcriptional regulator [Transcription]. 126
2989 225892 COG3356 COG3356 Predicted membrane-associated lipid hydrolase, neutral ceramidase superfamily [Lipid transport and metabolism]. 578
2990 225893 COG3357 COG3357 Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]. 97
2991 225894 COG3358 COG3358 Uncharacterized conserved protein, DUF1684 family [Function unknown]. 262
2992 225895 COG3359 YprB Uncharacterized conserved protein YprB, contains RNaseH-like and TPR domains [General function prediction only]. 278
2993 225896 COG3360 COG3360 Flavin-binding protein dodecin [General function prediction only]. 71
2994 225897 COG3361 YqjF Uncharacterized protein YqjF, DUF2071 family [Function unknown]. 240
2995 225898 COG3363 PurO Archaeal IMP cyclohydrolase [Nucleotide transport and metabolism]. 200
2996 225899 COG3364 COG3364 Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 112
2997 225900 COG3365 COG3365 Uncharacterized protein, DUF2073 family [Function unknown]. 118
2998 225901 COG3366 COG3366 Uncharacterized protein [Function unknown]. 311
2999 225902 COG3367 COG3367 Uncharacterized conserved protein, NAD-dependent epimerase/dehydratase family [General function prediction only]. 339
3000 225903 COG3368 COG3368 Predicted permease [General function prediction only]. 465
3001 225904 COG3369 COG3369 Uncharacterized protein, contains Zn-finger domain of CDGSH type [Function unknown]. 78
3002 225905 COG3370 COG3370 Uncharacterized protein [Function unknown]. 113
3003 225906 COG3371 COG3371 Uncharacterized membrane protein [Function unknown]. 181
3004 225907 COG3372 COG3372 Predicted nuclease of restriction endonuclease-like (RecB) superfamily, implicated in nucleotide excision repair [General function prediction only]. 396
3005 225908 COG3373 COG3373 Predicted transcriptional regulator, contains HTH domain [Transcription]. 108
3006 225909 COG3374 COG3374 Uncharacterized membrane protein [Function unknown]. 197
3007 225910 COG3375 COG3375 Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 266
3008 225911 COG3376 HoxN High-affinity nickel permease [Inorganic ion transport and metabolism]. 342
3009 225912 COG3377 YunC Uncharacterized protein YunC, DUF1805 family [Function unknown]. 95
3010 225913 COG3378 COG3378 Phage- or plasmid-associated DNA primase [Mobilome: prophages, transposons]. 517
3011 225914 COG3379 COG3379 Predicted phosphohydrolase or phosphomutase, AlkP superfamily [General function prediction only]. 471
3012 225915 COG3380 COG3380 Predicted NAD/FAD-dependent oxidoreductase [General function prediction only]. 331
3013 225916 COG3381 TorD Cytoplasmic chaperone TorD involved in molybdoenzyme TorA maturation [Posttranslational modification, protein turnover, chaperones]. 204
3014 225917 COG3382 B3/B4 B3/B4 domain (DNA/RNA-binding domain of Phe-tRNA-synthetase) [General function prediction only]. 229
3015 225918 COG3383 YjgC Predicted molibdopterin-dependent oxidoreductase YjgC [General function prediction only]. 978
3016 225919 COG3384 LigB Aromatic ring-opening dioxygenase, catalytic subunit, LigB family [Secondary metabolites biosynthesis, transport and catabolism]. 268
3017 225920 COG3385 InsG IS4 transposase [Mobilome: prophages, transposons]. 292
3018 225921 COG3386 YvrE Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]. 307
3019 225922 COG3387 SGA1 Glucoamylase (glucan-1,4-alpha-glucosidase), GH15 family [Carbohydrate transport and metabolism]. 612
3020 225923 COG3388 COG3388 Predicted transcriptional regulator [Transcription]. 101
3021 225924 COG3389 COG3389 Presenilin-like membrane protease, A22 family [Posttranslational modification, protein turnover, chaperones]. 277
3022 225925 COG3390 COG3390 Replication protein A (RPA) family protein [Replication, recombination and repair]. 196
3023 225926 COG3391 YncE DNA-binding beta-propeller fold protein YncE [General function prediction only]. 381
3024 225927 COG3392 COG3392 Adenine-specific DNA methylase [Replication, recombination and repair]. 330
3025 225928 COG3393 COG3393 Predicted acetyltransferase, GNAT family [General function prediction only]. 268
3026 225929 COG3394 ChbG Predicted glycoside hydrolase or deacetylase ChbG, UPF0249 family [Function unknown]. 257
3027 225930 COG3395 YgbK Uncharacterized conserved protein YgbK, DUF1537 family [Function unknown]. 413
3028 225931 COG3396 YdbO 1,2-phenylacetyl-CoA epoxidase, catalytic subunit [Secondary metabolites biosynthesis, transport and catabolism]. 265
3029 225932 COG3397 COG3397 Predicted carbohydrate-binding protein, contains CBM5 and CBM33 domains [General function prediction only]. 308
3030 225933 COG3398 COG3398 Predicted transcriptional regulator, containsd two HTH domains [Transcription]. 240
3031 225934 COG3399 COG3399 Uncharacterized protein [Function unknown]. 148
3032 225935 COG3400 COG3400 Uncharacterized protein [Function unknown]. 471
3033 225936 COG3401 FN3 Fibronectin type 3 domain [General function prediction only]. 343
3034 225937 COG3402 YdbS Uncharacterized membrane protein YdbS, contains bPH2 (bacterial pleckstrin homology) domain [Function unknown]. 161
3035 225938 COG3403 YcgG Uncharacterized protein YcgG, contains conserved FPC and CPF motifs [Function unknown]. 257
3036 225939 COG3404 FtcD Formiminotetrahydrofolate cyclodeaminase [Amino acid transport and metabolism]. 208
3037 225940 COG3405 BcsZ Endo-1,4-beta-D-glucanase Y [Carbohydrate transport and metabolism]. 360
3038 225941 COG3407 MVD1 Mevalonate pyrophosphate decarboxylase [Lipid transport and metabolism]. 329
3039 225942 COG3408 GDB1 Glycogen debranching enzyme (alpha-1,6-glucosidase) [Carbohydrate transport and metabolism]. 641
3040 225943 COG3409 PGRP Peptidoglycan-binding (PGRP) domain of peptidoglycan hydrolases [Cell wall/membrane/envelope biogenesis]. 185
3041 225944 COG3410 BH3996 Uncharacterized protein, DUF2075 family [Function unknown]. 191
3042 225945 COG3411 2Fe2S (2Fe-2S) ferredoxin [Energy production and conversion]. 64
3043 225946 COG3412 DhaM PTS-EIIA-like component DhaM of the dihydroxyacetone kinase DhaKLM complex [Signal transduction mechanisms]. 129
3044 225947 COG3413 COG3413 Predicted DNA binding protein, contains HTH domain [General function prediction only]. 215
3045 225948 COG3414 SgaB Phosphotransferase system, galactitol-specific IIB component [Carbohydrate transport and metabolism]. 93
3046 225949 COG3415 COG3415 Transposase [Mobilome: prophages, transposons]. 138
3047 225950 COG3416 COG3416 Uncharacterized protein [Function unknown]. 233
3048 225951 COG3417 LpoB Outer membrane lipoprotein LpoB, binds and activates PBP1b [Cell wall/membrane/envelope biogenesis]. 200
3049 225952 COG3418 FlgN Flagellar biosynthesis/type III secretory pathway chaperone [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 146
3050 225953 COG3419 PilY1 Tfp pilus assembly protein, tip-associated adhesin PilY1 [Cell motility, Extracellular structures]. 1036
3051 225954 COG3420 NosD Nitrous oxidase accessory protein NosD, contains tandem CASH domains [Inorganic ion transport and metabolism]. 408
3052 225955 COG3421 COG3421 Uncharacterized protein [Function unknown]. 812
3053 225956 COG3422 YegP Uncharacterized conserved protein YegP, UPF0339 family [Function unknown]. 59
3054 225957 COG3423 SfsB Predicted transcriptional regulator, lambda repressor-like DNA-binding domain [Transcription]. 82
3055 225958 COG3424 BH0617 Predicted naringenin-chalcone synthase [Secondary metabolites biosynthesis, transport and catabolism]. 356
3056 225959 COG3425 PksG 3-hydroxy-3-methylglutaryl CoA synthase [Lipid transport and metabolism]. 377
3057 225960 COG3426 Buk Butyrate kinase [Energy production and conversion]. 358
3058 225961 COG3427 CoxG Carbon monoxide dehydrogenase subunit G [Energy production and conversion]. 146
3059 225962 COG3428 YdbT Uncharacterized membrane protein YdbT, contains bPH2 (bacterial pleckstrin homology) domain [Function unknown]. 494
3060 225963 COG3429 OpcA Glucose-6-phosphate dehydrogenase assembly protein OpcA, contains a peptidoglycan-binding domain [Carbohydrate transport and metabolism]. 314
3061 225964 COG3430 COG3430 Archaeal flagellin (archaellin), FlaG/FlaF family [Cell motility]. 161
3062 225965 COG3431 COG3431 Uncharacterized membrane protein, DUF373 family [Function unknown]. 142
3063 225966 COG3432 COG3432 Predicted transcriptional regulator [Transcription]. 95
3064 225967 COG3433 DhbB2 Aryl carrier domain [Secondary metabolites biosynthesis, transport and catabolism]. 74
3065 225968 COG3434 YuxH c-di-GMP-related signal transduction protein, contains EAL and HDOD domains [Signal transduction mechanisms]. 407
3066 225969 COG3435 COG3435 Gentisate 1,2-dioxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 351
3067 225970 COG3436 COG3436 Transposase [Mobilome: prophages, transposons]. 157
3068 225971 COG3437 RpfG Response regulator c-di-GMP phosphodiesterase, RpfG family, contains REC and HD-GYP domains [Signal transduction mechanisms]. 360
3069 225972 COG3439 COG3439 Uncharacterized conserved protein, DUF302 family [Function unknown]. 137
3070 225973 COG3440 COG3440 Predicted restriction endonuclease [Defense mechanisms]. 301
3071 225974 COG3442 COG3442 Glutamine amidotransferase related to the GATase domain of CobQ [General function prediction only]. 250
3072 225975 COG3443 ZinT Periplasmic Zn/Cd-binding protein ZinT [Inorganic ion transport and metabolism]. 193
3073 225976 COG3444 AgaB Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB [Carbohydrate transport and metabolism]. 159
3074 225977 COG3445 GrcA Autonomous glycyl radical cofactor GrcA [Coenzyme transport and metabolism]. 127
3075 225978 COG3447 MASE1 Integral membrane sensor domain MASE1 [Signal transduction mechanisms]. 308
3076 225979 COG3448 COG3448 CBS-domain-containing membrane protein [Signal transduction mechanisms]. 382
3077 225980 COG3449 SbmC DNA gyrase inhibitor GyrI [Replication, recombination and repair]. 154
3078 225981 COG3450 COG3450 Predicted enzyme of the cupin superfamily [General function prediction only]. 116
3079 225982 COG3451 VirB4 Type IV secretory pathway, VirB4 component [Intracellular trafficking, secretion, and vesicular transport]. 796
3080 225983 COG3452 CHASE Extracellular (periplasmic) sensor domain CHASE (specificity unknown) [Signal transduction mechanisms]. 297
3081 225984 COG3453 COG3453 Predicted phosphohydrolase, protein tyrosine phosphatase (PTP) superfamily, DUF442 family [General function prediction only]. 130
3082 225985 COG3454 PhnM Alpha-D-ribose 1-methylphosphonate 5-triphosphate diphosphatase PhnM [Inorganic ion transport and metabolism]. 377
3083 225986 COG3455 COG3455 Type VI protein secretion system component VasF [Intracellular trafficking, secretion, and vesicular transport]. 262
3084 225987 COG3456 COG3456 Predicted component of the type VI protein secretion system, contains a FHA domain [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 430
3085 225988 COG3457 YhfX Predicted amino acid racemase [Amino acid transport and metabolism]. 353
3086 225989 COG3458 Axe1 Cephalosporin-C deacetylase or related acetyl esterase [Secondary metabolites biosynthesis, transport and catabolism]. 321
3087 225990 COG3459 COG3459 Cellobiose phosphorylase [Carbohydrate transport and metabolism]. 1056
3088 225991 COG3460 PaaB 1,2-phenylacetyl-CoA epoxidase, PaaB subunit [Secondary metabolites biosynthesis, transport and catabolism]. 117
3089 225992 COG3461 COG3461 Uncharacterized protein [Function unknown]. 103
3090 225993 COG3462 COG3462 Uncharacterized membrane protein [Function unknown]. 117
3091 225994 COG3463 COG3463 Uncharacterized membrane protein [Function unknown]. 458
3092 225995 COG3464 COG3464 Transposase [Mobilome: prophages, transposons]. 402
3093 225996 COG3465 YwgA Uncharacterized protein YwgA [Function unknown]. 171
3094 225997 COG3466 ISA1214 Putative transposon-encoded protein [Mobilome: prophages, transposons]. 52
3095 225998 COG3467 NimA Nitroimidazol reductase NimA or a related FMN-containing flavoprotein, pyridoxamine 5'-phosphate oxidase superfamily [Defense mechanisms]. 166
3096 225999 COG3468 AidA Type V secretory pathway, adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 592
3097 226000 COG3469 Chi1 Chitinase [Carbohydrate transport and metabolism]. 332
3098 226001 COG3470 Tpd Uncharacterized protein probably involved in high-affinity Fe2+ transport [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 179
3099 226002 COG3471 COG3471 Predicted secreted (periplasmic) protein [Function unknown]. 235
3100 226003 COG3472 COG3472 Uncharacterized protein [Function unknown]. 342
3101 226004 COG3473 COG3473 Maleate cis-trans isomerase [Secondary metabolites biosynthesis, transport and catabolism]. 238
3102 226005 COG3474 Cyc7 Cytochrome c2 [Energy production and conversion]. 135
3103 226006 COG3475 LicD Phosphorylcholine metabolism protein LicD [Lipid transport and metabolism]. 256
3104 226007 COG3476 TspO Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) [Signal transduction mechanisms]. 161
3105 226008 COG3477 YagU Uncharacterized membrane protein YagU, involved in acid resistance, DUF1440 family [Function unknown]. 176
3106 226009 COG3478 YpzJ Predicted nucleic-acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 68
3107 226010 COG3479 PadC Phenolic acid decarboxylase [Secondary metabolites biosynthesis, transport and catabolism]. 175
3108 226011 COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]. 342
3109 226012 COG3481 YhaM 3'-5' exoribonuclease YhaM, can participate in 23S rRNA maturation, HD superfamily [Translation, ribosomal structure and biogenesis]. 287
3110 226013 COG3482 COG3482 Uncharacterized protein [Function unknown]. 237
3111 226014 COG3483 TDO2 Tryptophan 2,3-dioxygenase (vermilion) [Amino acid transport and metabolism]. 262
3112 226015 COG3484 COG3484 Predicted proteasome-type protease [Posttranslational modification, protein turnover, chaperones]. 255
3113 226016 COG3485 PcaH Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport and catabolism]. 226
3114 226017 COG3486 IucD Lysine/ornithine N-monooxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 436
3115 226018 COG3487 IrpA Uncharacterized iron-regulated protein [Function unknown]. 446
3116 226019 COG3488 COG3488 Uncharacterized conserved protein with two CxxC motifs, DUF1111 family [General function prediction only]. 481
3117 226020 COG3489 COG3489 Predicted periplasmic lipoprotein [Function unknown]. 359
3118 226021 COG3490 COG3490 Uncharacterized protein [Function unknown]. 366
3119 226022 COG3491 PcbC Isopenicillin N synthase and related dioxygenases [Secondary metabolites biosynthesis, transport and catabolism]. 322
3120 226023 COG3492 COG3492 Uncharacterized protein, DUF1244 family [Function unknown]. 104
3121 226024 COG3493 CitS Na+/citrate or Na+/malate symporter [Energy production and conversion]. 438
3122 226025 COG3494 COG3494 Uncharacterized conserved protein, DUF1009 family [Function unknown]. 279
3123 226026 COG3495 COG3495 Uncharacterized protein, DUF3299 family [Function unknown]. 166
3124 226027 COG3496 COG3496 Uncharacterized conserved protein, DUF1365 family [Function unknown]. 261
3125 226028 COG3497 COG3497 Phage tail sheath protein FI [Mobilome: prophages, transposons]. 394
3126 226029 COG3498 COG3498 Phage tail tube protein FII [Mobilome: prophages, transposons]. 169
3127 226030 COG3499 COG3499 Phage protein U [Mobilome: prophages, transposons]. 147
3128 226031 COG3500 gpD Phage protein D [Mobilome: prophages, transposons]. 350
3129 226032 COG3501 VgrG Uncharacterized conserved protein, implicated in type VI secretion and phage assembly [Intracellular trafficking, secretion, and vesicular transport, Mobilome: prophages, transposons, General function prediction only]. 550
3130 226033 COG3502 COG3502 Uncharacterized conserved protein, DUF952 family [Function unknown]. 115
3131 226034 COG3503 COG3503 Uncharacterized membrane protein [Function unknown]. 323
3132 226035 COG3504 VirB9 Type IV secretory pathway, VirB9 components [Intracellular trafficking, secretion, and vesicular transport]. 265
3133 226036 COG3505 VirD4 Type IV secretory pathway, VirD4 component, TraG/TraD family ATPase [Intracellular trafficking, secretion, and vesicular transport]. 596
3134 226037 COG3506 Ree1 Regulation of enolase protein 1 (function unknown), concanavalin A-like superfamily [Function unknown]. 189
3135 226038 COG3507 XynB2 Beta-xylosidase [Carbohydrate transport and metabolism]. 549
3136 226039 COG3508 HmgA Homogentisate 1,2-dioxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 427
3137 226040 COG3509 LpqC Poly(3-hydroxybutyrate) depolymerase [Secondary metabolites biosynthesis, transport and catabolism]. 312
3138 226041 COG3510 CmcI Cephalosporin hydroxylase [Defense mechanisms]. 237
3139 226042 COG3511 PlcC Phospholipase C [Cell wall/membrane/envelope biogenesis]. 527
3140 226043 COG3512 Cas2 CRISPR/Cas system-associated protein Cas2, endoribonuclease [Defense mechanisms]. 116
3141 226044 COG3513 Cas9 CRISPR/Cas system Type II associated protein, contains McrA/HNH and RuvC-like nuclease domains [Defense mechanisms]. 1088
3142 226045 COG3514 COG3514 Uncharacterized conserved protein, DUF4415 family [Function unknown]. 93
3143 226046 COG3515 COG3515 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 346
3144 226047 COG3516 COG3516 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 169
3145 226048 COG3517 COG3517 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 495
3146 226049 COG3518 COG3518 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 157
3147 226050 COG3519 COG3519 Type VI protein secretion system component VasA [Intracellular trafficking, secretion, and vesicular transport]. 621
3148 226051 COG3520 COG3520 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 335
3149 226052 COG3521 COG3521 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 159
3150 226053 COG3522 COG3522 Predicted component of the type VI protein secretion system [Intracellular trafficking, secretion, and vesicular transport]. 446
3151 226054 COG3523 IcmF Type VI protein secretion system component VasK [Intracellular trafficking, secretion, and vesicular transport]. 1188
3152 226055 COG3524 KpsE Capsule polysaccharide export protein KpsE/RkpR [Cell wall/membrane/envelope biogenesis]. 372
3153 226056 COG3525 Chb N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism]. 732
3154 226057 COG3526 COG3526 Predicted selenoprotein, Rdx family [Function unknown]. 99
3155 226058 COG3527 AlsD Alpha-acetolactate decarboxylase [Secondary metabolites biosynthesis, transport and catabolism]. 234
3156 226059 COG3528 COG3528 Uncharacterized protein, DUF2219 family [Function unknown]. 330
3157 226060 COG3529 COG3529 Predicted nucleic-acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 66
3158 226061 COG3530 COG3530 Uncharacterized conserved protein, DUF3820 family [Function unknown]. 71
3159 226062 COG3531 COG3531 Predicted protein-disulfide isomerase , contains CxxC motif [Posttranslational modification, protein turnover, chaperones]. 212
3160 226063 COG3533 COG3533 Uncharacterized conserved protein, DUF1680 family [Function unknown]. 589
3161 226064 COG3534 AbfA Alpha-L-arabinofuranosidase [Carbohydrate transport and metabolism]. 501
3162 226065 COG3535 COG3535 Uncharacterized conserved protein, DUF917 family [Function unknown]. 357
3163 226066 COG3536 COG3536 Uncharacterized conserved protein, DUF971 family [Function unknown]. 120
3164 226067 COG3537 COG3537 Putative alpha-1,2-mannosidase [Carbohydrate transport and metabolism]. 768
3165 226068 COG3538 COG3538 Meiotically up-regulated gene 157 (Mug157) protein (function unknown) [Function unknown]. 434
3166 226069 COG3539 FimA Pilin (type 1 fimbria component protein) [Cell motility]. 184
3167 226070 COG3540 PhoD Phosphodiesterase/alkaline phosphatase D [Inorganic ion transport and metabolism]. 522
3168 226071 COG3541 YcgL Predicted nucleotidyltransferase [General function prediction only]. 248
3169 226072 COG3542 CFF1 Predicted sugar epimerase, cupin superfamily [General function prediction only]. 162
3170 226073 COG3543 COG3543 Uncharacterized protein [Function unknown]. 135
3171 226074 COG3544 COG3544 Uncharacterized conserved protein, DUF305 family [Function unknown]. 190
3172 226075 COG3545 YdeN Predicted esterase of the alpha/beta hydrolase fold [General function prediction only]. 181
3173 226076 COG3546 CotJC Mn-containing catalase (includes spore coat protein CotJC) [Inorganic ion transport and metabolism]. 277
3174 226077 COG3547 COG3547 Transposase [Mobilome: prophages, transposons]. 303
3175 226078 COG3548 COG3548 Uncharacterized membrane protein [Function unknown]. 197
3176 226079 COG3549 HigB Plasmid maintenance system killer protein [Defense mechanisms]. 94
3177 226080 COG3550 HipA Serine/threonine protein kinase HipA, toxin component of the HipAB toxin-antitoxin module [Signal transduction mechanisms]. 392
3178 226081 COG3551 COG3551 Uncharacterized protein [Function unknown]. 402
3179 226082 COG3552 CoxE Uncharacterized conserved protein, contains von Willebrand factor type A (vWA) domain [Function unknown]. 395
3180 226083 COG3553 COG3553 Uncharacterized protein [Function unknown]. 96
3181 226084 COG3554 COG3554 Uncharacterized protein [Function unknown]. 190
3182 226085 COG3555 LpxO2 Aspartyl/asparaginyl beta-hydroxylase, cupin superfamily [Posttranslational modification, protein turnover, chaperones]. 291
3183 226086 COG3556 COG3556 Uncharacterized membrane protein [Function unknown]. 150
3184 226087 COG3557 YgaC Uncharacterized protein associated with RNAses G and E, UPF0374/DUF402 family [Function unknown]. 177
3185 226088 COG3558 COG3558 Uncharacterized conserved protein, nuclear transport factor 2 (NTF2) superfamily [Function unknown]. 154
3186 226089 COG3559 TnrB3 Putative exporter of polyketide antibiotics [Intracellular trafficking, secretion, and vesicular transport]. 536
3187 226090 COG3560 FMR2 Fatty acid repression mutant protein (predicted oxidoreductase) [General function prediction only]. 200
3188 226091 COG3561 COG3561 Phage anti-repressor protein [Mobilome: prophages, transposons]. 110
3189 226092 COG3562 KpsS Capsule polysaccharide modification protein KpsS [Cell wall/membrane/envelope biogenesis]. 403
3190 226093 COG3563 KpsC Capsule polysaccharide export protein KpsC/LpsZ [Cell wall/membrane/envelope biogenesis]. 671
3191 226094 COG3564 COG3564 Uncharacterized conserved protein, DUF779 family [Function unknown]. 116
3192 226095 COG3565 COG3565 Predicted dioxygenase of extradiol dioxygenase family [General function prediction only]. 138
3193 226096 COG3566 COG3566 Uncharacterized protein [Function unknown]. 379
3194 226097 COG3567 COG3567 Uncharacterized protein [Function unknown]. 452
3195 226098 COG3568 ElsH Metal-dependent hydrolase, endonuclease/exonuclease/phosphatase family [General function prediction only]. 259
3196 226099 COG3569 Top1 DNA topoisomerase IB [Replication, recombination and repair]. 354
3197 226100 COG3570 StrB Streptomycin 6-kinase [Defense mechanisms]. 274
3198 226101 COG3571 COG3571 Predicted hydrolase of the alpha/beta-hydrolase fold [General function prediction only]. 213
3199 226102 COG3572 Gsh2 Gamma-glutamylcysteine synthetase [Coenzyme transport and metabolism]. 456
3200 226103 COG3573 COG3573 Predicted oxidoreductase [General function prediction only]. 552
3201 226104 COG3575 COG3575 Uncharacterized protein [Function unknown]. 184
3202 226105 COG3576 COG3576 Predicted flavin-nucleotide-binding protein, pyridoxine 5'-phosphate oxidase superfamily [General function prediction only]. 173
3203 226106 COG3577 COG3577 Predicted aspartyl protease [General function prediction only]. 215
3204 226107 COG3579 PepC Aminopeptidase C [Amino acid transport and metabolism]. 444
3205 226108 COG3580 COG3580 Predicted nucleotide-binding protein, sugar kinase/HSP70/actin superfamily [General function prediction only]. 351
3206 226109 COG3581 COG3581 Predicted nucleotide-binding protein, sugar kinase/HSP70/actin superfamily [General function prediction only]. 420
3207 226110 COG3582 COG3582 Predicted nucleic acid binding protein containing the AN1-type Zn-finger [General function prediction only]. 162
3208 226111 COG3583 YabE Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown]. 309
3209 226112 COG3584 3D 3D (Asp-Asp-Asp) domain [Function unknown]. 109
3210 226113 COG3585 MopI Molybdopterin-binding protein [Coenzyme transport and metabolism]. 69
3211 226114 COG3586 COG3586 Predicted transport protein [Function unknown]. 101
3212 226115 COG3587 COG3587 Restriction endonuclease [Defense mechanisms]. 985
3213 226116 COG3588 Fba1 Fructose-bisphosphate aldolase class 1 [Carbohydrate transport and metabolism]. 332
3214 226117 COG3589 COG3589 Uncharacterized protein [Function unknown]. 360
3215 226118 COG3590 PepO Predicted metalloendopeptidase [Posttranslational modification, protein turnover, chaperones]. 654
3216 226119 COG3591 eMpr V8-like Glu-specific endopeptidase [Posttranslational modification, protein turnover, chaperones]. 251
3217 226120 COG3592 YjdI Uncharacterized Fe-S cluster protein YjdI [Function unknown]. 74
3218 226121 COG3593 YbjD Predicted ATP-dependent endonuclease of the OLD family, contains P-loop ATPase and TOPRIM domains [Replication, recombination and repair]. 581
3219 226122 COG3594 NolL Fucose 4-O-acetylase or related acetyltransferase [Carbohydrate transport and metabolism]. 343
3220 226123 COG3595 YvlB Uncharacterized conserved protein YvlB, contains DUF4097 and DUF4098 domains [Function unknown]. 318
3221 226124 COG3596 YeeP Predicted GTPase [General function prediction only]. 296
3222 226125 COG3597 COG3597 Uncharacterized conserved protein, DUF697 family [Function unknown]. 139
3223 226126 COG3598 RepA RecA-family ATPase [Replication, recombination and repair]. 402
3224 226127 COG3599 DivIVA Cell division septum initiation DivIVA, interacts with FtsZ, MinD and other proteins [Cell cycle control, cell division, chromosome partitioning]. 212
3225 226128 COG3600 GepA Uncharacterized phage-associated protein [Mobilome: prophages, transposons]. 154
3226 226129 COG3601 FmnP Riboflavin transporter FmnP [Coenzyme transport and metabolism]. 186
3227 226130 COG3602 COG3602 Uncharacterized protein [Function unknown]. 134
3228 226131 COG3603 COG3603 Uncharacterized protein [Function unknown]. 128
3229 226132 COG3604 FhlA Transcriptional regulator containing GAF, AAA-type ATPase, and DNA-binding Fis domains [Transcription, Signal transduction mechanisms]. 550
3230 226133 COG3605 PtsP Signal transduction protein containing GAF and PtsI domains [Signal transduction mechanisms]. 756
3231 226134 COG3607 COG3607 Predicted lactoylglutathione lyase [General function prediction only]. 133
3232 226135 COG3608 COG3608 Predicted deacylase [General function prediction only]. 331
3233 226136 COG3609 ParD Transcriptional regulator, contains Arc/MetJ-type RHH (ribbon-helix-helix) DNA-binding domain [Transcription]. 89
3234 226137 COG3610 YjjB Uncharacterized membrane protein YjjB, DUF3815 family [Function unknown]. 156
3235 226138 COG3611 DnaB Replication initiation and membrane attachment protein DnaB [Replication, recombination and repair]. 417
3236 226139 COG3612 COG3612 Uncharacterized protein [Function unknown]. 157
3237 226140 COG3613 RCL Nucleoside 2-deoxyribosyltransferase [Nucleotide transport and metabolism]. 172
3238 226141 COG3614 CHASE1 Extracellular (periplasmic) sensor domain CHASE1 (specificity unknown) [Signal transduction mechanisms]. 348
3239 226142 COG3615 TehB Uncharacterized protein/domain, possibly involved in tellurite resistance [Function unknown]. 99
3240 226143 COG3616 Dsd1 D-serine deaminase, pyridoxal phosphate-dependent [Amino acid transport and metabolism]. 368
3241 226144 COG3617 COG3617 Prophage antirepressor [Mobilome: prophages, transposons]. 176
3242 226145 COG3618 COG3618 Predicted metal-dependent hydrolase, TIM-barrel fold [General function prediction only]. 279
3243 226146 COG3619 YoaK Uncharacterized membrane protein YoaK, UPF0700 family [Function unknown]. 226
3244 226147 COG3620 COG3620 Predicted transcriptional regulator with C-terminal CBS domains [Transcription]. 187
3245 226148 COG3621 PATA Patatin-like phospholipase/acyl hydrolase [General function prediction only]. 394
3246 226149 COG3622 Hyi Hydroxypyruvate isomerase [Carbohydrate transport and metabolism]. 260
3247 226150 COG3623 SgaU L-ribulose-5-phosphate 3-epimerase UlaE [Carbohydrate transport and metabolism]. 287
3248 226151 COG3624 PhnG Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnG [Inorganic ion transport and metabolism]. 151
3249 226152 COG3625 PhnH Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnH [Inorganic ion transport and metabolism]. 196
3250 226153 COG3626 PhnI Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnI [Inorganic ion transport and metabolism]. 367
3251 226154 COG3627 PhnJ Alpha-D-ribose 1-methylphosphonate 5-phosphate C-P lyase [Inorganic ion transport and metabolism]. 291
3252 226155 COG3628 COG3628 Phage baseplate assembly protein W [Mobilome: prophages, transposons]. 116
3253 226156 COG3629 DnrI DNA-binding transcriptional activator of the SARP family [Signal transduction mechanisms]. 280
3254 226157 COG3630 OadG Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, gamma subunit [Energy production and conversion]. 84
3255 226158 COG3631 YesE Ketosteroid isomerase-related protein [General function prediction only]. 133
3256 226159 COG3633 SstT Na+/serine symporter [Amino acid transport and metabolism]. 407
3257 226160 COG3634 AhpF Alkyl hydroperoxide reductase subunit AhpF [Defense mechanisms]. 520
3258 226161 COG3635 ApgM 2,3-bisphosphoglycerate-independent phosphoglycerate mutase, archeal type [Carbohydrate transport and metabolism]. 408
3259 226162 COG3636 COG3636 DNA-binding prophage protein [Mobilome: prophages, transposons]. 100
3260 226163 COG3637 LomR Opacity protein and related surface antigens [Cell wall/membrane/envelope biogenesis]. 199
3261 226164 COG3638 PhnC ABC-type phosphate/phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 258
3262 226165 COG3639 PhnE ABC-type phosphate/phosphonate transport system, permease component [Inorganic ion transport and metabolism]. 283
3263 226166 COG3640 CooC CO dehydrogenase nickel-insertion accessory protein CooC1 [Posttranslational modification, protein turnover, chaperones]. 255
3264 226167 COG3641 PfoR Uncharacterized membrane protein PfoR (does not regulate perfringolysin O expression) [Function unknown]. 348
3265 226168 COG3642 Bud32 tRNA A-37 threonylcarbamoyl transferase component Bud32 [Translation, ribosomal structure and biogenesis]. 204
3266 226169 COG3643 GluFT Glutamate formiminotransferase [Amino acid transport and metabolism]. 302
3267 226170 COG3644 COG3644 Uncharacterized protein [Function unknown]. 194
3268 226171 COG3645 KilAC Phage antirepressor protein YoqD, KilAC domain [Mobilome: prophages, transposons]. 135
3269 226172 COG3646 pRha Phage regulatory protein Rha [Mobilome: prophages, transposons]. 167
3270 226173 COG3647 YjdF Uncharacterized membrane protein YjdF [Function unknown]. 205
3271 226174 COG3648 UriC Uricase (urate oxidase) [Secondary metabolites biosynthesis, transport and catabolism]. 299
3272 226175 COG3649 Csh2 CRISPR/Cas system type I-B associated protein Csh2, Cas7 group, RAMP superfamily [Defense mechanisms]. 283
3273 226176 COG3650 COG3650 Uncharacterized membrane protein [Function unknown]. 149
3274 226177 COG3651 COG3651 Uncharacterized conserved protein, DUF2237 family [Function unknown]. 125
3275 226178 COG3652 COG3652 Predicted outer membrane protein [Function unknown]. 170
3276 226179 COG3653 COG3653 N-acyl-D-aspartate/D-glutamate deacylase [Secondary metabolites biosynthesis, transport and catabolism]. 579
3277 226180 COG3654 Doc Prophage maintenance system killer protein [Mobilome: prophages, transposons]. 132
3278 226181 COG3655 YozG DNA-binding transcriptional regulator, XRE family [Transcription]. 73
3279 226182 COG3656 COG3656 Predicted periplasmic protein [Function unknown]. 172
3280 226183 COG3657 COG3657 Putative component of the toxin-antitoxin plasmid stabilization module [Defense mechanisms]. 100
3281 226184 COG3658 CytB Cytochrome b [Energy production and conversion]. 192
3282 226185 COG3659 OprB Carbohydrate-selective porin OprB [Cell wall/membrane/envelope biogenesis]. 439
3283 226186 COG3660 ELM1 Mitochondrial fission protein ELM1 [Cell cycle control, cell division, chromosome partitioning]. 329
3284 226187 COG3661 AguA2 Alpha-glucuronidase [Carbohydrate transport and metabolism]. 684
3285 226188 COG3662 COG3662 Uncharacterized conserved protein, DUF2236 family [Function unknown]. 300
3286 226189 COG3663 Mug G:T/U-mismatch repair DNA glycosylase [Replication, recombination and repair]. 169
3287 226190 COG3664 XynB Beta-xylosidase [Carbohydrate transport and metabolism]. 428
3288 226191 COG3665 YcgI Uncharacterized conserved protein YcgI, DUF1989 family [Function unknown]. 264
3289 226192 COG3666 COG3666 Transposase [Mobilome: prophages, transposons]. 161
3290 226193 COG3667 PcoB Uncharacterized protein involved in copper resistance [Inorganic ion transport and metabolism]. 321
3291 226194 COG3668 ParE Plasmid stabilization system protein ParE [Mobilome: prophages, transposons]. 98
3292 226195 COG3669 AfuC Alpha-L-fucosidase [Carbohydrate transport and metabolism]. 430
3293 226196 COG3670 COG3670 Carotenoid cleavage dioxygenase or a related enzyme [Secondary metabolites biosynthesis, transport and catabolism]. 490
3294 226197 COG3671 COG3671 Uncharacterized membrane protein [Function unknown]. 125
3295 226198 COG3672 COG3672 Predicted transglutaminase-like cysteine proteinase [Posttranslational modification, protein turnover, chaperones]. 191
3296 226199 COG3673 COG3673 Uncharacterized protein, PA2063/DUF2235 family [Function unknown]. 423
3297 226200 COG3675 Lip2 Predicted lipase [Lipid transport and metabolism]. 332
3298 226201 COG3676 COG3676 Transposase and inactivated derivatives [Mobilome: prophages, transposons]. 126
3299 226202 COG3677 InsA Transposase [Mobilome: prophages, transposons]. 129
3300 226203 COG3678 CpxP Periplasmic protein refolding chaperone Spy/CpxP family [Posttranslational modification, protein turnover, chaperones]. 160
3301 226204 COG3679 YlbF Cell fate regulator YlbF, YheA/YmcA/DUF963 family (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 118
3302 226205 COG3680 COG3680 Uncharacterized protein [Function unknown]. 259
3303 226206 COG3681 CdsB L-cysteine desulfidase [Amino acid transport and metabolism]. 433
3304 226207 COG3682 COG3682 Predicted transcriptional regulator [Transcription]. 123
3305 226208 COG3683 COG3683 ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 213
3306 226209 COG3684 LacD Tagatose-1,6-bisphosphate aldolase [Carbohydrate transport and metabolism]. 306
3307 226210 COG3685 YciE Ferritin-like metal-binding protein YciE [Inorganic ion transport and metabolism]. 167
3308 226211 COG3686 COG3686 Uncharacterized conserved protein, MAPEG superfamily [Function unknown]. 125
3309 226212 COG3687 COG3687 Predicted metal-dependent hydrolase [General function prediction only]. 280
3310 226213 COG3688 YacP Predicted RNA-binding protein containing a PIN domain [General function prediction only]. 173
3311 226214 COG3689 YcgQ Uncharacterized membrane protein YcgQ, UPF0703/DUF1980 family [Function unknown]. 271
3312 226215 COG3691 YfcZ Uncharacterized conserved protein YfcZ, UPF0381/DUF406 family [Function unknown]. 98
3313 226216 COG3692 YifN Uncharacterized protein YifN, PemK superfamily [Function unknown]. 142
3314 226217 COG3693 XynA Endo-1,4-beta-xylanase, GH35 family [Carbohydrate transport and metabolism]. 345
3315 226218 COG3694 COG3694 ABC-type uncharacterized transport system, permease component [General function prediction only]. 260
3316 226219 COG3695 Atl1 Alkylated DNA nucleotide flippase Atl1, participates in nucleotide excision repair, Ada-like DNA-binding domain [Transcription]. 103
3317 226220 COG3696 CusA Cu/Ag efflux pump CusA [Inorganic ion transport and metabolism]. 1027
3318 226221 COG3697 CitX Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) [Coenzyme transport and metabolism, Lipid transport and metabolism]. 182
3319 226222 COG3698 YigE Uncharacterized protein YigE, DUF2233 family [Function unknown]. 250
3320 226223 COG3700 AphA Acid phosphatase (class B) [Inorganic ion transport and metabolism, General function prediction only]. 237
3321 226224 COG3701 TrbF Type IV secretory pathway, TrbF components [Intracellular trafficking, secretion, and vesicular transport]. 228
3322 226225 COG3702 VirB3 Type IV secretory pathway, VirB3 components [Intracellular trafficking, secretion, and vesicular transport]. 105
3323 226226 COG3703 ChaC Cation transport regulator ChaC [Inorganic ion transport and metabolism]. 190
3324 226227 COG3704 VirB6 Type IV secretory pathway, VirB6 components [Intracellular trafficking, secretion, and vesicular transport]. 406
3325 226228 COG3705 HisZ ATP phosphoribosyltransferase regulatory subunit HisZ [Amino acid transport and metabolism]. 390
3326 226229 COG3706 PleD Two-component response regulator, PleD family, consists of two REC domains and a diguanylate cyclase (GGDEF) domain [Signal transduction mechanisms, Transcription]. 435
3327 226230 COG3707 AmiR Two-component response regulator, AmiR/NasT family, consists of REC and RNA-binding antiterminator (ANTAR) domains [Signal transduction mechanisms, Transcription]. 194
3328 226231 COG3708 YdeE Predicted transcriptional regulator YdeE, contains AraC-type DNA-binding domain [Transcription]. 157
3329 226232 COG3709 PhnN Ribose 1,5-bisphosphokinase PhnN [Carbohydrate transport and metabolism]. 192
3330 226233 COG3710 CadC1 DNA-binding winged helix-turn-helix (wHTH) domain [Transcription]. 148
3331 226234 COG3711 BglG Transcriptional antiterminator [Transcription]. 491
3332 226235 COG3712 FecR periplasmic ferric-dicitrate binding protein FecR, regulates iron transport through sigma-19 [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 322
3333 226236 COG3713 OmpV Outer membrane scaffolding protein for murein synthesis, MipA/OmpV family [Cell wall/membrane/envelope biogenesis]. 258
3334 226237 COG3714 YhhN Uncharacterized membrane protein YhhN [Function unknown]. 212
3335 226238 COG3715 ManY Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC [Carbohydrate transport and metabolism]. 265
3336 226239 COG3716 ManZ Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID [Carbohydrate transport and metabolism]. 269
3337 226240 COG3717 KduI 5-keto 4-deoxyuronate isomerase [Carbohydrate transport and metabolism]. 278
3338 226241 COG3718 IolB 5-deoxy-D-glucuronate isomerase [Carbohydrate transport and metabolism]. 270
3339 226242 COG3719 RnaI Ribonuclease I [Translation, ribosomal structure and biogenesis]. 249
3340 226243 COG3720 HemS Putative heme degradation protein [Inorganic ion transport and metabolism]. 349
3341 226244 COG3721 HugX Putative heme iron utilization protein [Inorganic ion transport and metabolism]. 176
3342 226245 COG3722 MtlR DNA-binding transcriptional regulator, MltR family [Transcription]. 174
3343 226246 COG3723 RecT Recombinational DNA repair protein RecT [Replication, recombination and repair]. 276
3344 226247 COG3724 AstB Succinylarginine dihydrolase [Amino acid transport and metabolism]. 442
3345 226248 COG3725 AmpE Membrane protein required for beta-lactamase induction [Defense mechanisms]. 282
3346 226249 COG3726 AhpA Uncharacterized membrane protein affecting hemolysin expression [Function unknown]. 214
3347 226250 COG3727 Vsr G:T-mismatch repair DNA endonuclease, very short patch repair protein [Replication, recombination and repair]. 150
3348 226251 COG3728 XtmA Phage terminase, small subunit [Mobilome: prophages, transposons]. 179
3349 226252 COG3729 GsiB General stress protein YciG, contains tandem KGG domains [General function prediction only]. 73
3350 226253 COG3730 SrlA Phosphotransferase system sorbitol-specific component IIC [Carbohydrate transport and metabolism]. 176
3351 226254 COG3731 SrlB Phosphotransferase system sorbitol-specific component IIA [Carbohydrate transport and metabolism]. 123
3352 226255 COG3732 SrlE Phosphotransferase system sorbitol-specific component IIBC [Carbohydrate transport and metabolism]. 328
3353 226256 COG3733 TynA Cu2+-containing amine oxidase [Secondary metabolites biosynthesis, transport and catabolism]. 654
3354 226257 COG3734 DgoK 2-keto-3-deoxy-galactonokinase [Carbohydrate transport and metabolism]. 306
3355 226258 COG3735 TraB Uncharacterized conserved protein YbaP, TraB family [Function unknown]. 299
3356 226259 COG3736 VirB8 Type IV secretory pathway, component VirB8 [Intracellular trafficking, secretion, and vesicular transport]. 239
3357 226260 COG3737 COG3737 Uncharacterized conserved protein, contains Mth938-like domain [Function unknown]. 127
3358 226261 COG3738 YiijF Uncharacterized protein YijF, DUF1287 family [Function unknown]. 200
3359 226262 COG3739 YoaT Uncharacterized membrane protein YoaT, DUF817 family [Function unknown]. 263
3360 226263 COG3740 COG3740 Phage head maturation protease [Mobilome: prophages, transposons]. 194
3361 226264 COG3741 HutG N-formylglutamate amidohydrolase [Amino acid transport and metabolism]. 272
3362 226265 COG3742 COG3742 Uncharacterized protein, contains PIN domain [Function unknown]. 131
3363 226266 COG3743 H3TH Predicted 5' DNA nuclease, flap endonuclease-1-like, helix-3-turn-helix (H3TH) domain [Replication, recombination and repair]. 133
3364 226267 COG3744 COG3744 PIN domain nuclease, a component of toxin-antitoxin system (PIN domain) [Defense mechanisms]. 130
3365 226268 COG3745 CpaB Flp pilus assembly protein CpaB [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 276
3366 226269 COG3746 OprP Phosphate-selective porin [Inorganic ion transport and metabolism]. 426
3367 226270 COG3747 COG3747 Phage terminase, small subunit [Mobilome: prophages, transposons]. 160
3368 226271 COG3748 COG3748 Uncharacterized membrane protein [Function unknown]. 407
3369 226272 COG3749 COG3749 Uncharacterized conserved protein, DUF934 family [Function unknown]. 167
3370 226273 COG3750 COG3750 Uncharacterized conserved protein, UPF0335 family [Function unknown]. 85
3371 226274 COG3751 EGL9 Proline 4-hydroxylase (includes Rps23 Pro-64 3,4-dihydroxylase Tpa1), contains SM-20 domain [Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones]. 252
3372 226275 COG3752 COG3752 Steroid 5-alpha reductase family enzyme [General function prediction only]. 272
3373 226276 COG3753 YidB Uncharacterized conserved protein YidB, DUF937 family [Function unknown]. 143
3374 226277 COG3754 RgpF Lipopolysaccharide biosynthesis protein [Cell wall/membrane/envelope biogenesis]. 595
3375 226278 COG3755 YecT Uncharacterized conserved protein YecT, DUF1311 family [Function unknown]. 127
3376 226279 COG3756 YdaU Uncharacterized conserved protein YdaU, DUF1376 family [Function unknown]. 153
3377 226280 COG3757 Acm Lyzozyme M1 (1,4-beta-N-acetylmuramidase), GH25 family [Cell wall/membrane/envelope biogenesis]. 269
3378 226281 COG3758 Ves Various environmental stresses-induced protein Ves (function unknown) [Function unknown]. 193
3379 226282 COG3759 COG3759 Uncharacterized membrane protein [Function unknown]. 121
3380 226283 COG3760 ProX Predicted aminoacyl-tRNA deacylase, YbaK-like aminoacyl-tRNA editing domain [General function prediction only]. 164
3381 226284 COG3761 NDUFA12 NADH:ubiquinone oxidoreductase 17.2 kD subunit [Energy production and conversion]. 118
3382 226285 COG3762 COG3762 Uncharacterized membrane protein [Function unknown]. 213
3383 226286 COG3763 YneF Uncharacterized protein YneF, UPF0154 family [Function unknown]. 71
3384 226287 COG3764 SrtA Sortase (surface protein transpeptidase) [Cell wall/membrane/envelope biogenesis]. 210
3385 226288 COG3765 WzzB LPS O-antigen chain length determinant protein, WzzB/FepE family [Cell wall/membrane/envelope biogenesis]. 347
3386 226289 COG3766 YjfL Uncharacterized membrane protein YjfL, UPF0719 family [Function unknown]. 133
3387 226290 COG3767 COG3767 Uncharacterized low-complexity protein [Function unknown]. 95
3388 226291 COG3768 YcjF Uncharacterized membrane protein YcjF, UPF0283 family [Function unknown]. 350
3389 226292 COG3769 YedP Predicted mannosyl-3-phosphoglycerate phosphatase, HAD superfamily [Carbohydrate transport and metabolism]. 274
3390 226293 COG3770 MepA Murein endopeptidase [Cell wall/membrane/envelope biogenesis]. 284
3391 226294 COG3771 YciS Uncharacterized membrane protein YciS, DUF1049 family [Function unknown]. 97
3392 226295 COG3772 RrrD Phage-related lysozyme (muramidase), GH24 family [Cell wall/membrane/envelope biogenesis]. 152
3393 226296 COG3773 CwlJ Cell wall hydrolase CwlJ, involved in spore germination [Cell cycle control, cell division, chromosome partitioning, Cell wall/membrane/envelope biogenesis]. 249
3394 226297 COG3774 OCH1 Mannosyltransferase OCH1 or related enzyme [Cell wall/membrane/envelope biogenesis]. 347
3395 226298 COG3775 SgcC Phosphotransferase system, galactitol-specific IIC component [Carbohydrate transport and metabolism]. 446
3396 226299 COG3776 YhhL Uncharacterized conserved protein YhhL, DUF1145 family [Function unknown]. 91
3397 226300 COG3777 HTD2 Hydroxyacyl-ACP dehydratase HTD2, hotdog domain [Lipid transport and metabolism]. 273
3398 226301 COG3778 YmfQ Uncharacterized protein YmfQ in lambdoid prophage, DUF2313 family [Mobilome: prophages, transposons]. 188
3399 226302 COG3779 YegJ Uncharacterized conserved protein YegJ, DUF2314 family [Function unknown]. 151
3400 226303 COG3780 COG3780 DNA endonuclease related to intein-encoded endonucleases [Replication, recombination and repair]. 266
3401 226304 COG3781 YneE Predicted membrane chloride channel, bestrophin family [Inorganic ion transport and metabolism]. 306
3402 226305 COG3782 COG3782 Uncharacterized protein [Function unknown]. 289
3403 226306 COG3783 CybC Soluble cytochrome b562 [Energy production and conversion]. 100
3404 226307 COG3784 YdbL Uncharacterized conserved protein YdbL, DUF1318 family [Function unknown]. 109
3405 226308 COG3785 HspQ Heat shock protein HspQ [Posttranslational modification, protein turnover, chaperones]. 116
3406 226309 COG3786 COG3786 L,D-peptidoglycan transpeptidase YkuD, ErfK/YbiS/YcfS/YnhG family [Cell wall/membrane/envelope biogenesis]. 217
3407 226310 COG3787 YhbP Uncharacterized conserved protein YhbP, UPF0306 family [Function unknown]. 145
3408 226311 COG3788 YecN Uncharacterized membrane protein YecN, MAPEG domain [Function unknown]. 131
3409 226312 COG3789 YjfI Uncharacterized protein YjfI, DUF2170 family [Function unknown]. 146
3410 226313 COG3790 YbgE Predicted membrane protein, encoded in cydAB operon [Function unknown]. 97
3411 226314 COG3791 COG3791 Uncharacterized conserved protein [Function unknown]. 133
3412 226315 COG3792 COG3792 Uncharacterized protein [Function unknown]. 122
3413 226316 COG3793 TerB Tellurite resistance protein [Inorganic ion transport and metabolism]. 144
3414 226317 COG3794 PetE Plastocyanin [Energy production and conversion]. 128
3415 226318 COG3795 COG3795 Uncharacterized conserved protein [Function unknown]. 123
3416 226319 COG3797 COG3797 Uncharacterized conserved protein, DUF1697 family [Function unknown]. 178
3417 226320 COG3798 COG3798 Uncharacterized protein [Function unknown]. 75
3418 226321 COG3799 Mal Methylaspartate ammonia-lyase [Amino acid transport and metabolism]. 410
3419 226322 COG3800 COG3800 Predicted transcriptional regulator [General function prediction only]. 332
3420 226323 COG3801 COG3801 Uncharacterized protein [Function unknown]. 124
3421 226324 COG3802 GguC Uncharacterized protein [Function unknown]. 333
3422 226325 COG3803 COG3803 Uncharacterized conserved protein, DUF924 family [Function unknown]. 182
3423 226326 COG3804 COG3804 Uncharacterized protein [Function unknown]. 350
3424 226327 COG3805 DodA Aromatic ring-cleaving dioxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 120
3425 226328 COG3806 ChrR Anti-sigma factor ChrR, cupin superfamily [Signal transduction mechanisms]. 216
3426 226329 COG3807 SH3 SH3-like domain [Function unknown]. 171
3427 226330 COG3808 OVP1 Na+ or H+-translocating membrane pyrophosphatase [Energy production and conversion]. 703
3428 226331 COG3809 COG3809 Predicted nucleic acid-binding protein, contains Zn-finger domain [General function prediction only]. 88
3429 226332 COG3811 YjhX Uncharacterized protein YjhX, UPF0386/DUF2084 family [Function unknown]. 85
3430 226333 COG3812 COG3812 Uncharacterized protein [Function unknown]. 193
3431 226334 COG3813 COG3813 Uncharacterized protein [Function unknown]. 84
3432 226335 COG3814 SspB2 SspB-like protein, predicted to bind SsrA peptide [Posttranslational modification, protein turnover, chaperones]. 157
3433 226336 COG3815 COG3815 Uncharacterized membrane protein [Function unknown]. 113
3434 226337 COG3816 COG3816 Uncharacterized protein, DUF1285 family [Function unknown]. 205
3435 226338 COG3817 COG3817 Uncharacterized membrane protein [Function unknown]. 313
3436 226339 COG3818 COG3818 Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 167
3437 226340 COG3819 COG3819 Uncharacterized membrane protein [Function unknown]. 229
3438 226341 COG3820 COG3820 Uncharacterized protein, DUF1013 family [Function unknown]. 230
3439 226342 COG3821 COG3821 Uncharacterized membrane protein [Function unknown]. 234
3440 226343 COG3822 YdaE D-lyxose ketol-isomerase [Carbohydrate transport and metabolism]. 225
3441 226344 COG3823 COG3823 Glutamine cyclotransferase [Posttranslational modification, protein turnover, chaperones]. 262
3442 226345 COG3824 COG3824 Predicted Zn-dependent protease, minimal metalloprotease (MMP)-like domain [Posttranslational modification, protein turnover, chaperones]. 136
3443 226346 COG3825 CoxE Uncharacterized conserved protein, contains von Willebrand factor type A (vWA) domain [Function unknown]. 393
3444 226347 COG3826 COG3826 Uncharacterized protein [Function unknown]. 236
3445 226348 COG3827 PopZ Cell pole-organizing protein PopZ [Cell cycle control, cell division, chromosome partitioning]. 231
3446 226349 COG3828 COG3828 Type 1 glutamine amidotransferase (GATase1)-like domain [General function prediction only]. 239
3447 226350 COG3829 RocR Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding Fis domains [Transcription, Signal transduction mechanisms]. 560
3448 226351 COG3830 ACT ACT domain, binds amino acids and other small ligands [Signal transduction mechanisms]. 90
3449 226352 COG3831 WGR WGR domain, predicted DNA-binding domain in MolR [Transcription]. 85
3450 226353 COG3832 YndB Uncharacterized conserved protein YndB, AHSA1/START domain [Function unknown]. 149
3451 226354 COG3833 MalG ABC-type maltose transport system, permease component [Carbohydrate transport and metabolism]. 282
3452 226355 COG3835 CdaR Sugar diacid utilization regulator [Transcription, Signal transduction mechanisms]. 376
3453 226356 COG3836 HpcH 2-keto-3-deoxy-L-rhamnonate aldolase RhmA [Carbohydrate transport and metabolism]. 255
3454 226357 COG3837 COG3837 Uncharacterized conserved protein, cupin superfamily [Function unknown]. 161
3455 226358 COG3838 VirB2 Type IV secretory pathway, VirB2 components (pilins) [Intracellular trafficking, secretion, and vesicular transport]. 108
3456 226359 COG3839 MalK ABC-type sugar transport system, ATPase component [Carbohydrate transport and metabolism]. 338
3457 226360 COG3840 ThiQ ABC-type thiamine transport system, ATPase component [Coenzyme transport and metabolism]. 231
3458 226361 COG3842 PotA ABC-type Fe3+/spermidine/putrescine transport systems, ATPase components [Amino acid transport and metabolism]. 352
3459 226362 COG3843 VirD2 Type IV secretory pathway, VirD2 components (relaxase) [Intracellular trafficking, secretion, and vesicular transport]. 326
3460 226363 COG3844 Bna5 Kynureninase [Amino acid transport and metabolism]. 407
3461 226364 COG3845 YufO ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 501
3462 226365 COG3846 TrbL Type IV secretory pathway, TrbL components [Intracellular trafficking, secretion, and vesicular transport]. 452
3463 226366 COG3847 Flp Flp pilus assembly protein, pilin Flp [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 58
3464 226367 COG3848 PykA2 Phosphohistidine swiveling domain of PEP-utilizing enzymes [Signal transduction mechanisms]. 111
3465 226368 COG3850 NarQ Signal transduction histidine kinase, nitrate/nitrite-specific [Signal transduction mechanisms]. 574
3466 226369 COG3851 UhpB Signal transduction histidine kinase, glucose-6-phosphate specific [Signal transduction mechanisms]. 497
3467 226370 COG3852 NtrB Signal transduction histidine kinase, nitrogen specific [Signal transduction mechanisms]. 363
3468 226371 COG3853 TelA Uncharacterized conserved protein YaaN involved in tellurite resistance [Defense mechanisms]. 386
3469 226372 COG3854 SpoIIIAA Stage III sporulation protein SpoIIIAA [Cell cycle control, cell division, chromosome partitioning]. 308
3470 226373 COG3855 Fbp2 Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 648
3471 226374 COG3856 Sbp Small basic protein (function unknown) [Function unknown]. 113
3472 226375 COG3857 AddB ATP-dependent helicase/DNAse subunit B [Replication, recombination and repair]. 1108
3473 226376 COG3858 YaaH Spore germination protein YaaH [Cell cycle control, cell division, chromosome partitioning]. 423
3474 226377 COG3859 ThiT Thiamine transporter ThiT [Coenzyme transport and metabolism]. 185
3475 226378 COG3860 COG3860 Uncharacterized protein, DUF2087 family [Function unknown]. 89
3476 226379 COG3861 YsnF Stress response protein YsnF (function unknown) [Function unknown]. 195
3477 226380 COG3862 COG3862 Uncharacterized protein with two CxxC motifs [Function unknown]. 117
3478 226381 COG3863 YycO Uncharacterized protein YycO, NlpC/P60 family [Function unknown]. 231
3479 226382 COG3864 COG3864 Predicted metal-dependent peptidase [General function prediction only]. 396
3480 226383 COG3865 COG3865 Glyoxalase superfamily enzyme, possibly 3-demethylubiquinone-9 3-methyltransferase [General function prediction only]. 151
3481 226384 COG3866 PelB Pectate lyase [Carbohydrate transport and metabolism]. 345
3482 226385 COG3867 GanB Arabinogalactan endo-1,4-beta-galactosidase [Carbohydrate transport and metabolism]. 403
3483 226386 COG3868 COG3868 Predicted glycosyl hydrolase, GH114 family [Carbohydrate transport and metabolism]. 306
3484 226387 COG3869 McsB Protein-arginine kinase [Posttranslational modification, protein turnover, chaperones]. 352
3485 226388 COG3870 YaaQ Uncharacterized protein YaaQ, DUF970 family [Function unknown]. 109
3486 226389 COG3871 YzzA General stress protein 26 (function unknown) [Function unknown]. 145
3487 226390 COG3872 YqhQ Uncharacterized conserved protein YqhQ [Function unknown]. 318
3488 226391 COG3874 YtfJ Uncharacterized spore protein YtfJ [Function unknown]. 138
3489 226392 COG3875 LarA Nickel-dependent lactate racemase [Cell wall/membrane/envelope biogenesis]. 423
3490 226393 COG3876 YbbC Uncharacterized conserved protein YbbC, DUF1343 family [Function unknown]. 409
3491 226394 COG3877 COG3877 Uncharacterized protein, DUF2089 family [Function unknown]. 122
3492 226395 COG3878 YwqG Uncharacterized protein YwqG, DUF1963 family [Function unknown]. 261
3493 226396 COG3879 YlxW Uncharacterized conserved protein YlxW, UPF0749 family [Function unknown]. 247
3494 226397 COG3880 McsA Protein-arginine kinase activator protein McsA [Posttranslational modification, protein turnover, chaperones]. 176
3495 226398 COG3881 YrrD Uncharacterized protein YrrD, contains PRC-barrel domain [Function unknown]. 176
3496 226399 COG3882 FkbH Predicted enzyme involved in methoxymalonyl-ACP biosynthesis [Lipid transport and metabolism]. 574
3497 226400 COG3883 CwlO1 Uncharacterized N-terminal domain of peptidoglycan hydrolase CwlO [Function unknown]. 265
3498 226401 COG3884 FatA Acyl-ACP thioesterase [Lipid transport and metabolism]. 250
3499 226402 COG3885 COG3885 Aromatic ring-opening dioxygenase, LigB subunit [Secondary metabolites biosynthesis, transport and catabolism]. 261
3500 226403 COG3886 COG3886 HKD family nuclease [Replication, recombination and repair]. 198
3501 226404 COG3887 GdpP c-di-AMP phosphodiesterase, consists of a GGDEF-like and DHH domains [Signal transduction mechanisms]. 655
3502 226405 COG3888 COG3888 Predicted transcriptional regulator [Transcription]. 321
3503 226406 COG3889 COG3889 Predicted periplasmic protein [Function unknown]. 872
3504 226407 COG3890 ERG8 Phosphomevalonate kinase [Lipid transport and metabolism]. 337
3505 226408 COG3892 IolC Myo-inositol catabolism protein IolC [Carbohydrate transport and metabolism]. 310
3506 226409 COG3893 COG3893 Inactivated superfamily I helicase [Replication, recombination and repair]. 697
3507 226410 COG3894 COG3894 Uncharacterized 2Fe-2 and 4Fe-4S clusters-containing protein, contains DUF4445 domain [Function unknown]. 614
3508 226411 COG3895 MliC Membrane-bound inhibitor of C-type lysozyme [Cell wall/membrane/envelope biogenesis]. 112
3509 226412 COG3896 COG3896 Chloramphenicol 3-O-phosphotransferase [Defense mechanisms]. 205
3510 226413 COG3897 Nnt1 Predicted nicotinamide N-methyase [General function prediction only]. 218
3511 226414 COG3898 COG3898 Uncharacterized membrane-anchored protein [Function unknown]. 531
3512 226415 COG3899 COG3899 Predicted ATPase [General function prediction only]. 849
3513 226416 COG3900 COG3900 Predicted periplasmic protein [Function unknown]. 262
3514 226417 COG3901 NosR Regulator of nitric oxide reductase transcription [Transcription]. 482
3515 226418 COG3903 COG3903 Predicted ATPase [General function prediction only]. 414
3516 226419 COG3904 COG3904 Predicted periplasmic protein [Function unknown]. 245
3517 226420 COG3905 COG3905 Predicted transcriptional regulator [Transcription]. 83
3518 226421 COG3906 YrzB Uncharacterized protein YrzB, UPF0473 family [Function unknown]. 105
3519 226422 COG3907 COG3907 Membrane-associated enzyme, PAP2 (acid phosphatase) superfamily [General function prediction only]. 249
3520 226423 COG3908 COG3908 Uncharacterized protein [Function unknown]. 77
3521 226424 COG3909 CytC556 Cytochrome c556 [Energy production and conversion]. 147
3522 226425 COG3910 COG3910 Predicted ATPase [General function prediction only]. 233
3523 226426 COG3911 COG3911 Predicted ATPase [General function prediction only]. 183
3524 226427 COG3913 SciT Uncharacterized protein [Function unknown]. 227
3525 226428 COG3914 Spy Predicted O-linked N-acetylglucosamine transferase, SPINDLY family [Posttranslational modification, protein turnover, chaperones]. 620
3526 226429 COG3915 COG3915 Uncharacterized protein [Function unknown]. 155
3527 226430 COG3916 LasI N-acyl-L-homoserine lactone synthetase [Signal transduction mechanisms]. 209
3528 226431 COG3917 NahD 2-hydroxychromene-2-carboxylate isomerase [Secondary metabolites biosynthesis, transport and catabolism]. 203
3529 226432 COG3918 COG3918 Uncharacterized membrane protein [Function unknown]. 153
3530 226433 COG3919 COG3919 Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 415
3531 226434 COG3920 COG3920 Two-component sensor histidine kinase, HisKA and HATPase domains [Signal transduction mechanisms]. 221
3532 226435 COG3921 COG3921 Uncharacterized conserved protein [Function unknown]. 300
3533 226436 COG3923 PriC Primosomal replication protein N'' [Replication, recombination and repair]. 175
3534 226437 COG3924 YhdT Uncharacterized membrane protein YhdT [Function unknown]. 80
3535 226438 COG3925 FruA N-terminal domain of the phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 103
3536 226439 COG3926 ZliS Lysozyme family protein [General function prediction only]. 252
3537 226440 COG3930 COG3930 Uncharacterized protein [Function unknown]. 434
3538 226441 COG3931 HutG2 Predicted N-formylglutamate amidohydrolase [Amino acid transport and metabolism]. 263
3539 226442 COG3932 COG3932 Uncharacterized conserved protein [Function unknown]. 209
3540 226443 COG3933 LevR Transcriptional regulatory protein LevR, contains PRD, AAA+ and EIIA domains [Transcription]. 470
3541 226444 COG3934 COG3934 Endo-1,4-beta-mannosidase [Carbohydrate transport and metabolism]. 587
3542 226445 COG3935 DnaD DNA replication protein DnaD [Replication, recombination and repair]. 246
3543 226446 COG3936 YfiQ Membrane-bound acyltransferase YfiQ, involved in biofilm formation [Carbohydrate transport and metabolism]. 349
3544 226447 COG3937 PhaF Polyhydroxyalkanoate synthesis regulator phasin [Secondary metabolites biosynthesis, transport and catabolism, Signal transduction mechanisms]. 108
3545 226448 COG3938 PrdF Proline racemase [Amino acid transport and metabolism]. 341
3546 226449 COG3940 COG3940 Beta-xylosidase, GH43 family [Carbohydrate transport and metabolism]. 324
3547 226450 COG3941 HI1514 Phage tail tape-measure protein, controls tail length [Mobilome: prophages, transposons]. 633
3548 226451 COG3942 COG3942 Surface antigen [Cell wall/membrane/envelope biogenesis]. 173
3549 226452 COG3943 COG3943 Uncharacterized conserved protein [Function unknown]. 329
3550 226453 COG3944 YveK Capsular polysaccharide biosynthesis protein [Cell wall/membrane/envelope biogenesis]. 226
3551 226454 COG3945 COG3945 Hemerythrin-like domain [General function prediction only]. 189
3552 226455 COG3946 VirJ Type IV secretory pathway, VirJ component [Intracellular trafficking, secretion, and vesicular transport]. 456
3553 226456 COG3947 SAPR Two-component response regulator, SAPR family, consists of REC, wHTH and BTAD domains [Signal transduction mechanisms, Transcription]. 361
3554 226457 COG3948 COG3948 Phage-related baseplate assembly protein [Mobilome: prophages, transposons]. 306
3555 226458 COG3949 YkvI Uncharacterized membrane protein YkvI [Function unknown]. 349
3556 226459 COG3950 COG3950 Predicted ATP-binding protein involved in virulence [General function prediction only]. 440
3557 226460 COG3951 FlgJ1 Rod binding protein domain [Cell motility]. 166
3558 226461 COG3952 LABN Uncharacterized N-terminal domain of lipid-A-disaccharide synthase [General function prediction only]. 113
3559 226462 COG3953 SLT SLT domain protein [Mobilome: prophages, transposons]. 235
3560 226463 COG3954 PrkB Phosphoribulokinase [Carbohydrate transport and metabolism]. 289
3561 226464 COG3955 COG3955 Uncharacterized protein, DUF1919 family [Cell wall/membrane/envelope biogenesis]. 211
3562 226465 COG3956 YabN Uncharacterized conserved protein YabN, contains tetrapyrrole methylase and MazG-like pyrophosphatase domain [General function prediction only]. 488
3563 226466 COG3957 XFP Phosphoketolase [Carbohydrate transport and metabolism]. 793
3564 226467 COG3958 TktA2 Transketolase, C-terminal subunit [Carbohydrate transport and metabolism]. 312
3565 226468 COG3959 TktA1 Transketolase, N-terminal subunit [Carbohydrate transport and metabolism]. 243
3566 226469 COG3960 Gcl Glyoxylate carboligase [Secondary metabolites biosynthesis, transport and catabolism]. 592
3567 226470 COG3961 PDC1 TPP-dependent 2-oxoacid decarboxylase, includes indolepyruvate decarboxylase [Carbohydrate transport and metabolism, Coenzyme transport and metabolism, General function prediction only]. 557
3568 226471 COG3962 IolD TPP-dependent trihydroxycyclohexane-1,2-dione (THcHDO) dehydratase, myo-inositol metabolism [Carbohydrate transport and metabolism]. 617
3569 226472 COG3963 COG3963 Phospholipid N-methyltransferase [Lipid transport and metabolism]. 194
3570 226473 COG3964 COG3964 Predicted amidohydrolase [General function prediction only]. 386
3571 226474 COG3965 COG3965 Predicted Co/Zn/Cd cation transporter, cation efflux family [Inorganic ion transport and metabolism]. 314
3572 226475 COG3966 DltD Poly D-alanine transfer protein DltD, involved inesterification of teichoic acids [Cell wall/membrane/envelope biogenesis]. 415
3573 226476 COG3967 DltE Short-chain dehydrogenase involved in D-alanine esterification of teichoic acids [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 245
3574 226477 COG3968 GlnA3 Glutamine synthetase type III [Amino acid transport and metabolism]. 724
3575 226478 COG3969 YbdN Predicted phosphoadenosine phosphosulfate sulfurtransferase, contains C-terminal DUF3440 domain [General function prediction only]. 407
3576 226479 COG3970 COG3970 Fumarylacetoacetate (FAA) hydrolase family protein [General function prediction only]. 379
3577 226480 COG3971 MhpD 2-keto-4-pentenoate hydratase [Secondary metabolites biosynthesis, transport and catabolism]. 264
3578 226481 COG3972 COG3972 Superfamily I DNA and RNA helicases [Replication, recombination and repair]. 660
3579 226482 COG3973 HelD DNA helicase IV [Replication, recombination and repair]. 747
3580 226483 COG3975 COG3975 Predicted metalloprotease, contains C-terminal PDZ domain [General function prediction only]. 558
3581 226484 COG3976 COG3976 Uncharacterized protein, contains FMN-binding domain [General function prediction only]. 135
3582 226485 COG3977 AvtA Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase [Amino acid transport and metabolism]. 417
3583 226486 COG3978 IlvM Acetolactate synthase small subunit, contains ACT domain [Energy production and conversion]. 86
3584 226487 COG3979 COG3979 Chitodextrinase [Carbohydrate transport and metabolism]. 181
3585 226488 COG3980 SpsG Spore coat polysaccharide biosynthesis protein SpsG, predicted glycosyltransferase [Cell wall/membrane/envelope biogenesis]. 318
3586 226489 COG3981 COG3981 Predicted acetyltransferase [General function prediction only]. 174
3587 226490 COG4001 COG4001 Uncharacterized protein [Function unknown]. 102
3588 226491 COG4002 COG4002 Predicted methyltransferase MtxX, methanogen marker protein 4 [General function prediction only]. 256
3589 226492 COG4003 COG4003 Uncharacterized protein, DUF2095 family [Function unknown]. 98
3590 226493 COG4004 COG4004 Uncharacterized protein [Function unknown]. 96
3591 226494 COG4006 COG4006 CRISPR/Cas system-associated protein Csm6, COG1517 family [Defense mechanisms]. 278
3592 226495 COG4007 COG4007 Predicted dehydrogenase related to H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase [General function prediction only]. 340
3593 226496 COG4008 COG4008 Predicted metal-binding transcription factor, methanogenesis marker domain 9 [Transcription]. 153
3594 226497 COG4009 COG4009 Uncharacterized protein [Function unknown]. 88
3595 226498 COG4010 COG4010 Uncharacterized protein [Function unknown]. 170
3596 226499 COG4012 COG4012 Uncharacterized protein, DUF1786 family [Function unknown]. 342
3597 226500 COG4013 COG4013 Uncharacterized protein [Function unknown]. 91
3598 226501 COG4014 COG4014 Uncharacterized protein [Function unknown]. 97
3599 226502 COG4015 COG4015 Predicted dinucleotide-utilizing enzyme of the ThiF/HesA family [General function prediction only]. 217
3600 226503 COG4016 COG4016 Uncharacterized protein, UPF0254 family [Function unknown]. 165
3601 226504 COG4017 COG4017 Uncharacterized protein [Function unknown]. 254
3602 226505 COG4018 COG4018 Uncharacterized protein [Function unknown]. 505
3603 226506 COG4019 COG4019 Uncharacterized protein [Function unknown]. 156
3604 226507 COG4020 COG4020 Uncharacterized protein [Function unknown]. 332
3605 226508 COG4021 Thg1 tRNA(His) 5'-end guanylyltransferase [Translation, ribosomal structure and biogenesis]. 249
3606 226509 COG4022 COG4022 Uncharacterized protein [Function unknown]. 286
3607 226510 COG4023 SBH1 Preprotein translocase subunit Sec61beta [Intracellular trafficking, secretion, and vesicular transport]. 57
3608 226511 COG4024 COG4024 Uncharacterized protein [Function unknown]. 218
3609 226512 COG4025 COG4025 Uncharacterized membrane protein [Function unknown]. 284
3610 226513 COG4026 COG4026 Uncharacterized protein, contains TOPRIM domain, potential nuclease [General function prediction only]. 290
3611 226514 COG4027 COG4027 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase [Nucleotide transport and metabolism]. 194
3612 226515 COG4028 COG4028 Predicted P-loop ATPase/GTPase [General function prediction only]. 271
3613 226516 COG4029 COG4029 Uncharacterized protein [Function unknown]. 142
3614 226517 COG4030 COG4030 Predicted phosphohydrolase, HAD superfamily [General function prediction only]. 315
3615 226518 COG4031 COG4031 Uncharacterized protein, DUF2103 family [Function unknown]. 227
3616 226519 COG4032 COG4032 Sulfopyruvate decarboxylase, TPP-binding subunit (coenzyme M biosynthesis) [Coenzyme transport and metabolism]. 172
3617 226520 COG4033 COG4033 Uncharacterized protein [Function unknown]. 102
3618 226521 COG4034 COG4034 Uncharacterized protein [Function unknown]. 328
3619 226522 COG4035 COG4035 Uncharacterized membrane protein [Function unknown]. 108
3620 226523 COG4036 EhaG Energy-converting hydrogenase Eha subunit G [Energy production and conversion]. 224
3621 226524 COG4037 EhaF Energy-converting hydrogenase Eha subunit F [Energy production and conversion]. 163
3622 226525 COG4038 EhaE Energy-converting hydrogenase Eha subunit E [Energy production and conversion]. 87
3623 226526 COG4039 EhaC Energy-converting hydrogenase Eha subunit C [Energy production and conversion]. 86
3624 226527 COG4040 COG4040 Uncharacterized membrane protein [Function unknown]. 85
3625 226528 COG4041 EhaB Energy-converting hydrogenase Eha subunit B [Energy production and conversion]. 171
3626 226529 COG4042 EhaA Energy-converting hydrogenase Eha subunit A [Energy production and conversion]. 104
3627 226530 COG4043 ASCH ASC-1 homology (ASCH) domain, predicted RNA-binding domain [General function prediction only]. 111
3628 226531 COG4044 COG4044 Uncharacterized protein [Function unknown]. 247
3629 226532 COG4046 COG4046 Uncharacterized protein [Function unknown]. 368
3630 226533 COG4047 COG4047 N-glycosylase/DNA lyase [Replication, recombination and repair]. 243
3631 226534 COG4048 COG4048 Uncharacterized protein [Function unknown]. 123
3632 226535 COG4049 COG4049 Uncharacterized protein, contains archaeal-type C2H2 Zn-finger [General function prediction only]. 65
3633 226536 COG4050 COG4050 Uncharacterized protein [Function unknown]. 152
3634 226537 COG4051 COG4051 Uncharacterized protein [Function unknown]. 202
3635 226538 COG4052 COG4052 Uncharacterized protein related to methyl coenzyme M reductase subunit C, methanogenesis marker protein 7 [General function prediction only]. 310
3636 226539 COG4053 COG4053 Uncharacterized protein [Function unknown]. 244
3637 226540 COG4054 McrB Methyl coenzyme M reductase, beta subunit [Coenzyme transport and metabolism]. 447
3638 226541 COG4055 McrD Methyl coenzyme M reductase, subunit D [Coenzyme transport and metabolism]. 165
3639 226542 COG4056 McrC Methyl coenzyme M reductase, subunit C [Coenzyme transport and metabolism]. 204
3640 226543 COG4057 McrG Methyl coenzyme M reductase, gamma subunit [Coenzyme transport and metabolism]. 257
3641 226544 COG4058 McrA Methyl coenzyme M reductase, alpha subunit [Coenzyme transport and metabolism]. 553
3642 226545 COG4059 MtrE Tetrahydromethanopterin S-methyltransferase, subunit E [Coenzyme transport and metabolism]. 304
3643 226546 COG4060 MtrD Tetrahydromethanopterin S-methyltransferase, subunit D [Coenzyme transport and metabolism]. 230
3644 226547 COG4061 MtrC Tetrahydromethanopterin S-methyltransferase, subunit C [Coenzyme transport and metabolism]. 262
3645 226548 COG4062 MtrB Tetrahydromethanopterin S-methyltransferase, subunit B [Coenzyme transport and metabolism]. 108
3646 226549 COG4063 MtrA Tetrahydromethanopterin S-methyltransferase, subunit A [Coenzyme transport and metabolism]. 238
3647 226550 COG4064 MtrG Tetrahydromethanopterin S-methyltransferase, subunit G [Coenzyme transport and metabolism]. 75
3648 226551 COG4065 COG4065 Uncharacterized protein [Function unknown]. 480
3649 226552 COG4066 COG4066 Uncharacterized protein, UPF0305 family [Function unknown]. 165
3650 226553 COG4067 COG4067 Uncharacterized conserved protein [Function unknown]. 162
3651 226554 COG4068 COG4068 Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 64
3652 226555 COG4069 COG4069 Uncharacterized protein [Function unknown]. 367
3653 226556 COG4070 COG4070 Uncharacterized protein, methanogenesis marker protein 3, UPF0288 family [Function unknown]. 512
3654 226557 COG4071 COG4071 Uncharacterized protein, related to F420-0:gamma-glutamyl ligase [Function unknown]. 278
3655 226558 COG4072 COG4072 Uncharacterized protein [Function unknown]. 161
3656 226559 COG4073 COG4073 Uncharacterized protein [Function unknown]. 198
3657 226560 COG4074 Mth 5,10-methenyltetrahydromethanopterin hydrogenase [Energy production and conversion]. 343
3658 226561 COG4075 COG4075 Uncharacterized protein, distantly related to nitrogen regulatory protein PII [Function unknown]. 110
3659 226562 COG4076 COG4076 Predicted RNA methylase [General function prediction only]. 252
3660 226563 COG4077 COG4077 Uncharacterized protein [Function unknown]. 156
3661 226564 COG4078 EhaH Energy-converting hydrogenase Eha subunit H [Energy production and conversion]. 221
3662 226565 COG4079 COG4079 Uncharacterized protein [Function unknown]. 293
3663 226566 COG4080 COG4080 SpoU rRNA Methylase family enzyme [Translation, ribosomal structure and biogenesis]. 147
3664 226567 COG4081 COG4081 Uncharacterized protein [Function unknown]. 148
3665 226568 COG4083 COG4083 Exosortase/Archaeosortase [Replication, recombination and repair]. 239
3666 226569 COG4084 COG4084 Energy-converting hydrogenase A subunit M [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 135
3667 226570 COG4085 YhcR DNA/RNA endonuclease YhcR, contains UshA esterase domain [RNA processing and modification]. 204
3668 226571 COG4086 YpuA Uncharacterized protein YpuA, DUF1002 family [Function unknown]. 299
3669 226572 COG4087 COG4087 Soluble P-type ATPase [General function prediction only]. 152
3670 226573 COG4088 Kti12 tRNA Uridine 5-carbamoylmethylation protein Kti12 (Killer toxin insensitivity protein) [Translation, ribosomal structure and biogenesis]. 261
3671 226574 COG4089 COG4089 Uncharacterized membrane protein [Function unknown]. 235
3672 226575 COG4090 COG4090 Uncharacterized protein [Function unknown]. 154
3673 226576 COG4091 COG4091 Predicted homoserine dehydrogenase, contains C-terminal SAF domain [Amino acid transport and metabolism]. 438
3674 226577 COG4092 COG4092 Predicted glycosyltransferase involved in capsule biosynthesis [Cell wall/membrane/envelope biogenesis]. 346
3675 226578 COG4093 COG4093 Uncharacterized protein [Function unknown]. 338
3676 226579 COG4094 COG4094 Uncharacterized membrane protein [Function unknown]. 219
3677 226580 COG4095 SWEET Sugar transporter, SemiSWEET family, contains PQ motif [Carbohydrate transport and metabolism]. 89
3678 226581 COG4096 HsdR Type I site-specific restriction endonuclease, part of a restriction-modification system [Defense mechanisms]. 875
3679 226582 COG4097 COG4097 Predicted ferric reductase [Inorganic ion transport and metabolism]. 438
3680 226583 COG4098 comFA Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein) [Replication, recombination and repair]. 441
3681 226584 COG4099 COG4099 Predicted peptidase [General function prediction only]. 387
3682 226585 COG4100 YnbB Cystathionine beta-lyase family protein involved in aluminum resistance [Inorganic ion transport and metabolism, General function prediction only]. 416
3683 226586 COG4101 RmlC Uncharacterized protein, RmlC-like cupin domain [General function prediction only]. 142
3684 226587 COG4102 COG4102 Uncharacterized conserved protein, DUF1501 family [Function unknown]. 418
3685 226588 COG4103 TerB Uncharacterized conserved protein, tellurite resistance protein B (TerB) family [Function unknown]. 148
3686 226589 COG4104 PAAR Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, incolved in TypeVI secretion [Intracellular trafficking, secretion, and vesicular transport]. 98
3687 226590 COG4105 BamD Outer membrane protein assembly factor BamD, BamD/ComL family [Cell wall/membrane/envelope biogenesis]. 254
3688 226591 COG4106 Tam Trans-aconitate methyltransferase [Energy production and conversion]. 257
3689 226592 COG4107 PhnK ABC-type phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 258
3690 226593 COG4108 PrfC Peptide chain release factor RF-3 [Translation, ribosomal structure and biogenesis]. 528
3691 226594 COG4109 YtoI Predicted transcriptional regulator containing CBS domains [Transcription]. 432
3692 226595 COG4110 TerA Uncharacterized protein involved in tellurium resistance [Defense mechanisms]. 200
3693 226596 COG4111 COG4111 Uncharacterized conserved protein [Function unknown]. 322
3694 226597 COG4112 YmaB Predicted phosphoesterase, NUDIX family [General function prediction only]. 203
3695 226598 COG4113 COG4113 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 134
3696 226599 COG4114 FhuF Ferric iron reductase protein FhuF, involved in iron transport [Inorganic ion transport and metabolism]. 251
3697 226600 COG4115 COG4115 Toxin component of the Txe-Axe toxin-antitoxin module, Txe/YoeB family [Defense mechanisms]. 84
3698 226601 COG4116 YjbK Predicted triphosphatase or cyclase YjbK, contains CYTH domain [General function prediction only]. 193
3699 226602 COG4117 YdhU Thiosulfate reductase cytochrome b subunit [Inorganic ion transport and metabolism]. 221
3700 226603 COG4118 Phd Antitoxin component of toxin-antitoxin stability system, DNA-binding transcriptional repressor [Defense mechanisms]. 84
3701 226604 COG4119 COG4119 Predicted NTP pyrophosphohydrolase, NUDIX family [Nucleotide transport and metabolism, General function prediction only]. 161
3702 226605 COG4120 COG4120 ABC-type uncharacterized transport system, permease component [General function prediction only]. 293
3703 226606 COG4121 MnmC tRNA U34 5-methylaminomethyl-2-thiouridine-forming methyltransferase MnmC [Translation, ribosomal structure and biogenesis]. 252
3704 226607 COG4122 YrrM Predicted O-methyltransferase YrrM [General function prediction only]. 219
3705 226608 COG4123 TrmN6 tRNA1(Val) A37 N6-methylase TrmN6 [Translation, ribosomal structure and biogenesis]. 248
3706 226609 COG4124 ManB2 Beta-mannanase [Carbohydrate transport and metabolism]. 355
3707 226610 COG4125 COG4125 Uncharacterized membrane protein [Function unknown]. 149
3708 226611 COG4126 Dcg1 Asp/Glu/hydantoin racemase [Amino acid transport and metabolism]. 230
3709 226612 COG4127 COG4127 Predicted restriction endonuclease, Mrr-cat superfamily [General function prediction only]. 318
3710 226613 COG4128 Zot Zona occludens toxin, predicted ATPase [General function prediction only]. 398
3711 226614 COG4129 YgaE Uncharacterized membrane protein YgaE, UPF0421/DUF939 family [Function unknown]. 332
3712 226615 COG4130 COG4130 Predicted sugar epimerase, xylose isomerase-like family [Carbohydrate transport and metabolism]. 272
3713 226616 COG4132 COG4132 ABC-type uncharacterized transport system, permease component [General function prediction only]. 282
3714 226617 COG4133 CcmA ABC-type transport system involved in cytochrome c biogenesis, ATPase component [Posttranslational modification, protein turnover, chaperones]. 209
3715 226618 COG4134 YnjB ABC-type uncharacterized transport system YnjBCD, periplasmic component [General function prediction only]. 384
3716 226619 COG4135 YnjC ABC-type uncharacterized transport system YnjBCD, permease component [General function prediction only]. 551
3717 226620 COG4136 YnjD ABC-type uncharacterized transport system YnjBCD, ATPase component [General function prediction only]. 213
3718 226621 COG4137 YpjD ABC-type uncharacterized transport system, permease component [General function prediction only]. 265
3719 226622 COG4138 BtuD ABC-type cobalamin transport system, ATPase component [Coenzyme transport and metabolism]. 248
3720 226623 COG4139 BtuC ABC-type cobalamin transport system, permease component [Coenzyme transport and metabolism]. 326
3721 226624 COG4143 TbpA ABC-type thiamine transport system, periplasmic component [Coenzyme transport and metabolism]. 336
3722 226625 COG4145 PanF Na+/panthothenate symporter [Coenzyme transport and metabolism]. 473
3723 226626 COG4146 YidK Uncharacterized membrane permease YidK, sodium:solute symporter family [General function prediction only]. 571
3724 226627 COG4147 ActP Na+(or H+)/acetate symporter ActP [Energy production and conversion]. 529
3725 226628 COG4148 ModC ABC-type molybdate transport system, ATPase component [Inorganic ion transport and metabolism]. 352
3726 226629 COG4149 ModC ABC-type molybdate transport system, permease component [Inorganic ion transport and metabolism]. 225
3727 226630 COG4150 CysP ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 341
3728 226631 COG4152 YhaQ ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 300
3729 226632 COG4154 FucU L-fucose mutarotase/ribose pyranase, RbsD/FucU family [Carbohydrate transport and metabolism]. 144
3730 226633 COG4158 COG4158 Predicted ABC-type sugar transport system, permease component [General function prediction only]. 329
3731 226634 COG4160 ArtM ABC-type arginine/histidine transport system, permease component [Amino acid transport and metabolism]. 228
3732 226635 COG4161 ArtP ABC-type arginine transport system, ATPase component [Amino acid transport and metabolism]. 242
3733 226636 COG4166 OppA ABC-type oligopeptide transport system, periplasmic component [Amino acid transport and metabolism]. 562
3734 226637 COG4167 SapF ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 267
3735 226638 COG4168 SapB ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 321
3736 226639 COG4170 SapD ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 330
3737 226640 COG4171 SapC ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 296
3738 226641 COG4172 YejF ABC-type microcin C transport system, duplicated ATPase component YejF [Secondary metabolites biosynthesis, transport and catabolism]. 534
3739 226642 COG4174 YejB ABC-type microcin C transport system, permease component YejB [Secondary metabolites biosynthesis, transport and catabolism]. 364
3740 226643 COG4175 ProV ABC-type proline/glycine betaine transport system, ATPase component [Amino acid transport and metabolism]. 386
3741 226644 COG4176 ProW ABC-type proline/glycine betaine transport system, permease component [Amino acid transport and metabolism]. 290
3742 226645 COG4177 LivM ABC-type branched-chain amino acid transport system, permease component [Amino acid transport and metabolism]. 314
3743 226646 COG4178 YddA ABC-type uncharacterized transport system, permease and ATPase components [General function prediction only]. 604
3744 226647 COG4181 YbbA Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, ATPase component [Secondary metabolites biosynthesis, transport and catabolism]. 228
3745 226648 COG4185 COG4185 Predicted ABC-type ATPase [General function prediction only]. 187
3746 226649 COG4186 COG4186 Calcineurin-like phosphoesterase superfamily protein [General function prediction only]. 186
3747 226650 COG4187 RocB Arginine utilization protein RocB [Amino acid transport and metabolism]. 553
3748 226651 COG4188 COG4188 Predicted dienelactone hydrolase [General function prediction only]. 365
3749 226652 COG4189 COG4189 Predicted transcriptional regulator [Transcription]. 308
3750 226653 COG4190 COG4190 Predicted transcriptional regulator [Transcription]. 144
3751 226654 COG4191 COG4191 Signal transduction histidine kinase regulating C4-dicarboxylate transport system [Signal transduction mechanisms]. 603
3752 226655 COG4192 COG4192 Signal transduction histidine kinase regulating phosphoglycerate transport system [Signal transduction mechanisms]. 673
3753 226656 COG4193 LytD Beta- N-acetylglucosaminidase [Carbohydrate transport and metabolism]. 245
3754 226657 COG4194 COG4194 Uncharacterized membrane protein, DUF1648 family [Function unknown]. 350
3755 226658 COG4195 YjqB Phage-related replication protein YjqB, UPF0714/DUF867 family [Mobilome: prophages, transposons]. 208
3756 226659 COG4196 COG4196 Uncharacterized conserved protein, DUF2126 family [Function unknown]. 808
3757 226660 COG4197 YdaS DNA-binding transcriptional regulator YdaS, prophage-encoded, Cro superfamily [Transcription]. 96
3758 226661 COG4198 COG4198 Uncharacterized conserved protein, DUF1015 family [Function unknown]. 405
3759 226662 COG4199 RecJ ssDNA-specific exonuclease RecJ [Replication, recombination and repair]. 201
3760 226663 COG4200 EfiE Predicted lantabiotic-exporting membrane pepmease, EfiE/EfiG/ABC2 family [Defense mechanisms]. 239
3761 226664 COG4206 BtuB Outer membrane cobalamin receptor protein [Coenzyme transport and metabolism]. 608
3762 226665 COG4208 CysW ABC-type sulfate transport system, permease component [Inorganic ion transport and metabolism]. 287
3763 226666 COG4209 LplB ABC-type polysaccharide transport system, permease component [Carbohydrate transport and metabolism]. 309
3764 226667 COG4211 MglC ABC-type glucose/galactose transport system, permease component [Carbohydrate transport and metabolism]. 336
3765 226668 COG4213 XylF ABC-type xylose transport system, periplasmic component [Carbohydrate transport and metabolism]. 341
3766 226669 COG4214 XylH ABC-type xylose transport system, permease component [Carbohydrate transport and metabolism]. 394
3767 226670 COG4215 ArtQ ABC-type arginine transport system, permease component [Amino acid transport and metabolism]. 230
3768 226671 COG4218 MtrF Tetrahydromethanopterin S-methyltransferase, subunit F [Coenzyme transport and metabolism]. 73
3769 226672 COG4219 MecR1 Signal transducer regulating beta-lactamase production, contains metallopeptidase domain [Signal transduction mechanisms]. 337
3770 226673 COG4220 Nu1 Phage DNA packaging protein, Nu1 subunit of terminase [Mobilome: prophages, transposons]. 174
3771 226674 COG4221 YdfG NADP-dependent 3-hydroxy acid dehydrogenase YdfG [Energy production and conversion]. 246
3772 226675 COG4222 COG4222 Uncharacterized conserved protein [Function unknown]. 391
3773 226676 COG4223 COG4223 Uncharacterized conserved protein [Function unknown]. 422
3774 226677 COG4224 YnzC Uncharacterized protein YnzC, UPF0291/DUF896 family [Function unknown]. 77
3775 226678 COG4225 YesR Rhamnogalacturonyl hydrolase YesR [Carbohydrate transport and metabolism]. 357
3776 226679 COG4226 HicB Predicted nuclease of the RNAse H fold, HicB family [General function prediction only]. 111
3777 226680 COG4227 ArdC Antirestriction protein ArdC [Replication, recombination and repair]. 316
3778 226681 COG4228 COG4228 Mu-like prophage DNA circulation protein [Mobilome: prophages, transposons]. 451
3779 226682 COG4229 Utr4 Enolase-phosphatase E1 involved in merthionine salvage [Amino acid transport and metabolism]. 229
3780 226683 COG4230 PutA2 Delta 1-pyrroline-5-carboxylate dehydrogenase [Amino acid transport and metabolism]. 769
3781 226684 COG4231 IorA TPP-dependent indolepyruvate ferredoxin oxidoreductase, alpha subunit [Energy production and conversion]. 640
3782 226685 COG4232 DsbD Thiol:disulfide interchange protein [Posttranslational modification, protein turnover, chaperones]. 569
3783 226686 COG4233 COG4233 Thiol-disulfide interchange protein, contains DsbC and DsbD domains [Posttranslational modification, protein turnover, chaperones, Energy production and conversion]. 273
3784 226687 COG4235 NrfG Cytochrome c-type biogenesis protein CcmH/NrfG [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 287
3785 226688 COG4237 HyfE Hydrogenase-4 membrane subunit HyfE [Energy production and conversion]. 218
3786 226689 COG4238 Lpp Outer membrane murein-binding lipoprotein Lpp [Cell wall/membrane/envelope biogenesis]. 78
3787 226690 COG4239 YejE ABC-type microcin C transport system, permease component YejE [Secondary metabolites biosynthesis, transport and catabolism]. 341
3788 226691 COG4240 Tda10 Pantothenate kinase-related protein Tda10 (topoisomerase I damage affected protein) [General function prediction only]. 300
3789 226692 COG4241 YybS Uncharacterized conserved protein YybS, DUF2232 family [Function unknown]. 314
3790 226693 COG4242 CphB Cyanophycinase and related exopeptidases [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 293
3791 226694 COG4243 COG4243 Uncharacterized membrane protein [Function unknown]. 156
3792 226695 COG4244 COG4244 Uncharacterized membrane protein [Function unknown]. 160
3793 226696 COG4245 TerY Uncharacterized conserved protein YegL, contains vWA domain of TerY type [Function unknown]. 207
3794 226697 COG4246 COG4246 Uncharacterized protein [Function unknown]. 340
3795 226698 COG4247 Phy 3-phytase (myo-inositol-hexaphosphate 3-phosphohydrolase) [Lipid transport and metabolism]. 364
3796 226699 COG4248 YegI Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains [General function prediction only]. 637
3797 226700 COG4249 COG4249 Uncharacterized protein, contains caspase domain [General function prediction only]. 380
3798 226701 COG4250 DICT Sensory domain found in diguanylate cyclases and two-component systems (DICT domain) [Signal transduction mechanisms]. 226
3799 226702 COG4251 COG4251 Bacteriophytochrome (light-regulated signal transduction histidine kinase) [Signal transduction mechanisms]. 750
3800 226703 COG4252 CHASE2 Extracellular (periplasmic) sensor domain CHASE2 (specificity unknown) [Signal transduction mechanisms]. 400
3801 226704 COG4253 COG4253 Uncharacterized conserved protein, DUF2345 family [Function unknown]. 278
3802 226705 COG4254 COG4254 Uncharacterized conserved protein, contains LysM and FecR domains [General function prediction only]. 339
3803 226706 COG4255 COG4255 Uncharacterized protein [Function unknown]. 318
3804 226707 COG4256 HemP Hemin uptake protein HemP [Coenzyme transport and metabolism]. 63
3805 226708 COG4257 Vgb Streptogramin lyase [Defense mechanisms]. 353
3806 226709 COG4258 COG4258 Predicted exporter [General function prediction only]. 788
3807 226710 COG4259 COG4259 Uncharacterized protein [Function unknown]. 121
3808 226711 COG4260 YdjI Membrane protease subunit, stomatin/prohibitin family, contains C-terminal Zn-ribbon domain [Posttranslational modification, protein turnover, chaperones]. 345
3809 226712 COG4261 COG4261 Predicted acyltransferase, LPLAT superfamily [General function prediction only]. 309
3810 226713 COG4262 COG4262 Predicted spermidine synthase with an N-terminal membrane domain [General function prediction only]. 508
3811 226714 COG4263 NosZ Nitrous oxide reductase [Inorganic ion transport and metabolism]. 637
3812 226715 COG4264 RhbC Siderophore synthetase component [Inorganic ion transport and metabolism]. 602
3813 226716 COG4266 Alc Allantoicase [Nucleotide transport and metabolism]. 334
3814 226717 COG4267 COG4267 Uncharacterized membrane protein [Function unknown]. 467
3815 226718 COG4268 McrC 5-methylcytosine-specific restriction endonuclease McrBC, regulatory subunit McrC [Defense mechanisms]. 439
3816 226719 COG4269 YjgN Uncharacterized membrane protein YjgN, DUF898 family [Function unknown]. 364
3817 226720 COG4270 COG4270 Uncharacterized membrane protein [Function unknown]. 131
3818 226721 COG4271 COG4271 Predicted nucleotide-binding protein containing TIR -like domain [General function prediction only]. 233
3819 226722 COG4272 COG4272 Uncharacterized membrane protein [Function unknown]. 125
3820 226723 COG4273 COG4273 Uncharacterized protein, contains metal-binding DGC domain [Function unknown]. 135
3821 226724 COG4274 COG4274 Uncharacterized protein, contains GYD domain [Function unknown]. 104
3822 226725 COG4275 ChrB1 Chromate resistance protein ChrB1 [Inorganic ion transport and metabolism]. 143
3823 226726 COG4276 SRPBCC Ligand-binding SRPBCC domain [General function prediction only]. 153
3824 226727 COG4277 COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif [General function prediction only]. 404
3825 226728 COG4278 COG4278 Uncharacterized protein [Function unknown]. 269
3826 226729 COG4279 COG4279 Uncharacterized conserved protein, contains Zn finger domain [Function unknown]. 266
3827 226730 COG4280 COG4280 Uncharacterized membrane protein [Function unknown]. 236
3828 226731 COG4281 ACB Acyl-CoA-binding protein [Lipid transport and metabolism]. 87
3829 226732 COG4282 SMI1 Cell wall assembly regulator SMI1 [Cell wall/membrane/envelope biogenesis]. 191
3830 226733 COG4283 DinB Uncharacterized protein DinB, DUF1706 family [Function unknown]. 170
3831 226734 COG4284 QRI1 UDP-N-acetylglucosamine pyrophosphorylase [Carbohydrate transport and metabolism]. 472
3832 226735 COG4285 COG4285 Uncharacterized conserved protein , conains N-terminal glutamine amidotransferase (GATase1)-like domain [General function prediction only]. 253
3833 226736 COG4286 COG4286 Uncharacterized protein, UPF0160 family [Function unknown]. 306
3834 226737 COG4287 PqaA PhoPQ-activated pathogenicity-related protein [General function prediction only]. 507
3835 226738 COG4288 COG4288 Uncharacterized protein [Function unknown]. 124
3836 226739 COG4289 COG4289 Uncharacterized protein [Function unknown]. 458
3837 226740 COG4290 COG4290 Guanyl-specific ribonuclease Sa [Nucleotide transport and metabolism]. 152
3838 226741 COG4291 COG4291 Uncharacterized membrane protein [Function unknown]. 228
3839 226742 COG4292 LtrA Low temperature requirement protein LtrA (function unknown) [Function unknown]. 387
3840 226743 COG4293 COG4293 Uncharacterized protein, DUF1802 family [Function unknown]. 184
3841 226744 COG4294 Uve UV DNA damage repair endonuclease [Replication, recombination and repair]. 347
3842 226745 COG4295 COG4295 Uncharacterized protein [Function unknown]. 285
3843 226746 COG4296 COG4296 Uncharacterized protein [Function unknown]. 156
3844 226747 COG4297 YjlB Uncharacterized protein YjlB, cupin superfamily [Function unknown]. 163
3845 226748 COG4298 COG4298 Uncharacterized protein [Function unknown]. 95
3846 226749 COG4299 COG4299 Predicted acyltransferase [General function prediction only]. 371
3847 226750 COG4300 CadD Cadmium resistance protein CadD, predicted permease [Inorganic ion transport and metabolism]. 205
3848 226751 COG4301 COG4301 Uncharacterized conserved protein, contains predicted SAM-dependent methyltransferase domain [General function prediction only]. 321
3849 226752 COG4302 EutC Ethanolamine ammonia-lyase, small subunit [Amino acid transport and metabolism]. 294
3850 226753 COG4303 EutB Ethanolamine ammonia-lyase, large subunit [Amino acid transport and metabolism]. 453
3851 226754 COG4304 COG4304 Uncharacterized protein [Function unknown]. 166
3852 226755 COG4305 YoaJ Peptidoglycan-binding domain, expansin [Cell wall/membrane/envelope biogenesis]. 232
3853 226756 COG4306 COG4306 Uncharacterized protein [Function unknown]. 160
3854 226757 COG4307 COG4307 Uncharacterized protein, DUF2248 family [Function unknown]. 349
3855 226758 COG4308 LimA Limonene-1,2-epoxide hydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 130
3856 226759 COG4309 COG4309 Uncharacterized conserved protein, DUF2249 family [Function unknown]. 98
3857 226760 COG4310 COG4310 Uncharacterized protein, cotains an aminopeptidase-like domain [General function prediction only]. 435
3858 226761 COG4311 SoxD Sarcosine oxidase delta subunit [Amino acid transport and metabolism]. 97
3859 226762 COG4312 COG4312 Predicted dithiol-disulfide oxidoreductase, DUF899 family [General function prediction only]. 247
3860 226763 COG4313 SphA Uncharacterized conserved protein [Function unknown]. 304
3861 226764 COG4314 NosL Nitrous oxide reductase accessory protein NosL [Inorganic ion transport and metabolism]. 176
3862 226765 COG4315 COG4315 Predicted lipoprotein with conserved Yx(FWY)xxD motif (function unknown) [Function unknown]. 138
3863 226766 COG4316 COG4316 Uncharacterized protein [Function unknown]. 138
3864 226767 COG4317 XapX Xanthosine utilization system component, XapX domain [Nucleotide transport and metabolism]. 93
3865 226768 COG4318 COG4318 Uncharacterized protein [Function unknown]. 221
3866 226769 COG4319 YybH Ketosteroid isomerase homolog [General function prediction only]. 137
3867 226770 COG4320 COG4320 Uncharacterized conserved protein, DUF2252 family [Function unknown]. 410
3868 226771 COG4321 COG4321 Predicted DNA-binding protein, contains Ribbon-helix-helix (RHH) domain [General function prediction only]. 102
3869 226772 COG4322 COG4322 Uncharacterized protein [Function unknown]. 304
3870 226773 COG4323 COG4323 Uncharacterized protein [Function unknown]. 105
3871 226774 COG4324 COG4324 Predicted aminopeptidase [General function prediction only]. 376
3872 226775 COG4325 COG4325 Uncharacterized membrane protein [Function unknown]. 464
3873 226776 COG4326 Spo0M Sporulation-control protein spo0M [Cell cycle control, cell division, chromosome partitioning]. 270
3874 226777 COG4327 COG4327 Uncharacterized membrane protein [Function unknown]. 101
3875 226778 COG4328 COG4328 Predicted nuclease (RNAse H fold) [General function prediction only]. 266
3876 226779 COG4329 COG4329 Uncharacterized membrane protein [Function unknown]. 160
3877 226780 COG4330 COG4330 Uncharacterized membrane protein [Function unknown]. 211
3878 226781 COG4331 COG4331 Uncharacterized membrane protein [Function unknown]. 167
3879 226782 COG4332 COG4332 Uncharacterized protein [Function unknown]. 203
3880 226783 COG4333 COG4333 Uncharacterized protein [Function unknown]. 167
3881 226784 COG4334 COG4334 Uncharacterized protein [Function unknown]. 131
3882 226785 COG4335 AlkC 3-methyladenine DNA glycosylase AlkC [Replication, recombination and repair]. 167
3883 226786 COG4336 YcsI Uncharacterized protein YcsI, UPF0317 family [Function unknown]. 265
3884 226787 COG4337 COG4337 Uncharacterized protein [Function unknown]. 206
3885 226788 COG4338 COG4338 Uncharacterized protein, DUF2256 family [Function unknown]. 54
3886 226789 COG4339 COG4339 Predicted metal-dependent phosphohydrolase, HD superfamily [General function prediction only]. 208
3887 226790 COG4340 COG4340 Predicted dioxygenase, 2-oxoglutarate and Fe-dependent (2OG-Fe) dioxygenase superfamily [General function prediction only]. 226
3888 226791 COG4341 COG4341 Predicted HD phosphohydrolase [General function prediction only]. 186
3889 226792 COG4342 COG4342 Intergrase/Recombinase [Mobilome: prophages, transposons]. 291
3890 226793 COG4343 Cas4 CRISPR/Cas system-associated exonuclease Cas4, RecB family [Defense mechanisms]. 281
3891 226794 COG4344 COG4344 Predicted transciptional regulator, contains HTH domain [Transcription]. 175
3892 226795 COG4345 COG4345 Uncharacterized protein [Function unknown]. 181
3893 226796 COG4346 COG4346 Predicted membrane-bound dolichyl-phosphate-mannose-protein mannosyltransferase [Posttranslational modification, protein turnover, chaperones]. 438
3894 226797 COG4347 YpjA Uncharacterized membrane protein YpjA [Function unknown]. 200
3895 226798 COG4352 RPL13 Ribosomal protein L13E [Translation, ribosomal structure and biogenesis]. 113
3896 226799 COG4353 COG4353 Uncharacterized protein [Function unknown]. 192
3897 226800 COG4354 COG4354 Uncharacterized protein, contains GBA2_N and DUF608 domains [Function unknown]. 721
3898 226801 COG4357 COG4357 Uncharacterized protein, contains Zn-finger domain of CHY type [Function unknown]. 105
3899 226802 COG4359 MtnX 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase (methionine salvage) [Amino acid transport and metabolism]. 220
3900 226803 COG4360 APA2 ATP adenylyltransferase (5',5'''-P-1,P-4-tetraphosphate phosphorylase II) [Nucleotide transport and metabolism]. 298
3901 226804 COG4362 COG4362 Nitric oxide synthase, oxygenase domain [Inorganic ion transport and metabolism]. 355
3902 226805 COG4365 YllA Uncharacterized protein YllA, UPF0747 family [Function unknown]. 537
3903 226806 COG4367 COG4367 Uncharacterized protein [Function unknown]. 97
3904 226807 COG4370 COG4370 Uncharacterized protein [Function unknown]. 412
3905 226808 COG4371 COG4371 Uncharacterized membrane protein [Function unknown]. 334
3906 226809 COG4372 COG4372 Uncharacterized conserved protein, contains DUF3084 domain [Function unknown]. 499
3907 226810 COG4373 COG4373 Mu-like prophage FluMu protein gp28 [Mobilome: prophages, transposons]. 509
3908 226811 COG4374 COG4374 PIN domain nuclease, a component of toxin-antitoxin system (PIN domain) [Defense mechanisms]. 130
3909 226812 COG4377 YhfC Uncharacterized membrane protein YhfC [Function unknown]. 258
3910 226813 COG4378 COG4378 Uncharacterized protein [Function unknown]. 103
3911 226814 COG4379 COG4379 Mu-like prophage tail protein gpP [Mobilome: prophages, transposons]. 386
3912 226815 COG4380 COG4380 Uncharacterized protein [Function unknown]. 216
3913 226816 COG4381 gp46 Mu-like prophage protein gp46 [Mobilome: prophages, transposons]. 135
3914 226817 COG4382 gp16 Mu-like prophage protein gp16 [Mobilome: prophages, transposons]. 170
3915 226818 COG4383 gp29 Mu-like prophage protein gp29 [Mobilome: prophages, transposons]. 517
3916 226819 COG4384 gp45 Mu-like prophage protein gp45 [Mobilome: prophages, transposons]. 203
3917 226820 COG4385 gpI Bacteriophage P2-related tail formation protein [Mobilome: prophages, transposons]. 206
3918 226821 COG4386 COG4386 Mu-like prophage tail sheath protein gpL [Mobilome: prophages, transposons]. 487
3919 226822 COG4387 gp436 Mu-like prophage protein gp36 [Mobilome: prophages, transposons]. 139
3920 226823 COG4388 COG4388 Mu-like prophage I protein [Mobilome: prophages, transposons]. 357
3921 226824 COG4389 COG4389 Site-specific recombinase [Replication, recombination and repair]. 677
3922 226825 COG4390 COG4390 Uncharacterized protein [Function unknown]. 106
3923 226826 COG4391 COG4391 Uncharacterized conserved protein, contains Zn-finger domain [Function unknown]. 62
3924 226827 COG4392 AzlD2 Branched-chain amino acid transport protein [Amino acid transport and metabolism]. 107
3925 226828 COG4393 COG4393 Uncharacterized membrane protein [Function unknown]. 405
3926 226829 COG4394 EarP Elongation-Factor P (EF-P) rhamnosyltransferase EarP [Translation, ribosomal structure and biogenesis]. 370
3927 226830 COG4395 Tim44 Predicted lipid-binding transport protein, Tim44 family [Lipid transport and metabolism]. 281
3928 226831 COG4396 COG4396 Mu-like prophage host-nuclease inhibitor protein Gam [Mobilome: prophages, transposons]. 170
3929 226832 COG4397 COG4397 Mu-like prophage major head subunit gpT [Mobilome: prophages, transposons]. 308
3930 226833 COG4398 FIST Small ligand-binding sensory domain FIST [Signal transduction mechanisms]. 389
3931 226834 COG4399 YheB Uncharacterized membrane protein YheB, UPF0754 family [Function unknown]. 376
3932 226835 COG4401 AroH Chorismate mutase [Amino acid transport and metabolism]. 125
3933 226836 COG4402 COG4402 Uncharacterized protein [Function unknown]. 457
3934 226837 COG4403 LcnDR2 Lantibiotic modifying enzyme [Defense mechanisms]. 963
3935 226838 COG4405 YhfF Predicted RNA-binding protein YhfF, contains PUA-like ASCH domain [General function prediction only]. 140
3936 226839 COG4408 COG4408 Uncharacterized protein [Function unknown]. 431
3937 226840 COG4409 NanH Neuraminidase (sialidase) [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 728
3938 226841 COG4412 COG4412 Bacillopeptidase F, M6 metalloprotease family [Posttranslational modification, protein turnover, chaperones]. 760
3939 226842 COG4413 Utp Urea transporter [Amino acid transport and metabolism]. 319
3940 226843 COG4416 Com Mu-like prophage FluMu protein Com [Mobilome: prophages, transposons]. 60
3941 226844 COG4420 COG4420 Uncharacterized membrane protein [Function unknown]. 191
3942 226845 COG4421 COG4421 Capsular polysaccharide biosynthesis protein [Cell wall/membrane/envelope biogenesis]. 368
3943 226846 COG4422 COG4422 Bacteriophage protein gp37 [Mobilome: prophages, transposons]. 250
3944 226847 COG4423 COG4423 Uncharacterized protein [Function unknown]. 81
3945 226848 COG4424 LpsS LPS sulfotransferase NodH [Cell wall/membrane/envelope biogenesis]. 250
3946 226849 COG4425 COG4425 Uncharacterized membrane protein [Function unknown]. 588
3947 226850 COG4427 COG4427 Uncharacterized protein [Function unknown]. 350
3948 226851 COG4430 YdeI Uncharacterized conserved protein YdeI, YjbR/CyaY-like superfamily, DUF1801 family [Function unknown]. 200
3949 226852 COG4443 COG4443 Uncharacterized protein [Function unknown]. 72
3950 226853 COG4445 MiaE tRNA isopentenyl-2-thiomethyl-A-37 hydroxylase MiaE (synthesis of 2-methylthio-cis-ribozeatin) [Translation, ribosomal structure and biogenesis]. 203
3951 226854 COG4446 COG4446 Uncharacterized conserved protein, DUF1499 family [Function unknown]. 141
3952 226855 COG4447 COG4447 Uncharacterized protein related to plant photosystem II stability/assembly factor [General function prediction only]. 339
3953 226856 COG4448 AnsA2 L-asparaginase II [Amino acid transport and metabolism]. 339
3954 226857 COG4449 COG4449 Predicted protease, Abi (CAAX) family [General function prediction only]. 827
3955 226858 COG4451 RbcS Ribulose bisphosphate carboxylase small subunit [Carbohydrate transport and metabolism]. 127
3956 226859 COG4452 CreD Inner membrane protein involved in colicin E2 resistance [Defense mechanisms]. 443
3957 226860 COG4453 COG4453 Uncharacterized conserved protein, DUF1778 family [Function unknown]. 95
3958 226861 COG4454 COG4454 Uncharacterized copper-binding protein, cupredoxin-like subfamily [General function prediction only]. 158
3959 226862 COG4455 ImpE Protein of avirulence locus involved in temperature-dependent protein secretion [General function prediction only]. 273
3960 226863 COG4456 VagC Virulence-associated protein VagC (function unknown) [Function unknown]. 74
3961 226864 COG4457 SrfB Uncharacterized protein [Function unknown]. 1014
3962 226865 COG4458 SrfC Uncharacterized protein [Function unknown]. 821
3963 226866 COG4459 NapE Periplasmic nitrate reductase system, NapE component [Energy production and conversion]. 62
3964 226867 COG4460 COG4460 Uncharacterized protein [Function unknown]. 130
3965 226868 COG4461 LprI Uncharacterized protein LprI [Function unknown]. 185
3966 226869 COG4463 CtsR Transcriptional regulator CtsR [Transcription]. 153
3967 226870 COG4464 YwqE Tyrosine-protein phosphatase YwqE [Signal transduction mechanisms]. 254
3968 226871 COG4465 CodY GTP-sensing pleiotropic transcriptional regulator CodY [Transcription]. 261
3969 226872 COG4466 COG4466 Uncharacterized protein Veg, DUF1021 family [Function unknown]. 80
3970 226873 COG4467 YabA Regulator of replication initiation timing [Replication, recombination and repair]. 114
3971 226874 COG4468 GalT2 Galactose-1-phosphate uridylyltransferase [Carbohydrate transport and metabolism]. 503
3972 226875 COG4469 CoiA Competence protein CoiA-like family, contains a predicted nuclease domain [General function prediction only]. 342
3973 226876 COG4470 YutD Uncharacterized protein YutD, DUF1027 family [Function unknown]. 126
3974 226877 COG4471 YlbG Uncharacterized protein YlbG, UPF0298 family [Function unknown]. 90
3975 226878 COG4472 IreB-like IreB family regulatory phosphoprotein. IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135 88
3976 226879 COG4473 EcsB Predicted ABC-type exoprotein transport system, permease component [Intracellular trafficking, secretion, and vesicular transport]. 379
3977 226880 COG4474 YoqJ Uncharacterized SPBc2 prophage-derived protein YoqJ [Mobilome: prophages, transposons]. 180
3978 226881 COG4475 YwlG Uncharacterized protein YwlG, UPF0340 family [Function unknown]. 180
3979 226882 COG4476 YktA Uncharacterized protein YktA, UPF0223 family [Function unknown]. 90
3980 226883 COG4477 EzrA Septation ring formation regulator EzrA [Cell cycle control, cell division, chromosome partitioning]. 570
3981 226884 COG4478 COG4478 Uncharacterized membrane protein [Function unknown]. 210
3982 226885 COG4479 YozE Uncharacterized protein YozE, UPF0346 family [Function unknown]. 74
3983 226886 COG4481 COG4481 Uncharacterized protein, DUF951 family [Function unknown]. 60
3984 226887 COG4483 YqgQ Uncharacterized protein YqgQ, DUF910 family [Function unknown]. 68
3985 226888 COG4485 YfhO Uncharacterized membrane protein YfhO [Function unknown]. 858
3986 226889 COG4487 COG4487 Uncharacterized protein, contains DUF2130 domain [Function unknown]. 438
3987 226890 COG4492 PheB ACT domain-containing protein [General function prediction only]. 150
3988 226891 COG4493 YktB Uncharacterized protein YktB, UPF0637 family [Function unknown]. 209
3989 226892 COG4495 COG4495 Uncharacterized protein [Function unknown]. 109
3990 226893 COG4496 YerC Predicted DNA-binding transcriptional regulator YerC, contains ArsR-like HTH domain [General function prediction only]. 100
3991 226894 COG4499 YukC Uncharacterized membrane protein YukC [Function unknown]. 434
3992 226895 COG4502 YorC 5'(3')-deoxyribonucleotidase [Nucleotide transport and metabolism]. 180
3993 226896 COG4506 YwiB Uncharacterized beta-barrel protein YwiB, DUF1934 family [Function unknown]. 143
3994 226897 COG4508 Dut2 Dimeric dUTPase, all-alpha-NTP-PPase (MazG) superfamily [Nucleotide transport and metabolism]. 161
3995 226898 COG4509 SrtB class B sortase (surface protein transpeptidase) [Cell wall/membrane/envelope biogenesis]. 244
3996 226899 COG4512 AgrB Accessory gene regulator protein AgrB [Transcription, Signal transduction mechanisms]. 198
3997 226900 COG4517 COG4517 Uncharacterized protein [Function unknown]. 109
3998 226901 COG4518 gp41 Mu-like prophage FluMu protein gp41 [Mobilome: prophages, transposons]. 122
3999 226902 COG4519 COG4519 Uncharacterized protein [Function unknown]. 95
4000 226903 COG4520 LipA17 Surface antigen [Cell wall/membrane/envelope biogenesis]. 136
4001 226904 COG4521 TauA ABC-type taurine transport system, periplasmic component [Inorganic ion transport and metabolism]. 334
4002 226905 COG4525 TauB ABC-type taurine transport system, ATPase component [Inorganic ion transport and metabolism]. 259
4003 226906 COG4529 YdhS Uncharacterized NAD(P)/FAD-binding protein YdhS [General function prediction only]. 474
4004 226907 COG4530 COG4530 Uncharacterized protein [Function unknown]. 129
4005 226908 COG4531 ZnuA ABC-type Zn2+ transport system, periplasmic component/surface adhesin [Inorganic ion transport and metabolism]. 318
4006 226909 COG4533 SgrR DNA-binding transcriptional regulator SgrR of sgrS sRNA, contains a MarR-type HTH domain and a periplasmic-type solute-binding domain [Transcription]. 564
4007 226910 COG4535 CorC Mg2+ and Co2+ transporter CorC, contains CBS pair and CorC-HlyC domains [Inorganic ion transport and metabolism]. 293
4008 226911 COG4536 CorB Mg2+ and Co2+ transporter CorB, contains DUF21, CBS pair, and CorC-HlyC domains [Inorganic ion transport and metabolism]. 423
4009 226912 COG4537 ComGC Competence protein ComGC [Mobilome: prophages, transposons]. 107
4010 226913 COG4538 COG4538 Uncharacterized protein [Function unknown]. 112
4011 226914 COG4539 COG4539 Uncharacterized membrane protein YGL010W [Function unknown]. 180
4012 226915 COG4540 gpV Phage P2 baseplate assembly protein gpV [Mobilome: prophages, transposons]. 184
4013 226916 COG4541 COG4541 Uncharacterized membrane protein [Function unknown]. 100
4014 226917 COG4542 PduX Protein involved in propanediol utilization, and related proteins (includes coumermycin biosynthetic... [Secondary metabolites biosynthesis, transport and catabolism]. 293
4015 226918 COG4544 COG4544 Uncharacterized conserved protein [Function unknown]. 260
4016 226919 COG4545 COG4545 Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 85
4017 226920 COG4547 CobT2 Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphorib... [Coenzyme transport and metabolism]. 620
4018 226921 COG4548 NorD Nitric oxide reductase activation protein [Inorganic ion transport and metabolism]. 637
4019 226922 COG4549 YcnI Uncharacterized protein YcnI, contains cohesin/reeler-like domain [Function unknown]. 178
4020 226923 COG4550 YmcA Cell fate regulator YmcA, YheA/YmcA/DUF963 family (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 120
4021 226924 COG4551 COG4551 Predicted protein tyrosine phosphatase [General function prediction only]. 109
4022 226925 COG4552 Eis Predicted acetyltransferase [General function prediction only]. 389
4023 226926 COG4553 DepA Poly-beta-hydroxyalkanoate depolymerase [Lipid transport and metabolism]. 415
4024 226927 COG4555 NatA ABC-type Na+ transport system, ATPase component NatA [Energy production and conversion, Inorganic ion transport and metabolism]. 245
4025 226928 COG4558 ChuT ABC-type hemin transport system, periplasmic component [Inorganic ion transport and metabolism]. 300
4026 226929 COG4559 COG4559 ABC-type hemin transport system, ATPase component [Inorganic ion transport and metabolism]. 259
4027 226930 COG4564 COG4564 Signal transduction histidine kinase [Signal transduction mechanisms]. 459
4028 226931 COG4565 CitB Response regulator of citrate/malate metabolism [Transcription, Signal transduction mechanisms]. 224
4029 226932 COG4566 FixJ Two-component response regulator, FixJ family, consists of REC and HTH domains [Signal transduction mechanisms, Transcription]. 202
4030 226933 COG4567 COG4567 Two-component response regulator, ActR/RegA family, consists of REC and Fis-type HTH domains [Signal transduction mechanisms, Transcription]. 182
4031 226934 COG4568 Rof Transcriptional antiterminator Rof (Rho-off) [Transcription]. 84
4032 226935 COG4569 MhpF Acetaldehyde dehydrogenase (acetylating) [Secondary metabolites biosynthesis, transport and catabolism]. 310
4033 226936 COG4570 RusA Holliday junction resolvase RusA (prophage-encoded endonuclease) [Replication, recombination and repair]. 132
4034 226937 COG4571 OmpT Outer membrane protease [Cell wall/membrane/envelope biogenesis]. 314
4035 226938 COG4572 ChaB Cation transport regulator ChaB [Inorganic ion transport and metabolism]. 76
4036 226939 COG4573 GatZ Tagatose-1,6-bisphosphate aldolase non-catalytic subunit AgaZ/GatZ [Carbohydrate transport and metabolism]. 426
4037 226940 COG4574 Eco Serine protease inhibitor ecotin [Posttranslational modification, protein turnover, chaperones]. 162
4038 226941 COG4575 ElaB Membrane-anchored ribosome-binding protein, inhibits growth in stationary phase, ElaB/YqjD/DUF883 family [Translation, ribosomal structure and biogenesis]. 104
4039 226942 COG4576 CcmL Carboxysome shell and ethanolamine utilization microcompartment protein CcmK/EutM [Secondary metabolites biosynthesis, transport and catabolism, Energy production and conversion]. 89
4040 226943 COG4577 CcmK Carboxysome shell and ethanolamine utilization microcompartment protein CcmL/EutN [Secondary metabolites biosynthesis, transport and catabolism, Energy production and conversion]. 150
4041 226944 COG4578 GutM DNA-binding transcriptional regulator of glucitol operon [Transcription]. 128
4042 226945 COG4579 AceK Isocitrate dehydrogenase kinase/phosphatase [Signal transduction mechanisms]. 578
4043 226946 COG4580 LamB Maltoporin (phage lambda and maltose receptor) [Carbohydrate transport and metabolism]. 429
4044 226947 COG4581 Dob10 Superfamily II RNA helicase [Replication, recombination and repair]. 1041
4045 226948 COG4582 ZapD Cell division protein ZapD, interacts with FtsZ [Cell cycle control, cell division, chromosome partitioning]. 244
4046 226949 COG4583 SoxG Sarcosine oxidase gamma subunit [Amino acid transport and metabolism]. 189
4047 226950 COG4584 COG4584 Transposase [Mobilome: prophages, transposons]. 278
4048 226951 COG4585 COG4585 Signal transduction histidine kinase [Signal transduction mechanisms]. 365
4049 226952 COG4586 COG4586 ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 325
4050 226953 COG4587 COG4587 ABC-type uncharacterized transport system, permease component [General function prediction only]. 268
4051 226954 COG4588 AcfC Accessory colonization factor AcfC, contains ABC-type periplasmic domain [Cell wall/membrane/envelope biogenesis]. 252
4052 226955 COG4589 YnbB CDP-diglyceride synthetase [Lipid transport and metabolism]. 303
4053 226956 COG4590 COG4590 ABC-type uncharacterized transport system, permease component [General function prediction only]. 733
4054 226957 COG4591 LolE ABC-type transport system, involved in lipoprotein release, permease component [Cell wall/membrane/envelope biogenesis]. 408
4055 226958 COG4592 FepB ABC-type Fe2+-enterobactin transport system, periplasmic component [Inorganic ion transport and metabolism]. 319
4056 226959 COG4594 FecB ABC-type Fe3+-citrate transport system, periplasmic component [Inorganic ion transport and metabolism]. 310
4057 226960 COG4597 BatB ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 397
4058 226961 COG4598 HisP ABC-type histidine transport system, ATPase component [Amino acid transport and metabolism]. 256
4059 226962 COG4603 COG4603 ABC-type uncharacterized transport system, permease component [General function prediction only]. 356
4060 226963 COG4604 CeuD ABC-type enterochelin transport system, ATPase component [Inorganic ion transport and metabolism]. 252
4061 226964 COG4605 CeuC ABC-type enterochelin transport system, permease component [Inorganic ion transport and metabolism]. 316
4062 226965 COG4606 CeuB ABC-type enterochelin transport system, permease component [Inorganic ion transport and metabolism]. 321
4063 226966 COG4607 CeuA ABC-type enterochelin transport system, periplasmic component [Inorganic ion transport and metabolism]. 320
4064 226967 COG4608 AppF ABC-type oligopeptide transport system, ATPase component [Amino acid transport and metabolism]. 268
4065 226968 COG4615 PvdE ABC-type siderophore export system, fused ATPase and permease components [Inorganic ion transport and metabolism]. 546
4066 226969 COG4618 ArpD ABC-type protease/lipase transport system, ATPase and permease components [Intracellular trafficking, secretion, and vesicular transport]. 580
4067 226970 COG4619 FetA ABC-type iron transport system FetAB, ATPase component [Inorganic ion transport and metabolism]. 223
4068 226971 COG4623 MltF Membrane-bound lytic murein transglycosylase MltF [Cell wall/membrane/envelope biogenesis, Signal transduction mechanisms]. 473
4069 226972 COG4624 Nar1 Iron only hydrogenase large subunit, C-terminal domain [Energy production and conversion]. 411
4070 226973 COG4625 COG4625 Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown]. 577
4071 226974 COG4626 YmfN Phage terminase-like protein, large subunit, contains N-terminal HTH domain [Mobilome: prophages, transposons]. 546
4072 226975 COG4627 COG4627 Predicted SAM-depedendent methyltransferase [General function prediction only]. 185
4073 226976 COG4628 COG4628 Uncharacterized conserved protein, DUF2132 family [Function unknown]. 136
4074 226977 COG4630 XdhA Xanthine dehydrogenase, Fe-S cluster and FAD-binding subunit XdhA [Nucleotide transport and metabolism]. 493
4075 226978 COG4631 XdhB Xanthine dehydrogenase, molybdopterin-binding subunit XdhB [Nucleotide transport and metabolism]. 781
4076 226979 COG4632 EpsL Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acety... [Carbohydrate transport and metabolism]. 320
4077 226980 COG4633 COG4633 Plastocyanin domain containing protein [General function prediction only]. 272
4078 226981 COG4634 COG4634 Predicted nuclease, contains PIN domain, potential toxin-antitoxin system component [General function prediction only]. 113
4079 226982 COG4635 HemG Protoporphyrinogen IX oxidase, menaquinone-dependent (flavodoxin domain) [Coenzyme transport and metabolism]. 175
4080 226983 COG4636 Uma2 Endonuclease, Uma2 family (restriction endonuclease fold) [General function prediction only]. 200
4081 226984 COG4637 COG4637 Predicted ATPase [General function prediction only]. 373
4082 226985 COG4638 HcaE Phenylpropionate dioxygenase or related ring-hydroxylating dioxygenase, large terminal subunit [Inorganic ion transport and metabolism, General function prediction only]. 367
4083 226986 COG4639 COG4639 Predicted kinase [General function prediction only]. 168
4084 226987 COG4640 YvbJ Uncharacterized membrane protein YvbJ [Function unknown]. 465
4085 226988 COG4641 COG4641 Spore maturation protein CgeB [Cell cycle control, cell division, chromosome partitioning]. 373
4086 226989 COG4642 COG4642 Uncharacterized conserved protein [Function unknown]. 139
4087 226990 COG4643 COG4643 Uncharacterized domain associated with phage/plasmid primase [Mobilome: prophages, transposons]. 366
4088 226991 COG4644 COG4644 Transposase and inactivated derivatives, TnpA family [Mobilome: prophages, transposons]. 323
4089 226992 COG4645 OpgC Predicted acyltransferase [General function prediction only]. 410
4090 226993 COG4646 COG4646 Adenine-specific DNA methylase, N12 class [Replication, recombination and repair]. 637
4091 226994 COG4647 AcxC Acetone carboxylase, gamma subunit [Secondary metabolites biosynthesis, transport and catabolism]. 165
4092 226995 COG4648 COG4648 Uncharacterized membrane protein [Function unknown]. 201
4093 226996 COG4649 COG4649 Uncharacterized protein [Function unknown]. 221
4094 226997 COG4650 RtcR Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain [Transcription, Signal transduction mechanisms]. 531
4095 226998 COG4651 RosB Predicted Kef-type K+ transport protein, K+/H+ antiporter domain [Inorganic ion transport and metabolism]. 408
4096 226999 COG4652 COG4652 Uncharacterized protein [Function unknown]. 657
4097 227000 COG4653 COG4653 Predicted phage phi-C31 gp36 major capsid-like protein [Mobilome: prophages, transposons]. 422
4098 227001 COG4654 CytC552 Cytochrome c551/c552 [Energy production and conversion]. 110
4099 227002 COG4655 COG4655 Uncharacterized membrane protein [Function unknown]. 565
4100 227003 COG4656 RnfC Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfC subunit [Energy production and conversion]. 529
4101 227004 COG4657 RnfA Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfA subunit [Energy production and conversion]. 193
4102 227005 COG4658 RnfD Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfD subunit [Energy production and conversion]. 338
4103 227006 COG4659 RnfG Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfG subunit [Energy production and conversion]. 195
4104 227007 COG4660 RnfE Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfE subunit [Energy production and conversion]. 212
4105 227008 COG4662 TupA ABC-type tungstate transport system, periplasmic component [Inorganic ion transport and metabolism]. 227
4106 227009 COG4663 FcbT1 TRAP-type mannitol/chloroaromatic compound transport system, periplasmic component [Secondary metabolites biosynthesis, transport and catabolism]. 363
4107 227010 COG4664 FcbT3 TRAP-type mannitol/chloroaromatic compound transport system, large permease component [Secondary metabolites biosynthesis, transport and catabolism]. 447
4108 227011 COG4665 FcbT2 TRAP-type mannitol/chloroaromatic compound transport system, small permease component [Secondary metabolites biosynthesis, transport and catabolism]. 182
4109 227012 COG4666 COG4666 TRAP-type uncharacterized transport system, fused permease components [General function prediction only]. 642
4110 227013 COG4667 YjjU Predicted phospholipase, patatin/cPLA2 family [Lipid transport and metabolism]. 292
4111 227014 COG4668 MtlA2 Mannitol/fructose-specific phosphotransferase system, IIA domain [Carbohydrate transport and metabolism]. 142
4112 227015 COG4669 EscJ Type III secretory pathway, lipoprotein EscJ [Intracellular trafficking, secretion, and vesicular transport]. 246
4113 227016 COG4670 YdiF Acyl CoA:acetate/3-ketoacid CoA transferase [Lipid transport and metabolism]. 527
4114 227017 COG4671 COG4671 Predicted glycosyl transferase [General function prediction only]. 400
4115 227018 COG4672 gp18 Phage-related protein [Mobilome: prophages, transposons]. 231
4116 227019 COG4674 COG4674 ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 249
4117 227020 COG4675 MdpB Microcystin-dependent protein (function unknown) [Function unknown]. 170
4118 227021 COG4676 YfaP Uncharacterized conserved protein YfaP, DUF2135 family [Function unknown]. 268
4119 227022 COG4677 PemB Pectin methylesterase and related acyl-CoA thioesterases [Carbohydrate transport and metabolism, Lipid transport and metabolism]. 405
4120 227023 COG4678 COG4678 Muramidase (phage lambda lysozyme) [Cell wall/membrane/envelope biogenesis, Mobilome: prophages, transposons]. 180
4121 227024 COG4679 COG4679 Phage-related protein [Mobilome: prophages, transposons]. 116
4122 227025 COG4680 HigB mRNA-degrading endonuclease (mRNA interferase) HigB, toxic component of the HigAB toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 98
4123 227026 COG4681 YaeQ Uncharacterized conserved protein YaeQ, suppresses RfaH defect [Function unknown]. 181
4124 227027 COG4682 YiaA Uncharacterized membrane protein YiaA [Function unknown]. 128
4125 227028 COG4683 COG4683 Uncharacterized protein [Function unknown]. 120
4126 227029 COG4684 COG4684 Uncharacterized membrane protein [Function unknown]. 189
4127 227030 COG4685 YfaA Uncharacterized conserved protein YfaA, DUF2138 family [Function unknown]. 571
4128 227031 COG4687 COG4687 Uncharacterized protein [Function unknown]. 122
4129 227032 COG4688 COG4688 Uncharacterized protein [Function unknown]. 665
4130 227033 COG4689 Adc Acetoacetate decarboxylase [Secondary metabolites biosynthesis, transport and catabolism]. 247
4131 227034 COG4690 PepD Dipeptidase [Amino acid transport and metabolism]. 464
4132 227035 COG4691 StbC Plasmid stability protein [Defense mechanisms]. 80
4133 227036 COG4692 COG4692 Predicted neuraminidase (sialidase) [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 381
4134 227037 COG4693 PchG Oxidoreductase (NAD-binding), involved in siderophore biosynthesis [Inorganic ion transport and metabolism]. 361
4135 227038 COG4694 RloC Wobble nucleotide-excising tRNase [Translation, ribosomal structure and biogenesis]. 758
4136 227039 COG4695 BeeE Phage portal protein BeeE [Mobilome: prophages, transposons]. 398
4137 227040 COG4696 COG4696 Predicted phosphohydrolase, Cof family, HAD superfamily [General function prediction only]. 180
4138 227041 COG4697 COG4697 Uncharacterized protein [Function unknown]. 319
4139 227042 COG4698 YpmS Uncharacterized protein YpmS, DUF2140 family [Function unknown]. 197
4140 227043 COG4699 COG4699 Uncharacterized protein [Function unknown]. 120
4141 227044 COG4700 COG4700 Uncharacterized protein [Function unknown]. 251
4142 227045 COG4701 COG4701 Uncharacterized protein [Function unknown]. 162
4143 227046 COG4702 COG4702 Uncharacterized protein, UPF0303 family [Function unknown]. 168
4144 227047 COG4703 YkuJ Uncharacterized protein YkuJ, DUF1797 family [Function unknown]. 74
4145 227048 COG4704 COG4704 Uncharacterized conserved protein, DUF2141 family [Function unknown]. 151
4146 227049 COG4705 COG4705 Uncharacterized membrane-anchored protein [Function unknown]. 258
4147 227050 COG4706 COG4706 Predicted 3-hydroxylacyl-ACP dehydratase, HotDog domain [Lipid transport and metabolism]. 161
4148 227051 COG4707 COG4707 Prophage pi2 protein 07 [Mobilome: prophages, transposons]. 107
4149 227052 COG4708 COG4708 Uncharacterized membrane protein [Function unknown]. 169
4150 227053 COG4709 COG4709 Uncharacterized membrane protein [Function unknown]. 195
4151 227054 COG4710 COG4710 Predicted DNA-binding protein with an HTH domain [General function prediction only]. 80
4152 227055 COG4711 COG4711 Uncharacterized membrane protein [Function unknown]. 217
4153 227056 COG4712 COG4712 Uncharacterized protein [Function unknown]. 234
4154 227057 COG4713 COG4713 Uncharacterized membrane protein [Function unknown]. 489
4155 227058 COG4714 COG4714 Uncharacterized membrane-anchored protein [Function unknown]. 303
4156 227059 COG4715 COG4715 Uncharacterized conserved protein, contains Zn finger domain [Function unknown]. 587
4157 227060 COG4716 COG4716 Myosin-crossreactive antigen (function unknown) [Function unknown]. 587
4158 227061 COG4717 YhaN Uncharacterized protein YhaN, contains AAA domain [Function unknown]. 984
4159 227062 COG4718 COG4718 Phage-related protein [Mobilome: prophages, transposons]. 111
4160 227063 COG4719 COG4719 Uncharacterized protein [Function unknown]. 176
4161 227064 COG4720 COG4720 Uncharacterized membrane protein [Function unknown]. 177
4162 227065 COG4721 YkoE ABC-type thiamine/hydroxymethylpyrimidine transport system, permease component [Coenzyme transport and metabolism]. 192
4163 227066 COG4722 YomH Phage-related protein [Mobilome: prophages, transposons]. 239
4164 227067 COG4723 COG4723 Phage-related protein, tail component [Mobilome: prophages, transposons]. 198
4165 227068 COG4724 COG4724 Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism]. 553
4166 227069 COG4725 IME4 N6-adenosine-specific RNA methylase IME4 [Translation, ribosomal structure and biogenesis]. 198
4167 227070 COG4726 PilX Tfp pilus assembly protein PilX [Cell motility, Extracellular structures]. 196
4168 227071 COG4727 COG4727 Uncharacterized protein [Function unknown]. 287
4169 227072 COG4728 COG4728 Uncharacterized protein, DUF1653 family [Function unknown]. 124
4170 227073 COG4729 COG4729 Uncharacterized protein, DUF1850 family [Function unknown]. 156
4171 227074 COG4731 COG4731 Uncharacterized conserved protein, DUF2147 family [Function unknown]. 162
4172 227075 COG4732 ThiW Predicted membrane protein [Function unknown]. 177
4173 227076 COG4733 COG4733 Phage-related protein, tail component [Mobilome: prophages, transposons]. 952
4174 227077 COG4734 ArdA Antirestriction protein [Defense mechanisms]. 193
4175 227078 COG4735 YaaW Uncharacterized protein YaaW, UPF0174 family [Function unknown]. 211
4176 227079 COG4736 CcoQ Cbb3-type cytochrome oxidase, subunit 3 [Energy production and conversion]. 60
4177 227080 COG4737 COG4737 Uncharacterized protein [Function unknown]. 123
4178 227081 COG4738 COG4738 Predicted transcriptional regulator [Transcription]. 124
4179 227082 COG4739 COG4739 Uncharacterized protein, contains ferredoxin domain [Function unknown]. 182
4180 227083 COG4740 COG4740 Predicted metalloprotease [General function prediction only]. 176
4181 227084 COG4741 COG4741 Predicted secreted endonuclease distantly related to archaeal Holliday junction resolvase [Nucleotide transport and metabolism]. 175
4182 227085 COG4742 COG4742 Predicted transcriptional regulator, contains HTH domain [Transcription]. 260
4183 227086 COG4743 COG4743 Uncharacterized membrane protein [Function unknown]. 316
4184 227087 COG4744 COG4744 Uncharacterized protein [Function unknown]. 121
4185 227088 COG4745 COG4745 Predicted membrane-bound mannosyltransferase [General function prediction only]. 556
4186 227089 COG4746 COG4746 Uncharacterized protein [Function unknown]. 80
4187 227090 COG4747 ACTx2 Uncharacterized conserved protein, contains tandem ACT domains [Function unknown]. 142
4188 227091 COG4748 COG4748 Uncharacterized protein, contains restriction enzyme R protein N terminal (HSDR_N) domain [Function unknown]. 365
4189 227092 COG4749 COG4749 Uncharacterized protein [Function unknown]. 196
4190 227093 COG4750 LicC CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epi... [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 231
4191 227094 COG4752 COG4752 Uncharacterized protein [Function unknown]. 190
4192 227095 COG4753 YesN Two-component response regulator, YesN/AraC family, consists of REC and AraC-type DNA-binding domains [Signal transduction mechanisms, Transcription]. 475
4193 227096 COG4754 COG4754 Uncharacterized protein [Function unknown]. 157
4194 227097 COG4755 COG4755 Uncharacterized protein [Function unknown]. 151
4195 227098 COG4756 COG4756 Predicted cation transporter [General function prediction only]. 367
4196 227099 COG4757 COG4757 Predicted alpha/beta hydrolase [General function prediction only]. 281
4197 227100 COG4758 LiaF Predicted membrane protein [Function unknown]. 235
4198 227101 COG4759 COG4759 Uncharacterized protein, contains thioredoxin-like domain [General function prediction only]. 316
4199 227102 COG4760 COG4760 Uncharacterized membrane protein, YccA/Bax inhibitor family [Function unknown]. 276
4200 227103 COG4762 COG4762 Uncharacterized protein, UPF0548 family [Function unknown]. 168
4201 227104 COG4763 YcfT Uncharacterized membrane protein YcfT [Function unknown]. 388
4202 227105 COG4764 COG4764 Uncharacterized protein [Function unknown]. 197
4203 227106 COG4765 COG4765 Uncharacterized protein [Function unknown]. 164
4204 227107 COG4766 EutQ Ethanolamine utilization protein EutQ, cupin superfamily (function unknown) [Amino acid transport and metabolism]. 176
4205 227108 COG4767 VanZ Glycopeptide antibiotics resistance protein [Defense mechanisms]. 199
4206 227109 COG4768 COG4768 Uncharacterized protein YoxC, contains an MCP-like domain [Function unknown]. 139
4207 227110 COG4769 COG4769 Uncharacterized membrane protein [Function unknown]. 181
4208 227111 COG4770 PccA Acetyl/propionyl-CoA carboxylase, alpha subunit [Lipid transport and metabolism]. 645
4209 227112 COG4771 FepA Outer membrane receptor for ferrienterochelin and colicins [Inorganic ion transport and metabolism]. 699
4210 227113 COG4772 FecA Outer membrane receptor for Fe3+-dicitrate [Inorganic ion transport and metabolism]. 753
4211 227114 COG4773 FhuE Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid [Inorganic ion transport and metabolism]. 719
4212 227115 COG4774 Fiu Outer membrane receptor for monomeric catechols [Inorganic ion transport and metabolism]. 750
4213 227116 COG4775 BamA Outer membrane protein assembly factor BamA [Cell wall/membrane/envelope biogenesis]. 766
4214 227117 COG4776 Rnb Exoribonuclease II [Transcription]. 645
4215 227118 COG4778 PhnL Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnL [Inorganic ion transport and metabolism]. 235
4216 227119 COG4779 FepG ABC-type enterobactin transport system, permease component [Inorganic ion transport and metabolism]. 346
4217 227120 COG4781 UgpQ1 Membrane-anchored glycerophosphoryl diester phosphodiesterase (GDPDase), membrane domain [Lipid transport and metabolism]. 340
4218 227121 COG4782 COG4782 Esterase/lipase superfamily enzyme [General function prediction only]. 377
4219 227122 COG4783 YfgC Putative Zn-dependent protease, contains TPR repeats [General function prediction only]. 484
4220 227123 COG4784 COG4784 Putative Zn-dependent protease [General function prediction only]. 479
4221 227124 COG4785 NlpI Lipoprotein NlpI, contains TPR repeats [Cell wall/membrane/envelope biogenesis]. 297
4222 227125 COG4786 FlgG Flagellar basal body rod protein FlgG [Cell motility]. 265
4223 227126 COG4787 FlgF Flagellar basal body rod protein FlgF [Cell motility]. 251
4224 227127 COG4789 EscV Type III secretory pathway, component EscV [Intracellular trafficking, secretion, and vesicular transport]. 689
4225 227128 COG4790 EscR Type III secretory pathway, component EscR [Intracellular trafficking, secretion, and vesicular transport]. 214
4226 227129 COG4791 EscT Type III secretory pathway, component EscT [Intracellular trafficking, secretion, and vesicular transport]. 259
4227 227130 COG4792 EscU Type III secretory pathway, component EscU [Intracellular trafficking, secretion, and vesicular transport]. 349
4228 227131 COG4794 EscS Type III secretory pathway, component EscS [Intracellular trafficking, secretion, and vesicular transport]. 89
4229 227132 COG4795 PulJ Type II secretory pathway, component PulJ [Intracellular trafficking, secretion, and vesicular transport]. 194
4230 227133 COG4796 HofQ Type II secretory pathway, component HofQ [Intracellular trafficking, secretion, and vesicular transport]. 709
4231 227134 COG4797 COG4797 Predicted regulatory domain of a methyltransferase [General function prediction only]. 268
4232 227135 COG4798 COG4798 Predicted methyltransferase [General function prediction only]. 238
4233 227136 COG4799 MmdA Acetyl-CoA carboxylase, carboxyltransferase component [Lipid transport and metabolism]. 526
4234 227137 COG4800 COG4800 Predicted transcriptional regulator with an HTH domain [Transcription]. 170
4235 227138 COG4801 COG4801 Predicted acyltransferase, contains DUF342 domain [General function prediction only]. 277
4236 227139 COG4802 FtrB Ferredoxin-thioredoxin reductase, catalytic subunit [Energy production and conversion]. 110
4237 227140 COG4803 COG4803 Uncharacterized membrane protein [Function unknown]. 170
4238 227141 COG4804 YhcG Predicted nuclease of restriction endonuclease-like (RecB) superfamily, DUF1016 family [General function prediction only]. 159
4239 227142 COG4805 COG4805 Uncharacterized conserved protein, DUF885 familyt [Function unknown]. 588
4240 227143 COG4806 RhaA L-rhamnose isomerase [Carbohydrate transport and metabolism]. 419
4241 227144 COG4807 YehS Uncharacterized conserved protein YehS, DUF1456 family [Function unknown]. 155
4242 227145 COG4808 YehR Uncharacterized lipoprotein YehR, DUF1307 family [Function unknown]. 152
4243 227146 COG4809 Pfk2 Archaeal ADP-dependent phosphofructokinase/glucokinase [Carbohydrate transport and metabolism]. 466
4244 227147 COG4810 EutS Ethanolamine utilization protein EutS, ethanolamine utilization microcompartment shell protein [Amino acid transport and metabolism]. 121
4245 227148 COG4811 YobD Uncharacterized membrane protein YobD, UPF0266 family [Function unknown]. 152
4246 227149 COG4812 EutT Ethanolamine utilization cobalamin adenosyltransferase [Amino acid transport and metabolism]. 255
4247 227150 COG4813 ThuA Trehalose utilization protein [Carbohydrate transport and metabolism]. 261
4248 227151 COG4814 COG4814 Uncharacterized protein with an alpha/beta hydrolase fold [Function unknown]. 288
4249 227152 COG4815 COG4815 Uncharacterized protein [Function unknown]. 145
4250 227153 COG4816 EutL Ethanolamine utilization protein EutL, ethanolamine utilization microcompartment shell protein [Amino acid transport and metabolism]. 219
4251 227154 COG4817 GINS DNA-binding ferritin-like protein (Dps family) [Replication, recombination and repair]. 111
4252 227155 COG4818 COG4818 Uncharacterized membrane protein [Function unknown]. 105
4253 227156 COG4819 EutA Ethanolamine utilization protein EutA, possible chaperonin protecting lyase from inhibition [Amino acid transport and metabolism]. 473
4254 227157 COG4820 EutJ Ethanolamine utilization protein EutJ, possible chaperonin [Amino acid transport and metabolism]. 277
4255 227158 COG4821 COG4821 Uncharacterized protein, contains SIS (Sugar ISomerase) phosphosugar binding domain [General function prediction only]. 243
4256 227159 COG4822 CbiK Cobalamin biosynthesis protein CbiK, Co2+ chelatase [Coenzyme transport and metabolism]. 265
4257 227160 COG4823 AbiF Abortive infection bacteriophage resistance protein [Defense mechanisms]. 299
4258 227161 COG4824 COG4824 Phage-related holin (Lysis protein) [Mobilome: prophages, transposons]. 133
4259 227162 COG4825 COG4825 Uncharacterized membrane-anchored protein [Function unknown]. 395
4260 227163 COG4826 SERPIN Serine protease inhibitor [Posttranslational modification, protein turnover, chaperones]. 410
4261 227164 COG4827 COG4827 Predicted transporter [General function prediction only]. 239
4262 227165 COG4828 COG4828 Uncharacterized membrane protein [Function unknown]. 113
4263 227166 COG4829 CatC1 Muconolactone delta-isomerase [Secondary metabolites biosynthesis, transport and catabolism]. 98
4264 227167 COG4830 RPS26B Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]. 108
4265 227168 COG4831 COG4831 Roadblock/LC7 domain [Signal transduction mechanisms]. 109
4266 227169 COG4832 COG4832 Uncharacterized protein [Function unknown]. 207
4267 227170 COG4833 COG4833 Predicted alpha-1,6-mannanase, GH76 family [Carbohydrate transport and metabolism]. 377
4268 227171 COG4834 COG4834 Uncharacterized protein [Function unknown]. 334
4269 227172 COG4835 COG4835 Uncharacterized protein [Function unknown]. 124
4270 227173 COG4836 YwzB Uncharacterized membrane protein YwzB [Function unknown]. 77
4271 227174 COG4837 YuzD Disulfide oxidoreductase YuzD [Posttranslational modification, protein turnover, chaperones]. 106
4272 227175 COG4838 YlaN Uncharacterized protein YlaN, UPF0358 family [Function unknown]. 92
4273 227176 COG4839 FtsL2 Cell division protein FtsL [Cell cycle control, cell division, chromosome partitioning]. 120
4274 227177 COG4840 YfkK Uncharacterized protein YfkK, UPF0435 family [Function unknown]. 71
4275 227178 COG4841 YneR Uncharacterized protein YneR, related to HesB/YadR/YfhF family [Function unknown]. 95
4276 227179 COG4842 YukE Uncharacterized conserved protein YukE [Function unknown]. 97
4277 227180 COG4843 YebE Uncharacterized protein YebE, UPF0316/DUF2179 family [Function unknown]. 179
4278 227181 COG4844 YuzB Uncharacterized protein YuzB, UPF0349 family [Function unknown]. 78
4279 227182 COG4845 CatA Chloramphenicol O-acetyltransferase [Defense mechanisms]. 219
4280 227183 COG4846 CcdC Membrane protein CcdC involved in cytochrome C biogenesis [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 163
4281 227184 COG4847 COG4847 Uncharacterized protein [Function unknown]. 103
4282 227185 COG4848 YtpQ Uncharacterized protein YtpQ, UPF0354 family [Function unknown]. 265
4283 227186 COG4849 COG4849 Predicted nucleotidyltransferase [General function prediction only]. 269
4284 227187 COG4850 App1 Phosphatidate phosphatase APP1 [Lipid transport and metabolism]. 373
4285 227188 COG4851 CamS Protein involved in sex pheromone biosynthesis [General function prediction only]. 382
4286 227189 COG4852 COG4852 Uncharacterized membrane protein [Function unknown]. 134
4287 227190 COG4853 YycI Two-component signal transduction system YycFG, regulatory protein YycI [Signal transduction mechanisms]. 264
4288 227191 COG4854 COG4854 Uncharacterized membrane protein [Function unknown]. 126
4289 227192 COG4855 COG4855 Uncharacterized protein [Nucleotide transport and metabolism]. 76
4290 227193 COG4856 YbbR Uncharacterized protein, YbbR domain [Function unknown]. 403
4291 227194 COG4857 COG4857 5-Methylthioribose kinase, methionine salvage pathway [Amino acid transport and metabolism]. 408
4292 227195 COG4858 COG4858 Uncharacterized membrane-anchored protein [Function unknown]. 226
4293 227196 COG4859 COG4859 Uncharacterized protein [Function unknown]. 105
4294 227197 COG4860 COG4860 Predicted DNA-binding transcriptional regulator, ArsR family [Transcription]. 170
4295 227198 COG4861 COG4861 Uncharacterized protein [Function unknown]. 345
4296 227199 COG4862 MecA Negative regulator of genetic competence, sporulation and motility [Transcription, Signal transduction mechanisms, Cell motility]. 224
4297 227200 COG4863 YycH Two-component signal transduction system YycFG, regulatory protein YycH [Signal transduction mechanisms]. 439
4298 227201 COG4864 YqfA Uncharacterized protein YqfA, UPF0365 family [Function unknown]. 328
4299 227202 COG4865 GlmE Glutamate mutase epsilon subunit [Amino acid transport and metabolism]. 485
4300 227203 COG4866 COG4866 Uncharacterized protein [Function unknown]. 294
4301 227204 COG4867 COG4867 Uncharacterized protein, contains von Willebrand factor type A (vWA) domain [Function unknown]. 652
4302 227205 COG4868 COG4868 Uncharacterized protein, UPF0371 family [Function unknown]. 493
4303 227206 COG4869 PduL Propanediol utilization protein [Secondary metabolites biosynthesis, transport and catabolism]. 210
4304 227207 COG4870 COG4870 Cysteine protease, C1A family [Posttranslational modification, protein turnover, chaperones]. 372
4305 227208 COG4871 COG4871 Metal-binding trascriptional regulator, contains putative Fe-S cluster and ArsR family DNA binding domain [Transcription]. 193
4306 227209 COG4872 COG4872 Uncharacterized membrane protein [Function unknown]. 394
4307 227210 COG4873 YkvS Uncharacterized protein YkvS, DUF2187 family [Function unknown]. 81
4308 227211 COG4874 COG4874 Uncharacterized protein [Function unknown]. 318
4309 227212 COG4875 COG4875 Uncharacterized protein [Function unknown]. 156
4310 227213 COG4876 YdaT Uncharacterized protein YdaT [Function unknown]. 138
4311 227214 COG4877 COG4877 Uncharacterized protein [Function unknown]. 63
4312 227215 COG4878 COG4878 Uncharacterized protein [Function unknown]. 309
4313 227216 COG4879 COG4879 Uncharacterized protein [Function unknown]. 243
4314 227217 COG4880 COG4880 Secreted protein containing C-terminal beta-propeller domain distantly related to WD-40 repeats [General function prediction only]. 603
4315 227218 COG4881 COG4881 Predicted membrane protein [Function unknown]. 371
4316 227219 COG4882 COG4882 Predicted aminopeptidase, Iap family [General function prediction only]. 486
4317 227220 COG4883 COG4883 Uncharacterized protein [Function unknown]. 500
4318 227221 COG4884 YfeS Uncharacterized conserved protein YfeS, contains WGR domain [Function unknown]. 176
4319 227222 COG4885 COG4885 Uncharacterized protein [Function unknown]. 312
4320 227223 COG4886 LRR Leucine-rich repeat (LRR) protein [Transcription]. 394
4321 227224 COG4887 COG4887 Uncharacterized metal-binding protein, DUF1847 family [Function unknown]. 191
4322 227225 COG4888 Elf1 Transcription elongation factor Elf1, contains Zn-ribbon domain [Transcription]. 104
4323 227226 COG4889 COG4889 Predicted helicase [General function prediction only]. 1518
4324 227227 COG4890 COG4890 Predicted outer membrane lipoprotein [Function unknown]. 37
4325 227228 COG4891 COG4891 Uncharacterized protein [Function unknown]. 93
4326 227229 COG4892 COG4892 Predicted heme/steroid binding protein [General function prediction only]. 81
4327 227230 COG4893 COG4893 Uncharacterized protein [Function unknown]. 123
4328 227231 COG4894 YxjI Uncharacterized protein YxjI, Tubby2 superfamily [Function unknown]. 159
4329 227232 COG4895 YwbE Uncharacterized protein YwbE, DUF2196 family [Function unknown]. 63
4330 227233 COG4896 YlaI Uncharacterized protein YlaI, DUF2197 family [Function unknown]. 68
4331 227234 COG4897 CsbA General stress protein CsbA (function unknown) [Function unknown]. 78
4332 227235 COG4898 COG4898 Uncharacterized protein [Function unknown]. 115
4333 227236 COG4899 COG4899 Uncharacterized protein [Function unknown]. 166
4334 227237 COG4900 COG4900 Predicted metallopeptidase [General function prediction only]. 133
4335 227238 COG4901 RPS25 Ribosomal protein S25 [Translation, ribosomal structure and biogenesis]. 107
4336 227239 COG4902 COG4902 Uncharacterized protein [Function unknown]. 189
4337 227240 COG4903 ComK Competence transcription factor ComK [Transcription]. 190
4338 227241 COG4904 COG4904 Uncharacterized protein [Function unknown]. 174
4339 227242 COG4905 COG4905 Uncharacterized membrane protein [Function unknown]. 243
4340 227243 COG4906 COG4906 Uncharacterized membrane protein [Function unknown]. 696
4341 227244 COG4907 COG4907 Uncharacterized membrane protein [Function unknown]. 595
4342 227245 COG4908 COG4908 Uncharacterized protein, contains a NRPS condensation (elongation) domain [General function prediction only]. 439
4343 227246 COG4909 PduC Propanediol dehydratase, large subunit [Secondary metabolites biosynthesis, transport and catabolism]. 554
4344 227247 COG4910 PduE Propanediol dehydratase, small subunit [Secondary metabolites biosynthesis, transport and catabolism]. 170
4345 227248 COG4911 COG4911 Uncharacterized protein [Function unknown]. 123
4346 227249 COG4912 AlkD 3-methyladenine DNA glycosylase AlkD [Replication, recombination and repair]. 222
4347 227250 COG4913 COG4913 Uncharacterized protein, contains a C-terminal ATPase domain [Function unknown]. 1104
4348 227251 COG4914 COG4914 Predicted nucleotidyltransferase [General function prediction only]. 190
4349 227252 COG4915 XpaC 5-bromo-4-chloroindolyl phosphate hydrolysis protein [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 204
4350 227253 COG4916 COG4916 Uncharacterized protein [Function unknown]. 329
4351 227254 COG4917 EutP Ethanolamine utilization protein EutP, contains a P-loop NTPase domain [Amino acid transport and metabolism]. 148
4352 227255 COG4918 YqkB Predicted Fe-S cluster biosynthesis protein [General function prediction only]. 114
4353 227256 COG4919 RPS30 Ribosomal protein S30 [Translation, ribosomal structure and biogenesis]. 54
4354 227257 COG4920 COG4920 Uncharacterized membrane protein [Function unknown]. 249
4355 227258 COG4921 COG4921 Uncharacterized protein [Function unknown]. 131
4356 227259 COG4922 COG4922 Predicted SnoaL-like aldol condensation-catalyzing enzyme [General function prediction only]. 129
4357 227260 COG4923 COG4923 Predicted nuclease (RNAse H fold) [General function prediction only]. 245
4358 227261 COG4924 COG4924 Uncharacterized protein [Function unknown]. 386
4359 227262 COG4925 COG4925 Uncharacterized protein [Function unknown]. 166
4360 227263 COG4926 PblB Phage-related protein [Mobilome: prophages, transposons]. 698
4361 227264 COG4927 COG4927 Predicted choloylglycine hydrolase [General function prediction only]. 336
4362 227265 COG4928 COG4928 Predicted P-loop ATPase, KAP-like [General function prediction only]. 646
4363 227266 COG4929 COG4929 Uncharacterized membrane-anchored protein [Function unknown]. 190
4364 227267 COG4930 COG4930 Predicted ATP-dependent Lon-type protease [Posttranslational modification, protein turnover, chaperones]. 683
4365 227268 COG4932 COG4932 Uncharacterized surface anchored protein [Function unknown]. 1531
4366 227269 COG4933 COG4933 Predicted transcriptional regulator, contains an HTH and PUA-like domains [Transcription]. 124
4367 227270 COG4934 COG4934 Serine protease, subtilase family [Posttranslational modification, protein turnover, chaperones]. 1174
4368 227271 COG4935 COG4935 Regulatory P domain of the subtilisin-like proprotein convertases and other proteases [Posttranslational modification, protein turnover, chaperones]. 177
4369 227272 COG4936 PocR Ligand-binding sensor domain [Signal transduction mechanisms]. 169
4370 227273 COG4937 FDXACB Ferredoxin-fold anticodon binding domain [Translation, ribosomal structure and biogenesis]. 171
4371 227274 COG4938 COG4938 Predicted ATPase [General function prediction only]. 374
4372 227275 COG4939 Tpp15 Major membrane immunogen, membrane-anchored lipoprotein [Function unknown]. 147
4373 227276 COG4940 ComGF Competence protein ComGF [Mobilome: prophages, transposons]. 154
4374 227277 COG4941 COG4941 Predicted RNA polymerase sigma factor, contains C-terminal TPR domain [Transcription]. 415
4375 227278 COG4942 EnvC Septal ring factor EnvC, activator of murein hydrolases AmiA and AmiB [Cell cycle control, cell division, chromosome partitioning]. 420
4376 227279 COG4943 YjcC Environmental sensor c-di-GMP phosphodiesterase, contains periplasmic CSS-motif sensor and cytoplasmic EAL domain [Signal transduction mechanisms]. 524
4377 227280 COG4944 COG4944 Uncharacterized protein [Function unknown]. 213
4378 227281 COG4945 DOMON Carbohydrate-binding DOMON domain [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 570
4379 227282 COG4946 COG4946 Uncharacterized N-terminal domain of tricorn protease [Function unknown]. 668
4380 227283 COG4947 COG4947 Esterase/lipase superfamily enzyme [General function prediction only]. 227
4381 227284 COG4948 RspA L-alanine-DL-glutamate epimerase or related enzyme of enolase superfamily [Cell wall/membrane/envelope biogenesis, General function prediction only]. 372
4382 227285 COG4949 COG4949 Uncharacterized membrane-anchored protein [Function unknown]. 424
4383 227286 COG4950 YciW1 N-terminal domain of uncharacterized protein YciW (function unknown) [Function unknown]. 193
4384 227287 COG4951 COG4951 Uncharacterized protein [Function unknown]. 361
4385 227288 COG4952 COG4952 L-rhamnose isomerase [Cell wall/membrane/envelope biogenesis]. 430
4386 227289 COG4953 PbpC Membrane carboxypeptidase/penicillin-binding protein PbpC [Cell wall/membrane/envelope biogenesis]. 733
4387 227290 COG4954 COG4954 Uncharacterized protein [Function unknown]. 135
4388 227291 COG4955 YpbB Uncharacterized protein YpbB, contains C-terminal HTH domain [Function unknown]. 343
4389 227292 COG4956 COG4956 Uncharacterized conserved protein YacL, contains PIN and TRAM domains [General function prediction only]. 356
4390 227293 COG4957 COG4957 Predicted transcriptional regulator [Transcription]. 148
4391 227294 COG4959 TraF Type IV secretory pathway, protease TraF [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 173
4392 227295 COG4960 CpaA Flp pilus assembly protein, protease CpaA [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 168
4393 227296 COG4961 TadG Flp pilus assembly protein TadG [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 185
4394 227297 COG4962 CpaF Pilus assembly protein, ATPase of CpaF family [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 355
4395 227298 COG4963 CpaE Flp pilus assembly protein, ATPase CpaE [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 366
4396 227299 COG4964 CpaC Flp pilus assembly protein, secretin CpaC [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 455
4397 227300 COG4965 TadB Flp pilus assembly protein TadB [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 309
4398 227301 COG4966 PilW Tfp pilus assembly protein PilW [Cell motility, Extracellular structures]. 318
4399 227302 COG4967 PilV Tfp pilus assembly protein PilV [Cell motility, Extracellular structures]. 162
4400 227303 COG4968 PilE Tfp pilus assembly protein PilE [Cell motility, Extracellular structures]. 139
4401 227304 COG4969 PilA Tfp pilus assembly protein, major pilin PilA [Cell motility, Extracellular structures]. 125
4402 227305 COG4970 FimT Tfp pilus assembly protein FimT [Cell motility, Extracellular structures]. 181
4403 227306 COG4972 PilM Tfp pilus assembly protein, ATPase PilM [Cell motility, Extracellular structures]. 354
4404 227307 COG4973 XerC Site-specific recombinase XerC [Replication, recombination and repair]. 299
4405 227308 COG4974 XerD Site-specific recombinase XerD [Replication, recombination and repair]. 300
4406 227309 COG4975 GlcU Glucose uptake protein GlcU [Carbohydrate transport and metabolism]. 288
4407 227310 COG4976 COG4976 Predicted methyltransferase, contains TPR repeat [General function prediction only]. 287
4408 227311 COG4977 GlxA Transcriptional regulator GlxA family, contains an amidase domain and an AraC-type DNA-binding HTH domain [Transcription]. 328
4409 227312 COG4978 BltR2 Bacterial effector-binding domain [Signal transduction mechanisms]. 153
4410 227313 COG4980 GvpP Gas vesicle protein [General function prediction only]. 115
4411 227314 COG4981 COG4981 Enoyl reductase domain of yeast-type FAS1 [Lipid transport and metabolism]. 717
4412 227315 COG4982 FabG2 3-oxoacyl-ACP reductase domain of yeast-type FAS1 [Lipid transport and metabolism]. 866
4413 227316 COG4983 COG4983 Uncharacterized protein, contains Primase-polymerase (Primpol) domain [Function unknown]. 495
4414 227317 COG4984 COG4984 Uncharacterized membrane protein [Function unknown]. 644
4415 227318 COG4985 COG4985 ABC-type phosphate transport system, auxiliary component [Inorganic ion transport and metabolism]. 289
4416 227319 COG4986 COG4986 ABC-type anion transport system, duplicated permease component [Inorganic ion transport and metabolism]. 523
4417 227320 COG4987 CydC ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 573
4418 227321 COG4988 CydD ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 559
4419 227322 COG4989 YdhF Predicted oxidoreductase [General function prediction only]. 298
4420 227323 COG4990 YvpB Predicted cysteine peptidase, C39 family [General function prediction only]. 195
4421 227324 COG4991 YraI Uncharacterized conserved protein YraI [Function unknown]. 155
4422 227325 COG4992 ArgD Acetylornithine/succinyldiaminopimelate/putrescine aminotransferase [Amino acid transport and metabolism]. 404
4423 227326 COG4993 Gcd Glucose dehydrogenase [Carbohydrate transport and metabolism]. 773
4424 227327 COG4994 COG4994 Uncharacterized protein [Function unknown]. 120
4425 227328 COG4995 COG4995 Uncharacterized conserved protein, contains CHAT domain [Function unknown]. 420
4426 227329 COG4996 COG4996 Predicted phosphatase [General function prediction only]. 164
4427 227330 COG4997 COG4997 Predicted house-cleaning noncanonical NTP pyrophosphatase, all-alpha NTP-PPase (MazG) superfamily [General function prediction only]. 95
4428 227331 COG4998 RecB Predicted endonuclease, RecB family [Replication, recombination and repair]. 209
4429 227332 COG4999 BarA5 Uncharacterized domain of BarA-like signal transduction histidine kinase [Signal transduction mechanisms]. 140
4430 227333 COG5000 NtrY Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation [Signal transduction mechanisms]. 712
4431 227334 COG5001 COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain [Signal transduction mechanisms]. 663
4432 227335 COG5002 VicK Signal transduction histidine kinase [Signal transduction mechanisms]. 459
4433 227336 COG5003 COG5003 Mu-like prophage protein gp37 [Mobilome: prophages, transposons]. 151
4434 227337 COG5004 COG5004 P2-like prophage tail protein X [Mobilome: prophages, transposons]. 70
4435 227338 COG5005 COG5005 Mu-like prophage protein gpG [Mobilome: prophages, transposons]. 140
4436 227339 COG5006 RhtA Threonine/homoserine efflux transporter RhtA [Amino acid transport and metabolism]. 292
4437 227340 COG5007 IbaG Acid stress-induced BolA-like protein IbaG/YrbA, predicted regulator of iron metabolism [Signal transduction mechanisms]. 80
4438 227341 COG5008 PilU Tfp pilus assembly protein, ATPase PilU [Cell motility, Extracellular structures]. 375
4439 227342 COG5009 MrcA Membrane carboxypeptidase/penicillin-binding protein [Cell wall/membrane/envelope biogenesis]. 797
4440 227343 COG5010 TadD Flp pilus assembly protein TadD, contains TPR repeats [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 257
4441 227344 COG5011 COG5011 Uncharacterized conserved protein, DUF2344 family [Function unknown]. 228
4442 227345 COG5012 MtbC1 Methanogenic corrinoid protein MtbC1 [Energy production and conversion]. 227
4443 227346 COG5013 NarG Nitrate reductase alpha subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 1227
4444 227347 COG5014 COG5014 Uncharacterized Fe-S cluster-containing protein, radical SAM superfamily [General function prediction only]. 228
4445 227348 COG5015 COG5015 Uncharacterized protein, pyridoxamine 5'-phosphate oxidase (PNPOx-like) family [Function unknown]. 132
4446 227349 COG5016 OadA1 Pyruvate/oxaloacetate carboxyltransferase [Energy production and conversion]. 472
4447 227350 COG5017 COG5017 UDP-N-acetylglucosamine transferase subunit ALG13 [Carbohydrate transport and metabolism]. 161
4448 227351 COG5018 KapD Inhibitor of the KinA pathway to sporulation, predicted exonuclease [General function prediction only]. 210
4449 227352 COG5019 CDC3 Septin family protein [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 373
4450 227353 COG5020 KTR1 Mannosyltransferase [Carbohydrate transport and metabolism]. 399
4451 227354 COG5021 HUL4 Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 872
4452 227355 COG5022 COG5022 Myosin heavy chain [General function prediction only]. 1463
4453 227356 COG5023 COG5023 Tubulin [Cytoskeleton]. 443
4454 227357 COG5024 COG5024 Cyclin [Cell division and chromosome partitioning]. 440
4455 227358 COG5025 COG5025 Transcription factor of the Forkhead/HNF3 family [Transcription]. 610
4456 227359 COG5026 COG5026 Hexokinase [Carbohydrate transport and metabolism]. 466
4457 227360 COG5027 SAS2 Histone acetyltransferase (MYST family) [Chromatin structure and dynamics]. 395
4458 227361 COG5028 COG5028 Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion]. 861
4459 227362 COG5029 CAL1 Prenyltransferase, beta subunit [Posttranslational modification, protein turnover, chaperones, Lipid transport and metabolism]. 342
4460 227363 COG5030 APS2 Clathrin adaptor complex, small subunit [Intracellular trafficking and secretion]. 152
4461 227364 COG5031 COQ4 Ubiquinone biosynthesis protein Coq4 [Coenzyme transport and metabolism]. 235
4462 227365 COG5032 TEL1 Phosphatidylinositol kinase or protein kinase, PI-3 family [Signal transduction mechanisms]. 2105
4463 227366 COG5033 TFG3 Transcription initiation factor IIF, auxiliary subunit [Transcription]. 225
4464 227367 COG5034 TNG2 Chromatin remodeling protein, contains PhD zinc finger [Chromatin structure and dynamics]. 271
4465 227368 COG5035 CDC50 Cell cycle control protein [Cell division and chromosome partitioning / Transcription / Signal transduction mechanisms]. 372
4466 227369 COG5036 COG5036 SPX domain-containing protein involved in vacuolar polyphosphate accumulation [Inorganic ion transport and metabolism, Intracellular trafficking, secretion, and vesicular transport]. 509
4467 227370 COG5037 TOS9 Gluconate transport-inducing protein [Signal transduction mechanisms / Carbohydrate transport and metabolism]. 248
4468 227371 COG5038 COG5038 Ca2+-dependent lipid-binding protein, contains C2 domain [General function prediction only]. 1227
4469 227372 COG5039 EpsI Exopolysaccharide biosynthesis protein EpsI, predicted pyruvyl transferase [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 339
4470 227373 COG5040 BMH1 14-3-3 family protein [Signal transduction mechanisms]. 268
4471 227374 COG5041 SKB2 Casein kinase II, beta subunit [Signal transduction mechanisms / Cell division and chromosome partitioning / Transcription]. 242
4472 227375 COG5042 NUP Purine nucleoside permease [Nucleotide transport and metabolism]. 349
4473 227376 COG5043 MRS6 Vacuolar protein sorting-associated protein [Intracellular trafficking and secretion]. 2552
4474 227377 COG5044 MRS6 RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]. 434
4475 227378 COG5045 COG5045 Ribosomal protein S10E [Translation, ribosomal structure and biogenesis]. 105
4476 227379 COG5046 MAF1 Protein involved in Mod5 protein sorting [Posttranslational modification, protein turnover, chaperones]. 282
4477 227380 COG5047 SEC23 Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion]. 755
4478 227381 COG5048 COG5048 FOG: Zn-finger [General function prediction only]. 467
4479 227382 COG5049 XRN1 5'-3' exonuclease [Replication, recombination and repair]. 953
4480 227383 COG5050 EPT1 sn-1,2-diacylglycerol ethanolamine- and cholinephosphotranferases [Lipid metabolism]. 384
4481 227384 COG5051 RPL36A Ribosomal protein L36E [Translation, ribosomal structure and biogenesis]. 97
4482 227385 COG5052 YOP1 Protein involved in membrane traffic [Intracellular trafficking and secretion]. 186
4483 227386 COG5053 CDC33 Translation initiation factor 4E (eIF-4E) [Translation, ribosomal structure and biogenesis]. 217
4484 227387 COG5054 ERV1 Mitochondrial sulfhydryl oxidase involved in the biogenesis of cytosolic Fe/S proteins [Posttranslational modification, protein turnover, chaperones]. 181
4485 227388 COG5055 RAD52 Recombination DNA repair protein (RAD52 pathway) [Replication, recombination and repair]. 375
4486 227389 COG5056 ARE1 Acyl-CoA cholesterol acyltransferase [Lipid metabolism]. 512
4487 227390 COG5057 LAG1 Phosphotyrosyl phosphatase activator [Cell division and chromosome partitioning / Signal transduction mechanisms]. 353
4488 227391 COG5058 LAG1 Protein transporter of the TRAM (translocating chain-associating membrane) superfamily, longevity assurance factor [Intracellular trafficking and secretion]. 395
4489 227392 COG5059 KIP1 Kinesin-like protein [Cytoskeleton]. 568
4490 227393 COG5061 ERO1 Oxidoreductin, endoplasmic reticulum membrane-associated protein involved in disulfide bond formation [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 425
4491 227394 COG5062 COG5062 Uncharacterized membrane protein [Function unknown]. 429
4492 227395 COG5063 CTH1 CCCH-type Zn-finger protein [General function prediction only]. 351
4493 227396 COG5064 SRP1 Karyopherin (importin) alpha [Intracellular trafficking and secretion]. 526
4494 227397 COG5065 PHO88 Protein involved in inorganic phosphate transport [Inorganic ion transport and metabolism]. 185
4495 227398 COG5066 SCS2 VAMP-associated protein involved in inositol metabolism [Intracellular trafficking and secretion]. 242
4496 227399 COG5067 DBF4 Protein kinase essential for the initiation of DNA replication [DNA replication, recombination, and repair / Cell division and chromosome partitioning]. 468
4497 227400 COG5068 ARG80 Regulator of arginine metabolism and related MADS box-containing transcription factors [Transcription]. 412
4498 227401 COG5069 SAC6 Ca2+-binding actin-bundling protein fimbrin/plastin (EF-Hand superfamily) [Cytoskeleton]. 612
4499 227402 COG5070 VRG4 Nucleotide-sugar transporter [Carbohydrate transport and metabolism / Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 309
4500 227403 COG5071 RPN5 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 439
4501 227404 COG5072 ALK1 Serine/threonine kinase of the haspin family [Cell division and chromosome partitioning]. 488
4502 227405 COG5073 VID24 Vacuolar import and degradation protein [Intracellular trafficking and secretion]. 272
4503 227406 COG5074 COG5074 t-SNARE complex subunit, syntaxin [Intracellular trafficking, secretion, and vesicular transport]. 280
4504 227407 COG5075 COG5075 Uncharacterized conserved protein [Function unknown]. 305
4505 227408 COG5076 COG5076 Transcription factor involved in chromatin remodeling, contains bromodomain [Chromatin structure and dynamics / Transcription]. 371
4506 227409 COG5077 COG5077 Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 1089
4507 227410 COG5078 COG5078 Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 153
4508 227411 COG5079 SAC3 Nuclear protein export factor [Intracellular trafficking and secretion / Cell division and chromosome partitioning]. 646
4509 227412 COG5080 YIP1 Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]. 227
4510 227413 COG5081 COG5081 Predicted membrane protein [Function unknown]. 180
4511 227414 COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 190
4512 227415 COG5083 SMP2 Phosphatidate phosphatase PAH1, contains Lipin and LNS2 domains. can be involved in plasmid maintenance [Lipid transport and metabolism]. 580
4513 227416 COG5084 YTH1 Cleavage and polyadenylation specificity factor (CPSF) Clipper subunit and related makorin family Zn-finger proteins [General function prediction only]. 285
4514 227417 COG5085 COG5085 Predicted membrane protein [Function unknown]. 230
4515 227418 COG5086 COG5086 Uncharacterized conserved protein [Function unknown]. 218
4516 227419 COG5087 RTT109 Uncharacterized conserved protein [Function unknown]. 349
4517 227420 COG5088 SOH1 Rad5p-binding protein [General function prediction only]. 114
4518 227421 COG5090 TFG2 Transcription initiation factor IIF, small subunit (RAP30) [Transcription]. 297
4519 227422 COG5091 SGT1 Suppressor of G2 allele of skp1 and related proteins [General function prediction only]. 368
4520 227423 COG5092 NMT1 N-myristoyl transferase [Lipid metabolism]. 451
4521 227424 COG5093 COG5093 Uncharacterized conserved protein [Function unknown]. 185
4522 227425 COG5094 TAF9 Transcription initiation factor TFIID, subunit TAF9 (also component of histone acetyltransferase SAGA) [Transcription]. 145
4523 227426 COG5095 TAF6 Transcription initiation factor TFIID, subunit TAF6 (also component of histone acetyltransferase SAGA) [Transcription]. 450
4524 227427 COG5096 COG5096 Vesicle coat complex, various subunits [Intracellular trafficking, secretion, and vesicular transport]. 757
4525 227428 COG5097 MED6 RNA polymerase II transcriptional regulation mediator [Transcription]. 210
4526 227429 COG5098 COG5098 Chromosome condensation complex Condensin, subunit D2 [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 1128
4527 227430 COG5099 COG5099 RNA-binding protein of the Puf family, translational repressor [Translation, ribosomal structure and biogenesis]. 777
4528 227431 COG5100 NPL4 Nuclear pore protein [Nuclear structure]. 571
4529 227432 COG5101 CRM1 Importin beta-related nuclear transport receptor [Nuclear structure / Intracellular trafficking and secretion]. 1053
4530 227433 COG5102 SFT2 Membrane protein involved in ER to Golgi transport [Intracellular trafficking and secretion]. 201
4531 227434 COG5103 CDC39 Cell division control protein, negative regulator of transcription [Cell division and chromosome partitioning / Transcription]. 2005
4532 227435 COG5104 PRP40 Splicing factor [RNA processing and modification]. 590
4533 227436 COG5105 MIH1 Mitotic inducer, protein phosphatase [Cell division and chromosome partitioning]. 427
4534 227437 COG5106 RPF2 Uncharacterized conserved protein [Function unknown]. 316
4535 227438 COG5107 RNA14 Pre-mRNA 3'-end processing (cleavage and polyadenylation) factor [RNA processing and modification]. 660
4536 227439 COG5108 RPO41 Mitochondrial DNA-directed RNA polymerase [Transcription]. 1117
4537 227440 COG5109 COG5109 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 396
4538 227441 COG5110 RPN1 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 881
4539 227442 COG5111 RPC34 DNA-directed RNA polymerase III, subunit C34 [Transcription]. 301
4540 227443 COG5112 UFD2 U1-like Zn-finger-containing protein [General function prediction only]. 126
4541 227444 COG5113 UFD2 Ubiquitin fusion degradation protein 2 [Posttranslational modification, protein turnover, chaperones]. 929
4542 227445 COG5114 COG5114 Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]. 432
4543 227446 COG5116 RPN2 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 926
4544 227447 COG5117 NOC3 Protein involved in the nuclear export of pre-ribosomes [Translation, ribosomal structure and biogenesis / Intracellular trafficking and secretion]. 657
4545 227448 COG5118 BDP1 Transcription initiation factor TFIIIB, Bdp1 subunit [Transcription]. 507
4546 227449 COG5119 COG5119 Uncharacterized protein, contains ParB-like nuclease domain [General function prediction only]. 119
4547 227450 COG5120 GOT1 Membrane protein involved in Golgi transport [Intracellular trafficking and secretion]. 129
4548 227451 COG5122 TRS23 Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 134
4549 227452 COG5123 TOA2 Transcription initiation factor IIA, gamma subunit [Transcription]. 113
4550 227453 COG5124 COG5124 Protein predicted to be involved in meiotic recombination [Cell division and chromosome partitioning / General function prediction only]. 209
4551 227454 COG5125 COG5125 Uncharacterized conserved protein [Function unknown]. 259
4552 227455 COG5126 FRQ1 Ca2+-binding protein, EF-hand superfamily [Signal transduction mechanisms]. 160
4553 227456 COG5127 COG5127 Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]. 383
4554 227457 COG5128 COG5128 Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 208
4555 227458 COG5129 MAK16 Nuclear protein with HMG-like acidic region [General function prediction only]. 303
4556 227459 COG5130 YIP3 Prenylated rab acceptor 1 and related proteins [Intracellular trafficking and secretion / Signal transduction mechanisms]. 169
4557 227460 COG5131 URM1 Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones]. 96
4558 227461 COG5132 BUD31 Cell cycle control protein, G10 family [Transcription / Cell division and chromosome partitioning]. 146
4559 227462 COG5133 COG5133 Uncharacterized conserved protein [Function unknown]. 181
4560 227463 COG5134 COG5134 Uncharacterized conserved protein [Function unknown]. 272
4561 227464 COG5135 COG5135 Uncharacterized protein [Function unknown]. 245
4562 227465 COG5136 COG5136 U1 snRNP-specific protein C [RNA processing and modification]. 188
4563 227466 COG5137 COG5137 Histone chaperone involved in gene silencing [Transcription / Chromatin structure and dynamics]. 279
4564 227467 COG5138 COG5138 Uncharacterized conserved protein [Function unknown]. 168
4565 227468 COG5139 COG5139 Uncharacterized conserved protein [Function unknown]. 397
4566 227469 COG5140 UFD1 Ubiquitin fusion-degradation protein [Posttranslational modification, protein turnover, chaperones]. 331
4567 227470 COG5141 COG5141 PHD zinc finger-containing protein [General function prediction only]. 669
4568 227471 COG5142 OXR1 Oxidation resistance protein [DNA replication, recombination, and repair]. 212
4569 227472 COG5143 SNC1 Synaptobrevin/VAMP-like protein [Intracellular trafficking and secretion]. 190
4570 227473 COG5144 TFB2 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB2 [Transcription / DNA replication, recombination, and repair]. 447
4571 227474 COG5145 RAD14 DNA excision repair protein [DNA replication, recombination, and repair]. 292
4572 227475 COG5146 PanK Pantothenate kinase [Coenzyme transport and metabolism]. 342
4573 227476 COG5147 REB1 Myb superfamily proteins, including transcription factors and mRNA splicing factors [Transcription / RNA processing and modification / Cell division and chromosome partitioning]. 512
4574 227477 COG5148 RPN10 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones]. 243
4575 227478 COG5149 TOA1 Transcription initiation factor IIA, large chain [Transcription]. 293
4576 227479 COG5150 COG5150 Class 2 transcription repressor NC2, beta subunit (Dr1) [Transcription]. 148
4577 227480 COG5151 SSL1 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription / DNA replication, recombination, and repair]. 421
4578 227481 COG5152 COG5152 Uncharacterized conserved protein, contains RING and CCCH-type Zn-fingers [General function prediction only]. 259
4579 227482 COG5153 CVT17 Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking, secretion, and vesicular transport]. 425
4580 227483 COG5154 BRX1 RNA-binding protein required for 60S ribosomal subunit biogenesis [Translation, ribosomal structure and biogenesis]. 283
4581 227484 COG5155 ESP1 Separase, a protease involved in sister chromatid separation [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 1622
4582 227485 COG5156 DOC1 Anaphase-promoting complex (APC), subunit 10 [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 189
4583 227486 COG5157 CDC73 RNA polymerase II assessory factor [Transcription]. 362
4584 227487 COG5158 SEC1 Proteins involved in synaptic transmission and general secretion, Sec1 family [Intracellular trafficking and secretion]. 582
4585 227488 COG5159 RPN6 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 421
4586 227489 COG5160 ULP1 Protease, Ulp1 family [Posttranslational modification, protein turnover, chaperones]. 578
4587 227490 COG5161 SFT1 Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification]. 1319
4588 227491 COG5162 COG5162 Transcription initiation factor TFIID, subunit TAF10 (also component of histone acetyltransferase SAGA) [Transcription]. 197
4589 227492 COG5163 NOP7 Protein required for biogenesis of the 60S ribosomal subunit [Translation, ribosomal structure and biogenesis]. 591
4590 227493 COG5164 SPT5 Transcription elongation factor [Transcription]. 607
4591 227494 COG5165 POB3 Nucleosome-binding factor SPN, POB3 subunit [Transcription / DNA replication, recombination, and repair / Chromatin structure and dynamics]. 508
4592 227495 COG5166 COG5166 Uncharacterized conserved protein [Function unknown]. 657
4593 227496 COG5167 VID27 Protein involved in vacuole import and degradation [Intracellular trafficking and secretion]. 776
4594 227497 COG5169 HSF1 Heat shock transcription factor [Transcription]. 282
4595 227498 COG5170 CDC55 Serine/threonine protein phosphatase 2A, regulatory subunit [Signal transduction mechanisms]. 460
4596 227499 COG5171 YRB1 Ran GTPase-activating protein (Ran-binding protein) [Intracellular trafficking and secretion]. 211
4597 227500 COG5173 SEC6 Exocyst complex subunit SEC6 [Intracellular trafficking and secretion]. 742
4598 227501 COG5174 TFA2 Transcription initiation factor IIE, beta subunit [Transcription]. 285
4599 227502 COG5175 MOT2 Transcriptional repressor [Transcription]. 480
4600 227503 COG5176 MSL5 Splicing factor (branch point binding protein) [RNA processing and modification]. 269
4601 227504 COG5177 COG5177 Uncharacterized conserved protein [Function unknown]. 769
4602 227505 COG5178 PRP8 U5 snRNP spliceosome subunit [RNA processing and modification]. 2365
4603 227506 COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]. 968
4604 227507 COG5180 PBP1 PAB1-binding protein PBP1, interacts with poly(A)-binding protein [RNA processing and modification]. 654
4605 227508 COG5181 HSH155 U2 snRNP spliceosome subunit [RNA processing and modification]. 975
4606 227509 COG5182 CUS1 Splicing factor 3b, subunit 2 [RNA processing and modification]. 429
4607 227510 COG5183 SSM4 E3 ubiquitin-protein ligase DOA10 [Posttranslational modification, protein turnover, chaperones]. 1175
4608 227511 COG5184 ATS1 Alpha-tubulin suppressor and related RCC1 domain-containing proteins [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 476
4609 227512 COG5185 HEC1 Protein involved in chromosome segregation, interacts with SMC proteins [Cell cycle control, cell division, chromosome partitioning]. 622
4610 227513 COG5186 PAP1 Poly(A) polymerase Pap1 [RNA processing and modification]. 552
4611 227514 COG5187 RPN7 26S proteasome regulatory complex component, contains PCI domain [Posttranslational modification, protein turnover, chaperones]. 412
4612 227515 COG5188 PRP9 Splicing factor 3a, subunit 3 [RNA processing and modification]. 470
4613 227516 COG5189 SFP1 Putative transcriptional repressor regulating G2/M transition [Transcription / Cell division and chromosome partitioning]. 423
4614 227517 COG5190 FCP1 TFIIF-interacting CTD phosphatase, includes NLI-interacting factor [Transcription]. 390
4615 227518 COG5191 COG5191 Uncharacterized conserved protein, contains HAT (Half-A-TPR) repeat [General function prediction only]. 435
4616 227519 COG5192 BMS1 GTP-binding protein required for 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 1077
4617 227520 COG5193 LHP1 La protein, small RNA-binding pol III transcript stabilizing protein and related La-motif-containing proteins involved in translation [Posttranslational modification, protein turnover, chaperones / Translation, ribosomal structure and biogenesis]. 438
4618 227521 COG5194 APC11 Component of SCF ubiquitin ligase and anaphase-promoting complex [Posttranslational modification, protein turnover, chaperones / Cell division and chromosome partitioning]. 88
4619 227522 COG5195 COG5195 Uncharacterized conserved protein [Function unknown]. 118
4620 227523 COG5196 ERD2 ER lumen protein retaining receptor [Intracellular trafficking and secretion]. 214
4621 227524 COG5197 COG5197 Predicted membrane protein [Function unknown]. 284
4622 227525 COG5198 Ptpl Protein tyrosine phosphatase-like protein (contains Pro instead of catalytic Arg) [General function prediction only]. 209
4623 227526 COG5199 SCP1 Calponin [Cytoskeleton]. 178
4624 227527 COG5200 LUC7 U1 snRNP component, mediates U1 snRNP association with cap-binding complex [RNA processing and modification]. 258
4625 227528 COG5201 SKP1 SCF ubiquitin ligase, SKP1 component [Posttranslational modification, protein turnover, chaperones]. 158
4626 227529 COG5202 COG5202 Predicted membrane protein [Function unknown]. 512
4627 227530 COG5204 SPT4 Transcription elongation factor SPT4 [Transcription]. 112
4628 227531 COG5206 GPI8 Glycosylphosphatidylinositol transamidase (GPIT), subunit GPI8 [Posttranslational modification, protein turnover, chaperones]. 382
4629 227532 COG5207 UBP14 Uncharacterized Zn-finger protein, UBP-type [General function prediction only]. 749
4630 227533 COG5208 HAP5 CCAAT-binding factor, subunit C [Transcription]. 286
4631 227534 COG5209 RCD1 Uncharacterized protein involved in cell differentiation/sexual development [General function prediction only]. 315
4632 227535 COG5210 COG5210 GTPase-activating protein [General function prediction only]. 496
4633 227536 COG5211 SSU72 RNA polymerase II-interacting protein involved in transcription start site selection [Transcription]. 197
4634 227537 COG5212 PDE1 cAMP phosphodiesterase [Signal transduction mechanisms]. 356
4635 227538 COG5213 FIP1 Polyadenylation factor I complex, subunit FIP1 [RNA processing and modification]. 266
4636 227539 COG5214 POL12 DNA polymerase alpha-primase complex, polymerase-associated subunit B [DNA replication, recombination, and repair]. 581
4637 227540 COG5215 KAP95 Karyopherin (importin) beta [Intracellular trafficking and secretion]. 858
4638 227541 COG5216 COG5216 Uncharacterized conserved protein [Function unknown]. 67
4639 227542 COG5217 BIM1 Microtubule-binding protein involved in cell cycle control [Cell division and chromosome partitioning / Cytoskeleton]. 342
4640 227543 COG5218 YCG1 Chromosome condensation complex Condensin, subunit G [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 885
4641 227544 COG5219 COG5219 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 1525
4642 227545 COG5220 TFB3 Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB3 [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 314
4643 227546 COG5221 DOP1 Dopey and related predicted leucine zipper transcription factors [Transcription]. 1618
4644 227547 COG5222 COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 427
4645 227548 COG5223 COG5223 Uncharacterized conserved protein [Function unknown]. 240
4646 227549 COG5224 HAP2 CCAAT-binding factor, subunit B [Transcription]. 248
4647 227550 COG5225 RRS1 Uncharacterized protein involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 172
4648 227551 COG5226 CEG1 mRNA capping enzyme, guanylyltransferase (alpha) subunit [RNA processing and modification]. 404
4649 227552 COG5227 SMT3 Ubiquitin-like protein (sentrin) [Posttranslational modification, protein turnover, chaperones]. 103
4650 227553 COG5228 POP2 mRNA deadenylase subunit [RNA processing and modification]. 299
4651 227554 COG5229 LOC7 Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 662
4652 227555 COG5230 COG5230 Uncharacterized conserved protein [Function unknown]. 194
4653 227556 COG5231 VMA13 Vacuolar H+-ATPase V1 sector, subunit H [Energy production and conversion]. 432
4654 227557 COG5232 SEC62 Preprotein translocase subunit Sec62 [Intracellular trafficking and secretion]. 259
4655 227558 COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]. 417
4656 227559 COG5234 CIN1 Beta-tubulin folding cofactor D [Posttranslational modification, protein turnover, chaperones / Cytoskeleton]. 993
4657 227560 COG5235 RFA2 Single-stranded DNA-binding replication protein A (RPA), medium (30 kD) subunit [DNA replication, recombination, and repair]. 258
4658 227561 COG5236 COG5236 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 493
4659 227562 COG5237 PER1 Predicted membrane protein [Function unknown]. 319
4660 227563 COG5238 RNA1 Ran GTPase-activating protein (RanGAP) involved in mRNA processing and transport [Signal transduction mechanisms, RNA processing and modification]. 388
4661 227564 COG5239 CCR4 mRNA deadenylase, 3'-5' endonuclease subunit Ccr4 [RNA processing and modification]. 378
4662 227565 COG5240 SEC21 Vesicle coat complex COPI, gamma subunit [Intracellular trafficking and secretion]. 898
4663 227566 COG5241 RAD10 Nucleotide excision repair endonuclease NEF1, RAD10 subunit [DNA replication, recombination, and repair]. 224
4664 227567 COG5242 TFB4 RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair]. 296
4665 227568 COG5243 HRD1 HRD ubiquitin ligase complex, ER membrane component [Posttranslational modification, protein turnover, chaperones]. 491
4666 227569 COG5244 NIP100 Dynactin complex subunit involved in mitotic spindle partitioning in anaphase B [Cell cycle control, cell division, chromosome partitioning]. 669
4667 227570 COG5245 DYN1 Dynein, heavy chain [Cytoskeleton]. 3164
4668 227571 COG5246 PRP11 Splicing factor 3a, subunit 2 [RNA processing and modification]. 222
4669 227572 COG5247 BUR6 Class 2 transcription repressor NC2, alpha subunit (DRAP1 homolog) [Transcription]. 113
4670 227573 COG5248 TAF19 Transcription initiation factor TFIID, subunit TAF13 [Transcription]. 126
4671 227574 COG5249 RER1 Golgi protein involved in Golgi-to-ER retrieval [Intracellular trafficking and secretion]. 180
4672 227575 COG5250 RPB4 RNA polymerase II, fourth largest subunit [Transcription]. 138
4673 227576 COG5251 TAF40 Transcription initiation factor TFIID, subunit TAF11 [Transcription]. 199
4674 227577 COG5252 COG5252 Uncharacterized conserved protein, contains CCCH-type Zn-finger protein [General function prediction only]. 299
4675 227578 COG5253 MSS4 Phosphatidylinositol-4-phosphate 5-kinase [Signal transduction mechanisms]. 612
4676 227579 COG5254 ARV1 Predicted membrane protein [Function unknown]. 239
4677 227580 COG5255 COG5255 Uncharacterized protein [Function unknown]. 239
4678 227581 COG5256 TEF1 Translation elongation factor EF-1alpha (GTPase) [Translation, ribosomal structure and biogenesis]. 428
4679 227582 COG5257 GCD11 Translation initiation factor 2, gamma subunit (eIF-2gamma; GTPase) [Translation, ribosomal structure and biogenesis]. 415
4680 227583 COG5258 GTPBP1 GTPase [General function prediction only]. 527
4681 227584 COG5259 RSC8 RSC chromatin remodeling complex subunit RSC8 [Chromatin structure and dynamics / Transcription]. 531
4682 227585 COG5260 TRF4 DNA polymerase sigma [Replication, recombination and repair]. 482
4683 227586 COG5261 IQG1 Protein involved in regulation of cellular morphogenesis/cytokinesis [Cell division and chromosome partitioning / Signal transduction mechanisms]. 1054
4684 227587 COG5262 HTA1 Histone H2A [Chromatin structure and dynamics]. 132
4685 227588 COG5263 COG5263 Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism]. 313
4686 227589 COG5264 VTC1 Vacuolar transporter chaperone [Posttranslational modification, protein turnover, chaperones]. 126
4687 227590 COG5265 ATM1 ABC-type transport system involved in Fe-S cluster assembly, permease and ATPase components [Posttranslational modification, protein turnover, chaperones]. 497
4688 227591 COG5266 COG5266 Uncharacterized conserved protein, contains GH25 family domain [General function prediction only]. 264
4689 227592 COG5267 COG5267 Uncharacterized conserved protein, DUF1800 family [Function unknown]. 496
4690 227593 COG5268 TrbD Type IV secretory pathway, TrbD component [Intracellular trafficking, secretion, and vesicular transport]. 93
4691 227594 COG5269 ZUO1 Ribosome-associated chaperone zuotin [Translation, ribosomal structure and biogenesis / Posttranslational modification, protein turnover, chaperones]. 379
4692 227595 COG5270 PUA PUA domain (predicted RNA-binding domain) [Translation, ribosomal structure and biogenesis]. 202
4693 227596 COG5271 MDN1 Midasin, AAA ATPase with vWA domain, involved in ribosome maturation [Translation, ribosomal structure and biogenesis]. 4600
4694 319244 COG5272 UBI4 UBI4; linked to 3D-structure 74
4695 227598 COG5273 COG5273 Uncharacterized protein containing DHHC-type Zn finger [General function prediction only]. 309
4696 227599 COG5274 CYB5 Cytochrome b involved in lipid metabolism [Energy production and conversion, Lipid transport and metabolism]. 164
4697 227600 COG5275 COG5275 BRCT domain type II [General function prediction only]. 276
4698 227601 COG5276 COG5276 Uncharacterized conserved protein [Function unknown]. 370
4699 227602 COG5277 COG5277 Actin-related protein [Cytoskeleton]. 444
4700 227603 COG5278 CHASE3 Extracellular (periplasmic) sensor domain CHASE3 (specificity unknown) [Signal transduction mechanisms]. 207
4701 227604 COG5279 CYK3 Cytokinesis protein 3, contains TGc (transglutaminase/protease-like) domain [Cell cycle control, cell division, chromosome partitioning]. 521
4702 227605 COG5280 YqbO Phage-related minor tail protein [Mobilome: prophages, transposons]. 634
4703 227606 COG5281 COG5281 Phage-related minor tail protein [Mobilome: prophages, transposons]. 833
4704 227607 COG5282 COG5282 Uncharacterized conserved protein, DUF2342 family [Function unknown]. 359
4705 227608 COG5283 COG5283 Phage-related tail protein [Mobilome: prophages, transposons]. 1213
4706 227609 COG5285 PhyH Ectoine hydroxylase-related dioxygenase, phytanoyl-CoA dioxygenase (PhyH) family [Secondary metabolites biosynthesis, transport and catabolism]. 299
4707 227610 COG5290 COG5290 IkappaB kinase complex, IKAP component [Transcription]. 1243
4708 227611 COG5291 COG5291 Predicted membrane protein [Function unknown]. 313
4709 227612 COG5293 YydB Uncharacterized protein YydD, contains DUF2326 domain [Function unknown]. 591
4710 227613 COG5294 YxeA Uncharacterized protein YxeA, DUF1093 family [Function unknown]. 113
4711 227614 COG5295 Hia Autotransporter adhesin [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 715
4712 227615 COG5296 COG5296 Transcription factor involved in TATA site selection and in elongation by RNA polymerase II [Transcription]. 521
4713 227616 COG5297 CelA1 Cellulase/cellobiase CelA1 [Carbohydrate transport and metabolism]. 544
4714 227617 COG5298 YdaL Predicted metal-dependent carbohydrate esterase YdaL, contains NodB-like catalytic (CE4) domain [General function prediction only]. 530
4715 227618 COG5301 COG5301 Phage-related tail fibre protein [Mobilome: prophages, transposons]. 587
4716 227619 COG5302 COG5302 Post-segregation antitoxin (ccd killing mechanism protein) encoded by the F plasmid [Mobilome: prophages, transposons]. 80
4717 227620 COG5304 COG5304 Predicted DNA binding protein, CopG/RHH family [Transcription]. 92
4718 227621 COG5305 COG5305 Uncharacterized membrane protein [Function unknown]. 552
4719 227622 COG5306 COG5306 Uncharacterized protein [Function unknown]. 621
4720 227623 COG5307 COG5307 Guanine-nucleotide exchange factor, contains Sec7 domain [General function prediction only]. 1024
4721 227624 COG5308 NUP170 Nuclear pore complex subunit [Intracellular trafficking and secretion]. 1263
4722 227625 COG5309 Scw11 Exo-beta-1,3-glucanase, GH17 family [Carbohydrate transport and metabolism]. 305
4723 227626 COG5310 COG5310 Homospermidine synthase [Secondary metabolites biosynthesis, transport and catabolism]. 481
4724 227627 COG5314 COG5314 Conjugal transfer/entry exclusion protein [Mobilome: prophages, transposons]. 252
4725 227628 COG5316 COG5316 Uncharacterized protein [Function unknown]. 421
4726 227629 COG5317 COG5317 Uncharacterized protein [Function unknown]. 175
4727 227630 COG5319 COG5319 Uncharacterized protein [Function unknown]. 142
4728 227631 COG5321 COG5321 Uncharacterized protein [Function unknown]. 164
4729 227632 COG5322 COG5322 Predicted amino acid dehydrogenase [General function prediction only]. 351
4730 227633 COG5323 COG5323 Large terminase phage packaging protein [Mobilome: prophages, transposons]. 410
4731 227634 COG5324 Trl1 tRNA splicing ligase [Translation, ribosomal structure and biogenesis]. 758
4732 227635 COG5325 COG5325 t-SNARE complex subunit, syntaxin [Intracellular trafficking and secretion]. 283
4733 227636 COG5328 COG5328 Uncharacterized protein, UPF0262 family [Function unknown]. 160
4734 227637 COG5329 COG5329 Phosphoinositide polyphosphatase (Sac family) [Signal transduction mechanisms]. 570
4735 227638 COG5330 COG5330 Uncharacterized conserved protein, DUF2336 family [Function unknown]. 364
4736 227639 COG5331 COG5331 Uncharacterized protein [Function unknown]. 139
4737 227640 COG5333 CCL1 Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH/TFIIK, cyclin H subunit [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 297
4738 227641 COG5336 AtpI2 FoF1-type ATP synthase assembly protein I [Energy production and conversion]. 116
4739 227642 COG5337 CotH Spore coat protein CotH [Cell wall/membrane/envelope biogenesis]. 473
4740 227643 COG5338 COG5338 Uncharacterized protein [Function unknown]. 468
4741 227644 COG5339 YdgA Uncharacterized conserved protein YdgA, DUF945 family [Function unknown]. 479
4742 227645 COG5340 COG5340 Transcriptional regulator, predicted component of viral defense system [Defense mechanisms]. 269
4743 227646 COG5341 COG5341 Uncharacterized protein [Function unknown]. 132
4744 227647 COG5342 IalB Invasion protein IalB, involved in pathogenesis [General function prediction only]. 181
4745 227648 COG5343 RskA Anti-sigma-K factor RskA [Signal transduction mechanisms]. 240
4746 227649 COG5345 COG5345 Uncharacterized protein [Function unknown]. 358
4747 227650 COG5346 COG5346 Uncharacterized membrane protein [Function unknown]. 136
4748 227651 COG5347 COG5347 GTPase-activating protein that regulates ARFs (ADP-ribosylation factors), involved in ARF-mediated vesicular transport [Intracellular trafficking and secretion]. 319
4749 227652 COG5349 COG5349 Uncharacterized conserved protein, DUF983 family [Function unknown]. 126
4750 227653 COG5350 COG5350 Predicted protein tyrosine phosphatase [General function prediction only]. 172
4751 227654 COG5351 COG5351 Uncharacterized protein [Function unknown]. 367
4752 227655 COG5352 COG5352 Uncharacterized protein [Function unknown]. 169
4753 227656 COG5353 YpmB Uncharacterized protein YpmB, contains C-terminal PepSY domain [Function unknown]. 161
4754 227657 COG5354 COG5354 Uncharacterized protein, contains Trp-Asp (WD) repeat [General function prediction only]. 561
4755 227658 COG5360 COG5360 Uncharacterized conserved protein, heparinase superfamily [Function unknown]. 566
4756 227659 COG5361 COG5361 Uncharacterized conserved protein [Mobilome: prophages, transposons]. 458
4757 227660 COG5362 COG5362 Phage terminase large subunit [Mobilome: prophages, transposons]. 202
4758 227661 COG5366 COG5366 Protein involved in propagation of M2 dsRNA satellite of L-A virus [General function prediction only]. 531
4759 227662 COG5368 COG5368 Uncharacterized protein [Function unknown]. 451
4760 227663 COG5369 COG5369 Uncharacterized conserved protein [Function unknown]. 743
4761 227664 COG5371 COG5371 Golgi nucleoside diphosphatase [Nucleotide transport and metabolism]. 549
4762 227665 COG5373 COG5373 Uncharacterized membrane protein [Function unknown]. 931
4763 227666 COG5374 COG5374 Uncharacterized conserved protein [Function unknown]. 192
4764 227667 COG5375 COG5375 Uncharacterized protein [Function unknown]. 216
4765 227668 COG5377 COG5377 Phage-related protein, predicted endonuclease [Mobilome: prophages, transposons]. 319
4766 227669 COG5378 COG5378 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 175
4767 227670 COG5379 BtaA S-adenosylmethionine:diacylglycerol 3-amino-3-carboxypropyl transferase [Lipid transport and metabolism]. 414
4768 227671 COG5380 LimK Lipase chaperone LimK [Posttranslational modification, protein turnover, chaperones]. 283
4769 227672 COG5381 COG5381 Uncharacterized protein [Function unknown]. 184
4770 227673 COG5383 YdcJ Uncharacterized metalloenzyme YdcJ, glyoxalase superfamily [General function prediction only]. 295
4771 227674 COG5384 Mpp10 U3 small nucleolar ribonucleoprotein component [Translation, ribosomal structure and biogenesis]. 569
4772 227675 COG5385 COG5385 Uncharacterized protein [Function unknown]. 214
4773 227676 COG5386 NEAT Heme-binding NEAT domain [Inorganic ion transport and metabolism]. 352
4774 227677 COG5387 Atp12 Chaperone required for the assembly of the mitochondrial F1-ATPase [Posttranslational modification, protein turnover, chaperones]. 264
4775 227678 COG5388 COG5388 Uncharacterized protein [Function unknown]. 209
4776 227679 COG5389 COG5389 Uncharacterized protein [Function unknown]. 181
4777 227680 COG5391 COG5391 Phox homology (PX) domain protein [Intracellular trafficking and secretion / General function prediction only]. 524
4778 227681 COG5393 YqjE Uncharacterized membrane protein YqjE [Function unknown]. 131
4779 227682 COG5394 COG5394 Polyhydroxyalkanoate (PHA) synthesis regulator protein, binds DNA and PHA [Secondary metabolites biosynthesis, transport and catabolism, Signal transduction mechanisms]. 193
4780 227683 COG5395 COG5395 Uncharacterized membrane protein [Function unknown]. 131
4781 227684 COG5397 COG5397 Uncharacterized protein [Function unknown]. 349
4782 227685 COG5398 COG5398 Heme oxygenase [Coenzyme transport and metabolism]. 238
4783 227686 COG5399 COG5399 Uncharacterized protein [Function unknown]. 139
4784 227687 COG5400 COG5400 Uncharacterized protein [Function unknown]. 205
4785 227688 COG5401 GerM Spore germination protein GerM [Cell cycle control, cell division, chromosome partitioning]. 250
4786 227689 COG5402 COG5402 Uncharacterized protein [Function unknown]. 194
4787 227690 COG5403 COG5403 Uncharacterized protein [Function unknown]. 285
4788 227691 COG5404 SulA Cell division inhibitor SulA, prevents FtsZ ring assembly [Cell cycle control, cell division, chromosome partitioning]. 169
4789 227692 COG5405 HslV ATP-dependent protease HslVU (ClpYQ), peptidase subunit [Posttranslational modification, protein turnover, chaperones]. 178
4790 227693 COG5406 COG5406 Nucleosome binding factor SPN, SPT16 subunit [Transcription, Replication, recombination and repair, Chromatin structure and dynamics]. 1001
4791 227694 COG5407 SEC63 Preprotein translocase subunit Sec63 [Intracellular trafficking, secretion, and vesicular transport]. 610
4792 227695 COG5408 COG5408 SPX domain-containing protein [Signal transduction mechanisms]. 296
4793 227696 COG5409 COG5409 EXS domain-containing protein [Signal transduction mechanisms]. 384
4794 227697 COG5410 COG5410 Uncharacterized protein [Function unknown]. 305
4795 227698 COG5411 COG5411 Phosphatidylinositol 5-phosphate phosphatase [Signal transduction mechanisms]. 460
4796 227699 COG5412 COG5412 Phage-related protein [Mobilome: prophages, transposons]. 637
4797 227700 COG5413 COG5413 Uncharacterized integral membrane protein [Function unknown]. 168
4798 227701 COG5414 Taf7 TATA-binding protein-associated factor Taf7, part of the TFIID transcription initiation complex [Transcription]. 392
4799 227702 COG5415 COG5415 Predicted integral membrane metal-binding protein [General function prediction only]. 251
4800 227703 COG5416 YrvD Uncharacterized integral membrane protein [Function unknown]. 98
4801 227704 COG5417 YukD Uncharacterized ubiquitin-like protein YukD [Function unknown]. 81
4802 227705 COG5418 COG5418 Predicted secreted protein [Function unknown]. 164
4803 227706 COG5419 COG5419 Uncharacterized protein [Function unknown]. 160
4804 227707 COG5420 COG5420 Uncharacterized protein [Function unknown]. 71
4805 227708 COG5421 COG5421 Transposase [Mobilome: prophages, transposons]. 480
4806 227709 COG5422 ROM1 RhoGEF, Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases [Signal transduction mechanisms]. 1175
4807 227710 COG5423 COG5423 Predicted metal-binding protein [Function unknown]. 167
4808 227711 COG5424 PqqC Pyrroloquinoline quinone (PQQ) biosynthesis protein C [Coenzyme transport and metabolism]. 242
4809 227712 COG5425 Usg Usg protein (tryptophan operon, function unknown) [Function unknown]. 90
4810 227713 COG5426 COG5426 Uncharacterized membrane protein [Function unknown]. 254
4811 227714 COG5427 COG5427 Uncharacterized membrane protein [Function unknown]. 684
4812 227715 COG5428 YuzE Uncharacterized protein YuzE, DUF2283 family [Function unknown]. 69
4813 227716 COG5429 COG5429 Uncharacterized protein [Function unknown]. 261
4814 227717 COG5430 SCPU Spore coat protein U (SCPU) domain, function unknown [Function unknown]. 174
4815 227718 COG5431 COG5431 Predicted nucleic acid-binding protein, contains Zn-finger domain [General function prediction only]. 117
4816 227719 COG5432 RAD18 RING-finger-containing E3 ubiquitin ligase [Signal transduction mechanisms]. 391
4817 227720 COG5433 YhhI Predicted transposase YbfD/YdcC associated with H repeats [Mobilome: prophages, transposons]. 121
4818 227721 COG5434 Pgu1 Polygalacturonase [Carbohydrate transport and metabolism]. 542
4819 227722 COG5435 COG5435 Uncharacterized protein [Function unknown]. 147
4820 227723 COG5436 COG5436 Uncharacterized membrane protein [Function unknown]. 182
4821 227724 COG5437 COG5437 Predicted secreted protein [Function unknown]. 138
4822 227725 COG5438 COG5438 Uncharacterized membrane protein [Function unknown]. 385
4823 227726 COG5439 COG5439 Uncharacterized protein [Function unknown]. 112
4824 227727 COG5440 COG5440 Uncharacterized protein [Function unknown]. 161
4825 227728 COG5441 COG5441 Uncharacterized protein, UPF0261 family [Function unknown]. 401
4826 227729 COG5442 FlaF Flagellar biosynthesis regulator FlaF [Cell motility]. 115
4827 227730 COG5443 FlbT Flagellar biosynthesis regulator FlbT [Cell motility]. 148
4828 227731 COG5444 YeeF Predicted ribonuclease, toxin component of the YeeF-YezG toxin-antitoxin module [Defense mechanisms]. 565
4829 227732 COG5445 YfaQ Uncharacterized conserved protein YfaQ, DUF2300 domain [Function unknown]. 268
4830 227733 COG5446 CbtA Uncharacterized membrane protein, predicted cobalt tansporter CbtA [General function prediction only]. 233
4831 227734 COG5447 COG5447 Uncharacterized protein [Function unknown]. 115
4832 227735 COG5448 COG5448 Uncharacterized protein [Function unknown]. 184
4833 227736 COG5449 COG5449 Uncharacterized protein [Function unknown]. 225
4834 227737 COG5450 VapB6 Transcription regulator of the Arc/MetJ class [Transcription]. 84
4835 227738 COG5451 COG5451 Predicted secreted protein [Function unknown]. 128
4836 227739 COG5452 COG5452 Uncharacterized protein [Function unknown]. 180
4837 227740 COG5453 COG5453 Uncharacterized protein [Function unknown]. 96
4838 227741 COG5454 COG5454 Predicted secreted protein [Function unknown]. 89
4839 227742 COG5455 RcnB Periplasmic regulator RcnB of Ni and Co efflux [Inorganic ion transport and metabolism]. 129
4840 227743 COG5456 FixH Nitrogen fixation protein FixH [Inorganic ion transport and metabolism]. 166
4841 227744 COG5457 YjiS Uncharacterized conserved protein YjiS, DUF1127 family [Function unknown]. 63
4842 227745 COG5458 COG5458 Uncharacterized protein [Function unknown]. 144
4843 227746 COG5459 Rsm22 Ribosomal protein RSM22 (predicted mitochondrial rRNA methylase) [Translation, ribosomal structure and biogenesis]. 484
4844 227747 COG5460 COG5460 Uncharacterized conserved protein, DUF2164 family [Function unknown]. 82
4845 227748 COG5461 CpaD Type IV pilus biogenesis protein CpaD/CtpE [Extracellular structures]. 224
4846 227749 COG5462 COG5462 Predicted secreted (periplasmic) protein [Function unknown]. 138
4847 227750 COG5463 YgiB Uncharacterized conserved protein YgiB, involved in bioifilm formation, UPF0441/DUF1190 family [Function unknown]. 198
4848 227751 COG5464 YadD Predicted transposase YdaD [Replication, recombination and repair]. 289
4849 227752 COG5465 COG5465 Uncharacterized protein [Function unknown]. 166
4850 227753 COG5466 COG5466 Predicted small metal-binding protein [Function unknown]. 59
4851 227754 COG5467 COG5467 Uncharacterized protein [Function unknown]. 104
4852 227755 COG5468 COG5468 Predicted secreted (periplasmic) protein [Function unknown]. 172
4853 227756 COG5469 COG5469 Predicted metal-binding protein [Function unknown]. 143
4854 227757 COG5470 COG5470 Uncharacterized conserved protein, DUF1330 family [Function unknown]. 96
4855 227758 COG5471 COG5471 Predicted phage recombinase, RecA/RadA family [Mobilome: prophages, transposons]. 107
4856 227759 COG5472 COG5472 Predicted small integral membrane protein [Function unknown]. 164
4857 227760 COG5473 COG5473 Uncharacterized membrane protein [Function unknown]. 290
4858 227761 COG5474 COG5474 Uncharacterized protein [Function unknown]. 159
4859 227762 COG5475 YodC Uncharacterized conserved protein YodC, DUF2158 family [Function unknown]. 60
4860 227763 COG5476 COG5476 Microcystin degradation protein MlrC, contains DUF1485 domain [General function prediction only]. 488
4861 227764 COG5477 COG5477 Predicted small integral membrane protein [Function unknown]. 97
4862 227765 COG5478 Fet4 Low affinity Fe/Cu permease [Inorganic ion transport and metabolism]. 141
4863 227766 COG5479 Psp3 Uncharacterized conserved protein, contains LGFP repeats [Function unknown]. 556
4864 227767 COG5480 COG5480 Uncharacterized membrane protein [Function unknown]. 147
4865 227768 COG5481 COG5481 Uncharacterized protein [Function unknown]. 67
4866 227769 COG5482 COG5482 Uncharacterized protein [Function unknown]. 229
4867 227770 COG5483 COG5483 Uncharacterized conserved protein, DUF488 family [Function unknown]. 289
4868 227771 COG5484 YjcR Uncharacterized protein YjcR, contains N-terminal HTH domain [Function unknown]. 279
4869 227772 COG5485 COG5485 Predicted ester cyclase [General function prediction only]. 131
4870 227773 COG5486 COG5486 Predicted metal-binding membrane protein [Function unknown]. 283
4871 227774 COG5487 YtjA Uncharacterized membrane protein YtjA, UPF0391 family [Function unknown]. 54
4872 227775 COG5488 COG5488 Uncharacterized membrane protein [Function unknown]. 164
4873 227776 COG5489 COG5489 Uncharacterized conserved protein, DUF736 family [Function unknown]. 107
4874 227777 COG5490 COG5490 Uncharacterized protein [Function unknown]. 158
4875 227778 COG5491 Did4 Archaeal division protein CdvB, Snf7/Vps24/ESCRT-III family [Cell cycle control, cell division, chromosome partitioning]. 204
4876 227779 COG5492 YjdB Uncharacterized conserved protein YjdB, contains Ig-like domain [General function prediction only]. 329
4877 227780 COG5493 COG5493 Uncharacterized protein [Function unknown]. 231
4878 227781 COG5494 COG5494 Predicted thioredoxin/glutaredoxin [Posttranslational modification, protein turnover, chaperones]. 265
4879 227782 COG5495 COG5495 Predicted oxidoreductase, contains short-chain dehydrogenase (SDR) and DUF2520 domains [General function prediction only]. 289
4880 227783 COG5496 COG5496 Predicted thioesterase [General function prediction only]. 130
4881 227784 COG5497 COG5497 Predicted secreted protein [Function unknown]. 228
4882 227785 COG5498 Acf2 Endoglucanase Acf2 [Carbohydrate transport and metabolism]. 760
4883 227786 COG5499 HigA Antitoxin component HigA of the HigAB toxin-antitoxin module, contains an N-terminal HTH domain [Defense mechanisms]. 120
4884 227787 COG5500 COG5500 Uncharacterized membrane protein [Function unknown]. 159
4885 227788 COG5501 COG5501 Predicted secreted protein [Function unknown]. 148
4886 227789 COG5502 COG5502 Uncharacterized conserved protein, DUF2267 family [Function unknown]. 135
4887 227790 COG5503 RpoEps DNA-dependent RNA polymerase auxiliary subunit epsilon [Transcription, Defense mechanisms]. 69
4888 227791 COG5504 YjaZ Predicted Zn-dependent protease YjaZ, DUF2268 family [General function prediction only]. 280
4889 227792 COG5505 COG5505 Uncharacterized membrane protein [Function unknown]. 384
4890 227793 COG5506 YueI Uncharacterized protein YueI, DUF2278 family [Function unknown]. 144
4891 227794 COG5507 YbaA Uncharacterized conserved protein YbaA, DUF1428 family [Function unknown]. 117
4892 227795 COG5508 COG5508 Uncharacterized protein [Function unknown]. 84
4893 227796 COG5509 COG5509 Uncharacterized small protein, DUF1192 family [Function unknown]. 65
4894 227797 COG5510 COG5510 Predicted small secreted protein [Function unknown]. 44
4895 227798 COG5511 COG5511 Bacteriophage capsid protein [Mobilome: prophages, transposons]. 492
4896 227799 COG5512 COG5512 Predicted nucleic acid-binding protein, contains Zn-ribbon domain (includes truncated derivatives) [General function prediction only]. 194
4897 227800 COG5513 COG5513 Predicted secreted protein [Function unknown]. 113
4898 227801 COG5514 COG5514 Uncharacterized protein [Function unknown]. 203
4899 227802 COG5515 COG5515 Uncharacterized protein [Function unknown]. 70
4900 227803 COG5516 COG5516 Conserved protein containing a Zn-ribbon-like motif, possibly RNA-binding [General function prediction only]. 196
4901 227804 COG5517 HcaF 3-phenylpropionate/cinnamic acid dioxygenase, small subunit [Secondary metabolites biosynthesis, transport and catabolism]. 164
4902 227805 COG5518 COG5518 Bacteriophage capsid portal protein [Mobilome: prophages, transposons]. 492
4903 227806 COG5519 COG5519 Uncharcterized protein, DUF927 family [Function unknown]. 562
4904 227807 COG5520 XynC O-Glycosyl hydrolase [Cell wall/membrane/envelope biogenesis]. 433
4905 227808 COG5521 YvdJ Maltodextrin utilization protein YvdJ (function unknown) [Carbohydrate transport and metabolism]. 275
4906 227809 COG5522 YwaF Uncharacterized membrane protein YwaF [Function unknown]. 236
4907 227810 COG5523 COG5523 Uncharacterized membrane protein [Function unknown]. 271
4908 227811 COG5524 COG5524 Bacteriorhodopsin [Energy production and conversion, Signal transduction mechanisms]. 285
4909 227812 COG5525 YbcX Phage terminase, large subunit GpA [Mobilome: prophages, transposons]. 611
4910 227813 COG5526 COG5526 Lysozyme family protein [General function prediction only]. 191
4911 227814 COG5527 COG5527 Protein involved in initiation of plasmid replication [Mobilome: prophages, transposons]. 342
4912 227815 COG5528 COG5528 Uncharacterized membrane protein [Function unknown]. 155
4913 227816 COG5529 COG5529 Pyocin large subunit [Secondary metabolites biosynthesis, transport and catabolism]. 326
4914 227817 COG5530 COG5530 Uncharacterized membrane protein [Function unknown]. 247
4915 227818 COG5531 Rsc6 Chromatin remodeling complex protein RSC6, contains SWIB domain [Chromatin structure and dynamics]. 237
4916 227819 COG5532 yfdQ Uncharacterized conserved protein YfdQ, DUF2303 family [Function unknown]. 269
4917 227820 COG5533 COG5533 Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 415
4918 227821 COG5534 RepA Plasmid replication initiator protein [Mobilome: prophages, transposons]. 383
4919 227822 COG5535 RAD4 DNA repair protein RAD4 [DNA replication, recombination, and repair]. 650
4920 227823 COG5536 BET4 Protein prenyltransferase, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 328
4921 227824 COG5537 IRR1 Cohesin [Cell division and chromosome partitioning]. 740
4922 227825 COG5538 SEC66 Endoplasmic reticulum translocation complex, subunit SEC66 [Cell motility and secretion]. 180
4923 227826 COG5539 COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]. 306
4924 227827 COG5540 COG5540 RING-finger-containing ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 374
4925 227828 COG5541 RET3 Vesicle coat complex COPI, zeta subunit [Posttranslational modification, protein turnover, chaperones]. 187
4926 227829 COG5542 COG5542 Mannosyltransferase related to Gpi18 [Carbohydrate transport and metabolism]. 420
4927 227830 COG5543 COG5543 Uncharacterized conserved protein [Function unknown]. 1400
4928 227831 COG5544 yfiM Uncharacterized conserved protein YfiM, DUF2279 family [Function unknown]. 101
4929 227832 COG5545 COG5545 Predicted P-loop ATPase and inactivated derivatives [Mobilome: prophages, transposons]. 517
4930 227833 COG5546 COG5546 Uncharacterized membrane protein [Function unknown]. 80
4931 227834 COG5547 COG5547 Uncharacterized membrane protein [Function unknown]. 62
4932 227835 COG5548 COG5548 Uncharacterized membrane protein, UPF0136 family [Function unknown]. 105
4933 227836 COG5549 COG5549 Predicted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 236
4934 227837 COG5550 COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]. 125
4935 227838 COG5551 Cas6 CRISPR/Cas system endoribonuclease Cas6, RAMP superfamily [Defense mechanisms]. 261
4936 227839 COG5552 COG5552 Uncharacterized protein [Function unknown]. 88
4937 227840 COG5553 COG5553 Predicted metal-dependent enzyme of the double-stranded beta helix superfamily [General function prediction only]. 191
4938 227841 COG5554 NifT Nitrogen fixation protein [Secondary metabolites biosynthesis, transport and catabolism]. 69
4939 227842 COG5555 COG5555 Cytolysin, a secreted calcineurin-like phosphatase [Intracellular trafficking, secretion, and vesicular transport]. 392
4940 227843 COG5556 COG5556 Uncharacterized protein [Function unknown]. 110
4941 227844 COG5557 HybB Ni/Fe-hydrogenase 2 integral membrane subunit HybB [Energy production and conversion]. 401
4942 227845 COG5558 COG5558 Transposase [Mobilome: prophages, transposons]. 261
4943 227846 COG5559 COG5559 Uncharacterized conserved small protein [Function unknown]. 65
4944 227847 COG5560 UBP12 Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 823
4945 227848 COG5561 COG5561 Predicted metal-binding protein [Function unknown]. 101
4946 227849 COG5562 YbcV Prophage-encoded protein YbcV, DUF1398 family [Mobilome: prophages, transposons]. 137
4947 227850 COG5563 COG5563 Uncharacterized membrane protein [Function unknown]. 379
4948 227851 COG5564 COG5564 Predicted TIM-barrel enzyme [Function unknown]. 276
4949 227852 COG5565 COG5565 Bacteriophage terminase large (ATPase) subunit and inactivated derivatives [Mobilome: prophages, transposons]. 79
4950 227853 COG5566 COG5566 Transcriptional regulator, Middle operon regulator (Mor) family [Transcription]. 137
4951 227854 COG5567 YifL Predicted small periplasmic lipoprotein YifL (function unknown0 [Function unknown]. 58
4952 227855 COG5568 COG5568 Uncharacterized protein [Function unknown]. 85
4953 227856 COG5569 CusF Periplasmic Cu and Ag efflux protein CusF [Inorganic ion transport and metabolism]. 108
4954 227857 COG5570 COG5570 Uncharacterized protein [Function unknown]. 57
4955 227858 COG5571 YhjY Uncharacterized protein YhjY, contains autotransporter beta-barrel domain [General function prediction only]. 239
4956 227859 COG5572 COG5572 Uncharacterized membrane protein [Function unknown]. 104
4957 227860 COG5573 COG5573 Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 142
4958 227861 COG5574 PEX10 RING-finger-containing E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 271
4959 227862 COG5575 ORC2 Origin recognition complex, subunit 2 [DNA replication, recombination, and repair]. 535
4960 227863 COG5576 COG5576 Homeodomain-containing transcription factor [Transcription]. 156
4961 227864 COG5577 CotF Spore coat protein CotF [Cell wall/membrane/envelope biogenesis]. 145
4962 227865 COG5578 YesL Uncharacterized membrane protein YesL [Function unknown]. 208
4963 227866 COG5579 COG5579 Uncharacterized protein, DUF1810 family [Function unknown]. 143
4964 227867 COG5580 COG5580 Activator of HSP90 ATPase [Posttranslational modification, protein turnover, chaperones]. 272
4965 227868 COG5581 YcgR c-di-GMP-binding flagellar brake protein YcgR, contains PilZNR and PilZ domains [Cell motility]. 233
4966 227869 COG5582 YpiB Uncharacterized protein YpiB, UPF0302 family [Function unknown]. 182
4967 227870 COG5583 COG5583 Uncharacterized protein [Function unknown]. 54
4968 227871 COG5584 COG5584 Predicted small secreted protein [Function unknown]. 103
4969 227872 COG5585 COG5585 NAD+--asparagine ADP-ribosyltransferase [Signal transduction mechanisms]. 417
4970 227873 COG5586 COG5586 Uncharacterized protein [Function unknown]. 110
4971 227874 COG5587 COG5587 Uncharacterized conserved protein, DUF2461 family [Function unknown]. 228
4972 227875 COG5588 COG5588 Uncharacterized protein [Function unknown]. 207
4973 227876 COG5589 COG5589 Uncharacterized protein [Function unknown]. 164
4974 227877 COG5590 COG5590 Ubiquinone biosynthesis protein COQ9 [Coenzyme transport and metabolism]. 229
4975 227878 COG5591 COG5591 Uncharacterized protein [Function unknown]. 103
4976 227879 COG5592 COG5592 Hemerythrin superfamily protein [General function prediction only]. 171
4977 227880 COG5593 COG5593 Nucleic-acid-binding protein possibly involved in ribosomal biogenesis [Translation, ribosomal structure and biogenesis]. 821
4978 227881 COG5594 COG5594 Uncharacterized integral membrane protein [Function unknown]. 827
4979 227882 COG5595 COG5595 Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 256
4980 227883 COG5596 TIM22 Mitochondrial import inner membrane translocase, subunit TIM22 [Posttranslational modification, protein turnover, chaperones]. 191
4981 227884 COG5597 Gnt1 Alpha-N-acetylglucosamine transferase [Cell wall/membrane/envelope biogenesis]. 368
4982 227885 COG5598 MttB2 Trimethylamine:corrinoid methyltransferase [Coenzyme transport and metabolism]. 526
4983 227886 COG5599 COG5599 Protein tyrosine phosphatase [Signal transduction mechanisms]. 302
4984 227887 COG5600 COG5600 Transcription-associated recombination protein [DNA replication, recombination, and repair]. 413
4985 227888 COG5601 CDC36 General negative regulator of transcription subunit [Transcription]. 172
4986 227889 COG5602 Sin3 Histone deacetylase complex, regulatory component SIN3 [Chromatin structure and dynamics]. 1163
4987 227890 COG5603 TRS20 Subunit of TRAPP, an ER-Golgi tethering complex [Cell motility and secretion]. 136
4988 227891 COG5604 COG5604 Uncharacterized conserved protein [Function unknown]. 523
4989 227892 COG5605 COG5605 Cytochrome c oxidase subunit IV [Energy production and conversion]. 115
4990 227893 COG5606 COG5606 Predicted DNA-binding protein, XRE-type HTH domain [General function prediction only]. 91
4991 227894 COG5607 CHAD CHAD domain (function unknown) [Function unknown]. 283
4992 227895 COG5608 COG5608 LEA14-like dessication related protein [Defense mechanisms]. 161
4993 227896 COG5609 YbcI Uncharacterized protein YbcI, DUF2294 family [Function unknown]. 124
4994 227897 COG5610 COG5610 Predicted hydrolase, HAD superfamily [General function prediction only]. 635
4995 227898 COG5611 COG5611 Predicted nucleic-acid-binding protein, contains PIN domain [General function prediction only]. 130
4996 227899 COG5612 COG5612 Uncharacterized membrane protein [Function unknown]. 148
4997 227900 COG5613 COG5613 Uncharacterized protein [Function unknown]. 400
4998 227901 COG5614 COG5614 Bacteriophage head-tail adaptor [Mobilome: prophages, transposons]. 109
4999 227902 COG5615 COG5615 Uncharacterized membrane protein [Function unknown]. 161
5000 227903 COG5616 TolBN TolB amino-terminal domain (function unknown) [General function prediction only]. 152
5001 227904 COG5617 COG5617 Uncharacterized membrane protein [Function unknown]. 801
5002 227905 COG5618 COG5618 Predicted periplasmic lipoprotein [Function unknown]. 206
5003 227906 COG5619 COG5619 Uncharacterized protein [Function unknown]. 224
5004 227907 COG5620 COG5620 Uncharacterized protein [Function unknown]. 200
5005 227908 COG5621 COG5621 Predicted secreted hydrolase [General function prediction only]. 354
5006 227909 COG5622 COG5622 Protein required for attachment to host cells [Cell wall/membrane/envelope biogenesis]. 139
5007 227910 COG5623 CLP1 Predicted GTPase subunit of the pre-mRNA cleavage complex [Translation, ribosomal structure and biogenesis]. 424
5008 227911 COG5624 COG5624 Transcription initiation factor TFIID, subunit TAF12 [Transcription]. 505
5009 227912 COG5625 COG5625 Predicted DNA-binding transcriptional regulator, contains HTH domain [Transcription]. 113
5010 227913 COG5626 COG5626 Uncharacterized protein [Function unknown]. 97
5011 227914 COG5627 COG5627 SUMO ligase MMS21, Smc5/6 complex, required for cell growth and DNA repair [Replication, recombination and repair]. 275
5012 227915 COG5628 COG5628 Predicted acetyltransferase [General function prediction only]. 143
5013 227916 COG5629 COG5629 Predicted metal-binding protein [Function unknown]. 321
5014 227917 COG5630 Arg2 Acetylglutamate synthase [Amino acid transport and metabolism]. 495
5015 227918 COG5631 COG5631 Predicted transcription regulator, contains HTH domain, MarR family [Transcription]. 199
5016 227919 COG5632 CwlA N-acetylmuramoyl-L-alanine amidase CwlA [Cell wall/membrane/envelope biogenesis]. 302
5017 227920 COG5633 YcfL Uncharacterized conserved protein YcfL [Function unknown]. 123
5018 227921 COG5634 YukJ Uncharacterized protein YukJ, DUF2278 family [Function unknown]. 223
5019 227922 COG5635 COG5635 Predicted NTPase, NACHT family domain [Signal transduction mechanisms]. 824
5020 227923 COG5636 COG5636 Uncharacterized conserved protein, contains Zn-ribbon-like motif [Function unknown]. 284
5021 227924 COG5637 COG5637 Uncharacterized membrane protein [Function unknown]. 217
5022 227925 COG5638 COG5638 Uncharacterized conserved protein [Function unknown]. 622
5023 227926 COG5639 COG5639 Uncharacterized protein [Function unknown]. 77
5024 227927 COG5640 COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]. 413
5025 227928 COG5641 GAT1 GATA Zn-finger-containing transcription factor [Transcription]. 498
5026 227929 COG5642 COG5642 Uncharacterized conserved protein, DUF2384 family [Function unknown]. 149
5027 227930 COG5643 COG5643 Protein containing a metal-binding domain shared with formylmethanofuran dehydrogenase subunit E [General function prediction only]. 685
5028 227931 COG5644 COG5644 U3 small nucleolar RNA-associated protein 14 [Function unknown]. 869
5029 227932 COG5645 YceK Uncharacterized conserved protein YceK [Function unknown]. 80
5030 227933 COG5646 YdhG Uncharacterized conserved protein YdhG, YjbR/CyaY-like superfamily, DUF1801 family [Function unknown]. 126
5031 227934 COG5647 COG5647 Cullin, a subunit of E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 773
5032 227935 COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]. 211
5033 227936 COG5649 COG5649 Uncharacterized protein [Function unknown]. 132
5034 227937 COG5650 COG5650 Uncharacterized membrane protein [Function unknown]. 536
5035 227938 COG5651 COG5651 PPE-repeat protein [Function unknown]. 490
5036 227939 COG5652 COG5652 VanZ-like family protein (function unknown) [Function unknown]. 148
5037 227940 COG5653 BcsL Acetyltransferase involved in cellulose biosynthesis, CelD/BcsL family [Cell motility]. 406
5038 227941 COG5654 COG5654 Uncharacterized conserved protein, contains RES domain [Function unknown]. 163
5039 227942 COG5655 REP Plasmid rolling circle replication initiator protein REP and truncated derivatives [Mobilome: prophages, transposons]. 256
5040 227943 COG5656 SXM1 Importin, protein involved in nuclear import [Posttranslational modification, protein turnover, chaperones]. 970
5041 227944 COG5657 CSE1 CAS/CSE protein involved in chromosome segregation [Cell division and chromosome partitioning]. 947
5042 227945 COG5658 COG5658 Uncharacterized membrane protein [Function unknown]. 204
5043 227946 COG5659 COG5659 SRSO17 transposase [Mobilome: prophages, transposons]. 385
5044 227947 COG5660 YlaC Predicted anti-sigma-YlaC factor YlaD, contains Zn-finger domain [Signal transduction mechanisms]. 238
5045 227948 COG5661 COG5661 Predicted secreted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 210
5046 227949 COG5662 RsiW Transmembrane transcriptional regulator (anti-sigma factor RsiW) [Transcription]. 256
5047 227950 COG5663 YqfW Uncharacterized protein, HAD superfamily [General function prediction only]. 194
5048 227951 COG5664 COG5664 Predicted secreted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 201
5049 227952 COG5665 COG5665 CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription]. 548
5050 177099 MTH00001 ND4L NADH dehydrogenase subunit 4L; Provisional 99
5051 214398 MTH00004 ND5 NADH dehydrogenase subunit 5; Validated 602
5052 164583 MTH00005 ATP6 ATP synthase F0 subunit 6; Provisional 231
5053 133649 MTH00007 COX1 cytochrome c oxidase subunit I; Validated 511
5054 164584 MTH00008 COX2 cytochrome c oxidase subunit II; Validated 228
5055 177101 MTH00009 COX3 cytochrome c oxidase subunit III; Validated 259
5056 164586 MTH00010 ND1 NADH dehydrogenase subunit 1; Validated 311
5057 164587 MTH00011 ND2 NADH dehydrogenase subunit 2; Validated 330
5058 164588 MTH00012 ND3 NADH dehydrogenase subunit 3; Validated 117
5059 133655 MTH00013 ND4L NADH dehydrogenase subunit 4L; Validated 97
5060 214399 MTH00014 ND4 NADH dehydrogenase subunit 4; Validated 452
5061 164590 MTH00015 ND6 NADH dehydrogenase subunit 6; Validated 155
5062 177102 MTH00016 CYTB cytochrome b; Validated 378
5063 164592 MTH00018 ND3 NADH dehydrogenase subunit 3; Validated 113
5064 214400 MTH00020 ND5 NADH dehydrogenase subunit 5; Reviewed 610
5065 214401 MTH00021 ND6 NADH dehydrogenase subunit 6; Validated 188
5066 164595 MTH00022 CYTB cytochrome b; Validated 379
5067 214402 MTH00023 COX2 cytochrome c oxidase subunit II; Validated 240
5068 214403 MTH00024 COX3 cytochrome c oxidase subunit III; Validated 261
5069 214404 MTH00025 ATP8 ATP synthase F0 subunit 8; Validated 70
5070 164599 MTH00026 COX1 cytochrome c oxidase subunit I; Provisional 534
5071 214405 MTH00027 COX2 cytochrome c oxidase subunit II; Provisional 262
5072 214406 MTH00028 COX3 cytochrome c oxidase subunit III; Provisional 297
5073 177108 MTH00029 ND1 NADH dehydrogenase subunit 1; Provisional 343
5074 164603 MTH00030 ND3 NADH dehydrogenase subunit 3; Provisional 123
5075 214407 MTH00032 ND5 NADH dehydrogenase subunit 5; Provisional 669
5076 133672 MTH00033 CYTB cytochrome b; Provisional 383
5077 177109 MTH00034 CYTB cytochrome b; Validated 379
5078 177110 MTH00035 ATP6 ATP synthase F0 subunit 6; Validated 229
5079 214408 MTH00036 ATP8 ATP synthase F0 subunit 8; Validated 54
5080 177112 MTH00037 COX1 cytochrome c oxidase subunit I; Provisional 517
5081 177113 MTH00038 COX2 cytochrome c oxidase subunit II; Provisional 229
5082 177114 MTH00039 COX3 cytochrome c oxidase subunit III; Validated 260
5083 214409 MTH00040 ND1 NADH dehydrogenase subunit 1; Validated 323
5084 177116 MTH00041 ND2 NADH dehydrogenase subunit 2; Validated 349
5085 177117 MTH00042 ND3 NADH dehydrogenase subunit 3; Validated 116
5086 214410 MTH00043 ND4L NADH dehydrogenase subunit 4L; Validated 98
5087 214411 MTH00044 ND4 NADH dehydrogenase subunit 4; Validated 458
5088 177120 MTH00045 ND6 NADH dehydrogenase subunit 6; Validated 162
5089 177121 MTH00046 CYTB cytochrome b; Validated 355
5090 214412 MTH00047 COX2 cytochrome c oxidase subunit II; Provisional 194
5091 177123 MTH00048 COX1 cytochrome c oxidase subunit I; Provisional 511
5092 177124 MTH00049 COX3 cytochrome c oxidase subunit III; Validated 215
5093 177125 MTH00050 ATP6 ATP synthase F0 subunit 6; Validated 170
5094 177126 MTH00051 COX2 cytochrome c oxidase subunit II; Provisional 234
5095 164623 MTH00052 COX3 cytochrome c oxidase subunit III; Provisional 262
5096 164624 MTH00053 CYTB cytochrome b; Provisional 381
5097 177127 MTH00054 ND1 NADH dehydrogenase subunit 1; Provisional 324
5098 177128 MTH00055 ND3 NADH dehydrogenase subunit 3; Provisional 118
5099 177129 MTH00057 ND6 NADH dehydrogenase subunit 6; Provisional 186
5100 177130 MTH00058 ND1 NADH dehydrogenase subunit 1; Provisional 293
5101 177131 MTH00059 ND2 NADH dehydrogenase subunit 2; Provisional 289
5102 177132 MTH00060 ND3 NADH dehydrogenase subunit 3; Provisional 116
5103 177133 MTH00061 ND4L NADH dehydrogenase subunit 4L; Provisional 86
5104 214413 MTH00062 ND4 NADH dehydrogenase subunit 4; Provisional 417
5105 214414 MTH00063 ND5 NADH dehydrogenase subunit 5; Provisional 522
5106 177136 MTH00064 ND6 NADH dehydrogenase subunit 6; Provisional 151
5107 214415 MTH00065 ND6 NADH dehydrogenase subunit 6; Provisional 172
5108 214416 MTH00066 ND5 NADH dehydrogenase subunit 5; Provisional 598
5109 177139 MTH00067 ND4L NADH dehydrogenase subunit 4L; Provisional 98
5110 214417 MTH00068 ND4 NADH dehydrogenase subunit 4; Provisional 458
5111 177141 MTH00069 ND3 NADH dehydrogenase subunit 3; Provisional 114
5112 177142 MTH00070 ND2 NADH dehydrogenase subunit 2; Provisional 346
5113 214418 MTH00071 ND1 NADH dehydrogenase subunit 1; Provisional 322
5114 164642 MTH00072 ATP8 ATP synthase F0 subunit 8; Provisional 54
5115 177144 MTH00073 ATP6 ATP synthase F0 subunit 6; Provisional 227
5116 177145 MTH00074 CYTB cytochrome b; Provisional 380
5117 177146 MTH00075 COX3 cytochrome c oxidase subunit III; Provisional 261
5118 164646 MTH00076 COX2 cytochrome c oxidase subunit II; Provisional 228
5119 214419 MTH00077 COX1 cytochrome c oxidase subunit I; Provisional 514
5120 177148 MTH00079 COX1 cytochrome c oxidase subunit I; Provisional 508
5121 177149 MTH00080 COX2 cytochrome c oxidase subunit II; Provisional 231
5122 177150 MTH00083 COX3 cytochrome c oxidase subunit III; Provisional 256
5123 177151 MTH00086 CYTB cytochrome b; Provisional 355
5124 177152 MTH00087 ATP6 ATP synthase F0 subunit 6; Provisional 195
5125 177153 MTH00090 ND1 NADH dehydrogenase subunit 1; Provisional 284
5126 177154 MTH00091 ND2 NADH dehydrogenase subunit 2; Provisional 273
5127 177155 MTH00092 ND3 NADH dehydrogenase subunit 3; Provisional 111
5128 177156 MTH00093 ND4L NADH dehydrogenase subunit 4L; Provisional 77
5129 177157 MTH00094 ND4 NADH dehydrogenase subunit 4; Provisional 403
5130 177158 MTH00095 ND5 NADH dehydrogenase subunit 5; Provisional 527
5131 177159 MTH00097 ND6 NADH dehydrogenase subunit 6; Provisional 121
5132 177160 MTH00098 COX2 cytochrome c oxidase subunit II; Validated 227
5133 177161 MTH00099 COX3 cytochrome c oxidase subunit III; Validated 261
5134 177162 MTH00100 CYTB cytochrome b; Provisional 379
5135 177163 MTH00101 ATP6 ATP synthase F0 subunit 6; Validated 226
5136 214420 MTH00102 ATP8 ATP synthase F0 subunit 8; Validated 67
5137 177165 MTH00103 COX1 cytochrome c oxidase subunit I; Validated 513
5138 177166 MTH00104 ND1 NADH dehydrogenase subunit 1; Provisional 318
5139 177167 MTH00105 ND2 NADH dehydrogenase subunit 2; Provisional 347
5140 177168 MTH00106 ND3 NADH dehydrogenase subunit 3; Provisional 115
5141 177169 MTH00107 ND4L NADH dehydrogenase subunit 4L; Provisional 98
5142 177170 MTH00108 ND5 NADH dehydrogenase subunit 5; Provisional 602
5143 214421 MTH00109 ND6 NADH dehydrogenase subunit 6; Provisional 175
5144 177172 MTH00110 ND4 NADH dehydrogenase subunit 4; Provisional 459
5145 214422 MTH00111 ND1 NADH dehydrogenase subunit 1; Provisional 323
5146 214423 MTH00112 ND2 NADH dehydrogenase subunit 2; Provisional 346
5147 177175 MTH00113 ND3 NADH dehydrogenase subunit 3; Provisional 114
5148 214424 MTH00115 ND6 NADH dehydrogenase subunit 6; Provisional 174
5149 177177 MTH00116 COX1 cytochrome c oxidase subunit I; Provisional 515
5150 177178 MTH00117 COX2 cytochrome c oxidase subunit II; Provisional 227
5151 177179 MTH00118 COX3 cytochrome c oxidase subunit III; Provisional 261
5152 214425 MTH00119 CYTB cytochrome b; Provisional 380
5153 177181 MTH00120 ATP6 ATP synthase F0 subunit 6; Provisional 227
5154 214426 MTH00123 ATP8 ATP synthase F0 subunit 8; Provisional 54
5155 214427 MTH00124 ND4 NADH dehydrogenase subunit 4; Provisional 457
5156 177184 MTH00125 ND4L NADH dehydrogenase subunit 4L; Provisional 98
5157 177185 MTH00126 ND4L NADH dehydrogenase subunit 4L; Provisional 98
5158 177186 MTH00127 ND4 NADH dehydrogenase subunit 4; Provisional 459
5159 177187 MTH00129 COX2 cytochrome c oxidase subunit II; Provisional 230
5160 177188 MTH00130 COX3 cytochrome c oxidase subunit III; Provisional 261
5161 177189 MTH00131 CYTB cytochrome b; Provisional 380
5162 177190 MTH00132 ATP6 ATP synthase F0 subunit 6; Provisional 227
5163 177191 MTH00133 ATP8 ATP synthase F0 subunit 8; Provisional 55
5164 177192 MTH00134 ND1 NADH dehydrogenase subunit 1; Provisional 324
5165 177193 MTH00135 ND2 NADH dehydrogenase subunit 2; Provisional 347
5166 177194 MTH00136 ND3 NADH dehydrogenase subunit 3; Provisional 116
5167 214428 MTH00137 ND5 NADH dehydrogenase subunit 5; Provisional 611
5168 177196 MTH00138 ND6 NADH dehydrogenase subunit 6; Provisional 173
5169 214429 MTH00139 COX2 cytochrome c oxidase subunit II; Provisional 226
5170 214430 MTH00140 COX2 cytochrome c oxidase subunit II; Provisional 228
5171 177199 MTH00141 COX3 cytochrome c oxidase subunit III; Provisional 259
5172 214431 MTH00142 COX1 cytochrome c oxidase subunit I; Provisional 511
5173 177201 MTH00143 ND1 NADH dehydrogenase subunit 1; Provisional 307
5174 214432 MTH00144 ND2 NADH dehydrogenase subunit 2; Provisional 328
5175 177203 MTH00145 CYTB cytochrome b; Provisional 379
5176 177204 MTH00147 ATP8 ATP synthase F0 subunit 8; Provisional 51
5177 214433 MTH00148 ND3 NADH dehydrogenase subunit 3; Provisional 117
5178 214434 MTH00149 ND4L NADH dehydrogenase subunit 4L; Provisional 97
5179 214435 MTH00150 ND4 NADH dehydrogenase subunit 4; Provisional 417
5180 214436 MTH00151 ND5 NADH dehydrogenase subunit 5; Provisional 565
5181 214437 MTH00152 ND6 NADH dehydrogenase subunit 6; Provisional 163
5182 177210 MTH00153 COX1 cytochrome c oxidase subunit I; Provisional 511
5183 214438 MTH00154 COX2 cytochrome c oxidase subunit II; Provisional 227
5184 214439 MTH00155 COX3 cytochrome c oxidase subunit III; Provisional 255
5185 214440 MTH00156 CYTB cytochrome b; Provisional 356
5186 214441 MTH00157 ATP6 ATP synthase F0 subunit 6; Provisional 223
5187 177215 MTH00158 ATP8 ATP synthase F0 subunit 8; Provisional 32
5188 214442 MTH00160 ND2 NADH dehydrogenase subunit 2; Provisional 335
5189 177217 MTH00161 ND3 NADH dehydrogenase subunit 3; Provisional 113
5190 177218 MTH00162 ND4L NADH dehydrogenase subunit 4L; Provisional 89
5191 214443 MTH00163 ND4 NADH dehydrogenase subunit 4; Provisional 445
5192 214444 MTH00165 ND5 NADH dehydrogenase subunit 5; Provisional 573
5193 214445 MTH00166 ND6 NADH dehydrogenase subunit 6; Provisional 160
5194 177222 MTH00167 COX1 cytochrome c oxidase subunit I; Provisional 512
5195 177223 MTH00168 COX2 cytochrome c oxidase subunit II; Provisional 225
5196 214446 MTH00169 ATP8 ATP synthase F0 subunit 8; Provisional 67
5197 177225 MTH00171 ATP8 ATP synthase F0 subunit 8; Provisional 54
5198 214447 MTH00172 ATP6 ATP synthase F0 subunit 6; Provisional 232
5199 214448 MTH00173 ATP6 ATP synthase F0 subunit 6; Provisional 231
5200 133799 MTH00174 ATP6 ATP synthase F0 subunit 6; Provisional 252
5201 177228 MTH00175 ATP6 ATP synthase F0 subunit 6; Provisional 244
5202 214449 MTH00176 ATP6 ATP synthase F0 subunit 6; Provisional 229
5203 177230 MTH00179 ATP6 ATP synthase F0 subunit 6; Provisional 227
5204 177231 MTH00180 ND4L NADH dehydrogenase subunit 4L; Provisional 99
5205 214450 MTH00181 ND4L NADH dehydrogenase subunit 4L; Provisional 93
5206 214451 MTH00182 COX1 cytochrome c oxidase subunit I; Provisional 525
5207 177234 MTH00183 COX1 cytochrome c oxidase subunit I; Provisional 516
5208 177235 MTH00184 COX1 cytochrome c oxidase subunit I; Provisional 519
5209 164736 MTH00185 COX2 cytochrome c oxidase subunit II; Provisional 230
5210 177236 MTH00186 ATP8 ATP synthase F0 subunit 8; Provisional 52
5211 177237 MTH00188 ND4L NADH dehydrogenase subunit 4L; Provisional 97
5212 177238 MTH00189 COX3 cytochrome c oxidase subunit III; Provisional 260
5213 177239 MTH00191 CYTB cytochrome b; Provisional 365
5214 177240 MTH00192 ND4L NADH dehydrogenase subunit 4L; Provisional 99
5215 177241 MTH00193 ND1 NADH dehydrogenase subunit 1; Provisional 306
5216 214452 MTH00195 ND1 NADH dehydrogenase subunit 1; Provisional 307
5217 214453 MTH00196 ND2 NADH dehydrogenase subunit 2; Provisional 365
5218 177244 MTH00197 ND2 NADH dehydrogenase subunit 2; Provisional 323
5219 214454 MTH00198 ND2 NADH dehydrogenase subunit 2; Provisional 607
5220 177245 MTH00199 ND2 NADH dehydrogenase subunit 2; Provisional 460
5221 177246 MTH00200 ND2 NADH dehydrogenase subunit 2; Provisional 347
5222 214455 MTH00202 ND3 NADH dehydrogenase subunit 3; Provisional 117
5223 214456 MTH00203 ND3 NADH dehydrogenase subunit 3; Provisional 112
5224 164750 MTH00204 ND4 NADH dehydrogenase subunit 4; Provisional 485
5225 214457 MTH00205 ND4 NADH dehydrogenase subunit 4; Provisional 448
5226 214458 MTH00206 ND4 NADH dehydrogenase subunit 4; Provisional 450
5227 164753 MTH00207 ND5 NADH dehydrogenase subunit 5; Provisional 572
5228 177251 MTH00208 ND5 NADH dehydrogenase subunit 5; Provisional 628
5229 177252 MTH00209 ND5 NADH dehydrogenase subunit 5; Provisional 564
5230 177253 MTH00210 ND5 NADH dehydrogenase subunit 5; Provisional 616
5231 214459 MTH00211 ND5 NADH dehydrogenase subunit 5; Provisional 597
5232 214460 MTH00212 ND6 NADH dehydrogenase subunit 6; Provisional 160
5233 177256 MTH00213 ND6 NADH dehydrogenase subunit 6; Provisional 239
5234 214461 MTH00214 ND6 NADH dehydrogenase subunit 6; Provisional 168
5235 164761 MTH00216 ND1 NADH dehydrogenase subunit 1; Provisional 327
5236 214462 MTH00217 ND4 NADH dehydrogenase subunit 4; Provisional 482
5237 214463 MTH00218 ND1 NADH dehydrogenase subunit 1; Provisional 311
5238 214464 MTH00219 COX3 cytochrome c oxidase subunit III; Provisional 262
5239 164765 MTH00222 ATP9 ATP synthase F0 subunit 9; Provisional 77
5240 177260 MTH00223 COX1 cytochrome c oxidase subunit I; Provisional 512
5241 164767 MTH00224 CYTB cytochrome b; Provisional 379
5242 214465 MTH00225 ND1 NADH dehydrogenase subunit 1; Provisional 305
5243 214466 MTH00226 ND4 NADH dehydrogenase subunit 4; Provisional 505
5244 164770 MTH00260 ATP8 ATP synthase F0 subunit 8; Provisional 53
5245 177263 MTH00261 ATP8 ATP synthase F0 subunit 8; Provisional 68
5246 411074 NF000031 MFS_efflux_LmrA lincomycin efflux MFS transporter Lmr(A). 484
5247 411075 NF000040 Tn10_TetC tetracyline resistance-associated transcriptional repressor TetC. TetC, as found in composite transposon Tn10, is a transcriptional repressor of itself and of TetD, which is a transcriptional activator for some stress response proteins in the SoxS/MarA/Rob regulon in E. coli and which therefore contributes to antibiotic resistance. 197
5248 411076 NF000058 Erm41 23S rRNA (adenine(2058)-N(6))-methyltransferase Erm(41). 173
5249 411077 NF000060 MFS_efflux_LmrB lincomycin efflux MFS transporter Lmr(B). The lin-2 mutant, described in PMID:12499232, alters Lmr(B) expression in Bacillus subtilis and allows Lmr(B) to confer resistance to lincomycin. 479
5250 411078 NF000106 40850658_otr oxytetracycline efflux ABC transporter Otr(C) ATP-binding subunit. 351
5251 411079 NF000140 AAC_6p_Salmo AAC(6')-Iy/Iaa family aminoglycoside 6'-N-acetyltransferase. Members of this family are chromosomal acetyltransferases from the genus Salmonella. Analysis has demonstrated a case in which a member designated AAC(6')-Iy, identical in two different strains isolated from a single patient, conferred resistance to tobramycin only in the isolate where a deletion event upstream of the gene resulted in high expression. Members of this family are therefore considered cryptic aminoglycoside N-acetyltransferases. which may or may not confer resistance, depending on expression levels. 145
5252 411080 NF000217 MATE_multi_FepA multidrug efflux MATE transporter FepA. 443
5253 411081 NF000342 glpA_Cterm aminoglycoside O-phosphotransferase APH(4)-Ib. The C-terminal half of CAA52372.1 shows homology to hygromycin-modifying enzyme APH(4)-Ia. The N-terminal region shows no homology to any other protein. The gene glpA was identified as one of two in a 3.0-kb DNA segment capable of conferring on E. coli the ability to degrade and use the phosphonate herbicide glyphosate as a sole carbon source. The expressed protein conferred minor (3-fold) increase in tolerance to hygromycin, but this protein probably should not be considered an aminoglycoside modifying enzyme associated with the spread of resistance in bacteria toward clinically important antimicrobials. 228
5254 411082 NF000349 mfpA_AE000516.2 pentapeptide repeat protein MfpA. 183
5255 411083 NF000391 EmrB multidrug efflux MFS transporter permease subunit EmrB. 501
5256 380146 NF000535 MSCRAMM_SdrC MSCRAMM family adhesin SdrC. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase. 963
5257 333720 NF000536 YmiA YmiA family putative membrane protein. 42
5258 333721 NF000537 YncL stress response membrane protein YncL. 30
5259 333722 NF000539 plantaricin plantaricin C family lantibiotic. This family describes plantaricin C-like lantibiotic precursors. The seed alignment straddles the cleavage motif (typically GG), and includes both an extended leader peptide region and a Cys-rich core peptide region. Because of the mosaic structure of lantibiotic precursors, this family can be expected to overlap other lantibiotic precursor families in the same clan. 65
5260 380147 NF000540 alt_ValS valine--tRNA ligase. 827
5261 411084 NF012135 ANT_3pp_9_crypt aminoglycoside nucleotidyltransferase ANT(3'')/ANT(9). Known members of this family are restricted to Salmonella enterica. Activities as both a streptomycin 3''-O-adenylyltransferase and a spectinomycin 9-O-adenylyltransferase are cryptic because of a lack of expression, but detected if cells are grown on miminal rather than rich media (see PMID:21507083). 262
5262 333724 NF012136 SecA2_Lm accessory Sec system translocase SecA2. Members of this family are SecA2, part of a Sec-like preprotein translocase called accessory Sec. This SecA2 family is characteristic of Listeria species. 776
5263 333725 NF012138 exosort_XrtR exosortase R. 160
5264 333726 NF012139 exosort_XrtP exosortase P. 159
5265 411085 NF012144 ramA_TF RamA family antibiotic efflux transcriptional regulator. 108
5266 380148 NF012162 surf_Nterm_1 surface-anchored protein thioester-forming domain. This model describes a conserved region, fairly rich in insertions and deletions, located just past the signal peptide region in long, variable, and typically highly repetitive and sortase-dependent surface proteins. Members are found in a broad range of taxa, including many strains of Streptococcus pneumoniae. A conserved Cys forms a thioester bond, often to a host protein for covalent attachment. 234
5267 411086 NF012163 BaeS_SmeS sensor histidine kinase efflux regulator BaeS. 457
5268 333728 NF012164 AlbA subtilosin maturase AlbA. AlbA is a radical SAM/SPASM domain-containing protein responsible for introducing thioether crosslinks during that maturation of bacteriocins such subtilosin A. 442
5269 411087 NF012168 BlaI_of_BCL penicillinase repressor BlaI. 121
5270 333729 NF012179 CptA phosphoethanolamine transferase CptA. 556
5271 380149 NF012181 MSCRAMM_SdrD MSCRAMM family adhesin SdrD. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase. 1379
5272 333731 NF012182 exosortase_XrtQ exosortase Q. 256
5273 380150 NF012196 Ig_like_ice Ig-like domain. This variant form of the Ig-like domain occurs as a repeat in a number of large adhesins, including a 1.5-MDa ice-binding adhesin, the Marinomonas primoryensis antifreeze protein. 108
5274 380151 NF012197 lonely_Cys lonely Cys domain. This model describes an unusual domain, over 700 amino acids long, that is largely restricted to the Streptomyces (prodigious producers of natural products) and that may occur ten or more times in giant proteins. The most striking feature is an extremely low cysteine composition, one residue per domain, and that in an essentially invariant position. 706
5275 411088 NF012198 MarA_TF MDR efflux pump AcrAB transcriptional activator MarA. 124
5276 380152 NF012200 choice_anch_D choice-of-anchor D domain. This HMM describes a repeat domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, such as type IX secrection (T9SS) or PEP-CTERM. 107
5277 333735 NF012201 WIAG-tail WIAG-tail domain. This 80-amino acid domain occurs in proteins in a single copy at the C-terminus. In most proteins, the domain immediately follows a long, variable run of tandem 10-amino acid repeats. The domain is named for its C-terminal motif, WIAxGx, hence the name WIAG-tail. 80
5278 380153 NF012204 adhes_FxxPxG leukotoxin LktA family filamentous adhesin N-terminal domain. This model, related to TIGR01901, describes a conserved single-copy N-terminal domain found in repeat-rich, extremely long proteins such as the leukotoxin LktA of Fusobacterium necrophorum. 152
5279 333737 NF012206 LktA_tand_53 leukotoxin LktA-type filamentous protein tandem repeat. This repeat, about 53 amino acids in length, may comprise most of the length of proteins over 3000 amino acids long. The best characterized protein with this repeat is the leukotoxin LktA of Fusobacterium necrophorum, where it is the major virulence factor. 53
5280 411089 NF012208 SDR_dihy_bifunc bifunctional dihydropteridine reductase/dihydrofolate reductase TmpR. Members of this family are SDR family oxidoreductases, unrelated to previously known families of dihydrofolate reductase (DHFR), one of which was demonstrated to be a bifunctional dihydropteridine reductase/dihydrofolate reductase. The DHFR activity can give a heterologously expressed protein the ability to confer resistance to trimethoprim, an inhibitor of most forms of DHFR. 233
5281 333738 NF012209 LEPR-8K LEPR-XLL family repeat protein signature domain. This model, just 24 amino acids long, describes an N-terminal single-copy region that contains the most highly conserved motif in a collection of repeat-filled giant proteins. Member proteins average over 8000 amino acids and include at least one longer than 35,000 in length. The signature motif is LEPRxLL 24
5282 380154 NF012210 PDxFFG PDxFFG domain. This model represents the conserved N-terminal domain of family of large proteins with signal peptides, found in Mycoplasma and Ureaplasma. A short conserved N-terminal domain and a large conserved C-terminal domain are separated by poorly conserved regions of variable length. This domain is named for its best conserved motif, PDxFFG. 269
5283 333740 NF012211 tand_rpt_95 tandem-95 repeat. This 95-amino acid repeat occurs in tandem in proteins that may be several thousand amino acids long. 98
5284 380155 NF012221 MARTX_Nterm MARTX multifunctional-autoprocessing repeats-in-toxin holotoxin N-terminal region. This model describes the N-terminal 1900 amino acids of MARTX family multifunctional-autoprocessing repeats-in-toxin holotoxins, which contain both repeat regions that facilitate their entry into eukaryotic target cells, and multiple effector domains. 1848
5285 411090 NF012226 AdeS_HK two-component sensor histidine kinase AdeS. Mutations in this component of the two-component regulatory system for the AdeABC efflux pump can confer adaptive resistance to certain antibiotics, including tigecycline. 353
5286 411091 NF012227 AdeR_RR efflux system response regulator transcription factor AdeR. This protein, the DNA-binding regulator AdeR, works with its two-component system partner AdeS, to modulate expression of the AdeABC efflux pump. 239
5287 411092 NF012228 RobA_TF MDR efflux pump AcrAB transcriptional activator RobA. The original characterization of RobA as a Right side Origin of replication Binding protein A (robA) may be misleading. Characterizations in large numbers of papers since then treat RobA as a transcriptional activator of the AcrAB antibiotic efflux pump. 286
5288 333742 NF012230 LWXIA_domain LWXIA domain. This domain occurs exclusively at the C-terminus of a set of long proteins (average length 4000 residues), and is separated form the rest of the protein sequence by a Pro and Ser-rich spacer region of poorly conserved, low-complexity sequence. This domain is named for its most conserved motif, LWxIA. Some but not all sequences in the seed alignment score well locally to the LysM domain model of PF01476, which may have a general peptidoglycan binding function. 74
5289 333743 NF028536 PAP2_near_MCR1 PAP2 family protein. Members of this family belong to the PAP2 superfamily (see PF01569). The founding members of this family are notable for being encoded next to mcr-1, a phosphoethanolamine--lipid A transferase that confers resistance to colistin. 237
5290 380156 NF028538 PAP2_lipid_A PAP2 family lipid A phosphatase. All members of the seed alignment for this family belong to the PAP2 superfamily and therefore share homology with the lipid A 1-phosphatase LpxE of Helicobacter pylori. LpxE removes one of two KDO sugar phosphates from lipid A, making it possible for a phosphoethanolamine--lipid A transferase to add the modifying group that increases resistance to colistin. All members of the seed alignment for this model are encoded close to the gene for a phosphoethanolamine--lipid A transferase, such as MCR-1. 224
5291 380157 NF032891 tail_200_repeat tandem large repeat. This HMM describes a domain of nearly 200 amino acids, found in up to 14 tandem repeats in the C-terminal region of very large protein, in Vibrio parahaemolyticus and related species. 192
5292 380158 NF032893 tail-700 PLxRFG domain. This domain, nearly 700 residues long, begins with a nearly invariant motif YxPLxRFGx[YF]. It occurs as the extreme C-terminal domain of large size, some over 5000 amino acids long with an average of nearly 3000. The function is unknown. 681
5293 333748 NF033070 rSAM_AprD4 AprD4 family radical SAM diol-dehydratase. AprD4 is a radical SAM enzyme involved in C3-deoxygenation of the intermediate paromamine during biosynthesis of the aminoglycoside apramycin. It acts as a diol-dehydratase, and works with the partner protein, AprD3, a reductase. 456
5294 380159 NF033071 SusD starch-binding outer membrane lipoprotein SusD. SusD (Starch Uptake System D) is an outer membrane lipoprotein that binds starch and participates in a TonB-dependent nutrient uptake complex. Related proteins from similar TonB-dependent complexes that import other, usually multimeric nutrient substrates include RagB and NanU. 558
5295 333750 NF033072 NanU SusD family outer membrane lipoprotein NanU. NanU, related to SusD and RagB, is an outer membrane lipoprotein from a TonB-dependent nutrient uptake complex. 521
5296 380160 NF033073 LPXTG_double doubled motif LPXTG anchor domain. This unusual LPXTG-type C-terminal protein sorting domain occurs largely in the genus Clostridium and typically is separated from the main body of the protein by a glycine-rich linker sequence. In this domain, the classical sortase cleavage motif, LPXTG, has the consensus sequence VPLAxLPKTG. Much of this motif, the sequence VPLAxLP, is repeated an average 20 amino acids upstream within this domain. This unusual structure of a sortase recognition site-containing domain suggest a specialized form of interaction with its cognate sortase. 66
5297 380161 NF033092 HK_WalK cell wall metabolism sensor histidine kinase WalK. This model describes WalK as found in Staphylococcus aureus (sp|Q2G2U4.1|WALK_STAA8). A shorter version, as found in Streptococcus pneumoniae, called WalK(Spn) or VicK, is not included. WalK is part of a two-component system and works with partner protein WalR. 594
5298 380162 NF033093 HK_VicK cell wall metabolism sensor histidine kinase VicK. This model describes the protein VicK (or WalK) as found in Streptococcus pneumoniae, This protein is shorter than the WalK of Staphylococcus aureus, although apparently is functionally similar. Compare to model NF033092 (HK_WalK). VicK is a sensor histidine kinase involved in regulating cell wall metabolism. Its two component system partner is the response regulatory VicR. 448
5299 380163 NF033113 halo_ClmS chloramphenicol-biosynthetic FADH2-dependent halogenase CmlS. 570
5300 411093 NF033124 estX alpha/beta fold putative hydrolase EstX. 280
5301 411094 NF033138 RND-peri-MexC MexC family multidrug efflux RND transporter periplasmic adaptor subunit. 375
5302 411095 NF033143 efflux_OM_AdeK multidrug efflux RND transporter AdeIJK outer membrane channel subunit AdeK. 480
5303 411096 NF033147 GXX_rpt_CTERM Gly-Xaa-Xaa repeat protein C-terminal domain. This model often occurs at the C-terminus, and companion model N_to_GlyXaaXaa (NF033172) at the N-terminus, of proteins that in between consist largely of variable numbers of Gly-Xaa-Xaa repeats, reminiscent of collagen repeats. Member proteins observed have been found so far only in Gram-positive bacteria. This domain contains a motif IPxTG near its C-terminus, suggesting it is processed by some form of sortase. 132
5304 411097 NF033153 phage_ICD_like host cell division inhibitor Icd domain. Icd from temperate phage P1 inhibits cell division in its host. Homologous sequence is found in many other proteins, often as the C-terminal region of what appears to be a much larger protein. Putative phage proteins that contain this domain may be designated "host cell division inhibitor Icd-like protein". See PMID: 8491703 for a description of Icd. Many proteins with this domain also have the Ash domain described by PF10554, which also occurs in phage. 48
5305 411098 NF033154 endonuc_SmrA DNA endonuclease SmrA. YdaL is a small endonuclease with homology to the C-terminal domain found in the endonuclease MutS2, but not found in the related mismatch repair protein MutS. The biological role of this endonuclease is not yet known. As one of two Small MutS2-Related proteins in E. coli, This protein was designated SmrA by Gui, et al. (PMID:21276852). The term SMR is much better known for describing a large family of Small Multidrug Resistance (SMR) efflux transporters, but in that context is used with three capital letters. 189
5306 380165 NF033155 CatA_like_1 CatA-like O-acetyltransferase. Members of this family are homologs to members of the CatA family of chloramphenicol acetyltransferases, although less than 30% identical. There is no evidence that members of this family act on or confer resistance to chloramphenicol. 209
5307 380166 NF033157 SWFGD_domain SWFGD domain. This small domain (29 amino acids long) is named for its most conspicuous (although not invariant) motif, SWFGD. The motif occurs primarily, although not exclusively, in protein sequences with a BON domain (PF04972), suspected of involvement in attachment to phospholipid membranes, or with a DUF2171 domain (PF09939). Two copies of the motif may be found in a single protein. 29
5308 380167 NF033158 Myrrcad Myrrcad domain. This domain appears at or near the C-terminus in expanded paralogous families of proteins in Mycoplasma and Candidatus Mycoplasma genomes. Proteins with this domain typically show an N-terminal sequence regions followed by a repeat region, highly variable in length and similar to the leucine-rich repeat. The center region of this domain resembles alpha-helical hydrophobic transmembrane segments. The domain ends with a cluster of basic residues, suggesting an orientation in which the C-terminal residues of the domain face the cytosol. The coinage "Myrrcad", rendered without full capitalization because it is an acronym rather than an amino acid sequence motif, signifies "MYcoplasma Repeat-Rich protein C-terminal Anchor Domain" 36
5309 380168 NF033160 lipo_LipL36 lipoprotein LipL36. Members of this family are lipoprotein LipL36, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 383
5310 380169 NF033161 lipo_LipL41 lipoprotein LipL41. Members of this family are lipoprotein LipL41, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 358
5311 380170 NF033162 lipo_LipL21 lipoprotein LipL21. Members of this family are lipoprotein LipL21, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 191
5312 380171 NF033163 lipo_LipL71 lipoprotein LipL71. Members of this family are lipoprotein LipL71, also known as LruA, as described in Leptospira interrogans but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 472
5313 380172 NF033164 lipo_LipL46 lipoprotein LipL46. Members of this family are lipoprotein LipL46, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 413
5314 380173 NF033165 lipo_LipL45 lipoprotein LipL45. Members of this family are lipoprotein LipL45, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 391
5315 380174 NF033166 lipo_LipL31 lipoprotein LipL31. Members of this family are lipoprotein LipL31, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 210
5316 380175 NF033167 lipo_LIC11695 LIC_11695/LIC_11696 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira. Two paralogs, LIC_11695 and LIC_11696 are found in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 (a well-studied reference strain), where they are encoded by tandem genes. 186
5317 380176 NF033168 lipo_LIC10766 LIC_10766 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira. 142
5318 380177 NF033169 lipo_LIC10494 LIC_10494 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira. 213
5319 380178 NF033170 lipo_LIC13355 LIC_13355 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira. 273
5320 411099 NF033171 lipo_LIC11139 LIC_11139 family putative lipoprotein. Members of this family are restricted to the genus Leptospira. They are putative lipoproteins with only a single Cys residue, invariant at the proposed lipoprotein signal cleavage site. Residues in the -1 position are unusual for lipoproteins in general, but consistent with observations in Leptospira, as described in PMID:26890609. 157
5321 380180 NF033172 N_to_GlyXaaXaa collagen-like repeat preface domain. All protein sequence used in the seed alignment for this model comes from the N-terminal region of proteins with extended collagen-like Gly-rich repeat regions, and occur in Firmicutes. 122
5322 380181 NF033173 anticapsin_BacC dihydroanticapsin 7-dehydrogenase. Members of this family are dihydroanticapsin 7-dehydrogenase (EC 1.1.1.385), one of seven key molecular markers for biosynthesis of the non-cognate amino acid anticapsin, a building block for the dipeptide antibiotic natural product bacilysin. 252
5323 380182 NF033175 fuso_auto_Nterm autotransporter-associated N-terminal domain. This domain typically is found in the genus Fusobacterium, in N-terminal regions of large proteins that are recognized as autotransporter proteins by C-terminal regions matching Pfam model PF03797. In paralogous families of such proteins, the N-terminal and C-terminal regions are fairly well-conserved, but the repetitive central region is poorly conserved in both length and sequence. 121
5324 380183 NF033176 auto_AIDA-I autotransporter adhesin AIDA-I. 1287
5325 380184 NF033177 auto_Ag43 autotransporter adhesin Ag43. 948
5326 380185 NF033178 auto_BigA autotransporter adhesin BigA. Members of this family are the adhesin BigA, found in Salmonella. BigA is an autotranporter, meaning is has a C-terminal outer membrane beta-barrel, through which passenger regions of the proteins are transported. 1875
5327 380186 NF033179 TnsA_like_Actin TnsA-like heteromeric transposase endonuclease subunit. The transposase of transposon Tn7 contains multiple subunit. Members of this family are largely restricted to the Actinobacteria, resemble the endonuclease subunit TsnA of the multimeric transposase of Tn7 and its relatives, and occur in genomic neighborhoods that suggest a similar role in transposition. 212
5328 380187 NF033181 NiFeSe_hydrog nickel-dependent hydrogenase large subunit. Members of this family are the large subunit of the periplasmic nickel-dependent hydrogenase. Some members contain a selenocysteine residue. 509
5329 380188 NF033183 colliding_TM low-complexity tail membrane protein. Members of this family appear typically as an unusual gene pair with members of PF11998 (DUF3493). Strangely, members tend to have tail-to-tail overlapping regions, where the tail region from this protein is long, low-complexity, and typically rich in Asp, Glu, Asn, Gln, Ser, and Thr. These gene pairs occur broadly in Cyanobacteria. The function of this pair of convergently-transcribed overlapping proteins is unknown. The low-complexity region was trimmed from the seed alignment to build this HMM. 188
5330 411100 NF033186 internalin_K class 1 internalin InlK. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface-anchored proteins with an N-terminal signal peptide, leucine-rich repeats, and a C-terminal LPXTG processing and cell surface anchoring site. Members of this family are internalin K (InlK), a virulence factor. See articles PMID:17764999. for a general discussion of internalins, and PMID:21829365, PMID:22082958, and PMID:23958637 for more information about internalin K. 604
5331 380191 NF033187 internalin_J class 1 internalin InlJ. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface-anchored proteins with an N-terminal signal peptide, leucine-rich repeats, and a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin J (InlJ). 846
5332 380192 NF033188 internalin_H InlH/InlC2 family class 1 internalin. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface or secreted proteins with an N-terminal signal peptide, leucine-rich repeats, and usually a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin H (InlH), or internalin C2, two class 1 (LPXTG-type) internalins that are closely related, one apparently derived from the other through a recombination event. 548
5333 380193 NF033189 internalin_A class 1 internalin InlA. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface or secreted proteins with an N-terminal signal peptide, leucine-rich repeats, and usually a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin A (InlA), a class 1 (LPXTG-type) internalin. 799
5334 411101 NF033190 inl_like_NEAT_1 NEAT domain-containing leucine-rich repeat protein. Members of this family have an N-terminal NEAT (near transporter) domain often associated with iron transport, followed by a leucine-rich repeat region with significant sequence similarity to the internalins of Listeria monocytogenes. However, since Bacillus cereus (from which this protein was described, in PMID:16978259) is not considered an intracellular pathogen, and the function may be iron transport rather than internalization, applying the name "internalin" to this family probably would be misleading. 754
5335 380194 NF033191 JDVT-CTERM JDVT-CTERM protein-sorting domain. This bacterial C-terminal protein-sorting domain, superficially similar to MYXO-CTERM (TIGR03901), occurs in a variety of Proteobacteria, including Janthinobacterium, Duganella, Vibrio breoganii, and Thioalkalivibrio, hence the name JDVT. Its local genomic context in species examined so far includes a homolog of eukaryotic type II CAAX prenylation site proteases (see PF02517). The architecture of the domain consists of a run of Gly residues and an invariant Cys residue (a probably modification site), followed by a hydrophobic predicted transmembrane alpha helix and then a cluster of basic residues, mostly Arg. 34
5336 380195 NF033192 JDVT-CAAX JDVT-CTERM system CAAX-type protease. See NF033191. 98
5337 380196 NF033193 lipo_NDxxF NDxxF motif lipoprotein. Members of this family are lipoproteins, about 200 amino acids long in precursor form, found in Staphylococcus aureus, Bacillus cereus, and various other Firmicutes. The protein family is named for one of its several highly conserved motifs. 199
5338 380197 NF033194 lipo_EMYY EMYY motif lipoprotein. Members of this family are lipoproteins, about 300 amino acids long in precursor form, found broadly in the genus Staphylococcus and in some related species. 292
5339 380198 NF033195 F430_CfbB Ni-sirohydrochlorin a,c-diamide synthase. Members of this family are Ni-sirohydrochlorin a,c-diamide synthase, involving in synthesizing coenzyme F430, used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) the enzyme cobyrinic acid a,c-diamide synthase, involved in cobalamin biosynthesis. 451
5340 380199 NF033196 c_type_nonphoto c-type cytochrome. Members of this family are apparent c-type cytochromes that resemble the photosynthetic reaction center c-type cytochrome (PF02276) but are smaller and found in non-photosynthetic organisms. 97
5341 380200 NF033197 F430_CfbE coenzyme F430 synthase. Members of this family are coenzyme F430 synthase, involving in synthesizing coenzyme F430, which is used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) MurD, an enzyme of bacterial cell wall biosynthesis. 419
5342 380201 NF033198 F430_CfbA sirohydrochlorin nickelochelatase. Members of this family are sirohydrochlorin nickelochelatase, involving in synthesizing coenzyme F430, used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) sirohydrochlorin cobaltochelatase , involved in cobalamin biosynthesis. Some members of this family are double in length because of a duplication. 124
5343 380202 NF033200 F430_CfbC Ni-sirohydrochlorin a,c-diamide reductive cyclase ATP-dependent reductase subunit. This family, very closely related to the nitrogenase iron protein, was identified as a subunit involved in biosynthesis of coenzyme F430 in archaeal methanogens and archaeal anaerobic methanotrophs. 260
5344 380203 NF033201 Vip_LPXTG_Lm cell invasion LPXTG protein Vip. Vip (Virulence protein), like the LPXTG-type internalins, is an LPXTG-anchored surface protein of the mammalian cell-invading pathogen Listeria monocytogenes, but absent from the related species Listeria innocua. For certain cell types, Vip is required for Listeria's ability to invade. It appears to bind the endoplasmic reticulum (ER) resident chaperone Gp96 as its receptor. 414
5345 380204 NF033202 GW_glycos_SH3 GW domain. The GW domain of Listeria belongs to the clan of SH3-like domains. A similar but broader model (PF13457) occurs in Pfam. The GW domain occurs as repeats on surface proteins of the cell-invading pathogenic bacterium Listeria monocytogenes, and is involved in binding to glycosaminoglycans. Members of this family include the GW-type internalin InlB and several paralogs. 81
5346 380205 NF033203 entero_EhxA enterohemolysin EhxA. Members of this family are the RTX toxin called enterohemolysin or EhxA, because it is found in enterohemorrhagic Escherichia coli (EHEC) strains such as O157:H7. 997
5347 380206 NF033205 IPExxxVDY IPExxxVDY family protein. This protein family is uncharacterized. Member proteins average about 160 amino acids in size, and feature two widely separated invariant Asn residues, as well as the larger (though less invariant) motif for which it is named. Members are found primarily in the Flavobacteriia branch of the Bacteroidetes. 152
5348 380207 NF033206 ScyE_fam ScyD/ScyE family protein. This family includes ScyE, a protein involved in scytomenin biosynthesis and export, and its paralog ScyD. Some members of the family contain a C-terminal PEP-CTERM domain that predictions anchoring to the outer membrane. 330
5349 380208 NF033207 midcut_by_XrtH midcut-by-XrtH protein. Members of this protein family occur in bacterial genomes that encode the exosortase/archaeosortase family member XrtH (exosortase H). While many targets of XrtH are C-terminal protein-processing signals described by TIGR04174, the IPTL-CTERM domain, members of this family have a version of that signal in the N-terminal half of the protein. This architecture suggests that XrtH may performs a cleavage that releases the C-terminal domain of midcut-by-XrtH proteins from the membrane, perhaps as part of some regulatory pathway. 171
5350 411102 NF033208 choice_anch_E choice-of-anchor E domain-containing protein. This HMM describes a domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, usually PEP-CTERM but occasionally a type IX secrection (T9SS) recognition domain. 171
5351 380210 NF033210 RBP7 reticulate body protein Rbp-7. Members of this family are the 7-kDa reticulate body protein found in several species of Chlamydia. The protein is often overlooked during genome structural annotation; members observed so far are 65 to 67 amino acids long. The protein has been demonstrated by mass spectroscopy, and shown to be present only in reticulate or intermediate bodies. 74
5352 411103 NF033212 SapB_AmfS_lanti SapB/AmfS family lanthipeptide. Members of this family are class III lantipeptide precursors. These typically are short peptides, encoded next to the gene for the lantipeptide synthetase that creates the characteristic lanthionine (or beta-methyl lanthionine) bridges for which these natural products are named. Members of this family include SapB and AmfS, which are considered morphogens rather than antibiotics. Members also include labionin-containing peptides such labyrinthopeptins A1 and A2. 40
5353 411104 NF033213 matur_PanM aspartate 1-decarboxylase autocleavage activator PanM. Members of this family, called PanM (or PanZ), although related to the GNAT family N-acetyltransferases, have a different function. Then enzyme PanD, aspartate 1-decarboxylase, has an active site modified Ser residue, created by cleavage of a precursor form. PanM promotes the maturation of the CoA biosynthesis enzyme PanD, but also inhibits its activity in the presence of CoA. Figure 6 in PMID:26276430 identifies residues considered critical to interaction with PanD; seed alignment sequences and cutoff scores were chosen to separate proposed PanM from functionally distinct relatives. 130
5354 380213 NF033214 ComC_Streptocco competence-stimulating peptide ComC. Members of this family are ComC, a secreted peptide that stimulates competence for natural transformation in Streptococcus. ComC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish ComC from other pheromone precursors, such as BlpC. 41
5355 380214 NF033215 BlpC_Streptocco quorum-sensing system pheromone BlpC. Members of this family are BlpC, a peptide pheromone that stimulates production of BLP (bacteriocin-like peptides) family class II bacteriocins. BlpC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish BlpC from other pheromone precursors, such as ComC. 51
5356 380215 NF033216 lipo_YgdI_YgdR YgdI/YgdR family lipoprotein. Members of this family are exclusively lipoproteins of small size, including YgdI and YgdR from E. coli K-12. 71
5357 380216 NF033217 Fur_reg_FbpC Fur-regulated basic protein FbpC. Members of this family are FbpC, Fur-regulated basic protein C. This protein has also been described as MrgC (metal-regulated gene C). Members of this family are found so far only in the genus Bacillus, although the small size may have interferred in gene-finding. 29
5358 411105 NF033218 anchor_AmaP alkaline shock response membrane anchor protein AmaP. The founding member of this family, AmaP (Asp23 membrane anchoring protein), is related to Asp23 through part of its length, but includes a highly hydrophobic N-terminal region that should make it an integral membrane protein. Asp23 (alkaline shock protein of 23 kDa), described in PMID:7864904, is a cytosolic protein in Staphylococcus aureus, strongly induced by a pH shift from 7 to 10, and also recruited to the membrane. AmaP appears to be the partner protein with an integral membrane segment and the ability to anchor Asp23 to the membrane. This model was built to identify full-length homologs of AmaP, while excluding Asp23. Some but not all members of this family score above the cutoffs of Pfam model version PF03780.11, but full-length homologs of Asp23 score considerably higher. Asp23 family members previously were known as DUF322. 166
5359 380218 NF033222 listolys_S listeriolysin S family toxin. Members of this family include listeriolysin S from some strains of Listeria monocytogenes, and staphylolysin S. Members are encoded in biosynthetic clusters similar to those for streptolysin S (found by model TIGR03602), a precursor similar in length, architecture, and composition, but different enough to require a different HMM. 32
5360 380219 NF033223 YHYH_alt YHYH domain. Proteins with this form of YHYH motif-containing domain have it located near the N-terminus of a protein just after a signal peptide region. The domain has two characteristic motifs, GxC and YH[YC]H, separated by a short spacer region of variable length. A different family of YHYH domain-domaining proteins, in which the YHYH domain is more C-terminal and is repeated, is described by Pfam model PF14240. 25
5361 380220 NF033224 PmrR LpxT activity modulator PmrR. Members of this family are PrmR, and extremely small protein at 29 amino acids in length. 29
5362 380221 NF033225 spore_CmpA cortex morphogenetic protein CmpA. Members of this family are CmpA (cortex morphogenetic protein A), a small protein (37 amino acids) involved in endospore formation and frequently missed during genome annotation. 36
5363 380222 NF033226 small_MntS manganase accumulation protein MntS. Members of this family are MntS, a small protein of about 42 amino acids that seems to play assist bacteria in accumulating manganese when iron is limiting. It may function as a manganese chaperone, or may inhibit manganese efflux transporters. 41
5364 380223 NF033227 Fur_reg_FbpB Fur-regulated basic protein FbpB. This model describes FbpB (Fur-regulated basic protein B), one of three paralogous small proteins recognized by Pfam model PF13040 in Bacillus subtilis. 43
5365 380224 NF033228 div_inhib_SidA cell division inhibitor SidA. This protein, SidA (SOS-induced inhibitor of cell division A), is found so far in Caulobacter and Phenylobacterium. It interacts with FtsW. 29
5366 380225 NF033229 small_MgtR protein MgtR. 30
5367 411106 NF033230 phage_region_01 phage region protein. Members of this family are found broadly in the Gammaproteobacteria and may be a marker of temperate phage or prophage. A member (YP_008766900.1) occurs in Shigella phage SfIV. 142
5368 380226 NF033231 small_Blr division septum protein Blr. Members of this family are Blr, named beta-lactam resistance protein because mutants have heightened sensitivity to beta-lactam antibiotics, but actually involved in cell division. The protein is very small. 40
5369 380227 NF033232 small_YtzI YtzI protein. Members of this family include YtzI from Bacillus subtilis, and homologs widely distributed in the Firmicutes. The pattern of sequence conservation suggests the protein begins with a hydrophobic stretch without any basic residue near the initiator Met residue. Members of this family average about 53 residues in length. At the time this model was constructed, members were included in Pfam model PF12606.6 (Tumour necrosis factor receptor superfamily member 19), seemingly in error. This model was constructed to separate the two families. 41
5370 380228 NF033233 twin_helix twin transmembrane helix small protein. Members of the seed alignment for this family are small (average length 68 residues), strictly bacterial, and extremely hydrophobic. Pfam model PF04588 (HIG_1_N) includes both eukaryotic proteins, including a protein from the fish Gillichthys mirabilis, and the members of this family. Similarity between those eukaryotic proteins and the members o this model may represent convergent evolution related to the similar composition of their transmembrane alpha-helical regions, rather than a common origin or common function. 58
5371 380229 NF033376 lat_flg_LafA_1 lateral flagellin LafA. This HMM describes rare second type of flagellin from E. coli and some closely related species, called LafA, where the familiar and common flagellin, FliC, is nearly universal and carries the H-antigen used for serotyping strains. In contrast to FliC, whose center region is highly variable, LafA shows little variability in sequence. In many E. coli strains, the Flag-2 locus either is absent or is cryptic, appearing degraded and non-functional. 304
5372 380230 NF033377 OMA_tautomer 4-oxalomesaconate tautomerase. 347
5373 380231 NF033379 FrucBisAld_I fructose-bisphosphate aldolase class I. This family consists of fructose-bisphosphate aldolase class I. All members of the seed alignment are from prokaryotes, although class I is the common form in plants and animals. The common form in prokaryotes is class II. 324
5374 380232 NF033380 Rlm_2499C5 23S rRNA (cytosine(2499)-C(5))-methyltransferase. This model describes a 23S rRNA modification related to RlmI of E. coli, but that modifies site C2499 rather than C1962. 391
5375 411107 NF033381 MonaBetaBRL_TX monalysin family beta-barrel pore-forming toxin. Members of this family are secreted in a water-soluble pro-toxin form, but undergo cleavage and oligomerization to form beta-barrel pore. The founding member of the family is monalysin from Pseudomonas entomophila. This family is built narrowly, and therefore excludes a set of pore-forming proteins (not necessarily toxins) from a eukaryote, Dictyostelium. Analogous (but perhaps not homologous) beta-type pore-forming toxins include aerolysin and leukocidin. 230
5376 380234 NF033382 OMP_33_36 porin Omp33-36. Members of this family are outer membrane beta-barrel proteins that facilitate passive transport from the extracellular milieu into the periplasm. Known members are limited to the genus Acinetobacter, and the name, Omp33-36, reflects variability of this protein across the lineage. Note that this HMM previously was named CarO in error. Both this protein and CarO affect carbapenem transport across the outer member and thus carbapenem susceptibility or resistance. 293
5377 411108 NF033383 induct_EntF EntF family bacteriocin induction factor. Members of this family have leader sequences like bacteriocins (see TIGR01847), but characterized examples function as signaling peptides that induce production of a nearby encoded bacteriocin, rather than as bacteriocins themselves. The founding member of this family is enterocin induction factor EntF. 39
5378 380236 NF033384 enterocin_MR10 enterocin L50 family leaderless bacteriocin. Members of this family are leaderless peptide components of bacteriocins in which the two subunits share about 74% identity with each other and are each about 43 amino acids long. Members include enterocin subunits L50A and L50B, MR10A and MR10B, etc. 43
5379 380237 NF033385 enterocin_LsbB LsbB family leaderless bacteriocin. Members of this family are leaderless peptide components of bacteriocins with a conserved motif KXXXGXXPWE. 35
5380 380238 NF033388 ubiq_like_UBact ubiquitin-like protein UBact. This HMM describes a protein family that includes most, but not all, of the proteins designated UBact (a ubiquitin-like protein) in the article first describing a biosystem related to bacterial pupylation. Protein modification by ubiquitin in eukaryotes, pupylation in many bacteria, and this system is a few other bacteria, is considered a signal that can trigger altered protein handling such as rapid degradation. Proteins that the authors consider members of the same family, but that show very little sequence similarity other than protein size and the final two residues, include WP_008669967.1 in the genus Rhodopirellula and OHA48658.1 in Candidatus Terrybacteria. 54
5381 411109 NF033389 scrub_typh_TSA22 major outer membrane protein TSA22. Members of this family are TSA22, one of three major outer membrane proteins, the so-called type-specific antigens, in two species of Orientia, including O. tsutsugamushi, causative agent of scrub typhus. The other type-specific antigens are TSA47 (a serine proteins in the DegP/HtrA family) and the much better known TSA56, which is quite variable in size and may be used for strain typing. 202
5382 380240 NF033390 Orientia_TSA56 type-specific antigen TSA56. This protein is the immunodominant major cell surface protein of Orienta tsutsugamushi, known as "56-kDa type-specific antigen" or TSA56. It should not be confused with unrelated proteins TSA47 (a serine protease) or TSA22. An ortholog is found in Orientia chuto, and included in the seed alignment. 525
5383 380241 NF033391 lipid_A_LpxO lipid A hydroxylase LpxO. Members of this family are LpxO, an enzyme that modifies one of the lipid chains in lipid A by hydroxylation, with resulting changes in resistance to the host immune response and to the antibiotic colistin. This family, as built, includes LpxO1 from Pseudomonas aeruginosa, but not its paralog LpxO2. 297
5384 380242 NF033392 PSM_delta PSM-delta family phenol-soluble modulin. Members of this family are phenol-soluble modulins (short peptides, usually cytolysins) with an intact N-formyl-methionine at the N-terminus. 23
5385 380243 NF033393 TRP47_fam_Nterm TRP47 family tandem repeat effector N-terminal domain. This HMM describes a conserved N-terminal domain of a family of proteins found, so far, only in the genus Ehrlichia. The repeat region is followed by a long repeat region with a large content of acidic and serine residues, but other than in composition, the repeats themselves may be unrelated from one lineage to another. Characterized examples, such as TRP47 from Ehrlichia chaffeensis, are glycoproteins and are immunodominant antigens. 105
5386 411110 NF033394 capsid_maj_Podo phage major capsid protein. A founding member of this family, AKO59007.1, was identified as the major head protein in Brucella phage 02_19 during a comparison of Brucella phage genomes. The N-terminal half appears to the better conserved region with fewer insertions and deletions. 311
5387 380245 NF033395 fibronec_SfbI fibronectin-binding protein SfbI. SfbI is a fibronectin-binding protein a C-terminal region LPXTG region that mediates processing by sortase and covalent attachment to the cell wall. Near the N-terminus is a TED domain, which includes a Cys residue that forms a covalent thioester bond. 555
5388 380246 NF033396 pilus_ancill_1 pilus ancillary protein 1. 737
5389 380247 NF033399 thiazolyl_GetA GE37468 family thiazolyl peptide. 47
5390 380248 NF033400 thiazolyl_B thiazolylpeptide-type bacteriocin. 45
5391 380249 NF033401 thiazolyl_BerA thiocillin/thiostrepton family thiazolyl peptide. Members of this family include the precursor peptides for the antibiotics thiostrepton, nosiheptide, thiocillin, and berninamycin. 42
5392 380250 NF033402 linaridin_RiPP linaridin family RiPP. Linaridins are ribosomally translated, post-translationally modified peptide natural products, or RiPPs. Examples include cypemycin, SGR-1832, and legonaridin. 63
5393 380251 NF033403 linaridin_rel linaridin-like RiPP. Members of this family share N-terminal (leader peptide) sequence with the linaridin family of ribosomally translated, post-translationally modified natural product precursors. 64
5394 380252 NF033404 YneK putative protein YneK. Members of this family, YneK, are found so far only in Escherichia coli and Escherichia albertii, with 68% sequence identity but in identical gene neighborhoods. In E. coli O157:H7 strain Sakai, yneK has a nonsense mutation and is therefore truncated. The function is unknown. 371
5395 411111 NF033407 SnoaL_meth_ester SnoaL/DnrD family polyketide biosynthesis methyl ester cyclase. This HMM represents mutually closely related methyl ester cyclases from a number of polyketide biosynthesis pathways. Examples include proteins designated SnoaL (nogalamycin biosynthesis), DnrD (doxorubicin biosynthesis), RdmA (rhodomycin biosynthesis), etc. 144
5396 380254 NF033411 small_mem_YnhF YnhF family membrane protein. Members of this protein family, are small membrane proteins, about 29 amino acids in length. YnhF from E. coli was shown to have an intact fMet residue at the N-terminus and to be chloroform-soluble. The previously generated narrow cluster PRK14756 includes some members of this family. 29
5397 380255 NF033412 primase_PriX eukaryotic-type DNA primase noncatalytic subunit PriX. In most archaea, the eukaryotic-type DNA primase has catalytic subunit PriS and a regulatory subunit PriL. The proteins in this family are PriX, an essential second noncatalytic subunit found in a subset of the archaea. 98
5398 380256 NF033413 RiPP_TM1316 Cys-rich RiPP peptide. Member of this family include the small, Cys-rich peptide TM1316 of Thermotoga maritima, encoded near the peptide-modifying radical SAM/SPASM protein TM1317 (AE000512.1). TM1316 is expressed at very high levels in stationary phase. 31
5399 380257 NF033414 bottro_RiPP bottromycin family RiPP peptide. Bottromycins are one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics. 44
5400 380258 NF033415 thiovirid_RiPP thioviridamide family RiPP peptide. Thioviridamide represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics. 72
5401 380259 NF033416 YM-216391_RiPP YM-216391 family RiPP peptide. YM-216391 represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics. 35
5402 380260 NF033417 glycocin_F_RiPP glycocin F family RiPP peptide. Glycocin F, from Lactobacillus plantarum strain KW30, represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics. Members of the family are glycosylated, which is uncommon among RiPP natural products. 68
5403 380261 NF033418 T6SS_TagK type VI secretion system-associated protein TagK. Members of this family have full-length homology to SciF, a type VI secretion system (T6SS) protein from Salmonella typhimurium island SPI-6. Homologs occur in some but not all T6SS loci, and the broader family is now called TagK. 304
5404 380262 NF033419 T6SS_TagK_dom TagK family protein C-terminal domain. 127
5405 411112 NF033420 T6SS_PAAR_dom type VI secretion system PAAR domain. The PAAR domain is widespread, but this model represents a narrow clade that may occur in type VI secretion systems (T6SS), either free-standing or fused to a long extension. Effector domains of T6SS may be separate proteins, or may be fused to on of the tube or spike proteins: VgrG (spike), Hcp (tube), or this PAAR family (spike tip). Members of this family that have long extensions are likely to be T6SS effectors. 94
5406 411113 NF033422 onco_T4SS_CagA type IV secretion system oncogenic effector CagA. CagA, an effector injected into host cells by the type IV secretion system (T4SS) apparatus of Helicobacter pylori, is an oncogenic toxin. Tyrosine phosphorylation at multiple Glu-Pro-Ile-Tyr-Ala (EPIYA) motifs creates a scaffold that interacts with multiple host signaling systems and sometimes allows neoplasias to begin in gastric epithelial cells. 1056
5407 380266 NF033424 chlamy_CPAF protease-like activity factor CPAF. CPAF (chlamydial protease/proteasome-like activity factor) is a serine protease secreted various species of the intracellular pathogen Chlamydia. Early attribution of contributions of CPAF to virulence by cleavage of specific host cell substrates contains a number of errors, and the true role of CPAF, and its contributions to virulence, remain under study. 565
5408 380267 NF033425 PSM_alpha_1_2 alpha-1/alpha-2 family phenol-soluble modulin. Members of this family are extremely short proteins, about 21 amino acids long, that are known to retain an N-formyl-methionine (fMet) at the N-terminus. These proteins, phenol-soluble modulins of the alpha class, including alpha-1 and alpha-2 from Staphylococcus aureus, are exported by an ABC transporter, and affect the state of the host immune system in a number of ways. 21
5409 380268 NF033426 PSM_alpha_3 alpha-3 family phenol-soluble modulin. 22
5410 380269 NF033427 PSM_alpha_Shaem alpha family phenol-soluble modulin. This family is based on an alpha family Staphylococcus haemolyticus phenol-soluble modulin, somewhat similar to delta-lysin. 20
5411 380270 NF033428 PSM_epsilo epsilon family phenol-soluble modulin. Members of this family epsilon-family phenol-soluble modulins. Species with members include Staphylococcus epidermidis, Staphylococcus capitis, Staphylococcus lugdunensis, Staphylococcus pseudintermedius, and Staphylococcus schleiferi. 21
5412 411114 NF033429 ImuA_translesion translesion DNA synthesis-associated protein ImuA. A three-gene cassette encoding ImuA, ImuB, and ImuC ("inducible mutagenesis") is induced by DNA damage, is capable of DNA synthesis across DNA damage lesions, and consequently is associated with mutagenesis. This family, ImuA (previously misnamed SulA in Pseudomonas putida) shows some homology to SulA itself and to RecA. ImuB resembles Y-family polymerases but may be catalytically inactive. ImuC is a C-family polymerase, and catalytically active. 181
5413 380272 NF033430 TfxA_RiPP trifolitoxin family RiPP peptide. Trifolitoxin is ribosomally synthesized and post-translationally modified peptide natural product (RiPP) antibiotic produced by some strains of Rhizobium and active against others. At the time of building this model, only two variants sequences from this family are detected, both 42 amino acid-long TfxA peptides that differ at only two positions. 42
5414 380273 NF033431 cinnamycin_RiPP cinnamycin family lantibiotic. Members of this family are RiPP precursor peptides from which the lantibiotic cinnamycin is the most heavily studied. Mature cinnamycin is 19 amino acids long with nine post-translational modifications, including lanthionine, methyllanthionine, and lysinoalanine bridge modifications. 77
5415 380274 NF033432 ThioGly_TfuA_rel TfuA-related McrA-glycine thioamidation protein. 211
5416 380275 NF033433 NisI_immun_dup NisI/SpaI lantibiotic immunity lipoprotein domain. This HMM describes a domain that occurs twice in the nisin lantibiotic self-immunity lipoprotein NisI, and once in the subtilin lantibiotic self-immunity lipoprotein SpaI, and once or twice in numerous other known or putative lantibiotic resistance lipoproteins. 104
5417 380276 NF033434 AzmA_fam_RiPP azolemycin family RiPP peptide. The azolemycin precursor peptide, AzmA, contains a mature (core) peptide region derived from the sequence VVSTCTI. Examining genomic context from that RiPP precursor peptides related to AzmA are much more similar in the leader peptide region than in the core region, and that the proper length for the percursor peptide is probably about 36 amino acids. 36
5418 380277 NF033435 S-layer_Clost S-layer protein SlpA. In Clostridiodes difficile, the S-layer protein precursor, SlpA, is one member of a large paralogous family of protein that share several cell wall-binding repeats. SlpA is cleaved into a larger and smaller protein. The S-layer protein itself is important to adhesion, and portions of it are highly variable, and then N-terminal and C-terminal are well-conserved. 728
5419 380278 NF033436 SpoVM_broad stage V sporulation protein SpoVM. Members of this family are SpoVM (stage V sporulation protein M). 26
5420 380279 NF033437 YpdK membrane protein YpdK. 23
5421 411115 NF033438 BREX_BrxD BREX system ATP-binding protein BrxD. BrxD is an ATP-binding protein found in types 2 and 6 of BREX (bacteriophage exclusion) phage resistance systems. 423
5422 380281 NF033439 small_mem_YoeI membrane protein YoeI. YoeI, a hydrophobic protein of only 20 amino acids, is found in at least these genera: Escherichia, Salmonella, Citrobacter, Enterobacter, and Klebsiella. It is known to be expressed in E. coli. 20
5423 380282 NF033440 small_YrbN protein YrbN. YrbN, a small protein of only 26 amino acids, is found in at least these genera: Escherichia, Salmonella, Enterobacter, Klebsiella, and Yersinia. It is known to be expressed in E. coli. 26
5424 380283 NF033441 BREX_BrxC BREX system P-loop protein BrxC. BrxC is a P-loop-containing protein, and probable ATPase, from BREX (bacteriophage exclusion) systems of type 1. 1173
5425 411116 NF033442 BREX_PglW BREX system serine/threonine kinase PglW. Members of this family are PglW, a predicted serine/threonine kinase of the Pgl (phage growth limitation) system (now called BREX type 2) and the BREX type 3 system. 1387
5426 411117 NF033443 BREX_PglZ_6 BREX-6 system phosphatase PglZ. 958
5427 380286 NF033444 BREX_PglZ_5 BREX-5 system phosphatase PglZ. 704
5428 411118 NF033445 BREX_PglZ_4 BREX-4 system phosphatase PglZ. 734
5429 411119 NF033446 BREX_PglZ_2 BREX-2 system phosphatase PglZ. 890
5430 380289 NF033447 BrxE_fam BrxE family protein. This family is uncharacterized, but a subgroup within this family is BrxE, a protein of unknown function found in type 6 BREX phage resistance systems. 166
5431 380290 NF033448 BREX_6_BrxE BREX-6 system BrxE protein. Members of this family are BrxE, a protein of unknown function that is found in type 6 BREX systems of phage defense. 182
5432 411120 NF033449 BREX_PglZ_3 BREX-3 system phosphatase PglZ. BREX is a phage defense system (BacteRiophage EXclusion), with a number of described subtypes. The first described, PGL (phage growth limitation), is not called BREX-2. This model describes one of the two core proteins universal across the first six defined BREX subtypes, the phosphatase-like PglZ domain protein, as found in BREX-3 systems. 642
5433 380292 NF033450 BREX_PglZ_1_B BREX-1 system phosphatase PglZ type B. BREX (bacteriophage exclusion) is a phage resistance resistance, in which two protein families are core but other proteins are variable. BREX subtypes are based on PglZ domain protein, a putative phosphatase. This family is one of two major subtypes of PglZ as seen in type 1 BREX systems. Most members of this family contain an additional C-terminal domain that is not included in the seed alignment. Family TIGR02687 describes the alternative type A for PglZ of BREX-1. 672
5434 380293 NF033451 BREX_2_MTaseX BREX-2 system adenine-specific DNA-methyltransferase PglX. This protein, PglX, is a site-specific DNA methyltransferase associated with PGL (phage growth limitation), a type 2 BREX (bacteriophage exclusion) system. The phage resistance appears not be restriction, but does manage to inhibit phage replication. 1188
5435 411121 NF033452 BREX_1_MTaseX BREX-1 system adenine-specific DNA-methyltransferase PglX. This protein, PglX, is a site-specific DNA methyltransferase associated BREX (bacteriophage exclusion) type 1 systems. The phage resistance appears not to be through restriction-modification, as phage DNA appears not to get degraded, but it does manage to inhibit phage replication. 1187
5436 380295 NF033453 BREX_3_BrxF BREX-3 system P-loop-containing protein BrxF. This family of proteins that are about 150 amino acids in length includes BrxF from type 3 BREX (bacteriophage exclusion) systems. Most members have the P-loop motif GxxGxGKT, but the region is surprisingly poorly conserved in a sizable fraction of otherwise strongly similar proteins. 149
5437 380296 NF033454 BREX_5_MTaseX BREX-5 system adenine-specific DNA-methyltransferase PglX. 1401
5438 380297 NF033455 BREX_6_MTaseX BREX-6 system adenine-specific DNA-methyltransferase PglX. 1319
5439 380298 NF033456 RiPP_CCRG-2 CCRG-2 family RiPP leader. This model consists of the conserved leader peptide domain of CCRG-2 family ribosomal peptide natural products (RiPPs), up to the Gly-Gly presumptive cleavage site, plus one additional residue (hydrophobic, or another Gly). Members are found almost exclusively in the genus Prochlorococcus, in multiple instances per genome. 18
5440 380299 NF033457 elgicin_lanti elgicin/penisin family lantibiotic. The HMM describes the elgicin family of lantipeptides active as bacteriocins. The leader domain occurs in additional proteins that lack homology in the core region, although sharing richness in Cys residues there. 64
5441 380300 NF033458 lipid_A_LpxG UDP-2,3-diacylglucosamine diphosphatase LpxG. Members of this family are LpxG, the lipid A biosynthesis enzyme UDP-2,3-diacylglucosamine diphosphatase ()EC 3.6.1.54). This family is unrelated to the more common LpxH, or to LpxI, which share the same activity. 321
5442 380301 NF033459 DksA_like RNA polymerase-binding protein DksA. 113
5443 380302 NF033460 glycerol3P_ox_II type 2 glycerol-3-phosphate oxidase. This FAD-dependent enzyme (EC 1.1.3.21), glycerol-3-phosphate oxidase, also called L-alpha-glycerophosphate oxidase, converts sn-glycerol 3-phosphate plus oxygen to glycerone phosphate plus hydrogen peroxide, which contributes to virulence. This form, called type 2, is shorter than type 1 and is found exclusively in Mycoplasmas and other members of the Mollicutes. 361
5444 411122 NF033461 glycerol3P_ox_1 type 1 glycerol-3-phosphate oxidase. Glycerol-3-phosphate oxidase, also called alpha-glycerophosphate oxidase (GlpO), is an FAD-dependent enzyme related to the glycerol-3-phosphate dehydrogenase GlpD. Notably, GlpO releases hydrogen peroxide, which can contribute to virulence. 607
5445 380304 NF033464 cyanoexo_CrtC cyanoexosortase C. Cyanosortase C (CrtC) belongs to the exosortase/archaeosortase family of multiple membrane proteins that act as cysteine proteases, and probably as transpeptidases, analogous to (but unrelated to) the sortases. CrtC is known so far only in Cyanobacteria, and appears so far primarily in the lesser-studied genera: Leptolyngbya, Oscillatoriales, Scytonema, Alkalinema, Phormidesmis, etc. 278
5446 380305 NF033465 PTPA-CTERM PTPA-CTERM protein sorting domain. This C-terminal sorting and processing signal, called PTPA-CTERM, is a variant of the widespread PEP-CTERM domain. It is restricted a subset of Cyanobacteria that encode the sorting enzyme named cyanoexosortase C. 23
5447 380306 NF033471 J25_fam_lasso acinetodin/klebsidin/J25 family lasso peptide. Members of this family include precursors of at least three lasso peptides, namely microcin J25, acinetodin and klebsidin. All members of the family are encoded as neighboring genes to homologs of the processing protein genes mcjB, mcjC, and mcjD. 39
5448 380307 NF033474 DivGenRetAVD diversity-generating retroelement protein Avd. Avd (accessory variability determinant) is part of diversity-generating retroelement (GDR) system through which a portion of a protein-coding gene can be rewritten, creating diversity that can affect host range. The founding member of this family, bAvd, from a Bordetella bacteriophage, belongs to a retrohoming element called BPP-1. Members of this family are four-helix bundle proteins, related to those of family TIGR02436, some of whose members are found in long intervening sequence (IVS) regions in 23S rRNA. 104
5449 380308 NF033477 EmaA_autotrans collagen-binding adhesin autotransporter EmaA. EmaA (extracellular matrix protein adhesin ) is an outer membrane protein first described in Aggregatibacter (Actinobacillus) actinomycetemcomitans, a member of the Gammaproteobacteria. It is a glycoprotein, and has a C-terminal beta-barrel domain that marks it as an autotransporter. It serves as an adhesin that binds collagen, the most abundant material in the host extracellular matrix. It shares homology with the collagen-binding protein YadA of Yersinia enterocolitica. 1694
5450 380309 NF033478 YadA_autotrans trimeric autotransporter adhesin YadA. YadA (Yersinia adhesin A) is a type Vc secretion system autotransporter found in at least two pathogenic Yersinia species, Y. enterocolitica and Y. pseudotuberculosis. It forms trimers, with three monomers contributing to a single outer membrane beta-barrel. It binds collagen and other components of the extracellular matrix. 459
5451 380310 NF033479 Efa1_rel_toxin LifA/Efa1-related large cytotoxin. Members of this family are large and almost certainly multifunctional proteins found in various pathogens from genus Chlamydia, about 3000 amino acids in size and related to lymphostatin (Efa1/LifA) from enteropathogenic Escherichia coli. Roles have been suggested for Efa1 (EHEC factor for adherence) in adhesion, so some members have been annotated as adherence proteins rather than cytotoxins. 3223
5452 411123 NF033480 bifunc_MprF bifunctional lysylphosphatidylglycerol flippase/synthetase MprF. The C-terminal region of MprF tranfers lysine from a charged tRNA onto phosphatidylglycerol to make lysylphosphatidylglycerol (EC 2.3.2.3). The N-terminal region of MprF acts as a flippase. MprF helps confer resistance to antimicrobial cationic peptides. 839
5453 411124 NF033481 auto_Ata trimeric autotransporter adhesin Ata. Ata (Acinetobacter trimeric autotransporter) has an architecture that consists of a long signal peptide, a repetitive passenger domain that varies in length from strain to strain, and a C-terminal domain of four transmembrane beta stands that forms one third of the pore for autotransporter activity and anchoring in the outer membrane. 1862
5454 411125 NF033482 RiPP_thiocil thiocillin family RiPP. Members of this family are ribosomally synthesized and post-translationally modified peptide natural product (RiPP) precursors. Nearly all contain at least one Cys residues expected to be involved in thiazolyl peptide modifications, and are expected to behave at least in part as antibiotics. Genomes of some species (Bacillus atrophaeus, Streptomyces coelicoflavus, etc.) may contain multiple paralogs. 48
5455 411126 NF033483 PknB_PASTA_kin Stk1 family PASTA domain-containing Ser/Thr kinase. 563
5456 411127 NF033484 Stp1_PP2C_phos Stp1/IreP family PP2C-type Ser/Thr phosphatase. Many Gram-positive bacteria have a protein kinase/protein phosphatase gene pair that responds to peptidoglycan metabolites and can be instrumental in resistance to beta-lactam antibiotics. Characterized examples of the phosphatase component are Stp1 of Staphylococcus aureus and IreP of Enterococcus faecalis. 232
5457 411128 NF033485 small_SCO1431 SCO1431 family membrane protein. Members of this family, including SCO1431 from Streptomyces coelicolor A3(2), are small and extremely hydrophobic proteins that lack an N-terminal signal peptide. Known members are restricted to the genus Streptomyces, where the protein family is widespread. 47
5458 411129 NF033486 harvest_ssl1498 ssl1498 family light-harvesting-like protein. Members of this family appear restricted to the Cyanobacteria, and include ssl1498 from Synechocystis sp. PCC 6803. These proteins are small, usually about 56 amino acids, with an N-terminal half related to a number of light-harvesting proteins, and a highly hydrophobic C-terminal half likely to be embedded in membrane. 53
5459 411130 NF033487 Lacal_2735_fam Lacal_2735 family protein. This small protein is widespread but uncharacterized. Most members are shorter than 60 amino acids in length. 54
5460 411131 NF033488 lmo0937_fam_TM lmo0937 family membrane protein. Members of this family are very small (about 45 amino acids) and highly hydrophobic, suggesting a presence in the membrane, and have a broad phylogenetic distribution. The member protein lmo0937, from the pathogen Listeria monocytogenes, is described as up-regulated when the bacterium is in the mouse spleen, suggesting a role in stress response. 43
5461 411132 NF033490 small_SPW0924 SPW_0924 family protein. Members of this family average less than 44 amino acids in length, and are found exclusively in the Actinobacteria (mostly Streptomyces). The N-terminal half is organized like a signal peptide, beginning Met-Arg and then continuing with a hydrophobic stretch that is unusually rich in alanine. The C-terminal region has a nearly invariant motif TSPxPLLTTVP. The function is unknown. 44
5462 411133 NF033491 BA3454_fam BA3454 family stress response protein. BA3454, a protein less than 45 amino acids long, is up-regulated strongly by SpxA2 during stress conditions. Related proteins are found widely in the genus Bacillus, although not in Bacillus subtilis. 43
5463 411134 NF033492 podovir_small putative phage replication protein. Members of this family of very small, highly hydrophobic proteins are restricted to the genus Acinetobacter, and appear to be entirely of phage origin. One member is encoded in the Podoviral Bacteriophage YMC/09/02/B1251 ABA BP (although the coding region is not predicted in GenBank record JX403940.1). Evidence suggesting the reading frame really does encode protein includes the overlap of a TGA stop codon with an ATG start codon at both ends of this protein, suggesting translational coupling with the much larger adjacent genes immediately upstream and downstream. 36
5464 411135 NF033493 MetS_like_NSS MetS family NSS transporter small subunit. MetS, as described in the Gram-positive bacterium Corynebacterium glutamicum, is the small subunit of MetPS, an NSS (Neurotransmitter:Sodium Symporter) transporter involved in methionine and alanine import. While MetS itself is small, only 60 amino acids, homologs in gamma proteobacteria such as Vibrio sp., similarly found next to an NSS transporter large subunit, may be barely half that length and consist almost entirely of a predicted hydrophobic region that would localize to within the plasma membrane. 30
5465 411136 NF033494 NSS_import_MetS methionine/alanine import NSS transporter subunit MetS. 53
5466 411137 NF033495 phage_BC1881 BC1881 family protein. Members of this family of very small proteins (average length is about 50 residues) include BC1881 from the phBC6A51 prophage region of the Bacillus cereus ATCC 14579 genome. 45
5467 411138 NF033496 DUF2080_fam_acc DUF2080 family transposase-associated protein. Members of this family appear restricted to the archaea. They tend to be encoded upstream of predicted transposase genes within insertion sequences such as ISNagr11, ISHca1, ISH36, etc. The widespread distribution suggests this protein may be more than a mere passenger gene and may participate in some transposase-associated function. See PF09853, COG3466, and arCOG03884 for alternative (currently narrow) treatments of this family. 34
5468 411139 NF033497 rubre_like_arch rubrerythrin-like domain. This rubrerythrin-like domain is found primarily in the archaea, occasionally as part of a larger redox-active protein. It features two CxxC motifs with a spacer of 12 to 13 amino acids. 34
5469 411140 NF033498 YlcG_phage_expr YlcG family protein. Members of this family include YlcG from the DLP12 prophage region of Eschichia coli K-12, and homologs from the Gifsy-1 and Gifsy-2 prophage regions of Salmonella enterica subsp. enterica serovar Typhimurium str. LT2. Members of this protein family are small, about 46 amino acids long. YlcG is known to be expressed. It is encoded immediately downstream of the Holliday junction resolvase RusA. 41
5470 411141 NF033499 Xis_Gifsy_1 excisionase Xis. Members of this family are excisionases such as Xis from the Gifsy-1 prophage of Salmonella enterica subsp. enterica serovar Typhimurium str. LT2. 92
5471 411142 NF033500 phi80_GamL host nuclease inhibitor GamL. Members of this family, including GamL from phage phi80, are phage inhibitors of host nucleases such as RecBCD and SbcCD. This family has a distant relationship to Gam (see PF06064), which inhibits RecBCD. 89
5472 411143 NF033501 ArfB_arch_rifla 2-amino-5-formylamino-6-ribosylaminopyrimidin-4(3H)-one 5'-monophosphate deformylase. MJ0116 from Methanocaldococcus jannaschii, the founding member of this family, was shown be 2-amino-5-formylamino-6-ribosylaminopyrimidin-4(3H)-one 5'-monophosphate deformylase, catalyzing the second step in archaeal riboflavin and Fo biosynthesis. 219
5473 411144 NF033503 LarB nickel pincer cofactor biosynthesis protein LarB. This protein, related to AIR carboxylase, is part of a three protein system involved in producing a specialized nicotinic acid-derived, nickel-containing cofactor, as used in the nickel-dependent lactate racemase of lactic acid bacteria. 209
5474 411145 NF033504 Ni_dep_LarA nickel-dependent lactate racemase. LarA from Lactobacillus plantarum is a nickel-dependent lactate racemase and the founding member of a family of isomerases that depend on a nicotinic acid-derived nickel pincer cofactor. While it is not yet clear which homologs of LarA act preferentially on lactate, this model identifies one clade of architecurally similar proteins from among a broader set of LarA homologs. Note that the crystal structure 4NAR, on deposit at PDB but not associated with any publication, represents a protein from Thermotoga maritima that falls outside the scope of this family and that is annotated in PDB as a putative uronate isomerase. 417
5475 411146 NF033505 paceosortase pacearchaeosortase. Members of this family, pacearchaeosortase, are archaeosortases from the uncultured (so far) Candidatus Pacearchaeota archaeon lineage and its close relatives. In most assemblies where pacearchaeosortase is found, only one protein can be found likely to make it the target for sorting and cleavage, and it is encoded by the adjacent gene. This dedicated arrangement suggests the adjacent gene encodes a critical surface protein, most likely one that helps form an S-layer. 167
5476 411147 NF033506 PACE-CTERM-PROT putative S-layer protein. Assembled genomes from the Candidatus Pacearchaeota archaeon group and its close relatives, so far all uncultured, have a single archaeosortase, called pacearchaseosortase. A search protein archaeosortase targets, with a C-terminal domain resembling other archaeosortase and exosortase sorting signal regions, found this family as the best candidate. It is nearly always encoded by a gene found adjacent to the pacearchaseosortase gene. This dedicated arrangement, a sorting enzyme encoded next to its only predicted sorting substrate, suggests that members of this family, called PACE-CTERM, may be an important and abundant surface protein, most likely the major S-layer protein. 558
5477 411148 NF033507 Loki-CTERM Loki-CTERM protein-sorting domain. 26
5478 411149 NF033510 Ca_tandemer Ca2+-stabilized adhesin repeat. This repeat is found in proteins such as the biofilm-associated protein Bap of Acinetobacter baumannii (which can exceed 8000 amino acids in length), the calcium-stabilized ice-binding adhesin of the Antarctic bacterium Marinomonas primoryensis, and the giant calcium-binding adhesin SiiE of Salmonella enterica. 97
5479 411150 NF033511 metallo_CpaA metalloendopeptidase CpaA. 575
5480 411151 NF033512 T2SS_chap_CpaB metalloprotease secretion chaperone CpaB. The cpaA and cpaB gene pair, as described in the genus Acinetobacter, consists of a metalloendopeptidase virulence factor, CpaA, and a tightly binding membrane-bound chaperone, CpaB, important for its secretion by a type II secretion system (T2SS). CpaA, in at least some Acinetobacter, is the most heavily secreted T2SS effector, and behaves as a virulence factor that cleaves factor V in blood and alters coagulation. 205
5481 411152 NF033515 lipo_6_6_Borrel Lp6.6 family lipoprotein. In Borrelia burgdorferi, the tiny lipoprotein Lp6.6 is plasmid-borne and was originally described as a major low-molecular-weight lipoprotein that could constitute 2% of the dry weight of defatted cells. Lp6.6 was later found to facilitate transmission from ticks to mice, and to be down-regulated after infection. 60
5482 411153 NF033516 transpos_IS3 IS3 family transposase. 369
5483 411154 NF033517 transpos_IS66 IS66 family transposase. Members of this protein family are DDE transposases from the IS66 family insertion sequences, which typically consist of two accessary genes (TnpA and TnpB) and the third gene encoding the transposase. 388
5484 411155 NF033518 transpos_IS607 IS607 family transposase. 187
5485 411156 NF033519 transpos_ISAzo13 ISAzo13 family transposase. 387
5486 411157 NF033520 transpos_IS982 IS982 family transposase. Currently, there are 46 seed sequences in this family. 243
5487 411158 NF033521 lasso_leader_L3 lasso RiPP family leader peptide. 20
5488 411159 NF033522 lasso_benenodin benenodin family lasso peptide. This family consists of both the leader (removed) and core (mature) portions of precursor peptides from the benenodin family of lasso peptides. 43
5489 411160 NF033523 lasso_peptidase Atxe2 family lasso peptide isopeptidase. 637
5490 411161 NF033524 lasso_PadeA_fam paeninodin family lasso peptide. Members of this family are lasso peptides in the paeninodin family, mostly from the genera Bacillus, Paenibacillus, and Thermobacillus. The HMM covers the leader peptide region but only about half of the core peptide region, since it is quite diverse. 30
5491 411162 NF033525 lasso_albusnod albusnodin family lasso peptide. Members of this family are lasso peptides in the family of albusnodin, and appear limited so far to the Actinobacteria. Members are more strongly conserved in the core peptide region than in the leader peptide region, which is unusual for ribosomally produced, post-translationally modified natural products. The founding member of this family is not only circularized by the formation of an isopeptide bond, but also acetylated. 40
5492 411163 NF033527 transpos_Tn3 Tn3 family transposase. 954
5493 411164 NF033528 lasso_cyano lasso peptide. Members of this family are lasso peptide precursors of a type common in the Cyanobacteria and so far restricted to them. Some members have a pair of Cys residues in the core peptide region, C-terminal the region modeled in the HMM, and therefore would form bicyclic compounds. 35
5494 411165 NF033529 transpos_ISLre2 ISLre2 family transposase. 378
5495 411166 NF033530 lasso_PqqD_Strm lasso peptide biosynthesis PqqD family chaperone. Members of this family are homologs of PqqD, a chaperone that binds RiPP peptide precursors for their modification into bioactive natural products. By context, this set is involved in the biosynthesis of threaded-lasso peptides. This model focuses on lasso peptide systems from Actinobacteria. Similar systems, with different subfamilies of PqqD-related peptides, occur in lasso peptide systems in other lineages. A characterized example is LarB1 from the lariatin system of Rhodococcus jostii. 78
5496 411167 NF033532 lone7para_assoc type VII secretion system-associated protein. Members of this family occur almost exclusively in the genus Streptomyces, in the context of type VII secretion systems (T7SS). Several paralogs may accompany a single T7SS. A few members of this family are large proteins with additional domains that add or remove, ADP-ribosylations, suggesting that all family members may have effector activity as well, and that the longer members of the family are multifunctional effector proteins. 162
5497 411168 NF033533 lone7_assoc_B type VII secretion system-associated protein. Members of this family are found almost entirely in the genus Streptomyces, and are associated with a type VII secretion system (T7SS). 135
5498 411169 NF033534 rhodolasso lasso peptide. Members of this family are lasso peptides as found in the genus Rhodothermus. The leader peptide region shows sequence relatedness to several other groups of lasso peptide precursors. Known members of this family have a pair of Cys residues in the core peptide region, suggesting a bicyclic product, with one cross-link from the signature lasso peptide isopeptide bond and the other from the cysteine disulfide bond. This subfamily of lasso peptide appears not to have been discussed in the literature yet. 49
5499 411170 NF033535 lass_lactam_cya lasso peptide isopeptide bond-forming cyclase. Members of this family are the isopeptide bond-forming cyclase of lasso peptide biosynthesis systems, from a subgroup that contains primarily cyanobacterial examples. These proteins resemble the glutamine-hydrolyzing asparagine synthase AsnB (EC 6.3.5.4). 668
5500 411171 NF033536 lasso_PqqD_Bac lasso peptide biosynthesis PqqD family chaperone. Members of this family are homologs of PqqD, a chaperone that binds RiPP peptide precursors for their modification into bioactive natural products. By context, this set is involved in the biosynthesis of threaded-lasso peptides. This model focuses on lasso peptide systems from Firmicutes. Similar systems, with different subfamilies of PqqD-related peptides, occur in lasso peptide systems in other lineages. 88
5501 411172 NF033537 lasso_biosyn_B2 lasso peptide biosynthesis B2 protein. 130
5502 411173 NF033538 transpos_IS91 IS91 family transposase. 376
5503 411174 NF033539 transpos_IS1380 IS1380 family transposase. Proteins of this family are DDE type transposases, which are encoded by IS1380 family elements. It was first identified and characterized in an Acetobacter pasteurianus mutant with ethanol oxidation deficiency caused by disruption of the cytochrome c gene by the IS1380 element. 417
5504 411175 NF033540 transpos_IS701 IS701 family transposase. Members of this family are transposases in the family of that of insertion element IS701, narrowly defined. Note that a molecular phylogenetic tree of the broader sets of transposases from IS elements classified as IS701 family or IS4 family by ISFINDER shows the two groups interleaved. This model represents an unambiguous clade that includes IS701 itself and the majority of proteins called IS701 family. The poorly conserved C-terminal region of members of this family is not included in the seed alignment. 345
5505 411176 NF033541 transpos_ISH3 ISH3 family transposase. This family contains transposases from the insertion element ISH3, and related transposases from other mobile elements with similar transposases. This model reproduces the classification from ISFinder except for ISC1439B-like transposases, since those are extremely different. 298
5506 411177 NF033542 transpos_IS110 IS110 family transposase. Proteins of this family are DEDD (Asp, Glu, Asp, Asp) type transposases, which are encoded by the IS110 family elements. 345
5507 411178 NF033543 transpos_IS256 IS256 family transposase. Members of this family belong to the branch of the IS256-like family of transposases that includes the founding member. It excludes the IS1249 group. 406
5508 411179 NF033544 transpos_IS1249 IS1249 family transposase. Members of this family belong to the IS1249 group branch of the broader IS256 family of transposases. This group differs sharply from the main branch, which includes founding member IS256 itself, by having an N-terminal region with two pairs of closely spaced Cys residues. 381
5509 411180 NF033545 transpos_IS630 IS630 family transposase. 298
5510 411181 NF033546 transpos_IS21 IS21 family transposase. 296
5511 411182 NF033547 transpos_IS1595 IS1595 family transposase. Most transposases of this family of transposases, IS1595, have an additional short N-terminal domain with a pair of CxxC motifs. 211
5512 411183 NF033550 transpos_ISL3 ISL3 family transposase. 369
5513 411184 NF033551 transpos_IS1182 IS1182 family transposase. Members of this family are transposases of the IS1182 family. About two-thirds of the members of this family have an extra domain between the middle and the C-terminal domain, about 50 amino acids in size and containing four invariant Cys residues. 437
5514 411185 NF033553 MerP_Gpos mercury resistance system substrate-binding protein MerP. Members of this family are MerP, a substrate-binding protein for a system in which toxic Hg(II) is transported into the cytosol, where it can be reduced to the much less toxic form Hg(0). Members of this family are found, so far, in Gram-positive bacteria. A related MerP family, as found in plasmid-borne systems in Gram-negative bacteria, is described by TIGR02052. 111
5515 411186 NF033554 floc_PepA flocculation-associated PEP-CTERM protein PepA. PepA was described in Zoogloea resiniphila as a PEP-CTERM protein regulated by the PrsK/PrsR two-component system. Knocking out that system blocks flocculation, after which expression of recombinant PepA can restore flocculation. 258
5516 411187 NF033556 MerTP_fusion mercuric transport protein MerTP. MerTP is a transport protein for the mercuric ion, Hg(II). Once imported to the cytosol, the highly toxic ion can be converted by MerA, mercury(II) reductase, to the less toxic Hg atom. 186
5517 411188 NF033557 LLB_putidacin putidacin L1 family lectin-like bacteriocin. Putidacin L1 is a well-described member of a family of lectin-like bacteriocins found almost exclusively in Pseudomonas. This subfamily is narrowly defined, and does not include the homolog pyocin L1. 274
5518 411189 NF033558 transpos_IS1 IS1 family transposase. Proteins of this family are DDE transposases encoded by the IS1 family elements usually through a translational frameshift mechanism. 199
5519 411190 NF033559 transpos_IS1634 IS1634 family transposase. Members of this protein family are DDE type transposases encoded by the IS1634 family elements, which were firstly identified and characterized in Mycoplasma mycoides. 463
5520 411191 NF033561 macrolact_Ik_Al albusnodin/ikarugamycin family macrolactam cyclase. Members of this family show homology enzymes known to form the lactam bond of the isopeptide linkage of lasso peptides. This family includes the peptide cyclase involved in biosynthesis of the lasso peptide albusnodin. However, another member of this family belongs to the biosynthesis cassette for ikarugamycin, a macrolactam whose biosynthesis relies on a hybrid PKS/NRPS system, not a ribosomally produced peptide. 561
5521 411192 NF033562 BH0509_fam BH0509 family protein. This family of unknown function appears restricted to the Firmicutes. Proteomics evidence for expression was provided for the member from Bacillus cereus by Dr. Samuel Payne, Pacific Northwest National Labs. The family is named, for now, after BH0509 from Bacillus halodurans. 43
5522 411193 NF033563 transpos_IS30 IS30 family transposase. 267
5523 411194 NF033564 transpos_ISAs1 ISAs1 family transposase. 314
5524 411195 NF033566 adhes_LIC20035 LIC20035 family adhesin. LIC20035 of Leptospira interrogans was characterized as a surface exposed adhesin that binds host extracellular matrix components. Orthologs appear restricted to the genus Leptospira. Member proteins average about 430 residues in length, much of which consists of repeats. All members are predicted lipoproteins. 428
5525 411196 NF033567 act_recrut_TARP type III secretion system actin-recruiting effector Tarp. The founding member of the Tarp (translocated actin-recruiting phosphoprotein) is CT456 from Chlamydia trachomatis. Tarp is a type III secretion system effector. Orthologs are found other Chlamydia, but are highly variable in length because many lack much of the repeat region. 761
5526 411197 NF033570 FIB_Spiroplas cytoskeletal motor fibril protein Fib. Fib, a 59K protein also called fibrillin, is the repeating subunit of the linear fibril that runs through the shortest path along the length of Spiroplasma cells, members of the Mollicutes that lack cells walls but have a spiral shape. The fibril is a linear contractile ribbon, and Fib is its only component. 511
5527 411198 NF033571 motil_scm1_spiro motility-associated protein Scm1. Scm1 (Spiroplasma citri motility gene 1) was shown by loss-of-function mutation to be involved in motility. The Scm1 family is widespread in the genus Spiroplasma, members of the Mollicutes (bacteria with no cell wall) that have a spiral shape organized around a contractile ribbon fibril made of repeating subunits of the Fib (fibril) protein. 401
5528 411199 NF033572 transpos_ISKra4 ISKra4 family transposase. 414
5529 411200 NF033573 transpos_IS200 IS200/IS605 family transposase. Most IS200/IS605 family insertion sequences encode both this transposase, TnpA, about 130 amino acids long, and larger accessory protein, TnpB, that may act as a methyltransferase. 126
5530 411201 NF033576 mCpol mCpol domain. The mCpol domain (minimal CRISPR polymerase) is named for its homology relationship to catalytic domain of the CRISPR polymerases (often called Cmr2 or Cas10). It is predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. It is part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. The putative function of the mCpol domain implies that CRISPR polymerases of the type III CRISPR/Cas systems have a nucleotide synthetase functional role. 118
5531 411202 NF033577 transpos_IS481 IS481 family transposase. 283
5532 411203 NF033578 transpos_IS5_1 IS5 family transposase. 415
5533 411204 NF033579 transpos_IS5_2 IS5 family transposase. 287
5534 411205 NF033580 transpos_IS5_3 IS5 family transposase. 257
5535 411206 NF033581 transpos_IS5_4 IS5 family transposase. 284
5536 411207 NF033583 staphy_B_SbnC staphyloferrin B biosynthesis protein SbnC. SbnC, related to siderophore biosynthesis protein IucA and IucC, is encoded in Staphylococcus aureus in the sbnABCDEFGHI locus responsible for the biosynthesis of staphyloferrin B, a carboxylate-type siderophore. SbnC is found in many species of Staphylococcus. 584
5537 411208 NF033586 staphy_B_SbnF staphyloferrin B biosynthesis protein SbnF. 577
5538 411209 NF033587 transpos_IS6 IS6 family transposase. 203
5539 411210 NF033588 transpos_ISC774 IS6 family transposase. ISC774 is an example of an outlier clade of IS6 family insertion sequences. Members so are appear restricted to the archaeal genus Sulfolobus. 195
5540 411211 NF033589 staphy_B_SbnI bifunctional transcriptional regulator/O-phospho-L-serine synthase SbnI. SbnI is a bifunctional protein involved in staphyloferrin B (staphylobactin) biosynthesis in Staphylococcus aureus and other members of the genus. It is a bifunctional protein. The N-terminal region is heme-binding, and loses the ability to bind DNA when heme is bound. Under low iron conditions, the biosynthesis operon for staphyloferrin B, a carboxylate-type siderophore, is derepressed. The C-terminal domain is a kinase that acts on free serine, producing O-phospho-L-serine, which is used as one of the precursors of staphyloferrin B. 254
5541 411212 NF033590 transpos_IS4_3 IS4 family transposase. 403
5542 411213 NF033591 transpos_IS4_2 IS4 family transposase. 340
5543 411214 NF033592 transpos_IS4_1 IS4 family transposase. 332
5544 411215 NF033593 transpos_ISNCY_1 ISNCY family transposase. The ISNCY insertion sequence family, as defined by ISFinder, encodes several apparently unrelated families of transposases. Members of this family resemble the transposases of ISNCY family elements such as ISRm17 from Sinorhizobium meliloti, ISMav9 from Mycobacterium avium, and ISNfl1 from Nostoc commune. 444
5545 411216 NF033594 transpos_ISNCY_2 ISNCY family transposase. The ISNCY insertion sequence family, as defined by ISFinder, encodes several apparently unrelated families of transposases. Members of this family resemble the transposases of ISNCY family elements such as IS1202, ISTde1, ISKpn21, and ISCARN1. 367
5546 411217 NF033595 denti_PrtP dentilisin complex serine proteinase subunit PrtP. PrtP, a chymotrypsin-like protease known as dentilisin, forms a complex with PrcB and PrcA. It is found in Treponema denticola and in numerous other Treponema species. Dentilisin from T. denticola plays a significant role in pathogen-host interactions in periodontal disease. 608
5547 411218 NF033596 denti_PrcB dentilisin complex subunit PrcB. 174
5548 411219 NF033597 denti_PrcA dentilisin complex subunit PrcA. PrcA is a lipoprotein that, together with PrcB and the serine proteinase subunit PrtP, form a chymotrypsin-like surface complex that is also known as dentilisin, after its discovery and characterization in Treponema denticola. Dentilisin is an important virulence factor in periodontal disease. 611
5549 411220 NF033598 elast_bind_EbpS elastin-binding protein EbpS. The elastin-binding protein EbpS is an adhesin described in Staphylococcus aureus, with orthologs found in many additional staphylococcal species. EbpS is a membrane protein that lacks an N-terminal signal peptide region, has extensive regions low-complexity sequence rich in Asn and Gln, and has a C-terminal LysM domain. 466
5550 411221 NF033599 His_racem_CntK histidine racemase CntK. CntK (cobalt and nickel transport system protein K) is a histidine racemase that performs the first step in the biosynthesis of staphylopine, a metallophore involved in the import of multiple divalent cations. It was first characterized in Staphylococcus aureus. 271
5551 411222 NF033600 staphylopine_DH staphylopine biosynthesis dehydrogenase. 424
5552 411223 NF033601 Sta_opine_CntL staphylopine biosynthesis enzyme CntL. CntL (cobalt and nickel transporter L) is an enzyme involved in biosynthesis of staphylopine, a metallophore involved in the import of zinc, cobalt, nickel, and other divalent cations. CntL transfers aminobutyrate from S-adenoyslmethionine, and is sometimes misannotated as a SAM-dependent methyltransferase. The staphylopine biosynthesis pathway was first characterized in Staphylococcus aureus. 255
5553 411224 NF033602 campy_sm_acidic highly acidic protein. This highly acidic protein, usually between 50 and 55 amino acids long, is found so far in Campylobacter jejuni and Campylobacter coli. A reanalysis of proteomics data, performed by Dr. Samuel Payne of Pacific Northwest National Labs, shows strong evidence for expression of AIW09513.1, founding member of the family. 50
5554 411225 NF033603 mini-MOMP_1 mini-MOMP protein. Mini-MOMP proteins, found in several species of Campylobacter, are small proteins (about 63 amino acids long before removal of the signal peptide) with strong homology to the N-terminal region of MOMP, the major outer membrane protein that is Campylobacter's major porin. 63
5555 411226 NF033604 epsi_CJH_07325 CJH_07325 family protein. Members of the CJH_07325 family are small proteins, shorter than 70 amino acids, expressed in members of the genera Campylobacter and Helicobacter. The function is unknown. 56
5556 411227 NF033605 Zn_bnd_ABC_AdcA zinc ABC transporter substrate-binding lipoprotein AdcA. 516
5557 411228 NF033606 heat_AAA_ClpK heat shock survival AAA family ATPase ClpK. ClpK, a Clp family AAA ATPase, was discovered as a plasmid-encoded determinant for survival of heat shock along with other putative heat shock proteins. ClpK requires the presence of ClpP to confer heat resistance. ClpK is about 65% identical to ClpG. Note that PMID:26974352 and PMID:29263094 discuss both ClpG itself and a member of this family (ClpK) that they call ClpG-GI. 949
5558 411229 NF033607 disagg_AAA_ClpG AAA family protein disaggregase ClpG. ClpG, as characterized in Pseudomonas aeruginosa, is a Clp family member of the AAA+ family of ATPases. ClpG has stand-alone ability to disaggregate proteins from aggregates that result from heat stess. Both ClpG and its mobilized homolog ClpK provide increased survival of exposure to heat. 932
5559 411230 NF033608 type_I_tox_Fst type I toxin-antitoxin system Fst family toxin. This model represents an expansion of Pfam model PF13955, for type I toxin-antitoxin system Fst family toxins, with increased sensitivity to pick up a toxin from Streptococcus mutans described in PMID:23326602 28
5560 411231 NF033609 MSCRAMM_ClfA MSCRAMM family adhesin clumping factor ClfA. Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif. 934
5561 411232 NF033610 SLATT_3 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. The SLATT domain defined here (170 residues long) is similar to the DUF4231 domain (105 residues long) described in Pfam model PF14015. 164
5562 411233 NF033611 SAVED SAVED domain. The SAVED domain is predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers. 260
5563 411234 NF033615 CDF_MamM magnetosome biogenesis CDF transporter MamM. 290
5564 411235 NF033616 CDF_MamB magnetosome biogenesis CDF transporter MamB. 284
5565 411236 NF033617 RND_permease_2 multidrug efflux RND transporter permease subunit. 1009
5566 411237 NF033618 mlaB_1 lipid asymmetry maintenance protein MlaB. MlaB belongs to a system that maintains asymmetry in the outer membrane of Gram-negative bacteria, with LPS in the outer leaflet and phospholipids in the inner leaflet. Several components of the system share homology with typical ABC transporters. 94
5567 411238 NF033619 perm_MlaE_1 lipid asymmetry maintenance ABC transporter permease subunit MlaE. 253
5568 411239 NF033620 pqiC membrane integrity-associated transporter subunit PqiC. PqiC (YmbA), a lipoprotein, has been identified as part of the PqiABC system, a transporter that bridges the inner and outer membranes in species such as Escherichia coli and is important to membrane integrity. 185
5569 411240 NF033621 de_GSH_amidase deaminated glutathione amidase. 260
5570 411241 NF033622 repair_DdrC DNA damage response protein DdrC. DdrC is a DNA-binding protein that seems restricted to the genus Deinococcus, and that plays a role in the ability of members of that genus to recover from fragmentation of their DNA. Note that the region where DdrC is found in Deinococcus radiodurans R1 originally had incorrect structural annotation, with a feature designated DR_0003 shown on the opposite strand. 223
5571 411242 NF033623 urate_HpxO FAD-dependent urate hydroxylase HpxO. HpxO is an FAD-dependent urate hydroxylase (EC 1.14.13.113). Like the factor independent urate hydroxylase (EC 1.7.3.3), it consumes O2 and converts urate to 5-hydroxyisourate, which decomposes spontaneously to allantoin and CO2. However, HpxO oxidizes NADH to NAD(+), and produces H20, while EC 1.7.3.3 produces H202 as a byproduct. 382
5572 411243 NF033624 HpxX oxalurate catabolism protein HpxX. HpxX is a small protein of unknown function, about 60 residues in length, encoded in the set of four genes, hpxWXYZ, that belong to the oxalurate metabolism portion of a complete pathway for hypoxanthine (hpx) utilization, as in Klebsiella pneumoniae. 55
5573 411244 NF033625 HpxZ oxalurate catabolism protein HpxZ. HpxZ is not characterized, but it is encoded in the cluster hpxWXYZ, associated with oxalurate catabolism, within the larger hpx (hypoxanthine utilization) locus of species such as Klebsiella pneumoniae, where KPN_01771 is HpxZ. 122
5574 411245 NF033628 snapalysin snapalysin. Snapalysin (SnpA, or Small Neutral Protease A) belongs to the metzincin family of zinc-dependent metalloendopeptidases. 207
5575 411246 NF033629 RiPP_CPAC RiPP peptide. 52
5576 411247 NF033630 SLATT_6 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems. 179
5577 411248 NF033631 SLATT_5 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential. 182
5578 411249 NF033632 SLATT_4 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems. 154
5579 411250 NF033633 SLATT_2 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations. 181
5580 411251 NF033634 SLATT_1 SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain. 135
5581 411252 NF033635 SLATT_fungal SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict. 125
5582 411253 NF033638 RNase_AS polyadenylate-specific 3'-exoribonuclease AS. RNase AS is a 3'-exoribonuclease, found in Mycobacterium tuberculosis and other Actinobacteria, that acts specifically to degrade polyadenylate sequences from the 3'-end of RNA. 155
5583 411254 NF033640 N_Twi_rSAM twitch domain-containing radical SAM protein. Members of this family are unusual among radical SAM proteins in several ways. First, the N-terminal region consists of an iron-sulfur cluster-binding twitch domain (half of a SPASM domain), something usually found C-terminal to the radical SAM domain. Second, the radical SAM domains in many of the members of this family score poorly vs. the Pfam HMM, PF04055 (version 19), used to identify radical SAM. Lastly, the majority of members sequenced to date come from uncultured bacteria from marine or aquifer sources rather than from conventionally cultured bacterial isolates. The function is unknown. 396
5584 411255 NF033641 antiterm_LoaP antiterminator LoaP. LoaP is a paralog of NusG with an extensive presence in Firmicutes. The founding member, from Bacillus amyloliquefaciens, was shown to serve as an antiterminator for the transcription of genes involved in antibiotic biosynthesis. 166
5585 411256 NF033642 stress_AzuC stress response protein AzuC. AzuC is a basic, extremely small protein (28 amino acids in Escherichia coli K-12) whose expression is repressed by cyclic AMP response protein (CRP) and stimulated by acidic pH. 26
5586 411257 NF033644 antiterm_UpxY UpxY family transcription antiterminator. The UpxY family of NusG-related transcription antiterminators was described originally from a paralogous family of eight members from Bacteriodes fragilis, UpaY to UphY, each of which was associated with a distinct capsular polysaccharide biosynthesis locus. There is no UpxY protein per se. 162
5587 411258 NF033645 pilus_FilE putative pilus assembly protein FilE. FilE is found almost exclusively in the genus Acinetobacter, and is assigned as a putative pilus system protein from local genomic contexts that include several additional putative pilus system proteins. Note that some members of this protein family have proline-rich repeat regions for which spurious translation in another reading frame can give a false-positive match to Pfam's collagen repeat region HMM, PF01391. 411
5588 411259 NF033647 adhesin_LEA LEA family epithelial adhesin N-terminal domain. LEA (Lactobacillus epithelium adhesin), as characterized in an adhesive commensal strain of Lactobacillus crispatus (ST1), is a large, repetitive protein with an N-terminal YSIRK-type signal peptide and a C-terminal LPXTG site for processing by sortase and attachment to the cell surface. Family members contain variable numbers of an 82 amino acid long repeats similar to Lactobacillus Rib/alpha-like repeats. This HMM describes the N-terminal region upstream of the repeat region, just over 600 amino acids long. 655
5589 411260 NF033649 LipDrop_Rv1109c lipid droplet-associated protein. RHA1_ro05869 from Rhodococcus jostii RHA1, an ortholog of Rv1109c from Mycobacterium tuberculosis, has been shown specifically with lipid droplets that consist of a neutral lipid core enveloped by a phospholipid monolayer and surface proteins. Lipid droplets of triacylglycerol can be especially prominent in members of the genus Rhodococcus, but occur also in Mycobacterium tuberculosis and can support dormancy of that pathogen. 199
5590 411261 NF033650 ANR_neg_reg ANR family transcriptional regulator. ANR family transcriptional regulators include the AggR-activated regulator Aar. The name ANR (AraC Negative Regulators) refers to an effect on, rather than homology to, certain AraC family transcriptional regulators. 54
5591 411262 NF033652 LbtU_sider_porin LbtU family siderophore porin. LbtU, from Legionella pneumophila, a novel TonB-independent siderophore uptake outer membrane protein from a species that lacks TonB, is the founding member of a class of porins that may be involved generally in siderophore-mediated iron acquisition. 322
5592 411263 NF033656 DMQ_monoox_COQ7 2-polyprenyl-3-methyl-6-methoxy-1,4-benzoquinone monooxygenase. 205
5593 411264 NF033657 choice_anch_F choice-of-anchor F family protein. Choice-of-anchor F is a domain found in prokaryotic proteins with a variety of C-terminal sorting and transit domains. These include the autotransporter outer membrane beta-barrel domain, the JDVT-CTERM domain, and variant forms of PEP-CTERM domains. 296
5594 411265 NF033662 acid_disulf_rpt acidic double-disulfide repeat. The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids. 32
5595 411266 NF033663 AceI_fam_PACE AceI family chlorhexidine efflux PACE transporter. The AceI family of proton-coupled multidrug/biocide efflux transporters includes several shown to respond to the presence of the biocide chlorhexidine and improve resistance to it, among other biocides and antibiotics. Members of the family seen so far, including founding member AceI, have been chromosomal and restricted to the genus Acinetobacter. 136
5596 411267 NF033664 PACE_transport PACE efflux transporter. PACE transporters, including the chlorhexidine efflux transporter AceI, average about 140 amino acids in length and consist of two tandem homologous domains described by Pfam model PF05232. PACE transporters are single component efflux transporters that sit in the plasma membrane and couple proton import to substrate export. 130
5597 411268 NF033665 PACE_efflu_PCE multidrug/biocide efflux PACE transporter. PACE (proteobacterial antimicrobial compound efflux) transporters are single component proton-coupled efflux pumps that help confer resistance to a number of biocides and antibiotics. The family has also been named PCE (proteobacterial chlorhexidine efflux). Members of this subfamily of the PACE transporters, distinct from the AceI-like branch, include several whose expression is increased by exposure to chlorhexidine and/or help confer increased resistance to it. 130
5598 411269 NF033668 rSAM_PA0069 PA0069 family radical SAM protein. PA0069 from Pseudomonas aeruginosa is the founding member a family of radical SAM enzymes of unknown function. Note that inclusion of some members of this family in COG1533, along with some spore photoproduct lyase (SPL) proteins, has led to some family members being annotated as SPL. 348
5599 411270 NF033672 mbn_chaper_assoc copper uptake system-associated domain. Proteins from this family that may contain just an N-terminal signal peptide followed by this domain, or have a copper chaperone domain in addition, just past the signal peptide. A majority of bacteria that encode a peptide-derived methanobactin precursor encode a member of this family as well, strongly suggesting a role for this domain in copper acquisition. 104
5600 411271 NF033674 stress_OB_fold NirD/YgiW/YdeI family stress tolerance protein. Members of this family possess an N-terminal signal peptide, and are associated with tolerance to various toxic stresses. These include antimicrobial peptides (YdeI), hydrogen peroxide (YgiW and YdeI), and nickel (NcrY and NirD). 110
5601 411272 NF033675 NTTRR-F1 NTTRR-F1 domain. NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology domain found strictly as the N-terminal non-repetitive region of otherwise highly repetitive proteins of various Firmicutes. The repetitive region that follows typically is collagen-like, with every third residue a glycine. 155
5602 411273 NF033676 Lacb_SerRich_Nt serine-rich glycoprotein adhesin prefix domain. Lacb_SerRich_Nt describes a Lactobacillus-restricted N-terminal non-repetitive sequence region shared by proteins with extensive serine-rich repeat regions, all likely to function as adhesins. This region contains a variant form of the KxYKxGKxW motif (see TIGR03715) followed by a region related to serine-rich glycoprotein adhesins of the Streptococci. 80
5603 411274 NF033677 biofilm_BapA_N BapA prefix-like domain. Two largely unrelated repetitive proteins, both named biofilm-associated protein BapA (from Salmonella enterica and from Paracoccus denitrificans) share homology domains at the two ends. Both lack a typical signal peptide for translocation by Sec, and instead depend on type I secretion for export and for contribution to biofilm formation. The conserved prefix (i.e. N-terminal) domain is shared by a number of other large, repetitive proteins of Proteobacteria thought to be associated with adhesion or biofilm formation. 64
5604 411275 NF033678 C69_fam_dipept C69 family dipeptidase. Members of the MEROPS C69 family (subfamily 001) are dipeptidases (EC 3.4.13.-). 463
5605 411276 NF033679 DNRLRE_dom DNRLRE domain. The DNRLRE domain, with a length of about 160 amino acids, appears typically in large, repetitive surface proteins of bacteria and archaea, sometimes repeated several times. It occurs, notably, three times in the C-terminal region of the enzyme disaggregatase from the archaeal species Methanosarcina mazei, each time with the motif DNRLRE, for which the domain is named. Archaeal proteins within this family are described particularly well by the currently more narrowly defined Pfam model, PF06848. Note that the catalytic region of disaggregatase, in the N-terminal portion of the protein, is modeled by a different HMM, PF08480. 164
5606 411277 NF033680 exonuc_ExeM-GG extracellular exonuclease ExeM. ExeM, as described in Shewanella oneidensis, is a biofilm formation-associated exonuclease that cleaves extracellular DNA (eDNA), a biofilm component. Members of the ExeM family contain two or three pairs of Cys residues, presumed to form disulfide bonds, and a C-terminal GlyGly-CTERM membrane-anchoring segment. Strangely, engineered removal of the GlyGly-CTERM region did not result in net export from the cell and appearance of the enzyme in culture supernatants. 883
5607 411278 NF033681 ExeM_NucH_DNase ExeM/NucH family extracellular endonuclease. 545
5608 411279 NF033682 retention_LapA retention module-containing protein. The retention module, as described for the giant adhesin LapA of Pseudomonas fluorescens and for an ice-binding giant adhesin of an Antarctic bacterium, appears at the N-terminus of a number of very large repetitive proteins, many of which have C-terminal regions that make them substrates for type I secretion systems. 145
5609 411280 NF033683 di_4Fe-4S_YfhL YfhL family 4Fe-4S dicluster ferredoxin. 79
5610 411281 NF033684 suffix_2_RND transporter suffix domain. Members of this protein family contain a highly hydrophobic region about 70 amino acids long that usually occurs as essentially the full length of a small membrane protein, but in some cases occurs as a C-terminal suffix domain for RND efflux transporter permease subunit proteins. 69
5611 411282 NF033685 Tet_leader_L tetracycline resistance efflux system leader peptide. Stalling of the ribosomal translation of a tetracycline resistance system mRNA from inability to properly translate a short leader peptide can affect mRNA secondary structure, preventing early transcriptional termination and therefore inducing expression of the resistance protein. This leader peptide is found upstream of efflux transporters such as tet(L) and tet(45) (which occur in Gram-positive organisms), but on occasion may occur as a fused additional N-terminal domain of such efflux proteins. 20
5612 411283 NF033686 leader_PheM_1 pheST operon leader peptide PheM. 14
5613 411284 NF033688 MG406_fam MG406 family protein. Homologs to MG406 from Mycoplasma genitalium and MPN605 from Mycoplasma pneumoniae are about 150 amino acids long on average, highly hydrophobic, widespread in but restricted to the Mollicutes, and highly divergent there. MG406 itself appears to be an essential gene. 117
5614 411285 NF033689 N2Fix_CO_CowN N(2)-fixation sustaining protein CowN. Carbon monoxide inhibits the ability of Mo-nitrogenase to fix nitrogen, but expression of CowN is protective, and allows nitrogen fixation to continue. 89
5615 411286 NF033690 ErmCL_fam_lead ErmCL family antibiotic resistance leader peptide. 19
5616 411287 NF033691 immunity_MafI MafI family immunity protein. MafI proteins, as described in Neisseria species, are small proteins encoded in modules with MafB (multiple adhesin family B) secreted toxins. Homologs are found broadly, primarily in Proteobacterial genera such as Neisseria, Cronobacter, Enterobacter, Pseudomonas, etc. 67
5617 411288 NF033694 perox_inhi_SPIN SPIN family peroxidase inhibitor. SPIN (staphylococcal peroxidase inhibitor) binds to and inhibits human myeloperoxidase to resist killing after phagocytosis by neutrophils. 99
5618 411289 NF033696 CrpP_fam CrpP family protein. CrpP, described originally as a protein encoded by Pseudomonas aeruginosa plasmid pUM505, confers elevated resistance to ciprofloxacin, but not to four other fluoroquinoline-type antibiotics tested. The apparent mechanism is phosphorylation. However, CrpP appears more closely related to the ribosome modulation factor Rmf than to any known aminoglycoside phosphotransferase. 58
5619 411290 NF033697 leader_RseD rpoE leader peptide RseD. RseD, as described originally in Escherichia coli, is a leader peptide translationally coupled to the extracytoplasmic stress response sigma factor RpoE. It participates in the CrsA-mediated fine-tuning of the stress response. It is found also in Salmonella enterica, Cedecea neteri, etc. The corresponding locus in Klebsiella pneumoniae appears to be split into an upstream and a downstream peptide. 31
5620 411291 NF033701 yciY_fam YciY family protein. Members of the YciY family are named after the gene symbol given in E. coli K-12, but members of the family are found also in Salmonella, Klebsiella, Yersinia, Erwinia, etc. 56
5621 411292 NF033703 transcr_KstR cholesterol catabolism transcriptional regulator KstR. KstR, a protein characterized in Mycobacterium tuberculosis (MTB) and M. smegmatis is a TetR family transcriptional regulator that is essential for pathogenesis in MTB. In controls the expression of about 80 proteins involved in the earlier stages of cholesterol catabolism. KstR binds not to cholesterol itself, but to catabolites found early in the degradation pathway. 185
5622 411293 NF033704 helico_Hpn_like nickel-binding protein HpnL. The Hpn-like protein HpnL, or Hpn-2, including founding member HP1432, is a histidine and glutamine-rich metal-binding polypeptide that binds nickel, among other metals, any may help deliver sequestered nickel for the biosynthesis of urease, which is critical to the survival of Helicobacter pylori during exposures to strongly acidic conditions. This protein is a paralog to the histidine-rich nickel storage protein Hpn (e.g. HP1427). Both nickel-binding proteins are expressed in response to nickel, under control of the regulator NikR. 71
5623 411294 NF033705 helico_Hpn nickel storage protein Hpn. Hpn (Helicobacter pylori nickel) is a histidine-rich polypeptide, 60 amino acids in length, capable of binding nickel and several other metals. It can store a reserve of nickel for biosynthesis of active urease, critical to the ability of the bacterium to survive acid exposure during colonization, and hydrogenase. Hpn is closely related in its N-terminal region to a somewhat longer His and Gln-rich protein, called the Hpn-like protein. Both are expressed, under control of NikR, in response to nickel. 60
5624 411295 NF033706 Ni_bind_SCO4226 SCO4226 family nickel-binding protein. Members of the SCO4226 family belong to the larger family of DUF4242 domain-containing proteins, described by Pfam model PF14026. SCO4226 itself was shown to dimerize and bind four nickel atoms per homodimer. 82
5625 411296 NF033707 T9SS_sortase type IX secretion system sortase PorU. PorU, part of type IX secretion systems (T9SS), is the protease responsible for both removing the C-terminal sorting signal found in substrates and for its replacement by anionic LPS, through which most T9SS substrates become attached to the cell surface after secretion. 1056
5626 411297 NF033708 T9SS_Cterm_ChiA T9SS sorting signal type C. The sorting signals of type IX secretion systems (T9SS) in the CFB bacteria are long, compared to other prokaryotic C-terminal sorting motif-containing signals, including LPXTG, PEP-CTERM, and GlyGly-TERM, and they seem to contain multiple motifs. A few T9SS substrates, including ChiA, have a variant form of T9SS sorting signal that may score poorly to both TIGR04183 (type A) and TIGR04131 (type B), depend on T9SS for secretion, but are released from the cell rather than left anchored to the cell surface. 55
5627 411298 NF033709 PorV_fam PorV/PorQ family protein. Proteins closely related to PorV, and its paralog PorQ, are found regularly in species with type IX secretion systems (T9SS), the system associated with a type of gliding motility in many of the Bacteroidetes. 327
5628 411299 NF033710 T9SS_OM_PorV type IX secretion system outer membrane channel protein PorV. PorV, as characterized in oral pathogen Porphyromonas gingivalis, is a component of the type IX secretion system (T9SS) needed to process a subset of T9SS substrates. PorV is a paralog of PorQ. 368
5629 411300 NF033711 T9SS_PorQ type IX secretion system protein PorQ. 330
5630 411301 NF033712 B12_rSAM_KedN5 KedN5 family methylcobalamin-dependent radical SAM C-methyltransferase. KedN5, the founding member of a family of radical SAM enzymes with an N-terminal B12-binding domain, is a C-methyltransferase that relies on a methylcobalamin cofactor during natural product biosynthesis. 624
5631 411302 NF033713 DbpA DbpA. 166
5632 411303 NF033715 glycyl_HPDL_Lrg 4-hydroxyphenylacetate decarboxylase large subunit. 4-hydroxyphenylacetate decarboxylase, an enzyme with a glycyl radical active site, depends on a radical SAM enzyme for activation, and is found in strict anaerobes such as Clostridium difficile. It has a large and a small subunit. 901
5633 411304 NF033716 glycyl_HPDL_Sma 4-hydroxyphenylacetate decarboxylase small subunit. 4-hydroxyphenylacetate decarboxylase, an enzyme with a glycyl radical active site, depends on a radical SAM enzyme for activation, and is found in strict anaerobes such as Clostridium difficile. It has a large and a small subunit. 79
5634 411305 NF033717 HPDL_rSAM_activ 4-hydroxyphenylacetate decarboxylase activase. 4-hydroxyphenylacetate decarboxylase activase is a radical SAM enzyme, found in anaerobic bacteria where 4-hydroxyphenylacetate decarboxylase occurs and required to prepare the glycyl radical active site of the enzyme. 311
5635 411306 NF033718 indole_decarb indoleacetate decarboxylase. Indoleacetate decarboxylase is a single subunit glycyl radical enzyme that depends on a cognate radical SAM enzyme for its activation. It performs the final step in the anaerobic fermentation of tryptophan to skatole, a malodorous volatile compound. 868
5636 411307 NF033719 ind_deCO2_activ indoleacetate decarboxylase activase. 302
5637 411308 NF033720 DbpB decorin-binding protein DbpB. 182
5638 411309 NF033721 P12_lipo P12 family lipoprotein. 287
5639 411310 NF033723 S2_P23 S2/P23 family protein. 179
5640 411311 NF033724 P13_porin P13 family porin. 178
5641 411312 NF033725 borfam_49 chromosome replication/partitioning protein. 150
5642 411313 NF033726 borfam52 P52 family lipoprotein. 171
5643 411314 NF033728 borfam54_1 complement regulator-acquiring protein. 319
5644 411315 NF033729 borfam54_2 complement regulator-acquiring protein. 219
5645 411316 NF033730 borfam54_3 complement regulator-acquiring protein. 292
5646 411317 NF033731 borfam63 fibronectin-binding protein RevA. 160
5647 411318 NF033732 borfam95 exported protein A EppA. 176
5648 411319 NF033733 MFS_ArsK arsenite efflux MFS transporter ArsK. ArsK, a major facilitator superfamily (MFS) transporter, was shown in Agrobacterium tumefaciens to be induced by arsenite and antimonite, to reduce their accumulation, and to confer resistance when expressed heterologously. 388
5649 411320 NF033734 MFS_ArsJ organoarsenical effux MFS transporter ArsJ. ArkJ regularly is encoded next to a glyceraldehyde-3-phosphate dehydrogenase that can synthesize 1-arseno-3-phosphoglycerate in the presence of arsenate, and appears to provide arsenate resistance by exporting that organoarsenical compound before it spontaneously dissociates into arsenate and 3-phosphoglycerate. 392
5650 411321 NF033735 G3PDH_Arsen ArsJ-associated glyceraldehyde-3-phosphate dehydrogenase. 324
5651 411322 NF033737 Amm_Lyn_leader ammosamide/lymphostin biosynthesis leader domain. Precursors of the ammosamide (Amm6) and lymphostin family of natural products share both a strongly conserved leader peptide region that interacts with natural product biosynthesis enzymes, and a critical Trp residue at or near the C-terminus that is incorporated into the small molecule natural product eventually produced - a pyrroloquinoline alkaloid whose core is derived from tryptophan. An additional, shorter homolog, Xan, encoded in a less well understood natural production biosynthesis locus, lacks the critical Trp. It is thought to be a trans-acting leader peptide that does not need to be linked covalently to a core peptide, if that is the substrate of the natural product biosynthesis enzymes, in order to enable them to perform their modifications. 27
5652 411323 NF033738 microvirid_RiPP microviridin/marinostatin family tricyclic proteinase inhibitor. Members of the microviridin/marinostatin are ribosomally translated peptides whose post-translational processing converts them into tricyclic depsipeptides that serve as serine proteinase inhibitors. A single precursor usually has one core peptide region near the C-terminus, with a nearly invariant TxKYPSD motif, but may instead have two or three repeats of the core region. 47
5653 411324 NF033739 intramemb_PrsW intramembrane metalloprotease PrsW. PrsW, an intramembrane protease, cleaves the anti-sigma factor RsiW, which regulates the activity of the ECF-type sigma factor SigW. 209
5654 411325 NF033740 MarP_fam_protase MarP family serine protease. The founding member of this family of membrane-spanning serine proteases, which is restricted to Actinobacteria, is the acid resistance periplasmic serine protease MarP of Mycobacterium tuberculosis. Recent work shows that MarP is required to cleave and activate the peptidoglycan hydrolase RipA, and loss of RipA activity creates a defect in progeny separation during cell division. Therefore, the requirement for MarP in order to survive acidic conditions may be a consequence of peptidoglycan hydrolysis requirements, explaining why MarP family members are distributed more broadly in the Actinobacteria than the subset of species capable of surviving intracellularly as pathogens. 390
5655 411326 NF033741 NlpC_p60_RipA NlpC/P60 family peptidoglycan endopeptidase RipA. 457
5656 411327 NF033742 NlpC_p60_RipB NlpC/P60 family peptidoglycan endopeptidase RipB. 206
5657 411328 NF033743 NlpC_inact_RipD NlpC/P60 family peptidoglycan-binding protein RipD. RipD proteins, such as founding member Rv1566c from Mycobacterium tuberculosis, is a catalytically inactive paralog of the peptidoglycan endopeptidases RipA and RipB. A catalytically important Cys and His pair is replaced by Ala-83 and Ser-132. 177
5658 411329 NF033745 class_C_sortase class C sortase. 218
5659 411330 NF033746 class_D_sortase class D sortase. 135
5660 411331 NF033747 class_E_sortase class E sortase. 211
5661 411332 NF033748 class_F_sortase class F sortase. 155
5662 411333 NF033749 bact_hemeryth bacteriohemerythrin. Bacteriohemerythrin, an O2-carrying protein that lacks a heme moiety, is named based on its homology to eukaryotic proteins such as myohemerythrin. 129
5663 411334 NF033750 vWF_bind_Staph von Willebrand factor binding protein Vwb. The von Willebrand factor binding protein Vwb, like its paralog staphylocoagulase, is a coagulase and a virulence factor. It induces clotting, not by being an enzyme, but by activating prothrombin to generate fibrin. 510
5664 411335 NF033751 pallilysin_like pallilysin-related adhesin. In contrast to pallilysin itself (a bifunctional adhesin and protease), members of the pallilysin-related adhesin family average twice the length, lack the HEXXH motif essential to pallilysin's metalloprotease activity, and are likely to function in virulence only as an adhesin. Typical members of this family include TDE0840 from Treponema denticola and BB0038 from Borrelia burgdorferi, which share less than 20% pairwise amino acid sequence identity. 385
5665 411336 NF033752 linaridin_CypA cypemycin family RiPP. The cypemycin precursor CypA belongs to the linaridin class (linear "arid" peptide, following dehydration modifications) of RiPP natural product precursors. The signature terminal motif CL[VI]C is modified by decarboxylation of the C-terminal Cys residue, followed by cyclization. 52
5666 411337 NF033753 RiPP_decarbCypD CypD family RiPP peptide-cysteine decarboxylase. CypD, a Cys decarboxylase flavoprotein, oxidatively removes the carboxyl moiety from the C-terminal Cys residue of CypA, the precursor of the RiPP natural product cypemycin. 182
5667 411338 NF033754 gliding_CglC adventurous gliding motility lipoprotein CglC. CglC (cell contact-dependent gliding (or conditional gliding) motility protein C, also called adventurous gliding motility protein AgmO, is found in delta-proteobacterial species that exhibit a taxonomically restricted form of gliding motility. 156
5668 411339 NF033755 gliding_CglE adventurous gliding motility protein CglE. 172
5669 411340 NF033756 gliding_GltC adventurous gliding motility protein GltC. GltC is a soluble periplasmic protein required for a type of gliding motility found in certain social delta-proteobacteria, including the model species Myxococcus xanthus. 621
5670 411341 NF033757 gliding_CglB adventurous gliding motility lipoprotein CglB. CglB is an outer membrane lipoprotein required for A-motility in Myxococcus xanthus and other delta-proteobacteria. It is transferable between cells that have a compatible TraAB system, and was therefore named conditional (or cell contact-dependent) gliding motility protein B, or CglB. 403
5671 411342 NF033758 gliding_GltE adventurous gliding motility TPR repeat lipoprotein GltE. GltE (also called AglT) is a tetratricopeptide repeat protein with a lipoprotein signal peptide and a role in A-motility (adventurous gliding motility) in Myxococcus xanthus and other delta-proteobacteria. 411
5672 411343 NF033759 exchanger_TraA outer membrane exchange protein TraA. TraA, together with its partner TraB, mediates a large scale exchange of outer membrane lipoproteins, and lipids, between closely related strains or clonally identical cells, certain delta-proteobacterial species such as Myxococcus xanthus. The exchange mechanism is likely to involve fusion of outer membrane, probably done to coordinate the social behaviors these bacteria display. 662
5673 411344 NF033760 gliding_GltG adventurous gliding motility protein GltG. GltG proteins, including the founding member MXAN_4867 from Myxococcus xanthus, occur in certain delta-proteobacteria and are involved in adventurous gliding (A-)motility. GltG has an N-terminal forkhead-associated (FHA) domain domain, often associated with signal transduction. 647
5674 411345 NF033761 gliding_GltJ adventurous gliding motility protein GltJ. Adventurous gliding motility protein GltJ, also known as AgmX, occurs in delta-proteobacteria such as Myxococcus xanthus. 671
5675 411346 NF033762 social_mot_Tgl social motility TPR repeat lipoprotein Tgl. Social motility in delta-proteobacterial species such as Myxococcus xanthus depends on a type VI pilus, which in turn depends on assembly of the PilQ secretin complex. Tgl, a tetratricopeptide repeat (TPR) outer membrane lipoprotein, is required for PilQ assembly. 252
5676 411347 NF033763 exchanger_TraB outer membrane exchange protein TraB. TraB, as described originally in the delta-proteobacterium, is a protein with a C-terminal OmpA-like domain, and is encoded in an operon with TraA. Together TraAB make it possible for bacterial cells with close enough kinship to exchange outer membrane lipoproteins, such that certain motility defects in mutant cells can be corrected through the exchange. This exchange most likely involves membrane fusion events, as large amounts of lipid are also exchanged, and it is restricted by a bi-directional kin recognition requirement. Among wild-type cells, these exchanges likely help coordinate various social behaviors. 522
5677 411348 NF033764 gliding_CglF adventurous gliding motility protein CglF. CglF, as originally described in Myxococcus xanthus, is a gliding motility protein. It has the property that motility in a cglF loss mutant can be restored by close contact with compatible kin strains where cglF is wild type. The restoration of motility depends on bidirectional kinship recognition, which is mediated by TraAB and leads to exchange of outer membrane proteins and lipids. This cell contact requirement leads to the gene symbol, cglF, meaning conditional (or, cell contact-dependent) gliding F. This protein has also been called GltF. 87
5678 411349 NF033765 gliding_CglD adventurous gliding motility lipoprotein CglD. 155
5679 411350 NF033766 choice_anch_G choice-of-anchor G family protein. Choice-of-anchor proteins belong to homology families in which various branches carry C-terminal sorting signals known, or suspected, to be processed by transpeptidases as sortase, exosortase, archaeosortase, rhombosortase, or the type 9 secretion system sorting enzyme. Members of this family, called choice-of-anchor G, included sortase and exosortase targets, and are likely to be found on the cell surface. 282
5680 411351 NF033767 exosort_XrtS exosortase S. Members of the exosortase S family occur in the high GC Gram-positive order Micrococcales (a branch of the Actinobacteria), in genera such as Arthrobacter, Microbacterium, Curtobacterium, and Paenarthrobacter. 155
5681 411352 NF033768 myxo_SS_tail AgmX/PglI C-terminal domain. The myxo_SS_tail domain occurs as the C-terminal domain in multiple proteins per genome for a number of species capable of surface gliding motility, e.g. 12 in Myxococcus xanthus. Member proteins include the adventurous gliding motility proteins AgmX (GltJ) and PglI in M. xanthus. The domain is about 92 amino acids long, and features a pair of Cys residues about 45 amino acids apart in almost all cases. 92
5682 411353 NF033769 after_VWA_1 after-VIT domain. The after-VIT domain is a bacterial surface protein C-terminal domain found on some proteins that have both the Vault protein Inter-alpha-Trypsin (VIT) domain and a von Willebrand factor type A domain. Note that some of after-VIT domain-containing proteins, such as members of TIGR03788, may have a known C-terminal sorting signal, such as LPXTG or PEP-CTERM, instead of the after-VIT domain. The after-VIT domain appears to be homologous to the myxo_SS_tail domain, some of whose member proteins are involved in adventurous gliding motility in Myxococcus xanthus, and it is similarly located. 90
5683 411354 NF033770 exosort_XrtT exosortase T. Exosortase T is a variant form of exosortase, typically found in Alphaproteobacteria in genera such as Pseudovibrio, Labrenzia, and Ruegeria. Members of this family may be dedicated enzymes, processing a single substrate encoded by a nearby gene, rather than processing multiple proteins like exosortases A and B. 472
5684 411355 NF033771 colonize_BriC biofilm-regulating peptide BriC. BriC (Biofilm-Regulating peptide Induced by Competence), as characterized in Streptococcus pneumoniae, is a cell-cell communication peptide, or peptide pheromone, that is induced by expression of ComE (a master regulator of competence). BriC contributes to biofilm formation, and to colonization in a mouse mole. 60
5685 411356 NF033772 pheromone_VP1 peptide pheromone VP1. VP1 (virulence peptide 1), as characterized in Streptococcus pneumoniae (a.k.a. pneumococcus), is part of a large panel of secreted regulatory peptides with paralogous leader domains, typically ending with the cleavage site GlyGly, but with highly variable core regions. VP1, along with other pneumococcal peptide pheromones such as BriC, participate to cell-cell communication to regulate pathogenic processes such as biofilm formation. 65
5686 411357 NF033773 tellur_TrgA TrgA family protein. TrgA, a protein associated with tellurium resistance (but less critical to the phenotype than TrgB, encoded by the adjacent gene), is the founding member of a family of hydrophobic proteins, about 150 amino acids in length, probably embedded in the membrane, and possibly involved in transport. 145
5687 411358 NF033774 phos_trans_PitA inorganic phosphate transporter PitA. PitA is a low-affinity transporter for inorganic phosphate. It imports phosphate complexed to divalent cations such as Mg(2+) or Ca(2+). Loss of PitA function can confer reduced sensitivity to high levels of Zn(2+). The PitA of Escherichia coli has a closely related paralog, PitB, that is functional but only minimally expressed. Note that the term PitA is used broadly. This exception-level HMM identifies a rather narrow clade of PitA transporters which, however, includes both PitA and PitB of E. coli K-12. 499
5688 411359 NF033775 P_type_ZntA Zn(II)/Cd(II)/Pb(II) translocating P-type ATPase ZntA. 732
5689 411360 NF033776 stress_YhcN peroxide/acid stress response protein YhcN. 87
5690 411361 NF033777 M_group_A_cterm M protein C-terminal domain. M protein (emm) is an important virulence protein and serology-defining surface antigen of Streptococcus pyogenes (group A Streptococcus). M protein has an amino-terminal YSIRK-type signal sequence (associated with cross-wall targeting in dividing cells), and a C-terminal LPXTG domain for processing by sortase and covalent attachment to the Gram-positive cell wall. Past the signal peptide, M protein has a hypervariable region, but this HMM describes only the well-conserved region C-terminal to the hypervariable region. It discriminates M protein from two related proteins, Enn and Mrp. 218
5691 411362 NF033778 trans_TimA TIM44-related membrane protein TimA. TimA was first described in Caulobacter crescentus, a member of the alpha-proteobacterial lineage that gave rise to the mitochondrion. It is notable because of homology to Tim44, a protein involved in protein translocation in both human and yeast mitochondria, although TimA itself is not considered involved in protein translocation. TimA is found localized to the plasma membrane, on the cytosolic face. The mitochondrial homolog, Tim44, is found as a peripheral protein of the inner face of the mitochondrial inner membrane, and serves as an adaptor to recruit other subunits into the translocation complex, rather than in transport channel itself. 198
5692 411363 NF033779 Tim44_TimA_adap Tim44/TimA family putative adaptor protein. Members of this family resemble both the eukaryotic protein Tim44, important to the assembly of a protein translocase in mitochondria, and the TimA protein of alpha-proteobacteria such as Caulobacter crescentus. TimA may assist in protein recruitment to the membrane, as Tim44, but appears not to be part of any complex associated with protein translocation. 215
5693 411364 NF033780 exosort_XrtU_C exosortase U C-terminal region. The XrtU family of exosortases is marked by a distinctive C-terminal region, modeled by the exosort_XrtU_C hidden Markov model. Because members of the archaeosortase and exosortase family perform cleavage of target proteins, in pathways thought to lead to new covalent attachment at the C-terminus and surface anchoring, the XrtU C-terminal domain may indicate the presence of a novel anchoring chemistry. 173
5694 411365 NF033782 lipoprot_Omp28 Omp28 family outer membrane lipoprotein. The Omp28 family of lipoproteins is named for a founding member described in Porphyromonas gingivalis, where it has been shown across many strains to be an expressed surface antigen. All members of the family are predicted lipoproteins. 263
5695 411366 NF033785 sulfur_OscA sulfur starvation response protein OscA. OscA (organosulfur compound A) is a small protein, about 60 amino acids in length, in the DUF2292 family. As characterized in Pseudomonas corrugata, OscA is required during sulfur starvation for obtaining it from organosulfur compounds. The pathway is required to remediate oxidative stress from chromate, so oscA was discovered by the loss of high resistance to chromate in Pseudomonas corrugata 28 when the gene is insertionally inactivated. The oscA gene tends to be found near sulfate transporter genes. 60
5696 411367 NF033787 HTH_BldC BldC family transcriptional regulator. BldC, a helix-turn-helix transcription factor with homology to the mercury resistance transcriptional regulator MerR, is a DNA-binding protein. It is considered the founding member of a subfamily of regulators with an asymmetric head-to-tail oligomerization for cooperative DNA binding, rather than classic dimerization. 49
5697 411368 NF033788 HTH_metalloreg metalloregulator ArsR/SmtB family transcription factor. Transcriptional repressors that sense toxic heavy metals such as arsenic or cadmium, and are released from DNA so that resistance factors will be expressed, include ArsR, SmtB, ZiaR, CadC, CadX, KmtR, etc. However, some members of this family, including the sporulation delaying system autorepressor SdpR and its family (see NF033789), may lack metal-binding cites and instead regulate other cellular processes. 76
5698 411369 NF033789 repress_SdpR autorepressor SdpR family transcription factor. Transcription factors in the family of the sporulation delaying system autorepressor SdpR (of Bacillus subtilis) resemble metalloregulatory transcriptional repressors such as ArsR, SmtB, CadX, ZiaR, etc., but may lack the key metal-binding residues. 79
5699 411370 NF033790 CnrY_NccY_antiS CnrY/NccY family anti-sigma factor. 95
5700 411371 NF033791 ActR_PrrA_rreg ActR/PrrA/RegA family redox response regulator transcription factor. ActR, PrrA, and RegA are examples of lineage-specific names given to a response regulator transcription factor that acts as a global regulator and belongs to a sensor-regulator pair that is highly conserved in the alpha-proteobacteria. Examples of this regulator include ActR (acid tolerance regulator) in Sinorhizobium meliloti, stationary phase response regulator SpdR in Caulobacter crescentus, 173
5701 411372 NF033792 ActS_PrrB_HisK ActS/PrrB/RegB family redox-sensitive histidine kinase. This redox-responsive histidine kinase, found in alpha-proteobacteria, shows strong sequence conservation, including the notable motif [VA]AAAAHELGTPxTI. It always acts as a partner to an ActR/PrrA/RegA family global response regulator transcription factor in a two-component sensory transduction system. Lineage-specific names and gene symbols given to this histidine kinase reflect downstream regulator changes such as entry into stationary phase, anaerobic expression of photosynthesis genes, and survival of exposure to low pH. 423
5702 411373 NF033793 peri_CopK periplasmic Cu(I)/Cu(II)-binding protein CopK. 88
5703 411374 NF033794 chaper_CopZ_Eh copper chaperone CopZ. Copper chaperone CopZ, as the name is used in Enterococcus hirae and related species, is a small copper-binding protein with close homology to domains found, sometimes in multiple copies, in various copper-translocating copper-translocating P-type ATPases, and to distinct families of other small copper chaperones that also named CopZ. 68
5704 411375 NF033795 chaper_CopZ_Bs copper chaperone CopZ. This model describes CopZ, a small copper chaperone, as found in Bacillus subtilis and related species. A number of longer protein, such as copper-translocating P-type ATPases, contain multiple CopZ-like domains, with its signature invariant CxxC motif. CopZ from other species may be more different in sequence from this family than some of those domains of longer proteins. 66
5705 411376 NF033796 selen_YedE_FdhT selenium metabolism membrane protein YedE/FdhT. Members of this family are predicted multiple membrane-spanning proteins, and therefore thought likely to be transporters. It appears that all species whose genomes encode a member of this family produce selenocysteine-containing enzymes, typically formate dehydrogenase. The family member from Campylobacter jejuni was show to be essential for formate dehydrogenase (a selenocysteine-containing enzyme) expression and activity, and so it was named a formate dehydrogenase accessory protein, FdhT. Note that this family is related to (but distinct from) that of TIGR04112, which is similarly restricted to species with pathways for selenium incorporation. 387
5706 411377 NF033797 phero_SHP2_SHP3 SHP2/SHP3 family peptide pheromone. 23
5707 411378 NF033798 biofilm_StcA StcA family protein. StcA (streptococcal charged A protein), as described in Streptococcus pyogenes, is a small, positively charged, secreted protein that participates in a quorum-sensing system, and promotes biofilm formation. Related proteins are found in several other Streptococcus spp. 89
5708 411379 NF033799 inhib_PhrA PhrA family phosphatase inhibitor. 44
5709 411380 NF033800 quorum_NprX quorum-signaling peptide NprX. NprX, also called NprRB, belongs to the NprR-NprX quorum-sensing system in Bacillus. The mature form of the peptide pheromone is the SKPDIVG heptapeptide. 43
5710 411381 NF033801 NprX_fam NprX family peptide pheromone. 42
5711 411382 NF033802 AimP_fam lysogeny pheromone AimP family peptide. AimP is the quorum signaling-like peptide pheromone of a phage system, called the arbitrium system, that detects environmental evidence of predecessor phage activity, in order to direct a lysis/lysogeny decision. 43
5712 411383 NF033803 TOMM_BorA TOMM family putative cytolysin BorA. The BorA family of thiazole/oxazole-modified microcin (TOMM) peptides that are putative cytolysins and virulence factors have been found, so far, encoded by plasmids in members of the genus Borreliella. 37
5713 411384 NF033804 Streccoc_I_II antigen I/II family LPXTG-anchored adhesin. Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc. 1552
5714 411385 NF033805 invasion_CiaB invasion protein CiaB. CiaB (Campylobacter invasion antigen B) is important for host cell invasion by Campylobacter. It is found as well in a number of other species capable of invasion, including Campylobacter coli, Campylobacter rectus, and Helicobacter pullorum. 600
5715 411386 NF033806 laterosporulin laterosporulin family class IId bacteriocin. 52
5716 411387 NF033807 CopL_fam CopL family metal-binding regulatory protein. The founding member of this family was shown to be involved in the copper-responsive expression of a multicopper oxidase copA encoded downstream. The regulatory function likely involves copper-binding, but activity as a DNA-binding transcriptional regulator was not demonstrated directly. 129
5717 411388 NF033808 copper_CopD copper homeostasis membrane protein CopD. CopD is an inner membrane protein encoded in Gram-negative bacterial systems that provide resistance to excess copper. It is typically chromosomal, although the PcoD protein of the plasmid-borne pco copper resistance system is a exceptional member of the CopD family. 295
5718 411389 NF033814 copper_CopC copper homeostasis periplasmic binding protein CopC. 109
5719 411390 NF033816 Cj0069_fam Cj0069 family protein. Cj0069 from Campylobacter jejuni, described as a serological marker that can show a patient was previously infected with that organism, is the founding member of an uncommon broadly distributed family of proteins. Members of that family are found also in various Helicobacter spp., Bradyrhizobium spp., and Corynebacterium spp. 333
5720 411391 NF033817 Mplas_variab_LP variable surface lipoprotein signal domain. This HMM describes a homology domain of lipoprotein signal peptides restricted to the genus Mycoplasma, and found in paralogous families that typically are associated with antigenic phase variation, as expression of some members of the family is turned on, others turned off, by the means promoter region modifications by site-specific DNA invertases. The avg family of Mycoplasma agalactiae represents one such family of lipoproteins. 30
5721 411392 NF033819 IS66_TnpB IS66 family insertion sequence element accessory protein TnpB. The IS66 family insertion sequence element encodes a DDE transposase TnpC, and two accessory proteins, TnpA and TnpB. It has been assumed that the TnpA, TnpB, and TnpC proteins are produced independently in appropriate amounts and form a complex, which acts as a transposase to promote the transposition of an IS66 family element. 90
5722 411393 NF033820 STM0539_fam STM0539 family protein. STM0539, from Salmonella enterica strain LT2, is the founding member of family of proteins of unknown function, about 145 amino acids in length. 141
5723 411394 NF033821 YoaK YoaK family small membrane protein. YoaK is a small protein (about 32 amino acids) found in E. coli (from which it is named), Salmonella, Klebsiella, Pantoea, and related taxa. It associates with the inner membrane. 32
5724 411395 NF033823 archmetzin archaemetzincin family Zn-dependent metalloprotease. 170
5725 411396 NF033824 Myxo_non_zincin non-proteolytic archaemetzincin-like protein. Members of this family resemble the archaemetzincins, a mostly archaeal family of Zn-dependent metalloproteases, but it lacks the critical catalytic glutamate (E) in the signature motif HEXXH, within the longer pattern HEXXHXXGX3CX4CXMX17CXXC that defines the archaemetzincins. Members of this family are found in Myxococcus xanthus and related members of the Delta-proteobacteria. 174
5726 411397 NF033826 immun_CdiI ribonuclease toxin immunity protein CdiI. CdiI proteins, including the founding member from Escherichia coli strain STEC_O31, serve as immunity proteins for the toxic tRNA-cleaving ribonuclease toxin CdiA. The system confers contact-dependent inhibition (cdi) between different strains of bacteria. 118
5727 411398 NF033827 CDF_efflux_DmeF CDF family Co(II)/Ni(II) efflux transporter DmeF. DmeF, a metal efflux transporter belongs to the cation diffusion facilitator (CDF) family. Examples from different species have been described as primarily being induced by, or performing efflux of, different panels of metals, including Co(II) and Ni(II) for Rhizobium leguminosarum and Agrobacterium tumefaciens, and a broader spectrum Wautersia metallidurans CH34. 308
5728 411399 NF033828 entry_exc2_fam Exc2 family lipoprotein. Exc2, a plasmid-encoded predicted lipoprotein of small size, was once described as a plasmid entry exclusion protein (1985). However, a more recent article (1995), says entry exclusion activity should instead be ascribed to MbeD. 131
5729 411400 NF033829 plas_excl_MbeD MbeD family mobilization/exclusion protein. MbeD, as found in the ColE1 plasmid, was originally described as a plasmid mobilization protein. Later, it was shown that MbeD additionally was responsible for a plasmid entry exclusion phenotype that had previously been ascribed to products of the exc1 and exc2 genes. 69
5730 411401 NF033830 NleE_fam_methyl NleE/OspZ family T3SS effector cysteine methyltransferase. NleE from Escherichia coli O157:H7 strain Sakai, and its homolog OspZ from Shigella, are the founding members of a family of SAM-dependent protein--cysteine methyltransferases. Both are type III secretion system (T3SS) effectors involved in virulence, and have the host protein NF-kappa-B as substrates. 212
5731 411402 NF033831 sce7725_fam sce7725 family protein. This family of uncharacterized proteins is named for founding member sce7725 from Sorangium cellulosum, from the Deltaproteobacteria. It belongs a gene pair found sporadically in genera as diverse as Enterococcus, Lactobacillus, Staphylococcus, Streptococcus, Acinetobacter, and Klebsiella. The partner in each gene pair is a member of the sce7726 famly. 311
5732 411403 NF033832 sce7726_fam sce7726 family protein. This family of uncharacterized proteins is named for founding member sce7726 from Sorangium cellulosum, from the Deltaproteobacteria. It belongs a gene pair found sporadically in genera as diverse as Enterococcus, Lactobacillus, Staphylococcus, Streptococcus, Acinetobacter, and Klebsiella, or in phage from those lineages. The partner in each gene pair is a member of the sce7725 family. 182
5733 411404 NF033833 rhodan_ChrE rhodanese family chromate resistance protein ChrE. 108
5734 411405 NF033835 VraH_fam VraH family protein. 54
5735 411406 NF033837 GarQ_core garvicin Q family class II bacteriocin core domain. This HMM describes the core (mature) peptide region of GarQ, the class II bacteriocin garvicin (garvieacin) Q, and homologous peptide with similar core regions. Some members, such as GarQ itself, have a classical GlyGly-containing ComC/BlpC-like leader peptide, but others have N-terminal regions (probable leader peptides) of a different type. 42
5736 411407 NF033838 PspC_subgroup_1 pneumococcal surface protein PspC, choline-binding form. The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. 684
5737 411408 NF033839 PspC_subgroup_2 pneumococcal surface protein PspC, LPXTG-anchored form. The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. 557
5738 411409 NF033840 PspC_relate_1 PspC-related protein choline-binding protein 1. Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC. 648
5739 411410 NF033841 small_YshB YshB family small membrane protein. YshB, a membrane-associated protein typically 36 to 40 amino acids in length, is found conserved in genera of the gamma-proteobacteria including Cronobacter, Enterobacter, Escherichia, Klebiella, Salmonella, and Serratia. The gene symbol derives from E. coli K-12. 40
5740 411411 NF033842 small_MgtS protein MgtS. MgtS, previously called YneM, is a small inner membrane protein that modulates Mg(2+) concentrations through its effects on the P-type transporter MgtA. 30
5741 411412 NF033843 small_YpfM protein YpfM. 19
5742 411413 NF033844 small_YqgB acid stress response protein YqgB. 40
5743 411414 NF033845 MSCRAMM_ClfB MSCRAMM family adhesin clumping factor ClfB. Clumping factor B is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif. 871
5744 411415 NF033846 Rumino_NPXTG NPXTG family C-terminal sorting domain. Rumino_NPXTG represents a flavor of C-terminal protein sorting signal in species related to Ruminococcus albus. In that lineage, multiple sortases per genome may be found, including multiple B-type sortases. Proteins found by this HMM (more than 12 encoded in a representative complete genome) may represent substrates of a panel of related sortases, while additional proteins found in Ruminococcus genomes with below-cutoff hits to this model may be processed by other sortases. 29
5745 411416 NF033847 MCP_Sipho major capsid protein, Siphoviridae type. This protein is a phage major capsid protein, as reported in primary sequence submissions of a large number of Siphoviridae, many of which have hosts in the Mycobacterium and Gordonia genera of bacteria. 549
5746 411417 NF033848 VgrG_rel VgrG-related protein. Members of this family resemble Vgr proteins of type VI secretion systems (T6SS) as found in various proteobacteria. However, members of this family occur instead in genera such as Streptomyces and Roseiflexus. The biological roles and molecular functions of proteins in this family appear not to have been characterized. 547
5747 411418 NF033849 ser_rich_anae_1 serine-rich protein. This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments. 1122
5748 411419 NF033850 LCxxNW mobility-associated LCxxNW protein. This protein belongs to a family of small proteins, about 65 amino acids long, with three invariant Cys residues, including one in the motif LCxxNW, for which the family is named. Member proteins are found in contexts that suggests they are accessory proteins of mobile elements. The context includes members of families PF01695, PF01610, and PF00665, all associated with transposition and/or integration. 61
5749 411420 NF033852 fulvocin_rel bacteriocin fulvocin C-related protein. Fulvocin C was described in 1981 as a bacteriocin from Myxococcus fulvus, 45 amino acids long with 8 cysteines. The precursor form was not described. However, the most closely related precursor-like proteins represent the founding members of a family of proteins that average over 225 amino acids in length, the majority of which have a C-terminal tail region in which the 8 Cys residues are essentially invariant. The long N-terminal region, and the sharp change in amino acid composition at the start of what appears to be the bacteriocin core peptide region, suggests that the N-terminal region may contribute directly in peptide maturation or transport, rather than merely providing for recognition by maturation proteins. 149
5750 411421 NF033853 KPN_two_small small membrane protein. Members of this family, including paralogs KPN_01023 and KPN_01923 from Klebsiella pneumoniae subsp. pneumoniae MGH 78578, are small proteins, typically 40 amino acids in length. A highly hydrophobic region, about 25 amino acids in length with a composition typical of transmembrane segments, is followed by a highly basic region of about 15 amino acids. Analogous proteins of similar size, with apparently similar hydrophobic regions, have been described as membrane-associated proteins expressed in response to various types of cell envelope stress. 37
5751 411422 NF033854 esterase_BioV pimelyl-ACP methyl ester esterase BioV. BioV, found in Helicobacter pylori and a number of related Epsilonproteobacteria replaces BioH as a pimelyl-ACP methyl ester esterase required for biotin biosynthesis. 167
5752 411423 NF033855 tRNA_MNMC2 tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD. This HMM describes either the N-terminal region, called MNMC2, of the tRNA modification bifunctional enzyme MnmC, or a free-standing protein that performs the same methyltransferase function, in partnership with an FAD-dependent protein, or C-terminal region, called MNMC1 (see TIGR03197). 205
5753 411424 NF033856 T4SS_effec_BID T4SS effector BID domain. The BID domain (Bartonella intracellular delivery domain) is recognized by the type IV secretion system (T4SS) virB (not trw) of Bartonella and related taxa (e.g. Ochrobactrum), and is found in T4SS effector proteins such as BepA, BepB, BepC, etc. Multiple copies of the domain may be found in a single protein. 109
5754 411425 NF033857 BPSL0067_fam BPSL0067 family protein. 115
5755 411426 NF033858 ABC2_perm_RbbA ribosome-associated ATPase/putative transporter RbbA. 907
5756 411427 NF033859 SMEK_N SMEK domain. The SMEK domain is named for four genera in which multiple, diverse members of this uncommon family of bacterial proteins are found: Staphylococcus, Mycoplasma, Escherichia, and Klebsiella. Members of the family are highly variable in length. This domain occurs as the N-terminal region. The four scattered invariant residues in the seed alignment, which may provide a clue to function, are Glu, Asp, Gln, and Lys. 97
5757 411428 NF033860 Wzy_O6_O28 oligosaccharide repeat unit polymerase. Members of this family are oligosaccharide repeat unit polymerases in a subfamily that includes the Wzy proteins for polymerization of the O-antigens O6, O28, O39, O59, and several others. 384
5758 411429 NF033861 sm_mem_Ecr Ecr family regulatory small membrane protein. Ecr, as described in the genus Enterobacter, is a small membrane protein predicted to span the inner membrane. It was found to modulate expression of PhoP, part of the PhoP-PhoQ two component system, which in turn induces the arnBCADTEF operon, leading to modification of LPS and conferring elevated resistance to colistin. The form described in PMID:31169899 (WP_048029797.1), at 72 amino acids in length, is aberrant compared with the typical length for the majority of family members, about 46 residues. 46
5759 411430 NF033863 immun_TipC_fam TipC family immunity protein. This family is named for founding member TipC1 (previously TipC), an immunity protein for the toxin TelC, which is a type VII secretion system (T7SS) effector lipid II phosphatase. 192
5760 411431 NF033864 cytochrome579 cytochrome 579. Cytochrome 579, as described originally in Leptospirillum from acid mine drainage, is an abundant red cytochrome that acts as an electron transfer protein involved in Fe(II) oxidation. 178
5761 411432 NF033865 fusolisin autotransporter serine protease fusolisin. 1003
5762 411433 NF033869 viru_reg_Rsp AraC family transcriptional regulator Rsp. Rsp (repressor of surface proteins), as described in Staphylococcus aureus, is a large protein with an AraC-like helix-turn-helix DNA-binding domain. Regulatory targets include the accessory gene regulator (agr) operon, which in turn regulates a large number of virulence factors. 701
5763 411434 NF033870 VOMP_auto_Cterm Vomp family autotransporter C-terminal domain. The Vomp (variably expressed outer-membrane proteins) family, as described in Bartonella, consists of autotransporter surface proteins including collagen-binding autotransporter adhesins VompA and VompC. 356
5764 411435 NF033871 deAMP_SidD deAMPylase SidD family protein. The founding member of this protein family, SidD from Legionella pneumophila, is a type IV secretion system effector that acts as a deAMPylase of the host protein. Homologs are found in several other Gammaproteobacteria, including several different Legionella and Coxiella species. 180
5765 411436 NF033872 SidA_fam T4SS effector SidA family protein. The founding member of this protein family, SidD from Legionella pneumophila, is a minimally characterized type IV secretion system substrate. Homologs are found in a wide range of Legionella species. 379
5766 411437 NF033873 SidJ_poly_Glu SidJ family T4SS effector polyglutamylation protein. SidJ, called a pseudokinase because its polyglutamylation activity differs from what might be expected from its kinase-like fold, is the founding member of a family of such pseudokinases. SidJ itself, as described in Legionella pneumophila, is exported by a type IV secretion system (T4SS), and modifies and modulates the activity of T4SS effector SidE. 758
5767 411438 NF033874 SidJ_rel_pseudo SidJ-related pseudokinase. Members of this family are uncharacterized but exhibit strong local sequence similarity to SidJ, a protein that exhibits a protein kinase fold, but that surprisingly exhibit a different activity. In the case of SidJ itself, the activity is polyglutamation activity. For this family, the activity is unknown. 508
5768 411439 NF033875 Agg_substance LPXTG-anchored aggregation substance. Aggregation substances, as described in Enterococcus, are LPXTG-anchored large surface proteins that contribute to virulence. Several closely related paralogs may be found in a single strain. 1306
5769 411440 NF033876 flagella_HExxH flagellinolysin. Flagellinolysin is a variant form of bacterial flagellin in with the normally hypervariable central region contains an M9 (MEROPS classification) family metalloprotease domain, with its signature HExxH motif. The founding member of the family, from the pathogen Clostridium haemolyticum, shows EDTA-sensitive metalloprotease activity. The large count of flagellin subunits in a complete flagellum means the capacity of flagellinolysin to perform as a protease may have implications for host-pathogen relationships. 380
5770 411441 NF033878 thiovarsolin thiovarsolin family RiPP. The thiovarsolins, named for a founding member from Streptomyces varsoviensis, are RiPPs (ribosomally synthesized and post-translationally modified peptide). As with the thioviridamides, thiovarsolin precursors are encoded in loci that encode YcaO and TfuA family proteins, suggesting post-translational modification by thioamidation. 89
5771 411442 NF033879 smalltalk smalltalk protein. Smalltalk is a membrane-associated protein of very small size (less than 35 amino acids), found broadly in Bacteroides and Prevotella, both of which are prevalent in human gut microbiomes. Genomic context suggests a role in crosstalk in the gut microbiome, whether that involve toxins and immunity, signaling, or some other form of interaction. The family was identified and discussed by Sberro, et al., in a screen for overlooked small proteins encoded within human microbiomes, and named smalltalk here for its small size and cross-talk role. 29
5772 411443 NF033880 Prli42 stressosome-associated protein Prli42. Prli42, as characterized in Listeria monocytogenes and found broadly in the Firmicutes, is a membrane protein of very small size, essential to the function of the stressosome. It appears to be related to DUF4044 (PF13253). 31
5773 411444 NF033881 aureocin_A53 aureocin A53 family class IId bacteriocin. Members of this family include leaderless, unmodified class IId bacteriocins such as lacticin Q, BacSp222, and the founding member aureocin A53. 48
5774 411445 NF033882 T4SS_lipo_DotD type IVB secretion system lipoprotein DotD. Members of this family are the lipoprotein DotD from type IVB secretion systems, which are also called Dot/Icm secretion systems. DotD is is related to conjugal transfer protein TraH as that term is used in IncI1 plasmid transfer regions. 143
5775 411446 NF033883 conj_TraQ_IncI1 conjugal transfer protein TraQ. 175
5776 411447 NF033884 conj_TraO_IncI1 conjugal transfer protein TraO. TraO, involved in the conjugal transfer of plasmids such as IncI1 plasmids, shares homology with IcmE of type IVB secretion systems. 381
5777 411448 NF033885 conj_TraP_IncI1 conjugal transfer protein TraP. Members of this family are the conjugal transfer protein TraP, as the term is used for the member protein from IncI1 plasmids and for their homologs. Note that the same terminology may be applied to unrelated proteins from other forms of conjugal transfer system. 217
5778 411449 NF033886 T4SS_DotA type IVB secretion system protein DotA. This HMM distinguishes DotA of type IVB secretion systems from TraY as the term is used in the conjugal transfer systems of IncI1 family plasmids. 777
5779 411450 NF033887 conj_TraX conjugal transfer protein TraX. 163
5780 411451 NF033888 conj_TraW conjugal transfer protein TraW. Members of this family are the TraW protein of conjugal plasmid transfer systems, as the term is used for certain transfer systems, including that of IncI1 family plasmids. Note that an unrelated protein, also designated TraW, participates in the assembly of F-pilin subunits, involved in transfer of F-plasmids. 380
5781 411452 NF033889 termin_lrg_T7 phage terminase large subunit. The phage terminase large subunit (TerL) is also called DNA maturase B. It oligomerizes into an ATPase that forces DNA into a pre-existing phage prohead to package the DNA. This TerL family includes members from phage T7 and phage T3, among others. 499
5782 411453 NF033890 DotM_IcmP_IVB type IVB secretion system coupling complex protein DotM/IcmP. 354
5783 411454 NF033891 surf_exc_IncI1 plasmid IncI1-type surface exclusion protein ExcA. The surface exclusion protein ExcA, as found in R64 and other IncI1 family plasmids, is not required for plasmid transfer. Instead, it is required for blocking transfer of closely related plasmids into the host cell. 210
5784 411455 NF033892 XcbB_CpsF_sero XcbB/CpsF family capsular polysaccharide biosynthesis protein. Two partially characterized members of this family are XcbB, as described in Neisseria meningitidis serotype X, and CpsF as described in Enterococcus faecalis serotype C. In the latter case, loss of CpsF converts capsular polysaccharide to serotype D. 291
5785 411456 NF033893 pheromone_ipd peptide pheromone inhibitor Ipd. The pheromone inhibitor iPD1, in mature form, is the last 8 amino acids of the product of the ipd gene. It was described in conjugative plasmids of Enterococcus faecalis. 21
5786 411457 NF033894 Eex_IncN EexN family lipoprotein. Members of this family are lipoproteins, typically associated with mobile elements such as phage, and related to the entry exclusion protein Eex of IncN-type plasmids. Members of this family tend to be small (shorter than 90 amino acids) with three invariant Cys residues, one of which belongs to the lipoprotein signal peptide. This family shares some similarities with TIGR04359, which includes the entry exclusion lipoprotein TrbK of IncPalpha-type plasmids. 65
5787 411458 NF033896 MFS_LfrA efflux MFS transporter LfrA. This efflux transporter, as characterized in Mycolicibacterium (Mycobacterium) smegmatis, provides low-level fluoroquinolone resistance (lfr) when overexpressed. 504
5788 411459 NF033897 GUMAP_C GUMAP protein C-terminal domain. GUMAP (Giant Ureaplasma Membrane-Anchored Protein) is a very large protein found in several species of the genus Ureaplasma, a lineage that resembles Gram-positive bacteria by ancestry but that lacks a peptidoglycan cell wall. GUMAP proteins average about 5000 amino acids in length. Near the C-terminus, these proteins have a strongly hydrophobic segment followed immediately by a cluster of basic residues, as is characteristic of proteins with a C-terminal membrane anchor. Because of the potentially large computation cost of performing database searches with a protein profile HMM that is thousands of residues long, this HMM models only about 750 amino acids of the C-terminal region of GUMAP proteins. 729
5789 411460 NF033898 QWxxN_dom QWxxN domain-containing protein. The QWxxN domain is about 125 amino acids long, and appears typically as the conserved core region in up to 9 tandem repeats, each about 200 amino acids long. Proteins with this domain are known so far only in the genus Enterococcus, and may reach over 3000 amino acids in length. 127
5790 411461 NF033899 T4SS_pilin_TrwL VirB2 family type IV secretion system major pilin TrwL. TrwL is the major pilin of Trw type IV secretion system (T4SS) of Bartonella species. It is related to VirB2 of related T4SS and to the conjugal transfer protein TrbC. The Trw system is unusual for having duplications of certain subunits, and TrwL has divergent, tandem-duplicated copies, named in series TrwL1, TrwL2, etc. 103
5791 411462 NF033900 T4SS_IcmE_DotG type IVB secretion system protein DotG/IcmE. 1012
5792 411463 NF033901 L_lactate_LldD FMN-dependent L-lactate dehydrogenase LldD. LldD is an FMN-dependent L-lactate dehydrogenase. It occurs in E. coli, Salmonella, and as one of two L-lactate dehydrogenases in Pseudomonas aeruginosa. It is unrelated to the NAD-dependent enzyme. 377
5793 411464 NF033902 iso_D2_wall_anc SpaH/EbpB family LPXTG-anchored major pilin. Members of this family are pilin major subunits whose structure includes an LPXTG motif-containing signal (see TIGR01167) near the C-terminus, for processing by sortases. Most contain a recognizable D2-type fimbrial isopeptide formation domain (see TIGR04226), in which Lys-to-Asn isopeptide bond formation provides additional structural integrity to support adhesion despite shear. For proper members of this subfamily, lengths fall typically in the range of 460 to 640 amino acids in length. Many members of this family contribute to the virulence of certain Gram-positive pathogens, including SpaA, SpaD, and SpaH from Corynebacterium diphtheriae, and EbpB and EbpC from Enterococcus faecalis. 533
5794 411465 NF033903 VaFE_rpt VaFE repeat. The VaFE domain, about 121 amino acids long, typically occurs as a tandem repeat in sortase-anchored surface proteins of Gram-positive bacteria. A single protein may from one to over fifteen VaFE domains. The domain is named for a particularly strong motif with a nearly invariant Phe-Glu residue pair. The function is for the VaFE domain is unknown. 121
5795 411466 NF033904 LlsX_fam LlsX family protein. LlsX, as found in Listeria monocytogenes, is a small protein of unknown function, encoded in the island responsible for listeriolysin S biosynthesis and processing. Related proteins are found in additional Gram-positive lineages, such as Streptococcus sobrinus and Lactobacillus sp. 90
5796 411467 NF033906 ExsE_fam T3SS regulon translocated regulator ExsE family protein. ExsE, through protein-protein interaction, serves in a regulatory cascade that modulates the role of ExsA, a transcriptional activator of Pseudomonas aeruginosa's type III secretion system (T3SS) regulon. ExsE itself is a substrate for translocation (i.e. removal) by the T3SS system, providing feedback that modulates expression of secretion system genes. Homologs found in multiple species of Aeromonas and Photorhabdus may be functionally equivalent. Note that VP1702 from Vibrio parahaemolyticus, given the same gene symbol and ascribed an equivalent function, appears unrelated in sequence. 77
5797 411468 NF033907 ExsE2_fam T3SS regulon translocated regulator ExsE2. ExsE2, as described in two Vibrio species, is called functionally equivalent to ExsE from Pseudomonas aeruginosa, but appears to lack detectable sequence homology. In each model organism, a type III secretion system (T3SS) contains a DNA-binding transcriptional activator ExsA, an anti-activator ExsD that binds ExsA and inhibits its activity, and a secretion chaperone ExsC that can bind to and counteract ExsD. ExsE2 (in V. alginolyticus) or ExsE (in P. aeruginosa) can bind its chaperone ExsC until it exits the cell by successful T3SS translocation, providing fine tuning of T3SS gene expression through these interactions. We renamed the Vibrio family from ExsE to ExsE2 to help minimize confusion between these two very dissimilar families. 93
5798 411469 NF033908 AcfA_fam_omp AcfA family outer membrane beta-barrel protein. AcfA (accessory colonization factor A), as discussed in Vibrio cholerae, is a porin-like outer membrane beta-barrel protein. It encoded in a locus with other proteins also termed accessory colonization factor, near the toxin-coregulated pilus genes, and its presence aids in intestinal colonization, but its molecular function is unknown. Members of the broader family, described by this HMM, are found in many species of Vibrio, Photobacterium, Aliivibrio, and related genera. The name AcfA is used also for a member of this family from Vibrio alginolyticus, that is no more than 40 percent identical in amino acid sequence, but it is unclear that all members of this family should be considered AcfA. 214
5799 411470 NF033909 opacity_OapA opacity-associated protein OapA. This family consists of full-length homologs to OapA, opacity-associated protein A as described in Haemophilus influenzae. OapA shares a C-terminal homology domain, called the OapA domain, with the Escherichia coli protein YtfB, which is now known to bind peptidoglycan through its OapA domain and to act as a cell division protein. 421
5800 411471 NF033910 LWR_salt LWR-salt protein. This family of uncharacterized proteins was assigned the name LWR-salt (pronounced "lower-salt") to mark a well-conserved motif LWR (part of the longer motif WxFFRDxLWRG) and its restriction to the Halobacteria, a branch of the Archaea. 118
5801 411472 NF033911 botu_NTNH non-toxic nonhemagglutinin NTNH. The botulinum neurotoxin (BoNT) is always encoded together with associated non-toxic proteins (ANTPs) that are co-produced with it and form a complex that protects the toxin. Often, NtnH (non-toxic nonhemagglutinin) is one of these ANTPs, with the ntnh gene lying immediately upstream of the bont gene. 1164
5802 411473 NF033912 msc mechanosensitive ion channel. Proteins of this subfamily are mechanosensitive channels, involved in numerous biological functions. Representative proteins of this subfamily are WP_092311603 and WP_068170239 (TCDB accession: 1.A.23.8.3 and 1.A.23.8.5, respectively). 365
5803 411474 NF033913 fibronec_FbpA LPXTG-anchored fibronectin-binding protein FbpA. FbpA, a fibronectin-binding protein described in Streptococcus pyogenes, has a YSIRK-type (crosswall-targeting) signal peptide and a C-terminal LPXTG motif for covalent attachment to the cell wall. It is unrelated to the PavA-like protein from Streptococcus gordonii (see BlastRule NBR009716) that was given the identical name, so the phase LPXTG-anchored is added to the protein name for clarity. 386
5804 411475 NF033914 antiphage_ZorA_1 anti-phage defense protein ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense. 619
5805 411476 NF033915 antiphage_ZorA_2 anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense. 383
5806 411477 NF033916 antiphage_ZorA_3 anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense. 509
5807 411478 NF033917 antiphage_ZorA_4 anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense. 417
5808 411479 NF033918 LGIC_1 ligand-gated ion channel. Prokaryotic ligand-gated ion channels (LGICs) are a large group of transmembrane transporters, which might contribute to adaptation to pH change. 305
5809 411480 NF033919 PA2779_fam PA2779 family protein. Homologs of PA2779, an uncharacterized protein, average about 130 amino acids in length. The most distinctive feature is an extremely hydrophobic region at or near the C-terminus, consisting almost entirely of the bulkier hydrophobic residues Val, Ile, Leu, and Phe. The function is unknown. 123
5810 411481 NF033920 C39_PA2778_fam PA2778 family cysteine peptidase. Members of this family are MEROPS classification C39 family cysteine peptidases, a group that includes many processing enzyme for peptide natural products such as lantibiotics and other bacteriocins. This family, more specifically, includes PA2778 as found in Pseudomonas aeruginosa. All members of the defining seed alignment are encoded in the vicinity of a homolog of PA2779 (see HMM NF033919). Note that the C-terminal region consists largely of tetratricopeptide repeats (TPR), so classification using this HMM must be based on comparing the top domain score to the second gathering threshold (GA2). 255
5811 411482 NF033921 por_somb iron uptake porin. Proteins of this family have typical porin structures. It has been reported that Synechococcus outer membrane (Som) porins (SomA and SomB) are involved in iron uptake in cyanobacterium Synechococcus. 481
5812 411483 NF033922 opr_porin_1 Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates. 328
5813 411484 NF033923 opr_proin_2 Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates. 386
5814 411485 NF033924 T3SS_LcrQ_reg type III secretion system exported negative regulator LcrQ/YscM1. The type III secretion system (T3SS) protein called LcrQ in Yersinia pseudotuberculosis and YscM1 in Yersinia enterocolitica is a post-transcriptional regulator of T3SS effector gene expression. Successful chaperone-dependent export by the T3SS allows the translation of T3SS effector proteins to proceed. 110
5815 411486 NF033925 pora_1 PorA family porin. This HMM hits Corynebacterial Porin A (PorA) family porins, which are short membrane proteins. 42
5816 411487 NF033926 msp_porin MSP porin. Members of this HMM are MSP porins (major outer sheath proteins) in Treponema. They may play a role in immune evasion and persistence. 512
5817 411488 NF033927 alph_xenorhab_B alpha-xenorhabdolysin family binary toxin subunit B. 223
5818 411489 NF033928 alph_xenorhab_A alpha-xenorhabdolysin family binary toxin subunit A. Alpha-xenorhabdolysin was the founding member of a family of alpha-helical pore-forming binary toxins. YaxAB from Yersinia enterocolitica has been studied structurally. This HMM represents subunit A proteins such as XaxA and YaxA, capable of binding to the membrane even in the absence of the B subunit. This family is related to the Bacillus haemolytic enterotoxin family (see PF05791.9), although thresholds for this HMM are set to exclude that family. 340
5819 411490 NF033930 pneumo_PspA pneumococcal surface protein A. The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus. 660
5820 411491 NF033932 LapB_rpt_80 LapB C-terminal region repeat. This model describes a tandem repeat about 80 amino acids in length per repeat, found in at least 12 different surface-exposed proteins of the pathogen Listeria monocytogenes, and in particular found 10 times in tandem in the surface protein LapB, for which the repeat is named. 83
5821 411492 NF033934 KCU-star KCU-star family selenoprotein. This family is named KCU-star because nearly all member proteins end with tripeptide lysine-cysteine-selenocysteine, followed immediately by a stop codon (represented by an asterisk, or star). Members occur in primarily in species of Helicobacter (although not Helicobacter pylori, in which selenocysteine incorporation capability has been lost) and Campylobacter. This small family belongs the larger YbdD/YjiX (DUF466) family described by Pfam model PF04328. 57
5822 411493 NF033935 inclusion_IncB inclusion membrane protein IncB. When Chlamydia invades a cell, the host-derived membrane of the vacuole in which it resides becomes known as the inclusion membrane. The chlamydial type III secretion system (T3SS) delivers a number of effector proteins into the inclusion membrane, including this protein, IncB (inclusion membrane protein B). IncB proteins from different chlamydial species share a conserved hydrophobic C-terminal region, represented by this HMM, but their N-terminal regions vary considerably in length and sequence. 78
5823 411494 NF033936 CuZnOut_SO0444 SO_0444 family Cu/Zn efflux transporter. Members of this family are apparent metal cation efflux transporters. Architectural features include an average length of about 400 residues, with well conserved and highly hydrophobic N-terminal and C-terminal domains. The central region is highly variable in length and sequence, and rich in both Cys and His residues, as often seen in proteins produced in response to toxic concentrations of certain metals. The founding member, SO_0444, was shown to confer resistance to high levels of Cu and Zn ions. The best conserved region of the protein is a CSCG motif in the N-terminal region, found at least twice as in a selenocysteine-containing form, USCG. 336
5824 411495 NF033937 porH_1 PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins. 97
5825 411496 NF033938 porH_2 PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins. 63
5826 411497 NF033939 DESULF_POR1 outer membrane homotrimeric porin. Proteins of this HMM family are primarily identified in sulfate-reducing Desulfovibrio, but this HMM may also hit proteins from other Gram-negative bacteria. Porins of this family form transmembrane pores for the passive transport of small molecules across the outer membranes of Gram-negative bacteria. 455
5827 411498 NF033940 ErpA_rel ErpA-related iron-sulfur cluster insertion protein. 95
5828 411499 NF033942 GjpA outer membrane porin GjpA. GjpA was first identified in Gordonia jacobaea strain MV-1 as an outer membrane channel-forming protein. It has been reported that GjpA could be of relevance in the import and export of negatively charged molecules across the cell wall. 331
5829 411500 NF033943 RTX_toxin RTX family hemolysin. RTX family toxin are secreted from the bacteria and inserted into the membranes of infected cells, causing host cell rupture. 875
5830 411501 NF033945 AcrIIA2_fam AcrIIA2 family anti-CRISPR protein. Anti-CRISPR proteins are phage proteins that defeat CRISPR-Cas systems for immunity based on phage-derived spacers found in arrays between CRISPR repeats. The founding member of this family, AcrIIA2, works against a CRISPR-Cas class II system. 118
5831 411502 NF033946 AcrIIA4_fam AcrIIA4 family anti-CRISPR protein. AcrIIA4 is an anti-CRISPR protein that affects Cas9, a class II CRISPR system protein used in biotechnology applications for targeted genome editing. 86
5832 411503 NF033947 PEP-cistern cistern family PEP-CTERM protein. Members of this family are PEP-CTERM proteins, that is, surface proteins of Gram-negative organisms that carry a short C-terminal region used to help target proteins to their proper cellular location, hold them in position for post-translational modifications that might need to occur (such as glycosylation), and which is eventually removed by exosortase as the protein is ligated to something else. In this family the most conspicuous feature other than the PEP-CTERM sorting signal (with variants that include PEP, PAP, PTP, and SEP) is a pair of Cys residues about 6 amino acids apart from each other. The second Cys occurs in the middle of run of amino acids that are all either small (Gly, Ser, Ala) or else Asn. The local context suggests the Cys occurs at a turn at the end of a structural feature such as alpha-helix or beta-strand, rather than in the middle of one. The word "cistern" was assigned to suggest the proposed Cys-turn feature. 198
5833 411504 NF033948 AcrIIA3_fam anti-CRISPR protein AcrIIA3. AcrIIA3, as found in siphophages infecting Listeria and Streptococcus, is an anti-CRISPR protein that prevents Cas9-containing CRISPR systems from protecting against phage infection. 119
5834 411505 NF033949 Cas12b type V CRISPR-associated protein Cas12b. 1187
5835 411506 NF033950 Cas12c type V CRISPR-associated protein Cas12c. 1246
5836 411507 NF033951 Cas12d type V CRISPR-associated protein Cas12d. 1118
5837 411508 NF033952 AcrID1_fam AcrID1 family anti-CRISPR protein. The AcrID1 family of anti-CRISPR proteins occurs in virus infecting the Archaea, primarily Sulfolobus. It targets and inactivates type I-D CRISPR-Cas systems. 93
5838 411509 NF033953 AcrF10_fam AcrF10 family anti-CRISPR protein. Members of the AcrF10 family of anti-CRISPR proteins have been found in phage from various Vibrio, Shewanella, and their relatives. AcrF10 is considered a DNA mimic protein. 94
5839 411510 NF035921 staph_coagu staphylocoagulase. Past residue 485, the protein consists of a variable number of 27-amino acid tandem repeats (see PF04022). This HMM omits coverage of a portion of the repetitive C-terminal region. 520
5840 411511 NF035922 Trp_DH_ScyB tryptophan dehydrogenase ScyB. The tryptophan dehydrogenase (EC 1.4.1.19) ScyB performs a reversible NAD(+)-dependent deamination of L-Trp to 3-indolepyruvate. ScyB occurs in Cyanobacteria that biosynthesize scytonemin, a natural sunscreen, from tryptophan. 346
5841 411512 NF035923 TPP_ScyA scytonemin biosynthesis protein ScyA. This HMM distinguishes ScyA itself, found within the ScyABCDEF operon, from closely related paralogous TPP-binding ScyA-related proteins encoded outside the operon. 623
5842 411513 NF035924 scytonem_ScyC scytonemin biosynthesis cyclase/decarboxylase ScyC. Of the various markers of scytonemin biosynthesis, ScyC appears to be the clearest, as there are few or no close homologs from outside of the set of confidently predicted scytonemin producer bacteria. 312
5843 411514 NF035925 Geo26A_fam geobacillin-26 family protein. 197
5844 411515 NF035926 scyF_NHL_VPEP scytonemin biosynthesis PEP-CTERM protein ScyF. ScyF has multiple tandem copies of the NHL repeat, making it a beta-propellar protein. It has an N-terminal signal peptide, and a C-terminal PEP-CTERM domain for recognition and cleavage by exosortase, suggesting that ScyF is sorted to an extracellular location, perhaps the outer leaflet of the outer membrane, and attached there covalently. 376
5845 411516 NF035927 TPP_ScyA_rel ScyA-related TPP-binding enzyme. Members of this family are not ScyA itself, which is described in TPP_ScyA (NF035923), but instead are closely related paralogs that are likewise restricted to the Cyanobacteria. Nostoc punctiforme, for example, has ScyA itself, but also has three members of this family, all more closely related to each other than to ScyA. By homology, members of this family are expected to be thiamine pyrophosphate-binding enzymes. 551
5846 411517 NF035928 holin_1 bacteriophage holin. Holins form transmembrane pores for releasing endolysins that hydrolyze the cell wall and induce cell death. 117
5847 411518 NF035929 lectin_1 lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor. 837
5848 411519 NF035930 lectin_2 lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor. 238
5849 411520 NF035931 lectin_3 lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor. 226
5850 411521 NF035932 lectin_4 lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor. 318
5851 411522 NF035933 ESAT6_1 pore-forming ESAT-6 family protein. 107
5852 411523 NF035934 ESAT6_2 pore-forming ESAT-6 family protein. 96
5853 411524 NF035935 ESAT6_3 pore-forming ESAT-6 family protein. 98
5854 411525 NF035936 agg_sub_LPXTH serine-rich aggregation substance UasX. Members of this protein family are repetitive, serine-rich surface proteins of the Firmicutes, found primarily in the genus Leuconostoc. The variant form of sortase signal, LPXTH, is replaced by LPXTG in members from some lineages, such as Weissella oryzae, and therefore recognizable. Some members of this family have the KxYKxGKxW type signal peptide as seen in the glycoprotein adhesin GspB, a substrate of the accessory Sec system for secretion. WOSG25_050600 from Weissella oryzae SG25 is identified in a publication as an unnamed aggregation substance, a conclusion supported by the sorting signals and composition reported here. We assign the gene symbol uasX (unnamed aggregation substance X) based on our evaluation of the family. 281
5855 411526 NF035937 EboA_family EboA family metabolite traffic protein. This HMM describes a narrow, cyanobacterial-only clade of members of the EboA (eustigmatophyte/bacterial operon A) family. Members of this family appear required for transport of certain secondary metabolite precursors to the periplasm, including (but not limited) to precursors of scytonemin. More than half the members of this clade belong to scytonemin producers. 219
5856 411527 NF035938 EboA_domain EboA domain-containing protein. EboA (eustigmatophyte/bacterial operon A) belongs to a broadly distributed system of six proteins. The ebo system in general appears to involved in trafficking certain of certain natural product precursors whose biosynthesis requires export at least as far as the periplasm. Scytonemin is an example of one natural product whose biosynthesis requires an Ebo system. 141
5857 411528 NF035939 TIM_EboE metabolite traffic protein EboE. EboE (eustigmatophyte/bacterial operon E) belongs to the TIM barrel fold superfamily by homology, as shown by Pfam model PF01261. Although its exact function is unknown, EboE is encoded as part of a widely distributed system that appears to allow export of precursor metabolites of various secondary metabolites so that biosynthesis can be completed outside of the cytosol. 375
5858 411529 NF035940 prenyl_rel_EboC UbiA-like protein EboC. 292
5859 411530 NF035941 GBS_alph_likeN alpha-like surface protein N-terminal domain. Most Group B Streptococcus (GBS) have a member of a mosaic family of repetitive surface protein. The founding member is the alpha C protein (bca), but other named members with complete sequences include Alp2, Alp3, and Rib. This HMM describes the shared, non-repetitive N-terminal region, including a YSIRK-like signal peptide region. 225
5860 411531 NF035942 T3SS_eff_HopBF1 T3SS effector protein kinase HopBF1. HopBF1, found in plant pathogens such as Pseudomonas syringae and in the human pathogen Ewingella americana, it a type III secretion system effector that acts as a protein kinase. It phosphorylates the eukaryotic chaperone HSP90 on a serine residue, inhibiting its ATPase activity. The inhibition interferes with the proper folding of client proteins of HSP90 that are important to resistance to bacterial infection. 200
5861 411532 NF035943 exosort_XrtV exosortase V. 262
5862 411533 NF035944 PEPxxWA-CTERM PEPxxWA-CTERM sorting domain. This variant form of PEP-CTERM sorting signal shows unusually strong conservation across much of the hydrophobic transmembrane segment, including an unusual Trp (W) residue at position 6 of the seed alignment. The Trp is replaced by Tyr (Y) in a number of proteins hit by the HMM. The top-scoring members of this family tend to occur in Sphingomonas and related genera, encoded by genomes that also encode the sorting enzyme XrtV (exosortase V). 24
5863 411534 NF035945 Zn_serralysin serralysin family metalloprotease. 457
5864 411535 NF035950 RumC_sactiRiPP RumC family sactipeptide. The founding member of this family of radical SAM/SPASM-modified sactipeptide bacteriocins is ruminococcin C1, AEC03333.1 from Ruminococcus gnavus. This bacteriocin, from a human gut bacterium, is interesting because of its anti-clostridial activity. Known homologs to RumC1 average about 65 amino acids in length and have four invariant Cys residues. 63
5865 411536 NF035951 rSAM_RumMC RumMC family radical SAM sactipeptide maturase. The RumMC family of radical SAM/SPASM domain peptide modification enzymes creates sulfur-to-alpha-carbon (sactipeptide) linages in RiPP peptide natural products in the family of ruminococcin C. 510
5866 411537 NF035952 WxPxxD_TM WxPxxD family membrane protein. This uncommon, extremely hydrophobic protein of about 240 amino acids occurs sporadically in members of the Firmicutes, including the Listeria, Bacillus, Anoxybacillus, and Terribacillus genera. The protein is named for its most distinctive motif. The function is unknown, but the size and hydrophobicity suggests a transport-related function. 234
5867 411538 NF035953 integrity_Cei envelope integrity protein Cei. Cei (cell envelope integrity), as described for the founding member Rv2700 from Mycobacterium tuberculosis, is a transmembrane protein with an extracellular LytR_C domain. It lacks any DNA-binding domain and is not a transcriptional regulator. It shares homology to C-terminal regions present in some members of the LytR-CpsA-Psr family, a family in which some characterized members transfer teichoic acids to from carriers to mature peptidoglycan. 211
5868 411539 NF035954 ocin_CA_C0660 CA_C0660 family putative sactipeptide bacteriocin. Members of this family are Cys-rich peptides about 65 amino acids in length, regularly found in the vicinity of the radical SAM/SPASM domain enzymes. The most closely related such radical SAM enzyme is the RiPP modification enzyme that introduces sulfur-to-alpha-carbon peptide (sactipeptide) modification in ruminococcin C, whose precursor is very similar in length and in Cys content and arrangement. 64
5869 411540 NF037932 ocin_sys_WGxF bacteriocin-like WGxF protein. Members of this protein family of hydrophobic proteins about 60 amino acids long are found various members of the Firmicutes in three-gene contexts that suggest a role as a bacteriocin or an immunity protein. The protein is named for its most striking sequence feature, a nearly invariant WGxF motif. The two conserved neighboring families are TIGR01654-like (e.g. WP_149116529.1) and TIGR03608-like (e.g. WP_149116530.1). 59
5870 411541 NF037933 EpaQ_fam EpaQ family protein. EpaQ, as described in the Gram-positive bacterium Enterococcus faecalis, is encoded with the enterococcal polysaccharide antigen (epa) operon. It is distantly related to some O-antigen ligases of Gram-negative bacteria, and may have a similar molecular function. EpaQ contributes to biofilm formation and resistance to certain antibiotics. 377
5871 411542 NF037934 holdfast_HfaA holdfast anchoring protein HfaA. 115
5872 411543 NF037935 holdfast_HfaB holdfast anchoring protein HfaB. HfaB, part of the holdfast anchoring complex of Caulobacter and related bacteria, is a homolog of the outer membrane protein CsgG of curli biogenesis. 264
5873 411544 NF037936 holdfast_HfaD holdfast anchor protein HfaD. 373
5874 411545 NF037937 septum_RefZ forespore capture DNA-binding protein RefZ. RefZ (regulator of FtsZ), a DNA-binding protein in the family of TetR/AcrR family transcriptional regulators, participates in septum placement and in chromosome capture during the asymmetrical cell division in endospore formation. The five nearly palindromic DNA motifs (RBMs) to which RefZ binds affect chromosomal localization, not transcription, so RefZ is not considered a transcription factor. 195
5875 411546 NF037938 Myr_Ysa_major MyfA/PsaA family fimbrial adhesin. The related adhesins MyfA and PsaA are fimbrial major subunits of Myf fimbriae from Yersinia enterocolitica, and Psa (also known as pH6 antigen) fimbriae from Y. pestis. 158
5876 411547 NF037940 PKS_MbtD mycobactin polyketide synthase MbtD. 973
5877 411548 NF037941 PKS_NbtC nocobactin polyketide synthase NbtC. 1026
5878 411549 NF037942 ac_ACP_DH_MbtN mycobactin biosynthesis acyl-ACP dehydrogenase MbtN. MbtN belongs to a family of dehydrogenases that in most cases act on acyl groups carried on CoA. However, MbtN appears to act on an acyl group carried instead on the mycobactin biosynthesis acyl carrier protein MbtL. 376
5879 411550 NF037944 holin_2 bacteriophage holin. Proteins of this family are homologs of the mycobacterial phage holin Gp29. They can cause host cell lysis to release progeny phage particles. 64
5880 411551 NF037945 holin_3 bacteriophage holin. Bacteriophage holin can cause host cell lysis to release progeny phage particles. Proteins of this family have about 60 amino acids, and a transmembrane domain is usually found on the C-terminal. 37
5881 411552 NF037946 terminal_TopJ terminal organelle assembly protein TopJ. 440
5882 411553 NF037947 holin_4 bacteriophage holin. Bacteriophage holin can cause host cell lysis to release progeny phage particles. Proteins of this family usually have two transmembrane domains. 75
5883 411554 NF037948 signal_int_SinM signal integration modulator SinM. SinM (signal integration modulator) is a regulatory partner of the hybrid histidine kinase/response regulator SinK (signal integrating kinase). 387
5884 411555 NF037949 holin_5 bacteriophage holin. Proteins of this family can cause host cell lysis to release progeny phage particles. 64
5885 411556 NF037950 spanin2_1 bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane. 130
5886 411557 NF037951 spanin2_2 bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane. 107
5887 411558 NF037952 spanin2_3 bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane. 98
5888 411559 NF037953 frad septal junction protein FraD. Proteins of this family are components of cyanobacterial septal junctions (microplasmodesmata) in heterocyst-forming cyanobacteria. 305
5889 411560 NF037954 het_cyst_PatD heterocyst frequency control protein PatD. 116
5890 411561 NF037955 mfs MFS transporter. 366
5891 411562 NF037957 freyrasin_like freyrasin family ranthipeptide. A ranthipeptide is a ribosomally synthesized and post-translationally modified peptide (RiPP) whose linkages from cysteine sulfur atoms are to something besides the alpha carbons of other amino acids. Thus, ranthipeptides differ from sactipeptides (sulfur-to-alpha carbon RiPP peptides). The founding member of this family is freyrasin itself, the PapA protein from Paenibacillus polymyxa (see SUA72395.1). All members so far are encoded next to radical SAM peptide maturases. 49
5892 411563 NF037958 QH_gamma quinohemoprotein amine dehydrogenase subunit gamma. 100
5893 411564 NF037959 MFS_SpdSyn fused MFS/spermidine synthase. Proteins of this family are fusion of a N-terminal MFS (Major Facilitator Superfamily) transporter domain and a C-terminal spermidine synthase (SpdSyn)-like domain. The encoding genes usually near the genes encoding S-adenosylmethionine decarboxylase (AdoMetDC) on many bacterial genomes. It has been shown in Shewanella oneidensis that the fused protein aminopropylates a substrate other than putrescine, and has a role outside of polyamine biosynthesis. 480
5894 411565 NF037960 MFS_trans MFS transporter. 382
5895 411566 NF037961 RodA_shape rod shape-determining protein RodA. Proteins of this family are members of the FtsW/RodA/SpoVE superfamily. It has been reported that RodA proteins play important roles in maintaining cell shape and antibiotic resistance in bacteria. 415
5896 411567 NF037962 arsenic_eff arsenic efflux protein. Most proteins of this family have 8 transmembrane domains with two 4 transmembrane halves separated by a hydrophilic loop of variable sizes. It has been reported that some proteins of this family are involved in arsenate/arsenite resistance. 286
5897 411568 NF037963 heterocyst_HetZ heterocyst differentiation protein HetZ. The HMM distinguishes HetZ itself, a heterocyst differentiation protein, from a closely related paralog. 366
5898 411569 NF037964 HetZ_related HetZ-related protein. Members of this cyanobacterial protein family are paralogs of the heterocyst differentiation protein HetZ and occur in largely the same set of genomes. 370
5899 411570 NF037965 HetZ_rel_2 HetZ-related protein 2. Members of this family are cyanobacterial proteins distantly related to heterocyst differentiation protein HetZ, which also has a much more closely related set of paralogs in heterocyst-forming species. 367
5900 411571 NF037966 HetP_family HetP family heterocyst commitment protein. HetP and its paralogs occur in heterocyst-forming members of the Cyanobacteria, and play a role in commitment to development into heterocysts, which specialize in nitrogen fixation rather than photosynthesis. 63
5901 411572 NF037967 SemiSWEET_1 SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 7 TMHs. 196
5902 411573 NF037968 SemiSWEET_2 SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 3 TMHs. 76
5903 411574 NF037969 SemiSWEET_3 SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 3 TMHs. 97
5904 411575 NF037970 vanZ_1 VanZ family protein. VanZ was originally identified in Enterococcus faecium. VanZ increases teicoplanin resistance in Enterococcus faecium, but has no impact on vancomycin resistance. Proteins of this family are homologs of the VanZ protein. They may be involved in teicoplanin resistance. 107
5905 411576 NF037971 lipo_BcpO CDI system lipoprotein BcpO. BcpO is a small lipoprotein, about 74 amino acids long on average, encoded in Burkholderia bcpAIOB locus systems for two-partner secretion and contact dependent growth inhibition. 66
5906 411577 NF037972 HuaA_fam_RiPP huazacin family RiPP peptide. 40
5907 411578 NF037974 SslE_AcfD_Zn_LP SslE/AcfD family lipoprotein zinc metalloprotease. Members of this family are surface lipoprotein zinc metalloproteases, from the family that includes accessory colonization factor AcfD from Vibrio cholerae, SslE (YghJ ) from E. coli (Secreted and Surface-associated Lipoprotein from E. coli), and VPA1376 from Vibrio parahaemolyticus. Each is about 1500 amino acids long, and SslE is a known substrate of a type II secretion system (T2SS). SslE is known to have mucinase activity. 1389
5908 411579 NF037975 pilot_rel_YacC YacC family pilotin-like protein. Members of this, including YacC from Escherichia coli K-12, resemble the lipoprotein GspS of type II secretion systems (T2SS), but in general are not lipoproteins. In E. coli K-12, where the T2SS is cryptic (not expressed, but able to function after manipulation to force its express), YacC is encoded far from the locus where the main set of T2SS genes are found, and it is not clear that YacC is a true GspS. 115
5909 411580 NF037976 gtrA_1 GtrA family protein. The GtrA family proteins have 3-4 transmembrane domains. They are involved in translocation of lipid-linked glucose across the cytoplasmic membrane. 126
5910 411581 NF037977 Lpg0189_fam Lpg0189 family type II secretion system effector. 279
5911 411582 NF037978 T2SS_GspB type II secretion system assembly factor GspB. GspB (general secretory pathway B) occurs in type II secretion systems (T2SS) and is viewed as an accessory protein, a factor involved in the assembly process rather than integral to the completed T2SS apparatus. 158
5912 411583 NF037979 Na_transp sodium-dependent transporter. 417
5913 411584 NF037980 T2SS_GspK type II secretion system minor pseudopilin GspK. 311
5914 411585 NF037981 NCS2_1 purine/pyrimidine permease. Proteins of this family usually have 14 transmembrane domains. They belong to the NSC2 superfamily transporters. They are specific purine and/or pyrimidine permeases. 419
5915 411586 NF037982 Nramp_1 Nramp family divalent metal transporter. Nramp (natural resistance-associated macrophage protein) family divalent metal transporters are widely conserved divalent metal transporters, which enables manganese import in bacteria and dietary iron uptake in mammals. 404
5916 411587 NF037993 cyano_chori_ly cyanobacterial-type chorismate lyase. This variant form of chorismate lyase is widespread in the cyanobacteria, including founding example sll1797 from Synechocystis sp. PCC6803. Previously, members of this family were named DUF98 domain-containing protein, as found by Pfam model PF01947. The product, 4-hydroxybenzoate, is next prenylated by slr0926, during plastoquinone biosynthesis. 176
5917 411588 NF037994 DcuC_1 C4-dicarboxylate transporter DcuC. Proteins of this family usually have 11-12 transmembrane regions. They transport C4-dicarboxylates under anaerobic conditions, which plays an important role in anaerobic energy metabolism. 414
5918 411589 NF037995 TRAP_S1 TRAP transporter substrate-binding protein DctP. Proteins of this family are members of the superfamily of Tripartite ATP-independent Periplasmic Transporter (TRAP-T). They transport hydrophobic substrates, usually lipoprotein. 271
5919 411590 NF037996 B-4DMT B-4DMT family transporter. Proteins of this family usually have four transmembrane regions. They are classified as a new transporter family (9.B.148) by TCDB. 139
5920 411591 NF037997 Na_Pi_symport Na/Pi symporter. Proteins of this family belong to the Phosphate:Na+ Symporter (PNaS) superfamily. 294
5921 411592 NF037998 RND_1 protein translocase. 1237
5922 411593 NF037999 mutacin lantibiotic mutacin. Mutacins are lantibiotics in the epidermin/gallidermin/nisin family, found in the biofilm-forming dental caries pathogen Streptococcus mutans. Named members of the family include mutacin I and mutacin 1140. This HMM separates the mutacins (MutA) from paralog MutA' encoded nearby, which lacks mutacin activity. 63
5923 411594 NF038000 mutacin_prime mutacin-like lantipeptide. MutA', a paralog of the nisin-like lantibiotic mutacin precursor MutA, is a lantipeptide of unknown function, encoded in Streptococcus mutans within the same region. It is not required for mutacin function. 64
5924 411595 NF038001 HYExAFE HYExAFE family protein. This uncharacterized protein is named for its best conserved region, the motif HY[ED]xAFE, found near the N-terminus. It appears to limited in taxonomic range to members of the Planctomycetes. 164
5925 411596 NF038002 bifunc_CbiS bifunctional adenosylcobinamide hydrolase/alpha-ribazole phosphatase CbiS. 333
5926 411597 NF038004 darobactin_RiPP darobactin family peptide antibiotic. Darobactin, discovered in the genus Photorhabdus, is a peptide antibiotic, made from a ribosomally translated precursor, and modified by the radical SAM/SPASM peptide maturase DarE. Darobactin A is the founding member of a new class of antibiotic that appears to target BamA, a component of the outer membrane beta-barrel assembly machine. It is seven amino acids long, Trp1-Asn2-Trp3-Ser4-Lys5-Ser6-Phe7, with two crosslinks, one from Trp1 to Trp3, the other from Trp3 to Lys5. Homologs of the darobactin A precursor are encoded in various strains of Photorhabdus, Yersinia, and Vibrio. 45
5927 411598 NF038005 rSAM_mat_DarE darobactin maturation radical SAM/SPASM protein DarE. The radical SAM/SPASM protein DarE is a maturase for the ribosomally translated, post-translationally modified peptide natural product (RiPP) darobactin, including forms A, B, C, D, and E. The mature form is just seven amino acids long, with two cross-links, a Trp1-Trp3 linkage and a Trp3-Lys5 (or Arg5) linkage. 432
5928 411599 NF038006 NhaD_1 sodium:proton antiporter NhaD. Proteins of this family usually have 10-13 transmembrane regions. They extrudes Na+ or Li+ in exchange for H+. They have been identified and characterized in a number of bacterial species. 405
5929 411600 NF038007 ABC_ATP_DarD darobactin export ABC transporter ATP-binding protein. 218
5930 411601 NF038008 ABC_perm_DarB darobactin export ABC transporter permease subunit. 777
5931 411602 NF038009 TatB_1 twin-arginine translocation system-associated protein. This family is suggested by TCDB to be part of the twin-arginine translocation (TAT) system, but lacks detectable sequence similarity to subunits such as TatB. 70
5932 411603 NF038010 ABC_adapt_DarC darobactin export ABC transporter periplasmic adaptor subunit. 417
5933 411604 NF038011 PelF GT4 family glycosyltransferase PelF. Proteins of this family are components of the exopolysaccharide Pel transporter. It has been reported that PelF is a soluble glycosyltransferase that uses UDP-glucose as the substrate for the synthesis of exopolysaccharide Pel, whereas PelG is a Wzx-like and PST family exopolysaccharide transporter. 489
5934 411605 NF038012 DMT_1 DMT family transporter. Proteins of this family belong to the drug/metabolite transporter (DMT) superfamily. 276
5935 411606 NF038013 AceTr_1 acetate uptake transporter. Proteins of this family are acetate transporters, which usually have 6 transmembrane regions. The homologue in E. coli is YaaH. 176
5936 411607 NF038014 Chlamy_inclu_1 inclusion-associated protein. Proteins of this family are inclusion-associated proteins in Chlamydia. It has been shown that protein CPj0783, which is identical to the HMM seed protein WP_010892266, was localized on Chlamydial inclusion. CPj0783 interacted with host Huntingtin-protein14, which may play an important role in disturbing the vesicle transport system to escape host lysosomal or autophagosomal degradation. 236
5937 411608 NF038015 AztD metallochaperone AztD. Proteins of this family are components of the AztABCD zinc-uptake system. AztC, AztB, and AztA are the extracellular solute-binding protein (SBP), permease, and ATP-binding protein, respectively. AztD is a zinc chaperone to AztC, and it may store zinc in the periplasm for transfer through the AztABCD transporter system. 374
5938 411609 NF038016 sporang_Gsm sporangiospore maturation cell wall hydrolase GsmA. The peptidoglycan-hydrolyzing enzyme GsmA occurs in some sporangia-forming members of the Actinobacteria, such as Actinoplanes missouriensis, and is required for proper separation of spores. GsmA proteins have one or two SH3 domains N-terminal to the hydrolase domain. 312
5939 411610 NF038017 ABC_perm1 ABC transporter permease. Proteins of this family are the permease subunit of an ABC transporter complex, which may be involved in tungstate uptake. 184
5940 411611 NF038018 qmoC quinone-interacting membrane-bound oxidoreductase complex subunit QmoC. Proteins of this family are the transmembrane subunit of the quinone-interacting membrane-bound oxidoreductase complex, which consists of the QmoA, QmoB, and QmoC proteins. It has been reported that the QmoABC complex is essential for efficiently delivering electron to adenosine 5'-phosphosulfate reductase AprAB, which is important in sulfate reduction in sulfate reducing prokaryotes (SRPs). 363
5941 411612 NF038019 PE_process_PecA PecA family PE domain-processing aspartic protease. PecA from Mycobacterium marinum, and by homology, three related paralogs from Mycobacterium tuberculosis (PE26, PE_PGRS35, and PE_PGRS16) are all PE domain-containing proteins secreted by a type VII secretion system (T7SS, also called ESX in Mycobacterium), and all share a C-terminal aspartic protease-like domain. PecA itself is now known to be a functional aspartic protease that cleaves within the PE domain of T7SS secretion substrates that have the domain, including itself. Members of this family typically contain a long, variable, low-complexity region. This HMM represents the aspartic protease region C-terminal to the low-complexity region. 278
5942 411613 NF038020 HeR heliorhodopsin HeR. This HMM represents heliorhodopsins, a group of phylogenetically distinct microbial rhodopsins, which play an important role in absorbing and transferring light energy for numerous biological processes in bacteria. Heliorhodopsin was initially identified and characterized in a Gram-positive actinobacterium based on functional metagenomics and photochemical approaches. Heliorhodopsin have seven transmembrane domains, and exhibit similar biological function as microbial rhodopsins. however, heliorhodopsin form a distinct cluster based on phylogenetic analyses. Most microbial rhodopsins are hit by the Pfam HMM PF01036, which does not hit heliorhodopsins. 244
5943 411614 NF038021 mannan_LmeA mannan chain length control protein LmeA. 264
5944 411615 NF038022 PorACj_fam PorACj family cell wall channel-forming small protein. Members of this unusual protein family are small (often 40 amino acids or shorter), variable, and detected so far only in the genus Corynebacterium. Despite its small size, the founding member reported to form into homooligomeric channels in the cell wall (not the plasma membrane). This family, as built, may also include PorA subunits of PorA/PorH heterooligomeric cell wall channels. 32
5945 411616 NF038023 S_layer_PS2 S-layer protein PS2. 499
5946 411617 NF038024 CRR6_slr1097 CRR6 family NdhI maturation factor. The protein Slr1097 and its functionally equivalent cyanobacterial homologs are required for proper maturation of NdhI, a subunit of NADPH dehydrogenase complexes, so that NDH-1 complexes can assemble properly. The related protein in the model plant species Arabidopsis thaliana is known as CRR6 (chlororespiratory reduction 6). 151
5947 411618 NF038025 dapto_LiaX daptomycin-sensing surface protein LiaX. LiaX (lipid-II###interacting antibiotics X), as described in Enterococcus faecalis, is expressed under control of the the LiaR response regulator, and is involved in the process of resistance to daptomycin and to antimicrobial peptides of the innate immune response. 513
5948 411619 NF038026 RsaX20_sORF putative metal homeostasis protein. Members of this family average just 38 amino acids in length, but are widely conserved among species of the Gram-positive genera that include Staphylococcus, Enterococcus, Leuconostoc, and Lactobacillus. Expression of an RNA designated RsaX20, which encodes a member of this family, was studied in a Staphylococcus aureus of possible structural RNAs, and shown to be controlled by Fur-like transcription factor Zur. The broad conservation of the coding region across multiple species and protein-like pattern of amino acid substitutions in multiple sequence alignments strongly suggests that that members of this family are indeed translated into functional proteins. 32
5949 411620 NF038027 TssQ_fam TssQ family T6SS-associated lipoprotein. The founding member of this protein family, TssQ (BB0812) from Bordetella bronchiseptica, is a lipoprotein, and possibly all other family members are as well. TssQ is encoded within a T6SS locus but its function remains unknown. 100
5950 411621 NF038028 spiralin_repeat spiralin repeat. The spiralin repeat is a domain that appears once in spiralin (the major lipoprotein of Spiroplasma species) and up to six times in related proteins. 88
5951 411622 NF038029 LP_plasma lipoprotein signal peptide. This HMM describes one of several homology families that can be found of mutually closely related lipoprotein signal peptides in the Mycoplasmas and closely related genera (Mesoplasma, Spiroplasma), but absent outside that taxonomic group. Member proteins include spiralin, the most abundant lipoprotein in Spiroplasma. 24
5952 411623 NF038030 spiralin_LP spiralin lipoprotein. Spiralin is the major lipoprotein in multiple species of Spiroplasma, a relative of the Mycoplasmas. 239
5953 411624 NF038031 PavB_Nterm PavB family adhesin N-terminal domain. This HMM describes the portion of PavB from Streptococcus pneumoniae, and closely related proteins from Streptococcus mitis and Streptococcus pseudopneumoniae, N-terminal to the repetitive region with variable numbers of SSURE (Streptococcal Surface REpeats) regions (see PF11966), which bind fibronectin. The PavB region is notable, in part, for its rare variant, WSIRR, of the YSIRK motif signal peptide. Full-length versions of proteins from this family have a C-terminal LPXTG-containing region for sortase-mediated anchoring to the cell wall. 128
5954 411625 NF038032 CehA_McbA_metalo CehA/McbA family metallohydrolase domain. This domain, a branch of the PHP superfamily, is found in several partially characterized metallohydrolases, including McbA and CehA. Both were studied as hydrolases of carbaryl, a xenobiotic compound that does not contain a phosphate group, suggesting that presuming members of this family to be phosphoesterases (like many PHP domain-containing proteins) may be incorrect. 315
5955 411626 NF038033 FEA1_rel_lipo FEA1-related lipoprotein. 406
5956 411627 NF038034 lactGbeta_entB lactococcin G-beta/enterocin 1071B family bacteriocin. This HMM was built to improve on Pfam model PF11632, which in version PF11632.8 had a two-member seed. It includes 12 residues of leader peptide and GlyGly cleavage motif (see TIGR01847), and has a shorter but more broadly conserved core peptide region. Characterized member proteins include lactococcin G and enterocin 1071B. 38
5957 411628 NF038035 lactGalph_entA lactococcin G-alpha/enterocin 1071A family bacteriocin. 34
5958 411629 NF038036 TCP11_Legionella TCP11-related protein. 927
5959 411630 NF038037 cytob_DsrM sulfate reduction electron transfer complex DsrMKJOP subunit DsrM. Proteins of this family are the DsrM subunit of the DsrMKJOP complex, which is a membrane-bound redox complex involved in sulfate reduction in Sulfate-reducing organisms (SROs). The dsrM gene encodes a cytochrome b reductase, which usually has six transmembrane helices and five conserved histidines. 320
5960 411631 NF038038 cytoc_DsrJ sulfate reduction electron transfer complex DsrMKJOP subunit DsrJ. Proteins of this family are the DsrJ subunit of the DsrMKJOP complex, which is a membrane-bound redox complex involved in sulfate reduction in Sulfate-reducing organisms (SROs). The dsrJ gene encodes a triheme periplasmic cytochrome c subunit, which contains three conserved heme c-binding sites (CXXCH) at the C terminal. 119
5961 411632 NF038039 WGxxGxxG-CTERM WGxxGxxG-CTERM domain. This domain, a possible protein-sorting signal, begins with the motif that usually takes the form DWGW, followed by a hydrophobic (and probably membrane-spanning) stretch xGxxGxxGxxG (where x is hydrophobic), and then a short patch of mostly basic residues at the C-terminus. This domain has a broad taxonomic range that includes Firmicutes, Cyanobacteria, and Deinococcus. Members from the Firmicutes tend to occur together with a DUF3231 (PF11553) family protein. 19
5962 411633 NF038040 phero_PhrK_fam PhrK family phosphatase-inhibitory pheromone. 40
5963 411634 NF038041 fim_Mfa1_fam Mfa1 family fimbria major subunit. 483
5964 411635 NF038042 actinodefensin actinodefensin. The actinodefensin family is named (here) as an Actinomyces-specific branch of the (otherwise eukaryotic) arthropod defensin family described by Pfam model PF01097. 70
5965 411636 NF038043 act_def_assoc_A actinodefensin-associated protein A. Actinodefensin (see family NF038042) is a bacterial branch of the arthropod defensin family. Members of that family occur in the Actinomyes lineage, have a distinctive N-terminal region that may reflect how processing and transport occur, and are found in a conserved gene neighborhood. Actinodefensin-associated protein A is found exclusively in these conserved gene neighborhoods. 379
5966 411637 NF038044 act_def_assoc_B actinodefensin-associated protein B. Members of this family are small proteins, averaging about 70 amino acids in length, restricted to the Actinomycetes. Member proteins typically occur in the vicinity of actinodefensin, which represents and Actinomycetes-restricted branch of the arthropod defensin family. The function of this protein is unknown. 63
5967 411638 NF038045 GEF_RalF T4SS guanine nucleotide exchange effector RalF. 341
5968 411639 NF038047 not_Tcp10 AAWKG family protein. Members of this family are found primarily in Streptomyces. The family is notable in part because a region outside of the N-terminal region modeled here contains 9-residue repeats that resemble the 18-residue repeats found by PF07202 in the C-terminal region of eukaryotic T-complex protein 10. The family is uncharacterized. This model was constructed, and named for family's most prominent motif, AAWKG, to head off any possible confusion with Tcp10. 398
5969 411640 NF038048 DIP1984_fam DIP1984 family protein. Members of this family, including the Corynebacterium diphtheriae protein DIP1984, which has a solved crystal structure, are uncharacterized with respect to function. Some members of this family previously have been annotated, incorrectly, as septolysin. This model was constructed to overrule and correct such errors. Note that septolysin O, and other members of the family of cholesterol-dependent cytolysins such as listeriolysin O (WP_003722731.1), are unrelated. 149
5970 411641 NF038049 SelD_rel_HyperS SelD-related putative sulfur metabolism protein. 465
5971 411642 NF038050 NrtS nitrate/nitrite transporter NrtS. NrtS family proteins were first identified and characterized in Synechococcus sp. PCC 7002. The homologous proteins NrtS1 and NrtS2 are encoded by two neighboring genes on Synechococcus sp. PCC 7002 genome. The heteromeric transporter was shown to transport nitrite as well as nitrate. This HMM hits both NrtS1 and NrtS2 proteins, which have extremely high sequence identity and conserved motifs. 56
5972 411643 NF038051 MamC magnetosome protein MamC. Magnetosomes are membrane-enclosed organelles containing crystals of magnetite (Fe3O4) that cause magnetotactic bacteria to orient in magnetic fields. MamC interacts with the magnetite surface and affect the size and shape of the growing crystal. 97
5973 411644 NF038052 histone_lik_HC2 histone H1-like DNA-binding protein Hc2. This model describes highly repetitive, Lys and Arg-rich histone H1-like protein Hc2, as found in the genus Chlamydia. 201
5974 411645 NF038053 hist_H1_lk_Burk histone H1-like DNA-binding protein. This model describes histone H1-like repetitive basic proteins as found in Burkholderia and related species. It excludes the Hc2 family found in Chlamydia and involved in DNA condensation for the formation of elementary bodies. 192
5975 411646 NF038054 T3SS_SctI type III secretion system inner rod subunit SctI. This model describes protein SctI (Secretion and Cellular Translocation I), an inner rod protein in the basal body of the type III secretion system (T3SS) as found in many pathogenic bacteria. SctI has some sequence similarity to the needle filament protein SctF. Lineage-specific names for SctI in various bacteria include BsaK, PrgJ, and MxiI. 76
5976 411647 NF038055 T3SS_SctB_pilot type III secretion system translocon subunit SctB. One SctB and four SctE subunits, located at the tip of the type III secretion system (T3SS) injectosome, combine to form the translocon (translocator pore) in the membrane of targeted cells. Species-specific names for this highly variable component of T3SS include YopD, EspB, IpaC, SipC, etc. 311
5977 411648 NF038058 adhes_P110_Nter P110/LppT family adhesin N-terminal domain. Members of this family include the multifunctional adhesin P110 (as the name is used in Mycoplasma hyopneumoniae, not Mycoplasma genitalium) and homologs presumed also to be adhesins. Homologs include LppT (lipoprotein T), which despite its name seems to lack the signal peptide region Cys residue required for conversion into a lipoprotein, and paralogs MHP683 and MHP684 from Mycoplasma hyopneumoniae. 202
5978 411649 NF038065 Pr6Pr Pr6Pr family membrane protein. This family is defined by TCDB as prokaryotic 6 TMS (Pr6Pr) family membrane protein(http://www.tcdb.org/search/result.php?tc=9.b.302). The function of this family proteins is not understood. 162
5979 411650 NF038066 MptB polyprenol phosphomannose-dependent alpha 1,6 mannosyltransferase MptB. Proteins of this family are Involved in the initiation of core alpha-(1,6) mannan biosynthesis of lipomannan (LM-A) and multi-mannosylated polymer (LM-B), extending triacylatedphosphatidyl-myo-inositol dimannoside (Ac1PIM2) and mannosylated glycolipid, 1,2-di-O-C16/C18:1-(alpha-D-mannopyranosyl)-(1->4)-(alpha-D-glucopyranosyluronic acid)-(1->3)-glycerol (Man1GlcAGroAc2), respectively. 554
5980 411651 NF038067 OMP_CarO ornithine uptake porin CarO. The outer membrane porin CarO (carbapenem resistance-associated outer membrane protein), found in Acinetobacter, is of clinical interest because of its role in allowing entry of carbapenem antibiotics and its variability from strain to strain. CarO should not be confused with Omp33-36, an essentially unrelated outer membrane protein that is similar in size and that also affects carbapenem susceptibility. 249
5981 411652 NF038068 LaoB_over_CadC L-arginine responsive protein LaoB. LaoB (L-arginine responsive overlapping gene) is a small, rare protein, encoded in an antisense frame to the gene for CadC in some strains of Escherichia coli. CadC, in response to acid stress and the presence of sufficient lysine, activates expression of a lysine decarboxlation and lysine/cadaverine antiport system, which provides resistance to the acidity. 41
5982 411653 NF038070 LmbU_fam_TF LmbU family transcriptional regulator. LmbU is a well-described member of a family of DNA-binding transcriptional activators for natural product biosynthesis in Streptomyces and related species. Besides LmbU (lincomycin), some other members include CouE (coumermycin A1), CloE (clorobiocin), and HrmB (hormaomycin). 171
5983 411654 NF038071 lat_flg_LafA_2 lateral flagellin LafA. 283
5984 411655 NF038072 IcmL_DotI_only type IVB secretion system apparatus protein IcmL/DotI. 200
5985 411656 NF038073 rSAM_STM4011 STM4011 family radical SAM protein. Members of this family are putative radical SAM proteins that (at the time of HMM construction) fall outside the scope of Pfam model PF04055 available at the time (version 21), as many radical SAM proteins do. The function is unknown. Members are somewhat variable in architecture, and found mostly in Streptomyces and related species. However, the family is named for a member in Salmonella enterica, in the model strain LT2, protein STM4011, whose function is unknown. 278
5986 411657 NF038074 fam_STM4014 STM4014 family protein. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa. 350
5987 411658 NF038075 fam_STM4013 STM4013/SEN3800 family hydrolase. Members of this family are sulfatase-like metal-dependent hydrolases, of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. See PF00884 for information on related proteins. 261
5988 411659 NF038076 fam_STM4015 STM4015 family protein. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa. 286
5989 411660 NF038077 MFS_export_MxcK myxochelin export MFS transporter MxcK. 405
5990 411661 NF038078 NRPS_MxcG myxochelin non-ribosomal peptide synthetase MxcG. 1444
5991 411662 NF038079 TonB_sider_MxcH TonB-dependent siderophore myxochelin receptor MxcH. 796
5992 411663 NF038080 PG_bind_siph peptidoglycan-binding domain. This domain occurs shows apparent homology to known or putative peptidoglycan-binding domains in families such as PF01471. The domain occurs once, or twice, at the C-terminus of proteins such as cell wall amidases. In particular, member proteins can be found among putative lysins of phage of Streptomyces from the Siphoviridae family, such as phiBT1. 76
5993 411664 NF038081 BN159_2729_fam BN159_2729 family protein. This uncharacterized protein family occurs in Streptomyces and related species. Some members have insertions of long stretches of low-complexity sequences. 237
5994 411665 NF038082 phiSA1p31 phiSA1p31 domain. This domain occurs in Streptomyces and related lineages, in proteins with highly variable architectures, typically at or near the C-terminus. Member proteins include at least two from known temperate phage of Streptomyces, including phiSA1p31, for which it is named, from Streptomyces phage phiSASD1. 60
5995 411666 NF038083 CU044_5270_fam CU044_5270 family protein. Members of this family occur largely in Streptomyces and related species, often with several members per genome. Lengths average about 340 amino acids. The function is unknown. 284
5996 411667 NF038084 DHCW_cupin DHCW motif cupin fold protein. Members of this uncharacterized protein family resemble other cupin superfamily small barrel proteins. This family has a signature motif, DHCW, for which the family is named. 106
5997 411668 NF038085 MSMEG_6728_fam MSMEG_6728 family protein. 149
5998 411669 NF038086 anchor_synt_A protein sorting system archaetidylserine synthase. Members of this family are homologs of CDP-diacylglycerol--serine O-phosphatidyltransferase PssA of subclass II, as found in Gram-positive bacteria, but occur in a branch of the archaea. In Haloferax volcanii, the member of this family HVO_1143, together with the PssD-related decarboxylase HVO_0146, were both shown to be required for the archaeosortase ArtA to cause removal of the PGF-CTERM sorting signal and replacement with a prenyl-derived C-terminal lipid anchor. Based on these observations, members of this family are suggested to be CDP-2,3-di-O-geranylgeranyl-sn-glycerol:l-serine O-archaetidyltransferase, generating archaetidylserine en route to archaetidylethanolamine biosynthesis. 222
5999 411670 NF038087 arch_ser_synth archaetidylserine synthase. Members of this family, including founding member MTH_1027 from Methanothermobacter thermautotrophicus, resemble subclass II bacterial CDP-diacylglycerol--serine O-phosphatidyltransferase, but act as instead as CDP-2,3-Di-O-geranylgeranyl-sn-glycerol:L-serine O-archaetidyltransferase (archaetidylserine synthase). 216
6000 411671 NF038088 anchor_synt_D protein sorting system archaetidylserine decarboxylase. Members of this family, including founding member HVO_0146 from Haloferax volcanii, are archaeal homologs of bacterial phosphatidylserine decarboxylases (PssD). HVO_0146, and the PssA homolog HVO_1143, were shown be required for archaeosortase A (ArtA)-mediated removal of the PGF-CTERM protein-sorting signal and replacement with a large, prenyl-derived, C-terminal anchoring lipid moiety that is proposed to be archaetidylethanolamine. 196
6001 411672 NF038090 IscA_HesB_Se IscA/HesB family protein. Members of this family, a large fraction of which are selenoproteins, are homologous to proteins of iron-sulfur cluster biosynthesis such as IscA, and belong to the broader set of HesB-related proteins. 99
6002 411673 NF038091 T4SS_VirB10 type IV secretion system protein VirB10. Members of this family are VirB10, an outer membrane-associated protein from the apparatus of protein type IV secretion systems (T4SS). The model attempts to exclude related TraI proteins of conjugal transfer systems as well as the ComB10 protein of a DNA-translocating competence protein of Helicobacter pylori. Because the N-terminal regions of VirB10 proteins are highly variable, the model 197
6003 411674 NF038092 T4SS_ComB10 DNA type IV secretion system protein ComB10. ComB10, a VirB10 homolog, is an outer membrane-associated component of a DNA-translocating type IV secretion system (T4SS) involved in competence. Most T4SS translocate proteins, but both ComB10 as found in Helicobacter, and the T4SS of Agrobacterium, translocate DNA. 225
6004 411675 NF038093 GrdX GrdX family protein. GrdX is a small protein, of unknown function, encoded in grd operons for selenocysteine-dependent glycine reductase systems. A small number of GrdX proteins appear to be encoded with a UGA-encoded selenocysteine residue appearing close to the N-terminus, at sites that do not align with Cys residues in other members of the family. This arrangement suggests that the ability to complete translation of GrdX selenoproteins may have regulatory value, in addition to whatever may be the molecular function of GrdX itself. 118
6005 411676 NF038094 CueP_fam CueP family metal-binding protein. Members of this family including CueP itself, a copper-binding periplasmic metallochaperone, as found in Salmonella enterica serovar Typhimurium. Many family members, although not CueP itself, are lipoproteins. Several other members of the family are selenoproteins, including members from Bacillus selenitireducens, Bacillus beveridgei, and others. 146
6006 411677 NF038095 met_chaper_CueP copper-binding periplasmic metallochaperone CueP. This narrowly built model for CueP includes periplasmic proteins from Salmonella enterica, in which it contributes to an increased tolerance to copper, and from various other Gram-negative bacteria. It does not include CueP lipoproteins from species such as Corynebacterium diphtheriae. 177
6007 411678 NF038096 thylak_slr1796 thylakoid membrane photosystem I accumulation factor. Members of this family, restricted to the Cyanobacteria and chloroplasts, show homology to thioredoxins. However, the core region of family protein alignment shows either one Cys residue or zero, suggesting family members may share the thioredoxin-like fold but lack redox capability. The founding member of the family, slr1796, was shown by proteomics to localize to the thylakoid membrane, consistent with taxonomic restriction to the Cyanobacteria. Targeted mutation of slr1796 shows a role in successful translation and insertion of photosystem I proteins into the thylakoid membrane. 157
6008 411679 NF038097 KCGN_DNA_rpt KCGN motif-containing spurious repeat. This AntiFam-type HMM recognizes spurious protein translations, often with the motif KCGN, of a DNA repeat widespread in the genus Leptospira. 21
6009 411680 NF038098 GyrA_w_intein intein-containing DNA gyrase subunit A. 1232
6010 411681 NF038099 AsSugarArsM arsenosugar biosynthesis arsenite methyltransferase ArsM. This form of arsenite methyltransferase works together with the radical SAM enzyme ArsS in a pathway of arsenosugar biosynthesis. Examples of ArsM such as slr0303 from Synechocystis sp. PCC 6803 and alr3095 from Nostoc sp. PCC 7120 are encoded next to ArsS. 321
6011 411682 NF038101 Trm112_arch methytransferase partner Trm112. This HMM describes an archaeal branch of a small protein, Trm112, that is conserved in the three domains of life and that serves as general activator of methyltransferases for RNA or protein. 59
6012 411683 NF038104 lipo_NF038104 NF038104 family lipoprotein. This family of small lipoproteins of unknown function, about 68 amino acids long, occurs in genera that include Acinetobacter, Moraxella, Neisseria, and Psychrobacter. The N-terminal half, including the lipoprotein signal peptide, shows significant sequence similarity to the divisome-associated lipoprotein YraP, a three-fold longer protein, as found in Eschericia coli. 62
6013 411684 NF038105 acin_NF038105 NF038105 family protein. This family of small proteins, about 66 amino acids long, appears universal in the first 20 species of Acinetobacter examined, but absent outside the genus. 62
6014 411685 NF038106 gamma_NF038106 PA4642 family protein. Member of this family are small (about 95 amino acids), uncharacterized, and apparently restricted to the Gammaproteobacteria. Members include PA4642 from Pseudomonas aeruginosa PAO1. 93
6015 411686 NF038107 rSAM_NF038107 Cys-every-fifth radical SAM/SPASM peptide maturase CefB. Members of this family are radical SAM/SPASM domain proteins, most of which perform post-translational modification on RiPP peptide precursors or enzyme subunits. Target residues for modification often are Cys residues. Members of family NF038108, with a Cys Every Fifth position (CefA) over most of the short length of that family, are the putative target RiPP proteins. 444
6016 411687 NF038108 RiPP_NF038108 Cys-every-fifth RiPP peptide CefA. Members typically are shorter than 100 residues, with from nine to eleven Cys residues spaced strictly as every fifth residue and usually with at least one adjacent Gly. Most family members occur in the vicinity of a peptide-modifying radical SAM/SPASM domain protein, marking those family members as putative RiPP (ribosomally translated, post-translationally modified peptide) precursors. Because of the small size, richness in Cys and Gly residues, and strictly repetitive nature, it may be expected that some predicted proteins, scoring above the thresholds for the model, are related by convergent evolution rather than by homology, and are not themselves RiPP precursors. 65
6017 411688 NF038110 Lys_methyl_FliB flagellin lysine-N-methylase. 375
6018 222768 PHA00002 A DNA replication initiation protein gpA 515
6019 222769 PHA00003 B internal scaffolding protein 120
6020 164773 PHA00006 D external scaffolding protein 151
6021 164774 PHA00007 E cell lysis protein 91
6022 222770 PHA00008 J DNA packaging protein 25
6023 164775 PHA00009 F capsid protein 427
6024 164776 PHA00010 G major spike protein 179
6025 222771 PHA00012 I assembly protein 361
6026 222772 PHA00019 IV phage assembly protein 428
6027 164777 PHA00022 VII minor coat protein 28
6028 106880 PHA00024 IX minor coat protein 33
6029 222773 PHA00025 VIII major coat protein 76
6030 133846 PHA00026 cp coat protein 129
6031 133847 PHA00027 lys lysis protein 58
6032 222774 PHA00028 rep RNA replicase, beta subunit 561
6033 222775 PHA00080 PHA00080 DksA-like zinc finger domain containing protein 72
6034 106886 PHA00094 VI minor coat protein 112
6035 164779 PHA00097 K protein K 56
6036 222776 PHA00098 PHA00098 hypothetical protein 112
6037 164781 PHA00099 PHA00099 minor capsid protein 147
6038 177266 PHA00101 PHA00101 internal virion protein B 194
6039 222777 PHA00144 PHA00144 major head protein 438
6040 133855 PHA00147 PHA00147 upper collar protein 308
6041 222778 PHA00148 PHA00148 lower collar protein 242
6042 222779 PHA00149 PHA00149 DNA encapsidation protein 331
6043 177267 PHA00159 PHA00159 endonuclease I 148
6044 222780 PHA00198 PHA00198 nonstructural protein 86
6045 177268 PHA00201 PHA00201 major capsid protein 343
6046 164786 PHA00202 PHA00202 DNA replication initiation protein 388
6047 164787 PHA00212 PHA00212 putative transcription regulator 63
6048 222781 PHA00276 PHA00276 phage lambda Rz-like lysis protein 144
6049 106901 PHA00280 PHA00280 putative NHN endonuclease 121
6050 164789 PHA00327 PHA00327 minor capsid protein 187
6051 222782 PHA00330 PHA00330 putative replication initiation protein 316
6052 222783 PHA00350 PHA00350 putative assembly protein 399
6053 177271 PHA00360 II replication initiation protein 421
6054 222784 PHA00363 PHA00363 major capsid protein 557
6055 222785 PHA00368 PHA00368 internal virion protein D 1315
6056 164794 PHA00369 H minor spike protein 325
6057 164795 PHA00370 III attachment protein 297
6058 222786 PHA00371 mat maturation protein 418
6059 133872 PHA00380 PHA00380 tail protein 599
6060 164796 PHA00404 PHA00404 hypothetical protein 42
6061 222787 PHA00405 PHA00405 hypothetical protein 85
6062 164797 PHA00406 PHA00406 hypothetical protein 48
6063 164798 PHA00407 PHA00407 phage lambda Rz1-like protein 84
6064 222788 PHA00415 25 baseplate wedge subunit 131
6065 133878 PHA00422 PHA00422 hypothetical protein 69
6066 164800 PHA00425 PHA00425 DNA packaging protein, small subunit 88
6067 164801 PHA00426 PHA00426 type II holin 67
6068 222789 PHA00428 PHA00428 tail tubular protein A 193
6069 222790 PHA00430 PHA00430 tail fiber protein 568
6070 222791 PHA00431 PHA00431 internal virion protein C 746
6071 177277 PHA00432 PHA00432 internal virion protein A 137
6072 222792 PHA00435 PHA00435 capsid assembly protein 306
6073 222793 PHA00437 PHA00437 tail assembly protein 94
6074 133887 PHA00438 PHA00438 hypothetical protein 81
6075 222794 PHA00439 PHA00439 exonuclease 286
6076 133889 PHA00440 PHA00440 host protein H-NS-interacting protein 98
6077 222795 PHA00441 PHA00441 hypothetical protein 89
6078 222796 PHA00442 PHA00442 host recBCD nuclease inhibitor 59
6079 177281 PHA00446 PHA00446 hypothetical protein 89
6080 177282 PHA00447 PHA00447 lysozyme 142
6081 133894 PHA00448 PHA00448 hypothetical protein 70
6082 164812 PHA00450 PHA00450 host dGTPase inhibitor 85
6083 177283 PHA00451 PHA00451 protein kinase 362
6084 222797 PHA00452 PHA00452 T3/T7-like RNA polymerase 807
6085 164815 PHA00453 PHA00453 hypothetical protein 41
6086 222798 PHA00454 PHA00454 ATP-dependent DNA ligase 315
6087 133900 PHA00455 PHA00455 hypothetical protein 85
6088 164817 PHA00456 PHA00456 hypothetical protein 34
6089 222799 PHA00457 PHA00457 inhibitor of host bacterial RNA polymerase 63
6090 222800 PHA00458 PHA00458 single-stranded DNA-binding protein 233
6091 222801 PHA00476 PHA00476 hypothetical protein 110
6092 133905 PHA00489 PHA00489 scaffolding protein 101
6093 133906 PHA00490 PHA00490 terminal protein 266
6094 222802 PHA00497 pol RNA-dependent RNA polymerase 673
6095 133907 PHA00510 PHA00510 transcriptional regulator 125
6096 222803 PHA00514 PHA00514 dsDNA binding protein 98
6097 133909 PHA00515 PHA00515 hypothetical protein 53
6098 222804 PHA00520 PHA00520 packaging NTPase P4 330
6099 222805 PHA00527 PHA00527 hypothetical protein 129
6100 133910 PHA00540 PHA00540 hypothetical protein 715
6101 106954 PHA00542 PHA00542 putative Cro-like protein 82
6102 164822 PHA00547 PHA00547 hypothetical protein 337
6103 177288 PHA00616 PHA00616 hypothetical protein 44
6104 177289 PHA00617 PHA00617 ribbon-helix-helix domain containing protein 80
6105 177290 PHA00619 PHA00619 CRISPR-associated Cas4-like protein 201
6106 106959 PHA00626 PHA00626 hypothetical protein 59
6107 222806 PHA00645 PHA00645 hypothetical protein 125
6108 133916 PHA00646 PHA00646 hypothetical protein 65
6109 106962 PHA00649 PHA00649 hypothetical protein 83
6110 106963 PHA00650 PHA00650 hypothetical protein 82
6111 106964 PHA00652 PHA00652 hypothetical protein 128
6112 164824 PHA00653 mtd major tropism determinant 381
6113 106966 PHA00657 PHA00657 crystallin beta/gamma motif-containing protein 2052
6114 106967 PHA00658 PHA00658 putative lysin 720
6115 133918 PHA00660 PHA00660 hypothetical protein 215
6116 106970 PHA00661 PHA00661 hypothetical protein 734
6117 222807 PHA00662 PHA00662 hypothetical protein 215
6118 106972 PHA00663 PHA00663 hypothetical protein 68
6119 106973 PHA00664 PHA00664 hypothetical protein 140
6120 106974 PHA00665 PHA00665 major capsid protein 329
6121 222808 PHA00666 PHA00666 putative protease 233
6122 106976 PHA00667 PHA00667 hypothetical protein 158
6123 222809 PHA00669 PHA00669 hypothetical protein 114
6124 106978 PHA00670 PHA00670 hypothetical protein 540
6125 106979 PHA00671 PHA00671 hypothetical protein 135
6126 133920 PHA00672 PHA00672 hypothetical protein 152
6127 106981 PHA00673 PHA00673 acetyltransferase domain containing protein 154
6128 106982 PHA00675 PHA00675 hypothetical protein 78
6129 106983 PHA00676 PHA00676 hypothetical protein 96
6130 106984 PHA00679 PHA00679 hypothetical protein 71
6131 106985 PHA00680 PHA00680 hypothetical protein 143
6132 222810 PHA00684 PHA00684 hypothetical protein 128
6133 106987 PHA00687 PHA00687 hypothetical protein 56
6134 106988 PHA00689 PHA00689 hypothetical protein 62
6135 106989 PHA00691 PHA00691 hypothetical protein 68
6136 106990 PHA00692 PHA00692 hypothetical protein 74
6137 222811 PHA00724 PHA00724 hypothetical protein 83
6138 177293 PHA00725 PHA00725 hypothetical protein 81
6139 177294 PHA00726 PHA00726 hypothetical protein 89
6140 222812 PHA00727 PHA00727 hypothetical protein 278
6141 177296 PHA00728 PHA00728 hypothetical protein 151
6142 177297 PHA00729 PHA00729 NTP-binding motif containing protein 226
6143 222813 PHA00730 int integrase 337
6144 222814 PHA00731 PHA00731 hypothetical protein 96
6145 177300 PHA00732 PHA00732 hypothetical protein 79
6146 177301 PHA00733 PHA00733 hypothetical protein 128
6147 177302 PHA00734 PHA00734 hypothetical protein 95
6148 177303 PHA00735 PHA00735 hypothetical protein 808
6149 177304 PHA00736 PHA00736 hypothetical protein 79
6150 177305 PHA00738 PHA00738 putative HTH transcription regulator 108
6151 177306 PHA00739 V3 structural protein VP3 92
6152 222815 PHA00742 PHA00742 hypothetical protein 211
6153 177308 PHA00743 PHA00743 helix-turn-helix protein 51
6154 164842 PHA00771 PHA00771 head assembly protein 151
6155 107010 PHA00780 PHA00780 hypothetical protein 80
6156 133939 PHA00781 PHA00781 hypothetical protein 59
6157 164843 PHA00821 PHA00821 hypothetical protein 295
6158 222816 PHA00911 21 prohead core scaffolding protein and protease 212
6159 222817 PHA00965 PHA00965 tail protein 588
6160 177310 PHA00979 PHA00979 putative major coat protein 77
6161 222818 PHA01075 PHA01075 major capsid protein 408
6162 107017 PHA01076 PHA01076 putative encapsidation protein 378
6163 222819 PHA01077 PHA01077 putative lower collar protein 251
6164 164848 PHA01078 PHA01078 putative upper collar protein 249
6165 164849 PHA01079 PHA01079 hypothetical protein 48
6166 164850 PHA01080 PHA01080 hypothetical protein 80
6167 133945 PHA01081 PHA01081 putative minor coat protein 104
6168 222820 PHA01082 PHA01082 putative transcription regulator 133
6169 164851 PHA01083 PHA01083 hypothetical protein 149
6170 107025 PHA01159 PHA01159 hypothetical protein 114
6171 107026 PHA01160 PHA01160 nonstructural protein 40
6172 222821 PHA01327 PHA01327 hypothetical protein 49
6173 164853 PHA01346 PHA01346 hypothetical protein 53
6174 107029 PHA01351 PHA01351 putative minor structural protein 1070
6175 107030 PHA01365 PHA01365 hypothetical protein 91
6176 222822 PHA01366 PHA01366 hypothetical protein 337
6177 133949 PHA01399 PHA01399 membrane protein P6 242
6178 164854 PHA01474 PHA01474 nonstructural protein 52
6179 107034 PHA01486 PHA01486 nonstructural protein 32
6180 107035 PHA01511 PHA01511 coat protein 430
6181 164855 PHA01513 mnt Mnt 82
6182 107037 PHA01514 PHA01514 O-antigen conversion protein C 485
6183 107038 PHA01516 PHA01516 hypothetical protein 98
6184 107039 PHA01519 PHA01519 hypothetical protein 115
6185 177311 PHA01547 PHA01547 putative internal virion protein A 206
6186 222823 PHA01548 PHA01548 hypothetical protein 167
6187 164858 PHA01622 PHA01622 CRISPR-associated Cas4-like protein 204
6188 222824 PHA01623 PHA01623 hypothetical protein 56
6189 222825 PHA01624 PHA01624 hypothetical protein 102
6190 164860 PHA01625 PHA01625 hypothetical protein 249
6191 222826 PHA01627 PHA01627 DNA binding protein 107
6192 164861 PHA01630 PHA01630 putative group 1 glycosyl transferase 331
6193 164862 PHA01631 PHA01631 hypothetical protein 176
6194 133953 PHA01632 PHA01632 hypothetical protein 64
6195 107050 PHA01633 PHA01633 putative glycosyl transferase group 1 335
6196 133954 PHA01634 PHA01634 hypothetical protein 156
6197 222827 PHA01635 PHA01635 hypothetical protein 231
6198 107053 PHA01707 dut 2'-deoxyuridine 5'-triphosphatase 158
6199 222828 PHA01732 PHA01732 proline-rich protein 94
6200 107055 PHA01733 PHA01733 hypothetical protein 153
6201 222829 PHA01735 PHA01735 hypothetical protein 76
6202 133956 PHA01740 PHA01740 putative single-stranded DNA-binding protein 158
6203 222830 PHA01745 PHA01745 hypothetical protein 306
6204 107059 PHA01746 PHA01746 hypothetical protein 131
6205 222831 PHA01747 PHA01747 putative ATP-dependent protease 425
6206 222832 PHA01748 PHA01748 hypothetical protein 60
6207 177316 PHA01749 PHA01749 coat protein 134
6208 107063 PHA01750 PHA01750 hypothetical protein 75
6209 222833 PHA01751 PHA01751 hypothetical protein 110
6210 177317 PHA01752 PHA01752 hypothetical protein 488
6211 133958 PHA01753 PHA01753 Holliday junction resolvase 121
6212 133959 PHA01754 PHA01754 hypothetical protein 69
6213 222834 PHA01755 PHA01755 hypothetical protein 562
6214 107069 PHA01756 PHA01756 hypothetical protein 268
6215 222835 PHA01757 PHA01757 hypothetical protein 98
6216 222836 PHA01769 PHA01769 hypothetical protein 98
6217 177318 PHA01782 PHA01782 hypothetical protein 177
6218 164869 PHA01790 PHA01790 streptodornase 326
6219 222837 PHA01794 PHA01794 hypothetical protein 134
6220 177320 PHA01795 PHA01795 hypothetical protein 280
6221 222838 PHA01806 PHA01806 hypothetical protein 200
6222 222839 PHA01807 PHA01807 hypothetical protein 153
6223 164872 PHA01808 PHA01808 putative structural protein 98
6224 107079 PHA01809 PHA01809 hypothetical protein 65
6225 177323 PHA01810 PHA01810 hypothetical protein 100
6226 177324 PHA01811 PHA01811 hypothetical protein 78
6227 177325 PHA01812 PHA01812 hypothetical protein 122
6228 107083 PHA01813 PHA01813 hypothetical protein 58
6229 107084 PHA01814 PHA01814 hypothetical protein 137
6230 107085 PHA01815 PHA01815 hypothetical protein 55
6231 107086 PHA01816 PHA01816 hypothetical protein 160
6232 177326 PHA01817 PHA01817 hypothetical protein 479
6233 107088 PHA01818 PHA01818 hypothetical protein 458
6234 107089 PHA01819 PHA01819 hypothetical protein 129
6235 222840 PHA01886 PHA01886 TM2 domain-containing protein 78
6236 177328 PHA01929 PHA01929 putative scaffolding protein 306
6237 222841 PHA01971 PHA01971 hypothetical protein 123
6238 222842 PHA01972 PHA01972 structural protein 828
6239 177330 PHA01976 PHA01976 helix-turn-helix protein 67
6240 177331 PHA02004 PHA02004 capsid protein 332
6241 222843 PHA02030 PHA02030 hypothetical protein 336
6242 222844 PHA02031 PHA02031 putative DnaG-like primase 266
6243 222845 PHA02046 PHA02046 hypothetical protein 99
6244 222846 PHA02047 PHA02047 phage lambda Rz1-like protein 101
6245 177336 PHA02053 PHA02053 hypothetical protein 115
6246 177337 PHA02054 PHA02054 hypothetical protein 94
6247 177338 PHA02057 PHA02057 ADP-ribosylation superfamily-like protein 319
6248 164889 PHA02067 PHA02067 hypothetical protein 221
6249 177339 PHA02078 PHA02078 hypothetical protein 54
6250 164890 PHA02085 PHA02085 hypothetical protein 87
6251 107108 PHA02086 PHA02086 hypothetical protein 88
6252 107109 PHA02087 PHA02087 hypothetical protein 83
6253 107110 PHA02088 PHA02088 hypothetical protein 125
6254 177340 PHA02090 PHA02090 hypothetical protein 79
6255 177341 PHA02091 PHA02091 hypothetical protein 72
6256 177342 PHA02092 PHA02092 hypothetical protein 108
6257 177343 PHA02094 PHA02094 hypothetical protein 81
6258 107115 PHA02095 PHA02095 hypothetical protein 84
6259 107116 PHA02096 PHA02096 hypothetical protein 103
6260 177344 PHA02097 PHA02097 hypothetical protein 59
6261 107118 PHA02098 PHA02098 hypothetical protein 56
6262 107119 PHA02099 PHA02099 hypothetical protein 84
6263 107120 PHA02100 PHA02100 hypothetical protein 112
6264 177345 PHA02101 PHA02101 hypothetical protein 101
6265 222847 PHA02102 PHA02102 hypothetical protein 72
6266 222848 PHA02103 PHA02103 hypothetical protein 135
6267 177347 PHA02104 PHA02104 hypothetical protein 89
6268 133990 PHA02105 PHA02105 hypothetical protein 68
6269 177348 PHA02106 PHA02106 hypothetical protein 91
6270 164900 PHA02107 PHA02107 hypothetical protein 216
6271 177349 PHA02108 PHA02108 hypothetical protein 48
6272 222849 PHA02109 PHA02109 hypothetical protein 233
6273 107130 PHA02110 PHA02110 hypothetical protein 98
6274 107131 PHA02114 PHA02114 hypothetical protein 127
6275 164902 PHA02115 PHA02115 hypothetical protein 105
6276 177351 PHA02117 PHA02117 glutathionylspermidine synthase domain-containing protein 397
6277 107134 PHA02118 PHA02118 hypothetical protein 202
6278 107135 PHA02119 PHA02119 hypothetical protein 87
6279 177352 PHA02122 PHA02122 hypothetical protein 65
6280 107137 PHA02123 PHA02123 hypothetical protein 146
6281 133998 PHA02125 PHA02125 thioredoxin-like protein 75
6282 222850 PHA02126 PHA02126 hypothetical protein 153
6283 107140 PHA02127 PHA02127 hypothetical protein 57
6284 107141 PHA02128 PHA02128 hypothetical protein 151
6285 107142 PHA02130 PHA02130 hypothetical protein 81
6286 107143 PHA02131 PHA02131 hypothetical protein 70
6287 107144 PHA02132 PHA02132 hypothetical protein 86
6288 107145 PHA02135 PHA02135 hypothetical protein 122
6289 177353 PHA02141 PHA02141 hypothetical protein 105
6290 134000 PHA02142 PHA02142 putative RNA ligase 366
6291 107148 PHA02145 PHA02145 hypothetical protein 230
6292 107149 PHA02146 PHA02146 hypothetical protein 86
6293 107150 PHA02148 PHA02148 hypothetical protein 110
6294 134001 PHA02150 PHA02150 hypothetical protein 77
6295 177354 PHA02151 PHA02151 hypothetical protein 217
6296 107153 PHA02152 PHA02152 hypothetical protein 96
6297 107154 PHA02239 PHA02239 putative protein phosphatase 235
6298 107155 PHA02241 PHA02241 hypothetical protein 182
6299 107156 PHA02243 PHA02243 hypothetical protein 160
6300 107157 PHA02244 PHA02244 ATPase-like protein 383
6301 177355 PHA02246 PHA02246 hypothetical protein 192
6302 134004 PHA02248 PHA02248 hypothetical protein 204
6303 177356 PHA02256 PHA02256 hypothetical protein 113
6304 107161 PHA02264 PHA02264 hypothetical protein 152
6305 164905 PHA02265 PHA02265 hypothetical protein 103
6306 107163 PHA02275 PHA02275 hypothetical protein 125
6307 107164 PHA02277 PHA02277 hypothetical protein 150
6308 177357 PHA02278 PHA02278 thioredoxin-like protein 103
6309 107166 PHA02283 PHA02283 hypothetical protein 210
6310 107167 PHA02284 PHA02284 hypothetical protein 251
6311 107168 PHA02290 PHA02290 hypothetical protein 234
6312 177358 PHA02291 PHA02291 hypothetical protein 132
6313 177359 PHA02310 PHA02310 hypothetical protein 130
6314 164907 PHA02324 PHA02324 hypothetical protein 47
6315 177360 PHA02325 PHA02325 hypothetical protein 72
6316 164909 PHA02334 PHA02334 hypothetical protein 64
6317 164910 PHA02335 PHA02335 hypothetical protein 118
6318 177361 PHA02337 PHA02337 putative high light inducible protein 35
6319 164912 PHA02357 PHA02357 hypothetical protein 81
6320 222851 PHA02358 PHA02358 hypothetical protein 194
6321 107178 PHA02360 PHA02360 hypothetical protein 70
6322 107179 PHA02414 PHA02414 hypothetical protein 111
6323 177362 PHA02415 PHA02415 DNA primase domain-containing protein 930
6324 107181 PHA02416 PHA02416 hypothetical protein 167
6325 164914 PHA02417 PHA02417 hypothetical protein 83
6326 107183 PHA02436 PHA02436 hypothetical protein 52
6327 177363 PHA02446 PHA02446 hypothetical protein 166
6328 164916 PHA02447 PHA02447 hypothetical protein 86
6329 107186 PHA02448 PHA02448 hypothetical protein 192
6330 134010 PHA02450 PHA02450 hypothetical protein 53
6331 177364 PHA02451 PHA02451 hypothetical protein 54
6332 164918 PHA02456 PHA02456 zinc metallopeptidase motif-containing protein 141
6333 164919 PHA02458 A protein A*; Reviewed 341
6334 177365 PHA02503 PHA02503 putative transcription regulator; Provisional 57
6335 107192 PHA02508 PHA02508 putative minor coat protein; Provisional 93
6336 222852 PHA02510 X gene X product; Reviewed 116
6337 177367 PHA02513 V1 structural protein V1; Reviewed 135
6338 107197 PHA02515 PHA02515 hypothetical protein; Provisional 508
6339 134016 PHA02516 W baseplate wedge subunit; Provisional 103
6340 222853 PHA02517 PHA02517 putative transposase OrfB; Reviewed 277
6341 222854 PHA02518 PHA02518 ParA-like protein; Provisional 211
6342 107201 PHA02519 PHA02519 plasmid partition protein SopA; Reviewed 387
6343 164924 PHA02523 43B DNA polymerase subunit B; Provisional 391
6344 164925 PHA02524 43A DNA polymerase subunit A; Provisional 498
6345 177369 PHA02528 43 DNA polymerase; Provisional 881
6346 222855 PHA02529 O capsid-scaffolding protein; Provisional 278
6347 222856 PHA02530 pseT polynucleotide kinase; Provisional 300
6348 222857 PHA02531 20 portal vertex protein; Provisional 514
6349 222858 PHA02533 17 large terminase protein; Provisional 534
6350 222859 PHA02535 P terminase ATPase subunit; Provisional 581
6351 222860 PHA02536 Q portal vertex protein; Provisional 346
6352 222861 PHA02537 M terminase endonuclease subunit; Provisional 230
6353 164934 PHA02538 N capsid protein; Provisional 348
6354 222862 PHA02539 18 tail sheath protein; Provisional 648
6355 222863 PHA02540 61 DNA primase; Provisional 337
6356 177376 PHA02541 23 major capsid protein; Provisional 518
6357 222864 PHA02542 41 41 helicase; Provisional 473
6358 222865 PHA02543 regA translation repressor protein; Provisional 125
6359 222866 PHA02544 44 clamp loader, small subunit; Provisional 316
6360 177380 PHA02545 45 sliding clamp; Provisional 223
6361 222867 PHA02546 47 endonuclease subunit; Provisional 340
6362 222868 PHA02547 55 RNA polymerase sigma factor; Provisional 179
6363 177383 PHA02548 24 capsid vertex protein; Provisional 412
6364 222869 PHA02550 32 single-stranded DNA binding protein; Provisional 304
6365 177385 PHA02551 19 tail tube protein; Provisional 163
6366 222870 PHA02552 4 head completion protein; Provisional 151
6367 222871 PHA02553 6 baseplate wedge subunit; Provisional 611
6368 177388 PHA02554 13 neck protein; Provisional 311
6369 222872 PHA02555 14 neck protein; Provisional 216
6370 222873 PHA02556 15 tail sheath stabilizer and completion protein; Provisional 273
6371 222874 PHA02557 22 prohead core protein; Provisional 271
6372 222875 PHA02558 uvsW UvsW helicase; Provisional 501
6373 222876 PHA02559 59 59 protein; Provisional 216
6374 164955 PHA02560 FI major tail sheath protein; Provisional 388
6375 222877 PHA02561 D tail protein; Provisional 351
6376 222878 PHA02562 46 endonuclease subunit; Provisional 562
6377 222879 PHA02563 PHA02563 DNA polymerase; Provisional 630
6378 222880 PHA02564 V virion protein; Provisional 141
6379 177395 PHA02565 49 recombination endonuclease VII; Provisional 157
6380 222881 PHA02566 alt ADP-ribosyltransferase; Provisional 684
6381 222882 PHA02567 rnh RnaseH; Provisional 304
6382 164963 PHA02568 J baseplate assembly protein; Provisional 300
6383 177398 PHA02569 39 DNA topoisomerase II large subunit; Provisional 602
6384 177399 PHA02570 dexA exonuclease; Provisional 220
6385 177400 PHA02571 a-gt.4 hypothetical protein; Provisional 109
6386 222883 PHA02572 nrdA ribonucleoside-diphosphate reductase subunit alpha; Provisional 753
6387 222884 PHA02573 30.3 hypothetical protein; Provisional 148
6388 177403 PHA02574 57B hypothetical protein; Provisional 149
6389 222885 PHA02575 1 deoxynucleoside monophosphate kinase; Provisional 227
6390 177405 PHA02576 3 tail completion and sheath stabilizer protein; Provisional 177
6391 222886 PHA02577 2 DNA end protector protein; Provisional 181
6392 177407 PHA02578 53 baseplate wedge subunit; Provisional 181
6393 177408 PHA02579 7 baseplate wedge subunit; Provisional 1030
6394 177409 PHA02580 8 baseplate wedge subunit; Provisional 331
6395 222887 PHA02581 9 baseplate wedge tail fiber connector; Provisional 284
6396 222888 PHA02582 10 baseplate wedge subunit and tail pin; Provisional 604
6397 222889 PHA02583 11 baseplate wedge subunit and tail pin; Provisional 218
6398 222890 PHA02584 34 long tail fiber, proximal subunit; Provisional 1229
6399 222891 PHA02585 16 small terminase protein; Provisional 161
6400 222892 PHA02586 68 prohead core protein; Provisional 140
6401 222893 PHA02587 30 DNA ligase; Provisional 488
6402 222894 PHA02588 cd deoxycytidylate deaminase; Provisional 168
6403 222895 PHA02589 rnlA RNA ligase A; Provisional 378
6404 164985 PHA02590 PHA02590 hypothetical protein; Provisional 105
6405 164986 PHA02591 PHA02591 hypothetical protein; Provisional 83
6406 222896 PHA02592 52 DNA topisomerase II medium subunit; Provisional 439
6407 222897 PHA02593 62 clamp loader small subunit; Provisional 191
6408 222898 PHA02594 nadV nicotinamide phosphoribosyl transferase; Provisional 470
6409 222899 PHA02595 tk.4 hypothetical protein; Provisional 154
6410 222900 PHA02596 5 baseplate hub subunit and tail lysozyme; Provisional 576
6411 222901 PHA02597 30.2 hypothetical protein; Provisional 197
6412 222902 PHA02598 denA endonuclease II; Provisional 138
6413 222903 PHA02599 dsbA double-stranded DNA binding protein; Provisional 91
6414 164995 PHA02600 FII major tail tube protein; Provisional 169
6415 222904 PHA02601 int integrase; Provisional 333
6416 177427 PHA02602 56 dCTP pyrophosphatase; Provisional 172
6417 222905 PHA02603 nrdC.11 hypothetical protein; Provisional 330
6418 177429 PHA02604 rI.-1 hypothetical protein; Provisional 126
6419 177430 PHA02605 54 baseplate subunit; Provisional 305
6420 222906 PHA02606 5.1 hypothetical protein; Provisional 179
6421 177432 PHA02607 wac fibritin; Provisional 454
6422 177433 PHA02608 67 prohead core protein; Provisional 80
6423 165004 PHA02609 uvsW.1 hypothetical protein; Provisional 76
6424 165005 PHA02610 uvsY.-2 hypothetical protein; Provisional 53
6425 222907 PHA02611 51 baseplate hub assembly protein; Provisional 249
6426 222908 PHA02612 27 baseplate hub subunit; Provisional 372
6427 222909 PHA02613 48 baseplate subunit; Provisional 361
6428 222910 PHA02614 PHA02614 Major capsid protein VP1; Provisional 363
6429 222911 PHA02616 PHA02616 VP2/VP3; Provisional 259
6430 177439 PHA02620 PHA02620 VP3; Provisional 353
6431 177440 PHA02621 PHA02621 agnoprotein; Provisional 68
6432 222912 PHA02624 PHA02624 large T antigen; Provisional 647
6433 177442 PHA02627 PHA02627 hypothetical protein; Provisional 73
6434 165015 PHA02629 PHA02629 A-type inclusion body protein; Provisional 61
6435 165016 PHA02633 PHA02633 hypothetical protein; Provisional 63
6436 165017 PHA02634 PHA02634 hypothetical protein; Provisional 49
6437 165018 PHA02635 PHA02635 ankyrin-like protein; Provisional 61
6438 165019 PHA02636 PHA02636 hypothetical protein; Provisional 47
6439 222913 PHA02637 PHA02637 TNF-alpha-receptor-like protein; Provisional 127
6440 165021 PHA02638 PHA02638 CC chemokine receptor-like protein; Provisional 417
6441 165022 PHA02639 PHA02639 EEV host range protein; Provisional 295
6442 165023 PHA02641 PHA02641 hypothetical protein; Provisional 188
6443 165024 PHA02642 PHA02642 C-type lectin-like protein; Provisional 216
6444 165025 PHA02643 PHA02643 hypothetical protein; Provisional 82
6445 165026 PHA02644 PHA02644 hypothetical protein; Provisional 112
6446 165027 PHA02646 PHA02646 virion protein; Provisional 156
6447 165029 PHA02649 PHA02649 hypothetical protein; Provisional 95
6448 165030 PHA02650 PHA02650 hypothetical protein; Provisional 81
6449 165031 PHA02651 PHA02651 IL-1 receptor antagonist; Provisional 165
6450 165032 PHA02652 PHA02652 hypothetical protein; Provisional 70
6451 177443 PHA02653 PHA02653 RNA helicase NPH-II; Provisional 675
6452 165034 PHA02655 PHA02655 hypothetical protein; Provisional 94
6453 165035 PHA02656 PHA02656 viral TNFR II-like protein; Provisional 199
6454 165036 PHA02657 PHA02657 hypothetical protein; Provisional 95
6455 165037 PHA02658 PHA02658 hypothetical protein; Provisional 92
6456 165038 PHA02659 PHA02659 endothelin precursor; Provisional 70
6457 165039 PHA02660 PHA02660 serpin-like protein; Provisional 364
6458 177444 PHA02661 PHA02661 vascular endothelial growth factor like protein; Provisional 146
6459 177445 PHA02662 PHA02662 ORF131 putative membrane protein; Provisional 226
6460 177446 PHA02663 PHA02663 hypothetical protein; Provisional 172
6461 177447 PHA02664 PHA02664 hypothetical protein; Provisional 534
6462 177448 PHA02665 PHA02665 hypothetical protein; Provisional 322
6463 222914 PHA02666 PHA02666 hypothetical protein; Provisional 287
6464 177450 PHA02668 PHA02668 GM-CSF/IL-2 inhibition factor; Provisional 265
6465 177451 PHA02669 PHA02669 hypothetical protein; Provisional 210
6466 222915 PHA02670 PHA02670 ORF112 putative chemokine-binding protein; Provisional 287
6467 177453 PHA02671 PHA02671 hypothetical protein; Provisional 179
6468 177454 PHA02672 PHA02672 ORF110 EEV glycoprotein; Provisional 166
6469 177455 PHA02673 PHA02673 ORF109 EEV glycoprotein; Provisional 161
6470 177456 PHA02674 PHA02674 ORF107 virion morphogenesis; Provisional 60
6471 177457 PHA02675 PHA02675 ORF104 fusion protein; Provisional 90
6472 177458 PHA02676 PHA02676 A-type inclusion protein; Provisional 520
6473 222916 PHA02677 PHA02677 hypothetical protein; Provisional 108
6474 177460 PHA02678 PHA02678 hypothetical protein; Provisional 89
6475 177461 PHA02679 PHA02679 ORF091 IMV membrane protein; Provisional 53
6476 177462 PHA02680 PHA02680 ORF090 IMV phosphorylated membrane protein; Provisional 91
6477 222917 PHA02681 PHA02681 ORF089 virion membrane protein; Provisional 92
6478 177464 PHA02682 PHA02682 ORF080 virion core protein; Provisional 280
6479 177465 PHA02683 PHA02683 ORF078 thioredoxin-like protein; Provisional 75
6480 177466 PHA02684 PHA02684 ORF066 virion protein; Provisional 221
6481 177467 PHA02685 PHA02685 ORF065 virion protein; Provisional 155
6482 177468 PHA02686 PHA02686 hypothetical protein; Provisional 138
6483 222918 PHA02687 PHA02687 ORF061 late transcription factor VLTF-4; Provisional 231
6484 222919 PHA02688 PHA02688 ORF059 IMV protein VP55; Provisional 323
6485 177471 PHA02689 PHA02689 ORF051 putative membrane protein; Provisional 128
6486 222920 PHA02690 PHA02690 hypothetical protein; Provisional 90
6487 177473 PHA02691 PHA02691 hypothetical protein; Provisional 110
6488 177474 PHA02692 PHA02692 hypothetical protein; Provisional 70
6489 177475 PHA02693 PHA02693 hypothetical protein; Provisional 710
6490 177476 PHA02694 PHA02694 hypothetical protein; Provisional 292
6491 177477 PHA02695 PHA02695 hypothetical protein; Provisional 725
6492 222921 PHA02696 PHA02696 hypothetical protein; Provisional 79
6493 222922 PHA02697 PHA02697 hypothetical protein; Provisional 255
6494 177480 PHA02698 PHA02698 hypothetical protein; Provisional 89
6495 165075 PHA02699 PHA02699 hypothetical protein; Provisional 466
6496 177481 PHA02700 PHA02700 ORF017 DNA-binding phosphoprotein; Provisional 106
6497 177482 PHA02701 PHA02701 ORF020 dsRNA-binding PKR inhibitor; Provisional 183
6498 177483 PHA02702 PHA02702 ORF033 IMV membrane protein; Provisional 78
6499 165079 PHA02703 PHA02703 ORF007 dUTPase; Provisional 165
6500 165080 PHA02705 PHA02705 hypothetical protein; Provisional 72
6501 165081 PHA02706 PHA02706 hypothetical protein; Provisional 58
6502 165082 PHA02707 PHA02707 hypothetical protein; Provisional 37
6503 177484 PHA02708 PHA02708 hypothetical protein; Provisional 148
6504 165084 PHA02709 PHA02709 hypothetical protein; Provisional 44
6505 165085 PHA02711 PHA02711 Toll/IL-receptor-like protein; Provisional 190
6506 165086 PHA02713 PHA02713 hypothetical protein; Provisional 557
6507 165087 PHA02714 PHA02714 CD-30-like protein; Provisional 110
6508 165088 PHA02715 PHA02715 hypothetical protein; Provisional 202
6509 165089 PHA02716 PHA02716 CPXV016; CPX019; EVM010; Provisional 764
6510 165090 PHA02718 PHA02718 hypothetical protein; Provisional 69
6511 165092 PHA02723 PHA02723 hypothetical protein; Provisional 77
6512 165093 PHA02724 PHA02724 hydrophobic IMV membrane protein; Provisional 53
6513 165094 PHA02725 PHA02725 hypothetical protein; Provisional 170
6514 165095 PHA02726 PHA02726 hypothetical protein; Provisional 94
6515 165096 PHA02728 PHA02728 uncharacterized protein; Provisional 184
6516 165097 PHA02729 PHA02729 hypothetical protein; Provisional 94
6517 165098 PHA02730 PHA02730 ankyrin-like protein; Provisional 672
6518 177485 PHA02731 PHA02731 putative integrase; Provisional 231
6519 165099 PHA02732 PHA02732 hypothetical protein; Provisional 1467
6520 165101 PHA02734 PHA02734 coat protein; Provisional 149
6521 165102 PHA02735 PHA02735 putative DNA polymerase type B; Provisional 716
6522 165103 PHA02736 PHA02736 Viral ankyrin protein; Provisional 154
6523 165104 PHA02737 PHA02737 hypothetical protein; Provisional 72
6524 222923 PHA02738 PHA02738 hypothetical protein; Provisional 320
6525 222924 PHA02739 PHA02739 hypothetical protein; Provisional 116
6526 165107 PHA02740 PHA02740 protein tyrosine phosphatase; Provisional 298
6527 165108 PHA02741 PHA02741 hypothetical protein; Provisional 169
6528 165109 PHA02742 PHA02742 protein tyrosine phosphatase; Provisional 303
6529 222925 PHA02743 PHA02743 Viral ankyrin protein; Provisional 166
6530 165111 PHA02744 PHA02744 hypothetical protein; Provisional 88
6531 222926 PHA02745 PHA02745 hypothetical protein; Provisional 265
6532 165113 PHA02746 PHA02746 protein tyrosine phosphatase; Provisional 323
6533 165114 PHA02747 PHA02747 protein tyrosine phosphatase; Provisional 312
6534 165115 PHA02748 PHA02748 viral inexin-like protein; Provisional 360
6535 165116 PHA02749 PHA02749 hypothetical protein; Provisional 322
6536 165117 PHA02750 PHA02750 hypothetical protein; Provisional 240
6537 165118 PHA02751 PHA02751 hypothetical protein; Provisional 233
6538 177486 PHA02752 PHA02752 hypothetical protein; Provisional 242
6539 165120 PHA02753 PHA02753 hypothetical protein; Provisional 298
6540 165121 PHA02754 PHA02754 hypothetical protein; Provisional 67
6541 165122 PHA02755 PHA02755 hypothetical protein; Provisional 96
6542 165123 PHA02756 PHA02756 hypothetical protein; Provisional 164
6543 165124 PHA02757 PHA02757 hypothetical protein; Provisional 75
6544 165125 PHA02758 PHA02758 hypothetical protein; Provisional 321
6545 165126 PHA02759 PHA02759 virus coat protein VP2; Provisional 245
6546 165127 PHA02762 PHA02762 hypothetical protein; Provisional 62
6547 177487 PHA02763 PHA02763 hypothetical protein; Provisional 102
6548 165129 PHA02764 PHA02764 hypothetical protein; Provisional 399
6549 165130 PHA02765 PHA02765 hypothetical protein; Provisional 117
6550 165131 PHA02766 PHA02766 hypothetical protein; Provisional 73
6551 165132 PHA02767 PHA02767 hypothetical protein; Provisional 101
6552 165133 PHA02768 PHA02768 hypothetical protein; Provisional 55
6553 165134 PHA02769 PHA02769 hypothetical protein; Provisional 154
6554 165135 PHA02770 PHA02770 hypothetical protein; Provisional 81
6555 165136 PHA02771 PHA02771 hypothetical protein; Provisional 90
6556 165137 PHA02772 PHA02772 hypothetical protein; Provisional 95
6557 165138 PHA02773 PHA02773 hypothetical protein; Provisional 112
6558 222927 PHA02774 PHA02774 E1; Provisional 613
6559 165140 PHA02775 PHA02775 E6; Provisional 160
6560 165141 PHA02776 PHA02776 E7 protein; Provisional 101
6561 165142 PHA02777 PHA02777 major capsid L1 protein; Provisional 555
6562 222928 PHA02778 PHA02778 major capsid L1 protein; Provisional 503
6563 222929 PHA02779 PHA02779 E6 protein; Provisional 150
6564 177490 PHA02780 PHA02780 hypothetical protein; Provisional 73
6565 165146 PHA02781 PHA02781 hypothetical protein; Provisional 78
6566 165147 PHA02782 PHA02782 hypothetical protein; Provisional 503
6567 165148 PHA02783 PHA02783 uncharacterized protein; Provisional 181
6568 165149 PHA02785 PHA02785 IL-beta-binding protein; Provisional 326
6569 222930 PHA02786 PHA02786 uncharacterized protein; Provisional 192
6570 165152 PHA02789 PHA02789 uncharacterized protein; Provisional 173
6571 165153 PHA02790 PHA02790 Kelch-like protein; Provisional 480
6572 165154 PHA02791 PHA02791 ankyrin-like protein; Provisional 284
6573 165155 PHA02792 PHA02792 ankyrin-like protein; Provisional 631
6574 165156 PHA02793 PHA02793 hypothetical protein; Provisional 66
6575 165157 PHA02795 PHA02795 ankyrin-like protein; Provisional 437
6576 222931 PHA02798 PHA02798 ankyrin-like protein; Provisional 489
6577 165159 PHA02800 PHA02800 hypothetical protein; Provisional 161
6578 165161 PHA02807 PHA02807 hypothetical protein; Provisional 155
6579 222932 PHA02809 PHA02809 hypothetical protein; Provisional 111
6580 165163 PHA02811 PHA02811 putative host range protein; Provisional 197
6581 165164 PHA02813 PHA02813 hypothetical protein; Provisional 354
6582 165165 PHA02815 PHA02815 hypothetical protein; Provisional 64
6583 222933 PHA02816 PHA02816 hypothetical protein; Provisional 106
6584 165167 PHA02817 PHA02817 EEV Host range protein; Provisional 225
6585 165168 PHA02818 PHA02818 hypothetical protein; Provisional 92
6586 165169 PHA02819 PHA02819 hypothetical protein; Provisional 71
6587 222934 PHA02820 PHA02820 phospholipase-D-like protein; Provisional 424
6588 222935 PHA02823 PHA02823 chemokine binding protein; Provisional 255
6589 177491 PHA02825 PHA02825 LAP/PHD finger-like protein; Provisional 162
6590 165173 PHA02826 PHA02826 IL-1 receptor-like protein; Provisional 227
6591 177492 PHA02827 PHA02827 hypothetical protein; Provisional 150
6592 165175 PHA02828 PHA02828 putative transmembrane protein; Provisional 100
6593 165176 PHA02831 PHA02831 EEV host range protein; Provisional 268
6594 165177 PHA02834 PHA02834 chemokine receptor-like protein; Provisional 323
6595 165178 PHA02835 PHA02835 putative secreted protein; Provisional 186
6596 165179 PHA02836 PHA02836 putative transmembrane protein; Provisional 153
6597 165180 PHA02837 PHA02837 uncharacterized protein; Provisional 190
6598 165181 PHA02838 PHA02838 hypothetical protein; Provisional 68
6599 165182 PHA02839 PHA02839 Il-24-like protein; Provisional 156
6600 165183 PHA02840 PHA02840 hypothetical protein; Provisional 82
6601 165184 PHA02841 PHA02841 hypothetical protein; Provisional 103
6602 165185 PHA02843 PHA02843 hypothetical protein; Provisional 73
6603 165186 PHA02844 PHA02844 putative transmembrane protein; Provisional 75
6604 165187 PHA02845 PHA02845 hypothetical protein; Provisional 91
6605 165188 PHA02849 PHA02849 putative transmembrane protein; Provisional 82
6606 165189 PHA02851 PHA02851 EEV glycoprotein; Provisional 223
6607 165190 PHA02852 PHA02852 putative virion structural protein; Provisional 153
6608 165191 PHA02854 PHA02854 putative host range protein; Provisional 178
6609 222936 PHA02855 PHA02855 anti-apoptotic membrane protein; Provisional 180
6610 165193 PHA02857 PHA02857 monoglyceride lipase; Provisional 276
6611 165194 PHA02858 PHA02858 EIF2a-like PKR inhibitor; Provisional 86
6612 165195 PHA02859 PHA02859 ankyrin repeat protein; Provisional 209
6613 165196 PHA02861 PHA02861 uncharacterized protein; Provisional 149
6614 165197 PHA02862 PHA02862 5L protein; Provisional 156
6615 222937 PHA02864 PHA02864 hypothetical protein; Provisional 240
6616 165199 PHA02865 PHA02865 MHC-like TNF binding protein; Provisional 338
6617 165200 PHA02866 PHA02866 Hypothetical protein; Provisional 333
6618 165201 PHA02867 PHA02867 C-type lectin protein; Provisional 167
6619 165202 PHA02869 PHA02869 C4L/C10L-like gene family protein; Provisional 418
6620 165203 PHA02871 PHA02871 hypothetical protein; Provisional 222
6621 222938 PHA02872 PHA02872 EFc gene family protein; Provisional 124
6622 165205 PHA02874 PHA02874 ankyrin repeat protein; Provisional 434
6623 165206 PHA02875 PHA02875 ankyrin repeat protein; Provisional 413
6624 165207 PHA02876 PHA02876 ankyrin repeat protein; Provisional 682
6625 222939 PHA02878 PHA02878 ankyrin repeat protein; Provisional 477
6626 222940 PHA02880 PHA02880 hypothetical protein; Provisional 189
6627 165210 PHA02881 PHA02881 hypothetical protein; Provisional 161
6628 165211 PHA02882 PHA02882 putative serine/threonine kinase; Provisional 294
6629 165212 PHA02884 PHA02884 ankyrin repeat protein; Provisional 300
6630 165213 PHA02885 PHA02885 putative interleukin binding protein; Provisional 135
6631 165214 PHA02887 PHA02887 EGF-like protein; Provisional 126
6632 165215 PHA02888 PHA02888 hypothetical protein; Provisional 96
6633 165216 PHA02889 PHA02889 hypothetical protein; Provisional 241
6634 165217 PHA02890 PHA02890 hypothetical protein; Provisional 278
6635 165218 PHA02891 PHA02891 hypothetical protein; Provisional 120
6636 165219 PHA02892 PHA02892 hypothetical protein; Provisional 75
6637 165220 PHA02893 PHA02893 hypothetical protein; Provisional 88
6638 165221 PHA02894 PHA02894 hypothetical protein; Provisional 97
6639 165222 PHA02896 PHA02896 A-type inclusion like protein; Provisional 616
6640 165223 PHA02898 PHA02898 virion envelope protein; Provisional 92
6641 222941 PHA02901 PHA02901 virus redox protein; Provisional 75
6642 165225 PHA02902 PHA02902 putative IMV membrane protein; Provisional 70
6643 165226 PHA02907 PHA02907 hypothetical protein; Provisional 182
6644 165227 PHA02909 PHA02909 hypothetical protein; Provisional 72
6645 165228 PHA02910 PHA02910 hypothetical protein; Provisional 171
6646 177496 PHA02911 PHA02911 C-type lectin-like protein; Provisional 213
6647 177497 PHA02913 PHA02913 TGF-beta-like protein; Provisional 172
6648 165230 PHA02914 PHA02914 Immunoglobulin-like domain protein; Provisional 500
6649 165231 PHA02917 PHA02917 ankyrin-like protein; Provisional 661
6650 165232 PHA02919 PHA02919 host-range protein; Provisional 150
6651 165233 PHA02920 PHA02920 putative virulence factor; Provisional 117
6652 165234 PHA02922 PHA02922 hypothetical protein; Provisional 153
6653 165235 PHA02923 PHA02923 hypothetical protein; Provisional 315
6654 222942 PHA02924 PHA02924 hypothetical protein; Provisional 156
6655 165237 PHA02926 PHA02926 zinc finger-like protein; Provisional 242
6656 222943 PHA02927 PHA02927 secreted complement-binding protein; Provisional 263
6657 165239 PHA02928 PHA02928 Hypothetical protein; Provisional 214
6658 222944 PHA02929 PHA02929 N1R/p28-like protein; Provisional 238
6659 165241 PHA02930 PHA02930 hypothetical protein; Provisional 81
6660 165242 PHA02931 PHA02931 hypothetical protein; Provisional 72
6661 222945 PHA02932 PHA02932 hypothetical protein; Provisional 221
6662 165244 PHA02933 PHA02933 unchracterized protein; Provisional 149
6663 165245 PHA02934 PHA02934 Hypothetical protein; Provisional 253
6664 222946 PHA02935 PHA02935 Hypothetical protein; Provisional 349
6665 165247 PHA02937 PHA02937 hypothetical protein; Provisional 310
6666 165248 PHA02938 PHA02938 hypothetical protein; Provisional 361
6667 222947 PHA02939 PHA02939 hypothetical protein; Provisional 144
6668 165250 PHA02940 PHA02940 hypothetical protein; Provisional 315
6669 222948 PHA02941 PHA02941 hypothetical protein; Provisional 356
6670 165252 PHA02942 PHA02942 putative transposase; Provisional 383
6671 165253 PHA02943 PHA02943 hypothetical protein; Provisional 165
6672 165254 PHA02944 PHA02944 hypothetical protein; Provisional 180
6673 165255 PHA02945 PHA02945 interferon resistance protein; Provisional 88
6674 165256 PHA02946 PHA02946 ankyin-like protein; Provisional 446
6675 222949 PHA02947 PHA02947 S-S bond formation pathway protein; Provisional 215
6676 165258 PHA02948 PHA02948 serine protease inhibitor-like protein; Provisional 373
6677 165259 PHA02949 PHA02949 Hypothetical protein; Provisional 65
6678 177499 PHA02951 PHA02951 Hypothetical protein; Provisional 337
6679 222950 PHA02952 PHA02952 EEV maturation protein; Provisional 648
6680 165262 PHA02953 PHA02953 IEV and EEV membrane glycoprotein; Provisional 170
6681 165263 PHA02954 PHA02954 EEV membrane glycoprotein; Provisional 317
6682 165264 PHA02955 PHA02955 hypothetical protein; Provisional 213
6683 165265 PHA02956 PHA02956 hypothetical protein; Provisional 189
6684 165266 PHA02957 PHA02957 hypothetical protein; Provisional 206
6685 165267 PHA02961 PHA02961 hypothetical protein; Provisional 658
6686 165268 PHA02962 PHA02962 hypothetical protein; Provisional 722
6687 165269 PHA02963 PHA02963 hypothetical protein; Provisional 210
6688 165270 PHA02965 PHA02965 hypothetical protein; Provisional 466
6689 165271 PHA02966 PHA02966 hypothetical protein; Provisional 67
6690 165272 PHA02967 PHA02967 hypothetical protein; Provisional 128
6691 165273 PHA02968 PHA02968 hypothetical protein; Provisional 414
6692 165274 PHA02969 PHA02969 hypothetical protein; Provisional 111
6693 222951 PHA02970 PHA02970 hypothetical protein; Provisional 115
6694 165276 PHA02972 PHA02972 hypothetical protein; Provisional 109
6695 165277 PHA02973 PHA02973 hypothetical protein; Provisional 102
6696 165278 PHA02974 PHA02974 putative IMV membrane protein; Provisional 81
6697 165279 PHA02975 PHA02975 hypothetical protein; Provisional 69
6698 165280 PHA02976 PHA02976 hypothetical protein; Provisional 181
6699 165281 PHA02977 PHA02977 hypothetical protein; Provisional 201
6700 165282 PHA02978 PHA02978 hypothetical protein; Provisional 135
6701 165283 PHA02979 PHA02979 hypothetical protein; Provisional 140
6702 165284 PHA02980 PHA02980 hypothetical protein; Provisional 160
6703 165285 PHA02982 PHA02982 hypothetical protein; Provisional 251
6704 222952 PHA02983 PHA02983 hypothetical protein; Provisional 180
6705 165287 PHA02984 PHA02984 hypothetical protein; Provisional 286
6706 165288 PHA02985 PHA02985 hypothetical protein; Provisional 271
6707 222953 PHA02986 PHA02986 hypothetical protein; Provisional 141
6708 165290 PHA02987 PHA02987 Ig domain OX-2-like protein; Provisional 189
6709 165291 PHA02988 PHA02988 hypothetical protein; Provisional 283
6710 222954 PHA02989 PHA02989 ankyrin repeat protein; Provisional 494
6711 222955 PHA02991 PHA02991 HT motif gene family protein; Provisional 120
6712 222956 PHA02992 PHA02992 hypothetical protein; Provisional 728
6713 165295 PHA02993 PHA02993 hypothetical protein; Provisional 147
6714 222957 PHA02994 PHA02994 hypothetical protein; Provisional 218
6715 165297 PHA02995 PHA02995 DNA-binding virion core protein; Provisional 101
6716 177503 PHA02996 PHA02996 poly(A) polymerase large subunit; Provisional 467
6717 222958 PHA02998 PHA02998 RNA polymerase subunit; Provisional 195
6718 222959 PHA02999 PHA02999 Hypothetical protein; Provisional 382
6719 177505 PHA03000 PHA03000 Hypothetical protein; Provisional 566
6720 222960 PHA03001 PHA03001 putative virion core protein; Provisional 132
6721 165303 PHA03002 PHA03002 Hypothetical protein; Provisional 679
6722 177506 PHA03003 PHA03003 palmytilated EEV membrane glycoprotein; Provisional 369
6723 177507 PHA03004 PHA03004 putative membrane protein; Provisional 270
6724 222961 PHA03005 PHA03005 sulfhydryl oxidase; Provisional 96
6725 165307 PHA03006 PHA03006 hypothetical protein; Provisional 323
6726 165308 PHA03007 PHA03007 hypothetical protein; Provisional 540
6727 165309 PHA03008 PHA03008 hypothetical protein; Provisional 234
6728 165310 PHA03010 PHA03010 hypothetical protein; Provisional 546
6729 165311 PHA03011 PHA03011 hypothetical protein; Provisional 120
6730 165312 PHA03012 PHA03012 hypothetical protein; Provisional 279
6731 165313 PHA03013 PHA03013 hypothetical protein; Provisional 109
6732 165314 PHA03014 PHA03014 hypothetical protein; Provisional 163
6733 165315 PHA03016 PHA03016 hypothetical protein; Provisional 441
6734 165316 PHA03017 PHA03017 hypothetical protein; Provisional 228
6735 165317 PHA03018 PHA03018 hypothetical protein; Provisional 174
6736 165318 PHA03019 PHA03019 hypothetical protein; Provisional 77
6737 165319 PHA03020 PHA03020 hypothetical protein; Provisional 352
6738 165320 PHA03022 PHA03022 hypothetical protein; Provisional 335
6739 165321 PHA03023 PHA03023 hypothetical protein; Provisional 112
6740 165322 PHA03024 PHA03024 hypothetical protein; Provisional 229
6741 165323 PHA03025 PHA03025 hypothetical protein; Provisional 68
6742 165324 PHA03026 PHA03026 hypothetical protein; Provisional 421
6743 165325 PHA03027 PHA03027 hypothetical protein; Provisional 325
6744 165326 PHA03028 PHA03028 hypothetical protein; Provisional 185
6745 165327 PHA03029 PHA03029 hypothetical protein; Provisional 92
6746 165328 PHA03030 PHA03030 hypothetical protein; Provisional 122
6747 165329 PHA03031 PHA03031 hypothetical protein; Provisional 449
6748 165330 PHA03033 PHA03033 hypothetical protein; Provisional 142
6749 165331 PHA03034 PHA03034 hypothetical protein; Provisional 145
6750 165332 PHA03035 PHA03035 hypothetical protein; Provisional 158
6751 222962 PHA03036 PHA03036 DNA polymerase; Provisional 1004
6752 222963 PHA03041 PHA03041 virion core protein; Provisional 153
6753 222964 PHA03042 PHA03042 CD47-like protein; Provisional 286
6754 165336 PHA03043 PHA03043 hypothetical protein; Provisional 130
6755 165337 PHA03044 PHA03044 IMV membrane protein; Provisional 74
6756 177510 PHA03045 PHA03045 IMV membrane protein; Provisional 113
6757 165339 PHA03046 PHA03046 Hypothetical protein; Provisional 142
6758 165340 PHA03047 PHA03047 IMV membrane receptor-like protein; Provisional 53
6759 165341 PHA03048 PHA03048 IMV membrane protein; Provisional 93
6760 165342 PHA03049 PHA03049 IMV membrane protein; Provisional 68
6761 165343 PHA03050 PHA03050 glutaredoxin; Provisional 108
6762 165344 PHA03051 PHA03051 Hypothetical protein; Provisional 88
6763 165345 PHA03052 PHA03052 Hypothetical protein; Provisional 69
6764 165346 PHA03054 PHA03054 IMV membrane protein; Provisional 72
6765 165347 PHA03055 PHA03055 Hypothetical protein; Provisional 79
6766 165348 PHA03056 PHA03056 putative myristoylated protein; Provisional 165
6767 222965 PHA03057 PHA03057 Hypothetical protein; Provisional 146
6768 222966 PHA03058 PHA03058 Hypothetical protein; Provisional 124
6769 222967 PHA03060 PHA03060 Hypothetical protein; Provisional 71
6770 177511 PHA03061 PHA03061 putative DNA-binding virion core protein; Provisional 311
6771 177512 PHA03062 PHA03062 putative IMV membrane protein; Provisional 78
6772 222968 PHA03065 PHA03065 Hypothetical protein; Provisional 438
6773 165355 PHA03066 PHA03066 Hypothetical protein; Provisional 110
6774 222969 PHA03067 PHA03067 hypothetical protein; Provisional 383
6775 177515 PHA03068 PHA03068 DNA-binding phosphoprotein; Provisional 270
6776 165358 PHA03069 PHA03069 DNA-binding protein; Provisional 119
6777 177516 PHA03070 PHA03070 DNA-binding virion core protein; Provisional 249
6778 165360 PHA03071 PHA03071 late transcription factor VLTF-1; Provisional 260
6779 222970 PHA03072 PHA03072 putative viral membrane protein; Provisional 190
6780 177518 PHA03073 PHA03073 late transcription factor VLTF-2; Provisional 150
6781 165363 PHA03074 PHA03074 late transcription factor VLTF-3; Provisional 225
6782 177519 PHA03075 PHA03075 glutaredoxin-like protein; Provisional 123
6783 222971 PHA03078 PHA03078 transcriptional elongation factor; Provisional 219
6784 165366 PHA03079 PHA03079 hypothetical protein; Provisional 87
6785 222972 PHA03080 PHA03080 putative virion core protein; Provisional 366
6786 222973 PHA03081 PHA03081 putative metalloprotease; Provisional 595
6787 222974 PHA03082 PHA03082 DNA-dependent RNA polymerase subunit; Provisional 63
6788 222975 PHA03083 PHA03083 poxvirus myristoylprotein; Provisional 334
6789 222976 PHA03087 PHA03087 G protein-coupled chemokine receptor-like protein; Provisional 335
6790 222977 PHA03089 PHA03089 late transcription factor VLTF-4; Provisional 191
6791 222978 PHA03091 PHA03091 putative alpha aminitin-sensitive protein; Provisional 232
6792 165374 PHA03092 PHA03092 semaphorin-like protein; Provisional 134
6793 222979 PHA03093 PHA03093 EEV glycoprotein; Provisional 185
6794 165376 PHA03094 PHA03094 dUTPase; Provisional 144
6795 222980 PHA03095 PHA03095 ankyrin-like protein; Provisional 471
6796 222981 PHA03096 PHA03096 p28-like protein; Provisional 284
6797 222982 PHA03097 PHA03097 C-type lectin-like protein; Provisional 157
6798 222983 PHA03098 PHA03098 kelch-like protein; Provisional 534
6799 165381 PHA03099 PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional 139
6800 222984 PHA03100 PHA03100 ankyrin repeat protein; Provisional 422
6801 222985 PHA03101 PHA03101 DNA topoisomerase type I; Provisional 314
6802 222986 PHA03102 PHA03102 Small T antigen; Reviewed 153
6803 222987 PHA03103 PHA03103 double-strand RNA-binding protein; Provisional 183
6804 222988 PHA03105 PHA03105 EEV glycoprotein; Provisional 188
6805 165387 PHA03108 PHA03108 poly(A) polymerase small subunit; Provisional 300
6806 222989 PHA03111 PHA03111 Ser/Thr kinase; Provisional 444
6807 222990 PHA03112 PHA03112 IL-18 binding protein; Provisional 141
6808 177532 PHA03115 PHA03115 hypothetical protein; Provisional 340
6809 165391 PHA03118 PHA03118 multifunctional expression regulator; Provisional 474
6810 222991 PHA03119 PHA03119 helicase-primase primase subunit; Provisional 1085
6811 165393 PHA03120 PHA03120 tegument protein VP22; Provisional 310
6812 165395 PHA03123 PHA03123 dUTPase; Provisional 402
6813 165396 PHA03124 PHA03124 dUTPase; Provisional 418
6814 222992 PHA03125 PHA03125 dUTPase; Provisional 376
6815 165398 PHA03126 PHA03126 dUTPase; Provisional 326
6816 222993 PHA03127 PHA03127 dUTPase; Provisional 322
6817 165400 PHA03128 PHA03128 dUTPase; Provisional 376
6818 222994 PHA03129 PHA03129 dUTPase; Provisional 436
6819 222995 PHA03130 PHA03130 dUTPase; Provisional 368
6820 222996 PHA03131 PHA03131 dUTPase; Provisional 286
6821 222997 PHA03132 PHA03132 thymidine kinase; Provisional 580
6822 165405 PHA03133 PHA03133 thymidine kinase; Provisional 368
6823 177537 PHA03134 PHA03134 thymidine kinase; Provisional 340
6824 165407 PHA03135 PHA03135 thymidine kinase; Provisional 343
6825 177538 PHA03136 PHA03136 thymidine kinase; Provisional 378
6826 165410 PHA03138 PHA03138 thymidine kinase; Provisional 340
6827 165411 PHA03139 PHA03139 helicase-primase primase subunit; Provisional 860
6828 222998 PHA03140 PHA03140 helicase-primase primase subunit; Provisional 772
6829 177540 PHA03141 PHA03141 helicase-primase primase subunit; Provisional 101
6830 222999 PHA03142 PHA03142 helicase-primase primase subunit BSLF1; Provisional 835
6831 223000 PHA03144 PHA03144 helicase-primase primase subunit; Provisional 746
6832 165416 PHA03145 PHA03145 helicase-primase primase subunit; Provisional 1058
6833 177543 PHA03146 PHA03146 helicase-primase primase subunit; Provisional 1075
6834 165418 PHA03147 PHA03147 hypothetical protein; Provisional 280
6835 223001 PHA03148 PHA03148 hypothetical protein; Provisional 289
6836 165420 PHA03149 PHA03149 hypothetical protein; Provisional 66
6837 223002 PHA03150 PHA03150 hypothetical protein; Provisional 456
6838 177546 PHA03151 PHA03151 hypothetical protein; Provisional 259
6839 165423 PHA03152 PHA03152 hypothetical protein; Provisional 138
6840 165425 PHA03154 PHA03154 hypothetical protein; Provisional 304
6841 165426 PHA03155 PHA03155 hypothetical protein; Provisional 115
6842 165427 PHA03156 PHA03156 hypothetical protein; Provisional 90
6843 165429 PHA03158 PHA03158 hypothetical protein; Provisional 273
6844 165430 PHA03159 PHA03159 hypothetical protein; Provisional 160
6845 165431 PHA03160 PHA03160 hypothetical protein; Provisional 499
6846 165432 PHA03161 PHA03161 hypothetical protein; Provisional 150
6847 165433 PHA03162 PHA03162 hypothetical protein; Provisional 135
6848 165434 PHA03163 PHA03163 hypothetical protein; Provisional 92
6849 177547 PHA03164 PHA03164 hypothetical protein; Provisional 88
6850 165436 PHA03165 PHA03165 hypothetical protein; Provisional 57
6851 177548 PHA03166 PHA03166 hypothetical protein; Provisional 580
6852 223003 PHA03169 PHA03169 hypothetical protein; Provisional 413
6853 165441 PHA03170 PHA03170 UL37 tegument protein; Provisional 293
6854 165442 PHA03171 PHA03171 UL37 tegument protein; Provisional 499
6855 165443 PHA03172 PHA03172 UL37 tegument protein; Provisional 951
6856 223004 PHA03173 PHA03173 UL37 tegument protein; Provisional 1028
6857 177551 PHA03175 PHA03175 UL43 envelope protein; Provisional 413
6858 223005 PHA03176 PHA03176 UL43 envelope protein; Provisional 420
6859 177552 PHA03178 PHA03178 UL43 envelope protein; Provisional 403
6860 223006 PHA03179 PHA03179 UL43 envelope protein; Provisional 387
6861 165451 PHA03180 PHA03180 helicase-primase primase subunit; Provisional 1071
6862 165452 PHA03181 PHA03181 helicase-primase primase subunit; Provisional 764
6863 177553 PHA03185 PHA03185 UL14 tegument protein; Provisional 214
6864 223007 PHA03187 PHA03187 UL14 tegument protein; Provisional 322
6865 165458 PHA03188 PHA03188 UL14 tegument protein; Provisional 199
6866 223008 PHA03189 PHA03189 UL14 tegument protein; Provisional 348
6867 165460 PHA03190 PHA03190 UL14 tegument protein; Provisional 196
6868 165461 PHA03191 PHA03191 UL14 tegument protein; Provisional 238
6869 177555 PHA03193 PHA03193 tegument protein VP11/12; Provisional 594
6870 177556 PHA03195 PHA03195 tegument protein VP11/12; Provisional 746
6871 165466 PHA03199 PHA03199 uracil DNA glycosylase; Provisional 304
6872 165467 PHA03200 PHA03200 uracil DNA glycosylase; Provisional 255
6873 165468 PHA03201 PHA03201 uracil DNA glycosylase; Provisional 318
6874 165469 PHA03202 PHA03202 uracil DNA glycosylase; Provisional 313
6875 165471 PHA03204 PHA03204 uracil DNA glycosylase; Provisional 322
6876 165473 PHA03207 PHA03207 serine/threonine kinase US3; Provisional 392
6877 177557 PHA03209 PHA03209 serine/threonine kinase US3; Provisional 357
6878 165476 PHA03210 PHA03210 serine/threonine kinase US3; Provisional 501
6879 223009 PHA03211 PHA03211 serine/threonine kinase US3; Provisional 461
6880 165478 PHA03212 PHA03212 serine/threonine kinase US3; Provisional 391
6881 165479 PHA03214 PHA03214 nuclear protein UL24; Provisional 252
6882 223010 PHA03215 PHA03215 nuclear protein UL24; Provisional 262
6883 177558 PHA03216 PHA03216 nuclear protein UL24; Provisional 272
6884 223011 PHA03218 PHA03218 nuclear protein UL24; Provisional 306
6885 165484 PHA03219 PHA03219 nuclear protein UL24; Provisional 300
6886 165485 PHA03222 PHA03222 single-stranded binding protein UL29; Provisional 337
6887 165486 PHA03225 PHA03225 DNA packaging protein UL33; Provisional 125
6888 223012 PHA03229 PHA03229 DNA packaging protein UL33; Provisional 132
6889 223013 PHA03230 PHA03230 nuclear protein UL55; Provisional 180
6890 223014 PHA03231 PHA03231 glycoprotein BALF4; Provisional 829
6891 223015 PHA03232 PHA03232 DNA packaging protein UL32; Provisional 586
6892 223016 PHA03233 PHA03233 DNA packaging protein UL32; Provisional 518
6893 177562 PHA03234 PHA03234 DNA packaging protein UL33; Provisional 338
6894 223017 PHA03235 PHA03235 DNA packaging protein UL33; Provisional 409
6895 223018 PHA03236 PHA03236 DNA packaging protein UL33; Provisional 127
6896 223019 PHA03237 PHA03237 envelope glycoprotein M; Provisional 424
6897 177565 PHA03239 PHA03239 envelope glycoprotein M; Provisional 429
6898 165499 PHA03240 PHA03240 envelope glycoprotein M; Provisional 258
6899 177566 PHA03242 PHA03242 envelope glycoprotein M; Provisional 428
6900 177567 PHA03244 PHA03244 large tegument protein UL36; Provisional 478
6901 223020 PHA03246 PHA03246 large tegument protein UL36; Provisional 3095
6902 223021 PHA03247 PHA03247 large tegument protein UL36; Provisional 3151
6903 223022 PHA03248 PHA03248 DNA packaging tegument protein UL25; Provisional 583
6904 223023 PHA03249 PHA03249 DNA packaging tegument protein UL25; Provisional 653
6905 165509 PHA03250 PHA03250 UL35; Provisional 564
6906 223024 PHA03252 PHA03252 DNA packaging tegument protein UL25; Provisional 589
6907 223025 PHA03253 PHA03253 UL35; Provisional 609
6908 165513 PHA03255 PHA03255 BDLF3; Provisional 234
6909 165514 PHA03256 PHA03256 BDLF3; Provisional 77
6910 177569 PHA03257 PHA03257 Capsid triplex subunit 2; Provisional 316
6911 165516 PHA03258 PHA03258 Capsid triplex subunit 2; Provisional 304
6912 165517 PHA03259 PHA03259 Capsid triplex subunit 2; Provisional 302
6913 165518 PHA03260 PHA03260 Capsid triplex subunit 2; Provisional 339
6914 223026 PHA03261 PHA03261 Capsid triplex subunit 1; Provisional 469
6915 223027 PHA03262 PHA03262 Capsid triplex subunit 1; Provisional 264
6916 223028 PHA03263 PHA03263 Capsid triplex subunit 1; Provisional 332
6917 223029 PHA03264 PHA03264 envelope glycoprotein D; Provisional 416
6918 165523 PHA03265 PHA03265 envelope glycoprotein D; Provisional 402
6919 165527 PHA03269 PHA03269 envelope glycoprotein C; Provisional 566
6920 165528 PHA03270 PHA03270 envelope glycoprotein C; Provisional 466
6921 223030 PHA03271 PHA03271 envelope glycoprotein C; Provisional 490
6922 223031 PHA03273 PHA03273 envelope glycoprotein C; Provisional 486
6923 177573 PHA03275 PHA03275 envelope glycoprotein K; Provisional 340
6924 165533 PHA03276 PHA03276 envelope glycoprotein K; Provisional 337
6925 177574 PHA03278 PHA03278 envelope glycoprotein K; Provisional 347
6926 165536 PHA03279 PHA03279 envelope glycoprotein K; Provisional 361
6927 165538 PHA03281 PHA03281 envelope glycoprotein E; Provisional 642
6928 165539 PHA03282 PHA03282 envelope glycoprotein E; Provisional 540
6929 223032 PHA03283 PHA03283 envelope glycoprotein E; Provisional 542
6930 177576 PHA03286 PHA03286 envelope glycoprotein E; Provisional 492
6931 165546 PHA03289 PHA03289 envelope glycoprotein I; Provisional 352
6932 165547 PHA03290 PHA03290 envelope glycoprotein I; Provisional 357
6933 223033 PHA03291 PHA03291 envelope glycoprotein I; Provisional 401
6934 177577 PHA03292 PHA03292 envelope glycoprotein I; Provisional 413
6935 223034 PHA03293 PHA03293 deoxyribonuclease; Provisional 523
6936 223035 PHA03294 PHA03294 envelope glycoprotein H; Provisional 835
6937 223036 PHA03295 PHA03295 envelope glycoprotein H; Provisional 714
6938 165553 PHA03296 PHA03296 envelope glycoprotein H; Provisional 814
6939 165554 PHA03297 PHA03297 envelope glycoprotein L; Provisional 185
6940 165555 PHA03298 PHA03298 envelope glycoprotein L; Provisional 167
6941 165556 PHA03299 PHA03299 envelope glycoprotein L; Provisional 195
6942 223037 PHA03301 PHA03301 envelope glycoprotein L; Provisional 226
6943 223038 PHA03302 PHA03302 envelope glycoprotein L; Provisional 253
6944 165560 PHA03303 PHA03303 envelope glycoprotein L; Provisional 159
6945 223039 PHA03307 PHA03307 transcriptional regulator ICP4; Provisional 1352
6946 165563 PHA03308 PHA03308 transcriptional regulator ICP4; Provisional 1463
6947 165564 PHA03309 PHA03309 transcriptional regulator ICP4; Provisional 2033
6948 223040 PHA03311 PHA03311 helicase-primase subunit BBLF4; Provisional 782
6949 177582 PHA03312 PHA03312 helicase-primase subunit BBLF2/3; Provisional 709
6950 223041 PHA03321 PHA03321 tegument protein VP11/12; Provisional 694
6951 223042 PHA03322 PHA03322 tegument protein VP11/12; Provisional 674
6952 223043 PHA03323 PHA03323 nuclear egress membrane protein UL34; Provisional 272
6953 165570 PHA03324 PHA03324 nuclear egress membrane protein UL34; Provisional 274
6954 223044 PHA03325 PHA03325 nuclear-egress-membrane-like protein; Provisional 418
6955 223045 PHA03326 PHA03326 nuclear egress membrane protein; Provisional 275
6956 223046 PHA03328 PHA03328 nuclear egress lamina protein UL31; Provisional 316
6957 165574 PHA03330 PHA03330 putative primase; Provisional 771
6958 223047 PHA03332 PHA03332 membrane glycoprotein; Provisional 1328
6959 223048 PHA03333 PHA03333 putative ATPase subunit of terminase; Provisional 752
6960 223049 PHA03334 PHA03334 putative DNA polymerase catalytic subunit; Provisional 1545
6961 223050 PHA03335 PHA03335 hypothetical protein; Provisional 385
6962 223051 PHA03336 PHA03336 uncharacterized protein; Provisional 462
6963 165582 PHA03338 PHA03338 US22 family homolog; Provisional 344
6964 165586 PHA03342 PHA03342 US22 family homolog; Provisional 511
6965 165587 PHA03343 PHA03343 US22 family homolog; Provisional 578
6966 165588 PHA03344 PHA03344 US22 family homolog; Provisional 672
6967 223052 PHA03346 PHA03346 US22 family homolog; Provisional 520
6968 177588 PHA03347 PHA03347 uracil DNA glycosylase; Provisional 252
6969 177589 PHA03348 PHA03348 tegument protein UL21; Provisional 526
6970 177590 PHA03349 PHA03349 tegument protein UL16; Provisional 343
6971 177591 PHA03351 PHA03351 tegument protein UL16; Provisional 235
6972 223053 PHA03352 PHA03352 tegument protein UL16; Provisional 340
6973 177593 PHA03354 PHA03354 Alkaline exonuclease; Provisional 81
6974 177594 PHA03356 PHA03356 tegument protein UL11; Provisional 93
6975 177595 PHA03357 PHA03357 Alkaline exonuclease; Provisional 81
6976 177596 PHA03358 PHA03358 Alkaline exonuclease; Provisional 75
6977 223054 PHA03359 PHA03359 UL17 tegument protein; Provisional 686
6978 177598 PHA03360 PHA03360 tegument protein; Provisional 442
6979 223055 PHA03361 PHA03361 UL7 tegument protein; Provisional 302
6980 223056 PHA03362 PHA03362 single-stranded binding protein UL29; Provisional 1189
6981 223057 PHA03364 PHA03364 hypothetical protein; Provisional 264
6982 177602 PHA03365 PHA03365 hypothetical protein; Provisional 419
6983 223058 PHA03366 PHA03366 FGAM-synthase; Provisional 1304
6984 223059 PHA03367 PHA03367 single-stranded DNA binding protein; Provisional 1115
6985 223060 PHA03368 PHA03368 DNA packaging terminase subunit 1; Provisional 738
6986 223061 PHA03369 PHA03369 capsid maturational protease; Provisional 663
6987 177607 PHA03370 PHA03370 virion protein US2; Provisional 269
6988 177608 PHA03371 PHA03371 circ protein; Provisional 240
6989 177609 PHA03372 PHA03372 DNA packaging terminase subunit 1; Provisional 668
6990 223062 PHA03373 PHA03373 tegument protein; Provisional 247
6991 223063 PHA03374 PHA03374 hypothetical protein; Provisional 730
6992 223064 PHA03375 PHA03375 hypothetical protein; Provisional 844
6993 177613 PHA03376 PHA03376 BARF1; Provisional 221
6994 177614 PHA03377 PHA03377 EBNA-3C; Provisional 1000
6995 223065 PHA03378 PHA03378 EBNA-3B; Provisional 991
6996 223066 PHA03379 PHA03379 EBNA-3A; Provisional 935
6997 223067 PHA03380 PHA03380 transactivating tegument protein VP16; Provisional 432
6998 177618 PHA03381 PHA03381 tegument protein VP22; Provisional 290
6999 177619 PHA03383 PHA03383 PCNA-like protein; Provisional 262
7000 223068 PHA03384 PHA03384 early DNA-binding protein E2A; Provisional 445
7001 177621 PHA03385 IX capsid protein IX,hexon associated protein IX; Provisional 135
7002 177622 PHA03386 P10 fibrous body protein; Provisional 94
7003 177623 PHA03387 gp37 spherodin-like protein; Provisional 267
7004 177624 PHA03388 ORF1_granulin Granulin; Provisional 248
7005 177625 PHA03389 polh polyhedrin; Provisional 246
7006 223069 PHA03390 pk1 serine/threonine-protein kinase 1; Provisional 267
7007 223070 PHA03391 p47 viral transcription regulator p47; Provisional 395
7008 223071 PHA03392 egt ecdysteroid UDP-glucosyltransferase; Provisional 507
7009 223072 PHA03393 odv-e66 occlusion-derived virus envelope protein E66; Provisional 682
7010 223073 PHA03394 lef-8 DNA-directed RNA polymerase subunit beta-like protein; Provisional 865
7011 177631 PHA03395 p10 fibrous body protein; Provisional 87
7012 223074 PHA03396 lef-9 late expression factor 9; Provisional 493
7013 177633 PHA03397 vlf-1 very late expression factor 1; Provisional 363
7014 223075 PHA03398 PHA03398 viral phosphatase superfamily protein; Provisional 303
7015 223076 PHA03399 pif3 per os infectivity factor 3; Provisional 200
7016 223077 PHA03402 PHA03402 hypothetical protein; Provisional 81
7017 177637 PHA03405 PHA03405 hypothetical protein; Provisional 130
7018 223078 PHA03410 PHA03410 hypothetical protein; Provisional 170
7019 177639 PHA03411 PHA03411 putative methyltransferase; Provisional 279
7020 177640 PHA03412 PHA03412 putative methyltransferase; Provisional 241
7021 177641 PHA03413 PHA03413 putative internal core protein; Provisional 1304
7022 177642 PHA03414 PHA03414 virion protein; Provisional 1337
7023 177643 PHA03415 PHA03415 putative internal virion protein; Provisional 1019
7024 177644 PHA03416 PHA03416 hypothetical E4 protein; Provisional 92
7025 177645 PHA03417 PHA03417 E4 protein; Provisional 118
7026 177646 PHA03418 PHA03418 hypothetical E4 protein; Provisional 230
7027 223079 PHA03419 PHA03419 E4 protein; Provisional 200
7028 177648 PHA03420 PHA03420 E4 protein; Provisional 137
7029 177649 PLN00009 PLN00009 cyclin-dependent kinase A; Provisional 294
7030 215027 PLN00010 PLN00010 cyclin-dependent kinases regulatory subunit; Provisional 86
7031 177651 PLN00011 PLN00011 cysteine synthase 323
7032 215028 PLN00012 PLN00012 chlorophyll synthetase; Provisional 375
7033 177653 PLN00014 PLN00014 light-harvesting-like protein 3; Provisional 250
7034 177654 PLN00015 PLN00015 protochlorophyllide reductase 308
7035 215029 PLN00016 PLN00016 RNA-binding protein; Provisional 378
7036 177656 PLN00017 PLN00017 photosystem I reaction centre subunit VI; Provisional 90
7037 215030 PLN00019 PLN00019 photosystem I reaction center subunit III; Provisional 223
7038 215031 PLN00020 PLN00020 ribulose bisphosphate carboxylase/oxygenase activase -RuBisCO activase (RCA); Provisional 413
7039 177659 PLN00021 PLN00021 chlorophyllase 313
7040 215032 PLN00022 PLN00022 electron transfer flavoprotein subunit alpha; Provisional 356
7041 177661 PLN00023 PLN00023 GTP-binding protein; Provisional 334
7042 215033 PLN00025 PLN00025 photosystem II light harvesting chlorophyll a/b binding protein; Provisional 262
7043 177663 PLN00026 PLN00026 aquaporin NIP; Provisional 298
7044 177664 PLN00027 PLN00027 aquaporin TIP; Provisional 252
7045 177665 PLN00028 PLN00028 nitrate transmembrane transporter; Provisional 476
7046 215034 PLN00032 PLN00032 DNA-directed RNA polymerase; Provisional 71
7047 215035 PLN00033 PLN00033 photosystem II stability/assembly factor; Provisional 398
7048 215036 PLN00034 PLN00034 mitogen-activated protein kinase kinase; Provisional 353
7049 177669 PLN00035 PLN00035 histone H4; Provisional 103
7050 177670 PLN00036 PLN00036 40S ribosomal protein S4; Provisional 261
7051 177671 PLN00037 PLN00037 photosystem II oxygen-evolving enhancer protein 1; Provisional 313
7052 215037 PLN00038 PLN00038 photosystem I reaction center subunit XI (PsaL); Provisional 165
7053 177673 PLN00039 PLN00039 photosystem II reaction center Psb28 protein; Provisional 111
7054 215038 PLN00040 PLN00040 Protein MAK16 homolog; Provisional 233
7055 215039 PLN00041 PLN00041 photosystem I reaction center subunit II; Provisional 196
7056 177676 PLN00042 PLN00042 photosystem II oxygen-evolving enhancer protein 2; Provisional 260
7057 165621 PLN00043 PLN00043 elongation factor 1-alpha; Provisional 447
7058 165622 PLN00044 PLN00044 multi-copper oxidase-related protein; Provisional 596
7059 177677 PLN00045 PLN00045 photosystem I reaction center subunit IV; Provisional 101
7060 215040 PLN00046 PLN00046 photosystem I reaction center subunit O; Provisional 141
7061 177679 PLN00047 PLN00047 photosystem II biogenesis protein Psb29; Provisional 283
7062 177680 PLN00048 PLN00048 photosystem I light harvesting chlorophyll a/b binding protein 3; Provisional 262
7063 177681 PLN00049 PLN00049 carboxyl-terminal processing protease; Provisional 389
7064 165628 PLN00050 PLN00050 expansin A; Provisional 247
7065 177682 PLN00051 PLN00051 RNA-binding S4 domain-containing protein; Provisional 267
7066 177683 PLN00052 PLN00052 prolyl 4-hydroxylase; Provisional 310
7067 215041 PLN00053 PLN00053 photosystem II subunit R; Provisional 117
7068 215042 PLN00054 PLN00054 photosystem I reaction center subunit N; Provisional 139
7069 177686 PLN00055 PLN00055 photosystem II reaction center protein H; Provisional 73
7070 177687 PLN00056 PLN00056 photosystem Q(B) protein; Provisional 353
7071 177688 PLN00057 PLN00057 proliferating cell nuclear antigen; Provisional 263
7072 177689 PLN00058 PLN00058 photosystem II reaction center subunit T; Provisional 103
7073 177690 PLN00059 PLN00059 PsbP domain-containing protein 1; Provisional 286
7074 177691 PLN00060 PLN00060 meiotic recombination protein SPO11-2; Provisional 384
7075 215043 PLN00061 PLN00061 photosystem II protein Psb27; Provisional 150
7076 177693 PLN00062 PLN00062 TATA-box-binding protein; Provisional 179
7077 215044 PLN00063 PLN00063 photosystem II core complex proteins psbY; Provisional 194
7078 215045 PLN00064 PLN00064 photosystem II protein Psb27; Provisional 166
7079 215046 PLN00066 PLN00066 PsbP domain-containing protein 4; Provisional 262
7080 177697 PLN00067 PLN00067 PsbP domain-containing protein 6; Provisional 263
7081 177698 PLN00068 PLN00068 photosystem II CP47 chlorophyll A apoprotein; Provisional 508
7082 215047 PLN00070 PLN00070 aconitate hydratase 936
7083 177700 PLN00071 PLN00071 photosystem I subunit VII; Provisional 81
7084 177701 PLN00072 PLN00072 3-isopropylmalate isomerase/dehydratase small subunit; Provisional 246
7085 215048 PLN00074 PLN00074 photosystem II D2 protein (PsbD); Provisional 353
7086 215049 PLN00075 PLN00075 Photosystem II reaction center protein K; Provisional 52
7087 215050 PLN00077 PLN00077 photosystem II reaction centre W protein; Provisional 128
7088 165653 PLN00078 PLN00078 photosystem I reaction center subunit N (PsaN); Provisional 122
7089 165655 PLN00081 PLN00081 photosystem I reaction center subunit V (PsaG); Provisional 141
7090 215051 PLN00082 PLN00082 photosystem II reaction centre W protein (PsbW); Provisional 67
7091 177706 PLN00083 PLN00083 photosystem II subunit R; Provisional 101
7092 177707 PLN00084 PLN00084 photosystem II subunit S (PsbS); Provisional 214
7093 177708 PLN00085 PLN00085 photosystem II reaction center protein M (PsbM); Provisional 149
7094 177709 PLN00088 PLN00088 predicted protein; Provisional 127
7095 177710 PLN00089 PLN00089 fucoxanthin-chlorophyll a/c binding protein; Provisional 209
7096 165663 PLN00090 PLN00090 photosystem II reaction center M protein; Provisional 113
7097 215052 PLN00091 PLN00091 photosystem I reaction center subunit V (PsaG); Provisional 160
7098 177712 PLN00092 PLN00092 photosystem I reaction center subunit V (PsaG); Provisional 137
7099 177713 PLN00093 PLN00093 geranylgeranyl diphosphate reductase; Provisional 450
7100 215053 PLN00094 PLN00094 aconitate hydratase 2; Provisional 938
7101 165668 PLN00095 PLN00095 chlorophyllide a oxygenase; Provisional 394
7102 177715 PLN00096 PLN00096 isocitrate dehydrogenase (NADP+); Provisional 393
7103 165670 PLN00097 PLN00097 photosystem I light harvesting complex Lhca2/4, chlorophyll a/b binding; Provisional 244
7104 177716 PLN00098 PLN00098 light-harvesting complex I chlorophyll a/b-binding protein (Lhac); Provisional 267
7105 177717 PLN00099 PLN00099 light-harvesting complex IChlorophyll A-B binding protein Lhca1; Provisional 243
7106 215054 PLN00100 PLN00100 light-harvesting complex chlorophyll-a/b protein of photosystem I (Lhca); Provisional 246
7107 215055 PLN00101 PLN00101 Photosystem I light-harvesting complex type 4 protein; Provisional 250
7108 177720 PLN00103 PLN00103 isocitrate dehydrogenase (NADP+); Provisional 410
7109 215056 PLN00104 PLN00104 MYST -like histone acetyltransferase; Provisional 450
7110 215057 PLN00105 PLN00105 malate/L-lactate dehydrogenase; Provisional 330
7111 215058 PLN00106 PLN00106 malate dehydrogenase 323
7112 165679 PLN00107 PLN00107 FAD-dependent oxidoreductase; Provisional 257
7113 177724 PLN00108 PLN00108 unknown protein; Provisional 257
7114 177725 PLN00110 PLN00110 flavonoid 3',5'-hydroxylase (F3'5'H); Provisional 504
7115 215059 PLN00111 PLN00111 accumulation of photosystem one; Provisional 399
7116 215060 PLN00112 PLN00112 malate dehydrogenase (NADP); Provisional 444
7117 215061 PLN00113 PLN00113 leucine-rich repeat receptor-like protein kinase; Provisional 968
7118 177729 PLN00115 PLN00115 pollen allergen group 3; Provisional 118
7119 177730 PLN00116 PLN00116 translation elongation factor EF-2 subunit; Provisional 843
7120 215062 PLN00118 PLN00118 isocitrate dehydrogenase (NAD+) 372
7121 177732 PLN00119 PLN00119 endoglucanase 489
7122 215063 PLN00120 PLN00120 fucoxanthin-chlorophyll a-c binding protein; Provisional 202
7123 177733 PLN00121 PLN00121 histone H3; Provisional 136
7124 215064 PLN00122 PLN00122 serine/threonine protein phosphatase 2A; Provisional 170
7125 215065 PLN00123 PLN00123 isocitrate dehydrogenase (NAD+) 360
7126 177736 PLN00124 PLN00124 succinyl-CoA ligase [GDP-forming] subunit beta; Provisional 422
7127 215066 PLN00125 PLN00125 Succinyl-CoA ligase [GDP-forming] subunit alpha 300
7128 165695 PLN00126 PLN00126 succinate dehydrogenase, cytochrome b subunit family; Provisional 129
7129 177738 PLN00127 PLN00127 succinate dehydrogenase (ubiquinone) cytochrome b subunit; Provisional 178
7130 177739 PLN00128 PLN00128 Succinate dehydrogenase [ubiquinone] flavoprotein subunit 635
7131 215067 PLN00129 PLN00129 succinate dehydrogenase [ubiquinone] iron-sulfur subunit 276
7132 177741 PLN00130 PLN00130 succinate dehydrogenase (SDH3); Provisional 213
7133 165700 PLN00131 PLN00131 hypothetical protein; Provisional 218
7134 215068 PLN00133 PLN00133 class I-fumerate hydratase; Provisional 576
7135 215069 PLN00134 PLN00134 fumarate hydratase; Provisional 458
7136 177744 PLN00135 PLN00135 malate dehydrogenase 309
7137 215070 PLN00136 PLN00136 silicon transporter; Provisional 482
7138 215071 PLN00137 PLN00137 NHAD transporter family protein; Provisional 424
7139 165706 PLN00138 PLN00138 large subunit ribosomal protein LP2; Provisional 113
7140 165707 PLN00139 PLN00139 hypothetical protein; Provisional 320
7141 165708 PLN00140 PLN00140 alcohol acetyltransferase family protein; Provisional 444
7142 215072 PLN00141 PLN00141 Tic62-NAD(P)-related group II protein; Provisional 251
7143 215073 PLN00142 PLN00142 sucrose synthase 815
7144 165711 PLN00143 PLN00143 tyrosine/nicotianamine aminotransferase; Provisional 409
7145 177748 PLN00144 PLN00144 acetylornithine transaminase 382
7146 215074 PLN00145 PLN00145 tyrosine/nicotianamine aminotransferase; Provisional 430
7147 215075 PLN00146 PLN00146 40S ribosomal protein S15a; Provisional 130
7148 215076 PLN00147 PLN00147 light-harvesting complex I chlorophyll-a/b binding protein Lhca5; Provisional 252
7149 215077 PLN00148 PLN00148 potassium transporter; Provisional 785
7150 177753 PLN00149 PLN00149 potassium transporter; Provisional 779
7151 215078 PLN00150 PLN00150 potassium ion transporter family protein; Provisional 779
7152 215079 PLN00151 PLN00151 potassium transporter; Provisional 852
7153 177755 PLN00152 PLN00152 DNA-directed RNA polymerase; Provisional 130
7154 165721 PLN00153 PLN00153 histone H2A; Provisional 129
7155 177756 PLN00154 PLN00154 histone H2A; Provisional 136
7156 165723 PLN00155 PLN00155 histone H2A; Provisional 58
7157 215080 PLN00156 PLN00156 histone H2AX; Provisional 139
7158 177758 PLN00157 PLN00157 histone H2A; Provisional 132
7159 215081 PLN00158 PLN00158 histone H2B; Provisional 116
7160 165727 PLN00160 PLN00160 histone H3; Provisional 97
7161 215082 PLN00161 PLN00161 histone H3; Provisional 135
7162 215083 PLN00162 PLN00162 transport protein sec23; Provisional 761
7163 165730 PLN00163 PLN00163 histone H4; Provisional 59
7164 215084 PLN00164 PLN00164 glucosyltransferase; Provisional 480
7165 165732 PLN00165 PLN00165 hypothetical protein; Provisional 88
7166 165733 PLN00166 PLN00166 aquaporin TIP2; Provisional 250
7167 215085 PLN00167 PLN00167 aquaporin TIP5; Provisional 256
7168 215086 PLN00168 PLN00168 Cytochrome P450; Provisional 519
7169 177765 PLN00169 PLN00169 CETS family protein; Provisional 175
7170 215087 PLN00170 PLN00170 photosystem II light-harvesting-Chl-binding protein Lhcb6 (CP24); Provisional 255
7171 215088 PLN00171 PLN00171 photosystem light-harvesting complex -chlorophyll a/b binding protein Lhcb7; Provisional 324
7172 177768 PLN00172 PLN00172 ubiquitin conjugating enzyme; Provisional 147
7173 177769 PLN00174 PLN00174 predicted protein; Provisional 160
7174 215089 PLN00175 PLN00175 aminotransferase family protein; Provisional 413
7175 215090 PLN00176 PLN00176 galactinol synthase 333
7176 177772 PLN00177 PLN00177 sulfite oxidase; Provisional 393
7177 177773 PLN00178 PLN00178 sulfite reductase 623
7178 215091 PLN00179 PLN00179 acyl- [acyl-carrier protein] desaturase 390
7179 177775 PLN00180 PLN00180 NDF6 (NDH-dependent flow 6); Provisional 180
7180 177776 PLN00181 PLN00181 protein SPA1-RELATED; Provisional 793
7181 165748 PLN00182 PLN00182 putative aquaporin NIP4; Provisional 283
7182 215092 PLN00183 PLN00183 putative aquaporin NIP7; Provisional 274
7183 177778 PLN00184 PLN00184 aquaporin NIP1; Provisional 296
7184 177779 PLN00185 PLN00185 60S ribosomal protein L4-1; Provisional 405
7185 215093 PLN00186 PLN00186 ribosomal protein S26; Provisional 109
7186 177781 PLN00187 PLN00187 photosystem II light-harvesting complex II protein Lhcb4; Provisional 286
7187 215094 PLN00188 PLN00188 enhanced disease resistance protein (EDR2); Provisional 719
7188 177783 PLN00189 PLN00189 40S ribosomal protein S9; Provisional 194
7189 177784 PLN00190 PLN00190 60S ribosomal protein L21; Provisional 158
7190 215095 PLN00191 PLN00191 enolase 457
7191 215096 PLN00192 PLN00192 aldehyde oxidase 1344
7192 215097 PLN00193 PLN00193 expansin-A; Provisional 256
7193 215098 PLN00194 PLN00194 aldose 1-epimerase; Provisional 337
7194 165762 PLN00196 PLN00196 alpha-amylase; Provisional 428
7195 215099 PLN00197 PLN00197 beta-amylase; Provisional 573
7196 215100 PLN00198 PLN00198 anthocyanidin reductase; Provisional 338
7197 177791 PLN00200 PLN00200 argininosuccinate synthase; Provisional 404
7198 177792 PLN00202 PLN00202 beta-ureidopropionase 405
7199 215101 PLN00203 PLN00203 glutamyl-tRNA reductase 519
7200 215102 PLN00204 PLN00204 CP12 gene family protein; Provisional 126
7201 177795 PLN00205 PLN00205 ribisomal protein L13 family protein; Provisional 191
7202 215103 PLN00206 PLN00206 DEAD-box ATP-dependent RNA helicase; Provisional 518
7203 215104 PLN00207 PLN00207 polyribonucleotide nucleotidyltransferase; Provisional 891
7204 177798 PLN00208 PLN00208 translation initiation factor (eIF); Provisional 145
7205 165774 PLN00209 PLN00209 ribosomal protein S27; Provisional 86
7206 177799 PLN00210 PLN00210 40S ribosomal protein S16; Provisional 141
7207 215105 PLN00211 PLN00211 predicted protein; Provisional 61
7208 215106 PLN00212 PLN00212 glutelin; Provisional 493
7209 165778 PLN00213 PLN00213 predicted protein; Provisional 118
7210 177800 PLN00214 PLN00214 putative protein; Provisional 115
7211 165780 PLN00215 PLN00215 predicted protein; Provisional 110
7212 165781 PLN00216 PLN00216 predicted protein; Provisional 69
7213 165782 PLN00217 PLN00217 predicted protein; Provisional 210
7214 165783 PLN00218 PLN00218 predicted protein; Provisional 151
7215 165784 PLN00219 PLN00219 predicted protein; Provisional 65
7216 215107 PLN00220 PLN00220 tubulin beta chain; Provisional 447
7217 177802 PLN00221 PLN00221 tubulin alpha chain; Provisional 450
7218 215108 PLN00222 PLN00222 tubulin gamma chain; Provisional 454
7219 165788 PLN00223 PLN00223 ADP-ribosylation factor; Provisional 181
7220 215109 PLN00410 PLN00410 U5 snRNP protein, DIM1 family; Provisional 142
7221 177805 PLN00411 PLN00411 nodulin MtN21 family protein; Provisional 358
7222 215110 PLN00412 PLN00412 NADP-dependent glyceraldehyde-3-phosphate dehydrogenase; Provisional 496
7223 165792 PLN00413 PLN00413 triacylglycerol lipase 479
7224 177807 PLN00414 PLN00414 glycosyltransferase family protein 446
7225 177808 PLN00415 PLN00415 3-ketoacyl-CoA synthase 466
7226 177809 PLN00416 PLN00416 carbonate dehydratase 258
7227 177810 PLN00417 PLN00417 oxidoreductase, 2OG-Fe(II) oxygenase family protein 348
7228 177811 PLN02150 PLN02150 terpene synthase/cyclase family protein 96
7229 177812 PLN02151 PLN02151 trehalose-phosphatase 354
7230 177813 PLN02152 PLN02152 indole-3-acetate beta-glucosyltransferase 455
7231 177814 PLN02153 PLN02153 epithiospecifier protein 341
7232 215111 PLN02154 PLN02154 carbonic anhydrase 290
7233 165802 PLN02155 PLN02155 polygalacturonase 394
7234 177816 PLN02156 PLN02156 gibberellin 2-beta-dioxygenase 335
7235 177817 PLN02157 PLN02157 3-hydroxyisobutyryl-CoA hydrolase-like protein 401
7236 177818 PLN02159 PLN02159 Fe(2+) transport protein 337
7237 177819 PLN02160 PLN02160 thiosulfate sulfurtransferase 136
7238 177820 PLN02161 PLN02161 beta-amylase 531
7239 177821 PLN02162 PLN02162 triacylglycerol lipase 475
7240 177822 PLN02164 PLN02164 sulfotransferase 346
7241 177823 PLN02165 PLN02165 adenylate isopentenyltransferase 334
7242 165812 PLN02166 PLN02166 dTDP-glucose 4,6-dehydratase 436
7243 215112 PLN02167 PLN02167 UDP-glycosyltransferase family protein 475
7244 215113 PLN02168 PLN02168 copper ion binding / pectinesterase 545
7245 177826 PLN02169 PLN02169 fatty acid (omega-1)-hydroxylase/midchain alkane hydroxylase 500
7246 215114 PLN02170 PLN02170 probable pectinesterase/pectinesterase inhibitor 529
7247 215115 PLN02171 PLN02171 endoglucanase 629
7248 215116 PLN02172 PLN02172 flavin-containing monooxygenase FMO GS-OX 461
7249 177830 PLN02173 PLN02173 UDP-glucosyl transferase family protein 449
7250 177831 PLN02174 PLN02174 aldehyde dehydrogenase family 3 member H1 484
7251 177832 PLN02175 PLN02175 endoglucanase 484
7252 215117 PLN02176 PLN02176 putative pectinesterase 340
7253 215118 PLN02177 PLN02177 glycerol-3-phosphate acyltransferase 497
7254 177834 PLN02178 PLN02178 cinnamyl-alcohol dehydrogenase 375
7255 177835 PLN02179 PLN02179 carbonic anhydrase 235
7256 177836 PLN02180 PLN02180 gamma-glutamyl transpeptidase 4 639
7257 177837 PLN02182 PLN02182 cytidine deaminase 339
7258 165828 PLN02183 PLN02183 ferulate 5-hydroxylase 516
7259 177838 PLN02184 PLN02184 superoxide dismutase [Fe] 212
7260 215119 PLN02187 PLN02187 rooty/superroot1 462
7261 215120 PLN02188 PLN02188 polygalacturonase/glycoside hydrolase family protein 404
7262 215121 PLN02189 PLN02189 cellulose synthase 1040
7263 215122 PLN02190 PLN02190 cellulose synthase-like protein 756
7264 177843 PLN02191 PLN02191 L-ascorbate oxidase 574
7265 215123 PLN02192 PLN02192 3-ketoacyl-CoA synthase 511
7266 177844 PLN02193 PLN02193 nitrile-specifier protein 470
7267 177845 PLN02194 PLN02194 cytochrome-c oxidase 265
7268 215124 PLN02195 PLN02195 cellulose synthase A 977
7269 177847 PLN02196 PLN02196 abscisic acid 8'-hydroxylase 463
7270 177848 PLN02197 PLN02197 pectinesterase 588
7271 177849 PLN02198 PLN02198 glutathione gamma-glutamylcysteinyltransferase 573
7272 177850 PLN02199 PLN02199 shikimate kinase 303
7273 215125 PLN02200 PLN02200 adenylate kinase family protein 234
7274 177852 PLN02201 PLN02201 probable pectinesterase/pectinesterase inhibitor 520
7275 177853 PLN02202 PLN02202 carbonate dehydratase 284
7276 165847 PLN02203 PLN02203 aldehyde dehydrogenase 484
7277 215126 PLN02204 PLN02204 diacylglycerol kinase 601
7278 177855 PLN02205 PLN02205 alpha,alpha-trehalose-phosphate synthase [UDP-forming] 854
7279 177856 PLN02206 PLN02206 UDP-glucuronate decarboxylase 442
7280 177857 PLN02207 PLN02207 UDP-glycosyltransferase 468
7281 177858 PLN02208 PLN02208 glycosyltransferase family protein 442
7282 177859 PLN02209 PLN02209 serine carboxypeptidase 437
7283 215127 PLN02210 PLN02210 UDP-glucosyl transferase 456
7284 215128 PLN02211 PLN02211 methyl indole-3-acetate methyltransferase 273
7285 165857 PLN02213 PLN02213 sinapoylglucose-malate O-sinapoyltransferase/ carboxypeptidase 319
7286 177862 PLN02214 PLN02214 cinnamoyl-CoA reductase 342
7287 215129 PLN02216 PLN02216 protein SRG1 357
7288 215130 PLN02217 PLN02217 probable pectinesterase/pectinesterase inhibitor 670
7289 177865 PLN02218 PLN02218 polygalacturonase ADPG 431
7290 165863 PLN02219 PLN02219 probable galactinol--sucrose galactosyltransferase 2 775
7291 177866 PLN02220 PLN02220 delta-9 acyl-lipid desaturase 299
7292 177867 PLN02221 PLN02221 asparaginyl-tRNA synthetase 572
7293 177868 PLN02222 PLN02222 phosphoinositide phospholipase C 2 581
7294 165867 PLN02223 PLN02223 phosphoinositide phospholipase C 537
7295 177869 PLN02224 PLN02224 methionine-tRNA ligase 616
7296 177870 PLN02225 PLN02225 1-deoxy-D-xylulose-5-phosphate synthase 701
7297 177871 PLN02226 PLN02226 2-oxoglutarate dehydrogenase E2 component 463
7298 177872 PLN02227 PLN02227 fructose-bisphosphate aldolase I 399
7299 177873 PLN02228 PLN02228 Phosphoinositide phospholipase C 567
7300 177874 PLN02229 PLN02229 alpha-galactosidase 427
7301 177875 PLN02230 PLN02230 phosphoinositide phospholipase C 4 598
7302 177876 PLN02231 PLN02231 alanine transaminase 534
7303 165876 PLN02232 PLN02232 ubiquinone biosynthesis methyltransferase 160
7304 177877 PLN02233 PLN02233 ubiquinone biosynthesis methyltransferase 261
7305 177878 PLN02234 PLN02234 1-deoxy-D-xylulose-5-phosphate synthase 641
7306 177879 PLN02235 PLN02235 ATP citrate (pro-S)-lyase 423
7307 177880 PLN02236 PLN02236 choline kinase 344
7308 215131 PLN02237 PLN02237 glyceraldehyde-3-phosphate dehydrogenase B 442
7309 215132 PLN02238 PLN02238 hypoxanthine phosphoribosyltransferase 189
7310 177883 PLN02240 PLN02240 UDP-glucose 4-epimerase 352
7311 215133 PLN02241 PLN02241 glucose-1-phosphate adenylyltransferase 436
7312 215134 PLN02242 PLN02242 methionine gamma-lyase 418
7313 177886 PLN02243 PLN02243 S-adenosylmethionine synthase 386
7314 215135 PLN02244 PLN02244 tocopherol O-methyltransferase 340
7315 215136 PLN02245 PLN02245 ATP phosphoribosyl transferase 403
7316 215137 PLN02246 PLN02246 4-coumarate--CoA ligase 537
7317 165890 PLN02247 PLN02247 indole-3-acetic acid-amido synthetase 606
7318 215138 PLN02248 PLN02248 cellulose synthase-like protein 1135
7319 177891 PLN02249 PLN02249 indole-3-acetic acid-amido synthetase 597
7320 215139 PLN02250 PLN02250 lipid phosphate phosphatase 314
7321 215140 PLN02251 PLN02251 pyrophosphate-dependent phosphofructokinase 568
7322 215141 PLN02252 PLN02252 nitrate reductase [NADPH] 888
7323 177895 PLN02253 PLN02253 xanthoxin dehydrogenase 280
7324 215142 PLN02254 PLN02254 gibberellin 3-beta-dioxygenase 358
7325 215143 PLN02255 PLN02255 H(+) -translocating inorganic pyrophosphatase 765
7326 215144 PLN02256 PLN02256 arogenate dehydrogenase 304
7327 177899 PLN02257 PLN02257 phosphoribosylamine--glycine ligase 434
7328 215145 PLN02258 PLN02258 9-cis-epoxycarotenoid dioxygenase NCED 590
7329 177901 PLN02259 PLN02259 branched-chain-amino-acid aminotransferase 2 388
7330 215146 PLN02260 PLN02260 probable rhamnose biosynthetic enzyme 668
7331 215147 PLN02262 PLN02262 fructose-1,6-bisphosphatase 340
7332 177904 PLN02263 PLN02263 serine decarboxylase 470
7333 215148 PLN02264 PLN02264 lipoxygenase 919
7334 215149 PLN02265 PLN02265 probable phenylalanyl-tRNA synthetase beta chain 597
7335 215150 PLN02266 PLN02266 endoglucanase 510
7336 215151 PLN02267 PLN02267 enoyl-CoA hydratase/isomerase family protein 239
7337 177909 PLN02268 PLN02268 probable polyamine oxidase 435
7338 215152 PLN02269 PLN02269 Pyruvate dehydrogenase E1 component subunit alpha 362
7339 165912 PLN02270 PLN02270 phospholipase D alpha 808
7340 215153 PLN02271 PLN02271 serine hydroxymethyltransferase 586
7341 177912 PLN02272 PLN02272 glyceraldehyde-3-phosphate dehydrogenase 421
7342 215154 PLN02274 PLN02274 inosine-5'-monophosphate dehydrogenase 505
7343 215155 PLN02275 PLN02275 transferase, transferring glycosyl groups 371
7344 215156 PLN02276 PLN02276 gibberellin 20-oxidase 361
7345 177916 PLN02277 PLN02277 H(+) -translocating inorganic pyrophosphatase 730
7346 215157 PLN02278 PLN02278 succinic semialdehyde dehydrogenase 498
7347 177918 PLN02279 PLN02279 ent-kaur-16-ene synthase 784
7348 215158 PLN02280 PLN02280 IAA-amino acid hydrolase 478
7349 177920 PLN02281 PLN02281 chlorophyllide a oxygenase 536
7350 165923 PLN02282 PLN02282 phosphoglycerate kinase 401
7351 177921 PLN02283 PLN02283 alpha-dioxygenase 633
7352 177922 PLN02284 PLN02284 glutamine synthetase 354
7353 215159 PLN02285 PLN02285 methionyl-tRNA formyltransferase 334
7354 215160 PLN02286 PLN02286 arginine-tRNA ligase 576
7355 215161 PLN02287 PLN02287 3-ketoacyl-CoA thiolase 452
7356 215162 PLN02288 PLN02288 mannose-6-phosphate isomerase 394
7357 215163 PLN02289 PLN02289 ribulose-bisphosphate carboxylase small chain 176
7358 215164 PLN02290 PLN02290 cytokinin trans-hydroxylase 516
7359 177928 PLN02291 PLN02291 phospho-2-dehydro-3-deoxyheptonate aldolase 474
7360 215165 PLN02292 PLN02292 ferric-chelate reductase 702
7361 177930 PLN02293 PLN02293 adenine phosphoribosyltransferase 187
7362 177931 PLN02294 PLN02294 cytochrome c oxidase subunit Vb 174
7363 215166 PLN02295 PLN02295 glycerol kinase 512
7364 215167 PLN02296 PLN02296 carbonate dehydratase 269
7365 177934 PLN02297 PLN02297 ribose-phosphate pyrophosphokinase 326
7366 165939 PLN02298 PLN02298 hydrolase, alpha/beta fold family protein 330
7367 215168 PLN02299 PLN02299 1-aminocyclopropane-1-carboxylate oxidase 321
7368 215169 PLN02300 PLN02300 lactoylglutathione lyase 286
7369 215170 PLN02301 PLN02301 pectinesterase/pectinesterase inhibitor 548
7370 215171 PLN02302 PLN02302 ent-kaurenoic acid oxidase 490
7371 215172 PLN02303 PLN02303 urease 837
7372 215173 PLN02304 PLN02304 probable pectinesterase 379
7373 215174 PLN02305 PLN02305 lipoxygenase 918
7374 177941 PLN02306 PLN02306 hydroxypyruvate reductase 386
7375 177942 PLN02307 PLN02307 phosphoglucomutase 579
7376 177943 PLN02308 PLN02308 endoglucanase 492
7377 215175 PLN02309 PLN02309 5'-adenylylsulfate reductase 457
7378 215176 PLN02310 PLN02310 triacylglycerol lipase 405
7379 215177 PLN02311 PLN02311 chalcone isomerase 271
7380 215178 PLN02312 PLN02312 acyl-CoA oxidase 680
7381 177947 PLN02313 PLN02313 Pectinesterase/pectinesterase inhibitor 587
7382 215179 PLN02314 PLN02314 pectinesterase 586
7383 177949 PLN02315 PLN02315 aldehyde dehydrogenase family 7 member 508
7384 215180 PLN02316 PLN02316 synthase/transferase 1036
7385 215181 PLN02317 PLN02317 arogenate dehydratase 382
7386 177952 PLN02318 PLN02318 phosphoribulokinase/uridine kinase 656
7387 177953 PLN02319 PLN02319 aminomethyltransferase 404
7388 177954 PLN02320 PLN02320 seryl-tRNA synthetase 502
7389 215182 PLN02321 PLN02321 2-isopropylmalate synthase 632
7390 177956 PLN02322 PLN02322 acyl-CoA thioesterase 154
7391 215183 PLN02323 PLN02323 probable fructokinase 330
7392 177958 PLN02324 PLN02324 triacylglycerol lipase 415
7393 215184 PLN02325 PLN02325 nudix hydrolase 144
7394 215185 PLN02326 PLN02326 3-oxoacyl-[acyl-carrier-protein] synthase III 379
7395 215186 PLN02327 PLN02327 CTP synthase 557
7396 215187 PLN02328 PLN02328 lysine-specific histone demethylase 1 homolog 808
7397 215188 PLN02329 PLN02329 3-isopropylmalate dehydrogenase 409
7398 215189 PLN02330 PLN02330 4-coumarate--CoA ligase-like 1 546
7399 177965 PLN02331 PLN02331 phosphoribosylglycinamide formyltransferase 207
7400 215190 PLN02332 PLN02332 membrane bound O-acyl transferase (MBOAT) family protein 465
7401 215191 PLN02333 PLN02333 glucose-6-phosphate 1-dehydrogenase 604
7402 215192 PLN02334 PLN02334 ribulose-phosphate 3-epimerase 229
7403 177969 PLN02335 PLN02335 anthranilate synthase 222
7404 177970 PLN02336 PLN02336 phosphoethanolamine N-methyltransferase 475
7405 215193 PLN02337 PLN02337 lipoxygenase 866
7406 177972 PLN02338 PLN02338 3-phosphoshikimate 1-carboxyvinyltransferase 443
7407 177973 PLN02339 PLN02339 NAD+ synthase (glutamine-hydrolysing) 700
7408 215194 PLN02340 PLN02340 endoglucanase 614
7409 215195 PLN02341 PLN02341 pfkB-type carbohydrate kinase family protein 470
7410 177976 PLN02342 PLN02342 ornithine carbamoyltransferase 348
7411 177977 PLN02343 PLN02343 allene oxide cyclase 229
7412 177978 PLN02344 PLN02344 chorismate mutase 284
7413 177979 PLN02345 PLN02345 endoglucanase 469
7414 215196 PLN02346 PLN02346 histidine biosynthesis bifunctional protein hisIE 271
7415 215197 PLN02347 PLN02347 GMP synthetase 536
7416 215198 PLN02348 PLN02348 phosphoribulokinase 395
7417 215199 PLN02349 PLN02349 glycerol-3-phosphate acyltransferase 426
7418 215200 PLN02350 PLN02350 phosphogluconate dehydrogenase (decarboxylating) 493
7419 215201 PLN02351 PLN02351 cytochromes b561 family protein 242
7420 215202 PLN02352 PLN02352 phospholipase D epsilon 758
7421 177986 PLN02353 PLN02353 probable UDP-glucose 6-dehydrogenase 473
7422 177987 PLN02354 PLN02354 copper ion binding / oxidoreductase 552
7423 215203 PLN02355 PLN02355 probable galactinol--sucrose galactosyltransferase 1 758
7424 215204 PLN02356 PLN02356 phosphateglycerate kinase 423
7425 215205 PLN02357 PLN02357 serine acetyltransferase 360
7426 165999 PLN02358 PLN02358 glyceraldehyde-3-phosphate dehydrogenase 338
7427 166000 PLN02359 PLN02359 ethanolaminephosphotransferase 389
7428 166001 PLN02360 PLN02360 probable 6-phosphogluconolactonase 268
7429 177990 PLN02361 PLN02361 alpha-amylase 401
7430 215206 PLN02362 PLN02362 hexokinase 509
7431 215207 PLN02363 PLN02363 phosphoribosylanthranilate isomerase 256
7432 166005 PLN02364 PLN02364 L-ascorbate peroxidase 1 250
7433 177993 PLN02365 PLN02365 2-oxoglutarate-dependent dioxygenase 300
7434 215208 PLN02366 PLN02366 spermidine synthase 308
7435 177995 PLN02367 PLN02367 lactoylglutathione lyase 233
7436 177996 PLN02368 PLN02368 alanine transaminase 407
7437 215209 PLN02369 PLN02369 ribose-phosphate pyrophosphokinase 302
7438 215210 PLN02370 PLN02370 acyl-ACP thioesterase 419
7439 215211 PLN02371 PLN02371 phosphoglucosamine mutase family protein 583
7440 215212 PLN02372 PLN02372 violaxanthin de-epoxidase 455
7441 178001 PLN02373 PLN02373 soluble inorganic pyrophosphatase 188
7442 215213 PLN02374 PLN02374 pyruvate dehydrogenase (acetyl-transferring) 433
7443 178003 PLN02375 PLN02375 molybderin biosynthesis protein CNX3 270
7444 178004 PLN02376 PLN02376 1-aminocyclopropane-1-carboxylate synthase 496
7445 166018 PLN02377 PLN02377 3-ketoacyl-CoA synthase 502
7446 166019 PLN02378 PLN02378 glutathione S-transferase DHAR1 213
7447 178005 PLN02379 PLN02379 pfkB-type carbohydrate kinase family protein 367
7448 178006 PLN02380 PLN02380 1-acyl-sn-glycerol-3-phosphate acyltransferase 376
7449 215214 PLN02381 PLN02381 valyl-tRNA synthetase 1066
7450 178008 PLN02382 PLN02382 probable sucrose-phosphatase 413
7451 178009 PLN02383 PLN02383 aspartate semialdehyde dehydrogenase 344
7452 215215 PLN02384 PLN02384 ribose-5-phosphate isomerase 264
7453 215216 PLN02385 PLN02385 hydrolase; alpha/beta fold family protein 349
7454 166027 PLN02386 PLN02386 superoxide dismutase [Cu-Zn] 152
7455 215217 PLN02387 PLN02387 long-chain-fatty-acid-CoA ligase family protein 696
7456 215218 PLN02388 PLN02388 phosphopantetheine adenylyltransferase 177
7457 215219 PLN02389 PLN02389 biotin synthase 379
7458 178014 PLN02390 PLN02390 molybdopterin synthase catalytic subunit 111
7459 178015 PLN02392 PLN02392 probable steroid reductase DET2 260
7460 215220 PLN02393 PLN02393 leucoanthocyanidin dioxygenase like protein 362
7461 215221 PLN02394 PLN02394 trans-cinnamate 4-monooxygenase 503
7462 166036 PLN02395 PLN02395 glutathione S-transferase 215
7463 178018 PLN02396 PLN02396 hexaprenyldihydroxybenzoate methyltransferase 322
7464 215222 PLN02397 PLN02397 aspartate transaminase 423
7465 215223 PLN02398 PLN02398 hydroxyacylglutathione hydrolase 329
7466 178021 PLN02399 PLN02399 phospholipid hydroperoxide glutathione peroxidase 236
7467 215224 PLN02400 PLN02400 cellulose synthase 1085
7468 215225 PLN02401 PLN02401 diacylglycerol o-acyltransferase 446
7469 178024 PLN02402 PLN02402 cytidine deaminase 303
7470 178025 PLN02403 PLN02403 aminocyclopropanecarboxylate oxidase 303
7471 178026 PLN02404 PLN02404 6,7-dimethyl-8-ribityllumazine synthase 141
7472 215226 PLN02405 PLN02405 hexokinase 497
7473 215227 PLN02406 PLN02406 ethanolamine-phosphate cytidylyltransferase 418
7474 178029 PLN02407 PLN02407 diphosphomevalonate decarboxylase 343
7475 215228 PLN02408 PLN02408 phospholipase A1 365
7476 178031 PLN02409 PLN02409 serine--glyoxylate aminotransaminase 401
7477 178032 PLN02410 PLN02410 UDP-glucoronosyl/UDP-glucosyl transferase family protein 451
7478 178033 PLN02411 PLN02411 12-oxophytodienoate reductase 391
7479 166053 PLN02412 PLN02412 probable glutathione peroxidase 167
7480 215229 PLN02413 PLN02413 choline-phosphate cytidylyltransferase 294
7481 178035 PLN02414 PLN02414 glycine dehydrogenase (decarboxylating) 993
7482 178036 PLN02415 PLN02415 uricase 304
7483 178037 PLN02416 PLN02416 probable pectinesterase/pectinesterase inhibitor 541
7484 178038 PLN02417 PLN02417 dihydrodipicolinate synthase 280
7485 215230 PLN02418 PLN02418 delta-1-pyrroline-5-carboxylate synthase 718
7486 166060 PLN02419 PLN02419 methylmalonate-semialdehyde dehydrogenase [acylating] 604
7487 178040 PLN02420 PLN02420 endoglucanase 525
7488 215231 PLN02421 PLN02421 phosphotransferase, alcohol group as acceptor/kinase 330
7489 215232 PLN02422 PLN02422 dephospho-CoA kinase 232
7490 178043 PLN02423 PLN02423 phosphomannomutase 245
7491 215233 PLN02424 PLN02424 ketopantoate hydroxymethyltransferase 332
7492 215234 PLN02425 PLN02425 probable fructose-bisphosphate aldolase 390
7493 215235 PLN02426 PLN02426 cytochrome P450, family 94, subfamily C protein 502
7494 178047 PLN02427 PLN02427 UDP-apiose/xylose synthase 386
7495 215236 PLN02428 PLN02428 lipoic acid synthase 349
7496 166070 PLN02429 PLN02429 triosephosphate isomerase 315
7497 178049 PLN02430 PLN02430 long-chain-fatty-acid-CoA ligase 660
7498 178050 PLN02431 PLN02431 ferredoxin--nitrite reductase 587
7499 178051 PLN02432 PLN02432 putative pectinesterase 293
7500 215237 PLN02433 PLN02433 uroporphyrinogen decarboxylase 345
7501 178053 PLN02434 PLN02434 fatty acid hydroxylase 237
7502 215238 PLN02435 PLN02435 probable UDP-N-acetylglucosamine pyrophosphorylase 493
7503 215239 PLN02436 PLN02436 cellulose synthase A 1094
7504 178056 PLN02437 PLN02437 ribonucleoside--diphosphate reductase large subunit 813
7505 178057 PLN02438 PLN02438 inositol-3-phosphate synthase 510
7506 215240 PLN02439 PLN02439 arginine decarboxylase 559
7507 215241 PLN02440 PLN02440 amidophosphoribosyltransferase 479
7508 215242 PLN02441 PLN02441 cytokinin dehydrogenase 525
7509 178061 PLN02442 PLN02442 S-formylglutathione hydrolase 283
7510 178062 PLN02443 PLN02443 acyl-coenzyme A oxidase 664
7511 215243 PLN02444 PLN02444 HMP-P synthase 642
7512 215244 PLN02445 PLN02445 anthranilate synthase component I 523
7513 215245 PLN02446 PLN02446 (5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase 262
7514 215246 PLN02447 PLN02447 1,4-alpha-glucan-branching enzyme 758
7515 215247 PLN02448 PLN02448 UDP-glycosyltransferase family protein 459
7516 178068 PLN02449 PLN02449 ferrochelatase 485
7517 178069 PLN02450 PLN02450 1-aminocyclopropane-1-carboxylate synthase 468
7518 215248 PLN02451 PLN02451 homoserine kinase 370
7519 178071 PLN02452 PLN02452 phosphoserine transaminase 365
7520 178072 PLN02453 PLN02453 complex I subunit 105
7521 215249 PLN02454 PLN02454 triacylglycerol lipase 414
7522 178074 PLN02455 PLN02455 fructose-bisphosphate aldolase 358
7523 215250 PLN02456 PLN02456 citrate synthase 455
7524 215251 PLN02457 PLN02457 phenylalanine ammonia-lyase 706
7525 215252 PLN02458 PLN02458 transferase, transferring glycosyl groups 346
7526 215253 PLN02459 PLN02459 probable adenylate kinase 261
7527 215254 PLN02460 PLN02460 indole-3-glycerol-phosphate synthase 338
7528 215255 PLN02461 PLN02461 Probable pyruvate kinase 511
7529 215256 PLN02462 PLN02462 sedoheptulose-1,7-bisphosphatase 304
7530 178082 PLN02463 PLN02463 lycopene beta cyclase 447
7531 215257 PLN02464 PLN02464 glycerol-3-phosphate dehydrogenase 627
7532 215258 PLN02465 PLN02465 L-galactono-1,4-lactone dehydrogenase 573
7533 215259 PLN02466 PLN02466 aldehyde dehydrogenase family 2 member 538
7534 215260 PLN02467 PLN02467 betaine aldehyde dehydrogenase 503
7535 178087 PLN02468 PLN02468 putative pectinesterase/pectinesterase inhibitor 565
7536 178088 PLN02469 PLN02469 hydroxyacylglutathione hydrolase 258
7537 215261 PLN02470 PLN02470 acetolactate synthase 585
7538 215262 PLN02471 PLN02471 superoxide dismutase [Mn] 231
7539 215263 PLN02472 PLN02472 uncharacterized protein 246
7540 166114 PLN02473 PLN02473 glutathione S-transferase 214
7541 178092 PLN02474 PLN02474 UTP--glucose-1-phosphate uridylyltransferase 469
7542 215264 PLN02475 PLN02475 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase 766
7543 178094 PLN02476 PLN02476 O-methyltransferase 278
7544 178095 PLN02477 PLN02477 glutamate dehydrogenase 410
7545 215265 PLN02478 PLN02478 alternative oxidase 328
7546 178097 PLN02479 PLN02479 acetate-CoA ligase 567
7547 178098 PLN02480 PLN02480 Probable pectinesterase 343
7548 215266 PLN02481 PLN02481 Omega-hydroxypalmitate O-feruloyl transferase 436
7549 178100 PLN02482 PLN02482 glutamate-1-semialdehyde 2,1-aminomutase 474
7550 178101 PLN02483 PLN02483 serine palmitoyltransferase 489
7551 178102 PLN02484 PLN02484 probable pectinesterase/pectinesterase inhibitor 587
7552 215267 PLN02485 PLN02485 oxidoreductase 329
7553 178104 PLN02486 PLN02486 aminoacyl-tRNA ligase 383
7554 215268 PLN02487 PLN02487 zeta-carotene desaturase 569
7555 178106 PLN02488 PLN02488 probable pectinesterase/pectinesterase inhibitor 509
7556 215269 PLN02489 PLN02489 homocysteine S-methyltransferase 335
7557 215270 PLN02490 PLN02490 MPBQ/MSBQ methyltransferase 340
7558 215271 PLN02491 PLN02491 carotenoid 9,10(9',10')-cleavage dioxygenase 545
7559 215272 PLN02492 PLN02492 ribonucleoside-diphosphate reductase 324
7560 166134 PLN02493 PLN02493 probable peroxisomal (S)-2-hydroxy-acid oxidase 367
7561 178111 PLN02494 PLN02494 adenosylhomocysteinase 477
7562 215273 PLN02495 PLN02495 oxidoreductase, acting on the CH-CH group of donors 385
7563 215274 PLN02496 PLN02496 probable phosphopantothenoylcysteine decarboxylase 209
7564 178113 PLN02497 PLN02497 probable pectinesterase 331
7565 215275 PLN02498 PLN02498 omega-3 fatty acid desaturase 450
7566 178115 PLN02499 PLN02499 glycerol-3-phosphate acyltransferase 498
7567 215276 PLN02500 PLN02500 cytochrome P450 90B1 490
7568 215277 PLN02501 PLN02501 digalactosyldiacylglycerol synthase 794
7569 215278 PLN02502 PLN02502 lysyl-tRNA synthetase 553
7570 215279 PLN02503 PLN02503 fatty acyl-CoA reductase 2 605
7571 178120 PLN02504 PLN02504 nitrilase 346
7572 178121 PLN02505 PLN02505 omega-6 fatty acid desaturase 381
7573 215280 PLN02506 PLN02506 putative pectinesterase/pectinesterase inhibitor 537
7574 215281 PLN02507 PLN02507 glutathione reductase 499
7575 178124 PLN02508 PLN02508 magnesium-protoporphyrin IX monomethyl ester [oxidative] cyclase 357
7576 178125 PLN02509 PLN02509 cystathionine beta-lyase 464
7577 178126 PLN02510 PLN02510 probable 1-acyl-sn-glycerol-3-phosphate acyltransferase 374
7578 215282 PLN02511 PLN02511 hydrolase 388
7579 178128 PLN02512 PLN02512 acetylglutamate kinase 309
7580 178129 PLN02513 PLN02513 adenylosuccinate synthase 427
7581 166155 PLN02514 PLN02514 cinnamyl-alcohol dehydrogenase 357
7582 178130 PLN02515 PLN02515 naringenin,2-oxoglutarate 3-dioxygenase 358
7583 178131 PLN02516 PLN02516 methylenetetrahydrofolate dehydrogenase (NADP+) 299
7584 178132 PLN02517 PLN02517 phosphatidylcholine-sterol O-acyltransferase 642
7585 215283 PLN02518 PLN02518 pheophorbide a oxygenase 539
7586 215284 PLN02519 PLN02519 isovaleryl-CoA dehydrogenase 404
7587 178135 PLN02520 PLN02520 bifunctional 3-dehydroquinate dehydratase/shikimate dehydrogenase 529
7588 215285 PLN02521 PLN02521 galactokinase 497
7589 178137 PLN02522 PLN02522 ATP citrate (pro-S)-lyase 608
7590 215286 PLN02523 PLN02523 galacturonosyltransferase 559
7591 215287 PLN02524 PLN02524 S-adenosylmethionine decarboxylase 355
7592 215288 PLN02525 PLN02525 phosphatidic acid phosphatase family protein 352
7593 178141 PLN02526 PLN02526 acyl-coenzyme A oxidase 412
7594 178142 PLN02527 PLN02527 aspartate carbamoyltransferase 306
7595 215289 PLN02528 PLN02528 2-oxoisovalerate dehydrogenase E2 component 416
7596 178144 PLN02529 PLN02529 lysine-specific histone demethylase 1 738
7597 178145 PLN02530 PLN02530 histidine-tRNA ligase 487
7598 215290 PLN02531 PLN02531 GTP cyclohydrolase I 469
7599 215291 PLN02532 PLN02532 asparagine-tRNA synthetase 633
7600 215292 PLN02533 PLN02533 probable purple acid phosphatase 427
7601 215293 PLN02534 PLN02534 UDP-glycosyltransferase 491
7602 215294 PLN02535 PLN02535 glycolate oxidase 364
7603 178151 PLN02536 PLN02536 diaminopimelate epimerase 267
7604 178152 PLN02537 PLN02537 diaminopimelate decarboxylase 410
7605 215295 PLN02538 PLN02538 2,3-bisphosphoglycerate-independent phosphoglycerate mutase 558
7606 178154 PLN02539 PLN02539 glucose-6-phosphate 1-dehydrogenase 491
7607 215296 PLN02540 PLN02540 methylenetetrahydrofolate reductase 565
7608 215297 PLN02541 PLN02541 uracil phosphoribosyltransferase 244
7609 215298 PLN02542 PLN02542 fructose-1,6-bisphosphatase 412
7610 215299 PLN02543 PLN02543 pfkB-type carbohydrate kinase family protein 496
7611 178159 PLN02544 PLN02544 phosphoribosylaminoimidazole-succinocarboxamide synthase 370
7612 215300 PLN02545 PLN02545 3-hydroxybutyryl-CoA dehydrogenase 295
7613 215301 PLN02546 PLN02546 glutathione reductase 558
7614 215302 PLN02547 PLN02547 dUTP pyrophosphatase 157
7615 178163 PLN02548 PLN02548 adenosine kinase 332
7616 178164 PLN02549 PLN02549 asparagine synthase (glutamine-hydrolyzing) 578
7617 178165 PLN02550 PLN02550 threonine dehydratase 591
7618 178166 PLN02551 PLN02551 aspartokinase 521
7619 215303 PLN02552 PLN02552 isopentenyl-diphosphate delta-isomerase 247
7620 178168 PLN02553 PLN02553 inositol-phosphate phosphatase 270
7621 215304 PLN02554 PLN02554 UDP-glycosyltransferase family protein 481
7622 178170 PLN02555 PLN02555 limonoid glucosyltransferase 480
7623 178171 PLN02556 PLN02556 cysteine synthase/L-3-cyanoalanine synthase 368
7624 178172 PLN02557 PLN02557 phosphoribosylformylglycinamidine cyclo-ligase 379
7625 166199 PLN02558 PLN02558 CDP-diacylglycerol-glycerol-3-phosphate/ 3-phosphatidyltransferase 203
7626 178173 PLN02559 PLN02559 chalcone--flavonone isomerase 230
7627 178174 PLN02560 PLN02560 enoyl-CoA reductase 308
7628 178175 PLN02561 PLN02561 triosephosphate isomerase 253
7629 215305 PLN02562 PLN02562 UDP-glycosyltransferase 448
7630 178177 PLN02563 PLN02563 aminoacyl-tRNA ligase 963
7631 178178 PLN02564 PLN02564 6-phosphofructokinase 484
7632 166206 PLN02565 PLN02565 cysteine synthase 322
7633 215306 PLN02566 PLN02566 amine oxidase (copper-containing) 646
7634 215307 PLN02567 PLN02567 alpha,alpha-trehalase 554
7635 215308 PLN02568 PLN02568 polyamine oxidase 539
7636 178182 PLN02569 PLN02569 threonine synthase 484
7637 215309 PLN02571 PLN02571 triacylglycerol lipase 413
7638 215310 PLN02572 PLN02572 UDP-sulfoquinovose synthase 442
7639 215311 PLN02573 PLN02573 pyruvate decarboxylase 578
7640 215312 PLN02574 PLN02574 4-coumarate--CoA ligase-like 560
7641 215313 PLN02575 PLN02575 haloacid dehalogenase-like hydrolase 381
7642 215314 PLN02576 PLN02576 protoporphyrinogen oxidase 496
7643 178189 PLN02577 PLN02577 hydroxymethylglutaryl-CoA synthase 459
7644 215315 PLN02578 PLN02578 hydrolase 354
7645 215316 PLN02579 PLN02579 sphingolipid delta-4 desaturase 323
7646 215317 PLN02580 PLN02580 trehalose-phosphatase 384
7647 215318 PLN02581 PLN02581 red chlorophyll catabolite reductase 267
7648 178194 PLN02582 PLN02582 1-deoxy-D-xylulose-5-phosphate synthase 677
7649 178195 PLN02583 PLN02583 cinnamoyl-CoA reductase 297
7650 178196 PLN02584 PLN02584 5'-methylthioadenosine nucleosidase 249
7651 215319 PLN02585 PLN02585 magnesium protoporphyrin IX methyltransferase 315
7652 166227 PLN02586 PLN02586 probable cinnamyl alcohol dehydrogenase 360
7653 178198 PLN02587 PLN02587 L-galactose dehydrogenase 314
7654 215320 PLN02588 PLN02588 glycerol-3-phosphate acyltransferase 525
7655 166230 PLN02589 PLN02589 caffeoyl-CoA O-methyltransferase 247
7656 178200 PLN02590 PLN02590 probable tyrosine decarboxylase 539
7657 178201 PLN02591 PLN02591 tryptophan synthase 250
7658 215321 PLN02592 PLN02592 ent-copalyl diphosphate synthase 800
7659 178203 PLN02593 PLN02593 adrenodoxin-like ferredoxin protein 117
7660 215322 PLN02594 PLN02594 phosphatidate cytidylyltransferase 342
7661 178205 PLN02595 PLN02595 cytochrome c oxidase subunit VI protein 102
7662 178206 PLN02596 PLN02596 hexokinase-like 490
7663 178207 PLN02597 PLN02597 phosphoenolpyruvate carboxykinase [ATP] 555
7664 215323 PLN02598 PLN02598 omega-6 fatty acid desaturase 421
7665 178209 PLN02599 PLN02599 dihydroorotase 364
7666 178210 PLN02600 PLN02600 enoyl-CoA hydratase 251
7667 178211 PLN02601 PLN02601 beta-carotene hydroxylase 303
7668 178212 PLN02602 PLN02602 lactate dehydrogenase 350
7669 178213 PLN02603 PLN02603 asparaginyl-tRNA synthetase 565
7670 215324 PLN02604 PLN02604 oxidoreductase 566
7671 215325 PLN02605 PLN02605 monogalactosyldiacylglycerol synthase 382
7672 215326 PLN02606 PLN02606 palmitoyl-protein thioesterase 306
7673 215327 PLN02607 PLN02607 1-aminocyclopropane-1-carboxylate synthase 447
7674 178218 PLN02608 PLN02608 L-ascorbate peroxidase 289
7675 215328 PLN02609 PLN02609 catalase 492
7676 215329 PLN02610 PLN02610 probable methionyl-tRNA synthetase 801
7677 178221 PLN02611 PLN02611 glutamate--cysteine ligase 482
7678 215330 PLN02612 PLN02612 phytoene desaturase 567
7679 215331 PLN02613 PLN02613 endoglucanase 498
7680 166255 PLN02614 PLN02614 long-chain acyl-CoA synthetase 666
7681 178224 PLN02615 PLN02615 arginase 338
7682 215332 PLN02616 PLN02616 tetrahydrofolate dehydrogenase/cyclohydrolase, putative 364
7683 178226 PLN02617 PLN02617 imidazole glycerol phosphate synthase hisHF 538
7684 215333 PLN02618 PLN02618 tryptophan synthase, beta chain 410
7685 178228 PLN02619 PLN02619 nucleoside-diphosphate kinase 238
7686 166261 PLN02620 PLN02620 indole-3-acetic acid-amido synthetase 612
7687 178229 PLN02621 PLN02621 nicotinamidase 197
7688 166263 PLN02622 PLN02622 iron superoxide dismutase 261
7689 215334 PLN02623 PLN02623 pyruvate kinase 581
7690 215335 PLN02624 PLN02624 ornithine-delta-aminotransferase 474
7691 178232 PLN02625 PLN02625 uroporphyrin-III C-methyltransferase 263
7692 215336 PLN02626 PLN02626 malate synthase 551
7693 178234 PLN02627 PLN02627 glutamyl-tRNA synthetase 535
7694 215337 PLN02628 PLN02628 fructose-1,6-bisphosphatase family protein 351
7695 215338 PLN02629 PLN02629 powdery mildew resistance 5 387
7696 178237 PLN02630 PLN02630 pfkB-type carbohydrate kinase family protein 335
7697 178238 PLN02631 PLN02631 ferric-chelate reductase 699
7698 215339 PLN02632 PLN02632 phytoene synthase 334
7699 178240 PLN02633 PLN02633 palmitoyl protein thioesterase family protein 314
7700 215340 PLN02634 PLN02634 probable pectinesterase 359
7701 215341 PLN02635 PLN02635 disproportionating enzyme 538
7702 215342 PLN02636 PLN02636 acyl-coenzyme A oxidase 686
7703 215343 PLN02638 PLN02638 cellulose synthase A (UDP-forming), catalytic subunit 1079
7704 178245 PLN02639 PLN02639 oxidoreductase, 2OG-Fe(II) oxygenase family protein 337
7705 215344 PLN02640 PLN02640 glucose-6-phosphate 1-dehydrogenase 573
7706 215345 PLN02641 PLN02641 anthranilate phosphoribosyltransferase 343
7707 178248 PLN02642 PLN02642 copper, zinc superoxide dismutase 164
7708 215346 PLN02643 PLN02643 ADP-glucose phosphorylase 336
7709 215347 PLN02644 PLN02644 acetyl-CoA C-acetyltransferase 394
7710 178251 PLN02645 PLN02645 phosphoglycolate phosphatase 311
7711 215348 PLN02646 PLN02646 argininosuccinate lyase 474
7712 215349 PLN02647 PLN02647 acyl-CoA thioesterase 437
7713 215350 PLN02648 PLN02648 allene oxide synthase 480
7714 215351 PLN02649 PLN02649 glucose-6-phosphate isomerase 560
7715 178256 PLN02650 PLN02650 dihydroflavonol-4-reductase 351
7716 178257 PLN02651 PLN02651 cysteine desulfurase 364
7717 215352 PLN02652 PLN02652 hydrolase; alpha/beta fold family protein 395
7718 178259 PLN02653 PLN02653 GDP-mannose 4,6-dehydratase 340
7719 215353 PLN02654 PLN02654 acetate-CoA ligase 666
7720 215354 PLN02655 PLN02655 ent-kaurene oxidase 466
7721 178262 PLN02656 PLN02656 tyrosine transaminase 409
7722 178263 PLN02657 PLN02657 3,8-divinyl protochlorophyllide a 8-vinyl reductase 390
7723 215355 PLN02658 PLN02658 homogentisate 1,2-dioxygenase 435
7724 215356 PLN02659 PLN02659 Probable galacturonosyltransferase 534
7725 178266 PLN02660 PLN02660 pantoate--beta-alanine ligase 284
7726 178267 PLN02661 PLN02661 Putative thiazole synthesis 357
7727 178268 PLN02662 PLN02662 cinnamyl-alcohol dehydrogenase family protein 322
7728 166304 PLN02663 PLN02663 hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase 431
7729 178269 PLN02664 PLN02664 enoyl-CoA hydratase/delta3,5-delta2,4-dienoyl-CoA isomerase 275
7730 215357 PLN02665 PLN02665 pectinesterase family protein 366
7731 215358 PLN02666 PLN02666 5-oxoprolinase 1275
7732 215359 PLN02667 PLN02667 inositol polyphosphate multikinase 286
7733 178273 PLN02668 PLN02668 indole-3-acetate carboxyl methyltransferase 386
7734 178274 PLN02669 PLN02669 xylulokinase 556
7735 178275 PLN02670 PLN02670 transferase, transferring glycosyl groups 472
7736 178276 PLN02671 PLN02671 pectinesterase 359
7737 215360 PLN02672 PLN02672 methionine S-methyltransferase 1082
7738 215361 PLN02673 PLN02673 quinolinate synthetase A 724
7739 178279 PLN02674 PLN02674 adenylate kinase 244
7740 215362 PLN02676 PLN02676 polyamine oxidase 487
7741 215363 PLN02677 PLN02677 mevalonate kinase 387
7742 215364 PLN02678 PLN02678 seryl-tRNA synthetase 448
7743 178283 PLN02679 PLN02679 hydrolase, alpha/beta fold family protein 360
7744 215365 PLN02680 PLN02680 carbon-monoxide oxygenase 232
7745 215366 PLN02681 PLN02681 proline dehydrogenase 455
7746 215367 PLN02682 PLN02682 pectinesterase family protein 369
7747 215368 PLN02683 PLN02683 pyruvate dehydrogenase E1 component subunit beta 356
7748 166325 PLN02684 PLN02684 Probable galactinol--sucrose galactosyltransferase 750
7749 215369 PLN02685 PLN02685 iron superoxide dismutase 299
7750 215370 PLN02686 PLN02686 cinnamoyl-CoA reductase 367
7751 215371 PLN02687 PLN02687 flavonoid 3'-monooxygenase 517
7752 178291 PLN02688 PLN02688 pyrroline-5-carboxylate reductase 266
7753 215372 PLN02689 PLN02689 Bifunctional isoaspartyl peptidase/L-asparaginase 318
7754 178293 PLN02690 PLN02690 Agmatine deiminase 374
7755 215373 PLN02691 PLN02691 porphobilinogen deaminase 351
7756 178295 PLN02692 PLN02692 alpha-galactosidase 412
7757 178296 PLN02693 PLN02693 IAA-amino acid hydrolase 437
7758 178297 PLN02694 PLN02694 serine O-acetyltransferase 294
7759 178298 PLN02695 PLN02695 GDP-D-mannose-3',5'-epimerase 370
7760 215374 PLN02696 PLN02696 1-deoxy-D-xylulose-5-phosphate reductoisomerase 454
7761 215375 PLN02697 PLN02697 lycopene epsilon cyclase 529
7762 178301 PLN02698 PLN02698 Probable pectinesterase/pectinesterase inhibitor 497
7763 215376 PLN02699 PLN02699 Bifunctional molybdopterin adenylyltransferase/molybdopterin molybdenumtransferase 659
7764 215377 PLN02700 PLN02700 homoserine dehydrogenase family protein 377
7765 178304 PLN02701 PLN02701 alpha-mannosidase 1050
7766 215378 PLN02702 PLN02702 L-idonate 5-dehydrogenase 364
7767 178306 PLN02703 PLN02703 beta-fructofuranosidase 618
7768 166345 PLN02704 PLN02704 flavonol synthase 335
7769 178307 PLN02705 PLN02705 beta-amylase 681
7770 178308 PLN02706 PLN02706 glucosamine 6-phosphate N-acetyltransferase 150
7771 178309 PLN02707 PLN02707 Soluble inorganic pyrophosphatase 267
7772 215379 PLN02708 PLN02708 Probable pectinesterase/pectinesterase inhibitor 553
7773 178311 PLN02709 PLN02709 nudix hydrolase 222
7774 215380 PLN02710 PLN02710 farnesyltranstransferase subunit beta 439
7775 215381 PLN02711 PLN02711 Probable galactinol--sucrose galactosyltransferase 777
7776 215382 PLN02712 PLN02712 arogenate dehydrogenase 667
7777 215383 PLN02713 PLN02713 Probable pectinesterase/pectinesterase inhibitor 566
7778 178316 PLN02714 PLN02714 thiamin pyrophosphokinase 229
7779 178317 PLN02715 PLN02715 lipid phosphate phosphatase 327
7780 178318 PLN02716 PLN02716 nicotinate-nucleotide diphosphorylase (carboxylating) 308
7781 178319 PLN02717 PLN02717 uridine nucleosidase 316
7782 178320 PLN02718 PLN02718 Probable galacturonosyltransferase 603
7783 178321 PLN02719 PLN02719 triacylglycerol lipase 518
7784 178322 PLN02720 PLN02720 complex II 140
7785 178323 PLN02721 PLN02721 threonine aldolase 353
7786 166363 PLN02722 PLN02722 indole-3-acetamide amidohydrolase 422
7787 178324 PLN02723 PLN02723 3-mercaptopyruvate sulfurtransferase 320
7788 215384 PLN02724 PLN02724 Molybdenum cofactor sulfurase 805
7789 178326 PLN02725 PLN02725 GDP-4-keto-6-deoxymannose-3,5-epimerase-4-reductase 306
7790 215385 PLN02726 PLN02726 dolichyl-phosphate beta-D-mannosyltransferase 243
7791 215386 PLN02727 PLN02727 NAD kinase 986
7792 215387 PLN02728 PLN02728 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase 252
7793 215388 PLN02729 PLN02729 PSII-Q subunit 220
7794 178331 PLN02730 PLN02730 enoyl-[acyl-carrier-protein] reductase 303
7795 178332 PLN02731 PLN02731 Putative lipid phosphate phosphatase 333
7796 215389 PLN02732 PLN02732 Probable NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 159
7797 215390 PLN02733 PLN02733 phosphatidylcholine-sterol O-acyltransferase 440
7798 178335 PLN02734 PLN02734 glycyl-tRNA synthetase 684
7799 215391 PLN02735 PLN02735 carbamoyl-phosphate synthase 1102
7800 178337 PLN02736 PLN02736 long-chain acyl-CoA synthetase 651
7801 215392 PLN02737 PLN02737 inositol monophosphatase family protein 363
7802 215393 PLN02738 PLN02738 carotene beta-ring hydroxylase 633
7803 215394 PLN02739 PLN02739 serine acetyltransferase 355
7804 178341 PLN02740 PLN02740 Alcohol dehydrogenase-like 381
7805 178342 PLN02741 PLN02741 riboflavin synthase 194
7806 215395 PLN02742 PLN02742 Probable galacturonosyltransferase 534
7807 215396 PLN02743 PLN02743 nicotinamidase 239
7808 215397 PLN02744 PLN02744 dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase complex 539
7809 178346 PLN02745 PLN02745 Putative pectinesterase/pectinesterase inhibitor 596
7810 178347 PLN02746 PLN02746 hydroxymethylglutaryl-CoA lyase 347
7811 215398 PLN02747 PLN02747 N-carbamolyputrescine amidase 296
7812 215399 PLN02748 PLN02748 tRNA dimethylallyltransferase 468
7813 178350 PLN02749 PLN02749 Uncharacterized protein At1g47420 173
7814 178351 PLN02750 PLN02750 oxidoreductase, 2OG-Fe(II) oxygenase family protein 345
7815 215400 PLN02751 PLN02751 glutamyl-tRNA(Gln) amidotransferase 544
7816 215401 PLN02752 PLN02752 [acyl-carrier protein] S-malonyltransferase 343
7817 178354 PLN02753 PLN02753 triacylglycerol lipase 531
7818 215402 PLN02754 PLN02754 chorismate synthase 413
7819 178356 PLN02755 PLN02755 complex I subunit 71
7820 166397 PLN02756 PLN02756 S-methyl-5-thioribose kinase 418
7821 215403 PLN02757 PLN02757 sirohydrochlorine ferrochelatase 154
7822 215404 PLN02758 PLN02758 oxidoreductase, 2OG-Fe(II) oxygenase family protein 361
7823 178359 PLN02759 PLN02759 Formate--tetrahydrofolate ligase 637
7824 215405 PLN02760 PLN02760 4-aminobutyrate:pyruvate transaminase 504
7825 215406 PLN02761 PLN02761 lipase class 3 family protein 527
7826 215407 PLN02762 PLN02762 pyruvate kinase complex alpha subunit 509
7827 215408 PLN02763 PLN02763 hydrolase, hydrolyzing O-glycosyl compounds 978
7828 178364 PLN02764 PLN02764 glycosyltransferase family protein 453
7829 215409 PLN02765 PLN02765 pyruvate kinase 526
7830 215410 PLN02766 PLN02766 coniferyl-aldehyde dehydrogenase 501
7831 215411 PLN02768 PLN02768 AMP deaminase 835
7832 215412 PLN02769 PLN02769 Probable galacturonosyltransferase 629
7833 215413 PLN02770 PLN02770 haloacid dehalogenase-like hydrolase family protein 248
7834 178370 PLN02771 PLN02771 carbamoyl-phosphate synthase (glutamine-hydrolyzing) 415
7835 215414 PLN02772 PLN02772 guanylate kinase 398
7836 178372 PLN02773 PLN02773 pectinesterase 317
7837 178373 PLN02774 PLN02774 brassinosteroid-6-oxidase 463
7838 178374 PLN02775 PLN02775 Probable dihydrodipicolinate reductase 286
7839 215415 PLN02776 PLN02776 prenyltransferase 341
7840 178376 PLN02777 PLN02777 photosystem I P subunit (PSI-P) 167
7841 178377 PLN02778 PLN02778 3,5-epimerase/4-reductase 298
7842 215416 PLN02779 PLN02779 haloacid dehalogenase-like hydrolase family protein 286
7843 166421 PLN02780 PLN02780 ketoreductase/ oxidoreductase 320
7844 215417 PLN02781 PLN02781 Probable caffeoyl-CoA O-methyltransferase 234
7845 215418 PLN02782 PLN02782 Branched-chain amino acid aminotransferase 403
7846 178380 PLN02783 PLN02783 diacylglycerol O-acyltransferase 315
7847 215419 PLN02784 PLN02784 alpha-amylase 894
7848 215420 PLN02785 PLN02785 Protein HOTHEAD 587
7849 178383 PLN02786 PLN02786 isochorismate synthase 533
7850 215421 PLN02787 PLN02787 3-oxoacyl-[acyl-carrier-protein] synthase II 540
7851 215422 PLN02788 PLN02788 phenylalanine-tRNA synthetase 402
7852 215423 PLN02789 PLN02789 farnesyltranstransferase 320
7853 215424 PLN02790 PLN02790 transketolase 654
7854 215425 PLN02791 PLN02791 Nudix hydrolase homolog 770
7855 178389 PLN02792 PLN02792 oxidoreductase 536
7856 215426 PLN02793 PLN02793 Probable polygalacturonase 443
7857 178391 PLN02794 PLN02794 cardiolipin synthase 341
7858 178392 PLN02795 PLN02795 allantoinase 505
7859 215427 PLN02796 PLN02796 D-glycerate 3-kinase 347
7860 178394 PLN02797 PLN02797 phosphatidyl-N-dimethylethanolamine N-methyltransferase 164
7861 215428 PLN02798 PLN02798 nitrilase 286
7862 215429 PLN02799 PLN02799 Molybdopterin synthase sulfur carrier subunit 82
7863 215430 PLN02800 PLN02800 imidazoleglycerol-phosphate dehydratase 261
7864 215431 PLN02801 PLN02801 beta-amylase 517
7865 215432 PLN02802 PLN02802 triacylglycerol lipase 509
7866 178400 PLN02803 PLN02803 beta-amylase 548
7867 178401 PLN02804 PLN02804 chalcone isomerase 206
7868 178402 PLN02805 PLN02805 D-lactate dehydrogenase [cytochrome] 555
7869 178403 PLN02806 PLN02806 complex I subunit 81
7870 215433 PLN02807 PLN02807 diaminohydroxyphosphoribosylaminopyrimidine deaminase 380
7871 166449 PLN02808 PLN02808 alpha-galactosidase 386
7872 178405 PLN02809 PLN02809 4-hydroxybenzoate nonaprenyltransferase 289
7873 178406 PLN02810 PLN02810 carbon-monoxide oxygenase 231
7874 178407 PLN02811 PLN02811 hydrolase 220
7875 178408 PLN02812 PLN02812 5-formyltetrahydrofolate cyclo-ligase 211
7876 215434 PLN02813 PLN02813 pfkB-type carbohydrate kinase family protein 426
7877 215435 PLN02814 PLN02814 beta-glucosidase 504
7878 215436 PLN02815 PLN02815 L-aspartate oxidase 594
7879 215437 PLN02816 PLN02816 mannosyltransferase 546
7880 166458 PLN02817 PLN02817 glutathione dehydrogenase (ascorbate) 265
7881 215438 PLN02818 PLN02818 tocopherol cyclase 403
7882 215439 PLN02819 PLN02819 lysine-ketoglutarate reductase/saccharopine dehydrogenase 1042
7883 178415 PLN02820 PLN02820 3-methylcrotonyl-CoA carboxylase, beta chain 569
7884 215440 PLN02821 PLN02821 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase 460
7885 178417 PLN02822 PLN02822 serine palmitoyltransferase 481
7886 178418 PLN02823 PLN02823 spermine synthase 336
7887 178419 PLN02824 PLN02824 hydrolase, alpha/beta fold family protein 294
7888 215441 PLN02825 PLN02825 amino-acid N-acetyltransferase 515
7889 178421 PLN02826 PLN02826 dihydroorotate dehydrogenase 409
7890 215442 PLN02827 PLN02827 Alcohol dehydrogenase-like 378
7891 178422 PLN02828 PLN02828 formyltetrahydrofolate deformylase 268
7892 215443 PLN02829 PLN02829 Probable galacturonosyltransferase 639
7893 215444 PLN02830 PLN02830 UDP-sugar pyrophosphorylase 615
7894 215445 PLN02831 PLN02831 Bifunctional GTP cyclohydrolase II/ 3,4-dihydroxy-2-butanone-4-phosphate synthase 450
7895 215446 PLN02832 PLN02832 glutamine amidotransferase subunit of pyridoxal 5'-phosphate synthase complex 248
7896 215447 PLN02833 PLN02833 glycerol acyltransferase family protein 376
7897 215448 PLN02834 PLN02834 3-dehydroquinate synthase 433
7898 178429 PLN02835 PLN02835 oxidoreductase 539
7899 215449 PLN02836 PLN02836 3-oxoacyl-[acyl-carrier-protein] synthase 437
7900 215450 PLN02837 PLN02837 threonine-tRNA ligase 614
7901 166479 PLN02838 PLN02838 3-hydroxyacyl-CoA dehydratase subunit of elongase 221
7902 178432 PLN02839 PLN02839 nudix hydrolase 372
7903 215451 PLN02840 PLN02840 tRNA dimethylallyltransferase 421
7904 178434 PLN02841 PLN02841 GPI mannosyltransferase 440
7905 178435 PLN02842 PLN02842 nucleotide kinase 505
7906 215452 PLN02843 PLN02843 isoleucyl-tRNA synthetase 974
7907 215453 PLN02844 PLN02844 oxidoreductase/ferric-chelate reductase 722
7908 215454 PLN02845 PLN02845 Branched-chain-amino-acid aminotransferase-like protein 336
7909 166487 PLN02846 PLN02846 digalactosyldiacylglycerol synthase 462
7910 178439 PLN02847 PLN02847 triacylglycerol lipase 633
7911 178440 PLN02848 PLN02848 adenylosuccinate lyase 458
7912 215455 PLN02849 PLN02849 beta-glucosidase 503
7913 215456 PLN02850 PLN02850 aspartate-tRNA ligase 530
7914 178443 PLN02851 PLN02851 3-hydroxyisobutyryl-CoA hydrolase-like protein 407
7915 215457 PLN02852 PLN02852 ferredoxin-NADP+ reductase 491
7916 215458 PLN02853 PLN02853 Probable phenylalanyl-tRNA synthetase alpha chain 492
7917 215459 PLN02854 PLN02854 3-ketoacyl-CoA synthase 521
7918 215460 PLN02855 PLN02855 Bifunctional selenocysteine lyase/cysteine desulfurase 424
7919 215461 PLN02856 PLN02856 fumarylacetoacetase 424
7920 215462 PLN02857 PLN02857 octaprenyl-diphosphate synthase 416
7921 215463 PLN02858 PLN02858 fructose-bisphosphate aldolase 1378
7922 178450 PLN02859 PLN02859 glutamine-tRNA ligase 788
7923 215464 PLN02860 PLN02860 o-succinylbenzoate-CoA ligase 563
7924 178452 PLN02861 PLN02861 long-chain-fatty-acid-CoA ligase 660
7925 178453 PLN02862 PLN02862 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 216
7926 215465 PLN02863 PLN02863 UDP-glucoronosyl/UDP-glucosyl transferase family protein 477
7927 178455 PLN02864 PLN02864 enoyl-CoA hydratase 310
7928 215466 PLN02865 PLN02865 galactokinase 423
7929 215467 PLN02866 PLN02866 phospholipase D 1068
7930 178458 PLN02867 PLN02867 Probable galacturonosyltransferase 535
7931 178459 PLN02868 PLN02868 acyl-CoA thioesterase family protein 413
7932 166510 PLN02869 PLN02869 fatty aldehyde decarbonylase 620
7933 215468 PLN02870 PLN02870 Probable galacturonosyltransferase 533
7934 215469 PLN02871 PLN02871 UDP-sulfoquinovose:DAG sulfoquinovosyltransferase 465
7935 215470 PLN02872 PLN02872 triacylglycerol lipase 395
7936 215471 PLN02873 PLN02873 coproporphyrinogen-III oxidase 274
7937 178462 PLN02874 PLN02874 3-hydroxyisobutyryl-CoA hydrolase-like protein 379
7938 215472 PLN02875 PLN02875 4-hydroxyphenylpyruvate dioxygenase 398
7939 215473 PLN02876 PLN02876 acyl-CoA dehydrogenase 822
7940 215474 PLN02877 PLN02877 alpha-amylase/limit dextrinase 970
7941 178466 PLN02878 PLN02878 homogentisate phytyltransferase 280
7942 178467 PLN02879 PLN02879 L-ascorbate peroxidase 251
7943 215475 PLN02880 PLN02880 tyrosine decarboxylase 490
7944 215476 PLN02881 PLN02881 tetrahydrofolylpolyglutamate synthase 530
7945 215477 PLN02882 PLN02882 aminoacyl-tRNA ligase 1159
7946 178471 PLN02883 PLN02883 Branched-chain amino acid aminotransferase 384
7947 178472 PLN02884 PLN02884 6-phosphofructokinase 411
7948 178473 PLN02885 PLN02885 nicotinate phosphoribosyltransferase 545
7949 215478 PLN02886 PLN02886 aminoacyl-tRNA ligase 389
7950 215479 PLN02887 PLN02887 hydrolase family protein 580
7951 215480 PLN02888 PLN02888 enoyl-CoA hydratase 265
7952 215481 PLN02889 PLN02889 oxo-acid-lyase/anthranilate synthase 918
7953 178478 PLN02890 PLN02890 geranyl diphosphate synthase 422
7954 178479 PLN02891 PLN02891 IMP cyclohydrolase 547
7955 215482 PLN02892 PLN02892 isocitrate lyase 570
7956 215483 PLN02893 PLN02893 Cellulose synthase-like protein 734
7957 215484 PLN02894 PLN02894 hydrolase, alpha/beta fold family protein 402
7958 215485 PLN02895 PLN02895 phosphoacetylglucosamine mutase 562
7959 178484 PLN02896 PLN02896 cinnamyl-alcohol dehydrogenase 353
7960 178485 PLN02897 PLN02897 tetrahydrofolate dehydrogenase/cyclohydrolase, putative 345
7961 215486 PLN02898 PLN02898 HMP-P kinase/thiamin-monophosphate pyrophosphorylase 502
7962 178487 PLN02899 PLN02899 alpha-galactosidase 633
7963 215487 PLN02900 PLN02900 alanyl-tRNA synthetase 936
7964 215488 PLN02901 PLN02901 1-acyl-sn-glycerol-3-phosphate acyltransferase 214
7965 215489 PLN02902 PLN02902 pantothenate kinase 876
7966 215490 PLN02903 PLN02903 aminoacyl-tRNA ligase 652
7967 178492 PLN02904 PLN02904 oxidoreductase 357
7968 178493 PLN02905 PLN02905 beta-amylase 702
7969 215491 PLN02906 PLN02906 xanthine dehydrogenase 1319
7970 215492 PLN02907 PLN02907 glutamate-tRNA ligase 722
7971 178496 PLN02908 PLN02908 threonyl-tRNA synthetase 686
7972 178497 PLN02909 PLN02909 Endoglucanase 486
7973 215493 PLN02910 PLN02910 polygalacturonate 4-alpha-galacturonosyltransferase 657
7974 178499 PLN02911 PLN02911 inositol-phosphate phosphatase 296
7975 178500 PLN02912 PLN02912 oxidoreductase, 2OG-Fe(II) oxygenase family protein 348
7976 178501 PLN02913 PLN02913 dihydrofolate synthetase 510
7977 178502 PLN02914 PLN02914 hexokinase 490
7978 215494 PLN02915 PLN02915 cellulose synthase A [UDP-forming], catalytic subunit 1044
7979 178504 PLN02916 PLN02916 pectinesterase family protein 502
7980 215495 PLN02917 PLN02917 CMP-KDO synthetase 293
7981 215496 PLN02918 PLN02918 pyridoxine (pyridoxamine) 5'-phosphate oxidase 544
7982 215497 PLN02919 PLN02919 haloacid dehalogenase-like hydrolase family protein 1057
7983 215498 PLN02920 PLN02920 pantothenate kinase 1 398
7984 178509 PLN02921 PLN02921 naphthoate synthase 327
7985 215499 PLN02922 PLN02922 prenyltransferase 315
7986 178511 PLN02923 PLN02923 xylose isomerase 478
7987 178512 PLN02924 PLN02924 thymidylate kinase 220
7988 178513 PLN02925 PLN02925 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase 733
7989 215500 PLN02926 PLN02926 histidinol dehydrogenase 431
7990 178515 PLN02927 PLN02927 antheraxanthin epoxidase/zeaxanthin epoxidase 668
7991 215501 PLN02928 PLN02928 oxidoreductase family protein 347
7992 215502 PLN02929 PLN02929 NADH kinase 301
7993 178518 PLN02930 PLN02930 CDP-diacylglycerol-serine O-phosphatidyltransferase 353
7994 215503 PLN02931 PLN02931 nucleoside diphosphate kinase family protein 177
7995 178520 PLN02932 PLN02932 3-ketoacyl-CoA synthase 478
7996 178521 PLN02933 PLN02933 Probable pectinesterase/pectinesterase inhibitor 530
7997 215504 PLN02934 PLN02934 triacylglycerol lipase 515
7998 215505 PLN02935 PLN02935 Bifunctional NADH kinase/NAD(+) kinase 508
7999 178524 PLN02936 PLN02936 epsilon-ring hydroxylase 489
8000 215506 PLN02937 PLN02937 Putative isoaspartyl peptidase/L-asparaginase 414
8001 178526 PLN02938 PLN02938 phosphatidylserine decarboxylase 428
8002 215507 PLN02939 PLN02939 transferase, transferring glycosyl groups 977
8003 178528 PLN02940 PLN02940 riboflavin kinase 382
8004 215508 PLN02941 PLN02941 inositol-tetrakisphosphate 1-kinase 328
8005 178530 PLN02942 PLN02942 dihydropyrimidinase 486
8006 215509 PLN02943 PLN02943 aminoacyl-tRNA ligase 958
8007 178531 PLN02945 PLN02945 nicotinamide-nucleotide adenylyltransferase/nicotinate-nucleotide adenylyltransferase 236
8008 178532 PLN02946 PLN02946 cysteine-tRNA ligase 557
8009 215510 PLN02947 PLN02947 oxidoreductase 374
8010 178534 PLN02948 PLN02948 phosphoribosylaminoimidazole carboxylase 577
8011 215511 PLN02949 PLN02949 transferase, transferring glycosyl groups 463
8012 215512 PLN02950 PLN02950 4-alpha-glucanotransferase 909
8013 215513 PLN02951 PLN02951 Molybderin biosynthesis protein CNX2 373
8014 178538 PLN02952 PLN02952 phosphoinositide phospholipase C 599
8015 178539 PLN02953 PLN02953 phosphatidate cytidylyltransferase 403
8016 215514 PLN02954 PLN02954 phosphoserine phosphatase 224
8017 178541 PLN02955 PLN02955 8-amino-7-oxononanoate synthase 476
8018 215515 PLN02956 PLN02956 PSII-Q subunit 185
8019 215516 PLN02957 PLN02957 copper, zinc superoxide dismutase 238
8020 215517 PLN02958 PLN02958 diacylglycerol kinase/D-erythro-sphingosine kinase 481
8021 215518 PLN02959 PLN02959 aminoacyl-tRNA ligase 1084
8022 215519 PLN02960 PLN02960 alpha-amylase 897
8023 178546 PLN02961 PLN02961 alanine-tRNA ligase 223
8024 178547 PLN02962 PLN02962 hydroxyacylglutathione hydrolase 251
8025 215520 PLN02964 PLN02964 phosphatidylserine decarboxylase 644
8026 178549 PLN02965 PLN02965 Probable pheophorbidase 255
8027 178550 PLN02966 PLN02966 cytochrome P450 83A1 502
8028 215521 PLN02967 PLN02967 kinase 581
8029 215522 PLN02968 PLN02968 Probable N-acetyl-gamma-glutamyl-phosphate reductase 381
8030 215523 PLN02969 PLN02969 9-cis-epoxycarotenoid dioxygenase 610
8031 215524 PLN02970 PLN02970 serine racemase 328
8032 166612 PLN02971 PLN02971 tryptophan N-hydroxylase 543
8033 215525 PLN02972 PLN02972 Histidyl-tRNA synthetase 763
8034 178556 PLN02973 PLN02973 beta-fructofuranosidase 571
8035 215526 PLN02974 PLN02974 adenosylmethionine-8-amino-7-oxononanoate transaminase 817
8036 166616 PLN02975 PLN02975 complex I subunit 97
8037 215527 PLN02976 PLN02976 amine oxidase 1713
8038 215528 PLN02977 PLN02977 glutathione synthetase 478
8039 215529 PLN02978 PLN02978 pyridoxal kinase 308
8040 166620 PLN02979 PLN02979 glycolate oxidase 366
8041 215530 PLN02980 PLN02980 2-oxoglutarate decarboxylase/ hydro-lyase/ magnesium ion binding / thiamin pyrophosphate binding 1655
8042 215531 PLN02981 PLN02981 glucosamine:fructose-6-phosphate aminotransferase 680
8043 215532 PLN02982 PLN02982 galactinol-raffinose galactosyltransferase/ghydrolase, hydrolyzing O-glycosyl compounds 865
8044 215533 PLN02983 PLN02983 biotin carboxyl carrier protein of acetyl-CoA carboxylase 274
8045 215534 PLN02984 PLN02984 oxidoreductase, 2OG-Fe(II) oxygenase family protein 341
8046 178566 PLN02985 PLN02985 squalene monooxygenase 514
8047 178567 PLN02986 PLN02986 cinnamyl-alcohol dehydrogenase family protein 322
8048 166628 PLN02987 PLN02987 Cytochrome P450, family 90, subfamily A 472
8049 178568 PLN02988 PLN02988 3-hydroxyisobutyryl-CoA hydrolase 381
8050 178569 PLN02989 PLN02989 cinnamyl-alcohol dehydrogenase family protein 325
8051 215535 PLN02990 PLN02990 Probable pectinesterase/pectinesterase inhibitor 572
8052 215536 PLN02991 PLN02991 oxidoreductase 543
8053 178572 PLN02992 PLN02992 coniferyl-alcohol glucosyltransferase 481
8054 215537 PLN02993 PLN02993 lupeol synthase 763
8055 166635 PLN02994 PLN02994 1-aminocyclopropane-1-carboxylate synthase 153
8056 178574 PLN02995 PLN02995 Probable pectinesterase/pectinesterase inhibitor 539
8057 215538 PLN02996 PLN02996 fatty acyl-CoA reductase 491
8058 178576 PLN02997 PLN02997 flavonol synthase 325
8059 215539 PLN02998 PLN02998 beta-glucosidase 497
8060 178577 PLN02999 PLN02999 photosystem II oxygen-evolving enhancer 3 protein (PsbQ) 190
8061 178578 PLN03000 PLN03000 amine oxidase 881
8062 166642 PLN03001 PLN03001 oxidoreductase, 2OG-Fe(II) oxygenase family protein 262
8063 178579 PLN03002 PLN03002 oxidoreductase, 2OG-Fe(II) oxygenase family protein 332
8064 178580 PLN03003 PLN03003 Probable polygalacturonase At3g15720 456
8065 178581 PLN03004 PLN03004 UDP-glycosyltransferase 451
8066 178582 PLN03005 PLN03005 beta-fructofuranosidase 550
8067 178583 PLN03006 PLN03006 carbonate dehydratase 301
8068 178584 PLN03007 PLN03007 UDP-glucosyltransferase family protein 482
8069 178585 PLN03008 PLN03008 Phospholipase D delta 868
8070 166650 PLN03009 PLN03009 cellulase 495
8071 215540 PLN03010 PLN03010 polygalacturonase 409
8072 166653 PLN03012 PLN03012 Camelliol C synthase 759
8073 178587 PLN03013 PLN03013 cysteine synthase 429
8074 178588 PLN03014 PLN03014 carbonic anhydrase 347
8075 178589 PLN03015 PLN03015 UDP-glucosyl transferase 470
8076 178590 PLN03016 PLN03016 sinapoylglucose-malate O-sinapoyltransferase 433
8077 178591 PLN03017 PLN03017 trehalose-phosphatase 366
8078 178592 PLN03018 PLN03018 homomethionine N-hydroxylase 534
8079 166660 PLN03019 PLN03019 carbonic anhydrase 330
8080 215541 PLN03020 PLN03020 low-temperature-induced protein; Provisional 556
8081 178593 PLN03021 PLN03021 Low-temperature-induced protein; Provisional 619
8082 215542 PLN03023 PLN03023 Expansin-like B1; Provisional 247
8083 178595 PLN03024 PLN03024 Putative EG45-like domain containing protein 1; Provisional 125
8084 178596 PLN03025 PLN03025 replication factor C subunit; Provisional 319
8085 178597 PLN03026 PLN03026 histidinol-phosphate aminotransferase; Provisional 380
8086 215543 PLN03028 PLN03028 pyrophosphate--fructose-6-phosphate 1-phosphotransferase; Provisional 610
8087 215544 PLN03029 PLN03029 type-a response regulator protein; Provisional 222
8088 215545 PLN03030 PLN03030 cationic peroxidase; Provisional 324
8089 215546 PLN03031 PLN03031 hypothetical protein; Provisional 102
8090 166673 PLN03032 PLN03032 serine decarboxylase; Provisional 374
8091 178601 PLN03033 PLN03033 2-dehydro-3-deoxyphosphooctonate aldolase; Provisional 290
8092 178602 PLN03034 PLN03034 phosphoglycerate kinase; Provisional 481
8093 178603 PLN03036 PLN03036 glutamine synthetase; Provisional 432
8094 215547 PLN03037 PLN03037 lipase class 3 family protein; Provisional 525
8095 166679 PLN03039 PLN03039 ethanolaminephosphotransferase; Provisional 337
8096 215548 PLN03042 PLN03042 Lactoylglutathione lyase; Provisional 185
8097 178606 PLN03043 PLN03043 Probable pectinesterase/pectinesterase inhibitor; Provisional 538
8098 215549 PLN03044 PLN03044 GTP cyclohydrolase I; Provisional 188
8099 178608 PLN03046 PLN03046 D-glycerate 3-kinase; Provisional 460
8100 215550 PLN03049 PLN03049 pyridoxine (pyridoxamine) 5'-phosphate oxidase; Provisional 462
8101 215551 PLN03050 PLN03050 pyridoxine (pyridoxamine) 5'-phosphate oxidase; Provisional 246
8102 215552 PLN03051 PLN03051 acyl-activating enzyme; Provisional 499
8103 215553 PLN03052 PLN03052 acetate--CoA ligase; Provisional 728
8104 178613 PLN03055 PLN03055 AMP deaminase; Provisional 602
8105 166697 PLN03058 PLN03058 dynein light chain type 1 family protein; Provisional 128
8106 166698 PLN03059 PLN03059 beta-galactosidase; Provisional 840
8107 215554 PLN03060 PLN03060 inositol phosphatase-like protein; Provisional 206
8108 215555 PLN03063 PLN03063 alpha,alpha-trehalose-phosphate synthase (UDP-forming); Provisional 797
8109 215556 PLN03064 PLN03064 alpha,alpha-trehalose-phosphate synthase (UDP-forming); Provisional 934
8110 178617 PLN03065 PLN03065 isocitrate dehydrogenase (NADP+); Provisional 483
8111 215557 PLN03069 PLN03069 magnesiumprotoporphyrin-IX chelatase subunit H; Provisional 1220
8112 178619 PLN03070 PLN03070 photosystem I reaction center subunit psaK 247; Provisional 128
8113 178620 PLN03071 PLN03071 GTP-binding nuclear protein Ran; Provisional 219
8114 178621 PLN03072 PLN03072 60S ribosomal protein L12; Provisional 166
8115 215558 PLN03073 PLN03073 ABC transporter F family; Provisional 718
8116 215559 PLN03074 PLN03074 auxin influx permease; Provisional 473
8117 178624 PLN03075 PLN03075 nicotianamine synthase; Provisional 296
8118 215560 PLN03076 PLN03076 ARF guanine nucleotide exchange factor (ARF-GEF); Provisional 1780
8119 215561 PLN03077 PLN03077 Protein ECB2; Provisional 857
8120 215562 PLN03078 PLN03078 Putative tRNA pseudouridine synthase; Provisional 513
8121 178628 PLN03079 PLN03079 Uncharacterized protein At4g33100; Provisional 91
8122 178629 PLN03080 PLN03080 Probable beta-xylosidase; Provisional 779
8123 215563 PLN03081 PLN03081 pentatricopeptide (PPR) repeat-containing protein; Provisional 697
8124 215564 PLN03082 PLN03082 Iron-sulfur cluster assembly; Provisional 163
8125 215565 PLN03083 PLN03083 E3 UFM1-protein ligase 1 homolog; Provisional 803
8126 178633 PLN03084 PLN03084 alpha/beta hydrolase fold protein; Provisional 383
8127 215566 PLN03085 PLN03085 nucleobase:cation symporter-1; Provisional 221
8128 178635 PLN03086 PLN03086 PRLI-interacting factor K; Provisional 567
8129 215567 PLN03087 PLN03087 BODYGUARD 1 domain containing hydrolase; Provisional 481
8130 215568 PLN03088 PLN03088 SGT1, suppressor of G2 allele of SKP1; Provisional 356
8131 215569 PLN03089 PLN03089 hypothetical protein; Provisional 373
8132 178639 PLN03090 PLN03090 auxin-responsive family protein; Provisional 104
8133 215570 PLN03091 PLN03091 hypothetical protein; Provisional 459
8134 178641 PLN03093 PLN03093 Protein SENSITIVITY TO RED LIGHT REDUCED 1; Provisional 273
8135 178642 PLN03094 PLN03094 Substrate binding subunit of ER-derived-lipid transporter; Provisional 370
8136 215571 PLN03095 PLN03095 NADH:ubiquinone oxidoreductase 18 kDa subunit; Provisional 115
8137 215572 PLN03096 PLN03096 glyceraldehyde-3-phosphate dehydrogenase A; Provisional 395
8138 178645 PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional 846
8139 215573 PLN03098 LPA1 LOW PSII ACCUMULATION1; Provisional 453
8140 215574 PLN03099 PIR Protein PIR; Provisional 1232
8141 215575 PLN03100 PLN03100 Permease subunit of ER-derived-lipid transporter; Provisional 292
8142 215576 PLN03102 PLN03102 acyl-activating enzyme; Provisional 579
8143 215577 PLN03103 PLN03103 GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional 403
8144 215578 PLN03104 FHL FAR-RED-ELONGATED HYPOCOTYL1-LIKE; Provisional 201
8145 178652 PLN03105 TCP24 transcription factor TCP24 (TEOSINTE BRANCHED1, CYCLOIDEA, AND PCF FAMILY 24); Provisional 324
8146 215579 PLN03106 TCP2 Protein TCP2; Provisional 447
8147 215580 PLN03107 PLN03107 eukaryotic translation initiation factor 5A; Provisional 159
8148 178655 PLN03108 PLN03108 Rab family protein; Provisional 210
8149 215581 PLN03109 PLN03109 ETHYLENE-INSENSITIVE3-like3 protein; Provisional 599
8150 178657 PLN03110 PLN03110 Rab GTPase; Provisional 216
8151 215582 PLN03111 PLN03111 DNA-directed RNA polymerase II subunit family protein; Provisional 206
8152 215583 PLN03112 PLN03112 cytochrome P450 family protein; Provisional 514
8153 215584 PLN03113 PLN03113 DNA ligase 1; Provisional 744
8154 178661 PLN03114 PLN03114 ADP-ribosylation factor GTPase-activating protein AGD10; Provisional 395
8155 215585 PLN03115 PLN03115 ferredoxin--NADP(+) reductase; Provisional 367
8156 215586 PLN03116 PLN03116 ferredoxin--NADP+ reductase; Provisional 307
8157 178664 PLN03117 PLN03117 Branched-chain-amino-acid aminotransferase; Provisional 355
8158 215587 PLN03118 PLN03118 Rab family protein; Provisional 211
8159 178666 PLN03119 PLN03119 putative ADP-ribosylation factor GTPase-activating protein AGD14; Provisional 648
8160 215588 PLN03120 PLN03120 nucleic acid binding protein; Provisional 260
8161 215589 PLN03121 PLN03121 nucleic acid binding protein; Provisional 243
8162 178669 PLN03122 PLN03122 Poly [ADP-ribose] polymerase; Provisional 815
8163 215590 PLN03123 PLN03123 poly [ADP-ribose] polymerase; Provisional 981
8164 215591 PLN03124 PLN03124 poly [ADP-ribose] polymerase; Provisional 643
8165 215592 PLN03126 PLN03126 Elongation factor Tu; Provisional 478
8166 178673 PLN03127 PLN03127 Elongation factor Tu; Provisional 447
8167 215593 PLN03128 PLN03128 DNA topoisomerase 2; Provisional 1135
8168 215594 PLN03129 PLN03129 NADP-dependent malic enzyme; Provisional 581
8169 215595 PLN03130 PLN03130 ABC transporter C family member; Provisional 1622
8170 178677 PLN03131 PLN03131 hypothetical protein; Provisional 705
8171 178678 PLN03132 PLN03132 NADH dehydrogenase (ubiquinone) flavoprotein 1; Provisional 461
8172 215596 PLN03133 PLN03133 beta-1,3-galactosyltransferase; Provisional 636
8173 178680 PLN03134 PLN03134 glycine-rich RNA-binding protein 4; Provisional 144
8174 178681 PLN03136 PLN03136 Ferredoxin; Provisional 148
8175 215597 PLN03137 PLN03137 ATP-dependent DNA helicase; Q4-like; Provisional 1195
8176 215598 PLN03138 PLN03138 Protein TOC75; Provisional 796
8177 178684 PLN03139 PLN03139 formate dehydrogenase; Provisional 386
8178 215599 PLN03140 PLN03140 ABC transporter G family member; Provisional 1470
8179 215600 PLN03141 PLN03141 3-epi-6-deoxocathasterone 23-monooxygenase; Provisional 452
8180 215601 PLN03142 PLN03142 Probable chromatin-remodeling complex ATPase chain; Provisional 1033
8181 215602 PLN03143 PLN03143 nudix hydrolase; Provisional 291
8182 178689 PLN03144 PLN03144 Carbon catabolite repressor protein 4 homolog; Provisional 606
8183 215603 PLN03145 PLN03145 Protein phosphatase 2c; Provisional 365
8184 178691 PLN03146 PLN03146 aspartyl protease family protein; Provisional 431
8185 178692 PLN03147 PLN03147 ribosomal protein S19; Provisional 92
8186 178693 PLN03148 PLN03148 Blue copper-like protein; Provisional 167
8187 178694 PLN03149 PLN03149 peptidyl-prolyl isomerase H (cyclophilin H); Provisional 186
8188 178695 PLN03150 PLN03150 hypothetical protein; Provisional 623
8189 215604 PLN03151 PLN03151 cation/calcium exchanger; Provisional 650
8190 178697 PLN03152 PLN03152 hypothetical protein; Provisional 241
8191 215605 PLN03153 PLN03153 hypothetical protein; Provisional 537
8192 215606 PLN03154 PLN03154 putative allyl alcohol dehydrogenase; Provisional 348
8193 178700 PLN03155 PLN03155 cytochrome c oxidase subunit 5C; Provisional 63
8194 178701 PLN03156 PLN03156 GDSL esterase/lipase; Provisional 351
8195 178702 PLN03157 PLN03157 spermidine hydroxycinnamoyl transferase; Provisional 447
8196 215607 PLN03158 PLN03158 methionine aminopeptidase; Provisional 396
8197 215608 PLN03159 PLN03159 cation/H(+) antiporter 15; Provisional 832
8198 215609 PLN03160 PLN03160 uncharacterized protein; Provisional 219
8199 178706 PLN03161 PLN03161 Probable xyloglucan endotransglucosylase/hydrolase protein; Provisional 291
8200 178707 PLN03162 PLN03162 golden-2 like transcription factor; Provisional 526
8201 215610 PLN03164 PLN03164 3-oxo-5-alpha-steroid 4-dehydrogenase, C-terminal domain containing protein; Provisional 323
8202 178709 PLN03165 PLN03165 chaperone protein dnaJ-related; Provisional 111
8203 178710 PLN03166 PLN03166 60S ribosomal protein L34; Provisional 96
8204 215611 PLN03167 PLN03167 Chaperonin-60 beta subunit; Provisional 600
8205 178712 PLN03168 PLN03168 chalcone synthase; Provisional 389
8206 215612 PLN03169 PLN03169 chalcone synthase family protein; Provisional 391
8207 178714 PLN03170 PLN03170 chalcone synthase; Provisional 401
8208 178715 PLN03171 PLN03171 chalcone synthase-like protein; Provisional 399
8209 178716 PLN03172 PLN03172 chalcone synthase family protein; Provisional 393
8210 178717 PLN03173 PLN03173 chalcone synthase; Provisional 391
8211 215613 PLN03174 PLN03174 Chalcone-flavanone isomerase-related; Provisional 278
8212 178719 PLN03175 PLN03175 hypothetical protein; Provisional 415
8213 178720 PLN03176 PLN03176 flavanone-3-hydroxylase; Provisional 120
8214 215614 PLN03178 PLN03178 leucoanthocyanidin dioxygenase; Provisional 360
8215 215615 PLN03180 PLN03180 reversibly glycosylated polypeptide; Provisional 346
8216 215616 PLN03181 PLN03181 glycosyltransferase; Provisional 453
8217 215617 PLN03182 PLN03182 xyloglucan 6-xylosyltransferase; Provisional 429
8218 178725 PLN03183 PLN03183 acetylglucosaminyltransferase family protein; Provisional 421
8219 215618 PLN03184 PLN03184 chloroplast Hsp70; Provisional 673
8220 215619 PLN03185 PLN03185 phosphatidylinositol phosphate kinase; Provisional 765
8221 178728 PLN03186 PLN03186 DNA repair protein RAD51 homolog; Provisional 342
8222 215620 PLN03187 PLN03187 meiotic recombination protein DMC1 homolog; Provisional 344
8223 215621 PLN03188 PLN03188 kinesin-12 family protein; Provisional 1320
8224 215622 PLN03189 PLN03189 Protease specific for SMALL UBIQUITIN-RELATED MODIFIER (SUMO); Provisional 490
8225 215623 PLN03190 PLN03190 aminophospholipid translocase; Provisional 1178
8226 215624 PLN03191 PLN03191 Type I inositol-1,4,5-trisphosphate 5-phosphatase 2; Provisional 621
8227 215625 PLN03192 PLN03192 Voltage-dependent potassium channel; Provisional 823
8228 178735 PLN03193 PLN03193 beta-1,3-galactosyltransferase; Provisional 408
8229 215626 PLN03194 PLN03194 putative disease resistance protein; Provisional 187
8230 215627 PLN03195 PLN03195 fatty acid omega-hydroxylase; Provisional 516
8231 215628 PLN03196 PLN03196 MOC1-like protein; Provisional 487
8232 178739 PLN03198 PLN03198 delta6-acyl-lipid desaturase; Provisional 526
8233 178740 PLN03199 PLN03199 delta6-acyl-lipid desaturase-like protein; Provisional 485
8234 215629 PLN03200 PLN03200 cellulose synthase-interactive protein; Provisional 2102
8235 215630 PLN03201 PLN03201 RAB geranylgeranyl transferase beta-subunit; Provisional 316
8236 215631 PLN03202 PLN03202 protein argonaute; Provisional 900
8237 178744 PLN03205 PLN03205 ATR interacting protein; Provisional 652
8238 178745 PLN03206 PLN03206 phosphoribosylformylglycinamidine synthase; Provisional 1307
8239 215632 PLN03207 PLN03207 stomagen; Provisional 113
8240 178747 PLN03208 PLN03208 E3 ubiquitin-protein ligase RMA2; Provisional 193
8241 178748 PLN03209 PLN03209 translocon at the inner envelope of chloroplast subunit 62; Provisional 576
8242 215633 PLN03210 PLN03210 Resistant to P. syringae 6; Provisional 1153
8243 215634 PLN03211 PLN03211 ABC transporter G-25; Provisional 659
8244 178751 PLN03212 PLN03212 Transcription repressor MYB5; Provisional 249
8245 178752 PLN03213 PLN03213 repressor of silencing 3; Provisional 759
8246 215635 PLN03214 PLN03214 probable enoyl-CoA hydratase/isomerase; Provisional 278
8247 178754 PLN03215 PLN03215 ascorbic acid mannose pathway regulator 1; Provisional 373
8248 178755 PLN03216 PLN03216 actin depolymerizing factor; Provisional 141
8249 178756 PLN03217 PLN03217 transcription factor ATBS1; Provisional 93
8250 215636 PLN03218 PLN03218 maturation of RBCL 1; Provisional 1060
8251 178758 PLN03219 PLN03219 uncharacterized protein; Provisional 108
8252 178759 PLN03220 PLN03220 uncharacterized protein; Provisional 105
8253 178760 PLN03221 PLN03221 rapid alkalinization factor 23; Provisional 137
8254 178761 PLN03222 PLN03222 rapid alkalinization factor 23-like protein; Provisional 119
8255 215637 PLN03223 PLN03223 Polycystin cation channel protein; Provisional 1634
8256 178763 PLN03224 PLN03224 probable serine/threonine protein kinase; Provisional 507
8257 215638 PLN03225 PLN03225 Serine/threonine-protein kinase SNT7; Provisional 566
8258 215639 PLN03226 PLN03226 serine hydroxymethyltransferase; Provisional 475
8259 178766 PLN03227 PLN03227 serine palmitoyltransferase-like protein; Provisional 392
8260 178767 PLN03228 PLN03228 methylthioalkylmalate synthase; Provisional 503
8261 178768 PLN03229 PLN03229 acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha; Provisional 762
8262 178769 PLN03230 PLN03230 acetyl-coenzyme A carboxylase carboxyl transferase; Provisional 431
8263 178770 PLN03231 PLN03231 putative alpha-galactosidase; Provisional 357
8264 215640 PLN03232 PLN03232 ABC transporter C family member; Provisional 1495
8265 178772 PLN03233 PLN03233 putative glutamate-tRNA ligase; Provisional 523
8266 178773 PLN03234 PLN03234 cytochrome P450 83B1; Provisional 499
8267 178774 PLN03236 PLN03236 4-alpha-glucanotransferase; Provisional 745
8268 215641 PLN03237 PLN03237 DNA topoisomerase 2; Provisional 1465
8269 215642 PLN03238 PLN03238 probable histone acetyltransferase MYST; Provisional 290
8270 178777 PLN03239 PLN03239 histone acetyltransferase; Provisional 351
8271 178778 PLN03240 PLN03240 putative Low-temperature-induced protein; Provisional 626
8272 215643 PLN03241 PLN03241 magnesium chelatase subunit H; Provisional 1315
8273 178780 PLN03242 PLN03242 diacylglycerol o-acyltransferase; Provisional 410
8274 215644 PLN03243 PLN03243 haloacid dehalogenase-like hydrolase; Provisional 260
8275 178782 PLN03244 PLN03244 alpha-amylase; Provisional 872
8276 215645 PLN03246 PLN03246 26S proteasome regulatory subunit; Provisional 303
8277 234564 PRK00001 rplC 50S ribosomal protein L3; Validated 210
8278 234565 PRK00002 aroB 3-dehydroquinate synthase; Reviewed 358
8279 234566 PRK00004 rplX 50S ribosomal protein L24; Reviewed 105
8280 234567 PRK00005 fmt methionyl-tRNA formyltransferase; Reviewed 309
8281 234568 PRK00006 fabZ 3-hydroxyacyl-ACP dehydratase FabZ. 147
8282 234569 PRK00007 PRK00007 elongation factor G; Reviewed 693
8283 234570 PRK00009 PRK00009 phosphoenolpyruvate carboxylase; Reviewed 911
8284 178791 PRK00010 rplE 50S ribosomal protein L5; Validated 179
8285 234571 PRK00011 glyA serine hydroxymethyltransferase; Reviewed 416
8286 234572 PRK00012 gatA Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatA. 459
8287 234573 PRK00013 groEL chaperonin GroEL; Reviewed 542
8288 134031 PRK00014 ribB 3,4-dihydroxy-2-butanone 4-phosphate synthase; Provisional 230
8289 234574 PRK00015 rnhB ribonuclease HII; Validated 197
8290 234575 PRK00016 PRK00016 metal-binding heat shock protein; Provisional 159
8291 234576 PRK00019 rpmE 50S ribosomal protein L31; Reviewed 72
8292 134035 PRK00020 truB tRNA pseudouridine synthase B; Provisional 244
8293 234577 PRK00021 truA tRNA pseudouridine(38-40) synthase TruA. 244
8294 234578 PRK00022 lolB lipoprotein localization protein LolB. 202
8295 234579 PRK00023 cmk (d)CMP kinase. 225
8296 178801 PRK00024 PRK00024 DNA repair protein RadC. 224
8297 234580 PRK00025 lpxB lipid-A-disaccharide synthase; Reviewed 380
8298 234581 PRK00026 trmD tRNA (guanine-N(1)-)-methyltransferase; Reviewed 244
8299 234582 PRK00028 infC translation initiation factor IF-3; Reviewed 177
8300 234583 PRK00029 PRK00029 YdiU family protein. 487
8301 178806 PRK00030 minC septum site-determining protein MinC. 292
8302 178807 PRK00031 lolA outer membrane lipoprotein chaperone LolA. 195
8303 234584 PRK00032 PRK00032 septum formation inhibitor Maf. 190
8304 178809 PRK00033 clpS ATP-dependent Clp protease adaptor protein ClpS; Reviewed 100
8305 178810 PRK00034 gatC Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatC. 95
8306 234585 PRK00035 hemH ferrochelatase; Reviewed 333
8307 134050 PRK00036 PRK00036 primosomal replication protein N; Reviewed 107
8308 234586 PRK00037 hisS histidyl-tRNA synthetase; Reviewed 412
8309 134052 PRK00038 rnpA ribonuclease P protein component. 123
8310 234587 PRK00039 ruvC Holliday junction resolvase; Reviewed 164
8311 234588 PRK00040 rpsP 30S ribosomal protein S16; Reviewed 75
8312 178815 PRK00041 PRK00041 hypothetical protein; Validated 93
8313 234589 PRK00042 tpiA triosephosphate isomerase; Provisional 250
8314 234590 PRK00043 thiE thiamine phosphate synthase. 212
8315 234591 PRK00044 psd phosphatidylserine decarboxylase; Reviewed 288
8316 234592 PRK00045 hemA glutamyl-tRNA reductase; Reviewed 423
8317 234593 PRK00046 murB UDP-N-acetylmuramate dehydrogenase. 334
8318 234594 PRK00047 glpK glycerol kinase GlpK. 498
8319 234595 PRK00048 PRK00048 dihydrodipicolinate reductase; Provisional 257
8320 234596 PRK00049 PRK00049 elongation factor Tu; Reviewed 396
8321 234597 PRK00050 PRK00050 16S rRNA (cytosine(1402)-N(4))-methyltransferase RsmH. 296
8322 234598 PRK00051 hisI phosphoribosyl-AMP cyclohydrolase; Reviewed 125
8323 234599 PRK00052 PRK00052 prolipoprotein diacylglyceryl transferase; Reviewed 269
8324 234600 PRK00053 alr alanine racemase; Reviewed 363
8325 234601 PRK00054 PRK00054 dihydroorotate dehydrogenase electron transfer subunit; Reviewed 250
8326 234602 PRK00055 PRK00055 ribonuclease Z; Reviewed 270
8327 234603 PRK00056 mtgA monofunctional biosynthetic peptidoglycan transglycosylase; Provisional 236
8328 234604 PRK00058 PRK00058 peptide-methionine (S)-S-oxide reductase MsrA. 213
8329 234605 PRK00059 prsA peptidylprolyl isomerase; Provisional 336
8330 234606 PRK00061 ribH 6,7-dimethyl-8-ribityllumazine synthase; Provisional 154
8331 234607 PRK00062 PRK00062 glutamate-1-semialdehyde 2,1-aminomutase. 426
8332 234608 PRK00064 recF recombination protein F; Reviewed 361
8333 178836 PRK00066 ldh L-lactate dehydrogenase; Reviewed 315
8334 234609 PRK00068 PRK00068 hypothetical protein; Validated 970
8335 234610 PRK00070 acpS 4'-phosphopantetheinyl transferase; Provisional 126
8336 234611 PRK00071 nadD nicotinate-nucleotide adenylyltransferase. 203
8337 234612 PRK00072 hemC porphobilinogen deaminase; Reviewed 295
8338 234613 PRK00073 pgk phosphoglycerate kinase; Provisional 389
8339 234614 PRK00074 guaA GMP synthase; Reviewed 511
8340 234615 PRK00075 cbiD cobalt-precorrin-6A synthase; Reviewed 361
8341 234616 PRK00076 recR recombination protein RecR; Reviewed 196
8342 234617 PRK00077 eno enolase; Provisional 425
8343 234618 PRK00078 PRK00078 Maf-like protein; Reviewed 192
8344 234619 PRK00080 ruvB Holliday junction branch migration DNA helicase RuvB. 328
8345 234620 PRK00081 coaE dephospho-CoA kinase; Reviewed 194
8346 234621 PRK00082 hrcA heat-inducible transcription repressor; Provisional 339
8347 178850 PRK00083 frr ribosome recycling factor; Reviewed 185
8348 178851 PRK00084 ispF 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase; Reviewed 159
8349 234622 PRK00085 recO DNA repair protein RecO; Reviewed 247
8350 234623 PRK00087 PRK00087 bifunctional 4-hydroxy-3-methylbut-2-enyl diphosphate reductase/30S ribosomal protein S1. 647
8351 234624 PRK00089 era GTPase Era; Reviewed 292
8352 234625 PRK00090 bioD ATP-dependent dethiobiotin synthetase BioD. 222
8353 234626 PRK00091 miaA tRNA delta(2)-isopentenylpyrophosphate transferase; Reviewed 307
8354 234627 PRK00092 PRK00092 ribosome maturation protein RimP; Reviewed 154
8355 234628 PRK00093 PRK00093 GTP-binding protein Der; Reviewed 435
8356 234629 PRK00094 gpsA NAD(P)H-dependent glycerol-3-phosphate dehydrogenase. 325
8357 234630 PRK00095 mutL DNA mismatch repair endonuclease MutL. 617
8358 234631 PRK00098 PRK00098 GTPase RsgA; Reviewed 298
8359 234632 PRK00099 rplJ 50S ribosomal protein L10; Reviewed 172
8360 234633 PRK00102 rnc ribonuclease III; Reviewed 229
8361 234634 PRK00103 PRK00103 rRNA large subunit methyltransferase; Provisional 157
8362 234635 PRK00104 scpA segregation and condensation protein A; Reviewed 242
8363 234636 PRK00105 cobT nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase; Reviewed 335
8364 178867 PRK00106 PRK00106 ribonuclease Y. 535
8365 234637 PRK00107 gidB 16S rRNA (guanine(527)-N(7))-methyltransferase RsmG. 187
8366 234638 PRK00108 mraY phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional 344
8367 234639 PRK00109 PRK00109 Holliday junction resolvase RuvX. 138
8368 234640 PRK00110 PRK00110 YebC/PmpR family DNA-binding transcriptional regulator. 245
8369 234641 PRK00111 PRK00111 hypothetical protein; Provisional 180
8370 234642 PRK00112 tgt queuine tRNA-ribosyltransferase; Provisional 366
8371 234643 PRK00114 hslO Hsp33 family molecular chaperone HslO. 293
8372 234644 PRK00115 hemE uroporphyrinogen decarboxylase; Validated 346
8373 234645 PRK00116 ruvA Holliday junction branch migration protein RuvA. 192
8374 234646 PRK00117 recX recombination regulator RecX; Reviewed 157
8375 234647 PRK00118 PRK00118 putative DNA-binding protein; Validated 104
8376 234648 PRK00120 PRK00120 dITP/XTP pyrophosphatase; Reviewed 196
8377 234649 PRK00121 trmB tRNA (guanine-N(7)-)-methyltransferase; Reviewed 202
8378 234650 PRK00122 rimM 16S rRNA-processing protein RimM; Provisional 172
8379 178882 PRK00124 PRK00124 YaiI/YqxD family protein. 151
8380 234651 PRK00125 pyrF orotidine 5'-phosphate decarboxylase; Reviewed 278
8381 234652 PRK00128 ipk 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 286
8382 234653 PRK00129 upp uracil phosphoribosyltransferase; Reviewed 209
8383 178886 PRK00130 truB tRNA pseudouridine synthase B; Provisional 290
8384 234654 PRK00131 aroK shikimate kinase; Reviewed 175
8385 178888 PRK00132 rpsI 30S ribosomal protein S9; Reviewed 130
8386 234655 PRK00133 metG methionyl-tRNA synthetase; Reviewed 673
8387 234656 PRK00134 ccrB fluoride efflux transporter family protein. 104
8388 234657 PRK00135 scpB segregation and condensation protein B; Reviewed 188
8389 234658 PRK00136 rpsH 30S ribosomal protein S8; Validated 130
8390 234659 PRK00137 rplI 50S ribosomal protein L9; Reviewed 147
8391 234660 PRK00139 murE UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase; Provisional 460
8392 234661 PRK00140 rplK 50S ribosomal protein L11; Validated 141
8393 234662 PRK00141 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 473
8394 234663 PRK00142 PRK00142 rhodanese-related sulfurtransferase. 314
8395 234664 PRK00143 mnmA tRNA-specific 2-thiouridylase MnmA; Reviewed 346
8396 234665 PRK00145 PRK00145 putative inner membrane protein translocase component YidC; Provisional 223
8397 234666 PRK00147 queA S-adenosylmethionine:tRNA ribosyltransferase-isomerase; Provisional 342
8398 178901 PRK00148 PRK00148 Maf-like protein; Reviewed 194
8399 234667 PRK00149 dnaA chromosomal replication initiator protein DnaA. 401
8400 234668 PRK00150 def peptide deformylase; Reviewed 165
8401 234669 PRK00153 PRK00153 YbaB/EbfC family nucleoid-associated protein. 104
8402 234670 PRK00155 ispD D-ribitol-5-phosphate cytidylyltransferase. 227
8403 234671 PRK00157 rplL 50S ribosomal protein L7/L12; Reviewed 123
8404 178907 PRK00159 PRK00159 putative septation inhibitor protein; Reviewed 87
8405 178908 PRK00162 glpE thiosulfate sulfurtransferase GlpE. 108
8406 234672 PRK00164 moaA GTP 3',8-cyclase MoaA. 331
8407 234673 PRK00166 apaH symmetrical bis(5'-nucleosyl)-tetraphosphatase. 275
8408 234674 PRK00168 coaD phosphopantetheine adenylyltransferase; Provisional 159
8409 234675 PRK00170 PRK00170 azoreductase; Reviewed 201
8410 234676 PRK00172 rpmI 50S ribosomal protein L35; Reviewed 65
8411 178914 PRK00173 rph ribonuclease PH; Reviewed 238
8412 234677 PRK00174 PRK00174 acetyl-CoA synthetase; Provisional 637
8413 234678 PRK00175 metX homoserine O-acetyltransferase; Provisional 379
8414 166839 PRK00178 tolB Tol-Pal system protein TolB. 430
8415 234679 PRK00179 pgi glucose-6-phosphate isomerase; Reviewed 548
8416 234680 PRK00180 PRK00180 acetate kinase A/propionate kinase 2; Reviewed 402
8417 234681 PRK00182 tatB Sec-independent protein translocase subunit TatB. 160
8418 166842 PRK00183 PRK00183 hypothetical protein; Provisional 157
8419 166843 PRK00187 PRK00187 NorM family multidrug efflux MATE transporter. 464
8420 234682 PRK00188 trpD anthranilate phosphoribosyltransferase; Provisional 339
8421 234683 PRK00191 tatA twin arginine translocase protein A; Provisional 84
8422 234684 PRK00192 PRK00192 mannosyl-3-phosphoglycerate phosphatase; Reviewed 273
8423 178923 PRK00194 PRK00194 ACT domain-containing protein. 90
8424 234685 PRK00197 proA gamma-glutamyl phosphate reductase; Provisional 417
8425 178925 PRK00199 ihfB integration host factor subunit beta; Reviewed 94
8426 234686 PRK00202 nusB transcription antitermination factor NusB. 137
8427 178927 PRK00203 rnhA ribonuclease H; Reviewed 150
8428 178928 PRK00207 PRK00207 sulfurtransferase complex subunit TusD. 128
8429 234687 PRK00208 thiG thiazole synthase; Reviewed 250
8430 178930 PRK00211 PRK00211 sulfurtransferase complex subunit TusC. 119
8431 234688 PRK00215 PRK00215 transcriptional repressor LexA. 205
8432 234689 PRK00216 ubiE bifunctional demethylmenaquinone methyltransferase/2-methoxy-6-polyprenyl-1,4-benzoquinol methylase UbiE. 239
8433 234690 PRK00218 PRK00218 lysogenization regulator HflD. 207
8434 234691 PRK00220 PRK00220 glycerol-3-phosphate 1-O-acyltransferase PlsY. 198
8435 234692 PRK00222 PRK00222 peptide-methionine (R)-S-oxide reductase MsrB. 142
8436 234693 PRK00226 greA transcription elongation factor GreA; Reviewed 157
8437 178937 PRK00227 glnD [protein-PII] uridylyltransferase. 693
8438 234694 PRK00228 PRK00228 YqgE/AlgH family protein. 191
8439 234695 PRK00230 PRK00230 orotidine-5'-phosphate decarboxylase. 230
8440 234696 PRK00232 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase; Reviewed 332
8441 166864 PRK00234 PRK00234 Maf-like protein; Reviewed 192
8442 234697 PRK00235 cobS cobalamin synthase; Reviewed 249
8443 234698 PRK00236 xerC site-specific tyrosine recombinase XerC; Reviewed 297
8444 178943 PRK00239 rpsT 30S ribosomal protein S20; Reviewed 88
8445 234699 PRK00241 nudC NAD(+) diphosphatase. 256
8446 178945 PRK00247 PRK00247 putative inner membrane protein translocase component YidC; Validated 429
8447 234700 PRK00249 flgH flagellar basal body L-ring protein FlgH. 222
8448 234701 PRK00252 alaS alanyl-tRNA synthetase; Reviewed 865
8449 178948 PRK00253 fliE flagellar hook-basal body protein FliE; Reviewed 108
8450 234702 PRK00254 PRK00254 ski2-like helicase; Provisional 720
8451 166874 PRK00257 PRK00257 4-phosphoerythronate dehydrogenase PdxB. 381
8452 234703 PRK00258 aroE shikimate 5-dehydrogenase; Reviewed 278
8453 234704 PRK00259 PRK00259 intracellular septation protein A; Reviewed 179
8454 234705 PRK00260 cysS cysteinyl-tRNA synthetase; Validated 463
8455 234706 PRK00269 zipA cell division protein ZipA; Reviewed 293
8456 234707 PRK00270 rpsU 30S ribosomal protein S21; Reviewed 64
8457 234708 PRK00274 ksgA 16S rRNA (adenine(1518)-N(6)/adenine(1519)-N(6))-dimethyltransferase RsmA. 272
8458 234709 PRK00275 glnD PII uridylyl-transferase; Provisional 895
8459 178954 PRK00276 infA translation initiation factor IF-1; Validated 72
8460 178955 PRK00277 clpP ATP-dependent Clp protease proteolytic subunit; Reviewed 200
8461 234710 PRK00278 trpC indole-3-glycerol phosphate synthase TrpC. 260
8462 234711 PRK00279 adk adenylate kinase; Reviewed 215
8463 234712 PRK00281 PRK00281 undecaprenyl-diphosphate phosphatase. 268
8464 234713 PRK00283 xerD tyrosine recombinase. 299
8465 178960 PRK00284 pqqA pyrroloquinoline quinone precursor peptide PqqA. 23
8466 178961 PRK00285 ihfA integration host factor subunit alpha; Reviewed 99
8467 234714 PRK00286 xseA exodeoxyribonuclease VII large subunit; Reviewed 438
8468 234715 PRK00290 dnaK molecular chaperone DnaK; Provisional 627
8469 234716 PRK00292 glk glucokinase; Provisional 316
8470 234717 PRK00293 dipZ thiol:disulfide interchange protein precursor; Provisional 571
8471 166894 PRK00294 hscB co-chaperone HscB; Provisional 173
8472 166895 PRK00295 PRK00295 hypothetical protein; Provisional 68
8473 234718 PRK00296 minE cell division topological specificity factor MinE; Reviewed 86
8474 178967 PRK00299 PRK00299 sulfurtransferase TusA. 81
8475 234719 PRK00300 gmk guanylate kinase; Provisional 205
8476 234720 PRK00301 aat leucyl/phenylalanyl-tRNA--protein transferase; Reviewed 233
8477 234721 PRK00302 lnt apolipoprotein N-acyltransferase; Reviewed 505
8478 166901 PRK00304 PRK00304 hypothetical protein; Provisional 75
8479 178971 PRK00306 PRK00306 50S ribosomal protein L29; Reviewed 66
8480 234722 PRK00310 rpsC 30S ribosomal protein S3; Reviewed 232
8481 234723 PRK00311 panB 3-methyl-2-oxobutanoate hydroxymethyltransferase; Reviewed 264
8482 178974 PRK00312 pcm protein-L-isoaspartate(D-aspartate) O-methyltransferase. 212
8483 234724 PRK00315 PRK00315 potassium-transporting ATPase subunit KdpC. 193
8484 234725 PRK00317 mobA molybdopterin-guanine dinucleotide biosynthesis protein MobA; Reviewed 193
8485 234726 PRK00321 rdgC recombination associated protein; Reviewed 303
8486 234727 PRK00325 algL polysaccharide lyase. 359
8487 234728 PRK00326 PRK00326 transcriptional regulator MraZ. 139
8488 178979 PRK00329 PRK00329 GIY-YIG nuclease superfamily protein; Validated 86
8489 234729 PRK00331 PRK00331 isomerizing glutamine--fructose-6-phosphate transaminase. 604
8490 234730 PRK00339 minC septum site-determining protein MinC. 249
8491 166914 PRK00341 PRK00341 DUF493 domain-containing protein. 91
8492 234731 PRK00343 ipk 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 271
8493 234732 PRK00346 surE 5'(3')-nucleotidase/polyphosphatase; Provisional 250
8494 234733 PRK00347 PRK00347 DNA/RNA nuclease SfsA. 234
8495 234734 PRK00349 uvrA excinuclease ABC subunit UvrA. 943
8496 178985 PRK00357 rpsS 30S ribosomal protein S19; Reviewed 92
8497 234735 PRK00358 pyrH uridylate kinase; Provisional 231
8498 234736 PRK00359 rpmB 50S ribosomal protein L28; Reviewed 76
8499 178988 PRK00364 groES co-chaperonin GroES; Reviewed 95
8500 234737 PRK00366 ispG flavodoxin-dependent (E)-4-hydroxy-3-methylbut-2-enyl-diphosphate synthase. 360
8501 234738 PRK00369 pyrC dihydroorotase; Provisional 392
8502 178991 PRK00373 PRK00373 V-type ATP synthase subunit D; Reviewed 204
8503 234739 PRK00376 lspA lipoprotein signal peptidase. 160
8504 234740 PRK00377 cbiT cobalt-precorrin-6Y C(15)-methyltransferase; Provisional 198
8505 178993 PRK00378 PRK00378 nucleoid-associated protein NdpA; Validated 334
8506 234741 PRK00380 panC pantoate--beta-alanine ligase; Reviewed 281
8507 234742 PRK00389 gcvT glycine cleavage system aminomethyltransferase GcvT. 359
8508 234743 PRK00390 leuS leucyl-tRNA synthetase; Validated 805
8509 178997 PRK00391 rpsR 30S ribosomal protein S18; Reviewed 79
8510 234744 PRK00392 rpoZ DNA-directed RNA polymerase subunit omega; Reviewed 69
8511 234745 PRK00393 ribA GTP cyclohydrolase II RibA. 197
8512 234746 PRK00394 PRK00394 transcription factor; Reviewed 179
8513 179001 PRK00395 hfq RNA-binding protein Hfq; Provisional 79
8514 179002 PRK00396 rnpA ribonuclease P protein component. 130
8515 234747 PRK00398 rpoP DNA-directed RNA polymerase subunit P; Provisional 46
8516 179004 PRK00399 rpmH 50S ribosomal protein L34; Reviewed 44
8517 179005 PRK00400 hisE phosphoribosyl-ATP diphosphatase. 105
8518 234748 PRK00402 PRK00402 3-isopropylmalate dehydratase large subunit; Reviewed 418
8519 166942 PRK00404 tatB Sec-independent protein translocase subunit TatB. 141
8520 234749 PRK00405 rpoB DNA-directed RNA polymerase subunit beta; Reviewed 1112
8521 179008 PRK00407 Archease Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism. 139
8522 234750 PRK00409 PRK00409 recombination and DNA strand exchange inhibitor protein; Reviewed 782
8523 234751 PRK00411 cdc6 ORC1-type DNA replication protein. 394
8524 234752 PRK00413 thrS threonyl-tRNA synthetase; Reviewed 638
8525 179012 PRK00414 gmhA D-sedoheptulose 7-phosphate isomerase. 192
8526 179013 PRK00415 rps27e 30S ribosomal protein S27e; Reviewed 59
8527 234753 PRK00416 dcd deoxycytidine triphosphate deaminase; Reviewed 177
8528 234754 PRK00418 PRK00418 DNA gyrase inhibitor YacG. 62
8529 234755 PRK00419 PRK00419 DNA primase small subunit PriS. 376
8530 234756 PRK00420 PRK00420 hypothetical protein; Validated 112
8531 234757 PRK00421 murC UDP-N-acetylmuramate--L-alanine ligase; Provisional 461
8532 234758 PRK00423 tfb transcription initiation factor IIB; Reviewed 310
8533 179020 PRK00430 fis DNA-binding transcriptional regulator Fis. 95
8534 234759 PRK00431 PRK00431 ADP-ribose-binding protein. 177
8535 234760 PRK00432 PRK00432 30S ribosomal protein S27ae; Validated 50
8536 179023 PRK00435 ef1B elongation factor 1-beta; Validated 88
8537 234761 PRK00436 argC N-acetyl-gamma-glutamyl-phosphate reductase; Validated 343
8538 234762 PRK00439 leuD 3-isopropylmalate dehydratase small subunit; Reviewed 163
8539 234763 PRK00440 rfc replication factor C small subunit; Reviewed 319
8540 179027 PRK00441 argR arginine repressor; Provisional 149
8541 234764 PRK00442 tatA twin-arginine translocase TatA/TatE family subunit. 92
8542 179028 PRK00443 nagB glucosamine-6-phosphate deaminase; Provisional 261
8543 234765 PRK00446 cyaY frataxin-like protein; Provisional 105
8544 234766 PRK00447 PRK00447 hypothetical protein; Provisional 144
8545 234767 PRK00448 polC DNA polymerase III PolC; Validated 1437
8546 234768 PRK00450 dapF diaminopimelate epimerase; Provisional 274
8547 234769 PRK00451 PRK00451 aminomethyl-transferring glycine dehydrogenase subunit GcvPA. 447
8548 179034 PRK00453 rpsF 30S ribosomal protein S6; Reviewed 108
8549 234770 PRK00454 engB GTP-binding protein YsxC; Reviewed 196
8550 234771 PRK00455 pyrE orotate phosphoribosyltransferase; Validated 202
8551 234772 PRK00458 PRK00458 adenosylmethionine decarboxylase. 127
8552 234773 PRK00461 rpmC 50S ribosomal protein L29; Reviewed 87
8553 234774 PRK00464 nrdR transcriptional repressor NrdR. 154
8554 179039 PRK00465 rpmJ 50S ribosomal protein L36; Reviewed 37
8555 166979 PRK00466 PRK00466 acetyl-lysine deacetylase; Validated 346
8556 179040 PRK00468 PRK00468 KH domain-containing protein. 75
8557 179041 PRK00474 rps9p 30S ribosomal protein S9P; Reviewed 134
8558 234775 PRK00476 aspS aspartyl-tRNA synthetase; Validated 588
8559 234776 PRK00478 scpA segregation and condensation protein ScpA. 505
8560 234777 PRK00481 PRK00481 NAD-dependent deacetylase; Provisional 242
8561 234778 PRK00484 lysS lysyl-tRNA synthetase; Reviewed 491
8562 234779 PRK00485 fumC fumarate hydratase; Reviewed 464
8563 234780 PRK00488 pheS phenylalanyl-tRNA synthetase subunit alpha; Validated 339
8564 234781 PRK00489 hisG ATP phosphoribosyltransferase; Reviewed 287
8565 234782 PRK00499 rnpA ribonuclease P; Reviewed 114
8566 234783 PRK00504 rpmG 50S ribosomal protein L33; Validated 50
8567 234784 PRK00507 PRK00507 deoxyribose-phosphate aldolase; Provisional 221
8568 234785 PRK00509 PRK00509 argininosuccinate synthase; Provisional 399
8569 179052 PRK00513 minC septum formation inhibitor; Reviewed 214
8570 234786 PRK00517 prmA 50S ribosomal protein L11 methyltransferase. 250
8571 234787 PRK00521 rbfA 30S ribosome-binding factor RbfA. 120
8572 179055 PRK00522 tpx thiol peroxidase. 167
8573 179056 PRK00523 PRK00523 hypothetical protein; Provisional 72
8574 179057 PRK00528 rpmE 50S ribosomal protein L31; Reviewed 71
8575 234788 PRK00529 PRK00529 elongation factor P; Validated 186
8576 134311 PRK00536 speE spermidine synthase; Provisional 262
8577 179059 PRK00539 atpC F0F1 ATP synthase subunit epsilon; Validated 133
8578 234789 PRK00549 PRK00549 competence damage-inducible protein A; Provisional 414
8579 234790 PRK00550 rpsE 30S ribosomal protein S5; Validated 168
8580 179062 PRK00553 PRK00553 ribose-phosphate pyrophosphokinase; Provisional 332
8581 179063 PRK00555 PRK00555 galactokinase; Provisional 363
8582 234791 PRK00556 minC septum formation inhibitor; Reviewed 194
8583 234792 PRK00558 uvrC excinuclease ABC subunit UvrC. 598
8584 167003 PRK00560 PRK00560 molybdenum cofactor guanylyltransferase MobA. 196
8585 100598 PRK00561 ppnK NAD(+) kinase. 259
8586 179066 PRK00564 hypA hydrogenase nickel incorporation protein; Provisional 117
8587 234793 PRK00565 rplV 50S ribosomal protein L22; Reviewed 112
8588 234794 PRK00566 PRK00566 DNA-directed RNA polymerase subunit beta'; Provisional 1156
8589 234795 PRK00567 mscL large-conductance mechanosensitive channel protein MscL. 134
8590 134322 PRK00568 PRK00568 carbon storage regulator; Provisional 76
8591 234796 PRK00571 atpC F0F1 ATP synthase subunit epsilon; Validated 135
8592 100605 PRK00573 lspA signal peptidase II; Provisional 184
8593 234797 PRK00575 tatA Sec-independent protein translocase subunit TatA. 92
8594 234798 PRK00576 PRK00576 molybdopterin-guanine dinucleotide biosynthesis protein A; Provisional 178
8595 234799 PRK00578 prfB peptide chain release factor 2; Validated 367
8596 234800 PRK00587 PRK00587 YbaB/EbfC family nucleoid-associated protein. 99
8597 179073 PRK00588 rnpA ribonuclease P protein component. 118
8598 234801 PRK00591 prfA peptide chain release factor 1; Validated 359
8599 179075 PRK00595 rpmG 50S ribosomal protein L33; Validated 53
8600 179076 PRK00596 rpsJ 30S ribosomal protein S10; Reviewed 102
8601 234802 PRK00601 dut dUTP diphosphatase. 150
8602 100616 PRK00611 PRK00611 putative disulfide oxidoreductase; Provisional 135
8603 234803 PRK00615 PRK00615 glutamate-1-semialdehyde aminotransferase; Provisional 433
8604 167014 PRK00624 PRK00624 glycine cleavage system protein H; Provisional 114
8605 134335 PRK00625 PRK00625 shikimate kinase; Provisional 173
8606 234804 PRK00629 pheT phenylalanyl-tRNA synthetase subunit beta; Reviewed 791
8607 234805 PRK00630 PRK00630 nickel responsive regulator; Provisional 148
8608 234806 PRK00635 PRK00635 excinuclease ABC subunit A; Provisional 1809
8609 179080 PRK00642 PRK00642 inorganic pyrophosphatase; Provisional 205
8610 100624 PRK00647 PRK00647 hypothetical protein; Validated 96
8611 234807 PRK00648 PRK00648 Maf-like protein; Reviewed 191
8612 134340 PRK00650 PRK00650 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 288
8613 234808 PRK00652 lpxK tetraacyldisaccharide 4'-kinase; Reviewed 325
8614 234809 PRK00654 glgA glycogen synthase GlgA. 466
8615 179084 PRK00665 petG cytochrome b6-f complex subunit PetG; Reviewed 37
8616 179085 PRK00668 ndk mulitfunctional nucleoside diphosphate kinase/apyrimidinic endonuclease/3'-; Validated 134
8617 234810 PRK00676 hemA glutamyl-tRNA reductase; Validated 338
8618 179086 PRK00683 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 418
8619 234811 PRK00685 PRK00685 metal-dependent hydrolase; Provisional 228
8620 234812 PRK00694 PRK00694 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Validated 606
8621 234813 PRK00696 sucC ADP-forming succinate--CoA ligase subunit beta. 388
8622 234814 PRK00698 tmk thymidylate kinase; Validated 205
8623 234815 PRK00701 PRK00701 divalent metal cation transporter MntH. 439
8624 234816 PRK00702 PRK00702 ribose-5-phosphate isomerase RpiA. 220
8625 234817 PRK00704 PRK00704 photosystem I reaction center protein subunit XI; Provisional 160
8626 234818 PRK00708 PRK00708 twin-arginine translocase subunit TatB. 209
8627 234819 PRK00711 PRK00711 D-amino acid dehydrogenase. 416
8628 234820 PRK00714 PRK00714 RNA pyrophosphohydrolase; Reviewed 156
8629 234821 PRK00719 PRK00719 alkanesulfonate monooxygenase; Provisional 378
8630 234822 PRK00720 tatA twin-arginine translocase TatA/TatE family subunit. 78
8631 179097 PRK00723 PRK00723 phosphatidylserine decarboxylase; Provisional 297
8632 234823 PRK00724 PRK00724 formate dehydrogenase accessory sulfurtransferase FdhD. 263
8633 234824 PRK00725 glgC glucose-1-phosphate adenylyltransferase; Provisional 425
8634 234825 PRK00726 murG undecaprenyldiphospho-muramoylpentapeptide beta-N- acetylglucosaminyltransferase; Provisional 357
8635 234826 PRK00730 rnpA ribonuclease P protein component. 138
8636 179101 PRK00732 fliE flagellar hook-basal body complex protein FliE. 102
8637 234827 PRK00733 hppA membrane-bound proton-translocating pyrophosphatase; Validated 666
8638 179103 PRK00736 PRK00736 hypothetical protein; Provisional 68
8639 179104 PRK00737 PRK00737 small nuclear ribonucleoprotein; Provisional 72
8640 179105 PRK00741 prfC peptide chain release factor 3; Provisional 526
8641 234828 PRK00742 PRK00742 chemotaxis-specific protein-glutamate methyltransferase CheB. 354
8642 179107 PRK00745 PRK00745 4-oxalocrotonate tautomerase; Provisional 62
8643 179108 PRK00748 PRK00748 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase; Validated 233
8644 234829 PRK00750 lysK lysyl-tRNA synthetase; Reviewed 510
8645 134373 PRK00753 psbL photosystem II reaction center L; Provisional 39
8646 179110 PRK00754 PRK00754 signal recognition particle protein Srp19; Provisional 95
8647 179111 PRK00756 PRK00756 acyltransferase NodA; Provisional 196
8648 179112 PRK00758 PRK00758 GMP synthase subunit A; Validated 184
8649 179113 PRK00762 hypA hydrogenase maturation nickel metallochaperone HypA. 124
8650 234830 PRK00766 PRK00766 hypothetical protein; Provisional 194
8651 179115 PRK00767 PRK00767 transcriptional regulator BetI; Validated 197
8652 234831 PRK00768 nadE ammonia-dependent NAD(+) synthetase. 268
8653 179117 PRK00770 PRK00770 deoxyhypusine synthase. 384
8654 179118 PRK00771 PRK00771 signal recognition particle protein Srp54; Provisional 437
8655 234832 PRK00772 PRK00772 3-isopropylmalate dehydrogenase; Provisional 358
8656 234833 PRK00773 rplX 50S ribosomal protein LX; Validated 76
8657 234834 PRK00777 PRK00777 pantetheine-phosphate adenylyltransferase. 153
8658 234835 PRK00779 PRK00779 ornithine carbamoyltransferase; Provisional 304
8659 234836 PRK00782 PRK00782 MEMO1 family protein. 267
8660 234837 PRK00783 PRK00783 DNA-directed RNA polymerase subunit D; Provisional 263
8661 234838 PRK00784 PRK00784 cobyric acid synthase. 488
8662 179126 PRK00790 fliE flagellar hook-basal body complex protein FliE. 109
8663 179127 PRK00794 flbT flagellar biosynthesis repressor FlbT; Reviewed 132
8664 234839 PRK00801 PRK00801 hypothetical protein; Provisional 201
8665 234840 PRK00802 PRK00802 DNA-3-methyladenine glycosylase. 188
8666 234841 PRK00805 PRK00805 putative deoxyhypusine synthase; Provisional 329
8667 179131 PRK00807 PRK00807 50S ribosomal protein L24e; Validated 52
8668 179132 PRK00808 PRK00808 bacteriohemerythrin. 150
8669 179133 PRK00809 PRK00809 hypothetical protein; Provisional 144
8670 234842 PRK00810 nifW nitrogenase stabilizing/protective protein NifW. 113
8671 234843 PRK00811 PRK00811 polyamine aminopropyltransferase. 283
8672 234844 PRK00816 rnfD electron transport complex protein RnfD; Reviewed 350
8673 179136 PRK00819 PRK00819 RNA 2'-phosphotransferase; Reviewed 179
8674 234845 PRK00823 phhB pterin-4-alpha-carbinolamine dehydratase; Validated 97
8675 179138 PRK00831 rpmJ 50S ribosomal protein L36; Validated 41
8676 179139 PRK00843 egsA NAD(P)-dependent glycerol-1-phosphate dehydrogenase; Reviewed 350
8677 234846 PRK00844 glgC glucose-1-phosphate adenylyltransferase; Provisional 407
8678 179141 PRK00846 PRK00846 hypothetical protein; Provisional 77
8679 234847 PRK00847 thyX FAD-dependent thymidylate synthase; Reviewed 217
8680 234848 PRK00854 rocD ornithine--oxo-acid transaminase; Reviewed 401
8681 179143 PRK00855 PRK00855 argininosuccinate lyase; Provisional 459
8682 234849 PRK00856 pyrB aspartate carbamoyltransferase catalytic subunit. 305
8683 234850 PRK00861 PRK00861 putative lipid kinase; Reviewed 300
8684 234851 PRK00865 PRK00865 glutamate racemase; Provisional 261
8685 179147 PRK00870 PRK00870 haloalkane dehalogenase; Provisional 302
8686 234852 PRK00871 PRK00871 glutathione-regulated potassium-efflux system oxidoreductase KefF. 176
8687 179149 PRK00872 PRK00872 hypothetical protein; Provisional 157
8688 179150 PRK00876 nadE NAD(+) synthase. 326
8689 234853 PRK00877 hisD bifunctional histidinal dehydrogenase/ histidinol dehydrogenase; Reviewed 425
8690 234854 PRK00881 purH bifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase; Provisional 513
8691 234855 PRK00884 PRK00884 Maf-like protein; Reviewed 194
8692 234856 PRK00885 PRK00885 phosphoribosylamine--glycine ligase; Provisional 420
8693 234857 PRK00886 PRK00886 2-phosphosulfolactate phosphatase family protein. 240
8694 179156 PRK00888 ftsB cell division protein FtsB; Reviewed 105
8695 179157 PRK00889 PRK00889 adenylylsulfate kinase; Provisional 175
8696 234858 PRK00892 lpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase; Provisional 343
8697 234859 PRK00893 PRK00893 aspartate carbamoyltransferase regulatory subunit; Reviewed 152
8698 234860 PRK00901 PRK00901 methylated-DNA--protein-cysteine methyltransferase; Provisional 155
8699 179161 PRK00907 PRK00907 hypothetical protein; Provisional 92
8700 179162 PRK00910 ribB 3,4-dihydroxy-2-butanone-4-phosphate synthase. 218
8701 234861 PRK00911 PRK00911 dihydroxy-acid dehydratase; Provisional 552
8702 234862 PRK00912 PRK00912 ribonuclease P protein component 3; Provisional 237
8703 234863 PRK00913 PRK00913 multifunctional aminopeptidase A; Provisional 483
8704 234864 PRK00915 PRK00915 2-isopropylmalate synthase; Validated 513
8705 179167 PRK00919 PRK00919 glutamine-hydrolyzing GMP synthase subunit GuaA. 307
8706 234865 PRK00923 PRK00923 sirohydrochlorin nickelochelatase. 126
8707 179169 PRK00924 PRK00924 5-keto-4-deoxyuronate isomerase; Provisional 276
8708 234866 PRK00927 PRK00927 tryptophanyl-tRNA synthetase; Reviewed 333
8709 234867 PRK00933 PRK00933 ribosomal biogenesis protein; Validated 165
8710 234868 PRK00934 PRK00934 ribose-phosphate pyrophosphokinase; Provisional 285
8711 179173 PRK00939 PRK00939 stress response translation initiation inhibitor YciH. 99
8712 179174 PRK00941 PRK00941 acetyl-CoA decarbonylase/synthase complex subunit alpha; Validated 781
8713 234869 PRK00942 PRK00942 acetylglutamate kinase; Provisional 283
8714 234870 PRK00943 PRK00943 selenide, water dikinase SelD. 347
8715 234871 PRK00944 PRK00944 hypothetical protein; Provisional 195
8716 179177 PRK00945 PRK00945 acetyl-CoA decarbonylase/synthase complex subunit epsilon; Provisional 171
8717 234872 PRK00950 PRK00950 histidinol-phosphate transaminase. 361
8718 234873 PRK00951 hisB imidazoleglycerol-phosphate dehydratase HisB. 195
8719 234874 PRK00955 PRK00955 YgiQ family radical SAM protein. 620
8720 179181 PRK00956 thyA thymidylate synthase; Provisional 208
8721 234875 PRK00957 PRK00957 methionine synthase; Provisional 305
8722 234876 PRK00960 PRK00960 seryl-tRNA synthetase; Provisional 517
8723 179184 PRK00961 PRK00961 H(2)-dependent methylenetetrahydromethanopterin dehydrogenase; Provisional 342
8724 179185 PRK00962 PRK00962 hypothetical protein; Provisional 165
8725 234877 PRK00964 PRK00964 tetrahydromethanopterin S-methyltransferase subunit A; Provisional 225
8726 179187 PRK00965 PRK00965 tetrahydromethanopterin S-methyltransferase subunit B; Provisional 96
8727 179188 PRK00967 PRK00967 hypothetical protein; Provisional 105
8728 234878 PRK00968 PRK00968 tetrahydromethanopterin S-methyltransferase subunit D; Provisional 240
8729 234879 PRK00969 PRK00969 methanogenesis marker 3 protein. 508
8730 234880 PRK00971 PRK00971 glutaminase; Provisional 307
8731 234881 PRK00972 PRK00972 tetrahydromethanopterin S-methyltransferase subunit E; Provisional 292
8732 179193 PRK00973 PRK00973 glucose-6-phosphate isomerase; Provisional 446
8733 234882 PRK00976 PRK00976 methanogenesis marker 12 protein. 326
8734 179195 PRK00977 PRK00977 exodeoxyribonuclease VII small subunit; Provisional 80
8735 234883 PRK00979 PRK00979 tetrahydromethanopterin S-methyltransferase subunit H; Provisional 308
8736 179197 PRK00982 acpP acyl carrier protein; Provisional 78
8737 234884 PRK00984 truD tRNA pseudouridine synthase D; Reviewed 341
8738 179199 PRK00989 truB tRNA pseudouridine synthase B; Provisional 230
8739 234885 PRK00994 PRK00994 F420-dependent methylenetetrahydromethanopterin dehydrogenase. 277
8740 234886 PRK00996 PRK00996 ribonuclease HIII; Provisional 304
8741 234887 PRK01001 PRK01001 putative inner membrane protein translocase component YidC; Provisional 795
8742 179203 PRK01002 PRK01002 nickel responsive regulator; Provisional 141
8743 179204 PRK01005 PRK01005 V-type ATP synthase subunit E; Provisional 207
8744 134464 PRK01008 PRK01008 queuine tRNA-ribosyltransferase; Provisional 372
8745 179205 PRK01018 PRK01018 50S ribosomal protein L30e; Reviewed 99
8746 167141 PRK01021 lpxB lipid-A-disaccharide synthase; Reviewed 608
8747 234888 PRK01022 PRK01022 hypothetical protein; Provisional 167
8748 179207 PRK01024 PRK01024 Na(+)-translocating NADH-quinone reductase subunit B; Provisional 503
8749 179208 PRK01026 PRK01026 tetrahydromethanopterin S-methyltransferase subunit G; Provisional 77
8750 234889 PRK01029 tolB Tol-Pal system protein TolB. 428
8751 234890 PRK01030 PRK01030 tetrahydromethanopterin S-methyltransferase subunit C; Provisional 264
8752 234891 PRK01033 PRK01033 imidazole glycerol phosphate synthase subunit HisF; Provisional 258
8753 234892 PRK01037 trmD tRNA (guanine-N(1)-)-methyltransferase/unknown domain fusion protein; Reviewed 357
8754 234893 PRK01045 ispH 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Reviewed 298
8755 234894 PRK01059 PRK01059 ATP:guanido phosphotransferase; Provisional 346
8756 179214 PRK01060 PRK01060 endonuclease IV; Provisional 281
8757 234895 PRK01061 PRK01061 Na(+)-translocating NADH-quinone reductase subunit E; Provisional 244
8758 179216 PRK01064 PRK01064 hypothetical protein; Provisional 78
8759 167150 PRK01066 PRK01066 porphobilinogen deaminase; Provisional 231
8760 179217 PRK01076 PRK01076 L-rhamnose isomerase; Provisional 419
8761 234896 PRK01077 PRK01077 cobyrinate a,c-diamide synthase. 451
8762 234897 PRK01096 PRK01096 deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional 440
8763 179220 PRK01099 rpoK DNA-directed RNA polymerase subunit K; Provisional 62
8764 234898 PRK01100 PRK01100 accessory gene regulator ArgB-like protein. 210
8765 234899 PRK01103 PRK01103 bifunctional DNA-formamidopyrimidine glycosylase/DNA-(apurinic or apyrimidinic site) lyase. 274
8766 234900 PRK01109 PRK01109 ATP-dependent DNA ligase; Provisional 590
8767 234901 PRK01110 rpmF 50S ribosomal protein L32; Validated 60
8768 234902 PRK01112 PRK01112 2,3-bisphosphoglycerate-dependent phosphoglycerate mutase. 228
8769 234903 PRK01115 PRK01115 DNA polymerase sliding clamp; Validated 247
8770 234904 PRK01117 PRK01117 adenylosuccinate synthetase; Provisional 430
8771 179228 PRK01119 PRK01119 putative heavy metal-binding protein. 106
8772 234905 PRK01122 PRK01122 potassium-transporting ATPase subunit KdpB. 679
8773 234906 PRK01123 PRK01123 shikimate kinase; Provisional 282
8774 234907 PRK01130 PRK01130 putative N-acetylmannosamine-6-phosphate 2-epimerase. 221
8775 234908 PRK01143 rpl11p 50S ribosomal protein L11P; Validated 163
8776 234909 PRK01146 PRK01146 DNA-directed RNA polymerase subunit L; Provisional 85
8777 179234 PRK01151 rps17E 30S ribosomal protein S17e; Validated 58
8778 179235 PRK01153 PRK01153 nicotinamide-nucleotide adenylyltransferase; Provisional 174
8779 100796 PRK01156 PRK01156 chromosome segregation protein; Provisional 895
8780 234910 PRK01158 PRK01158 phosphoglycolate phosphatase; Provisional 230
8781 234911 PRK01160 PRK01160 hypothetical protein; Provisional 178
8782 234912 PRK01170 PRK01170 bifunctional pantetheine-phosphate adenylyltransferase/NTP phosphatase. 322
8783 100801 PRK01172 PRK01172 ATP-dependent DNA helicase. 674
8784 234913 PRK01175 PRK01175 phosphoribosylformylglycinamidine synthase I; Provisional 261
8785 167170 PRK01177 PRK01177 hypothetical protein; Provisional 140
8786 179239 PRK01178 rps24e 30S ribosomal protein S24e; Reviewed 99
8787 234914 PRK01184 PRK01184 flagellar hook-basal body complex protein FliE. 184
8788 179241 PRK01185 ppnK NAD(+) kinase. 271
8789 100807 PRK01189 PRK01189 V-type ATP synthase subunit F; Provisional 104
8790 234915 PRK01191 rpl24p 50S ribosomal protein L24P; Validated 120
8791 234916 PRK01192 PRK01192 50S ribosomal protein L31e; Reviewed 89
8792 100810 PRK01194 PRK01194 V-type ATP synthase subunit E; Provisional 185
8793 234917 PRK01198 PRK01198 V-type ATP synthase subunit C; Provisional 352
8794 234918 PRK01202 PRK01202 glycine cleavage system protein GcvH. 127
8795 100813 PRK01203 PRK01203 prefoldin subunit alpha; Provisional 130
8796 100814 PRK01207 PRK01207 methionine synthase; Provisional 343
8797 234919 PRK01209 cobD cobalamin biosynthesis protein. 312
8798 179247 PRK01211 PRK01211 dihydroorotase; Provisional 409
8799 234920 PRK01212 PRK01212 homoserine kinase; Provisional 301
8800 234921 PRK01213 PRK01213 phosphoribosylformylglycinamidine synthase subunit PurL. 724
8801 179250 PRK01215 PRK01215 nicotinamide mononucleotide deamidase-related protein. 264
8802 179251 PRK01216 PRK01216 DNA polymerase IV; Validated 351
8803 179252 PRK01217 PRK01217 hypothetical protein; Provisional 114
8804 179253 PRK01220 PRK01220 malonate decarboxylase subunit delta; Provisional 99
8805 234922 PRK01221 PRK01221 deoxyhypusine synthase. 312
8806 234923 PRK01222 PRK01222 phosphoribosylanthranilate isomerase. 210
8807 234924 PRK01229 PRK01229 N-glycosylase/DNA lyase; Provisional 208
8808 179257 PRK01231 ppnK NAD(+) kinase. 295
8809 234925 PRK01233 glyS glycyl-tRNA synthetase subunit beta; Validated 682
8810 234926 PRK01236 PRK01236 S-adenosylmethionine decarboxylase proenzyme; Provisional 131
8811 234927 PRK01237 PRK01237 triphosphoribosyl-dephospho-CoA synthase; Validated 289
8812 179261 PRK01242 rpl39e 50S ribosomal protein L39e; Validated 50
8813 179262 PRK01250 PRK01250 inorganic diphosphatase. 176
8814 179263 PRK01253 PRK01253 preprotein translocase subunit Sec61beta. 54
8815 234928 PRK01254 PRK01254 YgiQ family radical SAM protein. 707
8816 234929 PRK01259 PRK01259 ribose-phosphate diphosphokinase. 309
8817 234930 PRK01261 aroD 3-dehydroquinate dehydratase; Provisional 229
8818 234931 PRK01265 PRK01265 heat shock protein HtpX; Provisional 324
8819 234932 PRK01269 PRK01269 tRNA s(4)U8 sulfurtransferase; Provisional 482
8820 179269 PRK01271 PRK01271 tautomerase PptA. 76
8821 179270 PRK01278 argD acetylornithine transaminase protein; Provisional 389
8822 234933 PRK01285 PRK01285 pyruvoyl-dependent arginine decarboxylase; Reviewed 155
8823 234934 PRK01286 PRK01286 deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional 336
8824 234935 PRK01287 xerC site-specific tyrosine recombinase XerC; Reviewed 358
8825 234936 PRK01293 PRK01293 phosphoribosyl-dephospho-CoA transferase; Provisional 207
8826 234937 PRK01294 PRK01294 lipase secretion chaperone. 336
8827 167205 PRK01295 PRK01295 phosphoglyceromutase; Provisional 206
8828 234938 PRK01297 PRK01297 ATP-dependent RNA helicase RhlB; Provisional 475
8829 234939 PRK01305 PRK01305 arginyl-tRNA-protein transferase; Provisional 240
8830 167208 PRK01310 PRK01310 hypothetical protein; Validated 104
8831 234940 PRK01313 rnpA ribonuclease P protein component. 129
8832 234941 PRK01315 PRK01315 putative inner membrane protein translocase component YidC; Provisional 329
8833 234942 PRK01318 PRK01318 membrane protein insertase; Provisional 521
8834 179280 PRK01322 PRK01322 6-carboxyhexanoate--CoA ligase; Provisional 242
8835 179281 PRK01326 prsA foldase protein PrsA; Reviewed 310
8836 234943 PRK01343 PRK01343 zinc-binding protein; Provisional 57
8837 234944 PRK01345 PRK01345 heat shock protein HtpX; Provisional 317
8838 234945 PRK01346 PRK01346 enhanced intracellular survival protein Eis. 411
8839 234946 PRK01355 PRK01355 azoreductase; Reviewed 199
8840 167217 PRK01356 hscB co-chaperone HscB; Provisional 166
8841 234947 PRK01362 PRK01362 fructose-6-phosphate aldolase. 214
8842 179286 PRK01368 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 454
8843 179287 PRK01371 PRK01371 Sec-independent protein translocase protein TatB. 137
8844 234948 PRK01372 ddl D-alanine--D-alanine ligase; Reviewed 304
8845 134546 PRK01379 cyaY iron donor protein CyaY. 103
8846 179289 PRK01381 PRK01381 trp operon repressor. 99
8847 234949 PRK01388 PRK01388 arginine deiminase; Provisional 406
8848 234950 PRK01390 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 460
8849 234951 PRK01392 citX 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-citrate lyase; Reviewed 180
8850 179293 PRK01395 PRK01395 V-type ATP synthase subunit F; Provisional 104
8851 179294 PRK01397 PRK01397 50S ribosomal protein L31; Provisional 78
8852 234952 PRK01402 hslO Hsp33-like chaperonin; Reviewed 328
8853 234953 PRK01406 gltX glutamyl-tRNA synthetase; Reviewed 476
8854 167229 PRK01415 PRK01415 hypothetical protein; Validated 247
8855 234954 PRK01424 PRK01424 S-adenosylmethionine:tRNA ribosyltransferase-isomerase; Provisional 366
8856 234955 PRK01433 hscA chaperone protein HscA; Provisional 595
8857 179297 PRK01438 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 480
8858 167232 PRK01441 PRK01441 Maf-like protein; Reviewed 207
8859 179298 PRK01470 tatA twin arginine translocase protein A; Provisional 51
8860 100879 PRK01474 atpC F0F1 ATP synthase subunit epsilon; Validated 112
8861 134562 PRK01482 fliE flagellar hook-basal body complex protein FliE. 108
8862 234956 PRK01490 tig trigger factor; Provisional 435
8863 100883 PRK01492 rnpA ribonuclease P protein component. 118
8864 234957 PRK01526 PRK01526 Maf-like protein; Reviewed 205
8865 179300 PRK01528 truB tRNA pseudouridine synthase B; Provisional 292
8866 134567 PRK01530 PRK01530 hypothetical protein; Reviewed 105
8867 134568 PRK01533 PRK01533 histidinol-phosphate aminotransferase; Validated 366
8868 234958 PRK01544 PRK01544 bifunctional N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase/tRNA (m7G46) methyltransferase; Reviewed 506
8869 100891 PRK01546 PRK01546 hypothetical protein; Provisional 79
8870 234959 PRK01550 truB tRNA pseudouridine synthase B; Provisional 304
8871 179302 PRK01558 PRK01558 V-type ATP synthase subunit E; Provisional 198
8872 234960 PRK01565 PRK01565 thiamine biosynthesis protein ThiI; Provisional 394
8873 179304 PRK01574 lspA signal peptidase II; Provisional 163
8874 234961 PRK01581 speE polyamine aminopropyltransferase. 374
8875 234962 PRK01584 PRK01584 alanyl-tRNA synthetase; Provisional 594
8876 234963 PRK01610 PRK01610 putative voltage-gated ClC-type chloride channel ClcB; Provisional 418
8877 234964 PRK01611 argS arginyl-tRNA synthetase; Reviewed 507
8878 234965 PRK01614 tatE Sec-independent protein translocase subunit TatA. 85
8879 234966 PRK01617 PRK01617 hypothetical protein; Provisional 154
8880 179310 PRK01622 PRK01622 membrane protein insertase YidC. 256
8881 179311 PRK01625 sspH acid-soluble spore protein H; Provisional 59
8882 167247 PRK01631 PRK01631 hypothetical protein; Provisional 76
8883 179312 PRK01636 ccrB fluoride efflux transporter CrcB. 118
8884 179313 PRK01637 PRK01637 virulence factor BrkB family protein. 286
8885 179314 PRK01641 leuD 3-isopropylmalate dehydratase small subunit. 200
8886 234967 PRK01642 cls cardiolipin synthetase; Reviewed 483
8887 179316 PRK01655 spxA transcriptional regulator Spx; Reviewed 131
8888 167253 PRK01658 PRK01658 CidA/LrgA family holin-like protein. 122
8889 234968 PRK01663 PRK01663 C4-dicarboxylate transporter DctA; Reviewed 428
8890 234969 PRK01678 rpmE2 type B 50S ribosomal protein L31. 87
8891 234970 PRK01683 PRK01683 trans-aconitate 2-methyltransferase; Provisional 258
8892 234971 PRK01686 hisG ATP phosphoribosyltransferase catalytic subunit; Reviewed 215
8893 234972 PRK01688 PRK01688 histidinol-phosphate aminotransferase; Provisional 351
8894 179322 PRK01699 fliE flagellar hook-basal body complex protein FliE. 99
8895 167260 PRK01706 PRK01706 adenosylmethionine decarboxylase. 123
8896 179323 PRK01710 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 458
8897 234973 PRK01712 PRK01712 carbon storage regulator CsrA. 64
8898 167263 PRK01713 PRK01713 ornithine carbamoyltransferase; Provisional 334
8899 234974 PRK01722 PRK01722 formimidoylglutamase; Provisional 320
8900 234975 PRK01723 PRK01723 3-deoxy-D-manno-octulosonic-acid kinase; Reviewed 239
8901 179327 PRK01732 rnpA ribonuclease P; Reviewed 114
8902 234976 PRK01736 PRK01736 hypothetical protein; Reviewed 190
8903 234977 PRK01741 PRK01741 cell division protein ZipA; Provisional 332
8904 179329 PRK01742 tolB Tol-Pal system protein TolB. 429
8905 234978 PRK01747 mnmC bifunctional tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD/FAD-dependent 5-carboxymethylaminomethyl-2-thiouridine(34) oxidoreductase MnmC. 662
8906 234979 PRK01749 PRK01749 disulfide bond formation protein DsbB. 176
8907 179332 PRK01752 PRK01752 YchJ family protein. 156
8908 234980 PRK01759 glnD bifunctional uridylyltransferase/uridylyl-removing protein GlnD. 854
8909 234981 PRK01766 PRK01766 multidrug efflux protein; Reviewed 456
8910 179334 PRK01770 PRK01770 Sec-independent protein translocase subunit TatB. 171
8911 179335 PRK01773 hscB Fe-S protein assembly co-chaperone HscB. 173
8912 234982 PRK01777 PRK01777 RnfH family protein. 95
8913 167278 PRK01792 ribB 3,4-dihydroxy-2-butanone-4-phosphate synthase. 214
8914 179337 PRK01810 PRK01810 DNA polymerase IV; Validated 407
8915 179338 PRK01816 PRK01816 hypothetical protein; Provisional 143
8916 234983 PRK01821 PRK01821 hypothetical protein; Provisional 133
8917 234984 PRK01827 thyA thymidylate synthase; Reviewed 264
8918 167284 PRK01833 tatA Sec-independent protein translocase subunit TatA. 74
8919 179341 PRK01839 PRK01839 septum formation inhibitor Maf. 209
8920 234985 PRK01842 PRK01842 hypothetical protein; Provisional 149
8921 100947 PRK01844 PRK01844 YneF family protein. 72
8922 234986 PRK01851 truB tRNA pseudouridine synthase B; Provisional 303
8923 234987 PRK01862 PRK01862 voltage-gated chloride channel ClcB. 574
8924 179345 PRK01885 greB transcription elongation factor GreB; Reviewed 157
8925 234988 PRK01889 PRK01889 GTPase RsgA; Reviewed 356
8926 234989 PRK01903 rnpA ribonuclease P protein component. 133
8927 234990 PRK01904 PRK01904 DUF2057 domain-containing protein. 219
8928 179348 PRK01905 PRK01905 Fis family transcriptional regulator. 77
8929 179349 PRK01906 PRK01906 tetraacyldisaccharide 4'-kinase; Provisional 338
8930 179350 PRK01908 PRK01908 electron transport complex protein RnfG; Validated 205
8931 234991 PRK01909 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 329
8932 179352 PRK01911 ppnK inorganic polyphosphate/ATP-NAD kinase; Provisional 292
8933 179353 PRK01917 PRK01917 cation-binding hemerythrin HHE family protein; Provisional 139
8934 234992 PRK01919 tatB Sec-independent protein translocase subunit TatB. 169
8935 179355 PRK01964 PRK01964 4-oxalocrotonate tautomerase; Provisional 64
8936 234993 PRK01966 ddl D-alanine--D-alanine ligase. 333
8937 234994 PRK01973 PRK01973 septum site-determining protein MinC. 271
8938 179358 PRK02001 PRK02001 ribosome assembly cofactor RimP. 152
8939 234995 PRK02006 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 498
8940 179360 PRK02047 PRK02047 hypothetical protein; Provisional 91
8941 179361 PRK02048 PRK02048 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Provisional 611
8942 179362 PRK02079 PRK02079 pyrroloquinoline quinone biosynthesis peptide chaperone PqqD. 88
8943 234996 PRK02083 PRK02083 imidazole glycerol phosphate synthase subunit HisF; Provisional 253
8944 234997 PRK02090 PRK02090 phosphoadenylyl-sulfate reductase. 241
8945 234998 PRK02098 PRK02098 phosphoribosyl-dephospho-CoA transferase; Provisional 221
8946 234999 PRK02101 PRK02101 peroxide stress protein YaaA. 255
8947 179366 PRK02102 PRK02102 ornithine carbamoyltransferase; Validated 331
8948 179367 PRK02103 PRK02103 malonate decarboxylase acyl carrier protein. 105
8949 235000 PRK02106 PRK02106 choline dehydrogenase; Validated 560
8950 235001 PRK02107 PRK02107 glutamate--cysteine ligase; Provisional 523
8951 235002 PRK02110 PRK02110 disulfide bond formation protein B; Provisional 169
8952 179371 PRK02113 PRK02113 MBL fold metallo-hydrolase. 252
8953 235003 PRK02114 PRK02114 formylmethanofuran--tetrahydromethanopterin formyltransferase; Provisional 297
8954 179373 PRK02118 PRK02118 V-type ATP synthase subunit B; Provisional 436
8955 235004 PRK02119 PRK02119 hypothetical protein; Provisional 73
8956 235005 PRK02122 PRK02122 glucosamine-6-phosphate deaminase-like protein; Validated 652
8957 235006 PRK02126 PRK02126 ribonuclease Z; Provisional 334
8958 235007 PRK02134 PRK02134 chitin disaccharide deacetylase. 249
8959 235008 PRK02135 PRK02135 hypothetical protein; Provisional 201
8960 235009 PRK02141 PRK02141 Maf-like protein; Reviewed 207
8961 179379 PRK02155 ppnK NAD kinase. 291
8962 167325 PRK02166 PRK02166 hypothetical protein; Reviewed 184
8963 235010 PRK02186 PRK02186 argininosuccinate lyase; Provisional 887
8964 235011 PRK02190 PRK02190 agmatinase; Provisional 301
8965 179381 PRK02193 truB tRNA pseudouridine synthase B; Provisional 279
8966 179382 PRK02195 PRK02195 V-type ATP synthase subunit D; Provisional 201
8967 235012 PRK02201 PRK02201 putative inner membrane protein translocase component YidC; Provisional 357
8968 235013 PRK02220 PRK02220 4-oxalocrotonate tautomerase; Provisional 61
8969 179385 PRK02224 PRK02224 DNA double-strand break repair Rad50 ATPase. 880
8970 235014 PRK02227 PRK02227 (5-formylfuran-3-yl)methyl phosphate synthase. 238
8971 179387 PRK02228 PRK02228 V-type ATP synthase subunit F; Provisional 100
8972 179388 PRK02230 PRK02230 inorganic pyrophosphatase; Provisional 184
8973 167337 PRK02231 ppnK NAD(+) kinase. 272
8974 235015 PRK02234 recU Holliday junction-specific endonuclease; Reviewed 195
8975 235016 PRK02237 PRK02237 YnfA family protein. 109
8976 179391 PRK02240 PRK02240 GTP cyclohydrolase IIa. 254
8977 179392 PRK02249 PRK02249 DNA primase regulatory subunit PriL. 343
8978 179393 PRK02250 PRK02250 hypothetical protein; Provisional 166
8979 179394 PRK02251 PRK02251 cell division protein CrgA. 87
8980 179395 PRK02253 PRK02253 deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional 167
8981 235017 PRK02255 PRK02255 putrescine carbamoyltransferase; Provisional 338
8982 235018 PRK02256 PRK02256 putative aminopeptidase 1; Provisional 462
8983 235019 PRK02259 PRK02259 aspartoacylase; Provisional 288
8984 179399 PRK02260 PRK02260 S-ribosylhomocysteine lyase. 158
8985 179400 PRK02261 PRK02261 methylaspartate mutase subunit S; Provisional 137
8986 235020 PRK02264 PRK02264 N(5),N(10)-methenyltetrahydromethanopterin cyclohydrolase; Provisional 317
8987 179402 PRK02265 PRK02265 acetoacetate decarboxylase; Provisional 246
8988 235021 PRK02268 PRK02268 hypothetical protein; Provisional 141
8989 167353 PRK02269 PRK02269 ribose-phosphate diphosphokinase. 320
8990 235022 PRK02271 PRK02271 methylenetetrahydromethanopterin reductase; Provisional 325
8991 235023 PRK02277 PRK02277 orotate phosphoribosyltransferase-like protein; Provisional 200
8992 235024 PRK02287 PRK02287 DUF367 family protein. 171
8993 179406 PRK02289 PRK02289 4-oxalocrotonate tautomerase; Provisional 60
8994 235025 PRK02290 PRK02290 3-dehydroquinate synthase II family protein. 344
8995 235026 PRK02292 PRK02292 V-type ATP synthase subunit E; Provisional 188
8996 235027 PRK02301 PRK02301 deoxyhypusine synthase. 316
8997 179410 PRK02302 PRK02302 hypothetical protein; Provisional 89
8998 235028 PRK02304 PRK02304 adenine phosphoribosyltransferase; Provisional 175
8999 235029 PRK02308 uvsE putative UV damage endonuclease; Provisional 303
9000 235030 PRK02315 PRK02315 adaptor protein MecA. 233
9001 235031 PRK02318 PRK02318 mannitol-1-phosphate 5-dehydrogenase; Provisional 381
9002 235032 PRK02362 PRK02362 ATP-dependent DNA helicase. 737
9003 235033 PRK02363 PRK02363 DNA-directed RNA polymerase subunit delta; Reviewed 129
9004 179417 PRK02382 PRK02382 dihydroorotase; Provisional 443
9005 179418 PRK02391 PRK02391 heat shock protein HtpX; Provisional 296
9006 235034 PRK02395 PRK02395 hypothetical protein; Provisional 279
9007 179420 PRK02399 PRK02399 hypothetical protein; Provisional 406
9008 235035 PRK02406 PRK02406 DNA polymerase IV; Validated 343
9009 235036 PRK02412 aroD type I 3-dehydroquinate dehydratase. 253
9010 235037 PRK02427 PRK02427 3-phosphoshikimate 1-carboxyvinyltransferase; Provisional 435
9011 235038 PRK02436 xerD site-specific tyrosine recombinase XerD. 245
9012 235039 PRK02458 PRK02458 ribose-phosphate pyrophosphokinase; Provisional 323
9013 235040 PRK02463 PRK02463 OxaA-like protein precursor; Provisional 307
9014 179427 PRK02471 PRK02471 bifunctional glutamate--cysteine ligase GshA/glutathione synthetase GshB. 752
9015 235041 PRK02472 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 447
9016 167380 PRK02478 PRK02478 Maf-like protein; Reviewed 199
9017 235042 PRK02484 truB tRNA pseudouridine synthase B; Provisional 294
9018 179430 PRK02487 PRK02487 heme-degrading domain-containing protein. 163
9019 179431 PRK02491 PRK02491 putative deoxyribonucleotide triphosphate pyrophosphatase/unknown domain fusion protein; Reviewed 328
9020 235043 PRK02492 PRK02492 deoxyhypusine synthase. 347
9021 179433 PRK02496 adk adenylate kinase; Provisional 184
9022 235044 PRK02504 PRK02504 NAD(P)H-quinone oxidoreductase subunit N. 513
9023 235045 PRK02506 PRK02506 dihydroorotate dehydrogenase 1A; Reviewed 310
9024 235046 PRK02507 PRK02507 proton extrusion protein PcxA; Provisional 422
9025 235047 PRK02509 PRK02509 hypothetical protein; Provisional 973
9026 235048 PRK02515 psbU photosystem II complex extrinsic protein PsbU. 132
9027 134722 PRK02529 petN cytochrome b6-f complex subunit PetN; Provisional 33
9028 235049 PRK02534 PRK02534 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 312
9029 179440 PRK02539 PRK02539 DUF896 family protein. 85
9030 179441 PRK02542 PRK02542 photosystem I assembly protein Ycf4; Provisional 188
9031 235050 PRK02546 PRK02546 NAD(P)H-quinone oxidoreductase subunit 4; Provisional 525
9032 179443 PRK02551 PRK02551 flavoprotein NrdI; Provisional 154
9033 167396 PRK02553 psbK photosystem II reaction center protein K; Provisional 45
9034 179444 PRK02557 psbE cytochrome b559 subunit alpha; Provisional 81
9035 235051 PRK02561 psbF cytochrome b559 subunit beta; Provisional 44
9036 179446 PRK02565 PRK02565 photosystem II reaction center protein J; Provisional 39
9037 167400 PRK02576 psbZ photosystem II reaction center protein PsbZ. 62
9038 235052 PRK02597 rpoC2 DNA-directed RNA polymerase subunit beta'; Provisional 1331
9039 179448 PRK02603 PRK02603 photosystem I assembly protein Ycf3; Provisional 172
9040 235053 PRK02610 PRK02610 histidinol-phosphate transaminase. 374
9041 235054 PRK02615 PRK02615 thiamine phosphate synthase. 347
9042 179451 PRK02624 psbH photosystem II reaction center protein PsbH. 64
9043 235055 PRK02625 rpoC1 DNA-directed RNA polymerase subunit gamma; Provisional 627
9044 235056 PRK02627 PRK02627 acetylornithine aminotransferase; Provisional 396
9045 235057 PRK02628 nadE NAD synthetase; Reviewed 679
9046 179455 PRK02645 ppnK NAD(+) kinase. 305
9047 179456 PRK02649 ppnK NAD(+) kinase. 305
9048 179457 PRK02651 PRK02651 photosystem I iron-sulfur center protein PsaC. 81
9049 235058 PRK02654 PRK02654 putative inner membrane protein translocase component YidC; Provisional 375
9050 179459 PRK02655 psbI photosystem II reaction center protein I. 38
9051 179460 PRK02693 PRK02693 apocytochrome f; Reviewed 312
9052 235059 PRK02705 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 459
9053 235060 PRK02710 PRK02710 plastocyanin; Provisional 119
9054 235061 PRK02714 PRK02714 o-succinylbenzoate synthase. 320
9055 235062 PRK02724 PRK02724 30S ribosomal protein PSRP-3. 104
9056 235063 PRK02726 PRK02726 molybdenum cofactor guanylyltransferase. 200
9057 235064 PRK02731 PRK02731 histidinol-phosphate aminotransferase; Validated 367
9058 235065 PRK02733 PRK02733 photosystem I reaction center subunit IX; Provisional 42
9059 235066 PRK02746 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 345
9060 179468 PRK02749 PRK02749 photosystem I reaction center subunit IV; Provisional 71
9061 179469 PRK02755 truB tRNA pseudouridine synthase B; Provisional 295
9062 235067 PRK02759 PRK02759 bifunctional phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP diphosphatase HisIE. 203
9063 235068 PRK02769 PRK02769 histidine decarboxylase; Provisional 380
9064 235069 PRK02770 PRK02770 adenosylmethionine decarboxylase. 139
9065 179472 PRK02793 PRK02793 phi X174 lysis protein; Provisional 72
9066 179473 PRK02794 PRK02794 DNA polymerase IV; Provisional 419
9067 235070 PRK02797 PRK02797 TDP-N-acetylfucosamine:lipid II N-acetylfucosaminyltransferase. 322
9068 235071 PRK02801 PRK02801 primosomal replication protein N; Provisional 101
9069 235072 PRK02812 PRK02812 ribose-phosphate pyrophosphokinase; Provisional 330
9070 235073 PRK02813 PRK02813 putative aminopeptidase 2; Provisional 428
9071 179478 PRK02816 PRK02816 phycocyanobilin:ferredoxin oxidoreductase; Validated 243
9072 179479 PRK02821 PRK02821 RNA-binding protein. 77
9073 235074 PRK02830 PRK02830 Na(+)-translocating NADH-quinone reductase subunit E; Provisional 202
9074 235075 PRK02833 PRK02833 phosphate-starvation-inducible protein PsiE; Provisional 133
9075 235076 PRK02842 PRK02842 ferredoxin:protochlorophyllide reductase (ATP-dependent) subunit N. 427
9076 235077 PRK02853 PRK02853 hypothetical protein; Provisional 161
9077 179484 PRK02854 PRK02854 primosomal protein DnaT. 179
9078 235078 PRK02858 PRK02858 germination protease; Provisional 369
9079 179486 PRK02862 glgC glucose-1-phosphate adenylyltransferase; Provisional 429
9080 235079 PRK02866 PRK02866 cyanate hydratase; Validated 147
9081 235080 PRK02868 PRK02868 hypothetical protein; Provisional 245
9082 235081 PRK02870 PRK02870 heat shock protein HtpX; Provisional 336
9083 179490 PRK02877 PRK02877 hypothetical protein; Provisional 106
9084 179491 PRK02886 PRK02886 hypothetical protein; Provisional 87
9085 235082 PRK02888 PRK02888 nitrous-oxide reductase; Validated 635
9086 235083 PRK02889 tolB Tol-Pal system protein TolB. 427
9087 179494 PRK02898 PRK02898 energy-coupling factor ABC transporter substrate-binding protein. 100
9088 179495 PRK02899 PRK02899 genetic competence negative regulator. 197
9089 235084 PRK02901 PRK02901 O-succinylbenzoate synthase; Provisional 327
9090 179497 PRK02909 PRK02909 flagellar transcriptional regulator FlhD. 105
9091 235085 PRK02910 PRK02910 ferredoxin:protochlorophyllide reductase (ATP-dependent) subunit B. 519
9092 179499 PRK02913 PRK02913 hypothetical protein; Provisional 150
9093 235086 PRK02919 PRK02919 oxaloacetate decarboxylase subunit gamma; Provisional 82
9094 179501 PRK02922 PRK02922 cell surface composition regulator GlgS. 67
9095 235087 PRK02925 PRK02925 glucuronate isomerase; Reviewed 466
9096 179503 PRK02929 PRK02929 L-arabinose isomerase; Provisional 499
9097 179504 PRK02935 PRK02935 hypothetical protein; Provisional 110
9098 179505 PRK02936 argD acetylornithine transaminase. 377
9099 179506 PRK02939 PRK02939 YnfC family lipoprotein. 236
9100 235088 PRK02943 PRK02943 secA regulator SecM. 167
9101 179508 PRK02944 PRK02944 YidC family membrane integrase SpoIIIJ. 255
9102 235089 PRK02946 aceK bifunctional isocitrate dehydrogenase kinase/phosphatase protein; Validated 575
9103 179510 PRK02947 PRK02947 sugar isomerase domain-containing protein. 246
9104 179511 PRK02948 PRK02948 IscS subfamily cysteine desulfurase. 381
9105 235090 PRK02951 PRK02951 DNA replication terminus site-binding protein; Provisional 309
9106 179513 PRK02955 PRK02955 small acid-soluble spore protein SspI; Provisional 68
9107 235091 PRK02958 tatA Sec-independent protein translocase subunit TatA. 73
9108 235092 PRK02963 PRK02963 carbon starvation induced protein CsiD. 316
9109 235093 PRK02967 PRK02967 nickel-responsive transcriptional regulator NikR. 139
9110 235094 PRK02971 PRK02971 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol flippase subunit ArnF; Provisional 129
9111 179518 PRK02975 PRK02975 O-antigen assembly polymerase. 450
9112 235095 PRK02983 lysS bifunctional lysylphosphatidylglycerol synthetase/lysine--tRNA ligase LysX. 1094
9113 179520 PRK02984 sspO acid-soluble spore protein O; Provisional 49
9114 235096 PRK02991 PRK02991 D-serine dehydratase; Provisional 441
9115 179522 PRK02998 prsA peptidylprolyl isomerase; Reviewed 283
9116 235097 PRK02999 PRK02999 malate synthase G; Provisional 726
9117 179524 PRK03001 PRK03001 zinc metalloprotease HtpX. 283
9118 101162 PRK03002 prsA peptidylprolyl isomerase PrsA. 285
9119 179525 PRK03003 PRK03003 GTP-binding protein Der; Reviewed 472
9120 235098 PRK03007 PRK03007 deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional 428
9121 235099 PRK03011 PRK03011 butyrate kinase; Provisional 358
9122 235100 PRK03031 rnpA ribonuclease P protein component. 122
9123 179529 PRK03057 PRK03057 hypothetical protein; Provisional 180
9124 235101 PRK03059 PRK03059 PII uridylyl-transferase; Provisional 856
9125 179531 PRK03065 hutP anti-terminator HutP; Provisional 148
9126 235102 PRK03072 PRK03072 heat shock protein HtpX; Provisional 288
9127 235103 PRK03080 PRK03080 phosphoserine transaminase. 378
9128 179534 PRK03081 sspK small, acid-soluble spore protein K. 50
9129 179535 PRK03092 PRK03092 ribose-phosphate diphosphokinase. 304
9130 179536 PRK03094 PRK03094 hypothetical protein; Provisional 80
9131 179537 PRK03095 prsA peptidylprolyl isomerase PrsA. 287
9132 179538 PRK03100 PRK03100 Sec-independent protein translocase subunit TatB. 136
9133 235104 PRK03103 PRK03103 DNA polymerase IV; Reviewed 409
9134 179540 PRK03113 PRK03113 putative disulfide oxidoreductase; Provisional 139
9135 179541 PRK03114 PRK03114 DUF84 family protein. 169
9136 235105 PRK03124 PRK03124 S-adenosylmethionine decarboxylase proenzyme; Provisional 127
9137 179543 PRK03137 PRK03137 1-pyrroline-5-carboxylate dehydrogenase; Provisional 514
9138 179544 PRK03140 PRK03140 phosphatidylserine decarboxylase; Provisional 259
9139 179545 PRK03147 PRK03147 thiol-disulfide oxidoreductase ResA. 173
9140 235106 PRK03158 PRK03158 histidinol-phosphate aminotransferase; Provisional 359
9141 235107 PRK03170 PRK03170 dihydrodipicolinate synthase; Provisional 292
9142 179548 PRK03174 sspH small acid-soluble spore protein H. 59
9143 235108 PRK03180 ligB ATP-dependent DNA ligase; Reviewed 508
9144 235109 PRK03187 tgl transglutaminase; Provisional 272
9145 235110 PRK03188 PRK03188 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 300
9146 179552 PRK03195 PRK03195 DUF721 family protein. 186
9147 235111 PRK03202 PRK03202 ATP-dependent 6-phosphofructokinase. 320
9148 179554 PRK03204 PRK03204 haloalkane dehalogenase; Provisional 286
9149 235112 PRK03244 argD acetylornithine transaminase. 398
9150 235113 PRK03287 truB tRNA pseudouridine synthase B; Provisional 298
9151 235114 PRK03298 PRK03298 endonuclease NucS. 224
9152 235115 PRK03317 PRK03317 histidinol-phosphate aminotransferase; Provisional 368
9153 179559 PRK03321 PRK03321 putative aminotransferase; Provisional 352
9154 179560 PRK03333 coaE dephospho-CoA kinase/protein folding accessory domain-containing protein; Provisional 395
9155 235116 PRK03341 PRK03341 arginine repressor; Provisional 168
9156 235117 PRK03343 PRK03343 transaldolase; Validated 368
9157 235118 PRK03348 PRK03348 DNA polymerase IV; Provisional 454
9158 179564 PRK03352 PRK03352 DNA polymerase IV; Validated 346
9159 235119 PRK03353 ribB 3,4-dihydroxy-2-butanone 4-phosphate synthase; Provisional 217
9160 179566 PRK03354 PRK03354 crotonobetainyl-CoA dehydrogenase; Validated 380
9161 179567 PRK03355 PRK03355 glycerol-3-phosphate 1-O-acyltransferase. 783
9162 179568 PRK03356 PRK03356 L-carnitine/gamma-butyrobetaine antiport BCCT transporter. 504
9163 179569 PRK03359 PRK03359 putative electron transfer flavoprotein FixA; Reviewed 256
9164 235120 PRK03363 fixB electron transfer flavoprotein subunit alpha/FixB family protein. 313
9165 179571 PRK03369 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 488
9166 179572 PRK03371 pdxA D-threonate 4-phosphate dehydrogenase. 326
9167 235121 PRK03372 ppnK inorganic polyphosphate/ATP-NAD kinase; Provisional 306
9168 235122 PRK03378 ppnK inorganic polyphosphate/ATP-NAD kinase; Provisional 292
9169 179575 PRK03379 PRK03379 vitamin B12-transporter protein BtuF; Provisional 260
9170 235123 PRK03381 PRK03381 PII uridylyl-transferase; Provisional 774
9171 235124 PRK03427 PRK03427 cell division protein ZipA; Provisional 333
9172 235125 PRK03430 PRK03430 hypothetical protein; Validated 157
9173 179579 PRK03437 PRK03437 3-isopropylmalate dehydrogenase; Provisional 344
9174 179580 PRK03449 PRK03449 putative inner membrane protein translocase component YidC; Provisional 304
9175 235126 PRK03459 rnpA ribonuclease P; Reviewed 122
9176 235127 PRK03467 PRK03467 hypothetical protein; Provisional 144
9177 179583 PRK03482 PRK03482 phosphoglycerate mutase GpmB. 215
9178 179584 PRK03501 ppnK NAD kinase. 264
9179 179585 PRK03511 minC septum site-determining protein MinC. 228
9180 179586 PRK03512 PRK03512 thiamine phosphate synthase. 211
9181 179587 PRK03515 PRK03515 ornithine carbamoyltransferase subunit I; Provisional 336
9182 235128 PRK03522 rumB 23S rRNA (uracil(747)-C(5))-methyltransferase RlmC. 315
9183 179589 PRK03525 PRK03525 L-carnitine CoA-transferase. 405
9184 235129 PRK03537 PRK03537 molybdate ABC transporter substrate-binding protein. 188
9185 179591 PRK03545 PRK03545 putative arabinose transporter; Provisional 390
9186 179592 PRK03554 tatA Sec-independent protein translocase subunit TatA. 89
9187 235130 PRK03557 PRK03557 CDF family zinc transporter ZitB. 312
9188 235131 PRK03562 PRK03562 glutathione-regulated potassium-efflux system protein KefC; Provisional 621
9189 179595 PRK03564 PRK03564 formate dehydrogenase accessory protein FdhE; Provisional 309
9190 179596 PRK03573 PRK03573 transcriptional regulator SlyA; Provisional 144
9191 235132 PRK03577 PRK03577 acid shock protein. 102
9192 235133 PRK03578 hscB Fe-S protein assembly co-chaperone HscB. 176
9193 179599 PRK03580 PRK03580 crotonobetainyl-CoA hydratase. 261
9194 235134 PRK03584 PRK03584 acetoacetate--CoA ligase. 655
9195 235135 PRK03592 PRK03592 haloalkane dehalogenase; Provisional 295
9196 235136 PRK03598 PRK03598 putative efflux pump membrane fusion protein; Provisional 331
9197 179603 PRK03600 nrdI class Ib ribonucleoside-diphosphate reductase assembly flavoprotein NrdI. 134
9198 235137 PRK03601 PRK03601 HTH-type transcriptional regulator HdfR. 275
9199 235138 PRK03604 moaC bifunctional molybdenum cofactor biosynthesis protein MoaC/MogA; Provisional 312
9200 179606 PRK03606 PRK03606 ureidoglycolate lyase. 162
9201 179607 PRK03609 umuC translesion error-prone DNA polymerase V subunit UmuC. 422
9202 235139 PRK03612 PRK03612 polyamine aminopropyltransferase. 521
9203 235140 PRK03619 PRK03619 phosphoribosylformylglycinamidine synthase subunit PurQ. 219
9204 235141 PRK03620 PRK03620 5-dehydro-4-deoxyglucarate dehydratase; Provisional 303
9205 235142 PRK03624 PRK03624 putative acetyltransferase; Provisional 140
9206 179612 PRK03625 tatE twin-arginine translocase subunit TatE. 67
9207 235143 PRK03629 tolB Tol-Pal system protein TolB. 429
9208 179614 PRK03633 PRK03633 putative MFS family transporter protein; Provisional 381
9209 179615 PRK03634 PRK03634 rhamnulose-1-phosphate aldolase; Provisional 274
9210 235144 PRK03635 PRK03635 ArgP/LysG family DNA-binding transcriptional regulator. 294
9211 235145 PRK03636 PRK03636 hypothetical protein; Provisional 179
9212 235146 PRK03640 PRK03640 o-succinylbenzoate--CoA ligase. 483
9213 179619 PRK03641 PRK03641 DUF2057 family protein. 220
9214 179620 PRK03642 PRK03642 putative periplasmic esterase; Provisional 432
9215 235147 PRK03643 PRK03643 tagaturonate reductase. 471
9216 179622 PRK03646 dadX catabolic alanine racemase. 355
9217 235148 PRK03655 PRK03655 putative ion channel protein; Provisional 414
9218 235149 PRK03657 PRK03657 2-oxo-tetronate isomerase. 170
9219 179625 PRK03659 PRK03659 glutathione-regulated potassium-efflux system protein KefB; Provisional 601
9220 179626 PRK03660 PRK03660 anti-sigma F factor; Provisional 146
9221 179627 PRK03661 PRK03661 nicotinamide-nucleotide amidase. 164
9222 179628 PRK03669 PRK03669 mannosyl-3-phosphoglycerate phosphatase-related protein. 271
9223 167581 PRK03670 PRK03670 competence damage-inducible protein A; Provisional 252
9224 179629 PRK03673 PRK03673 nicotinamide mononucleotide deamidase-related protein YfaY. 396
9225 179630 PRK03681 hypA hydrogenase/urease nickel incorporation protein HypA. 114
9226 179631 PRK03692 PRK03692 putative UDP-N-acetyl-D-mannosaminuronic acid transferase; Provisional 243
9227 235150 PRK03695 PRK03695 vitamin B12-transporter ATPase; Provisional 248
9228 235151 PRK03699 PRK03699 putative transporter; Provisional 394
9229 235152 PRK03705 PRK03705 glycogen debranching protein GlgX. 658
9230 179635 PRK03708 ppnK NAD(+) kinase. 277
9231 179636 PRK03715 argD acetylornithine transaminase protein; Provisional 395
9232 167589 PRK03717 PRK03717 ribonuclease P protein component 2; Provisional 120
9233 179637 PRK03719 PRK03719 ecotin; Provisional 166
9234 235153 PRK03731 aroL shikimate kinase AroL. 171
9235 167593 PRK03732 PRK03732 hypothetical protein; Provisional 114
9236 179639 PRK03735 PRK03735 cytochrome b6; Provisional 223
9237 235154 PRK03739 PRK03739 2-isopropylmalate synthase; Validated 552
9238 179641 PRK03743 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 332
9239 235155 PRK03745 PRK03745 signal recognition particle protein Srp19; Provisional 100
9240 179642 PRK03757 PRK03757 YceI family protein. 191
9241 235156 PRK03759 PRK03759 isopentenyl-diphosphate Delta-isomerase. 184
9242 235157 PRK03760 PRK03760 hypothetical protein; Provisional 117
9243 235158 PRK03761 PRK03761 LPS assembly outer membrane complex protein LptD; Provisional 778
9244 179646 PRK03762 PRK03762 YbaB/EbfC family nucleoid-associated protein. 103
9245 179647 PRK03767 PRK03767 NAD(P)H:quinone oxidoreductase; Provisional 200
9246 179648 PRK03776 PRK03776 phosphatidylglycerol--membrane-oligosaccharide glycerophosphotransferase. 762
9247 235159 PRK03784 PRK03784 vtamin B12-transporter permease; Provisional 331
9248 235160 PRK03803 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 448
9249 179651 PRK03806 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 438
9250 235161 PRK03814 PRK03814 oxaloacetate decarboxylase subunit gamma; Provisional 85
9251 235162 PRK03815 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 401
9252 235163 PRK03817 PRK03817 galactokinase; Provisional 351
9253 179654 PRK03818 PRK03818 putative transporter; Validated 552
9254 179655 PRK03822 lplA lipoate-protein ligase A; Provisional 338
9255 235164 PRK03824 hypA hydrogenase nickel incorporation protein HypA. 135
9256 235165 PRK03826 PRK03826 5'-nucleotidase; Provisional 195
9257 179658 PRK03830 PRK03830 small acid-soluble spore protein Tlp; Provisional 73
9258 235166 PRK03837 PRK03837 transcriptional regulator NanR; Provisional 241
9259 179660 PRK03839 PRK03839 putative kinase; Provisional 180
9260 179661 PRK03846 PRK03846 adenylylsulfate kinase; Provisional 198
9261 235167 PRK03854 opgC glucans biosynthesis protein MdoC. 375
9262 179663 PRK03858 PRK03858 DNA polymerase IV; Validated 396
9263 235168 PRK03868 PRK03868 glucose-6-phosphate isomerase; Provisional 410
9264 235169 PRK03879 PRK03879 ribonuclease P protein component 1; Validated 96
9265 235170 PRK03881 PRK03881 hypothetical protein; Provisional 467
9266 167628 PRK03887 PRK03887 methylated-DNA--protein-cysteine methyltransferase; Provisional 175
9267 179667 PRK03892 PRK03892 Ribonuclease P protein component 3. 216
9268 179668 PRK03893 PRK03893 putative sialic acid transporter; Provisional 496
9269 179669 PRK03902 PRK03902 transcriptional regulator MntR. 142
9270 235171 PRK03903 PRK03903 transaldolase; Provisional 274
9271 235172 PRK03906 PRK03906 mannonate dehydratase; Provisional 385
9272 235173 PRK03907 fliE flagellar hook-basal body complex protein FliE. 97
9273 179673 PRK03910 PRK03910 D-cysteine desulfhydrase; Validated 331
9274 235174 PRK03911 PRK03911 HrcA family transcriptional regulator. 260
9275 235175 PRK03918 PRK03918 DNA double-strand break repair ATPase Rad50. 880
9276 179676 PRK03922 PRK03922 hypothetical protein; Provisional 113
9277 179677 PRK03926 PRK03926 mevalonate kinase; Provisional 302
9278 235176 PRK03932 asnC asparaginyl-tRNA synthetase; Validated 450
9279 235177 PRK03934 PRK03934 phosphatidylserine decarboxylase; Provisional 265
9280 235178 PRK03941 PRK03941 NTPase; Reviewed 174
9281 179681 PRK03946 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase; Provisional 307
9282 235179 PRK03947 PRK03947 prefoldin subunit alpha; Reviewed 140
9283 235180 PRK03954 PRK03954 ribonuclease P protein component 4; Validated 121
9284 179684 PRK03955 PRK03955 DUF126 domain-containing protein. 131
9285 179685 PRK03957 PRK03957 V-type ATP synthase subunit F; Provisional 100
9286 235181 PRK03958 PRK03958 tRNA 2'-O-methylase; Reviewed 176
9287 167649 PRK03963 PRK03963 V-type ATP synthase subunit E; Provisional 198
9288 167650 PRK03967 PRK03967 histidinol-phosphate transaminase. 337
9289 235182 PRK03968 PRK03968 DNA primase large subunit PriL. 399
9290 179688 PRK03971 PRK03971 deoxyhypusine synthase. 334
9291 179689 PRK03972 PRK03972 ribosomal biogenesis protein; Validated 208
9292 179690 PRK03975 tfx putative transcriptional regulator; Provisional 141
9293 235183 PRK03976 rpl37ae 50S ribosomal protein L37Ae; Reviewed 90
9294 235184 PRK03979 PRK03979 ADP-specific phosphofructokinase; Provisional 463
9295 235185 PRK03980 PRK03980 flap endonuclease-1; Provisional 292
9296 235186 PRK03982 PRK03982 heat shock protein HtpX; Provisional 288
9297 235187 PRK03983 PRK03983 exosome complex exonuclease Rrp41; Provisional 244
9298 235188 PRK03987 PRK03987 translation initiation factor IF-2 subunit alpha; Validated 262
9299 235189 PRK03988 PRK03988 translation initiation factor IF-2 subunit beta; Validated 138
9300 235190 PRK03991 PRK03991 threonyl-tRNA synthetase; Validated 613
9301 179699 PRK03992 PRK03992 proteasome-activating nucleotidase; Provisional 389
9302 235191 PRK03995 PRK03995 D-aminoacyl-tRNA deacylase. 267
9303 235192 PRK03996 PRK03996 archaeal proteasome endopeptidase complex subunit alpha. 241
9304 235193 PRK03999 PRK03999 translation initiation factor IF-5A; Provisional 129
9305 235194 PRK04000 PRK04000 translation initiation factor IF-2 subunit gamma; Validated 411
9306 235195 PRK04004 PRK04004 translation initiation factor IF-2; Validated 586
9307 235196 PRK04005 PRK04005 50S ribosomal protein L18e; Provisional 111
9308 235197 PRK04007 rps28e 30S ribosomal protein S28e; Validated 70
9309 235198 PRK04011 PRK04011 peptide chain release factor 1; Provisional 411
9310 179708 PRK04012 PRK04012 translation initiation factor IF-1A; Provisional 100
9311 101376 PRK04013 argD acetylornithine/acetyl-lysine aminotransferase; Provisional 364
9312 235199 PRK04015 PRK04015 DNA/RNA-binding protein AlbA. 91
9313 235200 PRK04016 PRK04016 DNA-directed RNA polymerase subunit N; Provisional 62
9314 179711 PRK04017 PRK04017 hypothetical protein; Provisional 132
9315 179712 PRK04019 rplP0 acidic ribosomal protein P0; Validated 330
9316 235201 PRK04020 rps2P 30S ribosomal protein S2; Provisional 204
9317 167678 PRK04021 PRK04021 hypothetical protein; Reviewed 92
9318 235202 PRK04023 PRK04023 DNA polymerase II large subunit; Validated 1121
9319 235203 PRK04024 PRK04024 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 412
9320 179716 PRK04025 PRK04025 adenosylmethionine decarboxylase. 139
9321 235204 PRK04027 PRK04027 30S ribosomal protein S7P; Reviewed 195
9322 235205 PRK04028 PRK04028 Glu-tRNA(Gln) amidotransferase subunit GatE. 630
9323 235206 PRK04031 PRK04031 DNA primase; Provisional 408
9324 235207 PRK04032 PRK04032 CDP-2,3-bis-(O-geranylgeranyl)-sn-glycerol synthase. 159
9325 179721 PRK04034 rps8p 30S ribosomal protein S8P; Reviewed 130
9326 235208 PRK04036 PRK04036 DNA-directed DNA polymerase II small subunit. 504
9327 235209 PRK04038 rps19p 30S ribosomal protein S19P; Provisional 134
9328 235210 PRK04040 PRK04040 adenylate kinase; Provisional 188
9329 235211 PRK04042 rpl4lp 50S ribosomal protein L4P; Provisional 254
9330 235212 PRK04043 tolB Tol-Pal system protein TolB. 419
9331 235213 PRK04044 rps5p 30S ribosomal protein S5P; Reviewed 211
9332 179728 PRK04046 PRK04046 translation initiation factor IF-6; Provisional 222
9333 235214 PRK04049 PRK04049 30S ribosomal protein S8e; Validated 127
9334 179730 PRK04051 rps4p 30S ribosomal protein S4P; Validated 177
9335 235215 PRK04053 rps13p 30S ribosomal protein S13P; Reviewed 149
9336 179732 PRK04056 PRK04056 septum formation inhibitor Maf. 180
9337 235216 PRK04057 PRK04057 30S ribosomal protein S3Ae; Validated 203
9338 179734 PRK04059 rpl34e 50S ribosomal protein L34e; Validated 88
9339 235217 PRK04069 PRK04069 serine-protein kinase RsbW; Provisional 161
9340 179736 PRK04073 rocD ornithine--oxo-acid transaminase; Provisional 396
9341 235218 PRK04081 PRK04081 hypothetical protein; Provisional 207
9342 235219 PRK04098 PRK04098 Sec-independent protein translocase subunit TatB. 158
9343 179739 PRK04099 truB tRNA pseudouridine synthase B; Provisional 273
9344 179740 PRK04101 PRK04101 metallothiol transferase FosB. 139
9345 235220 PRK04115 PRK04115 hypothetical protein; Provisional 137
9346 235221 PRK04123 PRK04123 ribulokinase; Provisional 548
9347 235222 PRK04125 PRK04125 antiholin-like protein LrgA. 141
9348 167709 PRK04128 PRK04128 1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 228
9349 235223 PRK04132 PRK04132 replication factor C small subunit; Provisional 846
9350 179745 PRK04135 PRK04135 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 395
9351 179746 PRK04136 rpl40e 50S ribosomal protein L40e; Provisional 48
9352 235224 PRK04140 PRK04140 transcriptional regulator. 317
9353 235225 PRK04143 PRK04143 protein-ADP-ribose hydrolase. 264
9354 179749 PRK04147 PRK04147 N-acetylneuraminate lyase; Provisional 293
9355 235226 PRK04148 PRK04148 hypothetical protein; Provisional 134
9356 235227 PRK04149 sat sulfate adenylyltransferase; Reviewed 391
9357 179752 PRK04151 PRK04151 IMP cyclohydrolase; Provisional 197
9358 235228 PRK04155 PRK04155 protein deglycase HchA. 287
9359 235229 PRK04156 gltX glutamyl-tRNA synthetase; Provisional 567
9360 235230 PRK04158 PRK04158 GTP-sensing pleiotropic transcriptional regulator CodY. 256
9361 235231 PRK04160 PRK04160 diphthine synthase; Provisional 258
9362 235232 PRK04161 PRK04161 tagatose 1,6-diphosphate aldolase; Reviewed 329
9363 235233 PRK04163 PRK04163 exosome complex protein Rrp4. 235
9364 235234 PRK04164 PRK04164 hypothetical protein; Provisional 181
9365 235235 PRK04165 PRK04165 acetyl-CoA decarbonylase/synthase complex subunit gamma; Provisional 450
9366 235236 PRK04168 PRK04168 tungstate ABC transporter substrate-binding protein WtpA. 334
9367 235237 PRK04169 PRK04169 heptaprenylglyceryl phosphate synthase. 232
9368 235238 PRK04171 PRK04171 16S rRNA (pseudouridine)(914)-N(1))-methyltransferase Nep1. 222
9369 235239 PRK04172 pheS phenylalanine--tRNA ligase subunit alpha. 489
9370 235240 PRK04173 PRK04173 glycyl-tRNA synthetase; Provisional 456
9371 179766 PRK04175 rpl7ae 50S ribosomal protein L7Ae; Validated 122
9372 235241 PRK04176 PRK04176 ribulose-1,5-biphosphate synthetase; Provisional 257
9373 235242 PRK04179 rpl37e 50S ribosomal protein L37e; Reviewed 62
9374 179769 PRK04180 PRK04180 pyridoxal 5'-phosphate synthase lyase subunit PdxS. 293
9375 235243 PRK04181 PRK04181 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 257
9376 235244 PRK04182 PRK04182 cytidylate kinase; Provisional 180
9377 235245 PRK04183 PRK04183 Glu-tRNA(Gln) amidotransferase subunit GatD. 419
9378 235246 PRK04184 PRK04184 DNA topoisomerase VI subunit B; Validated 535
9379 179774 PRK04190 PRK04190 glucose-6-phosphate isomerase; Provisional 191
9380 235247 PRK04191 rps3p 30S ribosomal protein S3P; Reviewed 207
9381 235248 PRK04192 PRK04192 V-type ATP synthase subunit A; Provisional 586
9382 235249 PRK04194 PRK04194 nickel pincer cofactor biosynthesis protein LarC. 392
9383 235250 PRK04195 PRK04195 replication factor C large subunit; Provisional 482
9384 235251 PRK04196 PRK04196 V-type ATP synthase subunit B; Provisional 460
9385 235252 PRK04199 rpl10e 50S ribosomal protein L16. 172
9386 179781 PRK04200 PRK04200 cofactor-independent phosphoglycerate mutase; Provisional 395
9387 235253 PRK04201 PRK04201 zinc transporter ZupT; Provisional 265
9388 235254 PRK04203 rpl1P 50S ribosomal protein L1P; Reviewed 215
9389 235255 PRK04204 PRK04204 RNA 3'-terminal phosphate cyclase. 343
9390 179785 PRK04205 PRK04205 hypothetical protein; Provisional 229
9391 179786 PRK04207 PRK04207 type II glyceraldehyde-3-phosphate dehydrogenase. 341
9392 179787 PRK04208 rbcL ribulose bisophosphate carboxylase; Reviewed 468
9393 235256 PRK04210 PRK04210 phosphoenolpyruvate carboxykinase (GTP). 601
9394 235257 PRK04211 rps12P 30S ribosomal protein S12P; Reviewed 145
9395 179790 PRK04213 PRK04213 GTP-binding protein EngB. 201
9396 179791 PRK04214 rbn ribonuclease BN/unknown domain fusion protein; Reviewed 412
9397 235258 PRK04217 PRK04217 hypothetical protein; Provisional 110
9398 235259 PRK04219 rpl5p 50S ribosomal protein L5P; Reviewed 177
9399 179793 PRK04220 PRK04220 2-phosphoglycerate kinase; Provisional 301
9400 179794 PRK04223 rpl22p 50S ribosomal protein L22P; Reviewed 153
9401 235260 PRK04231 rpl3p 50S ribosomal protein L3P; Reviewed 337
9402 235261 PRK04233 PRK04233 hypothetical protein; Provisional 129
9403 235262 PRK04235 PRK04235 hypothetical protein; Provisional 196
9404 179798 PRK04239 PRK04239 DNA-binding protein. 110
9405 235263 PRK04243 PRK04243 50S ribosomal protein L15e; Validated 196
9406 235264 PRK04247 PRK04247 endonuclease NucS. 238
9407 235265 PRK04250 PRK04250 dihydroorotase; Provisional 398
9408 179802 PRK04257 PRK04257 hypothetical protein; Provisional 78
9409 179803 PRK04260 PRK04260 acetylornithine transaminase. 375
9410 235266 PRK04262 PRK04262 hypothetical protein; Provisional 347
9411 235267 PRK04266 PRK04266 fibrillarin-like rRNA/tRNA 2'-O-methyltransferase. 226
9412 179806 PRK04270 PRK04270 RNA-guided pseudouridylation complex pseudouridine synthase subunit Cbf5. 300
9413 179807 PRK04280 PRK04280 transcriptional regulator ArgR. 148
9414 235268 PRK04282 PRK04282 exosome complex protein Rrp42. 271
9415 235269 PRK04284 PRK04284 ornithine carbamoyltransferase; Provisional 332
9416 235270 PRK04286 PRK04286 hypothetical protein; Provisional 298
9417 179810 PRK04288 PRK04288 antiholin-like protein LrgB; Provisional 232
9418 235271 PRK04290 PRK04290 30S ribosomal protein S6e; Validated 115
9419 179812 PRK04293 PRK04293 adenylosuccinate synthetase; Provisional 333
9420 235272 PRK04296 PRK04296 thymidine kinase; Provisional 190
9421 235273 PRK04301 radA DNA repair and recombination protein RadA; Validated 317
9422 235274 PRK04302 PRK04302 triosephosphate isomerase; Provisional 223
9423 235275 PRK04306 PRK04306 50S ribosomal protein L21e; Reviewed 98
9424 235276 PRK04307 PRK04307 protein-disulfide oxidoreductase DsbI. 218
9425 167786 PRK04308 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 445
9426 235277 PRK04309 PRK04309 DNA-directed RNA polymerase subunit A''; Validated 383
9427 235278 PRK04311 PRK04311 selenocysteine synthase; Provisional 464
9428 179820 PRK04313 PRK04313 30S ribosomal protein S4e; Validated 237
9429 235279 PRK04319 PRK04319 acetyl-CoA synthetase; Provisional 570
9430 235280 PRK04322 PRK04322 peptidyl-tRNA hydrolase; Provisional 113
9431 179823 PRK04323 PRK04323 hypothetical protein; Provisional 91
9432 179824 PRK04325 PRK04325 hypothetical protein; Provisional 74
9433 179825 PRK04326 PRK04326 methionine synthase; Provisional 330
9434 235281 PRK04328 PRK04328 hypothetical protein; Provisional 249
9435 235282 PRK04330 PRK04330 hypothetical protein; Provisional 88
9436 235283 PRK04333 PRK04333 50S ribosomal protein L14e; Validated 84
9437 235284 PRK04334 PRK04334 hypothetical protein; Provisional 251
9438 235285 PRK04335 PRK04335 cell division protein ZipA; Provisional 313
9439 179831 PRK04337 PRK04337 50S ribosomal protein L35Ae; Validated 87
9440 235286 PRK04338 PRK04338 N(2),N(2)-dimethylguanosine tRNA methyltransferase; Provisional 382
9441 235287 PRK04342 PRK04342 DNA topoisomerase IV subunit A. 367
9442 235288 PRK04346 PRK04346 tryptophan synthase subunit beta; Validated 397
9443 235289 PRK04350 PRK04350 thymidine phosphorylase; Provisional 490
9444 235290 PRK04351 PRK04351 SprT family protein. 149
9445 235291 PRK04358 PRK04358 hypothetical protein; Provisional 217
9446 235292 PRK04366 PRK04366 aminomethyl-transferring glycine dehydrogenase subunit GcvPB. 481
9447 179839 PRK04374 PRK04374 [protein-PII] uridylyltransferase. 869
9448 235293 PRK04375 PRK04375 protoheme IX farnesyltransferase; Provisional 296
9449 179841 PRK04387 PRK04387 hypothetical protein; Provisional 90
9450 235294 PRK04388 PRK04388 disulfide bond formation protein B; Provisional 172
9451 179843 PRK04390 rnpA ribonuclease P protein component. 120
9452 235295 PRK04405 prsA peptidylprolyl isomerase; Provisional 298
9453 235296 PRK04406 PRK04406 hypothetical protein; Provisional 75
9454 235297 PRK04423 PRK04423 LPS-assembly protein LptD. 798
9455 179847 PRK04424 PRK04424 transcription factor FapR. 185
9456 167814 PRK04425 PRK04425 septum formation protein Maf. 196
9457 179848 PRK04435 PRK04435 ACT domain-containing protein. 147
9458 235298 PRK04439 PRK04439 methionine adenosyltransferase. 399
9459 235299 PRK04443 PRK04443 [LysW]-lysine hydrolase. 348
9460 235300 PRK04447 PRK04447 hypothetical protein; Provisional 351
9461 235301 PRK04452 PRK04452 acetyl-CoA decarbonylase/synthase complex subunit delta; Provisional 319
9462 235302 PRK04456 PRK04456 acetyl-CoA decarbonylase/synthase complex subunit beta; Reviewed 463
9463 179854 PRK04457 PRK04457 polyamine aminopropyltransferase. 262
9464 179855 PRK04460 PRK04460 nickel-responsive transcriptional regulator NikR. 137
9465 179856 PRK04516 minC septum site-determining protein MinC. 235
9466 235303 PRK04517 PRK04517 hypothetical protein; Provisional 216
9467 235304 PRK04523 PRK04523 N-acetylornithine carbamoyltransferase; Reviewed 335
9468 235305 PRK04527 PRK04527 argininosuccinate synthase; Provisional 400
9469 235306 PRK04531 PRK04531 acetylglutamate kinase; Provisional 398
9470 235307 PRK04537 PRK04537 ATP-dependent RNA helicase RhlB; Provisional 572
9471 179862 PRK04539 ppnK inorganic polyphosphate/ATP-NAD kinase; Provisional 296
9472 179863 PRK04542 PRK04542 elongation factor P; Provisional 189
9473 179864 PRK04561 tatA twin arginine translocase protein A; Provisional 75
9474 235308 PRK04570 PRK04570 cell division protein ZipA; Provisional 243
9475 235309 PRK04596 minC septum site-determining protein MinC. 248
9476 179867 PRK04598 tatA twin-arginine translocase subunit TatA. 81
9477 179868 PRK04612 argD acetylornithine transaminase. 408
9478 179869 PRK04635 PRK04635 histidinol-phosphate aminotransferase; Provisional 354
9479 179870 PRK04642 truB tRNA pseudouridine synthase B; Provisional 300
9480 135173 PRK04654 PRK04654 sec-independent translocase; Provisional 214
9481 179871 PRK04663 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 438
9482 179872 PRK04690 murD UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 468
9483 179873 PRK04694 PRK04694 Maf-like protein; Reviewed 190
9484 235310 PRK04750 ubiB putative ubiquinone biosynthesis protein UbiB; Reviewed 537
9485 179875 PRK04758 PRK04758 hypothetical protein; Validated 181
9486 179876 PRK04761 ppnK inorganic polyphosphate/ATP-NAD kinase; Reviewed 246
9487 179877 PRK04778 PRK04778 septation ring formation regulator EzrA; Provisional 569
9488 235311 PRK04781 PRK04781 histidinol-phosphate aminotransferase; Provisional 364
9489 235312 PRK04792 tolB Tol-Pal system protein TolB. 448
9490 179880 PRK04804 minC septum site-determining protein MinC. 221
9491 235313 PRK04813 PRK04813 D-alanine--poly(phosphoribitol) ligase subunit DltA. 503
9492 179882 PRK04820 rnpA ribonuclease P protein component. 145
9493 179883 PRK04833 PRK04833 argininosuccinate lyase; Provisional 455
9494 235314 PRK04837 PRK04837 ATP-dependent RNA helicase RhlB; Provisional 423
9495 235315 PRK04841 PRK04841 HTH-type transcriptional regulator MalT. 903
9496 179886 PRK04860 PRK04860 SprT family zinc-dependent metalloprotease. 160
9497 235316 PRK04863 mukB chromosome partition protein MukB. 1486
9498 179888 PRK04870 PRK04870 histidinol-phosphate transaminase. 356
9499 235317 PRK04885 ppnK inorganic polyphosphate/ATP-NAD kinase; Provisional 265
9500 235318 PRK04897 PRK04897 heat shock protein HtpX; Provisional 298
9501 235319 PRK04914 PRK04914 RNA polymerase-associated protein RapA. 956
9502 179892 PRK04922 tolB Tol-Pal system beta propeller repeat protein TolB. 433
9503 179893 PRK04923 PRK04923 ribose-phosphate diphosphokinase. 319
9504 235320 PRK04926 dgt deoxyguanosinetriphosphate triphosphohydrolase; Provisional 503
9505 179895 PRK04930 PRK04930 glutathione-regulated potassium-efflux system ancillary protein KefG; Provisional 184
9506 179896 PRK04940 PRK04940 hypothetical protein; Provisional 180
9507 235321 PRK04946 PRK04946 endonuclease SmrB. 181
9508 179898 PRK04949 PRK04949 putative sulfate transport protein CysZ; Validated 251
9509 235322 PRK04950 PRK04950 ProP expression regulator; Provisional 213
9510 235323 PRK04960 PRK04960 universal stress protein UspB; Provisional 111
9511 179901 PRK04964 PRK04964 hypothetical protein; Provisional 66
9512 179902 PRK04965 PRK04965 NADH:flavorubredoxin reductase NorW. 377
9513 179903 PRK04966 PRK04966 hypothetical protein; Provisional 72
9514 235324 PRK04968 PRK04968 SecY interacting protein Syd; Provisional 181
9515 179905 PRK04972 PRK04972 putative transporter; Provisional 558
9516 235325 PRK04974 PRK04974 glycerol-3-phosphate 1-O-acyltransferase PlsB. 818
9517 235326 PRK04976 torD chaperone protein TorD; Validated 202
9518 179908 PRK04980 PRK04980 hypothetical protein; Provisional 102
9519 179909 PRK04984 PRK04984 fatty acid metabolism transcriptional regulator FadR. 239
9520 235327 PRK04987 PRK04987 fumarate reductase subunit FrdC. 130
9521 235328 PRK04989 psbM photosystem II reaction center protein M; Provisional 35
9522 179912 PRK04998 PRK04998 hypothetical protein; Provisional 88
9523 235329 PRK05007 PRK05007 bifunctional uridylyltransferase/uridylyl-removing protein GlnD. 884
9524 179914 PRK05014 hscB co-chaperone HscB; Provisional 171
9525 235330 PRK05015 PRK05015 aminopeptidase B; Provisional 424
9526 235331 PRK05022 PRK05022 nitric oxide reductase transcriptional regulator NorR. 509
9527 235332 PRK05031 PRK05031 tRNA (uracil-5-)-methyltransferase; Validated 362
9528 235333 PRK05033 truB tRNA pseudouridine synthase B; Provisional 312
9529 235334 PRK05035 PRK05035 electron transport complex protein RnfC; Provisional 695
9530 179920 PRK05054 PRK05054 exoribonuclease II; Provisional 644
9531 235335 PRK05057 aroK shikimate kinase AroK. 172
9532 179922 PRK05066 PRK05066 transcriptional regulator ArgR. 156
9533 179923 PRK05070 PRK05070 DNA mismatch repair protein; Provisional 218
9534 235336 PRK05074 PRK05074 non-canonical purine NTP phosphatase. 173
9535 235337 PRK05077 frsA esterase FrsA. 414
9536 235338 PRK05082 PRK05082 N-acetylmannosamine kinase; Provisional 291
9537 235339 PRK05084 xerS site-specific tyrosine recombinase XerS; Reviewed 357
9538 235340 PRK05086 PRK05086 malate dehydrogenase; Provisional 312
9539 179929 PRK05087 PRK05087 D-alanine--poly(phosphoribitol) ligase subunit DltC. 78
9540 235341 PRK05089 PRK05089 cytochrome C oxidase assembly protein; Provisional 188
9541 179931 PRK05090 PRK05090 hypothetical protein; Validated 95
9542 235342 PRK05092 PRK05092 PII uridylyl-transferase; Provisional 931
9543 179933 PRK05093 argD acetylornithine/succinyldiaminopimelate transaminase. 403
9544 179934 PRK05094 PRK05094 dsDNA-mimic protein; Reviewed 107
9545 235343 PRK05096 PRK05096 guanosine 5'-monophosphate oxidoreductase; Provisional 346
9546 235344 PRK05097 PRK05097 macrodomain Ter protein MatP. 150
9547 179937 PRK05101 PRK05101 galactokinase; Provisional 382
9548 235345 PRK05105 PRK05105 O-succinylbenzoate synthase; Provisional 322
9549 235346 PRK05111 PRK05111 acetylornithine deacetylase; Provisional 383
9550 235347 PRK05113 PRK05113 electron transport complex protein RnfB; Provisional 191
9551 179941 PRK05114 PRK05114 YoaH family protein. 59
9552 235348 PRK05122 PRK05122 major facilitator superfamily transporter; Provisional 399
9553 235349 PRK05124 cysN sulfate adenylyltransferase subunit 1; Provisional 474
9554 235350 PRK05134 PRK05134 bifunctional 2-polyprenyl-6-hydroxyphenol methylase/3-demethylubiquinol 3-O-methyltransferase UbiG. 233
9555 235351 PRK05137 tolB Tol-Pal system protein TolB. 435
9556 235352 PRK05151 PRK05151 electron transport complex protein RsxA; Provisional 193
9557 235353 PRK05157 PRK05157 pyrroloquinoline quinone biosynthesis protein PqqC; Provisional 246
9558 235354 PRK05159 aspC aspartyl-tRNA synthetase; Provisional 437
9559 235355 PRK05163 rpsL 30S ribosomal protein S12; Validated 124
9560 179950 PRK05166 PRK05166 histidinol-phosphate transaminase. 371
9561 179951 PRK05168 PRK05168 ribonuclease T; Provisional 211
9562 235356 PRK05170 PRK05170 YcgN family cysteine cluster protein. 147
9563 179953 PRK05174 PRK05174 bifunctional 3-hydroxydecanoyl-ACP dehydratase/trans-2-decenoyl-ACP isomerase. 172
9564 235357 PRK05177 minC septum formation inhibitor MinC. 239
9565 235358 PRK05179 rpsM 30S ribosomal protein S13; Validated 122
9566 235359 PRK05182 PRK05182 DNA-directed RNA polymerase subunit alpha; Provisional 310
9567 235360 PRK05183 hscA chaperone protein HscA; Provisional 616
9568 235361 PRK05184 PRK05184 pyrroloquinoline quinone biosynthesis protein PqqB; Provisional 302
9569 179959 PRK05185 rplT 50S ribosomal protein L20; Provisional 114
9570 235362 PRK05192 PRK05192 tRNA uridine-5-carboxymethylaminomethyl(34) synthesis enzyme MnmG. 618
9571 235363 PRK05198 PRK05198 2-dehydro-3-deoxyphosphooctonate aldolase; Provisional 264
9572 235364 PRK05201 hslU ATP-dependent protease ATPase subunit HslU. 443
9573 235365 PRK05205 PRK05205 bifunctional pyr operon transcriptional regulator/uracil phosphoribosyltransferase PyrR. 176
9574 179964 PRK05208 PRK05208 hypothetical protein; Provisional 168
9575 235366 PRK05218 PRK05218 heat shock protein 90; Provisional 613
9576 235367 PRK05222 PRK05222 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase; Provisional 758
9577 235368 PRK05225 PRK05225 ketol-acid reductoisomerase; Validated 487
9578 235369 PRK05231 PRK05231 homoserine kinase; Provisional 319
9579 179969 PRK05234 mgsA methylglyoxal synthase; Validated 142
9580 235370 PRK05244 PRK05244 Der GTPase-activating protein YihI. 177
9581 235371 PRK05246 PRK05246 glutathione synthetase; Provisional 316
9582 235372 PRK05248 PRK05248 hypothetical protein; Provisional 121
9583 235373 PRK05249 PRK05249 Si-specific NAD(P)(+) transhydrogenase. 461
9584 235374 PRK05250 PRK05250 S-adenosylmethionine synthetase; Validated 384
9585 235375 PRK05253 PRK05253 sulfate adenylyltransferase subunit CysD. 301
9586 235376 PRK05254 PRK05254 uracil-DNA glycosylase; Provisional 224
9587 235377 PRK05255 PRK05255 ribosome-associated protein. 171
9588 235378 PRK05256 PRK05256 chromosome partition protein MukE. 238
9589 179979 PRK05257 PRK05257 malate:quinone oxidoreductase; Validated 494
9590 179980 PRK05260 PRK05260 chromosome partition protein MukF. 440
9591 235379 PRK05261 PRK05261 phosphoketolase. 785
9592 179982 PRK05264 PRK05264 met regulon transcriptional regulator MetJ. 105
9593 235380 PRK05265 PRK05265 pyridoxine 5'-phosphate synthase; Provisional 239
9594 235381 PRK05269 PRK05269 transaldolase B; Provisional 318
9595 235382 PRK05270 PRK05270 UDP-glucose--hexose-1-phosphate uridylyltransferase. 493
9596 235383 PRK05273 PRK05273 D-tyrosyl-tRNA(Tyr) deacylase; Provisional 147
9597 235384 PRK05274 PRK05274 2-keto-3-deoxygluconate permease; Provisional 326
9598 235385 PRK05277 PRK05277 H(+)/Cl(-) exchange transporter ClcA. 438
9599 235386 PRK05279 PRK05279 N-acetylglutamate synthase; Validated 441
9600 179990 PRK05282 PRK05282 dipeptidase PepE. 233
9601 235387 PRK05283 PRK05283 deoxyribose-phosphate aldolase; Provisional 257
9602 235388 PRK05286 PRK05286 quinone-dependent dihydroorotate dehydrogenase. 344
9603 235389 PRK05287 PRK05287 cell division protein ZapD. 250
9604 235390 PRK05289 PRK05289 acyl-ACP--UDP-N-acetylglucosamine O-acyltransferase. 262
9605 235391 PRK05290 PRK05290 hybrid cluster protein; Provisional 546
9606 235392 PRK05291 trmE tRNA uridine-5-carboxymethylaminomethyl(34) synthesis GTPase MnmE. 449
9607 179997 PRK05293 glgC glucose-1-phosphate adenylyltransferase; Provisional 380
9608 235393 PRK05294 carB carbamoyl-phosphate synthase large subunit. 1066
9609 235394 PRK05297 PRK05297 phosphoribosylformylglycinamidine synthase; Provisional 1290
9610 235395 PRK05298 PRK05298 excinuclease ABC subunit UvrB. 652
9611 235396 PRK05299 rpsB 30S ribosomal protein S2; Provisional 258
9612 235397 PRK05301 PRK05301 pyrroloquinoline quinone biosynthesis protein PqqE; Provisional 378
9613 235398 PRK05302 PRK05302 30S ribosomal protein S7; Validated 156
9614 235399 PRK05303 flgI flagellar basal body P-ring protein FlgI. 367
9615 235400 PRK05305 PRK05305 phosphatidylserine decarboxylase family protein. 206
9616 235401 PRK05306 infB translation initiation factor IF-2; Validated 746
9617 180007 PRK05309 PRK05309 30S ribosomal protein S11; Validated 128
9618 235402 PRK05312 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 336
9619 180009 PRK05313 PRK05313 hypothetical protein; Provisional 452
9620 235403 PRK05318 PRK05318 deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional 432
9621 235404 PRK05319 rplD 50S ribosomal protein L4; Provisional 205
9622 235405 PRK05320 PRK05320 rhodanese superfamily protein; Provisional 257
9623 235406 PRK05321 PRK05321 nicotinate phosphoribosyltransferase; Provisional 400
9624 235407 PRK05322 PRK05322 galactokinase; Provisional 387
9625 235408 PRK05324 PRK05324 succinylglutamate desuccinylase; Provisional 329
9626 235409 PRK05325 PRK05325 hypothetical protein; Provisional 401
9627 235410 PRK05326 PRK05326 potassium/proton antiporter. 562
9628 235411 PRK05327 rpsD 30S ribosomal protein S4; Validated 203
9629 235412 PRK05329 PRK05329 glycerol-3-phosphate dehydrogenase subunit GlpB. 422
9630 235413 PRK05330 PRK05330 oxygen-dependent coproporphyrinogen oxidase. 300
9631 235414 PRK05331 PRK05331 phosphate acyltransferase PlsX. 334
9632 235415 PRK05333 PRK05333 NAD-dependent protein deacetylase. 285
9633 235416 PRK05335 PRK05335 tRNA (uracil-5-)-methyltransferase Gid; Reviewed 436
9634 235417 PRK05337 PRK05337 beta-hexosaminidase; Provisional 337
9635 235418 PRK05338 rplS 50S ribosomal protein L19; Provisional 116
9636 235419 PRK05339 PRK05339 pyruvate, phosphate dikinase/phosphoenolpyruvate synthase regulator. 269
9637 235420 PRK05340 PRK05340 UDP-2,3-diacylglucosamine hydrolase; Provisional 241
9638 235421 PRK05341 PRK05341 homogentisate 1,2-dioxygenase; Provisional 438
9639 235422 PRK05342 clpX ATP-dependent Clp protease ATP-binding subunit ClpX. 412
9640 235423 PRK05346 PRK05346 Na(+)-translocating NADH-quinone reductase subunit C; Provisional 256
9641 235424 PRK05347 PRK05347 glutaminyl-tRNA synthetase; Provisional 554
9642 235425 PRK05349 PRK05349 Na(+)-translocating NADH-quinone reductase subunit B; Provisional 405
9643 180033 PRK05350 PRK05350 acyl carrier protein; Provisional 82
9644 235426 PRK05352 PRK05352 Na(+)-translocating NADH-quinone reductase subunit A; Provisional 448
9645 235427 PRK05354 PRK05354 biosynthetic arginine decarboxylase. 634
9646 235428 PRK05355 PRK05355 3-phosphoserine/phosphohydroxythreonine transaminase. 360
9647 235429 PRK05359 PRK05359 oligoribonuclease; Provisional 181
9648 235430 PRK05362 PRK05362 phosphopentomutase; Provisional 394
9649 235431 PRK05363 PRK05363 protein-methionine-sulfoxide reductase catalytic subunit MsrP. 280
9650 180040 PRK05365 PRK05365 malonic semialdehyde reductase; Provisional 195
9651 235432 PRK05367 PRK05367 aminomethyl-transferring glycine dehydrogenase. 954
9652 235433 PRK05368 PRK05368 homoserine O-succinyltransferase; Provisional 302
9653 235434 PRK05370 PRK05370 argininosuccinate synthase; Validated 447
9654 235435 PRK05371 PRK05371 x-prolyl-dipeptidyl aminopeptidase; Provisional 767
9655 180045 PRK05377 PRK05377 fructose-1,6-bisphosphate aldolase; Reviewed 296
9656 235436 PRK05379 PRK05379 bifunctional nicotinamide-nucleotide adenylyltransferase/Nudix hydroxylase. 340
9657 235437 PRK05380 pyrG CTP synthetase; Validated 533
9658 235438 PRK05382 PRK05382 chorismate synthase; Validated 359
9659 235439 PRK05385 PRK05385 phosphoribosylaminoimidazole synthetase; Provisional 327
9660 235440 PRK05387 PRK05387 histidinol-phosphate aminotransferase; Provisional 353
9661 235441 PRK05388 argJ bifunctional glutamate N-acetyltransferase/amino-acid acetyltransferase ArgJ. 395
9662 235442 PRK05389 truB tRNA pseudouridine synthase B; Provisional 305
9663 235443 PRK05395 PRK05395 type II 3-dehydroquinate dehydratase. 146
9664 180054 PRK05396 tdh L-threonine 3-dehydrogenase; Validated 341
9665 180055 PRK05398 PRK05398 formyl-coenzyme A transferase; Provisional 416
9666 235444 PRK05399 PRK05399 DNA mismatch repair protein MutS; Provisional 854
9667 235445 PRK05402 PRK05402 1,4-alpha-glucan branching protein GlgB. 726
9668 235446 PRK05406 PRK05406 5-oxoprolinase subunit PxpA. 246
9669 180059 PRK05408 PRK05408 oxidative damage protection protein; Provisional 90
9670 235447 PRK05409 PRK05409 hypothetical protein; Provisional 281
9671 180061 PRK05412 PRK05412 putative nucleotide-binding protein; Reviewed 161
9672 235448 PRK05414 PRK05414 urocanate hydratase; Provisional 556
9673 235449 PRK05415 PRK05415 hypothetical protein; Provisional 341
9674 235450 PRK05416 PRK05416 RNase adapter RapZ. 288
9675 235451 PRK05417 PRK05417 glutathione-dependent formaldehyde-activating enzyme; Provisional 191
9676 235452 PRK05419 PRK05419 protein-methionine-sulfoxide reductase heme-binding subunit MsrQ. 205
9677 235453 PRK05420 PRK05420 aquaporin Z; Provisional 231
9678 235454 PRK05421 PRK05421 endonuclease/exonuclease/phosphatase family protein. 263
9679 235455 PRK05422 smpB SsrA-binding protein SmpB. 148
9680 180070 PRK05423 PRK05423 DUF496 family protein. 104
9681 180071 PRK05424 rplA 50S ribosomal protein L1; Validated 230
9682 235456 PRK05425 PRK05425 asparagine synthetase AsnA; Provisional 327
9683 235457 PRK05426 PRK05426 peptidyl-tRNA hydrolase; Provisional 189
9684 235458 PRK05427 PRK05427 putative manganese-dependent inorganic pyrophosphatase; Provisional 308
9685 235459 PRK05428 PRK05428 HPr kinase/phosphorylase; Provisional 308
9686 235460 PRK05429 PRK05429 gamma-glutamyl kinase; Provisional 372
9687 235461 PRK05431 PRK05431 seryl-tRNA synthetase; Provisional 425
9688 235462 PRK05433 PRK05433 GTP-binding protein LepA; Provisional 600
9689 235463 PRK05434 PRK05434 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 507
9690 235464 PRK05435 rpmA 50S ribosomal protein L27; Validated 82
9691 235465 PRK05437 PRK05437 isopentenyl pyrophosphate isomerase; Provisional 352
9692 235466 PRK05439 PRK05439 pantothenate kinase; Provisional 311
9693 235467 PRK05441 murQ N-acetylmuramic acid-6-phosphate etherase; Reviewed 299
9694 235468 PRK05442 PRK05442 malate dehydrogenase; Provisional 326
9695 235469 PRK05443 PRK05443 polyphosphate kinase; Provisional 691
9696 235470 PRK05444 PRK05444 1-deoxy-D-xylulose-5-phosphate synthase; Provisional 580
9697 180087 PRK05445 PRK05445 YfbU family protein. 164
9698 235471 PRK05446 PRK05446 bifunctional histidinol-phosphatase/imidazoleglycerol-phosphate dehydratase HisB. 354
9699 235472 PRK05447 PRK05447 1-deoxy-D-xylulose 5-phosphate reductoisomerase; Provisional 385
9700 180090 PRK05449 PRK05449 aspartate alpha-decarboxylase; Provisional 126
9701 235473 PRK05450 PRK05450 3-deoxy-manno-octulosonate cytidylyltransferase; Provisional 245
9702 235474 PRK05451 PRK05451 dihydroorotase; Provisional 345
9703 235475 PRK05452 PRK05452 anaerobic nitric oxide reductase flavorubredoxin; Provisional 479
9704 235476 PRK05454 PRK05454 glucans biosynthesis glucosyltransferase MdoH. 605
9705 235477 PRK05456 PRK05456 ATP-dependent protease subunit HslV. 172
9706 235478 PRK05457 PRK05457 protease HtpX. 284
9707 235479 PRK05458 PRK05458 guanosine 5'-monophosphate oxidoreductase; Provisional 326
9708 180098 PRK05461 apaG CO2+/MG2+ efflux protein ApaG; Reviewed 127
9709 235480 PRK05462 PRK05462 adenosylmethionine decarboxylase. 266
9710 180100 PRK05463 PRK05463 putative hydro-lyase. 262
9711 235481 PRK05464 PRK05464 Na(+)-translocating NADH-quinone reductase subunit F; Provisional 409
9712 235482 PRK05465 PRK05465 ethanolamine ammonia-lyase subunit EutC. 260
9713 235483 PRK05467 PRK05467 Fe(II)-dependent oxygenase superfamily protein; Provisional 226
9714 235484 PRK05469 PRK05469 tripeptide aminopeptidase PepT. 408
9715 180105 PRK05470 PRK05470 fumarate reductase subunit FrdD. 118
9716 235485 PRK05471 PRK05471 CDP-diacylglycerol pyrophosphatase; Provisional 252
9717 235486 PRK05472 PRK05472 redox-sensing transcriptional repressor Rex; Provisional 213
9718 180108 PRK05473 IreB-like IreB family regulatory phosphoprotein. IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135 86
9719 235487 PRK05474 PRK05474 xylose isomerase; Provisional 437
9720 235488 PRK05476 PRK05476 S-adenosyl-L-homocysteine hydrolase; Provisional 425
9721 235489 PRK05477 gatB Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatB. 474
9722 235490 PRK05478 PRK05478 3-isopropylmalate dehydratase large subunit. 466
9723 235491 PRK05479 PRK05479 ketol-acid reductoisomerase; Provisional 330
9724 235492 PRK05480 PRK05480 uridine/cytidine kinase; Provisional 209
9725 235493 PRK05481 PRK05481 lipoyl synthase; Provisional 289
9726 235494 PRK05482 PRK05482 potassium-transporting ATPase subunit A; Provisional 559
9727 180117 PRK05483 rplN 50S ribosomal protein L14; Validated 122
9728 235495 PRK05498 rplF 50S ribosomal protein L6; Validated 178
9729 180119 PRK05500 PRK05500 bifunctional orotidine-5'-phosphate decarboxylase/orotate phosphoribosyltransferase. 477
9730 180120 PRK05506 PRK05506 bifunctional sulfate adenylyltransferase subunit 1/adenylylsulfate kinase protein; Provisional 632
9731 180121 PRK05508 PRK05508 methionine-R-sulfoxide reductase. 119
9732 235496 PRK05518 rpl6p 50S ribosomal protein L6P; Reviewed 180
9733 235497 PRK05528 PRK05528 peptide-methionine (S)-S-oxide reductase. 156
9734 135428 PRK05529 PRK05529 cell division protein FtsQ; Provisional 255
9735 180124 PRK05537 PRK05537 bifunctional sulfate adenylyltransferase/adenylylsulfate kinase. 568
9736 235498 PRK05541 PRK05541 adenylylsulfate kinase; Provisional 176
9737 235499 PRK05550 PRK05550 bifunctional methionine sulfoxide reductase B/A protein; Provisional 283
9738 235500 PRK05557 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Validated 248
9739 235501 PRK05559 PRK05559 DNA topoisomerase IV subunit B; Reviewed 631
9740 235502 PRK05560 PRK05560 DNA gyrase subunit A; Validated 805
9741 235503 PRK05561 PRK05561 DNA topoisomerase 4 subunit A. 742
9742 235504 PRK05562 PRK05562 NAD(P)-dependent oxidoreductase. 223
9743 235505 PRK05563 PRK05563 DNA polymerase III subunits gamma and tau; Validated 559
9744 180132 PRK05564 PRK05564 DNA polymerase III subunit delta'; Validated 313
9745 235506 PRK05565 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 247
9746 235507 PRK05567 PRK05567 inosine 5'-monophosphate dehydrogenase; Reviewed 486
9747 235508 PRK05568 PRK05568 flavodoxin; Provisional 142
9748 135442 PRK05569 PRK05569 flavodoxin; Provisional 141
9749 235509 PRK05571 PRK05571 ribose-5-phosphate isomerase B; Provisional 148
9750 180137 PRK05572 PRK05572 RNA polymerase sporulation sigma factor, SigF/SigG family. 252
9751 235510 PRK05573 rplU 50S ribosomal protein L21; Validated 103
9752 235511 PRK05574 holA DNA polymerase III subunit delta; Reviewed 340
9753 180140 PRK05575 cbiC precorrin-8X methylmutase; Validated 204
9754 235512 PRK05576 PRK05576 cobalt-factor II C(20)-methyltransferase. 229
9755 180142 PRK05578 PRK05578 cytidine deaminase; Validated 131
9756 235513 PRK05579 PRK05579 bifunctional phosphopantothenoylcysteine decarboxylase/phosphopantothenate synthase; Validated 399
9757 235514 PRK05580 PRK05580 primosome assembly protein PriA; Validated 679
9758 235515 PRK05581 PRK05581 ribulose-phosphate 3-epimerase; Validated 220
9759 235516 PRK05582 PRK05582 type I DNA topoisomerase. 650
9760 235517 PRK05583 PRK05583 ribosomal protein L7Ae family protein; Provisional 104
9761 180148 PRK05584 PRK05584 5'-methylthioadenosine/adenosylhomocysteine nucleosidase. 230
9762 235518 PRK05585 yajC preprotein translocase subunit YajC; Validated 106
9763 180150 PRK05586 PRK05586 acetyl-CoA carboxylase biotin carboxylase subunit. 447
9764 235519 PRK05588 PRK05588 histidinol phosphate phosphatase. 255
9765 235520 PRK05589 PRK05589 peptide chain release factor 2; Provisional 325
9766 235521 PRK05590 PRK05590 hypothetical protein; Provisional 166
9767 235522 PRK05591 rplQ 50S ribosomal protein L17; Validated 113
9768 235523 PRK05592 rplO 50S ribosomal protein L15; Reviewed 146
9769 235524 PRK05593 rplR 50S ribosomal protein L18; Reviewed 117
9770 235525 PRK05595 PRK05595 replicative DNA helicase; Provisional 444
9771 235526 PRK05597 PRK05597 molybdopterin biosynthesis protein MoeB; Validated 355
9772 235527 PRK05599 PRK05599 SDR family oxidoreductase. 246
9773 235528 PRK05600 PRK05600 thiamine biosynthesis protein ThiF; Validated 370
9774 235529 PRK05601 PRK05601 DNA polymerase III subunit epsilon; Validated 377
9775 235530 PRK05602 PRK05602 RNA polymerase sigma factor; Reviewed 186
9776 235531 PRK05605 PRK05605 long-chain-fatty-acid--CoA ligase; Validated 573
9777 180161 PRK05609 nusG transcription antitermination protein NusG; Validated 181
9778 235532 PRK05610 rpsQ 30S ribosomal protein S17; Reviewed 84
9779 180163 PRK05611 rpmD 50S ribosomal protein L30; Reviewed 59
9780 168128 PRK05613 PRK05613 O-acetylhomoserine/O-acetylserine sulfhydrylase. 437
9781 180164 PRK05614 gltA citrate synthase. 419
9782 235533 PRK05617 PRK05617 3-hydroxyisobutyryl-CoA hydrolase; Provisional 342
9783 235534 PRK05618 PRK05618 50S ribosomal protein L25/general stress protein Ctc; Reviewed 197
9784 180167 PRK05620 PRK05620 long-chain fatty-acid--CoA ligase. 576
9785 235535 PRK05621 PRK05621 F0F1 ATP synthase subunit gamma; Validated 284
9786 180169 PRK05625 PRK05625 5-amino-6-(5-phosphoribosylamino)uracil reductase; Validated 217
9787 180170 PRK05626 rpsO 30S ribosomal protein S15; Reviewed 89
9788 235536 PRK05627 PRK05627 bifunctional riboflavin kinase/FAD synthetase. 305
9789 180172 PRK05628 PRK05628 coproporphyrinogen III oxidase; Validated 375
9790 180173 PRK05629 PRK05629 hypothetical protein; Validated 318
9791 180174 PRK05630 PRK05630 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 422
9792 235537 PRK05632 PRK05632 phosphate acetyltransferase; Reviewed 684
9793 235538 PRK05634 PRK05634 nucleosidase; Provisional 185
9794 180177 PRK05636 PRK05636 replicative DNA helicase; Provisional 505
9795 180178 PRK05637 PRK05637 anthranilate synthase component II; Provisional 208
9796 235539 PRK05638 PRK05638 threonine synthase; Validated 442
9797 168145 PRK05639 PRK05639 acetyl ornithine aminotransferase family protein. 457
9798 101884 PRK05640 PRK05640 putative monovalent cation/H+ antiporter subunit B; Reviewed 151
9799 235540 PRK05641 PRK05641 putative acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated 153
9800 168147 PRK05642 PRK05642 DnaA regulatory inactivator Hda. 234
9801 235541 PRK05643 PRK05643 DNA polymerase III subunit beta; Validated 367
9802 235542 PRK05644 gyrB DNA gyrase subunit B; Validated 638
9803 135493 PRK05645 PRK05645 lysophospholipid acyltransferase. 295
9804 235543 PRK05646 PRK05646 lipid A biosynthesis lauroyl acyltransferase; Provisional 310
9805 235544 PRK05647 purN phosphoribosylglycinamide formyltransferase; Reviewed 200
9806 235545 PRK05650 PRK05650 SDR family oxidoreductase. 270
9807 235546 PRK05653 fabG 3-oxoacyl-ACP reductase FabG. 246
9808 235547 PRK05654 PRK05654 acetyl-CoA carboxylase carboxyltransferase subunit beta. 292
9809 168156 PRK05656 PRK05656 acetyl-CoA C-acetyltransferase. 393
9810 235548 PRK05657 PRK05657 RNA polymerase sigma factor RpoS; Validated 325
9811 235549 PRK05658 PRK05658 RNA polymerase sigma factor RpoD; Validated 619
9812 168159 PRK05659 PRK05659 sulfur carrier protein ThiS; Validated 66
9813 235550 PRK05660 PRK05660 radical SAM family heme chaperone HemW. 378
9814 180188 PRK05664 PRK05664 threonine-phosphate decarboxylase; Reviewed 330
9815 168162 PRK05665 PRK05665 amidotransferase; Provisional 240
9816 235551 PRK05667 dnaG DNA primase; Validated 580
9817 235552 PRK05670 PRK05670 anthranilate synthase component II; Provisional 189
9818 168165 PRK05671 PRK05671 aspartate-semialdehyde dehydrogenase; Reviewed 336
9819 235553 PRK05672 dnaE2 error-prone DNA polymerase; Validated 1046
9820 235554 PRK05673 dnaE DNA polymerase III subunit alpha; Validated 1135
9821 168168 PRK05674 PRK05674 gamma-carboxygeranoyl-CoA hydratase; Validated 265
9822 180193 PRK05675 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 570
9823 168170 PRK05677 PRK05677 long-chain-fatty-acid--CoA ligase; Validated 562
9824 180194 PRK05678 PRK05678 succinyl-CoA synthetase subunit alpha; Validated 291
9825 235555 PRK05679 PRK05679 pyridoxal 5'-phosphate synthase. 195
9826 180196 PRK05680 flgB flagellar basal body rod protein FlgB; Reviewed 137
9827 235556 PRK05681 flgC flagellar basal body rod protein FlgC; Reviewed 135
9828 235557 PRK05682 flgE flagellar hook protein FlgE; Validated 407
9829 235558 PRK05683 flgK flagellar hook-associated protein FlgK; Validated 676
9830 235559 PRK05684 flgJ flagellar assembly peptidoglycan hydrolase FlgJ. 312
9831 235560 PRK05685 fliS flagellar export chaperone FliS. 132
9832 235561 PRK05686 fliG flagellar motor switch protein G; Validated 339
9833 235562 PRK05687 fliH flagellar assembly protein FliH. 246
9834 168181 PRK05688 fliI flagellar protein export ATPase FliI. 451
9835 235563 PRK05689 fliJ flagella biosynthesis chaperone FliJ. 147
9836 180204 PRK05690 PRK05690 molybdopterin biosynthesis protein MoeB; Provisional 245
9837 235564 PRK05691 PRK05691 peptide synthase; Validated 4334
9838 180206 PRK05692 PRK05692 hydroxymethylglutaryl-CoA lyase; Provisional 287
9839 168186 PRK05693 PRK05693 SDR family oxidoreductase. 274
9840 180207 PRK05696 fliL flagellar basal body-associated protein FliL; Reviewed 170
9841 235565 PRK05697 PRK05697 flagellar basal body-associated protein FliL-like protein; Validated 137
9842 168189 PRK05698 fliN flagellar motor switch protein FliN. 155
9843 235566 PRK05699 fliP flagellar biosynthesis protein FliP; Reviewed 245
9844 235567 PRK05700 fliQ flagellar type III secretion system protein FliQ. 89
9845 235568 PRK05701 fliR flagellar type III secretion system protein FliR. 242
9846 235569 PRK05702 flhB flagellar type III secretion system protein FlhB. 359
9847 235570 PRK05703 flhF flagellar biosynthesis protein FlhF. 424
9848 235571 PRK05704 PRK05704 2-oxoglutarate dehydrogenase complex dihydrolipoyllysine-residue succinyltransferase. 407
9849 180215 PRK05707 PRK05707 DNA polymerase III subunit delta'; Validated 328
9850 235572 PRK05708 PRK05708 putative 2-dehydropantoate 2-reductase. 305
9851 235573 PRK05710 PRK05710 tRNA glutamyl-Q(34) synthetase GluQRS. 299
9852 235574 PRK05711 PRK05711 DNA polymerase III subunit epsilon; Provisional 240
9853 235575 PRK05713 PRK05713 iron-sulfur-binding ferredoxin reductase. 312
9854 168201 PRK05714 PRK05714 2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Provisional 405
9855 180218 PRK05715 PRK05715 NADH-quinone oxidoreductase subunit NuoK. 100
9856 235576 PRK05716 PRK05716 methionine aminopeptidase; Validated 252
9857 168204 PRK05717 PRK05717 SDR family oxidoreductase. 255
9858 235577 PRK05718 PRK05718 keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase; Provisional 212
9859 235578 PRK05720 mtnA methylthioribose-1-phosphate isomerase; Reviewed 344
9860 235579 PRK05722 PRK05722 glucose-6-phosphate 1-dehydrogenase; Validated 495
9861 168208 PRK05723 PRK05723 flavodoxin; Provisional 151
9862 235580 PRK05724 PRK05724 acetyl-CoA carboxylase carboxyltransferase subunit alpha; Validated 319
9863 235581 PRK05728 PRK05728 DNA polymerase III subunit chi; Validated 142
9864 235582 PRK05729 valS valyl-tRNA synthetase; Reviewed 874
9865 235583 PRK05731 PRK05731 thiamine monophosphate kinase; Provisional 318
9866 235584 PRK05732 PRK05732 2-octaprenyl-6-methoxyphenyl hydroxylase; Validated 395
9867 235585 PRK05733 PRK05733 single-stranded DNA-binding protein; Provisional 172
9868 235586 PRK05738 rplW 50S ribosomal protein L23; Reviewed 92
9869 235587 PRK05740 secE preprotein translocase subunit SecE; Reviewed 92
9870 180230 PRK05742 PRK05742 carboxylating nicotinate-nucleotide diphosphorylase. 277
9871 235588 PRK05743 ileS isoleucyl-tRNA synthetase; Reviewed 912
9872 180232 PRK05748 PRK05748 replicative DNA helicase; Provisional 448
9873 235589 PRK05749 PRK05749 3-deoxy-D-manno-octulosonic-acid transferase; Reviewed 425
9874 180234 PRK05751 PRK05751 preprotein translocase subunit SecB; Validated 156
9875 235590 PRK05752 PRK05752 uroporphyrinogen-III synthase; Validated 255
9876 180236 PRK05753 PRK05753 nucleoside diphosphate kinase regulator; Provisional 137
9877 235591 PRK05755 PRK05755 DNA polymerase I; Provisional 880
9878 235592 PRK05756 PRK05756 pyridoxal kinase PdxY. 286
9879 235593 PRK05758 PRK05758 F0F1 ATP synthase subunit delta; Validated 177
9880 180240 PRK05759 PRK05759 F0F1 ATP synthase subunit B; Validated 156
9881 180241 PRK05760 PRK05760 F0F1 ATP synthase subunit I; Validated 124
9882 235594 PRK05761 PRK05761 DNA-directed DNA polymerase I. 787
9883 235595 PRK05762 PRK05762 DNA polymerase II; Reviewed 786
9884 235596 PRK05764 PRK05764 aspartate aminotransferase; Provisional 393
9885 235597 PRK05765 PRK05765 precorrin-3B C17-methyltransferase; Provisional 246
9886 235598 PRK05766 rps14P 30S ribosomal protein S14P; Reviewed 52
9887 180247 PRK05767 rpl44e 50S ribosomal protein L44e; Validated 92
9888 235599 PRK05769 PRK05769 acetyl ornithine aminotransferase family protein. 441
9889 235600 PRK05771 PRK05771 V-type ATP synthase subunit I; Validated 646
9890 168237 PRK05772 PRK05772 S-methyl-5-thioribose-1-phosphate isomerase. 363
9891 235601 PRK05773 PRK05773 3,4-dihydroxy-2-butanone 4-phosphate synthase; Validated 219
9892 235602 PRK05776 PRK05776 DNA topoisomerase I; Provisional 670
9893 235603 PRK05777 PRK05777 NADH-quinone oxidoreductase subunit NuoN. 476
9894 235604 PRK05778 PRK05778 2-oxoglutarate ferredoxin oxidoreductase subunit beta; Validated 301
9895 235605 PRK05782 PRK05782 bifunctional sirohydrochlorin cobalt chelatase/precorrin-8X methylmutase; Validated 335
9896 235606 PRK05783 PRK05783 hypothetical protein; Provisional 84
9897 180256 PRK05784 PRK05784 phosphoribosylamine--glycine ligase; Provisional 486
9898 235607 PRK05785 PRK05785 hypothetical protein; Provisional 226
9899 235608 PRK05786 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 238
9900 235609 PRK05787 PRK05787 cobalt-precorrin-7 (C(5))-methyltransferase. 210
9901 235610 PRK05788 PRK05788 cobalt-precorrin 5A hydrolase. 315
9902 180261 PRK05790 PRK05790 putative acyltransferase; Provisional 393
9903 235611 PRK05793 PRK05793 amidophosphoribosyltransferase; Provisional 469
9904 180263 PRK05799 PRK05799 oxygen-independent coproporphyrinogen III oxidase. 374
9905 235612 PRK05800 cobU adenosylcobinamide kinase/adenosylcobinamide-phosphate guanylyltransferase; Validated 170
9906 235613 PRK05802 PRK05802 sulfide/dihydroorotate dehydrogenase-like FAD/NAD-binding protein. 320
9907 180266 PRK05803 PRK05803 RNA polymerase sporulation sigma factor SigK. 233
9908 180267 PRK05805 PRK05805 phosphate butyryltransferase; Validated 301
9909 235614 PRK05807 PRK05807 RNA-binding protein S1. 136
9910 180269 PRK05808 PRK05808 3-hydroxybutyryl-CoA dehydrogenase; Validated 282
9911 180270 PRK05809 PRK05809 short-chain-enoyl-CoA hydratase. 260
9912 235615 PRK05812 secD preprotein translocase subunit SecD; Reviewed 462
9913 235616 PRK05813 PRK05813 single-stranded DNA-binding protein; Provisional 219
9914 235617 PRK05815 PRK05815 F0F1 ATP synthase subunit A; Validated 227
9915 235618 PRK05818 PRK05818 DNA polymerase III subunit delta'; Validated 261
9916 180275 PRK05819 deoD DeoD-type purine-nucleoside phosphorylase. 235
9917 180276 PRK05820 deoA thymidine phosphorylase; Reviewed 440
9918 235619 PRK05826 PRK05826 pyruvate kinase; Provisional 465
9919 180278 PRK05828 PRK05828 acyl carrier protein; Validated 84
9920 180279 PRK05834 PRK05834 hypothetical protein; Provisional 194
9921 180280 PRK05835 PRK05835 class II fructose-1,6-bisphosphate aldolase. 307
9922 180281 PRK05839 PRK05839 succinyldiaminopimelate transaminase. 374
9923 235620 PRK05841 flgE flagellar hook protein FlgE; Validated 603
9924 235621 PRK05842 flgD flagellar hook assembly protein FlgD. 295
9925 180284 PRK05844 PRK05844 pyruvate flavodoxin oxidoreductase subunit gamma; Validated 186
9926 235622 PRK05846 PRK05846 NADH:ubiquinone oxidoreductase subunit M; Reviewed 497
9927 180286 PRK05848 PRK05848 carboxylating nicotinate-nucleotide diphosphorylase. 273
9928 235623 PRK05849 PRK05849 hypothetical protein; Provisional 783
9929 235624 PRK05850 PRK05850 acyl-CoA synthetase; Validated 578
9930 180289 PRK05851 PRK05851 long-chain-fatty acid--ACP ligase MbtM. 525
9931 235625 PRK05852 PRK05852 fatty acid--CoA ligase family protein. 534
9932 235626 PRK05853 PRK05853 hypothetical protein; Validated 161
9933 235627 PRK05854 PRK05854 SDR family oxidoreductase. 313
9934 235628 PRK05855 PRK05855 SDR family oxidoreductase. 582
9935 180293 PRK05857 PRK05857 fatty acid--CoA ligase. 540
9936 235629 PRK05858 PRK05858 acetolactate synthase. 542
9937 180295 PRK05862 PRK05862 enoyl-CoA hydratase; Provisional 257
9938 135627 PRK05863 PRK05863 sulfur carrier protein ThiS; Provisional 65
9939 168278 PRK05864 PRK05864 enoyl-CoA hydratase; Provisional 276
9940 235630 PRK05865 PRK05865 sugar epimerase family protein. 854
9941 235631 PRK05866 PRK05866 SDR family oxidoreductase. 293
9942 135631 PRK05867 PRK05867 SDR family oxidoreductase. 253
9943 180297 PRK05868 PRK05868 FAD-binding protein. 372
9944 235632 PRK05869 PRK05869 enoyl-CoA hydratase; Validated 222
9945 180298 PRK05870 PRK05870 enoyl-CoA hydratase; Provisional 249
9946 235633 PRK05872 PRK05872 short chain dehydrogenase; Provisional 296
9947 102036 PRK05874 PRK05874 L-fuculose-phosphate aldolase; Validated 217
9948 180300 PRK05875 PRK05875 short chain dehydrogenase; Provisional 276
9949 135637 PRK05876 PRK05876 short chain dehydrogenase; Provisional 275
9950 235634 PRK05877 PRK05877 aminodeoxychorismate synthase component I; Provisional 405
9951 235635 PRK05878 PRK05878 pyruvate phosphate dikinase; Provisional 530
9952 180303 PRK05880 PRK05880 F0F1 ATP synthase subunit C; Validated 81
9953 180304 PRK05883 PRK05883 acyl carrier protein; Validated 91
9954 135642 PRK05884 PRK05884 SDR family oxidoreductase. 223
9955 235636 PRK05886 yajC preprotein translocase subunit YajC; Validated 109
9956 235637 PRK05888 PRK05888 NADH-quinone oxidoreductase subunit NuoI. 164
9957 180306 PRK05889 PRK05889 biotin/lipoyl-binding carrier protein. 71
9958 180307 PRK05892 PRK05892 nucleoside diphosphate kinase regulator; Provisional 158
9959 235638 PRK05896 PRK05896 DNA polymerase III subunits gamma and tau; Validated 605
9960 135648 PRK05898 dnaE DNA polymerase III subunit alpha. 971
9961 235639 PRK05899 PRK05899 transketolase; Reviewed 586
9962 235640 PRK05901 PRK05901 RNA polymerase sigma factor; Provisional 509
9963 235641 PRK05904 PRK05904 coproporphyrinogen III oxidase; Provisional 353
9964 235642 PRK05905 PRK05905 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 258
9965 168292 PRK05906 PRK05906 lipid A biosynthesis lauroyl acyltransferase; Provisional 454
9966 235643 PRK05907 PRK05907 hypothetical protein; Provisional 311
9967 168293 PRK05910 PRK05910 type III secretion system protein; Validated 584
9968 235644 PRK05911 PRK05911 RNA polymerase sigma factor sigma-28; Reviewed 257
9969 235645 PRK05912 PRK05912 tyrosyl-tRNA synthetase; Validated 408
9970 102059 PRK05917 PRK05917 DNA polymerase III subunit delta'; Validated 290
9971 180312 PRK05920 PRK05920 aromatic acid decarboxylase; Validated 204
9972 102061 PRK05922 PRK05922 type III secretion system ATPase; Validated 434
9973 235646 PRK05925 PRK05925 aspartate kinase; Provisional 440
9974 168296 PRK05926 PRK05926 hypothetical protein; Provisional 370
9975 135660 PRK05927 PRK05927 dehypoxanthine futalosine cyclase. 350
9976 235647 PRK05928 hemD uroporphyrinogen-III synthase; Reviewed 249
9977 235648 PRK05932 PRK05932 RNA polymerase factor sigma-54; Reviewed 455
9978 180315 PRK05933 PRK05933 type III secretion system protein; Validated 372
9979 168300 PRK05934 PRK05934 type III secretion system protein; Validated 341
9980 235649 PRK05935 PRK05935 biotin--protein ligase; Provisional 190
9981 102071 PRK05937 PRK05937 8-amino-7-oxononanoate synthase; Provisional 370
9982 235650 PRK05939 PRK05939 cystathionine gamma-synthase family protein. 397
9983 235651 PRK05940 PRK05940 anthranilate synthase component I. 463
9984 180317 PRK05942 PRK05942 aspartate aminotransferase; Provisional 394
9985 180318 PRK05943 PRK05943 50S ribosomal protein L25; Reviewed 94
9986 180319 PRK05945 sdhA succinate dehydrogenase/fumarate reductase flavoprotein subunit. 575
9987 180320 PRK05948 PRK05948 precorrin-2 C(20)-methyltransferase. 238
9988 180321 PRK05949 PRK05949 RNA polymerase sigma factor; Validated 327
9989 235652 PRK05950 sdhB succinate dehydrogenase iron-sulfur subunit; Reviewed 232
9990 180323 PRK05951 ubiA prenyltransferase; Reviewed 296
9991 235653 PRK05952 PRK05952 beta-ketoacyl-ACP synthase. 381
9992 180325 PRK05953 PRK05953 Precorrin-8X methylmutase. 208
9993 180326 PRK05954 PRK05954 precorrin-8X methylmutase; Provisional 203
9994 235654 PRK05957 PRK05957 pyridoxal phosphate-dependent aminotransferase. 389
9995 235655 PRK05958 PRK05958 8-amino-7-oxononanoate synthase; Reviewed 385
9996 168315 PRK05962 PRK05962 amidase; Validated 424
9997 180328 PRK05963 PRK05963 beta-ketoacyl-ACP synthase III. 326
9998 235656 PRK05964 PRK05964 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 423
9999 180330 PRK05965 PRK05965 hypothetical protein; Provisional 459
10000 235657 PRK05967 PRK05967 cystathionine beta-lyase; Provisional 395
10001 168320 PRK05968 PRK05968 hypothetical protein; Provisional 389
10002 235658 PRK05972 ligD ATP-dependent DNA ligase; Reviewed 860
10003 168322 PRK05973 PRK05973 replicative DNA helicase; Provisional 237
10004 235659 PRK05974 PRK05974 phosphoribosylformylglycinamidine synthase subunit PurS; Reviewed 80
10005 168324 PRK05975 PRK05975 3-carboxy-cis,cis-muconate cycloisomerase; Provisional 351
10006 235660 PRK05976 PRK05976 dihydrolipoamide dehydrogenase; Validated 472
10007 180334 PRK05978 PRK05978 hypothetical protein; Provisional 148
10008 180335 PRK05980 PRK05980 crotonase/enoyl-CoA hydratase family protein. 260
10009 235661 PRK05981 PRK05981 enoyl-CoA hydratase/isomerase. 266
10010 180337 PRK05985 PRK05985 cytosine deaminase; Provisional 391
10011 235662 PRK05986 PRK05986 cob(I)yrinic acid a,c-diamide adenosyltransferase. 191
10012 180339 PRK05988 PRK05988 formate dehydrogenase subunit gamma; Validated 156
10013 235663 PRK05989 cobN cobaltochelatase subunit CobN; Reviewed 1244
10014 180341 PRK05990 PRK05990 precorrin-2 C(20)-methyltransferase; Reviewed 241
10015 180342 PRK05991 PRK05991 precorrin-3B C17-methyltransferase; Provisional 250
10016 180343 PRK05993 PRK05993 SDR family oxidoreductase. 277
10017 180344 PRK05994 PRK05994 O-acetylhomoserine aminocarboxypropyltransferase; Validated 427
10018 235664 PRK05995 PRK05995 enoyl-CoA hydratase; Provisional 262
10019 235665 PRK05996 motB MotB family protein. 423
10020 235666 PRK06002 fliI flagellar protein export ATPase FliI. 450
10021 168340 PRK06003 flgB flagellar basal body rod protein FlgB; Reviewed 126
10022 235667 PRK06004 flgB flagellar basal body rod protein FlgB; Reviewed 127
10023 180347 PRK06005 flgA flagellar basal body P-ring formation protein FlgA. 160
10024 235668 PRK06007 fliF flagellar basal body M-ring protein FliF. 542
10025 235669 PRK06008 flgL flagellar hook-associated family protein. 348
10026 235670 PRK06009 flgD flagellar hook assembly protein FlgD. 140
10027 235671 PRK06010 fliQ flagellar biosynthesis protein FliQ; Reviewed 88
10028 235672 PRK06012 flhA flagellar type III secretion system protein FlhA. 697
10029 168348 PRK06015 PRK06015 2-dehydro-3-deoxy-phosphogluconate aldolase. 201
10030 235673 PRK06018 PRK06018 putative acyl-CoA synthetase; Provisional 542
10031 235674 PRK06019 PRK06019 phosphoribosylaminoimidazole carboxylase ATPase subunit; Reviewed 372
10032 168351 PRK06023 PRK06023 crotonase/enoyl-CoA hydratase family protein. 251
10033 235675 PRK06025 PRK06025 acetyl-CoA C-acetyltransferase. 417
10034 180353 PRK06026 PRK06026 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase; Validated 212
10035 235676 PRK06027 purU formyltetrahydrofolate deformylase; Reviewed 286
10036 235677 PRK06029 PRK06029 UbiX family flavin prenyltransferase. 185
10037 180356 PRK06030 PRK06030 hypothetical protein; Provisional 124
10038 235678 PRK06031 PRK06031 phosphoribosyltransferase; Provisional 233
10039 235679 PRK06032 fliH flagellar assembly protein H; Validated 199
10040 180359 PRK06033 PRK06033 flagellar motor switch protein FliN. 83
10041 235680 PRK06034 PRK06034 hypothetical protein; Provisional 279
10042 180361 PRK06035 PRK06035 3-hydroxyacyl-CoA dehydrogenase; Validated 291
10043 180362 PRK06036 PRK06036 S-methyl-5-thioribose-1-phosphate isomerase. 339
10044 180363 PRK06038 PRK06038 N-ethylammeline chlorohydrolase; Provisional 430
10045 235681 PRK06039 ileS isoleucyl-tRNA synthetase; Reviewed 975
10046 235682 PRK06041 PRK06041 archaellar assembly protein FlaJ. 553
10047 180366 PRK06043 PRK06043 fumarate hydratase; Provisional 192
10048 180367 PRK06046 PRK06046 alanine dehydrogenase; Validated 326
10049 180368 PRK06048 PRK06048 acetolactate synthase large subunit. 561
10050 235683 PRK06049 rpl30p 50S ribosomal protein L30P; Reviewed 154
10051 235684 PRK06052 PRK06052 methionine synthase. 344
10052 180371 PRK06057 PRK06057 short chain dehydrogenase; Provisional 255
10053 235685 PRK06058 PRK06058 4-aminobutyrate--2-oxoglutarate transaminase. 443
10054 180373 PRK06059 PRK06059 lipid-transfer protein; Provisional 399
10055 180374 PRK06060 PRK06060 p-hydroxybenzoic acid--AMP ligase FadD22. 705
10056 235686 PRK06061 PRK06061 amidase; Provisional 483
10057 235687 PRK06062 PRK06062 hypothetical protein; Provisional 451
10058 180377 PRK06063 PRK06063 DEDDh family exonuclease. 313
10059 235688 PRK06064 PRK06064 thiolase domain-containing protein. 389
10060 180379 PRK06065 PRK06065 thiolase domain-containing protein. 392
10061 180380 PRK06066 PRK06066 thiolase domain-containing protein. 385
10062 180381 PRK06067 PRK06067 flagellar accessory protein FlaH; Validated 234
10063 235689 PRK06069 sdhA succinate dehydrogenase/fumarate reductase flavoprotein subunit. 577
10064 168377 PRK06072 PRK06072 enoyl-CoA hydratase; Provisional 248
10065 235690 PRK06073 PRK06073 NADH dehydrogenase subunit A; Validated 124
10066 235691 PRK06074 PRK06074 NADH dehydrogenase subunit C; Provisional 189
10067 180385 PRK06075 PRK06075 NADH-quinone oxidoreductase subunit D. 392
10068 235692 PRK06076 PRK06076 NADH-quinone oxidoreductase subunit NuoH. 322
10069 235693 PRK06077 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 252
10070 180387 PRK06078 PRK06078 pyrimidine-nucleoside phosphorylase; Reviewed 434
10071 235694 PRK06079 PRK06079 enoyl-[acyl-carrier-protein] reductase FabI. 252
10072 235695 PRK06080 PRK06080 1,4-dihydroxy-2-naphthoate octaprenyltransferase; Validated 293
10073 180390 PRK06082 PRK06082 aspartate aminotransferase family protein. 459
10074 180391 PRK06083 PRK06083 sulfur carrier protein ThiS; Provisional 84
10075 180392 PRK06084 PRK06084 bifunctional O-acetylhomoserine aminocarboxypropyltransferase/cysteine synthase. 425
10076 180393 PRK06087 PRK06087 medium-chain fatty-acid--CoA ligase. 547
10077 180394 PRK06090 PRK06090 DNA polymerase III subunit delta'; Validated 319
10078 180395 PRK06091 PRK06091 membrane protein FdrA; Validated 555
10079 235696 PRK06092 PRK06092 4-amino-4-deoxychorismate lyase; Reviewed 268
10080 180397 PRK06096 PRK06096 molybdenum transport protein ModD; Provisional 284
10081 235697 PRK06099 PRK06099 F0F1 ATP synthase subunit I; Validated 126
10082 180398 PRK06100 PRK06100 DNA polymerase III subunit psi; Provisional 132
10083 180399 PRK06101 PRK06101 SDR family oxidoreductase. 240
10084 235698 PRK06102 PRK06102 amidase. 452
10085 180401 PRK06105 PRK06105 aminotransferase; Provisional 460
10086 180402 PRK06106 PRK06106 carboxylating nicotinate-nucleotide diphosphorylase. 281
10087 180403 PRK06107 PRK06107 aspartate transaminase. 402
10088 180404 PRK06108 PRK06108 pyridoxal phosphate-dependent aminotransferase. 382
10089 235699 PRK06110 PRK06110 threonine dehydratase. 322
10090 180406 PRK06111 PRK06111 acetyl-CoA carboxylase biotin carboxylase subunit; Validated 450
10091 235700 PRK06112 PRK06112 acetolactate synthase catalytic subunit; Validated 578
10092 135765 PRK06113 PRK06113 7-alpha-hydroxysteroid dehydrogenase; Validated 255
10093 180408 PRK06114 PRK06114 SDR family oxidoreductase. 254
10094 180409 PRK06115 PRK06115 dihydrolipoamide dehydrogenase; Reviewed 466
10095 235701 PRK06116 PRK06116 glutathione reductase; Validated 450
10096 180411 PRK06123 PRK06123 SDR family oxidoreductase. 248
10097 235702 PRK06124 PRK06124 SDR family oxidoreductase. 256
10098 235703 PRK06125 PRK06125 short chain dehydrogenase; Provisional 259
10099 235704 PRK06126 PRK06126 hypothetical protein; Provisional 545
10100 235705 PRK06127 PRK06127 enoyl-CoA hydratase; Provisional 269
10101 180413 PRK06128 PRK06128 SDR family oxidoreductase. 300
10102 235706 PRK06129 PRK06129 3-hydroxyacyl-CoA dehydrogenase; Validated 308
10103 235707 PRK06130 PRK06130 3-hydroxybutyryl-CoA dehydrogenase; Validated 311
10104 235708 PRK06131 PRK06131 dihydroxy-acid dehydratase; Validated 571
10105 235709 PRK06132 PRK06132 hypothetical protein; Provisional 359
10106 235710 PRK06133 PRK06133 glutamate carboxypeptidase; Reviewed 410
10107 180419 PRK06134 PRK06134 putative FAD-binding dehydrogenase; Reviewed 581
10108 235711 PRK06136 PRK06136 uroporphyrinogen-III C-methyltransferase. 249
10109 235712 PRK06138 PRK06138 SDR family oxidoreductase. 252
10110 235713 PRK06139 PRK06139 SDR family oxidoreductase. 330
10111 180421 PRK06141 PRK06141 ornithine cyclodeaminase family protein. 314
10112 235714 PRK06142 PRK06142 crotonase/enoyl-CoA hydratase family protein. 272
10113 180423 PRK06143 PRK06143 enoyl-CoA hydratase; Provisional 256
10114 180424 PRK06144 PRK06144 enoyl-CoA hydratase; Provisional 262
10115 102207 PRK06145 PRK06145 acyl-CoA synthetase; Validated 497
10116 235715 PRK06147 PRK06147 3-oxoacyl-(acyl carrier protein) synthase; Validated 348
10117 180426 PRK06148 PRK06148 hypothetical protein; Provisional 1013
10118 235716 PRK06149 PRK06149 aminotransferase. 972
10119 180428 PRK06151 PRK06151 N-ethylammeline chlorohydrolase; Provisional 488
10120 235717 PRK06153 PRK06153 hypothetical protein; Provisional 393
10121 235718 PRK06154 PRK06154 thiamine pyrophosphate-requiring protein. 565
10122 235719 PRK06155 PRK06155 crotonobetaine/carnitine-CoA ligase; Provisional 542
10123 235720 PRK06156 PRK06156 dipeptidase. 520
10124 180433 PRK06157 PRK06157 acetyl-CoA acetyltransferase; Validated 398
10125 180434 PRK06158 PRK06158 thiolase; Provisional 384
10126 180435 PRK06161 PRK06161 putative monovalent cation/H+ antiporter subunit F; Reviewed 89
10127 235721 PRK06163 PRK06163 hypothetical protein; Provisional 202
10128 235722 PRK06164 PRK06164 acyl-CoA synthetase; Validated 540
10129 180437 PRK06169 PRK06169 putative amidase; Provisional 466
10130 235723 PRK06170 PRK06170 amidase; Provisional 490
10131 180439 PRK06171 PRK06171 sorbitol-6-phosphate 2-dehydrogenase; Provisional 266
10132 180440 PRK06172 PRK06172 SDR family oxidoreductase. 253
10133 180441 PRK06173 PRK06173 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 429
10134 180442 PRK06175 PRK06175 L-aspartate oxidase; Provisional 433
10135 180443 PRK06176 PRK06176 cystathionine gamma-synthase. 380
10136 235724 PRK06178 PRK06178 acyl-CoA synthetase; Validated 567
10137 235725 PRK06179 PRK06179 short chain dehydrogenase; Provisional 270
10138 180446 PRK06180 PRK06180 short chain dehydrogenase; Provisional 277
10139 235726 PRK06181 PRK06181 SDR family oxidoreductase. 263
10140 180448 PRK06182 PRK06182 short chain dehydrogenase; Validated 273
10141 235727 PRK06183 mhpA bifunctional 3-(3-hydroxy-phenyl)propionate/3-hydroxycinnamic acid hydroxylase. 500
10142 235728 PRK06184 PRK06184 hypothetical protein; Provisional 502
10143 235729 PRK06185 PRK06185 FAD-dependent oxidoreductase. 407
10144 180452 PRK06186 PRK06186 hypothetical protein; Validated 229
10145 235730 PRK06187 PRK06187 long-chain-fatty-acid--CoA ligase; Validated 521
10146 235731 PRK06188 PRK06188 acyl-CoA synthetase; Validated 524
10147 235732 PRK06189 PRK06189 allantoinase; Provisional 451
10148 235733 PRK06190 PRK06190 enoyl-CoA hydratase; Provisional 258
10149 235734 PRK06193 PRK06193 hypothetical protein; Provisional 206
10150 180458 PRK06194 PRK06194 hypothetical protein; Provisional 287
10151 235735 PRK06195 PRK06195 DNA polymerase III subunit epsilon; Validated 309
10152 235736 PRK06196 PRK06196 oxidoreductase; Provisional 315
10153 235737 PRK06197 PRK06197 short chain dehydrogenase; Provisional 306
10154 180462 PRK06198 PRK06198 short chain dehydrogenase; Provisional 260
10155 235738 PRK06199 PRK06199 ornithine cyclodeaminase; Validated 379
10156 235739 PRK06200 PRK06200 2,3-dihydroxy-2,3-dihydrophenylpropionate dehydrogenase; Provisional 263
10157 180465 PRK06201 PRK06201 hypothetical protein; Validated 221
10158 180466 PRK06202 PRK06202 hypothetical protein; Provisional 232
10159 235740 PRK06203 aroB 3-dehydroquinate synthase; Reviewed 389
10160 235741 PRK06205 PRK06205 acetyl-CoA C-acetyltransferase. 404
10161 235742 PRK06207 PRK06207 pyridoxal phosphate-dependent aminotransferase. 405
10162 235743 PRK06208 PRK06208 class II aldolase/adducin family protein. 274
10163 180471 PRK06209 PRK06209 glutamate-1-semialdehyde 2,1-aminomutase; Provisional 431
10164 180472 PRK06210 PRK06210 enoyl-CoA hydratase; Provisional 272
10165 235744 PRK06213 PRK06213 crotonase/enoyl-CoA hydratase family protein. 229
10166 235745 PRK06214 PRK06214 sulfite reductase subunit alpha. 530
10167 235746 PRK06215 PRK06215 hypothetical protein; Provisional 238
10168 168472 PRK06217 PRK06217 hypothetical protein; Validated 183
10169 235747 PRK06222 PRK06222 sulfide/dihydroorotate dehydrogenase-like FAD/NAD-binding protein. 281
10170 180477 PRK06223 PRK06223 malate dehydrogenase; Reviewed 307
10171 235748 PRK06224 PRK06224 citryl-CoA lyase. 263
10172 235749 PRK06225 PRK06225 pyridoxal phosphate-dependent aminotransferase. 380
10173 235750 PRK06228 PRK06228 F0F1 ATP synthase subunit epsilon; Validated 131
10174 180481 PRK06231 PRK06231 F0F1 ATP synthase subunit B; Validated 205
10175 180482 PRK06233 PRK06233 vitamin B12 independent methionine synthase. 372
10176 168478 PRK06234 PRK06234 methionine gamma-lyase; Provisional 400
10177 235751 PRK06241 PRK06241 phosphoenolpyruvate synthase; Validated 871
10178 180484 PRK06242 PRK06242 flavodoxin; Provisional 150
10179 180485 PRK06245 cofG FO synthase subunit 1; Reviewed 336
10180 180486 PRK06246 PRK06246 fumarate hydratase; Provisional 280
10181 180487 PRK06247 PRK06247 pyruvate kinase; Provisional 476
10182 180488 PRK06249 PRK06249 putative 2-dehydropantoate 2-reductase. 313
10183 235752 PRK06251 PRK06251 V-type ATP synthase subunit K; Validated 102
10184 235753 PRK06252 PRK06252 methylcobalamin:coenzyme M methyltransferase; Validated 339
10185 235754 PRK06253 PRK06253 O-phosphoseryl-tRNA synthetase; Reviewed 529
10186 235755 PRK06256 PRK06256 biotin synthase; Validated 336
10187 235756 PRK06259 PRK06259 succinate dehydrogenase/fumarate reductase iron-sulfur subunit; Provisional 486
10188 235757 PRK06260 PRK06260 threonine synthase; Validated 397
10189 235758 PRK06263 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 543
10190 235759 PRK06264 cbiC cobalt-precorrin-8 methylmutase. 210
10191 235760 PRK06265 PRK06265 cobalt transporter CbiM. 199
10192 235761 PRK06266 PRK06266 transcription initiation factor E subunit alpha; Validated 178
10193 235762 PRK06267 PRK06267 hypothetical protein; Provisional 350
10194 235763 PRK06270 PRK06270 homoserine dehydrogenase; Provisional 341
10195 180501 PRK06271 PRK06271 V-type ATP synthase subunit K; Validated 213
10196 235764 PRK06273 PRK06273 ferredoxin; Provisional 165
10197 235765 PRK06274 PRK06274 indolepyruvate oxidoreductase subunit beta. 197
10198 235766 PRK06276 PRK06276 acetolactate synthase large subunit. 586
10199 235767 PRK06277 PRK06277 energy conserving hydrogenase EhbF. 478
10200 180505 PRK06278 PRK06278 cobyrinic acid a,c-diamide synthase; Validated 476
10201 180506 PRK06279 PRK06279 putative monovalent cation/H+ antiporter subunit E; Reviewed 100
10202 235768 PRK06280 PRK06280 hypothetical protein; Provisional 77
10203 180508 PRK06281 PRK06281 putative monovalent cation/H+ antiporter subunit B; Reviewed 154
10204 180509 PRK06285 PRK06285 chorismate mutase; Provisional 96
10205 235769 PRK06286 PRK06286 putative monovalent cation/H+ antiporter subunit G; Reviewed 91
10206 180511 PRK06287 PRK06287 cobalt transport protein CbiN; Validated 107
10207 235770 PRK06288 PRK06288 RNA polymerase sigma factor WhiG; Reviewed 268
10208 235771 PRK06289 PRK06289 acetyl-CoA acetyltransferase; Provisional 403
10209 235772 PRK06290 PRK06290 LL-diaminopimelate aminotransferase. 410
10210 235773 PRK06291 PRK06291 aspartate kinase; Provisional 465
10211 235774 PRK06292 PRK06292 dihydrolipoamide dehydrogenase; Validated 460
10212 180517 PRK06293 PRK06293 single-stranded DNA-binding protein; Provisional 161
10213 180518 PRK06294 PRK06294 coproporphyrinogen III oxidase; Provisional 370
10214 180519 PRK06298 PRK06298 type III secretion system protein; Validated 356
10215 235775 PRK06299 rpsA 30S ribosomal protein S1; Reviewed 565
10216 235776 PRK06300 PRK06300 enoyl-(acyl carrier protein) reductase; Provisional 299
10217 235777 PRK06302 PRK06302 acetyl-CoA carboxylase biotin carboxyl carrier protein. 155
10218 180523 PRK06305 PRK06305 DNA polymerase III subunits gamma and tau; Validated 451
10219 180524 PRK06309 PRK06309 DNA polymerase III subunit epsilon; Validated 232
10220 180525 PRK06310 PRK06310 DNA polymerase III subunit epsilon; Validated 250
10221 180526 PRK06315 PRK06315 type III secretion system ATPase; Provisional 442
10222 235778 PRK06319 PRK06319 DNA topoisomerase I/SWI domain fusion protein; Validated 860
10223 180528 PRK06321 PRK06321 replicative DNA helicase; Provisional 472
10224 235779 PRK06327 PRK06327 dihydrolipoamide dehydrogenase; Validated 475
10225 180530 PRK06328 PRK06328 HrpE/YscL family type III secretion apparatus protein. 223
10226 235780 PRK06330 PRK06330 transcript cleavage factor/unknown domain fusion protein; Validated 718
10227 235781 PRK06333 PRK06333 beta-ketoacyl-ACP synthase. 424
10228 180533 PRK06334 PRK06334 long chain fatty acid--[acyl-carrier-protein] ligase; Validated 539
10229 235782 PRK06341 PRK06341 single-stranded DNA-binding protein; Provisional 166
10230 180535 PRK06342 PRK06342 transcription elongation factor GreA. 160
10231 180536 PRK06347 PRK06347 1,4-beta-N-acetylmuramoylhydrolase. 592
10232 180537 PRK06348 PRK06348 pyridoxal phosphate-dependent aminotransferase. 384
10233 235783 PRK06349 PRK06349 homoserine dehydrogenase; Provisional 426
10234 180539 PRK06352 PRK06352 threonine synthase; Validated 351
10235 235784 PRK06354 PRK06354 pyruvate kinase; Provisional 590
10236 180541 PRK06357 PRK06357 hypothetical protein; Provisional 216
10237 180542 PRK06358 PRK06358 threonine-phosphate decarboxylase; Provisional 354
10238 180543 PRK06361 PRK06361 histidinol phosphate phosphatase domain-containing protein. 212
10239 235785 PRK06365 PRK06365 thiolase domain-containing protein. 430
10240 102340 PRK06366 PRK06366 acetyl-CoA C-acetyltransferase. 388
10241 235786 PRK06369 nac nascent polypeptide-associated complex protein; Reviewed 115
10242 235787 PRK06370 PRK06370 FAD-containing oxidoreductase. 463
10243 180547 PRK06371 PRK06371 S-methyl-5-thioribose-1-phosphate isomerase. 329
10244 235788 PRK06372 PRK06372 translation initiation factor IF-2B subunit delta; Provisional 253
10245 180548 PRK06380 PRK06380 metal-dependent hydrolase; Provisional 418
10246 235789 PRK06381 PRK06381 threonine synthase; Validated 319
10247 180550 PRK06382 PRK06382 threonine dehydratase; Provisional 406
10248 235790 PRK06386 PRK06386 replication factor A; Reviewed 358
10249 102351 PRK06388 PRK06388 amidophosphoribosyltransferase; Provisional 474
10250 235791 PRK06389 PRK06389 argininosuccinate lyase; Provisional 434
10251 235792 PRK06390 PRK06390 adenylosuccinate lyase; Provisional 451
10252 102354 PRK06392 PRK06392 homoserine dehydrogenase; Provisional 326
10253 102355 PRK06393 rpoE DNA-directed RNA polymerase subunit E''; Validated 64
10254 235793 PRK06394 rpl13p 50S ribosomal protein L13P; Reviewed 146
10255 102357 PRK06395 PRK06395 phosphoribosylamine--glycine ligase; Provisional 435
10256 135898 PRK06397 PRK06397 V-type ATP synthase subunit H; Validated 111
10257 235794 PRK06398 PRK06398 aldose dehydrogenase; Validated 258
10258 235795 PRK06402 rpl12p 50S ribosomal protein L12P; Reviewed 106
10259 102361 PRK06404 PRK06404 anthranilate synthase component I; Reviewed 351
10260 235796 PRK06406 PRK06406 vitamin B12-dependent ribonucleotide reductase. 771
10261 180556 PRK06407 PRK06407 ornithine cyclodeaminase; Provisional 301
10262 235797 PRK06411 PRK06411 NADH-quinone oxidoreductase subunit NuoB. 183
10263 235798 PRK06416 PRK06416 dihydrolipoamide dehydrogenase; Reviewed 462
10264 180559 PRK06418 PRK06418 transcription elongation factor NusA-like protein; Validated 166
10265 235799 PRK06419 rpl15p 50S ribosomal protein L15P; Reviewed 148
10266 102368 PRK06423 PRK06423 phosphoribosylformylglycinamidine synthase; Provisional 73
10267 102369 PRK06424 PRK06424 transcription factor; Provisional 144
10268 102370 PRK06425 PRK06425 histidinol-phosphate aminotransferase; Validated 332
10269 180561 PRK06427 PRK06427 bifunctional hydroxy-methylpyrimidine kinase/ hydroxy-phosphomethylpyrimidine kinase; Reviewed 266
10270 102372 PRK06432 PRK06432 NADH dehydrogenase subunit A; Validated 144
10271 180562 PRK06433 PRK06433 NADH dehydrogenase subunit J; Provisional 88
10272 102374 PRK06434 PRK06434 cystathionine gamma-lyase; Validated 384
10273 235800 PRK06436 PRK06436 2-hydroxyacid dehydrogenase. 303
10274 135906 PRK06437 PRK06437 hypothetical protein; Provisional 67
10275 102377 PRK06438 PRK06438 hypothetical protein; Provisional 292
10276 102378 PRK06439 PRK06439 NADH dehydrogenase subunit J; Provisional 72
10277 235801 PRK06443 PRK06443 chorismate mutase; Validated 177
10278 102381 PRK06444 PRK06444 prephenate dehydrogenase; Provisional 197
10279 180563 PRK06445 PRK06445 acetyl-CoA C-acetyltransferase. 394
10280 235802 PRK06446 PRK06446 hypothetical protein; Provisional 436
10281 180565 PRK06450 PRK06450 threonine synthase; Validated 338
10282 235803 PRK06451 PRK06451 NADP-dependent isocitrate dehydrogenase. 412
10283 180567 PRK06452 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 566
10284 235804 PRK06455 PRK06455 riboflavin synthase; Provisional 155
10285 180569 PRK06456 PRK06456 acetolactate synthase large subunit. 572
10286 180570 PRK06457 PRK06457 pyruvate dehydrogenase; Provisional 549
10287 235805 PRK06458 PRK06458 hydrogenase 4 subunit F; Validated 490
10288 235806 PRK06459 PRK06459 hydrogenase 4 subunit B; Validated 585
10289 235807 PRK06460 PRK06460 hypothetical protein; Provisional 376
10290 180574 PRK06461 PRK06461 single-stranded DNA-binding protein; Reviewed 129
10291 235808 PRK06462 PRK06462 asparagine synthetase A; Reviewed 335
10292 180576 PRK06463 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 255
10293 235809 PRK06464 PRK06464 phosphoenolpyruvate synthase; Validated 795
10294 180578 PRK06466 PRK06466 acetolactate synthase 3 large subunit. 574
10295 180579 PRK06467 PRK06467 dihydrolipoamide dehydrogenase; Reviewed 471
10296 235810 PRK06473 PRK06473 NADH-quinone oxidoreductase subunit M. 500
10297 235811 PRK06474 PRK06474 hypothetical protein; Provisional 178
10298 180582 PRK06475 PRK06475 FAD-binding protein. 400
10299 235812 PRK06476 PRK06476 pyrroline-5-carboxylate reductase; Reviewed 258
10300 180584 PRK06481 PRK06481 flavocytochrome c. 506
10301 235813 PRK06482 PRK06482 SDR family oxidoreductase. 276
10302 180586 PRK06483 PRK06483 dihydromonapterin reductase; Provisional 236
10303 168574 PRK06484 PRK06484 short chain dehydrogenase; Validated 520
10304 235814 PRK06486 PRK06486 aldolase. 262
10305 180588 PRK06487 PRK06487 2-hydroxyacid dehydrogenase. 317
10306 168577 PRK06488 PRK06488 sulfur carrier protein ThiS; Validated 65
10307 235815 PRK06489 PRK06489 hypothetical protein; Provisional 360
10308 180590 PRK06490 PRK06490 glutamine amidotransferase; Provisional 239
10309 180591 PRK06494 PRK06494 enoyl-CoA hydratase; Provisional 259
10310 168580 PRK06495 PRK06495 enoyl-CoA hydratase/isomerase family protein. 257
10311 180592 PRK06498 PRK06498 isocitrate lyase; Provisional 531
10312 235816 PRK06500 PRK06500 SDR family oxidoreductase. 249
10313 235817 PRK06501 PRK06501 beta-ketoacyl-ACP synthase. 425
10314 180595 PRK06504 PRK06504 acetyl-CoA C-acetyltransferase. 390
10315 180596 PRK06505 PRK06505 enoyl-[acyl-carrier-protein] reductase FabI. 271
10316 180597 PRK06508 PRK06508 acyl carrier protein; Provisional 93
10317 180598 PRK06512 PRK06512 thiamine phosphate synthase. 221
10318 235818 PRK06518 PRK06518 hypothetical protein; Provisional 177
10319 235819 PRK06519 PRK06519 beta-ketoacyl-ACP synthase. 398
10320 180601 PRK06520 PRK06520 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase. 368
10321 235820 PRK06521 PRK06521 hydrogenase 4 subunit B; Validated 667
10322 235821 PRK06522 PRK06522 2-dehydropantoate 2-reductase; Reviewed 304
10323 180604 PRK06523 PRK06523 short chain dehydrogenase; Provisional 260
10324 180605 PRK06524 PRK06524 biotin carboxylase-like protein; Validated 493
10325 180606 PRK06525 PRK06525 hydrogenase 4 subunit D; Validated 479
10326 180607 PRK06526 PRK06526 transposase; Provisional 254
10327 180608 PRK06529 PRK06529 amidase; Provisional 482
10328 235822 PRK06531 yajC preprotein translocase subunit YajC; Validated 113
10329 180610 PRK06539 PRK06539 ribonucleoside-diphosphate reductase subunit alpha. 822
10330 235823 PRK06541 PRK06541 aspartate aminotransferase family protein. 460
10331 180612 PRK06543 PRK06543 carboxylating nicotinate-nucleotide diphosphorylase. 281
10332 235824 PRK06545 PRK06545 prephenate dehydrogenase; Validated 359
10333 180614 PRK06546 PRK06546 pyruvate dehydrogenase; Provisional 578
10334 235825 PRK06547 PRK06547 hypothetical protein; Provisional 172
10335 75628 PRK06548 PRK06548 ribonuclease H; Provisional 161
10336 235826 PRK06549 PRK06549 acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated 130
10337 180617 PRK06550 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 235
10338 180618 PRK06552 PRK06552 keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase; Provisional 213
10339 235827 PRK06553 PRK06553 lipid A biosynthesis lauroyl acyltransferase; Provisional 308
10340 180620 PRK06555 PRK06555 pyrophosphate--fructose-6-phosphate 1-phosphotransferase; Validated 403
10341 235828 PRK06556 PRK06556 vitamin B12-dependent ribonucleotide reductase; Validated 953
10342 235829 PRK06557 PRK06557 L-ribulose-5-phosphate 4-epimerase; Validated 221
10343 235830 PRK06558 PRK06558 V-type ATP synthase subunit K; Validated 159
10344 235831 PRK06559 PRK06559 carboxylating nicotinate-nucleotide diphosphorylase. 290
10345 180625 PRK06563 PRK06563 crotonase/enoyl-CoA hydratase family protein. 255
10346 180626 PRK06565 PRK06565 amidase; Validated 566
10347 235832 PRK06567 PRK06567 putative bifunctional glutamate synthase subunit beta/2-polyprenylphenol hydroxylase; Validated 1028
10348 168615 PRK06568 PRK06568 F0F1 ATP synthase subunit B; Validated 154
10349 180627 PRK06569 PRK06569 F0F1 ATP synthase subunit B'; Validated 155
10350 180628 PRK06580 PRK06580 putative monovalent cation/H+ antiporter subunit E; Reviewed 103
10351 235833 PRK06581 PRK06581 DNA polymerase III subunit delta'; Validated 263
10352 180630 PRK06582 PRK06582 coproporphyrinogen III oxidase; Provisional 390
10353 235834 PRK06585 holA DNA polymerase III subunit delta; Reviewed 343
10354 168619 PRK06588 PRK06588 putative monovalent cation/H+ antiporter subunit D; Reviewed 506
10355 235835 PRK06589 PRK06589 putative monovalent cation/H+ antiporter subunit D; Reviewed 489
10356 235836 PRK06590 PRK06590 NADH:ubiquinone oxidoreductase subunit L; Reviewed 624
10357 235837 PRK06591 PRK06591 putative monovalent cation/H+ antiporter subunit D; Reviewed 432
10358 235838 PRK06596 PRK06596 RNA polymerase factor sigma-32; Reviewed 284
10359 235839 PRK06598 PRK06598 aspartate-semialdehyde dehydrogenase; Reviewed 369
10360 235840 PRK06599 PRK06599 DNA topoisomerase I; Validated 675
10361 180638 PRK06602 PRK06602 NADH:ubiquinone oxidoreductase subunit A; Validated 121
10362 168626 PRK06603 PRK06603 enoyl-[acyl-carrier-protein] reductase FabI. 260
10363 235841 PRK06606 PRK06606 branched-chain amino acid transaminase. 306
10364 235842 PRK06608 PRK06608 serine/threonine dehydratase. 338
10365 168629 PRK06617 PRK06617 2-octaprenyl-6-methoxyphenyl hydroxylase; Validated 374
10366 168630 PRK06620 PRK06620 hypothetical protein; Validated 214
10367 102471 PRK06628 PRK06628 lipid A biosynthesis lauroyl acyltransferase; Provisional 290
10368 168631 PRK06630 PRK06630 hypothetical protein; Provisional 99
10369 168632 PRK06633 PRK06633 acetyl-CoA C-acetyltransferase. 392
10370 235843 PRK06635 PRK06635 aspartate kinase; Reviewed 404
10371 235844 PRK06638 PRK06638 NADH-quinone oxidoreductase subunit J. 198
10372 135984 PRK06642 PRK06642 single-stranded DNA-binding protein; Provisional 152
10373 180643 PRK06645 PRK06645 DNA polymerase III subunits gamma and tau; Validated 507
10374 102480 PRK06646 PRK06646 DNA polymerase III subunit chi; Provisional 154
10375 235845 PRK06647 PRK06647 DNA polymerase III subunits gamma and tau; Validated 563
10376 180645 PRK06649 PRK06649 V-type ATP synthase subunit K; Validated 143
10377 235846 PRK06654 fliL flagellar basal body-associated protein FliL; Reviewed 181
10378 235847 PRK06655 flgD flagellar hook assembly protein FlgD. 225
10379 168637 PRK06661 PRK06661 hypothetical protein; Provisional 231
10380 180648 PRK06663 PRK06663 flagellar hook-associated protein 3. 419
10381 235848 PRK06664 fliD flagellar hook-associated protein FliD; Validated 661
10382 180650 PRK06665 flgK flagellar hook-associated protein FlgK; Validated 627
10383 235849 PRK06666 fliM flagellar motor switch protein FliM; Validated 337
10384 180652 PRK06667 motB flagellar motor protein MotB; Validated 252
10385 235850 PRK06669 fliH flagellar assembly protein H; Validated 281
10386 180654 PRK06672 PRK06672 hypothetical protein; Validated 341
10387 135998 PRK06673 PRK06673 DNA polymerase III subunit beta; Validated 376
10388 235851 PRK06676 rpsA 30S ribosomal protein S1; Reviewed 390
10389 180656 PRK06680 PRK06680 D-amino acid aminotransferase; Reviewed 286
10390 136002 PRK06683 PRK06683 hypothetical protein; Provisional 82
10391 180657 PRK06687 PRK06687 TRZ/ATZ family protein. 419
10392 235852 PRK06688 PRK06688 enoyl-CoA hydratase; Provisional 259
10393 180659 PRK06690 PRK06690 acetyl-CoA C-acyltransferase. 361
10394 180660 PRK06696 PRK06696 uridine kinase; Validated 223
10395 136007 PRK06698 PRK06698 bifunctional 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase/phosphatase; Validated 459
10396 235853 PRK06701 PRK06701 short chain dehydrogenase; Provisional 290
10397 102505 PRK06702 PRK06702 bifunctional O-acetylhomoserine aminocarboxypropyltransferase/cysteine synthase. 432
10398 235854 PRK06703 PRK06703 flavodoxin; Provisional 151
10399 180663 PRK06704 PRK06704 RNA polymerase subunit sigma-70. 228
10400 180664 PRK06705 PRK06705 argininosuccinate lyase; Provisional 502
10401 235855 PRK06707 PRK06707 amidase; Provisional 536
10402 180666 PRK06710 PRK06710 long-chain-fatty-acid--CoA ligase; Validated 563
10403 168652 PRK06714 PRK06714 S-adenosylhomocysteine nucleosidase; Validated 236
10404 180667 PRK06718 PRK06718 NAD(P)-binding protein. 202
10405 180668 PRK06719 PRK06719 precorrin-2 dehydrogenase; Validated 157
10406 180669 PRK06720 PRK06720 hypothetical protein; Provisional 169
10407 136018 PRK06721 PRK06721 threonine synthase; Reviewed 352
10408 180670 PRK06722 PRK06722 exonuclease; Provisional 281
10409 180671 PRK06724 PRK06724 hypothetical protein; Provisional 128
10410 180672 PRK06725 PRK06725 acetolactate synthase large subunit. 570
10411 136022 PRK06728 PRK06728 aspartate-semialdehyde dehydrogenase; Provisional 347
10412 75717 PRK06731 flhF flagellar biosynthesis regulator FlhF; Validated 270
10413 235856 PRK06732 PRK06732 phosphopantothenate--cysteine ligase; Validated 229
10414 180674 PRK06733 PRK06733 hypothetical protein; Provisional 151
10415 180675 PRK06737 PRK06737 ACT domain-containing protein. 76
10416 180676 PRK06739 PRK06739 pyruvate kinase; Validated 352
10417 180677 PRK06740 PRK06740 histidinol phosphate phosphatase domain-containing protein. 331
10418 102525 PRK06742 PRK06742 flagellar motor protein MotS; Reviewed 225
10419 136027 PRK06743 PRK06743 flagellar motor protein MotP; Reviewed 254
10420 75726 PRK06746 PRK06746 peptide chain release factor 2; Provisional 326
10421 180678 PRK06748 PRK06748 hypothetical protein; Validated 83
10422 168658 PRK06749 PRK06749 replicative DNA helicase; Provisional 428
10423 168659 PRK06751 PRK06751 single-stranded DNA-binding protein; Provisional 173
10424 168660 PRK06752 PRK06752 single-stranded DNA-binding protein; Validated 112
10425 168661 PRK06753 PRK06753 hypothetical protein; Provisional 373
10426 180679 PRK06754 mtnB methylthioribulose 1-phosphate dehydratase. 208
10427 102532 PRK06755 PRK06755 hypothetical protein; Validated 209
10428 168663 PRK06756 PRK06756 flavodoxin; Provisional 148
10429 136035 PRK06758 PRK06758 hypothetical protein; Provisional 128
10430 235857 PRK06759 PRK06759 RNA polymerase factor sigma-70; Validated 154
10431 180681 PRK06760 PRK06760 hypothetical protein; Provisional 223
10432 180682 PRK06761 PRK06761 hypothetical protein; Provisional 282
10433 235858 PRK06762 PRK06762 hypothetical protein; Provisional 166
10434 136040 PRK06763 PRK06763 F0F1 ATP synthase subunit alpha; Validated 213
10435 102540 PRK06764 PRK06764 hypothetical protein; Provisional 105
10436 235859 PRK06765 PRK06765 homoserine O-acetyltransferase; Provisional 389
10437 180685 PRK06767 PRK06767 methionine gamma-lyase; Provisional 386
10438 180686 PRK06769 PRK06769 HAD-IIIA family hydrolase. 173
10439 180687 PRK06770 PRK06770 hypothetical protein; Provisional 180
10440 180688 PRK06771 PRK06771 hypothetical protein; Provisional 93
10441 102546 PRK06772 PRK06772 salicylate synthase. 434
10442 180689 PRK06774 PRK06774 aminodeoxychorismate synthase component II. 191
10443 180690 PRK06777 PRK06777 4-aminobutyrate--2-oxoglutarate transaminase. 421
10444 235860 PRK06778 PRK06778 hypothetical protein; Validated 289
10445 136048 PRK06781 PRK06781 amidophosphoribosyltransferase; Provisional 471
10446 235861 PRK06782 PRK06782 flagellar motor switch protein; Reviewed 528
10447 180693 PRK06788 PRK06788 flagellar motor switch protein; Validated 119
10448 180694 PRK06789 PRK06789 flagellar motor switch protein; Validated 74
10449 180695 PRK06792 flgD flagellar basal body rod modification protein; Validated 190
10450 180696 PRK06793 fliI flagellar protein export ATPase FliI. 432
10451 180697 PRK06797 flgB flagellar basal body rod protein FlgB; Reviewed 135
10452 180698 PRK06798 fliD flagellar hook-associated protein 2. 440
10453 180699 PRK06799 flgK flagellar hook-associated protein FlgK; Validated 431
10454 180700 PRK06800 fliH flagellar assembly protein H; Validated 228
10455 180701 PRK06801 PRK06801 ketose 1,6-bisphosphate aldolase. 286
10456 180702 PRK06802 flgC flagellar basal body rod protein FlgC; Reviewed 141
10457 235862 PRK06803 flgE flagellar basal body protein FlaE. 402
10458 235863 PRK06804 flgA flagellar basal body P-ring formation protein FlgA. 261
10459 180705 PRK06806 PRK06806 class II aldolase. 281
10460 235864 PRK06807 PRK06807 3'-5' exonuclease. 313
10461 180707 PRK06811 PRK06811 RNA polymerase factor sigma-70; Validated 189
10462 168683 PRK06813 PRK06813 homoserine dehydrogenase; Validated 346
10463 235865 PRK06814 PRK06814 acyl-[ACP]--phospholipid O-acyltransferase. 1140
10464 180709 PRK06815 PRK06815 threonine/serine dehydratase. 317
10465 235866 PRK06816 PRK06816 StlD/DarB family beta-ketosynthase. 378
10466 235867 PRK06819 PRK06819 FliC/FljB family flagellin. 376
10467 180712 PRK06820 PRK06820 EscN/YscN/HrcN family type III secretion system ATPase. 440
10468 136070 PRK06823 PRK06823 ornithine cyclodeaminase family protein. 315
10469 168689 PRK06824 PRK06824 translation initiation factor Sui1; Validated 118
10470 235868 PRK06826 dnaE DNA polymerase III DnaE; Reviewed 1151
10471 180714 PRK06827 PRK06827 phosphoribosylpyrophosphate synthetase; Provisional 382
10472 180715 PRK06828 PRK06828 amidase; Provisional 491
10473 235869 PRK06830 PRK06830 ATP-dependent 6-phosphofructokinase. 443
10474 180717 PRK06833 PRK06833 L-fuculose-phosphate aldolase. 214
10475 235870 PRK06834 PRK06834 hypothetical protein; Provisional 488
10476 235871 PRK06835 PRK06835 DNA replication protein DnaC; Validated 329
10477 180720 PRK06836 PRK06836 pyridoxal phosphate-dependent aminotransferase. 394
10478 180721 PRK06837 PRK06837 ArgE/DapE family deacylase. 427
10479 168698 PRK06839 PRK06839 o-succinylbenzoate--CoA ligase. 496
10480 235872 PRK06840 PRK06840 3-oxoacyl-ACP synthase. 339
10481 180723 PRK06841 PRK06841 short chain dehydrogenase; Provisional 255
10482 180724 PRK06842 PRK06842 Fe-S-containing hydro-lyase. 185
10483 180725 PRK06843 PRK06843 inosine 5-monophosphate dehydrogenase; Validated 404
10484 235873 PRK06846 PRK06846 putative deaminase; Validated 410
10485 235874 PRK06847 PRK06847 hypothetical protein; Provisional 375
10486 235875 PRK06848 PRK06848 cytidine deaminase. 139
10487 235876 PRK06849 PRK06849 hypothetical protein; Provisional 389
10488 235877 PRK06850 PRK06850 hypothetical protein; Provisional 507
10489 235878 PRK06851 PRK06851 hypothetical protein; Provisional 367
10490 180731 PRK06852 PRK06852 aldolase; Validated 304
10491 180732 PRK06853 PRK06853 indolepyruvate oxidoreductase subunit beta; Reviewed 197
10492 235879 PRK06854 PRK06854 adenylyl-sulfate reductase subunit alpha. 608
10493 180734 PRK06855 PRK06855 pyridoxal phosphate-dependent aminotransferase. 433
10494 180735 PRK06856 PRK06856 DNA polymerase III subunit psi; Validated 128
10495 235880 PRK06860 PRK06860 lipid A biosynthesis lauroyl acyltransferase; Provisional 309
10496 136097 PRK06863 PRK06863 single-stranded DNA-binding protein; Provisional 168
10497 235881 PRK06870 secG preprotein translocase subunit SecG; Reviewed 76
10498 180738 PRK06871 PRK06871 DNA polymerase III subunit delta'; Validated 325
10499 180739 PRK06876 PRK06876 F0F1 ATP synthase subunit C; Validated 78
10500 168717 PRK06882 PRK06882 acetolactate synthase 3 large subunit. 574
10501 180740 PRK06886 PRK06886 hypothetical protein; Validated 329
10502 168719 PRK06893 PRK06893 DnaA regulatory inactivator Hda. 229
10503 235882 PRK06895 PRK06895 anthranilate synthase component II. 190
10504 235883 PRK06901 PRK06901 oxidoreductase. 322
10505 136106 PRK06904 PRK06904 replicative DNA helicase; Validated 472
10506 180742 PRK06911 rpsN 30S ribosomal protein S14; Reviewed 100
10507 180743 PRK06912 acoL dihydrolipoamide dehydrogenase; Validated 458
10508 180744 PRK06914 PRK06914 SDR family oxidoreductase. 280
10509 180745 PRK06915 PRK06915 peptidase. 422
10510 180746 PRK06916 PRK06916 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 460
10511 235884 PRK06917 PRK06917 aspartate aminotransferase family protein. 447
10512 235885 PRK06918 PRK06918 4-aminobutyrate aminotransferase; Reviewed 451
10513 180749 PRK06920 dnaE DNA polymerase III subunit alpha. 1107
10514 180750 PRK06921 PRK06921 hypothetical protein; Provisional 266
10515 180751 PRK06922 PRK06922 class I SAM-dependent methyltransferase. 677
10516 235886 PRK06923 PRK06923 isochorismate synthase DhbC; Validated 399
10517 180753 PRK06924 PRK06924 (S)-benzoin forming benzil reductase. 251
10518 235887 PRK06925 PRK06925 flagellar motor protein MotB. 230
10519 180755 PRK06926 PRK06926 flagellar motor protein MotP; Reviewed 271
10520 235888 PRK06928 PRK06928 pyrroline-5-carboxylate reductase; Reviewed 277
10521 180757 PRK06930 PRK06930 positive control sigma-like factor; Validated 170
10522 235889 PRK06931 PRK06931 diaminobutyrate--2-oxoglutarate transaminase. 459
10523 235890 PRK06932 PRK06932 2-hydroxyacid dehydrogenase. 314
10524 235891 PRK06933 PRK06933 YscQ/HrcQ family type III secretion apparatus protein. 308
10525 180760 PRK06934 PRK06934 flavodoxin; Provisional 221
10526 180761 PRK06935 PRK06935 2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 258
10527 180762 PRK06936 PRK06936 EscN/YscN/HrcN family type III secretion system ATPase. 439
10528 180763 PRK06937 PRK06937 HrpE/YscL family type III secretion apparatus protein. 204
10529 235892 PRK06938 PRK06938 diaminobutyrate--2-oxoglutarate aminotransferase; Provisional 464
10530 235893 PRK06939 PRK06939 2-amino-3-ketobutyrate coenzyme A ligase; Provisional 397
10531 180766 PRK06940 PRK06940 short chain dehydrogenase; Provisional 275
10532 235894 PRK06943 PRK06943 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 453
10533 180768 PRK06944 PRK06944 sulfur carrier protein ThiS; Provisional 65
10534 235895 PRK06945 flgK flagellar hook-associated protein FlgK; Validated 651
10535 180770 PRK06946 PRK06946 lipid A biosynthesis lauroyl acyltransferase; Provisional 293
10536 180771 PRK06947 PRK06947 SDR family oxidoreductase. 248
10537 180772 PRK06948 PRK06948 ribonucleotide reductase-like protein; Provisional 595
10538 180773 PRK06949 PRK06949 SDR family oxidoreductase. 258
10539 180774 PRK06953 PRK06953 SDR family oxidoreductase. 222
10540 180775 PRK06954 PRK06954 acetyl-CoA C-acetyltransferase. 397
10541 235896 PRK06955 PRK06955 biotin--[acetyl-CoA-carboxylase] ligase. 300
10542 180777 PRK06958 PRK06958 single-stranded DNA-binding protein; Provisional 182
10543 235897 PRK06959 PRK06959 threonine-phosphate decarboxylase. 339
10544 235898 PRK06964 PRK06964 DNA polymerase III subunit delta'; Validated 342
10545 180780 PRK06965 PRK06965 acetolactate synthase 3 catalytic subunit; Validated 587
10546 180781 PRK06973 PRK06973 nicotinate-nucleotide adenylyltransferase. 243
10547 235899 PRK06975 PRK06975 bifunctional uroporphyrinogen-III synthetase/uroporphyrin-III C-methyltransferase; Reviewed 656
10548 235900 PRK06978 PRK06978 carboxylating nicotinate-nucleotide diphosphorylase. 294
10549 235901 PRK06986 fliA flagellar biosynthesis sigma factor; Validated 236
10550 235902 PRK06988 PRK06988 formyltransferase. 312
10551 235903 PRK06991 PRK06991 electron transport complex subunit RsxB. 270
10552 235904 PRK06995 flhF flagellar biosynthesis protein FlhF. 484
10553 235905 PRK06996 PRK06996 UbiH/UbiF/VisC/COQ6 family ubiquinone biosynthesis hydroxylase. 398
10554 180789 PRK06997 PRK06997 enoyl-[acyl-carrier-protein] reductase FabI. 260
10555 235906 PRK07003 PRK07003 DNA polymerase III subunit gamma/tau. 830
10556 235907 PRK07004 PRK07004 replicative DNA helicase; Provisional 460
10557 180792 PRK07006 PRK07006 isocitrate dehydrogenase; Reviewed 409
10558 235908 PRK07008 PRK07008 long-chain-fatty-acid--CoA ligase; Validated 539
10559 180794 PRK07018 flgA flagellar basal body P-ring formation protein FlgA. 235
10560 235909 PRK07021 fliL flagellar basal body-associated protein FliL; Reviewed 162
10561 180796 PRK07023 PRK07023 SDR family oxidoreductase. 243
10562 235910 PRK07024 PRK07024 SDR family oxidoreductase. 257
10563 235911 PRK07027 PRK07027 cobalamin biosynthesis protein CbiG; Provisional 126
10564 235912 PRK07028 PRK07028 bifunctional hexulose-6-phosphate synthase/ribonuclease regulator; Validated 430
10565 180800 PRK07030 PRK07030 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 466
10566 180801 PRK07033 PRK07033 DotU family type VI secretion system protein. 427
10567 168775 PRK07034 PRK07034 hypothetical protein; Provisional 536
10568 180802 PRK07035 PRK07035 SDR family oxidoreductase. 252
10569 235913 PRK07036 PRK07036 aminotransferase. 466
10570 180803 PRK07037 PRK07037 extracytoplasmic-function sigma-70 factor; Validated 163
10571 235914 PRK07041 PRK07041 SDR family oxidoreductase. 230
10572 235915 PRK07042 PRK07042 amidase; Provisional 464
10573 235916 PRK07044 PRK07044 aldolase II superfamily protein; Provisional 252
10574 136171 PRK07045 PRK07045 putative monooxygenase; Reviewed 388
10575 235917 PRK07046 PRK07046 aminotransferase; Validated 453
10576 235918 PRK07048 PRK07048 threo-3-hydroxy-L-aspartate ammonia-lyase. 321
10577 180809 PRK07049 PRK07049 cystathionine gamma-synthase family protein. 427
10578 180810 PRK07050 PRK07050 cystathionine beta-lyase; Provisional 394
10579 180811 PRK07051 PRK07051 biotin carboxyl carrier domain-containing protein. 80
10580 235919 PRK07053 PRK07053 glutamine amidotransferase; Provisional 234
10581 235920 PRK07054 PRK07054 isochorismate synthase. 475
10582 235921 PRK07056 PRK07056 amidase; Provisional 454
10583 180814 PRK07057 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 591
10584 235922 PRK07058 PRK07058 acetate/propionate family kinase. 396
10585 235923 PRK07059 PRK07059 Long-chain-fatty-acid--CoA ligase; Validated 557
10586 180817 PRK07060 PRK07060 short chain dehydrogenase; Provisional 245
10587 180818 PRK07062 PRK07062 SDR family oxidoreductase. 265
10588 235924 PRK07063 PRK07063 SDR family oxidoreductase. 260
10589 180820 PRK07064 PRK07064 thiamine pyrophosphate-binding protein. 544
10590 168796 PRK07066 PRK07066 L-carnitine dehydrogenase. 321
10591 235925 PRK07067 PRK07067 L-iditol 2-dehydrogenase. 257
10592 180822 PRK07069 PRK07069 short chain dehydrogenase; Validated 251
10593 180823 PRK07074 PRK07074 SDR family oxidoreductase. 257
10594 136191 PRK07075 PRK07075 isochorismate lyase. 101
10595 235926 PRK07077 PRK07077 phosphorylase. 238
10596 235927 PRK07078 PRK07078 hypothetical protein; Validated 759
10597 235928 PRK07079 PRK07079 hypothetical protein; Provisional 469
10598 235929 PRK07080 PRK07080 amino acid--[acyl-carrier-protein] ligase. 317
10599 180828 PRK07081 PRK07081 acyl carrier protein; Provisional 83
10600 180829 PRK07084 PRK07084 class II fructose-1,6-bisphosphate aldolase. 321
10601 235930 PRK07085 PRK07085 diphosphate--fructose-6-phosphate 1-phosphotransferase; Provisional 555
10602 180831 PRK07088 PRK07088 ribonucleoside-diphosphate reductase subunit alpha. 764
10603 180832 PRK07090 PRK07090 class II aldolase/adducin domain protein; Provisional 260
10604 235931 PRK07092 PRK07092 benzoylformate decarboxylase; Reviewed 530
10605 235932 PRK07093 PRK07093 para-aminobenzoate synthase component I; Validated 323
10606 180835 PRK07094 PRK07094 biotin synthase; Provisional 323
10607 235933 PRK07097 PRK07097 gluconate 5-dehydrogenase; Provisional 265
10608 235934 PRK07101 PRK07101 hypothetical protein; Provisional 187
10609 180838 PRK07102 PRK07102 SDR family oxidoreductase. 243
10610 180839 PRK07103 PRK07103 polyketide beta-ketoacyl:acyl carrier protein synthase; Validated 410
10611 180840 PRK07105 PRK07105 pyridoxamine kinase; Validated 284
10612 180841 PRK07106 PRK07106 phosphoribosylaminoimidazolecarboxamide formyltransferase. 390
10613 180842 PRK07107 PRK07107 IMP dehydrogenase. 502
10614 180843 PRK07108 PRK07108 acetyl-CoA C-acyltransferase. 392
10615 235935 PRK07109 PRK07109 short chain dehydrogenase; Provisional 334
10616 235936 PRK07110 PRK07110 polyketide biosynthesis enoyl-CoA hydratase; Validated 249
10617 235937 PRK07111 PRK07111 anaerobic ribonucleoside triphosphate reductase; Provisional 735
10618 235938 PRK07112 PRK07112 enoyl-CoA hydratase/isomerase. 255
10619 235939 PRK07114 PRK07114 bifunctional 4-hydroxy-2-oxoglutarate aldolase/2-dehydro-3-deoxy-phosphogluconate aldolase. 222
10620 235940 PRK07115 PRK07115 AMP nucleosidase; Provisional 258
10621 180850 PRK07116 PRK07116 flavodoxin; Provisional 160
10622 180851 PRK07117 PRK07117 acyl carrier protein; Validated 79
10623 235941 PRK07118 PRK07118 Fe-S cluster domain-containing protein. 280
10624 235942 PRK07119 PRK07119 2-ketoisovalerate ferredoxin reductase; Validated 352
10625 180854 PRK07121 PRK07121 FAD-binding protein. 492
10626 168831 PRK07122 PRK07122 RNA polymerase sigma factor SigF; Reviewed 264
10627 180855 PRK07132 PRK07132 DNA polymerase III subunit delta'; Validated 299
10628 235943 PRK07133 PRK07133 DNA polymerase III subunits gamma and tau; Validated 725
10629 235944 PRK07135 dnaE DNA polymerase III DnaE; Validated 973
10630 235945 PRK07139 PRK07139 amidase; Provisional 439
10631 235946 PRK07143 PRK07143 hypothetical protein; Provisional 279
10632 235947 PRK07152 nadD nicotinate-nucleotide adenylyltransferase. 342
10633 235948 PRK07157 PRK07157 acetate kinase; Provisional 400
10634 235949 PRK07159 PRK07159 F0F1 ATP synthase subunit C; Validated 100
10635 235950 PRK07164 PRK07164 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase; Provisional 218
10636 235951 PRK07165 PRK07165 ATP F0F1 synthase subunit alpha. 507
10637 180864 PRK07168 PRK07168 uroporphyrin-III C-methyltransferase. 474
10638 180865 PRK07178 PRK07178 acetyl-CoA carboxylase biotin carboxylase subunit. 472
10639 180866 PRK07179 PRK07179 quorum-sensing autoinducer synthase. 407
10640 75964 PRK07182 flgB flagellar basal body rod protein FlgB; Reviewed 148
10641 235952 PRK07187 PRK07187 ribonucleoside-diphosphate reductase subunit alpha. 721
10642 235953 PRK07188 PRK07188 nicotinate phosphoribosyltransferase; Provisional 352
10643 235954 PRK07189 PRK07189 malonate decarboxylase subunit beta; Reviewed 301
10644 235955 PRK07190 PRK07190 FAD-binding protein. 487
10645 180871 PRK07191 flgK flagellar hook-associated protein FlgK; Validated 456
10646 235956 PRK07192 flgL flagellar hook-associated protein FlgL; Reviewed 305
10647 235957 PRK07193 fliF flagellar MS-ring protein; Reviewed 552
10648 235958 PRK07194 fliG flagellar motor switch protein G; Reviewed 334
10649 180875 PRK07196 fliI flagellar protein export ATPase FliI. 434
10650 235959 PRK07198 PRK07198 GTP cyclohydrolase II. 418
10651 235960 PRK07199 PRK07199 ribose-phosphate diphosphokinase. 301
10652 235961 PRK07200 PRK07200 aspartate/ornithine carbamoyltransferase family protein; Validated 395
10653 235962 PRK07201 PRK07201 SDR family oxidoreductase. 657
10654 235963 PRK07203 PRK07203 putative aminohydrolase SsnA. 442
10655 235964 PRK07204 PRK07204 beta-ketoacyl-ACP synthase III. 329
10656 235965 PRK07205 PRK07205 hypothetical protein; Provisional 444
10657 180883 PRK07206 PRK07206 hypothetical protein; Provisional 416
10658 235966 PRK07207 PRK07207 ribonucleoside-diphosphate reductase subunit alpha. 965
10659 235967 PRK07208 PRK07208 hypothetical protein; Provisional 479
10660 235968 PRK07209 PRK07209 ribonucleotide-diphosphate reductase subunit beta; Validated 369
10661 180887 PRK07211 PRK07211 single-stranded DNA binding protein. 485
10662 235969 PRK07213 PRK07213 chlorohydrolase; Provisional 375
10663 235970 PRK07217 PRK07217 replication factor A; Reviewed 311
10664 180890 PRK07218 PRK07218 Single-stranded DNA binding protein. 423
10665 235971 PRK07219 PRK07219 DNA topoisomerase I; Validated 822
10666 180892 PRK07220 PRK07220 DNA topoisomerase I; Validated 740
10667 235972 PRK07225 PRK07225 DNA-directed RNA polymerase subunit B'; Validated 605
10668 235973 PRK07226 PRK07226 fructose-bisphosphate aldolase; Provisional 267
10669 180895 PRK07228 PRK07228 5'-deoxyadenosine deaminase. 445
10670 235974 PRK07229 PRK07229 aconitate hydratase; Validated 646
10671 235975 PRK07231 FabG-like SDR family oxidoreductase. 251
10672 235976 PRK07232 PRK07232 bifunctional malic enzyme oxidoreductase/phosphotransacetylase; Reviewed 752
10673 235977 PRK07233 PRK07233 hypothetical protein; Provisional 434
10674 235978 PRK07234 PRK07234 putative monovalent cation/H+ antiporter subunit D; Reviewed 470
10675 235979 PRK07235 PRK07235 amidase; Provisional 502
10676 235980 PRK07236 PRK07236 hypothetical protein; Provisional 386
10677 180903 PRK07238 PRK07238 bifunctional RNase H/acid phosphatase; Provisional 372
10678 235981 PRK07239 PRK07239 bifunctional uroporphyrinogen-III synthetase/response regulator domain protein; Validated 381
10679 180905 PRK07246 PRK07246 bifunctional ATP-dependent DNA helicase/DNA polymerase III subunit epsilon; Validated 820
10680 180906 PRK07247 PRK07247 3'-5' exonuclease. 195
10681 168880 PRK07248 PRK07248 chorismate mutase. 87
10682 180907 PRK07251 PRK07251 FAD-containing oxidoreductase. 438
10683 180908 PRK07252 PRK07252 S1 RNA-binding domain-containing protein. 120
10684 235982 PRK07259 PRK07259 dihydroorotate dehydrogenase. 301
10685 180910 PRK07260 PRK07260 enoyl-CoA hydratase; Provisional 255
10686 180911 PRK07261 PRK07261 DNA topology modulation protein. 171
10687 235983 PRK07269 PRK07269 cystathionine gamma-synthase; Reviewed 364
10688 235984 PRK07272 PRK07272 amidophosphoribosyltransferase; Provisional 484
10689 180914 PRK07274 PRK07274 single-stranded DNA-binding protein; Provisional 131
10690 180915 PRK07275 PRK07275 single-stranded DNA-binding protein; Provisional 162
10691 180916 PRK07276 PRK07276 DNA polymerase III subunit delta'; Validated 290
10692 180917 PRK07279 dnaE DNA polymerase III DnaE; Reviewed 1034
10693 180918 PRK07281 PRK07281 methionyl aminopeptidase. 286
10694 180919 PRK07282 PRK07282 acetolactate synthase large subunit. 566
10695 180920 PRK07283 PRK07283 YlxQ-related RNA-binding protein. 98
10696 180921 PRK07306 PRK07306 ribonucleotide-diphosphate reductase subunit alpha; Validated 720
10697 180922 PRK07308 PRK07308 flavodoxin; Validated 146
10698 235985 PRK07309 PRK07309 pyridoxal phosphate-dependent aminotransferase. 391
10699 235986 PRK07313 PRK07313 phosphopantothenoylcysteine decarboxylase; Validated 182
10700 235987 PRK07314 PRK07314 beta-ketoacyl-ACP synthase II. 411
10701 180926 PRK07315 PRK07315 fructose-bisphosphate aldolase; Provisional 293
10702 235988 PRK07318 PRK07318 dipeptidase PepV; Reviewed 466
10703 180928 PRK07322 PRK07322 adenine phosphoribosyltransferase; Provisional 178
10704 235989 PRK07324 PRK07324 transaminase; Validated 373
10705 235990 PRK07326 PRK07326 SDR family oxidoreductase. 237
10706 235991 PRK07327 PRK07327 enoyl-CoA hydratase/isomerase family protein. 268
10707 235992 PRK07328 PRK07328 histidinol-phosphatase; Provisional 269
10708 180933 PRK07329 PRK07329 hypothetical protein; Provisional 246
10709 235993 PRK07331 PRK07331 cobalt transporter CbiM. 322
10710 180935 PRK07333 PRK07333 ubiquinone biosynthesis hydroxylase. 403
10711 235994 PRK07334 PRK07334 threonine dehydratase; Provisional 403
10712 180937 PRK07337 PRK07337 pyridoxal phosphate-dependent aminotransferase. 388
10713 235995 PRK07338 PRK07338 hydrolase. 402
10714 235996 PRK07340 PRK07340 delta(1)-pyrroline-2-carboxylate reductase family protein. 304
10715 235997 PRK07342 PRK07342 peptide chain release factor 2; Provisional 339
10716 235998 PRK07349 PRK07349 amidophosphoribosyltransferase; Provisional 500
10717 180941 PRK07352 PRK07352 F0F1 ATP synthase subunit B; Validated 174
10718 235999 PRK07353 PRK07353 F0F1 ATP synthase subunit B'; Validated 140
10719 180942 PRK07354 PRK07354 F0F1 ATP synthase subunit C; Validated 81
10720 236000 PRK07360 PRK07360 FO synthase subunit 2; Reviewed 371
10721 180944 PRK07362 PRK07362 NADP-dependent isocitrate dehydrogenase. 474
10722 180945 PRK07363 PRK07363 NADH-quinone oxidoreductase subunit M. 501
10723 236001 PRK07364 PRK07364 FAD-dependent hydroxylase. 415
10724 180947 PRK07366 PRK07366 LL-diaminopimelate aminotransferase. 388
10725 236002 PRK07369 PRK07369 dihydroorotase; Provisional 418
10726 180949 PRK07370 PRK07370 enoyl-[acyl-carrier-protein] reductase FabI. 258
10727 236003 PRK07373 PRK07373 DNA polymerase III subunit alpha; Reviewed 449
10728 168927 PRK07374 dnaE DNA polymerase III subunit alpha; Validated 1170
10729 236004 PRK07375 PRK07375 Na+/H+ antiporter subunit C. 112
10730 236005 PRK07376 PRK07376 NADH-quinone oxidoreductase subunit L. 673
10731 236006 PRK07377 PRK07377 hypothetical protein; Provisional 184
10732 236007 PRK07379 PRK07379 coproporphyrinogen III oxidase; Provisional 400
10733 180954 PRK07380 PRK07380 adenylosuccinate lyase; Provisional 431
10734 236008 PRK07390 PRK07390 NAD(P)H-quinone oxidoreductase subunit F; Validated 613
10735 236009 PRK07392 PRK07392 threonine-phosphate decarboxylase; Validated 360
10736 168934 PRK07394 PRK07394 hypothetical protein; Provisional 342
10737 236010 PRK07395 PRK07395 L-aspartate oxidase; Provisional 553
10738 180958 PRK07396 PRK07396 dihydroxynaphthoic acid synthetase; Validated 273
10739 236011 PRK07399 PRK07399 DNA polymerase III subunit delta'; Validated 314
10740 180960 PRK07400 PRK07400 30S ribosomal protein S1; Reviewed 318
10741 180961 PRK07402 PRK07402 precorrin-6Y C5,15-methyltransferase subunit CbiT. 196
10742 180962 PRK07403 PRK07403 type I glyceraldehyde-3-phosphate dehydrogenase. 337
10743 180963 PRK07405 PRK07405 RNA polymerase sigma factor SigD; Validated 317
10744 236012 PRK07406 PRK07406 RNA polymerase sigma factor RpoD; Validated 373
10745 180965 PRK07408 PRK07408 RNA polymerase sigma factor SigF; Reviewed 256
10746 236013 PRK07409 PRK07409 threonine synthase; Validated 353
10747 180967 PRK07411 PRK07411 molybdopterin-synthase adenylyltransferase MoeB. 390
10748 180968 PRK07413 PRK07413 cob(I)yrinic acid a,c-diamide adenosyltransferase. 382
10749 168945 PRK07414 PRK07414 P-loop NTPase family protein. 178
10750 180969 PRK07415 PRK07415 NAD(P)H-quinone oxidoreductase subunit H; Validated 394
10751 180970 PRK07417 PRK07417 prephenate/arogenate dehydrogenase. 279
10752 236014 PRK07418 PRK07418 acetolactate synthase large subunit. 616
10753 236015 PRK07419 PRK07419 2-carboxy-1,4-naphthoquinone phytyltransferase. 304
10754 236016 PRK07424 PRK07424 bifunctional sterol desaturase/short chain dehydrogenase; Validated 406
10755 236017 PRK07428 PRK07428 carboxylating nicotinate-nucleotide diphosphorylase. 288
10756 180975 PRK07429 PRK07429 phosphoribulokinase; Provisional 327
10757 236018 PRK07431 PRK07431 aspartate kinase; Provisional 587
10758 180977 PRK07432 PRK07432 S-methyl-5'-thioadenosine phosphorylase. 290
10759 180978 PRK07440 PRK07440 thiamine biosynthesis protein ThiS. 70
10760 236019 PRK07445 PRK07445 O-succinylbenzoic acid--CoA ligase; Reviewed 452
10761 236020 PRK07449 PRK07449 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase; Validated 568
10762 236021 PRK07451 PRK07451 translation initiation factor. 115
10763 180982 PRK07452 PRK07452 DNA polymerase III subunit delta; Validated 326
10764 180983 PRK07453 PRK07453 protochlorophyllide reductase. 322
10765 180984 PRK07454 PRK07454 SDR family oxidoreductase. 241
10766 180985 PRK07455 PRK07455 bifunctional 4-hydroxy-2-oxoglutarate aldolase/2-dehydro-3-deoxy-phosphogluconate aldolase. 187
10767 180986 PRK07459 PRK07459 single-stranded DNA-binding protein; Provisional 121
10768 180987 PRK07468 PRK07468 crotonase/enoyl-CoA hydratase family protein. 262
10769 180988 PRK07470 PRK07470 acyl-CoA synthetase; Validated 528
10770 236022 PRK07471 PRK07471 DNA polymerase III subunit delta'; Validated 365
10771 168961 PRK07473 PRK07473 M20/M25/M40 family metallo-hydrolase. 376
10772 236023 PRK07474 PRK07474 sulfur oxidation protein SoxY; Provisional 154
10773 236024 PRK07475 PRK07475 hypothetical protein; Provisional 245
10774 236025 PRK07476 eutB threonine dehydratase; Provisional 322
10775 180993 PRK07478 PRK07478 short chain dehydrogenase; Provisional 254
10776 180994 PRK07480 PRK07480 putative aminotransferase; Validated 456
10777 168967 PRK07481 PRK07481 hypothetical protein; Provisional 449
10778 236026 PRK07482 PRK07482 hypothetical protein; Provisional 461
10779 236027 PRK07483 PRK07483 aspartate aminotransferase family protein. 443
10780 236028 PRK07486 PRK07486 amidase; Provisional 484
10781 236029 PRK07487 PRK07487 amidase; Provisional 469
10782 236030 PRK07488 PRK07488 indoleacetamide hydrolase. 472
10783 236031 PRK07490 PRK07490 hypothetical protein; Provisional 245
10784 181000 PRK07492 PRK07492 adenylosuccinate lyase; Provisional 435
10785 181001 PRK07494 PRK07494 UbiH/UbiF family hydroxylase. 388
10786 236032 PRK07495 PRK07495 4-aminobutyrate--2-oxoglutarate transaminase. 425
10787 236033 PRK07500 rpoH2 RNA polymerase factor sigma-32; Reviewed 289
10788 236034 PRK07502 PRK07502 prephenate/arogenate dehydrogenase family protein. 307
10789 181005 PRK07503 PRK07503 methionine gamma-lyase; Provisional 403
10790 168979 PRK07504 PRK07504 O-succinylhomoserine sulfhydrylase; Reviewed 398
10791 181006 PRK07505 PRK07505 hypothetical protein; Provisional 402
10792 236035 PRK07508 PRK07508 aminodeoxychorismate synthase component I. 378
10793 181008 PRK07509 PRK07509 crotonase/enoyl-CoA hydratase family protein. 262
10794 181009 PRK07511 PRK07511 enoyl-CoA hydratase; Provisional 260
10795 236036 PRK07512 PRK07512 L-aspartate oxidase; Provisional 513
10796 181011 PRK07514 PRK07514 malonyl-CoA synthase; Validated 504
10797 236037 PRK07515 PRK07515 3-oxoacyl-(acyl carrier protein) synthase III; Reviewed 372
10798 181013 PRK07516 PRK07516 thiolase domain-containing protein. 389
10799 236038 PRK07521 flgK flagellar hook-associated protein FlgK; Validated 483
10800 236039 PRK07522 PRK07522 acetylornithine deacetylase; Provisional 385
10801 236040 PRK07523 PRK07523 gluconate 5-dehydrogenase; Provisional 255
10802 236041 PRK07524 PRK07524 5-guanidino-2-oxopentanoate decarboxylase. 535
10803 236042 PRK07525 PRK07525 sulfoacetaldehyde acetyltransferase; Validated 588
10804 236043 PRK07529 PRK07529 AMP-binding domain protein; Validated 632
10805 181018 PRK07530 PRK07530 3-hydroxybutyryl-CoA dehydrogenase; Validated 292
10806 236044 PRK07531 PRK07531 carnitine 3-dehydrogenase. 495
10807 181020 PRK07533 PRK07533 enoyl-[acyl-carrier-protein] reductase FabI. 258
10808 236045 PRK07534 PRK07534 betaine--homocysteine S-methyltransferase. 336
10809 181022 PRK07535 PRK07535 methyltetrahydrofolate:corrinoid/iron-sulfur protein methyltransferase; Validated 261
10810 236046 PRK07538 PRK07538 hypothetical protein; Provisional 413
10811 181024 PRK07539 PRK07539 NADH-quinone oxidoreductase subunit NuoE. 154
10812 181025 PRK07544 PRK07544 branched-chain amino acid aminotransferase; Validated 292
10813 169002 PRK07546 PRK07546 hypothetical protein; Provisional 209
10814 181026 PRK07550 PRK07550 aminotransferase. 386
10815 181027 PRK07558 PRK07558 F0F1 ATP synthase subunit C; Validated 74
10816 181028 PRK07559 PRK07559 2'-deoxycytidine 5'-triphosphate deaminase; Provisional 365
10817 236047 PRK07560 PRK07560 elongation factor EF-2; Reviewed 731
10818 236048 PRK07561 PRK07561 DNA topoisomerase I subunit omega; Validated 859
10819 236049 PRK07562 PRK07562 vitamin B12-dependent ribonucleotide reductase. 1220
10820 236050 PRK07564 PRK07564 phosphoglucomutase; Validated 543
10821 236051 PRK07565 PRK07565 dihydroorotate dehydrogenase-like protein. 334
10822 236052 PRK07566 PRK07566 chlorophyll synthase ChlG. 314
10823 181035 PRK07567 PRK07567 glutamine amidotransferase; Provisional 242
10824 181036 PRK07568 PRK07568 pyridoxal phosphate-dependent aminotransferase. 397
10825 181037 PRK07569 PRK07569 bidirectional hydrogenase complex protein HoxU; Validated 234
10826 181038 PRK07570 PRK07570 succinate dehydrogenase/fumarate reductase iron-sulfur subunit; Validated 250
10827 236053 PRK07571 PRK07571 bidirectional hydrogenase complex protein HoxE; Reviewed 169
10828 181039 PRK07572 PRK07572 cytosine deaminase; Validated 426
10829 236054 PRK07573 sdhA fumarate reductase/succinate dehydrogenase flavoprotein subunit. 640
10830 181041 PRK07574 PRK07574 NAD-dependent formate dehydrogenase. 385
10831 236055 PRK07575 PRK07575 dihydroorotase; Provisional 438
10832 236056 PRK07576 PRK07576 short chain dehydrogenase; Provisional 264
10833 181044 PRK07577 PRK07577 SDR family oxidoreductase. 234
10834 236057 PRK07578 PRK07578 short chain dehydrogenase; Provisional 199
10835 236058 PRK07579 PRK07579 dTDP-4-amino-4,6-dideoxyglucose formyltransferase. 245
10836 236059 PRK07580 PRK07580 Mg-protoporphyrin IX methyl transferase; Validated 230
10837 236060 PRK07581 PRK07581 hypothetical protein; Validated 339
10838 236061 PRK07582 PRK07582 cystathionine gamma-lyase; Validated 366
10839 236062 PRK07583 PRK07583 cytosine deaminase. 438
10840 236063 PRK07586 PRK07586 acetolactate synthase large subunit. 514
10841 169028 PRK07588 PRK07588 FAD-binding domain. 391
10842 236064 PRK07589 PRK07589 ornithine cyclodeaminase; Validated 346
10843 181053 PRK07590 PRK07590 L,L-diaminopimelate aminotransferase; Validated 409
10844 236065 PRK07591 PRK07591 threonine synthase; Validated 421
10845 136438 PRK07594 PRK07594 EscN/YscN/HrcN family type III secretion system ATPase. 433
10846 236066 PRK07597 secE preprotein translocase subunit SecE; Reviewed 64
10847 236067 PRK07598 PRK07598 RNA polymerase sigma factor SigC; Validated 415
10848 181057 PRK07608 PRK07608 UbiH/UbiF family hydroxylase. 388
10849 181058 PRK07609 PRK07609 CDP-6-deoxy-delta-3,4-glucoseen reductase; Validated 339
10850 181059 PRK07627 PRK07627 dihydroorotase; Provisional 425
10851 236068 PRK07630 PRK07630 CobD/CbiB family protein; Provisional 312
10852 181061 PRK07631 PRK07631 amidophosphoribosyltransferase; Provisional 475
10853 236069 PRK07632 PRK07632 ribonucleotide-diphosphate reductase subunit alpha; Validated 699
10854 181063 PRK07634 PRK07634 pyrroline-5-carboxylate reductase; Reviewed 245
10855 236070 PRK07636 ligB ATP-dependent DNA ligase; Reviewed 275
10856 236071 PRK07638 PRK07638 acyl-CoA synthetase; Validated 487
10857 181065 PRK07639 PRK07639 petrobactin biosynthesis protein AsbD. 86
10858 181066 PRK07649 PRK07649 aminodeoxychorismate/anthranilate synthase component II. 195
10859 181067 PRK07650 PRK07650 4-amino-4-deoxychorismate lyase; Provisional 283
10860 236072 PRK07656 PRK07656 long-chain-fatty-acid--CoA ligase; Validated 513
10861 181069 PRK07657 PRK07657 enoyl-CoA hydratase; Provisional 260
10862 181070 PRK07658 PRK07658 enoyl-CoA hydratase; Provisional 257
10863 236073 PRK07659 PRK07659 enoyl-CoA hydratase; Provisional 260
10864 181072 PRK07661 PRK07661 acetyl-CoA C-acetyltransferase. 391
10865 236074 PRK07666 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 239
10866 169051 PRK07667 PRK07667 uridine kinase; Provisional 193
10867 181074 PRK07668 PRK07668 hypothetical protein; Validated 254
10868 181075 PRK07670 PRK07670 FliA/WhiG family RNA polymerase sigma factor. 251
10869 181076 PRK07671 PRK07671 bifunctional cystathionine gamma-lyase/homocysteine desulfhydrase. 377
10870 181077 PRK07677 PRK07677 short chain dehydrogenase; Provisional 252
10871 181078 PRK07678 PRK07678 aminotransferase; Validated 451
10872 181079 PRK07679 PRK07679 pyrroline-5-carboxylate reductase; Reviewed 279
10873 181080 PRK07680 PRK07680 late competence protein ComER; Validated 273
10874 181081 PRK07681 PRK07681 LL-diaminopimelate aminotransferase. 399
10875 181082 PRK07682 PRK07682 aminotransferase. 378
10876 236075 PRK07683 PRK07683 aminotransferase A; Validated 387
10877 181084 PRK07688 PRK07688 thiamine/molybdopterin biosynthesis ThiF/MoeB-like protein; Validated 339
10878 181085 PRK07691 PRK07691 putative monovalent cation/H+ antiporter subunit D; Reviewed 496
10879 181086 PRK07695 PRK07695 thiazole tautomerase TenI. 201
10880 169065 PRK07696 PRK07696 thiamine biosynthesis protein ThiS. 67
10881 181087 PRK07701 flgL flagellar hook-associated protein FlgL; Validated 298
10882 181088 PRK07708 PRK07708 hypothetical protein; Validated 219
10883 169068 PRK07709 PRK07709 fructose-bisphosphate aldolase; Provisional 285
10884 236076 PRK07710 PRK07710 acetolactate synthase large subunit. 571
10885 236077 PRK07714 PRK07714 YlxQ family RNA-binding protein. 100
10886 181090 PRK07718 fliL flagellar basal body-associated protein FliL; Reviewed 142
10887 181091 PRK07720 fliJ flagellar biosynthesis chaperone FliJ. 146
10888 181092 PRK07721 fliI flagellar protein export ATPase FliI. 438
10889 236078 PRK07726 PRK07726 DNA topoisomerase 3. 658
10890 236079 PRK07729 PRK07729 glyceraldehyde-3-phosphate dehydrogenase; Validated 343
10891 236080 PRK07734 motB flagellar motor protein MotB; Reviewed 259
10892 236081 PRK07735 PRK07735 NADH-quinone oxidoreductase subunit C. 430
10893 236082 PRK07737 fliD flagellar hook-associated protein 2. 501
10894 236083 PRK07738 PRK07738 flagellar protein FlaG; Provisional 117
10895 236084 PRK07739 flgK flagellar hook-associated protein FlgK; Validated 507
10896 236085 PRK07740 PRK07740 hypothetical protein; Provisional 244
10897 236086 PRK07742 PRK07742 phosphate butyryltransferase; Validated 299
10898 236087 PRK07748 PRK07748 3'-5' exonuclease KapD. 207
10899 169084 PRK07756 PRK07756 NADH-quinone oxidoreductase subunit A. 122
10900 236088 PRK07757 PRK07757 N-acetyltransferase. 152
10901 181104 PRK07758 PRK07758 hypothetical protein; Provisional 95
10902 236089 PRK07761 PRK07761 DNA polymerase III subunit beta; Validated 376
10903 236090 PRK07764 PRK07764 DNA polymerase III subunits gamma and tau; Validated 824
10904 181107 PRK07765 PRK07765 aminodeoxychorismate/anthranilate synthase component II. 214
10905 236091 PRK07768 PRK07768 long-chain-fatty-acid--CoA ligase; Validated 545
10906 181109 PRK07769 PRK07769 long-chain-fatty-acid--CoA ligase; Validated 631
10907 236092 PRK07772 PRK07772 single-stranded DNA-binding protein; Provisional 186
10908 236093 PRK07773 PRK07773 replicative DNA helicase; Validated 886
10909 236094 PRK07774 PRK07774 SDR family oxidoreductase. 250
10910 181113 PRK07775 PRK07775 SDR family oxidoreductase. 274
10911 236095 PRK07777 PRK07777 putative succinyldiaminopimelate transaminase DapC. 387
10912 181115 PRK07785 PRK07785 NADH dehydrogenase subunit C; Provisional 235
10913 169098 PRK07786 PRK07786 long-chain-fatty-acid--CoA ligase; Validated 542
10914 236096 PRK07787 PRK07787 acyl-CoA synthetase; Validated 471
10915 236097 PRK07788 PRK07788 acyl-CoA synthetase; Validated 549
10916 236098 PRK07789 PRK07789 acetolactate synthase 1 catalytic subunit; Validated 612
10917 236099 PRK07791 PRK07791 short chain dehydrogenase; Provisional 286
10918 181120 PRK07792 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 306
10919 236100 PRK07798 PRK07798 acyl-CoA synthetase; Validated 533
10920 181122 PRK07799 PRK07799 crotonase/enoyl-CoA hydratase family protein. 263
10921 181123 PRK07801 PRK07801 acetyl-CoA C-acetyltransferase. 382
10922 236101 PRK07803 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 626
10923 236102 PRK07804 PRK07804 L-aspartate oxidase; Provisional 541
10924 181126 PRK07806 PRK07806 SDR family oxidoreductase. 248
10925 181127 PRK07807 PRK07807 GuaB1 family IMP dehydrogenase-related protein. 479
10926 236103 PRK07810 PRK07810 O-succinylhomoserine sulfhydrylase; Provisional 403
10927 236104 PRK07811 PRK07811 cystathionine gamma-synthase; Provisional 388
10928 236105 PRK07812 PRK07812 O-acetylhomoserine aminocarboxypropyltransferase; Validated 436
10929 181131 PRK07814 PRK07814 SDR family oxidoreductase. 263
10930 236106 PRK07818 PRK07818 dihydrolipoamide dehydrogenase; Reviewed 466
10931 181133 PRK07819 PRK07819 3-hydroxybutyryl-CoA dehydrogenase; Validated 286
10932 236107 PRK07823 PRK07823 S-methyl-5'-thioadenosine phosphorylase. 264
10933 236108 PRK07824 PRK07824 o-succinylbenzoate--CoA ligase. 358
10934 181136 PRK07825 PRK07825 short chain dehydrogenase; Provisional 273
10935 236109 PRK07827 PRK07827 enoyl-CoA hydratase family protein. 260
10936 236110 PRK07831 PRK07831 SDR family oxidoreductase. 262
10937 181139 PRK07832 PRK07832 SDR family oxidoreductase. 272
10938 236111 PRK07843 PRK07843 3-oxosteroid 1-dehydrogenase. 557
10939 236112 PRK07845 PRK07845 flavoprotein disulfide reductase; Reviewed 466
10940 181142 PRK07846 PRK07846 mycothione reductase; Reviewed 451
10941 236113 PRK07847 PRK07847 amidophosphoribosyltransferase; Provisional 510
10942 236114 PRK07849 PRK07849 aminodeoxychorismate lyase. 292
10943 181145 PRK07850 PRK07850 steroid 3-ketoacyl-CoA thiolase. 387
10944 181146 PRK07851 PRK07851 acetyl-CoA C-acetyltransferase. 406
10945 236115 PRK07854 PRK07854 enoyl-CoA hydratase; Provisional 243
10946 181147 PRK07855 PRK07855 lipid-transfer protein; Provisional 386
10947 236116 PRK07856 PRK07856 SDR family oxidoreductase. 252
10948 236117 PRK07857 PRK07857 chorismate mutase. 106
10949 236118 PRK07860 PRK07860 NADH dehydrogenase subunit G; Validated 797
10950 236119 PRK07865 PRK07865 N-succinyldiaminopimelate aminotransferase; Reviewed 364
10951 236120 PRK07867 PRK07867 acyl-CoA synthetase; Validated 529
10952 236121 PRK07868 PRK07868 acyl-CoA synthetase; Validated 994
10953 181154 PRK07869 PRK07869 amidase; Provisional 468
10954 169138 PRK07874 PRK07874 ATP synthase F0 subunit C. 80
10955 236122 PRK07877 PRK07877 Rv1355c family protein. 722
10956 181156 PRK07878 PRK07878 molybdopterin biosynthesis-like protein MoeZ; Validated 392
10957 236123 PRK07883 PRK07883 DEDD exonuclease domain-containing protein. 557
10958 236124 PRK07889 PRK07889 enoyl-[acyl-carrier-protein] reductase FabI. 256
10959 181159 PRK07890 PRK07890 short chain dehydrogenase; Provisional 258
10960 236125 PRK07896 PRK07896 carboxylating nicotinate-nucleotide diphosphorylase. 289
10961 236126 PRK07899 rpsA 30S ribosomal protein S1; Reviewed 486
10962 181162 PRK07904 PRK07904 decaprenylphospho-beta-D-erythro-pentofuranosid-2-ulose 2-reductase. 253
10963 181163 PRK07906 PRK07906 hypothetical protein; Provisional 426
10964 236127 PRK07907 PRK07907 hypothetical protein; Provisional 449
10965 236128 PRK07908 PRK07908 threonine-phosphate decarboxylase. 349
10966 236129 PRK07910 PRK07910 beta-ketoacyl-ACP synthase. 418
10967 169151 PRK07912 PRK07912 salicylate synthase. 449
10968 236130 PRK07914 PRK07914 hypothetical protein; Reviewed 320
10969 236131 PRK07920 PRK07920 lipid A biosynthesis lauroyl acyltransferase; Provisional 298
10970 181169 PRK07921 PRK07921 RNA polymerase sigma factor SigB; Reviewed 324
10971 236132 PRK07922 PRK07922 amino-acid N-acetyltransferase. 169
10972 181171 PRK07928 PRK07928 NADH dehydrogenase subunit A; Validated 119
10973 236133 PRK07933 PRK07933 dTMP kinase. 213
10974 181173 PRK07937 PRK07937 lipid-transfer protein; Provisional 352
10975 181174 PRK07938 PRK07938 enoyl-CoA hydratase family protein. 249
10976 236134 PRK07940 PRK07940 DNA polymerase III subunit delta'; Validated 394
10977 181176 PRK07942 PRK07942 DNA polymerase III subunit epsilon; Provisional 232
10978 236135 PRK07945 PRK07945 PHP domain-containing protein. 335
10979 236136 PRK07946 PRK07946 putative monovalent cation/H+ antiporter subunit C; Reviewed 163
10980 181179 PRK07948 PRK07948 putative monovalent cation/H+ antiporter subunit F; Reviewed 86
10981 181180 PRK07952 PRK07952 DNA replication protein DnaC; Validated 244
10982 236137 PRK07956 ligA NAD-dependent DNA ligase LigA; Validated 665
10983 181182 PRK07960 fliI flagellum-specific ATP synthase FliI. 455
10984 181183 PRK07963 fliN flagellar motor switch protein FliN; Validated 137
10985 181184 PRK07967 PRK07967 beta-ketoacyl-ACP synthase I. 406
10986 181185 PRK07979 PRK07979 acetolactate synthase 3 large subunit. 574
10987 181186 PRK07983 PRK07983 exodeoxyribonuclease X; Provisional 219
10988 181187 PRK07984 PRK07984 enoyl-ACP reductase FabI. 262
10989 181188 PRK07985 PRK07985 SDR family oxidoreductase. 294
10990 181189 PRK07986 PRK07986 adenosylmethionine--8-amino-7-oxononanoate transaminase; Validated 428
10991 181190 PRK07993 PRK07993 DNA polymerase III subunit delta'; Validated 334
10992 236138 PRK07994 PRK07994 DNA polymerase III subunits gamma and tau; Validated 647
10993 181192 PRK07998 gatY class II aldolase. 283
10994 169179 PRK08005 PRK08005 ribulose-phosphate 3 epimerase family protein. 210
10995 181193 PRK08006 PRK08006 replicative DNA helicase DnaB. 471
10996 181194 PRK08007 PRK08007 aminodeoxychorismate synthase component 2. 187
10997 181195 PRK08008 caiC putative crotonobetaine/carnitine-CoA ligase; Validated 517
10998 181196 PRK08010 PRK08010 pyridine nucleotide-disulfide oxidoreductase; Provisional 441
10999 236139 PRK08013 PRK08013 oxidoreductase; Provisional 400
11000 181198 PRK08017 PRK08017 SDR family oxidoreductase. 256
11001 181199 PRK08020 ubiF 2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Reviewed 391
11002 181200 PRK08025 PRK08025 kdo(2)-lipid IV(A) palmitoleoyltransferase. 305
11003 236140 PRK08026 PRK08026 FliC/FljB family flagellin. 529
11004 181202 PRK08027 flgL flagellar hook-filament junction protein FlgL. 317
11005 236141 PRK08032 fliD flagellar capping protein; Reviewed 462
11006 181204 PRK08035 PRK08035 YscQ/HrcQ family type III secretion apparatus protein. 323
11007 181205 PRK08040 PRK08040 putative semialdehyde dehydrogenase; Provisional 336
11008 181206 PRK08042 PRK08042 formate hydrogenlyase subunit 3; Reviewed 593
11009 181207 PRK08043 PRK08043 bifunctional acyl-ACP--phospholipid O-acyltransferase/long-chain-fatty-acid--ACP ligase. 718
11010 169193 PRK08044 PRK08044 allantoinase AllB. 449
11011 169194 PRK08045 PRK08045 cystathionine gamma-synthase; Provisional 386
11012 181208 PRK08049 PRK08049 F0F1 ATP synthase subunit I; Validated 124
11013 236142 PRK08051 fre FMN reductase; Validated 232
11014 181210 PRK08053 PRK08053 sulfur carrier protein ThiS; Provisional 66
11015 236143 PRK08055 PRK08055 chorismate mutase; Provisional 181
11016 181212 PRK08056 PRK08056 threonine-phosphate decarboxylase; Provisional 356
11017 236144 PRK08057 PRK08057 cobalt-precorrin-6x reductase; Reviewed 248
11018 181214 PRK08058 PRK08058 DNA polymerase III subunit delta'; Validated 329
11019 181215 PRK08059 PRK08059 general stress protein 13; Validated 123
11020 181216 PRK08061 rpsN type Z 30S ribosomal protein S14. 61
11021 236145 PRK08063 PRK08063 enoyl-[acyl-carrier-protein] reductase FabL. 250
11022 236146 PRK08064 PRK08064 cystathionine beta-lyase MetC. 390
11023 181219 PRK08068 PRK08068 transaminase; Reviewed 389
11024 236147 PRK08071 PRK08071 L-aspartate oxidase; Provisional 510
11025 181221 PRK08072 PRK08072 carboxylating nicotinate-nucleotide diphosphorylase. 277
11026 181222 PRK08073 flgL flagellar hook-associated protein 3. 287
11027 236148 PRK08074 PRK08074 bifunctional ATP-dependent DNA helicase/DNA polymerase III subunit epsilon; Validated 928
11028 181224 PRK08084 PRK08084 DnaA inactivator Hda. 235
11029 181225 PRK08085 PRK08085 gluconate 5-dehydrogenase; Provisional 254
11030 181226 PRK08087 PRK08087 L-fuculose-phosphate aldolase. 215
11031 236149 PRK08088 PRK08088 4-aminobutyrate--2-oxoglutarate transaminase. 425
11032 169215 PRK08091 PRK08091 ribulose-phosphate 3-epimerase; Validated 228
11033 236150 PRK08097 ligB NAD-dependent DNA ligase LigB. 562
11034 236151 PRK08099 PRK08099 multifunctional transcriptional regulator/nicotinamide-nucleotide adenylyltransferase/ribosylnicotinamide kinase NadR. 399
11035 181230 PRK08105 PRK08105 flavodoxin; Provisional 149
11036 181231 PRK08114 PRK08114 cystathionine beta-lyase; Provisional 395
11037 236152 PRK08115 PRK08115 vitamin B12-dependent ribonucleotide reductase. 858
11038 236153 PRK08116 PRK08116 hypothetical protein; Validated 268
11039 181234 PRK08117 PRK08117 aspartate aminotransferase family protein. 433
11040 181235 PRK08118 PRK08118 DNA topology modulation protein. 167
11041 236154 PRK08119 PRK08119 flagellar motor switch protein; Validated 382
11042 236155 PRK08123 PRK08123 histidinol-phosphatase HisJ. 270
11043 181238 PRK08124 PRK08124 flagellar motor protein MotA; Validated 263
11044 236156 PRK08125 PRK08125 bifunctional UDP-4-amino-4-deoxy-L-arabinose formyltransferase/UDP-glucuronic acid oxidase ArnA. 660
11045 236157 PRK08126 PRK08126 hypothetical protein; Provisional 432
11046 181241 PRK08130 PRK08130 putative aldolase; Validated 213
11047 181242 PRK08131 PRK08131 3-oxoadipyl-CoA thiolase. 401
11048 236158 PRK08132 PRK08132 FAD-dependent oxidoreductase; Provisional 547
11049 181244 PRK08133 PRK08133 O-succinylhomoserine sulfhydrylase; Validated 390
11050 236159 PRK08134 PRK08134 O-acetylhomoserine aminocarboxypropyltransferase; Validated 433
11051 236160 PRK08136 PRK08136 glycosyl transferase family protein; Provisional 317
11052 236161 PRK08137 PRK08137 amidase; Provisional 497
11053 236162 PRK08138 PRK08138 enoyl-CoA hydratase; Provisional 261
11054 181249 PRK08139 PRK08139 enoyl-CoA hydratase; Validated 266
11055 236163 PRK08140 PRK08140 enoyl-CoA hydratase; Provisional 262
11056 236164 PRK08142 PRK08142 thiolase domain-containing protein. 388
11057 236165 PRK08147 flgK flagellar hook-associated protein FlgK; Validated 547
11058 236166 PRK08149 PRK08149 FliI/YscN family ATPase. 428
11059 181254 PRK08150 PRK08150 crotonase/enoyl-CoA hydratase family protein. 255
11060 181255 PRK08153 PRK08153 pyridoxal phosphate-dependent aminotransferase. 369
11061 236167 PRK08154 PRK08154 anaerobic benzoate catabolism transcriptional regulator; Reviewed 309
11062 181257 PRK08155 PRK08155 acetolactate synthase large subunit. 564
11063 236168 PRK08156 PRK08156 EscU/YscU/HrcU family type III secretion system export apparatus switch protein. 361
11064 181259 PRK08158 PRK08158 YscQ/HrcQ family type III secretion apparatus protein. 303
11065 181260 PRK08159 PRK08159 enoyl-[acyl-carrier-protein] reductase FabI. 272
11066 236169 PRK08162 PRK08162 acyl-CoA synthetase; Validated 545
11067 181262 PRK08163 PRK08163 3-hydroxybenzoate 6-monooxygenase. 396
11068 236170 PRK08166 PRK08166 NADH-quinone oxidoreductase subunit NuoG. 791
11069 236171 PRK08168 PRK08168 NADH-quinone oxidoreductase subunit L. 516
11070 181265 PRK08170 PRK08170 acetyl-CoA C-acetyltransferase. 426
11071 181266 PRK08172 PRK08172 acyl carrier protein. 82
11072 236172 PRK08173 PRK08173 DNA topoisomerase III; Validated 862
11073 181268 PRK08175 PRK08175 aminotransferase; Validated 395
11074 181269 PRK08176 pdxK pyridoxine/pyridoxal/pyridoxamine kinase. 281
11075 236173 PRK08177 PRK08177 SDR family oxidoreductase. 225
11076 236174 PRK08178 PRK08178 acetolactate synthase 1 small subunit. 96
11077 181271 PRK08179 prfH peptide chain release factor-like protein; Reviewed 200
11078 236175 PRK08180 PRK08180 feruloyl-CoA synthase; Reviewed 614
11079 136670 PRK08181 PRK08181 transposase; Validated 269
11080 169261 PRK08182 PRK08182 single-stranded DNA-binding protein; Provisional 148
11081 236176 PRK08183 PRK08183 NADH:ubiquinone oxidoreductase subunit NDUFA12. 133
11082 181274 PRK08184 PRK08184 benzoyl-CoA-dihydrodiol lyase; Provisional 550
11083 181275 PRK08185 PRK08185 hypothetical protein; Provisional 283
11084 236177 PRK08186 PRK08186 allophanate hydrolase; Provisional 600
11085 236178 PRK08187 PRK08187 pyruvate kinase; Validated 493
11086 236179 PRK08188 PRK08188 ribonucleotide-diphosphate reductase subunit alpha; Validated 714
11087 236180 PRK08190 PRK08190 bifunctional enoyl-CoA hydratase/phosphate acetyltransferase; Validated 466
11088 169269 PRK08192 PRK08192 aspartate carbamoyltransferase; Provisional 338
11089 236181 PRK08193 araD L-ribulose-5-phosphate 4-epimerase AraD. 231
11090 181281 PRK08194 PRK08194 tartrate dehydrogenase; Provisional 352
11091 181282 PRK08195 PRK08195 4-hyroxy-2-oxovalerate/4-hydroxy-2-oxopentanoic acid aldolase,; Validated 337
11092 181283 PRK08197 PRK08197 threonine synthase; Validated 394
11093 236182 PRK08198 PRK08198 threonine dehydratase; Provisional 404
11094 181285 PRK08199 PRK08199 thiamine pyrophosphate protein; Validated 557
11095 169276 PRK08201 PRK08201 dipeptidase. 456
11096 236183 PRK08202 PRK08202 purine nucleoside phosphorylase; Provisional 272
11097 236184 PRK08203 PRK08203 hydroxydechloroatrazine ethylaminohydrolase; Reviewed 451
11098 181288 PRK08204 PRK08204 hypothetical protein; Provisional 449
11099 236185 PRK08205 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 583
11100 236186 PRK08206 PRK08206 diaminopropionate ammonia-lyase; Provisional 399
11101 236187 PRK08207 PRK08207 coproporphyrinogen III oxidase; Provisional 488
11102 181292 PRK08208 PRK08208 coproporphyrinogen III oxidase family protein. 430
11103 236188 PRK08210 PRK08210 aspartate kinase I; Reviewed 403
11104 236189 PRK08211 PRK08211 YjhG/YagF family D-xylonate dehydratase. 655
11105 181295 PRK08213 PRK08213 gluconate 5-dehydrogenase; Provisional 259
11106 181296 PRK08215 PRK08215 RNA polymerase sporulation sigma factor SigG. 258
11107 181297 PRK08217 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 253
11108 181298 PRK08219 PRK08219 SDR family oxidoreductase. 227
11109 236190 PRK08220 PRK08220 2,3-dihydroxybenzoate-2,3-dehydrogenase; Validated 252
11110 181300 PRK08221 PRK08221 anaerobic sulfite reductase subunit AsrB. 263
11111 181301 PRK08222 PRK08222 hydrogenase 4 subunit H; Validated 181
11112 181302 PRK08223 PRK08223 hypothetical protein; Validated 287
11113 236191 PRK08224 ligC ATP-dependent DNA ligase; Reviewed 350
11114 181304 PRK08225 PRK08225 acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated 70
11115 181305 PRK08226 PRK08226 SDR family oxidoreductase UcpA. 263
11116 181306 PRK08227 PRK08227 3-hydroxy-5-phosphonooxypentane-2,4-dione thiolase. 264
11117 236192 PRK08228 PRK08228 L(+)-tartrate dehydratase subunit beta; Validated 204
11118 236193 PRK08229 PRK08229 2-dehydropantoate 2-reductase; Provisional 341
11119 181309 PRK08230 PRK08230 tartrate dehydratase subunit alpha; Validated 299
11120 181310 PRK08233 PRK08233 hypothetical protein; Provisional 182
11121 181311 PRK08235 PRK08235 acetyl-CoA C-acetyltransferase. 393
11122 236194 PRK08236 PRK08236 hypothetical protein; Provisional 212
11123 236195 PRK08238 PRK08238 UbiA family prenyltransferase. 479
11124 236196 PRK08241 PRK08241 RNA polymerase subunit sigma-70. 339
11125 236197 PRK08242 PRK08242 acetyl-CoA C-acetyltransferase. 402
11126 236198 PRK08243 PRK08243 4-hydroxybenzoate 3-monooxygenase; Validated 392
11127 236199 PRK08244 PRK08244 monooxygenase. 493
11128 236200 PRK08245 PRK08245 hypothetical protein; Validated 240
11129 181319 PRK08246 PRK08246 serine/threonine dehydratase. 310
11130 181320 PRK08247 PRK08247 methionine biosynthesis PLP-dependent protein. 366
11131 236201 PRK08248 PRK08248 homocysteine synthase. 431
11132 236202 PRK08249 PRK08249 cystathionine gamma-synthase family protein. 398
11133 181323 PRK08250 PRK08250 glutamine amidotransferase; Provisional 235
11134 181324 PRK08251 PRK08251 SDR family oxidoreductase. 248
11135 181325 PRK08252 PRK08252 crotonase/enoyl-CoA hydratase family protein. 254
11136 236203 PRK08255 PRK08255 bifunctional salicylyl-CoA 5-hydroxylase/oxidoreductase. 765
11137 181327 PRK08256 PRK08256 lipid-transfer protein; Provisional 391
11138 236204 PRK08257 PRK08257 acetyl-CoA acetyltransferase; Validated 498
11139 181329 PRK08258 PRK08258 enoyl-CoA hydratase family protein. 277
11140 236205 PRK08259 PRK08259 crotonase/enoyl-CoA hydratase family protein. 254
11141 236206 PRK08260 PRK08260 enoyl-CoA hydratase; Provisional 296
11142 236207 PRK08261 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 450
11143 236208 PRK08262 PRK08262 M20 family peptidase. 486
11144 181334 PRK08263 PRK08263 short chain dehydrogenase; Provisional 275
11145 181335 PRK08264 PRK08264 SDR family oxidoreductase. 238
11146 236209 PRK08265 PRK08265 short chain dehydrogenase; Provisional 261
11147 181337 PRK08266 PRK08266 hypothetical protein; Provisional 542
11148 236210 PRK08267 PRK08267 SDR family oxidoreductase. 260
11149 236211 PRK08268 PRK08268 3-hydroxy-acyl-CoA dehydrogenase; Validated 507
11150 181340 PRK08269 PRK08269 3-hydroxybutyryl-CoA dehydrogenase; Validated 314
11151 236212 PRK08270 PRK08270 ribonucleoside triphosphate reductase. 656
11152 181342 PRK08271 PRK08271 anaerobic ribonucleoside triphosphate reductase; Provisional 623
11153 236213 PRK08272 PRK08272 crotonase/enoyl-CoA hydratase family protein. 302
11154 181344 PRK08273 PRK08273 thiamine pyrophosphate protein; Provisional 597
11155 236214 PRK08274 PRK08274 FAD-dependent tricarballylate dehydrogenase TcuA. 466
11156 181346 PRK08275 PRK08275 putative oxidoreductase; Provisional 554
11157 236215 PRK08276 PRK08276 long-chain-fatty-acid--CoA ligase; Validated 502
11158 236216 PRK08277 PRK08277 D-mannonate oxidoreductase; Provisional 278
11159 181349 PRK08278 PRK08278 SDR family oxidoreductase. 273
11160 236217 PRK08279 PRK08279 long-chain-acyl-CoA synthetase; Validated 600
11161 236218 PRK08284 PRK08284 precorrin 6A synthase; Provisional 253
11162 181352 PRK08285 cobH precorrin-8X methylmutase; Reviewed 208
11163 181353 PRK08286 cbiC cobalt-precorrin-8 methylmutase. 214
11164 181354 PRK08287 PRK08287 decarboxylating cobalt-precorrin-6B (C(15))-methyltransferase. 187
11165 236219 PRK08289 PRK08289 glyceraldehyde-3-phosphate dehydrogenase; Reviewed 477
11166 236220 PRK08290 PRK08290 enoyl-CoA hydratase; Provisional 288
11167 236221 PRK08291 PRK08291 cyclodeaminase. 330
11168 236222 PRK08292 PRK08292 AMP nucleosidase; Provisional 489
11169 181359 PRK08293 PRK08293 3-hydroxyacyl-CoA dehydrogenase. 287
11170 236223 PRK08294 PRK08294 phenol 2-monooxygenase; Provisional 634
11171 181361 PRK08295 PRK08295 RNA polymerase sporulation sigma factor SigH. 208
11172 181362 PRK08296 PRK08296 hypothetical protein; Provisional 603
11173 236224 PRK08297 PRK08297 L-lysine aminotransferase; Provisional 443
11174 236225 PRK08298 PRK08298 cytidine deaminase; Validated 136
11175 236226 PRK08299 PRK08299 NADP-dependent isocitrate dehydrogenase. 402
11176 236227 PRK08300 PRK08300 acetaldehyde dehydrogenase; Validated 302
11177 236228 PRK08301 PRK08301 RNA polymerase sporulation sigma factor SigE. 234
11178 236229 PRK08303 PRK08303 short chain dehydrogenase; Provisional 305
11179 236230 PRK08304 PRK08304 stage V sporulation protein AD; Validated 337
11180 181370 PRK08305 spoVFB dipicolinate synthase subunit B; Reviewed 196
11181 181371 PRK08306 PRK08306 dipicolinate synthase subunit DpsA. 296
11182 181372 PRK08307 PRK08307 stage III sporulation protein SpoAB; Provisional 171
11183 236231 PRK08308 PRK08308 acyl-CoA synthetase; Validated 414
11184 236232 PRK08309 PRK08309 short chain dehydrogenase; Provisional 177
11185 181375 PRK08310 PRK08310 amidase; Provisional 395
11186 236233 PRK08311 PRK08311 RNA polymerase sigma factor SigI. 237
11187 236234 PRK08312 PRK08312 indolepyruvate oxidoreductase subunit beta family protein. 510
11188 181378 PRK08313 PRK08313 thiolase domain-containing protein. 386
11189 236235 PRK08314 PRK08314 long-chain-fatty-acid--CoA ligase; Validated 546
11190 236236 PRK08315 PRK08315 AMP-binding domain protein; Validated 559
11191 181381 PRK08316 PRK08316 acyl-CoA synthetase; Validated 523
11192 181382 PRK08317 PRK08317 hypothetical protein; Provisional 241
11193 236237 PRK08318 PRK08318 NAD-dependent dihydropyrimidine dehydrogenase subunit PreA. 420
11194 181384 PRK08319 PRK08319 energy-coupling factor ABC transporter permease. 224
11195 236238 PRK08320 PRK08320 branched-chain amino acid aminotransferase; Reviewed 288
11196 181386 PRK08321 PRK08321 1,4-dihydroxy-2-naphthoyl-CoA synthase. 302
11197 236239 PRK08322 PRK08322 acetolactate synthase large subunit. 547
11198 236240 PRK08323 PRK08323 phenylhydantoinase; Validated 459
11199 236241 PRK08324 PRK08324 bifunctional aldolase/short-chain dehydrogenase. 681
11200 236242 PRK08326 PRK08326 R2-like ligand-binding oxidase. 311
11201 236243 PRK08327 PRK08327 thiamine pyrophosphate-requiring protein. 569
11202 169382 PRK08328 PRK08328 hypothetical protein; Provisional 231
11203 236244 PRK08329 PRK08329 threonine synthase; Validated 347
11204 169384 PRK08330 PRK08330 biotin--protein ligase; Provisional 236
11205 181392 PRK08332 PRK08332 vitamin B12-dependent ribonucleotide reductase. 1740
11206 181393 PRK08333 PRK08333 aldolase. 184
11207 169386 PRK08334 PRK08334 S-methyl-5-thioribose-1-phosphate isomerase. 356
11208 169387 PRK08335 PRK08335 translation initiation factor IF-2B subunit alpha; Validated 275
11209 181394 PRK08338 PRK08338 2-oxoacid:ferredoxin oxidoreductase subunit gamma. 170
11210 169389 PRK08339 PRK08339 short chain dehydrogenase; Provisional 263
11211 169390 PRK08340 PRK08340 SDR family oxidoreductase. 259
11212 181395 PRK08341 PRK08341 amidophosphoribosyltransferase; Provisional 442
11213 236245 PRK08343 secD preprotein translocase subunit SecD; Reviewed 417
11214 236246 PRK08344 PRK08344 V-type ATP synthase subunit K; Validated 157
11215 236247 PRK08345 PRK08345 cytochrome-c3 hydrogenase subunit gamma; Provisional 289
11216 181399 PRK08348 PRK08348 NADH-plastoquinone oxidoreductase subunit; Provisional 120
11217 169396 PRK08349 PRK08349 hypothetical protein; Validated 198
11218 169397 PRK08350 PRK08350 hypothetical protein; Provisional 341
11219 181400 PRK08351 PRK08351 DNA-directed RNA polymerase subunit E''; Validated 61
11220 169399 PRK08354 PRK08354 putative aminotransferase; Provisional 311
11221 169400 PRK08356 PRK08356 hypothetical protein; Provisional 195
11222 169401 PRK08359 PRK08359 transcription factor; Validated 176
11223 181401 PRK08360 PRK08360 aspartate aminotransferase family protein. 443
11224 236248 PRK08361 PRK08361 aspartate aminotransferase; Provisional 391
11225 181402 PRK08363 PRK08363 alanine aminotransferase; Validated 398
11226 236249 PRK08364 PRK08364 sulfur carrier protein ThiS; Provisional 70
11227 169406 PRK08366 vorA 2-ketoisovalerate ferredoxin oxidoreductase subunit alpha; Reviewed 390
11228 181403 PRK08367 porA pyruvate ferredoxin oxidoreductase subunit alpha; Reviewed 394
11229 236250 PRK08373 PRK08373 aspartate kinase; Validated 341
11230 169409 PRK08374 PRK08374 homoserine dehydrogenase; Provisional 336
11231 236251 PRK08375 PRK08375 putative monovalent cation/H+ antiporter subunit D; Reviewed 487
11232 236252 PRK08376 PRK08376 putative monovalent cation/H+ antiporter subunit D; Reviewed 521
11233 181406 PRK08377 PRK08377 NADH dehydrogenase subunit N; Validated 494
11234 236253 PRK08378 PRK08378 hypothetical protein; Provisional 93
11235 169414 PRK08381 PRK08381 putative monovalent cation/H+ antiporter subunit F; Reviewed 87
11236 169415 PRK08382 PRK08382 putative monovalent cation/H+ antiporter subunit E; Reviewed 201
11237 181407 PRK08383 PRK08383 putative monovalent cation/H+ antiporter subunit E; Reviewed 168
11238 236254 PRK08384 PRK08384 thiamine biosynthesis protein ThiI; Provisional 381
11239 236255 PRK08385 PRK08385 carboxylating.nicotinate-nucleotide diphosphorylase. 278
11240 236256 PRK08386 PRK08386 putative monovalent cation/H+ antiporter subunit B; Reviewed 151
11241 169420 PRK08387 PRK08387 putative monovalent cation/H+ antiporter subunit B; Reviewed 131
11242 236257 PRK08388 PRK08388 putative monovalent cation/H+ antiporter subunit C; Reviewed 119
11243 236258 PRK08389 PRK08389 putative monovalent cation/H+ antiporter subunit C; Reviewed 114
11244 169423 PRK08392 PRK08392 hypothetical protein; Provisional 215
11245 181411 PRK08393 PRK08393 N-ethylammeline chlorohydrolase; Provisional 424
11246 169425 PRK08395 PRK08395 fumarate hydratase; Provisional 162
11247 236259 PRK08401 PRK08401 L-aspartate oxidase; Provisional 466
11248 169427 PRK08402 PRK08402 replication factor A; Reviewed 355
11249 169428 PRK08404 PRK08404 V-type ATP synthase subunit H; Validated 103
11250 181413 PRK08406 PRK08406 transcription elongation factor NusA-like protein; Validated 140
11251 181414 PRK08410 PRK08410 D-2-hydroxyacid dehydrogenase. 311
11252 236260 PRK08411 PRK08411 flagellin; Reviewed 572
11253 236261 PRK08412 flgL flagellar hook-associated protein FlgL; Validated 827
11254 181416 PRK08415 PRK08415 enoyl-[acyl-carrier-protein] reductase FabI. 274
11255 181417 PRK08416 PRK08416 enoyl-ACP reductase. 260
11256 236262 PRK08417 PRK08417 metal-dependent hydrolase. 386
11257 181419 PRK08418 PRK08418 metal-dependent hydrolase. 408
11258 181420 PRK08419 PRK08419 lipid A biosynthesis lauroyl acyltransferase; Reviewed 298
11259 236263 PRK08425 flgE flagellar hook protein FlgE; Validated 731
11260 236264 PRK08432 PRK08432 flagellar motor switch protein FliY; Validated 283
11261 181423 PRK08433 PRK08433 flagellar motor switch protein FliN. 111
11262 236265 PRK08439 PRK08439 3-oxoacyl-(acyl carrier protein) synthase II; Reviewed 406
11263 181425 PRK08441 oorC 2-oxoglutarate-acceptor oxidoreductase subunit OorC; Reviewed 183
11264 181426 PRK08444 PRK08444 aminofutalosine synthase MqnE. 353
11265 181427 PRK08445 PRK08445 dehypoxanthine futalosine cyclase. 348
11266 181428 PRK08446 PRK08446 coproporphyrinogen III oxidase family protein. 350
11267 236266 PRK08447 PRK08447 ribonucleoside-diphosphate reductase subunit alpha. 789
11268 236267 PRK08451 PRK08451 DNA polymerase III subunits gamma and tau; Validated 535
11269 236268 PRK08452 PRK08452 flagellar protein FlaG; Provisional 124
11270 181432 PRK08453 fliD flagellar filament capping protein FliD. 673
11271 181433 PRK08455 fliL flagellar basal body-associated protein FliL; Reviewed 182
11272 181434 PRK08456 PRK08456 flagellar motor protein MotA; Validated 257
11273 181435 PRK08457 motB flagellar motor protein MotB; Reviewed 257
11274 236269 PRK08462 PRK08462 acetyl-CoA carboxylase biotin carboxylase subunit. 445
11275 169452 PRK08463 PRK08463 acetyl-CoA carboxylase subunit A; Validated 478
11276 181437 PRK08470 PRK08470 adenylosuccinate lyase; Provisional 442
11277 236270 PRK08471 flgK flagellar hook-associated protein FlgK; Validated 613
11278 181439 PRK08472 fliI flagellar protein export ATPase FliI. 434
11279 236271 PRK08474 PRK08474 F0F1 ATP synthase subunit delta; Validated 176
11280 236272 PRK08475 PRK08475 F0F1 ATP synthase subunit B; Validated 167
11281 181442 PRK08476 PRK08476 F0F1 ATP synthase subunit B'; Validated 141
11282 236273 PRK08477 PRK08477 biotin--[acetyl-CoA-carboxylase] ligase. 211
11283 181444 PRK08482 PRK08482 F0F1 ATP synthase subunit C; Validated 105
11284 236274 PRK08485 PRK08485 DNA polymerase III subunit delta'; Validated 206
11285 236275 PRK08486 PRK08486 single-stranded DNA-binding protein; Provisional 182
11286 236276 PRK08487 PRK08487 DNA polymerase III subunit delta; Validated 328
11287 181448 PRK08489 PRK08489 NAD(P)H-quinone oxidoreductase subunit 3. 129
11288 181449 PRK08491 PRK08491 NADH-quinone oxidoreductase subunit C. 263
11289 236277 PRK08493 PRK08493 NADH-quinone oxidoreductase subunit G. 819
11290 236278 PRK08506 PRK08506 replicative DNA helicase; Provisional 472
11291 181452 PRK08507 PRK08507 prephenate dehydrogenase; Validated 275
11292 236279 PRK08508 PRK08508 biotin synthase; Provisional 279
11293 236280 PRK08515 flgA flagellar basal body P-ring formation protein FlgA. 222
11294 236281 PRK08517 PRK08517 3'-5' exonuclease. 257
11295 181456 PRK08525 PRK08525 amidophosphoribosyltransferase; Provisional 445
11296 181457 PRK08526 PRK08526 threonine dehydratase; Provisional 403
11297 181458 PRK08527 PRK08527 acetolactate synthase large subunit. 563
11298 181459 PRK08533 PRK08533 flagellar accessory protein FlaH; Reviewed 230
11299 181460 PRK08534 PRK08534 pyruvate ferredoxin oxidoreductase subunit gamma; Reviewed 181
11300 236282 PRK08535 PRK08535 ribose 1,5-bisphosphate isomerase. 310
11301 181462 PRK08537 PRK08537 2-oxoacid:ferredoxin oxidoreductase subunit gamma. 177
11302 236283 PRK08540 PRK08540 adenylosuccinate lyase; Reviewed 449
11303 236284 PRK08541 PRK08541 flagellin; Validated 211
11304 236285 PRK08554 PRK08554 peptidase; Reviewed 438
11305 181465 PRK08557 PRK08557 hypothetical protein; Provisional 417
11306 181466 PRK08558 PRK08558 adenine phosphoribosyltransferase; Provisional 238
11307 181467 PRK08559 nusG transcription antitermination protein NusG; Validated 153
11308 236286 PRK08560 PRK08560 tyrosyl-tRNA synthetase; Validated 329
11309 236287 PRK08561 rps15p 30S ribosomal protein S15P; Reviewed 151
11310 236288 PRK08562 rpl32e 50S ribosomal protein L32e; Validated 125
11311 236289 PRK08563 PRK08563 DNA-directed RNA polymerase subunit E'; Provisional 187
11312 236290 PRK08564 PRK08564 S-methyl-5'-thioadenosine phosphorylase. 267
11313 236291 PRK08565 PRK08565 DNA-directed RNA polymerase subunit B; Provisional 1103
11314 236292 PRK08566 PRK08566 DNA-directed RNA polymerase subunit A'; Validated 882
11315 236293 PRK08568 PRK08568 preprotein translocase subunit SecY; Reviewed 462
11316 236294 PRK08569 rpl18p 50S ribosomal protein L18P; Reviewed 193
11317 236295 PRK08570 rpl19e 50S ribosomal protein L19e; Reviewed 150
11318 181478 PRK08571 rpl14p 50S ribosomal protein L14P; Reviewed 132
11319 236296 PRK08572 rps17p 30S ribosomal protein S17P; Reviewed 108
11320 236297 PRK08573 PRK08573 bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 448
11321 236298 PRK08574 PRK08574 cystathionine gamma-synthase family protein. 385
11322 236299 PRK08575 PRK08575 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase; Provisional 326
11323 236300 PRK08576 PRK08576 hypothetical protein; Provisional 438
11324 236301 PRK08577 PRK08577 hypothetical protein; Provisional 136
11325 236302 PRK08578 PRK08578 preprotein translocase subunit SecF; Reviewed 292
11326 236303 PRK08579 PRK08579 anaerobic ribonucleoside triphosphate reductase; Provisional 625
11327 236304 PRK08581 PRK08581 amidase domain-containing protein. 619
11328 236305 PRK08582 PRK08582 RNA-binding protein S1. 139
11329 236306 PRK08583 PRK08583 RNA polymerase sigma factor SigB; Validated 257
11330 181490 PRK08588 PRK08588 succinyl-diaminopimelate desuccinylase; Reviewed 377
11331 181491 PRK08589 PRK08589 SDR family oxidoreductase. 272
11332 236307 PRK08591 PRK08591 acetyl-CoA carboxylase biotin carboxylase subunit; Validated 451
11333 181493 PRK08593 PRK08593 aspartate aminotransferase family protein. 445
11334 236308 PRK08594 PRK08594 enoyl-[acyl-carrier-protein] reductase FabI. 257
11335 181495 PRK08596 PRK08596 acetylornithine deacetylase; Validated 421
11336 236309 PRK08599 PRK08599 oxygen-independent coproporphyrinogen III oxidase. 377
11337 181497 PRK08600 PRK08600 putative monovalent cation/H+ antiporter subunit C; Reviewed 113
11338 236310 PRK08601 PRK08601 NADH dehydrogenase subunit 5; Validated 509
11339 181499 PRK08605 PRK08605 D-lactate dehydrogenase; Validated 332
11340 236311 PRK08609 PRK08609 DNA polymerase/3'-5' exonuclease PolX. 570
11341 181501 PRK08610 PRK08610 fructose-bisphosphate aldolase; Reviewed 286
11342 181502 PRK08611 PRK08611 pyruvate oxidase; Provisional 576
11343 236312 PRK08617 PRK08617 acetolactate synthase AlsS. 552
11344 236313 PRK08618 PRK08618 ornithine cyclodeaminase family protein. 325
11345 181505 PRK08621 PRK08621 galactose-6-phosphate isomerase subunit LacA; Reviewed 142
11346 181506 PRK08622 PRK08622 galactose-6-phosphate isomerase subunit LacB; Reviewed 171
11347 236314 PRK08624 PRK08624 hypothetical protein; Provisional 373
11348 181507 PRK08626 PRK08626 fumarate reductase flavoprotein subunit; Provisional 657
11349 181508 PRK08628 PRK08628 SDR family oxidoreductase. 258
11350 181509 PRK08629 PRK08629 coproporphyrinogen III oxidase family protein. 433
11351 236315 PRK08633 PRK08633 2-acyl-glycerophospho-ethanolamine acyltransferase; Validated 1146
11352 236316 PRK08636 PRK08636 LL-diaminopimelate aminotransferase. 403
11353 181512 PRK08637 PRK08637 hypothetical protein; Provisional 388
11354 236317 PRK08638 PRK08638 bifunctional threonine ammonia-lyase/L-serine ammonia-lyase TdcB. 333
11355 236318 PRK08639 PRK08639 threonine dehydratase; Validated 420
11356 181515 PRK08640 sdhB succinate dehydrogenase iron-sulfur subunit; Reviewed 249
11357 236319 PRK08641 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 589
11358 181517 PRK08642 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 253
11359 181518 PRK08643 PRK08643 (S)-acetoin forming diacetyl reductase. 256
11360 236320 PRK08644 PRK08644 sulfur carrier protein ThiS adenylyltransferase ThiF. 212
11361 236321 PRK08645 PRK08645 bifunctional homocysteine S-methyltransferase/5,10-methylenetetrahydrofolate reductase protein; Reviewed 612
11362 236322 PRK08649 PRK08649 GuaB3 family IMP dehydrogenase-related protein. 368
11363 236323 PRK08651 PRK08651 succinyl-diaminopimelate desuccinylase; Reviewed 394
11364 236324 PRK08652 PRK08652 acetylornithine deacetylase; Provisional 347
11365 236325 PRK08654 PRK08654 acetyl-CoA carboxylase biotin carboxylase subunit. 499
11366 236326 PRK08655 PRK08655 prephenate dehydrogenase; Provisional 437
11367 181526 PRK08659 PRK08659 2-oxoacid:acceptor oxidoreductase subunit alpha. 376
11368 181527 PRK08660 PRK08660 aldolase. 181
11369 236327 PRK08661 PRK08661 prolyl-tRNA synthetase; Provisional 477
11370 236328 PRK08662 PRK08662 nicotinate phosphoribosyltransferase; Reviewed 343
11371 236329 PRK08664 PRK08664 aspartate-semialdehyde dehydrogenase; Reviewed 349
11372 236330 PRK08665 PRK08665 vitamin B12-dependent ribonucleotide reductase. 752
11373 169548 PRK08666 PRK08666 5'-methylthioadenosine phosphorylase; Validated 261
11374 236331 PRK08667 PRK08667 hydrogenase membrane subunit; Validated 644
11375 236332 PRK08668 PRK08668 NADH dehydrogenase subunit M; Validated 610
11376 181534 PRK08671 PRK08671 methionine aminopeptidase; Provisional 291
11377 181535 PRK08673 PRK08673 3-deoxy-7-phosphoheptulonate synthase; Reviewed 335
11378 181536 PRK08674 PRK08674 bifunctional phosphoglucose/phosphomannose isomerase; Validated 337
11379 181537 PRK08676 PRK08676 hydrogenase membrane subunit; Validated 485
11380 169553 PRK08690 PRK08690 enoyl-[acyl-carrier-protein] reductase FabI. 261
11381 236333 PRK08691 PRK08691 DNA polymerase III subunits gamma and tau; Validated 709
11382 181538 PRK08699 PRK08699 DNA polymerase III subunit delta'; Validated 325
11383 169556 PRK08703 PRK08703 SDR family oxidoreductase. 239
11384 169557 PRK08706 PRK08706 lipid A biosynthesis lauroyl acyltransferase; Provisional 289
11385 236334 PRK08719 PRK08719 ribonuclease H; Reviewed 147
11386 181539 PRK08722 PRK08722 beta-ketoacyl-ACP synthase II. 414
11387 236335 PRK08724 fliD flagellar filament capping protein FliD. 673
11388 181541 PRK08727 PRK08727 DnaA regulatory inactivator Hda. 233
11389 181542 PRK08733 PRK08733 LpxL/LpxP family Kdo(2)-lipid IV(A) lauroyl/palmitoleoyl acyltransferase. 306
11390 181543 PRK08734 PRK08734 lauroyl acyltransferase. 305
11391 181544 PRK08737 PRK08737 acetylornithine deacetylase; Provisional 364
11392 236336 PRK08742 PRK08742 adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional 472
11393 136958 PRK08745 PRK08745 ribulose-phosphate 3-epimerase; Provisional 223
11394 181546 PRK08751 PRK08751 long-chain fatty acid--CoA ligase. 560
11395 181547 PRK08760 PRK08760 replicative DNA helicase; Provisional 476
11396 236337 PRK08762 PRK08762 molybdopterin-synthase adenylyltransferase MoeB. 376
11397 181549 PRK08763 PRK08763 single-stranded DNA-binding protein; Provisional 164
11398 181550 PRK08764 PRK08764 Rnf electron transport complex subunit RnfB. 135
11399 181551 PRK08769 PRK08769 DNA polymerase III subunit delta'; Validated 319
11400 181552 PRK08773 PRK08773 UbiH/UbiF family hydroxylase. 392
11401 181553 PRK08775 PRK08775 homoserine O-succinyltransferase. 343
11402 181554 PRK08776 PRK08776 O-succinylhomoserine (thiol)-lyase. 405
11403 181555 PRK08780 PRK08780 DNA topoisomerase I; Provisional 780
11404 136970 PRK08787 PRK08787 peptide chain release factor 2; Provisional 313
11405 236338 PRK08788 PRK08788 enoyl-CoA hydratase; Validated 287
11406 181557 PRK08808 PRK08808 type II secretion system protein J. 211
11407 181558 PRK08811 PRK08811 uroporphyrinogen-III synthase; Validated 266
11408 236339 PRK08813 PRK08813 threonine dehydratase; Provisional 349
11409 236340 PRK08815 PRK08815 GTP cyclohydrolase II RibA. 375
11410 181561 PRK08818 PRK08818 prephenate dehydrogenase; Provisional 370
11411 181562 PRK08840 PRK08840 replicative DNA helicase; Provisional 464
11412 181563 PRK08841 PRK08841 aspartate kinase; Validated 392
11413 181564 PRK08849 PRK08849 2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Provisional 384
11414 236341 PRK08850 PRK08850 2-octaprenyl-6-methoxyphenol hydroxylase; Validated 405
11415 181566 PRK08857 PRK08857 aminodeoxychorismate/anthranilate synthase component II. 193
11416 181567 PRK08861 PRK08861 O-succinylhomoserine (thiol)-lyase. 388
11417 236342 PRK08862 PRK08862 SDR family oxidoreductase. 227
11418 236343 PRK08868 PRK08868 flagellar protein FlaG; Provisional 144
11419 236344 PRK08869 PRK08869 polar flagellin E. 376
11420 236345 PRK08870 flgL flagellar hook-associated protein FlgL; Reviewed 404
11421 181572 PRK08871 flgK flagellar hook-associated protein FlgK; Validated 626
11422 181573 PRK08878 PRK08878 cobalamin biosynthesis family protein. 317
11423 181574 PRK08881 rpsN 30S ribosomal protein S14; Reviewed 101
11424 181575 PRK08883 PRK08883 ribulose-phosphate 3-epimerase; Provisional 220
11425 181576 PRK08887 PRK08887 nicotinate-nicotinamide nucleotide adenylyltransferase. 174
11426 236346 PRK08898 PRK08898 oxygen-independent coproporphyrinogen III oxidase-like protein. 394
11427 236347 PRK08903 PRK08903 DnaA regulatory inactivator Hda; Validated 227
11428 236348 PRK08905 PRK08905 lysophospholipid acyltransferase family protein. 289
11429 181580 PRK08912 PRK08912 aminotransferase. 387
11430 236349 PRK08913 flgL flagellin. 301
11431 236350 PRK08916 PRK08916 flagellar motor switch protein FliN. 116
11432 236351 PRK08927 fliI flagellar protein export ATPase FliI. 442
11433 181584 PRK08931 PRK08931 S-methyl-5'-thioadenosine phosphorylase. 289
11434 181585 PRK08936 PRK08936 glucose-1-dehydrogenase; Provisional 261
11435 236352 PRK08937 PRK08937 adenylosuccinate lyase; Provisional 216
11436 236353 PRK08939 PRK08939 primosomal protein DnaI; Reviewed 306
11437 236354 PRK08942 PRK08942 D-glycero-beta-D-manno-heptose 1,7-bisphosphate 7-phosphatase. 181
11438 236355 PRK08943 PRK08943 lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase; Validated 314
11439 236356 PRK08944 motB flagellar motor protein MotB; Reviewed 302
11440 236357 PRK08945 PRK08945 putative oxoacyl-(acyl carrier protein) reductase; Provisional 247
11441 181592 PRK08947 fadA 3-ketoacyl-CoA thiolase; Reviewed 387
11442 181593 PRK08951 PRK08951 malate synthase; Provisional 190
11443 169599 PRK08955 PRK08955 glyceraldehyde-3-phosphate dehydrogenase; Validated 334
11444 181594 PRK08958 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 588
11445 181595 PRK08960 PRK08960 pyridoxal phosphate-dependent aminotransferase. 387
11446 236358 PRK08961 PRK08961 bifunctional aspartate kinase/diaminopimelate decarboxylase. 861
11447 181597 PRK08963 fadI 3-ketoacyl-CoA thiolase; Reviewed 428
11448 181598 PRK08965 PRK08965 putative monovalent cation/H+ antiporter subunit E; Reviewed 162
11449 181599 PRK08972 fliI flagellar protein export ATPase FliI. 444
11450 236359 PRK08974 PRK08974 long-chain-fatty-acid--CoA ligase FadD. 560
11451 181601 PRK08978 PRK08978 acetolactate synthase 2 catalytic subunit; Reviewed 548
11452 181602 PRK08979 PRK08979 acetolactate synthase 3 large subunit. 572
11453 236360 PRK08983 fliN flagellar motor switch protein FliN. 127
11454 181604 PRK08990 PRK08990 flagellar motor protein PomA; Reviewed 254
11455 181605 PRK08993 PRK08993 2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 253
11456 181606 PRK08997 PRK08997 isocitrate dehydrogenase; Provisional 334
11457 236361 PRK08999 PRK08999 Nudix family hydrolase. 312
11458 181608 PRK09004 PRK09004 FMN-binding protein MioC; Provisional 146
11459 181609 PRK09009 PRK09009 SDR family oxidoreductase. 235
11460 236362 PRK09010 PRK09010 single-stranded DNA-binding protein; Provisional 177
11461 181611 PRK09014 rfaH transcription/translation regulatory transformer protein RfaH. 162
11462 181612 PRK09016 PRK09016 carboxylating nicotinate-nucleotide diphosphorylase. 296
11463 181613 PRK09019 PRK09019 stress response translation initiation inhibitor YciH. 108
11464 181614 PRK09027 PRK09027 cytidine deaminase; Provisional 295
11465 181615 PRK09028 PRK09028 cystathionine beta-lyase; Provisional 394
11466 236363 PRK09029 PRK09029 O-succinylbenzoic acid--CoA ligase; Provisional 458
11467 236364 PRK09034 PRK09034 aspartate kinase; Reviewed 454
11468 181618 PRK09038 PRK09038 flagellar motor protein MotD; Reviewed 281
11469 181619 PRK09039 PRK09039 peptidoglycan -binding protein. 343
11470 181620 PRK09040 PRK09040 hypothetical protein; Provisional 214
11471 236365 PRK09041 motB motility protein MotB. 317
11472 236366 PRK09045 PRK09045 TRZ/ATZ family hydrolase. 443
11473 236367 PRK09047 PRK09047 RNA polymerase factor sigma-70; Validated 161
11474 181624 PRK09050 PRK09050 beta-ketoadipyl CoA thiolase; Validated 401
11475 181625 PRK09051 PRK09051 beta-ketothiolase BktB. 394
11476 181626 PRK09052 PRK09052 acetyl-CoA C-acyltransferase. 399
11477 181627 PRK09053 PRK09053 3-carboxy-cis,cis-muconate cycloisomerase; Provisional 452
11478 181628 PRK09054 PRK09054 phosphogluconate dehydratase; Validated 603
11479 181629 PRK09057 PRK09057 coproporphyrinogen III oxidase; Provisional 380
11480 236368 PRK09058 PRK09058 heme anaerobic degradation radical SAM methyltransferase ChuW/HutW. 449
11481 181631 PRK09059 PRK09059 dihydroorotase; Validated 429
11482 181632 PRK09060 PRK09060 dihydroorotase; Validated 444
11483 236369 PRK09061 PRK09061 D-glutamate deacylase; Validated 509
11484 236370 PRK09064 PRK09064 5-aminolevulinate synthase; Validated 407
11485 181635 PRK09065 PRK09065 glutamine amidotransferase; Provisional 237
11486 236371 PRK09070 PRK09070 aminodeoxychorismate synthase component I. 447
11487 181637 PRK09071 PRK09071 glycosyl transferase family protein. 323
11488 236372 PRK09072 PRK09072 SDR family oxidoreductase. 263
11489 236373 PRK09076 PRK09076 enoyl-CoA hydratase; Provisional 258
11490 236374 PRK09077 PRK09077 L-aspartate oxidase; Provisional 536
11491 236375 PRK09078 sdhA succinate dehydrogenase flavoprotein subunit; Reviewed 598
11492 181642 PRK09082 PRK09082 methionine aminotransferase; Validated 386
11493 236376 PRK09084 PRK09084 aspartate kinase III; Validated 448
11494 169652 PRK09087 PRK09087 hypothetical protein; Validated 226
11495 181644 PRK09088 PRK09088 acyl-CoA synthetase; Validated 488
11496 236377 PRK09094 PRK09094 putative monovalent cation/H+ antiporter subunit C; Reviewed 114
11497 181646 PRK09098 PRK09098 HrpE/YscL family type III secretion apparatus protein. 233
11498 169656 PRK09099 PRK09099 type III secretion system ATPase; Provisional 441
11499 181647 PRK09101 nrdB ribonucleotide-diphosphate reductase subunit beta; Reviewed 376
11500 236378 PRK09102 PRK09102 ribonucleoside-diphosphate reductase subunit alpha. 601
11501 181649 PRK09103 PRK09103 ribonucleoside-diphosphate reductase subunit alpha. 758
11502 236379 PRK09104 PRK09104 hypothetical protein; Validated 464
11503 181651 PRK09105 PRK09105 pyridoxal phosphate-dependent aminotransferase. 370
11504 236380 PRK09107 PRK09107 acetolactate synthase 3 catalytic subunit; Validated 595
11505 236381 PRK09108 PRK09108 type III secretion system protein HrcU; Validated 353
11506 181654 PRK09109 motC flagellar motor protein; Reviewed 246
11507 181655 PRK09110 PRK09110 flagellar motor stator protein MotA. 283
11508 236382 PRK09111 PRK09111 DNA polymerase III subunits gamma and tau; Validated 598
11509 169667 PRK09112 PRK09112 DNA polymerase III subunit delta'; Validated 351
11510 181657 PRK09116 PRK09116 beta-ketoacyl-ACP synthase. 405
11511 236383 PRK09120 PRK09120 p-hydroxycinnamoyl CoA hydratase/lyase; Validated 275
11512 181659 PRK09121 PRK09121 methionine synthase. 339
11513 236384 PRK09123 PRK09123 amidophosphoribosyltransferase; Provisional 479
11514 181661 PRK09124 PRK09124 ubiquinone-dependent pyruvate dehydrogenase. 574
11515 181662 PRK09125 PRK09125 DNA ligase; Provisional 282
11516 236385 PRK09126 PRK09126 FAD-dependent hydroxylase. 392
11517 236386 PRK09129 PRK09129 NADH dehydrogenase subunit G; Validated 776
11518 236387 PRK09130 PRK09130 NADH dehydrogenase subunit G; Validated 687
11519 236388 PRK09133 PRK09133 hypothetical protein; Provisional 472
11520 236389 PRK09134 PRK09134 SDR family oxidoreductase. 258
11521 181668 PRK09135 PRK09135 pteridine reductase; Provisional 249
11522 236390 PRK09136 PRK09136 S-methyl-5'-thioinosine phosphorylase. 245
11523 181670 PRK09140 PRK09140 2-dehydro-3-deoxy-6-phosphogalactonate aldolase; Reviewed 206
11524 236391 PRK09145 PRK09145 3'-5' exonuclease. 202
11525 236392 PRK09146 PRK09146 DNA polymerase III subunit epsilon; Validated 239
11526 236393 PRK09147 PRK09147 succinyldiaminopimelate transaminase; Provisional 396
11527 181674 PRK09148 PRK09148 LL-diaminopimelate aminotransferase. 405
11528 181675 PRK09162 PRK09162 hypoxanthine-guanine phosphoribosyltransferase; Provisional 181
11529 181676 PRK09165 PRK09165 replicative DNA helicase; Provisional 497
11530 236394 PRK09169 PRK09169 hypothetical protein; Validated 2316
11531 169691 PRK09173 PRK09173 F0F1 ATP synthase subunit B; Validated 159
11532 169692 PRK09174 PRK09174 F0F1 ATP synthase subunit B. 204
11533 236395 PRK09177 PRK09177 xanthine-guanine phosphoribosyltransferase; Validated 156
11534 236396 PRK09181 PRK09181 aspartate kinase; Validated 475
11535 236397 PRK09182 PRK09182 DNA polymerase III subunit epsilon; Validated 294
11536 181681 PRK09183 PRK09183 transposase/IS protein; Provisional 259
11537 181682 PRK09184 PRK09184 acyl carrier protein; Provisional 89
11538 236398 PRK09185 PRK09185 beta-ketoacyl-ACP synthase. 392
11539 236399 PRK09186 PRK09186 flagellin modification protein A; Provisional 256
11540 236400 PRK09188 PRK09188 serine/threonine protein kinase; Provisional 365
11541 169701 PRK09189 PRK09189 uroporphyrinogen-III synthase; Validated 240
11542 236401 PRK09190 PRK09190 RNA-binding protein. 220
11543 236402 PRK09191 PRK09191 two-component response regulator; Provisional 261
11544 236403 PRK09192 PRK09192 fatty acyl-AMP ligase. 579
11545 236404 PRK09193 PRK09193 indolepyruvate ferredoxin oxidoreductase; Validated 1165
11546 236405 PRK09194 PRK09194 prolyl-tRNA synthetase; Provisional 565
11547 181690 PRK09195 gatY tagatose-bisphosphate aldolase; Reviewed 284
11548 181691 PRK09196 PRK09196 fructose-bisphosphate aldolase class II. 347
11549 236406 PRK09197 PRK09197 fructose-bisphosphate aldolase; Provisional 350
11550 236407 PRK09198 PRK09198 putative nicotinate phosphoribosyltransferase; Provisional 463
11551 236408 PRK09200 PRK09200 preprotein translocase subunit SecA; Reviewed 790
11552 236409 PRK09201 PRK09201 AtzE family amidohydrolase. 465
11553 236410 PRK09202 nusA transcription elongation factor NusA; Validated 470
11554 236411 PRK09203 rplP 50S ribosomal protein L16; Reviewed 138
11555 236412 PRK09204 secY preprotein translocase subunit SecY; Reviewed 426
11556 181699 PRK09206 PRK09206 pyruvate kinase PykF. 470
11557 181700 PRK09209 PRK09209 ribonucleoside-diphosphate reductase subunit alpha. 761
11558 236413 PRK09210 PRK09210 RNA polymerase sigma factor RpoD; Validated 367
11559 169719 PRK09212 PRK09212 pyruvate dehydrogenase subunit beta; Validated 327
11560 236414 PRK09213 PRK09213 pur operon repressor; Provisional 271
11561 181703 PRK09216 rplM 50S ribosomal protein L13; Reviewed 144
11562 181704 PRK09218 PRK09218 peptide deformylase; Validated 136
11563 181705 PRK09219 PRK09219 xanthine phosphoribosyltransferase; Validated 189
11564 236415 PRK09220 PRK09220 methylthioribulose 1-phosphate dehydratase. 204
11565 181707 PRK09221 PRK09221 beta alanine--pyruvate transaminase; Provisional 445
11566 236416 PRK09222 PRK09222 NADP-dependent isocitrate dehydrogenase. 482
11567 236417 PRK09224 PRK09224 threonine ammonia-lyase IlvA. 504
11568 236418 PRK09225 PRK09225 threonine synthase; Validated 462
11569 236419 PRK09228 PRK09228 guanine deaminase; Provisional 433
11570 236420 PRK09229 PRK09229 N-formimino-L-glutamate deiminase; Validated 456
11571 181713 PRK09230 PRK09230 cytosine deaminase; Provisional 426
11572 236421 PRK09231 PRK09231 fumarate reductase flavoprotein subunit; Validated 582
11573 236422 PRK09234 fbiC FO synthase; Reviewed 843
11574 181716 PRK09236 PRK09236 dihydroorotase; Reviewed 444
11575 236423 PRK09237 PRK09237 amidohydrolase/deacetylase family metallohydrolase. 380
11576 236424 PRK09238 PRK09238 bifunctional aconitate hydratase 2/2-methylisocitrate dehydratase; Validated 835
11577 181719 PRK09239 PRK09239 chorismate mutase; Provisional 104
11578 236425 PRK09240 thiH 2-iminoacetate synthase ThiH. 371
11579 181721 PRK09242 PRK09242 SDR family oxidoreductase. 257
11580 236426 PRK09243 PRK09243 nicotinate phosphoribosyltransferase; Validated 464
11581 181723 PRK09245 PRK09245 crotonase/enoyl-CoA hydratase family protein. 266
11582 236427 PRK09246 PRK09246 amidophosphoribosyltransferase; Provisional 501
11583 236428 PRK09247 PRK09247 ATP-dependent DNA ligase; Validated 539
11584 236429 PRK09248 PRK09248 putative hydrolase; Validated 246
11585 236430 PRK09249 PRK09249 coproporphyrinogen dehydrogenase. 453
11586 236431 PRK09250 PRK09250 class I fructose-bisphosphate aldolase. 348
11587 236432 PRK09255 PRK09255 malate synthase; Validated 531
11588 181730 PRK09256 PRK09256 aminoacyl-tRNA hydrolase. 138
11589 181731 PRK09257 PRK09257 aromatic amino acid transaminase. 396
11590 181732 PRK09258 PRK09258 3-oxoacyl-(acyl carrier protein) synthase III; Reviewed 338
11591 236433 PRK09259 PRK09259 putative oxalyl-CoA decarboxylase; Validated 569
11592 236434 PRK09260 PRK09260 3-hydroxyacyl-CoA dehydrogenase. 288
11593 236435 PRK09261 PRK09261 phospho-2-dehydro-3-deoxyheptonate aldolase; Validated 349
11594 181735 PRK09262 PRK09262 hypothetical protein; Provisional 225
11595 236436 PRK09263 PRK09263 anaerobic ribonucleoside triphosphate reductase; Provisional 711
11596 236437 PRK09264 PRK09264 diaminobutyrate--2-oxoglutarate transaminase. 425
11597 181738 PRK09265 PRK09265 aminotransferase AlaT; Validated 404
11598 236438 PRK09266 PRK09266 hypothetical protein; Provisional 266
11599 236439 PRK09267 PRK09267 flavodoxin FldA; Validated 169
11600 236440 PRK09268 PRK09268 acetyl-CoA C-acetyltransferase. 427
11601 236441 PRK09269 PRK09269 chorismate mutase; Provisional 193
11602 236442 PRK09270 PRK09270 nucleoside triphosphate hydrolase domain-containing protein; Reviewed 229
11603 181744 PRK09271 PRK09271 flavodoxin; Provisional 160
11604 181745 PRK09272 PRK09272 hypothetical protein; Provisional 109
11605 181746 PRK09273 PRK09273 hypothetical protein; Provisional 211
11606 236443 PRK09274 PRK09274 peptide synthase; Provisional 552
11607 236444 PRK09275 PRK09275 bifunctional aspartate transaminase/aspartate 4-decarboxylase. 527
11608 181749 PRK09276 PRK09276 LL-diaminopimelate aminotransferase; Provisional 385
11609 236445 PRK09277 PRK09277 aconitate hydratase AcnA. 888
11610 236446 PRK09279 PRK09279 pyruvate phosphate dikinase; Provisional 879
11611 236447 PRK09280 PRK09280 F0F1 ATP synthase subunit beta; Validated 463
11612 236448 PRK09281 PRK09281 F0F1 ATP synthase subunit alpha; Validated 502
11613 236449 PRK09282 PRK09282 pyruvate carboxylase subunit B; Validated 592
11614 236450 PRK09283 PRK09283 porphobilinogen synthase. 323
11615 236451 PRK09284 PRK09284 thiamine biosynthesis protein ThiC; Provisional 607
11616 236452 PRK09285 PRK09285 adenylosuccinate lyase; Provisional 456
11617 236453 PRK09287 PRK09287 NADP-dependent phosphogluconate dehydrogenase. 459
11618 236454 PRK09288 purT formate-dependent phosphoribosylglycinamide formyltransferase. 395
11619 236455 PRK09289 PRK09289 riboflavin synthase. 194
11620 236456 PRK09290 PRK09290 allantoate amidohydrolase; Reviewed 413
11621 181762 PRK09291 PRK09291 SDR family oxidoreductase. 257
11622 236457 PRK09292 PRK09292 Na(+)-translocating NADH-quinone reductase subunit D; Validated 209
11623 236458 PRK09293 PRK09293 class 1 fructose-bisphosphatase. 327
11624 181765 PRK09294 PRK09294 phthiocerol/phthiodiolone dimycocerosyl transferase. 416
11625 181766 PRK09295 PRK09295 cysteine desulfurase SufS. 406
11626 181767 PRK09296 PRK09296 cysteine desulfuration protein SufE. 138
11627 236459 PRK09297 PRK09297 tRNA-splicing endonuclease subunit alpha; Reviewed 169
11628 236460 PRK09300 PRK09300 tRNA splicing endonuclease; Reviewed 330
11629 181770 PRK09301 PRK09301 circadian clock protein KaiB; Provisional 103
11630 236461 PRK09302 PRK09302 circadian clock protein KaiC; Reviewed 509
11631 236462 PRK09303 PRK09303 histidine kinase. 380
11632 236463 PRK09304 PRK09304 arginine exporter ArgO. 207
11633 137204 PRK09310 aroDE bifunctional 3-dehydroquinate dehydratase/shikimate dehydrogenase. 477
11634 181774 PRK09311 PRK09311 bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 402
11635 181775 PRK09314 PRK09314 bifunctional 3,4-dihydroxy-2-butanone 4-phosphate synthase/GTP cyclohydrolase II. 339
11636 236464 PRK09318 PRK09318 bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 387
11637 236465 PRK09319 PRK09319 bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase RibB/GTP cyclohydrolase II RibA. 555
11638 236466 PRK09325 PRK09325 coenzyme F420-reducing hydrogenase subunit beta; Validated 282
11639 181779 PRK09326 PRK09326 F420H2 dehydrogenase subunit F; Provisional 341
11640 236467 PRK09328 PRK09328 N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase; Provisional 275
11641 236468 PRK09330 PRK09330 cell division protein FtsZ; Validated 384
11642 236469 PRK09331 PRK09331 Sep-tRNA:Cys-tRNA synthetase; Provisional 387
11643 236470 PRK09333 PRK09333 30S ribosomal protein S19e; Provisional 150
11644 181784 PRK09334 PRK09334 30S ribosomal protein S25e; Provisional 86
11645 181785 PRK09335 PRK09335 30S ribosomal protein S26e; Provisional 95
11646 181786 PRK09336 PRK09336 30S ribosomal protein S30e; Provisional 50
11647 181787 PRK09343 PRK09343 prefoldin subunit beta; Provisional 121
11648 236471 PRK09344 PRK09344 phosphoenolpyruvate carboxykinase. 526
11649 236472 PRK09347 folE GTP cyclohydrolase I; Provisional 188
11650 236473 PRK09348 glyQ glycyl-tRNA synthetase subunit alpha; Validated 283
11651 236474 PRK09350 PRK09350 elongation factor P--(R)-beta-lysine ligase. 306
11652 236475 PRK09352 PRK09352 beta-ketoacyl-ACP synthase 3. 319
11653 236476 PRK09354 recA recombinase A; Provisional 349
11654 236477 PRK09355 PRK09355 hydroxyethylthiazole kinase; Validated 263
11655 236478 PRK09356 PRK09356 imidazolonepropionase; Validated 406
11656 236479 PRK09357 pyrC dihydroorotase; Validated 423
11657 236480 PRK09358 PRK09358 adenosine deaminase; Provisional 340
11658 236481 PRK09360 lamB maltoporin LamB. 415
11659 236482 PRK09361 radB DNA repair and recombination protein RadB; Provisional 225
11660 181800 PRK09362 PRK09362 phosphoribosylaminoimidazole-succinocarboxamide synthase; Reviewed 238
11661 236483 PRK09364 moaC cyclic pyranopterin monophosphate synthase MoaC. 159
11662 236484 PRK09367 PRK09367 histidine ammonia-lyase; Provisional 500
11663 236485 PRK09368 PRK09368 gas vesicle structural protein GvpA. 140
11664 236486 PRK09369 PRK09369 UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Validated 417
11665 181805 PRK09371 PRK09371 gas vesicle structural protein GvpA. 68
11666 236487 PRK09372 PRK09372 ribonuclease E inhibitor RraA. 159
11667 236488 PRK09374 rplB 50S ribosomal protein L2; Validated 276
11668 236489 PRK09375 PRK09375 quinolinate synthase NadA. 319
11669 236490 PRK09376 rho transcription termination factor Rho; Provisional 416
11670 236491 PRK09377 tsf elongation factor Ts; Provisional 290
11671 181811 PRK09379 PRK09379 LytR family transcriptional regulator. 303
11672 181812 PRK09381 trxA thioredoxin TrxA. 109
11673 236492 PRK09382 ispDF bifunctional 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase/2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase protein; Provisional 378
11674 236493 PRK09389 PRK09389 (R)-citramalate synthase; Provisional 488
11675 181815 PRK09390 fixJ response regulator FixJ; Provisional 202
11676 236494 PRK09391 fixK transcriptional regulator FixK; Provisional 230
11677 181817 PRK09392 ftrB transcriptional activator FtrB; Provisional 236
11678 181818 PRK09393 ftrA transcriptional activator FtrA; Provisional 322
11679 236495 PRK09395 actP cation/acetate symporter ActP. 551
11680 236496 PRK09398 sspN acid-soluble spore protein N; Provisional 47
11681 181821 PRK09399 sspP small acid-soluble spore protein P. 48
11682 236497 PRK09400 secE preprotein translocase subunit SecE; Reviewed 61
11683 236498 PRK09401 PRK09401 reverse gyrase; Reviewed 1176
11684 236499 PRK09404 sucA 2-oxoglutarate dehydrogenase E1 component; Reviewed 924
11685 236500 PRK09405 aceE pyruvate dehydrogenase subunit E1; Reviewed 891
11686 181826 PRK09406 gabD1 succinic semialdehyde dehydrogenase; Reviewed 457
11687 236501 PRK09407 gabD2 succinic semialdehyde dehydrogenase; Reviewed 524
11688 181828 PRK09408 ompX outer membrane protein OmpX. 171
11689 181829 PRK09409 PRK09409 IS2 transposase TnpB; Reviewed 301
11690 236502 PRK09410 ulaA PTS system ascorbate-specific transporter subunit IIC; Reviewed 452
11691 181831 PRK09411 PRK09411 carbamate kinase; Reviewed 297
11692 236503 PRK09412 PRK09412 anaerobic C4-dicarboxylate transporter; Reviewed 433
11693 181833 PRK09413 PRK09413 IS2 repressor TnpA; Reviewed 121
11694 181834 PRK09414 PRK09414 NADP-specific glutamate dehydrogenase. 445
11695 181835 PRK09415 PRK09415 RNA polymerase factor sigma C; Reviewed 179
11696 181836 PRK09416 lstR PadR family transcriptional regulator. 135
11697 181837 PRK09417 mogA molybdenum cofactor biosynthesis protein MogA; Provisional 193
11698 236504 PRK09418 PRK09418 bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 780
11699 236505 PRK09419 PRK09419 multifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase/5'-nucleotidase. 1163
11700 236506 PRK09420 cpdB bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 649
11701 181841 PRK09421 modB molybdate ABC transporter permease subunit. 229
11702 181842 PRK09422 PRK09422 ethanol-active dehydrogenase/acetaldehyde-active reductase; Provisional 338
11703 181843 PRK09423 gldA glycerol dehydrogenase; Provisional 366
11704 236507 PRK09424 pntA Re/Si-specific NAD(P)(+) transhydrogenase subunit alpha. 509
11705 181845 PRK09425 prpD bifunctional 2-methylcitrate dehydratase/aconitate hydratase. 480
11706 236508 PRK09426 PRK09426 methylmalonyl-CoA mutase; Reviewed 714
11707 236509 PRK09427 PRK09427 bifunctional indole-3-glycerol-phosphate synthase TrpC/phosphoribosylanthranilate isomerase TrpF. 454
11708 236510 PRK09428 pssA CDP-diacylglycerol--serine O-phosphatidyltransferase. 451
11709 236511 PRK09429 mepA penicillin-insensitive murein endopeptidase; Reviewed 275
11710 236512 PRK09430 djlA co-chaperone DjlA. 267
11711 236513 PRK09431 asnB asparagine synthetase B; Provisional 554
11712 181852 PRK09432 metF methylenetetrahydrofolate reductase. 296
11713 181853 PRK09433 thiP thiamine transporter membrane protein; Reviewed 525
11714 236514 PRK09434 PRK09434 aminoimidazole riboside kinase; Provisional 304
11715 236515 PRK09435 PRK09435 methylmalonyl Co-A mutase-associated GTPase MeaB. 332
11716 181856 PRK09436 thrA bifunctional aspartokinase I/homoserine dehydrogenase I; Provisional 819
11717 181857 PRK09437 bcp thioredoxin-dependent thiol peroxidase; Reviewed 154
11718 236516 PRK09438 nudB dihydroneopterin triphosphate pyrophosphatase; Provisional 148
11719 181859 PRK09439 PRK09439 PTS system glucose-specific transporter subunit; Provisional 169
11720 236517 PRK09440 avtA valine--pyruvate transaminase; Provisional 416
11721 236518 PRK09441 PRK09441 cytoplasmic alpha-amylase; Reviewed 479
11722 236519 PRK09442 panF sodium/pantothenate symporter. 483
11723 236520 PRK09444 pntB Re/Si-specific NAD(P)(+) transhydrogenase subunit beta. 462
11724 236521 PRK09448 PRK09448 DNA starvation/stationary phase protection protein Dps; Provisional 162
11725 181865 PRK09449 PRK09449 dUMP phosphatase; Provisional 224
11726 236522 PRK09450 cyaA class I adenylate cyclase. 830
11727 181867 PRK09451 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 456
11728 236523 PRK09452 potA spermidine/putrescine ABC transporter ATP-binding protein PotA. 375
11729 181869 PRK09453 PRK09453 phosphodiesterase; Provisional 182
11730 236524 PRK09454 ugpQ cytoplasmic glycerophosphodiester phosphodiesterase; Provisional 249
11731 236525 PRK09455 rseB anti-sigma E factor; Provisional 319
11732 181872 PRK09456 PRK09456 ?-D-glucose-1-phosphatase; Provisional 199
11733 181873 PRK09457 astD succinylglutamic semialdehyde dehydrogenase; Reviewed 487
11734 236526 PRK09458 pspB envelope stress response membrane protein PspB. 75
11735 181875 PRK09459 pspG envelope stress response protein PspG. 76
11736 181876 PRK09461 ansA cytoplasmic asparaginase I; Provisional 335
11737 236527 PRK09462 fur ferric uptake regulator; Provisional 148
11738 236528 PRK09463 fadE acyl-CoA dehydrogenase; Reviewed 777
11739 181879 PRK09464 pdhR pyruvate dehydrogenase complex transcriptional repressor PdhR. 254
11740 236529 PRK09465 tolC outer membrane channel protein; Reviewed 446
11741 236530 PRK09466 metL bifunctional aspartate kinase II/homoserine dehydrogenase II; Provisional 810
11742 236531 PRK09467 envZ osmolarity sensor protein; Provisional 435
11743 181883 PRK09468 ompR osmolarity response regulator; Provisional 239
11744 181884 PRK09469 glnA glutamate--ammonia ligase. 469
11745 236532 PRK09470 cpxA envelope stress sensor histidine kinase CpxA. 461
11746 181886 PRK09471 oppB oligopeptide ABC transporter permease OppB. 306
11747 181887 PRK09472 ftsA cell division protein FtsA; Reviewed 420
11748 181888 PRK09473 oppD oligopeptide transporter ATP-binding component; Provisional 330
11749 236533 PRK09474 malE maltose/maltodextrin ABC transporter substrate-binding protein MalE. 396
11750 236534 PRK09476 napG quinol dehydrogenase periplasmic component; Provisional 254
11751 236535 PRK09477 napH quinol dehydrogenase membrane component; Provisional 271
11752 181892 PRK09478 mglC galactose/methyl galactoside ABC transporter permease MglC. 336
11753 236536 PRK09479 glpX fructose 1,6-bisphosphatase II; Reviewed 319
11754 181894 PRK09480 slmA division inhibitor protein; Provisional 194
11755 236537 PRK09481 sspA stringent starvation protein A; Provisional 211
11756 181896 PRK09482 PRK09482 flap endonuclease-like protein; Provisional 256
11757 236538 PRK09483 PRK09483 response regulator; Provisional 217
11758 181898 PRK09484 PRK09484 3-deoxy-manno-octulosonate-8-phosphatase KdsC. 183
11759 181899 PRK09485 mmuM homocysteine methyltransferase; Provisional 304
11760 181900 PRK09487 sdhC succinate dehydrogenase cytochrome b556 subunit. 129
11761 181901 PRK09488 sdhD succinate dehydrogenase membrane anchor subunit. 115
11762 181902 PRK09489 rsmC 16S rRNA (guanine(1207)-N(2))-methyltransferase RsmC. 342
11763 236539 PRK09490 metH B12-dependent methionine synthase; Provisional 1229
11764 181904 PRK09491 rimI ribosomal-protein-alanine N-acetyltransferase; Provisional 146
11765 181905 PRK09492 treR HTH-type transcriptional regulator TreR. 315
11766 181906 PRK09493 glnQ glutamine ABC transporter ATP-binding protein GlnQ. 240
11767 181907 PRK09494 glnP glutamine ABC transporter permease protein; Reviewed 219
11768 236540 PRK09495 glnH glutamine ABC transporter periplasmic protein; Reviewed 247
11769 236541 PRK09496 trkA Trk system potassium transporter TrkA. 453
11770 181910 PRK09497 potB spermidine/putrescine ABC transporter membrane protein; Reviewed 285
11771 181911 PRK09498 sifA type III secretion system effector SifA. 336
11772 137339 PRK09499 sifB type III secretion system effector SifB. 316
11773 236542 PRK09500 potC spermidine/putrescine ABC transporter membrane protein; Reviewed 256
11774 181913 PRK09501 potD spermidine/putrescine ABC transporter periplasmic substrate-binding protein; Reviewed 348
11775 181914 PRK09502 iscA iron-sulfur cluster assembly protein IscA. 107
11776 181915 PRK09504 sufA iron-sulfur cluster assembly scaffold protein; Provisional 122
11777 236543 PRK09505 malS alpha-amylase; Reviewed 683
11778 236544 PRK09506 mrcB bifunctional glycosyl transferase/transpeptidase; Reviewed 830
11779 169931 PRK09507 cspE cold shock-like protein CspE. 69
11780 181918 PRK09508 leuO leucine transcriptional activator; Reviewed 314
11781 181919 PRK09509 fieF CDF family cation-efflux pump FieF. 299
11782 236545 PRK09510 tolA cell envelope integrity inner membrane protein TolA; Provisional 387
11783 181921 PRK09511 nirD nitrite reductase small subunit NirD. 108
11784 181922 PRK09512 rbsC ribose ABC transporter permease protein; Reviewed 320
11785 181923 PRK09513 fruK 1-phosphofructokinase; Provisional 312
11786 181924 PRK09514 zntR Zn(2+)-responsive transcriptional regulator. 140
11787 169939 PRK09517 PRK09517 multifunctional thiamine-phosphate pyrophosphorylase/synthase/phosphomethylpyrimidine kinase; Provisional 755
11788 236546 PRK09518 PRK09518 bifunctional cytidylate kinase/GTPase Der; Reviewed 712
11789 77219 PRK09519 recA intein-containing recombinase RecA. 790
11790 236547 PRK09521 PRK09521 exosome complex RNA-binding protein Csl4; Provisional 189
11791 181927 PRK09522 PRK09522 bifunctional anthranilate synthase glutamate amidotransferase component TrpG/anthranilate phosphoribosyltransferase TrpD. 531
11792 236548 PRK09525 lacZ beta-galactosidase. 1027
11793 181929 PRK09526 lacI lac repressor; Reviewed 342
11794 181930 PRK09527 lacA galactoside O-acetyltransferase; Reviewed 203
11795 236549 PRK09528 lacY galactoside permease; Reviewed 420
11796 236550 PRK09529 PRK09529 bifunctional acetyl-CoA decarbonylase/synthase complex subunit alpha/beta; Reviewed 711
11797 181933 PRK09532 PRK09532 DNA polymerase III subunit alpha; Reviewed 874
11798 236551 PRK09533 PRK09533 bifunctional transaldolase/phosoglucose isomerase; Validated 948
11799 236552 PRK09534 btuF corrinoid ABC transporter substrate-binding protein; Reviewed 359
11800 236553 PRK09535 btuC cobalamin ABC transporter permease BtuC. 366
11801 236554 PRK09536 btuD corrinoid ABC transporter ATPase; Reviewed 402
11802 236555 PRK09537 pylS pyrrolysine--tRNA(Pyl) ligase. 417
11803 236556 PRK09539 PRK09539 tRNA-splicing endonuclease subunit beta; Reviewed 124
11804 137367 PRK09541 emrE EmrE family multidrug efflux SMR transporter. 110
11805 236557 PRK09542 manB phosphomannomutase/phosphoglucomutase; Reviewed 445
11806 181938 PRK09543 znuB zinc ABC transporter permease subunit ZnuB. 261
11807 181939 PRK09544 znuC high-affinity zinc transporter ATPase; Reviewed 251
11808 236558 PRK09545 znuA zinc ABC transporter substrate-binding protein ZnuA. 311
11809 181941 PRK09546 zntB zinc transporter ZntB. 324
11810 181942 PRK09547 nhaB sodium/proton antiporter NhaB. 513
11811 236559 PRK09548 PRK09548 PTS ascorbate-specific subunit IIBC. 602
11812 236560 PRK09549 mtnW 2,3-diketo-5-methylthiopentyl-1-phosphate enolase; Reviewed 407
11813 236561 PRK09550 mtnK methylthioribose kinase; Reviewed 401
11814 236562 PRK09552 mtnX 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase; Reviewed 219
11815 181947 PRK09553 tauD taurine dioxygenase; Reviewed 277
11816 236563 PRK09554 feoB Fe(2+) transporter permease subunit FeoB. 772
11817 181949 PRK09555 feoA ferrous iron transporter A. 74
11818 236564 PRK09556 uhpT hexose-6-phosphate:phosphate antiporter. 467
11819 236565 PRK09557 PRK09557 fructokinase; Reviewed 301
11820 236566 PRK09558 ushA bifunctional UDP-sugar hydrolase/5'-nucleotidase periplasmic precursor; Reviewed 551
11821 236567 PRK09559 PRK09559 putative global regulator; Reviewed 327
11822 236568 PRK09560 nhaA pH-dependent sodium/proton antiporter; Reviewed 389
11823 181955 PRK09561 nhaA sodium/proton antiporter NhaA. 388
11824 236569 PRK09562 mazG nucleoside triphosphate pyrophosphohydrolase; Reviewed 262
11825 236570 PRK09563 rbgA GTPase YlqF; Reviewed 287
11826 181958 PRK09564 PRK09564 coenzyme A disulfide reductase; Reviewed 444
11827 236571 PRK09565 PRK09565 heme-binding protein. 533
11828 236572 PRK09566 nirA ferredoxin-nitrite reductase; Reviewed 513
11829 236573 PRK09567 nirA NirA family protein. 593
11830 236574 PRK09568 PRK09568 DNA primase regulatory subunit PriL. 306
11831 181961 PRK09569 PRK09569 citrate (Si)-synthase. 437
11832 236575 PRK09570 rpoH DNA-directed RNA polymerase subunit H; Reviewed 79
11833 181963 PRK09573 PRK09573 (S)-2,3-di-O-geranylgeranylglyceryl phosphate synthase; Reviewed 279
11834 236576 PRK09575 vmrA MATE family efflux transporter. 453
11835 169981 PRK09577 PRK09577 multidrug efflux RND transporter permease subunit. 1032
11836 169982 PRK09578 PRK09578 MexX/AxyX family multidrug efflux RND transporter periplasmic adaptor subunit. 385
11837 169983 PRK09579 PRK09579 multidrug efflux RND transporter permease subunit. 1017
11838 181965 PRK09580 sufC cysteine desulfurase ATPase component; Reviewed 248
11839 236577 PRK09581 pleD response regulator PleD; Reviewed 457
11840 181967 PRK09582 chaB putative cation transport regulator ChaB. 76
11841 236578 PRK09583 PRK09583 mycothiol-dependent maleylpyruvate isomerase; Reviewed 241
11842 181969 PRK09584 tppB dipeptide/tripeptide permease DtpA. 500
11843 236579 PRK09585 anmK anhydro-N-acetylmuramic acid kinase; Reviewed 365
11844 181971 PRK09586 murP PTS system N-acetylmuramic acid transporter subunits EIIBC; Reviewed 476
11845 181972 PRK09588 PRK09588 hypothetical protein; Reviewed 376
11846 181973 PRK09589 celA 6-phospho-beta-glucosidase; Reviewed 476
11847 181974 PRK09590 celB PTS cellobiose transporter subunit IIB. 104
11848 181975 PRK09591 celC PTS cellobiose transporter subunit IIA. 104
11849 181976 PRK09592 celD PTS cellobiose transporter subunit IIC. 449
11850 236580 PRK09593 arb 6-phospho-beta-glucosidase; Reviewed 478
11851 181978 PRK09597 PRK09597 lipid A 1-phosphatase LpxE. 190
11852 236581 PRK09598 PRK09598 phosphoethanolamine--lipid A transferase EptA. 522
11853 236582 PRK09599 PRK09599 NADP-dependent phosphogluconate dehydrogenase. 301
11854 236583 PRK09601 PRK09601 redox-regulated ATPase YchF. 364
11855 236584 PRK09602 PRK09602 translation-associated GTPase; Reviewed 396
11856 181983 PRK09603 PRK09603 DNA-directed RNA polymerase subunit beta/beta'. 2890
11857 236585 PRK09604 PRK09604 tRNA (adenosine(37)-N6)-threonylcarbamoyltransferase complex transferase subunit TsaD. 332
11858 236586 PRK09605 PRK09605 bifunctional N(6)-L-threonylcarbamoyladenine synthase/serine/threonine protein kinase. 535
11859 236587 PRK09606 PRK09606 DNA-directed RNA polymerase subunit B''; Validated 494
11860 236588 PRK09607 rps11p 30S ribosomal protein S11P; Reviewed 132
11861 181988 PRK09609 PRK09609 hypothetical protein; Provisional 312
11862 236589 PRK09612 rpl2p 50S ribosomal protein L2P; Validated 238
11863 236590 PRK09613 thiH thiamine biosynthesis protein ThiH; Reviewed 469
11864 236591 PRK09614 nrdF ribonucleotide-diphosphate reductase subunit beta; Reviewed 324
11865 181992 PRK09615 ggt gamma-glutamyltranspeptidase; Reviewed 581
11866 236592 PRK09616 pheT phenylalanine--tRNA ligase subunit beta. 552
11867 236593 PRK09617 PRK09617 type III secretion system protein; Reviewed 243
11868 236594 PRK09618 flgD flagellar hook assembly protein FlgD. 142
11869 181996 PRK09619 flgD flagellar hook assembly protein FlgD. 218
11870 181997 PRK09620 PRK09620 hypothetical protein; Provisional 229
11871 236595 PRK09621 PRK09621 ATP synthase subunit C. 141
11872 181999 PRK09622 porA 2-oxoacid:ferredoxin oxidoreductase subunit alpha. 407
11873 170016 PRK09623 vorD 3-methyl-2-oxobutanoate dehydrogenase subunit delta. 105
11874 170017 PRK09624 porD pyruvate ferredoxin oxidoreductase subunit delta; Reviewed 105
11875 236596 PRK09625 porD pyruvate flavodoxin oxidoreductase subunit delta; Reviewed 133
11876 236597 PRK09626 oorD 2-oxoglutarate-acceptor oxidoreductase subunit OorD; Reviewed 103
11877 182002 PRK09627 oorA 2-oxoglutarate synthase subunit alpha. 375
11878 182003 PRK09628 oorB 2-oxoglutarate ferredoxin oxidoreductase subunit beta. 277
11879 104071 PRK09629 PRK09629 bifunctional thiosulfate sulfurtransferase/phosphatidylserine decarboxylase; Provisional 610
11880 170022 PRK09630 PRK09630 DNA topoisomerase IV subunit A; Provisional 479
11881 236598 PRK09631 PRK09631 DNA topoisomerase IV subunit A; Provisional 635
11882 236599 PRK09632 PRK09632 ATP-dependent DNA ligase; Reviewed 764
11883 182006 PRK09633 ligD DNA ligase D. 610
11884 182007 PRK09634 nusB transcription antitermination protein NusB; Provisional 207
11885 182008 PRK09635 sigI RNA polymerase sigma factor SigI; Provisional 290
11886 236600 PRK09636 PRK09636 RNA polymerase sigma factor SigJ; Provisional 293
11887 236601 PRK09637 PRK09637 RNA polymerase sigma factor SigZ; Provisional 181
11888 182010 PRK09638 PRK09638 RNA polymerase sigma factor SigY; Reviewed 176
11889 236602 PRK09639 PRK09639 RNA polymerase sigma factor SigX; Provisional 166
11890 236603 PRK09640 PRK09640 RNA polymerase sigma factor SigX; Reviewed 188
11891 182012 PRK09641 PRK09641 RNA polymerase sigma factor SigW; Provisional 187
11892 170031 PRK09642 PRK09642 RNA polymerase sigma factor. 160
11893 236604 PRK09643 PRK09643 RNA polymerase sigma factor SigM; Reviewed 192
11894 170033 PRK09644 PRK09644 RNA polymerase sigma factor SigM; Provisional 165
11895 236605 PRK09645 PRK09645 ECF RNA polymerase sigma factor SigL. 173
11896 182015 PRK09646 PRK09646 ECF RNA polymerase sigma factor SigK. 194
11897 236606 PRK09647 PRK09647 RNA polymerase sigma factor SigE; Reviewed 203
11898 236607 PRK09648 PRK09648 RNA polymerase sigma factor ShbA. 189
11899 137458 PRK09649 PRK09649 RNA polymerase sigma factor SigC; Reviewed 185
11900 182018 PRK09651 PRK09651 RNA polymerase sigma factor FecI; Provisional 172
11901 236608 PRK09652 PRK09652 RNA polymerase sigma factor RpoE; Provisional 182
11902 236609 PRK09653 eutD phosphotransacetylase. 324
11903 182021 PRK09662 PRK09662 GspL-like protein; Provisional 286
11904 182022 PRK09664 PRK09664 low affinity tryptophan permease TnaB. 415
11905 182023 PRK09665 PRK09665 PTS galactitol transporter subunit IIA. 150
11906 236610 PRK09669 PRK09669 putative symporter YagG; Provisional 444
11907 236611 PRK09672 PRK09672 phage exclusion protein Lit; Provisional 305
11908 182026 PRK09674 PRK09674 enoyl-CoA hydratase-isomerase; Provisional 255
11909 137467 PRK09677 PRK09677 putative lipopolysaccharide biosynthesis O-acetyl transferase WbbJ; Provisional 192
11910 137468 PRK09678 PRK09678 DNA-binding transcriptional regulator; Provisional 72
11911 182027 PRK09681 PRK09681 putative type II secretion protein GspC; Provisional 276
11912 236612 PRK09685 PRK09685 DNA-binding transcriptional activator FeaR; Provisional 302
11913 170047 PRK09687 PRK09687 putative lyase; Provisional 280
11914 182029 PRK09689 PRK09689 prophage protein NinE; Provisional 56
11915 170049 PRK09692 PRK09692 integrase; Provisional 413
11916 236613 PRK09693 PRK09693 Cascade antiviral complex protein; Validated 489
11917 182031 PRK09694 PRK09694 CRISPR-associated helicase/endonuclease Cas3. 878
11918 182032 PRK09695 PRK09695 glycolate permease GlcA. 560
11919 182033 PRK09697 PRK09697 putative general secretion pathway protein GspB. 139
11920 182034 PRK09698 PRK09698 D-allose kinase; Provisional 302
11921 182035 PRK09699 PRK09699 D-allose ABC transporter permease. 312
11922 182036 PRK09700 PRK09700 D-allose ABC transporter ATP-binding protein AlsA. 510
11923 182037 PRK09701 PRK09701 D-allose transporter substrate-binding protein. 311
11924 77355 PRK09702 PRK09702 PTS sugar transporter subunit IIB. 161
11925 236614 PRK09705 cynX putative cyanate transporter; Provisional 393
11926 182039 PRK09706 PRK09706 transcriptional repressor DicA; Reviewed 135
11927 182040 PRK09707 PRK09707 putative lipoprotein; Provisional 1343
11928 236615 PRK09709 PRK09709 exodeoxyribonuclease VIII. 877
11929 182042 PRK09710 lar type I toxin-antitoxin system endodeoxyribonuclease toxin RalR. 64
11930 137485 PRK09713 focB formate transporter. 282
11931 182043 PRK09716 PRK09716 YhaC family protein. 395
11932 182044 PRK09717 PRK09717 stationary phase growth adaptation protein; Provisional 179
11933 182045 PRK09718 PRK09718 SopA family protein. 512
11934 182046 PRK09719 PRK09719 hypothetical protein; Provisional 89
11935 182047 PRK09720 cybC cytochrome b562; Provisional 100
11936 236616 PRK09722 PRK09722 allulose-6-phosphate 3-epimerase; Provisional 229
11937 236617 PRK09723 PRK09723 fimbrial-like adhesin. 421
11938 182049 PRK09726 PRK09726 type II toxin-antitoxin system antitoxin HipB. 88
11939 137493 PRK09727 PRK09727 his operon leader peptide; Provisional 16
11940 182050 PRK09729 PRK09729 hypothetical protein; Provisional 68
11941 182051 PRK09730 PRK09730 SDR family oxidoreductase. 247
11942 182052 PRK09731 PRK09731 type II secretion system protein. 178
11943 170072 PRK09732 PRK09732 hypothetical protein; Provisional 134
11944 182053 PRK09733 PRK09733 fimbrial-like protein. 181
11945 236618 PRK09736 PRK09736 5-methylcytosine-specific restriction enzyme subunit McrC; Provisional 352
11946 236619 PRK09737 PRK09737 type I restriction-modification system specificity subunit. 461
11947 182055 PRK09738 PRK09738 small toxic polypeptide; Provisional 52
11948 236620 PRK09739 PRK09739 NAD(P)H oxidoreductase. 199
11949 182057 PRK09741 PRK09741 hypothetical protein; Provisional 148
11950 137503 PRK09744 PRK09744 DNA-binding transcriptional regulator DicC; Provisional 75
11951 182058 PRK09750 PRK09750 hypothetical protein; Provisional 64
11952 137505 PRK09751 PRK09751 putative ATP-dependent helicase Lhr; Provisional 1490
11953 182059 PRK09752 PRK09752 AIDA-I family autotransporter YfaL. 1250
11954 170080 PRK09754 PRK09754 phenylpropionate dioxygenase ferredoxin reductase subunit; Provisional 396
11955 182060 PRK09755 PRK09755 ABC transporter substrate-binding protein. 535
11956 182061 PRK09756 PRK09756 PTS N-acetylgalactosamine transporter subunit IIB. 158
11957 236621 PRK09757 PRK09757 PTS N-acetylgalactosamine transporter subunit IIC. 267
11958 182063 PRK09759 PRK09759 type I toxin-antitoxin system toxin HokA. 50
11959 182064 PRK09762 PRK09762 galactosamine-6-phosphate isomerase; Provisional 232
11960 182065 PRK09764 PRK09764 GntR family transcriptional regulator. 240
11961 182066 PRK09765 PRK09765 PTS system 2-O-a-mannosyl-D-glycerate specific transporter subunit IIABC; Provisional 631
11962 182067 PRK09767 PRK09767 DUF559 domain-containing protein. 117
11963 170086 PRK09772 PRK09772 transcriptional antiterminator BglG; Provisional 278
11964 236622 PRK09774 PRK09774 fec operon regulator FecR; Reviewed 319
11965 236623 PRK09775 PRK09775 type II toxin-antitoxin system HipA family toxin YjjJ. 442
11966 182070 PRK09776 PRK09776 putative diguanylate cyclase; Provisional 1092
11967 182071 PRK09777 fecD Fe(3+) dicitrate ABC transporter permease subunit FecD. 318
11968 170091 PRK09778 PRK09778 type I toxin-antitoxin system antitoxin YafN. 97
11969 170092 PRK09781 PRK09781 hypothetical protein; Provisional 181
11970 236624 PRK09782 PRK09782 bacteriophage N4 receptor, outer membrane subunit; Provisional 987
11971 236625 PRK09783 PRK09783 copper/silver efflux system membrane fusion protein CusB; Provisional 409
11972 182074 PRK09784 PRK09784 YccE family protein. 417
11973 182075 PRK09786 PRK09786 endodeoxyribonuclease RUS; Reviewed 120
11974 182076 PRK09790 PRK09790 hypothetical protein; Reviewed 91
11975 182077 PRK09791 PRK09791 LysR family transcriptional regulator. 302
11976 182078 PRK09792 PRK09792 4-aminobutyrate transaminase; Provisional 421
11977 182079 PRK09793 PRK09793 methyl-accepting chemotaxis protein IV. 533
11978 182080 PRK09795 PRK09795 aminopeptidase; Provisional 361
11979 182081 PRK09796 PRK09796 PTS system cellobiose/arbutin/salicin-specific transporter subunits IIBC; Provisional 472
11980 182082 PRK09798 PRK09798 MazF-MazE toxin-antitoxin system antitoxin MazE. 82
11981 182083 PRK09799 PRK09799 putative oxidoreductase; Provisional 258
11982 182084 PRK09800 PRK09800 putative hypoxanthine oxidase; Provisional 956
11983 182085 PRK09801 PRK09801 LysR family transcriptional regulator. 310
11984 182086 PRK09802 PRK09802 DeoR family transcriptional regulator. 269
11985 182087 PRK09804 PRK09804 C4-dicarboxylate transporter DcuC. 455
11986 77417 PRK09806 PRK09806 tryptophanase leader peptide; Provisional 24
11987 182088 PRK09807 PRK09807 hypothetical protein; Provisional 161
11988 137533 PRK09810 PRK09810 lipoprotein antitoxin entericidin A. 41
11989 182089 PRK09812 PRK09812 type II toxin-antitoxin system ChpB family toxin. 116
11990 182090 PRK09813 PRK09813 fructoselysine 6-kinase; Provisional 260
11991 236626 PRK09814 PRK09814 sugar transferase. 333
11992 77423 PRK09816 thrL thr operon leader peptide; Provisional 21
11993 182092 PRK09818 PRK09818 kinase inhibitor. 183
11994 182093 PRK09819 PRK09819 mannosylglycerate hydrolase. 875
11995 182094 PRK09821 PRK09821 putative transporter; Provisional 454
11996 182095 PRK09822 PRK09822 lipopolysaccharide core biosynthesis protein; Provisional 269
11997 170114 PRK09823 PRK09823 putative inner membrane protein; Provisional 160
11998 236627 PRK09824 PRK09824 PTS system beta-glucoside-specific transporter subunits IIABC; Provisional 627
11999 182097 PRK09825 idnK gluconokinase. 176
12000 182098 PRK09828 PRK09828 putative fimbrial outer membrane usher protein; Provisional 865
12001 182099 PRK09831 PRK09831 GNAT family N-acetyltransferase. 147
12002 182100 PRK09834 PRK09834 DNA-binding transcriptional regulator. 263
12003 182101 PRK09835 PRK09835 Cu(+)/Ag(+) sensor histidine kinase. 482
12004 182102 PRK09836 PRK09836 DNA-binding transcriptional activator CusR; Provisional 227
12005 182103 PRK09837 PRK09837 Cu(I)/Ag(I) efflux RND transporter outer membrane protein. 461
12006 182104 PRK09838 PRK09838 periplasmic copper-binding protein; Provisional 115
12007 182105 PRK09840 PRK09840 catecholate siderophore receptor Fiu; Provisional 761
12008 182106 PRK09841 PRK09841 tyrosine-protein kinase. 726
12009 236628 PRK09846 recT recombination protein RecT. 266
12010 182108 PRK09847 PRK09847 gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase; Provisional 494
12011 182109 PRK09848 PRK09848 glucuronide transporter; Provisional 448
12012 236629 PRK09849 PRK09849 putative oxidoreductase; Provisional 702
12013 182111 PRK09850 PRK09850 pseudouridine kinase; Provisional 313
12014 182112 PRK09852 PRK09852 cryptic 6-phospho-beta-glucosidase; Provisional 474
12015 236630 PRK09853 PRK09853 putative selenate reductase subunit YgfK; Provisional 1019
12016 182114 PRK09854 cmtB PTS mannitol transporter subunit IIA. 147
12017 182115 PRK09855 PRK09855 PTS N-acetylgalactosamine transporter subunit IID. 263
12018 182116 PRK09856 PRK09856 fructoselysine 3-epimerase; Provisional 275
12019 182117 PRK09857 PRK09857 recombination-promoting nuclease RpnA. 292
12020 137559 PRK09859 PRK09859 multidrug transporter subunit MdtE. 385
12021 182118 PRK09860 PRK09860 putative alcohol dehydrogenase; Provisional 383
12022 182119 PRK09861 PRK09861 lipoprotein NlpA. 272
12023 182120 PRK09862 PRK09862 ATP-dependent protease. 506
12024 182121 PRK09863 PRK09863 putative frv operon regulatory protein; Provisional 584
12025 182122 PRK09864 PRK09864 aminopeptidase. 356
12026 182123 PRK09866 PRK09866 clamp-binding protein CrfC. 741
12027 182124 PRK09867 PRK09867 hypothetical protein; Provisional 209
12028 182125 PRK09870 PRK09870 tyrosine recombinase; Provisional 200
12029 182126 PRK09871 PRK09871 tyrosine recombinase; Provisional 198
12030 182127 PRK09874 PRK09874 multidrug efflux MFS transporter MdtG. 408
12031 182128 PRK09875 PRK09875 phosphotriesterase-related protein. 292
12032 182129 PRK09877 PRK09877 2,3-diketo-L-gulonate TRAP transporter small permease protein YiaM; Provisional 157
12033 182130 PRK09880 PRK09880 L-idonate 5-dehydrogenase; Provisional 343
12034 182131 PRK09881 PRK09881 D,D-dipeptide ABC transporter permease. 296
12035 236631 PRK09885 PRK09885 type II toxin-antitoxin system YafO family toxin. 132
12036 77467 PRK09890 PRK09890 cold shock protein CspG; Provisional 70
12037 170147 PRK09891 PRK09891 protein YmcE. 76
12038 182133 PRK09894 PRK09894 diguanylate cyclase; Provisional 296
12039 182134 PRK09897 PRK09897 FAD-NAD(P)-binding protein. 534
12040 182135 PRK09898 PRK09898 ferredoxin-like protein. 208
12041 182136 PRK09902 PRK09902 lipopolysaccharide kinase InaA. 216
12042 104216 PRK09903 PRK09903 transporter YfdV. 314
12043 182137 PRK09906 PRK09906 DNA-binding transcriptional regulator HcaR; Provisional 296
12044 182138 PRK09907 PRK09907 endoribonuclease MazF. 111
12045 182139 PRK09908 PRK09908 xanthine dehydrogenase iron sulfur-binding subunit XdhC. 159
12046 182140 PRK09912 PRK09912 L-glyceraldehyde 3-phosphate reductase; Provisional 346
12047 182141 PRK09913 PRK09913 PTS fructose transporter subunit IIA. 148
12048 182142 PRK09915 PRK09915 MdtP family multidrug efflux transporter outer membrane subunit. 488
12049 182143 PRK09917 PRK09917 threonine/serine exporter. 157
12050 236632 PRK09918 PRK09918 putative fimbrial chaperone protein; Provisional 230
12051 236633 PRK09919 PRK09919 anti-adapter protein IraM; Provisional 114
12052 182146 PRK09920 PRK09920 acetyl-CoA:acetoacetyl-CoA transferase subunit alpha; Provisional 219
12053 182147 PRK09921 PRK09921 permease DsdX; Provisional 445
12054 182148 PRK09922 PRK09922 lipopolysaccharide 1,6-galactosyltransferase. 359
12055 137592 PRK09925 PRK09925 leu operon leader peptide; Provisional 28
12056 236634 PRK09926 PRK09926 fimbrial chaperone. 246
12057 236635 PRK09928 PRK09928 choline transport protein BetT; Provisional 679
12058 182151 PRK09929 PRK09929 hypothetical protein; Provisional 91
12059 182152 PRK09932 PRK09932 glycerate 3-kinase. 381
12060 182153 PRK09934 PRK09934 fimbriae assembly protein. 171
12061 182154 PRK09935 PRK09935 fimbriae biosynthesis transcriptional regulator FimZ. 210
12062 182155 PRK09936 PRK09936 DUF4434 family protein. 296
12063 77494 PRK09937 PRK09937 cold shock-like protein CspD. 74
12064 182156 PRK09939 PRK09939 acid resistance putative oxidoreductase YdeP. 759
12065 182157 PRK09940 PRK09940 transcriptional regulator YdeO; Provisional 253
12066 182158 PRK09943 PRK09943 HTH-type transcriptional regulator PuuR. 185
12067 137602 PRK09945 PRK09945 hypothetical protein; Provisional 418
12068 182159 PRK09946 PRK09946 hypothetical protein; Provisional 270
12069 182160 PRK09947 PRK09947 YdhW family putative oxidoreductase system protein. 215
12070 236636 PRK09950 PRK09950 putative transporter; Provisional 506
12071 182162 PRK09951 PRK09951 hypothetical protein; Provisional 222
12072 182163 PRK09952 PRK09952 shikimate transporter; Provisional 438
12073 182164 PRK09953 wcaD putative colanic acid biosynthesis protein; Provisional 404
12074 182165 PRK09954 PRK09954 sugar kinase. 362
12075 182166 PRK09955 rihB ribosylpyrimidine nucleosidase. 313
12076 182167 PRK09956 PRK09956 ISNCY family transposase. 308
12077 182168 PRK09958 PRK09958 acid-sensing system DNA-binding response regulator EvgA. 204
12078 182169 PRK09959 PRK09959 acid-sensing system histidine kinase EvgS. 1197
12079 182170 PRK09961 PRK09961 aminopeptidase. 344
12080 170182 PRK09965 PRK09965 3-phenylpropionate dioxygenase ferredoxin subunit; Provisional 106
12081 182171 PRK09966 PRK09966 diguanylate cyclase DgcN. 407
12082 182172 PRK09967 PRK09967 OmpA family protein. 160
12083 182173 PRK09968 PRK09968 protein-serine/threonine phosphatase. 218
12084 236637 PRK09970 PRK09970 xanthine dehydrogenase subunit XdhA; Provisional 759
12085 182175 PRK09971 PRK09971 xanthine dehydrogenase subunit XdhB; Provisional 291
12086 170188 PRK09973 PRK09973 lipoprotein YqhH. 85
12087 236638 PRK09974 PRK09974 type II toxin-antitoxin system PrlF family antitoxin. 111
12088 182177 PRK09975 PRK09975 DNA-binding transcriptional regulator EnvR; Provisional 213
12089 182178 PRK09977 PRK09977 MgtC/SapB family protein. 215
12090 137624 PRK09978 PRK09978 DNA-binding transcriptional regulator GadX; Provisional 274
12091 77522 PRK09979 PRK09979 rho operon leader peptide rhoL. 33
12092 182179 PRK09980 ompL porin OmpL. 230
12093 182180 PRK09981 PRK09981 DUF406 domain-containing protein. 99
12094 137627 PRK09982 PRK09982 universal stress protein UspD; Provisional 142
12095 182181 PRK09983 pflD putative formate acetyltransferase 2; Provisional 765
12096 182182 PRK09984 PRK09984 phosphonate ABC transporter ATP-binding protein. 262
12097 182183 PRK09986 PRK09986 LysR family transcriptional regulator. 294
12098 182184 PRK09987 PRK09987 dTDP-4-dehydrorhamnose reductase; Provisional 299
12099 182185 PRK09989 PRK09989 HPr family phosphocarrier protein. 258
12100 182186 PRK09990 PRK09990 DNA-binding transcriptional regulator GlcC; Provisional 251
12101 182187 PRK09993 PRK09993 C-lysozyme inhibitor; Provisional 153
12102 182188 PRK09997 PRK09997 hydroxypyruvate isomerase; Provisional 258
12103 182189 PRK10001 PRK10001 serine-type D-Ala-D-Ala carboxypeptidase. 400
12104 236639 PRK10002 PRK10002 porin OmpF. 362
12105 236640 PRK10003 PRK10003 ferric-rhodotorulic acid outer membrane transporter; Provisional 729
12106 182192 PRK10005 PRK10005 dihydroxyacetone kinase ADP-binding subunit DhaL. 210
12107 182193 PRK10014 PRK10014 DNA-binding transcriptional repressor MalI; Provisional 342
12108 182194 PRK10015 PRK10015 oxidoreductase; Provisional 429
12109 182195 PRK10016 PRK10016 DNA gyrase inhibitor SbmC. 156
12110 182196 PRK10017 PRK10017 colanic acid biosynthesis protein; Provisional 426
12111 182197 PRK10018 PRK10018 colanic acid biosynthesis glycosyltransferase WcaA. 279
12112 236641 PRK10019 PRK10019 nickel/cobalt efflux transporter RcnA. 279
12113 182199 PRK10022 PRK10022 putative DNA-binding transcriptional regulator; Provisional 167
12114 182200 PRK10026 PRK10026 arsenate reductase (glutaredoxin). 141
12115 182201 PRK10027 PRK10027 cryptic adenine deaminase; Provisional 588
12116 236642 PRK10030 PRK10030 YiiX family permuted papain-like enzyme. 197
12117 182203 PRK10034 PRK10034 gluconate transporter GntP. 447
12118 182204 PRK10037 PRK10037 cellulose biosynthesis protein BcsQ. 250
12119 170217 PRK10039 PRK10039 hypothetical protein; Provisional 127
12120 182205 PRK10040 PRK10040 hypothetical protein; Provisional 52
12121 236643 PRK10044 PRK10044 ferrichrome outer membrane transporter; Provisional 727
12122 182207 PRK10045 PRK10045 ACP phosphodiesterase. 193
12123 182208 PRK10046 dpiA two-component response regulator DpiA; Provisional 225
12124 236644 PRK10049 pgaA outer membrane protein PgaA; Provisional 765
12125 182210 PRK10050 PRK10050 curli production assembly/transport protein CsgF. 138
12126 182211 PRK10051 csgA major curlin subunit CsgA. 151
12127 182212 PRK10053 PRK10053 YdeI family stress tolerance OB fold protein. 130
12128 182213 PRK10054 PRK10054 efflux MFS transporter YdeE. 395
12129 182214 PRK10057 rpsV stationary-phase-induced ribosome-associated protein. 44
12130 236645 PRK10060 PRK10060 cyclic di-GMP phosphodiesterase. 663
12131 182216 PRK10061 PRK10061 DNA damage-inducible protein YebG; Provisional 96
12132 182217 PRK10062 PRK10062 hypothetical protein; Provisional 303
12133 182218 PRK10063 PRK10063 colanic acid biosynthesis glycosyltransferase WcaE. 248
12134 236646 PRK10064 PRK10064 catecholate siderophore receptor CirA; Provisional 663
12135 236647 PRK10069 PRK10069 3-phenylpropionate/cinnamic acid dioxygenase subunit beta. 183
12136 182221 PRK10070 PRK10070 proline/glycine betaine ABC transporter ATP-binding protein ProV. 400
12137 182222 PRK10072 PRK10072 HTH-type transcriptional regulator. 96
12138 182223 PRK10073 PRK10073 putative glycosyl transferase; Provisional 328
12139 182224 PRK10076 PRK10076 pyruvate formate lyase II activase; Provisional 213
12140 182225 PRK10077 xylE D-xylose transporter XylE; Provisional 479
12141 236648 PRK10078 PRK10078 ribose 1,5-bisphosphokinase; Provisional 186
12142 182227 PRK10079 PRK10079 phosphonate metabolism transcriptional regulator PhnF; Provisional 241
12143 170240 PRK10081 PRK10081 lipoprotein toxin entericidin B. 48
12144 182228 PRK10082 PRK10082 hypochlorite stress DNA-binding transcriptional regulator HypT. 303
12145 182229 PRK10083 PRK10083 putative oxidoreductase; Provisional 339
12146 236649 PRK10084 PRK10084 dTDP-glucose 4,6 dehydratase; Provisional 352
12147 182231 PRK10086 PRK10086 DNA-binding transcriptional regulator DsdC. 311
12148 182232 PRK10089 PRK10089 chaperone CsaA. 112
12149 182233 PRK10090 PRK10090 aldehyde dehydrogenase A; Provisional 409
12150 182234 PRK10091 PRK10091 MFS transport protein AraJ; Provisional 382
12151 182235 PRK10092 PRK10092 maltose O-acetyltransferase; Provisional 183
12152 182236 PRK10093 PRK10093 primosomal replication protein N''; Provisional 171
12153 182237 PRK10094 PRK10094 HTH-type transcriptional activator AllS. 308
12154 236650 PRK10095 PRK10095 ribonuclease I; Provisional 268
12155 182239 PRK10096 citG triphosphoribosyl-dephospho-CoA synthase; Provisional 292
12156 182240 PRK10098 PRK10098 putative dehydrogenase; Provisional 350
12157 182241 PRK10100 PRK10100 transcriptional regulator CsgD. 216
12158 182242 PRK10101 csgB curlin minor subunit CsgB; Provisional 151
12159 182243 PRK10102 csgC curli assembly protein CsgC; Provisional 110
12160 236651 PRK10106 PRK10106 multiple antibiotic resistance protein MarB. 65
12161 182245 PRK10110 PRK10110 PTS maltose transporter subunit IICB. 530
12162 182246 PRK10113 PRK10113 cell division activator CedA. 80
12163 182247 PRK10115 PRK10115 protease 2; Provisional 686
12164 182248 PRK10116 PRK10116 universal stress protein UspC; Provisional 142
12165 182249 PRK10117 PRK10117 trehalose-6-phosphate synthase; Provisional 474
12166 236652 PRK10118 PRK10118 flagellar hook length control protein FliK. 408
12167 182251 PRK10119 PRK10119 putative hydrolase; Provisional 231
12168 182252 PRK10122 PRK10122 UTP--glucose-1-phosphate uridylyltransferase GalF. 297
12169 182253 PRK10123 wcaM putative colanic acid biosynthesis protein; Provisional 464
12170 182254 PRK10124 PRK10124 putative UDP-glucose lipid carrier transferase; Provisional 463
12171 182255 PRK10125 PRK10125 colanic acid biosynthesis glycosyltransferase WcaC. 405
12172 182256 PRK10126 PRK10126 low molecular weight protein-tyrosine-phosphatase Wzb. 147
12173 182257 PRK10128 PRK10128 2-keto-3-deoxy-L-rhamnonate aldolase; Provisional 267
12174 182258 PRK10130 PRK10130 HTH-type transcriptional regulator EutR. 350
12175 182259 PRK10132 PRK10132 hypothetical protein; Provisional 108
12176 182260 PRK10133 PRK10133 L-fucose:H+ symporter permease. 438
12177 236653 PRK10137 PRK10137 alpha-glucosidase; Provisional 786
12178 182262 PRK10139 PRK10139 serine endoprotease DegQ. 455
12179 182263 PRK10140 PRK10140 N-acetyltransferase. 162
12180 236654 PRK10141 PRK10141 DNA-binding transcriptional repressor ArsR; Provisional 117
12181 182265 PRK10144 PRK10144 formate-dependent nitrite reductase complex subunit NrfF; Provisional 126
12182 182266 PRK10146 PRK10146 aminoalkylphosphonate N-acetyltransferase. 144
12183 236655 PRK10147 phnH phosphonate C-P lyase system protein PhnH. 196
12184 236656 PRK10148 PRK10148 VOC family metalloprotein YjdN. 147
12185 236657 PRK10150 PRK10150 beta-D-glucuronidase; Provisional 604
12186 182270 PRK10151 PRK10151 50S ribosomal protein L7/L12-serine acetyltransferase. 179
12187 236658 PRK10153 PRK10153 DNA-binding transcriptional activator CadC; Provisional 517
12188 182272 PRK10154 PRK10154 DUF2541 family protein. 134
12189 182273 PRK10157 PRK10157 putative oxidoreductase FixC; Provisional 428
12190 236659 PRK10158 PRK10158 bifunctional tRNA pseudouridine(32) synthase/23S rRNA pseudouridine(746) synthase RluA. 219
12191 182275 PRK10159 PRK10159 phosphoporin PhoE. 351
12192 182276 PRK10160 PRK10160 taurine ABC transporter permease TauC. 275
12193 182277 PRK10161 PRK10161 phosphate response regulator transcription factor PhoB. 229
12194 236660 PRK10162 PRK10162 acetyl esterase. 318
12195 182279 PRK10163 PRK10163 HTH-type transcriptional repressor AllR. 271
12196 182280 PRK10167 PRK10167 hypothetical protein; Provisional 169
12197 182281 PRK10170 PRK10170 Ni/Fe-hydrogenase large subunit. 597
12198 182282 PRK10171 PRK10171 Ni/Fe-hydrogenase b-type cytochrome subunit. 235
12199 182283 PRK10172 PRK10172 AppA family phytase/histidine-type acid phosphatase. 436
12200 182284 PRK10173 PRK10173 glucose-1-phosphatase/inositol phosphatase; Provisional 413
12201 182285 PRK10174 PRK10174 hypothetical protein; Provisional 75
12202 182286 PRK10175 PRK10175 YceK/YidQ family lipoprotein. 75
12203 236661 PRK10177 PRK10177 YchO/YchP family invasin. 465
12204 236662 PRK10178 PRK10178 D-alanyl-D-alanine dipeptidase; Provisional 184
12205 182289 PRK10179 PRK10179 formate dehydrogenase-N subunit gamma; Provisional 217
12206 182290 PRK10183 PRK10183 hypothetical protein; Provisional 56
12207 182291 PRK10187 PRK10187 trehalose-6-phosphate phosphatase; Provisional 266
12208 182292 PRK10188 PRK10188 transcriptional regulator SdiA. 240
12209 182293 PRK10189 PRK10189 EmmdR/YeeO family multidrug/toxin efflux MATE transporter. 478
12210 182294 PRK10190 PRK10190 L,D-transpeptidase; Provisional 310
12211 182295 PRK10191 PRK10191 putative acyl transferase; Provisional 146
12212 182296 PRK10194 PRK10194 ferredoxin-type protein NapF. 163
12213 182297 PRK10197 PRK10197 GABA permease. 446
12214 182298 PRK10198 PRK10198 formate hydrogenlyase regulator HycA. 152
12215 182299 PRK10199 PRK10199 alkaline phosphatase isozyme conversion aminopeptidase; Provisional 346
12216 182300 PRK10200 PRK10200 putative racemase; Provisional 230
12217 182301 PRK10201 PRK10201 G/U mismatch-specific DNA glycosylase. 168
12218 182302 PRK10202 ebgC beta-galactosidase subunit beta. 149
12219 182303 PRK10203 PRK10203 hypothetical protein; Provisional 122
12220 182304 PRK10204 PRK10204 hypothetical protein; Provisional 55
12221 182305 PRK10206 PRK10206 putative oxidoreductase; Provisional 344
12222 182306 PRK10207 PRK10207 dipeptide/tripeptide permease DtpB. 489
12223 182307 PRK10208 PRK10208 acid-activated periplasmic chaperone HdeA. 114
12224 182308 PRK10209 PRK10209 HdeD family acid-resistance protein. 190
12225 182309 PRK10213 nepI purine ribonucleoside efflux pump NepI. 394
12226 182310 PRK10214 PRK10214 ilvB operon leader peptide IvbL. 32
12227 236663 PRK10215 PRK10215 hypothetical protein; Provisional 218
12228 182312 PRK10216 PRK10216 HTH-type transcriptional regulator YidZ. 319
12229 182313 PRK10217 PRK10217 dTDP-glucose 4,6-dehydratase; Provisional 355
12230 104396 PRK10218 PRK10218 translational GTPase TypA. 607
12231 182314 PRK10219 PRK10219 superoxide response transcriptional regulator SoxS. 107
12232 182315 PRK10220 PRK10220 phnA family protein. 111
12233 182316 PRK10222 PRK10222 PTS ascorbate transporter subunit IIB. 85
12234 236664 PRK10224 PRK10224 pyr operon leader peptide. 44
12235 182318 PRK10225 PRK10225 Uxu operon transcriptional regulator. 257
12236 182319 PRK10226 PRK10226 isoaspartyl peptidase; Provisional 313
12237 182320 PRK10227 PRK10227 HTH-type transcriptional regulator CueR. 135
12238 236665 PRK10229 PRK10229 threonine efflux system; Provisional 206
12239 182322 PRK10234 PRK10234 transcriptional regulator GutM. 118
12240 182323 PRK10236 PRK10236 acidic protein MsyB. 237
12241 182324 PRK10238 PRK10238 aromatic amino acid transporter AroP. 456
12242 182325 PRK10239 PRK10239 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. 159
12243 182326 PRK10240 PRK10240 (2E,6E)-farnesyl-diphosphate-specific ditrans,polycis-undecaprenyl-diphosphate synthase. 229
12244 182327 PRK10241 PRK10241 hydroxyacylglutathione hydrolase; Provisional 251
12245 236666 PRK10244 PRK10244 anti-adapter protein IraP. 88
12246 182329 PRK10245 adrA diguanylate cyclase AdrA; Provisional 366
12247 182330 PRK10246 PRK10246 exonuclease subunit SbcC; Provisional 1047
12248 182331 PRK10247 PRK10247 putative ABC transporter ATP-binding protein YbbL; Provisional 225
12249 236667 PRK10249 PRK10249 phenylalanine transporter; Provisional 458
12250 182333 PRK10250 PRK10250 MmcQ/YjbR family DNA-binding protein. 122
12251 182334 PRK10251 PRK10251 enterobactin synthase subunit EntD. 207
12252 236668 PRK10252 entF enterobactin non-ribosomal peptide synthetase EntF. 1296
12253 182336 PRK10253 PRK10253 iron-enterobactin ABC transporter ATP-binding protein. 265
12254 182337 PRK10254 PRK10254 proofreading thioesterase EntH. 137
12255 182338 PRK10255 PRK10255 PTS system N-acetyl glucosamine specific transporter subunits IIABC; Provisional 648
12256 182339 PRK10257 PRK10257 putative kinase inhibitor protein; Provisional 158
12257 182340 PRK10258 PRK10258 biotin biosynthesis protein BioC; Provisional 251
12258 137782 PRK10259 PRK10259 hypothetical protein; Provisional 86
12259 182341 PRK10260 PRK10260 L,D-transpeptidase; Provisional 306
12260 182342 PRK10261 PRK10261 glutathione transporter ATP-binding protein; Provisional 623
12261 182343 PRK10262 PRK10262 thioredoxin reductase; Provisional 321
12262 236669 PRK10263 PRK10263 DNA translocase FtsK; Provisional 1355
12263 182345 PRK10264 PRK10264 hydrogenase 1 maturation protease; Provisional 195
12264 182346 PRK10265 PRK10265 chaperone modulator CbpM. 101
12265 182347 PRK10266 PRK10266 curved DNA-binding protein. 306
12266 182348 PRK10270 PRK10270 putative aminodeoxychorismate lyase; Provisional 340
12267 182349 PRK10271 thiK thiamine kinase; Provisional 188
12268 182350 PRK10276 PRK10276 translesion error-prone DNA polymerase V autoproteolytic subunit. 139
12269 182351 PRK10278 PRK10278 SirB family protein. 130
12270 182352 PRK10279 PRK10279 patatin-like phospholipase RssA. 300
12271 182353 PRK10280 PRK10280 peptidyl-dipeptidase Dcp. 681
12272 182354 PRK10281 PRK10281 PhzF family isomerase. 299
12273 182355 PRK10286 PRK10286 methylated-DNA--[protein]-cysteine S-methyltransferase. 171
12274 182356 PRK10287 PRK10287 thiosulfate:cyanide sulfurtransferase; Provisional 104
12275 182357 PRK10290 PRK10290 superoxide dismutase [Cu-Zn] SodC2. 173
12276 182358 PRK10291 PRK10291 glyoxalase I; Provisional 129
12277 182359 PRK10292 PRK10292 fumarate hydratase FumD. 69
12278 182360 PRK10293 PRK10293 1,4-dihydroxy-2-naphthoyl-CoA hydrolase. 136
12279 182361 PRK10294 PRK10294 6-phosphofructokinase 2; Provisional 309
12280 182362 PRK10296 PRK10296 DNA-binding transcriptional regulator ChbR; Provisional 278
12281 182363 PRK10297 PRK10297 PTS system N,N'-diacetylchitobiose-specific transporter subunit IIC; Provisional 452
12282 182364 PRK10299 PRK10299 PhoP/PhoQ regulator MgrB. 47
12283 182365 PRK10301 PRK10301 CopC domain-containing protein YobA. 124
12284 182366 PRK10302 PRK10302 hypothetical protein; Provisional 272
12285 182367 PRK10304 PRK10304 non-heme ferritin. 165
12286 182368 PRK10306 PRK10306 zinc/cadmium-binding protein; Provisional 216
12287 236670 PRK10307 PRK10307 colanic acid biosynthesis glycosyltransferase WcaI. 412
12288 236671 PRK10308 PRK10308 3-methyl-adenine DNA glycosylase II; Provisional 283
12289 182371 PRK10309 PRK10309 galactitol-1-phosphate 5-dehydrogenase. 347
12290 182372 PRK10310 PRK10310 PTS galactitol transporter subunit IIB. 94
12291 182373 PRK10314 PRK10314 GNAT family N-acetyltransferase. 153
12292 182374 PRK10316 PRK10316 hypothetical protein; Provisional 209
12293 236672 PRK10318 PRK10318 hypothetical protein; Provisional 121
12294 182376 PRK10319 PRK10319 N-acetylmuramoyl-L-alanine amidase AmiA. 287
12295 182377 PRK10323 PRK10323 cysteine/O-acetylserine transporter. 195
12296 182378 PRK10324 PRK10324 ribosome-associated translation inhibitor RaiA. 113
12297 182379 PRK10325 PRK10325 heat shock protein GrpE; Provisional 197
12298 182380 PRK10328 PRK10328 DNA-binding protein StpA. 134
12299 182381 PRK10329 PRK10329 glutaredoxin-like protein NrdH. 81
12300 182382 PRK10330 PRK10330 electron transport protein HydN. 181
12301 182383 PRK10331 PRK10331 L-fuculokinase; Provisional 470
12302 182384 PRK10332 PRK10332 prepilin-type N-terminal cleavage/methylation domain-containing protein. 107
12303 182385 PRK10333 PRK10333 5-formyltetrahydrofolate cyclo-ligase family protein; Provisional 182
12304 182386 PRK10334 PRK10334 small-conductance mechanosensitive channel MscS. 286
12305 182387 PRK10336 PRK10336 two-component system response regulator QseB. 219
12306 182388 PRK10337 PRK10337 sensor protein QseC; Provisional 449
12307 182389 PRK10339 PRK10339 DNA-binding transcriptional repressor EbgR; Provisional 327
12308 236673 PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed 1021
12309 182391 PRK10341 PRK10341 transcriptional regulator TdcA. 312
12310 182392 PRK10342 PRK10342 glycerate kinase I; Provisional 381
12311 182393 PRK10343 PRK10343 ribosome assembly RNA-binding protein YhbY. 97
12312 182394 PRK10344 PRK10344 DNA-binding transcriptional regulator SfsB. 92
12313 182395 PRK10345 PRK10345 PhoP regulatory network protein YrbL. 210
12314 182396 PRK10347 PRK10347 putative adenosine monophosphate-protein transferase Fic. 200
12315 182397 PRK10348 PRK10348 ribosome-associated heat shock protein Hsp15; Provisional 133
12316 137836 PRK10349 PRK10349 pimeloyl-ACP methyl ester esterase BioH. 256
12317 182398 PRK10350 PRK10350 DUF2756 family protein. 145
12318 182399 PRK10351 PRK10351 4'-phosphopantetheinyl transferase AcpT. 187
12319 182400 PRK10352 PRK10352 nickel transporter permease NikB; Provisional 314
12320 182401 PRK10353 PRK10353 DNA-3-methyladenine glycosylase I. 187
12321 182402 PRK10354 PRK10354 RNA chaperone/antiterminator CspA. 70
12322 182403 PRK10355 xylF D-xylose ABC transporter substrate-binding protein. 330
12323 182404 PRK10356 PRK10356 protein bax. 274
12324 182405 PRK10357 PRK10357 putative glutathione S-transferase; Provisional 202
12325 182406 PRK10358 PRK10358 tRNA (uridine(34)/cytosine(34)/5-carboxymethylaminomethyluridine(34)-2'-O)-methyltransferase TrmL. 157
12326 182407 PRK10359 PRK10359 lipopolysaccharide core heptose(II) kinase RfaY. 232
12327 182408 PRK10360 PRK10360 transcriptional regulator UhpA. 196
12328 182409 PRK10361 PRK10361 DNA recombination protein RmuC; Provisional 475
12329 182410 PRK10363 cpxP cell-envelope stress modulator CpxP. 166
12330 236674 PRK10364 PRK10364 two-component system sensor histidine kinase ZraS. 457
12331 182412 PRK10365 PRK10365 sigma-54-dependent response regulator transcription factor ZraR. 441
12332 182413 PRK10367 PRK10367 DNA-damage-inducible SOS response protein; Provisional 441
12333 236675 PRK10369 PRK10369 heme lyase subunit NrfE; Provisional 571
12334 182415 PRK10370 PRK10370 formate-dependent nitrite reductase complex subunit NrfG; Provisional 198
12335 182416 PRK10371 PRK10371 transcriptional regulator MelR. 302
12336 182417 PRK10372 PRK10372 PTS ascorbate transporter subunit IIA. 154
12337 236676 PRK10376 PRK10376 putative oxidoreductase; Provisional 290
12338 182419 PRK10377 PRK10377 PTS glucitol/sorbitol transporter subunit IIA. 120
12339 236677 PRK10378 PRK10378 inactive ferrous ion transporter periplasmic protein EfeO; Provisional 375
12340 182421 PRK10380 PRK10380 hypothetical protein; Provisional 63
12341 182422 PRK10381 PRK10381 LPS O-antigen length regulator; Provisional 377
12342 182423 PRK10382 PRK10382 alkyl hydroperoxide reductase subunit C; Provisional 187
12343 236678 PRK10386 PRK10386 curli production assembly/transport protein CsgE. 130
12344 236679 PRK10387 PRK10387 glutaredoxin 2; Provisional 210
12345 182426 PRK10391 PRK10391 transcription modulator YdgT. 71
12346 236680 PRK10396 PRK10396 hypothetical protein; Provisional 221
12347 182428 PRK10397 PRK10397 lipoprotein; Provisional 137
12348 236681 PRK10401 PRK10401 HTH-type transcriptional regulator GalS. 346
12349 236682 PRK10402 PRK10402 DNA-binding transcriptional activator YeiL; Provisional 226
12350 182431 PRK10403 PRK10403 nitrate/nitrite response regulator protein NarP. 215
12351 182432 PRK10404 PRK10404 stress response protein ElaB. 101
12352 182433 PRK10406 PRK10406 alpha-ketoglutarate transporter; Provisional 432
12353 182434 PRK10408 PRK10408 L-valine transporter subunit YgaH. 111
12354 182435 PRK10409 PRK10409 HypC/HybG/HupF family hydrogenase formation chaperone. 90
12355 236683 PRK10410 PRK10410 nitrous oxide-stimulated promoter family protein. 100
12356 236684 PRK10411 PRK10411 L-fucose operon activator. 240
12357 182438 PRK10413 PRK10413 hydrogenase maturation factor HybG. 82
12358 236685 PRK10414 PRK10414 biopolymer transporter ExbB. 244
12359 182440 PRK10415 PRK10415 tRNA-dihydrouridine synthase B; Provisional 321
12360 236686 PRK10416 PRK10416 signal recognition particle-docking protein FtsY; Provisional 318
12361 236687 PRK10417 nikC nickel transporter permease NikC; Provisional 272
12362 236688 PRK10418 nikD nickel transporter ATP-binding protein NikD; Provisional 254
12363 236689 PRK10419 nikE nickel ABC transporter ATP-binding protein NikE. 268
12364 182445 PRK10420 PRK10420 L-lactate permease; Provisional 551
12365 236690 PRK10421 PRK10421 DNA-binding transcriptional repressor LldR; Provisional 253
12366 182447 PRK10422 PRK10422 lipopolysaccharide core biosynthesis protein; Provisional 352
12367 182448 PRK10423 PRK10423 transcriptional repressor RbsR; Provisional 327
12368 170429 PRK10424 PRK10424 ilv operon leader peptide. 32
12369 182449 PRK10425 PRK10425 3'-5' ssDNA/RNA exonuclease TatD. 258
12370 236691 PRK10426 PRK10426 alpha-glucosidase; Provisional 635
12371 182451 PRK10427 PRK10427 PTS fructose-like transporter subunit IIB. 114
12372 182452 PRK10428 PRK10428 hypothetical protein; Provisional 69
12373 182453 PRK10429 PRK10429 melibiose:sodium transporter MelB. 473
12374 182454 PRK10430 PRK10430 two-component system response regulator DcuR. 239
12375 236692 PRK10431 PRK10431 N-acetylmuramoyl-l-alanine amidase II; Provisional 445
12376 236693 PRK10433 PRK10433 putative RNA methyltransferase; Provisional 228
12377 182457 PRK10434 srlR DNA-binding transcriptional repressor. 256
12378 182458 PRK10435 cadB cadaverine/lysine antiporter. 435
12379 236694 PRK10436 PRK10436 hypothetical protein; Provisional 462
12380 182460 PRK10437 PRK10437 carbonic anhydrase; Provisional 220
12381 182461 PRK10438 PRK10438 C-N hydrolase family amidase; Provisional 256
12382 236695 PRK10439 PRK10439 enterobactin/ferric enterobactin esterase; Provisional 411
12383 182463 PRK10440 PRK10440 iron-enterobactin ABC transporter permease. 330
12384 182464 PRK10441 PRK10441 Fe(3+)-siderophore ABC transporter permease. 335
12385 182465 PRK10443 rihA ribonucleoside hydrolase 1; Provisional 311
12386 182466 PRK10444 PRK10444 HAD-IIA family hydrolase. 248
12387 182467 PRK10445 PRK10445 endonuclease VIII; Provisional 263
12388 182468 PRK10446 PRK10446 30S ribosomal protein S6--L-glutamate ligase. 300
12389 182469 PRK10447 PRK10447 FtsH protease modulator YccA. 219
12390 182470 PRK10449 PRK10449 heat shock protein HslJ. 140
12391 182471 PRK10452 PRK10452 multidrug/spermidine efflux SMR transporter subunit MdtJ. 120
12392 182472 PRK10454 PRK10454 PTS N,N'-diacetylchitobiose transporter subunit IIA. 115
12393 182473 PRK10455 PRK10455 periplasmic protein; Reviewed 161
12394 182474 PRK10456 PRK10456 arginine succinyltransferase; Provisional 344
12395 182475 PRK10457 PRK10457 hypothetical protein; Provisional 82
12396 236696 PRK10458 PRK10458 DNA cytosine methylase; Provisional 467
12397 236697 PRK10459 PRK10459 MOP flippase family protein. 492
12398 182478 PRK10461 PRK10461 thiamine biosynthesis lipoprotein ApbE; Provisional 350
12399 182479 PRK10463 PRK10463 hydrogenase nickel incorporation protein HypB; Provisional 290
12400 182480 PRK10465 PRK10465 hydrogenase-2 assembly chaperone. 159
12401 182481 PRK10466 hybD HyaD/HybD family hydrogenase maturation endopeptidase. 164
12402 182482 PRK10467 PRK10467 hydrogenase 2 large subunit; Provisional 567
12403 182483 PRK10468 PRK10468 hydrogenase 2 small subunit; Provisional 371
12404 182484 PRK10470 PRK10470 ribosome hibernation promoting factor. 95
12405 182485 PRK10472 PRK10472 low affinity gluconate transporter; Provisional 445
12406 182486 PRK10473 PRK10473 MdtL family multidrug efflux MFS transporter. 392
12407 170468 PRK10474 PRK10474 PTS fructose-like transporter subunit IIB. 88
12408 236698 PRK10475 PRK10475 23S rRNA pseudouridine(2604) synthase RluF. 290
12409 182488 PRK10476 PRK10476 multidrug transporter subunit MdtN. 346
12410 182489 PRK10477 PRK10477 outer membrane lipoprotein Blc; Provisional 177
12411 182490 PRK10478 PRK10478 PTS fructose transporter subunit EIIC. 359
12412 182491 PRK10481 PRK10481 hypothetical protein; Provisional 224
12413 182492 PRK10483 PRK10483 tryptophan permease; Provisional 414
12414 236699 PRK10484 PRK10484 putative transporter; Provisional 523
12415 182494 PRK10486 PRK10486 (4S)-4-hydroxy-5-phosphonooxypentane-2,3-dione isomerase. 96
12416 236700 PRK10489 PRK10489 enterobactin transporter EntS. 417
12417 236701 PRK10490 PRK10490 sensor protein KdpD; Provisional 895
12418 236702 PRK10494 PRK10494 envelope biogenesis factor ElyC. 259
12419 182498 PRK10497 PRK10497 phage shock protein PspD. 73
12420 182499 PRK10499 PRK10499 PTS sugar transporter subunit IIB. 106
12421 236703 PRK10502 PRK10502 putative acyl transferase; Provisional 182
12422 182501 PRK10503 PRK10503 MdtB/MuxB family multidrug efflux RND transporter permease subunit. 1040
12423 182502 PRK10504 PRK10504 putative transporter; Provisional 471
12424 236704 PRK10506 PRK10506 prepilin peptidase-dependent protein. 162
12425 182504 PRK10507 PRK10507 bifunctional glutathionylspermidine amidase/glutathionylspermidine synthetase; Provisional 619
12426 182505 PRK10508 PRK10508 luciferase-like monooxygenase. 333
12427 182506 PRK10509 PRK10509 bacterioferritin-associated ferredoxin; Provisional 64
12428 182507 PRK10510 PRK10510 OmpA family lipoprotein. 219
12429 182508 PRK10512 PRK10512 selenocysteinyl-tRNA-specific translation factor; Provisional 614
12430 182509 PRK10513 PRK10513 sugar phosphate phosphatase; Provisional 270
12431 182510 PRK10514 PRK10514 putative acetyltransferase; Provisional 145
12432 170492 PRK10515 PRK10515 hypothetical protein; Provisional 90
12433 236705 PRK10517 PRK10517 magnesium-transporting P-type ATPase MgtA. 902
12434 236706 PRK10518 PRK10518 alkaline phosphatase; Provisional 476
12435 182513 PRK10519 PRK10519 hypothetical protein; Provisional 151
12436 182514 PRK10520 rhtB homoserine/homoserine lactone efflux protein; Provisional 205
12437 236707 PRK10522 PRK10522 multidrug transporter membrane component/ATP-binding component; Provisional 547
12438 236708 PRK10523 PRK10523 envelope stress response activation lipoprotein NlpE. 234
12439 182517 PRK10524 prpE propionyl-CoA synthetase; Provisional 629
12440 182518 PRK10525 PRK10525 cytochrome o ubiquinol oxidase subunit II; Provisional 315
12441 182519 PRK10526 PRK10526 acyl-CoA thioesterase II; Provisional 286
12442 182520 PRK10527 PRK10527 DUF454 family protein. 125
12443 182521 PRK10528 PRK10528 multifunctional acyl-CoA thioesterase I and protease I and lysophospholipase L1; Provisional 191
12444 182522 PRK10529 PRK10529 DNA-binding transcriptional activator KdpE; Provisional 225
12445 182523 PRK10530 PRK10530 pyridoxal phosphate (PLP) phosphatase; Provisional 272
12446 236709 PRK10531 PRK10531 putative acyl-CoA thioester hydrolase. 422
12447 182525 PRK10532 PRK10532 threonine and homoserine efflux system; Provisional 293
12448 182526 PRK10533 PRK10533 putative lipoprotein; Provisional 171
12449 236710 PRK10534 PRK10534 L-threonine aldolase; Provisional 333
12450 182528 PRK10535 PRK10535 macrolide ABC transporter ATP-binding protein/permease MacB. 648
12451 182529 PRK10536 PRK10536 phosphate starvation-inducible protein PhoH. 262
12452 236711 PRK10537 PRK10537 voltage-gated potassium channel protein. 393
12453 182531 PRK10538 PRK10538 bifunctional NADP-dependent 3-hydroxy acid dehydrogenase/3-hydroxypropionate dehydrogenase YdfG. 248
12454 182532 PRK10540 PRK10540 osmotically-inducible lipoprotein OsmB. 72
12455 182533 PRK10542 PRK10542 glutathionine S-transferase; Provisional 201
12456 182534 PRK10543 PRK10543 superoxide dismutase [Fe]. 193
12457 182535 PRK10545 PRK10545 excinuclease Cho. 286
12458 182536 PRK10546 PRK10546 pyrimidine (deoxy)nucleoside triphosphate diphosphatase. 135
12459 236712 PRK10547 PRK10547 chemotaxis protein CheA; Provisional 670
12460 182538 PRK10548 PRK10548 flagella biosynthesis regulatory protein FliT. 121
12461 182539 PRK10549 PRK10549 two-component system sensor histidine kinase BaeS. 466
12462 236713 PRK10550 PRK10550 tRNA dihydrouridine(16) synthase DusC. 312
12463 182541 PRK10551 PRK10551 cyclic di-GMP phosphodiesterase. 518
12464 182542 PRK10553 PRK10553 chaperone NapD. 87
12465 182543 PRK10554 PRK10554 outer membrane porin protein C; Provisional 355
12466 182544 PRK10555 PRK10555 multidrug efflux RND transporter permease AcrD. 1037
12467 182545 PRK10556 PRK10556 hypothetical protein; Provisional 111
12468 236714 PRK10557 PRK10557 prepilin peptidase-dependent protein. 192
12469 182547 PRK10558 PRK10558 alpha-dehydro-beta-deoxy-D-glucarate aldolase; Provisional 256
12470 182548 PRK10559 PRK10559 p-hydroxybenzoic acid efflux pump subunit AaeA. 310
12471 182549 PRK10560 hofQ outer membrane porin HofQ; Provisional 386
12472 182550 PRK10561 PRK10561 sn-glycerol-3-phosphate ABC transporter permease UgpA. 280
12473 236715 PRK10562 PRK10562 putative acetyltransferase; Provisional 145
12474 182552 PRK10563 PRK10563 6-phosphogluconate phosphatase; Provisional 221
12475 236716 PRK10564 PRK10564 maltose operon protein MalM. 303
12476 182554 PRK10565 PRK10565 putative carbohydrate kinase; Provisional 508
12477 182555 PRK10566 PRK10566 esterase; Provisional 249
12478 182556 PRK10568 PRK10568 molecular chaperone OsmY. 203
12479 182557 PRK10569 PRK10569 NAD(P)H-dependent FMN reductase; Provisional 191
12480 236717 PRK10572 PRK10572 arabinose operon transcriptional regulator AraC. 290
12481 182559 PRK10573 PRK10573 protein transport protein HofC. 399
12482 236718 PRK10574 PRK10574 putative major pilin subunit; Provisional 146
12483 182561 PRK10575 PRK10575 Fe3+-hydroxamate ABC transporter ATP-binding protein FhuC. 265
12484 236719 PRK10576 PRK10576 Fe(3+)-hydroxamate ABC transporter substrate-binding protein FhuD. 292
12485 236720 PRK10577 PRK10577 Fe(3+)-hydroxamate ABC transporter permease FhuB. 668
12486 182564 PRK10578 PRK10578 hypothetical protein; Provisional 207
12487 182565 PRK10579 PRK10579 pyrimidine/purine nucleoside phosphorylase. 94
12488 182566 PRK10580 proY putative proline-specific permease; Provisional 457
12489 182567 PRK10581 PRK10581 (2E,6E)-farnesyl diphosphate synthase. 299
12490 182568 PRK10582 PRK10582 cytochrome o ubiquinol oxidase subunit IV; Provisional 109
12491 182569 PRK10584 PRK10584 putative ABC transporter ATP-binding protein YbbA; Provisional 228
12492 182570 PRK10586 PRK10586 putative oxidoreductase; Provisional 362
12493 236721 PRK10588 PRK10588 hypothetical protein; Provisional 97
12494 236722 PRK10590 PRK10590 ATP-dependent RNA helicase RhlE; Provisional 456
12495 182573 PRK10591 PRK10591 hypothetical protein; Provisional 92
12496 182574 PRK10592 PRK10592 putrescine transporter subunit: membrane component of ABC superfamily; Provisional 281
12497 182575 PRK10593 PRK10593 hypothetical protein; Provisional 297
12498 236723 PRK10594 PRK10594 murein L,D-transpeptidase; Provisional 608
12499 182577 PRK10595 PRK10595 cell division inhibitor SulA. 164
12500 182578 PRK10597 PRK10597 DNA damage-inducible protein I; Provisional 81
12501 182579 PRK10598 PRK10598 lipoprotein; Provisional 186
12502 182580 PRK10599 PRK10599 sodium-potassium/proton antiporter ChaA. 366
12503 182581 PRK10600 PRK10600 nitrate/nitrite two-component system sensor histidine kinase NarX. 569
12504 182582 PRK10602 PRK10602 murein tripeptide amidase MpaA. 237
12505 236724 PRK10604 PRK10604 sensor protein RstB; Provisional 433
12506 182584 PRK10605 PRK10605 N-ethylmaleimide reductase; Provisional 362
12507 182585 PRK10606 btuE putative glutathione peroxidase; Provisional 183
12508 170568 PRK10610 PRK10610 chemotaxis protein CheY. 129
12509 236725 PRK10611 PRK10611 protein-glutamate O-methyltransferase CheR. 287
12510 182587 PRK10612 PRK10612 chemotaxis protein CheW. 167
12511 182588 PRK10613 PRK10613 DUF2594 family protein. 74
12512 182589 PRK10614 PRK10614 multidrug efflux system subunit MdtC; Provisional 1025
12513 182590 PRK10617 PRK10617 cytochrome c-type protein NapC; Provisional 200
12514 236726 PRK10618 PRK10618 phosphotransfer intermediate protein in two-component regulatory system with RcsBC; Provisional 894
12515 182592 PRK10619 PRK10619 histidine ABC transporter ATP-binding protein HisP. 257
12516 182593 PRK10621 PRK10621 hypothetical protein; Provisional 266
12517 182594 PRK10622 pheA bifunctional chorismate mutase/prephenate dehydratase; Provisional 386
12518 182595 PRK10624 PRK10624 L-1,2-propanediol oxidoreductase; Provisional 382
12519 236727 PRK10625 tas putative aldo-keto reductase; Provisional 346
12520 182597 PRK10626 PRK10626 hypothetical protein; Provisional 239
12521 182598 PRK10628 PRK10628 LigB family dioxygenase; Provisional 246
12522 236728 PRK10629 PRK10629 EnvZ/OmpR regulon moderator MzrA. 127
12523 182600 PRK10631 PRK10631 p-hydroxybenzoic acid efflux subunit AaeB; Provisional 652
12524 182601 PRK10632 PRK10632 HTH-type transcriptional activator AaeR. 309
12525 182602 PRK10633 PRK10633 hypothetical protein; Provisional 80
12526 182603 PRK10634 PRK10634 L-threonylcarbamoyladenylate synthase type 1 TsaC. 190
12527 182604 PRK10635 PRK10635 bacterioferritin; Provisional 158
12528 236729 PRK10636 PRK10636 putative ABC transporter ATP-binding protein; Provisional 638
12529 182606 PRK10637 cysG siroheme synthase CysG. 457
12530 182607 PRK10638 PRK10638 glutaredoxin 3; Provisional 83
12531 182608 PRK10639 PRK10639 formate dehydrogenase cytochrome b556 subunit. 211
12532 182609 PRK10640 rhaB rhamnulokinase; Provisional 471
12533 236730 PRK10641 btuB TonB-dependent vitamin B12 receptor BtuB. 614
12534 182611 PRK10642 PRK10642 proline/glycine betaine transporter ProP. 490
12535 182612 PRK10643 PRK10643 two-component system response regulator PmrA. 222
12536 182613 PRK10644 PRK10644 arginine/agmatine antiporter. 445
12537 182614 PRK10645 PRK10645 divalent cation tolerance protein CutA. 112
12538 182615 PRK10646 PRK10646 tRNA (adenosine(37)-N6)-threonylcarbamoyltransferase complex ATPase subunit type 1 TsaE. 153
12539 182616 PRK10647 PRK10647 ferric iron reductase involved in ferric hydroximate transport; Provisional 262
12540 182617 PRK10649 PRK10649 phosphoethanolamine transferase CptA. 577
12541 182618 PRK10650 PRK10650 multidrug/spermidine efflux SMR transporter subunit MdtI. 109
12542 182619 PRK10651 PRK10651 transcriptional regulator NarL; Provisional 216
12543 182620 PRK10653 PRK10653 ribose ABC transporter substrate-binding protein RbsB. 295
12544 182621 PRK10654 dcuC C4-dicarboxylate transporter DcuC; Provisional 455
12545 182622 PRK10655 potE putrescine transporter; Provisional 438
12546 182623 PRK10657 PRK10657 isoaspartyl dipeptidase; Provisional 388
12547 236731 PRK10658 PRK10658 putative alpha-glucosidase; Provisional 665
12548 182625 PRK10659 PRK10659 acetate uptake transporter. 188
12549 182626 PRK10660 tilS tRNA(Ile)-lysidine synthetase; Provisional 436
12550 236732 PRK10662 PRK10662 beta-lactam binding protein AmpH; Provisional 378
12551 182628 PRK10663 PRK10663 cytochrome o ubiquinol oxidase subunit III; Provisional 204
12552 170612 PRK10664 PRK10664 DNA-binding protein HU-beta. 90
12553 182629 PRK10665 PRK10665 P-II family nitrogen regulator. 112
12554 182630 PRK10666 PRK10666 ammonium transporter AmtB. 428
12555 182631 PRK10667 PRK10667 Hha toxicity modulator TomB. 122
12556 182632 PRK10668 PRK10668 DNA-binding transcriptional repressor AcrR; Provisional 215
12557 182633 PRK10669 PRK10669 putative cation:proton antiport protein; Provisional 558
12558 182634 PRK10670 PRK10670 Cys-tRNA(Pro)/Cys-tRNA(Cys) deacylase YbaK. 159
12559 182635 PRK10671 copA copper-exporting P-type ATPase CopA. 834
12560 236733 PRK10672 PRK10672 endolytic peptidoglycan transglycosylase RlpA. 361
12561 182637 PRK10673 PRK10673 esterase. 255
12562 236734 PRK10674 PRK10674 deoxyribodipyrimidine photolyase; Provisional 472
12563 182639 PRK10675 PRK10675 UDP-galactose-4-epimerase; Provisional 338
12564 182640 PRK10676 PRK10676 DNA-binding transcriptional regulator ModE; Provisional 263
12565 182641 PRK10677 modA molybdate transporter periplasmic protein; Provisional 257
12566 182642 PRK10678 moaE molybdopterin synthase catalytic subunit MoaE. 150
12567 182643 PRK10680 PRK10680 molybdopterin biosynthesis protein MoeA; Provisional 411
12568 182644 PRK10681 PRK10681 DNA-binding transcriptional repressor DeoR; Provisional 252
12569 182645 PRK10682 PRK10682 putrescine transporter subunit: periplasmic-binding component of ABC superfamily; Provisional 370
12570 182646 PRK10683 PRK10683 putrescine transporter subunit: membrane component of ABC superfamily; Provisional 317
12571 236735 PRK10684 PRK10684 HCP oxidoreductase, NADH-dependent; Provisional 332
12572 182648 PRK10687 PRK10687 purine nucleoside phosphoramidase; Provisional 119
12573 182649 PRK10689 PRK10689 transcription-repair coupling factor; Provisional 1147
12574 182650 PRK10691 PRK10691 fumarylacetoacetate hydrolase family protein. 219
12575 182651 PRK10692 PRK10692 stress response protein YchH. 92
12576 182652 PRK10693 PRK10693 two-component system response regulator RssB. 303
12577 236736 PRK10694 PRK10694 acyl-CoA thioester hydrolase YciA. 133
12578 182654 PRK10695 PRK10695 YdbH family protein. 859
12579 236737 PRK10696 PRK10696 tRNA 2-thiocytidine biosynthesis protein TtcA; Provisional 258
12580 182656 PRK10697 PRK10697 envelope stress response membrane protein PspC. 118
12581 182657 PRK10698 PRK10698 phage shock protein PspA; Provisional 222
12582 182658 PRK10699 PRK10699 phosphatidylglycerophosphatase B; Provisional 244
12583 182659 PRK10700 PRK10700 23S rRNA pseudouridine(2605) synthase RluB. 289
12584 236738 PRK10701 PRK10701 DNA-binding transcriptional regulator RstA; Provisional 240
12585 182661 PRK10702 PRK10702 endonuclease III; Provisional 211
12586 236739 PRK10703 PRK10703 HTH-type transcriptional repressor PurR. 341
12587 182663 PRK10707 PRK10707 putative NUDIX hydrolase; Provisional 190
12588 182664 PRK10708 PRK10708 protein DsrB. 62
12589 182665 PRK10710 PRK10710 DNA-binding transcriptional regulator BaeR; Provisional 240
12590 182666 PRK10711 PRK10711 hypothetical protein; Provisional 231
12591 236740 PRK10712 PRK10712 PTS system fructose-specific transporter subunits IIBC; Provisional 563
12592 182668 PRK10713 PRK10713 2Fe-2S ferredoxin-like protein. 84
12593 182669 PRK10714 PRK10714 undecaprenyl phosphate 4-deoxy-4-formamido-L-arabinose transferase; Provisional 325
12594 182670 PRK10715 flk flagella biosynthesis regulator Flk. 335
12595 236741 PRK10716 PRK10716 long-chain fatty acid transporter FadL. 435
12596 182672 PRK10717 PRK10717 cysteine synthase A; Provisional 330
12597 236742 PRK10718 PRK10718 RpoE-regulated lipoprotein; Provisional 191
12598 236743 PRK10719 eutA ethanolamine ammonia-lyase reactivating factor EutA. 475
12599 236744 PRK10720 PRK10720 uracil transporter; Provisional 428
12600 170660 PRK10721 PRK10721 hypothetical protein; Provisional 66
12601 236745 PRK10722 PRK10722 two-component system QseEF-associated lipoprotein QseG. 247
12602 182677 PRK10723 PRK10723 polyphenol oxidase. 243
12603 182678 PRK10724 PRK10724 type II toxin-antitoxin system RatA family toxin. 158
12604 182679 PRK10725 PRK10725 fructose-1-phosphate/6-phosphogluconate phosphatase. 188
12605 236746 PRK10726 PRK10726 DUF3561 family protein. 105
12606 182681 PRK10727 PRK10727 HTH-type transcriptional regulator GalR. 343
12607 182682 PRK10729 nudF ADP-ribose pyrophosphatase NudF; Provisional 202
12608 182683 PRK10733 hflB ATP-dependent zinc metalloprotease FtsH. 644
12609 182684 PRK10734 PRK10734 putative calcium/sodium:proton antiporter; Provisional 325
12610 182685 PRK10735 tldD protease TldD; Provisional 481
12611 236747 PRK10736 PRK10736 DNA-protecting protein DprA. 374
12612 236748 PRK10737 PRK10737 peptidylprolyl isomerase. 196
12613 182688 PRK10738 PRK10738 OsmC family protein. 134
12614 170674 PRK10739 PRK10739 YhgN family NAAT transporter. 197
12615 182689 PRK10740 PRK10740 high-affinity branched-chain amino acid ABC transporter permease LivH. 308
12616 236749 PRK10742 PRK10742 16S rRNA (guanine(1516)-N(2))-methyltransferase RsmJ. 250
12617 182691 PRK10743 PRK10743 heat shock chaperone IbpA. 137
12618 182692 PRK10744 pstB phosphate ABC transporter ATP-binding protein PstB. 260
12619 182693 PRK10745 trkD low affinity potassium transporter Kup. 622
12620 182694 PRK10746 PRK10746 putative transport protein YifK; Provisional 461
12621 182695 PRK10747 PRK10747 putative protoheme IX biogenesis protein; Provisional 398
12622 182696 PRK10748 PRK10748 5-amino-6-(5-phospho-D-ribitylamino)uracil phosphatase YigB. 238
12623 182697 PRK10749 PRK10749 lysophospholipase L2; Provisional 330
12624 182698 PRK10750 PRK10750 Trk system potassium transporter TrkH. 483
12625 236750 PRK10751 PRK10751 molybdopterin-guanine dinucleotide biosynthesis protein B; Provisional 173
12626 182700 PRK10752 PRK10752 sulfate ABC transporter substrate-binding protein. 329
12627 138142 PRK10753 PRK10753 DNA-binding protein HU-alpha. 90
12628 182701 PRK10754 PRK10754 NADPH:quinone reductase. 327
12629 236751 PRK10755 PRK10755 two-component system sensor histidine kinase PmrB. 356
12630 236752 PRK10756 PRK10756 protein CreA. 157
12631 236753 PRK10757 PRK10757 inositol-1-monophosphatase. 267
12632 182705 PRK10759 PRK10759 YfiM family lipoprotein. 106
12633 236754 PRK10760 PRK10760 murein hydrolase B; Provisional 359
12634 236755 PRK10762 PRK10762 D-ribose transporter ATP binding protein; Provisional 501
12635 182708 PRK10763 PRK10763 phospholipase A; Provisional 289
12636 236756 PRK10764 PRK10764 potassium-tellurite ethidium and proflavin transporter; Provisional 324
12637 182710 PRK10765 PRK10765 oxygen-insensitive NADPH nitroreductase. 240
12638 182711 PRK10766 PRK10766 two-component system response regulator TorR. 221
12639 236757 PRK10767 PRK10767 chaperone protein DnaJ; Provisional 371
12640 182713 PRK10768 PRK10768 ribonucleoside hydrolase RihC; Provisional 304
12641 182714 PRK10769 folA type 3 dihydrofolate reductase. 159
12642 236758 PRK10770 PRK10770 peptidyl-prolyl cis-trans isomerase SurA; Provisional 413
12643 182716 PRK10771 thiQ thiamine ABC transporter ATP-binding protein ThiQ. 232
12644 182717 PRK10772 PRK10772 cell division protein FtsL; Provisional 108
12645 182718 PRK10773 murF UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase; Reviewed 453
12646 182719 PRK10774 PRK10774 cell division protein FtsW; Provisional 404
12647 182720 PRK10775 PRK10775 cell division protein FtsQ; Provisional 276
12648 182721 PRK10776 PRK10776 8-oxo-dGTP diphosphatase MutT. 129
12649 182722 PRK10778 dksA RNA polymerase-binding protein DksA. 151
12650 182723 PRK10779 PRK10779 sigma E protease regulator RseP. 449
12651 182724 PRK10780 PRK10780 molecular chaperone Skp. 165
12652 182725 PRK10781 rcsF Rcs stress response system protein RcsF. 133
12653 182726 PRK10782 PRK10782 D-methionine ABC transporter permease MetI. 217
12654 182727 PRK10783 mltD membrane-bound lytic murein transglycosylase D; Provisional 456
12655 236759 PRK10785 PRK10785 maltodextrin glucosidase; Provisional 598
12656 182729 PRK10786 ribD bifunctional diaminohydroxyphosphoribosylaminopyrimidine deaminase/5-amino-6-(5-phosphoribosylamino)uracil reductase RibD. 367
12657 182730 PRK10787 PRK10787 DNA-binding ATP-dependent protease La; Provisional 784
12658 182731 PRK10788 PRK10788 periplasmic folding chaperone; Provisional 623
12659 182732 PRK10789 PRK10789 SmdA family multidrug ABC transporter permease/ATP-binding protein. 569
12660 182733 PRK10790 PRK10790 SmdB family multidrug efflux ABC transporter permease/ATP-binding protein. 592
12661 182734 PRK10791 PRK10791 peptidylprolyl isomerase B. 164
12662 236760 PRK10792 PRK10792 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 285
12663 182736 PRK10793 PRK10793 D-alanyl-D-alanine carboxypeptidase fraction A; Provisional 403
12664 182737 PRK10794 PRK10794 rod shape-determining protein RodA. 370
12665 236761 PRK10795 PRK10795 penicillin-binding protein 2; Provisional 634
12666 236762 PRK10796 PRK10796 LPS-assembly lipoprotein RlpB; Provisional 188
12667 236763 PRK10797 PRK10797 glutamate and aspartate transporter subunit; Provisional 302
12668 182741 PRK10799 PRK10799 type 2 GTP cyclohydrolase I. 247
12669 182742 PRK10800 PRK10800 acyl-CoA thioesterase YbgC; Provisional 130
12670 182743 PRK10801 PRK10801 Tol-Pal system protein TolQ. 227
12671 182744 PRK10802 PRK10802 peptidoglycan-associated lipoprotein Pal. 173
12672 182745 PRK10803 PRK10803 tol-pal system protein YbgF; Provisional 263
12673 182746 PRK10805 PRK10805 formate transporter; Provisional 285
12674 182747 PRK10807 PRK10807 intermembrane transport protein PqiB. 547
12675 236764 PRK10808 PRK10808 outer membrane protein A; Reviewed 351
12676 182749 PRK10809 PRK10809 30S ribosomal protein S5 alanine N-acetyltransferase. 194
12677 236765 PRK10810 PRK10810 anti-sigma-28 factor FlgM. 98
12678 236766 PRK10811 rne ribonuclease E; Reviewed 1068
12679 236767 PRK10812 PRK10812 putative DNAse; Provisional 265
12680 182753 PRK10814 PRK10814 lipoprotein-releasing ABC transporter permease subunit LolC. 399
12681 182754 PRK10815 PRK10815 two-component system sensor histidine kinase PhoQ. 485
12682 182755 PRK10816 PRK10816 two-component system response regulator PhoP. 223
12683 182756 PRK10818 PRK10818 septum site-determining protein MinD. 270
12684 236768 PRK10819 PRK10819 transport protein TonB; Provisional 246
12685 236769 PRK10820 PRK10820 transcriptional regulator TyrR. 520
12686 182759 PRK10824 PRK10824 Grx4 family monothiol glutaredoxin. 115
12687 236770 PRK10826 PRK10826 hexitol phosphatase HxpB. 222
12688 182761 PRK10828 PRK10828 putative oxidoreductase; Provisional 183
12689 236771 PRK10829 PRK10829 ribonuclease D; Provisional 373
12690 182763 PRK10832 PRK10832 CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. 182
12691 236772 PRK10833 PRK10833 putative assembly protein; Provisional 617
12692 182765 PRK10834 PRK10834 outer membrane permeability protein SanA. 239
12693 182766 PRK10835 PRK10835 hypothetical protein; Provisional 373
12694 182767 PRK10836 PRK10836 lysine transporter; Provisional 489
12695 182768 PRK10837 PRK10837 putative DNA-binding transcriptional regulator; Provisional 290
12696 236773 PRK10838 spr bifunctional murein DD-endopeptidase/murein LD-carboxypeptidase. 190
12697 236774 PRK10839 PRK10839 16S rRNA pseudouridine(516) synthase RsuA. 232
12698 182771 PRK10840 PRK10840 transcriptional regulator RcsB; Provisional 216
12699 182772 PRK10841 PRK10841 two-component system sensor histidine kinase RcsC. 924
12700 182773 PRK10845 PRK10845 colicin V production protein; Provisional 162
12701 182774 PRK10846 PRK10846 bifunctional folylpolyglutamate synthase/ dihydrofolate synthase; Provisional 416
12702 182775 PRK10847 PRK10847 DedA family protein. 219
12703 182776 PRK10848 PRK10848 phosphohistidine phosphatase SixA. 159
12704 182777 PRK10850 PRK10850 phosphocarrier protein Hpr. 85
12705 182778 PRK10851 PRK10851 sulfate/thiosulfate ABC transporter ATP-binding protein CysA. 353
12706 236775 PRK10852 PRK10852 thiosulfate ABC transporter substrate-binding protein CysP. 338
12707 182780 PRK10853 PRK10853 putative reductase; Provisional 118
12708 182781 PRK10854 PRK10854 exopolyphosphatase; Provisional 513
12709 236776 PRK10856 PRK10856 cytoskeleton protein RodZ. 331
12710 236777 PRK10857 PRK10857 Fe-S cluster assembly transcriptional regulator IscR. 164
12711 182784 PRK10858 PRK10858 nitrogen regulatory protein P-II. 112
12712 236778 PRK10859 PRK10859 membrane-bound lytic murein transglycosylase MltF. 482
12713 182786 PRK10860 PRK10860 tRNA-specific adenosine deaminase; Provisional 172
12714 182787 PRK10861 PRK10861 signal peptidase I. 324
12715 182788 PRK10862 PRK10862 SoxR-reducing system protein RseC. 154
12716 182789 PRK10863 PRK10863 anti-sigma-E factor RseA. 216
12717 236779 PRK10864 PRK10864 putative methyltransferase; Provisional 346
12718 182791 PRK10865 PRK10865 ATP-dependent chaperone ClpB. 857
12719 182792 PRK10866 PRK10866 outer membrane protein assembly factor BamD. 243
12720 236780 PRK10867 PRK10867 signal recognition particle protein; Provisional 433
12721 236781 PRK10869 PRK10869 recombination and repair protein; Provisional 553
12722 182795 PRK10870 PRK10870 transcriptional repressor MprA; Provisional 176
12723 236782 PRK10871 nlpD murein hydrolase activator NlpD. 319
12724 182797 PRK10872 relA (p)ppGpp synthetase I/GTP pyrophosphokinase; Provisional 743
12725 182798 PRK10873 PRK10873 hypothetical protein; Provisional 131
12726 182799 PRK10874 PRK10874 cysteine desulfurase CsdA. 401
12727 236783 PRK10875 recD exodeoxyribonuclease V subunit alpha. 615
12728 236784 PRK10876 recB exonuclease V subunit beta; Provisional 1181
12729 182802 PRK10877 PRK10877 protein disulfide isomerase II DsbC; Provisional 232
12730 182803 PRK10878 PRK10878 FAD assembly factor SdhE. 72
12731 182804 PRK10879 PRK10879 proline aminopeptidase P II; Provisional 438
12732 182805 PRK10880 PRK10880 adenine DNA glycosylase. 350
12733 236785 PRK10881 PRK10881 Ni/Fe-hydrogenase cytochrome b subunit. 394
12734 236786 PRK10882 PRK10882 hydrogenase 2 operon protein HybA. 328
12735 182808 PRK10883 PRK10883 FtsI repressor; Provisional 471
12736 182809 PRK10884 PRK10884 SH3 domain-containing protein; Provisional 206
12737 182810 PRK10885 cca multifunctional CCA addition/repair protein. 409
12738 182811 PRK10886 PRK10886 DnaA initiator-associating protein DiaA; Provisional 196
12739 236787 PRK10887 glmM phosphoglucosamine mutase; Provisional 443
12740 182813 PRK10888 PRK10888 octaprenyl diphosphate synthase; Provisional 323
12741 182814 PRK10892 PRK10892 arabinose-5-phosphate isomerase KdsD. 326
12742 236788 PRK10893 PRK10893 LPS export ABC transporter periplasmic protein LptC. 192
12743 182816 PRK10894 PRK10894 lipopolysaccharide ABC transporter substrate-binding protein LptA. 180
12744 182817 PRK10895 PRK10895 lipopolysaccharide ABC transporter ATP-binding protein; Provisional 241
12745 182818 PRK10896 PRK10896 PTS IIA-like nitrogen regulatory protein PtsN. 154
12746 182819 PRK10897 PRK10897 PTS phosphocarrier protein NPr. 90
12747 182820 PRK10898 PRK10898 serine endoprotease DegS. 353
12748 236789 PRK10899 PRK10899 AsmA2 domain-containing protein. 1022
12749 236790 PRK10901 PRK10901 16S rRNA (cytosine(967)-C(5))-methyltransferase RsmB. 427
12750 236791 PRK10902 PRK10902 FKBP-type peptidyl-prolyl cis-trans isomerase; Provisional 269
12751 182824 PRK10903 PRK10903 peptidylprolyl isomerase A. 190
12752 182825 PRK10904 PRK10904 adenine-specific DNA-methyltransferase. 271
12753 236792 PRK10905 PRK10905 cell division protein DamX; Validated 328
12754 182827 PRK10906 PRK10906 DeoR/GlpR family transcriptional regulator. 252
12755 182828 PRK10907 PRK10907 intramembrane serine protease GlpG; Provisional 276
12756 182829 PRK10908 PRK10908 cell division ATP-binding protein FtsE. 222
12757 236793 PRK10909 rsmD 16S rRNA m(2)G966-methyltransferase; Provisional 199
12758 182831 PRK10910 PRK10910 DUF1145 family protein. 89
12759 182832 PRK10911 PRK10911 oligopeptidase A; Provisional 680
12760 182833 PRK10913 PRK10913 dipeptide ABC transporter permease DppC. 300
12761 182834 PRK10914 PRK10914 dipeptide ABC transporter permease DppB. 339
12762 182835 PRK10916 PRK10916 ADP-heptose--LPS heptosyltransferase RfaF. 348
12763 236794 PRK10917 PRK10917 ATP-dependent DNA helicase RecG; Provisional 681
12764 182837 PRK10918 PRK10918 phosphate ABC transporter substrate-binding protein PstS. 346
12765 182838 PRK10919 PRK10919 ATP-dependent DNA helicase Rep; Provisional 672
12766 236795 PRK10920 PRK10920 putative uroporphyrinogen III C-methyltransferase; Provisional 390
12767 182840 PRK10921 PRK10921 Sec-independent protein translocase subunit TatC. 258
12768 236796 PRK10922 PRK10922 4-hydroxy-3-polyprenylbenzoate decarboxylase. 497
12769 182842 PRK10923 glnG nitrogen regulation protein NR(I); Provisional 469
12770 182843 PRK10925 PRK10925 superoxide dismutase [Mn]. 206
12771 182844 PRK10926 PRK10926 ferredoxin-NADP reductase; Provisional 248
12772 236797 PRK10927 PRK10927 cell division protein FtsN. 319
12773 236798 PRK10929 PRK10929 putative mechanosensitive channel protein; Provisional 1109
12774 236799 PRK10930 PRK10930 FtsH protease activity modulator HflK. 419
12775 182848 PRK10931 PRK10931 adenosine-3'(2'),5'-bisphosphate nucleotidase; Provisional 246
12776 182849 PRK10933 PRK10933 trehalose-6-phosphate hydrolase; Provisional 551
12777 236800 PRK10935 PRK10935 nitrate/nitrite two-component system sensor histidine kinase NarQ. 565
12778 236801 PRK10936 PRK10936 TMAO reductase system periplasmic protein TorT; Provisional 343
12779 182852 PRK10938 PRK10938 putative molybdenum transport ATP-binding protein ModF; Provisional 490
12780 182853 PRK10939 PRK10939 autoinducer-2 (AI-2) kinase; Provisional 520
12781 182854 PRK10941 PRK10941 tetratricopeptide repeat-containing protein. 269
12782 236802 PRK10942 PRK10942 serine endoprotease DegP. 473
12783 170841 PRK10943 PRK10943 cold shock-like protein CspC; Provisional 69
12784 182856 PRK10945 PRK10945 hemolysin expression modulator Hha. 72
12785 236803 PRK10946 entE (2,3-dihydroxybenzoyl)adenylate synthase. 536
12786 182858 PRK10947 PRK10947 DNA-binding transcriptional regulator H-NS. 135
12787 236804 PRK10948 PRK10948 Fe-S cluster assembly protein SufD. 424
12788 182860 PRK10949 PRK10949 signal peptide peptidase SppA. 618
12789 236805 PRK10952 PRK10952 proline/glycine betaine ABC transporter permease ProW. 355
12790 182862 PRK10953 cysJ NADPH-dependent assimilatory sulfite reductase flavoprotein subunit. 600
12791 182863 PRK10954 PRK10954 thiol:disulfide interchange protein DsbA. 207
12792 182864 PRK10955 PRK10955 envelope stress response regulator transcription factor CpxR. 232
12793 236806 PRK10957 PRK10957 iron-enterobactin transporter periplasmic binding protein; Provisional 317
12794 236807 PRK10958 PRK10958 leucine export protein LeuE; Provisional 212
12795 182867 PRK10959 PRK10959 outer membrane protein W; Provisional 212
12796 236808 PRK10963 PRK10963 hypothetical protein; Provisional 223
12797 236809 PRK10964 PRK10964 lipopolysaccharide heptosyltransferase RfaC. 322
12798 236810 PRK10965 PRK10965 multicopper oxidase; Provisional 523
12799 182871 PRK10966 PRK10966 exonuclease subunit SbcD; Provisional 407
12800 182872 PRK10969 PRK10969 DNA polymerase III subunit theta; Reviewed 75
12801 182873 PRK10971 PRK10971 sulfate/thiosulfate ABC transporter permease CysT. 277
12802 182874 PRK10972 PRK10972 cell division protein ZapA. 109
12803 182875 PRK10973 PRK10973 sn-glycerol-3-phosphate ABC transporter permease UgpE. 281
12804 182876 PRK10974 PRK10974 sn-glycerol-3-phosphate ABC transporter substrate-binding protein UgpB. 438
12805 182877 PRK10975 PRK10975 dTDP-4-amino-4,6-dideoxy-D-galactose acyltransferase. 194
12806 182878 PRK10976 PRK10976 putative hydrolase; Provisional 266
12807 182879 PRK10977 PRK10977 hypothetical protein; Provisional 509
12808 182880 PRK10982 PRK10982 galactose/methyl galaxtoside transporter ATP-binding protein; Provisional 491
12809 182881 PRK10983 PRK10983 AI-2E family transporter YdiK. 368
12810 182882 PRK10984 PRK10984 sigma factor-binding protein Crl. 127
12811 182883 PRK10985 PRK10985 putative hydrolase; Provisional 324
12812 236811 PRK10987 PRK10987 beta-lactamase regulator AmpE. 284
12813 182885 PRK10991 fucI L-fucose isomerase; Provisional 588
12814 236812 PRK10992 PRK10992 iron-sulfur cluster repair protein YtfE. 220
12815 236813 PRK10993 PRK10993 omptin family outer membrane protease. 314
12816 236814 PRK10995 PRK10995 MarC family NAAT transporter. 221
12817 182889 PRK10996 PRK10996 thioredoxin 2; Provisional 139
12818 236815 PRK10997 yieM ATPase RavA stimulator ViaA. 487
12819 182891 PRK10998 malG maltose ABC transporter permease MalG. 296
12820 236816 PRK10999 malF maltose ABC transporter permease MalF. 520
12821 182893 PRK11000 PRK11000 maltose/maltodextrin ABC transporter ATP-binding protein MalK. 369
12822 236817 PRK11001 mtlR MltR family transcriptional regulator. 171
12823 182895 PRK11006 phoR phosphate regulon sensor histidine kinase PhoR. 430
12824 182896 PRK11007 PRK11007 PTS system trehalose(maltose)-specific transporter subunits IIBC; Provisional 473
12825 236818 PRK11009 aphA class B acid phosphatase. 237
12826 182898 PRK11010 ampG muropeptide MFS transporter AmpG. 491
12827 236819 PRK11013 PRK11013 DNA-binding transcriptional regulator LysR; Provisional 309
12828 236820 PRK11014 PRK11014 HTH-type transcriptional repressor NsrR. 141
12829 236821 PRK11017 codB cytosine permease; Provisional 404
12830 236822 PRK11018 PRK11018 putative sulfurtransferase YedF. 78
12831 182903 PRK11019 PRK11019 DksA/TraR family C4-type zinc finger protein. 88
12832 182904 PRK11020 PRK11020 YibL family ribosome-associated protein. 118
12833 236823 PRK11021 PRK11021 putative transporter; Provisional 410
12834 182906 PRK11022 dppD dipeptide transporter ATP-binding subunit; Provisional 326
12835 182907 PRK11023 PRK11023 divisome-associated lipoprotein YraP. 191
12836 236824 PRK11024 PRK11024 colicin uptake protein TolR; Provisional 141
12837 182909 PRK11025 PRK11025 23S rRNA pseudouridine(955/2504/2580) synthase RluC. 317
12838 182910 PRK11026 ftsX cell division ABC transporter subunit FtsX; Provisional 309
12839 236825 PRK11027 PRK11027 hypothetical protein; Provisional 112
12840 182912 PRK11028 PRK11028 6-phosphogluconolactonase; Provisional 330
12841 182913 PRK11029 PRK11029 protease modulator HflC. 334
12842 236826 PRK11031 PRK11031 guanosine-5'-triphosphate,3'-diphosphate diphosphatase. 496
12843 182915 PRK11032 PRK11032 zinc ribbon-containing protein. 160
12844 236827 PRK11033 zntA zinc/cadmium/mercury/lead-transporting ATPase; Provisional 741
12845 236828 PRK11034 clpA ATP-dependent Clp protease ATP-binding subunit; Provisional 758
12846 182918 PRK11036 PRK11036 tRNA uridine 5-oxyacetic acid(34) methyltransferase CmoM. 255
12847 182919 PRK11037 PRK11037 hypothetical protein; Provisional 83
12848 182920 PRK11038 PRK11038 hypothetical protein; Provisional 47
12849 182921 PRK11039 PRK11039 DUF1249 family protein. 140
12850 182922 PRK11040 PRK11040 peptidase PmbA; Provisional 446
12851 182923 PRK11041 PRK11041 DNA-binding transcriptional regulator CytR; Provisional 309
12852 182924 PRK11043 PRK11043 Bcr/CflA family multidrug efflux MFS transporter. 401
12853 236829 PRK11045 pagP lipid IV(A) palmitoyltransferase PagP. 184
12854 236830 PRK11049 PRK11049 D-alanine/D-serine/glycine permease; Provisional 469
12855 182927 PRK11050 PRK11050 manganese-binding transcriptional regulator MntR. 152
12856 236831 PRK11052 malQ 4-alpha-glucanotransferase; Provisional 695
12857 182929 PRK11053 PRK11053 oxygen-insensitive NAD(P)H nitroreductase. 217
12858 182930 PRK11054 helD DNA helicase IV; Provisional 684
12859 182931 PRK11055 galM galactose-1-epimerase; Provisional 342
12860 236832 PRK11056 PRK11056 YijD family membrane protein. 120
12861 182933 PRK11057 PRK11057 ATP-dependent DNA helicase RecQ; Provisional 607
12862 182934 PRK11058 PRK11058 GTPase HflX; Provisional 426
12863 236833 PRK11059 PRK11059 regulatory protein CsrD; Provisional 640
12864 182936 PRK11060 PRK11060 rod shape-determining protein MreD; Provisional 162
12865 182937 PRK11061 PRK11061 phosphoenolpyruvate--protein phosphotransferase. 748
12866 182938 PRK11062 nhaR transcriptional activator NhaR; Provisional 296
12867 182939 PRK11063 metQ D-methionine ABC transporter substrate-binding protein MetQ. 271
12868 182940 PRK11064 wecC UDP-N-acetyl-D-mannosamine dehydrogenase; Provisional 415
12869 236834 PRK11067 PRK11067 outer membrane protein assembly factor YaeT; Provisional 803
12870 182942 PRK11068 PRK11068 phosphatidylglycerophosphatase A; Provisional 164
12871 236835 PRK11069 recC exodeoxyribonuclease V subunit gamma. 1122
12872 182944 PRK11070 PRK11070 single-stranded-DNA-specific exonuclease RecJ. 575
12873 182945 PRK11071 PRK11071 esterase YqiA; Provisional 190
12874 236836 PRK11072 PRK11072 bifunctional [glutamate--ammonia ligase]-adenylyl-L-tyrosine phosphorylase/[glutamate--ammonia-ligase] adenylyltransferase. 943
12875 182947 PRK11073 glnL nitrogen regulation protein NR(II). 348
12876 182948 PRK11074 PRK11074 putative DNA-binding transcriptional regulator; Provisional 300
12877 236837 PRK11081 PRK11081 tRNA guanosine-2'-O-methyltransferase; Provisional 229
12878 236838 PRK11083 PRK11083 DNA-binding response regulator CreB; Provisional 228
12879 182951 PRK11085 PRK11085 magnesium/cobalt transporter CorA. 316
12880 236839 PRK11086 PRK11086 sensory histidine kinase DcuS; Provisional 542
12881 236840 PRK11087 PRK11087 oxidative stress defense protein; Provisional 231
12882 236841 PRK11088 rrmA 23S rRNA methyltransferase A; Provisional 272
12883 182955 PRK11089 PRK11089 PTS system glucose-specific transporter subunits IIBC; Provisional 477
12884 236842 PRK11091 PRK11091 aerobic respiration control sensor protein ArcB; Provisional 779
12885 236843 PRK11092 PRK11092 bifunctional GTP diphosphokinase/guanosine-3',5'-bis pyrophosphate 3'-pyrophosphohydrolase. 702
12886 182958 PRK11096 ansB L-asparaginase II; Provisional 347
12887 236844 PRK11097 PRK11097 cellulase. 376
12888 182960 PRK11098 PRK11098 peptide antibiotic transporter SbmA. 409
12889 236845 PRK11099 PRK11099 putative inner membrane protein; Provisional 399
12890 236846 PRK11100 PRK11100 sensory histidine kinase CreC; Provisional 475
12891 236847 PRK11101 glpA anaerobic glycerol-3-phosphate dehydrogenase subunit A. 546
12892 182964 PRK11102 PRK11102 Bcr/CflA family multidrug efflux MFS transporter. 377
12893 182965 PRK11103 PRK11103 PTS mannose transporter subunit IID. 282
12894 182966 PRK11104 hemG menaquinone-dependent protoporphyrinogen IX dehydrogenase. 177
12895 182967 PRK11106 PRK11106 queuosine biosynthesis protein QueC; Provisional 231
12896 236848 PRK11107 PRK11107 hybrid sensory histidine kinase BarA; Provisional 919
12897 236849 PRK11109 PRK11109 fused PTS fructose transporter subunit IIA/HPr protein. 375
12898 236850 PRK11111 PRK11111 YchE family NAAT transporter. 214
12899 182971 PRK11112 PRK11112 tRNA pseudouridine synthase C; Provisional 257
12900 182972 PRK11113 PRK11113 D-alanyl-D-alanine carboxypeptidase/endopeptidase; Provisional 477
12901 236851 PRK11114 PRK11114 cellulose biosynthesis cyclic di-GMP-binding regulatory protein BcsB. 756
12902 182974 PRK11115 PRK11115 phosphate signaling complex protein PhoU. 236
12903 182975 PRK11118 PRK11118 putative monooxygenase; Provisional 100
12904 236852 PRK11119 proX proline/glycine betaine ABC transporter substrate-binding protein ProX. 331
12905 236853 PRK11121 nrdG anaerobic ribonucleoside-triphosphate reductase-activating protein. 154
12906 182978 PRK11122 artM arginine ABC transporter permease ArtM. 222
12907 182979 PRK11123 PRK11123 arginine ABC transporter permease ArtQ. 238
12908 182980 PRK11124 artP arginine transporter ATP-binding subunit; Provisional 242
12909 236854 PRK11125 nrfA ammonia-forming cytochrome c nitrite reductase. 480
12910 236855 PRK11126 PRK11126 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase; Provisional 242
12911 182983 PRK11127 PRK11127 autonomous glycyl radical cofactor GrcA; Provisional 127
12912 236856 PRK11128 PRK11128 3-phenylpropionate MFS transporter. 382
12913 182985 PRK11130 moaD molybdopterin synthase small subunit; Provisional 81
12914 182986 PRK11131 PRK11131 ATP-dependent RNA helicase HrpA; Provisional 1294
12915 182987 PRK11132 cysE serine acetyltransferase; Provisional 273
12916 182988 PRK11133 serB phosphoserine phosphatase; Provisional 322
12917 236857 PRK11138 PRK11138 outer membrane biogenesis protein BamB; Provisional 394
12918 182990 PRK11139 PRK11139 DNA-binding transcriptional activator GcvA; Provisional 297
12919 236858 PRK11142 PRK11142 ribokinase; Provisional 306
12920 236859 PRK11143 glpQ glycerophosphodiester phosphodiesterase; Provisional 355
12921 182993 PRK11144 modC molybdenum ABC transporter ATP-binding protein ModC. 352
12922 182994 PRK11145 pflA pyruvate formate lyase 1-activating protein. 246
12923 236860 PRK11146 PRK11146 lipoprotein-releasing ABC transporter permease subunit LolE. 412
12924 236861 PRK11147 PRK11147 ABC transporter ATPase component; Reviewed 635
12925 182997 PRK11148 PRK11148 cyclic 3',5'-adenosine monophosphate phosphodiesterase; Provisional 275
12926 182998 PRK11150 rfaD ADP-L-glycero-D-mannoheptose-6-epimerase; Provisional 308
12927 182999 PRK11151 PRK11151 DNA-binding transcriptional regulator OxyR; Provisional 305
12928 236862 PRK11152 ilvM acetolactate synthase 2 small subunit. 76
12929 236863 PRK11153 metN DL-methionine transporter ATP-binding subunit; Provisional 343
12930 236864 PRK11154 fadJ fatty acid oxidation complex subunit alpha FadJ. 708
12931 236865 PRK11160 PRK11160 cysteine/glutathione ABC transporter membrane/ATP-binding component; Reviewed 574
12932 183004 PRK11161 PRK11161 fumarate/nitrate reduction transcriptional regulator Fnr. 235
12933 236866 PRK11162 mltA murein transglycosylase A; Provisional 355
12934 236867 PRK11165 PRK11165 diaminopimelate decarboxylase; Provisional 420
12935 236868 PRK11166 PRK11166 chemotaxis regulator CheZ; Provisional 214
12936 236869 PRK11168 glpC anaerobic glycerol-3-phosphate dehydrogenase subunit C. 396
12937 183009 PRK11169 PRK11169 leucine-responsive transcriptional regulator Lrp. 164
12938 183010 PRK11170 nagA N-acetylglucosamine-6-phosphate deacetylase; Provisional 382
12939 183011 PRK11171 PRK11171 (S)-ureidoglycine aminohydrolase. 266
12940 183012 PRK11172 dkgB 2,5-didehydrogluconate reductase DkgB. 267
12941 183013 PRK11173 PRK11173 two-component response regulator; Provisional 237
12942 236870 PRK11174 PRK11174 cysteine/glutathione ABC transporter membrane/ATP-binding component; Reviewed 588
12943 236871 PRK11175 PRK11175 universal stress protein UspE; Provisional 305
12944 183016 PRK11176 PRK11176 lipid A ABC transporter ATP-binding protein/permease MsbA. 582
12945 183017 PRK11177 PRK11177 phosphoenolpyruvate-protein phosphotransferase PtsI. 575
12946 183018 PRK11178 PRK11178 uridine phosphorylase; Provisional 251
12947 183019 PRK11179 PRK11179 DNA-binding transcriptional regulator AsnC; Provisional 153
12948 183020 PRK11180 rluD 23S rRNA pseudouridine(1911/1915/1917) synthase RluD. 325
12949 183021 PRK11181 PRK11181 23S rRNA (guanosine(2251)-2'-O)-methyltransferase RlmB. 244
12950 236872 PRK11183 PRK11183 D-lactate dehydrogenase; Provisional 564
12951 236873 PRK11186 PRK11186 carboxy terminal-processing peptidase. 667
12952 236874 PRK11187 PRK11187 replication initiation negative regulator SeqA. 182
12953 183025 PRK11188 rrmJ 23S rRNA (uridine(2552)-2'-O)-methyltransferase RlmE. 209
12954 236875 PRK11189 PRK11189 lipoprotein NlpI; Provisional 296
12955 183027 PRK11190 PRK11190 iron-sulfur cluster biogenesis protein NfuA. 192
12956 236876 PRK11191 PRK11191 ribonuclease E inhibitor RraB. 138
12957 236877 PRK11192 PRK11192 ATP-dependent RNA helicase SrmB; Provisional 434
12958 236878 PRK11193 PRK11193 23S rRNA accumulation protein YceD. 172
12959 183031 PRK11194 PRK11194 ribosomal RNA large subunit methyltransferase N; Provisional 372
12960 236879 PRK11195 PRK11195 lysophospholipid transporter LplT; Provisional 393
12961 183033 PRK11197 lldD L-lactate dehydrogenase; Provisional 381
12962 236880 PRK11198 PRK11198 LysM domain/BON superfamily protein; Provisional 147
12963 183035 PRK11199 tyrA bifunctional chorismate mutase/prephenate dehydrogenase; Provisional 374
12964 183036 PRK11200 grxA glutaredoxin 1; Provisional 85
12965 236881 PRK11202 PRK11202 HTH-type transcriptional repressor FabR. 203
12966 236882 PRK11204 PRK11204 N-glycosyltransferase; Provisional 420
12967 236883 PRK11205 tbpA thiamine transporter substrate binding subunit; Provisional 330
12968 183040 PRK11207 PRK11207 tellurite resistance methyltransferase TehB. 197
12969 183041 PRK11212 PRK11212 7-cyano-7-deazaguanine/7-aminomethyl-7-deazaguanine transporter. 210
12970 183042 PRK11228 fecC iron-dicitrate ABC transporter permease FecC. 323
12971 183043 PRK11230 PRK11230 glycolate oxidase subunit GlcD; Provisional 499
12972 183044 PRK11231 fecE Fe(3+) dicitrate ABC transporter ATP-binding protein FecE. 255
12973 183045 PRK11233 PRK11233 nitrogen assimilation transcriptional regulator; Provisional 305
12974 236884 PRK11234 nfrB phage adsorption protein NrfB. 727
12975 183047 PRK11235 PRK11235 type II toxin-antitoxin system RelB/DinJ family antitoxin. 80
12976 183048 PRK11239 PRK11239 hypothetical protein; Provisional 215
12977 183049 PRK11240 PRK11240 penicillin-binding protein 1C; Provisional 772
12978 183050 PRK11241 gabD NADP-dependent succinate-semialdehyde dehydrogenase I. 482
12979 183051 PRK11242 PRK11242 DNA-binding transcriptional regulator CynR; Provisional 296
12980 183052 PRK11244 phnP carbon-phosphorus lyase complex accessory protein; Provisional 250
12981 183053 PRK11245 folX dihydroneopterin triphosphate 2'-epimerase. 120
12982 236885 PRK11246 PRK11246 YtjB family periplasmic protein. 218
12983 183055 PRK11247 ssuB aliphatic sulfonates transport ATP-binding subunit; Provisional 257
12984 183056 PRK11248 tauB taurine ABC transporter ATP-binding subunit. 255
12985 236886 PRK11249 katE hydroperoxidase II; Provisional 752
12986 183058 PRK11251 PRK11251 osmotically-inducible lipoprotein OsmE. 109
12987 183059 PRK11253 ldcA L,D-carboxypeptidase A; Provisional 305
12988 236887 PRK11259 solA N-methyl-L-tryptophan oxidase. 376
12989 183061 PRK11260 PRK11260 cystine ABC transporter substrate-binding protein. 266
12990 236888 PRK11263 PRK11263 cardiolipin synthase ClsB. 411
12991 183063 PRK11264 PRK11264 putative amino-acid ABC transporter ATP-binding protein YecC; Provisional 250
12992 183064 PRK11267 PRK11267 biopolymer transport protein ExbD; Provisional 141
12993 183065 PRK11268 pstA phosphate ABC transporter permease PstA. 295
12994 183066 PRK11269 PRK11269 glyoxylate carboligase; Provisional 591
12995 183067 PRK11272 PRK11272 putative DMT superfamily transporter inner membrane protein; Provisional 292
12996 236889 PRK11273 glpT glycerol-3-phosphate transporter. 452
12997 236890 PRK11274 glcF glycolate oxidase subunit GlcF. 407
12998 183070 PRK11275 pstC phosphate ABC transporter permease PstC. 319
12999 236891 PRK11278 PRK11278 NADH-quinone oxidoreductase subunit NuoF. 448
13000 183072 PRK11280 PRK11280 hypothetical protein; Provisional 170
13001 236892 PRK11281 PRK11281 mechanosensitive channel MscK. 1113
13002 236893 PRK11282 glcE glycolate oxidase FAD binding subunit; Provisional 352
13003 183075 PRK11283 gltP glutamate/aspartate:proton symporter; Provisional 437
13004 183076 PRK11285 araH L-arabinose transporter permease protein; Provisional 333
13005 183077 PRK11288 araG L-arabinose ABC transporter ATP-binding protein AraG. 501
13006 236894 PRK11289 ampC beta-lactamase/D-alanine carboxypeptidase; Provisional 384
13007 236895 PRK11295 PRK11295 HNH nuclease family protein. 113
13008 183080 PRK11300 livG leucine/isoleucine/valine transporter ATP-binding subunit; Provisional 255
13009 236896 PRK11301 livM leucine/isoleucine/valine transporter permease subunit; Provisional 419
13010 183082 PRK11302 PRK11302 DNA-binding transcriptional regulator HexR; Provisional 284
13011 236897 PRK11303 PRK11303 catabolite repressor/activator. 328
13012 236898 PRK11308 dppF dipeptide transporter ATP-binding subunit; Provisional 327
13013 183085 PRK11316 PRK11316 bifunctional D-glycero-beta-D-manno-heptose-7-phosphate kinase/D-glycero-beta-D-manno-heptose 1-phosphate adenylyltransferase HldE. 473
13014 183086 PRK11320 prpB 2-methylisocitrate lyase; Provisional 292
13015 183087 PRK11325 PRK11325 scaffold protein; Provisional 127
13016 183088 PRK11331 PRK11331 5-methylcytosine-specific restriction enzyme subunit McrB; Provisional 459
13017 183089 PRK11337 PRK11337 MurR/RpiR family transcriptional regulator. 292
13018 183090 PRK11339 abgT putative aminobenzoyl-glutamate transporter; Provisional 508
13019 236899 PRK11340 PRK11340 phosphodiesterase YaeI; Provisional 271
13020 183092 PRK11342 mhpD 2-keto-4-pentenoate hydratase; Provisional 262
13021 183093 PRK11346 PRK11346 hypothetical protein; Provisional 285
13022 183094 PRK11347 PRK11347 antitoxin ChpS; Provisional 83
13023 183095 PRK11352 PRK11352 formaldehyde-responsive transcriptional repressor FrmR. 91
13024 236900 PRK11354 kil FtsZ inhibitor protein; Reviewed 73
13025 183096 PRK11357 frlA amino acid permease. 445
13026 183097 PRK11359 PRK11359 cyclic-di-GMP phosphodiesterase; Provisional 799
13027 236901 PRK11360 PRK11360 two-component system sensor histidine kinase AtoS. 607
13028 183099 PRK11361 PRK11361 acetoacetate metabolism transcriptional regulator AtoC. 457
13029 183100 PRK11365 ssuC aliphatic sulfonate ABC transporter permease SsuC. 263
13030 183101 PRK11366 puuD gamma-glutamyl-gamma-aminobutyrate hydrolase; Provisional 254
13031 183102 PRK11367 PRK11367 hypothetical protein; Provisional 476
13032 183103 PRK11370 PRK11370 YciI family protein. 99
13033 183104 PRK11371 PRK11371 hypothetical protein; Provisional 126
13034 236902 PRK11372 PRK11372 C-type lysozyme inhibitor. 109
13035 183106 PRK11375 PRK11375 putative allantoin permease. 484
13036 183107 PRK11376 hlyE hemolysin HlyE. 303
13037 183108 PRK11377 PRK11377 dihydroxyacetone kinase subunit M; Provisional 473
13038 183109 PRK11379 PRK11379 putative outer membrane porin protein; Provisional 417
13039 236903 PRK11380 PRK11380 hypothetical protein; Provisional 353
13040 183111 PRK11382 frlB fructoselysine 6-phosphate deglycase. 340
13041 105206 PRK11383 PRK11383 YiaB family inner membrane protein. 145
13042 183112 PRK11385 PRK11385 fimbrial chaperone. 236
13043 236904 PRK11387 PRK11387 S-methylmethionine permease. 471
13044 183114 PRK11388 PRK11388 DNA-binding transcriptional regulator DhaR; Provisional 638
13045 138553 PRK11391 etp phosphotyrosine-protein phosphatase; Provisional 144
13046 183115 PRK11394 PRK11394 23S rRNA pseudouridine(2457) synthase RluE. 217
13047 236905 PRK11396 PRK11396 environmental stress-induced protein Ves. 191
13048 183117 PRK11397 dacD serine-type D-Ala-D-Ala carboxypeptidase DacD. 388
13049 105214 PRK11401 PRK11401 enamine/imine deaminase. 129
13050 183118 PRK11402 PRK11402 transcriptional regulator PhoB. 241
13051 183119 PRK11403 PRK11403 hypothetical protein; Provisional 113
13052 183120 PRK11404 PRK11404 PTS fructose-like transporter subunit IIBC. 482
13053 236906 PRK11408 PRK11408 hypothetical protein; Provisional 145
13054 171099 PRK11409 PRK11409 YoeB-YefM toxin-antitoxin system antitoxin YefM. 83
13055 236907 PRK11410 PRK11410 hypothetical protein; Provisional 561
13056 183123 PRK11411 fecB iron-dicitrate transporter substrate-binding subunit; Provisional 303
13057 183124 PRK11412 PRK11412 uracil/xanthine transporter. 433
13058 183125 PRK11413 PRK11413 putative hydratase; Provisional 751
13059 183126 PRK11414 PRK11414 GntR family transcriptional regulator. 221
13060 183127 PRK11415 PRK11415 hypothetical protein; Provisional 74
13061 236908 PRK11423 PRK11423 methylmalonyl-CoA decarboxylase; Provisional 261
13062 236909 PRK11424 PRK11424 DNA-binding transcriptional activator TdcR; Provisional 114
13063 183129 PRK11425 PRK11425 PTS N-acetylgalactosamine transporter subunit IIB. 157
13064 183130 PRK11426 PRK11426 hypothetical protein; Provisional 132
13065 183131 PRK11427 PRK11427 multidrug efflux transporter permease subunit MdtO. 683
13066 183132 PRK11430 PRK11430 putative CoA-transferase; Provisional 381
13067 171110 PRK11431 PRK11431 quaternary ammonium compound efflux SMR transporter SugE. 105
13068 183133 PRK11432 fbpC ferric ABC transporter ATP-binding protein. 351
13069 236910 PRK11433 PRK11433 aldehyde oxidoreductase 2Fe-2S subunit; Provisional 217
13070 183135 PRK11436 PRK11436 biofilm-dependent modulation protein; Provisional 71
13071 236911 PRK11439 pphA protein-serine/threonine phosphatase. 218
13072 183137 PRK11440 PRK11440 putative hydrolase; Provisional 188
13073 183138 PRK11443 PRK11443 lipoprotein; Provisional 124
13074 183139 PRK11445 PRK11445 FAD-binding protein. 351
13075 183140 PRK11447 PRK11447 cellulose synthase subunit BcsC; Provisional 1157
13076 236912 PRK11448 hsdR type I restriction enzyme EcoKI subunit R; Provisional 1123
13077 171118 PRK11449 PRK11449 metal-dependent hydrolase. 258
13078 183142 PRK11453 PRK11453 O-acetylserine/cysteine exporter. 299
13079 183143 PRK11459 PRK11459 multidrug resistance outer membrane protein MdtQ; Provisional 478
13080 183144 PRK11460 PRK11460 putative hydrolase; Provisional 232
13081 183145 PRK11462 PRK11462 putative transporter; Provisional 460
13082 236913 PRK11463 fxsA phage T7 F exclusion suppressor FxsA; Reviewed 148
13083 183147 PRK11465 PRK11465 putative mechanosensitive channel protein; Provisional 741
13084 236914 PRK11466 PRK11466 hybrid sensory histidine kinase TorS; Provisional 914
13085 183149 PRK11467 PRK11467 secY/secA suppressor protein; Provisional 124
13086 183150 PRK11468 PRK11468 dihydroxyacetone kinase subunit DhaK; Provisional 356
13087 183151 PRK11469 PRK11469 manganese efflux pump MntP. 188
13088 183152 PRK11470 PRK11470 YebB family permuted papain-like enzyme. 200
13089 236915 PRK11475 PRK11475 DNA-binding transcriptional activator BglJ; Provisional 207
13090 183154 PRK11476 PRK11476 carnitine metabolism transcriptional regulator CaiF. 113
13091 183155 PRK11477 PRK11477 CdaR family transcriptional regulator. 385
13092 183156 PRK11478 PRK11478 VOC family protein. 129
13093 183157 PRK11479 PRK11479 YaeF family permuted papain-like enzyme. 274
13094 183158 PRK11480 tauA taurine transporter substrate binding subunit; Provisional 320
13095 183159 PRK11482 PRK11482 DNA-binding transcriptional regulator. 317
13096 183160 PRK11486 PRK11486 flagellar type III secretion system protein FliO. 124
13097 236916 PRK11492 hyfE hydrogenase 4 membrane subunit; Provisional 216
13098 236917 PRK11493 sseA 3-mercaptopyruvate sulfurtransferase; Provisional 281
13099 236918 PRK11498 bcsA cellulose synthase catalytic subunit; Provisional 852
13100 236919 PRK11504 tynA primary-amine oxidase. 647
13101 183165 PRK11505 PRK11505 phosphate starvation-inducible protein PsiF. 106
13102 183166 PRK11507 PRK11507 ribosome-associated protein YbcJ. 70
13103 183167 PRK11508 PRK11508 sulfurtransferase TusE. 109
13104 183168 PRK11509 PRK11509 hydrogenase-1 operon protein HyaE; Provisional 132
13105 236920 PRK11511 PRK11511 MDR efflux pump AcrAB transcriptional activator MarA. 127
13106 183170 PRK11512 PRK11512 multiple antibiotic resistance transcriptional regulator MarR. 144
13107 236921 PRK11513 PRK11513 cytochrome b561; Provisional 176
13108 183172 PRK11517 PRK11517 DNA-binding response regulator HprR. 223
13109 183173 PRK11519 PRK11519 tyrosine-protein kinase Wzc. 719
13110 183174 PRK11521 PRK11521 DUF2509 family protein. 124
13111 183175 PRK11522 PRK11522 putrescine--2-oxoglutarate aminotransferase; Provisional 459
13112 183176 PRK11523 PRK11523 transcriptional regulator ExuR. 253
13113 183177 PRK11524 PRK11524 adenine-specific DNA-methyltransferase. 284
13114 183178 PRK11525 dinD DNA-damage-inducible protein D; Provisional 279
13115 236922 PRK11528 PRK11528 hypothetical protein; Provisional 254
13116 236923 PRK11530 PRK11530 hypothetical protein; Provisional 183
13117 183181 PRK11534 PRK11534 DNA-binding transcriptional regulator CsiR; Provisional 224
13118 183182 PRK11536 PRK11536 6-N-hydroxylaminopurine resistance protein; Provisional 223
13119 183183 PRK11537 PRK11537 putative GTP-binding protein YjiA; Provisional 318
13120 183184 PRK11538 PRK11538 ribosome silencing factor. 105
13121 236924 PRK11539 PRK11539 ComEC family competence protein; Provisional 755
13122 183186 PRK11543 gutQ arabinose-5-phosphate isomerase GutQ. 321
13123 236925 PRK11544 hycI hydrogenase 3 maturation protease; Provisional 156
13124 236926 PRK11545 gntK gluconokinase. 163
13125 183189 PRK11546 zraP zinc resistance sensor/chaperone ZraP. 143
13126 183190 PRK11548 PRK11548 outer membrane protein assembly factor BamE. 113
13127 236927 PRK11551 PRK11551 putative 3-hydroxyphenylpropionic transporter MhpT; Provisional 406
13128 236928 PRK11552 PRK11552 putative DNA-binding transcriptional regulator; Provisional 225
13129 236929 PRK11553 PRK11553 alkanesulfonate transporter substrate-binding subunit; Provisional 314
13130 183194 PRK11556 PRK11556 MdtA/MuxA family multidrug efflux RND transporter periplasmic adaptor subunit. 415
13131 183195 PRK11557 PRK11557 MurR/RpiR family transcriptional regulator. 278
13132 236930 PRK11558 PRK11558 putative ssRNA endonuclease; Provisional 97
13133 183197 PRK11559 garR tartronate semialdehyde reductase; Provisional 296
13134 183198 PRK11560 PRK11560 kdo(2)-lipid A phosphoethanolamine 7''-transferase. 558
13135 183199 PRK11561 PRK11561 isovaleryl CoA dehydrogenase; Provisional 538
13136 183200 PRK11562 PRK11562 nitrite transporter NirC; Provisional 268
13137 236931 PRK11563 PRK11563 bifunctional aldehyde dehydrogenase/enoyl-CoA hydratase; Provisional 675
13138 236932 PRK11564 PRK11564 stationary phase inducible protein CsiE; Provisional 426
13139 183203 PRK11565 dkgA 2,5-didehydrogluconate reductase DkgA. 275
13140 183204 PRK11566 hdeB acid-activated periplasmic chaperone HdeB. 102
13141 183205 PRK11568 PRK11568 IMPACT family protein. 204
13142 183206 PRK11569 PRK11569 glyoxylate bypass operon transcriptional repressor IclR. 274
13143 183207 PRK11570 PRK11570 peptidyl-prolyl cis-trans isomerase; Provisional 206
13144 183208 PRK11572 PRK11572 copper homeostasis protein CutC; Provisional 248
13145 236933 PRK11573 PRK11573 hypothetical protein; Provisional 413
13146 183210 PRK11574 PRK11574 protein deglycase YajL. 196
13147 183211 PRK11578 PRK11578 macrolide transporter subunit MacA; Provisional 370
13148 183212 PRK11579 PRK11579 putative oxidoreductase; Provisional 346
13149 183213 PRK11582 PRK11582 flagella biosynthesis regulatory protein FliZ. 169
13150 183214 PRK11586 napB nitrate reductase cytochrome c-type subunit. 149
13151 183215 PRK11587 PRK11587 putative phosphatase; Provisional 218
13152 236934 PRK11588 PRK11588 putative basic amino acid antiporter YfcC. 506
13153 236935 PRK11589 gcvR glycine cleavage system transcriptional repressor; Provisional 190
13154 183218 PRK11590 PRK11590 hypothetical protein; Provisional 211
13155 183219 PRK11593 folB bifunctional dihydroneopterin aldolase/7,8-dihydroneopterin epimerase. 119
13156 183220 PRK11594 PRK11594 efflux system membrane protein; Provisional 67
13157 183221 PRK11595 PRK11595 DNA utilization protein GntX; Provisional 227
13158 183222 PRK11596 PRK11596 cyclic-di-GMP phosphodiesterase; Provisional 255
13159 183223 PRK11597 PRK11597 heat shock chaperone IbpB; Provisional 142
13160 183224 PRK11598 PRK11598 putative metal dependent hydrolase; Provisional 545
13161 183225 PRK11602 cysW sulfate/thiosulfate ABC transporter permease CysW. 283
13162 183226 PRK11607 potG putrescine ABC transporter ATP-binding subunit PotG. 377
13163 236936 PRK11608 pspF phage shock protein operon transcriptional activator; Provisional 326
13164 183228 PRK11609 PRK11609 bifunctional nicotinamidase/pyrazinamidase. 212
13165 183229 PRK11611 PRK11611 enhanced serine sensitivity protein SseB; Provisional 246
13166 183230 PRK11613 folP dihydropteroate synthase; Provisional 282
13167 183231 PRK11614 livF high-affinity branched-chain amino acid ABC transporter ATP-binding protein LivF. 237
13168 183232 PRK11615 PRK11615 hypothetical protein; Provisional 185
13169 183233 PRK11616 PRK11616 hypothetical protein; Provisional 109
13170 236937 PRK11617 PRK11617 deoxyribonuclease V. 224
13171 183235 PRK11618 PRK11618 inner membrane ABC transporter permease protein YjfF; Provisional 317
13172 183236 PRK11619 PRK11619 lytic murein transglycosylase; Provisional 644
13173 236938 PRK11621 PRK11621 Tat proofreading chaperone DmsD. 204
13174 183238 PRK11622 PRK11622 ABC transporter substrate-binding protein. 401
13175 236939 PRK11623 pcnB poly(A) polymerase I; Provisional 472
13176 183240 PRK11624 cdsA phosphatidate cytidylyltransferase. 285
13177 183241 PRK11625 PRK11625 Rho-binding antiterminator; Provisional 84
13178 183242 PRK11627 PRK11627 hypothetical protein; Provisional 192
13179 183243 PRK11628 PRK11628 transcriptional regulator BolA; Provisional 105
13180 183244 PRK11629 lolD lipoprotein-releasing ABC transporter ATP-binding protein LolD. 233
13181 183245 PRK11630 PRK11630 threonylcarbamoyl-AMP synthase. 206
13182 236940 PRK11633 PRK11633 cell division protein DedD; Provisional 226
13183 236941 PRK11634 PRK11634 ATP-dependent RNA helicase DeaD; Provisional 629
13184 183248 PRK11636 mrcA penicillin-binding protein 1a; Provisional 850
13185 236942 PRK11637 PRK11637 AmiB activator; Provisional 428
13186 236943 PRK11638 PRK11638 ECA polysaccharide chain length modulation protein. 342
13187 183251 PRK11639 PRK11639 zinc uptake transcriptional repressor Zur. 169
13188 183252 PRK11640 PRK11640 putative transcriptional regulator; Provisional 191
13189 236944 PRK11642 PRK11642 ribonuclease R. 813
13190 236945 PRK11644 PRK11644 signal transduction histidine-protein kinase/phosphatase UhpB. 495
13191 183255 PRK11646 PRK11646 multidrug efflux MFS transporter MdtH. 400
13192 183256 PRK11648 PRK11648 metal-dependent hydrolase. 195
13193 236946 PRK11649 PRK11649 putative peptidase; Provisional 439
13194 236947 PRK11650 ugpC sn-glycerol-3-phosphate ABC transporter ATP-binding protein UgpC. 356
13195 183259 PRK11652 emrD multidrug transporter EmrD. 394
13196 236948 PRK11653 PRK11653 DUF1190 family protein. 225
13197 236949 PRK11655 ubiC chorismate pyruvate lyase; Provisional 169
13198 183262 PRK11657 dsbG disulfide isomerase/thiol-disulfide oxidase; Provisional 251
13199 183263 PRK11658 PRK11658 UDP-4-amino-4-deoxy-L-arabinose aminotransferase. 379
13200 183264 PRK11659 PRK11659 cytochrome c nitrite reductase pentaheme subunit; Provisional 183
13201 183265 PRK11660 PRK11660 putative transporter; Provisional 568
13202 183266 PRK11663 PRK11663 glucose-6-phosphate receptor/MFS transporter UhpC. 434
13203 236950 PRK11664 PRK11664 ATP-dependent RNA helicase HrpB; Provisional 812
13204 236951 PRK11667 PRK11667 hypothetical protein; Provisional 163
13205 236952 PRK11669 pbpG D-alanyl-D-alanine endopeptidase; Provisional 306
13206 183270 PRK11670 PRK11670 iron-sulfur cluster carrier protein ApbC. 369
13207 183271 PRK11671 mltC membrane-bound lytic murein transglycosylase MltC. 359
13208 183272 PRK11675 PRK11675 LexA regulated protein; Provisional 90
13209 236953 PRK11677 PRK11677 DUF1043 family protein. 134
13210 236954 PRK11678 PRK11678 putative chaperone; Provisional 450
13211 236955 PRK11679 PRK11679 outer membrane protein assembly factor BamC. 346
13212 183276 PRK11688 PRK11688 thioesterase family protein. 154
13213 183277 PRK11689 PRK11689 aromatic amino acid efflux DMT transporter YddG. 295
13214 236956 PRK11697 PRK11697 two-component system response regulator BtsR. 238
13215 236957 PRK11700 PRK11700 VOC family protein. 187
13216 183280 PRK11701 phnK phosphonate C-P lyase system protein PhnK; Provisional 258
13217 183281 PRK11702 PRK11702 hypothetical protein; Provisional 108
13218 183282 PRK11705 PRK11705 cyclopropane fatty acyl phospholipid synthase. 383
13219 183283 PRK11706 PRK11706 TDP-4-oxo-6-deoxy-D-glucose transaminase; Provisional 375
13220 236958 PRK11709 PRK11709 putative L-ascorbate 6-phosphate lactonase; Provisional 355
13221 183285 PRK11712 PRK11712 ribonuclease G; Provisional 489
13222 236959 PRK11713 PRK11713 16S ribosomal RNA methyltransferase RsmE; Provisional 234
13223 236960 PRK11715 PRK11715 cell envelope integrity protein CreD. 436
13224 236961 PRK11716 PRK11716 HTH-type transcriptional activator IlvY. 269
13225 236962 PRK11718 PRK11718 sigma D regulator. 161
13226 236963 PRK11720 PRK11720 UDP-glucose--hexose-1-phosphate uridylyltransferase. 346
13227 236964 PRK11727 PRK11727 23S rRNA (adenine(1618)-N(6))-methyltransferase RlmF. 321
13228 183292 PRK11728 PRK11728 L-2-hydroxyglutarate oxidase. 393
13229 183293 PRK11730 fadB fatty acid oxidation complex subunit alpha FadB. 715
13230 236965 PRK11742 PRK11742 bifunctional NADH:ubiquinone oxidoreductase subunit C/D; Provisional 575
13231 236966 PRK11747 dinG ATP-dependent DNA helicase DinG; Provisional 697
13232 236967 PRK11749 PRK11749 dihydropyrimidine dehydrogenase subunit A; Provisional 457
13233 236968 PRK11750 gltB glutamate synthase subunit alpha; Provisional 1485
13234 183298 PRK11752 PRK11752 putative S-transferase; Provisional 264
13235 236969 PRK11753 PRK11753 cAMP-activated global transcriptional regulator CRP. 211
13236 236970 PRK11756 PRK11756 exonuclease III; Provisional 268
13237 236971 PRK11760 PRK11760 putative 23S rRNA C2498 ribose 2'-O-ribose methyltransferase; Provisional 357
13238 236972 PRK11761 cysM cysteine synthase CysM. 296
13239 183303 PRK11762 nudE adenosine nucleotide hydrolase NudE; Provisional 185
13240 236973 PRK11767 PRK11767 SpoVR family protein; Provisional 498
13241 236974 PRK11768 PRK11768 serine/threonine protein kinase. 325
13242 236975 PRK11770 PRK11770 YccF domain-containing protein. 135
13243 236976 PRK11773 uvrD DNA-dependent helicase II; Provisional 721
13244 236977 PRK11776 PRK11776 ATP-dependent RNA helicase DbpA; Provisional 460
13245 236978 PRK11778 PRK11778 putative inner membrane peptidase; Provisional 330
13246 236979 PRK11779 sbcB exonuclease I; Provisional 476
13247 236980 PRK11780 PRK11780 isoprenoid biosynthesis glyoxalase ElbB. 217
13248 236981 PRK11783 rlmL bifunctional 23S rRNA (guanine(2069)-N(7))-methyltransferase RlmK/23S rRNA (guanine(2445)-N(2))-methyltransferase RlmL. 702
13249 236982 PRK11784 PRK11784 tRNA 2-selenouridine synthase; Provisional 345
13250 236983 PRK11788 PRK11788 tetratricopeptide repeat protein; Provisional 389
13251 236984 PRK11789 PRK11789 1,6-anhydro-N-acetylmuramyl-L-alanine amidase AmpD. 185
13252 236985 PRK11790 PRK11790 phosphoglycerate dehydrogenase. 409
13253 236986 PRK11792 queF 7-cyano-7-deazaguanine reductase; Provisional 273
13254 183318 PRK11797 PRK11797 D-ribose pyranase; Provisional 139
13255 236987 PRK11798 PRK11798 ClpXP protease specificity-enhancing factor; Provisional 138
13256 236988 PRK11805 PRK11805 50S ribosomal protein L3 N(5)-glutamine methyltransferase. 307
13257 236989 PRK11809 putA trifunctional transcriptional regulator/proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed 1318
13258 236990 PRK11814 PRK11814 cysteine desulfurase activator complex subunit SufB; Provisional 486
13259 236991 PRK11815 PRK11815 tRNA dihydrouridine(20/20a) synthase DusA. 333
13260 236992 PRK11819 PRK11819 putative ABC transporter ATP-binding protein; Reviewed 556
13261 236993 PRK11820 PRK11820 YicC family protein. 288
13262 236994 PRK11823 PRK11823 DNA repair protein RadA; Provisional 446
13263 236995 PRK11824 PRK11824 polynucleotide phosphorylase/polyadenylase; Provisional 693
13264 183328 PRK11827 PRK11827 protein YcaR. 60
13265 183329 PRK11829 PRK11829 biofilm formation regulator HmsP; Provisional 660
13266 236996 PRK11830 dapD 2,3,4,5-tetrahydropyridine-2,6-carboxylate N-succinyltransferase; Provisional 272
13267 236997 PRK11831 PRK11831 phospholipid ABC transporter ATP-binding protein MlaF. 269
13268 183332 PRK11832 PRK11832 hydrogen peroxide resistance inhibitor IprA. 207
13269 183333 PRK11835 PRK11835 hypothetical protein; Provisional 114
13270 183334 PRK11836 PRK11836 deubiquitinase; Provisional 403
13271 183335 PRK11837 PRK11837 undecaprenyl pyrophosphate phosphatase; Provisional 202
13272 236998 PRK11840 PRK11840 bifunctional sulfur carrier protein/thiazole synthase protein; Provisional 326
13273 236999 PRK11854 aceF pyruvate dehydrogenase dihydrolipoyltransacetylase; Validated 633
13274 237000 PRK11855 PRK11855 dihydrolipoamide acetyltransferase; Reviewed 547
13275 237001 PRK11856 PRK11856 branched-chain alpha-keto acid dehydrogenase subunit E2; Reviewed 411
13276 237002 PRK11857 PRK11857 dihydrolipoamide acetyltransferase; Reviewed 306
13277 183341 PRK11858 aksA trans-homoaconitate synthase; Reviewed 378
13278 237003 PRK11860 PRK11860 bifunctional 3-phosphoshikimate 1-carboxyvinyltransferase/cytidylate kinase. 661
13279 183343 PRK11861 PRK11861 bifunctional prephenate dehydrogenase/3-phosphoshikimate 1-carboxyvinyltransferase; Provisional 673
13280 237004 PRK11863 PRK11863 N-acetyl-gamma-glutamyl-phosphate reductase; Provisional 313
13281 237005 PRK11864 PRK11864 3-methyl-2-oxobutanoate dehydrogenase subunit beta. 300
13282 183346 PRK11865 PRK11865 pyruvate synthase subunit beta. 299
13283 183347 PRK11866 PRK11866 2-oxoacid ferredoxin oxidoreductase subunit beta; Provisional 279
13284 237006 PRK11867 PRK11867 2-oxoglutarate ferredoxin oxidoreductase subunit beta; Reviewed 286
13285 183349 PRK11869 PRK11869 2-oxoacid ferredoxin oxidoreductase subunit beta; Provisional 280
13286 183350 PRK11872 antC anthranilate 1,2-dioxygenase electron transfer component AntC. 340
13287 237007 PRK11873 arsM arsenite methyltransferase. 272
13288 183352 PRK11874 petL cytochrome b6-f complex subunit PetL; Reviewed 30
13289 183353 PRK11875 psbT photosystem II reaction center protein T; Reviewed 31
13290 183354 PRK11876 petM cytochrome b6-f complex subunit PetM; Reviewed 32
13291 183355 PRK11877 psaI photosystem I reaction center subunit VIII; Reviewed 38
13292 183356 PRK11878 psaM photosystem I reaction center subunit XII; Reviewed 34
13293 237008 PRK11880 PRK11880 pyrroline-5-carboxylate reductase; Reviewed 267
13294 237009 PRK11883 PRK11883 protoporphyrinogen oxidase; Reviewed 451
13295 237010 PRK11886 PRK11886 bifunctional biotin--[acetyl-CoA-carboxylase] ligase/biotin operon repressor BirA. 319
13296 183360 PRK11889 flhF flagellar biosynthesis protein FlhF. 436
13297 183361 PRK11890 PRK11890 phosphate acetyltransferase; Provisional 312
13298 183362 PRK11891 PRK11891 aspartate carbamoyltransferase; Provisional 429
13299 237011 PRK11892 PRK11892 pyruvate dehydrogenase subunit beta; Provisional 464
13300 237012 PRK11893 PRK11893 methionyl-tRNA synthetase; Reviewed 511
13301 183365 PRK11895 ilvH acetolactate synthase 3 regulatory subunit; Reviewed 161
13302 237013 PRK11898 PRK11898 prephenate dehydratase; Provisional 283
13303 237014 PRK11899 PRK11899 prephenate dehydratase; Provisional 279
13304 237015 PRK11901 PRK11901 hypothetical protein; Reviewed 327
13305 183369 PRK11902 ampG muropeptide transporter; Reviewed 402
13306 237016 PRK11903 PRK11903 3,4-dehydroadipyl-CoA semialdehyde dehydrogenase. 521
13307 237017 PRK11904 PRK11904 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA. 1038
13308 237018 PRK11905 PRK11905 bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed 1208
13309 183373 PRK11906 PRK11906 HilA family transcriptional regulator YgeH. 458
13310 237019 PRK11907 PRK11907 bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 814
13311 183375 PRK11908 PRK11908 bifunctional UDP-4-keto-pentose/UDP-xylose synthase. 347
13312 183376 PRK11909 PRK11909 cobalt transporter CbiM. 230
13313 183377 PRK11910 PRK11910 amidase; Provisional 615
13314 138812 PRK11911 flgD flagellar basal body rod modification protein; Provisional 140
13315 237020 PRK11913 phhA phenylalanine 4-monooxygenase; Reviewed 275
13316 237021 PRK11914 PRK11914 diacylglycerol kinase; Reviewed 306
13317 237022 PRK11915 PRK11915 lysophospholipid acyltransferase. 621
13318 183380 PRK11916 PRK11916 electron transfer flavoprotein subunit alpha. 312
13319 183381 PRK11917 PRK11917 bifunctional adhesin/ABC transporter aspartate/glutamate-binding protein; Reviewed 259
13320 171344 PRK11920 rirA iron-responsive transcriptional regulator RirA. 153
13321 237023 PRK11921 PRK11921 anaerobic nitric oxide reductase flavorubredoxin. 394
13322 237024 PRK11922 PRK11922 RNA polymerase sigma factor; Provisional 231
13323 171347 PRK11923 algU RNA polymerase sigma factor AlgU; Provisional 193
13324 183384 PRK11924 PRK11924 RNA polymerase sigma factor; Provisional 179
13325 237025 PRK11929 PRK11929 bifunctional UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,6-diaminopimelate ligase MurE/UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase MurF. 958
13326 237026 PRK11930 PRK11930 putative bifunctional UDP-N-acetylmuramoyl-tripeptide:D-alanyl-D-alanine ligase/alanine racemase; Provisional 822
13327 183387 PRK11933 yebU rRNA (cytosine-C(5)-)-methyltransferase RsmF; Reviewed 470
13328 237027 PRK12266 glpD glycerol-3-phosphate dehydrogenase; Reviewed 508
13329 237028 PRK12267 PRK12267 methionyl-tRNA synthetase; Reviewed 648
13330 237029 PRK12268 PRK12268 methionyl-tRNA synthetase; Reviewed 556
13331 105491 PRK12269 PRK12269 bifunctional cytidylate kinase/ribosomal protein S1; Provisional 863
13332 237030 PRK12270 kgd multifunctional oxoglutarate decarboxylase/oxoglutarate dehydrogenase thiamine pyrophosphate-binding subunit/dihydrolipoyllysine-residue succinyltransferase subunit. 1228
13333 183392 PRK12271 rps10p 30S ribosomal protein S10P; Reviewed 102
13334 237031 PRK12273 aspA aspartate ammonia-lyase; Provisional 472
13335 237032 PRK12274 PRK12274 serine/threonine protein kinase; Provisional 218
13336 183395 PRK12275 PRK12275 hypothetical protein; Reviewed 116
13337 237033 PRK12276 PRK12276 putative heme peroxidase; Provisional 248
13338 183397 PRK12277 PRK12277 50S ribosomal protein L13e; Provisional 83
13339 237034 PRK12278 PRK12278 50S ribosomal protein L21. 221
13340 138835 PRK12279 PRK12279 50S ribosomal protein L22/unknown domain fusion protein; Provisional 311
13341 237035 PRK12280 rplW 50S ribosomal protein L23; Reviewed 158
13342 183399 PRK12281 rplX 50S ribosomal protein L24; Reviewed 76
13343 183400 PRK12282 PRK12282 tryptophanyl-tRNA synthetase II; Reviewed 333
13344 183401 PRK12283 PRK12283 tryptophanyl-tRNA synthetase; Reviewed 398
13345 237036 PRK12284 PRK12284 tryptophanyl-tRNA synthetase; Reviewed 431
13346 237037 PRK12285 PRK12285 tryptophanyl-tRNA synthetase; Reviewed 368
13347 237038 PRK12286 rpmF 50S ribosomal protein L32; Reviewed 57
13348 183405 PRK12287 tqsA pheromone autoinducer 2 transporter; Reviewed 344
13349 237039 PRK12288 PRK12288 small ribosomal subunit biogenesis GTPase RsgA. 347
13350 237040 PRK12289 PRK12289 small ribosomal subunit biogenesis GTPase RsgA. 352
13351 237041 PRK12290 thiE thiamine phosphate synthase. 437
13352 237042 PRK12291 PRK12291 apolipoprotein N-acyltransferase; Reviewed 418
13353 237043 PRK12292 hisZ ATP phosphoribosyltransferase regulatory subunit; Provisional 391
13354 183411 PRK12293 hisZ ATP phosphoribosyltransferase regulatory subunit; Provisional 281
13355 237044 PRK12294 hisZ ATP phosphoribosyltransferase regulatory subunit; Provisional 272
13356 183413 PRK12295 hisZ ATP phosphoribosyltransferase regulatory subunit; Provisional 373
13357 237045 PRK12296 obgE GTPase CgtA; Reviewed 500
13358 237046 PRK12297 obgE GTPase CgtA; Reviewed 424
13359 237047 PRK12298 obgE GTPase CgtA; Reviewed 390
13360 237048 PRK12299 obgE GTPase CgtA; Reviewed 335
13361 237049 PRK12300 leuS leucyl-tRNA synthetase; Reviewed 897
13362 183419 PRK12301 bssS biofilm formation regulator BssS. 84
13363 183420 PRK12302 bssR biofilm formation regulator BssR. 127
13364 183421 PRK12303 PRK12303 tumor necrosis factor alpha-inducing protein; Reviewed 192
13365 237050 PRK12305 thrS threonyl-tRNA synthetase; Reviewed 575
13366 183423 PRK12306 uvrC excinuclease ABC subunit C; Reviewed 519
13367 237051 PRK12307 PRK12307 MFS transporter. 426
13368 183425 PRK12308 PRK12308 argininosuccinate lyase. 614
13369 183426 PRK12309 PRK12309 transaldolase. 391
13370 183427 PRK12310 PRK12310 hydroxylamine reductase; Provisional 433
13371 183428 PRK12311 rpsB 30S ribosomal protein S2. 326
13372 237052 PRK12313 PRK12313 1,4-alpha-glucan branching protein GlgB. 633
13373 183430 PRK12314 PRK12314 gamma-glutamyl kinase; Provisional 266
13374 237053 PRK12315 PRK12315 1-deoxy-D-xylulose-5-phosphate synthase; Provisional 581
13375 237054 PRK12316 PRK12316 peptide synthase; Provisional 5163
13376 237055 PRK12317 PRK12317 elongation factor 1-alpha; Reviewed 425
13377 183434 PRK12318 PRK12318 methionyl aminopeptidase. 291
13378 183435 PRK12319 PRK12319 acetyl-CoA carboxylase subunit alpha; Provisional 256
13379 138873 PRK12320 PRK12320 hypothetical protein; Provisional 699
13380 237056 PRK12321 cobN cobaltochelatase subunit CobN; Reviewed 1100
13381 183437 PRK12322 PRK12322 NADH-quinone oxidoreductase subunit D. 366
13382 237057 PRK12323 PRK12323 DNA polymerase III subunit gamma/tau. 700
13383 237058 PRK12324 PRK12324 decaprenyl-phosphate phosphoribosyltransferase. 295
13384 237059 PRK12325 PRK12325 prolyl-tRNA synthetase; Provisional 439
13385 237060 PRK12326 PRK12326 preprotein translocase subunit SecA; Reviewed 764
13386 237061 PRK12327 nusA transcription elongation factor NusA; Provisional 362
13387 237062 PRK12328 nusA transcription termination/antitermination protein NusA. 374
13388 237063 PRK12329 nusA transcription termination factor NusA. 449
13389 183445 PRK12330 PRK12330 methylmalonyl-CoA carboxytransferase subunit 5S. 499
13390 183446 PRK12331 PRK12331 oxaloacetate decarboxylase subunit alpha. 448
13391 183447 PRK12332 tsf elongation factor Ts; Reviewed 198
13392 237064 PRK12333 PRK12333 nucleoside triphosphate pyrophosphohydrolase; Reviewed 204
13393 237065 PRK12334 PRK12334 nucleoside triphosphate pyrophosphohydrolase; Reviewed 277
13394 183450 PRK12335 PRK12335 tellurite resistance protein TehB; Provisional 287
13395 183451 PRK12336 PRK12336 translation initiation factor IF-2 subunit beta; Provisional 201
13396 183452 PRK12337 PRK12337 2-phosphoglycerate kinase; Provisional 475
13397 237066 PRK12338 PRK12338 hypothetical protein; Provisional 319
13398 105560 PRK12339 PRK12339 mevalonate-3-phosphate 5-kinase. 197
13399 183454 PRK12341 PRK12341 acyl-CoA dehydrogenase. 381
13400 183455 PRK12342 PRK12342 electron transfer flavoprotein. 254
13401 237067 PRK12343 PRK12343 cyclic pyranopterin monophosphate synthase MoaC. 151
13402 237068 PRK12344 PRK12344 putative alpha-isopropylmalate/homocitrate synthase family transferase; Provisional 524
13403 183458 PRK12346 PRK12346 transaldolase A; Provisional 316
13404 183459 PRK12347 sgbE L-ribulose-5-phosphate 4-epimerase; Reviewed 231
13405 183460 PRK12348 sgaE L-ribulose-5-phosphate 4-epimerase; Reviewed 228
13406 237069 PRK12349 PRK12349 citrate synthase. 369
13407 237070 PRK12350 PRK12350 citrate synthase 2; Provisional 353
13408 183463 PRK12351 PRK12351 methylcitrate synthase; Provisional 378
13409 183464 PRK12352 PRK12352 putative carbamate kinase; Reviewed 316
13410 237071 PRK12353 PRK12353 putative amino acid kinase; Reviewed 314
13411 183466 PRK12354 PRK12354 carbamate kinase; Reviewed 307
13412 237072 PRK12355 PRK12355 type-F conjugative transfer system mating-pair stabilization protein TraN. 558
13413 237073 PRK12356 PRK12356 glutaminase; Reviewed 319
13414 237074 PRK12357 PRK12357 glutaminase; Reviewed 326
13415 183470 PRK12358 PRK12358 glucosamine-6-phosphate deaminase. 239
13416 183471 PRK12359 PRK12359 flavodoxin FldB; Provisional 172
13417 237075 PRK12360 PRK12360 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Provisional 281
13418 183473 PRK12361 PRK12361 hypothetical protein; Provisional 547
13419 237076 PRK12362 PRK12362 germination protease; Provisional 318
13420 171438 PRK12363 PRK12363 phosphoglycerol transferase I; Provisional 703
13421 237077 PRK12364 PRK12364 ribonucleoside-diphosphate reductase subunit alpha. 842
13422 171440 PRK12365 PRK12365 ribonucleoside-diphosphate reductase subunit alpha. 1046
13423 237078 PRK12366 PRK12366 replication factor A; Reviewed 637
13424 237079 PRK12367 PRK12367 short chain dehydrogenase; Provisional 245
13425 171443 PRK12369 PRK12369 putative transporter; Reviewed 326
13426 237080 PRK12370 PRK12370 HilA/EilA family virulence transcriptional regulator. 553
13427 171444 PRK12371 PRK12371 ribonuclease III; Reviewed 235
13428 237081 PRK12372 PRK12372 ribonuclease III; Reviewed 413
13429 237082 PRK12373 PRK12373 NADH-quinone oxidoreductase subunit E. 400
13430 237083 PRK12374 PRK12374 ATP-dependent dethiobiotin synthetase BioD. 231
13431 183481 PRK12376 PRK12376 putative translaldolase; Provisional 236
13432 183482 PRK12377 PRK12377 putative replication protein; Provisional 248
13433 237084 PRK12378 PRK12378 YebC/PmpR family DNA-binding transcriptional regulator. 235
13434 183484 PRK12379 PRK12379 propionate kinase. 396
13435 183485 PRK12380 PRK12380 hydrogenase/urease nickel incorporation protein. 113
13436 183486 PRK12381 PRK12381 bifunctional succinylornithine transaminase/acetylornithine transaminase; Provisional 406
13437 183487 PRK12382 PRK12382 putative transporter; Provisional 392
13438 237085 PRK12383 PRK12383 putative mutase; Provisional 406
13439 183489 PRK12384 PRK12384 sorbitol-6-phosphate dehydrogenase; Provisional 259
13440 183490 PRK12385 PRK12385 succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 244
13441 237086 PRK12386 PRK12386 fumarate reductase iron-sulfur subunit; Provisional 251
13442 183492 PRK12387 PRK12387 formate hydrogenlyase complex iron-sulfur subunit; Provisional 180
13443 171459 PRK12388 PRK12388 class II fructose-bisphosphatase. 321
13444 183493 PRK12389 PRK12389 glutamate-1-semialdehyde aminotransferase; Provisional 428
13445 183494 PRK12390 PRK12390 1-aminocyclopropane-1-carboxylate deaminase; Provisional 337
13446 237087 PRK12391 PRK12391 TrpB-like pyridoxal phosphate-dependent enzyme. 427
13447 171463 PRK12392 PRK12392 bacteriochlorophyll c synthase; Provisional 331
13448 237088 PRK12393 PRK12393 amidohydrolase; Provisional 457
13449 183497 PRK12394 PRK12394 metallo-dependent hydrolase. 379
13450 183498 PRK12395 PRK12395 maltoporin; Provisional 419
13451 183499 PRK12396 PRK12396 5-methylribose kinase; Reviewed 409
13452 183500 PRK12397 PRK12397 acetate/propionate family kinase. 404
13453 237089 PRK12398 PRK12398 pyruvoyl-dependent arginine decarboxylase; Provisional 162
13454 183502 PRK12399 PRK12399 tagatose 1,6-diphosphate aldolase; Reviewed 324
13455 171470 PRK12400 PRK12400 D-amino acid aminotransferase; Reviewed 290
13456 237090 PRK12402 PRK12402 replication factor C small subunit 2; Reviewed 337
13457 171472 PRK12403 PRK12403 aspartate aminotransferase family protein. 460
13458 183504 PRK12404 PRK12404 stage V sporulation protein AD; Provisional 334
13459 237091 PRK12405 PRK12405 electron transport complex RsxE subunit; Provisional 231
13460 183506 PRK12406 PRK12406 long-chain-fatty-acid--CoA ligase; Provisional 509
13461 183507 PRK12407 flgH flagellar basal body L-ring protein FlgH. 221
13462 237092 PRK12408 PRK12408 glucokinase; Provisional 336
13463 237093 PRK12409 PRK12409 D-amino acid dehydrogenase small subunit; Provisional 410
13464 237094 PRK12410 PRK12410 glutamylglutaminyl-tRNA synthetase; Provisional 433
13465 183511 PRK12411 PRK12411 cytidine deaminase; Provisional 132
13466 183512 PRK12412 PRK12412 bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 268
13467 183513 PRK12413 PRK12413 bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 253
13468 183514 PRK12414 PRK12414 putative aminotransferase; Provisional 384
13469 183515 PRK12415 PRK12415 fructose-bisphosphatase class II. 322
13470 183516 PRK12416 PRK12416 protoporphyrinogen oxidase; Provisional 463
13471 237095 PRK12417 secY preprotein translocase subunit SecY; Reviewed 404
13472 183518 PRK12418 PRK12418 cysteinyl-tRNA synthetase; Provisional 384
13473 237096 PRK12419 PRK12419 6,7-dimethyl-8-ribityllumazine synthase. 158
13474 237097 PRK12420 PRK12420 histidyl-tRNA synthetase; Provisional 423
13475 237098 PRK12421 PRK12421 ATP phosphoribosyltransferase regulatory subunit; Provisional 392
13476 183521 PRK12422 PRK12422 chromosomal replication initiator protein DnaA. 445
13477 171489 PRK12423 PRK12423 LexA repressor; Provisional 202
13478 171490 PRK12425 PRK12425 class II fumarate hydratase. 464
13479 183522 PRK12426 PRK12426 elongation factor P; Provisional 185
13480 183523 PRK12427 PRK12427 FliA/WhiG family RNA polymerase sigma factor. 231
13481 237099 PRK12428 PRK12428 coniferyl-alcohol dehydrogenase. 241
13482 237100 PRK12429 PRK12429 3-hydroxybutyrate dehydrogenase; Provisional 258
13483 237101 PRK12430 PRK12430 putative bifunctional flagellar biosynthesis protein FliO/FliP; Provisional 379
13484 171495 PRK12434 PRK12434 tRNA pseudouridine(38-40) synthase TruA. 245
13485 183526 PRK12435 PRK12435 ferrochelatase; Provisional 311
13486 171497 PRK12436 PRK12436 UDP-N-acetylmuramate dehydrogenase. 305
13487 183527 PRK12437 PRK12437 prolipoprotein diacylglyceryl transferase; Reviewed 269
13488 171499 PRK12438 PRK12438 hypothetical protein; Provisional 991
13489 171500 PRK12439 PRK12439 NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional 341
13490 183528 PRK12440 PRK12440 acetate/propionate family kinase. 397
13491 237102 PRK12442 PRK12442 translation initiation factor IF-1; Reviewed 87
13492 183530 PRK12444 PRK12444 threonyl-tRNA synthetase; Reviewed 639
13493 171504 PRK12445 PRK12445 lysyl-tRNA synthetase; Reviewed 505
13494 171505 PRK12446 PRK12446 undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase; Reviewed 352
13495 237103 PRK12447 PRK12447 histidinol dehydrogenase; Reviewed 426
13496 237104 PRK12448 PRK12448 dihydroxy-acid dehydratase; Provisional 615
13497 183533 PRK12449 PRK12449 acyl carrier protein; Provisional 80
13498 138982 PRK12450 PRK12450 foldase protein PrsA; Reviewed 309
13499 183534 PRK12451 PRK12451 arginyl-tRNA synthetase; Reviewed 562
13500 171510 PRK12452 PRK12452 cardiolipin synthase. 509
13501 183535 PRK12454 PRK12454 carbamate kinase-like carbamoyl phosphate synthetase; Reviewed 313
13502 84141 PRK12456 PRK12456 Na(+)-translocating NADH-quinone reductase subunit E; Provisional 199
13503 237105 PRK12457 PRK12457 3-deoxy-8-phosphooctulonate synthase. 281
13504 183536 PRK12458 PRK12458 glutathione synthetase; Provisional 338
13505 237106 PRK12459 PRK12459 S-adenosylmethionine synthetase; Provisional 386
13506 183538 PRK12460 PRK12460 2-keto-3-deoxygluconate permease; Provisional 312
13507 183539 PRK12461 PRK12461 UDP-N-acetylglucosamine acyltransferase; Provisional 255
13508 183540 PRK12462 PRK12462 phosphoserine aminotransferase; Provisional 364
13509 171518 PRK12463 PRK12463 chorismate synthase; Reviewed 390
13510 237107 PRK12464 PRK12464 1-deoxy-D-xylulose 5-phosphate reductoisomerase; Provisional 383
13511 183542 PRK12465 PRK12465 xylose isomerase; Provisional 445
13512 183543 PRK12466 PRK12466 3-isopropylmalate dehydratase large subunit. 471
13513 237108 PRK12467 PRK12467 peptide synthase; Provisional 3956
13514 171522 PRK12468 flhB flagellar biosynthesis protein FlhB; Reviewed 386
13515 237109 PRK12469 PRK12469 RNA polymerase factor sigma-54; Provisional 481
13516 171524 PRK12470 PRK12470 amidase; Provisional 462
13517 237110 PRK12472 PRK12472 hypothetical protein; Provisional 508
13518 183546 PRK12473 PRK12473 hypothetical protein; Provisional 198
13519 139002 PRK12474 PRK12474 hypothetical protein; Provisional 518
13520 183547 PRK12475 PRK12475 thiamine/molybdopterin biosynthesis MoeB-like protein; Provisional 338
13521 171527 PRK12476 PRK12476 putative fatty-acid--CoA ligase; Provisional 612
13522 183548 PRK12478 PRK12478 crotonase/enoyl-CoA hydratase family protein. 298
13523 183549 PRK12479 PRK12479 branched-chain-amino-acid transaminase. 299
13524 183550 PRK12480 PRK12480 D-lactate dehydrogenase; Provisional 330
13525 171531 PRK12481 PRK12481 2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 251
13526 171532 PRK12482 PRK12482 flagellar motor stator protein MotA. 287
13527 237111 PRK12483 PRK12483 threonine dehydratase; Reviewed 521
13528 237112 PRK12484 PRK12484 nicotinate phosphoribosyltransferase; Provisional 443
13529 171535 PRK12485 PRK12485 bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 369
13530 237113 PRK12486 dmdA dimethylsulfoniopropionate demethylase. 368
13531 183553 PRK12487 PRK12487 putative 4-hydroxy-4-methyl-2-oxoglutarate aldolase. 163
13532 237114 PRK12488 PRK12488 cation/acetate symporter. 549
13533 237115 PRK12489 PRK12489 anaerobic C4-dicarboxylate transporter; Reviewed 443
13534 237116 PRK12490 PRK12490 6-phosphogluconate dehydrogenase-like protein; Reviewed 299
13535 105695 PRK12491 PRK12491 pyrroline-5-carboxylate reductase; Reviewed 272
13536 171539 PRK12492 PRK12492 long-chain-fatty-acid--CoA ligase; Provisional 562
13537 237117 PRK12493 PRK12493 magnesium chelatase subunit H; Provisional 1310
13538 183557 PRK12494 PRK12494 NAD(P)H-quinone oxidoreductase subunit J. 172
13539 183558 PRK12495 PRK12495 hypothetical protein; Provisional 226
13540 237118 PRK12496 PRK12496 hypothetical protein; Provisional 164
13541 237119 PRK12497 PRK12497 YraN family protein. 119
13542 183561 PRK12504 PRK12504 DUF4040 domain-containing protein. 178
13543 237120 PRK12505 PRK12505 putative monovalent cation/H+ antiporter subunit B; Reviewed 159
13544 237121 PRK12507 PRK12507 putative monovalent cation/H+ antiporter subunit B; Reviewed 332
13545 183563 PRK12508 PRK12508 Na(+)/H(+) antiporter subunit B. 139
13546 237122 PRK12509 PRK12509 Na+/H+ antiporter subunit B. 137
13547 237123 PRK12511 PRK12511 RNA polymerase sigma factor; Provisional 182
13548 171551 PRK12512 PRK12512 RNA polymerase sigma factor; Provisional 184
13549 183566 PRK12513 PRK12513 RNA polymerase sigma factor; Provisional 194
13550 105710 PRK12514 PRK12514 RNA polymerase sigma factor; Provisional 179
13551 183567 PRK12515 PRK12515 RNA polymerase sigma factor; Provisional 189
13552 183568 PRK12516 PRK12516 RNA polymerase sigma factor; Provisional 187
13553 183569 PRK12517 PRK12517 RNA polymerase sigma factor; Provisional 188
13554 237124 PRK12518 PRK12518 RNA polymerase sigma factor; Provisional 175
13555 237125 PRK12519 PRK12519 RNA polymerase sigma factor; Provisional 194
13556 237126 PRK12520 PRK12520 RNA polymerase sigma factor; Provisional 191
13557 183571 PRK12522 PRK12522 RNA polymerase sigma factor; Provisional 173
13558 183572 PRK12523 PRK12523 RNA polymerase sigma factor; Reviewed 172
13559 183573 PRK12524 PRK12524 RNA polymerase sigma factor; Provisional 196
13560 139037 PRK12525 PRK12525 RNA polymerase sigma factor; Provisional 168
13561 237127 PRK12526 PRK12526 RNA polymerase sigma factor; Provisional 206
13562 171560 PRK12527 PRK12527 RNA polymerase sigma factor; Reviewed 159
13563 171561 PRK12528 PRK12528 RNA polymerase sigma factor; Provisional 161
13564 183574 PRK12529 PRK12529 RNA polymerase sigma factor; Provisional 178
13565 237128 PRK12530 PRK12530 RNA polymerase sigma factor; Provisional 189
13566 105726 PRK12531 PRK12531 RNA polymerase sigma factor; Provisional 194
13567 171564 PRK12532 PRK12532 RNA polymerase sigma factor; Provisional 195
13568 237129 PRK12533 PRK12533 RNA polymerase sigma factor; Provisional 216
13569 183576 PRK12534 PRK12534 RNA polymerase sigma factor; Provisional 187
13570 237130 PRK12535 PRK12535 RNA polymerase sigma factor; Provisional 196
13571 237131 PRK12536 PRK12536 RNA polymerase sigma factor; Provisional 181
13572 171568 PRK12537 PRK12537 RNA polymerase sigma factor; Provisional 182
13573 139048 PRK12538 PRK12538 RNA polymerase sigma factor; Provisional 233
13574 237132 PRK12539 PRK12539 RNA polymerase sigma factor SigF. 184
13575 183579 PRK12540 PRK12540 RNA polymerase sigma factor; Provisional 182
13576 183580 PRK12541 PRK12541 RNA polymerase sigma factor; Provisional 161
13577 183581 PRK12542 PRK12542 RNA polymerase sigma factor; Provisional 185
13578 183582 PRK12543 PRK12543 RNA polymerase sigma factor; Provisional 179
13579 183583 PRK12544 PRK12544 RNA polymerase factor sigma-70. 206
13580 183584 PRK12545 PRK12545 RNA polymerase factor sigma-70. 201
13581 139055 PRK12546 PRK12546 RNA polymerase sigma factor; Provisional 188
13582 139056 PRK12547 PRK12547 RNA polymerase sigma factor; Provisional 164
13583 183585 PRK12548 PRK12548 shikimate dehydrogenase. 289
13584 183586 PRK12549 PRK12549 shikimate 5-dehydrogenase; Reviewed 284
13585 183587 PRK12550 PRK12550 shikimate 5-dehydrogenase; Reviewed 272
13586 139060 PRK12551 PRK12551 ATP-dependent Clp protease proteolytic subunit; Reviewed 196
13587 183588 PRK12552 PRK12552 ATP-dependent Clp protease proteolytic subunit. 222
13588 237133 PRK12553 PRK12553 ATP-dependent Clp protease proteolytic subunit; Reviewed 207
13589 237134 PRK12554 PRK12554 undecaprenyl pyrophosphate phosphatase; Reviewed 276
13590 237135 PRK12555 PRK12555 chemotaxis-specific protein-glutamate methyltransferase CheB. 337
13591 183592 PRK12556 PRK12556 tryptophanyl-tRNA synthetase; Provisional 332
13592 237136 PRK12557 PRK12557 H(2)-dependent methylenetetrahydromethanopterin dehydrogenase-related protein; Provisional 342
13593 183594 PRK12558 PRK12558 glutamyl-tRNA synthetase; Provisional 445
13594 79035 PRK12559 PRK12559 transcriptional regulator Spx; Provisional 131
13595 183595 PRK12560 PRK12560 adenine phosphoribosyltransferase; Provisional 187
13596 237137 PRK12561 PRK12561 NAD(P)H-quinone oxidoreductase subunit 4; Provisional 504
13597 105755 PRK12562 PRK12562 ornithine carbamoyltransferase subunit F; Provisional 334
13598 237138 PRK12563 PRK12563 sulfate adenylyltransferase subunit CysD. 312
13599 237139 PRK12564 PRK12564 carbamoyl-phosphate synthase small subunit. 360
13600 171585 PRK12566 PRK12566 glycine dehydrogenase; Provisional 954
13601 237140 PRK12567 PRK12567 putative monovalent cation/H+ antiporter subunit B; Reviewed 218
13602 139075 PRK12568 PRK12568 glycogen branching enzyme; Provisional 730
13603 237141 PRK12569 PRK12569 hypothetical protein; Provisional 245
13604 237142 PRK12570 PRK12570 N-acetylmuramic acid-6-phosphate etherase; Reviewed 296
13605 183601 PRK12571 PRK12571 1-deoxy-D-xylulose-5-phosphate synthase; Provisional 641
13606 183602 PRK12573 PRK12573 Na(+)/H(+) antiporter subunit B. 140
13607 183603 PRK12574 PRK12574 monovalent cation/H+ antiporter subunit B. 141
13608 171592 PRK12575 PRK12575 succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 235
13609 237143 PRK12576 PRK12576 succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 279
13610 183605 PRK12577 PRK12577 succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 329
13611 183606 PRK12578 PRK12578 thiolase domain-containing protein. 385
13612 183607 PRK12579 PRK12579 putative monovalent cation/H+ antiporter subunit B; Reviewed 258
13613 79055 PRK12580 PRK12580 omptin family plasminogen activator Pla. 312
13614 79056 PRK12581 PRK12581 oxaloacetate decarboxylase; Provisional 468
13615 237144 PRK12582 PRK12582 acyl-CoA synthetase; Provisional 624
13616 237145 PRK12583 PRK12583 acyl-CoA synthetase; Provisional 558
13617 237146 PRK12584 PRK12584 flagellin A; Reviewed 510
13618 183610 PRK12585 PRK12585 putative monovalent cation/H+ antiporter subunit G; Reviewed 197
13619 237147 PRK12586 PRK12586 Na+/H+ antiporter subunit G. 145
13620 183612 PRK12587 PRK12587 putative monovalent cation/H+ antiporter subunit G; Reviewed 118
13621 183613 PRK12592 PRK12592 putative monovalent cation/H+ antiporter subunit G; Reviewed 126
13622 183614 PRK12595 PRK12595 bifunctional 3-deoxy-7-phosphoheptulonate synthase/chorismate mutase; Reviewed 360
13623 105779 PRK12596 PRK12596 putative monovalent cation/H+ antiporter subunit E; Reviewed 171
13624 183615 PRK12597 PRK12597 F0F1 ATP synthase subunit beta; Provisional 461
13625 237148 PRK12599 PRK12599 putative monovalent cation/H+ antiporter subunit F; Reviewed 91
13626 183617 PRK12600 PRK12600 Na(+)/H(+) antiporter subunit F1. 94
13627 183618 PRK12603 PRK12603 putative monovalent cation/H+ antiporter subunit F; Reviewed 86
13628 183619 PRK12604 PRK12604 putative monovalent cation/H+ antiporter subunit F; Reviewed 84
13629 237149 PRK12606 PRK12606 GTP cyclohydrolase I; Reviewed 201
13630 183621 PRK12607 PRK12607 phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional 313
13631 237150 PRK12608 PRK12608 transcription termination factor Rho; Provisional 380
13632 237151 PRK12612 PRK12612 putative monovalent cation/H+ antiporter subunit F; Reviewed 87
13633 171609 PRK12613 PRK12613 galactose-6-phosphate isomerase subunit LacA; Provisional 141
13634 171610 PRK12615 PRK12615 galactose-6-phosphate isomerase subunit LacB; Reviewed 171
13635 183624 PRK12616 PRK12616 bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 270
13636 183625 PRK12617 flgA flagellar basal body P-ring formation protein FlgA. 214
13637 183626 PRK12618 flgA flagellar basal body P-ring formation protein FlgA. 141
13638 183627 PRK12619 flgB flagellar basal body rod protein FlgB; Provisional 130
13639 183628 PRK12620 flgB flagellar basal body rod protein FlgB; Provisional 132
13640 171613 PRK12621 flgB flagellar basal body rod protein FlgB; Provisional 136
13641 183629 PRK12622 flgB flagellar basal body rod protein FlgB; Provisional 135
13642 183630 PRK12623 flgB flagellar basal body rod protein FlgB; Provisional 131
13643 139107 PRK12624 flgB flagellar basal body rod protein FlgB; Provisional 143
13644 183631 PRK12625 flgB flagellar basal body rod protein FlgB; Provisional 132
13645 183632 PRK12626 flgB flagellar basal body rod protein FlgB; Provisional 162
13646 237152 PRK12627 flgB FlgB family protein. 128
13647 79089 PRK12628 flgC flagellar basal body rod protein FlgC; Provisional 140
13648 183634 PRK12629 flgC flagellar basal body rod protein FlgC; Provisional 135
13649 183635 PRK12630 flgC flagellar basal body rod protein FlgC; Provisional 143
13650 183636 PRK12631 flgC flagellar basal body rod protein FlgC; Provisional 138
13651 183637 PRK12632 flgC flagellar basal body rod protein FlgC; Provisional 130
13652 183638 PRK12633 flgD flagellar basal body rod modification protein; Provisional 230
13653 183639 PRK12634 flgD flagellar basal body rod modification protein; Reviewed 221
13654 183640 PRK12636 flgG flagellar basal body rod protein FlgG; Provisional 263
13655 183641 PRK12637 flgE flagellar hook protein FlgE; Provisional 473
13656 183642 PRK12640 flgF flagellar basal body rod protein FlgF; Reviewed 246
13657 105809 PRK12641 flgF flagellar basal-body rod protein FlgF. 252
13658 237153 PRK12642 flgF flagellar basal-body rod protein FlgF. 241
13659 139117 PRK12643 flgF flagellar basal body rod protein FlgF; Reviewed 209
13660 237154 PRK12644 PRK12644 putative monovalent cation/H+ antiporter subunit A; Reviewed 965
13661 237155 PRK12645 PRK12645 monovalent cation/H+ antiporter subunit A; Reviewed 800
13662 183646 PRK12646 PRK12646 DUF4040 family protein. 800
13663 237156 PRK12647 PRK12647 putative monovalent cation/H+ antiporter subunit A; Reviewed 761
13664 237157 PRK12648 PRK12648 putative monovalent cation/H+ antiporter subunit A; Reviewed 948
13665 183649 PRK12649 PRK12649 putative monovalent cation/H+ antiporter subunit A; Reviewed 789
13666 237158 PRK12650 PRK12650 DUF4040 family protein. 962
13667 237159 PRK12651 PRK12651 Na+/H+ antiporter subunit E. 158
13668 237160 PRK12652 PRK12652 monovalent cation/H+ antiporter subunit E. 357
13669 183653 PRK12653 PRK12653 fructose-6-phosphate aldolase; Reviewed 220
13670 237161 PRK12654 PRK12654 monovalent cation/H+ antiporter subunit E. 151
13671 183655 PRK12655 PRK12655 fructose-6-phosphate aldolase; Reviewed 220
13672 183656 PRK12656 PRK12656 fructose-6-phosphate aldolase; Reviewed 222
13673 183657 PRK12657 PRK12657 putative monovalent cation/H+ antiporter subunit F; Reviewed 100
13674 183658 PRK12658 PRK12658 Na+/H+ antiporter subunit C. 125
13675 183659 PRK12659 PRK12659 Na+/H+ antiporter subunit C. 117
13676 183660 PRK12660 PRK12660 putative monovalent cation/H+ antiporter subunit C; Reviewed 114
13677 237162 PRK12661 PRK12661 putative monovalent cation/H+ antiporter subunit C; Reviewed 140
13678 183662 PRK12662 PRK12662 putative monovalent cation/H+ antiporter subunit D; Reviewed 492
13679 237163 PRK12663 PRK12663 Na+/H+ antiporter subunit D. 497
13680 237164 PRK12664 PRK12664 putative monovalent cation/H+ antiporter subunit D; Reviewed 527
13681 237165 PRK12665 PRK12665 putative monovalent cation/H+ antiporter subunit D; Reviewed 521
13682 237166 PRK12666 PRK12666 putative monovalent cation/H+ antiporter subunit D; Reviewed 528
13683 237167 PRK12667 PRK12667 putative monovalent cation/H+ antiporter subunit D; Reviewed 520
13684 237168 PRK12668 PRK12668 Na(+)/H(+) antiporter subunit D. 581
13685 183669 PRK12670 PRK12670 putative monovalent cation/H+ antiporter subunit G; Reviewed 99
13686 183670 PRK12671 PRK12671 putative monovalent cation/H+ antiporter subunit G; Reviewed 120
13687 183671 PRK12672 PRK12672 putative monovalent cation/H+ antiporter subunit G; Reviewed 118
13688 237169 PRK12674 PRK12674 putative monovalent cation/H+ antiporter subunit G; Reviewed 99
13689 171652 PRK12675 PRK12675 putative monovalent cation/H+ antiporter subunit G; Reviewed 104
13690 183673 PRK12676 PRK12676 bifunctional fructose-bisphosphatase/inositol-phosphate phosphatase. 263
13691 237170 PRK12677 PRK12677 xylose isomerase; Provisional 384
13692 237171 PRK12678 PRK12678 transcription termination factor Rho; Provisional 672
13693 183676 PRK12679 cbl HTH-type transcriptional regulator Cbl. 316
13694 183677 PRK12680 PRK12680 LysR family transcriptional regulator. 327
13695 183678 PRK12681 cysB HTH-type transcriptional regulator CysB. 324
13696 183679 PRK12682 PRK12682 transcriptional regulator CysB-like protein; Reviewed 309
13697 237172 PRK12683 PRK12683 transcriptional regulator CysB-like protein; Reviewed 309
13698 237173 PRK12684 PRK12684 CysB family HTH-type transcriptional regulator. 313
13699 183682 PRK12685 flgB flagellar basal body rod protein FlgB; Reviewed 116
13700 183683 PRK12686 PRK12686 carbamate kinase; Reviewed 312
13701 105853 PRK12687 PRK12687 flagellin; Reviewed 311
13702 171664 PRK12688 PRK12688 flagellin; Reviewed 751
13703 183684 PRK12689 flgF flagellar basal-body rod protein FlgF. 253
13704 183685 PRK12690 flgF flagellar hook-basal body complex protein. 238
13705 183686 PRK12691 flgG flagellar basal body rod protein FlgG; Reviewed 262
13706 139158 PRK12692 flgG flagellar basal body rod protein FlgG; Reviewed 262
13707 183687 PRK12693 flgG flagellar basal body rod protein FlgG; Provisional 261
13708 183688 PRK12694 flgG flagellar basal body rod protein FlgG; Reviewed 260
13709 237174 PRK12696 flgH flagellar basal body L-ring protein; Reviewed 236
13710 237175 PRK12697 flgH flagellar basal body L-ring protein FlgH. 226
13711 183690 PRK12698 flgH flagellar basal body L-ring protein FlgH. 224
13712 105864 PRK12699 flgH flagellar basal body L-ring protein; Reviewed 246
13713 139164 PRK12700 flgH flagellar basal body L-ring protein; Reviewed 230
13714 183691 PRK12701 flgH flagellar basal body L-ring protein; Reviewed 230
13715 105866 PRK12702 PRK12702 mannosyl-3-phosphoglycerate phosphatase; Reviewed 302
13716 237176 PRK12703 PRK12703 tRNA 2'-O-methylase; Reviewed 339
13717 237177 PRK12704 PRK12704 phosphodiesterase; Provisional 520
13718 237178 PRK12705 PRK12705 hypothetical protein; Provisional 508
13719 183694 PRK12706 flgI flagellar basal body P-ring protein; Provisional 328
13720 139168 PRK12708 flgJ peptidoglycan hydrolase; Reviewed 134
13721 237179 PRK12709 flgJ flagellar rod assembly protein/muramidase FlgJ; Provisional 320
13722 139170 PRK12710 flgJ flagellar rod assembly protein/muramidase FlgJ; Provisional 291
13723 237180 PRK12711 flgJ flagellar assembly peptidoglycan hydrolase FlgJ. 392
13724 139172 PRK12712 flgJ flagellar rod assembly protein/muramidase FlgJ; Provisional 344
13725 139173 PRK12713 flgJ flagellar rod assembly protein/muramidase FlgJ; Provisional 339
13726 183697 PRK12714 flgK flagellar hook-associated protein FlgK; Provisional 624
13727 183698 PRK12715 flgK flagellar hook-associated protein FlgK; Provisional 649
13728 171679 PRK12717 flgL flagellar hook-associated protein 3. 523
13729 79176 PRK12718 flgL flagellar hook-associated protein FlgL; Provisional 510
13730 183699 PRK12720 PRK12720 EscV/YscV/HrcV family type III secretion system export apparatus protein. 675
13731 183700 PRK12721 PRK12721 EscU/YscU/HrcU family type III secretion system export apparatus switch protein. 349
13732 237181 PRK12722 PRK12722 flagellar transcriptional regulator FlhC. 187
13733 183702 PRK12723 PRK12723 flagellar biosynthesis regulator FlhF; Provisional 388
13734 183703 PRK12724 PRK12724 flagellar biosynthesis regulator FlhF; Provisional 432
13735 183704 PRK12726 PRK12726 flagellar biosynthesis regulator FlhF; Provisional 407
13736 237182 PRK12727 PRK12727 flagellar biosynthesis protein FlhF. 559
13737 237183 PRK12728 fliE flagellar hook-basal body protein FliE; Provisional 102
13738 183707 PRK12729 fliE flagellar hook-basal body protein FliE; Provisional 127
13739 183708 PRK12735 PRK12735 elongation factor Tu; Reviewed 396
13740 237184 PRK12736 PRK12736 elongation factor Tu; Reviewed 394
13741 183710 PRK12737 gatY tagatose-bisphosphate aldolase subunit GatY. 284
13742 183711 PRK12738 kbaY tagatose-bisphosphate aldolase subunit KbaY. 286
13743 237185 PRK12739 PRK12739 elongation factor G; Reviewed 691
13744 237186 PRK12740 PRK12740 elongation factor G-like protein EF-G2. 668
13745 183714 PRK12742 PRK12742 SDR family oxidoreductase. 237
13746 237187 PRK12743 PRK12743 SDR family oxidoreductase. 256
13747 183716 PRK12744 PRK12744 SDR family oxidoreductase. 257
13748 237188 PRK12745 PRK12745 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 256
13749 183718 PRK12746 PRK12746 SDR family oxidoreductase. 254
13750 183719 PRK12747 PRK12747 short chain dehydrogenase; Provisional 252
13751 237189 PRK12748 PRK12748 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 256
13752 183721 PRK12749 PRK12749 quinate/shikimate dehydrogenase; Reviewed 288
13753 183722 PRK12750 cpxP periplasmic repressor CpxP; Reviewed 170
13754 171704 PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed 162
13755 183723 PRK12753 PRK12753 transketolase; Reviewed 663
13756 183724 PRK12754 PRK12754 transketolase; Reviewed 663
13757 237190 PRK12755 PRK12755 phospho-2-dehydro-3-deoxyheptonate aldolase; Provisional 353
13758 183726 PRK12756 PRK12756 Trp-sensitive 3-deoxy-7-phosphoheptulonate synthase AroH. 348
13759 237191 PRK12757 PRK12757 cell division protein FtsN; Provisional 256
13760 237192 PRK12758 PRK12758 DNA gyrase/topoisomerase IV subunit A. 869
13761 139206 PRK12759 PRK12759 bifunctional gluaredoxin/ribonucleoside-diphosphate reductase subunit beta; Provisional 410
13762 237193 PRK12764 PRK12764 fumarylacetoacetate hydrolase family protein. 500
13763 237194 PRK12765 PRK12765 flagellar filament capping protein FliD. 595
13764 183731 PRK12766 PRK12766 50S ribosomal protein L32e; Provisional 232
13765 237195 PRK12767 PRK12767 carbamoyl phosphate synthase-like protein; Provisional 326
13766 237196 PRK12768 PRK12768 sulfate transporter family protein. 240
13767 183733 PRK12769 PRK12769 putative oxidoreductase Fe-S binding subunit; Reviewed 654
13768 237197 PRK12770 PRK12770 putative glutamate synthase subunit beta; Provisional 352
13769 237198 PRK12771 PRK12771 putative glutamate synthase (NADPH) small subunit; Provisional 564
13770 237199 PRK12772 PRK12772 fused FliR family export protein/FlhB family type III secretion system protein. 609
13771 183737 PRK12773 flhB flagellar biosynthesis protein FlhB; Reviewed 646
13772 183738 PRK12775 PRK12775 putative trifunctional 2-polyprenylphenol hydroxylase/glutamate synthase subunit beta/ferritin domain-containing protein; Provisional 1006
13773 237200 PRK12778 PRK12778 bifunctional dihydroorotate dehydrogenase B NAD binding subunit/NADPH-dependent glutamate synthase. 752
13774 183740 PRK12779 PRK12779 putative bifunctional glutamate synthase subunit beta/2-polyprenylphenol hydroxylase; Provisional 944
13775 183741 PRK12780 fliR flagellar biosynthesis protein FliR; Reviewed 251
13776 139219 PRK12781 fliQ flagellar biosynthetic protein FliQ. 88
13777 139220 PRK12782 flgC flagellar basal body rod protein FlgC; Reviewed 138
13778 171720 PRK12783 fliP flagellar biosynthesis protein FliP; Reviewed 255
13779 183742 PRK12784 PRK12784 hypothetical protein; Provisional 84
13780 183743 PRK12785 fliL flagellar basal body-associated protein FliL; Reviewed 166
13781 237201 PRK12786 flgA flagellar basal body P-ring formation protein FlgA. 338
13782 183745 PRK12787 fliX flagellar assembly regulator FliX; Reviewed 138
13783 237202 PRK12788 flgH flagellar basal body L-ring protein FlgH. 234
13784 183746 PRK12789 flgI flagellar basal body P-ring protein FlgI. 367
13785 237203 PRK12790 PRK12790 flagellar rod assembly protein FlgJ. 115
13786 237204 PRK12791 flbT flagellar biosynthesis repressor FlbT; Reviewed 131
13787 237205 PRK12792 flhA flagellar biosynthesis protein FlhA; Reviewed 694
13788 237206 PRK12793 flaF flagellar biosynthesis regulator FlaF. 115
13789 237207 PRK12794 flaF flagellar biosynthesis regulatory protein FlaF; Reviewed 122
13790 237208 PRK12795 fliM flagellar motor switch protein FliM; Reviewed 388
13791 237209 PRK12796 spaP EscR/YscR/HrcR family type III secretion system export apparatus protein. 221
13792 237210 PRK12797 PRK12797 type III secretion system protein YscR; Provisional 213
13793 237211 PRK12798 PRK12798 chemotaxis protein; Reviewed 421
13794 183756 PRK12799 motB flagellar motor protein MotB; Reviewed 421
13795 183757 PRK12800 fliF flagellar MS-ring protein; Reviewed 574
13796 139237 PRK12802 PRK12802 flagellin; Provisional 282
13797 183758 PRK12803 PRK12803 flagellin; Provisional 335
13798 183759 PRK12804 PRK12804 flagellin; Provisional 301
13799 183760 PRK12805 PRK12805 FliC/FljB family flagellin. 287
13800 183761 PRK12806 PRK12806 flagellin; Provisional 475
13801 171737 PRK12807 PRK12807 flagellin; Provisional 287
13802 237212 PRK12808 PRK12808 flagellin; Provisional 476
13803 183762 PRK12809 PRK12809 putative oxidoreductase Fe-S binding subunit; Reviewed 639
13804 237213 PRK12810 gltD glutamate synthase subunit beta; Reviewed 471
13805 139245 PRK12812 flgD flagellar basal body rod modification protein; Reviewed 259
13806 237214 PRK12813 flgD flagellar basal body rod modification protein; Reviewed 223
13807 139246 PRK12814 PRK12814 putative NADPH-dependent glutamate synthase small subunit; Provisional 652
13808 237215 PRK12815 carB carbamoyl phosphate synthase large subunit; Reviewed 1068
13809 183766 PRK12816 flgG flagellar basal body rod protein FlgG; Reviewed 264
13810 183767 PRK12817 flgG flagellar basal body rod protein FlgG; Reviewed 260
13811 183768 PRK12818 flgG flagellar basal body rod protein FlgG; Reviewed 256
13812 183769 PRK12819 flgG flagellar basal-body rod protein FlgG. 257
13813 105955 PRK12820 PRK12820 bifunctional aspartyl-tRNA synthetase/aspartyl/glutamyl-tRNA amidotransferase subunit C; Provisional 706
13814 237216 PRK12821 PRK12821 aspartyl/glutamyl-tRNA amidotransferase subunit C-like protein; Provisional 477
13815 237217 PRK12822 PRK12822 phospho-2-dehydro-3-deoxyheptonate aldolase; Provisional 356
13816 183772 PRK12823 benD 1,6-dihydroxycyclohexa-2,4-diene-1-carboxylate dehydrogenase; Provisional 260
13817 183773 PRK12824 PRK12824 3-oxoacyl-ACP reductase. 245
13818 237218 PRK12825 fabG 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 249
13819 183775 PRK12826 PRK12826 SDR family oxidoreductase. 251
13820 237219 PRK12827 PRK12827 short chain dehydrogenase; Provisional 249
13821 237220 PRK12828 PRK12828 short chain dehydrogenase; Provisional 239
13822 183778 PRK12829 PRK12829 short chain dehydrogenase; Provisional 264
13823 183779 PRK12830 PRK12830 UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Reviewed 417
13824 183780 PRK12831 PRK12831 putative oxidoreductase; Provisional 464
13825 183781 PRK12833 PRK12833 acetyl-CoA carboxylase biotin carboxylase subunit; Provisional 467
13826 183782 PRK12834 PRK12834 putative FAD-binding dehydrogenase; Reviewed 549
13827 237221 PRK12835 PRK12835 3-ketosteroid-delta-1-dehydrogenase; Reviewed 584
13828 237222 PRK12837 PRK12837 FAD-binding protein. 513
13829 183784 PRK12838 PRK12838 carbamoyl phosphate synthase small subunit; Reviewed 354
13830 237223 PRK12839 PRK12839 FAD-dependent oxidoreductase. 572
13831 237224 PRK12842 PRK12842 putative succinate dehydrogenase; Reviewed 574
13832 237225 PRK12843 PRK12843 FAD-dependent oxidoreductase. 578
13833 183787 PRK12844 PRK12844 3-ketosteroid-delta-1-dehydrogenase; Reviewed 557
13834 237226 PRK12845 PRK12845 3-ketosteroid-delta-1-dehydrogenase; Reviewed 564
13835 237227 PRK12846 PRK12846 peptide deformylase; Reviewed 165
13836 237228 PRK12847 ubiA 4-hydroxybenzoate octaprenyltransferase. 285
13837 237229 PRK12848 ubiA 4-hydroxybenzoate octaprenyltransferase. 282
13838 237230 PRK12849 groEL chaperonin GroEL; Reviewed 542
13839 237231 PRK12850 groEL chaperonin GroEL; Reviewed 544
13840 171770 PRK12851 groEL chaperonin GroEL; Reviewed 541
13841 237232 PRK12852 groEL chaperonin GroEL; Reviewed 545
13842 237233 PRK12853 PRK12853 glucose-6-phosphate dehydrogenase. 482
13843 237234 PRK12854 PRK12854 glucose-6-phosphate 1-dehydrogenase; Provisional 484
13844 171774 PRK12855 PRK12855 hypothetical protein; Provisional 103
13845 105987 PRK12856 PRK12856 hypothetical protein; Provisional 103
13846 237235 PRK12857 PRK12857 class II fructose-1,6-bisphosphate aldolase. 284
13847 237236 PRK12858 PRK12858 tagatose 1,6-diphosphate aldolase; Reviewed 340
13848 183797 PRK12859 PRK12859 3-ketoacyl-(acyl-carrier-protein) reductase; Provisional 256
13849 237237 PRK12860 PRK12860 flagellar transcriptional regulator FlhC. 189
13850 183798 PRK12861 PRK12861 malic enzyme; Reviewed 764
13851 183799 PRK12862 PRK12862 malic enzyme; Reviewed 763
13852 183800 PRK12863 PRK12863 YciI-like protein; Reviewed 94
13853 183801 PRK12864 PRK12864 YciI-like protein; Reviewed 89
13854 171782 PRK12865 PRK12865 YciI-like protein; Reviewed 97
13855 237238 PRK12866 PRK12866 YciI-like protein; Reviewed 97
13856 237239 PRK12869 ubiA protoheme IX farnesyltransferase; Reviewed 279
13857 237240 PRK12870 ubiA 4-hydroxybenzoate octaprenyltransferase. 290
13858 106000 PRK12871 ubiA prenyltransferase; Reviewed 297
13859 237241 PRK12872 ubiA prenyltransferase; Reviewed 285
13860 171787 PRK12873 ubiA 4-hydroxybenzoate polyprenyltransferase. 294
13861 237242 PRK12874 ubiA 4-hydroxybenzoate polyprenyltransferase. 291
13862 237243 PRK12875 ubiA prenyltransferase; Reviewed 282
13863 237244 PRK12876 ubiA prenyltransferase; Reviewed 300
13864 183808 PRK12878 ubiA 4-hydroxybenzoate octaprenyltransferase. 314
13865 237245 PRK12879 PRK12879 3-oxoacyl-(acyl carrier protein) synthase III; Reviewed 325
13866 171793 PRK12880 PRK12880 beta-ketoacyl-ACP synthase III. 353
13867 237246 PRK12881 acnA aconitate hydratase AcnA. 889
13868 183811 PRK12882 ubiA prenyltransferase; Reviewed 276
13869 171796 PRK12883 ubiA prenyltransferase UbiA-like protein; Reviewed 277
13870 183812 PRK12884 ubiA prenyltransferase; Reviewed 279
13871 237247 PRK12886 ubiA prenyltransferase; Reviewed 291
13872 183813 PRK12887 ubiA tocopherol phytyltransferase; Reviewed 308
13873 183814 PRK12888 ubiA 4-hydroxybenzoate octaprenyltransferase. 284
13874 237248 PRK12890 PRK12890 allantoate amidohydrolase; Reviewed 414
13875 237249 PRK12891 PRK12891 allantoate amidohydrolase; Reviewed 414
13876 183817 PRK12892 PRK12892 allantoate amidohydrolase; Reviewed 412
13877 237250 PRK12893 PRK12893 Zn-dependent hydrolase. 412
13878 237251 PRK12895 ubiA prenyltransferase; Reviewed 286
13879 237252 PRK12896 PRK12896 methionine aminopeptidase; Reviewed 255
13880 171806 PRK12897 PRK12897 type I methionyl aminopeptidase. 248
13881 237253 PRK12898 secA preprotein translocase subunit SecA; Reviewed 656
13882 237254 PRK12899 secA preprotein translocase subunit SecA; Reviewed 970
13883 237255 PRK12900 secA preprotein translocase subunit SecA; Reviewed 1025
13884 237256 PRK12901 secA preprotein translocase subunit SecA; Reviewed 1112
13885 237257 PRK12902 secA preprotein translocase subunit SecA; Reviewed 939
13886 237258 PRK12903 secA preprotein translocase subunit SecA; Reviewed 925
13887 237259 PRK12904 PRK12904 preprotein translocase subunit SecA; Reviewed 830
13888 237260 PRK12906 secA preprotein translocase subunit SecA; Reviewed 796
13889 183828 PRK12907 secY preprotein translocase subunit SecY; Reviewed 434
13890 171815 PRK12911 PRK12911 bifunctional preprotein translocase subunit SecD/SecF; Reviewed 1403
13891 183829 PRK12921 PRK12921 oxidoreductase. 305
13892 237261 PRK12928 PRK12928 lipoyl synthase; Provisional 290
13893 183831 PRK12933 secD protein translocase subunit SecD. 604
13894 183832 PRK12935 PRK12935 acetoacetyl-CoA reductase; Provisional 247
13895 171820 PRK12936 PRK12936 3-ketoacyl-(acyl-carrier-protein) reductase NodG; Reviewed 245
13896 171821 PRK12937 PRK12937 short chain dehydrogenase; Provisional 245
13897 171822 PRK12938 PRK12938 3-ketoacyl-ACP reductase. 246
13898 183833 PRK12939 PRK12939 short chain dehydrogenase; Provisional 250
13899 171824 PRK12996 ulaA PTS ascorbate transporter subunit IIC. 463
13900 237262 PRK12997 PRK12997 PTS sugar transporter subunit IIC. 466
13901 237263 PRK12999 PRK12999 pyruvate carboxylase; Reviewed 1146
13902 183836 PRK13004 PRK13004 YgeY family selenium metabolism-linked hydrolase. 399
13903 237264 PRK13007 PRK13007 succinyl-diaminopimelate desuccinylase; Reviewed 352
13904 237265 PRK13009 PRK13009 succinyl-diaminopimelate desuccinylase; Reviewed 375
13905 139334 PRK13010 purU formyltetrahydrofolate deformylase; Reviewed 289
13906 237266 PRK13011 PRK13011 formyltetrahydrofolate deformylase; Reviewed 286
13907 237267 PRK13012 PRK13012 2-oxoacid dehydrogenase subunit E1; Provisional 896
13908 237268 PRK13013 PRK13013 acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase family protein. 427
13909 237269 PRK13014 PRK13014 methionine sulfoxide reductase A; Provisional 186
13910 237270 PRK13015 PRK13015 3-dehydroquinate dehydratase; Reviewed 146
13911 237271 PRK13016 PRK13016 dihydroxy-acid dehydratase; Provisional 577
13912 237272 PRK13017 PRK13017 dihydroxy-acid dehydratase; Provisional 596
13913 237273 PRK13018 PRK13018 cell division protein FtsZ; Provisional 378
13914 183845 PRK13019 clpS ATP-dependent Clp protease adapter ClpS. 94
13915 183846 PRK13020 PRK13020 riboflavin synthase subunit alpha; Provisional 206
13916 237274 PRK13021 secF preprotein translocase subunit SecF; Reviewed 297
13917 237275 PRK13022 secF protein translocase subunit SecF. 289
13918 171842 PRK13023 PRK13023 protein translocase subunit SecDF. 758
13919 237276 PRK13024 PRK13024 bifunctional preprotein translocase subunit SecD/SecF; Reviewed 755
13920 237277 PRK13026 PRK13026 acyl-CoA dehydrogenase; Reviewed 774
13921 183850 PRK13027 PRK13027 C4-dicarboxylate transporter DctA; Reviewed 421
13922 183851 PRK13028 PRK13028 tryptophan synthase subunit beta; Provisional 402
13923 237278 PRK13029 PRK13029 indolepyruvate ferredoxin oxidoreductase family protein. 1186
13924 237279 PRK13030 PRK13030 indolepyruvate ferredoxin oxidoreductase family protein. 1159
13925 106068 PRK13031 PRK13031 preprotein translocase subunit SecB; Provisional 149
13926 171848 PRK13032 PRK13032 chemotaxis-inhibiting protein CHIPS; Reviewed 149
13927 171849 PRK13033 PRK13033 formyl peptide receptor-like 1 inhibitory protein; Reviewed 133
13928 237280 PRK13034 PRK13034 serine hydroxymethyltransferase; Reviewed 416
13929 171851 PRK13035 PRK13035 superantigen-like protein SSL5; Reviewed. 234
13930 171852 PRK13036 PRK13036 superantigen-like protein SSL11; Reviewed. 227
13931 106074 PRK13037 PRK13037 superantigen-like protein SSL1; Reviewed. 226
13932 171853 PRK13038 PRK13038 superantigen-like protein SSL10; Reviewed. 227
13933 171854 PRK13039 PRK13039 superantigen-like protein SSL8; Reviewed. 232
13934 106077 PRK13040 PRK13040 superantigen-like protein SSL6; Reviewed. 231
13935 106078 PRK13041 PRK13041 superantigen-like protein SSL2; Reviewed. 231
13936 183854 PRK13042 PRK13042 superantigen-like protein SSL4; Reviewed. 291
13937 171855 PRK13043 PRK13043 superantigen-like protein SSL14; Reviewed. 241
13938 237281 PRK13054 PRK13054 lipid kinase; Reviewed 300
13939 237282 PRK13055 PRK13055 putative lipid kinase; Reviewed 334
13940 183857 PRK13057 PRK13057 lipid kinase. 287
13941 183858 PRK13059 PRK13059 putative lipid kinase; Reviewed 295
13942 183859 PRK13103 secA preprotein translocase subunit SecA; Reviewed 913
13943 183860 PRK13104 secA preprotein translocase subunit SecA; Reviewed 896
13944 183861 PRK13105 ubiA prenyltransferase; Reviewed 282
13945 237283 PRK13106 ubiA prenyltransferase; Reviewed 300
13946 183863 PRK13107 PRK13107 preprotein translocase subunit SecA; Reviewed 908
13947 237284 PRK13108 PRK13108 prolipoprotein diacylglyceryl transferase; Reviewed 460
13948 183864 PRK13109 flhB flagellar biosynthesis protein FlhB; Reviewed 358
13949 237285 PRK13111 trpA tryptophan synthase subunit alpha; Provisional 258
13950 237286 PRK13125 trpA tryptophan synthase subunit alpha; Provisional 244
13951 171868 PRK13128 PRK13128 D-aminopeptidase; Reviewed 518
13952 237287 PRK13130 PRK13130 RNA-protein complex protein Nop10. 56
13953 237288 PRK13141 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 205
13954 171871 PRK13142 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 192
13955 237289 PRK13143 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 200
13956 183870 PRK13145 araD L-ribulose-5-phosphate 4-epimerase; Provisional 234
13957 237290 PRK13146 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 209
13958 183872 PRK13149 PRK13149 H/ACA RNA-protein complex component Gar1; Reviewed 73
13959 139376 PRK13150 PRK13150 cytochrome c maturation protein CcmE. 159
13960 171876 PRK13152 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 201
13961 183873 PRK13159 PRK13159 cytochrome c-type biogenesis protein CcmE; Reviewed 155
13962 183874 PRK13165 PRK13165 cytochrome c maturation protein CcmE. 160
13963 237291 PRK13168 rumA 23S rRNA (uracil(1939)-C(5))-methyltransferase RlmD. 443
13964 183876 PRK13169 PRK13169 DNA replication initiation control protein YabA. 110
13965 183877 PRK13170 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 196
13966 183878 PRK13181 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 199
13967 237292 PRK13182 racA chromosome-anchoring protein RacA. 175
13968 171884 PRK13183 psbN photosystem II reaction center protein PsbN. 46
13969 183880 PRK13184 pknD serine/threonine-protein kinase PknD. 932
13970 237293 PRK13185 chlL protochlorophyllide reductase iron-sulfur ATP-binding protein; Provisional 270
13971 237294 PRK13186 lpxC UDP-3-O-acyl-N-acetylglucosamine deacetylase. 295
13972 237295 PRK13187 PRK13187 UDP-3-O-acyl N-acetylglycosamine deacetylase. 304
13973 237296 PRK13188 PRK13188 bifunctional UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase/(3R)-hydroxymyristoyl-[acyl-carrier-protein] dehydratase; Reviewed 464
13974 237297 PRK13189 PRK13189 peroxiredoxin; Provisional 222
13975 106159 PRK13190 PRK13190 putative peroxiredoxin; Provisional 202
13976 183885 PRK13191 PRK13191 putative peroxiredoxin; Provisional 215
13977 183886 PRK13192 PRK13192 bifunctional urease subunit gamma/beta; Reviewed 208
13978 237298 PRK13193 PRK13193 pyroglutamyl-peptidase I. 209
13979 183887 PRK13194 PRK13194 pyrrolidone-carboxylate peptidase; Provisional 208
13980 171894 PRK13195 PRK13195 pyrrolidone-carboxylate peptidase; Provisional 222
13981 171895 PRK13196 PRK13196 pyroglutamyl-peptidase I. 211
13982 237299 PRK13197 PRK13197 pyrrolidone-carboxylate peptidase; Provisional 215
13983 171897 PRK13198 ureB urease subunit beta; Reviewed 158
13984 237300 PRK13199 psaB photosystem I P700 chlorophyll a apoprotein A2; Provisional 742
13985 237301 PRK13200 psaA photosystem I P700 chlorophyll a apoprotein A1; Provisional 766
13986 237302 PRK13201 ureB urease subunit beta; Reviewed 136
13987 106171 PRK13202 ureB urease subunit beta; Reviewed 104
13988 237303 PRK13203 ureB urease subunit beta; Reviewed 102
13989 171902 PRK13204 ureB urease subunit beta; Reviewed 159
13990 106174 PRK13205 ureB urease subunit beta; Reviewed 162
13991 237304 PRK13206 ureC urease subunit alpha; Reviewed 573
13992 237305 PRK13207 ureC urease subunit alpha; Reviewed 568
13993 237306 PRK13208 valS valyl-tRNA synthetase; Reviewed 800
13994 237307 PRK13209 PRK13209 L-ribulose-5-phosphate 3-epimerase. 283
13995 237308 PRK13210 PRK13210 L-ribulose-5-phosphate 3-epimerase. 284
13996 237309 PRK13211 PRK13211 N-acetylglucosamine-binding protein GbpA. 478
13997 106181 PRK13213 araD L-ribulose-5-phosphate 4-epimerase; Reviewed 231
13998 183899 PRK13214 PRK13214 photosystem I reaction center subunit X; Reviewed 86
13999 183900 PRK13216 PRK13216 photosystem I reaction center subunit X-like protein; Reviewed 91
14000 237310 PRK13222 PRK13222 N-acetylmuramic acid 6-phosphate phosphatase MupP. 226
14001 171912 PRK13223 PRK13223 phosphoglycolate phosphatase; Provisional 272
14002 106187 PRK13225 PRK13225 phosphoglycolate phosphatase; Provisional 273
14003 237311 PRK13226 PRK13226 phosphoglycolate phosphatase; Provisional 229
14004 183903 PRK13230 PRK13230 nitrogenase reductase-like protein; Reviewed 279
14005 183904 PRK13231 PRK13231 nitrogenase reductase-like protein; Reviewed 264
14006 106194 PRK13232 nifH nitrogenase reductase; Reviewed 273
14007 183905 PRK13233 nifH nitrogenase iron protein. 275
14008 183906 PRK13234 nifH nitrogenase reductase; Reviewed 295
14009 183907 PRK13235 nifH nitrogenase reductase; Reviewed 274
14010 237312 PRK13236 PRK13236 nitrogenase reductase; Reviewed 296
14011 237313 PRK13237 PRK13237 tyrosine phenol-lyase; Provisional 460
14012 237314 PRK13238 tnaA tryptophanase. 460
14013 183911 PRK13239 PRK13239 alkylmercury lyase MerB. 206
14014 183912 PRK13240 pbsY photosystem II protein Y; Reviewed 40
14015 183913 PRK13241 ureA urease subunit gamma; Provisional 100
14016 139420 PRK13242 ureA urease subunit gamma; Provisional 100
14017 183914 PRK13243 PRK13243 glyoxylate reductase; Reviewed 333
14018 183915 PRK13244 PRK13244 protease inhibitor. 145
14019 183916 PRK13245 hetR heterocyst differentiation control protein; Reviewed 299
14020 106208 PRK13246 PRK13246 15,16-dihydrobiliverdin:ferredoxin oxidoreductase. 236
14021 237315 PRK13247 PRK13247 15,16-dihydrobiliverdin:ferredoxin oxidoreductase. 238
14022 139425 PRK13248 PRK13248 phycoerythrobilin:ferredoxin oxidoreductase; Provisional 253
14023 139426 PRK13249 PRK13249 phycoerythrobilin:ferredoxin oxidoreductase; Provisional 257
14024 139427 PRK13250 PRK13250 phycoerythrobilin:ferredoxin oxidoreductase; Provisional 248
14025 183917 PRK13251 PRK13251 trp RNA-binding attenuation protein MtrB. 75
14026 183918 PRK13252 PRK13252 betaine aldehyde dehydrogenase; Provisional 488
14027 237316 PRK13253 PRK13253 citrate lyase subunit gamma; Provisional 92
14028 237317 PRK13254 PRK13254 cytochrome c maturation protein CcmE. 148
14029 183921 PRK13255 PRK13255 thiopurine S-methyltransferase; Reviewed 218
14030 237318 PRK13256 PRK13256 thiopurine S-methyltransferase; Reviewed 226
14031 237319 PRK13257 PRK13257 allantoicase; Provisional 336
14032 237320 PRK13258 PRK13258 7-cyano-7-deazaguanine reductase; Provisional 114
14033 237321 PRK13259 PRK13259 septation regulator SpoVG. 94
14034 183926 PRK13260 PRK13260 2,3-diketo-L-gulonate reductase; Provisional 332
14035 237322 PRK13261 ureE urease accessory protein UreE; Provisional 159
14036 183928 PRK13262 ureE urease accessory protein UreE; Provisional 231
14037 237323 PRK13263 ureE urease accessory protein UreE; Provisional 206
14038 183930 PRK13264 PRK13264 3-hydroxyanthranilate 3,4-dioxygenase; Provisional 177
14039 183931 PRK13265 PRK13265 glycine/sarcosine/betaine reductase complex protein A; Reviewed 154
14040 237324 PRK13266 PRK13266 Thf1-like protein; Reviewed 225
14041 237325 PRK13267 PRK13267 archaemetzincin-like protein; Reviewed 179
14042 183934 PRK13270 treF alpha,alpha-trehalase TreF. 549
14043 237326 PRK13271 treA alpha,alpha-trehalase TreA. 569
14044 183936 PRK13272 treA alpha,alpha-trehalase TreA. 542
14045 237327 PRK13273 mdoD glucan biosynthesis protein D; Provisional 476
14046 237328 PRK13274 mdoG glucan biosynthesis protein G; Provisional 516
14047 183939 PRK13275 mtrF tetrahydromethanopterin S-methyltransferase subunit F; Provisional 67
14048 183940 PRK13276 PRK13276 iron-sulfur cluster repair di-iron protein ScdA. 224
14049 183941 PRK13277 PRK13277 5-formaminoimidazole-4-carboxamide-1-(beta)-D-ribofuranosyl 5'-monophosphate synthetase-like protein; Provisional 366
14050 237329 PRK13278 purP formate--phosphoribosylaminoimidazolecarboxamide ligase. 358
14051 237330 PRK13279 arnT lipid IV(A) 4-amino-4-deoxy-L-arabinosyltransferase. 552
14052 237331 PRK13280 PRK13280 N-glycosylase/DNA lyase; Provisional 269
14053 237332 PRK13281 PRK13281 N-succinylarginine dihydrolase. 442
14054 183946 PRK13282 PRK13282 flagellar assembly protein FliW; Provisional 128
14055 183947 PRK13283 PRK13283 flagellar assembly protein FliW; Provisional 134
14056 237333 PRK13284 PRK13284 flagellar assembly protein FliW; Provisional 145
14057 237334 PRK13285 PRK13285 flagellar assembly protein FliW; Provisional 148
14058 237335 PRK13286 amiE aliphatic amidase. 345
14059 183950 PRK13287 amiF formamidase; Provisional 333
14060 237336 PRK13288 PRK13288 pyrophosphatase PpaX; Provisional 214
14061 237337 PRK13289 PRK13289 NO-inducible flavohemoprotein. 399
14062 183953 PRK13290 ectC L-ectoine synthase; Reviewed 125
14063 183954 PRK13291 PRK13291 putative metal-dependent hydrolase. 173
14064 183955 PRK13292 PRK13292 NADH-quinone oxidoreductase subunit B/C/D. 788
14065 183956 PRK13293 PRK13293 F420-0--gamma-glutamyl ligase; Reviewed 245
14066 183957 PRK13294 PRK13294 F420-0--gamma-glutamyl ligase; Provisional 448
14067 171961 PRK13295 PRK13295 cyclohexanecarboxylate-CoA ligase; Reviewed 547
14068 106256 PRK13296 PRK13296 CCA tRNA nucleotidyltransferase. 360
14069 139469 PRK13297 PRK13297 tRNA CCA-pyrophosphorylase; Provisional 364
14070 237338 PRK13298 PRK13298 tRNA CCA-pyrophosphorylase; Provisional 417
14071 237339 PRK13299 PRK13299 tRNA CCA-pyrophosphorylase; Provisional 394
14072 237340 PRK13300 PRK13300 CCA tRNA nucleotidyltransferase. 447
14073 106261 PRK13301 PRK13301 putative L-aspartate dehydrogenase; Provisional 267
14074 237341 PRK13302 PRK13302 aspartate dehydrogenase. 271
14075 237342 PRK13303 PRK13303 aspartate dehydrogenase. 265
14076 237343 PRK13304 PRK13304 aspartate dehydrogenase. 265
14077 183962 PRK13305 sgbH 3-keto-L-gulonate-6-phosphate decarboxylase UlaD. 218
14078 237344 PRK13306 ulaD 3-dehydro-L-gulonate-6-phosphate decarboxylase. 216
14079 183964 PRK13307 PRK13307 bifunctional 5,6,7,8-tetrahydromethanopterin hydro-lyase/3-hexulose-6-phosphate synthase. 391
14080 183965 PRK13308 ureC urease subunit alpha; Reviewed 569
14081 183966 PRK13309 ureC urease subunit alpha; Reviewed 572
14082 183967 PRK13310 PRK13310 N-acetyl-D-glucosamine kinase; Provisional 303
14083 106271 PRK13311 PRK13311 N-acetyl-D-glucosamine kinase; Provisional 256
14084 139480 PRK13312 PRK13312 staphylobilin-forming heme oxygenase IsdG. 107
14085 183968 PRK13313 PRK13313 staphylobilin-forming heme oxygenase IsdI. 108
14086 183969 PRK13314 PRK13314 heme oxygenase. 107
14087 237345 PRK13315 PRK13315 heme oxygenase. 107
14088 183970 PRK13316 PRK13316 heme oxygenase IsdG. 121
14089 237346 PRK13317 PRK13317 pantothenate kinase; Provisional 277
14090 237347 PRK13318 PRK13318 type III pantothenate kinase. 258
14091 237348 PRK13320 PRK13320 type III pantothenate kinase. 244
14092 237349 PRK13321 PRK13321 type III pantothenate kinase. 256
14093 237350 PRK13322 PRK13322 pantothenate kinase; Reviewed 246
14094 106284 PRK13324 PRK13324 type III pantothenate kinase. 258
14095 183976 PRK13325 PRK13325 bifunctional biotin--[acetyl-CoA-carboxylase] ligase/type III pantothenate kinase. 592
14096 237351 PRK13326 PRK13326 type III pantothenate kinase. 262
14097 183977 PRK13327 PRK13327 type III pantothenate kinase. 242
14098 237352 PRK13328 PRK13328 type III pantothenate kinase. 255
14099 183979 PRK13329 PRK13329 pantothenate kinase; Reviewed 249
14100 237353 PRK13331 PRK13331 pantothenate kinase; Reviewed 251
14101 183981 PRK13333 PRK13333 type III pantothenate kinase. 206
14102 139494 PRK13335 PRK13335 superantigen-like protein SSL3; Reviewed. 356
14103 183982 PRK13337 PRK13337 putative lipid kinase; Reviewed 304
14104 183983 PRK13339 PRK13339 malate:quinone oxidoreductase; Reviewed 497
14105 183984 PRK13340 PRK13340 alanine racemase; Reviewed 406
14106 237354 PRK13341 PRK13341 AAA family ATPase. 725
14107 237355 PRK13342 PRK13342 recombination factor protein RarA; Reviewed 413
14108 183987 PRK13343 PRK13343 F0F1 ATP synthase subunit alpha; Provisional 502
14109 183988 PRK13344 spxA transcriptional regulator Spx; Reviewed 132
14110 106303 PRK13345 PRK13345 superantigen-like protein SSL9; Reviewed. 232
14111 106304 PRK13346 PRK13346 superantigen-like protein SSL7; Reviewed. 231
14112 237356 PRK13347 PRK13347 coproporphyrinogen III oxidase; Provisional 453
14113 237357 PRK13348 PRK13348 HTH-type transcriptional regulator ArgP. 294
14114 106307 PRK13349 PRK13349 superantigen-like protein SSL13; Reviewed. 241
14115 171995 PRK13350 PRK13350 superantigen-like protein SSL12; Reviewed. 238
14116 237358 PRK13351 PRK13351 elongation factor G-like protein. 687
14117 237359 PRK13352 PRK13352 phosphomethylpyrimidine synthase ThiC. 431
14118 183992 PRK13353 PRK13353 aspartate ammonia-lyase; Provisional 473
14119 237360 PRK13354 PRK13354 tyrosyl-tRNA synthetase; Provisional 410
14120 237361 PRK13355 PRK13355 bifunctional HTH-domain containing protein/aminotransferase; Provisional 517
14121 237362 PRK13356 PRK13356 branched-chain amino acid aminotransferase. 286
14122 237363 PRK13357 PRK13357 branched-chain amino acid aminotransferase; Provisional 356
14123 183997 PRK13358 PRK13358 protocatechuate 4,5-dioxygenase subunit beta; Provisional 269
14124 183998 PRK13359 PRK13359 beta-ketoadipyl CoA thiolase; Provisional 400
14125 183999 PRK13360 PRK13360 omega amino acid--pyruvate transaminase; Provisional 442
14126 237364 PRK13361 PRK13361 GTP 3',8-cyclase MoaA. 329
14127 184001 PRK13362 PRK13362 protoheme IX farnesyltransferase; Provisional 306
14128 184002 PRK13363 PRK13363 protocatechuate 4,5-dioxygenase subunit beta; Provisional 335
14129 184003 PRK13364 PRK13364 protocatechuate 4,5-dioxygenase subunit beta; Provisional 278
14130 184004 PRK13365 PRK13365 protocatechuate 4,5-dioxygenase subunit beta; Provisional 279
14131 184005 PRK13366 PRK13366 protocatechuate 4,5-dioxygenase subunit beta; Provisional 284
14132 184006 PRK13367 PRK13367 gallate dioxygenase. 420
14133 184007 PRK13368 PRK13368 3-deoxy-manno-octulosonate cytidylyltransferase; Provisional 238
14134 237365 PRK13369 PRK13369 glycerol-3-phosphate dehydrogenase; Provisional 502
14135 237366 PRK13370 mhpB 3-carboxyethylcatechol 2,3-dioxygenase. 313
14136 237367 PRK13371 PRK13371 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Provisional 387
14137 106330 PRK13372 pcmA protocatechuate 4,5-dioxygenase subunit alpha/beta. 444
14138 106331 PRK13373 PRK13373 putative dioxygenase; Provisional 344
14139 237368 PRK13374 PRK13374 DeoD-type purine-nucleoside phosphorylase. 233
14140 172015 PRK13375 pimE mannosyltransferase; Provisional 409
14141 237369 PRK13376 pyrB bifunctional aspartate carbamoyltransferase catalytic subunit/aspartate carbamoyltransferase regulatory subunit; Provisional 525
14142 184013 PRK13377 PRK13377 protocatechuate 4,5-dioxygenase subunit alpha; Provisional 129
14143 139527 PRK13378 PRK13378 protocatechuate 4,5-dioxygenase subunit alpha; Provisional 117
14144 184014 PRK13379 PRK13379 protocatechuate 4,5-dioxygenase subunit alpha; Provisional 119
14145 237370 PRK13380 PRK13380 glycine cleavage system protein H; Provisional 144
14146 237371 PRK13381 PRK13381 peptidase T; Provisional 404
14147 172019 PRK13382 PRK13382 bile acid CoA ligase. 537
14148 139531 PRK13383 PRK13383 acyl-CoA synthetase; Provisional 516
14149 172020 PRK13384 PRK13384 porphobilinogen synthase. 322
14150 184017 PRK13385 PRK13385 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase; Provisional 230
14151 237372 PRK13386 fliH flagellar assembly protein H; Provisional 236
14152 237373 PRK13387 PRK13387 1,4-dihydroxy-2-naphthoate octaprenyltransferase; Provisional 317
14153 237374 PRK13388 PRK13388 acyl-CoA synthetase; Provisional 540
14154 184021 PRK13389 PRK13389 UTP--glucose-1-phosphate uridylyltransferase GalU. 302
14155 139538 PRK13390 PRK13390 acyl-CoA synthetase; Provisional 501
14156 184022 PRK13391 PRK13391 acyl-CoA synthetase; Provisional 511
14157 184023 PRK13392 PRK13392 5-aminolevulinate synthase; Provisional 410
14158 184024 PRK13393 PRK13393 5-aminolevulinate synthase; Provisional 406
14159 184025 PRK13394 PRK13394 3-hydroxybutyrate dehydrogenase; Provisional 262
14160 237375 PRK13395 PRK13395 ureidoglycolate lyase. 171
14161 237376 PRK13396 PRK13396 3-deoxy-7-phosphoheptulonate synthase; Provisional 352
14162 172030 PRK13397 PRK13397 3-deoxy-7-phosphoheptulonate synthase; Provisional 250
14163 184028 PRK13398 PRK13398 3-deoxy-7-phosphoheptulonate synthase; Provisional 266
14164 184029 PRK13399 PRK13399 fructose-bisphosphate aldolase class II. 347
14165 184030 PRK13400 PRK13400 30S ribosomal protein S18; Provisional 147
14166 184031 PRK13401 PRK13401 30S ribosomal protein S18; Provisional 82
14167 184032 PRK13402 PRK13402 glutamate 5-kinase. 368
14168 106361 PRK13403 PRK13403 ketol-acid reductoisomerase; Provisional 335
14169 184033 PRK13404 PRK13404 dihydropyrimidinase; Provisional 477
14170 237377 PRK13405 bchH magnesium chelatase subunit H; Provisional 1209
14171 237378 PRK13406 bchD magnesium chelatase subunit D; Provisional 584
14172 184036 PRK13407 bchI magnesium chelatase subunit I; Provisional 334
14173 184037 PRK13409 PRK13409 ribosome biogenesis/translation initiation ATPase RLI. 590
14174 184038 PRK13410 PRK13410 molecular chaperone DnaK; Provisional 668
14175 184039 PRK13411 PRK13411 molecular chaperone DnaK; Provisional 653
14176 237379 PRK13412 fkp bifunctional fucokinase/L-fucose-1-P-guanylyltransferase; Provisional 974
14177 184041 PRK13413 mpi master DNA invertase Mpi family serine-type recombinase. 200
14178 139556 PRK13414 PRK13414 flagellar biosynthesis protein FliZ; Provisional 209
14179 184042 PRK13415 PRK13415 flagella biosynthesis protein FliZ; Provisional 219
14180 237380 PRK13417 PRK13417 F0F1 ATP synthase subunit A; Provisional 352
14181 237381 PRK13419 PRK13419 F0F1 ATP synthase subunit A; Provisional 342
14182 237382 PRK13420 PRK13420 F0F1 ATP synthase subunit A; Provisional 226
14183 237383 PRK13421 PRK13421 F0F1 ATP synthase subunit A; Provisional 223
14184 184046 PRK13422 PRK13422 F0F1 ATP synthase subunit gamma; Provisional 298
14185 237384 PRK13423 PRK13423 F0F1 ATP synthase subunit gamma; Provisional 288
14186 172047 PRK13424 PRK13424 F0F1 ATP synthase subunit gamma; Provisional 291
14187 139564 PRK13425 PRK13425 F0F1 ATP synthase subunit gamma; Provisional 291
14188 237385 PRK13426 PRK13426 F0F1 ATP synthase subunit gamma; Provisional 291
14189 172049 PRK13427 PRK13427 F0F1 ATP synthase subunit gamma; Provisional 289
14190 184048 PRK13428 PRK13428 F0F1 ATP synthase subunit delta; Provisional 445
14191 237386 PRK13429 PRK13429 F0F1 ATP synthase subunit delta; Provisional 181
14192 237387 PRK13430 PRK13430 F0F1 ATP synthase subunit delta; Provisional 271
14193 184051 PRK13431 PRK13431 F0F1 ATP synthase subunit delta; Provisional 180
14194 139571 PRK13434 PRK13434 F0F1 ATP synthase subunit delta; Provisional 184
14195 184052 PRK13435 PRK13435 response regulator; Provisional 145
14196 184053 PRK13436 PRK13436 F0F1 ATP synthase subunit delta; Provisional 179
14197 184054 PRK13441 PRK13441 F0F1 ATP synthase subunit delta; Provisional 180
14198 184055 PRK13442 atpC F0F1 ATP synthase subunit epsilon; Provisional 89
14199 237388 PRK13443 atpC F0F1 ATP synthase subunit epsilon; Provisional 136
14200 139576 PRK13444 atpC F0F1 ATP synthase subunit epsilon; Provisional 127
14201 184056 PRK13446 atpC F0F1 ATP synthase subunit epsilon; Provisional 136
14202 184057 PRK13447 PRK13447 F0F1 ATP synthase subunit epsilon; Provisional 136
14203 139579 PRK13448 atpC F0F1 ATP synthase subunit epsilon; Provisional 135
14204 184058 PRK13449 atpC ATP synthase F1 subunit epsilon. 88
14205 184059 PRK13450 atpC F0F1 ATP synthase subunit epsilon; Provisional 132
14206 172059 PRK13451 atpC F0F1 ATP synthase subunit epsilon; Provisional 101
14207 106409 PRK13452 atpC F0F1 ATP synthase subunit epsilon; Provisional 145
14208 184060 PRK13453 PRK13453 F0F1 ATP synthase subunit B; Provisional 173
14209 184061 PRK13454 PRK13454 F0F1 ATP synthase subunit B'; Provisional 181
14210 184062 PRK13455 PRK13455 F0F1 ATP synthase subunit B; Provisional 184
14211 237389 PRK13456 PRK13456 DNA protection protein DPS; Provisional 186
14212 139585 PRK13460 PRK13460 F0F1 ATP synthase subunit B; Provisional 173
14213 184064 PRK13461 PRK13461 F0F1 ATP synthase subunit B; Provisional 159
14214 139587 PRK13462 PRK13462 acid phosphatase; Provisional 203
14215 172065 PRK13463 PRK13463 phosphoserine phosphatase 1. 203
14216 184065 PRK13464 PRK13464 F0F1 ATP synthase subunit B. 101
14217 172066 PRK13466 PRK13466 F0F1 ATP synthase subunit C; Provisional 66
14218 237390 PRK13467 PRK13467 F0F1 ATP synthase subunit C; Provisional 66
14219 184067 PRK13468 PRK13468 F0F1 ATP synthase subunit C; Provisional 82
14220 184068 PRK13469 PRK13469 F0F1 ATP synthase subunit C; Provisional 79
14221 184069 PRK13471 PRK13471 F0F1 ATP synthase subunit C; Provisional 85
14222 237391 PRK13473 PRK13473 aminobutyraldehyde dehydrogenase. 475
14223 237392 PRK13474 PRK13474 cytochrome b6-f complex iron-sulfur subunit; Provisional 178
14224 184072 PRK13475 PRK13475 ribulose-bisphosphate carboxylase. 443
14225 184073 PRK13476 PRK13476 cytochrome b6-f complex subunit IV; Provisional 160
14226 237393 PRK13477 PRK13477 bifunctional pantoate--beta-alanine ligase/(d)CMP kinase. 512
14227 184075 PRK13478 PRK13478 phosphonoacetaldehyde hydrolase; Provisional 267
14228 184076 PRK13479 PRK13479 2-aminoethylphosphonate--pyruvate transaminase; Provisional 368
14229 237394 PRK13480 PRK13480 3'-5' exoribonuclease YhaM; Provisional 314
14230 184078 PRK13481 PRK13481 glycosyltransferase; Provisional 232
14231 237395 PRK13482 PRK13482 DNA integrity scanning protein DisA; Provisional 352
14232 184080 PRK13483 PRK13483 ligand-gated channel protein. 660
14233 139605 PRK13484 PRK13484 IreA family TonB-dependent siderophore receptor. 682
14234 139606 PRK13486 PRK13486 TonB-dependent receptor. 696
14235 237396 PRK13487 PRK13487 chemoreceptor glutamine deamidase CheD; Provisional 201
14236 237397 PRK13488 PRK13488 chemoreceptor glutamine deamidase CheD; Provisional 157
14237 237398 PRK13489 PRK13489 chemoreceptor glutamine deamidase CheD; Provisional 233
14238 184084 PRK13490 PRK13490 chemoreceptor glutamine deamidase CheD; Provisional 162
14239 184085 PRK13491 PRK13491 chemoreceptor glutamine deamidase CheD; Provisional 199
14240 184086 PRK13493 PRK13493 chemoreceptor glutamine deamidase CheD; Provisional 213
14241 184087 PRK13494 PRK13494 chemoreceptor glutamine deamidase CheD; Provisional 163
14242 184088 PRK13495 PRK13495 chemoreceptor glutamine deamidase CheD; Provisional 159
14243 237399 PRK13497 PRK13497 chemoreceptor glutamine deamidase CheD; Provisional 184
14244 237400 PRK13498 PRK13498 chemoreceptor glutamine deamidase CheD; Provisional 167
14245 237401 PRK13499 PRK13499 L-rhamnose/proton symporter RhaT. 345
14246 184091 PRK13500 PRK13500 HTH-type transcriptional activator RhaR. 312
14247 184092 PRK13501 PRK13501 HTH-type transcriptional activator RhaR. 290
14248 184093 PRK13502 PRK13502 HTH-type transcriptional activator RhaR. 282
14249 184094 PRK13503 PRK13503 HTH-type transcriptional activator RhaS. 278
14250 237402 PRK13504 PRK13504 NADPH-dependent assimilatory sulfite reductase hemoprotein subunit. 569
14251 237403 PRK13505 PRK13505 formate--tetrahydrofolate ligase; Provisional 557
14252 237404 PRK13506 PRK13506 formate--tetrahydrofolate ligase; Provisional 578
14253 184098 PRK13507 PRK13507 formate--tetrahydrofolate ligase; Provisional 587
14254 237405 PRK13508 PRK13508 tagatose-6-phosphate kinase; Provisional 309
14255 184100 PRK13509 PRK13509 HTH-type transcriptional regulator UlaR. 251
14256 184101 PRK13510 PRK13510 sulfurtransferase complex subunit TusB. 95
14257 184102 PRK13511 PRK13511 6-phospho-beta-galactosidase; Provisional 469
14258 184103 PRK13512 PRK13512 coenzyme A disulfide reductase; Provisional 438
14259 184104 PRK13513 PRK13513 ligand-gated channel protein. 659
14260 237406 PRK13515 PRK13515 carboxylate-amine ligase; Provisional 371
14261 237407 PRK13516 PRK13516 gamma-glutamyl:cysteine ligase; Provisional 373
14262 237408 PRK13517 PRK13517 glutamate--cysteine ligase. 373
14263 184108 PRK13518 PRK13518 glutamate--cysteine ligase. 357
14264 237409 PRK13520 PRK13520 tyrosine decarboxylase MfnA. 371
14265 184110 PRK13523 PRK13523 NADPH dehydrogenase NamA; Provisional 337
14266 237410 PRK13524 PRK13524 FepA family TonB-dependent siderophore receptor. 744
14267 237411 PRK13525 PRK13525 pyridoxal 5'-phosphate synthase glutaminase subunit PdxT. 189
14268 184113 PRK13526 PRK13526 glutamine amidotransferase subunit PdxT; Provisional 179
14269 237412 PRK13527 PRK13527 glutamine amidotransferase subunit PdxT; Provisional 200
14270 237413 PRK13528 PRK13528 outer membrane receptor FepA; Provisional 727
14271 237414 PRK13529 PRK13529 oxaloacetate-decarboxylating malate dehydrogenase. 563
14272 237415 PRK13530 PRK13530 arsenate reductase (thioredoxin). 133
14273 184118 PRK13531 PRK13531 regulatory ATPase RavA; Provisional 498
14274 237416 PRK13532 PRK13532 nitrate reductase catalytic subunit NapA. 830
14275 237417 PRK13533 PRK13533 7-cyano-7-deazaguanine tRNA-ribosyltransferase; Provisional 487
14276 237418 PRK13534 PRK13534 7-cyano-7-deazaguanine tRNA-ribosyltransferase; Provisional 639
14277 184122 PRK13535 PRK13535 erythrose 4-phosphate dehydrogenase; Provisional 336
14278 237419 PRK13536 PRK13536 nodulation factor ABC transporter ATP-binding protein NodI. 340
14279 237420 PRK13537 PRK13537 nodulation factor ABC transporter ATP-binding protein NodI. 306
14280 184125 PRK13538 PRK13538 cytochrome c biogenesis heme-transporting ATPase CcmA. 204
14281 237421 PRK13539 PRK13539 cytochrome c biogenesis protein CcmA; Provisional 207
14282 184127 PRK13540 PRK13540 cytochrome c biogenesis protein CcmA; Provisional 200
14283 184128 PRK13541 PRK13541 cytochrome c biogenesis protein CcmA; Provisional 195
14284 184129 PRK13543 PRK13543 heme ABC exporter ATP-binding protein CcmA. 214
14285 184130 PRK13545 tagH teichoic acids export protein ATP-binding subunit; Provisional 549
14286 184131 PRK13546 PRK13546 teichoic acids export ABC transporter ATP-binding subunit TagH. 264
14287 184132 PRK13547 hmuV heme ABC transporter ATP-binding protein. 272
14288 237422 PRK13548 hmuV hemin importer ATP-binding subunit; Provisional 258
14289 184134 PRK13549 PRK13549 xylose transporter ATP-binding subunit; Provisional 506
14290 184135 PRK13551 PRK13551 agmatine deiminase; Provisional 362
14291 184136 PRK13552 frdB fumarate reductase iron-sulfur subunit; Provisional 239
14292 237423 PRK13553 PRK13553 fumarate reductase cytochrome b subunit. 258
14293 237424 PRK13554 PRK13554 fumarate reductase cytochrome b-556 subunit; Provisional 241
14294 184139 PRK13555 PRK13555 FMN-dependent NADH-azoreductase. 208
14295 184140 PRK13556 PRK13556 FMN-dependent NADH-azoreductase. 208
14296 237425 PRK13557 PRK13557 histidine kinase; Provisional 540
14297 237426 PRK13558 PRK13558 bacterio-opsin activator; Provisional 665
14298 237427 PRK13559 PRK13559 hypothetical protein; Provisional 361
14299 106506 PRK13560 PRK13560 hypothetical protein; Provisional 807
14300 184143 PRK13561 PRK13561 putative diguanylate cyclase; Provisional 651
14301 184144 PRK13562 PRK13562 ACT domain-containing protein. 84
14302 237428 PRK13564 PRK13564 anthranilate synthase component 1. 520
14303 184146 PRK13565 PRK13565 anthranilate synthase component I; Provisional 490
14304 237429 PRK13566 PRK13566 anthranilate synthase component I. 720
14305 184148 PRK13567 PRK13567 anthranilate synthase component I; Provisional 468
14306 237430 PRK13568 hofQ DNA uptake porin HofQ. 381
14307 184150 PRK13569 PRK13569 anthranilate synthase component I; Provisional 506
14308 237431 PRK13570 PRK13570 anthranilate synthase component I; Provisional 455
14309 184152 PRK13571 PRK13571 anthranilate synthase component I; Provisional 506
14310 237432 PRK13572 PRK13572 anthranilate synthase component I; Provisional 435
14311 184154 PRK13573 PRK13573 anthranilate synthase component I; Provisional 503
14312 184155 PRK13574 PRK13574 anthranilate synthase component I; Provisional 420
14313 184156 PRK13575 PRK13575 type I 3-dehydroquinate dehydratase. 238
14314 237433 PRK13576 PRK13576 type I 3-dehydroquinate dehydratase. 216
14315 184158 PRK13577 PRK13577 diaminopimelate epimerase; Provisional 281
14316 237434 PRK13578 PRK13578 ornithine decarboxylase; Provisional 720
14317 237435 PRK13579 gcvT glycine cleavage system aminomethyltransferase GcvT. 370
14318 184161 PRK13580 PRK13580 glycine hydroxymethyltransferase. 493
14319 237436 PRK13581 PRK13581 D-3-phosphoglycerate dehydrogenase; Provisional 526
14320 237437 PRK13582 thrH bifunctional phosphoserine phosphatase/homoserine phosphotransferase ThrH. 205
14321 237438 PRK13583 hisG ATP phosphoribosyltransferase. 228
14322 172153 PRK13584 hisG ATP phosphoribosyltransferase. 204
14323 184165 PRK13585 PRK13585 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino]imidazole-4-carboxamide isomerase. 241
14324 237439 PRK13586 PRK13586 1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 232
14325 172156 PRK13587 PRK13587 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase; Provisional 234
14326 237440 PRK13588 PRK13588 flagellin B; Provisional 514
14327 172158 PRK13589 PRK13589 flagellin A. 576
14328 184168 PRK13590 PRK13590 putative bifunctional OHCU decarboxylase/allantoate amidohydrolase; Provisional 591
14329 184169 PRK13591 ubiA prenyltransferase; Provisional 307
14330 139690 PRK13592 ubiA prenyltransferase; Provisional 299
14331 172161 PRK13595 ubiA prenyltransferase; Provisional 292
14332 237441 PRK13596 PRK13596 NADH-quinone oxidoreductase subunit NuoF. 433
14333 184171 PRK13598 hisB imidazoleglycerol-phosphate dehydratase; Provisional 193
14334 106544 PRK13599 PRK13599 peroxiredoxin. 215
14335 184172 PRK13600 PRK13600 putative ribosomal protein L7Ae-like; Provisional 84
14336 184173 PRK13601 PRK13601 putative L7Ae-like ribosomal protein; Provisional 82
14337 184174 PRK13602 PRK13602 50S ribosomal protein L7ae-like protein. 82
14338 172166 PRK13603 PRK13603 fumarate reductase subunit C; Provisional 126
14339 184175 PRK13604 luxD acyl transferase; Provisional 307
14340 237442 PRK13605 PRK13605 endoribonuclease SymE; Provisional 113
14341 237443 PRK13606 PRK13606 LPPG:FO 2-phospho-L-lactate transferase; Provisional 303
14342 237444 PRK13607 PRK13607 proline dipeptidase; Provisional 443
14343 184179 PRK13608 PRK13608 diacylglycerol glucosyltransferase; Provisional 391
14344 237445 PRK13609 PRK13609 diacylglycerol glucosyltransferase; Provisional 380
14345 139699 PRK13610 PRK13610 photosystem II reaction center protein Psb28; Provisional 113
14346 106556 PRK13611 PRK13611 photosystem II reaction center protein Psb28; Provisional 104
14347 237446 PRK13612 PRK13612 photosystem II reaction center protein Psb28; Provisional 113
14348 237447 PRK13613 PRK13613 lipoprotein LpqB; Provisional 599
14349 237448 PRK13614 PRK13614 lipoprotein LpqB; Provisional 573
14350 184183 PRK13615 PRK13615 lipoprotein LpqB; Provisional 557
14351 237449 PRK13616 PRK13616 MtrAB system accessory protein LpqB. 591
14352 106562 PRK13617 psbV cytochrome c-550; Provisional 170
14353 184185 PRK13618 psbV cytochrome c-550; Provisional 163
14354 172177 PRK13619 psbV cytochrome c-550; Provisional 160
14355 139707 PRK13620 psbV cytochrome c-550; Provisional 215
14356 237450 PRK13621 psbV cytochrome c-550; Provisional 170
14357 106567 PRK13622 psbV cytochrome c-550; Provisional 180
14358 184186 PRK13623 PRK13623 iron-sulfur cluster insertion protein ErpA; Provisional 115
14359 184187 PRK13625 PRK13625 bis(5'-nucleosyl)-tetraphosphatase PrpE; Provisional 245
14360 184188 PRK13626 PRK13626 HTH-type transcriptional regulator SgrR. 552
14361 184189 PRK13627 PRK13627 carnitine operon protein CaiE; Provisional 196
14362 184190 PRK13628 PRK13628 serine/threonine transporter SstT; Provisional 402
14363 184191 PRK13629 PRK13629 threonine/serine transporter TdcC; Provisional 443
14364 237451 PRK13631 cbiO cobalt transporter ATP-binding subunit; Provisional 320
14365 237452 PRK13632 cbiO cobalt transporter ATP-binding subunit; Provisional 271
14366 237453 PRK13633 PRK13633 energy-coupling factor transporter ATPase. 280
14367 237454 PRK13634 cbiO cobalt transporter ATP-binding subunit; Provisional 290
14368 184195 PRK13635 cbiO energy-coupling factor ABC transporter ATP-binding protein. 279
14369 184196 PRK13636 cbiO cobalt transporter ATP-binding subunit; Provisional 283
14370 237455 PRK13637 cbiO energy-coupling factor transporter ATPase. 287
14371 184198 PRK13638 cbiO energy-coupling factor ABC transporter ATP-binding protein. 271
14372 184199 PRK13639 cbiO cobalt transporter ATP-binding subunit; Provisional 275
14373 184200 PRK13640 cbiO energy-coupling factor transporter ATPase. 282
14374 237456 PRK13641 cbiO energy-coupling factor transporter ATPase. 287
14375 184202 PRK13642 cbiO energy-coupling factor transporter ATPase. 277
14376 184203 PRK13643 cbiO energy-coupling factor transporter ATPase. 288
14377 106587 PRK13644 cbiO energy-coupling factor transporter ATPase. 274
14378 184204 PRK13645 cbiO energy-coupling factor transporter ATPase. 289
14379 184205 PRK13646 cbiO energy-coupling factor transporter ATPase. 286
14380 237457 PRK13647 cbiO cobalt transporter ATP-binding subunit; Provisional 274
14381 184207 PRK13648 cbiO cobalt transporter ATP-binding subunit; Provisional 269
14382 184208 PRK13649 cbiO energy-coupling factor transporter ATPase. 280
14383 184209 PRK13650 cbiO energy-coupling factor transporter ATPase. 279
14384 184210 PRK13651 PRK13651 cobalt transporter ATP-binding subunit; Provisional 305
14385 172200 PRK13652 cbiO cobalt transporter ATP-binding subunit; Provisional 277
14386 237458 PRK13654 PRK13654 magnesium-protoporphyrin IX monomethyl ester cyclase; Provisional 355
14387 237459 PRK13655 PRK13655 phosphoenolpyruvate carboxylase; Provisional 494
14388 237460 PRK13656 PRK13656 enoyl-[acyl-carrier-protein] reductase FabV. 398
14389 184214 PRK13657 PRK13657 glucan ABC transporter ATP-binding protein/ permease. 588
14390 184215 PRK13658 PRK13658 hypothetical protein; Provisional 59
14391 184216 PRK13659 PRK13659 DUF1283 family protein. 103
14392 237461 PRK13660 PRK13660 hypothetical protein; Provisional 182
14393 184218 PRK13661 PRK13661 ECF-type riboflavin transporter substrate-binding protein. 182
14394 184219 PRK13662 PRK13662 hypothetical protein; Provisional 177
14395 184220 PRK13663 PRK13663 hypothetical protein; Provisional 493
14396 184221 PRK13664 PRK13664 hypothetical protein; Provisional 62
14397 237462 PRK13665 PRK13665 hypothetical protein; Provisional 316
14398 184223 PRK13666 PRK13666 hypothetical protein; Provisional 92
14399 184224 PRK13667 PRK13667 hypothetical protein; Provisional 70
14400 237463 PRK13668 PRK13668 hypothetical protein; Provisional 267
14401 184226 PRK13669 PRK13669 hypothetical protein; Provisional 78
14402 237464 PRK13670 PRK13670 nucleotidyltransferase. 388
14403 184228 PRK13671 PRK13671 nucleotidyltransferase. 298
14404 184229 PRK13672 PRK13672 hypothetical protein; Provisional 71
14405 237465 PRK13673 PRK13673 hypothetical protein; Provisional 118
14406 237466 PRK13674 PRK13674 GTP cyclohydrolase I FolE2. 271
14407 184232 PRK13675 PRK13675 GTP cyclohydrolase; Provisional 308
14408 237467 PRK13676 PRK13676 YlbF/YmcA family competence regulator. 114
14409 184234 PRK13677 PRK13677 DUF3461 family protein. 125
14410 184235 PRK13678 PRK13678 hypothetical protein; Provisional 95
14411 184236 PRK13679 PRK13679 hypothetical protein; Provisional 168
14412 184237 PRK13680 PRK13680 hypothetical protein; Provisional 117
14413 184238 PRK13681 PRK13681 protein YohO. 35
14414 184239 PRK13682 PRK13682 hypothetical protein; Provisional 51
14415 184240 PRK13683 PRK13683 hypothetical protein; Provisional 87
14416 237468 PRK13684 PRK13684 photosynthesis system II assembly factor Ycf48. 334
14417 184242 PRK13685 PRK13685 hypothetical protein; Provisional 326
14418 237469 PRK13686 PRK13686 photosystem II reaction center protein Ycf12. 43
14419 184244 PRK13687 PRK13687 hypothetical protein; Provisional 85
14420 237470 PRK13688 PRK13688 N-acetyltransferase. 156
14421 237471 PRK13689 PRK13689 hypothetical protein; Provisional 75
14422 237472 PRK13690 PRK13690 hypothetical protein; Provisional 184
14423 139768 PRK13691 PRK13691 (3R)-hydroxyacyl-ACP dehydratase subunit HadC; Provisional 166
14424 237473 PRK13692 PRK13692 (3R)-hydroxyacyl-ACP dehydratase subunit HadA; Provisional 159
14425 184249 PRK13693 PRK13693 (3R)-hydroxyacyl-ACP dehydratase subunit HadB; Provisional 142
14426 237474 PRK13694 PRK13694 hypothetical protein; Provisional 83
14427 237475 PRK13695 PRK13695 NTPase. 174
14428 237476 PRK13696 PRK13696 hypothetical protein; Provisional 62
14429 184253 PRK13697 PRK13697 cytochrome c6; Provisional 111
14430 184254 PRK13698 PRK13698 ParB/RepB/Spo0J family plasmid partition protein. 323
14431 184255 PRK13699 PRK13699 putative methylase; Provisional 227
14432 184256 PRK13700 PRK13700 conjugal transfer protein TraD; Provisional 732
14433 237477 PRK13701 psiB conjugation system SOS inhibitor PsiB. 144
14434 184258 PRK13702 PRK13702 replication regulatory protein RepA. 85
14435 184259 PRK13703 PRK13703 conjugal pilus assembly protein TraF; Provisional 248
14436 184260 PRK13704 PRK13704 plasmid SOS inhibition protein A; Provisional 240
14437 184261 PRK13705 PRK13705 plasmid-partitioning protein SopA; Provisional 388
14438 184262 PRK13706 PRK13706 conjugal transfer pilus acetylase TraX. 248
14439 184263 PRK13707 PRK13707 type IV conjugative transfer system protein TraL. 101
14440 184264 PRK13708 PRK13708 type II toxin-antitoxin system toxin CcdB. 101
14441 237478 PRK13709 PRK13709 conjugal transfer nickase/helicase TraI; Provisional 1747
14442 184266 PRK13710 PRK13710 type II toxin-antitoxin system antitoxin CcdA. 72
14443 184267 PRK13711 PRK13711 P-type conjugative transfer protein TrbJ. 113
14444 184268 PRK13712 PRK13712 conjugal transfer protein TrbA; Provisional 115
14445 184269 PRK13713 PRK13713 relaxosome protein TraM. 118
14446 184270 PRK13715 PRK13715 conjugal transfer protein TraR; Provisional 73
14447 106657 PRK13716 PRK13716 RepA leader peptide Tap. 24
14448 184271 PRK13717 PRK13717 type-F conjugative transfer system protein TrbI. 128
14449 172260 PRK13718 PRK13718 conjugal transfer protein TrbE; Provisional 84
14450 237479 PRK13719 PRK13719 conjugal transfer transcriptional regulator TraJ; Provisional 217
14451 172262 PRK13720 PRK13720 modulator of post-segregation killing protein; Provisional 70
14452 237480 PRK13721 PRK13721 conjugal transfer ATP-binding protein TraC; Provisional 844
14453 184274 PRK13722 PRK13722 lytic transglycosylase; Provisional 169
14454 237481 PRK13723 PRK13723 conjugal transfer pilus assembly protein TraH; Provisional 451
14455 237482 PRK13724 PRK13724 conjugal transfer protein TrbD; Provisional 65
14456 184277 PRK13725 PRK13725 tRNA(fMet)-specific endonuclease VapC. 132
14457 184278 PRK13726 PRK13726 type IV conjugative transfer system protein TraE. 188
14458 237483 PRK13727 PRK13727 conjugal transfer pilin chaperone TraQ; Provisional 80
14459 237484 PRK13728 PRK13728 conjugal transfer protein TrbB; Provisional 181
14460 184281 PRK13729 PRK13729 conjugal transfer pilus assembly protein TraB; Provisional 475
14461 184282 PRK13730 PRK13730 conjugal transfer pilus assembly protein TrbC; Provisional 212
14462 184283 PRK13731 PRK13731 complement resistance protein TraT. 243
14463 184284 PRK13732 PRK13732 single-stranded DNA-binding protein; Provisional 175
14464 184285 PRK13733 PRK13733 conjugal transfer protein TraV; Provisional 171
14465 237485 PRK13734 PRK13734 conjugal transfer pilin subunit TraA; Provisional 120
14466 184287 PRK13735 PRK13735 conjugal transfer mating pair stabilization protein TraG; Provisional 942
14467 237486 PRK13736 PRK13736 conjugal transfer protein TraK; Provisional 245
14468 237487 PRK13737 PRK13737 conjugal transfer pilus assembly protein TraU; Provisional 330
14469 184290 PRK13738 PRK13738 conjugal transfer pilus assembly protein TraW; Provisional 209
14470 237488 PRK13739 PRK13739 conjugal transfer protein TraP; Provisional 198
14471 184292 PRK13740 PRK13740 conjugal transfer relaxosome protein TraY. 70
14472 172283 PRK13741 PRK13741 conjugal transfer protein TraS. 171
14473 184293 PRK13742 PRK13742 replication initiation protein RepE. 245
14474 184294 PRK13743 PRK13743 conjugal transfer protein TrbF; Provisional 141
14475 139817 PRK13744 PRK13744 conjugal transfer protein TrbG; Provisional 83
14476 237489 PRK13745 PRK13745 anaerobic sulfatase-maturation protein. 412
14477 184296 PRK13746 PRK13746 aminoglycoside resistance protein; Provisional 262
14478 184297 PRK13747 PRK13747 putative mercury resistance protein; Provisional 78
14479 184298 PRK13748 PRK13748 putative mercuric reductase; Provisional 561
14480 184299 PRK13749 PRK13749 HTH-type transcriptional regulator MerD. 121
14481 184300 PRK13750 PRK13750 replication protein; Provisional 285
14482 184301 PRK13751 PRK13751 putative mercuric transport protein; Provisional 116
14483 184302 PRK13752 PRK13752 mercuric resistance operon transcriptional regulator MerR. 144
14484 184303 PRK13753 PRK13753 dihydropteroate synthase; Provisional 279
14485 184304 PRK13754 PRK13754 fertility inhibition protein FinO. 186
14486 237490 PRK13755 PRK13755 organomercurial transporter MerC. 139
14487 172294 PRK13756 PRK13756 TetR family transcriptional regulator. 205
14488 172295 PRK13757 PRK13757 type A chloramphenicol O-acetyltransferase. 219
14489 172296 PRK13758 PRK13758 anaerobic sulfatase-maturase; Provisional 370
14490 237491 PRK13759 PRK13759 arylsulfatase; Provisional 485
14491 237492 PRK13760 PRK13760 ribosome assembly factor SBDS. 231
14492 184308 PRK13761 PRK13761 phosphopantothenate/pantothenate synthetase. 248
14493 237493 PRK13762 PRK13762 4-demethylwyosine synthase TYW1. 322
14494 237494 PRK13763 PRK13763 putative RNA-processing protein; Provisional 180
14495 184311 PRK13764 PRK13764 ATPase; Provisional 602
14496 237495 PRK13765 PRK13765 ATP-dependent protease Lon; Provisional 637
14497 237496 PRK13766 PRK13766 Hef nuclease; Provisional 773
14498 237497 PRK13767 PRK13767 ATP-dependent helicase; Provisional 876
14499 237498 PRK13768 PRK13768 GTPase; Provisional 253
14500 172307 PRK13769 PRK13769 histidinol dehydrogenase; Provisional 368
14501 172308 PRK13770 PRK13770 histidinol dehydrogenase; Provisional 416
14502 184316 PRK13771 PRK13771 putative alcohol dehydrogenase; Provisional 334
14503 172310 PRK13772 PRK13772 formimidoylglutamase; Provisional 314
14504 237499 PRK13773 PRK13773 formimidoylglutamase; Provisional 324
14505 184317 PRK13774 PRK13774 formimidoylglutamase; Provisional 311
14506 172313 PRK13775 PRK13775 formimidoylglutamase; Provisional 328
14507 237500 PRK13776 PRK13776 formimidoylglutamase; Provisional 318
14508 237501 PRK13777 PRK13777 HTH-type transcriptional regulator Hpr. 185
14509 184320 PRK13778 paaA phenylacetate-CoA oxygenase subunit PaaA; Provisional 314
14510 237502 PRK13779 PRK13779 bifunctional PTS system fructose-specific transporter subunit IIA/HPr protein; Provisional 503
14511 237503 PRK13780 PRK13780 phosphocarrier protein HPr; Provisional 88
14512 237504 PRK13781 paaB phenylacetate-CoA oxygenase subunit PaaB; Provisional 95
14513 172320 PRK13782 PRK13782 HPr family phosphocarrier protein. 82
14514 237505 PRK13783 PRK13783 adenylosuccinate synthetase; Provisional 404
14515 172322 PRK13784 PRK13784 adenylosuccinate synthetase; Provisional 428
14516 237506 PRK13785 PRK13785 adenylosuccinate synthetase; Provisional 454
14517 184325 PRK13786 PRK13786 adenylosuccinate synthetase; Provisional 424
14518 172324 PRK13787 PRK13787 adenylosuccinate synthetase; Provisional 423
14519 184326 PRK13788 PRK13788 adenylosuccinate synthetase; Provisional 404
14520 184327 PRK13789 PRK13789 phosphoribosylamine--glycine ligase; Provisional 426
14521 237507 PRK13790 PRK13790 phosphoribosylamine--glycine ligase; Provisional 379
14522 237508 PRK13791 PRK13791 c-type lysozyme inhibitor. 113
14523 106733 PRK13792 PRK13792 lysozyme inhibitor; Provisional 127
14524 184329 PRK13793 PRK13793 nicotinate-nicotinamide nucleotide adenylyltransferase. 196
14525 237509 PRK13794 PRK13794 hypothetical protein; Provisional 479
14526 237510 PRK13795 PRK13795 hypothetical protein; Provisional 636
14527 237511 PRK13796 PRK13796 GTPase YqeH; Provisional 365
14528 106738 PRK13797 PRK13797 allantoicase. 516
14529 184333 PRK13798 PRK13798 putative OHCU decarboxylase; Provisional 166
14530 106740 PRK13799 PRK13799 unknown domain/N-carbamoyl-L-amino acid hydrolase fusion protein; Provisional 591
14531 237512 PRK13800 PRK13800 fumarate reductase/succinate dehydrogenase flavoprotein subunit. 897
14532 184335 PRK13802 PRK13802 bifunctional indole-3-glycerol phosphate synthase/tryptophan synthase subunit beta; Provisional 695
14533 237513 PRK13803 PRK13803 bifunctional phosphoribosylanthranilate isomerase/tryptophan synthase subunit beta; Provisional 610
14534 237514 PRK13804 ileS isoleucyl-tRNA synthetase; Provisional 961
14535 237515 PRK13805 PRK13805 bifunctional acetaldehyde-CoA/alcohol dehydrogenase; Provisional 862
14536 237516 PRK13806 rpsA 30S ribosomal protein S1; Provisional 491
14537 237517 PRK13807 PRK13807 maltose phosphorylase; Provisional 756
14538 172341 PRK13808 PRK13808 adenylate kinase; Provisional 333
14539 184340 PRK13809 PRK13809 orotate phosphoribosyltransferase; Provisional 206
14540 184341 PRK13810 PRK13810 orotate phosphoribosyltransferase; Provisional 187
14541 237518 PRK13811 PRK13811 orotate phosphoribosyltransferase; Provisional 170
14542 237519 PRK13812 PRK13812 orotate phosphoribosyltransferase; Provisional 176
14543 237520 PRK13813 PRK13813 orotidine 5'-phosphate decarboxylase; Provisional 215
14544 139876 PRK13814 pyrB aspartate carbamoyltransferase. 310
14545 172345 PRK13815 PRK13815 ribosome-binding factor A; Provisional 122
14546 184344 PRK13816 PRK13816 ribosome-binding factor A; Provisional 131
14547 139879 PRK13817 PRK13817 ribosome-binding factor A; Provisional 119
14548 184345 PRK13818 PRK13818 ribosome-binding factor A; Provisional 121
14549 237521 PRK13820 PRK13820 argininosuccinate synthase; Provisional 394
14550 184347 PRK13821 thyA thymidylate synthase; Provisional 323
14551 237522 PRK13822 PRK13822 conjugal transfer coupling protein TraG; Provisional 641
14552 184348 PRK13823 PRK13823 conjugal transfer protein TrbD; Provisional 94
14553 184349 PRK13824 PRK13824 replication initiation protein RepC; Provisional 404
14554 237523 PRK13825 PRK13825 conjugal transfer protein TraB; Provisional 388
14555 237524 PRK13826 PRK13826 Dtr system oriT relaxase; Provisional 1102
14556 184351 PRK13828 rimM 16S rRNA-processing protein RimM; Provisional 161
14557 184352 PRK13829 rimM 16S rRNA-processing protein RimM; Provisional 162
14558 237525 PRK13830 PRK13830 conjugal transfer protein TrbE; Provisional 818
14559 172358 PRK13831 PRK13831 conjugal transfer protein TrbI; Provisional 432
14560 184353 PRK13832 PRK13832 plasmid partitioning protein; Provisional 520
14561 172360 PRK13833 PRK13833 conjugal transfer protein TrbB; Provisional 323
14562 172361 PRK13834 PRK13834 putative autoinducer synthesis protein; Provisional 207
14563 172362 PRK13835 PRK13835 conjugal transfer protein TrbH; Provisional 145
14564 172363 PRK13836 PRK13836 conjugal transfer protein TrbF; Provisional 220
14565 237526 PRK13837 PRK13837 two-component system VirA-like sensor kinase. 828
14566 172365 PRK13838 PRK13838 conjugal transfer pilin processing protease TraF; Provisional 176
14567 237527 PRK13839 PRK13839 conjugal transfer protein TrbG; Provisional 277
14568 237528 PRK13840 PRK13840 sucrose phosphorylase; Provisional 495
14569 237529 PRK13841 PRK13841 conjugal transfer protein TrbL; Provisional 391
14570 172369 PRK13842 PRK13842 conjugal transfer protein TrbJ; Provisional 267
14571 237530 PRK13843 PRK13843 conjugal transfer protein TraH; Provisional 207
14572 139904 PRK13844 PRK13844 recombination protein RecR; Provisional 200
14573 172371 PRK13845 PRK13845 putative glycerol-3-phosphate acyltransferase PlsX; Provisional 437
14574 139906 PRK13846 PRK13846 phosphate acyltransferase PlsX. 316
14575 172372 PRK13847 PRK13847 type IV conjugative transfer system coupling protein TraD. 71
14576 172373 PRK13848 PRK13848 conjugal transfer protein TraC; Provisional 98
14577 139909 PRK13849 PRK13849 conjugal transfer ATPase VirC1. 231
14578 237531 PRK13850 PRK13850 type IV secretion system protein VirD4; Provisional 670
14579 172375 PRK13851 PRK13851 type IV secretion system protein VirB11; Provisional 344
14580 139912 PRK13852 PRK13852 type IV secretion system protein. 295
14581 139913 PRK13853 PRK13853 type IV secretion system protein VirB4; Provisional 789
14582 139914 PRK13854 PRK13854 type IV secretion system protein VirB3; Provisional 108
14583 172376 PRK13855 PRK13855 type IV secretion system protein VirB10; Provisional 376
14584 172377 PRK13856 PRK13856 two-component response regulator VirG; Provisional 241
14585 172378 PRK13857 PRK13857 pilin major subunit VirB2. 120
14586 237532 PRK13858 PRK13858 T-DNA border endonuclease VirD1. 147
14587 172380 PRK13859 PRK13859 type IV secretion system lipoprotein VirB7; Provisional 55
14588 172381 PRK13860 PRK13860 pilin minor subunit VirB5. 220
14589 172382 PRK13861 PRK13861 type IV secretion system protein VirB9; Provisional 292
14590 172383 PRK13862 PRK13862 conjugal transfer protein VirC2. 201
14591 237533 PRK13863 PRK13863 T-DNA border endonuclease VirD2. 446
14592 237534 PRK13864 PRK13864 type IV secretion system lytic transglycosylase VirB1; Provisional 245
14593 172386 PRK13865 PRK13865 type IV secretion system protein VirB8; Provisional 229
14594 172387 PRK13866 PRK13866 plasmid partitioning protein RepB; Provisional 336
14595 172388 PRK13867 PRK13867 type IV secretion system effector chaperone VirE1. 65
14596 237535 PRK13868 PRK13868 type IV secretion system single-stranded DNA binding effector VirE2. 556
14597 139929 PRK13869 PRK13869 plasmid-partitioning protein RepA; Provisional 405
14598 172390 PRK13870 PRK13870 transcriptional regulator TraR; Provisional 234
14599 172391 PRK13871 PRK13871 conjugal transfer protein TrbC; Provisional 135
14600 184356 PRK13872 PRK13872 conjugal transfer protein TrbF; Provisional 228
14601 237536 PRK13873 PRK13873 conjugal transfer ATPase TrbE; Provisional 811
14602 184358 PRK13874 PRK13874 conjugal transfer protein TrbJ; Provisional 230
14603 237537 PRK13875 PRK13875 conjugal transfer protein TrbL; Provisional 440
14604 237538 PRK13876 PRK13876 conjugal transfer coupling protein TraG; Provisional 663
14605 184361 PRK13877 PRK13877 conjugal transfer transcriptional regulator TraJ. 114
14606 237539 PRK13878 PRK13878 conjugal transfer relaxase TraI; Provisional 746
14607 184363 PRK13879 PRK13879 P-type conjugative transfer protein TrbJ. 253
14608 237540 PRK13880 PRK13880 conjugal transfer coupling protein TraG; Provisional 636
14609 237541 PRK13881 PRK13881 conjugal transfer protein TrbI; Provisional 472
14610 237542 PRK13882 PRK13882 conjugal transfer protein TrbP; Provisional 232
14611 184367 PRK13883 PRK13883 conjugal transfer protein TrbH; Provisional 151
14612 184368 PRK13884 PRK13884 conjugal transfer peptidase TraF; Provisional 178
14613 237543 PRK13885 PRK13885 conjugal transfer protein TrbG; Provisional 299
14614 184370 PRK13886 PRK13886 conjugal transfer protein TraL; Provisional 241
14615 237544 PRK13887 PRK13887 conjugal transfer protein TrbF; Provisional 250
14616 237545 PRK13888 PRK13888 conjugal transfer protein TrbN; Provisional 206
14617 237546 PRK13889 PRK13889 conjugal transfer relaxase TraA; Provisional 988
14618 237547 PRK13890 PRK13890 conjugal transfer protein TrbA; Provisional 120
14619 184375 PRK13891 PRK13891 conjugal transfer protein TrbE; Provisional 852
14620 184376 PRK13892 PRK13892 conjugal transfer protein TrbC; Provisional 134
14621 237548 PRK13893 PRK13893 conjugal transfer protein TrbM; Provisional 193
14622 184377 PRK13894 PRK13894 conjugal transfer ATPase TrbB; Provisional 319
14623 184378 PRK13895 PRK13895 conjugal transfer protein TraM; Provisional 144
14624 184379 PRK13896 PRK13896 cobyrinic acid a,c-diamide synthase; Provisional 433
14625 237549 PRK13897 PRK13897 type IV secretion system component VirD4; Provisional 606
14626 172418 PRK13898 PRK13898 type IV secretion system ATPase VirB4; Provisional 800
14627 237550 PRK13899 PRK13899 type IV secretion system protein VirB3; Provisional 97
14628 184381 PRK13900 PRK13900 type IV secretion system ATPase VirB11; Provisional 332
14629 139961 PRK13901 ruvA Holliday junction branch migration protein RuvA. 196
14630 237551 PRK13902 alaS alanyl-tRNA synthetase; Provisional 900
14631 237552 PRK13903 murB UDP-N-acetylmuramate dehydrogenase. 363
14632 184384 PRK13904 murB UDP-N-acetylmuramate dehydrogenase. 257
14633 237553 PRK13905 murB UDP-N-acetylmuramate dehydrogenase. 298
14634 184386 PRK13906 murB UDP-N-acetylmuramate dehydrogenase. 307
14635 139967 PRK13907 rnhA ribonuclease H; Provisional 128
14636 184387 PRK13908 PRK13908 recombination protein RecO. 204
14637 237554 PRK13909 PRK13909 RecB-like helicase. 910
14638 172427 PRK13910 PRK13910 DNA glycosylase MutY; Provisional 289
14639 139971 PRK13911 PRK13911 exodeoxyribonuclease III; Provisional 250
14640 184389 PRK13912 PRK13912 nuclease NucT; Provisional 177
14641 184390 PRK13913 PRK13913 3-methyladenine DNA glycosylase; Provisional 218
14642 237555 PRK13914 PRK13914 invasion associated endopeptidase. 481
14643 237556 PRK13915 PRK13915 putative glucosyl-3-phosphoglycerate synthase; Provisional 306
14644 139976 PRK13916 PRK13916 plasmid segregation protein ParR; Provisional 97
14645 184393 PRK13917 PRK13917 plasmid segregation protein ParM; Provisional 344
14646 237557 PRK13918 PRK13918 CRP/FNR family transcriptional regulator; Provisional 202
14647 184395 PRK13919 PRK13919 putative RNA polymerase sigma E protein; Provisional 186
14648 237558 PRK13920 PRK13920 putative anti-sigmaE protein; Provisional 206
14649 237559 PRK13921 PRK13921 CRISPR-associated Cse2 family protein; Provisional 173
14650 237560 PRK13922 PRK13922 rod shape-determining protein MreC; Provisional 276
14651 237561 PRK13923 PRK13923 putative spore coat protein regulator protein YlbO; Provisional 170
14652 184399 PRK13925 rnhB ribonuclease HII; Provisional 198
14653 184400 PRK13926 PRK13926 ribonuclease HII; Provisional 207
14654 237562 PRK13927 PRK13927 rod shape-determining protein MreB; Provisional 334
14655 237563 PRK13928 PRK13928 rod shape-determining protein Mbl; Provisional 336
14656 184403 PRK13929 PRK13929 rod-share determining protein MreBH; Provisional 335
14657 237564 PRK13930 PRK13930 rod shape-determining protein MreB; Provisional 335
14658 184405 PRK13931 PRK13931 5'/3'-nucleotidase SurE. 261
14659 172445 PRK13932 PRK13932 stationary phase survival protein SurE; Provisional 257
14660 184406 PRK13933 PRK13933 stationary phase survival protein SurE; Provisional 253
14661 237565 PRK13934 PRK13934 stationary phase survival protein SurE; Provisional 266
14662 237566 PRK13935 PRK13935 stationary phase survival protein SurE; Provisional 253
14663 237567 PRK13936 PRK13936 phosphoheptose isomerase; Provisional 197
14664 184408 PRK13937 PRK13937 phosphoheptose isomerase; Provisional 188
14665 139997 PRK13938 PRK13938 phosphoheptose isomerase; Provisional 196
14666 172450 PRK13940 PRK13940 glutamyl-tRNA reductase; Provisional 414
14667 184409 PRK13942 PRK13942 protein-L-isoaspartate O-methyltransferase; Provisional 212
14668 237568 PRK13943 PRK13943 protein-L-isoaspartate O-methyltransferase; Provisional 322
14669 140001 PRK13944 PRK13944 protein-L-isoaspartate O-methyltransferase; Provisional 205
14670 184410 PRK13945 PRK13945 formamidopyrimidine-DNA glycosylase; Provisional 282
14671 184411 PRK13946 PRK13946 shikimate kinase; Provisional 184
14672 184412 PRK13947 PRK13947 shikimate kinase; Provisional 171
14673 184413 PRK13948 PRK13948 shikimate kinase; Provisional 182
14674 140006 PRK13949 PRK13949 shikimate kinase; Provisional 169
14675 172457 PRK13951 PRK13951 bifunctional shikimate kinase AroK/3-dehydroquinate synthase AroB. 488
14676 237569 PRK13952 mscL large conductance mechanosensitive channel protein MscL. 142
14677 184415 PRK13953 mscL large conductance mechanosensitive channel protein MscL. 125
14678 172460 PRK13954 mscL large conductance mechanosensitive channel protein MscL. 119
14679 184416 PRK13955 mscL large conductance mechanosensitive channel protein MscL. 130
14680 184417 PRK13956 dut dUTP diphosphatase. 147
14681 140013 PRK13957 PRK13957 indole-3-glycerol-phosphate synthase; Provisional 247
14682 184418 PRK13958 PRK13958 N-(5'-phosphoribosyl)anthranilate isomerase; Provisional 207
14683 237570 PRK13959 PRK13959 phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional 341
14684 184420 PRK13960 PRK13960 phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional 367
14685 237571 PRK13961 PRK13961 phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional 296
14686 237572 PRK13962 PRK13962 bifunctional phosphoglycerate kinase/triosephosphate isomerase; Provisional 645
14687 237573 PRK13963 PRK13963 rRNA maturation RNase YbeY. 258
14688 184424 PRK13964 coaD pantetheine-phosphate adenylyltransferase. 140
14689 184425 PRK13965 PRK13965 ribonucleotide-diphosphate reductase subunit beta; Provisional 335
14690 140022 PRK13966 nrdF2 ribonucleotide-diphosphate reductase subunit beta; Provisional 324
14691 140023 PRK13967 nrdF1 ribonucleotide-diphosphate reductase subunit beta; Provisional 322
14692 184426 PRK13968 PRK13968 putative succinate semialdehyde dehydrogenase; Provisional 462
14693 184427 PRK13969 PRK13969 proline racemase; Provisional 334
14694 172473 PRK13970 PRK13970 4-hydroxyproline epimerase. 311
14695 184428 PRK13971 PRK13971 4-hydroxyproline epimerase. 333
14696 172475 PRK13972 PRK13972 GSH-dependent disulfide bond oxidoreductase; Provisional 215
14697 184429 PRK13973 PRK13973 thymidylate kinase; Provisional 213
14698 172477 PRK13974 PRK13974 dTMP kinase. 212
14699 184430 PRK13975 PRK13975 dTMP kinase. 196
14700 237574 PRK13976 PRK13976 dTMP kinase. 209
14701 237575 PRK13977 PRK13977 myosin-cross-reactive antigen; Provisional 576
14702 184433 PRK13978 PRK13978 ribose 5-phosphate isomerase A. 228
14703 237576 PRK13979 PRK13979 DNA topoisomerase IV subunit A; Provisional 957
14704 184435 PRK13980 PRK13980 NAD synthetase; Provisional 265
14705 237577 PRK13981 PRK13981 NAD synthetase; Provisional 540
14706 172484 PRK13982 PRK13982 bifunctional SbtC-like/phosphopantothenoylcysteine decarboxylase/phosphopantothenate synthase; Provisional 475
14707 237578 PRK13983 PRK13983 M20 family metallo-hydrolase. 400
14708 172486 PRK13984 PRK13984 putative oxidoreductase; Provisional 604
14709 184438 PRK13985 ureB urease subunit alpha. 568
14710 184439 PRK13986 PRK13986 urease subunit beta. 225
14711 237579 PRK13987 PRK13987 cell division topological specificity factor MinE; Provisional 91
14712 184441 PRK13988 PRK13988 cell division topological specificity factor MinE; Provisional 97
14713 184442 PRK13989 PRK13989 cell division topological specificity factor MinE; Provisional 84
14714 172492 PRK13990 PRK13990 cell division topological specificity factor MinE; Provisional 90
14715 172493 PRK13991 PRK13991 cell division topological specificity factor MinE; Provisional 87
14716 237580 PRK13992 minC septum site-determining protein MinC. 205
14717 237581 PRK13994 PRK13994 potassium-transporting ATPase subunit C; Provisional 222
14718 184445 PRK13995 PRK13995 K(+)-transporting ATPase subunit C. 203
14719 172497 PRK13996 PRK13996 potassium-transporting ATPase subunit C; Provisional 197
14720 172498 PRK13997 PRK13997 K(+)-transporting ATPase subunit C. 193
14721 172499 PRK13998 PRK13998 K(+)-transporting ATPase subunit C. 186
14722 172500 PRK13999 PRK13999 K(+)-transporting ATPase subunit C. 201
14723 184446 PRK14000 PRK14000 K(+)-transporting ATPase subunit C. 185
14724 172502 PRK14001 PRK14001 K(+)-transporting ATPase subunit C. 189
14725 172503 PRK14002 PRK14002 K(+)-transporting ATPase subunit C. 186
14726 184447 PRK14003 PRK14003 K(+)-transporting ATPase subunit C. 194
14727 172505 PRK14004 hisH imidazole glycerol phosphate synthase subunit HisH; Provisional 210
14728 184448 PRK14010 PRK14010 K(+)-transporting ATPase subunit B. 673
14729 237582 PRK14011 PRK14011 prefoldin subunit alpha; Provisional 144
14730 184450 PRK14012 PRK14012 IscS subfamily cysteine desulfurase. 404
14731 237583 PRK14013 PRK14013 hypothetical protein; Provisional 338
14732 237584 PRK14014 PRK14014 putative acyltransferase; Provisional 301
14733 237585 PRK14015 pepN aminopeptidase N; Provisional 875
14734 237586 PRK14016 PRK14016 cyanophycin synthetase; Provisional 727
14735 184455 PRK14017 PRK14017 galactonate dehydratase; Provisional 382
14736 184456 PRK14018 PRK14018 bifunctional peptide-methionine (S)-S-oxide reductase MsrA/peptide-methionine (R)-S-oxide reductase MsrB. 521
14737 237587 PRK14019 PRK14019 bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 367
14738 184458 PRK14021 PRK14021 bifunctional shikimate kinase/3-dehydroquinate synthase; Provisional 542
14739 237588 PRK14022 PRK14022 UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--L-lysine ligase. 481
14740 184460 PRK14023 PRK14023 homoaconitate hydratase small subunit; Provisional 166
14741 237589 PRK14024 PRK14024 phosphoribosyl isomerase A; Provisional 241
14742 184462 PRK14025 PRK14025 multifunctional 3-isopropylmalate dehydrogenase/D-malate dehydrogenase; Provisional 330
14743 172521 PRK14027 PRK14027 quinate/shikimate dehydrogenase (NAD+). 283
14744 172522 PRK14028 PRK14028 pyruvate ferredoxin oxidoreductase subunit gamma/delta; Provisional 312
14745 172523 PRK14029 PRK14029 pyruvate/ketoisovalerate ferredoxin oxidoreductase subunit gamma; Provisional 185
14746 184463 PRK14030 PRK14030 glutamate dehydrogenase; Provisional 445
14747 184464 PRK14031 PRK14031 NADP-specific glutamate dehydrogenase. 444
14748 184465 PRK14032 PRK14032 citrate synthase; Provisional 447
14749 237590 PRK14033 PRK14033 bifunctional 2-methylcitrate synthase/citrate synthase. 375
14750 184467 PRK14034 PRK14034 citrate synthase; Provisional 372
14751 184468 PRK14035 PRK14035 citrate synthase; Provisional 371
14752 237591 PRK14036 PRK14036 citrate synthase; Provisional 377
14753 184470 PRK14037 PRK14037 citrate synthase; Provisional 377
14754 172532 PRK14038 PRK14038 ADP-specific glucokinase. 453
14755 184471 PRK14039 PRK14039 ADP-dependent glucokinase; Provisional 453
14756 237592 PRK14040 PRK14040 oxaloacetate decarboxylase subunit alpha. 593
14757 237593 PRK14041 PRK14041 pyruvate carboxylase subunit B. 467
14758 172536 PRK14042 PRK14042 pyruvate carboxylase subunit B; Provisional 596
14759 172537 PRK14045 PRK14045 1-aminocyclopropane-1-carboxylate deaminase; Provisional 329
14760 237594 PRK14046 PRK14046 malate--CoA ligase subunit beta; Provisional 392
14761 184475 PRK14047 PRK14047 tetrahydromethanopterin S-methyltransferase subunit H. 310
14762 172540 PRK14048 PRK14048 ferrichrome/ferrioxamine B periplasmic transporter; Provisional 374
14763 172541 PRK14049 PRK14049 ferrioxamine B receptor precursor protein; Provisional 726
14764 237595 PRK14050 PRK14050 TonB-dependent siderophore receptor. 728
14765 184476 PRK14051 PRK14051 negative regulator GrlR; Provisional 123
14766 184477 PRK14052 PRK14052 adenosine monophosphate-protein transferase vopS. 387
14767 237596 PRK14053 PRK14053 methyltransferase; Provisional 194
14768 237597 PRK14054 PRK14054 peptide-methionine (S)-S-oxide reductase. 172
14769 172547 PRK14055 PRK14055 aromatic amino acid hydroxylase; Provisional 362
14770 237598 PRK14056 PRK14056 aromatic amino acid hydroxylase. 578
14771 172549 PRK14057 PRK14057 epimerase; Provisional 254
14772 237599 PRK14058 PRK14058 [LysW]-aminoadipate/[LysW]-glutamate kinase. 268
14773 184482 PRK14059 PRK14059 pyrimidine reductase family protein. 251
14774 172552 PRK14061 PRK14061 unknown domain/lipoate-protein ligase A fusion protein; Provisional 562
14775 184483 PRK14063 PRK14063 exodeoxyribonuclease VII small subunit; Provisional 76
14776 172554 PRK14064 PRK14064 exodeoxyribonuclease VII small subunit; Provisional 75
14777 184484 PRK14065 PRK14065 exodeoxyribonuclease VII small subunit; Provisional 86
14778 172556 PRK14066 PRK14066 exodeoxyribonuclease VII small subunit; Provisional 75
14779 172557 PRK14067 PRK14067 exodeoxyribonuclease VII small subunit; Provisional 80
14780 184485 PRK14068 PRK14068 exodeoxyribonuclease VII small subunit; Provisional 76
14781 172559 PRK14069 PRK14069 exodeoxyribonuclease VII small subunit; Provisional 95
14782 184486 PRK14070 PRK14070 exodeoxyribonuclease VII small subunit; Provisional 69
14783 184487 PRK14071 PRK14071 ATP-dependent 6-phosphofructokinase. 360
14784 237600 PRK14072 PRK14072 diphosphate--fructose-6-phosphate 1-phosphotransferase. 416
14785 172564 PRK14074 rpsF 30S ribosomal protein S6; Provisional 257
14786 184489 PRK14075 pnk NAD(+) kinase. 256
14787 237601 PRK14076 pnk bifunctional NADP phosphatase/NAD kinase. 569
14788 172567 PRK14077 pnk NAD(+) kinase. 287
14789 184491 PRK14079 recF recombination protein F; Provisional 349
14790 237602 PRK14081 PRK14081 triple tyrosine motif-containing protein; Provisional 667
14791 184493 PRK14082 PRK14082 hypothetical protein; Provisional 65
14792 237603 PRK14083 PRK14083 HSP90 family protein; Provisional 601
14793 184495 PRK14084 PRK14084 DNA-binding response regulator. 246
14794 237604 PRK14085 PRK14085 imidazolonepropionase; Provisional 382
14795 237605 PRK14086 dnaA chromosomal replication initiator protein DnaA. 617
14796 172577 PRK14087 dnaA chromosomal replication initiator protein DnaA. 450
14797 172578 PRK14088 dnaA chromosomal replication initiator protein DnaA. 440
14798 237606 PRK14089 PRK14089 lipid-A-disaccharide synthase. 347
14799 184499 PRK14090 PRK14090 phosphoribosylformylglycinamidine synthase subunit PurL. 601
14800 237607 PRK14091 PRK14091 RNA chaperone Hfq. 165
14801 172582 PRK14092 PRK14092 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. 163
14802 184501 PRK14093 PRK14093 UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelate--D-alanyl-D-alanine ligase; Provisional 479
14803 172584 PRK14094 psbM photosystem II reaction center protein PsbM. 50
14804 237608 PRK14095 pgi glucose-6-phosphate isomerase; Provisional 533
14805 237609 PRK14096 pgi glucose-6-phosphate isomerase; Provisional 528
14806 184504 PRK14097 pgi glucose-6-phosphate isomerase; Provisional 448
14807 172588 PRK14098 PRK14098 starch synthase. 489
14808 237610 PRK14099 PRK14099 glycogen synthase GlgA. 485
14809 184506 PRK14100 PRK14100 2-phosphosulfolactate phosphatase; Provisional 237
14810 184507 PRK14101 PRK14101 bifunctional transcriptional regulator/glucokinase. 638
14811 184508 PRK14102 nifW nitrogenase-stabilizing/protective protein NifW. 105
14812 184509 PRK14103 PRK14103 trans-aconitate 2-methyltransferase; Provisional 255
14813 172594 PRK14104 PRK14104 chaperonin GroEL; Provisional 546
14814 237611 PRK14105 PRK14105 selenide, water dikinase SelD. 345
14815 184511 PRK14106 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional 450
14816 237612 PRK14108 PRK14108 bifunctional [glutamine synthetase] adenylyltransferase/[glutamine synthetase]-adenylyl-L-tyrosine phosphorylase. 986
14817 237613 PRK14109 PRK14109 bifunctional [glutamine synthetase] adenylyltransferase/[glutamine synthetase]-adenylyl-L-tyrosine phosphorylase. 1007
14818 184514 PRK14110 PRK14110 F0F1 ATP synthase subunit gamma; Provisional 291
14819 184515 PRK14111 PRK14111 F0F1 ATP synthase subunit gamma; Provisional 290
14820 172602 PRK14112 PRK14112 urease accessory protein UreE; Provisional 149
14821 237614 PRK14113 PRK14113 urease accessory protein UreE; Provisional 152
14822 172604 PRK14114 PRK14114 1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 241
14823 184516 PRK14115 gpmA 2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 247
14824 172606 PRK14116 gpmA 2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 228
14825 184517 PRK14117 gpmA phosphoglyceromutase; Provisional 230
14826 172608 PRK14118 gpmA 2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 227
14827 184518 PRK14119 gpmA phosphoglyceromutase; Provisional 228
14828 184519 PRK14120 gpmA phosphoglyceromutase; Provisional 249
14829 237615 PRK14121 PRK14121 tRNA (guanine-N(7)-)-methyltransferase; Provisional 390
14830 184521 PRK14122 PRK14122 tRNA pseudouridine synthase B; Provisional 312
14831 184522 PRK14123 PRK14123 tRNA pseudouridine synthase B; Provisional 305
14832 172614 PRK14124 PRK14124 tRNA pseudouridine synthase B; Provisional 308
14833 184523 PRK14125 PRK14125 cell division suppressor protein YneA; Provisional 103
14834 172616 PRK14126 PRK14126 cell division protein ZapA; Provisional 85
14835 237616 PRK14127 PRK14127 cell division regulator GpsB. 109
14836 184525 PRK14128 iraD anti-adapter protein IraD. 69
14837 184526 PRK14129 PRK14129 heat shock protein HspQ; Provisional 105
14838 237617 PRK14131 PRK14131 N-acetylneuraminate epimerase. 376
14839 237618 PRK14132 PRK14132 riboflavin kinase; Provisional 126
14840 184529 PRK14133 PRK14133 DNA polymerase IV; Provisional 347
14841 184530 PRK14134 recX recombination regulator RecX; Provisional 283
14842 237619 PRK14135 recX recombination regulator RecX; Provisional 263
14843 237620 PRK14136 recX recombination regulator RecX; Provisional 309
14844 172626 PRK14137 recX recombination regulator RecX; Provisional 195
14845 172627 PRK14138 PRK14138 NAD-dependent deacetylase; Provisional 244
14846 237621 PRK14139 PRK14139 heat shock protein GrpE; Provisional 185
14847 237622 PRK14140 PRK14140 heat shock protein GrpE; Provisional 191
14848 172630 PRK14141 PRK14141 heat shock protein GrpE; Provisional 209
14849 237623 PRK14142 PRK14142 heat shock protein GrpE; Provisional 223
14850 237624 PRK14143 PRK14143 heat shock protein GrpE; Provisional 238
14851 184535 PRK14144 PRK14144 heat shock protein GrpE; Provisional 199
14852 184536 PRK14145 PRK14145 heat shock protein GrpE; Provisional 196
14853 172635 PRK14146 PRK14146 heat shock protein GrpE; Provisional 215
14854 237625 PRK14147 PRK14147 heat shock protein GrpE; Provisional 172
14855 172637 PRK14148 PRK14148 heat shock protein GrpE; Provisional 195
14856 184538 PRK14149 PRK14149 heat shock protein GrpE; Provisional 191
14857 184539 PRK14150 PRK14150 heat shock protein GrpE; Provisional 193
14858 172640 PRK14151 PRK14151 heat shock protein GrpE; Provisional 176
14859 184540 PRK14153 PRK14153 heat shock protein GrpE; Provisional 194
14860 237626 PRK14154 PRK14154 heat shock protein GrpE; Provisional 208
14861 237627 PRK14155 PRK14155 heat shock protein GrpE; Provisional 208
14862 237628 PRK14156 PRK14156 heat shock protein GrpE; Provisional 177
14863 184543 PRK14157 PRK14157 heat shock protein GrpE; Provisional 227
14864 172646 PRK14158 PRK14158 heat shock protein GrpE; Provisional 194
14865 172647 PRK14159 PRK14159 heat shock protein GrpE; Provisional 176
14866 237629 PRK14160 PRK14160 heat shock protein GrpE; Provisional 211
14867 237630 PRK14161 PRK14161 heat shock protein GrpE; Provisional 178
14868 237631 PRK14162 PRK14162 heat shock protein GrpE; Provisional 194
14869 184546 PRK14163 PRK14163 heat shock protein GrpE; Provisional 214
14870 237632 PRK14164 PRK14164 heat shock protein GrpE; Provisional 218
14871 184548 PRK14165 PRK14165 winged helix-turn-helix domain-containing protein/riboflavin kinase; Provisional 217
14872 172654 PRK14166 PRK14166 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 282
14873 184549 PRK14167 PRK14167 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 297
14874 237633 PRK14168 PRK14168 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 297
14875 184550 PRK14169 PRK14169 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 282
14876 172658 PRK14170 PRK14170 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 284
14877 172659 PRK14171 PRK14171 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 288
14878 172660 PRK14172 PRK14172 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 278
14879 184551 PRK14173 PRK14173 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 287
14880 172662 PRK14174 PRK14174 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 295
14881 184552 PRK14175 PRK14175 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 286
14882 184553 PRK14176 PRK14176 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 287
14883 172665 PRK14177 PRK14177 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 284
14884 172666 PRK14178 PRK14178 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 279
14885 237634 PRK14179 PRK14179 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase. 284
14886 172668 PRK14180 PRK14180 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 282
14887 172669 PRK14181 PRK14181 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 287
14888 172670 PRK14182 PRK14182 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 282
14889 184555 PRK14183 PRK14183 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 281
14890 237635 PRK14184 PRK14184 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 286
14891 184556 PRK14185 PRK14185 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 293
14892 237636 PRK14186 PRK14186 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 297
14893 172675 PRK14187 PRK14187 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 294
14894 184558 PRK14188 PRK14188 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 296
14895 184559 PRK14189 PRK14189 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase. 285
14896 184560 PRK14190 PRK14190 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 284
14897 172679 PRK14191 PRK14191 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 285
14898 184561 PRK14192 PRK14192 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 283
14899 237637 PRK14193 PRK14193 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional 284
14900 172682 PRK14194 PRK14194 bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 301
14901 184563 PRK14195 PRK14195 fluoride efflux transporter CrcB. 125
14902 184564 PRK14196 PRK14196 fluoride efflux transporter CrcB. 127
14903 172685 PRK14197 PRK14197 fluoride efflux transporter CrcB. 124
14904 172686 PRK14198 PRK14198 fluoride efflux transporter CrcB. 124
14905 172687 PRK14199 PRK14199 fluoride efflux transporter CrcB. 128
14906 237638 PRK14200 PRK14200 fluoride efflux transporter CrcB. 127
14907 184566 PRK14201 PRK14201 fluoride efflux transporter CrcB. 121
14908 172690 PRK14202 PRK14202 fluoride efflux transporter CrcB. 128
14909 237639 PRK14203 PRK14203 fluoride efflux transporter CrcB. 132
14910 172692 PRK14204 PRK14204 fluoride efflux transporter CrcB. 127
14911 172693 PRK14205 PRK14205 fluoride efflux transporter CrcB. 118
14912 172694 PRK14206 PRK14206 fluoride efflux transporter CrcB. 127
14913 172695 PRK14207 PRK14207 fluoride efflux transporter CrcB. 123
14914 172696 PRK14208 PRK14208 fluoride efflux transporter CrcB. 126
14915 237640 PRK14209 PRK14209 fluoride efflux transporter CrcB. 124
14916 172698 PRK14210 PRK14210 fluoride efflux transporter CrcB. 127
14917 172699 PRK14211 PRK14211 fluoride efflux transporter CrcB. 114
14918 184569 PRK14212 PRK14212 fluoride efflux transporter CrcB. 128
14919 184570 PRK14213 PRK14213 camphor resistance protein CrcB; Provisional 118
14920 184571 PRK14214 PRK14214 fluoride efflux transporter CrcB. 118
14921 172703 PRK14215 PRK14215 fluoride efflux transporter CrcB. 126
14922 184572 PRK14216 PRK14216 fluoride efflux transporter CrcB. 132
14923 172705 PRK14217 PRK14217 fluoride efflux transporter CrcB. 134
14924 184573 PRK14218 PRK14218 fluoride efflux transporter CrcB. 133
14925 172707 PRK14219 PRK14219 fluoride efflux transporter CrcB. 132
14926 237641 PRK14220 PRK14220 fluoride efflux transporter CrcB. 120
14927 184575 PRK14221 PRK14221 fluoride efflux transporter CrcB. 124
14928 172710 PRK14222 PRK14222 fluoride efflux transporter CrcB. 124
14929 184576 PRK14223 PRK14223 fluoride efflux transporter CrcB. 122
14930 237642 PRK14224 PRK14224 fluoride efflux transporter CrcB. 126
14931 172713 PRK14225 PRK14225 fluoride efflux transporter CrcB. 137
14932 172714 PRK14226 PRK14226 fluoride efflux transporter CrcB. 130
14933 172715 PRK14227 PRK14227 fluoride efflux transporter CrcB. 124
14934 237643 PRK14228 PRK14228 fluoride efflux transporter CrcB. 122
14935 172717 PRK14229 PRK14229 fluoride efflux transporter CrcB. 108
14936 172718 PRK14230 PRK14230 camphor resistance protein CrcB; Provisional 119
14937 184579 PRK14231 PRK14231 fluoride efflux transporter CrcB. 129
14938 237644 PRK14232 PRK14232 fluoride efflux transporter CrcB. 120
14939 172721 PRK14233 PRK14233 fluoride efflux transporter CrcB. 133
14940 172722 PRK14234 PRK14234 fluoride efflux transporter CrcB. 124
14941 237645 PRK14235 PRK14235 phosphate transporter ATP-binding protein; Provisional 267
14942 184582 PRK14236 PRK14236 phosphate transporter ATP-binding protein; Provisional 272
14943 237646 PRK14237 PRK14237 phosphate transporter ATP-binding protein; Provisional 267
14944 184584 PRK14238 PRK14238 phosphate transporter ATP-binding protein; Provisional 271
14945 184585 PRK14239 PRK14239 phosphate transporter ATP-binding protein; Provisional 252
14946 184586 PRK14240 PRK14240 phosphate transporter ATP-binding protein; Provisional 250
14947 184587 PRK14241 PRK14241 phosphate transporter ATP-binding protein; Provisional 258
14948 172730 PRK14242 PRK14242 phosphate ABC transporter ATP-binding protein. 253
14949 184588 PRK14243 PRK14243 phosphate transporter ATP-binding protein; Provisional 264
14950 172732 PRK14244 PRK14244 phosphate ABC transporter ATP-binding protein; Provisional 251
14951 172733 PRK14245 PRK14245 phosphate ABC transporter ATP-binding protein; Provisional 250
14952 172734 PRK14246 PRK14246 phosphate ABC transporter ATP-binding protein; Provisional 257
14953 172735 PRK14247 PRK14247 phosphate ABC transporter ATP-binding protein; Provisional 250
14954 237647 PRK14248 PRK14248 phosphate ABC transporter ATP-binding protein; Provisional 268
14955 184590 PRK14249 PRK14249 phosphate ABC transporter ATP-binding protein; Provisional 251
14956 237648 PRK14250 PRK14250 phosphate ABC transporter ATP-binding protein; Provisional 241
14957 172739 PRK14251 PRK14251 phosphate ABC transporter ATP-binding protein; Provisional 251
14958 172740 PRK14252 PRK14252 phosphate ABC transporter ATP-binding protein; Provisional 265
14959 172741 PRK14253 PRK14253 phosphate ABC transporter ATP-binding protein; Provisional 249
14960 237649 PRK14254 PRK14254 phosphate ABC transporter ATP-binding protein; Provisional 285
14961 172743 PRK14255 PRK14255 phosphate ABC transporter ATP-binding protein; Provisional 252
14962 172744 PRK14256 PRK14256 phosphate ABC transporter ATP-binding protein; Provisional 252
14963 172745 PRK14257 PRK14257 phosphate ABC transporter ATP-binding protein; Provisional 329
14964 184593 PRK14258 PRK14258 phosphate ABC transporter ATP-binding protein; Provisional 261
14965 172747 PRK14259 PRK14259 phosphate ABC transporter ATP-binding protein; Provisional 269
14966 172748 PRK14260 PRK14260 phosphate ABC transporter ATP-binding protein; Provisional 259
14967 172749 PRK14261 PRK14261 phosphate ABC transporter ATP-binding protein; Provisional 253
14968 172750 PRK14262 PRK14262 phosphate ABC transporter ATP-binding protein; Provisional 250
14969 172751 PRK14263 PRK14263 phosphate ABC transporter ATP-binding protein; Provisional 261
14970 184594 PRK14264 PRK14264 phosphate ABC transporter ATP-binding protein; Provisional 305
14971 237650 PRK14265 PRK14265 phosphate ABC transporter ATP-binding protein; Provisional 274
14972 237651 PRK14266 PRK14266 phosphate ABC transporter ATP-binding protein; Provisional 250
14973 184596 PRK14267 PRK14267 phosphate ABC transporter ATP-binding protein; Provisional 253
14974 172756 PRK14268 PRK14268 phosphate ABC transporter ATP-binding protein; Provisional 258
14975 172757 PRK14269 PRK14269 phosphate ABC transporter ATP-binding protein; Provisional 246
14976 184597 PRK14270 PRK14270 phosphate ABC transporter ATP-binding protein; Provisional 251
14977 172759 PRK14271 PRK14271 phosphate ABC transporter ATP-binding protein; Provisional 276
14978 172760 PRK14272 PRK14272 phosphate ABC transporter ATP-binding protein; Provisional 252
14979 172761 PRK14273 PRK14273 phosphate ABC transporter ATP-binding protein; Provisional 254
14980 172762 PRK14274 PRK14274 phosphate ABC transporter ATP-binding protein; Provisional 259
14981 237652 PRK14275 PRK14275 phosphate ABC transporter ATP-binding protein; Provisional 286
14982 237653 PRK14276 PRK14276 chaperone protein DnaJ; Provisional 380
14983 184599 PRK14277 PRK14277 chaperone protein DnaJ; Provisional 386
14984 237654 PRK14278 PRK14278 chaperone protein DnaJ; Provisional 378
14985 237655 PRK14279 PRK14279 molecular chaperone DnaJ. 392
14986 237656 PRK14280 PRK14280 molecular chaperone DnaJ. 376
14987 237657 PRK14281 PRK14281 chaperone protein DnaJ; Provisional 397
14988 184603 PRK14282 PRK14282 chaperone protein DnaJ; Provisional 369
14989 184604 PRK14283 PRK14283 chaperone protein DnaJ; Provisional 378
14990 237658 PRK14284 PRK14284 chaperone protein DnaJ; Provisional 391
14991 172773 PRK14285 PRK14285 chaperone protein DnaJ; Provisional 365
14992 172774 PRK14286 PRK14286 chaperone protein DnaJ; Provisional 372
14993 237659 PRK14287 PRK14287 chaperone protein DnaJ; Provisional 371
14994 172776 PRK14288 PRK14288 molecular chaperone DnaJ. 369
14995 237660 PRK14289 PRK14289 molecular chaperone DnaJ. 386
14996 172778 PRK14290 PRK14290 chaperone protein DnaJ; Provisional 365
14997 237661 PRK14291 PRK14291 chaperone protein DnaJ; Provisional 382
14998 237662 PRK14292 PRK14292 chaperone protein DnaJ; Provisional 371
14999 237663 PRK14293 PRK14293 molecular chaperone DnaJ. 374
15000 237664 PRK14294 PRK14294 chaperone protein DnaJ; Provisional 366
15001 237665 PRK14295 PRK14295 molecular chaperone DnaJ. 389
15002 237666 PRK14296 PRK14296 chaperone protein DnaJ; Provisional 372
15003 184611 PRK14297 PRK14297 molecular chaperone DnaJ. 380
15004 184612 PRK14298 PRK14298 chaperone protein DnaJ; Provisional 377
15005 237667 PRK14299 PRK14299 chaperone protein DnaJ; Provisional 291
15006 172788 PRK14300 PRK14300 chaperone protein DnaJ; Provisional 372
15007 237668 PRK14301 PRK14301 chaperone protein DnaJ; Provisional 373
15008 184614 PRK14314 glmM phosphoglucosamine mutase; Provisional 450
15009 237669 PRK14315 glmM phosphoglucosamine mutase; Provisional 448
15010 237670 PRK14316 glmM phosphoglucosamine mutase; Provisional 448
15011 237671 PRK14317 glmM phosphoglucosamine mutase; Provisional 465
15012 237672 PRK14318 glmM phosphoglucosamine mutase; Provisional 448
15013 172795 PRK14319 glmM phosphoglucosamine mutase; Provisional 430
15014 172796 PRK14320 glmM phosphoglucosamine mutase; Provisional 443
15015 172797 PRK14321 glmM phosphoglucosamine mutase; Provisional 449
15016 184619 PRK14322 glmM phosphoglucosamine mutase; Provisional 429
15017 184620 PRK14323 glmM phosphoglucosamine mutase; Provisional 440
15018 184621 PRK14324 glmM phosphoglucosamine mutase; Provisional 446
15019 237673 PRK14325 PRK14325 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 444
15020 237674 PRK14326 PRK14326 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 502
15021 184624 PRK14327 PRK14327 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 509
15022 237675 PRK14328 PRK14328 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 439
15023 237676 PRK14329 PRK14329 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 467
15024 184627 PRK14330 PRK14330 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 434
15025 184628 PRK14331 PRK14331 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 437
15026 172808 PRK14332 PRK14332 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 449
15027 237677 PRK14333 PRK14333 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 448
15028 184630 PRK14334 PRK14334 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 440
15029 237678 PRK14335 PRK14335 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 455
15030 184632 PRK14336 PRK14336 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 418
15031 172813 PRK14337 PRK14337 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 446
15032 184633 PRK14338 PRK14338 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 459
15033 184634 PRK14339 PRK14339 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 420
15034 237679 PRK14340 PRK14340 (dimethylallyl)adenosine tRNA methylthiotransferase; Provisional 445
15035 237680 PRK14341 PRK14341 lipoyl(octanoyl) transferase LipB. 213
15036 237681 PRK14342 PRK14342 lipoyl(octanoyl) transferase LipB. 213
15037 237682 PRK14343 PRK14343 lipoyl(octanoyl) transferase LipB. 235
15038 237683 PRK14344 PRK14344 lipoyl(octanoyl) transferase LipB. 223
15039 184638 PRK14345 PRK14345 lipoyl(octanoyl) transferase LipB. 234
15040 237684 PRK14346 PRK14346 lipoyl(octanoyl) transferase LipB. 230
15041 172823 PRK14347 PRK14347 lipoyl(octanoyl) transferase LipB. 209
15042 172824 PRK14348 PRK14348 lipoyl(octanoyl) transferase LipB. 221
15043 172825 PRK14349 PRK14349 lipoyl(octanoyl) transferase LipB. 220
15044 172826 PRK14350 ligA NAD-dependent DNA ligase LigA; Provisional 669
15045 184640 PRK14351 ligA NAD-dependent DNA ligase LigA; Provisional 689
15046 184641 PRK14352 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 482
15047 184642 PRK14353 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 446
15048 184643 PRK14354 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 458
15049 237685 PRK14355 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 459
15050 237686 PRK14356 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 456
15051 237687 PRK14357 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 448
15052 237688 PRK14358 glmU bifunctional N-acetylglucosamine-1-phosphate uridyltransferase/glucosamine-1-phosphate acetyltransferase; Provisional 481
15053 237689 PRK14359 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 430
15054 184646 PRK14360 glmU bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 450
15055 172837 PRK14361 PRK14361 Maf-like protein; Provisional 187
15056 172838 PRK14362 PRK14362 Maf-like protein; Provisional 207
15057 184647 PRK14363 PRK14363 Maf-like protein; Provisional 204
15058 184648 PRK14364 PRK14364 Maf-like protein; Provisional 181
15059 237690 PRK14365 PRK14365 Maf-like protein; Provisional 197
15060 237691 PRK14366 PRK14366 Maf-like protein; Provisional 195
15061 237692 PRK14367 PRK14367 Maf-like protein; Provisional 202
15062 237693 PRK14368 PRK14368 Maf-like protein; Provisional 193
15063 184650 PRK14369 PRK14369 membrane protein insertion efficiency factor YidD. 119
15064 184651 PRK14370 PRK14370 hypothetical protein; Provisional 120
15065 172847 PRK14371 PRK14371 hypothetical protein; Provisional 81
15066 172848 PRK14372 PRK14372 membrane protein insertion efficiency factor YidD. 97
15067 172849 PRK14373 PRK14373 hypothetical protein; Provisional 73
15068 237694 PRK14374 PRK14374 membrane protein insertion efficiency factor YidD. 118
15069 172851 PRK14375 PRK14375 membrane protein insertion efficiency factor YidD. 70
15070 237695 PRK14376 PRK14376 membrane protein insertion efficiency factor YidD. 176
15071 172853 PRK14377 PRK14377 membrane protein insertion efficiency factor YidD. 104
15072 237696 PRK14378 PRK14378 membrane protein insertion efficiency factor YidD. 103
15073 237697 PRK14379 PRK14379 membrane protein insertion efficiency factor YidD. 95
15074 184654 PRK14380 PRK14380 hypothetical protein; Provisional 81
15075 172857 PRK14381 PRK14381 membrane protein insertion efficiency factor YidD. 103
15076 172858 PRK14382 PRK14382 hypothetical protein; Provisional 68
15077 237698 PRK14383 PRK14383 membrane protein insertion efficiency factor YidD. 84
15078 172860 PRK14384 PRK14384 hypothetical protein; Provisional 56
15079 172861 PRK14385 PRK14385 membrane protein insertion efficiency factor YidD. 96
15080 172862 PRK14386 PRK14386 membrane protein insertion efficiency factor YidD. 106
15081 184655 PRK14387 PRK14387 membrane protein insertion efficiency factor YidD. 84
15082 172864 PRK14388 PRK14388 membrane protein insertion efficiency factor YidD. 82
15083 184656 PRK14389 PRK14389 membrane protein insertion efficiency factor YidD. 98
15084 172866 PRK14390 PRK14390 hypothetical protein; Provisional 63
15085 172867 PRK14391 PRK14391 membrane protein insertion efficiency factor YidD. 84
15086 237699 PRK14392 PRK14392 glycerol-3-phosphate acyltransferase. 207
15087 172869 PRK14393 PRK14393 glycerol-3-phosphate acyltransferase. 194
15088 172870 PRK14394 PRK14394 glycerol-3-phosphate acyltransferase. 195
15089 172871 PRK14395 PRK14395 glycerol-3-phosphate acyltransferase. 195
15090 184657 PRK14396 PRK14396 glycerol-3-phosphate acyltransferase. 190
15091 237700 PRK14397 PRK14397 membrane protein; Provisional 222
15092 237701 PRK14398 PRK14398 glycerol-3-phosphate acyltransferase. 191
15093 237702 PRK14399 PRK14399 membrane protein; Provisional 258
15094 237703 PRK14400 PRK14400 glycerol-3-phosphate acyltransferase. 201
15095 184658 PRK14401 PRK14401 membrane protein; Provisional 187
15096 237704 PRK14402 PRK14402 glycerol-3-phosphate acyltransferase. 198
15097 172879 PRK14403 PRK14403 glycerol-3-phosphate acyltransferase. 196
15098 237705 PRK14404 PRK14404 glycerol-3-phosphate acyltransferase. 201
15099 237706 PRK14405 PRK14405 membrane protein; Provisional 202
15100 172882 PRK14406 PRK14406 glycerol-3-phosphate acyltransferase. 199
15101 172883 PRK14407 PRK14407 glycerol-3-phosphate acyltransferase. 219
15102 172884 PRK14408 PRK14408 glycerol-3-phosphate acyltransferase. 257
15103 172885 PRK14409 PRK14409 glycerol-3-phosphate acyltransferase. 205
15104 237707 PRK14410 PRK14410 glycerol-3-phosphate acyltransferase. 235
15105 184663 PRK14411 PRK14411 membrane protein; Provisional 204
15106 184664 PRK14412 PRK14412 glycerol-3-phosphate acyltransferase. 198
15107 172889 PRK14413 PRK14413 glycerol-3-phosphate acyltransferase. 197
15108 184665 PRK14414 PRK14414 glycerol-3-phosphate acyltransferase. 210
15109 184666 PRK14415 PRK14415 glycerol-3-phosphate acyltransferase. 216
15110 184667 PRK14416 PRK14416 membrane protein; Provisional 200
15111 184668 PRK14417 PRK14417 membrane protein; Provisional 232
15112 237708 PRK14418 PRK14418 glycerol-3-phosphate acyltransferase. 236
15113 237709 PRK14419 PRK14419 membrane protein; Provisional 199
15114 237710 PRK14420 PRK14420 acylphosphatase; Provisional 91
15115 237711 PRK14421 PRK14421 acylphosphatase; Provisional 99
15116 237712 PRK14422 PRK14422 acylphosphatase; Provisional 93
15117 237713 PRK14423 PRK14423 acylphosphatase; Provisional 92
15118 184674 PRK14424 PRK14424 acylphosphatase; Provisional 94
15119 172901 PRK14425 PRK14425 acylphosphatase; Provisional 94
15120 184675 PRK14426 PRK14426 acylphosphatase; Provisional 92
15121 172903 PRK14427 PRK14427 acylphosphatase; Provisional 94
15122 172904 PRK14428 PRK14428 acylphosphatase; Provisional 97
15123 184676 PRK14429 PRK14429 acylphosphatase; Provisional 90
15124 172906 PRK14430 PRK14430 acylphosphatase; Provisional 92
15125 184677 PRK14431 PRK14431 acylphosphatase; Provisional 89
15126 184678 PRK14432 PRK14432 acylphosphatase; Provisional 93
15127 184679 PRK14433 PRK14433 acylphosphatase; Provisional 87
15128 184680 PRK14434 PRK14434 acylphosphatase; Provisional 92
15129 184681 PRK14435 PRK14435 acylphosphatase; Provisional 90
15130 172912 PRK14436 PRK14436 acylphosphatase; Provisional 91
15131 172913 PRK14437 PRK14437 acylphosphatase; Provisional 109
15132 172914 PRK14438 PRK14438 acylphosphatase; Provisional 91
15133 237714 PRK14439 PRK14439 acylphosphatase; Provisional 163
15134 172916 PRK14440 PRK14440 acylphosphatase; Provisional 90
15135 172917 PRK14441 PRK14441 acylphosphatase; Provisional 93
15136 172918 PRK14442 PRK14442 acylphosphatase; Provisional 91
15137 172919 PRK14443 PRK14443 acylphosphatase; Provisional 93
15138 172920 PRK14444 PRK14444 acylphosphatase; Provisional 92
15139 172921 PRK14445 PRK14445 acylphosphatase; Provisional 91
15140 172922 PRK14446 PRK14446 acylphosphatase; Provisional 88
15141 172923 PRK14447 PRK14447 acylphosphatase; Provisional 95
15142 172924 PRK14448 PRK14448 acylphosphatase; Provisional 90
15143 184682 PRK14449 PRK14449 acylphosphatase; Provisional 90
15144 184683 PRK14450 PRK14450 acylphosphatase; Provisional 91
15145 237715 PRK14451 PRK14451 acylphosphatase; Provisional 89
15146 237716 PRK14452 PRK14452 acylphosphatase; Provisional 107
15147 184685 PRK14453 PRK14453 chloramphenicol/florfenicol resistance protein; Provisional 347
15148 184686 PRK14454 PRK14454 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 342
15149 237717 PRK14455 PRK14455 ribosomal RNA large subunit methyltransferase N; Provisional 356
15150 172932 PRK14456 PRK14456 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 368
15151 184688 PRK14457 PRK14457 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 345
15152 184689 PRK14459 PRK14459 ribosomal RNA large subunit methyltransferase N; Provisional 373
15153 172935 PRK14460 PRK14460 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 354
15154 237718 PRK14461 PRK14461 ribosomal RNA large subunit methyltransferase N; Provisional 371
15155 237719 PRK14462 PRK14462 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 356
15156 237720 PRK14463 PRK14463 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 349
15157 184691 PRK14464 PRK14464 RNA methyltransferase. 344
15158 172940 PRK14465 PRK14465 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 342
15159 237721 PRK14466 PRK14466 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 345
15160 184693 PRK14467 PRK14467 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 348
15161 184694 PRK14468 PRK14468 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 343
15162 172944 PRK14469 PRK14469 23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 343
15163 172945 PRK14470 PRK14470 ribosomal RNA large subunit methyltransferase N; Provisional 336
15164 184695 PRK14471 PRK14471 F0F1 ATP synthase subunit B; Provisional 164
15165 172947 PRK14472 PRK14472 F0F1 ATP synthase subunit B; Provisional 175
15166 172948 PRK14473 PRK14473 F0F1 ATP synthase subunit B; Provisional 164
15167 184696 PRK14474 PRK14474 F0F1 ATP synthase subunit B; Provisional 250
15168 184697 PRK14475 PRK14475 F0F1 ATP synthase subunit B; Provisional 167
15169 237722 PRK14476 PRK14476 nitrogenase molybdenum-cofactor biosynthesis protein NifN; Provisional 455
15170 172952 PRK14477 PRK14477 bifunctional nitrogenase molybdenum-cofactor biosynthesis protein NifE/NifN; Provisional 917
15171 184699 PRK14478 PRK14478 nitrogenase molybdenum-cofactor biosynthesis protein NifE; Provisional 475
15172 237723 PRK14479 PRK14479 dihydroxyacetone kinase; Provisional 568
15173 237724 PRK14481 PRK14481 dihydroxyacetone kinase subunit DhaK; Provisional 331
15174 172956 PRK14483 PRK14483 DhaKLM operon coactivator DhaQ; Provisional 329
15175 184702 PRK14484 PRK14484 phosphotransferase mannnose-specific family component IIA; Provisional 124
15176 184703 PRK14485 PRK14485 putative bifunctional cbb3-type cytochrome c oxidase subunit I/II; Provisional 712
15177 184704 PRK14486 PRK14486 putative bifunctional cbb3-type cytochrome c oxidase subunit II/cytochrome c; Provisional 294
15178 237725 PRK14487 PRK14487 cbb3-type cytochrome c oxidase subunit II; Provisional 217
15179 237726 PRK14488 PRK14488 cbb3-type cytochrome c oxidase subunit I; Provisional 473
15180 237727 PRK14489 PRK14489 putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobA/MobB; Provisional 366
15181 237728 PRK14490 PRK14490 putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MobA; Provisional 369
15182 237729 PRK14491 PRK14491 putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MoeA; Provisional 597
15183 237730 PRK14493 PRK14493 putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MoaE; Provisional 274
15184 237731 PRK14494 PRK14494 putative molybdopterin-guanine dinucleotide biosynthesis protein MobB/FeS domain-containing protein protein; Provisional 229
15185 172967 PRK14495 PRK14495 putative molybdopterin-guanine dinucleotide biosynthesis protein MobB/unknown domain fusion protein; Provisional 452
15186 172968 PRK14497 PRK14497 putative molybdopterin biosynthesis protein MoeA/unknown domain fusion protein; Provisional 546
15187 237732 PRK14498 PRK14498 putative molybdopterin biosynthesis protein MoeA/LysR substrate binding-domain-containing protein; Provisional 633
15188 237733 PRK14499 PRK14499 cyclic pyranopterin monophosphate synthase MoaC/MOSC-domain-containing protein. 308
15189 237734 PRK14500 PRK14500 putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MoaC/MobA; Provisional 346
15190 184712 PRK14501 PRK14501 putative bifunctional trehalose-6-phosphate synthase/HAD hydrolase subfamily IIB; Provisional 726
15191 184713 PRK14502 PRK14502 bifunctional mannosyl-3-phosphoglycerate synthase/mannosyl-3 phosphoglycerate phosphatase; Provisional 694
15192 237735 PRK14503 PRK14503 mannosyl-3-phosphoglycerate synthase; Provisional 393
15193 237736 PRK14504 PRK14504 photosynthetic reaction center subunit M; Provisional 315
15194 172976 PRK14505 PRK14505 bifunctional photosynthetic reaction center subunit L/M; Provisional 643
15195 184716 PRK14506 PRK14506 photosynthetic reaction center subunit L; Provisional 276
15196 237737 PRK14507 PRK14507 malto-oligosyltrehalose synthase. 1693
15197 237738 PRK14508 PRK14508 4-alpha-glucanotransferase; Provisional 497
15198 237739 PRK14510 PRK14510 bifunctional glycogen debranching protein GlgX/4-alpha-glucanotransferase. 1221
15199 237740 PRK14511 PRK14511 malto-oligosyltrehalose synthase. 879
15200 237741 PRK14512 PRK14512 ATP-dependent Clp protease proteolytic subunit; Provisional 197
15201 237742 PRK14513 PRK14513 ATP-dependent Clp protease proteolytic subunit; Provisional 201
15202 184722 PRK14514 PRK14514 ATP-dependent Clp endopeptidase proteolytic subunit ClpP. 221
15203 237743 PRK14515 PRK14515 aspartate ammonia-lyase; Provisional 479
15204 184724 PRK14520 rpsP 30S ribosomal protein S16; Provisional 155
15205 237744 PRK14521 rpsP 30S ribosomal protein S16; Provisional 186
15206 172988 PRK14522 rpsP 30S ribosomal protein S16; Provisional 116
15207 172989 PRK14523 rpsP 30S ribosomal protein S16; Provisional 137
15208 172990 PRK14524 rpsP 30S ribosomal protein S16; Provisional 94
15209 172991 PRK14525 rpsP 30S ribosomal protein S16; Provisional 88
15210 172992 PRK14526 PRK14526 adenylate kinase; Provisional 211
15211 237745 PRK14527 PRK14527 adenylate kinase; Provisional 191
15212 172994 PRK14528 PRK14528 adenylate kinase; Provisional 186
15213 237746 PRK14529 PRK14529 adenylate kinase; Provisional 223
15214 237747 PRK14530 PRK14530 adenylate kinase; Provisional 215
15215 172997 PRK14531 PRK14531 adenylate kinase; Provisional 183
15216 184729 PRK14532 PRK14532 adenylate kinase; Provisional 188
15217 184730 PRK14533 groES co-chaperonin GroES; Provisional 91
15218 173000 PRK14534 cysS cysteinyl-tRNA synthetase; Provisional 481
15219 173001 PRK14535 cysS cysteinyl-tRNA synthetase; Provisional 699
15220 184731 PRK14536 cysS cysteinyl-tRNA synthetase; Provisional 490
15221 237748 PRK14537 PRK14537 50S ribosomal protein L20/unknown domain fusion protein; Provisional 230
15222 173004 PRK14538 PRK14538 putative bifunctional signaling protein/50S ribosomal protein L9; Provisional 838
15223 184732 PRK14539 PRK14539 50S ribosomal protein L11/unknown domain fusion protein; Provisional 196
15224 184733 PRK14540 PRK14540 nucleoside diphosphate kinase; Provisional 134
15225 173007 PRK14541 PRK14541 nucleoside diphosphate kinase; Provisional 140
15226 173008 PRK14542 PRK14542 nucleoside diphosphate kinase; Provisional 137
15227 237749 PRK14543 PRK14543 nucleoside diphosphate kinase; Provisional 169
15228 173010 PRK14544 PRK14544 nucleoside diphosphate kinase; Provisional 183
15229 184734 PRK14545 PRK14545 nucleoside diphosphate kinase; Provisional 139
15230 184735 PRK14547 rplD 50S ribosomal protein L4; Provisional 298
15231 237750 PRK14548 PRK14548 50S ribosomal protein L23P; Provisional 84
15232 237751 PRK14549 PRK14549 50S ribosomal protein L29P; Provisional 69
15233 173015 PRK14550 rnhB ribonuclease HII; Provisional 204
15234 237752 PRK14551 rnhB ribonuclease HII; Provisional 212
15235 237753 PRK14552 PRK14552 C/D box methylation guide ribonucleoprotein complex aNOP56 subunit; Provisional 414
15236 184740 PRK14553 PRK14553 ribosomal-processing cysteine protease Prp. 108
15237 237754 PRK14554 PRK14554 tRNA pseudouridine(54/55) synthase Pus10. 422
15238 237755 PRK14555 PRK14555 RNA-binding protein. 145
15239 173021 PRK14556 pyrH UMP kinase. 249
15240 173022 PRK14557 pyrH uridylate kinase; Provisional 247
15241 173023 PRK14558 pyrH uridylate kinase; Provisional 231
15242 237756 PRK14559 PRK14559 serine/threonine phosphatase. 645
15243 237757 PRK14560 PRK14560 putative RNA-binding protein; Provisional 160
15244 184745 PRK14561 PRK14561 hypothetical protein; Provisional 194
15245 184746 PRK14562 PRK14562 haloacid dehalogenase superfamily protein; Provisional 204
15246 184747 PRK14563 PRK14563 ribosome modulation factor; Provisional 55
15247 237758 PRK14565 PRK14565 triosephosphate isomerase; Provisional 237
15248 184749 PRK14566 PRK14566 triosephosphate isomerase; Provisional 260
15249 173031 PRK14567 PRK14567 triosephosphate isomerase; Provisional 253
15250 184750 PRK14568 vanB D-alanine--D-lactate ligase; Provisional 343
15251 173033 PRK14569 PRK14569 D-alanyl-alanine synthetase A; Provisional 296
15252 173034 PRK14570 PRK14570 D-alanyl-alanine synthetase A; Provisional 364
15253 184751 PRK14571 PRK14571 D-alanyl-alanine synthetase A; Provisional 299
15254 173036 PRK14572 PRK14572 D-alanyl-alanine synthetase A; Provisional 347
15255 184752 PRK14573 PRK14573 bifunctional UDP-N-acetylmuramate--L-alanine ligase/D-alanine--D-alanine ligase. 809
15256 173038 PRK14574 hmsH poly-beta-1,6 N-acetyl-D-glucosamine export porin PgaA. 822
15257 173039 PRK14575 PRK14575 putative peptidase; Provisional 406
15258 173040 PRK14576 PRK14576 putative endopeptidase; Provisional 405
15259 173042 PRK14578 PRK14578 elongation factor P; Provisional 187
15260 184753 PRK14581 hmsF outer membrane N-deacetylase; Provisional 672
15261 184754 PRK14582 pgaB poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB. 671
15262 184755 PRK14583 hmsR poly-beta-1,6 N-acetyl-D-glucosamine synthase. 444
15263 184756 PRK14584 hmsS hemin storage system protein; Provisional 153
15264 173049 PRK14585 pgaD putative PGA biosynthesis protein; Provisional 137
15265 173050 PRK14586 PRK14586 tRNA pseudouridine(38-40) synthase TruA. 245
15266 173051 PRK14587 PRK14587 tRNA pseudouridine synthase ACD; Provisional 256
15267 173052 PRK14588 PRK14588 tRNA pseudouridine(38-40) synthase TruA. 272
15268 237759 PRK14589 PRK14589 tRNA pseudouridine(38-40) synthase TruA. 265
15269 173054 PRK14590 rimM 16S rRNA-processing protein RimM; Provisional 171
15270 173055 PRK14591 rimM 16S rRNA-processing protein RimM; Provisional 169
15271 173056 PRK14592 rimM 16S rRNA-processing protein RimM; Provisional 165
15272 237760 PRK14593 rimM ribosome maturation factor RimM. 184
15273 173058 PRK14594 rimM 16S rRNA-processing protein RimM; Provisional 166
15274 184757 PRK14595 PRK14595 peptide deformylase; Provisional 162
15275 184758 PRK14596 PRK14596 peptide deformylase; Provisional 199
15276 237761 PRK14597 PRK14597 peptide deformylase; Provisional 166
15277 237762 PRK14598 PRK14598 peptide deformylase; Provisional 187
15278 173063 PRK14599 trmD tRNA (guanine-N(1)-)-methyltransferase/unknown domain fusion protein; Provisional 222
15279 173064 PRK14600 ruvA Holliday junction branch migration protein RuvA. 186
15280 173065 PRK14601 ruvA Holliday junction branch migration protein RuvA. 183
15281 173066 PRK14602 ruvA Holliday junction branch migration protein RuvA. 203
15282 237763 PRK14603 ruvA Holliday junction branch migration protein RuvA. 197
15283 184760 PRK14604 ruvA Holliday junction branch migration protein RuvA. 195
15284 184761 PRK14605 ruvA Holliday junction branch migration protein RuvA. 194
15285 184762 PRK14606 ruvA Holliday junction branch migration protein RuvA. 188
15286 237764 PRK14607 PRK14607 bifunctional anthranilate synthase component II/anthranilate phosphoribosyltransferase. 534
15287 237765 PRK14608 PRK14608 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 290
15288 237766 PRK14609 PRK14609 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 269
15289 184766 PRK14610 PRK14610 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 283
15290 184767 PRK14611 PRK14611 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 275
15291 237767 PRK14612 PRK14612 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 276
15292 173077 PRK14613 PRK14613 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 297
15293 173078 PRK14614 PRK14614 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional 280
15294 237768 PRK14615 PRK14615 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 296
15295 237769 PRK14616 PRK14616 4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 287
15296 237770 PRK14618 PRK14618 NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional 328
15297 237771 PRK14619 PRK14619 NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional 308
15298 173083 PRK14620 PRK14620 NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional 326
15299 173084 PRK14621 PRK14621 YbaB/EbfC family nucleoid-associated protein. 111
15300 173085 PRK14622 PRK14622 YbaB/EbfC family nucleoid-associated protein. 103
15301 184771 PRK14623 PRK14623 YbaB/EbfC family nucleoid-associated protein. 106
15302 173087 PRK14624 PRK14624 YbaB/EbfC family nucleoid-associated protein. 115
15303 184772 PRK14625 PRK14625 hypothetical protein; Provisional 109
15304 173089 PRK14626 PRK14626 YbaB/EbfC family nucleoid-associated protein. 110
15305 173090 PRK14627 PRK14627 YbaB/EbfC family nucleoid-associated protein. 100
15306 173091 PRK14628 PRK14628 YbaB/EbfC family nucleoid-associated protein. 118
15307 173092 PRK14629 PRK14629 YbaB/EbfC family nucleoid-associated protein. 99
15308 173093 PRK14630 PRK14630 ribosome maturation factor RimP. 143
15309 237772 PRK14631 PRK14631 ribosome maturation factor RimP. 174
15310 173095 PRK14632 PRK14632 ribosome maturation factor RimP. 172
15311 173096 PRK14633 PRK14633 ribosome maturation factor RimP. 150
15312 173097 PRK14634 PRK14634 ribosome maturation factor RimP. 155
15313 184774 PRK14635 PRK14635 ribosome maturation factor RimP. 162
15314 237773 PRK14636 PRK14636 ribosome maturation protein RimP. 176
15315 237774 PRK14637 PRK14637 ribosome maturation factor RimP. 151
15316 184777 PRK14638 PRK14638 ribosome maturation factor RimP. 150
15317 173102 PRK14639 PRK14639 ribosome maturation factor RimP. 140
15318 173103 PRK14640 PRK14640 hypothetical protein; Provisional 152
15319 173104 PRK14641 PRK14641 ribosome maturation factor RimP. 173
15320 237775 PRK14642 PRK14642 ribosome maturation factor RimP. 197
15321 173106 PRK14643 PRK14643 ribosome maturation factor RimP. 164
15322 184779 PRK14644 PRK14644 hypothetical protein; Provisional 136
15323 184780 PRK14645 PRK14645 ribosome maturation factor RimP. 154
15324 173109 PRK14646 PRK14646 ribosome maturation factor RimP. 155
15325 173110 PRK14647 PRK14647 ribosome maturation factor RimP. 159
15326 173111 PRK14648 PRK14648 UDP-N-acetylmuramate dehydrogenase. 354
15327 173112 PRK14649 PRK14649 UDP-N-acetylmuramate dehydrogenase. 295
15328 173113 PRK14650 PRK14650 UDP-N-acetylmuramate dehydrogenase. 302
15329 237776 PRK14651 PRK14651 UDP-N-acetylmuramate dehydrogenase. 273
15330 237777 PRK14652 PRK14652 UDP-N-acetylmuramate dehydrogenase. 302
15331 237778 PRK14653 PRK14653 UDP-N-acetylmuramate dehydrogenase. 297
15332 173117 PRK14654 mraY phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional 302
15333 173118 PRK14655 mraY phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional 304
15334 237779 PRK14656 acpS holo-[acyl-carrier-protein] synthase. 126
15335 173120 PRK14657 acpS holo-[acyl-carrier-protein] synthase. 123
15336 173121 PRK14658 acpS holo-ACP synthase. 115
15337 237780 PRK14659 acpS holo-[acyl-carrier-protein] synthase. 122
15338 173123 PRK14660 acpS holo-[acyl-carrier-protein] synthase. 125
15339 184782 PRK14661 acpS holo-[acyl-carrier-protein] synthase. 169
15340 184783 PRK14662 acpS 4'-phosphopantetheinyl transferase; Provisional 120
15341 237781 PRK14663 acpS holo-[acyl-carrier-protein] synthase. 116
15342 173127 PRK14664 PRK14664 tRNA-specific 2-thiouridylase MnmA; Provisional 362
15343 173128 PRK14665 mnmA tRNA-specific 2-thiouridylase MnmA; Provisional 360
15344 237782 PRK14666 uvrC excinuclease ABC subunit C; Provisional 694
15345 237783 PRK14667 uvrC excinuclease ABC subunit C; Provisional 567
15346 184785 PRK14668 uvrC excinuclease ABC subunit C; Provisional 577
15347 237784 PRK14669 uvrC excinuclease ABC subunit C; Provisional 624
15348 173133 PRK14670 uvrC excinuclease ABC subunit C; Provisional 574
15349 237785 PRK14671 uvrC excinuclease ABC subunit C; Provisional 621
15350 173135 PRK14672 uvrC excinuclease ABC subunit C; Provisional 691
15351 237786 PRK14673 PRK14673 hypothetical protein; Provisional 137
15352 184788 PRK14674 PRK14674 hypothetical protein; Provisional 133
15353 173138 PRK14675 PRK14675 hypothetical protein; Provisional 125
15354 173139 PRK14676 PRK14676 hypothetical protein; Provisional 117
15355 184789 PRK14677 PRK14677 hypothetical protein; Provisional 107
15356 173141 PRK14678 PRK14678 hypothetical protein; Provisional 120
15357 173142 PRK14679 PRK14679 hypothetical protein; Provisional 128
15358 173143 PRK14680 PRK14680 hypothetical protein; Provisional 134
15359 237787 PRK14681 PRK14681 hypothetical protein; Provisional 158
15360 173145 PRK14682 PRK14682 hypothetical protein; Provisional 117
15361 173146 PRK14683 PRK14683 hypothetical protein; Provisional 122
15362 173147 PRK14684 PRK14684 hypothetical protein; Provisional 120
15363 173148 PRK14685 PRK14685 hypothetical protein; Provisional 177
15364 184791 PRK14686 PRK14686 hypothetical protein; Provisional 119
15365 237788 PRK14687 PRK14687 hypothetical protein; Provisional 173
15366 184792 PRK14688 PRK14688 hypothetical protein; Provisional 121
15367 173152 PRK14689 PRK14689 hypothetical protein; Provisional 124
15368 237789 PRK14690 PRK14690 molybdopterin biosynthesis protein MoeA; Provisional 419
15369 173154 PRK14691 PRK14691 3-oxoacyl-(acyl carrier protein) synthase II; Provisional 342
15370 173155 PRK14692 PRK14692 flagellar hook-associated protein FlgL. 749
15371 173156 PRK14693 PRK14693 hypothetical protein; Provisional 552
15372 237790 PRK14694 PRK14694 putative mercuric reductase; Provisional 468
15373 173158 PRK14695 PRK14695 serine/threonine transporter SstT; Provisional 319
15374 184793 PRK14696 tynA primary-amine oxidase. 721
15375 184794 PRK14697 PRK14697 bifunctional 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase/phosphatase; Provisional 233
15376 184795 PRK14698 PRK14698 V-type ATP synthase subunit A; Provisional 1017
15377 173162 PRK14699 PRK14699 replication factor A; Provisional 484
15378 173163 PRK14700 PRK14700 recombination factor protein RarA; Provisional 300
15379 237791 PRK14701 PRK14701 reverse gyrase; Provisional 1638
15380 237792 PRK14702 PRK14702 insertion element IS2 transposase InsD; Provisional 262
15381 237793 PRK14703 PRK14703 glutaminyl-tRNA synthetase/YqeY domain fusion protein; Provisional 771
15382 184798 PRK14704 PRK14704 anaerobic ribonucleoside triphosphate reductase; Provisional 618
15383 237794 PRK14705 PRK14705 glycogen branching enzyme; Provisional 1224
15384 237795 PRK14706 PRK14706 glycogen branching enzyme; Provisional 639
15385 173170 PRK14707 PRK14707 hypothetical protein; Provisional 2710
15386 173171 PRK14708 PRK14708 flagellin; Provisional 888
15387 173172 PRK14709 PRK14709 hypothetical protein; Provisional 469
15388 173173 PRK14710 PRK14710 hypothetical protein; Provisional 86
15389 173174 PRK14711 ureE urease accessory protein UreE; Provisional 191
15390 237796 PRK14712 PRK14712 conjugal transfer nickase/helicase TraI; Provisional 1623
15391 237797 PRK14713 PRK14713 bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 530
15392 237798 PRK14714 PRK14714 DNA-directed DNA polymerase II large subunit. 1337
15393 237799 PRK14715 PRK14715 DNA-directed DNA polymerase II large subunit. 1627
15394 237800 PRK14716 PRK14716 glycosyl transferase family protein. 504
15395 184803 PRK14717 PRK14717 putative glycine/sarcosine/betaine reductase complex protein A; Provisional 107
15396 173181 PRK14718 PRK14718 ribonuclease III; Provisional 467
15397 237801 PRK14719 PRK14719 bifunctional RNAse/5-amino-6-(5-phosphoribosylamino)uracil reductase; Provisional 360
15398 184804 PRK14720 PRK14720 transcription elongation factor GreA. 906
15399 173184 PRK14721 flhF flagellar biosynthesis regulator FlhF; Provisional 420
15400 173185 PRK14722 flhF flagellar biosynthesis regulator FlhF; Provisional 374
15401 237802 PRK14723 flhF flagellar biosynthesis regulator FlhF; Provisional 767
15402 237803 PRK14724 PRK14724 DNA topoisomerase III; Provisional 987
15403 237804 PRK14725 PRK14725 pyruvate kinase; Provisional 608
15404 237805 PRK14726 PRK14726 protein translocase subunit SecDF. 855
15405 237806 PRK14727 PRK14727 putative mercuric reductase; Provisional 479
15406 173191 PRK14729 miaA tRNA delta(2)-isopentenylpyrophosphate transferase; Provisional 300
15407 184807 PRK14730 coaE dephospho-CoA kinase; Provisional 195
15408 173193 PRK14731 coaE dephospho-CoA kinase; Provisional 208
15409 237807 PRK14732 coaE dephospho-CoA kinase; Provisional 196
15410 173195 PRK14733 coaE dephospho-CoA kinase; Provisional 204
15411 237808 PRK14734 coaE dephospho-CoA kinase; Provisional 200
15412 173197 PRK14735 atpC F0F1 ATP synthase subunit epsilon; Provisional 139
15413 173198 PRK14736 atpC F0F1 ATP synthase subunit epsilon; Provisional 133
15414 173199 PRK14737 gmk guanylate kinase; Provisional 186
15415 237809 PRK14738 gmk guanylate kinase; Provisional 206
15416 173201 PRK14740 kdbF K(+)-transporting ATPase subunit F. 29
15417 173202 PRK14741 spoVM stage V sporulation protein SpoVM. 26
15418 173203 PRK14742 thrL thr operon leader peptide; Provisional 28
15419 173204 PRK14743 thrL thr operon leader peptide; Provisional 22
15420 173205 PRK14744 PRK14744 leu operon leader peptide; Provisional 28
15421 173206 PRK14745 PRK14745 RepA leader peptide Tap; Provisional 25
15422 173207 PRK14746 PRK14746 RepA leader peptide Tap; Provisional 24
15423 184810 PRK14747 PRK14747 cytochrome b6-f complex subunit PetN; Provisional 29
15424 173209 PRK14748 kdpF K(+)-transporting ATPase subunit F. 29
15425 173210 PRK14749 PRK14749 cytochrome bd-II oxidase subunit CbdX. 30
15426 173211 PRK14750 kdpF K(+)-transporting ATPase subunit F. 29
15427 173212 PRK14751 PRK14751 tetracycline resistance determinant leader peptide; Provisional 28
15428 173213 PRK14752 PRK14752 delta-lysin family phenol-soluble modulin. 44
15429 184811 PRK14753 PRK14753 30S ribosomal protein Thx; Provisional 27
15430 173215 PRK14754 PRK14754 toxic peptide TisB; Provisional 29
15431 173216 PRK14755 PRK14755 transcriptional regulatory protein PufK; Provisional 20
15432 341227 PRK14756 small_mem_YnhF YnhF family membrane protein; Validated. 29
15433 173218 PRK14757 PRK14757 putative protamine-like protein; Provisional 29
15434 173219 PRK14758 PRK14758 hypothetical protein; Provisional 27
15435 173220 PRK14759 PRK14759 K(+)-transporting ATPase subunit F. 29
15436 173221 PRK14760 PRK14760 protein YoaJ. 24
15437 173222 PRK14761 PRK14761 fur leader peptide. 28
15438 173223 PRK14762 PRK14762 protein YohP. 27
15439 184813 PRK14763 PRK14763 pyrroloquinoline quinone precursor peptide PqqA. 23
15440 237810 PRK14764 PRK14764 lipoprotein signal peptidase; Provisional 209
15441 173227 PRK14766 PRK14766 lipoprotein signal peptidase; Provisional 201
15442 237811 PRK14767 PRK14767 lipoprotein signal peptidase; Provisional 174
15443 173229 PRK14768 PRK14768 lipoprotein signal peptidase; Provisional 148
15444 173230 PRK14769 PRK14769 lipoprotein signal peptidase; Provisional 156
15445 173231 PRK14770 PRK14770 lipoprotein signal peptidase; Provisional 167
15446 184816 PRK14771 PRK14771 lipoprotein signal peptidase; Provisional 165
15447 237812 PRK14772 PRK14772 lipoprotein signal peptidase; Provisional 190
15448 173234 PRK14773 PRK14773 lipoprotein signal peptidase; Provisional 192
15449 173235 PRK14774 PRK14774 lipoprotein signal peptidase; Provisional 185
15450 237813 PRK14775 PRK14775 lipoprotein signal peptidase; Provisional 170
15451 173237 PRK14776 PRK14776 lipoprotein signal peptidase; Provisional 170
15452 237814 PRK14777 PRK14777 lipoprotein signal peptidase; Provisional 184
15453 173239 PRK14778 PRK14778 lipoprotein signal peptidase; Provisional 186
15454 184818 PRK14779 PRK14779 lipoprotein signal peptidase; Provisional 159
15455 173241 PRK14780 PRK14780 lipoprotein signal peptidase; Provisional 263
15456 237815 PRK14781 PRK14781 lipoprotein signal peptidase; Provisional 153
15457 173243 PRK14782 PRK14782 signal peptidase II. 157
15458 173244 PRK14783 PRK14783 lipoprotein signal peptidase; Provisional 182
15459 184819 PRK14784 PRK14784 lipoprotein signal peptidase; Provisional 160
15460 184820 PRK14785 PRK14785 lipoprotein signal peptidase; Provisional 171
15461 173247 PRK14786 PRK14786 lipoprotein signal peptidase; Provisional 154
15462 173248 PRK14787 PRK14787 lipoprotein signal peptidase; Provisional 159
15463 237816 PRK14788 PRK14788 lipoprotein signal peptidase; Provisional 200
15464 173250 PRK14789 PRK14789 lipoprotein signal peptidase; Provisional 191
15465 173251 PRK14790 PRK14790 lipoprotein signal peptidase; Provisional 169
15466 237817 PRK14791 PRK14791 lipoprotein signal peptidase; Provisional 146
15467 237818 PRK14792 PRK14792 signal peptidase II. 159
15468 184823 PRK14793 PRK14793 lipoprotein signal peptidase; Provisional 150
15469 173255 PRK14794 PRK14794 lipoprotein signal peptidase; Provisional 136
15470 173256 PRK14795 PRK14795 lipoprotein signal peptidase; Provisional 158
15471 184824 PRK14796 PRK14796 lipoprotein signal peptidase; Provisional 161
15472 184825 PRK14797 PRK14797 lipoprotein signal peptidase; Provisional 150
15473 184826 PRK14799 thrS threonyl-tRNA synthetase; Provisional 545
15474 173265 PRK14804 PRK14804 ornithine carbamoyltransferase; Provisional 311
15475 237819 PRK14805 PRK14805 ornithine carbamoyltransferase; Provisional 302
15476 237820 PRK14806 PRK14806 bifunctional cyclohexadienyl dehydrogenase/ 3-phosphoshikimate 1-carboxyvinyltransferase; Provisional 735
15477 184829 PRK14807 PRK14807 histidinol-phosphate transaminase. 351
15478 173269 PRK14808 PRK14808 histidinol-phosphate transaminase. 335
15479 184830 PRK14809 PRK14809 histidinol-phosphate transaminase. 357
15480 173271 PRK14810 PRK14810 formamidopyrimidine-DNA glycosylase; Provisional 272
15481 184831 PRK14811 PRK14811 formamidopyrimidine-DNA glycosylase; Provisional 269
15482 173273 PRK14812 PRK14812 hypothetical protein; Provisional 119
15483 173274 PRK14813 PRK14813 NADH dehydrogenase subunit B; Provisional 189
15484 173275 PRK14814 PRK14814 NADH dehydrogenase subunit B; Provisional 186
15485 237821 PRK14815 PRK14815 NADH dehydrogenase subunit B; Provisional 183
15486 173277 PRK14816 PRK14816 NADH-quinone oxidoreductase subunit B. 182
15487 173278 PRK14817 PRK14817 NADH-quinone oxidoreductase subunit B. 181
15488 173279 PRK14818 PRK14818 NADH dehydrogenase subunit B; Provisional 173
15489 237822 PRK14819 PRK14819 NADH-quinone oxidoreductase subunit B. 264
15490 184833 PRK14820 PRK14820 NADH-quinone oxidoreductase subunit B. 180
15491 184834 PRK14821 PRK14821 XTP/dITP diphosphatase. 184
15492 184835 PRK14822 PRK14822 XTP/dITP diphosphatase. 200
15493 237823 PRK14823 PRK14823 putative deoxyribonucleoside-triphosphatase; Provisional 191
15494 237824 PRK14824 PRK14824 putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional 201
15495 173286 PRK14825 PRK14825 putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional 199
15496 173287 PRK14826 PRK14826 putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional 222
15497 173288 PRK14827 PRK14827 undecaprenyl pyrophosphate synthase; Provisional 296
15498 237825 PRK14828 PRK14828 undecaprenyl pyrophosphate synthase; Provisional 256
15499 237826 PRK14829 PRK14829 undecaprenyl pyrophosphate synthase; Provisional 243
15500 184840 PRK14830 PRK14830 undecaprenyl pyrophosphate synthase; Provisional 251
15501 184841 PRK14831 PRK14831 undecaprenyl pyrophosphate synthase; Provisional 249
15502 237827 PRK14832 PRK14832 undecaprenyl pyrophosphate synthase; Provisional 253
15503 237828 PRK14833 PRK14833 di-trans,poly-cis-decaprenylcistransferase. 233
15504 237829 PRK14834 PRK14834 isoprenyl transferase. 249
15505 237830 PRK14835 PRK14835 isoprenyl transferase. 275
15506 237831 PRK14836 PRK14836 undecaprenyl pyrophosphate synthase; Provisional 253
15507 173298 PRK14837 PRK14837 undecaprenyl pyrophosphate synthase; Provisional 230
15508 184846 PRK14838 PRK14838 isoprenyl transferase. 242
15509 237832 PRK14839 PRK14839 di-trans,poly-cis-decaprenylcistransferase. 239
15510 173301 PRK14840 PRK14840 undecaprenyl pyrophosphate synthase; Provisional 250
15511 173302 PRK14841 PRK14841 undecaprenyl pyrophosphate synthase; Provisional 233
15512 173303 PRK14842 PRK14842 undecaprenyl pyrophosphate synthase; Provisional 241
15513 184847 PRK14843 PRK14843 dihydrolipoamide acetyltransferase; Provisional 347
15514 173305 PRK14844 PRK14844 DNA-directed RNA polymerase subunit beta/beta'. 2836
15515 237833 PRK14845 PRK14845 translation initiation factor IF-2; Provisional 1049
15516 237834 PRK14846 truB tRNA pseudouridine synthase B; Provisional 345
15517 184849 PRK14847 PRK14847 2-isopropylmalate synthase. 333
15518 184850 PRK14848 PRK14848 type III secretion system effector deubiquitinase SseL. 317
15519 184851 PRK14849 PRK14849 autotransporter barrel domain-containing lipoprotein. 1806
15520 237835 PRK14850 PRK14850 penicillin-binding protein 1b; Provisional 764
15521 184853 PRK14851 PRK14851 hypothetical protein; Provisional 679
15522 184854 PRK14852 PRK14852 hypothetical protein; Provisional 989
15523 184855 PRK14853 nhaA pH-dependent sodium/proton antiporter; Provisional 423
15524 184856 PRK14854 nhaA pH-dependent sodium/proton antiporter; Provisional 383
15525 237836 PRK14855 nhaA pH-dependent sodium/proton antiporter; Provisional 423
15526 184858 PRK14856 nhaA sodium/proton antiporter NhaA. 438
15527 184859 PRK14857 tatA TatA/E family twin arginine-targeting protein translocase. 90
15528 184860 PRK14858 tatA twin arginine translocase protein A; Provisional 108
15529 184861 PRK14859 tatA twin arginine translocase protein A; Provisional 63
15530 184862 PRK14860 tatA twin arginine translocase protein A; Provisional 64
15531 237837 PRK14861 tatA twin arginine translocase protein A; Provisional 61
15532 237838 PRK14862 rimO 30S ribosomal protein S12 methylthiotransferase RimO. 440
15533 184865 PRK14863 PRK14863 bifunctional regulator KidO; Provisional 292
15534 184866 PRK14864 PRK14864 biofilm peroxide resistance protein BsmA. 104
15535 237839 PRK14865 rnpA ribonuclease P protein component. 116
15536 237840 PRK14866 PRK14866 hypothetical protein; Provisional 451
15537 237841 PRK14867 PRK14867 DNA topoisomerase VI subunit B; Provisional 659
15538 237842 PRK14868 PRK14868 DNA topoisomerase VI subunit B; Provisional 795
15539 237843 PRK14869 PRK14869 putative manganese-dependent inorganic diphosphatase. 546
15540 184872 PRK14872 PRK14872 rod shape-determining protein MreC; Provisional 337
15541 237844 PRK14873 PRK14873 primosomal protein N'. 665
15542 237845 PRK14874 PRK14874 aspartate-semialdehyde dehydrogenase; Provisional 334
15543 184875 PRK14875 PRK14875 acetoin dehydrogenase E2 subunit dihydrolipoyllysine-residue acetyltransferase; Provisional 371
15544 237846 PRK14876 PRK14876 conjugal transfer mating pair stabilization protein TraN; Provisional 928
15545 184877 PRK14877 PRK14877 conjugal transfer mating pair stabilization protein TraN; Provisional 1062
15546 184878 PRK14878 PRK14878 UGMP family protein; Provisional 323
15547 237847 PRK14879 PRK14879 Kae1-associated kinase Bud32. 211
15548 237848 PRK14886 PRK14886 KEOPS complex subunit Cgi121. 167
15549 237849 PRK14887 PRK14887 KEOPS complex Pcc1-like subunit; Provisional 84
15550 237850 PRK14888 PRK14888 KEOPS complex Pcc1-like subunit; Provisional 59
15551 184883 PRK14889 PRK14889 VKOR family protein; Provisional 143
15552 184884 PRK14890 PRK14890 putative Zn-ribbon RNA-binding protein; Provisional 59
15553 184885 PRK14891 PRK14891 50S ribosomal protein L24e/unknown domain fusion protein; Provisional 131
15554 184886 PRK14892 PRK14892 putative transcription elongation factor Elf1; Provisional 99
15555 184887 PRK14893 PRK14893 V-type ATP synthase subunit K; Provisional 161
15556 237851 PRK14894 PRK14894 glycyl-tRNA synthetase; Provisional 539
15557 184889 PRK14895 gltX glutamyl-tRNA synthetase; Provisional 513
15558 237852 PRK14896 ksgA 16S ribosomal RNA methyltransferase A. 258
15559 237853 PRK14897 PRK14897 unknown domain/DNA-directed RNA polymerase subunit A'' fusion protein; Provisional 509
15560 237854 PRK14898 PRK14898 DNA-directed RNA polymerase subunit A''; Provisional 858
15561 237855 PRK14900 valS valyl-tRNA synthetase; Provisional 1052
15562 237856 PRK14901 PRK14901 16S rRNA methyltransferase B; Provisional 434
15563 237857 PRK14902 PRK14902 16S rRNA (cytosine(967)-C(5))-methyltransferase RsmB. 444
15564 184896 PRK14903 PRK14903 16S rRNA methyltransferase B; Provisional 431
15565 237858 PRK14904 PRK14904 16S rRNA methyltransferase B; Provisional 445
15566 184898 PRK14905 PRK14905 triosephosphate isomerase/PTS system glucose/sucrose-specific transporter subunit IIB; Provisional 355
15567 184899 PRK14906 PRK14906 DNA-directed RNA polymerase subunit beta'. 1460
15568 184900 PRK14907 rplD 50S ribosomal protein L4; Provisional 295
15569 237859 PRK14908 PRK14908 glycine--tRNA ligase. 1000
15570 184902 PRK14938 PRK14938 Ser-tRNA(Thr) hydrolase; Provisional 387
15571 237860 PRK14939 gyrB DNA gyrase subunit B; Provisional 756
15572 184904 PRK14940 PRK14940 DNA polymerase III subunit beta; Provisional 367
15573 184905 PRK14941 PRK14941 DNA polymerase III subunit beta; Provisional 374
15574 184906 PRK14942 PRK14942 DNA polymerase III subunit beta; Provisional 373
15575 184907 PRK14943 PRK14943 DNA polymerase III subunit beta; Provisional 374
15576 184908 PRK14944 PRK14944 DNA polymerase III subunit beta; Provisional 375
15577 184909 PRK14945 PRK14945 DNA polymerase III subunit beta; Provisional 362
15578 184910 PRK14946 PRK14946 DNA polymerase III subunit beta; Provisional 366
15579 237861 PRK14947 PRK14947 DNA polymerase III subunit beta; Provisional 384
15580 237862 PRK14948 PRK14948 DNA polymerase III subunit gamma/tau. 620
15581 237863 PRK14949 PRK14949 DNA polymerase III subunits gamma and tau; Provisional 944
15582 237864 PRK14950 PRK14950 DNA polymerase III subunits gamma and tau; Provisional 585
15583 237865 PRK14951 PRK14951 DNA polymerase III subunits gamma and tau; Provisional 618
15584 237866 PRK14952 PRK14952 DNA polymerase III subunits gamma/tau. 584
15585 237867 PRK14953 PRK14953 DNA polymerase III subunits gamma and tau; Provisional 486
15586 184918 PRK14954 PRK14954 DNA polymerase III subunits gamma and tau; Provisional 620
15587 184919 PRK14955 PRK14955 DNA polymerase III subunits gamma and tau; Provisional 397
15588 184920 PRK14956 PRK14956 DNA polymerase III subunits gamma and tau; Provisional 484
15589 184921 PRK14957 PRK14957 DNA polymerase III subunits gamma and tau; Provisional 546
15590 184922 PRK14958 PRK14958 DNA polymerase III subunits gamma and tau; Provisional 509
15591 184923 PRK14959 PRK14959 DNA polymerase III subunits gamma and tau; Provisional 624
15592 237868 PRK14960 PRK14960 DNA polymerase III subunit gamma/tau. 702
15593 184925 PRK14961 PRK14961 DNA polymerase III subunits gamma and tau; Provisional 363
15594 237869 PRK14962 PRK14962 DNA polymerase III subunits gamma and tau; Provisional 472
15595 184927 PRK14963 PRK14963 DNA polymerase III subunits gamma and tau; Provisional 504
15596 237870 PRK14964 PRK14964 DNA polymerase III subunits gamma and tau; Provisional 491
15597 237871 PRK14965 PRK14965 DNA polymerase III subunits gamma and tau; Provisional 576
15598 184930 PRK14966 PRK14966 unknown domain/N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase fusion protein; Provisional 423
15599 184931 PRK14967 PRK14967 putative methyltransferase; Provisional 223
15600 237872 PRK14968 PRK14968 putative methyltransferase; Provisional 188
15601 237873 PRK14969 PRK14969 DNA polymerase III subunits gamma and tau; Provisional 527
15602 184934 PRK14970 PRK14970 DNA polymerase III subunits gamma and tau; Provisional 367
15603 237874 PRK14971 PRK14971 DNA polymerase III subunit gamma/tau. 614
15604 184936 PRK14973 PRK14973 DNA topoisomerase I; Provisional 936
15605 237875 PRK14974 PRK14974 signal recognition particle-docking protein FtsY. 336
15606 237876 PRK14975 PRK14975 bifunctional 3'-5' exonuclease/DNA polymerase; Provisional 553
15607 237877 PRK14976 PRK14976 5'-3' exonuclease; Provisional 281
15608 184940 PRK14977 PRK14977 bifunctional DNA-directed RNA polymerase A'/A'' subunit; Provisional 1321
15609 237878 PRK14979 PRK14979 DNA-directed RNA polymerase subunit D; Provisional 195
15610 184942 PRK14980 PRK14980 DNA-directed RNA polymerase subunit G; Provisional 127
15611 184943 PRK14981 PRK14981 RNA polymerase Rpb4 family protein. 112
15612 184944 PRK14982 PRK14982 acyl-ACP reductase; Provisional 340
15613 237879 PRK14983 PRK14983 aldehyde decarbonylase; Provisional 231
15614 237880 PRK14984 PRK14984 gluconate transporter. 438
15615 237881 PRK14985 PRK14985 maltodextrin phosphorylase; Provisional 798
15616 184948 PRK14986 PRK14986 glycogen phosphorylase; Provisional 815
15617 184949 PRK14987 PRK14987 HTH-type transcriptional regulator GntR. 331
15618 237882 PRK14988 PRK14988 GMP/IMP nucleotidase; Provisional 224
15619 184951 PRK14989 PRK14989 nitrite reductase subunit NirD; Provisional 847
15620 184952 PRK14990 PRK14990 anaerobic dimethyl sulfoxide reductase subunit A; Provisional 814
15621 237883 PRK14991 PRK14991 tetrathionate reductase subunit TtrA. 1031
15622 184954 PRK14992 PRK14992 tetrathionate reductase subunit TtrC. 335
15623 184955 PRK14993 PRK14993 tetrathionate reductase subunit TtrB. 244
15624 184956 PRK14994 PRK14994 SAM-dependent 16S ribosomal RNA C1402 ribose 2'-O-methyltransferase; Provisional 287
15625 184957 PRK14995 PRK14995 SmvA family efflux MFS transporter. 495
15626 184958 PRK14996 PRK14996 TetR family transcriptional regulator; Provisional 192
15627 184959 PRK14997 PRK14997 LysR family transcriptional regulator; Provisional 301
15628 184960 PRK14998 PRK14998 cold shock-like protein CspD; Provisional 73
15629 184961 PRK14999 PRK14999 histidine utilization repressor; Provisional 241
15630 184962 PRK15000 PRK15000 peroxiredoxin C. 200
15631 184963 PRK15001 PRK15001 23S rRNA (guanine(1835)-N(2))-methyltransferase RlmG. 378
15632 184964 PRK15002 PRK15002 redox-sensitive transcriptional activator SoxR. 154
15633 184965 PRK15003 PRK15003 cytochrome d ubiquinol oxidase subunit II. 379
15634 184966 PRK15004 PRK15004 adenosylcobalamin/alpha-ribazole phosphatase. 199
15635 184967 PRK15005 PRK15005 universal stress protein UspF. 144
15636 184968 PRK15006 PRK15006 thiosulfate reductase cytochrome B subunit; Provisional 261
15637 184969 PRK15007 PRK15007 arginine ABC transporter substrate-binding protein. 243
15638 184970 PRK15008 PRK15008 HTH-type transcriptional regulator RutR; Provisional 212
15639 184971 PRK15009 PRK15009 GDP-mannose pyrophosphatase NudK; Provisional 191
15640 184972 PRK15010 PRK15010 lysine/arginine/ornithine ABC transporter substrate-binding protein ArgT. 260
15641 184973 PRK15011 PRK15011 sugar efflux transporter SetB. 393
15642 184974 PRK15012 PRK15012 menaquinone-specific isochorismate synthase; Provisional 431
15643 184975 PRK15014 PRK15014 6-phospho-beta-glucosidase BglA; Provisional 477
15644 184976 PRK15015 PRK15015 carbon starvation protein CstA. 701
15645 184977 PRK15016 PRK15016 isochorismate synthase EntC; Provisional 391
15646 184978 PRK15017 PRK15017 cytochrome o ubiquinol oxidase subunit I; Provisional 663
15647 184979 PRK15018 PRK15018 1-acyl-sn-glycerol-3-phosphate acyltransferase; Provisional 245
15648 184980 PRK15019 PRK15019 cysteine desulfurase sulfur acceptor subunit CsdE. 147
15649 237884 PRK15020 PRK15020 ethanolamine utilization cob(I)yrinic acid a,c-diamide adenosyltransferase EutT. 267
15650 184982 PRK15021 PRK15021 microcin C ABC transporter permease; Provisional 341
15651 184983 PRK15022 PRK15022 non-heme ferritin-like protein. 167
15652 184984 PRK15023 PRK15023 L-serine deaminase; Provisional 454
15653 184985 PRK15025 PRK15025 ureidoglycolate dehydrogenase; Provisional 349
15654 184986 PRK15026 PRK15026 aminoacyl-histidine dipeptidase; Provisional 485
15655 184987 PRK15027 PRK15027 xylulokinase; Provisional 484
15656 184988 PRK15028 PRK15028 cytochrome d ubiquinol oxidase subunit II. 378
15657 184989 PRK15029 PRK15029 arginine decarboxylase; Provisional 755
15658 184990 PRK15030 PRK15030 multidrug efflux RND transporter periplasmic adaptor subunit AcrA. 397
15659 184991 PRK15031 PRK15031 5-carboxymethyl-2-hydroxymuconate Delta-isomerase. 126
15660 184992 PRK15032 PRK15032 pentaheme c-type cytochrome TorC. 390
15661 237885 PRK15033 PRK15033 tricarballylate utilization 4Fe-4S protein TcuB. 389
15662 184994 PRK15034 PRK15034 nitrate/nitrite transport protein NarU; Provisional 462
15663 184995 PRK15035 PRK15035 cytochrome bd-II oxidase subunit 1; Provisional 514
15664 184996 PRK15036 PRK15036 hydroxyisourate hydrolase; Provisional 137
15665 184997 PRK15037 PRK15037 D-mannonate oxidoreductase; Provisional 486
15666 184998 PRK15038 PRK15038 autoinducer 2 ABC transporter permease LsrD. 330
15667 184999 PRK15039 PRK15039 Ni(II)/Co(II)-binding transcriptional repressor RcnR. 90
15668 185000 PRK15040 PRK15040 L-serine ammonia-lyase. 454
15669 185001 PRK15041 PRK15041 methyl-accepting chemotaxis protein. 554
15670 237886 PRK15042 pduD propanediol/glycerol family dehydratase medium subunit. 219
15671 185003 PRK15043 PRK15043 HTH-type transcriptional regulator MlrA. 243
15672 185004 PRK15044 PRK15044 transcriptional regulator SirC; Provisional 295
15673 185005 PRK15045 PRK15045 cellulose biosynthesis protein BcsE; Provisional 519
15674 237887 PRK15046 PRK15046 2-aminoethylphosphonate ABC transporter substrate-binding protein; Provisional 349
15675 185007 PRK15047 PRK15047 N-hydroxyarylamine O-acetyltransferase; Provisional 281
15676 185008 PRK15048 PRK15048 methyl-accepting chemotaxis protein II; Provisional 553
15677 185009 PRK15049 PRK15049 L-asparagine permease. 499
15678 237888 PRK15050 PRK15050 2-aminoethylphosphonate transport system permease PhnU; Provisional 296
15679 185011 PRK15051 PRK15051 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol flippase subunit ArnE; Provisional 111
15680 237889 PRK15052 PRK15052 tagatose-bisphosphate aldolase subunit GatZ. 421
15681 185013 PRK15053 dpiB sensor histidine kinase DpiB; Provisional 545
15682 185014 PRK15054 PRK15054 nitrate reductase molybdenum cofactor assembly chaperone. 231
15683 237890 PRK15055 PRK15055 anaerobic sulfite reductase subunit AsrA. 344
15684 185016 PRK15056 PRK15056 manganese/iron ABC transporter ATP-binding protein. 272
15685 185017 PRK15057 PRK15057 UDP-glucose 6-dehydrogenase; Provisional 388
15686 185018 PRK15058 PRK15058 cytochrome b562; Provisional 128
15687 185019 PRK15059 PRK15059 2-hydroxy-3-oxopropionate reductase. 292
15688 185020 PRK15060 PRK15060 2,3-diketo-L-gulonate transporter large permease YiaN. 425
15689 237891 PRK15061 PRK15061 catalase/peroxidase. 726
15690 237892 PRK15062 PRK15062 hydrogenase isoenzymes formation protein HypD; Provisional 364
15691 237893 PRK15063 PRK15063 isocitrate lyase; Provisional 428
15692 237894 PRK15064 PRK15064 ABC transporter ATP-binding protein; Provisional 530
15693 237895 PRK15065 PRK15065 mannose/fructose/sorbose family PTS transporter subunit IIC. 262
15694 237896 PRK15066 PRK15066 inner membrane transport permease; Provisional 257
15695 237897 PRK15067 PRK15067 ethanolamine ammonia-lyase subunit EutB. 461
15696 237898 PRK15068 PRK15068 tRNA 5-methoxyuridine(34)/uridine 5-oxyacetic acid(34) synthase CmoB. 322
15697 185029 PRK15069 PRK15069 histidine ABC transporter permease HisM. 234
15698 237899 PRK15070 PRK15070 phosphate propanoyltransferase. 211
15699 237900 PRK15071 PRK15071 lipopolysaccharide ABC transporter permease; Provisional 356
15700 237901 PRK15072 PRK15072 D-galactonate dehydratase family protein. 404
15701 185033 PRK15074 PRK15074 inosine/guanosine kinase; Provisional 434
15702 237902 PRK15075 PRK15075 tricarballylate/proton symporter TcuC. 434
15703 185035 PRK15076 PRK15076 alpha-galactosidase; Provisional 431
15704 237903 PRK15078 PRK15078 polysaccharide export protein Wza; Provisional 379
15705 185037 PRK15079 PRK15079 oligopeptide ABC transporter ATP-binding protein OppF; Provisional 331
15706 237904 PRK15080 PRK15080 ethanolamine utilization protein EutJ; Provisional 267
15707 185039 PRK15081 PRK15081 glutathione ABC transporter permease GsiC; Provisional 306
15708 185040 PRK15082 PRK15082 glutathione ABC transporter permease GsiD; Provisional 301
15709 237905 PRK15083 PRK15083 PTS system mannitol-specific transporter subunit IICBA; Provisional 639
15710 185042 PRK15084 PRK15084 formate hydrogenlyase maturation protein HycH; Provisional 133
15711 237906 PRK15086 PRK15086 ethanolamine utilization protein EutH; Provisional 372
15712 185044 PRK15087 PRK15087 hemolysin; Provisional 219
15713 185045 PRK15088 PRK15088 PTS system mannose-specific transporter subunits IIAB; Provisional 322
15714 185046 PRK15090 PRK15090 DNA-binding transcriptional regulator KdgR; Provisional 257
15715 185047 PRK15091 PRK15091 phospholipid-binding lipoprotein MlaA. 251
15716 237907 PRK15092 PRK15092 DNA-binding transcriptional repressor LrhA; Provisional 310
15717 185049 PRK15093 PRK15093 peptide ABC transporter ATP-binding protein SapD. 330
15718 185050 PRK15094 PRK15094 magnesium/cobalt transporter CorC. 292
15719 237908 PRK15095 PRK15095 FKBP-type peptidyl-prolyl cis-trans isomerase; Provisional 156
15720 185052 PRK15097 PRK15097 cytochrome bd-I ubiquinol oxidase subunit CydA. 522
15721 185053 PRK15098 PRK15098 beta-glucosidase BglX. 765
15722 185054 PRK15099 PRK15099 lipid III flippase WzxE. 416
15723 185055 PRK15100 PRK15100 cystine ABC transporter permease. 220
15724 185056 PRK15101 PRK15101 protease3; Provisional 961
15725 237909 PRK15102 PRK15102 trimethylamine-N-oxide reductase TorA. 825
15726 237910 PRK15103 PRK15103 membrane integrity-associated transporter subunit PqiA. 419
15727 185059 PRK15104 PRK15104 oligopeptide ABC transporter substrate-binding protein OppA; Provisional 543
15728 185060 PRK15105 PRK15105 peptidoglycan synthase FtsI; Provisional 578
15729 237911 PRK15106 PRK15106 nucleoside-specific channel-forming protein Tsx; Provisional 289
15730 185062 PRK15107 PRK15107 glutamate/aspartate ABC transporter permease GltK. 224
15731 185063 PRK15108 PRK15108 biotin synthase; Provisional 345
15732 185064 PRK15109 PRK15109 antimicrobial peptide ABC transporter periplasmic binding protein SapA; Provisional 547
15733 185065 PRK15110 PRK15110 peptide ABC transporter permease SapB. 321
15734 185066 PRK15111 PRK15111 peptide ABC transporter permease SapC. 296
15735 185067 PRK15112 PRK15112 peptide ABC transporter ATP-binding protein SapF. 267
15736 185068 PRK15113 PRK15113 glutathione transferase. 214
15737 185069 PRK15114 PRK15114 tRNA (cytidine/uridine-2'-O-)-methyltransferase TrmJ; Provisional 245
15738 185070 PRK15115 PRK15115 response regulator GlrR; Provisional 444
15739 185071 PRK15116 PRK15116 sulfur acceptor protein CsdL; Provisional 268
15740 237912 PRK15117 PRK15117 phospholipid-binding protein MlaC. 211
15741 185073 PRK15118 PRK15118 universal stress protein UspA. 144
15742 237913 PRK15119 PRK15119 UDP-N-acetylglucosamine--undecaprenyl-phosphate N-acetylglucosaminephosphotransferase. 365
15743 185075 PRK15120 PRK15120 lipopolysaccharide ABC transporter permease LptF; Provisional 366
15744 185076 PRK15121 PRK15121 MDR efflux pump AcrAB transcriptional activator RobA. 289
15745 237914 PRK15122 PRK15122 magnesium-transporting ATPase; Provisional 903
15746 237915 PRK15123 PRK15123 lipopolysaccharide core heptose(I) kinase RfaP; Provisional 268
15747 185079 PRK15124 PRK15124 RNA 2',3'-cyclic phosphodiesterase. 176
15748 185080 PRK15126 PRK15126 HMP-PP phosphatase. 272
15749 185081 PRK15127 PRK15127 multidrug efflux RND transporter permease subunit AcrB. 1049
15750 185082 PRK15128 PRK15128 23S rRNA (cytosine(1962)-C(5))-methyltransferase RlmI. 396
15751 185083 PRK15129 PRK15129 L-Ala-D/L-Glu epimerase; Provisional 321
15752 237916 PRK15130 PRK15130 spermidine N1-acetyltransferase; Provisional 186
15753 185085 PRK15131 PRK15131 mannose-6-phosphate isomerase; Provisional 389
15754 185086 PRK15132 PRK15132 tyrosine transporter TyrP; Provisional 403
15755 185087 PRK15133 PRK15133 microcin C ABC transporter permease YejB; Provisional 364
15756 237917 PRK15134 PRK15134 microcin C ABC transporter ATP-binding protein YejF; Provisional 529
15757 185089 PRK15135 PRK15135 histidine ABC transporter permease HisQ. 228
15758 185090 PRK15136 PRK15136 multidrug efflux MFS transporter periplasmic adaptor subunit EmrA. 390
15759 185091 PRK15137 PRK15137 DNA-specific endonuclease I; Provisional 235
15760 185092 PRK15138 PRK15138 alcohol dehydrogenase. 387
15761 185093 PRK15171 PRK15171 lipopolysaccharide 3-alpha-galactosyltransferase. 334
15762 237918 PRK15172 PRK15172 aldose-1-epimerase. 300
15763 185095 PRK15173 PRK15173 peptidase; Provisional 323
15764 185096 PRK15174 PRK15174 Vi polysaccharide transport protein VexE. 656
15765 185097 PRK15175 PRK15175 Vi polysaccharide ABC transporter protein VexA. 355
15766 185098 PRK15176 PRK15176 Vi polysaccharide ABC transporter inner membrane protein VexB. 264
15767 185099 PRK15177 PRK15177 Vi polysaccharide ABC transporter ATP-binding protein VexC. 213
15768 185100 PRK15178 PRK15178 Vi polysaccharide ABC transporter inner membrane protein VexD. 434
15769 185101 PRK15179 PRK15179 Vi polysaccharide biosynthesis protein TviE; Provisional 694
15770 185102 PRK15180 PRK15180 Vi polysaccharide biosynthesis protein TviD; Provisional 831
15771 185103 PRK15181 PRK15181 Vi polysaccharide biosynthesis UDP-N-acetylglucosaminuronic acid C-4 epimerase TviC. 348
15772 185104 PRK15182 PRK15182 Vi polysaccharide biosynthesis UDP-N-acetylglucosamine C-6 dehydrogenase TviB. 425
15773 185105 PRK15183 PRK15183 Vi polysaccharide biosynthesis regulator TviA. 143
15774 185106 PRK15184 PRK15184 curli production assembly/transport protein CsgG; Provisional 277
15775 185107 PRK15185 PRK15185 transcriptional regulator HilD; Provisional 309
15776 185108 PRK15186 PRK15186 AraC family transcriptional regulator; Provisional 291
15777 185109 PRK15187 PRK15187 fimbrial protein BcfA; Provisional 180
15778 185110 PRK15188 PRK15188 fimbrial chaperone protein BcfB; Provisional 228
15779 185111 PRK15189 PRK15189 fimbrial protein BcfD; Provisional 335
15780 185112 PRK15190 PRK15190 fimbrial protein BcfE; Provisional 181
15781 185113 PRK15191 PRK15191 fimbrial protein BcfF; Provisional 172
15782 185114 PRK15192 PRK15192 pili/flagellar-assembly chaperone. 234
15783 237919 PRK15193 PRK15193 outer membrane usher protein; Provisional 876
15784 237920 PRK15194 PRK15194 type 1 fimbrial protein subunit FimA. 185
15785 185117 PRK15195 PRK15195 molecular chaperone FimC. 229
15786 185118 PRK15196 PRK15196 type III secretion system effector PipB2. 350
15787 185119 PRK15197 PRK15197 type III secretion system effector PipB. 291
15788 185120 PRK15198 PRK15198 outer membrane usher protein FimD. 860
15789 237921 PRK15199 fimH type 1 fimbrin D-mannose specific adhesin FimH. 335
15790 185122 PRK15200 PRK15200 type 1 fimbrial protein subunit FimI. 177
15791 185123 PRK15201 PRK15201 fimbriae biosynthesis transcriptional regulator FimW. 198
15792 237922 PRK15202 PRK15202 type III secretion system chaperone. 117
15793 185125 PRK15203 PRK15203 4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase; Provisional 429
15794 185126 PRK15204 PRK15204 undecaprenyl-phosphate galactose phosphotransferase; Provisional 476
15795 237923 PRK15205 PRK15205 long polar fimbrial protein LpfE; Provisional 176
15796 237924 PRK15206 PRK15206 long polar fimbrial protein LpfD; Provisional 359
15797 185129 PRK15207 PRK15207 outer membrane usher protein LpfC. 842
15798 237925 PRK15208 PRK15208 molecular chaperone LpfB. 228
15799 185131 PRK15209 PRK15209 long polar fimbrial protein LpfA; Provisional 174
15800 185132 PRK15210 PRK15210 fimbrial protein. 194
15801 185133 PRK15211 PRK15211 fimbrial chaperone. 229
15802 185134 PRK15212 PRK15212 virulence protein SpvA; Provisional 255
15803 237926 PRK15213 PRK15213 outer membrane usher protein PefC. 797
15804 185136 PRK15214 PRK15214 fimbrial protein PefA; Provisional 172
15805 185137 PRK15215 PRK15215 fimbriae biosynthesis regulatory protein; Provisional 100
15806 185138 PRK15216 PRK15216 putative fimbrial biosynthesis regulatory protein; Provisional 340
15807 185139 PRK15217 PRK15217 fimbrial outer membrane usher protein; Provisional 826
15808 185140 PRK15218 PRK15218 fimbrial assembly chaperone. 226
15809 237927 PRK15219 PRK15219 carbonic anhydrase; Provisional 245
15810 237928 PRK15220 PRK15220 fimbrial protein YehD. 178
15811 185143 PRK15221 PRK15221 Saf-pilin pilus formation protein SafA; Provisional 165
15812 185144 PRK15222 PRK15222 putative pilin structural protein SafD; Provisional 156
15813 185145 PRK15223 PRK15223 fimbrial biogenesis outer membrane usher protein. 836
15814 185146 PRK15224 PRK15224 pili assembly chaperone PapD. 237
15815 185147 PRK15228 PRK15228 fimbrial protein SefA; Provisional 165
15816 185148 PRK15231 PRK15231 fimbrial adhesin protein SefD; Provisional 150
15817 185149 PRK15233 PRK15233 fimbrial chaperone SefB. 246
15818 185150 PRK15235 PRK15235 outer membrane fimbrial usher protein SefC; Provisional 814
15819 237929 PRK15238 PRK15238 inner membrane transporter YjeM; Provisional 496
15820 185152 PRK15239 PRK15239 putative fimbrial protein StaA; Provisional 197
15821 185153 PRK15240 PRK15240 resistance to complement killing; Provisional 185
15822 185154 PRK15241 PRK15241 putative fimbrial protein StaD; Provisional 188
15823 185155 PRK15243 PRK15243 virulence genes transcriptional activator SpvR. 297
15824 185156 PRK15244 PRK15244 type III secretion system effector NAD(+)--protein-arginine ADP-ribosyltransferase SpvB. 591
15825 185157 PRK15245 PRK15245 type III secretion system effector phosphothreonine lyase. 241
15826 185158 PRK15246 PRK15246 fimbrial assembly chaperone. 233
15827 185159 PRK15247 PRK15247 fimbrial usher protein StbD. 441
15828 185160 PRK15248 PRK15248 fimbrial outer membrane usher protein. 853
15829 237930 PRK15249 PRK15249 fimbrial chaperone. 253
15830 185162 PRK15250 PRK15250 type III secretion system effector cysteine hydrolase SpvD. 216
15831 237931 PRK15251 PRK15251 cytolethal distending toxin subunit B family protein. 271
15832 237932 PRK15252 PRK15252 putative fimbrial-like adhesin protein. 344
15833 185165 PRK15253 PRK15253 putative fimbrial assembly chaperone protein StcB; Provisional 242
15834 185166 PRK15254 PRK15254 fimbrial chaperone protein StdC; Provisional 239
15835 185167 PRK15255 PRK15255 fimbrial outer membrane usher protein StdB; Provisional 829
15836 185168 PRK15260 PRK15260 fimbrial protein SteF; Provisional 178
15837 185169 PRK15261 PRK15261 fimbrial protein SteA; Provisional 195
15838 237933 PRK15262 PRK15262 putative fimbrial protein StaF; Provisional 197
15839 185171 PRK15263 PRK15263 fimbrial-like protein. 196
15840 185172 PRK15265 PRK15265 subtilase cytotoxin subunit B-like protein; Provisional 134
15841 185173 PRK15266 PRK15266 subtilase cytotoxin subunit B; Provisional 135
15842 185174 PRK15267 PRK15267 subtilase cytotoxin subunit B-like protein; Provisional 141
15843 185175 PRK15272 PRK15272 pertussis-like toxin subunit ArtA. 242
15844 185176 PRK15273 PRK15273 fimbrial biogenesis outer membrane usher protein. 881
15845 185177 PRK15274 PRK15274 putative periplasmic fimbrial chaperone protein SteC; Provisional 257
15846 185178 PRK15275 PRK15275 putative fimbrial protein SteD; Provisional 166
15847 185179 PRK15276 PRK15276 putative fimbrial subunit SteE; Provisional 153
15848 185180 PRK15278 PRK15278 type III secretion protein BopE; Provisional 261
15849 185181 PRK15279 PRK15279 type III secretion system guanine nucleotide exchange factor SopE. 240
15850 185182 PRK15280 PRK15280 type III secretion system guanine nucleotide exchange factor SopE2. 240
15851 185183 PRK15283 PRK15283 fimbrial major subunit StfA. 186
15852 185184 PRK15284 PRK15284 putative fimbrial outer membrane usher protein StfC; Provisional 881
15853 185185 PRK15285 PRK15285 fimbrial chaperone StfD. 250
15854 185186 PRK15286 PRK15286 putative minor fimbrial subunit StfE; Provisional 170
15855 185187 PRK15287 PRK15287 fimbrial minor subunit StfF. 158
15856 185188 PRK15288 PRK15288 putative minor fimbrial subunit StfG; Provisional 176
15857 185189 PRK15289 lpfA fimbrial protein; Provisional 190
15858 237934 PRK15290 lfpB fimbrial chaperone. 243
15859 185191 PRK15291 PRK15291 fimbrial protein StgD; Provisional 355
15860 237935 PRK15292 PRK15292 type 1 fimbrial protein. 365
15861 185193 PRK15293 PRK15293 fimbrial protein SthD. 185
15862 185194 PRK15294 PRK15294 fimbrial outer membrane usher protein. 845
15863 237936 PRK15295 PRK15295 fimbrial assembly chaperone. 226
15864 185196 PRK15296 PRK15296 fimbrial protein SthA. 181
15865 185197 PRK15297 PRK15297 fimbrial protein. 359
15866 185198 PRK15298 PRK15298 fimbrial outer membrane usher protein. 848
15867 185199 PRK15299 PRK15299 fimbrial chaperone protein StiB; Provisional 227
15868 185200 PRK15300 PRK15300 fimbrial protein StiA; Provisional 179
15869 185201 PRK15301 PRK15301 hypothetical protein; Provisional 186
15870 185202 PRK15302 PRK15302 hypothetical protein; Provisional 229
15871 185203 PRK15303 PRK15303 hypothetical protein; Provisional 229
15872 237937 PRK15304 PRK15304 putative fimbrial outer membrane usher protein; Provisional 801
15873 185205 PRK15305 PRK15305 fimbrial protein StkG. 353
15874 237938 PRK15306 PRK15306 fimbrial protein. 190
15875 185207 PRK15307 PRK15307 major fimbrial protein StkA; Provisional 201
15876 237939 PRK15308 PRK15308 fimbrial protein TcfA. 234
15877 185209 PRK15309 PRK15309 fimbrial protein TcfB. 191
15878 185210 PRK15310 PRK15310 fimbrial outer membrane usher protein TcfC; Provisional 895
15879 185211 PRK15311 PRK15311 fimbrial protein TcfD. 359
15880 185212 PRK15312 PRK15312 antimicrobial resistance protein Mig-14; Provisional 298
15881 237940 PRK15313 PRK15313 intestinal colonization autotransporter adhesin MisL. 955
15882 185214 PRK15314 PRK15314 outer membrane protein RatB; Provisional 2435
15883 237941 PRK15315 PRK15315 outer membrane protein RatA; Provisional 1865
15884 185216 PRK15316 PRK15316 RatA-like protein; Provisional 2683
15885 237942 PRK15317 PRK15317 alkyl hydroperoxide reductase subunit F; Provisional 517
15886 237943 PRK15318 PRK15318 intimin-like protein SinH; Provisional 730
15887 185219 PRK15319 PRK15319 fibronectin-binding autotransporter adhesin ShdA. 2039
15888 185220 PRK15320 PRK15320 transcriptional activator SprB; Provisional 251
15889 185221 PRK15321 PRK15321 type III secretion system effector protein OrgC. 120
15890 185222 PRK15322 PRK15322 oxygen-regulated invasion protein OrgB. 210
15891 185223 PRK15323 PRK15323 oxygen-regulated invasion protein OrgA. 167
15892 185224 PRK15324 PRK15324 EscJ/YscJ/HrcJ family type III secretion inner membrane ring protein. 252
15893 185225 PRK15325 PRK15325 type III secretion system inner rod protein PrgJ. 80
15894 185226 PRK15326 PRK15326 type III secretion system needle complex protein. 80
15895 237944 PRK15327 PRK15327 PrgH/EprH family type III secretion inner membrane ring protein. 393
15896 185228 PRK15328 PRK15328 type III secretion system invasion protein IagB. 160
15897 237945 PRK15329 PRK15329 chaperone SicP. 138
15898 185230 PRK15330 PRK15330 type III secretion system needle tip protein SipD. 343
15899 185231 PRK15331 PRK15331 type III secretion system translocator chaperone SicA. 165
15900 185232 PRK15332 PRK15332 SpaR/YscT/HrcT type III secretion system export apparatus protein. 263
15901 185233 PRK15333 PRK15333 EscS/YscS/HrcS family type III secretion system export apparatus protein. 86
15902 185234 PRK15334 PRK15334 type III secretion system protein SpaN. 336
15903 185235 PRK15335 PRK15335 type III secretion system protein SpaM; Provisional 147
15904 185236 PRK15336 PRK15336 type III secretion system chaperone SpaK; Provisional 135
15905 237946 PRK15337 PRK15337 EscV/YscV/HrcV family type III secretion system export apparatus protein. 686
15906 237947 PRK15338 PRK15338 YopN/LcrE/InvE/MxiC type III secretion system gatekeeper. 372
15907 237948 PRK15339 PRK15339 EscC/YscC/HrcC family type III secretion system outer membrane ring protein. 559
15908 185240 PRK15340 PRK15340 AraC family transcriptional regulator InvF. 216
15909 185241 PRK15341 PRK15341 type III secretion system invasion lipoprotein InvH. 147
15910 185242 PRK15344 PRK15344 type III secretion system needle protein SsaG; Provisional 71
15911 237949 PRK15345 PRK15345 SepL/TyeA/HrpJ family type III secretion system protein. 326
15912 237950 PRK15346 PRK15346 EscC/YscC/HrcC family type III secretion system outer membrane ring protein. 499
15913 237951 PRK15347 PRK15347 two component system sensor kinase. 921
15914 185246 PRK15348 PRK15348 EscJ/YscJ/HrcJ family type III secretion inner membrane ring protein SsaJ. 249
15915 185247 PRK15349 PRK15349 EscT/YscT/HrcT family type III secretion system export apparatus protein. 259
15916 185248 PRK15350 PRK15350 EscS/YscS/HrcS family type III secretion system export apparatus protein. 88
15917 185249 PRK15351 PRK15351 type III secretion system apparatus protein SsaP. 124
15918 185250 PRK15352 PRK15352 type III secretion system apparatus protein SsaO. 125
15919 185251 PRK15353 PRK15353 type III secretion system apparatus protein. 122
15920 185252 PRK15354 PRK15354 type III secretion system apparatus protein SsaK. 224
15921 185253 PRK15355 PRK15355 type III secretion system protein SsaI; Provisional 82
15922 185254 PRK15356 PRK15356 type III secretion system protein SsaH; Provisional 75
15923 185255 PRK15357 PRK15357 pathogenicity island 2 effector protein SseG; Provisional 229
15924 185256 PRK15358 PRK15358 type III secretion systems effector SseF. 239
15925 185257 PRK15359 PRK15359 type III secretion system chaperone protein SscB; Provisional 144
15926 185258 PRK15360 PRK15360 pathogenicity island 2 effector protein SseE; Provisional 137
15927 185259 PRK15361 PRK15361 type III secretion system translocon protein SseD. 195
15928 237952 PRK15362 PRK15362 type III secretion system translocon protein. 473
15929 185261 PRK15363 PRK15363 CesD/SycD/LcrH family type III secretion system chaperone SscA. 157
15930 185262 PRK15364 PRK15364 type III secretion system translocon protein SseB. 196
15931 185263 PRK15365 PRK15365 type III secretion system chaperone SseA; Provisional 107
15932 185264 PRK15366 PRK15366 type III secretion system chaperone SsaE; Provisional 80
15933 185265 PRK15367 PRK15367 EscD/YscD/HrpQ family type III secretion system inner membrane ring protein. 395
15934 185266 PRK15368 PRK15368 type III secretion system protein SpiC. 127
15935 185267 PRK15369 PRK15369 two component system response regulator. 211
15936 185268 PRK15370 PRK15370 type III secretion system effector E3 ubiquitin transferase SlrP. 754
15937 185269 PRK15371 PRK15371 YopJ family type III secretion system effector serine/threonine acetyltransferase. 287
15938 185270 PRK15372 PRK15372 type III secretion system effector SseI. 292
15939 185271 PRK15373 PRK15373 IpaC/SipC family type III secretion system needle tip complex protein. 411
15940 185272 PRK15374 PRK15374 type III secretion system needle tip complex protein SipB. 593
15941 185273 PRK15375 PRK15375 type III secretion system effector GTPase-activating protein SptP. 535
15942 185274 PRK15376 PRK15376 type III secretion system effector SipA. 670
15943 185275 PRK15377 PRK15377 type III secretion system effector HECT-type E3 ubiquitin transferase. 782
15944 237953 PRK15378 PRK15378 type III secretion system effector inositol phosphate phosphatase. 564
15945 185277 PRK15379 PRK15379 type III secretion system effector SopD. 317
15946 185278 PRK15380 PRK15380 type III secretion system effector SopD2. 319
15947 185279 PRK15381 PRK15381 type III secretion system effector. 408
15948 185280 PRK15382 PRK15382 NleB family type III secretion system effector arginine glycosyltransferase. 326
15949 185281 PRK15383 PRK15383 type III secretion system effector arginine glycosyltransferase. 335
15950 185282 PRK15384 PRK15384 type III secretion system effector arginine glycosyltransferase SseK1. 336
15951 185283 PRK15385 PRK15385 MgtC family protein. 225
15952 237954 PRK15386 PRK15386 type III secretion effector GogB. 426
15953 185285 PRK15387 PRK15387 type III secretion system effector E3 ubiquitin transferase SspH2. 788
15954 185286 PRK15388 PRK15388 superoxide dismutase [Cu-Zn]. 177
15955 237955 PRK15389 PRK15389 fumarate hydratase; Provisional 536
15956 185288 PRK15390 PRK15390 fumarate hydratase FumA; Provisional 548
15957 185289 PRK15391 PRK15391 class I fumarate hydratase. 548
15958 185290 PRK15392 PRK15392 class I fumarate hydratase. 550
15959 185291 PRK15393 PRK15393 NUDIX hydrolase YfcD; Provisional 180
15960 185292 PRK15394 PRK15394 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD; Provisional 296
15961 185293 PRK15395 PRK15395 galactose/glucose ABC transporter substrate-binding protein MglB. 330
15962 185294 PRK15396 PRK15396 major outer membrane lipoprotein. 78
15963 185295 PRK15397 PRK15397 nicotinamide riboside transporter PnuC; Provisional 239
15964 237956 PRK15398 PRK15398 aldehyde dehydrogenase. 465
15965 185297 PRK15399 PRK15399 lysine decarboxylase. 713
15966 185298 PRK15400 PRK15400 lysine decarboxylase. 714
15967 237957 PRK15401 PRK15401 DNA oxidative demethylase AlkB. 213
15968 185300 PRK15402 PRK15402 MdfA family multidrug efflux MFS transporter. 406
15969 237958 PRK15403 PRK15403 multidrug efflux MFS transporter MdtM. 413
15970 237959 PRK15404 PRK15404 high-affinity branched-chain amino acid ABC transporter substrate-binding protein. 369
15971 185303 PRK15405 PRK15405 ethanolamine utilization microcompartment protein EutL. 217
15972 185304 PRK15406 PRK15406 oligopeptide ABC transporter permease OppC; Provisional 302
15973 237960 PRK15407 PRK15407 lipopolysaccharide biosynthesis protein RfbH; Provisional 438
15974 237961 PRK15408 PRK15408 autoinducer 2 ABC transporter substrate-binding protein LsrB. 336
15975 185307 PRK15409 PRK15409 glyoxylate/hydroxypyruvate reductase GhrB. 323
15976 185308 PRK15410 PRK15410 DgsA anti-repressor MtfA; Provisional 260
15977 185309 PRK15411 rcsA transcriptional regulator RcsA. 207
15978 185310 PRK15412 PRK15412 thiol:disulfide interchange protein DsbE; Provisional 185
15979 185311 PRK15413 PRK15413 glutathione ABC transporter substrate-binding protein GsiB; Provisional 512
15980 185312 PRK15414 PRK15414 phosphomannomutase. 456
15981 185313 PRK15415 PRK15415 propanediol utilization microcompartment protein PduB. 266
15982 185314 PRK15416 PRK15416 lipopolysaccharide core heptose(II)-phosphate phosphatase; Provisional 201
15983 185315 PRK15417 PRK15417 integron integrase. 337
15984 237962 PRK15418 PRK15418 transcriptional regulator LsrR; Provisional 318
15985 185317 PRK15419 PRK15419 sodium/proline symporter PutP. 502
15986 185318 PRK15420 fucU L-fucose mutarotase; Provisional 140
15987 185319 PRK15421 PRK15421 HTH-type transcriptional regulator MetR. 317
15988 185320 PRK15422 PRK15422 septal ring assembly protein ZapB; Provisional 79
15989 185321 PRK15423 PRK15423 hypoxanthine phosphoribosyltransferase; Provisional 178
15990 237963 PRK15424 PRK15424 propionate catabolism operon regulatory protein PrpR; Provisional 538
15991 185323 PRK15425 gapA glyceraldehyde-3-phosphate dehydrogenase. 331
15992 237964 PRK15426 PRK15426 cellulose biosynthesis regulator YedQ. 570
15993 185325 PRK15427 PRK15427 colanic acid biosynthesis glycosyltransferase WcaL; Provisional 406
15994 185326 PRK15428 PRK15428 putative propanediol utilization protein PduM; Provisional 163
15995 237965 PRK15429 PRK15429 formate hydrogenlyase transcriptional activator FlhA. 686
15996 185328 PRK15430 PRK15430 EamA family transporter RarD. 296
15997 185329 PRK15431 PRK15431 [Fe-S]-dependent transcriptional repressor FeoC. 78
15998 185330 PRK15432 PRK15432 autoinducer 2 ABC transporter permease LsrC; Provisional 344
15999 185331 PRK15433 PRK15433 branched-chain amino acid transporter carrier protein BrnQ. 439
16000 237966 PRK15434 PRK15434 GDP-mannose mannosyl hydrolase. 159
16001 185333 PRK15435 PRK15435 bifunctional DNA-binding transcriptional regulator/O6-methylguanine-DNA methyltransferase Ada. 353
16002 185334 PRK15437 PRK15437 histidine ABC transporter substrate-binding protein HisJ; Provisional 259
16003 185335 PRK15438 PRK15438 erythronate-4-phosphate dehydrogenase PdxB; Provisional 378
16004 185336 PRK15439 PRK15439 autoinducer 2 ABC transporter ATP-binding protein LsrA; Provisional 510
16005 185337 PRK15440 PRK15440 L-rhamnonate dehydratase; Provisional 394
16006 185338 PRK15441 PRK15441 peptidyl-prolyl cis-trans isomerase C; Provisional 93
16007 185339 PRK15442 PRK15442 beta-lactamase TEM; Provisional 284
16008 185340 PRK15443 pduE diol dehydratase small subunit. 138
16009 185341 PRK15444 pduC propanediol/glycerol family dehydratase large subunit. 554
16010 185342 PRK15445 PRK15445 arsenical efflux pump membrane protein ArsB. 427
16011 237967 PRK15446 PRK15446 phosphonate metabolism protein PhnM; Provisional 383
16012 237968 PRK15447 PRK15447 putative protease; Provisional 301
16013 185345 PRK15448 PRK15448 ethanolamine utilization microcompartment protein EutN. 95
16014 185346 PRK15449 PRK15449 ferredoxin-like protein FixX; Provisional 95
16015 185347 PRK15450 PRK15450 signal transduction protein PmrD; Provisional 85
16016 185348 PRK15451 PRK15451 carboxy-S-adenosyl-L-methionine synthase CmoA. 247
16017 237969 PRK15452 PRK15452 putative protease; Provisional 443
16018 237970 PRK15453 PRK15453 phosphoribulokinase; Provisional 290
16019 185351 PRK15454 PRK15454 ethanolamine utilization ethanol dehydrogenase EutG. 395
16020 185352 PRK15455 PRK15455 PrkA family serine protein kinase; Provisional 644
16021 185353 PRK15456 PRK15456 universal stress protein UspG; Provisional 142
16022 185354 PRK15457 PRK15457 ethanolamine utilization acetate kinase EutQ. 233
16023 185355 PRK15458 PRK15458 tagatose 6-phosphate aldolase subunit KbaZ; Provisional 426
16024 185356 PRK15459 PRK15459 flagella biosynthesis chaperone FlgN. 140
16025 185357 PRK15460 cpsB mannose-1-phosphate guanyltransferase; Provisional 478
16026 185358 PRK15461 PRK15461 sulfolactaldehyde 3-reductase. 296
16027 237971 PRK15462 PRK15462 dipeptide permease. 493
16028 185360 PRK15463 PRK15463 cold shock-like protein CspF; Provisional 70
16029 185361 PRK15464 PRK15464 cold shock-like protein CspH; Provisional 70
16030 185362 PRK15465 pabB aminodeoxychorismate synthase component 1. 453
16031 185363 PRK15466 PRK15466 ethanolamine utilization microcompartment protein EutK. 166
16032 185364 PRK15467 PRK15467 ethanolamine utilization acetate kinase EutP. 158
16033 185365 PRK15468 PRK15468 ethanolamine utilization microcompartment protein EutS. 111
16034 185366 PRK15469 ghrA glyoxylate/hydroxypyruvate reductase GhrA. 312
16035 185367 PRK15470 emtA membrane-bound lytic murein transglycosylase EmtA. 203
16036 185368 PRK15471 PRK15471 chain length determinant protein WzzB; Provisional 325
16037 185369 PRK15472 PRK15472 nucleoside triphosphatase NudI; Provisional 141
16038 185370 PRK15473 cbiF cobalt-precorrin-4 methyltransferase. 257
16039 185371 PRK15474 PRK15474 ethanolamine utilization microcompartment protein EutM. 97
16040 185372 PRK15475 PRK15475 oxaloacetate decarboxylase subunit beta; Provisional 433
16041 185373 PRK15476 PRK15476 oxaloacetate decarboxylase subunit beta; Provisional 433
16042 185374 PRK15477 PRK15477 oxaloacetate decarboxylase subunit beta; Provisional 433
16043 185375 PRK15478 cbiH precorrin-3B C(17)-methyltransferase. 241
16044 185376 PRK15479 PRK15479 transcriptional regulator TctD. 221
16045 185377 PRK15480 PRK15480 glucose-1-phosphate thymidylyltransferase RfbA; Provisional 292
16046 185378 PRK15481 PRK15481 transcriptional regulatory protein PtsJ; Provisional 431
16047 185379 PRK15482 PRK15482 HTH-type transcriptional regulator MurR. 285
16048 237972 PRK15483 PRK15483 type III restriction-modification system endonuclease. 986
16049 185381 PRK15484 PRK15484 lipopolysaccharide N-acetylglucosaminyltransferase. 380
16050 185382 PRK15485 PRK15485 energy-coupling factor ABC transporter transmembrane protein. 225
16051 185383 PRK15486 hpaC 4-hydroxyphenylacetate 3-monooxygenase reductase subunit; Provisional 170
16052 185384 PRK15487 PRK15487 O-antigen ligase RfaL; Provisional 400
16053 237973 PRK15488 PRK15488 thiosulfate reductase PhsA; Provisional 759
16054 237974 PRK15489 nfrB glycosyl transferase family protein. 703
16055 185387 PRK15490 PRK15490 Vi polysaccharide biosynthesis glycosyltransferase TviE. 578
16056 185388 PRK15491 PRK15491 replication factor A; Provisional 374
16057 185389 PRK15492 PRK15492 triosephosphate isomerase; Provisional 260
16058 185390 PRK15493 PRK15493 bifunctional S-methyl-5'-thioadenosine deaminase/S-adenosylhomocysteine deaminase. 435
16059 185391 PRK15494 era GTPase Era; Provisional 339
16060 240225 PTZ00004 PTZ00004 actin-2; Provisional 378
16061 173310 PTZ00005 PTZ00005 phosphoglycerate kinase; Provisional 417
16062 240226 PTZ00007 PTZ00007 (NAP-L) nucleosome assembly protein -L; Provisional 337
16063 185394 PTZ00008 PTZ00008 (NAP-S) nucleosome assembly protein-S; Provisional 185
16064 240227 PTZ00009 PTZ00009 heat shock 70 kDa protein; Provisional 653
16065 240228 PTZ00010 PTZ00010 tubulin beta chain; Provisional 445
16066 140051 PTZ00013 PTZ00013 plasmepsin 4 (PM4); Provisional 450
16067 240229 PTZ00014 PTZ00014 myosin-A; Provisional 821
16068 185397 PTZ00015 PTZ00015 histone H4; Provisional 102
16069 240230 PTZ00016 PTZ00016 aquaglyceroporin; Provisional 294
16070 185399 PTZ00017 PTZ00017 histone H2A; Provisional 134
16071 185400 PTZ00018 PTZ00018 histone H3; Provisional 136
16072 240231 PTZ00019 PTZ00019 fructose-bisphosphate aldolase; Provisional 355
16073 240232 PTZ00021 PTZ00021 falcipain-2; Provisional 489
16074 173322 PTZ00023 PTZ00023 glyceraldehyde-3-phosphate dehydrogenase; Provisional 337
16075 240233 PTZ00024 PTZ00024 cyclin-dependent protein kinase; Provisional 335
16076 185402 PTZ00026 PTZ00026 60S ribosomal protein L15; Provisional 204
16077 240234 PTZ00027 PTZ00027 60S ribosomal protein L6; Provisional 190
16078 185404 PTZ00028 PTZ00028 40S ribosomal protein S6e; Provisional 218
16079 185405 PTZ00029 PTZ00029 60S ribosomal protein L10a; Provisional 216
16080 185406 PTZ00030 PTZ00030 60S ribosomal protein L20; Provisional 121
16081 173329 PTZ00031 PTZ00031 ribosomal protein L2; Provisional 317
16082 240235 PTZ00032 PTZ00032 60S ribosomal protein L18; Provisional 211
16083 140068 PTZ00033 PTZ00033 60S ribosomal protein L24; Provisional 125
16084 173331 PTZ00034 PTZ00034 40S ribosomal protein S10; Provisional 124
16085 185407 PTZ00035 PTZ00035 Rad51 protein; Provisional 337
16086 173333 PTZ00036 PTZ00036 glycogen synthase kinase; Provisional 440
16087 240236 PTZ00037 PTZ00037 DnaJ_C chaperone protein; Provisional 421
16088 240237 PTZ00038 PTZ00038 ferredoxin; Provisional 191
16089 173336 PTZ00039 PTZ00039 40S ribosomal protein S20; Provisional 115
16090 240238 PTZ00040 PTZ00040 translation initiation factor E4; Provisional 233
16091 240239 PTZ00041 PTZ00041 60S ribosomal protein L35a; Provisional 120
16092 240240 PTZ00043 PTZ00043 cytochrome c oxidase subunit; Provisional 268
16093 185411 PTZ00044 PTZ00044 ubiquitin; Provisional 76
16094 240241 PTZ00045 PTZ00045 apical membrane antigen 1; Provisional 595
16095 240242 PTZ00046 PTZ00046 rifin; Provisional 358
16096 240243 PTZ00047 PTZ00047 cytochrome c oxidase subunit II; Provisional 162
16097 185414 PTZ00048 PTZ00048 cytochrome c; Provisional 115
16098 240244 PTZ00049 PTZ00049 cathepsin C-like protein; Provisional 693
16099 240245 PTZ00050 PTZ00050 3-oxoacyl-acyl carrier protein synthase; Provisional 421
16100 173347 PTZ00051 PTZ00051 thioredoxin; Provisional 98
16101 185416 PTZ00052 PTZ00052 thioredoxin reductase; Provisional 499
16102 240246 PTZ00053 PTZ00053 methionine aminopeptidase 2; Provisional 470
16103 185418 PTZ00054 PTZ00054 60S ribosomal protein L23; Provisional 139
16104 240247 PTZ00055 PTZ00055 glutathione synthetase; Provisional 619
16105 240248 PTZ00056 PTZ00056 glutathione peroxidase; Provisional 199
16106 173353 PTZ00057 PTZ00057 glutathione s-transferase; Provisional 205
16107 185420 PTZ00058 PTZ00058 glutathione reductase; Provisional 561
16108 185421 PTZ00059 PTZ00059 dynein light chain; Provisional 90
16109 240249 PTZ00060 PTZ00060 cyclophilin; Provisional 183
16110 173356 PTZ00061 PTZ00061 DNA-directed RNA polymerase; Provisional 205
16111 240250 PTZ00062 PTZ00062 glutaredoxin; Provisional 204
16112 240251 PTZ00063 PTZ00063 histone deacetylase; Provisional 436
16113 173359 PTZ00064 PTZ00064 histone acetyltransferase; Provisional 552
16114 240252 PTZ00065 PTZ00065 60S ribosomal protein L14; Provisional 130
16115 173361 PTZ00066 PTZ00066 pyruvate kinase; Provisional 513
16116 185422 PTZ00067 PTZ00067 40S ribosomal S23; Provisional 143
16117 240253 PTZ00068 PTZ00068 60S ribosomal protein L13a; Provisional 202
16118 240254 PTZ00069 PTZ00069 60S ribosomal protein L5; Provisional 300
16119 240255 PTZ00070 PTZ00070 40S ribosomal protein S2; Provisional 257
16120 240256 PTZ00071 PTZ00071 40S ribosomal protein S24; Provisional 132
16121 185427 PTZ00072 PTZ00072 40S ribosomal protein S13; Provisional 148
16122 240257 PTZ00073 PTZ00073 60S ribosomal protein L37; Provisional 91
16123 185429 PTZ00074 PTZ00074 60S ribosomal protein L34; Provisional 135
16124 240258 PTZ00075 PTZ00075 Adenosylhomocysteinase; Provisional 476
16125 173371 PTZ00076 PTZ00076 60S ribosomal protein L17; Provisional 253
16126 185431 PTZ00077 PTZ00077 asparagine synthetase-like protein; Provisional 586
16127 185432 PTZ00078 PTZ00078 Superoxide dismutase [Fe]; Provisional 193
16128 185433 PTZ00079 PTZ00079 NADP-specific glutamate dehydrogenase; Provisional 454
16129 240259 PTZ00081 PTZ00081 enolase; Provisional 439
16130 173376 PTZ00082 PTZ00082 L-lactate dehydrogenase; Provisional 321
16131 185434 PTZ00083 PTZ00083 40S ribosomal protein S27; Provisional 85
16132 240260 PTZ00084 PTZ00084 40S ribosomal protein S3; Provisional 220
16133 240261 PTZ00085 PTZ00085 40S ribosomal protein S28; Provisional 73
16134 185437 PTZ00086 PTZ00086 40S ribosomal protein S16; Provisional 147
16135 185438 PTZ00087 PTZ00087 thrombosponding-related protein; Provisional 340
16136 240262 PTZ00088 PTZ00088 adenylate kinase 1; Provisional 229
16137 173383 PTZ00089 PTZ00089 transketolase; Provisional 661
16138 173384 PTZ00090 PTZ00090 40S ribosomal protein S11; Provisional 233
16139 185439 PTZ00091 PTZ00091 40S ribosomal protein S5; Provisional 193
16140 240263 PTZ00092 PTZ00092 aconitate hydratase-like protein; Provisional 898
16141 173387 PTZ00093 PTZ00093 nucleoside diphosphate kinase, cytosolic; Provisional 149
16142 240264 PTZ00094 PTZ00094 serine hydroxymethyltransferase; Provisional 452
16143 140127 PTZ00095 PTZ00095 40S ribosomal protein S19; Provisional 169
16144 185442 PTZ00096 PTZ00096 40S ribosomal protein S15; Provisional 143
16145 185443 PTZ00097 PTZ00097 60S ribosomal protein L19; Provisional 175
16146 173391 PTZ00098 PTZ00098 phosphoethanolamine N-methyltransferase; Provisional 263
16147 185444 PTZ00099 PTZ00099 rab6; Provisional 176
16148 240265 PTZ00100 PTZ00100 DnaJ chaperone protein; Provisional 116
16149 185445 PTZ00101 PTZ00101 rhomboid-1 protease; Provisional 278
16150 240266 PTZ00102 PTZ00102 disulphide isomerase; Provisional 477
16151 240267 PTZ00103 PTZ00103 60S ribosomal protein L3; Provisional 390
16152 240268 PTZ00104 PTZ00104 S-adenosylmethionine synthase; Provisional 398
16153 240269 PTZ00105 PTZ00105 60S ribosomal protein L12; Provisional 140
16154 185450 PTZ00106 PTZ00106 60S ribosomal protein L30; Provisional 108
16155 240270 PTZ00107 PTZ00107 hexokinase; Provisional 464
16156 240271 PTZ00108 PTZ00108 DNA topoisomerase 2-like protein; Provisional 1388
16157 240272 PTZ00109 PTZ00109 DNA gyrase subunit b; Provisional 903
16158 240273 PTZ00110 PTZ00110 helicase; Provisional 545
16159 173403 PTZ00111 PTZ00111 DNA replication licensing factor MCM4; Provisional 915
16160 240274 PTZ00112 PTZ00112 origin recognition complex 1 protein; Provisional 1164
16161 240275 PTZ00113 PTZ00113 proliferating cell nuclear antigen; Provisional 275
16162 185455 PTZ00114 PTZ00114 Heat shock protein 60; Provisional 555
16163 240276 PTZ00115 PTZ00115 40S ribosomal protein S12; Provisional 290
16164 173408 PTZ00116 PTZ00116 signal peptidase; Provisional 185
16165 173409 PTZ00117 PTZ00117 malate dehydrogenase; Provisional 319
16166 240277 PTZ00118 PTZ00118 40S ribosomal protein S4; Provisional 262
16167 240278 PTZ00119 PTZ00119 40S ribosomal protein S15; Provisional 302
16168 185458 PTZ00120 PTZ00120 D-tyrosyl-tRNA(Tyr) deacylase; Provisional 154
16169 173412 PTZ00121 PTZ00121 MAEBL; Provisional 2084
16170 240279 PTZ00122 PTZ00122 phosphoglycerate mutase; Provisional 299
16171 240280 PTZ00123 PTZ00123 phosphoglycerate mutase like-protein; Provisional 236
16172 173415 PTZ00124 PTZ00124 adenosine deaminase; Provisional 362
16173 240281 PTZ00125 PTZ00125 ornithine aminotransferase-like protein; Provisional 400
16174 240282 PTZ00126 PTZ00126 tyrosyl-tRNA synthetase; Provisional 383
16175 240283 PTZ00127 PTZ00127 cytochrome c oxidase assembly protein; Provisional 403
16176 185464 PTZ00128 PTZ00128 cytochrome c oxidase assembly protein-like; Provisional 232
16177 185465 PTZ00129 PTZ00129 40S ribosomal protein S14; Provisional 149
16178 185466 PTZ00130 PTZ00130 heat shock protein 90; Provisional 814
16179 185467 PTZ00131 PTZ00131 glycophorin-binding protein; Provisional 413
16180 240284 PTZ00132 PTZ00132 GTP-binding nuclear protein Ran; Provisional 215
16181 173423 PTZ00133 PTZ00133 ADP-ribosylation factor; Provisional 182
16182 185469 PTZ00134 PTZ00134 40S ribosomal protein S18; Provisional 154
16183 240285 PTZ00135 PTZ00135 60S acidic ribosomal protein P0; Provisional 310
16184 185471 PTZ00136 PTZ00136 eukaryotic translation initiation factor 6-like protein; Provisional 247
16185 173427 PTZ00137 PTZ00137 2-Cys peroxiredoxin; Provisional 261
16186 185472 PTZ00138 PTZ00138 small nuclear ribonucleoprotein; Provisional 89
16187 240286 PTZ00139 PTZ00139 Succinate dehydrogenase [ubiquinone] flavoprotein subunit; Provisional 617
16188 173430 PTZ00140 PTZ00140 sexual stage antigen s45/48; Provisional 447
16189 185474 PTZ00141 PTZ00141 elongation factor 1- alpha; Provisional 446
16190 240287 PTZ00142 PTZ00142 6-phosphogluconate dehydrogenase; Provisional 470
16191 240288 PTZ00143 PTZ00143 deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional 155
16192 240289 PTZ00144 PTZ00144 dihydrolipoamide succinyltransferase; Provisional 418
16193 240290 PTZ00145 PTZ00145 phosphoribosylpyrophosphate synthetase; Provisional 439
16194 240291 PTZ00146 PTZ00146 fibrillarin; Provisional 293
16195 140176 PTZ00147 PTZ00147 plasmepsin-1; Provisional 453
16196 240292 PTZ00148 PTZ00148 40S ribosomal protein S8; Provisional 205
16197 240293 PTZ00149 PTZ00149 hypoxanthine phosphoribosyltransferase; Provisional 241
16198 240294 PTZ00150 PTZ00150 phosphoglucomutase-2-like protein; Provisional 584
16199 173440 PTZ00151 PTZ00151 translationally controlled tumor-like protein; Provisional 172
16200 173441 PTZ00152 PTZ00152 cofilin/actin-depolymerizing factor 1-like protein; Provisional 122
16201 173442 PTZ00153 PTZ00153 lipoamide dehydrogenase; Provisional 659
16202 240295 PTZ00154 PTZ00154 40S ribosomal protein S17; Provisional 134
16203 185484 PTZ00155 PTZ00155 40S ribosomal protein S9; Provisional 181
16204 185485 PTZ00156 PTZ00156 60S ribosomal protein L11; Provisional 172
16205 240296 PTZ00157 PTZ00157 60S ribosomal protein L36a; Provisional 84
16206 185487 PTZ00158 PTZ00158 40S ribosomal protein S15A; Provisional 130
16207 240297 PTZ00159 PTZ00159 60S ribosomal protein L32; Provisional 133
16208 185489 PTZ00160 PTZ00160 60S ribosomal protein L27a; Provisional 147
16209 240298 PTZ00162 PTZ00162 DNA-directed RNA polymerase II subunit 7; Provisional 176
16210 185490 PTZ00163 PTZ00163 hypothetical protein; Provisional 230
16211 240299 PTZ00164 PTZ00164 bifunctional dihydrofolate reductase-thymidylate synthase; Provisional 514
16212 240300 PTZ00165 PTZ00165 aspartyl protease; Provisional 482
16213 240301 PTZ00166 PTZ00166 DNA polymerase delta catalytic subunit; Provisional 1054
16214 185493 PTZ00167 PTZ00167 RNA polymerase subunit 8c; Provisional 144
16215 185494 PTZ00168 PTZ00168 mitochondrial carrier protein; Provisional 259
16216 240302 PTZ00169 PTZ00169 ADP/ATP transporter on adenylate translocase; Provisional 300
16217 240303 PTZ00170 PTZ00170 D-ribulose-5-phosphate 3-epimerase; Provisional 228
16218 240304 PTZ00171 PTZ00171 acyl carrier protein; Provisional 148
16219 185497 PTZ00172 PTZ00172 40S ribosomal protein S26; Provisional 108
16220 185498 PTZ00173 PTZ00173 60S ribosomal protein L10; Provisional 213
16221 240305 PTZ00174 PTZ00174 phosphomannomutase; Provisional 247
16222 185500 PTZ00175 PTZ00175 diphthine synthase; Provisional 270
16223 140204 PTZ00176 PTZ00176 erythrocyte membrane protein 1 (PfEMP1); Provisional 1317
16224 240306 PTZ00178 PTZ00178 60S ribosomal protein L17; Provisional 181
16225 140206 PTZ00179 PTZ00179 60S ribosomal protein L9; Provisional 189
16226 173464 PTZ00180 PTZ00180 60S ribosomal protein L8; Provisional 260
16227 140208 PTZ00181 PTZ00181 60S ribosomal protein L38; Provisional 82
16228 185502 PTZ00182 PTZ00182 3-methyl-2-oxobutanate dehydrogenase; Provisional 355
16229 185503 PTZ00183 PTZ00183 centrin; Provisional 158
16230 185504 PTZ00184 PTZ00184 calmodulin; Provisional 149
16231 140212 PTZ00185 PTZ00185 ATPase alpha subunit; Provisional 574
16232 140213 PTZ00186 PTZ00186 heat shock 70 kDa precursor protein; Provisional 657
16233 240307 PTZ00187 PTZ00187 succinyl-CoA synthetase alpha subunit; Provisional 317
16234 240308 PTZ00188 PTZ00188 adrenodoxin reductase; Provisional 506
16235 240309 PTZ00189 PTZ00189 60S ribosomal protein L21; Provisional 160
16236 140217 PTZ00190 PTZ00190 60S ribosomal protein L29; Provisional 70
16237 185507 PTZ00191 PTZ00191 60S ribosomal protein L23a; Provisional 145
16238 173472 PTZ00192 PTZ00192 60S ribosomal protein L13; Provisional 218
16239 140220 PTZ00193 PTZ00193 60S ribosomal protein L31; Provisional 188
16240 185508 PTZ00194 PTZ00194 60S ribosomal protein L26; Provisional 143
16241 140222 PTZ00195 PTZ00195 60S ribosomal protein L18; Provisional 198
16242 185509 PTZ00196 PTZ00196 60S ribosomal protein L36; Provisional 98
16243 185510 PTZ00197 PTZ00197 60S ribosomal protein L28; Provisional 146
16244 173474 PTZ00198 PTZ00198 60S ribosomal protein L22; Provisional 122
16245 185511 PTZ00199 PTZ00199 high mobility group protein; Provisional 94
16246 240310 PTZ00200 PTZ00200 cysteine proteinase; Provisional 448
16247 240311 PTZ00201 PTZ00201 amastin surface glycoprotein; Provisional 192
16248 240312 PTZ00202 PTZ00202 tuzin; Provisional 550
16249 185513 PTZ00203 PTZ00203 cathepsin L protease; Provisional 348
16250 140231 PTZ00204 PTZ00204 hypothetical protein; Provisional 120
16251 140232 PTZ00205 PTZ00205 DNA polymerase kappa; Provisional 571
16252 240313 PTZ00206 PTZ00206 amino acid transporter; Provisional 467
16253 140234 PTZ00207 PTZ00207 hypothetical protein; Provisional 591
16254 240314 PTZ00208 PTZ00208 65 kDa invariant surface glycoprotein; Provisional 436
16255 140236 PTZ00209 PTZ00209 retrotransposon hot spot protein; Provisional 693
16256 140237 PTZ00210 PTZ00210 UDP-GlcNAc-dependent glycosyltransferase; Provisional 382
16257 240315 PTZ00211 PTZ00211 ribonucleoside-diphosphate reductase small subunit; Provisional 330
16258 185514 PTZ00212 PTZ00212 T-complex protein 1 subunit beta; Provisional 533
16259 185515 PTZ00213 PTZ00213 asparagine synthetase A; Provisional 348
16260 173479 PTZ00214 PTZ00214 high cysteine membrane protein Group 4; Provisional 800
16261 185516 PTZ00215 PTZ00215 ribose 5-phosphate isomerase; Provisional 151
16262 240316 PTZ00216 PTZ00216 acyl-CoA synthetase; Provisional 700
16263 240317 PTZ00217 PTZ00217 flap endonuclease-1; Provisional 393
16264 185518 PTZ00218 PTZ00218 40S ribosomal protein S29; Provisional 54
16265 185519 PTZ00219 PTZ00219 Sec61 alpha subunit; Provisional 474
16266 173484 PTZ00220 PTZ00220 Activator of HSP-90 ATPase; Provisional 132
16267 140248 PTZ00221 PTZ00221 cyclophilin; Provisional 249
16268 140249 PTZ00222 PTZ00222 60S ribosomal protein L7a; Provisional 263
16269 140250 PTZ00223 PTZ00223 40S ribosomal protein S4; Provisional 273
16270 240318 PTZ00224 PTZ00224 protein phosphatase 2C; Provisional 381
16271 140252 PTZ00225 PTZ00225 60S ribosomal protein L10a; Provisional 214
16272 240319 PTZ00226 PTZ00226 fumarate hydratase; Provisional 570
16273 140254 PTZ00227 PTZ00227 variable surface protein Vir14; Provisional 418
16274 240320 PTZ00228 PTZ00228 variable surface protein Vir21; Provisional 350
16275 140256 PTZ00229 PTZ00229 variable surface protein Vir30; Provisional 317
16276 240321 PTZ00230 PTZ00230 variable surface protein Vir7; Provisional 364
16277 140258 PTZ00231 PTZ00231 variable surface protein Vir17; Provisional 385
16278 240322 PTZ00232 PTZ00232 variable surface protein Vir27; Provisional 363
16279 240323 PTZ00233 PTZ00233 variable surface protein Vir18; Provisional 509
16280 240324 PTZ00234 PTZ00234 variable surface protein Vir12; Provisional 433
16281 185521 PTZ00235 PTZ00235 DNA polymerase epsilon subunit B; Provisional 291
16282 173487 PTZ00236 PTZ00236 mitochondrial import inner membrane translocase subunit tim17; Provisional 164
16283 240325 PTZ00237 PTZ00237 acetyl-CoA synthetase; Provisional 647
16284 140265 PTZ00238 PTZ00238 expression site-associated gene (ESAG); Provisional 326
16285 173488 PTZ00239 PTZ00239 serine/threonine protein phosphatase 2A; Provisional 303
16286 140267 PTZ00240 PTZ00240 60S ribosomal protein P0; Provisional 323
16287 240326 PTZ00241 PTZ00241 40S ribosomal protein S11; Provisional 158
16288 185524 PTZ00242 PTZ00242 protein tyrosine phosphatase; Provisional 166
16289 240327 PTZ00243 PTZ00243 ABC transporter; Provisional 1560
16290 140271 PTZ00244 PTZ00244 serine/threonine-protein phosphatase PP1; Provisional 294
16291 140272 PTZ00245 PTZ00245 ubiquitin activating enzyme; Provisional 287
16292 173491 PTZ00246 PTZ00246 proteasome subunit alpha; Provisional 253
16293 240328 PTZ00247 PTZ00247 adenosine kinase; Provisional 345
16294 240329 PTZ00248 PTZ00248 eukaryotic translation initiation factor 2 subunit 1; Provisional 319
16295 140276 PTZ00249 PTZ00249 variable surface protein Vir28; Provisional 516
16296 140277 PTZ00250 PTZ00250 variable surface protein Vir23; Provisional 350
16297 140278 PTZ00251 PTZ00251 fatty acid elongase; Provisional 272
16298 240330 PTZ00252 PTZ00252 histone H2A; Provisional 134
16299 140280 PTZ00253 PTZ00253 tryparedoxin peroxidase; Provisional 199
16300 240331 PTZ00254 PTZ00254 40S ribosomal protein SA; Provisional 249
16301 240332 PTZ00255 PTZ00255 60S ribosomal protein L37a; Provisional 90
16302 173495 PTZ00256 PTZ00256 glutathione peroxidase; Provisional 183
16303 240333 PTZ00257 PTZ00257 Glycoprotein GP63 (leishmanolysin); Provisional 622
16304 240334 PTZ00258 PTZ00258 GTP-binding protein; Provisional 390
16305 240335 PTZ00259 PTZ00259 endonuclease G; Provisional 434
16306 240336 PTZ00260 PTZ00260 dolichyl-phosphate beta-glucosyltransferase; Provisional 333
16307 240337 PTZ00261 PTZ00261 acyltransferase; Provisional 355
16308 240338 PTZ00262 PTZ00262 subtilisin-like protease; Provisional 639
16309 140289 PTZ00263 PTZ00263 protein kinase A catalytic subunit; Provisional 329
16310 173500 PTZ00264 PTZ00264 circumsporozoite-related antigen; Provisional 148
16311 240339 PTZ00265 PTZ00265 multidrug resistance protein (mdr1); Provisional 1466
16312 173502 PTZ00266 PTZ00266 NIMA-related protein kinase; Provisional 1021
16313 140293 PTZ00267 PTZ00267 NIMA-related protein kinase; Provisional 478
16314 140294 PTZ00268 PTZ00268 glycosylphosphatidylinositol-specific phospholipase C; Provisional 380
16315 140295 PTZ00269 PTZ00269 variant surface glycoprotein; Provisional 472
16316 240340 PTZ00270 PTZ00270 variable surface protein Vir32; Provisional 333
16317 140297 PTZ00271 PTZ00271 hypoxanthine-guanine phosphoribosyltransferase; Provisional 211
16318 240341 PTZ00272 PTZ00272 heat shock protein 83 kDa (Hsp83); Provisional 701
16319 140299 PTZ00273 PTZ00273 oxidase reductase; Provisional 320
16320 140300 PTZ00274 PTZ00274 cytochrome b5 reductase; Provisional 325
16321 185536 PTZ00275 PTZ00275 biotin-acetyl-CoA-carboxylase ligase; Provisional 285
16322 140302 PTZ00276 PTZ00276 biotin/lipoate protein ligase; Provisional 245
16323 240342 PTZ00278 PTZ00278 ARP2/3 complex subunit; Provisional 174
16324 240343 PTZ00280 PTZ00280 Actin-related protein 3; Provisional 414
16325 173506 PTZ00281 PTZ00281 actin; Provisional 376
16326 240344 PTZ00283 PTZ00283 serine/threonine protein kinase; Provisional 496
16327 140307 PTZ00284 PTZ00284 protein kinase; Provisional 467
16328 140308 PTZ00285 PTZ00285 glucosamine-6-phosphate isomerase; Provisional 253
16329 185539 PTZ00286 PTZ00286 6-phospho-1-fructokinase; Provisional 459
16330 240345 PTZ00287 PTZ00287 6-phosphofructokinase; Provisional 1419
16331 240346 PTZ00288 PTZ00288 glucokinase 1; Provisional 405
16332 240347 PTZ00290 PTZ00290 galactokinase; Provisional 468
16333 185541 PTZ00292 PTZ00292 ribokinase; Provisional 326
16334 185542 PTZ00293 PTZ00293 thymidine kinase; Provisional 211
16335 240348 PTZ00294 PTZ00294 glycerol kinase-like protein; Provisional 504
16336 240349 PTZ00295 PTZ00295 glucosamine-fructose-6-phosphate aminotransferase; Provisional 640
16337 240350 PTZ00296 PTZ00296 choline kinase; Provisional 442
16338 140318 PTZ00297 PTZ00297 pantothenate kinase; Provisional 1452
16339 240351 PTZ00298 PTZ00298 mevalonate kinase; Provisional 328
16340 140320 PTZ00299 PTZ00299 homoserine kinase; Provisional 336
16341 140321 PTZ00300 PTZ00300 pyruvate kinase; Provisional 454
16342 140322 PTZ00301 PTZ00301 uridine kinase; Provisional 210
16343 240352 PTZ00302 PTZ00302 N-acetylglucosamine-phosphate mutase; Provisional 585
16344 140324 PTZ00303 PTZ00303 phosphatidylinositol kinase; Provisional 1374
16345 185547 PTZ00304 PTZ00304 NADH dehydrogenase [ubiquinone] flavoprotein 1; Provisional 461
16346 140326 PTZ00305 PTZ00305 NADH:ubiquinone oxidoreductase; Provisional 297
16347 140327 PTZ00306 PTZ00306 NADH-dependent fumarate reductase; Provisional 1167
16348 140328 PTZ00307 PTZ00307 ethanolamine phosphotransferase; Provisional 417
16349 140329 PTZ00308 PTZ00308 ethanolamine-phosphate cytidylyltransferase; Provisional 353
16350 240353 PTZ00309 PTZ00309 glucose-6-phosphate 1-dehydrogenase; Provisional 542
16351 240354 PTZ00310 PTZ00310 AMP deaminase; Provisional 1453
16352 185549 PTZ00311 PTZ00311 phosphoenolpyruvate carboxykinase; Provisional 561
16353 140333 PTZ00312 PTZ00312 inositol-1,4,5-triphosphate 5-phosphatase; Provisional 356
16354 140334 PTZ00313 PTZ00313 inosine-adenosine-guanosine-nucleoside hydrolase; Provisional 326
16355 240355 PTZ00314 PTZ00314 inosine-5'-monophosphate dehydrogenase; Provisional 495
16356 240356 PTZ00315 PTZ00315 2'-phosphotransferase; Provisional 582
16357 140337 PTZ00316 PTZ00316 profilin; Provisional 150
16358 240357 PTZ00317 PTZ00317 NADP-dependent malic enzyme; Provisional 559
16359 185553 PTZ00318 PTZ00318 NADH dehydrogenase-like protein; Provisional 424
16360 173521 PTZ00319 PTZ00319 NADH-cytochrome B5 reductase; Provisional 300
16361 140341 PTZ00320 PTZ00320 ribosomal protein L14; Provisional 188
16362 240358 PTZ00321 PTZ00321 ribosomal protein L11; Provisional 342
16363 140343 PTZ00322 PTZ00322 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase; Provisional 664
16364 185554 PTZ00323 PTZ00323 NAD+ synthase; Provisional 294
16365 240359 PTZ00324 PTZ00324 glutamate dehydrogenase 2; Provisional 1002
16366 240360 PTZ00325 PTZ00325 malate dehydrogenase; Provisional 321
16367 240361 PTZ00326 PTZ00326 phenylalanyl-tRNA synthetase alpha chain; Provisional 494
16368 240362 PTZ00327 PTZ00327 eukaryotic translation initiation factor 2 gamma subunit; Provisional 460
16369 140349 PTZ00328 PTZ00328 eukaryotic initiation factor 5a; Provisional 166
16370 185558 PTZ00329 PTZ00329 eukaryotic translation initiation factor 1A; Provisional 155
16371 140351 PTZ00330 PTZ00330 acetyltransferase; Provisional 147
16372 240363 PTZ00331 PTZ00331 alpha/beta hydrolase; Provisional 212
16373 240364 PTZ00332 PTZ00332 paraflagellar rod protein; Provisional 589
16374 240365 PTZ00333 PTZ00333 triosephosphate isomerase; Provisional 255
16375 240366 PTZ00334 PTZ00334 trans-sialidase; Provisional 780
16376 185562 PTZ00335 PTZ00335 tubulin alpha chain; Provisional 448
16377 185563 PTZ00337 PTZ00337 surface protease GP63; Provisional 567
16378 240367 PTZ00338 PTZ00338 dimethyladenosine transferase-like protein; Provisional 294
16379 240368 PTZ00339 PTZ00339 UDP-N-acetylglucosamine pyrophosphorylase; Provisional 482
16380 240369 PTZ00340 PTZ00340 O-sialoglycoprotein endopeptidase-like protein; Provisional 345
16381 173534 PTZ00341 PTZ00341 Ring-infected erythrocyte surface antigen; Provisional 1136
16382 240370 PTZ00342 PTZ00342 acyl-CoA synthetase; Provisional 746
16383 240371 PTZ00343 PTZ00343 triose or hexose phosphate/phosphate translocator; Provisional 350
16384 240372 PTZ00344 PTZ00344 pyridoxal kinase; Provisional 296
16385 240373 PTZ00345 PTZ00345 glycerol-3-phosphate dehydrogenase; Provisional 365
16386 240374 PTZ00346 PTZ00346 histone deacetylase; Provisional 429
16387 240375 PTZ00347 PTZ00347 phosphomethylpyrimidine kinase; Provisional 504
16388 173541 PTZ00348 PTZ00348 tyrosyl-tRNA synthetase; Provisional 682
16389 185571 PTZ00349 PTZ00349 dehydrodolichyl diphosphate synthetase; Provisional 322
16390 240376 PTZ00350 PTZ00350 adenylosuccinate synthetase; Provisional 436
16391 173544 PTZ00351 PTZ00351 adenylosuccinate synthetase; Provisional 710
16392 240377 PTZ00352 PTZ00352 60S ribosomal protein L13; Provisional 212
16393 173546 PTZ00353 PTZ00353 glycosomal glyceraldehyde-3-phosphate dehydrogenase; Provisional 342
16394 173547 PTZ00354 PTZ00354 alcohol dehydrogenase; Provisional 334
16395 173548 PTZ00355 PTZ00355 Rhoptry-associated protein 2; Provisional 400
16396 185573 PTZ00356 PTZ00356 peptidyl-prolyl cis-trans isomerase (PPIase); Provisional 115
16397 173550 PTZ00357 PTZ00357 methyltransferase; Provisional 1072
16398 240378 PTZ00358 PTZ00358 hypothetical protein; Provisional 367
16399 173552 PTZ00359 PTZ00359 hypothetical protein; Provisional 443
16400 240379 PTZ00360 PTZ00360 sexual stage antigen; Provisional 543
16401 185575 PTZ00361 PTZ00361 26 proteosome regulatory subunit 4-like protein; Provisional 438
16402 240380 PTZ00362 PTZ00362 hypothetical protein; Provisional 479
16403 185577 PTZ00363 PTZ00363 rab-GDP dissociation inhibitor; Provisional 443
16404 240381 PTZ00364 PTZ00364 dipeptidyl-peptidase I precursor; Provisional 548
16405 240382 PTZ00365 PTZ00365 60S ribosomal protein L7Ae-like; Provisional 266
16406 240383 PTZ00366 PTZ00366 Surface antigen (SAG) superfamily; Provisional 392
16407 240384 PTZ00367 PTZ00367 squalene epoxidase; Provisional 567
16408 173561 PTZ00368 PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional 148
16409 240385 PTZ00369 PTZ00369 Ras-like protein; Provisional 189
16410 240386 PTZ00370 PTZ00370 STEVOR; Provisional 296
16411 240387 PTZ00371 PTZ00371 aspartyl aminopeptidase; Provisional 465
16412 240388 PTZ00372 PTZ00372 endonuclease 4-like protein; Provisional 413
16413 185582 PTZ00373 PTZ00373 60S Acidic ribosomal protein P2; Provisional 112
16414 240389 PTZ00374 PTZ00374 dihydroxyacetone phosphate acyltransferase; Provisional 1108
16415 185583 PTZ00375 PTZ00375 dihydroxyacetone kinase-like protein; Provisional 584
16416 240390 PTZ00376 PTZ00376 aspartate aminotransferase; Provisional 404
16417 240391 PTZ00377 PTZ00377 alanine aminotransferase; Provisional 481
16418 173571 PTZ00378 PTZ00378 hypothetical protein; Provisional 518
16419 173572 PTZ00380 PTZ00380 microtubule-associated protein (MAP); Provisional 121
16420 240392 PTZ00381 PTZ00381 aldehyde dehydrogenase family protein; Provisional 493
16421 173574 PTZ00382 PTZ00382 Variant-specific surface protein (VSP); Provisional 96
16422 240393 PTZ00383 PTZ00383 malate:quinone oxidoreductase; Provisional 497
16423 173576 PTZ00384 PTZ00384 choline kinase; Provisional 383
16424 185588 PTZ00385 PTZ00385 lysyl-tRNA synthetase; Provisional 659
16425 240394 PTZ00386 PTZ00386 formyl tetrahydrofolate synthetase; Provisional 625
16426 240395 PTZ00387 PTZ00387 epsilon tubulin; Provisional 465
16427 240396 PTZ00388 PTZ00388 40S ribosomal protein S8-like; Provisional 223
16428 185592 PTZ00389 PTZ00389 40S ribosomal protein S7; Provisional 184
16429 240397 PTZ00390 PTZ00390 ubiquitin-conjugating enzyme; Provisional 152
16430 240398 PTZ00391 PTZ00391 transport protein particle component (TRAPP) superfamily; Provisional 168
16431 240399 PTZ00393 PTZ00393 protein tyrosine phosphatase; Provisional 241
16432 173585 PTZ00394 PTZ00394 glucosamine-fructose-6-phosphate aminotransferase; Provisional 670
16433 185594 PTZ00395 PTZ00395 Sec24-related protein; Provisional 1560
16434 240400 PTZ00396 PTZ00396 Casein kinase II subunit beta; Provisional 251
16435 240401 PTZ00397 PTZ00397 macrophage migration inhibition factor-like protein; Provisional 116
16436 173589 PTZ00398 PTZ00398 phosphoenolpyruvate carboxylase; Provisional 974
16437 240402 PTZ00399 PTZ00399 cysteinyl-tRNA-synthetase; Provisional 651
16438 240403 PTZ00400 PTZ00400 DnaK-type molecular chaperone; Provisional 663
16439 173592 PTZ00401 PTZ00401 aspartyl-tRNA synthetase; Provisional 550
16440 240404 PTZ00402 PTZ00402 glutamyl-tRNA synthetase; Provisional 601
16441 173594 PTZ00403 PTZ00403 phosphatidylserine decarboxylase; Provisional 353
16442 173595 PTZ00404 PTZ00404 cytochrome P450; Provisional 482
16443 173596 PTZ00405 PTZ00405 cytochrome c; Provisional 114
16444 173597 PTZ00407 PTZ00407 DNA topoisomerase IA; Provisional 805
16445 240405 PTZ00408 PTZ00408 NAD-dependent deacetylase; Provisional 242
16446 173599 PTZ00409 PTZ00409 Sir2 (Silent Information Regulator) protein; Provisional 271
16447 185600 PTZ00410 PTZ00410 NAD-dependent SIR2; Provisional 349
16448 240406 PTZ00411 PTZ00411 transaldolase-like protein; Provisional 333
16449 240407 PTZ00412 PTZ00412 leucyl aminopeptidase; Provisional 569
16450 240408 PTZ00413 PTZ00413 lipoate synthase; Provisional 398
16451 173604 PTZ00414 PTZ00414 10 kDa heat shock protein; Provisional 100
16452 185603 PTZ00415 PTZ00415 transmission-blocking target antigen s230; Provisional 2849
16453 240409 PTZ00416 PTZ00416 elongation factor 2; Provisional 836
16454 173607 PTZ00417 PTZ00417 lysine-tRNA ligase; Provisional 585
16455 240410 PTZ00418 PTZ00418 Poly(A) polymerase; Provisional 593
16456 240411 PTZ00419 PTZ00419 valyl-tRNA synthetase-like protein; Provisional 995
16457 240412 PTZ00420 PTZ00420 coronin; Provisional 568
16458 173611 PTZ00421 PTZ00421 coronin; Provisional 493
16459 185607 PTZ00422 PTZ00422 glideosome-associated protein 50; Provisional 394
16460 240413 PTZ00423 PTZ00423 glideosome-associated protein 45; Provisional 193
16461 185609 PTZ00424 PTZ00424 helicase 45; Provisional 401
16462 240414 PTZ00425 PTZ00425 asparagine-tRNA ligase; Provisional 586
16463 173616 PTZ00426 PTZ00426 cAMP-dependent protein kinase catalytic subunit; Provisional 340
16464 173617 PTZ00427 PTZ00427 isoleucine-tRNA ligase, putative; Provisional 1205
16465 185611 PTZ00428 PTZ00428 60S ribosomal protein L4; Provisional 381
16466 240415 PTZ00429 PTZ00429 beta-adaptin; Provisional 746
16467 185612 PTZ00430 PTZ00430 glucose-6-phosphate isomerase; Provisional 552
16468 173621 PTZ00431 PTZ00431 pyrroline carboxylate reductase; Provisional 260
16469 240416 PTZ00432 PTZ00432 falcilysin; Provisional 1119
16470 185613 PTZ00433 PTZ00433 tyrosine aminotransferase; Provisional 412
16471 185614 PTZ00434 PTZ00434 cytosolic glyceraldehyde 3-phosphate dehydrogenase; Provisional 361
16472 240417 PTZ00435 PTZ00435 isocitrate dehydrogenase; Provisional 413
16473 185616 PTZ00436 PTZ00436 60S ribosomal protein L19-like protein; Provisional 357
16474 240418 PTZ00437 PTZ00437 glutaminyl-tRNA synthetase; Provisional 574
16475 185618 PTZ00438 PTZ00438 gamete antigen 27/25-like protein; Provisional 374
16476 240419 PTZ00440 PTZ00440 reticulocyte binding protein 2-like protein; Provisional 2722
16477 240420 PTZ00441 PTZ00441 sporozoite surface protein 2 (SSP2); Provisional 576
16478 185621 PTZ00442 PTZ00442 sexual stage antigen s48/45-like protein; Provisional 347
16479 185622 PTZ00443 PTZ00443 Thioredoxin domain-containing protein; Provisional 224
16480 185623 PTZ00444 PTZ00444 hypothetical protein; Provisional 184
16481 240421 PTZ00445 PTZ00445 p36-lilke protein; Provisional 219
16482 240422 PTZ00446 PTZ00446 vacuolar sorting protein SNF7-like; Provisional 191
16483 185626 PTZ00447 PTZ00447 apical membrane antigen 1-like protein; Provisional 508
16484 185627 PTZ00448 PTZ00448 hypothetical protein; Provisional 373
16485 185628 PTZ00449 PTZ00449 104 kDa microneme/rhoptry antigen; Provisional 943
16486 185629 PTZ00450 PTZ00450 macrophage migration inhibitory factor-like protein; Provisional 113
16487 185630 PTZ00451 PTZ00451 dephospho-CoA kinase; Provisional 244
16488 185631 PTZ00452 PTZ00452 actin; Provisional 375
16489 185632 PTZ00453 PTZ00453 cyclin-dependent kinase; Provisional 96
16490 240423 PTZ00454 PTZ00454 26S protease regulatory subunit 6B-like protein; Provisional 398
16491 240424 PTZ00455 PTZ00455 3-ketoacyl-CoA thiolase; Provisional 438
16492 185635 PTZ00456 PTZ00456 acyl-CoA dehydrogenase; Provisional 622
16493 185636 PTZ00457 PTZ00457 acyl-CoA dehydrogenase; Provisional 520
16494 185637 PTZ00458 PTZ00458 acyl CoA binding protein; Provisional 90
16495 185638 PTZ00459 PTZ00459 mucin-associated surface protein (MASP); Provisional 291
16496 185639 PTZ00460 PTZ00460 acyl-CoA dehydrogenase; Provisional 646
16497 185640 PTZ00461 PTZ00461 isovaleryl-CoA dehydrogenase; Provisional 410
16498 185641 PTZ00462 PTZ00462 Serine-repeat antigen protein; Provisional 1004
16499 185642 PTZ00463 PTZ00463 histone H2B; Provisional 117
16500 240425 PTZ00464 PTZ00464 SNF-7-like protein; Provisional 211
16501 185644 PTZ00465 PTZ00465 rhoptry-associated protein 1 (RAP-1); Provisional 565
16502 240426 PTZ00466 PTZ00466 actin-like protein; Provisional 380
16503 185646 PTZ00467 PTZ00467 40S ribosomal protein S30; Provisional 66
16504 185647 PTZ00468 PTZ00468 phosphofructokinase family protein; Provisional 1328
16505 185648 PTZ00469 PTZ00469 60S ribosomal subunit protein L18; Provisional 187
16506 240427 PTZ00470 PTZ00470 glycoside hydrolase family 47 protein; Provisional 522
16507 240428 PTZ00471 PTZ00471 60S ribosomal protein L27; Provisional 134
16508 240429 PTZ00472 PTZ00472 serine carboxypeptidase (CBP1); Provisional 462
16509 240430 PTZ00473 PTZ00473 Plasmodium Vir superfamily; Provisional 420
16510 240431 PTZ00474 PTZ00474 tryptophan/threonine-rich antigen superfamily; Provisional 316
16511 185654 PTZ00475 PTZ00475 RESA-like protein; Provisional 282
16512 240432 PTZ00477 PTZ00477 rhoptry-associated protein; Provisional 524
16513 185656 PTZ00478 PTZ00478 Sec superfamily; Provisional 81
16514 185657 PTZ00479 PTZ00479 RAP Superfamily; Provisional 435
16515 185658 PTZ00480 PTZ00480 serine/threonine-protein phosphatase; Provisional 320
16516 185659 PTZ00481 PTZ00481 Membrane attack complex/ Perforin (MACPF) Superfamily; Provisional 524
16517 240433 PTZ00482 PTZ00482 membrane-attack complex/perforin (MACPF) Superfamily; Provisional 844
16518 185661 PTZ00483 PTZ00483 proliferating cell nuclear antigen; Provisional 264
16519 240434 PTZ00484 PTZ00484 GTP cyclohydrolase I; Provisional 259
16520 240435 PTZ00485 PTZ00485 aldolase 1-epimerase; Provisional 376
16521 240436 PTZ00486 PTZ00486 apyrase Superfamily; Provisional 352
16522 240437 PTZ00487 PTZ00487 ceramidase; Provisional 715
16523 185666 PTZ00488 PTZ00488 Proteasome subunit beta type-5; Provisional 247
16524 240438 PTZ00489 PTZ00489 glutamate 5-kinase; Provisional 264
16525 185668 PTZ00490 PTZ00490 Ferredoxin superfamily; Provisional 143
16526 240439 PTZ00491 PTZ00491 major vault protein; Provisional 850
16527 240440 PTZ00493 PTZ00493 phosphomethylpyrimidine kinase; Provisional 321
16528 185671 PTZ00494 PTZ00494 tuzin-like protein; Provisional 664
16529 272847 TIGR00001 rpmI_bact ribosomal protein L35. This ribosomal protein is found in bacteria and organelles only. It is not closely related to any eukaryotic or archaeal ribosomal protein. [Protein synthesis, Ribosomal proteins: synthesis and modification] 63
16530 272848 TIGR00002 S16 ribosomal protein S16. This model describes ribosomal S16 of bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification] 78
16531 188014 TIGR00003 TIGR00003 copper ion binding protein. This model describes an apparently copper-specific subfamily of the metal-binding domain HMA (pfam00403). Closely related sequences outside this model include mercury resistance proteins and repeated domains of eukaryotic eukaryotic copper transport proteins. Members of this family are strictly prokaryotic. The model identifies both small proteins consisting of just this domain and N-terminal regions of cation (probably copper) transporting ATPases. [Transport and binding proteins, Cations and iron carrying compounds] 66
16532 129116 TIGR00004 TIGR00004 reactive intermediate/imine deaminase. This protein was described initially as an inhibitor of protein synthesis intiation, then as an endoribonuclease active on single-stranded mRNA, endoribonuclease L-PSP. Members of this family, conserved in all domains of life and often with several members per bacterial genome, appear to catalyze a reaction that minimizes toxic by-products from reactions catalyzed by pyridoxal phosphate-dependent enzymes. [Cellular processes, Other] 124
16533 161659 TIGR00005 rluA_subfam pseudouridine synthase, RluA family. In E. coli, RluD (SfhB) modifies uridine to pseudouridine at 23S RNA U1911, 1915, and 1917, RluC modifies 955, 2504 and 2580, and RluA modifies U746 and tRNA U32. An additional homolog from E. coli outside this family, TruC (SP|Q46918), modifies uracil-65 in transfer RNAs to pseudouridine. [Protein synthesis, tRNA and rRNA base modification] 299
16534 272849 TIGR00006 TIGR00006 16S rRNA (cytosine(1402)-N(4))-methyltransferase. This model describes RsmH, a 16S rRNA methyltransferase. Previously, this gene was designated MraW, known to be essential in E. coli and widely conserved in bacteria. [Protein synthesis, tRNA and rRNA base modification] 307
16535 272850 TIGR00007 TIGR00007 phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase. This protein family consists of HisA, phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase, the enzyme catalyzing the fourth step in histidine biosynthesis. It is closely related to the enzyme HisF for the sixth step. Examples of this enzyme in Actinobacteria have been found to be bifunctional, also possessing phosphoribosylanthranilate isomerase activity; the trusted cutoff here has now been raised to 275.0 to exclude the bifunctional group, now represented by model TIGR01919. HisA from Lactococcus lactis was reported to be inactive (MEDLINE:93322317). [Amino acid biosynthesis, Histidine family] 230
16536 188015 TIGR00008 infA translation initiation factor IF-1. This family consists of translation initiation factor IF-1 as found in bacteria and chloroplasts. This protein, about 70 residues in length, consists largely of an S1 RNA binding domain (pfam00575). [Protein synthesis, Translation factors] 69
16537 272851 TIGR00009 L28 ribosomal protein L28. This model describes bacterial and chloroplast forms of the 50S ribosomal protein L28, a polypeptide about 60 amino acids in length. Mitochondrial homologs differ substantially in architecture (e.g. SP|P36525 from Saccharomyces cerevisiae, which is 258 amino acids long) and are not included. [Protein synthesis, Ribosomal proteins: synthesis and modification] 56
16538 272852 TIGR00010 TIGR00010 hydrolase, TatD family. PSI-BLAST, starting with a urease alpha subunit, finds a large superfamily of proteins, including a number of different enzymes that act as hydrolases at C-N bonds other than peptide bonds (EC 3.5.-.-), many uncharacterized proteins, and the members of this family. Several genomes have multiple paralogs related to this family. However, a set of 17 proteins can be found, one each from 17 of the first 20 genomes, such that each member forms a bidirectional best hit across genomes with all other members of the set. This core set (and one other near-perfect member), but not the other paralogs, form the seed for this model. Additionally, members of the seed alignment and all trusted hits, but not all paralogs, have a conserved motif DxHxH near the amino end. The member from E. coli was recently shown to have DNase activity. [Unknown function, Enzymes of unknown specificity] 252
16539 272853 TIGR00011 YbaK_EbsC Cys-tRNA(Pro) deacylase. This model represents the YbaK family, bacterial proteins whose full length sequence is homologous to an insertion domain in proline--tRNA ligases. The domain deacylates mischarged tRNAs. The YbaK protein of Haemophilus influenzae (HI1434) likewise deacylates Ala-tRNA(Pro), but not the correctly charged Pro-tRNA(Pro). A crystallographic study of HI1434 suggests a nucleotide binding function. Previously, a member of this family was described as EbsC and was thought to be involved in cell wall metabolism. [Protein synthesis, tRNA aminoacylation] 152
16540 272854 TIGR00012 L29 ribosomal protein L29. This model describes a ribosomal large subunit protein, called L29 in prokaryotic (50S) large subunits and L35 in eukaryotic (60S) large subunits. [Protein synthesis, Ribosomal proteins: synthesis and modification] 55
16541 129125 TIGR00013 taut 4-oxalocrotonate tautomerase family enzyme. 4-oxalocrotonate tautomerase is a homohexamer in which each monomer is very small, at about 62 amino acids. Pro-1 of the mature protein serves as a general base. The enzyme functions in meta-cleavage pathways of aromatic hydrocarbon catabolism. Because several Arg residues located near the active site in the crystal structure of Pseudomonas putida are not conserved among all members of this family, because the literature describes a general role in the isomerization of beta,gamma-unsaturated enones to their alpha,beta-isomers, and because of the presence of fairly distantly related paralogs in Campylobacter jejuni, the family is regarded as not necessarily uniform in function. [Energy metabolism, Other] 63
16542 272855 TIGR00014 arsC arsenate reductase (glutaredoxin). This model describes a distinct clade, including ArsC itself, of the broader ArsC family described by Pfam pfam03960. This clade is almost completely restricted to the Proteobacteria. An anion-translocating ATPase has been identified as the product of the arsenical resistance operon of resistance plasmid R773. When expressed in Escherichia coli this ATP-driven oxyanion pump catalyses extrusion of the oxyanions arsenite, antimonite and arsenate. The pump is composed of two polypeptides, the products of the arsA and arsB genes. The pump alone produces resistance to arsenite and antimonite. This protein, ArsC, catalyzes the reduction of arsenate to arsenite, and thus extends resistance to include arsenate. [Cellular processes, Detoxification] 114
16543 272856 TIGR00016 ackA acetate kinase. Acetate kinase is involved in the activation of acetate to acetyl CoA and in the secretion of acetate. It catalyzes the reaction ATP + acetate = ADP + acetyl phosphate. Some members of this family have been shown to act on propionate as well as acetate. An example of a propionate/acetate kinase is TdcD of E. coli (SP|P11868), an enzyme of an anaerobic pathway of threonine catabolism. It is not known how many members of this family act on additional substrates besides acetate. [Energy metabolism, Fermentation] 404
16544 129128 TIGR00017 cmk cytidylate kinase. This family consists of cytidylate kinase, which catalyzes the phosphorylation of cytidine 5-monophosphate (dCMP) to cytidine 5 -diphosphate (dCDP) in the presence of ATP or GTP. UMP and dCMP can also act as acceptors. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 217
16545 272857 TIGR00018 panC pantoate--beta-alanine ligase. This family is pantoate--beta-alanine ligase, the last enzyme of pantothenate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 282
16546 129130 TIGR00019 prfA peptide chain release factor 1. This model describes peptide chain release factor 1 (PrfA, RF-1), and excludes the related peptide chain release factor 2 (PrfB, RF-2). RF-1 helps recognize and terminate translation at UAA and UAG stop codons. The mitochondrial release factors are prfA-like, although not included above the trusted cutoff for this model. RF-1 does not have a translational frameshift. [Protein synthesis, Translation factors] 360
16547 272858 TIGR00020 prfB peptide chain release factor 2. In many but not all taxa, there is a conserved real translational frameshift at a TGA codon. RF-2 helps terminate translation at TGA codons and can therefore regulate its own production by readthrough when RF-2 is insufficient. There is a Pfam model called "RF-1" for the superfamily of RF-1, RF-2, mitochondrial, RF-H, etc. [Protein synthesis, Translation factors] 364
16548 272859 TIGR00021 rpiA ribose 5-phosphate isomerase. This model describes ribose 5-phosphate isomerase, an enzyme of the non-oxidative branch of the pentose phosphate pathway. [Energy metabolism, Pentose phosphate pathway] 218
16549 129133 TIGR00022 TIGR00022 YhcH/YjgK/YiaL family protein. This family consists of conserved hypothetical proteins, about 150 amino acids in length. Members with limited information include YhcH, a possible sugar isomerase of sialic acid catabolism, and YjgK. [Unknown function, General] 142
16550 272860 TIGR00023 TIGR00023 acyl-phosphate glycerol 3-phosphate acyltransferase. This model represents the full length of acylphosphate:glycerol 3-phosphate acyltransferase, and integral membrane protein about 200 amino acids in length, called PlsY in Streptococcus pneumoniae, YneS in Bacillus subtilis, and YgiH in E. coli. It is found in a single copy in a large number of bacteria, including the Mycoplasmas but not Mycobacteria or spirochetes, for example. Its partner is PlsX (see TIGR00182), and the pair can replace PlsB for synthesizing 1-acylglycerol-3-phosphate. [Fatty acid and phospholipid metabolism, Biosynthesis] 196
16551 129135 TIGR00024 SbcD_rel_arch putative phosphoesterase, SbcD/Mre11-related. Members of this uncharacterized family share a motif approximating DXH(X25)GDXXD(X25)GNHD as found in several phosphoesterases, including the nucleases SbcD and Mre11. SbcD is a subunit of the SbcCD nuclease of E. coli that can cleave DNA hairpins to unblock stalled DNA replication. All members of this family are archaeal. [Unknown function, Enzymes of unknown specificity] 225
16552 272861 TIGR00025 Mtu_efflux ABC transporter efflux protein, DrrB family. The seed members for this model are a paralogous family of Mycobacterium tuberculosis. Nearly all proteins scoring above the noise cutoff are from high-GC Gram-positive organisms. The members of this paralogous family of efflux proteins are all found in operons with ATP-binding chain partners. They are related to a putative daunorubicin resistance efflux protein of Streptomyces peucetius. This model represents a branch of a larger superfamily that also includes NodJ, a part of the NodIJ pair of nodulation-triggering signal efflux proteins. The members of this branch may all act in antibiotic resistance. 232
16553 211538 TIGR00026 hi_GC_TIGR00026 deazaflavin-dependent oxidoreductase, nitroreductase family. This model represents a family of proteins found in paralogous families in the genera Mycobacterium and Streptomyces. Seven members are in Mycobacterium tuberculosis. Member protein Rv3547 has been characterized as a deazaflavin-dependent nitroreductase. [Unknown function, Enzymes of unknown specificity] 113
16554 272862 TIGR00027 mthyl_TIGR00027 methyltransferase, TIGR00027 family. This model represents a set of probable methyltransferases, about 300 amino acids long, with essentially full length homology. Members share an N-terminal region described by Pfam model pfam02409. Included are a paralogous family of 12 proteins in Mycobacterium tuberculosis, plus close homologs in related species, a family of 8 in the archaeon Methanosarcina acetivorans, and small numbers of members in other species, including plants. [Unknown function, Enzymes of unknown specificity] 260
16555 272863 TIGR00028 Mtu_PIN_fam toxin-antitoxin system PIN domain toxin. Members of this protein consist almost entirely of a PIN (PilT N terminus) domain (see pfam01850). This family was originally defined a set of twelve closely related paralogs found in Mycobacterium tuberculosis, but additional members are found now Synechococcus sp. WH8102, etc. Inspection of genomic regions suggests these represent toxin components of toxin-antitoxin regions, potentially important to creating dormant persister cells. 142
16556 211539 TIGR00029 S20 ribosomal protein S20. This family consists of bacterial (and chloroplast) examples of the bacteria ribosomal small subunit protein S20. [Protein synthesis, Ribosomal proteins: synthesis and modification] 87
16557 129141 TIGR00030 S21p ribosomal protein S21. This model describes bacterial ribosomal protein S21 and most mitochondrial and chloroplast equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification] 58
16558 272864 TIGR00031 UDP-GALP_mutase UDP-galactopyranose mutase. This enzyme is involved in the conversion of UDP-GALP into UDP-GALF through a 2-keto intermediate. It contains FAD as a cofactor. The gene is known as glf, ceoA, and rfbD. It is known experimentally in E. coli, Mycobacterium tuberculosis, and Klebsiella pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 377
16559 199987 TIGR00032 argG argininosuccinate synthase. argG in bacteria, ARG1 in Saccharomyces cerevisiae. There is a very unusual clustering in the alignment, with a deep split between one cohort of E. coli, H. influenzae, and Streptomyces, and the other cohort of eukaryotes, archaea, and the rest of the eubacteria. [Amino acid biosynthesis, Glutamate family] 394
16560 272865 TIGR00033 aroC chorismate synthase. Homotetramer (noted in E.coli) suggests reason for good conservation. [Amino acid biosynthesis, Aromatic amino acid family] 351
16561 129145 TIGR00034 aroFGH phospho-2-dehydro-3-deoxyheptonate aldolase. [Amino acid biosynthesis, Aromatic amino acid family] 344
16562 213495 TIGR00035 asp_race aspartate racemase. Asparate racemases and some close homologs of unknown function are related to the more common glutamate racemases, but form a distinct evolutionary branch. This model identifies members of the aspartate racemase-related subset of amino acid racemases. [Energy metabolism, Amino acids and amines] 229
16563 129147 TIGR00036 dapB 4-hydroxy-tetrahydrodipicolinate reductase. [Amino acid biosynthesis, Aspartate family] 266
16564 272866 TIGR00037 eIF_5A translation elongation factor IF5A. Recent work (2009) changed the view of eIF5A in eukaryotes and aIF5A in archaea, hypusine-containing proteins, from translation initiation factor to translation elongation factor. [Protein synthesis, Translation factors] 130
16565 272867 TIGR00038 efp translation elongation factor P. function: involved in peptide bond synthesis. stimulate efficient translation and peptide-bond synthesis on native or reconstituted 70S ribosomes in vitro. probably functions indirectly by altering the affinity of the ribosome for aminoacyl-tRNA, thus increasing their reactivity as acceptors for peptidyl transferase (by similarity). The trusted cutoff of this model is set high enough to exclude members of TIGR02178, an EFP-like protein of certain Gammaproteobacteria. [Protein synthesis, Translation factors] 184
16566 272868 TIGR00039 6PTHBS 6-pyruvoyl tetrahydropterin synthase/QueD family protein. This model has been downgraded from hypothetical_equivalog to subfamily. The animal enzymes are known to be 6-pyruvoyl tetrahydropterin synthase. The function of the bacterial branch of the sequence lineage had been thought to be the same, but many are now taken to be QueD, and enzyme of queuosine biosynthesis. Queuosine is a hypermodified base in the wobble position of some tRNAs in most species. A new model is built to be the QueD equivalog model. [Protein synthesis, tRNA and rRNA base modification] 124
16567 272869 TIGR00040 yfcE phosphoesterase, MJ0936 family. Members of this largely uncharacterized family share a motif approximating DXH(X25)GDXXD(X25)GNHD as found in several phosphoesterases, including the nucleases SbcD and Mre11, and a family of uncharacterized archaeal putative phosphoesterases described by TIGR00024. In this family, the His residue in GNHD portion of the motif is not conserved. The member MJ0936, one of two from Methanococcus jannaschii, was shown () to act on model phosphodiesterase substrates; a divalent cation was required. [Unknown function, Enzymes of unknown specificity] 158
16568 161676 TIGR00041 DTMP_kinase dTMP kinase. Function: phosphorylation of DTMP to form DTDP in both de novo and salvage pathways of DTTP synthesis. Catalytic activity: ATP + thymidine 5'-phosphate = ADP + thymidine 5'-diphosphate. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 195
16569 272870 TIGR00042 TIGR00042 non-canonical purine NTP pyrophosphatase, RdgB/HAM1 family. Saccharomyces cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine, which can be a natural product of monooxygenase activity on adenine. Methanococcus jannaschii MJ0226 and E. coli RdgB are also characterized as pyrophosphatases active against non-standard purines NTPs. E. coli RdgB appears to act by intercepting non-canonical deoxyribonucleotide triphosphates from replication precursor pools. [DNA metabolism, DNA replication, recombination, and repair] 184
16570 272871 TIGR00043 TIGR00043 rRNA maturation RNase YbeY. This metalloprotein family is represented by a single member sequence only in nearly every bacterium. Crystallography demonstrated metal-binding activity, possibly to nickel. It is a predicted to be a metallohydrolase, and more recently it was shown that mutants have a ribosomal RNA processing defect. [Protein synthesis, Other] 110
16571 129155 TIGR00044 TIGR00044 pyridoxal phosphate enzyme, YggS family. Members of this protein family include YggS from Escherichia coli and YBL036C, an uncharacterized pyridoxal protein of Saccharomyces cerevisiae. [Unknown function, Enzymes of unknown specificity] 229
16572 272872 TIGR00045 TIGR00045 glycerate kinase. The only characterized member of this family so far is the glycerate kinase GlxK (EC 2.7.1.31) of E. coli. This enzyme acts after glyoxylate carboligase and 2-hydroxy-3-oxopropionate reductase (tartronate semialdehyde reductase) in the conversion of glyoxylate to 3-phosphoglycerate (the D-glycerate pathway) as a part of allantoin degradation. [Energy metabolism, Other] 375
16573 272873 TIGR00046 TIGR00046 RNA methyltransferase, RsmE family. Members of this protein family, previously called conserved hypothetical protein TIGR00046, include the YggJ protein of E. coli, which has now been shown to methylate U1498 in 16S rRNA. [Protein synthesis, tRNA and rRNA base modification] 240
16574 272874 TIGR00048 rRNA_mod_RlmN 23S rRNA (adenine(2503)-C(2))-methyltransferase. Members of this family are RlmN, a 23S rRNA m2A2503 methyltransferase in the radical SAM enzyme family. Closely related is Cfr, a Staphylococcus sciuri plasmid-borne homolog to this family, Cfr, has been identified as essential to transferrable resistance to chloramphenicol and florfenicol. Cfr methylates 23S RNA at a different site. [Protein synthesis, tRNA and rRNA base modification] 355
16575 272875 TIGR00049 TIGR00049 Iron-sulfur cluster assembly accessory protein. Proteins in this subfamily appear to be associated with the process of FeS-cluster assembly. The HesB proteins are associated with the nif gene cluster and the Rhizobium gene IscN has been shown to be required for nitrogen fixation. Nitrogenase includes multiple FeS clusters and many genes for their assembly. The E. coli SufA protein is associated with SufS, a NifS homolog and SufD which are involved in the FeS cluster assembly of the FhnF protein. The Azotobacter protein IscA (homologs of which are also found in E.coli) is associated which IscS, another NifS homolog and IscU, a nifU homolog as well as other factors consistent with a role in FeS cluster chemistry. A homolog from Geobacter contains a selenocysteine in place of an otherwise invariant cysteine, further suggesting a role in redox chemistry. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 105
16576 272876 TIGR00050 rRNA_methyl_1 RNA methyltransferase, TrmH family, group 1. This is part of the trmH (spoU) family of S-adenosyl-L-methionine (AdoMet)-dependent methyltransferases, and is now characterized, in E. coli, as a tRNA:Cm32/Um32 methyltransferase. It may be named TrMet(Xm32), or TrmJ, according to the nomenclature style chosen [Protein synthesis, tRNA and rRNA base modification] 233
16577 129161 TIGR00051 TIGR00051 acyl-CoA thioester hydrolase, YbgC/YbaW family. This model describes a subset of related acyl-CoA thioesterases that include several at least partially characterized proteins. YbgC is an acyl-CoA thioesterase associated with the Tol-Pal system. YbaW is part of the FadM regulon. [Unknown function, General] 117
16578 129162 TIGR00052 TIGR00052 nudix-type nucleoside diphosphatase, YffH/AdpP family. Members of this family include proteins of about 200 amino acids, including the recently characterized nudix hydrolase YffH, shows to be highly active as a GDP-mannose pyrophosphatase. It also includes the C-terminal half of a 361-amino acid protein, TrgB from Rhodobacter sphaeroides, shown experimentally to help confer tellurite resistance. This model also hits a region near the C-terminus of a 1092-amino acid protein of C. elegans. [Unknown function, Enzymes of unknown specificity] 185
16579 272877 TIGR00053 TIGR00053 addiction module toxin component, YafQ family. This model represents a cluster of eubacterial proteins and a cluster of archaeal proteins, all of which are uncharacterized, from 85 to 102 residues in length, and similar in sequence. These include YafQ, a ribosome-associated endoribonuclease that serves as part of a toxin-antitoxin system, for which DinJ is the antidote component. [Cellular processes, Adaptations to atypical conditions] 90
16580 272878 TIGR00054 TIGR00054 RIP metalloprotease RseP. Members of this nearly universal bacterial protein family are regulated intramembrane proteolysis (RIP) proteases. Older and synonymous gene symbols include yaeL in E. coli, mmpA in Caulobacter crescentus, etc. This family includes a region that hits the PDZ domain, found in a number of proteins targeted to the membrane by binding to a peptide ligand. The N-terminal region of this family contains a perfectly conserved motif HEXGH as found in a number of metalloproteinases, where the Glu is the active site and the His residues coordinate the metal cation. Membership in this family is determined by a match to the full length of the seed alignment; the model also detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 419
16581 129165 TIGR00055 uppS undecaprenyl diphosphate synthase. This enzyme builds undecaprenyl diphosphate, a molecule that in bacteria is used a carrier in synthesizing cell wall components. Alternate name: undecaprenyl pyrophosphate synthetase. Activity has been demonstrated experimentally for members of this family from Micrococcus luteus, E. coli, Haemophilus influenzae, and Streptococcus pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 226
16582 129166 TIGR00056 TIGR00056 ABC transport permease subunit. This model describes a subfamily of ABC transporter permease subunits. One member of this family has been associated with the toluene tolerance phenotype of Pseudomonas putida, another with L-glutamate transport, another with maintenance of lipid asymmetry. Many bacterial species have one or two members. The Mycobacteria have large paralogous families included in the DUF140 family but excluded from this subfamily on based on extreme divergence at the amino end and on phylogenetic and UPGMA trees on the more conserved regions. [Hypothetical proteins, Conserved] 259
16583 272879 TIGR00057 TIGR00057 tRNA threonylcarbamoyl adenosine modification protein, Sua5/YciO/YrdC/YwlC family. Has paralogs, but YrdC called a tRNA modification protein. Ref 2 authors say probably heteromultimeric complex. Paralogs may mean its does the final binding to the tRNA. [Protein synthesis, tRNA and rRNA base modification] 201
16584 129168 TIGR00058 Hemerythrin hemerythrin family non-heme iron protein. This family includes oxygen carrier proteins of various oligomeric states from the vascular fluid (hemerythrin) and muscle (myohemerythrin) of some marine invertebrates. Each unit binds 2 non-heme Fe using 5 H, one E and one D. One member of this family,from the sandworm Nereis diversicolor, is an unusual (non-metallothionein) cadmium-binding protein. Homologous proteins, excluded from this narrowly defined family, are found in archaea and bacteria (see pfam01814). 115
16585 272880 TIGR00059 L17 ribosomal protein L17. Eubacterial and mitochondrial. The mitochondrial form, from yeast, contains an additional 110 amino acids C-terminal to the region found by this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 111
16586 272881 TIGR00060 L18_bact ribosomal protein L18, bacterial type. The archaeal and eukaryotic type rpL18 is not detectable under this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 114
16587 129171 TIGR00061 L21 ribosomal protein L21. Eubacterial and chloroplast. [Protein synthesis, Ribosomal proteins: synthesis and modification] 101
16588 272882 TIGR00062 L27 ribosomal protein L27. Eubacterial, chloroplast, and mitochondrial. Mitochondrial members have an additional C-terminal domain. [Protein synthesis, Ribosomal proteins: synthesis and modification] 84
16589 129173 TIGR00063 folE GTP cyclohydrolase I. alternate names: Punch (Drosophila),GTP cyclohydrolase I (EC 3.5.4.16) catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 180
16590 272883 TIGR00064 ftsY signal recognition particle-docking protein FtsY. There is a weak division between FtsY and SRP54; both are GTPases. In E.coli, ftsY is an essential gene located in an operon with cell division genes ftsE and ftsX, but its apparent function is as the signal recognition particle docking protein. [Protein fate, Protein and peptide secretion and trafficking] 277
16591 272884 TIGR00065 ftsZ cell division protein FtsZ. This family consists of cell division protein FtsZ, a GTPase found in bacteria, the chloroplast of plants, and in archaebacteria. Structurally similar to tubulin, FtsZ undergoes GTP-dependent polymerization into filaments that form a cytoskeleton involved in septum synthesis. [Cellular processes, Cell division] 349
16592 129176 TIGR00066 g_glut_trans gamma-glutamyltranspeptidase. Also called gamma-glutamyltranspeptidase (ggt). Some members of this family have antibiotic synthesis or resistance activities. In the case of a cephalosporin acylase from Pseudomonas sp., the enzyme was shown to retain some gamma-glutamyltranspeptidase activity. Other, more distantly related proteins have ggt-related activities and score below the trusted cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 516
16593 272885 TIGR00067 glut_race glutamate racemase. This family consists of glutamate racemase, a protein required for making the UDP-N-acetylmuramoyl-pentapeptide used as a precursor in bacterial peptidoglycan biosynthesis. The most closely related proteins differing in function are aspartate racemases. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 251
16594 272886 TIGR00068 glyox_I lactoylglutathione lyase. Lactoylglutathione lyase is also known as aldoketomutase and glyoxalase I. Glyoxylase I is a homodimer in many species. In some eukaryotes, including yeasts and plants, the orthologous protein carries a tandem duplication, is twice as long, and hits this model twice. [Central intermediary metabolism, Amino sugars, Energy metabolism, Other] 150
16595 272887 TIGR00069 hisD histidinol dehydrogenase. This model describes a polypeptide sequence catalyzing the final step in histidine biosynthesis, found sometimes as an independent protein and sometimes as a part of a multifunctional protein. [Amino acid biosynthesis, Histidine family] 393
16596 272888 TIGR00070 hisG ATP phosphoribosyltransferase. Members of this family from B. subtilis, Aquifex aeolicus, and Synechocystis PCC6803 (and related taxa) lack the C-terminal third of the sequence. The sole homolog from Archaeoglobus fulgidus lacks the N-terminal 50 residues (as reported) and is otherwise atypical of the rest of the family. This model excludes the C-terminal extension. [Amino acid biosynthesis, Histidine family] 183
16597 272889 TIGR00071 hisT_truA tRNA pseudouridine(38-40) synthase. Members of this family are the tRNA modification enzyme TruA, tRNA pseudouridine(38-40) synthase. In a few species (e.g. Bacillus anthracis), TruA is represented by two paralogs. [Protein synthesis, tRNA and rRNA base modification] 227
16598 272890 TIGR00072 hydrog_prot hydrogenase maturation protease. HycI and HoxM are well-characterized as responsible for C-terminal protease activity on their respective hydrogenase large chains. A large number of homologous proteins appear responsible for the maturation of various forms of hydrogenase. 145
16599 272891 TIGR00073 hypB hydrogenase accessory protein HypB. A GTP hydrolase for assembly of nickel metallocenter of hydrogenase. A similar protein, ureG, is an accessory protein for urease, which also uses nickel. hits scoring 75 and above are safe as orthologs. [SS 1/05/04 I changed the role_ID and process GO from protein folding to to protein modification, since a protein folding role has not been established, but HypB is implicated in insertion of nickel into the large subunit of NiFe hydrogenases.] [Protein fate, Protein modification and repair] 208
16600 129184 TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. This protein is suggested by act as a chaperone for a hydrogenase large subunit, holding the precursor form before metallocenter nickel incorporation. [SS 12/31/03] More recently proposed additional function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. . Added metallochaperone and protein mod GO terms. [Protein fate, Protein folding and stabilization, Protein fate, Protein modification and repair] 76
16601 272892 TIGR00075 hypD hydrogenase expression/formation protein HypD. HypD is involved in the hyp operon which is needed for the activity of the three hydrogenase isoenzymes in Escherichia coli. HypD is one of the genes needed for formation of these enzymes. This protein has been found in gram-negative and gram-positive bacteria and Archaea. [Protein fate, Protein modification and repair] 369
16602 272893 TIGR00077 lspA lipoprotein signal peptidase. Alternate name: lipoprotein signal peptidase [Protein fate, Protein and peptide secretion and trafficking] 166
16603 272894 TIGR00078 nadC nicotinate-nucleotide pyrophosphorylase. Synonym: quinolinate phosphoribosyltransferase (decarboxylating) [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 265
16604 272895 TIGR00079 pept_deformyl peptide deformylase. Peptide deformylase (EC 3.5.1.88), also called polypeptide deformylase, is a metalloenzyme that uses water to release formate from the N-terminal formyl-L-methionine of bacterial and chloroplast peptides. This enzyme should not be confused with formylmethionine deformylase (EC 3.5.1.31) which is active on free N-formyl methionine and has been reported from rat intestine. [Protein fate, Protein modification and repair] 161
16605 272896 TIGR00080 pimt protein-L-isoaspartate(D-aspartate) O-methyltransferase. This is an all-kingdom (but not all species) full-length ortholog enzyme for repairing aging proteins. Among the prokaryotes, the gene name is pcm. Among eukaryotes, pimt. [Protein fate, Protein modification and repair] 215
16606 272897 TIGR00081 purC phosphoribosylaminoimidazole-succinocarboxamide synthase. Alternate name: SAICAR synthetase purine de novo biosynthesis. E.coli example noted as homotrimer. Check length. Longer versions may be multifunctional enzymes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 237
16607 129191 TIGR00082 rbfA ribosome-binding factor A. Associates with free 30S ribosomal subunits (but not with 30S subunits that are part of 70S ribosomes or polysomes). Essential for efficient processing of 16S rRNA. May interact with the 5'terminal helix region of 16S rRNA. Mutants lacking rbfA have a cold-sensitive phenotype. [Transcription, RNA processing] 114
16608 272898 TIGR00083 ribF riboflavin kinase/FMN adenylyltransferase. multifunctional enzyme: riboflavin kinase (EC 2.7.1.26) (flavokinase) / FMN adenylyltransferase (EC 2.7.7.2) (FAD pyrophosphorylase) (FAD synthetase). [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 288
16609 129193 TIGR00084 ruvA Holliday junction DNA helicase, RuvA subunit. RuvA specifically binds Holliday junctions as a sandwich of two tetramers and maintains the configuration of the junction. It forms a complex with two hexameric rings of RuvB, the subunit that contains helicase activity. The complex drives ATP-dependent branch migration of the Holliday junction recombination intermediate. The endonuclease RuvC resolves junctions. [DNA metabolism, DNA replication, recombination, and repair] 191
16610 129194 TIGR00086 smpB SsrA-binding protein. This model describes the SsrA-binding protein, also called tmRNA binding protein, small protein B, and SmpB. The small, stable RNA SsrA (also called tmRNA or 10Sa RNA) recognizes stalled ribosomes such as occur during translation from message that lacks a stop codon. It becomes charged with Ala like a tRNA, then acts as mRNA to resume translation started with the defective mRNA. The short C-terminal peptide tag added by the SsrA system marks the abortively translated protein for degradation. SmpB binds SsrA after its aminoacylation but before the coupling of the Ala to the nascent polypeptide chain and is an essential part of the SsrA peptide tagging system. SmpB has been associated with the survival of bacterial pathogens in conditions of stress. It is universal in the first 100 sequenced bacterial genomes. [Protein synthesis, Other] 144
16611 272899 TIGR00087 surE 5'/3'-nucleotidase SurE. This protein family originally was named SurE because of its role in stationary phase survivalin Escherichia coli. In E. coli, surE is next to pcm, an L-isoaspartyl protein repair methyltransferase that is also required for stationary phase survival. Recent work () shows that viewing SurE as an acid phosphatase (3.1.3.2) is not accurate. Rather, SurE in E. coli, Thermotoga maritima, and Pyrobaculum aerophilum acts strictly on nucleoside 5'- and 3'-monophosphates. E. coli SurE is Recommended cutoffs are 15 for homology, 40 for probable orthology, and 200 for orthology with full-length homology. [Cellular processes, Adaptations to atypical conditions] 247
16612 129196 TIGR00088 trmD tRNA (guanine-N1)-methyltransferase. This model is specfic for the tRNA modification enzyme tRNA (guanine-N1)-methyltransferase (trmD). This enzyme methylates guanosime-37 in a number of tRNAs.The enzyme's catalytic activity is as follows: S-adenosyl-L-methionine + tRNA = S-adenosyl-L-homocysteine + tRNA containing N1-methylguanine. [Protein synthesis, tRNA and rRNA base modification] 233
16613 272900 TIGR00089 TIGR00089 radical SAM methylthiotransferase, MiaB/RimO family. This subfamily contains the tRNA-i(6)A37 modification enzyme, MiaB (TIGR01574). The phylogenetic tree indicates 4 distinct clades, one of which corresponds to MiaB. The other three clades are modelled by hypothetical equivalogs (TIGR01125, TIGR01579 and TIGR01578). Together, the four models hit every sequence hit by the subfamily model without any overlap between them. This subfamily is aparrently a part of a larger superfamily of enzymes utilizing both a 4Fe4S cluster and S-adenosyl methionine (SAM) to initiate radical reactions. MiaB acts on a particular isoprenylated Adenine base of certain tRNAs causing thiolation at an aromatic carbon, and probably also transferring a methyl grouyp from SAM to the thiol. The particular substrate of the three other clades is unknown but may be very closely related. 429
16614 272901 TIGR00090 rsfS_iojap_ybeB ribosome silencing factor RsfS/YbeB/iojap. This model describes a widely distributed family of bacterial proteins related to iojap from plants. It includes RsfS(YbeB) from E. coli. The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. The conserved function of this protein is to silence ribosomes by binding the ribosomal large subunit and impairing joining with the small subunit in response to nutrient stress. Note that RsfS (starvation) is an author-endorsed change from the published symbol RsfA, which conflicted with previously published gene symbols. [Protein synthesis, Translation factors] 99
16615 161703 TIGR00091 TIGR00091 tRNA (guanine-N(7)-)-methyltransferase. This predicted S-adenosylmethionine-dependent methyltransferase is found in a single copy in most Bacteria. It is also found, with a short amino-terminal extension in eukaryotes. Its function is unknown. In E. coli, this protein flanks the DNA repair protein MutY, also called micA. [Protein synthesis, tRNA and rRNA base modification] 194
16616 129200 TIGR00092 TIGR00092 GTP-binding protein YchF. This predicted GTP-binding protein is found in a single copy in every complete bacterial genome, and is found in Eukaryotes. A more distantly related protein, separated from this model, is found in the archaea. It is known to bind GTP and double-stranded nucleic acid. It is suggested to belong to a nucleoprotein complex and act as a translation factor. [Unknown function, General] 368
16617 272902 TIGR00093 TIGR00093 pseudouridine synthase. This model identifies panels of pseudouridine synthase enzymes that RNA modifications involved in maturing the protein translation apparatus. Counts per genome vary: two in Staphylococcus aureus, three in Pseudomonas putida, four in E. coli, etc. [Protein synthesis, tRNA and rRNA base modification] 128
16618 272903 TIGR00094 tRNA_TruD_broad tRNA pseudouridine synthase, TruD family. an EGAD loading error caused one member to be called surE, but that's an adjacent gene. MJ11364 is a strong partial match from 50 to 230 aa. [Protein synthesis, tRNA and rRNA base modification] 387
16619 188022 TIGR00095 TIGR00095 16S rRNA (guanine(966)-N(2))-methyltransferase RsmD. This model represents a family of uncharacterized bacterial proteins. Members are present in nearly every complete bacterial genome, always in a single copy. PSI-BLAST analysis shows homology to several families of SAM-dependent methyltransferases, including ribosomal RNA adenine dimethylases. [Protein synthesis, tRNA and rRNA base modification] 190
16620 129204 TIGR00096 TIGR00096 16S rRNA (cytidine(1402)-2'-O)-methyltransferase. This protein, previously known as YraL, is RsmI, one of a pair of genes involved in a unique dimethyl modification of a cytidine in 16S rRNA. See pfam00590 (tetrapyrrole methylase), which demonstrates homology between this family and other members, including several methylases for the tetrapyrrole class of compound, as well as the enzyme diphthine synthase. [Protein synthesis, tRNA and rRNA base modification] 276
16621 272904 TIGR00097 HMP-P_kinase hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. This model represents a bifunctional enzyme, phosphomethylpyrimidine kinase (EC 2.7.4.7)/Hydroxymethylpyrimidine kinase (EC 2.7.1.49), the ThiD/J protein of thiamine biosynthesis. The protein is commonly observed within operons containing other thiamine biosynthesis genes. Numerous examples are fusion proteins with other thiamine-biosynthetic domains. Saccaromyces has three recent paralogs, two of which are isofunctional and score above the trusted cutoff. The third shows a longer branch length in a phylogenetic tree and scores below the trusted cutoff, as do putative second copies in a number of species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 254
16622 272905 TIGR00099 Cof-subfamily Cof subfamily of IIB subfamily of haloacid dehalogenase superfamily. This subfamily of sequences falls within the Class-IIB subfamily (TIGR01484) of the Haloacid Dehalogenase superfamily of aspartate-nucleophile hydrolases. The use of the name "Cof" as an identifier here is arbitrary and refers to the E. coli Cof protein. This subfamily is notable for the large number of recent paralogs in many species. Listeria, for instance, has 12, Clostridium, Lactococcus and Streptococcus pneumoniae have 8 each, Enterococcus and Salmonella have 7 each, and Bacillus subtilus, Mycoplasma, Staphylococcus and E. coli have 6 each. This high degree of gene duplication is limited to the gamma proteobacteria and low-GC gram positive lineages. The profusion of genes in this subfamily is not coupled with a high degree of divergence, so it is impossible to determine an accurate phylogeny at the equivalog level. Considering the relationship of this subfamily to the other known members of the HAD-IIB subfamily (TIGR01484), sucrose and trehalose phosphatases and phosphomannomutase, it seems a reasonable hypothesis that these enzymes act on phosphorylated sugars. Possibly the diversification of genes in this subfamily represents the diverse sugars and polysaccharides that various bacteria find in their biological niches. The members of this subfamily are restricted almost exclusively to bacteria (one sequences from S. pombe scores above trusted, while another is between trusted and noise). It is notable that no archaea are found in this group, the closest relations to the archaea found here being two Deinococcus sequences. [Unknown function, Enzymes of unknown specificity] 256
16623 272906 TIGR00100 hypA hydrogenase nickel insertion protein HypA. CXXC-~12X-CXXC and genetically seems a regulatory protein. In Hpylori, hypA mutant abolished hydrogenase activity and decrease in urease activity. Nickel supplementation in media restored urease activity and partial hydrogenase activity. HypA probably involved in inserting Ni in enzymes. [Protein fate, Protein modification and repair] 115
16624 129208 TIGR00101 ureG urease accessory protein UreG. This model represents UreG, a GTP hydrolase that acts in the assembly of the nickel metallocenter of urease. It is found only in urease-positive species, although some urease-positive species (e.g. Bacillus subtilis) lack this protein. A similar protein, hypB, is an accessory protein for expression of hydrogenase, which also uses nickel. [Central intermediary metabolism, Nitrogen metabolism] 199
16625 211546 TIGR00103 DNA_YbaB_EbfC DNA-binding protein, YbaB/EbfC family. The function of this protein is unknown, but it has been expressed and crystallized. Its gene nearly always occurs next to recR and/or dnaX. It is restricted to Bacteria and the plant Arabidopsis. The plant form contains an additional N-terminal region that may serve as a transit peptide and shows a close relationship to the cyanobacterial member, suggesting that it is a chloroplast protein. Members of this family are found in a single copy per bacterial genome, but are broadly distributed. A member is present even in the minimal gene complement of Mycoplasm genitalium. [Unknown function, General] 101
16626 129210 TIGR00104 tRNA_TsaA tRNA-Thr(GGU) m(6)t(6)A37 methyltransferase TsaA. This protein has been characterized by crystallography in complex with S-Adenosylmethionine, making it a probable S-adenosylmethionine-dependent methyltransferase. Analysis in EcoGene links this protein to the enzyme characterization mapped to the tsaA gene in Escherichia coli. [Unknown function, Enzymes of unknown specificity] 142
16627 272907 TIGR00105 L31 ribosomal protein L31. This family consists exclusively of bacterial (and organellar) 50S ribosomal protein L31. In some species, such as Bacillus subtilis, this protein exists in two forms (RpmE and YtiA), one of which (RpmE) contains a pair of motifs, CXC and CXXC, for binding zinc. [Protein synthesis, Ribosomal proteins: synthesis and modification] 68
16628 272908 TIGR00106 TIGR00106 uncharacterized protein, MTH1187 family. This protein has been crystallized in both Methanobacterium thermoautotrophicum and yeast, but its function remains unknown. Both crystal structures showed sulfate ions bound at the interface of two dimers to form a tetramer. [Unknown function, General] 97
16629 188024 TIGR00107 deoD purine-nucleoside phosphorylase, family 1 (deoD). Purine nucleoside phosphorylase (also called inosine phosphorylase) is a purine salvage enzyme. Purine nucleosides, such as guanosine, inosine, or xanthosine, plus orthophosphate, can be converted to their respective purine bases (guanine, hypoxanthine, or xanthine) plus ribose-1-phosphate. This family of purine nucleoside phosphorylase is restricted to the bacteria. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 232
16630 272909 TIGR00109 hemH ferrochelatase. Human ferrochelatase, found at the mitochondrial inner membrane inner surface, was shown in an active recombinant form to be a homodimer. This contrasts to an earlier finding by gel filtration that overexpressed E. coli ferrochelatase runs as a monomer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 322
16631 272910 TIGR00110 ilvD dihydroxy-acid dehydratase. This protein, dihydroxy-acid dehydratase, catalyzes the fourth step in valine and isoleucine biosynthesis. It contains a catalytically essential [4Fe-4S] cluster This model generates scores of up to 150 bits vs. 6-phosphogluconate dehydratase, a homologous enzyme. [Amino acid biosynthesis, Pyruvate family] 535
16632 129217 TIGR00111 pelota mRNA surveillance protein pelota. This model describes the Drosophila protein Pelota, the budding yeast protein DOM34 which it can replace, and a set of closely related archaeal proteins. Members contain a proposed RNA binding motif. The meiotic defect in pelota mutants may be a complex result of a protein translation defect, as suggested in yeast by ribosomal protein RPS30A being a multicopy suppressor and by an altered polyribosome profile in DOM34 mutants rescued by RPS30A. This family is homologous to a family of peptide chain release factors. Pelota is proposed to act in protein translation. [Protein synthesis, Translation factors] 351
16633 272911 TIGR00112 proC pyrroline-5-carboxylate reductase. This enzyme catalyzes the final step in proline biosynthesis. Among the four paralogs in Bacillus subtilis (proG, proH, proI, and comER), ComER is the most divergent and does not prevent proline auxotrophy from mutation of the other three. It is excluded from the seed and scores between the trusted and noise cutoffs. [Amino acid biosynthesis, Glutamate family] 245
16634 272912 TIGR00113 queA S-adenosylmethionine:tRNA ribosyltransferase-isomerase. This model describes the enzyme for S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA). QueA synthesizes Queuosine which is usually in the first position of the anticodon of tRNAs specific for asparagine, aspartate, histidine, and tyrosine. [Protein synthesis, tRNA and rRNA base modification] 344
16635 211550 TIGR00114 lumazine-synth 6,7-dimethyl-8-ribityllumazine synthase. This enzyme catalyzes the cyclo-ligation of 3,4-dihydroxy-2-butanone-4-P and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione to form 6,7-dimethyl-8-ribityllumazine, the immediate precursor of riboflavin. Sometimes referred to as riboflavin synthase, beta subunit, this should not be confused with the alpha subunit which carries out the subsequent reaction. Archaeal members of this family are considered putative, although included in the seed and scoring above the trusted cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 138
16636 272913 TIGR00115 tig trigger factor. Trigger factor is a ribosome-associated molecular chaperone and is the first chaperone to interact with nascent polypeptide. Trigger factor can bind at the same time as the signal recognition particle (SRP), but is excluded by the SRP receptor (FtsY). The central domain of trigger factor has peptidyl-prolyl cis/trans isomerase activity. This protein is found in a single copy in virtually every bacterial genome. [Protein fate, Protein folding and stabilization] 410
16637 272914 TIGR00116 tsf translation elongation factor Ts. Translational elongation factor Ts (EF-Ts) catalyzes the exchange of GTP for the GDP of the EF-Tu.GDP complex as part of the cycle of translation elongation. This protein is found in Bacteria, mitochondria, and chloroplasts. [Protein synthesis, Translation factors] 291
16638 129223 TIGR00117 acnB aconitate hydratase 2. Aconitate hydratase (aconitase) is an enzyme of the TCA cycle. This model describes aconitase 2, AcnB, which has weak similarity to aconitase 1. It is found almost exclusively in the Proteobacteria. [Energy metabolism, TCA cycle] 844
16639 272915 TIGR00118 acolac_lg acetolactate synthase, large subunit, biosynthetic type. Two groups of proteins form acetolactate from two molecules of pyruvate. The type of acetolactate synthase described in this model also catalyzes the formation of acetohydroxybutyrate from pyruvate and 2-oxobutyrate, an early step in the branched chain amino acid biosynthesis; it is therefore also termed acetohydroxyacid synthase. In bacteria, this catalytic chain is associated with a smaller regulatory chain in an alpha2/beta2 heterotetramer. Acetolactate synthase is a thiamine pyrophosphate enzyme. In this type, FAD and Mg++ are also found. Several isozymes of this enzyme are found in E. coli K12, one of which contains a frameshift in the large subunit gene and is not expressed. [Amino acid biosynthesis, Pyruvate family] 558
16640 272916 TIGR00119 acolac_sm acetolactate synthase, small subunit. Acetolactate synthase is a heterodimeric thiamine pyrophosphate enzyme with large and small subunits. One of the three isozymes in E. coli K12 contains a frameshift in the large subunit gene and is not expressed. acetohydroxyacid synthase is a synonym. [Amino acid biosynthesis, Pyruvate family] 157
16641 161718 TIGR00120 ArgJ glutamate N-acetyltransferase/amino-acid acetyltransferase. This enzyme can acetylate Glu to N-acetyl-Glu by deacetylating N-2-acetyl-ornithine into ornithine; the two halves of this reaction represent the first and fifth steps in the synthesis of Arg (or citrulline) from Glu by way of ornithine (EC 2.3.1.35). In Bacillus stearothermophilus, but not in Thermus thermophilus HB27, the enzyme is bifunctional and can also use acetyl-CoA to acetylate Glu (EC 2.3.1.1). [Amino acid biosynthesis, Glutamate family] 404
16642 272917 TIGR00121 birA_ligase birA, biotin-[acetyl-CoA-carboxylase] ligase region. This model represents the biotin--acetyl-CoA-carboxylase ligase region of biotin--acetyl-CoA-carboxylase ligase. In Escherichia coli and some other species, this enzyme is part of a bifunction protein BirA that includes a small, N-terminal biotin operon repressor domain. Proteins identified by this model should not be called bifunctional unless they are also identified by birA_repr_reg (TIGR00122). The protein name suggests that this enzyme transfers biotin only to acetyl-CoA-carboxylase but it also transfers the biotin moiety to other proteins. The apparent orthologs among the eukaryotes are larger proteins that contain a single copy of this domain. [Protein fate, Protein modification and repair] 237
16643 272918 TIGR00122 birA_repr_reg BirA biotin operon repressor domain. This model represents the amino-terminal helix-turn-helix repressor region of the biotin--acetyl-CoA-carboxylase ligase/biotin operon repressor bifunctional protein BirA. In many species, the biotin--acetyl-CoA-carboxylase ligase ortholog lacks this DNA-binding repressor region and therefore is not equivalent to the well-characterized BirA of E. coli. This model may recognize some other putative repressor proteins, such as DnrO of Streptomyces peucetius with scores below the noise cutoff but with significance shown by low E-value. [Regulatory functions, DNA interactions] 69
16644 272919 TIGR00123 cbiM cobalamin biosynthesis protein CbiM. A cutoff of 200 bits for trusted orthologs of cbiM is suggested. Scores lower than 200 but higher than 20 may be considered sufficient to call a protein cobalamin biosynthesis protein CbiM-related.The seed alignment for this model is a cluster of very closely related proteins from Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, Methanococcus jannaschii, and Salmonella typhimurium, each of which has greater than 50% identity to all the others. The ortholog from Salmonella is the source of the gene symbol cbiM for this set.In Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Methanococcus jannaschii, a second homolog of cbiM is also found. These cbiM-related proteins appear to represent a distinct but less well-conserved orthologous group. Still more distant homologs include sll0383 from Synechocystis sp. and HI1621 from Haemophilus influenzae; the latter protein, from a species that does not synthesize cobalamin, is the most divergent member of the group. The functions of and relationships among the set of proteins homologous to cbiM have not been determined. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 214
16645 129230 TIGR00124 cit_ly_ligase [citrate (pro-3S)-lyase] ligase. ATP is cleaved to AMP and pyrophosphate during the reaction. The carboxyl end is homologous to a number of cytidyltransferases that also release pyrophosphate. [Energy metabolism, Fermentation, Protein fate, Protein modification and repair] 332
16646 272920 TIGR00125 cyt_tran_rel cytidyltransferase-like domain. Protein families that contain at least one copy of this domain include citrate lyase ligase, pantoate-beta-alanine ligase, glycerol-3-phosphate cytidyltransferase, ADP-heptose synthase, phosphocholine cytidylyltransferase, lipopolysaccharide core biosynthesis protein KdtB, the bifunctional protein NadR, and a number whose function is unknown. Many of these proteins are known to use CTP or ATP and release pyrophosphate. 66
16647 272921 TIGR00126 deoC deoxyribose-phosphate aldolase. Deoxyribose-phosphate aldolase is involved in the catabolism of nucleotides and deoxyriibonucleotides. The catalytic process is as follows: 2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehyde. It is found in both gram-postive and gram-negative bacteria. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other] 211
16648 129233 TIGR00127 nadp_idh_euk isocitrate dehydrogenase, NADP-dependent, eukaryotic type. This model describes a eukaryotic, NADP-dependent form of isocitrate dehydrogenase. These eukaryotic enzymes differ considerably from a fairly tight cluster that includes all other related isocitrate dehydrogenases, 3-isopropylmalate dehydrogenases, and tartrate dehydrogenases. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates. This model does not discriminate cytosolic, mitochondrial, and chloroplast proteins. However, the model starts very near the amino end of the cytosolic form; the finding of additional amino-terminal sequence may indicate a transit peptide. [Energy metabolism, TCA cycle] 409
16649 272922 TIGR00128 fabD malonyl CoA-acyl carrier protein transacylase. This enzyme of fatty acid biosynthesis transfers the malonyl moeity from coenzyme A to acyl-carrier protein. The seed alignment for this family of proteins contains a single member each from a number of bacterial species but also an additional pair of closely related, uncharacterized proteins from B. subtilis, one of which has a long C-terminal extension. [Fatty acid and phospholipid metabolism, Biosynthesis] 290
16650 272923 TIGR00129 fdhD_narQ formate dehydrogenase family accessory protein FdhD. FdhD in E. coli and NarQ in B. subtilis are required for the activity of formate dehydrogenase. The gene name in B. subtilis reflects the requirement of the neighboring gene narA for nitrate assimilation, for which NarQ is not required. In some species, the gene is associated not with a known formate dehydrogenase but with a related putative molybdopterin-binding oxidoreductase. A reasonable hypothesis is that this protein helps prepare a required cofactor for assembly into the holoenzyme. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 237
16651 161726 TIGR00130 frhD coenzyme F420-reducing hydrogenase delta subunit (putative coenzyme F420 hydrogenase processing subunit). FrhD is not part of the active FRH heterotrimer, but is probably a protease required for maturation. Alternative name: 8-hydroxy-5-deazaflavin (F420) reducing hydrogenase (FRH) subunit delta. [Protein fate, Protein modification and repair] 153
16652 272924 TIGR00131 gal_kin galactokinase. Galactokinase is a member of the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase) and shares with them an amino-terminal domain probably related to ATP binding.The galactokinases found by this model are divided into two sets. Prokaryotic forms are generally shorter. The eukaryotic forms are longer because of additional central regions and in some cases are known to be bifunctional, with regulatory activities that are independent of galactokinase activity. [Energy metabolism, Sugars] 386
16653 272925 TIGR00132 gatA aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, A subunit. In many species, Gln--tRNA ligase is missing. tRNA(Gln) is misacylated with Glu after which a heterotrimeric amidotransferase converts Glu to Gln. This model represents the amidase chain of that heterotrimer, encoded by the gatA gene. In the Archaea, Asn--tRNA ligase is also missing. This amidase subunit may also function in the conversion of Asp-tRNA(Asn) to Asn-tRNA(Asn), presumably with a different recognition unit to replace gatB. Both Methanococcus jannaschii and Methanobacterium thermoautotrophicum have both authentic gatB and a gatB-related gene, but only one gene like gatA. It has been shown that gatA can be expressed only when gatC is also expressed. In most species expressing the amidotransferase, the gatC ortholog is about 90 residues in length, but in Mycoplasma genitalium and Mycoplasma pneumoniae the gatC equivalent is as the C-terminal domain of a much longer protein. Not surprisingly, the Mycoplasmas also represent the most atypical lineage of gatA orthology. This orthology group is more narrowly defined here than in Proc Natl Acad Aci USA 94, 11819-11826 (1997). In particular, a Rhodococcus homolog found in association with nitrile hydratase genes and described as an enantiomer-selective amidase active on several 2-aryl propionamides, is excluded here. It is likely, however, that the amidase subunit GatA is not exclusively a part of the Glu-tRNA(Gln) amidotransferase heterotrimer and restricted to that function in all species. [Protein synthesis, tRNA aminoacylation] 460
16654 272926 TIGR00133 gatB aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, B subunit. The heterotrimer GatABC is responsible for transferring the NH2 group that converts Glu to Gln, or Asp to Asn after the Glu or Asp has been ligated to the tRNA for Gln or Asn, respectively. In Lactobacillus, GatABC is responsible only for tRNA(Gln). In the Archaea, GatABC is responsible only for tRNA(Asn), while GatDE is responsible for tRNA(Gln). In lineages that include Thermus, Chlamydia, or Acidithiobacillus, the GatABC complex catalyzes both. [Protein synthesis, tRNA aminoacylation] 478
16655 213509 TIGR00134 gatE_arch glutamyl-tRNA(Gln) amidotransferase, subunit E. This peptide is found only in the Archaea. It is paralogous to the gatB-encoded subunit of Glu-tRNA(Gln) amidotransferase. The GatABC system operates in many bacteria to convert Glu-tRNA(Gln) into Gln-tRNA(Gln). However, the homologous system in archaea instead converts Asp-tRNA(Asn) to Asn-tRNA(Asn). Glu-tRNA(Gln) is converted to Gln-tRNA(Gln) by a heterodimeric amidotransferase of GatE (this protein) and GatD. The Archaea have an Asp-tRNA(Asn) amidotransferase instead of an Asp--tRNA ligase, but the genes have not been identified. It is likely that this protein replaces gatB in Asp-tRNA(Asn) amidotransferase but that both enzymes share gatA. [Protein synthesis, tRNA aminoacylation] 620
16656 129241 TIGR00135 gatC aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, C subunit. Archaea, organelles, and many bacteria charge Gln-tRNA by first misacylating it with Glu and then amidating Glu to Gln. This small protein is part of the amidotransferase heterotrimer and appears to be important to the stability of the amidase subunit encode by gatA, but its function may not be required in every organism that expresses gatA and gatB. The seed alignment for this model does not include any eukaryotic sequence and is not guaranteed to find eukaryotic examples, although it does find some. Saccharomyces cerevisiae, which expresses the amidotransferase for mitochondrial protein translation, seems to lack a gatC ortholog. This model has been revised to remove the candidate sequence from Methanococcus jannaschii, now part of a related model. [Protein synthesis, tRNA aminoacylation] 93
16657 272927 TIGR00136 gidA glucose-inhibited division protein A. GidA, the longer of two forms of GidA-related proteins, appears to be present in all complete eubacterial genomes so far, as well as Saccharomyces cerevisiae. A subset of these organisms have a closely related protein. GidA is absent in the Archaea. It appears to act with MnmE, in an alpha2/beta2 heterotetramer, in the 5-carboxymethylaminomethyl modification of uridine 34 in certain tRNAs. The shorter, related protein, previously called gid or gidA(S), is now called TrmFO (see model TIGR00137). [Protein synthesis, tRNA and rRNA base modification] 616
16658 129243 TIGR00137 gid_trmFO tRNA:m(5)U-54 methyltransferase. This model represents an orthologous set of proteins present in relatively few bacteria but very tightly conserved where it occurs. It is closely related to gidA (glucose-inhibited division protein A), which appears to be present in all complete eubacterial genomes so far and in Saccharomyces cerevisiae. It was designated gid but is now recognized as a tRNA:m(5)U-54 methyltransferase and is now designated trmFO. [Protein synthesis, tRNA and rRNA base modification] 433
16659 272928 TIGR00138 rsmG_gidB 16S rRNA (guanine(527)-N(7))-methyltransferase RsmG. RsmG was previously called GidB (glucose-inhibited division protein B). It is present and a single copy in nearly all complete eubacterial genomes. It is missing only from some obligate intracellular species of various lineages (Chlamydiae, Ehrlichia, Wolbachia, Anaplasma, Buchnera, etc.). RsmG shows a methytransferase fold in its the crystal structure, and acts as a 7-methylguanosine (m(7)G) methyltransferase, apparently specific to 16S rRNA. [Protein synthesis, tRNA and rRNA base modification] 181
16660 129245 TIGR00139 h_aconitase homoaconitase. Homoaconitase is known only as a fungal enzyme from two species, where it is part of an unusual lysine biosynthesis pathway. Because this model is based on just two sequences from a narrow taxonomic range, it may not recognize distant orthologs, should any exist. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures, but 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble leuC and leuD over their lengths but are even closer to the respective domains of homoaconitase, and their identity is uncertain. [Amino acid biosynthesis, Aspartate family] 712
16661 129246 TIGR00140 hupD hydrogenase expression/formation protein. valid names: hupD, hynC, hoxM. C at 64 and 67 are believed to be metal binding. Postulated to be involved in processing or hydrogenase. Superfamily suggests that it is a peptidase/protease. [Protein fate, Protein modification and repair] 134
16662 129247 TIGR00142 hycI hydrogenase maturation protease HycI. Hydrogenase maturation protease is a protease that is involved in the C-terminal processing of HycE,the large subunit of hydrogenase 3 from E.Coli. This protein seems to be found in E.Coli and in Archaea. [Protein fate, Protein modification and repair] 146
16663 272929 TIGR00143 hypF [NiFe] hydrogenase maturation protein HypF. A previously described regulatory effect of HypF mutatation is attributable to loss of activity of a regulatory hydrogenase. A zinc finger-like region CXXCX(18)CXXCX(24)CXXCX(18)CXXC region further supported the regulatory hypothesis. However, more recent work (PUBMED:11375153) shows the direct effect is on the activity of expressed hydrogenases with nickel/iron centers, rather than on expression. [Protein fate, Protein modification and repair] 711
16664 129249 TIGR00144 beta_RFAP_syn beta-RFAP synthase. This protein family contains several archaeal examples of beta-ribofuranosylaminobenzene 5-prime-phosphate synthase (beta-RFAP synthase), an enzyme involved in methanopterin biosynthesis. In some species, two members of this family are found. It is unclear whether both act as beta-RFAP synthase. This family is related to the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase). Members are found so far only in the Archaea and in Methylobacterium extorquens. [Unknown function, Enzymes of unknown specificity] 324
16665 272930 TIGR00145 TIGR00145 FTR1 family protein. A characterized member from yeast acts as oxidase-coupled high affinity iron transporter. Note that the apparent member from E. coli K12-MG1655 has a frameshift by homology with member sequences from other species. [Unknown function, General] 283
16666 161732 TIGR00147 TIGR00147 lipid kinase, YegS/Rv2252/BmrU family. The E. coli member of this family, YegS has been purified and shown to have phosphatidylglycerol kinase activity. The member from M. tuberculosis, Rv2252, has diacylglycerol kinase activity. BmrU from B. subtilis is in an operon with multidrug efflux transporter Bmr, but is uncharacterized. [Unknown function, Enzymes of unknown specificity] 293
16667 129252 TIGR00148 TIGR00148 UbiD family decarboxylase. The member of this family in E. coli is UbiD, 3-octaprenyl-4-hydroxybenzoate carboxy-lyase. The family described by this model, however, is broad enough that it is likely to contain several different decarboxylases. Found in bacteria, archaea, and yeast, with two members in A. fulgidus. No homologs were detected besides those classified as orthologs. The member from H. pylori has a C-terminal extension of just over 100 residues that is shared in part by the Aquifex aeolicus homolog. [Unknown function, General] 438
16668 129253 TIGR00149 TIGR00149_YjbQ secondary thiamine-phosphate synthase enzyme. Members of this protein family have been studied extensively by crystallography. Members from several different species have been shown to have sufficient thiamin phosphate synthase activity (EC 2.5.1.3) to complement thiE mutants. However, it is presumed that this is a secondary activity, and the primary function of this enzyme remains unknown. [Unknown function, Enzymes of unknown specificity] 132
16669 129254 TIGR00150 T6A_YjeE tRNA threonylcarbamoyl adenosine modification protein YjeE. This protein family belongs to a four-gene system responsible for the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family have a conserved nucleotide-binding motif GXXGXGKT and a nucleotide-binding fold. Member protein YjeE of Haemophilus influenzae (HI0065) was shown to have (weak) ATPase activity. [Protein synthesis, tRNA and rRNA base modification] 133
16670 129255 TIGR00151 ispF 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase. Members of this protein family are 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, the IspF protein of the deoxyxylulose (non-mevalonate) pathway of IPP biosynthesis. This protein occurs as an IspDF bifunctional fusion protein in about 20 percent of bacterial genomes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 155
16671 272931 TIGR00152 TIGR00152 dephospho-CoA kinase. This model produces scores in the range of 0-25 bits against adenylate, guanylate, uridine, and thymidylate kinases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 190
16672 272932 TIGR00153 TIGR00153 TIGR00153 family protein. An apparent homolog with a suggested function is Pit accessory protein from Sinorhizobium meliloti, which may be involved in phosphate (Pi) transport. [Hypothetical proteins, Conserved] 216
16673 188029 TIGR00154 ispE 4-diphosphocytidyl-2C-methyl-D-erythritol kinase. Members of this family of GHMP kinases were previously designated as conserved hypothetical protein YchB or as isopentenyl monophosphate kinase. It is now known, in tomato and E. coli, to encode 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, an enzyme of the deoxyxylulose phosphate pathway of terpenoid biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 294
16674 272933 TIGR00155 pqiA_fam integral membrane protein, PqiA family. This family consists of uncharacterized predicted integral membrane proteins found, so far, only in the Proteobacteria. Of two members in E. coli, one is induced by paraquat and is designated PqiA, paraquat-inducible protein A. [Unknown function, General] 403
16675 129260 TIGR00156 TIGR00156 TIGR00156 family protein. As of the last revision, this family consists only of two proteins from Escherichia coli and one from the related species Haemophilus influenzae. [Hypothetical proteins, Conserved] 126
16676 272934 TIGR00157 TIGR00157 ribosome small subunit-dependent GTPase A. Members of this protein were designated YjeQ and are now designated RsgA (ribosome small subunit-dependent GTPase A). The strongest motif in the alignment of these proteins is GXSGVGKS[ST], a classic P-loop for nucleotide binding. This protein has been shown to cleave GTP and remain bound to GDP. A role as a regulator of translation has been suggested. The Aquifex aeolicus ortholog is split into consecutive open reading frames. Consequently, this model was build in fragment mode (-f option). [Protein synthesis, Translation factors] 245
16677 129262 TIGR00158 L9 ribosomal protein L9. Ribosomal protein L9 appears to be universal in, but restricted to, eubacteria and chloroplast. [Protein synthesis, Ribosomal proteins: synthesis and modification] 148
16678 129263 TIGR00159 TIGR00159 TIGR00159 family protein. These proteins have no detectable global or local homology to any protein of known function. Members are restricted to the bacteria and found broadly in lineages other than the Proteobacteria. [Hypothetical proteins, Conserved] 211
16679 272935 TIGR00160 MGSA methylglyoxal synthase. Methylglyoxal synthase (MGS) generates methylglyoxal (MG), a toxic metabolite (that may also be a regulatory metabolite and) that is detoxified, prinicipally, through a pathway involving glutathione and glyoxylase I. Totemeyer et al. propose that, during a loss of control over carbon flux, with accumulation of phosphorylated sugars and depletion of phosphate, as might happen during a rapid shift to a richer medium, MGS aids the cell by converting some dihydroxyacetone phosphate (DHAP) to MG and phosphate. This is therefore an alternative to triosephosphate isomerase and the remainder of the glycolytic pathway for the disposal of DHAP during the stress of a sudden increase in available sugars. [Energy metabolism, Other] 143
16680 129265 TIGR00161 TIGR00161 TIGR00161 family protein. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in the Archaea. This ortholog set includes MJ0106 from Methanococcus jannaschii and AF1251 from Archaeoglobus fulgidus, but not MJ1210 or AF0525. [Hypothetical proteins, Conserved] 238
16681 129266 TIGR00162 TIGR00162 TIGR00162 family protein. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in but are universal among the Archaea. This ortholog set includes MJ1210 from Methanococcus jannaschii and AF0525 from Archaeoglobus fulgidus while excluding MJ0106 and AF1251. [Hypothetical proteins, Conserved] 188
16682 272936 TIGR00163 PS_decarb phosphatidylserine decarboxylase precursor. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. A closely related family, possibly also active as phosphatidylserine decarboxylase, falls under model TIGR00164. [Fatty acid and phospholipid metabolism, Biosynthesis] 238
16683 129268 TIGR00164 PS_decarb_rel phosphatidylserine decarboxylase precursor-related protein. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. This protein has many regions of homology to known phosphatidylserine decarboxylases, including the Gly-Ser motif for chain cleavage and active site generation, but has a shorter amino end and a number of deletions along the length of the alignment to the phosphatidylserine decarboxylases. It is unclear whether this protein is a form of phosphatidylserine decarboxylase or is a related enzyme. It is found in Neisseria gonorrhoeae, Mycobacterium tuberculosis, and several archaeal species, all of which lack known phosphatidylserine decarboxylase. [Unknown function, General] 189
16684 272937 TIGR00165 S18 ribosomal protein S18. This ribosomal small subunit protein is found in all eubacteria so far, as well as in chloroplasts. YER050C from Saccharomyces cerevisiae and a related protein from Caenorhabditis elegans appear to be homologous and may represent mitochondrial forms. The trusted cutoff is set high enough that these two candidate S18 proteins are not categorized automatically. [Protein synthesis, Ribosomal proteins: synthesis and modification] 70
16685 129270 TIGR00166 S6 ribosomal protein S6. The ribosomal protein S6 ortholog family, including yeast MRP17, shows more than two-fold length variation from 95 residues in Bacillus subtilis to 215 in Mycoplasma pneumoniae. This length variation comes primarily from poorly conserved C-terminal extensions that are particularly long in the Mycoplasmas. MRP17 protein is a component of the small ribosomal subunit in mitochondria, and is shown here to be an ortholog of S6. [Protein synthesis, Ribosomal proteins: synthesis and modification] 93
16686 272938 TIGR00167 cbbA ketose-bisphosphate aldolase. This model is under revision. Proteins found by this model include fructose-bisphosphate and tagatose-bisphosphate aldolase. [Energy metabolism, Glycolysis/gluconeogenesis] 288
16687 129272 TIGR00168 infC translation initiation factor IF-3. infC uses abnormal initiation codons such as AUA, AUC, and CUG which render its expression particularly sensitive to excess of its gene product IF-3 thereby regulating its own expression [Protein synthesis, Translation factors] 165
16688 272939 TIGR00169 leuB 3-isopropylmalate dehydrogenase. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the dimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates.Among these decarboxylating dehydrogenases of hydroxyacids, overall sequence homology indicates evolutionary history rather than actual substrate or cofactor specifity, which may be toggled experimentally by replacement of just a few amino acids. 3-isopropylmalate dehydrogenase is an NAD-dependent enzyme and should have a sequence resembling HGSAPDI around residue 340. The subtrate binding loop should include a sequence resembling E[KQR]X(0,1)LLXXR around residue 115. Other contacts of importance are known from crystallography but not detailed here.This model will not find all isopropylmalate dehydrogenases; the enzyme from Sulfolobus sp. strain 7 is more similar to mitochondrial NAD-dependent isocitrate dehydrogenases than to other known isopropylmalate dehydrogenases and was omitted to improve the specificity of the model. It scores below the cutoff and below some enzymes known not to be isopropylmalate dehydrogenase. [Amino acid biosynthesis, Pyruvate family] 346
16689 272940 TIGR00170 leuC 3-isopropylmalate dehydratase, large subunit. Members of this family are 3-isopropylmalate dehydratase, large subunit, or the large subunit domain of single-chain forms. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. These homologs are now described by a separate model of subfamily (rather than equivalog) homology type, and the priors and cutoffs for this model have been changed to focus this equivalog family more narrowly. [Amino acid biosynthesis, Pyruvate family] 465
16690 129275 TIGR00171 leuD 3-isopropylmalate dehydratase, small subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. The candidate archaeal leuD proteins are not included in the seed alignment for this model and score below the trusted cutoff. [Amino acid biosynthesis, Pyruvate family] 188
16691 129276 TIGR00172 maf MAF protein. This nonessential gene causes inhibition of septation when overexpressed. A member of the family is found in the Archaeon Pyrococcus horikoshii and another in the round worm Caenorhabditis elegans. [Cellular processes, Cell division] 183
16692 272941 TIGR00173 menD 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylic-acid synthase. MenD was thought until recently to act as SHCHC synthase, but has recently been shown to act instead as SEPHCHC synthase. Conversion of SEPHCHC into SHCHC and pyruvate may occur spontaneously but is catalyzed efficiently, at least in some organisms, by MenH (see TIGR03695). 2-oxoglutarate decarboxylase/SHCHC synthase (menD) is a thiamine pyrophosphate enzyme involved in menaquinone biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 432
16693 213512 TIGR00174 miaA tRNA dimethylallyltransferase. Alternate names include delta(2)-isopentenylpyrophosphate transferase, IPP transferase, 2-methylthio-N6-isopentyladenosine tRNA modification enzyme. Catalyzes the first step in the modification of an adenosine near the anticodon to 2-methylthio-N6-isopentyladenosine. Understanding of substrate specificity has changed. [Protein synthesis, tRNA and rRNA base modification] 287
16694 272942 TIGR00175 mito_nad_idh isocitrate dehydrogenase, NAD-dependent, mitochondrial type. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates. Mitochondrial NAD-dependent isocitrate dehydrogenases (IDH) resemble prokaryotic NADP-dependent IDH and 3-isopropylmalate dehydrogenase (an NAD-dependent enzyme) more closely than they resemble eukaryotic NADP-dependent IDH. The mitochondrial NAD-dependent isocitrate dehydrogenase is believed to be an alpha(2)-beta-gamma heterotetramer. All subunits are homologous and found by this model. The NADP-dependent IDH of Thermus aquaticus thermophilus strain HB8 resembles these NAD-dependent IDH, except for the residues involved in cofactor specificity, much more closely than it resembles other prokaryotic NADP-dependent IDH, including that of Thermus aquaticus strain YT1. [Energy metabolism, TCA cycle] 333
16695 272943 TIGR00176 mobB molybdopterin-guanine dinucleotide biosynthesis protein MobB. This molybdenum cofactor biosynthesis enzyme is similar to the urease accessory protein UreG and to the hydrogenase accessory protein HypB, both GTP hydrolases involved in loading nickel into the metallocenters of their respective target enzymes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 155
16696 272944 TIGR00177 molyb_syn molybdenum cofactor synthesis domain. The Drosophila protein cinnamon, the Arabidopsis protein cnx1, and rat protein gephyrin each have one domain like MoeA and one like MoaB and Mog. These domains are, however, distantly related to each other, as captured by this model. Gephyrin is unusual in that it seems to be a tubulin-binding neuroprotein involved in the clustering of both blycine receptors and GABA receptors, rather than a protein of molybdenum cofactor biosynthesis. 148
16697 129282 TIGR00178 monomer_idh isocitrate dehydrogenase, NADP-dependent, monomeric type. The monomeric type of isocitrate dehydrogenase has been found so far in a small number of species, including Azotobacter vinelandii, Corynebacterium glutamicum, Rhodomicrobium vannielii, and Neisseria meningitidis. It is NADP-specific. [Energy metabolism, TCA cycle] 741
16698 272945 TIGR00179 murB UDP-N-acetylenolpyruvoylglucosamine reductase. This model describes MurB, UDP-N-acetylenolpyruvoylglucosamine reductase, which is also called UDP-N-acetylmuramate dehydrogenase. It is part of the pathway for the biosynthesis of the UDP-N-acetylmuramoyl-pentapeptide that is a precursor of bacterial peptidoglycan. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 284
16699 272946 TIGR00180 parB_part ParB/RepB/Spo0J family partition protein. This model represents the most well-conserved core of a set of chromosomal and plasmid partition proteins related to ParB, including Spo0J, RepB, and SopB. Spo0J has been shown to bind a specific DNA sequence that, when introduced into a plasmid, can serve as partition site. Study of RepB, which has nicking-closing activity, suggests that it forms a transient protein-DNA covalent intermediate during the strand transfer reaction. 187
16700 272947 TIGR00181 pepF oligoendopeptidase F. This family represents the oligoendopeptidase F clade of the family of larger M3 or thimet (for thiol-dependent metallopeptidase) oligopeptidase family. Lactococcus lactis PepF hydrolyzed peptides of 7 and 17 amino acids with fairly broad specificity. The homolog of lactococcal PepF in group B Streptococcus was named PepB (, with the name difference reflecting a difference in species of origin rather activity; substrate profiles were quite similar. Differences in substrate specificity should be expected in other species. The gene is duplicated in Lactococcus lactis on the plasmid that bears it. A shortened second copy is found in Bacillus subtilis. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 591
16701 129286 TIGR00182 plsX fatty acid/phospholipid synthesis protein PlsX. This protein of fatty acid/phospholipid biosynthesis, called PlsX after the member in Streptococcus pneumoniae, is proposed to be a phosphate acyltransferase that partners with PlsY (TIGR00023) in a two-step 1-acylglycerol-3-phosphate biosynthesis pathway alternative to the one-step PlsB (EC 2.3.1.15) pathway. [Fatty acid and phospholipid metabolism, Biosynthesis] 322
16702 272948 TIGR00183 prok_nadp_idh isocitrate dehydrogenase, NADP-dependent, prokaryotic type. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; Prokaryotic NADP-dependent isocitrate dehydrogenases resemble their NAD-dependent counterparts and 3-isopropylmalate dehydrogenase (an NAD-dependent enzyme) more closely than they resemble eukaryotic NADP-dependent isocitrate dehydrogenases. [Energy metabolism, TCA cycle] 416
16703 272949 TIGR00184 purA adenylosuccinate synthase. Alternate name IMP--aspartate ligase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 425
16704 211559 TIGR00185 tRNA_yibK_trmL tRNA (cytidine(34)-2'-O)-methyltransferase. TrmL (previously YibK) is responsible for 2'-O-methylation at tRNA(Leu) position 34. [Protein synthesis, tRNA and rRNA base modification] 153
16705 129290 TIGR00186 rRNA_methyl_3 rRNA methylase, putative, group 3. this is part of the trmH (spoU) family of rRNA methylases [Protein synthesis, tRNA and rRNA base modification] 237
16706 272950 TIGR00187 ribE riboflavin synthase, alpha subunit. This protein family consists almost entirely of two lumazine-binding domains, described in the family Lum_binding from Pfam. The model generates lower scores against other proteins that also have two lumazine-binding domains, including some involved in bioluminescence.The name ribE was selected, from among alternatives including ribB and ribC, to match the usage in EcoCyc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 200
16707 211560 TIGR00188 rnpA ribonuclease P protein component, eubacterial. This peptide is the protein component of a ribonucleoprotein that cleaves the leader sequence from each tRNA precursor to leave the mature 5'-terminus. The catalytic site is in the RNA component, M1 RNA. The yeast mitochondrial RNase P protein component gene RPM2 has no obvious sequence similarity to rnpA, but resembles eukaryotic nuclear RNase P instead. [Transcription, RNA processing] 111
16708 272951 TIGR00189 tesB acyl-CoA thioesterase II. Function: hydrolyzes a broad range of acyl-CoA thioesters. Physiological function is not known. Subunit: homotetramer. [Fatty acid and phospholipid metabolism, Biosynthesis] 271
16709 129294 TIGR00190 thiC phosphomethylpyrimidine synthase. The thiC ortholog is designated thiA in Bacillus subtilis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 423
16710 129295 TIGR00191 thrB homoserine kinase. Homoserine kinase is part of the threonine biosynthetic pathway.Homoserine kinase is a member of the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase) and shares with them an amino-terminal domain probably related to ATP binding.P.aeruginosa homoserine kinase seems not to be homologous (see PROSITE:PDOC0054) [Amino acid biosynthesis, Aspartate family] 302
16711 129296 TIGR00192 urease_beta urease, beta subunit. In a number of species, including B.subtilis, Synechocystis, and Haemophilus influenzae, urease subunits beta and gamma are encoded as separate polypeptides. In Helicobacter pylori UreA and in the fission yeast Schizosaccharomyces pombe, beta subunit-like sequence follows gamma subunit-like sequence in a single chain; the fission yeast protein contains additional C-terminal regions. [Central intermediary metabolism, Nitrogen metabolism] 101
16712 272952 TIGR00193 urease_gam urease, gamma subunit. In a number of species, including B.subtilis, Synechocystis, and Haemophilus influenzae, urease subunits beta and gamma are encoded as separate polypeptides. In Helicobacter pylori UreA and in the fission yeast Schizosaccharomyces pombe, beta subunit-like sequence follows gamma subunit-like sequence in a single chain; the fission yeast protein contains additional C-terminal regions. Nomenclature for the various subunits of urease in Helicobacter differs from nomenclature in most other species. [Central intermediary metabolism, Nitrogen metabolism] 102
16713 272953 TIGR00194 uvrC excinuclease ABC, C subunit. This family consists of the DNA repair enzyme UvrC, an ABC excinuclease subunit which interacts with the UvrA/UvrB complex to excise UV-damaged nucleotide segments. [DNA metabolism, DNA replication, recombination, and repair] 574
16714 272954 TIGR00195 exoDNase_III exodeoxyribonuclease III. The model brings in reverse transcriptases at scores below 50, model also contains eukaryotic apurinic/apyrimidinic endonucleases which group in the same family [DNA metabolism, DNA replication, recombination, and repair] 254
16715 272955 TIGR00196 yjeF_cterm yjeF C-terminal region, hydroxyethylthiazole kinase-related. E. coli yjeF has full-length orthologs in a number of species, all of unknown function. However, yeast YNL200C is homologous and corresponds to the N-terminal region while yeast YKL151C and B. subtilis yxkO correspond to this C-terminal region only. The present model may hit hydroxyethylthiazole kinase, an enzyme associated with thiamine biosynthesis. [Unknown function, General] 270
16716 272956 TIGR00197 yjeF_nterm yjeF N-terminal region. The protein region corresponding to this model shows no clear homology to any protein of known function. This model is built on yeast protein YNL200C and the N-terminal regions of E. coli yjeF and its orthologs in various species. The C-terminal region of yjeF and its orthologs shows similarity to hydroxyethylthiazole kinase (thiM) and other enzymes involved in thiamine biosynthesis. Yeast YKL151C and B. subtilis yxkO match the yjeF C-terminal domain but lack this region. [Unknown function, General] 205
16717 272957 TIGR00198 cat_per_HPI catalase/peroxidase HPI. As catalase, this enzyme catalyzes the dismutation of two molecules of hydrogen peroxide to dioxygen and two molecules of water. As a peroxidase, it uses hydrogen peroxide to oxidize donor compounds and produce water. KatG from E. coli is a homotetramer with two non-covalently associated iron protoheme IX groups per tetramer, but the ortholog from Synechococcus sp. is a homodimer with one protoheme. Important sites (numbered according to E. coli KatG) include heme ligands His-106 and His-267 and active site Trp-318. Note that the translation PID:g296476 from accession X71420 from Rhodobacter capsulatus B10 contains extensive frameshift differences from the rest of the orthologous family. [Cellular processes, Detoxification] 716
16718 129303 TIGR00199 PncC_domain amidohydrolase, PncC family. CinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species. Several bacterial species have a protein consisting largely of the C-terminal domain of CinA but lacking the N-terminal domain, including nicotinamide mononucleotide (NMN) deamidase (3.5.1.42) proteins PncC in Shewanella oneidensis and ygaD in E. coli. [DNA metabolism, DNA replication, recombination, and repair] 146
16719 161761 TIGR00200 cinA_nterm competence/damage-inducible protein CinA N-terminal domain. cinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species [DNA metabolism, DNA replication, recombination, and repair] 413
16720 272958 TIGR00201 comF comF family protein. This protein is found in species that do (Bacillus subtilis, Haemophilus influenzae) or do not (E. coli, Borrelia burgdorferi) have described systems for natural transformation with exogenous DNA. It is involved in competence for transformation in Bacillus subtilis. [Cellular processes, DNA transformation] 190
16721 272959 TIGR00202 csrA carbon storage regulator (csrA). Modulates the expression of genes in the glycogen biosynthesis and gluconeogenesis pathways by accelerating the 5'-to-3' degradation of these transcripts through selective RNA binding. The N-terminal end of the sequence (AA 11-45) contains the KH motif which is characteristic of a set of RNA-binding proteins. [Energy metabolism, Glycolysis/gluconeogenesis, Regulatory functions, RNA interactions] 69
16722 129307 TIGR00203 cydB cytochrome d oxidase, subunit II (cydB). part of a two component cytochrome D terminal complex. Terminal reaction in the aerobic respiratory chain. [Energy metabolism, Electron transport] 378
16723 129308 TIGR00204 dxs 1-deoxy-D-xylulose-5-phosphate synthase. DXP synthase is a thiamine diphosphate-dependent enzyme related to transketolase and the pyruvate dehydrogenase E1-beta subunit. By an acyloin condensation of pyruvate with glyceraldehyde 3-phosphate, it produces 1-deoxy-D-xylulose 5-phosphate, a precursor of thiamine diphosphate (TPP), pyridoxal phosphate, and the isoprenoid building block isopentenyl diphosphate (IPP). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 617
16724 272960 TIGR00205 fliE flagellar hook-basal body complex protein FliE. fliE is a component of the flagellar hook-basal body complex located possibly at (MS-ring)-rod junction. [Cellular processes, Chemotaxis and motility] 108
16725 129310 TIGR00206 fliF flagellar basal-body M-ring protein/flagellar hook-basal body protein (fliF). Component of the M (cytoplasmic associated) ring, one of four rings (L,P,S,M) which make up the flagellar hook-basal body which is a major portion of the flagellar organelle. Although the basic structure of the flagella appears to be similar for all bacteria, additional rings and structures surrounding the basal body have been observed for some bacteria (eg Vibrio cholerae and Treponema pallidum). [Cellular processes, Chemotaxis and motility] 555
16726 272961 TIGR00207 fliG flagellar motor switch protein FliG. The fliG protein along with fliM and fliN interact to form the switch complex of the bacterial flagellar motor located at the base of the basal body. This complex interacts with chemotaxis proteins (eg CHEY). In addition the complex interacts with other components of the motor that determine the direction of flagellar rotation. The model contains putative members of the fliG family at scores of less than 100 from Agrobacterium radiobacter and Sinorhizobium meliloti as well as fliG-like genes from treponema pallidum and Borrelia burgdorferi. That is why the suggested cutoff is set at 20 but was set at 100 to construct the family. [Cellular processes, Chemotaxis and motility] 338
16727 188033 TIGR00208 fliS flagellar biosynthetic protein FliS. The function of this protein in flagellar biosynthesis is unknown, but appears to be regulatory. The member of this family in Vibrio parahaemolyticus is designated FlaJ (creating a synonym for FliS) and was shown essential for flagellin biosynthesis. [Cellular processes, Chemotaxis and motility] 124
16728 129313 TIGR00209 galT_1 galactose-1-phosphate uridylyltransferase, family 1. This enzyme is involved in glucose and galactose interconversion. This model describes one of two extremely distantly related branches of the model pfam01087. [Energy metabolism, Sugars] 347
16729 129314 TIGR00210 gltS sodium--glutamate symport carrier (gltS). [Transport and binding proteins, Amino acids, peptides and amines] 398
16730 272962 TIGR00211 glyS glycyl-tRNA synthetase, tetrameric type, beta subunit. The glycyl-tRNA synthetases differ even among the eubacteria in oligomeric structure. In Escherichia coli and most others, it is a heterodimer of two alpha chains and two beta chains, encoded by tandem genes. The genes are similar, but fused, in Chlamydia trachomatis. By contrast, the glycyl-tRNA synthetases of Thermus thermophilus and of archaea and eukaryotes differ considerably; they are homodimeric, mutually similar, and not detected by this model. [Protein synthesis, tRNA aminoacylation] 691
16731 272963 TIGR00212 hemC hydroxymethylbilane synthase. Alternate name hydroxymethylbilane synthase Biosynthesis of cofactors, prosthetic groups, and carriers: Heme and porphyrin [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 292
16732 129317 TIGR00213 GmhB_yaeD D,D-heptose 1,7-bisphosphate phosphatase. This family of proteins formerly designated yaeD resembles the histidinol phosphatase domain of the bifunctional protein HisB. The member from E. coli has been characterized as D,D-heptose 1,7-bisphosphate phosphatase, GmhB, involved in inner core LPS assembly (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 176
16733 272964 TIGR00214 lipB lipoate-protein ligase B. Involved in lipoate biosynthesis as the main determinant of the lipoyl-protein ligase activity required for lipoylation of enzymes such as alpha-ketoacid dehydrogenases. Involved in activation and re-activation (following denaturation) of lipoyl-protein ligases (calcium ion-dependant process). [Protein fate, Protein modification and repair] 184
16734 129319 TIGR00215 lpxB lipid-A-disaccharide synthase. Lipid-A precursor biosynthesis producing lipid A disaccharide in a condensation reaction. transcribed as part of an operon including lpxA [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 385
16735 272965 TIGR00216 ispH_lytB (E)-4-hydroxy-3-methyl-but-2-enyl pyrophosphate reductase (IPP and DMAPP forming). The IspH protein (previously designated LytB) has now been recognized as the last enzyme in the biosynthesis of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). Escherichia coli LytB protein had been found to regulate the activity of RelA (guanosine 3',5'-bispyrophosphate synthetase I), which in turn controls the level of a regulatory metabolite. It is involved in penicillin tolerance and the stringent response. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 282
16736 129321 TIGR00217 malQ 4-alpha-glucanotransferase. This enzyme is known as amylomaltase and disproportionating enzyme. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 513
16737 272966 TIGR00218 manA mannose-6-phosphate isomerase, class I. The names phosphomannose isomerase and mannose-6-phosphate isomerase are synonomous. This family contains two rather deeply branched groups. One group contains an experimentally determined phosphomannose isomerase of Streptococcus mutans as well as three uncharacterized paralogous proteins of Bacillus subtilis, all at more than 50 % identity to each other, plus a more distant homolog from Archaeoglobus fulgidus. The other group contains members from E. coli, budding yeast, Borrelia burgdorferi, etc. [Energy metabolism, Sugars] 302
16738 129323 TIGR00219 mreC rod shape-determining protein MreC. MreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. Cells defective in MreC are round. Species with MreC include many of the Proteobacteria, Gram-positives, and spirochetes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 283
16739 272967 TIGR00220 mscL large conductance mechanosensitive channel protein. Protein encodes a channel which opens in response to a membrane stretch force. Probably serves as an osmotic gauge. Carboxy terminus tends to be more divergent across species with a high degree of sequence conservation found at the N-terminus. [Cellular processes, Adaptations to atypical conditions] 127
16740 272968 TIGR00221 nagA N-acetylglucosamine-6-phosphate deacetylase. [Central intermediary metabolism, Amino sugars] 380
16741 272969 TIGR00222 panB 3-methyl-2-oxobutanoate hydroxymethyltransferase. Members of this family are 3-methyl-2-oxobutanoate hydroxymethyltransferase, the first enzyme of the pantothenate biosynthesis pathway. An alternate name is ketopantoate hydroxymethyltransferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 263
16742 129327 TIGR00223 panD L-aspartate-alpha-decarboxylase. Members of this family are aspartate 1-decarboxylase, the enzyme that makes beta-alanine and C02 from aspartate. Beta-alanine is then used to make the vitamin pantothenate, from which coenzyme A is made. Aspartate 1-decarboxylase is synthesized as a proenzyme, then cleaved to an alpha (C-terminal) and beta (N-terminal) subunit with a pyruvoyl group. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 126
16743 161774 TIGR00224 pckA phosphoenolpyruvate carboxykinase (ATP). Involved in the gluconeogenesis pathway. It converts oxaloacetic acid to phosphoenolpyruvate using ATP. Enzyme is a monomer. The reaction is also catalysed by phosphoenolpyruvate carboxykinase (GTP) (EC 4.1.1.32) using GTP instead of ATP, described in PROSITE:PDOC00421 [Energy metabolism, Glycolysis/gluconeogenesis] 532
16744 272970 TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Protein fate, Protein modification and repair] 334
16745 129330 TIGR00227 ribD_Cterm riboflavin-specific deaminase C-terminal domain. Eubacterial riboflavin-specific deaminases have a zinc-binding domain recognized by the dCMP_cyt_deam model toward the N-terminus and this domain toward the C-terminus. Yeast HTP reductase, a riboflavin-biosynthetic enzyme, and several archaeal proteins believed related to riboflavin biosynthesis consist only of this domain and lack the dCMP_cyt_deam domain. 216
16746 129331 TIGR00228 ruvC crossover junction endodeoxyribonuclease RuvC. Endonuclease that resolves Holliday junction intermediates in genetic recombination. The active form of the protein is a dimer. Structure studies reveals that the catalytic center, comprised of four acidic residues, lies at the bottom of a cleft that fits a DNA duplex. The model hits a single Synechocystis PCC6803 protein at a score of 30, below the trusted cutoff, that appears orthologous and may act as authentic RuvC. [DNA metabolism, DNA replication, recombination, and repair] 156
16747 272971 TIGR00229 sensory_box PAS domain S-box. The PAS domain was previously described. This sensory box, or S-box domain occupies the central portion of the PAS domain but is more widely distributed. It is often tandemly repeated. Known prosthetic groups bound in the S-box domain include heme in the oxygen sensor FixL, FAD in the redox potential sensor NifL, and a 4-hydroxycinnamyl chromophore in photoactive yellow protein. Proteins containing the domain often contain other regulatory domains such as response regulator or sensor histidine kinase domains. Other S-box proteins include phytochromes and the aryl hydrocarbon receptor nuclear translocator. [Regulatory functions, Small molecule interactions] 124
16748 272972 TIGR00230 sfsA sugar fermentation stimulation protein. probable regulatory factor involved in maltose metabolism contains a putative DNA binding domain. Isolated as a gene which enabled E.coli strain MK2001 to use maltose. [Energy metabolism, Sugars, Regulatory functions, Other] 234
16749 272973 TIGR00231 small_GTP small GTP-binding protein domain. Proteins with a small GTP-binding domain recognized by this model include Ras, RhoA, Rab11, translation elongation factor G, translation initiation factor IF-2, tetratcycline resistance protein TetM, CDC42, Era, ADP-ribosylation factors, tdhF, and many others. In some proteins the domain occurs more than once.This model recognizes a large number of small GTP-binding proteins and related domains in larger proteins. Note that the alpha chains of heterotrimeric G proteins are larger proteins in which the NKXD motif is separated from the GxxxxGK[ST] motif (P-loop) by a long insert and are not easily detected by this model. [Unknown function, General] 162
16750 272974 TIGR00232 tktlase_bact transketolase, bacterial and yeast. This model is designed to capture orthologs of bacterial transketolases. The group includes two from the yeast Saccharomyces cerevisiae but excludes dihydroxyactetone synthases (formaldehyde transketolases) from various yeasts and the even more distant mammalian transketolases. Among the family of thiamine diphosphate-dependent enzymes that includes transketolases, dihydroxyacetone synthases, pyruvate dehydrogenase E1-beta subunits, and deoxyxylulose-5-phosphate synthases, mammalian and bacterial transketolases seem not to be orthologous. [Energy metabolism, Pentose phosphate pathway] 653
16751 272975 TIGR00233 trpS tryptophanyl-tRNA synthetase. This model represents tryptophanyl-tRNA synthetase. Some members of the family have a pfam00458 domain amino-terminal to the region described by this model. [Protein synthesis, tRNA aminoacylation] 327
16752 272976 TIGR00234 tyrS tyrosyl-tRNA synthetase. This tyrosyl-tRNA synthetase model starts picking up tryptophanyl-tRNA synthetases at scores of 0 and below. The proteins found by this model have a deep split between two groups. One group contains bacterial and organellar eukaryotic examples. The other contains archaeal and cytosolic eukaryotic examples. [Protein synthesis, tRNA aminoacylation] 378
16753 272977 TIGR00235 udk uridine kinase. Model contains a number of longer eukaryotic proteins and starts bringing in phosphoribulokinase hits at scores of 160 and below [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 207
16754 272978 TIGR00236 wecB UDP-N-acetylglucosamine 2-epimerase. This cytosolic enzyme converts UDP-N-acetyl-D-glucosamine to UDP-N-acetyl-D-mannosamine. In E. coli, this is the first step in the pathway of enterobacterial common antigen biosynthesis.Members of this orthology group have many gene symbols, often reflecting the overall activity of the pathway and/or operon that includes it. Symbols include epsC (exopolysaccharide C) in Burkholderia solanacerum, cap8P (type 8 capsule P) in Staphylococcus aureus, and nfrC in an older designation based on the effects of deletion on phage N4 adsorption. Epimerase activity was also demonstrated in a bifunctional rat enzyme, for which the N-terminal domain appears to be orthologous. The set of proteins found above the suggested cutoff includes E. coli WecB in one of two deeply branched clusters and the rat UDP-N-acetylglucosamine 2-epimerase domain in the other. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 365
16755 272979 TIGR00237 xseA exodeoxyribonuclease VII, large subunit. This family consist of exodeoxyribonuclease VII, large subunit XseA which catalyses exonucleolytic cleavage in either the 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. Exonuclease VII consists of one large subunit and four small subunits. [DNA metabolism, Degradation of DNA] 389
16756 272980 TIGR00238 TIGR00238 KamA family protein. This model represents essentially the whole of E. coli YjeK and of some of its apparent orthologs. YodO in Bacillus subtilis, a family member which is longer protein by an additional 100 residues, is characterized as a lysine 2,3-aminomutase with iron, sulphide and pyridoxal 5'-phosphate groups. The homolog MJ0634 from M. jannaschii is preceded by nearly 200 C-terminal residues. This family shows similarity to molybdenum cofactor biosynthesis protein MoaA and related proteins. Note that the E. coli homolog was expressed in E. coli and purified and found not to display display lysine 2,3-aminomutase activity. Active site residues are found in 100 residue extension in B. subtilis. Name changed to KamA family protein. [Cellular processes, Adaptations to atypical conditions] 331
16757 161785 TIGR00239 2oxo_dh_E1 2-oxoglutarate dehydrogenase, E1 component. The 2-oxoglutarate dehydrogenase complex consists of this thiamine pyrophosphate-binding subunit (E1), dihydrolipoamide succinyltransferase (E2), and lipoamide dehydrogenase (E3). The E1 ortholog from Corynebacterium glutamicum is unusual in having an N-terminal extension that resembles the dihydrolipoamide succinyltransferase (E2) component of 2-oxoglutarate dehydrogenase. [Energy metabolism, TCA cycle] 929
16758 272981 TIGR00240 ATCase_reg aspartate carbamoyltransferase, regulatory subunit. The presence of this regulatory subunit allows feedback inhibition by CTP on aspartate carbamoyltransferase, the first step in the synthesis of CTP from aspartate. In many species, this regulatory subunit is not present. In Thermotoga maritima, the catalytic and regulatory subunits are encoded by a fused gene and the regulatory region has enough sequence differences to score below the trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 150
16759 129344 TIGR00241 CoA_E_activ CoA-substrate-specific enzyme activase, putative. This domain is found in a set of closely related proteins including the (R)-2-hydroxyglutaryl-CoA dehydratase activase of Acidaminococcus fermentans, in longer proteins from M. jannaschii and M. thermoautotrophicum that share an additional N-terminal domain, in a protein described as a subunit of the benzoyl-CoA reductase of Rhodopseudomonas palustris, and in two repeats of an uncharacterized protein of Aquifex aeolicus.This domain may be involved in generating or regenerating the active sites of enzymes related to (R)-2-hydroxyglutaryl-CoA dehydratase and benzoyl-CoA reductase. 248
16760 129345 TIGR00242 TIGR00242 division/cell wall cluster transcriptional repressor MraZ. Members of this family contain two tandem copies of a domain described by pfam02381. This protein often is found with other genes of the dcw (division cell wall) gene cluster, including mraW, ftsI, murE, murF, ftsW, murG, etc. Recent work shows MraW in E. coli binds an upstream region with three tandem GTGGG repeats separated by 5bp spacers. We find similar sites in other species. [Cellular processes, Cell division, Regulatory functions, DNA interactions] 142
16761 161787 TIGR00243 Dxr 1-deoxy-D-xylulose 5-phosphate reductoisomerase. 1-deoxy-D-xylulose 5-phosphate is converted to 2-C-methyl-D-erythritol 4-phosphate in the presence of NADPH. It is involved in the synthesis of isopentenyl diphosphate (IPP), a basic building block in isoprenoid, thiamin, and pyridoxal biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 389
16762 129347 TIGR00244 TIGR00244 transcriptional regulator NrdR. Members of this almost entirely bacterial family contain an ATP cone domain (pfam03477). There is never more than one member per genome. Common gene symbols given include nrdR, ybaD, ribX and ytcG. The member from Streptomyces coelicolor is found upstream in the operon of the class II oxygen-independent ribonucleotide reductase gene nrdJ and was shown to repress nrdJ expression. Many members of this family are found near genes for riboflavin biosynthesis in Gram-negative bacteria, suggesting a role in that pathway. However, a phylogenetic profiling study associates members of this family with the presence of a palindromic signal with consensus acaCwAtATaTwGtgt, termed the NrdR-box, an upstream element for most operons for ribonucleotide reductase of all three classes in bacterial genomes. [Regulatory functions, DNA interactions] 147
16763 272982 TIGR00245 TIGR00245 TIGR00245 family protein. [Hypothetical proteins, Conserved] 248
16764 129349 TIGR00246 tRNA_RlmH_YbeA rRNA large subunit m3Psi methyltransferase RlmH. This protein, in the SPOUT methyltransferase family, previously designated YbeA in E. coli, was shown to be responsible for a further modification, a methylation, to a pseudouridine base in ribosomal large subunit RNA. [Protein synthesis, tRNA and rRNA base modification] 153
16765 272983 TIGR00247 TIGR00247 conserved hypothetical protein, YceG family. This uncharacterized protein family, found in three of four microbial genomes, virtually always once per genome, includes YceG from Escherichia coli. This protein is encoded next to PabC, 4-amino-4-deoxychorismate lyase, in E. coli and numerous other proteobacteria, but that proximity is not conserved in other lineages. Numerous members of this family have been misannotated as aminodeoxychorismate lyase, apparently because of promiximty to PabC. [Hypothetical proteins, Conserved] 342
16766 129351 TIGR00249 sixA phosphohistidine phosphatase SixA. [Regulatory functions, Protein interactions] 152
16767 129352 TIGR00250 RNAse_H_YqgF putative transcription antitermination factor YqgF. This protein family, which exhibits an RNAse H fold in crystal structure, has been proposed as a putative Holliday junction resolvase, an alternate to RuvC. [Unknown function, General] 130
16768 129353 TIGR00251 TIGR00251 TIGR00251 family protein. [Hypothetical proteins, Conserved] 87
16769 129354 TIGR00252 TIGR00252 TIGR00252 family protein. the scores for Mycobacterium tuberculosis and Treponema pallidum are low considering the alignment [Hypothetical proteins, Conserved] 119
16770 129355 TIGR00253 RNA_bind_YhbY putative RNA-binding protein, YhbY family. A combination of crystal structure, molecular modeling, and bioinformatic data together suggest that members of this family, including YhbY of E. coli, are RNA binding proteins. [Unknown function, General] 95
16771 272984 TIGR00254 GGDEF diguanylate cyclase (GGDEF) domain. The GGDEF domain is named for the motif GG[DE]EF shared by many proteins carrying the domain. There is evidence that the domain has diguanylate cyclase activity. Several proteins carrying this domain also carry domains with functions relating to environmental sensing. These include PleD, a response regulator protein involved in the swarmer-to-stalked cell transition in Caulobacter crescentus, and FixL, a heme-containing oxygen sensor protein. [Regulatory functions, Small molecule interactions, Signal transduction, Other] 165
16772 129357 TIGR00255 TIGR00255 TIGR00255 family protein. The apparent ortholog from Aquifex aeolicus as reported is split into two consecutive reading frames. [Hypothetical proteins, Conserved] 291
16773 129358 TIGR00256 TIGR00256 D-tyrosyl-tRNA(Tyr) deacylase. This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader. [Protein synthesis, tRNA aminoacylation] 145
16774 129359 TIGR00257 IMPACT_YIGZ uncharacterized protein, YigZ family. This uncharacterized protein family includes YigZ, which has been crystallized, from E. coli. YigZ is homologous to the protein product of the mouse IMPACT gene. Crystallography shows a two-domain stucture, and the C-terminal domain is suggested to bind nucleic acids. The function is unknown. Note that the ortholog from E. coli was shown fused to the pepQ gene in GenBank entry X54687. This caused occasional misidentification of this protein as pepQ; this family is found in a number of species that lack pepQ. [Unknown function, General] 204
16775 272985 TIGR00258 TIGR00258 inosine/xanthosine triphosphatase. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 163
16776 129361 TIGR00259 thylakoid_BtpA membrane complex biogenesis protein, BtpA family. Members of this family are found in C. elegans, Synechocystis sp., E. coli, and several of the Archaea. Members in Cyanobacteria have been shown to play a role in protein complex biogenesis, and designated BtpA (biogenesis of thylakoid protein). Homologs in non-photosynthetic species, where thylakoid intracytoplasmic membranes are lacking, are likely to act elsewhere in membrane protein biogenesis. [Protein fate, Protein folding and stabilization] 257
16777 272986 TIGR00260 thrC threonine synthase. Involved in threonine biosynthesis it catalyses the reaction O-PHOSPHO-L-HOMOSERINE + H(2)O = L-THREONINE + ORTHOPHOSPHATE using pyridoxal phosphate as a cofactor. the enzyme is distantly related to the serine/threonine dehydratases which are also pyridoxal-phosphate dependent enzymes. the pyridoxal-phosphate binding site is a Lys (K) residues present at residue 70 of the model. [Amino acid biosynthesis, Aspartate family] 327
16778 129363 TIGR00261 traB pheromone shutdown-related protein TraB. traB is a plasmid encoded gene that functions in the shutdown of the peptide sex pheromone cPD1 which is produced by the plasmid free recipient cell prior to conjugative transfer in Enterococcus faecalis. Once the recipient acquires the plasmid, production of cPD1 is shut down. The gene product may play another role in the other species in the family. [Unknown function, General] 380
16779 161792 TIGR00262 trpA tryptophan synthase, alpha subunit. Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan. The alpha chain is responsible for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate. In bacteria and plants each domain is found on a separate subunit (alpha and beta chains), while in fungi the two domains are fused together on a single multifunctional protein. The signature pattern for trpA contains three conserved acidic residues. [LIVM]-E-[LIVM]-G-x(2)-[FYC]-[ST]-[DE]-[PA]-[LIVMY]-[AGLI]-[DE]-G and this is located between residues 43-58 of the model. The Sulfolobus solfataricus trpA is known to be quite divergent from other known trpA sequences. [Amino acid biosynthesis, Aromatic amino acid family] 256
16780 272987 TIGR00263 trpB tryptophan synthase, beta subunit. Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan. the beta chain contains the functional domain for or the synthesis of tryptophan from indole and serine. The enzyme requires pyridoxal-phosphate as a cofactor. The pyridoxal-P attachment site is contained within the conserved region [LIVM]-x-H-x-G-[STA]-H-K-x-N] [K is the pyridoxal-P attachment site] which is present between residues 90-100 of the model. [Amino acid biosynthesis, Aromatic amino acid family] 385
16781 272988 TIGR00264 TIGR00264 alpha-NAC-related protein. This hypothetical protein is found so far only in the Archaea. Its C-terminal domain of about 40 amino acids is homologous to the C-termini of the nascent polypeptide-associated complex alpha chain (alpha-NAC) and its yeast ortholog Egd2p and to the huntingtin-interacting protein HYPK. It shows weaker similarity, possibly through shared structural constraints rather than through homology, with the amino-terminal domain of elongation factor Ts. Alpha-NAC plays a role in preventing nascent polypeptides from binding inappropriately to membrane-targeting apparatus during translation, but is also active as a transcription regulator. [Unknown function, General] 116
16782 272989 TIGR00266 TIGR00266 TIGR00266 family protein. [Hypothetical proteins, Conserved] 222
16783 129368 TIGR00267 TIGR00267 TIGR00267 family protein. This family of uncharacterized proteins shows a low level of similarity (possibly meaningful) to the predicted membrane protein YLR220W, which is involved in calcium homeostatis. It shows no similarity to any other characterized protein.This family is represented in three of the first four completed archaeal genomes, with two members in A. fulgidus. [Hypothetical proteins, Conserved] 169
16784 129369 TIGR00268 TIGR00268 TIGR00268 family protein. The N-terminal region of the model shows similarity to Argininosuccinate synthase proteins using PSI-blast and using the recognize protein identification server. [Hypothetical proteins, Conserved] 252
16785 129370 TIGR00269 TIGR00269 TIGR00269 family protein. [Hypothetical proteins, Conserved] 104
16786 129371 TIGR00270 TIGR00270 TIGR00270 family protein. [Hypothetical proteins, Conserved] 154
16787 129372 TIGR00271 TIGR00271 uncharacterized hydrophobic domain. This domain is in a family of archaeal proteins that includes AF0785 of Archaeoglobus fulgidus and in several eubacterial proteins, including the much longer protein sll1151 from Synechocystis PCC6803. 175
16788 272990 TIGR00272 DPH2 diphthamide biosynthesis protein 2. This protein has been shown in Saccharomyces cerevisiae to be one of several required for the modification of a particular histidine residue of translation elongation factor 2 to diphthamide. This modified site can then become the target for ADP-ribosylation by diphtheria toxin. [Protein fate, Protein modification and repair] 496
16789 129374 TIGR00273 TIGR00273 iron-sulfur cluster-binding protein. Members of this family have a perfect 4Fe-4S binding motif C-x(2)-C-x(2)-C-x(3)-CP followed by either a perfect or imperfect (the first Cys replaced by Ser) second copy. Members probably bind two 4fe-4S iron-sulfur clusters. [Energy metabolism, Electron transport] 432
16790 272991 TIGR00274 TIGR00274 N-acetylmuramic acid 6-phosphate etherase. This protein, MurQ, is involved in recycling components of the bacterial murein sacculus turned over during cell growth. The cell wall metabolite anhydro-N-acetylmuramic acid (anhMurNAc) is converted by a kinase, AnmK, to MurNAc-phosphate, then converted to N-acetylglucosamine-phosphate by this etherase, called MurQ. This family of proteins is similar to the C-terminal half of a number of vertebrate glucokinase regulator proteins and contains a Prosite pattern which is shared by this group of proteins in a region of local similarity. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 291
16791 272992 TIGR00275 TIGR00275 flavoprotein, HI0933 family. The model when searched with a partial length search brings in proteins with a dinucleotide-binding motif (Rossman fold) over the initial 40 residues of the model, including oxidoreductases and dehydrogenases. Partially characterized members include an FAD-binding protein from Bacillus cereus and flavoprotein HI0933 from Haemophilus influenzae. [Unknown function, Enzymes of unknown specificity] 400
16792 272993 TIGR00276 TIGR00276 epoxyqueuosine reductase. This model was rebuilt to exclude archaeal homologs, now that there is new information that bacterial members are epoxyqueuosine reductase, QueG, involved in queuosine biosynthesis for tRNA maturation. [Protein synthesis, tRNA and rRNA base modification] 337
16793 272994 TIGR00277 HDIG HDIG domain. This domain is found in a few known nucleotidyltransferes and in a large number of uncharacterized proteins. It contains four widely separated His residues, the second of which is part of an invariant dipeptide His-Asp in a region matched approximately by the motif HDIG. This model may annotate homologous domains in which one or more of the His residues is conserved but misaligned, and some probable false-positive hits. 80
16794 272995 TIGR00278 TIGR00278 putative membrane protein insertion efficiency factor. This model describes a family, YidD, of small, non-essential proteins now suggested to improve YidC-dependent inner membrane protein insertion. A related protein is found in the temperature phage HP1 of Haemophilus influenzae. Annotation of some members of this family as hemolysins appears to represent propagation from an unpublished GenBank submission, L36462, attributed to Aeromonas hydrophila but a close match to E. coli. [Hypothetical proteins, Conserved] 75
16795 129380 TIGR00279 uL16_euk_arch ribosomal protein uL16(L10.e), eukarotic/archaeal form. This model finds the archaeal and eukaryotic forms of ribosomal protein uL16, previously L10.e. The protein is encoded by multiple loci in some eukaryotes and has been assigned a number of extra-ribosomal functions, some of which will require re-evaluation in the context of identification as a ribosomal protein. L10.e is distantly related to eubacterial ribosomal protein L16. [Protein synthesis, Ribosomal proteins: synthesis and modification] 172
16796 272996 TIGR00280 eL43_euk_arch ribosomal protein eL43. This model finds eukaryotic ribosomal protein eL43 (previously L37a) and its archaeal orthologs. The nomeclature is tricky because eukaryotes have proteins called both L37 and L37a. [Protein synthesis, Ribosomal proteins: synthesis and modification] 92
16797 213521 TIGR00281 TIGR00281 segregation and condensation protein B. Shown to be required for chromosome segregation and condensation in B. subtilis. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins] 186
16798 161802 TIGR00282 TIGR00282 metallophosphoesterase, MG_246/BB_0505 family. A member of this family from Mycoplasma Pneumoniae has been crystallized and described as a novel phosphatase. [Unknown function, Enzymes of unknown specificity] 266
16799 161803 TIGR00283 arch_pth2 peptidyl-tRNA hydrolase. This model describes an archaeal/eukaryotic form of peptidyl-tRNA hydrolase. Most bacterial forms are described by TIGR00447. [Protein synthesis, Other] 115
16800 272997 TIGR00284 TIGR00284 dihydropteroate synthase-related protein. This protein has been found so far only in the Archaea, and in particular in those archaea that lack a bacterial-type dihydropteroate synthase. The central region of this protein shows considerable homology to the amino-terminal half of dihydropteroate synthases, while the carboxyl-terminal region shows homology to the small, uncharacterized protein slr0651 of Synechocystis PCC6803. [Unknown function, General] 499
16801 129386 TIGR00285 TIGR00285 DNA-binding protein Alba. Alba has been shown to bind DNA and affect DNA supercoiling in a temperature dependent manner. It is regulated by acetylation (alba = acetylation lowers binding affinity) by the Sir2 protein. Alba is proposed to play a role in establishment or maintenace of chromatin architecture and thereby in transcription repression. This protein appears so far only in the Archaea, but may be universal there. There is a single member in three of the first four completed archaeal genomes, and a second copy in A. fulgidus. In Sulfolobus shibatae there is a tandem second copy that is poorly conserved and scores below the trusted cutoff; all other members of the family are conserved at greater than 50 % pairwise identity. [DNA metabolism, Chromosome-associated proteins] 87
16802 211565 TIGR00286 TIGR00286 arginine decarboxylase, pyruvoyl-dependent. The three copies present in Archeoglobus fulgidus, one of which is only half-length and excluded from the seed alignment, are very closely related and clearly arose by duplication after the separation from well-studied species. The other completed archaeal genomes each contain a single copy. The lone, weak (below trusted cutoff) hit to a non-archaeal sequence is to an uncharacterized protein of Chlamydia, with the greatest similarity in the amino-terminal half of the model. [Central intermediary metabolism, Polyamine biosynthesis, Energy metabolism, Amino acids and amines] 152
16803 272998 TIGR00287 cas1 CRISPR-associated endonuclease Cas1. This model identifies CRISPR-associated protein Cas1, the most universal CRISPR system protein. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Cas1 is a metal-dependent DNA-specific endonuclease. 323
16804 272999 TIGR00288 TIGR00288 TIGR00288 family protein. This family of orthologs is restricted to but universal among the completed archaeal genomes so far. Eubacterial proteins showing at least local homology include slr1870 from Synechocystis PCC6803 and two proteins from Aquifex aeolicusr, none of which is characterized. [Hypothetical proteins, Conserved] 160
16805 129390 TIGR00289 TIGR00289 TIGR00289 family protein. Homologous proteins related to MJ0570 of Methanococcus jannaschii include both the apparent orthologs found by this model above the trusted cutoff, the much longer protein YLR143W from Saccharomyces cerevisiae, and second homologous proteins from Archaeoglobus fulgidus and Pyrococcus horikoshii that appear to represent a second orthologous group. [Hypothetical proteins, Conserved] 222
16806 273000 TIGR00290 MJ0570_dom MJ0570-related uncharacterized domain. Proteins with this uncharacterized domain include two apparent ortholog families in the Archaea, one of which is universal among the first four completed archaeal genomes, and YLR143W, a much longer protein from Saccharomyces cerevisiae. The domain comprises the full length of the archaeal proteins and the first third of the yeast protein. 223
16807 129392 TIGR00291 RNA_SBDS rRNA metabolism protein, SBDS family. This protein family, possibly universal in both archaea and eukaryotes, appears to be involved in (ribosomal) RNA metabolism. Mutations in the human ortholog are associated with Shwachman-Bodian-Diamond syndrome. [Protein synthesis, Other] 231
16808 273001 TIGR00292 TIGR00292 thiazole biosynthesis enzyme. This enzyme is involved in the biosynthesis of the thiamine precursor thiazole, and is repressed by thiamine. This family includes c-thi1, a Citrus gene induced during natural and ethylene induced fruit maturation and is highly homologous to plant and yeast thi genes involved in thiamine biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 254
16809 129394 TIGR00293 TIGR00293 prefoldin, archaeal alpha subunit/eukaryotic subunit 5. Members of this protein family, rich in coiled coil regions, are molecular chaperones in the class of the prefoldin (GimC) alpha subunit. Prefoldin is a hexamer of two alpha and four beta subunits. This protein appears universal in the archaea but is restricted to Aquifex aeolicus among bacteria so far. Eukaryotes have several related proteins; only prefoldin subunit 5, which appeared the most similar to archaeal prefoldin alpha, is included in this model. This model finds a set of small proteins from the Archaea and from Aquifex aeolicus that may represent two orthologous groups. The proteins are predicted to be mostly coiled coil, and the model may have a significant number of hits to proteins that contain coiled coil regions. [Protein fate, Protein folding and stabilization] 126
16810 129395 TIGR00294 TIGR00294 GTP cyclohydrolase, MptA/FolE2 family. This family includes type I GTP cyclohydrolases involved in methanopterin in archaea (MptA) and de novo tetrahydrofolate biosynthesis in bacteria (FolE2). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 308
16811 129396 TIGR00295 TIGR00295 TIGR00295 family protein. This set of orthologs is narrowly defined, comprising proteins found in three Archaea but not in Pyrococcus horikoshii. The closest homologs are other archaeal proteins that appear to be represent distinct orthologous clusters. [Hypothetical proteins, Conserved] 164
16812 273002 TIGR00296 TIGR00296 uncharacterized protein, PH0010 family. Members of this functionally uncharacterized protein family have been crystallized from Pyrococcus Horikoshii, Methanosarcina Mazei, and Sulfolobus Tokodaii. [Unknown function, General] 200
16813 213522 TIGR00297 TIGR00297 TIGR00297 family protein. [Hypothetical proteins, Conserved] 237
16814 273003 TIGR00298 TIGR00298 2-phosphosulfolactate phosphatase. 2-phosphosulfolactate phosphatase catalyzes the sulfonation of phosphoenolpyruvate to form 2-phospho-3-sulfolactate, the second step in coenzyme M biosynthesis. Coenzyme M is the terminal methyl carrier in methanogenesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 216
16815 129400 TIGR00299 TIGR00299 TIGR00299 family protein. Members of this family are found in the Archaea and in several different bacteria lineages. The function in unknown and the genomic context is not well conserved. [Hypothetical proteins, Conserved] 382
16816 129401 TIGR00300 TIGR00300 TIGR00300 family protein. All members of the family come from genome projects. A partial length search brings in two plant lysine-ketoglutarate reductase/saccharopine dehydrogenase bifunctional enzymes hitting the N-terminal region of the family. [Hypothetical proteins, Conserved] 407
16817 129402 TIGR00302 TIGR00302 phosphoribosylformylglycinamidine synthase, purS protein. In species such as Bacillus subtilis in which FGAM synthetase is split into two ORFs purL and purQ, this small protein, previously called yexA, is required for FGAM synthetase activity. Although the article does not make it clear whether this is a subunit or an accessory protein, it is encoded as part of the operon, which suggests stochiometric amounts, = subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 80
16818 273004 TIGR00303 TIGR00303 TIGR00303 family protein. All current members of the family are from genome projects. [Hypothetical proteins, Conserved] 331
16819 213523 TIGR00304 TIGR00304 TIGR00304 family protein. The member of this family from Pyrococcus horikoshii scores only 13.91 bits, largely because it is at least 15 residues shorter than other members of this family of small proteins and is penalized for not matching to the N-terminal section of the model. Cutoff scores are set so this hit is between noise and trusted cutoffs. [Hypothetical proteins, Conserved] 77
16820 129405 TIGR00305 TIGR00305 putative toxin-antitoxin system toxin component, PIN family. This uncharacterized protein family, part of the PIN domain superfamily, is restricted to bacteria and archaea. A comprehensive in silico study of toxin-antitoxin systems by Makarova, et al. (2009) finds evidence this family represents the toxin-like component of one class of type 2 toxin-antitoxin systems. [Cellular processes, Other, Transcription, Degradation of RNA] 114
16821 273005 TIGR00306 apgM phosphoglycerate mutase (2,3-diphosphoglycerate-independent), archaeal form. Experimentally characterized in archaea as 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This model describes a set of proteins in the Archaea (two each in Methanococcus jannaschii, Methanobacterium thermoautotrophicum, and Archaeoglobus fulgidus) and in Aquifex aeolicus (1 member). [Energy metabolism, Glycolysis/gluconeogenesis] 396
16822 129407 TIGR00307 eS8 ribosomal protein eS8. Archaeal and eukaryotic ribosomal protein S8. This model could easily have been split into two models, one for eukaryotic S8 and one for archaeal S8; eukaryotic forms invariably have in insert of about 80 residues that archaeal forms of S8 do not. [Protein synthesis, Ribosomal proteins: synthesis and modification] 127
16823 273006 TIGR00308 TRM1 tRNA(guanine-26,N2-N2) methyltransferase. This enzyme is responsible for two methylations of a characteristic guanine of most tRNA molecules. The activity has been demonstrated for eukaryotic and archaeal proteins, which are active when expressed in E. coli, a species that lacks this enzyme. At least one Eubacterium, Aquifex aeolicus, has an ortholog, as do all completed archaeal genomes. [Protein synthesis, tRNA and rRNA base modification] 374
16824 129409 TIGR00309 V_ATPase_subD H(+)-transporting ATP synthase, vacuolar type, subunit D. Although this ATPase can run backwards, using a proton gradient to synthesize ATP, the primary biological role is to acidify some compartment, such as yeast vacuole (a lysosomal homolog) or the interior of a prokaryote. [Transport and binding proteins, Cations and iron carrying compounds] 209
16825 273007 TIGR00310 ZPR1_znf ZPR1 zinc finger domain. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction. 192
16826 129411 TIGR00311 aIF-2beta translation initiation factor aIF-2, beta subunit, putative. The trusted cutoff is set high enough to select only archaeal members. The suggested cutoff is set to include most eukaryotic members but largely exclude the related eIF-5. [Protein synthesis, Translation factors] 133
16827 273008 TIGR00312 cbiD cobalamin biosynthesis protein CbiD. This protein has been shown by cloning into E. coli to be required for cobalamin biosynthesis. role_id [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 347
16828 129413 TIGR00313 cobQ cobyric acid synthase CobQ. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 475
16829 129414 TIGR00314 cdhA CO dehydrogenase/acetyl-CoA synthase complex, epsilon subunit. Acetyl-CoA decarbonylase/synthase (ACDS) is a multienzyme complex. Carbon monoxide dehydrogenase is a synonym. The ACDS complex carries out an unusual reaction involving the reversible cleavage and synthesis of acetyl-CoA in methanogens. The model contains the prosite signature for 4Fe-4S ferredoxins [C-x(2)-C-x(2)-C-x(3)-C-[PEG]] between residues 448-462 of the model. [Energy metabolism, Chemoautotrophy] 784
16830 273009 TIGR00315 cdhB CO dehydrogenase/acetyl-CoA synthase complex, epsilon subunit. Nomenclature follows the description for Methanosarcina thermophila. The complex is also found in Archaeoglobus fulgidus, not considered a methanogen, but is otherwise generally associated with methanogenesis. [Energy metabolism, Chemoautotrophy] 165
16831 129416 TIGR00316 cdhC CO dehydrogenase/CO-methylating acetyl-CoA synthase complex, beta subunit. Nomenclature follows the description for Methanosarcina thermophila. The CO-methylating acetyl-CoA synthase is considered the defining enzyme of the Wood-Ljungdahl pathway, used for acetate catabolism by sulfate reducing bacteria but for acetate biosynthesis by acetogenic bacteria such as oorella thermoacetica (f. Clostridium thermoaceticum). [Energy metabolism, Chemoautotrophy] 458
16832 213524 TIGR00317 cobS cobalamin 5'-phosphate synthase/cobalamin synthase. cobS is involved with cobalamin biosynthesis in part III of colbalmin biosynthesis. The enzyme catyalzes the reactions adenosylcobinamide-GDP + alpha-ribazole-5'-P = adenosylcobalamin-5'-phosphate + GMP and adenosylcobinamide-GDP + alpha-ribazole = adenosylcobalamin + GMP. The protein product is associated with a large complex of proteins and is induced by cobinamide. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 241
16833 273010 TIGR00318 cyaB adenylyl cyclase CyaB, putative. The protein CyaB from Aeromonas hydrophila is a second adenylyl cyclase from that species, as demonstrated by complementation in E. coli and by assay of the enzymatic properties of purified recombinant protein. It has no detectable homology to any other protein of known function, and has several unusual properties, including an optimal temperature of 65 degrees and an optimal pH of 9.5. A cluster of uncharaterized archaeal homologs may be orthologous and serve (under certain circumstances) to produce the regulatory metabolite cyclic AMP (cAMP). [Regulatory functions, Small molecule interactions] 174
16834 200008 TIGR00319 desulf_FeS4 desulfoferrodoxin FeS4 iron-binding domain. This domain is found as essentially the full length of desulforedoxin, a 37-residue homodimeric non-heme iron protein. It is also found as the N-terminal domain of desulfoferrodoxin (rbo), a homodimeric non-heme iron protein with 2 Fe atoms per monomer in different oxidation states.This domain binds the ferric rather than the ferrous Fe of desulfoferrodoxin.Neelaredoxin, a monomeric blue non-heme iron protein, lacks this domain. [Energy metabolism, Electron transport] 33
16835 273011 TIGR00320 dfx_rbo desulfoferrodoxin. The short N-terminal domain contains four conserved Cys for binding of a ferric iron atom, and is homologous to the small protein desulforedoxin; this domain may also be responsible for dimerization. The remainder of the molecule binds a ferrous iron atom and is similar to neelaredoxin, a monomeric blue non-heme iron protein. The homolog from Treponema pallidum scores between the trusted cutoff for orthology and the noise cutoff. Although essentially a full length homolog, it lacks three of the four Cys residues in the N-terminal domain; the domain may have lost ferric binding ability but may have some conserved structural role such as dimerization, or some new function. This protein is described in some articles as rubredoxin oxidoreductase (rbo), and its gene shares an operon with the rubredoxin gene in Desulfovibrio vulgaris Hildenborough. [Energy metabolism, Electron transport] 125
16836 273012 TIGR00321 dhys deoxyhypusine synthase. Deoxyhypusine synthase is responsible for the first step in creating hypusine. Hypusine is a modified amino acid found in eukaryotes and in archaea in their respective forms of initiation factor 5A. Its presence is confirmed in archaeal genera Pyrococcus (), Sulfolobus, Halobacterium, and Haloferax (), but in an older report was not detected in Methanococcus voltae (J Biol Chem 1987 Dec 5;262(34):16585-9). This family of apparent orthologs has an unusual UPGMA difference tree, in which the members from the archaea M. jannaschii and P. horikoshii cluster with the known eukaryotic deoxyhypusine synthases. Separated by a fairly deep branch, although still strongly related, is a small cluster of proteins from Methanobacterium thermoautotrophicum and Archeoglobus fulgidus, the latter of which has two. [Protein fate, Protein modification and repair] 301
16837 273013 TIGR00322 diphth2_R diphthamide biosynthesis enzyme Dph1/Dph2 domain. Archaea and Eukaryotes, but not Eubacteria, share the property of having a covalently modified residue, 2'-[3-carboxamido-3-(trimethylammonio)propyl]histidine, as a part of a cytosolic protein. The modified His, termed diphthamide, is part of translation elongation factor EF-2 and is the site for ADP-ribosylation by diphtheria toxin. This model includes both Dph1 and Dph2 from Saccharomyces cerevisiae, although only Dph2 is found in the Archaea (see TIGR03682). Dph2 has been shown to act analogously to the radical SAM (rSAM) family (pfam04055), with 4Fe-4S-assisted cleavage of S-adenosylmethionine to create a free radical, but a different organic radical than in rSAM. 318
16838 211569 TIGR00323 eIF-6 translation initiation factor eIF-6, putative. This model finds translation initiation factor eIF-6 of eukaryotes, which is a ribosome dissociation factor. It also finds a set of apparent archaeal orthologs, slightly shorter proteins not yet shown to act as initiation factors; these probably should be designated as translation initiation factor aIF-6, putative. [Protein synthesis, Translation factors] 216
16839 129424 TIGR00324 endA tRNA-intron lyase. The enzyme catalyses the endonucleolytic cleavage of pre tRNA at the 5' and 3' splice sites to release the intron and produces two half tRNA molecules bearing 5' hydroxyl and 2', 3'-cyclic phosphate termini. The genes are homologous in Eucarya and Archea. The two yeast genes have been functionally studied and are two subunits of a heterotetramer enzyme in yeast the other two subunits of which have no known homologs. [Transcription, RNA processing] 170
16840 273014 TIGR00325 lpxC UDP-3-0-acyl N-acetylglucosamine deacetylase. UDP-3-O-(R-3-hydroxymyristoyl)-GlcNAc deacetylase from E. coli , LpxC, was previously designated EnvA. This enzyme is involved in lipid-A precursor biosynthesis. It is essential for cell viability. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 297
16841 273015 TIGR00326 eubact_ribD riboflavin biosynthesis protein RibD. This model describes the ribD protein as found in Escherichia coli. The N-terminal domain includes the conserved zinc-binding site region captured in the model dCMP_cyt_deam and shared by proteins such as cytosine deaminase, mammalian apolipoprotein B mRNA editing protein, blasticidin-S deaminase, and Bacillus subtilis competence protein comEB. The C-terminal domain is homologous to the full length of yeast HTP reductase, a protein required for riboflavin biosynthesis. A number of archaeal proteins believed related to riboflavin biosynthesis contain only this C-terminal domain and are not found as full-length matches to this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 344
16842 273016 TIGR00327 secE_euk_arch protein translocase SEC61 complex gamma subunit, archaeal and eukaryotic. This model describes archaeal SEC61-like and eukaryotic SEC61 but not bacterial secE proteins, for which a Pfam pfam00584 (SecE) has been created. [Protein fate, Protein and peptide secretion and trafficking] 61
16843 129428 TIGR00328 flhB flagellar biosynthetic protein FlhB. FlhB and its functionally equivalent orthologs, from among a larger superfamily of proteins involved in type III protein export systems, are specifically involved in flagellar protein export. The seed members are restricted and the trusted cutoff is set high such that the proteins gathered by this model play roles specifically related to flagellar structures. Full-length homologs scoring below the trusted cutoff are involved in peptide export but not necessarily in the creation of flagella. [Cellular processes, Chemotaxis and motility] 347
16844 129429 TIGR00329 gcp_kae1 metallohydrolase, glycoprotease/Kae1 family. This subfamily includes the well-studied secreted O-sialoglycoprotein endopeptidase (glycoprotease, EC 3.4.24.57) of Pasteurella haemolytica, a pathogen. A member from Riemerella anatipestifer, associated with cohemolysin activity, likewise is exported without benefit of a classical signal peptide and shows glycoprotease activity on the test substrate glycophorin. However, archaeal members of this subfamily show unrelated activities as demonstrated in Pyrococcus abyssi: DNA binding, iron binding, apurinic endonuclease activity, genomic association with a kinase domain, and no glycoprotease activity. This family thus pulls together a set of proteins as a homology group that appears to be near-universal in life, yet heterogeneous in assayed function between bacteria and archaea. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 305
16845 129430 TIGR00330 glpX fructose-1,6-bisphosphatase, class II. This model represents GlpX, one of three classes of bacterial fructose-1,6-bisphosphatases. This form is homodimeric and Mn2+-dependent, and only very distantly related to the class I fructose-1,6-bisphosphatase, the product of the fbp gene, which is homotetrameric and Mg2+-dependent. A third class is found as one of two types in Bacillus subtilis. In E. coli, GlpX is found in the glpFKX operon together with a glycerol update protein and glycerol kinase. [Energy metabolism, Pentose phosphate pathway] 321
16846 273017 TIGR00331 hrcA heat shock gene repressor HrcA. HrcA represses the class I heat shock operons groE and dnaK; overproduction prevents induction of these operons by heat shock while deletion allows constitutive expression even at low temperatures. In Bacillus subtilis, hrcA is the first gene of the dnaK operon and so is itself a heat shock gene. [Regulatory functions, DNA interactions] 337
16847 273018 TIGR00332 neela_ferrous desulfoferrodoxin ferrous iron-binding domain. This domain comprises essentially the full length of neelaredoxin, a monomeric, blue, non-heme iron protein of Desulfovibrio gigas said to bind two iron atoms per monomer with identical spectral properties. Neelaredoxin was shown recently to have significant superoxide dismutase activity. This domain is also found (in a form in which the distance between the motifs H[HWYF]IXW and CN[IL]HGXW is somewhat shorter) as the C-terminal domain of desulfoferrodoxin, which is said to bind a single ferrous iron atom.The N-terminal domain of desulfoferrodoxin is described in a separate model, dfx_rbo (TIGR00320). [Energy metabolism, Electron transport] 106
16848 188042 TIGR00333 nrdI ribonucleoside-diphosphate reductase 2, operon protein nrdI. Ribonucleotide reductases (RNRs) are enzymes that provide the precursors of DNA synthesis. The three characterized classes of RNRs differ by their metal cofactor and their stable organic radical. The exact function of nrdI within the ribonucleotide reductases has not yet been fully characterised. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 127
16849 273019 TIGR00334 5S_RNA_mat_M5 ribonuclease M5. This family of orthologous proteins shows a weak but significant similarity to the central region of the DnaG-type DNA primase. The region of similarity is termed the Toprim (topoisomerase-primase) domain and is also shared by RecR, OLD family nucleases, and type IA and II topoisomerases. [Transcription, RNA processing] 174
16850 273020 TIGR00335 primase_sml DNA primase, eukaryotic-type, small subunit, putative. Archaeal members differ substantially from eukaryotic members and should be considered putative pending experimental evidence. The protein is universal and single copy among completed archaeal and eukarotic genomes to date. DNA primase creates RNA primers needed for DNA replication.This model is named putative because the assignment is putative for archaeal proteins. Eukaryotic proteins scoring above the trusted cutoff can be considered authentic. [DNA metabolism, DNA replication, recombination, and repair] 297
16851 129436 TIGR00336 pyrE orotate phosphoribosyltransferase. Orotate phosphoribosyltransferase (OPRTase) is involved in the biosynthesis of pyrimidine nucleotides. Alpha-D-ribosyldiphosphate 5-phosphate (PRPP) and orotate are utilized to form pyrophosphate and orotidine 5'-monophosphate (OMP) in the presence of divalent cations, preferably Mg2+. In a number of eukaryotes, this protein is fused to a domain that catalyses the reaction (EC 4.1.1.23). The combined activity of EC 2.4.2.10 and EC 4.1.1.23 is termed uridine 5'-monophosphate synthase. The conserved Lys (K) residue at position 101 of the seed alignment has been proposed as the active site for the enzyme. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 173
16852 273021 TIGR00337 PyrG CTP synthase. CTP synthase is involved in pyrimidine ribonucleotide/ribonucleoside metabolism. The enzyme catalyzes the reaction L-glutamine + H2O + UTP + ATP = CTP + phosphate + ADP + L-glutamate. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. This gene has been found circa 500 bp 5' upstream of enolase in both beta (Nitrosomonas europaea) and gamma (E.coli) subdivisions of proteobacterium (FEMS Microbiol Lett 1998 Aug 1;165(1):153-7). [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 525
16853 273022 TIGR00338 serB phosphoserine phosphatase SerB. Phosphoserine phosphatase catalyzes the reaction 3-phospho-serine + H2O = L-serine + phosphate. It catalyzes the last of three steps in the biosynthesis of serine from D-3-phosphoglycerate. Note that this enzyme acts on free phosphoserine, not on phosphoserine residues of phosphoproteins. [Amino acid biosynthesis, Serine family] 219
16854 273023 TIGR00339 sopT ATP sulphurylase. This enzyme forms adenosine 5'-phosphosulfate (APS) from ATP and free sulfate, the first step in the formation of the activated sulfate donor 3'-phosphoadenylylsulfate (PAPS). In some cases, it is found in a bifunctional protein in which the other domain, APS kinase, catalyzes the second and final step, the phosphorylation of APS to PAPS; the combined ATP sulfurylase/APS kinase may be called PAPS synthase. Members of this family also include the dissimilatory sulfate adenylyltransferase (sat) of the sulfate reducer Archaeoglobus fulgidus. [Central intermediary metabolism, Sulfur metabolism] 383
16855 129440 TIGR00340 zpr1_rel ZPR1-related zinc finger protein. This model describes a strictly archaeal family homologous to the domain duplicated in the eukaryotic zinc-binding protein ZPR1. ZPR1 was shown experimentally to bind approximately two moles of zinc; each copy of the domain contains a putative zinc finger of the form CXXCX(25)CXXC. ZPR1 binds the tyrosine kinase domain of epidermal growth factor receptor, but is displaced by receptor activation and autophosphorylation after which it redistributes in part to the nucleus. The proteins described by this model by analogy may be suggested to play a role in signal transduction. A model ZPR1_znf (TIGR00310) has been created to describe the domain shared by this protein and ZPR1. [Unknown function, General] 163
16856 273024 TIGR00341 TIGR00341 TIGR00341 family protein. This conserved hypothetical protein is found so far only in three archaeal genomes and in Streptomyces coelicolor. It shares a hydrophobic uncharacterized domain (see TIGR00271) of about 180 residues with several eubacterial proteins, including the much longer protein sll1151 of Synechocystis PCC6803. [Hypothetical proteins, Conserved] 325
16857 273025 TIGR00342 TIGR00342 tRNA sulfurtransferase ThiI. Members of this protein family are "ThiI", a sulfurtransferase involved in 4-thiouridine modification of tRNA. This protein often is bifunctional, with genetically separable activities, where the C-terminal rhodanese-like domain (residues 385 to 482 in E. coli ThiI), a domain not included in this model, is sufficient to synthesize the thiazole moiety of thiamine (see TIGR04271). Note that ThiI, because of its role in tRNA modification, may occur in species (such as Mycoplasma genitalium) that lack de novo thiamine biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine, Protein synthesis, tRNA and rRNA base modification] 371
16858 129443 TIGR00343 TIGR00343 pyridoxal 5'-phosphate synthase, synthase subunit Pdx1. This protein had been believed to be a singlet oxygen resistance protein. Subsequent work showed that it is a protein of pyridoxine (vitamin B6) biosynthesis, and that pyridoxine quenches the highly toxic singlet form of oxygen produced by light in the presence of certain chemicals. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 287
16859 273026 TIGR00344 alaS alanine--tRNA ligase. The model describes alanine--tRNA ligase. This enzyme catalyzes the reaction (tRNAala + L-alanine + ATP = L-alanyl-tRNAala + pyrophosphate + AMP). [Protein synthesis, tRNA aminoacylation] 845
16860 273027 TIGR00345 GET3_arsA_TRC40 transport-energizing ATPase, TRC40/GET3/ArsA family. Members of this family are ATPases that energize transport, although with different partner proteins for different functions. Recent findings show that TRC40 (GET3 in yeast) in involved in the insertion of tail-anchored membrane proteins in eukaryotes. A similar function is expected for members of this family in archaea. However, the earliest discovery of a function for this protein family is ArsA, an arsenic resistance protein that partners with ArsB (see pfam02040) for As(III) efflux. [Hypothetical proteins, Conserved] 284
16861 129446 TIGR00346 azlC 4-azaleucine resistance probable transporter AzlC. Overexpression of this gene results in resistance to a leucine analog, 4-azaleucine. The protein has 5 potential transmembrane motifs. It has been inferred, but not experimentally demonstrated, to be part of a branched-chain amino acid transport system. Commonly found in association with azlD. [Transport and binding proteins, Amino acids, peptides and amines] 221
16862 129447 TIGR00347 bioD dethiobiotin synthase. Dethiobiotin synthase is involved in biotin biosynthesis and catalyses the reaction (CO2 + 7,8-diaminononanoate + ATP = dethiobiotin + phosphate + ADP). The enzyme binds ATP (see motif in first 12 residues of the SEED alignment) and requires magnesium as a co-factor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 166
16863 273028 TIGR00348 hsdR type I site-specific deoxyribonuclease, HsdR family. This gene is part of the type I restriction and modification system which is composed of three polypeptides R (restriction endonuclease), M (modification) and S (specificity). This group of enzymes recognize specific short DNA sequences and have an absolute requirement for ATP (or dATP) and S-adenosyl-L-methionine. They also catalyse the reactions of EC 2.1.1.72 and EC 2.1.1.73, with similar site specificity.(J. Mol. Biol. 271 (3), 342-348 (1997)). Members of this family are assumed to differ from each other in DNA site specificity. [DNA metabolism, Restriction/modification] 667
16864 273029 TIGR00350 lytR_cpsA_psr cell envelope-related function transcriptional attenuator common domain. This model describes a domain of unknown function that is found in the predicted extracellular domain of a number of putative membrane-bound proteins. One of these is proteins psr, described as a penicillin binding protein 5 (PDP-5) synthesis repressor. Another is Bacillus subtilis LytR, described as a transcriptional attenuator of itself and the LytABC operon, where LytC is N-acetylmuramoyl-L-alanine amidase. A third is CpsA, a putative regulatory protein involved in exocellular polysaccharide biosynthesis. Besides the region of strong similarily represented by this model, these proteins share the property of having a short putative N-terminal cytoplasmic domain and transmembrane domain forming a signal-anchor. [Regulatory functions, Other] 152
16865 273030 TIGR00351 narI respiratory nitrate reductase, gamma subunit. Involved in anerobic respiration the gene product catalyzes the reaction (reduced acceptor + NO3- = Acceptor + nitrite). Another possible role_id for this gene product is in nitrogen fixation (Role_id:160). [Energy metabolism, Anaerobic] 224
16866 129451 TIGR00353 nrfE c-type cytochrome biogenesis protein CcmF. The product of this gene is required for the biogenesis of C-type cytochromes. This gene is thought to have eleven transmembrane helices. Disruption of this gene in Paracoccus denitrificans, encoding a putative transporter, results in formation of an unstable apocytochrome c and deficiency in siderophore production. [Energy metabolism, Electron transport] 576
16867 273031 TIGR00354 polC DNA polymerase, archaeal type II, large subunit. This model represents the large subunit, DP2, of a two subunit novel Archaeal replicative DNA polymerase first characterized for Pyrococcus furiosus. Structure of DP2 appears to be organized as a ~950 residue component separated from a ~300 residue component by a ~150 residue intein. The other subunit, DP1, has sequence similarity to the eukaryotic DNA polymerase delta small subunit. [DNA metabolism, DNA replication, recombination, and repair] 1095
16868 273032 TIGR00355 purH phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase. PurH is bifunctional: IMP cyclohydrolase (EC 3.5.4.10); phosphoribosylaminoimidazolecarboxamide formyltransferase (EC 2.1.2.3) Involved in purine ribonucleotide biosynthesis. The IMP cyclohydrolase activity is in the N-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 511
16869 129454 TIGR00357 TIGR00357 methionine-R-sulfoxide reductase. This model describes a domain found in PilB, a protein important for pilin expression, N-terminal to a domain coextensive to with the known peptide methionine sulfoxide reductase (MsrA), a protein repair enzyme, of E. coli. Among the early completed genomes, this module is found if and only if MsrA is also found, whether N-terminal to MsrA (as for Helicobacter pylori), C-terminal (as for Treponema pallidum), or in a separate polypeptide. Although the function of this region is not clear, an auxiliary function to MsrA is suggested. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions] 134
16870 273033 TIGR00358 3_prime_RNase VacB and RNase II family 3'-5' exoribonucleases. This model is defined to identify a pair of paralogous 3-prime exoribonucleases in E. coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterized originally as required for the expression of virulence genes, but is now recognized as the exoribonuclease RNase R (Rnr). Its paralog in E. coli and H. influenzae is designated exoribonuclease II (Rnb). Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cutoff to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3-prime exoribonucleases. [Transcription, Degradation of RNA] 654
16871 273034 TIGR00359 cello_pts_IIC phosphotransferase system, cellobiose specific, IIC component. The family consists of the cellobiose specific form of the phosphotransferase system (PTS), IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 423
16872 273035 TIGR00360 ComEC_N-term ComEC/Rec2-related protein. The related model ComEC_Rec2 (TIGR00361) describes a set of proteins of ~ 700-800 residues, one each from a number of different species, of which most can become competent for natural transformation with exogenous DNA. The best-studied examples are ComEC from Bacillus subtilis and Rec-2 from Haemophilus influenzae, where the protein appears to form part of the DNA import structure. This model represents a region found in full-length ComEC/Rec2 and shorter homologs of unknown function from large number of additional bacterial species, most of which are not known to become competent for transformation (an exception is Helicobacter pylori). [Unknown function, General] 171
16873 273036 TIGR00361 ComEC_Rec2 DNA internalization-related competence protein ComEC/Rec2. Apparant orthologs are found in 5 species so far (Haemophilus influenzae, Escherichia coli, Bacillus subtilis, Neisseria gonorrhoeae, Streptococcus pneumoniae), of which all but E. coli are model systems for the study of competence for natural transformation. This protein is a predicted multiple membrane-spanning protein likely to be involved in DNA internalization. In a large number of bacterial species not known to exhibit competence, this protein is replaced by a half-length N-terminal homolog of unknown function, modelled by the related model ComEC_N-term. The role for this protein in species that are not naturally transformable is unknown. [Cellular processes, DNA transformation] 662
16874 273037 TIGR00362 DnaA chromosomal replication initiator protein DnaA. DnaA is involved in DNA biosynthesis; initiation of chromosome replication and can also be transcription regulator. The C-terminal of the family hits the pfam bacterial DnaA (bac_dnaA) domain family. For a review, see Kaguni (2006). [DNA metabolism, DNA replication, recombination, and repair] 437
16875 129460 TIGR00363 TIGR00363 lipoprotein, YaeC family. This family of putative lipoproteins contains a consensus site for lipoprotein signal sequence cleavage. Included in this family is the E. coli hypothetical protein yaeC. About half of the proteins between the noise and trusted cutoffs contain the consensus lipoprotein signature and may belong to this family. [Cell envelope, Other] 258
16876 129461 TIGR00364 TIGR00364 queuosine biosynthesis protein QueC. Members of this protein family are QueC, involved in synthesizing pre-Q0 from GTP en route to tRNA modification with queuosine. This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti, the gene was designated exsB, possibly because of polar effects on exsA expression in a shared polycistronic mRNA. In Arthrobacter viscosus, the homologous gene was designated ALU1 and was associated with an aluminum tolerance phenotype. [Unknown function, General] 201
16877 188046 TIGR00365 TIGR00365 monothiol glutaredoxin, Grx4 family. The gene for the member of this glutaredoxin family in E. coli, originally designated ydhD, is now designated grxD. Its protein, Grx4, is a monothiol glutaredoxin similar to Grx5 of yeast, which is involved in iron-sulfur cluster formation. [Energy metabolism, Electron transport] 97
16878 273038 TIGR00366 TIGR00366 TIGR00366 family protein. [Hypothetical proteins, Conserved] 438
16879 273039 TIGR00367 TIGR00367 K+-dependent Na+/Ca+ exchanger related-protein. This model models a family of bacterial and archaeal proteins that is homologous, except for lacking a central region of ~ 250 amino acids and an N-terminal region of > 100 residues, to a functionally proven potassium-dependent sodium-calcium exchanger of the rat. [Unknown function, General] 307
16880 129465 TIGR00368 TIGR00368 Mg chelatase-related protein. The N-terminal end matches very strongly a pfam Mg_chelatase domain. [Unknown function, General] 499
16881 161843 TIGR00369 unchar_dom_1 uncharacterized domain 1. Most proteins containing this domain consist almost entirely of a single copy of this domain. A protein from C. elegans consists of two tandem copies of the domain. The domain is also found as the N-terminal region of an apparent initiation factor eIF-2B alpha subunit of Aquifex aeolicus. The function of the domain is unknown. 117
16882 129467 TIGR00370 TIGR00370 sensor histidine kinase inhibitor, KipI family. [Hypothetical proteins, Conserved] 202
16883 273040 TIGR00372 cas4 CRISPR-associated protein Cas4. This model represents a family of proteins associated with CRISPR repeats in a wide set of prokaryotic genomes. This scope of this model has been broadened since it was first built to describe an archaeal subset only. The function of the protein is undefined. Distantly related proteins, excluded from this model, include ORFs from Mycobacteriophage D29 and Sulfolobus islandicus filamentous virus and a region of the Schizosaccharomyces pombe DNA replication helicase Dna2p. 178
16884 129469 TIGR00373 TIGR00373 transcription factor E. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain. [Transcription, Transcription factors] 158
16885 129470 TIGR00374 TIGR00374 conserved hypothetical protein. This model is built on a superfamily of proteins in the Archaea and in Aquifex aeolicus. The authenticity of homology can be seen in the presence of motifs in the alignment that include residues relatively rare among these sequences, even though the alignment includes long regions of low-complexity hydrophobic sequences. One apparent fusion protein contains a Glycos_transf_2 region in the N-terminal half of the protein and a region homologous to this superfamily in the C-terminal region. [Unknown function, General] 319
16886 161657 TIGR00375 TIGR00375 TIGR00375 family protein. The member of this family from Methanococcus jannaschii, MJ0043, is considerably longer and appears to contain an intein N-terminal to the region of homology. [Hypothetical proteins, Conserved] 374
16887 273041 TIGR00376 TIGR00376 DNA helicase, putative. The gene product may represent a DNA helicase. Eukaryotic members of this family have been characterized as binding certain single-stranded G-rich DNA sequences (GGGGT and GGGCT). A number of related proteins are characterized as helicases. [DNA metabolism, DNA replication, recombination, and repair] 636
16888 273042 TIGR00377 ant_ant_sig anti-anti-sigma factor. This superfamily includes small (105-125 residue) proteins related to SpoIIAA of Bacillus subtilis, an anti-anti-sigma factor. SpoIIAA can bind to and inhibit the anti-sigma F factor SpoIIAB. Also, it can be phosphorylated by SpoIIAB on a Ser residue at position 59 of the seed alignment. A similar arrangement is inferred for RsbV, an anti-anti-sigma factor for sigma B. This Ser is fairly well conserved within a motif resembling MXS[STA]G[VIL]X[VIL][VILF] among homologous known or predicted anti-anti-sigma factors. Regions similar to SpoIIAA and apparently homologous, but differing considerably near the phosphorlated Ser of SpoIIAA, appear in a single copy in several longer proteins. [Regulatory functions, Protein interactions] 108
16889 273043 TIGR00378 cax calcium/proton exchanger (cax). [Transport and binding proteins, Cations and iron carrying compounds] 349
16890 273044 TIGR00379 cobB cobyrinic acid a,c-diamide synthase. This model describes cobyrinic acid a,c-diamide synthase, the cobB (cbiA in Salmonella) protein of cobalamin biosynthesis. It is responsible for the amidation of carboxylic groups at positions A and C of either cobyrinic acid or hydrogenobrynic acid. NH(2) groups are provided by glutamine and one molecule of ATP hydrogenolyzed for each amidation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 449
16891 273045 TIGR00380 cobD cobalamin biosynthesis protein CobD. This protein is involved in cobalamin (vitamin B12) biosynthesis and porphyrin biosynthesis. It converts cobyric acid to cobinamide by the addition of aminopropanol on the F carboxylic group. It is part of the cob operon. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 305
16892 273046 TIGR00381 cdhD CO dehydrogenase/acetyl-CoA synthase, delta subunit. This is the small subunit of a heterodimer which catalyzes the reaction CO + H2O + Acceptor = CO2 + Reduced acceptor and is involved in the synthesis of acetyl-CoA from CO2 and H2. [Energy metabolism, Chemoautotrophy] 389
16893 273047 TIGR00382 clpX endopeptidase Clp ATP-binding regulatory subunit (clpX). A member of the ATP-dependent proteases, ClpX has ATP-dependent chaperone activity and is required for specific ATP-dependent proteolytic activities expressed by ClpPX. The gene is also found to be involved in stress tolerance in Bacillus subtilis and is essential for the efficient acquisition of genes specifying type IA and IB restriction. [Protein fate, Protein folding and stabilization, Protein fate, Degradation of proteins, peptides, and glycopeptides] 413
16894 273048 TIGR00383 corA magnesium Mg(2+) and cobalt Co(2+) transport protein (corA). The article in Microb Comp Genomics 1998;3(3):151-69 discusses this family and suggests that some members may have functions other than Mg2+ transport. [Transport and binding proteins, Cations and iron carrying compounds] 318
16895 273049 TIGR00384 dhsB succinate dehydrogenase and fumarate reductase iron-sulfur protein. Succinate dehydrogenase and fumarate reductase are reverse directions of the same enzymatic interconversion, succinate + FAD+ = fumarate + FADH2 (EC 1.3.11.1). In E. coli, the forward and reverse reactions are catalyzed by distinct complexes: fumarate reductase operates under anaerobic conditions and succinate dehydrogenase operates under aerobic conditions. This model also describes a region of the B subunit of a cytosolic archaeal fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle] 220
16896 129481 TIGR00385 dsbE periplasmic protein thiol:disulfide oxidoreductases, DsbE subfamily. Involved in the biogenesis of c-type cytochromes as well as in disulfide bond formation in some periplasmic proteins. [Protein fate, Protein folding and stabilization] 173
16897 273050 TIGR00387 glcD glycolate oxidase, subunit GlcD. This protein, the glycolate oxidase GlcD subunit, is similar in sequence to that of several D-lactate dehydrogenases, including that of E. coli. The glycolate oxidase has been found to have some D-lactate dehydrogenase activity. [Energy metabolism, Other] 413
16898 129483 TIGR00388 glyQ glycyl-tRNA synthetase, tetrameric type, alpha subunit. This tetrameric form of glycyl-tRNA synthetase (2 alpha, 2 beta) is found in the majority of completed eubacterial genomes, with the two genes fused in a few species. A substantially different homodimeric form (not recognized by this model) replaces this form in the Archaea, animals, yeasts, and some eubacteria. [Protein synthesis, tRNA aminoacylation] 293
16899 273051 TIGR00389 glyS_dimeric glycyl-tRNA synthetase, dimeric type. This model describes a glycyl-tRNA synthetase distinct from the two alpha and two beta chains of the tetrameric E. coli glycyl-tRNA synthetase. This enzyme is a homodimeric class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes His, Ser, Pro, and this set of glycyl-tRNA synthetases. [Protein synthesis, tRNA aminoacylation] 551
16900 273052 TIGR00390 hslU ATP-dependent protease HslVU, ATPase subunit. This model represents the ATPase subunit of HslVU, while the proteasome-related peptidase subunit is HslV. Residues 54-61 of the model contain a P-loop ATP-binding motif. Cys-287 of E. coli (position 308 in the seed alignment) is Ser in other members of the seed alignment. [Protein fate, Protein folding and stabilization] 441
16901 273053 TIGR00391 hydA hydrogenase (NiFe) small subunit (hydA). Called (hupA/hydA/hupS/hoxK/vhtG) Involved in hydrogenase reactions performing different specific functions in different species eg (EC 1.12.2.1) in Desulfovibrio gigas,(EC 1.12.99.3) in Wolinella succinogenes and (EC 1.18.99.1) in E.coli and a number of other species and (EC 1.12.99.-) in the archea. [Energy metabolism, Electron transport] 365
16902 273054 TIGR00392 ileS isoleucyl-tRNA synthetase. The isoleucyl tRNA synthetase (IleS) is a class I amino acyl-tRNA ligase and is particularly closely related to the valyl tRNA synthetase. This model may recognize IleS from every species, including eukaryotic cytosolic and mitochondrial forms. [Protein synthesis, tRNA aminoacylation] 861
16903 129488 TIGR00393 kpsF KpsF/GutQ family protein. This model describes a number of closely related proteins with the phosphosugar-binding domain SIS (Sugar ISomerase) followed by two copies of the CBS (named after Cystathionine Beta Synthase) domain. One is GutQ, a protein of the glucitol operon. Another is KpsF, a virulence factor involved in capsular polysialic acid biosynthesis in some pathogenic strains of E. coli. [Energy metabolism, Sugars] 268
16904 273055 TIGR00394 lac_pts_IIC phosphotransferase system, lactose specific, IIC component. This family of proteins models the IIC domain of the phosphotransferase system (PTS) for lactose. The IIC domain catalyzes the transfer of a phosphoryl group from the IIB domain to lactose. When the IIC component and IIB components are in the same polypeptide chain they are designated IIBC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 412
16905 273056 TIGR00395 leuS_arch leucyl-tRNA synthetase, archaeal and cytosolic family. The leucyl-tRNA synthetases belong to two families so broadly different that they are represented by separate models. This model includes both archaeal and cytosolic eukaryotic leucyl-tRNA synthetases; the eubacterial and mitochondrial forms differ so substantially that some other tRNA ligases score higher by this model than does any eubacterial LeuS. [Protein synthesis, tRNA aminoacylation] 938
16906 273057 TIGR00396 leuS_bact leucyl-tRNA synthetase, eubacterial and mitochondrial family. The leucyl-tRNA synthetases belong to two families so broadly different that they are represented by separate models. This model includes both eubacterial and mitochondrial leucyl-tRNA synthetases. It generates higher scores for some valyl-tRNA synthetases than for any archaeal or eukaryotic cytosolic leucyl-tRNA synthetase. Note that the enzyme from Aquifex aeolicus is split into alpha and beta chains; neither chain is long enough to score above the trusted cutoff, but the alpha chain scores well above the noise cutoff. The beta chain must be found by a model and search designed for partial length matches. [Protein synthesis, tRNA aminoacylation] 842
16907 129492 TIGR00397 mauM_napG MauM/NapG family ferredoxin-type protein. MauM is involved in methylamine utilization. NapG is associated with nitrate reductase activity. The two proteins are highly similar. [Energy metabolism, Electron transport] 213
16908 273058 TIGR00398 metG methionine--tRNA ligase. The methionyl-tRNA synthetase (metG) is a class I amino acyl-tRNA ligase. This model appears to recognize the methionyl-tRNA synthetase of every species, including eukaryotic cytosolic and mitochondrial forms. The UPGMA difference tree calculated after search and alignment according to this model shows an unusual deep split between two families of MetG. One family contains forms from the Archaea, yeast cytosol, spirochetes, and E. coli, among others. The other family includes forms from yeast mitochondrion, Synechocystis sp., Bacillus subtilis, the Mycoplasmas, Aquifex aeolicus, and Helicobacter pylori. The E. coli enzyme is homodimeric, although monomeric forms can be prepared that are fully active. Activity of this enzyme in bacteria includes aminoacylation of fMet-tRNA with Met; subsequent formylation of the Met to fMet is catalyzed by a separate enzyme. Note that the protein from Aquifex aeolicus is split into an alpha (large) and beta (small) subunit; this model does not include the C-terminal region corresponding to the beta chain. [Protein synthesis, tRNA aminoacylation] 530
16909 273059 TIGR00399 metG_C_term methionyl-tRNA synthetase C-terminal region/beta chain. The methionyl-tRNA synthetase (metG) is a class I amino acyl-tRNA ligase. This model describes a region of the methionyl-tRNA synthetase that is present at the C-terminus of MetG in some species (E. coli, B. subtilis, Thermotoga maritima, Methanobacterium thermoautotrophicum), and as a separate beta chain in Aquifex aeolicus. It is absent in a number of other species (e.g. Mycoplasma genitalium, Mycobacterium tuberculosis), while Pyrococcus horikoshii has both a full length MetG and a second protein homologous to the beta chain only. Proteins hit by this model should be called methionyl-tRNA synthetase beta chain if and only if the model metG hits a separate protein not also hit by this model. [Protein synthesis, tRNA aminoacylation] 137
16910 129495 TIGR00400 mgtE Mg2+ transporter (mgtE). This family of prokaryotic proteins models a class of Mg++ transporter first described in Bacillus firmus. May form a homodimer. [Transport and binding proteins, Cations and iron carrying compounds] 449
16911 129496 TIGR00401 msrA methionine-S-sulfoxide reductase. This model describes peptide methionine sulfoxide reductase (MsrA), a repair enzyme for proteins that have been inactivated by oxidation. The enzyme from E. coli is coextensive with this model and has enzymatic activity. However, in all completed genomes in which this module is present, a second protein module, described in TIGR00357, is also found, and in several cases as part of the same polypeptide chain: N-terminal to this module in Helicobacter pylori and Haemophilus influenzae (as in PilB of Neisseria gonorrhoeae) but C-terminal to it in Treponema pallidum. PilB, containing both domains, has been shown to be important for the expression of adhesins in certain pathogens. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions] 149
16912 273060 TIGR00402 napF ferredoxin-type protein NapF. The gene codes for a ferredoxin-type cytosolic protein, NapF, of the periplasmic nitrate reductase system, as in Escherichia coli. NapF interacts with the catalytic subunit, NapA, and may be an accessory protein for NapA maturation. [Energy metabolism, Electron transport] 101
16913 129498 TIGR00403 ndhI NADH-plastoquinone oxidoreductase subunit I protein. [Energy metabolism, Electron transport] 183
16914 129499 TIGR00405 KOW_elon_Spt5 transcription elongation factor Spt5. This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein. 145
16915 273061 TIGR00406 prmA ribosomal protein L11 methyltransferase. Ribosomal protein L11 methyltransferase is an S-adenosyl-L-methionine-dependent methyltransferase required for the modification of ribosomal protein L11. This protein is found in bacteria and (with a probable transit peptide) in Arabidopsis. [Protein synthesis, Ribosomal proteins: synthesis and modification] 288
16916 161862 TIGR00407 proA gamma-glutamyl phosphate reductase. The related model TIGR01092 describes a full-length fusion protein delta l-pyrroline-5-carboxylate synthetase that includes a gamma-glutamyl phosphate reductase region as described by this model. Alternate name: glutamate-5-semialdehyde dehydrogenase. The prosite motif begins at residue 332 of the seed alignment although not all of the members of the family exactly obey the motif. [Amino acid biosynthesis, Glutamate family] 398
16917 273062 TIGR00408 proS_fam_I prolyl-tRNA synthetase, family I. Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent families. This family includes the archaeal enzyme, the Pro-specific domain of a human multifunctional tRNA ligase, and the enzyme from the spirochete Borrelia burgdorferi. The other family includes enzymes from Escherichia coli, Bacillus subtilis, Synechocystis PCC6803, and one of the two prolyL-tRNA synthetases of Saccharomyces cerevisiae. [Protein synthesis, tRNA aminoacylation] 472
16918 273063 TIGR00409 proS_fam_II prolyl-tRNA synthetase, family II. Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent groups. This group includes enzymes from Escherichia coli, Bacillus subtilis, Aquifex aeolicus, the spirochete Treponema pallidum, Synechocystis PCC6803, and one of the two prolyL-tRNA synthetases of Saccharomyces cerevisiae. The other group includes the Pro-specific domain of a human multifunctional tRNA ligase and the prolyl-tRNA synthetases from the Archaea, the Mycoplasmas, and the spirochete Borrelia burgdorferi. [Protein synthesis, tRNA aminoacylation] 568
16919 273064 TIGR00410 lacE PTS system, lactose/cellobiose family IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family of proteins consists of both the cellobiose specific and the lactose specific forms of the phosphotransferase system (PTS) IIC component. The IIC domain catalyzes the transfer of a phosphoryl group from the IIB domain to the substrate. When the IIC component and IIB components are in the same polypeptide chain they are designated IIBC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 423
16920 129505 TIGR00411 redox_disulf_1 small redox-active disulfide protein 1. This protein is homologous to a family of proteins that includes thioredoxins, glutaredoxins, protein-disulfide isomerases, and others, some of which have several such domains. The sequence of this protein at the redox-active disufide site, CPYC, matches glutaredoxins rather than thioredoxins, although its overall sequence seems closer to thioredoxins. It is suggested to be a ribonucleotide-reducing system component distinct from thioredoxin or glutaredoxin. [Unknown function, General] 82
16921 129506 TIGR00412 redox_disulf_2 small redox-active disulfide protein 2. This small protein is found in three archaeal species so far (Methanococcus jannaschii, Archeoglobus fulgidus, and Methanobacterium thermoautotrophicum) as well as in Anabaena PCC7120. It is homologous to thioredoxins, glutaredoxins, and protein disulfide isomerases, and shares with them a redox-active disulfide. The redox active disulfide region CXXC motif resembles neither thioredoxin nor glutaredoxin. A closely related protein found in the same three Archaea, described by redox_disulf_1, has a glutaredoxin-like CP[YH]C sequence; it has been characterized in functional assays as redox-active but unlikely to be a thioredoxin or glutaredoxin. [Unknown function, General] 76
16922 273065 TIGR00413 rlpA rare lipoprotein A. This is a family of prokaryotic proteins with unknown function. Lipoprotein annotation based on the presence of consensus lipoprotein signal sequence. Included in this family is the E. coli putative lipoprotein rlpA. [Cell envelope, Other] 208
16923 273066 TIGR00414 serS seryl-tRNA synthetase. This model represents the seryl-tRNA synthetase found in most organisms. This protein is a class II tRNA synthetase, and is recognized by the pfam model tRNA-synt_2b. The seryl-tRNA synthetases of two archaeal species, Methanococcus jannaschii and Methanobacterium thermoautotrophicum, differ considerably and are included in a different model. [Protein synthesis, tRNA aminoacylation] 418
16924 129509 TIGR00415 serS_MJ seryl-tRNA synthetase, Methanococcus jannaschii family. The seryl-tRNA synthetases from a few of the Archaea, represented by this model, are very different from the set of mutually more closely related seryl-tRNA synthetases from Eubacteria, Eukaryotes, and other Archaea. Although distantly homologous, the present set differs enough not to be recognized by the pfam model tRNA-synt_2b that recognizes the remainder of seryl-tRNA synthetases among oither class II amino-acyl tRNA synthetases. [Protein synthesis, tRNA aminoacylation] 520
16925 273067 TIGR00416 sms DNA repair protein RadA. The gene protuct codes for a probable ATP-dependent protease involved in both DNA repair and degradation of proteins, peptides, glycopeptides. Also known as sms. Residues 11-28 of the SEED alignment contain a putative Zn binding domain. Residues 110-117 of the seed contain a putative ATP binding site both documented in Haemophilus (SP:P45266) and in Listeria monocytogenes (SP:Q48761) . for E.coli see ( J. BACTERIOL. 178:5045-5048(1996)). [DNA metabolism, DNA replication, recombination, and repair] 454
16926 188048 TIGR00417 speE spermidine synthase. the SpeE subunit of spermidine synthase catalysesthe reaction (putrescine + S-adenosylmethioninamine = spermidine + 5'-methylthioadenosine) and is involved in polyamine biosynthesis and in the biosynthesis of spermidine from arganine. The region between residues 77 and 120 of the seed alignment is thought to be involved in binding to decarboxylated SAM. [Central intermediary metabolism, Polyamine biosynthesis] 271
16927 273068 TIGR00418 thrS threonyl-tRNA synthetase. This model represents the threonyl-tRNA synthetase found in most organisms. This protein is a class II tRNA synthetase, and is recognized by the pfam model tRNA-synt_2b. Note that B. subtilis has closely related isozymes thrS and thrZ. The N-terminal regions are quite dissimilar between archaeal and eubacterial forms, while some eukaryotic forms are missing sequence there altogether. . [Protein synthesis, tRNA aminoacylation] 563
16928 129513 TIGR00419 tim triosephosphate isomerase. Triosephosphate isomerase (tim/TPIA) is the glycolytic enzyme that catalyzes the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. The active site of the enzyme is located between residues 240-258 of the model ([AV]-Y-E-P-[LIVM]-W-[SA]-I-G-T-[GK]) with E being the active site residue. There is a slight deviation from this sequence within the archeal members of this family. [Energy metabolism, Glycolysis/gluconeogenesis] 205
16929 273069 TIGR00420 trmU tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase. tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase (trmU, asuE, or mnmA) is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine (mnm5s2U34) present in the wobble position of some tRNAs. This enzyme appears not to occur in the Archaea. [Protein synthesis, tRNA and rRNA base modification] 352
16930 129515 TIGR00421 ubiX_pad UbiX family flavin prenyltransferase. UbiX partners with UbiD for decarboxylation of the 3-octaprenyl-4-hydroxybenzoate precursor during ubiquinone biosynthesis, but the role of UbiX is as a flavin prenyltransferase that provides a cofactor UbiD requires.In E.coli, the protein UbiX (3-octaprenyl-4-hydroxybenzoate carboxy-lyase) has been shown to be involved in the third step of ubiquinone biosynthesis, the reaction [3-octaprenyl-4-hydroxybenzoate = 2-octaprenylphenol + CO2]. The knockout of the homologous protein in yeast confers sensitivity to phenylacrylic acid. Members are not restricted to ubiquinone-synthesizing species. This family represents a distinct clade within the flavoprotein family of pfam02441. 181
16931 273070 TIGR00422 valS valyl-tRNA synthetase. The valyl-tRNA synthetase (ValS) is a class I amino acyl-tRNA ligase and is particularly closely related to the isoleucyl tRNA synthetase. [Protein synthesis, tRNA aminoacylation] 861
16932 273071 TIGR00423 TIGR00423 radical SAM domain protein, CofH subfamily. This protein family includes the CofH protein of coenzyme F(420) biosynthesis from Methanocaldococcus jannaschii, but appears to hit genomes more broadly than just the subset that make coenzyme F(420), so that narrower group is being built as a separate family. [Hypothetical proteins, Conserved] 309
16933 273072 TIGR00424 APS_reduc 5'-adenylylsulfate reductase, thioredoxin-independent. This enzyme, involved in the assimilation of inorganic sulfate, is closely related to the thioredoxin-dependent PAPS reductase of Bacteria (CysH) and Saccharomyces cerevisiae. However, it has its own C-terminal thioredoxin-like domain and is not thioredoxin-dependent. Also, it has a substrate preference for 5'-adenylylsulfate (APS) over 3'-phosphoadenylylsulfate (PAPS) so the pathway does not require an APS kinase (CysC) to convert APS to PAPS. Arabidopsis thaliana appears to have three isozymes, all able to complement E. coli CysH mutants (even in backgrounds lacking thioredoxin or APS kinase) but likely localized to different compartments in Arabidopsis. [Central intermediary metabolism, Sulfur metabolism] 463
16934 273073 TIGR00425 CBF5 rRNA pseudouridine synthase, putative. This family, found in archaea and eukaryotes, includes the only archaeal proteins markedly similar to bacterial TruB, the tRNA pseudouridine 55 synthase. However, among two related yeast proteins, the archaeal set matches yeast YLR175w far better than YNL292w. The first, termed centromere/microtubule binding protein 5 (CBF5), is an apparent rRNA pseudouridine synthase, while the second is the exclusive tRNA pseudouridine 55 synthase for both cytosolic and mitochondrial compartments. It is unclear whether archaeal proteins found by this model modify tRNA, rRNA, or both. [Protein synthesis, tRNA and rRNA base modification] 322
16935 129520 TIGR00426 TIGR00426 competence protein ComEA helix-hairpin-helix repeat region. Members of the subfamily recognized by this model include competence protein ComEA and closely related proteins from a number of species that exhibit competence for transformation by exongenous DNA, including Streptococcus pneumoniae, Bacillus subtilis, Neisseria meningitidis, and Haemophilus influenzae. This model represents a region of two tandem copies of a helix-hairpin-helix domain (pfam00633), each about 30 residues in length. Limited sequence similarity can be found among some members of this family N-terminal to the region covered by this model. [Cellular processes, DNA transformation] 69
16936 129521 TIGR00427 TIGR00427 membrane protein, MarC family. MarC is a protein that spans the plasma membrane multiple times and once was thought to be a multiple antibiotic resistance protein. The function for this family is unknown. [Unknown function, General] 201
16937 129522 TIGR00430 Q_tRNA_tgt tRNA-guanine transglycosylase. This tRNA-guanine transglycosylase (tgt) catalyzes an exchange for the guanine base at position 34 of many tRNAs; this nucleotide is subsequently modified to queuosine. The Archaea have a closely related enzyme that catalyzes a base exchange for guanine at position 15 in some tRNAs, a site that is subsequently converted to the archaeal-specific modified base archaeosine (7-formamidino-7-deazaguanosine), while Archaeoglobus fulgidus has both enzymes. [Protein synthesis, tRNA and rRNA base modification] 368
16938 129523 TIGR00431 TruB tRNA pseudouridine(55) synthase. TruB, the tRNA pseudouridine 55 synthase, converts uracil to pseudouridine in the T loop of most tRNAs in all three domains of life. This model is built on a seed alignment of bacterial proteins only. Saccharomyces cerevisiae protein YNL292w (Pus4) has been shown to be the pseudouridine 55 synthase of both cytosolic and mitochondrial compartments, active at no other position on tRNA and the only enzyme active at that position in the species. A distinct yeast protein YLR175w, (centromere/microtubule-binding protein CBF5) is an rRNA pseudouridine synthase, and the archaeal set is much more similar to CBF5 than to Pus4. It is unclear whether the archaeal proteins found by this model are tRNA pseudouridine 55 synthases like TruB, rRNA pseudouridine synthases like CBF5, or (as suggested by the absence of paralogs in the Archaea) both. CBF5 likely has additional, eukaryotic-specific functions. The trusted cutoff is set above the scores for the archaeal homologs of unknown function, so yeast Pus4p scores between trusted and noise. [Protein synthesis, tRNA and rRNA base modification] 209
16939 273074 TIGR00432 arcsn_tRNA_tgt tRNA-guanine(15) transglycosylase. This tRNA-guanine transglycosylase (tgt) differs from the tgt of E. coli and other Bacteria in the site of action and the modification that results. It exchanges 7-cyano-7-deazaguanine (preQ0) with guanine at position 15 of archaeal tRNA; this nucleotide is subsequently converted to archaeosine, found exclusively in the Archaea. This enzyme from Haloferax volcanii has been purified, characterized, and partially sequenced and is the basis for identifying this family. In contrast, bacterial tgt (TIGR00430) catalyzes the exchange of preQ0 or preQ1 for the guanine base at position 34; this nucleotide is subsequently modified to queuosine. Archeoglobus fulgidus has both enzymes, while some other Archaea have just this one. [Protein synthesis, tRNA and rRNA base modification] 540
16940 273075 TIGR00433 bioB biotin synthase. Catalyzes the last step of the biotin biosynthesis pathway. All members of the seed alignment are in the immediate gene neighborhood of a bioA gene. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 296
16941 129526 TIGR00434 cysH phosophoadenylyl-sulfate reductase (thioredoxin). This enzyme, involved in the assimilation of inorganic sulfate, is designated cysH in Bacteria and MET16 in Saccharomyces cerevisiae. Synonyms include phosphoadenosine phosphosulfate reductase, PAPS reductase, and PAPS reductase, thioredoxin-dependent. In a reaction requiring reduced thioredoxin and NADPH, it converts 3(prime)-phosphoadenylylsulfate (PAPS) to sulfite and adenosine 3(prime),5(prime) diphosphate (PAP). A related family of plant enzymes, scoring below the trusted cutoff, differs in having a thioredoxin-like C-terminal domain, not requiring thioredoxin, and in having a preference for 5(prime)-adenylylsulfate (APS) over PAPS. [Central intermediary metabolism, Sulfur metabolism] 212
16942 273076 TIGR00435 cysS cysteinyl-tRNA synthetase. This model finds the cysteinyl-tRNA synthetase from most but not from all species. The enzyme from one archaeal species, Archaeoglobus fulgidus, is found but the equivalent enzymes from some other Archaea, including Methanococcus jannaschii, are not found, although biochemical evidence suggests that tRNA(Cys) in these species are charged directly with Cys rather than through a misacylation and correction pathway as for tRNA(Gln). [Protein synthesis, tRNA aminoacylation] 464
16943 129528 TIGR00436 era GTP-binding protein Era. Era is an essential GTPase in Escherichia coli and many other bacteria. It plays a role in ribosome biogenesis. Few bacteria lack this protein. [Protein synthesis, Other] 270
16944 273077 TIGR00437 feoB ferrous iron transporter FeoB. FeoB (773 amino acids in E. coli), a cytoplasmic membrane protein required for iron(II) update, is encoded in an operon with FeoA (75 amino acids), which is also required, and is regulated by Fur. There appear to be two copies in Archaeoglobus fulgidus and Clostridium acetobutylicum. [Transport and binding proteins, Cations and iron carrying compounds] 591
16945 273078 TIGR00438 rrmJ cell division protein FtsJ. Methylates the 23S rRNA. Previously known as cell division protein ftsJ.// Trusted cutoff too high? [SS 10/1/04] [Protein synthesis, tRNA and rRNA base modification] 188
16946 129531 TIGR00439 ftsX putative protein insertion permease FtsX. FtsX is an integral membrane protein encoded in the same operon as signal recognition particle docking protein FtsY and FtsE. It belongs to a family of predicted permeases and may play a role in the insertion of proteins required for potassium transport, cell division, and other activities. FtsE is a hydrophilic nucleotide-binding protein that associates with the inner membrane by means of association with FtsX. [Cellular processes, Cell division, Protein fate, Protein and peptide secretion and trafficking] 309
16947 273079 TIGR00440 glnS glutaminyl-tRNA synthetase. This protein is a relatively rare aminoacyl-tRNA synthetase, found in the cytosolic compartment of eukaryotes, in E. coli and a number of other Gram-negative Bacteria, and in Deinococcus radiodurans. In contrast, the pathway to Gln-tRNA in mitochondria, Archaea, Gram-positive Bacteria, and a number of other lineages is by misacylation with Glu followed by transamidation to correct the aminoacylation to Gln. This enzyme is a class I tRNA synthetase (hit by the pfam model tRNA-synt_1c) and is quite closely related to glutamyl-tRNA synthetases. [Protein synthesis, tRNA aminoacylation] 522
16948 129533 TIGR00441 gmhA phosphoheptose isomerase. This model describes phosphoheptose isomerase. Because a closely related paralo in Escherichia coli differs in function (DnaA initiator-associating protein diaA), this model has been rebuilt with a high stringency, and is likely to miss many true examples for phosphoheptose isomerase. Involved in lipopolysaccharide biosynthesis it may have a role in virulence in Haemophilus ducreyi. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 154
16949 273080 TIGR00442 hisS histidyl-tRNA synthetase. This model finds a histidyl-tRNA synthetase in every completed genome. Apparent second copies from Bacillus subtilis, Synechocystis sp., and Aquifex aeolicus are slightly shorter, more closely related to each other than to other hisS proteins, and actually serve as regulatory subunits for an enzyme of histidine biosynthesis. They were excluded from the seed alignment and score much lower than do single copy histidyl-tRNA synthetases of other genomes not included in the seed alignment. These putative second copies of HisS score below the trusted cutoff. The regulatory protein kinase GCN2 of Saccharomyces cerevisiae (YDR283c), and related proteins from other species designated eIF-2 alpha kinase, have a domain closely related to histidyl-tRNA synthetase that may serve to detect and respond to uncharged tRNA(his), an indicator of amino acid starvation; these regulatory proteins are not orthologous and so score below the noise cutoff. [Protein synthesis, tRNA aminoacylation] 404
16950 273081 TIGR00443 hisZ_biosyn_reg ATP phosphoribosyltransferase, regulatory subunit. Apparant second copies of histidyl-tRNA synthetase, found in Bacillus subtilis, Synechocystis sp., Aquifex aeolicus, and others, are in fact a regulatory subunit of ATP phosphoribosyltransferase, and usually encoded by a gene adjacent to that encoding the catalytic subunit. [Amino acid biosynthesis, Histidine family] 313
16951 273082 TIGR00444 mazG MazG family protein. This family of prokaryotic proteins has no known function. It includes the uncharacterized protein MazG in E. coli. [Unknown function, General] 248
16952 161884 TIGR00445 mraY phospho-N-acetylmuramoyl-pentapeptide-transferase. Involved in peptidoglycan biosynthesis, the enzyme catalyzes the first of the lipid cycle reactions. Also known as Muramoyl-Pentapeptide Transferase (murX). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 321
16953 188051 TIGR00446 nop2p NOL1/NOP2/sun family putative RNA methylase. [Protein synthesis, tRNA and rRNA base modification] 264
16954 213531 TIGR00447 pth aminoacyl-tRNA hydrolase. The natural substrate for this enzyme may be peptidyl-tRNAs that drop off the ribosome during protein synthesis. Peptidyl-tRNA hydrolase is a bacterial protein; YHR189W from Saccharomyces cerevisiae appears to be orthologous and likely has the same function. [Protein synthesis, Other] 188
16955 129540 TIGR00448 rpoE DNA-directed RNA polymerase (rpoE), archaeal and eukaryotic form. This family seems to be confined to the archea and eukaryotic taxa and are quite dissimilar to E.coli rpoE. [Transcription, DNA-dependent RNA polymerase] 179
16956 129541 TIGR00449 tgt_general tRNA-guanine family transglycosylase. Different tRNA-guanine transglycosylases catalyze different tRNA base modifications. Two guanine base substitutions by different enzymes described by the model are involved in generating queuosine at position 34 in bacterial tRNAs and archaeosine at position 15 in archaeal tRNAs. This model is designed for fragment searching, so the superfamily is used loosely. [Protein synthesis, tRNA and rRNA base modification] 367
16957 273083 TIGR00450 mnmE_trmE_thdF tRNA modification GTPase TrmE. TrmE, also called MnmE and previously designated ThdF (thiophene and furan oxidation protein), is a GTPase involved in tRNA modification to create 5-methylaminomethyl-2-thiouridine in the wobble position of some tRNAs. This protein and GidA form an alpha2/beta2 heterotetramer. [Protein synthesis, tRNA and rRNA base modification] 442
16958 129543 TIGR00451 unchar_dom_2 uncharacterized domain 2. This uncharacterized domain is found a number of enzymes and uncharacterized proteins, often at the C-terminus. It is found in some but not all members of a family of related tRNA-guanine transglycosylases (tgt), which exchange a guanine base for some modified base without breaking the phosphodiester backbone of the tRNA. It is also found in rRNA pseudouridine synthase, another enzyme of RNA base modification not otherwise homologous to tgt. It is found, again at the C-terminus, in two putative glutamate 5-kinases. It is also found in a family of small, uncharacterized archaeal proteins consisting mostly of this domain. 107
16959 273084 TIGR00452 TIGR00452 tRNA (mo5U34)-methyltransferase. This model describes CmoB, the enzyme tRNA (mo5U34)-methyltransferase involved in tRNA wobble base modification. [Unknown function, Enzymes of unknown specificity] 316
16960 213532 TIGR00453 ispD 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase. Members of this protein family are 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, the IspD protein of the deoxyxylulose pathway of IPP biosynthesis. In about twenty percent of bacterial genomes, this protein occurs as IspDF, a bifunctional fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 217
16961 200016 TIGR00454 TIGR00454 TIGR00454 family protein. At this time this gene appears to be present only in Archea [Hypothetical proteins, Conserved] 175
16962 129547 TIGR00455 apsK adenylyl-sulfate kinase. This protein, adenylylsulfate kinase, is often found as a fusion protein with sulfate adenylyltransferase. Important residue (active site in E.coli) is residue 100 of the seed alignment. [Central intermediary metabolism, Sulfur metabolism] 184
16963 273085 TIGR00456 argS arginyl-tRNA synthetase. This model recognizes arginyl-tRNA synthetase in every completed genome to date. An interesting feature of the alignment of all arginyl-tRNA synthetases is a fairly deep split between two families. One family includes archaeal, eukaryotic and organellar, spirochete, E. coli, and Synechocystis sp. The second, sharing a deletion of about 25 residues in the central region relative to the first, includes Bacillus subtilis, Aquifex aeolicus, the Mycoplasmas and Mycobacteria, and the Gram-negative bacterium Helicobacter pylori. [Protein synthesis, tRNA aminoacylation] 563
16964 273086 TIGR00457 asnS asparaginyl-tRNA synthetase. In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, asnS, represents asparaginyl-tRNA synthetases from the three domains of life. Some species lack this enzyme and charge tRNA(asn) by misacylation with Asp, followed by transamidation of Asp to Asn. [Protein synthesis, tRNA aminoacylation] 453
16965 273087 TIGR00458 aspS_nondisc nondiscriminating aspartyl-tRNA synthetase. In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, aspS_arch, represents aspartyl-tRNA synthetases from the eukaryotic cytosol and from the Archaea. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA(asn) is subsequently transamidated to Asn-tRNA(asn). [Protein synthesis, tRNA aminoacylation] 428
16966 211576 TIGR00459 aspS_bact aspartyl-tRNA synthetase, bacterial type. Asparate--tRNA ligases in this family may be discriminating (6.1.1.12) or nondiscriminating (6.1.1.23). In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, aspS_bact, represents aspartyl-tRNA synthetases from the Bacteria and from mitochondria. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA(asn) is subsequently transamidated to Asn-tRNA(asn). This model generates very low scores for the archaeal type of aspS and for asnS; scores between the trusted and noise cutoffs represent fragmentary sequences. [Protein synthesis, tRNA aminoacylation] 583
16967 273088 TIGR00460 fmt methionyl-tRNA formyltransferase. The top-scoring characterized proteins other than methionyl-tRNA formyltransferase (fmt) itself are formyltetrahydrofolate dehydrogenases. The mitochondrial methionyl-tRNA formyltransferases are so divergent that, in a multiple alignment of bacterial fmt, mitochondrial fmt, and formyltetrahydrofolate dehydrogenases, the mitochondrial fmt appears the most different. However, because both bacterial and mitochondrial fmt are included in the seed alignment, all credible fmt sequences score higher than any non-fmt sequence. This enzyme modifies Met on initiator tRNA to f-Met. [Protein synthesis, tRNA aminoacylation] 313
16968 273089 TIGR00461 gcvP glycine dehydrogenase (decarboxylating). This apparently ubiquitous enzyme is found in bacterial, mammalian and plant sources. The enzyme catalyzes the reaction: GLYCINE + LIPOYLPROTEIN = S-AMINOMETHYL-DIHYDROLIPOYLPROTEIN + CO2. It is part of the glycine decarboxylase multienzyme complex (GDC) consisting of four proteins P, H, L and T. Active site in E.coli is located as the (K) residues at position 713 of the SEED alignment. [Energy metabolism, Amino acids and amines] 939
16969 273090 TIGR00462 genX EF-P lysine aminoacylase GenX. Many Gram-negative bacteria have a protein closely homologous to the C-terminal region of lysyl-tRNA synthetase (LysS). Multiple sequence alignment of these proteins with the homologous regions of collected LysS proteins shows that these proteins form a distinct set rather than just similar truncations of LysS. The protein is termed GenX after its designation in E. coli. Interestingly, genX often is located near a homolog of lysine-2,3-aminomutase. Its function is unknown. [Unknown function, General] 290
16970 273091 TIGR00463 gltX_arch glutamyl-tRNA synthetase, archaeal and eukaryotic family. The glutamyl-tRNA synthetases of the eukaryotic cytosol and of the Archaea are more similar to glutaminyl-tRNA synthetases than to bacterial glutamyl-tRNA synthetases. This model models just the eukaryotic cytosolic and archaeal forms of the enzyme. In some eukaryotes, the glutamyl-tRNA synthetase is part of a longer, multifunctional aminoacyl-tRNA ligase. In many species, the charging of tRNA(gln) proceeds first through misacylation with Glu and then transamidation. For this reason, glutamyl-tRNA synthetases, including all known archaeal enzymes (as of 2010) may act on both tRNA(gln) and tRNA(glu). [Protein synthesis, tRNA aminoacylation] 556
16971 273092 TIGR00464 gltX_bact glutamyl-tRNA synthetase, bacterial family. The glutamyl-tRNA synthetases of the eukaryotic cytosol and of the Archaea are more similar to glutaminyl-tRNA synthetases than to bacterial glutamyl-tRNA synthetases. This model models just the bacterial and mitochondrial forms of the enzyme. In many species, the charging of tRNA(gln) proceeds first through misacylation with Glu and then transamidation. For this reason, glutamyl-tRNA synthetases may act on both tRNA(gln) and tRNA(glu). This model is highly specific. Proteins with positive scores below the trusted cutoff may be fragments rather than full-length sequences. [Protein synthesis, tRNA aminoacylation] 470
16972 273093 TIGR00465 ilvC ketol-acid reductoisomerase. This is the second enzyme in the parallel isoleucine-valine biosynthetic pathway [Amino acid biosynthesis, Pyruvate family] 314
16973 129558 TIGR00466 kdsB 3-deoxy-D-manno-octulosonate cytidylyltransferase. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 238
16974 273094 TIGR00467 lysS_arch lysyl-tRNA synthetase, archaeal and spirochete. This model represents the lysyl-tRNA synthetases that are class I amino-acyl tRNA synthetases. It includes archaeal and spirochete examples of the enzyme. All other known examples are class IIc amino-acyl tRNA synthetases and seem to form a separate orthologous set. [Protein synthesis, tRNA aminoacylation] 515
16975 273095 TIGR00468 pheS phenylalanyl-tRNA synthetase, alpha subunit. Most phenylalanyl-tRNA synthetases are heterodimeric, with 2 alpha (pheS) and 2 beta (pheT) subunits. This model describes the alpha subunit, which shows some similarity to class II aminoacyl-tRNA ligases. Mitochondrial phenylalanyl-tRNA synthetase is a single polypeptide chain, active as a monomer, and similar to this chain rather than to the beta chain, but excluded from this model. An interesting feature of the alignment of all sequences captured by this model is a deep split between non-spirochete bacterial examples and all other examples; supporting this split is a relative deletion of about 50 residues in the former set between two motifs well conserved throughout the alignment. [Protein synthesis, tRNA aminoacylation] 293
16976 129561 TIGR00469 pheS_mito phenylalanyl-tRNA synthetase, mitochondrial. Unlike all other known phenylalanyl-tRNA synthetases, the mitochondrial form demonstrated from yeast is monomeric. It is similar to but longer than the alpha subunit (PheS) of the alpha 2 beta 2 form found in Bacteria, Archaea, and eukaryotes, and shares the characteristic motifs of class II aminoacyl-tRNA ligases. This model models the experimental example from Saccharomyces cerevisiae (designated MSF1) and its orthologs from other eukaryotic species. [Protein synthesis, tRNA aminoacylation] 460
16977 129562 TIGR00470 sepS O-phosphoserine--tRNA ligase. This family of archaeal proteins resembles known phenylalanyl-tRNA synthetase alpha chains. Recently, it was shown to act in a proposed pathway of tRNA(Cys) indirect aminoacylation, resulting in Cys biosynthesis from O-phosphoserine, in certain archaea. It charges tRNA(Cys) with O-phosphoserine. The pscS gene product converts the phosphoserine to Cys. [Amino acid biosynthesis, Serine family, Protein synthesis, tRNA aminoacylation] 533
16978 273096 TIGR00471 pheT_arch phenylalanyl-tRNA synthetase, beta subunit. Every known example of the phenylalanyl-tRNA synthetase, except the monomeric form of mitochondrial, is an alpha 2 beta 2 heterotetramer. The beta subunits break into two subfamilies that are considerably different in sequence, length, and pattern of gaps. This model represents the subfamily that includes the beta subunit from eukaryotic cytosol, the Archaea, and spirochetes. [Protein synthesis, tRNA aminoacylation] 551
16979 273097 TIGR00472 pheT_bact phenylalanyl-tRNA synthetase, beta subunit, non-spirochete bacterial. Every known example of the phenylalanyl-tRNA synthetase, except the monomeric form of mitochondrial, is an alpha 2 beta 2 heterotetramer. The beta subunits break into two subfamilies that are considerably different in sequence, length, and pattern of gaps. This model represents the subfamily that includes the beta subunit from Bacteria other than spirochetes, as well as a chloroplast-encoded form from Porphyra purpurea. The chloroplast-derived sequence is considerably shorter at the amino end. [Protein synthesis, tRNA aminoacylation] 797
16980 273098 TIGR00473 pssA CDP-diacylglycerol--serine O-phosphatidyltransferase. This enzyme, CDP-diacylglycerol--serine O-phosphatidyltransferase, is involved in phospholipid biosynthesis catalyzing the reaction CDP-diacylglycerol + L-serine = CMP + L-1-phosphatidylserine. Members of this family do not bear any significant sequence similarity to the corresponding E.coli protein. [Fatty acid and phospholipid metabolism, Biosynthesis] 151
16981 273099 TIGR00474 selA L-seryl-tRNA(Sec) selenium transferase. In bacteria, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes SelA. This model excludes homologs that appear to differ in function from Frankia alni, Helicobacter pylori, Methanococcus jannaschii and other archaea, and so on. [Protein synthesis, tRNA aminoacylation] 454
16982 129567 TIGR00475 selB selenocysteine-specific elongation factor SelB. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes the elongation factor SelB, a close homolog rf EF-Tu. It may function by replacing EF-Tu. A C-terminal domain not found in EF-Tu is in all SelB sequences in the seed alignment except that from Methanococcus jannaschii. This model does not find an equivalent protein for eukaryotes. [Protein synthesis, Translation factors] 581
16983 273100 TIGR00476 selD selenium donor protein. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes SelD, known as selenophosphate synthetase, selenium donor protein, and selenide,water dikinase. SelD provides reduced selenium for the selenium transferase SelA. This protein itself contains selenocysteine in many species; any sequence scoring well but not aligning to the beginning of the model is likely to have a selenocysteine residue incorrectly interpreted as a stop codon upstream of the given sequence. The SelD protein also provides selenophosphate for the enzyme tRNA 2-selenouridine synthase, which catalyzes a tRNA base modification. It also contributes to selenium incorporation by selenium-dependent molybdenum hydroxylases (SDMH), in genomes with the marker TIGR03309. All genomes with SelD should make selenocysteine, selenouridine, SDMH, or some combination. 301
16984 188054 TIGR00477 tehB tellurite resistance protein TehB. Part of a tellurite-reducing operon tehA and tehB [Cellular processes, Toxin production and resistance] 195
16985 129570 TIGR00478 tly TlyA family rRNA methyltransferase/putative hemolysin. Members of this family include TlyA from Mycobacterium tuberculosis, an rRNA methylase whose modifications are necessary to confer sensitivity to ribosome-targeting antibiotics capreomycin and viomycin. Homology supports identification as a methyltransferase. However, a parallel literature persists in calling some members hemolysins. Hemolysins are exotoxins that attack blood cell membranes and cause cell rupture, often by forming a pore in the membrane. A recent study (2013) on SCO1782 from Streptomyces coelicolor shows hemolysin activity as earlier described for a homolog from the spirochete Serpula (Treponema) hyodysenteriae and one from Mycobacterium tuberculosis. [Unknown function, General] 228
16986 129571 TIGR00479 rumA 23S rRNA (uracil-5-)-methyltransferase RumA. This protein family was first proposed to be RNA methyltransferases by homology to the TrmA family. The member from E. coli has now been shown to act as the 23S RNA methyltransferase for the conserved U1939. The gene is now designated rumA and was previously designated ygcA. [Protein synthesis, tRNA and rRNA base modification] 431
16987 129572 TIGR00481 TIGR00481 Raf kinase inhibitor-like protein, YbhB/YbcL family. [Unknown function, General] 141
16988 273101 TIGR00482 TIGR00482 nicotinate (nicotinamide) nucleotide adenylyltransferase. This model represents the predominant bacterial/eukaryotic adenylyltransferase for nicotinamide-nucleotide, its deamido form nicotinate nucleotide, or both. The first activity, nicotinamide-nucleotide adenylyltransferase (EC 2.7.7.1), synthesizes NAD by the salvage pathway, while the second, nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18) synthesizes the immediate precursor of NAD by the de novo pathway. In E. coli, NadD activity is biased toward the de novo pathway while salvage activity is channeled through the multifunctional NadR protein, but this division of labor may be exceptional. The given name of this model, nicotinate (nicotinamide) nucleotide adenylyltransferase, reflects the lack of absolute specificity with respect to substrate amidation state in most species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 193
16989 129574 TIGR00483 EF-1_alpha translation elongation factor EF-1 alpha. This model represents the counterpart of bacterial EF-Tu for the Archaea (aEF-1 alpha) and Eukaryotes (eEF-1 alpha). The trusted cutoff is set fairly high so that incomplete sequences will score between suggested and trusted cutoff levels. [Protein synthesis, Translation factors] 426
16990 129575 TIGR00484 EF-G translation elongation factor EF-G. After peptide bond formation, this elongation factor of bacteria and organelles catalyzes the translocation of the tRNA-mRNA complex, with its attached nascent polypeptide chain, from the A-site to the P-site of the ribosome. Every completed bacterial genome has at least one copy, but some species have additional EF-G-like proteins. The closest homolog to canonical (e.g. E. coli) EF-G in the spirochetes clusters as if it is derived from mitochondrial forms, while a more distant second copy is also present. Synechocystis PCC6803 has a few proteins more closely related to EF-G than to any other characterized protein. Two of these resemble E. coli EF-G more closely than does the best match from the spirochetes; it may be that both function as authentic EF-G. [Protein synthesis, Translation factors] 689
16991 129576 TIGR00485 EF-Tu translation elongation factor TU. This model models orthologs of translation elongation factor EF-Tu in bacteria, mitochondria, and chloroplasts, one of several GTP-binding translation factors found by the more general pfam model GTP_EFTU. The eukaryotic conterpart, eukaryotic translation elongation factor 1 (eEF-1 alpha), is excluded from this model. EF-Tu is one of the most abundant proteins in bacteria, as well as one of the most highly conserved, and in a number of species the gene is duplicated with identical function. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors. Transfer RNA is carried to the ribosome in these complexes for protein translation. [Protein synthesis, Translation factors] 394
16992 213534 TIGR00486 YbgI_SA1388 dinuclear metal center protein, YbgI/SA1388 family. The characterization of this family of uncharacterized proteins as orthologous is tentative. Members are found in all three domains of life. Several members (from Bacillus subtilis, Listeria monocytogenes, and Mycobacterium tuberculosis - all classified as Firmicutes within the Eubacteria) share a long insert relative to other members. [Unknown function, General] 249
16993 273102 TIGR00487 IF-2 translation initiation factor IF-2. This model discriminates eubacterial (and mitochondrial) translation initiation factor 2 (IF-2), encoded by the infB gene in bacteria, from similar proteins in the Archaea and Eukaryotes. In the bacteria and in organelles, the initiator tRNA is charged with N-formyl-Met instead of Met. This translation factor acts in delivering the initator tRNA to the ribosome. It is one of a number of GTP-binding translation factors recognized by the pfam model GTP_EFTU. [Protein synthesis, Translation factors] 587
16994 273103 TIGR00488 TIGR00488 putative HD superfamily hydrolase of NAD metabolism. The function of this protein family is unknown. Members of this family of uncharacterized proteins from the Mycoplasmas are longer at the amino end, fused to a region of nicotinamide nucleotide adenylyltransferase, an NAD salvage biosynthesis enzyme. Members are putative metal-dependent phosphohydrolases for NAD metabolism. [Unknown function, Enzymes of unknown specificity] 158
16995 129580 TIGR00489 aEF-1_beta translation elongation factor aEF-1 beta. This model describes the archaeal translation elongation factor aEF-1 beta. The member from Sulfolobus solfataricus was demonstrated experimentally. It is a dimer that catalyzes the exchange of GDP for GTP on aEF-1 alpha. [Protein synthesis, Translation factors] 88
16996 129581 TIGR00490 aEF-2 translation elongation factor aEF-2. This model represents archaeal elongation factor 2, a protein more similar to eukaryotic EF-2 than to bacterial EF-G, both in sequence similarity and in sharing with eukaryotes the property of having a diphthamide (modified His) residue at a conserved position. The diphthamide can be ADP-ribosylated by diphtheria toxin in the presence of NAD. [Protein synthesis, Translation factors] 720
16997 273104 TIGR00491 aIF-2 translation initiation factor aIF-2/yIF-2. This model describes archaeal and eukaryotic orthologs of bacterial IF-2. Like IF-2, it helps convey the initiator tRNA to the ribosome, although the initiator is N-formyl-Met in bacteria and Met here. This protein is not closely related to the subunits of eIF-2 of eukaryotes, which is also involved in the initiation of translation. The aIF-2 of Methanococcus jannaschii contains a large intein interrupting a region of very strongly conserved sequence very near the amino end; the alignment generated by this model does not correctly align the sequences from Methanococcus jannaschii and Pyrococcus horikoshii in this region. [Protein synthesis, Translation factors] 591
16998 129583 TIGR00492 alr alanine racemase. This enzyme interconverts L-alanine and D-alanine. Its primary function is to generate D-alanine for cell wall formation. With D-alanine-D-alanine ligase, it makes up the D-alanine branch of the peptidoglycan biosynthetic route. It is a monomer with one pyridoxal phosphate per subunit. In E. coli, the ortholog is duplicated so that a second isozyme, DadX, is present. DadX, a paralog of the biosynthetic Alr, is induced by D- or L-alanine and is involved in catabolism. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 367
16999 188055 TIGR00493 clpP ATP-dependent Clp endopeptidase, proteolytic subunit ClpP. This model for the proteolytic subunit ClpP has been rebuilt to a higher stringency. In every bacterial genome with the ClpXP machine, a ClpP protein will be found that scores well with this model. In general, this ClpP member will be encoded adjacent to the clpX gene, as were all examples used in the seed alignment. A large fraction of genomes have one or more additional ClpP paralogs, sometimes encoded nearby and sometimes elsewhere. The stringency of the trusted cutoff used here excludes the more divergent ClpP paralogs from being called authentic ClpP by this model. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 192
17000 129585 TIGR00494 crcB protein CrcB. The role of this protein is uncharacterized, but phenotypes associated with overproduction include resistance to camphor, suppression of a mukB chromosomal partition mutant, and chromosome condensation, together suggesting a function related to chromosome folding. [Unknown function, General] 117
17001 273105 TIGR00495 crvDNA_42K 42K curved DNA binding protein. Proteins identified by this model have been identified in a number of species as a nuclear (but not nucleolar) protein with a cell cycle dependence. Various names given to members of this family have included cell cycle protein p38-2G4, DNA-binding protein GBP16, and proliferation-associated protein 1. This protein is closely related to methionine aminopeptidase, a cobolt-binding protein. [Unknown function, General] 390
17002 129587 TIGR00496 frr ribosome recycling factor. This model finds only eubacterial proteins. Mitochondrial and/or chloroplast forms might be expected but are not currently known. This protein was previously called ribosome releasing factor. By releasing ribosomes from mRNA at the end of protein biosynthesis, it prevents inappropriate translation from 3-prime regions of the mRNA and frees the ribosome for new rounds of translation. EGAD|53116|YHR038W is part of the frr superfamily. [Protein synthesis, Translation factors] 176
17003 211578 TIGR00497 hsdM type I restriction system adenine methylase (hsdM). Function: methylation of specific adenine residues; required for both restriction and modification activities. The ECOR124/3 I enzyme recognizes 5'GAA(N7)RTCG. for E.coli see (J. Mol. Biol. 257: 960-969 (1996)). [DNA metabolism, Restriction/modification] 501
17004 273106 TIGR00498 lexA SOS regulatory protein LexA. LexA acts as a homodimer to repress a number of genes involved in the response to DNA damage (SOS response), including itself and RecA. RecA, in the presence of single-stranded DNA, acts as a co-protease to activate a latent autolytic protease activity (EC 3.4.21.88) of LexA, where the active site Ser is part of LexA. The autolytic cleavage site is an Ala-Gly bond in LexA (at position 84-85 in E. coli LexA; this sequence is replaced by Gly-Gly in Synechocystis). The cleavage leads to derepression of the SOS regulon and eventually to DNA repair. LexA in Bacillus subtilis is called DinR. LexA is much less broadly distributed than RecA. [DNA metabolism, DNA replication, recombination, and repair, Regulatory functions, DNA interactions] 199
17005 273107 TIGR00499 lysS_bact lysyl-tRNA synthetase, eukaryotic and non-spirochete bacterial. This model represents the lysyl-tRNA synthetases that are class II amino-acyl tRNA synthetases. It includes all eukaryotic and most bacterial examples of the enzyme, but not archaeal or spirochete forms. [Protein synthesis, tRNA aminoacylation] 493
17006 129591 TIGR00500 met_pdase_I methionine aminopeptidase, type I. Methionine aminopeptidase is a cobalt-binding enzyme. Bacterial and organellar examples (type I) differ from eukaroytic and archaeal (type II) examples in lacking a region of approximately 60 amino acids between the 4th and 5th cobalt-binding ligands. This model describes type I. The role of this protein in general is to produce the mature form of cytosolic proteins by removing the N-terminal methionine. [Protein fate, Protein modification and repair] 247
17007 129592 TIGR00501 met_pdase_II methionine aminopeptidase, type II. Methionine aminopeptidase (map) is a cobalt-binding enzyme. Bacterial and organellar examples (type I) differ from eukaroytic and archaeal (type II) examples in lacking a region of approximately 60 amino acids between the 4th and 5th cobalt-binding ligands. The role of this protein in general is to produce the mature amino end of cytosolic proteins by removing the N-terminal methionine. This model describes type II, among which the eukaryotic members typically have an N-terminal extension not present in archaeal members. It can act cotranslationally. The enzyme from rat has been shown to associate with translation initiation factor 2 (IF-2) and may have a role in translational regulation. [Protein fate, Protein modification and repair] 295
17008 129593 TIGR00502 nagB glucosamine-6-phosphate isomerase. The set of proteins recognized by this model includes a closely related pair from Bacillus subtilis, one of which is uncharacterized but included as a member of the orthologous set. [Central intermediary metabolism, Amino sugars] 259
17009 129594 TIGR00503 prfC peptide chain release factor 3. This translation releasing factor, RF-3 (prfC) was originally described as stop codon-independent, in contrast to peptide chain release factor 1 (RF-1, prfA) and RF-2 (prfB). RF-1 and RF-2 are closely related to each other, while RF-3 is similar to elongation factors EF-Tu and EF-G; RF-1 is active at UAA and UAG and RF-2 is active at UAA and UGA. More recently, RF-3 was shown to be active primarily at UGA stop codons in E. coli. All bacteria and organelles have RF-1. The Mycoplasmas and organelles, which translate UGA as Trp rather than as a stop codon, lack RF-2. RF-3, in contrast, seems to be rare among bacteria and is found so far only in Escherichia coli and some other gamma subdivision Proteobacteria, in Synechocystis PCC6803, and in Staphylococcus aureus. [Protein synthesis, Translation factors] 527
17010 129595 TIGR00504 pyro_pdase pyroglutamyl-peptidase I. Alternate names include pyroglutamate aminopeptidase, pyrrolidone-carboxylate peptidase, and 5-oxoprolyl-peptidase. It removes pyroglutamate (pyrrolidone-carboxylate, a modified glutamine) that can otherwise block hydrolysis of a polypeptide at the amino end, and so can be extremely useful in the biochemical studies of proteins. The biological role in the various species in which it is found is not fully understood. The enzyme appears to be a homodimer. It does not closely resemble any other peptidases. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 212
17011 129596 TIGR00505 ribA GTP cyclohydrolase II. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. The function of archaeal members of the family has not been demonstrated and is assigned tentatively. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 191
17012 273108 TIGR00506 ribB 3,4-dihydroxy-2-butanone 4-phosphate synthase. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 199
17013 161904 TIGR00507 aroE shikimate dehydrogenase. This model finds proteins from prokaryotes and functionally equivalent domains from larger, multifunctional proteins of fungi and plants. Below the trusted cutoff of 180, but above the noise cutoff of 20, are the putative shikimate dehydrogenases of Thermotoga maritima and Mycobacterium tuberculosis, and uncharacterized paralogs of shikimate dehydrogenase from E. coli and H. influenzae. The related enzyme quinate 5-dehydrogenase scores below the noise cutoff. A neighbor-joining tree, constructed with quinate 5-dehydrogenases as the outgroup, shows the Clamydial homolog as clustering among the shikimate dehydrogenases, although the sequence is unusual in the degree of sequence divergence and the presence of an additional N-terminal domain. [Amino acid biosynthesis, Aromatic amino acid family] 270
17014 273109 TIGR00508 bioA adenosylmethionine-8-amino-7-oxononanoate transaminase. All members of the seed alignment have been demonstrated experimentally to act as EC 2.6.1.62, an enzyme in the biotin biosynthetic pathway. Alternate names include 7,8-diaminopelargonic acid aminotransferase, DAPA aminotransferase, and adenosylmethionine-8-amino-7-oxononanoate aminotransferase. The gene symbol is bioA in E. coli and BIO3 in S. cerevisiae. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 421
17015 273110 TIGR00509 bisC_fam molybdopterin guanine dinucleotide-containing S/N-oxide reductases. This enzyme family shares sequence similarity and a requirement for a molydenum cofactor as the only prosthetic group. The form of the cofactor is a single molybdenum atom coordinated by two molybdopterin guanine dinucleotide molecules. Members of the family include biotin sulfoxide reductase, dimethylsulfoxide reductase, and trimethylamine-N-oxide reductase, although a single member may show all those activities and related activities; it may not be possible to resolve the primary function for members of this family by sequence comparison alone. A number of similar molybdoproteins in which the N-terminal region contains a CXXXC motif and may bind an iron-sulfur cluster are excluded from this set, including formate dehydrogenases and nitrate reductases. Also excluded is the A chain of a heteromeric, anaerobic DMSO reductase, which also contains the CXXXC motif. 770
17016 273111 TIGR00510 lipA lipoate synthase. This enzyme is an iron-sulfur protein. It is localized to mitochondria in yeast and Arabidopsis. It generates lipoic acid, a thiol antioxidant that is linked to a specific Lys as prosthetic group for the pyruvate and alpha-ketoglutarate dehydrogenase complexes and the glycine-cleavage system. The family shows strong sequence conservation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Lipoate] 302
17017 188057 TIGR00511 ribulose_e2b2 ribose-1,5-bisphosphate isomerase, e2b2 family. The delineation of this family was based originally, in part, on a discussion and neighbor-joining phylogenetic study by Kyrpides and Woese of archaeal and other proteins homologous to the alpha, beta, and delta subunits of eukaryotic initiation factor 2B (eIF-2B), a five-subunit molecule that catalyzes GTP recycling for eIF-2. Recently, Sato, et al. assigned the function ribulose-1,5 bisphosphate isomerase. [Energy metabolism, Other] 301
17018 273112 TIGR00512 salvage_mtnA S-methyl-5-thioribose-1-phosphate isomerase. The delineation of this family was based in part on a discussion and neighbor-joining phylogenetic study, by Kyrpides and Woese, of archaeal and other proteins homologous to the alpha, beta, and delta subunits of eukaryotic initiation factor 2B (eIF-2B), a five-subunit molecule that catalyzes GTP recycling for eIF-2. This clade is now recognized to include the methionine salvage pathway enzyme MtnA. [Amino acid biosynthesis, Aspartate family] 335
17019 273113 TIGR00513 accA acetyl-CoA carboxylase, carboxyl transferase, alpha subunit. The enzyme acetyl-CoA carboxylase contains a biotin carboxyl carrier protein or domain, a biotin carboxylase, and a carboxyl transferase. This model represents the alpha chain of the carboxyl transferase for cases in which the architecture of the protein is as in E. coli, in which the carboxyltransferase portion consists of two non-identical subnits, alpha and beta. [Fatty acid and phospholipid metabolism, Biosynthesis] 316
17020 129605 TIGR00514 accC acetyl-CoA carboxylase, biotin carboxylase subunit. This model represents the biotin carboxylase subunit found usually as a component of acetyl-CoA carboxylase. Acetyl-CoA carboxylase is designated EC 6.4.1.2 and this component, biotin carboxylase, has its own designation, EC 6.3.4.14. Homologous domains are found in eukaryotic forms of acetyl-CoA carboxylase and in a number of other carboxylases (e.g. pyruvate carboxylase), but seed members and trusted cutoff are selected so as to exclude these. In some systems, the biotin carboxyl carrier protein and this protein (biotin carboxylase) may be shared by different carboxyltransferases. However, this model is not intended to identify the biotin carboxylase domain of propionyl-coA carboxylase. The model should hit the full length of proteins, except for chloroplast transit peptides in plants. If it hits a domain only of a longer protein, there may be a problem with the identification. [Fatty acid and phospholipid metabolism, Biosynthesis] 449
17021 129606 TIGR00515 accD acetyl-CoA carboxylase, carboxyl transferase, beta subunit. The enzyme acetyl-CoA carboxylase contains a biotin carboxyl carrier protein or domain, a biotin carboxylase, and a carboxyl transferase. This model represents the beta chain of the carboxyl transferase for cases in which the architecture of the protein is as in E. coli, in which the carboxyltransferase portion consists of two non-identical subnits, alpha and beta. [Fatty acid and phospholipid metabolism, Biosynthesis] 285
17022 273114 TIGR00516 acpS holo-[acyl-carrier-protein] synthase. Formerly dpj. This enzyme adds the prosthetic group, phosphopantethiene, to the acyl carrier protein (ACP) apo-enzyme to generate the holo-enzyme. Related phosphopantethiene--protein transferases also exist. There is an orthologous domain in eukaryotic proteins. [Fatty acid and phospholipid metabolism, Biosynthesis] 121
17023 213536 TIGR00517 acyl_carrier acyl carrier protein. This small protein has phosphopantetheine covalently bound to a Ser residue. It acts as a carrier of the growing fatty acid chain, which is bound to the prosthetic group, during fatty acid biosynthesis. Homologous phosphopantetheine-binding domains are found in longer proteins. Acyl carrier proteins scoring above the noise cutoff but below the trusted cutoff may be specialized versions. These include those involved in mycolic acid biosynthesis in the Mycobacteria, lipid A biosynthesis in Rhizobium, actinorhodin polyketide synthesis in Streptomyces coelicolor, etc. This protein is not found in the Archaea.Gene name acpP.S (Ser) at position 37 in the seed alignment, in the motif DSLD, is the phosphopantetheine attachment site. [Fatty acid and phospholipid metabolism, Biosynthesis] 77
17024 129609 TIGR00518 alaDH alanine dehydrogenase. The family of known L-alanine dehydrogenases (EC 1.4.1.1) includes representatives from the Proteobacteria, Firmicutes, Cyanobacteria, and Actinobacteria, all with about 50 % identity or better. An outlier to this group in both sequence and gap pattern is the homolog from Helicobacter pylori, an epsilon division Proteobacteria, which must be considered a putative alanine dehydrogenase. In Mycobacterium smegmatis and M. tuberculosis, the enzyme doubles as a glycine dehydrogenase (1.4.1.10), running in the reverse direction (glyoxylate amination to glycine, with conversion of NADH to NAD+). Related proteins include saccharopine dehydrogenase and the N-terminal half of the NAD(P) transhydrogenase alpha subunit. All of these related proteins bind NAD and/or NADP. [Energy metabolism, Amino acids and amines] 370
17025 129610 TIGR00519 asnASE_I L-asparaginase, type I. Two related families of asparaginase are designated type I and type II according to the terminology in E. coli, which has both: L-asparaginase I is a low-affinity enzyme found in the cytoplasm, while L-asparaginase II is a high-affinity secreted enzyme synthesized with a cleavable signal sequence. This model describes L-asparaginases related to type I of E. coli. Archaeal putative asparaginases are of this type but contain an extra ~ 80 residues in a conserved N-terminal region. These archaeal homologs are included in this model. 336
17026 273115 TIGR00520 asnASE_II L-asparaginase, type II. Two related families of asparaginase (L-asparagine amidohydrolase, EC 3.5.1.1) are designated type I and type II according to the terminology in E. coli, which has both: L-asparaginase I is a low-affinity enzyme found in the cytoplasm, while L-asparaginase II is a high-affinity periplasmic enzyme synthesized with a cleavable signal sequence. This model describes L-asparaginases related to type II of E. coli. Both the cytoplasmic and the cell wall asparaginases of Saccharomyces cerevisiae belong to this set. Members of this set from Acinetobacter glutaminasificans and Pseudomonas fluorescens are described as having both glutaminase and asparaginase activitities. All members are homotetrameric. [Energy metabolism, Amino acids and amines] 349
17027 273116 TIGR00521 coaBC_dfp phosphopantothenoylcysteine decarboxylase / phosphopantothenate--cysteine ligase. This model represents a bifunctional enzyme that catalyzes the second and third steps (cysteine ligation, EC 6.3.2.5, and decarboxylation, EC 4.1.1.36) in the biosynthesis of coenzyme A (CoA) from pantothenate in bacteria. In early descriptions of this flavoprotein, a ts mutation in one region of the protein appeared to cause a defect in DNA metaobolism rather than an increased need for the pantothenate precursor beta-alanine. This protein was then called dfp, for DNA/pantothenate metabolism flavoprotein. The authors responsible for detecting phosphopantothenate--cysteine ligase activity suggest renaming this bifunctional protein coaBC for its role in CoA biosynthesis. This enzyme contains the FMN cofactor, but no FAD or pyruvoyl group. The amino-terminal region contains the phosphopantothenoylcysteine decarboxylase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 391
17028 273117 TIGR00522 dph5 diphthine synthase. Alternate name: diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase. This protein participates in the modification of a specific His of elongation factor 2 of eukarotes and Archaea to diphthamide. The protein was characterized in Saccharomyces cerevisiae and designated DPH5. [Protein fate, Protein modification and repair] 257
17029 273118 TIGR00523 eIF-1A eukaryotic/archaeal initiation factor 1A. Recommended nomenclature: eIF-1A for eukaryotes, aIF-1A for Archaea. Also called eIF-4C [Protein synthesis, Translation factors] 98
17030 273119 TIGR00524 eIF-2B_rel eIF-2B alpha/beta/delta-related uncharacterized proteins. This model, eIF-2B_rel, describes half of a superfamily, where the other half consists of eukaryotic translation initiation factor 2B (eIF-2B) subunits alpha, beta, and delta. It is unclear whether the eIF-2B_rel set is monophyletic, or whether they are all more closely related to each other than to any eIF-2B subunit because the eIF-2B clade is highly derived. Members of this branch of the family are all uncharacterized with respect to function and are found in the Archaea, Bacteria, and Eukarya, although a number are described as putative translation intiation factor components. Proteins found by eIF-2B_rel include at least three clades, including a set of uncharacterized eukaryotic proteins, a set found in some but not all Archaea, and a set universal so far among the Archaea and closely related to several uncharacterized bacterial proteins. [Unknown function, General] 303
17031 213537 TIGR00525 folB dihydroneopterin aldolase. This model describes a bacterial dihydroneopterin aldolase, shown to form homo-octamers in E. coli. The equivalent activity is catalyzed by domains of larger folate biosynthesis proteins in other systems. The closely related parologous enzyme in E. coli, dihydroneopterin triphosphate epimerase, which is also homo-octameric, and dihydroneopterin aldolase domains of larger proteins, score below the trusted cutoff but may score well above the noise cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 116
17032 273120 TIGR00526 folB_dom FolB domain. Two paralogous genes of E. coli, folB (dihydroneopterin aldolase) and folX (d-erythro-7,8-dihydroneopterin triphosphate epimerase) are homologous to each other and homo-octameric. In Pneumocystis carinii, a multifunctional enzyme of folate synthesis has an N-terminal region active as dihydroneopterin aldolase. This region consists of two tandem sequences each homologous to folB and forms tetramers. 118
17033 200024 TIGR00527 gcvH glycine cleavage system H protein. This model represents the glycine cleavage system H protein, which shuttles the methylamine group of glycine from the P protein to the T protein. The mature protein is about 130 residues long and contains a lipoyl group covalently bound to a conserved Lys residue. The genome of Aquifex aeolicus contains one protein scoring above the trusted cutoff and clustering with other bacterial H proteins, and four more proteins clustering together and scoring below the trusted cutoff; it seems doubtful that all of these homologs are authentic H protein. The Chlamydial homolog of H protein is nearly as divergent as the Aquifex outgroup, is not accompanied by P and T proteins, is not included in the seed alignment, and consequently also scores below the trusted cutoff. [Energy metabolism, Amino acids and amines] 128
17034 273121 TIGR00528 gcvT glycine cleavage system T protein. The glycine cleavage system T protein (GcvT) is also known as aminomethyltransferase (EC 2.1.2.10). It works with the H protein (GcvH), the P protein (GcvP), and lipoamide dehydrogenase. The reported sequence of the member from Aquifex aeolicus starts about 50 residues downstream of the start of other members of the family (perhaps in error); it scores below the trusted cutoff. Eukaryotic forms are mitochondrial and have an N-terminal transit peptide. [Energy metabolism, Amino acids and amines] 362
17035 273122 TIGR00529 AF0261 integral membrane protein, TIGR00529 family. This protein is predicted to have 10 transmembrane regions. Members of this family are found so far in the Archaea (Archaeoglobus fulgidus and Pyrococcus horikoshii) and in a bacterial thermophile, Thermotoga maritima. In Pyrococcus, the gene is located between nadA and nadB, two components of an enzyme involved in de novo synthesis of NAD. By PSI-BLAST, this family shows similarity (but not necessarily homology) to gluconate permease and other transport proteins. [Hypothetical proteins, Conserved] 387
17036 129621 TIGR00530 AGP_acyltrn 1-acyl-sn-glycerol-3-phosphate acyltransferases. This model describes the core homologous region of a collection of related proteins, several of which are known to act as 1-acyl-sn-glycerol-3-phosphate acyltransferases (EC 2.3.1.51). Proteins scoring above the trusted cutoff are likely to have the same general activity. However, there is variation among characterized members as to whether the acyl group can be donated by acyl carrier protein or coenzyme A, and in the length and saturation of the donated acyl group. 1-acyl-sn-glycerol-3-phosphate acyltransferase is also called 1-AGP acyltransferase, lysophosphatidic acid acyltransferase, and LPA acyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis] 130
17037 273123 TIGR00531 BCCP acetyl-CoA carboxylase, biotin carboxyl carrier protein. This model is designed to identify biotin carboxyl carrier protein as a peptide of acetyl-CoA carboxylase. Scoring below the trusted cutoff is a related protein encoded in a region associated with polyketide synthesis in the prokaryote Saccharopolyspora hirsuta, and a reported chloroplast-encoded biotin carboxyl carrier protein that may be highly derived from the last common ancestral sequence. Scoring below the noise cutoff are biotin carboxyl carrier domains of other enzymes such as pyruvate carboxylase.The gene name is accB or fabE. [Fatty acid and phospholipid metabolism, Biosynthesis] 155
17038 129623 TIGR00532 HMG_CoA_R_NAD hydroxymethylglutaryl-CoA reductase, degradative. Most known examples of hydroxymethylglutaryl-CoA reductase are NADP-dependent (EC 1.1.1.34) from eukaryotes and archaea, involved in the biosynthesis of mevalonate from 3-hydroxy-3-methylglutaryl-CoA. This model, in contrast, is built from the two examples in completed genomes of sequences closely related to the degradative, NAD-dependent hydroxymethylglutaryl-CoA reductase of Pseudomonas mevalonii, a bacterium that can use mevalonate as its sole carbon source. [Energy metabolism, Other] 393
17039 129624 TIGR00533 HMG_CoA_R_NADP 3-hydroxy-3-methylglutaryl Coenzyme A reductase, hydroxymethylglutaryl-CoA reductase (NADP). This model represents archaeal examples of the enzyme hydroxymethylglutaryl-CoA reductase (NADP) (EC 1.1.1.34) and the catalytic domain of eukaryotic examples, which also contain a hydrophobic N-terminal domain. This enzyme synthesizes mevalonate, a precursor of isopentenyl pyrophosphate (IPP), a building block for the synthesis of cholesterol, isoprenoids, and other molecules. A related hydroxymethylglutaryl-CoA reductase, typified by an example from Pseudomonas mevalonii, is NAD-dependent and catabolic. [Central intermediary metabolism, Other] 402
17040 213538 TIGR00534 OpcA glucose-6-phosphate dehydrogenase assembly protein OpcA. The opcA gene is found immediately downstream of zwf, the glucose-6-phosphate dehydrogenase (G6PDH) gene, in a number of species, including Mycobacterium tuberculosis, Streptomyces coelicolor, Nostoc punctiforme, and Synechococcus sp. PCC 7942. In the latter, disruption of opcA was shown to block assembly of G6PDH into active oligomeric forms. [Protein fate, Protein folding and stabilization] 311
17041 273124 TIGR00535 SAM_DCase S-adenosylmethionine decarboxylase proenzyme, eukaryotic form. This enzyme is a key regulatory enzyme of the polyamine synthetic pathway. This protein is a pyruvoyl-dependent enzyme. The proenzyme is cleaved at a Ser residue that becomes a pyruvoyl group active site. [Central intermediary metabolism, Polyamine biosynthesis] 334
17042 273125 TIGR00536 hemK_fam HemK family putative methylases. The gene hemK from E. coli was found to contribute to heme biosynthesis and originally suggested to be protoporphyrinogen oxidase. Functional analysis of the nearest homolog in Saccharomyces cerevisiae, YNL063w, finds it is not protoporphyrinogen oxidase and sequence analysis suggests that HemK homologs have S-adenosyl-methionine-dependent methyltransferase activity (Medline 99237242). Homologs are found, usually in a single copy, in nearly all completed genomes, but varying somewhat in apparent domain architecture. Both E. coli and H. influenzae have two members rather than one. The members from the Mycoplasmas have an additional C-terminal domain. [Protein fate, Protein modification and repair] 284
17043 129628 TIGR00537 hemK_rel_arch HemK-related putative methylase. The gene hemK from E. coli was found to contribute to heme biosynthesis and originally suggested to be protoporphyrinogen oxidase. Functional analysis of the nearest homolog in Saccharomyces cerevisiae, YNL063w, finds it is not protoporphyrinogen oxidase and sequence analysis suggests that HemK homologs have S-adenosyl-methionine-dependent methyltransferase activity (Medline 99237242). Homologs are found, usually in a single copy, in nearly all completed genomes, but varying somewhat in apparent domain architecture. This model represents an archaeal and eukaryotic protein family that lacks an N-terminal domain found in HemK and its eubacterial homologs. It is found in a single copy in the first six completed archaeal and eukaryotic genomes. [Unknown function, Enzymes of unknown specificity] 179
17044 129629 TIGR00538 hemN oxygen-independent coproporphyrinogen III oxidase. This model represents HemN, the oxygen-independent coproporphyrinogen III oxidase that replaces HemF function under anaerobic conditions. Several species, including E. coli, Helicobacter pylori, and Aquifex aeolicus, have both a member of this family and a member of another, closely related family for which there is no evidence of coproporphyrinogen III oxidase activity. Members of this family have a perfectly conserved motif PYRT[SC]YP in a region N-terminal to the region of homology with the related uncharacterized protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 455
17045 129630 TIGR00539 hemN_rel putative oxygen-independent coproporphyrinogen III oxidase. Experimentally determined examples of oxygen-independent coproporphyrinogen III oxidase, an enzyme that replaces HemF function under anaerobic conditions, belong to a family of proteins described by the model hemN. This model, hemN_rel, models a closely related protein, shorter at the amino end and lacking the region containing the motif PYRT[SC]YP found in members of the hemN family. Several species, including E. coli, Helicobacter pylori, Aquifex aeolicus, and Chlamydia trachomatis, have members of both this family and the E. coli hemN family. The member of this family from Bacillus subtilis was shown to complement an hemF/hemN double mutant of Salmonella typimurium and to prevent accumulation of coproporphyrinogen III under anaerobic conditions, but the exact role of this protein is still uncertain. It is found in a number of species that do not synthesize heme de novo. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 360
17046 273126 TIGR00540 TPR_hemY_coli heme biosynthesis-associated TPR protein. Members of this protein family are uncharacterized tetratricopeptide repeat (TPR) proteins invariably found in heme biosynthesis gene clusters. The absence of any invariant residues other than Ala argues against this protein serving as an enzyme per se. The gene symbol hemY assigned in E. coli is unfortunate in that an unrelated protein, protoporphyrinogen oxidase (HemG in E. coli) is designated HemY in Bacillus subtilis. [Unknown function, General] 367
17047 129632 TIGR00541 hisDCase_pyru histidine decarboxylase, pyruvoyl type. This enzyme converts histadine to histamine in a single step by catalyzing the release of CO2. This type is synthesized as an inactive single chain precursor, then cleaved into two chains. The Ser at the new N-terminus at the cleavage site is converted to a pyruvoyl group essential for activity. This type of histidine decarboxylase appears is known so far only in some Gram-positive bacteria, where it may play a role in amino acid catabolism. There is also a pyridoxal phosphate type histidine decarboxylase, as found in human, where histamine is a biologically active amine. [Energy metabolism, Amino acids and amines] 310
17048 129633 TIGR00542 hxl6Piso_put hexulose-6-phosphate isomerase, putative. This family shows similarity by PSI-BLAST to other isomerases. Putative identification as hexulose-6-phosphate isomerase is reported in Swiss-Prot, attributing a discussion in Genome Sci. Technol. 1:53-75(1996). This family is conserved at better than 40 % identity among the four known examples from three species: Escherichia coli (SgbU and SgaU), Haemophilus influenzae, and Mycoplasma pneumoniae. The rarity of the family, high level of conservation, and proposed catabolic role suggests lateral transfer may be a part of the evolutionary history of this protein. [Energy metabolism, Sugars] 279
17049 273127 TIGR00543 isochor_syn isochorismate synthases. This enzyme interconverts chorismate and isochorismate. In E. coli, different loci encode isochorismate synthases for the pathways of menaquinone biosynthesis and enterobactin biosynthesis (via salicilate) and fail to complement each other. Among isochorismate synthases, the N-terminal domain is poorly conserved. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 351
17050 273128 TIGR00544 lgt prolipoprotein diacylglyceryl transferase. The conversion of lipoprotein precursors into lipoproteins consists of three steps. First, the enzyme described by this model transfers a diacylglyceryl moiety from phosphatidylglycerol to the side chain of a Cys that will become the new N-terminus. Second, the signal peptide is removed by signal peptidase II. Finally, the free amino group of the new N-terminal Cys is acylated by apolipoprotein N-acyltransferase. [Protein fate, Protein modification and repair] 277
17051 161920 TIGR00545 lipoyltrans lipoyltransferase and lipoate-protein ligase. One member of this group of proteins is bovine lipoyltransferase, which transfers the lipoyl group from lipoyl-AMP to the specific Lys of lipoate-dependent enzymes. However, it does not first activate lipoic acid with ATP to create lipoyl-AMP and pyrophosphate. Another member of this group, lipoate-protein ligase A from E. coli, catalyzes both the activation and the transfer of lipoate. Homology between the two is full-length, except for the bovine mitochondrial targeting signal, but is strongest toward the N-terminus. [Protein fate, Protein modification and repair] 324
17052 273129 TIGR00546 lnt apolipoprotein N-acyltransferase. This enzyme transfers the acyl group to lipoproteins in the lgt/lsp/lnt system which is found broadly in bacteria but not in archaea. This model represents one component of the "lipoprotein lgt/lsp/lnt system" genome property. [Protein fate, Protein modification and repair] 391
17053 129638 TIGR00547 lolA periplasmic chaperone LolA. This protein, LolA, is known so far only in the gamma and beta subdivisions of the Proteobacteria. The E. coli major outer lipoprotein (Lpp) of E. coli is released from the inner membrane as a complex with this chaperone in an energy-requiring process, and is then delivered to LolB for insertion into the outer membrane. LolA is involved in the delivery of lipoproteins generally, rather than just Lpp, and is an essential protein in E. coli, unlike Lpp itself. [Protein fate, Protein and peptide secretion and trafficking] 204
17054 129639 TIGR00548 lolB outer membrane lipoprotein LolB. This protein, LolB, is known so far only in the gamma and beta subdivisions of the Proteobacteria. It is a processed, lipid-modified outer membrane protein. It is required in E. coli for insertion of the major outer lipoprotein (Lpp) into the outer membrane. Lpp is transferred to LolB from the carrier protein LolA in the periplasm. Previously, this protein was thought to play in role in 5-aminolevulinic acid synthesis and was designated HemM. [Protein fate, Protein and peptide secretion and trafficking] 202
17055 273130 TIGR00549 mevalon_kin mevalonate kinase. This model represents mevalonate kinase, the third step in the mevalonate pathway of isopentanyl pyrophosphate (IPP) biosynthesis. IPP is a common intermediate for a number of pathways including cholesterol biosynthesis. This model covers enzymes from eukaryotes, archaea and bacteria. The related enzyme from the same pathway, phosphmevalonate kinase, serves as an outgroup for this clade. Paracoccus exhibits two genes within the phosphomevalonate/mevalonate kinase family, one of which falls between trusted and noise cutoffs of this model. The degree of divergence is high, but if the trees created from this model are correct, the proper names of these genes have been swapped. [Central intermediary metabolism, Other] 273
17056 129641 TIGR00550 nadA quinolinate synthetase complex, A subunit. This protein, termed NadA, plays a role in the synthesis of pyridine, a precursor to NAD. The quinolinate synthetase complex consists of A protein (this protein) and B protein. B protein converts L-aspartate to iminoaspartate, an unstable reaction product which in the absence of A protein is spontaneously hydrolyzed to form oxaloacetate. The A protein, NadA, converts iminoaspartate to quinolate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 310
17057 273131 TIGR00551 nadB L-aspartate oxidase. L-aspartate oxidase is the B protein, NadB, of the quinolinate synthetase complex. Quinolinate synthetase makes a precursor of the pyridine nucleotide portion of NAD. This model identifies proteins that cluster as L-aspartate oxidase (a flavoprotein difficult to separate from the set of closely related flavoprotein subunits of succinate dehydrogenase and fumarate reductase) by both UPGMA and neighbor-joining trees. The most distant protein accepted as an L-aspartate oxidase (NadB), that from Pyrococcus horikoshii, not only clusters with other NadB but is just one gene away from NadA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 489
17058 273132 TIGR00552 nadE NAD+ synthetase. NAD+ synthetase is a nearly ubiquitous enzyme for the final step in the biosynthesis of the essensial cofactor NAD. The member of this family from Bacillus subtilis is a strictly NH(3)-dependent NAD(+) synthetase of 272 amino acids. Proteins consisting only of the domain modeled here may be named as NH3-dependent NAD+ synthetase. Amidotransferase activity may reside in a separate protein, or not be present. Some other members of the family, such as from Mycobacterium tuberculosis, are considerably longer, contain an apparent amidotransferase domain, and show glutamine-dependent as well as NH(3)-dependent activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 250
17059 273133 TIGR00553 pabB aminodeoxychorismate synthase, component I, bacterial clade. Members of this family, aminodeoxychorismate synthase, component I (PabB), were designated para-aminobenzoate synthase component I until it was recognized that PabC, a lyase, completes the pathway of PABA synthesis. This family is closely related to anthranilate synthase component I (trpE), and both act on chorismate. The clade of PabB enzymes represented by this model includes sequences from Gram-positive and alpha and gamma Proteobacteria as well as Chlorobium, Nostoc, Fusobacterium and Arabidopsis. A closely related clade of fungal PabB enzymes is identified by TIGR01823, while another bacterial clade of potential PabB enzymes is more closely related to TrpE (TIGR01824). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 328
17060 273134 TIGR00554 panK_bact pantothenate kinase, bacterial type. Shown to be a homodimer in E. coli. This enzyme catalyzes the rate-limiting step in the biosynthesis of coenzyme A. It is very well conserved from E. coli to B. subtilis, but differs considerably from known eukaryotic forms, described in a separate model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 290
17061 273135 TIGR00555 panK_eukar pantothenate kinase, eukaryotic/staphyloccocal type. This model describes a eukaryotic form of pantothenate kinase, characterized from the fungus Aspergillus nidulans and with similar forms known in several other eukaryotes. It also includes forms from several Gram-positive bacteria suggested to have originated from the eukaryotic form by lateral transfer. It differs in a number of biochemical properties (such as inhibition by acetyl-CoA) from most bacterial CoaA and lacks sequence similarity. This enzyme is the key regulatory step in the biosynthesis of coenzyme A (CoA). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 296
17062 273136 TIGR00556 pantethn_trn phosphopantetheine--protein transferase domain. This model models a domain active in transferring the phophopantetheine prosthetic group to its attachment site on enzymes and carrier proteins. Many members of this family are small proteins that act on the acyl carrier protein involved in fatty acid biosynthesis. Some members are domains of larger proteins involved specialized pathways for the synthesis of unusual molecules including polyketides, atypical fatty acids, and antibiotics. [Protein fate, Protein modification and repair] 128
17063 273137 TIGR00557 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase. This model represents PdxA, an NAD+-dependent 4-hydroxythreonine 4-phosphate dehydrogenase (EC 1.1.1.262) active in pyridoxal phosphate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 320
17064 273138 TIGR00558 pdxH pyridoxamine-phosphate oxidase. This model is similar to Pyridox_oxidase from Pfam but is designed to find only true pyridoxamine-phosphate oxidase and to ignore the related protein PhzG involved in phenazine biosynthesis. This protein from E. coli was characterized as a homodimer with two FMN per dimer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 190
17065 188064 TIGR00559 pdxJ pyridoxine 5'-phosphate synthase. PdxJ is required in the biosynthesis of pyridoxine (vitamin B6), a precursor to the enzyme cofactor pyridoxal phosphate. ECOCYC describes the predicted reaction equation as 1-amino-propan-2-one-3-phosphate + deoxyxylulose-5-phosphate = pyridoxine-5'-phosphate. The product of that reaction is oxidized by PdxH to pyridoxal 5'-phosphate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 236
17066 273139 TIGR00560 pgsA CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. Alternate names: phosphatidylglycerophosphate synthase; glycerophosphate phosphatidyltransferase; PGP synthase. A number of related enzymes are quite similar in both sequence and catalytic activity, including Saccharamyces cerevisiae YDL142c, now known to be a cardiolipin synthase. There may be problems with incorrect transitive annotation of near homologs as authentic CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis] 182
17067 273140 TIGR00561 pntA NAD(P) transhydrogenase, alpha subunit. This integral membrane protein is the alpha subunit of alpha 2 beta 2 tetramer that couples the proton transport across the membrane to the reversible transfer of hydride ion equivalents between NAD and NADP. An alternate name is pyridine nucleotide transhydrogenase alpha subunit. The N-terminal region is homologous to alanine dehydrogenase. In some species, such as Rhodospirillum rubrum, the alpha chain is replaced by two shorter chains, both with some homology to the full-length alpha chain modeled here. These score below the trusted cutoff. [Energy metabolism, Electron transport] 512
17068 213540 TIGR00562 proto_IX_ox protoporphyrinogen oxidase. This enzyme oxidizes protoporphyrinogen IX to protoporphyrin IX, a precursor of heme and chlorophyll. Bacillus subtilis HemY also has coproporphyrinogen III to coproporphyrin III oxidase activity in a heterologous expression system, although the role for this activity in vivo is unclear. This protein is a flavoprotein and has a beta-alpha-beta dinucleotide binding motif near the amino end. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 462
17069 273141 TIGR00563 rsmB 16S rRNA (cytosine(967)-C(5))-methyltransferase. This protein is also known as sun protein. The reading frame was originally interpreted as two reading frames, fmu and fmv. The recombinant protein from E. coli was shown to methylate only C967 of small subunit (16S) ribosomal RNA and to produce only m5C at that position. The seed alignment is built from bacterial sequences only. Eukaryotic homologs include Nop2, a protein required for processing pre-rRNA, that is likely also a rRNA methyltransferase, although the fine specificity may differ. Cutoff scores are set to avoid treating archaeal and eukaroytic homologs automatically as functionally equivalent, although they may have very similar roles. [Protein synthesis, tRNA and rRNA base modification] 426
17070 273142 TIGR00564 trpE_most anthranilate synthase component I, non-proteobacterial lineages. This enzyme resembles some other chorismate-binding enzymes, including para-aminobenzoate synthase (pabB) and isochorismate synthase. There is a fairly deep split between two sets, seen in the pattern of gaps as well as in amino acid sequence differences. Archaeal enzymes have been excluded from this model (and are now found in TIGR01820) as have a clade of enzymes which constitute a TrpE paralog which may have PabB activity (TIGR01824). This allows the B. subtilus paralog which has been shown to have PabB activity to score below trusted to this model. This model contains sequences from gram-positive bacteria, certain proteobacteria, cyanobacteria, plants, fungi and assorted other bacteria.A second family of TrpE enzymes is modelled by TIGR00565. The breaking of the TrpE family into these diverse models allows for the separation of the models for the related enzyme, PabB. [Amino acid biosynthesis, Aromatic amino acid family] 454
17071 273143 TIGR00565 trpE_proteo anthranilate synthase component I, proteobacterial subset. This enzyme resembles some other chorismate-binding enzymes, including para-aminobenzoate synthase (pabB) and isochorismate synthase. There is a fairly deep split between two sets, seen in the pattern of gaps as well as in amino acid sequence differences. This group includes proteobacteria such as E. coli and Helicobacter pylori but also the gram-positive organism Corynebacterium glutamicum. The second group includes eukaryotes, archaea, and most other bacterial lineages; sequences from the second group may resemble pabB more closely than other trpE from this group. [Amino acid biosynthesis, Aromatic amino acid family] 498
17072 273144 TIGR00566 trpG_papA glutamine amidotransferase of anthranilate synthase or aminodeoxychorismate synthase. This model describes the glutamine amidotransferase domain or peptide of the tryptophan-biosynthetic pathway enzyme anthranilate synthase or of the folate biosynthetic pathway enzyme para-aminobenzoate synthase. In at least one case, a single polypeptide from Bacillus subtilis was shown to have both functions. This model covers a subset of the sequences described by the Pfam model GATase. 188
17073 273145 TIGR00567 3mg DNA-3-methyladenine glycosylase. This families are based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). All proteins in this family for which the function is known are involved in the base excision repair of alkylation damage to DNA. The exact specificty of the type of alkylation damage repaired by each of these varies somewhat between species. Substrates include 3-methyl adenine, 7-methyl-guanaine, and 3-methyl-guanine. [DNA metabolism, DNA replication, recombination, and repair] 192
17074 129659 TIGR00568 alkb DNA alkylation damage repair protein AlkB. Proteins in this family have an as of yet undetermined function in the repair of alkylation damage to DNA. Alignment and family designation based on phylogenomic analysis of Jonathan A. Eisen (PhD Thesis, Stanford University, 1999). [DNA metabolism, DNA replication, recombination, and repair] 169
17075 129660 TIGR00569 ccl1 cyclin ccl1. All proteins in this family for which functions are known are cyclins that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, StanfordUniversity). [DNA metabolism, DNA replication, recombination, and repair] 305
17076 129661 TIGR00570 cdk7 CDK-activating kinase assembly factor MAT1. All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 309
17077 273146 TIGR00571 dam DNA adenine methylase (dam). All proteins in this family for which functions are known are DNA-adenine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The DNA adenine methylase (dam) of E. coli and related species is instrumental in distinguishing the newly synthesized strand during DNA replication for methylation-directed mismatch repair. This family includes several phage methylases and a number of different restriction enzyme chromosomal site-specific modification systems. [DNA metabolism, DNA replication, recombination, and repair] 267
17078 129663 TIGR00573 dnaq exonuclease, DNA polymerase III, epsilon subunit family. All proteins in this family for which functions are known are components of the DNA polymerase III complex (epsilon subunit). There is, however, an outgroup that includes paralogs in some gamma-proteobacteria and the n-terminal region of DinG from some low GC gram positive bacteria. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, Degradation of DNA] 217
17079 273147 TIGR00574 dnl1 DNA ligase I, ATP-dependent (dnl1). All proteins in this family with known functions are ATP-dependent DNA ligases. Functions include DNA repair, DNA replication, and DNA recombination (or any process requiring ligation of two single-stranded DNA sections). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 514
17080 273148 TIGR00575 dnlj DNA ligase, NAD-dependent. All proteins in this family with known functions are NAD-dependent DNA ligases. Functions of these proteins include DNA repair, DNA replication, and DNA recombination. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The member of this family from Treponema pallidum differs in having three rather than just one copy of the BRCT (BRCA1 C Terminus) domain (pfam00533) at the C-terminus. It is included in the seed. [DNA metabolism, DNA replication, recombination, and repair] 652
17081 273149 TIGR00576 dut deoxyuridine 5'-triphosphate nucleotidohydrolase (dut). The main function of these proteins is in maintaining the levels of dUTP in the cell to prevent dUTP incorporation into DNA during DNA replication. Pol proteins in viruses are very similar to this protein family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Changed role from 132 to 123. RTD [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 142
17082 273150 TIGR00577 fpg DNA-formamidopyrimidine glycosylase. All proteins in the FPG family with known functions are FAPY-DNA glycosylases that function in base excision repair. Homologous to endonuclease VIII (nei). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 272
17083 273151 TIGR00578 ku70 ATP-dependent DNA helicase II, 70 kDa subunit (ku70). Proteins in this family are involved in non-homologous end joining, a process used for the repair of double stranded DNA breaks. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Cutoff does not detect the putative ku70 homologs in yeast. [DNA metabolism, DNA replication, recombination, and repair] 586
17084 273152 TIGR00580 mfd transcription-repair coupling factor (mfd). All proteins in this family for which functions are known are DNA-dependent ATPases that function in the process of transcription-coupled DNA repair in which the repair of the transcribed strand of actively transcribed genes is repaired at a higher rate than the repair of non-transcribed regions of the genome and than the non-transcribed strand of the same gene. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). This family is closely related to the RecG and UvrB families. [DNA metabolism, DNA replication, recombination, and repair] 926
17085 129670 TIGR00581 moaC molybdenum cofactor biosynthesis protein MoaC. MoaC catalyzes an early step in molybdenum cofactor biosynthesis in E. coli. The Arabidopsis homolog Cnx3 complements MoaC deficiency in E. coli. Eukarotic members of this family branch within the bacterial branch, with the archaeal members as an apparent outgroup. This protein is absent in a number of the pathogens with smaller genomes, including Mycoplasmas, Chlamydias, and spirochetes, but is found in most other complete genomes to date. The homolog form Synechocystis sp. is fused to a MobA-homologous region and is an outlier to all other bacterial forms by both neighbor-joining and UPGMA analyses. Members of this family are well-conserved. The seed for this model excludes both archaeal sequences and the most divergent bacterial sequences, but still finds all candidate MoaC sequences easily between trusted and noise cutoffs. We suggest that sequences branching outside the set that contains all seed members be regarded only as putative functional equivalents of MoaC unless and until a member of the archaeal outgroup is shown to have equivalent function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 147
17086 273153 TIGR00583 mre11 DNA repair protein (mre11). All proteins in this family for which functions are known are subunits of a nuclease complex made up of multiple proteins including MRE11 and RAD50 homologs. The functions of this nuclease complex include recombinational repair and non-homolgous end joining. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The proteins in this family are distantly related to proteins in the SbcCD complex of bacteria. [DNA metabolism, DNA replication, recombination, and repair] 405
17087 273154 TIGR00584 mug mismatch-specific thymine-DNA glycosylate (mug). All proteins in this family for whcih functions are known are G-T or G-U mismatch glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Used 2pf model. [DNA metabolism, DNA replication, recombination, and repair] 328
17088 273155 TIGR00585 mutl DNA mismatch repair protein MutL. All proteins in this family for which the functions are known are involved in the process of generalized mismatch repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 312
17089 200031 TIGR00586 mutt mutator mutT protein. All proteins in this family for which functions are known are involved in repairing oxidative damage to dGTP (they are 8-oxo-dGTPases). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Lowering the threshold picks up members of MutT superfamily well. [DNA metabolism, DNA replication, recombination, and repair] 128
17090 273156 TIGR00587 nfo apurinic endonuclease (APN1). All proteins in this family for which functions are known are 5' AP endonculeases that are used in base excision repair and the repair of abasic sites in DNA.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 274
17091 211589 TIGR00588 ogg 8-oxoguanine DNA-glycosylase (ogg). All proteins in this family for which functions are known are 8-oxo-guanaine DNA glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). This family is distantly realted to the Nth-MutY superfamily. [DNA metabolism, DNA replication, recombination, and repair] 310
17092 273157 TIGR00589 ogt O-6-methylguanine DNA methyltransferase. All proteins in this family for which functions are known are involved alkyl-DNA transferases which remove alkyl groups from DNA as part of alkylation DNA repair. Some of the proteins in this family are also transcription regulators and have a distinct transcription regulatory domain. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 80
17093 273158 TIGR00590 pcna proliferating cell nuclear antigen (pcna). All proteins in this family for which functions are known form sliding DNA clamps that are used in DNA replication processes. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 259
17094 129679 TIGR00591 phr2 photolyase PhrII. All proteins in this family for which functions are known are DNA-photolyases used for the direct repair of UV irradiation induced DNA damage. Some repair 6-4 photoproducts while others repair cyclobutane pyrimidine dimers. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 454
17095 273159 TIGR00592 pol2 DNA polymerase (pol2). All proteins in this superfamily for which functions are known are DNA polymerases.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 1172
17096 273160 TIGR00593 pola DNA polymerase I. All proteins in this family for which functions are known are DNA polymerases Many also have an exonuclease motif. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 887
17097 273161 TIGR00594 polc DNA-directed DNA polymerase III (polc). All proteins in this family for which functions are known are DNA polymerases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 1022
17098 273162 TIGR00595 priA primosomal protein N'. All proteins in this family for which functions are known are components of the primosome which is involved in replication, repair, and recombination.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 505
17099 273163 TIGR00596 rad1 DNA repair protein (rad1). All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombinational repair in some species. Most Archaeal species also have homologs of these genes, but the function of these Archaeal genes is not known, so we have set our cutoff to only pick up the eukaryotic genes.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford Universit [DNA metabolism, DNA replication, recombination, and repair] 814
17100 129685 TIGR00597 rad10 DNA repair protein rad10. All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 112
17101 273164 TIGR00598 rad14 DNA repair protein. All proteins in this family for which functions are known are used for the recognition of DNA damage as part of nucleotide excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 172
17102 273165 TIGR00599 rad18 DNA repair protein rad18. All proteins in this family for which functions are known are involved in nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 397
17103 273166 TIGR00600 rad2 DNA excision repair protein (rad2). All proteins in this family for which functions are known are flap endonucleases that generate the 3' incision next to DNA damage as part of nucleotide excision repair. This family is related to many other flap endonuclease families including the fen1 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 1034
17104 273167 TIGR00601 rad23 UV excision repair protein Rad23. All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. In humans, Rad23 complexes with the XPC protein. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 378
17105 129690 TIGR00602 rad24 checkpoint protein rad24. All proteins in this family for which functions are known are involved in DNA damage tolerance (likely cell cycle checkpoints).This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 637
17106 273168 TIGR00603 rad25 DNA repair helicase rad25. All proteins in this family for which functions are known are DNA-DNA helicases used for the initiation of nucleotide excision repair and transacription as part of the TFIIH complex.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 732
17107 273169 TIGR00604 rad3 DNA repair helicase (rad3). All proteins in this family for which funcitons are known are DNA-DNA helicases that funciton in the initiation of transcription and nucleotide excision repair as part of the TFIIH complex. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 705
17108 273170 TIGR00605 rad4 DNA repair protein rad4. All proteins in this family for which functions are known are involved in targeting nucleotide excision repair to specific regions of the genome.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 713
17109 129694 TIGR00606 rad50 rad50. All proteins in this family for which functions are known are involvedin recombination, recombinational repair, and/or non-homologous end joining.They are components of an exonuclease complex with MRE11 homologs. This family is distantly related to the SbcC family of bacterial proteins.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). 1311
17110 129695 TIGR00607 rad52 recombination protein rad52. All proteins in this family for which functions are known are involved in recombination and recombination repair. Their exact biochemical activity is not yet known.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 161
17111 273171 TIGR00608 radc DNA repair protein radc. The genes in this family for which the functions are known have an as yet porrly defined role in determining sensitivity to DNA damaging agents such as UV irradiation. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 218
17112 273172 TIGR00609 recB exodeoxyribonuclease V, beta subunit. The RecBCD holoenzyme is a multifunctional nuclease with potent ATP-dependent exodeoxyribonuclease activity. Ejection of RecD, as occurs at chi recombinational hotspots, cripples exonuclease activity in favor of recombinagenic helicase activity. All proteins in this family for which functions are known are DNA-DNA helicases that are used as part of an exonuclease-helicase complex (made up of RecBCD homologs) that function to generate substrates for the initiation of recombination and recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 1087
17113 273173 TIGR00611 recf recF protein. All proteins in this family for which functions are known are DNA binding proteins that assist the filamentation of RecA onto DNA for the initiation of recombination or recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 365
17114 273174 TIGR00612 ispG_gcpE 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase. This protein of previously unknown biochemical function has now been identified as an enzyme of the non-mevalonate pathway of IPP biosynthesis. Chlamydial members of the family have a long insert. The family is largely restricted to Bacteria, where it is widely but not universally distributed. No homology can be detected between the GcpE family and other proteins. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 346
17115 273175 TIGR00613 reco DNA repair protein RecO. All proteins in this family for which functions are known are DNA binding proteins that are involved in the initiation of recombination or recombinational repair. [DNA metabolism, DNA replication, recombination, and repair] 241
17116 129701 TIGR00614 recQ_fam ATP-dependent DNA helicase, RecQ family. All proteins in this family for which functions are known are 3'-5' DNA-DNA helicases. These proteins are used for recombination, recombinational repair, and possibly maintenance of chromosome stability. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 470
17117 273176 TIGR00615 recR recombination protein RecR. All proteins in this family for which functions are known are involved in the initiation of recombination and recombinational repair. RecF is also required. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 195
17118 129703 TIGR00616 rect recombinase, phage RecT family. All proteins in this family for which functions are known bind ssDNA and are involved in the the pairing of homologous DNA This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). RecT and homologs are found in prophage regions of bacterial genomes. RecT works with a partner protein, RecE. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 241
17119 273177 TIGR00617 rpa1 replication factor-a protein 1 (rpa1). All proteins in this family for which functions are known are part of a multiprotein complex made up of homologs of RPA1, RPA2 and RPA3 that bind ssDNA and function in the recognition of DNA damage for nucleotide excision repairThis family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 608
17120 129705 TIGR00618 sbcc exonuclease SbcC. All proteins in this family for which functions are known are part of an exonuclease complex with sbcD homologs. This complex is involved in the initiation of recombination to regulate the levels of palindromic sequences in DNA. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 1042
17121 273178 TIGR00619 sbcd exonuclease SbcD. All proteins in this family for which functions are known are double-stranded DNA exonuclease (as part of a complex with SbcC homologs). This complex functions in the initiation of recombination and recombinational repair and is particularly important in regulating the stability of DNA sections that can form secondary structures. This family is likely homologous to the MRE11 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 253
17122 273179 TIGR00621 ssb single stranded DNA-binding protein (ssb). All proteins in this family for which functions are known are single-stranded DNA binding proteins that function in many processes including transcription, repair, replication and recombination. Members encoded between genes for ribosomal proteins S6 and S18 should be annotated as primosomal protein N (PriB). Forms in gamma-protoeobacteria are much shorter and poorly recognized by this model. Additional members of this family include phage proteins. Eukaryotic members are organellar proteins. [DNA metabolism, DNA replication, recombination, and repair] 164
17123 129709 TIGR00622 ssl1 transcription factor ssl1. All proteins in this family for which functions are known are components of the TFIIH complex which is involved in the initiaiton of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 112
17124 129710 TIGR00623 sula cell division inhibitor SulA. All proteins in this family for which the functions are known are cell division inhibitors. In E. coli, SulA is one of the SOS regulated genes. [DNA metabolism, DNA replication, recombination, and repair] 168
17125 129711 TIGR00624 tag DNA-3-methyladenine glycosylase I. All proteins in this family are alkylation DNA glycosylases that function in base excision repair This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 179
17126 273180 TIGR00625 tfb2 Transcription factor tfb2. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 448
17127 273181 TIGR00627 tfb4 transcription factor tfb4. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 279
17128 273182 TIGR00628 ung uracil-DNA glycosylase. All proteins in this family for which functions are known are uracil-DNA glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 211
17129 273183 TIGR00629 uvde UV damage endonuclease UvdE. All proteins in this family for which functions are known are UV dimer endonucleases that function in an alternative nucleotide excision repair process. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 312
17130 273184 TIGR00630 uvra excinuclease ABC, A subunit. This family is a member of the ABC transporter superfamily of proteins of which all members for which functions are known except the UvrA proteins are involved in the transport of material through membranes. UvrA orthologs are involved in the recognition of DNA damage as a step in nucleotide excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 925
17131 273185 TIGR00631 uvrb excinuclease ABC, B subunit. All proteins in this family for wich functions are known are DNA helicases that function in the nucleotide excision repair and are endonucleases that make the 3' incision next to DNA damage. They are part of a pathway requiring UvrA, UvrB, UvrC, and UvrD homologs. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University) [DNA metabolism, DNA replication, recombination, and repair] 655
17132 211591 TIGR00632 vsr DNA mismatch endonuclease Vsr. All proteins in this family for which functions are known are G:T mismatch endonucleases that function in a specialized mismatch repair process used usually to repair G:T mismatches in specific sections of the genome. This family was based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Members of this family typically are found near to a DNA cytosine methyltransferase. [DNA metabolism, DNA replication, recombination, and repair] 117
17133 273186 TIGR00633 xth exodeoxyribonuclease III (xth). All proteins in this family for which functions are known are 5' AP endonucleases that funciton in base excision repair and the repair of abasic sites in DNA.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 254
17134 273187 TIGR00634 recN DNA repair protein RecN. All proteins in this family for which functions are known are ATP binding proteins involved in the initiation of recombination and recombinational repair. [DNA metabolism, DNA replication, recombination, and repair] 563
17135 129721 TIGR00635 ruvB Holliday junction DNA helicase, RuvB subunit. All proteins in this family for which functions are known are 5'-3' DNA helicases that, as part of a complex with RuvA homologs serve as a 5'-3' Holliday junction helicase. RuvA specifically binds Holliday junctions as a sandwich of two tetramers and maintains the configuration of the junction. It forms a complex with two hexameric rings of RuvB, the subunit that contains helicase activity. The complex drives ATP-dependent branch migration of the Holliday junction recombination intermediate. The endonuclease RuvC resolves junctions. [DNA metabolism, DNA replication, recombination, and repair] 305
17136 213544 TIGR00636 PduO_Nterm ATP:cob(I)alamin adenosyltransferase. This model represents as ATP:cob(I)alamin adenosyltransferase family corresponding to the N-terminal half of Salmonella PduO, a 1,2-propanediol utilization protein that probably is bifunctional. PduO represents one of at least three families of ATP:corrinoid adenosyltransferase: others are CobA (which partially complements PduO) and EutT. It was not clear originally whether ATP:cob(I)alamin adenosyltransferase activity resides in the N-terminal region of PduO, modeled here, but this has now become clear from the characterization of MeaD from Methylobacterium extorquens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 171
17137 273188 TIGR00637 ModE_repress ModE molybdate transport repressor domain. ModE is a molybdate-activated repressor of the molybdate transport operon in E. coli. It consists of the domain represented by this model and two tandem copies of mop-like domain, where Mop proteins are a family of 68-residue molybdenum-pterin binding proteins of Clostridium pasteurianum. This model also represents the full length of a pair of archaeal proteins that lack Mop-like domains. PSI-BLAST analysis shows similarity to helix-turn-helix regulatory proteins. [Regulatory functions, Other] 99
17138 273189 TIGR00638 Mop molybdenum-pterin binding domain. This model describes a multigene family of molybdenum-pterin binding proteins of about 70 amino acids in Clostridium pasteurianum, as a tandemly-repeated domain C-terminal to an unrelated domain in ModE, a molybdate transport gene repressor of E. coli, and in single or tandemly paired domains in several related proteins. [Transport and binding proteins, Anions] 69
17139 161973 TIGR00639 PurN phosphoribosylglycinamide formyltransferase, formyltetrahydrofolate-dependent. This model describes phosphoribosylglycinamide formyltransferase (GAR transformylase), one of several proteins in formyl_transf (pfam00551). This enzyme uses formyl tetrahydrofolate as a formyl group donor to produce 5'-phosphoribosyl-N-formylglycinamide. PurT, a different GAR transformylase, uses ATP and formate rather than formyl tetrahydrofolate. Experimental proof includes complementation of E. coli purN mutants by orthologs from vertebrates (where it is a domain of a multifunctional protein), Bacillus subtilis, and Arabidopsis. No archaeal example was detected. In phylogenetic analyses, the member from Saccharomyces cerevisiae shows a long branch length but membership in the family, while the formyltetrahydrofolate deformylases form a closely related outgroup. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 190
17140 129726 TIGR00640 acid_CoA_mut_C methylmalonyl-CoA mutase C-terminal domain. Methylmalonyl-CoA mutase (EC 5.4.99.2) catalyzes a reversible isomerization between L-methylmalonyl-CoA and succinyl-CoA. The enzyme uses an adenosylcobalamin cofactor. It may be a homodimer, as in mitochondrion, or a heterodimer with partially homologous beta chain that does not bind the adenosylcobalamin cofactor, as in Propionibacterium freudenreichii. The most similar archaeal sequences are separate chains, such as AF2215 and AF2219 of Archaeoglobus fulgidus, that correspond roughly to the first 500 and last 130 residues, respectively of known methylmalonyl-CoA mutases. This model describes the C-terminal domain subfamily. In a neighbor-joining tree (methylaspartate mutase S chain as the outgroup), AF2219 branches with a coenzyme B12-dependent enzyme known not to be 5.4.99.2. 132
17141 273190 TIGR00641 acid_CoA_mut_N methylmalonyl-CoA mutase N-terminal domain. Methylmalonyl-CoA mutase (EC 5.4.99.2) catalyzes a reversible isomerization between L-methylmalonyl-CoA and succinyl-CoA. The enzyme uses an adenosylcobalamin cofactor. It may be a homodimer, as in mitochondrion, or a heterodimer with partially homologous beta chain that does not bind the adenosylcobalamin cofactor, as in Propionibacterium freudenreichii. The most similar archaeal sequences are separate chains, such as AF2215 abd AF2219 of Archaeoglobus fulgidus, that correspond roughly to the first 500 and last 130 residues, respectively of known methylmalonyl-CoA mutases. This model describes the N-terminal domain subfamily. In a neighbor-joining tree, AF2215 branches with a bacterial isobutyryl-CoA mutase, which is also the same length. Scoring between the noise and trusted cutoffs are the non-catalytic, partially homologous beta chains from certain heterodimeric examples of 5.4.99.2. 526
17142 273191 TIGR00642 mmCoA_mut_beta methylmalonyl-CoA mutase, heterodimeric type, beta chain. The adenosylcobalamin-binding, catalytic chain of methylmalonyl-CoA mutase may form homodimers, as in mitochondrion and E. coli, or heterodimers with a shorter, homologous chain that does not bind adenosylcobalamin. This model describes this non-catalytic beta chain, as found in the enzyme from Propionibacterium freudenreichii, for which the 3-dimensional structure has been solved. [Central intermediary metabolism, Other] 619
17143 273192 TIGR00643 recG ATP-dependent DNA helicase RecG. [DNA metabolism, DNA replication, recombination, and repair] 630
17144 273193 TIGR00644 recJ single-stranded-DNA-specific exonuclease RecJ. All proteins in this family are 5'-3' single-strand DNA exonucleases. These proteins are used in some aspects of mismatch repair, recombination, and recombinational repair. [DNA metabolism, DNA replication, recombination, and repair] 539
17145 129731 TIGR00645 HI0507 TIGR00645 family protein. This conserved hypothetical protein with four predicted transmembrane regions is found in E. coli, Haemophilus influenzae, and Helicobacter pylori, among completed genomes. A similar protein from Aquifex aeolicus appears to share a central region of homology and a similar overall arrangement of hydrophobic stretches, and forms a bidirectional best hit with several members of the seed alignment. However, it is uncertain whether the observed similarity represents full-length homology and/or equivalent function, and so it is excluded from the seed and scores below the trusted cutoff. [Hypothetical proteins, Conserved] 167
17146 129732 TIGR00646 MG010 DNA primase-related protein. The DNA primase DnaG of E. coli and its apparent orthologs in other eubacterial species are approximately 600 residues in length. Within this set, a conspicuous outlier in percent identity, as seen in a UPGMA difference tree, is the branch containing the Mycoplasmas. This lineage is also unique in containing the small, DNA primase-related protein modelled here, which is homologous to the central third of DNA primase. Several small regions of sequence similarity specifically to Mycoplasma sequences rather than to all DnaG homologs suggests that the divergence of this protein from DnaG post-dated the separation of bacterial lineages. The function of this DNA primase-related protein is unknown. [Unknown function, General] 218
17147 273194 TIGR00647 DNA_bind_WhiA DNA-binding protein WhiA. This family describes a DNA-binding protein widely conserved in Gram-positive bacteria, and occasionally occurring elsewhere, such as in Thermotoga. It is associated with cell division, and in sporulating organisms with sporulation. [Cellular processes, Cell division] 304
17148 129734 TIGR00648 recU recombination protein U. The Bacillus protein has been shown to be required for DNA recombination and repair. RJD 11/20/00 [DNA metabolism, DNA replication, recombination, and repair] 169
17149 273195 TIGR00649 MG423 beta-CASP ribonuclease, RNase J family. This family of metalloenzymes includes RNase J1 and RNase J2, involved in mRNA degradation in a wide range of organism. [Transcription, Degradation of RNA] 422
17150 273196 TIGR00651 pta phosphate acetyltransferase. Alternate name: phosphotransacetylase Model contains a gene from E.coli coding for ethanolamine utilization protein (euti) and also contains similarity to malate oxidoreductases [Central intermediary metabolism, Other, Energy metabolism, Fermentation] 303
17151 273197 TIGR00652 DapF diaminopimelate epimerase. [Amino acid biosynthesis, Aspartate family] 267
17152 273198 TIGR00653 GlnA glutamine synthetase, type I. Alternate name: glutamate--ammonia ligase. This model represents the dodecameric form, which can be subdivided into 1-alpha and 1-beta forms. The phylogeny of the 1-alpha and 1-beta forms appears polyphyletic. E. coli, Synechocystis PCC6803, Aquifex aeolicus, and the crenarcheon Sulfolobus acidocaldarius have form 1-beta, while Bacillus subtilis, Thermotoga maritima, and various euryarchaea has form 1-alpha. The 1-beta dodecamer from the crenarcheon Sulfolobus acidocaldarius differs from that in E. coli in that it is not regulated by adenylylation. [Amino acid biosynthesis, Glutamate family] 459
17153 129739 TIGR00654 PhzF_family phenazine biosynthesis protein PhzF family. Members of this family show a distant global similarity to diaminopimelate epimerases, which can be taken as the outgroup. One member of this family has been shown to act as an enzyme in the biosynthesis of the antibiotic phenazine in Pseudomonas aureofaciens. The function in other species is unclear. [Cellular processes, Toxin production and resistance] 297
17154 273199 TIGR00655 PurU formyltetrahydrofolate deformylase. This model describes formyltetrahydrofolate deformylases. The enzyme is a homohexamer. Sequences from a related enzyme formyl tetrahydrofolate-specific enzyme, phosphoribosylglycinamide formyltransferase, serve as an outgroup for phylogenetic analysis. Putative members of this family, scoring below the trusted cutoff, include a sequence from Rhodobacter capsulatus that lacks an otherwise conserved C-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 280
17155 273200 TIGR00656 asp_kin_monofn aspartate kinase, monofunctional class. This model describes a subclass of aspartate kinases. These are mostly Lys-sensitive and not fused to homoserine dehydrogenase, unlike some Thr-sensitive and Met-sensitive forms. Homoserine dehydrogenase is part of Thr and Met but not Lys biosynthetic pathways. Aspartate kinase catalyzes a first step in the biosynthesis from Asp of Lys (and its precursor diaminopimelate), Met, and Thr. In E. coli, a distinct isozyme is inhibited by each of the three amino acid products. The Met-sensitive (I) and Thr-sensitive (II) forms are bifunctional enzymes fused to homoserine dehydrogenases and form homotetramers, while the Lys-sensitive form (III) is a monofunctional homodimer. The Lys-sensitive enzyme of Bacillus subtilis resembles the E. coli form but is an alpha 2/beta 2 heterotetramer, where the beta subunit is translated from an in-phase alternative initiator at Met-246. The protein slr0657 from Synechocystis PCC6803 is extended by a duplication of the C-terminal region corresponding to the beta chain. Incorporation of a second copy of the C-terminal domain may be quite common in this subgroup of aspartokinases. [Amino acid biosynthesis, Aspartate family] 400
17156 273201 TIGR00657 asp_kinases aspartate kinase. Aspartate kinase catalyzes a first step in the biosynthesis from Asp of Lys (and its precursor diaminopimelate), Met, and Thr. In E. coli, a distinct isozyme is inhibited by each of the three amino acid products. The Met-sensitive (I) and Thr-sensitive (II) forms are bifunctional enzymes fused to homoserine dehydrogenases and form homotetramers, while the Lys-sensitive form (III) is a monofunctional homodimer.The Lys-sensitive enzyme of Bacillus subtilis resembles the E. coli form but is an alpha 2/beta 2 heterotetramer, where the beta subunit is translated from an in-phase alternative initiator at Met-246. This may be a feature of a number of closely related forms, including a paralog from B. subtilis. [Amino acid biosynthesis, Aspartate family] 441
17157 129743 TIGR00658 orni_carb_tr ornithine carbamoyltransferase. This family of ornithine carbamoyltransferases (OTCase) is in a superfamily with the related enzyme aspartate carbamoyltransferase. Most known examples are anabolic, playing a role in arginine biosynthesis, but some are catabolic. Most OTCases are homotrimers, but the homotrimers are organized into dodecamers built from four trimers in at least two species; the catabolic OTCase of Pseudomonas aeruginosa is allosterically regulated, while OTCase of the extreme thermophile Pyrococcus furiosus shows both allostery and thermophily. [Amino acid biosynthesis, Glutamate family] 304
17158 273202 TIGR00659 TIGR00659 TIGR00659 family protein. Members of this small but broadly distibuted (Gram-positive, Gram-negative, and Archaeal) family appear to have multiple transmembrane segments. The function is unknown. A homolog, LrgB of Staphylococcus aureus, in the same small superfamily but in an outgroup to this subfamily, is regulated by LytSR and is suggested to act as a murein hydrolase. Of the three paralogous proteins in B. subtilis, one is a full length member of this family, one lacks the C-terminal 60 residues and has an additional 128 N-terminal residues but branches within the family in a phylogenetic tree, and one is closely related to LrgB and part of the outgroup. [Hypothetical proteins, Conserved] 226
17159 273203 TIGR00661 MJ1255 conserved hypothetical protein. This model represents nearly the full length of MJ1255 from Methanococcus jannaschii and of an unpublished protein from Vibrio cholerae, as well as the C-terminal half of a protein from Methanobacterium thermoautotrophicum. A small region (~50 amino acids) within the domain appears related to a family of sugar transferases. [Hypothetical proteins, Conserved] 321
17160 273204 TIGR00663 dnan DNA polymerase III, beta subunit. All proteins in this family for which functions are known are components of the DNA polymerase III complex (beta subunit). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 367
17161 273205 TIGR00664 DNA_III_psi DNA polymerase III, psi subunit. This small subunit of the DNA polymerase III holoenzyme in E. coli and related species appearsto have a narrow taxonomic distribution. It is not found so far outside the gamma subdivision proteobacteria. [DNA metabolism, DNA replication, recombination, and repair] 124
17162 273206 TIGR00665 DnaB replicative DNA helicase. This model describes the helicase DnaB, a homohexameric protein required for DNA replication. The homohexamer can form a ring around a single strand of DNA near a replication fork. An intein of > 400 residues is found at a conserved location in DnaB of Synechocystis PCC6803, Rhodothermus marinus (both experimentally confirmed), and Mycobacterium tuberculosis. The intein removes itself by a self-splicing reaction. The seed alignment contains inteins so that the model built from the seed alignment will model a low cost at common intein insertion sites. [DNA metabolism, DNA replication, recombination, and repair] 432
17163 200042 TIGR00666 PBP4 D-alanyl-D-alanine carboxypeptidase, serine-type, PBP4 family. In E. coli, this protein is known as penicillin binding protein 4 (dacB). A signal sequence is cleaved from a precursor form. The protein is described as periplasmic in E. coli (Gram-negative) and extracellular in Actinomadura R39 (Gram-positive). Unlike some other proteins with similar activity, it does not form transpeptidation. It is not essential for viability. This family is related to class A beta-lactamases. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 333
17164 273207 TIGR00667 aat leucyl/phenylalanyl-tRNA--protein transferase. The N-terminal residue controls the biological half-life of many proteins via the N-end rule pathway. This enzyme transfers a Leu or Phe to the amino end of certain proteins to enable degradation. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 185
17165 273208 TIGR00668 apaH bis(5'-nucleosyl)-tetraphosphatase (symmetrical). Diadenosine 5',5"'-P1,P4-tetraphosphate (Ap4A) is a regulatory metabolite of stress conditions. It is hydrolyzed to two ADP by this enzyme. Alternate names include diadenosine-tetraphosphatase and Ap4A hydrolase. [Cellular processes, Adaptations to atypical conditions] 279
17166 129752 TIGR00669 asnA aspartate--ammonia ligase, AsnA-type. This model represents one of two non-homologous forms of aspartate--ammonia ligase (asparagine synthetase) found in E. coli. This type is also found in Haemophilus influenzae, Treponema pallidum and Lactobacillus delbrueckii, but appears to have a very limited distribution. The fact that the protein from the H. influenzae is more than 70 % identical to that from the spirochete Treponema pallidum, but less than 65 % identical to that from the closely related E. coli, strongly suggests lateral transfer. [Amino acid biosynthesis, Aspartate family] 330
17167 273209 TIGR00670 asp_carb_tr aspartate carbamoyltransferase. Aspartate transcarbamylase (ATCase) is an alternate name.PyrB encodes the catalytic chain of aspartate carbamoyltransferase, an enzyme of pyrimidine biosynthesis, which organizes into trimers. In some species, including E. coli and the Archaea but excluding Bacillus subtilis, a regulatory subunit PyrI is also present in an allosterically regulated hexameric holoenzyme. Several molecular weight classes of ATCase are described in MEDLINE:96303527 and often vary within taxa. PyrB and PyrI are fused in Thermotoga maritima.Ornithine carbamoyltransferases are in the same superfamily and form an outgroup. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 301
17168 273210 TIGR00671 baf pantothenate kinase, type III. This model describes a family of proteins found in a single copy in at least ten different early completed bacterial genomes. The only characterized member of the family is Bvg accessory factor (Baf), a protein required, in addition to the regulatory operon bvgAS, for heterologous transcription of the Bordetella pertussis toxin operon (ptx) in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 243
17169 129755 TIGR00672 cdh CDP-diacylglycerol diphosphatase, bacterial type. This model finds only bacterial examples of CDP-diacylglycerol pyrophosphatase. The member from Mycobacterium tuberculosis, the only non-proteobacterial example, is only tentatively identified and scores below the trusted cutoff. No homology is detected to functionally similar mammalian enzymes. Alternate names for this enzyme include CDP-diglyceride hydrolase and CDP-diacylglycerol hydrolase. [Fatty acid and phospholipid metabolism, Biosynthesis] 250
17170 213549 TIGR00673 cynS cyanase. Alternate names include cyanate C-N-lyase, cyanate hydratase, and cyanate hydrolase. [Cellular processes, Detoxification] 150
17171 129757 TIGR00674 dapA 4-hydroxy-tetrahydrodipicolinate synthase. Members of this family are 4-hydroxy-tetrahydrodipicolinate synthase, previously (incorrectly) called dihydrodipicolinate synthase. It is a homotetrameric enzyme of lysine biosynthesis. E. coli has several paralogs closely related to dihydrodipicoline synthase (DapA), as well as the more distant N-acetylneuraminate lyase. In Pyrococcus horikoshii, the bidirectional best hit with E. coli is to an uncharacterized paralog of DapA, not DapA itself, and it is omitted from the seed. The putative members from the Chlamydias (pathogens with a parasitic metabolism) are easily the most divergent members of the multiple alignment. [Amino acid biosynthesis, Aspartate family] 285
17172 273211 TIGR00675 dcm DNA-methyltransferase (dcm). All proteins in this family for which functions are known are DNA-cytosine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 315
17173 273212 TIGR00676 fadh2 5,10-methylenetetrahydrofolate reductase, prokaryotic form. The enzyme activities methylenetetrahydrofolate reductase (EC 1.5.1.20) and 5,10-methylenetetrahydrofolate reductase (FADH) (EC 1.7.99.5) differ in that 1.5.1.20 (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while 1.7.99.5 (assigned in many bacteria) is flexible with respect to the acceptor; both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as 1.5.1.20 and 1.7.99.5, this model describes the subset of proteins found in bacteria, and currently designated 1.7.99.5. This protein is an FAD-containing flavoprotein. [Amino acid biosynthesis, Aspartate family] 272
17174 129760 TIGR00677 fadh2_euk methylenetetrahydrofolate reductase, eukaryotic type. The enzyme activities methylenetetrahydrofolate reductase (EC 1.5.1.20) and 5,10-methylenetetrahydrofolate reductase (FADH) (EC 1.7.99.5) differ in that 1.5.1.20 (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while 1.7.99.5 (assigned in many bacteria) is flexible with respect to the acceptor; both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as 1.5.1.20 and 1.7.99.5, this model describes the subset of proteins found in eukaryotes and designated 1.5.1.20. This protein is an FAD-containing flavoprotein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 281
17175 273213 TIGR00678 holB DNA polymerase III, delta' subunit. This model describes the N-terminal half of the delta' subunit of DNA polymerase III. Delta' is homologous to the gamma and tau subunits, which form an outgroup for phylogenetic comparison. The gamma/tau branch of the tree is much more tighly conserved than the delta' branch, and some members of that branch score more highly against this model than some proteins classisified as delta'. The noise cutoff is set to detect weakly scoring delta' subunits rather than to exclude gamma/tau subunits. At position 126-127 of the seed alignment, this family lacks the HM motif of gamma/tau; at 132 it has a near-invariant A vs. an invariant F in gamma/tau. [DNA metabolism, DNA replication, recombination, and repair] 188
17176 273214 TIGR00679 hpr-ser Hpr(Ser) kinase/phosphatase. Members of this family are the bifunctional enzyme, HPr kinase/phosphatase. All members of the seed alignment (n=57) have a gene tightly clustered with a gene for the phospocarrier protein HPr, its target. [Regulatory functions, Protein interactions, Signal transduction, PTS] 300
17177 273215 TIGR00680 kdpA K+-transporting ATPase, KdpA. Kdp is a high affinity ATP-driven K+ transport system in Escherichia coli. It is composed of three membrane-bound subunits, KdpA, KdpB and KdpC and one small peptide, KdpF. KdpA is the K+-transporting subunit of this complex. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [medline:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [medline:9858692]. [Transport and binding proteins, Cations and iron carrying compounds] 563
17178 129764 TIGR00681 kdpC K+-transporting ATPase, C subunit. This chain has a single predicted transmembrane region near the amino end. It is part of a K+-transport ATPase that contains two other membrane-bound subunits, KdpA and KdpB, and a small subunit KdpF. KdpA is the K+-translocating subunit, KdpB the ATP-hydrolyzing subunit. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [MEDLINE:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [MEDLINE:9858692]. [Transport and binding proteins, Cations and iron carrying compounds] 187
17179 273216 TIGR00682 lpxK tetraacyldisaccharide 4'-kinase. Also called lipid-A 4'-kinase. This essential gene encodes an enzyme in the pathway of lipid A biosynthesis in Gram-negative organisms. A single copy of this protein is found in Gram-negative bacteria. PSI-BLAST converges on this set of apparent orthologs without identifying any other homologs. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 311
17180 273217 TIGR00683 nanA N-acetylneuraminate lyase. N-acetylneuraminate lyase is also known as N-acetylneuraminic acid aldolase, sialic acid aldolase, or sialate lyase. It is an intracellular enzyme. The structure of this homotetrameric enzyme related to dihydrodipicolinate synthase is known. In Clostridium tertium, the enzyme appears to be in an operon with a secreted sialidase that releases sialic acid from host sialoglycoconjugates. In several E. coli strains, however, this enzyme is responsible for N-acetyl-D-neuraminic acid synthesis for capsule production by condensing N-acetyl-D-mannosamine and pyruvate. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Central intermediary metabolism, Amino sugars] 293
17181 273218 TIGR00684 narJ nitrate reductase molybdenum cofactor assembly chaperone. This protein is termed NarJ in most species that have a single copy, and has been called the delta subunit of nitrate reductase. However, although it is required for correct assembly of active enzyme, it dissociates and is not part of the enzyme. Two hits to this model are found each in E. coli and in Mycobacterium tuberculosis, but in each case duplication to create paralogs appears to be recent. The NarX protein of Mycobacterium tuberculosis includes one of these paralogs as a domain, fused to structural domains of nitrate reductases before and after the NarJ-homologous region. [Protein fate, Protein folding and stabilization] 152
17182 273219 TIGR00685 T6PP trehalose-phosphatase. Trehalose, a neutral disaccharide of two glucose residues, is an important osmolyte for dessication and/or salt tolerance in a number of prokaryotic and eukaryotic species, including E. coli, Saccharomyces cerevisiae, and Arabidopsis thaliana. Many bacteria also utilize trehalose in the synthesis of trehalolipids, specialized cell wall constituents believed to be involved in the uptake of hydrophobic substances. Trehalose dimycolate (TDM, cord factor) and related substances are important constituents of the mycobacterial waxy coat and responsible for various clinically important immunological interactions with host organism. This enzyme, trehalose-phosphatase, removes a phosphate group in the final step of trehalose biosynthesis. The trehalose-phosphatase from Saccharomyces cerevisiae is fused to the synthase. At least 18 distinct sequences from Arabidopsis have been identified, roughly half of these are of the fungal type, with a fused synthase and half are like the bacterial members having only the phosphatase domain. It has been suggested that trehalose is being used in Arabidopsis as a regulatory molecule in development and possibly other processes. [Cellular processes, Adaptations to atypical conditions] 244
17183 273220 TIGR00686 phnA alkylphosphonate utilization operon protein PhnA. The protein family includes an uncharacterized member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterized phosphonoacetate hydrolase designated PhnA by Kulakova, et al. (2001, 1997). [Unknown function, General] 109
17184 273221 TIGR00687 pyridox_kin pyridoxal kinase. E. coli has an enzyme PdxK that acts in vitro as a pyridoxine/pyridoxal/pyridoxamine kinase, but mutants lacking PdxK activity retain a specific pyridoxal kinase, PdxY. PdxY acts in the salvage pathway of pyridoxal 5'-phosphate biosynthesis. Mammalian forms of pyridoxal kinase are more similar to PdxY than to PdxK. The PdxK isozyme is omitted from the seed alignment but scores above the trusted cutoff.ThiD and related proteins form an outgroup. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 287
17185 129771 TIGR00688 rarD rarD protein. This uncharacterized protein is predicted to have many membrane-spanning domains. [Transport and binding proteins, Unknown substrate] 256
17186 129772 TIGR00689 rpiB_lacA_lacB sugar-phosphate isomerase, RpiB/LacA/LacB family. Proteins of known function in this family act as sugar (pentose and/or hexose)-phosphate isomerases, including the LacA and LacB subunits of galactose-6-phosphate isomerases from Gram-positive bacteria and RpiB. RpiB is the second ribose phosphate isomerase of E. coli. It lacks homology to RpiA, its inducer is unknown (but is not ribose), and it can be replaced by the homologous galactose-6-phosphate isomerase of Streptococcus mutans, all of which suggests that the ribose phosphate isomerase activity of RpiB is a secondary function. On the other hand, there appear to be a significant number of species which contain rpiB, lack rpiA and seem to require rpi activity in order to complete the pentose phosphate pathway. 144
17187 188073 TIGR00690 rpoZ DNA-directed RNA polymerase, omega subunit. This small component of highly purified E. coli RNA polymerase is not required for transcription, but acts in assembly and is present in stochiometric amounts. The trusted cutoff excludes archaeal homologs but captures some organellar sequences. [Transcription, DNA-dependent RNA polymerase] 60
17188 213552 TIGR00691 spoT_relA (p)ppGpp synthetase, RelA/SpoT family. The functions of E. coli RelA and SpoT differ somewhat. RelA (EC 2.7.6.5) produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT (EC 3.1.7.2) degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species. [Cellular processes, Adaptations to atypical conditions] 683
17189 129775 TIGR00692 tdh L-threonine 3-dehydrogenase. This protein is a tetrameric, zinc-binding, NAD-dependent enzyme of threonine catabolism. Closely related proteins include sorbitol dehydrogenase, xylitol dehydrogenase, and benzyl alcohol dehydrogenase. Eukaryotic examples of this enzyme have been demonstrated experimentally but do not appear in database search results.E. coli His-90 modulates substrate specificity and is believed part of the active site. [Energy metabolism, Amino acids and amines] 340
17190 273222 TIGR00693 thiE thiamine-phosphate diphosphorylase. This model represents the thiamine-phosphate pyrophosphorylase, ThiE, of a number of bacteria, and N-terminal domains of bifunctional thiamine proteins of Saccharomyces cerevisiae and Schizosaccharomyces pombe, in which the C-terminal domain corresponds to the bacterial hydroxyethylthiazole kinase (EC 2.7.1.50), ThiM. This model includes ThiE from Bacillus subtilis but excludes its paralog, the regulatory protein TenI (SP:P25053), and neighbors of TenI. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 196
17191 188074 TIGR00694 thiM hydroxyethylthiazole kinase. This model represents the hydoxyethylthiazole kinase, ThiM, of a number of bacteria, and C-terminal domains of bifunctional thiamine biosynthesis proteins of Saccharomyces cerevisiae and Schizosaccharomyces pombe, in which the N-terminal domain corresponds to the bacterial thiamine-phosphate pyrophosphorylase (EC 2.5.1.3), ThiE. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 249
17192 129778 TIGR00695 uxuA mannonate dehydratase. This Fe2+-requiring enzyme plays a role in D-glucuronate catabolism in Escherichia coli. Mannonate dehydratase converts D-mannonate to 2-dehydro-3-deoxy-D-gluconate. An apparent equivalog is found in a glucuronate utilization operon in Bacillus stearothermophilus T-6. [Energy metabolism, Sugars] 394
17193 129779 TIGR00696 wecG_tagA_cpsF bacterial polymer biosynthesis proteins, WecB/TagA/CpsF family. The WecG member of this superfamily, believed to be UDP-N-acetyl-D-mannosaminuronic acid transferase, plays a role in enterobacterial common antigen (eca) synthesis in Escherichia coli. Another family member, the Bacillus subtilis TagA protein, is involved in the biosynthesis of the cell wall polymer poly(glycerol phosphate). The third family member, CpsF, CMP-N-acetylneuraminic acid synthetase has a role in the capsular polysaccharide biosynthesis pathway. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 177
17194 129780 TIGR00697 TIGR00697 conserved hypothetical integral membrane protein. All known members of this family are proteins or 210-250 amino acids in length. Conserved regions of hydrophobicity suggest that all members of the family are integral membrane proteins. [Hypothetical proteins, Conserved] 202
17195 129781 TIGR00698 TIGR00698 conserved hypothetical integral membrane protein. Members of this family are found so far only in one archaeal species, Archaeoglobus fulgidus, and in two related bacterial species, Haemophilus influenzae and Escherichia coli. It has 9 GES predicted transmembrane regions at conserved locations in all members. These proteins have a molecular weight of approximately 35 to 38 kDa. [Hypothetical proteins, Conserved] 335
17196 129782 TIGR00699 GABAtrns_euk 4-aminobutyrate aminotransferase, eukaryotic type. This enzyme is a class III pyridoxal-phosphate-dependent aminotransferase. This model describes known eukaryotic examples of the enzyme. The degree of sequence difference between this set and known bacterial examples is greater than the distance between either set the most similar enzyme with distinct function, and so separate models are built for prokaryotic and eukaryotic sets. Alternate names include GABA transaminase, gamma-amino-N-butyrate transaminase, and beta-alanine--oxoglutarate aminotransferase. [Central intermediary metabolism, Other] 464
17197 129783 TIGR00700 GABAtrnsam 4-aminobutyrate aminotransferase, prokaryotic type. This enzyme is a class III pyridoxal-phosphate-dependent aminotransferase. This model describes known bacterial examples of the enzyme. The best archaeal matches are presumed but not trusted to have the equivalent function. The degree of sequence difference between this set and known eukaryotic (mitochondrial) examples is greater than the distance to some proteins known to have different functions, and so separate models are built for prokaryotic and eukaryotic sets. E. coli has two isozymes. Alternate names include GABA transaminase, gamma-amino-N-butyrate transaminase, and beta-alanine--oxoglutarate aminotransferase. [Central intermediary metabolism, Other] 420
17198 273223 TIGR00701 TIGR00701 TIGR00701 family protein. It appears this conserved hypothetical integral membrane protein is found only in gram negative bacteria. Completed genomes that include a member of this family include Rickettsia prowazekii, Synechocystis sp. PCC6803, and Helicobacter pylori. These proteins have 3 (Helicobacter pylori) to 5 (Synechocystis sp. PCC 6803) GES predicted transmembrane regions. Most members have 4 GES predicted transmembrane regions. [Hypothetical proteins, Conserved] 142
17199 273224 TIGR00702 TIGR00702 YcaO-type kinase domain. This protein family includes YcaO and homologs that can phosphorylate a peptide amide backbone (rather than side chains), as during heterocycle-forming modifications during maturation of the TOMM class (Thiazole/Oxazole-Modified Microcins) of bacteriocins. However, YcaO domain proteins also occur in contexts that do not suggest peptide modification. [Hypothetical proteins, Conserved] 377
17200 129786 TIGR00703 TIGR00703 TIGR00703 family protein. The function of this family is unknown. These proteins are from 222 to 233 residues in length, lack hydrophobic stretches, and are found so far only in thermophiles. [Hypothetical proteins, Conserved] 223
17201 273225 TIGR00704 NaPi_cotrn_rel Na/Pi-cotransporter. This model describes essentially the full length of an uncharacterized protein from Bacillus subtilis and correponding lengths of longer proteins from E. coli and Treponema pallidum. PSI-BLAST analysis converges to demonstrate homology to one other group of proteins, type II sodium/phosphate (Na/Pi) cotransporters. A well-conserved repeated domain in this family, approximately 60 residues in length, is also repeated in the Na/Pi cotransporters, although with greater spacing between the repeats. The two families share additional homology in the region after the first repeat, share the properly of having extensive hydrophobic regions, and may be similar in function. [Transport and binding proteins, Cations and iron carrying compounds] 308
17202 273226 TIGR00705 SppA_67K signal peptide peptidase SppA, 67K type. This model represents the signal peptide peptidase A (SppA, protease IV) as found in E. coli, Treponema pallidum, Mycobacterium leprae, and several other species, in which it has a molecular mass around 67 kDa and a duplication such that the N-terminal half shares extensive homology with the C-terminal half. This enzyme was shown in E. coli to form homotetramers. E. coli SohB, which is most closely homologous to the C-terminal duplication of SppA, is predicted to perform a similar function of small peptide degradation, but in the periplasm. Many prokaryotes have a single SppA/SohB homolog that may perform the function of either or both. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 584
17203 273227 TIGR00706 SppA_dom signal peptide peptidase SppA, 36K type. The related but duplicated, double-length protein SppA (protease IV) of E. coli was shown experimentally to degrade signal peptides as are released by protein processing and secretion. This protein shows stronger homology to the C-terminal region of SppA than to the N-terminal domain or to the related putative protease SuhB. The member of this family from Bacillus subtilis was shown to have properties consistent with a role in degrading signal peptides after cleavage from precursor proteins, although it was not demonstrated conclusively. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 208
17204 273228 TIGR00707 argD transaminase, acetylornithine/succinylornithine family. This family of proteins, for which ornithine aminotransferases form an outgroup, consists mostly of proteins designated acetylornithine aminotransferase. However, the two very closely related members from E. coli are assigned different enzymatic activities. One is acetylornithine aminotransferase (EC 2.6.1.11), ArgD, an enzyme of arginine biosynthesis, while another is succinylornithine aminotransferase, an enzyme of the arginine succinyltransferase pathway, an ammonia-generating pathway of arginine catabolism (See MEDLINE:98361920). Members of this family may also act on ornithine, like ornithine aminotransferase (EC 2.6.1.13) (see MEDLINE:90337349) and on succinyldiaminopimelate, like N-succinyldiaminopmelate-aminotransferase (EC 2.6.1.17, DapC, an enzyme of lysine biosynthesis) (see MEDLINE:99175097) 379
17205 129791 TIGR00708 cobA cob(I)alamin adenosyltransferase. Alternate name: corrinoid adenosyltransferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 173
17206 129792 TIGR00709 dat 2,4-diaminobutyrate 4-transaminases. This family consists of L-diaminobutyric acid transaminases. This general designation covers both 2.6.1.76 (diaminobutyrate-2-oxoglutarate transaminase, which uses glutamate as the amino donor in DABA biosynthesis), and 2.6.1.46 (diaminobutyrate--pyruvate transaminase, which uses alanine as the amino donor). Most members with known function are 2.6.1.76, and at least some annotations as 2.6.1.46 in current databases at time of model revision are incorrect. A distinct branch of this family contains examples of 2.6.1.76 nearly all of which are involved in ectoine biosynthesis. A related enzyme is 4-aminobutyrate aminotransferase (EC 2.6.1.19), also called GABA transaminase. These enzymes all are pyridoxal phosphate-containing class III aminotransferase. [Central intermediary metabolism, Other] 442
17207 273229 TIGR00710 efflux_Bcr_CflA drug resistance transporter, Bcr/CflA subfamily. This subfamily of drug efflux proteins, a part of the major faciliator family, is predicted to have 12 membrane-spanning regions. Members with known activity include Bcr (bicyclomycin resistance protein) in E. coli, Flor (chloramphenicol and florfenicol resistance) in Salmonella typhimurium DT104, and CmlA (chloramphenicol resistance) in Pseudomonas sp. plasmid R1033. 385
17208 129794 TIGR00711 efflux_EmrB drug resistance transporter, EmrB/QacA subfamily. This subfamily of drug efflux proteins, a part of the major faciliator family, is predicted to have 14 potential membrane-spanning regions. Members with known activities include EmrB (multiple drug resistance efflux pump) in E. coli, FarB (antibacterial fatty acid resistance) in Neisseria gonorrhoeae, TcmA (tetracenomycin C resistance) in Streptomyces glaucescens, etc. In most cases, the efflux pump is described as having a second component encoded in the same operon, such as EmrA of E. coli. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Other] 485
17209 129795 TIGR00712 glpT glycerol-3-phosphate transporter. This model describes a very hydrophobic protein, predicted to span the membrane at least 8 times. The two members confirmed experimentally as glycerol-3-phosphate transporters, from E. coli and B. subtilis, share more than 50 % amino acid identity. Proteins of the hexose phosphate and phosphoglycerate transport systems are also quite similar. [Transport and binding proteins, Other] 438
17210 273230 TIGR00713 hemL glutamate-1-semialdehyde-2,1-aminomutase. This enzyme, glutamate-1-semialdehyde-2,1-aminomutase (glutamate-1-semialdehyde aminotransferase, GSA aminotransferase), contains a pyridoxal phosphate attached at a Lys residue at position 283 of the seed alignment. It is in the family of class III aminotransferases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 423
17211 211601 TIGR00714 hscB Fe-S protein assembly co-chaperone HscB. This model describes the small subunit, Hsc20 (20K heat shock cognate protein) of a pair of proteins Hsc66-Hsc20, related to the DnaK-DnaJ heat shock proteins, which also serve as molecular chaperones. Hsc20, unlike DnaJ, appears not to have chaperone activity on its own, but to act solely as a regulatory subunit for Hsc66 (i.e., to be a co-chaperone). The gene for Hsc20 in E. coli, hscB, is not induced by heat shock. [Protein fate, Protein folding and stabilization] 155
17212 273231 TIGR00715 precor6x_red precorrin-6x reductase. This enzyme catalyzes a step in cobalamin biosynthesis. It has been identified experimentally in Pseudomonas denitrificans and has been shown to be part of cobalamin biosynthetic operons in several other species. This enzyme was found to be a monomer by gel filtration. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 256
17213 129799 TIGR00716 rnhC ribonuclease HIII. This enzyme cleaves RNA from DNA-RNA hybrids. Two types of ribonuclease H in Bacillus subtilis, RNase HII (rnhB) and RNase HIII (rnhC), are both known experimentally and are quite similar to each other. The only RNase H homolog in the Mycoplasmas resembles rnhC. Archaeal forms resemble HII more closely than HIII. This model describes bacterial RNase III. [DNA metabolism, DNA replication, recombination, and repair] 284
17214 273232 TIGR00717 rpsA ribosomal protein S1. This model describes ribosomal protein S1, RpsA. This protein is found in most bacterial genomes in a single copy, but is not present in the Mycoplasmas. It is heterogeneous with respect to the number of repeats of the S1 RNA binding domain described by pfam00575: six repeats in E. coli and most other bacteria, four in Bacillus subtilis and some other species. rpsA is an essential gene in E. coli but not in B. subtilis. It is associated with the cytidylate kinase gene cmk in many species, and fused to it in Treponema pallidum. RpsA is proposed (Medline:97323001) to assist in mRNA degradation. This model provides trusted hits to most long form (6 repeat) examples of RpsA. Among homologs with only four repeats are some to which other (perhaps secondary) functions have been assigned. [Protein synthesis, Ribosomal proteins: synthesis and modification] 516
17215 129801 TIGR00718 sda_alpha L-serine dehydratase, iron-sulfur-dependent, alpha subunit. This enzyme is also called serine deaminase. L-serine dehydratase converts serine into pyruvate in the gluconeogenesis pathway from serine. This model describes the alpha chain of an iron-sulfur-dependent L-serine dehydratase, found in Bacillus subtilis. A fairly deep split in a UPGMA tree separates members of this family of alpha chains from the homologous region of single chain forms such as found in Escherichia coli. This family of enzymes is not homologous to the pyridoxal phosphate-dependent threonine deaminases and eukaryotic serine deaminases. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis] 294
17216 129802 TIGR00719 sda_beta L-serine dehydratase, iron-sulfur-dependent, beta subunit. This enzyme is also called serine deaminase. This model describes the beta chain of an iron-sulfur-dependent L-serine dehydratase, as in Bacillus subtilis. A fairly deep split in a UPGMA tree separates members of this family of beta chains from the homologous region of single chain forms such as found in E. coli. This family of enzymes is not homologous to the pyridoxal phosphate-dependent threonine deaminases and eukaryotic serine deaminases. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis] 208
17217 273233 TIGR00720 sda_mono L-serine dehydratase, iron-sulfur-dependent, single chain form. This enzyme is also called serine deaminase and L-serine dehydratase 1. L-serine ammonia-lyase converts serine into pyruvate in the gluconeogenesis pathway from serine. This enzyme is comprised of a single chain in Escherichia coli, Mycobacterium tuberculosis, and several other species, but has separate alpha and beta chains in Bacillus subtilis and related species. The beta and alpha chains are homologous to the N-terminal and C-terminal regions, respectively, but are rather deeply branched in a UPGMA tree. This enzyme requires iron and dithiothreitol for activation in vitro, and is a predicted 4Fe-4S protein. Escherichia coli Pseudomonas aeruginosa have two copies of this protein. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis] 450
17218 129804 TIGR00721 tfx DNA-binding protein, Tfx family. PSI-BLAST starting with one member of this family converges with significant hits only to other members of the family, which is restricted to the Archaea. Homology is strongest in the helix-turn-helix-containing N-terminal region. Tfx from Methanobacterium thermoautotrophicum is associated with the operon for molybdenum formyl-methanofuran dehydrogenase and binds a DNA sequence near its promoter. [Regulatory functions, DNA interactions] 137
17219 273234 TIGR00722 ttdA_fumA_fumB hydro-lyases, Fe-S type, tartrate/fumarate subfamily, alpha region. A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this subfamily has not been established. 273
17220 129806 TIGR00723 ttdB_fumA_fumB hydro-lyases, Fe-S type, tartrate/fumarate subfamily, beta region. A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is homologous to the tartrate dehydratase alpha chain). The activity of archaeal proteins in this subfamily has not been established. 168
17221 129807 TIGR00724 urea_amlyse_rel biotin-dependent carboxylase uncharacterized domain. Urea amidolyase of Saccharomyces cerevisiae is a 1835 amino acid protein with an amidase domain, a biotin/lipoyl cofactor attachment domain, a carbamoyl-phosphate synthase L chain-like domain, and uncharacterized regions. It has both urea carboxylase and allophanate hydrolase activities. This model models a domain that represents uncharacterized prokaryotic proteins of about 300 amino acids, regions of prokaryotic urea carboxylase and of the urea carboxylase region of yeast urea amidolyase, and regions of other biotin-containing proteins. [Unknown function, General] 314
17222 129808 TIGR00725 TIGR00725 TIGR00725 family protein. This model represents one branch of a subfamily of uncharacterized proteins. Both PSI-BLAST and weak hits by this model show a low level of similarity and suggest an evolutionary relationship of the subfamily to the DprA/Smf family of DNA-processing proteins involved in chromosomal transformation with foreign DNA. Both Aquifex aeolicus and Mycobacterium leprae have one member in each of two branches of this subfamily, suggesting the branches may have distinct functions. This family is one of several families within the scope of pfam03641, several members of which are annotated as lysine decarboxylases. That larger family, and the branch described by this model, have a well-conserved motif PGGXGTXXE. [Hypothetical proteins, Conserved] 159
17223 273235 TIGR00726 TIGR00726 YfiH family protein. PSI-BLAST converges on members of this family of uncharacterized bacterial proteins and shows no significant similarity to any characterized protein. No completed genome to date has two members. Members of the family have been crystallized but the function is unknown. [Unknown function, General] 221
17224 129810 TIGR00727 ISP4_OPT small oligopeptide transporter, OPT family. This model represents a family of transporters of small oligopeptides, demonstrated experimentally in three different species of yeast. A set of related proteins from the plant Arabidopsis thaliana forms an outgroup to the yeast set by neighbor joining analysis but is remarkably well conserved and is predicted here to have equivalent function. [Transport and binding proteins, Amino acids, peptides and amines] 681
17225 273236 TIGR00728 OPT_sfam oligopeptide transporter, OPT superfamily. This superfamily has two main branches. One branch contains a tetrapeptide transporter demonstrated experimentally in three different species of yeast. The other family contains EspB of Myxococcus xanthus, a protein required for normal rather than delayed sporulation after cellular aggregation; its role is unknown but is compatible with transport of a signalling molecule. Homology between the two branches of the superfamily is seen most easily at the ends of the protein. The central regions are poorly conserved within each branch and may not be homologous between branches. 657
17226 129812 TIGR00729 TIGR00729 ribonuclease H, mammalian HI/archaeal HII subfamily. This enzyme cleaves RNA from DNA-RNA hybrids. Archaeal members of this subfamily of RNase H are designated RNase HII and one has been shown to be active as a monomer. A member from Homo sapiens was characterized as RNase HI, large subunit. [DNA metabolism, DNA replication, recombination, and repair] 206
17227 129813 TIGR00730 TIGR00730 TIGR00730 family protein. This model represents one branch of a subfamily of proteins of unknown function. Both PSI-BLAST and weak hits by this model show a low level of similarity to and suggest an evolutionary relationship of the subfamily to the DprA/Smf family of DNA-processing proteins involved in chromosomal transformation with foreign DNA. Both Aquifex aeolicus and Mycobacterium leprae have one member in each of two branches of this subfamily, suggesting that the branches may have distinct functions. [Hypothetical proteins, Conserved] 178
17228 273237 TIGR00731 bL25_bact_ctc ribosomal protein bL25, Ctc-form. This model models a family of proteins with full-length homology to the general stress protein Ctc of Bacillus subtilis, a mesophile, and ribosomal protein TL5 of Thermus thermophilus, a thermophile. Ribosomal protein L25 of Escherichia coli and H. influenzae appear to be orthologous but consist only of the N-terminal half of Ctc and TL5. Both short (L25-like) and full-length (CTC-like) members of this family bind the E-loop of bacterial 5S rRNA. This protein appears to be restricted to bacteria and organelles, and consists of at most one copy per prokaryotic genome.Ctc of Bacillus subtilis has now been localized to ribosomes and can be viewed as the long form, or Ctc form, of L25. The C-terminal domain of sll1824, an apparent L25 of Synechocystis PCC6803, matches the N-terminal domain of this family. Examples of L25 and Ctc are not separated by a UPGMA tree built on the region of shared homology. [Protein synthesis, Ribosomal proteins: synthesis and modification] 176
17229 273238 TIGR00732 dprA DNA protecting protein DprA. Disruption of this gene in both Haemophilus influenzae and Helicobacter pylori drastically reduces the efficiency of transformation with exogenous DNA, but with different levels of effect on chromosomal (linear) and plasmid (circular) DNA. This difference suggests the DprA is not active in recombination, and it has been shown not to affect DNA binding, leaving the intermediate step in natural transformation, DNA processing. In Strep. pneumoniae, inactivation of dprA had no effect on the uptake of DNA. All of these data indicated that DprA is required at a later stage in transformation. Subsequently DprA and RecA were both shown in S. pneumoniae to be required to protect incoming ssDNA from immediate degradation. Role of DprA in non-transformable species is not known. The gene symbol smf was assigned in E. coli, but without assignment of function. [Cellular processes, DNA transformation] 220
17230 273239 TIGR00733 TIGR00733 putative oligopeptide transporter, OPT family. This protein represents a small family of integral membrane proteins from Gram-negative bacteria, a Gram-positive bacteria, and an archaeal species. Members of this family contain 15 to 18 GES predicted transmembrane regions, and this family has extensive homology to a family of yeast tetrapeptide transporters, including isp4 (Schizosaccharomyces pombe) and Opt1 (Candida albicans). EspB, an apparent equivalog from Myxococcus xanthus, shares an operon with a two component system regulatory protein, and is required for the normal timing of sporulation after the aggregation of cells. This is consistent with a role in transporting oligopeptides as signals across the membrane. [Transport and binding proteins, Amino acids, peptides and amines] 591
17231 273240 TIGR00734 hisAF_rel hisA/hisF family protein. This model models a family of proteins found so far in three archaeal species: Methanobacterium thermoautotrophicum, Methanococcus jannaschii, and Archaeoglobus fulgidus. This protein is homologous to phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase (HisA) and, with lower similarity, to the cyclase HisF, both of which are enzymes of histidine biosynthesis. Each species with this protein also encodes HisA. The function of this protein is unknown. [Unknown function, General] 221
17232 273241 TIGR00735 hisF imidazoleglycerol phosphate synthase, cyclase subunit. [Amino acid biosynthesis, Histidine family] 254
17233 129819 TIGR00736 nifR3_rel_arch TIM-barrel protein, putative. Members of this family show a distant relationship by PSI-BLAST to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. At least two closely related but well-separable families among the bacteria, the nifR3/yhdG family and the yjbN family, share a more distant relationship to this family of shorter, exclusively archaeal proteins. [Unknown function, General] 231
17234 129820 TIGR00737 nifR3_yhdG putative TIM-barrel protein, nifR3 family. This model represents one branch of COG0042 (Predicted TIM-barrel enzymes, possibly dehydrogenases, nifR3 family). This branch includes NifR3 itself, from Rhodobacter capsulatus. It excludes a broadly distributed but more sparsely populated subfamily that contains sll0926 from Synechocystis PCC6803, HI0634 from Haemophilus influenzae, and BB0225 from Borrelia burgdorferi. It also excludes a shorter and more distant archaeal subfamily.The function of nifR3, a member of this family, is unknown, but it is found in an operon with nitrogen-sensing two component regulators in Rhodobacter capsulatus.Members of this family show a distant relationship to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. [Unknown function, General] 319
17235 273242 TIGR00738 rrf2_super Rrf2 family protein. This model represents a superfamily of probable transcriptional regulators. One member, RRF2 of Desulfovibrio vulgaris is an apparent regulatory protein experimentally (MEDLINE:97293189). The N-terminal region appears related to the DNA-binding biotin repressor region of the BirA bifunctional according to results after three rounds of PSI-BLAST with a fairly high stringency. [Unknown function, General] 132
17236 273243 TIGR00739 yajC preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas. [Protein fate, Protein and peptide secretion and trafficking] 84
17237 273244 TIGR00740 TIGR00740 tRNA (cmo5U34)-methyltransferase. This tRNA methyltransferase is involved, together with cmoB, in preparing the uridine-5-oxyacetic acid (cmo5U) at position 34. [Unknown function, Enzymes of unknown specificity] 239
17238 129824 TIGR00741 yfiA ribosomal subunit interface protein. This model includes a small protein encoded by one of two genes, both downstream of the gene rpoN for sigma 54, whose deletion leads to increased expression from sigma 54-dependent promoters. It also includes the N-terminal half of a light-repressed protein LtrA of Synechococcus PCC 7002 and the N-terminal region (after removal of the transit peptide) of a larger plastid-specific ribosomal protein of spinach. The member of this family from E. coli is now recognized as a protein at the interace between ribosomal large and small subunits, with about 1/3 as many copies per cell as the number of ribosomes. [Protein synthesis, Translation factors] 95
17239 129825 TIGR00742 yjbN tRNA dihydrouridine synthase A. This model represents one branch of COG0042 (Predicted TIM-barrel enzymes, possibly dehydrogenases, nifR3 family). It represents a distinct subset by a set of shared unique motifs, a conserved pattern of insertions/deletions relative to other nifR3 homologs, and by subclustering based on cross-genome bidirectional best hits. Members are found in species as diverse as the proteobacteria, a spirochete, a cyanobacterium, and Deinococcus radiodurans. NifR3 itself, a protein of unknown function associated with nitrogen regulation in Rhodobacter capsulatus, is not a member of this branch. Members of this family show a distant relationship to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. [Protein synthesis, tRNA and rRNA base modification] 318
17240 273245 TIGR00743 TIGR00743 conserved hypothetical protein. These small proteins are approximately 100 amino acids in length and appear to be found only in gamma proteobacteria. The function of this protein family is unknown. [Hypothetical proteins, Conserved] 95
17241 273246 TIGR00744 ROK_glcA_fam ROK family protein (putative glucokinase). This model models one branch of the ROK superfamily of proteins. The three members of the seed alignment for this model all have experimental evidence for activity as glucokinase, but the set of related proteins is crowded with paralogs of different or unknown function. Proteins scoring above the trusted_cutoff will show strong similarity to at least one known glucokinase and may be designated as putative glucokinases. However, definitive identification of glucokinases should be done only with extreme caution. [Unknown function, General] 318
17242 273247 TIGR00745 apbA_panE 2-dehydropantoate 2-reductase. This model describes enzymes that perform as 2-dehydropantoate 2-reductase, one of four enzymes required for the de novo biosynthesis of pantothenate (vitamin B5) from Asp and 2-oxoisovalerate. Although few members of the seed alignment are characterized experimentally, nearly all from complete genomes are found in a genome-wide (but not local) context of all three other pantothenate-biosynthetic enzymes (TIGR00222, TIGR00018, TIGR00223). The gene encoding this enzyme is designated apbA in Salmonella typhimurium and panE in Escherichia coli; this protein functions as a monomer and functions in the alternative pyrimidine biosynthetic, or APB, pathway, used to synthesize the pyrimidine moiety of thiamine. Note, synthesis of the pyrimidine moiety of thiamine occurs either via the first five steps in de novo purine biosynthesis, which uses the pur gene products, or through the APB pathway. Note that this family includes both NADH and NADPH-dependent enzymes, and enzymes with broad specificity, such as a D-mandelate dehydrogease that is also a 2-dehydropantoate 2-reductase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 293
17243 273248 TIGR00746 arcC carbamate kinase. In most species, carbamate kinase works in arginine catabolism and consumes carbamoyl phosphate to convert ADP into ATP. In the pathway in Pyrococcus furiosus, the enzyme acts instead to generate carbamoyl phosphate.The seed alignment for this model includes experimentally confirmed examples from a set of phylogenetically distinct species. In a neighbor-joining tree constructed from an alignment of candidate carbamate kinases and several acetylglutamate kinases, the latter group forms a clear outgroup which roots the tree of carbamate kinase-like proteins. This analysis suggests that in E. coli, the ArcC paralog YqeA may be a second isozyme, while the paralog YahI branches as an outlier and is less likely to be an authentic carbamate kinase. The homolog from Mycoplasma pneumoniae likewise branches outside the set containing known carbamate kinases and also scores below the trusted cutoff. [Energy metabolism, Amino acids and amines] 310
17244 273249 TIGR00747 fabH 3-oxoacyl-(acyl-carrier-protein) synthase III. FabH in general initiate elongation in type II fatty acid synthase systems found in bacteria and plants. The two members of this subfamily from Bacillus subtilis differ from each other, and from FabH from E. coli, in acyl group specificity. Active site residues include Cys112, His244 and Asn274 of E. coli FabH. Cys-112 is the site of acyl group attachment. [Fatty acid and phospholipid metabolism, Biosynthesis] 318
17245 273250 TIGR00748 HMG_CoA_syn_Arc hydroxymethylglutaryl-CoA synthase, putative. This family of archaeal proteins shows considerable homology and identical active site residues to the bacterial hydroxymethylglutaryl-CoA synthase (HMG-CoA synthase, modeled by TIGR01835) which is the second step in the mevalonate pathway of IPP biosynthesis. An enzyme from Pseudomonas fluorescens involved in the biosynthesis of the polyketide diacetyl-phloroglucinol is more closely related, but lacks the active site residues. In each of the genomes containing a member of this family there is no other recognized HMG-CoA synthase, although other elements of the mevalonate pathway are in evidence. The only archaeon currently sequenced which lacks a homolog in this pathway is Halobacterium, which _does_ contain a separate HMG-CoA synthase. Thus, although there is no experimental evidence supporting this name, the bioinformatics-based conclusion appears to be sound. [Fatty acid and phospholipid metabolism, Biosynthesis] 347
17246 129832 TIGR00749 glk glucokinase, proteobacterial type. This model represents glucokinase of E. coli and close homologs, mostly from other proteobacteria, presumed to have equivalent function. This glucokinase is more closely related to a number of uncharacterized paralogs than to the glucokinase glcK (fromerly yqgR) of Bacillus subtilis and its closest homologs, so the two sets are represented by separate models. [Energy metabolism, Glycolysis/gluconeogenesis] 316
17247 129833 TIGR00750 lao LAO/AO transport system ATPase. In E. coli, mutation of this kinase blocks phosphorylation of two transporter system periplasmic binding proteins and consequently inhibits those transporters. This kinase is also found in Gram-positive bacteria, archaea, and the roundworm C. elegans. It may have a more general, but still unknown function. Mutations have also been found that do not phosphorylate the periplasmic binding proteins, yet still allow transport. The ATPase activity of this protein seems to be necessary, however. [Transport and binding proteins, Amino acids, peptides and amines, Regulatory functions, Protein interactions] 300
17248 129834 TIGR00751 menA 1,4-dihydroxy-2-naphthoate octaprenyltransferase. This membrane-associated enzyme converts 1,4-dihydroxy-2-naphthoic acid (DHNA) to demethylmenaquinone, a step in menaquinone biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 284
17249 273251 TIGR00752 slp outer membrane lipoprotein, Slp family. Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli, which also contains a close paralog, Haemophilus influenzae and Pasteurella multocida and Vibrio cholera. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from Escherichia coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation. [Cell envelope, Other] 182
17250 129836 TIGR00753 undec_PP_bacA undecaprenyl-diphosphatase UppP. This is a family of small, highly hydrophobic proteins. Overexpression of this protein in Escherichia coli is associated with bacitracin resistance, and the protein was originally proposed to be an undecaprenol kinase and called bacA. It is now known to be an undecaprenyl pyrophosphate phosphatase (EC 3.6.1.27) and is renamed UppP. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 255
17251 162022 TIGR00754 bfr bacterioferritin. Bacterioferritin, predominantly an iron-storage protein restricted to Bacteria, has also been designated cytochrome b1 and cytochrome b-557.Bacterioferritin is a homomultimer most species. In Neisseria gonorrhoeae, Synechocystis PCC6803, Magnetospirillum magnetotacticum, and Pseudomonas aeruginosa, two types of subunit are found in a heteromultimeric complex, with each species having one member of each type. At present, both types of subunit are including in this single model. [Transport and binding proteins, Cations and iron carrying compounds] 157
17252 273252 TIGR00755 ksgA ribosomal RNA small subunit methyltransferase A. In both E. coli and Saccharomyces cerevisiae, this protein is responsible for the dimethylation of two adjacent adenosine residues in a conserved hairpin of 16S rRNA in bacteria, 18S rRNA in eukaryotes. This adjacent dimethylation is the only rRNA modification shared by bacteria and eukaryotes. A single member of this family is present in each of the first 20 completed microbial genomes. This protein is essential in yeast, but not in E. coli, where its deletion leads to resistance to the antibiotic kasugamycin. Alternate name: S-adenosylmethionine--6-N',N'-adenosyl (rRNA) dimethyltransferase [Protein synthesis, tRNA and rRNA base modification] 254
17253 273253 TIGR00756 PPR pentatricopeptide repeat domain (PPR motif). This model describes a domain called the PPR motif, or pentatricopeptide repeat. Its consensus sequence is 35 positions long and typically is found in four or more tandem copies. This family is strongly represented in plant proteins, particularly those sorted to chloroplasts or mitochondria. The pfam01535, domain of unknown function DUF17, consists of 6 copies of this repeat. This family has a similar consensus to the TPR domain (tetratricopeptide), pfam00515, a 33-residue repeat. It is predicted to form a pair of antiparallel helices similar to that of TPR. 35
17254 273254 TIGR00757 RNaseEG ribonuclease, Rne/Rng family. This model describes ribonuclease G (formerly CafA, cytoplasmic axial filament protein A), the N-terminal domain of ribonuclease E in which ribonuclease activity resides, and related proteins. In E. coli, both RNase E and RNase G have been shown to play a role in the maturation of the 5' end of 16S RNA. The C-terminal half of RNase E (excluded from the seed alignment for this model) lacks ribonuclease activity but participates in mRNA degradation by organizing the degradosome. [Transcription, Degradation of RNA] 414
17255 129841 TIGR00758 UDG_fam4 uracil-DNA glycosylase, family 4. This well-conserved family of proteins is about 200 residues in length and homologous to the N-terminus of the DNA polymerase of phage SPO1 of Bacillus subtilis. The member from Thermus thermophilus HB8 is known to act as uracil-DNA glycosylase, an enzyme of DNA base excision repair. Its appearance as a domain of phage DNA polymerases could be consistent with uracil-DNA glycosylase activity. [DNA metabolism, DNA replication, recombination, and repair] 173
17256 273255 TIGR00759 aceE pyruvate dehydrogenase E1 component, homodimeric type. Most members of this family are pyruvate dehydrogenase complex, E1 component. Note: this family was classified as subfamily rather than equivalog because it includes a counterexample from Pseudomonas putida, MdeB, that is active as an E1 component of an alpha-ketoglutarate dehydrogenase complex rather than a pyruvate dehydrogase complex. The second pyruvate dehydrogenase complex E1 protein from Alcaligenes eutrophus, PdhE, complements an aceE mutant of E. coli but is not part of a pyruvate dehydrogenase complex operon, is more similar to the Pseudomonas putida MdeB than to E. coli AceE, and may have also have a different primary specificity. 885
17257 129843 TIGR00760 araD L-ribulose-5-phosphate 4-epimerase. E. coli has two genes, sgaE and sgbE (YiaS), that are very close homologs of araD, the established L-ribulose-5-phosphate 4-epimerase of E. coli. SgbE, part of an operon for L-xylulose metabolism, also has L-ribulose-5-phosphate 4-epimerase activity; L-xylulose-5-phosphate may be converted into L-ribulose-5-phosphate by another product of that operon. The homolog to this family from Mycobacterium smegmatis is flanked by putative araB and araA genes, consistent with it also being araD. [Energy metabolism, Sugars] 231
17258 273256 TIGR00761 argB acetylglutamate kinase. This model describes N-acetylglutamate kinases (ArgB) of many prokaryotes and the N-acetylglutamate kinase domains of multifunctional proteins from yeasts. This enzyme is the second step in the "acetylated" ornithine biosynthesis pathway. A related group of enzymes representing the first step of the pathway contain a homologous domain and are excluded from this model. [Amino acid biosynthesis, Glutamate family] 231
17259 273257 TIGR00762 DegV EDD domain protein, DegV family. This family of proteins is related to DegV of Bacillus subtilis and includes paralogous sets in several species (B. subtilis, Deinococcus radiodurans, Mycoplasma pneumoniae) that are closer in percent identity to each than to most homologs from other species. This suggests both recent paralogy and diversity of function. DegV itself is encoded immediately downstream of DegU, a transcriptional regulator of degradation, but is itself uncharacterized. Crystallography suggested a lipid-binding site, while comparison of the crystal structure to dihydroxyacetone kinase and to a mannose transporter EIIA domain suggests a conserved domain, EDD, with phosphotransferase activity. [Unknown function, General] 275
17260 273258 TIGR00763 lon endopeptidase La. This protein, the ATP-dependent serine endopeptidase La, is induced by heat shock and other stresses in E. coli, B. subtilis, and other species. The yeast member, designated PIM1, is located in the mitochondrial matrix, required for mitochondrial function, and also induced by heat shock. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 775
17261 273259 TIGR00764 lon_rel lon-related putative ATP-dependent protease. This model represents a set of proteins with extensive C-terminal homology to the ATP-dependent protease La, product of the lon gene of E. coli. The model is based on a seed alignment containing only archaeal members, but several bacterial proteins match the model well. Because several species, including Thermotoga maritima and Treponema pallidum, contain both a close homolog of the lon protease and nearly full-length homolog of the members of this family, we suggest there may also be a functional division between the two families. Members of this family from Pyrococcus horikoshii and Pyrococcus abyssi each contain a predicted intein. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 608
17262 273260 TIGR00765 yihY_not_rbn YihY family inner membrane protein. Initial identification of members of this protein family was based on characterization of the yihY gene product as ribonuclease BN in Escherichia coli. This identification has been withdrawn, as the group now finds the homolog in E. coli of RNase Z is the true ribonuclease BN rather than a strict functional equivalent of RNase Z. Members of this subfamily include the largely uncharacterized BrkB (Bordetella resist killing by serum B) from Bordetella pertussis. Some members have an additional C-terminal domain. Paralogs from E. coli (yhjD) and Mycobactrium tuberculosis (Rv3335c) are part of a smaller, related subfamily that form their own cluster. [Unknown function, General] 259
17263 188082 TIGR00766 TIGR00766 inner membrane protein YhjD. This family, including YhjD in E. coli, is a conserved inner membrane protein homologous YihY, which in turn was incorrectly assigned to be ribonuclease BN. This, any suggestion this family is similar to ribonucleases should be removed. [Transcription, Degradation of RNA] 263
17264 162030 TIGR00767 rho transcription termination factor Rho. This RNA helicase, the transcription termination factor Rho, occurs in nearly all bacteria but is missing from the Cyanobacteria, the Mollicutes (Mycoplasmas), and various Lactobacillales including Streptococcus. It is also missing, of course, from the Archaea, which also lack Nus factors. Members of this family from Micrococcus luteus, Mycobacterium tuberculosis, and related species have a related but highly variable long, highly charged insert near the amino end. Members of this family differ in the specificity of RNA binding. [Transcription, Transcription factors] 415
17265 273261 TIGR00768 rimK_fam alpha-L-glutamate ligase, RimK family. This family, related to bacterial glutathione synthetases, contains at least three different alpha-L-glutamate ligases. One is RimK, as in E. coli, which adds additional Glu residues to the native Glu-Glu C-terminus of ribosomal protein S6, but not to Lys-Glu mutants. Most species with a member of this subfamily lack an S6 homolog ending in Glu-Glu, however. Members in Methanococcus jannaschii act instead as a tetrahydromethanopterin:alpha-l-glutamate ligase (MJ0620) and a gamma-F420-2:alpha-l-glutamate ligase (MJ1001). 276
17266 273262 TIGR00769 AAA ADP/ATP carrier protein family. These proteins are members of the ATP:ADP Antiporter (AAA) Family (TC 2.A.12), which consists of nucleotide transporters that have 12 GES predicted transmembrane regions. One protein from Rickettsia prowazekii functions to take up ATP from the eukaryotic cell cytoplasm into the bacterium in exchange for ADP. Five AAA family paralogues are encoded within the genome of R. prowazekii. This organism transports UMP and GMP but not CMP, and it seems likely that one or more of the AAA family paralogues are responsible. The genome of Chlamydia trachomatis encodes two AAA family members, Npt1 and Npt2, which catalyse ATP/ADP exchange and GTP, CTP, ATP and UTP uptake probably employing a proton symport mechanism. Two homologous adenylate translocators of Arabidopsis thaliana are postulated to be localized to the intracellular plastid membrane where they function as ATP importers. [Transport and binding proteins, Nucleosides, purines and pyrimidines] 472
17267 273263 TIGR00770 Dcu anaerobic c4-dicarboxylate membrane transporter family protein. These proteins are members of th C4-Dicarboxylate Uptake (Dcu) Family (TC 2.A.13). Most proteins in this family have 12 GES predicted transmembrane regions; however one member has 10 experimentally determined transmembrane regions with both the N- and C-termini localized to the periplasm. The two Escherichia coli proteins, DcuA and DcuB, transport aspartate, malate, fumarate and succinate, and function as antiporters with any two of these substrates. Since DcuA is encoded in an operon with the gene for aspartase, and DcuB is encoded in an operon with the gene for fumarase, their physiological functions may be to catalyze aspartate:fumarate and fumarate:malate exchange during the anaerobic utilization of aspartate and fumarate, respectively. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 430
17268 129854 TIGR00771 DcuC c4-dicarboxylate anaerobic carrier family protein. These proteins are members of the C4-dicarboxylate Uptake C (DcuC) Family (TC 2.A.61). The only functionally characterized member of this family is the anaerobic C4-dicarboxylate transporter (DcuC) of Escherichia coli. DcuC has 12 GES predicted transmembrane regions, is induced only under anaerobic conditions, and is not repressed by glucose. It may therefore function as a succinate efflux system during anaerobic glucose fermentation. However, when overexpressed, it can replace either DcuA or DcuB in catalyzing fumarate-succinate exchange and fumarate uptake. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 388
17269 273264 TIGR00773 NhaA Na+/H+ antiporter NhaA. These proteins are members of the NhaA Na+:H+ Antiporter (NhaA) Family (TC. 2.A.33). The Escherichia coli NhaA protein probably functions in the regulation of the internal pH when the external pH is alkaline. It also uses the H+ gradient to expel Na+ from the cell. Its activity is highly pH dependent. Only the E. coli protein is functionally and structurally well characterized. [Transport and binding proteins, Cations and iron carrying compounds] 373
17270 129856 TIGR00774 NhaB Na+/H+ antiporter NhaB. These proteins are members of the NhaB Na+:H+ Antiporter (NhaB) Family (TC 2.A.34). The only characterised member of this family is the Escherichia coli NhaB protein, which has 12 GES predicted transmembrane regions, and catalyses sodium/proton exchange. Unlike NhaA this activity is not pH dependent. [Transport and binding proteins, Cations and iron carrying compounds] 515
17271 129857 TIGR00775 NhaD Na+/H+ antiporter, NhaD family. These proteins are members of the NhaD Na+:H+ Antiporter (NhaD) Family (TC 2.A.62). A single member of the NhaD family has been characterized. This protein is the NhaD protein of Vibrio parahaemolyticus which has 12 GES predicted transmembrane regions. It has been shown to catalyze Na+/H+ antiport, but Li+ can also be a substrate. [Transport and binding proteins, Cations and iron carrying compounds] 420
17272 273265 TIGR00776 RhaT RhaT L-rhamnose-proton symporter family protein. These proteins are members of the L-Rhamnose Symporter (RhaT) Family (TC 2.A.7). This family includes two characterized members, both of which function as L-rhamnose:H+ symporters and have 10 GES predicted transmembrane domains. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 290
17273 129859 TIGR00777 ahpD alkylhydroperoxidase, AhpD family. Members of this family are alkylhydroperoxidases, which catalyze the reduction of peroxides to their corresponding alcohols via oxidation of cysteine residues. In these alkylhydroperoxidases, the cysteines are located in a conserved -CXXC- motif located towards the COOH terminus. In Mycobacterium tuberculosis, two non-homologous alkylhydroperoxidases, AhpD and AhpC, are found in the same operon. [Cellular processes, Detoxification] 177
17274 273266 TIGR00778 ahpD_dom alkylhydroperoxidase AhpD family core domain. This model represents a 51-residue core region of homology among a family of mostly uncharacterized proteins of 110 to 227 amino acids. Most members of this family contain the motif EXXXXXX[SA]XXXXXC[VIL]XCXXXH. Members of the family include the alkylhydroperoxidase AhpD of Mycobacterium tuberculosis, a macrophage infectivity potentiator peptide of Legionella pneumophila, and an uncharacterized peptide in the tetrachloroethene reductive dehalogenase operon of Dehalospirillum multivorans. We suggest that many peptides containing this domain may have alkylhydroperoxidase or related antioxidant activity. [Unknown function, General] 50
17275 129861 TIGR00779 cad cadmium resistance transporter (or sequestration) family protein. These proteins are members of the Cadmium Resistance (CadD) Family (TC 2.A.77). To date, this family of proteins has only been found in Gram-positive bacteria. The CadD family includes several closely related Staphylococcal proteins reported to function in cadmium resistance. Members are predicted to span the membrane five times; the mechanism of resistance is believed to be export but has also been suggested to be binding and sequestration in the membrane. Closely related but outside the scope of this model is another staphylococcal protein that has been reported to possibly function in quaternary ammonium ion export. Still more distant are other members of the broader LysE family (see Vrljic. et al, ). [Transport and binding proteins, Amino acids, peptides and amines] 193
17276 129862 TIGR00780 ccoN cytochrome c oxidase, cbb3-type, subunit I. This model represents the largest subunit, I, of the ccb3-type cytochrome c oxidase, with two protohemes and copper. It shows strong homology to subunits of other types of cytochrome oxidases. Species with this type, all from the Proteobacteria so far, include Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Rhodobacter sphaeroides, Rhizobium leguminosarum, and others. Gene symbols ccoN and fixN are synonymous. [Energy metabolism, Electron transport] 474
17277 129863 TIGR00781 ccoO cytochrome c oxidase, cbb3-type, subunit II. This model describes the monoheme subunit of the cbb3-type cytochrome oxidase, found in a subset of Proteobacterial species. Species having this protein also have CcoN (subunit I, containing copper and two heme groups), CcoP (subunit III, containing two hemes), and CcoQ (essential for incorporation of the prosthetic groups). [Energy metabolism, Electron transport] 232
17278 129864 TIGR00782 ccoP cytochrome c oxidase, cbb3-type, subunit III. This model describes a di-heme subunit of approximately 26 kDa of the cbb3 type copper and heme-containing cytochrome oxidase. [Energy metabolism, Electron transport] 285
17279 129865 TIGR00783 ccs citrate carrier protein, CCS family. These proteins are members of the Citrate:Cation Symporter (CCS) Family (TC 2.A.24). These proteins have 12 GES predicted transmembrane regions. Most members of the CCS family catalyze citrate uptake with either Na+ or H+ as the cotransported cation. However, one member is specific for L-malate and probably functions by a proton symport mechanism. [Unclassified, Role category not yet assigned] 347
17280 162036 TIGR00784 citMHS citrate transporter, CitMHS family. This family includes two characterized citrate/proton symporters from Bacillus subtilis. CitM transports citrate complexed to Mg2+, while the CitH apparently transports citrate without Mg2+. The family also includes uncharacterized transporters, including a third paralog in Bacillus subtilis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 431
17281 273267 TIGR00785 dass anion transporter. The Divalent Anion:Na+ Symporter (DASS) Family (TC 2.A.47) Functionally characterized proteins of the DASS family transport (1) organic di- and tricarboxylates of the Krebs Cycle as well as dicarboxylate amino acid, (2) inorganic sulfate and (3) phosphate. The animal NaDC-1 cotransport 3 Na+ with each dicarboxylate. Protonated tricarboxylates are also cotransported with 3Na+. [Transport and binding proteins, Anions, Transport and binding proteins, Cations and iron carrying compounds] 444
17282 129868 TIGR00786 dctM TRAP transporter, DctM subunit. The Tripartite ATP-independent Periplasmic Transporter (TRAP-T) Family (TC 2.A.56)- DctM subunit TRAP-T family permeases generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. Only one member of the family has been both sequenced and functionally characterized. This system is the DctPQM system of Rhodobacter capsulatus (Forward et al., 1997). DctP is a periplasmic dicarboxylate (malate, fumarate, succinate) binding receptor that is biochemically well-characterized. DctQ is an integral cytoplasmic membrane protein with 4 putative transmembrane a-helical spanners (TMSs). DctM is a second integral cytoplasmic membrane protein with 12 putative TMSs. These proteins have been shown to be both necessary and sufficient for the proton motive force-dependent uptake of dicarboxylates into R. capsulatus. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 405
17283 129869 TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 257
17284 273268 TIGR00788 fbt folate/biopterin transporter. The Folate-Biopterin Transporter (FBT) Family (TC 2.A.71)The only functionally characterized members of the family are from protozoa and include FT1, the major folate transporter in Leishmania, and BT1, the Leishmania biopterin/folate transporter. A related protein in Trypanosoma brucei, ESAGIO, shows weak folate/biopterin transport activity. [Cell envelope, Other] 468
17285 129871 TIGR00789 flhB_rel flhB C-terminus-related protein. This model describes a short protein (80-93 residues) homologous to the C-terminus of the flagellar biosynthetic protein FlhB. It is found so far only in species that also have FlhB. In a phylogenetic tree based on alignment of both this family and the homologous region of FlhB and its homologs, the members of this family form a monophyletic set. [Unknown function, General] 82
17286 273269 TIGR00790 fnt formate/nitrite transporter. The Formate-Nitrite Transporter (FNT) Family (TC 2.A.44)The prokaryotic proteins of the FNT family probably function in the transport of the structurally related compounds, formate and nitrite. The homologous yeast protein may function as a short chain aliphatic carboxylate H+ symporter,transporting formate, acetate and propionate, and functioning primarily as an acetate uptake permease. The putative formate efflux transporters (FocA) of bacteria associated with pyruvate-formate lyase (pfl) comprise cluster I; the putative formate uptake permeases (FdhC) of bacteria and archaea associated with formate dehydrogenase comprise cluster II; the putative nitrite uptake permeases (NirC) of bacteria comprise cluster III, and the single yeast protein, the putative acetate:H+ symporter alone comprises cluster IV. The energy coupling mechanisms for proteins of the FNT family have not been extensively characterized. HCO2 -, CH3CO2 - and NO2 - uptakes are probably coupled to H+symport. HCO2 - efflux may be driven by the membrane potential by a uniport mechanism or by H+ antiport. [Transport and binding proteins, Anions] 239
17287 129873 TIGR00791 gntP gluconate transporter. This family includes known gluconate transporters of E. coli and Bacillus species as well as an idonate transporter from E. coli. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 440
17288 273270 TIGR00792 gph sugar (Glycoside-Pentoside-Hexuronide) transporter. The Glycoside-Pentoside-Hexuronide (GPH):Cation Symporter Family (TC 2.A.2) GPH:cation symporters catalyze uptake of sugars in symport with a monovalent cation (H+ or Na+). Members of this family includes transporters for melibiose, lactose, raffinose, glucuronides, pentosides and isoprimeverose. Mutants of two groups of these symporters (the melibiose permeases of enteric bacteria, and the lactose permease of Streptococcus thermophilus) have been isolated in which altered cation specificity is observed or in which sugar transport is uncoupled from cation symport (i.e., uniport is catalyzed). The various members of the family can use Na+, H+ or Li, Na+ or Li+, H+ or Li+, or only H+ as the symported cation. All of these proteins possess twelve putative transmembrane a-helical spanners. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 437
17289 273271 TIGR00793 kdgT 2-keto-3-deoxygluconate transporter. This family includes the characterized 2-Keto-3-Deoxygluconate transporters from Bacillus subtilis and Erwinia chrysanthemi. There are homologs of this protein found in both gram-positive and gram-negative bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 314
17290 129876 TIGR00794 kup potassium uptake protein. Proteins of the KUP family include the KUP (TrkD) protein of E. coli, a partially sequenced ORF from Lactococcus lactis, high affinity K+ uptake systems (Hak1) of the yeast Debaryomyces occidentalis as well as the fungus, Neurospora crassa, and several homologues in plants. While the E. coli KUP protein is assumed to be a secondary transporter, and uptake is blocked by protonophores such as CCCP (but not arsenate), the energy coupling mechanism has not been defined. However, the N. crassa protein has been shown to be a K+:H+ symporter, establishing that the KUP family consists of secondary carriers. The plant high affinity (20mM) K+ transporter can complement K+ uptake defects in E. coli. [Transport and binding proteins, Cations and iron carrying compounds] 688
17291 162041 TIGR00795 lctP L-lactate transport. The Lactate Permease (LctP) Family (TC 2.A.14) The only characterized member of this family, from E. coli, appears to catalyze lactate:H+ uptake. Members of this family have 12 probable TMS. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 530
17292 273272 TIGR00796 livcs branched-chain amino acid uptake carrier. The Branched Chain Amino Acid:Cation Symporter (LIVCS) Family (TC 2.A.26) Characterized members of this family transport all three of the branched chain aliphatic amino acids (leucine (L), isoleucine (I) and valine (V)). They function by a Na+ or H+ symport mechanism and display 12 putative transmembrane helical spanners. [Transport and binding proteins, Amino acids, peptides and amines] 378
17293 273273 TIGR00797 matE putative efflux protein, MATE family. The Multi Antimicrobial Extrusion (MATE) Family (TC 2.A.66) The MATE family consists of probable efflux proteins including a functionally characterized multi drug efflux system from Vibrio parahaemolyticus, a putative ethionine resistance protein of Saccharomyces cerevisiae, and the functionally uncharacterized DNA damage-inducible protein F (DinF) of E. coli. These proteins have 12 probable TMS. [Transport and binding proteins, Other] 342
17294 129880 TIGR00798 mtc tricarboxylate carrier. The MTC family consists of a limited number of homologues, all from eukaryotes. A single member of the family has been functionally characterized, the tricarboxylate carrier from rat liver mitochondria. The rat liver mitochondrial tricarboxylate carrier has been reported to transport citrate, cis-aconitate, threo-D-isocitrate, D- and L-tartrate, malate, succinate and phosphoenolpyruvate. It presumably functions by a proton symport mechanism. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 318
17295 273274 TIGR00799 mtp Golgi 4-transmembrane spanning transporter. The proteins of the MET family have 4 TMS regions and are located in late endosomal or lysosomal membranes. Substrates of the mouse MTP transporter include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores and steroid hormones. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other compounds.Drug sensitivity by mouse MET was regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol transport, or modulate the multidrug resistance phenotype of mammalian cells. Thus, MET family members may compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity,nucleoside/nucleobase availability and steroid hormone responses. [Transport and binding proteins, Unknown substrate] 258
17296 273275 TIGR00800 ncs1 NCS1 nucleoside transporter family. The Nucleobase:Cation Symporter-1 (NCS1) Family (TC 2.A.39) The NCS1 family consists of bacterial and yeast transporters for nucleobases including purines and pyrimidines. Members of this family possess twelve putative transmembrane a-helical spanners (TMSs). At least some of them have been shown to function in uptake by substrate:H+ symport mechanism. [Transport and binding proteins, Nucleosides, purines and pyrimidines] 442
17297 273276 TIGR00801 ncs2 uracil-xanthine permease. The Nucleobase:Cation Symporter-2 (NCS2) Family (TC 2.A.40) Most of the functionally characterized members of the NCS2 family are transporters specific for nucleobases including both purines and pyrimidines. However, two closely related rat members of the family, SVCT1 and SVCT2, localized to different tissues of the body, cotransport L-ascorbate and Na+ with a high degree of specificity and high affinity for the vitamin. The NCS2 family appears to be distantly related to the NCS1 family (TC #2.A.39). [Transport and binding proteins, Nucleosides, purines and pyrimidines] 412
17298 273277 TIGR00802 nico high-affinity nickel-transporter, HoxN/HupN/NixA family. This family is found in both Gram-negative and Gram-positive bacteria. The functionally characterized members of the family catalyze uptake of either Ni2+ or Co2+ in a proton motive force-dependent process. Topological analyses with the HoxN Ni2+ transporter of Ralstonia eutropha (Alcaligenes eutrophus) suggest that it possesses 8 TMSs with its N- and C-termini in the cytoplasm. [Transport and binding proteins, Cations and iron carrying compounds] 280
17299 129885 TIGR00803 nst UDP-galactose transporter. The 10-12 TMS Nucleotide Sugar Transporters (TC 2.A.7.10)Nucleotide-sugar transporters (NSTs) are found in the Golgi apparatus and the endoplasmic reticulum of eukaryotic cells. Members of the family have been sequenced from yeast, protozoans and animals. Animals such as C. elegans possess many of these transporters. Humans have at least two closely related isoforms of the UDP-galactose:UMP exchange transporter.NSTs generally appear to function by antiport mechanisms, exchanging a nucleotide-sugar for a nucleotide. Thus, CMP-sialic acid is exchanged for CMP; GDP-mannose is preferentially exchanged for GMP, and UDP-galactose and UDP-N-acetylglucosamine are exchanged for UMP (or possibly UDP). Other nucleotide sugars (e.g., GDP-fucose, UDP-xylose, UDP-glucose, UDP-N-acetylgalactosamine, etc.) may also be transported in exchange for various nucleotides, but their transporters have not been molecularly characterized. Each compound appears to be translocated by its own transport protein. Transport allows the compound, synthesized in the cytoplasm, to be exported to the lumen of the Golgi apparatus or the endoplasmic reticulum where it is used for the synthesis of glycoproteins and glycolipids. 222
17300 273278 TIGR00804 nupC nucleoside transporter. The Concentrative Nucleoside Transporter (CNT) Family (TC 2.A.41) Members of the CNT family mediate nucleoside uptake. In bacteria they are energized by H+ symport, but in mammals they are energized by Na+ symport. The different transporters exhibit differing specificities for nucleosides. The E. coli NupC permease transports all nucleosides (both ribo- and deoxyribonucleosides) except hypoxanthine and guanine nucleosides. The B. subtilis NupC is specific for pyrimidine nucleosides (cytidine and uridine and the corresponding deoxyribonucleosides). The mammalian permease members of the CNT family also exhibit differing specificities. Thus, rats possess at least two NupC homologues, one specific for both purine and pyrimidine nucleosides and one specific for purine nucleosides. At least three paralogues have been characterized from humans. One human homologue(CNT1) transports pyrimidine nucleosides and adenosine, but deoxyadenosine and guanosine are poor substrates of this permease. Another (CNT2) is selective for purine nucleosides. Alteration of just a few amino acyl residues in TMSs 7 and 8 interconverts their specificities. [Transport and binding proteins, Nucleosides, purines and pyrimidines] 401
17301 273279 TIGR00805 oat sodium-independent organic anion transporter. The Organo Anion Transporter (OAT) Family (TC 2.A.60)Proteins of the OAT family catalyze the Na+-independent facilitated transport of organic anions such as bromosulfobromophthalein and prostaglandins as well as conjugated and unconjugated bile acids (taurocholate and cholate, respectively). These transporters have been characterized in mammals, but homologues are present in C. elegans and A. thaliana. Some of the mammalian proteins exhibit a high degree of tissue specificity. For example, the rat OAT is found at high levels in liver and kidney and at lower levels in other tissues. These proteins possess 10-12 putative a-helical transmembrane spanners. They may catalyze electrogenic anion uniport or anion exchange. 632
17302 129888 TIGR00806 rfc RFC reduced folate carrier. The Reduced Folate Carrier (RFC) Family (TC 2.A.48) Members of the RFC family mediate the uptake of folate, reduce folate, derivatives of reduced folate and the drug, methotrexate. Proteins of the RFC family are so-far restricted to animals. RFC proteins possess 12 putative transmembrane a-helical spanners (TMSs) and evidence for a 12 TMS topology has been published for the human RFC. The RFC transporters appear to transport reduced folate by an energy-dependent, pH-dependent, Na+-independent mechanism. Folate:H+ symport, folate:OH- antiport and folate:anion antiport mechanisms have been proposed, but the energetic mechanism is not well defined. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 511
17303 129889 TIGR00807 malonate_madL malonate transporter, MadL subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM. The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 125
17304 129890 TIGR00808 malonate_madM malonate transporter, MadM subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM.The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 254
17305 213561 TIGR00809 secB protein-export chaperone SecB. This protein acts as an export-specific cytosolic chaperone. It binds the mature region of pre-proteins destined for secretion, prevents aggregation, and delivers them to SecA. This protein is tetrameric in E. coli. The archaeal Methanococcus jannaschii homolog MJ0357 has been shown () to share many properties, including chaperone-like activity, and scores between trusted and noise. [Protein fate, Protein and peptide secretion and trafficking] 140
17306 273280 TIGR00810 secG protein translocase, SecG subunit. This family of proteins forms a complex with SecY and SecE. SecA then recruits the SecYEG complex to form an active protein translocation channel. [Protein fate, Protein and peptide secretion and trafficking] 73
17307 273281 TIGR00811 sit silicon transporter. Marine diatoms such as Cylindrotheca fusiformis encode at least six silicon transport protein homologues which exhibit similar size and topology. One characterized member of the family (Sit1) functions in the energy-dependent uptake of either Silicic acid [Si(OH)4] or Silicate [Si(OH)3O-] by a Na+ symport mechanism. The system is found in marine diatoms which make their "glass houses" out of silicon. [Transport and binding proteins, Other] 545
17308 273282 TIGR00813 sss transporter, SSS family. The Solute:Sodium Symporter (SSS) Family (TC 2.A.21) Members of the SSS family catalyze solute:Na+ symport. The solutes transported may be sugars, amino acids, nucleosides, inositols, vitamins, urea or anions, depending on the system. Members of the SSS family have been identified in bacteria, archaea and animals, and all functionally well characterized members catalyze solute uptake via Na+ symport. Proteins of the SSS generally share a core of 13 TMSs, but different members of the family may have different numbers of TMSs. A 13 TMS topology with a periplasmic N-terminus and a cytoplasmic C-terminus has been experimentally determined for the proline:Na+ symporter, PutP, of E. coli. [Transport and binding proteins, Cations and iron carrying compounds] 407
17309 273283 TIGR00814 stp serine transporter. The Hydroxy/Aromatic Amino Acid Permease (HAAAP) Family- serine/threonine subfamily (TC 2.A.42.2) The HAAAP family includes well characterized aromatic amino acid:H+ symport permeases and hydroxy amino acid permeases. This subfamily is specific for hydroxy amino acid transporters and includes the serine permease, SdaC, of E. coli, and the threonine permease, TdcC, of E. coli.//added GO terms, none avaialbelf or ser/thr specifically [SS 2/6/05] [Transport and binding proteins, Amino acids, peptides and amines] 397
17310 273284 TIGR00815 sulP high affinity sulphate transporter 1. The SulP family is a large and ubiquitous family with over 30 sequenced members derived from bacteria, fungi, plants and animals. Many organisms including Bacillus subtilis, Synechocystis sp, Saccharomyces cerevisiae, Arabidopsis thaliana and Caenorhabditis elegans possess multiple SulP family paralogues. Many of these proteins are functionally characterized, and all are sulfate uptake transporters. Some transport their substrate with high affinities, while others transport it with relatively low affinities. Most function by SO42- :H+symport, but SO42- :HCO3- antiport has been reported for the rat protein (spP45380). The bacterial proteins vary in size from 434 residues to 566 residues with one exception, a Mycobacterium tuberculosis protein with 784 residues. The eukaryotic proteins vary in size from 611 residues to 893 residues with one exception, a protein designated "early nodulin 70 protein" from Glycine max which is reported to be of 485 residues. Thus, the eukaryotic proteins are almost without exception larger than the prokaryotic proteins. These proteins exhibit 10-13 putative transmembrane a-helical spanners (TMSs) depending on the protein. The phylogenetic tree for the SulP family reveals five principal branches. Three of these are bacterial specific as follows: one bears a single protein from M. tuberculosis; a second bears two proteins, one from M. tuberculosis, the other from Synechocystis sp, and the third bears all remaining prokaryotic proteins. The remaining two clusters bear only eukaryotic proteins with the animal proteins all localized to one branch and the plant and fungal proteins localized to the other. The generalized transport reactions catalyzed by SulP family proteins are: (1) SO42- (out) + nH+ (out) --> SO42- (in) + nH+ (in). (2) SO42- (out) + nHCO3- (in) SO42- (in) + nHCO3- (out). [Transport and binding proteins, Anions] 552
17311 273285 TIGR00816 tdt C4-dicarboxylate transporter/malic acid transport protein. The Tellurite-Resistance/Dicarboxylate Transporter (TDT) Family (TC 2.A.16)Two members of the TDT family have been functionally characterized. One is the TehA protein of E. coli which has been implicated in resistance to tellurite; the other is the Mae1 protein of S. pombe which functions in the uptake of malate and other dicarboxylates by a proton symportmechanism. These proteins exhibit 10 putative transmembrane a-helicalspanners (TMSs). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 320
17312 129898 TIGR00817 tpt Tpt phosphate/phosphoenolpyruvate translocator. The 6-8 TMS Triose-phosphate Transporter (TPT) Family (TC 2.A.7.9)Functionally characterized members of the TPT family are derived from the inner envelope membranes of chloroplasts and nongreen plastids of plants. However,homologues are also present in yeast. Saccharomyces cerevisiae has three functionally uncharacterized TPT paralogues encoded within its genome. Under normal physiologicalconditions, chloroplast TPTs mediate a strict antiport of substrates, frequently exchanging an organic three carbon compound phosphate ester for inorganic phosphate (Pi).Normally, a triose-phosphate, 3-phosphoglycerate, or another phosphorylated C3 compound made in the chloroplast during photosynthesis, exits the organelle into thecytoplasm of the plant cell in exchange for Pi. However, experiments with reconstituted translocator in artificial membranes indicate that transport can also occur by achannel-like uniport mechanism with up to 10-fold higher transport rates. Channel opening may be induced by a membrane potential of large magnitude and/or by high substrateconcentrations. Nongreen plastid and chloroplast carriers, such as those from maize endosperm and root membranes, mediate transport of C3 compounds phosphorylated atcarbon atom 2, particularly phosphenolpyruvate, in exchange for Pi. These are the phosphoenolpyruvate:Pi antiporters (PPT). Glucose-6-P has also been shown to be asubstrate of some plastid translocators (GPT). The three types of proteins (TPT, PPT and GPT) are divergent in sequence as well as substrate specificity, but their substratespecificities overlap. [Hypothetical proteins, Conserved] 302
17313 273286 TIGR00819 ydaH p-Aminobenzoyl-glutamate transporter family. The p-Aminobenzoyl-glutamate transporter family includes two transporters, the AbgT (YdaH) protein of E. coli and MtrF of Neisseria gonorrhoea. AbgT is apparently cryptic in wild type cells, but when expressed on a high copy number plasmid, or when expressed at higher levels due to mutation, it allows utilization of p-aminobenzoyl-glutamate as a source of p-aminobenzoate for p-aminobenzoate auxotrophs. p-Aminobenzoate is a constituent of and a precursor for the biosynthesis of folic acid. [Hypothetical proteins, Conserved] 524
17314 273287 TIGR00820 zip ZIP zinc/iron transport family. The Zinc (Zn2+)-Iron (Fe2+) Permease (ZIP) Family (TC 2.A.5)Members of the ZIP family consist of proteins with eight putative transmembrane spanners. They are derived from animals, plants and yeast. Theycomprise a diverse family, with several paralogues in any one organism (e.g., at least five in Caenorabditis elegans, at least five in Arabidopsis thaliana and two inSaccharomyces cervisiae. The two S. cerevisiae proteins, Zrt1 and Zrt2, both probably transport Zn2+ with high specificity, but Zrt1 transports Zn2+ with ten-fold higher affinitythan Zrt2. Some members of the ZIP family have been shown to transport Zn2+ while others transport Fe2+, and at least one transports a range of metal ions. The energy source fortransport has not been characterized, but these systems probably function as secondary carriers. [Transport and binding proteins, Cations and iron carrying compounds] 324
17315 129901 TIGR00821 EII-GUT PTS system, glucitol/sorbitol-specific, IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific transporters, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 181
17316 129902 TIGR00822 EII-Sor PTS system, mannose/fructose/sorbose family, IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man (PTS splinter group) family is unique in several respects among PTS permease families. It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this family can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the sorbose-specific IIC subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 265
17317 129903 TIGR00823 EIIA-LAC phosphotransferase system enzyme II, lactose-specific, factor III. The PTS Lactose-N,N?-Diacetylchitobiose-b-glucoside (Lac) Family (TC 4.A.3)Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Lac family includes several sequenced lactose (b-galactoside) permeases of Gram-positive bacteria as well as the E. coli N,N?-diacetylchitobiose (Chb)permease which can transport aromatic b-glucosides and cellobiose as well as the chitin disaccharide, Chb, but only Chb induces expression of the chboperon. While the Lac permeases consist of two polypeptide chains (IIA and IICB), the Chb permease of E. coli consists of three (IIA, IIB and IIC). In B. subtilis, a PTS permease similar to the Chb permease of E. coli is believed to transport lichenan (a b-1,3;1,4 glucan) degradation products, oligosaccharides of 2-4 glucose units. This model is specific for the IIA subunit of the Lac PTS family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 99
17318 129904 TIGR00824 EIIA-man PTS system, mannose/fructose/sorbose family, IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIA components. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 116
17319 129905 TIGR00825 EIIBC-GUT PTS system, glucitol/sorbitol-specific, IIBC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIBC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 331
17320 273288 TIGR00826 EIIB_glc PTS system, glucose-like IIB component. The PTS Glucose-Glucoside (Glc) Family (TC 4.A.1) Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the E. coli PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. This model is specific for the IIB domain of the Glc family PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 88
17321 129907 TIGR00827 EIIC-GAT PTS system, galactitol-specific IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The only characterized member of this family of PTS transporters is the E. coli galactitol transporter. Gat family PTS systems typically have 3 components: IIA, IIB and IIC. This family is specific for the IIC component of the PTS Gat family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 407
17322 129908 TIGR00828 EIID-AGA PTS system, mannose/fructose/sorbose family, IID component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IID subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 271
17323 129909 TIGR00829 FRU PTS system, fructose-specific, IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several PTS components of unknown specificities. The fructose components of this family phosphorylate fructose on the 1-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This family is specific for the IIB domain of the fructose PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 85
17324 273289 TIGR00830 PTBA PTS system, glucose subfamily, IIA component. These are part of the The PTS Glucose-Glucoside (Glc) SuperFamily. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). The IIA, IIB and IIC domains of all of the permeases listed below are demonstrably homologous. These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. The three-dimensional structures of the IIA and IIB domains of the E. coli glucose permease have been elucidated. IIAglchas a complex b-sandwich structure while IIBglc is a split ab-sandwich with a topology unrelated to the split ab-sandwich structure of HPr. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 121
17325 129911 TIGR00831 a_cpa1 Na+/H+ antiporter, bacterial form. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36) The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals. Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in (1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles. This model is specific for the bacterial members of this family. [Transport and binding proteins, Cations and iron carrying compounds] 525
17326 213563 TIGR00832 acr3 arsenical-resistance protein. The Arsenical Resistance-3 (ACR3) Family (TC 2.A.59) The first protein of the ACR3 family functionally characterized was the ACR3 protein of Saccharomyces cerevisiae. It is present in the yeast plasma membrane and pumps arsenite out of the cell in response to the pmf. Similar proteins are found in bacteria, often as part of a four gene operon with an regulatory protein ArsR, a protein of unknown function ArsH, and an arsenate reductase that converts arsenate to arsenite to facilitate transport. [Cellular processes, Detoxification, Transport and binding proteins, Anions] 328
17327 129913 TIGR00833 actII Transport protein. The Resistance-Nodulation-Cell Division (RND) Superfamily- MmpL sub family (TC 2.A.6.5)Characterized members of the RND superfamily all probably catalyze substrate efflux via an H+ antiport mechanism. These proteins are found ubiquitously in bacteria, archaea and eukaryotes. This sub-family includes the S. coelicolor ActII3 protein, which may play a role in drug resistance, and the M. tuberculosis MmpL7 protein, which catalyzes export of an outer membrane lipid, phthiocerol dimycocerosate. [Transport and binding proteins, Unknown substrate] 910
17328 273290 TIGR00834 ae anion exchange protein. The Anion Exchanger (AE) Family (TC 2.A.31)Characterized protein members of the AE family are found only in animals.They preferentially catalyze anion exchange (antiport) reactions, typically acting as HCO3-:Cl- antiporters, but also transporting a range of other inorganic and organic anions. Additionally, renal Na+:HCO3- cotransporters have been found to be members of the AE family. They catalyze the reabsorption of HCO3- in the renal proximal tubule. [Transport and binding proteins, Anions] 900
17329 273291 TIGR00835 agcS amino acid carrier protein. The Alanine or Glycine: Cation Symporter (AGCS) Family (TC 2.A.25) Members of the AGCS family transport alanine and/or glycine in symport with Na+ and or H+. 425
17330 273292 TIGR00836 amt ammonium transporter. The Ammonium Transporter (Amt) Family (TC 2.A.49) All functionally characterized members of the Amt family are ammonia or ammonium uptake transporters. Some, but not others, also transport methylammonium. The mechanism of energy coupling, if any, to methyl-NH2 or NH3 uptake by the AmtB protein of E. coli is not entirely clear. NH4+ uniport driven by the pmf, energy independent NH3 facilitation, and NH4+/K+ antiport have been proposed as possible transport mechanisms. In Corynebacterium glutamicum and Arabidopsis thaliana, uptake via the Amt1 homologues of AmtB has been reported to be driven by the pmf. [Transport and binding proteins, Cations and iron carrying compounds] 403
17331 273293 TIGR00837 araaP aromatic amino acid transport protein. The Hydroxy/Aromatic Amino Acid Permease (HAAAP) Family- tyrosine/tryptophan subfamily (TC 2.A.42.1) The HAAAP family includes well characterized aromatic amino acid:H+ symport permeases and hydroxy amino acid permeases. This subfamily is specific for aromatic amino acid transporters and includes the tyrosine permease, TyrP, of E. coli, and the tryptophan transporters TnaB and Mtr of E. coli. [Transport and binding proteins, Amino acids, peptides and amines] 381
17332 129918 TIGR00838 argH argininosuccinate lyase. This model describes argininosuccinate lyase, but may include examples of avian delta crystallins, in which argininosuccinate lyase activity may or may not be present and the biological role is to provide the optically clear cellular protein of the eye lens. [Amino acid biosynthesis, Glutamate family] 455
17333 213564 TIGR00839 aspA aspartate ammonia-lyase. This enzyme, aspartate ammonia-lyase, shows local homology to a number of other lyases, as modeled by pfam00206. Fumarate hydratase scores as high as 570 bits against this model. [Energy metabolism, Amino acids and amines] 468
17334 273294 TIGR00840 b_cpa1 sodium/hydrogen exchanger 3. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36)The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals.Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in(1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles.This model is specific for the eukaryotic members members of this family. [Transport and binding proteins, Cations and iron carrying compounds] 559
17335 188087 TIGR00841 bass bile acid transporter. The Bile Acid:Na+ Symporter (BASS) Family (TC 2.A.28) Functionally characterized members of the BASS family catalyze Na+:bile acid symport. These systems have been identified in intestinal, liver and kidney tissues of animals. These symporters exhibit broad specificity, taking up a variety of non bile organic compounds as well as taurocholate and other bile salts. Functionally uncharacterised homologues are found in plants, yeast, archaea and bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 286
17336 213565 TIGR00842 bcct choline/carnitine/betaine transport. The Betaine/Carnitine/Choline Transporter (BCCT) Family (TC 2.A.15) Proteins of the BCCT family share the common functional feature of transporting molecules with a quaternary ammonium group [R-N+(CH3)3]. The BCCT family includes transporters for carnitine, choline and glycine betaine. BCCT transporters have 12 putative TMS, and are energized by pmf-driven proton symport. Some of these permeases exhibit osmosensory and osmoregulatory properties inherent to their polypeptide chains. [Transport and binding proteins, Other] 452
17337 129923 TIGR00843 benE benzoate transporter. The benzoate transporter family contains only a single characterised member, the benzoate transporter of Acinetobacter calcoaceticus, which functions as a benzoate/proton symporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 395
17338 273295 TIGR00844 c_cpa1 na(+)/h(+) antiporter. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36) The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals. Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in (1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles. This model is specific for the fungal members of this family. [Transport and binding proteins, Cations and iron carrying compounds] 810
17339 273296 TIGR00845 caca sodium/calcium exchanger 1. The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family [Transport and binding proteins, Other] 928
17340 273297 TIGR00846 caca2 calcium/proton exchanger. The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is generated from the calcium ion/proton exchangers of the CacA family. [Transport and binding proteins, Cations and iron carrying compounds] 363
17341 129927 TIGR00847 ccoS cytochrome oxidase maturation protein, cbb3-type. CcoS from Rhodobacter capsulatus has been shown essential for incorporation of redox-active prosthetic groups (heme, Cu) into cytochrome cbb(3) oxidase. FixS of Bradyrhizobium japonicum appears to have the same function. Members of this family are found so far in organisms with a cbb3-type cytochrome oxidase, including Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Caulobacter crescentus, Bradyrhizobium japonicum, and Rhodobacter capsulatus. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair] 51
17342 273298 TIGR00848 fruA PTS system, fructose subfamily, IIA component. 4.A.2 The PTS Fructose-Mannitol (Fru) Family Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities. The fructose permeases of this family phosphorylate fructose on the 1-position. Those of family 4.6 phosphorylate fructose on the 6-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This model is specific for the IIA domain of the fructose PTS transporters. Also similar to the Enzyme IIA Fru subunits of the PTS, but included in TIGR01419 rather than this model, is enzyme IIA Ntr (nitrogen), also called PtsN, found in E. coli and other organisms, which may play a solely regulatory role. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 129
17343 129929 TIGR00849 gutA PTS system, glucitol/sorbitol-specific IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria.The system in E.Coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 121
17344 129930 TIGR00851 mtlA PTS system, mannitol-specific IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities.The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This family is specific for the IIC domain of the mannitol PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 338
17345 273299 TIGR00852 pts-Glc PTS system, maltose and glucose-specific subfamily, IIC component. The PTS Glucose-Glucoside (Glc) Family (TC 4.A.1) Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the E. coli PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. This model is specific for the IIC domain of the Glc family PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 289
17346 273300 TIGR00853 pts-lac PTS system, lactose/cellobiose family IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Lac family includes several sequenced lactose (b-galactoside) permeases of Gram-positive bacteria as well as those in E. coli. While the Lac family usually consists of two polypeptide components IIA and IICB, the Chb permease of E. coli consists of three IIA, IIB and IIC. This family is specific for the IIB subunit of the Lac PTS family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 95
17347 129933 TIGR00854 pts-sorbose PTS system, mannose/fructose/sorbose family, IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIB components of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 151
17348 273301 TIGR00855 L12 ribosomal protein L7/L12. Ribosomal proteins L7 and L12 are synonymous except for post-translational modification of the N-terminal amino acid. THis model resembles pfam00542 but matches the full length of prokaryotic and organellar proteins rather than just the C-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification] 123
17349 129935 TIGR00856 pyrC_dimer dihydroorotase, homodimeric type. This homodimeric form of dihydroorotase is less common in microbial genomes than a related dihydroorotase that appears in a complex with aspartyltranscarbamoylase or as a homologous domain in multifunctional proteins of pyrimidine biosynthesis in higher eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 341
17350 273302 TIGR00857 pyrC_multi dihydroorotase, multifunctional complex type. In contrast to the homodimeric type of dihydroorotase found in E. coli, this class tends to appear in a large, multifunctional complex with aspartate transcarbamoylase. Homologous domains appear in multifunctional proteins of higher eukaryotes. In some species, including Pseudomonas putida and P. aeruginosa, this protein is inactive but is required as a non-catalytic subunit of aspartate transcarbamoylase (ATCase). In these species, a second, active dihydroorotase is also present. The seed for this model does not include any example of the dihydroorotase domain of eukaryotic multidomain pyrimidine synthesis proteins. All proteins described by this model should represent active and inactive dihydroorotase per se and functionally equivalent domains of multifunctional proteins from higher eukaryotes, but exclude related proteins such as allantoinase. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 411
17351 273303 TIGR00858 bioF 8-amino-7-oxononanoate synthase. 7-keto-8-aminopelargonic acid synthetase is an alternate name. This model represents 8-amino-7-oxononanoate synthase, the BioF protein of biotin biosynthesis. This model is based on a careful phylogenetic analysis to separate members of this family from 2-amino-3-ketobutyrate and other related pyridoxal phosphate-dependent enzymes. In several species, including Staphylococcus and Coxiella, a candidate 8-amino-7-oxononanoate synthase is confirmed by location in the midst of a biotin biosynthesis operon but scores below the trusted cutoff of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 360
17352 273304 TIGR00859 ENaC sodium channel transporter. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the vertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds] 595
17353 273305 TIGR00860 LIC Cation transporter family protein. The Ligand-gated Ion Channel (LIC) Family of Neurotransmitter Receptors TC 1.A.9)Members of the LIC family of ionotropic neurotransmitter receptors are found only in vertebrate and invertebrate animals. They exhibit receptor specificity for (1)acetylcholine, (2) serotonin, (3) glycine, (4) glutamate and (5) g-aminobutyric acid (GABA). All of these receptor channels are probably hetero- orhomopentameric. The best characterized are the nicotinic acetyl-choline receptors which are pentameric channels of a2bgd subunit composition. All subunits arehomologous. The three dimensional structures of the protein complex in both the open and closed configurations have been solved at 0.9 nm resolution.The channel protein complexes of the LIC family preferentially transport cations or anions depending on the channel (e.g., the acetylcholine receptors are cationselective while glycine receptors are anion selective). [Transport and binding proteins, Cations and iron carrying compounds] 459
17354 273306 TIGR00861 MIP MIP family channel proteins. 1.A.8 The Major Intrinsic Protein (MIP) FamilyThe MIP family is large and diverse, possessing over 100 members that all form transmembrane channels. These channel proteins function in water, smallcarbohydrate (e.g., glycerol), urea, NH3, CO2 and possibly ion transport by an energy independent mechanism. They are found ubiquitously in bacteria, archaeaand eukaryotes. The MIP family contains two major groups of channels: aquaporins and glycerol facilitators.The known aquaporins cluster loosely together as do the known glycerol facilitators. MIP family proteins are believed to form aqueous pores that selectively allow passive transport of their solute(s) across the membrane with minimal apparent recognition. Aquaporins selectively transport water (but not glycerol) while glycerol facilitators selectively transport glycerol but not water. Some aquaporins can transport NH3 and CO2. Glycerol facilitators function as solute nonspecific channels, and may transport glycerol, dihydroxyacetone, propanediol, urea and other small neutral molecules in physiologically importantprocesses. Some members of the family, including the yeast FPS protein (TC #1.A.8.5.1) and tobacco NtTIPA may transport both water and small solutes. [Transport and binding proteins, Unknown substrate] 216
17355 129941 TIGR00862 O-ClC intracellular chloride channel protein. The Organellar Chloride Channel (O-ClC) Family (TC 1.A.12) Proteins of the O-ClC family are voltage-sensitive chloride channels found in intracellular membranes but not the plasma membranes of animal cells. They are found in human nuclear membranes, and the bovine protein targets to the microsomes, but not the plasma membrane, when expressed in Xenopus laevis oocytes. These proteins are thought to function in the regulation of the membrane potential and in transepithelial ion absorption and secretion in the kidney. [Transport and binding proteins, Anions] 236
17356 273307 TIGR00863 P2X cation transporter protein. ATP-gated Cation Channel (ACC) Family (TC 1.A.7)Members of the ACC family (also called P2X receptors) respond to ATP, a functional neurotransmitter released by exocytosis from many types of neurons.These channels, which function at neuron-neuron and neuron-smooth muscle junctions, may play roles in the control of blood pressure and pain sensation. They may also function in lymphocyte and plateletphysiology. They are found only in animals.ACC channels are probably hetero- or homomultimers and transport small monovalent cations (Me+). Some also transport Ca2+; a few also transport small metabolites. [Transport and binding proteins, Cations and iron carrying compounds] 372
17357 188093 TIGR00864 PCC polycystin cation channel protein. The Polycystin Cation Channel (PCC) Family (TC 1.A.5) Polycystin is a huge protein of 4303aas. Its repeated leucine-rich (LRR) segment is found in many proteins. It contains 16 polycystic kidney disease (PKD) domains, one LDL-receptor class A domain, one C-type lectin family domain, and 16-18 putative TMSs in positions between residues 2200 and 4100. Polycystin-L has been shown to be a cation (Na+, K+ and Ca2+) channel that is activated by Ca2+. Two members of the PCC family (polycystin 1 and 2) are mutated in autosomal dominant polycystic kidney disease, and polycystin-L is deleted in mice with renal and retinal defects. Note: this model is restricted to the amino half. 2740
17358 273308 TIGR00865 bcl-2 apoptosis regulator. The Bcl-2 (Bcl-2) Family (TC 1.A.21) The Bcl-2 family consists of the apoptosis regulator, Bcl-X, and its homologues. Bcl-X is a dominant regulator of programmed cell death in mammalian cells. The long form (Bcl-X(L)) displays cell death repressor activity, but the short isoform (Bcl-X(S)) and the b-isoform (Bcl-Xb) promote cell death. Bcl-X(L), Bcl-X(S) and Bcl-Xb are three isoforms derived by alternative RNA splicing. Bcl-X(S) forms heterodimers with Bcl-2. Homologues of Bcl-X include the Bax (rat; 192 aas; spQ63690) and Bak (mouse; 208 aas; spO08734) proteins which also influence apoptosis. Using isolated mitochondria, recombinant Bax and Bak have been shown to induce Dy loss, swelling and cytochrome c release. All of these changes are dependent on Ca2+ and are prevented by cyclosporin A and bongkrekic acid, both of which are known to close permeability transition pores (megachannels). Coimmimoprecipitation studies revealed that Bax and Bak interact with VDAC to form permeability transition pores. Thus, even though they can form channels in artificial membranes at acidic pH, proapoptotic Bcl-2 family proteins (including Bax and Bak) probably induce the mitochondrial permeability transition and cytochrome c release by interacting with permeability transition pores, the most important component for pore fomation of which is VDAC. [Regulatory functions, Other] 213
17359 273309 TIGR00867 deg-1 degenerin. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the invertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds] 600
17360 129946 TIGR00868 hCaCC calcium-activated chloride channel protein 1. found a row in 1A13.INFO that was not parsed out AC found a row in 1A13.INFO that was not parsed out EC found a row in 1A13.INFO that was not parsed out GA found a row in 1A13.INFO that was not parsed out SO found a row in 1A13.INFO that was not parsed out RH found a row in 1A13.INFO that was not parsed out EN found a row in 1A13.INFO that was not parsed out GS found a row in 1A13.INFO that was not parsed out AL found a row in 1A13.INFO that was not parsed out The Epithelial Chloride Channel (E-ClC) Family (TC 1.A.13) found a row in 1A13.INFO that was not parsed out found a row in 1A13.INFO that was not parsed out Mammals have multiple isoforms of epithelial chloride channel proteins. The first member of this family to be characterized was a respiratory epithelium, Ca found a row in 1A13.INFO that was not parsed out 2+-regulated, chloride channel protein isolated from bovine tracheal apical membranes. It was biochemically characterized as a 140 kDa complex. The purified found a row in 1A13.INFO that was not parsed out complex when reconstituted in a planar lipid bilayer behaved as an anion-selective channel. It was regulated by Ca 2+ via a calmodulin kinase II-dependent found a row in 1A13.INFO that was not parsed out mechanism. When the cRNA was injected into Xenopus oocytes, an outward rectifying, DIDS-sensitive, anion conductance was measured. A related gene, found a row in 1A13.INFO that was not parsed out Lu-ECAM, was cloned from the bovine aortic endothelial cell line, BAEC. It is expressed in the lung and spleen but not in the trachea. Homologues are found in found a row in 1A13.INFO that was not parsed out several mammals, and at least three paralogues(hCaCC-1-3) are present in humans, each with different tissue distributions. found a row in 1A13.INFO that was not parsed out [Transport and binding proteins, Anions] 863
17361 273310 TIGR00869 sec62 protein translocation protein, Sec62 family. Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins has been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions. [Transport and binding proteins, Amino acids, peptides and amines] 232
17362 273311 TIGR00870 trp transient-receptor-potential calcium channel protein. The Transient Receptor Potential Ca2+ Channel (TRP-CC) Family (TC. 1.A.4)The TRP-CC family has also been called the store-operated calcium channel (SOC) family. The prototypical members include the Drosophila retinal proteinsTRP and TRPL (Montell and Rubin, 1989; Hardie and Minke, 1993). SOC members of the family mediate the entry of extracellular Ca2+ into cells in responseto depletion of intracellular Ca2+ stores (Clapham, 1996) and agonist stimulated production of inositol-1,4,5 trisphosphate (IP3). One member of the TRP-CCfamily, mammalian Htrp3, has been shown to form a tight complex with the IP3 receptor (TC #1.A.3.2.1). This interaction is apparently required for IP3 tostimulate Ca2+ release via Htrp3. The vanilloid receptor subtype 1 (VR1), which is the receptor for capsaicin (the ?hot? ingredient in chili peppers) and servesas a heat-activated ion channel in the pain pathway (Caterina et al., 1997), is also a member of this family. The stretch-inhibitable non-selective cation channel(SIC) is identical to the vanilloid receptor throughout all of its first 700 residues, but it exhibits a different sequence in its last 100 residues. VR1 and SICtransport monovalent cations as well as Ca2+. VR1 is about 10x more permeable to Ca2+ than to monovalent ions. Ca2+ overload probably causes cell deathafter chronic exposure to capsaicin. (McCleskey and Gold, 1999). [Transport and binding proteins, Cations and iron carrying compounds] 743
17363 273312 TIGR00871 zwf glucose-6-phosphate 1-dehydrogenase. This enzyme (EC 1.1.1.49) acts on glucose 6-phospate and reduces NADP(+). An alternate name appearing in the literature for the human enzyme, based on a slower activity with beta-D-glucose, is glucose 1-dehydrogenase (EC 1.1.1.47), but that name more properly describes a subfamily of the short chain dehydrogenases/reductases family. This is a well-studied enzyme family, with sequences available from well over 50 species. The trusted cutoff is set above the score for the Drosophila melanogaster CG7140 gene product, a homolog of unknown function. G6PD homologs from the bacteria Aquifex aeolicus and Helicobacter pylori lack several motifs well conserved most other members, were omitted from the seed alignment, and score well below the trusted cutoff. [Energy metabolism, Pentose phosphate pathway] 487
17364 273313 TIGR00872 gnd_rel 6-phosphogluconate dehydrogenase (decarboxylating). This family resembles a larger family (gnd) of bacterial and eukaryotic 6-phosphogluconate dehydrogenases but differs from it by a deep split in a UPGMA similarity clustering tree and the lack of a central region of about 140 residues. Among complete genomes, it is found is found in Bacillus subtilis and Mycobacterium tuberculosis, both of which also contain gnd, and in Aquifex aeolicus. The protein from Methylobacillus flagellatus KT has been characterized as a decarboxylating 6-phosphogluconate dehydrogenase as part of an unusual formaldehyde oxidation cycle. In some sequenced organisms members of this family are the sole 6-phosphogluconate dehydrogenase present and are probably active in the pentose phosphate cycle. [Energy metabolism, Pentose phosphate pathway] 298
17365 273314 TIGR00873 gnd 6-phosphogluconate dehydrogenase (decarboxylating). This model does not specify whether the cofactor is NADP only (EC 1.1.1.44), NAD only, or both. The model does not assign an EC number for that reason. [Energy metabolism, Pentose phosphate pathway] 467
17366 162081 TIGR00874 talAB transaldolase. This family includes the majority of known and predicted transaldolase sequences, including E. coli TalA and TalB. It excluded two other families. The first includes E. coli transaldolase-like protein TalC. The second family includes the putative transaldolases of Helicobacter pylori and Mycobacterium tuberculosis. [Energy metabolism, Pentose phosphate pathway] 317
17367 129953 TIGR00875 fsa_talC_mipB fructose-6-phosphate aldolase, TalC/MipB family. This model represents a family that includes the E. coli transaldolase homologs TalC and MipB, both shown to be fructose-6-phosphate aldolases rather than transaldolases as previously thought. It is related to but distinct from the transaldolase family of E. coli TalA and TalB. The member from Bacillus subtilis becomes phosphorylated during early stationary phase but not during exponential growth. [Energy metabolism, Pentose phosphate pathway] 213
17368 129954 TIGR00876 tal_mycobact transaldolase, mycobacterial type. This model describes one of three related but easily separable famiiles of known and putative transaldolases. This family and the family typified by E. coli TalA and TalB both contain experimentally verified examples. [Energy metabolism, Pentose phosphate pathway] 350
17369 273315 TIGR00877 purD phosphoribosylamine--glycine ligase. Alternate name: glycinamide ribonucleotide synthetase (GARS). This enzyme appears as a monofunctional protein in prokaryotes but as part of a larger, multidomain protein in eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 422
17370 273316 TIGR00878 purM phosphoribosylaminoimidazole synthetase. Alternate name: phosphoribosylformylglycinamidine cyclo-ligase; AIRS; AIR synthase This enzyme is found as a homodimeric monofunctional protein in prokaryotes and as part of a larger, multifunctional protein, sometimes with two copies of this enzyme in tandem, in eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 332
17371 273317 TIGR00879 SP MFS transporter, sugar porter (SP) family. This model represent the sugar porter subfamily of the major facilitator superfamily (pfam00083) [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 481
17372 273318 TIGR00880 2_A_01_02 Multidrug resistance protein. 141
17373 273319 TIGR00881 2A0104 phosphoglycerate transporter family protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 379
17374 211613 TIGR00882 2A0105 oligosaccharide:H+ symporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 396
17375 273320 TIGR00883 2A0106 metabolite-proton symporter. This model represents the metabolite:H+ symport subfamily of the major facilitator superfamily (pfam00083), including citrate-H+ symporters, dicarboxylate:H+ symporters, the proline/glycine-betaine transporter ProP, etc. [Transport and binding proteins, Unknown substrate] 394
17376 273321 TIGR00884 guaA_Cterm GMP synthase (glutamine-hydrolyzing), C-terminal domain or B subunit. This protein of purine de novo biosynthesis is well-conserved. However, it appears to split into two separate polypeptide chains in most of the Archaea. This C-terminal region would be the larger subunit [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 311
17377 211614 TIGR00885 fucP L-fucose:H+ symporter permease. This family describes the L-fucose permease in bacteria. L-fucose(6-deoxy-L-galactose) is a monosaccharide found in glycoproteins and cell wall polysaccharides. L-fucose is used in bacteria through an inducible pathway mediated by atleast four enzymes: a permease, isomerase, kinase and an aldolase which are encoded by fucP, fucI, fucK, fucA respectively. The fuc genes belong to a regulon comprising of four linked operons: fucO, fucA, fucPIK and fucR. The positive regulator is encoded by fucR, whose protein responds to fuculose-1-phosphate, which acts as an effector. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 408
17378 273322 TIGR00886 2A0108 nitrite extrusion protein (nitrite facilitator). [Transport and binding proteins, Anions] 354
17379 129965 TIGR00887 2A0109 phosphate:H+ symporter. This model represents the phosphate uptake symporter subfamily of the major facilitator superfamily (pfam00083). [Transport and binding proteins, Anions] 502
17380 129966 TIGR00888 guaA_Nterm GMP synthase (glutamine-hydrolyzing), N-terminal domain or A subunit. This protein of purine de novo biosynthesis is well-conserved. However, it appears to split into two separate polypeptide chains in most of the Archaea. This N-terminal region would be the smaller subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 188
17381 129967 TIGR00889 2A0110 nucleoside transporter. This family of proteins transports nucleosides at a high affinity. The transport mechanism is driven by proton motive force. This family includes nucleoside permease NupG and xanthosine permease from E.Coli. [Transport and binding proteins, Nucleosides, purines and pyrimidines] 418
17382 273323 TIGR00890 2A0111 oxalate/formate antiporter family transporter. This subfamily belongs to the major facilitator family. Members include the oxalate/formate antiporter of Oxalobacter formigenes, where one substrate is decarboxylated in the cytosol into the other to consume a proton and drive an ion gradient. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 377
17383 273324 TIGR00891 2A0112 putative sialic acid transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 405
17384 273325 TIGR00892 2A0113 monocarboxylate transporter 1. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 455
17385 273326 TIGR00893 2A0114 D-galactonate transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 399
17386 129972 TIGR00894 2A0114euk Na(+)-dependent inorganic phosphate cotransporter. [Transport and binding proteins, Anions] 465
17387 273327 TIGR00895 2A0115 benzoate transport. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 398
17388 129974 TIGR00896 CynX cyanate transporter. This family of proteins is involved in active transport of cyanate. The cyanate transporter in E.Coli is used to transport cyanate into the cell so it can be metabolized into ammonia and bicarbonate. This process is used to overcome the toxicity of environmental cyanate. [Transport and binding proteins, Other] 355
17389 162096 TIGR00897 2A0118 polyol permease family. This family of proteins includes the ribitol and D-arabinitol transporters from Klebsiella pneumoniae and the alpha-ketoglutarate permease from Bacillus subtilis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 402
17390 273328 TIGR00898 2A0119 cation transport protein. [Transport and binding proteins, Cations and iron carrying compounds] 505
17391 129977 TIGR00899 2A0120 sugar efflux transporter. This family of proteins is an efflux system for lactose, glucose, aromatic glucosides and galactosides, cellobiose, maltose, a-methyl glucoside and other sugar compounds. They are found in both gram-negative and gram-postitive bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 375
17392 162098 TIGR00900 2A0121 H+ Antiporter protein. [Transport and binding proteins, Cations and iron carrying compounds] 365
17393 273329 TIGR00901 2A0125 AmpG-like permease. [Cellular processes, Adaptations to atypical conditions] 356
17394 129980 TIGR00902 2A0127 phenyl proprionate permease family protein. This family of proteins is involved in the uptake of 3-phenylpropionic acid. This uptake mechanism is for the metabolism of phenylpropanoid compounds and plays an important role in the natural degradative cycle of these aromatic molecules. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 382
17395 129981 TIGR00903 2A0129 major facilitator 4 family protein. This family of proteins are uncharacterized proteins from archaea. This family includes proteins from Archaeoglobus fulgidus and Aeropyrum pernix. [Transport and binding proteins, Other] 368
17396 129982 TIGR00904 mreB cell shape determining protein, MreB/Mrl family. MreB (mecillinam resistance) in E. coli (also called envB) and the paralogous pair MreB and Mrl of Bacillus subtilis have all been shown to help determine cell shape. This protein is present in a wide variety of bacteria, including spirochetes, but is missing from the Mycoplasmas and from Gram-positive cocci. Most completed bacterial genomes have a single member of this family. In some species it is an essential gene. A close homolog is found in the Archaeon Methanobacterium thermoautotrophicum, and a more distant homolog in Archaeoglobus fulgidus. The family is related to cell division protein FtsA and heat shock protein DnaK. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 333
17397 129983 TIGR00905 2A0302 transporter, basic amino acid/polyamine antiporter (APA) family. This family includes several families of antiporters that, rather commonly, are encoded next to decarboxylases that convert one of the antiporter substrates into the other. This arrangement allows a cycle that can remove proteins from the cytoplasm and thereby protect against acidic conditions. [Transport and binding proteins, Amino acids, peptides and amines] 473
17398 273330 TIGR00906 2A0303 cationic amino acid transport permease. [Transport and binding proteins, Amino acids, peptides and amines] 557
17399 273331 TIGR00907 2A0304 amino acid permease (GABA permease). [Transport and binding proteins, Amino acids, peptides and amines] 482
17400 129986 TIGR00908 2A0305 ethanolamine permease. The three genes used as the seed for this model (from Burkholderia pseudomallei, Pseudomonas aeruginosa and Clostridium acetobutylicum are all adjacent to genes for the catabolism of ethanolamine. Most if not all of the hits to this model have a similar arrangement of genes. This group is a member of the Amino Acid-Polyamine-Organocation (APC) Superfamily. [Transport and binding proteins, Amino acids, peptides and amines] 442
17401 129987 TIGR00909 2A0306 amino acid transporter. [Transport and binding proteins, Amino acids, peptides and amines] 429
17402 129988 TIGR00910 2A0307_GadC glutamate:gamma-aminobutyrate antiporter. Lowered cutoffs from 1000/500 to 800/300, promoted from subfamily to equivalog, and put into a Genome Property DHH 9/1/2009 [Transport and binding proteins, Amino acids, peptides and amines] 507
17403 273332 TIGR00911 2A0308 L-type amino acid transporter. [Transport and binding proteins, Amino acids, peptides and amines] 501
17404 273333 TIGR00912 2A0309 spore germination protein (amino acid permease). This model describes spore germination protein GerKB and paralogs from Bacillus subtilis, Clostridium tetani, and other known or predicted endospore-forming members of the Firmicutes (low-GC Gram positive bacteria). Members show some similarity to amino acid permeases. [Transport and binding proteins, Amino acids, peptides and amines] 359
17405 273334 TIGR00913 2A0310 amino acid permease (yeast). [Transport and binding proteins, Amino acids, peptides and amines] 478
17406 129992 TIGR00914 2A0601 heavy metal efflux pump, CzcA family. This model represents a family of H+/heavy metal cation antiporters. This family is one of several subfamilies within the scope of pfam00873. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds] 1051
17407 273335 TIGR00915 2A0602 The (Largely Gram-negative Bacterial) Hydrophobe/Amphiphile Efflux-1 (HAE1) Family. Proteins scoring above the trusted cutoff (1000) form a tight clade within the RND (Resistance-Nodulation-Cell Division) superfamily. Proteins scoring greater than the noise cutoff (100) appear to form a larger clade, cleanly separated from more distant homologs that include cadmium/zinc/cobalt resistance transporters. This family is one of several subfamilies within the scope of pfam00873. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Unknown substrate] 1044
17408 273336 TIGR00916 2A0604s01 protein-export membrane protein, SecD/SecF family. The SecA,SecB,SecD,SecE,SecF,SecG and SecY proteins form the protein translocation appartus in prokaryotes. This family is specific for the SecD and SecF proteins. [Protein fate, Protein and peptide secretion and trafficking] 192
17409 273337 TIGR00917 2A060601 Niemann-Pick C type protein family. The model describes Niemann-Pick C type protein in eukaryotes. The defective protein has been associated with Niemann-Pick disease which is described in humans as autosomal recessive lipidosis. It is characterized by the lysosomal accumulation of unestrified cholesterol. It is an integral membrane protein, which indicates that this protein is most likely involved in cholesterol transport or acts as some component of cholesterol homeostasis. [Transport and binding proteins, Other] 1205
17410 273338 TIGR00918 2A060602 The Eukaryotic (Putative) Sterol Transporter (EST) Family. 1145
17411 273339 TIGR00920 2A060605 3-hydroxy-3-methylglutaryl-coenzyme A reductase. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 886
17412 273340 TIGR00921 2A067 The (Largely Archaeal Putative) Hydrophobe/Amphiphile Efflux-3 (HAE3) Family. Characterized members of the RND superfamily all probably catalyze substrate efflux via an H+ antiport mechanism. These proteins are found ubiquitously in bacteria, archaea and eukaryotes. They fall into seven phylogenetic families, this family (2.A.6.7) consists of uncharacterised putative transporters, largely in the Archaea. [Transport and binding proteins, Unknown substrate] 719
17413 273341 TIGR00922 nusG transcription termination/antitermination factor NusG. NusG proteins are transcription factors which are aparrently universal in prokaryotes (archaea and eukaryotes have homologs that may have related functions). The essential components of these factors include an N-terminal RNP-like (ribonucleoprotein) domain and a C-terminal KOW motif (pfam00467) believed to be a nucleic acid binding domain. In E. coli, NusA has been shown to interact with RNA polymerase and termination factor Rho. This model covers a wide variety of bacterial species but excludes mycoplasmas which are covered by a separate model (TIGR01956).The function of all of these NusG proteins is likely to be the same at the level of interaction with RNA and other protein factors to affect termination; however different species may utilize NusG towards different processes and in combination with different suites of affector proteins.In E. coli, NusG promotes rho-dependent termination. It is an essential gene. In Streptomyces virginiae and related species, an additional N-terminal sequence is also present and is suggested to play a role in butyrolactone-mediated autoregulation. In Thermotoga maritima, NusG has a long insert, fails to substitute for E. coli NusG (with or without the long insert), is a large 0.7 % of total cellular protein, and has a general, sequence non-specific DNA and RNA binding activity that blocks ethidium staining, yet permits transcription.Archaeal proteins once termed NusG share the KOW domain but are actually a ribosomal protein corresponding to L24p in bacterial and L26e in eukaryotes (TIGR00405). [Transcription, Transcription factors] 172
17414 273342 TIGR00924 yjdL_sub1_fam amino acid/peptide transporter (Peptide:H+ symporter), bacterial. The model describes proton-dependent oligopeptide transporters in bacteria. This model is restricted in its range in recognizing bacterial proton-dependent oligopeptide transporters, although they are found in yeast, plants and animals. They function by proton symport in a 1:1 stoichiometry, which is variable in different species. All of them are predicted to contain 12 transmembrane domains, for which limited experimental evidence exists. [Transport and binding proteins, Amino acids, peptides and amines] 475
17415 273343 TIGR00926 2A1704 Peptide:H+ symporter (also transports b-lactam antibiotics, the antitumor agent, bestatin, and various protease inhibitors). [Transport and binding proteins, Amino acids, peptides and amines] 654
17416 273344 TIGR00927 2A1904 K+-dependent Na+/Ca+ exchanger. [Transport and binding proteins, Cations and iron carrying compounds] 1096
17417 273345 TIGR00928 purB adenylosuccinate lyase. This family consists of adenylosuccinate lyase, the enzyme that catalyzes step 8 in the purine biosynthesis pathway for de novo synthesis of IMP and also the final reaction in the two-step sequence from IMP to AMP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 435
17418 273346 TIGR00929 VirB4_CagE type IV secretion/conjugal transfer ATPase, VirB4 family. Type IV secretion systems are found in Gram-negative pathogens. They export proteins, DNA, or complexes in different systems and are related to plasmid conjugation systems. This model represents related ATPases that include VirB4 in Agrobacterium tumefaciens (DNA export) CagE in Helicobacter pylori (protein export) and plasmid TraB (conjugation). 785
17419 273347 TIGR00930 2a30 K-Cl cotransporter. [Transport and binding proteins, Other] 953
17420 188097 TIGR00931 antiport_nhaC Na+/H+ antiporter NhaC. A single member of the NhaC family, a protein from Bacillus firmus, has been functionally characterized.It is involved in pH homeostasis and sodium extrusion. Members of the NhaC family are found in both Gram-negative bacteria and Gram-positive bacteria. Intriguingly, archaeal homolog ArcD (just outside boundaries of family) has been identified as an arginine/ornithine antiporter. [Transport and binding proteins, Cations and iron carrying compounds] 454
17421 273348 TIGR00932 2a37 transporter, monovalent cation:proton antiporter-2 (CPA2) family. [Transport and binding proteins, Cations and iron carrying compounds] 273
17422 273349 TIGR00933 2a38 potassium uptake protein, TrkH family. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. [Transport and binding proteins, Cations and iron carrying compounds] 391
17423 130009 TIGR00934 2a38euk potassium uptake protein, Trk family. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. This family is specific for the eukaryotic Trk system. [Transport and binding proteins, Cations and iron carrying compounds] 800
17424 213571 TIGR00935 2a45 arsenite/antimonite efflux pump membrane protein. Members of this protein family are ArsB, a highly hydrophobic integral membrane protein involved in transport processes used to protect cells from arsenite (or antimonite). Members of the seed alignment were selected by adjacency to the ATPase subunit ArsA that energizes the transport. [Cellular processes, Detoxification, Transport and binding proteins, Other] 426
17425 213572 TIGR00936 ahcY adenosylhomocysteinase. This enzyme hydrolyzes adenosylhomocysteine as part of a cycle for the regeneration of the methyl donor S-adenosylmethionine. Species that lack this enzyme are likely to have adenosylhomocysteine nucleosidase (EC 3.2.2.9), an enzyme which also acts as 5'-methyladenosine nucleosidase (see TIGR01704). [Energy metabolism, Amino acids and amines] 407
17426 273350 TIGR00937 2A51 chromate transporter, chromate ion transporter (CHR) family. Members of this family probably act as chromate transporters, and are found in Pseudomonas aeruginosa, Alcaligenes eutrophus, Vibrio cholerae, Bacillus subtilis, Ochrobactrum tritici, cyanobacteria and archaea. The protein reduces chromate accumulation and is essential for chromate resistance. [Transport and binding proteins, Anions] 368
17427 273351 TIGR00938 thrB_alt homoserine kinase, Neisseria type. Homoserine kinase is required in the biosynthesis of threonine from aspartate.The member of this family from Pseudomonas aeruginosa was shown by direct assay and complementation to act specifically as a homoserine kinase. [Amino acid biosynthesis, Aspartate family] 307
17428 273352 TIGR00939 2a57 Equilibrative Nucleoside Transporter (ENT). [Transport and binding proteins, Nucleosides, purines and pyrimidines] 437
17429 273353 TIGR00940 2a6301s01 Tmonovalent cation:proton antiporter. This family of proteins constists of bacterial multicomponent K+:H+ and Na+:H+ antiporters. The best characterized systems are the PhaABCDEFG system of Rhizobium meliloti which functions in pH adaptation and as a K+ efflux system and the MnhABCDEFG system of Staphylococcus aureus which functions as a Na+:H+ antiporter. [Transport and binding proteins, Cations and iron carrying compounds] 793
17430 273354 TIGR00941 2a6301s03 Multicomponent Na+:H+ antiporter, MnhC subunit. [Transport and binding proteins, Cations and iron carrying compounds] 104
17431 130017 TIGR00942 2a6301s05 Monovalent Cation (K+ or Na+):Proton Antiporter-3 (CPA3) subfamily. [Transport and binding proteins, Cations and iron carrying compounds] 144
17432 130018 TIGR00943 2a6301s02 monovalent cation:proton antiporter. This family of proteins constists of bacterial multicomponent K+:H+ and Na+:H+ antiporters. The best characterized systems are the PhaABCDEFG system of Rhizobium meliloti which functions in pH adaptation and as a K+ efflux system and the MnhABCDEFG system of Staphylococcus aureus which functions as a Na+:H+ antiporter.This family is specific for the phaB and mnhB proteins. [Transport and binding proteins, Cations and iron carrying compounds] 107
17433 130019 TIGR00944 2a6301s04 Multicomponent K+:H+antiporter. [Transport and binding proteins, Cations and iron carrying compounds] 463
17434 273355 TIGR00945 tatC Twin arginine targeting (Tat) protein translocase TatC. This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR01912) represents the archaeal clade of this family. TatC is often found in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409). [Protein fate, Protein and peptide secretion and trafficking] 215
17435 273356 TIGR00946 2a69 he Auxin Efflux Carrier (AEC) Family. [Transport and binding proteins, Other] 321
17436 273357 TIGR00947 2A73 putative bicarbonate transporter, IctB family. This family of proteins is suggested to transport inorganic carbon (HCO3-), based on the phenotype of a mutant of IctB in Synechococcus sp. strain PCC 7942. Bicarbonate uptake is used by many photosynthetic organisms including cyanobacteria. These organisms are able to concentrate CO2/HCO3- against a greater than ten-fold concentration gradient. Cyanobacteria may have several such carriers operating with different efficiencies. Note that homology to various O-antigen ligases, with possible implications for mutant cell envelope structure, might allow alternatives to the interpretation of IctB as a bicarbonate transport protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 425
17437 130023 TIGR00948 2a75 L-lysine exporter. [Transport and binding proteins, Amino acids, peptides and amines] 177
17438 273358 TIGR00949 2A76 The Resistance to Homoserine/Threonine (RhtB) Family protein. [Transport and binding proteins, Amino acids, peptides and amines] 185
17439 273359 TIGR00950 2A78 Carboxylate/Amino Acid/Amine Transporter. [Transport and binding proteins, Amino acids, peptides and amines] 260
17440 130026 TIGR00951 2A43 Lysosomal Cystine Transporter. [Transport and binding proteins, Amino acids, peptides and amines] 220
17441 130027 TIGR00952 S15_bact ribosomal protein S15, bacterial/organelle. This model is built to recognize specifically bacterial, chloroplast, and mitochondrial ribosomal protein S15. The homologous proteins of Archaea and Eukarya are designated S13. [Protein synthesis, Ribosomal proteins: synthesis and modification] 86
17442 273360 TIGR00954 3a01203 Peroxysomal Fatty Acyl CoA Transporter (FAT) Family protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 659
17443 273361 TIGR00955 3a01204 The Eye Pigment Precursor Transporter (EPP) Family protein. [Transport and binding proteins, Other] 617
17444 273362 TIGR00956 3a01205 Pleiotropic Drug Resistance (PDR) Family protein. [Transport and binding proteins, Other] 1394
17445 188098 TIGR00957 MRP_assoc_pro multi drug resistance-associated protein (MRP). This model describes multi drug resistance-associated protein (MRP) in eukaryotes. The multidrug resistance-associated protein is an integral membrane protein that causes multidrug resistance when overexpressed in mammalian cells. It belongs to ABC transporter superfamily. The protein topology and function was experimentally demonstrated by epitope tagging and immunofluorescence. Insertion of tags in the critical regions associated with drug efflux, abrogated its function. The C-terminal domain seem to highly conserved. [Transport and binding proteins, Other] 1522
17446 273363 TIGR00958 3a01208 Conjugate Transporter-2 (CT2) Family protein. [Transport and binding proteins, Other] 711
17447 273364 TIGR00959 ffh signal recognition particle protein. This model represents Ffh (Fifty-Four Homolog), the protein component that forms the bacterial (and organellar) signal recognition particle together with a 4.5S RNA. Ffh is a GTPase homologous to eukaryotic SRP54 and also to the GTPase FtsY (TIGR00064) that is the receptor for the signal recognition particle. [Protein fate, Protein and peptide secretion and trafficking] 428
17448 273365 TIGR00962 atpA proton translocating ATP synthase, F1 alpha subunit. The sequences of ATP synthase F1 alpha and beta subunits are related and both contain a nucleotide-binding site for ATP and ADP. They have a common amino terminal domain but vary at the C-terminus. The beta chain has catalytic activity, while the alpha chain is a regulatory subunit. The alpha-subunit contains a highly conserved adenine-specific noncatalytic nucleotide-binding domain. The conserved amino acid sequence is Gly-X-X-X-X-Gly-Lys. Proton translocating ATP synthase F1, alpha subunit is homologous to proton translocating ATP synthase archaeal/vacuolar(V1), B subunit. [Energy metabolism, ATP-proton motive force interconversion] 501
17449 273366 TIGR00963 secA preprotein translocase, SecA subunit. The proteins SecA-F and SecY, not all of which are necessary, comprise the standard prokaryotic protein translocation apparatus. Other, specialized translocation systems also exist but are not as broadly distributed. This model describes SecA, an essential member of the apparatus. This model excludes SecA2 of the accessory secretory system. [Protein fate, Protein and peptide secretion and trafficking] 742
17450 273367 TIGR00964 secE_bact preprotein translocase, SecE subunit, bacterial. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC. [Protein fate, Protein and peptide secretion and trafficking] 55
17451 130038 TIGR00965 dapD 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. Alternate name: tetrahydrodipicolinate N-succinyltransferase. The closely related TabB protein of Pseudomonas syringae (pv. tabaci), SP|P31852|TABB_PSESZ, appears to act in the biosynthesis of tabtoxin rather than lysine. The trusted cutoff is set high enough to exclude this gene. Sequences below trusted also include a version of this enzyme which apparently utilize acetate rather than succinate (EC: 2.3.1.89). [Amino acid biosynthesis, Aspartate family] 269
17452 273368 TIGR00966 3a0501s07 protein-export membrane protein SecF. This bacterial protein is always found with the homologous protein-export membrane protein SecD. In numerous lineages, this protein occurs as a SecDF fusion protein. [Protein fate, Protein and peptide secretion and trafficking] 246
17453 273369 TIGR00967 3a0501s007 preprotein translocase, SecY subunit. Members of this protein family are the SecY component of the SecYEG translocon, or protein translocation pore, which is driven by the ATPase SecA. This model does not discriminate bacterial from archaeal forms. [Protein fate, Protein and peptide secretion and trafficking] 410
17454 130041 TIGR00968 3a0106s01 sulfate ABC transporter, ATP-binding protein. [Transport and binding proteins, Anions] 237
17455 273370 TIGR00969 3a0106s02 sulfate ABC transporter, permease protein. This model describes a subfamily of both CysT and CysW, paralogous and generally tandemly encoded permease proteins of the sulfate ABC transporter. [Transport and binding proteins, Anions] 271
17456 273371 TIGR00970 leuA_yeast 2-isopropylmalate synthase, yeast type. A larger family of homologous proteins includes homocitrate synthase, distinct lineages of 2-isopropylmalate synthase, several distinct, uncharacterized, orthologous sets in the Archaea, and other related enzymes. This model describes a family of 2-isopropylmalate synthases as found in yeasts and in a minority of studied bacteria. [Amino acid biosynthesis, Pyruvate family] 564
17457 130044 TIGR00971 3a0106s03 sulfate/thiosulfate-binding protein. This model describes binding proteins functionally associated with the sulfate ABC transporter. In the model bacterium E. coli, two different members work with the same transporter; mutation analysis says each enables the uptake of both sulfate and thiosulfate. In many species, a single binding protein is found, and may be referred to in general terms as a sulfate ABC transporter sulfate-binding protein. [Transport and binding proteins, Anions] 315
17458 273372 TIGR00972 3a0107s01c2 phosphate ABC transporter, ATP-binding protein. This model represents the ATP-binding protein of a family of ABC transporters for inorganic phosphate. In the model species Escherichia coli, a constitutive transporter for inorganic phosphate, with low affinity, is also present. The high affinity transporter that includes this polypeptide is induced when extracellular phosphate concentrations are low. The proteins most similar to the members of this family but not included appear to be amino acid transporters. [Transport and binding proteins, Anions] 247
17459 130046 TIGR00973 leuA_bact 2-isopropylmalate synthase, bacterial type. This is the first enzyme of leucine biosynthesis. A larger family of homologous proteins includes homocitrate synthase, distinct lineages of 2-isopropylmalate synthase, several distinct, uncharacterized, orthologous sets in the Archaea, and other related enzymes. This model describes a family of 2-isopropylmalate synthases found primarily in Bacteria. The homologous families in the Archaea may represent isozymes and/or related enzymes. [Amino acid biosynthesis, Pyruvate family] 494
17460 273373 TIGR00974 3a0107s02c phosphate ABC transporter, permease protein PstA. This model describes PtsA, one of a pair of permease proteins in the ABC (high affinity) phosphate transporter. In a number of species, this permease is fused with the PtsC protein (TIGR02138). In the model bacterium Escherichia coli, this transport system is induced when the concentration of extrallular inorganic phosphate is low. A constitutive, lower affinity transporter operates otherwise. [Transport and binding proteins, Anions] 271
17461 273374 TIGR00975 3a0107s03 phosphate ABC transporter, phosphate-binding protein. This family represents one type of (periplasmic, in Gram-negative bacteria) phosphate-binding protein found in phosphate ABC (ATP-binding cassette) transporters. This protein is accompanied, generally in the same operon, by an ATP binding protein and (usually) two permease proteins. [Transport and binding proteins, Anions] 313
17462 273375 TIGR00976 /NonD putative hydrolase, CocE/NonD family. This model represents a protein subfamily that includes the cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). This family shows extensive, low-level similarity to a family of xaa-pro dipeptidyl-peptidases, and local similarity by PSI-BLAST to many other hydrolases. [Unknown function, Enzymes of unknown specificity] 550
17463 130050 TIGR00977 citramal_synth citramalate synthase. This model includes GSU1798 and is now known to represent citramalate synthase. Members are related to 2-isopropylmalate synthases and homocitrate synthases but phylogenetically distinct. The role is isoleucine biosynthesis, the first dedicated step. [Unknown function, General] 526
17464 273376 TIGR00978 asd_EA aspartate-semialdehyde dehydrogenase (non-peptidoglycan organisms). Two closely related families of aspartate-semialdehyde dehydrogenase are found. They differ by a deep split in phylogenetic and percent identity trees and in gap patterns. Separate models are built for the two types in order to exclude the USG-1 protein, found in several species, which is specifically related to the Bacillus subtilis type of aspartate-semialdehyde dehydrogenase. Members of this type are found primarily in organisms that lack peptidoglycan. [Amino acid biosynthesis, Aspartate family] 341
17465 130052 TIGR00979 fumC_II fumarate hydratase, class II. Putative fumarases from several species (Mycobacterium tuberculosis, Streptomyces coelicolor, Pseudomonas aeruginosa) branch deeply, although within the same branch of a phylogenetic tree rooted by aspartate ammonia-lyase sequences, and score between the trusted and noise cutoffs. [Energy metabolism, TCA cycle] 458
17466 130053 TIGR00980 3a0801so1tim17 mitochondrial import inner membrane translocase subunit tim17. [Transport and binding proteins, Amino acids, peptides and amines] 170
17467 130054 TIGR00981 rpsL_bact ribosomal protein S12, bacterial/organelle. This model recognizes ribosomal protein S12 of Bacteria, mitochondria, and chloroplasts. The homologous ribosomal proteins of Archaea and Eukarya, termed S23 in Eukarya and S12 or S23 in Archaea, score below the trusted cutoff. [Protein synthesis, Ribosomal proteins: synthesis and modification] 124
17468 273377 TIGR00982 uS12_E_A ribosomal protein uS12, eukaryotic/archaeal form. This model represents eukaryotic and archaeal forms of ribosomal protein uS12. This protein was known previously as S23 in eukaryotes and as either S12 or S23 in the Archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification] 139
17469 130056 TIGR00983 3a0801s02tim23 mitochondrial import inner membrane translocase subunit tim23. [Transport and binding proteins, Amino acids, peptides and amines] 149
17470 130057 TIGR00984 3a0801s03tim44 mitochondrial import inner membrane, translocase subunit. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tim proteins. [Transport and binding proteins, Amino acids, peptides and amines] 378
17471 273378 TIGR00985 3a0801s04tom mitochondrial import receptor subunit translocase of outer membrane 20 kDa subunit. [Transport and binding proteins, Amino acids, peptides and amines] 148
17472 273379 TIGR00986 3a0801s05tom22 mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom22 proteins. [Transport and binding proteins, Amino acids, peptides and amines] 145
17473 130060 TIGR00987 himA integration host factor, alpha subunit. This protein forms a site-specific DNA-binding heterodimer with the integration host factor beta subunit. It is closely related to the DNA-binding protein HU. [DNA metabolism, DNA replication, recombination, and repair] 96
17474 130061 TIGR00988 hip integration host factor, beta subunit. This protein forms a site-specific DNA-binding heterodimer with the homologous integration host factor alpha subunit. It is closely related to the DNA-binding protein HU. [DNA metabolism, DNA replication, recombination, and repair] 94
17475 130062 TIGR00989 3a0801s07tom40 mitochondrial import receptor subunit Tom40. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom40 proteins. [Transport and binding proteins, Amino acids, peptides and amines] 161
17476 273380 TIGR00990 3a0801s09 mitochondrial precursor proteins import receptor (72 kDa mitochondrial outermembrane protein) (mitochondrial import receptor for the ADP/ATP carrier) (translocase of outermembrane tom70). [Transport and binding proteins, Amino acids, peptides and amines] 615
17477 130064 TIGR00991 3a0901s02IAP34 GTP-binding protein (Chloroplast Envelope Protein Translocase). [Transport and binding proteins, Nucleosides, purines and pyrimidines] 313
17478 130065 TIGR00992 3a0901s03IAP75 chloroplast envelope protein translocase, IAP75 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the TOC IAP75 protein. [Transport and binding proteins, Amino acids, peptides and amines] 718
17479 273381 TIGR00993 3a0901s04IAP86 chloroplast protein import component Toc86/159, G and M domains. The long precursor of the 86K protein originally described is proposed to have three domains. The N-terminal A-domain is acidic, repetitive, weakly conserved, readily removed by proteolysis during chloroplast isolation, and not required for protein translocation. The other domains are designated G (GTPase) and M (membrane anchor); this family includes most of the G domain and all of M. [Transport and binding proteins, Amino acids, peptides and amines] 763
17480 273382 TIGR00994 3a0901s05TIC20 chloroplast protein import component, Tic20 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic20 protein. [Transport and binding proteins, Amino acids, peptides and amines] 267
17481 273383 TIGR00995 3a0901s06TIC22 chloroplast protein import component, Tic22 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic22 protein. [Transport and binding proteins, Amino acids, peptides and amines] 270
17482 273384 TIGR00996 Mtu_fam_mce virulence factor Mce family protein. Members of this paralogous family are found as six tandem homologous proteins in the same orientation per cassette, in four separate cassettes in Mycobacterium tuberculosis. The six members of each cassette represent six subfamilies. One subfamily includes the protein mce (mycobacterial cell entry), a virulence protein required for invasion of non-phagocytic cells. [Cellular processes, Pathogenesis] 291
17483 130070 TIGR00997 ispZ intracellular septation protein A. This partially characterized protein, whose absence can cause a cell division defect in an intracellularly replicating bacterium, is found only so far only in the Proteobacteria. [Cellular processes, Cell division] 178
17484 273385 TIGR00998 8a0101 efflux pump membrane protein (multidrug resistance protein A). [Transport and binding proteins, Other] 334
17485 273386 TIGR00999 8a0102 Membrane Fusion Protein cluster 2 (function with RND porters). [Transport and binding proteins, Other] 265
17486 273387 TIGR01000 bacteriocin_acc bacteriocin secretion accessory protein. This family represents an accessory protein that works with the bacteriocin maturation and ABC transport secretion protein described by TIGR01193. [Transport and binding proteins, Other] 457
17487 130074 TIGR01001 metA homoserine O-succinyltransferase. The apparent equivalog from Bacillus subtilis is broken into two tandem reading frames. [Amino acid biosynthesis, Aspartate family] 300
17488 273388 TIGR01002 hlyII beta-channel forming cytolysin. This family of cytolytic pore-forming proteins includes alpha toxin and leukocidin F and S subunits from Staphylococcus aureus, hemolysin II of Bacillus cereus, and related toxins. [Cellular processes, Toxin production and resistance] 312
17489 273389 TIGR01003 PTS_HPr_family Phosphotransferase System HPr (HPr) Family. The HPr family are bacterial proteins (or domains of proteins) which function in phosphoryl transfer system (PTS) systems. They include energy-coupling components which catalyze sugar uptake via a group translocation mechanism. The functions of most of these proteins are not known, but they presumably function in PTS-related regulatory capacities. All seed members are stand-alone HPr proteins, although the model also recognizes HPr domains of PTS fusion proteins. This family includes the related NPr protein. [Signal transduction, PTS] 82
17490 273390 TIGR01004 PulS_OutS lipoprotein, PulS/OutS family. This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae, the OutS protein of Erwinia chrysanthemi and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. [Transport and binding proteins, Amino acids, peptides and amines] 128
17491 273391 TIGR01005 eps_transp_fam exopolysaccharide transport protein family. The model describes the exopolysaccharide transport protein family in bacteria. The transport protein is part of a large genetic locus which is associated with exopolysaccharide (EPS) biosynthesis. Detailed molecular characterization and gene fusion analysis revealed atleast seven gene products are involved in the overall regulation, which among other things, include exopolysaccharide biosynthesis, property of conferring virulence and exopolysaccharide export. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 764
17492 130079 TIGR01006 polys_exp_MPA1 polysaccharide export protein, MPA1 family, Gram-positive type. This family contains members from Low GC Gram-positive bacteria; they are proposed to have a function in the export of complex polysaccharides. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 226
17493 273392 TIGR01007 eps_fam capsular exopolysaccharide family. This model describes the capsular exopolysaccharide proteins in bacteria. The exopolysaccharide gene cluster consists of several genes which encode a number of proteins which regulate the exoploysaccharide biosynthesis(EPS). Atleast 13 genes espA to espM in streptococcus species seem to direct the EPS proteins and all of which share high homology. Functional roles were characterized by gene disruption experiments which resulted in exopolysaccharide-deficient phenotypes. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 204
17494 273393 TIGR01008 uS3_euk_arch ribosomal protein uS3, eukaryotic/archaeal type. This model describes ribosomal protein S3 of the eukaryotic cytosol and of the archaea. TIGRFAMs model TIGR01009 describes the bacterial/organellar type, although the organellar types have a different architecture with long insertions and may score poorly. [Protein synthesis, Ribosomal proteins: synthesis and modification] 195
17495 130082 TIGR01009 rpsC_bact ribosomal protein S3, bacterial type. This model describes the bacterial type of ribosomal protein S3. Chloroplast and mitochondrial forms have large, variable inserts between conserved N-terminal and C-terminal domains. This model recognizes all bacterial forms and many chloroplast forms above the trusted cutoff score. TIGRFAMs model TIGR01008 describes S3 of the eukaryotic cytosol and of the archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification] 211
17496 130083 TIGR01010 BexC_CtrB_KpsE polysaccharide export inner-membrane protein, BexC/CtrB/KpsE family. This family contains gamma proteobacterial proteins involved in capsule polysaccharide export. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 362
17497 130084 TIGR01011 rpsB_bact ribosomal protein S2, bacterial type. This model describes the bacterial, ribosomal, and chloroplast forms of ribosomal protein S2. TIGR01012 describes the archaeal and cytosolic forms. [Protein synthesis, Ribosomal proteins: synthesis and modification] 225
17498 273394 TIGR01012 uS2_euk_arch ribosomal protein uS2, eukaryotic/archaeal form. This model describes the ribosomal protein of the cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea. TIGR01011 describes the related protein of organelles and bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification] 196
17499 162157 TIGR01013 2a58 Phosphate:Na+ Symporter (PNaS) Family. [Transport and binding proteins, Cations and iron carrying compounds] 456
17500 273395 TIGR01015 hmgA homogentisate 1,2-dioxygenase. Missing in human disease alkaptonuria. [Energy metabolism, Amino acids and amines] 429
17501 273396 TIGR01016 sucCoAbeta succinyl-CoA synthetase, beta subunit. This model is designated subfamily because it does not discriminate the ADP-forming enzyme ((EC 6.2.1.5) from the GDP_forming (EC 6.2.1.4) enzyme. The N-terminal half is described by the CoA-ligases model (pfam00549). The C-terminal half is described by the ATP-grasp model (pfam02222). This family contains a split seen both in a maximum parsimony tree (which ignores gaps) and in the gap pattern near position 85 of the seed alignment. Eukaryotic and most bacterial sequences are longer and contain a region similar to TXQTXXXG. Sequences from Deinococcus radiodurans, Mycobacterium tuberculosis, Streptomyces coelicolor, and the Archaea are 6 amino acids shorter in that region and contain a motif resembling [KR]G [Energy metabolism, TCA cycle] 386
17502 200066 TIGR01017 rpsD_bact ribosomal protein S4, bacterial/organelle type. This model finds organelle (chloroplast and mitochondrial) ribosomal protein S4 as well as bacterial ribosomal protein S4. [Protein synthesis, Ribosomal proteins: synthesis and modification] 200
17503 273397 TIGR01018 uS4_arch ribosomal protein uS4, eukaryotic/archaeal type. This model finds eukaryotic ribosomal protein S9 as well as archaeal ribosomal protein S4. [Protein synthesis, Ribosomal proteins: synthesis and modification] 162
17504 130091 TIGR01019 sucCoAalpha succinyl-CoA synthetase, alpha subunit. This model describes succinyl-CoA synthetase alpha subunits but does not discriminate between GTP-specific and ATP-specific reactions. The model is designated as subfamily rather than equivalog for that reason. ATP citrate lyases appear to form an outgroup. [Energy metabolism, TCA cycle] 286
17505 273398 TIGR01020 uS5_euk_arch ribosomal protein uS5, eukaryotic/archaeal form. This model finds eukaryotic ribosomal protein uS5 (previously S2 in yeast and human) as well as archaeal ribosomal protein uS5. [Protein synthesis, Ribosomal proteins: synthesis and modification] 212
17506 130093 TIGR01021 rpsE_bact ribosomal protein S5, bacterial/organelle type. This model finds chloroplast ribosomal protein S5 as well as bacterial ribosomal protein S5. A candidate mitochondrial form (Saccharomyces cerevisiae YBR251W and its homolog) differs substantially and is not included in this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 154
17507 130094 TIGR01022 rpmJ_bact ribosomal protein L36, bacterial type. Proteins found by this model occur exclusively in bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification] 37
17508 273399 TIGR01023 rpmG_bact ribosomal protein L33, bacterial type. This model describes bacterial ribosomal protein L33 and its chloroplast and mitochondrial equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification] 54
17509 130096 TIGR01024 rplS_bact ribosomal protein L19, bacterial type. This model describes bacterial ribosomoal protein L19 and its chloroplast equivalent. Putative mitochondrial L19 are found in several species (but not Saccharomyces cerevisiae) and score between trusted and noise cutoffs. [Protein synthesis, Ribosomal proteins: synthesis and modification] 113
17510 273400 TIGR01025 uS19_arch ribosomal protein uS19, eukaryotic/archaeal form. This model represents eukaryotic ribosomal protein uS19 (previously S15) and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15. [Protein synthesis, Ribosomal proteins: synthesis and modification] 135
17511 273401 TIGR01026 fliI_yscN ATPase, FliI/YscN family. This family of ATPases demonstrates extensive homology with ATP synthase F1, beta subunit. It is a mixture of members with two different protein functions. The first group is exemplified by Salmonella typhimurium FliI protein. It is needed for flagellar assembly, its ATPase activity is required for flagellation, and it may be involved in a specialized protein export pathway that proceeds without signal peptide cleavage. The second group of proteins function in the export of virulence proteins; exemplified by Yersinia sp. YscN protein an ATPase involved in the type III secretory pathway for the antihost Yops proteins. [Energy metabolism, ATP-proton motive force interconversion] 440
17512 162163 TIGR01027 proB glutamate 5-kinase. Bacterial ProB proteins hit the full length of this model, but the ProB-like domain of delta 1-pyrroline-5-carboxylate synthetase does not hit the C-terminal 100 residues of this model. The noise cutoff is set low enough to hit delta 1-pyrroline-5-carboxylate synthetase and other partial matches to this family. [Amino acid biosynthesis, Glutamate family] 363
17513 273402 TIGR01028 uS7_euk_arch ribosomal protein uS7, eukaryotic/archaeal. This model describes the members from the eukaryotic cytosol and the Archaea of the family that includes ribosomal protein uS7 (previously S5 in yeast and human). A separate model describes bacterial and organellar S7. [Protein synthesis, Ribosomal proteins: synthesis and modification] 186
17514 273403 TIGR01029 rpsG_bact ribosomal protein S7, bacterial/organelle. This model describes the bacterial and organellar branch of the ribosomal protein S7 family (includes prokaroytic S7 and eukaryotic S5). The eukaryotic and archaeal branch is described by model TIGR01028. [Protein synthesis, Ribosomal proteins: synthesis and modification] 154
17515 130102 TIGR01030 rpmH_bact ribosomal protein L34, bacterial type. This model describes the bacterial protein L34 and its equivalents in organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification] 44
17516 273404 TIGR01031 rpmF_bact ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification] 55
17517 130104 TIGR01032 rplT_bact ribosomal protein L20. This model describes bacterial ribosomal protein L20 and its chloroplast equvalent. This protein binds directly to 23s ribosomal RNA and is necessary for the in vitro assembly process of the 50s ribosomal subunit. It is not involved in the protein synthesizing functions of that subunit. GO process changed accordingly (SS 5/09/03) [Protein synthesis, Ribosomal proteins: synthesis and modification] 113
17518 273405 TIGR01033 TIGR01033 DNA-binding regulatory protein, YebC/PmpR family. This model describes a minimally characterized protein family, restricted to bacteria excepting for some eukaryotic sequences that have possible transit peptides. YebC from E. coli is crystallized, and PA0964 from Pseudomonas aeruginosa has been shown to be a sequence-specific DNA-binding regulatory protein. In silico analysis suggests a role in Holliday junction resolution. [Regulatory functions, DNA interactions] 238
17519 273406 TIGR01034 metK S-adenosylmethionine synthetase. Tandem isozymes of this S-adenosylmethionine synthetase in E. coli are designated MetK and MetX. [Central intermediary metabolism, Other] 377
17520 273407 TIGR01035 hemA glutamyl-tRNA reductase. This enzyme, together with glutamate-1-semialdehyde-2,1-aminomutase (TIGR00713), leads to the production of delta-amino-levulinic acid from Glu-tRNA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 417
17521 273408 TIGR01036 pyrD_sub2 dihydroorotate dehydrogenase, subfamily 2. This model describes enzyme protein dihydroorotate dehydrogenase exclusively for subfamily 2. It includes members from bacteria, yeast, plants etc. The subfamilies 1 and 2 share extensive homology, particularly toward the C-terminus. This subfamily has a longer N-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 335
17522 130109 TIGR01037 pyrD_sub1_fam dihydroorotate dehydrogenase (subfamily 1) family protein. This family includes subfamily 1 dihydroorotate dehydrogenases while excluding the closely related subfamily 2 (TIGR01036). This family also includes a number of uncharacterized proteins and a domain of dihydropyrimidine dehydrogenase. The uncharacterized proteins might all be dihydroorotate dehydrogenase. 300
17523 273409 TIGR01038 uL22_arch_euk ribosomal protein uL22, eukaryotic/archaeal form. This model describes the ribosomal protein uL22 of the eukaryotic cytosol and of the Archaea, previously designated as L17, L22, and L23. The corresponding bacterial form of uL22 is described by a separate model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 148
17524 211621 TIGR01039 atpD ATP synthase, F1 beta subunit. The sequences of ATP synthase F1 alpha and beta subunits are related and both contain a nucleotide-binding site for ATP and ADP. They have a common amino terminal domain but vary at the C-terminus. The beta chain has catalytic activity, while the alpha chain is a regulatory subunit. Proton translocating ATP synthase, F1 beta subunit is homologous to proton translocating ATP synthase archaeal/vacuolar(V1), A subunit. [Energy metabolism, ATP-proton motive force interconversion] 461
17525 273410 TIGR01040 V-ATPase_V1_B V-type (H+)-ATPase V1, B subunit. This models eukaryotic vacuolar (H+)-ATPase that is responsible for acidifying cellular compartments. This enzyme shares extensive sequence similarity with archaeal ATP synthase. [Transport and binding proteins, Cations and iron carrying compounds] 466
17526 200071 TIGR01041 ATP_syn_B_arch ATP synthase archaeal, B subunit. Archaeal ATP synthase shares extensive sequence similarity with eukaryotic and prokaryotic V-type (H+)-ATPases. [Energy metabolism, ATP-proton motive force interconversion] 458
17527 273411 TIGR01042 V-ATPase_V1_A V-type (H+)-ATPase V1, A subunit. This models eukaryotic vacuolar (H+)-ATPase that is responsible for acidifying cellular compartments. This enzyme shares extensive sequence similarity with archaeal ATP synthase. [Transport and binding proteins, Cations and iron carrying compounds] 591
17528 130115 TIGR01043 ATP_syn_A_arch ATP synthase archaeal, A subunit. Archaeal ATP synthase shares extensive sequence similarity with eukaryotic and prokaryotic V-type (H+)-ATPases. [Energy metabolism, ATP-proton motive force interconversion] 578
17529 130116 TIGR01044 rplV_bact ribosomal protein L22, bacterial type. This model decribes bacterial and chloroplast ribosomal protein L22. [Protein synthesis, Ribosomal proteins: synthesis and modification] 103
17530 273412 TIGR01045 RPE1 Rickettsial palindromic element RPE1 domain. This model describes protein translations of the first family described, RPE1, of Rickettsia palindromic elements (RPE). In Rickettsia conorii, 19 copies are found within protein coding regions, where they encode an insert relative to homologs from other species but do not disrupt the reading frame. Insertion is always in the same reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else. 46
17531 273413 TIGR01046 uS10_euk_arch ribosomal protein uS10, eukaryotic/archaeal. This model describes the archaeal ribosomal protein uS10 and its equivalents (previously called S20) in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification] 99
17532 273414 TIGR01047 nspC carboxynorspermidine decarboxylase. This protein is related to diaminopimelate decarboxylase. It is the last enzyme in norspermidine biosynthesis by an unusual pathway shown in Vibrio alginolyticus. [Central intermediary metabolism, Polyamine biosynthesis] 379
17533 273415 TIGR01048 lysA diaminopimelate decarboxylase. This family consists of diaminopimelate decarboxylase, an enzyme which catalyzes the conversion of diaminopimelic acid into lysine during the last step of lysine biosynthesis. [Amino acid biosynthesis, Aspartate family] 414
17534 130121 TIGR01049 rpsJ_bact ribosomal protein S10, bacterial/organelle. This model describes bacterial 30S ribosomal protein S10. In species that have a transcription antitermination complex, or N utilization substance, with NusA, NusB, NusG, and NusE, this ribosomal protein is responsible for NusE activity. Included in the family are one member each from Saccharomyces cerevisiae and Schizosaccharomyces pombe. These proteins lack an N-terminal mitochondrial transit peptide but contain additional sequence C-terminal to the ribosomal S10 protein region. [Protein synthesis, Ribosomal proteins: synthesis and modification] 99
17535 130122 TIGR01050 rpsS_bact ribosomal protein S19, bacterial/organelle. The homologous protein of the eukaryotic cytosol and of the Archaea may be designated S15 or S19. [Protein synthesis, Ribosomal proteins: synthesis and modification] 92
17536 273416 TIGR01051 topA_bact DNA topoisomerase I, bacterial. This model describes DNA topoisomerase I among the members of bacteria. DNA topoisomerase I transiently cleaves one DNA strand and thus relaxes negatively supercoiled DNA during replication, transcription and recombination events. [DNA metabolism, DNA replication, recombination, and repair] 610
17537 273417 TIGR01052 top6b DNA topoisomerase VI, B subunit. This model describes DNA topoisomerase VI, an archaeal type II DNA topoisomerase (DNA gyrase). [DNA metabolism, DNA replication, recombination, and repair] 488
17538 130125 TIGR01053 LSD1 zinc finger domain, LSD1 subclass. This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC 31
17539 273418 TIGR01054 rgy reverse gyrase. This model describes reverse gyrase, found in both archaeal and bacterial thermophiles. This enzyme, a fusion of a type I topoisomerase domain and a helicase domain, introduces positive supercoiling to increase the melting temperature of DNA double strands. Generally, these gyrases are encoded as a single polypeptide. An exception was found in Methanopyrus kandleri, where enzyme is split within the topoisomerase domain, yielding a heterodimer of gene products designated RgyB and RgyA. [DNA metabolism, DNA replication, recombination, and repair] 1171
17540 130127 TIGR01055 parE_Gneg DNA topoisomerase IV, B subunit, proteobacterial. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. This protein is active as an alpha(2)beta(2) heterotetramer. [DNA metabolism, DNA replication, recombination, and repair] 625
17541 273419 TIGR01056 topB DNA topoisomerase III, bacteria and conjugative plasmid. This model describes topoisomerase III from bacteria and its equivalents encoded on plasmids. The gene is designated topB if found in the bacterial chromosome, traE on conjugative plasmid RP4, etc. These enzymes are involved in the control of DNA topology. DNA topoisomerase III belongs to the type I topoisomerases, which are ATP-independent. [DNA metabolism, DNA replication, recombination, and repair] 660
17542 273420 TIGR01057 topA_arch DNA topoisomerase I, archaeal. This model describes topoisomerase I from archaea. These enzymes are involved in the control of DNA topology. DNA topoisomerase I belongs to the type I topoisomerases, which are ATP-independent. [DNA metabolism, DNA replication, recombination, and repair] 618
17543 130130 TIGR01058 parE_Gpos DNA topoisomerase IV, B subunit, Gram-positive. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation step of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair] 637
17544 273421 TIGR01059 gyrB DNA gyrase, B subunit. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. Proteins scoring above the noise cutoff for this model and below the trusted cutoff for topoisomerase IV models probably should be designated GyrB. [DNA metabolism, DNA replication, recombination, and repair] 654
17545 213580 TIGR01060 eno phosphopyruvate hydratase. Alternate name: enolase [Energy metabolism, Glycolysis/gluconeogenesis] 425
17546 273422 TIGR01061 parC_Gpos DNA topoisomerase IV, A subunit, Gram-positive. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair] 738
17547 130134 TIGR01062 parC_Gneg DNA topoisomerase IV, A subunit, proteobacterial. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair] 735
17548 273423 TIGR01063 gyrA DNA gyrase, A subunit. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. [DNA metabolism, DNA replication, recombination, and repair] 800
17549 273424 TIGR01064 pyruv_kin pyruvate kinase. This enzyme is a homotetramer. Some forms are active only in the presence of fructose-1,6-bisphosphate or similar phosphorylated sugars. [Energy metabolism, Glycolysis/gluconeogenesis] 472
17550 273425 TIGR01065 hlyIII channel protein, hemolysin III family. This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens and Drosophila. In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a role cytolytic role. 204
17551 162186 TIGR01066 rplM_bact ribosomal protein L13, bacterial type. This model distinguishes ribosomal protein L13 of bacteria and organelles from its eukarytotic and archaeal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification] 140
17552 273426 TIGR01067 rplN_bact ribosomal protein L14, bacterial/organelle. This model distinguishes bacterial and most organellar examples of ribosomal protein L14 from all archaeal and eukaryotic forms. [Protein synthesis, Ribosomal proteins: synthesis and modification] 122
17553 200072 TIGR01068 thioredoxin thioredoxin. Several proteins, such as protein disulfide isomerase, have two or more copies of a domain closely related to thioredoxin. This model is designed to recognize authentic thioredoxin, a small protein that should be hit exactly once by this model. Any protein that hits once with a score greater than the second (per domain) trusted cutoff may be taken as thioredoxin. [Energy metabolism, Electron transport] 101
17554 130141 TIGR01069 mutS2 MutS2 family protein. Function of MutS2 is unknown. It should not be considered a DNA mismatch repair protein. It is likely a DNA mismatch binding protein of unknown cellular function. [DNA metabolism, Other] 771
17555 273427 TIGR01070 mutS1 DNA mismatch repair protein MutS. [DNA metabolism, DNA replication, recombination, and repair] 840
17556 273428 TIGR01071 rplO_bact ribosomal protein L15, bacterial/organelle. [Protein synthesis, Ribosomal proteins: synthesis and modification] 145
17557 162190 TIGR01072 murA UDP-N-acetylglucosamine 1-carboxyvinyltransferase. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 416
17558 273429 TIGR01073 pcrA ATP-dependent DNA helicase PcrA. Designed to identify pcrA members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair] 726
17559 130146 TIGR01074 rep ATP-dependent DNA helicase Rep. Designed to identify rep members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair] 664
17560 130147 TIGR01075 uvrD DNA helicase II. Designed to identify uvrD members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair] 715
17561 130148 TIGR01076 sortase_fam LPXTG-site transpeptidase (sortase) family protein. This family includes Staphylococcus aureus sortase, a transpeptidase that attaches surface proteins by the Thr of an LPXTG motif to the cell wall. It also includes a protein required for correct assembly of an LPXTG-containing fimbrial protein, a set of homologous proteins from Streptococcus pneumoniae, in which LPXTG proteins are common. However, related proteins are found in Bacillus subtilis and Methanobacterium thermoautotrophicum, in which LPXTG-mediated cell wall attachment is not known. [Cell envelope, Other, Protein fate, Protein and peptide secretion and trafficking] 136
17562 162192 TIGR01077 L13_A_E ribosomal protein uL13, archaeal/eukaryotic form. This model represents ribosomal protein of L13 from the Archaea and from the eukaryotic cytosol. Bacterial and organellar forms are represented by model TIGR01066. [Protein synthesis, Ribosomal proteins: synthesis and modification] 142
17563 273430 TIGR01078 arcA arginine deiminase. Arginine deiminase is the first enzyme of the arginine deiminase pathway of arginine degradation. [Energy metabolism, Amino acids and amines] 405
17564 273431 TIGR01079 rplX_bact ribosomal protein L24, bacterial/organelle. This model recognizes bacterial and organellar forms of ribosomal protein L24. It excludes eukaryotic and archaeal forms, designated L26 in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification] 102
17565 273432 TIGR01080 rplX_A_E ribosomal protein uL24, archaeal/eukaryotic form. This model represents the archaeal and eukaryotic branch of the ribosomal protein L24p/L26e family. Bacterial and organellar forms are represented by related model TIGR01079. [Protein synthesis, Ribosomal proteins: synthesis and modification] 114
17566 130153 TIGR01081 mpl UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate ligase. Alternate name: murein tripeptide ligase [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 448
17567 273433 TIGR01082 murC UDP-N-acetylmuramate--L-alanine ligase. This model describes the MurC protein in bacterial peptidoglycan (murein) biosynthesis. In a few species (Mycobacterium leprae, the Chlamydia), the amino acid may be L-serine or glycine instead of L-alanine. A related protein, UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate ligase (murein tripeptide ligase) is described by model TIGR01081. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 448
17568 273434 TIGR01083 nth endonuclease III. This equivalog model identifes nth members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. [DNA metabolism, DNA replication, recombination, and repair] 192
17569 130156 TIGR01084 mutY A/G-specific adenine glycosylase. This equivalog model identifies mutY members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. [DNA metabolism, DNA replication, recombination, and repair] 275
17570 273435 TIGR01085 murE UDP-N-acetylmuramyl-tripeptide synthetase. Most members of this family are EC 6.3.2.13, UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,6-diaminopimelate ligase. An exception is Staphylococcus aureus, in which diaminopimelate is replaced by lysine in the peptidoglycan and MurE is EC 6.3.2.7. The Mycobacteria, part of the closest neighboring branch outside of the low-GC Gram-positive bacteria, use diaminopimelate. A close homolog, scoring just below the trusted cutoff, is found (with introns) in Arabidopsis thaliana. Its role is unknown. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 464
17571 188107 TIGR01086 fucA L-fuculose phosphate aldolase. Members of this family are L-fuculose phosphate aldolase from various Proteobacteria, encoded in fucose utilization operons. Homologs in other bacteria given similar annotation but scoring below the trusted cutoff may share extensive sequence similarity but are not experimenally characterized and are not found in apparent fucose utilization operons; we consider their annotation as L-fuculose phosphate aldolase to be tenuous. This model has been narrowed in scope from the previous version. [Energy metabolism, Sugars] 214
17572 273436 TIGR01087 murD UDP-N-acetylmuramoylalanine--D-glutamate ligase. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 433
17573 130160 TIGR01088 aroQ 3-dehydroquinate dehydratase, type II. This model specifies the type II enzyme. The type I enzyme, often found as part of a multifunctional protein, is described by TIGR01093. [Amino acid biosynthesis, Aromatic amino acid family] 141
17574 130161 TIGR01089 fucI L-fucose isomerase. This enzyme catalyzes the first step in fucose metabolism, and has been characterized in Escherichia coli and Bacteroides thetaiotaomicron. [Energy metabolism, Sugars] 587
17575 273437 TIGR01090 apt adenine phosphoribosyltransferase. A phylogenetic analysis suggested omitting the bi-directional best hit homologs from the spirochetes from the seed for this model and making only tentative predictions of adenine phosphoribosyltransferase function for this lineage. The trusted cutoff score is made high for this reason. Most proteins scoring between the trusted and noise cutoffs are likely to act as adenine phosphotransferase. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 169
17576 273438 TIGR01091 upp uracil phosphoribosyltransferase. A fairly deep split in phylogenetic and UPGMA trees separates this mostly prokaryotic set of uracil phosphoribosyltransferases from a mostly eukaryotic set that includes uracil phosphoribosyltransferase, uridine kinases, and other, uncharacterized proteins. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 207
17577 130164 TIGR01092 P5CS delta l-pyrroline-5-carboxylate synthetase. This protein contains a glutamate 5-kinase (ProB, EC 2.7.2.11) region followed by a gamma-glutamyl phosphate reductase (ProA, EC 1.2.1.41) region. [Amino acid biosynthesis, Glutamate family] 715
17578 273439 TIGR01093 aroD 3-dehydroquinate dehydratase, type I. This model detects 3-dehydroquinate dehydratase, type I, either as a monofunctional protein or as a domain of a larger, multifunctional protein. It is often found fused to shikimate 5-dehydrogenase (EC 1.1.1.25), and sometimes additional domains. Type II 3-dehydroquinate dehydratase, designated AroQ, is described by the model TIGR01088. [Amino acid biosynthesis, Aromatic amino acid family] 229
17579 273440 TIGR01096 3A0103s03R lysine-arginine-ornithine-binding periplasmic protein. [Transport and binding proteins, Amino acids, peptides and amines] 250
17580 273441 TIGR01097 PhnE phosphonate ABC transporter, permease protein PhnE. Phosphonates are a class of compound analogous to organic phosphates, but in which the C-O-P linkage is replaced by a direct, stable C-P bond. Some bacteria can utilize phosphonates as a source of phosphorus. This family consists of permease proteins of known or predicted phosphonate ABC transporters. Often this protein is found as a duplicated pair, occasionally as a fused pair. Certain "second" copies score in between the trusted and noise cutoff and should be considered true hits (by context). [Transport and binding proteins, Anions] 250
17581 273442 TIGR01098 3A0109s03R phosphate/phosphite/phosphonate ABC transporter, periplasmic binding protein. Phosphonates are a varied class of phosphorus-containing organic compound in which a direct C-P bond is found, rather than a C-O-P linkage of the phosphorus through an oxygen atom. They may be toxic but also may be used as sources of phosphorus and energy by various bacteria. Phosphonate utilization systems typically are encoded in 14 or more genes, including a three gene ABC transporter. This family includes the periplasmic binding protein component of ABC transporters for phosphonates as well as other, related binding components for closely related substances such as phosphate and phosphite. A number of members of this family are found in genomic contexts with components of selenium metabolic processes suggestive of a role in selenate or other selenium-compound transport. A subset of this model in which nearly all members exhibit genomic context with elements of phosphonate metabolism, particularly the C-P lyase system (GenProp0232) has been built (TIGR03431) as an equivalog. Nevertheless, there are members of this subfamily (TIGR01098) which show up sporadically on a phylogenetic tree that also show phosphonate context and are most likely competent to transport phosphonates. [Transport and binding proteins, Anions] 254
17582 273443 TIGR01099 galU UTP--glucose-1-phosphate uridylyltransferase. Built to distinquish between the highly similar genes galU and galF [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 260
17583 130170 TIGR01100 V_ATP_synt_C vacuolar ATP synthase 16 kDa proteolipid subunit. This model describes the vacuolar ATP synthase 16 kDa proteolipid subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. The principal role V-ATPases are the acidification of intracellular compartments of eukaryotic cells. [Transport and binding proteins, Cations and iron carrying compounds] 108
17584 130171 TIGR01101 V_ATP_synt_F vacuolar ATP synthase F subunit. This model describes the vacuolar ATP synthase F subunit (14 kDa subunit) in eukaryotes. In some archaeal species this protein subunit is referred as G subunit [Transport and binding proteins, Cations and iron carrying compounds] 115
17585 130172 TIGR01102 yscR type III secretion apparatus protein, YscR/HrcR family. This model identifies the generic virulence translocation proteins in bacteria. It derives its name:'Yop' from Yersinia enterocolitica species, where this virulence protein was identified. In bacterial pathogenesis, Yop effector proteins are translocated into the eukaryotic cells. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 202
17586 211625 TIGR01103 fliP flagellar biosynthetic protein FliP. This model describes bacterial flagellar biogenesis protein fliP, which is one of the genes in motility locus on the bacterial chromosome that is involved in structure and function of bacterial flagellum. It was demonstrated that mutants in fliP locus were non-flagellated and non-motile, while revertants were flagellated and motile. [Cellular processes, Chemotaxis and motility] 197
17587 273444 TIGR01104 V_PPase vacuolar-type H(+)-translocating pyrophosphatase. This model describes proton pyrophosphatases from eukaryotes (predominantly plants), archaea and bacteria. It is an integral membrane protein and is suggested to have about 15 membrane spanning domains. Proton translocating inorganic pyrophosphatase, like H(+)-ATPase, acidifies the vacuoles and is pivotal to the vacuolar secondary active transport systems in plants. [Transport and binding proteins, Cations and iron carrying compounds] 695
17588 130175 TIGR01105 galF UTP-glucose-1-phosphate uridylyltransferase, non-catalytic GalF subunit. GalF is a non-catalytic subunit of the UTP-glucose pyrophosphorylase modulating the enzyme activity to increase the formation of UDP-glucose [Regulatory functions, Protein interactions] 297
17589 273445 TIGR01106 ATPase-IIC_X-K sodium or proton efflux -- potassium uptake antiporter, P-type ATPase, alpha subunit. This model describes the P-type ATPases responsible for the exchange of either protons or sodium ions for potassium ions across the plasma membranes of eukaryotes. Unlike most other P-type ATPases, members of this subfamily require a beta subunit for activity. This model encompasses eukaryotes and consists of two functional types, a Na/K antiporter found widely distributed in eukaryotes and a H/K antiporter found only in vertebrates. The Na+ or H+/K+ antiporter P-type ATPases have been characterized as Type IIC based on a published phylogenetic analysis. Sequences from Blastocladiella emersonii (GP|6636502, GP|6636502 and PIR|T43025), C. elegans (GP|2315419, GP|6671808 and PIR|T31763) and Drosophila melanogaster (GP|7291424) score below trusted cutoff, apparently due to long branch length (excessive divergence from the last common ancestor) as evidenced by a phylogenetic tree. Experimental evidence is needed to determine whether these sequences represent ATPases with conserved function. Aside from fragments, other sequences between trusted and noise appear to be bacterial ATPases of unclear lineage, but most likely calcium pumps. [Energy metabolism, ATP-proton motive force interconversion] 997
17590 273446 TIGR01107 Na_K_ATPase_bet Sodium Potassium ATPase beta subunit. This model describes the Na+/K+ ATPase beta subunit in eukaryotes. Na+/K+ ATPase(also called Sodium-Potassium pump) is intimately associated with the plasma membrane. It couples the energy released by the hydrolysis of ATP to extrude 3 Na+ ions, with the concomitant uptake of 2K+ ions, against their ionic gradients. [Transport and binding proteins, Cations and iron carrying compounds] 290
17591 273447 TIGR01108 oadA oxaloacetate decarboxylase alpha subunit. This model describes the bacterial oxaloacetate decarboxylase alpha subunit and its equivalents in archaea. The oxaloacetate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases that present in bacteria and archaea. It a multi subunit enzyme consisting of a peripheral alpha-subunit and integral membrane subunits beta and gamma. The energy released by the decarboxylation reaction of oxaloacetate is coupled to Na+ ion pumping across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other] 582
17592 130179 TIGR01109 Na_pump_decarbB sodium ion-translocating decarboxylase, beta subunit. This model describes the beta subunits of sodium pump decarboxylases that include oxaloacetate decarboxylase, methylmalonyl-CoA decarboxylase, and glutaconyl-CoA decarboxylase. Beta and gammma-subunits are integral membrane proteins, while alpha is membrane bound. Catalytically, the energy released by the decarboxylation reaction is coupled to the extrusion of Na+ ions across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other] 354
17593 273448 TIGR01110 mdcA malonate decarboxylase, alpha subunit. This model describes malonate decarboxylase alpha subunit, from both the water-soluble form as found in Klebsiella pneumoniae and the form couple to sodium ion pumping in Malonomonas rubra. Malonate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases. Essentially, it couples the energy derived from decarboxylation of a carboxylic acid substrate to move Na+ ion across the bilayer. Functional malonate decarboylase is a multi subunit protein. The alpha subunit enzymatically performs the transfer of malonate (substrate) to an acyl carrier protein subunit for subsequent decarboxylation, hence the name: acetyl-S-acyl carrier protein:malonate carrier protein-SH transferase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other] 543
17594 130181 TIGR01111 mtrA N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit A. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit A in methanogenic archaea. This methyltranferase is a membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase (encoded by subunit A) is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds] 238
17595 273449 TIGR01112 mtrD N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit D. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit D in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis] 223
17596 130183 TIGR01113 mtrE N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit E. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit E in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis] 283
17597 273450 TIGR01114 mtrH N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit H. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit H in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Methanogenesis] 314
17598 273451 TIGR01115 pufM photosynthetic reaction center M subunit. This model decribes the photosynthetic reaction center M subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reacion center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 305
17599 273452 TIGR01116 ATPase-IIA1_Ca sarco/endoplasmic reticulum calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the endoplasmic reticulum membrane of eukaryotes, and is of particular importance in the sarcoplasmic reticulum of skeletal and cardiac muscle in vertebrates. These pumps transfer Ca2+ from the cytoplasm to the lumen of the endoplasmic reticulum. In humans and mice, at least, there are multiple isoforms of the SERCA pump with overlapping but not redundant functions. Defects in SERCA isoforms are associated with diseases in humans. The calcium P-type ATPases have been characterized as Type IIA based on a phylogenetic analysis which distinguishes this group from the Type IIB PMCA calcium pump modelled by TIGR01517. A separate analysis divides Type IIA into sub-types, SERCA and PMR1, the latter of which is modelled by TIGR01522. [Transport and binding proteins, Cations and iron carrying compounds] 917
17600 130187 TIGR01117 mmdA methylmalonyl-CoA decarboxylase alpha subunit. This model describes methymalonyl-CoA decarboxylase aplha subunit in archaea and bacteria. Metylmalonyl-CoA decarboxylase Na+ pump is a representative of a class of Na+ transport decarboxylases that couples the energy derived by decarboxylation of carboxylic acid substrates to drive the extrusion of Na+ ion across the membrane. [Energy metabolism, ATP-proton motive force interconversion, Energy metabolism, Fermentation, Transport and binding proteins, Cations and iron carrying compounds] 512
17601 130188 TIGR01118 lacA galactose-6-phosphate isomerase, LacA subunit. This family contains members from low GC gram-positive bacteria. Galactose-6-phosphate isomerase is involved in lactose catabolism by the tagatose-6-phosphate pathway. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 141
17602 130189 TIGR01119 lacB galactose-6-phosphate isomerase, LacB subunit. This family contains four members from low GC gram-positive bacteria. Galactose-6-phosphate isomerase is involved in lactose catabolism by the tagatose-6-phosphate pathway. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 171
17603 130190 TIGR01120 rpiB ribose 5-phosphate isomerase B. Involved in the non-oxidative branch of the pentose phospate pathway. [Energy metabolism, Pentose phosphate pathway] 143
17604 130191 TIGR01121 D_amino_aminoT D-amino acid aminotransferase. This enzyme is a homodimer. The pyridoxal phosphate attachment site is the Lys at position 146 of the seed alignment, in the motif Cys-Asp-Ile-Lys-Ser-Leu-Asn. Specificity is broad for various D-amino acids, and differs among members of the family; the family is designated equivalog, but with this caveat attached. [Energy metabolism, Amino acids and amines] 276
17605 130192 TIGR01122 ilvE_I branched-chain amino acid aminotransferase, group I. Among the class IV aminotransferases are two phylogenetically separable groups of branched-chain amino acid aminotransferase (IlvE). The last common ancestor of the two lineages appears also to have given rise to a family of D-amino acid aminotransferases (DAAT). This model represents the IlvE family more strongly similar to the DAAT family. [Amino acid biosynthesis, Pyruvate family] 298
17606 233278 TIGR01123 ilvE_II branched-chain amino acid aminotransferase, group II. Among the class IV aminotransferases are two phylogenetically separable groups of branched-chain amino acid aminotransferase (IlvE). The last common ancestor of the two lineages appears also to have given rise to a family of D-amino acid aminotransferases (DAAT). This model represents the IlvE family less similar to the DAAT family. [Amino acid biosynthesis, Pyruvate family] 313
17607 130194 TIGR01124 ilvA_2Cterm threonine ammonia-lyase, biosynthetic, long form. This model describes a form of threonine ammonia-lyase, a pyridoxal-phosphate dependent enzyme, with two copies of the threonine dehydratase C-terminal domain (pfam00585). Members with known function participate in isoleucine biosynthesis and are inhibited by isoleucine. Alternate name: threonine deaminase, threonine dehydratase. Forms scoring between the trusted and noise cutoff tend to branch with this subgroup of threonine ammonia-lyase phylogenetically but have only a single copy of the C-terminal domain. [Amino acid biosynthesis, Pyruvate family] 499
17608 273453 TIGR01125 TIGR01125 ribosomal protein S12 methylthiotransferase RimO. Members of this protein are the methylthiotransferase RimO, which modifies a conserved Asp residue in ribosomal protein S12. This clade of radical SAM family proteins is closely related to the tRNA modification bifunctional enzyme MiaB (see TIGR01574), and it catalyzes the same two types of reactions: a radical-mechanism sulfur insertion, and a methylation of the inserted sulfur. This clade spans alpha and gamma proteobacteria, cyano bacteria, Deinococcus, Porphyromonas, Aquifex, Helicobacter, Campylobacter, Thermotoga, Chlamydia, Streptococcus coelicolor and Clostridium, but does not include most other gram positive bacteria, archaea or eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification] 426
17609 273454 TIGR01126 pdi_dom protein disulfide-isomerase domain. This model describes a domain of eukaryotic protein disulfide isomerases, generally found in two copies. The high cutoff for total score reflects the expectation of finding both copies. The domain is similar to thioredoxin but the redox-active disulfide region motif is APWCGHCK. [Protein fate, Protein folding and stabilization] 102
17610 130197 TIGR01127 ilvA_1Cterm threonine ammonia-lyase, medium form. A form of threonine dehydratase with two copies of the C-terminal domain pfam00585 is described by TIGR01124. This model describes a phylogenetically distinct form with a single copy of pfam00585. This form branches with the catabolic threonine dehydratase of E. coli; many members are designated as catabolic for this reason. However, the catabolic form lacks any pfam00585 domain. Many members of this model are found in species with other Ile biosynthetic enzymes. [Amino acid biosynthesis, Pyruvate family] 380
17611 273455 TIGR01128 holA DNA polymerase III, delta subunit. DNA polymerase III delta (holA) and delta prime (holB) subunits are distinct proteins encoded by separate genes. The delta prime subunit (holB) exhibits sequence homology to the tau and gamma subunits (dnaX), but the delta subunit (holA) does not demonstrate this same homology with dnaX. The delta, delta prime, gamma, chi and psi subunits form the gamma complex subassembly of DNA polymerase III holoenzyme, which couples ATP to assemble the ring-shaped beta subunit around DNA forming a DNA sliding clamp. [DNA metabolism, DNA replication, recombination, and repair] 302
17612 273456 TIGR01129 secD protein-export membrane protein SecD. Members of this family are highly variable in length immediately after the well-conserved motif LGLGLXGG at the amino-terminal end of this model. Archaeal homologs are not included in the seed and score between the trusted and noise cutoffs. SecD from Mycobacterium tuberculosis has a long Pro-rich insert. [Protein fate, Protein and peptide secretion and trafficking] 397
17613 273457 TIGR01130 ER_PDI_fam protein disulfide isomerase, eukaryotic. This model represents eukaryotic protein disulfide isomerases retained in the endoplasmic reticulum (ER) and closely related forms. Some members have been assigned alternative or additional functions such as prolyl 4-hydroxylase and dolichyl-diphosphooligosaccharide-protein glycotransferase. Members of this family have at least two protein-disulfide domains, each similar to thioredoxin but with the redox-active disulfide in the motif PWCGHCK, and an ER retention signal at the extreme C-terminus (KDEL, HDEL, and similar motifs). 462
17614 273458 TIGR01131 ATP_synt_6_or_A ATP synthase subunit 6 (eukaryotes),also subunit A (prokaryotes). Bacterial forms should be designated ATP synthase, F0 subunit A; eukaryotic (chloroplast and mitochondrial) forms should be designated ATP synthase, F0 subunit 6. The F1/F0 ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Individual subunits in each of these clusters are named differently in prokaryotes and in organelles e.g., mitochondria and chloroplast. The bacterial equivalent of subunit 6 is named subunit 'A'. It has been shown that proton is conducted though this subunit. Typically, deprotonation and reprotonation of the acidic amino acid side-chains are implicated in the process. [Energy metabolism, ATP-proton motive force interconversion] 226
17615 273459 TIGR01132 pgm phosphoglucomutase, alpha-D-glucose phosphate-specific. This enzyme interconverts alpha-D-glucose-1-P and alpha-D-glucose-6-P. [Energy metabolism, Sugars] 543
17616 273460 TIGR01133 murG undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase. RM 8449890 RT The final step of peptidoglycan subunit assembly in Escherichia coli occurs in the cytoplasm. RA Bupp K, van Heijenoort J. RL J Bacteriol 1993 Mar;175(6):1841-3 [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 348
17617 273461 TIGR01134 purF amidophosphoribosyltransferase. Alternate name: glutamine phosphoribosylpyrophosphate (PRPP) amidotransferase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 442
17618 273462 TIGR01135 glmS glucosamine--fructose-6-phosphate aminotransferase (isomerizing). The member from Methanococcus jannaschii contains an intein. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars] 607
17619 273463 TIGR01136 cysKM cysteine synthase. This model discriminates cysteine synthases (EC 2.5.1.47) (both CysK and CysM) from cystathionine beta-synthase, a protein found primarily in eukaryotes and carrying a C-terminal CBS domain lacking from this protein. Bacterial proteins lacking the CBS domain but otherwise showing resemblamnce to cystathionine beta-synthases and considerable phylogenetic distance from known cysteine synthases were excluded from the seed and score below the trusted cutoff. [Amino acid biosynthesis, Serine family] 299
17620 273464 TIGR01137 cysta_beta cystathionine beta-synthase. Members of this family closely resemble cysteine synthase but contain an additional C-terminal CBS domain. The function of any bacterial member included in this family is proposed but not proven. [Amino acid biosynthesis, Serine family] 455
17621 130208 TIGR01138 cysM cysteine synthase B. CysM differs from CysK in that it can also use thiosulfate instead of sulfide, to produce cysteine thiosulfonate instead of cysteine. Alternate name: O-acetylserine (thiol)-lyase [Amino acid biosynthesis, Serine family] 290
17622 273465 TIGR01139 cysK cysteine synthase A. This model distinguishes cysteine synthase A (CysK) from cysteine synthase B (CysM). CysM differs in having a broader specificity that also allows the use of thiosulfate to produce cysteine thiosulfonate. [Amino acid biosynthesis, Serine family] 298
17623 273466 TIGR01140 L_thr_O3P_dcar L-threonine-O-3-phosphate decarboxylase. This family contains pyridoxal phosphate-binding class II aminotransferases (see pfamAM:pfam00222) closely related to, yet distinct from, histidinol-phosphate aminotransferase (HisC). It is found in cobalamin biosynthesis operons in Salmonella typhimurium and Bacillus halodurans (each of which also has HisC) and has been shown to have L-threonine-O-3-phosphate decarboxylase activity in Salmonella. Although the gene symbol cobD was assigned in Salmonella, cobD in other contexts refers to a different cobalamin biosynthesis enzyme, modeled by pfam03186 and called cbiB in Salmonella. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 330
17624 273467 TIGR01141 hisC histidinol-phosphate aminotransferase. Alternate names: histidinol-phosphate transaminase; imidazole acetol-phosphate transaminase Histidinol-phosphate aminotransferase is a pyridoxal-phosphate dependent enzyme. [Amino acid biosynthesis, Histidine family] 350
17625 130212 TIGR01142 purT phosphoribosylglycinamide formyltransferase 2. This enzyme is an alternative to PurN (TIGR00639) [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 380
17626 273468 TIGR01143 murF UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase. This family consists of the strictly bacterial MurF gene of peptidoglycan biosynthesis. This enzyme is almost always UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelate--D-alanyl-D-alanyl ligase, but in a few species, MurE adds lysine rather than diaminopimelate. This enzyme acts on the product from MurE activity, and so is also subfamily rather than equivalog. Staphylococcus aureus is an example of species in this MurF protein would differ. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 417
17627 130214 TIGR01144 ATP_synt_b ATP synthase, F0 subunit b. This model describes the F1/F0 ATP synthase b subunit in bacteria only. Scoring just below the trusted cutoff are the N-terminal domains of Mycobacterial b/delta fusion proteins and a subunit from an archaeon, Methanosarcina barkeri, in which the ATP synthase homolog differs in architecture and is not experimentally confirmed. This model helps resolve b from the related b' subunit. Within the family is an example from a sodium-translocating rather than proton-translocating ATP synthase. [Energy metabolism, ATP-proton motive force interconversion] 147
17628 130215 TIGR01145 ATP_synt_delta ATP synthase, F1 delta subunit. This model describes the ATP synthase delta subunit in bacteria, mitochondria, and chloroplasts. It is sometimes called OSCP for Oligomycin Sensitivity Conferring Protein. F1/F0-ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Delta subunit belongs to the F1 cluster or sector and functionally implicated in the overall stability of the complex. Expression of truncated forms of this subunit results in low ATPase activity. [Energy metabolism, ATP-proton motive force interconversion] 172
17629 273469 TIGR01146 ATPsyn_F1gamma ATP synthase, F1 gamma subunit. This model describes the ATP synthase gamma subunit in bacteria and its equivalents in organelles, namely, mitochondria and chloroplast. F1/F0-ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involed in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. The gamma subunit is the part of F1 cluster. Surrounding the gamma subunit in a cylinder-like structure are three alpha and three subunits in an alternating fashion. This is the central catalytic unit whose different conformations permit the binding of ADP and inorganic phosphate and release of ATP. [Energy metabolism, ATP-proton motive force interconversion] 286
17630 130217 TIGR01147 V_ATP_synt_G vacuolar ATP synthase, subunit G. This model describes the vacuolar ATP synthase G subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. V-ATPases are multi-subunit enzymes composed of two functional domains: A transmembrane Vo domain and a peripheral catalytic domain V1. The G subunit is one of the subunits of the catalytic domain. V-ATPases are responsible for the acidification of endosomes and lysosomes, which are part of the central vacuolar system. [Energy metabolism, ATP-proton motive force interconversion] 113
17631 130218 TIGR01148 mtrC N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit C. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit C in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Other] 265
17632 130219 TIGR01149 mtrG N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit G. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit G in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Other] 70
17633 273470 TIGR01150 puhA photosynthetic reaction center, subunit H, bacterial. This model describes the photosynthetic reaction center H subunit in non-oxygenic photosynthetic bacteria. The reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in the form of NADH. Ultimately, the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is an organic acid rather than water. Much of our current functional understanding of photosynthesis comes from the structural determination and spectroscopic studies on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 252
17634 130221 TIGR01151 psbA photosystem II, DI subunit (also called Q(B)). This model describes the Photosystem II, DI subunit (also called Q(B)) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits DI, DII, Cyt.b, Cyt.f and iron-sulphur protein. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no structural data is presently available, a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 360
17635 130222 TIGR01152 psbD Photosystem II, DII subunit (also called Q(A)). This model describes the Photosystem II, DII subunit (also called Q(A)) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is in many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits DI, DII, Cyt.b, Cyt.f and iron-sulphur protein. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no high resolution X-ray structural data is presently available, recently a 3D structure of the supercomplex has been described by cryo-electron microscopy. Besides a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 352
17636 213589 TIGR01153 psbC photosystem II 44 kDa subunit reaction center protein (also called P6 protein, CP43), bacterial and chloroplast. This model describes the Photosystem II, 44kDa subunit (also called P6 protein, CP43) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is in many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits 44 kDa protein,DI, DII, Cyt.b, Cyt.f, iron-sulphur protein and others. Functinally 44 kDa subunit is imlicated in chlorophyll binding. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no high resolution X-ray structural data is presently available, recently a 3D structure of the supercomplex has been described by cryo-electron microscopy. Besides a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 432
17637 130224 TIGR01156 cytb6/f_IV cytochrome b6/f complex subunit IV. This model describes the subunit IV of the cytochrome b6/f complex. The cyt b6/f complex is central to the functions of the oxygenic phosynthetic electron transport in cyanobacteria and its equivalents in algae and higher plants. Energetically, on the redox scale the cytb6/f complex is placed below the other components - Q(A); Q(B) of the photosystem II in the Z-scheme, along the pathway of the electron transport. The complex is made of the following subunits: cytochrome f; cytochrome b6; Rieske 2Fe-2S; and subunits IV; V; VI; VII. Subunit IV is one of the principal subunits for the binding of the redox prosthetic groups. Each monomer of the complex contains a molecule of chlorophyll a and beta-carotene. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 159
17638 130225 TIGR01157 pufL photosynthetic reaction center L subunit. This model describes the photosynthetic reaction center L subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 239
17639 273471 TIGR01158 SUI1_rel translation initation factor SUI1, putative, prokaryotic. This family of archaeal and bacterial proteins is homologous to the eukaryotic translation intiation factor SUI1 involved in directing the ribosome to the proper start site of translation by functioning in concert with eIF-2 and the initiator tRNA-Met. [Protein synthesis, Translation factors] 101
17640 273472 TIGR01159 DRP1 density-regulated protein DRP1. This protein family shows weak but suggestive similarity to translation initiation factor SUI1 and its prokaryotic homologs. 173
17641 130228 TIGR01160 SUI1_MOF2 translation initiation factor SUI1, eukaryotic. Alternate name: MOF2. A similar protein family (see TIGRFAMs model TIGR01158) is found in prokaryotes. The human proteins complements a yeast SUI1 mutatation. [Protein synthesis, Translation factors] 110
17642 273473 TIGR01161 purK phosphoribosylaminoimidazole carboxylase, PurK protein. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, N5-carboxyaminoimidazole ribonucleotide synthetase, which hydrolyzes ATP and converts AIR to N5-CAIR. PurE converts N5-CAIR to CAIR. In the presence of high concentrations of bicarbonate, PurE is reported able to convert AIR to CAIR directly and without ATP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 352
17643 273474 TIGR01162 purE phosphoribosylaminoimidazole carboxylase, PurE protein. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, an N5-CAIR mutase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 156
17644 273475 TIGR01163 rpe ribulose-phosphate 3-epimerase. This family consists of Ribulose-phosphate 3-epimerase, also known as pentose-5-phosphate 3-epimerase (PPE). PPE converts D-ribulose 5-phosphate into D-xylulose 5-phosphate in Calvin's reductive pentose phosphate cycle. It has been found in a wide range of bacteria, archebacteria, fungi and plants. [Energy metabolism, Pentose phosphate pathway] 210
17645 273476 TIGR01164 rplP_bact ribosomal protein L16, bacterial/organelle. This model describes bacterial and organellar ribosomal protein L16. The homologous protein of the eukaryotic cytosol is designated L10 [Protein synthesis, Ribosomal proteins: synthesis and modification] 125
17646 273477 TIGR01165 cbiN cobalt transport protein. This model describes the cobalt transporter in bacteria and its equivalents in archaea. It principally functions in the ion uptake mechanism. It is a multisubunit transporter with two integral membrane proteins and two closely associated cytoplasmic subunits. This transporter belongs to the ABC transporter superfamily (ATP stands for ATP Binding Cassette). This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds] 91
17647 130234 TIGR01166 cbiO cobalt transport protein ATP-binding subunit. This model describes the ATP binding subunit of the multisubunit cobalt transporter in bacteria and its equivalents in archaea. The model is restricted to ATP subunit that is a part of the cobalt transporter, which belongs to the ABC transporter superfamily (ATP Binding Cassette). The model excludes ATP binding subunit that are associated with other transporters belonging to ABC transporter superfamily. This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds] 190
17648 273478 TIGR01167 LPXTG_anchor LPXTG-motif cell wall anchor domain. This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other] 34
17649 273479 TIGR01168 YSIRK_signal Gram-positive signal peptide, YSIRK family. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus. 39
17650 211630 TIGR01169 rplA_bact ribosomal protein L1, bacterial/chloroplast. This model describes bacterial (and chloroplast) ribosomal protein L1. The apparent mitochondrial L1 is sufficiently diverged to be the subject of a separate model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 227
17651 273480 TIGR01170 rplA_mito ribosomal protein uL1, mitochondrial. This model represents the mitochondrial homolog of bacterial ribosomal protein L1. Unlike chloroplast L1, this form was not sufficiently similar to bacterial forms to include in a single bacterial/organellar L1 model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 141
17652 273481 TIGR01171 rplB_bact ribosomal protein L2, bacterial/organellar. This model distinguishes bacterial and organellar ribosomal protein L2 from its counterparts in the archaea nad in the eukaryotic cytosol. Plant mitochondrial examples tend to have long, variable inserts. [Protein synthesis, Ribosomal proteins: synthesis and modification] 273
17653 200082 TIGR01172 cysE serine O-acetyltransferase. Cysteine biosynthesis [Amino acid biosynthesis, Serine family] 162
17654 273482 TIGR01173 glmU UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. This protein is a bifunctional enzyme, GlmU, which catalyzes last two reactions in the four-step pathway of UDP-N-acetylglucosamine biosynthesis from fructose-6-phosphate. Its reaction product is required from peptidoglycan biosynthesis, LPS biosynthesis in species with LPS, and certain other processes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Central intermediary metabolism, Amino sugars] 451
17655 273483 TIGR01174 ftsA cell division protein FtsA. This bacterial cell division protein interacts with FtsZ, the bacterial homolog of tubulin. It is an ATP-binding protein and shows structural similarities to actin and heat shock cognate protein 70. [Cellular processes, Cell division] 371
17656 273484 TIGR01175 pilM type IV pilus assembly protein PilM. This protein is required for the assembly of the type IV fimbria in Pseudomonas aeruginosa responsible for twitching motility, and for a similar pilus-like structure in Synechocystis. It is also found in species such as Deinococcus described as having natural transformation (for which a type IV pilus-like structure is proposed) but not fimbria. 348
17657 273485 TIGR01176 fum_red_Fp fumarate reductase (quinol), flavoprotein subunit. The terms succinate dehydrogenase and fumarate reductase may be used interchangeably in certain systems. However, a number of species have distinct complexes, with the fumarate reductase active under anaerobic conditions. This model represents the fumarate reductase flavoprotein subunit from several such species in which a distinct succinate dehydrogenase is also found. Not all bona fide fumarate reductases will be found by this model. 580
17658 273486 TIGR01177 TIGR01177 putative methyltransferase, TIGR01177 family. This family of probable methyltransferases is found exclusively in the Archaea. [Hypothetical proteins, Conserved] 329
17659 130246 TIGR01178 ade adenine deaminase. The family described by this model includes an experimentally characterized adenine deaminase of Bacillus subtilis. It also include a member from Methanobacterium thermoautotrophicum, in which adenine deaminase activity has been detected. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 552
17660 273487 TIGR01179 galE UDP-glucose-4-epimerase GalE. Alternate name: UDPgalactose 4-epimerase This enzyme interconverts UDP-glucose and UDP-galactose. A set of related proteins, some of which are tentatively identified as UDP-glucose-4-epimerase in Thermotoga maritima, Bacillus halodurans, and several archaea, but deeply branched from this set and lacking experimental evidence, are excluded from this model and described by a separate model. [Energy metabolism, Sugars] 328
17661 273488 TIGR01180 aman2_put alpha-1,2-mannosidase, putative. The identification of members of this family as putative alpha-1,2-mannosidases is based on an unpublished characterization of the aman2 gene in Bacillus sp. M-90 by Maruyama,Y., Nakajima,M. and Nakajima,T. (Genbank accession BAA76709, pid g4587313). Most members of this family appear to have signal sequences. Members from the dental pathogen Porphyromonas gingivalis have been described as immunoreactive with periodontitis patient serum. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 750
17662 273489 TIGR01181 dTDP_gluc_dehyt dTDP-glucose 4,6-dehydratase. This protein is related to UDP-glucose 4-epimerase (GalE) and likewise has an NAD cofactor. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 317
17663 273490 TIGR01182 eda Entner-Doudoroff aldolase. 2-deydro-3-deoxyphosphogluconate aldolase (EC 4.1.2.14) is an enzyme of the Entner-Doudoroff pathway. This aldolase has another function, 4-hydroxy-2-oxoglutarate aldolase (EC 4.1.3.16) shown experimentally in Escherichia coli and Pseudomonas putida [Amino acid biosynthesis, Glutamate family, Energy metabolism, Entner-Doudoroff] 204
17664 130251 TIGR01183 ntrB nitrate ABC transporter, permease protein. This model describes the nitrate transport permease in bacteria. This is gene product of ntrB. The nitrate transport permease is the integral membrane component of the nitrate transport system and belongs to the ATP-binding cassette (ABC) superfamily. At least in photosynthetic bacteria nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA, ntrB, ntrC, ntrD, narB. Functionally ntrC and ntrD resemble the ATP binding components of the binding protein-dependent transport systems. Mutational studies have shown that ntrB and ntrC are mandatory for nitrate accumulation. Nitrate reductase is encoded by narB. [Transport and binding proteins, Anions] 202
17665 130252 TIGR01184 ntrCD nitrate transport ATP-binding subunits C and D. This model describes the ATP binding subunits of nitrate transport in bacteria and archaea. This protein belongs to the ATP-binding cassette (ABC) superfamily. It is thought that the two subunits encoded by ntrC and ntrD form the binding surface for interaction with ATP. This model is restricted in identifying ATP binding subunit associated with the nitrate transport. Nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA - a regulatory protein; ntrB - a hydropbobic transmembrane permease and narB - a reductase. [Transport and binding proteins, Anions, Transport and binding proteins, Other] 230
17666 130253 TIGR01185 devC DevC protein. This model describes a predicted membrane subunit, DevC, of an ABC transporter known so far from two species of cyanobacteria. Some experimental data from mutational analysis suggest that this protein along with DevA and DevB encoded in the same operon may be involved in the transport/export of glycolipids. [Transport and binding proteins, Other] 380
17667 130254 TIGR01186 proV glycine betaine/L-proline transport ATP binding subunit. This model describes the glycine betaine/L-proline ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. Functionally, this transport system is involved in osmoregulation. Under conditions of stress, the organism recruits these transport system to accumulate glycine betaine and other solutes which offer osmo-protection. It has been demonstrated that glycine betaine uptake is accompanied by symport with sodium ions. The locus has been named variously as proU or opuA. A gene library from L.lactis functionally complements an E.coli proU mutant. The comlementing locus is similar to a opuA locus in B.sutlis. This clarifies the differences in nomenclature. [Transport and binding proteins, Amino acids, peptides and amines] 363
17668 162242 TIGR01187 potA spermidine/putrescine ABC transporter ATP-binding subunit. This model describes spermidine/putrescine ABC transporter, ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. Polyamines like spermidine and putrescine play vital role in cell proliferation, differentiation, and ion homeostasis. The concentration of polyamines within the cell are regulated by biosynthesis, degradation and transport (uptake and efflux included). [Transport and binding proteins, Amino acids, peptides and amines] 325
17669 130256 TIGR01188 drrA daunorubicin resistance ABC transporter ATP-binding subunit. This model describes daunorubicin resistance ABC transporter, ATP binding subunit in bacteria and archaea. This model is restricted in its scope to preferentially recognize the ATP binding subunit associated with effux of the drug, daunorubicin. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. In eukaryotes proteins of similar function include p-gyco proteins, multidrug resistance protein etc. [Transport and binding proteins, Other] 302
17670 273491 TIGR01189 ccmA heme ABC exporter, ATP-binding protein CcmA. This model describes the cyt c biogenesis protein encoded by ccmA in bacteria. An exception is, an arabidopsis protein. Quite likely this is encoded by an organelle. Bacterial c-type cytocromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes: ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome c. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other] 198
17671 200083 TIGR01190 ccmB heme exporter protein CcmB. This model describes the cyt c biogenesis protein encoded by ccmB in bacteria. Bacterial c-type cytochromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes: ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome C. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other] 211
17672 273492 TIGR01191 ccmC heme exporter protein CcmC. This model describes the cyt c biogenesis protein encoded by ccmC in bacteria. It must be noted an arabidopsis, a tritcum and a piscum plant proteins were recognizable in the clade. Quite likely they are of organellar origin. Bacterial c-type cytocromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes, ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in the heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome c. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other] 184
17673 130260 TIGR01192 chvA glucan exporter ATP-binding protein. This model describes glucan exporter ATP binding protein in bacteria. It belongs to the larger ABC transporter superfamily with the characteristic ATP binding motif. The In general, this protein is in some ways implicated in osmoregulation and suggested to participate in the export of glucan from the cytoplasm to periplasm. The cyclic beta-1,2-glucan in the bactrerial periplasmic space is suggested to confer the property of high osmolority. It has also been demonstrated that mutants in this loci have lost functions of virulence and motility. It is unclear as to how virulence and osmoadaptaion are related. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 585
17674 130261 TIGR01193 bacteriocin_ABC ABC-type bacteriocin transporter. This model describes ABC-type bacteriocin transporter. The amino terminal domain (pfam03412) processes the N-terminal leader peptide from the bacteriocin while C-terminal domains resemble ABC transporter membrane protein and ATP-binding cassette domain. In general, bacteriocins are agents which are responsible for killing or inhibiting the closely related species or even different strains of the same species. Bacteriocins are usually encoded by bacterial plasmids. Bacteriocins are named after the species and hence in literature one encounters various names e.g., leucocin from Leuconostic geldium; pedicocin from Pedicoccus acidilactici; sakacin from Lactobacillus sake etc. [Protein fate, Protein and peptide secretion and trafficking, Protein fate, Protein modification and repair, Transport and binding proteins, Other] 708
17675 130262 TIGR01194 cyc_pep_trnsptr cyclic peptide transporter. This model describes cyclic peptide transporter in bacteria. Bacteria have elaborate pathways for the production of toxins and secondary metabolites. Many such compounds, including syringomycin and pyoverdine are synthesized on non-ribosomal templates consisting of a multienzyme complex. On several occasions the proteins of the complex and transporter protein are present on the same operon. Often times these compounds cross the biological membrane by specific transporters. Syringomycin is an amphipathic, cylclic lipodepsipeptide when inserted into host causes formation of channels, permeable to variety of cations. On the other hand, pyoverdine is a cyclic octa-peptidyl dihydroxyquinoline, which is efficient in sequestering iron for uptake. [Transport and binding proteins, Amino acids, peptides and amines, Transport and binding proteins, Other] 555
17676 273493 TIGR01195 oadG_fam sodium pump decarboxylases, gamma subunit. This model finds the subfamily of distantly related, low complexity, hydrophobic small subunits of several related sodium ion-pumping decarboxylases. These include oxaloacetate decarboxylase gamma subunit and methylmalonyl-CoA decarboxylase delta subunit. Most sequences scoring between the noise and trusted cutoffs are eukaryotic sodium channel proteins. 82
17677 130264 TIGR01196 edd 6-phosphogluconate dehydratase. A close homolog, designated MocB (mannityl opine catabolism), is found in a mannopine catabolism region of a plasmid of Agrobacterium tumefaciens. However, it is not essential for mannopine catabolism, branches within the cluster of 6-phosphogluconate dehydratases (with a short branch length) in a tree rooted by the presence of other dehydyatases. It may represent an authentic 6-phosphogluconate dehydratase, redundant with the chromosomal copy shown to exist in plasmid-cured strains. This model includes mocB above the trusted cutoff, although the designation is somewhat tenuous. [Energy metabolism, Entner-Doudoroff] 601
17678 162246 TIGR01197 nramp NRAMP (natural resistance-associated macrophage protein) metal ion transporters. This model describes the Nramp metal ion transporter family. Historically, in mammals these proteins have been functionally characterized as proteins involved in the host pathogen resistance, hence the name - NRAMP. At least two isoforms Nramp1 and Nramp2 have been identified. However the exact mechanism of pathogen resistance was unclear, until it was demonstrated by expression cloning and electrophysiological techniques that this protein was a metal ion transporter. It was also independently demonstrated that a microcytic anemia (mk) locus in mouse, encodes a metal ion transporter (DCT1 or Nramp2). The transporter has a broad range of substrate specificity that include Fe+2, Zn+2, Mn+2, Co+2, Cd+2, Cu+2, Ni+2 and Pb+2. The uptake of these metal ions is coupled to proton symport. Metal ions are essential cofactors in a number of biological process including, oxidative phosphorylation, gene regulation and metal ion homeostasis. Nramp1 could confer resistance to infection in one of the two ways. (1) The uptake of Fe+2 can produce toxic hydroxyl radicals via Fenton reaction killing the pathogens in phagosomes or (2) Deplete the metal ion pools in the phagosome and deprive the pathogens of metal ions, which is critical for its survival. [Transport and binding proteins, Cations and iron carrying compounds] 390
17679 273494 TIGR01198 pgl 6-phosphogluconolactonase. This enzyme of the pentose phosphate pathway is often found as a part of a multifunctional protein with [Energy metabolism, Pentose phosphate pathway] 233
17680 273495 TIGR01200 GLPGLI GLPGLI family protein. This protein family was first noted as a paralogous set in Porphyromonas gingivalis, but it is more widely distributed among the Bacteroidetes. The protein family is now renamed GLPGLI after its best-conserved motif. 226
17681 273496 TIGR01201 HU_rel DNA-binding protein, histone-like, putative. This model describes a set of proteins related to but longer than DNA-binding protein HU. Its distinctive domain architecture compared to HU and related histone-like DNA-binding proteins justifies the designation as superfamily. Members include, so far, one from Bacteroides fragilis, a gut bacterium, and ten from Porphyromonas gingivalis, an oral anaerobe. [DNA metabolism, Chromosome-associated proteins] 145
17682 130269 TIGR01202 bchC 2-desacetyl-2-hydroxyethyl bacteriochlorophyllide A dehydrogenase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 308
17683 273497 TIGR01203 HGPRTase hypoxanthine phosphoribosyltransferase. Alternate name: hypoxanthine-guanine phosphoribosyltransferase. Sequence differences as small as a single residue can affect whether members of this family act on hypoxanthine and guanine or hypoxanthine only. The designation of this model as equivalog reflects hypoxanthine specificity and does not reflect whether or not guanine can replace hypoxanthine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 166
17684 130271 TIGR01204 bioW 6-carboxyhexanoate--CoA ligase. Alternate name: pimeloyl-CoA synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 232
17685 273498 TIGR01205 D_ala_D_alaTIGR D-alanine--D-alanine ligase. This model describes D-Ala--D-Ala ligase, an enzyme that makes a required precursor of the bacterial cell wall. It also describes some closely related proteins responsible for resistance to glycopeptide antibiotics such as vancomycin. The mechanism of glyopeptide antibiotic resistance involves the production of D-alanine-D-lactate (VanA and VanB families) or D-alanine-D-serine (VanC). The seed alignment contains only chromosomally encoded D-ala--D-ala ligases, but a number of antibiotic resistance proteins score above the trusted cutoff of this model. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 315
17686 273499 TIGR01206 lysW lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway. [Amino acid biosynthesis, Aspartate family] 54
17687 130274 TIGR01207 rmlA glucose-1-phosphate thymidylyltransferase, short form. Alternate name: dTDP-D-glucose synthase homotetramer This model describes a tightly conserved but broadly distributed subfamily (here designated as short form) of known and putative bacterial glucose-1-phosphate thymidylyltransferases. It is well characterized in several species as the first of four enzymes involved in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 286
17688 273500 TIGR01208 rmlA_long glucose-1-phosphate thymidylylransferase, long form. The family of known and putative glucose-1-phosphate thymidyltransferase (also called dTDP-glucose synthase) shows a deep split into a short form (see TIGR01207) and a long form described by this model. The homotetrameric short form is found in numerous bacterial species that incorporate dTDP-L-rhamnose, which it helps synthesize, into the cell wall. It is subject to feedback inhibition. This form, in contrast, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced. Alternate name: dTDP-D-glucose synthase 353
17689 273501 TIGR01209 TIGR01209 RNA ligase, Pab1020 family. Members of this family are found, so far, in a single copy per genome and largely in thermophiles, of which only Aquifex aeolicus is bacterial rather than archaeal. PSI-BLAST converges after a single iteration to the whole of this family and reveals no convincing similarity to any other protein. The member protein Pab1020 has been characterized as an RNA ligase with circularization activity. [Transcription, RNA processing] 374
17690 273502 TIGR01210 TIGR01210 radical SAM enzyme, TIGR01210 family. This family of exclusively archaeal radical SAM enzymes has no characterized close homologs. [Hypothetical proteins, Conserved] 313
17691 273503 TIGR01211 ELP3 radical SAM enzyme/protein acetyltransferase, ELP3 family. This family includes elongator complex protein 3 (ELP3) from eukaryotes and related proteins from other lineages. ELP3 is a component of the RNA polymerase II holoenzyme. It has an N-terminal radical SAM domain and C-terminal GNAT acetyltransferase domain. Members of this family are found in eukaryotes, archaea, and a few bacteria (e.g. Atopobium sp). The activity discovered first was an acetyltransferase modification at the N-termini of all four core histones, shown in vitro in eukaryotes. More recently, the radical SAM domain was shown to play a role in zygotic paternal genome demethylation. Family TIGR01212, widespread in prokaryotes, lacks the GNAT acetyltransferase domain but shares extensive sequence similarity with this family (TIGR01211). [Transcription, DNA-dependent RNA polymerase] 522
17692 130279 TIGR01212 TIGR01212 radical SAM protein, TIGR01212 family. Members of this family are apparent radical-SAM enzymes, related to the N-terminal region of the bifunctional ELP3, whose C-terminal region is part of the elongator complex and appears to acetylate histones and other proteins. ELP3 binds S-adenosylmethionine (SAM) and was recently shown to be involved in a DNA demethylation process in eukaryotes. Close sequence similarity of this family (with lacks the GNAT family acetyltransferase domain) to the ELP3 N-terminal region and a strong match to the pfam04055 support identification of this family as radical SAM despite the atypical spacing between first and second Cys residues in the 4Fe4S-binding motif. [Unknown function, Enzymes of unknown specificity] 302
17693 273504 TIGR01213 pseudo_Pus10arc tRNA pseudouridine(54/55) synthase. Members of this family show twilight-zone similarity to several predicted RNA pseudouridine synthases. All trusted members of this family are archaeal. Several eukaryotic homologs lack N-terminal homology including two CXXC motifs. [Hypothetical proteins, Conserved] 388
17694 273505 TIGR01214 rmlD dTDP-4-dehydrorhamnose reductase. This enzyme catalyzes the last of 4 steps in making dTDP-rhamnose, a precursor of LPS core antigen, O-antigen, etc. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 287
17695 188120 TIGR01215 minE cell division topological specificity factor MinE. This protein is involved in the process of cell division. This protein prevents the proteins MinC and MinD to inhibit cell division at internal sites, but allows inhibiton at polar sites. This allows for correct cell division at the proper sites. [Cellular processes, Cell division] 81
17696 273506 TIGR01216 ATP_synt_epsi ATP synthase, F1 epsilon subunit (delta in mitochondria). This model describes one of the five types of subunits in the F1 part of F1/F0 ATP synthases. Members of this family are designated epsilon in bacterial and chloroplast systems but designated delta in mitochondria, where the counterpart of the bacterial delta subunit is designated OSCP. In a few cases (Propionigenium modestum, Acetobacterium woodii) scoring above the trusted cutoff and designated here as exceptions, Na+ replaces H+ for translocation. [Energy metabolism, ATP-proton motive force interconversion] 130
17697 273507 TIGR01217 ac_ac_CoA_syn acetoacetyl-CoA synthase. This enzyme catalyzes the first step of the mevalonate pathway of IPP biosynthesis. Most bacteria do not use this pathway, but rather the deoxyxylulose pathway. [Central intermediary metabolism, Other] 652
17698 273508 TIGR01218 Gpos_tandem_5TM tandem five-transmembrane protein. Members of this family of proteins, with average length of 210, have no invariant residues but five predicted transmembrane segments. Strangely, most members occur in groups of consecutive paralogous genes. A striking example is a set of eleven encoded consecutively, head-to-tail, in Staphylococcus aureus strain COL. 207
17699 273509 TIGR01219 Pmev_kin_ERG8 phosphomevalonate kinase, ERG8-type, eukaryotic branch. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents plant and fungal forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other] 454
17700 130287 TIGR01220 Pmev_kin_Gr_pos phosphomevalonate kinase, ERG8-type, Gram-positive branch. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents the low GC Gram-positive organism forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other] 358
17701 273510 TIGR01221 rmlC dTDP-4-dehydrorhamnose 3,5-epimerase. This enzyme participates in the biosynthesis of dTDP-L-rhamnose, often as a precursor to LPS O-antigen [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 176
17702 273511 TIGR01222 minC septum site-determining protein MinC. The minC protein assists in correct placement of the septum for cell division by inhibiting septum formation at other sites. Homologs from Deinocoocus, Synechocystis PCC 6803, and Helicobacter pylori do not hit the full length of the model and score between the trusted and noise cutoffs. [Cellular processes, Cell division] 217
17703 130290 TIGR01223 Pmev_kin_anim phosphomevalonate kinase, animal type. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found. One is this type, found in animals. The other is the ERG8 type, found in plants and fungi (TIGR01219) and in Gram-positive bacteria (TIGR01220). [Central intermediary metabolism, Other] 182
17704 273512 TIGR01224 hutI imidazolonepropionase. This enzyme catalyzes the third step in histidine degradation. [Energy metabolism, Amino acids and amines] 377
17705 200086 TIGR01225 hutH histidine ammonia-lyase. This enzyme deaminates histidine to urocanic acid, the first step in histidine degradation. It is closely related to phenylalanine ammonia-lyase. [Energy metabolism, Amino acids and amines] 506
17706 130293 TIGR01226 phe_am_lyase phenylalanine ammonia-lyase. Members of this subfamily of MIO prosthetic group enzymes are phenylalanine ammonia-lyases. They are found, so far, in plants and fungi. From phenylalanine, this enzyme yields cinnaminic acid, a precursor of many important plant compounds. This protein shows extensive homology to histidine ammonia-lyase, the first enzyme of a histidine degradation pathway. Note that members of this family from plant species that synthesize taxol are actually phenylalanine aminomutase, and are covered by exception model TIGR04473. 680
17707 273513 TIGR01227 hutG formimidoylglutamase. Formiminoglutamase, the fourth enzyme of histidine degradation, is similar to arginases and agmatinases. It is often encoded near other enzymes of the histidine degredation pathway: histidine ammonia-lyase, urocanate hydratase, and imidazolonepropionase. [Energy metabolism, Amino acids and amines] 307
17708 130295 TIGR01228 hutU urocanate hydratase. This model represents the second of four enzymes involved in the degradation of histidine to glutamate. [Energy metabolism, Amino acids and amines] 545
17709 162262 TIGR01229 rocF_arginase arginase. This model helps resolve arginases from known and putative agmatinases, formiminoglutamases, and other related proteins of unknown specifity. The pathway from arginine to the polyamine putrescine may procede by hydrolysis to remove urea (arginase) followed by decarboxylation (ornithine decarboxylase), or by decarboxylation first (arginine decarboxylase) followed by removal of urea (agmatinase). 300
17710 273514 TIGR01230 agmatinase agmatinase. Members of this family include known and predicted examples of agmatinase (agmatine ureohydrolase). The seed includes members of archaea, for which no definitive agmatinase sequence has yet been made available. However, archaeal sequences are phylogenetically close to the experimentally verified B. subtilis sequence. One species of Halobacterium has been demonstrated in vitro to produce agmatine from arginine, but no putrescine from ornithine, suggesting that arginine decarboxylase and agmatinase, rather than arginase and ornithine decarboxylase, lead from Arg to polyamine biosynthesis. Note: a history of early misannotation of members of this family is detailed in PUBMED:10931887. 275
17711 273515 TIGR01231 lacC tagatose-6-phosphate kinase. This enzyme is part of the tagatose-6-phosphate pathway of lactose degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 310
17712 130299 TIGR01232 lacD tagatose 1,6-diphosphate aldolase. This family consists of Gram-positive proteins. Tagatose 1,6-diphosphate aldolase is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 325
17713 273516 TIGR01233 lacG 6-phospho-beta-galactosidase. This enzyme is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 467
17714 130301 TIGR01234 L-ribulokinase ribulokinase. This enzyme catalyzes the second step in arabinose catabolism. The most closely related protein subfamily outside the scope of this model includes ribitol kinase from E. coli. [Energy metabolism, Sugars] 536
17715 130302 TIGR01235 pyruv_carbox pyruvate carboxylase. This enzyme plays a role in gluconeogensis but not glycolysis. [Energy metabolism, Glycolysis/gluconeogenesis] 1143
17716 273517 TIGR01236 D1pyr5carbox1 delta-1-pyrroline-5-carboxylate dehydrogenase, group 1. This model represents one of two related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. The two branches are not as closely related to each other as some aldehyde dehydrogenases are to this branch, and separate models are built for this reason. The enzyme is the second of two in the degradation of proline to glutamate. [Energy metabolism, Amino acids and amines] 532
17717 200087 TIGR01237 D1pyr5carbox2 delta-1-pyrroline-5-carboxylate dehydrogenase, group 2, putative. This enzyme is the second of two in the degradation of proline to glutamate. This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. Members of this branch may be associated with proline dehydrogenase (the other enzyme of the pathway from proline to glutamate) but have not been demonstrated experimentally. The branches are not as closely related to each other as some distinct aldehyde dehydrogenases are to some; separate models were built to let each model describe a set of equivalogs. [Energy metabolism, Amino acids and amines] 511
17718 273518 TIGR01238 D1pyr5carbox3 delta-1-pyrroline-5-carboxylate dehydrogenase (PutA C-terminal domain). This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. Members of this branch are the C-terminal domain of the PutA bifunctional proline dehydrogenase / delta-1-pyrroline-5-carboxylate dehydrogenase. [Energy metabolism, Amino acids and amines] 500
17719 273519 TIGR01239 galT_2 galactose-1-phosphate uridylyltransferase, family 2. This enzyme is involved in glucose and galactose interconversion. This model describes one of two extremely distantly related branches of the model pfam01087 from Pfam. [Energy metabolism, Sugars] 489
17720 130307 TIGR01240 mevDPdecarb diphosphomevalonate decarboxylase. This enzyme catalyzes the last step in the synthesis of isopentenyl diphosphate (IPP) in the mevalonate pathway. Alternate names: mevalonate diphosphate decarboxylase; pyrophosphomevalonate decarboxylase [Central intermediary metabolism, Other] 305
17721 273520 TIGR01241 FtsH_fam ATP-dependent metalloprotease FtsH. HflB(FtsH) is a pleiotropic protein required for correct cell division in bacteria. It has ATP-dependent zinc metalloprotease activity. It was formerly designated cell division protein FtsH. [Cellular processes, Cell division, Protein fate, Degradation of proteins, peptides, and glycopeptides] 495
17722 130309 TIGR01242 26Sp45 26S proteasome subunit P45 family. Many proteins may score above the trusted cutoff because an internal 364
17723 273521 TIGR01243 CDC48 AAA family ATPase, CDC48 subfamily. This subfamily of the AAA family ATPases includes two members each from three archaeal species. It also includes yeast CDC48 (cell division control protein 48) and the human ortholog, transitional endoplasmic reticulum ATPase (valosin-containing protein). These proteins in eukaryotes are involved in the budding and transfer of membrane from the transitional endoplasmic reticulum to the Golgi apparatus. 733
17724 130311 TIGR01244 TIGR01244 TIGR01244 family protein. No member of this family is characterized. The member from Xylella fastidiosa is a longer protein with an N-terminal region described by this model, followed by a metallo-beta-lactamase family domain and an additional C-terminal region. Members scoring above the trusted cutoff are limited to the proteobacteria. [Hypothetical proteins, Conserved] 135
17725 273522 TIGR01245 trpD anthranilate phosphoribosyltransferase. In many widely different species, including E. coli, Thermotoga maritima, and Archaeoglobus fulgidus, this enzymatic domain (anthranilate phosphoribosyltransferase) is found C-terminal to glutamine amidotransferase; the fusion protein is designated anthranilate synthase component II (EC 4.1.3.27) [Amino acid biosynthesis, Aromatic amino acid family] 330
17726 162269 TIGR01246 dapE_proteo succinyl-diaminopimelate desuccinylase, proteobacterial clade. This model describes a proteobacterial subset of succinyl-diaminopimelate desuccinylases. An experimentally confirmed Gram-positive lineage succinyl-diaminopimelate desuccinylase has been described for Corynebacterium glutamicum (SP:Q59284), and a neighbor-joining tree shows the seed members, SP:Q59284, and putative archaeal members such as TrEMBL:O58003 in a single clade. However, the archaeal members differ substantially, share a number of motifs with acetylornithine deacetylases rather than succinyl-diaminopimelate desuccinylases, and are not taken as trusted examples of succinyl-diaminopimelate desuccinylases. This model is limited to proteobacterial members for this reason. [Amino acid biosynthesis, Aspartate family] 370
17727 130314 TIGR01247 drrB daunorubicin resistance ABC transporter membrane protein. This model describes daunorubicin resistance ABC transporter, membrane associated protein in bacteria and archaea. The protein associated with effux of the drug, daunorubicin. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. In eukaryotes proteins of similar function include p-gyco proteins, multidrug resistance protein etc. [Transport and binding proteins, Other] 236
17728 130315 TIGR01248 drrC daunorubicin resistance protein C. The model describes daunorubicin resistance protein C in bacteria. This protein confers the function of daunorubicin resistance. The protein seems to share strong sequence similarity to UvrA proteins, which are involved in excision repair of DNA. Disruption of drrC gene showed increased sensitivity upon exposure to duanorubicin. However it failed to complement uvrA mutants to exposure to UV irradiation. The mechanism on how it confers duanomycin resistance is unclear, but has been suggested to be different from DrrA and DrrB which are antiporters. [Unclassified, Role category not yet assigned] 152
17729 130316 TIGR01249 pro_imino_pep_1 proline iminopeptidase, Neisseria-type subfamily. This model represents one of two related families of proline iminopeptidase in the alpha/beta fold hydrolase family. The fine specificities of the various members, including both the range of short peptides from which proline can be removed and whether other amino acids such as alanine can be also removed, may vary among members. 306
17730 188121 TIGR01250 pro_imino_pep_2 proline-specific peptidase, Bacillus coagulans-type subfamily. This model describes a subfamily of the alpha/beta fold family of hydrolases. Characterized members include prolinases (Pro-Xaa dipeptidase, EC 3.4.13.8), prolyl aminopeptidases (EC 3.4.11.5), and a leucyl aminopeptidase 289
17731 273523 TIGR01251 ribP_PPkin ribose-phosphate pyrophosphokinase. Alternate name: phosphoribosylpyrophosphate synthetase In some systems, close homologs lacking enzymatic activity exist and perform regulatory functions. The model is designated subfamily rather than equivalog for this reason. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 308
17732 273524 TIGR01252 acetolac_decarb alpha-acetolactate decarboxylase. Puruvate can be fermented to 2,3-butanediol. It is first converted to alpha-acetolactate by alpha-acetolactate synthase, then decarboxylated to acetoin by this enzyme. Acetoin can be reduced in some species to 2,3-butanediol by acetoin reductase. [Energy metabolism, Fermentation] 232
17733 130320 TIGR01253 thiP thiamine ABC transporter, permease protein. The model describes thiamine ABC transporter, permease protein in bacteria. The protein belongs to the larger ABC transport system. It consists of atleast three components: the inner mebrane permease; thiamine binding protein; an ATP-binding subunit. It has been experimentally demonstrated that the mutants in the various steps in the de novo synthesis of the thiamine and the biologically active form, namely thiamine pyrophosphate can be exogenously supplemented with thiamine, thiamine monophosphate (TMP) or thiamine pyrophosphate (TPP). [Transport and binding proteins, Other] 519
17734 130321 TIGR01254 sfuA ABC transporter periplasmic binding protein, thiB subfamily. The model describes thiamine ABC transporter, periplasmic protein in bacteria and archae. The protein belongs to the larger ABC transport system. It consists of at least three components: the thiamine binding periplasmic protein; an inner membrane permease; an ATP-binding subunit. It has been experimentally demonstrated that the mutants in the various steps in the de novo synthesis of the thiamine and the biologically active form, namely thiamine pyrophosphate can be exogenously supplemented with thiamine, thiamine monophosphate (TMP) or thiamine pyrophosphate (TPP). [Transport and binding proteins, Other] 304
17735 273525 TIGR01255 pyr_form_ly_1 formate acetyltransferase 1. Alternate names: pyruvate formate-lyase; formate C-acetyltransferase This enzyme converts formate + acetyl-CoA into pyruvate + CoA. This model describes formate acetyltransferase 1. More distantly related putative formate acetyltransferases have also been identified, including formate acetyltransferase 2 from E. coli, which is excluded from this model. [Energy metabolism, Fermentation] 744
17736 273526 TIGR01256 modA molybdenum ABC transporter, periplasmic molybdate-binding protein. The model describes the molybdate ABC transporter periplasmic binding protein in bacteria and archae. Several of the periplasmic receptors constitute a diverse class of binding proteins that differ widely in size, sequence and ligand specificity. It has been shown experimentally by radioactive labeling that ModA represent hydrophylioc periplasmic-binding protein in gram-negative organisms and its counterpart in gram-positive organisms is a lipoprotein. The other components of the system include the ModB, an integral membrane protein and ModC the ATP-binding subunit. Invariably almost all of them display a common beta/alpha folding motif and have similar tertiary structures consisting of two globular domains. [Transport and binding proteins, Anions] 216
17737 130324 TIGR01257 rim_protein retinal-specific rim ABC transporter. This model describes the photoreceptor protein (rim protein) in eukaryotes. It is the member of ABC transporter superfamily. Rim protein is a membrane glycoprotein which is localized in the photoreceptor outer segment discs. Mutation/s in its genetic loci is implicated in the recessive Stargardt's disease. [Transport and binding proteins, Other] 2272
17738 213596 TIGR01258 pgm_1 phosphoglycerate mutase, BPG-dependent, family 1. Most members of this family are phosphoglycerate mutase (EC 5.4.2.1). This enzyme interconverts 2-phosphoglycerate and 3-phosphoglycerate. The enzyme is transiently phosphorylated on an active site histidine by 2,3-diphosphoglyerate, which is both substrate and product. Some members of this family have are phosphoglycerate mutase as a minor activity and act primarily as a bisphoglycerate mutase, interconverting 2,3-diphosphoglycerate and 1,3-diphosphoglycerate (EC 5.4.2.4). This model is designated as a subfamily for this reason. The second and third paralogs in S. cerevisiae are somewhat divergent and apparently inactive (see PUBMED:9544241) but are also part of this subfamily phylogenetically. 245
17739 213597 TIGR01259 comE comEA protein. This model describes the ComEA protein in bacteria. The com E locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. Lesions in the loci has been variously described for the appearance of competence-related pheonotypes and impairment of competence, suggesting their intimate functional role in bacterial transformation. [Cellular processes, DNA transformation] 120
17740 130327 TIGR01260 ATP_synt_c ATP synthase, F0 subunit c. This model describes the subunit c in F1/F0-ATP synthase, a membrane associated multisubunit complex found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. The functional role of subunit c, which is the part of F0 cluster, has been delineated in-vitro reconstitution experiments. Overall experimental proof exists that demonstrate the electrochemical gradient is converted into a rotational torque that leads to ATP synthesis. [Energy metabolism, ATP-proton motive force interconversion] 58
17741 130328 TIGR01261 hisB_Nterm histidinol-phosphatase. This model describes histidinol phosphatase. All known examples in the scope of this model are bifunctional proteins with a histidinol phosphatase domain followed by an imidazoleglycerol-phosphate dehydratase domain. These enzymatic domains catalyze the ninth and seventh steps, respectively, of histidine biosynthesis. [Amino acid biosynthesis, Histidine family] 161
17742 273527 TIGR01262 maiA maleylacetoacetate isomerase. Maleylacetoacetate isomerase is an enzyme of tyrosine and phenylalanine catabolism. It requires glutathione and belongs by homology to the zeta family of glutathione S-transferases. The enzyme (EC 5.2.1.2) is described as active also on maleylpyruvate, and the example from a Ralstonia sp. catabolic plasmid is described as a maleylpyruvate isomerase involved in gentisate catabolism. [Energy metabolism, Amino acids and amines] 210
17743 273528 TIGR01263 4HPPD 4-hydroxyphenylpyruvate dioxygenase. This protein oxidizes 4-hydroxyphenylpyruvate, a tyrosine and phenylalanine catabolite, to homogentisate. Homogentisate can undergo a further non-enzymatic oxidation and polymerization into brown pigments that protect some bacterial species from light. A similar process occurs spontaneously in blood and is hemolytic (see . In some bacterial species, this enzyme has been studied as a hemolysin. [Energy metabolism, Amino acids and amines] 352
17744 273529 TIGR01264 tyr_amTase_E tyrosine aminotransferase, eukaryotic. This model describes tyrosine aminotransferase as found in animals and Trypanosoma cruzi. It is the first enzyme of a pathway of tyrosine degradation via homogentisate. Several plant enzyme designated as probable tyrosine aminotransferases are very closely related to an experimentally demonstrated nicotianamine aminotransferase, an enzyme in a siderophore (iron uptake chelator) biosynthesis pathway. These plant sequences are excluded from the model seed and score between the trusted an noise cutoffs. [Energy metabolism, Amino acids and amines] 401
17745 188123 TIGR01265 tyr_nico_aTase tyrosine/nicotianamine family aminotransferase. This subfamily of pyridoxal phosphate-dependent enzymes includes known examples of both tyrosine aminotransferase from animals and nicotianamine aminotransferase from barley. 403
17746 162276 TIGR01266 fum_ac_acetase fumarylacetoacetase. This enzyme catalyzes the final step in the breakdown of tyrosine or phenylalanine to fumarate and acetoacetate. [Energy metabolism, Amino acids and amines] 415
17747 130334 TIGR01267 Phe4hydrox_mono phenylalanine-4-hydroxylase, monomeric form. This model describes the smaller, monomeric form of phenylalanine-4-hydroxylase, as found in a small number of Gram-negative bacteria. The enzyme irreversibly converts phenylalanine to tryosine and is known to be the rate-limiting step in phenylalanine catabolism in some systems. This family is of biopterin and metal-dependent hydroxylases is related to a family of longer, multimeric aromatic amino acid hydroxylases that have additional N-terminal regulatory sequences. These include tyrosine 3-monooxygenase, phenylalanine-4-hydroxylase, and tryptophan 5-monoxygenase. [Energy metabolism, Amino acids and amines] 248
17748 130335 TIGR01268 Phe4hydrox_tetr phenylalanine-4-hydroxylase, tetrameric form. This model describes the larger, tetrameric form of phenylalanine-4-hydroxylase, as found in metazoans. The enzyme irreversibly converts phenylalanine to tryosine and is known to be the rate-limiting step in phenylalanine catabolism in some systems. It is closely related to metazoan tyrosine 3-monooxygenase and tryptophan 5-monoxygenase, and more distantly to monomeric phenylalanine-4-hydroxylases of some Gram-negative bacteria. The member of this family from Drosophila has been described as having both phenylalanine-4-hydroxylase and tryptophan 5-monoxygenase activity (. However, a Drosophila member of the tryptophan 5-monoxygenase clade has subsequently been discovered. 436
17749 130336 TIGR01269 Tyr_3_monoox tyrosine 3-monooxygenase, tetrameric. This model describes tyrosine 3-monooxygenase, a member of the family of tetrameric, biopterin-dependent aromatic amino acid hydroxylases found in metazoans. It is closely related to tetrameric phenylalanine-4-hydroxylase and tryptophan 5-monooxygenase, and more distantly related to the monomeric phenylalanine-4-hydroxylase found in some Gram-negative bacteria. 457
17750 130337 TIGR01270 Trp_5_monoox tryptophan 5-monooxygenase, tetrameric. This model describes tryptophan 5-monooxygenase, a member of the family of tetrameric, biopterin-dependent aromatic amino acid hydroxylases found in metazoans. It is closely related to tetrameric phenylalanine-4-hydroxylase and tyrosine 3-monooxygenase, and more distantly related to the monomeric phenylalanine-4-hydroxylase found in some Gram-negative bacteria. [Energy metabolism, Amino acids and amines] 464
17751 273530 TIGR01271 CFTR_protein cystic fibrosis transmembrane conductor regulator (CFTR). The model describes the cystis fibrosis transmembrane conductor regulator (CFTR) in eukaryotes. The principal role of this protein is chloride ion conductance. The protein is predicted to consist of 12 transmembrane domains. Mutations or lesions in the genetic loci have been linked to the aetiology of asthma, bronchiectasis, chronic obstructive pulmonary disease etc. Disease-causing mutations have been studied by 36Cl efflux assays in vitro cell cultures and electrophysiology, all of which point to the impairment of chloride channel stability and not the biosynthetic processing per se. [Transport and binding proteins, Anions] 1490
17752 273531 TIGR01272 gluP glucose/galactose transporter. This model describes the glucose/galactose transporter in bacteria. This belongs to the larger facilitator superfamily. Disruption of the loci leads to the total loss of glucose or galactose uptake in E.coli. Putative transporters in other bacterial species were isolated by functional complementation, which restored it functional activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 310
17753 273532 TIGR01273 speA arginine decarboxylase, biosynthetic. Two alternative pathways can convert arginine to putrescine. One is decarboxylation by this enzyme followed by removal of the urea moeity by agmatinase. In the other, the ureohydrolase (arginase) acts first, followed by ornithine decarboxylase. This pathway leads to spermidine biosynthesis, hence the gene symbol speA. A distinct biodegradative form is also pyridoxal phosphate-dependent but is not similar in sequence. [Central intermediary metabolism, Polyamine biosynthesis] 624
17754 130341 TIGR01274 ACC_deam 1-aminocyclopropane-1-carboxylate deaminase. This pyridoxal phosphate-dependent enzyme degrades 1-aminocyclopropane-1-carboxylate, which in plants is a precursor of the ripening hormone ethylene, to ammonia and alpha-ketoglutarate. This model includes all members of this family for which function has been demonstrated experimentally, but excludes a closely related family often annotated as putative members of this family. [Central intermediary metabolism, Other] 337
17755 273533 TIGR01275 ACC_deam_rel pyridoxal phosphate-dependent enzymes, D-cysteine desulfhydrase family. This model represents a family of pyridoxal phosphate-dependent enzymes closely related to (and often designated as putative examples of) 1-aminocyclopropane-1-carboxylate deaminase. It appears that members of this family include both D-cysteine desulfhydrase (EC 4.4.1.15) and 1-aminocyclopropane-1-carboxylate deaminase (EC 3.5.99.7). 318
17756 130343 TIGR01276 thiB thiamine ABC transporter, periplasmic binding protein. This model finds the thiamine (and thiamine pyrophosphate) ABC transporter periplasmic binding protein ThiB in proteobacteria. Completed genomes having this protein (E. coli, Vibrio cholera, Haemophilus influenzae) also have the permease ThiP, described by TIGRFAMs equivalog model TIGR01253. [Transport and binding proteins, Other] 309
17757 130344 TIGR01277 thiQ thiamine ABC transporter, ATP-binding protein. This model describes the energy-transducing ATPase subunit ThiQ of the ThiBPQ thiamine (and thiamine pyrophosphate) ABC transporter in several Proteobacteria. This protein is found so far only in Proteobacteria, and is found in complete genomes only if the ThiB and ThiP subunits are also found. [Transport and binding proteins, Other] 213
17758 273534 TIGR01278 DPOR_BchB light-independent protochlorophyllide reductase, B subunit. Alternate name: dark protochlorophyllide reductase This enzyme describes the B subunit of the dark form protochlorophyllide reductase, a nitrogenase-like enzyme. This subunit shows homology to the nitrogenase molybdenum-iron protein. It catalyzes a step in bacteriochlorophyll biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 511
17759 273535 TIGR01279 DPOR_bchN light-independent protochlorophyllide reductase, N subunit. This enzyme describes the N subunit of the dark form protochlorophyllide reductase, a nitrogenase-like enzyme involved in bacteriochlorophyll biosynthesis. This subunit shows homology to the nitrogenase molybdenum-iron protein NifN. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 407
17760 273536 TIGR01280 xseB exodeoxyribonuclease VII, small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides. [DNA metabolism, Degradation of DNA] 54
17761 130348 TIGR01281 DPOR_bchL light-independent protochlorophyllide reductase, iron-sulfur ATP-binding protein. The BchL peptide (ChlL in chloroplast and cyanobacteria) is an ATP-binding iron-sulfur protein of the dark form protochlorophyllide reductase, an enzyme similar to nitrogenase. This subunit resembles the nitrogenase NifH subunit. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 268
17762 162284 TIGR01282 nifD nitrogenase molybdenum-iron protein alpha chain. Nitrogenase consists of alpha (NifD) and beta (NifK) subunits of the molybdenum-iron protein and an ATP-binding iron-sulfur protein (NifH). This model describes a large clade of NifD proteins, but excludes a lineage that contains putative NifD and NifD homologs from species with vanadium-dependent nitrogenases. [Central intermediary metabolism, Nitrogen fixation] 466
17763 188126 TIGR01283 nifE nitrogenase molybdenum-iron cofactor biosynthesis protein NifE. This protein is part of the NifEN complex involved in biosynthesis of the molybdenum-iron cofactor used by the homologous NifDK complex of nitrogenase. In a few species, the protein is found as a NifEN fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 453
17764 188127 TIGR01284 alt_nitrog_alph nitrogenase alpha chain. This model represents the alpha chains of various forms of the nitrogen-fixing enzyme nitrogenase: vanadium-iron, iron-iron, and molybdenum-iron. Most examples of NifD, the molybdenum-iron type nitrogenase alpha chain, are excluded from this model and described instead by equivalog model TIGR01282. It appears by phylogenetic and UPGMA trees that this model represents a distinct clade of NifD homologs, in which arose several molybdenum-independent forms. [Central intermediary metabolism, Nitrogen fixation] 457
17765 273537 TIGR01285 nifN nitrogenase molybdenum-iron cofactor biosynthesis protein NifN. This protein forms a complex with NifE, and appears as a NifEN in some species. NifEN is a required for producing the molybdenum-iron cofactor of molybdenum-requiring nitrogenases. NifN is closely related to the nitrogenase molybdenum-iron protein beta chain NifK. This model describes most examples of NifN but excludes some cases, such as the putative NifN of Chlorobium tepidum, for which a separate model may be created. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 432
17766 130353 TIGR01286 nifK nitrogenase molybdenum-iron protein beta chain. This model represents the majority of known sequences of the nitrogenase molybdenum-iron protein beta subunit. A distinct clade in a phylogenetic tree contains molybdenum-iron, vanadium-iron, and iron-iron forms of nitrogenase beta subunit and is excluded from this model. Nitrogenase, also called dinitrogenase, is responsible for nitrogen fixation. Note: the trusted cutoff score has recently been lowered to include an additional family in which the beta subunit is shorter by about 50 amino acids at the N-terminus. In species with the shorter form of the beta subunit, the alpha subunit has a novel insert of similar length. [Central intermediary metabolism, Nitrogen fixation] 515
17767 273538 TIGR01287 nifH nitrogenase iron protein. This model describes nitrogenase (EC 1.18.6.1) iron protein, also called nitrogenase reductase or nitrogenase component II. This model includes molybdenum-iron nitrogenase reductase (nifH), vanadium-iron nitrogenase reductase (vnfH), and iron-iron nitrogenase reductase (anfH). The model excludes the homologous protein from the light-independent protochlorophyllide reductase. [Central intermediary metabolism, Nitrogen fixation] 275
17768 130355 TIGR01288 nodI ATP-binding ABC transporter family nodulation protein NodI. This protein is required for normal nodulation by nitrogen-fixing root nodule bacteria such as Mesorhizobium loti. It is a member of the family of ABC transporter ATP binding proteins and works with NodJ to export a nodulation signal molecule. This model does not recognize the highly divergent NodI from Azorhizobium caulinodans. [Cellular processes, Other, Transport and binding proteins, Other] 303
17769 200089 TIGR01289 LPOR light-dependent protochlorophyllide reductase. This model represents the light-dependent, NADPH-dependent form of protochlorophyllide reductase. It belongs to the short chain alcohol dehydrogenase family, in contrast to the nitrogenase-related light-independent form. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 314
17770 273539 TIGR01290 nifB nitrogenase cofactor biosynthesis protein NifB. This model describes NifB, a protein required for the biosynthesis of the iron-molybdenum (or iron-vanadium) cofactor used by the nitrogen-fixing enzyme nitrogenase. NifB belongs to the radical SAM family, and the FeMo cluster biosynthesis process requires S-adenosylmethionine. Archaeal homologs lack the most C-terminal region and score between the trusted and noise cutoffs of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 442
17771 130358 TIGR01291 nodJ ABC-2 type transporter, NodJ family. Nearly all members of this subfamily are NodJ which, together with NodI (TIGR01288), acts to export a variety of modified carbohydrate molecules as signals to plant hosts to establish root nodules. The seed alignment includes a highly divergent member from Azorhizobium caulinodans that is, nonetheless, associated with nodulation. This model is designated as subfamily in part because not all sequences derived from the last common ancestral sequence of Rhizobium sp. and Azorhizobium caulinodans NodJ are necessarily nodulation proteins. [Cellular processes, Other, Transport and binding proteins, Other] 253
17772 273540 TIGR01292 TRX_reduct thioredoxin-disulfide reductase. This model describes thioredoxin-disulfide reductase, a member of the pyridine nucleotide-disulphide oxidoreductases (pfam00070). [Energy metabolism, Electron transport] 299
17773 213602 TIGR01293 Kv_beta voltage-dependent potassium channel beta subunit, animal. This model describes the conserved core region of the beta subunit of voltage-gated potassium (Kv) channels in animals. Amino-terminal regions differ substantially, in part by alternative splicing, and are not included in the model. Four beta subunits form a complex with four alpha subunit cytoplasmic (T1) regions, and the structure of the complex is solved. The beta subunit belongs to a family of NAD(P)H-dependent aldo-keto reductases, binds NADPH, and couples voltage-gated channel activity to the redox potential of the cell. Plant beta subunits and their closely related bacterial homologs (in Deinococcus radiudurans, Xylella fastidiosa, etc.) appear more closely related to each other than to animal forms. However, the bacterial species lack convincing counterparts the Kv alpha subunit and the Kv beta homolog may serve as an enzyme. Cutoffs are set for this model such that yeast and plant forms and bacterial close homologs score between trusted and noise cutoffs. 317
17774 273541 TIGR01294 P_lamban phospholamban. This model represents the short (52 residue) transmembrane phosphoprotein phospholamban. Phospholamban, in its unphosphorylated form, inhibits SERCA2, the cardiac sarcoplasmic reticulum Ca-ATPase. 52
17775 273542 TIGR01295 PedC_BrcD bacteriocin transport accessory protein, putative. This model describes a small family of proteins believed to aid in the export of various class II bacteriocins, which are ribosomally-synthesized, non-lantibiotic bacterial peptide antibiotics. Members of this family are found in operons for pediocin PA-1 from Pediococcus acidilactici and brochocin-C from Brochothrix campestris. 122
17776 273543 TIGR01296 asd_B aspartate-semialdehyde dehydrogenase (peptidoglycan organisms). Two closely related families of aspartate-semialdehyde dehydrogenase are found. They differ by a deep split in phylogenetic and percent identity trees and in gap patterns. This model represents a branch more closely related to the USG-1 protein than to the other aspartate-semialdehyde dehydrogenases represented in model TIGR00978. [Amino acid biosynthesis, Aspartate family] 338
17777 273544 TIGR01297 CDF cation diffusion facilitator family transporter. This model describes a broadly distributed family of transporters, a number of which have been shown to transport divalent cations of cobalt, cadmium and/or zinc. The family has six predicted transmembrane domains. Members of the family are variable in length because of variably sized inserts, often containing low-complexity sequence. [Transport and binding proteins, Cations and iron carrying compounds] 268
17778 188129 TIGR01298 RNaseT ribonuclease T. This model describes ribonuclease T, an enzyme found so far only in gamma-subdivision Proteobacteria such as Escherichia coli and Xylella fastidiosa. Ribonuclease T is homologous to the DNA polymerase III alpha chain. It can liberate AMP from the common C-C-A terminus of uncharged tRNA. It appears also to be involved in RNA maturation. It also acts as a 3' to 5' single-strand DNA-specific exonuclease; it is distinctive for its ability to remove residues near a double-stranded stem. Ribonuclease T is a high copy suppressor in E. coli of a uv-repair defect caused by deletion of three other single-stranded DNA exonucleases. [Transcription, RNA processing] 200
17779 130366 TIGR01299 synapt_SV2 synaptic vesicle protein SV2. This model describes a tightly conserved subfamily of the larger family of sugar (and other) transporters described by pfam00083. Members of this subfamily include closely related forms SV2A and SV2B of synaptic vesicle protein from vertebrates and a more distantly related homolog (below trusted cutoff) from Drosophila melanogaster. Members are predicted to have two sets of six transmembrane helices. 742
17780 130367 TIGR01300 CPA3_mnhG_phaG monovalent cation/proton antiporter, MnhG/PhaG subunit. This model represents a subfamily of small, transmembrane proteins believed to be components of Na+/H+ and K+/H+ antiporters. Members, including proteins designated MnhG from Staphylococcus aureus and PhaG from Rhizobium meliloti, show some similarity to chain L of the NADH dehydrogenase I, which also translocates protons. [Transport and binding proteins, Cations and iron carrying compounds] 97
17781 273545 TIGR01301 GPH_sucrose GPH family sucrose/H+ symporter. This model represents sucrose/proton symporters, found in plants, from the Glycoside-Pentoside-Hexuronide (GPH)/cation symporter family. These proteins are predicted to have 12 transmembrane domains. Members may export sucrose (e.g. SUT1, SUT4) from green parts to the phloem for long-distance transport or import sucrose (e.g SUT2) to sucrose sinks such as the tap root of the carrot. 477
17782 273546 TIGR01302 IMP_dehydrog inosine-5'-monophosphate dehydrogenase. This model describes IMP dehydrogenase, an enzyme of GMP biosynthesis. This form contains two CBS domains. This model describes a rather tightly conserved cluster of IMP dehydrogenase sequences, many of which are characterized. The model excludes two related families of proteins proposed also to be IMP dehydrogenases, but without characterized members. These are related families are the subject of separate models. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 450
17783 130370 TIGR01303 IMP_DH_rel_1 IMP dehydrogenase family protein. This model represents a family of proteins, often annotated as a putative IMP dehydrogenase, related to IMP dehydrogenase and GMP reductase and restricted to the high GC Gram-positive bacteria. All species in which a member is found so far (Corynebacterium glutamicum, Mycobacterium tuberculosis, Streptomyces coelicolor, etc.) also have IMP dehydrogenase as described by TIGRFAMs entry TIGR01302. [Unknown function, General] 475
17784 273547 TIGR01304 IMP_DH_rel_2 IMP dehydrogenase family protein. This model represents a family of proteins, often annotated as a putative IMP dehydrogenase, related to IMP dehydrogenase and GMP reductase. Most species with a member of this family belong to the high GC Gram-positive bacteria, and these also have the IMP dehydrogenase described by TIGRFAMs equivalog model TIGR01302. [Unknown function, General] 369
17785 130372 TIGR01305 GMP_reduct_1 guanosine monophosphate reductase, eukaryotic. A deep split separates two families of GMP reductase. This family includes both eukaryotic and some proteobacterial sequences, while the other family contains other bacterial sequences. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 343
17786 130373 TIGR01306 GMP_reduct_2 guanosine monophosphate reductase, bacterial. A deep split separates two families of GMP reductase. The other (TIGR01305) is found in eukaryotic and some proteobacterial lineages, including E. coli, while this family is found in a variety of bacterial lineages. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 321
17787 130374 TIGR01307 pgm_bpd_ind phosphoglycerate mutase (2,3-diphosphoglycerate-independent). This protein is about double in length of, and devoid of homology to the form of phosphoglycerate mutase that uses 2,3-bisphosphoglycerate as a cofactor. [Energy metabolism, Glycolysis/gluconeogenesis] 501
17788 130375 TIGR01308 rpmD_bact ribosomal protein L30, bacterial/organelle. This model describes bacterial (and organellar) 50S ribosomal protein L30. Homologous ribosomal proteins of the eukaryotic cytosol and of the archaea differ substantially in architecture, from bacterial L30 and also from each other, and are described by separate models. [Protein synthesis, Ribosomal proteins: synthesis and modification] 55
17789 130376 TIGR01309 uL30_arch 50S ribosomal protein uL30, archaeal form. This model represents the archaeal ribosomal protein similar to longer (~ 250 residue) eukaryotic 60S ribosomal protein L7 and to the much shorter (~ 60 residue) bacterial 50S ribosomal protein L30. Protein naming follows the SwissProt designation as L30P, while the gene symbol rpmD follows TIGR usage. [Protein synthesis, Ribosomal proteins: synthesis and modification] 152
17790 273548 TIGR01310 uL30_euk 60S ribosomal protein uL30, eukaryotic form. This model describes the eukaryotic 60S (cytosolic) ribosomal protein uL30 (previously L7) and paralogs that may or may not also be uL30. Human, Drosophila, and Arabidopsis all have both a typical L7 and an L7-related paralog. This family is designated subfamily rather than equivalog to reflect these uncharacterized paralogs. Members of this family average ~ 250 residues in length, somewhat longer than the archaeal L30P/L7E homolog (~ 155 residues) and much longer than the related bacterial/organellar form (~ 60 residues). 235
17791 273549 TIGR01311 glycerol_kin glycerol kinase. This model describes glycerol kinase, a member of the FGGY family of carbohydrate kinases. [Energy metabolism, Other] 493
17792 273550 TIGR01312 XylB D-xylulose kinase. This model describes D-xylulose kinases, a subfamily of the FGGY family of carbohydrate kinases. The member from Klebsiella pneumoniae, designated DalK (see , was annotated erroneously in GenBank as D-arabinitol kinase but is authentic D-xylulose kinase. D-xylulose kinase (XylB) generally is found with xylose isomerase (XylA) and acts in xylose utilization. [Energy metabolism, Sugars] 481
17793 273551 TIGR01313 therm_gnt_kin carbohydrate kinase, thermoresistant glucokinase family. This model represents a subfamily of proteins that includes thermoresistant and thermosensitve isozymes of gluconate kinase (gluconokinase) in E. coli and other related proteins; members of this family are often named by similarity to the thermostable isozyme. These proteins show homology to shikimate kinases and adenylate kinases but not to gluconate kinases from the FGGY family of carbohydrate kinases. 163
17794 130381 TIGR01314 gntK_FGGY gluconate kinase, FGGY type. Gluconate is derived from glucose in two steps. This model describes one form of gluconate kinase, belonging to the FGGY family of carbohydrate kinases. Gluconate kinase phosphoryates gluconate for entry into the Entner-Douderoff pathway. [Energy metabolism, Sugars] 505
17795 273552 TIGR01315 5C_CHO_kinase FGGY-family pentulose kinase. This model represents a subfamily of the FGGY family of carbohydrate kinases. This subfamily is closely related to a set of ribulose kinases, and many members are designated ribitol kinase. However, the member from Klebsiella pneumoniae, from a ribitol catabolism operon, accepts D-ribulose and to a lesser extent D-arabinitol and ribitol (and JW Lengeler, personal communication); its annotation in GenBank as ribitol kinase is imprecise and may have affected public annotation of related proteins. 541
17796 130383 TIGR01316 gltA glutamate synthase (NADPH), homotetrameric. This protein is homologous to the small subunit of NADPH and NADH forms of glutamate synthase as found in eukaryotes and some bacteria. This protein is found in numerous species having no homolog of the glutamate synthase large subunit. The prototype of the family, from Pyrococcus sp. KOD1, was shown to be active as a homotetramer and to require NADPH. [Amino acid biosynthesis, Glutamate family] 449
17797 162300 TIGR01317 GOGAT_sm_gam glutamate synthases, NADH/NADPH, small subunit. This model represents one of three built for the NADPH-dependent or NADH-dependent glutamate synthase (EC 1.4.1.13 and 1.4.1.14, respectively) small subunit or homologous region. TIGR01316 describes a family in several archaeal and deeply branched bacterial lineages of a homotetrameric form for which there is no large subunit. Another model describes glutamate synthase small subunit from gamma and some alpha subdivision Proteobacteria plus paralogs of unknown function. This model describes the small subunit, or homologous region of longer forms proteins, of eukaryotes, Gram-positive bacteria, cyanobacteria, and some other lineages. All members with known function participate in NADH or NADPH-dependent reactions to interconvert between glutamine plus 2-oxoglutarate and two molecules of glutamate. 485
17798 273553 TIGR01318 gltD_gamma_fam glutamate synthase small subunit family protein, proteobacterial. This model represents one of three built for the NADPH-dependent or NADH-dependent glutamate synthase (EC 1.4.1.13 and 1.4.1.14, respectively) small subunit and homologs. TIGR01317 describes the small subunit (or equivalent region from longer forms) in eukaryotes, Gram-positive bacteria, and some other lineages, both NADH and NADPH-dependent. TIGR01316 describes a protein of similar length, from Archaea and a number of bacterial lineages, that forms glutamate synthase homotetramers without a large subunit. This model describes both glutatate synthase small subunit and closely related paralogs of unknown function from a number of gamma and alpha subdivision Proteobacteria, including E. coli. 467
17799 130386 TIGR01319 glmL_fam conserved hypothetical protein. This small family includes, so far, an uncharacterized protein from E. coli O157:H7 and GlmL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family (pfam01968). 463
17800 130387 TIGR01320 mal_quin_oxido malate:quinone-oxidoreductase. This membrane-associated enzyme is an alternative to the better-known NAD-dependent malate dehydrogenase as part of the TCA cycle. The reduction of a quinone rather than NAD+ makes the reaction essentially irreversible in the direction of malate oxidation to oxaloacetate. Both forms of malate dehydrogenase are active in E. coli; disruption of this form causes less phenotypic change. In some bacteria, this form is the only or the more important malate dehydrogenase. [Energy metabolism, TCA cycle] 483
17801 130388 TIGR01321 TrpR trp operon repressor, proteobacterial. This model represents TrpR, the repressor of the trp operon. It is found so far only in the gamma subdivision of the proteobacteria and in Chlamydia trachomatis. All members belong to species capable of tryptophan biosynthesis. [Amino acid biosynthesis, Aromatic amino acid family, Regulatory functions, DNA interactions] 94
17802 273554 TIGR01322 scrB_fam sucrose-6-phosphate hydrolase. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 445
17803 188130 TIGR01323 nitrile_alph nitrile hydratase, alpha subunit. This model describes both iron- and cobalt-containing nitrile hydratase alpha chains. It excludes the thiocyanate hydrolase gamma subunit of Thiobacillus thioparus, a sequence that appears to have evolved from within the family of nitrile hydratase alpha subunits but which differs by several indels and a more rapid accumulation of point mutations. [Energy metabolism, Amino acids and amines] 189
17804 130391 TIGR01324 cysta_beta_ly_B cystathionine beta-lyase, bacterial. This model represents cystathionine beta-lyase (alternate name: beta-cystathionase), one of several pyridoxal-dependent enzymes of cysteine, methionine, and homocysteine metabolism. This enzyme is involved in the biosynthesis of Met from Cys. [Amino acid biosynthesis, Aspartate family] 377
17805 188131 TIGR01325 O_suc_HS_sulf O-succinylhomoserine sulfhydrylase. This model describes O-succinylhomoserine sulfhydrylase, one of several related pyridoxal phosphate-dependent enzymes of cysteine and methionine metabolism. This enzyme is part of an alternative pathway of homocysteine biosynthesis, a step in methionine biosynthesis. [Amino acid biosynthesis, Aspartate family] 381
17806 273555 TIGR01326 OAH_OAS_sulfhy OAH/OAS sulfhydrylase. This model describes a distinct clade of the Cys/Met metabolism pyridoxal phosphate-dependent enzyme superfamily. Members include examples of OAH/OAS sulfhydrylase, an enzyme with activity both as O-acetylhomoserine (OAH) sulfhydrylase (EC 2.5.1.49) and O-acetylserine (OAS) sulphydrylase (EC 2.5.1.47). An alternate name for OAH sulfhydrylase is homocysteine synthase. This model is designated subfamily because it may or may not have both activities. [Amino acid biosynthesis, Aspartate family, Amino acid biosynthesis, Serine family] 418
17807 273556 TIGR01327 PGDH D-3-phosphoglycerate dehydrogenase. This model represents a long form of D-3-phosphoglycerate dehydrogenase, the serA gene of one pathway of serine biosynthesis. Shorter forms, scoring between trusted and noise cutoff, include SerA from E. coli. [Amino acid biosynthesis, Serine family] 525
17808 130395 TIGR01328 met_gam_lyase methionine gamma-lyase. This model describes a methionine gamma-lyase subset of a family of PLP-dependent trans-sulfuration enzymes. The member from the parasite Trichomonas vaginalis is described as catalyzing alpha gamma- and alpha-beta eliminations and gamma-replacement reactions on methionine, cysteine, and some derivatives. Likewise, the enzyme from Pseudomonas degrades cysteine as well as methionine. [Energy metabolism, Amino acids and amines] 391
17809 273557 TIGR01329 cysta_beta_ly_E cystathionine beta-lyase, eukaryotic. This model represents cystathionine beta-lyase (alternate name: beta-cystathionase), one of several pyridoxal-dependent enzymes of cysteine, methionine, and homocysteine metabolism. This enzyme is involved in the biosynthesis of Met from Cys. 378
17810 273558 TIGR01330 bisphos_HAL2 3'(2'),5'-bisphosphate nucleotidase, HAL2 family. Sulfate is incorporated into 3-phosphoadenylylsulfate, PAPS, for utilization in pathways such as methionine biosynthesis. Transfer of sulfate from PAPS to an acceptor leaves adenosine 3'-5'-bisphosphate, APS. This model describes a form found in plants of the enzyme 3'(2'),5'-bisphosphate nucleotidase, which removes the 3'-phosphate from APS to regenerate AMP and help drive the cycle. Sensitivity of this essential enzyme to sodium and other metal ions results is responsible for characterization of this enzyme as a salt tolerance protein. Some members of this family are active also as inositol 1-monophosphatase. 353
17811 130398 TIGR01331 bisphos_cysQ 3'(2'),5'-bisphosphate nucleotidase, bacterial. Sulfate is incorporated into 3-phosphoadenylylsulfate, PAPS, for utilization in pathways such as methionine biosynthesis. Transfer of sulfate from PAPS to an acceptor leaves adenosine 3'-5'-bisphosphate, APS. This model describes a form found in bacteria of the enzyme 3'(2'),5'-bisphosphate nucleotidase, which removes the 3'-phosphate from APS to regenerate AMP and help drive the cycle. [Central intermediary metabolism, Sulfur metabolism] 249
17812 130399 TIGR01332 cyt_b559_alpha cytochrome b559, alpha subunit. This model describes the alpha subunit of cytochrome b559, about 83 residues in length. The N-terminal half is homologous to the ~ 40-residue beta subunit. Cytochrome b559 is associated with photosystem II. Sequences scoring between trusted and noise cutoffs are fragments. [Energy metabolism, Photosynthesis] 80
17813 130400 TIGR01333 cyt_b559_beta cytochrome b559, beta subunit. This model describes the beta subunit of cytochrome b559, about 40 residues in length. It is homologous to the N-terminal half of the alpha subunit, a protein of about 83 residues. Cytochrome b559 is associated with photosystem II. [Energy metabolism, Photosynthesis] 43
17814 130401 TIGR01334 modD putative molybdenum utilization protein ModD. The gene modD for a member of this family is found with molybdenum transport genes modABC in Rhodobacter capsulatus. However, disruption of modD causes only a 4-fold (rather than 500-fold for modA, modB, modC) change in the external molybdenum concentration required to suppress an alternative nitrogenase. ModD proteins are highly similar to nicotinate-nucleotide pyrophosphorylase (also called quinolinate phosphoribosyltransferase). The function unknown. [Unknown function, General] 277
17815 130402 TIGR01335 psaA photosystem I core protein PsaA. The core proteins of photosystem I are PsaA and PsaB, homologous integral membrane proteins that form a heterodimer. The heterodimer binds the electron-donating chlorophyll dimer P700, as well as chlorophyll, phylloquinone, and 4FE-4S electron acceptors. This model describes PsaA only. [Energy metabolism, Photosynthesis] 752
17816 130403 TIGR01336 psaB photosystem I core protein PsaB. The core proteins of photosystem I are PsaA and PsaB, homologous integral membrane proteins that form a heterodimer. The heterodimer binds the electron-donating chlorophyll dimer P700, as well as chlorophyll, phylloquinone, and 4FE-4S electron acceptors. This model describes PsaB only. [Energy metabolism, Photosynthesis] 734
17817 273559 TIGR01337 apcB allophycocyanin, beta subunit. The alpha and beta subunits of allophycocyanin form heterodimers, six of which associate into larger aggregates as part of the phycobilisome, a light-harvesting complex of phycobiliproteins and linker proteins. This model describes allophycocyanin beta subunit. Other, homologous phyobiliproteins include allophycocyanin alpha chain and the phycocyanin and phycoerythrin alpha and beta chains. [Energy metabolism, Photosynthesis] 167
17818 130405 TIGR01338 phycocy_alpha phycocyanin, alpha subunit. This model describes the phycocyanin alpha subunit. Other, homologous phyobiliproteins of the phycobilisome include phycocyanin alpha chain and the allophycocyanin and phycoerythrin alpha and beta chains. This model excludes the closely related phycoerythrocyanin alpha subunit. [Energy metabolism, Photosynthesis] 161
17819 273560 TIGR01339 phycocy_beta phycocyanin, beta subunit. This model describes the phycocyanin beta subunit. Other, homologous phycobiliproteins of the phycobilisome include phycocyanin beta chain and the allophycocyanin and phycoerythrin alpha and beta chains. This model excludes the closely related phycoerythrocyanin beta subunit. [Energy metabolism, Photosynthesis] 171
17820 273561 TIGR01340 aconitase_mito aconitate hydratase, mitochondrial. This model represents mitochondrial forms of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. [Energy metabolism, TCA cycle] 745
17821 273562 TIGR01341 aconitase_1 aconitate hydratase 1. This model represents one form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. It is found in bacteria, archaea, and eukaryotic cytosol. It has been shown to act also as an iron-responsive element binding protein in animals and may have the same role in other eukaryotes. [Energy metabolism, TCA cycle] 876
17822 130409 TIGR01342 acon_putative aconitate hydratase, putative, Aquifex type. This model represents a small family of proteins homologous (and likely functionally equivalent to) aconitase 1. Members are found, so far in the anaerobe Clostridium acetobutylicum, in the microaerophilic, early-branching bacterium Aquifex aeolicus, and in the halophilic archaeon Halobacterium sp. NRC-1. No member is experimentally characterized. [Energy metabolism, TCA cycle] 658
17823 273563 TIGR01343 hacA_fam homoaconitate hydratase family protein. This model represents a subfamily of proteins consisting of aconitase, homoaconitase, 3-isopropylmalate dehydratase, and uncharacterized proteins. The majority of the members of this family have been designated as 3-isopropylmalate dehydratase large subunit (LeuC) in microbial genome annotation, but the only characterized member is Thermus thermophilus homoaconitase, an enzyme of a non-aspartate pathway of Lys biosynthesis. 412
17824 188132 TIGR01344 malate_syn_A malate synthase A. This model represents plant malate synthase and one of two bacterial forms, designated malate synthase A. The distantly related malate synthase G is described by a separate model. This enzyme and isocitrate lyase are the two characteristic enzymes of the glyoxylate shunt. The shunt enables the cell to use acetyl-CoA to generate increased levels of TCA cycle intermediates for biosynthetic pathways such as gluconeogenesis. [Energy metabolism, TCA cycle] 511
17825 130412 TIGR01345 malate_syn_G malate synthase G. This model describes the G isozyme of malate synthase. Isocitrate synthase and malate synthase form the glyoxylate shunt, which generates additional TCA cycle intermediates. [Energy metabolism, TCA cycle] 721
17826 273564 TIGR01346 isocit_lyase isocitrate lyase. Isocitrate lyase and malate synthase are the enzymes of the glyoxylate shunt, a pathway associated with the TCA cycle. [Energy metabolism, TCA cycle] 527
17827 273565 TIGR01347 sucB 2-oxoglutarate dehydrogenase complex dihydrolipoamide succinyltransferase (E2 component). This model describes the TCA cycle 2-oxoglutarate system E2 component, dihydrolipoamide succinyltransferase. It is closely related to the pyruvate dehydrogenase E2 component, dihydrolipoamide acetyltransferase. The seed for this model includes mitochondrial and Gram-negative bacterial forms. Mycobacterial candidates are highly derived, differ in having and extra copy of the lipoyl-binding domain at the N-terminus. They score below the trusted cutoff, but above the noise cutoff and above all examples of dihydrolipoamide acetyltransferase. [Energy metabolism, TCA cycle] 403
17828 273566 TIGR01348 PDHac_trf_long pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase, long form. This model describes a subset of pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase specifically close by both phylogenetic and per cent identity (UPGMA) trees. Members of this set include two or three copies of the lipoyl-binding domain. E. coli AceF is a member of this model, while mitochondrial and some other bacterial forms belong to a separate model. [Energy metabolism, Pyruvate dehydrogenase] 546
17829 273567 TIGR01349 PDHac_trf_mito pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase, long form. This model represents one of several closely related clades of the dihydrolipoamide acetyltransferase subunit of the pyruvate dehydrogenase complex. It includes sequences from mitochondria and from alpha and beta branches of the proteobacteria, as well as from some other bacteria. Sequences from Gram-positive bacteria are not included. The non-enzymatic homolog protein X, which serves as an E3 component binding protein, falls within the clade phylogenetically but is rejected by its low score. [Energy metabolism, Pyruvate dehydrogenase] 436
17830 273568 TIGR01350 lipoamide_DH dihydrolipoamide dehydrogenase. This model describes dihydrolipoamide dehydrogenase, a flavoprotein that acts in a number of ways. It is the E3 component of dehydrogenase complexes for pyruvate, 2-oxoglutarate, 2-oxoisovalerate, and acetoin. It can also serve as the L protein of the glycine cleavage system. This family includes a few members known to have distinct functions (ferric leghemoglobin reductase and NADH:ferredoxin oxidoreductase) but that may be predicted by homology to act as dihydrolipoamide dehydrogenase as well. The motif GGXCXXXGCXP near the N-terminus contains a redox-active disulfide. 460
17831 273569 TIGR01351 adk adenylate kinase. Adenylate kinase (EC 2.7.4.3) converts ATP + AMP to ADP + ADP, that is, uses ATP as a phosphate donor for AMP. Most members of this family are known or believed to be adenylate kinase. However, some members accept other nucleotide triphosphates as donors, may be unable to use ATP, and may fail to complement adenylate kinase mutants. An example of a nucleoside-triphosphate--adenylate kinase (EC 2.7.4.10) is SP|Q9UIJ7, a GTP:AMP phosphotransferase. This family is designated subfamily rather than equivalog for this reason. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 210
17832 273570 TIGR01352 tonB_Cterm TonB family C-terminal domain. This model represents the C-terminal of TonB and is homologs. TonB is an energy-transducer for TonB-dependent receptors of Gram-negative bacteria. Most members are designated as TonB or TonB-related proteins, but a few represent the paralogous TolA protein. Several bacteria have up to four TonB paralogs. In nearly every case, a proline-rich repetive region is found N-terminal to this domain; these low-complexity regions are highly divergent and cannot readily be aligned. The region is suggested to help span the periplasm. [Transport and binding proteins, Cations and iron carrying compounds] 74
17833 273571 TIGR01353 dGTP_triPase deoxyguanosinetriphosphate triphosphohydrolase, putative. dGTP triphosphohydrolase (dgt) releases inorganic triphosphate, an unusual activity reaction product, from GTP. Its activity has been called limited to the Enterobacteriaceae, although homologous sequences are detected elsewhere. This finding casts doubt on whether the activity is shared in other species. In several of these other species, the homologous gene is found in an apparent operon with dnaG, the DNA primase gene. The enzyme from E. coli was shown to bind coopertatively to single stranded DNA. The biological role of dgt is unknown. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 381
17834 273572 TIGR01354 cyt_deam_tetra cytidine deaminase, homotetrameric. This small, homotetrameric zinc metalloprotein is found in humans and most bacteria. A related, homodimeric form with a much larger subunit is found in E. coli and in Arabidopsis. Both types may act on deoxycytidine as well as cytidine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 127
17835 273573 TIGR01355 cyt_deam_dimer cytidine deaminase, homodimeric. This homodimeric zinc metalloprotein is found in Arabidopis and some Proteobacteria. A related, homotetrameric form with a much smaller subunit is found most bacteria and in animals. Both types may act on deoxycytidine as well as cytidine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 283
17836 273574 TIGR01356 aroA 3-phosphoshikimate 1-carboxyvinyltransferase. This model represents 3-phosphoshikimate-1-carboxyvinyltransferase (aroA), which catalyzes the sixth of seven steps in the shikimate pathway of the biosynthesis of chorimate. Chorismate is last common precursor of all three aromatic amino acids. Sequences scoring between the trusted and noise cutoffs include fragmentary and aberrant sequences in which generally well-conserved motifs are missing or altererd, but no example of a protein known to have a different function. [Amino acid biosynthesis, Aromatic amino acid family] 409
17837 273575 TIGR01357 aroB 3-dehydroquinate synthase. This model represents 3-dehydroquinate synthase, the enzyme catalyzing the second of seven steps in the shikimate pathway of chorismate biosynthesis. Chorismate is the last common intermediate in the biosynthesis of all three aromatic amino acids. [Amino acid biosynthesis, Aromatic amino acid family] 344
17838 130425 TIGR01358 DAHP_synth_II 3-deoxy-7-phosphoheptulonate synthase, class II. This model represents the class II family of 3-deoxy-7-phosphoheptulonate synthase, aka phospho-2-dehydro-3-deoxyheptonate aldolase, as found in plants and some bacteria. It shows some similarity to the class I family found in many bacteria. The enzyme catalyzes the first of 7 steps in the biosynthesis of chorismate, that last common precursor of all three aromatic amino acids. Homologs scoring between trusted and noise cutoff include proteins involved in antibiotic biosynthesis; one example is active as this enzyme, while another acts on an amino analog. [Amino acid biosynthesis, Aromatic amino acid family] 443
17839 273576 TIGR01359 UMP_CMP_kin_fam UMP-CMP kinase family. This subfamily of the adenylate kinase superfamily contains examples of UMP-CMP kinase, as well as others proteins with unknown specificity, some currently designated adenylate kinase. All known members are eukaryotic. 185
17840 130427 TIGR01360 aden_kin_iso1 adenylate kinase, isozyme 1 subfamily. Members of this family are adenylate kinase, EC 2.7.4.3. This clade is found only in eukaryotes and includes human adenylate kinase isozyme 1 (myokinase). Within the adenylate kinase superfamily, this set appears specifically closely related to a subfamily of eukaryotic UMP-CMP kinases (TIGR01359), rather than to the large clade of bacterial, archaeal, and eukaryotic adenylate kinase family members in TIGR01351. 188
17841 273577 TIGR01361 DAHP_synth_Bsub 3-deoxy-7-phosphoheptulonate synthase. This model describes one of at least three types of phospho-2-dehydro-3-deoxyheptonate aldolase (DAHP synthase). This enzyme catalyzes the first of 7 steps in the biosynthesis of chorismate, that last common precursor of all three aromatic amino acids and of PABA, ubiquinone and menaquinone. Some members of this family, including an experimentally characterized member from Bacillus subtilis, are bifunctional, with a chorismate mutase domain N-terminal to this region. The member of this family from Synechocystis PCC 6803, CcmA, was shown to be essential for carboxysome formation. However, no other candidate for this enzyme is present in that species, chorismate biosynthesis does occur, other species having this protein lack carboxysomes but appear to make chorismate, and a requirement of CcmA for carboxysome formation does not prohibit a role in chorismate biosynthesis. [Amino acid biosynthesis, Aromatic amino acid family] 260
17842 130429 TIGR01362 KDO8P_synth 3-deoxy-8-phosphooctulonate synthase. This model describes 3-deoxy-8-phosphooctulonate synthase. Alternate names include 2-dehydro-3-deoxyphosphooctonate aldolase, 3-deoxy-d-manno-octulosonic acid 8-phosphate and KDO-8 phosphate synthetase. It catalyzes the aldol condensation of phosphoenolpyruvate with D-arabinose 5-phosphate: phosphoenolpyruvate + D-arabinose 5-phosphate + H2O = 2-dehydro-3-deoxy-D-octonate 8-phosphate + phosphate In Gram-negative bacteria, this is the first step in the biosynthesis of 3-deoxy-D-manno-octulosonate, part of the oligosaccharide core of lipopolysaccharide. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 258
17843 273578 TIGR01363 strep_his_triad streptococcal histidine triad protein. This model represents the N-terminal half of a family of Streptococcal proteins that contain a signal peptide and then up to five repeats of a region that includes a His-X-X-His-X-His (histidine triad) motif. Three repeats are found in the seed alignment. Additional copies in more poorly conserved regions may be detected. Members of this family from Streptococcus pneumoniae are suggested to cleave human C3, and the member PhpA has been shown in vaccine studies to be a protective antigen in mice. [Cellular processes, Pathogenesis] 348
17844 130431 TIGR01364 serC_1 phosphoserine aminotransferase. This model represents the common form of the phosphoserine aminotransferase SerC. The phosphoserine aminotransferase of the archaeon Methanosarcina barkeri and putative phosphoserine aminotransferase of Mycobacterium tuberculosis are represented by separate models. All are members of the class V aminotransferases (pfam00266). [Amino acid biosynthesis, Serine family] 349
17845 130432 TIGR01365 serC_2 phosphoserine aminotransferase, Methanosarcina type. This model represents a variant form of the serine biosynthesis enzyme phosphoserine aminotransferase, as found in a small number of distantly related species, including Caulobacter crescentus, Mesorhizobium loti, and the archaeon Methanosarcina barkeri. [Amino acid biosynthesis, Serine family] 374
17846 130433 TIGR01366 serC_3 phosphoserine aminotransferase, putative. This model represents a putative variant form of the serine biosynthesis enzyme phosphoserine aminotransferase, as found in Mycobacterium tuberculosis and related high-GC Gram-positive bacteria. [Amino acid biosynthesis, Serine family] 361
17847 273579 TIGR01367 pyrE_Therm orotate phosphoribosyltransferase, Thermus family. This model represents a distinct clade of orotate phosphoribosyltransferases. Members include the experimentally determined example from Thermus aquaticus and additional examples from Caulobacter crescentus, Helicobacter pylori, Mesorhizobium loti, and related species. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 187
17848 273580 TIGR01368 CPSaseIIsmall carbamoyl-phosphate synthase, small subunit. This model represents the whole of the small chain of the glutamine-dependent form (EC 6.3.5.5) of carbamoyl phosphate synthase, CPSase II. The C-terminal domain has glutamine amidotransferase activity. Note that the sequence from the mammalian urea cycle form has lost the active site Cys, resulting in an ammonia-dependent form, CPSase I (EC 6.3.4.16). CPSases of pyrimidine biosynthesis, arginine biosynthesis, and the urea cycle may be encoded by one or by several genes, depending on the species. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 357
17849 273581 TIGR01369 CPSaseII_lrg carbamoyl-phosphate synthase, large subunit. Carbamoyl-phosphate synthase (CPSase) catalyzes the first committed step in pyrimidine, arginine, and urea biosynthesis. In general, it is a glutamine-dependent enzyme, EC 6.3.5.5, termed CPSase II in eukaryotes. An exception is the mammalian mitochondrial urea-cycle form, CPSase I, in which the glutamine amidotransferase domain active site Cys on the small subunit has been lost, and the enzyme is ammonia-dependent. In both CPSase I and the closely related, glutamine-dependent CPSase III (allosterically activated by acetyl-glutamate) demonstrated in some other vertebrates, the small and large chain regions are fused in a single polypeptide chain. This model represents the large chain of glutamine-hydrolysing carbamoyl-phosphate synthases, or the corresponding regions of larger, multifunctional proteins, as found in all domains of life, and CPSase I forms are considered exceptions within the family. In several thermophilic species (Methanobacterium thermoautotrophicum, Methanococcus jannaschii, Aquifex aeolicus), the large subunit appears split, at different points, into two separate genes. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 1050
17850 273582 TIGR01370 TIGR01370 extracellular protein. Original assignment of this protein family as cysteinyl-tRNA synthetase is controversial, supported by but challenged by and by subsequent discovery of the actual mechanism for synthesizing Cys-tRNA in species where a direct Cys--tRNA ligase was not found. Lingering legacy annotations of members of this family probably should be removed. Evidence against the role includes a signal peptide. This family as been renamed "extracellular protein" to facilitate correction. Members of this family occur in Deinococcus radiodurans (bacterial) and Methanococcus jannaschii (archaeal). A number of homologous but more distantly related proteins are annotated as alpha-1,4 polygalactosaminidases. The function remains unknown. [Unknown function, General] 315
17851 273583 TIGR01371 met_syn_B12ind 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase. This model describes the cobalamin-independent methionine synthase. A family of uncharacterized archaeal proteins is homologous to the C-terminal region of this family. That family is excluded from this model but, along with this family, belongs to pfam01717. [Amino acid biosynthesis, Aspartate family] 750
17852 273584 TIGR01372 soxA sarcosine oxidase, alpha subunit family, heterotetrameric form. This model describes the alpha subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason.Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms [Energy metabolism, Amino acids and amines] 985
17853 273585 TIGR01373 soxB sarcosine oxidase, beta subunit family, heterotetrameric form. This model describes the beta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms. [Energy metabolism, Amino acids and amines] 407
17854 130441 TIGR01374 soxD sarcosine oxidase, delta subunit family, heterotetrameric form. This model describes the delta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) form [Energy metabolism, Amino acids and amines] 84
17855 273586 TIGR01375 soxG sarcosine oxidase, gamma subunit family, heterotetrameric form. This model describes the gamma subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. The gamma subunit is the most divergent between operons of the four subunits. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms. [Energy metabolism, Amino acids and amines] 152
17856 273587 TIGR01376 POMP_repeat Chlamydial polymorphic outer membrane protein repeat. This model represents a repeat region of about 27 residues that appears from twice to over twenty times in Chlamydial polymorphic outer membrane proteins (POMP). Characteristic motifs in the repeat are FXXN and GGAI. Except for a few apparently truncated examples, Chlamydial proteins have this repeat region if and only if they also have the autotransporter beta-domain (pfam03797) at the C-terminus, with Phe as the C-terminal residue. This repeat is observed, but is very rare, outside the Chlamydias. 27
17857 130444 TIGR01377 soxA_mon sarcosine oxidase, monomeric form. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms [Energy metabolism, Amino acids and amines] 380
17858 273588 TIGR01378 thi_PPkinase thiamine pyrophosphokinase. This model has been revised. Originally, it described strictly eukaryotic thiamine pyrophosphokinase. However, it is now expanded to include also homologous enzymes, apparently functionally equivalent, from species that rely on thiamine pyrophosphokinase rather than thiamine-monophosphate kinase (TIGR01379) to produce the active TPP cofactor. This includes the thiamine pyrophosphokinase from Bacillus subtilis, previously designated YloS. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 205
17859 273589 TIGR01379 thiL thiamine-phosphate kinase. This model describes thiamine-monophosphate kinase, an enzyme that converts thiamine monophosphate into thiamine pyrophosphate (TPP, coenzyme B1), an enzyme cofactor. Thiamine monophosphate may be derived from de novo synthesis or from unphosphorylated thiamine, known as vitamin B1. Proteins scoring between the trusted and noise cutoff for this model include short forms from the Thermoplasmas (which lack the N-terminal region) and a highly derived form from Campylobacter jejuni. Eukaryotes lack this enzyme, and add pyrophosphate from ATP to unphosphorylated thiamine in a single step. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 317
17860 130447 TIGR01380 glut_syn glutathione synthetase, prokaryotic. This model was built using glutathione synthetases found in Gram-negative bacteria. This gene does not appear to be present in genomes of Gram-positive bacteria. Glutathione synthetase has an ATP-binding domain in the COOH terminus and catalyzes the second step in the glutathione biosynthesis pathway: ATP + gamma-L-glutamyl-L-cysteine + glycine = ADP + phosphate + glutathione. Glutathione is a tripeptide that functions as a reductant in many cellular reactions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 312
17861 273590 TIGR01381 E1_like_apg7 E1-like protein-activating enzyme Gsa7p/Apg7p. This model represents a family of eukaryotic proteins found in animals, plants, and yeasts, including Apg7p (YHR171W) from Saccharomyces cerevisiae and GSA7 from Pichia pastoris. Members are about 650 to 700 residues in length and include a central domain of about 150 residues shared with the ThiF/MoeB/HesA family of proteins. A low level of similarity to ubiquitin-activating enzyme E1 is described in a paper on peroxisome autophagy mediated by GSA7, and is the basis of the name ubiquitin activating enzyme E1-like protein. Members of the family appear to be involved in protein lipidation events analogous to ubiquitination and required for membrane fusion events during autophagy. 664
17862 273591 TIGR01382 PfpI intracellular protease, PfpI family. The member of this family from Pyrococcus horikoshii has been solved to 2 Angstrom resolution. It is an ATP-independent intracellular protease that crystallizes as a hexameric ring. Cys-101 is proposed as the active site residue in a catalytic triad with the adjacent His-102 and a Glu residue from an adjacent monomer. A member of this family from Bacillus subtilis, GSP18, has been shown to be expressed in response to several forms of stress. A role in the degradation of small peptides has been suggested. A closely related family consists of the thiamine biosynthesis protein ThiJ and its homologs. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 166
17863 213612 TIGR01383 not_thiJ DJ-1 family protein. This model represents the DJ-1 clade of the so-called ThiJ/PfpI family of proteins. PfpI, represented by a distinct model, is a putative intracellular cysteine protease. DJ-1 is described as an oncogene that acts cooperatively with H-Ras. Many members of the DJ-1 clade are annotated (apparently incorrectly) as ThiJ, a protein of thiamine biosynthesis. However, published reports of ThiJ activity and identification of a ThiJ/ThiD bifunctional protein describe an unrelated locus mapping near ThiM, rather than the DJ-1 homolog of E. coli. The ThiJ designation for this family may be spurious; the cited paper refers to a locus near thiD and thiM in E. coli, unlike the gene represented here. Current public annotation reflects ThiJ/ThiD bifunctional activity, apparently a property of ThiD and not of this locus. [Unknown function, General] 179
17864 130451 TIGR01384 TFS_arch transcription factor S, archaeal. This model describes archaeal transcription factor S, a protein related in size and sequence to certain eukaryotic RNA polymerase small subunits, and in sequence and function to the much larger eukaryotic transcription factor IIS (TFIIS). Although originally suggested to be a subunit of the archaeal RNA polymerase, it elutes separately from active polymerase in gel filtration experiments and acts, like TFIIs, as an induction factor for RNA cleavage by RNA polymerase. There has been an apparent duplication event in the Halobacteriaceae lineage (Haloarcula, Haloferax, Haloquadratum, Halobacterium and Natromonas). There appears to be a separate duplication in Methanosphaera stadtmanae. [Transcription, Transcription factors] 104
17865 273592 TIGR01385 TFSII transcription elongation factor S-II. This model represents eukaryotic transcription elongation factor S-II. This protein allows stalled RNA transcription complexes to perform a cleavage of the nascent RNA and restart at the newly generated 3-prime end. 299
17866 273593 TIGR01386 cztS_silS_copS heavy metal sensor kinase. Members of this family contain a sensor histidine kinase domain (pfam00512) and a domain found in bacterial signal proteins (pfam00672). This group is separated phylogenetically from related proteins with similar architecture and contains a number of proteins associated with heavy metal resistance efflux systems for copper, silver, cadmium, and/or zinc. 457
17867 130454 TIGR01387 cztR_silR_copR heavy metal response regulator. Members of this family contain a response regulator receiver domain (pfam00072) and an associated transcriptional regulatory region (pfam00486). This group is separated phylogenetically from related proteins with similar architecture and contains a number of proteins associated with heavy metal resistance efflux systems for copper, silver, cadmium, and/or zinc. Most members encoded by genes adjacent to genes for encoding a member of the heavy metal sensor histidine kinase family (TIGRFAMs:TIGR01386), its partner in the two-component response regulator system. [Regulatory functions, DNA interactions] 218
17868 130455 TIGR01388 rnd ribonuclease D. This model describes ribonuclease D, a 3'-exonuclease shown to act on tRNA both in vitro and when overexpressed in vivo. Trusted members of this family are restricted to the Proteobacteria; Aquifex, Mycobacterial, and eukaryotic homologs are not full-length homologs. Ribonuclease D is not essential in E. coli and is deleterious when overexpressed. Its precise biological role is still unknown. [Transcription, RNA processing] 367
17869 273594 TIGR01389 recQ ATP-dependent DNA helicase RecQ. The ATP-dependent DNA helicase RecQ of E. coli is about 600 residues long. This model represents bacterial proteins with a high degree of similarity in domain architecture and in primary sequence to E. coli RecQ. The model excludes eukaryotic and archaeal proteins with RecQ-like regions, as well as more distantly related bacterial helicases related to RecQ. [DNA metabolism, DNA replication, recombination, and repair] 591
17870 130457 TIGR01390 CycNucDiestase 2',3'-cyclic-nucleotide 2'-phosphodiesterase. 2',3'-cyclic-nucleotide 2'-phosphodiesterase is a bifunctional enzyme localized to the periplasm of Gram-negative bacteria. 2',3'-cyclic-nucleotide 2'-phosphodiesters are intermediates formed during the hydrolysis of RNA by the ribonuclease I, which is also found to the periplasm, and other enzymes of the RNAse T2 family. Bacteria are unable to transport 2',3'-cyclic-nucleotides into the cytoplasm. 2',3'-cyclic-nucleotide 2'-phosphodiesterase contains 2 active sites which catalyze the reactions that convert the 2',3'-cyclic-nucleotide into a 3'-nucleotide, which is then converted into nucleic acid and phosphate. Both final products can be transported into the cytoplasm. Thus, it has been suggested that 2',3'-cyclic-nucleotide 2'-phosphodiesterase has a 'scavenging' function. Experimental evidence indicates that 2',3'-cyclic-nucleotide 2'-phosphodiesterase enables Yersinia enterocolitica O:8 to grow on 2'3'-cAMP as a sole source of carbon and energy (). [Purines, pyrimidines, nucleosides, and nucleotides, Other] 626
17871 273595 TIGR01391 dnaG DNA primase, catalytic core. Members of this family are DNA primase, a ubiquitous bacteria protein. Most members of this family contain nearly two hundred additional residues C-terminal to the region represented here, but conservation between species is poor and the C-terminal region was not included in the seed alignment. This protein contains a CHC2 zinc finger (pfam01807) and a Toprim domain (pfam01751). [DNA metabolism, DNA replication, recombination, and repair] 415
17872 273596 TIGR01392 homoserO_Ac_trn homoserine O-acetyltransferase. This family describes homoserine-O-acetyltransferase, an enzyme of methionine biosynthesis. This model has been rebuilt to identify sequences more broadly, including a number of sequences suggested to be homoserine O-acetyltransferase based on proximity to other Met biosynthesis genes. [Amino acid biosynthesis, Aspartate family] 351
17873 130460 TIGR01393 lepA elongation factor 4. LepA (GUF1 in Saccaromyces), now called elongation factor 4, is a GTP-binding membrane protein related to EF-G and EF-Tu. Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including GUF1 of yeast) originated within the bacterial LepA family. The function is unknown. [Unknown function, General] 595
17874 273597 TIGR01394 TypA_BipA GTP-binding protein TypA/BipA. This bacterial (and Arabidopsis) protein, termed TypA or BipA, a GTP-binding protein, is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways, but the precise function is unknown. [Regulatory functions, Other, Cellular processes, Adaptations to atypical conditions, Protein synthesis, Translation factors] 594
17875 130462 TIGR01395 FlgC flagellar basal-body rod protein FlgC. This model represents FlgC, one of several components of bacterial flagella that share a domain described by pfam00460. FlgC is part of the basal body. [Cellular processes, Chemotaxis and motility] 135
17876 273598 TIGR01396 FlgB flagellar basal-body rod protein FlgB. This model represents FlgB, one of several components of bacterial flagella that share a domain described by pfam00460. FlgB is part of the basal body. [Cellular processes, Chemotaxis and motility] 131
17877 130464 TIGR01397 fliM_switch flagellar motor switch protein FliM. Members of this family are the flagellar motor switch protein FliM. The family excludes FliM homologs that lack an N-terminal region critical to interaction with phosphorylated CheY. One set lacking this N-terminal region is found in Rhizobium meliloti, in which the direction of flagellar rotation is not reversible (i.e. the FliM homolog does not act to reverse the motor direction), and in related species. Another is found in Buchnera, an obligate intracellular endosymbiont with genes for many of the components of the flagellar apparatus, but not, apparently, for flagellin iself. [Cellular processes, Chemotaxis and motility] 320
17878 273599 TIGR01398 FlhA flagellar biosynthesis protein FlhA. This model describes flagellar biosynthesis protein FlhA, one of a large number of genes associated with the biosynthesis of functional bacterial flagella. Homologs of many such proteins, including FlhA, function in type III protein secretion systems. A separate model describes InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc., all of which score below the noise cutoff for this model. [Cellular processes, Chemotaxis and motility] 678
17879 273600 TIGR01399 hrcV type III secretion protein, HrcV family. Members of this family are closely homologous to the flagellar biosynthesis protein FlhA (TIGR01398) and should all participate in type III secretion systems. Examples include InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc. Type III secretion systems resemble flagellar biogenesis systems, and may share the property of translocating special classes of peptides through the membrane. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 677
17880 130467 TIGR01400 fliR flagellar biosynthetic protein FliR. This model recognizes the FliR protein of bacterial flagellar biosynthesis. It distinguishes FliR from the homologous proteins bacterial type III protein secretion systems, known by names such as YopT, EscT, and HrcT. [Cellular processes, Chemotaxis and motility] 245
17881 130468 TIGR01401 fliR_like_III type III secretion protein SpaR/YscT/HrcT. This model represents members of bacterial type III secretion systems homologous to the flagellar biosynthetic protein FliR (TIGRFAMs:TIGR01400). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 253
17882 130469 TIGR01402 fliQ flagellar biosynthetic protein FliQ. This model describes FliQ, a protein involved in biosynthesis of bacterial flagella. A related family of proteins, excluded from this model, participates in bacterial type III protein secretion systems. [Cellular processes, Chemotaxis and motility] 88
17883 130470 TIGR01403 fliQ_rel_III type III secretion protein, HrpO family. This model represents one of several families of proteins related to bacterial flagellar biosynthesis proteins and involved in bacterial type III protein secretion systems. This family is homologous to, but distinguished from, flagellar biosynthetic protein FliQ. This model may not identify all type III secretion system FliQ homologs. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 81
17884 130471 TIGR01404 FlhB_rel_III type III secretion protein, YscU/HrpY family. This model represents one of several families of proteins related to bacterial flagellar biosynthesis proteins and involved in bacterial type III protein secretion systems. This family is homologous to, but distinguished from, flagellar biosynthetic protein FlhB (TIGRFAMs model TIGR00328). This model may not identify all type III secretion system FlhB homologs. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 342
17885 273601 TIGR01405 polC_Gram_pos DNA polymerase III, alpha chain, Gram-positive type. This model describes a polypeptide chain of DNA polymerase III. Full-length homologs of this protein are restricted to the Gram-positive lineages, including the Mycoplasmas. This protein is designated alpha chain and given the gene symbol polC, but is not a full-length homolog of other polC genes. The N-terminal region of about 200 amino acids is rich in low-complexity sequence, poorly alignable, and not included n this model. [DNA metabolism, DNA replication, recombination, and repair] 1213
17886 130473 TIGR01406 dnaQ_proteo DNA polymerase III, epsilon subunit, Proteobacterial. This model represents DnaQ, the DNA polymerase III epsilon subunit, as found in most Proteobacteria. It consists largely of an exonuclease domain as described in pfam00929. In Gram-positive bacteria, closely related regions are found both in the Gram-positive type DNA polymerase III alpha subunit and as an additional N-terminal domain of a DinG-family helicase. Both are excluded from this model, as are smaller proteins, also outside the Proteobacteria, that are similar in size to the epsilon subunit but as different in sequence as are the epsilon-like regions found in Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair] 225
17887 273602 TIGR01407 dinG_rel DnaQ family exonuclease/DinG family helicase, putative. This model represents a family of proteins in Gram-positive bacteria. The N-terminal region of about 200 amino acids resembles the epsilon subunit of E. coli DNA polymerase III and the homologous region of the Gram-positive type DNA polymerase III alpha subunit. The epsilon subunit contains an exonuclease domain. The remainder of this protein family resembles a predicted ATP-dependent helicase, the DNA damage-inducible protein DinG of E. coli. [DNA metabolism, DNA replication, recombination, and repair] 850
17888 273603 TIGR01408 Ube1 ubiquitin-activating enzyme E1. This model represents the full length, over a thousand amino acids, of a multicopy family of eukaryotic proteins, many of which are designated ubiquitin-activating enzyme E1. Members have two copies of the ThiF family domain (pfam00899), a repeat found in ubiquitin-activating proteins (pfam02134), and other regions. 1006
17889 273604 TIGR01409 TAT_signal_seq Tat (twin-arginine translocation) pathway signal sequence. Proteins assembled with various cofactors or by means of cytosolic molecular chaperones are poor candidates for translocation across the bacterial inner membrane by the standard general secretory (Sec) pathway. This model describes a family of predicted long, non-Sec signal sequences and signal-anchor sequences (uncleaved signal sequences). All contain an absolutely conserved pair of arginine residues, in a motif approximated by (S/T)-R-R-X-F-L-K, followed by a membrane-spanning hydrophobic region. Members with small amino acid side chains at the -1 and -3 positions from the C-terminus of the model should be predicted to be cleaved as are Sec pathway signal sequences. Members are almost exclusively bacterial, although archaeal sequences are also found. A large fraction of the members of this family may have bound redox-active cofactors. [Protein fate, Protein and peptide secretion and trafficking] 29
17890 130477 TIGR01410 tatB twin arginine-targeting protein translocase TatB. This model represents the TatB protein of a Sec-independent system for transporting folded proteins, often with a bound redox cofactor, across the bacterial inner membrane. TatC is the multiple membrane spanning component. TatB, like the related TatA/E proteins, appears to span the membrane one time. The tat system recognizes proteins with an elongated signal sequence containing a conserved R-R in a motif approximated by RRxFLK N-terminal to the transmembrane helix. TIGRFAMs model TIGR01409 describes this twin-Arg signal sequence. A similar system, termed Delta-pH-dependent transport, operates on chloroplast-encoded proteins. [Protein fate, Protein and peptide secretion and trafficking] 80
17891 273605 TIGR01411 tatAE twin arginine-targeting protein translocase, TatA/E family. This model distinguishes TatA/E from the related TatB, but does not distinguish TatA from TatE. The Tat (twin-arginine translocation) system is a Sec-independent exporter for folded proteins, often with a redox cofactor already bound, across the bacterial inner membrane. Functionally equivalent systems are found in the chloroplast and some in archaeal species. The signal peptide recognized by the Tat system is modeled by TIGR01409. [Protein fate, Protein and peptide secretion and trafficking] 47
17892 273606 TIGR01412 tat_substr_1 Tat-translocated enzyme. This model represents a small family of proteins with a typical Tat (twin-arginine translocation) signal sequence, suggesting that the family is exported in a folded state, perhaps with a bound redox cofactor. Members of this family show homology to Dyp, a dye-decolorizing peroxidase from Geotrichum candidum that lacks any typical heme-binding site. 414
17893 273607 TIGR01413 Dyp_perox_fam Dyp-type peroxidase family. A defined member of this superfamily is Dyp, a dye-decolorizing peroxidase that lacks a typical heme-binding region. A distinct, uncharacterized branch (TIGR01412) of this superfamily has a typical twin-arginine dependent signal sequence characteristic of exported proteins with bound redox cofactors. 308
17894 273608 TIGR01414 autotrans_barl outer membrane autotransporter barrel domain. A number of Gram-negative bacterial proteins, mostly found in pathogens and associated with virulence, contain a conserved C-terminal domain that integrates into the outer membrane and enables the N-terminal region to be delivered across the membrane. This C-terminal autotransporter domain is about 400 amino acids in length and includes the aromatic amino acid-rich OMP signal, typically ending with a Phe or Trp residue, at the extreme C-terminus. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 431
17895 273609 TIGR01415 trpB_rel pyridoxal-phosphate dependent TrpB-like enzyme. This model represents a family of pyridoxal-phosphate dependent enzyme (pfam00291) closely related to the beta subunit of tryptophan synthase (TIGR00263). However, the only case in which a member of this family replaces a member of TIGR00263 is in Sulfolobus species which contain two sequences which hit this model, one of which is proximal to the alpha subunit. In every other case so far, either the species appears not to make tryptophan (there is no trp synthase alpha subunit), or a trp synthase beta subunit matching TIGR00263 is also found. [Unknown function, Enzymes of unknown specificity] 419
17896 273610 TIGR01416 Rieske_proteo ubiquinol-cytochrome c reductase, iron-sulfur subunit. This model represents the Proteobacterial and mitochondrial type of the Rieske [2Fe-2S] iron-sulfur as found in ubiquinol-cytochrome c reductase. The model excludes the Rieske iron-sulfur protein as found in the cytochrome b6-f complex of the Cyanobacteria and chloroplasts. Most members of this family have a recognizable twin-arginine translocation (tat) signal sequence (DeltaPh-dependent translocation in chloroplast) for transport across the membrane with the 2Fe-2S group already bound. These signal sequences include a motif resembling RRxFLK before the transmembrane helix. [Energy metabolism, Electron transport] 174
17897 273611 TIGR01417 PTS_I_fam phosphoenolpyruvate-protein phosphotransferase. This model recognizes a distinct clade of phophoenolpyruvate (PEP)-dependent enzymes. Most members are known or deduced to function as the phosphoenolpyruvate-protein phosphotransferase (or enzyme I) of PTS sugar transport systems. However, some species with both a member of this family and a homolog of the phosphocarrier protein HPr lack a IIC component able to serve as a permease. An HPr homolog designated NPr has been implicated in the regulation of nitrogen assimilation, which demonstrates that not all phosphotransferase system components are associated directly with PTS transport. 565
17898 273612 TIGR01418 PEP_synth phosphoenolpyruvate synthase. Also called pyruvate,water dikinase and PEP synthase. The member from Methanococcus jannaschii contains a large intein. This enzyme generates phosphoenolpyruvate (PEP) from pyruvate, hydrolyzing ATP to AMP and releasing inorganic phosphate in the process. The enzyme shows extensive homology to other enzymes that use PEP as substrate or product. This enzyme may provide PEP for gluconeogenesis, for PTS-type carbohydrate transport systems, or for other processes. [Energy metabolism, Glycolysis/gluconeogenesis] 786
17899 162350 TIGR01419 nitro_reg_IIA PTS IIA-like nitrogen-regulatory protein PtsN. This model describes a full-length protein of about 160 residues closely related to the fructose-specific phosphotransferase (PTS) system IIA component. It is a regulatory protein found only in species with a phosphoenolpyruvate-protein phosphotransferase (enzyme I of PTS systems) and an HPr-like phosphocarrier protein, but not all species have a IIC-like permease. Members of this family are found in Proteobacteria, Chlamydia, and the spirochete Treponema pallidum. [Signal transduction, PTS] 145
17900 273613 TIGR01420 pilT_fam pilus retraction protein PilT. This model represents the PilT subfamily of proteins related to GspE, a protein involved in type II secretion (also called the General Secretion Pathway). PilT is an apparent cytosolic ATPase associated with type IV pilus systems. It is not required for pilin biogenesis, but is required for twitching motility and social gliding behaviors, shown in some species, powered by pilus retraction. Members of this family may be found in some species that type IV pili but have related structures for DNA uptake and natural transformation. [Cell envelope, Surface structures, Cellular processes, Chemotaxis and motility] 343
17901 273614 TIGR01421 gluta_reduc_1 glutathione-disulfide reductase, animal/bacterial. The tripeptide glutathione is an important reductant, e.g., for maintaining the cellular thiol/disulfide status and for protecting against reactive oxygen species such as hydrogen peroxide. Glutathione-disulfide reductase regenerates reduced glutathione from oxidized glutathione (glutathione disulfide) + NADPH. This model represents one of two closely related subfamilies of glutathione-disulfide reductase. Both are closely related to trypanothione reductase, and separate models are built so each of the three can describe proteins with conserved function. This model describes glutathione-disulfide reductases of animals, yeast, and a number of animal-resident bacteria. [Energy metabolism, Electron transport] 450
17902 188140 TIGR01422 phosphonatase phosphonoacetaldehyde hydrolase. This enzyme catalyzes the cleavage of the carbon phosphorous bond of a phosphonate. The mechanism depends on the substrate having a carbonyl one carbon away from the cleavage position. This enzyme is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases (pfam00702), and contains a modified version of the conserved catalytic motifs of that superfamily: the first motif is usually DxDx(T/V), here it is DxAxT, and in the third motif the normal conserved lysine is instead an arginine. Additionally, the enzyme contains a unique conserved catalytic lysine (B. cereus pos. 53) which is involved in the binding and activation of the substrate through the formation of a Schiff base. The substrate of this enzyme is the product of 2-aminoethylphosphonate (AEP) transaminase, phosphonoacetaldehyde. This degradation pathway for AEP may be related to its toxic properties which are utilized by microorganisms as a chemical warfare agent. [Central intermediary metabolism, Other] 253
17903 200098 TIGR01423 trypano_reduc trypanothione-disulfide reductase. Trypanothione, a glutathione-modified derivative of spermidine, is (in its reduced form) an important antioxidant found in trypanosomatids (Crithidia, Leishmania, Trypanosoma). This model describes trypanothione reductase, a possible antitrypanosomal drug target closely related to some forms of glutathione reductase. 486
17904 213618 TIGR01424 gluta_reduc_2 glutathione-disulfide reductase, plant. The tripeptide glutathione is an important reductant, e.g., for maintaining the cellular thiol/disulfide status and for protecting against reactive oxygen species such as hydrogen peroxide. Glutathione-disulfide reductase regenerates reduced glutathione from oxidized glutathione (glutathione disulfide) + NADPH. This model represents one of two closely related subfamilies of glutathione-disulfide reductase. Both are closely related to trypanothione reductase, and separate models are built so each of the three can describe proteins with conserved function. This model describes glutathione-disulfide reductases of plants and some bacteria, including cyanobacteria. [Energy metabolism, Electron transport] 446
17905 273615 TIGR01425 SRP54_euk signal recognition particle protein SRP54. This model represents examples from the eukaryotic cytosol of the signal recognition particle protein component, SRP54. This GTP-binding protein is a component of the eukaryotic signal recognition particle, along with several other protein subunits and a 7S RNA. Some species, including Arabidopsis, have several closely related forms. The extreme C-terminal region is glycine-rich and lower in complexity, poorly conserved between species, and excluded from this model. 428
17906 273616 TIGR01426 MGT glycosyltransferase, MGT family. This model describes the MGT (macroside glycosyltransferase) subfamily of the UDP-glucuronosyltransferase family. Members include a number of glucosyl transferases for macrolide antibiotic inactivation, but also include transferases of glucose-related sugars for macrolide antibiotic production. [Cellular processes, Toxin production and resistance] 392
17907 273617 TIGR01427 PTS_IIC_fructo PTS system, fructose subfamily, IIC component. This model represents the IIC component, or IIC region of a IIABC or IIBC polypeptide of a phosphotransferase system for carbohydrate transport. Members of this family belong to the fructose-specific subfamily of the broader family (pfam02378) of PTS IIC proteins. Members should be found as part of the same chain or in the same operon as fructose family IIA (TIGR00848) and IIB (TIGR00829) protein regions. A number of bacterial species have members in two different branches of this subfamily, suggesting some diversity in substrate specificity of its members. 346
17908 130495 TIGR01428 HAD_type_II 2-haloalkanoic acid dehalogenase, type II. Catalyzes the hydrolytic dehalogenation of small L-2-haloalkanoic acids to yield the corresponding D-2-hydroxyalkanoic acids. Belongs to the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases (pfam00702), class (subfamily) I. Note that the Type I HAD enzymes have not yet been fully characterized, but clearly utilize a substantially different catalytic mechanism and are thus unlikely to be related. 198
17909 273618 TIGR01429 AMP_deaminase AMP deaminase. This model describes AMP deaminase, a large, well-conserved eukaryotic protein involved in energy metabolism. Most members of the family have an additional, poorly alignable region of 150 amino acids or more N-terminal to the region included in the model. 611
17910 273619 TIGR01430 aden_deam adenosine deaminase. This family includes the experimentally verified adenosine deaminases of mammals and E. coli. Other members of this family are predicted also to be adenosine deaminase, an enzyme of nucleotide degradation. This family is distantly related to AMP deaminase. 324
17911 273620 TIGR01431 adm_rel adenosine deaminase-related growth factor. Members of this family have been described as secreted proteins with growth factor activity and regions of adenosine deaminase homology in insects, mollusks, and vertebrates. 479
17912 273621 TIGR01432 QOXA cytochrome aa3 quinol oxidase, subunit II. This enzyme catalyzes the oxidation of quinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. This subunit contains two transmembrane helices and a large external domain responsible for the binding and oxidation of quinol. QuoX is (presently) only found in gram positive bacteria of the Bacillus/Staphylococcus group. Like CyoA, the ubiquinol oxidase found in proteobacteria, the residues responsible for the ligation of Cu(a) and cytochrome c (found in the related cyt. c oxidases) are absent. Unlike CyoA, QoxA is in complex with a subunit I which contains cytochromes a similar to the cyt. c oxidases (as opposed to cytochromes b). [Energy metabolism, Electron transport] 226
17913 213620 TIGR01433 CyoA cytochrome o ubiquinol oxidase subunit II. This enzyme catalyzes the oxidation of ubiquinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. Subunit II is responsible for binding and oxidation of the ubiquinone substrate. This sequence is closely related to QoxA, which oxidizes quinol in gram positive bacteria but which is in complex with subunits which utilize cytochromes a in the reduction of molecular oxygen. Slightly more distantly related is subunit II of cytochrome c oxidase which uses cyt. c as the oxidant. [Energy metabolism, Electron transport] 226
17914 213621 TIGR01434 glu_cys_ligase glutamate--cysteine ligase. Alternate name: gamma-glutamylcysteine synthetase. This model represents glutamate--cysteine ligase, and enzyme in the biosynthesis of glutathione (GSH). GSH is one of several low molecular weight cysteine derivatives that can serve to protect against oxidative damage and participate in a biosynthetic or detoxification reactions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 512
17915 273622 TIGR01435 glu_cys_lig_rel glutamate--cysteine ligase/glutathione synthase, Streptococcus agalactiae type. This model represents a bifunctional protein family for the biosynthesis of glutathione, and perhaps a range of related gamma-glutamyltripeptides of the form gamma-Glu-Cys-X(aa). The N-terminal region is similar to proteobacterial glutamate-cysteine ligase. The C-terminal region is homologous to cyanophycin synthetase of cyanobacteria and, more distantly, to D-alanine-D-alanine ligases. Members of this family are found in Listeria and Enterococcus, Gram-positive lineages in which glutathione is produced (see PUBMED:8606174), and in Pasteurella multocida, a Proteobacterium. In Clostridium acetobutylicum, adjacent genes include separate proteins rather than a fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 737
17916 130503 TIGR01436 glu_cys_lig_pln glutamate--cysteine ligase, plant type. This model represents one of two highly dissimilar forms of glutamate--cysteine ligase (gamma-glutamylcysteine synthetase), an enzyme of glutathione biosynthesis. The other type is modeled by TIGR01434. This type is found in plants (with a probable transit peptide), root nodule and other bacteria, but not E. coli and closely related species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 446
17917 273623 TIGR01437 selA_rel uncharacterized pyridoxal phosphate-dependent enzyme. This model describes a protein related to a number of pyridoxal phosphate-dependent enzymes, and in particular to selenocysteine synthase (SelA), which converts Ser to selenocysteine on its tRNA. While resembling SelA, this protein is found only in species that have a better candidate SelA or else lack the other genes (selB, selC, and selD) required for selenocysteine incorporation. [Unknown function, Enzymes of unknown specificity] 363
17918 273624 TIGR01438 TGR thioredoxin and glutathione reductase selenoprotein. This homodimeric, FAD-containing member of the pyridine nucleotide disulfide oxidoreductase family contains a C-terminal motif Cys-SeCys-Gly, where SeCys is selenocysteine encoded by TGA (in some sequence reports interpreted as a stop codon). In some members of this subfamily, Cys-SeCys-Gly is replaced by Cys-Cys-Gly. The reach of the selenium atom at the C-term arm of the protein is proposed to allow broad substrate specificity. 484
17919 273625 TIGR01439 lp_hng_hel_AbrB looped-hinge helix DNA binding domain, AbrB family. This DNA-binding domain family includes AbrB, a transition state regulator in Bacillus subtilis, whose DNA-binding domain structure in solution was determined by NMR. The domain binds DNA as a dimer in what is termed a looped-hinge helix fold. Some members of the family have two copies of the domain in tandem. The domain is found usually at the N-terminus of a small protein. This model excludes members of family TIGR02609. [Regulatory functions, DNA interactions] 43
17920 130507 TIGR01440 TIGR01440 TIGR01440 family protein. Members of this family are uncharacterized proteins of about 180 amino acids from the Bacillus/Clostridium group of Gram-positive bacteria, found in no more than one copy per genome. [Hypothetical proteins, Conserved] 172
17921 273626 TIGR01441 GPR GPR endopeptidase. This model describes a tetrameric protease that makes the rate-limiting first cut in the small, acid-soluble spore proteins (SASP) of Bacillus subtilis and related species. The enzyme lacks clear homology to other known proteases. It processes its own amino end before becoming active to cleave SASPs. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination] 358
17922 273627 TIGR01442 SASP_gamma small, acid-soluble spore protein, gamma-type. This model represents a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. Most members of the family have two copies of the spore protease (GPR) cleavage motif, typically EFASE in this family, separating three low-complexity repeats. [Cellular processes, Sporulation and germination] 85
17923 213622 TIGR01443 intein_Cterm intein C-terminal splicing region. This model represents the well-conserved C-terminal region of a large number of inteins. It is based on interated search results, starting with a curated collection of intein N-terminal splicing regions from InBase, the New England Biolabs Intein Database, as presented on its web site. Inteins are regions encoded within proteins from which they remove themselves after translation in a self-splicing reaction, leaving the remainder of the coding region to form a complete, functional protein as if the intein were never there. Proteins with inteins include RecA, GyrA, ribonucleotide reductase, and others. Most inteins have a central region with putative endonuclease activity. 21
17924 273628 TIGR01444 fkbM_fam methyltransferase, FkbM family. Members of this family are characterized by two well-conserved short regions separated by a variable in both sequence and length. The first of the two regions is found in a large number of proteins outside this subfamily, a number of which have been characterized as methyltransferases. One member of the present family, FkbM, was shown to be required for a specific methylation in the biosynthesis of the immunosuppressant FK506 in Streptomyces strain MA6548. 143
17925 273629 TIGR01445 intein_Nterm intein N-terminal splicing region. This model is based on interated search results, starting with a curated collection of intein N-terminal splicing regions from InBase, the New England Biolabs Intein Database, as presented on its web site. It is designed to recognize inteins but not the related region of the sonic hedgehog protein. 81
17926 273630 TIGR01446 DnaD_dom DnaD and phage-associated domain. This model represents the conserved domain of DnaD, part of Bacillus subtilis replication restart primosome, and of a number of phage-associated proteins. Members, both chromosomal or phage-associated, are found in the Bacillus/Clostridium group of Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 73
17927 273631 TIGR01447 recD exodeoxyribonuclease V, alpha subunit. This family describes the exodeoxyribonuclease V alpha subunit, RecD. RecD is part of a RecBCD complex. A related family in the Gram-positive bacteria separates in a phylogenetic tree, has an additional N-terminal extension of about 200 residues, and is not supported as a member of a RecBCD complex by neighboring genes. The related family is consequently described by a different model. [DNA metabolism, DNA replication, recombination, and repair] 582
17928 273632 TIGR01448 recD_rel helicase, putative, RecD/TraA family. This model describes a family similar to RecD, the exodeoxyribonuclease V alpha chain of TIGR01447. Members of this family, however, are not found in a context of RecB and RecC and are longer by about 200 amino acids at the amino end. Chlamydia muridarum has both a member of this family and a RecD. [Unknown function, Enzymes of unknown specificity] 720
17929 130516 TIGR01449 PGP_bact 2-phosphoglycolate phosphatase, prokaryotic. PGP is an essential enzyme in the glycolate salvage pathway in higher organisms (photorespiration in plants). Phosphoglycolate results from the oxidase activity of RubisCO in the Calvin cycle when concentrations of carbon dioxide are low relative to oxygen. In Ralstonia (Alcaligenes) eutropha and Rhodobacter sphaeroides, the PGP gene (CbbZ) is located on an operon along with other Calvin cycle enzymes including RubisCO. The only other pertinent experimental evidence concerns the gene from E. coli. The in vitro activity of the Ralstonia and Escherichia enzymes was determined with crude cell extracts of strains containing PGP on expression plasmids and compared to controls. In E. coli, however, there does not appear to be a functional Calvin cycle (RubisCO is absent), although the E. coli PGP gene (gph) is on the same operon (dam) with ribulose-5-phosphate-3-epimerase (rpe), a gene in the pentose-phosphate pathway (along with other, unrelated genes). The E. coli enzyme is not expressed under normal laboratory conditions; the pathway to which it belongs has not been determined. In fact, the possibility exists, although unlikely, that the E. coli enzyme and others within this equivalog have as their physiological substrate another, closely related molecule. The other seed chosen for this model, from Xylella fastidiosa has no experimental evidence, but is a plant pathogen and thus may obtain phosphoglycolate from its host. This model has been restricted to encompass only proteobacteria as no related PGP has been verified outside of this clade. Sequences from Aquifex aeolicus and Treponema pallidum fall between the trusted and noise cutoffs. Just below the noise cutoff is a gene which is part of the operon for the biosynthesis of the blue pigment, indigoidine, from Erwinia (Pectobacterium) chrysanthemi, a plant pathogen. It does not seem likely, considering the proposed biosynthetic mechanism, that the dephosphorylation of phosphoglycolate or a closely related compound is required. Possibly, this gene is fortuitously located in this operon, or has an indirect relationship to the necessity for the biosynthesis of this compound. Sequences from 11 species have been annotated as PGP or putative PGP but fall below the noise cutoff. None of these have experimental validation. This enzyme is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolase enzymes (pfam00702). [Energy metabolism, Sugars] 213
17930 273633 TIGR01450 recC exodeoxyribonuclease V, gamma subunit. This model describes the gamma subunit of exodeoxyribonuclease V. Species containing this protein should also have the alpha (TIGR01447) and beta (TIGR00609) subunits. Candidates from Borrelia and from the Chlamydias differ dramatically and score between trusted and noise cutoffs. [DNA metabolism, DNA replication, recombination, and repair] 1060
17931 273634 TIGR01451 B_ant_repeat conserved repeat domain. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins. 53
17932 273635 TIGR01452 PGP_euk phosphoglycolate/pyridoxal phosphate phosphatase family. PGP is an essential enzyme in the glycolate salvage pathway in higher organisms (photorespiration in plants). Phosphoglycolate results from the oxidase activity of RubisCO in the Calvin cycle when concentrations of carbon dioxide are low relative to oxygen. In mammals, PGP is found in many tissues, notably in red blood cells where P-glycolate is and important activator of the hydrolysis of 2,3-bisphosphoglycerate, a major modifier of the oxygen affinity of hemoglobin. Pyridoxal phosphate (PLP, Vitamin B6) phosphatase is involved in the degradation of PLP in mammals and is widely distributed in human tissues including erythrocyes. The enzymes described here are members of the Haloacid dehalogenase superfamily of hydrolase enzymes (pfam00702). Unlike the bacterial PGP equivalog (TIGR01449), which is a member of class (subfamily) I, these enzymes are members of class (subfamily) II. These two families have almost certainly arisen from convergent evolution (although these two ancestors may themselves have diverged from a more distant HAD superfamily progenitor). The primary seed sequence for this model comes from Chlamydomonas reinhardtii, a photosynthetic alga. The enzyme has been purified and characterized and these data are fully consistent with the assignment of function as a PGPase involved in photorespiration. The second seed, from Homo sapiens chromosome 22 has been characterized as a pyridoxal phosphatase. Biochemical characterization of partially purified PGP's from various tissues including red blood cells have been performed while one gene for PGP has been localized to chromosome 16p13.3. The sequence used here maps to chromosome 22. There is indeed a related gene on chromosome 16 (and it is expressed, since EST's are found) which shows 46% identity. The chromosome 16 gene is not in evidence in nraa but translated from the genomic sequence. The third seed, from C. elegans, is only supported by sequence similarity. This model is limited to eukaryotic species including S. pombe and S. cerevisiae, although several archaea score between the trusted and noise cutoffs. This model is closely related to a family of bacterial sequences including the E. coli NagD and B. subtilus AraL genes which are characterized by the ability to hydrolyze para-nitrophenylphosphate (pNPPases or NPPases). The chlamydomonas PGPase d 279
17933 273636 TIGR01453 grpIintron_endo group I intron endonuclease. This model represents one subfamily of endonucleases containing the endo/excinuclease amino terminal domain, pfam01541 at its amino end. A distinct subfamily includes excinuclease abc subunit c (uvrC). Members of pfam01541 are often termed GIY-YIG endonucleases after conserved motifs near the amino end. This subfamily in this model is found in open reading frames of group I introns in both phage and mitochondria. The closely related endonucleases of phage T4: segA, segB, segC, segD and segE, score below the trusted cutoff for the family. 214
17934 130521 TIGR01454 AHBA_synth_RP 3-amino-5-hydroxybenoic acid synthesis related protein. The enzymes in this equivalog are all located in the operons for the biosynthesis of 3-amino-5-hydroxybenoic acid (AHBA), which is a precursor of several antibiotics including ansatrienin, naphthomycin, rifamycin and mitomycin. The role that this enzyme plays in this biosynthesis has not been elucidated. This enzyme is a member of the Haloacid dehalogenase superfamily (pfam00702) of aspartate-nucleophile hydrolases. This enzyme is closely related to phosphoglycolate phosphatase (TIGR01449), but it is unclear what purpose a PGPase or PGPase-like activity would serve in these biosyntheses. This model is limited to the Gram positive Actinobacteria. The most closely related enzyme below the noise cutoff is IndB which is involved in the biosynthesis of Indigoidine in Pectobacterium (Erwinia) chrysanthemi, a gamma proteobacter. This enzyme is similarly related to PGP. In this case, too it is unclear what role would be be played by a PGPase activity. 205
17935 130522 TIGR01455 glmM phosphoglucosamine mutase. This model describes GlmM, phosphoglucosamine mutase, also designated in MrsA and YhbF E. coli, UreC in Helicobacter pylori, and femR315 or FemD in Staphlococcus aureus. It converts glucosamine-6-phosphate to glucosamine-1-phosphate as part of the pathway toward UDP-N-acetylglucosamine for peptidoglycan and lipopolysaccharides. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars] 443
17936 200106 TIGR01456 CECR5 HAD-superfamily class IIA hydrolase, TIGR01456, CECR5. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all eukaryotes. One sequence (GP|13344995) is called "Cat Eye Syndrome critical region protein 5" (CECR5). This gene has been cloned from a pericentromere region of human chromosome 22 believed to be the location of the gene or genes responsible for Cat Eye Syndrome. This is one of a number of candidate genes. The Schizosaccharomyces pombe sequence (EGAD|138276) is annotated as "phosphatidyl synthase," however this is due entirely to a C-terminal region of the protein (outside the region of similarity of this model) which is highly homologous to a family of CDP-alcohol phosphatidyltransferases. (Thus, the annotation of GP|4226073 from C. elegans as similar to phosphatidyl synthase, is a mistake as this gene does not contain the C-terminal portion). The physical connection of the phosphatidyl synthase and the HAD-superfamily hydrolase domain in S. pombe may, however, be an important clue to the substrate for the hydrolases in this equivalog. 321
17937 130524 TIGR01457 HAD-SF-IIA-hyp2 HAD-superfamily subfamily IIA hydrolase, TIGR01457. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all gram positive (low-GC) bacteria. Sequences found in this model are annotated variously as related to NagD or 4-nitrophenyl phosphatase, and this hypothetical equivalog, of all of those within the Class IIA subfamily, is most closely related to the E. coli NagD enzyme and the PGP_euk equivalog (TIGR01452). However, there is presently no evidence that this hypothetical equivalog has the same function of either those. [Unknown function, Enzymes of unknown specificity] 249
17938 162372 TIGR01458 HAD-SF-IIA-hyp3 HAD-superfamily subfamily IIA hydrolase, TIGR01458. This hypothetical equivalog is a member of the IIA subfamily (TIGR01460) of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. One sequence (GP|10716807) has been annotated as a "phospholysine phosphohistidine inorganic pyrophosphatase," probably in reference to studies on similarly described (but unsequenced) enzymes from bovine and rat tissues. However, the supporting information for this annotation has never been published. [Unknown function, Enzymes of unknown specificity] 257
17939 130526 TIGR01459 HAD-SF-IIA-hyp4 HAD-superfamily class IIA hydrolase, TIGR01459. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all gram negative and primarily alpha proteobacteria. Only one sequence hase been annotated as other than "hypothetical." That one, from Brucella, is annotated as related to NagD, but only by sequence similarity and should be treated with some skepticism. (See comments for Class IIA subfamily model) 242
17940 273637 TIGR01460 HAD-SF-IIA Haloacid Dehalogenase Superfamily Class (subfamily) IIA. This model represents one structural subclass of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The classes are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Class I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Class II consists of sequences in which the capping domain is found between the second and third motifs. Class III sequences have no capping domain in iether of these positions. The Class IIA capping domain is predicted by PSI-PRED to consist of a mixed alpha-beta fold with the basic pattern: Helix-Helix-Helix-Sheet-Helix-Loop-Sheet-Helix-Sheet-Helix. Presently, this subfamily encompasses a single equivalog model (TIGR01452) for the eukaryotic phosphoglycolate phosphatase, as well as four hypothetical equivalogs covering closely related sequences (TIGR01456 and TIGR01458 in eukaryotes, TIGR01457 in gram positive bacteria and TIGR01459 in gram negative bacteria). The Escherishia coli NagD gene and the Bacillus subtilus AraL gene are members of this subfamily but are not members of the any of the presently defined equivalogs within it. NagD is part of the NAG operon responsible for N-acetylglucosamine metabolism. The function of this gene is unknown. Genes from several organisms have been annotated as NagD, or NagD-like. However, without data on the presence of other members of this pathway, (such as in the case of Yersinia pestis) these assignments should not be given great weight. The AraL gene is similar: it is part of the L-arabinose operon, but the function is unknown. A gene from Halobacterium has been annotated as AraL, but no other Ara operon genes have been annotated. Many of the genes in this subfamily have been annotated as "pNPPase" "4-nitrophenyl phosphatase" or "NPPase". These all refer to the same activity versus a common lab test compound used to determine phosphatase activity. There is no evidence that this activity is physiologically relevant. [Unknown function, Enzymes of unknown specificity] 236
17941 130528 TIGR01461 greB transcription elongation factor GreB. The GreA and GreB transcription elongation factors enable to continuation of RNA transcription past template-encoded arresting sites. Among the Proteobacteria, distinct clades of GreA and GreB are found. GreB differs functionally in that it releases larger oligonucleotides. This model describes proteobacterial GreB. [Transcription, Transcription factors] 156
17942 273638 TIGR01462 greA transcription elongation factor GreA. The GreA and GreB transcription elongation factors enable to continuation of RNA transcription past template-encoded arresting sites. Among the Proteobacteria, distinct clades of GreA and GreB are found. GreA differs functionally in that it releases smaller oligonucleotides. Because members of the family outside the Proteobacteria resemble GreA more closely than GreB, the GreB clade (TIGR01461) forms a plausible outgroup and the remainder of the GreA/B family, included in this model, is designated GreA. In the Chlamydias and some spirochetes, the region described by this model is found as the C-terminal region of a much larger protein. [Transcription, Transcription factors] 151
17943 273639 TIGR01463 mtaA_cmuA methyltransferase, MtaA/CmuA family. This subfamily is closely related to, yet is distinct from, uroporphyrinogen decarboxylase (EC 4.1.1.37). It includes two isozymes from Methanosarcina barkeri of methylcobalamin--coenzyme M methyltransferase. It also includes a chloromethane utilization protein, CmuA, which transfers the methyl group of chloromethane to a corrinoid protein. 336
17944 273640 TIGR01464 hemE uroporphyrinogen decarboxylase. This model represents uroporphyrinogen decarboxylase (HemE), which converts uroporphyrinogen III to coproporphyrinogen III. This step takes the pathway toward protoporphyrin IX, a common precursor of both heme and chlorophyll, rather than toward precorrin 2 and its products. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 338
17945 200107 TIGR01465 cobM_cbiF precorrin-4 C11-methyltransferase. This model represents precorrin-4 C11-methyltransferase, one of two methyltransferases commonly referred to as precorrin-3 methylase (the other is precorrin-3B C17-methyltransferase, EC 2.1.1.131). This enzyme participates in the pathway toward the biosynthesis of cobalamin and related products. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 247
17946 273641 TIGR01466 cobJ_cbiH precorrin-3B C17-methyltransferase. This model represents precorrin-3B C17-methyltransferase, one of two methyltransferases commonly referred to as precorrin-3 methylase (the other is precorrin-4 C11-methyltransferase, EC 2.1.1.133). This enzyme participates in the pathway toward the biosynthesis of cobalamin and related products. Members of this family may appear as fusion proteins with other enzymes of cobalamin biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 239
17947 273642 TIGR01467 cobI_cbiL precorrin-2 C(20)-methyltransferase. This model represents precorrin-2 C(20)-methyltransferase, one of several closely related S-adenosylmethionine-dependent methyltransferases involved in cobalamin (vitamin B12) biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 230
17948 273643 TIGR01469 cobA_cysG_Cterm uroporphyrin-III C-methyltransferase. This model represents enzymes, or enzyme domains, with uroporphyrin-III C-methyltransferase activity. This enzyme catalyzes the first step committed to the biosynthesis of either siroheme or cobalamin (vitamin B12) rather than protoheme (heme). Cobalamin contains cobalt while siroheme contains iron. Siroheme is a cofactor for nitrite and sulfite reductases and therefore plays a role in cysteine biosynthesis; many members of this family are CysG, siroheme synthase, with an additional N-terminal domain and with additional oxidation and iron insertion activities. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 236
17949 130536 TIGR01470 cysG_Nterm siroheme synthase, N-terminal domain. This model represents a subfamily of CysG N-terminal region-related sequences. All sequences in the seed alignment for this model are N-terminal regions of known or predicted siroheme synthases. The C-terminal region of each is uroporphyrin-III C-methyltransferase (EC 2.1.1.107), which catalyzes the first step committed to the biosynthesis of either siroheme or cobalamin (vitamin B12) rather than protoheme (heme). The region represented by this model completes the process of oxidation and iron insertion to yield siroheme. Siroheme is a cofactor for nitrite and sulfite reductases, so siroheme synthase is CysG of cysteine biosynthesis in some organisms. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 205
17950 273644 TIGR01472 gmd GDP-mannose 4,6-dehydratase. Alternate name: GDP-D-mannose dehydratase. This enzyme converts GDP-mannose to GDP-4-dehydro-6-deoxy-D-mannose, the first of three steps for the conversion of GDP-mannose to GDP-fucose in animals, plants, and bacteria. In bacteria, GDP-L-fucose acts as a precursor of surface antigens such as the extracellular polysaccharide colanic acid of E. coli. Excluded from this model are members of the clade that score poorly because of highly dervied (phylogenetically long-branch) sequences, e.g. Aneurinibacillus thermoaerophilus Gmd, described as a bifunctional GDP-mannose 4,6-dehydratase/GDP-6-deoxy-D-lyxo-4-hexulose reductase (PUBMED:11096116). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 343
17951 273645 TIGR01473 cyoE_ctaB protoheme IX farnesyltransferase. This model describes protoheme IX farnesyltransferase, also called heme O synthase, an enzyme that creates an intermediate in the biosynthesis of heme A. Prior to the description of its enzymatic function, this protein was often called a cytochrome o ubiquinol oxidase assembly factor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 280
17952 130539 TIGR01474 ubiA_proteo 4-hydroxybenzoate polyprenyl transferase, proteobacterial. This model represents a family of integral membrane proteins that condenses para-hydroxybenzoate with any of several polyprenyldiphosphates. Heterologous expression studies suggest that for, many but not all members, the activity seen (e.g. octaprenyltransferase in E. coli) reflects available host isoprenyl pools rather than enzyme specificity. A fairly deep split by both clustering (UPGMA) and phylogenetics (NJ tree) separates this group (mostly Proteobacterial and mitochondrial), with several characterized members, from another group (mostly archaeal and Gram-positive bacterial) lacking characterized members. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 281
17953 273646 TIGR01475 ubiA_other putative 4-hydroxybenzoate polyprenyltransferase. A fairly deep split separates this polyprenyltransferase subfamily from the set of mitochondrial and proteobacterial 4-hydroxybenzoate polyprenyltransferases, described in TIGR01474. Protoheme IX farnesyltransferase (heme O synthase) (TIGR01473) is more distantly related. Because no species appears to have both this protein and a member of TIGR01474, it is likely that this model represents 4-hydroxybenzoate polyprenyltransferase, a critical enzyme of ubiquinone biosynthesis, in the Archaea, Gram-positive bacteria, Aquifex aeolicus, the Chlamydias, etc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 282
17954 130541 TIGR01476 chlor_syn_BchG bacteriochlorophyll/chlorophyll synthetase. This model describes a subfamily of a large family of polyprenyltransferases (pfam01040) that also includes 4-hydroxybenzoate octaprenyltransferase and protoheme IX farnesyltransferase (heme O synthase). Members of this family are found exclusively in photosynthetic organisms, including a single copy in Arabidopsis thaliana. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 283
17955 273647 TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits. 353
17956 130543 TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits. 295
17957 273648 TIGR01479 GMP_PMI mannose-1-phosphate guanylyltransferase/mannose-6-phosphate isomerase. This enzyme is known to be bifunctional, as both mannose-6-phosphate isomerase (EC 5.3.1.8) (PMI) and mannose-1-phosphate guanylyltransferase (EC 2.7.7.22) in Pseudomonas aeruginosa, Xanthomonas campestris, and Gluconacetobacter xylinus. The literature on the enzyme from E. coli attributes mannose-6-phosphate isomerase activity to an adjacent gene, but the present sequence has not been shown to lack the activity. The PMI domain is C-terminal. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 468
17958 273649 TIGR01480 copper_res_A copper-resistance protein, CopA family. This model represents the CopA copper resistance protein family. CopA is related to laccase (benzenediol:oxygen oxidoreductase) and L-ascorbate oxidase, both copper-containing enzymes. Most members have a typical TAT (twin-arginine translocation) signal sequence with an Arg-Arg pair. Twin-arginine translocation is observed for a large number of periplasmic proteins that cross the inner membrane with metal-containing cofactors already bound. The combination of copper-binding sites and TAT translocation motif suggests a mechansism of resistance by packaging and export. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds] 587
17959 130546 TIGR01481 ccpA catabolite control protein A. Catabolite control protein A is a LacI family global transcriptional regulator found in Gram-positive bacteria. CcpA is involved in repressing carbohydrate utilization genes [ex: alpha-amylase (amyE), acetyl-coenzyme A synthase (acsA)] and in activating genes involved in transporting excess carbon from the cell [ex: acetate kinase (ackA), alpha-acetolactate synthase (alsS)]. Additionally, disruption of CcpA in Bacillus megaterium, Staphylococcus xylosus, Lactobacillus casei and Lactocacillus pentosus also decreases growth rate, which suggests CcpA is involved in the regulation of other metabolic pathways. [Regulatory functions, DNA interactions] 329
17960 273650 TIGR01482 SPP-subfamily sucrose-phosphate phosphatase subfamily. This model includes both the members of the SPP equivalog model (TIGR01485), encompassing plants and cyanobacteria, as well as those archaeal sequences which are the closest relatives (TIGR01487). It remains to be shown whether these archaeal sequences catalyze the same reaction as SPP. 225
17961 273651 TIGR01484 HAD-SF-IIB HAD-superfamily hydrolase, subfamily IIB. This subfamily falls within the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The Class II subfamilies are characterized by a domain that is located between the second and third conserved catalytic motifs of the superfamily domain. The IIB subfamily is distinguished from the IIA subfamily (TIGR01460) by homology and the predicted secondary structure of this domain by PSI-PRED. The IIB subfamily's Class II domain has the following predicted structure: Helix-Sheet-Sheet-(Helix or Sheet)-Helix-Sheet-(variable)-Helix-Sheet-Sheet. The IIB subfamily consists of Trehalose-6-phosphatase (TIGR00685), plant and cyanobacterial Sucrose-phosphatase and a closely related group of bacterial and archaeal sequences, eukaryotic phosphomannomutase (pfam03332), a large subfamily ("Cof-like hydrolases", TIGR00099) containing many closely related bacterial sequences, a hypothetical equivalog containing the E. coli YedP protein, as well as two small clusters containing OMNI|TC0379 and OMNI|SA2196 whose relationship to the other groups is unclear. [Unknown function, Enzymes of unknown specificity] 207
17962 130549 TIGR01485 SPP_plant-cyano sucrose-6F-phosphate phosphohydrolase. This model describes the sucrose phosphate phosphohydrolase from plants and cyanobacteria (SPP). SPP is a member of the Class IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. SPP catalyzes the final step in the biosynthesis of sucrose, a critically important molecule for plants. Sucrose phosphate synthase (SPS), the prior step in the biosynthesis of sucrose, contains a domain which exhibits considerable similarity to SPP albeit without conservation of the catalytic residues. The catalytic machinery of the synthase resides in another domain. It seems likely that the phosphatase-like domain is involved in substrate binding, possibly binding both substrates in a "product-like" orientation prior to ligation by the synthase catalytic domain. 249
17963 130550 TIGR01486 HAD-SF-IIB-MPGP mannosyl-3-phosphoglycerate phosphatase family. This small group of proteins is a member of the IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. Several members of this family from thermophiles (and from Dehalococcoides ethenogenes) are now known to act as mannosyl-3-phosphoglycerate (MPG) phosphatase. In these cases, the enzyme acts after MPG synthase to make the compatible solute mannosylglycerate. We propose that other mesophilic members of this family do not act as mannosyl-3-phosphoglycerate phosphatase. A member of this family is found in Escherichia coli, which appears to lack MPG synthase. Mannosylglycerate is imported in E. coli by phosphoenolpyruvate-dependent transporter (), but it appears the phosphorylation is not on the glycerate moiety, that the phosphorylated import is degraded by an alpha-mannosidase from an adjacent gene, and that E. coli would have no pathway to obtain MPG. 256
17964 273652 TIGR01487 Pglycolate_arch phosphoglycolate phosphatase, TA0175-type. This group of Archaeal sequences, now known to be phosphoglycolate phosphatases, is most closely related to the sucrose-phosphate phosphatases from plants and cyanobacteria (TIGR01485). Together, these two models comprise a subfamily model (TIGR01482). TIGR01482, in turn, is a member of the IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. 215
17965 273653 TIGR01488 HAD-SF-IB Haloacid Dehalogenase superfamily, subfamily IB, phosphoserine phosphatase-like. This model represents a subfamily of the Haloacid Dehalogenase superfamily of aspartate-nucleophile hydrolases. Subfamily IA, B, C and D are distinguished from the rest of the superfamily by the presence of a variable domain between the first and second conserved catalytic motifs. In subfamilies IA and IB, this domain consists of an alpha-helical bundle. It was necessary to model these two subfamilies separately, breaking them at a an apparent phylogenetic bifurcation, so that the resulting model(s) are not so broadly defined that members of subfamily III (which lack the variable domain) are included. Subfamily IA includes the enzyme phosphoserine phosphatase (TIGR00338) as well as three hypothetical equivalogs. Many members of these hypothetical equivalogs have been annotated as PSPase-like or PSPase-family proteins. In particular, the hypothetical equivalog which appears to be most closely related to PSPase contains only Archaea (while TIGR00338 contains only eukaryotes and bacteria) of which some are annotated as PSPases. Although this is a reasonable conjecture, none of these sequences has sufficient evidence for this assignment. If such should be found, this model should be retired while the PSPase model should be broadened to include these sequences. [Unknown function, Enzymes of unknown specificity] 177
17966 213629 TIGR01489 DKMTPPase-SF 2,3-diketo-5-methylthio-1-phosphopentane phosphatase. This phosphatase is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. With the exception of OMNI|NTL01BS01361 from B. subtilis and GP|15024582 from Clostridium acetabutylicum, the members of this group are all eukaryotic, spanning metazoa, plants and fungi. The B. subtilus gene (YkrX, renamed MtnX) is part of an operon for the conversion of methylthioribose (MTR) to methionine. It works with the enolase MtnW, a RuBisCO homolog. The combination of MtnW and MtnX achieves the same overall reaction as the enolase-phosphatase MtnC. The function of MtnX was shown by Ashida, et al. (2003) to be 2,3-diketo-5-methylthio-1-phosphopentane phosphatase, rather than 2,3-diketo-5-methylthio-1-phosphopentane phosphatase as proposed earlier. See the Genome Property for methionine salvage for more details. In eukaryotes, methionine salvage from methylthioadenosine also occurs. It seems reasonable that members of this family in eukaryotes fulfill a similar role as in Bacillus. A more specific, equivalog-level model is TIGR03333. Note that SP|P53981 from S. cerevisiae, a member of this family, is annotated as a "probable membrane protein" due to a predicted transmembrane helix. The region in question contains the second of the three conserved HAD superfamily catalytic motifs and thus, considering the fold of the HAD catalytic domain, is unlikely to be a transmembrane region in fact. [Central intermediary metabolism, Other] 188
17967 273654 TIGR01490 HAD-SF-IB-hyp1 HAD-superfamily subfamily IB hydrolase, TIGR01490. This hypothetical equivalog is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The sequences modelled here are all bacterial. The IB subfamily includes the enzyme phosphoserine phosphatase (TIGR00338). Due to this relationship, several of these sequences have been annotated as "phosphoserine phosphatase related proteins," or "Phosphoserine phosphatase-family enzymes." There is presently no evidence that any of the enzymes in this model possess PSPase activity. OMNI|NTL01ML1250 is annotated as a "possible transferase," however this is due to the C-terminal domain found on this sequence which is homologous to a group of glycerol-phosphate acyltransferases (between trusted and noise to TIGR00530). A subset of these sequences including OMNI|CC1962, the Caulobacter crescentus CicA protein cluster together and may represent a separate equivalog. [Unknown function, Enzymes of unknown specificity] 202
17968 273655 TIGR01491 HAD-SF-IB-PSPlk HAD-superfamily, subfamily-IB PSPase-like hydrolase, archaeal. This hypothetical equivalog is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The sequences modelled here are all from archaeal species. The phylogenetically closest group of sequences to these are phosphoserine phosphatases (TIGR00338). There are no known archaeal phosphoserine phosphatases, and no archaea fall within TIGR00338. It is likely, then, that this model represents the archaeal branch of the PSPase equivalog. 201
17969 130556 TIGR01492 CPW_WPC Plasmodium falciparum CPW-WPC domain. This model represents a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. 62
17970 130557 TIGR01493 HAD-SF-IA-v2 Haloacid dehalogenase superfamily, subfamily IA, variant 2 with 3rd motif like haloacid dehalogenase. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called 'capping domain', or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions. The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)D, (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 2 (this model) is distinctive of the type II haloacid dehalogenases, and nearly all of the sequences are also part of the HAD, type II equivalog model (TIGR01428). These three variant models were created with the knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly related HAD subfamilies caused by an overly broad single model. 175
17971 273656 TIGR01494 ATPase_P-type ATPase, P-type (transporting), HAD superfamily, subfamily IC. The P-type ATPases are a large family of trans-membrane transporters acting on charged substances. The distinguishing feature of the family is the formation of a phosphorylated intermediate (aspartyl-phosphate) during the course of the reaction. Another common name for these enzymes is the E1-E2 ATPases based on the two isolable conformations: E1 (unphosphorylated) and E2 (phosphorylated). Generally, P-type ATPases consist of only a single subunit encompassing the ATPase and ion translocation pathway, however, in the case of the potassium (TIGR01497) and sodium/potassium (TIGR01106) varieties, these functions are split between two subunits. Additional small regulatory or stabilizing subunits may also exist in some forms. P-type ATPases are nearly ubiquitous in life and are found in numerous copies in higher organisms (at least 45 in Arabidopsis thaliana, for instance). Phylogenetic analyses have revealed that the P-type ATPase subfamily is divided up into groups based on substrate specificities and this is represented in the various subfamily and equivalog models that have been made: IA (K+) TIGR01497, IB (heavy metals) TIGR01525, IIA1 (SERCA-type Ca++) TIGR01116, IIA2 (PMR1-type Ca++) TIGR01522, IIB (PMCA-type Ca++) TIGR01517, IIC (Na+/K+, H+/K+ antiporters) TIGR01106, IID (fungal-type Na+ and K+) TIGR01523, IIIA (H+) TIGR01647, IIIB (Mg++) TIGR01524, IV (phospholipid, flippase) TIGR01652 and V (unknown specificity) TIGR01657. The crystal structure of one calcium-pumping ATPase and an analysis of the fold of the catalytic domain of the P-type ATPases have been published. These reveal that the catalytic core of these enzymes is a haloacid dehalogenase(HAD)-type aspartate-nucleophile hydrolase. The location of the ATP-binding loop in between the first and second HAD conserved catalytic motifs defines these enzymes as members of subfamily I of the HAD superfamily (see also TIGR01493, TIGR01509, TIGR01549, TIGR01544 and TIGR01545). Based on these classifications, the P-type ATPase _superfamily_ corresponds to the IC subfamily of the HAD superfamily. 545
17972 130559 TIGR01495 ETRAMP Plasmodium ring stage membrane protein ETRAMP. This model describes a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rident parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region, and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both P. falciparum and P. yoelii chromosomes 85
17973 273657 TIGR01496 DHPS dihydropteroate synthase. This model represents dihydropteroate synthase, the enzyme that catalyzes the second to last step in folic acid biosynthesis. The gene is usually designated folP (folic acid biosynthsis) or sul (sulfanilamide resistance). This model represents one branch of the family of pterin-binding enzymes (pfam00809) and of a cluster of dihydropteroate synthase and related enzymes (COG0294). Other members of pfam00809 and COG0294 are represented by model TIGR00284. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 257
17974 130561 TIGR01497 kdpB K+-transporting ATPase, B subunit. This model describes the P-type ATPase subunit of the complex responsible for translocating potassium ions across biological membranes in microbes. In E. coli and other species, this complex consists of the proteins KdpA, KdpB, KdpC and KdpF. KdpB is the ATPase subunit, while KdpA is the potassium-ion translocating subunit. The function of KdpC is unclear, although cit has been suggested to couple the ATPase subunit to the ion-translocating subunit, while KdpF serves to stabilize the complex. The potassium P-type ATPases have been characterized as Type IA based on a phylogenetic analysis which places this clade closest to the heavy-metal translocating ATPases (Type IB). Others place this clade closer to the Na+/K+ antiporter type (Type IIC) based on physical characteristics. This model is very clear-cut, with a strong break between trusted hits and noise. All members of the seed alignment, from Clostridium, Anabaena and E. coli are in the characterized table. One sequence above trusted, OMNI|NTL01TA01282, is apparently mis-annotated in the primary literature, but properly annotated by TIGR. [Transport and binding proteins, Cations and iron carrying compounds] 675
17975 273658 TIGR01498 folK 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. This model describes the folate biosynthesis enzyme 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase. Alternate names include 6-hydroxymethyl-7,8-dihydropterin diphosphokinase and 7,8-dihydro-6-hydroxymethylpterin pyrophosphokinase (HPPK). The extreme C-terminal region, of typically eight to thirty residues, is not included in the model. This enzyme may be found as a fusion protein with other enzymes of folate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 129
17976 273659 TIGR01499 folC folylpolyglutamate synthase/dihydrofolate synthase. This model represents the FolC family of folate pathway proteins. Most examples are bifunctional, active as both folylpolyglutamate synthetase (EC 6.3.2.17) and dihydrofolate synthetase (EC 6.3.2.12). The two activities are similar - ATP + glutamate + dihydropteroate or tetrahydrofolyl-[Glu](n) = ADP + orthophosphate + dihydrofolate or tetrahydrofolyl-[Glu](n+1). A mutation study of the FolC gene of E. coli suggests that both activities belong to the same active site. Because some examples are monofunctional (and these cannot be separated phylogenetically), the model is treated as subfamily, not equivalog. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 397
17977 273660 TIGR01500 sepiapter_red sepiapterin reductase. This model describes sepiapterin reductase, a member of the short chain dehydrogenase/reductase family. The enzyme catalyzes the last step in the biosynthesis of tetrahydrobiopterin. A similar enzyme in Bacillus cereus was isolated for its ability to convert benzil to (S)-benzoin, a property sepiapterin reductase also shares. Cutoff scores for this model are set such that benzil reductase scores between trusted and noise cutoffs. 256
17978 130565 TIGR01501 MthylAspMutase methylaspartate mutase, S subunit. This model represents the S (sigma) subunit of methylaspartate mutase (glutamate mutase), a cobalamin-dependent enzyme that catalyzes the first step in a pathway of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation] 134
17979 213632 TIGR01502 B_methylAsp_ase methylaspartate ammonia-lyase. This model describes methylaspartate ammonia-lyase, also called beta-methylaspartase (EC 4.3.1.2). It follows methylaspartate mutase (composed of S and E subunits) in one of several possible pathways of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation] 408
17980 130567 TIGR01503 MthylAspMut_E methylaspartate mutase, E subunit. This model represents the E (epsilon) subunit of methylaspartate mutase (glutamate mutase), a cobalamin-dependent enzyme that catalyzes the first step in a pathway of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation] 480
17981 213633 TIGR01504 glyox_carbo_lig glyoxylate carboligase. Glyoxylate carboligase, also called tartronate-semialdehyde synthase, releases CO2 while synthesizing a single molecule of tartronate semialdehyde from two molecules of glyoxylate. It is a thiamine pyrophosphate-dependent enzyme, closely related in sequence to the large subunit of acetolactate synthase. In the D-glycerate pathway, part of allantoin degradation in the Enterobacteriaceae, tartronate semialdehyde is converted to D-glycerate and then 3-phosphoglycerate, a product of glycolysis and entry point in the general metabolism. 588
17982 130569 TIGR01505 tartro_sem_red 2-hydroxy-3-oxopropionate reductase. This model represents 2-hydroxy-3-oxopropionate reductase (EC 1.1.1.60), also called tartronate semialdehyde reductase. It follows glyoxylate carboligase and precedes glycerate kinase in D-glycerate pathway of glyoxylate degradation. The eventual product, 3-phosphoglycerate, is an intermediate of glycolysis and is readily metabolized. Tartronic semialdehyde, the substrate of this enzyme, may also come from other pathways, such as D-glucarate catabolism. 291
17983 130570 TIGR01506 ribC_arch riboflavin synthase. This archaeal protein catalyzes the same reaction, the final step in riboflavin biosynthesis, as bacterial riboflavin biosynthesis alpha chain. However, it is more similar in sequence to 6,7-dimethyl-8-ribityllumazine synthase, which catalyzes the previous reaction and which (in bacteria) is called the riboflavin synthase beta chain. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 151
17984 273661 TIGR01507 hopene_cyclase squalene-hopene cyclase. SHC is an essential prokaryotic gene in hopanoid (triterpenoid) biosynthesis. Squalene hopene cyclase, an integral membrane protein, directly cyclizes squalene into hopanoid products. [Fatty acid and phospholipid metabolism, Other] 635
17985 130572 TIGR01508 rib_reduct_arch 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)pyrimidine 1'-reductase, archaeal. This model represents a specific reductase of riboflavin biosynthesis in the Archaea, diaminohydroxyphosphoribosylaminopyrimidine reductase. It should not be confused with bacterial 5-amino-6-(5-phosphoribosylamino)uracil reductase. The intermediate 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)pyrimidine in riboflavin biosynthesis is reduced first, and then deaminated, in both Archaea and Fungi, opposite the order in Bacteria. The subsequent deaminase is not presently known and is not closely homologous to the deaminase domain (3.5.4.26) fused to the reductase domain (1.1.1.193) similar to this protein but found in most bacteria. 210
17986 273662 TIGR01509 HAD-SF-IA-v3 haloacid dehalogenase superfamily, subfamily IA, variant 3 with third motif having DD or ED. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions. The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)D, (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 3 (this model) is found in the enzymes beta-phosphoglucomutase (TIGR01990) and deoxyglucose-6-phosphatase, while many other enzymes of subfamily IA exhibit this variant as well as variant 1 (TIGR01549). These three variant models were created with the knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly related HAD subfamilies caused by an overly broad single model. [Unknown function, Enzymes of unknown specificity] 178
17987 273663 TIGR01510 coaD_prev_kdtB pantetheine-phosphate adenylyltransferase, bacterial. This model describes pantetheine-phosphate adenylyltransferase, the penultimate enzyme of coenzyme A (CoA) biosynthesis in bacteria. It does not show any strong homology to eukaryotic enzymes of coenzyme A biosynthesis. This protein was previously designated KdtB and postulated (because of cytidyltransferase homology and proximity to kdtA) to be an enzyme of LPS biosynthesis, a cytidyltransferase for 3-deoxy-D-manno-2-octulosonic acid. However, no activity toward that compound was found with either CTP or ATP. The phylogenetic distribution of this enzyme is more consistent with coenzyme A biosynthesis than with LPS biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 155
17988 273664 TIGR01511 ATPase-IB1_Cu copper-(or silver)-translocating P-type ATPase. This model describes the P-type ATPase primarily responsible for translocating copper ions accross biological membranes. These transporters are found in prokaryotes and eukaryotes. This model encompasses those species which pump copper ions out of cells or organelles (efflux pumps such as CopA of Escherichia coli) as well as those which pump the ion into cells or organelles either for the purpose of supporting life in extremely low-copper environments (for example CopA of Enterococcus hirae) or for the specific delivery of copper to a biological complex for which it is a necessary component (for example FixI of Bradyrhizobium japonicum, or CtaA and PacS of Synechocystis). The substrate specificity of these transporters may, to a varying degree, include silver ions (for example, CopA from Archaeoglobus fulgidus). Copper transporters from this family are well known as the genes which are mutated in two human disorders of copper metabolism, Wilson's and Menkes' diseases. The sequences contributing to the seed of this model are all experimentally characterized. The copper P-type ATPases have been characterized as Type IB based on a phylogenetic analysis which combines the copper-translocating ATPases with the cadmium-translocating species. This model and that describing the cadmium-ATPases (TIGR01512) are well separated, and thus we further type the copper-ATPases as IB1 (and the cadmium-ATPases as IB2). Several sequences which have not been characterized experimentally fall just below the cutoffs for both of these models (SP|Q9CCL1 from Mycobacterium leprae, GP|13816263 from Sulfolobus solfataricus, OMNI|NTL01CJ01098 from Campylobacter jejuni, OMNI|NTL01HS01687 from Halobacterium sp., GP|6899169 from Ureaplasma urealyticum and OMNI|HP1503 from Helicobacter pylori). Accession PIR|A29576 from Enterococcus faecalis scores very high against this model, but yet is annotated as an "H+/K+ exchanging ATPase". BLAST of this sequence does not hit anything else annotated in this way. This error may come from the characterization paper published in 1987. Accession GP|7415611 from Saccharomyces cerevisiae appears to be mis-annotated as a cadmium resistance protein. Accession OMNI|NTL01HS00542 from Halobacterium which scores above trusted for this model is annotated as "molybdenum-binding protein" although no evidence can be found for this classification. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds] 562
17989 273665 TIGR01512 ATPase-IB2_Cd heavy metal-(Cd/Co/Hg/Pb/Zn)-translocating P-type ATPase. This model describes the P-type ATPase primarily responsible for translocating cadmium ions (and other closely-related divalent heavy metals such as cobalt, mercury, lead and zinc) across biological membranes. These transporters are found in prokaryotes and plants. Experimentally characterized members of the seed alignment include: SP|P37617 from E. coli, SP|Q10866 from Mycobacterium tuberculosis and SP|Q59998 from Synechocystis PCC6803. The cadmium P-type ATPases have been characterized as Type IB based on a phylogenetic analysis which combines the copper-translocating ATPases with the cadmium-translocating species. This model and that describing the copper-ATPases (TIGR01511) are well separated, and thus we further type the copper-ATPases as IB1 and the cadmium-ATPases as IB2. Several sequences which have not been characterized experimentally fall just below trusted cutoff for both of these models (SP|Q9CCL1 from Mycobacterium leprae, GP|13816263 from Sulfolobus solfataricus, OMNI|NTL01CJ01098 from Campylobacter jejuni, OMNI|NTL01HS01687 from Halobacterium sp., GP|6899169 from Ureaplasma urealyticum and OMNI|HP1503 from Helicobacter pylori). [Transport and binding proteins, Cations and iron carrying compounds] 550
17990 273666 TIGR01513 NAPRTase_put putative nicotinate phosphoribosyltransferase. A deep split separates two related families of proteins, one of which includes experimentally characterized examples of nicotinate phosphoribosyltransferase, an the first enzyme of NAD salvage biosynthesis. This model represents the other family. Members have a different (longer) spacing of several key motifs and have an additional C-terminal domain of up to 100 residues. One argument suggesting that this family represents the same enzyme is that no species has a member of both families. Another is that the gene encoding this protein is located near other NAD salvage biosynthesis genes in Nostoc and in at least four different Gram-positive bacteria. NAD and NADP are ubiquitous in life. Most members of this family are Gram-positive bacteria. An additional set of mutually closely related archaeal sequences score between the trusted and noise cutoffs. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 443
17991 130578 TIGR01514 NAPRTase nicotinate phosphoribosyltransferase. This model represents nicotinate phosphoribosyltransferase, the first enzyme in the salvage pathway of NAD biosynthesis from nicontinate (niacin). Members are primary proteobacterial but also include yeasts and Methanosarcina acetivorans. A related family, apparently non-overlapping in species distribution, is TIGR01513. Members of that family differ in substantially in sequence and have a long C-terminal extension missing from this family, but are proposed also to act as nicotinate phosphoribosyltransferase (see model TIGR01513). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 394
17992 273667 TIGR01515 branching_enzym alpha-1,4-glucan:alpha-1,4-glucan 6-glycosyltransferase. This model describes the glycogen branching enzymes which are responsible for the transfer of chains of approx. 7 alpha(1--4)-linked glucosyl residues to other similar chains (in new alpha(1--6) linkages) in the biosynthesis of glycogen. This enzyme is a member of the broader amylase family of starch hydrolases which fold as (beta/alpha)8 barrels, the so-called TIM-barrel structure. All of the sequences comprising the seed of this model have been experimentally characterized. This model encompasses both bacterial and eukaryotic species. No archaea have this enzyme, although Aquifex aolicus does. Two species, Bacillus thuringiensis and Clostridium perfringens have two sequences each which are annotated as amylases. These annotations are aparrently in error. GP|18143720 from C. perfringens, for instance, contains the note "674 aa, similar to gp:A14658_1 amylase (1,4-alpha-glucan branching enzyme (EC 2.4.1.18) ) from Bacillus thuringiensis (648 aa); 51.1% identity in 632 aa overlap." A branching enzyme from Porphyromonas gingivales, OMNI|PG1793, appears to be more closely related to the eukaryotic species (across a deep phylogenetic split) and may represent an instance of lateral transfer from this species' host. A sequence from Arabidopsis thaliana, GP|9294564, scores just above trusted, but appears either to contain corrupt sequence or, more likely, to be a pseudogene as some of the conserved catalytic residues common to the alpha amylase family are not conserved here. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 618
17993 273668 TIGR01517 ATPase-IIB_Ca plasma-membrane calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the plasma membrane of eukaryotes, out of the cell. In some organisms, this type of pump may also be found in vacuolar membranes. In humans and mice, at least, there are multiple isoforms of the PMCA pump with overlapping but not redundant functions. Accordingly, there are no human diseases linked to PMCA defects, although alterations of PMCA function do elicit physiological effects. The calcium P-type ATPases have been characterized as Type IIB based on a phylogenetic analysis which distinguishes this group from the Type IIA SERCA calcium pump. A separate analysis divides Type IIA into sub-types (SERCA and PMR1) which are represented by two corresponding models (TIGR01116 and TIGR01522). This model is well separated from those. 956
17994 130581 TIGR01518 g3p_cytidyltrns glycerol-3-phosphate cytidylyltransferase. This model describes glycerol-3-phosphate cytidyltransferase, also called CDP-glycerol pyrophosphorylase. A closely related protein assigned a different function experimentally is a human ethanolamine-phosphate cytidylyltransferase (EC 2.7.7.14). Glycerol-3-phosphate cytidyltransferase acts in pathways of teichoic acid biosynthesis. Teichoic acids are substituted polymers, linked by phosphodiester bonds, of glycerol, ribitol, etc. An example is poly(glycerol phosphate), the major teichoic acid of the Bacillus subtilis cell wall. Most but not all species encoding proteins in this family are Gram-positive bacteria. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 125
17995 130582 TIGR01519 plasmod_dom_1 Plasmodium falciparum uncharacterized domain. This model represents an uncharacterized domain present in roughly eight hypothetical proteins of the malaria parasite Plasmodium falciparum. 70
17996 130583 TIGR01520 FruBisAldo_II_A fructose-bisphosphate aldolase, class II, yeast/E. coli subtype. Members of this family are class II examples of the glycolytic enzyme fructose-bisphosphate aldolase (FBA). This model represents one of two deeply split, architecturally distinct clades of the family that includes class II fructose-bisphosphate aldolases, tagatose-bisphosphate aldolases, and related uncharacterized proteins. This family is well-conserved and includes characterized FBA from Saccharomyces cerevisiae, Escherichia coli, and Corynebacterium glutamicum. Proteins outside the scope of this model may also be designated as class II fructose-bisphosphate aldolases, but are well separated in an alignment-based phylogenetic tree. [Energy metabolism, Glycolysis/gluconeogenesis] 357
17997 130584 TIGR01521 FruBisAldo_II_B fructose-bisphosphate aldolase, class II, Calvin cycle subtype. Members of this family are class II examples of the enzyme fructose-bisphosphate aldolase, an enzyme both of glycolysis and (in the opposite direction) of the Calvin cycle of CO2 fixation. A deep split separates the tightly conserved yeast/E. coli/Mycobacterium subtype (all species lacking the Calvin cycle) represented by model TIGR01520 from a broader group of aldolases that includes both tagatose- and fructose-bisphosphate aldolases. This model represents a distinct, elongated, very well conserved subtype within the latter group. Most species with this aldolase subtype have the Calvin cycle. 347
17998 130585 TIGR01522 ATPase-IIA2_Ca golgi membrane calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the golgi membrane of fungi and animals, and is of particular importance in the sarcoplasmic reticulum of skeletal and cardiac muscle in vertebrates. The calcium P-type ATPases have been characterized as Type IIA based on a phylogenetic analysis which distinguishes this group from the Type IIB PMCA calcium pump modelled by TIGR01517. A separate analysis divides Type IIA into sub-types, SERCA and PMR1, the former of which is modelled by TIGR01116. 884
17999 130586 TIGR01523 ATPase-IID_K-Na potassium and/or sodium efflux P-type ATPase, fungal-type. Initially described as a calcium efflux ATPase, more recent work has shown that the S. pombe CTA3 gene is in fact a potassium ion efflux pump. This model describes the clade of fungal P-type ATPases responsible for potassium and sodium efflux. The degree to which these pumps show preference for sodium or potassium varies. This group of ATPases has been classified by phylogentic analysis as type IID. The Leishmania sequence (GP|3192903), which falls between trusted and noise in this model, may very well turn out to be an active potassium pump. 1053
18000 130587 TIGR01524 ATPase-IIIB_Mg magnesium-translocating P-type ATPase. This model describes the magnesium translocating P-type ATPase found in a limited number of bacterial species and best described in Salmonella typhimurium, which contains two isoforms. These transporters are active in low external Mg2+ concentrations and pump the ion into the cytoplasm. The magnesium ATPases have been classified as type IIIB by a phylogenetic analysis. [Transport and binding proteins, Cations and iron carrying compounds] 867
18001 273669 TIGR01525 ATPase-IB_hvy heavy metal translocating P-type ATPase. This model encompasses two equivalog models for the copper and cadmium-type heavy metal transporting P-type ATPases (TIGR01511 and TIGR01512) as well as those species which score ambiguously between both models. For more comments and references, see the files on TIGR01511 and 01512. 558
18002 273670 TIGR01526 nadR_NMN_Atrans nicotinamide-nucleotide adenylyltransferase, NadR type. The NadR protein of E. coli and closely related bacteria is both enzyme and regulatory protein. The first 60 or so amino acids, N-terminal to the region covered by this model, is a DNA-binding helix-turn-helix domain (pfam01381) responsible for repressing the nadAB genes of NAD de novo biosynthesis. The NadR homologs in Mycobacterium tuberculosis, Haemophilus influenzae, and others appear to lack the repressor domain. NadR has recently been shown to act as an enzyme of the salvage pathway of NAD biosynthesis, nicotinamide-nucleotide adenylyltransferase; members of this family are presumed to share this activity. E. coli NadR has also been found to regulate the import of its substrate, nicotinamide ribonucleotide, but it is not known if the other members of this model share that activity. 325
18003 273671 TIGR01527 arch_NMN_Atrans nicotinamide-nucleotide adenylyltransferase. This model describes a family of archaeal proteins with the activity of the NAD salvage biosynthesis enzyme nicotinamide-nucleotide adenylyltransferase (EC 2.7.7.1). In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18), an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a lower-scoring paralog, uncharacterized with respect to activity, is also present. These score between trusted and noise cutoffs. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 165
18004 273672 TIGR01528 NMN_trans_PnuC nicotinamide mononucleotide transporter PnuC. The PnuC protein of E. coli is membrane protein responsible for nicotinamide mononucleotide transport, subject to regulation by interaction with the NadR (also called NadI) protein (see TIGR01526). This model defines a region corresponding to most of the length of PnuC, found primarily in pathogens. The extreme N- and C-terminal regions are poorly conserved and not included in the alignment and model. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 189
18005 130592 TIGR01529 argR_whole arginine repressor. This model includes most members of the arginine-responsive transcriptional regulator family ArgR. This hexameric protein binds DNA at its amino end to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbor-joining tree, some of these paralogous sequences show long branches and differ significantly in an otherwise well-conserved C-terminal region motif GT[VIL][AC]GDDT. These paralogs are excluded from the seed and score in the gray zone of this model, between trusted and noise cutoffs. [Amino acid biosynthesis, Glutamate family, Regulatory functions, DNA interactions] 146
18006 211667 TIGR01530 nadN NAD pyrophosphatase/5'-nucleotidase NadN. This model describes NadN of Haemophilus influenzae and a small number of close homologs in pathogenic, Gram-negative bacteria. NadN is a periplasmic enzyme that cleaves NAD (nicotinamide adenine dinucleotide) to NMN (nicotinamide mononucleotide) and AMP. The NMN must be converted by a 5'-nucleotidase to nicotinamide riboside for import. NadN belongs a large family of 5'-nucleotidases and has NMN 5'-nucleotidase activity for NMN, AMP, etc. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 545
18007 273673 TIGR01531 glyc_debranch glycogen debranching enzymye. glycogen debranching enzyme possesses two different catalytic activities; oligo-1,4-->1,4-glucantransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). Site directed mutagenesis studies in S. cerevisiae indicate that the transferase and glucosidase activities are independent and located in different regions of the polypeptide chain. Proteins in this model belong to the larger alpha-amylase family. The model covers eukaryotic proteins with a seed composed of human, nematode and yeast sequences. Yeast seed sequence is well characterized. The model is quite rigorous; either query sequence yields large bit score or it fails to hit the model altogether. There doesn't appear to be any middle ground. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 1464
18008 130595 TIGR01532 E4PD_g-proteo erythrose-4-phosphate dehydrogenase. This model represents the small clade of dehydrogenases in gamma-proteobacteria which utilize NAD+ to oxidize erythrose-4-phosphate (E4P) to 4-phospho-erythronate, a precursor for the de novo synthesis of pyridoxine via 4-hydroxythreonine and D-1-deoxyxylulose. This enzyme activity appears to have evolved from glyceraldehyde-3-phosphate dehydrogenase, whose substrate differs only in the lack of one carbon relative to E4P. Accordingly, this model is very close to the corresponding models for GAPDH, and those sequences which hit above trusted here invariably hit between trusted and noise to the GAPDH model (TIGR01534). Similarly, it may be found that there are species outside of the gamma proteobacteria which synthesize pyridoxine and have more than one aparrent GAPDH gene of which one may have E4PD activity - this may necessitate a readjustment of these models. Alternatively, some of the GAPDH enzymes may prove to be bifunctional in certain species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 325
18009 273674 TIGR01533 lipo_e_P4 5'-nucleotidase, lipoprotein e(P4) family. This model represents a set of bacterial lipoproteins belonging to a larger acid phosphatase family (pfam03767), which in turn belongs to the haloacid dehalogenase (HAD) superfamily of aspartate-dependent hydrolases. Members are found on the outer membrane of Gram-negative bacteria and the cytoplasmic membrane of Gram-positive bacteria. Most members have classic lipoprotein signal sequences. A critical role of this 5'-nucleotidase in Haemophilus influenzae is the degradation of external riboside in order to allow transport into the cell. An earlier suggested role in hemin transport is no longer current. This enzyme may also have other physiologically significant roles. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 266
18010 273675 TIGR01534 GAPDH-I glyceraldehyde-3-phosphate dehydrogenase, type I. This model represents glyceraldehyde-3-phosphate dehydrogenase (GAPDH), the enzyme responsible for the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. Forms exist which utilize NAD (EC 1.2.1.12), NADP (EC 1.2.1.13) or either (1.2.1.59). In some species, NAD- and NADP- utilizing forms exist, generally being responsible for reactions in the anabolic and catabolic directions respectively. Two Pfam models cover the two functional domains of this protein; pfam00044 represents the N-terminal NAD(P)-binding domain and pfam02800 represents the C-terminal catalytic domain. An additional form of gap gene is found in gamma proteobacteria and is responsible for the conversion of erythrose-4-phosphate (E4P) to 4-phospho-erythronate in the biosynthesis of pyridoxine. This pathway of pyridoxine biosynthesis appears to be limited, however, to a relatively small number of bacterial species although it is prevalent among the gamma-proteobacteria. This enzyme is described by TIGR001532. These sequences generally score between trusted and noise to this GAPDH model due to the close evolutionary relationship. There exists the possiblity that some forms of GAPDH may be bifunctional and act on E4P in species which make pyridoxine and via hydroxythreonine and lack a separate E4PDH enzyme (for instance, the GAPDH from Bacillus stearothermophilus has been shown to posess a limited E4PD activity as well as a robust GAPDH activity). There are a great number of sequences in the databases which score between trusted and noise to this model, nearly all of them due to fragmentary sequences. It seems that study of this gene has been carried out in many species utilizing PCR probes which exclude the extreme ends of the consenses used to define this model. The noise level is set relative not to E4PD, but the next closest outliers, the class II GAPDH's (found in archaea, TIGR01546) and aspartate semialdehyde dehydrogenase (ASADH, TIGR01296) both of which have highest-scoring hits around -225 to the prior model. [Energy metabolism, Glycolysis/gluconeogenesis] 326
18011 130598 TIGR01535 glucan_glucosid glucan 1,4-alpha-glucosidase. Glucan 1,4-alpha-glucosidase catalyzes the hydrolysis of terminal 1,4-linked alpha-D-glucose residues from non-reducing ends of polysaccharides, releasing a beta-D-glucose monomer. Some forms of this enzyme can hydrolyze terminal 1,6- and 1,3-alpha-D-glucosidic bonds in polysaccharides as well. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 648
18012 273676 TIGR01536 asn_synth_AEB asparagine synthase (glutamine-hydrolyzing). This model describes the glutamine-hydrolysing asparagine synthase. A poorly conserved C-terminal extension was removed from the model. Bacterial members of the family tend to have a long, poorly conserved insert lacking from archaeal and eukaryotic sequences. Multiple isozymes have been demonstrated, such as in Bacillus subtilis. Long-branch members of the phylogenetic tree (which typically were also second or third candidate members from their genomes) were removed from the seed alignment and score below trusted cutoff. [Amino acid biosynthesis, Aspartate family] 466
18013 273677 TIGR01537 portal_HK97 phage portal protein, HK97 family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions] 342
18014 273678 TIGR01538 portal_SPP1 phage portal protein, SPP1 family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions] 412
18015 273679 TIGR01539 portal_lambda phage portal protein, lambda family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions] 458
18016 273680 TIGR01540 portal_PBSX phage portal protein, PBSX family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. This family shows clear homology to TIGR01537. The alignment for this group was trimmed of poorly alignable N-terminal sequence of about 50 residues and of C-terminal regions present in some but not all members of up 180 residues. [Mobile and extrachromosomal element functions, Prophage functions] 320
18017 273681 TIGR01541 tape_meas_lam_C phage tail tape measure protein, lambda family. This model represents a relatively well-conserved region near the C-terminus of the tape measure protein of a lambda and related phage. This protein, which controls phage tail length, is typically about 1000 residues in length. Both low-complexity sequence and insertion/deletion events appear common in this family. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behavior is attributed to proteins from distantly related or unrelated families in other phage. [Mobile and extrachromosomal element functions, Prophage functions] 332
18018 130605 TIGR01542 A118_put_portal phage portal protein, putative, A118 family. This model represents a family of phage minor structural proteins. The protein is suggested to be the head-tail connector, or portal protein, on the basis of its position in the phage gene order, its presence in mature phage, its size, and its conservation across a number of complete genomes of tailed phage that lack other candidate portal proteins. Several other known portal protein families lack clear homology to this family and to each other. [Mobile and extrachromosomal element functions, Prophage functions] 476
18019 273682 TIGR01543 proheadase_HK97 phage prohead protease, HK97 family. This model describes the prohead protease of HK97 and related phage. It is generally encoded next to the gene for the capsid protein that it processes, and in some cases may be fused to it. This family does not show similarity to the prohead protease of phage T4 (see pfam03420). [Mobile and extrachromosomal element functions, Prophage functions, Protein fate, Other] 145
18020 273683 TIGR01544 HAD-SF-IE haloacid dehalogenase superfamily, subfamily IE hydrolase, TIGR01544. This model represents a small group of metazoan sequences. The sequences from mouse are annotated as Pyrimidine 5'-nucleotidases, aparrently in reference to HSPC233, the human homolog. However, no such annotation can currently be found for this gene. This group of sequences was found during searches for members of the haloacid dehalogenase (HAD) superfamily. All of the conserved catalytic motifs are found. The placement of the variable domain between motifs 1 and 2 indicates membership in subfamily I of the superfamily, but these sequences are sufficiently different from any of the branches (IA, TIGR01493, TIGR01509, TIGR01549; IB, TIGR01488; IC, TIGR01494; ID, TIGR01658; IF TIGR01545) of that subfamily as to constitute a separate branch to now be called IE. Considering that the closest identifiable hit outside of the noise range is to a phosphoserine phosphatase, this group may be considered to be most closely allied to subfamily IB. 283
18021 130608 TIGR01545 YfhB_g-proteo haloacid dehalogenase superfamily, subfamily IF hydrolase, YfhB. This model describes a clade of sequences limited to the gamma proteobacteria. This group is a member of the haloacid dehalogenase (HAD) superfamily of aspartate-dependent hydrolases and all of the conserved catalytic motifs are present. Although structurally similar to subfamily IA in that the variable domain is predicted to consist of five consecutive alpha helices (by PSI-PRED), it is sufficiently divergent to warrant being regarded as a separate sub-family (IF). The gene name comes from the E. coli gene. There is currently no information regarding the function of this gene. 210
18022 130609 TIGR01546 GAPDH-II_archae glyceraldehyde-3-phosphate dehydrogenase, type II. This model describes the type II glyceraldehyde-3-phosphate dehydrogenases which are limited to archaea. These enzymes catalyze the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. In archaea, either NAD or NADP may be utilized as the cofactor. The class I GAPDH's from bacteria and eukaryotes are covered by TIGR01534. All of the members of the seed are characterized. See, for instance. This model is very solid, there are no species falling between trusted and noise at this time. The closest relatives scoring in the noise are the class I GAPDH's. 333
18023 273684 TIGR01547 phage_term_2 phage terminase, large subunit, PBSX family. This model detects members of a highly divergent family of the large subunit of phage terminase. All members are encoded by phage genomes or within prophage regions of bacterial genomes. This is a distinct family from pfam03354. [Mobile and extrachromosomal element functions, Prophage functions] 394
18024 273685 TIGR01548 HAD-SF-IA-hyp1 haloacid dehalogenase superfamily, subfamily IA hydrolase, TIGR01548. This model represents a small and phylogenetically curious clade of sequences. Sequences are found from Halobacterium (an archaeon), Nostoc and Synechococcus (cyanobacteria) and Phytophthora (a stramenophile eukaryote). These appear to be members of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases by general homology and the conservation of all of the recognized catalytic motifs. The variable domain is found in between motifs 1 and 2, indicating membership in subfamily I and phylogeny and prediction of the alpha helical nature of the variable domain (by PSI-PRED) indicate membership in subfamily IA. All but the Halobacterium sequence currently found are annotated as "Imidazoleglycerol-phosphate dehydratase", however, the source of the annotation could not be traced and significant homology could not be found between any of these sequences and known IGPD's. 197
18025 273686 TIGR01549 HAD-SF-IA-v1 haloacid dehalogenase superfamily, subfamily IA, variant 1 with third motif having Dx(3-4)D or Dx(3-4)E. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions.The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)(D/E), (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 1 (this model) is found in the enzymes phosphoglycolate phosphatase (TIGR01449) and enolase-phosphatase. These three variant models (see also TIGR01493 and TIGR01509) were created withthe knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly relatedHAD subfamilies caused by an overly broad single model. [Unknown function, Enzymes of unknown specificity] 164
18026 273687 TIGR01550 DOC_P1 death-on-curing family protein. The characterized member of this family is the death-on-curing (DOC) protein of phage P1. It is part of a two protein operon with prevents-host-death (phd) that forms an addiction module. DOC lacks homology to analogous addiction module post-segregational killing proteins involved in plasmid maintenance. These modules work as a combination of a long lived poison (e.g. this protein) and a more abundant but shorter lived antidote. Members of this family have a well-conserved central motif HxFx[ND][AG]NKR. A similar region, with K replaced by G, is found in the huntingtin interacting protein (HYPE) family. [Unknown function, General] 121
18027 233464 TIGR01551 major_capsid_P2 phage major capsid protein, P2 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage P2 and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions] 327
18028 273688 TIGR01552 phd_fam prevent-host-death family protein. This model recognizes a region of about 55 amino acids toward the N-terminal end of bacterial proteins of about 85 amino acids in length. The best-characterized member is prevent-host-death (phd) of bacteriophage P1, the antidote partner of death-on-curing (doc) (TIGR01550) in an addiction module. Addiction modules prevent plasmid curing by killing the host cell as the longer-lived killing protein persists while the gene for the shorter-lived antidote is lost. Note, however, that relatively few members of this family appear to be plasmid or phage-encoded. Also, there is little overlap, except for phage P1 itself, of species with this family and with the doc family. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 52
18029 273689 TIGR01553 formate-DH-alph formate dehydrogenase-N alpha subunit. This model describes a subset of formate dehydrogenase alpha chains found mainly in proteobacteria but also in Aquifex. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits of 32 and 20 kDa. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The electrons are utilized mainly in the nitrate respiration by nitrate reductase. In E. coli and Salmonella, there are two forms of the formate dehydrogenase, one induced by nitrate which is strictly anaerobic (fdn), and one incuced during the transition from aerobic to anaerobic growth (fdo). This subunit is one of only three proteins in E. coli which contain selenocysteine. This model is well-defined, with a large, unpopulated trusted/noise gap. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 1009
18030 273690 TIGR01554 major_cap_HK97 phage major capsid protein, HK97 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage HK97, phi-105, P27, and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions] 386
18031 130618 TIGR01555 phge_rel_HI1409 phage-related protein, HI1409 family. This model describes an uncharacterized family of proteins found in prophage regions of a number of bacterial genomes, including Haemophilus influenzae, Xylella fastidiosa, Salmonella typhi, and Enterococcus faecalis. Distantly related proteins can be found in the prophage-bearing plasmids of Borrelia burgdorferi. [Mobile and extrachromosomal element functions, Prophage functions] 404
18032 130619 TIGR01556 rhamnosyltran L-rhamnosyltransferase. This model subfamily is comprised of gamma proteobacteria whose proteins function as L-rhamnosyltransferases in the synthesis of their respective surface polysaccharides. Rhamnolipids are glycolipids containing mono- or di- L-rhamnose molecules. Rhamnolipid synthesis occurs by sequential glycosyltransferase reactions involving two distinct rhamnosyltransferase enzymes. In P.aeruginosa, the synthesis of mono-rhamnolipids is catalyzed by rhamnosyltransferase 1, and proceeds by a glycosyltransfer reaction catalyzed by rhamnosyltransferase 2 to yield di-rhamnolipids. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 281
18033 130620 TIGR01557 myb_SHAQKYF myb-like DNA-binding domain, SHAQKYF class. This model describes a DNA-binding domain restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain described by pfam00249. It is distinguished in part by a well-conserved motif SH[AL]QKY[RF] at the C-terminal end of the motif. 57
18034 273691 TIGR01558 sm_term_P27 phage terminase, small subunit, putative, P27 family. This model describes a distinct family of phage (and integrated prophage) putative terminase small subunit. Members tend to be adjacent to the phage terminase large subunit gene. [Mobile and extrachromosomal element functions, Prophage functions] 116
18035 188157 TIGR01559 squal_synth farnesyl-diphosphate farnesyltransferase. This model describes farnesyl-diphosphate farnesyltransferase, also known as squalene synthase, as found in eukaryotes. This family is related to phytoene synthases. Tentatively identified archaeal homologs (excluded from this model) lack the C-terminal predicted transmembrane region universally conserved among members of this family. 337
18036 130623 TIGR01560 put_DNA_pack uncharacterized phage protein (possible DNA packaging). This model describes a small (~ 100 amino acids) protein found in phage and in putative prophage regions of a number of bacterial genomes. Members have been annotated in some cases as a possible DNA packaging protein, but the source of this annotation was not traced during construction of this model. [Mobile and extrachromosomal element functions, Prophage functions] 91
18037 130624 TIGR01561 gde_arch glycogen debranching enzyme, archaeal type, putative. The seed for this model is composed of two uncharacterized archaeal proteins from Methanosarcina acetivorans and Sulfolobus solfataricus. Trusted cutoff is set so that essentially only archaeal members hit the model. The notable exceptions to archaeal membership are the Gram positive Clostridium perfringens which scores much better than some other archaea and the Cyanobacterium Nostoc sp. which scores just above the trusted cutoff. Noise cutoff is set to exclude the characterized eukaryotic glycogen debranching enzyme in S. cerevisiae. These cutoffs leave the prokaryotes Porphyromonas gingivalis and Deinococcus radiodurans below trusted but above noise. Multiple alignments including these last two species exhibit sequence divergence which may suggest a subtly different function for these prokaryotic proteins. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 575
18038 130625 TIGR01562 FdhE formate dehydrogenase accessory protein FdhE. This model describes an accessory protein required for the assembly of formate dehydrogenase of certain proteobacteria although not present in the final complex. The exact nature of the function of FdhE in the assembly of the complex is unknown, but considering the presence of selenocysteine, molybdopterin, iron-sulfur clusters and cytochrome b556, it is likely to have something to do with the insertion of cofactors. The only sequence scoring between trusted and noise is that from Aquifex aeolicus, which shows certain structural differences from the proteobacterial forms in the alignment. However it is notable that A. aeolicus also has a sequence scoring above trusted to the alpha subunit of formate dehydrogenase (TIGR01553). 305
18039 273692 TIGR01563 gp16_SPP1 phage head-tail adaptor, putative, SPP1 family. This family describes a small protein of about 100 amino acids found in bacteriophage and in bacterial prophage regions. Examples include gp9 of phage HK022 and gp16 of phage SPP1. This minor structural protein is suggested to be a head-tail adaptor protein (although the source of this annotation was not traced during construction of this model). [Mobile and extrachromosomal element functions, Prophage functions] 101
18040 273693 TIGR01564 S_layer_MJ S-layer protein, MJ0822 family. This model represents one of several families of proteins associated with the formation of prokaryotic S-layers. Members of this family are found in archaeal species, including Pyrococcus horikoshii (split into two tandem reading frames), Methanococcus jannaschii, and related species. Some local similarity can be found to other S-layer protein families. [Cell envelope, Surface structures] 571
18041 130628 TIGR01565 homeo_ZF_HD homeobox domain, ZF-HD class. This model represents a class of homoebox domain that differs substantially from the typical homoebox domain described in pfam00046. It is found in both C4 and C3 plants. 58
18042 130629 TIGR01566 ZF_HD_prot_N ZF-HD homeobox protein Cys/His-rich dimerization domain. This model describes a 54-residue domain found in the N-terminal region of plant proteins, the vast majority of which contain a ZF-HD class homeobox domain toward the C-terminus. The region between the two domains typically is rich in low complexity sequence. The companion ZF-HD homeobox domain is described in model TIGR01565. 53
18043 273694 TIGR01567 S_layer_rel_Mac S-layer family duplication domain. This model represents a sequence region found tandemly duplicated in two proven archaeal S-layer glycoproteins, MA0829 from Methanosarcina acetivorans C2A and MM1976 from Methanosarcina mazei Go1, as well as in several paralogs of those L-layer proteins from both species. Members of the family show regions of local similarity to another known family of archaeal S-layer proteins described by model TIGR01564. Some members of this family, including the proven S-layer proteins, have the archaeosortase A target motif, PGF-CTERM (TIGR04126), at the protein C-terminus. [Cell envelope, Surface structures] 256
18044 130631 TIGR01568 A_thal_3678 uncharacterized plant-specific domain TIGR01568. This model describes an uncharacterized domain of about 70 residues found exclusively in plants, generally toward the C-terminus of proteins of 200 to 350 amino acids in length. At least 14 such proteins are found in Arabidopsis thaliana. Other regions of these proteins tend to consist largely of low-complexity sequence. 66
18045 273695 TIGR01569 A_tha_TIGR01569 plant integral membrane protein TIGR01569. This model describes a region of ~160 residues found exclusively in plant proteins, generally as the near complete length of the protein. At least 24 different members are found in Arabidopsis thaliana. Members have four predicted transmembrane regions, the last of which is preceded by an invariant CXXXXX[FY]C motif. The family is not functionally characterized. 154
18046 273696 TIGR01570 A_thal_3588 uncharacterized plant-specific domain TIGR01570. This model represents a region of about 170 amino acids found at the C-terminus of a family of plant proteins. These proteins typically have additional highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterized protein. At least 12 distinct members are found in Arabidopsis thaliana. 161
18047 273697 TIGR01571 A_thal_Cys_rich uncharacterized Cys-rich domain. This model describes an uncharacterized domain of about 100 residues. It is common in plants but found also in Homo sapiens, Dictyostelium, and Leishmania; at least 12 distinct members are found in Arabidopsis. Most members of this family contain more than 10 per cent Cys, but no Cys residue is invariant across the family. 104
18048 273698 TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL. 265
18049 273699 TIGR01573 cas2 CRISPR-associated endonuclease Cas2. This model describes most members of the family of Cas2, one of the first four protein families found to mark prokaryotic genomes that contain multiple CRISPR elements. CRISPR systems protect against invasive nucleic acid sequences, including phage. Cas2 proteins have been characterized as either endoribonuclease (for ssRNA) or endodeoxyribonuclease (for dsDNA), depending on the system to which the Cas2 belongs. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. The cas genes usually are found near the repeats. A distinct branch of the Cas2 family shows a very low level of sequence identity and is modeled by TIGR01873 instead of by this model (TIGR01573). 95
18050 273700 TIGR01574 miaB-methiolase tRNA-N(6)-(isopentenyl)adenosine-37 thiotransferase enzyme MiaB. This model represents homologs of the MiaB enzyme responsible for the modification of the isopentenylated adenine-37 base of most bacterial and eukaryotic tRNAs that read codons beginning with uracil (all except tRNA(I,V) Ser). Adenine-37 is next to the anticodon on the 3' side in these tRNA's, and lack of modification at this site leads to an increased spontaneous mutation frequency. Isopentenylated A-37 is modified by methylthiolation at position 2, either by MiaB alone or in concert with a separate methylase yet to be discovered (MiaC?). MiaB contains a 4Fe-4S cluster which is labile under oxidizing conditions. Additionally, the sequence is homologous (via PSI-BLAST searches) to the biotin synthetase, BioB, which utilizes both an iron-sulfur cluster and S-adenosym methionine (SAM) to generate a radical which is responsible for initiating the insertion of sulfur into the substrate. It is reasonable to surmise that the methyl group of SAM becomes the methyl group of the product, but this has not been shown, and the possibility of a separate methylase exists. This equivalog is a member of a subfamily (TIGR00089) which contains several other hypothetical equivalogs which are all probably enzymes with similar function acting on different substrates. These enzymes contain a TRAM domain (pfam01938) which is believed to be responsible for binding to tRNAs. Hits to this model span all major groups of bacteria and eukaryotes, but not archaea, which are known to lack this particular tRNA modification. The enzyme from Thermotoga maritima has been cloned, expressed, spectroscopically characterized and shown to complement the E. coli MiaB enzyme. [Protein synthesis, tRNA and rRNA base modification] 438
18051 273701 TIGR01575 rimI ribosomal-protein-alanine acetyltransferase. Members of this model belong to the GCN5-related N-acetyltransferase (GNAT) superfamily. This model covers prokarotes and the archaea. The seed contains a characterized accession for Gram negative E. coli. An untraceable characterized accession (PIR|S66013) for Gram positive B. subtilis scores well (205.0) in the full alignment. Characterized members are lacking in the archaea. Noise cutoff (72.4) was set to exclude M. loti paralog of rimI. Trusted cutoff (80.0) was set at next highest scoring member in the mini-database. [Protein synthesis, Ribosomal proteins: synthesis and modification] 131
18052 273702 TIGR01577 oligosac_amyl oligosaccharide amylase. The name of this type of amylase is based on the characterization of an glucoamylase family enzyme from Thermoactinomyces vulgaris. The T. vulgaris enzyme was expressed in E. coli and, like other glucoamylases, it releases beta-D-glucose from starch. However, unlike previously characterized glucoamylases, this T. vulgaris amylase hydrolyzes maltooligosaccharides (maltotetraose, maltose) more efficiently than starch (1), indicating this enzyme belongs to a class of glucoamylase-type enzymes with oligosaccharide-metabolizing activity. 616
18053 273703 TIGR01578 MiaB-like-B MiaB-like tRNA modifying enzyme, archaeal-type. This clade of sequences is closely related to MiaB, a modifier of isopentenylated adenosine-37 of certain eukaryotic and bacterial tRNAs (see TIGR01574). Sequence alignments suggest that this equivalog performs the same chemical transformation as MiaB, perhaps on a different (or differently modified) tRNA base substrate. This clade is a member of a subfamily (TIGR00089) and spans the archaea and eukaryotes. The only archaeal miaB-like genes are in this clade, while eukaryotes have sequences described by this model as well as ones falling within the scope of the MiaB equivalog model. [Protein synthesis, tRNA and rRNA base modification] 420
18054 273704 TIGR01579 MiaB-like-C MiaB-like tRNA modifying enzyme. This clade of sequences is closely related to MiaB, a modifier of isopentenylated adenosine-37 of certain eukaryotic and bacterial tRNAs (see TIGR01574). Sequence alignments suggest that this equivalog performs the same chemical transformation as MiaB, perhaps on a different (or differently modified) tRNA base substrate. This clade is a member of a subfamily (TIGR00089) and spans low GC Gram positive bacteria, alpha and epsilon proteobacteria, Campylobacter, Porphyromonas, Aquifex, Thermotoga, Chlamydia, Treponema and Fusobacterium. [Protein synthesis, tRNA and rRNA base modification] 414
18055 162434 TIGR01580 narG respiratory nitrate reductase, alpha subunit. The Nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and is a b-type cytochrome that receives electrons from the quinone pool and transfers them to the beta subunit. This model is specific for the alpha subunit for nitrate reductase I (narG) and nitrate reductase II (narZ) for gram positive and gram negative bacteria.A few thermophiles and archaea also match the model The seed members used to make the model include Nitrate reductases from Pseudomonas fluorescens (GP:11344601), E.coli (SP:P09152) and B.subtilis (SP:P42175). All seed members are experimentally characterized. Some unpublished nitrate reductases, that are shorter sequences, and probably fragments fall in between the noise and trusted cutoffs. Pfam models pfam00384 (Molybdopterin oxidoreductase) and pfam01568(Molydopterin dinucleotide binding domain) will also match the nitrate reductase, alpha subunit. [Energy metabolism, Anaerobic] 1235
18056 130643 TIGR01581 Mo_ABC_porter NifC-like ABC-type porter. This model describes a clade of ABC porter genes with relatively weak homology compared to its neighbor clades, the molybdate (TIGR02141) and sulfate (TIGR00969) porters. Neighbor-Joining, PAM-distance phylogenetic trees support the separation of the clades in this way. Included in this group is a gene designated NifC in Clostridium pasturianum. It would be reasonable to presume that NifC acts as a molybdate porter since the most common form of nitrogenase is a molybdoenzyme. Several other sequences falling within the scope of this model are annotated as molybdate porters and one, from Halobacterium, is annotated as a sulfate porter. There is presently no experimental evidence to support annotations with this degree of specificity. 225
18057 273705 TIGR01582 FDH-beta formate dehydrogenase, beta subunit, Fe-S containing. This model represents the beta subunit of the gamma-proteobacterial formate dehydrogenase. This subunit contains four 4Fe-4S clusters and is involved in transmitting electrons from the alpha subunit (TIGR01553) at the periplasmic space to the gamma subunit which spans the cytoplasmic membrane. In addition to the gamma proteobacteria, a sequence from Aquifex aolicus falls within the scope of this model. This appears to be the case for the alpha, gamma and epsilon (accessory protein TIGR01562) chains as well. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 283
18058 130645 TIGR01583 formate-DH-gamm formate dehydrogenase, gamma subunit. This model represents the gamma chain of the gamma proteobacteria (and Aquifex aolicus) formate dehydrogenase. This subunit is integral to the cytoplasmic membrane, consisting of 4 transmembrane helices, and receives electrons from the beta subunit. The entire E. coli formate dehydrogenase N (nitrate-inducible form) has been crystallized. The gamma subunit contains two cytochromes, heme b(P) and heme b(C) near the periplasmic and cytoplasmic sides of the membrane respectively. The electron acceptor quinone binds at the cytoplasmic heme histidine ligand. NiFe-hydrogenase and thiosulfate reductase contain homologous gamma subunits, and these can be found scoring in the noise of this model. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 204
18059 130646 TIGR01584 citF citrate lyase, alpha subunit. This is a model of the alpha subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The alpha subunit catalyzes the reaction Acetyl-CoA + citrate = acetate + (3S)-citryl-CoA. The seed contains an experimentally characterized member from Lactococcus lactis subsp. lactis. The model covers both Gram positive and Gram negative bacteria. It is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation] 492
18060 273706 TIGR01586 yopT_cys_prot cysteine protease domain, YopT-type. The model represents a cysteine protease domain found in proteins of bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs, and additional members may exist that are detected by this model. YopT, a virulence effector protein of Yersinia pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors. [Cellular processes, Pathogenesis] 196
18061 273707 TIGR01587 cas3_core CRISPR-associated helicase Cas3. This model represents the highly conserved core region of an alignment of Cas3, a protein found in association with CRISPR repeat elements in a broad range of bacteria and archaea. Cas3 appears to be a helicase, with regions found by pfam00270 (DEAD/DEAH box helicase) and pfam00271 (Helicase conserved C-terminal domain). Some but not all members have an N-terminal HD domain region (pfam01966) that is not included within this model. 359
18062 130649 TIGR01588 citE citrate lyase, beta subunit. This is a model of the beta subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The beta subunit catalyzes the reaction (3S)-citryl-CoA = acetyl-CoA + oxaloacetate. The seed contains an experimentally characterized member from Leuconostoc mesenteroides. The model covers a wide range of Gram positive bacteria. For Gram negative bacteria, it appears that only gamma proteobacteria hit this model. The model is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in-between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation] 288
18063 130650 TIGR01589 A_thal_3526 uncharacterized plant-specific domain TIGR01589. This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa. 57
18064 130651 TIGR01590 yir-bir-cir_Pla yir/bir/cir-family of variant antigens, Plasmodium-specific. This model represents a large paralogous family of variant antigens from several Plasmodium species (P. yoelii, P. berghei and P. chabaudi). The seed was generated from a list of ORF's in P. yoelii containing a paralagous domain as defined by an algorithm implemented at TIFR. The list was aligned and reduced to six sequences approximating the most divergent clades present in the data set. The model only hits genes previously characterized as yir, bir, or cir genes above the trusted cutoff. In between trusted and noise is one gene from P. vivax (vir25) which has been characterized as a distant relative of the yir/bir/cir family. The vir family appears to be present in 600-1000 copies per haploid genome and is preferentially located in the sub-telomeric regions of the chromosomes. The genomic data for yoelii is consistent with this observation. It is not believed that there are any orthologs of this family in P. falciparum. 199
18065 130652 TIGR01591 Fdh-alpha formate dehydrogenase, alpha subunit, archaeal-type. This model describes a subset of formate dehydrogenase alpha chains found mainly archaea but also in alpha and gamma proteobacteria and a small number of gram positive bacteria. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The enzyme's purpose is to allow growth on formate in some circumstances and, in the case of FdhH in gamma proteobacteria, to pass electrons to hydrogenase (by which process acid is neutralized). This model is well-defined, with only a single fragmentary sequence falling between trusted and noise. The alpha subunit of a version of nitrate reductase is closely related. 671
18066 130653 TIGR01592 holin_SPP1 holin, SPP1 family. This model represents one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others. [Mobile and extrachromosomal element functions, Prophage functions] 75
18067 273708 TIGR01593 holin_tox_secr toxin secretion/phage lysis holin. This model describes one of the many mutally dissimilar families of holins, phage proteins that act together with lytic enzymes in bacterial lysis. This family includes, besides phage holins, the protein TcdE/UtxA involved in toxin secretion in Clostridium difficile and related species. [Protein fate, Protein and peptide secretion and trafficking, Mobile and extrachromosomal element functions, Prophage functions] 128
18068 273709 TIGR01594 holin_lambda phage holin, lambda family. This model represents one of a large number of mutally dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. [Mobile and extrachromosomal element functions, Prophage functions] 107
18069 273710 TIGR01595 cas_CT1132 CRISPR-associated protein, CT1132 family. This protein is found in at least five widely species that contain CRISPR loci. Four cas (CRISPR-associated) proteins that are widely distributed and found near the CRISPR repeats. This protein is found exclusively next to other cas proteins. Its function is unknown. 281
18070 273711 TIGR01596 cas3_HD CRISPR-associated endonuclease Cas3-HD. CRISPR/Cas systems are widespread, mobile systems for host defense against invasive elements such as phage. In these systems, Cas3 designates one of the core proteins shared widely by multiple types of CRISPR/Cas system. This model represents an HD-like endonuclease that occurs either separately or as the N-terminal region of Cas3, the helicase-containing CRISPR-associated protein. 176
18071 130658 TIGR01597 PYST-B Plasmodium yoelii subtelomeric family PYST-B. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. 255
18072 130659 TIGR01598 holin_phiLC3 holin, phage phi LC3 family. Phage proteins for bacterial lysis typically include a membrane-disrupting protein, or holin, and one or more cell wall degrading enzymes that reach the cell wall because of holin action. Holins are found in a large number of mutually non-homologous families. [Mobile and extrachromosomal element functions, Prophage functions] 78
18073 273712 TIGR01599 PYST-A Plasmodium yoelii subtelomeric family PYST-A. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. Members of this family are expressed in both the Sporozoite and Gametozoite life stages. A single high-scoring gene was identified in the complete genome of P. falciparum as well as a single gene from P. chaboudi from GenBank which were included in the seed. There are no obvious homologs to these genes in any non-Plasmodium organism. These observations suggest an expansion of this family in yoelii from a common Plasmodium ancestor gene (present in a single copy in falciparum). 208
18074 130661 TIGR01600 phage_tail_L lambda-like phage minor tail protein L. This model detects members of the family of phage lambda minor tail protein L. This model was built as a fragment model to allow detection of fragmentary sequences, as might be found in cryptic prophage regions. [Mobile and extrachromosomal element functions, Prophage functions] 225
18075 213640 TIGR01601 PYST-C1 Plasmodium yoelii subtelomeric domain PYST-C1. This model represents the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604). 82
18076 130663 TIGR01602 PY-rept-46 Plasmodium yoelii repeat of length 46. This repeat is found in only 2 genes in Plasmodium yoelii, in each of these genes it is repeated 9 times. It is found in no other organism. 46
18077 273713 TIGR01603 maj_tail_phi13 phage major tail protein, phi13 family. This model describes a set of proteins that share low levels of sequence similarity but similar lengths and similar patterns of charged, hydrophobic, and Gly/Pro residues. All members (except one attributed to mouse embryo cDNA) belong to phage of Gram-positive bacteria. Several are identified as phage major tail proteins. Some members of this family have additional C-terminal regions of about 100 residues not included in this model. [Mobile and extrachromosomal element functions, Prophage functions] 190
18078 130665 TIGR01604 PYST-C2 Plasmodium yoelii subtelomeric domain PYST-C2. This model represents a domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The genes found by this model often are associated with an N-terminal domain yoelii-specific domain such as PYST-C1 (TIGR01601). 150
18079 130666 TIGR01605 PYST-D Plasmodium yoelii subtelomeric family PYST-D. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. These genes are generally very short (ca. 50 residues). There are no obvious homologs to these genes in any other organism. 55
18080 200119 TIGR01606 holin_BlyA holin, BlyA family. This family represents a BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage. This protein was previously proposed to be an hemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C-terminus. [Mobile and extrachromosomal element functions, Prophage functions] 63
18081 162444 TIGR01607 PST-A Plasmodium subtelomeric family (PST-A). This model represents a paralogous family of genes in Plasmodium falciparum and Plasmodium yoelii, which are closely related to various phospholipases and lysophospholipases of plants as well as generally being related to the alpha/beta-fold superfamily of hydrolases. These genes are preferentially located in the subtelomeric regions of the chromosomes of both P. falciparum and P. yoelii. 332
18082 130669 TIGR01608 citD citrate lyase acyl carrier protein. This is a model of the acyl carrier protein (aka gamma subunit) of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The acyl carrier protein covalently binds the coenzyme of citrate lyase. The seed contains an experimentally characterized member from Leuconostoc mesenteroides. The model covers a wide range of Gram positive bacteria. For Gram negative bacteria, it appears that only gamma proteobacteria hit this model. The model is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in-between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation] 92
18083 273714 TIGR01609 PF_unchar_267 Plasmodium falciparum uncharacterized protein TIGR01609. This model represents a family of at least four proteins in Plasmodium falciparum. An interesting feature is five perfectly conserved Trp residues. 146
18084 273715 TIGR01610 phage_O_Nterm phage replication protein O, N-terminal domain. This model represents the N-terminal region of the phage lambda replication protein O and homologous regions of other phage proteins. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 95
18085 130672 TIGR01611 tail_tube phage contractile tail tube protein, P2 family. The tails of some phage are contractile. This model represents the tail tube, or tail core, protein of the contractile tail of phage P2, and homologous proteins from additional phage. [Mobile and extrachromosomal element functions, Prophage functions] 168
18086 130673 TIGR01612 235kDa-fam reticulocyte binding/rhoptry protein. This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii. 2757
18087 273716 TIGR01613 primase_Cterm phage/plasmid primase, P4 family, C-terminal domain. This model represents a clade within a larger family of proteins from viruses of bacteria and animals. Members of this family are found in phage and plasmids of bacteria and archaea only. The model describes a domain of about 300 residues, found generally toward the protein C-terminus. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 304
18088 273717 TIGR01614 PME_inhib pectinesterase inhibitor domain. This model describes a plant domain of about 200 amino acids, characterized by four conserved Cys residues, shown in a pectinesterase inhibitor from Kiwi to form two disulfide bonds: first to second and third to fourth. Roughly half the members of this family have the region described by this model followed immediately by a pectinesterase domain, pfam01095. This suggests that the pairing of the enzymatic domain and its inhibitor reflects a conserved regulatory mechanism for this enzyme family. 178
18089 273718 TIGR01615 A_thal_3542 uncharacterized plant-specific domain TIGR01615. This model represents a domain found toward the C-terminus of a number of uncharacterized plant proteins. The domain is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence. 131
18090 273719 TIGR01616 nitro_assoc nitrogenase-associated protein. This model describes a small family of uncharacterized proteins found so far in alpha and gamma proteobacteria and in Nostoc sp. PCC 7120, a cyanobacterium. The gene for this protein is associated with nitrogenase genes. This family shows sequence similarity to TIGR00014, a glutaredoxin-dependent arsenate reductase that converts arsentate to arsenite for disposal. This family is one of several included in pfam03960. [Unknown function, General] 126
18091 273720 TIGR01617 arsC_related transcriptional regulator, Spx/MgsR family. This model represents a portion of the proteins within the larger set covered by pfam03960. That larger family includes a glutaredoxin-dependent arsenate reductase (TIGR00014). Characterized members of this family include Spx and MgsR from Bacillus subtili. Spx is a global regulator for response to thiol-specific oxidative stress. It interacts with RNA polymerase. MgsR (modulator of the general stress response, also called YqgZ) provides a second level of regulation for more than a third of the proteins in the B. subtilis general stress regulon controlled by Sigma-B. [Regulatory functions, DNA interactions] 117
18092 130679 TIGR01618 phage_P_loop phage nucleotide-binding protein. This model represents an uncharacterized family of proteins from a number of phage of Gram-positive bacteria. This protein contains a P-loop motif, G/A-X-X-G-X-G-K-T near its amino end. The function of this protein is unknown. [Mobile and extrachromosomal element functions, Prophage functions] 220
18093 130680 TIGR01619 hyp_HI0040 TIGR01619 family protein. This model represents a hypothetical equivalog of gamma proteobacteria, includes HI0040. These sequences do not have any similarity to known proteins by PSI-BLAST. 249
18094 130681 TIGR01620 hyp_HI0043 TIGR01620 family protein. This model includes putative membrane proteins from alpha and gamma proteobacteria, each making up their own clade. The two clades have less than 25% identity between them. We could not find support for the assignment to the sequence from Brucella (OMNI|NTL01BM0951) of being a GTP-binding protein. 289
18095 130682 TIGR01621 RluA-like pseudouridine synthase Rlu family protein, TIGR01621. This model represents a clade of sequences within the pseudouridine synthase superfamily (pfam00849). The superfamily includes E. coli proteins: RluA, RluB, RluC, RluD, and RsuA. The sequences modeled here are most closely related to RluA. Neisseria, among those species hitting this model, does not appear to have an RluA homolog. It is presumed that these sequences function as pseudouridine synthases, although perhaps with different specificity. [Protein synthesis, tRNA and rRNA base modification] 217
18096 273721 TIGR01622 SF-CC1 splicing factor, CC1-like family. This model represents a subfamily of RNA splicing factors including the Pad-1 protein (N. crassa), CAPER (M. musculus) and CC1.3 (H.sapiens). These proteins are characterized by an N-terminal arginine-rich, low complexity domain followed by three (or in the case of 4 H. sapiens paralogs, two) RNA recognition domains (rrm: pfam00706). These splicing factors are closely related to the U2AF splicing factor family (TIGR01642). A homologous gene from Plasmodium falciparum was identified in the course of the analysis of that genome at TIGR and was included in the seed. 494
18097 130684 TIGR01623 put_zinc_LRP1 putative zinc finger domain, LRP1 type. This model represents a putative zinc finger domain found in plants. Arabidopsis thaliana has at least 10 distinct members. Proteins containing this domain, including LRP1, generally share the same size, about 300 amino acids, and architecture. This 43-residue domain, and a more C-terminal companion domain of similar size, appear as tightly conserved islands of sequence similarity. The remainder consists largely of low-complexity sequence. Several animal proteins have regions with matching patterns of Cys, Gly, and His residues. These are not included in the model but score between trusted and noise cutoffs. 43
18098 273722 TIGR01624 LRP1_Cterm LRP1 C-terminal domain. This model represents a tightly conserved small domain found in LRP1 and related plant proteins. This family also contains a well-conserved putative zinc finger domain (TIGR01623). The rest of the sequence of most members consists of highly divergent, low-complexity sequence. 50
18099 130686 TIGR01625 YidE_YbjL_dupl AspT/YidE/YbjL antiporter duplication domain. This model represents a domain that is duplicated the aspartate-alanine antiporter AspT, as well as HI0035 of Haemophilus influenzae, YidE and YbjL of E. coli, and a number of other known or putative transporters. Member proteins may have 0, 1, or 2 copies of TrkA potassium uptake domain pfam02080 between the duplications. The domain contains several apparent transmembrane regions and is proposed here to act in transport. [Transport and binding proteins, Unknown substrate] 154
18100 130687 TIGR01626 ytfJ_HI0045 conserved hypothetical protein YtfJ-family, TIGR01626. This model represents sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ. 184
18101 130688 TIGR01627 A_thal_3515 uncharacterized plant-specific domain TIGR01627. This model represents an uncharacterized domain found in both Arabidopsis thaliana (at least 10 copies) and Oryza sativa. Most member proteins have only a short stretch of sequence N-terminal to this domain, but one has a long N-terminal extension that includes a protein kinase domain (pfam00069). 225
18102 130689 TIGR01628 PABP-1234 polyadenylate binding protein, human types 1, 2, 3, 4 family. These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range. 562
18103 273723 TIGR01629 rep_II_X phage/plasmid replication protein, gene II/X family. This model represents a family of phage and plasmid replication proteins. In bacteriophage IKe and related phage, the full-length protein is designated gene II protein. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein. Members of this family also include plasmid replication proteins. This model is built as a fragment model to better detect translations from alternate intiators and other fragments relative to full length gene II protein. [Mobile and extrachromosomal element functions, Prophage functions, Mobile and extrachromosomal element functions, Plasmid functions] 345
18104 130691 TIGR01630 psiM2_ORF9 phage uncharacterized protein (putative large terminase), C-terminal domain. This model represents the C-terminal region of a set of phage proteins typically about 400-500 amino acids in length, although some members are considerably shorter. An article on Methanobacterium phage Psi-M2 ( calls the member from that phage, ORF9, a putative large terminase subunit, and ORF8 a candidate terminase small subunit. Most proteins in this family have an apparent P-loop nucleotide-binding sequence toward the N-terminus. [Mobile and extrachromosomal element functions, Prophage functions] 142
18105 273724 TIGR01631 Trypano_RHS trypanosome RHS (retrotransposon hot spot) family. This model describes full-length and part-length members of the RHS (retrotransposon hot spot) family in Trypanosoma brucei and Trypanosoma cruzi. Members of this family are frequently interrupted by non-LTR retrotransposons inserted at exactly the same relative position. 760
18106 233500 TIGR01632 L11_bact 50S ribosomal protein uL11, bacterial form. This model represents bacterial, chloroplast, and most mitochondrial forms of 50S ribosomal protein L11. [Protein synthesis, Ribosomal proteins: synthesis and modification] 140
18107 188159 TIGR01633 phi3626_gp14_N putative phage tail component, N-terminal domain. This model represents the best-conserved region of about 125 amino acids, toward the N-terminus, of a family of proteins from temperate phage of a number of Gram-positive bacteria. These phage proteins range in length from 230 to 525 amino acids. [Mobile and extrachromosomal element functions, Prophage functions] 124
18108 130695 TIGR01634 tail_P2_I phage tail protein, P2 protein I family. This model represents the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria. This model is built as a fragment model and identifies some phage tail proteins with strong but local similarity to members of the seed alignment. [Mobile and extrachromosomal element functions, Prophage functions] 139
18109 130696 TIGR01635 tail_comp_S phage virion morphogenesis (putative tail completion) protein. This model describes protein S of phage P2, suggested experimentally to act in tail completion and stable head joining, and related proteins from a number of phage. [Mobile and extrachromosomal element functions, Prophage functions] 144
18110 130697 TIGR01636 phage_rinA phage transcriptional activator, RinA family. This model represents a family of phage proteins, including RinA, a transcriptional activator in staphylococcal phage phi 11. This family shows similarity to ArpU, a phage-related putative autolysin regulator, and to some sporulation-specific sigma factors. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions] 134
18111 273725 TIGR01637 phage_arpU phage transcriptional regulator, ArpU family. This model represents a family of phage proteins, including ArpU, called a putative autolysin regulatory protein. ArpU was described as a regulator of cellular muramidase-2 of Enterococcus hirae but appears to have been cloned from a prophage. This family appears related to the RinA family of bacteriophage transcriptional activators and to some sporulation-specific sigma factors. We propose that this is a phage transcriptional activator family. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions] 132
18112 130699 TIGR01638 Atha_cystat_rel Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam00031. Cystatins may help plants resist attack by insects. 92
18113 130700 TIGR01639 P_fal_TIGR01639 Plasmodium falciparum uncharacterized domain TIGR01639. This model represents a conserved sequence region of about 60 amino acids found in over 40 predicted proteins of Plasmodium falciparum. It is not found elsewhere, including closely related species such as Plasmodium yoelii. No member of this family is characterized. 61
18114 273726 TIGR01640 F_box_assoc_1 F-box protein interaction domain. This model describes a large family of plant domains, with several hundred members in Arabidopsis thaliana. Most examples are found C-terminal to an F-box (pfam00646), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes. Some members have two copies of this domain. 230
18115 213641 TIGR01641 phageSPP1_gp7 phage putative head morphogenesis protein, SPP1 gp7 family. This model describes a region of about 110 amino acids found exclusively in phage-related proteins, internally or toward the C-terminus. One member, gp7 of phage SPP1, appears involved in head morphogenesis. [Mobile and extrachromosomal element functions, Prophage functions] 108
18116 273727 TIGR01642 U2AF_lg U2 snRNP auxilliary factor, large subunit, splicing factor. These splicing factors consist of an N-terminal arginine-rich low complexity domain followed by three tandem RNA recognition motifs (pfam00076). The well-characterized members of this family are auxilliary components of the U2 small nuclear ribonuclearprotein splicing factor (U2AF). These proteins are closely related to the CC1-like subfamily of splicing factors (TIGR01622). Members of this subfamily are found in plants, metazoa and fungi. 509
18117 273728 TIGR01643 YD_repeat_2x YD repeat (two copies). This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin. 42
18118 273729 TIGR01644 phage_P2_V phage baseplate assembly protein V. This model describes a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain. [Mobile and extrachromosomal element functions, Prophage functions] 190
18119 130706 TIGR01645 half-pint poly-U binding splicing factor, half-pint family. The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA. 612
18120 273730 TIGR01646 vgr_GE Rhs element Vgr protein. This model represents the Vgr family of proteins, associated with some classes of Rhs elements. This model does not include a large octapeptide repeat region, VGXXXXXX, found in the Vgr of Rhs classes G and E. 483
18121 273731 TIGR01647 ATPase-IIIA_H plasma-membrane proton-efflux P-type ATPase. This model describes the plasma membrane proton efflux P-type ATPase found in plants, fungi, protozoa, slime molds and archaea. The best studied representative is from yeast. 754
18122 273732 TIGR01648 hnRNP-R-Q heterogeneous nuclear ribonucleoprotein R, Q family. Sequences in this subfamily include the human heterogeneous nuclear ribonucleoproteins (hnRNP) R, Q, and APOBEC-1 complementation factor (aka APOBEC-1 stimulating protein). These proteins contain three RNA recognition domains (rrm: pfam00076) and a somewhat variable C-terminal domain. 578
18123 273733 TIGR01649 hnRNP-L_PTB hnRNP-L/PTB/hephaestus splicing factor family. Included in this family of heterogeneous ribonucleoproteins are PTB (polypyrimidine tract binding protein) and hnRNP-L. These proteins contain four RNA recognition motifs (rrm: pfam00067). 481
18124 130711 TIGR01650 PD_CobS cobaltochelatase, CobS subunit. This model describes Pseudomonas denitrificans CobS gene product, which is a cobalt chelatase subunit that functions in cobalamin biosynthesis. Cobalamin (vitamin B12) can be synthesized via several pathways, including an aerobic pathway (found in Pseudomonas denitrificans) and an anaerobic pathway (found in P. shermanii and Salmonella typhimurium). These pathways differ in the point of cobalt insertion during corrin ring formation. There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction. Confusion regarding the functions of enzymes found in the aerobic vs. anaerobic pathways has arisen because nonhomologous genes in these different pathways were given the same gene symbols. Thus, cobS in the aerobic pathway (P. denitrificans) is not a homolog of cobS in the anaerobic pathway (S. typhimurium). It should be noted that E. coli synthesizes cobalamin only when it is supplied with the precursor cobinamide, which is a complex intermediate. Additionally, all E. coli cobalamin synthesis genes (cobU, cobS and cobT) were named after their Salmonella typhimurium homologs which function in the anaerobic cobalamin synthesis pathway. This model describes the aerobic cobalamin pathway Pseudomonas denitrificans CobS gene product, which is a cobalt chelatase subunit, with a MW ~37 kDa. The aerobic pathway cobalt chelatase is a heterotrimeric, ATP-dependent enzyme that catalyzes cobalt insertion during cobalamin biosynthesis. The other two subunits are the P. denitrificans CobT (TIGR01651) and CobN (pfam02514 CobN/Magnesium Chelatase) proteins. To avoid potential confusion with the nonhomologous Salmonella typhimurium/E.coli cobS gene product, the P. denitrificans gene symbol is not used in the name of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 327
18125 130712 TIGR01651 CobT cobaltochelatase, CobT subunit. This model describes Pseudomonas denitrificans CobT gene product, which is a cobalt chelatase subunit that functions in cobalamin biosynthesis. Cobalamin (vitamin B12) can be synthesized via several pathways, including an aerobic pathway (found in Pseudomonas denitrificans) and an anaerobic pathway (found in P. shermanii and Salmonella typhimurium). These pathways differ in the point of cobalt insertion during corrin ring formation. There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction. Confusion regarding the functions of enzymes found in the aerobic vs. anaerobic pathways has arisen because nonhomologous genes in these different pathways were given the same gene symbols. Thus, cobT in the aerobic pathway (P. denitrificans) is not a homolog of cobT in the anaerobic pathway (S. typhimurium). It should be noted that E. coli synthesizes cobalamin only when it is supplied with the precursor cobinamide, which is a complex intermediate. Additionally, all E. coli cobalamin synthesis genes (cobU, cobS and cobT) were named after their Salmonella typhimurium homologs which function in the anaerobic cobalamin synthesis pathway. This model describes the aerobic cobalamin pathway Pseudomonas denitrificans CobT gene product, which is a cobalt chelatase subunit, with a MW ~70 kDa. The aerobic pathway cobalt chelatase is a heterotrimeric, ATP-dependent enzyme that catalyzes cobalt insertion during cobalamin biosynthesis. The other two subunits are the P. denitrificans CobS (TIGR01650) and CobN (pfam02514 CobN/Magnesium Chelatase) proteins. To avoid potential confusion with the nonhomologous Salmonella typhimurium/E.coli cobT gene product, the P. denitrificans gene symbol is not used in the name of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 600
18126 273734 TIGR01652 ATPase-Plipid phospholipid-translocating P-type ATPase, flippase. This model describes the P-type ATPase responsible for transporting phospholipids from one leaflet of bilayer membranes to the other. These ATPases are found only in eukaryotes. 1057
18127 273735 TIGR01653 lactococcin_972 bacteriocin, lactococcin 972 family. This model represents bacteriocins related to lactococcin 972. Members tend to be found in association with a seven transmembrane putative immunity protein. [Cellular processes, Toxin production and resistance] 92
18128 273736 TIGR01654 bact_immun_7tm bacteriocin-associated integral membrane (putative immunity) protein. This model represents a family of integral membrane proteins, most of which are about 650 residues in size and predicted to span the membrane seven times. Nearly half of the members of this family are found in association with a member of the lactococcin 972 family of bacteriocins (TIGR01653). Others may be associated with uncharacterized proteins that may also act as bacteriocins. Although this protein is suggested to be an immunity protein, and the bacteriocin is suggested to be exported by a Sec-dependent process, the role of this protein is unclear. [Cellular processes, Toxin production and resistance] 679
18129 130716 TIGR01655 yxeA_fam conserved hypothetical protein TIGR01655. This model represents a family of small (about 115 amino acids) uncharacterized proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members. [Hypothetical proteins, Conserved] 114
18130 273737 TIGR01656 Histidinol-ppas histidinol-phosphate phosphatase family domain. This domain is found in authentic histidinol-phosphate phosphatases which are sometimes found as stand-alone entities and sometimes as fusions with imidazoleglycerol-phosphate dehydratase (TIGR01261). Additionally, a family of proteins including YaeD from E. coli (TIGR00213) and various other proteins are closely related but may not have the same substrate specificity. This domain is a member of the haloacid-dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. This superfamily is distinguished by the presence of three motifs: an N-terminal motif containing the nucleophilic aspartate, a central motif containing an conserved serine or threonine, and a C-terminal motif containing a conserved lysine (or arginine) and conserved aspartates. More specifically, the domian modelled here is a member of subfamily III of the HAD-superfamily by virtue of lacking a "capping" domain in either of the two common positions, between motifs 1 and 2, or between motifs 2 and 3. 147
18131 273738 TIGR01657 P-ATPase-V P-type ATPase of unknown pump specificity (type V). These P-type ATPases form a distinct clade but the substrate of their pumping activity has yet to be determined. This clade has been designated type V in. 1054
18132 273739 TIGR01658 EYA-cons_domain eyes absent protein conserved domain. This domain is common to all eyes absent (EYA) homologs. Metazoan EYA's also contain a variable N-terminal domain consisting largely of low-complexity sequences. 274
18133 273740 TIGR01659 sex-lethal sex-lethal family splicing factor. This model describes the sex-lethal family of splicing factors found in Dipteran insects. The sex-lethal phenotype, however, may be limited to the Melanogasters and closely related species. In Drosophila the protein acts as an inhibitor of splicing. This subfamily is most closely related to the ELAV/HUD subfamily of splicing factors (TIGR01661). 346
18134 211677 TIGR01660 narH nitrate reductase, beta subunit. The Nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and is a b-type cytochrome that receives electrons from the quinone pool and transfers them to the beta subunit. This model is specific for the beta subunit for nitrate reductase I (narH) and nitrate reductase II (narY) for gram positive and gram negative bacteria.A few thermophiles and archaea also match the model.The seed members used in this model are all experimentally characterized and include the following:SP:P11349, and SP:P19318, both E.Coli (NarH and NarY respectively), SP:P42176 from B. Subtilis, GP:11344602 from Psuedomonas fluorescens,GP:541762 from Paracoccus denitrificans, and GP:18413622 from Halomonas halodenitrificans. This model also matches Pfam pfam00037 for 4Fe-4S binding domain. [Energy metabolism, Anaerobic] 492
18135 273741 TIGR01661 ELAV_HUD_SF ELAV/HuD family splicing factor. This model describes the ELAV/HuD subfamily of splicing factors found in metazoa. HuD stands for the human paraneoplastic encephalomyelitis antigen D of which there are 4 variants in human. ELAV stnds for the Drosophila Embryonic lethal abnormal visual protein. ELAV-like splicing factors are also known in human as HuB (ELAV-like protein 2), HuC (ELAV-like protein 3, Paraneoplastic cerebellar degeneration-associated antigen) and HuR (ELAV-like protein 1). These genes are most closely related to the sex-lethal subfamily of splicing factors found in Dipteran insects (TIGR01659). These proteins contain 3 RNA-recognition motifs (rrm: pfam00076). 352
18136 273742 TIGR01662 HAD-SF-IIIA HAD-superfamily hydrolase, subfamily IIIA. This subfamily falls within the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The Class III subfamilies are characterized by the lack of any domains located between either between the first and second conserved catalytic motifs (as in the Class I subfamilies, TIGR01493, TIGR01509, TIGR01488 and TIGR01494) or between the second and third conserved catalytic motifs (as in the Class II subfamilies, TIGR01460 and TIGR01484) of the superfamily domain. The IIIA subfamily contains five major clades: histidinol-phosphatase (TIGR01261) and histidinol-phosphatase-related protein (TIGR00213) which together form a subfamily (TIGR01656), DNA 3'-phosphatase (TIGR01663, TIGR01664), YqeG (TIGR01668) and YrbI (TIGR01670). In the case of histidinol phosphatase and PNK-3'-phosphatase, this model represents a domain of a bifunctional system. In the histidinol phosphatase HisB, a C-terminal domain is an imidazoleglycerol-phosphate dehydratase which catalyzes a related step in histidine biosynthesis. In PNK-3'-phosphatase, N- and C-terminal domains constitute the polynucleotide kinase and DNA-binding components of the enzyme. [Unknown function, Enzymes of unknown specificity] 135
18137 130724 TIGR01663 PNK-3'Pase polynucleotide 5'-kinase 3'-phosphatase. This model represents the metazoan 5'-polynucleotide-kinase-3'-phosphatase, PNKP, which is believed to be involved in repair of oxidative DNA damage. Removal of 3' phosphates is essential for the further processing of the break by DNA polymerases. The central phosphatase domain is a member of the IIIA subfamily (TIGR01662) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. As is common in this superfamily, the enzyme is magnesium dependent. A difference between this enzyme and other HAD-superfamily phosphatases is in the third conserved catalytic motif which usually contains two conserved aspartate residues believed to be involved in binding the magnesium ion. Here, the second aspartate is replaced by a conserved arginine residue which may indicate an interaction with the phosphate backbone of the substrate. Very close relatives of this domain are also found separate from the N- and C-terminal domains seen here, as in the 3'-phosphatase found in plants. The larger family of these domains is described by TIGR01664. Outside of the phosphatase domain is a P-loop ATP-binding motif associated with the kinase activity. The entry for the mouse homolog appears to be missing a large piece of sequence corresponding to the first conserved catalytic motif of the phosphatase domain as well as the conserved threonine of the second motif. Either this is a sequencing artifact or this may represent a pseudo- or non-functional gene. Note that the EC number for the kinase function is: 2.7.1.78 526
18138 211680 TIGR01664 DNA-3'-Pase DNA 3'-phosphatase. This model represents a family of proteins and protein domains which catalyze the dephosphorylation of DNA 3'-phosphates. It is believed that this activity is important for the repair of single-strand breaks in DNA caused by radiation or oxidative damage. This domain is often (TIGR01663), but not always linked to a DNA 5'-kinase domain. The central phosphatase domain is a member of the IIIA subfamily (TIGR01662) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. As is common in this superfamily, the enzyme is magnesium dependent. A difference between this enzyme and other HAD-superfamily phosphatases is in the third conserved catalytic motif which usually contains two conserved aspartate residues believed to be involved in binding the magnesium ion. Here, the second aspartate is usually replaced by an arginine residue which may indicate an interaction with the phosphate backbone of the substrate. Alternatively, there is an additional conserved aspartate downstream of the ususal site which may indicate slightly different fold in this region. 166
18139 273743 TIGR01665 put_anti_recept phage minor structural protein, N-terminal region. This model represents the conserved N-terminal region, typically from about residue 25 to about residue 350, of a family of uncharacterized phage proteins 500 to 1700 residues in length. [Mobile and extrachromosomal element functions, Prophage functions] 317
18140 130727 TIGR01666 YCCS TIGR01666 family membrane protein. This model represents a clade of sequences from gamma and beta proteobacteria. These proteins are >700 amino acids long and many have been annotated as putative membrane proteins. The gene from Salmonella has been annotated as a putative efflux transporter. The gene from E. coli has the name yccS. [Cell envelope, Other] 704
18141 130728 TIGR01667 YCCS_YHJK integral membrane protein, YccS/YhfK family. This model represents two clades of putative transmembrane proteins including the E. coli YccS and YhfK proteins. The YccS hypothetical equivalog (TIGR01666) is found in beta and gamma proteobacteria, while the smaller YhfK group is only found in E. coli, Salmonella and Yersinia. TMHMM on the 19 hits to this model shows a consensus of 11 transmembrane helices separated into two clusters, an N-terminal cluster of 6 and a central cluster of 5. This would indicate two non-membrane domains one on each side of the membrane 701
18142 273744 TIGR01668 YqeG_hyp_ppase HAD superfamily (subfamily IIIA) phosphatase, TIGR01668. This family of hypothetical proteins is a member of the IIIA subfamily of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of this subfamily (TIGR01662) and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. This family consists of sequences from fungi, plants, cyanobacteria, gram-positive bacteria and Deinococcus. There is presently no characterization of any sequence in this family. 170
18143 273745 TIGR01669 phage_XkdX phage uncharacterized protein, XkdX family. This model represents a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes. [Mobile and extrachromosomal element functions, Prophage functions] 45
18144 130731 TIGR01670 KdsC-phosphatas 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase, YrbI family. This family of proteins is a member of the IIIA subfamily of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of this subfamily (TIGR01662) and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. One member of this family, the YrbI protein from H. influenzae has been cloned, expressed, purified and found to be an active 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase. Furthermore, its crystal structure has been determined. This family consists of sequences from beta, gamma and epsilon proteobacteria, Aquifex, Fusobacterium, Porphyromonas and Methanosarcina. The Methanosarcina sequence is distinctive in that it is linked to an N-terminal cytidylyltransferase domain (pfam02348) and is annotated as acylneuraminate cytidylyltransferase. This may give some clue as the function of these phosphatases. Several eukaryotic sequences scoring between trusted and noise are also closely related to this function such as the CMP-N-acetylneuraminic acid synthetase from mouse, but in these cases the phosphatase domain is clearly inactive as many of the active site residues are not conserved. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 154
18145 273746 TIGR01671 phage_TIGR01671 phage uncharacterized protein TIGR01671. This model represents an uncharacterized, well-conserved family of proteins found in bacteriophage and prophage regions of Gram-positive bacteria. [Mobile and extrachromosomal element functions, Prophage functions, Hypothetical proteins, Conserved] 151
18146 273747 TIGR01672 AphA HAD superfamily (subfamily IIIB) phosphatase, TIGR01672. This family of proteins is a member of the IIIB subfamily (pfam02767) of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of subfamily III and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. The AphA gene from E. coli has been characterized and shown to be an active phosphatase enzyme. This family has been previously described as the "class B non-specific bacterial acid phosphatase" (B-NSAP) family, where it is noted that the enzyme is secreted and has a broad substrate range. The possibility exists, however, that the enzyme is specific for an as yet undefined substrate. Supporting evidence for the inclusion in the HAD superfamily, whose phosphatase members are magnesium dependent, is the inhibition by EDTA and calcium ions, and stimulation by magnesium ion. 237
18147 130734 TIGR01673 holin_LLH phage holin, LL-H family. This model represents a putative phage holin from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is small (about 100 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. [Mobile and extrachromosomal element functions, Prophage functions] 108
18148 273748 TIGR01674 phage_lambda_G phage minor tail protein G. This model describes a family of bacteriophage proteins including G of phage lambda. This protein has been described as undergoing a translational frameshift at a Gly-Lys dipeptide near the C-terminus of protein G from phage lambda, with about 4 % efficiency, to produce tail assembly protein G-T. The Lys of the Gly-Lys pair is the conserved second-to-last residue of seed alignment for this family. [Mobile and extrachromosomal element functions, Prophage functions] 138
18149 273749 TIGR01675 plant-AP plant acid phosphatase. This model represents a family of acid phosphatase from plants which are most closely related to the (so called) class B non-specific acid phosphatase OlpA (TIGR01533, which is believed to be a 5'-nucleotide phosphatase) and somewhat more distantly to another class B phosphatase, AphA (TIGR01672). Together these three clades define a subfamily (pfam03767) which corresponds to the IIIB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. It has been reported that the best substrate for this enzyme that could be found was purine 5'-nucleoside phosphates. This is in concordance with the assignment of the H. influenzae hel protein (from TIGR01533) as a 5'-nucleotidase, however there is presently no other evidence to support this specific function for these plant phosphatases. Many genes from this family have been annotated as vegetative storage proteins due to their close homology with these earlier-characterized gene products, which are highly expressed in leaves. There are significant differences however, including expression levels and distribution. The most important difference is the lack in authentic VSPs of the nucleophilic aspartate residue, which is instead replaced by serine, glycine or asparagine. Thus these proteins can not be expected to be active phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP. This model explicitly excludes the VSPs which lack the nucleophilc aspartate. The possibility exists, however, that some members of this family may, while containing all of the conserved HAD-superfamily catalytic residues, lack activity and have a function related to the function of the VSPs rather than the acid phosphatases. 228
18150 130737 TIGR01676 GLDHase galactonolactone dehydrogenase. This model represents L-Galactono-gamma-lactone dehydrogenase (EC 1.3.2.3). This enzyme catalyzes the final step in ascorbic acid biosynthesis in higher plants. This protein is homologous to ascorbic acid biosynthesis enzymes of other species: L-gulono-gamma-lactone oxidase in rat and L-galactono-gamma-lactone oxidase in yeast. All three covalently bind the cofactor FAD. 541
18151 273750 TIGR01677 pln_FAD_oxido plant-specific FAD-dependent oxidoreductase. This model represents an uncharacterized plant-specific family of FAD-dependent oxidoreductases. At least seven distinct members are found in Arabidopsis thaliana. The family shows considerable sequence similarity to three different enzymes of ascorbic acid biosynthesis: L-galactono-1,4-lactone dehydrogenase (EC 1.3.2.3) from higher plants, D-arabinono-1,4-lactone oxidase (EC 1.1.3.37 from Saccharomyces cerevisiae, and L-gulonolactone oxidase (EC 1.1.3.8) from mouse, as well as to a bacterial sorbitol oxidase. The class of compound acted on by members of this family is unknown. 557
18152 273751 TIGR01678 FAD_lactone_ox sugar 1,4-lactone oxidases. This model represents a family of at least two different sugar 1,4 lactone oxidases, both involved in synthesizing ascorbic acid or a derivative. These include L-gulonolactone oxidase (EC 1.1.3.8) from rat and D-arabinono-1,4-lactone oxidase (EC 1.1.3.37) from Saccharomyces cerevisiae. Members are proposed to have the cofactor FAD covalently bound at a site specified by Prosite motif PS00862; OX2_COVAL_FAD; 1. 438
18153 130740 TIGR01679 bact_FAD_ox FAD-linked oxidoreductase. This model represents a family of bacterial oxidoreductases with covalently linked FAD, closely related to two different eukaryotic oxidases, L-gulonolactone oxidase (EC 1.1.3.8) from rat and D-arabinono-1,4-lactone oxidase (EC 1.1.3.37) from Saccharomyces cerevisiae. 419
18154 130741 TIGR01680 Veg_Stor_Prot vegetative storage protein. The proteins represented by this model are close relatives of the plant acid phosphatases (TIGR01675), are limited to members of the Phaseoleae including Glycine max (soybean) and Phaseolus vulgaris (kidney bean). These proteins are highly expressed in the leaves of repeatedly depodded plants. VSP differs most strinkingly from the acid phosphatases in the lack of the conserved nucleophilic aspartate residue in the N-terminus, thus, they should be inactive as phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP. 275
18155 273752 TIGR01681 HAD-SF-IIIC HAD-superfamily phosphatase, subfamily IIIC. This model represents the IIIC subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. Subfamily III (also including IIIA - TIGR01662 and IIIB - pfam03767) contains sequences which do not contain either of the insert domains (between the 1st and 2nd conserved catalytic motifs, subfamily I - TIGR01493, TIGR01509, TIGR01549, TIGR01488, TIGR01494, TIGR01658, TIGR01544 and TIGR01545, or between the 2nd and 3rd, subfamily II - TIGR01460 and TIGR01484). Subfamily IIIC contains five relatively distantly related clades: a family of viral proteins (TIGR01684), a family of eukaryotic proteins called MDP-1 and a family of archaeal proteins most closely related to MDP-1 (TIGR01685), a family of bacteria including the Streptomyces FkbH protein (TIGR01686), and a small clade including the Pasteurella BcbF and EcbF proteins. The overall lack of species overlap among these clades may indicate a conserved function, but the degree of divergence between the clades and the differences in archetecture outside of the domain in some clades warns against such a conclusion. No member of this subfamily is characterized with respect to function, however the MDP-1 protein is a characterized phosphatase. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC. 128
18156 273753 TIGR01682 moaD molybdopterin converting factor, subunit 1, non-archaeal. This model describes MoaD. It excludes archaeal homologs, since many Archaea have two MoaD-like proteins, suggesting two different functions. pfam02597 describes both the thiamine biosynthesis protein ThiS and this protein, MoaD, a subunit (together with MoaE, pfam02391) of the molybdopterin converting factor. Both ThiS and MoaD are involved in sulfur transfer reactions. Distribution of this family appears limited to species that also have a member of pfam02391, but a number of Archaea have two different members, suggesting functionally distinct subtypes. The C-terminal Gly-Gly of this model is critical to function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 80
18157 273754 TIGR01683 thiS thiamine biosynthesis protein ThiS. This model represents ThiS, a small, ubiquitin-like thiamine biosynthesis protein related to MoaD, a molybdenum cofactor biosynthesis protein. Both proteins are involved in sulfur transfer. ThiS has a conserved Gly-Gly C-terminus that is modified, in reactions requiring ThiI, ThiF, IscS, and a sulfur atom from Cys, into the thiocarboxylate that provides the sulfur for thiazole biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 64
18158 273755 TIGR01684 viral_ppase viral phosphatase. This model represents a family of viral proteins of unknown function. These proteins are members, however, of the IIIC (TIGR01681) subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. All characterized members of the III subfamilies (IIIA, TIGR01662; IIIB, pfam03767) are phosphatases, including MDP-1, a member of subfamily IIIC (TIGR01681). No member of this subfamily is characterized with respect to particular function. All of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC. These proteins also include an N-terminal domain (ca. 125 aas) that is unique to this clade. 301
18159 273756 TIGR01685 MDP-1 magnesium-dependent phosphatase-1. This model represents two closely related clades of sequences from eukaryotes and archaea. The mouse enzyme has been characterized as a phosphatase and has been positively identified as a member of the haloacid dehalogenase (HAD) superfamily by site-directed mutagenesis of the active site residues. 174
18160 273757 TIGR01686 FkbH FkbH-like domain. This model describes a domain of a family of proteins of unknown overall function. One of these, however, is a modular polyketide synthase 4800 amino acids in length from Streptomyces avermilitis in which this domain is the C-terminal segment. By contrast, the FkbH protein from Streptomyces hygroscopicus aparently contains only this domain. The remaining members of the family all contain an additional N-terminal domain of between 200 and 275 amino acids which show less than 20% identity to one another. It seems likely then that these proteins are involved in disparate functions, probably the biosynthesis of different natural products. For instance, the FkbH gene is found in a gene cluster believed to be responsible for the biosynthesis of unususal "PKS extender units" in the ascomycin pathway. This domain is composed of two parts, the first of which is a member of subfamily IIIC (TIGR01681) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in this domain. The C-terminal portion of this domain is unique to this family (by BLAST). 320
18161 273758 TIGR01687 moaD_arch MoaD family protein, archaeal. Members of this family appear to be archaeal versions of MoaD, subunit 1 of molybdopterin converting factor. This model has been split from the bacterial/eukaryotic equivalog model TIGR01682 because the presence of two members of this family in a substantial number of archaeal species suggests that roles might not be interchangeable. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 88
18162 130749 TIGR01688 dltC D-alanine--poly(phosphoribitol) ligase, subunit 2. This protein is part of the teichoic acid operon in gram-positive organisms. Gram positive organisms incorporate teichoic acid in their cell walls, and in the fatty acid residues of the glycolipid component of the outer layer of the cytoplasmic membrane. This gene, dltC, encodes the alanyl carrier protein. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 73
18163 273759 TIGR01689 EcbF-BcbF capsule biosynthesis phosphatase. This model describes a small family of highly conserved proteins (>60% ID). Two of these, BcbF and EcbF of Pasteurella multocida are believed to be part of the capsule polysaccharide biosynthesis machinery because they are cotranscribed from a locus devoted to that purpose. In pasteurella there are six different variant capsules (A-F), and these proteins are found only in B and E. The other two species in which this gene is (currently) found are both also pathogenic. These proteins are also members of the IIIC (TIGR01681) subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in this subfamily. Due to the likelihood that the substrates of these enzymes are different depending on the nature of the particular polysaccharides associated with each species, this model has been classified as a subfamily despite the close homology. 126
18164 162489 TIGR01690 ICE_RAQPRD integrative conjugative element protein, RAQPRD family. This model represents a small family of proteins about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 94
18165 273760 TIGR01691 enolase-ppase 2,3-diketo-5-methylthio-1-phosphopentane phosphatase. This enzyme is the enolase-phosphatase of methionine salvage, a pathway that regenerates methionine from methylthioadenosine (MTA). Adenosylmethionine (AdoMet) is a donor of different moieties for various processes, including methylation reactions. Use of AdoMet for spermidine biosynthesis, which leads to polyamine biosynthesis, leaves MTA as a by-product that must be cleared. In Bacillus subtilis and related species, this single protein is replaced by separate enzymes with enolase and phosphatase activities. [Central intermediary metabolism, Sulfur metabolism] 220
18166 130753 TIGR01692 HIBADH 3-hydroxyisobutyrate dehydrogenase. 3-hydroxyisobutyrate dehydrogenase is an enzyme that catalyzes the NAD+-dependent oxidation of 3-hydroxyisobutyrate to methylmalonate semialdehyde of the valine catabolism pathway. In Pseudomonas aeruginosa, 3-hydroxyisobutyrate dehydrogenase (mmsB) is co-induced with methylmalonate-semialdehyde dehydrogenase (mmsA) when grown on medium containing valine as the sole carbon source. The positive transcriptional regulator of this operon (mmsR) is located upstream of these genes and has been identified as a member of the XylS/AraC family of transcriptional regulators. 3-hydroxyisobutyrate dehydrogenase shares high sequence homology to the characterized 3-hydroxyisobutyrate dehydrogenase from rat liver with conservation of proposed NAD+ binding residues at the N-terminus (G-8,10,13,24 and D-31). This enzyme belongs to the 3-hydroxyacid dehydrogenase family, sharing a common evolutionary origin and enzymatic mechanism with 6-phosphogluconate. HIBADH exhibits sequence similarity to the NAD binding domain of 6-phosphogluconate dehydrogenase above trusted (pfam03446). [Energy metabolism, Amino acids and amines] 288
18167 273761 TIGR01693 UTase_glnD [Protein-PII] uridylyltransferase. This model describes GlnD, the uridylyltransferase/uridylyl-removing enzyme for the nitrogen regulatory protein PII. Not all homologs of PII share the property of uridylyltransferase modification on the characteristic Tyr residue (see Prosite pattern PS00496 and document PDOC00439), but the modification site is preserved in the PII homolog of all species with a member of this family. [Central intermediary metabolism, Nitrogen metabolism, Regulatory functions, Protein interactions] 850
18168 273762 TIGR01694 MTAP 5'-deoxy-5'-methylthioadenosine phosphorylase. This model represents the methylthioadenosine phosphorylase found in metazoa, cyanobacteria and a limited number of archaea such as Sulfolobus, Aeropyrum, Pyrobaculum, Pyrococcus, and Thermoplasma. This enzyme is responsible for the first step in the methionine salvage pathway after the transfer of the amino acid moiety from S-adenosylmethionine. The enzyme from human is well-characterized including a crystal structure. A misleading characterization is found for a Sulfolobus solfataricus enzyme, which is called a MTAP. In fact, as uncovered by the genome sequence of S. solfataricus, there are at least two nucleotide phosphorylases and the one found in the MTAP clade is not the one annotated as such. The sequence in this clade has not been isolated but is likely to be the authentic SsMTAP as it displays all of the conserved active site residues found in the human enzyme. This explains the finding that the characterized enzyme has greater efficiency towards the purines inosine, guanosine and adenosine over MTA. In fact, this mis-naming of this enzyme has been carried forward to several publications including a crystal stucture. In between the trusted and noise cutoffs are: 1) several archaeal sequences which appear to contain several residues characteristic of phosphorylases which act on guanosine or inosine (according to the crystal structure of MTAP and alignments). In any case, these residues are not conserved. 2) sequences from Mycobacterium tuberculosis and Streptomyces coelicolor which have better, although not perfect retention of the active site residues, but considering the general observation that bacteria utilize the MTA/SAH nucleotidase enzyme and a kinase to do this reaction, these have been excluded pending stronger evidence of their function, and 3) a sequence from Drosophila which appears to be a recent divergence (long branch in neighbor-joining trees) and lacks some of the conserved active site residues. [Central intermediary metabolism, Other, Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 241
18169 273763 TIGR01695 murJ_mviN murein biosynthesis integral membrane protein MurJ. This model represents MurJ (previously MviN), a family of integral membrane proteins predicted to have ten or more transmembrane regions. Members have been suggested to act as a lipid II flippase, translocated a precursor of murein. However, it appears FtsW has that activity. Flippase activity for MurJ has not been shown. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 502
18170 162494 TIGR01696 deoB phosphopentomutase. This protein is involved in the purine and pyrimidine salvage pathway. It catalyzes the conversion of D-ribose 1-phosphate to D-ribose 5-phosphate and the conversion of 2-deoxy-D-ribose 1-phosphate to 2-deoxy-D-ribose 5-phosphate. The seed members of this protein are characterized deoB proteins from E.Coli(SP:P07651) and Bacillus (SP:P46353). This model matches pfam01676 for Metalloenzyme superfamily. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 381
18171 130758 TIGR01697 PNPH-PUNA-XAPA inosine/guanosine/xanthosine phosphorylase family. This model is a subset of the subfamily represented by pfam00896 (phosphorylase family 2). This model excludes the methylthioadenosine phosphorylases (MTAP, TIGR01684) which are believed toplay a specific role in the recycling of methionine from methylthioadenosine. In this subfamily is found three clades of purine phosphorylases based on a neighbor-joining tree using the MTAP family as an outgroup. The highest-branching clade (TIGR01698) consists of a group of sequences from both gram positive and gram negative bacteria which have been annotated as purine nucleotide phosphorylases but have not been further characterized as to substrate specificity. Of the two remaining clades, one is xanthosine phosphorylase (XAPA, TIGR01699), is limited to certain gamma proteobacteria and constitutes a special purine phosphorylase found in a specialized operon for xanthosine catabolism. The enzyme also acts on the same purines (inosine and guanosine) as the other characterized members of this subfamily, but is only induced when xanthosine must be degraded. The remaining and largest clade consists of purine nucleotide phosphorylases (PNPH, TIGR01700) from metazoa and bacteria which act primarily on guanosine and inosine (and do not act on adenosine). Sequences from Clostridium (GP:15025051) and Thermotoga (OMNI:TM1596) fall between these last two clades and are uncharacterized with respect to substrate range and operon. 248
18172 130759 TIGR01698 PUNP purine nucleotide phosphorylase. This clade of purine nucleotide phosphorylases has not been experimentally characterized but is assigned based on strong sequence homology. Closely related clades act on inosine and guanosine (PNPH, TIGR01700), and xanthosine, inosine and guanosine (XAPA, TIGR01699) neither of these will act on adenosine. A more distantly related clade (MTAP, TIGR01694) acts on methylthioadenosine. 237
18173 130760 TIGR01699 XAPA xanthosine phosphorylase. This model represents a small clade of purine nucleotide phosphorylases found in certain gamma proteobacteria. The gene is part of an operon for the degradation of xanthosine and is induced by xanthosine. The enzyme is also capable of acting on inosine and guanosine (but not adenosine) in a manner similar to those other phosphorylases to which it is closely related (TIGR01698, TIGR01700). 248
18174 273764 TIGR01700 PNPH purine nucleoside phosphorylase I, inosine and guanosine-specific. This model represents a family of bacterial and metazoan purine phosphorylases acting primarily on inosine and guanosine and not acting on adenosine. PNP-I refers to the nomenclature from Bacillus stearothermophilus where PHP-II refers to the nucleotidase acting on adenosine as the primary substrate.The bacterial enzymes (PUNA) are typified by the Bacilus PupG protein, which is involved in the metabolism of nucleosides as a carbon source.Several metazoan enzymes (PNPH) are well characterized including the human and bovine enzymes which have been crystallized. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 249
18175 273765 TIGR01701 Fdhalpha-like oxidoreductase alpha (molybdopterin) subunit. This model represents a well-defined clade of oxidoreductase alpha subunits most closely related to a group of formate dehydrogenases including the E. coli FdhH protein (TIGR01591). These alpha subunits contain a molybdopterin cofactor and generally associate with two other subunits which contain iron-sulfur clusters and cytochromes. The particular subunits with which this enzyme interacts and the substrate which is reduced is unknown at this time. In Ralstonia, the gene is associated with the cbb operon, but is not essential for CO2 fixation. 743
18176 130763 TIGR01702 CO_DH_cata carbon-monoxide dehydrogenase, catalytic subunit. This model represents the carbon-monoxide dehydrogenase catalytic subunit. This protein is related to prismane (also called hybrid cluster protein), a complex whose activity is not yet fully described; the two share similar sets of ligands to unusual metal-containing clusters. 621
18177 130764 TIGR01703 hybrid_clust hydroxylamine reductase. This model represents a family of proteins containing an unusual 4Fe-2S-2O hydrid cluster. Earlier reports had proposed a 6Fe-6S prismane cluster. This subfamily is heterogeneous with respect to the presence or absence of a region of about 100 amino acids not far from the N-terminus of the protein. Members have been described as monomeric. The general function is unknown, although members from E. coli and several other species have hydroxylamine reductase activity. Members are found in various bacteria, in Archaea, and in several parasitic eukaryotes: Giardia intestinalis, Trichomonas vaginalis, and Entamoeba histolytica. [Cellular processes, Detoxification, Energy metabolism, Amino acids and amines] 522
18178 130765 TIGR01704 MTA/SAH-Nsdase 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase. This model represents the enzyme 5-methylthioadenosine/S-adenosylhomocysteine nucleosidase which acts on its two substrates at the same active site. This enzyme is involved in the recycling of the components of S-adenosylmethionine after it has donated one of its two non-ribose sulfur ligands to an acceptor. In the case of 5-methylthioadenosine this represents the first step of the methionine salvage pathway in bacteria. This enzyme is widely distributed in bacteria, especially those that lack adenosylhomocysteinase (EC 3.3.1.1). One clade of bacteria including Agrobacterium, Mesorhizobium, Sinorhizobium and Brucella includes sequences annotated as MTA/SAH nucleotidase, but differs significantly in homology and has no independent experimental evidence. There are homologs of this enzyme in plants, some of which score between trusted and noise cutoffs here, but there is no experimental evidence to validate this function at this time. [Central intermediary metabolism, Other, Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 228
18179 130766 TIGR01705 MTA/SAH-nuc-hyp 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase, putative. This model represents the enzyme 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase which acts on its two substrates at the same active site. This clade of sequences is sufficiently distinct from the characterized proteins, which form the seed of TIGR01704 as to cast some doubt on the accuracy of annotations based on sequence similarity alone. This enzyme is involved in the recycling of the components of S-adenosylmethionine after it has donated one of its two non-ribose sulfur ligands to an acceptor. In the case of 5'-methylthioadenosine this represents the first step of the methionine salvage pathway in bacteria. This enzyme is widely distributed in bacteria. 212
18180 273766 TIGR01706 NAPA periplasmic nitrate reductase, large subunit. This model represents the large subunit of a family of nitrate reductases found in proteobacteria which are localized to the periplasm. This subunit binds molybdopterin and contains a twin-arginine motif at the N-terminus. The protein associates with NapB, a soluble heme-containing protein and NapC, a membrane-bound cytochrome c. The periplasmic nitrate reductases are not involved in the assimilation of nitrogen, and are not directly involved in the formation of electrochemical gradients (i.e. respiration) either. Rather, the purpose of this enzyme is either dissimilatory (i.e. to dispose of excess reductive equivalents) or indirectly respiratory by virtue of the consumption of electrons derived from NADH via the proton translocating NADH dehydrogenase. The enzymes from Alicagenes eutrophus and Paracoccus pantotrophus have been characterized. In E. coli (as well as other organisms) this gene is part of a large nitrate reduction operon (napFDAGHBC). [Energy metabolism, Aerobic, Energy metabolism, Electron transport, Central intermediary metabolism, Nitrogen metabolism] 830
18181 273767 TIGR01707 gspI type II secretion system protein I. This model represents GspI, one of two proteins highly conserved at their N-termini and described by pfam02501 but easily separable phylogenetically. The other is GspJ. Both GspI and GspJ are proteins of the type II secretion pathway, or main terminal branch of the general secretion pathway. This pathway carries proteins across the outer membrane. Note that proteins of type II secretion are cryptic in E. coli K-12 - present but not yet demonstrated to act on any target. 101
18182 130769 TIGR01708 typeII_sec_gspH type II secretion system protein H. This model represents GspH, protein H of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 143
18183 273768 TIGR01709 typeII_sec_gspL type II secretion system protein L. This model represents GspL, protein L of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking] 384
18184 130771 TIGR01710 typeII_sec_gspG type II secretion system protein G. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 134
18185 130772 TIGR01711 gspJ type II secretion system protein J. This model represents GspJ, one of two proteins highly conserved at their N-termini and described by pfam02501 but easily separable phylogenetically. The other is GspI. Both GspI and GspJ are proteins of the type II secretion pathway, or main terminal branch of the general secretion pathway. This pathway carries proteins across the outer membrane. Note that proteins of type II secretion are cryptic in E. coli K-12 - present but not yet demonstrated to act on any target. 192
18186 273769 TIGR01712 phage_N6A_met phage N-6-adenine-methyltransferase. This model is a fragment-mode model for a phage-borne DNA N-6-adenine-methyltransferase. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification] 166
18187 273770 TIGR01713 typeII_sec_gspC type II secretion system protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking] 259
18188 130775 TIGR01714 phage_rep_org_N phage replisome organizer, putative, N-terminal region. This model represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region covered by this model is N-terminal to the low-complexity region. [Mobile and extrachromosomal element functions, Prophage functions] 119
18189 273771 TIGR01715 phage_lam_T phage tail assembly protein T. This model represents a translation of the T gene in phage lambda and related phage. A translational frameshift from the upstream gene G into the frame of T produces a minor protein gpG-T, essential in tail assembly but not found in the mature virion. [Mobile and extrachromosomal element functions, Prophage functions] 95
18190 273772 TIGR01716 RGG_Cterm transcriptional activator, Rgg/GadR/MutR family, C-terminal domain. This model describes the whole, except for a 60 residue N-terminal helix-turn-helix DNA-binding domain (pfam01381) of the family of proteins related to the transcriptional regulator Rgg, also called RopB. Rgg is required for secretion of several proteins, including a cysteine proteinase associated with virulence. GadR is a positive regulator of a glutamate-dependent acid resistance mechanism. MutR is a transcriptional activator for mutacin biosynthesis genes in Streptococcus mutans. This family appears restricted to the low-GC Gram-positive bacteria, including at least eight members in Lactococcus lactis. [Regulatory functions, DNA interactions] 220
18191 273773 TIGR01717 AMP-nucleosdse AMP nucleosidase. This model represents the AMP nucleosidase from proteobacteria but also including a sequence from Corynebacterium, a gram-positive organism. The species from E. coli has been most well studied. 477
18192 130779 TIGR01718 Uridine-psphlse uridine phosphorylase. This model represents a family of bacterial and archaeal uridine phosphorylases unrelated to the mammalian enzymes of the same name. The E. coli, Salmonella and Klebsiella genes have been characterized. Sequences from Clostridium, Streptomyces, Treponema, Halobacterium and Pyrobaculum were included above trusted on the basis of sequence homology and a PAM-based neighbor-joining tree. A clade including second sequences from Halobacterium and Vibrio was somewhat more distantly related and may represent a slightly different substrate specificity - these were placed below the noise cutoff. More distantly related is a clade of archaeal sequences which as related to the DeoD family of inosine phosphorylases (TIGR00107) as they are to these uridine phosphorylases. This clade includes a characterized protein from Sulfolobus solfataricus which has been mis-named as a methylthioadenosine phosphorylase, but which acts on inosine and guanosine - it is unclear whether uridine has been evaluated as a substrate. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 245
18193 130780 TIGR01719 euk_UDPppase uridine phosphorylase. This model represents a clade of mainly eucaryotic uridine phosphorylases. Genes from human and mouse have been characterized. This enzyme is a member of the PHP/UDP subfamily (pfam01048) and is closely related to the bacterial uridine (TIGR01718) and inosine (TIGR00107) phosphorylase equivalogs. In addition to the eukaryotes, a gene from Mycobacterium leprae is included in this equivalog and may have resulted from lateral gene transfer. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 287
18194 273774 TIGR01720 NRPS-para261 non-ribosomal peptide synthase domain TIGR01720. This domain appears to be located immediately downstream from a condensation domain (pfam00668), and is followed primarily by the end of the molecule or another condensation domain (in a few cases it is followed by pfam00501, an AMP-binding module). The converse is not true, pfam00668 domains are not always followed by this domain. This implicates this domain in possible post-condensation modification events. This model is 171 amino acids long and contains three very highly conserved regions. At the N-terminus is a nearly invariant lysine (position 11) followed by xxxRxxPxxGxGYG in which the proline and the first glycine are invariant. This is followed approximately 22 residues later by the motif FNYLG. Near the C-terminus of the domain is the sequence TxSD where the serine and aspartate are nearly invariant. 153
18195 130782 TIGR01721 AMN-like AMP nucleosidase, putative. The sequences in the clade represented by this model are most closely related to the AMP nucleosidase found in TIGR01717. These sequences are found only in Chlamydia and Porphyromonas and differ sufficiently from the characterized AMP nucleosidase to put some doubt on assignment of this name. 266
18196 130783 TIGR01722 MMSDH methylmalonic acid semialdehyde dehydrogenase. Involved in valine catabolism, methylmalonate-semialdehyde dehydrogenase catalyzes the irreversible NAD+- and CoA-dependent oxidative decarboxylation of methylmalonate semialdehyde to propionyl-CoA. Methylmalonate-semialdehyde dehydrogenase has been characterized in both prokaryotes and eukaryotes, functioning as a mammalian tetramer and a bacterial homodimer. Although similar in monomeric molecular mass and enzymatic activity, the N-terminal sequence in P.aeruginosa does not correspond with the N-terminal sequence predicted for rat liver. Sequence homology to a variety of prokaryotic and eukaryotic aldehyde dehydrogenases places MMSDH in the aldehyde dehydrogenase (NAD+) superfamily (pfam00171), making MMSDH's CoA requirement unique among known ALDHs. Methylmalonate semialdehyde dehydrogenase is closely related to betaine aldehyde dehydrogenase, 2-hydroxymuconic semialdehyde dehydrogenase, and class 1 and 2 aldehyde dehydrogenase. In Bacillus, a highly homologous protein to methylmalonic acid semialdehyde dehydrogenase, groups out from the main MMSDH clade with Listeria and Sulfolobus. This Bacillus protein has been suggested to be located in an iol operon and/or involved in myo-inositol catabolism, converting malonic semialdehyde to acetyl CoA ad CO2. The preceeding enzymes responsible for valine catabolism are present in Bacillus, Listeria, and Sulfolobus. [Energy metabolism, Amino acids and amines] 477
18197 130784 TIGR01723 hmd_TIGR 5,10-methenyltetrahydromethanopterin hydrogenase. This model represents a clade of authenticated coenzyme N(5),N(10)-methenyltetrahydromethanopterin reductases. This enzyme does not use F420. This enzyme acts in methanogenesis and as such is restricted to methanogenic archaeal species. This clade is one of two clades in pfam03201. [Energy metabolism, Methanogenesis] 340
18198 130785 TIGR01724 hmd_rel H2-forming N(5),N(10)-methenyltetrahydromethanopterin dehydrogenase-related protein. This model represents a sister clade to the authenticated coenzyme F420-dependent N(5),N(10)-methenyltetrahydromethanopterin reductase (HMD) of TIGR01723. Two members, designated HmdII and HmdIII, are found. Members are restricted to methanogens, but the function is unknown. [Unknown function, Enzymes of unknown specificity] 341
18199 273775 TIGR01725 phge_HK97_gp10 phage protein, HK97 gp10 family. This model represents an uncharacterized, highly divergent bacteriophage family. The family includes gp10 from HK022 and HK97. It appears related to TIGR01635, a phage morphogenesis family believed to be involved in tail completion. [Mobile and extrachromosomal element functions, Prophage functions] 119
18200 130787 TIGR01726 HEQRo_perm_3TM amine acid ABC transporter, permease protein, 3-TM region, His/Glu/Gln/Arg/opine family. This model represents one of several classes of multiple membrane spanning regions found immediately N-terminal to the domain described by pfam00528, binding-protein-dependent transport systems inner membrane component. The region covered by this model generally is predicted to contain three transmembrane helices. Substrate specificities attributed to members of this family include histidine, arginine, glutamine, glutamate, and (in Agrobacterium) the opines octopine and nopaline. [Transport and binding proteins, Amino acids, peptides and amines] 99
18201 213647 TIGR01727 oligo_HPY oligopeptide/dipeptide ABC transporter, ATP-binding protein, C-terminal domain. This model represents a domain found in the C-terminal regions of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid. [Transport and binding proteins, Amino acids, peptides and amines] 87
18202 130789 TIGR01728 SsuA_fam ABC transporter, substrate-binding protein, aliphatic sulfonates family. Members of this family are substrate-binding periplasmic proteins of ABC transporters. This subfamily includes SsuA, a member of a transporter operon needed to obtain sulfur from aliphatic sulfonates. Related proteins outside the scope of this model include taurine (NH2-CH2-CH2-S03H) binding proteins, the probable sulfate ester binding protein AtsR, and the probable aromatic sulfonate binding protein AsfC. All these families make sulfur available when Cys and sulfate levels are low. Please note that phylogenetic analysis by neighbor-joining suggests that a number of sequences belonging to this family have been excluded because of scoring lower than taurine-binding proteins. [Transport and binding proteins, Other] 288
18203 130790 TIGR01729 taurine_ABC_bnd taurine ABC transporter, periplasmic binding protein. This model identifies a cluster of ABC transporter periplasmic substrate binding proteins, apparently specific for taurine. Transport systems for taurine (NH2-CH2-CH2-SO3H), sulfonates, and sulfate esters import sulfur when sulfate levels are low. The most closely related proteins outside this family are putative aliphatic sulfonate binding proteins (TIGR01728). 300
18204 273776 TIGR01730 RND_mfp RND family efflux transporter, MFP subunit. This model represents the MFP (membrane fusion protein) component of the RND family of transporters. RND refers to Resistance, Nodulation, and cell Division. It is, in part, a subfamily of pfam00529 (Pfam release 7.5) but hits substantial numbers of proteins missed by that model. The related HlyD secretion protein, for which pfam00529 is named, is outside the scope of this model. Attributed functions imply outward transport. These functions include nodulation, acriflavin resistance, heavy metal efflux, and multidrug resistance proteins. Most members of this family are found in Gram-negative bacteria. The proposed function of MFP proteins is to bring the inner and outer membranes together and enable transport to the outside of the outer membrane. Note, however, that a few members of this family are found in Gram-positive bacteria, where there is no outer membrane. [Transport and binding proteins, Unknown substrate] 322
18205 273777 TIGR01731 fil_hemag_20aa adhesin HecA family 20-residue repeat (two copies). This model represents two copies of a 20-residue repeat found in Bordetella pertussis filamentous hemagglutinin family of adhesins. This family includes extremely long proteins from a number of plant and animal pathogens. 40
18206 273778 TIGR01732 tiny_TM_bacill conserved hypothetical tiny transmembrane protein. This model represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV. [Hypothetical proteins, Conserved] 26
18207 273779 TIGR01733 AA-adenyl-dom amino acid adenylation domain. This model represents a domain responsible for the specific recognition of amino acids and activation as adenylyl amino acids. The reaction catalyzed is aa + ATP -> aa-AMP + PPi. These domains are usually found as components of multi-domain non-ribosomal peptide synthetases and are usually called "A-domains" in that context. A-domains are almost invariably followed by "T-domains" (thiolation domains, pfam00550) to which the amino acid adenylate is transferred as a thiol-ester to a bound pantetheine cofactor with the release of AMP (these are also called peptide carrier proteins, or PCPs. When the A-domain does not represent the first module (corresponding to the first amino acid in the product molecule) it is usually preceded by a "C-domain" (condensation domain, pfam00668) which catalyzes the ligation of two amino acid thiol-esters from neighboring modules. This domain is a subset of the AMP-binding domain found in Pfam (pfam00501) which also hits substrate--CoA ligases and luciferases. Sequences scoring in between trusted and noise for this model may be ambiguous as to whether they activate amino acids or other molecules lacking an alpha amino group. 409
18208 273780 TIGR01734 D-ala-DACP-lig D-alanine--poly(phosphoribitol) ligase, subunit 1. This model represents the enzyme (also called D-alanine-D-alanyl carrier protein ligase) which activates D-alanine as an adenylate via the reaction D-ala + ATP -> D-ala-AMP + PPi, and further catalyzes the condensation of the amino acid adenylate with the D-alanyl carrier protein (D-ala-ACP). The D-alanine is then further transferred to teichoic acid in the biosynthesis of lipoteichoic acid (LTA) and wall teichoic acid (WTA) in gram positive bacteria, both polysacchatides. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 502
18209 188163 TIGR01735 FGAM_synt phosphoribosylformylglycinamidine synthase, single chain form. This model represents a single-molecule form of phosphoribosylformylglycinamidine synthase, also called FGAM synthase, an enzyme of purine de novo biosynthesis. This form is found mostly in eukaryotes and Proteobacteria. In Bacillus subtilis PurL (FGAM synthase II) and PurQ (FGAM synthase I), homologous to different parts of this model, perform the equivalent function; the unrelated small protein PurS is also required and may be a third subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 1310
18210 273781 TIGR01736 FGAM_synth_II phosphoribosylformylglycinamidine synthase II. Phosphoribosylformylglycinamidine synthase is a single, long polypeptide in most Proteobacteria and eukarotes. Three proteins are required in Bacillus subtilis and many other species. This is the longest of the three and is designated PurL, phosphoribosylformylglycinamidine synthase II, or FGAM synthase II. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 715
18211 273782 TIGR01737 FGAM_synth_I phosphoribosylformylglycinamidine synthase I. In some species, phosphoribosylformylglycinamidine synthase is composed of a single polypeptide chain. This model describes the PurQ protein of Bacillus subtilis (where PurL, PurQ, and PurS are required for phosphoribosylformylglycinamidine synthase activity) and functionally equivalent proteins from other bacteria and archaea. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 227
18212 273783 TIGR01738 bioH pimelyl-[acyl-carrier protein] methyl ester esterase. This CoA-binding enzyme is required for the production of pimeloyl-coenzyme A, the substrate of the BioF protein early in the biosynthesis of biotin. Its exact function is unknown, but is proposed in ref 2. This enzyme belongs to the alpha/beta hydrolase fold family (pfam00561). Members of this family are restricted to the Proteobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 245
18213 273784 TIGR01739 tegu_FGAM_synt herpesvirus tegument protein/v-FGAM-synthase. This model describes a family of large proteins of herpesvirues. The protein is described variably as tegument protein or phosphoribosylformylglycinamidine synthase (FGAM-synthase). Most of the length of the protein shows homology to eukaryotic FGAM-synthase. Functional characterizations were not verified during construction of this model. 1202
18214 273785 TIGR01740 pyrF orotidine 5'-phosphate decarboxylase, subfamily 1. This model represents orotidine 5'-monophosphate decarboxylase, the PyrF protein of pyrimidine nucleotide biosynthesis. In many eukaryotes, the region hit by this model is part of a multifunctional protein. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 214
18215 130802 TIGR01741 staph_tand_hypo conserved hypothetical protein. This model represents a tandem array of 10 proteins in Staphylococcus aureus and the C-terminal region of one protein each in Bacillus subtilis and Bacillus halodurans. 157
18216 273786 TIGR01742 SA_tandem_lipo Staphylococcus tandem lipoproteins. Members of this family are predicted lipoproteins (mostly), found in Staphylococcus aureus in several different tandem clusters in pathogenicity islands. Members are also found, clustered, in Staphylococcus epidermidis. 255
18217 130804 TIGR01743 purR_Bsub pur operon repressor, Bacillus subtilis type. This model represents the puring operon repressor PurR of low-GC Gram-positive bacteria. This homodimeric repressor contains a large region homologous to phosphoribosyltransferases and is inhibited by 5-phosphoribosyl 1-pyrophosphate. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis, Regulatory functions, DNA interactions] 268
18218 130805 TIGR01744 XPRTase xanthine phosphoribosyltransferase. This model represent a xanthine-specific phosphoribosyltransferase of Bacillus subtilis and closely related proteins from other species, mostly from other Gram-positive bacteria. The adjacent gene is a xanthine transporter; B. subtilis can import xanthine for the purine salvage pathway or for catabolism to obtain nitrogen. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides] 191
18219 130806 TIGR01745 asd_gamma aspartate-semialdehyde dehydrogenase, gamma-proteobacterial. [Amino acid biosynthesis, Aspartate family] 366
18220 273787 TIGR01746 Thioester-redct thioester reductase domain. This model includes the terminal domain from the fungal alpha aminoadipate reductase enzyme (also known as aminoadipate semialdehyde dehydrogenase) which is involved in the biosynthesis of lysine, as well as the reductase-containing component of the myxochelin biosynthetic gene cluster, MxcG. The mechanism of reduction involves activation of the substrate by adenylation and transfer to a covalently-linked pantetheine cofactor as a thioester. This thioester is then reduced to give an aldehyde (thus releasing the product) and a regenerated pantetheine thiol. (In myxochelin biosynthesis this aldehyde is further reduced to an alcohol or converted to an amine by an aminotransferase.) This is a fundamentally different reaction than beta-ketoreductase domains of polyketide synthases which act at a carbonyl two carbons removed from the thioester and forms an alcohol as a product. This domain is invariably found at the C-terminus of the proteins which contain it (presumably because it results in the release of the product). The majority of hits to this model are non-ribosomal peptide synthetases in which this domain is similarly located proximal to a thiolation domain (pfam00550). In some cases this domain is found at the end of a polyketide synthetase enzyme, but is unlike ketoreductase domains which are found before the thiolase domains. Exceptions to this observed relationship with the thiolase domain include three proteins which consist of stand-alone reductase domains (GP|466833 from M. leprae, GP|435954 from Anabaena and OMNI|NTL02SC1199 from Strep. coelicolor) and one protein (OMNI|NTL01NS2636 from Nostoc) which contains N-terminal homology with a small group of hypothetical proteins but no evidence of a thiolation domain next to the putative reductase domain. Below the noise cutoff to this model are proteins containing more distantly related ketoreductase and dehydratase/epimerase domains. It has been suggested that a NADP-binding motif can be found in the N-terminal portion of this domain that may form a Rossman-type fold. 367
18221 130808 TIGR01747 diampropi_NH3ly diaminopropionate ammonia-lyase family. This small subfamily includes diaminopropionate ammonia-lyase from Salmonella typhimurium and a small number of close homologs, about 50 % identical in sequence. The enzyme is a pyridoxal phosphate-binding homodimer homologous to threonine dehydratase (threonine deaminase). [Energy metabolism, Other] 376
18222 130809 TIGR01748 rhaA L-rhamnose isomerase. This enzyme interconverts L-rhamnose and L-rhamnulose. In some species, including E. coli, this is the first step in rhamnose catabolism. Sequential steps are catalyzed by rhamnulose kinase (rhaB), then rhamnulose-1-phosphate aldolase (rhaD) to yield glycerone phosphate and (S)-lactaldehyde. Characterization of this family is based on members in E. coli and Salmonella. [Energy metabolism, Sugars] 414
18223 130810 TIGR01749 fabA beta-hydroxyacyl-[acyl carrier protein] dehydratase FabA. This enzyme, FabA, shows overlapping substrate specificity with FabZ with regard to chain length in fatty acid biosynthesis. It is commonly designated 3-hydroxydecanoyl-[acyl-carrier-protein] dehydratase (EC 4.2.1.60) as if it were specific for that chain length, but its specificity is broader; it is active even in the initiation of fatty acid biosynthesis. This enzyme can also isomerize trans-2-decenoyl-ACP to cis-3-decenoyl-ACP to bypass reduction by FabI and instead allow biosynthesis of unsaturated fatty acids. FabA cannot elongate unsaturated fatty acids. [Fatty acid and phospholipid metabolism, Biosynthesis] 169
18224 130811 TIGR01750 fabZ beta-hydroxyacyl-[acyl carrier protein] dehydratase FabZ. This enzyme, FabZ, shows overlapping substrate specificity with FabA with regard to chain length in fatty acid biosynthesis. FabZ works preferentially on shorter chains and is often designated (3R)-hydroxymyristoyl-[acyl carrier protein] dehydratase, although its actual specificity is broader. Unlike FabA, FabZ does not function as an isomerase and cannot initiate unsaturated fatty acid biosynthesis. However, only FabZ can act during the elongation of unsaturated fatty acid chains. [Fatty acid and phospholipid metabolism, Biosynthesis] 140
18225 188164 TIGR01751 crot-CoA-red crotonyl-CoA carboxylase/reductase. The enzyme represented by this model can convert crotonyl-CoA to butyryl-CoA (crotonyl-CoA reductase activity), but more importantly, in the presence of CO2, generates (2S)-ethylmalonyl-CoA. In serine cycle methylotrophic bacteria this enzyme is involved in the process of acetyl-CoA to glyoxylate. In other bacteria the enzyme is used to produce extender units for incorporation into polyketides such as tylosin from Streptomyces fradiae and coronatine from Pseudomonas syringae. 398
18226 273788 TIGR01752 flav_long flavodoxin, long chain. Flavodoxins are small redox-active proteins with a flavin mononucleotide (FMN) prosthetic group. They can act in nitrogen fixation by nitrogenase, in sulfite reduction, and light-dependent NADP+ reduction in during photosynthesis, among other roles. This model describes the long chain type, typical for nitrogen fixation but associated with pyruvate formate-lyase activation and cobalamin-dependent methionine synthase activity in E. coli. [Energy metabolism, Electron transport] 167
18227 273789 TIGR01753 flav_short flavodoxin, short chain. Flavodoxins are small redox-active proteins with a flavin mononucleotide (FMN) prosthetic group. They can act in nitrogen fixation by nitrogenase, in sulfite reduction, and light-dependent NADP+ reduction in during photosynthesis, among other roles. This model describes the short chain type. Many of these are involved in sulfite reduction. [Energy metabolism, Electron transport] 140
18228 130815 TIGR01754 flav_RNR ribonucleotide reductase-associated flavodoxin, putative. This model represents a family of proteins found immediately downstream of ribonucleotide reductase genes in Xyella fastidiosa and some Gram-positive bacteria. It appears to be a highly divergent flavodoxin of the short chain type, more like the flavodoxins of the sulfate-reducing genus Desulfovibrio than like the NifF flavodoxins associated with nitrogen fixation. 140
18229 130816 TIGR01755 flav_wrbA NAD(P)H:quinone oxidoreductase, type IV. This model represents a protein, WrbA, related to and slightly larger than flavodoxin. It was just shown, in E. coli and Archaeoglobus fulgidus (and previously for some eukaryotic homologs) to act as fourth type of NAD(P)H:quinone oxidoreductase. In E. coli, this protein was earlier reported to be produced during stationary phase, bind to the trp repressor, and make trp operon repression more efficient. WrbA does not interact with the trp operator by itself. Members are found in species in which homologs of the E. coli trp operon repressor TrpR (SP:P03032) are not detected. [Energy metabolism, Electron transport] 197
18230 130817 TIGR01756 LDH_protist lactate dehydrogenase. This model represents a family of protist lactate dehydrogenases which have aparrently evolved from a recent protist malate dehydrogenase ancestor. Lactate dehydrogenase converts the hydroxyl at C-2 of lactate to a carbonyl in the product, pyruvate. The preference of this enzyme for NAD or NADP has not been determined. A critical residue in malate dehydrogenase, arginine-91 (T. vaginalis numbering) has been mutated to a leucine, eliminating the positive charge which complemeted the carboxylate in malate which is absent in lactate. Several other more subtle changes are proposed to make the active site smaller to accomadate the less bulky lactate molecule. 313
18231 130818 TIGR01757 Malate-DH_plant malate dehydrogenase, NADP-dependent. This model represents the NADP-dependent malate dehydrogenase found in plants, mosses and green algae and localized to the chloroplast. Malate dehydrogenase converts oxaloacetate into malate, a critical step in the C4 cycle which allows circumvention of the effects of photorespiration. Malate is subsequenctly transported from the chloroplast to the cytoplasm (and then to the bundle sheath cells in C4 plants). The plant and moss enzymes are light regulated via cysteine disulfide bonds. The enzyme from Sorghum has been crystallized. 387
18232 130819 TIGR01758 MDH_euk_cyt malate dehydrogenase, NAD-dependent. This model represents the NAD-dependent cytosolic malate dehydrogenase from eukaryotes. The enzyme from pig has been studied by X-ray crystallography 324
18233 130820 TIGR01759 MalateDH-SF1 malate dehydrogenase. This model represents a family of malate dehydrogenases in bacteria and eukaryotes which utilize either NAD or NADP depending on the species and context. MDH interconverts malate and oxaloacetate and is a part of the citric acid cycle as well as the C4 cycle in certain photosynthetic organisms. 323
18234 273790 TIGR01760 tape_meas_TP901 phage tail tape measure protein, TP901 family, core region. This model represents a reasonably well conserved core region of a family of phage tail proteins. The member from phage TP901-1 was characterized as a tail length tape measure protein in that a shortened form of the protein leads to phage with proportionately shorter tails. [Mobile and extrachromosomal element functions, Prophage functions] 350
18235 273791 TIGR01761 thiaz-red thiazolinyl imide reductase. This reductase is found associated with gene clusters for the biosynthesis of various non-ribosomal peptide derived natural products in which cysteine is cyclized to a thiazoline ring containing an imide double bond. Examples include yersiniabactin (irp3/YbtU, GP|21959262) and pyochelin (PchG, GP|4325022). 344
18236 130823 TIGR01762 chlorin-enz chlorinating enzyme. This model represents a a group of highly homologous enzymes related to dioxygenases which chlorinate amino acid methyl groups. BarB1 and BarB2 are proposed to trichlorinate one of the methyl groups of a leucine residue in the biosynthesis of barbamide in the cyanobacterium Lyngbya majuscula. SyrB2 is proposed to chlorinate the methyl group of threonine in the biosynthesis of syringomycin in Pseudomonas syringae. CmaB is proposed to chlorinate the beta-methyl group of alloisoleucine in the process of ring closure in the biosynthesis of coronamic acid, a component of coronatine also in Pseudomonas syringae. 288
18237 273792 TIGR01763 MalateDH_bact malate dehydrogenase, NAD-dependent. This enzyme converts malate into oxaloacetate in the citric acid cycle. The critical residues which discriminate malate dehydrogenase from lactate dehydrogenase have been characterized, and have been used to set the cutoffs for this model. Sequences showing [aflimv][ap]R[rk]pgM[st] and [ltv][ilm]gGhgd were kept above trusted, while those in which the capitalized residues in the patterns were found to be Q, E and E were kept below the noise cutoff. Some sequences in the grey zone have been annotated as malate dehydrogenases, but none have been characterized. Phylogenetically, a clade of sequences from eukaryotes such as Toxoplasma and Plasmodium which include a characterized lactate dehydrogenase and show abiguous critical residue patterns appears to be more closely related to these bacterial sequences than other eukaryotic sequences. These are relatively long branch and have been excluded from the model. All other sequences falling below trusted appear to be phylogenetically outside of the clade including the trusted hits. The annotation of Botryococcus braunii as lactate dehydrogenase appears top be in error. This was initially annotated as MDH by Swiss-Prot and then changed. The rationale for either of these annotations is not traceable. [Energy metabolism, TCA cycle] 305
18238 200128 TIGR01764 excise DNA binding domain, excisionase family. An excisionase, or Xis protein, is a small protein that binds and promotes excisive recombination; it is not enzymatically active. This model represents a number of putative excisionases and related proteins from temperate phage, plasmids, and transposons, as well as DNA binding domains of other proteins, such as a DNA modification methylase. This model identifies mostly small proteins and N-terminal regions of large proteins, but some proteins appear to have two copies. This domain appears similar, in both sequence and predicted secondary structure (PSIPRED) to the MerR family of transcriptional regulators (pfam00376). [Unknown function, General] 49
18239 130826 TIGR01765 tspaseT_teng_N transposase, putative, N-terminal domain. This model represents the N-terminal region of a family of putative transposases found in the largest copy number in Thermoanaerobacter tengcongensis. The three homologs in Bacillus anthracis are each split into two ORFs and this model represents the upstream ORF. [Mobile and extrachromosomal element functions, Transposon functions] 73
18240 273793 TIGR01766 tspaseT_teng_C transposase, IS605 OrfB family, central region. This model represents a region of a sequence similarity between a family of putative transposases of Thermoanaerobacter tengcongensis, smaller related proteins from Bacillus anthracis, putative transposes described by pfam01385, and other proteins. [Mobile and extrachromosomal element functions, Transposon functions] 82
18241 130828 TIGR01767 MTRK S-methyl-5-thioribose kinase. This enzyme, S-methyl-5-thioribose kinase (MtnK) is involved in the methionine salvage pathway in certain bacteria. 370
18242 273794 TIGR01768 GGGP-family geranylgeranylglyceryl phosphate synthase family protein. This model represents a family of sequences including geranylgeranylglyceryl phosphate synthase which catalyzes the first committed step in the synthesis of ether-linked membrane lipids in archaea. The clade of bacterial sequences may have the same function or a closely related function. This model supercedes TIGR00265, which has been retired. 223
18243 130830 TIGR01769 GGGP phosphoglycerol geranylgeranyltransferase. This model represents geranylgeranylglyceryl phosphate synthase which catalyzes the first committed step in the synthesis of ether-linked membrane lipids in archaea. The active enzyme is reported to be a homopentamer in Methanobacterium thermoautotrophicum but is reported to be a homodimer in Thermoplasma acidophilum. 205
18244 273795 TIGR01770 NDH_I_N proton-translocating NADH-quinone oxidoreductase, chain N. This model describes the 14th (based on E. coli) structural gene, N, of bacterial and chloroplast energy-transducing NADH (or NADPH) dehydrogenases. This model does not describe any subunit of the mitochondrial complex I (for which the subunit composition is very different), nor NADH dehydrogenases that are not coupled to ion transport. The Enzyme Commission designation 1.6.5.3, for NADH dehydrogenase (ubiquinone), is applied broadly, perhaps unfortunately, even if the quinone is menaquinone (Thermus, Mycobacterium) or plastoquinone (chloroplast). For chloroplast members, the name NADH-plastoquinone oxidoreductase is used for the complex and this protein is designated as subunit 2 or B. This model also includes a subunit of a related complex in the archaeal methanogen, Methanosarcina mazei, in which F420H2 replaces NADH and 2-hydroxyphenazine replaces the quinone. [Energy metabolism, Electron transport] 468
18245 273796 TIGR01771 L-LDH-NAD L-lactate dehydrogenase. This model represents the NAD-dependent L-lactate dehydrogenases from bacteria and eukaryotes. This enzyme function as as the final step in anaerobic glycolysis. Although lactate dehydrogenases have in some cases been mistaken for malate dehydrogenases due to the similarity of these two substrates and the apparent ease with which evolution can toggle these activities, critical residues have been identified which can discriminate between the two activities. At the time of the creation of this model no hits above the trusted cutoff contained critical residues typical of malate dehydrogenases. [Energy metabolism, Anaerobic, Energy metabolism, Glycolysis/gluconeogenesis] 299
18246 130833 TIGR01772 MDH_euk_gproteo malate dehydrogenase, NAD-dependent. This model represents the NAD-dependent malate dehydrogenase found in eukaryotes and certain gamma proteobacteria. The enzyme is involved in the citric acid cycle as well as the glyoxalate cycle. Several isoforms exidt in eukaryotes. In S. cereviseae, for example, there are cytoplasmic, mitochondrial and peroxisomal forms. Although malate dehydrogenases have in some cases been mistaken for lactate dehydrogenases due to the similarity of these two substrates and the apparent ease with which evolution can toggle these activities, critical residues have been identified which can discriminate between the two activities. At the time of the creation of this model no hits above the trusted cutoff contained critical residues typical of lactate dehydrogenases. [Energy metabolism, TCA cycle] 312
18247 273797 TIGR01773 GABAperm gamma-aminobutyrate permease. GABA permease (gabP) catalyzes the translocation of 4-aminobutyrate (GABA) across the plasma membrane, with homologues expressed in Gram-negative and Gram-positive organisms. This permease is a highly hydrophobic transmembrane protein consisting of 12 transmembrane domains with hydrophilic N- and C-terminal ends. Induced by nitrogen-limited culture conditions in both Escherichia coli and Bacillus subtilis, gabP is an energy dependent transport system stimulated by membrane potential and has been observed adjacent and distant from other GABA degradation proteins. GabP is highly homologous to amino acid permeases from B. subtilis, E. coli, as well as to other members of the amino acid permease family (pfam00324). A member of the APC (amine-polyamine-choline) transporter superfamily, GABA permease possesses a "consensus amphiphatic region" (CAR) found to be evolutionarily conserved within this transport family. This amphiphatic region is located between helix 8 and cytoplasmic loop 8-9, forming a potential channel domain and suggested to play a significant role in ligand recognition and translocation. Unique to GABA permeases, a conserved cysteine residue (CYS-300, E.coli) located at the beginning of the amphiphatic domain, has been determined to be critical for catalytic specificity. [Transport and binding proteins, Amino acids, peptides and amines] 452
18248 273798 TIGR01774 PFL2-3 glycyl radical enzyme, PFL2/glycerol dehydratase family. This family previously was designated pyruvate formate-lyase, but it now appears that members include the B12-independent glycerol dehydratase. Therefore, the functional definition of the family is being broadened. This family includes the PflF and PflD proteins of E. coli, described as isoforms of pyruvate-formate lyase found in a limited number additional species. PFL catalyzes the reaction pyruvate + CoA -> acetyl-CoA + formate, which is a step in the fermentation of glucose. 786
18249 273799 TIGR01776 TonB-tbp-lbp TonB-dependent lactoferrin and transferrin receptors. This family of TonB-dependent receptors are responsible for import of iron from the mammalian iron carriers lactoferrin and transferrin across the outer membrane. These receptors are found only in bacteria which can infect mammals such as Moraxella, Mannheimia, Neisseria, Actinobacillus, Pasteurella, Haemophilus and Histophilus species. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins] 932
18250 273800 TIGR01777 yfcH TIGR01777 family protein. This model represents a clade of proteins of unknown function including the E. coli yfcH protein. [Hypothetical proteins, Conserved] 291
18251 273801 TIGR01778 TonB-copper TonB-dependent copper receptor. This model represents a family of proteobacterial TonB-dependent outer membrane receptor/transporters which bind and translocate copper ions. Two characterized members of this family exist, outer membrane protein C (OprC) from Pseudomonas aeruginosa and NosA from Pseudomonas stutzeri which is responsible for providing copper for the copper-containing N2O reducatse. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins] 636
18252 273802 TIGR01779 TonB-B12 TonB-dependent vitamin B12 receptor. This model represents the TonB-dependent outer membrane receptor found in gamma proteobacteria responsible for translocating the cobalt-containing vitamin B12 (cobalamin). [Transport and binding proteins, Other, Transport and binding proteins, Porins] 614
18253 188167 TIGR01780 SSADH succinate-semialdehyde dehydrogenase. Succinic semialdehyde dehydrogenase is one of three enzymes constituting 4-aminobutyrate (GABA) degradation in both prokaryotes and eukaryotes, catalyzing the (NAD(P)+)-dependent catabolism reaction of succinic semialdehyde to succinate for metabolism by the citric acid cycle. The EC number depends on the cofactor: 1.2.1.24 for NAD only, 1.2.1.79 for NADP only, and 1.2.1.16 if both can be used. In Escherichia coli, succinic semialdehyde dehydrogenase is located in an unidirectionally transcribed gene cluster encoding enzymes for GABA degradation and is suggested to be cotranscribed with succinic semialdehyde transaminase from a common promoter upstream of SSADH. Similar gene arrangements can be found in characterized Ralstonia eutropha and the genome analysis of Bacillus subtilis. Prokaryotic succinic semialdehyde dehydrogenases (1.2.1.16) share high sequence homology to characterized succinic semialdehyde dehydrogenases from rat and human (1.2.1.24), exhibiting conservation of proposed cofactor binding residues, and putative active sites (G-237 & G-242, C-293 & G-259 respectively of rat SSADH). Eukaryotic SSADH enzymes exclusively utilize NAD+ as a cofactor, exhibiting little to no NADP+ activity. While a NADP+ preference has been detected in prokaryotes in addition to both NADP+- and NAD+-dependencies as in E.coli, Pseudomonas, and Klebsiella pneumoniae. The function of this alternative SSADH currently is unknown, but has been suggested to play a possible role in 4-hydroxyphenylacetic degradation. Just outside the scope of this model, are several sequences belonging to clades scoring between trusted and noise. These sequences may be actual SSADH enzymes, but lack sufficiently close characterized homologs to make a definitive assignment at this time. SSADH enzyme belongs to the aldehyde dehydrogenase family (pfam00171), sharing a common evolutionary origin and enzymatic mechanism with lactaldehyde dehydrogenase. Like in lactaldehyde dehydrogenase and succinate semialdehyde dehydrogenase, the mammalian catalytic glutamic acid and cysteine residues are conserved in all the enzymes of this family (PS00687, PS00070). [Central intermediary metabolism, Other] 448
18254 273803 TIGR01781 Trep_dent_lipo Treponema denticola clustered lipoprotein. This model represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown. 412
18255 273804 TIGR01782 TonB-Xanth-Caul TonB-dependent receptor. This model represents a family of TonB-dependent outer-membrane receptors which are found mainly in Xanthomonas and Caulobacter. These appear to represent the expansion of a paralogous family in that the 22 X. axonopodis (21 in X. campestris) and 18 C. crescentus sequences are more closely related to each other than any of the many TonB-dependent receptors found in other species. In fact, the Crescentus and Xanthomonas sequences are inseparable on a phylogenetic tree using a PAM-weighted neighbor-joining method, indicating that one of the two genuses may have acquired this set of receptors from the other. The mechanism by which this family is shared between Xanthomonas, a gamma proteobacterial plant pathogen and Caulobacter, an alpha proteobacterial aquatic organism is unclear. [Transport and binding proteins, Porins] 845
18256 273805 TIGR01783 TonB-siderophor TonB-dependent siderophore receptor. This subfamily model encompasses a wide variety of TonB-dependent outer membrane siderophore receptors. It has no overlap with TonB receptors known to transport other substances, but is likely incomplete due to lack of characterizations. It is likely that genuine siderophore receptors will be identified which score below the noise cutoff to this model at which point the model should be updated. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins] 651
18257 273806 TIGR01784 T_den_put_tspse conserved hypothetical protein (putative transposase or invertase). Several lines of evidence suggest that members of this family (loaded as a fragment mode model to find part-length matches) are associated with transposition, inversion, or recombination. Members are found in small numbers of genomes, but in large copy numbers in many of those species, including over 30 full length and fragmentary members in Treponema denticola. The strongest similarities are usually within rather than between species. PSI-BLAST shows similarity to proteins designated as possible transposases, DNA invertases (resolvases), and recombinases. In the oral pathogenic spirochete Treponema denticola, full-length members are often found near transporters or other membrane proteins. This family includes members of the putative transposase family pfam04754. 270
18258 273807 TIGR01785 TonB-hemin TonB-dependent heme/hemoglobin receptor family protein. This model represents the TonB-dependent outer membrane heme/hemoglobin receptor/transporter found in bacteria which live in contact with animals (which contain hemoglobin or other heme-bearing globins) or legumes (which contain leghemoglobin). Some species having hits to this model such as Nostoc, Caulobacter and Chlorobium do not have an obvious source of hemoglobin-like proteins in their biological niche and so the possibility exists that they act on some other substance. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins] 665
18259 273808 TIGR01786 TonB-hemlactrns TonB-dependent hemoglobin/transferrin/lactoferrin receptor family protein. This model represents a family of TonB-dependent outer membrane receptor/transporters acting on iron-containing proteins such as hemoglobin, transferrin and lactoferrin. Two subfamily models with a narrower scope are contained within this model, the heme/hemoglobin receptor family protein model (TIGR01785) and the transferrin/lactoferrin receptor family model (TIGR01776). Accessions which score above trusted to this model while not scoring above trusted to the more specific models are most likely to be hemoglobin transporters. Nearly all of the species containing trusted hits to this model have access to hemoglobin, transferrin or lactoferrin or related proteins in their biological niche. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins] 715
18260 273809 TIGR01787 squalene_cyclas squalene/oxidosqualene cyclases. This family of enzymes catalyzes the cyclization of the triterpenes squalene or 2-3-oxidosqualene to a variety of products including hopene, lanosterol, cycloartenol, amyrin, lupeol, and isomultiflorenol. 621
18261 130848 TIGR01788 Glu-decarb-GAD glutamate decarboxylase. This model represents the pyridoxal phosphate-dependent glutamate (alpha) decarboxylase found in bacteria (low and hi-GC gram positive, proteobacteria and cyanobacteria), plants, fungi and at least one archaon (Methanosarcina). The product of the enzyme is gamma-aminobutyrate (GABA). 431
18262 130849 TIGR01789 lycopene_cycl lycopene cyclase. This model represents a family of bacterial lycopene cyclases catalyzing the transformation of lycopene to carotene. These enzymes are found in a limited spectrum of alpha and gamma proteobacteria as well as Flavobacterium. 370
18263 130850 TIGR01790 carotene-cycl lycopene cyclase family protein. This family includes lycopene beta and epsilion cyclases (which form beta and delta carotene, respectively) from bacteria and plants as well as the plant capsanthin/capsorubin and neoxanthin cyclases which appear to have evolved from the plant lycopene cyclases. The plant lycopene epsilon cyclases also transform neurosporene to alpha zeacarotene. 388
18264 130851 TIGR01791 CM_archaeal chorismate mutase, archaeal type. This model represents a clade of archaeal chorismate mutases. Chorismate mutase catalyzes the conversion of chorismate into prephenate which is subsequently converted into either phenylalanine or tyrosine. In Sulfolobus this gene is found as a fusion with prephenate dehydrogenase (although the non-TIGR annotation contains a typographical error indicating it as a dehydratase OMNI|NTL02SS0274) which is the next enzyme in the tyrosine biosynthesis pathway. The Archaeoglobus gene contains an N-terminal prephenate dehydrogenase domain and a C-terminal prephenate dehydratase domain followed by a regulatory amino acid-binding ACT domain. The Thermoplasma volcanium gene is adjacent to prephenate dehydratase. [Amino acid biosynthesis, Aromatic amino acid family] 83
18265 273810 TIGR01792 urease_alph urease, alpha subunit. This model describes the urease alpha subunit UreC (designated beta or B chain, UreB in Helicobacter species). Accessory proteins for incorporation of the nickel cofactor are usually found in addition to the urease alpha, beta, and gamma subunits. The trusted cutoff is set above the scores of many reported fragments and of a putative second urease alpha chain in Streptomyces coelicolor. [Central intermediary metabolism, Nitrogen metabolism] 567
18266 130853 TIGR01793 cit_synth_euk citrate (Si)-synthase, eukaryotic. This model includes both mitochondrial and peroxisomal forms of citrate synthase. Citrate synthase is the entry point to the TCA cycle from acetyl-CoA. Peroxisomal forms, such as SP:P08679 from yeast (recognized by the C-terminal targeting motif SKL) act in the glyoxylate cycle. Eukaryotic homologs excluded by the high trusted cutoff of this model include a Tetrahymena thermophila citrate synthase that doubles as a filament protein, a putative citrate synthase from Plasmodium falciparum (no TCA cycle), and a methylcitrate synthase from Aspergillus nidulans. 427
18267 130854 TIGR01795 CM_mono_cladeE monofunctional chorismate mutase, alpha proteobacterial type. This model represents a small clade of monofunctional (non-fused) chorismate mutases spanning alpha proteobacteria and two actinobacter gram positive species. The alpha proteobacterial members are trusted because the pathways of CM are evident and there is only one plausible CM in the genome. In S. coelicolor, however, there is another aparrent monofunctional CM. [Amino acid biosynthesis, Aromatic amino acid family] 94
18268 130855 TIGR01796 CM_mono_aroH monofunctional chorismate mutase, gram positive type, clade 1. This model represents a family of monofunctional (non-fused) chorismate mutases from gram positive bacteria (Firmicutes) and cyanobacteria. Trusted members of the family are found in operons with other enzymes of the chorismate pathways, both up- and downstream of CM (Listeria, Bacillus, Oceanobacillus) or are the sole CM in the genome where the other members of the chorismate pathways are found elsewhere in the genome (Nostoc, Thermosynechococcus). [Amino acid biosynthesis, Aromatic amino acid family] 117
18269 130856 TIGR01797 CM_P_1 chorismate mutase domain of proteobacterial P-protein, clade 1. This model represents the chorismate mutase domain of the gamma and beta proteobacterial "P-protein" which contains an N-terminal chorismate mutase domain and a C-terminal prephenate dehydratase domain. [Amino acid biosynthesis, Aromatic amino acid family] 83
18270 273811 TIGR01798 cit_synth_I citrate synthase I (hexameric type). This model describes one of several distinct but closely homologous classes of citrate synthase, the protein that brings carbon (from acetyl-CoA) into the TCA cycle. This form, class I, is known to be hexameric and allosterically inhibited by NADH in Escherichia coli, Acinetobacter anitratum, Azotobacter vinelandii, Pseudomonas aeruginosa, etc. In most species with a class I citrate synthase, a dimeric class II isozyme is found. The class II enzyme may act primarily on propionyl-CoA to make 2-methylcitrate or be bifunctional, may be found among propionate utilization enzymes, and may be constitutive or induced by propionate. Some members of this model group as class I enzymes, and may be hexameric, but have shown regulatory properties more like class II enzymes. [Energy metabolism, TCA cycle] 412
18271 130858 TIGR01799 CM_T chorismate mutase domain of T-protein. This model represents the chorismate mutase domain of the gamma proteobacterial "T-protein" which consists of an N-terminal chorismate mutase domain and a C-terminal prephenate dehydrogenase domain. [Amino acid biosynthesis, Aromatic amino acid family] 83
18272 130859 TIGR01800 cit_synth_II 2-methylcitrate synthase/citrate synthase II. Members of this family are dimeric enzymes with activity as 2-methylcitrate synthase, citrate synthase, or both. Many Gram-negative species have a hexameric citrate synthase, termed citrate synthase I (TIGR01798). Members of this family (TIGR01800) appear as a second citrate synthase isozyme but typically are associated with propionate metabolism and synthesize 2-methylcitrate from propionyl-CoA; citrate synthase activity may be incidental. A number of species, including Thermoplasma acidophilum, Pyrococcus furiosus, and the Antarctic bacterium DS2-3R have a bifunctional member of this family as the only citrate synthase isozyme. 368
18273 130860 TIGR01801 CM_A chorismate mutase domain of gram positive AroA protein. This model represents a small clade of chorismate mutase domains N-terminally fused to the first enzyme in the chorismate pathway, 2-dehydro-3-deoxyphosphoheptanoate aldolase (DAHP synthetase, AroA) which are found in some gram positive species and Deinococcus. Only in Deinococcus, where this domain is the sole CM domain in the genome can a trusted assignment of function be made. In the other species there is at least one other trusted CM domain present. The similarity between the Deinococcus gene and the others in this clade is sufficiently strong (~44% identity), that the whole clade can be trusted to be functional. The possibility exists, however, that in the gram positive species the fusion to the first enzyme in the pathway has evolved a separate, regulatory role. [Amino acid biosynthesis, Aromatic amino acid family] 102
18274 273812 TIGR01802 CM_pl-yst monofunctional chorismate mutase, eukaryotic type. This model represents the plant and yeast (plastidic) chorismate mutase. These CM's are distinct from other forms by the presence of an extended regulatory domain. [Amino acid biosynthesis, Aromatic amino acid family] 246
18275 130862 TIGR01803 CM-like chorismate mutase related enzymes. This subfamily includes two enzymes which are variants on the mechanism of chorismate mutase and are likely to have evolved from an ancestral chorismate mutase enzyme. 4-amino-4-deoxy-chorismate mutase produces amino-deoxy-prephenate which is subsequently converted to para-dimethylamino-phenylalanine, a component of the natural product pristinamycin. Isochorismate-pyruvate lyase presumably catalyzes the same type of 2+2+2 cyclo-rearrangement as chorismate mutase, but acting on isochorismate, this results in two broken bonds instead of one broken and one made. The product of this reaction is salicylate (2-hydroxy-benzoate) which is also incorporated into various natural products. 82
18276 200131 TIGR01804 BADH betaine-aldehyde dehydrogenase. Under osmotic stress, betaine aldehyde dehydrogenase oxidizes glycine betaine aldehyde into the osmoprotectant glycine betaine, via the second of two oxidation steps from exogenously supplied choline or betaine aldehyde. This choline-glycine betaine synthesis pathway can be found in gram-positive and gram-negative bacteria. In Escherichia coli, betaine aldehyde dehydrogenase (betB) is osmotically co-induced with choline dehydrogenase (betA) in the presence of choline. These dehydrogenases are located in a betaine gene cluster with the upstream choline transporter (betT) and transcriptional regulator (betI). Similar to E.coli, betaine synthesis in Staphylococcus xylosus is also influenced by osmotic stress and the presence of choline with genes localized in a functionally equivalent gene cluster. Organization of the betaine gene cluster in Sinorhizobium meliloti and Bacillus subtilis differs from that of E.coli by the absence of upstream choline transporter and transcriptional regulator homologues. Additionally, B.subtilis co-expresses a type II alcohol dehydrogenase with betaine aldehyde dehydrogenase instead of choline dehydrogenase as in E.coli, St.xylosus, and Si.meliloti. Betaine aldehyde dehydrogenase is a member of the aldehyde dehydrogenase family (pfam00171). [Cellular processes, Adaptations to atypical conditions] 467
18277 130864 TIGR01805 CM_mono_grmpos monofunctional chorismate mutase, gram positive-type, clade 2. This model represents a clade of chorismate mutase proteins/domains from gram positive species. The sequence from Enterococcus is fused to the C-terminus of an aparrent acetyltransferase, and the seuence from Clostridium acetobutylicum (but not perfringens) is fused to the N-terminus of shikimate-5-dehydrogenase, another enzyme of the chorismate pathway. All the other members of this clade are mono-functional. Members of this clade from Streptococcus and Lactococcus have been found which represent the sole chorismate mutase domain in their respective genomes which also exhibit evidence of the enzymes of both the upstream and downstream branches of the chorismate pathways. [Amino acid biosynthesis, Aromatic amino acid family] 81
18278 130865 TIGR01806 CM_mono2 chorismate mutase, putative. This model represents a clade of probable chorismate mutases from alpha, beta and gamma proteobacteria as well as Mycobacterium tuberculosis and a clade of nematodes. Although the most likely function for the enzymes represented by this model is as a chorismate mutase, in no species are these enzymes the sole chorismate mutase in the genome. Also, in no case are these enzymes located in a region of the genome proximal to any other enzymes involved in chorismate pathways. Although the Pantoea enzyme has been shown to complement a CM-free mutant of E. coli, this was also shown to be the case with isochorismate-pyruvate lyase which only has a secondary (non-physiologically relevant) chorismate mutase activity. This enzyme is believed to be a homodimer and be localized to the periplasm. [Amino acid biosynthesis, Aromatic amino acid family] 114
18279 130866 TIGR01807 CM_P2 chorismate mutase domain of proteobacterial P-protein, clade 2. This model represents one of two separate clades of the chorismate mutase domain of the gamma and beta and epsilon proteobacterial "P-protein" which contains an N-terminal chorismate mutase domain and a C-terminal prephenate dehydratase domain. It is also found in Aquifex aolicus. [Amino acid biosynthesis, Aromatic amino acid family] 76
18280 130867 TIGR01808 CM_M_hiGC-arch monofunctional chorismate mutase, high GC gram positive type. This model represents the monofunctional chorismate mutase from high GC gram-positive bacteria and archaea. Trusted annotations from Corynebacterium and Pyrococcus are aparrently the sole chorismate mutase enzymes in their respective genomes. This is coupled with the presence in those genomes of the enzymes of the chorismate pathways both up- and downstream of chorismate mutase. [Amino acid biosynthesis, Aromatic amino acid family] 74
18281 273813 TIGR01809 Shik-DH-AROM shikimate-5-dehydrogenase, fungal AROM-type. This model represents a clade of shikimate-5-dehydrogenases found in Corynebacterium, Mycobacteria and fungi. The fungal sequences are pentafunctional proteins known as AroM which contain the central five seven steps in the chorismate biosynthesis pathway. The Corynebacterium and Mycobacterial sequences represent the sole shikimate-5-dehydrogenases in species which otherwise have every enzyme of the chorismate biosynthesis pathway. [Amino acid biosynthesis, Aromatic amino acid family] 282
18282 273814 TIGR01810 betA choline dehydrogenase. Choline dehydrogenase catalyzes the conversion of exogenously supplied choline into the intermediate glycine betaine aldehyde, as part of a two-step oxidative reaction leading to the formation of osmoprotectant betaine. This enzymatic system can be found in both gram-positive and gram-negative bacteria. As in Escherichia coli, Staphylococcus xylosus, and Sinorhizobium meliloti, this enzyme is found associated in a transciptionally co-induced gene cluster with betaine aldehyde dehydrogenase, the second catalytic enzyme in this reaction. Other gram-positive organisms have been shown to employ a different enzymatic system, utlizing a soluable choline oxidase or type III alcohol dehydrogenase instead of choline dehydrogenase. This enzyme is a member of the GMC oxidoreductase family (pfam00732 and pfam05199), sharing a common evoluntionary origin and enzymatic reaction with alcohol dehydrogenase. Outgrouping from this model, Caulobacter crescentus shares sequence homology with choline dehydrogenase, yet other genes participating in this enzymatic reaction have not currently been identified. [Cellular processes, Adaptations to atypical conditions] 532
18283 130870 TIGR01811 sdhA_Bsu succinate dehydrogenase or fumarate reductase, flavoprotein subunit, Bacillus subtilis subgroup. This model represents the succinate dehydrogenase flavoprotein subunit as found in the low-GC Gram-positive bacteria and a few other lineages. This enzyme may act in a complete or partial TCA cycle, or act in the opposite direction as fumarate reductase. In some but not all species, succinate dehydrogenase and fumarate reductase may be encoded as separate isozymes. [Energy metabolism, TCA cycle] 603
18284 273815 TIGR01812 sdhA_frdA_Gneg succinate dehydrogenase or fumarate reductase, flavoprotein subunitGram-negative/mitochondrial subgroup. This model represents the succinate dehydrogenase flavoprotein subunit as found in Gram-negative bacteria, mitochondria, and some Archaea. Mitochondrial forms interact with ubiquinone and are designated EC 1.3.5.1, but can be degraded to 1.3.99.1. Some isozymes in E. coli and other species run primarily in the opposite direction and are designated fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle] 566
18285 273816 TIGR01813 flavo_cyto_c flavocytochrome c. This model describes a family of redox proteins related to the succinate dehydrogenases and fumarate reductases of E. coli, mitochondria, and other well-characterized systems. A member of this family from Shewanella frigidimarina NCIMB400 is characterized as a water-soluble periplasmic protein with four heme groups, a non-covalently bound FAD, and essentially unidirectional fumarate reductase activity. At least seven distinct members of this family are found in Shewanella oneidensis, a species able to use a wide variety of pathways for respiraton. [Energy metabolism, Electron transport] 439
18286 130873 TIGR01814 kynureninase kynureninase. This model describes kynureninase, a pyridoxal-phosphate enzyme. Kynurinine is a Trp breakdown product and a precursor for NAD. In Chlamydia psittaci, an obligate intracellular pathogen, kynureninase makes anthranilate, a Trp precursor, from kynurenine. This counters the tryptophan hydrolysis that occurs in the host cell in response to the pathogen. [Energy metabolism, Amino acids and amines] 406
18287 130874 TIGR01815 TrpE-clade3 anthranilate synthase, alpha proteobacterial clade. This model represents a small clade of anthranilate synthases from alpha proteobacteria and Nostoc (a cyanobacterium). This enzyme is the first step in the pathway for the biosynthesis of tryprophan from chorismate. [Amino acid biosynthesis, Aromatic amino acid family] 717
18288 130875 TIGR01816 sdhA_forward succinate dehydrogenase, flavoprotein subunit, E. coli/mitochondrial subgroup. Succinate dehydrogenase and fumarate reductase are homologous enzymes reversible in principle but favored under different circumstances. This model represents a narrowly defined clade of the succinate dehydrogenase flavoprotein subunit as found in mitochondria, in Rickettsia, in E. coli and other Proteobacteria, and in a few other lineages. However, this model excludes all known fumarate reductases. It also excludes putative succinate dehydrogenases that appear to diverged before the split between E. coli succinate dehydrogenase and fumarate reductase. [Energy metabolism, TCA cycle] 565
18289 273817 TIGR01817 nifA Nif-specific regulatory protein. This model represents NifA, a DNA-binding regulatory protein for nitrogen fixation. The model produces scores between the trusted and noise cutoffs for a well-described NifA homolog in Aquifex aeolicus (which lacks nitrogenase), for transcriptional activators of alternative nitrogenases (VFe or FeFe instead of MoFe), and truncated forms. [Central intermediary metabolism, Nitrogen fixation, Regulatory functions, DNA interactions] 534
18290 273818 TIGR01818 ntrC nitrogen regulation protein NR(I). This model represents NtrC, a DNA-binding response regulator that is phosphorylated by NtrB and interacts with sigma-54. NtrC usually controls the expression of glutamine synthase, GlnA, and may be called GlnL, GlnG, etc. [Central intermediary metabolism, Nitrogen metabolism, Regulatory functions, DNA interactions, Signal transduction, Two-component systems] 463
18291 130878 TIGR01819 F420_cofD 2-phospho-L-lactate transferase. This model represents LPPG:Fo 2-phospho-L-lactate transferase, which catalyses the fourth step in the biosynthesis of coenzyme F420, a flavin derivative found in methanogens, the Mycobacteria, and several other lineages. This enzyme is characterized so far in Methanococcus jannaschii but appears restricted to F420-containing species and is predicted to carry out the same function in these other species. The clade represented by this model is one of two major divisions of proteins in pfam01933. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 297
18292 273819 TIGR01820 TrpE-arch anthranilate synthase component I, archaeal clade. This model represents an archaeal clade of anthranilate synthase component I enzymes. This enzyme is responsible for the first step of tryptophan biosynthesis from chorismate. The Sulfolobus enzyme has been reported to be part of a gene cluster for Trp biosynthesis [Amino acid biosynthesis, Aromatic amino acid family] 435
18293 273820 TIGR01821 5aminolev_synth 5-aminolevulinic acid synthase. This model represents 5-aminolevulinic acid synthase, an enzyme for one of two routes to the heme precursor 5-aminolevulinate. The protein is a pyridoxal phosphate-dependent enzyme related to 2-amino-3-ketobutyrate CoA tranferase and 8-amino-7-oxononanoate synthase. This enzyme appears restricted to the alpha Proteobacteria and mitochondrial derivatives. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 402
18294 130881 TIGR01822 2am3keto_CoA glycine C-acetyltransferase. This model represents a narrowly defined clade of animal and bacterial (almost exclusively Proteobacterial) 2-amino-3-ketobutyrate--CoA ligase, now called glycine C-acetyltransferase. This enzyme can act in threonine catabolism. The closest homolog from Bacillus subtilis, and sequences like it, may be functionally equivalent but were not included in the model because of difficulty in finding reports of function. [Energy metabolism, Amino acids and amines] 393
18295 273821 TIGR01823 PabB-fungal aminodeoxychorismate synthase, fungal clade. This model represents the fungal clade of a para-aminobenzoate synthesis enzyme, aminodeoxychorismate synthase, which acts on chorismate in a pathway that yields PABA, a precursor of folate. 742
18296 130883 TIGR01824 PabB-clade2 aminodeoxychorismate synthase, component I, clade 2. This clade of sequences is more closely related to TrpE (anthranilate synthase, TIGR00564/TIGR01820/TIGR00565) than to the better characterized group of PabB enzymes (TIGR00553/TIGR01823). This clade includes one characterized enzyme from Lactococcus and the conserved function across the clade is supported by these pieces of evidence: 1) all genomes with a member in this clade also have a separate TrpE gene, 2) none of these genomes contain an aparrent PabB from any of the other PabB clades, 3) none of these sequences are found in a region of the genome in association with other Trp biosynthesis genes, 4) all of these genomes aparrently contain most if not all of the steps of the folate biosynthetic pathway (for which PABA is a precursor). Many of the sequences hit by this model are annotated as TrpE enzymes, however, we believe that all members of this clade are, in fact, PabB. The sequences from Bacillus halodurans and subtilus which score below the trusted cutoff for this model are also likely to be PabB enzymes, but are too closely related to TrpE to be separated at this time. 355
18297 130884 TIGR01825 gly_Cac_T_rel pyridoxal phosphate-dependent acyltransferase, putative. This model represents an enzyme subfamily related to three known enzymes; it appears closest to glycine C-acteyltransferase, shows no overlap with it in species distribution, and may share that function. The three closely related enzymes are glycine C-acetyltransferase (2-amino-3-ketobutyrate coenzyme A ligase), 5-aminolevulinic acid synthase, and 8-amino-7-oxononanoate synthase. All transfer the R-group (acetyl, succinyl, or 6-carboxyhexanoyl) from coenzyme A to an amino acid (Gly, Gly, Ala, respectively), with release of CO2 for the latter two reactions. 385
18298 211689 TIGR01826 CofD_related conserved hypothetical protein, cofD-related. This model represents a subfamily of conserved hypothetical proteins that forms a sister group to the family of CofD, (TIGR01819), LPPG:Fo 2-phospho-L-lactate transferase, an enzyme of cytochrome F420 biosynthesis. Both this family and TIGR01819 are within the scope of the pfam01933. [Hypothetical proteins, Conserved] 310
18299 130886 TIGR01827 gatC_rel Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase, subunit C, putative. This model represents a family small family related to GatC, the third subunit of an enzyme for completing the charging of tRNA(Gln) by amidating the Glu-tRNA(Gln). The few known archaea that contain a member of this family appear to produce Asn-tRNA(Asn) by an analogous amidotransferase reaction. This protein is proposed to substitute for GatC in the charging of both tRNAs. 73
18300 273822 TIGR01828 pyru_phos_dikin pyruvate, phosphate dikinase. This model represents pyruvate,phosphate dikinase, also called pyruvate,orthophosphate dikinase. It is similar in sequence to other PEP-utilizing enzymes. [Energy metabolism, Other] 856
18301 273823 TIGR01829 AcAcCoA_reduct acetoacetyl-CoA reductase. This model represent acetoacetyl-CoA reductase, a member of the family short-chain-alcohol dehydrogenases. Note that, despite the precision implied by the enzyme name, the reaction of EC 1.1.1.36 is defined more generally as (R)-3-hydroxyacyl-CoA + NADP+ = 3-oxoacyl-CoA + NADPH. Members of this family may act in the biosynthesis of poly-beta-hydroxybutyrate (e.g. Rhizobium meliloti) and related poly-beta-hydroxyalkanoates. Note that the member of this family from Azospirillum brasilense, designated NodG, appears to lack acetoacetyl-CoA reductase activity and to act instead in the production of nodulation factor. This family is downgraded to subfamily for this NodG. Other proteins designated NodG, as from Rhizobium, belong to related but distinct protein families. 242
18302 273824 TIGR01830 3oxo_ACP_reduc 3-oxoacyl-(acyl-carrier-protein) reductase. This model represents 3-oxoacyl-[ACP] reductase, also called 3-ketoacyl-acyl carrier protein reductase, an enzyme of fatty acid biosynthesis. [Fatty acid and phospholipid metabolism, Biosynthesis] 239
18303 273825 TIGR01831 fabG_rel 3-oxoacyl-(acyl-carrier-protein) reductase, putative. This model represents a small, very well conserved family of proteins closely related to the FabG family, TIGR01830, and possibly equal in function. In all completed genomes with a member of this family, a FabG in TIGR01830 is also found. [Fatty acid and phospholipid metabolism, Biosynthesis] 239
18304 188170 TIGR01832 kduD 2-deoxy-D-gluconate 3-dehydrogenase. This model describes 2-deoxy-D-gluconate 3-dehydrogenase (also called 2-keto-3-deoxygluconate oxidoreductase), a member of the family of short-chain-alcohol dehydrogenases (pfam00106). This protein has been characterized in Erwinia chrysanthemi as an enzyme of pectin degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 248
18305 273826 TIGR01833 HMG-CoA-S_euk 3-hydroxy-3-methylglutaryl-CoA-synthase, eukaryotic clade. Hydroxymethylglutaryl(HMG)-CoA synthase is the first step of isopentenyl pyrophosphate (IPP) biosynthesis via the mevalonate pathway. This pathway is found mainly in eukaryotes, but also in archaea and some bacteria. This model is specific for eukaryotes. 457
18306 273827 TIGR01834 PHA_synth_III_E poly(R)-hydroxyalkanoic acid synthase, class III, PhaE subunit. This model represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species. This model must be designated subfamily to indicate the heterogeneity of PHAs. [Cellular processes, Adaptations to atypical conditions, Fatty acid and phospholipid metabolism, Biosynthesis] 320
18307 213655 TIGR01835 HMG-CoA-S_prok 3-hydroxy-3-methylglutaryl CoA synthase, prokaryotic clade. This clade of hydroxymethylglutaryl-CoA (HMG-CoA) synthases is found in a limited spectrum of mostly gram-positive bacteria which make isopentenyl pyrophosphate (IPP) via the mevalonate pathway. This pathway is found primarily in eukaryotes and archaea, but the bacterial homologs are distinct, having aparrently diverged after being laterally transferred from an early eukaryote. HMG-CoA synthase is the first step in the pathway and joins acetyl-CoA with acetoacetyl-CoA with the release of one molecule of CoA. The Borellia sequence may have resulted from a separate lateral transfer event. 379
18308 130895 TIGR01836 PHA_synth_III_C poly(R)-hydroxyalkanoic acid synthase, class III, PhaC subunit. This model represents the PhaC subunit of a heterodimeric form of polyhydroxyalkanoic acid (PHA) synthase. Excepting the PhaC of Bacillus megaterium (which needs PhaR), all members require PhaE (TIGR01834) for activity and are designated class III. This enzyme builds ester polymers for carbon and energy storage that accumulate in inclusions, and both this enzyme and the depolymerase associate with the inclusions. Class III enzymes polymerize short-chain-length hydroxyalkanoates. [Fatty acid and phospholipid metabolism, Biosynthesis] 350
18309 130896 TIGR01837 PHA_granule_1 poly(hydroxyalkanoate) granule-associated protein. This model describes a domain found in some proteins associated with polyhydroxyalkanoate (PHA) granules in a subset of species that have PHA inclusion granules. Included are two tandem proteins of Pseudomonas oleovorans, PhaI and PhaF, and their homologs in related species. PhaF proteins have a low-complexity C-terminal region with repeats similar to AAAKP. [Fatty acid and phospholipid metabolism, Biosynthesis] 118
18310 213656 TIGR01838 PHA_synth_I poly(R)-hydroxyalkanoic acid synthase, class I. This model represents the class I subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs with three to five carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis] 532
18311 130898 TIGR01839 PHA_synth_II poly(R)-hydroxyalkanoic acid synthase, class II. This model represents the class II subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs, typically with six to fourteen carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis] 560
18312 273828 TIGR01840 esterase_phb esterase, PHB depolymerase family. This model describes a subfamily among lipases of the ab-hydrolase family. This subfamily includes bacterial depolymerases for poly(3-hydroxybutyrate) (PHB) and related polyhydroxyalkanoates (PHA), as well as acetyl xylan esterases, feruloyl esterases, and others from fungi. [Fatty acid and phospholipid metabolism, Degradation] 212
18313 130900 TIGR01841 phasin phasin family protein. This model describes a family of small proteins found associated with inclusions in bacterial cells. Most associate with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB). These are designated granule-associate proteins or phasins; the member from Rhodospirillum rubrum is an activator of polyhydroxybutyrate (PHB) degradation. However, the member from Magnetospirillum sp. AMB-1 is called a magnetic particle membrane-specific GTPase. 88
18314 200134 TIGR01842 type_I_sec_PrtD type I secretion system ABC transporter, PrtD family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. [Protein fate, Protein and peptide secretion and trafficking] 544
18315 130902 TIGR01843 type_I_hlyD type I secretion membrane fusion protein, HlyD family. Type I secretion is an ABC transport process that exports proteins, without cleavage of any signal sequence, from the cytosol to extracellular medium across both inner and outer membranes. The secretion signal is found in the C-terminus of the transported protein. This model represents the adaptor protein between the ATP-binding cassette (ABC) protein of the inner membrane and the outer membrane protein, and is called the membrane fusion protein. This model selects a subfamily closely related to HlyD; it is defined narrowly and excludes, for example, colicin V secretion protein CvaA and multidrug efflux proteins. [Protein fate, Protein and peptide secretion and trafficking] 423
18316 273829 TIGR01844 type_I_sec_TolC type I secretion outer membrane protein, TolC family. Members of this model are outer membrane proteins from the TolC subfamily within the RND (Resistance-Nodulation-cell Division) efflux systems. These proteins, unlike the NodT subfamily, appear not to be lipoproteins. All are believed to participate in type I protein secretion, an ABC transporter system for protein secretion without cleavage of a signal sequence, although they may, like TolC, participate also in the efflux of smaller molecules as well. This family includes the well-documented examples TolC (E. coli), PrtF (Erwinia), and AprF (Pseudomonas aeruginosa). [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Porins] 415
18317 273830 TIGR01845 outer_NodT efflux transporter, outer membrane factor (OMF) lipoprotein, NodT family. Members of this model comprise a subfamily of the Outer Membrane Factor (TCDB 1.B.17) porins. OMF proteins operate in conjunction with a primary transporter of the RND, MFS, ABC, or PET systems, and a MFP (membrane fusion protein) to tranport substrates across membranes. The complex thus formed allows transport (export) of various solutes (heavy metal cations; drugs, oligosaccharides, proteins, etc.) across the two envelopes of the Gram-negative bacterial cell envelope in a single energy-coupled step. Current data suggest that the OMF (and not the MFP) is largely responsible for the formation of both the trans-outer membrane and trans-periplasmic channels. The roles played by the MFP have yet to be determined. [Cellular processes, Detoxification, Transport and binding proteins, Porins] 460
18318 273831 TIGR01846 type_I_sec_HlyB type I secretion system ABC transporter, HlyB family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. [Protein fate, Protein and peptide secretion and trafficking] 694
18319 130906 TIGR01847 bacteriocin_sig bacteriocin-type signal sequence. Bacteriocins are bacterial peptide products toxic to closely related bacteria. This model represents the N-terminal region up to the GG cleavage motif. Processing to remove this bacteriocin leader peptide occurs together with export by an ABC transporter. Note: because this model is so small (15 amino acids), it may have many spurious high-scoring matches to unrelated proteins, even with fairly stringent cutoff scores. The most likely true positives are small proteins of Gram-positive bacteria, matching regions that start within the first 15 amino acids, and encoded near bacteriocin transport family proteins (TIGR01000, TIGR01193). 15
18320 130907 TIGR01848 PHA_reg_PhaR polyhydroxyalkanoate synthesis repressor PhaR. Poly-B-hydroxyalkanoates are lipidlike carbon/energy storage polymers found in granular inclusions. PhaR is a regulatory protein found in general near other proteins associated with polyhydroxyalkanoate (PHA) granule biosynthesis and utilization. It is found to be a DNA-binding homotetramer that is also capable of binding short chain hydroxyalkanoic acids and PHA granules. PhaR may regulate the expression of itself, of the phasins that coat granules, and of enzymes that direct carbon flux into polymers stored in granules. The C-terminal region is poorly conserved in this family and is not part of this model.//GO terms added 12/6/04 [SS] [Fatty acid and phospholipid metabolism, Biosynthesis, Regulatory functions, DNA interactions] 107
18321 130908 TIGR01849 PHB_depoly_PhaZ polyhydroxyalkanoate depolymerase, intracellular. This model represents an intracellular depolymerase for polyhydroxyalkanoate (PHA), a carbon and energy storing polyester that accumulates in granules in many bacterial species when carbon sources are abundant but other nutrients are limiting. This family is named for PHAs generally, rather than polyhydroxybutyrate (PHB) specificially as in Ralstonia eutropha H16, to avoid overcalling chemical specificity in other species. Note that this family lacks the classic GXSXG lipase motif and instead shows weak similarity to some [Fatty acid and phospholipid metabolism, Degradation] 406
18322 273832 TIGR01850 argC N-acetyl-gamma-glutamyl-phosphate reductase, common form. This model represents the more common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and the gap architecture in a multiple sequence alignment. Bacterial members of this family tend to be found within Arg biosynthesis operons. [Amino acid biosynthesis, Glutamate family] 346
18323 273833 TIGR01851 argC_other N-acetyl-gamma-glutamyl-phosphate reductase, uncommon form. This model represents the less common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and gap architecture in a multiple sequence alignment. [Amino acid biosynthesis, Glutamate family] 310
18324 188173 TIGR01852 lipid_A_lpxA acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase. This model describes LpxA, an enzyme for the biosynthesis of lipid A, a component oflipopolysaccharide (LPS) in the outer membrane outer leaflet of most Gram-negative bacteria. Some differences are found between lipid A of different species, but this protein represents the first step (from UDP-N-acetyl-D-glucosamine) and appears to be conserved in function. Proteins from this family contain many copies of the bacterial transferase hexapeptide repeat (pfam00132). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 254
18325 273834 TIGR01853 lipid_A_lpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase LpxD. This model describes LpxD, an enzyme for the biosynthesis of lipid A, a component oflipopolysaccharide (LPS) in the outer membrane outer leaflet of most Gram-negative bacteria. Some differences are found between lipid A of different species. This protein represents the third step from UDP-N-acetyl-D-glucosamine. The group added at this step generally is 14:0(3-OH) (myristate) but may vary; in Aquifex it appears to be 16:0(3-OH) (palmitate). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 324
18326 273835 TIGR01854 lipid_A_lpxH UDP-2,3-diacylglucosamine diphosphatase. This model represents LpxH, UDP-2,3-diacylglucosamine hydrolase, and essential enzyme in E. coli that catalyzes the fourth step in lipid A biosynthesis. Note that Pseudomonas aeruginosa has both a member of this family that shares this function and a more distant homolog, designated LpxH2, that does not. Many species that produce lipid A lack an lpxH gene in this family; some of those species have an lpxH2 gene instead, although for which the function is unknown. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 231
18327 273836 TIGR01855 IMP_synth_hisH imidazole glycerol phosphate synthase, glutamine amidotransferase subunit. This model represents the glutamine amidotransferase subunit (or domain, in eukaryotic systems) of imidazole glycerol phosphate synthase. This subunit catalyzes step 5 of histidine biosynthesis from PRPP. The other subunit, the cyclase, catalyzes step 6. [Amino acid biosynthesis, Histidine family] 196
18328 273837 TIGR01856 hisJ_fam histidinol phosphate phosphatase, HisJ family. This model represents the histidinol phosphate phosphatase HisJ of Bacillus subtilis, and related proteins from a number of species within a larger family of phosphatases in the PHP hydrolase family. HisJ catalyzes the penultimate step of histidine biosynthesis but shows no homology to the functionally equivalent sequence in E. coli, a domain of the bifunctional HisB protein. Note, however, that many species have two members and that Clostridium perfringens, predicted not to make histidine, has five members of this family; this family is designated subfamily rather than equivalog to indicate that members may not all act as HisJ. 253
18329 130916 TIGR01857 FGAM-synthase phosphoribosylformylglycinamidine synthase, clade II. This model represents a single-molecule form of phosphoribosylformylglycinamidine synthase, also called FGAM synthase, an enzyme of purine de novo biosynthesis. This model represents a second clade of these enzymes found in Clostridia, Bifidobacteria and Streptococcus species. This enzyme performs the fourth step in IMP biosynthesis (the precursor of all purines) from PRPP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 1239
18330 130917 TIGR01858 tag_bisphos_ald class II aldolase, tagatose bisphosphate family. This model describes tagatose-1,6-bisphosphate aldolases, and perhaps other closely related class II aldolases. This tetrameric, Zn2+-dependent enzyme is related to the class II fructose bisphosphate aldolase; fructose 1,6-bisphosphate and tagatose 1,6 bisphosphate differ only in chirality at C4. 282
18331 130918 TIGR01859 fruc_bis_ald_ fructose-1,6-bisphosphate aldolase, class II, various bacterial and amitochondriate protist. This model represents of one of several subtypes of the class II fructose-1,6-bisphosphate aldolase, an enzyme of glycolysis. The subtypes are split into several models to allow separation of a family of tagatose bisphosphate aldolases. This form is found in Gram-positive bacteria, a variety of Gram-negative, and in amitochondriate protists. The class II enzymes share homology with tagatose bisphosphate aldolase but not with class I aldolase. [Energy metabolism, Glycolysis/gluconeogenesis] 282
18332 130919 TIGR01860 VNFD nitrogenase vanadium-iron protein, alpha chain. This model represents the alpha chain of the vanadium-containing component of the vanadium-iron nitrogenase compound I. The complex also includes a second alpha chain, two beta chains and two delta chains. Compount I interacts with compound II also known as the iron-protein which transfers electrons to compound I where the catalysis occurs. [Central intermediary metabolism, Nitrogen fixation] 461
18333 130920 TIGR01861 ANFD nitrogenase iron-iron protein, alpha chain. This model represents the all-iron variant of the nitrogenase component I alpha chain. Molybdenum-iron and vanadium iron forms are also found. The complete complex contains two alpha chains, two beta chains and two delta chains. The component I associates with component II also known as the iron protein which serves to provide electrons for component I. [Central intermediary metabolism, Nitrogen fixation] 513
18334 273838 TIGR01862 N2-ase-Ialpha nitrogenase component I, alpha chain. This model represents the alpha chain of all three varieties (Mo-Fe, V-Fe, and Fe-Fe) of component I of nitrogenase. [Central intermediary metabolism, Nitrogen fixation] 443
18335 273839 TIGR01863 cas_Csd1 CRISPR-associated protein Cas8c/Csd1, subtype I-C/DVULG. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein that tends to be found near CRISPR repeats of the DVULG subtype of CRISPR/Cas locus. We designate this family Csd1 (CRISPR/Cas Subtype DVULG protein 1). The species range for this subtype, so far, is exclusively bacterial and mesophilic, although CRISPR loci in general are particularly common among archaea and thermophilic bacteria. In a few species (Xanthomonas axonopodis pv. citri str. 306 and Streptococcus mutans UA159), homology to this protein family is split across two tandem genes; the trusted cutoff to this family is set low enough to capture at least the longer of the two. 584
18336 273840 TIGR01865 cas_Csn1 CRISPR subtype II/NMENI RNA-guided endonuclease Cas9/Csn1. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas locus. The species range so far for this protein is animal pathogens and commensals only. 805
18337 273841 TIGR01866 cas_Csn2 CRISPR type II-A/NMEMI-associated protein Csn2. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas loci. The species range so far for this subtype is animal pathogens and commensals only. This protein is present in some but not all NMENI CRISPR/Cas loci. 222
18338 273842 TIGR01868 casD_Cas5e CRISPR-associated protein Cas5/CasD, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is part of the ECOLI subtype CRISPR/Cas locus, and now characterized as part of the CASCADE complex of that system. It shares a small N-terminal homology region with members of several other CRISPR/Cas subtypes, and we view the families that share this region as being Cas5. 216
18339 273843 TIGR01869 casC_Cse4 CRISPR-associated protein Cas7/Cse4/CasC, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum and is part of the Ecoli subtype of CRISPR/Cas locis. It is designated Cse4, for CRISPR/Cas Subtype Ecoli protein 4. 325
18340 273844 TIGR01870 cas_TM1810_Csm2 CRISPR type III-A/MTUBE-associated protein Csm2. These proteins are found adjacent to a characteristic short, palidromic repeat cluster termed CRISPR, a probable mobile DNA element. This model represents the C-terminal domain of a minor family of CRISPR-associated protein from the Mtube subtype of CRISPR/Cas locus. The family is designated Csm2, for CRISPR/Cas Subtype Mtube Protein 2. 97
18341 233610 TIGR01873 cas_CT1978 CRISPR-associated endoribonuclease Cas2, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor branch of the Cas2 family of CRISPR-associated endonuclease, whereas most Cas2 proteins are modeled instead by TIGR01573. This form of Cas2 is characteristic for the Ecoli subtype of CRISPR/Cas locus. 87
18342 273845 TIGR01874 cas_cas5a CRISPR-associated protein Cas5, subtype I-A/APERN. This model represents a minor family of CRISPR-associated (Cas) protein. These proteins are found adjacent to a characteristic short, palidromic repeat cluster termed CRISPR, a probable mobile DNA element. This family belongs to a set of several Cas proteins, one each for a number of different CRISPR/Cas subtypes, that share a region of N-terminal sequence similarity modeled by TIGR02593. The family is designated Cas5a, for CRISPR-associated protein Cas5, Apern subtype. 172
18343 273846 TIGR01875 cas_MJ0381 CRISPR-associated autoregulator DevR family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This model represents one such family, represented by MJ0381 of Methanococcus jannaschii. This family includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development. [Regulatory functions, DNA interactions, , ] 237
18344 273847 TIGR01876 cas_Cas5d CRISPR-associated protein Cas5, subtype I-C/DVULG. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum. This family belongs to a set of several Cas protein families, one each for a number of different CRISPR/Cas subtypes, that share a region of N-terminal sequence similarity modeled by TIGR02593. This family represents the Dvulg subtype of CRISPR/Cas locus. 203
18345 273848 TIGR01877 cas_cas6 CRISPR-associated endoribonuclease Cas6. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This broadly distributed, highly divergent Cas family is now characterized as an endoribonuclease that generates guide RNAs for host defense against phage and other invaders. The family contains a C-terminal motif GXGXXXXXGXG, where the each X between two Gly is hydrophobic and the spacer XXXXX contains (usually) one Arg or Lys. The seed alignment for the current version of this model has gappy columns removed. Members of this protein family are found associated with several different CRISPR/cas system subtypes, and consequently we designate this family Cas6. 199
18346 273849 TIGR01878 cas_Csa5 CRISPR type I-A/APERN-associated protein Csa5. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor family of Cas protein found in the (all archaeal) APERN subtype of CRISPR/Cas locus, so the family is designated Csa5, for CRISPR/Cas Subtype Protein 5. 97
18347 200138 TIGR01879 hydantase amidase, hydantoinase/carbamoylase family. Enzymes in this subfamily hydrolize the amide bonds of compounds containing carbamoyl groups or hydantoin rings. These enzymes are members of the broader family of amidases represented by pfam01546. 400
18348 273850 TIGR01880 Ac-peptdase-euk N-acyl-L-amino-acid amidohydrolase. This model represents a family of eukaryotic N-acyl-L-amino-acid amidohydrolases active on fatty acid and acetyl amides of L-amino acids. 400
18349 273851 TIGR01881 cas_Cmr5 CRISPR type III-B/RAMP module-associated protein Cmr5. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species as part of the 6-gene CRISPR RAMP module. 127
18350 130937 TIGR01882 peptidase-T peptidase T. This model represents a tripeptide aminopeptidase known as Peptidase T, which has a substrate preference for hydrophobic peptides. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 410
18351 162579 TIGR01883 PepT-like peptidase T-like protein. This model represents a clade of enzymes closely related to Peptidase T, an aminotripeptidase found in bacteria. This clade consists of gram positive bacteria of which several additionally contain a Peptidase T gene. 361
18352 273852 TIGR01884 cas_HTH CRISPR locus-related DNA-binding protein. Most but not all examples of this family are associated with CRISPR loci, a combination of DNA repeats and characteristic proteins encoded near the repeat cluster. The C-terminal region of this protein is homologous to DNA-binding helix-turn-helix domains with predicted transcriptional regulatory activity. [Regulatory functions, DNA interactions, , ] 203
18353 273853 TIGR01885 Orn_aminotrans ornithine aminotransferase. This model describes the final step in the biosynthesis of ornithine from glutamate via the non-acetylated pathway. Ornithine amino transferase takes L-glutamate 5-semialdehyde and makes it into ornithine, which is used in the urea cycle, as well as in the biosynthesis of arginine. This model includes low-GC bacteria and eukaryotic species. The genes from two species are annotated as putative acetylornithine aminotransferases - one from Porphyromonas gingivalis (OMNI|PG1271), and the other from Staphylococcus aureus (OMNI|SA0170). After homology searching using BLAST it was determined that these two sequences were most closely related to ornithine aminotransferases. This model's seed includes one characterized hit, from Bacillus subtilis (SP|P38021). 401
18354 130941 TIGR01886 dipeptidase dipeptidase PepV. This model represents a small clade of dipeptidase enzymes which are members of the larger M25 subfamily of metalloproteases. Two characterized enzymes are included in the seed. One, from Lactococcus lactis has been shown to act on a wide range of dipeptides, but not larger peptides. The enzyme from Lactobacillus delbrueckii was originally characterized as a Xaa-His dipeptidase, specifically a carnosinase (beta-Ala-His) by complementation of an E. coli mutant. Further study, including the crystallization of the enzyme, has shown it to also be a non-specific dipeptidase. This group also includes enzymes from Streptococcus and Enterococcus. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 466
18355 273854 TIGR01887 dipeptidaselike dipeptidase, putative. This model represents a clade of probable zinc dipeptidases, closely related to the characterized non-specific dipeptidase, PepV. Many enzymes in this clade have been given names including the terms "Xaa-His" and "carnosinase" due to the early mis-characterization of the Lactobacillus delbrueckii PepV enzyme. These names are likely too specific. 447
18356 273855 TIGR01888 cas_cmr3 CRISPR type III-B/RAMP module-associated protein Cmr3. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family is found in at least ten different archaeal and bacterial species as part of the CRISPR RAMP modulue but is not a member of the RAMP superfamily itself. A typical example is TM1793 from Thermotoga maritima. 333
18357 130944 TIGR01889 Staph_reg_Sar staphylococcal accessory regulator family. This model represents a family of transcriptional regulatory proteins in Staphylococcus aureus and Staphylococcus epidermidis. Some members contain two tandem copies of this region. This family is related to the MarR transcriptional regulator family described by pfam01047. [Regulatory functions, DNA interactions] 109
18358 273856 TIGR01890 N-Ac-Glu-synth amino-acid N-acetyltransferase. This model represents a clade of amino-acid N-acetyltransferases acting mainly on glutamate in the first step of the "acetylated" ornithine biosynthesis pathway. For this reason it is also called N-acetylglutamate synthase. The enzyme may also act on aspartate. [Amino acid biosynthesis, Glutamate family] 429
18359 273857 TIGR01891 amidohydrolases amidohydrolase. This model represents a subfamily of amidohydrolases which are a subset of those sequences detected by pfam01546. Included within this group are hydrolases of hippurate (N-benzylglycine), indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. These hydrolases are of the carboxypeptidase-type, most likely utilizing a zinc ion in the active site. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 363
18360 130947 TIGR01892 AcOrn-deacetyl acetylornithine deacetylase (ArgE). This model represents a clade of acetylornithine deacetylases from proteobacteria. This enzyme is the final step of the "acetylated" ornithine biosynthesis pathway. The enzyme is closely related to dapE, succinyl-diaminopimelate desuccinylase, and outside of this clade annotation is very inaccurate as to which function should be ascribed to genes. [Amino acid biosynthesis, Glutamate family] 364
18361 273858 TIGR01893 aa-his-dipept Xaa-His dipeptidase. This model represents a clade of dipeptidase enzymes, many of which are specific for carnosine (beta-alanyl-histidine). This enzymes is found broadly in bacteria and at least one archaeon (Methanosarcina). In most species there is only one sequence hitting this model, while Bacteroides thetaiotaomicron, Chlorobium tepidum and Clostridium perfringens have two each and Fusobacterium nucleatum has three. These may indicate that there is a broader substrate range than just carnosine in these (and other) species. 8/19/03 GO terms added [SS] 477
18362 273859 TIGR01894 cas_TM1795_cmr1 CRISPR type III-B/RAMP module RAMP protein Cmr1. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents the region of stongest conservation, the N-terminal half, of one such family, represented by TM1795 from Thermotoga maritima. This protein is the first of a set of six genes, mostly from the RAMP superfamily, that we designated the CRISPR-associated RAMP module. 154
18363 273860 TIGR01895 cas_Cas5t CRISPR-associated protein Cas5, subtype I-B/TNEAP. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family is represented by TM1800 from Thermotoga maritima. It is related to TIGR01868 (CRISPR-associated protein, CT1976 family). 215
18364 273861 TIGR01896 cas_AF1879 CRISPR-associated protein Cas4/Csa1, subtype I-A/APERN. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model describes a particularly strongly conserved family found so only in the APERN subtype of CRISPR/Cas loci and represented by AF1879 from Archaeoglobus fulgidus. This family has four perfectly preserved Cys residues. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa1, for CRISPR/Cas Subtype Protein 1. 271
18365 273862 TIGR01897 cas_MJ1666 CRISPR-associated protein, MJ1666 family. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model describes a Cas protein about 400 residues in length, found mostly in the Archaea but also in Aquifex. 410
18366 213662 TIGR01898 cas_TM1791_cmr6 CRISPR type III-B/RAMP module RAMP protein Cmr6. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family, represented by TM1791 of Thermotoga maritima, is designated Cmr6 [sic], for CRISPR/Cas Ramp Module protein 6. This family is both closely related to and frequently encoded next to the TM1792 family of Cas proteins described by TIGR01867. The two proteins are fused in an example from Methanopyrus kandleri. 176
18367 273863 TIGR01899 cas_TM1807_csm5 CRISPR type III-A/MTUBE-associated RAMP protein Csm5. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm5, for CRISPR/cas Subtype Mtube, protein 5. 365
18368 273864 TIGR01900 dapE-gram_pos succinyl-diaminopimelate desuccinylase. This model represents a clade of succinyl-diaminopimelate desuccinylases from actinobacteria (high-GC gram positives), delta-proteobacteria and aquificales and is based on the characterization of the enzyme from Corynebacterium glutamicum. This enzyme is involved in the biosynthesis of lysine, and is related to the enzyme acetylornithine deacetylase and other amidases and peptidases found within pfam01546. Other sequences included in the seed of this model were assessed to confirm that 1) the related genes DapC (succinyl-diaminopimelate transaminase) and DapD (2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase) are also found in the genome, 2) each is found only once in those genomes, 3) the lysine biosynthesis pathway is complete and 4) the direct (TIGR03540 or TIGR03542) or acetylated (GenProp0787) aminotransferase pathways are absent in thes genomes. Additionally, a number of the seed members are observed adjacent to either DapC or DapD (often as a divergon with a putative promoter site between them. [Amino acid biosynthesis, Aspartate family] 351
18369 273865 TIGR01901 adhes_NPXG filamentous hemagglutinin family N-terminal domain. This model represents a conserved domain found near the N-terminus of a number of large, repetitive bacterial proteins, including many proteins of over 2500 amino acids. Members generally have a signal sequence, then an intervening region, then the region described by this model. Following this region, proteins typically have regions rich in repeats but may show no homology between the repeats of one member and the repeats of another. A number of the members of this family have been designated adhesins, filamentous haemagglutinins, heme/hemopexin-binding protein, etc. 79
18370 130957 TIGR01902 dapE-lys-deAc N-acetyl-ornithine/N-acetyl-lysine deacetylase. This clade of mainly archaeal and related bacterial species contains two characterized enzymes, an deacetylase with specificity for both N-acetyl-ornithine and N-acetyl-lysine from Thermus, which is found within a lysine biosynthesis operon, and a fusion protein with acetyl-glutamate kinase (an enzyme of ornithine biosynthesis) from Lactobacillus. It is possible that all of the sequences within this clade have dual specificity, or that a mix of specificities have evolved within this clade. 336
18371 273866 TIGR01903 cas5_csm4 CRISPR type III-A/MTUBE-associated RAMP protein Csm4. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm4, for CRISPR/cas Subtype Mtube, protein 4. 297
18372 273867 TIGR01904 GSu_C4xC__C2xCH Geobacter sulfurreducens CxxxxCH...CXXCH domain. This domain occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches ProSite motif PS00190, the cytochrome c family heme-binding site signature, suggesting 42
18373 213663 TIGR01905 paired_CXXCH_1 doubled CXXCH domain. This model represents a domain of about 41 amino acids that contains, among other motifs, two copies of the motif CXXCH associated with heme binding. Almost every member of this family has at least three copies of this domain (at least six copies of CXXCH) is predicted to be a high molecular weight c-type cytochrome. Members are found mostly in species of Shewanella, Geobacter, and Vibrio. 41
18374 273868 TIGR01906 integ_TIGR01906 integral membrane protein TIGR01906. This model represents a family of highly hydrophobic, uncharacterized predicted integral membrane proteins found almost entirely in low-GC Gram-positive bacteria, although a member is also found in the early-branching bacterium Aquifex aeolicus. 207
18375 273869 TIGR01907 casE_Cse3 CRISPR-associated protein Cas6/Cse3/CasE, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by CT1974 from Chlorobium tepidum, is found in the Ecoli subtype of CRISPR/Cas regions and is designated Cse3 (CRISPR/Cas Subtype Ecoli protein 3). The representative of this family from Thermus thermophilus HB8 (TTHB192) has been crystallized and found to have a structure consisting of two domains with opposing parallel beta-sheets known as a beta-sheet platform. This structure is similar to those found in the Sex-lethal protein and poly(A)-binding protein. This structure is consistent with an RNA-binding function. 206
18376 162595 TIGR01908 cas_CXXC_CXXC CRISPR-associated protein Cas8b1/Cst1, subtype I-B/TNEAP. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This (revised) model describes a conserved region from an otherwise highly divergent protein found in the Tneap subtype of CRISPR/Cas regions. This Cys-rich region features two motifs of CXXC. 309
18377 213664 TIGR01909 C_GCAxxG_C_C C_GCAxxG_C_C family probable redox protein. This model represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters. 120
18378 273870 TIGR01910 DapE-ArgE acetylornithine deacetylase or succinyl-diaminopimelate desuccinylase. This group of sequences contains annotations for both acetylornithine deacetylase and succinyl-diaminopimelate desuccinylase, but does not contain any members with experimental characterization. Bacillus, Staphylococcus and Sulfolobus species contain multiple hits to this subfamily and each may have a separate activity. Determining which is which must await further laboratory research. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 375
18379 188182 TIGR01911 HesB_rel_seleno HesB-like selenoprotein. This model represents a family of small proteins related to HesB and its close homologs, which are likely to be invovlved in iron-sulfur cluster assembly (See TIGR00049 and pfam01521). Several members are selenoproteins, with a TGA codon and Sec residue that aligns to the conserved Cys of the HesB domain. A variable Cys/Ser/Gly-rich C-terminal region is not included in the seed alignment and model. [Unknown function, General] 92
18380 162597 TIGR01912 TatC-Arch Twin arginine targeting (Tat) protein translocase TatC, Archaeal clade. This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR00945) represents the bacterial clade of this family. TatC is often found (in bacteria) in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409). 237
18381 273871 TIGR01913 bet_lambda phage recombination protein Bet. This model represents the phage recombination protein Bet from a number of phage, including phage lambda. All members of this family are found in phage genomes or in putative prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions] 180
18382 273872 TIGR01914 cas_Csa4 CRISPR-associated protein Cas8a2/Csa4, subtype I-A/APERN. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa4, for CRISPR/Cas Subtype Protein 4. 354
18383 273873 TIGR01915 npdG NADPH-dependent F420 reductase. This model represents a subset of a parent family described by pfam03807. Unlike the parent family, members of this family are found only in species with evidence of coenzyme F420. All members of this family are believed to act as NADPH-dependent F420 reductase. [Energy metabolism, Electron transport] 219
18384 273874 TIGR01916 F420_cofE coenzyme F420-0:L-glutamate ligase. This model represents an enzyme of coenzyme F(420) biosynthesis, as catalyzed by MJ0768 of Methanococcus jannaschii and by the N-terminal half of FbiB of Mycobacterium bovis strain BCG. Note that only two glutamates are ligated in M. jannaschii, but five to six in the Mycobacterium lineage. In M. jannaschii, CofE catalyzes the GTP-dependent addition of two L-glutamates. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 243
18385 130972 TIGR01917 gly_red_sel_B glycine reductase, selenoprotein B. Glycine reductase is a complex with two selenoprotein subunits, A and B. This model represents the glycine reductase selenoprotein B. Closely related to it, but excluded from this model, are selenoprotein B subunits of betaine reductase and sarcosine reductase. All contain selenocysteine incorporated during translation at a specific UGA codon. 431
18386 130973 TIGR01918 various_sel_PB selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family. This model represents selenoprotein B of glycine reductase, sarcosine reductase, betaine reductase, D-proline reductase, and perhaps others. This model is built in fragment mode to assist in recognizing fragmentary translations. All members are expected to contain an internal TGA codon, encoding selenocysteine, which may be misinterpreted as a stop codon. 431
18387 273875 TIGR01919 hisA-trpF 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase/N-(5'phosphoribosyl)anthranilate isomerase. This model represents a bifunctional protein posessing both hisA (1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase) and trpF (N-(5'phosphoribosyl)anthranilate isomerase) activities. Thus, it is involved in both the histidine and tryptophan biosynthetic pathways. Enzymes with this property have been described only in the Actinobacteria (High-GC gram-positive). The enzyme is closely related to the monofunctional HisA proteins (TIGR00007) and in Actinobacteria, the classical monofunctional TrpF is generally absent. 243
18388 273876 TIGR01920 Shik_kin_archae shikimate kinase. This model represents the shikimate kinase (SK) gene found in archaea which is only distantly related to homoserine kinase (thrB) and not atr all to the bacterial SK enzyme. The SK from M. janaschii has been overexpressed in E. coli and characterized. SK catalyzes the fifth step of the biosynthesis of chorismate from D-erythrose-4-phosphate and phosphoenolpyruvate. [Amino acid biosynthesis, Aromatic amino acid family] 261
18389 273877 TIGR01921 DAP-DH diaminopimelate dehydrogenase. This model represents the diaminopimelate dehydrogenase enzyme which provides an alternate (shortcut) route of lysine buiosynthesis in Corynebacterium, Bacterioides, Porphyromonas and scattered other species. The enzyme from Corynebacterium glutamicum has been crystallized and characterized. 324
18390 273878 TIGR01922 purO_arch IMP cyclohydrolase. This model represents IMP cyclohydrolase, the final step in the biosynthesis of inosine monophosphate (IMP) in archaea. In bacteria this step is catalyzed by a bifunctional enzyme (purH). 199
18391 162605 TIGR01923 menE O-succinylbenzoate-CoA ligase. This model represents an enzyme, O-succinylbenzoate-CoA ligase, which is involved in the fourth step of the menaquinone biosynthesis pathway. O-succinylbenzoate-CoA ligase, together with menB - naphtoate synthase, take 2-succinylbenzoate and convert it into 1,4-di-hydroxy-2- naphtoate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 436
18392 273879 TIGR01924 rsbW_low_gc serine-protein kinase RsbW. This model describes the anti-sigma B factor also known as serine-protein kinase RsbW. Sigma B controls the general stress regulon in B subtilis and is activated by cell stresses such as stationary phase and heat shock. RsbW binds to sigma B and prevents formation of the transcription complex at the promoter. RsbV (anti-anti-sigma factor) binds to RsbW to inhibit association with sigma B, however RsbW can phosphorylate RsbV, causing disassociation of the RsbV/RsbW complex. Low ATP level or environmental stress causes the dephosphorylation of RsbV. 159
18393 130980 TIGR01925 spIIAB anti-sigma F factor. This model describes the SpoIIAB anti-sigma F factor. Sigma F regulates spore development in B subtilis. SpoIIAB binds to sigma F, preventing formation of the transcription complex at the promoter. SpoIIAA (anti-anti-sigma F factor) binds to SpoIIAB to inhibit association with sigma F, however SpoIIAB can phosphorylate SpoIIAA, causing disassociation of the SpoIIAA/B complex. The SpoIIE phosphatase dephosphorylates SpoIIAA. [Regulatory functions, Protein interactions, Cellular processes, Sporulation and germination] 137
18394 130981 TIGR01926 peroxid_rel uncharacterized peroxidase-related enzyme. This protein family with length of about 200 amino acids. One member, from Myxococcus xanthus, is a selenoprotein, with an otherwise conserved Cys replaced by Sec. This family is drawn narrowly enough to suggest that These proteins contain a domain described by TIGR00778, with a CxxCxxxHxxxxxxxG motif. Some members of that family are known to act as peroxidases or correlate with resistance to oxidative stress. 177
18395 273880 TIGR01927 menC_gamma/gm+ o-succinylbenzoate synthase. This model describes the enzyme o-succinylbenzoic acid synthetase (menC) that is involved in one of the steps of the menaquinone biosynthesis pathway. It takes SHCHC and makes it into 2-succinylbenzoate. Included in this model are gamma proteobacteria and archaea. Many of the com-names of the proteins identified by the model are identified as O-succinylbenzoyl-CoA synthase in error. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 307
18396 213667 TIGR01928 menC_lowGC/arch o-succinylbenzoate synthase. This model describes the enzyme o-succinylbenzoic acid synthetase (menC) that is involved in one of the steps of the menaquinone biosynthesis pathway. It takes SHCHC and makes it into 2-succinylbenzoate. Included in this model are low GC gram positive bacteria and archaea. Also included in the seed and in the model are enzymes with the com-name of N-acylamino acid racemase (or the more general term, racemase / racemase family), which refers to the enzyme's industrial application as racemases, and not to its biological function as o-succinylbenzoic acid synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 324
18397 200143 TIGR01929 menB naphthoate synthase (dihydroxynaphthoic acid synthetase). This model represents an enzyme, naphthoate synthase (dihydroxynaphthoic acid synthetase), which is involved in the fifth step of the menaquinone biosynthesis pathway. Together with o-succinylbenzoate-CoA ligase (menE: TIGR01923), this enzyme takes 2-succinylbenzoate and converts it into 1,4-di-hydroxy-2-naphthoate. Included above the trusted cutoff are two enzymes from Arabadopsis thaliana and one from Staphylococcus aureus which are identified as putative enoyl-CoA hydratase/isomerases. These enzymes group with the naphthoate synthases when building a tree and when doing BLAST searches. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 259
18398 273881 TIGR01930 AcCoA-C-Actrans acetyl-CoA acetyltransferases. This model represents a large family of enzymes which catalyze the thiolysis of a linear fatty acid CoA (or acetoacetyl-CoA) using a second CoA molecule to produce acetyl-CoA and a CoA-ester product two carbons shorter (or, alternatively, the condensation of two molecules of acetyl-CoA to produce acetoacetyl-CoA and CoA). This enzyme is also known as "thiolase", "3-ketoacyl-CoA thiolase", "beta-ketothiolase" and "Fatty oxidation complex beta subunit". When catalyzing the degradative reaction on fatty acids the corresponding EC number is 2.3.1.16. The condensation reaction corresponds to 2.3.1.9. Note that the enzymes which catalyze the condensation are generally not involved in fatty acid biosynthesis, which is carried out by a decarboxylating condensation of acetyl and malonyl esters of acyl carrier proteins. Rather, this activity may produce acetoacetyl-CoA for pathways such as IPP biosynthesis in the absence of sufficient fatty acid oxidation. [Fatty acid and phospholipid metabolism, Other] 385
18399 273882 TIGR01931 cysJ sulfite reductase [NADPH] flavoprotein, alpha-component. This model describes an NADPH-dependent sulfite reductase flavoprotein subunit. Most members of this family are found in Cys biosynthesis gene clusters. The closest homologs below the trusted cutoff are designated as subunits nitrate reductase. 597
18400 273883 TIGR01932 hflC HflC protein. HflK and HflC are paralogs encoded by tandem genes in Proteobacteria, spirochetes, and some other bacterial lineages. The HflKC complex is anchored in the membrane and exposed to the periplasm. The complex is not active as a protease, but rather binds to and appears to modulate the ATP-dependent protease FtsH. The overall function of HflKC is not fully described. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Regulatory functions, Protein interactions] 317
18401 130988 TIGR01933 hflK HflK protein. HflK and HflC are paralogs encoded by tandem genes in Proteobacteria, spirochetes, and some other bacterial lineages. The HflKC complex is anchored in the membrane and exposed to the periplasm. The complex is not active as a protease, but rather binds to and appears to modulate the ATP-dependent protease FtsH. The overall function of HflKC is not fully described.//Regulation of FtsH by HflKC appears to be negative (SS 8/27/03] 261
18402 273884 TIGR01934 MenG_MenH_UbiE ubiquinone/menaquinone biosynthesis methyltransferases. This model represents a family of methyltransferases involved in the biosynthesis of menaquinone and ubiqinone. Some members such as the UbiE enzyme from E. coli are believed to act in both pathways, while others may act in only the menaquinone pathway. These methyltransferases are members of the UbiE/CoQ family of methyltransferases (pfam01209) which also contains ubiquinone methyltransferases and other methyltransferases. Members of this clade include a wide distribution of bacteria and eukaryotes, but no archaea. An outgroup for this clade is provided by the phosphatidylethanolamine methyltransferase (EC 2.1.1.17) from Rhodobacter sphaeroides. Note that a number of non-orthologous genes which are members of pfam03737 have been erroneously annotated as MenG methyltransferases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 223
18403 130990 TIGR01935 NOT-MenG RraA famliy. The E. coli member of this family has been characterized as a regulator of RNase E and its crystal structure has been analyzed. This model was initially classified as a "hypothetical equivalog" expressing the tentative hypothesis that all members might have the same function as the E. coli enzyme. Considering the second clade of enterobacterial sequences within this family, that appears to be less tenable. The function of these sequences outside of the narrow RraA equivalog model (TIGR02998) remains obscure. All of these were initially annotated as MenG, AKA S-adenosylmethionine: 2-demethylmenaquinone methyltransferase (EC 2.1.-.-). See the references characterizing this as a case of transitive annotation error in the case of the E. coli protein. [Unknown function, General] 150
18404 273885 TIGR01936 nqrA NADH:ubiquinone oxidoreductase, Na(+)-translocating, A subunit. This model represents the NqrA subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 447
18405 130992 TIGR01937 nqrB NADH:ubiquinone oxidoreductase, Na(+)-translocating, B subunit. This model represents the NqrB subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 413
18406 273886 TIGR01938 nqrC NADH:ubiquinone oxidoreductase, Na(+)-translocating, C subunit. This model represents the NqrC subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 251
18407 130994 TIGR01939 nqrD NADH:ubiquinone oxidoreductase, Na(+)-translocating, D subunit. This model represents the NqrD subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 207
18408 130995 TIGR01940 nqrE NADH:ubiquinone oxidoreductase, Na(+)-translocating, E subunit. This model represents the NqrE subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 200
18409 130996 TIGR01941 nqrF NADH:ubiquinone oxidoreductase, Na(+)-translocating, F subunit. This model represents the NqrF subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 405
18410 130997 TIGR01942 pcnB poly(A) polymerase. This model describes the pcnB family of poly(A) polymerases (also known as plasmid copy number protein). These enzymes sequentially add adenosine nucleotides to the 3' end of RNAs, targeting them for degradation by the cell. This was originally described for anti-sense RNAs, but was later demonstrated for mRNAs as well. Members of this family are as yet limited to the gamma- and beta-proteobacteria, with putative members in the Chlamydiacae and spirochetes. This family has homology to tRNA nucleotidyltransferase (cca). 410
18411 130998 TIGR01943 rnfA electron transport complex, RnfABCDGE type, A subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport] 190
18412 273887 TIGR01944 rnfB electron transport complex, RnfABCDGE type, B subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the B subunit. [Energy metabolism, Electron transport] 165
18413 273888 TIGR01945 rnfC electron transport complex, RnfABCDGE type, C subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the C subunit. [Energy metabolism, Electron transport] 435
18414 131001 TIGR01946 rnfD electron transport complex, RnfABCDGE type, D subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport] 327
18415 273889 TIGR01947 rnfG electron transport complex, RnfABCDGE type, G subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport] 186
18416 162619 TIGR01948 rnfE electron transport complex, RnfABCDGE type, E subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the E subunit. [Energy metabolism, Electron transport] 196
18417 273890 TIGR01949 AroFGH_arch predicted phospho-2-dehydro-3-deoxyheptonate aldolase. This model represents a clade of sequences related to fructose-bisphosphate aldolase (class I, included within pfam01791). The members of this clade appear to be phospho-2-dehydro-3-deoxyheptonate aldolases. This enzyme is the first step of the chorismate biosynthesis pathway. Evidence for this assignment is based on gene clustering and phylogenetic profiling. A group of species lack members of the three other types of phospho-2-dehydro-3-deoxyheptonate aldolase (represented by TIGR00034, TIGR01358 and TIGR01361), and also aparrently lack the well-known forms of step 2 (3-dehydroquinate synthase), but contain all other steps of the pathway: Desulfovibrio, Aquifex, Archaeoglobus, Halobacterium, Methanopyrus, Methanococcus and Methanobacterium. The clade of sequences represented here is limited strictly to this group of organisms. In Desulfovibrio, Aquifex, Archaeoglobus, Halobacterium and Methanosarcina the genes found by this model are clustered with other genes from the chorismate, phenylalanine, tyrosine and tryptophan biosynthesis pathways. In addition, these genes in Desulfovibrio, Archaeoglobus, Halobacterium, Methanosarcina and Methanopyrus are adjacent to a gene which hits pfam01959 which also has the property of having members only in those species which lack steps 1 and 2. Together these two genes appear to perform the synthesis of 3-dehydroquinate. It is presumed that the substrates and the chemical transformations involved are identical, but this has not yet been proven experimentally. 258
18418 131005 TIGR01950 SoxR redox-sensitive transcriptional activator SoxR. SoxR is a MerR-family homodimeric transcription factor with a 2Fe-2S cluster in each monomer. The motif CIGCGCxxxxxC is conserved. Oxidation of the iron-sulfur cluster activates SoxR. The physiological role in E. coli is response to oxidative stress. It is activated by superoxide, singlet oxygen, nitric oxide (NO), and hydrogen peroxide. In E. coli, SoxR increases expression of transcription factor SoxS; different downstream targets may exist in other species. [Cellular processes, Detoxification, Regulatory functions, DNA interactions] 142
18419 273891 TIGR01951 nusB transcription antitermination factor NusB. A transcription antitermination complex active in many bacteria was designated N-utilization substance (Nus) in E. coli because of its interaction with phage lambda protein N. This model represents NusB. Other components are NusA and NusG. NusE is, in fact, ribosomal protein S10. [Transcription, Transcription factors] 129
18420 273892 TIGR01952 nusA_arch NusA family KH domain protein, archaeal. This model represents a family of archaeal proteins found in a single copy per genome. It contains two KH domains (pfam00013) and is most closely related to the central region bacterial NusA, a transcription termination factor named for its iteraction with phage lambda protein N in E. coli. The proteins required for antitermination by N include NusA, NusB, nusE (ribosomal protein S10), and nusG. This system, on the whole, appears not to be present in the Archaea. 141
18421 273893 TIGR01953 NusA transcription termination factor NusA. This model describes NusA, or N utilization substance protein A, a bacterial transcription termination factor. It binds to RNA polymerase alpha subunit and promotes termination at certain RNA hairpin structures. It is named for the interaction in E. coli of phage lambda antitermination protein N with the N-utilization substance, consisting of NusA, NusB, NusE (ribosomal protein S10), and nusG. This model represents a region of NusA shared in all bacterial forms, and including an S1 (pfam00575) and a KH (pfam00013) RNA binding domains. Proteobacterial forms have an additional C-terminal region, not included in this model, with two repeats of 50-residue domain rich in acidic amino acids. [Transcription, Transcription factors] 341
18422 273894 TIGR01954 nusA_Cterm_rpt transcription termination factor NusA, C-terminal duplication. NusA is a bacterial transcription termination factor. It is named for its interaction with phage lambda protein N, as part of the N utilization substance. Some members of the NusA family have a long C-terminal extension. This model represents an acidic 50-residue region found in two copies toward the C-terminus of most Proteobacterial NusA proteins, spaced about 26 residues apart. Analogous C-terminal extensions in some other bacterial lineages lack apparent homology but appear similarly acidic. [Transcription, Transcription factors] 50
18423 131010 TIGR01955 RfaH transcription elongation factor/antiterminator RfaH. This model represents the transcription elongation factor/antiterminator, RfaH. This protein is most closely related to the transcriptional termination/antitermination protein NusG (TIGR00922) and contains the KOW motif (pfam00467). This protein appears to be limited to the gamma proteobacteria. In E. coli, this gene appears to control the expression of haemolysin, sex factor and lipopolysaccharide genes. [Transcription, Transcription factors] 159
18424 273895 TIGR01956 NusG_myco NusG family protein. This model represents a family of Mycoplasma proteins orthologous to the bacterial transcription termination/antitermination factor NusG. These sequences from Mycoplasma are notably diverged (long branches in a Neighbor-joining phylogenetic tree) from the bacterial species. And although NusA and ribosomal protein S10 (NusE) appear to be present, NusB may be absent in Mycoplasmas calling into question whether these species have a functional Nus system including this family as a member. 258
18425 273896 TIGR01957 nuoB_fam NADH-quinone oxidoreductase, B subunit. This model describes the B chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. The quinone is plastoquinone in Synechocystis (where the chain is designated K) and in chloroplast, where NADH may be replaced by NADPH. In the methanogenic archaeal genus Methanosarcina, NADH is replaced by F420H2. [Energy metabolism, Electron transport] 145
18426 131013 TIGR01958 nuoE_fam NADH-quinone oxidoreductase, E subunit. This model describes the E chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. This model does not identify proteins from chloroplast and cyanobacteria. [Energy metabolism, Electron transport] 148
18427 131014 TIGR01959 nuoF_fam NADH-quinone oxidoreductase, F subunit. This model describes the F chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoF. This family does not have any members in chloroplast or cyanobacteria, where the quinone may be plastoquinone and NADH may be replaced by NADPH, nor in Methanosarcina, where NADH is replaced by F420H2. [Energy metabolism, Electron transport] 411
18428 131015 TIGR01960 ndhF3_CO2 NAD(P)H dehydrogenase, subunit NdhF3 family. This family represents a subfamily of NAD(P)H dehydrogenase subunit 5, or ndhF. It is restricted to two paralogs in each completed cyanobacterial genome, in which several subtypes of ndhF are found. Included in this family is NdhF3, shown to play a role in high-affinity CO2 uptake in Synechococcus sp. PCC7002. In all cases, neighboring genes include a paralog of ndhD but do include other NAD(P)H dehydrogenase subunits. Instead, genes related to C02 uptake tend to be found nearby. 606
18429 273897 TIGR01961 NuoC_fam NADH (or F420H2) dehydrogenase, subunit C. This model describes the C subunit of the NADH dehydrogenase complex I in bacteria, as well as many instances of the corresponding mitochondrial subunit (NADH dehydrogenase subunit 9) and of the F420H2 dehydrogenase in Methanosarcina. Complex I contains subunits designated A-N. This C subunit often occurs as a fusion protein with the D subunit. This model excludes the NAD(P)H and plastoquinone-dependent form of chloroplasts and [Energy metabolism, Electron transport] 121
18430 273898 TIGR01962 NuoD NADH dehydrogenase I, D subunit. This model recognizes specificially the D subunit of NADH dehydrogenase I complex. It excludes the related chain of NAD(P)H-quinone oxidoreductases from chloroplast and Synechocystis, where the quinone may be plastoquinone rather than ubiquinone. This subunit often appears as a C/D fusion. [Energy metabolism, Electron transport] 386
18431 211705 TIGR01963 PHB_DH 3-hydroxybutyrate dehydrogenase. This model represents a subfamily of the short chain dehydrogenases. Characterized members so far as 3-hydroxybutyrate dehydrogenases and are found in species that accumulate ester polmers called polyhydroxyalkanoic acids (PHAs) under certain conditions. Several members of the family are from species not known to accumulate PHAs, including Oceanobacillus iheyensis and Bacillus subtilis. However, polymer formation is not required for there be a role for 3-hydroxybutyrate dehydrogenase; it may be members of this family have the same function in those species. 255
18432 213671 TIGR01964 chpXY CO2 hydration protein. This small family of proteins includes paralogs ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations. [Energy metabolism, Photosynthesis] 367
18433 273899 TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion. 99
18434 131021 TIGR01966 RNasePH ribonuclease PH. This bacterial enzyme, ribonuclease PH, performs the final 3'-trimming and modification of tRNA precursors. This model is restricted absolutely to bacteria. Related families outside the model include proteins described as probable exosome complex exonucleases (rRNA processing) and polyribonucleotide nucleotidyltransferases (mRNA degradation). The most divergent member within the family is RNase PH from Deinococcus radiodurans. [Transcription, RNA processing] 236
18435 273900 TIGR01967 DEAH_box_HrpA RNA helicase HrpA. This model represents HrpA, one of two related but uncharacterized DEAH-box ATP-dependent helicases in many Proteobacteria and a few high-GC Gram-positive bacteria. HrpA is about 1300 amino acids long, while its paralog HrpB, also uncharacterized, is about 800 amino acids long. Related characterized eukarotic proteins are RNA helicases associated with pre-mRNA processing. The HrpA/B homolog from Borrelia is 500 amino acids shorter but appears to be derived from HrpA rather than HrpB. [Unknown function, Enzymes of unknown specificity] 1283
18436 131023 TIGR01968 minD_bact septum site-determining protein MinD. This model describes the bacterial and chloroplast form of MinD, a multifunctional cell division protein that guides correct placement of the septum. The homologous archaeal MinD proteins, with many archaeal genomes having two or more forms, are described by a separate model. [Cellular processes, Cell division] 261
18437 131024 TIGR01969 minD_arch cell division ATPase MinD, archaeal. This model represents the archaeal branch of the MinD family. MinD, a weak ATPase, works in bacteria with MinC as a generalized cell division inhibitor and, through interaction with MinE, prevents septum placement inappropriate sites. Often several members of this family are found in archaeal genomes, and the function is uncharacterized. More distantly related proteins ParA chromosome partitioning proteins. The exact roles of the various archaeal MinD homologs are unknown. 251
18438 273901 TIGR01970 DEAH_box_HrpB ATP-dependent helicase HrpB. This model represents HrpB, one of two related but uncharacterized DEAH-box ATP-dependent helicases in many Proteobacteria, but also in a few species of other lineages. The member from Rhizobium meliloti has been designated HelO. HrpB is typically about 800 residues in length, while its paralog HrpA (TIGR01967), also uncharacterized, is about 1300 amino acids long. Related characterized eukarotic proteins are RNA helicases associated with pre-mRNA processing. [Unknown function, Enzymes of unknown specificity] 819
18439 273902 TIGR01971 NuoI NADH-quinone oxidoreductase, chain I. This model represents the I subunit (one of 14: A->N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This model excludes "I" subunits from the closely related F420H2 dehydrogenase and formate hydrogenlyase complexes. [Energy metabolism, Electron transport] 122
18440 273903 TIGR01972 NDH_I_M proton-translocating NADH-quinone oxidoreductase, chain M. This model describes the 13th (based on E. coli) structural gene, M, of bacterial NADH dehydrogenase I, as well as chain 4 of the corresponding mitochondrial complex I and of the chloroplast NAD(P)H dehydrogenase complex. [Energy metabolism, Electron transport] 481
18441 273904 TIGR01973 NuoG NADH-quinone oxidoreductase, chain G. This model represents the G subunit (one of 14: A->N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This model excludes related subunits from formate dehydrogenase complexes. [Energy metabolism, Electron transport] 602
18442 273905 TIGR01974 NDH_I_L proton-translocating NADH-quinone oxidoreductase, chain L. This model describes the 12th (based on E. coli) structural gene, L, of bacterial NADH dehydrogenase I, as well as chain 5 of the corresponding mitochondrial complex I and subunit 5 (or F) of the chloroplast NAD(P)H-plastoquinone dehydrogenase complex. [Energy metabolism, Electron transport] 609
18443 131030 TIGR01975 isoAsp_dipep isoaspartyl dipeptidase IadA. The L-isoaspartyl derivative of Asp arises non-enzymatically over time as a form of protein damage. In this isomerization, the connectivity of the polypeptide changes to pass through the beta-carboxyl of the side chain. Much but not all of this damage can be repaired by protein-L-isoaspartate (D-aspartate) O-methyltransferase. This model describes the isoaspartyl dipeptidase IadA, apparently one of two such enzymes in E. coli, an enzyme that degrades isoaspartyl dipeptides and may unblock degradation of proteins that cannot be repaired. This model also describes closely related proteins from other species (e.g. Clostridium perfringens, Thermoanaerobacter tengcongensis) that we assume to be equivalent in function. This family shows homology to dihydroorotases. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 389
18444 273906 TIGR01976 am_tr_V_VC1184 cysteine desulfurase family protein, VC1184 subfamily. This model describes a subfamily of probable pyridoxal phosphate-dependent enzymes in the aminotransferase class V family (pfam00266). The most closely related characterized proteins are active as cysteine desulfurases, selenocysteine lyases, or both; some are involved in FeS cofactor biosynthesis and are designated NifS. An active site Cys residue present in those sequences, in motifs resembling GHHC or GSAC, is not found in this family. The function of members of this family is unknown, but seems unlike to be as an aminotransferase. [Unknown function, Enzymes of unknown specificity] 397
18445 131032 TIGR01977 am_tr_V_EF2568 cysteine desulfurase family protein. This model describes a subfamily of probable pyridoxal phosphate-dependent enzymes in the aminotransferase class V family. Related families contain members active as cysteine desulfurases, selenocysteine lyases, or both. The members of this family form a distinct clade and all are shorter at the N-terminus. The function of this subfamily is unknown. [Unknown function, Enzymes of unknown specificity] 376
18446 273907 TIGR01978 sufC FeS assembly ATPase SufC. SufC is part of the SUF system, shown in E. coli to consist of six proteins and believed to act in Fe-S cluster formation during oxidative stress. SufC forms a complex with SufB and SufD. SufC belongs to the ATP-binding cassette transporter family (pfam00005) but is no longer thought to be part of a transporter. The complex is reported as cytosolic () or associated with the membrane (). The SUF system also includes a cysteine desulfurase (SufS, enhanced by SufE) and a probable iron-sulfur cluster assembly scaffold protein, SufA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 243
18447 131034 TIGR01979 sufS cysteine desulfurases, SufSfamily. This model represents a subfamily of NifS-related cysteine desulfurases involved in FeS cluster formation needed for nitrogen fixation among other vital functions. Many cysteine desulfurases are also active as selenocysteine lyase and/or cysteine sulfinate desulfinase. This subfamily is associated with the six-gene SUF system described in E. coli and Erwinia as an FeS cluster formation system during oxidative stress. The active site Cys is this subfamily resembles GHHC with one or both His conserved. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 403
18448 131035 TIGR01980 sufB FeS assembly protein SufB. This protein, SufB, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 448
18449 273908 TIGR01981 sufD FeS assembly protein SufD. This protein, SufD, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. SufB and SufD are homologous. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 275
18450 273909 TIGR01982 UbiB 2-polyprenylphenol 6-hydroxylase. This model represents the enzyme (UbiB) which catalyzes the first hydroxylation step in the ubiquinone biosynthetic pathway in bacteria. It is believed that the reaction is 2-polyprenylphenol -> 6-hydroxy-2-polyprenylphenol. This model finds hits primarily in the proteobacteria. The gene is also known as AarF in certain species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 437
18451 273910 TIGR01983 UbiG ubiquinone biosynthesis O-methyltransferase. This model represents an O-methyltransferase believed to act at two points in the ubiquinone biosynthetic pathway in bacteria (UbiG) and fungi (COQ3). A separate methylase (MenG/UbiE) catalyzes the single C-methylation step. The most commonly used names for genes in this family do not indicate whether this gene is an O-methyl, or C-methyl transferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 224
18452 273911 TIGR01984 UbiH 2-polyprenyl-6-methoxyphenol 4-hydroxylase. This model represents the FAD-dependent monoxygenase responsible for the second hydroxylation step in the aerobic ubiquinone bioynthetic pathway. The scope of this model is limited to the proteobacteria. This family is closely related to the UbiF hydroxylase which catalyzes the final hydroxylation step. The enzyme has also been named VisB due to a mutant VISible light sensitive phenotype. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 382
18453 131040 TIGR01985 phasin_2 phasin. This model represents a family of granule-associate proteins (phasins) which are part of the polyhydroxyalkanoate synthesis machinery. This family is based on a pair of characterized genes from Methylobacterium extorquens. Members of the seed for this model all contain the rest of the components believed to be essential for this system (see the "polyhydroxyalkanoic acid synthesis" property in the GenPropDB). Members of this family score below trusted to another phasin model, TIGR01841 and together may represent a subfamily or broader equivalog. 112
18454 273912 TIGR01986 glut_syn_euk glutathione synthetase, eukaryotic. This model represents the eukaryotic glutathione synthetase, which shows little resemblance to the analogous enzyme of Gram-negative bacteria (TIGR01380). In the Kinetoplastida, trypanothione replaces glutathione, but can be made from glutathione; a sequence from Leishmania is not included in the seed, is highly divergent, and therefore scores between the trusted and noise cutoffs. 472
18455 131042 TIGR01987 HI0074 nucleotidyltransferase substrate binding protein, HI0074 family. The member of this family from Haemophilus influenzae, HI0074, has been shown by crystal structure to resemble nucleotidyltransferase substrate binding proteins. It forms a complex with HI0073, encoded by the adjacent gene and containing a nucleotidyltransferase nucleotide binding domain (pfam01909). 123
18456 273913 TIGR01988 Ubi-OHases Ubiquinone biosynthesis hydroxylase, UbiH/UbiF/VisC/COQ6 family. This model represents a family of FAD-dependent hydroxylases (monooxygenases) which are all believed to act in the aerobic ubiquinone biosynthesis pathway. A separate set of hydroxylases, as yet undiscovered, are believed to be active under anaerobic conditions. In E. coli three enzyme activities have been described, UbiB (which acts first at position 6, see TIGR01982), UbiH (which acts at position 4) and UbiF (which acts at position 5). UbiH and UbiF are similar to one another and form the basis of this subfamily. Interestingly, E. coli contains another hydroxylase gene, called visC, that is highly similar to UbiF, adjacent to UbiH and, when mutated, results in a phenotype similar to that of UbiH (which has also been named visB). Several other species appear to have three homologs in this family, although they assort themselves differently on phylogenetic trees (e.g. Xylella and Mesorhizobium) making it difficult to ascribe a specific activity to each one. Eukaryotes appear to have only a single homolog in this subfamily (COQ6) which complements UbiH, but also possess a non-orthologous gene, COQ7 which complements UbiF. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 385
18457 273914 TIGR01989 COQ6 ubiquinone biosynthesis monooxygenase COQ6. This model represents the monooxygenase responsible for the 4-hydroxylateion of the phenol ring in the aerobic biosynthesis of ubiquinone 437
18458 213672 TIGR01990 bPGM beta-phosphoglucomutase. This model represents the beta-phosphoglucomutase enzyme which catalyzes the interconverison of beta-D-glucose-1-phosphate and beta-D-glucose-6-phosphate. The 6-phosphate is capable of non-enzymatic anomerization (alpha <-> beta) while the 1-phosphate is not. A separate enzyme is responsible for the isomerization of the alpha anomers. Beta-D-glucose-1-phosphate results from the phosphorylysis of maltose (2.4.1.8), trehalose (2.4.1.64) or trehalose-6-phosphate (2.4.1.216). Alternatively, these reactions can be run in the synthetic direction to create the disaccharides. All sequenced genomes which contain a member of this family also appear to contain at least one putative maltose or trehalose phosphorylase. Three species, Lactococcus, Enterococcus and Neisseria appear to contain a pair of paralogous beta-PGM's. Beta-phosphoglucomutase is a member of the haloacid dehalogenase superfamily of hydrolase enzymes. These enzymes are characterized by a series of three catalytic motifs positioned within an alpha-beta (Rossman) fold. beta-PGM contains an inserted alpha helical domain in between the first and second conserved motifs and thus is a member of subfamily IA of the superfamily. The third catalytic motif comes in three variants, the third of which, containing a conserved DD or ED, is the only one found here as well as in several other related enzymes (TIGR01509). The enzyme from L. lactis has been extensively characterized including a remarkable crystal structure which traps the pentacoordinate transition state. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 185
18459 273915 TIGR01991 HscA Fe-S protein assembly chaperone HscA. The Heat Shock Cognate proteins HscA and HscB act together as chaperones. HscA resembles DnaK but belongs in a separate clade. The apparent function is to aid assembly of iron-sulfur cluster proteins. Homologs from Buchnera and Wolbachia are clearly in the same clade but are highly derived and score lower than some examples of DnaK. [Protein fate, Protein folding and stabilization] 599
18460 273916 TIGR01992 PTS-IIBC-Tre PTS system, trehalose-specific IIBC component. This model represents the fused enzyme II B and C components of the trehalose-specific PTS sugar transporter system. Trehalose is converted to trehalose-6-phosphate in the process of translocation into the cell. These transporters lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). The exceptions to this rule are Staphylococci and Streptococci which contain their own A domain as a C-terminal fusion. This family is closely related to the sucrose transporting PTS IIBC enzymes and the B and C domains of each are described by subfamily-domain level TIGRFAMs models (TIGR00826 and TIGR00852, respectively). In E. coli, B. subtilis and P. fluorescens the presence of this gene is associated with the presence of trehalase which degrades T6P to glucose and glucose-6-P. Trehalose may also be transported (in Salmonella) via the mannose PTS or galactose permease systems, or (in Sinorhizobium, Thermococcus and Sulfolobus, for instance) by ABC transporters. 462
18461 273917 TIGR01993 Pyr-5-nucltdase pyrimidine 5'-nucleotidase. This family of proteins includes the SDT1/SSM1 gene from yeast which has been shown to code for a pyrimidine (UMP/CMP) 5'nucleotidase. The family spans plants, fungi and a small number of bacteria. These enzymes are members of the haloacid dehalogenase (HAD) superfamily of hydrolases, specifically the IA subfamily (variant 3, TIGR01509). 183
18462 273918 TIGR01994 SUF_scaf_2 SUF system FeS assembly protein, NifU family. Three iron-sulfur cluster assembly systems are known so far. ISC is broadly distributed while NIF tends to be associated with nitrogenase in nitrogen-fixing bacteria. The most recently described is SUF, believed to be important to maintain the function during aerobic stress of enzymes with labile Fe-S clusters. It is fairly widely distributed. This family represents one of two different proteins proposed to act as a scaffold on which the Fe-S cluster is built and from which it is transferred. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 137
18463 273919 TIGR01995 PTS-II-ABC-beta PTS system, beta-glucoside-specific IIABC component. This model represents a family of PTS enzyme II proteins in which all three domains are found in the same polypeptide chain and which appear to have a broad specificity for beta-glucosides including salicin (beta-D-glucose-1-salicylate) and arbutin (Hydroquinone-O-beta-D-glucopyranoside). These are distinct from the closely related sucrose-specific and trehalose-specific PTS transporters. 610
18464 131051 TIGR01996 PTS-II-BC-sucr PTS system, sucrose-specific IIBC component. This model represents the fused enzyme II B and C components of the sucrose-specific PTS sugar transporter system. Sucrose is converted to sucrose-6-phosphate in the process of translocation into the cell. Some of these transporters lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). The exceptions to this rule are Staphylococci, Streptococci, Lactococci, Lactobacilli, etc. which contain their own A domain as a C-terminal fusion. This family is closely related to the trehalose transporting PTS IIBC enzymes and the B and C domains of each are described by subfamily-domain level TIGRFAMs models (TIGR00826 and TIGR00852, respectively). 461
18465 131052 TIGR01997 sufA_proteo FeS assembly scaffold SufA. This model represents the SufA protein of the SUF system of iron-sulfur cluster biosynthesis. This system performs FeS biosynthesis even during oxidative stress and tends to be absent in obligate anaerobic and microaerophilic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 107
18466 273920 TIGR01998 PTS-II-BC-nag PTS system, N-acetylglucosamine-specific IIBC component. This model represents the combined B and C domains of the PTS transport system enzyme II specific for N-acetylglucosamine transport. Many of the genes in this family also include an A domain as part of the same polypeptide and thus should be given the name "PTS system, N-acetylglucosamine-specific IIABC component". This family is most closely related to the glucose-specific PTS enzymes. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 475
18467 188192 TIGR01999 iscU FeS cluster assembly scaffold IscU. This model represents IscU, a homolog of the N-terminal region of NifU, an Fe-S cluster assembly protein found mostly in nitrogen-fixing bacteria. IscU is considered part of the IscSUA-hscAB-fdx system of Fe-S assembly, whereas NifU is found in nitrogenase-containing (nitrogen-fixing) species. A NifU-type protein is also found in Helicobacter and Campylobacter. IscU and NifU are considered scaffold proteins on which Fe-S clusters are assembled before transfer to apoproteins. This model excludes true NifU proteins as in Klebsiella pneumoniae and Anabaena sp. as well as archaeal homologs. It includes largely proteobacterial and eukaryotic forms. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 124
18468 273921 TIGR02000 NifU_proper Fe-S cluster assembly protein NifU. Three different but partially homologous Fe-S cluster assembly systems have been described: Isc, Suf, and Nif. The latter is associated with donation of an Fe-S cluster to nitrogenase in a number of nitrogen-fixing species. NifU, described here, consists of an N-terminal domain (pfam01592) and a C-terminal domain (pfam01106). Homologs with an equivalent domain archictecture from Helicobacter and Campylobacter, however, are excluded from this model by a high trusted cutoff. The model, therefore, is specific for NifU involved in nitrogenase maturation. The related model TIGR01999 homologous to the N-terminus of this model describes IscU from the Isc system as in E. coli, Saccharomyces cerevisiae, and Homo sapiens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 290
18469 273922 TIGR02001 gcw_chp conserved hypothetical protein, proteobacterial. This model represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis, Ralstonia solanacearum, and Colwellia psychrerythraea, usually as part of a paralogous family. The function is unknown. 243
18470 273923 TIGR02002 PTS-II-BC-glcB PTS system, glucose-specific IIBC component. This model represents the combined B and C domains of the PTS transport system enzyme II specific for glucose transport. Many of the genes in this family also include an A domain as part of the same polypeptide and thus should be given the name "PTS system, glucose-specific IIABC component" while the B. subtilus enzyme also contains an enzyme III domain which appears to act independently of the enzyme II domains. This family is most closely related to the N-acetylglucosamine-specific PTS enzymes (TIGR01998). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 502
18471 131058 TIGR02003 PTS-II-BC-unk1 PTS system, IIBC component. This model represents a family of fused B and C components of PTS enzyme II. This clade is a member of a larger family which contains enzyme II's specific for a variety of sugars including glucose (TIGR02002) and N-acetylglucosamine (TIGR01998). None of the members of this clade have been experimentally characterized. This clade includes sequences from Streptococcus and Enterococcus which also include a C-terminal A domain as well as Bacillus and Clostridium which do not. In nearly all cases, these species also contain an authentic glucose-specific PTS transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 548
18472 273924 TIGR02004 PTS-IIBC-malX PTS system, maltose and glucose-specific IIBC component. This model represents a family of PTS enzyme II fused B and C components including and most closely related to the MalX maltose and glucose-specific transporter of E. coli. A pair of paralogous genes from E. coli strain CFT073 score between trusted and noise and may have diverged sufficiently to have an altered substrate specificity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 517
18473 273925 TIGR02005 PTS-IIBC-alpha PTS system, alpha-glucoside-specific IIBC component. This model represents a family of fused PTS enzyme II B and C domains. A gene from Clostridium has been partially characterized as a maltose transporter, while genes from Fusobacterium and Klebsiella have been proposed to transport the five non-standard isomers of sucrose. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 524
18474 131061 TIGR02006 IscS cysteine desulfurase IscS. This model represents IscS, one of several cysteine desulfurases from a larger protein family designated (misleadingly, in this case) class V aminotransferases. IscS is one of at least 6 enzymes characteristic of the IscSUA-hscAB-fsx system of iron-sulfur cluster assembly. Scoring almost as well as proteobacterial sequences included in the model are mitochondrial cysteine desulfurases, apparently from an analogous system in eukaryotes. The sulfur, taken from cysteine, may be used in other systems as well, such as tRNA base modification and biosynthesis of other cofactors. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Protein synthesis, tRNA and rRNA base modification] 402
18475 131062 TIGR02007 fdx_isc ferredoxin, 2Fe-2S type, ISC system. This family consists of proteobacterial ferredoxins associated with and essential to the ISC system of 2Fe-2S cluster assembly. This family is closely related to (but excludes) eukaryotic (mitochondrial) adrenodoxins, which are ferredoxins involved in electron transfer to P450 cytochromes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 110
18476 273926 TIGR02008 fdx_plant ferredoxin [2Fe-2S]. This model represents single domain 2Fe-2S (also called plant type) ferredoxins. In general, these occur as a single domain proteins or with a chloroplast transit peptide. Species tend to be photosynthetic, but several forms may occur in one species and individually may not be associated with photocynthesis. Halobacterial forms differ somewhat in architecture; they score between trusted and noise cutoffs. Sequences scoring below the noise cutoff tend to be ferredoxin-related domains of larger proteins. 97
18477 213673 TIGR02009 PGMB-YQAB-SF beta-phosphoglucomutase family hydrolase. This subfamily model groups together three clades: the characterized beta-phosphoglucomutases (including those from E.coli, B.subtilus and L.lactis, TIGR01990), a clade of putative bPGM's from mycobacteria and a clade including the uncharacterized E.coli and H.influenzae yqaB genes which may prove to be beta-mutases of a related 1-phosphosugar. All of these are members of the larger Haloacid dehalogenase (HAD) subfamily IA and include the "variant 3" glu-asp version of the third conserved HAD domain (TIGR01509). 185
18478 273927 TIGR02010 IscR iron-sulfur cluster assembly transcription factor IscR. This model describes IscR, an iron-sulfur binding transcription factor of the ISC iron-sulfur cluster assembly system. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions] 135
18479 213674 TIGR02011 IscA iron-sulfur cluster assembly protein IscA. This model represents the IscA component of the ISC system for iron-sulfur cluster assembly. The ISC system consists of IscRASU, HscAB and an Isc-specific ferredoxin. IscA previously was believed to act as a scaffold and now is seen as an iron donor protein. This clade is limited to the proteobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 105
18480 162659 TIGR02012 tigrfam_recA protein RecA. This model describes orthologs of the recA protein. RecA promotes hybridization of homolgous regions of DNA. A segment of ssDNA can be hybridized to another ssDNA region, or to a dsDNA region. ATP is hydrolyzed in the process. Part of the SOS respones, it is regulated by LexA via autocatalytic cleavage. [DNA metabolism, DNA replication, recombination, and repair] 321
18481 273928 TIGR02013 rpoB DNA-directed RNA polymerase, beta subunit. This model describes orthologs of the beta subunit of Bacterial RNA polymerase. The core enzyme consists of two alpha chains, one beta chain, and one beta' subunit. [Transcription, DNA-dependent RNA polymerase] 1065
18482 131069 TIGR02014 BchZ chlorophyllide reductase subunit Z. This model represents the Z subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. [Energy metabolism, Photosynthesis] 468
18483 131070 TIGR02015 BchY chlorophyllide reductase subunit Y. This model represents the Y subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. [Energy metabolism, Photosynthesis] 422
18484 273929 TIGR02016 BchX chlorophyllide reductase iron protein subunit X. This model represents the X subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. This subunit is homologous to the nitrogenase component II, or "iron" protein. [Energy metabolism, Photosynthesis] 296
18485 131072 TIGR02017 hutG_amidohyd N-formylglutamate amidohydrolase. In some species, histidine is converted to via urocanate and then formimino-L-glutamate to glutamate in four steps, where the fourth step is conversion of N-formimino-L-glutamate to L-glutamate and formamide. In others, that pathway from formimino-L-glutamate may differ, with the next enzyme being formiminoglutamate hydrolase (HutF) yielding N-formyl-L-glutamate. This model represents the enzyme N-formylglutamate deformylase, also called N-formylglutamate amidohydrolase, which then produces glutamate. [Energy metabolism, Amino acids and amines] 263
18486 188194 TIGR02018 his_ut_repres histidine utilization repressor, proteobacterial. This model represents a proteobacterial histidine utilization repressor. It is usually found clustered with the enzymes HutUHIG so that it can regulate its own expression as well. A number of species have several paralogs and may fine-tune the regulation according to levels of degradation intermediates such as urocanate. This family belongs to the larger GntR family of transcriptional regulators. [Energy metabolism, Amino acids and amines, Regulatory functions, DNA interactions] 230
18487 131074 TIGR02019 BchJ bacteriochlorophyll 4-vinyl reductase. This model represents the component of bacteriochlorophyll synthetase responsible for reduction of the B-ring pendant ethylene (4-vinyl) group. It appears that this step must precede the reduction of ring D, at least by the "dark" protochlorophyllide reductase enzymes BchN, BchB and BchL. This family appears to be present in photosynthetic bacteria except for the cyanobacterial clade. Cyanobacteria must use a non-orthologous gene to carry out this required step for the biosynthesis of both bacteriochlorophyll and chlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 188
18488 131075 TIGR02020 BchF 2-vinyl bacteriochlorophyllide hydratase. This model represents the enzyme responsible for the first step in the modification of the ring A vinyl group of chlorophyllide a which (in part) distinguishes chlorophyll from bacteriochlorophyll. This enzyme is aparrently absent from cyanobacteria (which do not use bacteriochlorophyll). [Energy metabolism, Photosynthesis] 145
18489 273930 TIGR02021 BchM-ChlM magnesium protoporphyrin O-methyltransferase. This model represents the S-adenosylmethionine-dependent O-methyltransferase responsible for methylation of magnesium protoporphyrin IX. This step is essentiasl for the biosynthesis of both chlorophyll and bacteriochlorophyll. This model encompasses two closely related clades, from cyanobacteria (and plants) where it is called ChlM and other photosynthetic bacteria where it is known as BchM. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 219
18490 273931 TIGR02022 hutF formiminoglutamate deiminase. In some species, histidine utilization goes via urocanate to glutamate in four step, the last being removal of formamide. This model describes an alternate fourth step, formiminoglutamate hydrolase, which leads to N-formyl-L-glutamate. This product may be acted on by formylglutamate amidohydrolase (TIGR02017) and bypass glutamate as a product during its degradation. Alternatively, removal of formate (by EC 3.5.1.68) would yield glutamate. [Energy metabolism, Amino acids and amines] 454
18491 273932 TIGR02023 BchP-ChlP geranylgeranyl reductase. This model represents a group of geranylgeranyl reductases specific for the biosyntheses of bacteriochlorophyll and chlorophyll. It is unclear whether the processes of isoprenoid ligation to the chlorin ring and reduction of the geranylgeranyl chain to a phytyl chain are necessarily ordered the same way in all species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 388
18492 131079 TIGR02024 FtcD glutamate formiminotransferase. This model represents the tetrahydrofolate (THF) dependent glutamate formiminotransferase involved in the histidine utilization pathway. This enzyme interconverts L-glutamate and N-formimino-L-glutamate. The enzyme is bifunctional as it also catalyzes the cyclodeaminase reaction on N-formimino-THF, converting it to 5,10-methenyl-THF and releasing ammonia - part of the process of regenerating THF. This model covers enzymes from metazoa as well as gram-positive bacteria and archaea. In humans, deficiency of this enzyme results in a disease phenotype. The crystal structure of the enzyme has been studied in the context of the catalytic mechanism. [Energy metabolism, Amino acids and amines] 298
18493 273933 TIGR02025 BchH magnesium chelatase, H subunit. This model represents the H subunit of the magnesium chelatase complex responsible for magnesium insertion into the protoporphyrin IX ring in the biosynthesis of both chlorophyll and bacteriochlorophyll. In chlorophyll-utilizing species, this gene is known as ChlH, while in bacteriochlorophyll-utilizing spoecies it is called BchH. Subunit H is the largest (~140kDa) of the three subunits (the others being BchD/ChlD and BchI/ChlI), and is known to bind protoporphyrin IX. Subunit H is homologous to the CobN subunit of cobaltochelatase and by anology with that enzyme, subunit H is believed to also bind the magnesium ion which is inserted into the ring. In conjunction with the hydrolysis of ATP by subunits I and D, a conformation change is believed to happen in subunit H causing the magnesium ion insertion into the distorted protoporphyrin ring. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 1224
18494 131081 TIGR02026 BchE magnesium-protoporphyrin IX monomethyl ester anaerobic oxidative cyclase. This model represents the cobalamin-dependent oxidative cyclase, a radical SAM enzyme responsible for forming the distinctive E-ring of the chlorin ring system under anaerobic conditions. This step is essential in the biosynthesis of both bacteriochlorophyll and chlorophyll under anaerobic conditions (a separate enzyme, AcsF, acts under aerobic conditions). This model identifies two clades of sequences, one from photosynthetic, non-cyanobacterial bacteria and another including Synechocystis and several non-photosynthetic bacteria. The function of the Synechocystis gene is supported by gene clustering with other photosynthetic genes, so the purpose of the gene in the non-photosynthetic bacteria is uncertain. Note that homologs of this gene are not found in plants which rely solely on the aerobic cyclase. 497
18495 273934 TIGR02027 rpoA DNA-directed RNA polymerase, alpha subunit, bacterial and chloroplast-type. This family consists of the bacterial (and chloroplast) DNA-directed RNA polymerase alpha subunit, encoded by the rpoA gene. The RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. The amino terminal domain is involved in dimerizing and assembling the other RNA polymerase subunits into a transcriptionally active enzyme. The carboxy-terminal domain contains determinants for interaction with DNA and with transcriptional activator proteins. [Transcription, DNA-dependent RNA polymerase] 297
18496 131083 TIGR02028 ChlP geranylgeranyl reductase. This model represents the reductase which acts reduces the geranylgeranyl group to the phytyl group in the side chain of chlorophyll. It is unclear whether the enzyme has a preference for acting before or after the attachment of the side chain to chlorophyllide a by chlorophyll synthase. This clade is restricted to plants and cyanobacteria to separate it from the homologues which act in the biosynthesis of bacteriochlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 398
18497 131084 TIGR02029 AcsF magnesium-protoporphyrin IX monomethyl ester aerobic oxidative cyclase. This model respresents the oxidative cyclase responsible for forming the distinctive E-ring of the chlorin ring system under aerobic conditions. This enzyme is believed to utilize a binuclear iron center and molecular oxygen. There are two isoforms of this enzyme in some plants and cyanobacterai which are differentially regulated based on the levels of copper and oxygen. This step is essential in the biosynthesis of both bacteriochlorophyll and chlorophyll under aerobic conditions (a separate enzyme, BchE, acts under anaerobic conditions). This enzyme is found in plants, cyanobacteria and other photosynthetic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 337
18498 131085 TIGR02030 BchI-ChlI magnesium chelatase ATPase subunit I. This model represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 337
18499 273935 TIGR02031 BchD-ChlD magnesium chelatase ATPase subunit D. This model represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (TIGR02030), this subunit is not found in archaea. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 589
18500 273936 TIGR02032 GG-red-SF geranylgeranyl reductase family. This model represents a subfamily which includes geranylgeranyl reductases involved in chlorophyll and bacteriochlorophyll biosynthesis as well as other related enzymes which may also act on geranylgeranyl groups or related substrates. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 295
18501 273937 TIGR02033 D-hydantoinase D-hydantoinase. This model represents the D-hydantoinase (dihydropyrimidinase) which primarily converts 5,6-dihydrouracil to 3-ureidopropanoate but also acts on dihydrothymine and hydantoin. The enzyme is a metalloenzyme. 454
18502 213679 TIGR02034 CysN sulfate adenylyltransferase, large subunit. Metabolic assimilation of sulfur from inorganic sulfate, requires sulfate activation by coupling to a nucleoside, for the production of high-energy nucleoside phosphosulfates. This pathway appears to be similar in all prokaryotic organisms. Activation is first achieved through sulfation of sulfate with ATP by sulfate adenylyltransferase (ATP sulfurylase) to produce 5'-phosphosulfate (APS), coupled by GTP hydrolysis. Subsequently, APS is phosphorylated by an APS kinase to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS). In Escherichia coli, ATP sulfurylase is a heterodimer composed of two subunits encoded by cysD and cysN, with APS kinase encoded by cysC. These genes are located in a unidirectionally transcribed gene cluster, and have been shown to be required for the synthesis of sulfur-containing amino acids. Homologous to this E.coli activation pathway are nodPQH gene products found among members of the Rhizobiaceae family. These gene products have been shown to exhibit ATP sulfurase and APS kinase activity, yet are involved in Nod factor sulfation, and sulfation of other macromolecules. With members of the Rhizobiaceae family, nodQ often appears as a fusion of cysN (large subunit of ATP sulfurase) and cysC (APS kinase). [Central intermediary metabolism, Sulfur metabolism] 406
18503 211710 TIGR02035 D_Ser_am_lyase D-serine ammonia-lyase. This family consists of D-serine ammonia-lyase (EC 4.3.1.18), a pyridoxal-phosphate enzyme that converts D-serine to pyruvate and NH3. This enzyme is also called D-serine dehydratase and D-serine deaminase and was previously designated EC 4.2.1.14. It is homologous to an enzyme that acts on threonine and may itself act weakly on threonine. [Energy metabolism, Amino acids and amines] 431
18504 131091 TIGR02036 dsdC D-serine deaminase transcriptional activator. This family, part of the LysR family of transcriptional regulators, activates transcription of the gene for D-serine deaminase, dsdA. Trusted members of this family so far are found adjacent to dsdA and only in Gammaproteobacteria, including E. coli, Vibrio cholerae, and Colwellia psychrerythraea. [Regulatory functions, DNA interactions] 302
18505 273938 TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures. [Protein fate, Protein folding and stabilization, Protein fate, Degradation of proteins, peptides, and glycopeptides] 428
18506 273939 TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Regulatory functions, Protein interactions] 351
18507 131094 TIGR02039 CysD sulfate adenylyltransferase, small subunit. Metabolic assimilation of sulfur from inorganic sulfate, requires sulfate activation by coupling to a nucleoside, for the production of high-energy nucleoside phosphosulfates. This pathway appears to be similar in all prokaryotic organisms. Activation is first achieved through sulfation of sulfate with ATP by sulfate adenylyltransferase (ATP sulfurylase) to produce 5'-phosphosulfate (APS), coupled by GTP hydrolysis. Subsequently, APS is phosphorylated by an APS kinase to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS). In Escherichia coli, ATP sulfurylase is a heterodimer composed of two subunits encoded by cysD and cysN, with APS kinase encoded by cysC. These genes are located in a unidirectionally transcribed gene cluster, and have been shown to be required for the synthesis of sulfur-containing amino acids. Homologous to this E.coli activation pathway are nodPQH gene products found among members of the Rhizobiaceae family. These gene products have been shown to exhibit ATP sulfurase and APS kinase activity, yet are involved in Nod factor sulfation, and sulfation of other macromolecules. [Central intermediary metabolism, Sulfur metabolism] 294
18508 273940 TIGR02040 PpsR-CrtJ transcriptional regulator PpsR. This model represents the transcriptional regulator PpsR which is strictly associated with photosynthetic proteobacteria and found in photosynthetic operons. PpsR has been reported to be a repressor. These proteins contain a Helix-Turn_Helix motif of the "fis" type (pfam02954). [Energy metabolism, Photosynthesis, Regulatory functions, DNA interactions] 442
18509 273941 TIGR02041 CysI sulfite reductase (NADPH) hemoprotein, beta-component. Sulfite reductase (NADPH) catalyzes a six electron reduction of sulfite to sulfide in prokaryotic organisms. It is a complex oligomeric enzyme composed of two different peptides with a subunit composition of alpha(8)-beta(4). The alpha component, encoded by cysJ, is a flavoprotein containing both FMN and FAD, while the beta component, encoded by cysI, is a siroheme, iron-sulfur protein. In Salmonella typhimurium and Escherichia coli, both the alpha and beta subunits of sulfite reductase are located in a unidirectional gene cluster along with phosphoadenosine phosphosulfate reductase, which catalyzes a two step reduction of PAPS to give free sulfite. In cyanobacteria and plant species, sulfite reductase ferredoxin (EC 1.8.7.1) catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism] 541
18510 131097 TIGR02042 sir ferredoxin-sulfite reductase. Distantly related to the iron-sulfur hemoprotein of sulfite reductase (NADPH) found in Proteobacteria and Eubacteria, sulfite reductase (ferredoxin) is a cyanobacterial and plant monomeric enzyme that also catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism] 577
18511 131098 TIGR02043 ZntR Zn(II)-responsive transcriptional regulator. This model represents the zinc and cadmium (II) responsive transcriptional activator of the gamma proteobacterial zinc efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-Cys-X(8-9)-Cys, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Regulatory functions, DNA interactions] 131
18512 131099 TIGR02044 CueR Cu(I)-responsive transcriptional regulator. This model represents the copper-, silver- and gold- (I) responsive transcriptional activator of the gamma proteobacterial copper efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X7-Cys. This family also lacks a conserved cysteine at the N-terminal end of the dimerization helix which is required for the binding of divalent metals such as zinc; here it is replaced by a serine residue. [Regulatory functions, DNA interactions] 127
18513 131100 TIGR02045 P_fruct_ADP ADP-specific phosphofructokinase. Phosphofructokinase is a key enzyme of glycolysis. The phosphate group donor for different subtypes of phosphofructokinase can be ATP, ADP, or pyrophosphate. This family consists of ADP-dependent phosphofructokinases. Members are more similar to ADP-dependent glucokinases (excluded from this family) than to other phosphofructokinases. [Energy metabolism, Glycolysis/gluconeogenesis] 446
18514 131101 TIGR02046 sdhC_b558_fam succinate dehydrogenase (or fumarate reductase) cytochrome b subunit, b558 family. This family consists of the succinate dehydrogenase subunit C of Bacillus subtilis, designated cytochrome b-558, and related sequences that include a fumarate reductase subunit C. This subfamily is only weakly similar to the main group of succinate dehydrogenase cytochrome b subunits described by pfam01127, so that some members score above the gathering threshold and some do not. [Energy metabolism, TCA cycle] 214
18515 131102 TIGR02047 CadR-PbrR Cd(II)/Pb(II)-responsive transcriptional regulator. This model represents the cadmium(II) and/or lead(II) responsive transcriptional activator of the proteobacterial metal efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X(6-9)-Cys, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Regulatory functions, DNA interactions] 127
18516 131103 TIGR02048 gshA_cyano glutamate--cysteine ligase, cyanobacterial, putative. This family consists of proteins believed (see Copley SD, Dhillon JK, 2002) to be the glutamate--cysteine ligases of several cyanobacteria, which are known to make glutathione. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 376
18517 273942 TIGR02049 gshA_ferroox glutamate--cysteine ligase, T. ferrooxidans family. This family consists of a rare family of glutamate--cysteine ligases, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 403
18518 273943 TIGR02050 gshA_cyan_rel carboxylate-amine ligase, YbdK family. This family represents a division of a larger family, the other branch of which is predicted to act as glutamate--cysteine ligase (the first of two enzymes in glutathione biosynthesis) in the cyanobacteria. Species containing this protein, however, are generally not believe to make glutathione, and the function is unknown. 287
18519 131106 TIGR02051 MerR Hg(II)-responsive transcriptional regulator. This model represents the mercury (II) responsive transcriptional activator of the mer organomercurial resistance operon. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X(8)-Cys-Pro, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Cellular processes, Detoxification, Regulatory functions, DNA interactions] 124
18520 131107 TIGR02052 MerP mercuric transport protein periplasmic component. This model represents the periplasmic mercury (II) binding protein of the bacterial mercury detoxification system which passes mercuric ion to the MerT transporter for subsequent reduction to Hg(0) by the mercuric reductase MerA. MerP contains a distinctive GMTCXXC motif associated with metal binding. MerP is related to a larger family of metal binding proteins (pfam00403). [Cellular processes, Detoxification] 92
18521 273944 TIGR02053 MerA mercury(II) reductase. This model represents the mercuric reductase found in the mer operon for the detoxification of mercury compounds. MerA is a FAD-containing flavoprotein which reduces Hg(II) to Hg(0) utilizing NADPH. [Cellular processes, Detoxification] 463
18522 131109 TIGR02054 MerD mercuric resistence transcriptional repressor protein MerD. This model represents a transcriptional repressor protein of the MerR family (pfam00376) whose expression is regulated by the mercury-sensitive transcriptional activator, MerR. MerD has been shown to repress the transcription of the mer operon. [Cellular processes, Detoxification] 120
18523 273945 TIGR02055 APS_reductase thioredoxin-dependent adenylylsulfate APS reductase. This model describes recently identified adenosine 5'-phosphosulfate (APS) reductase activity found in sulfate-assimilatory prokaryotes, thus separating it from the traditionally described phosphoadenosine 5'-phosphosulfate (PAPS) reductases found in bacteria and fungi. Homologous to PAPS reductase in enterobacteria, cyanobacteria, and yeast, APS reductase here clusters with, and demonstrates greater homology to plant APS reductase. Additionally, the presence of two conserved C-terminal motifs (CCXXRKXXPL & SXGCXXCT) distinguishes APS substrate specificity and serves as a FeS cluster. [Central intermediary metabolism, Sulfur metabolism] 191
18524 131111 TIGR02056 ChlG chlorophyll synthase, ChlG. This model represents the strictly cyanobacterial and plant-specific chlorophyll synthase ChlG. ChlG is the enzyme (esterase) which attaches the side chain moiety onto chlorophyllide a. Both geranylgeranyl and phytyl pyrophosphates are substrates to varying degrees in enzymes from different sources. Thus, ChlG may act as the final or penultimate step in chlorophyll biosynthesis (along with the geranylgeranyl reductase, ChlP). [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 306
18525 131112 TIGR02057 PAPS_reductase phosphoadenosine phosphosulfate reductase, thioredoxin dependent. Requiring thioredoxin as an electron donor, phosphoadenosine phosphosulfate reductase catalyzes the reduction of 3'-phosphoadenylylsulfate (PAPS) to sulfite and phospho-adenosine-phosphate (PAP). Found in enterobacteria, cyanobacteria, and yeast, PAPS reductase is related to a group of plant (TIGR00424) and bacterial (TIGR02055) enzymes preferring 5'-adenylylsulfate (APS) over PAPS as a substrate for reduction to sulfite. [Central intermediary metabolism, Sulfur metabolism] 226
18526 131113 TIGR02058 lin0512_fam conserved hypothetical protein. This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N-terminus. [Hypothetical proteins, Conserved] 116
18527 131114 TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (>10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803. 101
18528 131115 TIGR02060 aprB adenosine phosphosulphate reductase, beta subunit. During dissimilatory sulfate reduction and sulfur oxidation, adenylylsulfate (APS) reductase catalyzes reversibly the two-electron reduction of APS to sulfite and AMP. Found in several bacterial lineages and in Archaeoglobales, APS reductase is a heterodimer composed of an alpha subunit containing a noncovalently bound FAD, and a beta subunit containing two [4Fe-4S] clusters. Described by this model is the beta subunit of APS reductase, sharing common evolutionary origin with other iron-sulfur cluster-binding proteins. [Central intermediary metabolism, Sulfur metabolism] 132
18529 273946 TIGR02061 aprA adenosine phosphosulphate reductase, alpha subunit. During dissimilatory sulfate reduction or sulfur oxidation, adenylylsulfate (APS) reductase catalyzes reversibly the two-electron reduction of APS to sulfite and AMP. Found in several bacterial lineages and in Archaeoglobales, APS reductase is a heterodimer composed of an alpha subunit containing a noncovalently bound FAD, and a beta subunit containing two [4Fe-4S] clusters. Described by this model is the alpha subunit of APS reductase, sharing common evolutionary origin with fumarate reductase/succinate dehydrogenase flavoproteins. [Central intermediary metabolism, Sulfur metabolism] 614
18530 131117 TIGR02062 RNase_B exoribonuclease II. This family consists of exoribonuclease II, the product of the rnb gene, as found in a number of gamma proteobacteria. In Escherichia coli, it is one of eight different exoribonucleases. It is involved in mRNA degradation and tRNA precursor end processing. [Transcription, Degradation of RNA] 639
18531 273947 TIGR02063 RNase_R ribonuclease R. This family consists of an exoribonuclease, ribonuclease R, also called VacB. It is one of the eight exoribonucleases reported in E. coli and is broadly distributed throughout the bacteria. In E. coli, double mutants of this protein and polynucleotide phosphorylase are not viable. Scoring between trusted and noise cutoffs to the model are shorter, divergent forms from the Chlamydiae, and divergent forms from the Campylobacterales (including Helicobacter pylori) and Leptospira interrogans. [Transcription, Degradation of RNA] 709
18532 273948 TIGR02064 dsrA sulfite reductase, dissimilatory-type alpha subunit. Dissimilatory sulfite reductase catalyzes the six-electron reduction of sulfite to sulfide, as the terminal reaction in dissimilatory sulfate reduction. It remains unclear however, whether trithionate and thiosulfate serve as intermediate compounds to sulfide, or as end products of sulfite reduction. Sulfite reductase is a multisubunit enzyme composed of dimers of either alpha/beta or alpha/beta/gamma subunits, each containing a siroheme and iron sulfur cluster prosthetic center. Found in sulfate-reducing bacteria, these genes are commonly located in an unidirectional gene cluster. This model describes the alpha subunit of sulfite reductase. [Central intermediary metabolism, Sulfur metabolism] 402
18533 131120 TIGR02065 ECX1 archaeal exosome-like complex exonuclease 1. This family contains the archaeal protein orthologous to the eukaryotic exosome protein Rrp41. It is somewhat more distantly related to the bacterial protein ribonuclease PH. An exosome-like complex has been demonstrated experimentally for the Archaea in Sulfolobus solfataricus, so members of this family are designated exosome complex exonuclease 1, after usage in SwissProt. [Transcription, Degradation of RNA] 230
18534 131121 TIGR02066 dsrB sulfite reductase, dissimilatory-type beta subunit. Dissimilatory sulfite reductase catalyzes the six-electron reduction of sulfite to sulfide, as the terminal reaction in dissimilatory sulfate reduction. It remains unclear however, whether trithionate and thiosulfate serve as intermediate compounds to sulfide, or as end products of sulfite reduction. Sulfite reductase is a multisubunit enzyme composed of dimers of either alpha/beta or alpha/beta/gamma subunits, each containing a siroheme and iron sulfur cluster prosthetic center. Found in sulfate-reducing bacteria, these genes are commonly located in an unidirectional gene cluster. This model describes the beta subunit of sulfite reductase. [Central intermediary metabolism, Sulfur metabolism] 341
18535 273949 TIGR02067 his_9_HisN histidinol-phosphatase, inositol monophosphatase family. This subfamily belongs to the inositol monophosphatase family (pfam00459). The members of this family consist of no more than one per species and are found only in species in which histidine is synthesized de novo but no histidinol phosphatase can be found in either of the two described families (TIGR01261, TIGR01856). In at least one species, the member of this family is found near known histidine biosynthesis genes. The role as histidinol-phosphatase wsa first proven in Corynebacterium glutamicum. [Amino acid biosynthesis, Histidine family] 251
18536 273950 TIGR02068 cya_phycin_syn cyanophycin synthetase. Cyanophycin is an insoluble storage polymer for carbon, nitrogen, and energy, found in most Cyanobacteria. The polymer has a backbone of L-aspartic acid, with most Asp side chain carboxyl groups attached to L-arginine. The polymer is made by this enzyme, cyanophycin synthetase, and degraded by cyanophycinase. Heterologously expressed cyanophycin synthetase in E. coli produces a closely related, water-soluble polymer with some Arg replaced by Lys. It is unclear whether enzymes that produce soluble cyanophycin-like polymers in vivo in non-Cyanobacterial species should be designated as cyanophycin synthetase itself or as a related enzyme. This model makes the designation as cyanophycin synthetase. Cyanophycin synthesis is analogous to polyhydroxyalkanoic acid (PHA) biosynthesis, except that PHA polymers lack nitrogen and may be made under nitrogen-limiting conditions. [Cellular processes, Biosynthesis of natural products] 864
18537 131124 TIGR02069 cyanophycinase cyanophycinase. This model describes both cytosolic and extracellular cyanophycinases. The former are part of a system in many Cyanobacteria and a few other species of generating and later utilizing a storage polymer for nitrogen, carbon, and energy, called cyanophycin. The latter are found in species such as Pseudomonas anguilliseptica that can use external cyanophycin. The polymer has a backbone of L-aspartic acid, with most Asp side chain carboxyl groups attached to L-arginine. [Energy metabolism, Other] 250
18538 273951 TIGR02070 mono_pep_trsgly monofunctional biosynthetic peptidoglycan transglycosylase. This family is one of the transglycosylases involved in the late stages of peptidoglycan biosynthesis. Members tend to be small, about 240 amino acids in length, and consist almost entirely of a domain described by pfam00912 for transglycosylases. Species with this protein will have several other transglycosylases as well. All species with this protein are Proteobacteria that produce murein (peptidoglycan). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 224
18539 273952 TIGR02071 PBP_1b penicillin-binding protein 1B. Bacterial that synthesize a cell wall of peptidoglycan (murein) generally have several transglycosylases and transpeptidases for the task. This family consists of a particular bifunctional transglycosylase/transpeptidase in E. coli and other Proteobacteria, designated penicillin-binding protein 1B. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 730
18540 273953 TIGR02072 BioC malonyl-acyl carrier protein O-methyltransferase BioC. This enzyme, which is found in biotin biosynthetic gene clusters in proteobacteria, firmicutes, green-sulfur bacteria, fusobacterium and bacteroides, carries out an enzymatic step prior to the formation of pimeloyl-CoA, namely O-methylation of the malonyl group preferentially while on acyl carrier protein. The enzyme is recognizable as a methyltransferase by homology. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin] 240
18541 273954 TIGR02073 PBP_1c penicillin-binding protein 1C. This subfamily of the penicillin binding proteins includes the member from E. coli designated penicillin-binding protein 1C. Members have both transglycosylase and transpeptidase domains and are involved in forming cross-links in the late stages of peptidoglycan biosynthesis. All members of this subfamily are presumed to have the same basic function. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 727
18542 273955 TIGR02074 PBP_1a_fam penicillin-binding protein, 1A family. Bacterial that synthesize a cell wall of peptidoglycan (murein) generally have several transglycosylases and transpeptidases for the task. This family consists of bifunctional transglycosylase/transpeptidase penicillin-binding proteins (PBP). In the Proteobacteria, this family includes PBP 1A but not the paralogous PBP 1B (TIGR02071). This family also includes related proteins, often designated PBP 1A, from other bacterial lineages. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 531
18543 213681 TIGR02075 pyrH_bact uridylate kinase. This protein, also called UMP kinase, converts UMP to UDP by adding a phosphate from ATP. It is the first step in pyrimidine biosynthesis. GTP is an allosteric activator. In a large fraction of all bacterial genomes, the gene tends to be located immediately downstream of elongation factor Ts and upstream of ribosome recycling factor. A related protein family, believed to be equivalent in function and found in the archaea and in spirochetes, is described by a separate model, TIGR02076. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 232
18544 273956 TIGR02076 pyrH_arch uridylate kinase, putative. This family consists of the archaeal and spirochete proteins most closely related to bacterial uridylate kinases (TIGR02075), an enzyme involved in pyrimidine biosynthesis. Members are likely, but not known, to be functionally equivalent to their bacterial counterparts. However, substantial sequence differences suggest that regulatory mechanisms may be different; the bacterial form is allosterically regulated by GTP. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 221
18545 200156 TIGR02077 thr_lead_pep thr operon leader peptide. This family consists of examples of the threonine biosynthesis (thr) operon leader peptide, also called the thr operon attenuator. The small gene for this peptide is often missed in genome annotation. It should be looked for in genomes of the proteobacteria, immediately upstream of genes for threonine biosynthesis, typically aspartokinase I/homoserine dehydrogenase, homoserine kinase, and threonine synthase. Transcription of the rest of the Thr operon is attenuated (mostly turned off) unless the ribosome pauses during a stretch of the leader sequence rich in both Ile (made from Thr) and in Thr itself because of the scarcity of those amino acids at the time. The leader peptide itself, once made, may have no role other than to be degraded. Similar systems exist for some other amino acid biosynthetic operons, such as Trp. [Amino acid biosynthesis, Aspartate family] 24
18546 131133 TIGR02078 AspKin_pair Pyrococcus aspartate kinase subunit, putative. This family consists of proteins restricted to and found as paralogous pairs (typically close together) in species of Pyrococcus, a hyperthermophilic archaeal genus. Members are always found close to other genes of threonine biosynthesis and appear to represent the Pyrococcal form of aspartate kinase. Alignment to aspartokinase III from E. coli shows that 300 N-terminal and 20 C-terminal amino acids are homologous, but the form in Pyrococcus lacks ~ 100 amino acids in between. [Amino acid biosynthesis, Aspartate family] 327
18547 273957 TIGR02079 THD1 threonine dehydratase. This model represents threonine dehydratase, the first step in the pathway converting threonine into isoleucine. At least two other clades of biosynthetic threonine dehydratases have been characterized by models (TIGR01124 and TIGR01127). Those sequences described by this model are exclusively found in species containg the rest of the isoleucine pathway and which are generally lacking in members of the those other two clades of threonine dehydratases. Members of this clade are also often gene clustered with other elements of the isoleucine pathway. [Amino acid biosynthesis, Pyruvate family] 409
18548 131135 TIGR02080 O_succ_thio_ly O-succinylhomoserine (thiol)-lyase. This family consists of O-succinylhomoserine (thiol)-lyase, one of three different enzymes designated cystathionine gamma-synthase and involved in methionine biosynthesis. In all three cases, sulfur is added by transsulfuration from Cys to yield cystathionine rather than by a sulfhydrylation step that uses H2S directly and bypasses cystathionine. [Amino acid biosynthesis, Aspartate family] 382
18549 273958 TIGR02081 metW methionine biosynthesis protein MetW. This protein is found alongside MetX, of the enzyme that acylates homoserine as a first step toward methionine biosynthesis, in many species. It appears to act in methionine biosynthesis but is not fully characterized. [Amino acid biosynthesis, Aspartate family] 194
18550 273959 TIGR02082 metH 5-methyltetrahydrofolate--homocysteine methyltransferase. This family represents 5-methyltetrahydrofolate--homocysteine methyltransferase (EC 2.1.1.13), one of at least three different enzymes able to convert homocysteine to methionine by transferring a methyl group on to the sulfur atom. It is also called the vitamin B12(or cobalamine)-dependent methionine synthase. Other methionine synthases include 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase (MetE, EC 2.1.1.14, the cobalamin-independent methionine synthase) and betaine-homocysteine methyltransferase. [Amino acid biosynthesis, Aspartate family] 1181
18551 131138 TIGR02083 LEU2 3-isopropylmalate dehydratase, large subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. These homologs are described by a separate model of subfamily (rather than equivalog) homology type (TIGR01343). This model along with TIGR00170 describe clades which consist only of LeuC sequences. Here, the genes from Pyrococcus furiosus, Clostridium acetobutylicum, Thermotoga maritima and others are gene clustered with related genes from the leucine biosynthesis pathway. [Amino acid biosynthesis, Pyruvate family] 419
18552 131139 TIGR02084 leud 3-isopropylmalate dehydratase, small subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. The members of the seed for this model are those sequences which are gene clustered with other genes involved in leucine biosynthesis and include some archaea. [Amino acid biosynthesis, Pyruvate family] 156
18553 131140 TIGR02085 meth_trns_rumB 23S rRNA (uracil-5-)-methyltransferase RumB. This family consists of RNA methyltransferases designated RumB, formerly YbjF. Members act on 23S rRNA U747 and the equivalent position in other proteobacterial species. This family is homologous to the other 23S rRNA methyltransferase RumA and to the tRNA methyltransferase TrmA. [Protein synthesis, tRNA and rRNA base modification] 374
18554 273960 TIGR02086 IPMI_arch 3-isopropylmalate dehydratase, large subunit. This subfamily is a subset of the larger HacA family (Homoaconitate hydratase family, TIGR01343) and is most closely related to the 3-isopropylmalate dehydratase, large subunits which form TIGR00170. This subfamily includes the members of TIGR01343 which are gene clustered with other genes of leucine biosynthesis. The rest of the subfamily includes mainly archaeal species which exhibit two hits to this model. In these cases it is possible that one or the other of the hits does not have a 3-isopropylmalate dehydratase activity but rather one of the other related aconitase-like activities. 413
18555 273961 TIGR02087 LEUD_arch 3-isopropylmalate dehydratase, small subunit. This subfamily is most closely related to the 3-isopropylmalate dehydratase, small subunits which form TIGR00171. This subfamily includes the members of TIGR02084 which are gene clustered with other genes of leucine biosynthesis. The rest of the subfamily includes mainly archaeal species which exhibit two hits to this model. In these cases it is possible that one or the other of the hits does not have a 3-isopropylmalate dehydratase activity but rather one of the other related aconitase-like activities. 154
18556 273962 TIGR02088 LEU3_arch isopropylmalate/isohomocitrate dehydrogenases. This model represents a group of archaeal decarboxylating dehydrogenases which include the leucine biosynthesis enzyme 3-isopropylmalate dehydrogenase (LeuB, LEU3) and the methanogenic cofactor CoB biosynthesis enzyme isohomocitrate dehydrogenase (AksF). Both of these have been characterized in Methanococcus janaschii. Non-methanogenic archaea have only one hit to this model and presumably this is LeuB, although phylogenetic trees cannot establish which gene is which in the methanogens. The AksF gene is capable of acting on isohomocitrate, iso(homo)2-citrate and iso(homo)3-citrate in the successive elongation cycles of coenzyme B (7-mercaptoheptanoyl-threonine phosphate). This family is closely related to both the LeuB genes found in TIGR00169 and the mitochondrial eukaryotic isocitrate dehydratases found in TIGR00175. All of these are included within the broader subfamily model, pfam00180. 322
18557 273963 TIGR02089 TTC tartrate dehydrogenase. Tartrate dehydrogenase catalyzes the oxidation of both meso- and (+)-tartrate as well as a D-malate. These enzymes are closely related to the 3-isopropylmalate and isohomocitrate dehydrogenases found in TIGR00169 and TIGR02088, respectively. [Energy metabolism, Other] 352
18558 273964 TIGR02090 LEU1_arch isopropylmalate/citramalate/homocitrate synthases. Methanogenic archaea contain three closely related homologs of the 2-isopropylmalate synthases (LeuA) represented by TIGR00973. Two of these in Methanococcus janaschii (MJ1392 - CimA; MJ0503 - AksA) have been characterized as catalyzing alternative reactions leaving the third (MJ1195) as the presumptive LeuA enzyme. CimA is citramalate (2-methylmalate) synthase which condenses acetyl-CoA with pyruvate. This enzyme is believed to be involved in the biosynthesis of isoleucine in methanogens and possibly other species lacking threonine dehydratase. AksA is a homocitrate synthase which also produces (homo)2-citrate and (homo)3-citrate in the biosynthesis of Coenzyme B which is restricted solely to methanogenic archaea. Methanogens, then should and aparrently do contain all three of these enzymes. Unfortunately, phylogenetic trees do not resolve into three unambiguous clades, making assignment of function to particular genes problematic. Other archaea which lack a threonine dehydratase (mainly Euryarchaeota) should contain both a CimA and a LeuA gene. This is true of, for example, archaeoglobus fulgidis, but not for the Pyrococci which have none in this clade, but one in TIGR00973 and one in TIGRT00977 which may fulfill these roles. Other species which have only one hit to this model and lack threonine dehydratase are very likely LeuA enzymes. 363
18559 273965 TIGR02091 glgC glucose-1-phosphate adenylyltransferase. This enzyme, glucose-1-phosphate adenylyltransferase, is also called ADP-glucose pyrophosphorylase. The plant form is an alpha2,beta2 heterodimer, allosterically regulated in plants. Both subunits are homologous and included in this model. In bacteria, both homomeric forms of GlgC and more active heterodimers of GlgC and GlgD have been described. This model describes the GlgC subunit only. This enzyme appears in variants of glycogen synthesis pathways that use ADP-glucose, rather than UDP-glucose as in animals. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 361
18560 273966 TIGR02092 glgD glucose-1-phosphate adenylyltransferase, GlgD subunit. This family is GlgD, an apparent regulatory protein that appears in an alpha2/beta2 heterotetramer with GlgC (glucose-1-phosphate adenylyltransferase, TIGR02091) in a subset of bacteria that use GlgC for glycogen biosynthesis. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 369
18561 273967 TIGR02093 P_ylase glycogen/starch/alpha-glucan phosphorylases. This family consists of phosphorylases. Members use phosphate to break alpha 1,4 linkages between pairs of glucose residues at the end of long glucose polymers, releasing alpha-D-glucose 1-phosphate. The nomenclature convention is to preface the name according to the natural substrate, as in glycogen phosphorylase, starch phosphorylase, maltodextrin phosphorylase, etc. Name differences among these substrates reflect differences in patterns of branching with alpha 1,6 linkages. Members include allosterically regulated and unregulated forms. A related family, TIGR02094, contains examples known to act well on particularly small alpha 1,4 glucans, as may be found after import from exogenous sources. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 794
18562 273968 TIGR02094 more_P_ylases alpha-glucan phosphorylases. This family consists of known phosphorylases, and homologs believed to share the function of using inorganic phosphate to cleave an alpha 1,4 linkage between the terminal glucose residue and the rest of the polymer (maltodextrin, glycogen, etc.). The name of the glucose storage polymer substrate, and therefore the name of this enzyme, depends on the chain lengths and branching patterns. A number of the members of this family have been shown to operate on small maltodextrins, as may be obtained by utilization of exogenous sources. This family represents a distinct clade from the related family modeled by TIGR02093/pfam00343. 601
18563 273969 TIGR02095 glgA glycogen/starch synthase, ADP-glucose type. This family consists of glycogen (or starch) synthases that use ADP-glucose (EC 2.4.1.21), rather than UDP-glucose (EC 2.4.1.11) as in animals, as the glucose donor. This enzyme is found in bacteria and plants. Whether the name given is glycogen synthase or starch synthase depends on context, and therefore on substrate. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 473
18564 273970 TIGR02096 TIGR02096 conserved hypothetical protein, steroid delta-isomerase-related. This family of proteins about 135 amino acids in length largely restricted to the Proteobacteria. This family and a delta5-3-ketosteroid isomerase from Pseudomonas testosteroni appear homologous, especially toward their respective N-termini. Members, therefore, probably are enzymes. 129
18565 131152 TIGR02097 yccV hemimethylated DNA binding domain. This model describes the small protein from E. coli YccV and its homologs in other Proteobacteria. YccV is now described as a hemimethylated DNA binding protein. The model also describes a domain in longer eukaryotic proteins. 101
18566 131153 TIGR02098 MJ0042_CXXC MJ0042 family finger-like domain. This domain contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N-terminus of prokaryotic proteins. One partially characterized gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility. 38
18567 273971 TIGR02099 TIGR02099 TIGR02099 family protein. This model describes a family of long proteins, over 1250 amino acids in length and present in the Proteobacteria. The degree of sequence similarity is low between sequences from different genera. Apparent membrane-spanning regions at the N-terminus and C-terminus suggest the protein is inserted into (or exported through) the membrane. [Hypothetical proteins, Conserved] 1260
18568 131155 TIGR02100 glgX_debranch glycogen debranching enzyme GlgX. This family consists of the GlgX protein from the E. coli glycogen operon and probable equivalogs from other prokaryotic species. GlgX is not required for glycogen biosynthesis, but instead acts as a debranching enzyme for glycogen catabolism. This model distinguishes GlgX from pullanases and other related proteins that also operate on alpha-1,6-glycosidic linkages. In the wide band between the trusted and noise cutoffs are functionally similar enzymes, mostly from plants, that act similarly but usually are termed isoamylase. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 688
18569 273972 TIGR02101 IpaC_SipC type III secretion target, IpaC/SipC family. This model represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri (SP:P18012) and SipC from Salmonella typhimurium (GB:AAA75170.1). Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins. [Cellular processes, Pathogenesis] 317
18570 273973 TIGR02102 pullulan_Gpos pullulanase, extracellular, Gram-positive. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. In contrast, a glycogen debranching enzyme such GlgX, homologous to this family, can release glucose at alpha,1-6 linkages from glycogen first subjected to limit degradation by phosphorylase. Characterized members of this family include a surface-located pullulanase from Streptococcus pneumoniae () and an extracellular bifunctional amylase/pullulanase with C-terminal pullulanase activity (. 1111
18571 273974 TIGR02103 pullul_strch alpha-1,6-glucosidases, pullulanase-type. Members of this protein family include secreted (or membrane-anchored) pullulanases of Gram-negative bacteria and pullulanase-type starch debranching enzymes of plants. Both enzymes hydrolyze alpha-1,6 glycosidic linkages. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. This family is closely homologous to, but architecturally different from, the Gram-positive pullulanases of Gram-positive bacteria (TIGR02102). [Energy metabolism, Biosynthesis and degradation of polysaccharides] 898
18572 273975 TIGR02104 pulA_typeI pullulanase, type I. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. This family consists of pullulanases related to the subfamilies described in TIGR02102 and TIGR02103 but having a different domain architecture with shorter sequences. Members are called type I pullulanases. 605
18573 131160 TIGR02105 III_needle type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 72
18574 211715 TIGR02106 cyd_oper_ybgT cyd operon protein YbgT. This model describes a very small (as short as 33 amino acids) protein of unknown function, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It begins with an aromatic motif MWYFXW and appears to contain a membrane-spanning helix. This protein appears to be restricted to the Proteobacteria and exist in a single copy only. We suggest it may be a membrane subunit of the terminal oxidase. The family is named after the E. coli member YbgT (SP|P56100). This model excludes the apparently related protein YccB (SP|P24244). [Energy metabolism, Electron transport] 30
18575 273976 TIGR02107 PQQ_syn_pqqA coenzyme PQQ precursor peptide PqqA. This model describes a very small protein, coenzyme PQQ biosynthesis protein A, which is smaller than 25 amino acids in many species. It is proposed to serve as a peptide precursor of coenzyme pyrrolo-quinoline-quinone (PQQ), with Glu and Tyr of a conserved motif Glu-Xxx-Xxx-Xxx-Tyr becoming part of the product. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 21
18576 273977 TIGR02108 PQQ_syn_pqqB coenzyme PQQ biosynthesis protein B. This model describes coenzyme PQQ biosynthesis protein B, a gene required for the biosynthesis of pyrrolo-quinoline-quinone (coenzyme PQQ). PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. Note that this gene appears to be required for PQQ in biosynthesis in Methylobacterium extorquens (under the name pqqG) and in Klebiella pneumoniae but that the equivalent pqqV in Acinetobacter calcoaceticus is not necessary for heterologous expression of PQQ biosynthesis in E. coli. Based on this latter finding, it is suggested (Goosen, et al. 1989) that PqqB might be a transporter or a PQQ-dependent enzyme rather than a PQQ biosynthesis enzyme. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 302
18577 162708 TIGR02109 PQQ_syn_pqqE coenzyme PQQ biosynthesis enzyme PqqE. This model describes coenzyme PQQ biosynthesis protein E, a prototypical peptide-cyclizing radical SAM enzyme. It links a Tyr to a Glu as the first step in the biosynthesis of pyrrolo-quinoline-quinone (coenzyme PQQ) from the precursor peptide PqqA. PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 358
18578 273978 TIGR02110 PQQ_syn_pqqF coenzyme PQQ biosynthesis probable peptidase PqqF. In a subset of species that make coenzyme PQQ (pyrrolo-quinoline-quinone), this probable peptidase is found in the PQQ biosynthesis region and is thought to act as a protease on PqqA (TIGR02107), a probable peptide precursor of the coenzyme. PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 697
18579 131166 TIGR02111 PQQ_syn_pqqC coenzyme PQQ biosynthesis protein C. This model describes the coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein PqqC.In contrast to the broader model pfam05312, this model does not include related proteins likely to be functionally distinct from PqqC, such as homologs found in the Chlamydias. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 239
18580 131167 TIGR02112 cyd_oper_ybgE cyd operon protein YbgE. This model describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria. [Energy metabolism, Electron transport] 93
18581 131168 TIGR02113 coaC_strep phosphopantothenoylcysteine decarboxylase, streptococcal. In most bacteria, a single bifunctional protein catalyses phosphopantothenoylcysteine decarboxylase and phosphopantothenate--cysteine ligase activities, sequential steps in coenzyme A biosynthesis (see TIGR00521). These activities reside in separate proteins encoded by tandem genes in some bacterial lineages. This model describes proteins from the genera Streptococcus and Enterococcus homologous to the N-terminal region of TIGR00521, corresponding to phosphopantothenoylcysteine decarboxylase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 177
18582 131169 TIGR02114 coaB_strep phosphopantothenate--cysteine ligase, streptococcal. In most bacteria, a single bifunctional protein catalyses phosphopantothenoylcysteine decarboxylase and phosphopantothenate--cysteine ligase activities, sequential steps in coenzyme A biosynthesis (see TIGR00521). These activities reside in separate proteins encoded by tandem genes in some bacterial lineages. This model describes proteins from the genera Streptococcus and Enterococcus homologous to the C-terminal region of TIGR00521, corresponding to phosphopantothenate--cysteine ligase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 227
18583 131170 TIGR02115 potass_kdpF K+-transporting ATPase, KdpF subunit. This model describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (TIGR00680). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation. [Transport and binding proteins, Cations and iron carrying compounds] 24
18584 131171 TIGR02116 toxin_Txe_YoeB toxin-antitoxin system, toxin component, Txe/YoeB family. The Axe-Txe pair in Enterococcus faecium and the homologous YefM-YoeB pair in Escherichia coli have been shown to act as an antitoxin-toxin pair. This model describes the toxin component. Nearly every example found is next to an identifiable antitoxin, as indicated by match to models TIGR01552 and/or pfam02604. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 80
18585 273979 TIGR02117 chp_urease_rgn conserved hypothetical protein. This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species. [Hypothetical proteins, Conserved] 208
18586 131173 TIGR02118 TIGR02118 conserved hypothetical protein. This model represents a small family of proteins of unknown function, each about 105 amino acids in length. Conserved sites in the multiple alignment include a pair of aromatic residues, a histidine, and an aspartate. [Hypothetical proteins, Conserved] 100
18587 131174 TIGR02119 panF sodium/pantothenate symporter. Pantothenate (vitamin B5) is a precursor of coenzyme A and is made from aspartate and 2-oxoisovalerate in most bacteria with completed genome sequences. However, some pathogens must import pantothenate. This model describes PanF, a sodium/pantothenate symporter, from a larger family of Sodium/substrate symporters (pfam00474). Several species that have this transporter appear to lack all enzymes of pantothenate biosynthesis, namely Haemophilus influenzae, Pasteurella multocida, Fusobacterium nucleatum, and Borrelia burgdorferi. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A, Transport and binding proteins, Other] 471
18588 273980 TIGR02120 GspF type II secretion system protein F. This membrane protein is a component of the terminal branch complex of the general secretion pathway (GSP), also known as the"Type II" secretion pathway. The GSP transports proteins (generally virulence-associated cell wall hydrolases) across the outer membrase of the bacterial cell. Transport across the inner membrane is often, but not exclusively handled by the Sec system. This model was constructed from the broader subfamily model, pfam00482 which includes components of pilin complexes (PilC) as well as other related genes. GspF is nearly always gene clustered with other GSP subunits. Some genes from Xylella and Xanthomonas strains score below the trusted cutoff due to excessive divergence from the family such that a sequence from Deinococcus which does not appear to be GspF scores higher. [Protein fate, Protein and peptide secretion and trafficking] 399
18589 273981 TIGR02121 Na_Pro_sym sodium/proline symporter. This family consists of the sodium/proline symporter (proline permease) from a number of Gram-negative and Gram-positive bacteria and from the archaeal genus Methanosarcina. Using the related pantothenate permease as an outgroup, candidate sequences from Bifidobacterium longum and several from archaea are found to be outside the clade defined by known proline permeases. These sequences, scoring between 570 and -40, define the range between trusted and noise cutoff scores. [Transport and binding proteins, Amino acids, peptides and amines] 487
18590 273982 TIGR02122 TRAP_TAXI TRAP transporter solute receptor, TAXI family. This family is one of at least three major families of extracytoplasmic solute receptor (ESR) for TRAP (Tripartite ATP-independent Periplasmic Transporter) transporters. The others are the DctP (TIGR00787) and SmoM (pfam03480) families. These transporters are secondary (driven by an ion gradient) but composed of three polypeptides, although in some species the 4-TM and 12-TM integral membrane proteins are fused. Substrates for this transporter family are not fully characterized but, besides C4 dicarboxylates, may include mannitol and other compounds. [Transport and binding proteins, Unknown substrate] 320
18591 273983 TIGR02123 TRAP_fused TRAP transporter, 4TM/12TM fusion protein. In some species, the 12-transmembrane spanning and 4-transmembrane spanning components of tripartite ATP-independent periplasmic (TRAP)-type transporters are fused. This model describes such transporters, found in the Archaea and in Bacteria. [Transport and binding proteins, Unknown substrate] 614
18592 273984 TIGR02124 hypE hydrogenase expression/formation protein HypE. This family contains HypE (or HupE), a protein required for expression of catalytically active hydrogenase in many systems. It appears to be an accessory protein involved in maturation rather than a regulatory protein involved in expression. HypE shows considerable homology to the thiamine-monophosphate kinase ThiL (TIGR01379) and other enzymes. 320
18593 273985 TIGR02125 CytB-hydogenase Ni/Fe-hydrogenase, b-type cytochrome subunit. This model describes a family of cytochrome b proteins which appear to be specific for nickel-iron hydrogenase complexes. Every genome which contains a member of this family posesses a Ni/Fe hydrogenase according to Genome Properties (GenProp0177), and most are gene clustered with other hydrogenase components. Some Ni/Fe hydrogenase-containing species lack a member of this family but contain other CytB homologs (pfam01292) which may substitute for it. 211
18594 273986 TIGR02126 phgtail_TP901_1 phage major tail protein, TP901-1 family. This family includes the members of pfam06199 but is broader. Characterized members are major tail proteins from various phage, including lactococcal temperate bacteriophage TP901-1. [Mobile and extrachromosomal element functions, Prophage functions] 136
18595 273987 TIGR02127 pyrF_sub2 orotidine 5'-phosphate decarboxylase, subfamily 2. This model represents orotidine 5'-monophosphate decarboxylase, the PyrF protein of pyrimidine nucleotide biosynthesis. See TIGR01740 for a related but distinct subfamily of the same enzyme. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis] 261
18596 273988 TIGR02128 G6PI_arch bifunctional phosphoglucose/phosphomannose isomerase. This bifunctional isomerase is a member of the larger PGI superfamily and only distantly related to other glucose-6-phosphate isomerases. The family is limited to the archaea. 308
18597 162719 TIGR02129 hisA_euk phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase, eukaryotic type. This enzyme acts in the biosynthesis of histidine and has been characterized in S. cerevisiae and Arabidopsis where it complements the E. coli HisA gene. In eukaryotes the gene is known as HIS6. In bacteria, this gene is found in Fibrobacter succinogenes, presumably due to lateral gene transfer from plants in the rumen gut. [Amino acid biosynthesis, Histidine family] 253
18598 131185 TIGR02130 dapB_plant dihydrodipicolinate reductase. This narrow family includes genes from Arabidopsis and Fibrobacter succinogenes (which probably recieved the gene from a plant via lateral gene transfer). The sequences are distantly related to the dihydrodipicolinate reductases from archaea. In Fibrobacter this gene is the only candidate DHPR in the genome. [Amino acid biosynthesis, Aspartate family] 275
18599 131186 TIGR02131 phaP_Bmeg polyhydroxyalkanoic acid inclusion protein PhaP. This model describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage (see McCool,G.J. and Cannon,M.C, 1999). 165
18600 131187 TIGR02132 phaR_Bmeg polyhydroxyalkanoic acid synthase, PhaR subunit. This model describes a protein, PhaR, localized to polyhydroxyalkanoic acid (PHA) inclusion granules in Bacillus cereus and related species. PhaR is required for PHA biosynthesis along with PhaC and may be a regulatory subunit. 189
18601 273989 TIGR02133 RPI_actino ribose 5-phosphate isomerase. This family is a member of the RpiB/LacA/LacB subfamily (TIGR00689) but lies outside the RpiB equivalog (TIGR01120) which is also a member of that subfamily. Ribose 5-phosphate isomerase is an essential enzyme of the pentose phosphate pathway; a pathway that appears to be present in the actinobacteria. The only candidates for ribose 5-phosphate isomerase in the Actinobacteria are members of this family. 148
18602 131189 TIGR02134 transald_staph transaldolase. This small family of proteins is a member of the transaldolase sybfamily represented by pfam00923. Coxiella and Staphylococcus lack members of the known transaldolase equivalog families and appear to require a transaldolase activity for completion of the pentose phosphate pathway. [Energy metabolism, Pentose phosphate pathway] 236
18603 273990 TIGR02135 phoU_full phosphate transport system regulatory protein PhoU. This model describes PhoU, a regulatory protein of unknown mechanism for high-affinity phosphate ABC transporter systems. The protein consists of two copies of the domain described by pfam01895. Deletion of PhoU activates constitutive expression of the phosphate ABC transporter and allows phosphate transport, but causes a growth defect and so likely has some second function. [Regulatory functions, Other, Transport and binding proteins, Anions] 212
18604 273991 TIGR02136 ptsS_2 phosphate binding protein. Members of this family are phosphate-binding proteins. Most are found in phosphate ABC-transporter operons, but some are found in phosphate regulatory operons. This model separates members of the current family from the phosphate ABC transporter phosphate binding protein described by TIGRFAMs model TIGR00975. [Transport and binding proteins, Anions] 287
18605 162723 TIGR02137 HSK-PSP phosphoserine phosphatase/homoserine phosphotransferase bifunctional protein. This protein is has been characterized as both a phosphoserine phosphatase and a phosphoserine:homoserine phosphotransferase. In Pseudomonas aeruginosa, where the characterization was done, a second phosphoserine phosphatase (SerB) and a second homoserine kinase (thrB) are found, but in Fibrobacter succinogenes neither are present. This enzyme is a member of the haloacid dehalogenase (HAD) superfamily, specifically part of subfamily IB by virtue of the presence of an alpha helical domain in between motifs I and II of the HAD domain. The closest homologs to this family are monofunctional phosphoserine phosphatases (TIGR00338). 203
18606 273992 TIGR02138 phosphate_pstC phosphate ABC transporter, permease protein PstC. The typical operon for the high affinity inorganic phosphate ABC transporter encodes an ATP-binding protein, a phosphate-binding protein, and two permease proteins. This family consists of one of the two permease proteins, PstC, which is homologous to PstA (TIGR00974). In the model bacterium Escherichia coli, this transport system is induced when the concentration of extrallular inorganic phosphate is low. A constitutive, lower affinity transporter operates otherwise. [Transport and binding proteins, Anions] 295
18607 131194 TIGR02139 permease_CysT sulfate ABC transporter, permease protein CysT. This model represents CysT, one of two homologous, tandem permeases in the sulfate ABC transporter system; the other is CysW (TIGR02140). The sulfate transporter has been described in E. coli as transporting sulfate, thiosulfate, selenate, and selenite. Sulfate transporters may also transport molybdate ion if a specific molybdate transporter is not present. [Transport and binding proteins, Anions] 265
18608 162725 TIGR02140 permease_CysW sulfate ABC transporter, permease protein CysW. This model represents CysW, one of two homologous, tandem permeases in the sulfate ABC transporter system; the other is CysT (TIGR02139). The sulfate transporter has been described in E. coli as transporting sulfate, thiosulfate, selenate, and selenite. Sulfate transporters may also transport molybdate ion if a specific molybdate transporter is not present. [Transport and binding proteins, Anions] 261
18609 273993 TIGR02141 modB_ABC molybdate ABC transporter, permease protein. This model describes the permease protein, ModB, of the molybdate ABC transporter. This system has been characterized in E. coli, Staphylococcus carnosus, Rhodobacter capsulatus and Azotobacter vinlandii. Molybdate is chemically similar to sulfate, thiosulfate, and selenate. These related substrates, and sometimes molybdate itself, can be transported by the homologous sulfate receptor. Some apparent molybdenum transport operons include a permease related to this ModB, although less similar than some sulfate permease proteins and not included in this model. [Transport and binding proteins, Anions] 208
18610 131197 TIGR02142 modC_ABC molybdenum ABC transporter, ATP-binding protein. This model represents the ATP-binding cassette (ABC) protein of the three subunit molybdate ABC transporter. The three proteins of this complex are homologous to proteins of the sulfate ABC transporter. Molybdenum may be used in nitrogenases of nitrogen-fixing bacteria and in molybdopterin cofactors. In some cases, molybdate may be transported by a sulfate transporter rather than by a specific molybdate transporter. [Transport and binding proteins, Anions] 354
18611 131198 TIGR02143 trmA_only tRNA (uracil(54)-C(5))-methyltransferase. This family consists exclusively of proteins believed to act as tRNA (uracil-5-)-methyltransferase. All members of far are proteobacterial. The seed alignment was taken directly from pfam05958 in Pfam 12.0, but higher cutoffs are used to select only functionally equivalent proteins. Homologous proteins excluded by the higher cutoff scores of this model include other uracil methyltransferases, such as RumA, active on rRNA. [Protein synthesis, tRNA and rRNA base modification] 353
18612 273994 TIGR02144 LysX_arch Lysine biosynthesis enzyme LysX. The family of proteins found in this equivalog include the characterized LysX from Thermus thermophilus, which is part of a well-organized lysine biosynthesis gene cluster. LysX is believed to carry out an ATP-dependent acylation of the amino group of alpha-aminoadipate in the prokaryotic version of the fungal AAA lysine biosynthesis pathway. No species having a sequence in this equivalog contains the elements of the more common diaminopimelate lysine biosythesis pathway, and none has been shown to be a lysine auxotroph. These sequences have mainly recieved the name of the related enzyme, "ribosomal protein S6 modification protein RimK". RimK has been characterized in E. coli, and acts by ATP-dependent condensation of S6 with glutamate residues. 280
18613 273995 TIGR02145 Fib_succ_major Fibrobacter succinogenes major paralogous domain. This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron. [Cell envelope, Other] 171
18614 162728 TIGR02146 LysS_fung_arch homocitrate synthase. This model includes the yeast LYS21 gene which carries out the first step of the alpha-aminoadipate (AAA) lysine biosynthesis pathway. A related pathway is found in Thermus thermophilus. This enzyme is closely related to 2-isopropylmalate synthase (LeuA) and citramalate synthase (CimA), both of which are present in the euryarchaeota. Some archaea have a separate homocitrate synthase (AksA) which also synthesizes longer homocitrate analogs. 344
18615 273996 TIGR02147 Fsuc_second TIGR02147 family protein. This family consists of the 40 members of a paralogous protein family in the rumen anaerobe Fibrobacter succinogenes S85 and a smaller number in Bdellovibrio bacteriovorus HD100. Member proteins are about 270 residues long and appear to lack signal sequences and transmembrane helices. The only perfectly conserved residue is a glycine in an otherwise poorly conserved region, suggesting members are not enzymes. The family is not characterized. [Hypothetical proteins, Conserved] 271
18616 162730 TIGR02148 Fibro_Slime fibro-slime domain. This model represents a conserved region of about 90 amino acids, shared in at least 4 distinct large putative proteins from the slime mold Dictyostelium discoideum and 10 proteins from the rumen bacterium Fibrobacter succinogenes, and in no other species so far. We propose here the name fibro-slime domain 90
18617 273997 TIGR02149 glgA_Coryne glycogen synthase, Corynebacterium family. This model describes Corynebacterium glutamicum GlgA and closely related proteins in several other species. This enzyme is required for glycogen biosynthesis and appears to replace the distantly related TIGR02095 family of ADP-glucose type glycogen synthase in Corynebacterium glutamicum, Mycobacterium tuberculosis, Bifidobacterium longum, and Streptomyces coelicolor. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 388
18618 273998 TIGR02150 IPP_isom_1 isopentenyl-diphosphate delta-isomerase, type 1. This model represents type 1 of two non-homologous families of the enzyme isopentenyl-diphosphate delta-isomerase (IPP isomerase). IPP is an essential building block for many compounds, including enzyme cofactors, sterols, and prenyl groups. This inzyme interconverts isopentenyl diphosphate and dimethylallyl diphosphate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 158
18619 273999 TIGR02151 IPP_isom_2 isopentenyl-diphosphate delta-isomerase, type 2. Isopentenyl-diphosphate delta-isomerase (IPP isomerase) interconverts isopentenyl diphosphate and dimethylallyl diphosphate. This model represents the type 2 enzyme. FMN, NADPH, and Mg2+ are required by this form, which lacks homology to the type 1 enzyme (TIGR02150). IPP is precursor to many compounds, including enzyme cofactors, sterols, and isoprenoids. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 333
18620 274000 TIGR02152 D_ribokin_bact ribokinase. This model describes ribokinase, an enzyme catalyzing the first step in ribose catabolism. The rbsK gene encoding ribokinase typically is found with ribose transport genes. Ribokinase belongs to the carbohydrate kinase pfkB family (pfam00294). In the wide gulf between the current trusted (360 bit) and noise (100 bit) cutoffs are a number of sequences, few of which are clustered with predicted ribose transport genes but many of which are currently annotated as if having ribokinase activity. Most likely some have this function and others do not. [Energy metabolism, Sugars] 293
18621 274001 TIGR02153 gatD_arch glutamyl-tRNA(Gln) amidotransferase, subunit D. This peptide is found only in the Archaea. It is part of a heterodimer, with GatE (TIGR00134), that acts as an amidotransferase on misacylated Glu-tRNA(Gln) to produce Gln-tRNA(Gln). The analogous amidotransferase found in bacteria is the GatABC system, although GatABC homologs in the Archaea appear to act instead on Asp-tRNA(Asn). [Protein synthesis, tRNA aminoacylation] 404
18622 131209 TIGR02154 PhoB phosphate regulon transcriptional regulatory protein PhoB. PhoB is a DNA-binding response regulator protein acting with PhoR in a 2-component system responding to phosphate ion. PhoB acts as a positive regulator of gene expression for phosphate-related genes such as phoA, phoS, phoE and ugpAB as well as itself. It is often found proximal to genes for the high-affinity phosphate ABC transporter (pstSCAB; GenProp0190) and presumably regulates these as well. [Regulatory functions, DNA interactions, Signal transduction, Two-component systems] 226
18623 131210 TIGR02155 PA_CoA_ligase phenylacetate-CoA ligase. Phenylacetate-CoA ligase (PA-CoA ligase) catalyzes the first step in aromatic catabolism of phenylacetic acid (PA) into phenylacetyl-CoA (PA-CoA). Often located in a conserved gene cluster with enzymes involved in phenylacetic acid activation (paaG/H/I/J), phenylacetate-CoA ligase has been found among the proteobacteria as well as in gram positive prokaryotes. In the B-subclass proteobacterium Azoarcus evansii, phenylacetate-CoA ligase has been shown to be induced under aerobic and anaerobic growth conditions. It remains unclear however, whether this induction is due to the same enzyme or to another isoenzyme restricted to specific anaerobic growth conditions. [Energy metabolism, Other] 422
18624 131211 TIGR02156 PA_CoA_Oxy1 phenylacetate-CoA oxygenase, PaaG subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 289
18625 274002 TIGR02157 PA_CoA_Oxy2 phenylacetate-CoA oxygenase, PaaH subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 90
18626 131213 TIGR02158 PA_CoA_Oxy3 phenylacetate-CoA oxygenase, PaaI subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 237
18627 131214 TIGR02159 PA_CoA_Oxy4 phenylacetate-CoA oxygenase, PaaJ subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 146
18628 131215 TIGR02160 PA_CoA_Oxy5 phenylacetate-CoA oxygenase/reductase, PaaK subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 352
18629 131216 TIGR02161 napC_nirT periplasmic nitrate (or nitrite) reductase c-type cytochrome, NapC/NirT family. Nearly every member of this subfamily is NapC, a predicted membrane-anchored four-heme c-type cytochrome that forms one component of the periplasmic nitrate reductase along with NapA, NapB, NapD, NapE, and NapF subunits. A single known exception at this time is NirT, which is instead a component of a nitrite reductase. This family excludes TorC subunits of trimethylamine N-oxide (TMAO) reductases. 185
18630 274003 TIGR02162 torC trimethylamine-N-oxide reductase c-type cytochrome TorC. This family includes consists of TorC, a pentahemic c-type cytochrome subunit of periplasmic reductases for trimethylamine-N-oxide (TMAO). The N-terminal half is closely related to tetrahemic NapC (or NirT) subunits of periplasmic nitrate (or nitrite) reductases; some species have both TMAO and nitrate reductase complexes. 386
18631 274004 TIGR02163 napH_ ferredoxin-type protein, NapH/MauN family. Most members of this family are the NapH protein, found next to NapG,in operons that encode the periplasmic nitrate reductase. Some species with this reductase lack NapC but accomplish electron transfer to NapAB in some other manner, likely to involve NapH, NapG, and/or some other protein. A few members of this protein are designated MauN and are found in methylamine utilization operons in species that appear to lack a periplasmic nitrate reductase. 255
18632 131219 TIGR02164 torA trimethylamine-N-oxide reductase TorA. This very narrowly defined family represents TorA, part of a family of related molybdoenzymes that include biotin sulfoxide reductases, dimethyl sulfoxide reductases, and at least two different subfamilies of trimethylamine-N-oxide reductases. A single enzyme from the larger family may have more than one activity. TorA typically is located in the periplasm, has a Tat (twin-arginine translocation)-dependent signal sequence, and is encoded in a torCAD operon. 822
18633 274005 TIGR02165 cas5_6_GSU0054 CRISPR-associated protein GSU0054/csb2, Dpsyc system. This model represents a CRISPR-associated protein from the Dpsyc subtype (a type I-C variant), named for Desulfotalea psychrophila LSv54. CRISPR systems confer resistance in prokaryotes to invasive DNA or RNA, including phage and plasmids. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins. 484
18634 274006 TIGR02166 dmsA_ynfE anaerobic dimethyl sulfoxide reductase, A subunit, DmsA/YnfE family. Members of this family include known and probable dimethyl sulfoxide reductase (DMSO reductase) A chains. In E. coli, dmsA encodes the canonical anaerobic DMSO reductase A chain. The paralog ynfE, as part of ynfFGH expressed from a multicopy plasmid, could complement a dmsABC deletion, suggesting a similar function and some overlap in specificity, although YnfE could not substitute for DmsA in a mixed complex. 797
18635 274007 TIGR02167 Liste_lipo_26 bacterial surface protein 26-residue repeat. This model describes a tandem peptide repeat sequence of 25 or 26 residues, found in predicted surface proteins (often lipoproteins) from Listeria monocytogenes, L. innocua, Enterococcus faecalis, Lactobacillus plantarum, Mycoplasma mycoides, Helicobacter hepaticus, and other species. 26
18636 274008 TIGR02168 SMC_prok_B chromosome segregation protein SMC, common bacterial type. SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins] 1179
18637 274009 TIGR02169 SMC_prok_A chromosome segregation protein SMC, primarily archaeal type. SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins] 1164
18638 274010 TIGR02170 thyX thymidylate synthase, flavin-dependent. Two forms of microbial thymidylate synthase are known: ThyA (2.1.1.45) and ThyX (2.1.1.148). This model describes ThyX, a homotetrameric flavoprotein. Both enzymes convert dUMP to dTMP. Under oxygen-limiting conditions, thyX can complement a thyA mutation. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 209
18639 274011 TIGR02171 Fb_sc_TIGR02171 Fibrobacter succinogenes paralogous family TIGR02171. This model describes a paralogous family of the rumen bacterium Fibrobacter succinogenes. Eleven members are found in Fibrobacter succinogenes S85, averaging over 900 amino acids in length. More than half are predicted lipoproteins. The function is unknown. 912
18640 162743 TIGR02172 Fb_sc_TIGR02172 Fibrobacter succinogenes paralogous family TIGR02172. This model describes a paralogous family of five proteins, likely to be enzymes, in the rumen bacterium Fibrobacter succinogenes S85. Members show homology to proteins described by pfam01636, a phosphotransferase enzyme family associated with resistance to aminoglycoside antibiotics. 226
18641 274012 TIGR02173 cyt_kin_arch cytidylate kinase, putative. Proteins in this family are believed to be cytidylate kinase. Members of this family are found in the archaea and in spirochaetes, and differ considerably from the common bacterial form of cytidylate kinase described by TIGR00017. 171
18642 274013 TIGR02174 CXXU_selWTH selT/selW/selH selenoprotein domain. This model represents a domain found in both bacteria and animals, including animal proteins SelT, SelW, and SelH, all of which are selenoproteins. In a CXXC motif near the N-terminus of the domain, selenocysteine may replace the second Cys. Proteins with this domain may include an insert of about 70 amino acids. This model is broader than the current SelW model pfam05169 in Pfam. 73
18643 274014 TIGR02175 PorC_KorC 2-oxoacid:acceptor oxidoreductase, gamma subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins (H. pylori) used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes the gamma subunit. In Pyrococcus furious, enzymes active on pyruvate and 2-ketoisovalerate share a common gamma subunit. 177
18644 131231 TIGR02176 pyruv_ox_red pyruvate:ferredoxin (flavodoxin) oxidoreductase, homodimeric. This model represents a single chain form of pyruvate:ferredoxin (or flavodoxin) oxidoreductase. This enzyme may transfer electrons to nitrogenase in nitrogen-fixing species. Portions of this protein are homologous to gamma subunit of the four subunit pyruvate:ferredoxin (flavodoxin) oxidoreductase. 1165
18645 274015 TIGR02177 PorB_KorB 2-oxoacid:acceptor oxidoreductase, beta subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes a subfamily of beta subunits, representing mostly pyruvate and 2-ketoisovalerate specific enzymes. 287
18646 131233 TIGR02178 yeiP elongation factor P-like protein YeiP. This model represents the family of Escherichia coli protein YeiP, a close homolog of elongation factor P (TIGR00038) and probably itself a translation factor. Member of this family are found only in some Gammaproteobacteria, including E. coli and Vibrio cholerae. [Protein synthesis, Translation factors] 186
18647 131234 TIGR02179 PorD_KorD 2-oxoacid:acceptor oxidoreductase, delta subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes a subfamily of delta subunits, representing mostly pyruvate, 2-ketoisovalerate, and 2-oxoglutarate specific enzymes. The delta subunit is the smallest and resembles ferredoxins. 78
18648 274016 TIGR02180 GRX_euk Glutaredoxin. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model represents eukaryotic glutaredoxins and includes sequences from fungi, plants and metazoans as well as viruses. 83
18649 274017 TIGR02181 GRX_bact Glutaredoxin, GrxC family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This family of glutaredoxins includes the E. coli protein GrxC (Grx3) which appears to have a secondary role in reducing ribonucleotide reductase (in the absence of GrxA) possibly indicating a role in the reduction of other protein disulfides. [Energy metabolism, Electron transport] 79
18650 274018 TIGR02182 GRXB Glutaredoxin, GrxB family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model includes the highly abundant E. coli GrxB (Grx2) glutaredoxin which is notably longer than either GrxA or GrxC. Unlike the other two E. coli glutaredoxins, GrxB appears to be unable to reduce ribonucleotide reductase, and may have more to do with resistance to redox stress. [Energy metabolism, Electron transport] 209
18651 131238 TIGR02183 GRXA Glutaredoxin, GrxA family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model includes the E. coli glyutaredoxin GrxA which appears to have primary responsibility for the reduction of ribonucleotide reductase. 86
18652 213689 TIGR02184 Myco_arth_vir_N Mycoplasma virulence family signal region. This model represents the N-terminal region, including a probable signal sequence or signal anchor which in most instances has four consecutive Lys residues before the hydrophobic stretch, of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. 33
18653 274019 TIGR02185 Trep_Strep putative ECF transporter S component, Trep_Strep family. This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. If is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6. [Transport and binding proteins, Unknown substrate] 189
18654 274020 TIGR02186 alph_Pro_TM conserved hypothetical protein. This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome. 261
18655 274021 TIGR02187 GlrX_arch Glutaredoxin-like domain protein. This family of archaeal proteins contains a C-terminal domain with homology to bacterial and eukaryotic glutaredoxins, including a CPYC motif. There is an N-terminal domain which has even more distant homology to glutaredoxins. The name "glutaredoxin" may be inappropriate in the sense of working in tandem with glutathione and glutathione reductase which may not be present in the archaea. The overall domain structure appears to be related to bacterial alkylhydroperoxide reductases, but the homology may be distant enough that the function of this family is wholly different. 215
18656 274022 TIGR02188 Ac_CoA_lig_AcsA acetate--CoA ligase. This model describes acetate-CoA ligase (EC 6.2.1.1), also called acetyl-CoA synthetase and acetyl-activating enzyme. It catalyzes the reaction ATP + acetate + CoA = AMP + diphosphate + acetyl-CoA and belongs to the family of AMP-binding enzymes described by pfam00501. 626
18657 274023 TIGR02189 GlrX-like_plant Glutaredoxin-like family. This family of glutaredoxin-like proteins is aparrently limited to plants. Multiple isoforms are found in A. thaliana and O.sativa. 99
18658 131245 TIGR02190 GlrX-dom Glutaredoxin-family domain. This C-terminal domain with homology to glutaredoxin is fused to an N-terminal peroxiredoxin-like domain. 79
18659 274024 TIGR02191 RNaseIII ribonuclease III, bacterial. This family consists of bacterial examples of ribonuclease III. This enzyme cleaves double-stranded rRNA. It is involved in processing ribosomal RNA precursors. It is found even in minimal genones such as Mycoplasma genitalium and Buchnera aphidicola, and in some cases has been shown to be an essential gene. These bacterial proteins contain a double-stranded RNA binding motif (pfam00035) and a ribonuclease III domain (pfam00636). Eukaryotic homologs tend to be much longer proteins with additional domains, localized to the nucleus, and not included in this family. [Transcription, RNA processing] 220
18660 131247 TIGR02192 HtrL_YibB protein YibB. The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein. [Hypothetical proteins, Conserved] 270
18661 274025 TIGR02193 heptsyl_trn_I lipopolysaccharide heptosyltransferase I. This family consists of examples of ADP-heptose:LPS heptosyltransferase I, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 319
18662 131249 TIGR02194 GlrX_NrdH Glutaredoxin-like protein NrdH. NrdH-redoxin is a representative of a class of small redox proteins that contain a conserved CXXC motif and are characterized by a glutaredoxin-like amino acid sequence and thioredoxin-like activity profile. Unlike other the glutaredoxins to which it is most closely related, NrdH aparrently does not interact with glutathione/glutathione reductase, but rather with thioredoxin reductase to catalyze the reduction of ribonucleotide reductase. 72
18663 274026 TIGR02195 heptsyl_trn_II lipopolysaccharide heptosyltransferase II. This family consists of examples of ADP-heptose:LPS heptosyltransferase II, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 334
18664 274027 TIGR02196 GlrX_YruB Glutaredoxin-like protein, YruB-family. This glutaredoxin-like protein family contains the conserved CxxC motif and includes the Clostridium pasteurianum protein YruB which has been cloned from a rubredoxin operon. Somewhat related to NrdH, it is unknown whether this protein actually interacts with glutathione/glutathione reducatase, or, like NrdH, some other reductant system. 74
18665 274028 TIGR02197 heptose_epim ADP-L-glycero-D-manno-heptose-6-epimerase. This family consists of examples of ADP-L-glycero-D-mannoheptose-6-epimerase, an enzyme involved in biosynthesis of the inner core of lipopolysaccharide (LPS) for Gram-negative bacteria. This enzyme is homologous to UDP-glucose 4-epimerase (TIGR01179) and belongs to the NAD dependent epimerase/dehydratase family (pfam01370). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 314
18666 274029 TIGR02198 rfaE_dom_I rfaE bifunctional protein, domain I. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in E. coli, and separate proteins in some other genome. The longer, N-terminal domain I (this family) is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (TIGR02199) adds ADP to yield ADP-D-glycero-D-manno-heptose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 315
18667 131254 TIGR02199 rfaE_dom_II rfaE bifunctional protein, domain II. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in E. coli, and separate proteins in some other genome. Domain I (TIGR02198) is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (this family) adds ADP to yield ADP-D-glycero-D-manno-heptose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 144
18668 131255 TIGR02200 GlrX_actino Glutaredoxin-like protein. This family of glutaredoxin-like proteins is limited to the Actinobacteria and contains the conserved CxxC motif. 77
18669 131256 TIGR02201 heptsyl_trn_III lipopolysaccharide heptosyltransferase III, putative. This family consists of examples of the putative ADP-heptose:LPS heptosyltransferase III, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. This enzyme may be less widely distributed than heptosyltransferases I and II. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 344
18670 131257 TIGR02202 Ehrlichia_rpt Ehrlichia chaffeensis immunodominant surface protein repeat. This model represents 77 residues of an 80 amino acid (240 nucleotide) tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen. 77
18671 131258 TIGR02203 MsbA_lipidA lipid A export permease/ATP-binding protein MsbA. This family consists of a single polypeptide chain transporter in the ATP-binding cassette (ABC) transporter family, MsbA, which exports lipid A. It may also act in multidrug resistance. Lipid A, a part of lipopolysaccharide, is found in the outer leaflet of the outer membrane of most Gram-negative bacteria. Members of this family are restricted to the Proteobacteria (although lipid A is more broadly distributed) and often are clustered with lipid A biosynthesis genes. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 571
18672 131259 TIGR02204 MsbA_rel ABC transporter, permease/ATP-binding protein. This protein is related to a Proteobacterial ATP transporter that exports lipid A and to eukaryotic P-glycoproteins. 576
18673 274030 TIGR02205 septum_zipA cell division protein ZipA. This model represents the full length of bacterial cell division protein ZipA. The N-terminal hydrophobic stretch is an uncleaved signal-anchor sequence. This is followed by an unconserved, variable length, low complexity region, and then a conserved C-terminal region of about 140 amino acids (see pfam04354) that interacts with the tubulin-like cell division protein FtsZ. [Cellular processes, Cell division] 284
18674 131261 TIGR02206 intg_mem_TP0381 conserved hypothetical integral membrane protein TIGR02206. This model represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc. 222
18675 274031 TIGR02207 lipid_A_htrB lipid A biosynthesis lauroyl (or palmitoleoyl) acyltransferase. This model represents a narrow clade of acyltransferases, nearly all of which transfer a lauroyl group to KDO2-lipid IV-A, a lipid A precursor; these proteins are termed lipid A biosynthesis lauroyl acyltransferase, HtrB. An exception is a closely related paralog of E. coli HtrB, LpxP, which acts in cold shock conditions by transferring a palmitoleoyl rather than lauroyl group to the lipid A precursor. Members of this family are homologous to the family of acyltransferases responsible for the next step in lipid A biosynthesis. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 303
18676 274032 TIGR02208 lipid_A_msbB lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase. This family consists of MsbB in E. coli and closely related proteins in other species. MsbB is homologous to HtrB (TIGR02207) and acts immediately after it in the biosynthesis of KDO-2 lipid A (also called Re LPS and Re endotoxin). These two enzymes act after creation of KDO-2 lipid IV-A by addition of the KDO sugars. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 305
18677 131264 TIGR02209 ftsL_broad cell division protein FtsL. This model represents FtsL, both forms similar to that in E. coli and similar to that in B. subtilis. FtsL is one of the later proteins active in cell division septum formation. FtsL is small, low in complexity, and highly divergent. The scope of this model is broader than that of the pfam04999.3 for FtsL, as this one includes FtsL from Bacillus subtilis and related species. [Cellular processes, Cell division] 85
18678 274033 TIGR02210 rodA_shape rod shape-determining protein RodA. This protein is a member of the FtsW/RodA/SpoVE family (pfam01098). It is found only in species with rod (or spiral) shapes. In many species, mutation of rodA has been shown to correlate with loss of the normal rod shape. Note that RodA homologs are found, scoring below the cutoffs for this model, in a number of both rod-shaped and coccoid bacteria, including four proteins in Bacillus anthracis, for example. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division] 352
18679 131266 TIGR02211 LolD_lipo_ex lipoprotein releasing system, ATP-binding protein. This model represents LolD, a member of the ABC transporter family (pfam00005). LolD is involved in localization of lipoproteins in some bacteria. It works with a transmembrane protein LolC, which in some species is a paralogous pair LolC and LolE. Depending on whether the residue immediately following the new, modified N-terminal Cys residue, the nascent lipoprotein may be carried further by LolA and LolB to the outer membrane, or remain at the inner membrane. The top scoring proteins excluded by this model include homologs from the archaeal genus Methanosarcina. [Protein fate, Protein and peptide secretion and trafficking] 221
18680 274034 TIGR02212 lolCE lipoprotein releasing system, transmembrane protein, LolC/E family. This model describes the LolC protein, and its paralog LolE found in some species. These proteins are homologous to permease proteins of ABC transporters. In some species, two paralogs occur, designated LolC and LolE. In others, a single form is found and tends to be designated LolC. [Protein fate, Protein and peptide secretion and trafficking] 411
18681 131268 TIGR02213 lolE_release lipoprotein releasing system, transmembrane protein LolE. This protein is part of an unusual ABC transporter complex that releases lipoproteins from the periplasmic side of the bacterial inner membrane, rather than transport any substrate across the inner membrane. In some species, the permease-like transmembrane protein is represented by two paralogs, LolC and LolE, both in the LolCDE complex. This family consists of LolE, as found in E. coli and related species. [Protein fate, Protein and peptide secretion and trafficking] 411
18682 131269 TIGR02214 spoVD_pbp stage V sporulation protein D. This model describes the spoVD subfamily of homologs of the cell division protein FtsI, a penicillin binding protein. This subfamily is restricted to Bacillus subtilis and related Gram-positive species with known or suspected endospore formation capability. In these species, the functional equivalent of FtsI is desginated PBP-2B, a paralog of spoVD. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination] 636
18683 274035 TIGR02215 phage_chp_gp8 phage conserved hypothetical protein, phiE125 gp8 family. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. Members of this family show some similarity to members of pfam05135, a putative DNA packaging protein family. [Mobile and extrachromosomal element functions, Prophage functions] 188
18684 274036 TIGR02216 phage_TIGR02216 phage conserved hypothetical protein. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. [Mobile and extrachromosomal element functions, Prophage functions] 58
18685 274037 TIGR02217 chp_TIGR02217 TIGR02217 family protein. This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes. [Mobile and extrachromosomal element functions, Prophage functions] 210
18686 274038 TIGR02218 phg_TIGR02218 phage conserved hypothetical protein BR0599. This model describes a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions. [Mobile and extrachromosomal element functions, Prophage functions] 229
18687 131274 TIGR02219 phage_NlpC_fam putative phage cell wall peptidase, NlpC/P60 family. Members of this family show sequence similarity to members of the NlpC/P60 family described by pfam00877 and by Anantharaman and Aravind (). The NlpC/P60 family includes a number of characterized bacterial cell wall hydrolases. Members of this related family are all found in prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions] 134
18688 274039 TIGR02220 phg_TIGR02220 phage conserved hypothetical protein, C-terminal domain. This model represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown. [Mobile and extrachromosomal element functions, Prophage functions] 77
18689 274040 TIGR02221 cas_TM1812 CRISPR-associated protein, TM1812 family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family, represented by TM1812 of Thermotoga maritima, is found also in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, a large plasmid of Synechocystis sp. PCC 6803, and Fibrobacter succinogenes S85. 218
18690 131277 TIGR02222 chap_CsaA export-related chaperone protein CsaA. This model describes Bacillus subtilis CsaA, an export-related chaperone that interacts with the Sec system, and related proteins from a number of other bacteria and archaea. The crystal structure is known for the homodimer from Thermus thermophilus. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking] 107
18691 274041 TIGR02223 ftsN cell division protein FtsN. FtsN is a poorly conserved protein active in cell division in a number of Proteobacteria. The N-terminal 30 residue region tends to by Lys/Arg-rich, and is followed by a membrane-spanning region. This is followed by an acidic low-complexity region of variable length and a well-conserved C-terminal domain of two tandem regions matched by pfam05036 (Sporulation related repeat), found in several cell division and sporulation proteins. The role of FtsN as a suppressor for other cell division mutations is poorly understood; it may involve cell wall hydrolysis. [Cellular processes, Cell division] 298
18692 274042 TIGR02224 recomb_XerC tyrosine recombinase XerC. The phage integrase family describes a number of recombinases with tyrosine active sites that transiently bind covalently to DNA. Many are associated with mobile DNA elements, including phage, transposons, and phase variation loci. This model represents XerC, one of two closely related chromosomal proteins along with XerD (TIGR02225). XerC and XerD are site-specific recombinases which help resolve chromosome dimers to monomers for cell division after DNA replication. In species with a large chromosome and homologs of XerC on other replicons, the chomosomal copy was preferred for building this model. This model does not detect all XerC, as some apparent XerC examples score in the gray zone between trusted (450) and noise (410) cutoffs, along with some XerD examples. XerC and XerD interact with cell division protein FtsK. [DNA metabolism, DNA replication, recombination, and repair] 295
18693 274043 TIGR02225 recomb_XerD tyrosine recombinase XerD. The phage integrase family describes a number of recombinases with tyrosine active sites that transiently bind covalently to DNA. Many are associated with mobile DNA elements, including phage, transposons, and phase variation loci. This model represents XerD, one of two closely related chromosomal proteins along with XerC (TIGR02224). XerC and XerD are site-specific recombinases which help resolve chromosome dimers to monomers for cell division after DNA replication. In species with a large chromosome and with homologs of XerD on other replicons, the chomosomal copy was preferred for building this model. This model does not detect all XerD, as some apparent XerD examples score below the trusted and noise cutoff scores. XerC and XerD interact with cell division protein FtsK. [DNA metabolism, DNA replication, recombination, and repair] 291
18694 131281 TIGR02226 two_anch N-terminal double-transmembrane domain. This model represents a prokaryotic N-terminal region of about 80 amino acids. The predicted membrane topology by TMHMM puts the N-terminus outside and spans the membrane twice, with a cytosolic region of about 25 amino acids between the two transmembrane regions. Member proteins tend to be between 600 and 1000 amino acids in length. [Hypothetical proteins, Domain] 82
18695 274044 TIGR02227 sigpep_I_bact signal peptidase I, bacterial type. This model represents signal peptidase I from most bacteria. Eukaryotic sequences are likely organellar. Several bacteria have multiple paralogs, but these represent isozymes of signal peptidase I. Virtually all known bacteria may be presumed to A related model finds a simlar protein in many archaea and a few bacteria, as well as a microsomal (endoplasmic reticulum) protein in eukaryotes. [Protein fate, Protein and peptide secretion and trafficking] 142
18696 131283 TIGR02228 sigpep_I_arch signal peptidase I, archaeal type. This model represents signal peptidase I from most archaea, a subunit of the eukaryotic endoplasmic reticulum signal peptidase I complex, and an apparent signal peptidase I from a small number of bacteria. It is related to but does not overlap in hits with TIGR02227, the bacterial and mitochondrial signal peptidase I. 158
18697 131284 TIGR02229 caa3_sub_IV caa(3)-type oxidase, subunit IV. This model represents a small set of proteins with weak similarity to the sequences in pfam03626, which describes the cytochrome C oxidase subunit IV. [Energy metabolism, Electron transport] 92
18698 131285 TIGR02230 ATPase_gene1 F0F1-ATPase subunit, putative. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophic stretches and is presumed to be a subunit of the enzyme. [Energy metabolism, ATP-proton motive force interconversion] 100
18699 274045 TIGR02231 TIGR02231 conserved hypothetical protein. This family consists of proteins over 500 amino acids long in Caenorhabditis elegans and several bacteria (Pseudomonas aeruginosa, Nostoc sp. PCC 7120, Leptospira interrogans, etc.). The function is unknown. 525
18700 200169 TIGR02232 myxo_disulf_rpt Myxococcus cysteine-rich repeat. This model represents a sequence region shared between several proteins of Myxococcus xanthus DK 1622 and some eukaryotic proteins that include human pappalysin-1 (SP|Q13219). The region of about 40 amino acids contains several conserved Cys residues presumed to form disulfide bonds. The region appears in up to 13 repeats in Myxococcus. 38
18701 274046 TIGR02234 trp_oprn_chp trp region conserved hypothetical membrane protein. Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis. 202
18702 131289 TIGR02235 menA_cyano-plnt 1,4-dihydroxy-2-naphthoate phytyltransferase. This family of phytyltransferases, found in plants and cyanobacteria, are involved in the biosythesis of phylloquinone (Vitamin K1). Phylloquinone is a critical component of photosystem I. The closely related MenA enzyme from bacteria transfers a prenyl group (which only differs in the saturation of the isoprenyl groups) in the biosynthesis of menaquinone. Activity towards both substrates in certain organisms should be considered a possibility. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 285
18703 131290 TIGR02236 recomb_radA DNA repair and recombination protein RadA. This family consists exclusively of archaeal RadA protein, a homolog of bacterial RecA (TIGR02012), eukaryotic RAD51 (TIGR02239), and archaeal RadB (TIGR02237). This protein is involved in DNA repair and recombination. The member from Pyrococcus horikoshii contains an intein. [DNA metabolism, DNA replication, recombination, and repair] 310
18704 274047 TIGR02237 recomb_radB DNA repair and recombination protein RadB. This family consists exclusively of archaeal RadB protein, a homolog of bacterial RecA (TIGR02012), eukaryotic RAD51 (TIGR02239) and DMC1 (TIGR02238), and archaeal RadA (TIGR02236). 209
18705 131292 TIGR02238 recomb_DMC1 meiotic recombinase Dmc1. This model describes DMC1, a subfamily of a larger family of DNA repair and recombination proteins. It is eukaryotic only and most closely related to eukaryotic RAD51. It also resembles archaeal RadA (TIGR02236) and RadB (TIGR02237) and bacterial RecA (TIGR02012). It has been characterized for human as a recombinase active only in meiosis. 313
18706 274048 TIGR02239 recomb_RAD51 DNA repair protein RAD51. This eukaryotic sequence family consists of RAD51, a protein involved in DNA homologous recombination and repair. It is similar in sequence the exclusively meiotic recombinase DMC1 (TIGR02238), to archaeal families RadA (TIGR02236) and RadB (TIGR02237), and to bacterial RecA (TIGR02012). 316
18707 131294 TIGR02240 PHA_depoly_arom poly(3-hydroxyalkanoate) depolymerase. This family consists of the polyhydroxyalkanoic acid (PHA) depolymerase of Pseudomonas oleovorans, Pseudomonas putida BM01, and related species. This enzyme is part of polyester storage and mobilization system as in many bacteria. However, species containing this enzyme are unusual in their capacity to produce aromatic polyesters when grown on carbon sources such as benzoic acid or phenylacetic acid. [Energy metabolism, Other] 276
18708 274049 TIGR02241 TIGR02241 conserved hypothetical phage tail region protein. This family consists of uncharacterized proteins. All members so far represent bacterial genes found in apparent phage or otherwisely laterally transferred regions of the chromosome. Tentatively identified neighboring proteins tend to be phage tail region proteins. In some species, including Photorhabdus luminescens TTO1, several members of this family may be encoded near each other. 140
18709 274050 TIGR02242 tail_TIGR02242 phage tail protein domain. This model describes a region of sequence similarity shared by a number of uncharacterized proteins in bacterial genomes, including Geobacter sulfurreducens PCA, Mesorhizobium loti, Streptomyces coelicolor A3(2), Gloeobacter violaceus PCC 7421, and Myxococcus xanthus. In all cases, the genomic region resembles a phage tail region, based on tentative identifications of neighboring genes. A region of this domain resembles a region of TIGR01634, another phage tail protein model. [Mobile and extrachromosomal element functions, Prophage functions] 130
18710 274051 TIGR02243 TIGR02243 putative baseplate assembly protein. This family consists of a large, conserved hypothetical protein in phage tail-like regions of at least six bacterial genomes: Gloeobacter violaceus PCC 7421, Geobacter sulfurreducens PCA, Streptomyces coelicolor A3(2), Streptomyces avermitilis MA-4680, Mesorhizobium loti, and Myxococcus xanthus. The C-terminal region is identified by the broader model pfam04865 as related to baseplate protein J from phage P2, but that relationship is not observed directly. [Mobile and extrachromosomal element functions, Prophage functions] 656
18711 274052 TIGR02244 HAD-IG-Ncltidse HAD superfamily (subfamily IG) hydrolase, 5'-nucleotidase. This model includes a 5'-nucleotidase specific for purines (IMP and GMP). These enzymes are members of the Haloacid Dehalogenase (HAD) superfamily. HAD members are recognized by three short motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and either {hhhh(D/E)(D/E)x(3-4)(G/N)} or {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue). Crystal structures of many HAD enzymes has verified PSI-PRED predictions of secondary structural elements which show each of the "hhhh" sequences of the motifs as part of beta sheets. This subfamily of enzymes is part of "Subfamily I" of the HAD superfamily by virtue of a "cap" domain in between motifs 1 and 2. This subfamily's cap domain has a different predicted secondary structure than all other known HAD enzymes and thus has been designated "subfamily IG". This domain appears to consist of a mixed alpha/beta fold. A Pfam model (pfam05761) detects an identical range of sequences above the trusted cutoff, but does not model the N-terminal motif 1 region. A TIGRFAMs model (TIGR01993) represents a (putative) family of _pyrimidine_ 5'-nucleotidases which are also subfamily I HAD's, which should not be confused with the current model. 343
18712 131299 TIGR02245 HAD_IIID1 HAD-superfamily subfamily IIID hydrolase, TIGR02245. This family of sequences appears to belong to the Haloacid Dehalogenase (HAD) superfamily of enzymes by virtue of the presence of three catalytic domains, in this case: LLVLD(ILV)D(YH)T, I(VMG)IWS, and (DN)(VC)K(PA)Lx{15-17}T(IL)(MH)(FV)DD(IL)(GRS)(RK)N. Since this family has no large "cap" domain between motifs 1 and 2 or between 2 and 3, it is formally a "class III" HAD. 195
18713 274053 TIGR02246 TIGR02246 conserved hypothetical protein. This family consists of uncharacterized proteins found in a number of genera and species, including Streptomyces, Xanthomonas, Oceanobacillus iheyensis, Caulobacter crescentus CB15, and Xylella fastidiosa. The function is unknown. 128
18714 274054 TIGR02247 HAD-1A3-hyp epoxide hydrolase N-terminal domain-like phosphatase. This model represents a small clade of sequences including C. elegans and mammalian sequences as well as a small number of bacteria. In eukaryotes, this domain exists as an N-terminal fusion to the soluble epoxide hydrolase enzyme and has recently been shown to be an active phosphatase, although the nature of the biological substrate is unclear. These appear to be members of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases by general homology and the conservation of all of the recognized catalytic motifs (although the first motif is unusual in the replacement of the more common aspartate with glycine...). The variable domain is found in between motifs 1 and 2, indicating membership in subfamily I and phylogeny and prediction of the alpha helical nature of the variable domain (by PSI-PRED) indicate membership in subfamily IA. 211
18715 131302 TIGR02248 mutH_TIGR DNA mismatch repair endonuclease MutH. This family consists exclusively of MutH, an endonuclease in some Proteobacteria that is activated by MutS1 and MutL for methylation-directed mismatch repair. [DNA metabolism, DNA replication, recombination, and repair] 217
18716 131303 TIGR02249 integrase_gron integron integrase. Members of this family are integrases associated with integrons (and super-integrons), which are systems for incorporating and expressing cassettes of laterally transferred DNA. Incorporation occurs at an attI site. A super-integron, as in Vibrio sp., may include over 100 cassettes. This family belongs to the phage integrase family (pfam00589) that also includes recombinases XerC (TIGR02224) and XerD (TIGR02225), which are bacterial housekeeping proteins. Within this family of integron integrases, some are designated by class, e.g. IntI4, a class 4 integron integrase from Vibrio cholerae N16961. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Other] 315
18717 131304 TIGR02250 FCP1_euk FCP1-like phosphatase, phosphatase domain. This model represents the phosphatase domain of the humanRNA polymerase II subunit A C-terminal domain phosphatase (FCP1) and closely related phosphatases from eukaryotes including plants, fungi, and slime mold. This domain is a member of the haloacid dehalogenase (HAD) superfamily by virtue of a conserved set of three catalytic motifs and a conserved fold as predicted by PSIPRED. The third motif in this family is distinctive (hhhhDDppphW). This domain is classified as a "Class III" HAD, since there is no large "cap" domain found between motifs 1 and 2 or motifs 2 and 3. This domain is related to domains found in the human NLI interacting factor-like phosphatases, and together both are detected by the pfam03031. 156
18718 274055 TIGR02251 HIF-SF_euk Dullard-like phosphatase domain. This model represents the putative phosphatase domain of a family of eukaryotic proteins including "Dullard", and the NLI interacting factor (NIF)-like phosphatases. This domain is a member of the haloacid dehalogenase (HAD) superfamily by virtue of a conserved set of three catalytic motifs and a conserved fold as predicted by PSIPRED. The third motif in this family is distinctive (hhhhDNxPxxa) and aparrently lacking the last aspartate. This domain is classified as a "Class III" HAD, since there is no large "cap" domain found between motifs 1 and 2 or motifs 2 and 3. This domain is related to domains found in FCP1-like phosphatases (TIGR02250), and together both are detected by the pfam03031. 162
18719 274056 TIGR02252 DREG-2 REG-2-like, HAD superfamily (subfamily IA) hydrolase. This family of proteins includes uncharacterized sequences from eukaryotes, cyanobacteria and Leptospira as well as the DREG-2 protein from Drosophila melanogaster which has been identified as a rhythmically (diurnally) regulated gene. This family is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called 'capping domain', or the absence of such a domain. This family is a member of subfamily 1A in which the cap domain consists of a predicted alpha helical bundle found in between the first and second catalytic motifs. A distinctive feature of this family is a conserved tandem pair of tryptophan residues in the cap domain. The most divergent sequences included within the scope of this model are from plants and have "FW" at this position instead. Most likely, these sequences, like the vast majority of HAD sequences, represent phosphatase enzymes. 203
18720 274057 TIGR02253 CTE7 HAD superfamily (subfamily IA) hydrolase, TIGR02253. This family of sequences from archaea and metazoans includes the human uncharacterized protein CTE7. Pyrococcus species appear to have three different forms of this enzyme, so it is unclear whether all members of this family have the same function. This family is a member of the haloacid dehalogenase (HAD) superfamily of hydrolases which are characterized by three conserved sequence motifs. By virtue of an alpha helical domain in-between the first and second conserved motif, this family is a member of subfamily IA (TIGR01549). 221
18721 162788 TIGR02254 YjjG/YfnB noncanonical pyrimidine nucleotidase, YjjG family. This HAD superfamily includes including YjjG from E. coli and YfnB from B. subtilis. YjjG has been shown to act as a house-cleaning enzyme, cleaving nucleotides with non-canonical nucleotide bases. This family is a member of the haloacid dehalogenase (HAD) superfamily of hydrolases which are characterized by three conserved sequence motifs. By virtue of an alpha helical domain in-between the first and second conserved motif, this family is a member of subfamily IA (TIGR01549). 224
18722 131309 TIGR02256 ICE_VC0181 integrative and conjugative element protein, VC0181 family. This uncharacterized protein is found in several Proteobacteria, among them Rhizobium sp. NGR234, Vibrio cholerae, Myxococcus xanthus, and E. coli strain ECOR31. In the latter, it is part of an integrative and conjugative element that is readily induced to excise and circularize. 131
18723 131310 TIGR02257 cobalto_cobN cobaltochelatase, CobN subunit. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 1122
18724 274058 TIGR02258 2_5_ligase 2'-5' RNA ligase. This protein family consists of bacterial and archaeal proteins with two tandem copies of Pfam domain pfam02834. Members for which activity has been measured perform a reversible, ATP-independent 2'-5'-ligation of what is presumably a non-phyiological substrate: half-tRNA splice intermediates from an intron-containing yeast tRNA. The physiological substrate(s) in prokaryotes may include small 2'-5'-link-containing oligonucleotides, perhaps with regulatory or biosynthetic roles. [Transcription, RNA processing] 179
18725 131312 TIGR02259 benz_CoA_red_A benzoyl-CoA reductase, bcr type, subunit A. This model describes A, or gamma, subunit of the bcr type of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This family shows strong sequence similarity to the 2-hydroxyglutaryl-CoA dehydratase alpha chain and to subunits of different types of benzoyl-CoA reductase (such as the bzd type). 432
18726 131313 TIGR02260 benz_CoA_red_B benzoyl-CoA reductase, bcr type, subunit B. This model describes B, or beta, subunit of the bcr type of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. 413
18727 131314 TIGR02261 benz_CoA_red_D benzoyl-CoA reductase, bcr type, subunit D. This model describes the D subunit of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This family shows sequence similarity to the A subunit (TIGR02259) and to the 2-hydroxyglutaryl-CoA dehydratase alpha chain. 262
18728 274059 TIGR02262 benz_CoA_lig benzoate-CoA ligase family. Characterized members of this protein family include benzoate-CoA ligase, 4-hydroxybenzoate-CoA ligase, 2-aminobenzoate-CoA ligase, etc. Members are related to fatty acid and acetate CoA ligases. 505
18729 131316 TIGR02263 benz_CoA_red_C benzoyl-CoA reductase, subunit C. This model describes C subunit of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This enzyme acts under anaerobic conditions. 380
18730 131317 TIGR02264 gmx_para_CXXCG Myxococcus xanthus double-CXXCG motif paralogous family. This family consists of at least 10 paralogous proteins from Myxococcus xanthus that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment. 237
18731 131318 TIGR02265 Mxa_TIGR02265 Myxococcales-restricted protein, TIGR02265 family. This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622. Members are about 200 amino acids in length. No other homologs are known; the function is unknown. 179
18732 274060 TIGR02266 gmx_TIGR02266 Myxococcus xanthus paralogous domain TIGR02266. This domain is related to Type IV pilus assembly protein PilZ (pfam07238). It is found in at least 12 copies in Myxococcus xanthus DK 1622. 96
18733 131320 TIGR02267 TIGR02267 DUSAM domain. This family consists of at least eight paralogs in Myxococcus xanthus and six in Stigmatella aurantiaca DW4/3-1, both members of Myxococcales order within the Deltaproteobacteria. The function is unknown. Some member proteins consist of two copies of the domain. This domain is hereby named DUSAM, DUplication in Stigmatella And Myxococcus. 123
18734 131321 TIGR02268 TIGR02268 Myxococcus xanthus paralogous family TIGR02268. This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown. 295
18735 131322 TIGR02269 TIGR02269 Myxococcus xanthus paralogous lipoprotein family TIGR02269. This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown. 211
18736 131323 TIGR02270 TIGR02270 conserved hypothetical protein. Members are found in Myxococcus xanthus (six members), Geobacter sulfurreducens, and Pseudomonas aeruginosa; a short protein homologous to the N-terminal region is found in Mesorhizobium loti. All sequence are from Proteobacteria. The function is unknown. [Hypothetical proteins, Conserved] 410
18737 131324 TIGR02271 TIGR02271 conserved domain. This model describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain (pfam05239, which is also found in rRNA processing protein RimM and in a photosynthetic reaction center complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known. 115
18738 131325 TIGR02272 gentisate_1_2 gentisate 1,2-dioxygenase. This family consists of gentisate 1,2-dioxygenases. This ring-opening enzyme acts in salicylate degradation that goes via gentisate rather than via catechol. It converts gentisate to maleylpyruvate. Some putative gentisate 1,2-dioxygenases are excluded by a relatively high trusted cutoff score because they are too closely related to known examples of 1-hydroxy-2-naphthoate dioxygenase. Therefore some homologs may be bona fide gentisate 1,2-dioxygenases even if they score below the given cutoffs. 335
18739 274061 TIGR02273 16S_RimM 16S rRNA processing protein RimM. This family consists of the bacterial protein RimM (YfjA, 21K), a 30S ribosomal subunit-binding protein implicated in 16S ribsomal RNA processing. It has been partially characterized in Escherichia coli, is found with other translation-associated genes such as trmD. It is broadly distributed among bacteria, including some minimal genomes such the aphid endosymbiont Buchnera aphidicola. The protein contains a PRC-barrel domain that it shares with other protein families (pfam05239) and a unique domain (pfam01782). This model describes the full-length protein. A member from Arabidopsis (plant) has additional N-terminal sequence likely to represent a chloroplast transit peptide. [Transcription, RNA processing] 165
18740 274062 TIGR02274 dCTP_deam deoxycytidine triphosphate deaminase. Members of this family include the Escherichia coli monofunctional deoxycytidine triphosphate deaminase (dCTP deaminase) and a Methanocaldococcus jannaschii bifunctional dCTP deaminase (3.5.4.13)/dUTP diphosphatase (EC 3.6.1.23), which has the EC number 3.5.4.30 for the overall operation. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 179
18741 274063 TIGR02275 DHB_AMP_lig 2,3-dihydroxybenzoate-AMP ligase. Proteins in this family belong to the AMP-binding enzyme family (pfam00501). Members activate 2,3-dihydroxybenzoate (DHB) by ligation of AMP from ATP with the release of pyrophosphate; many are involved in synthesis of siderophores such as enterobactin, vibriobactin, vulnibactin, etc. The most closely related proteine believed to differ in function activates salicylate rather than DHB. [Transport and binding proteins, Cations and iron carrying compounds] 526
18742 213697 TIGR02276 beta_rpt_yvtn 40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (pfam00400). 42
18743 274064 TIGR02277 PaaX_trns_reg phenylacetic acid degradation operon negative regulatory protein PaaX. This transcriptional regulator is always found in association with operons believed to be involved in the degradation of phenylacetic acid. The gene product has been shown to bind to the promoter sites and repress their transcription. [Regulatory functions, DNA interactions] 280
18744 131331 TIGR02278 PaaN-DH phenylacetic acid degradation protein paaN. This enzyme is proposed to act in the ring-opening step of phenylacetic acid degradation which follows ligation of the acid with coenzyme A (by PaaF) and hydroxylation by a multicomponent non-heme iron hydroxylase complex (PaaGHIJK). Gene symbols have been standardized in. This enzyme is related to aldehyde dehydrogenases and has domains which are members of the pfam00171 and pfam01575 families. This family includes paaN genes from Pseudomonas, Sinorhizobium, Rhodopseudomonas, Escherichia, Deinococcus and Corynebacterium. Another homology family (TIGR02288) includes several other species. 663
18745 188207 TIGR02279 PaaC-3OHAcCoADH 3-hydroxyacyl-CoA dehydrogenase PaaC. This 3-hydroxyacyl-CoA dehydrogenase is involved in the degradation of phenylacetic acid, presumably in steps following the opening of the phenyl ring. The sequences included in this model are all found in aparrent operons with other related genes such as paaA, paaB, paaD, paaE, paaF and paaN. Some genomes contain these other genes without an apparent paaC in the same operon - possibly in these cases a different dehydrogenase involved in fatty acid degradation may fill in the needed activity. This enzyme has domains which are members of the pfam02737 and pfam00725 families. 503
18746 274065 TIGR02280 PaaB1 phenylacetate degradation probable enoyl-CoA hydratase paaB. This family of proteins are found within apparent operons for the degradation of phenylacetic acid. These proteins contain the enoyl-CoA hydratase domain as detected by pfam00378. This activity is consistent with current hypotheses for the degradation pathway, which involve the ligation of phenylacetate with coenzyme A (paaF), hydroxylation (paaGHIJK), ring-opening (paaN) and degradation of the resulting fatty acid-like compound to a Krebs cycle intermediate (paaABCDE). 257
18747 131334 TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria). [Protein fate, Degradation of proteins, peptides, and glycopeptides] 121
18748 274066 TIGR02282 MltB lytic murein transglycosylase B. This family consists of lytic murein transglycosylases (murein hydrolases) in the family of MltB, which is a membrane-bound lipoprotein in Escherichia coli. The N-terminal lipoprotein modification motif is conserved in about half the members of this family. The term Slt35 describes a naturally occurring soluble fragment of MltB. Members of this family never contain the putative peptidoglycan binding domain described by pfam01471, which is associated with several classes of bacterial cell wall lytic enzymes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 290
18749 274067 TIGR02283 MltB_2 lytic murein transglycosylase. Members of this family are closely related to the MltB family lytic murein transglycosylases described by TIGR02282 and are likewise all proteobacterial, although that family and this one form clearly distinct clades. Several species have one member of each family. Many members of this family (unlike the MltB family) contain an additional C-terminal domain, a putative peptidoglycan binding domain (pfam01471), not included in region described by this model. Many sequences appear to contain N-terminal lipoprotein attachment sites, as does E. coli MltB in TIGR02282. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 300
18750 131337 TIGR02284 TIGR02284 conserved hypothetical protein. Members of this protein family are found mostly in the Proteobacteria, although one member is found in the the marine planctomycete Pirellula sp. strain 1. The function is unknown. 139
18751 131338 TIGR02285 TIGR02285 conserved hypothetical protein. Members of this family are found in several Proteobacteria, including Pseudomonas putida KT2440, Bdellovibrio bacteriovorus HD100 (three members), Aeromonas hydrophila, and Chromobacterium violaceum ATCC 12472. The function is unknown. [Hypothetical proteins, Conserved] 268
18752 131339 TIGR02286 PaaD phenylacetic acid degradation protein PaaD. This member of the domain family TIGR00369 (which is, in turn, a member of the pfam03061 thioesterase superfamily) is nearly always found adjacent to other genes of the phenylacetic acid degradation pathway. Its function is currently unknown, but a role as a thioesterase is not inconsistent with the proposed overall pathway. Sequences scoring between trusted and noise include those from archaea and other species not known to catabolize phenylacetic acid and which are not adjacent to other genes potentially involved with such a pathway. 114
18753 131340 TIGR02287 PaaY phenylacetic acid degradation protein PaaY. Members of this family are located next to other genes organized into apparent operons for phenylacetic acid degradation. PaaY is located near the end of these gene clusters and often next to PaaX, a transcriptional regulator. [Energy metabolism, Other] 192
18754 131341 TIGR02288 PaaN_2 phenylacetic acid degradation protein paaN. This enzyme is proposed to act in the ring-opening step of phenylacetic acid degradation which follows ligation of the acid with coenzyme A (by PaaF) and hydroxylation by a multicomponent non-heme iron hydroxylase complex (PaaGHIJK). Gene symbols have been standardized in. This enzyme is related to aldehyde dehydrogenases and has a domain which is a member of the pfam00171 family. This family includes sequences from Burkholderia, Bordetella, Streptomyces. Other PaaN enzymes are represented by a separate model, TIGR02278. 551
18755 274068 TIGR02289 M3_not_pepF oligoendopeptidase, M3 family. This family consists of probable oligoendopeptidases in the M3 family, related to lactococcal PepF and group B streptococcal PepB (TIGR00181) but in a distinct clade with considerable sequence differences. The likely substrate is small peptides and not whole proteins, as with PepF, but members are not characterized and the activity profile may differ. Several bacteria have both a member of this family and a member of the PepF family. 549
18756 274069 TIGR02290 M3_fam_3 oligoendopeptidase, pepF/M3 family. The M3 family of metallopeptidases contains several distinct clades. Oligoendopeptidase F as characterized in Lactococcus, the functionally equivalent oligoendopeptidase B of group B Streptococcus, and closely related sequences are described by TIGR00181. The present family is quite similar but forms a distinct clade, and a number of species have one member of each. A greater sequence difference separates members of TIGR02289, probable oligoendopeptidases of the M3 family that probably should not be designated PepF. 587
18757 274070 TIGR02291 rimK_rel_E_lig alpha-L-glutamate ligase-related protein. Members of this protein family contain a region of homology to the RimK family of alpha-L-glutamate ligases (TIGR00768), various members of which modify the Glu-Glu C-terminus of ribosomal protein S6, or tetrahydromethanopterin, or a form of coenzyme F420 derivative. Members of this family are found so far in various Vibrio and Pseudomonas species and some other gamma and beta Proteobacteria. The function is unknown. 317
18758 274071 TIGR02292 ygfB_yecA yecA family protein. This family resembles pfam03695 (version pfam03695.3), uncharacterised protein family UPF0149, but is broader in scope and includes additional proteins. It includes E. coli proteins YgfB and YecA. The function of this family of proteins is unknown. The crystal structure is known for the member from Haemophilus influenzae (Ygfb, HI0817). [Unknown function, General] 150
18759 131346 TIGR02293 TAS_TIGR02293 putative toxin-antitoxin system antitoxin component, TIGR02293 family. Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. This family was proposed by Makarova, et al. (2009) to be the antitoxin component of a new class of type 2 toxin-antitoxin system, or addiction module. [Cellular processes, Other] 133
18760 274072 TIGR02294 nickel_nikA nickel ABC transporter, nickel/metallophore periplasmic binding protein. Members of this family are periplasmic nickel-binding proteins of nickel ABC transporters. Most appear to be lipoproteins. This protein was previously (circa 2003) thought to mediate binding to nickel through water molecules, but is now thought to involve a chelating organic molecule, perhaps butane-1,2,4-tricarboxylate, acting as a metallophore. [Transport and binding proteins, Cations and iron carrying compounds] 500
18761 213698 TIGR02295 HpaD 3,4-dihydroxyphenylacetate 2,3-dioxygenase. This enzyme catalyzes the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. 4-hydroxyphenylacetate arises from the degradation of tyrosine. The substrate, 3,4-dihydroxyphenylacetate (homoprotocatechuate) arises from the action of a hydroxylase on 4-hydroxyphenylacetate. The aromatic ring is opened by this dioxygenase exo to the 3,4-diol resulting in 2-hydroxy-5-carboxymethylmuconate semialdehyde. The enzyme from Bacillus brevis contains manganese. 294
18762 131349 TIGR02296 HpaC 4-hydroxyphenylacetate 3-monooxygenase, reductase component. This model identifies the reductase component (HpaC) of 4-hydroxyphenylacetate 3-monooxygenase. This enzyme catalyzes the first step (hydroxylation at the 3-position) in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. 4-hydroxyphenylacetate arises from the degradation of tyrosine. These reductases catalyze the reduction of free flavins by NADPH. The flavin is then utilized by the large subunit of the monooxygenase. 154
18763 131350 TIGR02297 HpaA 4-hydroxyphenylacetate catabolism regulatory protein HpaA. This putative transcriptional regulator, which contains both the substrate-binding, dimerization domain (pfam02311) and the helix-turn-helix DNA-binding domain (pfam00165) of the AraC famil, is located proximal to genes of the 4-hydroxyphenylacetate catabolism pathway. 287
18764 131351 TIGR02298 HpaD_Fe 3,4-dihydroxyphenylacetate 2,3-dioxygenase. This enzyme catalyzes the ring-opening step in the degradation of 4-hydroxyphenylacetate. 282
18765 131352 TIGR02299 HpaE 5-carboxymethyl-2-hydroxymuconate semialdehyde dehydrogenase. This model represents the dehydrogenase responsible for the conversion of 5-carboxymethyl-2-hydroxymuconate semialdehyde to 5-carboxymethyl-2-hydroxymuconate (a tricarboxylic acid). This is the step in the degradation of 4-hydroxyphenylacetic acid via homoprotocatechuate following the oxidative opening of the aromatic ring. 488
18766 131353 TIGR02300 FYDLN_acid TIGR02300 family protein. Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown. 129
18767 131354 TIGR02301 TIGR02301 TIGR02301 family protein. Members of this uncharacterized protein family are found in a number of alphaProteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown. 121
18768 274073 TIGR02302 aProt_lowcomp TIGR02302 family protein. Members of this family are long (~850 residue) bacterial proteins from the alpha Proteobacteria. Each has 2-3 predicted transmembrane helices near the N-terminus and a long C-terminal region that includes stretches of Gln/Gly-rich low complexity sequence, predicted by TMHMM to be outside the membrane. In Bradyrhizobium japonicum, two tandem reading frames are together homologous the single members found in other species; the cutoffs scores are set low enough that the longer scores above the trusted cutoff and the shorter above the noise cutoff for this model. 851
18769 131356 TIGR02303 HpaG-C-term 4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase, C-terminal subunit. This model represents one of two subunits/domains of the bifunctional isomerase/decarboxylase involved in 4-hydroxyphenylacetate degradation. In E. coli and some other species this enzyme is encoded by a single polypeptide containing both this domain and the closely related N-terminal domain (TIGR02305). In other species such as Pasteurella multocida these domains are found as two separate proteins (usually as tandem genes). Together, these domains carry out the decarboxylation of 5-oxopent-3-ene-1,2,5-tricarboxylic acid (OPET) to 2-hydroxy-2,4-diene-1,7-dioate (HHDD) and the subsequent isomerization to 2-oxohept-3-ene-1,7-dioate (OHED). 245
18770 274074 TIGR02304 aden_form_hyp putative adenylate-forming enzyme. Members of this family form a distinct clade within a larger family of proteins that also includes coenzyme F390 synthetase, an enzyme known in Methanobacterium thermoautotrophicum and a few other methanogenic archaea. That enzyme adenylates coenzyme F420 to F390, a reversible process, during oxygen stress. Other informative homologies include domains of the non-ribosomal peptide synthetases involved in activation by adenylation. The family defined by this model is likely to be of an adenylate-forming enzyme related to but distinct from coenzyme F390 synthetase. 430
18771 131358 TIGR02305 HpaG-N-term 4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase, N-terminal subunit. This model represents one of two subunits/domains of the bifunctional isomerase/decarboxylase involved in 4-hydroxyphenylacetate degradation. In E. coli and some other species this enzyme is encoded by a single polypeptide containing both this domain and the closely related C-terminal domain (TIGR02303). In other species such as Pasteurella multocida these domains are found as two separate proteins (usually as tandem genes). Together, these domains carry out the decarboxylation of 5-oxopent-3-ene-1,2,5-tricarboxylic acid (OPET) to 2-hydroxy-2,4-diene-1,7-dioate (HHDD) and the subsequent isomerization to 2-oxohept-3-ene-1,7-dioate (OHED). 205
18772 274075 TIGR02306 RNA_lig_DRB0094 RNA ligase, DRB0094 family. The member of this family from Deinococcus radiodurans, a species that withstands and recovers from extensive radiation or dessication damage, is an apparent RNA ligase. It repairs RNA stand breaks in nicked DNA:RNA and RNA:RNA but not DNA:DNA duplexes. It has adenylyltransferase activity associated with the C-terminal domain. Related proteins also in this family are found in Streptomyces avermitilis MA-4680 and in bacteriophage 44RR2.8t. The phage example is unsurprising since one mechanism of host cell defense against phage is cleavage and inactivation of certain tRNA molecules. A fungal sequence from Neurospora crassa scores between trusted and noise cutofffs and may be similar in function. 341
18773 274076 TIGR02307 RNA_lig_RNL2 RNA ligase, Rnl2 family. Members of this family ligate (seal breaks in) RNA. Members so far include phage proteins that can counteract a host defense of cleavage of specific tRNA molecules, trypanosome ligases involved in RNA editing, but no prokaryotic host proteins . [Mobile and extrachromosomal element functions, Prophage functions, Transcription, RNA processing] 325
18774 213699 TIGR02308 RNA_lig_T4_1 RNA ligase, T4 RnlA family. Members of this family are phage proteins with ATP-dependent RNA ligase activity. Host defense to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. [Mobile and extrachromosomal element functions, Prophage functions, Transcription, RNA processing] 374
18775 131362 TIGR02309 HpaB-1 4-hydroxyphenylacetate 3-monooxygenase, oxygenase component. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Deinococcus, Thermus and Oceanobacillus. Phylogenetic trees support inclusion of the Bacillus halodurans sequence above trusted although the complete 4-hydroxyphenylacetic acid degradation pathway may not exist in that organism. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241). 477
18776 213700 TIGR02310 HpaB-2 4-hydroxyphenylacetate 3-monooxygenase, oxygenase component. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Shigella, Photorhabdus and Pasteurella. The family represented by this model is narrowly limited to gammaproteobacteria to exclude other aromatic hydroxylases involved in various secondary metabolic pathways. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241). 519
18777 131364 TIGR02311 HpaI 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase. This model represents the aldolase which performs the final step unique to the 4-hydroxyphenylacetic acid catabolism pathway in which 2,4-dihydroxyhept-2-ene-1,7-dioic acid is split into pyruvate and succinate-semialdehyde. The gene for enzyme is generally found adjacent to other genes for this pathway organized into an operon. 249
18778 131365 TIGR02312 HpaH 2-oxo-hepta-3-ene-1,7-dioic acid hydratase. This model represents the enzyme which hydrates the double bond of 2-oxo-hepta-3-ene-1,7-dioic acid to form 4-hydroxy-2-oxo-heptane-1,7-dioic acid in the catabolism of 4-hydroxyphenylacetic acid. The gene for this enzyme is generally found adjacent to other genes of this pathway in an apparent operon. 267
18779 131366 TIGR02313 HpaI-NOT-DapA 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase. This model represents a subset of the DapA (dihydrodipicolinate synthase) family which has apparently evolved a separate function. The product of DapA, dihydrodipicolinate, results from the non-enzymatic cyclization and dehydration of 6-amino-2,4-dihydroxyhept-2-ene-1,7-dioic acid, which is different from the substrate of this reaction only in the presence of the amino group. In the absence of this amino group, and running the reaction in the opposite direction, the reaction corresponds to the HpaI aldolase component of the 4-hydroxyphenylacetic acid catabolism pathway (see TIGR02311). At present, this variant of DapA is found only in Oceanobacillus iheyensis HTE831 and Thermus thermophilus HB27. In both of these cases, one or more other DapA genes can be found and the one identified by this model is part of an operon for 4-hydroxyphenylacetic acid catabolism. 294
18780 131367 TIGR02314 ABC_MetN D-methionine ABC transporter, ATP-binding protein. Members of this family are the ATP-binding protein of the D-methionine ABC transporter complex. Known members belong to the Proteobacteria. 343
18781 131368 TIGR02315 ABC_phnC phosphonate ABC transporter, ATP-binding protein. Phosphonates are a class of phosphorus-containing organic compound with a stable direct C-P bond rather than a C-O-P linkage. A number of bacterial species have operons, typically about 14 genes in size, with genes for ATP-dependent transport of phosphonates, degradation, and regulation of the expression of the system. Members of this protein family are the ATP-binding cassette component of tripartite ABC transporters of phosphonates. [Transport and binding proteins, Anions] 243
18782 131369 TIGR02316 propion_prpE propionate--CoA ligase. This family contains one of three readily separable clades of proteins in the group of acetate and propionate--CoA ligases. Characterized members of this family act on propionate. From propionyl-CoA, there is a cyclic degradation pathway: it is ligated by PrpC to the TCA cycle intermediate oxaloacetate, acted upon further by PrpD and an aconitase, then cleaved by PrpB to pyruvate and the TCA cycle intermediate succinate. 628
18783 131370 TIGR02317 prpB methylisocitrate lyase. Members of this family are methylisocitrate lyase, also called (2S,3R)-3-hydroxybutane-1,2,3-tricarboxylate pyruvate-lyase. This enzyme acts in propionate metabolism. It cleaves a carbon-carbon bond to convert 2-methylisocitrate to pyruvate plus succinate. Some members of this family have been annotated, incorrectly it seems, as the related protein carboxyphosphoenolpyruvate phosphomutase, which is involved in synthesizing the antibiotic bialaphos in Streptomyces hygroscopicus. 285
18784 131371 TIGR02318 phosphono_phnM phosphonate metabolism protein PhnM. This family consists of proteins from in the PhnM family. PhnM is a a protein associated with phosphonate utilization in a number of bacterial species. In Pseudomonas stutzeri WM88, a protein that is part of a system for the oxidation of phosphites (another form of reduced phosphorous compound) scores between trusted and noise cutoffs. [Energy metabolism, Other] 376
18785 131372 TIGR02319 CPEP_Pphonmut carboxyvinyl-carboxyphosphonate phosphorylmutase. This family consists of carboxyvinyl-carboxyphosphonate phosphorylmutase (CPEP phosphonomutase), an unusual enzyme involved in the biosynthesis of the antibiotic bialaphos. So far, it is known only in that pathway and only in Streptomyces hygroscopicus. Some related proteins annotated as being functionally equivalent are likely misannotated examples of methylisocitrate lyase, an enzyme of priopionate utilization. 294
18786 274077 TIGR02320 PEP_mutase phosphoenolpyruvate mutase. This family consists of examples of phosphoenolpyruvate phosphomutase, an enzyme that creates a C-P bond as the first step in the biosynthesis of natural products including antibiotics like bialaphos and phosphonothricin in Streptomyces species, phosphonate-modified molecules such as the polysaccharide B of Bacteroides fragilis, the phosphonolipids of Tetrahymena pyroformis, the glycosylinositolphospholipids of Trypanosoma cruzi. This gene generally occurs in prokaryotic organisms adjacent to the gene for phosphonopyruvate decarboxylase (aepY). Since the PEP phosphomutase reaction favors the substrate PEP energetically, the decarboxylase is required to drive the reaction in the direction of phosphonate production. Most often an aminotansferase (aepZ) is also present which leads to the production of the most common phosphonate compound, 2-aminoethylphosphonate (AEP). A closely related enzyme, phosphonopyruvate hydrolase from Variovorax sp. Pal2, is excluded from this model. 284
18787 131374 TIGR02321 Pphn_pyruv_hyd phosphonopyruvate hydrolase. This family consists of phosphonopyruvate hydrolase, an enzyme closely related to phosphoenolpyruvate phosphomutase. It cleaves the direct C-P bond of phosphonopyruvate. The characterized example is from Variovorax sp. Pal2. 290
18788 274078 TIGR02322 phosphon_PhnN phosphonate metabolism protein/1,5-bisphosphokinase (PRPP-forming) PhnN. Members of this family resemble PhnN of phosphonate utilization operons, where different such operons confer the ability to use somewhat different profiles of C-P bond-containing compounds (see ), including phosphites as well as phosphonates. PhnN in E. coli shows considerable homology to guanylate kinases (EC 2.7.4.8), and has actually been shown to act as a ribose 1,5-bisphosphokinase (PRPP forming). This suggests an analogous kinase reaction for phosphonate metabolism, converting 5-phosphoalpha-1-(methylphosphono)ribose to methylphosphono-PRPP. [Central intermediary metabolism, Phosphorus compounds] 179
18789 188208 TIGR02323 CP_lyasePhnK phosphonate C-P lyase system protein PhnK. Members of this family are the PhnK protein of C-P lyase systems for utilization of phosphonates. These systems resemble phosphonatase-based systems in having a three component ABC transporter, where TIGR01097 is the permease, TIGR01098 is the phosphonates binding protein, and TIGR02315 is the ATP-binding cassette (ABC) protein. They differ, however, in having, typically, ten or more additional genes, many of which are believed to form a membrane-associated complex. This protein (PhnK) and the adjacent-encoded PhnL resemble transporter ATP-binding proteins but are suggested, based on mutatgenesis studies, to be part of this complex rather than part of a transporter per se. [Central intermediary metabolism, Phosphorus compounds] 253
18790 131377 TIGR02324 CP_lyasePhnL phosphonate C-P lyase system protein PhnL. Members of this family are the PhnL protein of C-P lyase systems for utilization of phosphonates. These systems resemble phosphonatase-based systems in having a three component ABC transporter, where TIGR01097 is the permease, TIGR01098 is the phosphonates binding protein, and TIGR02315 is the ATP-binding cassette (ABC) protein. They differ, however, in having, typically, ten or more additional genes, many of which are believed to form a membrane-associated C-P lysase complex. This protein (PhnL) and the adjacent-encoded PhnK (TIGR02323) resemble transporter ATP-binding proteins but are suggested, based on mutatgenesis studies, to be part of this C-P lyase complex rather than part of a transporter per se. 224
18791 131378 TIGR02325 C_P_lyase_phnF phosphonates metabolism transcriptional regulator PhnF. All members of the seed alignment for this family are predicted helix-turn-helix transcriptional regulatory proteins of the broader gntR and are found associated with genes for the import and degradation of phosphonates and/or related compounds (e.g. phosphonites) with a direct C-P bond. [Transport and binding proteins, Anions, Regulatory functions, DNA interactions] 238
18792 131379 TIGR02326 transamin_PhnW 2-aminoethylphosphonate--pyruvate transaminase. Members of this family are 2-aminoethylphosphonate--pyruvate transaminase. This enzyme acts on the most common type of naturally occurring phosphonate. It interconverts 2-aminoethylphosphonate plus pyruvate with 2-phosphonoacetaldehyde plus alanine. The enzyme phosphonoacetaldehyde hydrolase (EC 3.11.1.1), usually encoded by an adjacent gene, then cleaves the C-P bond of phosphonoacetaldehyde, adding water to yield acetaldehyde plus inorganic phosphate. Species with this pathway generally have an identified phosphonate ABC transporter but do not also have the multisubunit C-P lysase complex as found in Escherichia coli. [Central intermediary metabolism, Phosphorus compounds] 363
18793 131380 TIGR02327 int_mem_ywzB conserved hypothetical integral membrane protein. Members of this protein family are small, typically about 80 residues in length, and are highly hydrophobic. The gene is found so far only in a subset of the Firmicutes in association with genes of the ATP synthase F1 complex or NADH-quinone oxidoreductase. This family includes ywzB from Bacillus subtilis; pfam06612 describes the same family as Protein of unknown function DUF1146. 68
18794 131381 TIGR02328 TIGR02328 conserved hypothetical protein. Members of this protein are found in a small number of taxonomically well separated species, yet are strongly conserved, suggesting lateral gene transfer. Members are found in Treponema denticola, Clostridium acetobutylicum, and several of the Firmicutes. The function of this protein is unknown. [Hypothetical proteins, Conserved] 120
18795 274079 TIGR02329 propionate_PrpR propionate catabolism operon regulatory protein PrpR. At least five distinct pathways exists for the catabolism of propionate by way of propionyl-CoA. Members of this family represent the transcriptional regulatory protein PrpR, whose gene is found in most cases divergently transcribed from an operon for the methylcitric acid cycle of propionate catabolism. 2-methylcitric acid, a catabolite by this pathway, is a coactivator of PrpR. [Regulatory functions, DNA interactions] 526
18796 131383 TIGR02330 prpD 2-methylcitrate dehydratase. Members of this family are bacterial proteins known or predicted to act as 2-methylcitrate dehydratase, an enzyme involved in the methylcitrate cycle of propionate catabolism. A related clade of archaeal proteins that may or may not be functionally equivalent is reserved for a future model and is excluded from this family. The PrpD enzyme of E. coli is responsible for the minor aconitase activity (AcnC) not accounted for by AcnA and AcnB. 468
18797 131384 TIGR02331 rib_alpha Rib/alpha/Esp surface antigen repeat. Sequences in this family are tandem repeats of about 79 amino acids, present in up to 14 copies in a protein and highly identical, even at the DNA level, within each protein. Sequences with these repeats are found in the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis, and related proteins of Lactobacillus. The repeat lacks Cys residues. Most members of this protein family also have the cell wall anchor motif LPXTG shared by many staphyloccal and streptococcal surface antigens. 80
18798 131385 TIGR02332 HpaX 4-hydroxyphenylacetate permease. This protein is a part of the Major Facilitator Superfamily (pfam07690). Member of this family are found in a number of proteobacterial genomes, but only in the context of having genes for 4-hydroxyphenylacetate (4-HPA) degradation. The protein is characterized by Prieto, et al. ( as 4-hydroxyphenylacetate permease in E. coli, where 3-HPA and 3,4-dihydroxyphenylacetate are shown to competitively inhibit 4-HPA transport and therefore also interact specificially. 412
18799 131386 TIGR02333 2met_isocit_dHY 2-methylisocitrate dehydratase, Fe/S-dependent. Members of this family appear in an operon for the degradation of propionyl-CoA via 2-methylcitrate. This family is homologous to aconitases A and B and appears to act the part as 2-methylisocitrate dehydratase, the enzyme after PrpD and before PrpB. In Escherichia coli, which lacks a member of this family, 2-methylisocitrate dehydratase activity was traced to aconitase B (TIGR00117) (). 858
18800 131387 TIGR02334 prpF probable AcnD-accessory protein PrpF. The 2-methylcitrate cycle is one of at least five degradation pathways for propionate via propionyl-CoA. Degradation of propionate toward pyruvate consumes oxaloacetate and releases succinate. Oxidation of succinate back into oxaloacetate by the TCA cycle makes the 2-methylcitrate pathway a cycle. This family consists of PrpF, an incompletely characterized protein that appears to be an essential accessory protein for the Fe/S-dependent 2-methylisocitrate dehydratase AcnD (TIGR02333). This protein is related to but distinct from FldA (part of pfam04303), a putative fluorene degradation protein of Sphingomonas sp. LB126. [Energy metabolism, Fermentation] 390
18801 131388 TIGR02335 hydr_PhnA phosphonoacetate hydrolase. This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases. 408
18802 213701 TIGR02336 TIGR02336 1,3-beta-galactosyl-N-acetylhexosamine phosphorylase. Members of this family are found in phylogenetically diverse bacteria, including Clostridium perfringens (in the Firmicutes), Bifidobacterium longum and Propionibacterium acnes (in the Actinobacteria), and Vibrio vulnificus (in the Proteobacteria), most of which occur as mammalian pathogens or commensals. The nominal activity, 1,3-beta-galactosyl-N-acetylhexosamine phosphorylase (EC 2.4.1.211), varies somewhat from instance to instance in relative rates for closely related substrates. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 719
18803 188209 TIGR02337 HpaR homoprotocatechuate degradation operon regulator, HpaR. This Helix-Turn-Helix transcriptional regulator is a member of the MarR family (pfam01047) and is found in association with operons for the degradation of 4-hydroxyphenylacetic acid via homoprotocatechuate. 118
18804 131391 TIGR02338 gimC_beta prefoldin, beta subunit, archaeal. Chaperonins are cytosolic, ATP-dependent molecular chaperones, with a conserved toroidal architecture, that assist in the folding of nascent and/or denatured polypeptide chains. The group I chaperonin system consists of GroEL and GroES, and is found (usually) in bacteria and organelles of bacterial origin. The group II chaperonin system, called the thermosome in Archaea and TRiC or CCT in the Eukaryota, is structurally similar but only distantly related. Prefoldin, also called GimC, is a complex in Archaea and Eukaryota, that works with group II chaperonins. Members of this protein family are the archaeal clade of the beta class of prefoldin subunit. Closely related, but outside the scope of this family are the eukaryotic beta-class prefoldin subunits, Gim-1,3,4 and 6. The alpha class prefoldin subunits are more distantly related. 110
18805 274080 TIGR02339 thermosome_arch thermosome, various subunits, archaeal. Thermosome is the name given to the archaeal rather than eukaryotic form of the group II chaperonin (counterpart to the group I chaperonin, GroEL/GroES, in bacterial), a torroidal, ATP-dependent molecular chaperone that assists in the folding or refolding of nascent or denatured proteins. Various homologous subunits, one to five per archaeal genome, may be designated alpha, beta, etc., but phylogenetic analysis does not show distinct alpha subunit and beta subunit lineages traceable to ancient paralogs. [Protein fate, Protein folding and stabilization] 519
18806 274081 TIGR02340 chap_CCT_alpha T-complex protein 1, alpha subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 536
18807 274082 TIGR02341 chap_CCT_beta T-complex protein 1, beta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT beta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 519
18808 274083 TIGR02342 chap_CCT_delta T-complex protein 1, delta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT delta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 517
18809 274084 TIGR02343 chap_CCT_epsi T-complex protein 1, epsilon subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT epsilon chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 532
18810 274085 TIGR02344 chap_CCT_gamma T-complex protein 1, gamma subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT gamma chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 524
18811 274086 TIGR02345 chap_CCT_eta T-complex protein 1, eta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT eta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 523
18812 274087 TIGR02346 chap_CCT_theta T-complex protein 1, theta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 531
18813 274088 TIGR02347 chap_CCT_zeta T-complex protein 1, zeta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT zeta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes. 531
18814 274089 TIGR02348 GroEL chaperonin GroL. This family consists of GroEL, the larger subunit of the GroEL/GroES cytosolic chaperonin. It is found in bacteria, organelles derived from bacteria, and occasionally in the Archaea. The bacterial GroEL/GroES group I chaperonin is replaced a group II chaperonin, usually called the thermosome in the Archaeota and CCT (chaperone-containing TCP) in the Eukaryota. GroEL, thermosome subunits, and CCT subunits all fall under the scope of pfam00118. [Protein fate, Protein folding and stabilization] 524
18815 274090 TIGR02349 DnaJ_bact chaperone protein DnaJ. This model represents bacterial forms of DnaJ, part of the DnaK-DnaJ-GrpE chaperone system. The three components typically are encoded by consecutive genes. DnaJ homologs occur in many genomes, typically not near DnaK and GrpE-like genes; most such genes are not included by this family. Eukaryotic (mitochondrial and chloroplast) forms are not included in the scope of this family. 354
18816 274091 TIGR02350 prok_dnaK chaperone protein DnaK. Members of this family are the chaperone DnaK, of the DnaK-DnaJ-GrpE chaperone system. All members of the seed alignment were taken from completely sequenced bacterial or archaeal genomes and (except for Mycoplasma sequence) found clustered with other genes of this systems. This model excludes DnaK homologs that are not DnaK itself, such as the heat shock cognate protein HscA (TIGR01991). However, it is not designed to distinguish among DnaK paralogs in eukaryotes. Note that a number of dnaK genes have shadow ORFs in the same reverse (relative to dnaK) reading frame, a few of which have been assigned glutamate dehydrogenase activity. The significance of this observation is unclear; lengths of such shadow ORFs are highly variable as if the presumptive protein product is not conserved. [Protein fate, Protein folding and stabilization] 595
18817 131404 TIGR02351 thiH thiazole biosynthesis protein ThiH. Members this protein family are the ThiH protein of thiamine biosynthesis, a homolog of the BioB protein of biotin biosynthesis. Genes for the this protein generally are found in operons with other thiamin biosynthesis genes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 366
18818 274092 TIGR02352 thiamin_ThiO glycine oxidase ThiO. This family consists of the homotetrameric, FAD-dependent glycine oxidase ThiO, from species such as Bacillus subtilis that use glycine in thiamine biosynthesis. In general, members of this family will not be found in species such as E. coli that instead use tyrosine and the ThiH protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 337
18819 274093 TIGR02353 NRPS_term_dom non-ribosomal peptide synthetase terminal domain of unknown function. This domain is found exclusively in non-ribosomal peptide synthetases and always as the final domain in the polypeptide. This domain is roughly 700 amino acids in size and is found in polypeptides roughly twice that size. 695
18820 162819 TIGR02354 thiF_fam2 thiamine biosynthesis protein ThiF, family 2. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with one the E. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the divergent clade of putative ThiF proteins such found in Campylobacter. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 200
18821 131408 TIGR02355 moeB molybdopterin synthase sulfurylase MoeB. This model describes the molybdopterin biosynthesis protein MoeB in E. coli and related species. The enzyme covalently modifies the molybdopterin synthase MoaD by sulfurylation. This enzyme is closely related to ThiF, a thiamine biosynthesis enzyme that modifies ThiS by an analogous adenylation. Both MoeB and ThiF belong to the HesA/MoeB/ThiF family (pfam00899). [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 240
18822 274094 TIGR02356 adenyl_thiF thiazole biosynthesis adenylyltransferase ThiF, E. coli subfamily. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with the Escherichia. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the more widely distributed clade of ThiF proteins such found in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 202
18823 274095 TIGR02357 ECF_ThiT_YuaJ energy-coupled thiamine transporter ThiT. Members of this protein family have been assigned as thiamine transporters by a phylogenomic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Steptococcus mutans and Streptoccus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicated thiamine uptake is coupled to proton translocation. However, a more recent comprehensive study of energy-coupled factor (ECF) transport suggests this protein is the S (subtrate capture) component of an ECF system, meaning it is energized by ATP. Previously YuaJ, but renamed ThiT. [Transport and binding proteins, Other] 173
18824 274096 TIGR02358 thia_cytX putative hydroxymethylpyrimidine transporter CytX. On the basis of a phylogenomic study of thiamine biosythetic, salvage, and transporter genes and a highly conserved RNA element THI, this protein family has been identified as a probable transporter of hydroxymethylpyrimidine (HMP), the phosphorylated (by ThiD) form of which gets joined (by ThiE) to hydroxyethylthiazole phosphate to make thiamine phosphate. [Transport and binding proteins, Nucleosides, purines and pyrimidines, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 386
18825 131412 TIGR02359 thiW energy coupling factor transporter S component ThiW. Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved,to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 160
18826 131413 TIGR02360 pbenz_hydroxyl 4-hydroxybenzoate 3-monooxygenase. Members of this family are the enzyme 4-hydroxybenzoate 3-monooxygenase, also called p-hydroxybenzoate hydroxylase. It converts 4-hydroxybenzoate + NADPH + molecular oxygen to protocatechuate + NADPH + water. It contains monooxygenase (pfam01360) and FAD binding (pfam01494) domains. Pathways that contain this enzyme include the protocatechuate 4,5-degradation pathway. [Energy metabolism, Other] 390
18827 274097 TIGR02361 dak_ATP dihydroxyacetone kinase, ATP-dependent. This family consists of examples of the form of dihydroxyacetone kinase (also called glycerone kinase) that uses ATP (2.7.1.29) as the phosphate donor, rather than a phosphoprotein as in E. coli. This form is composed of a single chain with separable domains homologous to the K and L subunits of the E. coli enzyme, and is found in yeasts and other eukaryotes and in some bacteria, including Citrobacter freundii. The member from tomato has been shown to phosphorylate dihydroxyacetone, 3,4-dihydroxy-2-butanone, and some other aldoses and ketoses (). 574
18828 213706 TIGR02362 dhaK1b probable dihydroxyacetone kinase DhaK1b subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form with a phosphoprotein donor related to PTS transport proteins. This family represents a protein, unique to the Firmicutes (low GC Gram-positives), that appears to be a divergent second copy of the K subunit of that complex; its gene is always found in operons with the other three proteins of the complex. 326
18829 274098 TIGR02363 dhaK1 dihydroxyacetone kinase, DhaK subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the DhaK subunit of the latter type of dihydroxyacetone kinase, but it specifically excludes the DhaK paralog DhaK2 (TIGR02362) found in the same operon as DhaK and DhaK in the Firmicutes. 329
18830 131417 TIGR02364 dha_pts dihydroxyacetone kinase, phosphotransfer subunit. In E. coli and many other bacteria, unlike the yeasts and a few bacteria such as Citrobacter freundii, the dihydroxyacetone kinase (also called glycerone kinase) transfers a phosphate from a phosphoprotein rather than from ATP and contains multiple subunits. This protein, which resembles proteins of PTS transport systems, is found with its gene adjacent to 125
18831 274099 TIGR02365 dha_L_ycgS dihydroxyacetone kinase, phosphoprotein-dependent, L subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the subunit homologous to the E. coli YcgS subunit. 194
18832 274100 TIGR02366 DHAK_reg probable dihydroxyacetone kinase regulator. The seed alignment for this family was built from a set of closely related uncharacterized proteins associated with operons for the type of bacterial dihydroxyacetone kinase that transfers PEP-derived phosphate from a phosphoprotein, as in phosphotransferase system transport, rather than from ATP. Members have a TetR transcriptional regulator domain (pfam00440) at the N-terminus and sequence homology throughout. 176
18833 188213 TIGR02367 PylS_Cterm pyrrolysyl-tRNA synthetase, C-terminal region. PylS is the enzyme responsible for charging the pyrrolysine tRNA, PylT, by ligating a free molecule of pyrrolysine. Pyrrolysine is encoded at an in-frame UAG (amber) at least in several corrinoid-dependent methyltransferases of the archaeal genera Methanosarcina and Methanococcoides, such as trimethylamine methyltransferase. This protein occurs as a fusion protein in Methanosarcina but as split genes in Desulfitobacterium hafniense and other bacteria. [Protein synthesis, tRNA aminoacylation] 242
18834 131421 TIGR02368 dimeth_PyL dimethylamine:corrinoid methyltransferase. This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of M. acetivorans, M. barkeri, and M. Mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence. 466
18835 131422 TIGR02369 trimeth_pyl trimethylamine:corrinoid methyltransferase. This model represents a distinct subfamily of pfam06253. All members here are trimethylamine:corrinoid methyltransferases that contain a critical pyrrolysine residue incorporated during translation via a special tRNA for a TAG (amber) codon. Known members so far are from the genus Methanosarcina. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with dimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates trimethylamine, leaving dimethylamine, and methylates the prosthetic group of its small cognate corrinoid protein, MttC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence. 489
18836 131423 TIGR02370 pyl_corrinoid methyltransferase cognate corrinoid proteins, Methanosarcina family. This model describes a subfamily of the B12 binding domain (pfam02607, pfam02310) proteins. Members of the seed alignment include corrinoid proteins specific to four different, mutally non-homologous enzymes of the genus Methanosarcina. Three of the four cognate enzymes (trimethylamine, dimethylamine, and monomethylamine methyltransferases) all have the unusual, ribosomally incorporated amino acid pyrrolysine at the active site. All act in systems in which a methyl group is transferred to the corrinoid protein to create methylcobalamin, from which the methyl group is later transferred elsewhere. 197
18837 131424 TIGR02371 ala_DH_arch alanine dehydrogenase, Archaeoglobus fulgidus type. This enzyme, a homolog of bacterial ornithine cyclodeaminases and marsupial mu-crystallins, is a homodimeric, NAD-dependent alanine dehydrogenase found in Archaeoglobus fulgidus and several other Archaea. For a number of close homologs, scoring between trusted and noise cutoffs, it is not clear at present what is the enzymatic activity. 325
18838 131425 TIGR02372 4_coum_CoA_lig 4-coumarate--CoA ligase, photoactive yellow protein activation family. This model represents the 4-coumarate--CoA ligase associated with biosynthesis of the 4-hydroxy cinnamyl (also called 4-coumaroyl) chromophore covalently linked to a Cys residue in photoactive yellow protein of Rhodobacter spp. and 386
18839 131426 TIGR02373 photo_yellow photoactive yellow protein. Members of this family are photoactive yellow protein, a cytosolic, 14-kDa light-sensing protein which has a 4-hydroxycinnamyl (p-coumaric acid) chromophore covalently linked to a Cys residue. The enzyme 4-coumarate--CoA ligase as described by TIGR02372 is required for its biosynthesis. The modified Cys is in a PAS (pfam00989) domain, frequently found in signal transducing proteins. Members are known in alpha and gamma Proteobacteria that include Rhodobacter capsulatus, Halorhodospira halophila, Rhodospirillum centenum, etc. 124
18840 162827 TIGR02374 nitri_red_nirB nitrite reductase [NAD(P)H], large subunit. [Central intermediary metabolism, Nitrogen metabolism] 785
18841 131428 TIGR02375 pseudoazurin pseudoazurin. Pseudoazurin, also called cupredoxin, is a small, blue periplasmic protein with a single bound copper atom. Pseudoazurin is related plastocyanins. Several examples of pseudoazurin are encoded by a neighboring gene for, or have been shown to transfer electrons to, copper-containing nitrite reductases (TIGR02376) of the same species. [Energy metabolism, Electron transport] 116
18842 131429 TIGR02376 Cu_nitrite_red nitrite reductase, copper-containing. This family consists of copper-type nitrite reductase. It reduces nitrite to nitric oxide, the first step in denitrification. [Central intermediary metabolism, Nitrogen metabolism] 311
18843 131430 TIGR02377 MocE_fam_FeS Rieske [2Fe-2S] domain protein, MocE subfamily. This model describes a subfamily of the Rieske-like [2Fe-2S] family of ferredoxins that includes MocE, part of the rhizopine (3-O-methyl-scyllo-inosamine) catabolic cluster in Rhizobium. Members of this family are related to, yet distinct from, the small subunit of nitrite reductase [NAD(P)H]. 101
18844 131431 TIGR02378 nirD_assim_sml nitrite reductase [NAD(P)H], small subunit. This model describes NirD, the small subunit of nitrite reductase [NAD(P)H] (the assimilatory nitrite reductase), which associates with NirB, the large subunit (TIGR02374). In a few bacteria such as Klebsiella pneumoniae and in Fungi, the two regions are fused. [Central intermediary metabolism, Nitrogen metabolism] 105
18845 131432 TIGR02379 ECA_wecE TDP-4-keto-6-deoxy-D-glucose transaminase. This family consists of TDP-4-keto-6-deoxy-D-glucose transaminases, the WecE (formerly RffA) protein of enterobacterial common antigen (ECA) biosynthesis, from enterobacteria. It also includes closely matching sequence from species not expected to make ECA, but which contain other genes for the biosynthesis of TDP-4-keto-6-deoxy-D-Glc, an intermediate in the biosynthesis of other compounds as well and the substrate of WecA. This family belongs to the DegT/DnrJ/EryC1/StrS aminotransferase family (pfam01041). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 376
18846 131433 TIGR02380 ECA_wecA undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphatetransferase. Members of this family are the WecA enzyme of enterobacterial common antigen biosynthesis, undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphatetransferase. This family represents one narrow clade, and closely related sequences outside this clade may represent enzymes that catalyze the same specific reaction, but in the context of different pathways. A His-rich motif in a cytosolic loop of this integral membrane protein, shown critical to enzymatic activity for WecA is variously present or absent in the clade that includes Bacillus subtilis TagO teichoic acid biosynthesis enzyme, which may catalyze the same reaction as WecA. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 346
18847 131434 TIGR02381 cspD cold shock domain protein CspD. This model represents what appears to be a phylogenetically distinct clade, containing E. coli CspD (SP|P24245) and related proteobacterial proteins within the larger family of cold shock domain proteins described by pfam00313. The gene symbol cspD may have been used idependently for other subfamilies of cold shock domain proteins, such as for B. subtilis CspD. These proteins typically are shorter than 70 amino acids. In E. coli, CspD is a stress response protein induced in stationary phase. This homodimer binds single-stranded DNA and appears to inhibit DNA replication. [DNA metabolism, DNA replication, recombination, and repair, Cellular processes, Adaptations to atypical conditions] 68
18848 131435 TIGR02382 wecD_rffC TDP-D-fucosamine acetyltransferase. This model represents the WecD protein (Formerly RffC) for the biosynthesis of enterobacterial common antigen (ECA), an outer leaflet, outer membrane glycolipid with a trisaccharide repeat unit. WecD is a member of the GNAT family of acetytransferases (pfam00583). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 191
18849 274101 TIGR02383 Hfq RNA chaperone Hfq. This model represents the RNA-binding pleiotropic regulator Hfq, a small, Sm-like protein of bacteria. It helps pair regulatory noncoding RNAs with complementary mRNA target regions. It enhances the elongation of poly(A) tails on mRNA. It appears also to protect RNase E recognition sites (A/U-rich sequences with adjacent stem-loop structures) from cleavage. Being pleiotropic, it differs in some of its activities in different species. Hfq binds the non-coding regulatory RNA DsrA (see Rfam RF00014) in the few species known to have it: Escherichia coli, Shigella flexneri, Salmonella spp. In Azorhizobium caulinodans, an hfq mutant is unable to express nifA, and Hfq is called NrfA, for nif regulatory factor (see . The name hfq reflects phenomenology as a host factor for phage Q-beta RNA replication. [Regulatory functions, Other] 61
18850 274102 TIGR02384 RelB_DinJ addiction module antitoxin, RelB/DinJ family. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. The resulting model appears to describe a narrower set of proteins than pfam04221, although many in the scope of this model are not obviously paired with toxin proteins. Several toxin/antitoxin pairs may occur in a single species. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 84
18851 211740 TIGR02385 RelE_StbE addiction module toxin, RelE/StbE family. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all are found adjacent to RelB/DinJ family antitoxin genes (TIGR02384), as are most genes found by the resulting model. StbE from Morganella morganii plasmid R485 shows typical behaviour for an addiction module toxin. It cannot be cloned without its partner (the antitoxin), whereas its partner cannot confer plasmid stability without StbE. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 88
18852 274103 TIGR02386 rpoC_TIGR DNA-directed RNA polymerase, beta' subunit, predominant form. Bacteria have a single DNA-directed RNA polymerase, with required subunits that include alpha, beta, and beta-prime. This model describes the predominant architecture of the beta-prime subunit in most bacteria. This model excludes from among the bacterial mostly sequences from the cyanobacteria, where RpoC is replaced by two tandem genes homologous to it but also encoding an additional domain. [Transcription, DNA-dependent RNA polymerase] 1140
18853 131440 TIGR02387 rpoC1_cyan DNA-directed RNA polymerase, gamma subunit. The RNA polymerase gamma subunit, encoded by the rpoC1 gene, is found in cyanobacteria and corresponds to the N-terminal region the beta' subunit, encoded by rpoC, in other bacteria. The equivalent subunit in plastids and chloroplasts is designated beta', while the product of the rpoC2 gene is designated beta''. 619
18854 274104 TIGR02388 rpoC2_cyan DNA-directed RNA polymerase, beta'' subunit. The family consists of the product of the rpoC2 gene, a subunit of DNA-directed RNA polymerase of cyanobacteria and chloroplasts. RpoC2 corresponds largely to the C-terminal region of the RpoC (the beta' subunit) of other bacteria. Members of this family are designated beta'' in chloroplasts/plastids, and beta' (confusingly) in Cyanobacteria, where RpoC1 is called beta' in chloroplasts/plastids and gamma in Cyanobacteria. We prefer to name this family beta'', after its organellar members, to emphasize that this RpoC1 and RpoC2 together replace RpoC in other bacteria. [Transcription, DNA-dependent RNA polymerase] 1227
18855 274105 TIGR02389 RNA_pol_rpoA2 DNA-directed RNA polymerase, subunit A''. This family consists of the archaeal A'' subunit of the DNA-directed RNA polymerase. The example from Methanocaldococcus jannaschii contains an intein. [Transcription, DNA-dependent RNA polymerase] 367
18856 274106 TIGR02390 RNA_pol_rpoA1 DNA-directed RNA polymerase subunit A'. This family consists of the archaeal A' subunit of the DNA-directed RNA polymerase. The example from Methanocaldococcus jannaschii contains an intein. 868
18857 162834 TIGR02391 hypoth_ymh TIGR02391 family protein. This family consists of a relatively rare (~ 8 occurrences per 200 genomes) prokaryotic protein family. Genes for members are appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. The function is unknown. [Hypothetical proteins, Domain] 125
18858 274107 TIGR02392 rpoH_proteo alternative sigma factor RpoH. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoH and further restricted to the Proteobacteria. This protein may be called sigma-32, sigma factor H, heat shock sigma factor, and alternative sigma factor RpoH. Note that in some species the single locus rpoH may be replaced by two or more differentially regulated stress response sigma factors. [Cellular processes, Adaptations to atypical conditions, Transcription, Transcription factors] 270
18859 274108 TIGR02393 RpoD_Cterm RNA polymerase sigma factor RpoD, C-terminal domain. This model represents the well-conserved C-terminal region of the major, essential sigma factor of most bacteria. Members of this clade show considerable variability in domain architecture and molecular weight, as well as in nomenclature: RpoD in E. coli and other Proteobacteria, SigA in Bacillus subtilis and many other Gram-positive bacteria, HrdB in Streptomyces, MysA in Mycobacterium smegmatis, etc. [Transcription, Transcription factors] 238
18860 131447 TIGR02394 rpoS_proteo RNA polymerase sigma factor RpoS. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoS (also called sigma-38, KatF, etc.), found only in Proteobacteria. This sigma factor is induced in stationary phase (in response to the stress of nutrient limitation) and becomes the second prinicipal sigma factor at that time. RpoS is a member of the larger Sigma-70 subfamily (TIGR02937) and most closely related to RpoD (TIGR02393). [Cellular processes, Adaptations to atypical conditions, Transcription, Transcription factors] 285
18861 274109 TIGR02395 rpoN_sigma RNA polymerase sigma-54 factor. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called sigma-54, or RpoN (unrelated to sigma 70-type factors such as RpoD/SigA). RpoN is responsible for enhancer-dependent transcription, and its presence characteristically is associated with varied panels of activators, most of which are enhancer-binding proteins (but see Brahmachary, et al., ). RpoN may be responsible for transcription of nitrogen fixation genes, flagellins, pilins, etc., and synonyms for the gene symbol rpoN, such as ntrA, reflect these observations [Transcription, Transcription factors] 429
18862 274110 TIGR02396 diverge_rpsU rpsU-divergently transcribed protein. This uncharacterized protein is found in a number of Alphaproteobacteria and, with N-terminal regions long enough to be transit peptides, in eukaryotes. This phylogeny suggests mitochondrial derivation. In several Alphaproteobacteria, the gene for this protein is encoded divergently from rpsU, the gene for ribosomal protein S21. S21 is unusual in being encoded outside the usual long ribosomal protein operons, but rather in contexts that suggest regulation of the initiation of protein translation. [Unknown function, General] 185
18863 274111 TIGR02397 dnaX_nterm DNA polymerase III, subunit gamma and tau. This model represents the well-conserved first ~ 365 amino acids of the translation of the dnaX gene. The full-length product of the dnaX gene in the model bacterium E. coli is the DNA polymerase III tau subunit. A translational frameshift leads to early termination and a truncated protein subunit gamma, about 1/3 shorter than tau and present in roughly equal amounts. This frameshift mechanism is not necessarily universal for species with DNA polymerase III but appears conserved in the exterme thermophile Thermus thermophilis. [DNA metabolism, DNA replication, recombination, and repair] 355
18864 131451 TIGR02398 gluc_glyc_Psyn glucosylglycerol-phosphate synthase. Glucosylglycerol-phosphate synthase catalyzes the key step in the biosynthesis of the osmolyte glucosylglycerol. It is known in several cyanobacteria and in Pseudomonas anguilliseptica. The enzyme is closely related to the alpha,alpha-trehalose-phosphate synthase, likewise involved in osmolyte biosynthesis, of E. coli and many other bacteria. A close homolog from Xanthomonas campestris is excluded from this model and scores between trusted and noise. 487
18865 131452 TIGR02399 salt_tol_Pase glucosylglycerol 3-phosphatase. Proteins in this family are glucosylglycerol-phosphate phosphatase, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol. 389
18866 274112 TIGR02400 trehalose_OtsA alpha,alpha-trehalose-phosphate synthase [UDP-forming]. This enzyme catalyzes the key, penultimate step in biosynthesis of trehalose, a compatible solute made as an osmoprotectant in some species in all three domains of life. The gene symbol OtsA stands for osmotically regulated trehalose synthesis A. Trehalose helps protect against both osmotic and thermal stresses, and is made from two glucose subunits. This model excludes glucosylglycerol-phosphate synthase, an enzyme of an analogous osmoprotectant system in many cyanobacterial strains. This model does not identify archaeal examples, as they are more divergent than glucosylglycerol-phosphate synthase. Sequences that score in the gray zone between the trusted and noise cutoffs include a number of yeast multidomain proteins in which the N-terminal domain may be functionally equivalent to this family. The gray zone also includes the OtsA of Cornyebacterium glutamicum (and related species), shown to be responsible for synthesis of only trace amounts of trehalose while the majority is synthesized by the TreYZ pathway; the significance of OtsA in this species is unclear (see Wolf, et al., ). [Cellular processes, Adaptations to atypical conditions] 456
18867 274113 TIGR02401 trehalose_TreY malto-oligosyltrehalose synthase. This enzyme, formally named (1->4)-alpha-D-glucan 1-alpha-D-glucosylmutase, is the TreY enzyme of the TreYZ pathway of trehalose biosynthesis, an alternative to the OtsAB pathway. Trehalose may be incorporated into more complex compounds but is best known as compatible solute. It is one of the most effective osmoprotectants, and unlike the various betaines does not require nitrogen for its synthesis. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 825
18868 274114 TIGR02402 trehalose_TreZ malto-oligosyltrehalose trehalohydrolase. Members of this family are the trehalose biosynthetic enzyme malto-oligosyltrehalose trehalohydrolase, formally known as 4-alpha-D-{(1->4)-alpha-D-glucano}trehalose trehalohydrolase (EC 3.2.1.141). It is the TreZ protein of the TreYZ pathway for trehalose biosynthesis, and alternative to the OtsAB system. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 544
18869 274115 TIGR02403 trehalose_treC alpha,alpha-phosphotrehalase. Trehalose is a glucose disaccharide that serves in many biological systems as a compatible solute for protection against hyperosmotic and thermal stress. This family describes trehalose-6-phosphate hydrolase, product of the treC (or treA) gene, which is often found together with a trehalose uptake transporter and a trehalose operon repressor. 543
18870 274116 TIGR02404 trehalos_R_Bsub trehalose operon repressor, B. subtilis-type. This family consists of repressors of the GntR family typically associated with trehalose utilization operons. Trehalose is imported as trehalose-6-phosphate and then hydrolyzed by alpha,alpha-phosphotrehalase to glucose and glucose-6-P. This family includes repressors mostly from Gram-positive lineages and does not include the TreR from E. coli. [Regulatory functions, DNA interactions] 233
18871 131458 TIGR02405 trehalos_R_Ecol trehalose operon repressor, proteobacterial. This family consists of repressors of the LacI family typically associated with trehalose utilization operons. Trehalose is imported as trehalose-6-phosphate and then hydrolyzed by alpha,alpha-phosphotrehalase to glucose and glucose-6-P. This family includes repressors mostly from Gammaproteobacteria and does not include the GntR family TreR of Bacillus subtilis [Regulatory functions, DNA interactions] 311
18872 131459 TIGR02406 ectoine_EctA diaminobutyrate acetyltransferase. This enzyme family is the EctA of ectoine biosynthesis. Ectoine is a compatible solute, analagous to trehalose, betaines, etc., found often in halotolerant organisms. EctA is L-2,4-diaminobutyric acid acetyltransferase, also called DABA acetyltransferase. [Cellular processes, Adaptations to atypical conditions] 157
18873 274117 TIGR02407 ectoine_ectB diaminobutyrate--2-oxoglutarate aminotransferase. Members of this family of class III pyridoxal-phosphate-dependent aminotransferases are diaminobutyrate--2-oxoglutarate aminotransferase (EC 2.6.1.76) that catalyze the first step in ectoine biosynthesis from L-aspartate beta-semialdehyde. This family is readily separated phylogenetically from enzymes with the same substrate and product but involved in other process such as siderophore (SP|Q9Z3R2) or 1,3-diaminopropane (SP|P44951) biosynthesis. The family TIGR00709 previously included both groups but has now been revised to exclude the ectoine biosynthesis proteins of this family. Ectoine is a compatible solute particularly effective in conferring salt tolerance. [Cellular processes, Adaptations to atypical conditions] 412
18874 131461 TIGR02408 ectoine_ThpD ectoine hydroxylase. Both ectoine and hydroxyectoine are compatible solvents that serve as protectants against osmotic and thermal stresses. A number of genomes synthesize ectoine. This enzyme allows conversion of ectoine to hydroxyectoine, which may be more effective for some purposes, and is found in a subset of ectoine-producing organisms. 277
18875 274118 TIGR02409 carnitine_bodg gamma-butyrobetaine hydroxylase. Members of this protein family are gamma-butyrobetaine hydroxylase, both bacterial and eukarytotic. This enzyme catalyzes the last step in the conversion of lysine to carnitine. Carnitine can serve as a compatible solvent in bacteria and also participates in fatty acid metabolism. 366
18876 274119 TIGR02410 carnitine_TMLD trimethyllysine dioxygenase. Members of this family with known function act as trimethyllysine dioxygenase, an enzyme in the pathway for carnitine biosynthesis from lysine. This enzyme is homologous to gamma-butyrobetaine,2-oxoglutarate dioxygenase, which catalyzes the last step in carnitine biosynthesis. Members of this family appear to be eukaryotic only. 362
18877 274120 TIGR02411 leuko_A4_hydro leukotriene A-4 hydrolase/aminopeptidase. Members of this family represent a distinctive subset within the zinc metallopeptidase family M1 (pfam01433). The majority of the members of pfam01433 are aminopeptidases, but the sequences in this family for which the function is known are leukotriene A-4 hydrolase. A dual epoxide hydrolase and aminopeptidase activity at the same active site is indicated. The physiological substrate for aminopeptidase activity is not known. 602
18878 274121 TIGR02412 pepN_strep_liv aminopeptidase N, Streptomyces lividans type. This family is a subset of the members of the zinc metallopeptidase family M1 (pfam01433), with a single member characterized in Streptomyces lividans 66 and designated aminopeptidase N. The spectrum of activity may differ somewhat from the aminopeptidase N clade of E. coli and most other Proteobacteria, well separated phylogenetically within the M1 family. The M1 family also includes leukotriene A-4 hydrolase/aminopeptidase (with a bifunctional active site). 831
18879 131466 TIGR02413 Bac_small_yrzI Bacillus tandem small hypothetical protein. Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, but arrays of six in tandem in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number small, acid-soluble spore proteins (SASP). A role in sporulation is possible. 46
18880 274122 TIGR02414 pepN_proteo aminopeptidase N, Escherichia coli type. The M1 family of zinc metallopeptidases contains a number of distinct, well-separated clades of proteins with aminopeptidase activity. Several are designated aminopeptidase N, EC 3.4.11.2, after the Escherichia coli enzyme, suggesting a similar activity profile (see SP|P04825 for a description of catalytic activity). This family consists of all aminopeptidases closely related to E. coli PepN and presumed to have similar (not identical) function. Nearly all are found in Proteobacteria, but members are found also in Cyanobacteria, plants, and apicomplexan parasites. This family differs greatly in sequence from the family of aminopeptidases typified by Streptomyces lividans PepN (TIGR02412), from the membrane bound aminopeptidase N family in animals, etc. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 863
18881 131468 TIGR02415 23BDH acetoin reductases. One member of this family, as characterized in Klebsiella terrigena, is described as able to interconvert acetoin + NADH with meso-2,3-butanediol + NAD(+). It is also called capable of irreversible reduction of diacetyl with NADH to acetoin. Blomqvist, et al. decline to specify either EC 1.1.1.4 which is (R,R)-butanediol dehydrogenase, or EC 1.1.1.5, which is acetoin dehydrogenase without a specified stereochemistry, for this enzyme. This enzyme is a homotetramer in the family of short chain dehydrogenases (pfam00106). Another member of this family, from Corynebacterium glutamicum, is called L-2,3-butanediol dehydrogenase (). [Energy metabolism, Fermentation] 254
18882 131469 TIGR02416 CO_dehy_Mo_lg carbon-monoxide dehydrogenase, large subunit. This model represents the large subunits of group of carbon-monoxide dehydrogenases that include molybdenum as part of the enzymatic cofactor. There are various forms of carbon-monoxide dehydrogenase; Salicibacter pomeroyi DSS-3, for example, has two forms. Note that, at least in some species, the active site Cys is modified with a selenium attached to (rather than replacing) the sulfur atom. This is termed selanylcysteine, and created post-translationally, in contrast to selenocysteine incorporation during translation as for many other selenoproteins. [Energy metabolism, Other] 770
18883 131470 TIGR02417 fruct_sucro_rep D-fructose-responsive transcription factor. Members of this family belong the lacI helix-turn-helix family (pfam00356) of DNA-binding transcriptional regulators. All members are from the proteobacteria. Characterized members act as positive and negative transcriptional regulators of fructose and sucrose transport and metabolism. Sucrose is a disaccharide composed of fructose and glucose; D-fructose-1-phosphate rather than an intact sucrose moiety has been shown to act as the inducer. [Regulatory functions, DNA interactions] 327
18884 131471 TIGR02418 acolac_catab acetolactate synthase, catabolic. Acetolactate synthase (EC 2.2.1.6) combines two molecules of pyruvate to yield 2-acetolactate with the release of CO2. This reaction may be involved in either valine biosynthesis (biosynthetic) or conversion of pyruvate to acetoin and possibly to 2,3-butanediol (catabolic). The biosynthetic type, described by TIGR00118, is also capable of forming acetohydroxybutyrate from pyruvate and 2-oxobutyrate for isoleucine biosynthesis. The family described here, part of the same larger family of thiamine pyrophosphate-dependent enzymes (pfam00205, pfam02776) is the catabolic form, generally found associated with in species with acetolactate decarboxylase and usually found in the same operon. The model may not encompass all catabolic acetolactate synthases, but rather one particular clade in the larger TPP-dependent enzyme family. [Energy metabolism, Fermentation] 539
18885 274123 TIGR02419 C4_traR_proteo phage/conjugal plasmid C-4 type zinc finger protein, TraR family. Members of this family are putative C4-type zinc finger proteins found almost exclusively in prophage regions, actual phage, or conjugal transfer regions of the Proteobactia. This small protein (about 70 amino acids) appears homologous to but is smaller than DksA (DnaK suppressor protein), found to be critical for regulating transcription of ribosomal RNA. [Mobile and extrachromosomal element functions, Prophage functions] 63
18886 274124 TIGR02420 dksA RNA polymerase-binding protein DksA. The model that is the basis for this family describes a small, pleiotropic protein, DksA (DnaK suppressor A), originally named as a multicopy suppressor of temperature sensitivity of dnaKJ mutants. DksA mutants are defective in quorum sensing, virulence, etc. DksA is now understood to bind RNA polymerase directly and modulate its response to small molecules to control the level of transcription of rRNA. Nearly all members of this family are in the Proteobacteria. Whether the closest homologs outside the Proteobacteria function equivalently is unknown. The low value set for the noise cutoff allows identification of possible DksA proteins from outside the proteobacteria. TIGR02419 describes a closely related family of short sequences usually found in prophage regions of proteobacterial genomes or in known phage. [Transcription, Transcription factors, Regulatory functions, Small molecule interactions] 110
18887 274125 TIGR02421 QEGLA conserved hypothetical protein. Members of this family include a possible metal-binding motif HEXXXH and, nearby, a perfectly conserved motif QEGLA. All members belong to the Proteobacteria, including Agrobacterium tumefaciens and several species of Vibrio and Pseudomonas, and are found in only one copy per chromosome (Vibrio vulnificus, with two chromosomes, has two). The function is unknown. 366
18888 131475 TIGR02422 protocat_beta protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA. [Energy metabolism, Other] 220
18889 274126 TIGR02423 protocat_alph protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA. [Energy metabolism, Other] 193
18890 274127 TIGR02424 TF_pcaQ pca operon transcription factor PcaQ. Members of this family are LysR-family transcription factors associated with operons for catabolism of protocatechuate. Members occur only in Proteobacteria. [Energy metabolism, Other, Regulatory functions, DNA interactions] 300
18891 131478 TIGR02425 decarb_PcaC 4-carboxymuconolactone decarboxylase. Members of this family are 4-carboxymuconolactone decarboxylase, which catalyzes the third step in the catabolism of protocatechuate (and therefore the fourth step in the catabolism of para-hydroxybenzoate, of 3-hydroxybenzoate, of vanillate, etc.). Most members of this family are encoded within protocatechuate catabolism operons. This protein is sometimes found as a fusion protein with other enzymes of the pathway, as in Rhodococcus opacus, Streptomyces avermitilis, and Caulobacter crescentus. [Energy metabolism, Other] 123
18892 274128 TIGR02426 protocat_pcaB 3-carboxy-cis,cis-muconate cycloisomerase. Members of this family are 3-carboxy-cis,cis-muconate cycloisomerase, the enzyme the catalyzes the second step in the protocatechuate degradation to beta-ketoadipate and then to succinyl-CoA and acetyl-CoA. 4-hydroxybenzoate, 3-hydroxybenzoate, and vanillate all can be converted in one step to protocatechuate. All members of the seed alignment for this model were chosen from within protocatechuate degradation operons of at least three genes of the pathway, from genomes with the complete pathway through beta-ketoadipate. [Energy metabolism, Other] 338
18893 131480 TIGR02427 protocat_pcaD 3-oxoadipate enol-lactonase. Members of this family are 3-oxoadipate enol-lactonase. Note that the substrate is known as 3-oxoadipate enol-lactone, 2-oxo-2,3-dihydrofuran-5-acetate, 4,5-Dihydro-5-oxofuran-2-acetate, and 5-oxo-4,5-dihydrofuran-2-acetate. The enzyme the catalyzes the fourth step in the protocatechuate degradation to beta-ketoadipate and then to succinyl-CoA and acetyl-CoA. 4-hydroxybenzoate, 3-hydroxybenzoate, and vanillate all can be converted in one step to protocatechuate. This enzyme also acts in catechol degradation. In genomes that catabolize both catechol and protocatechuate, two forms of this enzyme may be found. All members of the seed alignment for this model were chosen from within protocatechuate degradation operons of at least three genes of the pathway, from genomes with the complete pathway through beta-ketoadipate. [Energy metabolism, Other] 251
18894 188219 TIGR02428 pcaJ_scoB_fam 3-oxoacid CoA-transferase, B subunit. Various members of this family are characterized as the B subunits of succinyl-CoA:3-ketoacid-CoA transferase (EC 2.8.3.5), beta-ketoadipate:succinyl-CoA transferase (EC 2.8.3.6), acetyl-CoA:acetoacetate CoA transferase (EC 2.8.3.8), and butyrate-acetoacetate CoA-transferase (EC 2.8.3.9). This represents a very distinct clade with strong sequence conservation within the larger family defined by pfam01144. The A subunit represents a different clade in pfam01144. 207
18895 131482 TIGR02429 pcaI_scoA_fam 3-oxoacid CoA-transferase, A subunit. Various members of this family are characterized as the A subunits of succinyl-CoA:3-ketoacid-CoA transferase (EC 2.8.3.5), beta-ketoadipate:succinyl-CoA transferase (EC 2.8.3.6), acetyl-CoA:acetoacetate CoA transferase (EC 2.8.3.8), and butyrate-acetoacetate CoA-transferase (EC 2.8.3.9). This represents a very distinct clade with strong sequence conservation within the larger family defined by pfam01144. The B subunit represents a different clade in pfam01144, described by TIGR02428. The two are found in general as tandem genes and occasionally as a fusion. 222
18896 131483 TIGR02430 pcaF 3-oxoadipyl-CoA thiolase. Members of this family are designated beta-ketoadipyl CoA thiolase, an enzyme that acts at the end of pathways for the degradation of protocatechuate (from benzoate and related compounds) and of phenylacetic acid. 400
18897 131484 TIGR02431 pcaR_pcaU beta-ketoadipate pathway transcriptional regulators, PcaR/PcaU/PobR family. Member of this family are IclR-type transcriptional regulators with similar DNA binding sites, able to bind at least three different metabolites related to protocatechuate metabolism. Beta-ketoadipate is the inducer for PcaR, p-hydroxybenzoate for PobR, and protocatechuate for PcaU. [Regulatory functions, DNA interactions] 248
18898 274129 TIGR02432 lysidine_TilS_N tRNA(Ile)-lysidine synthetase, N-terminal domain. The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) vs. AUG (Met) and UGA (stop) vs. UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This family describes a protein domain found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain archictecture of this protein family is variable; some, including characterized proteins of E. coli and B. subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family ( pfam01171). It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer). [Protein synthesis, tRNA and rRNA base modification] 189
18899 274130 TIGR02433 lysidine_TilS_C tRNA(Ile)-lysidine synthetase, C-terminal domain. TIGRFAMs model TIGR02432 describes the family of the N-terminal domain of tRNA(Ile)-lysidine synthetase. This family (TIGR02433) describes a small C-terminal domain of about 50 residues present in about half the members of family TIGR02432,and in no other protein. Characterized examples of tRNA(Ile)-lysidine synthetase from E. coli and Bacillus subtilis both contain this domain. [Protein synthesis, tRNA and rRNA base modification] 47
18900 131487 TIGR02434 CobF precorrin-6A synthase (deacetylating). In the aerobic cobalamin biosythesis pathway, four enzymes are involved in the conversion of precorrin-3A to precorrin-6A. The first of the four steps is carried out by EC 1.14.13.83, precorrin-3B synthase (CobG), yielding precorrin-3B as the product. This is followed by three methylation reactions, which introduce a methyl group at C-17 (CobJ; EC 2.1.1.131), C-11 (CobM; EC 2.1.1.133) and C-1 (CobF; EC 2.1.1.152) of the macrocycle, giving rise to precorrin-4, precorrin-5 and precorrin-6A, respectively. This model identifies CobF in High GC gram positive, alphaproteobacteria and pseudomonas-related species. 249
18901 274131 TIGR02435 CobG precorrin-3B synthase. An iron-sulfur protein. An oxygen atom from dioxygen is incorporated into the macrocycle at C-20. In the aerobic cobalamin biosythesis pathway, four enzymes are involved in the conversion of precorrin-3A to precorrin-6A. The first of the four steps is carried out by EC 1.14.13.83, precorrin-3B synthase (CobG), yielding precorrin-3B as the product. This is followed by three methylation reactions, which introduce a methyl group at C-17 (CobJ; EC 2.1.1.131), C-11 (CobM; EC 2.1.1.133) and C-1 (CobF; EC 2.1.1.152) of the macrocycle, giving rise to precorrin-4, precorrin-5 and precorrin-6A, respectively. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 390
18902 274132 TIGR02436 TIGR02436 four helix bundle protein. This family describes a protein of unknown function whose structure is a bundle of four long alpha helices. Some of the first members of this family were found encoded in the (atypically large) intervening sequence (IVS) of Leptospira 23S RNA, a region often present in the rRNA gene and removed during rRNA processing without re-ligation. However, this location is not conserved, and naming this protein as a 23S RNA protein is both confusing and inaccurate. 108
18903 131490 TIGR02437 FadB fatty oxidation complex, alpha subunit FadB. Members represent alpha subunit of multifunctional enzyme complex of the fatty acid degradation cycle. Activities include: enoyl-CoA hydratase (EC 4.2.1.17), dodecenoyl-CoA delta-isomerase activity (EC 5.3.3.8), 3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35), 3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3). A representative is E. coli FadB (SP:P21177). This model excludes the FadJ family represented by SP:P77399. [Fatty acid and phospholipid metabolism, Degradation] 714
18904 274133 TIGR02438 catachol_actin catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol. 281
18905 274134 TIGR02439 catechol_proteo catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol. [Energy metabolism, Other] 285
18906 131493 TIGR02440 FadJ fatty oxidation complex, alpha subunit FadJ. Members represent alpha subunit of multifunctional enzyme complex of the fatty acid degradation cycle. Plays a minor role in aerobic beta-oxidation of fatty acids. FadJI complex is necessary for anaerobic growth on short-chain acids with nitrate as an electron acceptor. Activities include: enoyl-CoA hydratase (EC 4.2.1.17),3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35), 3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3). A representative is E. coli FadJ (aka YfcX) (SP:P77399). This model excludes the FadB of TIGR02437 equivalog model. [Fatty acid and phospholipid metabolism, Degradation] 699
18907 131494 TIGR02441 fa_ox_alpha_mit fatty acid oxidation complex, alpha subunit, mitochondrial. Members represent alpha subunit of mitochondrial multifunctional fatty acid degradation enzyme complex. Subunit activities include: enoyl-CoA hydratase (EC 4.2.1.17) & 3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35). Some characterization in human (SP:P40939), pig (SP:Q29554), and rat (SP:Q64428). The beta subunit has activity: acetyl-CoA C-acyltransferase (EC 2.3.1.16). 737
18908 274135 TIGR02442 Cob-chelat-sub cobaltochelatase subunit. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (pfam06180). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis (TIGR02257/TIGR01650/TIGR01651). A number of genomes (actinobacteria, cyanobacteria, betaproteobacteria and pseudomonads) which apparently biosynthesize B12, encode a cobN gene but are demonstrably lacking cobS and cobT. These genomes do, however contain a homolog (modelled here) of the magnesium chelatase subunits BchI/BchD family. Aside from the cyanobacteria (which have a separate magnesium chelatase trimer), these species do not make chlorins, so do not have any use for a magnesium chelatase. Furthermore, in nearly all cases the members of this family are proximal to either CobN itself or other genes involved in cobalt transport or B12 biosynthesis. 633
18909 131496 TIGR02443 TIGR02443 conserved hypothetical metal-binding protein. Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various Proteobacteria. 59
18910 274136 TIGR02444 TIGR02444 TIGR02444 family protein. Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various Proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model. [Hypothetical proteins, Conserved] 116
18911 131498 TIGR02445 fadA fatty oxidation complex, beta subunit FadA. This subunit of the FadBA complex has acetyl-CoA C-acyltransferase (EC 2.3.1.16) activity, and is also known as beta-ketothiolase and fatty oxidation complex, beta subunit. This protein is almost always located adjacent to FadB (TIGR02437). The FadBA complex is the major complex active for beta-oxidation of fatty acids in E. coli. [Fatty acid and phospholipid metabolism, Degradation] 385
18912 131499 TIGR02446 FadI fatty oxidation complex, beta subunit FadI. This subunit of the FadJI complex has acetyl-CoA C-acyltransferase (EC 2.3.1.16) activity, and is also known as beta-ketothiolase and fatty oxidation complex, beta subunit, and YfcY. This protein is almost always located adjacent to FadJ (TIGR02440). The FadJI complex is needed for anaerobic beta-oxidation of short-chain fatty acids in E. coli. [Fatty acid and phospholipid metabolism, Degradation] 430
18913 131500 TIGR02447 yiiD_Cterm thioesterase domain, putative. This family consists of a broadly distributed uncharacterized domain found often as a standalone protein. The member from Shewanella oneidensis, PDB|1T82_A (Forouhar, et al., unpublished) is described from crystallography work as a putative thioesterase. About half of the members of this family are fused to an Acetyltransf_1 domain (pfam00583). The function of this protein is unknown. 138
18914 274137 TIGR02448 TIGR02448 conserverd hypothetical protein. This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1, and also in Pseudomonas putida KT2440, four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown. 101
18915 131502 TIGR02449 TIGR02449 TIGR02449 family protein. Members of this family are small proteins, typically 73 amino acids in length, with single copies in each of several Proteobacteria, including Xylella fastidiosa, Pseudomonas aeruginosa, and Xanthomonas campestris. The function is unknown. 65
18916 131503 TIGR02450 TIGR02450 tryptophan-rich conserved hypothetical protein. Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo. 61
18917 131504 TIGR02451 anti_sig_ChrR anti-sigma factor, putative, ChrR family. The member of this family from Rhodobacter sphaeroides has been shown both to form a complex with sigma(E) and to negatively regulate tetrapyrrole biosynthesis. This protein likely contains (at least) two distinct functional domains; several smaller homologs (excluded by the model) show homology only to the C-terminal, including a motif PxHxHxGxE. [Regulatory functions, Other] 215
18918 131505 TIGR02452 TIGR02452 TIGR02452 family protein. Members of this uncharacterized protein family are found in Streptomyces, Nostoc sp. PCC 7120, Clostridium acetobutylicum, Lactobacillus johnsonii NCC 533, Deinococcus radiodurans, and Pirellula sp. for a broad but sparse phylogenetic distibution that at least suggests lateral gene transfer. 266
18919 274138 TIGR02453 TIGR02453 TIGR02453 family protein. Members of this family are widely (though sparsely) distributed bacterial proteins about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown. In several fungi, this model identifies a conserved region of a longer protein. Therefore, it may be incorrect to speculate that all members share a common function. 217
18920 274139 TIGR02454 ECF_T_CbiQ cobalt ECF transporter T component CbiQ. This model represents the CbiQ component of the cobalt-specific ECF-type. CbiQ is now recognized as the T component of energy-coupling factor (ECF)-type transporters. The S component confers specificity (CbiM-N for cobalt systems), which CbiO is the ABC-family ATPase. In general, proteins found by this model reside next to the other putative subunits of the complex, identified as CbiN, CbiO, or CbiM. Note that the designation of cobalt transporter has been spread excessively among ECF system transporters with many other specificities. [Transport and binding proteins, Cations and iron carrying compounds] 198
18921 131508 TIGR02455 TreS_stutzeri trehalose synthase, Pseudomonas stutzeri type. Trehalose synthase catalyzes a one-step conversion of maltose to trehalose. This is an alternative to the OtsAB and TreYZ pathways. This family includes a characterized example from Pseudomonas stutzeri plus very closely related sequences from other Pseudomonads. Cutoff scores are set to find a more distantly related sequence from Desulfovibrio vulgaris, likely to be functionally equivalent, between trusted and noise limits. [Energy metabolism, Biosynthesis and degradation of polysaccharides, Cellular processes, Adaptations to atypical conditions] 688
18922 274140 TIGR02456 treS_nterm trehalose synthase. Trehalose synthase interconverts maltose and alpha, alpha-trehalose by transglucosylation. This is one of at least three mechanisms for biosynthesis of trehalose, an important and widespread compatible solute. However, it is not driven by phosphate activation of sugars and its physiological role may tend toward trehalose degradation. This view is accentuated by numerous examples of fusion to a probable maltokinase domain. The sequence region described by this model is found both as the whole of a trehalose synthase and as the N-terminal region of a larger fusion protein that includes trehalose synthase activity. Several of these fused trehalose synthases have a domain homologous to proteins with maltokinase activity from Actinoplanes missouriensis and Streptomyces coelicolor (). [Energy metabolism, Biosynthesis and degradation of polysaccharides] 539
18923 274141 TIGR02457 TreS_Cterm trehalose synthase-fused probable maltokinase. Three pathways for the biosynthesis of trehalose, an osmoprotectant that in some species is also a precursor of certain cell wall glycolipids. Trehalose synthase, TreS, can interconvert maltose and trehalose, but while the equilibrium may favor trehalose, physiological concentrations of trehalose may be much greater than that of maltose and TreS may act largely in its degradation. This model describes a domain found only as a C-terminal fusion to TreS proteins. The most closely related proteins outside this family, Pep2 of Streptomyces coelicolor and Mak1 of Actinoplanes missouriensis, have known maltokinase activity. We suggest this domain acts as a maltokinase and helps drive conversion of trehalose to maltose. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 528
18924 274142 TIGR02458 CbtA cobalt transporter subunit CbtA (proposed). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five trans-membrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site. 225
18925 131512 TIGR02459 CbtB cobalt transporter subunit CbtB (proposed). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional trans-membrane segments. 60
18926 162866 TIGR02460 osmo_MPGsynth mannosyl-3-phosphoglycerate synthase. This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together mannosyl-3-phosphoglycerate phosphatase (MPGP) comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus, this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase. 381
18927 131514 TIGR02461 osmo_MPG_phos mannosyl-3-phosphoglycerate phosphatase. Members of this family are mannosyl-3-phosphoglycerate phosphatase (EC 3.1.3.70). It acts sequentially after mannosyl-3-phosphoglycerate synthase (EC 2.4.1.217) in a two-step pathway of biosynthesis of the compatible solute mannosylglycerate, a typical osmolyte of thermophiles. 225
18928 274143 TIGR02462 pyranose_ox pyranose oxidase. Pyranose oxidase (also called glucose 2-oxidase) converts D-glucose and molecular oxygen to 2-dehydro-D-glucose and hydrogen peroxide. Peroxide production is believed to be important to the wood rot fungi in which this enzyme is found for lignin degradation. 547
18929 131516 TIGR02463 MPGP_rel mannosyl-3-phosphoglycerate phosphatase-related protein. This family consists of members of the HAD superfamily, subfamily IIB. All members are closely related to mannosyl-3-phosphoglycerate phosphatase, the second enzyme in a two-step pathway for biosynthesis of mannosylglycerate, a compatible solute present in some thermophiles and in Dehalococcoides ethenogenes. However, members of this family are separable in a neighbor-joining tree constructed from a multiple sequence alignment and are found only in mesophiles that lack the companion mannosyl-3-phosphoglycerate synthase (TIGR02460). Members of this family are like to act on a compound related to yet distinct from mannosyl-3-phosphoglycerate. [Unknown function, General] 221
18930 274144 TIGR02464 ribofla_fusion conserved hypothetical protein, ribA/ribD-fused. This model describes a sequence region that occurs in at least three different polypeptide contexts. It is found fused to GTP cyclohydrolase II, the RibA of riboflavin biosynthesis (TIGR00505), as in Vibrio vulnificus. It is found fused to riboflavin biosynthesis protein RibD (TIGR00326) in rice and Arabidopsis. It occurs as a standalone protein in a number of bacterial species in varied contexts, including single gene operons and bacteriophage genomes. The member from E. coli currently is named YbiA. The function(s) of members of this family is unknown. 153
18931 131518 TIGR02465 chlorocat_1_2 chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds. 246
18932 274145 TIGR02466 TIGR02466 conserved hypothetical protein. This family consists of uncharacterized proteins in Caulobacter crescentus CB15, Bdellovibrio bacteriovorus HD100, Synechococcus sp. WH 8102, Silicibacter pomeroyi DSS-3, and Hyphomonas neptunium ATCC 15444. The context of nearby genes differs substantially between members and does point to any specific biological role. [Hypothetical proteins, Conserved] 201
18933 274146 TIGR02467 CbiE precorrin-6y C5,15-methyltransferase (decarboxylating), CbiE subunit. This model recognizes the CbiE methylase which is responsible, in part (along with CbiT), for methylating precorrin-6y (or cobalt-precorrin-6y) at both the 5 and 15 positions as well as the concomitant decarbozylation at C-12. In many organisms, this protein is fused to the CbiT subunit. The fused protein, when found in organisms catalyzing the oxidative version of the cobalamin biosynthesis pathway, is called CobL. 204
18934 274147 TIGR02468 sucrsPsyn_pln sucrose phosphate synthase/possible sucrose phosphate phosphatase, plant. Members of this family are sucrose-phosphate synthases of plants. This enzyme is known to exist in multigene families in several species of both monocots and dicots. The N-terminal domain is the glucosyltransferase domain. Members of this family also have a variable linker region and a C-terminal domain that resembles sucrose phosphate phosphatase (SPP) (EC 3.1.3.24) (see TIGR01485), the next and final enzyme of sucrose biosynthesis. The SPP-like domain likely serves a binding and not a catalytic function, as the reported SPP is always encoded by a distinct protein. 1050
18935 274148 TIGR02469 CbiT precorrin-6Y C5,15-methyltransferase (decarboxylating), CbiT subunit. This model recognizes the CbiT methylase which is responsible, in part (along with CbiE), for methylating precorrin-6y (or cobalt-precorrin-6y) at both the 5 and 15 positions as well as the concomitant decarbozylation at C-12. In many organisms, this protein is fused to the CbiE subunit. The fused protein, when found in organisms catalyzing the oxidative version of the cobalamin biosynthesis pathway, is called CobL. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 124
18936 274149 TIGR02470 sucr_synth sucrose synthase. This model represents sucrose synthase, an enzyme that, despite its name, generally uses rather produces sucrose. Sucrose plus UDP (or ADP) becomes D-fructose plus UDP-glucose (or ADP-glucose), which is then available for cell wall (or starch) biosynthesis. The enzyme is homologous to sucrose phosphate synthase, which catalyzes the penultimate step in sucrose synthesis. Sucrose synthase is found, so far, exclusively in plants and cyanobacteria. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 784
18937 131524 TIGR02471 sucr_syn_bact_C sucrose-phosphate synthase, sucrose phosphatase-like domain, bacterial. Sucrose phosphate synthase (SPS) and sucrose phosphate phosphatase (SPP) are the last two enzymes of sucrose biosynthesis. In cyanobacteria and plants, the C-terminal region of most or all versions of SPS has a domain homologous to the known SPP. This domain may serve a binding or regulatory rather than catalytic function. Sequences in this family are bacterial C-terminal regions found in all but two of the putative bacterial sucrose phosphate synthases described by TIGR02472. 236
18938 131525 TIGR02472 sucr_P_syn_N sucrose-phosphate synthase, putative, glycosyltransferase domain. This family consists of the N-terminal regions, or in some cases the entirety, of bacterial proteins closely related to plant sucrose-phosphate synthases (SPS). The C-terminal domain (TIGR02471), found with most members of this family, resembles both bona fide plant sucrose-phosphate phosphatases (SPP) and the SPP-like domain of plant SPS. At least two members of this family lack the SPP-like domain, which may have binding or regulatory rather than enzymatic activity by analogy to plant SPS. This enzyme produces sucrose 6-phosphate and UDP from UDP-glucose and D-fructose 6-phosphate, and may be encoded near the gene for fructokinase. 439
18939 131526 TIGR02473 flagell_FliJ flagellar export protein FliJ. Members of this family are the FliJ protein found, in nearly every case, in the midst of other flagellar biosynthesis genes in bacgterial genomes. Typically the fliJ gene is found adjacent to the gene for the flagellum-specific ATPase FliI. Sequence scoring in the gray zone between trusted and noise cutoffs include both probable FliJ proteins and components of bacterial type III secretion systems. 141
18940 274150 TIGR02474 pec_lyase pectate lyase, PelA/Pel-15E family. Members of this family are isozymes of pectate lyase (EC 4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 290
18941 274151 TIGR02475 CobW cobalamin biosynthesis protein CobW. The family of proteins identified by this model is generally found proximal to the trimeric cobaltochelatase subunit CobN which is essential for vitamin B12 (cobalamin) biosynthesis. The protein contains an P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. A broader CobW family is delineated by two Pfam models which identify the N- and C-terminal domains (pfam02492 and pfam07683). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 341
18942 162875 TIGR02476 BluB 5,6-dimethylbenzimidazole synthase. A previously published hypothesis that BluB, involved in cobalamin biosynthesis, is EC 1.16.8.1 (cob(II)yrinic acid a,c-diamide reductase) is now contradicted by newer work ascribing a role in 5,6-dimethylbenzimidazole (DMB) biosynthesis. The BluB protein is related to the nitroreductase family (pfam0881). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 205
18943 131530 TIGR02477 PFKA_PPi diphosphate--fructose-6-phosphate 1-phosphotransferase. Diphosphate--fructose-6-phosphate 1-phosphotransferase catalyzes the addition of phosphate from diphosphate (PPi) to fructose 6-phosphate to give fructose 1,6-bisphosphate (EC 2.7.1.90). The enzyme is also known as pyrophosphate-dependent phosphofructokinase. The usage of PPi-dependent enzymes in glycolysis presumably frees up ATP for other processes. TIGR02482 represents the ATP-dependent 6-phosphofructokinase enzyme contained within pfam00365: Phosphofructokinase. This model hits primarily bacterial, plant alpha, and plant beta sequences. [Energy metabolism, Glycolysis/gluconeogenesis] 539
18944 274152 TIGR02478 6PF1K_euk 6-phosphofructokinase, eukaryotic type. Members of this family are eukaryotic (with one exception) ATP-dependent 6-phosphofructokinases (EC 2.7.1.11) in which two tandem copies of the phosphofructokinase are found. Members are found, often including several isozymes, in animals and fungi and in the bacterium Propionibacterium acnes KPA171202 (a human skin commensal). 746
18945 274153 TIGR02479 FliA_WhiG RNA polymerase sigma factor, FliA/WhiG family. Most members of this family are the flagellar operon sigma factor FliA, controlling transcription of bacterial flagellar genes by RNA polymerase. An exception is the sigma factor WhiG in the genus Streptomyces, involved in the production of sporulating aerial mycelium. 224
18946 131533 TIGR02480 fliN flagellar motor switch protein FliN. Proteins that consist largely of the domain described by this model for this protein family can be designated flagellar motor switch protein FliN. Longer proteins in which this region is a C-terminal domain typically are designated FliY. More distantly related sequences, outside the scope of this family, are associated with type III secretion and include the surface presentation of antigens protein SpaO required or invasion of host cells by Salmonella enterica. [Cellular processes, Chemotaxis and motility] 77
18947 274154 TIGR02481 hemeryth_dom hemerythrin-like metal-binding domain. This model describes both members of the hemerythrin (TIGR00058) family of marine invertebrates and a broader collection of bacterial and archaeal homologs. Many of the latter group are multidomain proteins with signal-transducing domains such as the GGDEF diguanylate cyclase domain (TIGR00254, pfam00990) and methyl-accepting chemotaxis protein signaling domain (pfam00015). Most hemerythrins are oxygen-carriers with a bound non-heme iron, but at least one example is a cadmium-binding protein, apparently with a role in sequestering toxic metals rather than in binding oxygen. Patterns of conserved residues suggest that all prokaryotic instances of this domain bind iron or another heavy metal, but the exact function is unknown. Not surprisingly, the prokaryote with the most instances of this domain is Magnetococcus sp. MC-1, a magnetotactic bacterium. 126
18948 213713 TIGR02482 PFKA_ATP 6-phosphofructokinase. 6-phosphofructokinase (EC 2.7.1.11) catalyzes the addition of phosphate from ATP to fructose 6-phosphate to give fructose 1,6-bisphosphate. This represents a key control step in glycolysis. This model hits bacterial ATP-dependent 6-phosphofructokinases which lack a beta-hairpin loop present in TIGR02483 family members. TIGR02483 contains members that are ATP-dependent as well as members that are pyrophosphate-dependent. TIGR02477 represents the pyrophosphate-dependent phosphofructokinase, diphosphate--fructose-6-phosphate 1-phosphotransferase (EC 2.7.1.90). [Energy metabolism, Glycolysis/gluconeogenesis] 301
18949 274155 TIGR02483 PFK_mixed phosphofructokinase. Members of this family that are characterized, save one, are phosphofructokinases dependent on pyrophosphate (EC 2.7.1.90) rather than ATP (EC 2.7.1.11). The exception is one of three phosphofructokinases from Streptomyces coelicolor. Family members are both bacterial and archaeal. [Energy metabolism, Glycolysis/gluconeogenesis] 324
18950 274156 TIGR02484 CitB CitB domain protein. This model identifies proteins of two distinct names which may or may not have two distinct functions. CitB has been identified in salmonella and E. coli as the signal transduction component of a two-component system for citrate in which CitA acts as a citrate transporter. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the C-terminal domain of the R. capsulatus CobZ, which, in most other species exists as a separate gene adjacent to CobZ. 372
18951 274157 TIGR02485 CobZ_N-term precorrin 3B synthase CobZ. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the N-terminal portion of the R. capsulatus gene which, in other species exists as a separate protein. The C-terminal portion is homologous to the 2-component signal transduction system protein CitB (TIGR02484). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 432
18952 274158 TIGR02486 RDH reductive dehalogenase. This model represents a family of corrin and 8-iron Fe-S cluster-containing reductive dehalogenases found primarily in halorespiring microorganisms such as dehalococcoides ethenogenes which contains as many as 17 enzymes of this type with varying substrate ranges. One example of a characterized species is the tetrachloroethene reductive dehalogenase (1.97.1.8) which also acts on trichloroethene converting it to dichloroethene. 314
18953 274159 TIGR02487 NrdD anaerobic ribonucleoside-triphosphate reductase. This model represents the oxygen-sensitive (anaerobic, class III) ribonucleotide reductase. The mechanism of the enzyme involves a glycine-centered radical, a C-terminal zinc binding site, and a set of conserved active site cysteines and asparagines. This enzyme requires an activating component, NrdG, a radical-SAM domain containing enzyme (TIGR02491). Together the two form an alpha-2/beta-2 heterodimer. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 579
18954 131541 TIGR02488 flgG_G_neg flagellar basal-body rod protein FlgG, Gram-negative bacteria. This family consists of the FlgG protein of the flagellar apparatus in the Proteobacteria and spirochetes. [Cellular processes, Chemotaxis and motility] 259
18955 131542 TIGR02489 flgE_epsilon flagellar hook protein FlgE, epsilon proteobacterial. Members of this family are flagellar hook proteins, designated FlgE, as found in the epsilon subdivision of the Proteobacteria (Helicobacter, Wolinella, and Campylobacter). These proteins differ significantly in architecture from proteins designated FlgE in other lineages; the N-terminal and C-terminal domains are homologous, but members of this family only contain a large central domain that is surface-exposed and variable between strains. 719
18956 274160 TIGR02490 flgF flagellar basal-body rod protein FlgF. Members of this protein are FlgF, one of several homologous flagellar basal-body rod proteins in bacteria. [Cellular processes, Chemotaxis and motility] 89
18957 274161 TIGR02491 NrdG anaerobic ribonucleoside-triphosphate reductase activating protein. This enzyme is a member of the radical-SAM family (pfam04055) and utilizes S-adenosyl methionine, an iron-sulfur cluster and a reductant (dihydroflavodoxin) to produce a glycine-centered radical in the class III (anaerobic) ribonucleotide triphosphate reductase (NrdD, TIGR02487). The two components form an alpha-2/beta-2 heterodimer. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism, Protein fate, Protein modification and repair] 154
18958 274162 TIGR02492 flgK_ends flagellar hook-associated protein FlgK. The flagellar hook-associated protein FlgK of bacterial flagella has conserved N- and C-terminal domains. The central region is highly variable in length and sequence, and often contains substantial runs of low-complexity sequence. This model is built from an alignment of FlgK sequences with the central region excised. Note that several other proteins of the flagellar apparatus also are homologous in the N- and C-terminal regions to FlgK, but are excluded from this model. [Cellular processes, Chemotaxis and motility] 323
18959 131546 TIGR02493 PFLA pyruvate formate-lyase 1-activating enzyme. An iron-sulfur protein with a radical-SAM domain (pfam04055). A single glycine residue in EC 2.3.1.54, formate C-acetyltransferase (formate-pyruvate lyase), is oxidized to the corresponding radical by transfer of H from its CH2 to AdoMet with concomitant cleavage of the latter. The reaction requires Fe2+. The first stage is reduction of the AdoMet to give methionine and the 5'-deoxyadenosin-5-yl radical, which then abstracts a hydrogen radical from the glycine residue. [Energy metabolism, Anaerobic, Protein fate, Protein modification and repair] 235
18960 274163 TIGR02494 PFLE_PFLC glycyl-radical enzyme activating protein. This subset of the radical-SAM family (pfam04055) includes a number of probable activating proteins acting on different enzymes all requiring an amino-acid-centered radical. The closest relatives to this family are the pyruvate-formate lyase activating enzyme (PflA, 1.97.1.4, TIGR02493) and the anaerobic ribonucleotide reductase activating enzyme (TIGR02491). Included within this subfamily are activators of hydroxyphenyl acetate decarboxylase (HdpA), benzylsuccinate synthase (BssD), gycerol dehydratase (DhaB2) as well as enzymes annotated in E. coli as activators of different isozymes of pyruvate-formate lyase (PFLC and PFLE) however, these appear to lack characterization and may activate enzymes with distinctive functions. Most of the sequence-level variability between these forms is concentrated within an N-terminal domain which follows a conserved group of three cysteines and contains a variable pattern of 0 to 8 additional cysteines. 295
18961 274164 TIGR02495 NrdG2 anaerobic ribonucleoside-triphosphate reductase activating protein. This enzyme is a member of the radical-SAM family (pfam04055). It is often gene clustered with the class III (anaerobic) ribonucleotide triphosphate reductase (NrdD, TIGR02487) and presumably fulfills the identical function as NrdG, which utilizes S-adenosyl methionine, an iron-sulfur cluster and a reductant (dihydroflavodoxin) to produce a glycine-centered radical in NrdD. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism, Protein fate, Protein modification and repair] 192
18962 131549 TIGR02497 yscI_hrpB_dom type III secretion apparatus protein, YscI/HrpB, C-terminal domain. This model represents the conserved C-terminal domain of a protein conserved in across species in the bacterial type III secretion apparatus. This protein is designated YscI (Yop proteins translocation protein I) in Yersinia and HrpB (hypersensitivity response and pathogenicity protein B) in plant pathogens such as Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 39
18963 131550 TIGR02498 type_III_ssaH type III secretion system protein, SsaH family. This family describes a small protein, always smaller than 100 amino acids, encoded in pathogenicity islands for bacterial type III secretion systems in various strains of Yersinia, Salmonella, and enteropathogenic E. coli, as well as Chromobacterium violaceum and Citrobacter rodentium. Although strictly associated with type III secretion systems, this protein seems not yet to have been characterized as part of the apparatus or as an effector protein. [Cellular processes, Pathogenesis] 79
18964 274165 TIGR02499 HrpE_YscL_not type III secretion apparatus protein, HrpE/YscL family. This model is related to pfam06188, but is broader. pfam06188 describes HrpE-like proteins, components of bacterial type III secretion systems primarily in bacteria that infect plants. This model includes also the homologous proteins of animal pathogens, such as YscL of Yersinia pestis. This model excludes the related protein FliH of the bacterial flagellar apparatus (see pfam02108) [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 166
18965 274166 TIGR02500 type_III_yscD type III secretion apparatus protein, YscD/HrpQ family. This family represents a conserved protein of bacterial type III secretion systems. Gene symbols are variable from species to species. Members are designated YscD in Yersinia, HrpQ in Pseudomonas syringae, and EscD in enteropathogenic Escherichia coli. In the Chlamydiae, this model describes the C-terminal 400 residues of a longer protein. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 410
18966 274167 TIGR02501 type_III_yscE type III secretion system protein, YseE family. Members of this family are found exclusively in type III secretion appparatus gene clusters in bacteria. Those bacteria with a protein from this family tend to target animal cells, as does Yersinia pestis. This protein is small (about 70 amino acids) and not well characterized. [Cellular processes, Pathogenesis] 67
18967 131554 TIGR02502 type_III_YscX type III secretion protein, YscX family. Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believe to be part of the secretion machinery. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 121
18968 131555 TIGR02503 type_III_SycN type III secretion chaperone SycN. Members of this protein family are part of the machinery of bacterial type III secretion in a number of bacteria that target animal cells. In the well-studied system from Yersinia, a complex of this protein (SycN) and YscB (pfam07329) acts as a chaperone for the export of YopN (). YopN then acts to control effector protein secretion, in response to calcium levels, so that secretion occurs only after contact with the targeted eukaryotic cell. [Protein fate, Protein folding and stabilization, Cellular processes, Pathogenesis] 119
18969 274168 TIGR02504 NrdJ_Z ribonucleoside-diphosphate reductase, adenosylcobalamin-dependent. This model represents a group of adenosylcobalamin(B12)-dependent ribonucleotide reductases (Class II RNRs) related to the characterized species from Pyrococcus, Thermoplasma, Corynebacterium, and Deinococcus. RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. This model identifies genes in a wide range of deeply branching bacteria. All are structurally related to the class I (non-heme iron dependent) RNRs. In most species this gene is known as NrdJ, while in mycobacteria it is called NrdZ. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 575
18970 274169 TIGR02505 RTPR ribonucleoside-triphosphate reductase, adenosylcobalamin-dependent. This model represents a group of adenosylcobalamin(B12)-dependent ribonucleotide reductases (RNR) related to the characterized species from Lactococcus leichmannii. RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. Thus model identifies NrdJ enzymes only in cyanobacteria, lactococcus and certain bacteriophage. A separate model (TIGR02504) identifies a larger group of divergent B12-dependent RNR's. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 713
18971 274170 TIGR02506 NrdE_NrdA ribonucleoside-diphosphate reductase, alpha subunit. This model represents the alpha (large) chain of the class I ribonucleotide reductase (RNR). RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. Class I RNR's generate the required radical (on tyrosine) via a "non-heme" iron cofactor which resides in the beta (small) subunit. The alpha subunit contains the catalytic and allosteric regulatory sites. The mechanism of this enzyme requires molecular oxygen. E. Coli contains two versions of this enzyme which are regulated independently (NrdAB and NrdEF, where NrdA and NrdE are the large chains). Most organisms contain only one, but the application of the gene symbols NrdA and NrdE are somewhat arbitrary. This model identifies RNR's in diverse clades of bacteria, eukaryotes as well as numerous DNA viruses and phage. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 617
18972 131559 TIGR02507 MtrF tetrahydromethanopterin S-methyltransferase, F subunit. This small protein (MtrF) is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. 65
18973 131560 TIGR02508 type_III_yscG type III secretion protein, YscG family. YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designate Yops (Yersinia outer proteins) in Yersinia. This family consists of YscG of Yersinia, and functionally equivalent type III secretion machinery protein in other species: AscG in Aeromonas, LscG in Photorhabdus luminescens, etc. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 115
18974 131561 TIGR02509 type_III_yopR type III secretion effector, YopR family. Members of this family are type III secretion system effectors, named differently in different species and designated YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yops protein is unusual in that it is released to extracellularly rather than injected directly into the target cell as are most Yops. [Cellular processes, Pathogenesis] 131
18975 188230 TIGR02510 NrdE-prime ribonucleoside-diphosphate reductase, alpha chain. This model represents a small clade of ribonucleoside-diphosphate reductase, alpha chains which are sufficiently divergent from the usual Class I RNR alpha chains (NrdE or NrdA, TIGR02506) as to warrant their own model. The genes from Thermus thermophilus, Dichelobacter and Salinibacter are adjacent to the usual RNR beta chain. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 548
18976 131563 TIGR02511 type_III_tyeA type III secretion effector delivery regulator, TyeA family. Members of this family include both small proteins, about 90 amino acids, in which this model covers the whole, and longer proteins of about 360 residues which match in the C-terminal region. The longer proteins (HrpJ) have N-terminal regions that match pfam07201. Members of this family belong to bacterial type III secretion systems, and include TyeA from the well-studied Yersinia systems. TyeA appears involved in calcium-responsive regulation of the delivery of type III effectors. 79
18977 274171 TIGR02512 FeFe_hydrog_A [FeFe] hydrogenase, group A. This model describes iron-only hydrogenases of anaerobic and microaerophilic bacteria and protozoa. This model is narrower, and covers a longer stretch of sequence, than pfam02906. This family represents a division among families that belong to pfam02906, which also includes proteins such as nuclear prelamin A recognition factor in animals. Note that this family shows some heterogeneity in terms of periplasmic, cytosolic, or hydrogenosome location, NAD or NADP dependence, and overal protein protein length. 374
18978 131565 TIGR02513 type_III_yscB type III secretion system chaperone, YscB family. Members of this family include YscB of Yersinia and functionally equivalent (but differently named) proteins from type III secretion systems of other pathogens that affect animal cells. YscB acts, along with SycN (TIGR02503), as a chaperone for YopN, a key part of a complex that regulates type III secretion so it responds to contact with the eukaryotic target cell. 139
18979 274172 TIGR02514 type_III_yscP type III secretion system needle length determinant. Members of this family include YscP of the Yersinia type III secretion system and equivalent proteins in other animal pathogen bacterial type III secretion systems. The model describes the conserved C-terminal region. N-terminal regions are poorly conserved and variable in length with some low-complexity sequence. 129
18980 274173 TIGR02515 IV_pilus_PilQ type IV pilus secretin (or competence protein) PilQ. A number of proteins homologous to PilQ are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). Members of this family include PilQ itself, which is a component of the type IV pilus structure, from a number of species. In Haemophilus influenzae, the member of this family is associated with competence for transformation with exogenous DNA rather than with formation of a type IV pilus; the surface structure required for competence may be considered an unusual, incomplete type IV pilus structure. [Cell envelope, Surface structures] 418
18981 274174 TIGR02516 type_III_yscC type III secretion outer membrane pore, YscC/HrcC family. A number of proteins homologous to the type IV pilus secretin PilQ (TIGR02515) are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). The clade described by this model contains the outer membrane pore proteins of bacterial type III secretion systems, typified by YscC for animal pathogens and HrcC for plant pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 462
18982 274175 TIGR02517 type_II_gspD type II secretion system protein D. In Gram-negative bacteria, proteins that have first crossed the inner member by Sec-dependent protein transport can be exported across the outer membrane by type II secretion, also called the main terminal branch of the general secretion pathway. Members of this family are general secretion pathway protein D. In Yersinia enterocolitica, a second member of this family is part of a novel second type II secretion system specifically associated with virulence (See ). This family is closely homologous to the type IV pilus outer membrane secretin PilQ (TIGR02515) and to the type III secretion system pore YscC/HrcC (TIGR02516). [Protein fate, Protein and peptide secretion and trafficking] 594
18983 131570 TIGR02518 EutH_ACDH acetaldehyde dehydrogenase (acetylating). 488
18984 274176 TIGR02519 pilus_MshL pilus (MSHA type) biogenesis protein MshL. Members of this family are predicted secretins, that is, outer membrane pore proteins associated with delivery of proteins from periplasm to the outside of the cell. Related families include GspD of type II secretion (TIGR02517), the YscC/HrcC family from type III secretion (TIGR02516), and the PilQ secretin of type IV pilus formation (TIGR02515). Members of this family are found in gene clusters associated with MSHA (mannose-sensitive hemagglutinin) and related pili, and appear to be the secretin of this pilus system. [Cell envelope, Surface structures] 290
18985 274177 TIGR02520 pilus_B_mal_scr type IVB pilus formation outer membrane protein, R64 PilN family. Several related protein families encode outer membrane pore proteins for type II secretion, type III secretion, and type IV pilus formation. This protein family appears to encode a secretin for pilus formation, although it is quite different from PilQ. Members include the PilN lipoprotein of the plasmid R64 thin pilus, a type IV pilus. Scoring between the trusted and noise cutoffs are examples of bundle-forming pilus B (bfpB). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking] 497
18986 131573 TIGR02521 type_IV_pilW type IV pilus biogenesis/stability protein PilW. Members of this family are designated PilF and PilW. This outer membrane protein is required both for pilus stability and for pilus function such as adherence to human cells. Members of this family contain copies of the TPR (tetratricopeptide repeat) domain. 234
18987 274178 TIGR02522 pilus_cpaD pilus (Caulobacter type) biogenesis lipoprotein CpaD. This family consists of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function is not known. [Cell envelope, Surface structures] 198
18988 131575 TIGR02523 type_IV_pilV type IV pilus modification protein PilV. Pilus systems categorized as type IV pilins differ greatly from one another, with some showing greater similarty to type II or type III secretion systems than to each other. Members of this protein family represent the PilV protein of type IV pilus systems as found in Pseudomonas aeruginosa PAO1, Pseudomonas syringae DC3000, Neisseria meningitidis MC58, Xylella fastidiosa 9a5c, etc. [Cell envelope, Surface structures, Protein fate, Protein modification and repair] 139
18989 131576 TIGR02524 dot_icm_DotB Dot/Icm secretion system ATPase DotB. Members of this protein family are the DotB component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the liturature now seems to favor calling this the Dot/Icm system. This family is most closely related to TraJ proteins of plasmid transfer, rather than to proteins of other type IV secretion systems. 358
18990 131577 TIGR02525 plasmid_TraJ plasmid transfer ATPase TraJ. Members of this protein family are predicted ATPases associated with plasmid transfer loci in bacteria. This family is most similar to the DotB ATPase of a type-IV secretion-like system of obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii (TIGR02524). [Mobile and extrachromosomal element functions, Plasmid functions] 372
18991 131578 TIGR02526 eut_PduT PduT-like ethanolamine utilization protein. This gene shows up in ethanolamine utilization operons in which a proteinaceous coat organelle is also encoded. It is closely related to the PduT protein in propane-diol operons with the same structure. 182
18992 274179 TIGR02527 dot_icm_IcmQ Dot/Icm secretion system protein IcmQ. Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (). 179
18993 131580 TIGR02528 EutP ethanolamine utilization protein, EutP. This protein is found within operons which code for polyhedral organelles containing the enzyme ethanolamine ammonia lyase. The function of this gene is unknown, although the presence of an N-terminal GxxGxGK motif implies a GTP-binding site. [Energy metabolism, Amino acids and amines] 142
18994 274180 TIGR02529 EutJ ethanolamine utilization protein EutJ family protein. 239
18995 274181 TIGR02530 flg_new flagellar operon protein. Members of this family are found in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus. The specific molecular function is unknown. [Cellular processes, Chemotaxis and motility] 96
18996 188231 TIGR02531 yecD_yerC TrpR-related protein YerC/YecD. This model represents a protein subfamily found mostly in the Firmicutes (Bacillus and allies). This family is similar in sequence to the trp operon repressor TrpR described by TIGR01321, and represents a distinct clade within the broader family described by pfam01371. At least one species, Xylella fastidiosa, in the Proteobacteria, has a member of both this family and TIGR01321. Several genomes with a member of this family do not synthesize tryptophan, and members of this family should not be considered trp operon repressors without new evidence. [Unknown function, General] 87
18997 274182 TIGR02532 IV_pilin_GFxxxE prepilin-type N-terminal cleavage/methylation domain. This model describes many but not all examples of the N-terminal region of bacterial proteins that resemble type IV pilins at their N-terminus, with a cleavage site G^FxxxE followed by a hydrophobic stretch. The new N-terminal residue, usually Phe, is methylated. Separate domains of the prepilin peptidase appear responsible for cleavage and methylation. Proteins with this N-terminal region include type IV pilins and other components of pilus biogenesis, competence proteins, and type II secretion proteins. Typically several proteins in a single operon have this N-terminal domain. The N-terminal cleavage and methylation site is described by PROSITE motif PS00409 as [KRHEQSTAG]-G-[FYLIVM]-[ST]-[LT]-[LIVP]-E-[LIVMFWSTAG](14). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking] 24
18998 131585 TIGR02533 type_II_gspE type II secretion system protein E. This family describes GspE, the E protein of the type II secretion system, also called the main terminal branch of the general secretion pathway. This model separates GspE from the PilB protein of type IV pilin biosynthesis. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 486
18999 162905 TIGR02534 mucon_cyclo muconate and chloromuconate cycloisomerases. This model encompasses muconate cycloisomerase (EC 5.5.1.1) and chloromuconate cycloisomerase (EC 5.5.1.7), enzymes that often overlap in specificity. It excludes more distantly related proteins such as mandelate racemase (5.1.2.2). 368
19000 274183 TIGR02535 hyp_Hser_kinase proposed homoserine kinase. The genes in this family are largely adjacent to genes involved in the biosynthesis of threonine (aspartate kinase, homoserine dehydrogenase and threonine synthase) in genomes which are lacking any other known homoserine kinase, and in which the presence of a homoserine kinase would indicate a complete pathway for the biosynthesis of threonine. These genes are a member of the (now subfamily, formerly equivalog) TIGR00306 model describing the archaeal form of 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. All of these are members of a superfamily (pfam01676) of metalloenzyme also including phosphopentomutase alkaline phosphatases and sulfatases. The proposal that this family encodes a kinase is based on analogy to phosphomutases which are intramolecular phosphotransferases. A mutase active site could evolve to bring together homoserine and a phosphate donor such as phosphoenolpyruvate resulting in a kinase activity. 396
19001 274184 TIGR02536 eut_hyp ethanolamine utilization protein. This family of proteins is found in operons for the polyhedral organelle-based degradation of ethanolamine. This family is not found in proteobacterial species which otherwise have the same suite of genes in the eut operon. Proteobacteria have two genes that are not found in non-proteobacteria which may complement this genes function, a phosphotransacetylase (pfam01515) and the EutJ protein (TIGR02529) of unknown function. 207
19002 274185 TIGR02537 arch_flag_Nterm archaeal flagellin N-terminal-like domain. This model describes a hydrophobic N-terminal sequence of archaeal flagellins and other archaeal proteins. The sequence is directly analogous to bacterial sequences recognized by TIGR02532, which has cleavage motif resembling G^FxxxE followed by strongly hydrophobic sequence. Such sequences are the recognized for cleavage and methylation, and include pilins and other pilus components and competence and type II secretion secretion proteins. In the present family, the E is not conversed and sequence differs enough that there is no overlap between this family and TIGR02532. 24
19003 274186 TIGR02538 type_IV_pilB type IV-A pilus assembly ATPase PilB. This model describes a protein of type IV pilus biogenesis designated PilB in Pseudomonas aeruginosa but PilF in Neisseria gonorrhoeae; the more common usage, reflected here, is PilB. This protein is an ATPase involved in protein export for pilin assembly and is closely related to GspE (TIGR02533) of type II secretion, also called the main terminal branch of the general secretion pathway. Note that type IV pilus systems are often divided into type IV-A and IV-B, with the latter group including bundle-forming pilus, mannose-sensitive hemagglutinin, etc. Members of this family are found in type IV-A systems. [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking] 564
19004 274187 TIGR02539 SepCysS O-phospho-L-seryl-tRNA:Cys-tRNA synthase. Aminoacylation of tRNA(Cys) with Cys, and cysteine biosynthesis in the process, happens in Methanocaldococcus jannaschii and several other archaea by misacylation of tRNA(Cys) with O-phosphoserine (Sep), followed by modification of the phosphoserine to cysteine. In some species, direct tRNA-cys aminoacylation also occurs but this pathway is required for Cys biosynthesis. Members of this protein catalyze the second step in this two step pathway, using pyridoxal phosphate and a sulfur donor to synthesize Cys from Sep while attached to the tRNA. 369
19005 131592 TIGR02540 gpx7 putative glutathione peroxidase Gpx7. This model represents one of several families of known and probable glutathione peroxidases. This family is restricted to animals and designated GPX7. 153
19006 274188 TIGR02541 flagell_FlgJ flagellar rod assembly protein/muramidase FlgJ. The N-terminal region of this protein acts directly in flagellar rod assembly. The C-terminal region is a flagellum-specific muramidase (peptidoglycan hydrolase) required for formation of the outer membrane L ring. 294
19007 211749 TIGR02542 T_forsyth_147 TANFOR domain. The longest predicted protein in Tannerella forsythia (Bacteroides forsythus) ATCC 43037 is over 3000 residues long and lacks homology to other known proteins. Immediately after the signal sequence are four tandem repeats, approximately 147 residues long. This model describes that repeat, plus homologous single copy N-terminal domains in other large bacterial proteins. We designate this region the TANFOR domain. Many proteins with this domain also have fibronectin type III domains. 145
19008 274189 TIGR02543 List_Bact_rpt Listeria/Bacterioides repeat. This model describes a conserved core region, about 43 residues in length, of at least two families of tandem repeats. These include 78-residue repeats from 2 to 15 in number, in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few bacteria. [Unknown function, General] 43
19009 274190 TIGR02544 III_secr_YscJ type III secretion apparatus lipoprotein, YscJ/HrcJ family. All members of this protein family are predicted lipoproteins with a conserved Cys near the N-terminus for cleavage and modification, and are part of known or predicted type III secretion systems. Members are found in both plant and animal pathogens, including the obligately intracellular chlamydial species and (non-pathogenic) root nodule bacteria. The most closely related proteins outside this family are examples of the flagellar M-ring protein FliF. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 193
19010 274191 TIGR02546 III_secr_ATP type III secretion apparatus H+-transporting two-sector ATPase. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 422
19011 274192 TIGR02547 casA_cse1 CRISPR type I-E/ECOLI-associated protein CasA/Cse1. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by CT1972 from Chlorobium tepidum, is found in Ecoli subtype CRISPR/Cas regions of many bacteria, most of which are mesophiles, and not in Archaea. It is designated Cse1. 502
19012 274193 TIGR02548 casB_cse2 CRISPR type I-E/ECOLI-associated protein CasB/Cse2. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family is found in Ecoli subtype CRISPR/Cas regions of many bacteria, most of which are mesophiles, and not in Archaea. It was designated Cse2 originally, and renamed CasB based on its characterization in the CASCADE complex. 160
19013 274194 TIGR02549 CRISPR_DxTHG CRISPR-associated DxTHG motif protein. This model describes a short region highly conserved between two otherwise substantially different CRISPR-associated (cas) proteins, TIGR02221 and TIGR01987. This region includes the motif [VIL]-D-x-[ST]-H-[GS]. 21
19014 274195 TIGR02550 flagell_flgL flagellar hook-associated protein 3. This protein family consists of flagellar hook-associated proteins designated FlgL (or HAP3) encoded in bacterial flagellar operons. A N-terminal region of about 150 residues and a C-terminal region of about 85 residues are conserved. Members show considerable length heterogeneity between these two well-conserved terminal regions; the seed alignment 486 columns, 393 of which are represented in the model, while members of this family are from 287 to over 500 residues in length. This model distinguishes FlgL from the flagellin gene product FliC. [Cellular processes, Chemotaxis and motility] 306
19015 274196 TIGR02551 SpaO_YscQ type III secretion system apparatus protein YscQ/HrcQ. Genes in this family are found in type III secretion operons. The gene (YscQ) in Yersinia is essential for YOPs secretion, while SpaO in Shigella is involved in the Surface Presentation of Antigens apparatus found on the virulence plasmid, and HrcQ is involved in the Harpin secretory system in organisms like Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 298
19016 274197 TIGR02552 LcrH_SycD type III secretion low calcium response chaperone LcrH/SycD. Genes in this family are found in type III secretion operons. LcrH, from Yersinia is believed to have a regulatory function in the low-calcium response of the secretion system. The same protein is also known as SycD (SYC = Specific Yop Chaperone) for its chaperone role. In Pseudomonas, where the homolog is known as PcrH, the chaperone role has been demonstrated and the regulatory role appears to be absent. ScyD/LcrH contains three central tetratricopeptide-like repeats that are predicted to fold into an all-alpha-helical array. 135
19017 274198 TIGR02553 SipD_IpaD_SspD type III effector protein IpaD/SipD/SspD. These proteins are found within type III secretion operons and have been shown to be secreted by that system. 313
19018 131605 TIGR02554 PrgH type III secretion system protein PrgH/EprH. In Samonella, this gene is part of a four-gene operon PrgHIJK and in general is found in type III secretion operons. PrgH has been shown to be required for secretion, as well as being a structural component of the needle complex. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 389
19019 131606 TIGR02555 OrgA_MxiK type III secretion apparatus protein OrgA/MxiK. This gene is found in type III secretion operons and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the ghene is called MxiK and has been shown to be sessential for the proper assembly of the secretion needle complex. 185
19020 274199 TIGR02556 cas_TM1802 CRISPR-associated protein, TM1802 family. This minor cas protein is found in CRISPR/cas regions of at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial. 555
19021 274200 TIGR02557 HpaP type III secretion protein HpaP. This family of genes is always found in type III secretion operons, althought its function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence. 201
19022 131609 TIGR02558 HrpB2 type III secretion protein HrpB2. This family of genes is found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia. 124
19023 131610 TIGR02559 HrpB7 type III secretion protein HrpB7. This family of genes is found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia. 158
19024 131611 TIGR02560 HrpB4 type III secretion protein HrpB4. This family of genes are always found in type III secretion operons in a limited number of species including Burkholderia, Xanthomonas and Ralstonia. 210
19025 131612 TIGR02561 HrpB1_HrpK type III secretion protein HrpB1/HrpK. This gene is found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia. 153
19026 274201 TIGR02562 cas3_yersinia CRISPR-associated helicase Cas3, subtype I-F/YPEST. The helicase in many CRISPR-associated (cas) gene clusters is designated Cas3, and most Cas3 proteins are described by model TIGR01587. Members of this family are considerably larger, show a number of motifs in common with TIGR01587 sequences, and replace Cas3 in some CRISPR/cas loci in a number of Proteobacteria, including Yersinia pestis, Chromobacterium violaceum, Erwinia carotovora subsp. atroseptica SCRI1043, Photorhabdus luminescens subsp. laumondii TTO1, Legionella pneumophila, etc. 1110
19027 274202 TIGR02563 cas_Csy4 CRISPR-associated endoribonuclease Cas6/Csy4, subtype I-F/YPEST. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4. 185
19028 274203 TIGR02564 cas_Csy1 CRISPR type I-F/YPEST-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1. 384
19029 274204 TIGR02565 cas_Csy2 CRISPR type I-F/YPEST-associated protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2. 296
19030 274205 TIGR02566 cas_Csy3 CRISPR type I-F/YPEST-associated protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3. 341
19031 131618 TIGR02567 YscW type III secretion system chaperone YscW. This family of proteins is found within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC (TIGR02516). YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC. 124
19032 274206 TIGR02568 LcrE type III secretion regulator YopN/LcrE/InvE/MxiC. This protein is found in type III secretion operons and, in Yersinia is localized to the cell surface and is involved in the Low-Calicium Response (LCR), possibly by sensing the calcium concentration. In Salmonella, the gene is known as InvE and is believed to perform an essential role in the secretion process and interacts with the proteins SipBCD and SicA.//Altered name to reflect regulatory role. Added GO and role IDs . Negative regulation of type III secretion in Y pestis is mediated in part by a multiprotein complex that has been proposed to act as a physical impediment to type III secretion by blocking the entrance to the secretion apparatus prior to contact with mammalian cells. This complex is composed of YopN, its heterodimeric secretion chaperone SycN-YscB, and TyeA. 3[SS 6/3/05] [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 240
19033 131620 TIGR02569 TIGR02569_actnb TIGR02569 family protein. This protein family is found, so far, only in Actinobacteria, including as least five species of Mycobacterium, three of Corynebacterium, and Nocardia farcinica, always in a single copy per genome. The function is unknown. [Hypothetical proteins, Conserved] 272
19034 274207 TIGR02570 cas7_GSU0053 CRISPR-associated protein GSU0053/csb1, Dpsyc system. Members of this family are found in association with CRISPR repeats and other CRISPR-associated (cas) genes in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria). This CRISPR/Cas type is designated Dpsych. 172
19035 131622 TIGR02571 ComEB ComE operon protein 2. This protein is found in the ComE operon for "late competence" as characterized in B. subtilis. Proteins in this family contain homology to a cytidine/deoxycytidine deaminase domain family (pfam00383), and may carry out this activity. 151
19036 131623 TIGR02572 LcrR type III secretion system regulator LcrR. This protein is found in type III secretion operons and has been characterized in Yersinia as a regulator of the Low-Calcium Respone (LCR). [Protein fate, Protein and peptide secretion and trafficking] 139
19037 131624 TIGR02573 LcrG_PcrG type III secretion protein LcrG. This protein is found in type III secretion operons, along with LcrR, H and V. Also known as PcrG in Pseudomonas, the protein is believed to make a 1:1 complex with PcrV (LcrV). Mutants of LcrG cause premature secretion of effector proteins into the medium . 90
19038 131625 TIGR02574 stabl_TIGR02574 putative addiction module component, TIGR02574 family. Members of this family are bacterial proteins, typically are about 75 amino acids long, always found as part of a pair (at least) of two small genes. The other in the pair always belongs to a subfamily of the larger family pfam05016 (although not necessarily scoring above the designated cutoff), which contains plasmid stabilization proteins. It is likely that this protein and its pfam05016 member partner comprise some form of addiction module, although these gene pairs usually are found on the bacterial main chromosome. [Mobile and extrachromosomal element functions, Other] 63
19039 274208 TIGR02577 cas_TM1794_Cmr2 CRISPR-associated protein Cas10/Cmr2, subtype III-B. This model represent a Crm2 family of the CRISPR-associated RAMP module, a set of six genes recurring found together in prokaryotic genomes. This gene cluster is found only in species with CRISPR repeats, usually near the repeats themselves. Because most of the six (but not this family) contain RAMP domains, and because its appearance in a genome appears to depend on other CRISPR-associated Cas genes, the set is designated the CRISPR RAMP module. This protein, typified by TM1794 from Thermotoga maritima, is designated Crm2, for CRISPR RAMP Module protein 2. 483
19040 274209 TIGR02578 cas_TM1811_Csm1 CRISPR-associated protein Cas10/Csm1, subtype III-A/MTUBE. The family is designated Csm2, for CRISPR/Cas Subtype Mtube Protein 2. A typical example is TM1811 from Thermotoga maritima. CRISPR are Clustered Regularly Interspaced Short Palindromic Repeats. This protein family belongs to a conserved gene cluster regularly found near CRISPR repeats. 648
19041 131628 TIGR02579 cas_csx3 CRISPR-associated protein, Csx3 family. Members of this family are found encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3). 83
19042 274210 TIGR02580 cas_RAMP_Cmr4 CRISPR type III-B/RAMP module RAMP protein Cmr4. This model represents a CRISPR-associated protein from the family that includes TM1792 of Thermotoga maritima. This family is part of the broad RAMP superfamily (pfam03787) collection of CRISPR-associated proteins. It is the fourth of a recurring set of six proteins, four of are in the RAMP superfamily, that we designate the CRISPR RAMP module. 280
19043 274211 TIGR02581 cas_cyan_RAMP CRISPR-associated RAMP protein, SSO1426 family. Members of this CRISPR-associated (cas) gene family are found in the RAMP-2 subtype of CRISPR/cas locus and designated TM1809 family. 217
19044 274212 TIGR02582 cas7_TM1809 CRISPR type III-A/MTUBE-associated RAMP protein Csm3. Members of this CRISPR-associated (cas) gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm3, for CRISPR/cas Subtype Mtube, protein 3. 204
19045 274213 TIGR02583 DevR_archaea CRISPR-associated protein Cas7/Csa2, subtype I-A/APERN. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This model represents one such family, typified by MJ0381 of Methanococcus jannaschii. This archaeal clade is a member of the DevR family (TIGR01875) which includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa2, for CRISPR/Cas Subtype Protein 2. 285
19046 274214 TIGR02584 cas_NE0113 CRISPR-associated protein, NE0113 family. Members of this minor CRISPR-associated (Cas) protein family are found in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum. 209
19047 274215 TIGR02585 cas_Cst2_DevR CRISPR-associated protein Cas7/Cst2/DevR, subtype I-B/TNEAP. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This clade is a member of the DevR family (TIGR01875) and includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to be regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development. 310
19048 131635 TIGR02586 cas5_cmx5_devS CRISPR-associated protein Cas5/DevS, subtype MYXAN. This model represents DevS of Myxococcus xanthus and related proteins of Leptospira interrogans and Gemmata obscuriglobus. This protein is encoded in a cluster of CRISPR-associated (cas) genes, and in the special case of Myxococcus xanthus has taken on a role in the control of fruiting body development. CRISPRs are clustered, regularly interspaced short palidromic repeats. This protein family is related to models TIGR01868, TIGR01895, and TIGR01876. 188
19049 131636 TIGR02587 TIGR02587 putative integral membrane protein TIGR02587. Members of this family are found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. This family, as defined, includes some members of COG4711 but is narrower and strictly bacterial. Members appear to span the membrane seven times. [Cell envelope, Other] 271
19050 131637 TIGR02588 TIGR02588 TIGR02588 family protein. The function of this protein is unknown. It is always found as part of a two-gene operon with TIGR02587, a protein that appears to span the membrane seven times. It is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus, so far, all of which are bacterial. [Hypothetical proteins, Conserved] 122
19051 274216 TIGR02589 cas_Csd2 CRISPR-associated protein Cas7/Csd2, subtype I-C/DVULG. This model represents one of two closely related CRISPR-associated proteins that belong to the larger family of TIGR01595. Members are the Csd2 protein of the Dvulg subtype of CRISPR/cas system. CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. The related model is TIGR02590, the Csh2 protein of the Hmari CRISPR subtype. 284
19052 274217 TIGR02590 cas_Csh2 CRISPR-associated protein Cas7/Csh2, subtype I-B/HMARI. This model represents one of two closely related CRISPR-associated proteins that belong to the larger family of TIGR01595. Members are the Csh2 protein of the Hmari subtype of CRISPR/cas system. CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. The related model is TIGR02589, the Csd3 protein of the Dvulg CRISPR subtype. 286
19053 188234 TIGR02591 cas_Csh1 CRISPR-associated protein Cas8b/Csh1, subtype I-B/HMARI. This domain is found in the C-terminal 2/3 of a family of CRISPR associated proteins of the Hmari subtype. Except for the two sequences from halophilic archaea this domain contains a pair of CXXC motifs. 393
19054 131641 TIGR02592 cas_Cas5h CRISPR-associated protein Cas5, subtype I-B/HMARI. This is a CRISPR-associated protein unique to the hmari subtype of cas genes and CRISPR repeat, which is the only subtype present in Haloarcula marismortui ATCC 43049. The hmari type, though uncommon, is also found in the Aquificae, Thermotogae, Firmicutes, and Dictyoglomi. 241
19055 274218 TIGR02593 CRISPR_cas5 CRISPR-associated protein Cas5, N-terminal domain. This model represents a shared N-terminal domain, about 43 amino acids in length, common to a number of related protein families each of which is associated with a distinct subtype of CRISPR/cas system, where CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeat and Cas is an abbreviation for CRISPR-associated. Members of this family are widely distributed enough that we designated the family Cas5. Homology appears remote, or absent, between the more C-terminal regions different subfamilies of these proteins, which typically are 210 to 265 amino acids in total length. Cas5 proteins of six different CRISPR/cas subtypes so far defined are described by respective full-length models TIGR01868, TIGR01876, TIGR01895, TIGR01874, TIGR02586, and TIGR02592. The best characterized protein in this family is DevS or Myxococcus xanthus, a Cas protein that appears to participate in a species-specific developmental pathway. 42
19056 131643 TIGR02594 TIGR02594 TIGR02594 family protein. Members of this protein family known so far are restricted to the bacteria, and for the most to the proteobacteria. The function is unknown. 129
19057 131644 TIGR02595 PEP_exosort PEP-CTERM protein-sorting domain. This model describes a 25-residue domain that includes a near-invariant Pro-Glu-Pro (PEP) motif, a thirteen residue strongly hydrophobic sequence likely to span the membrane, and a five-residue strongly basic motif that often contains four Arg residues. In nearly every case, this motif is found within nine residues, and usually within five residues, of the extreme C-terminus of the protein. Proteins with this motif typically have signal sequences at the N-terminus. This region appears many times per genome or not at all, and co-occurs in genomes with a proposed protein-sorting integral membrane protein we designate exosortase (see TIGR02602). PEP-CTERM proteins frequently are poorly conserved, Ser/Thr-rich proteins and may become extensively modified proteinaceous constituents of extracellular material in bacterial biofilms. [Cell envelope, Surface structures] 24
19058 274219 TIGR02596 TIGR02596 Verru_Chthon cassette protein D. This model describes a nearly twenty member protein family in Verrucomicrobium spinosum and a somewhat smaller paralogous family in Chthoniobacter flavus. All members share a type IV pilin-like N-terminal leader sequence (TIGR02532). These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures] 195
19059 274220 TIGR02597 TIGR02597 TIGR02597 family protein. This model describes a paralogous family with at least ten members in Verrucomicrobium spinosum. Two additional predicted proteins match more weakly and score between the trusted and noise cutoffs, while a third contains a point mutation. Eleven of the thirteen genes are found in a single tandem array. 361
19060 131647 TIGR02598 TIGR02598 Verru_Chthon cassette protein B. This family consists sets of paralogous family of proteins in the Verrucomicrobium spinosum and Chthoniobacter flavus. All members contain the prepilin-type N-terminal cleavage/methylation domain (TIGR02532) at the N-terminus. The mature protein would be about 150 amino acids long. These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures] 151
19061 274221 TIGR02599 TIGR02599 Verru_Chthon cassette protein C. This family consists sets of paralogous family of proteins in the Verrucomicrobium spinosum and Chthoniobacter flavus. All members contain the prepilin-type N-terminal cleavage/methylation domain (TIGR02532) at the N-terminus. The mature protein would be about 350 amino acids long. These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures] 339
19062 274222 TIGR02600 Verru_Chthon_A Verru_Chthon cassette protein A. In Verrucomicrobium spinosum and Chthoniobacter flavus, a four-gene operon that includes proteins with an N-terminal signal sequence for cleavage and methylation recurs many times. Each operon is likely to encode a membrane complex, the function of which is unknown. This model represents a long protein from this putative membrame complex, with members averaging about 1300 amino acids. The N-terminal region includes an apparent signal sequence. The function is unknown. Most cassettes are adjacent to an unusually large protein with both an outer membrane autotransporter region and PEP-CTERM putative protein-sorting motif. [Cell envelope, Surface structures] 1265
19063 274223 TIGR02601 autotrns_rpt autotransporter-associated beta strand repeat. This model represent a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797, TIGR01414). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. pfam05594 is based on a longer, much more poorly conserved multiple sequence alignment and hits some of the same proteins as this model with some overlap between the hit regions of the two models. It describes these repeats as likely to have a beta-helical structure. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 32
19064 131651 TIGR02602 8TM_EpsH exosortase. This family is designated exosortase, and it is the predicted protein-sorting transpeptidase for the PEP-CTERM protein-sorting signal of many biofilm-producing Gram-negative bacteria. This system is analogous to the sortase/LPXTG system found mostly in Gram-positive bacteria. Members of this family are integral membrane proteins with eight predicted transmembrane helices in common, and with a triad of invariant residues that matches the catalytic triad of sortases. Some members of this family have long trailing sequences past the region described by this model, which in other species is a separate protein EpsI. This model does not include the region of the first predicted transmembrane region. The only partially characterized member is EpsH of Methylobacillus sp. 12S, part of a locus associated with biosynthesis of the exopolysaccharide methanolan but itself not involved in polysaccharide biosynthesis. [Protein fate, Protein and peptide secretion and trafficking] 241
19065 274224 TIGR02603 CxxCH_TIGR02603 putative heme-binding domain, Pirellula/Verrucomicrobium type. This model represents a domain limited to very few species but expanded into large paralogous families in some species that conain it. We find it in over 20 copies each in Pirellula sp. strain 1 (phylum Planctomycetes) and Verrucomicrobium spinosum DSM 4136 (phylum Verrucomicrobia), and no matches above trusted cutoff an any other species so far. This domain, about 140 amino acids long, contains an absolutely conserved motif CxxCH, the cytochrome c family heme-binding site signature (PS00190). 133
19066 274225 TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) (SP|Q44091) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs. 367
19067 274226 TIGR02605 CxxC_CxxC_SSSS putative regulatory protein, FmdB family. This model represents a region of about 50 amino acids found in a number of small proteins in a wide range of bacteria. The region begins usually with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One member of this family is has been noted as a putative regulatory protein, designated FmdB (SP:Q50229, ). Most members of this family have a C-terminal region containing highly degenerate sequence, such as SSTSESTKSSGSSGSSGSSESKASGSTEKSTSSTTAAAAV in Mycobacterium tuberculosis and VAVGGSAPAPSPAPRAGGGGGGCCGGGCCG in Streptomyces avermitilis. These low complexity regions, which are not included in the model, resemble low-complexity C-terminal regions of some heterocycle-containing bacteriocin precursors. [Regulatory functions, DNA interactions] 52
19068 274227 TIGR02606 antidote_CC2985 putative addiction module antidote protein, CC2985 family. This bacterial protein family has a very similar seed alignment to that of pfam03693 but is a more stringent model with higher cutoff scores. Proteins that score above the trusted cutoff to this model almost invariably are found adjacent to a ParE family protein (pfam05016), where ParE is the killing partner of an addiction module for plasmid stabilization. Members of this family, therefore, are putative addiction module antidote proteins. Some are encoded on plasmids or in prophage regions, but others appear chromosomal. A genome may contain several identical copies, such as the four in Magnetococcus sp. MC-1. This family is named for one member, CC2985 of Caulobacter crescentus CB15. [Cellular processes, Other, Mobile and extrachromosomal element functions, Plasmid functions] 69
19069 274228 TIGR02607 antidote_HigA addiction module antidote protein, HigA family. Members of this family form a distinct clade within the larger family HTH_3 of helix-turn-helix proteins, described by pfam01381. Members of this clade are strictly bacterial and nearly always shorter than 110 amino acids. This family includes the characterized member HigA, without which the killer protein HigB cannot be cloned. The hig (host inhibition of growth) system is noted to be unusual in that killer protein is uncoded by the upstream member of the gene pair. [Regulatory functions, DNA interactions, Regulatory functions, Protein interactions, Mobile and extrachromosomal element functions, Other] 78
19070 274229 TIGR02608 delta_60_rpt delta-60 repeat domain. This domain occurs in tandem repeats, as many as 13, in proteins from Bdellovibrio bacteriovorus, Azotobacter vinelandii, Geobacter sulfurreducens, Pirellula sp. 1, Myxococcus xanthus, and others, many of which are Deltaproteobacteria. The periodicity of the repeat ranges from about 57 to 61 amino acids, and a core region of about 54 is represented by this model and seed alignment. 54
19071 274230 TIGR02609 doc_partner putative addiction module antidote. Members of this protein family are putative addiction module antidote proteins that appear recurringly in two-gene operons with members of the Doc (death-on-curing) family TIGR01550. Members of this family contain a SpoVT/AbrB-like domain (pfam04014). Note that the gene pairs with a member of this family tend to be found on bacterial chromosomes, not on plasmids. [Mobile and extrachromosomal element functions, Other] 74
19072 131659 TIGR02610 PHA_gran_rgn putative polyhydroxyalkanoic acid system protein. All members of this family are encoded by genes polyhydroxyalkanoic acid (PHA) biosynthesis and utilization genes, including proteins at found at the surface of PHA granules. Examples so far are found in the Pseudomonales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria. 91
19073 131660 TIGR02611 TIGR02611 TIGR02611 family protein. Members of this family are Actinobacterial putative proteins of about 150 amino acids in length with three apparent transmembrane helix and an unusual motif with consensus sequence PGPGW. [Hypothetical proteins, Conserved] 121
19074 274231 TIGR02612 mob_myst_A mobile mystery protein A. Members of this protein family are found in mobization-related contexts more often than not, including within a CRISPR-associated gene region in Geobacter sulfurreducens PCA, and on plasmids in Agrobacterium tumefaciens and Coxiella burnetii, always together with mobile mystery protein B, a member of the Fic protein family (pfam02661). This protein is encoded by the upstream member of the gene pair and belongs to a family of helix-turn-helix DNA binding proteins (pfam01381). [Unknown function, General] 150
19075 131662 TIGR02613 mob_myst_B mobile mystery protein B. Members of this protein family, which we designate mobile mystery protein B, are found in mobization-related contexts more often than not, including within a CRISPR-associated gene region in Geobacter sulfurreducens PCA, and on plasmids in Agrobacterium tumefaciens and Coxiella burnetii, always together with mobile mystery protein A (TIGR02612), a member of the family of helix-turn-helix DNA binding proteins (pfam01381). This protein is encoded by the downstream member of the gene pair and belongs to the Fic protein family (pfam02661), where Fic (filamentation induced by cAMP) is a regulator of cell division. The characteristics of having a two-gene operon in a varied context and often on plasmids, with one member affecting cell division and the other able to bind DNA, suggests similarity to addiction modules. 186
19076 274232 TIGR02614 ftsW cell division protein FtsW. This family consists of FtsW, an integral membrane protein with ten transmembrane segments. In general, it is one of two paralogs involved in peptidoglycan biosynthesis, the other being RodA, and is essential for cell division. All members of the seed alignment for this model are encoded in operons for the biosynthesis of UDP-N-acetylmuramoyl-pentapeptide, a precursor of murein (peptidoglycan). The FtsW designation is not used in endospore-forming bacterial (e.g. Bacillus subtilis), where the member of this family is designated SpoVE and three or more RodA/FtsW/SpoVE family paralogs are present. SpoVE acts in spore cortex formation and is dispensible for growth. Biological rolls for FtsW in cell division include recruitment of penicillin-binding protein 3 to the division site. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division] 356
19077 131664 TIGR02615 spoVE stage V sporulation protein E. This model represents an exception within the members of the FtsW model TIGR02614. This exception occurs only in endospore-forming genera such as Bacillus, Geobacillus, and Oceanobacillus. Like FtsW, members are found in a peptidoglycan operon context, but in these genera they part of a larger set of paralogs (not just the pair FtsW and RodA) and are required specifically for sporulation, not for viability. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination] 354
19078 274233 TIGR02616 tnaC_leader tryptophanase leader peptide. Members of this family are the apparent leader peptides of tryptophanase operons in Esherichia coli, Vibrio cholerae, Photobacterium profundum, Haemophilus influenzae type b, and related species. All members of the seed alignment are examples ORFs upstream of tryptophanase, with a start codon, a conserved single Trp residue, and several other conserved residues. It is suggested (Konan KV and Yanofsky C) that the nascent peptide interacts with the ribosome once (if) the ribosome reaches the stop codon. Note that this model describes a much broader set (and shorter protein region) than pfam08053. [Energy metabolism, Amino acids and amines, Transcription, Other] 22
19079 131666 TIGR02617 tnaA_trp_ase tryptophanase, leader peptide-associated. Members of this family belong to the beta-eliminating lyase family (pfam01212) and act as tryptophanase (L-tryptophan indole-lyase). The tryptophanases of this family, as a rule, are found with a tryptophanase leader peptide (TnaC) encoded upstream. Both tryptophanases (4.1.99.1) and tyrosine phenol-lyases (EC 4.1.99.2) are found between trusted and noise cutoffs, but this model captures nearly all tryptophanases for which the leader peptide gene tnaC can be found upstream. [Energy metabolism, Amino acids and amines] 467
19080 131667 TIGR02618 tyr_phenol_ly tyrosine phenol-lyase. This model describes a group of tyrosine phenol-lyase (4.1.99.2) (beta-tyrosinase), a pyridoxal-phosphate enzyme closely related to tryptophanase (4.1.99.1) (see model TIGR02617). Both belong to the beta-eliminating lyase family (pfam01212) [Energy metabolism, Amino acids and amines] 450
19081 274234 TIGR02619 TIGR02619 putative CRISPR-associated protein, APE2256 family. This model represents a conserved domain of about 150 amino acids found in at least five archaeal species and three bacterial species, exclusively in species with CRISPR (Clustered Regularly Interspaced Short Palidromic Repeats). In six of eight species, the member of this family is in the vicinity of a CRISPR/Cas locus. 149
19082 200203 TIGR02620 cas_VVA1548 putative CRISPR-associated protein, VVA1548 family. This model represents a conserved domain of about 95 amino acids exclusively in species with CRISPR (Clustered Regularly Interspaced Short Palidromic Repeats). In all bacterial species with members so far (Vibrio vulnificus YJ016, Mannheimia succiniciproducens MBEL55E, and Nitrosomonas europaea ATCC 19718) and but not in the archaeon Methanothermobacter thermautotrophicus str. Delta H, the gene for this protein is in the midst of a cluster of Cas protein gene near CRISPR repeats. 93
19083 274235 TIGR02621 cas3_GSU0051 CRISPR-associated helicase Cas3, subtype Dpsyc. This model describes a CRISPR-associated putative DEAH-box helicase, or Cas3, of a subtype found in Actinomyces naeslundii MG1, Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, and Desulfotalea psychrophila. This protein includes both DEAH and HD motifs. 862
19084 274236 TIGR02622 CDP_4_6_dhtase CDP-glucose 4,6-dehydratase. Members of this protein family are CDP-glucose 4,6-dehydratase from a variety of Gram-negative and Gram-positive bacteria. Members typically are encoded next to a gene that encodes a glucose-1-phosphate cytidylyltransferase, which produces the substrate, CDP-D-glucose, used by this enzyme to produce CDP-4-keto-6-deoxyglucose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 349
19085 131672 TIGR02623 G1P_cyt_trans glucose-1-phosphate cytidylyltransferase. Members of this family are the enzyme glucose-1-phosphate cytidylyltransferase, also called CDP-glucose pyrophosphorylase, the product of the rfbF gene. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 254
19086 131673 TIGR02624 rhamnu_1P_ald rhamnulose-1-phosphate aldolase. Members of this family are the enzyme RhaD, rhamnulose-1-phosphate aldolase. 270
19087 131674 TIGR02625 YiiL_rotase L-rhamnose mutarotase. Members of this protein family are rhamnose mutarotase from Escherichia coli, previously designated YiiL as an uncharacterized protein, and close homologs also associated with rhamnose dissimilation operons in other bacterial genomes. Mutarotase is a term for an epimerase that changes optical activity. This enzyme was shown experimentally to interconvert alpha and beta stereoisomers of the pyranose form of L-rhamnose. The crystal structure of this small (104 amino acid) protein shows a locally asymmetric dimer with active site residues of His, Tyr, and Trp. [Energy metabolism, Sugars] 102
19088 274237 TIGR02627 rhamnulo_kin rhamnulokinase. This model describes rhamnulokinase, an enzyme that catalyzes the second step in rhamnose catabolism. 454
19089 131676 TIGR02628 fuculo_kin_coli L-fuculokinase. Members of this family are L-fuculokinase, from the clade that includes the L-fuculokinase of Escherichia coli. This enzyme catalyzes the second step in fucose catabolism. This family belongs to FGGY family of carbohydrate kinases (pfam02782, pfam00370). It is encoded by the kinase (K) gene of the fucose (fuc) operon. [Energy metabolism, Sugars] 465
19090 131677 TIGR02629 L_rham_iso_rhiz L-rhamnose catabolism isomerase, Pseudomonas stutzeri subtype. Members of this family are isomerases in the pathway of L-rhamnose catabolism as found in Pseudomonas stutzeri and in a number of the Rhizobiales. This family differs from the L-rhamnose isomerases of Escherichia coli (see TIGR01748). This enzyme catalyzes the isomerization step in rhamnose catabolism. Genetic evidence in Rhizobium leguminosarum bv. trifolii suggests phosphorylation occurs first, then isomerization of the the phosphorylated sugar, but characterization of the recombinant enzyme from Pseudomonas 412
19091 274238 TIGR02630 xylose_isom_A xylose isomerase. Members of this family are the enzyme xylose isomerase (5.3.1.5), which interconverts D-xylose and D-xylulose. [Energy metabolism, Sugars] 434
19092 131679 TIGR02631 xylA_Arthro xylose isomerase, Arthrobacter type. This model describes a D-xylose isomerase that is also active as a D-glucose isomerase. It is tetrameric and dependent on a divalent cation Mg2+, Co2+ or Mn2+ as characterized in Arthrobacter. Members of this family differ substantially from the D-xylose isomerases of family TIGR02630. 382
19093 131680 TIGR02632 RhaD_aldol-ADH rhamnulose-1-phosphate aldolase/alcohol dehydrogenase. 676
19094 131681 TIGR02633 xylG D-xylose ABC transporter, ATP-binding protein. Several bacterial species have enzymes xylose isomerase and xylulokinase enzymes for xylose utilization. Members of this protein family are the ATP-binding cassette (ABC) subunit of the known or predicted high-affinity xylose ABC transporter for xylose import. These genes, which closely resemble other sugar transport ABC transporter genes, typically are encoded near xylose utilization enzymes and regulatory proteins. Note that this form of the transporter contains two copies of the ABC transporter domain (pfam00005). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 500
19095 188237 TIGR02634 xylF D-xylose ABC transporter, substrate-binding protein. Members of this family are periplasmic (when in Gram-negative bacteria) binding proteins for D-xylose import by a high-affinity ATP-binding cassette (ABC) transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 302
19096 274239 TIGR02635 RhaI_grampos L-rhamnose isomerase, Streptomyces subtype. This clade of sequences is closely related to the L-rhamnose isomerases found in Pseudomonas stutzeri and in a number of the Rhizobiales (TIGR02629). The genes of the family represented here are found in similar genomic contexts which contain genes apparently involved in rhamnose catabolism such as rhamnulose-1-phosphate aldolase (TIGR02632), sugar kinases, and sugar transporters. [Energy metabolism, Sugars] 378
19097 274240 TIGR02636 galM_Leloir galactose mutarotase. Members of this protein family act as galactose mutarotase (D-galactose 1-epimerase) and participate in the Leloir pathway for galactose/glucose interconversion. All members of the seed alignment for this model are found in gene clusters with other enzymes of the Leloir pathway. This enzyme family belongs to the aldose 1-epimerase family, described by pfam01263. However, the enzyme described as aldose 1-epimerase itself (EC 5.1.3.3) is called broadly specific for D-glucose, L-arabinose, D-xylose, D-galactose, maltose and lactose. The restricted genome context for genes in this family suggests members should act primarily on D-galactose. 336
19098 131685 TIGR02637 RhaS rhamnose ABC transporter, rhamnose-binding protein. This sugar-binding component of ABC transporter complexes is found in rhamnose catabolism operon contexts. Mutation of this gene in Rhizobium leguminosarum abolishes rhamnose transport and prevents growth on rhamnose as a carbon source. 302
19099 131686 TIGR02638 lactal_redase lactaldehyde reductase. This clade of genes encoding iron-containing alcohol dehydrogenase (pfam00465) proteins is generally found in apparent operons for the catabolism of rhamnose or fucose. Catabolism of both of these monosaccharides results in lactaldehyde which is reduced by this enzyme to 1,2 propanediol. This protein is alternatively known by the name 1,2 propanediol oxidoreductase. This enzyme is active under anaerobic conditions in E. coli while being inactivated by reactive oxygen species under aerobic conditions. Under aerobic conditions the lactaldehyde product of rhamnose and fucose catabolism is believed to be oxidized to lactate by a separate enzyme, lactaldehyde dehydrogenase. [Energy metabolism, Sugars] 379
19100 274241 TIGR02639 ClpA ATP-dependent Clp protease ATP-binding subunit clpA. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 730
19101 131688 TIGR02640 gas_vesic_GvpN gas vesicle protein GvpN. Members of this family are the GvpN protein associated with the production of gas vesicles produced in some prokaryotes to give cells buoyancy. This family belongs to a larger family of ATPases (pfam07728). [Cellular processes, Other] 262
19102 274242 TIGR02641 gvpC_cyan_rpt gas vesicle protein GvpC repeat. This model describes a 33-amino acid repeated domain in bacterial versions of the gas vesicle protein GvpC, a structural protein less abundant than GvpA. [Cellular processes, Other] 33
19103 274243 TIGR02642 phage_xxxx uncharacterized phage protein. This uncharacterized protein is found in prophage regions of Shewanella oneidensis MR-1, Vibrio vulnificus YJ016, Yersinia pseudotuberculosis IP 32953, and Aeromonas hydrophila ATCC7966. It appears to have regions of sequence similarity to phage lambda antitermination protein Q. [Mobile and extrachromosomal element functions, Prophage functions] 186
19104 131691 TIGR02643 T_phosphoryl thymidine phosphorylase. Thymidine phosphorylase (alternate name: pyrimidine phosphorylase), EC 2.4.2.4, is the designation for the enzyme of E. coli and other Proteobacteria involved in (deoxy)nucleotide degradation. It often occurs in an operon with a deoxyribose-phosphate aldolase, phosphopentomutase and a purine nucleoside phosphorylase. In many other lineages, the corresponding enzyme is designated pyrimidine-nucleoside phosphorylase (EC 2.4.2.2); the naming convention imposed by this model represents standard literature practice. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 437
19105 274244 TIGR02644 Y_phosphoryl pyrimidine-nucleoside phosphorylase. In general, members of this protein family are designated pyrimidine-nucleoside phosphorylase, enzyme family EC 2.4.2.2, as in Bacillus subtilis, and more narrowly as the enzyme family EC 2.4.2.4, thymidine phosphorylase (alternate name: pyrimidine phosphorylase), as in Escherichia coli. The set of proteins encompassed by this model is designated subfamily rather than equivalog for this reason; the protein name from this model should be used when TIGR02643 does not score above trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 405
19106 274245 TIGR02645 ARCH_P_rylase putative thymidine phosphorylase. Members of this family are closely related to characterized examples of thymidine phosphorylase (EC 2.4.2.4) and pyrimidine nucleoside phosphorylase (RC 2.4.2.2). Most examples are found in the archaea, but other examples in Legionella pneumophila str. Paris and Rhodopseudomonas palustris CGA009. 493
19107 131694 TIGR02646 TIGR02646 TIGR02646 family protein. Members of this uncharacterized protein family are found exclusively in bacteria. Neighboring genes in various genomes are also uncharacterized or may annotated as similar to restriction system proteins. [Hypothetical proteins, Conserved] 144
19108 131695 TIGR02647 DNA TIGR02647 family protein. Members of this family are found, so far, only in the Gammaproteobacteria. The function is unknown. The location on the chromosome usually is not far from housekeeping genes rather than in what is clearly, say, a prophage region. Some members have been annotated in public databases as DNA-binding protein inhibitor Id-2-related protein, putative transcriptional regulator, or hypothetical DNA binding protein. [Hypothetical proteins, Conserved] 77
19109 131696 TIGR02648 rep_term_tus DNA replication terminus site-binding protein. Members of this protein family are found on the main chromosomes of a number of the Gammaproteobacteria; this model excludes related plasmid proteins, which score between trusted and noise cutoffs. This protein, DNA replication terminus site-binding protein, binds specific DNA sites near the replication terminus to arrest the DNA replication fork. [DNA metabolism, DNA replication, recombination, and repair] 300
19110 131697 TIGR02649 true_RNase_BN ribonuclease BN. Members of this protein family are ribonuclease BN of Escherichia coli K-12 and closely related proteins believed to be equivalent in function. Note that E. coli appears to lack RNase Z per se, and this protein of E. coli appears orthologous to (but not functionally equivalent to) RNase Z of Bacillus subtilis and various other species. Meanwhile, the yihY gene product of E. coli previously was incorrectly identified as RNase BN. [Transcription, RNA processing] 303
19111 188239 TIGR02650 RNase_Z_T_toga ribonuclease Z, Thermotoga type. Members of this protein family are ribonuclease Z as found in the genus Thermotoga, where the enzyme cleaves after the CCA, in contrast to the activities characterized for other enzymes also designated ribonuclease Z. In other systems, cleavage occurs 5-prime to the location of the CCA sequence, and CCA is added subsequently. A species may lack ribonuclease Z if all tRNA genes encode the CCA sequence, or if the CCA is exposed by exonuclease activity rather than endonuclease activity. Note that members of this sequence family differ considerably from the majority of RNase Z sequences. [Transcription, RNA processing] 277
19112 274246 TIGR02651 RNase_Z ribonuclease Z. Processing of the 3-prime end of tRNA precursors may be the result of endonuclease or exonuclease activity, and differs in different species. Member of this family are ribonuclease Z, a tRNA 3-prime endonuclease that processes tRNAs to prepare for addition of CCA. In species where all tRNA sequences already have the CCA tail, such as E. coli, the need for such an enzyme is unclear. Protein similar to the E. coli enzyme, matched by TIGRFAMs model TIGR02649, are designated ribonuclease BN. [Transcription, RNA processing] 299
19113 131700 TIGR02652 TIGR02652 TIGR02652 family protein. Members of this family of conserved hypothetical proteins are found, so far, only in the Cyanobacteria. Members are about 170 amino acids long and share a motif CxxCx(14)CxxH near the amino end. [Hypothetical proteins, Conserved] 163
19114 131701 TIGR02653 Lon_rel_chp conserved hypothetical protein. This model describes a protein family of unknown function, about 690 residues in length, in which some members show C-terminal sequence similarity to pfam05362, which is the Lon protease C-terminal proteolytic domain, from MEROPS family S16. However, the annotated catalytic sites of E. coli Lon protease are not conserved in members of this family. Members have a motif GP[RK][GS]TGKS, similar to the ATP-binding P-loop motif GxxGxGK[ST]. [Hypothetical proteins, Conserved] 675
19115 211759 TIGR02654 circ_KaiB circadian clock protein KaiB. Members of this protein family are the circadian clock protein KaiB of Cyanobacteria, encoded in the circadian clock gene cluster kaiABC. KaiB has homologs of unknown function in some Archaea and Proteobacteria, and has paralogs of unknown function in some Cyanobacteria. KaiB forms homodimers, homotetramers, and multimeric complexes with KaiA and/or KaiC. [Cellular processes, Other] 87
19116 131703 TIGR02655 circ_KaiC circadian clock protein KaiC. Members of this family are the circadian clock protein KaiC, part of the kaiABC operon that controls circadian rhythm. It may be universal in Cyanobacteria. Each member has two copies of the KaiC domain (pfam06745), which is also found in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor. [Cellular processes, Other] 484
19117 274247 TIGR02656 cyanin_plasto plastocyanin. Members of this family are plastocyanin, a blue copper protein related to pseudoazurin, halocyanin, amicyanin, etc. This protein, located in the thylakoid luman, performs electron transport to photosystem I in Cyanobacteria and chloroplasts. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 99
19118 131705 TIGR02657 amicyanin amicyanin. Members of this family are amicyanin, a type I blue copper protein that accepts electrons from the tryptophan tryptophylquinone (TTQ) cofactor of the methylamine dehydrogenase light chain and then transfers them to the heme group of cytochrome c-551i. Amicyanin, methylamine dehydrogenase, and cytochrome c-551i are periplasmic and form a complex. This system has been studied primarily in Paracoccus denitrificans and Methylobacterium extorquens. Related type I blue copper proteins include plastocyanin, pseudoazurin, halocyanin, etc. [Energy metabolism, Electron transport] 83
19119 131706 TIGR02658 TTQ_MADH_Hv methylamine dehydrogenase (amicyanin) heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines] 352
19120 131707 TIGR02659 TTQ_MADH_Lt methylamine dehydrogenase (amicyanin) light chain. This family consists of the light chain of methylamine dehydrogenase light chain, a periplasmic enzyme. This subunit contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from Trp-114 and Trp-165 of the precursor, numbered according to the sequence from Paracoccus denitrificans. The enzyme forms a complex with the type I blue copper protein amicyanin and cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines] 186
19121 274248 TIGR02660 nifV_homocitr homocitrate synthase NifV. This family consists of the NifV clade of homocitrate synthases, most of which are found in operons for nitrogen fixation. Members are closely homologous to enzymes that include 2-isopropylmalate synthase, (R)-citramalate synthase, and homocitrate synthases associated with other processes. The homocitrate made by this enzyme becomes a part of the iron-molybdenum cofactor of nitrogenase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 365
19122 131709 TIGR02661 MauD methylamine dehydrogenase accessory protein MauD. This protein, MauD, appears critical to proper formation of the small subunit of methylamine dehydrogenase, which has both an unusual tryptophan tryptophylquinone cofactor and multiple disulfide bonds. MauD shares sequence similarity, including a CPxC motif, with a number of thiol:disulfide interchange proteins. In MauD mutants, the small subunit apparently does not form properly and is rapidly degraded. [Protein fate, Protein folding and stabilization, Energy metabolism, Amino acids and amines] 189
19123 131710 TIGR02662 dinitro_DRAG ADP-ribosyl-[dinitrogen reductase] hydrolase. Members of this family are the enzyme ADP-ribosyl-[dinitrogen reductase] hydrolase (EC 3.2.2.24), better known as Dinitrogenase Reductase Activating Glycohydrolase, DRAG. This enzyme reverses a regulatory inactivation of dinitrogen reductase caused by the action of NAD(+)--dinitrogen-reductase ADP-D-ribosyltransferase (EC 2.4.2.37) (DRAT). This enzyme is restricted to nitrogen-fixing bacteria and belongs to the larger family of ADP-ribosylglycohydrolases described by pfam03747. [Central intermediary metabolism, Nitrogen fixation] 287
19124 131711 TIGR02663 nifX nitrogen fixation protein NifX. Members of this family are NifX proteins encoded within operons for nitrogen fixation in a number of bacteria. NifX, NafY, and the C-terminal region of NifB all belong to the pfam02579 and are involved in MoFe cofactor biosynthesis. NifX is a nitrogenase accessory protein with a role in expression of the MoFe cofactor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation] 119
19125 131712 TIGR02664 nitr_red_assoc conserved hypothetical protein. Most members of this protein family are found in the Cyanobacteria, and these mostly near nitrate reductase genes and molybdopterin biosynthesis genes. We note that molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. This protein is sometimes annotated as nitrate reductase-associated protein. Its function is unknown. 145
19126 274249 TIGR02665 molyb_mobA molybdenum cofactor guanylyltransferase, proteobacterial. In many molybdopterin-containing enzymes, including nitrate reductase and dimethylsulfoxide reductase, the cofactor is molybdopterin-guanine dinucleotide. The family described here contains MobA, molybdenum cofactor guanylyltransferase, from the Proteobacteria only. MobA can reconstitute molybdopterin-guanine dinucleotide biosynthesis without the product of the neighboring gene MobB. The probable MobA proteins of other lineages differ sufficiently that they are not included in scope of this family. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 186
19127 274250 TIGR02666 moaA molybdenum cofactor biosynthesis protein A, bacterial. The model for this family describes molybdenum cofactor biosynthesis protein A, or MoaA, as found in bacteria. It does not include the family of probable functional equivalent proteins from the archaea. MoaA works together with MoaC to synthesize precursor Z from guanine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 334
19128 131715 TIGR02667 moaB_proteo molybdenum cofactor biosynthesis protein B, proteobacterial. This model represents the MoaB protein molybdopterin biosynthesis regions in Proteobacteria. This crystallized but incompletely characterized protein is thought to be involved in, though not required for, early steps in molybdopterin biosynthesis. It may bind a molybdopterin precursor. A distinctive conserved motif PCN near the C-terminus helps distinguish this clade from other homologs, including sets of proteins designated MogA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 163
19129 274251 TIGR02668 moaA_archaeal probable molybdenum cofactor biosynthesis protein A, archaeal. This model describes an archaeal family related, and predicted to be functionally equivalent, to molybdenum cofactor biosynthesis protein A (MoaA) of bacteria (see TIGR02666). [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin] 302
19130 274252 TIGR02669 SpoIID_LytB SpoIID/LytB domain. This model describes a domain found typically in two or three proteins per genome in Cyanobacteria and Firmicutes, and sporadically in other genomes. One member is SpoIID of Bacillus subtilis. Another in B. subtilis is the C-terminal half of LytB, encoded immediately upstream of an amidase, the autolysin LytC, to which its N-terminus is homologous. Gene neighborhoods are not well conserved for members of this family, as many, such as SpoIID, are monocistronic. One early modelling-based study suggests a DNA-binding role for SpoIID, but the function of this domain is unknown. [Unknown function, General] 267
19131 274253 TIGR02670 cas_csx8 CRISPR-associated protein Cas8a1/Csx8, subtype I. In three genomes so far, a member of this protein appears in the midst of a CRISPR-associated (cas) gene operon, immediately upstream of a member of family TIGR01875 (CRISPR-associated autoregulator, DevR family). The genomes so far are Nocardia farcinica IFM10152, Clostridium perfringens SM101, and Clostridium tetani E88. 441
19132 131719 TIGR02671 cas_csx9 CRISPR-associated protein Cas8a2/Csx9, subtype I-A/APERN. Members of this family, so far, are archaeal proteins found in CRISPR-associated (cas) gene regions. So far, this rare cas protein is found in only three genomes: Pyrococcus horikoshii shinkaj OT3, Pyrococcus abyssi GE5, and Thermococcus kodakarensis KOD1. In each case it is found immediately upstream of cas3 in loci that resemble the Apern type but lack Csa1 and Csa4 genes. 377
19133 131720 TIGR02672 cas_csm6 CRISPR type III-A/MTUBE-associated protein Csm6. Members of this family as found in CRISPR-associated (cas) gene regions in Streptococcus thermophilus CNRZ1066, Staphylococcus epidermidis RP62A, and Mycobacterium tuberculosis (strains CDC1551 and H37Rv), as part of Mtube-type CRISPR/Cas systems. CRISPR is a widespread form of direct repeat found in archaea and bacteria, with distinctive subtypes each of which has a characteristic sporadic distribution. 362
19134 131721 TIGR02673 FtsE cell division ATP-binding protein FtsE. This model describes FtsE, a member of the ABC transporter ATP-binding protein family. This protein, and its permease partner FtsX, localize to the division site. In a number of species, the ftsEX gene pair is located next to FtsY, the signal recognition particle-docking protein. [Cellular processes, Cell division] 214
19135 131722 TIGR02674 cas_cyan_RAMP_2 CRISPR-associated RAMP protein, Csx10 family. CRISPR is a widespread repeat family in prokaryotes. At least 45 different protein families occur in prokaryotes only when these repeats are present. This family, a minor CRISPR-associated protein family, seems largely restricted to the Cyanobacteria. It belongs to the RAMP superfamily (pfam03787). 393
19136 131723 TIGR02675 tape_meas_nterm tape measure domain. Proteins containing this domain are strictly bacterial, including bacteriophage and prophage regions of bacterial genomes. Most members are 800 to 1800 amino acids long, making them among the longest predicted proteins of their respective phage genomes, where they are encoded in tail protein regions. This roughly 80-residue domain described here usually begins between residue 100 and 250. Many members are known or predicted to act as phage tail tape measure proteins, a minor tail component that regulates tail length. 75
19137 131724 TIGR02677 TIGR02677 TIGR02677 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved] 494
19138 274254 TIGR02678 TIGR02678 TIGR02678 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved] 375
19139 274255 TIGR02679 TIGR02679 TIGR02679 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved] 385
19140 274256 TIGR02680 TIGR02680 TIGR02680 family protein. Members of this protein family belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). Proteins in this family average over 1400 amino acids in length. [Hypothetical proteins, Conserved] 1353
19141 131728 TIGR02681 phage_pRha phage regulatory protein, rha family. Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that pRha is a phage regulatory protein. [Mobile and extrachromosomal element functions, Prophage functions] 108
19142 274257 TIGR02682 cas_csx11 CRISPR-associated protein, Csx11 family. Members of this uncommon, sporadically distributed protein family are large (>900 amino acids) and strictly associated, so far, with CRISPR-associated (Cas) gene clusters. Nearby Cas genes always include members of the RAMP superfamily and the six-gene CRISPR-associated RAMP module. Species in which it is found, so far, include three archaea (Methanosarcina mazei, M. barkeri and Methanobacterium thermoautotrophicum) and two bacteria (Thermodesulfovibrio yellowstonii DSM 11347 and Sulfurihydrogenibium azorense). 918
19143 162974 TIGR02683 upstrm_HI1419 putative addiction module killer protein. Members of this strictly bacterial protein family are small, at roughly 100 amino acids. The gene is almost invariably the upstream member of a gene pair, where the downstream member is a predicted DNA-binding protein from a clade within Pfam helix-turn-helix family pfam01381. These gene pairs, when found on the bacterial chromosome, often are located with prophage regions, but also in both integrated plasmid regions and near housekeeping genes. Analysis suggests that the gene pair may serve as an addiction module. 95
19144 188241 TIGR02684 dnstrm_HI1420 probable addiction module antidote protein. Members of this strictly bacterial protein family are small, at roughly 100 amino acids. The gene is almost invariably the downstream member of a gene pair. It is a predicted DNA-binding protein from a clade within Pfam helix-turn-helix family pfam01381. These gene pairs, when found on the bacterial chromosome, are located often with prophage regions, but also both in integrated plasmid regions and in housekeeping gene regions. Analysis suggests that the gene pair may serve as an addiction module. [Mobile and extrachromosomal element functions, Other] 89
19145 131732 TIGR02685 pter_reduc_Leis pteridine reductase. Pteridine reductase is an enzyme used by trypanosomatids (including Trypanosoma cruzi and Leishmania major) to obtain reduced pteridines by salvage rather than biosynthetic pathways. Enzymes in T. cruzi described as pteridine reductase 1 (PTR1) and pteridine reductase 2 (PTR2) have different activity profiles. PTR1 is more active with with fully oxidized biopterin and folate than with reduced forms, while PTR2 reduces dihydrobiopterin and dihydrofolate but not oxidized pteridines. T. cruzi PTR1 and PTR2 are more similar to each other in sequence than either is to the pteridine reductase of Leishmania major, and all are included in this family. 267
19146 274258 TIGR02686 relax_trwC conjugative relaxase domain, TrwC/TraI family. This domain is in the N-terminal (relaxase) region of TrwC, a relaxase-helicase that acts in plasmid R388 conjugation. The relaxase domain has DNA cleavage and strand transfer activities. Plasmid transfer protein TraI is also a member of this domain family. Members of this family on bacterial chromosomes typically are found near other genes typical of conjugative plasmids and appear to mark integrated plasmids. [Mobile and extrachromosomal element functions, Plasmid functions] 283
19147 274259 TIGR02687 TIGR02687 TIGR02687 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 880 amino acids in length. This protein is repeatedly found upstream of another uncharacterized protein of about 470 amino acids in length, modeled by TIGR02688. 844
19148 131735 TIGR02688 TIGR02688 TIGR02688 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 470 amino acids in length. Several members of this family appear in public databases with annotation as ATP-dependent protease La, despite the lack of similarity to families TIGR00763 (ATP-dependent protease La) or pfam02190 (ATP-dependent protease La (LON) domain). This protein is repeatedly found downstream of another uncharacterized protein of about 880 amino acids in length, described by model TIGR02687. [Hypothetical proteins, Conserved] 449
19149 131736 TIGR02689 ars_reduc_gluta arsenate reductase, glutathione/glutaredoxin type. Members of this protein family represent a novel form of arsenate reductase, using glutathione and glutaredoxin rather than thioredoxin for reducing equivalents as do some homologous arsenate reductases. An example of this type is Synechocystis sp. strain PCC 6803 slr0946, and of latter type (excluded from this model) is Staphylococcus aureus plasmid pI258 ArsC. Both are among the subset of arsenate reductases that belong the the low-molecular-weight protein-tyrosine phosphatase superfamily. [Cellular processes, Detoxification] 126
19150 274260 TIGR02690 resist_ArsH arsenical resistance protein ArsH. Members of this protein family occur in arsenate resistance operons that include at least two different types of arsenate reductase. ArsH is not required for arsenate resistance in some systems. This family belongs to the larger family of NADPH-dependent FMN reductases (pfam03358). The function of ArsH is not known. [Cellular processes, Detoxification] 219
19151 131738 TIGR02691 arsC_pI258_fam arsenate reductase (thioredoxin). This family describes the well-studied thioredoxin-dependent arsenate reductase of Staphylococcus aureaus plasmid pI258 and other mechanistically similar arsenate reductases. The mechanism involves an intramolecular disulfide bond cascade, and aligned members of this family have four absolutely conserved Cys residues. This group of arsenate reductases belongs to the low-molecular weight protein-tyrosine phosphatase family (pfam01451), as does a group of glutathione/glutaredoxin type arsenate reductases (TIGR02689). At least two other, non-homologous groups of arsenate reductases involved in arsenical resistance are also known. This enzyme reduces arsenate to arsenite, which may be more toxic but which is more easily exported. [Cellular processes, Detoxification] 129
19152 131739 TIGR02692 tRNA_CCA_actino tRNA adenylyltransferase. The enzyme tRNA adenylyltransferase, also called tRNA-nucleotidyltransferase and CCA-adding enzyme, can add or repair the required CCA triplet at the 3'-end of tRNA molecules. Genes encoding tRNA include the CCA tail in some but not all bacteria, and this enzyme may be required for viability. Members of this family represent a distinct clade within the larger family pfam01743 (tRNA nucleotidyltransferase/poly(A) polymerase family protein). The example from Streptomyces coelicolor was shown to act as a CCA-adding enzyme and not as a poly(A) polymerase. [Protein synthesis, tRNA and rRNA base modification] 466
19153 274261 TIGR02693 arsenite_ox_L arsenite oxidase, large subunit. This model represents the large subunit of an arsenite oxidase complex. The small subunit is a Rieske protein. Homologs to both large and small subunits that score in the gray zone between the set trusted and noise bit score cutoffs for the respective models are found in Aeropyrum pernix K1 and in Sulfolobus tokodaii str. 7. This enzyme acts in energy metabolim by arsenite oxidation, rather than detoxification by reduction of arsenate to arsenite prior to export. [Energy metabolism, Electron transport] 806
19154 131741 TIGR02694 arsenite_ox_S arsenite oxidase, small subunit. This model represents the small subunit of an arsenite oxidase complex. It is a Rieske protein and appears to rely on the Tat (twin-arginine translocation) system to cross the membrane. Although this enzyme could run in the direction of arsenate reduction to arsenite in principle, the relevant biological function is arsenite oxidation for energy metabolism, not arsenic resistance. Homologs to both large (TIGR02693) and small subunits that score in the gray zone between the set trusted and noise bit score cutoffs for the respective models are found in Aeropyrum pernix K1 and in Sulfolobus tokodaii str. 7. [Energy metabolism, Electron transport] 129
19155 131742 TIGR02695 azurin azurin. Azurin is a blue copper-binding protein in the plastocyanin/azurin family (see pfam00127). It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The most closely related copper-binding proteins to this family are auracyanins, as in Chloroflexus aurantiacus, which have similar redox activities. [Energy metabolism, Electron transport] 125
19156 131743 TIGR02696 pppGpp_PNP guanosine pentaphosphate synthetase I/polynucleotide phosphorylase. Sohlberg, et al. present characterization of two proteins from Streptomyces coelicolor. The protein in this family was shown to have poly(A) polymerase activity and may be responsible for polyadenylating RNA in this species. Reference 2 showed that a nearly identical plasmid-encoded protein from Streptomyces antibioticus is a bifunctional enzyme that acts also as a guanosine pentaphosphate synthetase. 719
19157 131744 TIGR02697 WPE_wolbac Wolbachia palindromic element (WPE) domain. This domain conceptually resembles TIGR01045, the Rickettsial palindromic element (RPE) domain. In both cases, a protein-coding palindromic element spreads through a genome, inserting usually in protein-coding regions. The additional protein coding sequence is thought to allow function of the host protein because of location in surface-exposed regions of the protein structure. Note that this model appears to work better in fragment mode. [Mobile and extrachromosomal element functions, Other] 36
19158 131745 TIGR02698 CopY_TcrY copper transport repressor, CopY/TcrY family. This family includes metal-fist type transcriptional repressors of copper transport systems such as copYZAB of Enterococcus hirae and tcrYAZB (transferble copper resistance) of an Enterocuccus faecium plasmid. High levels of copper can displace zinc and prevent binding by the repressor, activating efflux by copper resistance transporters. The most closely related proteins excluded by this model are antibiotic resistance regulators including the methicillin resistance regulatory protein MecI. [Transport and binding proteins, Cations and iron carrying compounds, Regulatory functions, DNA interactions] 130
19159 131746 TIGR02699 archaeo_AfpA archaeoflavoprotein AfpA. The prototypical member of this archaeal protein family is AF1518 from Archaeoglobus fulgidus. This homodimer with two non-covalently bound FMN cofactors can receive electrons from ferredoxin, but not from a number of other electron donors such as NADH or rubredoxin. It can then donate electrons to various reductases. [Energy metabolism, Electron transport] 174
19160 131747 TIGR02700 flavo_MJ0208 archaeoflavoprotein, MJ0208 family. This model describes one of two paralogous families of archaealflavoprotein. The other, described by TIGR02699 and typified by the partially characterized AF1518 of Archaeoglobus fulgidus, is a homodimeric FMN-containing flavoprotein that accepts electrons from ferredoxin and can transfer them to various oxidoreductases. The function of this protein family is unknown. [Unknown function, General] 234
19161 274262 TIGR02701 shell_carb_anhy carboxysome shell carbonic anhydrase. This model describes a carboxysome shell protein that proves to be a novel class, designated epsilon, of carbonic anhydrase. It tends to be encoded near genes for RuBisCo and for other carboxysome shell proteins. [Central intermediary metabolism, One-carbon metabolism] 450
19162 131749 TIGR02702 SufR_cyano iron-sulfur cluster biosynthesis transcriptional regulator SufR. All members of this cyanobacterial protein family are the transcriptional regulator SufR and regulate the SUF system, which makes possible iron-sulfur cluster biosynthesis despite exposure to oxygen. In all cases, the sufR gene is encoded near SUF system genes but in the opposite direction. This DNA-binding protein belongs to the the DeoR family of helix-loop-helix proteins. All members also have a probable metal-binding motif C-X(12)-C-X(13)-C-X(14)-C near the C-terminus. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions] 203
19163 131750 TIGR02703 carboxysome_A carboxysome peptide A. This model distinguishes one of two closely related paralogs encoded by nearby genes in the carboxysome operons of a number of cyanobacteria and chemoautotrophic bacteria. More distantly related proteins, also belonging to pfam03319, participate in other types of shell such as the ethanolamine degradation organelle. [Central intermediary metabolism, One-carbon metabolism] 81
19164 131751 TIGR02704 carboxysome_B carboxysome peptide B. This model distinguishes one of two closely related paralogs encoded by nearby genes in the carboxysome operons of a number of cyanobacteria and chemoautotrophic bacteria. More distantly related proteins, also belonging to pfam03319, participate in other types of shell such as the ethanolamine degradation organelle. [Central intermediary metabolism, One-carbon metabolism] 80
19165 131752 TIGR02705 nudix_YtkD nucleoside triphosphatase YtkD. The functional assignment to the proteins of this family is contentious, with papers disagreeing in both interpretation and enzyme assay results. This protein belongs to the nudix family and shares some sequence identity with E. coli MutT but appears not to be functionally interchangeable with it. [DNA metabolism, DNA replication, recombination, and repair] 156
19166 162980 TIGR02706 P_butyryltrans phosphate butyryltransferase. Members of this family are phosphate butyryltransferase, also called phosphotransbutyrylase. In general, this enzyme is found in butyrate-producing anaerobic bacteria, encoded next to the gene for butyrate kinase. Together, these two enzymes represent what may be the less common of two pathways for butyrate production from butyryl-CoA. The alternative is transfer of the CoA group to acetate by butyryl-CoA:acetate CoA transferase. Cutoffs for this model are set such that the homolog from Thermotoga maritima, whose activity on butyryl-CoA is only 30 % of its activity with acetyl-CoA, scores in the zone between trusted and noice cutoffs. [Energy metabolism, Fermentation] 294
19167 162981 TIGR02707 butyr_kinase butyrate kinase. This model represents an enzyme family in which members are designated either butryate kinase or branched-chain carboxylic acid kinase. The EC designation 2.7.2.7 describes an enzyme with relatively broad specificity; gene products whose context suggests a role in metabolism of aliphatic amino acids are likely to act as branched-chain carboxylic acid kinase. The gene typically found adjacent, ptb (phosphate butyryltransferase), likewise encodes an enzyme that may have a broad specificity that includes a role in aliphatic amino acid cabolism. [Energy metabolism, Fermentation] 351
19168 131755 TIGR02708 L_lactate_ox L-lactate oxidase. Members of this protein oxidize L-lactate to pyruvate, reducing molecular oxygen to hydrogen peroxide. The enzyme is known in Aerococcus viridans, Streptococcus iniae, and some strains of Streptococcus pyogenes where it appears to contribute to virulence. [Energy metabolism, Other] 367
19169 131756 TIGR02709 branched_ptb branched-chain phosphotransacylase. This model distinguishes branched-chain phosphotransacylases like that of Enterococcus faecalis from closely related subfamilies of phosphate butyryltransferase (EC 2.3.1.19) (TIGR02706) and phosphate acetyltransferase (EC 2.3.1.8) (TIGR00651). Members of this family and of TIGR02706 show considerable crossreactivity, and the occurrence of a member of either family near an apparent leucine dehydrogenase will suggest activity on branched chain-acyl-CoA compounds. [Energy metabolism, Amino acids and amines] 271
19170 274263 TIGR02710 TIGR02710 CRISPR-associated protein, TIGR02710 family. Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Archaea), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia). 380
19171 131758 TIGR02711 symport_actP cation/acetate symporter ActP. Members of this family belong to the Sodium:solute symporter family. Both members of this family and other close homologs tend to be encoded next to a member of pfam04341, a set of uncharacterized membrane proteins. The characterized member from E. coli is encoded near and cotranscribed with the acetyl coenzyme A synthetase (acs) gene. Proximity to an acs gene was used as one criterion for determining the trusted cutoff for this model. Closely related proteins may differ in function and are excluded by the high cutoffs of this model; members of the family of phenylacetic acid transporter PhaJ can score as high as 1011 bits. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 549
19172 274264 TIGR02712 urea_carbox urea carboxylase. Members of this family are ATP-dependent urea carboxylase, including characterized members from Oleomonas sagaranensis (alpha class Proteobacterium) and yeasts such as Saccharomyces cerevisiae. The allophanate hydrolase domain of the yeast enzyme is not included in this model and is represented by an adjacent gene in Oleomonas sagaranensis. The fusion of urea carboxylase and allophanate hydrolase is designated urea amidolyase. The enzyme from Oleomonas sagaranensis was shown to be highly active on acetamide and formamide as well as urea. [Central intermediary metabolism, Nitrogen metabolism] 1201
19173 274265 TIGR02713 allophanate_hyd allophanate hydrolase. Allophanate hydrolase catalyzes the second reaction in an ATP-dependent two-step degradation of urea to ammonia and C02, following the action of the biotin-containing urea carboxylase. The yeast enzyme, a fusion of allophanate hydrolase to urea carboxylase, is designated urea amidolyase. [Central intermediary metabolism, Nitrogen metabolism] 561
19174 274266 TIGR02714 amido_AtzD_TrzD ring-opening amidohydrolases. Members of this family are are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC 3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC 3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring-opening of barbituric acid to ureidomalonic acid (see Soong, et al., ). 366
19175 274267 TIGR02715 amido_AtzE amidohydrolase, AtzE family. Members of this protein family are aminohydrolases related to, but distinct from, glutamyl-tRNA(Gln) amidotransferase subunit A. The best characterized member is the biuret hydrolase of Pseudomonas sp. ADP, which hydrolyzes ammonia from the three-nitrogen compound biuret to yield allophanate. Allophanate is also an intermediate in urea degradation by the urea carboxylase/allophanate hydrolase pathway, an alternative to urease. [Unknown function, Enzymes of unknown specificity] 452
19176 131763 TIGR02716 C20_methyl_CrtF C-20 methyltransferase BchU. Members of this protein family are the S-adenosylmethionine-depenedent C-20 methyltransferase BchU, part of the pathway of bacteriochlorophyll c production in photosynthetic green sulfur bacteria. The position modified by this enzyme represents the difference between bacteriochlorophylls c and d; strains lacking this protein can only produced bacteriochlorophyll d. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 306
19177 131764 TIGR02717 AcCoA-syn-alpha acetyl coenzyme A synthetase (ADP forming), alpha domain. Although technically reversible, it is believed that this group of ADP-dependent acetyl-CoA synthetases (ACS) act in the direction of acetate and ATP production in the organisms in which it has been characterized. In most species this protein exists as a fused alpha-beta domain polypeptide. In Pyrococcus and related species, however the domains exist as separate polypeptides. This model represents the alpha (N-terminal) domain. In Pyrococcus and related species there appears to have been the development of a paralogous family such that four other proteins are close relatives. In reference, one of these (along with its beta-domain partner) was characterized as ACS-II showing specificity for phenylacetyl-CoA. This model has been constructed to exclude these non-ACS-I paralogs. This may result in new, authentic ACS-I sequences falling below the trusted cutoff. 447
19178 131765 TIGR02718 sider_RhtX_FptX siderophore transporter, RhtX/FptX family. RhtX from Sinorhizobium meliloti 2011 and FptX from Pseudomonas aeruginosa appear to be single polypeptide transporters, from the major facilitator family (see pfam07690) for import of siderophores as a means to import iron. This function was suggested by proximity to siderophore biosynthesis genes and then confirmed by study of knockout and heterologous expression phenotypes. [Transport and binding proteins, Cations and iron carrying compounds] 390
19179 131766 TIGR02719 repress_PhaQ poly-beta-hydroxybutyrate-responsive repressor. Members of this family are transcriptional regulatory proteins found in the vicinity of poly-beta-hydroxybutyrate (PHB) operons in several species of Bacillus. This protein appears to have repressor activity modulated by PHB itself. This protein belongs to the larger PadR family (see pfam03551). [Regulatory functions, DNA interactions] 138
19180 213733 TIGR02720 pyruv_oxi_spxB pyruvate oxidase. Members of this family are examples of pyruvate oxidase (EC 1.2.3.3), an enzyme with FAD and TPP as cofactors that catalyzes the reaction pyruvate + phosphate + O2 + H2O = acetyl phosphate + CO2 + H2O2. It should not be confused with pyruvate dehydrogenase [cytochrome] (EC 1.2.2.2) as in E. coli PoxB, although the E. coli enzyme is closely homologous and has pyruvate oxidase as an alternate name. [Energy metabolism, Aerobic] 575
19181 274268 TIGR02721 ycfN_thiK thiamine kinase. Members of this family are the ycfN gene product of Escherichia coli, now identified as the salvage enzyme thiamine kinase (thiK), and additional proteobacterial homologs taken to be orthologs with equivalent function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 256
19182 274269 TIGR02722 lp_ uncharacterized proteobacterial lipoprotein. Members of this protein family are restricted to the Proteobacteria, and all are predicted lipoproteins. In genomes that contain the thiK gene for the salvage enzyme thiamin kinase, the member of this family is encoded nearby. [Cell envelope, Other] 189
19183 131770 TIGR02723 phenyl_P_alpha phenylphosphate carboxylase, alpha subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This alpha subunit is homologous to the beta subunit and, more broadly, to UbiD family decarboxylases. [Energy metabolism, Anaerobic] 485
19184 131771 TIGR02724 phenyl_P_beta phenylphosphate carboxylase, beta subunit. Members of this protein family are the beta subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This beta subunit is homologous to the alpha subunit and, more broadly, to UbiD family decarboxylases. 472
19185 131772 TIGR02725 phenyl_P_gamma phenylphosphate carboxylase, gamma subunit. Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs. 84
19186 131773 TIGR02726 phenyl_P_delta phenylphosphate carboxylase, delta subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This delta subunit belongs to HAD family hydrolases. [Energy metabolism, Anaerobic] 169
19187 274270 TIGR02727 MTHFS_bact 5,10-methenyltetrahydrofolate synthetase. This enzyme, 5,10-methenyltetrahydrofolate synthetase, is also called 5-formyltetrahydrofolate cycloligase. Function of bacterial proteins in this family was inferred originally from the known activity of eukaryotic homologs. Recently, activity was shown explicitly for the member from Mycoplasma pneumonia. Members of this family from alpha- and gamma-proteobacteria, designated ygfA, are often found in an operon with 6S structural RNA, and show a similar pattern of high expression during stationary phase. The function may be to deplete folate to slow 1-carbon biosynthetic metabolism. [Central intermediary metabolism, One-carbon metabolism] 179
19188 131775 TIGR02728 spore_gerQ spore coat protein GerQ. Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase. [Cellular processes, Sporulation and germination] 82
19189 274271 TIGR02729 Obg_CgtA Obg family GTPase CgtA. This model describes a univeral, mostly one-gene-per-genome GTP-binding protein that associates with ribosomal subunits and appears to play a role in ribosomal RNA maturation. This GTPase, related to the nucleolar protein Obg, is designated CgtA in bacteria. Mutations in this gene are pleiotropic, but it appears that effects on cellular functions such as chromosome partition may be secondary to the effect on ribosome structure. Recent work done in Vibrio cholerae shows an essential role in the stringent response, in which RelA-dependent ability to synthesize the alarmone ppGpp is required for deletion of this GTPase to be lethal. [Protein synthesis, Other] 328
19190 131777 TIGR02730 carot_isom carotene isomerase. Members of this family, including sll0033 (crtH) of Synechocystis sp. PCC 6803, catalyze a cis-trans isomerization of carotenes to the all-trans lycopene, a reaction that can also occur non-enzymatically in light through photoisomerization. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 493
19191 131778 TIGR02731 phytoene_desat phytoene desaturase. Plants and cyanobacteria (and, supposedly, Chlorobium tepidum) have a conserved pathway from two molecules geranylgeranyl-PP to one of all-trans-lycopene. Members of this family are the enzyme pytoene desaturase (also called phytoene dehydrogenase). This model does not include the region of the chloroplast transit peptide in plants. A closely related family, excluded by this model, is zeta-carotene desaturase, another enzyme in the same pathway. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 453
19192 131779 TIGR02732 zeta_caro_desat 9,9'-di-cis-zeta-carotene desaturase. Carotene 7,8-desaturase, also called zeta-carotene desaturase, catalyzes multiple steps in the pathway from geranylgeranyl-PP to all-trans-lycopene in plants and cyanobacteria. A similar enzyme and pathway is found in the green sulfur bacterium Chlorobium tepidum. 474
19193 274272 TIGR02733 desat_CrtD C-3',4' desaturase CrtD. Members of this family are slr1293, a carotenoid biosynthesis protein which was shown to be the C-3',4' desaturase (CrtD) of myxoxanthophyll biosynthesis in Synechocystis sp. strain PCC 6803, and close homologs (presumed to be functionally equivalent) from other cyanobacteria, where myxoxanthophyll biosynthesis is either known or expected. This enzyme can act on neurosporene and so presumably catalyzes the first step that is committed to myxoxanthophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 492
19194 274273 TIGR02734 crtI_fam phytoene desaturase. Phytoene is converted to lycopene by desaturation at four (two symmetrical pairs of) sites. This is achieved by two enzymes (crtP and crtQ) in cyanobacteria (Gloeobacter being an exception) and plants, but by a single enzyme in most other bacteria and in fungi. This single enzyme is called the bacterial-type phytoene desaturase, or CrtI. Most members of this family, part of the larger pfam01593, which also contains amino oxidases, are CrtI itself; it is likely that all members act on either phytoene or on related compounds such as dehydrosqualene, for carotenoid biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 495
19195 131782 TIGR02735 purC_vibrio phosphoribosylaminoimidazole-succinocarboxamide synthase, Vibrio type. Members of this protein family appear to represent a novel form of phosphoribosylaminoimidazole-succinocarboxamide synthase (SAICAR synthetase), significantly different in sequence and gap pattern from a form (see TIGR00081) shared by a broad range of bacteria and eukaryotes. Members of this family are found within the gammaproteobacteria in the genera Vibrio, Shewanella, and Colwellia, and also (reported as a fragment) in the primitive eukarote Guillardia theta. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 365
19196 131783 TIGR02736 cbb3_Q_epsi cytochrome c oxidase, cbb3-type, CcoQ subunit, epsilon-Proteobacterial. Members of this protein family are restricted to the epsilon branch of the Proteobacteria. All members are found in operons containing the other three structural subunits of the cbb3 type of cytochrome c oxidase. These small proteins show remote sequence similarity to the CcoQ subunit in other cytochrome c oxidase systems, so this family is assumed to represent the epsilonproteobacterial variant of CcoQ. [Energy metabolism, Electron transport] 56
19197 131784 TIGR02737 caa3_CtaG cytochrome c oxidase assembly factor CtaG. Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as in Bacillus subtilis. 281
19198 131785 TIGR02738 TrbB type-F conjugative transfer system pilin assembly thiol-disulfide isomerase TrbB. This protein is part of a large group of proteins involved in conjugative transfer of plasmid DNA, specifically the F-type system. This protein has been predicted to contain a thioredoxin fold, contains a conserved pair of cysteines and has been shown to function as a thiol disulfide isomerase by complementation of an Ecoli DsbA defect. The protein is believed to be involved in pilin assembly. The protein is closely related to TraF (TIGR02739) which is somewhat longer, lacks the cysteine motif and is apparently not functional as a disulfide bond isomerase. 153
19199 274274 TIGR02739 TraF type-F conjugative transfer system pilin assembly protein TraF. This protein is part of a large group of proteins involved in conjugative transfer of plasmid DNA, specifically the F-type system. This protein has been predicted to contain a thioredoxin fold and has been shown to be localized to the periplasm. Unlike the related protein TrbB (TIGR02738), TraF does not contain a conserved pair of cysteines and has been shown not to function as a thiol disulfide isomerase by complementation of an Ecoli DsbA defect. The protein is believed to be involved in pilin assembly. Even more closely related than TrbB is a clade of genes (TIGR02740) which do contain the CXXC motif, but it is unclear whether these genes are involved in type-F conjugation systems per se. 256
19200 274275 TIGR02740 TraF-like TraF-like protein. This protein is related to the F-type conjugation system pilus assembly proteins TraF (TIGR02739)and TrbB (TIGR02738) both of which exhibit a thioredoxin fold. The protein represented by this model has the same length and architecture as TraF, but lacks the CXXC-motif found in TrbB and believed to be responsible for the disulfide isomerase activity of that protein. 271
19201 131788 TIGR02741 TraQ type-F conjugative transfer system pilin chaperone TraQ. This protein makes a specific interaction with the pilin (TraA) protein to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly. 80
19202 131789 TIGR02742 TrbC_Ftype type-F conjugative transfer system pilin assembly protein TrbC. This protein is an essential component of the F-type conjugative pilus assembly system for the transfer of plasmid DNA. The N-terminal portion of these proteins are heterogeneous and are not covered by this model. 130
19203 274276 TIGR02743 TraW type-F conjugative transfer system protein TraW. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm. 202
19204 274277 TIGR02744 TrbI_Ftype type-F conjugative transfer system protein TrbI. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm. 112
19205 274278 TIGR02745 ccoG_rdxA_fixG cytochrome c oxidase accessory protein FixG. Member of this ferredoxin-like protein family are found exclusively in species with an operon encoding the cbb3 type of cytochrome c oxidase (cco-cbb3), and near the cco-cbb3 operon in about half the cases. The cco-cbb3 is found in a variety of proteobacteria and almost nowhere else, and is associated with oxygen use under microaerobic conditions. Some (but not all) of these proteobacteria are also nitrogen-fixing, hence the gene symbol fixG. FixG was shown essential for functional cco-cbb3 expression in Bradyrhizobium japonicum. 434
19206 274279 TIGR02746 TraC-F-type type-IV secretion system protein TraC. The protein family described here is common among the F, P and I-like type IV secretion systems. Gene symbols include TraC (F-type), TrbE/VirB4 (P-type) and TraU (I-type). The protein conyains the Walker A and B motifs and so is a putative nucleotide triphosphatase. 797
19207 274280 TIGR02747 TraV type IV conjugative transfer system lipoprotein TraV. The TraV protein is a component of conjugative type IV secretion systems. TraV is an outer membrane lipoprotein and is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half. 144
19208 131795 TIGR02748 GerC3_HepT heptaprenyl diphosphate synthase component II. Members of this family are component II of the heterodimeric heptaprenyl diphosphate synthase. The trusted cutoff was set such that all members identified are encoded near to a recognizable gene for component I (in pfam07307). This enzyme acts in menaquinone-7 isoprenoid side chain biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 319
19209 131796 TIGR02749 prenyl_cyano solanesyl diphosphate synthase. Members of this family all are from cyanobacteria or plastid-containing eukaryotes. A member from Arabidopsis (where both plastoquinone and ubiquinone contain the C(45) prenyl moiety) was characterized by heterologous expression as a solanesyl diphosphate synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 322
19210 274281 TIGR02750 TraN_Ftype type-F conjugative transfer system mating-pair stabilization protein TraN. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilization (adhesin) component of the F-type conjugative plamid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein. 572
19211 131798 TIGR02751 PEPCase_arch phosphoenolpyruvate carboxylase, archaeal type. This family is the archaeal-type phosphoenolpyruvate carboxylase, although not every host species is archaeal. These sequences bear little resemblance to the bacterial/eukaryotic type. The members from Sulfolobus solfataricus and Methanothermobacter thermautotrophicus were verified experimentally, while the activity is known to be present in a number of other archaea. [Energy metabolism, Other] 506
19212 131799 TIGR02752 MenG_heptapren demethylmenaquinone methyltransferase. MenG is a generic term for a methyltransferase that catalyzes the last step in menaquinone biosynthesis; the exact enzymatic activity differs for different MenG because the menaquinone differ in their prenoid side chains in different species. Members of this MenG protein family are 2-heptaprenyl-1,4-naphthoquinone methyltransferase, and are found together in operons with the two subunits of the heptaprenyl diphosphate synthase in Bacillus subtilis and related species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 231
19213 131800 TIGR02753 sodN superoxide dismutase, Ni. This superoxide dismutase uses nickel, rather than iron, manganese, copper, or zinc. Its gene is always accompanied by a gene for a required protease. 145
19214 274282 TIGR02754 sod_Ni_protease nickel-type superoxide dismutase maturation protease. Members of this protein family are apparent proteases encoded adjacent to the genes for a nickel-type superoxide dismutase. This family belongs to the same larger family (see pfam00717) as signal peptidase I, an unusual serine protease suggested to have a Ser/Lys catalytic dyad. [Cellular processes, Detoxification, Protein fate, Protein modification and repair] 90
19215 131802 TIGR02755 TraX_Ftype type-F conjugative transfer system pilin acetylase TraX. TraX is responsible for the acetylation of the F-pilin TraA during conjugative plasmid transfer. The purpose of this acetylation is unclear, but the reported transcriptional regulation of TraX may indicate that it is involved in the process of pilu extension/retraction. 224
19216 274283 TIGR02756 TraK_Ftype type-F conjugative transfer system secretin TraK. The TraK protein is predicted to interact with the TraV and TraB proteins as part of the scaffold which extends from the inner membrane, through the periplasm to the cell envelope and through which the F-type conjugative pilus passes. TraK is homologous to the P-type IV secretion system protein TrbG, the Ti-type protein VirB9 and the I-type TraN protein. The protein is related to the secretin family especially the HrcC subgroup of the type III secretion system. The protein is hypothesized to oligomerize to form a ring structure akin to other secretins. 232
19217 274284 TIGR02757 TIGR02757 TIGR02757 family protein. Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation. [Hypothetical proteins, Conserved] 229
19218 131805 TIGR02758 TraA_TIGR type IV conjugative transfer system pilin TraA. TraA is the single structural subunit of the pilus found in type IV conjugative transfer systems. This family is generally found in gammaproteobacteria. The pilins show considerable heterogeneity among the different conjugative plasmit types. All of them however contain an N-terminal part which is cleaved off by a leader peptidase (LepB, or similar) to result in a 68-78 amino acid product. Pilins may be further processed by acetylation (in F-like systems by the TraX protein) or by cyclization (in P-like systems by the TraF protein). 93
19219 131806 TIGR02759 TraD_Ftype type IV conjugative transfer system coupling protein TraD. The TraD protein performs an essential coupling function in conjugative type IV secretion systems. This protein sits at the inner membrane in contact with the assembled pilus and its scaffold as well as the relaxosome-plasmid DNA complex (through TraM). 566
19220 274285 TIGR02760 TraI_TIGR conjugative transfer relaxase protein TraI. This protein is a component of the relaxosome complex. In the process of conjugative plasmid transfer the realaxosome binds to the plasmid at the oriT (origin of transfer) site. The relaxase protein TraI mediates the single-strand nicking and ATP-dependent unwinding (relaxation, helicase activity) of the plasmid molecule. These two activities reside in separate domains of the protein. 1960
19221 163004 TIGR02761 TraE_TIGR type IV conjugative transfer system protein TraE. TraE is a component of type IV secretion systems involved in conjugative transfer of plasmid DNA. The function of the TraE protein is unknown. 181
19222 274286 TIGR02762 TraL_TIGR type IV conjugative transfer system protein TraL. This protein is part of the type IV secretion system for conjugative plasmid transfer. The function of the TraL protein is unknown. [Cellular processes, Conjugation] 94
19223 131810 TIGR02763 chlamy_scaf chlamydiaphage internal scaffolding protein. Members of this protein family are encoded by genes in chlamydiaphage such as Chp2, viruses with around eight genes that infect obligately intracellular bacterial pathogens of the genus Chlamydia. This protein, initially designated VP3 (as if a structural protein of mature viral particles), is displaced from procapsids as DNA is packaged, and therefore is described as a scafolding protein. [Mobile and extrachromosomal element functions, Prophage functions] 114
19224 274287 TIGR02764 spore_ybaN_pdaB polysaccharide deacetylase family sporulation protein PdaB. This model describes the YbaN protein family, also called PdaB and SpoVIE, of Gram-positive bacteria. Although ybaN null mutants have only a mild sporulation defect, ybaN/ytrI double mutants show drastically reducted sporulation efficiencies. This synthetic defect suggests the role of this sigmaE-controlled gene in sporulation had been masked by functional redundancy. Members of this family are homologous to a characterized polysaccharide deacetylase; the exact function this protein family is unknown. [Cellular processes, Sporulation and germination] 191
19225 274288 TIGR02765 crypto_DASH cryptochrome, DASH family. Photolyases and cryptochromes are related flavoproteins. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes do not repair DNA and are presumed to act instead in some other (possibly unknown) process such as entraining circadian rhythms. This model describes the cryptochrome DASH subfamily, one of at least five major subfamilies, which is found in plants, animals, marine bacteria, etc. Members of this family bind both folate and FAD. They may show weak photolyase activity in vitro but have not been shown to affect DNA repair in vivo. Rather, DASH family cryptochromes have been shown to bind RNA (Vibrio cholerae VC1814), or DNA, and seem likely to act in light-responsive regulatory processes. [Cellular processes, Adaptations to atypical conditions] 429
19226 131813 TIGR02766 crypt_chrom_pln cryptochrome, plant family. At least five major families of cryptochomes and photolyases share FAD cofactor binding, sequence homology, and the ability to react to short wavelengths of visible light. Photolysases are responsible for light-dependent DNA repair by removal of two types of uv-induced DNA dimerizations. Cryptochromes have other functions, often regulatory and often largely unknown, which may include circadian clock entrainment and control of development. Members of this subfamily are known so far only in plants; they may show some photolyase activity in vitro but appear mostly to be regulatory proteins that respond to blue light. 475
19227 131814 TIGR02767 TraG-Ti Ti-type conjugative transfer system protein TraG. This protein is found in the Agrobacterium tumefaciens Ti plasmid tra region responsible for conjugative transfer of the entire plasmid among Agrobacterium strains. The protein is distantly related to the F-type conjugation system TraG protein. Both of these systems are examples of type IV secretion systems. 623
19228 274289 TIGR02768 TraA_Ti Ti-type conjugative transfer relaxase TraA. This protein contains domains distinctive of a single strand exonuclease (N-terminus, MobA/MobL, pfam03389) as well as a helicase domain (central region, homologous to the corresponding region of the F-type relaxase TraI, TIGR02760). This protein likely fills the same role as TraI(F), nicking (at the oriT site) and unwinding the coiled plasmid prior to conjugative transfer. 744
19229 131816 TIGR02769 nickel_nikE nickel import ATP-binding protein NikE. This family represents the NikE subunit of a multisubunit nickel import ABC transporter complex. Nickel, once imported, may be used in urease and in certain classes of hydrogenase and superoxide dismutase. [Transport and binding proteins, Cations and iron carrying compounds] 265
19230 131817 TIGR02770 nickel_nikD nickel import ATP-binding protein NikD. This family represents the NikD subunit of a multisubunit nickel import ABC transporter complex. Nickel, once imported, may be used in urease and in certain classes of hydrogenase and superoxide dismutase. NikD and NikE are homologous. [Transport and binding proteins, Cations and iron carrying compounds] 230
19231 131818 TIGR02771 TraF_Ti conjugative transfer signal peptidase TraF. This protein is found in apparent operons encoding elements of conjugative transfer systems. This family is homologous to a broader family of signal (leader) peptidases such as lepB. This family is present in both Ti-type and I-type conjugative systems. 171
19232 274290 TIGR02772 Ku_bact Ku protein, prokaryotic. Members of this protein family are Ku proteins of non-homologous end joining (NHEJ) DNA repair in bacteria and in at least one member of the archaea (Archaeoglobus fulgidus). Most members are encoded by a gene adjacent to the gene for the DNA ligase that completes the repair. The NHEJ system is broadly but rather sparsely distributed, being present in about one fifth of the first 250 completed prokarytotic genomes. A few species (e.g. Archaeoglobus fulgidus and Bradyrhizobium japonicum) have multiple copies that appear to represent recent paralogous family expansion. [DNA metabolism, DNA replication, recombination, and repair] 258
19233 213736 TIGR02773 addB_Gpos helicase-exonuclease AddAB, AddB subunit. DNA repair is accomplished by several different systems in prokaryotes. Recombinational repair of double-stranded DNA breaks involves the RecBCD pathway in some lineages, and AddAB (also called RexAB) in other. The AddA protein is conserved between the firmicutes and the alphaproteobacteria, while the partner protein is not. Nevertheless, the partner is designated AddB in both systems. This model describes the AddB protein as found Bacillus subtilis and related species. Although the RexB protein of Streptococcus and Lactococcus is considered to be orthologous, functionally equivalent, and merely named differently, all members of this protein family have a P-loop nucleotide binding motif GxxGxGK[ST] at the N-terminus, unlike RexB proteins, and a CxxCxxxxxC motif at the C-terminus, both of which may be relevant to function. [DNA metabolism, DNA replication, recombination, and repair] 1160
19234 274291 TIGR02774 rexB_recomb ATP-dependent nuclease subunit B. DNA repair is accomplished by several different systems in prokaryotes. Recombinational repair of double-stranded DNA breaks involves the RecBCD pathway in some lineages, and AddAB (also called RecAB) in other. The AddA protein is conserved between the firmicutes and the alphaproteobacteria, while the partner protein is not. The partner may be designated AddB, as in Bacillus and in alphaproteobacteria, or RexB as in Streptococcus and Lactococcus. Note, however, that RexB proteins lack an N-terminal GxxGxGK[ST] ATP-binding motif found in Bacillus subtilis and related species, and this difference may be important; this model represents specifically RexB proteins as found in Streptococcus and Lactococcus. [DNA metabolism, DNA replication, recombination, and repair] 1076
19235 274292 TIGR02775 TrbG_Ti P-type conjugative transfer protein TrbG. The TrbG protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbG is a homolog of the F-type TraK protein (which is believed to be an outer membrane pore-forming secretin, TIGR02756) as well as the vir system VirB9 protein. [Cellular processes, Conjugation] 206
19236 274293 TIGR02776 NHEJ_ligase_prk DNA ligase D. Members of this protein family are DNA ligases involved in the repair of DNA double-stranded breaks by non-homologous end joining (NHEJ). The system of the bacterial Ku protein (TIGR02772) plus this DNA ligase is seen in about 20 % of bacterial genomes to date and at least one archaeon (Archeoglobus fulgidus). This model describes a central and a C-terminal domain. These two domains may be permuted, as in genus Mycobacterium, or divided into tandem ORFs, and therefore not be identified by this model. An additional N-terminal 3'-phosphoesterase (PE) domain present in some but not all examples of this ligase is not included in the seed alignment for this model; it only represents the central ATP-dependent ligase domain and the C-terminal polymerase domain. Most examples of genes for this ligase are adjacent to the gene for Ku. [DNA metabolism, DNA replication, recombination, and repair] 552
19237 131824 TIGR02777 LigD_PE_dom DNA ligase D, 3'-phosphoesterase domain. Most sequences in this family are the 3'-phosphoesterase domain of a multidomain, multifunctional DNA ligase, LigD, involved, along with bacterial Ku protein, in non-homologous end joining, the less common of two general mechanisms of repairing double-stranded breaks in DNA sequences. LigD is variable in architecture, as it lacks this domain in Bacillus subtilis, is permuted in Mycobacterium tuberculosis, and occasionally is encoded by tandem ORFs rather than as a multifuntional protein. In a few species (Dehalococcoides ethenogenes and the archaeal genus Methanosarcina), sequences corresponding to the ligase and polymerase domains of LigD are not found, and the role of this protein is unclear. [DNA metabolism, DNA replication, recombination, and repair] 156
19238 274294 TIGR02778 ligD_pol DNA ligase D, polymerase domain. DNA repair of double-stranded breaks by non-homologous end joining (NHEJ) is accomplished by a two-protein system that is present in a minority of prokaryotes. One component is the Ku protein (see TIGR02772), which binds DNA ends. The other is a DNA ligase, a protein that is a multidomain polypeptide in most of those bacteria that have NHEJ, a permuted polypeptide in Mycobacterium tuberculosis and a few other species, and the product of tandem genes in some other bacteria. This model represents the polymerase domain. 245
19239 274295 TIGR02779 NHEJ_ligase_lig DNA ligase D, ligase domain. DNA repair of double-stranded breaks by non-homologous end joining (NHEJ) is accomplished by a two-protein system that is present in a minority of prokaryotes. One component is the Ku protein (see TIGR02772), which binds DNA ends. The other is a DNA ligase, a protein that is a multidomain polypeptide in most of those bacteria that have NHEJ, a permuted polypeptide in Mycobacterium tuberculosis and a few other species, and the product of tandem genes in some other bacteria. This model represents the ligase domain. 298
19240 131827 TIGR02780 TrbJ_Ti P-type conjugative transfer protein TrbJ. The TrbJ protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbJ is a homolog of the F-type TraE protein (which is believed to be an inner membrane pore-forming protein, TIGR02761) as well as the vir system VirB5 protein. 246
19241 274296 TIGR02781 VirB9 P-type conjugative transfer protein VirB9. The VirB9 protein is found in the vir locus of Agrobacterium Ti plasmids where it is involved in a type IV secretion system . VirB9 is a homolog of the F-type conjugative transfer system TraK protein (which is believed to be an outer membrane pore-forming secretin, TIGR02756) as well as the Ti system TrbG protein. [Cellular processes, Conjugation] 243
19242 274297 TIGR02782 TrbB_P P-type conjugative transfer ATPase TrbB. The TrbB protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbB is a homolog of the vir system VirB11 ATPase, and the Flp pilus sytem ATPase TadA. [Cellular processes, Conjugation] 299
19243 131830 TIGR02783 TrbL_P P-type conjugative transfer protein TrbL. The TrbL protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbL is a homolog of the F-type TraG protein (which is believed to be a mating pair stabilization pore-forming protein, pfam07916) as well as the vir system VirB6 protein. [Cellular processes, Conjugation] 298
19244 274298 TIGR02784 addA_alphas double-strand break repair helicase AddA, alphaproteobacterial type. AddAB, also called RexAB, substitutes for RecBCD in several bacterial lineages. These DNA recombination proteins act before synapse and are particularly important for DNA repair of double-stranded breaks by homologous recombination. The term AddAB is used broadly, with AddA homologous between the alphaproteobacteria (as modeled here) and the Firmicutes, while the partner AddB proteins show no strong homology across the two groups of species. [DNA metabolism, DNA replication, recombination, and repair] 1135
19245 274299 TIGR02785 addA_Gpos helicase-exonuclease AddAB, AddA subunit, Firmicutes type. AddAB, also called RexAB, substitutes for RecBCD in several bacterial lineages. These DNA recombination proteins act before synapse and are particularly important for DNA repair of double-stranded breaks by homologous recombination. The term AddAB is used broadly, with AddA homologous between the Firmicutes (as modeled here) and the alphaproteobacteria, while the partner AddB proteins show no strong homology across the two groups of species. [DNA metabolism, DNA replication, recombination, and repair] 1230
19246 274300 TIGR02786 addB_alphas double-strand break repair protein AddB, alphaproteobacterial type. AddAB is a system well described in the Firmicutes as a replacement for RecBCD in many prokaryotes for the repair of double stranded break DNA damage. More recently, a distantly related gene pair conserved in many alphaproteobacteria was shown also to function in double-stranded break repair in Rhizobium etli. This family consists of AddB proteins of the alphaproteobacteial type. [DNA metabolism, DNA replication, recombination, and repair] 1021
19247 131834 TIGR02787 codY_Gpos GTP-sensing transcriptional pleiotropic repressor CodY. This model represents the full length of CodY, a pleiotropic repressor in Bacillus subtilis and other Firmicutes (low-GC Gram-positive bacteria) that responds to intracellular levels of GTP and branched chain amino acids. The C-terminal helix-turn-helix DNA-binding region is modeled by pfam08222 in Pfam. [Regulatory functions, DNA interactions] 251
19248 274301 TIGR02788 VirB11 P-type DNA transfer ATPase VirB11. The VirB11 protein is found in the vir locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for DNA transfer. VirB11 is believed to be an ATPase. VirB11 is a homolog of the P-like conjugation system TrbB protein and the Flp pilus sytem protein TadA. 308
19249 131836 TIGR02789 nickel_nikB nickel ABC transporter, permease subunit NikB. This family consists of the NikB family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikC. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases. [Transport and binding proteins, Cations and iron carrying compounds] 314
19250 131837 TIGR02790 nickel_nikC nickel ABC transporter, permease subunit NikC. This family consists of the NikC family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikB. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases. [Transport and binding proteins, Cations and iron carrying compounds] 258
19251 274302 TIGR02791 VirB5 P-type DNA transfer protein VirB5. The VirB5 protein is involved in the type IV DNA secretion systems typified by the Agrobacterium Ti plasmid vir system where it interacts with several other proteins essential for proper pilus formation. VirB5 is homologous to the IncN (N-type) conjugation system protein TraC as well as the P-type protein TrbJ and the F-type protein TraE. 220
19252 131839 TIGR02792 PCA_ligA protocatechuate 4,5-dioxygenase, alpha subunit. Protocatechuate (PCA) 4,5-dioxygenase is the first enzyme in the PCA 4,5-cleavage pathway that is an alternative to PCA 3,4-cleavage and PCA 2,3 cleavage pathways. PCA is an intermediate in the breakdown of lignin (hence the gene symbol ligA) and other compounds. Members of this family are the alpha chain of PCA 4,5-dioxygenase, or the equivalent domain of a fusion protein. [Energy metabolism, Aerobic] 117
19253 131840 TIGR02793 nikR nickel-responsive transcriptional regulator NikR. Three members of the seed for this model, from Escherichia coli, Pseudomonas putida, and Brucella melitensis, are found associated with a nickel ABC transporter operon that acts to import nickel for use as a cofactor in urease or hydrogenase. These proteins, with characterized nickel-binding and DNA-binding domains, act as nickel-responsive transcriptional regulators. In the larger family of full-length homologs, most others both lack proximity to the nickel ABC transporter operon and form a separate clade. Several of the homologs not within the scope of this model, but rather scoring between the trusted and noise cutoffs, have been shown to bind nickel, copper, or both, and to regulate genes in response to nickel. [Regulatory functions, DNA interactions] 129
19254 274303 TIGR02794 tolA_full TolA protein. TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cutoffs are based largely conserved operon struction. //The Tol-Pal complex is required for maintaining outer membrane integrity. Also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins ompC, phoE and lamB. [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 346
19255 188247 TIGR02795 tol_pal_ybgF tol-pal system protein YbgF. Members of this protein family are the product of one of seven genes regularly clustered in operons to encode the proteins of the tol-pal system, which is critical for maintaining the integrity of the bacterial outer membrane. The gene for this periplasmic protein has been designated orf2 and ybgF. All members of the seed alignment were from unique tol-pal gene regions from completed bacterial genomes. The architecture of this protein is a signal sequence, a low-complexity region usually rich in Asn and Gln, a well-conserved region with tandem repeats that resemble the tetratricopeptide (TPR) repeat, involved in protein-protein interaction. 117
19256 131843 TIGR02796 tolQ TolQ protein. TolQ is one of the essential components of the Tol-Pal system. Together with TolR, it harnesses protonmotive force to energize TolA, which spans the periplasm to reach the complex of TolB and Pal at the outer member. The tol-pal system proves to be important for maintaining outer membrane integrity. Gene pairs similar to the TolQ and TolR gene pair often number several per genome, but this model describes specificially TolQ per se, as found in tol-pal operons. A close homolog, excluded from this model, is ExbB of the ExbB/ExbD/TonB protein complex, which powers transport of siderophores and vitamin B12 across the bacterial outer membrane. The Tol-Pal system is exploited by colicin and filamentous phage DNA to enter the cell. It is also implicated in pathogenesis in several bacterial species [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 215
19257 131844 TIGR02797 exbB tonB-system energizer ExbB. This model describes ExbB proteins, part of the MotA/TolQ/ExbB protein family. The paired proteins MotA and MotB, TolQ and TolR, and ExbB and ExbD harness the proton-motive force to drive the flagellar motor, energize the Tol-Pal system, or energize TonB, respectively. Tol-Pal and TonB are both active at the outer membrane. Genomes may have many different TonB-dependent receptors, of which many of those characterized are involved in siderophore transport across the outer membrane. [Transport and binding proteins, Cations and iron carrying compounds] 211
19258 131845 TIGR02798 ligK_PcmE 4-carboxy-4-hydroxy-2-oxoadipate aldolase/oxaloacetate decarboxylase. Members of this protein family 4-carboxy-4-hydroxy-2-oxoadipate aldolase, also called 4-oxalocitramalate aldolase. This enzyme of the protocatechuate 4,5-cleavage pathway converts its substrate to pyruvate plus oxaloacetate. Protocatechuate is an intermediate in many pathways for degrading aromatic compounds, including lignin, fluorene, etc. Hara, et al. showed the LigK gene was not only a 4-carboxy-4-hydroxy-2-oxoadipate aldolase but also the enzyme of the following step, oxaloacetate decarboxylase. 222
19259 274304 TIGR02799 thio_ybgC tol-pal system-associated acyl-CoA thioesterase. The tol-pal system consists of five critical genes. Inner membrane proteins TolQ and TolR convert protomotive force to energy that is transduced through TolA to an outer membrane complex of TolB and Pal. The system is known to be required to maintain outer membrane integrity. In a system with several homologous parts, ExbB and ExbD transduces energy through TonB to a variety of outer membrane proteins, many of which are siderophore receptors. The tol-pal system therefore may also be involved in transport. This family consists of a protein nearly always found in operons with the genes of the tol-pal system. The significance of this thioesterase to the tol-pal system is unclear, but either of two observations may be relevant. First, Pal, or peptidoglycan-associated lipoprotein, has a conserved N-terminal cleavage and acylation that makes it a lipoprotein. Second, the tol-pal system is implicated not only in the import of certain organics but also in the maintenance of outer membrane integrity (by an unknown mechanism). 126
19260 274305 TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. Members of this protein family are the TolB periplasmic protein of Gram-negative bacteria. TolB is part of the Tol-Pal (peptidoglycan-associated lipoprotein) multiprotein complex, comprising five envelope proteins, TolQ, TolR, TolA, TolB and Pal, which form two complexes. The TolQ, TolR and TolA inner-membrane proteins interact via their transmembrane domains. The {beta}-propeller domain of the periplasmic protein TolB is responsible for its interaction with Pal. TolB also interacts with the outer-membrane peptidoglycan-associated proteins Lpp and OmpA. TolA undergoes a conformational change in response to changes in the proton-motive force, and interacts with Pal in an energy-dependent manner. The C-terminal periplasmic domain of TolA also interacts with the N-terminal domain of TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi , Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 417
19261 274306 TIGR02801 tolR TolR protein. The model describes the inner membrane protein TolR, part of the TolR/TolQ complex that transduces energy from the proton-motive force, through TolA, to an outer membrane complex made up of TolB and Pal (peptidoglycan-associated lipoprotein). The complex is required to maintain outer membrane integrity, and defects may cause a defect in the import of some organic compounds in addition to the resulting morphologic. While several gene pairs homologous to talR and tolQ may be found in a single genome, but the scope of this model is set to favor finding only bone fide TolR, supported by operon structure as well as by score. [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 129
19262 274307 TIGR02802 Pal_lipo peptidoglycan-associated lipoprotein. Members of this protein are Pal (also called OprL), the Peptidoglycan-Associated Lipoprotein of the Tol-Pal system. The system appears to be involved both in the maintenance of outer membrane integrity and in the import of certain organic molecules as nutrients. Members of this family contain a hydrodrophobic lipoprotein signal sequence, a conserved N-terminal cleavage and modification site, a poorly conserved low-complexity region, together comprising about 65 amino acids, and a well-conserved C-terminal domain. The seed alignment for this model includes only the conserved C-terminal domain. 104
19263 131850 TIGR02803 ExbD_1 TonB system transport protein ExbD, group 1. Members of this family are Gram-negative bacterial inner membrane proteins, generally designated ExbD, related to the TolR family modeled by TIGRFAMs TIGR02801. Members always are encoded next to a protein designated ExbB (TIGR02797), related to the TolQ family modeled by TIGRFAMs TIGR02796. ExbD and ExbB together form a proton channel through which they can harness the proton-motive force to energize TonB, which in turn energizes TonB-dependent receptors in the outer membrane. TonB-dependent receptors with known specificity tend to import siderophores or vitamin B12. A TonB system and Tol-Pal system often will co-exist in a single bacterial genome. 122
19264 131851 TIGR02804 ExbD_2 TonB system transport protein ExbD, group 2. Members of this family are Gram-negative bacterial inner membrane proteins, generally designated ExbD, related to the TolR family modeled by TIGRFAMs TIGR02801. Members always are encoded next to a protein designated ExbB (TIGR02797), related to the TolQ family modeled by TIGRFAMs TIGR02796. ExbD and ExbB together form a proton channel through which they can harness the proton-motive force to energize TonB, which in turn energizes TonB-dependent receptors in the outer membrane. TonB-dependent receptors with known specificity tend to import siderophores or vitamin B12. A TonB system and Tol-Pal system often will co-exist in a single bacterial genome. 121
19265 131852 TIGR02805 exbB2 tonB-system energizer ExbB, group 2. Members of this protein family appear to be the ExbB protein of an ExbBD proton-transporting membrane complex that, by means of TonB, energizes transport by TonB-dependent receptors. Note that this family represents one of at least two distinct groups TolQ homologs designated ExbB - see also TIGR02797. Each group associates with a distinct group of ExbD proteins, and a single species may have two ExbB/ExbD/TonB systems. [Transport and binding proteins, Cations and iron carrying compounds] 138
19266 131853 TIGR02806 clostrip clostripain. Clostripain is a cysteine protease characterized from Clostridium histolyticum, and also known from Clostridium perfringens. It is a heterodimer processed from a single precursor polypeptide, specific for Arg-|-Xaa peptide bonds. The older term alpha-clostripain refers to the most active, most reduced form, rather than to the product of one of several different genes. Clostripain belongs to the peptidase family C11, or clostripain family (see pfam03415). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Pathogenesis] 476
19267 274308 TIGR02807 cas6_cmx6 CRISPR-associated protein Cas6, subtype MYXAN. Members of this protein family resemble the Cas6 proteins described by TIGR01877 in having a C-terminal motif GXGXXXXXGXG, where the single X of each GXG is hydrophobic and the spacer XXXXX has at least one Lys or Arg. Examples are found in cas gene operons of CRISPR regions in Anabaena variabilis ATCC 29413, Leptospira interrogans, Gemmata obscuriglobus UQM 2246, and twice in Myxococcus xanthus DK 1622. Oddly, an orphan member is found in Thiobacillus denitrificans ATCC 25259, whose genome does not seem to contain other evidence of CRISPR repeats or cas genes. 190
19268 131855 TIGR02808 short_TIGR02808 TIGR02808 family protein. This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC7966. [Hypothetical proteins, Conserved] 42
19269 131856 TIGR02809 phasin_3 phasin family protein. Members of this protein family are encoded in polyhydroxyalkanoic acid storage system regions in Vibrio, Photobacterium profundum SS9, Acinetobacter sp., Aeromonas hydrophila, and several species of Vibrio. Members appear distantly related to the phasin family proteins modeled by TIGR01841 and TIGR01985. 110
19270 274309 TIGR02810 agaZ_gatZ D-tagatose-bisphosphate aldolase, class II, non-catalytic subunit. Aldolases specific for D-tagatose-bisphosphate occur in distinct pathways in Escherichia coli and other bacteria, one for the degradation of galactitol (formerly dulcitol) and one for degradation of N-acetyl-galactosamine and D-galactosamine. This family represents a protein of both systems that behaves as a non-catalytic subunit of D-tagatose-bisphosphate aldolase, required both for full activity and for good stability of the aldolase. Note that members of this protein family appear in public databases annotated as putative tagatose 6-phosphate kinases, possibly in error. [Energy metabolism, Sugars] 420
19271 274310 TIGR02811 formate_TAT formate dehydrogenase region TAT target. Members of this uncharacterized protein family are all small, extending 70 or fewer residues from their respective likely start codons. All have the twin-arginine-dependent tranport (TAT) signal sequence at the N-terminus and a conserved 20-residue C-terminal region that includes the motif Y-[HRK]-X-[TS]-X-H-[IV]-X-X-[YF]-Y. The TAT signal sequence suggests a bound cofactor. All members are encoded near genes for subunits of formate dehydrogenase, and may themselves be a subunit or accessory protein. [Unknown function, General] 66
19272 163028 TIGR02812 fadR_gamma fatty acid metabolism transcriptional regulator FadR. Members of this family are FadR, a transcriptional regulator of fatty acid metabolism, including both biosynthesis and beta-oxidation. It is found exclusively in a subset of Gammaproteobacteria, with strictly one copy per genome. It has an N-terminal DNA-binding domain and a less well conserved C-terminal long chain acyl-CoA-binding domain. FadR from this family heterologously expressed in Escherichia coli show differences in regulatory response and fatty acid binding profiles. The family is nevertheless designated equivalog, as all member proteins have at least nominally the same function. [Fatty acid and phospholipid metabolism, Biosynthesis, Fatty acid and phospholipid metabolism, Degradation, Regulatory functions, DNA interactions] 235
19273 274311 TIGR02813 omega_3_PfaA polyketide-type polyunsaturated fatty acid synthase PfaA. Members of the seed for this alignment are involved in omega-3 polyunsaturated fatty acid biosynthesis, such as the protein PfaA from the eicosapentaenoic acid biosynthesis operon in Photobacterium profundum strain SS9. PfaA is encoded together with PfaB, PfaC, and PfaD, and the functions of the individual polypeptides have not yet been described. More distant homologs of PfaA, also included with the reach of this model, appear to be involved in polyketide-like biosynthetic mechanisms of polyunsaturated fatty acid biosynthesis, an alternative to the more familiar iterated mechanism of chain extension and desaturation, and in most cases are encoded near genes for homologs of PfaB, PfaC, and/or PfaD. 2582
19274 274312 TIGR02814 pfaD_fam PfaD family protein. The protein PfaD is part of four gene locus, similar to polyketide biosynthesis systems, responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. Several other members of the seed alignment for this model are found in loci presumed to act in polyketide biosyntheses per se. 444
19275 131862 TIGR02815 agaS_fam putative sugar isomerase, AgaS family. Some members of this protein family are found in regions associated with N-acetyl-galactosamine and galactosamine untilization and are suggested to be isomerases. 372
19276 131863 TIGR02816 pfaB_fam PfaB family protein. The protein PfaB is part of four gene locus, similar to polyketide biosynthesis systems, responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. The fairly permissive trusted cutoff set for this model allows detection of homologs encoded near homologs to other proteins of the locus: PfaA, PfaC, and/or PfaD. The likely role in every case is either polyunsaturated fatty acid or polyketide biosynthesis. 538
19277 274313 TIGR02817 adh_fam_1 zinc-binding alcohol dehydrogenase family protein. Members of this model form a distinct subset of the larger family of oxidoreductases that includes zinc-binding alcohol dehydrogenases and NADPH:quinone reductases (pfam00107). While some current members of this family carry designations as putative alginate lyase, it seems no sequence with a direct characterization as such is detected by this model. [Energy metabolism, Fermentation] 336
19278 131865 TIGR02818 adh_III_F_hyde S-(hydroxymethyl)glutathione dehydrogenase/class III alcohol dehydrogenase. The members of this protein family show dual function. First, they remove formaldehyde, a toxic metabolite, by acting as S-(hydroxymethyl)glutathione dehydrogenase (1.1.1.284). S-(hydroxymethyl)glutathione can form spontaneously from formaldehyde and glutathione, and so this enzyme previously was designated glutathione-dependent formaldehyde dehydrogenase. These same proteins are also designated alcohol dehydrogenase (EC 1.1.1.1) of class III, for activities that do not require glutathione; they tend to show poor activity for ethanol among their various substrate alcohols. [Cellular processes, Detoxification, Energy metabolism, Fermentation] 368
19279 274314 TIGR02819 fdhA_non_GSH formaldehyde dehydrogenase, glutathione-independent. Members of this family represent a distinct clade within the larger family of zinc-dependent dehydrogenases of medium chain alcohols, a family that also includes the so-called glutathione-dependent formaldehyde dehydrogenase. Members of this protein family have a tightly bound NAD that can act as a true cofactor, rather than a cosubstrate in dehydrogenase reactions, in dismutase reactions for some aldehydes. The name given to this family, however, is formaldehyde dehydrogenase, glutathione-independent. [Central intermediary metabolism, One-carbon metabolism] 393
19280 131867 TIGR02820 formald_GSH S-(hydroxymethyl)glutathione synthase. The formation of S-(hydroxymethyl)glutathione synthase from glutathione and formaldehyde occurs naturally, but this enzyme speeds its formation in some species as part of a pathway of formaldehyde detoxification. [Cellular processes, Detoxification, Central intermediary metabolism, One-carbon metabolism] 182
19281 131868 TIGR02821 fghA_ester_D S-formylglutathione hydrolase. This model describes a protein family from bacteria, yeast, and human, with a conserved critical role in formaldehyde detoxification as S-formylglutathione hydrolase (EC 3.1.2.12). Members in eukaryotes such as the human protein are better known as esterase D (EC 3.1.1.1), an enzyme with broad specificity, although S-formylglutathione hydrolase has now been demonstrated as well. [Cellular processes, Detoxification] 275
19282 131869 TIGR02822 adh_fam_2 zinc-binding alcohol dehydrogenase family protein. Members of this model form a distinct subset of the larger family of oxidoreductases that includes zinc-binding alcohol dehydrogenases and NADPH:quinone reductases (pfam00107). The gene neighborhood of members of this family is not conserved and it appears that no members are characterized. The sequence of the family includes 6 invariant cysteine residues and one invariant histidine. It appears that no member is characterized. [Energy metabolism, Fermentation] 329
19283 274315 TIGR02823 oxido_YhdH putative quinone oxidoreductase, YhdH/YhfP family. This model represents a subfamily of pfam00107 as defined by Pfam, a superfamily in which some members are zinc-binding medium-chain alcohol dehydrogenases while others are quinone oxidoreductases with no bound zinc. This subfamily includes proteins studied crystallographically for insight into function: YhdH from Escherichia coli and YhfP from Bacillus subtilis. Members bind NADPH or NAD, but not zinc. [Unknown function, Enzymes of unknown specificity] 323
19284 274316 TIGR02824 quinone_pig3 putative NAD(P)H quinone oxidoreductase, PIG3 family. Members of this family are putative quinone oxidoreductases that belong to the broader superfamily (modeled by Pfam pfam00107) of zinc-dependent alcohol (of medium chain length) dehydrogenases and quinone oxiooreductases. The alignment shows no motif of conserved Cys residues as are found in zinc-binding members of the superfamily, and members are likely to be quinone oxidoreductases instead. A member of this family in Homo sapiens, PIG3, is induced by p53 but is otherwise uncharacterized. [Unknown function, Enzymes of unknown specificity] 325
19285 131872 TIGR02825 B4_12hDH leukotriene B4 12-hydroxydehydrogenase/15-oxo-prostaglandin 13-reductase. Leukotriene B4 12-hydroxydehydrogenase is an NADP-dependent enzyme of arachidonic acid metabolism, responsible for converting leukotriene B4 to the much less active metabolite 12-oxo-leukotriene B4. The BRENDA database lists leukotriene B4 12-hydroxydehydrogenase as one of the synonyms of 2-alkenal reductase (EC 1.3.1.74), while 1.3.1.48 is 15-oxoprostaglandin 13-reductase. 325
19286 274317 TIGR02826 RNR_activ_nrdG3 anaerobic ribonucleoside-triphosphate reductase activating protein. Members of this family represent a set of radical SAM enzymes related to, yet architecturally different from, the activating protein for the glycine radical-containing, oxygen-sensitive ribonucleoside-triphosphate reductase (RNR) as described in model TIGR02491. Members of this family are found paired with members of a similarly divergent set of anaerobic ribonucleoside-triphosphate reductases. Identification of this protein as an RNR activitating protein is partly from pairing with a candidate RNR. It is further supported by our finding that upstream of these operons are examples of a conserved regulatory element (described Rodionov and Gelfand) that is found in nearly all bacteria and that occurs specifically upstream of operons for all three classes of RNR genes. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 147
19287 274318 TIGR02827 RNR_anaer_Bdell anaerobic ribonucleoside-triphosphate reductase. Members of this family belong to the class III anaerobic ribonucleoside-triphosphate reductases (RNR). These glycine-radical-containing enzymes are oxygen-sensitive and operate under anaerobic conditions. The genes for this family are pair with genes for an acitivating protein that creates a glycine radical. Members of this family, though related, fall outside the scope of TIGR02487, a functionally equivalent protein set; no genome has members in both familes. Identification as RNR is supported by gene pairing with the activating protein, lack of other anaerobic RNR, and presence of an upstream regulatory element strongly conserved upstream of most RNR operons. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 595
19288 131875 TIGR02828 TIGR02828 putative membrane fusion protein. Members of this family show similarity to the members of TIGR00999, the membrane fusion protein (MFP) cluster 2 family, which is linked to RND transport systems. [Transport and binding proteins, Unknown substrate] 188
19289 213743 TIGR02829 spore_III_AE stage III sporulation protein AE. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AE. [Cellular processes, Sporulation and germination] 381
19290 274319 TIGR02830 spore_III_AG stage III sporulation protein AG. CC A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AG. [Cellular processes, Sporulation and germination] 186
19291 131878 TIGR02831 spo_II_M stage II sporulation protein M. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This predicted integral membrane protein is designated stage II sporulation protein M. [Cellular processes, Sporulation and germination] 200
19292 131879 TIGR02832 spo_yunB sporulation protein YunB. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. Mutation of this sigma E-regulated gene, designated yunB, has been shown to cause a sporulation defect. [Cellular processes, Sporulation and germination] 204
19293 131880 TIGR02833 spore_III_AB stage III sporulation protein AB. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage III sporulation protein AB. [Cellular processes, Sporulation and germination] 170
19294 274320 TIGR02834 spo_ytxC putative sporulation protein YtxC. This uncharacterized protein is part of a panel of proteins conserved in all known endospore-forming Firmicutes (low-GC Gram-positive bacteria), including Carboxydothermus hydrogenoformans, and nowhere else. [Cellular processes, Sporulation and germination] 276
19295 131882 TIGR02835 spore_sigmaE RNA polymerase sigma-E factor. Members of this family comprise the Firmicutes lineage endospore formation-specific sigma factor SigE, also called SpoIIGB and sigma-29. As characterized in Bacillus subtilis, this protein is synthesized as a precursor, specifically in the mother cell compartment, and must cleaved by the SpoIIGA protein to be made active. [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 234
19296 131883 TIGR02836 spore_IV_A stage IV sporulation protein A. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. [Cellular processes, Sporulation and germination] 492
19297 131884 TIGR02837 spore_II_R stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage II sporulation protein R. [Cellular processes, Sporulation and germination] 168
19298 131885 TIGR02838 spore_V_AC stage V sporulation protein AC. This model describes stage V sporulation protein AC, a paralog of stage V sporulation protein AE. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAC have a stage V sproulation defect. [Cellular processes, Sporulation and germination] 141
19299 131886 TIGR02839 spore_V_AE stage V sporulation protein AE. This model describes stage V sporulation protein AE, a paralog of stage V sporulation protein AC. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAE have a stage V sproulation defect. [Cellular processes, Sporulation and germination] 114
19300 274321 TIGR02840 spore_YtaF putative sporulation protein YtaF. This protein family was identified, at the time of the publication of the Carboxydothermus hydrogenoformans genome, as having a phylogenetic profile that exactly matches the subset of the Firmicutes capable of forming endospores. The species include Bacillus anthracis, Clostridium tetani, Thermoanaerobacter tengcongensis, Geobacillus kaustophilus, etc. This protein, previously named YtaF, is therefore a putative sporulation protein. [Cellular processes, Sporulation and germination] 206
19301 131888 TIGR02841 spore_YyaC putative sporulation protein YyaC. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, also called YyaC, is a member of that panel and is otherwise uncharacterized. The second round of PSI-BLAST shows many similarities to the germination protease GPR, which is found in exactly the same set of organisms and has a known role in the sporulation/germination process. [Cellular processes, Sporulation and germination] 140
19302 131889 TIGR02842 CyoC cytochrome o ubiquinol oxidase, subunit III. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport] 180
19303 131890 TIGR02843 CyoB cytochrome o ubiquinol oxidase, subunit I. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport] 646
19304 131891 TIGR02844 spore_III_D sporulation transcriptional regulator SpoIIID. Members of this protein are the transcriptional regulator SpoIIID, or stage III sporulation protein D. It is present in genomes if and only if the species is capable of endospore formation as occurs in the model species Bacillus subtilis. SpoIIID is a DNA binding protein that, in B. subtilis, downregulates many genes but also turns on ten genes. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination] 80
19305 188254 TIGR02845 spore_V_AD stage V sporulation protein AD. Bacillus and Clostridium species contain about 10 % dipicolinic acid (pyridine-2,6-dicarboxylic acid) by weight. This protein family, SpoVAD, belongs to the spoVA operon that is suggested to act in the transport of dipicolinic acid (DPA) from the mother cell, where DPA is synthesized, to the forespore, a process essential to sporulation. Members of this protein family are found, so far, in exactly those species believed capable of endospore formation. [Cellular processes, Sporulation and germination] 327
19306 131893 TIGR02846 spore_sigmaK RNA polymerase sigma-K factor. The sporulation-specific transcription factor sigma-K (also called sigma-27) is expressed in the mother cell compartment of endospore-forming bacteria such as Bacillus subtilis. Like its close homolog sigma-E (sigma-29) (see TIGR02835), also specific to the mother cell compartment, it must be activated by a proteolytic cleavage. Note that in Bacillus subtilis (and apparently also Clostridium tetani), but not in other endospore forming species such as Bacillus anthracis, the sigK gene is generated by a non-germline (mother cell only) chromosomal rearrangement that recombines coding regions for the N-terminal and C-terminal regions of sigma-K. [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 227
19307 131894 TIGR02847 CyoD cytochrome o ubiquinol oxidase subunit IV. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport] 96
19308 131895 TIGR02848 spore_III_AC stage III sporulation protein AC. Members of this protein family are designated SpoIIIAC, part of the spoIIIA operon of sporulation genes whose mutant phenotype is linked to sporulation stage III. Members of this family are encoded by the genome of a species if and only if that species is capable of endospore formation, as in Bacillus subtilis. The molecular function of this small, probable integral membrane protein is unknown. [Cellular processes, Sporulation and germination] 64
19309 131896 TIGR02849 spore_III_AD stage III sporulation protein AD. Members of this family are the uncharacterized protein SpoIIIAD, part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. Note that the start sites of members of this family as annotated tend to be variable; quite a few members have apparent homologous protein-coding regions continuing upstream of the first available start codon. The length of the alignment and the scoring cutoff thresholds for the model have been set to try to detect all valid members of the family, even if annotation of the start site begins too far downstream. [Cellular processes, Sporulation and germination] 101
19310 131897 TIGR02850 spore_sigG RNA polymerase sigma-G factor. Members of this family comprise the Firmicutes lineage endospore formation-specific sigma factor SigG. It is also desginated stage III sporulation protein G (SpoIIIG). This protein is rather closely related to sigma-F (SpoIIAC), another sporulation sigma factor. [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 254
19311 131898 TIGR02851 spore_V_T stage V sporulation protein T. Members of this protein family are the stage V sporulation protein T (SpoVT), a protein of the sporulation/germination program in Bacillus subtilis and related species. The amino-terminal 50 amino acids are nearly perfectly conserved across all endospore-forming bacteria. SpoVT is a DNA-binding transcriptional regulator related to AbrB (See pfam04014). [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination] 180
19312 188255 TIGR02852 spore_dpaB dipicolinic acid synthetase, B subunit. Members of this family represent the B subunit of dipicolinic acid synthetase, an enzyme that synthesizes a small molecule that appears to confer heat stability to bacterial endospores such as those of Bacillus subtilis. The A and B subunits are together in what was originally designated the spoVF locus for stage V of endospore formation. [Cellular processes, Sporulation and germination] 187
19313 131900 TIGR02853 spore_dpaA dipicolinic acid synthetase, A subunit. This predicted Rossman fold-containing protein is the A subunit of dipicolinic acid synthetase as found in most, though not all, endospore-forming low-GC Gram-positive bacteria; it is absent in Clostridium. The B subunit is represented by TIGR02852. This protein is also known as SpoVFA. [Cellular processes, Sporulation and germination] 287
19314 131901 TIGR02854 spore_II_GA sigma-E processing peptidase SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria. [Cellular processes, Sporulation and germination] 288
19315 274322 TIGR02855 spore_yabG sporulation peptidase YabG. Members of this family are the protein YabG, demonstrated for Bacillus subtilis to be an endopeptidase able to release N-terminal peptides from a number of sporulation proteins, including CotT, CotF, and SpoIVA. It appears to be expressed under control of sigma-K. [Cellular processes, Sporulation and germination] 283
19316 131903 TIGR02856 spore_yqfC sporulation protein YqfC. This small protein, designated YqfC in Bacillus subtilis, is both restricted to and universal in sporulating species of the Firmcutes, such as Bacillus subtilis and Clostridium perfringens. It is part of the sigma(E)-controlled regulon, and its mutation leads to a sporulation defect. [Cellular processes, Sporulation and germination] 85
19317 274323 TIGR02857 CydD thiol reductant ABC exporter, CydD subunit. The gene pair cydCD encodes an ABC-family transporter in which each gene contains an N-terminal membrane-spanning domain (pfam00664) and a C-terminal ATP-binding domain (pfam00005). In E. coli these genes were discovered as mutants which caused the terminal heme-copper oxidase complex cytochrome bd to fail to assemble. Recent work has shown that the transporter is involved in export of redox-active thiol compounds such as cysteine and glutathione. The linkage to assembly of the cytochrome bd complex is further supported by the conserved operon structure found outside the gammaproteobacteria (cydABCD) containing both the transporter and oxidase genes components. The genes used as the seed members for this model are all either found in the gammproteobacterial context or the CydABCD context. All members of this family scoring above trusted at the time of its creation were from genomes which encode a cytochrome bd complex. Unfortunately, the gene symbol nomenclature adopted based on this operon in B. subtilis assigns cydC to the third gene in the operon where this gene is actually homologous to the E. coli cydD gene. We have chosen to name all homologs in this family in accordance with the precedence of publication of the E. coli name, CydD 529
19318 274324 TIGR02858 spore_III_AA stage III sporulation protein AA. Members of this protein are the stage III sporulation protein AA, encoded by one of several genes in the spoIIIA locus. It seems that this protein is found in a species if and only if that species is capable of endospore formation. [Cellular processes, Sporulation and germination] 270
19319 131906 TIGR02859 spore_sigH RNA polymerase sigma-H factor. Members of this protein family are RNA polymerase sigma-H factor for sporulation in endospore-forming bacteria. This protein is also called Sigma-30 and SigH. Although rather close homologs in Listeria score well against this model, Listeria does not form spores and the role of the related sigma factor in that genus is in doubt. [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 198
19320 274325 TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else. [Cellular processes, Sporulation and germination] 402
19321 274326 TIGR02861 SASP_H small acid-soluble spore protein, H-type. This model is derived from pfam08141 but has been expanded to include in the seed corresponding proteins from three species of Clostridium. Members of this family should occur only in endospore-forming bacteria, typically with two members per genome, but may be absent from the genomes of some endospore-forming bacteria. SspH (previously designated YfjU) was shown to be expressed specifically in spores of Bacillus subtilis. [Cellular processes, Sporulation and germination] 58
19322 163046 TIGR02862 spore_BofA pro-sigmaK processing inhibitor BofA. Members of this protein family are found only in endospore-forming bacteria, such as Bacillus subtilis and Clostridium tetani. Among such bacteria, it appears only Symbiobacterium thermophilum lacks a member of this family. The protein, designated BofA, is an integral membrane protein that regulates the proteolytic activation of the RNA polymerase sigma factor K. [Cellular processes, Sporulation and germination] 83
19323 131910 TIGR02863 spore_sspJ small, acid-soluble spore protein, SspJ. New small, acid-soluble proteins unique to spores of Bacillus subtilis [Cellular processes, Sporulation and germination] 47
19324 274327 TIGR02864 spore_sspO small, acid-soluble spore protein O. This model represents a minor (low-abundance) spore protein, designated SspO. It is found in a very limited subset of the already small group of endospore-forming bacteria, but these species include Oceanobacillus iheyensis, Geobacillus kaustophilus, Bacillus subtilis, B. halodurans, and B. cereus. This protein was previously called CotK. [Cellular processes, Sporulation and germination] 50
19325 274328 TIGR02865 spore_II_E stage II sporulation protein E. Stage II sporulation protein E (SpoIIE) is a multiple membrane spanning protein with two separable functions. It plays a role in the switch to polar cell division during sporulation. By means of it protein phosphatase activity, located in the C-terminal region, it activates sigma-F. All proteins that score above the trusted cutoff to this model are found in endospore-forming Gram-positive bacteria. Surprisingly, a sequence from the Cyanobacterium-like (and presumably non-spore-forming) photosynthesizer Heliobacillus mobilis is homologous, and scores between the trusted and noise cutoffs. [Cellular processes, Sporulation and germination] 764
19326 274329 TIGR02866 CoxB cytochrome c oxidase, subunit II. Cytochrome c oxidase is the terminal electron acceptor of mitochondria (and one of several possible acceptors in prokaryotes) in the electron transport chain of aerobic respiration. The enzyme couples the oxidation of reduced cytochrome c with the reduction of molecular oxygen to water. This process results in the pumping of four protons across the membrane which are used in the proton gradient powered synthesis of ATP. The oxidase contains two heme a cofactors and three copper atoms as well as other bound ions. [Energy metabolism, Electron transport] 199
19327 274330 TIGR02867 spore_II_P stage II sporulation protein P. Stage II sporulation protein P is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIIP, along with SpoIIM and SpoIID, is one of three major proteins involved in engulfment of the forespore by the mother cell. This protein family is named for the single member in Bacillus subtilis, although most sporulating bacteria have two members. [Cellular processes, Sporulation and germination] 196
19328 274331 TIGR02868 CydC thiol reductant ABC exporter, CydC subunit. The gene pair cydCD encodes an ABC-family transporter in which each gene contains an N-terminal membrane-spanning domain (pfam00664) and a C-terminal ATP-binding domain (pfam00005). In E. coli these genes were discovered as mutants which caused the terminal heme-copper oxidase complex cytochrome bd to fail to assemble. Recent work has shown that the transporter is involved in export of redox-active thiol compounds such as cysteine and glutathione. The linkage to assembly of the cytochrome bd complex is further supported by the conserved operon structure found outside the gammaproteobacteria (cydABCD) containing both the transporter and oxidase genes components. The genes used as the seed members for this model are all either found in the gammproteobacterial context or the CydABCD context. All members of this family scoring above trusted at the time of its creation were from genomes which encode a cytochrome bd complex. 530
19329 213747 TIGR02869 spore_SleB spore cortex-lytic enzyme. Members of this protein family are the spore cortex-lytic enzyme SleB from Bacillus subtilis and other Gram-positive, endospore-forming bacterial species. This protein is stored in an inactive form in the spore and activated during germination. [Cellular processes, Sporulation and germination] 200
19330 274332 TIGR02870 spore_II_D stage II sporulation protein D. Stage II sporulation protein D (SpoIID) is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIID, along with SpoIIM and SpoIIP, is one of three major proteins involved in engulfment of the forespore by the mother cell. [Cellular processes, Sporulation and germination] 338
19331 274333 TIGR02871 spore_ylbJ sporulation integral membrane protein YlbJ. Members of this protein family are found exclusively in Firmicutes (low-GC Gram-positive bacterial) and are known from studies in Bacillus subtilis to be part of the sigma-E regulon. Mutation leads to a sporulation defect, confirming that members of this protein family, YlbJ, are sporulation proteins. This protein appears to be universal among endospore-forming bacteria, but is encoded by a pair ORFs distant from eash other in Symbiobacterium thermophilum IAM14863. [Cellular processes, Sporulation and germination] 362
19332 274334 TIGR02872 spore_ytvI sporulation integral membrane protein YtvI. Three lines of evidence show this protein to be involved in sporulation. First, it is under control of a sporulation-specific sigma factor, sigma-E. Second, mutation leads to a sporulation defect. Third, it if found in exactly those genomes whose bacteria are capable of sporulation, except for being absent in Clostridium acetobutylicum ATCC824. This protein has extensive hydrophobic regions and is likely an integral membrane protein. [Cellular processes, Sporulation and germination] 341
19333 131920 TIGR02873 spore_ylxY probable sporulation protein, polysaccharide deacetylase family. Members of this protein family are most closely related to TIGR02764, a subset of polysaccharide deacetylase family proteins found in a species if and only if the species forms endospores like those of Bacillus subtilis or Clostridium tetani. This family is likewise restricted to spore-formers, but is not universal among them in having sequences with full-length matches to the model. [Energy metabolism, Biosynthesis and degradation of polysaccharides, Cellular processes, Sporulation and germination] 268
19334 131921 TIGR02874 spore_ytfJ sporulation protein YtfJ. Members of this protein family, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if and only if the species is capable of endospore formation. YtfJ was confirmed in spores of Bacillus subtilis; it appears to be expressed in the forespore under control of SigF (see ). [Cellular processes, Sporulation and germination] 125
19335 131922 TIGR02875 spore_0_A sporulation transcription factor Spo0A. Spo0A, the stage 0 sporulation protein A, is a transcription factor critical for the initiation of sporulation. It contains a response regulator receiver domain (pfam00072). In Bacillus subtilis, it works together with response regulator Spo0F and the phosphotransferase Spo0B, both of which are missing from at least some sporulating species and thus not part of the endospore forming bacteria minimal gene set. Spo0A, however, is universal among endospore-forming species. [Cellular processes, Sporulation and germination] 262
19336 274335 TIGR02876 spore_yqfD sporulation protein YqfD. YqfD is part of the sigma-E regulon in the sporulation program of endospore-forming Gram-positive bacteria. Mutation results in a sporulation defect in Bacillus subtilis. Members are found in all currently known endospore-forming bacteria, including the genera Bacillus, Symbiobacterium, Carboxydothermus, Clostridium, and Thermoanaerobacter. [Cellular processes, Sporulation and germination] 382
19337 274336 TIGR02877 spore_yhbH sporulation protein YhbH. This protein family, typified by YhbH in Bacillus subtilis, is found in nearly every endospore-forming bacterium and in no other genome (but note that the trusted cutoff score is set high to exclude a single high-scoring sequence from Nitrosococcus oceani ATCC 19707, which is classified in the Gammaproteobacteria). The gene in Bacillus subtilis was shown to be in the regulon of the sporulation sigma factor, sigma-E, and its mutation was shown to create a sporulation defect. [Cellular processes, Sporulation and germination] 371
19338 131925 TIGR02878 spore_ypjB sporulation protein YpjB. Members of this protein, YpjB, family are restricted to a subset of endospore-forming bacteria, including Bacillus species but not CLostridium or some others. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon, where sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect. This protein family is not, however, a part of the endospore formation minimal gene set. [Cellular processes, Sporulation and germination] 233
19339 200217 TIGR02880 cbbX_cfxQ probable Rubsico expression protein CbbX. Proteins in this family are now designated CbbX. Some previously were CfxQ (carbon fixation Q). Its gene is often found immmediately downstream of the Rubisco large and small chain genes, and it is suggested to be necessary for Rubisco expression. CbbX has been shown to be necessary for photoautotrophic growth. This protein belongs to the larger family of pfam00004, ATPase family Associated with various cellular Activities. Within that larger family, members of this family are most closely related to the stage V sporulation protein K, or SpoVK, in endospore-forming bacteria such as Bacillus subtilis. 284
19340 163057 TIGR02881 spore_V_K stage V sporulation protein K. Members of this protein family are the stage V sporulation protein K (SpoVK), a close homolog of the Rubisco expression protein CbbX (TIGR02880) and a members of the ATPase family associated with various cellular activities (pfam00004). Members are strictly limited to bacterial endospore-forming species, but are not universal in this group and are missing from the Clostridium group. [Cellular processes, Sporulation and germination] 261
19341 131928 TIGR02882 QoxB cytochrome aa3 quinol oxidase, subunit I. This family (QoxB) encodes subunit I of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport] 643
19342 274337 TIGR02883 spore_cwlD N-acetylmuramoyl-L-alanine amidase CwlD. Members of this protein family are the CwlD family of N-acetylmuramoyl-L-alanine amidase. This family has been called the germination-specific N-acetylmuramoyl-L-alanine amidase. CwlD is required, along with the putative deactylase PdaA, to make muramic delta-lactam, a novel peptidoglycan constituent found only in spores. CwlD mutants show a germination defect. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination] 189
19343 131930 TIGR02884 spore_pdaA delta-lactam-biosynthetic de-N-acetylase. Muramic delta-lactam is an unusual constituent of peptidoglycan, found only in bacterial spores in the peptidoglycan wall, or spore cortex. The proteins in this family are PdaA (yfjS), a member of a larger family of polysaccharide deacetylases, and are specificially involved in delta-lactam biosynthesis. PdaA acts immediately after CwlD, an N-acetylmuramoyl-L-alanine amidase and performs a de-N-acetylation. PdaA may also perform the following transpeptidation for lactam ring formation, as heterologous expression in E. coli of CwlD and PdaA together is sufficient for delta-lactam production. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination] 224
19344 131931 TIGR02885 spore_sigF RNA polymerase sigma-F factor. Members of this protein family are the RNA polymerase sigma factor F. Sigma-F is specifically and universally a component of the Firmicutes lineage endospore formation program, and is expressed in the forespore to turn on expression of dozens of genes. It is closely homologous to sigma-G, which is also expressed in the forespore. [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 231
19345 131932 TIGR02886 spore_II_AA anti-sigma F factor antagonist. The anti-sigma F factor antagonist, also called stage II sporulation protein AA, is a protein universal among endospore-forming bacteria, all of which belong to the Firmcutes [Regulatory functions, Protein interactions, Cellular processes, Sporulation and germination] 106
19346 274338 TIGR02887 spore_ger_x_C germination protein, Ger(x)C family. Members of this protein family are restricted to endospore-forming members of the Firmicutes lineage of bacteria, including the genera Bacillus, Clostridium, Thermoanaerobacter, Carboxydothermus, etc. Members are nearly all predicted lipoproteins and belong to probable transport operons, some of which have been characterized as crucial to germination in response to alanine. Members typically have been gene symbols gerKC, gerAC, gerYC, etc. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Sporulation and germination] 371
19347 274339 TIGR02888 spore_YlmC_YmxH sporulation protein, YlmC/YmxH family. Members of this family belong to the broader family of PRC-barrel domain proteins (see pfam05239), but are found only in endospore-forming bacteria of the Firmicutes lineage. Most such species have exactly two members of this family and all have at least one; the function is unknown. One of two members from Bacillus subtilis, YmxH, is strongly induced by the mother cell-specific sigma-E factor. [Cellular processes, Sporulation and germination] 76
19348 131935 TIGR02889 spore_YpeB germination protein YpeB. Members of this family are YpeB, a protein usually encoded with the putative spore-cortex-lytic enzyme SleB and required, together with SleB, for normal germination. This family is retricted to endospore-forming species in the Firmicutes lineage of bacteria, and found in all such species to date except Clostridium perfringens. The matching phenotypes of mutants in SleB (called a lytic transglycosylase) and YpeB suggests that YpeB is necessary to allow SleB to function. [Cellular processes, Sporulation and germination] 435
19349 274340 TIGR02890 bacill_yteA regulatory protein, yteA family. Members of this predicted regulatory protein are found only in endospore-forming members of the Firmicutes group of bacteria, and in nearly every such species; Clostridium perfringens seems to be an exception. The member from Bacillus subtilis, the model system for the study of the sporulation program, has been designated both yteA and yzwB. Some (but not all) members of this family show a strong sequence match to Pfam family pfam01258 the C4-type zinc finger protein, DksA/TraR family, but only one of the four key Cys residues is conserved. All members of this protein family share an additional C-terminal domain. Smaller proteins from the proteobacteria with just the N-terminal domain, including DksA and DksA2 are RNA polymerase-binding regulatory proteins even if the Zn-binding site is not conserved. [Unknown function, General] 159
19350 213748 TIGR02891 CtaD_CoxA cytochrome c oxidase, subunit I. This large family represents subunit I's (CtaD, CoxA, CaaA) of cytochrome c oxidases of bacterial origin. Cytochrome c oxidase is the component of the respiratory chain that catalyzes the reduction of oxygen to water. Subunits I-III form the functional core of the enzyme complex. Subunit I is the catalytic subunit of the enzyme. Electrons originating in cytochrome c are transferred via the copper A center of subunit II and heme a of subunit I to the bimetallic center formed by heme a3 and copper B. This cytochrome c oxidase shows proton pump activity across the membrane in addition to the electron transfer. In the bacilli an apparent split (paralogism) has created a sister clade (TIGR02882) encoding subunits (QoxA) of the aa3-type quinone oxidase complex which reacts directly with quinones, bypassing the interaction with soluble cytochrome c. This model attempts to exclude these sequences, placing them between the trusted and noise cutoffs. These families, as well as archaeal and eukaryotic cytochrome c subunit I's are included within the superfamily model, pfam00115. [Energy metabolism, Electron transport] 499
19351 274341 TIGR02892 spore_yabP sporulation protein YabP. Members of this protein family are the YabP protein of the bacterial sporulation program, as found in Bacillus subtilis, Clostridium tetani, and other spore-forming members of the Firmicutes. In Bacillus subtilis, a yabP single mutant appears to sporulate and germinate normally (), but is in an operon with yabQ (essential for formation of the spore cortex), it near-universal among endospore-forming bacteria, and is found nowhere else. It is likely, therefore, that YabP does have a function in sporulation or germination, one that is either unappreciated or partially redundant with that of another protein. [Cellular processes, Sporulation and germination] 85
19352 131939 TIGR02893 spore_yabQ spore cortex biosynthesis protein YabQ. YabQ, a protein predicted to span the membrane several times, is found in exactly those genomes whose species perform sporulation in the style of Bacillus subtilis, Clostridium tetani, and others of the Firmicutes. Mutation of this sigma(E)-dependent gene blocks development of the spore cortex. The length of the C-terminal region, including some hydrophobic regions, is rather variable between members. [Cellular processes, Sporulation and germination] 130
19353 274342 TIGR02894 DNA_bind_RsfA transcription factor, RsfA family. In a subset of endospore-forming members of the Firmcutes, members of this protein family are found, several to a genome. Two very strongly conserved sequences regions are separated by a highly variable linker region. Much of the linker region was excised from the seed alignment for this model. A characterized member is the prespore-specific transcription RsfA from Bacillus subtilis, previously called YwfN, which is controlled by sigma factor F and seems to fine-tune expression of some genes in the sigma-F regulon. A paralog in Bacillus subtilis is designated YlbO. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination] 161
19354 131941 TIGR02895 spore_sigI RNA polymerase sigma-I factor. Members of this sigma factor protein family are strictly limited to endospore-forming species in the Firmicutes lineage of bacteria, but are not universally present among such species. Sigma-I was shown to be induced by heat shock () in Bacillus subtilis and is suggested by its phylogenetic profile to be connected to the program of sporulation (). [Transcription, Transcription factors, Cellular processes, Sporulation and germination] 218
19355 131942 TIGR02896 spore_III_AF stage III sporulation protein AF. This family represents the stage III sporulation protein AF of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of this protein is poorly conserved, so only the N-terminal region, which includes two predicted transmembrane domains, is included in the seed alignment. [Cellular processes, Sporulation and germination] 106
19356 131943 TIGR02897 QoxC cytochrome aa3 quinol oxidase, subunit III. This family (QoxC) encodes subunit III of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport] 190
19357 131944 TIGR02898 spore_YhcN_YlaJ sporulation lipoprotein, YhcN/YlaJ family. YhcN and YlaJ are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic 40-residue C-terminal domain that is not included in the seed alignment for this model. A portion of the low-complexity region between the lipoprotein signal sequence and the main conserved region of the protein family was also excised from the seed alignment. [Cellular processes, Sporulation and germination] 158
19358 131945 TIGR02899 spore_safA spore coat assembly protein SafA. SafA (YrbB) (SafA) of Bacillus subtilis is a protein found at the interface of the spore cortex and spore coat, and is dependent on SpoVID for its localization. This model is based on the N-terminal LysM (lysin motif) domain (see pfamAM model pfam01476) of SafA and, from several other spore-forming species, the protein with the most similar N-terminal region. However, this set of proteins differs greatly in C-terminal of the LysM domaim; blocks of 12-residue and 13-residue repeats are found in the Bacillus cereus group, tandem LysM domains in Thermoanaerobacter tengcongensis MB4, etc. in which one of which is found in most examples of endospore-forming bacteria. [Cellular processes, Sporulation and germination] 44
19359 274343 TIGR02900 spore_V_B stage V sporulation protein B. SpoVB is the stage V sporulation protein B of the bacterial endopore formation program in Bacillus subtilis and various other Firmcutes. It is nearly universal among endospore-formers. Paralogs with rather high sequence similarity to SpoVB exist, including YkvU in B. subtilis and a number of proteins in the genus Clostridium. [Cellular processes, Sporulation and germination] 488
19360 200218 TIGR02901 QoxD cytochrome aa3 quinol oxidase, subunit IV. This family (QoxD) encodes subunit IV of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport] 94
19361 131948 TIGR02902 spore_lonB ATP-dependent protease LonB. Members of this protein are LonB, a paralog of the ATP-dependent protease La (LonA, TIGR00763). LonB proteins are found strictly, and almost universally, in endospore-forming bacteria. This protease was shown, in Bacillus subtilis, to be expressed specifically in the forespore, during sporulation, under control of sigma(F). The lonB gene, despite location immediately upstream of lonA, was shown to be monocistronic. LonB appears able to act on sigma(H) for post-translation control, but lonB mutation did not produce an obvious sporulation defect under the conditions tested. Note that additional paralogs of LonA and LonB occur in the Clostridium lineage and this model selects only one per species as the protein that corresponds to LonB in B. subtilis. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination] 531
19362 274344 TIGR02903 spore_lon_C ATP-dependent protease, Lon family. Members of this protein family resemble the widely distributed ATP-dependent protease La, also called Lon and LonA. It resembles even more closely LonB, which is a LonA paralog found in genomes if and only if the species is capable of endospore formation (as in Bacillus subtilis, Clostridium tetani, and select other members of the Firmicutes) and expressed specifically in the forespore compartment. Members of this family are restricted to a subset of spore-forming species, and are very likely to participate in the program of endospore formation. We propose the designation LonC. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination] 615
19363 274345 TIGR02904 spore_ysxE spore coat protein YsxE. Members of this family are homologs of the Bacillus subtilis spore coat protein CotS. Members of this family, designated YsxE, are found only in the family Bacillaceae, from among the endospore-forming members of the Firmicutes branch of the Bacteria. As a rule, the ysxE gene is found immediately downstream of spoVID, a gene necessary for spore coat assembly. The protein has been shown to be part of the spore coat. [Cellular processes, Sporulation and germination] 309
19364 131951 TIGR02905 spore_yutH spore coat protein YutH. Members of this family are homologs of the Bacillus subtilis spore coat protein CotS. Members of this family, designated YutH, are found only in the family Bacillaceae from among the endospore-forming members of the Firmicutes branch of the Bacteria. [Cellular processes, Sporulation and germination] 313
19365 131952 TIGR02906 spore_CotS spore coat protein, CotS family. Members of this family include the spore coat proteins CotS and YtaA from Bacillus subtilis and, from other endospore-forming bacteria, homologs that are more closely related to these two than to the spore coat proteins YutH and YsxE. The CotS family is more broadly distributed than YutH or YsxE, but still is not universal among spore-formers. [Cellular processes, Sporulation and germination] 313
19366 274346 TIGR02907 spore_VI_D stage VI sporulation protein D. SpoVID, the stage VI sporulation protein D, is restricted to endospore-forming members of the bacteria, all of which are found among the Firmicutes. It is widely distributed but not quite universal in this group. Between well-conserved N-terminal and C-terminal domains is a poorly conserved, low-complexity region of variable length, rich enough in glutamic acid to cause spurious BLAST search results unless a filter is used. The seed alignment for this model was trimmed, in effect, by choosing member sequences in which these regions are relatively short. SpoVID is involved in spore coat assembly by the mother cell compartment late in the process of sporulation. [Cellular processes, Sporulation and germination] 338
19367 131954 TIGR02908 CoxD_Bacillus cytochrome c oxidase, subunit IVB. This model represents a small clade of cytochrome oxidase subunit IV's found in the Bacilli. [Energy metabolism, Electron transport] 110
19368 131955 TIGR02909 spore_YkwD uncharacterized protein, YkwD family. Members of this protein family represent a subset of those belonging to pfam00188 (SCP-like extracellular protein). Based on currently cuttoffs for this model, all member proteins are found in Bacteria capable of endospore formation. Members include a named but uncharacterized protein, YkwD of Bacillus subtilis. Only the C-terminal region is well-conserved and is included in the seed alignment for this model. Three members of this family have an N-terminal domain homologous to the spore coat assembly protein SafA. 127
19369 131956 TIGR02910 sulfite_red_A sulfite reductase, subunit A. Members of this protein family include the A subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism] 334
19370 131957 TIGR02911 sulfite_red_B sulfite reductase, subunit B. Members of this protein family include the B subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. [Central intermediary metabolism, Sulfur metabolism] 261
19371 131958 TIGR02912 sulfite_red_C sulfite reductase, subunit C. Members of this protein family include the C subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism] 314
19372 131959 TIGR02913 HAF_rpt probable extracellular repeat, HAF family. The model for this family detects a homology domain of about 40 amino acids. Member proteins always have a least two tandem copies and as many as seven. The spacing between repeats as defined here usually is four residues exactly. This repeat is named for a tripeptide motif HAF found in most members. Some members proteins are found in species with no outer membrane (archaea and Gram-positive bacteria) while others have C-terminal autotransporter domains that suggest that the repeat region is transported across the outer membrane. This domain seems likely to be an extracellular protein repeat. 39
19373 274347 TIGR02914 EpsI_fam EpsI family protein. In Methylobacillus sp strain 12S, EpsI is encoded immediately downstream of the multiple-membrane-spanning putative transporter EpsH, and is predicted to be a periplasmic protein involved in, but not required for, expression of the exopolysaccharide methanolan. In a number of other species, protein homologous to EpsI is encoded either next to EpsH or, more often, combined in a fused gene. We have proposed renaming EpsH, or the EpsHI fusion protein, to exosortase, based on its phylogenetic association with the PEP-CTERM proposed protein targeting signal. [Transport and binding proteins, Unknown substrate] 174
19374 274348 TIGR02915 PEP_resp_reg PEP-CTERM-box response regulator transcription factor. Members of this protein family share full-length homology with (but do not include) the acetoacetate metabolism regulatory protein AtoC (see SP|Q06065). These proteins have a Fis family DNA binding sequence (pfam02954), a response regulator receiver domain (pfam00072), and sigma-54 interaction domain (pfam00158). [Regulatory functions, DNA interactions] 445
19375 274349 TIGR02916 PEP_his_kin putative PEP-CTERM system histidine kinase. Members of this protein family have a novel N-terminal domain, a single predicted membrane-spanning helix, and a predicted cystosolic histidine kinase domain. We designate this protein PrsK, and its companion DNA-binding response regulator protein (TIGR02915) PrsR. These predicted signal-transducing proteins appear to enable enhancer-dependent transcriptional activation. The prsK gene is often associated with exopolysaccharide biosynthesis genes. [Protein fate, Protein and peptide secretion and trafficking, Signal transduction, Two-component systems] 679
19376 274350 TIGR02917 PEP_TPR_lipo putative PEP-CTERM system TPR-repeat lipoprotein. This protein family occurs in strictly within a subset of Gram-negative bacterial species with the proposed PEP-CTERM/exosortase system, analogous to the LPXTG/sortase system common in Gram-positive bacteria. This protein occurs in a species if and only if a transmembrane histidine kinase (TIGR02916) and a DNA-binding response regulator (TIGR02915) also occur. The present of tetratricopeptide repeats (TPR) suggests protein-protein interaction, possibly for the regulation of PEP-CTERM protein expression, since many PEP-CTERM proteins in these genomes are preceded by a proposed DNA binding site for the response regulator. 899
19377 131964 TIGR02918 TIGR02918 accessory Sec system glycosylation protein GtfA. Members of this protein family are found only in Gram-positive bacteria of the Firmicutes lineage, including several species of Staphylococcus, Streptococcus, and Lactobacillus. Members are associated with glycosylation of serine-rich glycoproteins exported by the accessory Sec system. [Protein fate, Protein modification and repair] 500
19378 274351 TIGR02919 TIGR02919 accessory Sec system glycosyltransferase GtfB. Members of this protein family are found only in Gram-positive bacteria of the Firmicutes lineage, including several species of Staphylococcus, Streptococcus, and Lactobacillus. [Protein fate, Protein modification and repair] 438
19379 131966 TIGR02920 acc_sec_Y2 accessory Sec system translocase SecY2. Members of this family are restricted to the Firmicutes lineage (low-GC Gram-positive bacteria) and appear to be paralogous to, and much more divergent than, the preprotein translocase SecY. Members include the SecY2 protein of the accessory Sec system in Streptococcus gordonii, involved in export of the highly glycosylated platelet-binding protein GspB. [Protein fate, Protein and peptide secretion and trafficking] 395
19380 131967 TIGR02921 PEP_integral PEP-CTERM family integral membrane protein. Members of this protein family, found in eighteen genera so far, have a PEP-CTERM sequence at the carboxyl-terminus (see model TIGR02595), but are unusual among PEP-CTERM proteins in having multiple predicted transmembrane segments. The function is unknown. It is proposed that an exosortase (see TIGR02602), recognizes and cleaves PEP-CTERM proteins in a manner analogous to the cleavage of LPXTG proteins by sortase (see Haft, et al., 2006). In at least six species, a gene encoding what appears to be a dedicated (single target) exosortase is adjacent. In that subset, the PEP-CTERM motif takes the form VPEPxxWxL. 952
19381 131968 TIGR02922 TIGR02922 TIGR02922 family protein. Two members of this family are found in Colwellia psychrerythraea 34H and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by TIGR02595. [Hypothetical proteins, Conserved] 67
19382 274352 TIGR02923 AhaC ATP synthase A1, C subunit. The A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The C subunit is part of the hydrophilic A1 "stalk" complex (AhaABCDEFG), which is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex. 343
19383 274353 TIGR02924 ICDH_alpha isocitrate dehydrogenase. This family of mainly alphaproteobacterial enzymes is a member of the isocitrate/isopropylmalate dehydrogenase superfamily described by pfam00180. Every member of the seed of this model appears to have a TCA cycle lacking only a determined isocitrate dehydrogenase. The precise identity of the cofactor (NADH -- 1.1.1.41 vs. NADPH -- 1.1.1.42) is unclear. [Energy metabolism, TCA cycle] 473
19384 131971 TIGR02925 cis_trans_EpsD peptidyl-prolyl cis-trans isomerase, EpsD family. Members of this family belong to the peptidyl-prolyl cis-trans isomerase family and are found in loci associated with exopolysaccharide biosynthesis. All members are encoded near a homolog of EpsH, as detected by TIGR02602. 232
19385 131972 TIGR02926 AhaH ATP synthase archaeal, H subunit. he A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The hydrophilic A1 "stalk" complex (AhaABCDEFG) is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex. It is unclear precisely where AhaH fits into these complexes. 85
19386 200219 TIGR02927 SucB_Actino 2-oxoglutarate dehydrogenase, E2 component, dihydrolipoamide succinyltransferase. This model represents an Actinobacterial clade of E2 enzyme, a component of the 2-oxoglutarate dehydrogenase complex involved in the TCA cycle. These proteins have multiple domains including the catalytic domain (pfam00198), one or two biotin domains (pfam00364) and an E3-component binding domain (pfam02817). 579
19387 274354 TIGR02928 TIGR02928 orc1/cdc6 family replication initiation protein. Members of this protein family are found exclusively in the archaea. This set of DNA binding proteins shows homology to the origin recognition complex subunit 1/cell division control protein 6 family in eukaryotes. Several members may be found in genome and interact with each other. [DNA metabolism, DNA replication, recombination, and repair] 365
19388 131975 TIGR02929 anfG_nitrog Fe-only nitrogenase, delta subunit. Nitrogenase, also called dinitrogenase, is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, AnfG, represents the delta subunit of the Fe-only alternative nitrogenase. It is homologous to VnfG, the delta subunit of the V-containing (vanadium) nitrogenase. [Central intermediary metabolism, Nitrogen fixation] 109
19389 131976 TIGR02930 vnfG_nitrog V-containing nitrogenase, delta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfG, represents the delta subunit of the V-containing (vanadium) alternative nitrogenase. It is homologous to AnfG, the delta subunit of the Fe-only nitrogenase. [Central intermediary metabolism, Nitrogen fixation] 109
19390 131977 TIGR02931 anfK_nitrog Fe-only nitrogenase, beta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, AnfK, represents the beta subunit of the iron-only alternative nitrogenase. It is homologous to NifK and VnfK, of the molybdenum-containing and the vanadium (V)-containing types, respectively. [Central intermediary metabolism, Nitrogen fixation] 461
19391 131978 TIGR02932 vnfK_nitrog V-containing nitrogenase, beta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfK, represents the beta subunit of the vanadium (V)-containing alternative nitrogenase. It is homologous to NifK and AnfK, of the molybdenum-containing and the iron (Fe)-only types, respectively. [Central intermediary metabolism, Nitrogen fixation] 457
19392 131979 TIGR02933 nifM_nitrog nitrogen fixation protein NifM. Members of this protein family, found in a subset of nitrogen-fixing bacteria, are the nitrogen fixation protein NifM. NifM, homologous to peptidyl-prolyl cis-trans isomerases, appears to be an accessory protein for NifH, the Fe protein, also called component II or dinitrogenase reductase, of nitrogenase. [Central intermediary metabolism, Nitrogen fixation] 256
19393 274355 TIGR02934 nifT_nitrog probable nitrogen fixation protein FixT. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein. 68
19394 131981 TIGR02935 TIGR02935 probable nitrogen fixation protein. Members of this protein family, called DUF269 by pfam03270, are strictly limited to nitrogen-fixing species, although not universal among them. The gene typically is found next to the nifX gene (see TIGRFAMs model TIGR02663). [Central intermediary metabolism, Nitrogen fixation] 140
19395 274356 TIGR02936 fdxN_nitrog ferredoxin III, nif-specific. Members of this family are homodimeric ferredoxins from nitrogen fixation regions of many nitrogen-fixing bacteria. As characterized in Rhodobacter capsulatus, these proteins are homodimeric, with two 4Fe-4S clusters bound per monomer. Although nif-specific, this protein family is not usiveral, as other nitrogenase systems may substitute flavodoxins, or different types of ferredoxin. [Central intermediary metabolism, Nitrogen fixation] 91
19396 274357 TIGR02937 sigma70-ECF RNA polymerase sigma factor, sigma-70 family. This model encompasses all varieties of the sigma-70 type sigma factors including the ECF subfamily. A number of sigma factors have names with a different number than 70 (i.e. sigma-38), but in fact, all except for the Sigma-54 family (TIGR02395) are included within this family. Several Pfam models hit segments of these sequences including Sigma-70 region 2 (pfam04542) and Sigma-70, region 4 (pfam04545), but not always above their respective trusted cutoffs. 158
19397 131984 TIGR02938 nifL_nitrog nitrogen fixation negative regulator NifL. NifL is a modulator of the nitrogen fixation positive regulator protein NifA, and is therefore a negative regulator. It binds NifA. NifA and NifL are encoded by adjacent genes. [Central intermediary metabolism, Nitrogen fixation, Regulatory functions, Protein interactions] 494
19398 131985 TIGR02939 RpoE_Sigma70 RNA polymerase sigma factor RpoE. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoE. This protein may be called sigma-24, sigma-E factor, sigma-H factor, fecI-like sigma factor or alternative sigma factor AlgU. 190
19399 274358 TIGR02940 anfO_nitrog Fe-only nitrogenase accessory protein AnfO. Members of this protein family, called Anf1 in Rhodobacter capsulatus and AnfO in Azotobacter vinelandii, are found only in species with the Fe-only nitrogenase and are encoded immediately downstream of the structural genes in the above named species. 214
19400 131987 TIGR02941 Sigma_B RNA polymerase sigma-B factor. This sigma factor is restricted to certain lineages of the order Bacillales including Staphylococcus, Listeria, and Bacillus. 255
19401 274359 TIGR02943 Sig70_famx1 RNA polymerase sigma-70 factor, TIGR02943 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. 188
19402 131989 TIGR02944 suf_reg_Xantho FeS assembly SUF system regulator, gammaproteobacterial. The SUF system is an oxygen-resistant iron-sulfur cluster assembly system found in both aerobes and facultative anaerobes. Its presence appears to be a marker of oxygen tolerance; strict anaerobes and microaerophiles tend to have different FeS cluster biosynthesis systems. Members of this protein family belong to the rrf2 family of transcriptional regulators and are found, typically, as the first gene of a SUF operon. It is found only in a subset of genomes that encode the SUF system, including the genus Xanthomonas. The conserved location suggests an autoregulatory role. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions] 130
19403 131990 TIGR02945 SUF_assoc FeS assembly SUF system protein. Members of this family belong to the broader pfam01883, or Domain of Unknown Function DUF59. Many members of DUF59 are candidate ring hydroxylating complex subunits. However, members of the narrower family defined here all are found in genomes that carry the FeS assembly SUF system. For 70 % of these species, the member of this protein family is found as part of the SUF locus, usually immediately downstream of the sufS gene. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 99
19404 274360 TIGR02946 acyl_WS_DGAT acyltransferase, WS/DGAT/MGAT. This bacteria-specific protein family includes a characterized, homodimeric, broad specificity acyltransferase from Acinetobacter sp. strain ADP1, active as wax ester synthase, as acyl coenzyme A:diacylglycerol acyltransferase, and as acyl-CoA:monoacylglycerol acyltransferase. [Unknown function, Enzymes of unknown specificity] 446
19405 131992 TIGR02947 SigH_actino RNA polymerase sigma-70 factor, TIGR02947 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and (with the exception of a paralog in Thermobifida fusca YX) one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria and each gene examined is followed by an anti-sigma factor in an apparent operon. 193
19406 131993 TIGR02948 SigW_bacill RNA polymerase sigma-W factor. This sigma factor is restricted to certain lineages of the order Bacillales. 187
19407 188261 TIGR02949 anti_SigH_actin anti-sigma factor, TIGR02949 family. This group of anti-sigma factors are associated in an apparent operon with a family of sigma-70 family sigma factors (TIGR02947). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria. [Transcription, Transcription factors] 84
19408 274361 TIGR02950 SigM_subfam RNA polymerase sigma factor, SigM family. This family of RNA polymerase sigma factors is a member of the Sigma-70 subfamily (TIGR02937) and is restricted to certain lineages of the order Bacillales. This family encompasses at least two distinct sigma factors as two proteins are found in each of B. anthracis, B. subtilis subsp. subtilis str. 168, and B. lichiniformis (although these are not apparently the same two in each). One of these is designated as SigM in B. subtilis (Swiss_Prot: 154
19409 131996 TIGR02951 DMSO_dmsB DMSO reductase, iron-sulfur subunit. This family consists of the iron-sulfur subunit, or chain B, of an enzyme called the anaerobic dimethyl sulfoxide reductase. Chains A and B are catalytic, while chain C is a membrane anchor. 161
19410 131997 TIGR02952 Sig70_famx2 RNA polymerase sigma-70 factor, TIGR02952 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is found in a limited number of Gram-positive bacterial lineages. 170
19411 131998 TIGR02953 penta_MxKDx pentapeptide MXKDX repeat protein. Members of this protein family are small bacterial proteins, each with an N-terminal signal sequence followed by up to 11 imperfect repeats of a pentapeptide. The pentapeptide repeat usually follows the form Met-Xaa-Lys-Asp-Xaa. 75
19412 213752 TIGR02954 Sig70_famx3 RNA polymerase sigma-70 factor, TIGR02954 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is found in certain Bacillus and Clostridium species. 169
19413 132000 TIGR02955 TMAO_TorT TMAO reductase system periplasmic protein TorT. Members of this family are the periplasmic protein TorT which, together with the the TorS/TorR histidine kinase/response regulator system, regulates expression of the torCAD operon for trimethylamine N-oxide reductase (TMAO reductase). It appears to bind an inducer for TMAO reductase, and shows homology to a periplasmic D-ribose binding protein. 295
19414 274362 TIGR02956 TMAO_torS TMAO reductase sytem sensor TorS. This protein, TorS, is part of a regulatory system for the torCAD operon that encodes the pterin molybdenum cofactor-containing enzyme trimethylamine-N-oxide (TMAO) reductase (TorA), a cognate chaperone (TorD), and a penta-haem cytochrome (TorC). TorS works together with the inducer-binding protein TorT and the response regulator TorR. TorS contains histidine kinase ATPase (pfam02518), HAMP (pfam00672), phosphoacceptor (pfam00512), and phosphotransfer (pfam01627) domains and a response regulator receiver domain (pfam00072). [Signal transduction, Two-component systems] 968
19415 132002 TIGR02957 SigX4 RNA polymerase sigma-70 factor, TIGR02957 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building and bidirectional best hits, to represent a conserved family. This family is found in a limited number of bacterial lineages. This family includes apparent paralogous expansion in Streptomyces coelicolor A3(2), and multiple copies in Mycobacterium smegmatis MC2, Streptomyces avermitilis MA-4680 and Nocardia farcinica IFM10152. 281
19416 132004 TIGR02959 SigZ RNA polymerase sigma factor, SigZ family. This family of RNA polymerase sigma factors is a member of the Sigma-70 subfamily (TIGR02937). One of these is designated as SigZ in B. subtilis (Swiss_Prot: SIGZ_BACSU). Interestingly, this group has a very sporatic distribution, B. subtilis, for instance, being the only sequenced strain of Bacilli with a member. Dechloromonas aromatica RCB appears to have two of these sigma factors. A member appears on a plasmid found in Photobacterium profundum SS9 and Vibrio fischeri ES114 (where a second one is chromosomally encoded). 170
19417 132005 TIGR02960 SigX5 RNA polymerase sigma-70 factor, TIGR02960 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. 324
19418 274363 TIGR02961 allantoicase allantoicase. Members of this family are the enzyme allantoicase (EC 3.5.3.4), also called allantoate amidinohydrolase. This enzyme hydrolyzes allantoate to (S)-ureidoglycolate and urea; it can also degrade (R)-ureidoglycolate to glyoxylate and urea. Allantoinase (EC 3.5.2.5) hydrolyzes (S)-allantoin (a xanthine metabolite, via urate) to allantoate. Allantoate can then be degraded either by this enzyme, allantoicase, or by allantoate deiminase (EC 3.5.3.9). Members of the seed alignment for this model were taken from BRENDA. Proteins in this family contain two copies of the allantoicase repeat (pfam03561). A different but similarly named enzyme, allantoate amidohydrolase (EC 3.5.3.9), simultaneously breaks down the urea to ammonia and carbon dioxide. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other] 322
19419 274364 TIGR02962 hdxy_isourate hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 112
19420 274365 TIGR02963 xanthine_xdhA xanthine dehydrogenase, small subunit. Members of this protein family are the small subunit (or, in eukaryotes, the N-terminal domain) of xanthine dehydrogenase, an enzyme of purine catabolism via urate. The small subunit contains both an FAD and a 2Fe-2S cofactor. Aldehyde oxidase (retinal oxidase) appears to have arisen as a neofunctionalization among xanthine dehydrogenases in eukaryotes and [Purines, pyrimidines, nucleosides, and nucleotides, Other] 467
19421 274366 TIGR02964 xanthine_xdhC xanthine dehydrogenase accessory protein XdhC. Members of this protein family are the accessory protein XdhC for insertion of the molybdenum cofactor into the xanthine dehydrogenase large chain, XdhB, in bacteria. This protein is not part of the mature xanthine dehydrogenase. Xanthine dehydrogenase is an enzyme for purine catabolism, from other purines to xanthine to urate to further breakdown products. [Protein fate, Protein folding and stabilization, Purines, pyrimidines, nucleosides, and nucleotides, Other] 246
19422 274367 TIGR02965 xanthine_xdhB xanthine dehydrogenase, molybdopterin binding subunit. Members of the protein family are the molybdopterin-containing large subunit (or, in, eukaryotes, the molybdopterin-binding domain) of xanthine dehydrogenase, and enzyme that reduces the purine pool by catabolizing xanthine to urate. This model is based primarily on bacterial sequences; it does not manage to include all eukaryotic xanthine dehydrogenases and thereby discriminate them from the closely related enzyme aldehyde dehydrogenase. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 758
19423 274368 TIGR02966 phoR_proteo phosphate regulon sensor kinase PhoR. Members of this protein family are the regulatory histidine kinase PhoR associated with the phosphate ABC transporter in most Proteobacteria. Related proteins from Gram-positive organisms are not included in this model. The phoR gene usually is adjacent to the response regulator phoB gene (TIGR02154). [Signal transduction, Two-component systems] 333
19424 132012 TIGR02967 guan_deamin guanine deaminase. This model describes guanine deaminase, which hydrolyzes guanine to xanthine and ammonia. Xanthine can then be converted to urate by xanthine dehydrogenase, and urate subsequently degraded. In some bacteria, the guanine deaminase gene is found near the xdhABC genes for xanthine dehydrogenase. Non-homologous forms of guanine deaminase also exist, as well as distantly related forms outside the scope of this model. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 401
19425 274369 TIGR02968 succ_dehyd_anc succinate dehydrogenase, hydrophobic membrane anchor protein. In E. coli and many other bacteria, two small, hydrophobic, mutually homologous subunits of succinate dehydrogenase, a TCA cycle enzyme, are SdhC and SdhD. This family is the SdhD, the hydrophobic membrane anchor protein. SdhC is apocytochrome b558, which also plays a role in anchoring the complex. [Energy metabolism, TCA cycle] 105
19426 132014 TIGR02969 mam_aldehyde_ox aldehyde oxidase. Members of this family are mammalian aldehyde oxidase (EC 1.2.3.1) isozymes, closely related to xanthine dehydrogenase/oxidase. 1330
19427 274370 TIGR02970 succ_dehyd_cytB succinate dehydrogenase, cytochrome b556 subunit. In E. coli and many other bacteria, two small, hydrophobic, mutually homologous subunits of succinate dehydrogenase, a TCA cycle enzyme, are SdhC and SdhD. This family is the SdhC, the cytochrome b subunit, called b556 in bacteria and b560 in mitochondria. SdhD (see TIGR02968) is called the hydrophobic membrane anchor subunit, although both SdhC and SdhD participate in anchoring the complex. In some bacteria, this cytochrome b subunit is replaced my a member of the cytochrome b558 family (see TIGR02046). [Energy metabolism, TCA cycle] 120
19428 213754 TIGR02971 heterocyst_DevB ABC exporter membrane fusion protein, DevB family. Members of this protein family are found mostly in the Cyanobacteria, but also in the Planctomycetes. DevB from Anabaena sp. strain PCC 7120 is partially characterized as a membrane fusion protein of the DevBCA ABC exporter, probably a glycolipid exporter, required for heterocyst formation. Most Cyanobacteria have one member only, but Nostoc sp. PCC 7120 has seven members. 327
19429 132017 TIGR02972 TMAO_torE trimethylamine N-oxide reductase system, TorE protein. Members of this small, apparent transmembrane protein are designated TorE and occur in operons for the trimethylamine N-oxide (TMAO) reductase system. Members are closely related to the NapE protein of the related periplasmic nitrate reductase system. It may be that TorE is an integral membrane subunit of a complex with the reductase TorA. [Energy metabolism, Anaerobic] 47
19430 132018 TIGR02973 nitrate_rd_NapE periplasmic nitrate reductase, NapE protein. NapE, homologous to TorE (TIGR02972), is a membrane protein of unknown function that is part of the periplasmic nitrate reductase system; it may be part of the enzyme complex. The periplasmic nitrate reductase allows for nitrate respiration in anaerobic conditions. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 42
19431 274371 TIGR02974 phageshock_pspF psp operon transcriptional activator PspF. Members of this protein family are PspF, the sigma-54-dependent transcriptional activator of the phage shock protein (psp) operon, in Escherichia coli and numerous other species. The psp operon is induced by a number of stress conditions, including heat shock, ethanol, and filamentous phage infection. Changed com_name to adhere to TIGR role notes conventions. 09/15/06 - DMH [Regulatory functions, DNA interactions] 329
19432 132020 TIGR02975 phageshock_pspG phage shock protein G. This protein previously was designated yjbO in E. coli. It is found only in genomes that have the phage shock operon (psp), but only rarely is encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins, and heat shock. [Cellular processes, Adaptations to atypical conditions] 64
19433 132021 TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response. [Cellular processes, Adaptations to atypical conditions] 75
19434 274372 TIGR02977 phageshock_pspA phage shock protein A. Members of this family are the phage shock protein PspA, from the phage shock operon. This is a narrower family than the set of PspA and its homologs, sometimes several in a genome, as described by pfam04012. PspA appears to maintain the protonmotive force under stress conditions that include overexpression of certain phage secretins, heat shock, ethanol, and protein export defects. [Cellular processes, Adaptations to atypical conditions] 219
19435 132023 TIGR02978 phageshock_pspC phage shock protein C. All members of this protein family are the phage shock protein PspC. These proteins contain a PspC domain, as do other members of the larger family of proteins described by pfam04024. The phage shock regulon is restricted to the Proteobacteria and somewhat sparsely distributed there. It is expressed, under positive control of a sigma-54-dependent transcription factor, PspF, which binds and is modulated by PspA. Stresses that induce the psp regulon include phage secretin overexpression, ethanol, heat shock, and protein export defects. [Cellular processes, Adaptations to atypical conditions] 121
19436 132024 TIGR02979 phageshock_pspD phage shock protein PspD. Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown. [Cellular processes, Adaptations to atypical conditions] 59
19437 274373 TIGR02980 SigBFG RNA polymerase sigma-70 factor, sigma-B/F/G subfamily. This group of similar sigma-70 factors includes clades found in Bacilli (including the sporulation factors SigF:TIGR02885 and SigG:TIGR02850 as well as SigB:TIGR02941), and the high GC gram positive bacteria (Actinobacteria) where a variable number of them are found depending on the lineage. 227
19438 132026 TIGR02981 phageshock_pspE phage shock operon rhodanese PspE. Members of this very narrowly defined protein family are proteins active as rhodanese (EC 2.8.1.1) and found in the extended variants of the phage shock protein (psp operon) in Escherichia coli and a few closely related species. Note that the designation phage shock protein PspE has been applied, incorrectly, in many instances where the genome lacks the phage shock regulon entirely. 101
19439 274374 TIGR02982 heterocyst_DevA ABC exporter ATP-binding subunit, DevA family. Members of this protein family are found mostly in the Cyanobacteria, but also in the Planctomycetes. Cyanobacterial examples are involved in heterocyst formation, by which some fraction of members of the colony undergo a developmental change and become capable of nitrogen fixation. The DevBCA proteins are thought export of either heterocyst-specific glycolipids or an enzyme essential for formation of the laminated layer found in heterocysts. 220
19440 132028 TIGR02983 SigE-fam_strep RNA polymerase sigma-70 factor, sigma-E family. This group of similar sigma-70 factors includes the sigE factor from Streptomyces coelicolor. The family appears to include a paralagous expansion in the Streptomycetes lineage, while related Actinomycetales have at most two representatives. 162
19441 274375 TIGR02984 Sig-70_plancto1 RNA polymerase sigma-70 factor, Planctomycetaceae-specific subfamily 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are apparently found only in the Planctomycetaceae family including the genuses Gemmata and Pirellula (in which seven sequences are found). 189
19442 274376 TIGR02985 Sig70_bacteroi1 RNA polymerase sigma-70 factor, Bacteroides expansion family 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found primarily in the genus Bacteroides. This family appears to have resulted from a lineage-specific expansion as B. thetaiotaomicron VPI-5482, Bacteroides forsythus ATCC 43037, Bacteroides fragilis YCH46 and Bacteroides fragilis NCTC 9343 contain 25, 12, 24 and 23 members, respectively. There are currentlyonly two known members of this family outside of the Bacteroides, in Rhodopseudomonas and Bradyrhizobium. 161
19443 132031 TIGR02986 restrict_Alw26I type II restriction endonuclease, Alw26I/Eco31I/Esp3I family. Members of this family are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. Characterized specificities of three members are GGTCTC, CGTCTC, and the shared subsequence GTCTC. [DNA metabolism, Restriction/modification] 424
19444 274377 TIGR02987 met_A_Alw26 type II restriction m6 adenine DNA methyltransferase, Alw26I/Eco31I/Esp3I family. Members of this family are the m6-adenine DNA methyltransferase protein, or domain of a fusion protein that also carries m5 cytosine methyltransferase activity, of type II restriction systems of the Alw26I/Eco31I/Esp3I family. A methyltransferase of this family is alway accompanied by a type II restriction endonuclease from the Alw26I/Eco31I/Esp3I family (TIGR02986) and by an adenine-specific modification methyltransferase. Members of this family are unusual in that regions of similarity to homologs outside this family are circularly permuted. [DNA metabolism, Restriction/modification] 524
19445 274378 TIGR02988 YaaA_near_RecF S4 domain protein YaaA. This small protein has a single S4 domain (pfam01479), as do bacterial ribosomal protein S4, some pseudouridine synthases, tyrosyl-tRNA synthetases. The S4 domain may bind RNA. Members of this protein family are found almost exclusively in the Firmicutes, and almost invariably just a few nucleotides upstream of the gene for the DNA replication and repair protein RecF. The few members of this family that are not near recF are found instead near dnaA and/or dnaN, the usual neighbors of recF, near the origin of replication. The conserved location suggests a possible role in replication in the Firmicutes lineage. [DNA metabolism, DNA replication, recombination, and repair] 59
19446 274379 TIGR02989 Sig-70_gvs1 RNA polymerase sigma-70 factor, Rhodopirellula/Verrucomicrobium family. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are abundantly found in the species Rhodopirellula baltica (11), and Verrucomicrobium spinosum (16) and to a lesser extent in Gemmata obscuriglobus (2). 159
19447 132035 TIGR02990 ectoine_eutA ectoine utilization protein EutA. Members of this protein family are EutA, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti and Silicibacter pomeroyi. It is missing from two other species with the other ectoine transport and utilization genes: Pseudomonas putida and Agrobacterium tumefaciens. 239
19448 132036 TIGR02991 ectoine_eutB ectoine utilization protein EutB. Members of this protein family are EutB, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. Members of this family resemble threonine dehydratases. 317
19449 132037 TIGR02992 ectoine_eutC ectoine utilization protein EutC. Members of this protein family are EutA, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. This family belongs to the ornithine cyclodeaminase/mu-crystallin family (pfam02423). 326
19450 274380 TIGR02993 ectoine_eutD ectoine utilization protein EutD. Members of this family are putative peptidases or hydrolases similar to Xaa-Pro aminopeptidase (pfam00557). They belong to ectoine utilization operons, as found in Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. The exact function is unknown. 391
19451 132039 TIGR02994 ectoine_eutE ectoine utilization protein EutE. Members of this family, part of the succinylglutamate desuccinylase / aspartoacylase family (pfam04952), belong to ectoine utilization operons, as found in Sinorhizobium meliloti 1021 (where it the operon is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. 325
19452 132040 TIGR02995 ectoine_ehuB ectoine/hydroxyectoine ABC transporter solute-binding protein. Members of this family are the extracellular solute-binding proteins of ABC transporters that closely resemble amino acid transporters. The member from Sinorhizobium meliloti is involved in ectoine uptake, both for osmoprotection and for catabolism. All other members of the seed alignment are found associated with ectoine catabolic genes. [Transport and binding proteins, Amino acids, peptides and amines] 275
19453 274381 TIGR02996 rpt_mate_G_obs repeat-companion domain TIGR02996. This model describes an abundant paralogous domain of Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. The domain also occurs, although rarely, in Myxococcus xanthus DK 1622 and related species. Most member proteins have extensive repeats similar to the leucine-rich repeat, or another repeat class or region of low-complexity sequence. This domain is not repeated, and in Gemmata is usually found at the protein N-terminus. 42
19454 274382 TIGR02997 Sig70-cyanoRpoD RNA polymerase sigma factor, cyanobacterial RpoD-like family. This family includes a number of closely related sigma-70 (TIGR02937) factors in the cyanobacteria. All appear most closely related to the essential sigma-70 factor RpoD, and some score above trusted to the RpoD C-terminal domain model (TIGR02393). 298
19455 132043 TIGR02998 RraA_entero regulator of ribonuclease activity A. This family includes a number of closely related sequences from certain enterobacteria. The E. coli member of this family has been characterized as a regulator of RNase E and its crystal structure has been analyzed. The broader subfamily which includes this equivalog, TIGR01935, was initially classified as a "hypothetical equivalog" with the name "regulator of ribonuclease activity A" based on the same evidence for this model. It now appears that, considering the second group of enterobacterial sequences within TIGR01935, the functional assignment is unsupported. THIS PROTEIN IS _NOT_ MenG, AKA S-adenosylmethionine: 2-demethylmenaquinone methyltransferase (EC 2.1.-.-). See the references characterizing this as a case of transitive annotation error. [Transcription, Degradation of RNA, Regulatory functions, Protein interactions] 161
19456 132044 TIGR02999 Sig-70_X6 RNA polymerase sigma factor, TIGR02999 family. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found in a variety of species including Rhodopirellula baltica which encodes a paralogous group of five. 183
19457 274383 TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown. 75
19458 188267 TIGR03001 Sig-70_gmx1 RNA polymerase sigma-70 factor, Myxococcales family 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found in multiple copies in the order Myxococcales. This model supercedes TIGR02233, which has now been retired. 244
19459 274384 TIGR03002 outer_YhbN_LptA lipopolysaccharide transport periplasmic protein LptA. Members of this protein family include LptA (previously called YhbN). It was shown to be an essential protein in E. coli, implicated in cell envelope integrity, and to play a role in the delivery of LPS to the outer leaflet of the outer membrane. It works with LptB (formerly yhbG), a homolog of ABC transporter ATP-binding proteins, encoded by an adjacent gene. Numerous homologs in other Proteobacteria are found in a conserved location near lipopolysaccharide inner core biosynthesis genes. This family is related to organic solvent tolerance protein (OstA), though distantly. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 142
19460 132048 TIGR03003 ectoine_ehuD ectoine/hydroxyectoine ABC transporter, permease protein EhuD. Members of this family are presumed to act as permease subunits of ectoine ABC transporters. Operons containing this gene also contain the other genes of the ABC transporter and typically are found next to either ectoine utilization or ectoine biosynthesis operons. 212
19461 132049 TIGR03004 ectoine_ehuC ectoine/hydroxyectoine ABC transporter, permease protein EhuC. Members of this family are presumed to act as permease subunits of ectoine ABC transporters. Operons containing this gene also contain the other genes of the ABC transporter and typically are found next to either ectoine utilization or ectoine biosynthesis operons. Permease subunits EhuC and EhuD are homologous. 214
19462 132050 TIGR03005 ectoine_ehuA ectoine/hydroxyectoine ABC transporter, ATP-binding protein. Members of this family are the ATP-binding protein of a conserved four gene ABC transporter operon found next to ectoine unilization operons and ectoine biosynthesis operons. Ectoine is a compatible solute that protects enzymes from high osmolarity. It is released by some species in response to hypoosmotic shock, and it is taken up by a number of bacteria as a compatible solute or for consumption. This family shows strong sequence similiarity to a number of amino acid ABC transporter ATP-binding proteins. 252
19463 274385 TIGR03006 pepcterm_polyde polysaccharide deacetylase family protein, PEP-CTERM locus subfamily. Members of this protein family belong to the family of polysaccharide deacetylases (pfam01522). All are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria, and are found near the epsH homolog that is the putative exosortase gene. The highest scoring homologs below the trusted cutoff for this model are found in several species of Methanosarcina, an archaeal genus. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 271
19464 274386 TIGR03007 pepcterm_ChnLen polysaccharide chain length determinant protein, PEP-CTERM locus subfamily. Members of this protein family belong to the family of polysaccharide chain length determinant proteins (pfam02706). All are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria, and are found near the epsH homolog that is the putative exosortase gene. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 498
19465 163100 TIGR03008 pepcterm_CAAX CAAX prenyl protease-related protein. The CAAX prenyl protease, in eukaryotes, catalyzes three covalent modifications, including cleavage and acylation, at the C-terminus of certain proteins in a process connected to protein sorting. This family describes a bacterial protein family homologous to one domain of the CAAX-processing enzyme. Members of this protein family are found in genomes that carry a predicted protein sorting system, PEP-CTERM/exosortase, usually in the vicinity of the EpsH homolog that is the hallmark of the system. The function of this protein is unknown, but it may relate to protein motification. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 222
19466 274387 TIGR03009 plancto_dom_2 Planctomycetes uncharacterized domain TIGR03009. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to four proteins per genome. The function is unknown. [Hypothetical proteins, Conserved] 210
19467 200229 TIGR03010 sulf_tusC_dsrF sulfur relay protein TusC/DsrF. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification] 116
19468 274388 TIGR03011 sulf_tusB_dsrH sulfur relay protein TusB/DsrH. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification] 94
19469 274389 TIGR03012 sulf_tusD_dsrE sulfur relay protein TusD/DsrE. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification] 127
19470 274390 TIGR03013 EpsB_2 sugar transferase, PEP-CTERM system associated. Members of this protein family belong to the family of bacterial sugar transferases (pfam02397). Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria (notable exceptions appear to include Magnetococcus sp. MC-1 and Myxococcus xanthus DK 1622 ). These genes are generally found near one or more of the PrsK, PrsR or PrsT genes that have been related to the PEP-CTERM system by phylogenetic profiling methods. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species. These proteins are homologs of the EpsB protein found in Methylobacillus sp. strain 12S, which is also associated with a PEP-CTERM system, but of a distinct type. A name which appears attached to a number of genes (by transitive annotation) in this family is "undecaprenyl-phosphate galactose phosphotransferase", which comes from relatively distant characterized enterobacterial homologs, and is considerably more specific than warranted from the currently available evidence. 442
19471 132059 TIGR03014 EpsL exopolysaccharide biosynthesis operon protein EpsL. The epsL gene is described as a component of the methanolan exopolysaccharide biosynthesis operon in Methylobacillus sp strain 12S, although no other information regarding its possible function is suggested. Homologs of this gene are found in several other exopolysaccharide operons in a small number of species. These operons contain a subset of the methanolan operon genes by homology and synteny, including the epsH gene which is proposed to act as an "exosortase" directing proteins with a C-terminal tag (PEP-CTERM) to the exopolysaccharide layer. Each of the genomes in which these genes and epsL are found also encode genes with these C-terminal tags. 381
19472 132060 TIGR03015 pepcterm_ATPase putative secretion ATPase, PEP-CTERM locus subfamily. Members of this protein are marked as probable ATPases by the nucleotide binding P-loop motif GXXGXGKTT, a motif DEAQ similar to the DEAD/H box of helicases, and extensive homology to ATPases of MSHA-type pilus systems and to GspA proteins associated with type II protein secretion systems. [Protein fate, Protein and peptide secretion and trafficking] 269
19473 274391 TIGR03016 pepcterm_hypo_1 uncharacterized protein, PEP-CTERM system associated. Members of this protein family are found predominantly in exopolysaccharide biosynthesis operons marked by the presence of the EpsH-family putative exosortase and presence in the genome of the PEP-CTERM protein sorting signal. Members of this family may be distantly related to the EpsL family modeled in TIGR03014. 431
19474 132062 TIGR03017 EpsF chain length determinant protein EpsF. Sequences in this family of proteins are members of the chain length determinant family (pfam02706) which includes the wzc protein from E.coli. This family of proteins are homologous to the EpsF protein of the methanolan biosynthesis operon of Methylobacillus species strain 12S. The distribution of this protein appears to be restricted to a subset of exopolysaccharide operons containing a syntenic grouping of genes including a variant of the EpsH exosortase protein. Exosortase has been proposed to be involved in the targetting and processing of proteins containing the PEP-CTERM domain to the exopolysaccharide layer. 444
19475 274392 TIGR03018 pepcterm_TyrKin exopolysaccharide/PEP-CTERM locus tyrosine autokinase. Members of this protein family are related to a known protein-tyrosine autokinase and to numerous homologs from exopolysaccharide biosynthesis region proteins, many of which are designated as chain length determinants. Most members of this family contain a short region, immediately C-terminal to the region modeled here, with an abundance of Tyr residues. These C-terminal tyrosine residues are likely to be autophosphorylation sites. Some members of this family are fusion proteins. 207
19476 132064 TIGR03019 pepcterm_femAB FemAB-related protein, PEP-CTERM system-associated. Members of this protein family are found always as part of extended exopolysaccharide biosynthesis loci in bacteria. In nearly every case, these loci contain determinants for the processing of the PEP-CTERM proposed C-terminal protein sorting signal. This family shows remote, local sequence similarity to the FemAB protein family (see pfam02388), whose members [Unknown function, General] 330
19477 274393 TIGR03020 EpsA transcriptional regulator EpsA. Proteins in this family include a C-terminal LuxR transcriptional regulator domain (pfam00196). These proteins are positioned proximal to either EpsH-containing exopolysaccharide biosynthesis operons of the Methylobacillus type, or the associated PEP-CTERM-containing genes. 247
19478 274394 TIGR03021 pilP_fam type IV pilus biogenesis protein PilP. Members of this protein family are found in type IV pilus biogenesis loci and include proteins designated PilP. [Cell envelope, Surface structures] 118
19479 274395 TIGR03022 WbaP_sugtrans Undecaprenyl-phosphate galactose phosphotransferase, WbaP. The WbaP (formerly RfbP) protein has been characterized as the first enzyme in O-antigen biosynthesis in Salmonella typhimurium. The enzyme transfers galactose from UDP-galactose to a polyprenyl carrier (utilizing the highly conserved C-terminal sugar transferase domain, pfam02397) a reaction which takes place at the cytoplasmic face of the inner membrane. The N-terminal hydrophobic domain is then believed to facilitate the "flippase" function of transferring the liposaccharide unit from the cytoplasmic face to the periplasmic face of the inner membrane. This model includes the enterobacterial enzymes, where the function is presumed to be identical to the S. typhimurium enzyme as well as a somewhat broader group which are likely to catalyze the same or highly similar reactions based on a phylogenetic tree-building analysis of the broader sugar transferase family. Most of these genes are found within large operons dedicated to the production of complex exopolysaccharides such as the enterobacterial O-antigen. The most likely heterogeneity would be in the precise nature of the sugar molecule transferred. 456
19480 274396 TIGR03023 WcaJ_sugtrans Undecaprenyl-phosphate glucose phosphotransferase. This family of proteins encompasses the E. coli WcaJ protein involved in colanic acid biosynthesis, the Methylobacillus EpsB protein involved in methanolan biosynthesis, as well as the GumD protein involved in the biosynthesis of xanthan. All of these are closely related to the well-characterized WbaP (formerly RfbP) protein, which is the first enzyme in O-antigen biosynthesis in Salmonella typhimurium. The enzyme transfers galactose from UDP-galactose (NOTE: not glucose) to a polyprenyl carrier (utilizing the highly conserved C-terminal sugar transferase domain, pfam02397) a reaction which takes place at the cytoplasmic face of the inner membrane. The N-terminal hydrophobic domain is then believed to facilitate the "flippase" function of transferring the liposaccharide unit from the cytoplasmic face to the periplasmic face of the inner membrane. Most of these genes are found within large operons dedicated to the production of complex exopolysaccharides such as the enterobacterial O-antigen. Colanic acid biosynthesis utilizes a glucose-undecaprenyl carrier, knockout of EpsB abolishes incorporation of UDP-glucose into the lipid phase, and the C-terminal portion of GumD has been shown to be responsible for the glucosyl-1-transferase activity. 450
19481 274397 TIGR03024 arch_PEF_CTERM PEF-CTERM protein sorting domain. This domain, distantly related to the PEP-Cterm domain described in model TIGR02595, is found in Methanosarcina mazei in four different proteins, as well as in other archaea such as Methanococcoides burtonii. Several proteins with this domain have their genes only a short distance from archaeosortase C, a proposed integral membrane transpeptidase. This family should exclude members of the PEFG-CTERM domain family (TIGR04296), specific to the Thaumarchaeota. 25
19482 274398 TIGR03025 EPS_sugtrans exopolysaccharide biosynthesis polyprenyl glycosylphosphotransferase. Members of this family are generally found near other genes involved in the biosynthesis of a variety of exopolysaccharides. These proteins consist of two fused domains, an N-terminal hydrophobic domain of generally low conservation and a highly conserved C-terminal sugar transferase domain (pfam02397). Characterized and partially characterized members of this subfamily include Salmonella WbaP (originally RfbP), E. coli WcaJ, Methylobacillus EpsB, Xanthomonas GumD, Vibrio CpsA, Erwinia AmsG, Group B Streptococcus CpsE (originally CpsD), and Streptococcus suis Cps2E. Each of these is believed to act in transferring the sugar from, for instance, UDP-glucose or UDP-galactose, to a lipid carrier such as undecaprenyl phosphate as the first (priming) step in the synthesis of an oligosaccharide "block". This function is encoded in the C-terminal domain. The liposaccharide is believed to be subsequently transferred through a "flippase" function from the cytoplasmic to the periplasmic face of the inner membrane by the N-terminal domain. Certain closely related transferase enzymes, such as Sinorhizobium ExoY and Lactococcus EpsD, lack the N-terminal domain and are not found by this model. 445
19483 274399 TIGR03026 NDP-sugDHase nucleotide sugar dehydrogenase. Enzymes in this family catalyze the NAD-dependent alcohol-to-acid oxidation of nucleotide-linked sugars. Examples include UDP-glucose 6-dehydrogenase (1.1.1.22), GDP-mannose 6-dehydrogenase (1.1.1.132), UDP-N-acetylglucosamine 6-dehydrogenase (1.1.1.136), UDP-N-acetyl-D-galactosaminuronic acid dehydrogenase, and UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase. These enzymes are most often involved in the biosynthesis of polysaccharides and are often found in operons devoted to that purpose. All of these enzymes contain three Pfam domains, pfam03721, pfam00984, and pfam03720 for the N-terminal, central, and C-terminal regions respectively. 409
19484 132072 TIGR03027 pepcterm_export putative polysaccharide export protein, PEP-CTERM sytem-associated. This protein family belongs to the larger set of polysaccharide biosynthesis/export proteins described by pfam02563. Members of this family are variable in either containing of lacking a 78-residue insert, but appear to fall within a single clade, nevertheless, where the regions in which the gene is found encode components of the PEP-CTERM/EpsH proposed exosortase protein sorting system. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 165
19485 132073 TIGR03028 EpsE polysaccharide export protein EpsE. Sequences in this family of proteins are members of a polysaccharide export protein family (pfam02563) which includes the wza protein from E.coli. This family of proteins are homologous to the EpsE protein of the methanolan biosynthesis operon of Methylobacillus species strain 12S. The distribution of this protein appears to be restricted to a subset of exopolysaccharide operons containing a syntenic grouping of genes including a variant of the EpsH exosortase protein. Exosortase has been proposed to be involved in the targetting and processing of proteins containing the PEP-CTERM domain to the exopolysaccharide layer. 239
19486 132074 TIGR03029 EpsG chain length determinant protein tyrosine kinase EpsG. The proteins in this family are homologs of the EpsG protein found in Methylobacillus strain 12S and are generally found in operons with other Eps homologs. The protein is believed to function as the protein tyrosine kinase component of the chain length regulator (along with the transmembrane component EpsF). 274
19487 274400 TIGR03030 CelA cellulose synthase catalytic subunit (UDP-forming). Cellulose synthase catalyzes the beta-1,4 polymerization of glucose residues in the formation of cellulose. In bacteria, the substrate is UDP-glucose. The synthase consists of two subunits (or domains in the frequent cases where it is encoded as a single polypeptide), the catalytic domain modelled here and the regulatory domain (pfam03170). The regulatory domain binds the allosteric activator cyclic di-GMP. The protein is membrane-associated and probably assembles into multimers such that the individual cellulose strands can self-assemble into multi-strand fibrils. 713
19488 274401 TIGR03031 cas_csx12 CRISPR system subtype II-B RNA-guided endonuclease Cas9/Csx12. Members of this family of CRISPR-associated (cas) protein are found, so far, in CRISPR/cas loci in Wolinella succinogenes DSM 1740, Legionella pneumophila str. Paris, and Francisella tularensis, where the last probably is an example of a degenerate CRISPR locus, having neither repeats nor a functional Cas1. The characteristic repeat length is 37 base pairs and period is about 72. One region of this large protein shows sequence similarity to pfam01844, HNH endonuclease. 802
19489 274402 TIGR03032 TIGR03032 TIGR03032 family protein. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown. [Hypothetical proteins, Conserved] 335
19490 200235 TIGR03033 phage_rel_nuc putative phage-type endonuclease. Members of this protein family are found often in phage genomes and in prokaryotic genomes in uncharacterized regions that resemble integrated prophage regions. 153
19491 132079 TIGR03034 TIGR03034 conserved hypothetical protein. Members of this protein family have been found in several species of gammaproteobacteria, including Yersinia pestis and Y. pseudotuberculosis, Xylella fastidiosa, and Escherichia coli UTI89. As many as five members can be found in a single genome. The function is unknown. [Hypothetical proteins, Conserved] 274
19492 132080 TIGR03035 trp_arylform arylformamidase. One of several pathways of tryptophan degradation is as follows: tryptophan 2,3-dioxygenase (1.13.11.11) uses 02 to convert Trp to L-formylkynurenine. Arylformamidase (3.5.1.9) hydrolyzes the product to L-kynurenine and formate. Kynureninase (3.7.1.3) hydrolyzes L-kynurenine to anthranilate plus alanine. Members of the seed alignment for this model are bacterial predicted metal-dependent hydrolases. All are supported as arylformamidase (3.5.1.9) by an operon structure in which kynureninase and/or tryptophan 2,3-dioxygenase genes are adjacent. The members from Bacillus cereus, Pseudomonas aeruginosa and Ralstonia metallidurans were characterized. An example from Pseudomonas fluorescens is given the gene symbol qbsH instead of kynB because of its role in quinolobactin biosynthesis, which begins with tryptophan. All members of this family should be arylformamidase (3.5.1.9). [Energy metabolism, Amino acids and amines] 206
19493 188272 TIGR03036 trp_2_3_diox tryptophan 2,3-dioxygenase. Members of this family are tryptophan 2,3-dioxygenase, as confirmed by several experimental characterizations, and by conserved operon structure for many of the other members. This enzyme represents the first of a two-step degradation to L-kynurenine, and a three-step pathway (via kynurenine) to anthranilate plus alanine. [Energy metabolism, Amino acids and amines] 264
19494 132082 TIGR03037 anthran_nbaC 3-hydroxyanthranilate 3,4-dioxygenase. Members of this protein family, from both bacteria and eukaryotes, are the enzyme 3-hydroxyanthranilate 3,4-dioxygenase. This enzyme acts on the tryptophan metabolite 3-hydroxyanthranilate and produces 2-amino-3-carboxymuconate semialdehyde, which can rearrange spontaneously to quinolinic acid and feed into nicotinamide biosynthesis, or undergo further enzymatic degradation. 159
19495 274403 TIGR03038 PS_II_psbM photosystem II reaction center protein PsbM. Members of this protein family are the photosystem II reaction center M protein, product of the psbM gene, in Cyanobacteria and their derived organelles in plants. This model resembles pfam05151 but has cutoffs set to avoid false-positive matches to similar (not necessarily homologous) sequences in species that are not photosynthetic. [Energy metabolism, Photosynthesis] 33
19496 163117 TIGR03039 PS_II_CP47 photosystem II chlorophyll-binding protein CP47. [Energy metabolism, Photosynthesis] 504
19497 213761 TIGR03041 PS_antenn_a_b chlorophyll a/b binding light-harvesting protein. This model represents a family of proteins from the Cyanobacteria, closely homologous to and yet distinct from PbsC, a chlorophyll a antenna protein of photosystem II. Members are not univerally present in Cyanobacteria, while the family has several members per genome in Prochlorococcus marinus, with seven members in a strain adapted to low light conditions. These antenna proteins may deliver light energy to photosystem I and/or photosystem II. [Energy metabolism, Photosynthesis] 321
19498 274404 TIGR03042 PS_II_psbQ_bact photosystem II protein PsbQ. This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology. [Energy metabolism, Photosynthesis] 142
19499 274405 TIGR03043 PS_II_psbZ photosystem II core protein PsbZ. PsbZ is a core protein of photosystem II in thylakoid-containing Cyanobacteria and plant chloroplasts. The original Chlamydomonas gene symbol, ycf9, is a synonym. PsbZ controls the interaction of the reaction center core with the light-harvesting antenna. [Energy metabolism, Photosynthesis] 58
19500 274406 TIGR03044 PS_II_psb27 photosystem II protein Psb27. Members of this family are the Psb27 protein of the cyanobacterial photosynthetic supracomplex, photosystem II. Although most protein components of both cyanobacterial and chloroplast versions of photosystem II are closely related and described together by single models, this family is strictly bacterial. Some uncharacterized proteins with highly divergent sequences, from Arabidopsis, score between trusted and noise cutoffs for this model but are not at this time assigned as functionally equivalent photosystem II proteins. [Energy metabolism, Photosynthesis] 135
19501 274407 TIGR03045 PS_II_C550 cytochrome c-550. Members of this protein family are cytochrome c-550, the PsbV extrinsic protein of photosystem II, from both Cyanobacteria and chloroplasts. A paralog to this protein, PsbV2, is found in some species in addition to PsbV itself. [Energy metabolism, Photosynthesis] 159
19502 274408 TIGR03046 PS_II_psbV2 photosystem II cytochrome PsbV2. Members of this protein family are PsbV2, a protein closely related cytochrome c-550 (PsbV), a protein important to the water-splitting and oxygen-evolving activity of photosystem II. Mutant studies in Thermosynechococcus elongatus showed PsbV2 can partially replace PsbV, from which it appears to have arisen first by duplication, then by intergenic recombination with a different gene. [Energy metabolism, Photosynthesis] 155
19503 274409 TIGR03047 PS_II_psb28 photosystem II reaction center protein Psb28. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW. [Energy metabolism, Photosynthesis] 108
19504 132092 TIGR03048 PS_I_psaC photosystem I iron-sulfur protein PsaC. Members of this family are PsaC, an essential component of photosystem I (PS-I) reaction center in Cyanobacteria and chloroplasts. This small protein, about 80 amino acids in length, contains two copies of the ferredoxin-like 4Fe-4S binding site (pfam00037) and therefore eight conserved Cys residues. This protein is also called photosystem I subunit VII. [Energy metabolism, Photosynthesis] 80
19505 274410 TIGR03049 PS_I_psaK photosystem I reaction center subunit PsaK. Members of this protein family are the PsaK of the photosystem I reaction center. Photosystems I and II occur together in the same sets of organisms. Photosystem I uses light energy to transfer electrons from plastocyanin to ferredoxin, while photosystem II uses light energy to split water and releases molecular oxygen. [Energy metabolism, Photosynthesis] 81
19506 188274 TIGR03050 PS_I_psaK_plant photosystem I reaction center PsaK, plant form. This protein family is based on a model that separates the photosystem I PsaK subunit of chloroplasts from chloroplast PsaG protein and from Cyanobacterial PsaK, both of which show sequence similarity. 83
19507 132095 TIGR03051 PS_I_psaG_plant photosystem I reaction center subunit V, chloroplast. 88
19508 132096 TIGR03052 PS_I_psaI photosystem I reaction center subunit VIII. Members of this protein family are PsaI, subunit VIII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. [Energy metabolism, Photosynthesis] 31
19509 274411 TIGR03053 PS_I_psaM photosystem I reaction center subunit XII. Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. The seed alignment for this model includes sequences from pfam07465 and additional sequences, as from Prochlorococcus. [Energy metabolism, Photosynthesis] 29
19510 213764 TIGR03054 photo_alph_chp1 putative photosynthetic complex assembly protein. In twenty or so anoxygenic photosynthetic alpha-Proteobacteria known so far, a gene for a member of this protein family is present and is found in the vicinity of puhA, which encodes a component of the photosynthetic reaction center, and other genes associated with photosynthesis. This protein family is suggested, consequently, as a probable assembly factor for the photosynthetic reaction center, but its seems its actual function has not yet been demonstrated. [Energy metabolism, Photosynthesis] 135
19511 188275 TIGR03055 photo_alph_chp2 putative photosynthetic complex assembly protein 2. This uncharacterized protein family was identified, by the method of partial phylogenetic profiling, as having a matching phylogenetic distribution to that of the photosynthetic reaction center of the alpha-proteobacterial type. It is nearly always encoded near other photosynthesis-related genes, including puhA. [Energy metabolism, Photosynthesis] 245
19512 132100 TIGR03056 bchO_mg_che_rel putative magnesium chelatase accessory protein. Members of this family belong to the alpha/beta fold family hydrolases (pfam00561). Members are found in bacterial genomes if and only if they encoded for anoxygenic photosynthetic systems similar to that of Rhodobacter capsulatus and other alpha-Proteobacteria. Members often are encoded in the same operon as subunits of the protoporphyrin IX magnesium chelatase, and were once designated BchO. No literature supports a role as an actual subunit of magnesium chelatase, but an accessory role is possible, as suggested by placement by its probable hydrolase activity. [Energy metabolism, Photosynthesis] 278
19513 274412 TIGR03057 xxxLxxG_by_4 X-X-X-Leu-X-X-Gly heptad repeats. This model represents a 28-column alignment, comprising four tandem sets of seven residues each, in which the fourth residue tends to be Leu and the seventh tends to be Gly in each set. This heptad periodicity, corresponding to two turns of an alpha helix, suggests alpha-helical structure; in many proteins this 28-region model hits many times in tandem. Arrangement of these sequences on a helical wheel would show a strict alternation of Leu and Gly residues on one side of the helix, that is, an extremely bulky side chain alternating with the virtual absence of one. This suggests an extended zippering of one alpha helix to another, analogous to the shorter leucine zippers found in many dimerizing transcription factors. Proteins in which these heptad repeats occur often have higher order repeats of a unit comprised of several heptads. 28
19514 132102 TIGR03058 rpt_csmH chlorosome envelope protein H repeat. CsmH, as studied in Chlorobium tepidum, is one of at least ten surface-exposed proteins of the chloroplast, a bacteriochlorophyll-rich structure with a lipid-protein envelope. CsmH contain typically three copies of a repeated sequence, represented by this model. [Energy metabolism, Photosynthesis] 27
19515 132103 TIGR03059 psaOeuk photosystem I protein PsaO. Members of this family are the PsaO protein of photosystem I. This protein is found in chloroplasts but not in Cyanobacteria. 82
19516 213765 TIGR03060 PS_II_psb29 photosystem II biogenesis protein Psp29. Psp29, originally designated sll1414 in Synechocystis 6803, is found universally in Cyanobacteria and in Arabidopsis. It was isolated and partially sequenced from purified photosystem II (PS II) in Synechocystis. While its function is unknown, mutant studies show an impairment in photosystem II biogenesis and/or stability, rather than in PS II core function. [Energy metabolism, Photosynthesis] 214
19517 274413 TIGR03061 pip_yhgE_Nterm YhgE/Pip N-terminal domain. This family contains the N-terminal domain of a family of multiple membrane-spanning proteins of Gram-positive bacteria. One member was shown to be a host protein essential for phage infection, so many members of this family are called "phage infection protein". A separate model, TIGR03062, represents the conserved C-terminal domain. The domains are separated by regions highly variable in both length and sequence, often containing extended heptad repeats as described in model TIGR03057. 164
19518 274414 TIGR03062 pip_yhgE_Cterm YhgE/Pip C-terminal domain. This family contains the C-terminal domain of a family of multiple membrane-spanning proteins of Gram-positive bacteria. One member was shown to be a host protein essential for phage infection, so many members of this family are called "phage infection protein". A separate model, TIGR03061, represents the conserved N-terminal domain. The domains are separated by regions highly variable in both length and sequence, often containing extended heptad repeats as described in model TIGR03057. 208
19519 213766 TIGR03063 srtB_target sortase B cell surface sorting signal. Two different classes of sorting signal, both analogous to the sortase A signal LPXTG, may be recognized by the sortase SrtB. These are given as NXZTN and NPKXZ. Proteins sorted by this class of sortase are less common than the sortase A and LPXTG system. This model describes a number of cell surface protein C-terminal regions from Gram-positive bacteria that appear to be sortase B (SrtB) sorting signals. 29
19520 211782 TIGR03064 sortase_srtB sortase, SrtB family. Members of this transpeptidase family are, in most cases, designated sortase B, product of the srtB gene. This protein shows only distant similarity to the sortase A family, for which there may be several members in a single bacterial genome. Typical SrtB substrate motifs include NAKTN, NPKSS, etc, and otherwise resemble the LPXTG sorting signals recognized by sortase A proteins. [Cell envelope, Other, Protein fate, Protein and peptide secretion and trafficking] 232
19521 132109 TIGR03065 srtB_sig_QVPTGV sortase B signal domain, QVPTGV class. This model represents a boutique (unusual) sorting signal, recognized by a member of the sortase SrtB family rather than by the housekeeping sortase, SrtA. 32
19522 132110 TIGR03066 Gem_osc_para_1 Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species. [Hypothetical proteins, Conserved] 111
19523 274415 TIGR03067 Planc_TIGR03067 Planctomycetes uncharacterized domain TIGR03067. This domain occurs in several species, mostly from the Planctomycetes division of the bacteria. It is expanded into a paralogous family of at least twenty-five members in Gemmata obscuriglobus UQM 2246. This family appears related to TIGR03066, which also is expanded into a large paralogous family in Gemmata obscuriglobus. [Unknown function, General] 107
19524 132112 TIGR03068 srtB_sig_NPQTN sortase B signal domain, NPQTN class. This model represents one of the boutique (rare) sortase signals, recognized by sortase B (SrtB) rather than by the housekeeping-type SrtA class sortase. This sequence, beginning NPQTN, shows little similarity to several other SrtB substrates. 33
19525 132113 TIGR03069 PS_II_S4 photosystem II S4 domain protein. Members of this protein family are about 265 residues long and each contains an S4 RNA-binding domain of about 48 residues. The member from the Cyanobacterium, Synechocystis sp. PCC 6803, was detected as a novel polypeptide in a highly purified preparation of active photosystem II (Kashino, et al., 2002). The phylogenetic distribution, including Cyanobacteria and Arabidopsis, supports a role in photosystem II, although the high bit score cutoffs for this model reflect similar sequences in non-photosynthetic organisms such as Carboxydothermus hydrogenoformans, a Gram-positive bacterium. [Energy metabolism, Photosynthesis] 257
19526 213767 TIGR03070 couple_hipB transcriptional regulator, y4mF family. Members of this family belong to a clade of helix-turn-helix DNA-binding proteins, among the larger family pfam01381 (HTH_3; Helix-turn-helix). Members are similar in sequence to the HipB protein of E. coli. Genes for members of the seed alignment for this protein family were found to be closely linked to genes encoding proteins related to HipA. The HibBA operon appears to have some features in common with toxin-antitoxin post-segregational killing systems. [Regulatory functions, DNA interactions] 58
19527 274416 TIGR03071 couple_hipA HipA N-terminal domain. Although Pfam models pfam07805 and pfam07804 currently are called HipA-like N-terminal domain and HipA-like C-terminal domain, respectively, those models hit the central and C-terminal regions of E. coli HipA but not the N-terminal region. This model hits the N-terminal region of HipA and its homologs, and also identifies proteins that lack match regions for pfam07804 and pfam07805. 101
19528 213768 TIGR03072 release_prfH putative peptide chain release factor H. Members of this protein family are bacterial proteins homologous to peptide chain release factors 1 (RF-1, product of the prfA gene), and 2 (RF-2, product of the prfB gene). The member from Escherichia coli K-12, designated prfH, appears to be a pseudogene. This class I release factor is always found as the downstream gene of a two-gene operon. [Protein synthesis, Translation factors] 200
19529 274417 TIGR03073 release_rtcB release factor H-coupled RctB family protein. Members of this family are related to RctB. RctB a protein of known structure but unknown function that often is encoded near RNA cyclase and therefore is suggested to be a tRNA or mRNA processing enzyme. This family of RctB-like proteins in encoded upstream of, and apparently is translationally coupled to, the putative peptide chain release factor RF-H (TIGR03072), product of the prfH gene. Note that a large deletion at the junction between this gene and the prfH gene in Escherichia coli K-12 marks both as probable pseudogenes. [Protein synthesis, Other] 356
19530 274418 TIGR03074 PQQ_membr_DH membrane-bound PQQ-dependent dehydrogenase, glucose/quinate/shikimate family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Members of this family have several predicted transmembrane helices in the N-terminal region, and include the quinoprotein glucose dehydrogenase (EC 1.1.5.2) of Escherichia coli and the quinate/shikimate dehydrogenase of Acinetobacter sp. ADP1 (EC 1.1.99.25). Sequences closely related except for the absense of the N-terminal hydrophobic region, scoring in the gray zone between the trusted and noise cutoffs, include PQQ-dependent glycerol (EC 1.1.99.22) and and other polyol (sugar alcohol) dehydrogenases. 764
19531 274419 TIGR03075 PQQ_enz_alc_DH PQQ-dependent dehydrogenase, methanol/ethanol family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Genes in this family often are found adjacent to the PQQ biosynthesis genes themselves. An unusual, strained disulfide bond between adjacent Cys residues contributes to PQQ-binding, as does a Trp residue that is part of a PQQ enzyme repeat (see pfam01011). Characterized members include the dehydrogenase subunit of a membrane-anchored, three subunit alcohol (ethanol) dehydrogenase of Gluconobacter suboxydans, a homodimeric ethanol dehydrogenase in Pseudomonas aeruginosa, and the large subunit of an alpha2/beta2 heterotetrameric methanol dehydrogenase in Methylobacterium extorquens. 527
19532 213771 TIGR03076 near_not_gcvH Chlamydial GcvH-like protein upstream region protein. The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia but lack the P and T proteins, and have a single homolog of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. The protein family modeled here is observed so far only in the Chlamydiae, always as part of a two-gene operon, upstream of the homolog of GcvH. Its function is unknown. [Unknown function, General] 686
19533 132121 TIGR03077 not_gcvH glycine cleavage protein H-like protein, Chlamydial. The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia but lack the P and T proteins, and have a single homolog of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. The protein family modeled here is observed the Chlamydial GcvH homolog, so far always seen as part of a two-gene operon, downstream of a member of the uncharacterized protein family TIGR03076. The function of this protein is unknown. 110
19534 274420 TIGR03078 CH4_NH3mon_ox_C methane monooxygenase/ammonia monooxygenase, subunit C. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit C of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria. 231
19535 132123 TIGR03079 CH4_NH3mon_ox_B methane monooxygenase/ammonia monooxygenase, subunit B. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria. 399
19536 132124 TIGR03080 CH4_NH3mon_ox_A methane monooxygenase/ammonia monooxygenase, subunit A. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit A of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria. 243
19537 213772 TIGR03081 metmalonyl_epim methylmalonyl-CoA epimerase. Members of this protein family are the enzyme methylmalonyl-CoA epimerase (EC 5.1.99.1), also called methylmalonyl-CoA racemase. This enzyme converts (2R)-methylmalonyl-CoA to (2S)-methylmalonyl-CoA, which is then a substrate for methylmalonyl-CoA mutase (TIGR00642). It is known in bacteria, archaea, and as a mitochondrial protein in animals. It is closely related to lactoylglutathione lyase (TIGR00068), which is also called glyoxylase I, and is also a homodimer. 128
19538 274421 TIGR03082 Gneg_AbrB_dup membrane protein AbrB duplication. The model describes a hydrophobic sequence region that is duplicated to form the AbrB protein of Escherichia coli (not to be confused with a Bacillus subtilis protein with the same gene symbol). In some species, notably the Cyanobacteria and Thermus thermophilus, proteins consist of a single copy rather than two copies. The member from Pseudomonas putida, PP_1415, was suggested to be an ammonia monooxygenase characteristic of heterotrophic nitrifiers, based on an experimental indication of such activity in the organism and a glimmer of local sequence similarity between parts of P. putida protein and an instance of the AmoA protein from Nitrosomonas europaea (; we do not believe the sequence similarity to be meaningful. The member from E. coli (b0715, ybgN) appears to be the largely uncharacterized AbrB (aidB regulator) protein of E. coli cited in Volkert, et al. (PMID 8002588), although we did not manage to trace the origin of association of the article to the sequence. 156
19539 274422 TIGR03083 TIGR03083 uncharacterized Actinobacterial protein TIGR03083. This protein family pulls together several groups of proteins, each very different from the others. They share in common three conserved regions. The first is a region of about 38 amino acids, nearly always at the N-terminus of a protein. This region has a bulky hydrophobic residue, usually Trp, at position 29, and a His residue at position 37 that is invariant, so far, in over 150 instances. The second conserved region has a motif [DE]xxxHxxD. The third conserved region contains a hydrophobic patch and a well-conserved Arg residue. Most examples are found in the Actinobacteria, including the genera Mycobacterium, Corynebacterium, Streptomyces, Nocardia, Frankia, etc. The pattern of near-invariant residues against a backdrop of extreme sequence divergence suggests enzymatic activity and conservation of active site residues. 202
19540 274423 TIGR03084 TIGR03084 TIGR03084 family protein. This family, like pfam07398, belongs to the larger set of probable enzymes modeled by TIGRFAMs family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized. [Hypothetical proteins, Conserved] 253
19541 132129 TIGR03085 TIGR03085 TIGR03085 family protein. This family, like pfam07398 and TIGRFAMs family TIGR03084, belongs to the larger set of probable enzymes defined in family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized. [Hypothetical proteins, Conserved] 199
19542 274424 TIGR03086 TIGR03086 TIGR03086 family protein. This family, like pfam07398 and TIGRFAMs family TIGR030834, belongs to the larger set of probable enzymes defined in family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized. 180
19543 274425 TIGR03087 stp1 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate. 397
19544 132132 TIGR03088 stp2 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate. 374
19545 274426 TIGR03089 TIGR03089 TIGR03089 family protein. This protein family is found, so far, only in the Actinobacteria (Streptomyces, Mycobacterium, Corynebacterium, Nocardia, Propionibacterium, etc.) and never more than one to a genome. Members show twilight-level sequence similarity to family of AMP-binding enzymes described by pfam00501. 228
19546 163133 TIGR03090 SASP_tlp small, acid-soluble spore protein tlp. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. Although previously designated tlp (thioredoxin-like protein), the B. subtilis protein was shown to be a minor small acid-soluble spore protein SASP, unique to spores. The motif E[VIL]XDE near the C-terminus probably represents at a germination protease cleavage site. [Cellular processes, Sporulation and germination] 70
19547 274427 TIGR03091 SASP_sspK small, acid-soluble spore protein K. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspK. [Cellular processes, Sporulation and germination] 32
19548 132136 TIGR03092 SASP_sspI small, acid-soluble spore protein I. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspI. The gene in Bacillus subtilis previously was designated ysfA. [Cellular processes, Sporulation and germination] 65
19549 132137 TIGR03093 SASP_sspL small, acid-soluble spore protein L. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspL. [Cellular processes, Sporulation and germination] 36
19550 132138 TIGR03094 sulfo_cyanin sulfocyanin. Members of this family are blue-copper redox proteins designated sulfocyanin, from the archaeal genera Sulfolobus, Ferroplasma, and Picrophilus. The most closely related proteins characterized as functionally different are the rustacyanins. [Energy metabolism, Electron transport] 195
19551 132139 TIGR03095 rusti_cyanin rusticyanin. Rusticyanin is a blue copper protein, described in an obligate acidophilic chemolithoautroph, Acidithiobacillus ferrooxidans, as an electron transfer protein. It can constitute up to 5 percent of protein in cells grown on Fe(II) and is thought to be part of an electron chain for Fe(II) oxidation, with two c-type cytochromes, an aa3-type cytochrome oxidase, and 02 as terminal electron acceptor. It is rather closely related to sulfocyanin (TIGR03094). [Energy metabolism, Electron transport] 148
19552 132140 TIGR03096 nitroso_cyanin nitrosocyanin. Nitrosocyanin, as described from the obligate chemolithoautotroph Nitrosomonas europaea, is a red copper protein of unknown function with sequence similarity to a number of blue copper redox proteins. [Energy metabolism, Electron transport] 135
19553 132141 TIGR03097 PEP_O_lig_1 probable O-glycosylation ligase, exosortase A-associated. These proteins are members of the O-antigen polymerase (wzy) family described by pfam04932. This group is associated with genomes and ususally genomic contexts containing elements of the exosortase/PEP-CTERM protein export system, specificially the type 1 variety of this system described by the Genome Property, GenProp0652. 402
19554 211788 TIGR03098 ligase_PEP_1 acyl-CoA ligase (AMP-forming), exosortase A-associated. This group of proteins contains an AMP-binding domain (pfam00501) associated with acyl CoA-ligases. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present next to a decarboxylase enzyme. A number of sequences from Burkholderia species also hit this model, but the genomic context is obviously different. The hypothesis of a constant substrate for this family is only strong where the exosortase context is present. 517
19555 132143 TIGR03099 dCO2ase_PEP1 pyridoxal-dependent decarboxylase, exosortase A system-associated. The sequences in this family contain the pyridoxal binding domain (pfam02784) and C-terminal sheet domain (pfam00278) of a family of Pyridoxal-dependent decarboxylases. Characterized enzymes in this family decarboxylate substrates such as ornithine, diaminopimelate and arginine. The genes of the family modeled here, with the exception of those observed in certain Burkholderia species, are all found in the context of exopolysaccharide biosynthesis loci containing the exosortase/PEP-CTERM protein sorting system. More specifically, these are characteristic of the type 1 exosortase system represented by the Genome Property GenProp0652. The substrate of these enzymes may be a precursor of the carrier or linker which is hypothesized to release the PEP-CTERM protein from the exosortase enzyme. These enzymes are apparently most closely related to the diaminopimelate decarboxylase modeled by TIGR01048 which may suggest a similarity (or identity) of substrate. 398
19556 132144 TIGR03100 hydr1_PEP exosortase A system-associated hydrolase 1. This group of proteins are members of the alpha/beta hydrolase superfamily. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present in the vicinity of a second, relatively unrelated enzyme (ortholog 2, TIGR03101) of the same superfamily. 274
19557 274428 TIGR03101 hydr2_PEP exosortase A system-associated hydrolase 2. This group of proteins are members of the alpha/beta hydrolase superfamily. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present in the vicinity of a second, relatively unrelated enzyme (ortholog 1, TIGR03100) of the same superfamily. 266
19558 274429 TIGR03102 halo_cynanin halocyanin domain. Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronobacterium pharaonis. This model represents a domain duplicated in some halocyanins, while appearing once in others. This domain includes the characteristic copper ligand residues. This family does not include plastocyanins, and does not include certain divergent paralogs of halocyanin. 115
19559 132147 TIGR03103 trio_acet_GNAT GNAT-family acetyltransferase TIGR03103. Members of this protein family belong to the GNAT family of acetyltransferases. Each is part of a conserved three-gene cassette sparsely distributed across at least twenty different species known so far, including alpha, beta, and gamma Proteobacteria, Mycobacterium, and Prosthecochloris, which is a member of the Chlorobi. The other two members of the cassette are a probable protease and an asparagine synthetase family protein. 547
19560 274430 TIGR03104 trio_amidotrans asparagine synthase family amidotransferase. Members of this protein family are closely related to several isoforms of asparagine synthetase (glutamine amidotransferase) and typically have been given this name in genome annotation to date. Each is part of a conserved three-gene cassette sparsely distributed across at least twenty different species known so far, including alpha, beta, and gamma Proteobacteria, Mycobacterium, and Prosthecochloris, which is a member of the Chlorobi. The other two members of the cassette are a probable protease and a member of the GNAT family of acetyltransferases. 589
19561 274431 TIGR03105 gln_synth_III glutamine synthetase, type III. This family consists of the type III isozyme of glutamine synthetase, originally described in Rhizobium meliloti, where types I and II also occur. 435
19562 132150 TIGR03106 trio_M42_hydro hydrolase, peptidase M42 family. This model describes a subfamily of MEROPS peptidase family M42, a glutamyl aminopeptidase family that also includes the cellulase CelM from Clostridium thermocellum and deblocking aminopeptidases that can remove acylated amino acids. Members of this family occur in a three gene cassette with an amidotransferase (TIGR03104)in the asparagine synthase (glutamine-hydrolyzing) family, and a probable acetyltransferase (TIGR03103) in the GNAT family. 343
19563 132151 TIGR03107 glu_aminopep glutamyl aminopeptidase. This model represents the M42.001 clade within MEROPS family M42. M42 includes glutamyl aminopeptidase as in the present model, deblocking aminopeptidases as from Pyrococcus horikoshii and related species, and endo-1,4-beta-glucanase (cellulase M) as from Clostridium thermocellum. The current family includes [Protein fate, Degradation of proteins, peptides, and glycopeptides] 350
19564 132152 TIGR03108 eps_aminotran_1 exosortase A system-associated amidotransferase 1. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. This model represents a distinct clade among a set of amidotransferases largely annotated (not necessarily accurately) as glutatime-hydrolyzing asparagine synthases. Members of this clade are essentially restricted to the characteristic exopolysaccharide (EPS) regions that contain the exosortase 1 genome (xrtA), in genomes that also have numbers of PEP-CTERM domain (TIGR02595) proteins. 628
19565 274432 TIGR03109 exosortase_1 exosortase A. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. We designate this, the most common type so far, exosortase 1. We propose the gene symbol xrtA, analogous to srtA for the most common type of sortase in Gram-positive bacteria. 267
19566 188282 TIGR03110 exosort_Gpos exosortase family protein XrtG. Members of this protein family are found in a modest number of non-pathogenic Gram-positive bacteria, including three species of Lactococcus and three paralogs in Clostridium acetobutylicum. This protein appears related to the conserved core region of a family of proposed transpeptidases, exosortase (previously EpsH), thought to act on PEP-CTERM proteins. Members of the seed alignment include all exosortase proposed active site residues. However, in contrast to canonical exosortase (TIGR02602) and archaeal (TIGR03762), and cyanobacterial (TIGR03763) variants, this family has not yet been matched to a cognate PEP-CTERM-like sorting signal. This protein is assigned the gene symbol XrtG (eXosoRTase family protein of Gram-positives). 187
19567 132155 TIGR03111 glyc2_xrt_Gpos1 putative glycosyltransferase, exosortase G-associated. Members of this protein family are probable glycosyltransferases of family 2, whose genes are near those for the exosortase homolog XrtG (TIGR03110), which is restricted to Gram-positive bacteria. Other genes in the conserved gene neighborhood include a 6-pyruvoyl tetrahydropterin synthase homolog (TIGR03112) and an uncharacterized intergral membrane protein (TIGR03766). 439
19568 132156 TIGR03112 6_pyr_pter_rel 6-pyruvoyl tetrahydropterin synthase-related domain. Members of this family are small proteins, or small domains of larger proteins, that occur in certain Firmicutes in the same regions as members of families TIGR03110 and TIGR03111. Members of TIGR03110 resemble exosortase, a proposed protein sorting transpeptidase (see TIGR02602). TIGR03111 represents a small clade among the group 2 glycosyltransferases. Members of the current protein family resemble eukaryotic known and prokaryotic predicted 6-pyruvoyl tetrahydropterin synthases. 113
19569 274433 TIGR03113 exosortase_2 exosortase B. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. We designate this relatively uncommon proteobacterial type to be type 2. We propose the gene symbol xrtB. Most species encountered so far with xrtB also contain xrtA (TIGR03109). 268
19570 274434 TIGR03114 cas8u_csf1 CRISPR type AFERR-associated protein Csf1. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf1 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies closest to the repeats. 202
19571 274435 TIGR03115 cas7_csf2 CRISPR type IV/AFERR-associated protein Csf2. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf2 (CRISPR/cas Subtype as in A. ferrooxidans protein 2), as it lies second closest to the repeats. 344
19572 132160 TIGR03116 cas5_csf3 CRISPR type IV/AFERR-associated protein Csf3. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf3 (CRISPR/cas Subtype as in A. ferrooxidans protein 3), as it lies third closest to the repeats. 214
19573 274436 TIGR03117 cas_csf4 CRISPR type AFERR-associated DEAD/DEAH-box helicase Csf4. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf4 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies farthest (fourth closest) from the repeats in the A. ferrooxidans genome. 636
19574 132162 TIGR03118 PEPCTERM_chp_1 TIGR03118 family protein. This model describes and uncharacterized conserved hypothetical protein. Members are found with the C-terminal putative exosortase interaction domain, PEP-CTERM, in Nitrosospira multiformis, Rhodoferax ferrireducens, Solibacter usitatus Ellin6076, and Acidobacteria bacterium Ellin345. It is found without the PEP-CTERM domain in several other species, including Burkholderia ambifaria, Gloeobacter violaceus PCC 7421, and three copies in the Acanthamoeba polyphaga mimivirus. [Hypothetical proteins, Conserved] 336
19575 132163 TIGR03119 one_C_fhcD formylmethanofuran--tetrahydromethanopterin N-formyltransferase. Members of this protein family are the FhcD protein of tetrahydromethanopterin (H4MPT)-dependent C-1 carrier metabolism. In the archaea, FhcD is designated formylmethanofuran--tetrahydromethanopterin N-formyltransferase, while in bacteria it is commonly designated as formyltransferase/hydrolase complex subunit D. FhcD is essential for one-carbon metabolism in at least three groups of prokaryotes: methanogenic archaea, sulfate-reducing archaea, and methylotrophic bacteria. [Central intermediary metabolism, One-carbon metabolism] 287
19576 274437 TIGR03120 one_C_mch methenyltetrahydromethanopterin cyclohydrolase. Members of this protein family are the enzyme methenyltetrahydromethanopterin cyclohydrolase, a key enzyme for tetrahydromethanopterin (H4MPT)-linked C1 transfer metabolism. [Central intermediary metabolism, One-carbon metabolism] 313
19577 274438 TIGR03121 one_C_dehyd_A formylmethanofuran dehydrogenase subunit A. Members of this largely archaeal protein family are subunit A of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit A. Note that this model does not distinguish tungsten (FwdA) from molybdenum-containing (FmdA) forms of this enzyme; a single gene from this family is expressed constitutively in Methanobacterium thermoautotrophicum, which has both tungsten and molybdenum forms and may work interchangeably. 556
19578 274439 TIGR03122 one_C_dehyd_C formylmethanofuran dehydrogenase subunit C. Members of this largely archaeal protein family are subunit C of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit C. Note that this model does not distinguish tungsten (FwdC) from molybdenum-containing (FmdC) forms of this enzyme. 257
19579 163144 TIGR03123 one_C_unchar_1 probable H4MPT-linked C1 transfer pathway protein. This protein family was identified, by the method of partial phylogenetic profiling, as related to the use of tetrahydromethanopterin (H4MPT) as a C-1 carrier. Characteristic markers of the H4MPT-linked C1 transfer pathway include formylmethanofuran dehydrogenase subunits, methenyltetrahydromethanopterin cyclohydrolase, etc. Tetrahydromethanopterin, a tetrahydrofolate analog, occurs in methanogenic archaea, bacterial methanotrophs, planctomycetes, and a few other lineages. [Central intermediary metabolism, One-carbon metabolism] 318
19580 163145 TIGR03124 citrate_citX holo-ACP synthase CitX. Members of this protein family are the CitX protein, or CitX domain of the CitXG bifunctional protein, of the citrate lyase system. CitX transfers the prosthetic group 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA to the citrate lyase gamma chain, an acyl carrier protein. This enzyme may be designated holo-ACP synthase, holo-citrate lyase synthase, or apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. In a few genera, including Haemophilus, this protein occurs as a fusion protein with CitG (2.7.8.25), an enzyme involved in prosthetic group biosynthesis. This CitX family is easily separated from the holo-ACP synthases of other enzyme systems. [Energy metabolism, Fermentation, Protein fate, Protein modification and repair] 165
19581 132169 TIGR03125 citrate_citG triphosphoribosyl-dephospho-CoA synthase CitG. Triphosphoribosyl-dephospho-CoA is transferred to, and becomes the prosthetic group of, the respective acyl carrier protein subunits of both citrate lyase and malonate decarboxylase. Members of this protein family are triphosphoribosyl-dephospho-CoA synthases specifically from citrate lyase systems. This protein sometimes occurs as a fusion protein with CitX, the phosphoribosyl-dephospho-CoA transferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Fermentation] 275
19582 132170 TIGR03126 one_C_fae formaldehyde-activating enzyme. This family consists of formaldehyde-activating enzyme, or the corresponding domain of longer, bifunctional proteins. It links formaldehyde to the C1 carrier tetrahydromethanopterin (H4MPT), an analog of tetrahydrofolate, and is common among species with H4MPT. The ribulose monophosphate (RuMP) pathway, which removes the toxic metabolite formaldehyde by assimilation, runs in the opposite direction in some species to produce ribulose 5-phosphate for nucleotide biosynthesis, leaving formaldehyde as an additional metabolite. In these species, formaldehyde activating enzyme may occur as a fusion protein with D-arabino 3-hexulose 6-phosphate formaldehyde lyase from the RuMP pathway. 160
19583 132171 TIGR03127 RuMP_HxlB 6-phospho 3-hexuloisomerase. Members of this protein family are 6-phospho 3-hexuloisomerase (PHI), or the PHI domain of a fusion protein. This enzyme is part of the ribulose monophosphate (RuMP) pathway, which in one direction removes the toxic metabolite formaldehyde by assimilation into fructose-6-phosphate. In the other direction, in species lacking a complete pentose phosphate pathway, the RuMP pathway yields ribulose-5-phosphate, necessary for nucleotide biosynthesis, at the cost of also yielding formaldehyde. These latter species tend usually have a formaldehyde-activating enzyme to attach formaldehyde to the C1 carrier tetrahydromethanopterin. 179
19584 132172 TIGR03128 RuMP_HxlA 3-hexulose-6-phosphate synthase. Members of this protein family are 3-hexulose-6-phosphate synthase (HPS), or the HPS domain of a fusion protein. This enzyme is part of the ribulose monophosphate (RuMP) pathway, which in one direction removes the toxic metabolite formaldehyde by assimilation into fructose-6-phosphate. In the other direction, in species lacking a complete pentose phosphate pathway, the RuMP pathway yields ribulose-5-phosphate, necessary for nucleotide biosynthesis, at the cost of also yielding formaldehyde. These latter species tend usually have a formaldehyde-activating enzyme to attach formaldehyde to the C1 carrier tetrahydromethanopterin. In these species, the enzyme is viewed as a lyase rather than a synthase and is called D-arabino 3-hexulose 6-phosphate formaldehyde lyase. Note that there is some overlap in specificity with the Escherichia coli enzyme 3-keto-L-gulonate 6-phosphate decarboxylase. 206
19585 132173 TIGR03129 one_C_dehyd_B formylmethanofuran dehydrogenase subunit B. Members of this largely archaeal protein family are subunit B of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit C. Note that this model does not distinguish tungsten (FwdB) from molybdenum-containing (FmdB) forms of this enzyme. 421
19586 188283 TIGR03130 malonate_delta malonate decarboxylase acyl carrier protein. Members of this protein family are the acyl carrier protein, also called the delta subunit, of malonate decarboxylase. This subunit has the same covalently bound prosthetic group, derived from and similar to coenzyme A, as does citrate lyase, although this protein and the acyl carrier protein of citrate lyase do not show significant sequence similarity. Both malonyl and acetyl groups are transferred to the prosthetic group for catalysis. 98
19587 132175 TIGR03131 malonate_mdcH malonate decarboxylase, epsilon subunit. Members of this protein family are the epsilon subunit of malonate decarboxylase. This subunit has malonyl-CoA/dephospho-CoA acyltransferase activity. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. The epsilon subunit is closely related to the malonyl CoA-acyl carrier protein (ACP) transacylase family described by TIGR00128, but acts on an ACP subunit of malonate decarboxylase that has an unusual coenzyme A derivative as its prothetic group. 295
19588 274440 TIGR03132 malonate_mdcB triphosphoribosyl-dephospho-CoA synthase MdcB. This protein acts in cofactor biosynthesis, preparing the coenzyme A derivative that becomes attached to the malonate decarboxylase acyl carrier protein (or delta subunit). The closely related protein CitG of citrate lyase produces the same molecule, but the two families are nonetheless readily separated. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 272
19589 188285 TIGR03133 malonate_beta biotin-independent malonate decarboxylase, beta subunit. Members of this protein family are the beta subunit of malonate decarboxylase. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. In the malonate decarboxylase complex, the beta subunit appears to act as a malonyl-CoA decarboxylase. 274
19590 274441 TIGR03134 malonate_gamma biotin-independent malonate decarboxylase, gamma subunit. Members of this protein family are the gamma subunit of malonate decarboxylase. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. In the malonate decarboxylase complex, the beta subunit appears to act as a malonyl-CoA decarboxylase, while the gamma subunit appears either to mediate subunit interaction or to act as a co-decarboxylase with the beta subunit. The beta and gamma subunits exhibit some local sequence similarity. 238
19591 274442 TIGR03135 malonate_mdcG malonate decarboxylase holo-[acyl-carrier-protein] synthase. Malonate decarboxylase, like citrate lyase, has a unique acyl carrier protein subunit with a prosthetic group derived from, and distinct from, coenzyme A. Members of this protein family are the phosphoribosyl-dephospho-CoA transferase specific to the malonate decarboxylase system. This enzyme can also be designated holo-ACP synthase (2.7.7.61). The corresponding component of the citrate lyase system, CitX, shows little or no sequence similarity to this family. [Energy metabolism, Other] 202
19592 188287 TIGR03136 malonate_biotin Na+-transporting malonate decarboxylase, carboxybiotin decarboxylase subunit. Malonate decarboxylase can be a soluble enzyme, or a sodium ion-translocating with additional membrane-bound components. Members of this protein family are integral membrane proteins required to couple decarboxylation to sodium ion export. This family belongs to a broader family, TIGR01109 of sodium ion-translocating decarboxylase beta subunits. [Transport and binding proteins, Cations and iron carrying compounds] 399
19593 211789 TIGR03137 AhpC peroxiredoxin. This peroxiredoxin (AhpC, alkylhydroperoxide reductase subunit C) is one subunit of a two-subunit complex with subunit F(TIGR03140). Usually these are found as an apparent operon. The gene has been characterized in Bacteroides fragilis, where it is important in oxidative stress defense. This gene contains two invariant cysteine residues, one near the N-terminus and one near the C-terminus, each followed immediately by a proline residue. [Cellular processes, Detoxification, Cellular processes, Adaptations to atypical conditions] 187
19594 274443 TIGR03138 QueF 7-cyano-7-deazaguanine reductase. This enzyme catalyzes the 4-electron reduction of the cyano group of 7-cyano-7-deazaguanine (preQ0) to an amine. Although related to a large family of GTP cyclohydrolases (pfam01227), the relationship is structural and not germane to the catalytic mechanism. This mode represents the longer, gram-negative version of the enzyme as found in E. coli. The enzymatic step represents the first point at which the biosynthesis of queuosine in bacteria and eukaryotes is distinguished from the biosynthesis of archaeosine in archaea. [Transcription, RNA processing] 275
19595 213775 TIGR03139 QueF-II 7-cyano-7-deazaguanine reductase. This enzyme catalyzes the 4-electron reduction of the cyano group of 7-cyano-7-deazaguanine (proQ1) to an amine. Although related to a large family of GTP cyclohydrolases (pfam01227), the relationship is structural and not germane to the catalytic mechanism. This mode represents the shorter, gram-positive version of the enzyme as found in B. subtilis. The enzymatic step represents the first point at which the biosynthesis of queuosine in bacteria and eukaryotes is distinguished from the biosynthesis of archaeosine in archaea. 115
19596 274444 TIGR03140 AhpF alkyl hydroperoxide reductase subunit F. This enzyme is the partner of the peroxiredoxin (alkyl hydroperoxide reductase) AhpC which contains the peroxide-reactive cysteine. AhpF contains the reductant (NAD(P)H) binding domain (pfam00070) and presumably acts to resolve the disulfide which forms after oxidation of the active site cysteine in AphC. This proteins contains two paired conserved cysteine motifs, CxxCP and CxHCDGP. [Cellular processes, Detoxification, Cellular processes, Adaptations to atypical conditions] 515
19597 274445 TIGR03141 cytochro_ccmD heme exporter protein CcmD. The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins. 45
19598 274446 TIGR03142 cytochro_ccmI cytochrome c-type biogenesis protein CcmI. This TPR repeat-containing protein is the CcmI protein (also called CycH) of c-type cytochrome biogenesis. CcmI is thought to act as an apo-cytochrome c chaperone. This model describes the N-terminal region of the protein, Members of this protein family [Protein fate, Protein folding and stabilization, Energy metabolism, Electron transport] 117
19599 132187 TIGR03143 AhpF_homolog putative alkyl hydroperoxide reductase F subunit. This family of thioredoxin reductase homologs is found adjacent to alkylhydroperoxide reductase C subunit predominantly in cases where there is only one C subunit in the genome and that genome is lacking the F subunit partner (also a thioredcxin reductase homolog) that is usually found (TIGR03140). 555
19600 274447 TIGR03144 cytochr_II_ccsB cytochrome c-type biogenesis protein CcsB. Members of this protein family represent one of two essential proteins of system II for c-type cytochrome biogenesis. Additional proteins tend to be part of the system but can be replaced by chemical reductants such as dithiothreitol. This protein is designated CcsB in Bordetella pertussis and some other bacteria, resC in Bacillus (where there is additional N-terminal sequence), and CcsA in chloroplast. We use the CcsB designation here. Member sequences show regions of strong sequence conservation and variable-length, poorly conserved regions in between; sparsely filled columns were removed from the seed alignment prior to model construction. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair] 245
19601 274448 TIGR03145 cyt_nit_nrfE cytochrome c nitrate reductase biogenesis protein NrfE. Members of this protein family closely resemble the CcmF protein of the CcmABCDEFGH system, or system I, for c-type cytochrome biogenesis (GenProp0678). Members are found, as a rule, next to closely related paralogs of CcmG and CcmH and always located near other genes associated with the cytochrome c nitrite reductase enzyme complex. As a rule, members are found in species that also encode bona fide members of the CcmF, CcmG, and CcmH families. 614
19602 132190 TIGR03146 cyt_nit_nrfB cytochrome c nitrite reductase, pentaheme subunit. Members of this protein family contain five copies of the CXXCH heme-binding motif, and are the NrfB component of the multisubunit enzyme, cytochrome c nitrite reductase. [Energy metabolism, Electron transport] 145
19603 274449 TIGR03147 cyt_nit_nrfF cytochrome c nitrite reductase, accessory protein NrfF. [Energy metabolism, Electron transport] 126
19604 274450 TIGR03148 cyt_nit_nrfD cytochrome c nitrite reductase, NrfD subunit. Members of this protein family are NrfD, a highly hydrophobic protein encoded in the nrf operon, which encodes cytochrome c nitrite reductase. This multiple heme-containing enzyme can reduce nitrite to ammonia. Members belong to a broader Pfam protein family, pfam03916, which also contains an NrfD-related subunit of polysulphide reductase. [Energy metabolism, Electron transport] 316
19605 274451 TIGR03149 cyt_nit_nrfC cytochrome c nitrite reductase, Fe-S protein. Members of this protein family are the Fe-S protein, NrfC, of a cytochrome c nitrite reductase system for which the pentaheme cytochrome c protein, NrfB (family TIGR03146) is an unambiguous marker. Members of this protein family show similarity to other ferredoxin-like proteins, including a subunit of a polysulfide reductase. [Energy metabolism, Electron transport] 225
19606 274452 TIGR03150 fabF beta-ketoacyl-acyl-carrier-protein synthase II. 3-oxoacyl-[acyl-carrier-protein] synthase 2 (KAS-II, FabF) is involved in the condensation step of fatty acid biosynthesis in which the malonyl donor group is decarboxylated and the resulting carbanion used to attack and extend the acyl group attached to the acyl carrier protein. Most genomes encoding fatty acid biosynthesis contain a number of condensing enzymes, often of all three types: 1, 2 and 3. Synthase 2 is mechanistically related to synthase 1 (KAS-I, FabB) containing a number of absolutely conserved catalytic residues in common. This model is based primarily on genes which are found in apparent operons with other essential genes of fatty acid biosynthesis (GenProp0681). The large gap between the trusted cutoff and the noise cutoff contains many genes which are not found adjacent to genes of the fatty acid pathway in genomes that often also contain a better hit to this model. These genes may be involved in other processes such as polyketide biosyntheses. Some genomes contain more than one above-trusted hit to this model which may result from recent paralogous expansions. Second hits to this model which are not next to other fatty acid biosynthesis genes may be involved in other processes. FabB sequences should fall well below the noise cutoff of this model. [Fatty acid and phospholipid metabolism, Biosynthesis] 407
19607 132195 TIGR03151 enACPred_II putative enoyl-[acyl-carrier-protein] reductase II. This oxidoreductase of the 2-nitropropane dioxygenase family (pfam03060) is commonly found in apparent operons with genes involved in fatty acid biosynthesis. Furthermore, this genomic context generally includes the fabG 3-oxoacyl-[ACP] reductase and lacks the fabI enoyl-[ACP] reductase. 307
19608 200248 TIGR03152 cyto_c552_HCOOH formate-dependent cytochrome c nitrite reductase, c552 subunit. Members of this protein family are cytochrome c552, a component of cytochrome c nitrite reductase, which is known more formally as nitrite reductase (cytochrome; ammonia-forming) (EC 1.7.2.2). Nitrate can be reduced by several enzymes. EC 1.7.2.2 reduces nitrite all the way to ammonia, rather than to ammonium hydroxide (nitrite reductase (NAD(P)H), EC 1.7.1.4) or nitric oxide (nitrite reductase (NO-forming), EC 1.7.2.1). Some examples of EC 1.7.2.2 occur in a seven gene system that enables formate-dependent nitrite reduction, but is also found in simpler contexts. Members of this protein family, however, belong to the formate-dependent system. [Energy metabolism, Electron transport] 439
19609 274453 TIGR03153 cytochr_NrfH cytochrome c nitrite reductase, small subunit. Members of this protein family are NrfH, a tetraheme cytochrome c. NrfH is the cytochrome c nitrite reductase small subunit, and forms a heterodimer with NrfA, the catalytic subunit. While NrfA can act as a monomer, NrfH can bind to and anchor NrfA in the membrane and enables electron transfer to NrfA from quinones. [Energy metabolism, Electron transport] 135
19610 132198 TIGR03154 sulfolob_CbsA cytochrome b558/566, subunit A. Members of this protein family are CbsA, one subunit of a highly glycosylated, heterodimeric, mono-heme cytochrome b558/566, found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota. 465
19611 274454 TIGR03155 sulfolob_CbsB cytochrome b558/566, subunit B. Members of this protein family are CbsB, one subunit of a highly glycosylated, heterodimeric, mono-heme cytochrome b558/566, found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota. 302
19612 274455 TIGR03156 GTP_HflX GTP-binding protein HflX. This protein family is one of a number of homologous small, well-conserved GTP-binding proteins with pleiotropic effects. Bacterial members are designated HflX, following the naming convention in Escherichia coli where HflX is encoded immediately downstream of the RNA chaperone Hfq, and immediately upstream of HflKC, a membrane-associated protease pair with an important housekeeping function. Over large numbers of other bacterial genomes, the pairing with hfq is more significant than with hflK and hlfC. The gene from Homo sapiens in this family has been named PGPL (pseudoautosomal GTP-binding protein-like). [Unknown function, General] 351
19613 274456 TIGR03157 cas_Csc2 CRISPR type I-D/CYANO-associated protein Csc2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc2 for CRISPR/Cas Subtype Cyano protein 2, as it is often the second gene upstream of the core cas genes, cas3-cas4-cas1-cas2. 282
19614 274457 TIGR03158 cas3_cyano CRISPR-associated helicase Cas3, subtype CYANO. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. It contains helicase motifs and appears to represent the Cas3 protein of the Cyano subtype of CRISPR/Cas system. 357
19615 274458 TIGR03159 cas_Csc1 CRISPR type I-D/CYANO-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc1 for CRISPR/Cas Subtype Cyano protein 1, as it is often the first gene upstream of the core cas genes, cas3-cas4-cas1-cas2. 225
19616 274459 TIGR03160 cobT_DBIPRT nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase. Members of this family are nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase, an enzyme of cobalamin biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 333
19617 274460 TIGR03161 ribazole_CobZ alpha-ribazole phosphatase CobZ. Sequences in the seed alignment were the experimentally characterized CobZ of the methanogenic archaeon Methanosarcina mazei, and other archaeal proteins found similarly next to or very near to other cobalamin biosynthesis genes. CobZ replaces the alpha-ribazole-phosphate phosphatase (EC 3.1.3.73) called CobC in analogous bacterial pathways for cobalamin biosynthesis under anaerobic conditions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 139
19618 274461 TIGR03162 ribazole_cobC alpha-ribazole phosphatase. Members of this protein family include the known CobC protein of Salmonella and Eschichia coli species, and homologous proteins found in cobalamin biosynthesis regions in other bacteria. This protein is alpha-ribazole phosphatase (EC 3.1.3.73) and, like many phosphatases, can be closely related in sequence to other phosphatases with different functions. Close homologs excluded from this model include proteins with duplications, so this model is built in -g mode to suppress hits to those proteins. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 177
19619 132208 TIGR03164 UHCUDC OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. 157
19620 274462 TIGR03165 F1F0_chp_2 F1/F0 ATPase, Methanosarcina type, subunit 2. Members of this protein family are uncharacterized, highly hydrophobic proteins encoded in the middle of apparent F1/F0 ATPase operons. We note, however, that this protein is both broadly and sparsely distributed. It is found in about only about two percent of microbial genomes sequenced, with the first ten examples found coming from the Euryarchaeota, Chlorobia, Betaproteobacteria, Deltaproteobacteria, and Planctomycetes. In most of these species, surrounding operon appears to represent a second F1/F0 ATPase system, and the member proteins belong to subfamilies with the same phylogenetic distribution as the current protein family. 83
19621 132210 TIGR03166 alt_F1F0_F1_eps alternate F1F0 ATPase, F1 subunit epsilon. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 epsilon subunit of this apparent second ATP synthase. 122
19622 274463 TIGR03167 tRNA_sel_U_synt tRNA 2-selenouridine synthase. The Escherichia coli YbbB protein was shown to encode a selenophosphate-dependent tRNA 2-selenouridine synthase, essential for modification of some tRNAs to replace a sulfur atom with selenium. This enzyme works with SelD, the selenium donor protein, which also acts in selenocysteine incorporation. Although the members of this protein family show a fairly deep split, sequences from both sides of the split are supported by co-occurence with, and often proximity to, the selD gene. [Protein synthesis, tRNA and rRNA base modification] 311
19623 274464 TIGR03168 1-PFK hexose kinase, 1-phosphofructokinase family. This family consists largely of 1-phosphofructokinases, but also includes tagatose-6-kinases and 6-phosphofructokinases. 303
19624 274465 TIGR03169 Nterm_to_SelD pyridine nucleotide-disulfide oxidoreductase family protein. Members of this protein family include N-terminal sequence regions of (probable) bifunctional proteins whose C-terminal sequences are SelD, or selenide,water dikinase, the selenium donor protein necessary for selenium incorporation into protein (as selenocysteine), tRNA (as 2-selenouridine), or both. However, some members of this family occur in species that do not show selenium incorporation, and the function of this protein family is unknown. 364
19625 274466 TIGR03170 flgA_cterm flagella basal body P-ring formation protein FlgA. This model describes a conserved C-terminal region of the flagellar basal body P-ring formation protein FlgA. This sequence region contains a SAF domain, now described by pfam08666. [Cellular processes, Chemotaxis and motility] 122
19626 132215 TIGR03171 soxL2 Rieske iron-sulfur protein SoxL2. This iron-sulfur protein is found in a contiguous genomic region with subunits of cytochrome b558/566 in several archaeal species, and appears to be part of a cytochrome bc1-analogous system. 321
19627 274467 TIGR03172 TIGR03172 probable selenium-dependent hydroxylase accessory protein YqeC. This uncharacterized protein family includes YqeC from Escherichia coli. A phylogenetic profiling analysis shows correlation with SelD, the selenium donor protein, even in species where SelD contributes to neither selenocysteine nor selenouridine biosynthesis. Instead, this family, and families TIGR03309 and TIGR03310 appear to mark selenium-dependent molybdenum hydroxylase maturation systems. [Unknown function, General] 210
19628 274468 TIGR03173 pbuX xanthine permease. All the seed members of this model are observed adjacent to genes for either xanthine phosphoribosyltransferase (for the conversion of xanthine to guanine, GenProp0696) or genes for the conversion of xanthine to urate and its concomitant catabolism (GenProp0640, GenProp0688, GenProp0686 and GenProp0687). A number of sequences scoring higher than trusted to this model are found in different genomic contexts, and the possibility exist that these transport related compounds in addition to or instead of xanthine itself. The outgroup to this family are sequences which are characterized as uracil permeases or are adjacent to established uracil phosphoribosyltransferases. 406
19629 274469 TIGR03174 cas_Csc3 CRISPR type I-D/CYANO-associated protein Csc3/Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc3 for CRISPR/Cas Subtype Cyano protein 3, as it is often the third gene upstream of the core cas genes, cas3-cas4-cas1-cas2. 953
19630 274470 TIGR03175 AllD ureidoglycolate dehydrogenase. This enzyme converts ureidoglycolate to oxalureate in the non-urea-forming catabolism of allantoin (GenProp0687). The pathway has been characterized in E. coli and is observed in the genomes of Entercoccus faecalis and Bacillus licheniformis. 349
19631 274471 TIGR03176 AllC allantoate amidohydrolase. This enzyme catalyzes the breakdown of allantoate, first to ureidoglycine by hydrolysis and then decarboxylation of one of the two equivalent ureido groups. Ureidoglycine then spontaneously exchanges ammonia for water resulting in ureidoglycolate. This enzyme is an alternative to allantoicase (3.5.3.4) which releases urea. [Central intermediary metabolism, Nitrogen metabolism] 406
19632 274472 TIGR03177 pilus_cpaB Flp pilus assembly protein CpaB. Members of this protein family are the CpaB protein of Flp-type pilus assembly. Similar proteins include the FlgA protein of bacterial flagellum biosynthesis. 261
19633 163175 TIGR03178 allantoinase allantoinase. This enzyme carries out the first step in the degradation of allantoin, a ring-opening hydrolysis. The seed members of this model are all in the vicinity of other genes involved in the processes of xanthine/urate/allantoin catabolism. Although not included in the seed, many eukaryotic homologs of this family are included above the trusted cutoff. Below the noise cutoff are related hydantoinases. 443
19634 188295 TIGR03180 UraD_2 OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. This model is a separate (but related) clade from that represented by TIGR3164. This model places a second homolog in streptomyces species which (are not in the vicinity of other urate catabolism associated genes) below the trusted cutoff. 158
19635 213783 TIGR03181 PDH_E1_alph_x pyruvate dehydrogenase E1 component, alpha subunit. Members of this protein family are the alpha subunit of the E1 component of pyruvate dehydrogenase (PDH). This model represents one branch of a larger family that E1-alpha proteins from 2-oxoisovalerate dehydrogenase, acetoin dehydrogenase, another PDH clade, etc. [Energy metabolism, Pyruvate dehydrogenase] 341
19636 274473 TIGR03182 PDH_E1_alph_y pyruvate dehydrogenase E1 component, alpha subunit. Members of this protein family are the alpha subunit of the E1 component of pyruvate dehydrogenase (PDH). This model represents one branch of a larger family that E1-alpha proteins from 2-oxoisovalerate dehydrogenase, acetoin dehydrogenase, another PDH clade, etc. [Energy metabolism, Pyruvate dehydrogenase] 315
19637 163177 TIGR03183 DNA_S_dndC putative sulfurtransferase DndC. Members of this protein family are the DndC protein from the dnd (degradation during electrophoresis) operon. The dnd phenotype reflects a sulfur-containing modification to DNA. This operon is sparsely and sporadically distributed among bactera; among the first eight examples are members from the Actinobacteria, Firmicutes, Gammaproteobacteria, Cyanobacteria. DndC is suggested to be a sulfurtransferase. [DNA metabolism, Restriction/modification] 447
19638 274474 TIGR03184 DNA_S_dndE DNA sulfur modification protein DndE. This model describes the DndE protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is a putative carboxylase homologous to NCAIR synthetases. [DNA metabolism, Restriction/modification] 105
19639 274475 TIGR03185 DNA_S_dndD DNA sulfur modification protein DndD. This model describes the DndB protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndD is described as a putative ATPase. The small number of examples known so far include species from among the Firmicutes, Actinomycetes, Proteobacteria, and Cyanobacteria. [DNA metabolism, Restriction/modification] 650
19640 132230 TIGR03186 AKGDH_not_PDH alpha-ketoglutarate dehydrogenase. Several bacterial species have a paralog to homodimeric form of the pyruvate dehydrogenase E1 component (see model TIGR00759), often encoded next to L-methionine gamma-lyase gene (mdeA). The member from a strain of Pseudomonas putida was shown to act on alpha-ketobutyrate, which is produced by MdeA.This model serves as an exception model to TIGR00759, as other proteins hitting TIGR00759 should be identified as the pyruvate dehydrogenase E1 component. 889
19641 274476 TIGR03187 DGQHR DGQHR domain. This highly divergent, uncharacterized domain has several absolutely conserved residues, including a QR pair and FxxxN motif. Its most striking feature, however, is a near invariant pentapeptide motif DGQHR. Several different subfamilies occur specifically as a part of DNA phosphorothioation systems, previously called DND (DNA instability during electrophoresis), while others (e.g. CPS_2936) occur in other contexts suggestive of lateral gene transfer (sporadic distribution of helicase-containing cassettes). The region described by this model is about 280 amino acids in length; additional sequences show local sequence similarity. 272
19642 274477 TIGR03188 histidine_hisI phosphoribosyl-ATP pyrophosphohydrolase. This enzyme, phosphoribosyl-ATP pyrophosphohydrolase, catalyses the second step in the histidine biosynthesis pathway. It often occurs as a fusion protein. This model a somewhat narrower scope than pfam01503, as some paralogs that appear to be functionally distinct are excluded from this model. [Amino acid biosynthesis, Histidine family] 84
19643 132233 TIGR03189 dienoyl_CoA_hyt cyclohexa-1,5-dienecarbonyl-CoA hydratase. This enzyme, cyclohexa-1,5-dienecarbonyl-CoA hydratase, also called dienoyl-CoA hydratase, acts on the product of benzoyl-CoA reductase (EC 1.3.99.15). Benzoyl-CoA is a common intermediate in the degradation of many aromatic compounds, and this enzyme is part of an anaerobic pathway for dearomatization and degradation. 251
19644 132234 TIGR03190 benz_CoA_bzdN benzoyl-CoA reductase, bzd-type, N subunit. Members of this family are the N subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene. 377
19645 132235 TIGR03191 benz_CoA_bzdO benzoyl-CoA reductase, bzd-type, O subunit. Members of this family are the O subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene. 430
19646 132236 TIGR03192 benz_CoA_bzdQ benzoyl-CoA reductase, bzd-type, Q subunit. Members of this family are the Q subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene. 293
19647 132237 TIGR03193 4hydroxCoAred 4-hydroxybenzoyl-CoA reductase, gamma subunit. 4-hydroxybenzoyl-CoA reductase converts 4-hydroxybenzoyl-CoA to benzoyl-CoA, a common intermediate in the degradation of aromatic compounds. This protein family represents the gamma chain of this three-subunit enzyme. 148
19648 132238 TIGR03194 4hydrxCoA_A 4-hydroxybenzoyl-CoA reductase, alpha subunit. This model represents the largest chain, alpha, of the enzyme 4-hydroxybenzoyl-CoA reductase. In species capable of degrading various aromatic compounds by way of benzoyl-CoA, this enzyme can convert 4-hydroxybenzoyl-CoA to benzoyl-CoA. 746
19649 132239 TIGR03195 4hydrxCoA_B 4-hydroxybenzoyl-CoA reductase, beta subunit. This model represents the second largest chain, beta, of the enzyme 4-hydroxybenzoyl-CoA reductase. In species capable of degrading various aromatic compounds by way of benzoyl-CoA, this enzyme can convert 4-hydroxybenzoyl-CoA to benzoyl-CoA. 321
19650 132240 TIGR03196 pucD xanthine dehydrogenase D subunit. This gene has been characterized in B. subtilis as the molybdopterin binding-subunit of xanthine dehydrogenase (pucD), acting in conjunction with pucC, the FAD-binding subunit and pucE, the FeS-binding subunit. The more common XDH complex (GenProp0640) includes the xdhB gene which is related to pucD. It appears that most of the relatives of pucD outside of this narrow clade are involved in other processes as they are found in unrelated genomic contexts, contain the more common XDH complex and/or do not appear to process purines to allantoin. 768
19651 274478 TIGR03197 MnmC_Cterm tRNA U-34 5-methylaminomethyl-2-thiouridine biosynthesis protein MnmC, C-terminal domain. In Escherichia coli, the protein previously designated YfcK is now identified as the bifunctional enzyme MnmC. It acts, following the action of the heterotetramer of GidA and MnmE, in the modification of U-34 of certain tRNA to 5-methylaminomethyl-2-thiouridine (mnm5s2U). In other bacterial, the corresponding proteins are usually but always found as a single polypeptide chain, but occasionally as the product of tandem genes. This model represents the C-terminal region of the multifunctional protein. [Protein synthesis, tRNA and rRNA base modification] 381
19652 132242 TIGR03198 pucE xanthine dehydrogenase E subunit. This gene has been characterized in B. subtilis as the Iron-sulfur cluster binding-subunit of xanthine dehydrogenase (pucE), acting in conjunction with pucC, the FAD-binding subunit and pucD, the molybdopterin binding subunit. The more common XDH complex (GenProp0640) includes the xdhA gene as the Fe-S cluster binding component. 151
19653 274479 TIGR03199 pucC xanthine dehydrogenase C subunit. This gene has been characterized in B. subtilis as the FAD binding-subunit of xanthine dehydrogenase (pucC), acting in conjunction with pucD, the molybdopterin-binding subunit and pucE, the FeS-binding subunit. 264
19654 132244 TIGR03200 dearomat_oah 6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase. Members of this protein family are 6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase, a ring-hydrolyzing enzyme in the anaerobic metabolism of aromatic enzymes by way of benzoyl-CoA, as seen in Thauera aromatica, Geobacter metallireducens, and Azoarcus sp. Note that Rhodopseudomonas palustris uses a different pathway to perform a similar degradation of benzoyl-CoA to 3-hydroxpimelyl-CoA. 360
19655 132245 TIGR03201 dearomat_had 6-hydroxycyclohex-1-ene-1-carbonyl-CoA dehydrogenase. Members of this protein family are 6-hydroxycyclohex-1-ene-1-carbonyl-CoA dehydrogenase, an enzyme in the anaerobic metabolism of aromatic enzymes by way of benzoyl-CoA, as seen in Thauera aromatica, Geobacter metallireducens, and Azoarcus sp. The experimentally characterized form from T. aromatica uses only NAD+, not NADP+. Note that Rhodopseudomonas palustris uses a different pathway to perform a similar degradation of benzoyl-CoA to 3-hydroxpimelyl-CoA. 349
19656 132246 TIGR03202 pucB xanthine dehydrogenase accessory protein pucB. In Bacillus subtilis the expression of this protein, located in an operon with the structural subunits of xanthine dehydrogenase, has been found to be essential for XDH activity. Some members of this family appear to have a distant relationship to the MobA protein involved in molybdopterin biosynthesis, although this may be coincidental. 190
19657 132247 TIGR03203 pimD_small pimeloyl-CoA dehydrogenase, small subunit. Members of this protein family are the PimD proteins of species such as Rhodopseudomonas palustris, Bradyrhizobium japonicum. The pimFABCDE operon encodes proteins for the metabolism of straight chain dicarboxylates of seven to fourteen carbons. Especially relevant is pimeloyl-CoA, basis of the gene symbol, as it is a catabolite of benzoyl-CoA degradation, which occurs in Rhodopseudomonas palustris. 378
19658 132248 TIGR03204 pimC_large pimeloyl-CoA dehydrogenase, large subunit. Members of this protein family are the PimC proteins of species such as Rhodopseudomonas palustris and Bradyrhizobium japonicum. The pimFABCDE operon encodes proteins for the metabolism of straight chain dicarboxylates of seven to fourteen carbons. Especially relevant is pimeloyl-CoA, basis of the gene symbol, as it is a catabolite of benzoyl-CoA degradation, which occurs in Rhodopseudomonas palustris. 395
19659 132249 TIGR03205 pimA dicarboxylate--CoA ligase PimA. PimA, a member of a large family of acyl-CoA ligases, is found in a characteristic operon pimFABCDE for the metabolism of pimelate and related compounds. It is found, so far, in Bradyrhizobium japonicum and several strains of Rhodopseudomonas palustris. PimA from R. palustris was shown to be active as a CoA ligase for C(7) to C(14) dicarboxylates and fatty acids. 541
19660 132250 TIGR03206 benzo_BadH 2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. Members of this protein family are the enzyme 2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. The enzymatic properties were confirmed experimentally in Rhodopseudomonas palustris; the enzyme is homotetrameric, and not sensitive to oxygen. This enzyme is part of proposed pathway for degradation of benzoyl-CoA to 3-hydroxypimeloyl-CoA that differs from the analogous in Thauera aromatica. It also may occur in degradation of the non-aromatic compound cyclohexane-1-carboxylate. 250
19661 132251 TIGR03207 cyc_hxne_CoA_dh cyclohexanecarboxyl-CoA dehydrogenase. Cyclohex-1-ene-1carboxyl-CoA is an intermediate in the anaerobic degradation of benzoyl-CoA derived from varioius aromatic compounds, in Rhodopseudomonas palustris but not Thauera aromatica. The aliphatic compound cyclohexanecarboxylate, can be converted to the same intermediate in two steps. The first step is its ligation to coenzyme A. The second is the action of this enzyme, cyclohexanecarboxyl-CoA dehydrogenase. 372
19662 132252 TIGR03208 cyc_hxne_CoA_lg cyclohexanecarboxylate-CoA ligase. Members of this protein family are cyclohexanecarboxylate-CoA ligase. This enzyme prepares the aliphatic ring compound, cyclohexanecarboxylate, for dehydrogenation and then degradation by a pathway also used in benzoyl-CoA degradation in Rhodopseudomonas palustris. 538
19663 132253 TIGR03209 P21_Cbot clostridium toxin-associated regulator BotR. Clostridium botulinum neurotoxin production is regulated by a regulatory sigma-70 protein, BotR transcription regulator. Similarly, tetanus toxin production of Clostridium tetani is regulated by TetR which is a very close relative of BotR. Both BotR and TetR are members of the TIGR02937 subfamily of sigma-70 RNA polymerase sigma factors. Functional complementation experiments have been done for botR and tetR in highly transformable strain of Clostridium perfringens host cells to assess functional interchangeability of sigma factors and it has been confirmed that they are interchangeable in vivo. 142
19664 132254 TIGR03210 badI 2-ketocyclohexanecarboxyl-CoA hydrolase. Members of this protein family are 2-ketocyclohexanecarboxyl-CoA hydrolase, a ring-opening enzyme that acts in catabolism of molecules such as benzoyl-CoA and cyclohexane carboxylate. It converts -ketocyclohexanecarboxyl-CoA to pimelyl-CoA. It is not sensitive to oxygen. 256
19665 274480 TIGR03211 catechol_2_3 catechol 2,3 dioxygenase. Members of this family all are enzymes active as catechol 2,3 dioxygenase (1.13.11.2), although some members have highly significant activity on catechol derivatives such as 3-methylcatechol, 3-chlorocatechol, and 4-chlorocatechol (see Mars, et al.). This enzyme is also called metapyrocatechase, as it performs a meta-cleavage (an extradiol ring cleavage), in contrast to the ortho-cleavage (intradiol ring cleavage)performed by catechol 1,2-dioxygenase (EC 1.13.11.1), also called pyrocatechase. [Energy metabolism, Other] 303
19666 211797 TIGR03212 uraD_N-term-dom putative urate catabolism protein. This model represents a protein that is predominantly found just upstream of the UraD protein (OHCU decarboxylase) and in a number of instances as a N-terminal fusion with it. UraD itself catalyzes the last step in the catabolism of urate to allantoate. The function of this protein is presently unknown. It shows homology with the pfam01522 polysaccharide deacetylase domain family. 297
19667 132257 TIGR03213 23dbph12diox 2,3-dihydroxybiphenyl 1,2-dioxygenase. Members of this protein family all have activity as 2,3-dihydroxybiphenyl 1,2-dioxygenase, the third enzyme of a pathway for biphenyl degradation. Many of the extradiol ring-cleaving dioxygenases, to which these proteins belong, act on a range of related substrates. Note that some members of this family may be found operons for toluene or naphthalene degradation, where other activities of the same enzyme may be more significant; the trusted cutoff for this model is set relatively high to exclude most such instances. [Energy metabolism, Other] 286
19668 200251 TIGR03214 ura-cupin putative allantoin catabolism protein. This model represents a protein containing a tandem arrangement of cupin domains (N-terminal part of pfam07883 and C-terminal more distantly related to pfam00190). This protein is found in the vicinity of genes involved in the catabolism of allantoin, a breakdown product of urate and sometimes of urate iteslf. The distribution of pathway components in the genomes in which this family is observed suggests that the function is linked to the allantoate catabolism to glyoxylate pathway (GenProp0686) since it is sometimes found in genomes lacking any elements of the xanthine-to-allantoin pathways (e.g. in Enterococcus faecalis). 252
19669 132259 TIGR03215 ac_ald_DH_ac acetaldehyde dehydrogenase (acetylating). Members of this protein family are acetaldehyde dehydrogenase (acetylating), EC 1.2.1.10. This enzyme oxidizes acetaldehyde, using NAD(+), and attaches coenzyme A (CoA), yielding acetyl-CoA. It occurs as a late step in the meta-cleavage pathways of a variety of compounds, including catechol, biphenyl, toluene, salicylate, etc. 285
19670 132260 TIGR03216 OH_muco_semi_DH 2-hydroxymuconic semialdehyde dehydrogenase. Members of this protein family are 2-hydroxymuconic semialdehyde dehydrogenase. Many aromatic compounds are catabolized by way of the catechol, via the meta-cleavage pathway, to pyruvate and acetyl-CoA. This enzyme performs the second of seven steps in that pathway for catechol degradation. [Energy metabolism, Other] 481
19671 274481 TIGR03217 4OH_2_O_val_ald 4-hydroxy-2-oxovalerate aldolase. Members of this protein family are 4-hydroxy-2-oxovalerate aldolase, also called 4-hydroxy-2-ketovalerate aldolase and 2-oxo-4-hydroxypentanoate aldolase. This enzyme, part of the pathway for the meta-cleavage of catechol, produces pyruvate and acetaldehyde. Acetaldehyde is then converted by acetaldehyde dehydrogenase (acylating) (DmpF; EC 1.2.1.10) to acetyl-CoA. The two enzymes are tightly associated. [Energy metabolism, Other] 333
19672 132262 TIGR03218 catechol_dmpH 4-oxalocrotonate decarboxylase. Members of this protein family are 4-oxalocrotonate decarboxylase. Note that this protein, as characterized (indirectly) in Pseudomonas sp. strain CF600, was inactive except when coexpressed with DmpE, 2-oxopent-4-enoate hydratase, a homologous protein from the same operon. Both of these enzymes are active in the degradation of catechol, a common intermediate in the degradation of aromatic compounds such as benzoate, toluene, phenol, dimethylphenol (dmp), salicylate, etc. [Energy metabolism, Other] 263
19673 274482 TIGR03219 salicylate_mono salicylate 1-monooxygenase. Members of this protein family are salicylate 1-monooxygenase, also called salicylate hydroxylase. This enzyme converts salicylate to catechol, which is a common intermediate in the degradation of a number of aromatic compounds (phenol, toluene, benzoate, etc.). The gene for this protein may occur in catechol degradation genes, such as those of the meta-cleavage pathway. 414
19674 132264 TIGR03220 catechol_dmpE 2-oxopent-4-enoate hydratase. Members of this protein family are 2-oxopent-4-enoate hydratase, which is also called 2-hydroxypent-2,4-dienoate hydratase. It is closely related to another gene found in the same operon, 4-oxalocrotonate decarboxylase, with which it interacts closely. 255
19675 213786 TIGR03221 muco_delta muconolactone delta-isomerase. Members of this protein family are muconolactone delta-isomerase (EC 5.3.3.4), the CatC protein of the ortho cleavage pathway for metabolizing aromatic compounds by way of catechol. [Energy metabolism, Other] 90
19676 213787 TIGR03222 benzo_boxC benzoyl-CoA-dihydrodiol lyase. In the presence of O2, the benzoyl-CoA oxygenase/reductase BoxBA BoxAB converts benzoyl-CoA to 2,3-dihydro-2,3-dihydroxybenzoyl-CoA. Members of this family, BoxC, homologous to enoyl-CoA hydratases/isomerases, hydrolyze this compound to 3,4-dehydroadipyl-CoA semialdehyde + HCOOH. 546
19677 274483 TIGR03223 Phn_opern_protn putative phosphonate metabolism protein. This family of proteins is observed in the vicinity of other caharacterized genes involved in the catabolism of phosphonates via the3 C-P lyase system (GenProp0232), its function is unknown. These proteins are members of the somewhat broader pfam06299 model "Protein of unknown function (DUF1045)" which contains proteins found in a different genomic context as well. 228
19678 132268 TIGR03224 benzo_boxA benzoyl-CoA oxygenase/reductase, BoxA protein. Members of this protein family are BoxA, the A component of the BoxAB benzoyl-CoA oxygenase/reductase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. BoxA is a homodimeric iron-sulphur-flavoprotein and acts as an NADPH-dependent reductase for BoxB. [Energy metabolism, Other] 411
19679 200253 TIGR03225 benzo_boxB benzoyl-CoA oxygenase, B subunit. Members of this protein family are BoxB, the B subunit of benzoyl-CoA oxygenase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. [Energy metabolism, Other] 471
19680 274484 TIGR03226 PhnU 2-aminoethylphosphonate ABC transporter, permease protein. This ABC transporter permease (membrane-spanning) component is found in a region of the salmonella typhimurium LT2 genome responsible for the catabolism of 2-aminoethylphosphonate via the phnWX pathway (GenProp0238). 288
19681 132271 TIGR03227 PhnS 2-aminoethylphosphonate ABC transporter, periplasmic 2-aminoethylphosphonate binding protein. This ABC transporter periplasmic substrate binding protein component is found in a region of the salmonella typhimurium LT2 genome responsible for the catabolism of 2-aminoethylphosphonate via the phnWX pathway (GenProp0238). The protein contains a match to pfam01547 for the "Bacterial extracellular solute-binding protein" domain. 367
19682 132272 TIGR03228 anthran_1_2_A anthranilate 1,2-dioxygenase, large subunit. Anthranilate (2-aminobenzoate) is an intermediate of tryptophan (Trp) biosynthesis and degradation. Members of this family are the large subunit of anthranilate 1,2-dioxygenase, which acts in Trp degradation by converting anthranilate to catechol. Closely related paralogs typically are the benzoate 1,2-dioxygenase large subunit, among the larger set of ring-hydroxylating dioxygenases. [Energy metabolism, Amino acids and amines] 438
19683 132273 TIGR03229 benzo_1_2_benA benzoate 1,2-dioxygenase, large subunit. Benzoate 1,2-dioxygenase (EC 1.14.12.10) belongs to the larger family of aromatic ring-hydroxylating dioxygenases. Members of this family all act on benzoate, but may have additional activities on various benozate analogs. This model describes the large subunit. Between the trusted and noise cutoffs are similar enzymes, likely to act on benzoate but perhaps best identified according to some other activity, such as 2-chlorobenzoate 1,2-dioxygenase (1.14.12.13). [Energy metabolism, Other] 433
19684 132274 TIGR03230 lipo_lipase lipoprotein lipase. Members of this protein family are lipoprotein lipase (EC 3.1.1.34), a eukaryotic triacylglycerol lipase active in plasma and similar to pancreatic and hepatic triacylglycerol lipases (EC 3.1.1.3). It is also called clearing factor. It cleaves chylomicron and VLDL triacylglycerols; it also has phospholipase A-1 activity. 442
19685 132275 TIGR03231 anthran_1_2_B anthranilate 1,2-dioxygenase, small subunit. Anthranilate (2-aminobenzoate) is an intermediate of tryptophan (Trp) biosynthesis and degradation. Members of this family are the small subunit of anthranilate 1,2-dioxygenase, which acts in Trp degradation by converting anthranilate to catechol. Closely related paralogs typically are the benzoate 1,2-dioxygenase small subunit, among the larger set of ring-hydroxylating dioxygenases. [Energy metabolism, Amino acids and amines] 155
19686 132276 TIGR03232 benzo_1_2_benB benzoate 1,2-dioxygenase, small subunit. Benzoate 1,2-dioxygenase (EC 1.14.12.10) belongs to the larger family of aromatic ring-hydroxylating dioxygenases. Members of this family should all act on benzoate, but several have additional known activities on various benozate analogs. Some members actually may be named more suitably according to such alternate an activity, such as 2-chlorobenzoate 1,2-dioxygenase (1.14.12.13). 155
19687 163189 TIGR03233 DNA_S_dndB DNA sulfur modification protein DndB. This model describes the DndB protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndB is described as a putative ATPase. [DNA metabolism, Restriction/modification] 355
19688 163190 TIGR03234 OH-pyruv-isom hydroxypyruvate isomerase. This enzyme interconverts tartronate semi-aldehyde (TSA, aka 2-hydroxy 3-oxopropionate) and hydroxypyruvate. The E. coli enzyme has been characterized and found to be specific for TSA, contain no cofactors, and have a rather high Km for hydroxypyruvate of 12.5 mM. The gene is ofter found in association with glyoxalate carboligase (which produces TSA), but has been shown to have no effect on growth on glyoxalate when knocked out. This is consistent with the fact that the gene for tartronate semialdehyde reductase (glxR) is also associated and may have primary responsibility for the catabolism of TSA. 254
19689 163191 TIGR03235 DNA_S_dndA cysteine desulfurase DndA. This model describes DndA, a protein related to IscS and part of a larger family of cysteine desulfurases. It is encoded, typically, divergently from a conserved, sparsely distributed operon for sulfur modification of DNA. This modification system is designated dnd, after the phenotype of DNA degradation during electrophoresis. The system is sporadically distributed in bacteria, much like some restriction enzyme operons. DndB is described as a putative ATPase. [DNA metabolism, Restriction/modification] 353
19690 274485 TIGR03236 dnd_assoc_1 DNA phosphorothioation-dependent restriction protein DptG. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification] 363
19691 132281 TIGR03237 dnd_assoc_2 DNA phosphorothioation-dependent restriction protein DptH. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification] 1256
19692 132282 TIGR03238 dnd_assoc_3 DNA phosphorothioation-dependent restriction protein DptF. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification] 504
19693 132283 TIGR03239 GarL 2-dehydro-3-deoxyglucarate aldolase. In E. coli this enzyme (GarL) 2-dehydro-3-deoxyglucarate aldolase acts in the catabolism of several sugars including D-galactarate, D-glucarate and L-idarate. In fact, 5-dehydro-4-deoxy-D-glucarate aldolase is a synonym for this enzyme as it is unclear in the literature whether the enzyme acts on only one of these or, as seems likely, has no preference. (Despite the apparent large difference in substrate stucture indicated by their names, 2-DH-3DO- and 5-DH-4DO-glucarate differ only by the chirality of most central hydroxyl-bearing carbon and is alternately named 2-DH-3DO-galactarate.) The reported product of D-galactarate dehydratase (4.2.1.42) is the 5DH-4DO-glucarate isomer and this enzyme is found proximal to the aldolase in many genomes (GenProp0714) where no epimerase is known. Similarly, the product of D-glucarate dehydratase (4.2.1.40) is again the 5-DH-4DO isomer, so the provenance of the 2-DH-3DO-glucarate isomer for which this enzyme is named is unclear. 249
19694 274486 TIGR03240 arg_catab_astD succinylglutamate-semialdehyde dehydrogenase. Members of this protein family are succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71), the fourth enzyme in the arginine succinyltransferase (AST) pathway for arginine catabolism. [Energy metabolism, Amino acids and amines] 484
19695 132285 TIGR03241 arg_catab_astB succinylarginine dihydrolase. Members of this family are succinylarginine dihydrolase (EC 3.5.3.23), the second of five enzymes in the arginine succinyltransferase (AST) pathway. [Energy metabolism, Amino acids and amines] 443
19696 132286 TIGR03242 arg_catab_astE succinylglutamate desuccinylase. Members of this protein family are succinylglutamate desuccinylase, the fifth and final enzyme of the arginine succinyltransferase (AST) pathway for arginine catabolism. This model excludes the related protein aspartoacylase. [Energy metabolism, Amino acids and amines] 319
19697 274487 TIGR03243 arg_catab_AOST arginine and ornithine succinyltransferase subunits. In many bacteria, the sole member of this protein family is arginine N-succinyltransferase (EC 2.3.1.109), the AstA protein of the arginine succinyltransferase (ast) pathway. However, in Pseudomonas aeruginosa and several other species, a tandem gene pair encodes alpha and beta subunits of a heterodimer that is designated arginine and ornithine succinyltransferase (AOST). 335
19698 274488 TIGR03244 arg_catab_AstA arginine N-succinyltransferase. In many bacteria, the arginine succinyltransferase (ast) pathway operon consists of five genes, including this protein, arginine N-succinyltransferase (EC 2.3.1.109). In a few species, such as Pseudomonas aeruginosa, the member of this family is encoded adjacent to a paralog, and the two polypeptides form a heterodimeric enzyme, active on both arginine and ornithine. In such species, this polypeptide may be treated as the beta subunit of an enzyme that may be named either arginine N-succinyltransferase (AST) or arginine and orthithine N-succinyltransferase (AOST). [Energy metabolism, Amino acids and amines] 336
19699 274489 TIGR03245 arg_AOST_alph arginine/ornithine succinyltransferase, alpha subunit. In some bacteria, including Pseudomonas aeruginosa, the astB gene (arginine N-succinyltransferase) is replaced by tandem paralogs that form a heterodimer. This heterodimer from P. aeruginosa is characterized as arginine and ornithine N-2 succinyltransferase (AOST). Members of this protein family represent the less widespread paralog, designated AruI, or arginine/ornithine succinyltransferase, alpha subunit. 336
19700 274490 TIGR03246 arg_catab_astC succinylornithine transaminase family. Members of the seed alignment for this protein family are the enzyme succinylornithine transaminase (EC 2.6.1.81), which catalyzes the third of five steps in arginine succinyltransferase (AST) pathway, an ammonia-releasing pathway of arginine degradation. All seed alignment sequences are found within arginine succinyltransferase operons, and all proteins that score above 820.0 bits should function as succinylornithine transaminase. However, a number of sequences extremely closely related in sequence, found in different genomic contexts, are likely to act in different biological processes and may act on different substrates. This model is desigated subfamily rather than equivalog, pending further consideration, for this reason. [Energy metabolism, Amino acids and amines] 397
19701 211799 TIGR03247 glucar-dehydr glucarate dehydratase. Glucarate dehydratase converts D-glucarate (and L-idarate, a stereoisomer) to 5-dehydro-4-deoxyglucarate which is subsequently acted on by GarL, tartronate semialdehyde reductase and glycerate kinase (GenProp0716). The E. coli enzyme has been well-characterized. 441
19702 274491 TIGR03248 galactar-dH20 galactarate dehydratase. Galactarate dehydratase converts D-galactarate to 5-dehydro-4-deoxyglucarate which is subsequently acted on by GarL, tartronate semialdehyde reductase and glycerate kinase (GenProp0714). 506
19703 132293 TIGR03249 KdgD 5-dehydro-4-deoxyglucarate dehydratase. 5-dehydro-4-deoxyglucarate dehydratase not only catalyzes the dehydration of the substrate (diol to ketone + water), but causes the decarboxylation of the intermediate product to yield 2-oxoglutarate semialdehyde (2,5-dioxopentanoate). The gene for the enzyme is usually observed in the vicinity of transporters and dehydratases handling D-galactarate and D-gluconate as well as aldehyde dehydrogenases which convert the product to alpha-ketoglutarate. 296
19704 132294 TIGR03250 PhnAcAld_DH putative phosphonoacetaldehyde dehydrogenase. This family of genes are members of the pfam00171 NAD-dependent aldehyde dehydrogenase family. These genes are observed in Ralstonia eutropha JMP134, Sinorhizobium meliloti 1021, Burkholderia mallei ATCC 23344, Burkholderia thailandensis E264, Burkholderia cenocepacia AU 1054, Burkholderia pseudomallei K96243 and 1710b, Burkholderia xenovorans LB400, Burkholderia sp. 383 and Polaromonas sp. JS666 in close proximity to the PhnW gene (TIGR02326) encoding 2-aminoethyl phosphonate aminotransferase (which generates phosphonoacetaldehyde) and PhnA (TIGR02335) encoding phosphonoacetate hydrolase (not to be confused with the alkylphosphonate utilization operon protein PhnA modeled by TIGR00686). Additionally, transporters believed to be specific for 2-aminoethyl phosphonate are often present. PhnW is, in other organisms, coupled with PhnX (TIGR01422) for the degradation of phosphonoacetaldehyde (GenProp0238), but PhnX is apparently absent in each of the organisms containing this aldehyde reductase. PhnA, characterized in a strain of Pseudomonas fluorescens that has not het been genome sequenced, is only rarely found outside of the PhnW and aldehyde dehydrogenase context. For instance in Rhodopseudomonas and Bordetella bronchiseptica, where it is adjacent to transporters presumably specific for the import of phosphonoacetate. It seems reasonably certain then, that this enzyme catalyzes the NAD-dependent oxidation of phosphonoacetaldehyde to phosphonoacetate, bridging the metabolic gap between PhnW and PhnA. We propose the name phosphonoacetaldehyde dehydrogenase and the gene symbol PhnY for this enzyme. 472
19705 274492 TIGR03251 LAT_fam L-lysine 6-transaminase. Characterized members of this protein family are L-lysine 6-transaminase, also called lysine epsilon-aminotransferase (LAT). The immediate product of the reaction of this enzyme on lysine, 2-aminoadipate 6-semialdehyde, becomes 1-piperideine 6-carboxylate, or P6C. This product may be converted subsequently to pipecolate or alpha-aminoadipate, lysine catabolites that may be precursors of certain seconary metabolites. 431
19706 132296 TIGR03252 TIGR03252 uncharacterized HhH-GPD family protein. This model describes a small, well-conserved bacterial protein family. Its sequence largely consists of a domain, HhH-GPD, found in a variety of related base excision DNA repair enzymes (see pfam00730). [DNA metabolism, DNA replication, recombination, and repair] 177
19707 211800 TIGR03253 oxalate_frc formyl-CoA transferase. This enzyme, formyl-CoA transferase, transfers coenzyme A from formyl-CoA to oxalate. It forms a pathway, together with oxalyl-CoA decarboxylase, for oxalate degradation; decarboxylation by the latter gene regenerates formyl-CoA. The two enzymes typically are encoded by a two-gene operon. [Cellular processes, Detoxification] 415
19708 132298 TIGR03254 oxalate_oxc oxalyl-CoA decarboxylase. In a number of bacteria, including Oxalobacter formigenes from the human gut, a two-gene operon of oxc (oxalyl-CoA decarboxylase) and frc (formyl-CoA transferase) encodes a system for degrading and therefore detoxifying oxalate. Members of this family are the thiamine pyrophosphate (TPP)-containing enzyme oxalyl-CoA decarboxylase. [Cellular processes, Detoxification] 554
19709 132299 TIGR03255 PhnV 2-aminoethylphosphonate ABC transport system, membrane component PhnV. This membrane component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate. 272
19710 132300 TIGR03256 met_CoM_red_alp methyl-coenzyme M reductase, alpha subunit. Members of this protein family are the alpha subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis] 548
19711 132301 TIGR03257 met_CoM_red_bet methyl-coenzyme M reductase, beta subunit. Members of this protein family are the beta subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis] 433
19712 132302 TIGR03258 PhnT 2-aminoethylphosphonate ABC transport system, ATP-binding component PhnT. This ATP-binding component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate. 362
19713 132303 TIGR03259 met_CoM_red_gam methyl-coenzyme M reductase, gamma subunit. Members of this protein family are the gamma subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis] 244
19714 274493 TIGR03260 met_CoM_red_D methyl-coenzyme M reductase operon protein D. Members of this protein family are protein D, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. Proteins in this family are expressed at much lower levels than the methyl-coenzyme M reductase itself and associate and have been shown to form at least transient associations. The precise function is unknown. [Energy metabolism, Methanogenesis] 150
19715 274494 TIGR03261 phnS2 putative 2-aminoethylphosphonate ABC transporter, periplasmic 2-aminoethylphosphonate-binding protein. This ABC transporter extracellular solute-binding protein is found in a number of genomes in operon-like contexts strongly suggesting a substrate specificity for 2-aminoethylphosphonate (2-AEP). The characterized PhnSTUV system is absent in the genomes in which this system is found. These genomes encode systems for the catabolism of 2-AEP, making the need for a 2-AEP-specific transporter likely. [Transport and binding proteins, Amino acids, peptides and amines] 334
19716 274495 TIGR03262 PhnU2 putative 2-aminoethylphosphonate ABC transporter, permease protein. [Transport and binding proteins, Amino acids, peptides and amines] 546
19717 213788 TIGR03263 guanyl_kin guanylate kinase. Members of this family are the enzyme guanylate kinase, also called GMP kinase. This enzyme transfers a phosphate from ATP to GMP, yielding ADP and GDP. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions] 179
19718 132308 TIGR03264 met_CoM_red_C methyl-coenzyme M reductase I operon protein C. Members of this protein family are protein C, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this protein occurs only operons of type I. The precise function is unknown. [Energy metabolism, Methanogenesis] 194
19719 274496 TIGR03265 PhnT2 putative 2-aminoethylphosphonate ABC transporter, ATP-binding protein. This ABC transporter ATP-binding protein is found in a number of genomes in operon-like contexts strongly suggesting a substrate specificity for 2-aminoethylphosphonate (2-AEP). The characterized PhnSTUV system is absent in the genomes in which this system is found. These genomes encode systems for the catabolism of 2-AEP, making the need for a 2-AEP-specific transporter likely. [Transport and binding proteins, Amino acids, peptides and amines] 353
19720 132310 TIGR03266 methan_mark_1 putative methanogenesis marker protein 1. Members of this protein family represent a distinct clade among the larger set of proteins that belong to families TIGR00702 and pfam02624. Proteins from this clade are found in genome sequence if and only if the species sequenced is one of the methanogens. All methanogens belong to the archaea; some but not all of those sequenced are hyperthermophiles. This protein family was detected by the method of partial phylogenetic profiling (see Haft, et al., 2006). 376
19721 132311 TIGR03267 methan_mark_2 putative methanogenesis marker protein 2. A single member of this protein family is found in each of the first ten complete genome sequences of archaeal methanogens, and nowhere else. Sequence similarity to various bacterial proteins is reflected in Pfam models pfam00586 and pfam02769, AIR synthase related protein N-terminal and C-terminal domains, respectively. The functions of proteins in this family are unknown, but their role is likely one essential to methanogenesis. [Energy metabolism, Methanogenesis] 323
19722 132312 TIGR03268 methan_mark_3 putative methanogenesis marker protein 3. A single member of this protein family is found in each of the first ten complete genome sequences of archaeal methanogens, and nowhere else. This protein family was detected by the method of partial phylogenetic profiling (see Haft, et al., 2006). The functions of proteins in this family are unknown, but their role is likely one essential to methanogenesis. [Energy metabolism, Methanogenesis] 503
19723 132313 TIGR03269 met_CoM_red_A2 methyl coenzyme M reductase system, component A2. The enzyme that catalyzes the final step in methanogenesis, methyl coenzyme M reductase, contains alpha, beta, and gamma chains. In older literature, the complex of alpha, beta, and gamma chains was termed component C, while this single chain protein was termed methyl coenzyme M reductase system component A2. [Energy metabolism, Methanogenesis] 520
19724 274497 TIGR03270 methan_mark_4 putative methanogen marker protein 4. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely linked to it. Some members have been suggested to be a methyltransferase, based on the proximity of its gene to genes of the multi-subunit complex, N5-methyl-tetrahydromethanopterin--coenzyme M methyltransferase. That context is not conserved, however. The family shows similarity to various phosphate acyltranferases. [Energy metabolism, Methanogenesis] 202
19725 132315 TIGR03271 methan_mark_5 putative methanogenesis marker protein 5. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 142
19726 274498 TIGR03272 methan_mark_6 putative methanogenesis marker protein 6. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 132
19727 132317 TIGR03274 methan_mark_7 putative methanogenesis marker protein 7. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. 302
19728 132318 TIGR03275 methan_mark_8 putative methanogenesis marker protein 8. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. 259
19729 132319 TIGR03276 Phn-HD phosphonate degradation operons associated HDIG domain protein. This small clade of proteins are found adjacent to other genes implicated in the catabolism of phosphonates. They are members of the TIGR00277 domain family and contain a series of five invariant histidines (the domain in general has only four). 179
19730 132320 TIGR03277 methan_mark_9 putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins. 109
19731 132321 TIGR03278 methan_mark_10 methanogenesis marker radical SAM protein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. It is a radical SAM enzyme by homology. The exact function is unknown, but likely is linked to methanogenesis. In most genomes, the member of this family is encoded by a gene next to, and divergently transcribed from, the methyl coenzyme M reductase operon. 404
19732 132322 TIGR03279 cyano_FeS_chp putative radical SAM enzyme, TIGR03279 family. Members of this protein family are predicted radical SAM enzymes of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis. 433
19733 213789 TIGR03280 methan_mark_11 methanogenesis imperfect marker protein 11. The first twenty-nine completed genomes with a member of this protein family include twenty-eight archaeal methanogens and one other related archaeon, Ferroglobus placidus DSM 10642. The exact function is unknown, but the protein likely belongs to a system usually tightly linked to methanogenesis. 292
19734 132324 TIGR03281 methan_mark_12 putative methanogenesis marker protein 12. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 326
19735 274499 TIGR03282 methan_mark_13 putative methanogenesis marker 13 metalloprotein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This metal cluster-binding family is related to nitrogenase structural protein NifD and accessory protein NifE, among others. [Energy metabolism, Methanogenesis] 352
19736 132326 TIGR03283 thy_syn_methano thymidylate synthase, methanogen type. Thymidylate synthase makes dTMP for DNA synthesis, and is among the most widely distributed of all enzymes. Members of this protein family are encoded within a completed genome sequence if and only if that species is one of the methanogenenic archaea. In these species, tetrahydromethanopterin replaces tetrahydrofolate, The member from Methanobacterium thermoautotrophicum was shown to behave as a thymidylate synthase based on similar side reactions (the exchange of a characteristic proton with water), although the full reaction was not reconstituted. Partial sequence data showed no similarity to known thymidylate synthases simply because the region sequenced was from a distinctive N-terminal region not found in other thymidylate synthases. Members of this protein family appear, therefore, to a novel, tetrahydromethanopterin-dependent thymidylate synthase. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 199
19737 213790 TIGR03284 thym_sym thymidylate synthase. Members of this protein family are thymidylate synthase, an enzyme that produces dTMP from dUMP. In prokaryotes, its gene usually is found close to that for dihydrofolate reductase, and in some systems the two enzymes are found as a fusion protein. This model excludes a set of related proteins (TIGR03283) that appears to replace this family in archaeal methanogens, where tetrahydrofolate is replaced by tetrahydromethanopterin. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 295
19738 274500 TIGR03285 methan_mark_14 putative methanogenesis marker protein 14. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 445
19739 132329 TIGR03286 methan_mark_15 putative methanogenesis marker protein 15. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. Related proteins include the BadF/BadG/BcrA/BcrD ATPase family (pfam01869), which includes an activator for (R)-2-hydroxyglutaryl-CoA dehydratase. [Energy metabolism, Methanogenesis] 404
19740 274501 TIGR03287 methan_mark_16 putative methanogenesis marker 16 metalloprotein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This protein is a predicted to bind FeS clusters, based on the presence of two copies of the Fer4 domain (pfam00037), with each copy having four Cys residues invariant across all members. [Energy metabolism, Methanogenesis] 391
19741 132331 TIGR03288 CoB_CoM_SS_B CoB--CoM heterodisulfide reductase, subunit B. Members of this protein family are subunit B of the CoB--CoM heterodisulfide reductase, or simply heterodisulfide reductase, found in methanogenic archaea. Some archaea species have two copies, HdrB1 and HdrB2. 290
19742 274502 TIGR03289 frhB coenzyme F420 hydrogenase, subunit beta. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. The N- and C-terminal domains of this protein are modelled by pfam04422 and pfam04423 respectively. 275
19743 132333 TIGR03290 CoB_CoM_SS_C CoB--CoM heterodisulfide reductase, subunit C. The last step in methanogenesis leaves two coenzymes of methanogenesis, CoM and CoB, linked by a disulfide bond. Members of this protein family are the C subunit of the enzyme that reduces the heterodisulfide to CoB-SH and CoM-SH. Similar enzyme complex subunits are found in various other species, but likely act on a different substrate. 144
19744 274503 TIGR03291 methan_mark_17 putative methanogenesis marker protein 17. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 185
19745 274504 TIGR03292 PhnH_redo phosphonate C-P lyase system protein PhnH. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam05845.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context. 174
19746 274505 TIGR03293 PhnG_redo phosphonate C-P lyase system protein PhnG. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam06754.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context. 144
19747 132337 TIGR03294 FrhG coenzyme F420 hydrogenase, subunit gamma. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein contains two 4Fe-4S cluster binding domains (pfam00037) and scores above the trusted cutoff to model pfam01058 for the "NADH ubiquinone oxidoreductase, 20 Kd subunit" family. 228
19748 274506 TIGR03295 frhA coenzyme F420 hydrogenase, subunit alpha. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein is a member of the Nickel-dependent hydrogenase superfamily represented by Pfam model, pfam00374. 408
19749 274507 TIGR03296 M6dom_TIGR03296 M6 family metalloprotease domain. This model describes a metalloproteinase domain, with a characteristic HExxH motif. Examples of this domain are found in proteins in the family of immune inhibitor A, which cleaves antibacterial peptides, and in other, only distantly related proteases. This model is built to be broader and more inclusive than pfam05547. 285
19750 274508 TIGR03297 Ppyr-DeCO2ase phosphonopyruvate decarboxylase. This family consists of examples of phosphonopyruvate an decarboxylase enzyme that produces phosphonoacetaldehyde (Pald), the second step in the biosynthesis phosphonate-containing compounds. Since the preceding enzymate step, PEP phosphomutase (AepX, TIGR02320) favors the substrate PEP energetically, the decarboxylase is required to drive the reaction in the direction of phosphonate production. Pald is a precursor of natural products including antibiotics like bialaphos and phosphonothricin in Streptomyces species, phosphonate-modified molecules such as the polysaccharide B of Bacteroides fragilis, the phosphonolipids of Tetrahymena pyroformis, the glycosylinositolphospholipids of Trypanosoma cruzi. This gene generally occurs in prokaryotic organisms adjacent to the gene for AepX. Most often an aminotansferase (aepZ) is also present which leads to the production of the most common phosphonate compound, 2-aminoethylphosphonate (AEP). 361
19751 274509 TIGR03298 argP transcriptional regulator, ArgP family. ArgP used to be known as IciA. ArgP is a positive regulator of argK. It is a negative autoregulator in presence of arginine. It competes with DnaA for oriC iteron (13-mer) binding. It activates dnaA and nrd transcription. It has been demonstrated to be part of the pho regulon (). ArgP mutants convey canavanine (an L-arginine structural homolog) sensitivity. [Cellular processes, Toxin production and resistance, DNA metabolism, DNA replication, recombination, and repair, Regulatory functions, DNA interactions] 292
19752 274510 TIGR03299 LGT_TIGR03299 phage/plasmid-like protein TIGR03299. Members of this uncharacterized protein family are found in various Mycobacterium phage genomes, in Streptomyces coelicolor plasmid SCP1, and in bacterial genomes near various markers that suggest lateral gene transfer. The function is unknown. [Mobile and extrachromosomal element functions, Other] 309
19753 274511 TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ. [Protein fate, Protein and peptide secretion and trafficking] 377
19754 274512 TIGR03301 PhnW-AepZ 2-aminoethylphosphonate aminotransferase. This family includes a number of 2-aminoethylphosphonate aminotransferases, some of which are indicated to operate in the catabolism of 2-aminoethylphosphonate (AEP) and others which are involved in the biosynthesis of the same compound. The catabolic enzyme (PhnW) is known to use pyruvate:alanine as the transfer partner and is modeled by the equivalog-level model (TIGR02326). The PhnW family is apparently a branch of a larger tree including genes (AepZ) adjacent to others responsible for the biosynthesis of phosphonoacetaldehyde. The identity of the transfer partner is unknown for these enzymes and considering the reversed flux compared to PhnW, it may very well be different. 355
19755 274513 TIGR03302 OM_YfiO outer membrane assembly lipoprotein YfiO. Members of this protein family include YfiO, a near-essential protein of the outer membrane, part of a complex involved in protein insertion into the bacterial outer membrane. Many proteins in this family are annotated as ComL, based on the involvement of this protein in natural transformation with exogenous DNA in Neisseria gonorrhoeae. This protein family shows sequence similarity to, but is distinct from, the tol-pal system protein YbgF (TIGR02795). [Protein fate, Protein and peptide secretion and trafficking] 235
19756 274514 TIGR03303 OM_YaeT outer membrane protein assembly complex, YaeT protein. Members of this protein family are the YaeT protein of the YaeT/YfiO complex for assembling proteins into the outer membrane of Gram-negative bacteria. This protein is similar in sequence and function to a non-essential paralog, YtfM, that is also in the Omp85 family. Members of this family typically have five tandem copies of the surface antigen variable number repeat (pfam07244), followed by an outer membrane beta-barrel domain (pfam01103), while the YtfM family typically has a single pfam07244 repeat. [Protein fate, Protein and peptide secretion and trafficking] 741
19757 274515 TIGR03304 OMP85_target outer membrane insertion C-terminal signal. This hidden Markov model detects a 10-residue targeting sequence common to beta-barrel outer membrane proteins (OMP) that rely on Omp85-like proteins for insertion into the outer membrane. Hits should be trusted if they include the last amino acid of a protein sequence that occurs in Gram-negative bacteria. It has been noted that Omp85 target sequences differ somewhat by species, while this model works generally for most Proteobacteria. 10
19758 132348 TIGR03305 alt_F1F0_F1_bet alternate F1F0 ATPase, F1 subunit beta. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 beta subunit of this apparent second ATP synthase. 449
19759 132349 TIGR03306 altF1_A alternate F1F0 ATPase, F0 subunit A. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F0 subunit A of this apparent second ATP synthase. 217
19760 163212 TIGR03307 PhnP phosphonate metabolism protein PhnP. This family of proteins found in operons encoding phosphonate C-P lyase systems as is observed in E. coli and is a member of the metallo-beta-lactamase superfamily (pfam00753). As defined by this model, all instances of this protein are associated with the C-P lyase, but not all genomes containing the C-P lyase system contain phnP. 238
19761 132351 TIGR03308 phn_thr-fam phosphonate metabolism protein, transferase hexapeptide repeat family. This family of proteins contains copies of the Bacterial transferase hexapeptide repeat family (pfam00132) and is only found in operons encoding the phosphonate C-P lyase system (GenProp0232). Many C-P lyase operons, however, lack a homolog of this protein. 204
19762 132352 TIGR03309 matur_yqeB selenium-dependent molybdenum hydroxylase system protein, YqeB family. Members of this protein family are probable accessory proteins for the biosynthesis of enzymes with labile selenium-containing centers, different from selenocysteine-containing proteins. 256
19763 274516 TIGR03310 matur_MocA_YgfJ molybdenum cofactor cytidylyltransferase. Members of this protein include MocA, which transfers cytosine from CTP to molybdopterin during molybdopterin cytosine dinucleotide (MCD) cofactor biosynthesis. It is distantly related to MobA, the GTP:molybdopterin guanylyltransferase. The MocA family is particularly closely related in phylogenetic distribution to other markers of selenium-dependent molybdenum hydroxylases (SDMH), suggesting most SDMH must use MCD rather than molybdopterin guanine dinucleotide. 188
19764 132354 TIGR03311 Se_dep_XDH selenium-dependent xanthine dehydrogenase. Members of this protein resemble conventional xanthine dehydrogenase enzymes, which depend on molybdenum cofactor - molybdopterin bound to molybdate with two sulfur atoms as ligands. But all members of this family occur in species that contain markers for the biosynthesis of enzymes with a selenium-containing form of molybdenum cofactor. The member of this family from Enterococcus faecalis has been shown to act as a xanthine dehydrogenenase, and its activity if dependent on SelD (selenophosphate synthase), selenium, and molybdenum. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 848
19765 132355 TIGR03312 Se_sel_red_FAD probable selenate reductase, FAD-binding subunit. This protein is suggested by Bebien, et al., to be the FAD-binding subunit of a molydbopterin-containing selenate reductase. Our comparative genomics suggests it to be a subunit of a selenium-dependent molybdenum hydroxylase for an unknown substrate. 257
19766 132356 TIGR03313 Se_sel_red_Mo probable selenate reductase, molybdenum-binding subunit. Our comparative genomics suggests this protein family to be a subunit of a selenium-dependent molybdenum hydroxylase, although the substrate is not specified. This protein is suggested by Bebien, et al., to be the molybdenum-binding subunit of a molydbopterin-containing selenate reductase. Xi, et al, however, show that mutation of this gene in E. coli conferred sensitivity to adenine, suggesting a defect in purine interconversion. This finding, plus homology of nearby genes in a 23-gene purine catabolism region in E. coli to xanthine dehydrogase subunits suggests xanthine dehydrogenase activity. 951
19767 132357 TIGR03314 Se_ssnA putative selenium metabolism protein SsnA. Members of this protein family are found exclusively in genomes that contain putative set of labile selenium-dependent enzyme accessory proteins as well as homologs of a labile selenium-dependent purine hydroxylase. A mutant in this gene in Escherichia coli had improved stationary phase viability. The function is unknown. 441
19768 132358 TIGR03315 Se_ygfK putative selenate reductase, YgfK subunit. Members of this protein family are YgfK, predicted to be one subunit of a three-subunit, molybdopterin-containing selenate reductase. This enzyme is found, typically, in genomic regions associated with xanthine dehydrogenase homologs predicted to belong to the selenium-dependent molybdenum hydroxylases (SDMH). Therefore, the selenate reductase is suggested to play a role in furnishing selenide for SelD, the selenophosphate synthase. 1012
19769 274517 TIGR03316 ygeW knotted carbamoyltransferase YgeW. Members of this protein family include the ygeW gene product of Escherichia coli. The function is unknown. Members show homology to ornithine carbamoyltransferase (TIGR00658) and aspartate carbamoyltransferase (carbamoyltransferase), and therefore may belong to the carbamoyltransferases in function. Members often are found in a large, conserved genomic region associated with selenium-dependent molybdenum hydroxylases. 391
19770 274518 TIGR03317 ygfZ_signature folate-binding protein YgfZ. YgfZ is a protein from Escherichia coli, homologous to the glycine cleavage system T protein, or aminomethyltransferase, GcvT (TIGR00528). Homologs of YgfZ other than members of the GcvT family share a well-conserved signature region that includes the motif, KGCYxGQE. Elsewhere, sequence diverge and length variation are substantial. Members of this family are mostly bacterial, largely absent from the Firmicutes and otherwise usually present. A few eukaryotic examples are found among the Apicomplexa, and a few archaeal sequences are found. Two functions implicated for this folate-binding protein are RNA modification (a function likely to be conserved) and replication initiation (a function likely to be highly variable). Many members of this family are, at the time of construction of this model, misnamed as the glycine cleavage system T protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 67
19771 132361 TIGR03318 YdfZ_fam putative selenium-binding protein YdfZ. This small protein has a very limited distribution, being found so far only among some gamma-Proteobacteria. The member from Escherichia coli was shown to bind selenium in the absence of a working SelD-dependent selenium incorporation system. Note that while the E. coli member contains a single Cys residue, a likely selenium binding site, some other members of this protein family contain two Cys residues or none. [Unknown function, General] 65
19772 188306 TIGR03319 RNase_Y ribonuclease Y. Members of this family are RNase Y, an endoribonuclease. The member from Bacillus subtilis, YmdA, has been shown to be involved in turnover of yitJ riboswitch. [Transcription, Degradation of RNA] 514
19773 132364 TIGR03321 alt_F1F0_F0_B alternate F1F0 ATPase, F0 subunit B. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, CC and in principle may run in either direction. This model represents the F0 subunit B of this apparent second ATP synthase. 246
19774 132365 TIGR03322 alt_F1F0_F0_C alternate F1F0 ATPase, F0 subunit C. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F0 subunit C of this apparent second ATP synthase. 86
19775 274519 TIGR03323 alt_F1F0_F1_gam alternate F1F0 ATPase, F1 subunit gamma. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 gamma subunit of this apparent second ATP synthase. 285
19776 132367 TIGR03324 alt_F1F0_F1_al alternate F1F0 ATPase, F1 subunit alpha. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 alpha subunit of this apparent second ATP synthase. 497
19777 132368 TIGR03325 BphB_TodD cis-2,3-dihydrobiphenyl-2,3-diol dehydrogenase. Members of this family occur as the BphD protein of biphenyl catabolism and as the TodD protein of toluene catabolism. Members catalyze the second step in each pathway and proved interchangeable when tested; the first and fourth enzymes in each pathway confer metabolic specificity. In the context of biphenyl degradation, the enzyme acts as cis-2,3-dihydrobiphenyl-2,3-diol dehydrogenase (EC 1.3.1.56), while in toluene degradation it acts as cis-toluene dihydrodiol dehydrogenase. 262
19778 188307 TIGR03326 rubisco_III ribulose bisphosphate carboxylase, type III. Members of this protein family are the archaeal, single chain, type III form of ribulose bisphosphate carboxylase, or RuBisCO. Members act is a three-step pathway for conversion of the sugar moiety of AMP to two molecules of 3-phosphoglycerate. Many of these species use ADP-dependent sugar kinases, which form AMP, for glycolysis. [Energy metabolism, Sugars] 411
19779 274520 TIGR03327 AMP_phos AMP phosphorylase. This enzyme family is found, so far, strictly in the Archaea, and only in those with a type III Rubisco enzyme. Most of the members previously were annotated as thymidine phosphorylase, or DeoA. The AMP metabolized by this enzyme may be produced by ADP-dependent sugar kinases. 494
19780 274521 TIGR03328 salvage_mtnB methylthioribulose-1-phosphate dehydratase. Members of this family are the methylthioribulose-1-phosphate dehydratase of the methionine salvage pathway. This pathway allows methylthioadenosine, left over from polyamine biosynthesis, to be recycled to methionine. [Amino acid biosynthesis, Aspartate family] 192
19781 274522 TIGR03329 Phn_aa_oxid putative aminophosphonate oxidoreductase. This clade of sequences are members of the pfam01266 family of FAD-dependent oxidoreductases. Characterized proteins within this family include glycerol-3-phosphate dehydrogenase (1.1.99.5), sarcosine oxidase beta subunit (1.5.3.1) and a number of deaminating amino acid oxidases (1.4.-.-). These genes have been consistently observed in a genomic context including genes for the import and catabolism of 2-aminoethylphosphonate (AEP). If the substrate of this oxidoreductase is AEP itself, then it is probably acting in the manner of a deaminating oxidase, resulting in the same product (phosphonoacetaldehyde) as the transaminase PhnW (TIGR02326), but releasing ammonia instead of coupling to pyruvate:alanine. Alternatively, it is reasonable to suppose that the various ABC cassette transporters which are also associated with these loci allow the import of phosphonates closely related to AEP which may not be substrates for PhnW. 460
19782 274523 TIGR03330 SAM_DCase_Bsu S-adenosylmethionine decarboxylase proenzyme, Bacillus form. Members of this protein family are the single chain precursor of the two chains of the mature S-adenosylmethionine decarboxylase as found in Methanocaldococcus jannaschii, Bacillus subtilis, and a wide range of other species. It differs substantially in architecture from the form as found in Escherichia coli, and lacks any extended homology to the eukaryotic form (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis] 112
19783 274524 TIGR03331 SAM_DCase_Eco S-adenosylmethionine decarboxylase proenzyme, Escherichia coli form. Members of this protein family are the single chain precursor of the S-adenosylmethionine decarboxylase as found in Escherichia coli. This form shows a substantially different architecture from the form shared by the Archaea, Bacillus, and many other species (TIGR03330). It shows little or no similarity to the form found in eukaryotes (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis] 259
19784 132375 TIGR03332 salvage_mtnW 2,3-diketo-5-methylthiopentyl-1-phosphate enolase. Members of this family are the methionine salvage pathway enzyme 2,3-diketo-5-methylthiopentyl-1-phosphate enolase, a homolog of RuBisCO. This protein family seems restricted to Bacillus subtilis and close relatives, where two separate proteins carry the enolase and phosphatase activities that in other species occur in a single protein, MtnC (TIGR01691). [Amino acid biosynthesis, Aspartate family, Central intermediary metabolism, Sulfur metabolism] 407
19785 213797 TIGR03333 salvage_mtnX 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase. Members of this family are the methionine salvage enzyme MnxX, a member of the HAD-superfamily hydrolases, subfamily IB (see TIGR01488). Members are found in Bacillus subtilis and related species, paired with MtnW (TIGR03332). In most species that recycle methionine from methylthioadenosine, the single protein MtnC replaces the MtnW/MtnX pair. In B. subtilis, mtnX was first known as ykrX. [Amino acid biosynthesis, Aspartate family, Central intermediary metabolism, Sulfur metabolism] 214
19786 274525 TIGR03334 IOR_beta indolepyruvate ferredoxin oxidoreductase, beta subunit. This model represents the beta subunit of indolepyruvate ferredoxin oxidoreductase, an alpha(2)/beta(2) tetramer, as found in Pyrococcus furiosus and Methanobacterium thermoautotrophicum. Cofactors for the tetramer include TPP, 4Fe4S, and 3Fe-4S. It shows considerable sequence similarity to subunits of several other ketoacid oxidoreductases. 189
19787 132378 TIGR03335 F390_ftsA coenzyme F390 synthetase. This enzyme, characterized in Methanobacterium thermoautotrophicum and found in several other methanogens, modifies coenzyme F420 by ligation of AMP (or GMP) from ATP (or GTP). On F420, it activates an aromatic hydroxyl group, which is unusual chemistry for an adenylyltransferase. This enzyme name has been attached to numbers of uncharacterized genes likely to instead act as phenylacetate CoA ligase, based on proximity to predicted indolepyruvate ferredoxin oxidoreductase (1.2.7.8) genes. The enzyme acts during transient exposure of the organism to oxygen. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 445
19788 274526 TIGR03336 IOR_alpha indolepyruvate ferredoxin oxidoreductase, alpha subunit. Indolepyruvate ferredoxin oxidoreductase (IOR) is an alpha 2/beta 2 tetramer related to ketoacid oxidoreductases for pyruvate (1.2.7.1, POR), 2-ketoglutarate (1.2.7.3, KOR), and 2-oxoisovalerate (1.2.7.7, VOR). These multi-subunit enzymes typically are found in anaerobes and are inactiviated by oxygen. IOR in Pyrococcus acts in fermentation of all three aromatic amino acids, following removal of the amino group by transamination. In Methanococcus maripaludis, by contrast, IOR acts in the opposite direction, in pathways of amino acid biosynthesis from phenylacetate, indoleacetate, and p-hydroxyphenylacetate. In M. maripaludis and many other species, iorA and iorB are found next to an apparent phenylacetate-CoA ligase. 595
19789 132380 TIGR03337 phnR phosphonate utilization transcriptional regulator PhnR. This family of proteins are members of the GntR family (pfam00392) containing an N-terminal helix-turn-helix (HTH) motif. This clade is found adjacent to or inside of operons for the degradation of 2-aminoethylphosphonate (AEP) in Salmonella, Vibrio Aeromonas hydrophila, Hahella chejuensis and Psychromonas ingrahamii. [Regulatory functions, DNA interactions] 231
19790 132381 TIGR03338 phnR_burk phosphonate utilization associated transcriptional regulator. This family of proteins are members of the GntR family (pfam00392) containing an N-terminal helix-turn-helix (HTH) motif. This clade is found adjacent to or inside of operons for the degradation of 2-aminoethylphosphonate (AEP) in Polaromonas, Burkholderia, Ralstonia and Verminephrobacter. 212
19791 132382 TIGR03339 phn_lysR aminoethylphosphonate catabolism associated LysR family transcriptional regulator. This group of sequences represents a number of related clades with numerous examples of members adjacent to operons for the degradation of 2-aminoethylphosphonate (AEP) in Pseudomonas, Ralstonia, Bordetella and Burkholderia species. These are transcriptional regulators of the LysR family which contain a helix-turn-helix (HTH) domain (pfam00126) and a periplasmic substrate-binding protein-like domain (pfam03466). [Regulatory functions, DNA interactions] 279
19792 274527 TIGR03340 phn_DUF6 phosphonate utilization associated putative membrane protein. This family of hydrophobic proteins has some homology to families of integral membrane proteins such as (pfam00892) and may be a permease. It occurs in the vicinity of various types of operons for the catabolism of phosphonates in Vibrio, Pseudomonas, Polaromonas and Thiomicrospira. 281
19793 132384 TIGR03341 YhgI_GntY IscR-regulated protein YhgI. IscR (TIGR02010) is an iron-sulfur cluster-binding transcriptional regulator (see Genome Property GenProp0138). Members of this protein family include YhgI, whose expression is under control of IscR, and show sequence similarity to IscA, a known protein of iron-sulfur cluster biosynthesis. These two lines of evidence strongly suggest a role as an iron-sulfur cluster biosynthesis protein. An older study designated this protein GntY and suggested a role for it and for the product of an adjacent gene, based on complementation studies, in gluconate utilization. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 190
19794 213798 TIGR03342 dsrC_tusE_dsvC sulfur relay protein, TusE/DsrC/DsvC family. Members of this protein family may be described as TusE, a partner to TusBCD in a sulfur relay system for 2-thiouridine biosynthesis, a tRNA base modification process. Other members are DsrC, a functionally similar protein in species where the sulfur relay system exists primarily for sulfur metabolism rather than tRNA base modification. Some members of this family are known explicitly as the gamma subunit of sulfite reductases. 108
19795 132386 TIGR03343 biphenyl_bphD 2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoate hydrolase. Members of this family are 2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoate hydrolase, or HOPD hydrolase, the BphD protein of biphenyl degradation. BphD acts on the product of ring meta-cleavage by BphC. Many species carrying bphC and bphD are capable of degrading polychlorinated biphenyls as well as biphenyl itself. 282
19796 213799 TIGR03344 VI_effect_Hcp1 type VI secretion system effector, Hcp1 family. This family includes Hcp1 (hemolysin coregulated protein 1), an exported, homohexameric ring-forming virulence protein from Pseudomonas aeruginosa. Hcp1 lacks a conventional signal sequence and is instead exported by means of the type VI secretion system, encoded by a pathogenicity cluster of a class previously designated IAHP (IcmF-associated homologous protein). Homologs of Hcp1, in this protein family, are found in various bacteria of which most but not all are known pathogens. Pathogens may have many multiple members of this family, with three to ten in Erwinia carotovora, Yersinia pestis, uropathogenic Escherichia coli, and the insect pathogen Photorhabdus luminescens. [Cellular processes, Pathogenesis] 166
19797 274528 TIGR03345 VI_ClpV1 type VI secretion ATPase, ClpV1 family. Members of this protein family are homologs of ClpB, an ATPase associated with chaperone-related functions. These ClpB homologs, designated ClpV1, are a key component of the bacterial pathogenicity-associated type VI secretion system. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 852
19798 274529 TIGR03346 chaperone_ClpB ATP-dependent chaperone ClpB. Members of this protein family are the bacterial ATP-dependent chaperone ClpB. This protein belongs to the AAA family, ATPases associated with various cellular activities (pfam00004). This molecular chaperone does not act as a protease, but rather serves to disaggregate misfolded and aggregated proteins. [Protein fate, Protein folding and stabilization] 850
19799 274530 TIGR03347 VI_chp_1 type VI secretion protein, VC_A0111 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 300
19800 274531 TIGR03348 VI_IcmF type VI secretion protein IcmF. Members of this protein family are IcmF homologs and tend to be associated with type VI secretion systems. [Cellular processes, Pathogenesis] 1169
19801 274532 TIGR03349 IV_VI_DotU type IV / VI secretion system protein, DotU family. At least two families of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (TIGR03348). The other is the family described by this model. Members include DotU from the Legionella pneumophila type IV secretion system. Many of the members of this protein family from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 183
19802 274533 TIGR03350 type_VI_ompA type VI secretion system peptidoglycan-associated domain. The flagellar motor protein MotB, the Gram-negative bacterial outer membrane protein OmpA (with an N-terminal outer membrane beta barrel domain) share a C-terminal peptidoglycan-associating homology region. This model describes a domain found fused to type VI secretion system homologs of the type IV system protein DotU (see model TIGR03349), with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 137
19803 274534 TIGR03351 PhnX-like phosphonatase-like hydrolase. This clade of sequences are the closest homologs to the PhnX enzyme, phosphonoacetaldehyde (Pald) hydrolase (phosphonatase, TIGR01422). This phosphonatase-like enzyme and PhnX itself are members of the haloacid dehalogenase (HAD) superfamily (pfam00702) having a a number of distinctive features that set them apart from typical HAD enzymes. The typical HAD N-terminal motif DxDx(T/V) here is DxAGT and the usual conserved lysine prior to the C-terminal motif is instead an arginine. Also distinctive of phosphonatase, and particular to its bi-catalytic mechanism is a conserved lysine in the variable "cap" domain. This lysine forms a Schiff base with the aldehyde of phosphonoacetaldehyde, providing, through the resulting positive charge, a polarization of the C-P bond necesary for cleavage as well as a route to the initial product of cleavage, an ene-amine. The conservation of these elements in this phosphonatase-like enzyme suggests that the substrate is also, like Pald, a 2-oxo-ethylphosphonate. Despite this, the genomic context of members of this family are quite distinct from PhnX, which is almost invariably associated with the 2-aminoethylphosphonate transaminase PhnW (TIGR02326), the source of the substrate Pald. Members of this clade are never associated with PhnW, but rather associate with families of FAD-dependent oxidoreductases related to deaminating amino acid oxidases (pfam01266) as well as zinc-dependent dehydrogenases (pfam00107). Notably, family members from Arthrobacter aurescens TC1 and Nocardia farcinica IFM 10152 are adjacent to the PhnCDE ABC cassette phosphonates transporter (GenProp0236) typically found in association with the phosphonates C-P lyase system (GenProp0232). These observations suggest two possibilities. First, the substrate for this enzyme family is also Pald, the non-association with PhnW not withstanding. Alternatively, the substrate is something very closely related such as hydroxyphosphonoacetaldehyde (Hpald). Hpald could come from oxidative deamination of 1-hydroxy-2-aminoethylphosphonate (HAEP) by the associated oxidase. HAEP would not be a substrate for PhnW due to its high specificity for AEP. HAEP has been shown to be a constituent of the sphingophosphonolipid of Bacteriovorax stolpii, and presumably has other natural sources. If Hpald is the substrate, the product would be glycoaldehyde (hydroxyacetaldehyde), and the associated alcohol dehydrogenase may serve to convert this to glycol. 220
19804 274535 TIGR03352 VI_chp_3 type VI secretion lipoprotein, VC_A0113 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 130
19805 274536 TIGR03353 VI_chp_4 type VI secretion protein, VC_A0114 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 439
19806 274537 TIGR03354 VI_FHA type VI secretion system FHA domain protein. Members of this protein family are FHA (forkhead-associated) domain-containing proteins that are part of type VI secretion loci in a considerable number of bacteria, most of which are known pathogens. Species include Pseudomonas aeruginosa PAO1, Aeromonas hydrophila, Yersinia pestis, Burkholderia mallei, etc. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 396
19807 274538 TIGR03355 VI_chp_2 type VI secretion protein, EvpB/VC_A0108 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 473
19808 274539 TIGR03356 BGL beta-galactosidase. 426
19809 274540 TIGR03357 VI_zyme type VI secretion system lysozyme-like protein. The description for Pfam family pfam04965 cites acidic lysozyme activity for some phage-encoded members. This family represents a different subgroup of the proteins from pfam04965, where all members are associated with bacterial type VI secretion system genomic contexts. 133
19810 132401 TIGR03358 VI_chp_5 type VI secretion protein, VC_A0107 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 159
19811 274541 TIGR03359 VI_chp_6 type VI secretion protein, VC_A0110 family. This protein family is associated with type VI secretion in a number of pathogenic bacteria. Mutation is associated with impaired virulence, such as impaired infection of plants by Rhizobium leguminosarum. 598
19812 132403 TIGR03360 VI_minor_1 type VI secretion-associated protein, VC_A0118 family. Members of this protein family, including VC_A0118 from Vibrio cholerae El Tor N16961, are restricted to a subset of bacteria with the type VI secretion system, and are encoded among the type VI-associated pathogenicity islands. However, many species with type VI secretion lack a member of this family. This lack suggests that members of this family may be targets rather than components of the type VI secretion system. 185
19813 274542 TIGR03361 VI_Rhs_Vgr type VI secretion system Vgr family protein. Members of this protein family belong to the Rhs element Vgr protein family (see TIGR01646), but furthermore all are found in genomes with type VI secretion loci. However, members of this protein family, although recognizably correlated to type VI secretion according the partial phylogenetic profiling algorithm, are often found far the type VI secretion locus. 513
19814 274543 TIGR03362 VI_chp_7 type VI secretion-associated protein, VC_A0119 family. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 301
19815 274544 TIGR03363 VI_chp_8 type VI secretion-associated protein, ImpA family. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). 353
19816 132407 TIGR03364 HpnW_proposed FAD dependent oxidoreductase TIGR03364. This clade of FAD dependent oxidoreductases (members of the pfam01266 family) is syntenically associated with a family of proposed phosphonatase-like enzymes (TIGR03351) and is also found (less frequently) in association with phosphonate transporter components. A likely role for this enzyme involves the oxidative deamination of an aminophosphonate differring slightly from 2-aminoethylphosphonate, possibly 1-hydroxy-2-aminoethylphosphonate (see the comments for TIGR03351). Many members of the larger FAD dependent oxidoreductase family act as amino acid oxidative deaminases. 365
19817 274545 TIGR03365 Bsubt_queE 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE. This uncharacterized enzyme, designated QueE, participates in the biosynthesis, from GTP, of 7-cyano-7-deazaguanosine, also called preQ0 because in many species it is a precursor of queuosine. In most Archaea, it is instead the precursor of a different tRNA modified base, archaeosine. [Protein synthesis, tRNA and rRNA base modification] 238
19818 274546 TIGR03366 HpnZ_proposed putative phosphonate catabolism associated alcohol dehydrogenase. This clade of zinc-binding alcohol dehydrogenases (members of pfam00107) are repeatedly associated with genes proposed to be involved with the catabolism of phosphonate compounds. 280
19819 274547 TIGR03367 queuosine_QueD queuosine biosynthesis protein QueD. Members of this protein family, closely related to eukaryotic 6-pyruvoyl tetrahydrobiopterin synthase enzymes, are the QueD protein of queuosine biosynthesis. Queuosine is a hypermodified base in the wobble position of tRNAs for Tyr, His, Asp, and Asn in many species. This modification, although widespread, appears not to be important for viability. The queuosine precursor made by this enzyme may be converted instead to archeaosine as in some Archaea. [Protein synthesis, tRNA and rRNA base modification] 89
19820 132411 TIGR03368 cellulose_yhjU cellulose synthase operon protein YhjU. This protein was identified by the partial phylogenetic profiling algorithm () as part of the system for cellulose biosynthesis in bacteria, and in fact is found in cellulose biosynthesis gene regions. The protein was designated YhjU in Salmonella enteritidis, where disruption of its gene disrupts cellulose biosynthesis and biofilm formation (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 518
19821 274548 TIGR03369 cellulose_bcsE cellulose biosynthesis protein BcsE. This protein, called BcsE (bacterial cellulose synthase E) or YhjS, is required for cellulose biosynthesis in Salmonella enteritidis. Its role is this process across multiple bacterial species is implied by the partial phylogenetic profiling algorithm. Members are found in the vicinity of other cellulose biosynthesis genes. The model does not include a much less well-conserved N-terminal region about 150 amino acids in length for most members. Solano, et al. suggest this protein acts as a protease. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 322
19822 132413 TIGR03370 VPLPA-CTERM VPLPA-CTERM protein sorting domain. A probable protein export sorting signal, PEP-CTERM, was described by Haft, et al. (). It is predicted to interact with a putative transpeptidase we designate exosortase. This model describes a variant with conserved motif VPLPA, rather than VPEP. It appears to be the recognition sequences for exosortase D (TIGR04152). This variant is found prominently in two members of the Rhodobacterales, namely Jannaschia sp. CCS1 and Roseobacter denitrificans OCh 114. One interesting member protein has a full-length duplication and therefore two copies of this putative sorting domain. 25
19823 274549 TIGR03371 cellulose_yhjQ cellulose synthase operon protein YhjQ. Members of this family are the YhjQ protein, found immediately upsteam of bacterial cellulose synthase (bcs) genes in a broad range of bacteria, including both copies of the bcs locus in Klebsiella pneumoniae. In several species it is seen clearly as part of the bcs operon. It is identified as a probable component of the bacterial cellulose metabolic process not only by gene location, but also by partial phylogenetic profiling, or Haft-Selengut algorithm (), based on a bacterial cellulose biosynthesis genome property profile. Cellulose plays an important role in biofilm formation and structural integrity in some bacteria. Mutants in yhjQ in Escherichia coli, show altered morphology an growth, but the function of YhjQ has not yet been determined. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 246
19824 132415 TIGR03372 putres_am_tran putrescine aminotransferase. Members of this family are putrescine aminotransferase, as found in Escherichia coli, Erwinia carotovora subsp. atroseptica, and closely related species. This pyridoxal phosphate enzyme, as characterized in E. coli, can act also on cadaverine and, more weakly, spermidine. [Central intermediary metabolism, Polyamine biosynthesis] 442
19825 132416 TIGR03373 VI_minor_4 type VI secretion-associated protein, BMA_A0400 family. Members of this protein family are found exclusively, although not universally, in bacterial species that possess a type VI secretion system. Genes are found in type VI secretion-associated gene clusters. The specific function is unknown. This model represents the rather well-conserved amino-terminal domain of a protein family in which carboxy-terminal regions, when present, show little conservation. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 136
19826 132417 TIGR03374 ABALDH 1-pyrroline dehydrogenase. Members of this protein family are 1-pyrroline dehydrogenase (1.5.1.35), also called gamma-aminobutyraldehyde dehydrogenase. This enzyme can follow putrescine transaminase (EC 2.6.1.82) for a two-step conversion of putrescine to gamma-aminobutyric acid (GABA). The member from Escherichia coli is characterized as a homotetramer that binds one NADH per momomer. This enzyme belongs to the medium-chain aldehyde dehydrogenases, and is quite similar in sequence to the betaine aldehyde dehydrogenase (EC 1.2.1.8) family. 472
19827 274550 TIGR03375 type_I_sec_LssB type I secretion system ATPase, LssB family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. This model is related to models TIGR01842 and TIGR01846, and to bacteriocin ABC transporters that cleave their substrates during export. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 694
19828 274551 TIGR03376 glycerol3P_DH glycerol-3-phosphate dehydrogenase (NAD(+)). Members of this protein family are the eukaryotic enzyme, glycerol-3-phosphate dehydrogenase (NAD(+)) (EC 1.1.1.8). Enzymatic activity for 1.1.1.8 is defined as sn-glycerol 3-phosphate + NAD(+) = glycerone phosphate + NADH. Note the very similar reactions of enzymes defined as EC 1.1.1.94 and 1.1.99.5, assigned to families of proteins in the bacteria. 342
19829 274552 TIGR03377 glycerol3P_GlpA glycerol-3-phosphate dehydrogenase, anaerobic, A subunit. Members of this protein family are the A subunit, product of the glpA gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic] 516
19830 213807 TIGR03378 glycerol3P_GlpB glycerol-3-phosphate dehydrogenase, anaerobic, B subunit. Members of this protein family are the B subunit, product of the glpB gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic] 419
19831 132422 TIGR03379 glycerol3P_GlpC glycerol-3-phosphate dehydrogenase, anaerobic, C subunit. Members of this protein family are the membrane-anchoring, non-catalytic C subunit, product of the glpC gene, of a three-subunit, FAD-dependent, anaerobic glycerol-3-phosphate dehydrogenase. GlpC lasks classical hydrophobic transmembrane helices; Cole, et al suggest interaction with the membrane may involve amphipathic helices. GlcC has conserved Cys-containing motifs suggestive of iron-sulfur binding. This complex is found mostly in Escherichia coli and closely related species. [Energy metabolism, Anaerobic] 397
19832 132423 TIGR03380 agmatine_aguA agmatine deiminase. Members of this family are agmatine deiminase (3.5.3.12), as characterized in Pseudomonas aeruginosa and plants. Related deiminases include the peptidyl-arginine deiminase (3.5.3.15) as found in Porphyromonas gingivalis. [Central intermediary metabolism, Polyamine biosynthesis] 357
19833 274553 TIGR03381 agmatine_aguB N-carbamoylputrescine amidase. Members of this family are N-carbamoylputrescine amidase (3.5.1.53). Bacterial genes are designated AguB. The AguAB pathway replaces SpeB for conversion of agmatine to putrescine in two steps rather than one. [Central intermediary metabolism, Polyamine biosynthesis] 279
19834 213808 TIGR03382 GC_trans_RRR Myxococcales GC_trans_RRR domain. The domain described here is small (about 30 amino acids), hydrophobic, only moderately conserved, and similar to numerous other transmembrane helix-containing sequence regions from convergent evolution. This domain is found, once per protein but in many proteins per genome in several bacteria of the order Myxococcales. It begins with a signature Gly-Cys motif. Its other features, including a hydrophobic transmembrane helix, Arg-rich cluster, and location at the protein C-terminus, resemble the PEP-CTERM proposed protein targeting domain. 27
19835 274554 TIGR03383 urate_oxi urate oxidase. Members of this protein family are urate oxidase, also called uricase. This protein contains two copies of the domain described by the uricase model pfam01014. In animals, this enzyme has been lost from primates and birds. [Central intermediary metabolism, Other] 282
19836 132427 TIGR03384 betaine_BetI transcriptional repressor BetI. BetI is a DNA-binding transcriptional repressor of the bet (betaine) regulon. In sequence, it is related to TetR (pfam00440). Choline, through BetI, induces the expression of the betaine biosynthesis genes betA and betB by derepression. The choline porter gene betT is also part of this regulon in Escherichia coli. Note that a different transcriptional regulator, ArcA, controls the expression of bet regulon genes in response to oxygen, as BetA is an oxygen-dependent enzyme. [Regulatory functions, DNA interactions] 189
19837 163244 TIGR03385 CoA_CoA_reduc CoA-disulfide reductase. Members of this protein family are CoA-disulfide reductase (EC 1.8.1.14), as characterized in Staphylococcus aureus, Pyrococcus horikoshii, and Borrelia burgdorferi, and inferred in several other species on the basis of high levels of CoA and an absence of glutathione as a protective thiol. [Cellular processes, Detoxification] 427
19838 274555 TIGR03388 ascorbase L-ascorbate oxidase, plant type. Members of this protein family are the copper-containing enzyme L-ascorbate oxidase (EC 1.10.3.3), also called ascorbase. This family is found in flowering plants, and shows greater sequence similarity to a family of laccases (EC 1.10.3.2) from plants than to other known ascorbate oxidases. 541
19839 274556 TIGR03389 laccase laccase, plant. Members of this protein family include the copper-containing enzyme laccase (EC 1.10.3.2), often several from a single plant species, and additional, uncharacterized, closely related plant proteins termed laccase-like multicopper oxidases. This protein family shows considerable sequence similarity to the L-ascorbate oxidase (EC 1.10.3.3) family. Laccases are enzymes of rather broad specificity, and classification of all proteins scoring about the trusted cutoff of this model as laccases may be appropriate. 539
19840 132431 TIGR03390 ascorbOXfungal L-ascorbate oxidase, fungal type. This model describes a family of fungal ascorbate oxidases, within a larger family of multicopper oxidases that also includes plant ascorbate oxidases (TIGR03388), plant laccases and laccase-like proteins (TIGR03389), and related proteins. The member from Acremonium sp. HI-25 is characterized. 538
19841 274557 TIGR03391 FeS_syn_CsdE cysteine desulfurase, sulfur acceptor subunit CsdE. Members of this protein family are CsdE, formerly called YgdK. This protein, found as a paralog to SufE in Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, works together and physically interacts with CsdA (a paralog of SufS). CsdA has cysteine desulfurase activity that is enhanced by this protein (CsdE), in which Cys-61 (numbered as in E. coli) is a sulfur acceptor site. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 139
19842 274558 TIGR03392 FeS_syn_CsdA cysteine desulfurase, catalytic subunit CsdA. Members of this protein family are CsdS. This protein, found Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, and related to SufS, works together with and physically interacts with CsdE (a paralog of SufE). CsdA has cysteine desulfurase activity that is enhanced by CsdE, a sulfur acceptor protein. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 398
19843 274559 TIGR03393 indolpyr_decarb indolepyruvate decarboxylase, Erwinia family. A family of closely related, thiamine pyrophosphate-dependent enzymes includes indolepyruvate decarboxylase (EC 4.1.1.74), phenylpyruvate decarboxylase (EC 4.1.1.43), pyruvate decarboxylase (EC 4.1.1.1), branched-chain alpha-ketoacid decarboxylase, etc.. Members of this group of homologs may overlap in specificity. Within the larger family, this model represents a clade of bacterial indolepyruvate decarboxylases, part of a pathway for biosynthesis of the plant hormone indole-3-acetic acid. Typically, these species interact with plants, as pathogens or as beneficial, root-associated bacteria. [Central intermediary metabolism, Other] 539
19844 274560 TIGR03394 indol_phenyl_DC indolepyruvate/phenylpyruvate decarboxylase, Azospirillum family. A family of closely related, thiamine pyrophosphate-dependent enzymes includes indolepyruvate decarboxylase (EC 4.1.1.74), phenylpyruvate decarboxylase (EC 4.1.1.43), pyruvate decarboxylase (EC 4.1.1.1), branched-chain alpha-ketoacid decarboxylase, etc.. Members of this group of homologs may overlap in specificity. This model represents a clade that includes a Azospirillum brasilense member active as both phenylpyruvate decarboxylase and indolepyruvate decarboxylase. 535
19845 132436 TIGR03395 sphingomy sphingomyelin phosphodiesterase. Members of this family are bacterial proteins that act as sphingomyelin phosphodiesterase (EC 3.1.4.12), also called sphingomyelinase. Some members of this family have been shown to act as hemolysins. [Cellular processes, Pathogenesis] 283
19846 274561 TIGR03396 PC_PLC phospholipase C, phosphocholine-specific, Pseudomonas-type. Members of this protein family are bacterial, phosphatidylcholine-hydrolyzing phospholipase C enzymes, with a characteristic domain architecture as found in hemolytyic (PlcH) and nonhemolytic (PlcN) secreted enzymes of Pseudomonas aeruginosa. PlcH hydrolyzes phosphatidylcholine to diacylglycerol and phosphocholine, but unlike PlcN can also hydrolyze sphingomyelin to ceramide ((N-acylsphingosine)) and phosphocholine. Members of this family share the twin-arginine signal sequence for Sec-independent transport across the plasma membrane. PlcH is secreted as a heterodimer with a small chaperone, PlcR, encoded immediately downstream. [Cellular processes, Pathogenesis] 689
19847 274562 TIGR03397 acid_phos_Burk acid phosphatase, Burkholderia-type. A member of this family, AcpA from Burkholderia mallei, has been charactized as a surface-bound glycoprotein with acid phosphatase activity, as can be shown with the colorigenic substrate 5-bromo-4-chloro-3-indolyl phosphate. This family shares regions of sequence similarity with phosphocholine-preferring phospholipase C enzymes (TIGR03396) from many of the same species. 483
19848 132439 TIGR03398 plc_access_R phospholipase C accessory protein PlcR. The class of microbial phosphocholine-preferring phospholipase C enzymes described by model TIGR03396 has two members in Pseudomonas aeruginosa, one of which (PlcH) is hemolytic and can hydrolyzes sphingomyelin as well as phosphatidylcholine. This model describes PlcR, an accessory protein for PlcH with which it forms a heterodimer. The member of the family from P. aeruginosa, although not the members from various Burkholderia species, is encoded immediately downstream of phospholipase C. 141
19849 274563 TIGR03399 RNA_3prim_cycl RNA 3'-phosphate cyclase. Members of this protein family are RNA 3'-phosphate cyclase (6.5.1.4), an enzyme whose function is conserved from E. coli to human. The modification this enzyme performs enables certain RNA ligations to occur, although the full biological roll for this enzyme is not fully described. This model separates this enzyme from a related protein, present only in eukaryotes, localized to the nucleolus, and involved in ribosomal modification. [Transcription, RNA processing] 326
19850 274564 TIGR03400 18S_RNA_Rcl1p 18S rRNA biogenesis protein RCL1. Members of this strictly eukaryotic protein family are not RNA 3'-phosphate cyclase (6.5.1.4), but rather a homolog with a distinct function, found in the nucleolus and required for ribosomal RNA processing. Homo sapiens has both a member of this RCL (RNA terminal phosphate cyclase like) family and EC 6.5.1.4, while Saccharomyces has a member of this family only. 360
19851 188314 TIGR03401 cyanamide_fam HD domain protein, cyanamide hydratase family. Members of this protein family are known, so far, in the Ascomycota, a branch of the Fungi, and contain an HD domain (pfam01966), found typically in various metal-dependent phosphohydrolases. The only characterized member of this family, from the soil fungus Myrothecium verrucaria, is cyanamide hydratase (EC 4.2.1.69), a zinc-containing homohexamer that adds water to the fertilizer cyanamide (NCNH2), a nitrile compound, to produce urea (NH2-CO-NH2). Homologs are likely to be nitrile hydratases. 228
19852 132443 TIGR03402 FeS_nifS cysteine desulfurase NifS. Members of this protein family are NifS, one of several related families of cysteine desulfurase involved in iron-sulfur (FeS) cluster biosynthesis. NifS is part of the NIF system, usually associated with other nif genes involved in nitrogenase expression and nitrogen fixation. The protein family is given a fairly broad interpretation here. It includes a clade nearly always found in extended nitrogen fixation genomic regions, plus a second clade more closely related to the first than to IscS and also part of NifS-like/NifU-like systems. This model does not extend to a more distantly clade found in the epsilon proteobacteria such as Helicobacter pylori, also named NifS in the literature, built instead in TIGR03403. 379
19853 132444 TIGR03403 nifS_epsilon cysteine desulfurase, NifS family, epsilon proteobacteria type. Members of this family are the NifS-like cysteine desulfurase of the epsilon division of the Proteobacteria, similar to the NifS protein of nitrogen-fixing bacteria. Like NifS, and unlike IscS, this protein is found as part of a system of just two proteins, a cysteine desulfurase and a scaffold, for iron-sulfur cluster biosynthesis. This protein is called NifS by Olsen, et al. (), so we use this designation. 382
19854 274565 TIGR03404 bicupin_oxalic bicupin, oxalate decarboxylase family. Members of this protein family are defined as bicupins as they have two copies of the cupin domain (pfam00190). Two different known activities for members of this family are oxalate decarboxylase (EC 4.1.1.2) and oxalate oxidase (EC 1.2.3.4), although the latter activity has more often been found in distantly related monocupin (germin) proteins. 367
19855 274566 TIGR03405 Phn_Fe-ADH phosphonate metabolism-associated iron-containing alcohol dehydrogenase. This small clade of iron-containing alcohol dehydrogenases of the pfam00465 family is found in genomic contexts indicating a role in the metabolism of phosphonates. In Delftia acidovorans SPH-1, the gene ZP_01580650.1 is adjacent to and running in the same direction as ZP_01580649.1 encoding the enzyme phosphonatase (PhnX, TIGR01422). Upstream are also found genes encoding components of a phosphonate ABC transport complex. In Ralstonia eutropha H16 and Verminephrobacter eiseniae EF01-2 the dehydrogenase is followed by a homolog of the PhnB gene, a putative phosphonate-specific MFS-type transporter. In Azoarcus BH72 the gene is preceded by Phosphoenolpyruvate phosphomutase (aepX) and a putative phosphonopyruvate decarboxylase (aepY), two genes involved in the biosynthesis of phosphonoacetaldehyde (Pald). Ususally these two are accompanied by a specific transaminase, AepZ, which converts Pald to 2-aminoethylphosphonate (2-AEP). 2-hydroxyethylphosphonate (2-HEP), the presumed product of the reaction of Pald with an alcohol dehydrogenase, is a biologically novel but reasonable analog of 2-AEP and may be a constituent of as-yet undescribed natural products. In the case of Azoarcus, downstream of the dehydrogenase is a CDP-glycerol:glycerophosphate transferase homolog that may indicate the existence of a pathway for 2-HEP-derived phosphonolipid biosynthesis. 355
19856 213809 TIGR03406 FeS_long_SufT probable FeS assembly SUF system protein SufT. The function is unknown for this protein family, but members are found almost always in operons for the the SUF system of iron-sulfur cluster biosynthesis. The SUF system is present elsewhere on the chromosome for those few species where SUF genes are not adjacent. This family shares this property of association with the SUF system with a related family, TIGR02945. TIGR02945 consists largely of a DUF59 domain (see pfam01883), while this protein is about double the length, with a unique N-terminal domain and DUF59 C-terminal domain. A location immediately downstream of the cysteine desulfurase gene sufS in many contexts suggests the gene symbol sufT. Note that some other homologs of this family and of TIGR02945, but no actual members of this family, are found in operons associated with phenylacetic acid (or other ring-hydroxylating) degradation pathways. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 174
19857 132448 TIGR03407 urea_ABC_UrtA urea ABC transporter, urea binding protein. Members of this protein family are ABC transporter substrate-binding proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. Members of this protein family tend to have the twin-arginine signal for Sec-independent transport across the plasma membrane. [Transport and binding proteins, Amino acids, peptides and amines] 359
19858 132449 TIGR03408 urea_trans_UrtC urea ABC transporter, permease protein UrtC. Members of this protein family are ABC transporter permease proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines] 313
19859 200272 TIGR03409 urea_trans_UrtB urea ABC transporter, permease protein UrtB. Members of this protein family are ABC transporter permease proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines] 291
19860 274567 TIGR03410 urea_trans_UrtE urea ABC transporter, ATP-binding protein UrtE. Members of this protein family are ABC transporter ATP-binding subunits associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines] 230
19861 274568 TIGR03411 urea_trans_UrtD urea ABC transporter, ATP-binding protein UrtD. Members of this protein family are ABC transporter ATP-binding subunits associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines] 242
19862 132453 TIGR03412 iscX_yfhJ FeS assembly protein IscX. Members of this protein family are YfhJ, a protein of the ISC system for iron-sulfur cluster assembly. Other genes in the system include iscSUA, hscBA, and fdx. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 63
19863 274569 TIGR03413 GSH_gloB hydroxyacylglutathione hydrolase. Members of this protein family are hydroxyacylglutathione hydrolase, a detoxification enzyme known as glyoxalase II. It follows lactoylglutathione lyase, or glyoxalase I, and acts to remove the toxic metabolite methylglyoxal and related compounds. This protein belongs to the broader metallo-beta-lactamase family (pfam00753). [Cellular processes, Detoxification] 248
19864 188316 TIGR03414 ABC_choline_bnd choline ABC transporter, periplasmic binding protein. Partial phylogenetic profiling () vs. the genome property of glycine betaine biosynthesis from choline consistently reveals a member of this ABC transporter periplasmic binding protein as the best match, save for the betaine biosynthesis enzymes themselves. Genomes often carry several paralogs, one encoded together with the permease and ATP-binding components and another encoded next to a choline-sulfatase gene, suggesting that different members of this protein family interact with shared components and give some flexibility in substrate. Of two members from Sinorhizobium meliloti 1021, one designated ChoX has been shown experimentally to bind choline (though not various related compounds such as betaine) and to be required for about 60 % of choline uptake. Members of this protein have an invariant Cys residue near the N-terminus and likely are lipoproteins. [Transport and binding proteins, Amino acids, peptides and amines] 290
19865 188317 TIGR03415 ABC_choXWV_ATP choline ABC transporter, ATP-binding protein. Members of this protein family are the ATP-binding subunit of a three-protein transporter. This family belongs, more broadly, to the family of proline and glycine-betaine transporters, but members have been identified by direct characterization and by bioinformatic means as choline transporters. Many species have several closely-related members of this family, probably with variable abilities to act additionally on related quaternary amines. [Transport and binding proteins, Amino acids, peptides and amines] 382
19866 188318 TIGR03416 ABC_choXWV_perm choline ABC transporter, permease protein. 267
19867 274570 TIGR03417 chol_sulfatase choline-sulfatase. 500
19868 188320 TIGR03418 chol_sulf_TF putative choline sulfate-utilization transcription factor. Members of this protein family are transcription factors of the LysR family. Their genes typically are divergently transcribed from choline-sulfatase genes. That enzyme makes choline, a precursor to the osmoprotectant glycine-betaine, available by hydrolysis of choline sulfate. 291
19869 132460 TIGR03419 NifU_clost FeS cluster assembly scaffold protein NifU, Clostridium type. NifU and NifS form a pair of iron-sulfur (FeS) cluster biosynthesis proteins much simpler than the ISC and SUF systems. Members of this protein family are a distinct group of NifU-like proteins, found always to a NifS-like protein and restricted to species that lack a SUF system. Typically, NIF systems service a smaller number of FeS-containing proteins than do ISC or SUF. Members of this particular branch typically are found, almost half the time, near the mnmA gene, involved in the carboxymethylaminomethyl modification of U34 in some tRNAs (see GenProp0704). While other NifU proteins are associated with nitrogen fixation, this family is not. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 121
19870 274571 TIGR03420 DnaA_homol_Hda DnaA regulatory inactivator Hda. Members of this protein family are Hda (Homologous to DnaA). These proteins are about half the length of DnaA and homologous over length of Hda. In the model species Escherichia coli, the initiation of DNA replication requires DnaA bound to ATP rather than ADP; Hda helps facilitate the conversion of DnaA-ATP to DnaA-ADP. [DNA metabolism, DNA replication, recombination, and repair] 226
19871 274572 TIGR03421 FeS_CyaY iron donor protein CyaY. Members of this protein family are the iron-sulfur cluster (FeS) metabolism protein CyaY, a homolog of eukaryotic frataxin. ISC is one of several bacterial systems for FeS assembly; we find by Partial Phylogenetic Profiling vs. the ISC system that CyaY most like work with the ISC system for FeS cluster biosynthesis. A study of of cyaY mutants in Salmonella enterica bears this out. Although the trusted cutoff is set low enough to include eukaryotic frataxin sequences, a narrower, exception-type model (TIGR03421) identifies identifies members of that specific set. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 102
19872 132463 TIGR03422 mito_frataxin frataxin. Frataxin is a mitochondrial protein, mutation of which leads to the disease Friedreich's ataxia. Its orthologs are widely distributed in the bacteria, associated with the ISC system for iron-sulfur cluster assembly, and designated CyaY. This exception-type model allows those examples of frataxin per se that score above the trusted cutoff to the CyaY equivalog-type model (TIGR03421) to be named appropriately. 97
19873 274573 TIGR03423 pbp2_mrdA penicillin-binding protein 2. Members of this protein family are penicillin-binding protein 2 (PBP-2), a protein whose gene (designated pbpA or mrdA) generally is found next to the gene for RodA, a protein required for rod (bacillus) shape in many bacteria. PBP-2 acts as a transpeptidase for cell elongation (hence, rod-shape). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 592
19874 132465 TIGR03424 urea_degr_1 urea carboxylase-associated protein 1. A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes. 198
19875 163257 TIGR03425 urea_degr_2 urea carboxylase-associated protein 2. A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes. 233
19876 274574 TIGR03426 shape_MreD rod shape-determining protein MreD. Members of this protein family are the MreD protein of bacterial cell shape determination. Most rod-shaped bacteria depend on MreB and RodA to achieve either a rod shape or some other non-spherical morphology such as coil or stalk formation. MreD is encoded in an operon with MreB, and often with RodA and PBP-2 as well. It is highly hydrophobic (therefore somewhat low-complexity) and highly divergent, and therefore sometimes tricky to discover by homology, but this model finds most examples. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 152
19877 213811 TIGR03427 ABC_peri_uca ABC transporter periplasmic binding protein, urea carboxylase region. Members of this family are ABC transporter periplasmic binding proteins associated with the urea carboxylase/allophanate hydrolase pathway, an alternative to urease for urea degradation. The protein is restricted to bacteria with the pathway, with its gene close to the urea carboxylase and allophanate hydrolase genes. The substrate for this transporter therefore is likely to be urea or a compound from which urea is easily derived. [Transport and binding proteins, Unknown substrate] 328
19878 132469 TIGR03428 ureacarb_perm permease, urea carboxylase system. A number of bacteria obtain nitrogen by biotin- and ATP-dependent urea degradation system distinct from urease. The two characterized proteins of this system are the enzymes urea carboxylase and allophanate hydrolase, but other, uncharacterized proteins co-occur as genes encoded nearby in multiple organisms. This family includes predicted permeases of the amino acid permease family, likely to transport either urea or a compound from which urea is derived. It is found so far only Actinobacteria, whereas a number of other species with the urea carboxylase have an adjacent ABC transporter operon. 475
19879 274575 TIGR03429 arom_pren_DMATS aromatic prenyltransferase, DMATS type. Members of this protein family are mostly fungal enzymes of secondary metabolite production. Characterized or partially characterized members include several examples of dimethylallyltryptophan synthase, a brevianamide F prenyltransferase, LtxC from lyngbyatoxin biosynthesis, and a probable dimethylallyl tyrosine synthase. 405
19880 132471 TIGR03430 trp_dimet_allyl tryptophan dimethylallyltransferase. Members of this family are the enzyme tryptophan dimethylallyltransferase (EC 2.5.1.34), a distinct clade within a larger group of aromatic prenyltransferases that may act on on trp-containing cyclic dipeptides, or on tyrosine or other related substrates. Tryptophan dimethylallyltransferase and related enzymes typically are of fungal origin are involved in the biosynthesis of secondary metabolites such as ergot alkaloids. 419
19881 132472 TIGR03431 PhnD phosphonate ABC transporter, periplasmic phosphonate binding protein. This model is a subset of the broader subfamily of phosphate/phosphonate binding protein ABC transporter components, TIGR01098. In this model all members of the seed have support from genomic context for association with pathways for the metabolims of phosphonates, particularly the C-P lyase system, GenProp0232. This model includes the characterized phnD gene from E. coli. Note that this model does not identify all phnD-subfamily genes with evident phosphonate context, but all sequences above the trusted context may be inferred to bind phosphonate compounds even in the absence of such context. Furthermore, there is ample evidence to suggest that many other members of the TIGR01098 subfamily have a different primary function. 288
19882 163260 TIGR03432 yjhG_yagF putative dehydratase, YjhG/YagF family. This homolog of dihydroxy-acid dehydratases has an odd, sparse distribution. Members are found in two Acidobacteria, two Planctomycetes, Bacillus clausii KSM-K16, and (in two copies each) in strains K12-MG1655 and W3110 of Escherichia coli. The local context is not well conserved, but a few members are adjacent to homologs of the gluconate:H+ symporter (see TIGR00791). [Unknown function, Enzymes of unknown specificity] 640
19883 163261 TIGR03433 padR_acidobact transcriptional regulator, Acidobacterial, PadR-family. Members of this protein family are putative transcriptional regulators of the PadR family, as found in species of the Acidobacteria. This family of proteins has expanded greatly in this lineage, and where it regularly is found in the vicinity of a putative transporter protein [Regulatory functions, DNA interactions] 100
19884 274576 TIGR03434 ADOP Acidobacterial duplicated orphan permease. Members of this protein family are found, so far, only in three species of Acidobacteria, namely Acidobacteria bacterium Ellin345, Acidobacterium capsulatum ATCC 51196, and Solibacter usitatus Ellin6076, where they form large paralogous families. Each protein contains two copies of a domain called the efflux ABC transporter permease protein (pfam02687). However, unlike other members of that family (including LolC, FtsX, and MacB), genes for these proteins are essentially never found fused or adjacent to ABC transporter ATP-binding protein (pfam00005) genes. We name this family ADOP, for Acidobacterial Duplicated Orphan Permease, to reflect the restricted lineage, internal duplication, lack of associated ATP-binding cassette proteins, and permease homology. The function is unknown. 803
19885 132476 TIGR03435 Soli_TIGR03435 soil-associated protein, TIGR03435 family. Bacterial reference strains encoding members of this protein family are all isolated from soil. These include 39 members from Solibacter usitatus Ellin6076, 27 from Acidobacterium sp. MP5ACTX8 (both Acidobacteria), and four from Pedosphaera parvula Ellin514 (Verrucomicrobia). The family is well-diversified, with few pairs showing greater than 50 % pairwise identity. A few members are fused to Peptidase_M56 domains (see pfam05569), to Sigma70_r2 domains (see pfam04542), or have a duplication of this domain. 237
19886 274577 TIGR03436 acidobact_VWFA VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196. 296
19887 274578 TIGR03437 Soli_cterm Solibacter uncharacterized C-terminal domain. This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria. 215
19888 274579 TIGR03438 egtD_ergothio dimethylhistidine N-methyltransferase. This model represents a distinct set of uncharacterized proteins found in the bacteria. Analysis by PSI-BLAST shows remote sequence homology to methyltransferases [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 301
19889 274580 TIGR03439 methyl_EasF probable methyltransferase domain, EasF family. This model represents an uncharacterized domain of about 300 amino acids with homology to S-adenosylmethionine-dependent methyltransferases. Proteins with this domain are exclusively fungal. A few, such as EasF from Neotyphodium lolii, are associated with the biosynthesis of ergot alkaloids, a class of fungal secondary metabolites. EasF may, in fact, be the AdoMet:dimethylallyltryptophan N-methyltransferase, the enzyme that follows tryptophan dimethylallyltransferase (DMATS) in ergot alkaloid biosynthesis. Several other members of this family, including mug158 (meiotically up-regulated gene 158 protein) from Schizosaccharomyces pombe, contain an additional uncharacterized domain DUF323 (pfam03781). 319
19890 274581 TIGR03440 egtB_TIGR03440 ergothioneine biosynthesis protein EgtB. Members of this family include EgtB, and enzyme of the ergothioneine biosynthesis, as found in numerous Actinobacteria. Characterized homologs to this family include a formylglycine-generating enzyme that serves as a maturase for an aerobic sulfatase (cf. the radical SAM enzymes that serve as anaerobic sulfatase maturases). [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 406
19891 163267 TIGR03441 urea_trans_yut urea transporter, Yersinia type. Members of this protein family are bacterial urea transporters, found not only is species that contain urease, but adjacent to the urease operon. It was characterized in Yersinia pseudotuberculosis. Members are homologous to eukaryotic members of solute carrier family 14, a family that includes urea transporters, and to bacterial proteins in species with no detectable urea degradation system. [Transport and binding proteins, Other] 292
19892 132483 TIGR03442 TIGR03442 ergothioneine biosynthesis protein EgtC. Members of this strictly bacterial protein family show similarity to class II glutamine amidotransferases (see pfam00310). They are distinguished by appearing in a genome context with, and usually adjacent to or between, members of families TIGR03438 (an uncharacterized methyltransferase) and TIGR03440 (an uncharacterized protein). [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 251
19893 274582 TIGR03443 alpha_am_amid L-aminoadipate-semialdehyde dehydrogenase. Members of this protein family are L-aminoadipate-semialdehyde dehydrogenase (EC 1.2.1.31), product of the LYS2 gene. It is also called alpha-aminoadipate reductase. In fungi, lysine is synthesized via aminoadipate. Currently, all members of this family are fungal. 1389
19894 274583 TIGR03444 EgtA_Cys_ligase ergothioneine biosynthesis glutamate--cysteine ligase EgtA. Members of this bacterial protein family, EgtA, resemble the glutamate--cysteine ligase of the two-step pathway for glutathione (GSH) biosynthesis, but instead are involved in the biosynthesis of the histidine-derived thiol, ergothioneine (EGT). Successful in vitro reconstitution of EGT biosynthesis using EgtBCDE and gamma-L-glutamyl-L-cysteine suggests that this enzyme is a bone fide glutamate--cysteine ligase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 390
19895 274584 TIGR03445 mycothiol_MshB N-acetyl-1-D-myo-inositol-2-amino-2-deoxy-alpha-D-glucopyranoside deacetylase. Members of this protein family are N-acetyl-1-D-myo-inositol-2-amino-2-deoxy-alpha-D-glucopyranoside deacetylase, also called 1D-myo-inosityl-2-acetamido-2-deoxy-alpha-D-glucopyranoside deacetylase, the MshB protein of mycothiol biosynthesis in Mycobacterium tuberculosis and related species. [Cellular processes, Detoxification] 284
19896 132487 TIGR03446 mycothiol_Mca mycothiol conjugate amidase Mca. Mycobacterium tuberculosis, Corynebacterium glutamicum, and related species use the thiol mycothiol in place of glutathione. This enzyme, homologous to the (dispensible) MshB enzyme of mycothiol biosynthesis, is described as an amidase that acts on conjugates to mycothiol. It is a detoxification enzyme. [Cellular processes, Detoxification] 283
19897 132488 TIGR03447 mycothiol_MshC cysteine--1-D-myo-inosityl 2-amino-2-deoxy-alpha-D-glucopyranoside ligase. Members of this protein family are MshC, l-cysteine:1-D-myo-inosityl 2-amino-2-deoxy-alpha-D-glucopyranoside ligase, an enzyme that uses ATP to ligate a Cys residue to a mycothiol precursor molecule, in the second to last step in mycothiol biosynthesis. This enzyme shows considerable homology to Cys--tRNA ligases, and many instances are misannotated as such. Mycothiol is found in Mycobacterium tuberculosis, Corynebacterium glutamicum, Streptomyces coelicolor, and various other members of the Actinobacteria. Mycothiol is an analog to glutathione. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 411
19898 132489 TIGR03448 mycothiol_MshD mycothiol synthase. Members of this family are MshD, the acetyltransferase that catalyzes the final step of mycothiol biosynthesis in various members of the Actinomyctes, Mycothiol replaces glutathione in these species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 292
19899 132490 TIGR03449 mycothiol_MshA D-inositol-3-phosphate glycosyltransferase. Members of this protein family, found exclusively in the Actinobacteria, are MshA, the glycosyltransferase of mycothiol biosynthesis. Mycothiol replaces glutathione in these species. 405
19900 132491 TIGR03450 mycothiol_INO1 inositol 1-phosphate synthase, Actinobacterial type. This enzyme, inositol 1-phosphate synthase as found in Actinobacteria, produces an essential precursor for several different products, including mycothiol, which is a glutathione analog, and phosphatidylinositol, which is a phospholipid. 351
19901 132492 TIGR03451 mycoS_dep_FDH S-(hydroxymethyl)mycothiol dehydrogenase. Members of this protein family are mycothiol-dependent formaldehyde dehydrogenase (EC 1.2.1.66). This protein is found, so far, only in the Actinobacteria (Mycobacterium sp., Streptomyces sp., Corynebacterium sp., and related species), where mycothione replaces glutathione. [Cellular processes, Detoxification] 358
19902 132493 TIGR03452 mycothione_red mycothione reductase. Mycothiol, a glutathione analog in Mycobacterium tuberculosis and related species, can form a disulfide-linked dimer called mycothione. This enzyme can reduce mycothione to regenerate two mycothiol molecules. The enzyme shows some sequence similarity to glutathione-disulfide reductase, trypanothione-disulfide reductase, and dihydrolipoamide dehydrogenase. The characterized protein from M. tuberculosis, a homodimer, has FAD as a cofactor, one per monomer, and uses NADPH as a substrate. 452
19903 274585 TIGR03453 partition_RepA plasmid partitioning protein RepA. Members of this family are the RepA (or ParA) protein involved in replicon partitioning. All known examples occur in bacterial species with two or more replicons, on a plasmid or the smaller chromosome. Note that an apparent exception may be seen as a pseudomolecule from assembly of an incompletely sequenced genome. Members of this family belong to a larger family that also includes the enzyme cobyrinic acid a,c-diamide synthase, but assignment of that name to members of this family would be in error. [Mobile and extrachromosomal element functions, Plasmid functions] 387
19904 274586 TIGR03454 partition_RepB plasmid partitioning protein RepB. Members of this family are the RepB protein involved in replicon partitioning. RepB is found, in general, as part of a repABC operon in plasmids and small chromosomes, separate from the main chromosome, in various bacteria. This model describes a rather narrow clade of proteins; it should be noted that additional homologs scoring below the trusted cutoff have very similar functions, although they may be named differently. [Mobile and extrachromosomal element functions, Plasmid functions] 325
19905 274587 TIGR03455 HisG_C-term ATP phosphoribosyltransferase, C-terminal domain. This domain corresponds to the C-terminal third of the HisG protein. It is absent in many lineages. 92
19906 132497 TIGR03457 sulphoacet_xsc sulfoacetaldehyde acetyltransferase. Members of this protein family are sulfoacetaldehyde acetyltransferase, an enzyme of taurine utilization. Taurine, or 2-aminoethanesulfonate, can be used by bacteria as a source of carbon, nitrogen, and sulfur. [Central intermediary metabolism, Other] 579
19907 274588 TIGR03458 YgfH_subfam succinate CoA transferase. This family of CoA transferases includes enzymes catalyzing at least two related but distinct activities. The E. coli YgfH protein has been characterized as a propionyl-CoA:succinate CoA transferase where it appears to be involved in a pathway for the decarboxylation of succinate to propionate. The Clostridium kluyveri CAT1 protein has been characterized as a acetyl-CoA:succinate CoA transferase and is believed to be involved in anaerobic succinate degradation. The propionate:succinate transferase activity has been reported in the propionic acid fermentation of propionibacterium species, where it is distinct from the coupled activities of distinct nucleotide-triphosphate dependent succinate and propionate/acetate CoA transferases (as inferred from activity in the absence of NTPs). The family represented by this model includes a member from Propionibacterium acnes KPA171202 which is likely to be responsible for this activity. A closely related clade not included in this family are the Ach1p proteins of fungi which are acetyl-CoA hydrolases. This name has been applied to many of the proteins represented by this model, possibly erroneously. 485
19908 274589 TIGR03459 crt_membr carotene biosynthesis associated membrane protein. This model represents a family of hydrophobic and presumed membrane proteins found in the Actinobacteria. The genes encoding these proteins are syntenically associated with (found proximal to) genes of carotene biosynthesis ususally including phytoene synthase (crtB), phytoene dehydrogenase (crtI) and geranylgeranyl pyrophosphate synthase (ispA). 456
19909 132500 TIGR03460 crt_membr_arch carotene biosynthesis associated membrane protein. 232
19910 132501 TIGR03461 pabC_Proteo aminodeoxychorismate lyase. Members of this protein family are aminodeoxychorismate lyase (ADC lyase), EC 4.1.3.38, the PabC protein of PABA biosynthesis. PABA (para-aminobenzoate) is a precursor of folate, needed for de novo purine biosynthesis. This enzyme is a pyridoxal-phosphate-binding protein in the class IV aminotransferase family (pfam01063). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 261
19911 274590 TIGR03462 CarR_dom_SF lycopene cyclase domain. This domain is often repeated twice within the same polypeptide, as is observed in Archaea, Thermus, Sphingobacteria and Fungi. In the fungal sequences, this tandem domain pair is observed as the N-terminal half of a bifunctional protein, where it has been characterized as a lycopene beta-cyclase and the C-terminal half is a phytoene synthetase. In Myxococcus and Actinobacterial genomes this domain appears as a single polypeptide, tandemly repeated and usually in a genomic context consistent with a role in carotenoid biosynthesis. It is unclear whether any of the sequences in this family truly encode lycopene epsilon cyclases. However a number are annotated as such. The domain is generally hydrophobic with a number of predicted membrane spanning segments and contains a distinctive motif (hPhEEhhhhhh). In certain sequences one of either the proline or glutamates may vary, but always one of the tandem pair appear to match this canonical sequence exactly. 89
19912 274591 TIGR03463 osq_cycl 2,3-oxidosqualene cyclase. This model identifies 2,3-oxidosqualene cyclases from Stigmatella aurantiaca which produces cycloartenol, and Gemmata obscuriglobus and Methylococcus capsulatus, which each produce the closely related sterol, lanosterol. 634
19913 274592 TIGR03464 HpnC squalene synthase HpnC. This family of genes are members of a superfamily (pfam00494) of phytoene and squalene synthases which catalyze the head-t0-head condensation of polyisoprene pyrophosphates. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. In the organisms Zymomonas mobilis and Bradyrhizobium japonicum these genes have been characterized as squalene synthases (farnesyl-pyrophosphate ligases). Often, these genes appear in tandem with the HpnD gene which appears to have resulted from an ancient gene duplication event. Presumably these proteins form a heteromeric complex, but this has not yet been experimentally demonstrated. 266
19914 163278 TIGR03465 HpnD squalene synthase HpnD. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. In the organisms Zymomonas mobilis and Bradyrhizobium japonicum these genes have been characterized as squalene synthases (farnesyl-pyrophosphate ligases). Often, these genes appear in tandem with the HpnC gene which appears to have resulted from an ancient gene duplication event. Presumably these proteins form a heteromeric complex, but this has not yet been experimentally demonstrated. 266
19915 163279 TIGR03466 HpnA hopanoid-associated sugar epimerase. The sequences in this family are members of the pfam01370 superfamily of NAD-dependent epimerases and dehydratases typically acting on nucleotide-sugar substrates. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of hopene, the cyclization product of the polyisoprenoid squalene. This gene and its association with hopene biosynthesis in Zymomonas mobilis has been noted in the literature where the gene symbol hpnA was assigned. Hopanoids are known to be components of the plasma membrane and to have polar sugar head groups in Z. mobilis and other species. 328
19916 274593 TIGR03467 HpnE squalene-associated FAD-dependent desaturase. The sequences in this family are members of the pfam01593 superfamily of flavin-containing amine oxidases which include the phytoene desaturases. These sequences also include a FAD-dependent oxidoreductase domain, pfam01266. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of squalene, the condensation product of the polyisoprenoid farnesyl pyrophosphate. This gene and its association with hopene biosynthesis in Zymomonas mobilis has been noted in the literature where the gene symbol hpnE was assigned. This gene is also found in contexts where the downstream conversion of squalene to hopenes is not evidence. The precise nature of the reaction catalyzed by this enzyme is unknown at this time. 419
19917 274594 TIGR03468 HpnG hopanoid-associated phosphorylase. The sequences in this family are members of the pfam01048 family of phosphorylases typically acting on nucleotide-sugar substrates. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of hopene, the cyclization product of the polyisoprenoid squalene. This gene is adjacent to the genes PhnA-E and squalene-hopene cyclase (which would be HpnF) in Zymomonas mobilis and their association with hopene biosynthesis has been noted in the literature. Extending the gene symbol sequence, we suggest the symbol HpnG for the product of this gene. Hopanoids are known to be components of the plasma membrane and to have polar sugar head groups in Z. mobilis and other species. 212
19918 213815 TIGR03469 HpnB hopene-associated glycosyltransferase HpnB. This family of genes include a glycosyl transferase, group 2 domain (pfam00535) which are responsible, generally for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. Indeed, the members of this family appear to never be found in a genome lacking squalene-hopene cyclase (SHC), although not all genomes encoding SHC have this glycosyl transferase. In the organism Zymomonas mobilis the linkage of this gene to hopanoid biosynthesis has been noted and the gene named HpnB. Hopanoids are known to feature polar glycosyl head groups in many organisms. 384
19919 274595 TIGR03470 HpnH hopanoid biosynthesis associated radical SAM protein HpnH. The sequences represented by this model are members of the radical SAM superfamily of enzymes (pfam04055). These enzymes utilize an iron-sulfur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions. The members of this clade are frequently found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. The linkage between SHC and this radical SAM enzyme is strong; one is nearly always observed in the same genome where the other is found. A hopanoid biosynthesis locus was described in Zymomonas mobilis consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and this radical SAM enzyme (ZMO0874) which we name here HpnH. Granted, in Z. mobilis, HpnH is in a convergent orientation with respect to HpnA-G, but one gene beyond HpnH and running in the same convergent direction is IspH (ZM0875, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase), an essential enzyme of IPP biosynthesis and therefore essential for the biosynthesis of hopanoids. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift. 318
19920 132511 TIGR03471 HpnJ hopanoid biosynthesis associated radical SAM protein HpnJ. The sequences represented by this model are members of the radical SAM superfamily of enzymes (pfam04055). These enzymes utilize an iron-sulfur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0975) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4901) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnJ) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift. 472
19921 132512 TIGR03472 HpnI hopanoid biosynthesis associated glycosyl transferase protein HpnI. This family of genes include a glycosyl transferase, group 2 domain (pfam00535) which are responsible, generally for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0974) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4902) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnI) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. Hopanoids are known to feature polar glycosyl head groups in many organisms. 373
19922 132513 TIGR03473 HpnK hopanoid biosynthesis associated protein HpnK. The sequences represented by this model are members of the pfam04794 "YdjC-like" family of uncharacterized proteins. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0976) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4902) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnK) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and a radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. 283
19923 132514 TIGR03474 incFII_RepA incFII family plasmid replication initiator RepA. Members of this protein are the plasmid replication initiator RepA of incFII (plasmid incompatibility group F-II) plasmids. R1 and R100 are plasmids in this group. Immediately upstream of repA is found tap, a leader peptide of about 24 amino acids, often not assigned as a gene in annotated plasmid sequences. Note that other, non-homologous plasmid replication proteins share the gene symbol (repA) and similar names (plasmid replication protein RepA). 275
19924 132515 TIGR03475 tap_IncFII_lead RepA leader peptide Tap. This protein is a translated leader peptide that actis in the regulation of the expression of the plasmid replication protein RepA in incF2 group plasmids. [Mobile and extrachromosomal element functions, Plasmid functions] 25
19925 274596 TIGR03476 HpnL putative membrane protein. This family of hydrophobic proteins is observed in two distinct contexts. It is primarily found in the presence of genes for the biosynthesis and elaboration of hopene where we assign the gene symbol HpnL. In a subset of the genomes containing HpnL a second, often plasmid-encoded, homolog is observed in a context implying the biosynthesis of 2-aminoethylphosphonate head-group containing lipids. 318
19926 274597 TIGR03477 DMSO_red_II_gam DMSO reductase family type II enzyme, heme b subunit. This model represents a heme b-binding subunit, typically called the gamma subunit, of various proteins that also contain a molybdopterin subunit and an iron-sulfur protein. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase and ethylbenzene dehydrogenase. [Energy metabolism, Electron transport] 206
19927 132518 TIGR03478 DMSO_red_II_bet DMSO reductase family type II enzyme, iron-sulfur subunit. This model represents the iron-sulfur subunit, typically called the beta subunit, of various proteins that also contain a molybdopterin subunit and a heme b subunit. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase and ethylbenzene dehydrogenase. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 321
19928 132519 TIGR03479 DMSO_red_II_alp DMSO reductase family type II enzyme, molybdopterin subunit. This model represents the molybdopterin subunit, typically called the alpha subunit, of various proteins that also contain an iron-sulfur subunit and a heme b subunit. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase, ethylbenzene dehydrogenase, and an archaeal respiratory nitrate reductase. This alpha subunit has a twin-arginine translocation (TAT) signal for Sec-independent translocation across the plasma membrane. 912
19929 274598 TIGR03480 HpnN hopanoid biosynthesis associated RND transporter like protein HpnN. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins appear to be related to the RND family of export proteins, particularly the hydrophobe/amphiphile efflux-3 (HAE3) family represented by TIGR00921. 862
19930 274599 TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se. 198
19931 274600 TIGR03482 DMSO_red_II_cha DMSO reductase family type II enzyme chaperone. Type II members of the DMSO reductase family are heterotrimeric proteins with bis(molybdopterin guanine dinucleotide)Mo, iron-sulfur, and heme b prosthetic groups bound by the alpha, beta, and gamma subunits respectively. Members of this protein family are not part of the mature protein, although they are the product of a fourth clustered gene. Proteins in this family are interpreted as a chaperone, analogous to NarJ of nitrate reductases. 197
19932 274601 TIGR03483 FtsZ_alphas_C cell division protein FtsZ, alphaProteobacterial C-terminal extension. This model describes a domain found as a C-terminal extension to the cell division protein FtsZ in many but not all members of the alphaProteobacteria. [Cellular processes, Cell division] 121
19933 132524 TIGR03485 cas_csx13_N CRISPR-associated protein Cas8a1/Csx13, MYXAN subtype. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the N-terminal region or upstream gene; the C-terminal region is described by TIGR03486. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA. 316
19934 132525 TIGR03486 cas_csx13_C CRISPR-associated protein Cas8a1/Csx13, MYXAN subtype, C-terminal region. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the C-terminal region or downstream gene; the N-terminal region is modeled by TIGR03485. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA. 152
19935 132526 TIGR03487 cas_csp2 CRISPR-associated protein Cas8c/Csp2, subtype PGING. Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pging (Porphyromonas gingivalis) subtype. 489
19936 132527 TIGR03488 cas_Cas5p CRISPR-associated protein Cas5, subtype PGING. CC Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pgingi (Porphyromonas gingivalis) subtype, but shows some sequence similarity to genes of the Cas5 type (see TIGR02593). 237
19937 132528 TIGR03489 cas_csp1 CRISPR-associated protein Cas7/Csp1, subtype PGING. Members of this protein family are Csp1, a CRISPR-associated (cas) gene marker for the Pging subtype of CRISPR/cas system, as found in Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This protein belongs to the family of DevR (TIGR01875), a regulator of development in Myxococcus xanthus located in a cas gene region. A different branch of the DevR family, Cst2 (TIGR02585), is a marker for the Tneap subtype of CRISPR/cas system. 292
19938 274602 TIGR03490 Mycoplas_LppA mycoides cluster lipoprotein, LppA/P72 family. Members of this protein family occur in Mycoplasma mycoides, Mycoplasma hyopneumoniae, and related Mycoplasmas in small paralogous families that may also include truncated forms and/or pseudogenes. Members are predicted lipoproteins with a conserved signal peptidase II processing and lipid attachment site. Note that the name for certain characterized members, p72, reflects an anomalous apparent molecular weight, given a theoretical MW of about 61 kDa. 541
19939 274603 TIGR03491 TIGR03491 RecB family nuclease, putative, TM0106 family. Members of this uncharacterized protein family are found broadly but sporadically among bacteria. The N-terminal region is homologous to the Cas4 protein of CRISPR systems, although this protein family shows no signs of association with CRISPR repeats. 457
19940 274604 TIGR03492 TIGR03492 conserved hypothetical protein. This protein family is restricted to the Cyanobacteria, in one or two copies, save for instances in the genus Deinococcus. This protein shows some sequence similarity, especially toward the C-terminus, to lipid-A-disaccharide synthase (TIGR00215 or pfam02684). The function is unknown. 396
19941 274605 TIGR03493 cellullose_BcsF celllulose biosynthesis operon protein BcsF/YhjT. Members of this protein family are found invariably together with genes of bacterial cellulose biosynthesis, and are presumed to be involved in the process. Members average about 63 amino acids in length and are not uncharacterized. The gene has been designated both YhjT and BcsF (bacterial cellulose synthesis F). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 62
19942 132533 TIGR03494 salicyl_syn salicylate synthase. Members of this protein family are salicylate synthases, bifunctional enzymes that make salicylate, in two steps, from chorismate. Members are homologous to anthranilate synthase component I from Trp biosynthesis. Members typically are found in gene regions associated with siderophore or other secondary metabolite biosynthesis. 425
19943 274606 TIGR03495 phage_LysB phage lysis regulatory protein, LysB family. Members of this protein family are phage lysis regulatory protein, including the well-studied protein LysB (lysis protein B) of Enterobacteria phage P2. For members of this family, genes are found in phage or in prophage regions of bacterial genomes, typically near a phage lysozyme or phage holin. 135
19944 274607 TIGR03496 FliI_clade1 flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively. [Cellular processes, Chemotaxis and motility] 411
19945 274608 TIGR03497 FliI_clade2 flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively. [Cellular processes, Chemotaxis and motility] 413
19946 163293 TIGR03498 FliI_clade3 flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively. 418
19947 274609 TIGR03499 FlhF flagellar biosynthetic protein FlhF. [Cellular processes, Chemotaxis and motility] 282
19948 274610 TIGR03500 FliO_TIGR flagellar biosynthetic protein FliO. This short protein found in flagellar biosynthesis operons contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been names "FliZ" but phylogenetic tree building supports a single FliO family. 69
19949 274611 TIGR03501 GlyGly_CTERM GlyGly-CTERM domain. This homology domain, GlyGly-CTERM, shares a species distribution with rhombosortase (TIGR03902), a subfamily of rhomboid-like intramembrane serine proteases. It is probably a recognition sequence for protein sorting and then cleavage by rhombosortase. Shewanella species have the largest number of target proteins per genome, up to thirteen. The domain occurs at the extreme carboxyl-terminus of a diverse set of proteins, most of which are enzymes with conventional signal sequences and with hydrolytic activities: nucleases, proteases, agarases, etc. The agarase AgaA from Vibro sp. strain JT0107 is secreted into the medium, while the same protein heterologously expressed in E. coli is retained in the cell fraction. This suggests cleavage and release in species with this domain. Both this suggestion, and the chemical structure of the domain (motif, hydrophobic predicted transmembrane helix, cluster of basic residues) closely parallels that of the LPXTG/sortase system and the PEP-CTERM/exosortase(EpsH) system. For this reason, the putative processing enzyme is designated rhombosortase. 22
19950 274612 TIGR03502 lipase_Pla1_cef extracellular lipase, Pla-1/cef family. Members of this protein family are bacterial lipoproteins largely from the Gammaproteobacteria. Characterized members are expressed extracellularly and have esterase activity. Members include the lipase Pla-1 from Aeromonas hydrophila (AF092033) and CHO cell elongation factor (cef) from Vibrio hollisae 792
19951 274613 TIGR03503 TIGR03503 TIGR03503 family protein. This set of conserved hypothetical protein has a phylogenetic range that closely matches that of TIGR03501, a putative C-terminal protein targeting signal. 374
19952 274614 TIGR03504 FimV_Cterm FimV C-terminal domain. This protein is found at the extreme C-terminus of FimV from Pseudomonas aeruginosa, and of TspA of Neisseria meningitidis. Disruption of the former blocks twitching motility from type IV pili; Semmler, et al. suggest a role in peptidoglycan layer remodelling required by type IV fimbrial systems. 44
19953 274615 TIGR03505 FimV_core FimV N-terminal domain. This region is found at, or about 200 amino acids from, the N-terminus of FimV from Pseudomonas aeruginosa, TspA of Neisseria meningitidis, and related proteins. Disruption of FimV blocks twitching motility from type IV pili; Semmler, et al. suggest a role for this family in peptidoglycan layer remodelling required by type IV fimbrial systems. Most but not all members of this protein family have a C-terminal region recognized by TIGR03504. In between is a highly variable, often repeat-filled region rich in the negatively charged amino acids Asp and Glu. 74
19954 274616 TIGR03506 FlgEFG_subfam flagellar hook-basal body protein. This model encompasses three closely related flagellar proteins usually denoted FlgE, FlgF and FlgG. The names have often been mis-assigned, however. Three equivalog models, TIGR02489, TIGR02490 and TIGR02488, respectively, separate the individual forms into three genome-context consistent groups. The major differences between these genes are architectural, with variable central sections between relatively conserved N- and C-terminal domains. More distantly related are two other flagellar apparatus familis, FlgC (TIGR01395) which consists of little else but the N-and C-terminal domains and FlgK (TIGR02492) with a substantial but different central domain. 231
19955 274617 TIGR03507 decahem_SO1788 decaheme c-type cytochrome, OmcA/MtrC family. The protein SO_1778 (MtrC) of Shewanella oneidensis MR-1, and its paralog SO_1779 (OmcA), with which it intteracts, are large decaheme proteins, about 900 amino acids in length, involved in the use of manganese [Mn(III/IV)] and iron [Fe(III)] as terminal electron acceptors. This model represents these and similar decaheme proteins, found also in Rhodoferax ferrireducens DSM 15236, Aeromonas hydrophila ATCC7966, and a few other bacterial species. [Energy metabolism, Electron transport] 659
19956 274618 TIGR03508 decahem_SO decaheme c-type cytochrome, DmsE family. Members of this family are small, decaheme c-type cytochromes, related DmsE of Shewanella oneidensis MR-1, which has been shown to be part of an anaerobic dimethyl sulfoxide reductase. 258
19957 274619 TIGR03509 OMP_MtrB_PioB decaheme-associated outer membrane protein, MtrB/PioB family. Members of this protein family are integral proteins of the bacterial outer membrane, associated with multiheme c-type cytochromes involved in electron transfer. The MtrB protein of Shewanella oneidensis MR-1 (SO1776) has been shown to form a complex with 1:1:1 stochiometry with the small, periplasmic decaheme cytochrome MtrA and large, surface-exposed decaheme cytochrome MtrC. [Energy metabolism, Electron transport] 649
19958 274620 TIGR03510 XapX XapX domain. This model describes an uncharacterized small, hydrophobic protein of about 50 amino acids, found between the xapB and xapR genes of the E. coli xanthosine utilization system, and homologous regions in other small proteins, such as the N-terminal region of DUF1427 (pfam07235). We name this domain XapX, as it comprises the full length of the protein encoded between the genes for the well-studied XapB and XapR proteins. [Unknown function, General] 49
19959 274621 TIGR03511 GldH_lipo gliding motility-associated lipoprotein GldH. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility. [Cellular processes, Chemotaxis and motility] 156
19960 132551 TIGR03512 GldD_lipo gliding motility-associated lipoprotein GldD. Members of this protein family are found a number of Bacteriodetes lineage bacteria, including both species such as Flavobacterium johnsoniae, which possess a poorly understood form of rapid gliding motility, and other species which apparently do not. Mutation of GldD blocks both this motility and chitin utilization in the model species, Flavobacterium johnsoniae. [Cellular processes, Chemotaxis and motility] 186
19961 274622 TIGR03513 GldL_gliding gliding motility-associated protein GldL. This protein family, GldL, is named for the member from Flavobacterium johnsoniae, which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. However, members are found also in several members of the Bacteriodetes that appear not to be motile [Cellular processes, Chemotaxis and motility] 202
19962 274623 TIGR03514 GldB_lipo gliding motility-associated lipoprotein GldB. 319
19963 132554 TIGR03515 GldC gliding motility-associated protein GldC. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldC is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldC do not abolish the gliding phenotype but do impair it. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 108
19964 132555 TIGR03516 ppisom_GldI peptidyl-prolyl isomerase, gliding motility-associated. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldI is a FKBP-type peptidyl-prolyl cis-trans isomerase (pfam00254) linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockout of this gene abolishes the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. This family is only found in Bacteroidetes containing the suite of genes proposed to confer the gliding motility phenotype. 177
19965 274624 TIGR03517 GldM_gliding gliding motility-associated protein GldM. This protein family, GldM, is named for the member from Flavobacterium johnsoniae, which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. However, members are found also in several members of the Bacteriodetes that appear not to be motile. The best conserved region, toward the N-terminus, is centered on a highly hydrobobic probable transmembrane helix. Two paralogs are found in Cytophaga hutchinsonii. 523
19966 132557 TIGR03518 ABC_perm_GldF gliding motility-associated ABC transporter permease protein GldF. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldF is believed to be a ABC transporter permease protein (along with ATP-binding subunit, GldA and a sunstrate-binding subunit, GldG) and is linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldF abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 240
19967 274625 TIGR03519 T9SS_PorP_fam type IX secretion system membrane protein, PorP/SprF family. This model describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that have type IX secretion systems (T9SS) and exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility. 291
19968 274626 TIGR03520 GldE gliding motility-associated protein GldE. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldC is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. GldE was discovered because of its adjacency to GldD in F. johnsonii. Overexpression of GldE partially supresses the effects of a GldB point mutant suggesting that GldB and GldE interact. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility and in fact some do not appear to express the gliding phenotype. 407
19969 274627 TIGR03521 GldG gliding-associated putative ABC transporter substrate-binding component GldG. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldG is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldG abolish the gliding phenotype. GldG, along with GldA and GldF are believed to compose an ABC transporter and are observed as an operon. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 552
19970 132561 TIGR03522 GldA_ABC_ATP gliding motility-associated ABC transporter ATP-binding subunit GldA. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldA is an ABC transporter ATP-binding protein (pfam00005) linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldA abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 301
19971 274628 TIGR03523 GldN gliding motility associated protien GldN. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldN is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldN abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein also include those which are not believed to express the gliding phenotype, such as Prevotella intermedia and Porphyromonas gingivales. 280
19972 132563 TIGR03524 GldJ gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. There is a GldJ homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 559
19973 274629 TIGR03525 GldK gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. There is a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 450
19974 132565 TIGR03526 selenium_YgeY putative selenium metabolism hydrolase. SelD, selenophosphate synthase, is the selenium donor protein for both selenocysteine and selenouridine biosynthesis systems, but it occurs also in a few prokaryotes that have neither of those pathways. The method of partial phylogenetic profiling, starting from such orphan-selD genomes, identifies this protein as one of those most strongly correlated to SelD occurrence. Its distribution is also well correlated with that of family TIGR03309, a putative accessory protein of labile selenium (non-selenocysteine) enzyme maturation. This family includes the uncharacterized YgeY of Escherichia coli, and belongs to a larger family of metalloenzymes in which some are known peptidases, others enzymes of different types. 395
19975 274630 TIGR03527 selenium_YedF selenium metabolism protein YedF. Members of this protein family are about 200 amino acids in size, and include the uncharacterized YedF protein of Escherichia coli. This family shares an N-terminal domain, modeled by pfam01206, with the sulfurtransferase TusA (also called SirA). The C-terminal domain includes a typical redox-active disulfide motif, CGXC. This protein family found only among those genomes that also carry the selenium donor protein SelD, and its connection to selenium metabolism is indicated by the method of partial phylogenetic profiling vs. SelD. Its gene typically is found next to selD. Members of this family are found even when selenocysteine and selenouridine biosynthesis pathways are, except for SelD, completely absent, as in Enterococcus faecalis. Its role in selenium metabolism is unclear, but may include either detoxification or a role in labile selenoprotein biosynthesis. 194
19976 274631 TIGR03528 2_3_DAP_am_ly diaminopropionate ammonia-lyase. Members of this protein family are the homodimeric, pyridoxal phosphate enzyme diaminopropionate ammonia-lyase, which adds water to remove two amino groups, leaving pyruvate. 396
19977 274632 TIGR03529 GldK_short gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. This model represents a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture than that found in Flavobacterium johnsoniae and related species (represented by (TIGR03525). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 344
19978 132569 TIGR03530 GldJ_short gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. This model represents the GldJ homolog in Cytophaga hutchinsonii and several other species which is of shorter architecture than that found in Flavobacterium johnsoniae and is represented by a separate model (TIGR03524). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 402
19979 211833 TIGR03531 selenium_SpcS O-phosphoseryl-tRNA(Sec) selenium transferase. In the archaea and eukaryotes, the conversion of the mischarged serine to selenocysteine (Sec) on its tRNA is accomplished in two steps. This enzyme, O-phosphoseryl-tRNA(Sec) selenium transferase, acts second, after a phosphophorylation step catalyzed by a homolog of the bacterial SelA protein. [Protein synthesis, tRNA aminoacylation] 444
19980 132571 TIGR03532 DapD_Ac 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-acetyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. Alternate name: tetrahydrodipicolinate N-acetyltransferase. Note that IUBMB lists this alternate name as the accepted name. Unfortunately, the related succinyl transferase acting on the same substrate (EC:2.3.1.117, TIGR00695) uses the opposite standard. We have decided to give these two enzymes names which more clearly indicated that they act on the same substrate. 231
19981 274633 TIGR03533 L3_gln_methyl protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific. Members of this protein family methylate ribosomal protein L3 on a glutamine side chain. This family is related to HemK, a protein-glutamine methyltranferase for peptide chain release factors. [Protein synthesis, Ribosomal proteins: synthesis and modification] 284
19982 274634 TIGR03534 RF_mod_PrmC protein-(glutamine-N5) methyltransferase, release factor-specific. Members of this protein family are HemK (PrmC), a protein once thought to be involved in heme biosynthesis but now recognized to be a protein-glutamine methyltransferase that modifies the peptide chain release factors. All members of the seed alignment are encoded next to the release factor 1 gene (prfA) and confirmed by phylogenetic analysis. SIMBAL analysis (manuscript in prep.) shows the motif [LIV]PRx[DE]TE (in Escherichia coli, IPRPDTE) confers specificity for the release factors rather than for ribosomal protein L3. [Protein fate, Protein modification and repair] 250
19983 274635 TIGR03535 DapD_actino 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. This model represents a clade of the enzyme specific to Actinobacteria. Alternate name: tetrahydrodipicolinate N-succinyltransferase. 319
19984 211834 TIGR03536 DapD_gpp 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase (DapD) is involved in the succinylated branch of the "lysine biosynthesis via diaminopimelate (DAP)" pathway (GenProp0125). This model represents a clade of DapD sequences most closely related to the actinobacterial DapD family represented by the TIGR03535 model. All of the genes evaluated for the seed of this model are found in genomes where the downstream desuccinylase is present, but known DapD genes are absent. Additionally, many of the genes identified by this model are found proximal to genes involved in this lysine biosynthesis pathway. 341
19985 274636 TIGR03537 DapC succinyldiaminopimelate transaminase. The four sequences which make up the seed for this model are not closely related, although they are all members of the pfam00155 family of aminotransferases and are more closely related to each other than to anything else. Additionally, all of them are found in the vicinity of genes involved in the biosynthesis of lysine via the diaminopimelate pathway (GenProp0125), although this amount to a separation of 12 genes in the case of Sulfurihydrogenibium azorense Az-Fu1. None of these genomes contain another strong candidate for this role in the pathway. Note: the detailed information included in the EC:2.6.1.17 record includes the assertions that the enzyme uses the pyridoxal pyrophosphate cofactor, which is consistent with the pfam00155 family, and the assertion that the amino group donor is L-glutamate, which is undetermined for the sequences in this clade. 350
19986 274637 TIGR03538 DapC_gpp succinyldiaminopimelate transaminase. This family of succinyldiaminopimelate transaminases (DapC) includes the experimentally characterized enzyme from Bordatella pertussis. The majority of genes in this family are proximal to genes encoding components of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). 393
19987 132578 TIGR03539 DapC_actino succinyldiaminopimelate transaminase. This family of actinobacterial succinyldiaminopimelate transaminase enzymes (DapC) are members of the pfam00155 superfamily. Many of these genes appear adjacent to other genes encoding enzymes of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). 357
19988 274638 TIGR03540 DapC_direct LL-diaminopimelate aminotransferase. This clade of the pfam00155 superfamily of aminotransferases includes several which are adjacent to elements of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). Every member of this clade is from a genome which possesses most of the lysine biosynthesis pathway but lacks any of the known aminotransferases, succinylases, desuccinylases, acetylases or deacetylases typical of the acylated versions of this pathway nor do they have the direct, NADPH-dependent enzyme (ddh). Although there is no experimental characterization of any of the sequences in this clade, a direct pathway is known in plants and Chlamydia and the clade containing the Chlamydia gene is a neighboring one in the same pfam00155 superfamily so it seems quite reasonable that these enzymes catalyze the same transformation. 383
19989 132580 TIGR03541 reg_near_HchA LuxR family transcriptional regulatory, chaperone HchA-associated. Members of this protein family belong to the LuxR transcriptional regulator family, and contain both autoinducer binding (pfam03472) and transcriptional regulator (pfam00196) domains. Members, however, occur only in a few members of the Gammaproteobacteria that have the chaperone/aminopeptidase HchA, and are always encoded by the adjacent gene. 232
19990 163316 TIGR03542 DAPAT_plant LL-diaminopimelate aminotransferase. This clade of the pfam00155 superfamily of aminotransferases includes several which are adjacent to elements of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). This clade includes characterized species in plants and Chlamydia. Every member of this clade is from a genome which possesses most of the lysine biosynthesis pathway but lacks any of the known succinylases, desuccinylases, acetylases or deacetylases typical of the acylated versions of this pathway nor do they have the direct, NADPH-dependent enzyme (ddh). 402
19991 188337 TIGR03543 divI1A_rptt_fam DivIVA domain repeat protein. Members of this protein family contain two full and two partial repeats of a domain found at the N-terminus of Bacillus subtilis cell-division initiation protein DivIVA. The portion repeated four times in these proteins includes the motif GYxxxxVD. 178
19992 274639 TIGR03544 DivI1A_domain DivIVA domain. This model describes a domain found in Bacillus subtilis cell division initiation protein DivIVA, and homologs, toward the N-terminus. It is also found as a repeated domain in certain other proteins, including family TIGR03543. 34
19993 274640 TIGR03545 TIGR03545 TIGR03545 family protein. This model represents a relatively rare but broadly distributed uncharacterized protein family, distributed in 1-2 percent of bacterial genomes, all of which have outer membranes. In many of these genomes, it is part of a two-gene pair. 555
19994 200289 TIGR03546 TIGR03546 TIGR03546 family protein. Members of this family are uncharacterized proteins, usually encoded by a gene adjacent to a member of family TIGR03545, which is also uncharacterized. 154
19995 274641 TIGR03547 muta_rot_YjhT mutatrotase, YjhT family. Members of this protein family contain multiple copies of the beta-propeller-forming Kelch repeat. All are full-length homologs to YjhT of Escherichia coli, which has been identified as a mutarotase for sialic acid. This protein improves bacterial ability to obtain host sialic acid, and thus serves as a virulence factor. Some bacteria carry what appears to be a cyclically permuted homolog of this protein. 346
19996 274642 TIGR03548 mutarot_permut cyclically-permuted mutarotase family protein. Members of this protein family show essentially full-length homology, cyclically permuted, to YjhT from Escherichia coli. YjhT was shown to act as a mutarotase for sialic acid, and by this ability to be able to act as a virulence factor. Members of the YjhT family (TIGR03547) and this cyclically-permuted family have multiple repeats of the beta-propeller-forming Kelch repeat. 331
19997 132588 TIGR03549 TIGR03549 YcaO domain protein. This family consists of remarkably well-conserved proteins from gamma and beta Proteobacteria, heavily skewed towards organisms of marine environments. Its gene neighborhood is not conserved. This family has an OsmC-like N-terminal domain. It shares a YcaO domain, frequently associated with ATP-dependent cyclodehydration for peptide modification. The function is unknown. Fifteen of the first sixteen members of this family are from selenouridine-positive genomes, but this correlation may not be meaningful. 718
19998 132589 TIGR03550 F420_cofG 7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, CofG subunit. This model represents either a subunit or a domain, depending on whether or not the genes are fused, of a bifunctional protein that completes the synthesis of 7,8-didemethyl-8-hydroxy-5-deazariboflavin, or FO. FO is the chromophore of coenzyme F(420), involved in methanogenesis in methanogenic archaea but found in certain other lineages as well. The chromophore also occurs as a cofactor in DNA photolyases in Cyanobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 322
19999 132590 TIGR03551 F420_cofH 7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, CofH subunit. This enzyme, together with CofG, complete the biosynthesis of 7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, the chromophore of coenzyme F420. The chromophore is also used in cyanobacteria DNA photolyases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 343
20000 274643 TIGR03552 F420_cofC 2-phospho-L-lactate guanylyltransferase. Members of this protein family are the CofC enzyme of coenzyme F420 biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 195
20001 132592 TIGR03553 F420_FbiB_CTERM F420 biosynthesis protein FbiB, C-terminal domain. Coenzyme F420 differs between the Archaea and the Actinobacteria, where the numbers of glutamate residues attached are 2 (Archaea) or 5-6 (Mycobacterium). The enzyme in the Archaea is homologous to the N-terminal domain of FbiB from Mycobacterium bovis, and is responsible for glutamate ligation. Therefore it seems likely that the C-terminal domain of FbiB modeled here, is involved in additional glutamate ligation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 194
20002 213827 TIGR03554 F420_G6P_DH glucose-6-phosphate dehydrogenase (coenzyme-F420). This family consists of the F420-dependent glucose-6-phosphate dehydrogenase of Mycobacterium and Nocardia. It shows homology to several other F420-dependent enzymes rather than to the NAD or NADP-dependent glucose-6-phosphate dehydrogenases. [Energy metabolism, Pentose phosphate pathway] 331
20003 132594 TIGR03555 F420_mer 5,10-methylenetetrahydromethanopterin reductase. Members of this protein family are 5,10-methylenetetrahydromethanopterin reductase, an F420-dependent enzyme of methanogenesis. It is restricted to the Archaea. [Energy metabolism, Methanogenesis] 325
20004 274644 TIGR03556 photolyase_8HDF deoxyribodipyrimidine photo-lyase, 8-HDF type. This model describes a narrow clade of cyanobacterial deoxyribodipyrimidine photo-lyase. This group, in contrast to several closely related proteins, uses a chromophore that, in other lineages is modified further to become coenzyme F420. This chromophore is called 8-HDF in most articles on the DNA photolyase and FO in most literature on coenzyme F420. [DNA metabolism, DNA replication, recombination, and repair] 471
20005 274645 TIGR03557 F420_G6P_family F420-dependent oxidoreductase, G6PDH family. Members of this protein family include F420-dependent glucose-6-phosphate dehydrogenases (TIGR03554) and related proteins. All members of this family come from species that synthesize coenzyme F420, with the exception of those that belong to TIGR03885, a clade within this family in which cofactor binding may instead be directed to FMN. [Unknown function, Enzymes of unknown specificity] 316
20006 274646 TIGR03558 oxido_grp_1 luciferase family oxidoreductase, group 1. The Pfam domain family pfam00296 is named for luciferase-like monooxygenases, but the family also contains several coenzyme F420-dependent enzymes. This protein family represents a well-resolved clade within family pfam00296 and shows no restriction to coenzyme F420-positive species, unlike some other clades within pfam00296. [Unknown function, Enzymes of unknown specificity] 323
20007 274647 TIGR03559 F420_Rv3520c probable F420-dependent oxidoreductase, Rv3520c family. Members of this protein family are predicted to be oxidoreductases dependent on coenzyme F420. The family includes a single member in Mycobacterium tuberculosis (Rv3520c/MT3621) but four in Mycobacterium smegmatis. Prediction that this family is F420-dependent is based primarily on Partial Phylogenetic Profiling vs. F420 biosynthesis. [Unknown function, Enzymes of unknown specificity] 325
20008 274648 TIGR03560 F420_Rv1855c probable F420-dependent oxidoreductase, Rv1855c family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes one such subfamily, exemplified by Rv1855c from Mycobacterium tuberculosis. [Unknown function, Enzymes of unknown specificity] 227
20009 274649 TIGR03561 organ_hyd_perox peroxiredoxin, Ohr subfamily. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein Ohr, or organic hydroperoxide resistance protein. [Cellular processes, Detoxification] 134
20010 274650 TIGR03562 osmo_induc_OsmC peroxiredoxin, OsmC subfamily. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein OsmC, or osmotically induced protein C. The member from Thermus thermophilus was shown to have hydroperoxide peroxidase activity. In many species, this protein is induced by stress and helps resist oxidative stress. [Cellular processes, Detoxification] 135
20011 132602 TIGR03563 perox_SACOL1771 peroxiredoxin, SACOL1771 subfamily. This protein family belongs to the OsmC/Ohr family (pfam02566, OsmC-like protein) of peroxiredoxins. 138
20012 274651 TIGR03564 F420_MSMEG_4879 F420-dependent oxidoreductase, MSMEG_4879 family. Coenzyme F420 is produced by methanogenic archaea, a number of the Actinomycetes (including Mycobacterium tuberculosis), and rare members of other lineages. The resulting information-rich phylogenetic profile identifies candidate F420-dependent oxidoreductases within the family of luciferase-like enzymes (pfam00296), where the species range for the subfamily encompasses many F420-positive genomes without straying beyond. This family is uncharacterized, and named for member MSMEG_4879 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity] 265
20013 274652 TIGR03565 alk_sulf_monoox alkanesulfonate monooxygenase, FMNH(2)-dependent. Members of this protein family are monooxygenases that catalyze desulfonation of aliphatic sulfonates such as methane sulfonate. This enzyme uses reduced FMN, although various others members of the same luciferase-like monooxygenase family (pfam00296) are F420-dependent enzymes. [Central intermediary metabolism, Sulfur metabolism] 346
20014 211840 TIGR03566 FMN_reduc_MsuE FMN reductase, MsuE subfamily. Members of this protein family use NAD(P)H to reduce FMN and regenerate FMNH2. Members include the NADH-dependent enzyme MsuE from Pseudomonas aeruginosa, which serves as a partner to an FMNH2-dependent alkanesulfonate monooxygenase. The NADP-dependent enzyme from E. coli is outside the scope of this model. 174
20015 274653 TIGR03567 FMN_reduc_SsuE FMN reductase, SsuE family. Members of this protein family use NAD(P)H to reduce FMN and regenerate FMNH2. Members include the homodimeric, NAD(P)H-dependent enzyme SsuE from Escherichia coli, which serves as a partner to an FMNH2-dependent alkanesulfonate monooxygenase. It is induced by sulfate starvation. The NADH-dependent enzyme MsuE from Pseudomonas aeruginosa is outside the scope of this model (see model TIGR03566). [Central intermediary metabolism, Sulfur metabolism] 171
20016 274654 TIGR03568 NeuC_NnaA UDP-N-acetyl-D-glucosamine 2-epimerase, UDP-hydrolysing. This family of enzymes catalyzes the combined epimerization and UDP-hydrolysis of UDP-N-acetylglucosamine to N-acetylmannosamine. This is in contrast to the related enzyme WecB (TIGR00236) which retains the UDP moiety. NeuC acts in concert with NeuA and NeuB to synthesize CMP-N5-acetyl-neuraminate. 364
20017 274655 TIGR03569 NeuB_NnaB N-acetylneuraminate synthase. This family is a subset of the pfam03102 and is believed to include only authentic NeuB N-acetylneuraminate (sialic acid) synthase enzymes. The majority of the genes identified by this model are observed adjacent to both the NeuA and NeuC genes which together effect the biosynthesis of CMP-N-acetylneuraminate from UDP-N-acetylglucosamine. 329
20018 274656 TIGR03570 NeuD_NnaD sugar O-acyltransferase, sialic acid O-acetyltransferase NeuD family. This family of proteins includes the characterized NeuD sialic acid O-acetyltransferase enzymes from E. coli and Streptococcus agalactiae (group B strep). These two are quite closely related to one another, so extension of this annotation to other members of the family in unsupported without additional independent evidence. The neuD gene is often observed in close proximity to the neuABC genes for the biosynthesis of CMP-N-acetylneuraminic acid (CMP-sialic acid), and NeuD sequences from these organisms were used to construct the seed for this model. Nevertheless, there are numerous instances of sequences identified by this model which are observed in a different genomic context (although almost universally in exopolysaccharide biosynthesis-related loci), as well as in genomes for which the biosynthesis of sialic acid (SA) is undemonstrated. Even in the cases where the association with SA biosynthesis is strong, it is unclear in the literature whether the biological substrate is SA iteself, CMP-SA, or a polymer containing SA. Similarly, it is unclear to what extent the enzyme has a preference for acetylation at the 7, 8 or 9 positions. In the absence of evidence of association with SA, members of this family may be involved with the acetylation of differring sugar substrates, or possibly the delivery of alternative acyl groups. The closest related sequences to this family (and those used to root the phylogenetic tree constructed to create this model) are believed to be succinyltransferases involved in lysine biosynthesis. These proteins contain repeats of the bacterial transferase hexapeptide (pfam00132), although often these do not register above the trusted cutoff. 201
20019 274657 TIGR03571 lucif_BA3436 luciferase-type oxidoreductase, BA3436 family. This family is a distinct subgroup among members of the luciferase monooxygenase domain family. The larger family contains both FMN-binding enzymes (luciferase, alkane monooxygenase) and F420-binding enzymes (methylenetetrahydromethanopterin reductase, secondary alcohol dehydrogenase, glucose-6-phosphate dehydrogenase). Although some members of the domain family bind coenzyme F420 rather than FMN, members of this family are from species that lack the genes for F420 biosynthesis. A crystal structure, but not function, is known (but unpublished) for the member from Bacillus cereus, PDB|2B81. [Unknown function, Enzymes of unknown specificity] 298
20020 132611 TIGR03572 WbuZ glycosyl amidation-associated protein WbuZ. This clade of sequences is highly similar to the HisF protein, but generally represents the second HisF homolog in the genome where the other is an authentic HisF observed in the context of a complete histidine biosynthesis operon. The similarity between these WbuZ sequences and true HisFs is such that often the closest match by BLAST of a WbuZ is a HisF. Only by making a multiple sequence alignment is the homology relationship among the WbuZ sequences made apparent. WbuZ genes are invariably observed in the presence of a homolog of the HisH protein (designated WbuY) and a proposed N-acetyl sugar amidotransferase designated in WbuX in E. coli, IfnA in P. aeriginosa and PseA in C. jejuni. Similarly, this trio of genes is invariably found in the context of saccharide biosynthesis loci. It has been shown that the WbuYZ homologs are not essential components of the activity expressed by WbuX, leading to the proposal that these to proteins provide ammonium ions to the amidotransferase when these are in low concentration. WbuY (like HisH) is proposed to act as a glutaminase to release ammonium. In histidine biosynthesis this is also dispensible in the presence of exogenous ammonium ion. HisH and HisF form a complex such that the ammonium ion is passed directly to HisF where it is used in an amidation reaction causing a subsequent cleavage and cyclization. In the case of WbuYZ, the ammonium ion would be passed from WbuY to WbuZ. WbuZ, being non-essential and so similar to HisF that a sugar substrate is unlikely, would function instead as a amoonium channel to the WbuX protein which does the enzymatic work. 232
20021 274658 TIGR03573 WbuX N-acetyl sugar amidotransferase. This enzyme has been implicated in the formation of the acetamido moiety (sugar-NC(=NH)CH3) which is found on some exopolysaccharides and is positively charged at neutral pH. The reaction involves ligation of ammonia with a sugar N-acetyl group, displacing water. In E. coli (O145 strain) and Pseudomonas aeruginosa (O12 strain) this gene is known as wbuX and ifnA respectively and likely acts on sialic acid. In Campylobacter jejuni, the gene is known as pseA and acts on pseudaminic acid in the process of flagellin glycosylation. In other Pseudomonas strains and various organisms it is unclear what the identity of the sugar substrate is, and in fact, the phylogenetic tree of this family sports a considerably deep branching suggestive of possible major differences in substrate structure. Nevertheless, the family is characterized by a conserved tetracysteine motif (CxxC.....[GN]xCxxC) possibly indicative of a metal binding site, as well as an invariable contextual association with homologs of the HisH and HisF proteins known as WbuY and WbuZ, respectively. These two proteins are believed to supply the enzyme with ammonium by hydrolysis of glutamine and delivery through an ammonium conduit. 343
20022 132613 TIGR03574 selen_PSTK L-seryl-tRNA(Sec) kinase, archaeal. Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents archaeal proteins with this activity. [Protein synthesis, tRNA aminoacylation] 249
20023 188340 TIGR03575 selen_PSTK_euk L-seryl-tRNA(Sec) kinase, eukaryotic. Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents eukaryotic proteins with this activity. 340
20024 213830 TIGR03576 pyridox_MJ0158 pyridoxal phosphate enzyme, MJ0158 family. Members of this archaeal protein family are pyridoxal phosphate enzymes of unknown function. Sequence similarity to SelA, a bacterial enzyme of selenocysteine biosynthesis, has led to some members being misannotated as functionally equivalent, but selenocysteine is made on tRNA in Archaea by a two-step process that does not involve a SelA homolog. [Unknown function, Enzymes of unknown specificity] 346
20025 132616 TIGR03577 EF_0830 conserved hypothetical protein EF_0830/AHA_3911. Members of this family of small (about 120 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 115
20026 213831 TIGR03578 EF_0831 conserved hypothetical protein EF_0831/AHA_3912. Members of this family of small (about 100 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 96
20027 132618 TIGR03579 EF_0833 conserved hypothetical protein EF_0833/AHA_3914. Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. 209
20028 274659 TIGR03580 EF_0832 conserved hypothetical protein EF_0832/AHA_3913. Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Unknown function, General] 246
20029 188342 TIGR03581 EF_0839 conserved hypothetical protein EF_0839/AHA_3917. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 227
20030 132621 TIGR03582 EF_0829 PRD domain protein EF_0829/AHA_3910. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. This protein contains a PRD domain (see pfam00874). The function is unknown. [Unknown function, General] 107
20031 132622 TIGR03583 EF_0837 probable amidohydrolase EF_0837/AHA_3915. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. These proteins resemble aminohydrolases (see pfam01979), including dihydroorotases. The function is unknown. [Hypothetical proteins, Conserved] 365
20032 274660 TIGR03584 PseF pseudaminic acid cytidylyltransferase. The sequences in this family include the pfam02348 (cytidyltransferase) domain and are homologous to the NeuA protein responsible for the transfer of CMP to neuraminic acid. According to, this gene is responsible for the transfer of CMP to the structurally related sugar, pseudaminic acid which is observed as a component of sugar modifications of flagellin in Campylobacter species. This gene is commonly observed in apparent operons with other genes responsible for the biosynthesis of pseudaminic acid and as a component of flagellar and exopolysaccharide biosynthesis loci. 222
20033 274661 TIGR03585 PseH UDP-4-amino-4,6-dideoxy-N-acetyl-beta-L-altrosamine N-acetyltransferase. Sequences in this family are members of the pfam00583 (GNAT) superfamily of acetyltransferases and are proposed to perform a N-acetylation step in the process of pseudaminic acid biosynthesis in Campylobacter species. This gene is commonly observed in apparent operons with other genes responsible for the biosynthesis of pseudaminic acid and as a component of flagellar and exopolysaccharide biosynthesis loci. Significantly, many genomes containing other components of this pathway lack this gene, indicating that some other N-acetyl transferases may be incolved and/or the step is optional, resulting in a non-acetylated pseudaminic acid variant sugar. 152
20034 163337 TIGR03586 PseI pseudaminic acid synthase. Members of this family are included within the larger pfam03102 (NeuB) family. NeuB itself (TIGR03569) is involved in the biosynthesis of neuraminic acid by the condensation of phosphoenolpyruvate (PEP) with N-Acetyl-D-Mannosamine. In an analagous reaction, this enzyme, PseI, condenses PEP with 6-deoxy-beta-L-AltNAc4NAc to generate pseudaminic acid. 327
20035 132626 TIGR03587 Pse_Me-ase pseudaminic acid biosynthesis-associated methylase. Members of this small clade are methyltransferases of the pfam08241 family and are observed within operons for the biosynthesis of pseudaminic acid, a component of exopolysaccharide and flagellin glycosyl modifications. Notable among these genomes is Pseudomonas fluorescens PfO-1. Possibly one of the two hydroxyl groups of pseudaminic acid, at positions 4 and 8 is converted to a methoxy group by this enzyme 204
20036 274662 TIGR03588 PseC UDP-4-amino-4,6-dideoxy-N-acetyl-beta-L-altrosamine transaminase. This family of enzymes are aminotransferases of the pfam01041 family involved in the biosynthesis of pseudaminic acid. They convert UDP-4-keto-6-deoxy-N-acetylglucosamine into UDP-4-amino-4,6-dideoxy-N-acetylgalactose. Pseudaminic acid has a role in surface polysaccharide in Pseudomonas as well as in the modification of flagellin in Campylobacter and Helicobacter species. 380
20037 132628 TIGR03589 PseB UDP-N-acetylglucosamine 4,6-dehydratase (inverting). This enzyme catalyzes the first step in the biosynthesis of pseudaminic acid, the conversion of UDP-N-acetylglucosamine to UDP-4-keto-6-deoxy-N-acetylglucosamine. These sequences are members of the broader pfam01073 (3-beta hydroxysteroid dehydrogenase/isomerase family) family. 324
20038 274663 TIGR03590 PseG UDP-2,4-diacetamido-2,4,6-trideoxy-beta-L-altropyranose hydrolase. This protein is found in association with enzymes involved in the biosynthesis of pseudaminic acid, a component of polysaccharide in certain Pseudomonas strains as well as a modification of flagellin in Campylobacter and Hellicobacter. The role of this protein is unclear, although it may participate in N-acetylation in conjunction with, or in the absence of PseH (TIGR03585) as it often scores above the trusted cutoff to pfam00583 representing a family of acetyltransferases. 279
20039 274664 TIGR03591 polynuc_phos polyribonucleotide nucleotidyltransferase. Members of this protein family are polyribonucleotide nucleotidyltransferase, also called polynucleotide phosphorylase. Some members have been shown also to have additional functions as guanosine pentaphosphate synthetase and as poly(A) polymerase (see model TIGR02696 for an exception clade, within this family). [Transcription, Degradation of RNA] 688
20040 274665 TIGR03592 yidC_oxa1_cterm membrane protein insertase, YidC/Oxa1 family, C-terminal domain. This model describes full-length from some species, and the C-terminal region only from other species, of the YidC/Oxa1 family of proteins. This domain appears to be univeral among bacteria (although absent from Archaea). The well-characterized YidC protein from Escherichia coli and its close homologs contain a large N-terminal periplasmic domain in addition to the region modeled here. [Protein fate, Protein and peptide secretion and trafficking] 179
20041 274666 TIGR03593 yidC_nterm membrane protein insertase, YidC/Oxa1 family, N-terminal domain. Essentially all bacteria have a member of the YidC family, whose C-terminal domain is modeled by TIGR03592. The two copies are found in endospore-forming bacteria such as Bacillus subtilis appear redundant during vegetative growth, although the member designated spoIIIJ (stage III sporulation protein J) has a distinct role in spore formation. YidC, its mitochondrial homolog Oxa1, and its chloroplast homolog direct insertion into the bacterial/organellar inner (or only) membrane. This model describes an N-terminal sequence region, including a large periplasmic domain lacking in YidC members from Gram-positive species. The multifunctional YidC protein acts both with and independently of the Sec system. [Protein fate, Protein and peptide secretion and trafficking] 366
20042 274667 TIGR03594 GTPase_EngA ribosome-associated GTPase EngA. EngA (YfgK, Der) is a ribosome-associated essential GTPase with a duplication of its GTP-binding domain. It is broadly to universally distributed among bacteria. It appears to function in ribosome biogenesis or stability. [Protein synthesis, Other] 428
20043 274668 TIGR03595 Obg_CgtA_exten Obg family GTPase CgtA, C-terminal extension. CgtA (see model TIGR02729) is a broadly conserved member of the obg family of GTPases associated with ribosome maturation. This model represents a unique C-terminal domain found in some but not all sequences of CgtA. This region is preceded, and may be followed, by a region of low-complexity sequence. 69
20044 274669 TIGR03596 GTPase_YlqF ribosome biogenesis GTP-binding protein YlqF. Members of this protein family are GTP-binding proteins involved in ribosome biogenesis, including the essential YlqF protein of Bacillus subtilis, which is an essential protein. They are related to Era, EngA, and other GTPases of ribosome biogenesis, but are circularly permuted. This family is not universal, and is not present in Escherichia coli, and so is not as well studied as some other GTPases. This model is built for bacterial members. [Protein synthesis, Other] 276
20045 213834 TIGR03597 GTPase_YqeH ribosome biogenesis GTPase YqeH. This family describes YqeH, a member of a larger family of GTPases involved in ribosome biogenesis. Like YqlF, it shows a cyclical permutation relative to GTPases EngA (in which the GTPase domain is duplicated), Era, and others. Members of this protein family are found in a relatively small number of bacterial species, including Bacillus subtilis but not Escherichia coli. [Protein synthesis, Other] 360
20046 274670 TIGR03598 GTPase_YsxC ribosome biogenesis GTP-binding protein YsxC/EngB. Members of this protein family are a GTPase associated with ribosome biogenesis, typified by YsxC from Bacillus subutilis. The family is widely but not universally distributed among bacteria. Members commonly are called EngB based on homology to EngA, one of several other GTPases of ribosome biogenesis. Cutoffs as set find essentially all bacterial members, but also identify large numbers of eukaryotic (probably organellar) sequences. This protein is found in about 80 percent of bacterial genomes. [Protein synthesis, Other] 179
20047 274671 TIGR03599 YloV DAK2 domain fusion protein YloV. This model describes a protein family that contains an N-terminal DAK2 domain (pfam02734), so named because of similarity to the dihydroxyacetone kinase family family. The GTP-binding protein CgtA (a member of the obg family) is a bacterial GTPase associated with ribosome biogenesis, and it has a characteristic extension (TIGR03595) in certain lineages. This protein family described here was found, by the method of partial phylognetic profiling, to have a phylogenetic distribution strongly correlated to that of TIGR03595. This correlation implies some form of functional coupling. 530
20048 274672 TIGR03600 phage_DnaB phage replicative helicase, DnaB family, HK022 subfamily. Members of this family are phage (or prophage-region) homologs of the bacterial homohexameric replicative helicase DnaB. Some phage may rely on host DnaB, while others encode their own verions. This model describes the largest phage-specific clade among the close homologs of DnaB, but there are, or course, other DnaB homologs from phage that fall outside the scope of this model. [Mobile and extrachromosomal element functions, Prophage functions] 420
20049 274673 TIGR03601 B_an_ocin bacteriocin, heterocycloanthracin/sonorensin family. Numerous bacteria encode systems for producing bacteriocins by extensive modification of ribosomally produced precursors. Members of the TOMM class (thiazole/oxazole-modified microcins) are recognizable by association with cyclodehydratase (and often dehydrogenase) maturation proteins. This family consists of a special subclass, the heterocycloanthracin family, that share a homologous leader peptide region and then a repeat region with Cys as every third residue. In Bacillus anthracis and Bacillus cereus, the RiPP (ribosomally translated and post-translationally modified natural product) precursor is encoded far from its maturase genes, and every strain has the system. In other species (e.g. B. licheniformis, B. sorenensis), precursor and maturase genes are close together. Sonorensin, from B. sonorensis MT93, was shown to have broad spectrum antimicrobial activity, affecting Gram-positive and Gram-negative bacteria. [Cellular processes, Toxin production and resistance] 88
20050 132641 TIGR03602 streptolysinS bacteriocin protoxin, streptolysin S family. Members of this family are bacteriocin precursors. These small, ribosomally produced polypeptide precursors are extensively processed post-translationally. This family belongs to a class of heterocycle-containing bacteriocins, including streptolysin S from Streptococcus pyogenes, and related bacteriocins from Streptococcus iniae and Clostridium botulinum. Streptolysin S is hemolytic. Bacteriocin genes in general are small and highly diverse, with odd sequence composition, and are easily missed by many gene-finding programs. [Cellular processes, Toxin production and resistance] 56
20051 200298 TIGR03603 cyclo_dehy_ocin thiazole/oxazole-forming peptide maturase, SagC family component. Members of this protein family include enzymes related to SagC, a protein involved in thiazole/oxazole cyclodehydration modifications during biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Recent evidence suggests that the YcaO/SagD-like component, not this component, performs an ATP-dependent cyclodehydration. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins. Note that this model does not find all possible examples of bacteriocin biosynthesis cyclodehydratases, an in particular misses the E. coli plasmid protein McbB of microcin B17 biosynthesis. [Cellular processes, Pathogenesis] 319
20052 274674 TIGR03604 TOMM_cyclo_SagD thiazole/oxazole-forming peptide maturase, SagD family component. Members of this protein family include enzymes related to SagD, previously referred to as a scaffold or docking protein involved in the biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Newer evidence describes an enzymatic activity, an ATP-dependent cyclodehydration reaction, previously ascribed to the SagC component. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins. 377
20053 188352 TIGR03605 antibiot_sagB SagB-type dehydrogenase domain. SagB of Sterptococcus pyogenes participates in the maturation of streptolysin S from a ribosomally produced precursor polypeptide. Chemically similar systems operate on highly diverse sets of bacteriocin precursors in numerous other bacteria. This model describes a domain within SgaB and homologous regions from other proteins, many of which appear to be involved in biosynthesis of secondary metabolites. While some substrates may be intermediates in non-ribosomal peptide syntheses, others are involved in heterocycle-containing bacteriocin biosynthesis, and can be found near SgaC-like (see TIGR03603, cyclodehydratase) and SgaD-like (see TIGR03604, "docking") proteins. Members of this domain family are heterogeneous in length, as many have a partial second copy of the domain represented here. The incomplete second domain scores below the cutoffs to this model in most cases. 173
20054 274675 TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis. 454
20055 274676 TIGR03607 TIGR03607 patatin-related protein. This bacterial protein family contains an N-terminal patatin domain, where patatins are plant storage proteins capable of phospholipase activity (see pfam01734). Regions of strong sequence conservation are separated by regions of significant sequence and length variability. Members of the family are distributed sporadically among bacteria. The function is unknown. [Unknown function, General] 738
20056 188353 TIGR03608 L_ocin_972_ABC putative bacteriocin export ABC transporter, lactococcin 972 group. A gene pair with a fairly wide distribution consists of a polypeptide related to the lactococcin 972 (see TIGR01653) and multiple-membrane-spanning putative immunity protein (see TIGR01654). This model represents a small clade within the ABC transporters that regularly are found adjacent to these bacteriocin system gene pairs and are likely serve as export proteins. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Unknown substrate] 206
20057 132648 TIGR03609 S_layer_CsaB polysaccharide pyruvyl transferase CsaB. The CsaB protein (cell surface anchoring B) of Bacillus anthracis adds a pyruvoyl group to peptidoglycan-associated polysaccharide. This addition is required for proteins with an S-layer homology domain (pfam00395) to bind. Within the larger group of proteins described by pfam04230, this model represents a distinct clade that nearly exactly follows the phylogenetic distribution of the S-layer homology domain (pfam00395). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking] 298
20058 274677 TIGR03610 RutC pyrimidine utilization protein C. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the endoribonuclease L-PSP family defined by pfam01042. 127
20059 211851 TIGR03611 RutD pyrimidine utilization protein D. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the hydrolase, alpha/beta fold family defined by pfam00067. 248
20060 163355 TIGR03612 RutA pyrimidine utilization protein A. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the luciferase family defined by pfam00296 and is likely a FMN-dependent monoxygenase. [Unknown function, Enzymes of unknown specificity] 355
20061 274678 TIGR03613 RutR pyrimidine utilization regulatory protein R. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the TetR family of transcriptional regulators defined by the N-teminal model pfam00440 and the C-terminal model pfam08362 (YcdC-like protein, C-terminal region). 202
20062 163356 TIGR03614 RutB pyrimidine utilization protein B. 226
20063 132654 TIGR03615 RutF pyrimidine utilization flavin reductase protein F. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the flavin reductase family defined by pfam01613. Presumably, this protein recycles the flavin of the RutA luciferase-like oxidoreductase. 156
20064 132655 TIGR03616 RutG pyrimidine utilization transport protein G. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the uracil-xanthine permease family defined by TIGR00801. As well as the The Nucleobase:Cation Symporter-2 (NCS2) Family (TC 2.A.40). 429
20065 132656 TIGR03617 F420_MSMEG_2256 probable F420-dependent oxidoreductase, MSMEG_2256 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes one such subfamily, exemplified by MSMEG_2256 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity] 318
20066 274679 TIGR03618 Rv1155_F420 PPOX class probable F420-dependent enzyme. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomyces, make F420. The Partial Phylogenetic Profiling algorithm identifies this members of this protein family as high-scoring proteins to the F420 biosynthesis profile. A member of this family, Rv1155, was crytallized after expression in Escherichia coli, which does not synthesize F420; the crystal structure shown to resemble FMN-binding proteins, but with a recognizable empty cleft corresponding to, yet differing profounding from, the FMN site of pyridoxine 5'-phosphate oxidase. We propose that this protein family consists of F420-binding enzymes. [Unknown function, Enzymes of unknown specificity] 126
20067 274680 TIGR03619 F420_Rv2161c probable F420-dependent oxidoreductase, Rv2161c family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a domain found in a distinctive subset of bacterial luciferase homologs, found only in F420-biosynthesizing members of the Actinobacteria. [Unknown function, Enzymes of unknown specificity] 246
20068 274681 TIGR03620 F420_MSMEG_4141 probable F420-dependent oxidoreductase, MSMEG_4141 family. Members of this protein family, related to F420-dependent oxidoreductases within the larger family of a bacterial luciferase (an FMN-dependent enzyme), occurs only within the small subset of species that synthesize F420. Most such proteins are from members of the Actinobacteria, but at least one species, Sphingomonas wittichii, belongs to the Alphaproteobacteria. [Unknown function, Enzymes of unknown specificity] 278
20069 200301 TIGR03621 F420_MSMEG_2516 probable F420-dependent oxidoreductase, MSMEG_2516 family. Coenzyme F420 is produced by methanogenic archaea, a number of the Actinomycetes (including Mycobacterium tuberculosis), and rare members of other lineages. The resulting information-rich phylogenetic profile identifies candidate F420-dependent oxidoreductases within the family of luciferase-like enzymes (pfam00296), where the species range for the subfamily encompasses many F420-positive genomes without straying beyond. This family is uncharacterized, and named for member MSMEG_2516 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity] 295
20070 132661 TIGR03622 urea_t_UrtB_arc urea ABC transporter, permease protein UrtB. Members of this protein family are ABC transporter permease subunits restricted to the Archaea. Several lines of evidence suggest this protein is functionally analogous, as well as homologous, to the UrtB subunit of the Corynebacterium glutamicum urea transporter. All members of the operon show sequence similarity to urea transport subunits, the gene is located near the urease structural subunits in two of three species, and partial phylogenetic profiling identifies this permease subunit as closely matching the profile of urea utilization. 283
20071 274682 TIGR03623 TIGR03623 probable DNA repair protein. Members of this protein family are bacterial proteins of about 900 amino acids in length. Members show extended homology to proteins in TIGR02786, the AddB protein of double-strand break repair via homologous recombination. Members of this family, therefore, may be DNA repair proteins. 874
20072 274683 TIGR03624 TIGR03624 putative hydrolase. Members of this protein family have a phylogenetic distribution skewed toward the Actinobacteria (high GC Gram-positive bacteria), but with a few members occuring in the Archaea and Chloroflexi. The function is unknown. [Unknown function, General] 346
20073 274684 TIGR03625 L3_bact 50S ribosomal protein uL3, bacterial form. This model describes bacterial (and mitochondrial and chloroplast) class of ribosomal protein L3. A separate model describes the archaeal form, where both belong to pfam00297. The name is phrased to meet the needs of bacterial genome annotation. Organellar forms typically will have transit peptides, N-terminal to the region modeled here. 202
20074 274685 TIGR03626 L3_arch ribosomal protein uL3, archaeal form. This model describes exclusively the archaeal class of ribosomal protein L3. A separate model (TIGR03625) describes the bacterial/organelle form, and both belong to pfam00297. Eukaryotic proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 330
20075 132666 TIGR03627 uS9_arch ribosomal protein uS9, archaeal form. This model describes exclusively the archaeal ribosomal protein S9P. Homologous eukaryotic and bacterial ribosomal proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 130
20076 274686 TIGR03628 arch_S11P ribosomal protein uS11P, archaeal form. This model describes exclusively the archaeal ribosomal protein S11P. It excludes homologous ribosomal proteins S14 from eukaryotes and S11 from bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification] 117
20077 213839 TIGR03629 uS13_arch ribosomal protein uS13, archaeal form. This model describes exclusively the archaeal ribosomal protein S13P. It excludes the homologous eukaryotic 40S ribosomal protein S18 and bacterial 30S ribosomal protein S13. [Protein synthesis, Ribosomal proteins: synthesis and modification] 144
20078 274687 TIGR03630 uS17_arch ribosomal protein uS17, archaeal form. This model describes exclusively the archaeal ribosomal protein S17P. It excludes the homologous ribosomal protein S17 from bacteria, and is not intended for use on eukaryotic sequences, where some instances of ribosomal proteins S11 score above the trusted cutoff. [Protein synthesis, Ribosomal proteins: synthesis and modification] 102
20079 274688 TIGR03631 uS13_bact ribosomal protein uS13, bacterial form. This model describes bacterial ribosomal protein S13, to the exclusion of the homologous archaeal S13P and eukaryotic ribosomal protein S18. This model identifies some (but not all) instances of chloroplast and mitochondrial S13, which is of bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification] 113
20080 274689 TIGR03632 uS11_bact ribosomal protein uS11, bacterial form. This model describes the bacterial 30S ribosomal protein S11. Cutoffs are set such that the model excludes archaeal and eukaryotic ribosomal proteins, but many chloroplast and mitochondrial equivalents of S11 are detected. [Protein synthesis, Ribosomal proteins: synthesis and modification] 108
20081 163366 TIGR03633 arc_protsome_A proteasome endopeptidase complex, archaeal, alpha subunit. This protein family describes the archaeal proteasome alpha subunit, homologous to both the beta subunit and to the alpha and beta subunits of eukaryotic proteasome subunits. This family is universal in the first 29 complete archaeal genomes but occasionally is duplicated. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 224
20082 274690 TIGR03634 arc_protsome_B proteasome endopeptidase complex, archaeal, beta subunit. This protein family describes the archaeal proteasome beta subunit, homologous to both the alpha subunit and to the alpha and beta subunits of eukaryotic proteasome subunits. This family is universal in the first 29 complete archaeal genomes but occasionally is duplicated. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 185
20083 274691 TIGR03635 uS17_bact ribosomal protein uS17, bacterial form. This model describes the bacterial ribosomal small subunit protein S17, while excluding cytosolic eukaryotic homologs and archaeal homologs. The model finds many, but not, chloroplast and mitochondrial counterparts to bacterial S17. [Protein synthesis, Ribosomal proteins: synthesis and modification] 72
20084 274692 TIGR03636 uL23_arch ribosomal protein uL23, archaeal form. This model describes the archaeal ribosomal protein L23P and rigorously excludes the bacterial counterpart L23. In order to capture every known instance of archaeal L23P, the trusted cutoff is set lower than a few of the highest scoring eukaryotic cytosolic ribosomal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification] 77
20085 132676 TIGR03637 cas1_YPEST CRISPR-associated endonuclease Cas1, subtype I-F/YPEST. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the YPEST subtype of CRISPR/Cas system. 307
20086 274693 TIGR03638 cas1_ECOLI CRISPR-associated endonuclease Cas1, subtype I-E/ECOLI. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the ECOLI subtype of CRISPR/Cas system. 268
20087 274694 TIGR03639 cas1_NMENI CRISPR-associated endonuclease Cas1, subtype II/NMENI. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is a prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 variant of the NMENI subtype of CRISPR/Cas system. 278
20088 188360 TIGR03640 cas1_DVULG CRISPR-associated endonuclease Cas1, subtype I-C/DVULG. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the DVULG subtype of CRISPR/Cas system. 340
20089 274695 TIGR03641 cas1_HMARI CRISPR-associated endonuclease Cas1, subtype I-B/HMARI/TNEAP. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes Cas1 subgroup that includes Cas1 proteins of the related HMARI and TNEAP subtypes of CRISPR/Cas system. 320
20090 274696 TIGR03642 cas_csx14 CRISPR-associated protein, Csx14 family. This model describes a protein N-terminal protein sequence domain strictly associated with CRISPR and CRISPR-associated protein systems. This model and TIGR02584 identify two separate clades from a larger homology domain family, both CRISPR-associated, while other homologs are found that may not be. Members are found in bacteria that include Pelotomaculum thermopropionicum SI, Thermoanaerobacter tengcongensis MB4, and Roseiflexus sp. RS-1, and in archaea that include Thermoplasma volcanium, Picrophilus torridus, and Methanospirillum hungatei. The molecular function is unknown. 124
20091 132682 TIGR03643 TIGR03643 TIGR03643 family protein. This model describes an uncharacterized bacterial protein family. Members average about 90 amino acids in length with several well-conserved uncommon amino acids (Trp, Met). The majority of species are marine bacteria. Few species have more than one copy, but Vibrio cholerae El Tor N16961 has three identical copies. [Hypothetical proteins, Conserved] 72
20092 274697 TIGR03644 marine_trans_1 probable ammonium transporter, marine subtype. Members of this protein family are well conserved subclass of putative ammonimum transporters, belonging to the much broader set of ammonium/methylammonium transporter described by TIGR00836. Species with this transporter tend to be marine bacteria. Partial phylogenetic profiling (PPP) picks a member of this protein family as the single best-scoring protein vs. a reference profile for the marine environment Genome Property for a large number of different query genomes. This finding by PPP suggests that this transporter family represents an important adaptation to the marine environment. 404
20093 132684 TIGR03645 glyox_marine lactoylglutathione lyase family protein. Members of this protein family share homology with lactoylglutathione lyase (glyoxalase I) and are found mainly in marine members of the gammaproteobacteria, including CPS_0532 from Colwellia psychrerythraea 34H. This family excludes a well-separated, more narrowly distributed paralogous family, exemplified by CPS_3492 from C. psychrerythraea. The function is of this protein family is unknown. 162
20094 132685 TIGR03646 YtoQ_fam YtoQ family protein. Members of this family are uncharacterized proteins, including YtoQ from Bacillus subtilis. This family shows some sequence similarity to a family of nucleoside 2-deoxyribosyltransferases (COG3613 as iterated through CDD), but sufficiently remote that PSI-BLAST starting from YtoQ and exploring outwards does not discover the relationship. 144
20095 132686 TIGR03647 Na_symport_sm putative solute:sodium symporter small subunit. Members of this family are highly hydrophobic bacterial proteins of about 90 amino acids in length. Members usually are found immediately upstream (sometimes fused to) a member of the solute:sodium symporter family, and therefore are a putative sodium:solute symporter small subunit. Members tend to be found in aquatic species, especially those from marine or other high salt environments. [Transport and binding proteins, Unknown substrate] 77
20096 274698 TIGR03648 Na_symport_lg probable sodium:solute symporter, VC_2705 subfamily. This family belongs to a larger family of transporters of the sodium:solute symporter superfamily, TC 2.A.21. Members of this strictly bacterial protein subfamily are found almost invariably immediately downstream from a member of family TIGR03647. Occasionally, the two genes are fused. 552
20097 274699 TIGR03649 ergot_EASG ergot alkaloid biosynthesis protein, AFUA_2G17970 family. This family consists of fungal proteins of unknown function associated with secondary metabolite biosynthesis, such as of the ergot alkaloids such as ergovaline. Nomenclature differs because gene order differs - this is EasG in Neotyphodium lolii but is designated ergot alkaloid biosynthetic protein A in several other fungi. 285
20098 132689 TIGR03650 violacein_E violacein biosynthesis enzyme VioE. This enzyme catalyzes the third step in violacein biosynthesis from a pair of Trp residues, as in Chromobacterium violaceum, but the first step that distinguishes that pathway from staurosporine (an indolocarbazole antibiotic) biosynthesis. [Cellular processes, Toxin production and resistance] 184
20099 274700 TIGR03651 circ_ocin_uber circular bacteriocin, circularin A/uberolysin family. Circular bacteriocins are antibiotic proteins made by ribosomal translation of a precursor molecular, followed by cleavage and circularization. Members of this subclass of the circular bacteriocins include circularin A from Clostridium beijerinckii, bacteriocin AS-48 from Enterococcus faecalis, uberolysin from Streptococcus uberis, and carnocyclin A from Carnobacterium maltaromaticum. The mature circularized peptides average about 70 amino acids in size. [Cellular processes, Toxin production and resistance] 73
20100 274701 TIGR03652 FeS_repair_RIC iron-sulfur cluster repair di-iron protein. Members of this protein family, designated variously as YftE, NorA, DrnN, and NipC, are di-iron proteins involved in the repair of iron-sulfur clusters. Previously assigned names reflect pleiotropic effects of damage from NO or other oxidative stress when this protein is mutated. The suggested name now is RIC, for Repair of Iron Centers. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 216
20101 274702 TIGR03653 uL6_arch ribosomal protein uL6, archaeal form. Members of this protein family are the archaeal form ofribosomal protein uL6 (previously L9 in yeast and human). The top-scoring proteins not selected by this model are eukaryotic cytosolic uL6. Bacterial ribosomal protein L6 scores lower and is described by a distinct model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 170
20102 274703 TIGR03654 L6_bact ribosomal protein L6, bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification] 175
20103 274704 TIGR03655 anti_R_Lar restriction alleviation protein, Lar family. Restriction alleviation proteins provide a countermeasure to host cell restriction enzyme defense against foreign DNA such as phage or plasmids. This family consists of homologs to the phage antirestriction protein Lar, and most members belong to phage genomes or prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification] 53
20104 274705 TIGR03656 IsdC heme uptake protein IsdC. Isd proteins are iron-regulated surface proteins found in Bacillus, Staphylococcus and Listeria species and are responsible for heme scavenging from hemoproteins. The IsdC protein consists of an N-terminal hydrophobic signal sequence, a central NEAT (NEAr Transporter, pfam05031) domain, which confers the ability to bind heme, and a C-terminal SrtB processing signal which targets the protein to the cell wall. IsdC is believed to make a direct contact with, and transfer heme to, the heme-binding component (IsdE) of an ABC transporter in the cytoplasmic membrane, and to receive heme from other NEAT-containing heme-binding proteins also localized in the cell wall. 217
20105 213844 TIGR03657 IsdB heme uptake protein IsdB. Isd proteins are iron-regulated surface proteins found in Bacillus, Staphylococcus and Listeria species and are responsible for heme scavenging from hemoproteins. The IsdB protein is only observed in Staphylococcus and consists of an N-terminal hydrophobic signal sequence, a pair of tandem NEAT (NEAr Transporter, pfam05031) domains, which confers the ability to bind heme, and a C-terminal sortase processing signal which targets the protein to the cell wall. IsdB is believed to make a direct contact with methemoglobin facilitating transfer of heme to IsdB. The heme is then transferred to other cell wall-bound NEAT domain proteins such as IsdA and IsdC. 644
20106 132697 TIGR03658 IsdH_HarA haptoglobin-binding heme uptake protein HarA. HarA is a heme-binding NEAT-domain (NEAr Transporter, pfam05031) protein which has been shown to bind to the haptoglobin-hemoglobin complex in order to extract heme from it. HarA has also been reported to bind hemoglobin directly. HarA (also known as IsdH) contains three NEAT domains as well as a sortase A C-terminal signal for localization to the cell wall. The heme bound at the third of these NEAT domains has been shown to be transferred to the IsdA protein also localized at the cell wall, presumably through an additional specific protein-protein interaction. Haptoglobin is a hemoglobin carrier protein involved in scavenging hemoglobin in the blood following red blood cell lysis and targetting it to the liver. 895
20107 274706 TIGR03659 IsdE heme ABC transporter, heme-binding protein isdE. This family of ABC substrate-binding proteins is observed primarily in close proximity with proteins localized to the cell wall and bearing the NEAT (NEAr Transporter, pfam05031) heme-binding domain. IsdE has been shown to bind heme and is involved in the process of scavenging heme for the purpose of obtaining iron. 289
20108 132699 TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion. [Cellular processes, Pathogenesis] 137
20109 274707 TIGR03661 T1SS_VCA0849 type I secretion C-terminal target domain (VC_A0849 subclass). This model represents a C-terminal domain associated with secretion by type 1 secretion systems (T1SS). Members of this subclass do not include the RtxA toxin of Vibrio cholerae and its homologs, although the two classes of proteins share large size, occurrence in genomes with T1SS, regions with long tandem repeats, and regions with the glycine-rich repeat modeled by pfam00353. [Cellular processes, Pathogenesis] 88
20110 274708 TIGR03662 Chlor_Arch_YYY Chlor_Arch_YYY domain. Members of this highly hydrophobic probable integral membrane family belong to two classes. In one, a single copy of the region covered by this model represents essentially the full length of a strongly hydrophobic protein of about 700 to 900 residues (variable because of long inserts in some). The domain architecture of the other class consists of an additional N-terminal region, two copies of the region represented by this model, and three to four repeats of TPR, or tetratricopeptide repeat. The unusual species range includes several Archaea, several Chloroflexi, and Clostridium phytofermentans. An unusual motif YYYxG is present, and we suggest the name Chlor_Arch_YYY protein. The function is unknown. 723
20111 274709 TIGR03663 TIGR03663 TIGR03663 family protein. Members of this protein family, uncommon and rather sporadically distributed, are found almost always in the same genomes as members of family TIGR03662, and frequently as a nearby gene. Members show some N-terminal sequence similarity with pfam02366, dolichyl-phosphate-mannose-protein mannosyltransferase. The few invariant residues in this family, found toward the N-terminus, include a dipeptide DE, a tripeptide HGP, and two different Arg residues. Up to three members may be found in a genome. The function is unknown. 439
20112 274710 TIGR03664 fut_nucase futalosine hydrolase. This enzyme catalyzes the conversion of futalosine to de-hypoxanthine futalosine in a pathway for the biosynthesis of menaquinone distinct from the pathway observed in E. coli. 222
20113 274711 TIGR03665 arCOG04150 arCOG04150 universal archaeal KH domain protein. This family of proteins is universal among the 41 archaeal genomes analyzed, and is not observed outside of the archaea. The proteins contain a single KH domain (pfam00013) which is likely to confer the ability to bind RNA. 172
20114 274712 TIGR03666 Rv2061_F420 PPOX class probable F420-dependent enzyme, Rv2061 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity] 132
20115 132706 TIGR03667 Rv3369 PPOX class probable F420-dependent enzyme, Rv3369 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity] 130
20116 132707 TIGR03668 Rv0121_F420 PPOX class probable F420-dependent enzyme, Rv0121 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity] 141
20117 132708 TIGR03669 urea_ABC_arch urea ABC transporter, substrate-binding protein, archaeal type. Members of this protein family are identified as the substrate-binding protein of a urea ABC transport system by similarity to a known urea transporter from Corynebacterium glutamicum, operon structure, proximity of its operons to urease (urea-utilization protein) operons, and by Partial Phylogenetic Profiling vs. urea utilization. [Transport and binding proteins, Amino acids, peptides and amines] 374
20118 274713 TIGR03670 rpoB_arch DNA-directed RNA polymerase subunit B. This model represents the archaeal version of DNA-directed RNA polymerase subunit B (rpoB) and is observed in all archaeal genomes. 599
20119 274714 TIGR03671 cca_archaeal CCA-adding enzyme. 408
20120 274715 TIGR03672 rpl4p_arch 50S ribosomal protein uL4, archaeal form. One of the primary rRNA binding proteins, this protein initially binds near the 5'-end of the 23S rRNA. It is important during the early stages of 50S assembly. It makes multiple contacts with different domains of the 23S rRNA in the assembled 50S subunit and ribosome. 251
20121 274716 TIGR03673 uL14_arch 50S ribosomal protein uL14, archaeal form. Part of the 50S ribosomal subunit. Forms a cluster with proteins L3 and L24e, part of which may contact the 16S rRNA in 2 intersubunit bridges. 131
20122 274717 TIGR03674 fen_arch flap structure-specific endonuclease. Endonuclease that cleaves the 5'-overhanging flap structure that is generated by displacement synthesis when DNA polymerase encounters the 5'-end of a downstream Okazaki fragment. Has 5'-endo-/exonuclease and 5'-pseudo-Y-endonuclease activities. Cleaves the junction between single and double-stranded regions of flap DNA 338
20123 274718 TIGR03675 arCOG00543 arCOG00543 universal archaeal KH-domain/beta-lactamase-domain protein. This family of proteins is universal in the archaea and consistsof an N-terminal type-1 KH-domain (pfam00013) a central beta-lactamase-domain (pfam00753) with a C-terminal motif associated with RNA metabolism (pfam07521). KH-domains are associated with RNA-binding, so taken together, this protein is a likely metal-dependent RNAase. This family was defined as arCOG01782. 630
20124 274719 TIGR03676 aRF1/eRF1 peptide chain release factor 1, archaeal and eukaryotic forms. Directs the termination of nascent peptide synthesis (translation) in response to the termination codons UAA, UAG and UGA. This model identifies both archaeal (aRF1) and eukaryotic (eRF1) of the protein. Also known as translation termination factor 1. [Protein synthesis, Translation factors] 403
20125 188367 TIGR03677 eL8_ribo ribosomal protein eL8, archaeal form. This model specifically identifies the archaeal version of the large ribosomal complex protein eL8, previously designated L8 in yeast and L7Ae in the archaea. The family is a narrower version of the pfam01248 model which also recognizes the L30 protein. 117
20126 163391 TIGR03678 het_cyc_patell bacteriocin leader peptide, microcyclamide/patellamide family. This model represents a conserved N-terminal region shared by microcyclamide and patellamide bacteriocins precursors. These bacteriocin precursors are associated with heterocyclization. Related precursors are found in family TIGR04446. 34
20127 188368 TIGR03679 arCOG00187 arCOG00187 universal archaeal metal-binding-domain/4Fe-4S-binding-domain containing ABC transporter, ATP-binding protein. This protein consists of an N-terminal possible metal-binding domain (pfam04068) followed by a 4Fe-4S cluster binding domain (pfam00037) followed by a C-terminal ABC transporter, ATP-binding domain (pfam00005). This combination of N-terminal domains is observed in the RNase L inhibitor, RLI. This model has the same scope as an archaeal COG (arCOG00187) and is found in all completely sequenced archaea and does not recognize any known non-archaeal genes. 218
20128 274720 TIGR03680 eif2g_arch translation initiation factor 2 subunit gamma. This model represents the archaeal translation initiation factor 2 subunit gamma and is found in all known archaea. eIF-2 functions in the early steps of protein synthesis by forming a ternary complex with GTP and initiator tRNA. 406
20129 274721 TIGR03682 arCOG04112 diphthamide biosynthesis enzyme Dph2. Members of this family are the archaeal protein Dph2, members of the universal archaeal protein family designated arCOG04112. The chemical function of this protein is analogous to the radical SAM family (pfam04055), although the sequence is not homologous. The chemistry involves [4Fe-4S]-aided formation of a 3-amino-3-carboxypropyl radical rather than the canonical 5'-deoxyadenosyl radical of the radical SAM family. 308
20130 274722 TIGR03683 A-tRNA_syn_arch alanyl-tRNA synthetase. This family of alanyl-tRNA synthetases is limited to the archaea, and is a subset of those sequences identified by the model pfam07973 covering the second additional domain (SAD) of alanyl and threonyl tRNA synthetases . 902
20131 274723 TIGR03684 arCOG00985 arCOG04150 universal archaeal PUA-domain protein. This universal archaeal protein contains a domain possibly associated with RNA binding (pfam01472, TIGR00451). 150
20132 274724 TIGR03685 ribo_P1_arch 50S ribosomal protein P1. This model represents P1 the L12P protein of the large (50S) subunit of the archaeal ribosome. 105
20133 274725 TIGR03686 pupylate_PafA Pup--protein ligase. Members of this family are the Pup--protein ligase PafA (proteasome accessory factor A), a protein shown to regulate steady-state levels of certain proteasome targets in Mycobacterium tuberculosis. Iyer, et al (2008) first suggested that PafA is the ligase for Pup, a ubiquitin analog attached to an epsilon-amino group of a Lys side-chain to direct the target to the proteasome. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 453
20134 200311 TIGR03687 pupylate_cterm ubiquitin-like protein Pup. Members of this protein family are Pup, a small protein whose ligation to target proteins steers them toward degradation. This protein family occurs in a number of bacteria, especially Actinobacteria such as Mycobacterium tuberculosis, that possess an archeal-type proteasome. All members of this protein family known during model construction end with the C-terminal motif [FY][VI]QKGG[QE]. Ligation is thought to occur between the C-terminal COOH of Pup and an epsilon-amino group of a Lys on the target protein. The N-terminal half of this protein is poorly conserved and not represented in the seed alignment. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 33
20135 274726 TIGR03688 pupylate_PafA2 proteasome accessory factor PafA2. This protein family is paralogous to (and distinct from) the PafA (proteasome accessory factor) first described in Mycobacterium tuberculosis (see TIGR03686). Members of both this family and TIGR03686 itself tend to cluster with each other, with the ubiquitin analog Pup (TIGR03687) associated with targeting to the proteasome, and with proteasome subunits themselves. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 485
20136 200312 TIGR03689 pup_AAA proteasome ATPase. In the Actinobacteria, as shown for Mycobacterium tuberculosis, some proteins are modified by ligation between an epsilon-amino group of a lysine side chain and the C-terminal carboxylate of the ubiquitin-like protein Pup. This modification leads to protein degradation by the archaeal-like proteasome found in the Actinobacteria. Members of this protein family belong to the AAA family of ATPases and tend to be clustered with the genes for Pup, the Pup ligase PafA, and structural components of the proteasome. This protein forms hexameric rings with ATPase activity. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 512
20137 163402 TIGR03690 20S_bact_beta proteasome, beta subunit, bacterial type. Members of this family are the beta subunit of the 20S proteasome as found in Actinobacteria such as Mycobacterium, Rhodococcus, and Streptomyces. In Streptomyces, maturation during proteasome assembly was shown to remove a 53-amino acid propeptide. Most of the length of the propeptide is not included in this model. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 219
20138 163403 TIGR03691 20S_bact_alpha proteasome, alpha subunit, bacterial type. Members of this family are the alpha subunit of the 20S proteasome as found in Actinobacteria such as Mycobacterium, Rhodococcus, and Streptomyces. In most Actinobacteria (an exception is Propionibacterium acnes), the proteasome is accompanied by a system of tagging proteins for degradation with Pup. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 228
20139 274727 TIGR03692 ATP_dep_HslV ATP-dependent protease HslVU, peptidase subunit. The ATP-dependent protease HslVU, a complex of hexameric HslU active as a protein-unfolding ATPase and dodecameric HslV, the catalytic threonine protease. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 171
20140 163405 TIGR03693 ocin_ThiF_like putative thiazole-containing bacteriocin maturation protein. Members of this protein family are found in a three-gene operon in Bacillus anthracis and related Bacillus species, where the other two genes are clearly identified with maturation of a putative thiazole-containing bacteriocin precursor. While there is no detectable pairwise sequence similarity between members of this family and the proposed cyclodehydratases such as SagC of Streptococcus pyogenes (see family TIGR03603), both families show similarity through PSI-BLAST to ThiF, a protein involved in biosynthesis of the thiazole moiety for thiamine biosynthesis. This family, therefore, may contribute to cyclodehydratase function in heterocycle-containing bacteriocin biosyntheses. In Bacillus licheniformis ATCC 14580, the bacteriocin precursor gene is adjacent to the gene for this protein. [Cellular processes, Toxin production and resistance] 637
20141 274728 TIGR03694 exosort_acyl N-acyl amino acid synthase, PEP-CTERM/exosortase system-associated. Members of this protein family are restricted to bacterial species with the PEP-CTERM/exosortase system predicted to act in exopolysaccharide-associated protein targeting. PSI-BLAST and CDD reveal relationships to the acyltransferase family that includes N-acyl-L-homoserine lactone synthetase, and recent work shows long-chain N-acyl amino acid biosynthesis activity. Several members of this family may be found in a single genome. These acyltransferases may produce a quorum signalling molecule or may contribute to chemical modifications in exopolysaccharide and biofilm structural material production. 241
20142 274729 TIGR03695 menH_SHCHC 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase. This protein catalyzes the formation of SHCHC, or (1 R,6 R)-2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate, by elmination of pyruvate from 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate (SEPHCHC). Note that SHCHC synthase activity previously was attributed to MenD, which in fact is SEPHCHC synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 252
20143 274730 TIGR03696 Rhs_assc_core RHS repeat-associated core domain. This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain. 77
20144 163409 TIGR03697 NtcA_cyano global nitrogen regulator NtcA, cyanobacterial. Members of this protein family, found in the cyanobacteria, are the global nitrogen regulator NtcA. This DNA-binding transcriptional regulator is required for expressing many different ammonia-repressible genes. The consensus NtcA-binding site is G T A N(8)T A C. [Regulatory functions, DNA interactions] 193
20145 163410 TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 107
20146 274731 TIGR03699 menaquin_MqnC dehypoxanthine futalosine cyclase. members of this protein family are involved in menaquinone biosynthesis by an alternate pathway via futalosine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 340
20147 213851 TIGR03700 mena_SCO4494 putative menaquinone biosynthesis radical SAM enzyme, SCO4494 family. Members of this protein family appear to be involved in menaquinone biosynthesis by an alternate pathway via futalosine, based on close phylogenetic correlation with known markers of the futalosine pathway, gene clustering in many organisms, and paralogy with the SCO4550 protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 351
20148 163413 TIGR03701 mena_SCO4490 menaquinone biosynthesis decarboxylase, SCO4490 family. Members of this protein family are putative decarboxylases involved in a late stage of the alternative pathway for menaquinone, via futalosine, as in Streptomyces coelicolor and Helicobacter pylori. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 433
20149 163414 TIGR03702 lip_kinase_YegS lipid kinase YegS. Members of this protein family are designated YegS, an apparent lipid kinase family in the Proteobacteria. Bakali, et al. report phosphatidylglycerol kinase activity for the member from Escherichia coli, but refrain from calling that activity synonymous with its biological role. Note that a broader, subfamily-type model (TIGR00147), includes this family but also multiple paralogs in some species and varied functions. [Unknown function, Enzymes of unknown specificity] 293
20150 274732 TIGR03703 plsB glycerol-3-phosphate O-acyltransferase. Members of this protein family are PlsB, glycerol-3-phosphate O-acyltransferase, present in E. coli and numerous related species. In many bacteria, PlsB is not found, and appears to be replaced by a two enzyme system for 1-acyl-glycerol-3-phosphate biosynthesis, the PlsX/Y system. [Fatty acid and phospholipid metabolism, Biosynthesis] 799
20151 274733 TIGR03704 PrmC_rel_meth putative protein-(glutamine-N5) methyltransferase, unknown substrate-specific. This protein family is closely related to two different families of protein-(glutamine-N5) methyltransferase. The first is PrmB, which modifies ribosomal protein L3 in some bacteria. The second is PrmC (HemK), which modifies peptide chain release factors 1 and 2 in most bacteria and also in eukaryotes. The glutamine side chain-binding motif NPPY shared by PrmB and PrmC is N[VAT]PY in this family. The protein substrate is unknown. [Protein synthesis, Ribosomal proteins: synthesis and modification] 251
20152 274734 TIGR03705 poly_P_kin polyphosphate kinase 1. Members of this protein family are the enzyme polyphosphate kinase 1 (PPK1). This family is found in many prokaryotes and also in Dictyostelium. Sequences in the seed alignment were taken from prokaryotic consecutive two-gene pairs in which the other gene encodes an exopolyphosphatase. It synthesizes polyphosphate from the terminal phosphate of ATP but not GTP, in contrast to PPK2. [Central intermediary metabolism, Phosphorus compounds] 672
20153 274735 TIGR03706 exo_poly_only exopolyphosphatase. It appears that a single enzyme may act as both exopolyphosphatase (Ppx) and guanosine pentaphosphate phosphohydrolase (GppA) in a number of species. Members of the seed alignment use to define this exception-level model are encoded adjacent to a polyphosphate kinase 1 gene, and the trusted cutoff is set high enough (425) that no genome has a second hit. Therefore all members may be presumed to at least share exopolyphospatase activity, and may lack GppA activity. GppA acts in the stringent response. [Central intermediary metabolism, Phosphorus compounds] 300
20154 213852 TIGR03707 PPK2_P_aer polyphosphate kinase 2, PA0141 family. Members of this protein family are designated polyphosphate kinase 2 (PPK2) after the characterized protein in Pseudomonas aeruginosa. This family comprises one of three well-separated clades in the larger family described by pfam03976. PA0141 from this family has been shown capable of operating in reverse, with GDP preferred (over ADP) as a substrate, producing GTP (or ATP) by transfer of a phosphate residue from polyphosphate. Most species with a member of this family also encode a polyphosphate kinase 1 (PPK1). [Central intermediary metabolism, Phosphorus compounds] 230
20155 274736 TIGR03708 poly_P_AMP_trns polyphosphate:AMP phosphotransferase. Members of this protein family contain a domain duplication. The characterized member from Acinetobacter johnsonii is polyphosphate:AMP phosphotransferase (PAP), which can transfer the terminal phosphate from poly(P) to AMP, yielding ADP. In the opposite direction, this enzyme can synthesize poly(P). Each domain of this protein family is homologous to polyphosphate kinase, an enzyme that can run in the forward direction to extend a polyphosphate chain with a new terminal phosphate from ATP, or in reverse to make ATP (or GTP) from ADP (or GDP). [Central intermediary metabolism, Phosphorus compounds] 493
20156 274737 TIGR03709 PPK2_rel_1 polyphosphate:nucleotide phosphotransferase, PPK2 family. Members of this protein family belong to the polyphosphate kinase 2 (PPK2) family, which is not related in sequence to PPK1. While PPK1 tends to act in the biosynthesis of polyphosphate, or poly(P), members of the PPK2 family tend to use the terminal phosphate of poly(P) to regenerate ATP or GTP from the corresponding nucleoside diphosphate, or ADP from AMP as is the case with polyphosphate:AMP phosphotransferase (PAP). Members of this protein family most likely transfer the terminal phosphate between poly(P) and some nucleotide, but it is not clear which. [Central intermediary metabolism, Phosphorus compounds] 264
20157 274738 TIGR03710 OAFO_sf 2-oxoacid:acceptor oxidoreductase, alpha subunit. This family of proteins contains a C-terminal thiamine diphosphate (TPP) binding domain typical of flavodoxin/ferredoxin oxidoreductases (pfam01855) as well as an N-terminal domain similar to the gamma subunit of the same group of oxidoreductases (pfam01558). The genes represented by this model are always found in association with a neighboring gene for a beta subunit (TIGR02177) which also occurs in a 4-subunit (alpha/beta/gamma/ferredoxin) version of the system. This alpha/gamma plus beta structure was used to define the set of sequences to include in this model. This pair of genes is not consistantly observed in proximity to any electron acceptor genes, but is found next to putative ferredoxins or ferredoxin-domain proteins in Azoarcus sp. EbN1, Bradyrhizobium japonicum USDA 110, Frankia sp. CcI3, Rhodoferax ferrireducens DSM 15236, Rhodopseudomonas palustris BisB5, Os, Sphingomonas wittichii RW1 and Streptomyces clavuligerus. Other potential acceptors are also sporadically observed in close proximity including ferritin-like proteins, reberythrin, peroxiredoxin and a variety of other flavin and iron-sulfur cluster-containing proteins. The phylogenetic distribution of this family encompasses archaea, a number of deeply-branching bacterial clades and only a small number of firmicutes and proteobacteria. The enzyme from Sulfolobus has been characterized with respect to its substrate specificity, which is described as wide, encompassing various 2-oxoacids such as 2-oxoglutarate, 2-oxobutyrate and pyruvate. The enzyme from Hydrogenobacter thermophilus has been shown to have a high specificity towards 2-oxoglutarate and is one of the key enzymes in the reverse TCA cycle in this organism. Furthermore, considering its binding of coenzyme A, it can be reasonably inferred that the product of the reaction is succinyl-CoA. The genes for this enzyme in Prevotella intermedia 17, Persephonella marina EX-H1 and Picrophilus torridus DSM 9790 are in close proximity to a variety of TCA cycle genes. Persephonella marina and P. torridus are believed to encode complete TCA cycles, and none of these contains the lipoate-based 2-oxoglutarate dehydrogenase (E1/E2/E3) system. That system is presumed to be replaced by this one. In fact, the lipoate system is absent in most organisms possessing a member of this family, providing additional circumstantial evidence that many of these enzymes are capable of acting as 2-oxoglutarate dehydrogenases and 562
20158 163423 TIGR03711 acc_sec_asp3 accessory Sec system protein Asp3. This protein is designated Asp3 because, along with SecY2, SecA2, and other proteins it is part of the accessory Sec system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 135
20159 274739 TIGR03712 acc_sec_asp2 accessory Sec system protein Asp2. This protein is designated Asp2 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 511
20160 274740 TIGR03713 acc_sec_asp1 accessory Sec system protein Asp1. This protein is designated Asp1 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 519
20161 163426 TIGR03714 secA2 accessory Sec system translocase SecA2. Members of this protein family are homologous to SecA and part of the accessory Sec system. This system, including both five core proteins for export and a variable number of proteins for glycosylation, operates in certain Gram-positive pathogens for the maturation and delivery of serine-rich glycoproteins such as the cell surface glycoprotein GspB in Streptococcus gordonii. [Protein fate, Protein and peptide secretion and trafficking] 762
20162 274741 TIGR03715 KxYKxGKxW KxYKxGKxW signal peptide. This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus. 23
20163 274742 TIGR03716 R_switched_YkoY integral membrane protein, YkoY family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family often are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains proteins YceF and YkoY from Bacillus subtilis. A transport function is proposed. 215
20164 163429 TIGR03717 R_switched_YjbE integral membrane protein, YjbE family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family commonly are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains protein YjbE from Bacillus subtilis. A transport function is proposed. 176
20165 274743 TIGR03718 R_switched_Alx integral membrane protein, TerC family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family often are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains TerC itself from Alcaligenes sp. plasmid IncHI2 pMER610 and from Proteus mirabilis. It also contains the alkaline-inducible E. coli protein Alx, which unlike the two TerC examples is preceded by a yybP-ykoY leader. 302
20166 274744 TIGR03719 ABC_ABC_ChvD ATP-binding cassette protein, ChvD family. Members of this protein family have two copies of the ABC transporter ATP-binding cassette, but are found outside the common ABC transporter operon structure that features integral membrane permease proteins and substrate-binding proteins encoded next to the ATP-binding cassette (ABC domain) protein. The member protein ChvD from Agrobacterium tumefaciens was identified as both a candidate to interact with VirB8, based on yeast two-hybrid analysis, and as an apparent regulator of VirG. The general function of this protein family is unknown. 552
20167 274745 TIGR03720 exospor_lead exosporium leader peptide. This domain is found as a leader peptide in at least two proteins targeted to the exosporium, a structure that occurs as the outermost layer of Bacillus anthracis, B. cereus, and B. thuringiensis spores. The exosporium consists of a basal layer and a nap of hair-like filaments. BclA, the major protein of the nap filaments, is targeted there by this leader peptide. [Cellular processes, Sporulation and germination] 19
20168 274746 TIGR03721 exospore_TM BclB C-terminal domain. This domain occurs as the C-terminal region in a number of proteins that have extensive collagen-like triple helix repeat regions. Member domains are predicted by TmHMM to have four or five transmembrane helices. Members are found mostly in the Firmicutes, but also in Acanthamoeba polyphaga mimivirus. Members include spore surface glycoprotein BclB from Bacillus anthracis, a protein of the exosporium. The exosporium is an additional outermost spore layer, lacking in B. subtilis and most other spore formers, consisting of a basal layer and, above it, a nap of fine filaments. 165
20169 274747 TIGR03722 arch_KAE1 universal archaeal protein Kae1. This family represents the archaeal protein Kae1. Its partner Bud32 is fused with it in about half of the known archaeal genomes. The pair, which appears universal in the archaea, corresponds to EKC/KEOPS complex in eukaryotes. A recent characterization of the member from Pyrococcus abyssi, as an iron-binding, atypical DNA-binding protein with an apurinic lyase activity, challenges the common annotation of close homologs as O-sialoglycoprotein endopeptidase. The latter annotation is based on a characterized protein from the bacterium Pasteurella haemolytica. [DNA metabolism, DNA replication, recombination, and repair] 322
20170 274748 TIGR03723 T6A_TsaD_YgjD tRNA threonylcarbamoyl adenosine modification protein TsaD. This model represents bacterial members of a protein family that is widely distributed. In a few pathogenic species, the protein is exported in a way that may represent an exceptional secondary function. This model plus companion (archaeal) model TIGR03722 together span the prokaryotic member sequences of TIGR00329, a protein family that appears universal in life, and whose broad function is unknown. A member of TIGR03722 has been characterized as a DNA-binding protein with apurinic endopeptidase activity. In contrast, the rare characterized members of the present family show O-sialoglycoprotein endopeptidase (EC. 3.4.24.57) activity after export. These include glycoprotease (gcp) from Pasteurella haemolytica A1 and a cohemolysin from Riemerella anatipestifer (GB|AAG39646.1). The member from Staphylococcus aureus is essential and is related to cell wall dynamics and the modulation of autolysis, but members are also found in the Mycoplasmas (which lack a cell wall). A reasonable hypothesis is that virulence-related activities after export are secondary to a bacterial domain-wide unknown function. [Protein synthesis, tRNA and rRNA base modification] 313
20171 274749 TIGR03724 arch_bud32 Kae1-associated kinase Bud32. Members of this protein family are the Bud32 protein associated with Kae1 (kinase-associated endopeptidase 1) in the Archaea. In many Archaeal genomes, Kae1 and Bud32 are fused. The complex is homologous to the Kae1 and Bud32 subunits of the eukaryotic KEOPS complex, an apparently ancient protein kinase-containing molecular machine. [Unknown function, General] 199
20172 274750 TIGR03725 T6A_YeaZ tRNA threonylcarbamoyl adenosine modification protein YeaZ. This family describes a protein family, YeaZ, now associated with the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family may occur as fusions with ygjD (previously gcp) or the ribosomal protein N-acetyltransferase rimI, and is frequently encoded next to rimI. [Protein synthesis, tRNA and rRNA base modification] 204
20173 274751 TIGR03726 strep_RK_lipo putative cross-wall-targeting lipoprotein signal. The YSIRK signal domain targets proteins to the cross-wall, or septum, of dividing Gram-positive bacterial. Lipoprotein signal motifs direct a characteristic N-terminal cleavage and lipid modification for membrane anchoring. This Streptococcal-only signal peptide variant appears to be a hybrid between the two, likely directing protein targeting of nascent surface lipoproteins to the cross-wall. Nearly all members of this family have the characteristic LPXTG cell wall anchor signal at the C-terminus. 34
20174 274752 TIGR03727 urea_t_UrtC_arc urea ABC transporter, permease protein UrtC, archaeal type. Members of this protein family are ABC transporter permease subunits restricted to the Archaea. Several lines of evidence suggest this protein is functionally analogous, as well as homologous, to the UrtC subunit of the Corynebacterium glutamicum urea transporter. All members of the operon show sequence similarity to urea transport subunits, the gene is located near the urease structural subunits in two of three species, and partial phylogenetic profiling identifies this permease subunit as closely matching the profile of urea utilization. 369
20175 163440 TIGR03728 glyco_access_1 glycosyltransferase, SP_1767 family. Members of this protein family are putative glycosyltransferases. Some members are found close to genes for the accessory secretory (SecA2) system, and are suggested by Partial Phylogenetic Profiling to correlate with SecA2 systems. Glycosylation, therefore, may occur in the cytosol prior to secretion. 265
20176 163441 TIGR03729 acc_ester putative phosphoesterase. Members of this protein family belong to the larger family pfam00149 (calcineurin-like phosphoesterase), a family largely defined by small motifs of metal-chelating residues. The subfamily in this model shows a good but imperfect co-occurrence in species with domain TIGR03715 that defines a novel class of signal peptide typical of the accessory secretory system. 239
20177 163442 TIGR03730 tungstate_WtpA tungstate ABC transporter binding protein WtpA. Members of this protein family are tungstate (and, more weakly, molybdate) binding proteins of tungstate(/molybdate) ABC transporters, as first characterized in Pyrococcus furiosus. Model seed members and cutoffs, pending experimental evidence for more distant homologs, were chosen such that this model identifies select archaeal proteins, excluding weaker archaeal and all bacterial homologs. Note that this family is homologous to molybdate transporters, and that at least one other family of tungstate transporter binding protein, TupA, also exists. 273
20178 274753 TIGR03731 lantibio_gallid lantibiotic, gallidermin/nisin family. Members of this family are lantibiotic precursors in the family that includes gallidermin, nisin, mutacin, epidermin, and streptin. [Cellular processes, Toxin production and resistance] 48
20179 274754 TIGR03732 lanti_perm_MutE lantibiotic protection ABC transporter permease subunit, MutE/EpiE family. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes. In most species, this subunit is paralogous to an adjacent gene, modeled separately. 241
20180 163445 TIGR03733 lanti_perm_MutG lantibiotic protection ABC transporter permease subunit, MutG family. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes. In most species, this subunit is paralogous to an adjacent gene modeled separate by TIGR03732, while in some species only one subunit is found. 248
20181 274755 TIGR03734 PRTRC_parB PRTRC system ParB family protein. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family the member related to ParB, and is designated PRTRC system ParB family protein. 554
20182 163447 TIGR03735 PRTRC_A PRTRC system protein A. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated protein A. 192
20183 163448 TIGR03736 PRTRC_ThiF PRTRC system ThiF family protein. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This family is the PRTRC system ThiF family protein. 244
20184 274756 TIGR03737 PRTRC_B PRTRC system protein B. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein B. 228
20185 163450 TIGR03738 PRTRC_C PRTRC system protein C. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein C. 66
20186 274757 TIGR03739 PRTRC_D PRTRC system protein D. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein D. The gray zone, between trusted and noise, includes proteins found in the same genomes as other proteins of the PRTRC systems, but not in the same contiguous gene region. 320
20187 163452 TIGR03740 galliderm_ABC gallidermin-class lantibiotic protection ABC transporter, ATP-binding subunit. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family describes the ATP-binding subunit of a gallidermin/epidermin class lantibiotic protection transporter. It is largely restricted to gallidermin-family lantibiotic biosynthesis and export cassettes, but also occurs in orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes. 223
20188 274758 TIGR03741 PRTRC_E PRTRC system protein E. A novel genetic system characterized by six or seven major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family averages about 150 amino acids in length, but the last third contains low-complexity sequence that complicates sequence comparisons. This model does not include the low-complexity region. 104
20189 274759 TIGR03742 PRTRC_F PRTRC system protein F. A novel genetic system characterized by seven (usually) major proteins, including a ParB homolog and a ThiF homolog, is commonly found on plasmids or in bacterial chromosomal regions near phage, plasmid, or transposon markers. It is most common among the beta Proteobacteria. We designate the system PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein F. It is the most divergent of the families. 342
20190 274760 TIGR03743 SXT_TraD conjugative coupling factor TraD, SXT/TOL subfamily. Members of this protein family are the putative conjugative coupling factor, TraD (or TraG), rather distantly related to the well-characterized TraD of the F plasmid. Members are associated with conjugative-transposon-like mobile genetic elements of the class that includes SXT, an antibiotic resistance transfer element in some Vibrio cholerae strains. [Mobile and extrachromosomal element functions, Other] 634
20191 274761 TIGR03744 traC_PFL_4706 conjugative transfer ATPase, PFL_4706 family. Members of this protein family are predicted ATP-binding proteins apparently associated with DNA conjugal transfer. Members are found both in plasmids and in bacterial chromosomal regions that appear to derive from integrative elements such as conjugative transposons. More distant homologs, outside the scope of this family, include type IV secretion/conjugal transfer proteins such as TraC, VirB4 and TrsE. The granularity of this protein family definition is chosen so as to represent one distinctive clade and act as a marker through which to define and recognize the class of mobile element it serves. [Mobile and extrachromosomal element functions, Plasmid functions] 893
20192 274762 TIGR03745 conj_TIGR03745 integrating conjugative element membrane protein, PFL_4702 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 105
20193 163458 TIGR03746 conj_TIGR03746 integrating conjugative element protein, PFL_4703 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 202
20194 163459 TIGR03747 conj_TIGR03747 integrating conjugative element membrane protein, PFL_4697 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 233
20195 163460 TIGR03748 conj_PilL conjugative transfer region protein, TIGR03748 family. This model describes the conserved N-terminal region of a variable length protein family associated with laterally transfered regions flanked by markers of conjugative plasmid integration and/or transposition. Most members of the family have the lipoprotein signal peptide motif. A member of the family from a pathogenicity island in Salmonella enterica serovar Dublin strain was designated PilL for nomenclature consistency with a neighboring gene for the pilin structural protein PilS. However, the species distribution of this protein family tracks much better with markers of conjugal transfer than with markers of PilS-like pilin structure. [Mobile and extrachromosomal element functions, Plasmid functions] 105
20196 163461 TIGR03749 conj_TIGR03749 integrating conjugative element protein, PFL_4704 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 257
20197 274763 TIGR03750 conj_TIGR03750 conjugative transfer region protein, TIGR03750 family. Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 111
20198 274764 TIGR03751 conj_TIGR03751 conjugative transfer region lipoprotein, TIGR03751 family. Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 124
20199 274765 TIGR03752 conj_TIGR03752 integrating conjugative element protein, PFL_4705 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida toluene catabolic TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 472
20200 274766 TIGR03753 blh_monoox beta-carotene 15,15'-monooxygenase, Brp/Blh family. This integral membrane protein family includes Brp (bacterio-opsin related protein) and Blh (Brp-like protein). Bacteriorhodopsin is a light-driven proton pump with a covalently bound retinal cofactor that appears to be derived beta-carotene. Blh has been shown to cleave beta-carotene to product two all-trans retinal molecules. Mammalian enzymes with similar enzymatic function are not multiple membrane spanning proteins and are not homologous. 259
20201 274767 TIGR03754 conj_TOL_TraD conjugative coupling factor TraD, TOL family. Members of this protein are assigned by homology to the TraD family of conjugative coupling factor. This particular clade serves as a marker for an extended gene region that occurs occasionally on plasmids, including the toluene catabolism TOL plasmid. More commonly, the gene region is chromosomal, flanked by various markers of conjugative transfer and insertion. 643
20202 274768 TIGR03755 conj_TIGR03755 integrating conjugative element protein, PFL_4711 family. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 418
20203 274769 TIGR03756 conj_TIGR03756 integrating conjugative element protein, PFL_4710 family. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 297
20204 163469 TIGR03757 conj_TIGR03757 integrating conjugative element protein, PFL_4709 family. Members of this protein belong to extended genomic regions that appear to be spread by conjugative transfer. [Mobile and extrachromosomal element functions, Plasmid functions] 113
20205 163470 TIGR03758 conj_TIGR03758 integrating conjugative element protein, PFL_4701 family. Members of this family of small, hydrophobic proteins are found occasionally on plasmids such as the Pseudomonas putida TOL (toluene catabolic) plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 65
20206 274770 TIGR03759 conj_TIGR03759 integrating conjugative element protein, PFL_4693 family. Members of this protein family, such as model protein PFL_4693 from Pseudomonas fluorescens Pf-5, belong to extended genomic regions that appear to be spread by conjugative transfer. Most members have a predicted N-terminal signal sequence. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 200
20207 163472 TIGR03760 ICE_TraI_Pfluor integrating conjugative element relaxase, PFL_4751 family. Members of this protein family are the TraI putative relaxases required for transfer by a subclass of integrating conjugative elements (ICE) as found in Pseudomonas fluorescens Pf-5, and understood from study of two related ICE, SXT and R391. This model represents the N-terminal domain. Note that no homology is detected to the similarly named TraI relaxase of the F plasmid. 218
20208 274771 TIGR03761 ICE_PFL4669 integrating conjugative element protein, PFL_4669 family. Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens. 216
20209 274772 TIGR03762 archaeo_artC archaeosortase C, PEF-CTERM variant. Members of this family are archaeal homologs to bacterial PEP-CTERM-sorting protein exosortase (TIGR02602). Members of this family are found in species with an archaeal variant sorting motif, PEF-CTERM (TIGR03024). Members are found in the thermoacidophilic Aciduliprofundum boonei, the mesophilic psychromethanogens Methanosarcina mazei and Methanococcoides burtonii, and in Ferroglobus placidus DSM 10642. [Protein fate, Protein and peptide secretion and trafficking] 274
20210 163475 TIGR03763 cyanoexo_CrtA cyanoexosortase A. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci and/or different taxonomic lineages. We designate this relatively divergent cyanobacterial type to be type 3. We propose the gene symbol xrtC. This type coexists with a TIGR02602-recognized form in Nostoc sp. PCC 7120. [Protein fate, Protein and peptide secretion and trafficking] 260
20211 213858 TIGR03764 ICE_PFGI_1_parB integrating conjugative element, PFGI_1 class, ParB family protein. Members of this protein family carry the ParB-type nuclease domain and are found in integrating conjugative elements (ICE) in the same class as PFGI-1 of Pseudomonas fluorescens Pf-5. 258
20212 274773 TIGR03765 ICE_PFL_4695 integrating conjugative element protein, PFL_4695 family. This model describes a protein family exemplified by PFL_4695 of Pseudomonas fluorescens Pf-5. Full-length proteins in this family show some architectural variety, but this model represents a conserved domain. Most or all member proteins belong to laterally transferred chromosomal islands called integrative conjugative elements, or ICE. 105
20213 274774 TIGR03766 TIGR03766 conserved hypothetical integral membrane protein. Models TIGR03110, TIGR03111, and TIGR03112 describe a three-gene system found in several Gram-positive bacteria, where TIGR03110 (XrtG) is distantly related to a putative transpeptidase, exosortase (TIGR02602). This model describes a small clade that correlates by both gene clustering and phyletic pattern, although imperfectly, to the three gene system. Both this narrow clade, and the larger set of full-length homologous integral membrane proteins, have an especially well-conserved region near the C-terminus with an invariant tyrosine. The function is unknown. 483
20214 213859 TIGR03767 P_acnes_RR metallophosphoesterase, PPA1498 family. This model describes a small collection of probable metallophosphoresterases, related to pfam00149 but with long inserts separating some of the shared motifs such that the homology is apparent only through multiple sequence alignment. Members of this protein family, in general, have a Sec-independent TAT (twin-arginine translocation) signal sequence, N-terminal to the region represented by this model. Members include YP_056203.1 from Propionibacterium acnes KPA171202. 496
20215 163480 TIGR03768 RPA4764 metallophosphoesterase, RPA4764 family. This model describes a small collection of probable metallophosphoresterases, related to pfam00149. Members of this protein family usually have a Sec-independent TAT (twin-arginine translocation) signal sequence, N-terminal to the region represented by this model. This model and TIGR03767 divide a narrow clade of pfam00149-related enzymes. 492
20216 274775 TIGR03769 P_ac_wall_RPT actinobacterial surface-anchored protein domain. This model describes a repeat domain that one to three times in Actinobacterial proteins, some of which have LPXTG-type sortase recognition motifs for covalent attachment to the Gram-positive cell wall. Where it occurs with duplication in an LPXTG-anchored protein, it tends to be adjacent to the substrate-binding protein of the gene trio of an ABC transporter system, where that substrate-binding protein has a single copy of this same domain. This arrangement suggests a substrate-binding relay system, with the LPXTG protein acting as a substrate receptor. 41
20217 163482 TIGR03770 anch_rpt_perm anchored repeat-type ABC transporter, permease subunit. This protein family is the permease subunit of binding protein-dependent ABC transporter complex that strictly co-occurs with TIGR03769. TIGRFAMs model TIGR03769 describes a protein domain that occurs singly or as one of up to three repeats in proteins of a number of Actinobacteria, including Propionibacterium acnes KPA171202. The TIGR03769 domain occurs both in the adjacent gene for the substrate-binding protein and in additional (often nearby) proteins, often with LPXTG-like sortase recognition signals. Homologous permease subunits outside the scope of this family include manganese transporter MntB in Synechocystis sp. PCC 6803 and chelated iron transporter subunits. The function of this transporter complex is unknown. [Transport and binding proteins, Unknown substrate] 270
20218 163483 TIGR03771 anch_rpt_ABC anchored repeat-type ABC transporter, ATP-binding subunit. This protein family is the ATP-binding cassette subunit of binding protein-dependent ABC transporter complex that strictly co-occurs with TIGR03769. TIGRFAMs model TIGR03769 describes a protein domain that occurs singly or as one of up to three repeats in proteins of a number of Actinobacteria, including Propionibacterium acnes KPA171202. The TIGR03769 domain occurs both in an adjacent gene for the substrate-binding protein and in additional (often nearby) proteins, often with LPXTG-like sortase recognition signals. Homologous ATP-binding subunits outside the scope of this family include manganese transporter MntA in Synechocystis sp. PCC 6803 and chelated iron transporter subunits. The function of this transporter complex is unknown. [Transport and binding proteins, Unknown substrate] 223
20219 163484 TIGR03772 anch_rpt_subst anchored repeat ABC transporter, substrate-binding protein. Members of this protein family are ABC transporter permease subunits as identified by pfam00950, but additionally contain the Actinobacterial insert domain described by TIGR03769. Some homologs (lacking the insert) have been described as transporters of manganese or of chelated iron. Members of this family typically are found along with an ATP-binding cassette protein, a permease, and an LPXTG-anchored protein with two or three copies of the TIGR03769 insert that occurs just once in this protein family. [Transport and binding proteins, Unknown substrate] 479
20220 274776 TIGR03773 anch_rpt_wall putative ABC transporter-associated repeat protein. Members of this protein family occur in genomes that contain a three-gene ABC transporter operon associated with the presence of domain TIGR03769. That domain occurs as a single-copy insert in the substrate-binding protein, and occurs in two or more copies in members of this protein family. Members of this family typically are encoded adjacent to the said transporter operon and may serve as a substrate receptor. 513
20221 163486 TIGR03774 RPE2 Rickettsial palindromic element RPE2 domain. This model describes protein translations of a second family, RPE2, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding regions that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else. 35
20222 163487 TIGR03775 RPE3 Rickettsial palindromic element RPE3 domain. This model describes protein translations of a second family, RPE3, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding regions that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else. 43
20223 163488 TIGR03776 RPE5 Rickettsial palindromic element RPE5 domain. This model describes protein translations of a family, RPE5, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding region that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else. 43
20224 274777 TIGR03777 RPE4 Rickettsial palindromic element RPE4 domain. This model describes protein translations of a family, RPE4, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding region that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else. 32
20225 274778 TIGR03778 VPDSG_CTERM VPDSG-CTERM protein sorting domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria. 24
20226 274779 TIGR03779 Bac_Flav_CT_M Bacteroides conjugative transposon TraM protein. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. [Cellular processes, DNA transformation] 410
20227 163492 TIGR03780 Bac_Flav_CT_N Bacteroides conjugative transposon TraN protein. Members of this family are the TraN protein encoded by transfer region genes of conjugative transposons of Bacteroides. The family is related to conjugative transfer proteins VirB9 and TrbG of Agrobacterium Ti plasmids. [Cellular processes, DNA transformation] 285
20228 200324 TIGR03781 Bac_Flav_CT_K Bacteroides conjugative transposon TraK protein. Members of this protein family are designated TraK and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. PSI-BLAST reveals a distant relationship to proteins TrbF and VirB8 in Proteobacterial conjugal transfer systems. [Cellular processes, DNA transformation] 202
20229 274780 TIGR03782 Bac_Flav_CT_J Bacteroides conjugative transposon TraJ protein. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. This family is related conjugation system proteins in the Proteobacteria, including TrbL of Agrobacterium Ti plasmids and VirB6. [Cellular processes, DNA transformation] 323
20230 163495 TIGR03783 Bac_Flav_CT_G Bacteroides conjugation system ATPase, TraG family. Members of this family include the predicted ATPase, TraG, encoded by transfer region genes of conjugative transposons of Bacteroides, such as CTnDOT, found on the main chromosome. Members also include TraG homologs borne on plasmids in Bacteroides. The protein family is related to the conjugative transfer system ATPase VirB4. [Cellular processes, DNA transformation] 829
20231 163496 TIGR03784 marine_sortase sortase, marine proteobacterial type. Members of this protein family are sortase enzymes, cysteine transpeptidases involved in protein sorting activities. Members of this family tend to be found in proteobacteria, rather than in Gram-positive bacteria where sortases attach proteins to the Gram-positive cell wall or participate in pilin cross-linking. Many species with this sortase appear to contain a signal target sequence, a protein with a Vault protein inter-alpha-trypsin domain (pfam08487) and a von Willebrand factor type A domain (pfam00092), encoded by an adjacent gene. These sortases are designated subfamily 6 according to Comfort and Clubb (2004). 174
20232 163497 TIGR03785 marine_sort_HK proteobacterial dedicated sortase system histidine kinase. This histidine kinase protein is paired with an adjacent response regulator (TIGR03787) gene. It co-occurs with a variant sortase enzyme (TIGR03784), usually in the same gene neighborhood, in proteobacterial species most of which are marine, and with an LPXTG motif-containing sortase target conserved protein (TIGR03788). Sortases and LPXTG proteins are far more common in Gram-positive bacteria, where sortase systems mediate attachment to the cell wall or cross-linking of pilin structures. We give this predicted sensor histidine kinase the gene symbol psdS, for Proteobacterial Dedicated Sortase system Sensor histidine kinase. 703
20233 274781 TIGR03786 strep_pil_rpt streptococcal pilin isopeptide linkage domain. This model describes a domain that occurs once in the major pilin of Streptococcus pyogenes, Spy0128, but in higher copy numbers in other streptococcal proteins. The domain occurs nine times in a surface-anchored protein of Bifidobacterium longum. All members of this family have LPXTG-type sortase target sequences. The S. pyogenes major pilin has been shown to undergo isopeptide bond cross-linking, mediated by sortases, that are critical to maintaining pilus structural integrity. One such Lys-to-Asn isopeptide bond is to a near-invariant Asn near the C-terminal end of this domain (column 81 of the seed alignment). A Glu in the S. pyogenes major pilin (column 25 of the seed alignment), invariant as Glu or Gln, is described as catalytic for isopeptide bond formation. 63
20234 163499 TIGR03787 marine_sort_RR proteobacterial dedicated sortase system response regulator. This model describes a family of DNA-binding response regulator proteins, associated with an adjacent histidine kinase (TIGR03785) to form a two-component system. This system co-occurs with, and often is adjacent to, a proteobacterial variant form of the protein sorting transpeptidase called sortase (TIGR03784), and a single target protein for the sortase. We give this protein the gene symbol pdsR, for Proteobacterial Dedicated Sortase system Response regulator. 227
20235 274782 TIGR03788 marine_srt_targ marine proteobacterial sortase target protein. Members of this protein family are restricted to the Proteobacteria. Each contains a C-terminal sortase-recognition motif, transmembrane domain, and basic residues cluster at the the C-terminus, and is encoded adjacent to a sortase gene. This protein is frequently the only sortase target in its genome, which is as unusual its occurrence in Gram-negative rather than Gram-positive genomes. Many bacteria with this system are marine. In addition to the LPXTG signal, members carry a vault protein inter-alpha-trypsin inhibitor domain (pfam08487) and a von Willebrand factor type A domain (pfam00092). 596
20236 274783 TIGR03789 pdsO proteobacterial sortase system peptidoglycan-associated protein. A newly defined histidine kinase (TIGR03785) and response regulator (TIGR03787) gene pair occurs exclusively in Proteobacteria, mostly of marine origin, nearly all of which contain a subfamily 6 sortase (TIGR03784) and its single dedicated target protein (TIGR03788) adjacent to to the sortase. This protein family shows up in only in those species with the histidine kinase/response regulator gene pair, and often adjacent to that pair. It belongs to the pfam00691 domain family, which is the peptidoglycan-associated region of flagellar motor protein MotB, OmpA (whose N-terminal region forms an outer membrane beta barrel), and peptidoglycan-associated lipoprotein Pal. Its function is unknown. We assign the gene symbol pdsO, for Proteobacterial Dedicated Sortase system OmpA family protein. [Protein fate, Protein and peptide secretion and trafficking] 243
20237 274784 TIGR03790 TIGR03790 TIGR03790 family protein. Despite a broad and sporadic distribution (Cyanobacteria, Verrucomicrobia, Acidobacteria, beta and delta Proteobacteria, and Planctomycetes), this uncharacterized protein family occurs only among the roughly 8 percent of prokarotyic species that carry homologs of the integral membrane protein exosortase (see TIGR02602), a proposed protein-sorting system transpeptidase. 322
20238 163503 TIGR03791 TTQ_mauG tryptophan tryptophylquinone biosynthesis enzyme MauG. Members of this protein family are the tryptophan tryptophylquinone biosynthesis (TTQ) enzyme MauG, as found in Methylobacterium extorquens and related species. This protein is required to complete the maturation of the TTQ cofactor in the methylamine dehydrogenase light (beta) chain. 291
20239 274785 TIGR03792 TIGR03792 uncharacterized cyanobacterial protein, TIGR03792 family. Members of this family are found, no more than one to a genome, exclusively in (but not universal to) the Cyanobacteria. These proteins are small, 100-150 amino acids. The function is unknown. [Unknown function, General] 90
20240 274786 TIGR03793 TOMM_pelo NHLP leader peptide domain. This model represents a domain that is conserved among a large number of putative ribosomal natural products (RNP) precursor, including the thiazole/oxazole-modified microcins (TOMMs). As a leader peptide domain, likely to be removed from the mature product, this domain is unusual in several ways. First, it is longer than most previously described RNP leader peptides. Second, most of the domain is homologous to nitrile hydratase alpha subunits. Finally, it appears that this domain correlates with a specific family of cleavage/export proteins while members undergo modifications by different classes of peptide maturase, including cyclodehydratases, lantibiotic synthases, radical SAM peptide maturases. This family is expanded especially in Pelotomaculum thermopropionicum SI. [Cellular processes, Biosynthesis of natural products] 77
20241 274787 TIGR03794 NHLM_micro_HlyD NHLM bacteriocin system secretion protein. Members of this protein family are homologs of the HlyD membrane fusion protein of type I secretion systems. Their occurrence in prokaryotic genomes is associated with the occurrence of a novel class of microcin (small bacteriocins) with a leader peptide region related to nitrile hydratase. We designate the class of bacteriocin as Nitrile Hydratase Leader Microcin, or NHLM. This family, therefore, is designated as NHLM bacteriocin system secretion protein. Some but not all NHLM-class putative microcins belong to the TOMM (thiazole/oxazole modified microcin) class as assessed by the presence of the scaffolding protein and/or cyclodehydratase in the same gene clusters. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products] 421
20242 163507 TIGR03795 RNP_Burkhold ribosomal natural product, two-chain TOMM family. Members of this protein family are found sparsely, mostly in members of the genus Burkholderia. Members often occur as tandem homologous genes, such as BMA_0021 and BMA_0022 in Burkholderia mallei ATCC 23344, or else have a duplication. The genes regularly are encoded near a cyclodehydrogenase/docking protein fusion protein of TOMM (thiazole/oxazole-modified microcins) biosynthetic clusters, suggesting a role in bacteriocin biosynthesis. The role of the putative natural product is unknown, but function as a two-chain bacteriocin is suggested. [Cellular processes, Biosynthesis of natural products] 114
20243 274788 TIGR03796 NHLM_micro_ABC1 NHLM bacteriocin system ABC transporter, peptidase/ATP-binding protein. This protein describes a multidomain ABC transporter subunit that is one of three protein families associated with some regularity with a distinctive family of putative bacteriocins. It includes a bacteriocin-processing peptidase domain at the N-terminus. Model TIGR03793 describes a conserved propeptide region for this bacteriocin family, unusual because it shows obvious homology a region of the enzyme nitrile hydratase up to the classic Gly-Gly cleavage motif. This family is therefore predicted to be a subunit of a bacteriocin processing and export system characteristic to this system that we designate NHLM, Nitrile Hydratase Leader Microcin. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products] 710
20244 274789 TIGR03797 NHLM_micro_ABC2 NHLM bacteriocin system ABC transporter, ATP-binding protein. Members of this protein family are ABC transporter ATP-binding subunits, part of a three-gene putative bacteriocin transport operon. The other subunits include another ATP-binding subunit (TIGR03796), which has an N-terminal leader sequence cleavage domain, and an HlyD homolog (TIGR03794). In a number of genomes, members of protein families related to nitrile hydratase alpha subunit or to nif11 have undergone paralogous family expansions, with members possessing a putative bacteriocin cleavage region ending with a classic Gly-Gly motif. Those sets of putative bacteriocins, members of this protein family and its partners TIGR03794 and TIGR03796, and cyclodehydratase/docking scaffold fusion proteins of thiazole/oxazole biosynthesis frequently show correlated species distribution and co-clustering within many of those genomes. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products] 686
20245 274790 TIGR03798 ocin_TIGR03798 nif11-like leader peptide domain. This model describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus. [Cellular processes, Biosynthesis of natural products] 64
20246 274791 TIGR03799 NOD_PanD_pyr putative pyridoxal-dependent aspartate 1-decarboxylase. This enzyme is proposed here to be a form of aspartate 1-decarboxylase, pyridoxal-dependent, that represents a non-orthologous displacement to the more widely distributed pyruvoyl-dependent form (TIGR00223). Aspartate 1-decarboxylase makes beta-alanine, used usually in pathothenate biosynthesis, by decarboxylation from asparatate. A number of species with the PanB and PanC enzymes, however, lack PanD. This protein family occurs in a number of Proteobacteria that lack PanD. This enzyme family appears to be a pyridoxal-dependent enzyme (see pfam00282). The family was identified by Partial Phylogenetic Profiling; members in Geobacter sulfurreducens, G. metallireducens, and Pseudoalteromonas atlantica are clustered with the genes for PanB and PanC. We suggest the gene symbol panP (panthothenate biosynthesis enzyme, Pyridoxal-dependent). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A] 522
20247 274792 TIGR03800 PLP_synth_Pdx2 pyridoxal 5'-phosphate synthase, glutaminase subunit Pdx2. Pyridoxal 5'-phosphate (PLP) is synthesized by the PdxA/PdxJ pathway in some species (mostly within the gamma subdivision of the proteobacteria) and by the Pdx1/Pdx2 pathway in most other organisms. This family describes Pdx2, the glutaminase subunit of the PLP synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 184
20248 163513 TIGR03801 asp_4_decarbox aspartate 4-decarboxylase. This enzyme, aspartate 4-decarboxylase (EC 4.1.1.12), removes the side-chain carboxylate from L-aspartate, converting it to L-alanine plus carbon dioxide. It is a PLP-dependent enzyme, homologous to aspartate aminotransferase (EC 2.6.1.1). [Energy metabolism, Amino acids and amines] 521
20249 274793 TIGR03802 Asp_Ala_antiprt aspartate-alanine antiporter. All members of the seed alignment for this model are asparate-alanine anti-transporters (AspT) encoded next to the gene for aspartate 4-decarboxylase (AspD), which converts asparate to alanine, releasing CO2. The exchange of Asp for Ala is electrogenic, so the AspD/AspT system confers a proton-motive force. This transporter contains two copies of the AspT/YidE/YbjL antiporter duplication domain (TIGR01625). 562
20250 274794 TIGR03803 Gloeo_Verruco Gloeo_Verruco repeat. This model describes a rare protein repeat, found so far in two species of Verrucomicrobia (Chthoniobacter flavus and Verrucomicrobium spinosum) and in four different proteins of Gloeobacter violaceus PCC7421. In the Verrucomicrobial species, the repeat region is followed by a PEP-CTERM protein-sorting signal, suggesting an extracellular location. 34
20251 274795 TIGR03804 para_beta_helix parallel beta-helix repeat (two copies). This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the beta-strand repeats that stack in a right-handed parallel beta-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel beta-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences. 44
20252 163517 TIGR03805 beta_helix_1 parallel beta-helix repeat-containing protein. Members of this protein family contain a tandem pair of beta-helix repeats (see TIGR03804). Each repeat is expected to consist of three beta strands that form a single turn as they form a right-handed helix of stacked beta-structure. Member proteinsa occur regularly in two-gene pairs along with another uncharacterized protein family; both protein families exhibit either lipoprotein or regular signal peptides, suggesting transit through the plasma membrane, and the two may be fused. The function of the pair is unknown. [Unknown function, General] 314
20253 163518 TIGR03806 chp_HNE_0200 conserved hypothetical protein, HNE_0200 family. The model TIGR03805 describes an uncharacterized protein family that contains repeats associated with the formation of a right-handed helical stack of parallel beta strands, homologous to those found in a number of carbohydrate-binding proteins and sugar hydrolases. This model describes another uncharacterized protein family, found in the same species as TIGR03805 member proteins, usually as the adjacent gene or in a fusion protein. An example is HNE_0200 from Hyphomonas neptunium ATCC 15444. Sometimes two members of this family are with a single member of TIGR03805. The function is unknown. [Hypothetical proteins, Conserved] 317
20254 213864 TIGR03807 RR_fam_repeat putative cofactor-binding repeat. This model describes a small repeat found in a family of proteins that crosses the plasma membrane by twin-arginine translation, which usually signifies the presence of a bound cofactor. This repeat shows similarity to the beta-helical repeat, in which three beta-strands per repeat wind once per repeat around in a right-handed helical stack of parallel beta structure. 27
20255 163520 TIGR03808 RR_plus_rpt_1 twin-arg-translocated uncharacterized repeat protein. Members of this protein family have a Sec-independent twin-arginine tranlocation (TAT) signal sequence, which enables tranfer of proteins folded around prosthetic groups to cross the plasma membrane. These proteins have four copies of a repeat of about 23 amino acids that resembles the beta-helix repeat. Beta-helix refers to a structural motif in which successive beta strands wind around to stack parallel in a right-handed helix, as in AlgG and related enzymes of carbohydrate metabolism. The twin-arginine motif suggests that members of this protein family bind some unknown cofactor. 455
20256 163521 TIGR03809 TIGR03809 TIGR03809 family protein. This protein family contains proteins with a median length of about 175, including a strongly conserved N-terminal region of about 55 amino acids, a conserved extreme C-terminal region of about 15 amino acids, and highly variable sequence in between the two. Members are found invariably with a member of family TIGR03808. 168
20257 163522 TIGR03810 arg_ornith_anti arginine-ornithine antiporter. Members of this protein family are the arginine/ornithine antiporter, ArcD. This exchanger of ornithine for arginine occurs in a system with arginine deiminase, ornithine carbamoyltransferase, and carbamate kinase, with together turn arginine to ornithine with the generation of ATP and release of CO2. [Transport and binding proteins, Amino acids, peptides and amines] 468
20258 163523 TIGR03811 tyr_de_CO2_Ent tyrosine decarboxylase, Enterococcus type. This model represents tyrosine decarboxylases in the family of the Enterococcus faecalis enzyme Tdc. These enzymes often are encoded next to tyrosine/tyramine antiporter, together comprising a system in which tyrosine decarboxylation can protect against exposure to acid conditions. This clade differs from the archaeal tyrosine decarboxylases associated with methanofuran biosynthesis. [Cellular processes, Adaptations to atypical conditions] 608
20259 274796 TIGR03812 tyr_de_CO2_Arch tyrosine decarboxylase MnfA. Members of this protein family are the archaeal form, MnfA, of tyrosine decarboxylase, and are involved in methanofuran biosynthesis. Members show clear homology to the Enterococcus form, Tdc, that is involved in tyrosine decarboxylation for resistance to acidic conditions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 373
20260 163525 TIGR03813 put_Glu_GABA_T putative glutamate/gamma-aminobutyrate antiporter. Members of this protein family are putative putative glutamate/gamma-aminobutyrate antiporters. Each member of the seed alignment is found adjacent to a glutamate decarboxylase, which converts glutamate (Glu) to gamma-aminobutyrate (GABA). However, the majority belong to genome contexts with a glutaminase (converts Gln to Glu) as well as the decarboxylase that converts Glu to GABA. The specificity of the transporter remains uncertain. 474
20261 274797 TIGR03814 Gln_ase glutaminase A. This family describes the enzyme glutaminase, from a larger family that includes serine-dependent beta-lactamases and penicillin-binding proteins. Many bacteria have two isozymes. This model is based on selected known glutaminases and their homologs within prokaryotes, with the exclusion of highly-derived (long branch) and architecturally varied homologs, so as to achieve conservative assignments. A sharp drop in scores occurs below 250, and cutoffs are set accordingly. The enzyme converts glutamine to glutamate, with the release of ammonia. Members tend to be described as glutaminase A (glsA), where B (glsB) is unknown and may not be homologous (as in Rhizobium etli). Some species have two isozymes that may both be designated A (GlsA1 and GlsA2). [Energy metabolism, Amino acids and amines] 300
20262 274798 TIGR03815 CpaE_hom_Actino helicase/secretion neighborhood CpaE-like protein. Members of this protein family belong to the MinD/ParA family of P-loop NTPases, and in particular show homology to the CpaE family of pilus assembly proteins (see ). Nearly all members are found, not only in a gene context consistent with pilus biogenesis or a pilus-like secretion apparatus, but also near a DEAD/DEAH-box helicase, suggesting an involvement in DNA transfer activity. The model describes a clade restricted to the Actinobacteria. 322
20263 274799 TIGR03816 tadE_like_DECH helicase/secretion neighborhood TadE-like protein. Members of this small, highly hydrophobic protein family occur in a pilus/secretion-like region that usually is next to an uncharacterized DEAH-box helicase, in Actinobacteria. Members show sequence similarity to the TadE-like family described by pfam07811. The function is unknown. [Unknown function, General] 109
20264 274800 TIGR03817 DECH_helic helicase/secretion neighborhood putative DEAH-box helicase. A conserved gene neighborhood widely spread in the Actinobacteria contains this uncharacterized DEAH-box family helicase encoded convergently towards an operon of genes for protein homologous to type II secretion and pilus formation proteins. The context suggests that this helicase may play a role in conjugal transfer of DNA. 742
20265 274801 TIGR03818 MotA1 flagellar motor stator protein MotA. The MotA protein, along with its partner MotB, comprise the stator complex of the bacterial flagellar motor. MotAB span the cytoplasmic membrane and undergo conformational changes powered by the translocation of protons. These conformational changes in turn are communicated to the rotor assembly, producing torque. This model represents one family of MotA proteins which are often not identified by the "transporter, MotA/TolQ/ExbB proton channel family" model, pfam01618. 282
20266 200328 TIGR03819 heli_sec_ATPase helicase/secretion neighborhood ATPase. Members of this protein family comprise a distinct clade of putative ATPase associated with an integral membrane complex likely to act in pilus formation, secretion, or conjugal transfer. The association of most members with a nearby gene for a DEAH-box helicase suggests a role in conjugal transfer. 340
20267 163532 TIGR03820 lys_2_3_AblA lysine-2,3-aminomutase. This model describes lysine-2,3-aminomutase as found along with beta-lysine acetyltransferase in a two-enzyme pathway for making the compatible solute N-epsilon-acetyl-beta-lysine. This compatible solute, or osmolyte, is known to protect a number of methanogenic archaea against salt stress. The trusted cutoff distinguishes a tight clade with essentially full-length homology from additional homologs that are shorter or highly diverged in the C-terminal region. All members of this family have the radical SAM motif CXXXCXXC, while some but not all have a second copy of the motif in the C-terminal region. 417
20268 163533 TIGR03821 EFP_modif_epmB EF-P beta-lysylation protein EpmB. Members of this radical SAM protein subfamily, including yjeK in E. coli, form a distinctive clade, homologous to lysine-2,3-aminomutase of Bacillus, Clostridium, and methanogenic archaea. Members of this family are found in E. coli, Buchnera, Yersinia, etc. The gene symbol is now reassigned as EpmB (Elongation factor P Modification B). [Protein fate, Protein modification and repair] 321
20269 163534 TIGR03822 AblA_like_2 lysine-2,3-aminomutase-related protein. Members of this protein form a distinctive clade, homologous to lysine-2,3-aminomutase (of Bacillus, Clostridium, and methanogenic archaea) and likely similar in function. Members of this family are found in Rhodopseudomonas, Caulobacter crescentus, Bradyrhizobium, etc. 321
20270 163535 TIGR03823 FliZ flagellar regulatory protein FliZ. FliZ is involved in the regulation of flagellar assembly and possibly also the down-regulation of the motile phenotype. FliZ interacts with the flagellar translational activator FlhCD complex. 168
20271 274802 TIGR03824 FlgM_jcvi flagellar biosynthesis anti-sigma factor FlgM. FlgM interacts with and inhibits the alternative sigma factor sigma(28) FliA. The C-terminus of FlgM contains the sigma(28)-binding domain. 95
20272 274803 TIGR03825 FliH_bacil flagellar assembly protein FliH. This bacillus clade of FliH proteins is not found by the Pfam FliH model pfam02108, but is closely related to the sequences identified by that model. Sequences identified by this model are observed in flagellar operons in an analogous position relative to other flagellar operon genes. 255
20273 163538 TIGR03826 YvyF flagellar operon protein TIGR03826. This gene is found in flagellar operons of Bacillus-related organisms. Its function has not been determined and an official gene symbol has not been assigned, although the gene is designated yvyF in B. subtilus. A tentative assignment as a regulator is suggested in the NCBI record GI:16080597. 137
20274 163539 TIGR03827 GNAT_ablB putative beta-lysine N-acetyltransferase. Members of this protein family are GNAT family acetyltransferases, based on a seed alignment in which every member is associated with a lysine 2,3-aminomutase family protein, usually as the adjacent gene. This family includes AblB, the enzyme beta-lysine acetyltransferase that completes the two-step synthesis of the osmolyte (compatible solute) N-epsilon-acetyl-beta-lysine; all members of the family may have this function. Note that N-epsilon-acetyl-beta-lysine has been observed only in methanogenic archaea (e.g. Methanosarcina) but that this model, paired with TIGR03820, suggests a much broader distribution. 266
20275 274804 TIGR03828 pfkB 1-phosphofructokinase. This enzyme acts in concert with the fructose-specific phosphotransferase system (PTS) which imports fructose as fructose-1-phosphate. The action of 1-phosphofructokinase results in beta-D-fructose-1,6-bisphosphate and is an entry point into glycolysis (GenProp0688). 304
20276 163541 TIGR03829 YokU_near_AblA uncharacterized protein, YokU family. Members of this protein family occur in various species of the genus Bacillus, always next to the gene (kamA or ablA) for lysine 2,3-aminomutase. Members have a pair of CXXC motifs, and share homology to the amino-terminal region of a family of putative transcription factors for which the C-terminal is modeled by pfam01381, a helix-turn-helix domain model. This family, however, is shorter and lacks the helix-turn-helix region. The function of this protein family is unknown, but a regulatory role in compatible solute biosynthesis is suggested by local genome context. [Unknown function, General] 89
20277 274805 TIGR03830 CxxCG_CxxCG_HTH putative zinc finger/helix-turn-helix protein, YgiT family. This model describes a family of predicted regulatory proteins with a conserved zinc finger/HTH architecture. The amino-terminal region contains a novel domain, featuring two CXXC motifs and occuring in a number of small bacterial proteins as well as in the present family. The carboxyl-terminal region consists of a helix-turn-helix domain, modeled by pfam01381. The predicted function is DNA binding and transcriptional regulation. 127
20278 274806 TIGR03831 YgiT_finger YgiT-type zinc finger domain. This domain model describes a small domain with two copies of a putative zinc-binding motif CXXC (usually CXXCG). Most member proteins consist largely of this domain or else carry an additional C-terminal helix-turn-helix domain, resembling that of the phage protein Cro and modeled by pfam01381. 46
20279 163544 TIGR03832 Tyr_2_3_mutase tyrosine 2,3-aminomutase. Members of this protein family are tyrosine 2,3-aminomutase. It is variable from member to member as to whether the (R)-beta-Tyr or (S)-beta-Tyr is the preferred product from L-Tyr. This enzyme tends to occur in secondary metabolite biosynthesis systems, as in the production of chondramides in Chondromyces crocatus. This class of enzyme has a prosthetic group, MIO (4-methylideneimidazol-5-one), that forms posttranslationally from an Ala-Ser-Gly motif. 507
20280 274807 TIGR03833 TIGR03833 conserved hypothetical protein. A pair of adjacent genes, ablAB (acetyl-beta-lysine biosynthesis) encodes lysine 2,3-aminomutase and beta-lysine acetyltransferase in methanogenic archaea. Homologous pairs, possibly with identical function, occur in a wide range of species, including Bacillus subtilis. This model describes a conserved hypothetical protein, small in size, with a phylogenetic distribution moderately well correlated to that of the acetyltransferase family. This protein family is also described as DUF2196 and COG4895. The function is unknown. [Hypothetical proteins, Conserved] 62
20281 213869 TIGR03834 EAGR_box EAGR box. The EAGR box (Enriched in Aromatic and Glycine Residues) is found in three different proteins of the Mycoplasma genitalium terminal organelle, which acts in both cytadherence and gliding motility. The presence of this domain in a genome predicts the Mycoplasma-type terminal organelle structure, gliding motility, and cytadherence. The EAGR box may occur from one to nine times in a protein. 28
20282 274808 TIGR03835 termin_org_DnaJ terminal organelle assembly protein TopJ. This model describes TopJ (MG_200, CbpA), a DnaJ homolog and probable assembly protein of the Mycoplasma terminal organelle. The terminal organelle is involved in both cytadherence and gliding motility. [Cellular processes, Chemotaxis and motility] 871
20283 163548 TIGR03836 termin_org_HMW1 cytadherence high molecular weight protein 1 N-terminal region. This model describes the N-terminal region of the Mycoplasma cytadherence protein HMW1, up to but not including the first EAGR box domain. The apparent orthologs in different Mycoplasma species differ profoundly in archictecture C-terminally to the region described here. 82
20284 274809 TIGR03837 EarP Elongation-Factor P (EF-P) rhamnosyltransferase EarP. This model describes a conserved protein that typically is encoded next to the gene efp for translation elongation factor P. 371
20285 274810 TIGR03838 queuosine_YadB glutamyl-queuosine tRNA(Asp) synthetase. This protein resembles a shortened glutamyl-tRNA ligase, but its purpose is to modify tRNA(Asp) at a queuosine position in the anticodon rather than to charge a tRNA with its cognate amino acid. [Protein synthesis, tRNA and rRNA base modification] 271
20286 274811 TIGR03839 termin_org_P1 adhesin P1. Members of this protein family are the major adhesin of the Mycoplasma terminal organelle. The protein is called adhesin P1, cytadhesin P1, P140, attachment protein, and MgPa, with locus names MG191 in Mycoplasma genitalium and MPN141 in M. pneumoniae. A conserved C-terminal region is shared by additional paralogs in M. pneumoniae and M. gallisepticum, as well as by the member of this family. [Cell envelope, Surface structures, Cellular processes, Pathogenesis] 1425
20287 213871 TIGR03840 TMPT_Se_Te thiopurine S-methyltransferase, Se/Te detoxification family. Members of this family are thiopurine S-methyltransferase from a branch in which at least some member proteins can perform selenium methylation as a means to detoxify selenium, or perform a related detoxification of tellurium. Note that the EC number definition does not specify a particular thiopurine, but rather represents a class of activity. 213
20288 274812 TIGR03841 F420_Rv3093c probable F420-dependent oxidoreductase, Rv3093c family. This model describes a small family of enzymes in the bacterial luciferase-like monooxygenase family, which includes F420-dependent enzymes such as N5,N10-methylenetetrahydromethanopterin reductase as well as FMN-dependent enzymes. All members of this family are from species that produce coenzyme F420; SIMBAL analysis suggests that members of this family bind F420 rather than FMN. [Unknown function, Enzymes of unknown specificity] 301
20289 163554 TIGR03842 F420_CPS_4043 F420-dependent oxidoreductase, CPS_4043 family. This model represents a family of putative F420-dependent oxidoreductases, fairly closely related to 5,10-methylenetetrahydromethanopterin reductase (mer, TIGR03555), both within the bacterial luciferase-like monoxygenase (LLM) family. A fairly deep split (to about 40 % sequence identity) in the present family separates a strictly Actinobacterial clade from an alpha/beta/gamma-proteobacterial clade, in which the member is often the only apparent F420-dependent LLM family member. The specific function, and whether Actinobacterial and Proteobacterial clades differ in function, are unknown. [Unknown function, Enzymes of unknown specificity] 330
20290 274813 TIGR03843 TIGR03843 conserved hypothetical protein. This model represents a protein family largely restricted to the Actinobacteria (high-GC Gram-positives), although it is also found in the Chloroflexi. Distant similarity to the phosphatidylinositol 3- and 4-kinase is suggested by the matching of some members to pfam00454. 226
20291 163556 TIGR03844 cysteate_syn cysteate synthase. Members of this family are cysteate synthase, an enzyme of alternate pathway to sulfopyruvate, a precursor of coenzyme M. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 398
20292 163557 TIGR03845 sulfopyru_alph sulfopyruvate decarboxylase, alpha subunit. This model represents the alpha subunit, or the N-terminal region, of sulfopyruvate decarboxylase, an enzyme of coenzyme M biosynthesis. Coenzyme M is found almost exclusively in the methanogenic archaea. However, the enzyme also occurs in Roseovarius nubinhibens ISM in a degradative pathway, where the resulting sulfoacetaldehyde is desulfonated to acetyl phosphate, then converted to acetyl-CoA (see ). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 157
20293 274814 TIGR03846 sulfopy_beta sulfopyruvate decarboxylase, beta subunit. Nearly every member of this protein family is the beta subunit, or else the C-terminal region, of sulfopyruvate decarboxylase, in an archaeal species capable of coenzyme M biosynthesis. However, the enzyme also occurs in Roseovarius nubinhibens ISM in a degradative pathway, where the resulting sulfoacetaldehyde is desulfonated to acetyl phosphate, then converted to acetyl-CoA (see ). 181
20294 213872 TIGR03847 TIGR03847 conserved hypothetical protein. The conserved hypothetical protein described here occurs as part of the trio of uncharacterized proteins common in the Actinobacteria. 177
20295 163560 TIGR03848 MSMEG_4193 probable phosphomutase, MSMEG_4193 family. A three-gene system broadly conserved among the Actinobacteria includes MSMEG_4193 and homologs, a subgroup among the larger phosphoglycerate mutase family protein (pfam00300). Another member of the trio is a probable kinase, related to phosphatidylinositol kinases; that context supports the hypothesis that this protein acts as a phosphomutase. 204
20296 163561 TIGR03849 arch_ComA phosphosulfolactate synthase. This model finds the ComA (Coenzyme M biosynthesis A) protein, phosphosulfolactate synthase, in methanogenic archaea. The ComABC pathway is one of at least two pathways to the intermediate sulfopyruvate. Coenzyme M occurs rarely and sporadically outside of the archaea, as for expoxide metabolism in Xanthobacter autotrophicus Py2, but candidate phosphosulfolactate synthases from that and other species occur fall below the cutoff and outside the scope of this model. This model deliberately is narrower in scope than pfam02679. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 237
20297 274815 TIGR03850 bind_CPR_0540 carbohydrate ABC transporter substrate-binding protein, CPR_0540 family. Members of this protein are the substrate-binding protein of a predicted carbohydrate transporter operon, together with permease subunits of ABC transporter homology families. This substrate-binding protein frequently co-occurs in genomes with a family of disaccharide phosphorylases, TIGR02336, suggesting that the molecule transported will include beta-D-galactopyranosyl-(1->3)-N-acetyl-D-glucosamine and related carbohydrates. Members of this family are sporadically strain by strain, often in species with a human host association, including Propionibacterium acnes and Clostridium perfringens, and Bacillus cereus. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 437
20298 274816 TIGR03851 chitin_NgcE carbohydrate ABC transporter, N-acetylglucosamine/diacetylchitobiose-binding protein. Members of this protein family are the substrate-binding protein, a lipid-anchored protein of Gram-positive bacteria in all examples found so far, that include NgcE of the chitin-degrader, Streptomyces olivaceoviridis, and close homologs from other species likely to share the same function. NgcE binds both N-acetylglucosamine and the chitin dimer, N,N'-diacetylchitobiose. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 450
20299 163564 TIGR03852 sucrose_gtfA sucrose phosphorylase. In the forward direction, this enzyme uses phosphate to cleave sucrose into D-fructose + alpha-D-glucose 1-phosphate. Characterized representatives from Streptococcus mutans and Bifidobacterium adolescentis represent well-separated branches of a molecular phylogenetic tree. In S. mutans, the region including this gene has been associated with neighboring transporter genes and multiple sugar metabolism. 470
20300 163565 TIGR03853 matur_matur probable metal-binding protein. This model describes a family of small cytosolic proteins, about 80 amino acids in length, in which the eight invariant residues include three His residues and two Cys residues. Two pairs of these invariant residues occur in motifs HxH (where x is A or G) and CxH, both of which suggest metal-binding activity. This protein family was identified by searching with a phylogenetic profile based on an anaerobic sulfatase-maturase enzyme, which contains multiple 4Fe-4S clusters. The linkages by phylogenetic profiling and by iron-sulfur cluster-related motifs together suggest this protein may be an accessory protein to certain maturases in sulfatase/maturase systems. 77
20301 163566 TIGR03854 F420_MSMEG_3544 probable F420-dependent oxidoreductase, MSMEG_3544 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a small family, closely related to other such families in the putative F420-binding region, exemplified by MSMEG_3544 in Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity] 290
20302 163567 TIGR03855 NAD_NadX aspartate dehydrogenase. Members of this protein family are L-aspartate dehydrogenase, as shown for the NADP-dependent enzyme TM_1643 of Thermotoga maritima. Members lack homology to NadB, the aspartate oxidase (EC 1.4.3.16) of most mesophilic bacteria (described by TIGR00551), which this enzyme replaces in the generation of oxaloacetate from aspartate for the NAD biosynthetic pathway. All members of the seed alignment are found adjacent to other genes of NAD biosynthesis, although other uses of L-aspartate dehydrogenase may occur. 229
20303 213873 TIGR03856 F420_MSMEG_2906 probable F420-dependent oxidoreductase, MSMEG_2906 family. This model describes a small family of enzymes in the bacterial luciferase-like monooxygenase family, which includes F420-dependent enzymes such as N5,N10-methylenetetrahydromethanopterin reductase as well as FMN-dependent enzymes. All members of this family are from species that produce coenzyme F420; SIMBAL analysis suggests that members of this family bind F420 rather than FMN. [Unknown function, Enzymes of unknown specificity] 249
20304 213874 TIGR03857 F420_MSMEG_2249 probable F420-dependent oxidoreductase, MSMEG_2249 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a distinctive subfamily, found only in F420-biosynthesizing members of the Actinobacteria of the bacterial luciferase-like monooxygenase (LLM) superfamily. [Unknown function, Enzymes of unknown specificity] 329
20305 274817 TIGR03858 LLM_2I7G probable oxidoreductase, LLM family. This model describes a highly conserved, somewhat broadly distributed family withing the luciferase-like monooxygenase (LLM) superfamily. Most members are from species incapable of synthesizing coenzyme F420, bound by some members of the LLM superfamily. Members, therefore, are more likely to use FMN as a cofactor. 337
20306 274818 TIGR03859 PQQ_PqqD coenzyme PQQ biosynthesis protein PqqD. This model identifies PqqD, a protein involved in the final steps of the biosynthesis of pyrroloquinoline quinone, coenzyme PQQ. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 81
20307 274819 TIGR03860 FMN_nitrolo FMN-dependent oxidoreductase, nitrilotriacetate monooxygenase family. This model represents a distinctive clade, in which all characterized members are FMN-binding, within the larger family of luciferase-like monooxygenases (LLM), among which there are both FMN- and F420-binding enzymes. A well-characterized member is nitrilotriacetate monooxygenase from Aminobacter aminovorans (Chelatobacter heintzii), where nitrilotriacetate is a chelating agent used in detergents. [Unknown function, Enzymes of unknown specificity] 422
20308 163573 TIGR03861 phenyl_ABC_PedC alcohol ABC transporter, permease protein. Members of this protein family, part of a larger class of efflux-type ABC transport permease proteins, are found exclusively in genomic contexts with pyrroloquinoline-quinone (PQQ) biosynthesis enzymes and/or PQQ-dependent alcohol dehydrogenases, such as the phenylethanol dehydrogenase PedE of Pseudomonas putida U. Members include PedC, an apparent phenylethanol transport protein whose suggested role is efflux to limit intracellular concentrations of toxic metabolites during phenylethanol catalysis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 253
20309 274820 TIGR03862 flavo_PP4765 uncharacterized flavoprotein, PP_4765 family. This model describes a sharply distinctive clade of proteins within the larger family of flavoproteins described by pfam03486 and TIGRFAMs model TIGR00275. The function is unknown. 376
20310 274821 TIGR03863 PQQ_ABC_bind ABC transporter, substrate binding protein, PQQ-dependent alcohol dehydrogenase system. Members of this protein family are putative substrate-binding proteins of an ABC transporter family that associates, in gene neighborhood and phylogenomic profile, with pyrroloquinoline-quinone (PQQ)-dependent degradation of certain alcohols, such as 2-phenylethanol in Pseudomonas putida U. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 347
20311 274822 TIGR03864 PQQ_ABC_ATP ABC transporter, ATP-binding subunit, PQQ-dependent alcohol dehydrogenase system. Members of this protein family are the ATP-binding subunit of an ABC transporter system that is associated with PQQ biosynthesis and PQQ-dependent alcohol dehydrogenases. While this family shows homology to several efflux ABC transporter subunits, the presence of a periplasmic substrate-binding protein and association with systems for catabolism of alcohols suggests a role in import rather than detoxification. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 236
20312 274823 TIGR03865 PQQ_CXXCW PQQ-dependent catabolism-associated CXXCW motif protein. Members of this protein family have a CXXXCW motif, consistent with a possible role in redox cofactor binding. This protein family shows strong relationships by phylogenetic profiling and conserved gene neighborhoods with a transport system for alcohols metabolized by PQQ-dependent enzymes. 162
20313 274824 TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol. 310
20314 274825 TIGR03867 MprA_tail MprA protease C-terminal rhombosortase-interaction domain. This model describes the Ralstonia lineage variant of the GlyGly-CTERM domain (TIGR03501), a predicted target for protein sorting and cleavage by rhombosortase, a member of the family of rhomboid proteases. Note that some MprA family proteases are full-length homologs except for the lack of this domain. All members of the present family are predicted serine proteases. 27
20315 274826 TIGR03868 F420-O_ABCperi proposed F420-0 ABC transporter, periplasmic F420-0 binding protein. This small clade of ABC-type transporter periplasmic binding protein components is found as a three gene cassette along with a permease (TIGR03869) and an ATPase (TIGR03873). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this periplasmic binding protein is a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter. 287
20316 163581 TIGR03869 F420-0_ABCperm proposed F420-0 ABC transporter, permease protein. his small clade of ABC-type transporter permease protein components is found as a three gene cassette along with a periplasmic substrate-binding protein (TIGR03868) and an ATPase (TIGR03873). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with an F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this permease protein is a component of a F420-0 (that is, F420 lacking only the polyglutamate tail) transporter. 325
20317 274827 TIGR03870 ABC_MoxJ methanol oxidation system protein MoxJ. This predicted periplasmic protein, called MoxJ or MxaJ, is required for methanol oxidation in Methylobacterium extorquens. Two differing lines of evidence suggest two different roles. Forming one view, homology suggests it is the substrate-binding protein of an ABC transporter associated with methanol oxidation. The gene, furthermore, is found regular in genomes with, and only two or three genes away from, a corresponding permease and ATP-binding cassette gene pair. The other view is that this protein is an accessory factor or additional subunit of methanol dehydrogenase itself. Mutational studies show a dependence on this protein for expression of the PQQ-dependent, two-subunit methanol dehydrogenase (MxaF and MxaI) in Methylobacterium extorquens, as if it is a chaperone for enzyme assembly or a third subunit. A homologous N-terminal sequence was found in Paracoccus denitrificans as a 32Kd third subunit. This protein may, in fact, be both, a component of a periplasmic enzyme that converts methanol to formaldehyde and a component of an ABC transporter that delivers the resulting formaldehyde to the cell's interior. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Energy metabolism, Other] 246
20318 274828 TIGR03871 ABC_peri_MoxJ_2 quinoprotein dehydrogenase-associated probable ABC transporter substrate-binding protein. This protein family, a sister family to TIGR03870, is found more broadly. It occurs a range of PQQ-biosynthesizing species, not just in known methanotrophs. Interpretation of evidence by homology and by direct experimental work suggest two different roles. By homology, this family appears to be the periplasmic substrate-binding protein of an ABC transport family. However, mutational studies and direct characterization for some sequences related to this family suggests this family may act as a maturation chaperone or additional subunit of a methanol dehydrogenase-like enzyme. 232
20319 274829 TIGR03872 cytochrome_MoxG cytochrome c(L), periplasmic. This model describes a periplasmic c-type cytochrome that serves as the primary electron acceptor for the quinoprotein methanol dehydrogenase, a PQQ enzyme. The member from Paracoccus denitrificans is also characterized as an electron acceptor for methylamine dehydrogenase, a tryptophan tryptophylquinone enzyme. This protein is called cytochrome c(L) in methylotrophic bacteria such Methylobacterium extorquens, but c551i in Paracoccus denitrificans. [Energy metabolism, Electron transport] 133
20320 163585 TIGR03873 F420-0_ABC_ATP proposed F420-0 ABC transporter, ATP-binding protein. This small clade of ABC-type transporter ATP-binding protein components is found as a three gene cassette along with a periplasmic substrate-binding protein (TIGR03868) and a permease (TIGR03869). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this ATP-binding protein is a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter. 256
20321 163586 TIGR03874 4cys_cytochr c-type cytochrome, methanol metabolism-related. This family represents a c-type cytochrome related to (but excluding) cytochrome c-555 of Methylococcus capsulatus. Members contain four invariant Cys residues, including two from a heme-binding motif shared with c-555, and two others. 143
20322 163587 TIGR03875 RNA_lig_partner RNA ligase partner, MJ_0950 family. This uncharacterized protein family is found almost perfectly in the same set of genomes as the Pab1020 family described by model TIGR01209. These pairs are found mostly in Archaea, but also in a few bacteria (e.g. Alkalilimnicola ehrlichei MLHE-1, Aquifex aeolicus). While the partner protein has been described as homodimeric ligase that has RNA circularization activity, the function of this protein (also called UPF0278) is unknown. 206
20323 274830 TIGR03876 cas_csaX CRISPR type I-A/APERN-associated protein CsaX. This family comprises a minor CRISPR-associated protein family. It occurs only in the context of the (strictly archaeal) Apern subtype of CRISPR/Cas system, and is further restricted to the Sulfolobales, including Metallosphaera sedula DSM 5348 and multiple species of the genus Sulfolobus. 281
20324 163589 TIGR03877 thermo_KaiC_1 KaiC domain protein, Ph0284 family. Members of this family contain a single copy of the KaiC domain (pfam06745) that occurs in two copies of the circadian clock protein kinase KaiC itself. Members occur primarily in thermophilic archaea and in Thermotoga. 237
20325 274831 TIGR03878 thermo_KaiC_2 KaiC domain protein, AF_0795 family. This KaiC domain-containing protein family occurs sporadically across a broad taxonomic range (Euryarchaeota, Aquificae, Dictyoglomi, Epsilonproteobacteria, and Firmicutes), but exclusively in thermophiles. 259
20326 163591 TIGR03879 near_KaiC_dom probable regulatory domain. This model describes a common domain shared by two different families of proteins, each of which occurs regularly next to its corresponding partner family, a probable regulatory with homology to KaiC. By implication, this protein family likely is also involved in sensory transduction and/or regulation. 73
20327 163592 TIGR03880 KaiC_arch_3 KaiC domain protein, AF_0351 family. This model represents a rather narrowly distributed archaeal protein family in which members have a single copy of the KaiC domain. This stands in contrast to the circadian clock protein KaiC itself, with two copies of the domain. Members are expected to have weak ATPase activity, by homology to the autokinase/autophosphorylase KaiC itself. 224
20328 163593 TIGR03881 KaiC_arch_4 KaiC domain protein, PAE1156 family. Members of this protein family are archaeal single-domain KaiC_related proteins, homologous to the Cyanobacterial circadian clock cycle protein KaiC, an autokinase/autophosphorylase that has two copies of the domain. 229
20329 274832 TIGR03882 cyclo_dehyd_2 bacteriocin biosynthesis cyclodehydratase domain. This model describes a ThiF-like domain of a fusion protein found in clusters associated with the production of TOMMs (thiazole/oxazole-modified microcins), small bacteriocins with characteristic heterocycle modifications. This domain is presumed to act as a cyclodehydratase, as do members of the SagC family modeled by TIGR03603. 164
20330 274833 TIGR03883 DUF2342_F420 uncharacterized protein, coenzyme F420 biosynthesis associated. A phylogenetic tree of the DUF2342 family (TIGR03624) consists of two major branches. One of these branches, modeled here, is observed almost entirely to be found in coenzyme F420 biosynthesizing species of the Actinobacterial, Chloroflexi and Archaeal lineages. The few organisms having genes within this family and lacking F420 biosynthesis may either have an undiscovered F420 transporter, or may represent F420-to-FMN revertants. This family includes a Chloroflexus Aurantiacus protein whose crystal structure has been determined (PDB:3CMN_A). This has been annotated as a putative hydrolase, but the support for that assertion is untraceable. There is no cofactor present in the structure. 346
20331 163596 TIGR03884 sel_bind_Methan selenium-binding protein. This model describes a homopentameric selenium-binding protein with a suggested role in selenium transport and delivery to selenophosphate synthase, the SelD protein. This protein family is closely related to pfam01906, but is shorter because of several deleted regions. It is restricted to the archaeal genus Methanococcus. 74
20332 274834 TIGR03885 flavin_revert probable non-F420 flavinoid oxidoreductase. This model represents a clade of proteins within the larger subfamily TIGR03557. The parent model includes the F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554) and many other proteins. Excepting the members of this family, all members of TIGR03557 occur in species capable of synthesizing coenzyme F420. All members of the seed alignment for this model are from species that lack F420 biosynthesis. It is suggested that members of this family bind FMN, or FO, or a novel flavinoid cofactor, but not F420 per se. [Unknown function, Enzymes of unknown specificity] 315
20333 188401 TIGR03886 lyase_spl_fam spore photoproduct lyase family protein. This uncharacterized radical SAM domain protein occurs rarely and sporadically in species that include select Alphaproteobacteria and Actinobacteria, and in Deinococcus deserti VCD115. It is a distant but full-length homolog to the Bacillus subtilis spore photoproduct lyase (spl), which monomerizes thymine dimers created as DNA damage by uv radiation. 346
20334 188402 TIGR03887 thiocyan_alph thiocyanate hydrolase, gamma subunit. Members of this family are the gamma subunit of thiocyanate hydrolase. This family is closely related to the nitrile hydratase, alpha subunit (TIGR01323). 200
20335 274835 TIGR03888 nitrile_beta nitrile hydratase, beta subunit. Members of this protein family are the beta subunit of nitrile hydratase. The alpha subunit is represented by model TIGR01323. While nitrile hydratase is given the specific EC number 4.2.1.84, nitriles are a class of compounds, and one genome may carry more than one nitrile hydratase. The enzyme occurs in both non-heme iron and non-corrin cobalt forms. [Energy metabolism, Amino acids and amines] 223
20336 188404 TIGR03889 nitrile_acc nitrile hydratase accessory protein. Members of this protein family are found in operons with the alpha and beta subunits of nitrile hydratase, an enzyme with Fe(III) or Co(III) at the active site, and appear to be accessory proteins for maturation or activation of the enzyme. This protein is homologous to the beta subunit (see TIGR03888). 74
20337 188405 TIGR03890 nif11_cupin nif11 domain/cupin domain protein. Members of this protein family occur exclusively in the Cyanobacteria and contain both a nif11 and a cupin domain. The function is unknown. 171
20338 274836 TIGR03891 thiopep_ocin thiopeptide-type bacteriocin biosynthesis domain. This domain occurs within longer proteins that contain lantibiotic dehydratase domains (see pfam04737 and pfam04738), and as single-domain proteins in bacteriocin biosynthesis genomic contexts. Three named genes in this family, SioK in Streptomyces sioyaensis, TsrD in Streptomyces laurentii, and NosD in Streptomyces actuosus, all occur in regions associated with thiopeptide biosynthesis. [Cellular processes, Toxin production and resistance] 263
20339 200334 TIGR03892 thiopep_precurs thiazolylpeptide-type bacteriocin precursor. Members of this protein family are the precursors of a family of small bacteriocins (i.e. microcins) with thiopeptide type modifications, a highly modified subclass of heterocycle-containing peptide antibiotics. Members tend to be found clustered in genomes with proteins recognized by TIGR03891 and proteins/domains annotated as lantibiotic dehydratase (pfam04737, pfam04738), and with a cyclodehydratase/docking protein fusion protein characteristic of heterocycle formation. The seed alignment includes both an N-terminal leader peptide region and a C-terminal low-complexity region consisting mostly of Cys and Ser residues. Members with known function block translation by inhibiting translation factor activity. [Cellular processes, Toxin production and resistance] 43
20340 274837 TIGR03893 lant_SP_1948 type 2 lantibiotic, SP_1948 family. This model recognizes a number of type 2 lantibiotic-type bacteriocins, related to but distinct from the family that includes lichenicidin and mersacidin. Sequence similarity among members consists largely of a 20-residue block of conserved sequence that covers most of the leader peptide region, absent from the mature lantibiotic. This is followed by a region with characteristic composition for lantibiotic precursor regions, rich in Ser and Thr and including a near-invariant Cys near or at the C-terminus, involved in cyclization. Members of this family typically are shorter than 70 amino acids. [Cellular processes, Toxin production and resistance] 61
20341 188409 TIGR03894 chp_P_marinus_1 conserved hypothetical protein, TIGR03894 family. This protein family is restricted to the Prochlorococcus and Synechococcus lineages of the Cyanobacteria, and is sporadic in those lineages. Members average 100 amino acids in length, including a 30-residue, highly polar, low complexity region sandwiched between an N-terminal region of about 60 residues and a C-terminal [KR]VVR[KR]RS motif, both well-conserved. The function is unknown. [Hypothetical proteins, Conserved] 95
20342 274838 TIGR03895 protease_PatA cyanobactin maturation protease, PatA/PatG family. This model describes a protease domain associated with the maturation of various members of the cyanobactin family of ribosomally produced, heavily modified bioactive metabolites. Members include the PatA protein and C-terminal domain of the PatG protein of Prochloron didemni, TenA and a region of TenG from Nostoc spongiaeforme var. tenue, etc. 602
20343 274839 TIGR03896 cyc_nuc_ocin bacteriocin-type transport-associated protein. Members of this protein family are uncharacterized and contain two copies of the cyclic nucleotide-binding domain pfam00027. Members are restricted to select cyanobacteria but are found regularly in association with a transport operon that, in turn, is associated with the production of putative bacteriocins. The models describing the transport operon are TIGR03794, TIGR03796, and TIGR03797. 317
20344 274840 TIGR03897 lanti_2_LanM type 2 lantibiotic biosynthesis protein LanM. Members of this family are known generally as LanM, a multifunctional enzyme of lantibiotic biosynthesis. This catalysis by LanM distinguishes the type 2 lantibiotics, such as mersacidin, cinnamycin, and lichenicidin, from LanBC-produced type 1 lantibiotics such as nisin and subtilin. The N-terminal domain contains regions associated with Ser and Thr dehydration. The C-terminal region contains a pfam05147 domain, which catalyzes the formation of the lanthionine bridge. [Cellular processes, Toxin production and resistance] 931
20345 274841 TIGR03898 lanti_MRSA_kill type 2 lantibiotic, mersacidin/lichenicidin family. This model recognizes a number of type 2 lantibiotic-type bacteriocins, including mersacidin and lichenicidin. Members often are found as gene pairs encoding two-chain bacteriocins. Maturation is accomplished, at least in part, by a LanM-type enzyme (TIGR03897). This model describes only the leader peptide region. [Cellular processes, Toxin production and resistance] 44
20346 274842 TIGR03899 TIGR03899 TIGR03899 family protein. Members of this protein family are conserved hypothetical proteins with a limited species distribution within the Gammaproteobacteria. It is common in the genera Vibrio and Shewanella, and in this resembles the C-terminal domain and putative protein sorting motif TIGR03501. This model, but design, does not extend to all homologs,but rather represents a particular clade. 250
20347 274843 TIGR03900 prc_long_Delta putative carboxyl-terminal-processing protease, deltaproteobacterial. This model describes a multidomain protein of about 1070 residues, restricted to the order Myxococcales in the Deltaproteobacteria. Members contain a PDZ domain (pfam00595), an S41 family peptidase domain (pfam03572), and an SH3 domain (pfam06347). A core region of this family, including PDZ and S41 regions, is described by TIGR00225, C-terminal processing peptidase, which recognizes the Prc protease. The species distribution of this family approximates that of largely Deltaproteobacterial C-terminal putative protein-sorting domain, TIGR03901, analogous to LPXTG and PEP-CTERM, but the co-occurrence may reflect shared restriction to the Myxococcales rather than a substrate/target relationship. 973
20348 274844 TIGR03901 MYXO-CTERM MYXO-CTERM domain. This model describes MYXO-CTERM, a C-terminal putative protein sorting domain, analogous to LPXTG (TIGR01167) and PEP-CTERM (TIGR02595). It is restricted to the Myxococcales, a division of the Deltaproteobacteria, with over 60 members occurring in Plesiocystis pacifica SIR-1. An example protein is TraA, involved in outer membrane exchange (lipids and proteins) through which one strain of Myxococcus can repair a mobility defect in another. The trusted cutoff for this model is set artificially high to avoid false positives, and consequently only about half of all members are recognized. 31
20349 274845 TIGR03902 rhom_GG_sort rhomboid family GlyGly-CTERM serine protease. This model describes a rhomboid-like intramembrane serine protease. Its species distribution closely matches model TIGR03501, GlyGly-CTERM, which describes a protein targeting domain analogous to LPXTG and PEP-CTERM. In a number of species (Ralstonia eutropha ,R. metallidurans, R. solanacearum, Marinobacter aquaeolei, etc) with just one GlyGly-CTERM protein (i.e., a dedicated system), the rhombosortase and GlyGly-CTERM genes are adjacent. 154
20350 274846 TIGR03903 TOMM_kin_cyc TOMM system kinase/cyclase fusion protein. This model represents proteins of 1350 in length, in multiple species of Burkholderia, in Acidovorax avenae subsp. citrulli AAC00-1 and Delftia acidovorans SPH-1, and in multiple copies in Sorangium cellulosum, in genomic neighborhoods that include a cyclodehydratase/docking scaffold fusion protein (TIGR03882) and a member of the thiazole/oxazole modified metabolite (TOMM) precursor family TIGR03795. It has a kinase domain in the N-terminal 300 amino acids, followed by a cyclase homology domain, followed by regions without named domain definitions. It is a probable bacteriocin-like metabolite biosynthesis protein. [Cellular processes, Toxin production and resistance] 1266
20351 274847 TIGR03904 SAM_YgiQ uncharacterized radical SAM protein YgiQ. Members of this family are fairly widespread uncharacterized radical SAM family proteins, many of which are designated YgiQ. [Unknown function, Enzymes of unknown specificity] 559
20352 188420 TIGR03905 TIGR03905_4_Cys uncharacterized protein TIGR03905. This model describes a family of conserved hypothetical proteins of small size, typically ~85 residues, with four invariant Cys residues. This small protein is distantly homologous to a C-terminal domain found in proteins identified by N-terminal homology as ribonucleotide reductases. The rare and sporadic distribution of this protein family falls mostly within the subset of bacterial genomes containing the uncharacterized radical SAM protein modeled by TIGR03904. [Unknown function, General] 78
20353 274848 TIGR03906 quino_hemo_SAM quinohemoprotein amine dehydrogenase maturation protein. Members of this protein family are radical SAM enzymes responsible for post-translational modifications to the gamma subunit of quinohemoprotein amine dehydrogenases. Ono, et al. () suggest that this protein is responsible for intrapeptidyl thioether cross-linking rather than cysteine tryptophylquinone biogenesis in the gamma subunit. [Protein fate, Protein modification and repair] 467
20354 211887 TIGR03907 QH_beta quinohemoprotein amine dehydrogenase, beta subunit. Quinohemoprotein amine dehydrogenase is a three subunit enzyme with both a heme group and a cysteine tryptophylquinone group derived by post-translational modification of the gamma subunit. This model describes the beta subunit. This enzyme catalyzes oxidative deamination of primary aliphatic and aromatic amines (). [Energy metabolism, Amino acids and amines] 338
20355 274849 TIGR03908 QH_alpha quinohemoprotein amine dehydrogenase, alpha subunit. Quinohemoprotein amine dehydrogenase is a three subunit enzyme with both a heme group and a cysteine tryptophylquinone group derived by post-translational modification of the gamma subunit. This model describes the beta subunit. This enzyme catalyzes oxidative deamination of primary aliphatic and aromatic amines (). [Energy metabolism, Amino acids and amines] 510
20356 188424 TIGR03909 pyrrolys_PylC pyrrolysine biosynthesis protein PylC. This protein is PylC, part of a three-gene cassette that is sufficient to direct the biosynthesis of pyrrolysine, the twenty-second amino acid, incorporated in some species at a UAG canonical stop codon. [Amino acid biosynthesis, Other] 374
20357 188425 TIGR03910 pyrrolys_PylB pyrrolysine biosynthesis radical SAM protein. This model describes a radical SAM protein, PylB, that is part of the three-gene cassette sufficient for the biosynthesis of pyrrolysine (the twenty-second amino acid) when expressed heterologously in E. coli. The pyrrolysine next is ligated to its own tRNA and incorporated at special UAG codons. [Amino acid biosynthesis, Other] 347
20358 188426 TIGR03911 pyrrolys_PylD pyrrolysine biosynthesis protein PylD. This protein is PylD, part of a three-gene cassette that is sufficient to direct the biosynthesis of pyrrolysine, the twenty-second amino acid, incorporated in some species at a UAG canonical stop codon. [Amino acid biosynthesis, Other] 266
20359 188427 TIGR03912 PylS_Nterm pyrrolysyl-tRNA synthetase, N-terminal region. PylS is the enzyme responsible for charging the pyrrolysine tRNA, PylT, by ligating a free molecule of pyrrolysine. Pyrrolysine is encoded at an in-frame UAG (amber) at least in several corrinoid-dependent methyltransferases of the archaeal genera Methanosarcina and Methanococcoides, such as trimethylamine methyltransferase. This protein occurs as a fusion protein in Methanosarcina but as split genes in Desulfitobacterium hafniense and other bacteria. This model describes the small, N-terminal region. [Protein synthesis, tRNA aminoacylation] 89
20360 188428 TIGR03913 rad_SAM_trio Y_X(10)_GDL-associated radical SAM protein. This narrowly distributed protein family contains an N-terminal radical SAM domain. It occurs in Pseudomonas fluorescens Pf0-1, Ralstonia solanacearum, and numerous species and strains of Burkholderia. Members always occur next to a trio of three mutually homologous genes, all of which contain the domain pfam08898 as the whole of the protein (about 60 amino acids) or as the C-terminal domain. The function is unknown, but the fact that all phylogenetically correlated proteins are mutually homologous with prominent invariant motifs (an invariant tyrosine and a GDL motif) and as small as 60 amino acids suggests that post-translational modification of pfam08898 domain-containing proteins may be its function. This view is supported by closer homology to the PqqE radical SAM protein involved in PQQ biosynthesis from the PqqA precursor peptide than to other characterized radical SAM proteins. [Unknown function, Enzymes of unknown specificity] 477
20361 274850 TIGR03914 UDG_fam_dom uracil-DNA glycosylase family domain. This model represents a clade within the uracil-DNA glycosylase superfamily. Among characterized proteins, it most closely resembles the Thermus thermophilus uracil-DNA glycosylase TTUDGA, which acts uracil (deamidated cytosine) in both single-stranded DNA and U/G pairs of double-stranded DNA. This domain may occur either as a stand-alone protein or as the C-terminal domain of a fusion with another domain that always pairs with a particular radical-SAM family protein. 230
20362 274851 TIGR03915 SAM_7_link_chp probable DNA metabolism protein. This model represents a conserved hypothetical protein that almost invariably pairs with an uncharacterized radical SAM protein. The pair occurs in about twenty percent of completed prokaryotic genomes. About forty percent of the members of this family occur as fusion proteins, where the C-terminal domain belongs to the uracil-DNA glycosylase family, a DNA repair family (because uracil in DNA is deamidated cytosine). The linkage by gene clustering and correlated species distribution to a radical SAM protein, and by gene fusion to a DNA repair protein family, suggests a role in DNA modification and/or repair. 241
20363 188431 TIGR03916 rSAM_link_UDG putative DNA modification/repair radical SAM protein. This uncharacterized protein of about 400 amino acids in length contains a radical SAM protein in the N-terminal half. Members are present in about twenty percent of prokaryotic genomes, always paired with a member of the conserved hypothetical protein TIGR03915. Roughly forty percent of the members of that family exist as fusions with a uracil-DNA glycosylase-like region, TIGR03914. In DNA, uracil results from deamidation of cytosine, forming U/G mismatches that lead to mutation, and so uracil-DNA glycosylase is a DNA repair enzyme. This indirect connection, and the recurring role or radical SAM protein in modification chemistries, suggest that this protein may act in DNA modification, repair, or both. [Unknown function, Enzymes of unknown specificity] 415
20364 274852 TIGR03917 Frankia_40_dom Frankia-40 domain. This model describes a paralogous domain of length 40, restricted to smaller proteins of the genus Frankia, a member of the Actinobacteria. The function is unknown. 40
20365 274853 TIGR03918 GTP_HydF [FeFe] hydrogenase H-cluster maturation GTPase HydF. This model describes the family of the [Fe] hydrogenase maturation protein HypF as characterized in Chlamydomonas reinhardtii and found, in an operon with radical SAM proteins HydE and HydG, in numerous bacteria. It has GTPase activity, can bind an 4Fe-4S cluster, and is essential for hydrogenase activity. [Protein fate, Protein modification and repair] 391
20366 274854 TIGR03919 T7SS_EccB type VII secretion protein EccB, Actinobacterial. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccB1, EccB2, etc. This model does not identify functionally related proteins in the Firmicutes such as Staphylococcus aureus and Bacillus anthracis. [Protein fate, Protein and peptide secretion and trafficking] 456
20367 274855 TIGR03920 T7SS_EccD type VII secretion integral membrane protein EccD. Members of this family are EccD, a component of actinobacterial type VII secretion systems (T7SS) with ten to eleven predicted transmembrane helix regions. [Protein fate, Protein and peptide secretion and trafficking] 453
20368 274856 TIGR03921 T7SS_mycosin type VII secretion-associated serine protease mycosin. Members of this family are subtilisin-related serine proteases, found strictly in the Actinobacteria and associated with type VII secretion operons. The designation mycosin is used for members from Mycobacterium. [Protein fate, Protein and peptide secretion and trafficking, Protein fate, Protein modification and repair] 350
20369 188437 TIGR03922 T7SS_EccA type VII secretion AAA-ATPase EccA. This model represents the AAA family ATPase, EccA, of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccA1, EccA2, etc. [Protein fate, Protein and peptide secretion and trafficking] 557
20370 274857 TIGR03923 T7SS_EccE type VII secretion protein EccE. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccE1, EccE2, etc. This model represents a conserved core region, and many members have 200 or more additional C-terminal residues. [Protein fate, Protein and peptide secretion and trafficking] 341
20371 274858 TIGR03924 T7SS_EccC_a type VII secretion protein EccCa. This model represents the N-terminal domain or EccCa subunit of the type VII secretion protein EccC as found in the Actinobacteria. Type VII secretion is defined more broadly as including secretion systems for ESAT-6-like proteins in the Firmicutes as well as in the Actinobacteria, but this family does not show close homologs in the Firmicutes. [Protein fate, Protein and peptide secretion and trafficking] 658
20372 274859 TIGR03925 T7SS_EccC_b type VII secretion protein EccCb. This model represents the C-terminal domain or EccCb subunit of the type VII secretion protein EccC as found in the Actinobacteria. Type VII secretion is defined more broadly as including secretion systems for ESAT-6-like proteins in the Firmicutes as well as in the Actinobacteria, but this family does not show close homologs in the Firmicutes. [Protein fate, Protein and peptide secretion and trafficking] 566
20373 188441 TIGR03926 T7_EssB type VII secretion protein EssB. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This protein is designated YukC in Bacillus subtilis and EssB is Staphylococcus aureus. [Protein fate, Protein and peptide secretion and trafficking] 377
20374 200340 TIGR03927 T7SS_EssA_Firm type VII secretion protein EssA. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions. [Protein fate, Protein and peptide secretion and trafficking] 150
20375 274860 TIGR03928 T7_EssCb_Firm type VII secretion protein EssC, C-terminal domain. This model describes the C-terminal domain, or longer subunit, of the Firmicutes type VII secretion protein EssC. This protein (homologous to EccC in Actinobacteria) and the WXG100 target proteins are the only homologous parts of type VII secretion between Firmicutes and Actinobacteria. [Protein fate, Protein and peptide secretion and trafficking] 1296
20376 274861 TIGR03929 T7_esaA_Nterm type VII secretion protein EsaA, N-terminal domain. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This model represents the conserved N-terminal domain. 193
20377 274862 TIGR03930 WXG100_ESAT6 WXG100 family type VII secretion target. Members of this protein family include secretion targets for the two main variants of type VII secretion systems (T7SS), one found in the Actinobacteria, one found in the Firmicutes. This model was derived through iteration from pfam06013. The best characterized member of this family is ESAT-6 from Mycobacterium tuberculosis. Members of this family usually are ~100 amino acids in length but occasionally have a long C-terminal extension. 90
20378 274863 TIGR03931 T7SS_Rv3446c type VII secretion-associated protein, Rv3446c family, C-terminal domain. Members of this protein family occur as part of the ESX-4 cluster of type VII secretion system (T7SS) proteins in Mycobacterium tuberculosis and in similar T7SS clusters in other Actinobacteria genera, including Corynebacterium, Nocardia, Rhodococcus, and Saccharopolyspora. This model describes the better-conserved C-terminal region. [Protein fate, Protein and peptide secretion and trafficking] 172
20379 188447 TIGR03932 PIA_icaD intracellular adhesion protein D. Members of this protein family are IcaD (intracellular adhesion protein D), which with catalytic subunit IcaA forms an N-acetylglucosaminyltransferase. In the absence of IcaC, this enzyme forms N-acetylglucosamine oligomers up to 20 in length. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 88
20380 188448 TIGR03933 PIA_icaB intercellular adhesin biosynthesis polysaccharide N-deacetylase. A common motif in bacterial biosynthesis of polysaccharide for export is modification that follows polymerization. This model describes a subfamily of polysaccharide N-deacetylases that acts on poly-beta-1,6-N-acetyl-D-glyscosamine as produced by Staphylococcus epidermidis and S. aureus. The end product in these species is designated polysaccharide intercellular adhesin (PIA), and this gene designated icaB (intercellular adhesion protein B). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Cellular processes, Pathogenesis] 245
20381 274864 TIGR03934 TQXA_dom TQXA domain. This model describes a domain of about 40 residues with an invariant TQ dipeptide in an almost invariant TQxA[VI]W motif. This domain occurs in surface-expressed proteins of Gram-positive bacteria, many of which are anchored by LPXTG-containing sortase target domains. Numerous members of this family have domains pfam05738 (Cna protein B-type domain) and pfam08341 (fibronectin-binding protein signal sequence). 42
20382 188450 TIGR03935 fragilysin fragilysin. Members of this family are fragilysin, the Bacteroides fragilis enterotoxin. This enzyme is a Zn metalloprotease. Three distinct subtypes included in this family all are produced by enterotoxigenic (by definition) strains of Bacteroides fragilis. [Cellular processes, Pathogenesis] 386
20383 274865 TIGR03936 sam_1_link_chp radical SAM-linked protein. This model describes an uncharacterized protein encoded adjacent to, or as a fusion protein with, an uncharacterized radical SAM protein. 208
20384 274866 TIGR03937 PgaC_IcaA poly-beta-1,6 N-acetyl-D-glucosamine synthase. Members of this protein family are biofilm-forming enzymes that polymerize N-acetyl-D-glucosamine residues in beta(1,6) linkage. One named members is IcaA (intercellular adhesin protein A), an enzyme that acts (with aid of subunit IcaD) in Polysaccharide Intercellular Adhesin (PIA) biosynthesis in Staphylococcus epidermis). The homologous member in E. coli is designated PgaC. Members are often encoded next to a polysaccharide deacetylase and involved in biofilm formation. Note that chitin, although also made from N-acetylglucosamine, is formed with beta-1,4 linkages. 407
20385 274867 TIGR03938 deacetyl_PgaB poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB. Two well-characterized systems produce polysaccharide based on N-acetyl-D-glucosamine in straight chains with beta-1,6 linkages. These are encoded by the icaADBC operon in Staphylococcus species, where the system is designated polysaccharide intercellular adhesin (PIA), and the pgaABCD operon in Gram-negative bacteria such as E. coli. Both systems include a putative polysaccharide deacetylase. The PgaB protein, described here, contains an additional domain lacking from its Gram-positive counterpart IcaB (TIGR03933). Deacetylation by this protein appears necessary to allow export through the porin PgaA [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 619
20386 274868 TIGR03939 PGA_TPR_OMP poly-beta-1,6 N-acetyl-D-glucosamine export porin PgaA. Members of this protein family are the poly-beta-1,6 N-acetyl-D-glucosamine (PGA) export porin PgaA of Gram-negative bacteria. There is no counterpart in the poly-beta-1,6 N-acetyl-D-glucosamine biosynthesis systems of Gram-positive bacteria such as Staphylococcus epidermidis. The PGA polysaccharide adhesin is a critical determinant of biofilm formation. The conserved C-terminal domain of this outer membrane protein is preceded by a variable number of TPR repeats. 800
20387 188455 TIGR03940 PGA_PgaD poly-beta-1,6-N-acetyl-D-glucosamine biosynthesis protein PgaD. Members of this protein family are PgaD, essential to the production of poly-beta-1,6-N-acetyl-D-glucosamine (PGA). This cytoplasmic membrane protein appears to be an auxiliary subunit to the PGA synthase, PgaC (TIGR03937). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 141
20388 274869 TIGR03941 tRNA_deam_assoc putative tRNA adenosine deaminase-associated protein. This model describes a protein family about 200 amino acids in length with only five invariant residues, including an Arg, a Ser-Asp pair, and two Gly residues. Members always are found exclusively in Actinobacteria, and always adjacent to homologs of TadA, a tRNA-specific adenosine deaminase from Escherichia coli. Homology, phyletic pattern, and gene neighborhood together suggest a housekeeping function in tRNA metabolism. [Unknown function, General] 154
20389 188457 TIGR03942 sulfatase_rSAM anaerobic sulfatase-maturating enzyme. Members of this protein family are radical SAM family enzymes, maturases that prepare the oxygen-sensitive radical required in the active site of anaerobic sulfatases. This maturase role has led to many misleading legacy annotations suggesting that this enzyme maturase is instead a sulfatase regulatory protein. All members of the seed alignment are radical SAM enzymes encoded next to or near an anaerobic sulfatase. Note that a single genome may encode more than one sulfatase/maturase pair. [Protein fate, Protein modification and repair] 363
20390 274870 TIGR03943 TIGR03943 TIGR03943 family protein. Members of this occur in gene pairs with members of pfam03773. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region. 219
20391 274871 TIGR03944 dehyd_SbnB_fam 2,3-diaminopropionate biosynthesis protein SbnB. Members of this protein family are probable NAD-dependent dehydrogenases related to the alanine dehydrogenase of Archaeoglobus fulgidus (see TIGR02371, PDB structure 1OMO and ) and more distantly to ornithine cyclodeaminase. Members include the staphylobactin biosynthesis protein SbnB and tend to occur in contexts suggesting non-ribosomal peptide synthesis, always adjacent to (occasionally fused with) a pyridoxal phosphate-dependent enzyme, SbnA. The pair appears to provide 2,3-diaminopropionate for biosynthesis of siderophores or other secondary metabolites. [Cellular processes, Biosynthesis of natural products] 327
20392 274872 TIGR03945 PLP_SbnA_fam 2,3-diaminopropionate biosynthesis protein SbnA. Members of this family include SbnA, a protein of the staphyloferrin B biosynthesis operon of Staphylococcus aureus. SbnA and SbnB together appear to synthesize 2,3-diaminopropionate, a precursor of certain siderophores and other secondary metabolites. SbnA is a pyridoxal phosphate-dependent enzyme. [Cellular processes, Biosynthesis of natural products] 304
20393 188461 TIGR03946 viomycin_VioC arginine beta-hydroxylase, Fe(II)/alpha-ketoglutarate-dependent. Members of this protein family are L-arginine beta-hydroxylase, members of a broader family of enzymes dependent on Fe(II), alpha-ketoglutarate, and molecular oxygen. Enzymes in the broader family but excluded by this model include clavaminate synthase, taurine dioxygenase, and prolyl-4-hydroxylase. [Cellular processes, Biosynthesis of natural products] 333
20394 188462 TIGR03947 viomycin_VioD capreomycidine synthase. Members of this family are the enzyme capreomycidine synthase, which performs the second of two steps in the biosynthesis of 2S,3R-capreomycidine from arginine. Capreomycidine is an unusual amino acid used by non-ribosomal peptide synthases (NRPS) to make the tuberactinomycin class of peptide antibiotic. The best characterized member is VioD from the biosynthetic pathway for viomycin. [Cellular processes, Biosynthesis of natural products] 359
20395 188463 TIGR03948 butyr_acet_CoA butyryl-CoA:acetate CoA-transferase. This enzyme represents one of at least two mechanisms for reclaiming CoA from butyryl-CoA at the end of butyrate biosynthesis (an important process performed by some colonic bacteria), namely transfer of CoA to acetate. An alternate mechanism transfers the butyrate onto inorganic phosphate, after which butyrate kinase transfers the phosphate onto ADP, creating ATP. [Energy metabolism, Fermentation] 445
20396 274873 TIGR03949 bact_IIb_cerein class IIb bacteriocin, lactobin A/cerein 7B family. Members of this protein family are described variably as bacteriocins per se, one chain of a two-chain bacteriocin, or bacteriocin enhancer proteins. All members of the seed alignment occur in paired gene contexts with another member of the same protein family. This family includes bacteriocins that appear not to undergo post-translational modification, other than cleavage at a Gly-Gly motif coupled to sec-independent export. For many members, the N-terminal bacteriocin cleavage motif region is recognized by TIGR01847. C-terminal to the cleavage motif, these proteins are hydrophobic and low in complexity, consistent with pore-forming activity as a mechanism of bacteriocin action. 45
20397 274874 TIGR03950 sidero_Fe_reduc siderophore ferric iron reductase, AHA_1954 family. Members of this protein family are 2Fe-2S cluster binding proteins, found regularly in the context of siderophore transporters. Members are distantly related to FhuF from E. coli, a ferric iron reductase linked to removal of iron from hydroxamate-type siderophores (). [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds] 223
20398 274875 TIGR03951 Fe_III_red_FhuF siderophore-iron reductase FhuF. Members of this protein family, including FhuF of E. coli, are siderophore ferric iron reductases that appear to play a role in iron removal from certain hydroxamate-type siderophores, including coprogen, ferrichrome, ferrioxamine B, and aerobactin. Genes occur in regularly in siderophore transport and/or biosynthesis clusters. The C-terminus includes four Cys residues in a C-C-10(X)-C-X-X-C motif that binds a 2Fe-2S cluster. Family TIGR03950 is similar, but especially in the C-terminal region, but likely acts on a different panel of siderophores. [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds] 182
20399 274876 TIGR03952 metzin_BF0631 zinc-dependent metalloproteinase lipoprotein, BF0631 family. Members of this protein family are zinc-dependent metalloproteinases, related to ulilysin and other members of the pappalysin family. Members occur as predicted lipoproteins and occur mostly in the genera Bacteriodes and Prevotella. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 351
20400 274877 TIGR03953 rplD_bact 50S ribosomal protein L4, bacterial/organelle. Members of this protein family are ribosomal protein L4. This model recognizes bacterial and most organellar forms, but excludes homologs from the eukaryotic cytoplasm and from archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification] 188
20401 274878 TIGR03954 integ_memb_HG integral membrane protein. This model describes a strictly bacterial integral membrane domain of about 85 residues in length. It occurs in proteins that on rare occasions are fused to transporter domains such as the major facilitator superfamily domain. Of three invariant residues, two occur as a His-Gly dipeptide in the middle of three predicted transmembrane helices. [Unknown function, General] 85
20402 274879 TIGR03955 rSAM_HydG [FeFe] hydrogenase H-cluster radical SAM maturase HydG. This model describes the radical SAM protein HydG. It is part of an enzyme metallocenter maturation system, working together with GTP-binding protein HydF and another radical SAM enzyme, HydE, in H-cluster maturation in [FeFe] hydrogenases. [Protein fate, Protein modification and repair] 471
20403 274880 TIGR03956 rSAM_HydE [FeFe] hydrogenase H-cluster radical SAM maturase HydE. This model describes the radical SAM protein HydE, one of a pair of radical SAM proteins, along with GTP-binding protein HydF, for maturation of [Fe] hydrogenase in Chlamydomonas reinhardtii and numerous bacteria. [Protein fate, Protein modification and repair] 340
20404 188472 TIGR03957 rSAM_HmdB 5,10-methenyltetrahydromethanopterin hydrogenase cofactor biosynthesis protein HmdB. Members of this archaeal protein family are HmdB, a partially characterized radical SAM protein with an unusual CX5CX2C motif. Its gene flanks the H2-forming methylene-H4-methanopterin dehydrogenase gene hmdA, found in hydrogenotrophic methanogens. HmdB appears to act in in biosynthesis of the novel cofactor of HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis] 317
20405 274881 TIGR03958 monoFe_hyd_HmdC 5,10-methenyltetrahydromethanopterin hydrogenase cofactor biosynthesis protein HmdC. Members of this protein family are HmdC, whose gene regularly occurs in the context of genes for HmdA (5,10-methenyltetrahydromethanopterin hydrogenase) and the radical SAM protein HmdB involved in biosynthesis of the HmdA cofactor. Bioinformatics suggests this protein, a homolog of eukaryotic fibrillarin, may be involved in biosynthesis of the guanylyl pyridinol cofactor in HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis] 505
20406 274882 TIGR03959 hyd_TM1266 putative iron-only hydrogenase system regulator. Members of this protein family occur as part of a system for producing iron-only hydrogenases, dependent on radical SAM proteins HydE and HydG and GTPase HydF. One member of this family, TM_1266 from Thermotoga maritima, has a known crystal structure. The small size, about 80 residues, and a distant relationship to the nickel regulator NikR of the CopG transcriptional regulator family suggest a role as a transcription factor. [Regulatory functions, DNA interactions] 76
20407 188475 TIGR03960 rSAM_fuse_unch radical SAM family uncharacterized protein. This model describes a radical SAM protein, or protein region, regularly found paired with or fused to a region described by TIGR03936. PSI-BLAST analysis of TIGR03936 suggests a relationship to the tRNA pseudouridine synthase TruA, suggesting that this system may act in RNA modification. [Unknown function, Enzymes of unknown specificity] 605
20408 188476 TIGR03961 rSAM_PTO1314 archaeal radical SAM protein, PTO1314 family. Members of this protein family average about 340 residues in length, with a radical SAM domain in the N-terminal 200 residues. The taxonomic distribution is restricted to non-methanogenic archaea, including Picrophilus torridus (locus PTO1314), Sulfolobus sp., Thermoplasma sp., Picrophilus torridus, and Metallosphaera sedula. The gene neighborhood is not conserved, and the function of this family is unknown. [Unknown function, Enzymes of unknown specificity] 332
20409 188477 TIGR03962 mycofact_rSAM mycofactocin radical SAM maturase. Members of this family are uncharacterized radical SAM proteins from the Mycobacterium tuberculosis and many other Actinobacteria, as well as some deltaproteobacteria (e.g. Geobacter uraniireducens), firmicutes (Pelotomaculum thermopropionicum and Desulfotomaculum acetoxidans), and Chloroflexi (Thermomicrobium roseum DSM 5159 and Sphaerobacter thermophilus DSM 20745). They resemble several characterized radical SAM enzymes of peptide modification (PqqE, AlbA), and are always found next to the proposed target, TIGR03969, the putative mycofactocin precursor. [Unknown function, Enzymes of unknown specificity] 339
20410 188478 TIGR03963 rSAM_QueE_Clost putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE, clostridial. Members of this radical SAM domain protein family appear to be the Clostridial form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in preQ0 operons species that lack members of related protein family TIGR03365. [Protein synthesis, tRNA and rRNA base modification] 219
20411 274883 TIGR03964 mycofact_creat mycofactocin system creatininase family protein. Members of this protein family are uncharacterized Actinobacterial proteins, with homology to creatinine amidohydrolase from Pseudomonas. Members occur only in the context of the mycofactocin system. [Unknown function, Enzymes of unknown specificity] 228
20412 274884 TIGR03965 mycofact_glyco mycofactocin system glycosyltransferase. Members of this protein family are putative glycosyltransferases, members of pfam00535 (glycosyl transferase family 2). Members appear mostly in the Actinobacteria, where they appear to be part of a system for converting a precursor peptide (TIGR03969) into a novel redox carrier designated mycofactocin. A radical SAM enzyme, TIGR03962, is a proposed to be a key maturase for mycofactocin. 466
20413 274885 TIGR03966 actino_HemFlav heme/flavin dehydrogenase, mycofactocin system. Members of this protein family possess an N-terminal heme-binding domain and C-terminal flavodehydrogenase domain, and share homology to yeast flavocytochrome b2, to E. coli L-lactate dehydrogenase [cytochrome], to (S)-mandelate dehydrogenase, etc. This enzyme appears only in the context of the mycofactocin system. Interestingly, it is absent from the four species detected so far with mycofactocin but without an F420 biosynthesis system. 385
20414 274886 TIGR03967 mycofact_MftB putative mycofactocin binding protein MftB. Families TIGR03969 and TIGR03962 describe, respectively, the putative mycofactocin precursor and its cognate radical SAM peptide maturase. This small protein family appears in the same sporadically distributed cassette and may serve as a scaffolding protein during mycofactocin maturation or as a carrier protein for the mature product, a putative novel redox carrier. A feature of mycofactocin-encoding genomes is co-clustering with sets of NAD-binding oxidoreductases in which the NAD is not exchangeable. Therefore it is proposed that mature mycofactocin, bound by a member of this family as a carrier protein, docks with the nicotinoprotein to allow electron transfer. Mediation of electron transfer through this system would define a segregated redox pool. [Unknown function, General] 81
20415 188483 TIGR03968 mycofact_TetR mycofactocin system transcriptional regulator. Members of this family are TetR family putative transcriptional regulators that occur in genome contexts near proteins of the mycofactocin system. These include the precursor peptide (TIGR03969), a radical SAM peptide maturase (TIGR03962), and a putative carrier protein (TIGR03967). [Regulatory functions, DNA interactions] 190
20416 274887 TIGR03969 mycofactocin mycofactocin precursor. Members of this protein family occur in Mycobacterium tuberculosis and many other Actinobacteria, as well as some delta-Proteobacteria (e.g. Geobacter uraniireducens), Firmicutes (Pelotomaculum thermopropionicum and Desulfotomaculum acetoxidans), and Chloroflexi (Thermomicrobium roseum DSM 5159 and Sphaerobacter thermophilus DSM 20745). Members sometimes are missed during gene model identification but always occur in the vicinity of radical SAM (rSAM) enzyme TIGR03962, which resembles several rSAM enzymes of peptide maturation (PqqE, AlbA). Species with this protein always carry members of unusual clades of nicotinoproteins that are restricted to mycofactocin-containing species and in which the NAD, when studied, has appeared non-exchangeable. It is proposed that the mature form of mycofactocin is a novel redox carrier for a segregated redox pool. 23
20417 274888 TIGR03970 Rv0697 dehydrogenase, Rv0697 family. This model describes a set of dehydrogenases belonging to the glucose-methanol-choline oxidoreductase (GMC oxidoreductase) family. Members of the present family are restricted to Actinobacterial genome contexts containing also members of families TIGR03962 and TIGR03969 (the mycofactocin system), and are proposed to be uniform in function. 487
20418 274889 TIGR03971 SDR_subfam_1 SDR family mycofactocin-dependent oxidoreductase. Members of this protein subfamily are putative oxidoreductases belonging to the larger SDR family. All members occur in genomes that encode a cassette for the biosynthesis of mycofactocin, a proposed electron carrier of a novel redox pool. Characterized members of this family are described as NDMA-dependent, meaning that a blue aniline dye serving as an artificial electron acceptor is required for members of this family to cycle in vitro, since the bound NAD residue is not exchangeable. See EC 1.1.99.36. [Unknown function, Enzymes of unknown specificity] 270
20419 274890 TIGR03972 rSAM_TYW1 wyosine biosynthesis protein TYW1. Members of this protein family are the archaeal protein TWY1, a radical SAM protein that catalyzes the second step in creating the wye-bases, wyosine and derivatives such as wybutosine, for tRNA base modification. [Protein synthesis, tRNA and rRNA base modification] 297
20420 274891 TIGR03973 six_Cys_in_45 six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein, family TIGR03974, that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase. 43
20421 274892 TIGR03974 rSAM_six_Cys SCIFF radical SAM maturase. Members of this protein family are predicted radical SAM enzymes universally associated with Six Cysteines in Forty-Five protein, or SCIFF (family TIGR03973), a predicted ribosomal natural product precursor that is nearly universal in the class Clostridia. Similarity of this family to radical SAM maturases (PqqE and subtilosin A maturase) found in the vicinity of other peptide precursors suggests this protein is the SCIFF radical SAM maturase. [Cellular processes, Biosynthesis of natural products] 451
20422 274893 TIGR03975 rSAM_ocin_1 ribosomal peptide maturation radical SAM protein 1. Models TIGR03793 and TIGR03798 describe bacteriocin precursor families to occur often in large paralogous families and are subject to various modifications, including by LanM family lantibiotic synthases and by cyclodehydratases. This model represents a radical SAM protein family that regularly occurs in the context of these bacteriocins, and may occur where other familiar peptide modification enzymes are absent. [Cellular processes, Toxin production and resistance] 606
20423 274894 TIGR03976 chp_LLNDYxLRE His-Xaa-Ser system protein HxsD. This rare conserved hypothetical protein of small size occurs exclusively, and perhaps universally, in the context of a pair of (uncharacterized) radical SAM proteins, TIGR03977 and TIGR03978. Many members of this family have invariant motifs LYW and LLNDYxLRE, but PSI-BLAST starting from family members well below 20 % pairwise sequence identity to this group eventually brings in the entire family as modeled here. The family TIGR03979 represents the fourth regularly conserved member of this system. 90
20424 274895 TIGR03977 rSAM_pair_HxsC His-Xaa-Ser system radical SAM maturase HxsC. This model describes the downstream member, HxsC, of a pair of uncharacterized radical SAM proteins, regularly found in the context of a small protein with four or more repeats of the tripeptide His-Xaa-Ser (HXS). This enzyme appears to be part of a peptide modification system. 292
20425 274896 TIGR03978 rSAM_paired_1 His-Xaa-Ser system radical SAM maturase HxsB. This model describes the upstream member, HxsB, of a pair of uncharacterized radical SAM proteins, regularly found in the context of a small protein with four or more repeats of the tripeptide His-Xaa-Ser (HXS). This enzyme appears to be part of a peptide modification system. 466
20426 274897 TIGR03979 His_Ser_Rich His-Xaa-Ser repeat protein HxsA. Members of this protein share two defining regions. One is a histidine/serine-rich cluster, typically H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in the context of a pair of radical SAM proteins, HxsB and HxsC, and a fourth protein HxsD. The system is predicted to perform peptide modifications, likely in the His-Xaa-Ser region, to produce some uncharacterized natural product. 186
20427 274898 TIGR03980 prismane_assoc hybrid cluster protein-associated redox disulfide domain. Members of this protein family resemble the domain of unknown function DUF1858 described by pfam08984, but all members contain an apparent redox-active disulfide. In at least one member protein, a cysteine in the CXXC motif is substituted by a selenocysteine. Most member proteins consist of this domain only, but a few members are fused to or adjacent to members of the hybrid-cluster (prismane) family or the nitrite/sulfite reductase family. [Energy metabolism, Electron transport] 58
20428 188496 TIGR03981 SAM_quin_mod His-Xaa-Ser system putative quinone modification maturase. One clue for the interpretation of this protein family is homology to the MauG protein (see TIGR03791) involved in the tryptophan tryptophylquinone post-translational modification of methylamine dehydrogenase light (beta) chain. The other is occurrence only in a five gene context in which two members are radical SAM proteins (TIGR03977 and TIGR03978) also likely involved in post-translational modification. 411
20429 188497 TIGR03982 TIGR03982 His-Xaa-Ser system protein, TIGR03982 family. Members of this rare protein family occur in the presence of TIGR03981 and TIGR03979, which in turn occur only in the context of radical SAM protein families TIGR03977 and TIGR03978. The function is unknown. 117
20430 274899 TIGR03983 cas1_MYXAN CRISPR-associated endonuclease Cas1, subtype MYXAN. Members of this protein are the Cas1 endonuclease, or Cas1 domain in Cas4/Cas1 fusion proteins, of the MYXAN subtype of CRISPR/Cas systems. These systems typically feature repeats and spacers each about 36 base pairs in length. Species with this type of CRISPR system include Myxococcus xanthus, Cyanothece sp., Leptospira interrogans, Sorangium cellulosum, Anabaena variabilis ATCC 29413, etc. 347
20431 274900 TIGR03984 TIGR03984 CRISPR-associated protein, TIGR03984 family. Members of this protein family are found exclusively in CRISPR-containing organisms, in operon contexts with RAMP (repeat-associated mystery protein) proteins also linked to CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats). 147
20432 274901 TIGR03985 TIGR03985 CRISPR-associated protein, TIGR03985 family. Members of this protein family belong to CRISPR-associated (Cas) gene clusters. The majority of members are Cyanobacterial. 248
20433 274902 TIGR03986 TIGR03986 CRISPR-associated protein. Members of this protein family, part of the larger RAMP family, are found exclusively in species with CRISPR systems, in local contexts containing other RAMP (Repeat-Associated Mystery Proteins). 562
20434 274903 TIGR03987 TIGR03987 TIGR03987 family protein. Conserved hypothetical protein 120
20435 274904 TIGR03988 antisig_RsrA mycothiol system anti-sigma-R factor. Members of this family are the anti-sigma-R factor RsrA, which contains a CXXC motif as a thiol-disulphide redox switch. It interacts with sigma-R. It regulates and is regulated by the mycothiol system, which occurs in many actinomycetes. [Transcription, Transcription factors] 77
20436 274905 TIGR03989 Rxyl_3153 NDMA-dependent alcohol dehydrogenase, Rxyl_3153 family. This model describes a clade within the family pfam00107 of zinc-binding dehydrogenases. The family pfam00107 contains class III alcohol dehydrogenases, including enzymes designated S-(hydroxymethyl)glutathione dehydrogenase and NAD/mycothiol-dependent formaldehyde dehydrogenase. Members of the current family occur only in species that contain the very small protein mycofactocin (TIGR03969), a possible cofactor precursor, and radical SAM protein TIGR03962. We name this family for Rxyl_3153, where the lone member of the family co-clusters with these markers in Rubrobacter xylanophilus. [Unknown function, Enzymes of unknown specificity] 369
20437 274906 TIGR03990 Arch_GlmM phosphoglucosamine mutase. The MMP1680 protein from Methanococcus maripaludis has been characterized as the archaeal protein responsible for the second step of UDP-GlcNAc biosynthesis. This GlmM protein catalyzes the conversion of glucosamine-6-phosphate to glucosamine-1-phosphate. The first-characterized bacterial GlmM protein is modeled by TIGR01455. These two families are members of the larger phosphoglucomutase/phosphomannomutase family (characterized by three domains: pfam02878, pfam02879 and pfam02880), but are not nearest neighbors to each other. This model also includes a number of sequences from non-archaea in the Bacteroides, Chlorobi, Chloroflexi, Planctomycetes and Spirochaetes lineages. Evidence supporting their inclusion in this equivalog as having the same activity comes from genomic context and phylogenetic profiling. A large number of these organisms are known to produce exo-polysaccharide and yet only appeared to contain the GlmS enzyme of the GlmSMU pathway for UDP-GlcNAc biosynthesis (GenProp0750). In some organisms including Leptospira, this archaeal GlmM is found adjacent to the GlmS as well as a putative GlmU non-orthologous homolog. Phylogenetic profiling of the GlmS-only pattern using PPP identifies members of this archaeal GlmM family as the highest-scoring result. [Central intermediary metabolism, Amino sugars] 443
20438 274907 TIGR03991 alt_bact_glmU UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. The MJ_1101 protein from Methanococcus jannaschii has been characterized as the GlmU enzyme catalyzing the final two steps of UDP-GlcNAc biosynthesis. Homologs of this enzyme are identified in a number of bacterial organisms and modeled here. A number of these are observed in proximity to the GlmS and GlmM genes, and phylogenetic profiling by PPP identifies the LEPBI_I0518 gene in Leptospira biflexa as a likely Glm-system candidate. Multiple sequence alignments of these bacterial homologs with their archaeal counterparts reveals significant structural differences, necessitating the construction of separate models. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars] 337
20439 274908 TIGR03992 Arch_glmU UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. The MJ_1101 protein from Methanococcus jannaschii has been characterized as the GlmU enzyme catalyzing the final two steps of UDP-GlcNAc biosynthesis. Many of the genes identified by this model are in proximity to the GlmS and GlmM genes and are also presumed to be GlmU. However, some archaeal genomes contain multiple closely-related homologs from this family and it is not clear what the substrate specificity is for each of them. 393
20440 274909 TIGR03993 hydrog_HybE [NiFe] hydrogenase assembly chaperone, HybE family. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain. 143
20441 274910 TIGR03994 rSAM_HemZ coproporphyrinogen dehydrogenase HemZ. Members of this radical SAM protein family are HemZ, a protein involved in coproporphyrinogen III decarboxylation. Alternative names for this enzyme (EC 1.3.99.22) include coproporphyrinogen dehydrogenase and oxygen-independent coproporphyrinogen III oxidase. The family is related to, but distinct from HemN, and in Bacillus subtilis was shown to be connected to peroxide stress and catalase formation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 401
20442 274911 TIGR03995 target_X_rSAM putative rSAM target protein, CGCGG family. Members of this family of small proteins, approx. 100 amino acids in length, co-occur with a subfamily of radical SAM protein in several species in the Halobacteria and in Bacillus. The radical SAM protein belongs to a branch in which most characterized members act on peptide substrates. The lack of homology of this family to any known enzyme and the distinctive C-terminal region motif, with the common modification target residue Cys flanked by sterically permissive Gly residues. 84
20443 188511 TIGR03996 mycofact_OYE_1 mycofactocin system FadH/OYE family oxidoreductase 1. The yeast protein called old yellow enzyme and FadH from Escherichia coli (2,4-dienoyl CoA reductase) are enzymes with 4Fe-4S, FMN, and FAD prosthetic groups, and interact with NADPH as well as substrate. Members of this related protein family occur in the vicinity of the putative mycofactocin biosynthesis operon in a number of Actinobacteria such as Frankia sp. and Rhodococcus sp. The function of this oxidoreductase is unknown. 633
20444 274912 TIGR03997 mycofact_OYE_2 mycofactocin system FadH/OYE family oxidoreductase 2. The yeast protein called old yellow enzyme and FadH from Escherichia coli (2,4-dienoyl CoA reductase) are enzymes with 4Fe-4S, FMN, and FAD prosthetic groups, and interact with NADPH as well as substrate. Members of this related protein family occur in the vicinity of the putative mycofactocin biosynthesis operon in a number of Actinobacteria such as Frankia sp. and Rhodococcus sp., in Pelotomaculum thermopropionicum SI (Firmicutes), and in Geobacter uraniireducens Rf4 (Deltaproteobacteria). The function of this oxidoreductase is unknown. 644
20445 274913 TIGR03998 thiol_BshC bacillithiol biosynthesis cysteine-adding enzyme BshC. Members of this protein family are BshC, an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 528
20446 274914 TIGR03999 thiol_BshA N-acetyl-alpha-D-glucosaminyl L-malate synthase BshA. Members of this protein family are BshA, a glycosyltransferase required for bacillithiol biosynthesis. This enzyme combines UDP-GlcNAc and L-malate to form N-acetyl-alpha-D-glucosaminyl L-malate synthase. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 374
20447 188515 TIGR04000 thiol_BshB2 bacillithiol biosynthesis deacetylase BshB2. Members of this protein family are BshB2 (YojG), an enzyme of bacillithiol biosynthesis; either BshB1 (YpjG) or BshB2 must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 217
20448 274915 TIGR04001 thiol_BshB1 bacillithiol biosynthesis deacetylase BshB1. Members of this protein family are BshB1 (YpjG), an enzyme of bacillithiol biosynthesis; either BshB1 or BshB2 (YojG) must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 226
20449 188517 TIGR04002 TIGR04002 TIGR04002 family protein. TIGR04002 family proteins, a division within DUF1393 ( pfam07155), occur strictly as part of a tandem gene pair with an uncharacterized radical SAM protein. [Unknown function, General] 151
20450 188518 TIGR04003 rSAM_BssD [benzylsuccinate synthase]-activating enzyme. Members of this radical SAM protein family are [benzylsuccinate synthase]-activating enzyme, a glycyl radical active site-creating enzyme related to [pyruvate formate-lyase]-activating enzyme and additional uncharacterized homologs activating additional glycyl radical-containing enzymes. [Protein fate, Protein modification and repair] 314
20451 188519 TIGR04004 WcaM colanic acid biosynthesis protein WcaM. This protein of uncharacterized function is the final gene in the conserved colanic acid biosynthesis cluster observed in Enterobacteraceae. 464
20452 188520 TIGR04005 wcaL colanic acid biosynthesis glycosyltransferase WcaL. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 406
20453 188521 TIGR04006 wcaK colanic acid biosynthesis pyruvyl transferase WcaK. This gene is the pyruvyl transferase involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 426
20454 188522 TIGR04007 wcaI colanic acid biosynthesis glycosyl transferase WcaI. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 407
20455 188523 TIGR04008 WcaF colanic acid biosynthesis acetyltransferase WcaF. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. This acetyltransferase is believed to catalyze the addition of the acetyl group that is attached through an O linkage to the first fucosyl residue of the colanic acid repetitive unit (E unit) 180
20456 188524 TIGR04009 wcaE colanic acid biosynthesis glycosyl transferase WcaE. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 248
20457 188525 TIGR04010 WcaD putative colanic acid polymerase WcaD. This membrane protein is believed to function as the colanic acid repeating unit polymerase (in an analagous fashion to wzy proteins in O-antigen polymerization). 404
20458 274916 TIGR04011 poly_gGlu_PgsC poly-gamma-glutamate biosynthesis protein PgsC/CapC. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsC, covering both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material. PgsC binds tightly to PgsB, which has been shown to have poly-gamma-glutamate activity. [Cell envelope, Other] 132
20459 188527 TIGR04012 poly_gGlu_PgsB poly-gamma-glutamate synthase PgsB/CapB. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsB, a nomeclature that covers both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material.PgsB has been shown to have poly-gamma-glutamate activity by itself but is bound tightly by PgsC (TIGR04011). [Cell envelope, Other] 366
20460 274917 TIGR04013 B12_SAM_MJ_1487 B12-binding domain/radical SAM domain protein, MJ_1487 family. Members of this family have both a B12 binding homology domain (pfam02310) and a radical SAM domain (pfam04055), and occur only once per genome. Some species with members of this family have a related protein with similar domain architecture. This protein is occurs largely in archaeal methanogens but also in a few bacteria, including Thermotoga maritima and Myxococcus xanthus. [Unknown function, Enzymes of unknown specificity] 383
20461 274918 TIGR04014 B12_SAM_MJ_0865 B12-binding domain/radical SAM domain protein, MJ_0865 family. Members of this family have both a B12 binding homology domain (pfam02310) and a radical SAM domain (pfam04055), and occur only once per genome. This protein occurs so far only in methanogenic archaea. Some species with members of this family have a related protein with similar domain architecture (see TIGR04013). [Unknown function, Enzymes of unknown specificity] 434
20462 274919 TIGR04015 WcaC colanic acid biosynthesis glycosyl transferase WcaC. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 405
20463 188531 TIGR04016 WcaB colanic acid biosynthesis acetyltransferase WcaB. This gene is one of the acetyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 146
20464 274920 TIGR04017 WcaA colanic acid biosynthesis glycosyl transferase WcaA. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. 279
20465 188533 TIGR04018 Bthiol_YpdA putative bacillithiol system oxidoreductase, YpdA family. Members of this protein family, including YpdA from Bacillus subtilis, are apparent oxidoreductases present only in species with an active bacillithiol system. They have been suggested actually to be thiol disulfide oxidoreductases (TDOR), although the evidence is incomplete. [Unknown function, Enzymes of unknown specificity] 316
20466 274921 TIGR04019 B_thiol_YtxJ bacillithiol system protein YtxJ. Members of this protein family, including YtxJ from Bacillus subtilis, occur in species that encode proteins for synthesizing bacillithiol. The protein is described as thioredoxin-like, while another bacillithiol-associated protein, YpdA (TIGR04018), is described as thioredoxin reductase-like. [Unknown function, Enzymes of unknown specificity] 105
20467 274922 TIGR04020 seco_metab_LLM natural product biosynthesis luciferase-like monooxygenase domain. This model describes a subfamily within the bacterial luciferase-like monooxygenase (LLM) family that regularly occurs within large non-ribosomal protein synthases/polyketide synthases, but also as small proteins. The LLM family includes members that bind either FMN or F420, and FMN is more likely in this case because many members are from species that lack F420 biosynthesis capability. An example member is the MupA protein of mupirocin biosynthesis in Pseudomonas fluorescens NCIMB 10586. 341
20468 274923 TIGR04021 LLM_DMSO2_sfnG dimethyl sulfone monooxygenase SfnG. This family of FMNH2-dependent members of the luciferase-like monooxygenase (LLM) family includes SfnG, a monooxygenase that converts dimethylsulphone (DMSO2) to methanesulphonate. This step can be followed immediately by methanesulfonate sulfonatase (an alkanesulfonate monooxygenase - see TIGR03565) for the FMNH2-dependent conversion an inorganic form. [Central intermediary metabolism, Sulfur metabolism] 350
20469 274924 TIGR04022 sulfur_SfnB sulfur acquisition oxidoreductase, SfnB family. Members of this protein family belong to the greater family of acyl-CoA dehydrogenases. This family includes the sulfate starvation induced protein SfnB of Pseudomonas putida strain DS1, which is both encoded nearby to and phylogenetically closely correlated with the dimethyl sulphone monooxygenase SfnG. This family shows considerable sequence similarity to the Rhodococcus dibenzothiophene desulfurization enzyme DszC, although that enzyme falls outside of the scope of this family. [Central intermediary metabolism, Sulfur metabolism] 391
20470 274925 TIGR04023 PPOX_MSMEG_5819 PPOX class F420-dependent enzyme, MSMEG_5819/OxyR family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. This subfamily within the PPOX family occurs in at least 19 distinct species of F420 producers and is likely to bind F420 rather than FMN. The member OxyR was shown to use F420 to catalyze a C5a-C11a reduction in oxytetracycline biosynthesis. [Unknown function, Enzymes of unknown specificity] 115
20471 188539 TIGR04024 F420_NP1902A coenzyme F420-dependent oxidoreductase, NP1902A family. This subfamily of the luciferase-like monooxygenases is restricted to the order Halobacteriales. SIMBAL analysis strongly suggests this oxidoreductase binds coenzyme F420 rather than FMN. Occasional annotations of members of this family as N5,N10-methylenetetrahydromethanopterin reductase appear to represent overly aggressive transfer of annotation. [Unknown function, Enzymes of unknown specificity] 330
20472 274926 TIGR04025 PPOX_FMN_DR2398 PPOX class probable FMN-dependent enzyme, DR_2398 family. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily consists of proteins mostly from species that lack the capability to synthesize F420, and therefore most likely all bind FMN. 197
20473 274927 TIGR04026 PPOX_FMN_cyano PPOX class probable FMN-dependent enzyme, alr4036 family. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily described here is widespread in Cyanobacteria and plants, and is named for alr4036 from Nostoc sp. PCC 7120. The family consists mostly of proteins from species that lack the capability to synthesize F420, so it is probable that all members bind FMN rather than F420. [Unknown function, Enzymes of unknown specificity] 185
20474 274928 TIGR04027 LLM_KPN_01858 putative FMN-dependent luciferase-like monooxygenase, KPN_01858 family. This protein family consists of luciferase-like monooxygenases (LLM), and include KPN_01858 from Klebsiella pneumoniae as a representative member. Most are from species that lack F420 biosynthesis, so the family is likely to bind FMN as its cofactor. This family is closely associated with a binding protein-dependent ABC transporter, suggesting a role in catabolism. [Unknown function, Enzymes of unknown specificity] 326
20475 274929 TIGR04028 SBP_KPN_01854 ABC transporter substrate binding protein, KPN_01854 family. Members of this protein family are ABC transporter substrate-binding proteins related to KPN_01854 from Klebsiella pneumoniae, and occur in both Gram-positive and Gram-negative species. This transport protein family is closely associated with a putative FMN-dependent luciferase-like monooxygenase of unknown function (TIGR04027), as well as with the other proteins of its transporter complex. [Transport and binding proteins, Unknown substrate] 509
20476 213885 TIGR04029 CMD_Avi_7170 CMD domain protein, Avi_7170 family. Sequences in this family occur as the N-terminal domain of a fusion protein with a C-terminal peroxidase-like protein, or as discrete protein encoded next to a peroxidase-like protein. The two partners regularly are encoded near to, and in the same genomes as, a putative FMN-dependent luciferase-like monooxygenase (LLM) (TIGR04027), and an ABC transporter in which TIGR04028 models the substrate-binding protein. CDD identifies this family as falling within the CMD superfamily that includes carboxymuconolactone decarboxylase. 174
20477 188545 TIGR04030 perox_Avi_7169 alkylhydroperoxidase domain protein, Avi_7169 family. Members of this family represent a narrow clade that falls within a family of alkylhydroperoxidase-related proteins, fused to or adjacent to a sequence described by TIGR04029. These two partners occur almost always in the context of a putative FMN-dependent luciferase-like monooxygenase (LLM) (TIGR04027), and an ABC transporter in which TIGR04028 models the substrate-binding protein. 185
20478 188546 TIGR04031 Htur_1727_fam rSAM-partnered protein, Htur_1727 family. Members of this protein family show homology to the putative PaaH (or PaaB) subunit of the phenylacetate-CoA oxygenase complex. However, all members are found in the Halobacteriales in the vicinity of a radical SAM protein homologous to the PqqE protein of pyroquinoline quinone (PQQ) biosynthesis. Members are well-conserved to about residue 75, but then become low-complexity and hypervariable. 71
20479 274930 TIGR04032 toxin_SdpC antimicrobial peptide, SdpC family. This protein family contains the antimicrobial peptide SdpC, used in cannibalistic killing by Bacillus subtilis, and related sequences in species as distant as Myxococcus xanthus from the Deltaproteobacteria. A conserved gene neighborhood includes proteins associated with immunity. 172
20480 274931 TIGR04033 export_SdpB antimicrobial peptide system protein, SdpB family. Members of this protein family resemble SdpB (Sporulation Delaying Protein B), an integral membrane protein associated with production of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus. 276
20481 274932 TIGR04034 export_SdpA antimicrobial peptide system protein, SdpA family. Members of this protein family resemble SdpA (Sporulation Delaying Protein A), a protein associated with production and export of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus, Stigmatella aurantiaca DW4/3-1, Streptomyces sp. ACTE, etc. 156
20482 274933 TIGR04035 glucan_65_rpt glucan-binding repeat. This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473. 62
20483 274934 TIGR04036 LLM_CE1758_fam putative luciferase-like monooxygenase, FMN-dependent, CE1758 family. This tightly conserved subfamily of the bacterial luciferase-like monooxygenase (LLM) family, with members showing > 60 % pairwise sequence identity, includes proteins from both species with and species without the ability to make coenzyme F420. Therefore, the like cofactor is FMN rather than F420. The presence of three members in Kineococcus radiotolerans SRS30216 and two in Saccharopolyspora erythraea NRRL 2338 suggest closely related (subfamily) rather than exactly conserved (equivalog) function. Gene neighborhoods around members are not conserved. [Unknown function, Enzymes of unknown specificity] 355
20484 274935 TIGR04037 LLM_duo_CE1759 LLM-partnered FMN reductase, CE1759 family. This family represents a distinct clade within pfam03358. That family includes enzymes such as the NADH-dependent FMN reductase MsuE. Members of the present family regularly co-occur in genomes, typically as gene pairs, with members of TIGR04036, a probable FMN-dependent member of the bacterial luciferase-like monooxygenase (LLM) family. At least one member, RF|YP_001509627.1 from Frankia sp. EAN1pec, is fused to the LLM protein. The function of these gene pairs is unknown. 198
20485 274936 TIGR04038 tatD_link_rSAM radical SAM protein, TatD family-associated. Members of this family are radical SAM proteins found in about 5 percent of microbial genomes. A portion occur as gene fusions with, or adjacent to, members of the TatD family of hydrolases (pfam01026). The TatD family may have several paralogs per genome, including TatD itself from E. coli (a soluble protein not actually part of the twin-arginine translocation complex), which appears to act in quality control for TAT, directing turnover of misfolded TAT substrates. The functions of TatD family hydrolases in general (other than TatD itself, which may be exceptional within its larger family), and of this radical SAM domain protein modeled here, are unknown. 191
20486 188554 TIGR04039 MXAN_0977_Heme2 di-heme enzyme, MXAN_0977 family. This model describes a subfamily of di-heme proteins related to the di-heme cytochrome c peroxidase and to MauG (methylamine utilization G), an enzyme that performs a tryptophan tryptophylquinone modification to the methylamine dehydrogenase light chain. 336
20487 274937 TIGR04040 glycyl_YjjI glycine radical enzyme, YjjI family. Members of this family are homologs to enzymes known to undergo activation by a radical SAM protein to create an active site glycyl radical. This family appears to be activated by the YjjW radical SAM protein, usually encoded by an adjacent gene. [Unknown function, Enzymes of unknown specificity] 497
20488 274938 TIGR04041 activase_YjjW glycine radical enzyme activase, YjjW family. Members of this family are radical SAM enzymes, designated YjjW in E. coli, that are paired with and appear to activate a glycyl radical enzyme of unknown function, designated YjjI. This activase and its target are found in Clostridial species as well as E. coli and cousins. Members of this family may be misannotated as pyruvate formate lyase activating enzyme. [Protein fate, Protein modification and repair] 276
20489 274939 TIGR04042 MSMEG_0570_fam MSMEG_0570 family protein. This small protein, about 90 residues long, has no detectable homologs outside the set used to characterize this model. Member proteins serve as markers for an eight-gene region whose overall function is unknown. One member of the cluster is a radical SAM protein with some similarity to enzymes of cofactor biosynthesis, another a glycosyltransferase, several hydrolases or oxidoreductases, and several unknown. 90
20490 274940 TIGR04043 rSAM_MSMEG_0568 radical SAM protein, MSMEG_0568 family. Members of this protein family are radical SAM proteins related to MSMEG_0568 from Mycobacterium smegmatis. Members occur within 8-gene operons in species as diverse as M. smegmatis, Rhizobium leguminosarum, Synechococcus elongatus, and Sorangium cellulosum. The function of the operon is unknown, but similarity of MSMEG_0568 to some cofactor biosynthesis radical SAM proteins suggests a similar biosynthetic function. [Unknown function, Enzymes of unknown specificity] 354
20491 188559 TIGR04044 MSMEG_0572_fam MSMEG_0572 family protein. This model describes a family of proteins with remote similarity to the DsrE/DsrF-like family (see pfam02635). All members are found in a context of at least seven genes that includes a radical SAM protein, suggesting biosynthesis. The system is sparsely but broadly distributed in bacteria, including Actinobacteria, Proteobacteria, and Cyanobacteria. 159
20492 274941 TIGR04045 MSMEG_0567_GNAT putative N-acetyltransferase, MSMEG_0567 N-terminal domain family. Members of this family belong to the GNAT family (pfam00583), in which numerous characterized examples, though not all, are are shown to be N-acetyltransferases or to interact with acetyl-CoA. This family occurs in a sparsely distributed biosynthetic cluster that occurs in Actinobacteria, Cyanobacteria, and Proteobacteria. 153
20493 274942 TIGR04046 MSMEG_0569_nitr flavin-dependent oxidoreductase, MSMEG_0569 family. Members of this protein family belong to a conserved seven-gene biosynthetic cluster found sparsely in Cyanobacteria, Proteobacteria, and Actinobacteria. Distant homologies to characterized proteins suggest that members are enzymes dependent on a flavinoid cofactor. 400
20494 274943 TIGR04047 MSMEG_0565_glyc glycosyltransferase, MSMEG_0565 family. A conserved gene cluster found sporadically from Actinobacteria to Proteobacteria to Cyanobacteria features a radical SAM protein, an N-acetyltransferase, an oxidoreductase, and two additional proteins whose functional classes are unclear. The metabolic role of the cluster is probably biosynthetic. This glycosyltransferase, named from member MSMEG_0565 from Mycobacterium smegmatis, occurs in most but not all instances of the cluster. [Unknown function, Enzymes of unknown specificity] 373
20495 188563 TIGR04048 nitrile_sll0784 putative nitrilase, sll0784 family. This family represents a subfamily of a C-N bond-cleaving hydrolases (see pfam00795). Members occur as part of a cluster of genes in a probable biosynthetic cluster that contains a radical SAM protein, an N-acetyltransferase, a flavoprotein, several proteins of unknown function, and usually a glycosyltransferase. Members are closely related to a characterized aliphatic nitrilase from Rhodopseudomonas rhodochrous J1, for which an active site Cys was found at position 165. [Unknown function, Enzymes of unknown specificity] 301
20496 188564 TIGR04049 AIR_rel_sll0787 AIR synthase-related protein, sll0787 family. Members of this family include sll0787 from Synechocystis sp. PCC 6803 and resemble the C-terminal region of MSMEG_0567 from Mycobacterium smegmatis, where the N-terminal is a GNAT family N-acetyltransferase. The conserved cluster is found broadly (Cyanobacteria, Proteobacteria, Actinobacteria) in about 8 percent of genomes and appears to be biosynthetic. The product is unkown. [Unknown function, Enzymes of unknown specificity] 316
20497 274944 TIGR04050 MSMEG_0567_Cter AIR synthase-related protein, MSMEG_0567 C-terminal family. Members of this family include the C-terminal region of MSMEG_0567 from Mycobacterium smegmatis, where the N-terminal is a GNAT family N-acetyltransferase, and resemble the full length of sll0787 from Synechocystis sp. PCC 6803. The conserved cluster that contains these is found broadly (Cyanobacteria, Proteobacteria, Actinobacteria) in about 8 percent of genomes and appears to be biosynthetic. The product is unkown. [Unknown function, Enzymes of unknown specificity] 296
20498 188566 TIGR04051 rSAM_NirJ heme d1 biosynthesis radical SAM protein NirJ. Heme d1 occurs in the cytochrome cd1 subunit of nitrite reductase in species such as Pseudomonas stutzeri. NirJ is a radical SAM protein involved in its bioynthesis. In a number of species, distinct genes NirJ1 and NirJ2 are found in similar genomic regions; this model describe authentic NirJ from genomes with NirJ only. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 354
20499 188567 TIGR04052 AZL_007920_fam AZL_007920/MXAN_0976 family protein. Members of this rare protein family regularly occur next to a member of the MXAN_0977 subfamily (TIGR04039) of the di-heme cytochrome c peroxidase/MauG family (pfam03150). MauG itself (TIGR03791) is a protein modification enzyme responsible for the tryptophan tryptophylquinone (TTQ) modification involved in methylamine dehydrogenase activation. All members of this family have a motif of four spaced invariant Cys residues, while additional homologs outside the scope of this family lack the four Cys residues. 206
20500 274945 TIGR04053 sam_11 radical SAM protein, BA_1875 family. Members of this subfamily of the radical SAM domain superfamily show closer sequence relationships to peptide-modifying proteins of bacteriocin and PQQ biosynthesis than to other characterized radical SAM proteins. Within this subfamily, targets are likely to be diverse. [Unknown function, Enzymes of unknown specificity] 365
20501 274946 TIGR04054 rSAM_NirJ1 putative heme d1 biosynthesis radical SAM protein NirJ1. Members of this radical SAM protein subfamily, designated NirJ1, occur in genomic contexts with a paralog NirJ2 and with other nitrite reductase operon genes associated with heme d1 biosynthesis, as in Heliobacillus mobilis and Heliophilum fasciatum. NirJ1 is presumed by bioinformatics analysis (Xiong, et al.) to be a heme d1 biosynthesis protein by context, perhaps involved in conversions of acetate groups to methyl groups in conversion from uroporphyrinogen III. A very closely related protein, involved in alternative heme b biosynthesis, occurs in Desulfovibrio and in methanogens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 387
20502 274947 TIGR04055 rSAM_NirJ2 putative heme d1 biosynthesis radical SAM protein NirJ2. Members of this radical SAM protein subfamily, designated NirJ2, occur in genomic contexts with a paralog NirJ1 and with other nitrite reductase operon genes associated with heme d1 biosynthesis, as in Heliobacillus mobilis and Heliophilum fasciatum. NirJ2 is presumed by bioinformatics analysis (Xiong, et al.) to be a heme d1 biosynthesis protein by context. This model has been redone (2014) to remove the branch (TIGR04545) that included DVU_0855, from a similar pathway for heme b biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 326
20503 274948 TIGR04056 OMP_RagA_SusC TonB-linked outer membrane protein, SusC/RagA family. This model describes a distinctive clade among the TonB-linked outer membrane proteins (OMP). Members of this family are restricted to the Bacteriodetes lineage (except for Gemmatimonas aurantiaca T-27 from the novel phylum Gemmatimonadetes) and occur in high copy numbers, with over 100 members from Bacteroides thetaiotaomicron VPI-5482 alone. Published descriptions of members of this family are available for RagA from Porphyromonas gingivalis, SusC from Bacteroides thetaiotaomicron, and OmpW from Bacteroides caccae. Members form pairs with members of the SusD/RagB family (pfam07980). Transporter complexes including these outer membrane proteins are likely to import large degradation products of proteins (e.g. RagA) or carbohydrates (e.g. SusC) as nutrients, rather than siderophores. [Transport and binding proteins, Unknown substrate] 981
20504 274949 TIGR04057 SusC_RagA_signa TonB-dependent outer membrane receptor, SusC/RagA subfamily, signature region. This model describes a 31-residue signature region of the SusC/RagA family of outer membrane proteins from the Bacteriodetes. While many TonB-dependent outer membrane receptors are associated with siderophore import, this family seems to include generalized nutrient receptors that may convey fairly large oligomers of protein or carbohydrate. This family occurs in high copy numbers in the most abundant species of the human gut microbiome. 31
20505 188573 TIGR04058 AcACP_reductase long-chain fatty acyl-ACP reductase (aldehyde-forming). This enzyme, found in cyanobacteria, reduces a long-chain (mainly C16 or C18) fatty acyl ACP ester to its corresponding fatty aldehyde, releasing the acyl carrier protein (ACP). NADPH or NADH is the reductant for this reaction. This enzyme may be distantly related to the short-chain dehydrogenase or reductase (SDR) family (pfam00106). The purpose of this reaction is in the first step of alkane biosynthesis (GenProp0942). [Central intermediary metabolism, Other] 339
20506 274950 TIGR04059 Ald_deCOase long-chain fatty aldehyde decarbonylase. This cyanobacterial family of fatty aldehyde decarbonylases acts on mainly C16 and C18 substrates to form hydrocarbons and carbon monoxide. Note that the corresponding EC number (4.1.99.5) dating from 1989 refers to a nonorthologous Pisum sativum enzyme that acts on C18 and longer chains and attaches the overly narrow narrow name octadecanal decarbonylase. [Central intermediary metabolism, Other] 220
20507 274951 TIGR04060 formate_focA formate transporter FocA. FocA (formate channel A) forms a pentameric formate-selective channel through the plasma membrane. The focA gene is largely restricted to Proteobacteria and occurs adjacent to genes for pyruvate formate lyase (PFL) and the PFL activase, a radical SAM protein. FocA is homologous to a nitrite transport protein, NirC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 267
20508 274952 TIGR04061 AZL_007950_fam AZL_007950 family protein. This set of proteins includes PP_3335 from Pseudomonas putida, a protein of unknown function, and AZL_007950, a member of a putative biosynthetic cluster from Azospirillum sp. B510. 164
20509 274953 TIGR04062 dnd_assoc_4 dnd system-associated protein 4. A DNA sulfur modification system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. Members of this protein fam ily are strictly limited to species with the dnd operon, and are found close to the dnd operon on the chromosomes of species such as Nostoc sp. PCC 7120, Geobacter uraniireducens Rf4, and Roseobacter denitrificans OCh 114. [DNA metabolism, Restriction/modification] 151
20510 274954 TIGR04063 stp3 PEP-CTERM/exosortase A-associated glycosyltransferase, Daro_2409 family. PEP-CTERM/exosortase is a protein-sorting system associated with exopolysaccharide production. Members of this protein family are group 1 glycosyltransferases (see pfam00534) in which the overwhelming majority occur in species with the EpsH1 form of exosortase (see TIGR03109), and usually co-clustered with the exosortase. A typical member is Daro_2409 from Dechloromonas aromatica RCB. 397
20511 274955 TIGR04064 rSAM_nif11 nif11-like peptide radical SAM maturase. Members of this family are radical SAM enzymes that occur co-clustered with nif11-related ribosomal natural product (RNP) precursors described by TIGRFAMs model TIGR03798. Homology within the bacteriocin family reflects largely constraints on the leader peptide, tied to processes such as cleavage and export, and members associate with various families of maturation enzyme. The gene symbol assigned is nlpM, for Nif11-class Leader Peptide family Radical SAM Maturase. [Cellular processes, Toxin production and resistance] 458
20512 274956 TIGR04065 ocin_CLI_3235 putative bacteriocin precursor, CLI_3235 family. Members of this protein family are Cys-rich putative bacteriocin precursor peptides restricted to the Clostridia but found in multiple species with up to three per genome. They are found next to a CLI_3234 family radical SAM protein that may perform post-translational modification. This model describes approximately 35 residues starting from the N-terminus. Precursor peptides average about 50 amino acids in length. [Cellular processes, Toxin production and resistance] 34
20513 274957 TIGR04066 nat_prod_clost peptide maturation system protein, TIGR04066 family. Members of this protein family occur in various Clostridial genomes, always in the context of a short peptide and a radical SAM protein predicted to modify the short peptide. PSI-BLAST analysis suggests a sequence relationship to archaeal proteins designated as subunits of an H+-transporting two-sector ATPase. The modified peptide is likely to be a bacteriocin, and this protein is a candidate to act in either maturation or immunity. 361
20514 188582 TIGR04067 oc_CLOSPO_01332 putative bacteriocin precursor, CLOSPO_01332 family. Members of this protein family are Cys-rich putative bacteriocin precursor peptides found in a few strains of Clostridium and Anaerococcus. This family is related to the family of CLI_3235 (TIGR04065). Members of both families are found next to a CLI_3234 family radical SAM protein that appears to perform post-translational modification. 59
20515 274958 TIGR04068 rSAM_ocin_clost Cys-rich peptide radical SAM maturase CcpM. Members of this family are radical SAM enzymes that occur next to clostridial Cys-rich predicted bacteriocin (or other class of ribosomal natural product) precursors (see families TIGR04065 and TIGR04067). They include a TIGR04085 C-terminal additional 4Fe4S cluster-binding domain that is associated with peptide modification by radical SAM enzymes, and they are proposed to be ribosomal natural product maturases. The gene symbol ccpM is assigned, for Clostridial Cys-rich Peptide Maturase. [Cellular processes, Toxin production and resistance] 459
20516 274959 TIGR04069 ocin_ACP_rel peptide maturation system acyl carrier-related protein. Both PSI-BLAST and large numbers of noise-level HMM hits show a relationship between this family and the phosphopantetheine attachment site domain modeled by pfam00550. That domain includes acyl carrier proteins (ACP) and features an essentially invariant serine residue that is the attachment site for the phosphopantetheine prosthetic group. In this family, the corresponding residue is not Ser and is not conserved. Members are found in genomic contexts associated with a small Cys-rich peptide and a radical SAM protein we predict modifies the peptide. [Cellular processes, Toxin production and resistance] 77
20517 213890 TIGR04070 photo_TT_lyase spore photoproduct lyase. DNA damage to bacterial spores from ultraviolet light accumulates in the form of 5-thyminyl-5,6-dihydrothymine, spore photoproduct. The damage is repaired by spore photoproduct lyase, a member of the radical SAM family of enzymes. The score of this model is set to restrict itself to spore-forming members of the Firmicutes, but additional homologs scoring below the trusted cutoff tend to occur in radioresistant organisms (e.g. Kineococcus radiotolerans) and may be functionally equivalent. A related family in the Mycobacterium lineage is described by family TIGR03886, and may or may not be equivalent in function. [DNA metabolism, DNA replication, recombination, and repair, Cellular processes, Sporulation and germination] 338
20518 274960 TIGR04071 methanobac_OB3b methanobactin precursor, Mb-OB3b family. Methanobactins are siderophore-like copper-chelating natural products with considerable variety from species to species. The 11-residue methanobactin of Methylosinus trichosporium OB3b is derived from a 30-residue precursor. A very similar 31-residue precursor is found in the rice endophyte Azospirillum sp. B510, which has not yet been shown to produce a methanobactin. This model models the shared region of the first 25 amino acids, including a Cys-Gly-Ser motif. 30
20519 188587 TIGR04072 rSAM_ladder_B12 lipid biosynthesis B12-binding/radical SAM protein. Members of this protein family occur in conserved genomic contexts highly suggestive of lipid biosynthesis, including an island shared between Kuenenia stuttgartiensis, which produces ladderanes, and Desulfotalea psychrophila, which produces a different kind of unusual polyunsaturated hydrocarbon. 151
20520 274961 TIGR04073 exo_TIGR04073 putative exosortase-associated protein, TIGR04073 family. Members of this protein family are found in beta, gamma, and delta proteobacteria, and in the verrucomicrobia. Twenty-two of twenty-four species encountered contain the PEP-CTERM/exosortase system for modulating extracellular polysaccharide biosynthesis production, suggesting a role in protein sorting. The N-terminal signal sequence is divergent and not included in the model. PSI-BLAST and HMM searches suggest a distant sequence relationship between a region of this protein of about 100 amino acids and a corresponding region of the very large eukaryotic protein vps13, associated with vacuolar protein sorting in yeast. 75
20521 274962 TIGR04074 bacter_Hen1 3' terminal RNA ribose 2'-O-methyltransferase Hen1. Members of this protein family are bacterial Hen1, a 3' terminal RNA ribose 2'-O-methyltransferase that acts in bacterial RNA repair. All members of the seed alignment belong to a cassette with the RNA repair enzyme polynucleotide kinase-phosphatase (Pnkp). Chemically similar Hen1 in eukaryotes acts instead on small regulatory RNAs. [Transcription, RNA processing, Protein synthesis, tRNA and rRNA base modification] 462
20522 274963 TIGR04075 bacter_Pnkp polynucleotide kinase-phosphatase. Members of this protein family are the bacterial polynucleotide kinase-phosphatase (Pnkp) whose genes occur paired with genes for the 3' terminal RNA ribose 2'-O-methyltransferase Hen1. All members of the seed alignment belong to a cassette with the Hen1. The pair acts in bacterial RNA repair. This enzyme performs end-healing reactions on broken RNA, preparing from the RNA ligase to close the break. The working hypothesis is that the combination of Pnkp (RNA repair) and Hen1 (RNA modification) serves to first repair RNA damage from ribotoxins and then perform a modification that prevents the damage from recurring. [Transcription, RNA processing] 851
20523 274964 TIGR04076 TIGR04076 TIGR04076 family protein. Members of this protein family are uncharacterized. The only invariant residue, and one of three other residues better than 90 percent conserved are both Cys. Phylogenetic profiling results and occasional fusion genes suggest a role for members of this family in redox reactions or iron cluster metabolism. Species occasionally have two or three copies. 89
20524 188592 TIGR04077 expor_sig_YdyF exported signaling peptide, YydF/SAG_2028 family. This family describes a rare family of small proteins, about 50 residues in length, that includes YydF from Bacillus subtilis and SAG_2028 from Streptococcus agalactiae 2603V/R. Mutational analysis and genomic context show that members of this family likely are modified by a (variably present) radical SAM enzyme, are exported by an ABC transporter, and serve as signaling peptide. The member from Bacillus subtilis induces the LiaRS two-component system. [Regulatory functions, Protein interactions] 49
20525 188593 TIGR04078 rSAM_yydG peptide modification radical SAM enzyme, YydG family. Members of this radical SAM protein family for peptide modification occur only in the context of members of family TIGR04077, which average about 50 amino acids in length. In Bacillus subtilis, this protein (YydG) appears to act on its cognate target peptide (YydF) prior to its export, and result in the creation of a signaling molecule that induces the LiaRS two-component system. [Regulatory functions, Protein interactions] 309
20526 188594 TIGR04079 phero_cyc_pep KxxxW-cyclized secreted peptide. Members of this family are short precursor peptides in which the mature form undergoes a cyclization between a Lys and a Trp four residues away. The modification enzyme appears to be an adjacent encoded radical SAM protein. Genomes encoding this system include Streptococcus thermophilus LMD-9 and Lactococcus lactis subsp. cremoris MG1363, among others. [Cellular processes, Biosynthesis of natural products] 23
20527 188595 TIGR04080 rSAM_pep_cyc KxxxW cyclic peptide radical SAM maturase. Members of this family are radical SAM enzymes that appear to perform a cyclization on an adjacent cognate peptide from family TIGR04079. Genomes with the complete system include Streptococcus thermophilus LMD-9 and Lactococcus lactis subsp. cremoris MG1363, among others. The gene symbol assigned is kwcM, for KxxxW Cyclic peptide Maturase. [Protein fate, Protein modification and repair] 440
20528 188596 TIGR04081 selen_ocin radical SAM modification target peptide, selenobiotic family. Members of this protein family are small peptides found in the vicinity of a peptide modification-type radical SAM protein family. Multiple members of this protein family occur in species with a selenocysteine incorporation systems and have a TGA stop codon at position that aligns with cysteine residues from other homologs. This finding strongly suggests that GSU_1558 and similar members of the family are selenopeptides. The selenocysteine insertion sequence (SECIS) finder bSECISearch finds two homologous SECIS elements for two TGA codons in the extension of GSU_1558. Meanwhile, the pairing with the radical SAM enzyme suggests additional modification. 37
20529 274965 TIGR04082 rSAM_for_selen selenobiotic family peptide radical SAM maturase. Members of this protein family are radical SAM (rSAM) enzymes similar in sequence to others with known or postulated roles in peptide modification, and regularly found adjacent to members of the GSU_1558 peptide family described by model TIGR04081. GSU_1558 and several other members of that family appear to be selenoproteins, hence the term selenobiotic. 516
20530 274966 TIGR04083 rSAM_pep_methan putative peptide-modifying radical SAM enzyme, Mhun_1560 family. Members of this family are radical SAM enzymes, homologous to a variety of other peptide-modifying radical SAM, and found primarily in methanogenic archaea. 376
20531 274967 TIGR04084 rSAM_AF0577 putative peptide-modifying radical SAM enzyme, AF0577 family. This radical SAM family contains a C-terminal region motif CXXCX5CX3C that is found in PqqE and other radical SAM enzymes that act on peptide substrates. Members of this family are found primarily in the Archaea, but also several eukaryotes (Trichomonas vaginalis G3, Entamoeba dispar SAW760, Giardia intestinalis ATCC 50581, etc.). The function is unknown. 347
20532 274968 TIGR04085 rSAM_more_4Fe4S radical SAM additional 4Fe4S-binding SPASM domain. This domain contains regions binding additional 4Fe4S clusters found in various radical SAM proteins C-terminal to the domain described by model pfam04055. Radical SAM enzymes with this domain tend to be involved in protein modification, including anaerobic sulfatase maturation proteins, a quinohemoprotein amine dehydrogenase biogenesis protein, the Pep1357-cyclizing radical SAM enzyme, and various bacteriocin biosynthesis proteins. The motif CxxCxxxxxCxxxC is nearly invariant for members of this family, although PqqE has a variant form. We name this domain SPASM for Subtilosin, PQQ, Anaerobic Sulfatase, and Mycofactocin. 93
20533 274969 TIGR04086 TIGR04086_membr putative membrane protein, TIGR04086 family. Members of this family of strongly hydrophobic putative transmembrane protein average about 125 amino acids in length and occur mostly, but not exclusively, in the Firmicutes. Members are quite diverse in sequence. The function is unknown. 115
20534 274970 TIGR04087 YqxM_for_SipW YqxM protein. Members of this protein, including the partially characterized YqxM of Bacillus subtilis, are always found adjacent to a variant form, SipW, of signal peptidase, and are targets for this signal peptide, as is the biofilm protein constituent TasA. The function may always be associated with biofilm formation. In one instance, this protein is fused with the SipW signal peptidase. 186
20535 274971 TIGR04088 cognate_SipW SipW-cognate class signal peptide. This model describes a protein N-terminal domain found regularly in proteins encoded near a variant form of signal peptidase I such as the SipW protein of Bacillus subtilis. Many though not all members are homologs of camelysin (a casein-cleaving metalloprotease) and TasA (CotN), a metalloprotease that is secreted, along with extracellular polysaccharide (EPS), to be the major protein constituent of the Bacillus subtilis biofilm matrix. Sequencing from several known TasA/CotN proteins shows the cleavage location to be near the center of the alignment and typical of type I signal peptidases, with small residues at -3 and -1. This domain, therefore, appears to be a special subclass of signal peptide. 34
20536 274972 TIGR04089 exp_by_SipW_III alternate signal-mediated exported protein, RER_14450 family. Members of this Actinobacterial protein family contain the cognate signal peptide domain, modeled by TIGR04088, for the variant SipW form of the signal peptidase I family. The remainder of this protein, however, differs from families such as Peptidase_M73 (pfam12389) and YqxM (TIGR04087) that share the same signal peptide domain. Some additional homologs to this family lack full-length homology and are excluded by the trusted cutoff as set. The two known targets for export by the SipW signal peptidase in Bacillus subtilis act in producing biofilm matrix material. 179
20537 274973 TIGR04090 exp_by_SipW_IV alternate signal-mediated exported protein, CPF_0494 family. Members of this largely Clostridial protein family contain the cognate signal peptide domain, modeled by TIGR04088, for the variant SipW form of the signal peptidase I family. The remainder of this protein, however, differs from families such as Peptidase_M73 (pfam12389) and YqxM (TIGR04087) that share the same signal peptide domain. Some additional homologs to this family lack full-length homology and are excluded by the trusted cutoff as set. The two known targets for export by the SipW signal peptidase in Bacillus subtilis act in producing biofilm matrix material. Members include CPF_0494, adjacent to SipW homolog CPF_0493. 244
20538 274974 TIGR04091 LTA_dltB D-alanyl-lipoteichoic acid biosynthesis protein DltB. Members of this protein family are DltB, part of a four-gene operon for D-alanyl-lipoteichoic acid biosynthesis that is present in the vast majority of low-GC Gram-positive organisms. This protein may be involved in transport of D-alanine across the plasma membrane. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 380
20539 274975 TIGR04092 LTA_DltD D-alanyl-lipoteichoic acid biosynthesis protein DltD. Members of this protein family are DltD, part of the DltABCD system widely distributed in the Firmicutes for D-alanylation of lipoteichoic acids. The most common form of LTA, as in Staphylococcus aureus, has a backbone of polyglycerolphosphate. 384
20540 274976 TIGR04093 cas1_CYANO CRISPR-associated endonuclease Cas1, subtype CYANO. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes a clade of Cas1 limited to the CYANO subtype of CRISPR/Cas system and most often the type found there. 323
20541 274977 TIGR04094 adjacent_YSIRK YSIRK-targeted surface antigen transcriptional regulator. Bacteria whose genomes encode only one protein with the YSIRK variant form of signal peptide (TIGR01168) were examined for conserved genes near that one tagged protein. This protein is found adjacent to at various classes of repetitive or low-complexity YSIRK proteins (whether unique in genome or not), in a range of species (Enterococcus faecalis X98, Ruminococcus torques, Coprobacillus sp. D7, Lysinibacillus fusiformis ZC1, Streptococcus equi subsp. equi 4047, etc). The affliated YSIRK proteins include Streptococcal protective antigen (see ) and proteins with the Rib/alpha/Esp surface antigen repeat (see TIGR02331). The last quarter of this protein has an AraC family helix-turn-helix (HTH)transcriptional regulator domain. 383
20542 274978 TIGR04095 dnd_restrict_1 DNA phosphorothioation system restriction enzyme. The DNA phosphorothioate modification system dnd (DNA instability during electrophoresis) recently has been shown to provide a modification essential to a restriction system. This protein family was detected by Partial Phylogenetic Profiling as linked to dnd, and its members usually are clustered with the dndABCDE genes. 451
20543 274979 TIGR04096 dnd_rel_methyl DNA phosphorothioation-associated putative methyltransferase. Members of this protein family show distant local sequence similarity to a number of S-adenosyl-methionine-dependent methyltransferases. The family is identified by Partial Phylogenetic Profiling as closely tied to the DNA phosphorothioation system (dnd), and members are found adjacent to dnd genes in at least 13 species (Streptomyces lividans TK24, Shewanella frigidimarina NCIMB 400, Mycobacterium abscessus ATCC 19977, Nostoc punctiforme PCC 73102, Vibrio fischeri MJ11, etc.). The DNA phosphorothioation enables a novel form of restriction enzyme activity. Most members of this family appear in species with the DNA phosphorothioation system. [DNA metabolism, Restriction/modification] 478
20544 274980 TIGR04098 biosyn_clust_1 biosynthesis cluster domain. Radical SAM family TIGR04043 is a marker for a widespread eight-gene probable biosynthetic cluster of unknown function. This protein family describes a domain that occurs as an additional protein for some of those clusters, but also as the N-terminal domain of large, multidomain polyketide synthases and in other contexts. 270
20545 274981 TIGR04099 biosn_Pnap_2097 probable biosynthetic protein, Pnap_2097 family. Radical SAM family TIGR04043 is a marker for a widespread eight-gene probable biosynthetic cluster of unknown function. This protein family occurs only in the context of TIGR04043 member-containing biosynthetic clusters, although in a minority of such clusters. This protein family belongs to the TIGR04098 domain family, which also includes N-terminal domains of several probable polyketide synthases. A role in biosynthetic processes is suspected. 258
20546 188615 TIGR04100 rSAM_pair_X radical SAM enzyme, TIGR04100 family. Members of this protein family are radical SAM enzymes that appear paired with members of TIGR04002, a family of small (~170 residue), mostly hydrophobic protein. This family of radical SAM enzymes belongs to a larger family TIGR04038, in which some members show regularly in contexts with TatD. 197
20547 200352 TIGR04101 CCGSCS CCGSCS motif protein. This protein family, with average protein length about 58 residues, occurs in several marine bacteria, such as Shewanella benthica KT99, Marinobacter sp. ELB17, and Photobacterium profundum 3TCK. The striking feature is a C-terminal motif CCGSCS, which (perhaps coincidentally) resembles conserved core motif [LC]CGSC shared by two methanobactin precursors (see TIGR04071). There is no detectable conserved gene region for these proteins. 59
20548 200353 TIGR04102 SWIM_PBPRA1643 SWIM/SEC-C metal-binding motif protein, PBPRA1643 family. Members of this protein family have a SWIM, or SEC-C, domain (see pfam02810), a 21-amino acid putative Zn-binding domain that is shared with SecA, plant MuDR transposases, etc. This small protein family of unknown function occurs primarily in marine bacteria. 108
20549 200354 TIGR04103 rSAM_nif11_3 nif11-class peptide radical SAM maturase 3. Members of this protein family are peptide-modifying radical SAM enzymes, with a C-terminal additional 4Fe-4S cluster binding domain like many other peptide-modifying radical SAM enzymes. This form occurs primarily in the genera Cyanothece and Nostoc. 412
20550 274982 TIGR04104 cxxc_20_cxxc cxxc_20_cxxc protein. This small, uncommon, poorly conserved protein is found primarily in the Firmicutes. It features are pair of CxxC motifs separated by about 20 amino acids, followed by a highly hydrophobic region of about 45 amino acids. It has no conserved gene neighborhood, and its function is unknown. 94
20551 274983 TIGR04105 FeFe_hydrog_B1 [FeFe] hydrogenase, group B1/B3. See for descriptions of different groups. 462
20552 274984 TIGR04106 cas8c_GSU0052 CRISPR-associated protein GSU0052/csb3, Dpsyc system. This model describes a CRISPR-associated (cas) protein unique to the Dpsyc subtype (named for Desulfotalea psychrophila), a variant type I-C subtype, although not universal to the that subtype. Members of this family occur in CRISPR loci of Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, Rhodospirillum centenum SW, Planctomyces limnophilus DSM 3776, and Methylosinus trichosporium OB3b. 282
20553 274985 TIGR04107 rSAM_HutW putative heme utilization radical SAM enzyme HutW. HutW is a radical SAM enzyme closely related to HemN, the heme biosynthetic oxygen-independent coproporphyrinogen oxidase. It belongs to operons associated with heme uptake and utilization in Vibrio cholerae and related species, but neither it not HutX has been shown to be needed, as is HutZ, for heme utilization. HutW failed to complement a Salmonella enterica hemN mutant (), suggesting a related but distinct activity. Some members of this family are fused to hutX. 420
20554 274986 TIGR04108 HutX putative heme utilization carrier protein HutX. Members of this protein family are HutX, found paired with HutW in some heme utilization loci although not shown directly to be necessary for heme utilization. This protein is homologous to the heme carrier protein HemS, while its partner HutW is homologous to (but does not complement) HemN, the radical SAM enzyme oxygen-independent coproporphyrinogen III oxidase involved in heme biosynthesis. 154
20555 200360 TIGR04109 heme_ox_HugZ heme oxygenase, HugZ family. Members of this protein family are HugZ, a class of heme oxygenase that belongs to the PPOX family (pfam01243) and lacks homology to the HmuO family (pfam01126). Characterized members of this family include HP0318 from Helicobacter pylori and CJ1613c from Campylobacter jejuni. This enzyme releases iron during the conversion of heme to biliverdin. 243
20556 200361 TIGR04110 heme_HutZ heme utilization protein HutZ. Members of this family are heme utilization proteins, typically designated HutZ. They are members of the PPOX family (pfam01243) and, except for the lack of an N-terminal extension, are closely related to one form of heme oxidase (1.14.99.3), HugZ (TIGR04109). Members typically are found in a three-gene operon with radical SAM enzyme HutW and a protein of unknown function, HutX. 168
20557 274987 TIGR04111 BcepMu_gp16 phage-associated protein, BcepMu gp16 family. Members of this protein family occur in Burkholderia phage BcepMu, Pseudomonas phage B3, and Burkholderia phage KS10, and many bacterial putative prophage regions. The member from Burkholderia phage BcepMu is named gp16. Homology suggests DNA-binding activity. [Mobile and extrachromosomal element functions, Prophage functions] 55
20558 274988 TIGR04112 seleno_YedE putative selenium metabolism protein, YedE family. For 79 of the first 80 reference genomes in which a member of this protein family, YedE, is found, a selenium utilization system is found, spread over a broad taxonomic range (Firmicutes, spirochetes, delta-proteobacteria, Fusobacteria, Bacteriodes, etc. This family is less widespread than YedF, also involved in selenium metabolism. 337
20559 274989 TIGR04113 cas_csx17 CRISPR-associated protein Csx17, subtype Dpsyc. Members of this protein family are found exclusively in CRISPR-associated (cas) type I system gene clusters of the Dpsyc subtype. Markers for that type include a variant form of cas3 (model TIGR02621) and the GSU0054-like protein family (model TIGR02165). This family occurs in less than half of known Dpsyc clusters. 703
20560 274990 TIGR04114 tSAM_targ_Cxxx modification target Cys-rich repeat. This model describes a cysteine-rich repeat found in a number of bacterial putative radical-SAM modified natural product precursors. A substantial fraction of members of this family have been missed during gene-finding. A true hit to the model must exceed both TC on the whole and trusted cutoff 2 for at least one domain, to avoid false-positives from African swine fever virus proteins. 18
20561 200366 TIGR04115 rSAM_Cxxx_rpt radical SAM peptide maturase, CXXX-repeat target family. Members of this radical SAM domain protein are predicted peptide maturases, similar to PqqE, AlbA, the mycofactocin radical SAM maturase, and many others that share the peptide modification radical SAM protein C-terminal additional 4Fe4S-binding domain (TIGR04085). Members co-occur with a protein of unknown function that may be a chaperone or immunity protein and with a peptide that may have twelve or more cysteines occurring regularly spaced every fourth residue. These Cys residues tend to be flanked by residues with small side chains that provide minimal steric hindrance to crosslink formation by the radical SAM enzyme as in the subtilosin A system. 359
20562 274991 TIGR04116 CXXX_rpt_assoc CXXX repeat peptide modification system protein. Members of this protein family occur strictly in the presence of a peptide modification radical SAM enzyme described by model TIGR04115 and some small peptide in which, for a stretch, every fourth amino acid is Cys. Cysteine residues usually are flanked by residues with sterically small side chains, as with many radical SAM-modified peptides. Many of the latter are recognized by model TIGR04114. 90
20563 274992 TIGR04117 Syntroph_Cxxx Syntrophus aciditrophicus Cys-Xaa-Xaa-Xaa repeat radical SAM target protein. This model represents a paralogous family, in Syntrophus aciditrophicus SB, of peptides a conserved N-terminal region followed by ten to seventeen direct repeats of the sequence CXXX (see repeats model TIGR04114). The N-terminal region includes a hydrophobic patch that is not shared by most members of family TIGR04114. 95
20564 200369 TIGR04118 Cxxx_AC3_0185 modification target Cys-rich peptide, AC3_0185 family. Radical SAM enzyme family TIGR04115 is paired with a number of short peptides with multiple tandem repeats of Cys-Xaa-Xaa-Xaa (see TIGR04114). This family represent a peptide family with a TIGR04114-like region, although the repeat region is relatively short in this group. 46
20565 200370 TIGR04119 CXXX_matur CXXX repeat peptide maturase. This model describes a peptide maturase that works with, usually fused to, a radical SAM enzyme in a system that modifies peptides with multiple tandem repeats of CXXX sequences. This protein includes an iron-sulfur cluster binding region associated with peptide modification as described in domain model TIGR04085. 210
20566 274993 TIGR04120 DNA_lig_bact DNA ligase, ATP-dependent, PP_1105 family. This model describes a family of ATP-dependent DNA ligases present in about 12 % of prokaryotic genomes. It occurs as part of a four-gene system with an exonuclease, a helicase and a phosphoesterase, with all four genes clustered or at least the first two and last two paired. This family resembles DNA ligase I (see TIGR00574 and pfam01068), and its presumed function may be in DNA repair, replication, or recombination. 526
20567 274994 TIGR04121 DEXH_lig_assoc DEXH box helicase, DNA ligase-associated. Members of this protein family are DEAD/DEAH box helicases found associated with a bacterial ATP-dependent DNA ligase, part of a four-gene system that occurs in about 12 % of prokaryotic reference genomes. The actual motif in this family is DE[VILW]H. 804
20568 274995 TIGR04122 Xnuc_lig_assoc putative exonuclease, DNA ligase-associated. Members of this protein family frequently are found annotated as a putative exonuclease involved in mRNA processing. This protein is found, exclusively in bacteria, associated with three other proteins: an ATP-dependent DNA ligase, a helicase, and putative phosphoesterase. 326
20569 274996 TIGR04123 P_estr_lig_assc metallophosphoesterase, DNA ligase-associated. Members of this protein family are an uncharacterized putative metallophosphoesterase associated with a DNA ligase, a helicase, and a putative exonuclease. It may play a role in DNA repair. Its system is present in about 12 % of prokaryotic reference genomes. 208
20570 274997 TIGR04124 archaeo_artE archaeosortase family protein ArtE. This protein family is related to the predicted protein-sorting transpeptidase Exosortase (EpsH), with the Cys, Arg, and His putative active site residues preserved, but it is strictly archaeal and is not associated with any known PEP-CTERM-like target sequence. The immediate gene neighborhood in most genomes suggests RNA (methylase, cyclase) and cofactor (thiamine pyrophosphate) metabolism. The function is unknown. It is designated archaeosortase family protein ArtE. 153
20571 274998 TIGR04125 exosort_PGF_TRM archaeosortase A, PGF-CTERM-specific. This family is an archaeal variant of the (normally bacterial) putative protein-sorting integral membrane protein exosortase, hence archaeosortase. In species a member of this family, its PGF-CTERM cognate sequence (TIGR04126) occurs at the C-termini of from two to over fifty proteins per genome. Those target proteins may not share homology to each other in regions N-terminal to the PGF-CTERM region. 262
20572 274999 TIGR04126 PGF_CTERM PGF-CTERM archaeal protein-sorting signal. This model describes a strictly archaeal putative protein-sorting motif, PGF-CTERM. It is the (predicted) recognition sequence for an exosortase homolog, archaeosortase (TIGR04125). In some archaea, up to fifty proteins have this domain as their C-terminal region, usually preceded by a Thr-rich region likely to be heavily glycosylated. The removal of this sorting signal may be associated with a C-terminal prenyl group modification in the halobacterial major cell surface glycoprotein, an S-layer protein. 28
20573 275000 TIGR04127 flavo_near_exo exosortase F-associated protein. Members of this protein family are always found next to an exosortase/archaeosortase-like protein, and occur so far only in the flavobacteria, within the Bacteroidetes. Members do not have an obvious PEP-CTERM-like C-terminal protein sorting domain. 136
20574 275001 TIGR04128 exoso_Fjoh_1448 exosortase family protein XrtF. Members of this protein family are exosortase-related proteins found always in association with a member of family TIGR04127, a small, hydrophobic, uncharacterized protein limited to the Bacteriodetes. Exosortases are proposed transpeptidases with a cysteine active site (3.4.22.-), but usually are associated with specific C-terminal target motifs (PEP-CTERM, PEF-CTERM, PGF-CTERM, etc). 174
20575 275002 TIGR04129 CxxH_BA5709 CxxH/CxxC protein, BA_5709 family. Members of this protein family occur exclusively in the Firmicutes, in at least 50 different species. Members average about 55 residues in length, and four of the five invariant or nearly invariant residues occur in motifs CxxH and CxxC. The function is unknown. 49
20576 275003 TIGR04130 FnlA UDP-N-acetylglucosamine 4,6-dehydratase/5-epimerase. The FnlA enzyme is the first step in the biosynthesis of UDP-FucNAc from UDP-GlcNAc in E. coli (along with FnlB and FnlC). The proteins identified by this model include FnlA homologs in the O-antigen clusters of O4, O25, O26, O29 (Shigella D11), O118, O145 and O172 serotype strains, all of which produce O-antigens containing FucNAc (or the further modified FucNAm). A homolog from Pseudomonas aerugiosa serotype O11, WbjB, also involved in the biosynthesis of UDP-FucNAc has been characterized and is now believed to carry out both the initial 4,6-dehydratase reaction and the subsequent epimerization of the resulting methyl group at C-5. A phylogenetic tree of related sequences shows a distinct clade of enzymes involved in the biosynthesis of UDP-QuiNAc (Qui=qinovosamine). This clade appears to be descendant from the common ancestor of the Pseudomonas and E. coli fucose-biosynthesis enzymes. It has been hypothesized that the first step in the biosynthesis of these two compounds may be the same, and thus that these enzymes all have the same function. At present, lacking sufficient confirmation of this, the current model trusted cutoff only covers the tree segment surrounding the E. coli genes. The clades containing the Pseudomonas and QuiNAc biosynthesis enzymes score above the noise cutoff. Immediately below the noise cutoff are enzymes involved in the biosynthesis of UDP-RhaNAc (Rha=rhamnose), which again may or may not produce the same product. 337
20577 275004 TIGR04131 Bac_Flav_CTERM gliding motility-associated C-terminal domain. This model describes a protein homology domain unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility. Proteins with this domain frequently pair with members of family TIGR03519, whether one such pair or many occur in a genome. More than 30 members may occur in one genome. 85
20578 275005 TIGR04132 intra_fol_E_lig putative folate metabolism gamma-glutamate ligase. This protein family is related to CofE, a gamma-glutamyl ligase of coenzyme F420 biosynthesis. However, it occurs in a different gamma-glutamyl ligase context, polyglutamylated tetrahydrofolate biosynthesis-like regions in two widely separated lineages that both occur as intracellular bacteria - Chlamydia and Wolbachia. 241
20579 200384 TIGR04133 rSAM_w_lipo radical SAM enzyme, rSAM/lipoprotein system. Members of this protein family are radical SAM enzymes with an additional 4Fe4S cluster-binding C-terminal domain (TIGR04085) shared with PqqE and many other peptide and protein-modifying radical SAM enzymes. All members occur in the context of a predicted lipoprotein that usually is encoded by an adjacent gene. 350
20580 200385 TIGR04134 lipo_with_rSAM putative lipoprotein, rSAM/lipoprotein system. Members of this family are Bacteroidetes lineage putative lipoproteins that always occur in pairs with a radical SAM enzyme, TIGR04133, from a branch of the radical SAM superfamily in which many members perform peptide or protein modifications. In some members, the region distal to the Cys of the putative lipoprotein cleavage motif is duplicated. 150
20581 200386 TIGR04135 FibroRuminTarg Cys-rich radical SAM target, FibroRumin family. Members of this protein family are cysteine-rich small peptides, about 52 amino acids long, that are proposed targets for modification by a radical SAM enzyme. Known occurrences are as tandem gene pairs Fibrobacter succinogenes subsp. succinogenes S85 (missed gene calls) and in Ruminococcus albus 8. 52
20582 200387 TIGR04136 rSAM_FibroRumin radical SAM peptide maturase, FibroRumin system. Members of this protein family are radical SAM enzymes proposed to act on small, Cys-rich peptides encoded by tandem gene pairs. Members occur in enzymes Fibrobacter succinogenes subsp. succinogenes S85 (genes for their target peptides missed) and in Ruminococcus albus 8. This enzyme family is similar in sequence to the SCIFF (Six Cysteines in Forty-Five) system maturase (TIGR03974). 458
20583 275006 TIGR04137 Chlam_Ver_rRNA Chlam_Verruc_Plancto small basic protein. Members of this protein family are commonly found next to markers of rRNA processing such as YbeY. They are extremely lineage-restricted, in the Planctomycetes and Chlamydiae/Verrucomicrobia group. Since classification is based on rRNA molecular phylogeny, this provides additional support for a role in rRNA metabolism. This small protein, about 50 amino acids in length, is rich in basic residues, a third line of support for rRNA interaction. 50
20584 275007 TIGR04138 Plancto_Ver_chp Verruc_Plancto-restricted protein. Members of this protein family are extremely lineage-restricted, occurring exclusively in the Planctomycetes and Chlamydiae/Verrucomicrobia group, although not in Chlamydia itself. The function is unknown; the lack of invariant residues other than a single Phe suggests an ancient, conserved, non-enzymatic role. 122
20585 200390 TIGR04139 CxxCx5CxxC_targ putative peptide modification target, TIGR04139 family. This model describes a rare family of small putative polypeptides, including three encoded in tandem in Sphingobacterium spiritivorum ATCC 33300, in the vicinity of a TIGR04085 protein. This pairing is conserved in Chryseobacterium gleum ATCC 35910, Kordia algicida OT-1, and other species. TIGR04085 describes a C-terminal additional 4Fe4S-binding domain in PqqE and other radical SAM enzymes that seems to be a marker for peptide modification, and the family modeled here is a candidate modified peptide precursor. 66
20586 275008 TIGR04140 chp_AF_0576 TIGR04140 family protein. This model represents an uncharacterized small archaeal protein. 66
20587 275009 TIGR04141 TIGR04141 sporadically distributed protein, TIGR04141 family. This model describes a sporadically distributed conserved hypothetical protein in which complete members average over 500 amino acids in length, although matching sequences frequently are truncated or broken into tandem ORFs. Regular co-clustering with known markers of mobility (integrases, transposases, phage proteins, restriction enzymes, etc.) suggests this family also is part of the mobilome. The function is unknown. 516
20588 200393 TIGR04142 PCisTranLspir putative peptidyl-prolyl cis-trans isomerase, LIC12922 family. Members of this protein family have a known crystal structure (3NRK) showing similarity to the peptidyl-prolyl cis-trans isomerase SurA. Members are found in Leptospira species next to an uncharacterized radical SAM enzyme and a cytidylyltransferase family protein. 315
20589 200394 TIGR04143 VPxxxP_CTERM VPXXXP-CTERM protein sorting domain. This C-terminal protein sorting domain is detected, so far, in Methanohalophilus mahii DSM 5219 (five members) and Methanohalobium evestigatum Z-7303 (nine members). This domain resembles the PEP-CTERM, PEF-CTERM, and PGF-CTERM domains of other exosortase/archaeosortase systems. Member proteins co-cluster with a variant member of the exosortase/archaeosortase protein family, and represent a boutique second sorting system in these species. 25
20590 200395 TIGR04144 archaeo_VPXXXP archaeosortase B, VPXXXP-CTERM-specific. Members of this protein family are found so far in Methanohalophilus mahii DSM 5219 and Methanohalobium evestigatum Z-7303, along with five and nine proteins, respectively, with the VPXXXP-CTERM protein sorting signal (TIGR04143). In these species, this boutique system represents a second exosortase/archaeosortase-type system. 156
20591 200396 TIGR04145 Firmicu_CTERM Firmicu-CTERM domain. This C-terminal domain is found only in the Firmicutes, where its presence is sporadically distributed. Proteins with this domain are most conserved in the C-terminal region, where the pattern of ending with a transmembrane domain resembles both the LPXTG (sortase target) and PEP-CTERM (exosortase target) domain structures. However, members occur exclusively in the presence of an exosortase-like protein XrtG (TIGR03110), a putative glycosyltransferase (TIGR03111), and a 6-pyruvoyl tetrahydropterin synthase-related protein (TIGR03112). 45
20592 275010 TIGR04146 GGGPS_Afulg phosphoglycerol geranylgeranyltransferase. This enzyme, known also as GGGP synthase and GGGPS, catalyzes the stereospecific first step in the biosynthesis of the characteristic membrane diether lipids of archaea. Interestingly, the closest homologs outside this family are not the functionally equivalent enzymes of other archaea, but rather functionally distinct bacterial enzymes. 221
20593 275011 TIGR04147 GGGPS_Halobact phosphoglycerol geranylgeranyltransferase, putative. In most archaea, phosphoglycerol geranylgeranyltransferase (EC 2.5.1.41), also known as GGGP synthase and GGGPS, catalyzes the stereospecific first step in the biosynthesis of their characteristic membrane diether lipids. However, some groups of archaeal GGGPS homologs are more closely related to certain bacterial proteins than to each other. This family represents the putative GGGPS family as found in the Halobacteria. 229
20594 200399 TIGR04148 GG_samocin_CFB radical SAM peptide maturase, GG-Bacteroidales family. Members of this protein family are radical SAM enzymes (pfam04055) with the additional C-terminal region (TIGR04085) that is frequently a marker of peptide modification. Many members of this family are found in the vicinity of one or several ORFs encoding short polypeptides with a Gly-Gly motif (common for bacteriocin leader peptide cleavage), followed by a Cys-rich patch and then poorly conserved sequences. 411
20595 275012 TIGR04149 GG_sam_targ_CFB natural product precursor, GG-Bacteroidales family. Sequences in this protein domain family include a leader peptide region, up to and including a Gly-Gly cleavage motif, and about 15 additional residues, usually Cys-rich, from a family of predicted ribosomal natural product precursors. Many of these are associated with peptide-modifying radical SAM enzymes. The core region, up through the diglycine motif, resembles and contains some overlapping hits with the bacteriocin precuror leader peptide region modeled by TIGR01847, but is longer with an extreme N-terminal region with consensus sequence MKKLKKLKL. [Cellular processes, Biosynthesis of natural products] 43
20596 275013 TIGR04150 pseudo_rSAM_GG pseudo-rSAM protein, GG-Bacteroidales system. Many peptide-modifying radical SAM enzymes have two 4Fe4S-binding regions, an N-terminal one recognized by Pfam radical SAM domain-defining model pfam04055 and a C-terminal one recognized by TIGR04085. Members of this protein family occur in cassettes with such a radical SAM family (TIGR04148) and with a peptide modification target (TIGR04149). Surprisingly, members of this family show full-length homology to each other, with several scoring at least borderline hits to both pfam04055 and TIGR04085, and yet differ in the presence/absence of a signature CX(3)CX(2)CX(9)C motif. Instead, members are best-conserved in the TIGR04085-like C-terminal region. Therefore, this protein family is designated a pseudo-radical-SAM protein, which likely works in partnership with a TIGR04148 family protein. 407
20597 200402 TIGR04151 exosort_VPDSG exosortase C, VPDSG-CTERM-specific. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (). This model describes the exosortase subtype specific for the VPDSG-CTERM variant (TIGR03778) of PEP-CTERM. Systems are found, so far, in Verrucomicrobiae bacterium DG1235 (twenty) and bacterium Ellin514 (two). This system may coexist with other system variants. [Protein fate, Protein and peptide secretion and trafficking] 309
20598 275014 TIGR04152 exosort_VPLPA exosortase D, VPLPA-CTERM-specific. This model describes a variant sub class, exosortase D, of protein sorting enzyme (see parent exosortase model TIGR02602), specific for the VPLPA-CTERM variant (TIGR03370) of the PEP-CTERM protein sorting signal. [Protein fate, Protein and peptide secretion and trafficking] 486
20599 275015 TIGR04153 cyanosortA_assc cyanosortase A-associated protein. Members of this protein family are found exclusively in the Cyanobacteria, usually usually encoded next to and in at least one case fused to a gene encoding cyanoexosortase A. Note that family TIGR04533 shows a similar relationship to cyanoexosortase B (TIGR04156), and no EpsI is found. 186
20600 275016 TIGR04154 archaeo_STT3 oligosaccharyl transferase, archaeosortase A system-associated. Members of this protein family occur, one to three members per genome, in the same species of Euryarchaeota as contain the predicted protein-sorting enzyme archaeosortase (TIGR04125) and its cognate protein-sorting signal PGF-CTERM (TIGR04126). 817
20601 275017 TIGR04155 cyano_PEP PEP-CTERM protein sorting domain, cyanobacterial subclass. This domain model describes a subclass with family TIGR02595 of PEP-CTERM protein sorting signals associated with bacterial exosortases. This subclass is restricted to Cyanobacteria, including the genera Cyanothece, Nostoc, Trichodesmium, Lyngbya, Arthospira, etc. This PEP-CTERM subclass features strongly conserved residues within the transmembrane region, including a Gx4GxG motif. Model TIGR03763 describes a corresponding cyanobacterial form of exosortase found in most species with this domain. 25
20602 275018 TIGR04156 cyanoexo_CrtB cyanoexosortase B. This model describes a cyanobacterial-restricted form of exosortase, associated with a PEP-CTERM domain subclass described in model TIGR04155. This is one of two such cyanoexosortases, either of which is sufficient to accompany TIGR04155 family members. The cyanoexosortase is TIGR03763. [Protein fate, Protein and peptide secretion and trafficking] 280
20603 275019 TIGR04157 glyco_rSAM_CFB glycosyltransferase, GG-Bacteroidales peptide system. Members of this protein family are predicted glycosyltransferases that occur in conserved gene neighborhoods in various members of the Bacteroidales. These neighborhoods feature a radical SAM enzyme predicted to act in peptide modification (family TIGR04148), peptides from family TIGR04149 with a characteristic GG cleavage motif, and several other proteins. 405
20604 275020 TIGR04158 rSAM_MIA_synth 3-methyl-2-indolic acid synthase. Members are a radical SAM enzyme that converts L-Trp to 3-methyl-2-indolic acid synthase through a complex rearrangement. This enzyme is closest to ThiH, which also does a complex rearrangement, among other characterized radical SAM enzymes. 368
20605 200410 TIGR04159 methbact_MbnB methanobactin biosynthesis cassette protein MbnB. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor. The model excludes some close homologs from species where no similar precursor can be found. 91
20606 275021 TIGR04160 methbact_MbnC methanobactin biosynthesis cassette protein MbnC. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor of the Mb-OB3b type. The model excludes several Pseudomonas proteins whose function is unknown, which likewise are in model TIGR04061, but which diverge toward the C-terminus. 89
20607 275022 TIGR04161 VPEID-CTERM VPEID-CTERM protein sorting domain. Proteins belonging to this family are small, 80 to 120 residues, including a signal peptide, a central low-complexity region, and this roughly 31-amino acid extreme C-terminal region. Members occur paired with a variant form of exosortase. Species include Ruegeria sp., Phaeobacter gallaeciensis, Roseovarius nubinhibens ISM, and two in Methylobacter tundripaludum. 31
20608 200413 TIGR04162 exo_VPEID exosortase E/protease, VPEID-CTERM system. Members of this protein family are fusion proteins of exosortase (N-terminal) and a CAAX prenyl protease domain (C-terminal). Members are restricted to the alpha Proteobacteria. The variant C-terminal protein sequence VPEID-CTERM occurs only in these species, often adjacent. 519
20609 200414 TIGR04163 rSAM_cobopep peptide-modifying radical SAM enzyme CbpB. Members of this family are radical SAM enzymes that modify a short peptide encoded by an upstream gene. A role in metal chelation is suggested. 428
20610 200415 TIGR04164 cobo_pep modified peptide precursor CbpA. Members of this family are short peptides predicted to reach mature form after modification by a radical SAM enzyme (TIGR04163). 25
20611 275023 TIGR04165 methano_modCys Cys-rich peptide, TIGR04165 family. Members of this small peptide family occur strictly in a subset of archaeal methanogens. Members have four invariant Cys residues in two Cys-Xaa-Xaa-Cys-Gly motifs and may have other Cys residues as well. At least two members occur next to family TIGR04083 radical SAM enzymes predicted to act in peptide or protein modification. 50
20612 275024 TIGR04166 methano_MtrB tetrahydromethanopterin S-methyltransferase, subunit B. Members of this protein family are the MtrB protein of the tetrahydromethanopterin S-methyltransferase complex. This system is universal in archaeal methanogens. [Energy metabolism, Methanogenesis] 95
20613 275025 TIGR04167 rSAM_SeCys radical SAM/Cys-rich domain protein. Members of this protein family have an N-terminal radical SAM domain (pfam04055) and a C-terminal pfam12345 domain. The C-terminal region has several conserved Cys residues, one of which is replaced by selenocysteine in at least five bacterial reference genomes. 303
20614 275026 TIGR04168 TIGR04168 TIGR04168 family protein. Members of this uncharacterized protein family are restricted, in 49 of 50 genomes, to organisms with a family TIGR04167 radical SAM protein, which occasionally is a selenoprotein. 269
20615 200420 TIGR04169 perox_w_seleSAM alkylhydroperoxidase/carboxymuconolactone decarboxylase family protein. Members of this family are usually annotated as putative carboxymuconolactone decarboxylases, are related also to alkylhydroperoxidase AhpD, and contain a peroxidase-like Cys-X-X-Cys putative redox-active disulfide. All members occur in genomes with a radical SAM protein of family TIGR04167, which occasionally are selenoproteins. 109
20616 211905 TIGR04170 RNR_1b_NrdE ribonucleoside-diphosphate reductase, class 1b, alpha subunit. Members of this family are NrdE, the alpha subunit of class 1b ribonucleotide reductase. This form uses a dimanganese moiety associated with a tyrosine radical to reduce the cellular requirement for iron. 698
20617 275027 TIGR04171 RNR_1b_NrdF ribonucleoside-diphosphate reductase, class 1b, beta subunit. Members of this family are NrdF, the beta subunit of class 1b ribonucleotide reductase. This form uses a dimanganese moiety associated with a tyrosine radical to reduce the cellular requirement for iron. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism] 313
20618 275028 TIGR04172 DGQHR_dnd_1 DNA phosphorothioation-associated DGQHR protein 1. The DND system produces an phosphorothioation modification to DNA, replacing a non-bridging oxygen of a phosphate group with sulfur. The modification causes DNA degradation during electrophoresis in Tris buffer. This protein, like DndB (TIGR03233), contains a DGQHR domain (TIGR03187), which also occurs in several contexts that suggest lateral transfer rather than DNA phosphorothioation-dependent restriction. 378
20619 200424 TIGR04173 PIP_CTERM PIP-CTERM protein sorting domain. Proteins closely related to MJ_1469.1 from Methanocaldococcus jannaschii DSM 2661 are designated archaeosortase D (ArtD). ArtD appears to be a dedicated protein-sorting enzyme with a single target, a PKD domain (pfam00801) repeat protein encoded by adjacent gene. This model describes the C-terminal putative protein-sorting region structurally similar to PEP-CTERM (TIGR02602) and found only on these methanogen PKD domain proteins. 27
20620 275029 TIGR04174 IPTL_CTERM IPTL-CTERM protein sorting domain. This model describes a variant form of the PEP-CTERM C-terminal protein-sorting domain, with a consensus motif IPTL replacing the more typical VPEP. A majority of these sequences have a WG (Trp-Gly) motif at positions 7-8 of the domain. Species with multiple (up to 15) copies of this domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3. 27
20621 200426 TIGR04175 archaeo_artD archaeosortase D. This model describes archaeosortase D, one of several strictly archaeal subfamilies related to exosortase, the bacterial protein-sorting putative transpeptidase (see TIGR02602). ArtD is found in the genus Methanocaldococcus. Its predicted target, encoded by an adjacent gene, has a C-terminal VPIP motif-containing region (TIGR04173) likely to be its recognition site. 150
20622 275030 TIGR04176 MarR_EPS EPS-associated transcriptional regulator, MarR family. Members of this family of MarR-family transcriptional regulators are associated with long genomic loci consisting of genes encoding enzymes for the biosynthesis of exopolysaccharides. These genes include glycosyl transferases, sugar modifying enzymes (epimerases, isomerases, methyltransferases, aminotransferases, etc.), and exopolysaccharide polymerases (wzx, wzy). In Leptospira interrogans, borgpeterenii and biflexa, this gene is observed first in unidirectional EPS biosynthesis loci as long as 90 genes. MarR genes (pfam01407) are known to bind to DNA regions with palindromic or pseudopalindromic sequences as homodimers, and to bind small molecules as triggers for conformational changes controlling on/off states. 105
20623 200428 TIGR04177 exosort_XrtH exosortase H, IPTLxxWG-CTERM-specific. This model describes exosortase subfamily H, for which most cognate recognition sequences are found by the IPTLxxWG-CTERM model TIGR04174. Species with this exosortase and multiple (up to 15) copies of the target domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3. [Protein fate, Protein and peptide secretion and trafficking] 158
20624 275031 TIGR04178 exo_archaeo exosortase/archaeosortase family protein. This model represents the most conserved region of the multitransmembrane protein family of exosortases and archaeosortases. The region includes nearly invariant motifs at the ends of three predicted transmembrane helices on the extracytoplasmic face: a Cys (often Cys-Xaa-Gly), Asn-Xaa-Xaa-Arg, and His. This model is much broader than the bacterial exosortase model (TIGR02602), and has in intended scope similar to (or broader than) pfam09721. 97
20625 275032 TIGR04179 rhombo_lipo rhombotail lipoprotein. Members of this protein family are probable lipoproteins. Nearly every member ends with a C-terminal region consisting of a glycine-rich probable cleavage site, a hydrophobic probable transmembrane helix, and a cluster of basic residues, as described in putative protein sorting region model TIGR03501. Furthermore, members tend to be encoded next to a rhomboid family protease, called rhombosortase (TIGR03902) predicted to perform a C-terminal cleavage. 258
20626 275033 TIGR04180 EDH_00030 NAD dependent epimerase/dehydratase, LLPSF_EDH_00030 family. This clade within the NAD dependent epimerase/dehydratase superfamily (pfam01370) is characterized by inclusion of its members within a cassette of seven distinctive enzymes. These include four genes homologous to the elements of the neuraminic (sialic) acid biosynthesis cluster (NeuABCD), an aminotransferase and a nucleotidyltransferase in addition to the epimerase/dehydratase. Together it is very likely that these enzymes direct the biosynthesis of a nine-carbon sugar analagous to CMP-neuraminic acid. These seven genes form the core of the cassette, although they are often accompanied by additional genes that may further modify the product sugar. Although this cassette is widely distributed in bacteria, the family nomenclature arises from the instance in Leptospira interrogans serovar Lai, str. 56601, where it appears as the 30th gene in the 91-gene lipopolysaccharide biosynthesis cluster. 297
20627 275034 TIGR04181 NHT_00031 aminotransferase, LLPSF_NHT_00031 family. This clade of aminotransferases is a member of the pfam01041 (DegT/DnrJ/EryC1/StrS) superfamily. The family is named after the instance in Leptospira interrogans serovar Lai, str. 56601, where it is the 31st gene in the 91-gene lipopolysaccharide biosynthesis locus. Members of this family are generally found within a subcluster of seven or more genes including an epimerase/dehydratase, four genes homologous to the elements of the neuraminic (sialic) acid biosynthesis cluster (NeuABCD) and a nucleotidyl transferase. Together it is very likely that these enzymes direct the biosynthesis of a nine-carbon sugar analogous to CMP-neuraminic acid. These seven genes form the core of the cassette, although they are often accompanied by additional genes that may further modify the product sugar. 359
20628 275035 TIGR04182 glyco_TIGR04182 glycosyltransferase, TIGR04182 family. Members of this family are glycosyltransferases restricted to the archaea. All but two members are from species with the PGF-CTERM/archaeosortase A system, a proposed maturation system for exported, glycosylated proteins as are found often in S-layers. 293
20629 275036 TIGR04183 Por_Secre_tail Por secretion system C-terminal sorting domain. Species that include Porphyromonas gingivalis, Fibrobacter succinogenes, Flavobacterium johnsoniae, Cytophaga hutchinsonii, Gramella forsetii, Prevotella intermedia, and Salinibacter ruber average twenty or more copies of a C-terminal domain, represented by this model, associated with sorting to the outer membrane and covalent modification. 72
20630 275037 TIGR04184 ATPgraspMvdD ATP-grasp ribosomal peptide maturase, MvdD family. The pair of ATP-grasp proteins MvdD and MvdC (microviridin D and C), as well as an acetyltransferase, produce microviridin K, an example of a RiPP (ribosomally synthesized and posttranslationally modified peptide). Microviridins are peptidase inhibitors. 321
20631 275038 TIGR04185 ATPgraspMvdC ATP-grasp ribosomal peptide maturase, MvdC family. The pair of ATP-grasp proteins MvdD and MvdC (microviridin D and C), as well as an acetyltransferase, produce microviridin K, an example of a RiPP (ribosomally synthesized and posttranslationally modified peptide). Microviridins are peptidase inhibitors. This family includes MvdC and corresponding members of similar cassettes. 318
20632 275039 TIGR04186 GRASP_targ putative ATP-grasp target RiPP. A RiPP is a ribosomally produced, post-translationally modified peptide. This family regularly occurs next to ATP-grasp enzymes related to those of microviridin maturation and next to a methyltransferase. 72
20633 275040 TIGR04187 GRASP_SAV_5884 ATP-grasp ribosomal peptide maturase, SAV_5884 family. Members of this protein family are ATP-grasp ligase family enzymes that regularly occur in a contexts with a methyltransferase and a putative ribosomally translated post-translationally modified peptide precursor. Because of this conserved gene neighborhood and close sequence similarity to ATP-grasp enzymes from microviridin/marinostatin biosynthesis cassettes, this enzyme is suggested also to serve as a peptide maturase. 312
20634 275041 TIGR04188 methyltr_grsp methyltransferase, ATP-grasp peptide maturase system. Members of this protein family are predicted SAM-dependent methyltransferases that regularly occur in the context of a putative peptide modification ATP-grasp enzyme (TIGR04187, related to enzymes of microviridin maturation) and a putative ribosomal peptide modification target (TIGR04186). 363
20635 275042 TIGR04189 surface_SprA cell surface protein SprA. SprA is a cell surface protein widely distributed in the Bacteroidetes lineage. In Flavobacterium johnsoniae, a species that shows gliding motility, mutation disrupts gliding. 2315
20636 211913 TIGR04190 B12_SAM_Ta0216 B12-binding domain/radical SAM domain protein, Ta0216 family. Members of this family are enzymes with an N-terminal B12-binding domain and central radical SAM domain. Families TIGR03975, TIGR04013 and TIGR04014 exhibit a similar architecture, which may be associated with lipid metabolism. 553
20637 275043 TIGR04191 YphP_YqiW putative bacilliredoxin, YphP/YqiW family. This protein family is one of several observed in species that express bacillithiol, an analog of glutathione and mycothiol. Rather than being involved in bacillithiol biosynthesis, members are likely to act in bacillithiol-dependent processes. A suggested term is bacilliredoxin (a glutaredoxin-like thiol-dependent oxidoreductase), and a suggested role of YphP is de-bacillithiolation - removing bacillithiol that became linked to protein thiols under oxidative stress. An older description of YphP as a disulphide isomerase therefore may be wrong. 136
20638 275044 TIGR04192 GRASP_w_spasm ATP-GRASP peptide maturase, grasp-with-spasm system. Members of this protein family are ATP-GRASP proteins that occur in a peptide maturation cassette with a SPASM domain protein. SPASM (TIGR04085) usually occurs as a C-terminal extension to radical SAM enzymes that act as peptide maturases, although it can occur independently. 318
20639 211916 TIGR04193 SPASM_w_grasp SPASM domain peptide maturase, grasp-with-spasm system. A 4Fe-4S-binding C-terminal domain is shared by radical SAM maturases for Subtilosin A (S), PQQ (P), Anaerobic sulfatases (AS), and mycofactocin (M), hence SPASM. Radical SAM proteins with SPASM tend to be peptide maturases. All members of this family, like some members of the quasi-rSAM family TIGR04105, lack the 4Fe-4S cluster of the radical SAM domain (pfam04055) in the N-terminal region. Members of this family occur with an ATP-GRASP family protein, known as a possible maturase from microviridin biosynthetic clusters. Systems occur in Microscilla marina ATCC 23134, Kordia algicida OT-1, Sphingobacterium spiritivorum ATCC 33300, etc. 342
20640 211917 TIGR04194 grasp_w_spasm_A grasp-with-spasm leader A domain. This model describes the leader peptide domain, ending in a Gly-Gly cleavage motif, for a post-ribosomal natural product (PRNP) precursor. The corresponding modification enzymes include an ATP-GRASP enzyme and a SPASM-domain protein, related to the C-terminal region of numerous peptide-modification radical SAM enzymes. 28
20641 211918 TIGR04195 S_glycosyl_SunS peptide S-glycosyltransferase, SunS family. Members of this family include SunS, the S-glycosyltransferase that transfers a sugar (substrate is variable in reconstitution assays) onto the precursor of the glycopeptide sublancin, which once was thought to be a lantibiotic. 422
20642 211919 TIGR04196 glycopep_SunS glycopeptide, sublancin family. Members of this family, including sublancin, are post-ribosomal natural products (PRNP) with an S-linked glycosylation. Sublancin itself also has two disulfide bonds. A related gene cluster in Bacillus cereus E33L includes the four Cys involved in the disulfide cluster but lacks the region with the glycosylated Cys, and have been excluded. 80
20643 275045 TIGR04197 T7SS_SACOL2603 type VII secretion effector, SACOL2603 family. Members of this protein family are similar in length and sequence (although remotely) to the WXG100 family of type VII secretion system (T7SS) targets, described by family TIGR03930. Phylogenetic profiling shows that members of this family are similarly restricted to species with T7SS, marking this family as a related set of T7SS effectors. Members include SACOL2603 from Staphylococcus aureus subsp. aureus COL. Oddly, members of family pfam10824 (DUF2580), which appears also to be related, seem not to be tied to T7SS. 85
20644 275046 TIGR04198 paramyx_RNAcap mRNA capping enzyme, paramyxovirus family. This model represents a common C-terminal region shared by paramyxovirus-like RNA-dependent RNA polymerases (see pfam00946). Polymerase proteins described by these two models are often called L protein (large polymerase protein). Capping of mRNA requires RNA triphosphatase and guanylyl transferase activities, demonstrated for the rinderpest virus L protein and at least partially localized to the region of this model. 893
20645 275047 TIGR04199 exosort_xrtJ exosortase J. Exosortase J occurs as a three-member paralogous family in Acidobacterium sp. MP5ACTX8. It contains an N-terminal exosortase/archaeosortase domain and a novel C-terminal domain comprising about half of total protein length. The presumptive target, found as an adjacent gene for two of the three paralogs, consists of a possible lipoprotein signal peptide followed almost immediately by a C-terminal region with some PEP-CTERM-like characteristics. 522
20646 211923 TIGR04200 targ_of_XrtJ XrtJ-associated TM-motif-TM protein. This model represents essentially the full length, ~60 residues, of a two-gene paralogous family from Acidobacterium sp. MP5ACTX8. Sequences consist of an N-terminal signal sequence ending in a GC motif, suggestive of the lipoprotein signal sequence, followed immediately by a C-terminal domain sequence with characteristics PEP-CTERM-like sequences, including a PExP motif and a transmembrane helix. Both members occur next to the novel exosortase variant, XrtJ, which contains a novel C-terminal domain. 62
20647 275048 TIGR04201 Myxo_Cys_RPT Cys-rich repeat, Myxococcales-type. This repeat is restricted to the Myxococcales, a division of the deltaproteobacteria. It occurs in several surface proteins, and may form a stalk region. The repeat averages about 21 amino acids in length with four or five Cys, three of which are nearly invariant. 22
20648 275049 TIGR04202 capSnatchArena RNA endonuclease, cap-snatching, arenavirus family. This model describes a shared signature region from an RNA endonuclease region associated with cap-snatching for mRNA production by RNA viruses. This domain usually is part of a multifunctional protein, the L protein responsible for RNA-dependent RNA polymerase activity. Cap-snatching is a viral alternative to synthesizing a eukaryotic-like mRNA cap itself. 61
20649 275050 TIGR04203 RPT_S_cricet Streptococcal surface-anchored protein repeat, S. criceti family. This model describes a repeat sequence that occurs primarily LPXTG-anchored Streptococcus surface proteins, although it does occur elsewhere. It can comprise a major fraction of the length of repeat proteins taht exceed 2000 in length. 38
20650 275051 TIGR04204 MAST_ArtA_sort MAST domain. This model describes a domain (or in most cases the full length) of archaeal surface proteins that are putative targets for C-terminal processing by archaeosortase A (TIGR04125). Most members of this family belong to proteins encoded by tandem genes in the genus Methanosarcina. The putative processing signal, PGF-CTERM (TIGR04126), included within the domain definition, takes a variant form, with consensus motif PAF instead of PGF. We suggest the name MAST domain: Methanosarcina Archaeosortase-Sorted Tandem gene family domain. 182
20651 275052 TIGR04205 classIII_w_PIP class III signal peptide protein, archaeosortase D/PIP-CTERM system. Members of this protein family are short proteins that consist largely of the archaeal class III signal peptide (see pfam04021). Members are encoded in a gene cassette between archaeosortase D (TIGR04175) and its PIP-CTERM target protein (TIGR04173). 67
20652 275053 TIGR04206 near_ArtA TIGR04206 family protein. Members of this integral membrane protein family are found exclusively in halophilic archaea. In at least three species (Haloarcula marismortui, Haloquadratum walsbyi, and Haloferax volcanii), members are found in the gene neighborhood of archaeosortase A, suggesting a role in protein sorting. 139
20653 275054 TIGR04207 halo_sig_pep surface glycoprotein signal peptide. This N-terminal homology domain appears to be a specialized class of signal peptide. It occurs mostly in the halophilic archaea, primarily on proteins with the C-terminal PGF-CTERM domain, including the S-layer-forming major surface glycoprotein of several species. The PGF-CTERM domain is the putative archaeosortase A recognition sequence. However, this N-terminal domain occurs also in several archaeal proteins that lack PGF-CTERM, and occurs in bacteria on a protein from Clostridium leptum DSM 753. 30
20654 275055 TIGR04209 sarcinarray sarcinarray family protein. Members of this protein family are exclusive to archaea, probably all of which have S-layer surface protein arrays. All member proteins have an N-terminal signal sequence. The majority of known members belong to codirectional tandem arrays in the genus Methanosarcina (nine in M. barkeri str. Fusaro). Nearly all members have an additional 50 residues, (trimmed from the seed alignment for this model), consisting of low-complexity sequence rich in E,N,Q,T,S, and P, followed by a variant (PAF) form of the PGF-CTERM putative archaeal surface glycoprotein sorting signal. The coined name, sarcinarray family protein, evokes the predicted archaeal surface layer localization, the taxonomic bias of known members, and the tandem organization of most members. 144
20655 211933 TIGR04210 bunya_NSm bunyavirus nonstructural protein NSm. This model describes a protein region that is cleaved from a bunyavirus polyprotein to become the nonstructural protein NSm (encoded by the M segment). It is flanked by glycoprotein GP2 and glycoprotein GP1. 173
20656 275056 TIGR04211 SH3_and_anchor SH3 domain protein. Members of this protein family have a signal peptide, a strongly conserved SH3 domain, a variable region, and then a C-terminal hydrophobic transmembrane alpha helix region. 198
20657 275057 TIGR04212 GlyGly_RbtA Acinetobacter rhombotarget A. Members of this protein family are found, so far, exclusively in the genus Acinetobacter. Members average just over 600 amino acids in length, including a 22-amino acid C-terminal putative protein sorting recognition sequence, GlyGly-CTERM (TIGR03501). The GlyGly-CTERM signal always co-occurs with a subfamily of the rhomboid family intramembrane serine proteases called rhombosortase (TIGR03902). Members occur paired with a second rhombosortase target, with which it also shares an N-terminal motif CSLREA. This protein is designated Acinetobacter rhombotarget A (rbtA). 605
20658 275058 TIGR04213 PGF_pre_PGF PGF-pre-PGF domain. This domain occurs in archaeal species. Most domains in this family end with a motif PGF, after which the member sequences change in character to low-complexity sequence (usually Thr-rich) for about 40 residues. The low complexity region usually is followed by a PGF-CTERM domain (TIGR04126), which we suggest is the recognition sequence for archaeosortase A (TIGR04125), a putative protein-sorting transpeptidase. The similarity between the PGF motif in this domain and in the PGF-CTERM domain is highly suggestive. 153
20659 211937 TIGR04214 CSLREA_Nterm CSLREA domain. This model describes an N-terminal region, with a motif CSLREA, shared by tandem genes in Acinetobacter that both have the GlyGly-CTERM putative protein-sorting domain. Many proteins with this domain are putative outer membrane proteins (OMPs) with predicted beta strand-forming repeats. 27
20660 275059 TIGR04215 choice_anch_A choice-of-anchor A domain. This domain may occur as essentially the full length of a protein, except for an N-terminal sequence and a C-terminal protein-sorting signal such as PEP-CTERM or LPXTG. Most often, the putative surface protein is longer and contains repetitive sequence regions. This is one of very few domains for which both anchoring domains occur, and designated choice-of-anchor A domain. The best characterized member is Bacillus anthracis protein BA0871, a collagen-binding protein with five CNA-family protein B-type repeats toward the C-terminus and an LPXTG cell wall attachment motif. 249
20661 275060 TIGR04216 halo_surf_glyco major cell surface glycoprotein. Members of this family are the S-layer-forming halobacterial major cell surface glycoprotein. The highest scores below model cutoffs are fragmentary paralogs to actual members of the family. Modifications include at N-linked and O-linked glycosylation, a C-terminal diphytanylglyceryl modification, and probable cleavage of the PGF-CTERM tail. 763
20662 211940 TIGR04217 archae_ser_T archaetidylserine synthase. The activity CDP-2,3-di-O-geranylgeranyl-sn-glycerol:L-serine O-archaetidyltransferase (archaetidylserine synthase) was demonstrated experimentally in Methanothermobacter thermautotrophicus. Members represent an exception within the broader family (TIGR00473) of CDP-diacylglycerol-serine O-phosphatidyltransferases. 221
20663 211941 TIGR04218 TOMM_plantaz ribosomal natural product, plantazolicin-class. Members of this protein family are precursors of TOMMs, that is, thizazole/oxazole-modified microcins. Members are about 42 residues in length, have a C-terminal region of extremely low complexity rich in Ser, and are often missed by ab initio gene callers. The plantazolicin from Bacillus amyloliquefaciens FZB42 is a peptide antibiotic effective against Bacillus anthracis. 41
20664 275061 TIGR04219 OMP_w_GlyGly outer membrane protein. Members of this protein family are outer membrane proteins (OMP), as can be seen by their homology to YfaZ protein (see ) and by the OMP targeting region at the C-terminus, including a C-terminal Phe residue. Members of this protein family are found in the great majority of genomes with the GlyGly-CTERM protein sorting signal and the rhombosortase putative sorting enzyme, although the relationship may be fortuitous. 233
20665 211943 TIGR04220 patB_acyB_mcaB cyanobactin biosynthesis protein, PatB/AcyB/McaB family. Members of this protein family are small (~ 80 amino acids) and occur in biosynthesis clusters for cyanobactins, a type of ribosomal natural product, thiazole/oxazole-modified microcin (TOMM). The function of this protein family is unknown, and the recognized cyanobactin precursors (e.g. microcyclamides and patellamides) are encoded by a different protein (see TIGR03678). In this protein family, however, a core region of about 62 amino acids (modeled) is followed by a hypervariable region of 5 to 23 amino acids, with hallmarks of possible cyclodehydratase modification sites. The hallmarks include Cys residues flanked by Gly, and variable length Ser-rich tripeptide repeats. Further, members of this family were shown dispensible for patellamide biosynthesis, and two may occur in a cluster. Therefore, this family may represent a precursor of another type of ribosomal natural product. 61
20666 275062 TIGR04221 SecA2_Mycobac accessory Sec system translocase SecA2, Actinobacterial type. Members of this family are the SecA2 subunit of the Mycobacterial type of accessory secretory system. This family is quite different SecA2 of the Staph/Strep type (TIGR03714). 762
20667 275063 TIGR04222 near_uncomplex TIGR04222 domain. The majority of the proteins with a domain as described by this model have an extreme C-terminal sequence that is consists of extremely low-complexity sequence, rich in Ser or in Gly interspersed with Cys. That C-terminal region resembles ribosomal natural product precursors, although there is no evidence that C-terminal regions of these proteins undergo any modification or have any such function. 227
20668 275064 TIGR04223 quorum_AgrD cyclic lactone autoinducer peptide. Members of this family of short peptides are precursors to thiolactone (unless Cys is replaced by Ser) cyclic autoinducer peptides, used in quorum-sensing systems in Gram-positive bacteria. The best characterized is the AgrD precursor, processed by the AgrB protein. Nearby proteins regularly encountered include a histidine kinase and a response regulator. This model is related to pfam05931 but is newer and currently broader in scope. 37
20669 275065 TIGR04224 ser_adhes_Nterm serine-rich repeat adhesion glycoprotein AST domain. This model describes a definitive conserved N-terminal domain shared by Streptococcal serine-rich adhesion glycoproteins. These highly repetitive proteins may exceed 4000 amino acids in length, consisting largely of long regions in which every second amino acid is Ser. Members of this family, if sequenced completely and assigned the correct start site, begin with a KxYKxGKxW motif region (see TIGR03715) and end with an LPXTG motif region (see TIGR01167). Members are exported by the accessory secretory system (SecA2 and SecY2). They are highly variable among the Streptococci and may help determine host ranges for pathogenesis. 50
20670 275066 TIGR04225 CshA_fibril_rpt CshA-type fibril repeat. Many proteins with this repeat are LPXTG-anchored surface proteins of Firmicutes species, but the repeat occurs more broadly. Members include CshA from Streptococcus gordonii. 103
20671 275067 TIGR04226 RrgB_K2N_iso_D2 fimbrial isopeptide formation D2 domain. The Streptococcus Pneumoniae pilus backbone protein, RrgB, has three tandem domains with Lys-to-Asn isopeptide bonds, but these three regions are extremely divergent in sequence. This model represents the homology domain family of the D2 domain. It occurs just once in many surface proteins but up to twenty times in some pilin subunit proteins. Three of every four members have the typical Gram-positive C-terminal motif, LPXTG, although in many cases this motif may be involved in pilin subunit cross-linking rather than cell wall attachment. Proteins with this domain include fimbrial proteins with lectin-like adhesion functions, and the majority of characterized members are involved in surface adhesion to host structures. 124
20672 211950 TIGR04227 zmp_18_rpt zinc metalloproteinase 18-residue repeat. This model describes a short (18-amino acid) tandem repeat that occurs variable numbers of times in zinc metalloproteinase C (zmpC) homologs in various species of Streptococcus. This repeat occurs, oddly, as an interruption in a region of tandem repeats of another type. 18
20673 275068 TIGR04228 isopep_sspB_C2 adhesin isopeptide-forming domain, sspB-C2 type. This domain has a conserved Lys (position 3 in seed alignment) and Asn at 177 that form an intramolecular isopeptide bond. The Asp (or Glu) at position 59 173
20674 275069 TIGR04229 geopeptide putative radical SAM-modified peptide. This family of short peptides occurs near radical SAM/SPASM domain proteins and is proposed to be modified by that enzyme. 23
20675 275070 TIGR04230 seadorna_VP11 seadornavirus VP11 protein. This protein family occurs in the seadornavirus virus group, with designations VP11 in Banna virus, and VP12 in Kadipiro virus and Liao ning virus. The function has not been assigned. 175
20676 275071 TIGR04231 seadorna_VP5 seadornavirus VP5 protein. This protein family occurs in the seadornavirus virus group, with designations VP5 in Banna virus, and VP6 in Kadipiro virus and Liao ning virus. The function is unassigned. 505
20677 211955 TIGR04232 seadorna_VP3 seadornavirus VP3 protein. Members of this protein family are VP3 proteins in the seadornavirus group. Sequences show sequence similarity to methyltransferases. 731
20678 211956 TIGR04233 seadorna_VP8 seadornavirus VP8 protein. This protein family occurs in the seadornavirus virus group, with designations VP8 in Banna virus, and VP9 in Kadipiro virus and Liao ning virus. The function has not been assigned. 291
20679 275072 TIGR04234 seadorna_RNAP seadornavirus RNA-directed RNA polymerase. Members of this protein family are the seadornavirus VP1 protein, the RNA-directed RNA polymerase. 1144
20680 211958 TIGR04235 seadorna_VP4 seadornavirus VP4 protein. This protein family occurs in the seadornavirus virus group, with designation VP4 in Banna virus, Kadipiro virus, and Liao ning virus. Although this family has been suggested to resemble methyltransferases, members show apparent N-terminal sequence similarity to the outer capsid protein VP5 of the orbivirus group, such as bluetongue virus, which also belong to the Reoviridae. 618
20681 275073 TIGR04236 seadorna_VP2 seadornavirus VP2 protein. This protein family occurs in the seadornavirus virus group, with the designation VP2 in Banna virus, Kadipiro virus, and Liao ning virus. 953
20682 211960 TIGR04237 seadorna_VP9 seadornavirus/coltivirus VP9 protein. This model, broader than related pfam08978, describes proteins VP9 in Coltivirus, and proteins with various designations in the seadornavirus group: VP9 in Banna virus, VP10 in Liao ning virus, and VP11 in Kadipiro virus. 280
20683 275074 TIGR04238 seadorna_dsRNA seadornavirus double-stranded RNA-binding protein. This protein family occurs in the seadornavirus virus group, with an N-terminal domain for binding double-stranded RNA, is designated VP12 in Banna virus, VP8 in Kadipiro virus, and VP11 in Liao ning virus. 201
20684 275075 TIGR04239 rhombo_GlpG rhomboid family protease GlpG. GlpG in E. coli is a rhomboid family intramembrane serine protease that has been extensively characterized as a proxy for rhomboid family proteases in animals. It efficiently cleaves eukaryote-derived model substrates. This multiple membrane-spanning protein excludes inappropriate substrates from access to its cleavage site, and shows activity against truncated versions, but not full-length versions, of the E. coli multidrug transporter MdfA. This finding suggests a housekeeping function in removing faulty proteins. In contrast, several eukaryotic rhomboid family proteases release peptide hormones for signaling functions, and the Shewanella and Vibrio protein rhombosortase appears to be part of a protein-sorting system, cleaving a C-terminal anchoring helix domain. 270
20685 213897 TIGR04240 flavi_E_stem flavivirus envelope glycoprotein E, stem/anchor domain. This model describes the C-terminal domain, containing a stem region followed by two transmembrane anchor domains, of the envelope protein E. This protein is cleaved from the large flavivirus polyprotein, which yields three structural and seven nonstructural proteins. 97
20686 211964 TIGR04241 adenoE3CR1rpt mastadenovirus E3 CR1-alpha-1. This domain occurs only in the adenovirus E3 region CR1-alpha-1 protein. It may occur once, twice, or three times. 81
20687 275076 TIGR04242 nodulat_NodC chitooligosaccharide synthase NodC. Members of this family are NodC, an N-acetylglucosaminyltransferase involved in the production of nodulation factors through which rhizobia establish symbioses with leguminous plants. 395
20688 211966 TIGR04243 nodulat_NodB chitooligosaccharide deacetylase NodB. Nodulation factors are lipooligosaccharide signalling molecules produced by rhizobia, the symbiotic nitrogen-fixing bacteria that form nodules in plants. These Nod factor sustems have the NodABC genes in common but differ subtly in what they produce, which affects host range. NodB is a chitooligosaccharide deacetylase. 197
20689 275077 TIGR04244 nitrous_NosZ_RR nitrous-oxide reductase, TAT-dependent. Members of this family are the nitrous-oxide reductase structural protein, NosZ, with an N-terminal twin-arginine translocation (TAT) signal sequence (see TIGR01409). The TAT system replaces the Sec system for export of proteins with bound cofactor. 627
20690 211968 TIGR04245 nodulat_NodA N-acyltransferase NodA. Nodulation factors are lipo-chitooligosaccharides made by bacterial nitrogen-fixing bacteria as a signal to plant hosts. Nod factors differ slightly from system to system are serve as host range determinants. Because the N-acyl group varies from one NodA to another, the family treated as a subfamily, but all members of this family belong to NodABC systems. 193
20691 275078 TIGR04246 nitrous_NosZ_Gp nitrous-oxide reductase, Sec-dependent. This model represents the nitrous-oxide reductase protein NosZ as characterized in Geobacillus thermodenitrificans. In contrast to the related form in Pseudomonas stutzeri, this version lacks a recognizable twin-arginine translocation (TAT) signal at the N-terminus. Consequently, its accessory protein may differ. Some members of this family have an additional cytochrome c-like domain at the C-terminus. 578
20692 275079 TIGR04247 NosD_copper_fam nitrous oxide reductase family maturation protein NosD. Members of this family include NosD, a repetitive periplasmic protein required for the maturation of the copper-containing enzyme nitrous-oxide reductase. NosD appears to be part of a complex with NosF (an ABC transporter family ATP-binding protein) and NosY (a six-helix transmembrane protein in the ABC-2 permease family). However, NosDFY-like complexes appear to occur also in species whose copper requiring enzymes are something other than nitrous-oxide reductase. 377
20693 211971 TIGR04248 SCM_PqqD_rel SynChlorMet cassette protein ScmD. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Members of this family are the PqqD-like protein. 84
20694 275080 TIGR04249 SCM_chp_ScmC SynChlorMet cassette protein ScmC. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Members of this family are designated ScmC. 292
20695 211973 TIGR04250 SCM_rSAM_ScmE SynChlorMet cassette radical SAM/SPASM protein ScmE. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Of the two PqqE homologs of the cassette, this family is the closer in sequence. 358
20696 211974 TIGR04251 SCM_rSAM_ScmF SynChlorMet cassette radical SAM/SPASM protein ScmF. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Of the two PqqE homologs of the cassette, this family is the more distant in sequence. 353
20697 211975 TIGR04252 SCM_precur_ScmA SynChlorMet cassette protein ScmA. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. This model identifies a conserved open reading frame that was identified as a predicted gene in only one of those species (Chlorobium), but that may represent the ribosomally produced peptide precursor of the system. As with most other radical SAM enzyme-modified ribosomal natural products, these polypeptides are Cys-rich in the C-terminal half. 49
20698 211976 TIGR04253 mesacon_CoA_iso mesaconyl-CoA isomerase. Members of this protein family belong by homology to the family of CoA transferases. However, the characterized member from Chloroflexus aurantiacus appears to perform an intramolecular transfer, making it an isomerase. The enzyme converts mesaconyl-C1-CoA to mesaconyl-C4-CoA as part of the bicyclic 3-hydroxyproprionate pathway for carbon fixation. 403
20699 275081 TIGR04254 OpituPEPCTERM_1 putative globular PEP-CTERM protein. Representatives of this family include a 13-member paralogous family of proteins about 215 amino acids in length from the termite gut bacterium Opitutaceae bacterium TAV2, a member of the Verrucomicrobia. The signal peptide (N-terminal) and PEP-CTERM putative protein sorting signal (C-terminal) are not included in the seed alignment. Conserved residues such as an invariant Arg and a lack of conspicuous low-complexity sequence suggest a globular structure and possible enzymatic activity. Members average about thirty percent sequence identify overall, but over seventy percent in the PEP-CTERM region. The function of this family is unknown. 136
20700 275082 TIGR04255 sporadTIGR04255 TIGR04255 family protein. Members of this uncharacterized protein family are found broadly but sporadically among bacteria and archaea, including members of the genera Mycobacterium, Nostoc, Acinetobacter, Planctomyces, Geobacter, Streptomyces, Methanospirillum, etc. The function is unknown. 249
20701 275083 TIGR04256 GxxExxY GxxExxY protein. Members of this protein family average about 130 residues in length and include an almost perfectly conserved motif GxxExxY. Members occur in a wide range of prokaryotes, including Proteobacteria, Perrucomicrobia, Cyanobacteria, Bacteriodetes, Archaea, etc. 116
20702 275084 TIGR04257 nanowire_3heme c(7)-type cytochrome triheme domain. This domain binds three hemes, and itself occurs as a repeating unit. It occurs, for instance, four times in the dodecaheme c-type cytochrome protein GSU_1996, whose crystal structure shows elongation and a nanowire-like arrangement of twelve hemes that could function in extracellular electron transport processes. 75
20703 275085 TIGR04258 4helix_suffix four helix bundle suffix domain. This domain occurs as a suffix domain to some members of the much broader protein family TIGR02436, a few of whose other members are encoded within intervening sequences of bacterial 23S ribosomal RNA. Some proteins with this domain, in turn, are followed by a predicted DNA topoisomerase type C4 zinc finger. 49
20704 275086 TIGR04259 oxa_formateAnti oxalate/formate antiporter. This model represents a subgroup of the more broadly defined model TIGR00890, which in turn belongs to the Major Facilitator transporter family. Seed members for this family include the known oxalate/formate antiporter of Oxalobacter formigenes, as well as transporter subunits co-clustered with the two genes of a system that decarboxylates oxalate into formate. In many of these cassettes, two subunits are found rather than one, suggesting the antiporter is sometimes homodimeric, sometimes heterodimeric. 405
20705 275087 TIGR04260 Cyano_gly_rpt rSAM-associated Gly-rich repeat protein. Members of this protein family average 125 in length, roughly half of which is the repetitive and extremely Gly-rich C-terminal region. Virtually all members occur in the Cyanobacteria, in a neighborhood that includes a radical SAM/SPASM domain, often a marker of peptide modification systems. 119
20706 211984 TIGR04261 rSAM_GlyRichRpt radical SAM/SPASM domain protein, GRRM system. Members of this protein family are radical SAM/SPASM domain proteins (see pfam04055 and TIGR04085) related to anaeroboic sulfatase maturating enzymes and the peptide modification enzyme PqqE. Members are found primarily in Cyanobacteria adjacent to a short protein, ~150 residues, in which the last ~60 residues tends to be repetitive and highly glycine-rich (see TIGR04260). The arrangement suggests modifications to the repetitive C-terminal region by this radical SAM domain enzyme, but the purpose of this system on the whole is unknown. 363
20707 275088 TIGR04262 orph_peri_GRRM extracellular substrate-binding orphan protein, GRRM family. This subfamily belongs to bacterial extracellular solute-binding protein family 3 (pfam00497). In that family, most members are ABC transporter periplasmic substrate-binding proteins. However, members of the present subfamily are orphans in the sense of being adjacent to neither ABC transporter ATP-binding proteins or permease subunits. Instead, most members are encoded next to the two signature proteins of the proposed Glycine-Rich Repeat Modification (GRRM) system, a radical SAM/SPASM protein GrrM (TIGR04261) and the Gly-rich repeat protein itself GrrA (TIGR04260). 257
20708 275089 TIGR04263 SasC_Mrp_aggreg SasC/Mrp/FmtB intercellular aggregation domain. This domain, about 375 amino acids long on average, occurs only in Staphylococcus and Streptococcus. It occurs as a non-repetitive N-terminal domain of LPXTG-anchored surface proteins, including SasC, Mrp, and FmtB. This region in SasC was shown to be involved in cell aggregation and biofilm formation, which may explain the methicillin resistance seen for Mrp and FmtB. 366
20709 275090 TIGR04264 hyperosmo_Ebh hyperosmolarity resistance protein Ebh, N-terminal domain. Staphylococcal protein Ebh (extracellular matrix-binding protein homolog) is a giant protein, sometimes over 10,000 amino acids long as reported. This model describes a non-repetitive amino-terminal domain of about 2400 amino acids. 2354
20710 211988 TIGR04265 bac_cardiolipin cardiolipin synthase. This model is based on experimentally characterized bacterial cardiolipin synthases (cls) from E. coli, Staphylococcus aureus (two), and Bacillus pseudofirmus OF4. This model describes just one of several homologous but non-orthologous forms of cls. The cutoff score is set arbitrarily high to avoid false-positives. Note that there are two enzymatic activites called cardiolipin synthase. This model represents type 1, which does not rely on a CDP-linked donor, but instead does a reversible transfer of a phosphatidyl group from one phosphatidylglycerol molecule to another. 483
20711 211989 TIGR04266 NDMA_methanol NDMA-dependent methanol dehydrogenase. Members of this family belong to the iron-dependent alcohol dehydrogenase family (see pfam00465). The NADP(H) cofactor is bound too tightly for exchange (although non-convalently), so enzymatic activity depends on a second substrate or electron carrier. The radical SAM-modified natural product mycofactocin is proposed to fill this role. In Rhodococcus erythropolis N9T-4, a role was shown for this protein in CO2 fixation during extreme oligotrophic (or possibly chemoautotrophic) growth. 420
20712 275091 TIGR04267 mod_HExxH HEXXH motif domain. Some proteins with this domain toward the C-terminus have an N-terminal region with a radical SAM domain (pfam04055) and a SPASM domain (TIGR04085), a combination frequently associated with peptide modification. All seed alignment members, and all family members that are not fused to a radical SAM domain, have a motif HEXXH that suggests metalloprotease activity. A role in peptide or protein maturation is suggested. 399
20713 275092 TIGR04268 FxSxx-COOH FXSXX-COOH protein. Members of this family are very short (~60 residue) polypeptides, among which the fifth and third to last residues are nearly always Phe and Ser, respectively. Because members occur in a conserved context with a putative peptide-modifying radical SAM/SPASM domain protein, we suggest that members of this family may be the modification target. The gene symbol fxsA reflects both the FXA motif and the proposed role as a ribosomal natural product. 44
20714 275093 TIGR04269 SAM_SPASM_FxsB radical SAM/SPASM domain protein, FxsB family. This model describes a radical SAM (pfam04055)/SPASM domain (TIGR04085) fusion subfamily distinct from PqqE, MftC, anaerobic sulfatase maturases, and other peptide maturases. The combined region described in this model can itself be fused to another domain, such as TIGR04267, or stand alone. Members occurring in the same cassette as a member of family TIGR04268 should be designated FxsB. 363
20715 211993 TIGR04270 Rama_corrin_act methylamine methyltransferase corrinoid protein reductive activase. Members of this family occur as paralogs in species capable of generating methane from mono-, di-, and tri-methylamine. Members include RamA (Reductive Activation of Methyltransfer, Amines) from Methanosarcina barkeri MS (DSM 800). Member proteins have two C-terminal motifs with four Cys each, likely to bind one 4Fe-4S cluster per motif. 535
20716 275094 TIGR04271 ThiI_C_thiazole thiazole biosynthesis domain. The ThiI protein of Escherichia coli is a bifunctional protein in which most of the length of the protein is responsible for sulfurtransferase activity in 4-thiouridine modification to tRNA (EC 2.8.1.4 - see model TIGR00342). This rhodanese-like C-terminal domain, by itself, is able to synthesize the thiazole moiety during thiamin biosynthesis. Note that the invariant Cys residue in this domain is unusual in being required for both activities of the bifunctional ThiI protein. 101
20717 275095 TIGR04272 cxxc_cxxc_Mbark CxxC-x17-CxxC domain. This domain, with a pair of CXXC motifs separated by 17 amino acids, is a candidate zinc finger domain based on these motifs. Some proteins have two copies of the domain, while others are fused to another probable zinc-binding domain, described by pfam13451. 37
20718 275096 TIGR04273 Y_sulf_Ax21 sulfation-dependent quorum factor, Ax21 family. This family consists of proteins closely related to Ax21 (Activator of XA21-mediated immunity), a protein that is secreted by a type I secretion system (RaxABC), and that appears to be sulfated on an N-terminal region tryosine in a motif LSYN. Ax21 acts in a quorum-sensing system. Homologous peptide-mediated quorum-sensing systems appear to exist in other species, such as the emerging opportunistic pathogen Stenotrophomonas maltophilia. Intriguingly, the rice genome encodes a receptor (XA21) for this protein that triggers innate immunity. [Cellular processes, Pathogenesis] 186
20719 211997 TIGR04274 hypoxanDNAglyco hypoxanthine-DNA glycosylase. Members of this protein family represent family 6 of the uracil-DNA glycosylase superfamily, where the five previously described families all act as uracil-DNA glycosylase (EC 3.2.2.27) per se. This family, instead, acts as a hypoxanthine-DNA glycosylase, where hypoxanthine results from deamination of adenine. Activity was shown directly for members from Methanosarcina barkeri and Methanosarcina acetivorans. 150
20720 275097 TIGR04275 beta_prop_Msarc beta propeller repeat, Methanosarcina surface protein type. This model describes a repeat region found mostly in cell surface proteins of various methanogens. Methanosarcina barkeri, for example, has twenty such proteins, often with either seven or fourteen repeats. These repeats resemble the beta propeller repeats of the TolB periplasmic protein of Gram-negative bacteria, part of a complex associated with various functions including biopolymer transport (see TIGR02800). 40
20721 275098 TIGR04276 FxsC_Cterm FxsC C-terminal domain. This model describes a sequence region found regularly as the C-terminal domain of a protein (where the N-terminal domain resembles a TIR domain - see pfam13676) in the vicinity of a proposed peptide-modifying radical SAM/SPASM domain protein, FxsB (TIGR04269). 196
20722 212000 TIGR04277 squa_tetra_cyc squalene--tetrahymanol cyclase. This enzyme, also called squalene--tetrahymanol cyclase, occurs a small number of eukaryotes, some of them anaerobic. The pathway can occur under anaerobic conditions, and the product is thought to replace sterols, letting organisms with this compound build membrane suitable for performing phagocytosis. 624
20723 212001 TIGR04278 viperin antiviral radical SAM protein viperin. Viperin (Virus Inhibitory Protein, ER-associated, Iterferon-inducible) is a radical SAM enzyme found in human and other vertebrates. It is both induced by interferon and demonstrably active in blocking replication by several types of virus, apparently by modifying lipid chemistries in lipid droplets and membrane rafts. 347
20724 275099 TIGR04279 TIGR04279 TIGR04279 methanogen extracellular domain. This domain, with length just over 300 amino acids, occurs in predicted extracellular proteins in a number of methanogens, in one to three proteins per genome. The aromatic residue tyrosine, comprising about five percent of the amino acid composition, is overrepresented among the most highly conserved columns of the multiple sequence alignment. The three members of this family in Methanosarcina barkeri occur all within a six-gene region. 316
20725 275100 TIGR04280 geopep_mat_rSAM putative geopeptide radical SAM maturase. This family is the radical SAM/SPASM domain putative peptide maturase for geopeptide, described by model TIGR04229. The SPASM domain (see model TIGR04085) frequently marks peptide-modifying radical SAM enzymes. 428
20726 275101 TIGR04281 peripla_PGF_1 putative ABC transporter PGF-CTERM-modified substrate-binding protein. Members of this archaeal protein family resemble periplasmic substrate-binding proteins of ABC transporters and appear in gene neighborhoods with permease and ATP-binding cassette proteins. Notably, essentially all members also have the PGF-CTERM putative protein-sorting domain at the C-terminus, while more distant homologs (excluded by the trusted cutoff) instead have what appear to be lipoprotein signal peptides at the N-terminus. 330
20727 275102 TIGR04282 glyco_like_cofC transferase 1, rSAM/selenodomain-associated. Members of this protein family show strongly correlated phylogenetic distribution, and in most cases co-clustering, with an unusual radical SAM enzyme (TIGR04167) whose C-terminal pfam12345 domain often contains a selenocysteine residue. Other members of the conserved gene neighborhood include another putative glycosyltransferase, an alkylhydroperoxidase family protein (TIGR04169), and a phosphoesterase family protein (TIGR04168). The cassette is likely to be biosynthetic but its exact function is unknown. [Unknown function, Enzymes of unknown specificity] 189
20728 275103 TIGR04283 glyco_like_mftF transferase 2, rSAM/selenodomain-associated. This enzyme may transfer a nucleotide, or it sugar moiety, as part of a biosynthetic pathway. Other proposed members of the pathway include another transferase (TIGR04282), a phosphoesterase, and a radical SAM enzyme (TIGR04167) whose C-terminal domain (pfam12345) frequently contains a selenocysteine. [Unknown function, Enzymes of unknown specificity] 220
20729 275104 TIGR04284 aldehy_Rv0768 aldehyde dehydrogenase, Rv0768 family. This family describes a branch of the aldehyde dehydrogenase (NAD) family (see pfam00171) that includes Rv0768 from Mycobacterium tuberculosis. All members of this family belong to species predicted to synthesize mycofactocin, suggesting that this enzyme or another upstream or downstream in the same pathway might be mycofactocin-dependent. However, the taxonomic range of this family is not nearly broad enough to make that relationship conclusive. [Unknown function, Enzymes of unknown specificity] 480
20730 275105 TIGR04285 nucleoid_noc nucleoid occlusion protein. This model describes nucleoid occlusion protein, a close homolog to ParB chromosome partitioning proteins including Spo0J in Bacillus subtilis. Its gene often is located near the gene for the Spo0J ortholog. This protein bind a specific DNA sequence and blocks cytokinesis from happening until chromosome segregation is complete. 255
20731 275106 TIGR04286 MSEP-CTERM MSEP-CTERM protein. Members of this protein family average over 900 residues in length and appear to have multiple membrane-spanning helices in the N-terminal half. The extreme C-terminal region consists of a motif with consensus sequence MSEP, then a transmembrane alpha helix, then a short region with several basic residues. This region, hereby dubbed MSEP-CTERM, resembles other putative sorting signals associated with the archaeosortase/exosortase protein family (see TIGR04178). Genes for all members of this family are found next to a gene for exosortase K. 920
20732 213900 TIGR04287 exosort_XrtK exosortase K. Members of this protein family are exosortase K, a bacterial branch of the archaeosortase/exosortase family of protein-processing enzymes (see TIGR04178). All members of the seed alignment are encoded next to a member of family TIGR04286, which has the putative processing signal MSEP-CTERM (see family TIGR04286) at the extreme C-terminus. 163
20733 213901 TIGR04288 CGP_CTERM CGP-CTERM domain. This domain has an essentially invariant motif, Cys-Gly-Pro, followed by a highly hydrophobic transmembrane domain, always at the protein C-terminus. It occurs, so far, strictly in the family Thermococcaceae (includes Thermococcus and Pyrococcus) within the Euryarchaeota. It occurs in ten proteins per genome on average, and proteins with the domain may lack similarity elsewhere. The presumed sorting/processing protein, for which this domain contains the recognition sequence, is unknown, but it is unlikely to be a member of the exosortase/archaeosortase family. The Cys residue suggests a lipid modification. Upstream, from this domain, most member proteins have an extremely Thr-rich sequence, suggesting archaeal surface protein O-linked glycosylation. 20
20734 275107 TIGR04289 heavy_Cys eight-cysteine-cluster domain. In this domain of about 50 residues, eight of twelve invariant residues are Cys. Proteins with this domain tend to have N-terminal signal sequences, suggesting an extracytoplasmic location for this domain. 52
20735 275108 TIGR04290 meth_Rta_06860 methyltransferase, Rta_06860 family. Members of this family are methyltransferases that mark a widely distributed large conserved gene neighborhood of unknown function. It appears most common in soil and rhizosphere bacteria. 226
20736 275109 TIGR04291 arsen_driv_ArsA arsenical pump-driving ATPase. The broader family (TIGR00345) to which the current family belongs consists of transport-energizing ATPases, including to TRC40/GET3 family involved in post-translational insertion of protein C-terminal transmembrane anchors into membranes from the cyotosolic face. This family, however, is restricted to ATPases that energize pumps that export arsenite (or antimonite). 566
20737 275110 TIGR04292 heavy_Cys_CGP heavy-Cys/CGP-CTERM domain protein. Members of this protein family are restricted to the Pyrococcus and Thermococcus genera of the archaea. Member proteins have a C-terminal, Cys-containing predicted surface anchor domain, where the Cys may be the site of cleavage and lipid attachment (see domain TIGR04288). Members also contain a region crowded with 10 invariant Cys in 60 residues (see domain TIGR04289), possible ligands to some redox cofactor. Note that a sorting motif is CGP. Previously, the motif was named incorrectly as GCP-CTERM in this model due to a typographical error. 373
20738 213906 TIGR04293 archaeo_artF archaeosortase family protein ArtF. Members of this protein family, ArtF, belong to the archaeosortase/exosortase family, in which many members associate with specific protein C-terminal putative protein sorting domains (exosortase A with PEP-CTERM, archaeosortase A with PGF-CTERM, etc.). This subgroup is observed in Thermococcus gammatolerans EJ3 and Thermococcus sp. AM4, but the gene neighborhood is not conserved. The cognate sequence to ArtF is unknown, but should not be ICGP-CTERM (model TIGR04288), found also in many Pyrococcus species that lack any archaeosortase family member. 166
20739 213907 TIGR04294 pre_pil_HX9DG prepilin-type processing-associated H-X9-DG domain. This model describes a region of ~16 residues found typically about 30 residues away from the C-terminus of large numbers of proteins in the Planctomycetes, Lentisphaerae, and Verrucomicrobia, on proteins with a prepilin-type N-terminal cleavage/methylation domain (see TIGR02532). The motif H-X(9)-D-G is nearly invariant. Single genomes may encode over 200 such proteins. 16
20740 275111 TIGR04295 B12_rSAM_oligo B12-binding domain/radical SAM domain protein, rhizo-twelve system. A variety of bacteria, including multiple species of Bradyrhizobium, Mesorhizobium, and Methylobacterium, have a typically twelve-gene cassette (hence the designation rhizo-twelve) for the biosynthesis of some unknown oligosaccaride. This family is a B12-binding domain/radical SAM domain protein found in roughly have of these cassettes, but nowhere else. 422
20741 275112 TIGR04296 PEFG-CTERM PEFG-CTERM domain. This putative protein sorting/processing domain occurs about ten times per genome in members of the Thaumarchaeota. Its putative handling protein, a member of the archaeosortase/exosortase protein family, is exceptional in having a Ser rather than Cys at the putative active site. The highly conserved motif resembles the PEF-CTERM protein sorting domain of family TIGR03024, but membership does not overlap. 30
20742 213910 TIGR04297 thauma_sortase thaumarchaeosortase. This member of the archaeosortase/exosortase family occurs exclusively in the Thaumarchaeota, where the corresponding proposed sorting signal is PEFG-CTERM (see model TIGR04296). This family is unusual in that the suspected active site residue, Cys in every other defined subfamily of archaeosortases and exosortases is replaced by Ser. 307
20743 213911 TIGR04298 his_histam_anti histidine-histamine antiporter. Members of this protein family are antiporters that exchange histidine with histamine, product of histidine decarboxylation. A system consisting of this protein, and a histidine decarboxylase encoded by an adjacent gene, creates decarboxylation/antiport proton-motive cycle that provides a transient resistance to acidic conditions. 429
20744 213912 TIGR04299 antiport_PotE putrescine-ornithine antiporter. Members of this protein family are putrescine-ornithine antiporter. They work together with an enzyme that decarboxylates ornithine to putrescine. This two-gene system has the net effect of removing a protein from the cytosol, providing transient resistance to acid conditions. 430
20745 213913 TIGR04300 exosort_XrtM exosortase family protein XrtM. Members of this family, part of the larger exosortase/archaeosortase family, are known from five related cassettes of genes in Methylomonas methanica MC09, a gammaproteobacterial methanotroph. Each xrtM gene occurs near a large YD repeat (see TIGR01643) protein of 1500-2500 residues and a small, uncharacterized protein of about 200 residues. No PEP-CTERM-like recognition sequence has been identified, so this protein is designated as exosortase family, but not necessarily a functional exosortase. 150
20746 275113 TIGR04301 ODC_inducible ornithine decarboxylase SpeF. Members of this family are known or trusted examples of ornithine decarboxylase, all encoded in the immediate vicinity of an ornithine-putrescine antiporter. Decarboxylation of ornithine to putrescine, followed by exchange of a putrescine for a new ornithine, is a proton-motive cycle that can be induced by low pH and protect a bacterium against transient exposure to acidic conditions. 719
20747 213915 TIGR04302 geo_PqqD_fam GeoRSP system PqqD family protein. Members of this PqqD-related family so far occur only in the genus Geobacter, always together with a PqqE-like radical SAM domain/SPASM domain protein and a second SPASM domain protein with traces of a degenerate radical SAM domain. The extended gene region includes a high-molecular-weight cytochrome c family protein. Besides authentic PqqD (TIGR03859), another example of a PqqD family protein occurs in the SynChlorMet cassette, again with two PqqE-like proteins. The system is named GeoRSP for its prevalence in Geobacter, its Radical SAM protein, is SPASM domain protein, and its PqqD family protein. 102
20748 213916 TIGR04303 GeoRSP_rSAM GeoRSP system radical SAM/SPASM protein. Members of this family are radical SAM/SPASM domain proteins from a cassette restricted to the genus Geobacter. Genes always found adjacent include a non-radical SAM protein with a closely related SPASM domain and a short stretch of N-terminal homology as well to this family, and also a PqqD-like protein. The three-gene cassette is designated GeoRSP for the genus Geobacter, this radical SAM protein, the SPASM domain protein, and the PqqD family protein. 325
20749 213917 TIGR04304 GeoRSP_SPASM GeoRSP system SPASM domain protein. Members of this protein family are encoded by one of two consecutive genes for SPASM domain proteins. The two are closely homologous in the SPASM domain regions, and also in a small N-terminal region, but the other family (TIGR04303) has an intact radical SAM domain (pfam04055) that this "quasi-rSAM" protein lacks. A PqqD-family protein, TIGR04302, is always adjacent. 293
20750 275114 TIGR04305 fol_rel_CADD putative folate metabolism protein, CADD family. This protein family, related to but outside the family of PqqC proteins involved in PQQ biosynthesis, includes the well-studied Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), which can induce apoptosis in a host cell. Other members of this family occur in Rickettsia and Wolbachia, unrelated in terms of phylogeny (both are alphaproteobacteria) but similar in living intracellularly. Local gene context in these species, although not in Trichodesmium or Nitrosomonas eutropha, suggests a role in folate metabolism, and some species with this protein lack FolE but have other folate synthesis proteins. 212
20751 213919 TIGR04306 salvage_TenA thiaminase II. The TenA protein of Bacillus subtilis and Staphylococcus aurues, and the C-terminal region of trifunctional protein Thi20p from Saccharomyces cerevisiae, perform cleavages on thiamine and related compounds to produce 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP), a substrate a salvage pathway for thiamine biosynthesis. The gene symbol tenA, for Transcription ENhancement A, reflects a misleading early characterization as a regulatory protein. This family is related to PqqC from the PQQ biosynthesis system (see TIGR02111), heme oxygenase (pfam01126), and CADD (Chlamydia protein Associating with Death Domains), a putative folate metabolism enzyme (see TIGR04305). 208
20752 213920 TIGR04307 ProTailRpt proline-rich tail region repeat. This model describes a proline-rich tandem repeat of about 24 residues found in C-terminal regions of Gram-positive surface proteins with LPXTG sequences for processing and cell surface attachment by sortase. 23
20753 275115 TIGR04308 repeat_SSSPR51 surface protein repeat SSSPR-51. This repeat domain is designated SSRS51, Streptococcal and Staphylococcal Surface Protein Repeat of size 51. These repeats are homologous to the listerial repeats of pfam13461, but shorter on average by about 8 amino acids. Up to twelve tandem repeats can occur, on some of the longest proteins of their respective species. Nearly all member proteins carry the C-terminal sortase target sequence, LPXTG, recognizable by model TIGR01167. The repeat structure and probable surface location suggest a possible adhesion function. A protein with this class of repeats may have other classes as well. 48
20754 213922 TIGR04309 amanitin amanitin/phalloidin family toxin. Members of this family are ribosomally produced precursors of toxins produced by several mushrooms. These precursors undergo extensive post-translational modification to become amatoxins (e.g. alpha-amanitin) and phallotoxins (e.g. phalloidin). 33
20755 213923 TIGR04310 pantocin_A_pre pantocin A family RiPP. Members of this family are ribosomally-synthesized and posttranslationally-modified peptide (RiPP) precursors about 30 amino acids in length encoded in the vicinity of PaaA and PaaB homologs. Members include PaaP from Pantoea agglomerans, whose central tripeptide EEN appears to be the source of the mature product, pantocin A. Note, however, that the corresponding residues in Photobacterium sp. SKA34 and Photobacterium asymbiotica are EEK rather than EEN. This family, therefore, resembles the PQQ precursor PqqA as a peptide precursor of an extremely small mature product. 29
20756 275116 TIGR04311 rSAM_Geo_metal putative metalloenzyme radical SAM/SPASM domain maturase. This model describes a family of radical SAM/SPASM enzymes largely from the deltaproteobacteria. The family is most closely related to radical SAM enzyme family regularly in archaea in the vicinity of tungsten-containing oxidoreductases. A single member of the family in archaea may correspond to multiple tungsten enzymes, e.g. five in Pyrococcus furiosus. Therefore, the lack of a conserved gene neighborhood for members of this family in deltaprotebacteria suggests members may be involved in the maturation of multiple metalloenzymes. 423
20757 275117 TIGR04312 choice_anch_B choice-of-anchor B domain. This domain, about 385 amino acids long, can have either of at least two types of C-terminal sorting signal. Members from Shewanella and allies have the rhombosortase target domain GlyGly-CTERM (TIGR03501), while members of the Bacteroidetes have the Por secretion system C-terminal domain (TIGR04183). Most other members lack any C-terminal extension, but in most of those, the normal signal sequence is replaced by a lipoprotein signal sequence. Member sequences show a region of local similarity to the LVIVD repeat sequence (pfam08309). 364
20758 275118 TIGR04313 aro_clust_Mycop aromatic cluster surface protein. Members of this family are absolutely restricted to the Mollicutes (Mycoplasma and Ureaplasma). All have a signal peptide, usually of the lipoprotein type, suggesting surface expression. Most members have lengths of about 280 residues but some members have a nearly full-length duplication. The mostly nearly invariant residue, a Trp,is part of a strongly conserved 9-residue motif, [ND]-W-[LY]-[WF]-X-[LF]-X-N-[LI], where X usually is hydrophobic. Because the hydrophobic six-residue core of this motif almost always contains three to four aromatic residues, we name this family aromatic cluster surface protein. Multiple paralogs may occur in a given Mycoplasma, usually clustered on the genome. 293
20759 213927 TIGR04314 methano7heme methanogenesis multiheme c-type cytochrome. Members of this protein family are multiheme cytochrome c proteins of Methanosarcina acetivorans C2A and several other archaeal methanogens. All members have N-terminal signal peptides and are presumed to act in electron transfer reactions associated with methanogenesis. Putative heme-binding motifs include five (or six) CXXCH motifs, a CXXXCH motif, and a CXXXXCH motif. These proteins show multiple regions of local homology, in the same order, with multiheme cytochrome c proteins such as octaheme tetrathionate reductase from Shewanella. 494
20760 275119 TIGR04315 octaheme_Shew octaheme c-type cytochrome, tetrathionate reductase family. Members of this protein family bind heme covalently and contain eight (at least) CXXCH heme-binding motifs. A characterized member is the respiratory enzyme octaheme tetrathionate reductase from Shewanella. 430
20761 275120 TIGR04316 dhbA_paeA 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase. Members of this family are 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase (EC 1.3.1.28), the third enzyme in the biosynthesis of 2,3-dihydroxybenzoic acid (DHB) from chorismate. The first two enzymes are isochorismate synthase (EC 5.4.4.2) and isochorismatase (EC 3.3.2.1). Synthesis is often followed by adenylation by the enzyme DHBA-AMP ligase (EC 2.7.7.58) to activate (DHB) for a non-ribosomal peptide synthetase. 250
20762 275121 TIGR04317 W_rSAM_matur tungsten cofactor oxidoreducase radical SAM maturase. Members of this family are radical SAM enzymes involved in the maturation of tungsten (W)-containing cofactors in the enzymes aldehyde ferredoxin oxidoreductase, formaldehyde ferredoxin oxidoreductase, and others, and tend to be encoded by an adjacent gene. 349
20763 275122 TIGR04318 lacto_ODC_hypo putative ornithine decarboxylase. In at least ten species of Lactobacillus, this close homolog to known ornithine decarboxylase occurs in a three-gene neighborhood, along with an amino acid permease family transporter and pyridoxal phosphate-dependent enyzme from the cystathionine gamma-lyase family. Species include L. acidophilus, L. amylovorus, L. crispatus, L. delbrueckii, L. farciminis, L. helveticus, L. johnsonii, etc. The combination of a decarboxylase with an antiporter in a two-gene system suggests a decarboxylation/antiport proton-motive cycle for transient resistance to acidic conditions. The substrate for this decarboxylase might be ornithine but is unknown. 695
20764 275123 TIGR04319 SerAla_Lrha_rpt surface protein repeat Ser-Ala-175. This serine and alanine-rich surface protein repeat, about 175 amino acids long, occurs up to nine times in surface proteins of some Lactobacillus strains, particularly in Lactobacillus rhamnosus. Members proteins have the N-terminal variant signal sequence described by TIGR03715 and C-terminal LPXTG signals for surface attachment by sortase. 175
20765 275124 TIGR04320 Surf_Exclu_PgrA SEC10/PgrA surface exclusion domain. This model describes a conserved domain found in surface proteins of a number of Firmutes. Many members have LPXTG C-terminal anchoring motifs and a substantial number have the KxYKxGKxW putative sorting signal at the N-terminus. The tetracycline resistance plasmid pCF10 in Enterococcus faecalis promotes conjugal plasmid transfer in response to sex pheromones, but PgrA/Sec10 encoded by that plasmid, a member of this family, specifically inhibits the ability of cells to receive homologous plasmids. The phenomenon is called surface exclusion. 356
20766 275125 TIGR04321 spiroSPASM spiro-SPASM protein. This three-domain protein is restricted to the spirochetes and widely distributed (excepting Borrelia). It has a conserved C-terminal SPASM domain, a 4Fe-4S binding domain shared by a number of peptide-modifying and heme-modifying radical SAM proteins. It has a central radical SAM domain, although half the members have lost the signature 4Fe-4S-binding Cys residues, fail to register with the radical SAM domain definition of pfam04055, and must be considered pseudo-SAM proteins. PSI-BLAST shows a relationship between the N-terminal domain and various predicted glycosyltransferases (e.g. Bacillus subtilis SpsF) and cytidyltransferases. In some Treponema species, this protein appears to split into two tandem genes. 508
20767 275126 TIGR04322 rSAM_QueE_Ecoli putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE. Members of this radical SAM domain protein family appear to be the E. coli form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in species that lack known forms of QueE but usually are not found in queuosine biosynthesis operons. Members of this family tend to form bi-directional best hit matches to members of known (TIGR03365) and putative (TIGR03963) QueE families from other lineages. 215
20768 213936 TIGR04323 SpoChoClust_1 sporadic carbohydrate cluster protein, LIC12192 family. Members of this uncharacterized protein family mark a rare but widely distributed carbohydrate biosynthesis cluster found sporadically in genera Bradyrhizobium, Leptospira, Magnetospirillum, Oscillatoria, Prochlorococcus, etc. 122
20769 275127 TIGR04324 SpoChoClust_2 sporadic carbohydrate cluster 2OG-Fe(II) oxygenase. This family, related to streptomycin biosynthesis protein StrG and to phytanoyl-CoA dioxygenase, belongs to the 2-oxoglutarate and Fe(II)-dependent oxygenase superfamily, which includes not just dioxygenases, but also some chlorinating enzymes involved in natural product biosynthesis. Most members of this family are adjacent to a member of TIGR04323, and occur in a larger carbohydrate biosynthesis cluster found sporadically in genera Bradyrhizobium, Leptospira, Magnetospirillum, Oscillatoria, Prochlorococcus, etc. 248
20770 275128 TIGR04325 MTase_LIC12133 putative methyltransferase, LIC12133 family. Members of this family tend to occur next to glycosyltransferases and other characteristic enzymes of O-antigen biosynthetic regions. The founding member is LIC12133 from Leptospira interrogans serovar Copenhageni. PSI-BLAST reveals distant homology to known SAM-dependent methyltransferases, as in pfam13489. 235
20771 275129 TIGR04326 O_ant_LIC13510 surface carbohydrate biosynthesis protein, LIC13510 family. This uncharacterized, rare protein occurs in the highly variable O-antigen region of some strains of Leptospira, as well as strains of and serves as a phylogenetic marker for the likely presence of six additional proteins, including an activated sugar-nucleotidyltransferase, an activated sugar epimerase and a dehydratase, an aldolase, and a DegT family aminotransferase. The patterns suggests a role in preparing a novel sugar for O-antigen incorporation. 602
20772 275130 TIGR04327 OMP_LA_2444 outer membrane protein, LA_2444/LA_4059 family. Members of this family are predicted outer membrane proteins, apparently restricted to the Leptospiraceae (Leptospira and Leptonema). 291
20773 213941 TIGR04328 cas4_PREFRAN CRISPR-associated protein Cas4, subtype PREFRAN. Members of this family are the Cas4 protein of a novel CRISPR subtype, PREFRAN, found in Prevotella bryantii B14, Prevotella disiens FB035-09AN, Francisella tularensis subsp. novicida, Francisella philomiragia, Butyrivibrio proteoclasticus B316, Helcococcus kunzii ATCC 51366, etc. 178
20774 213942 TIGR04329 cas1_PREFRAN CRISPR-associated endonuclease Cas1, subtype PREFRAN. Members of this family are the Cas1 endonuclease of a novel CRISPR subtype, PREFRAN, found in Prevotella bryantii B14, Prevotella disiens FB035-09AN, Francisella tularensis subsp. novicida, Francisella philomiragia, Butyrivibrio proteoclasticus B316, Helcococcus kunzii ATCC 51366, etc. 317
20775 275131 TIGR04330 cas_Cpf1 CRISPR-associated protein Cpf1, subtype PREFRAN. This family is the long protein of a novel CRISPR subtype, PREFRAN, which is most common in Prevotella and Francisella, although widely distributed. The PREFRAN type has Cas1, Cas2, and Cas4, but lacks the helicase Cas3 and endonuclease Cas3-HD. 1286
20776 275132 TIGR04331 o_ant_LIC12162 putative transferase, LIC12162 family. This protein family shows C-terminal sequence similarity to various surface carbohydrate biosynthesis enzymes: spore coat polysaccharide biosynthesis protein SpsB, UDP-N-acetyl-D-glucosamine 2-epimerase, lipid A disaccharide synthetase LpxB, etc. It may occur in O-antigen biosythesis regions. 585
20777 275133 TIGR04332 gamma_Glu_sys poly-gamma-glutamate system protein. Poly(gamma-glutamic acid), or PGA, is an extracellular structural polymer found in Bacillus subtilis and a number of other species. PGA is sometimes capsule-forming, sometimes secretory, and may be produced by Gram-positive (single plasma membrane) and Gram-negative (inner and outer membranes), so export and/or attachment machinery may differ from system to system. Members of this family occur in a subset of PGA operons, in lineages that include Francisella, Leptospira, Treponema, Thermotoga, Fusobacterium, and Clostridium, among others. Because gene symbols pgsWXYZ are not yet in use, we suggest pgsW, as one of a series of poly-gamma-glutamate synthesis auxiliary proteins. 307
20778 213946 TIGR04333 Clo7Bot_mod_Cys Cys-rich peptide, Clo7bot family. Members of this protein family range in size from 34 to 53 residues, including from four to seven Cys residues. Multiple strains of Clostridium botulinum show seven tandem members upstream of a radical SAM/SPASM domain protein likely to act as a ribosomal natural product maturase. By analogy to subtilosin A, the Cys residues are likely targets for modifications that may introduce new crosslinks. Across multiple strains of Clostridium botulinum and C. sporogenes, the adjacent radical SAM enzyme is nearly invariant. 34
20779 213947 TIGR04334 rSAM_Clo7bot radical SAM/SPASM domain Clo7bot peptide maturase. In multiple strains of Clostridium botulinum, this single radical SAM domain protein occurs next to a tandem array of seven homologous Cys-rich small peptides (see TIGR04333). Because this radical SAM enzyme contains the SPASM domain, associated with peptide modification, it is proposed to modify all seven C. botulinum targets, hence the name Clo7bot for this system. Suggested gene symbol is ctpM (Clostridial Tandem Peptide Maturase). [Protein fate, Protein modification and repair] 440
20780 275134 TIGR04335 AmmeMemoSam_A AmmeMemoRadiSam system protein A. Members of this protein family belong to the same domain family as AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Members of the present family occur as part of a three gene system with a homolog of the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility), and an uncharacterized radical SAM enzyme. 174
20781 275135 TIGR04336 AmmeMemoSam_B AmmeMemoRadiSam system protein B. Members of this protein family belong to the same domain family as the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility). Members of the present family occur as part of a three gene system with an uncharacterized radical SAM enzyme and a homolog of the mammalian protein AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Memo in humans has protein-protein interaction activity with binding of phosphorylated Try, but members of this family may be active as enzymes, as suggested by homology to a class of nonheme iron dioxygenases. 269
20782 275136 TIGR04337 AmmeMemoSam_rS AmmeMemoRadiSam system radical SAM enzyme. Members of this protein family are uncharacterized radical SAM enzymes that occur in a prokaryotic three-gene system along with homologs of mammalian proteins Memo (Mediator of ErbB2-driven cell MOtility) and AMMERCR1 (Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis). Among radical SAM enzymes that have been experimentally characterized, the most closely related in sequence include activases of pyruvate formate-lyase and of benzylsuccinate synthase. 349
20783 275137 TIGR04338 HEXXH_Rv0185 putative metallohydrolase, TIGR04338 family. This protein family is restricted to the Actinomycetales, including Mycobacterium, Rhodococcus, Nocardia, Gordonii, and others. The invariant motif HEXXH, at the core of the best conserved region in the protein, suggests metallohydrolase activity, as does local sequence similarity in this region to other metallohydrolases. 159
20784 213952 TIGR04339 PQQ_MSMEG_3727 Actinobacterial PQQ system protein. Members of this protein family are restricted to members of the Actinobacteria (Mycobacterium smegmatis, Streptomyces hygroscopicus, Geodermatophilus obscurus, Pseudonocardia dioxanivorans, Saccharomonospora marina, etc) that synthesize PQQ. This small protein, 155 amino acids long on average, is found regularly next to a much larger protein, a PQQ-dependent oxidoreductase, and might be a companion subunit or an accessory protein such as chaperone involved in cofactor insertion. 151
20785 213953 TIGR04340 rSAM_ACGX radical SAM/SPASM domain protein, ACGX system. Members of this protein family are radical SAM/SPASM domain proteins likely to be involved in the modification of small, Cys-rich peptides. Members of the family of proposed target sequences, TIGR04341, average 75 amino acids in length and average six instances of the motif ACGX, where X is A, S, or T. 341
20786 213954 TIGR04341 target_ACGX ACGX-repeat peptide. Members of this family average 75 amino acids in length and average six instances of the motif ACGX, where X is A, S, or T. Members are proposed target sequences for modification by adjacent radical SAM/SPASM domain proteins (family TIGR04340). Cys residues adjacent to Gly residues are common as proposed sites for modification by radical SAM enzymes. 57
20787 275138 TIGR04342 EXLDI EXLDI protein. The most conserved region in this protein family is the C-terminal pentapeptide, with motif ExLDI. Members from the Firmicutes average about 120 amino acids in length, while members from the Actinobacteria have an additional 45-residue amino-terminal segment not included in the model. In it is suggested that the member from Streptococcus mutans UA159, and its homologs, participate in bacteriocin production, export, or immunity. 124
20788 275139 TIGR04343 egtE_PLP_lyase ergothioneine biosynthesis PLP-dependent enzyme EgtE. Members of this protein family are the pyridoxal phosphate-dependent enzyme EgtE, which catalyzes the final step in the biosynthesis of ergothioneine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 370
20789 275140 TIGR04344 ovoA_Nterm 5-histidylcysteine sulfoxide synthase. Ovothiol A is N1-methyl-4-mercaptohistidine. In the absence of S-adenosylmethione, a methyl donor, the intermediate produced is 4-mercaptohistidine. In both Erwinia tasmaniensis and Trypanosoma cruzi, a protein occurs with 5-histidylcysteine sulfoxide synthase activity, but these two enzymes and most homologs share an additional C-terminal methyltransferase domain. Thus OvoA may be a bifunctional enzyme with 5-histidylcysteine sulfoxide synthase and 4-mercaptohistidine N1-methyltranferase activity. This model describes the 5-histidylcysteine sulfoxide synthase domain, a homolog of the ergothioneine biosynthesis protein EgtB. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 442
20790 275141 TIGR04345 ovoA_Cterm putative 4-mercaptohistidine N1-methyltranferase. Ovothiol A is N1-methyl-4-mercaptohistidine. In the absence of S-adenosylmethione, a methyl donor, the intermediate produced is 4-mercaptohistidine. In both Erwinia tasmaniensis and Trypanosoma cruzi, a protein occurs with 5-histidylcysteine sulfoxide synthase activity, but these two enzymes and most homologs share an additional C-terminal methyltransferase domain. Thus OvoA may be a bifunctional enzyme with 5-histidylcysteine sulfoxide synthase and 4-mercaptohistidine N1-methyltranferase activity. This model describes C-terminal putative 4-mercaptohistidine N1-methyltranferase domain. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 242
20791 275142 TIGR04346 DotA_TraY conjugal transfer/type IV secretion protein DotA/TraY. Members of this protein family include transfer protein TraY of IncI1 plasmid R64 and DotA (defect in organelle trafficking A) of Legionella pneumophila. 652
20792 275143 TIGR04347 pseudo_SAM_Halo pseudo-rSAM protein/SPASM domain protein. Members of this family all have a C-terminal SPASM domain (see model TIGR04085), a region usually found as a C-terminal second 4Fe-4S domain of radical SAM domain (see pfam04055) proteins. A majority of rSAM/SPASM proteins modify ribosomally produced peptides. In a few members of this family, the key Cys residues of the radical SAM domain have been lost, making this a pseudo-rSAM family. Members of this family are restricted so far to Haloarchaea, always occur next a member of family TIGR04031, and are often accompanied by another rSAM/SPASM domain protein. The function of this two or three gene cassette is unknown. 390
20793 275144 TIGR04348 TIGR04348 putative glycosyltransferase, TIGR04348 family. This putative glycosyltransferase is found in marine bacteria such as Marinobacter and soil bacteria such as Anaeromyxobacter, but does not seem to occur in known pathogenic bacteria. 310
20794 275145 TIGR04349 rSAM_QueE_gams putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE, gammaproteobacterial type. Members of this radical SAM domain protein family appear to be a form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in preQ0 operons species that lack members of related protein family TIGR03365. 210
20795 275146 TIGR04350 C_S_lyase_PatB putative C-S lyase. Members of this subfamily are probable C-S lyases from a family of pyridoxal phosphate-dependent enzymes that tend to be (mis)annotated as probable aminotransferases. One member is PatB of Bacillus subtilis, a proven C-S-lyase. Another is the virulence factor cystalysin from Treponema denticola, whose hemolysin activity may stem from H2S production. Members of the seed alignment occur next to examples of the enzyme 5-histidylcysteine sulfoxide synthase, from ovothiol A biosynthesis, and would be expected to perform a C-S cleavage of 5-histidylcysteine sulfoxide to leave 1-methyl-4-mercaptohistidine (ovothiol A). 384
20796 213964 TIGR04351 TOMM_nitrile_2 putative TOMM peptide. Members of this family of short peptides average about 110 amino acids in length, with greatest variability in the last thirty. The conserved region resembles the alpha subunit of nitrile hydratase, as with the NHLP leader peptide domain (TIGR03793), and members usually are found near a cyclodehydratase (maturase) enzyme, marking these as like thiazole/oxazole-modified microcins (TOMM), but these precursor forms lack the GlyGly cleavage motif that marks the clear end of a leader peptide region. Genomes with this system include Streptomyces clavuligerus ATCC 27064, Verrucosispora maris AB-18-032, and Kitasatospora setae KM-6054. 48
20797 275147 TIGR04352 HprK_rel_A HprK-related kinase A. A number of protein families resemble HPr kinase (see TIGR00679) but do not belong to that system. They include this family, which appears instead to be the marker for a different type of gene neighborhood, in which one of the conserved neighboring proteins resembles (but is distinct from) PqqD. 280
20798 275148 TIGR04353 PqqD_rel_X PqqD family protein, HPr-rel-A system. Members of this protein show distant homology to PqqD, and belong to a three-gene cassette that included the HPr kinase related protein family of TIGR04352. The role of the cassette, and of this protein, are unknown. 73
20799 275149 TIGR04354 amphi-Trp amphi-Trp domain. This domain usually comprises most of the span of bacterial or archaeal proteins with a length of about 90 amino acids. Some members, however, are extended by one or two copies of domain pfam07411 in the C-terminal region. No residue in this domain is invariant. A striking feature of this domain is a C-terminal region that alternates strongly charged with strongly hydrophobic residues and usually ends with a Trp residue, e.g. LEIEIEW or FEIKVRW, suggesting an amphipathic beta strand structure. We suggest the name amphi-Trp for this domain. Some members of this function occur regularly in genomic contexts that include putative kinases of unknown specificity related to (but distinct from) HPr kinase, a Ser-specific protein kinase. The function is unknown. 67
20800 275150 TIGR04355 HprK_rel_B HprK-related kinase B. Members of this protein family resemble (and often are misannotated as) HprK, the serine kinase/phosphatase of the phosphocarrier protein HPr. However, members do not occur with an HPr homolog, but instead as part of a distinctive gene cassette of unknown function. 351
20801 213969 TIGR04356 grasp_GAK ATP-grasp enzyme, GAK system. Members of this family are ATP-grasp family enzymes related to a number of characterized glutamate ligases, including the ribosomal protein S6 modification enzyme RimK. This group belongs to a conserved gene neighborhood that also features an HPr kinase-related protein (see TIGR04355). We assign this system the initial designation GAK, for Grasp (this ATP-grasp family enzyme), Amphipathic (for the member of family TIGR04354, designated Amphi-Trp), and Kinase, for the HPr-kinase homolog TIGR04355. 287
20802 275151 TIGR04357 CofD_rel_GAK CofD-related protein, GAK system. Members of this family are distantly related to CofD, the enzyme LPPG:FO 2-phospho-L-lactate transferase, involved in coenzyme F420 biosynthesis. This family appears to belong to a biosynthesis cassette of unknown function. 368
20803 275152 TIGR04358 XXXCH_domain XXXCH domain. Members of this family show C-terminal sequence similarity, perhaps indicating distant homology, to cytochrome c-prime (see pfam01322). However, the motif CxxCH is replaced by xxxCH. Genes for this protein occur in a sporadically distributed genome context, largely in deltaProteobacteria, in which an ATP-grasp family glutamate ligase homolog and a CofD (LPPG:FO 2-phospho-L-lactate transferase) homolog suggest a novel biosynthesis. 90
20804 275153 TIGR04359 TrbK_RP4 entry exclusion lipoprotein TrbK. The characterized model example of TrbK, from incompatibility group P (IncP) plasmid RP4, is an N-terminally processed lipoprotein, localized to the periplasmic face of the plasma membrane. TrbK prevents entry through conjugation by other IncP plasmids. Unrelated, uncharacterized proteins encoded in equivalent positions in other plasmid P-type conjugative transfer regions (e.g. TIGR04360) may have analogous functions. [Mobile and extrachromosomal element functions, Plasmid functions] 66
20805 275154 TIGR04360 other_trbK conjugative transfer region protein TrbK. Members of this family regularly are encoded between the TrbJ and TrbL proteins essential for P-type conjugal transfer, and therefore are designated TrbK. Positional analogy to family TIGR04359 (the entry exclusion lipoprotein TrbK of IncP plasmid RP4), which is a lipoprotein and not homologous, suggests this protein may also be involved in entry exclusion. Members of this family are small, with a non-lipoprotein signal peptide and a conserved disulfide bond. [Mobile and extrachromosomal element functions, Plasmid functions] 74
20806 275155 TIGR04361 TrbK_Ti entry exclusion protein TrbK, Ti-type. Members of this family are encoded between the genes for TrbJ and TrbL of P-type plasmid conjugal transfer systems, and therefore are TrbK, a member of a guild of unrelated TrbK protein families. The similarly located TrbK of plasmid RP4 (family TIGR04359) functions in entry exclusion, and the current family may as well, despite lacking any detectable homology. Members of this family include TrbK of the Ti plasmid from Agrobacterium, shown not to be required for transfer, which would be consistent with a role in entry exclusion rather than transfer itself. Li et al. cite unpublished results that showed an entry exclusion function for TrbK of the Ti plasmid. This small protein shares close C-terminal sequence homology to the much longer protein encoded by the neighboring gene TrbJ. [Mobile and extrachromosomal element functions, Plasmid functions] 62
20807 275156 TIGR04362 choice_anch_C choice-of-anchor C domain. This family describes an extracellular bacterial domain that occurs on a number of proteins with PEP-CTERM (exosortase recognition site) sequences at the C-terminus, as well some with an apparent alternate anchor sequence. Note that related pfam04862 (DUF642), as of release 26, is double the length of this model because it has two tandem regions homologous to this domain. pfam04862, in turn, belongs to a Pfam clan called the galactose-binding domain-like superfamily. 157
20808 275157 TIGR04363 LD_lanti_pre FxLD family lantipeptide. Members of this protein family occur with a cassette of lanthionine-type peptide modification enzymes. Members are small (about 60 amino acids long), rich in Cys, and variable in copy number per genome (from one to three). These features suggest that members of this family are modified to become lantipeptides, although not necessarily a lantibiotic. There is no GlyGly cleavage motif to separate a leader peptide from core region.The considerable abundance in Streptomyces and relatively strong consideration hints at a non-antibiotic function. The motif FxLD in the N-terminal region is nearly invariant. 37
20809 275158 TIGR04364 methyltran_FxLD methyltransferase, FxLD system. Members of this family resemble occur regularly in the vicinity of lantibiotic biosynthesis enzymes and their probable target, the FxLD family of putative ribosomal natural product precursor (TIGR04363). Members resemble protein-L-isoaspartate O-methyltransferase (TIGR00080) and a predicted methyltranserase, TIGR04188, of another putative peptide modification system. 394
20810 213978 TIGR04365 spare_glycyl autonomous glycyl radical cofactor GrcA. This small protein, previously designated YfiD in E. coli, is closely homologous to pyruvate formate_lyase (PFL) in a region surrounding the stable glycyl radical that is prepared by the action of pyruvate formate-lyase activase, a radical SAM enzyme. When damage at the site of this radical breaks the main chain of PFL, this protein acts as a spare part that reintroduces the needed stable glycyl radical. Cutoffs for this model are set to exclude a set of closely related phage proteins that appear to have a corresponding function. 124
20811 275159 TIGR04366 cupin_WbuC cupin fold metalloprotein, WbuC family. Members of this family show sequence similarity to cupin fold proteins (see pfam07883), including conserved His residues likely to serve as metal-binding ligands. Many members occur in bacterial O-antigen biosynthesis regions. Some members have acquired the gene symbol wbuC (e.g. Jarvis, et al, 2011), but publications using this term do not ascribe a function. 132
20812 275160 TIGR04367 HpnR_B12_rSAM hopanoid C-3 methylase HpnR. Members of this are family are a B12-binding domain/radical SAM domain protein required for 3-methylhopanoid production. Activity was confirmed by mutant phenotype by disrupting this gene in Methylococcus capsulatus strain Bath. This protein family should only occur in genomes that encode a squalene-hopene cyclase (see TIGR01507). [Fatty acid and phospholipid metabolism, Biosynthesis] 490
20813 275161 TIGR04368 Glu_2_3_NH3_mut glutamate 2,3-aminomutase. Members of this family are glutamate 2,3-aminomutase, a radical SAM enzyme with a pyridoxal phosphate group. It is closely related to lysine 2,3-aminomutase, but distinguished by architecture (longer N-terminal region, shorter C-terminal region) and replacement of key lysine-binding residues Asp293 and Asp330 (inferred from the crystal structure) by glutamate-binding residues Lys and Asn. Activity was demonstrated for sequences from Clostridium difficile, Thermoanaerobacter tengcongensis MB4, and Syntrophomonas wolfei str. Goettingen. The action of this enzyme creates beta-glutamate, an osmolyte. [Cellular processes, Adaptations to atypical conditions] 404
20814 275162 TIGR04369 fusion_not_SelD oxidoreductase/SelD-related fusion protein. Some selenium donor proteins (selenide,water dikase, product of the selD gene, model TIGR00476) are fusion proteins with an N-terminal extension described by model TIGR03169. Members of this family have a C-terminal region similar to yet outside the scope of the SelD model, fused to an N-terminal region similar to but outside the scope of TIGR03169. 702
20815 275163 TIGR04370 glyco_rpt_poly oligosaccharide repeat unit polymerase. Members of this subfamily of highly hydrophobic proteins, with few highly conserved residues, all may act to polymerize the oligosaccharide repeat units of surface polysaccharides, including O-antigen in Gram-negative bacteria such as Leptospira (assign gene symbol wzy) and capsular polysaccharide in Gram-positive bacteria such as Streptococcus. O-antigen biosynthesis enzymes produce a repeat unit, usually an oligosaccharide, which itself is polymerized. O-antigen polymerase, usually designated Wzy. This family bears homology to the O-antigen ligase WaaL, but known examples of WaaL fall outside the bounds defined here. This model is much broader than pfam14296. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 392
20816 275164 TIGR04371 methyltran_NanM putative sugar O-methyltransferase. Members of this family appear to be SAM-dependent O-methyltransferases acting on sugars, based on iterated sequence searches and gene context. Members occur in Leptospira O-antigen regions, as well NanM from the biosynthesis cluster for nanchangmycin, which produces 4-O-methyl-L-rhodinose as an intermediate. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 273
20817 275165 TIGR04372 glycosyl_04372 putative glycosyltransferase, TIGR04372 family. This domain occurs in proteins of various lengths, in contexts that include O-antigen biosynthesis regions of various Leptospira species. Hits to this model and PSI-BLAST analysis suggest distant sequence similarity to family 9 glycosyltransferases (pfam01075), including ADP-heptose:LPS heptosyltransferase (RfaF), an enzyme involved in LPS inner core region biosynthesis. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 205
20818 275166 TIGR04373 egtB_X_signatur EgtB-related enzyme signature domain. This model represents a signature C-terminal region of a distinct clade in the EgtB subfamily, other members of which participate in ergothioneine biosynthesis 50
20819 275167 TIGR04374 small_w_EgtBD hercynine metabolism small protein. Hercynine is the betaine (trimethylated amino group) form of histidine. This small protein occurs in a conserved four-gene cyanobacterial cassette along with a EgtD, the methyltransferase that converts histidine to hercynine, and an EgtB homolog as in ergothioneine biosynthesis, likely to attach some thiol through its sulfur to the imidazole ring. 73
20820 275168 TIGR04375 cyano_w_EgtBD hercynine metabolism protein. Hercynine is the betaine (trimethylated amino group) form of histidine. This protein occurs in a conserved four-gene cyanobacterial cassette along with a EgtD, the methyltransferase that converts histidine to hercynine as in ergothioneine biosynthesis, an EgtB homolog that is likely to attach some thiol (e.g. gamma-glutamyl-cysteine) through its sulfur to the hercynine imidazole ring, and a small protein of unknown function (TIGR04374). Members are distantly related to phage shock protein A (PspA). 154
20821 275169 TIGR04376 TIGR04376 TIGR04376 family protein. Members of this protein family resemble TIGR04375 and, more distantly, to phage shock protein A (PspA). Members are restricted to the Cyanobacteria. 189
20822 275170 TIGR04377 myo_inos_iolD 3,5/4-trihydroxycyclohexa-1,2-dione hydrolase. Members of this protein family, 3,5/4-trihydroxycyclohexa-1,2-dione hydrolase (iolD), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. IolD hydrolyzes the cyclic molecule 3D-(3,5/4)-trihydroxycyclohexane-1,2-dione to yield 5-deoxy-D-glucuronic acid. TPP is a cofactor. [Energy metabolism, Sugars] 615
20823 275171 TIGR04378 myo_inos_iolB 5-deoxy-glucuronate isomerase. Members of this protein family, 5-deoxy-glucuronate isomerase (iolB), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. [Energy metabolism, Sugars] 247
20824 275172 TIGR04379 myo_inos_iolE myo-inosose-2 dehydratase. Members of this family include the enzyme myo-inosose-2 dehydratase, product of the gene iolE, as found in inositol utilization cassettes in many species. [Energy metabolism, Sugars] 290
20825 275173 TIGR04380 myo_inos_iolG inositol 2-dehydrogenase. All members of the seed alignment for this model are known or predicted inositol 2-dehydrogenase sequences co-clustered with other enzymes for catabolism of myo-inositol or closely related compounds. Inositol 2-dehydrogenase catalyzes the first step in inositol catabolism. Members of this family may vary somewhat in their ranges of acceptable substrates and some may act on analogs to myo-inositol rather than myo-inositol per se. [Energy metabolism, Sugars] 330
20826 275174 TIGR04381 HTH_TypR TyrR family helix-turn-helix domain. This model describes the C-terminal DNA-binding helix-turn-helix domain of several regulators of aromatic amino acid metabolism. Examples include TyrR in Escherichia coli and PhhR in Pseudomonas putida. Most members of this family have a sigma-54 interaction domain. [Regulatory functions, DNA interactions] 49
20827 275175 TIGR04382 myo_inos_iolC_N 5-dehydro-2-deoxygluconokinase. All members of the seed alignment for this model are translated from the iolC gene of known or putative inositol catabolism operons. Members with characterized function are 5-dehydro-2-deoxygluconokinase, the enzyme catalyzing the fifth step in degradation from myo-inositol or closely related compounds. Note that many members of this family are fusion proteins with an additional C-terminal domain, of unknown function, described by pfam09863. [Energy metabolism, Sugars] 309
20828 275176 TIGR04383 acidic_w_LPXTA processed acidic surface protein. Members of this family are acidic surface proteins with an N-terminal signal peptide and a variant C-terminal sortase recognition sequence, LPXTA rather than LPXTG. The N-terminal region past the signal peptide is repeated a second or third time in many members of this family. Members occur in Firmicutes, encoded next to a dedicated sortase related to SrtC that assembles pilins, suggesting that this protein serves a structural rather than enzymatic role. Processing by the neighboring sortase may result in polymerization as well as surface attachment. [Cell envelope, Surface structures] 316
20829 275177 TIGR04384 putr_carbamoyl putrescine carbamoyltransferase. Members of this family are putrescine carbamoyltransferase (EC 2.1.3.6). There is some overlapping specificity with ornithine carbamoyltransferase (EC 2.1.3.3). The gene regularly is found next to agmatine deiminase and a carbamate kinase, suggesting a conserved catabolic agmatine deiminase pathway. [Energy metabolism, Amino acids and amines] 330
20830 275178 TIGR04385 B12_rSAM_cofa1 putative variant cofactor biosynthesis B12-binding domain/radical SAM domain protein 1. Members of this protein family are one of two tandem B12-binding domain/radical SAM domain proteins that occur in a genome context with a pair of homologs to ThiC (phosphomethylpyrimidine synthase, EC 4.1.99.17), an enzyme that performs a complex rearrangement involved in thiamin biosynthesis, and a putative CobT (nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase), an enzyme of cobalamin biosynthesis. 438
20831 275179 TIGR04386 ThiC_like_1 ThiC-like protein 1. Members of this protein family closely resemble ThiC, an enzyme that performs a complex rearrangement during thiamin biosynthesis, but instead occur as one of two adjacent additional paralogs to bona fide ThiC, in a conserved gene neighborhood with a pair of B12 binding domain/radical SAM domain proteins. Members of the ThiC family are non-canonical radical SAM enzymes, using a C-terminal Cys-rich motif to ligand a 4Fe-4S cluster that cleaves S-adenosylmethionine (SAM), but that sequence region does not belong to pfam04055. 426
20832 275180 TIGR04387 capsid_maj_N4 major capsid protein, N4-gp56 family. Members of this family are phage major capsid proteins as found in phage N4 (a double-stranded DNA virus) plus many additional lytic phage and integrated prophage regions. [Mobile and extrachromosomal element functions, Prophage functions] 315
20833 275181 TIGR04388 Lepto_longest putative large structural protein. Members of this family are restricted so far to the lineage Leptospira, where they may be the longest protein encoded by the genome. Two or three paralogs are often found. The seed alignment for this model includes sequences with significant length variability, and stops adjacent to an intein feature most full-length members of this family share. Oddly, members closely related in sequence up to the start of the intein (see TIGR01445) usually show very little sequence similarity C-terminal to the end of the intein (see TIGR01443). [Unknown function, General] 1134
20834 275182 TIGR04389 Lepto_lipo_1 lipoprotein, Leptospiral tandem type. Members of this family are lipoproteins restricted (so far) to the genus Leptospira, sometimes with several paralogs clustered with each other, such as four in a row (out of six) in Leptospira interrogans str. UI 13372. The tandem set may be co-clustered with a putative structural protein that is usually the longest encoded by the leptospiral genome (and that often is an intein-containing protein). [Cell envelope, Other] 201
20835 275183 TIGR04390 OMP_YaiO_dom outer membrane protein, YaiO family. Members of this family share a domain of bacterial outer membrane beta barrel, up to the protein C-terminal residue (usually Phe or Trp). The member YaiO was shown experimentally to be localized to the outer membrane. [Unknown function, General] 230
20836 275184 TIGR04391 CcmD_alt_fam CcmD family protein. Members of this protein family are small (typically less than 50 amino acids in length), with the first half highly hydrophobic like transmembrane alpha helices and containing a nearly invariant tyrosine residue. Members from the Desulfovibrionales appear in the position of ccmD of system I c-type cytochrome biogenesis operons (see pfam04995). This family and pfam04995 appear very similar in sequence properties, but the very low level of actual sequence identify makes it unclear that the similarity reflects homology per se. 36
20837 275185 TIGR04392 haoB_nitrify hydroxylamine oxidation protein HaoB. Members of this family occur as the HaoB (hydroxylamine oxidation B) protein encoded next to the homotrimeric HaoA, hydroxylamine oxidoreductase, a protein with eight heme groups. It appears all species with this enzyme are nitrifying bacteria. 312
20838 275186 TIGR04393 rpt_T5SS_PEPC T5SS/PEP-CTERM-associated repeat. This model describes a repeat about 50 amino acids in length, appearing sometimes more than ten times in tandem in a single protein. Most proteins with this repeat have a C-terminal autotransporter domain (TIGR01414, pfam03797) and/or an N-terminal type V secretion system signal peptide (pfam13018), while others instead have a C-terminal PEP-CTERM domain (TIGR02595). 49
20839 275187 TIGR04394 choline_CutC choline trimethylamine-lyase. Members of this family, homologs to pyruvate formate-lyases and benzylsuccinate synthases, are glycine radical enzymes that appear to act as choline TMA-lyase, that is, to perform a C-N bond cleavage turning choline into trimethylamine (TMA) plus acetaldehyde. The gene symbol is cutC, for choline utilization. The activase, CutD, is a radical SAM enzyme. [Energy metabolism, Amino acids and amines] 789
20840 275188 TIGR04395 cutC_activ_rSAM choline TMA-lyase-activating enzyme. Members of this family are CutD, a radical enzyme that serves as an activase for choline TMA-lyase, CutC. CutC is a glycyl radical enzyme related to pyruvate formate-lyase, and this enzyme, CutD, is related to pyruvate formate-lyase activase. [Energy metabolism, Amino acids and amines] 309
20841 275189 TIGR04396 surf_polysacc surface carbohydrate biosynthesis protein. This model describes an uncharacterized homology region found broadly in proteins of surface carbohydrate biosynthesis regions. This family shows distant homology to regions of family TIGR04326, of spore coat polysaccharide biosynthesis protein SpsB from Bacillus subtilis, etc. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 321
20842 275190 TIGR04397 SecA2_Bac_anthr accessory Sec system translocase SecA2, Bacillus type. Members of this family always occur in genomes with the preprotein translocase SecA (TIGR00963) and closely resemble it, hence the designation SecA2. However, this appears to mark a different type of accessory Sec system SecA2 (TIGR03714) from the serine-rich glycoprotein type found in Staphylococcus and Streptococcus, and the actinobacterial SecA2 (TIGR04221). This type occurs in species including Bacillus anthracis, Geobacillus thermoglucosidasius, Solibacillus silvestris, etc. [Protein fate, Protein and peptide secretion and trafficking] 774
20843 275191 TIGR04398 SLAP_DUP SLAP domain. This domain is duplicated in SlaP (S-layer assembly protein), a partner of SecA2 in the Bacillus anthracis type of accessory Sec system (see TIGR04397). The domain is found, either once or twice, in additional Firmicutes species. 125
20844 275192 TIGR04399 acc_Sec_SLAP accessory Sec system S-layer assembly protein. Members of this family, designated S-layer assembly protein (SlaP), occur next to a Bacillus anthracis-type accessory Sec system SecA2. . Members have two tandem copies of a duplicated domain (TIGR04398) that may also occur in other contexts. SlaP is found both free in the cytoplasm and membrane-associated. SecA2 and SlaP appear to work together to modify Sec for efficient S-layer secretion. [Protein fate, Protein and peptide secretion and trafficking] 288
20845 275193 TIGR04400 RK_trnsloc_Pase Arg-Lys translocation region protein phosphatase. The Sec-independent protein export system TAT, or twin-arginine translocation, is unusual in Leptospira, with Lys replacing Arg in the second position of the twin-Arg motif. This protein, restricted to Leptospira and showing distant homology to the phosphoserine phosphatases RsbU and SpoIIE, is always encoded immediately downstream of the tatC gene and appears to be part of the variant TAT system. It lacks a TAT signal itself, and so is more likely to be part of the Sec-independent translocation machinery than to be a substrate. The suggested symbol is rktP, for RK-Translocation Phosphatase. [Protein fate, Protein and peptide secretion and trafficking] 358
20846 275194 TIGR04401 TAT_Cys_rich twin-arginine translocation signal/Cys-rich four helix bundle protein. Members of this family average about 150 amino acids in length, beginning with a twin-arginine translocation signal sequence, then a His-rich spacer region, followed by a ~105-residue region in which thirteen positions are nearly invariant Cys residues. CDD (Conserved Domain Database) assigns members of this family to clan cl13994, the DUF326 superfamily, based on homology to PA2107 from Pseudomonas aeruginosa. PA2107 is a cysteine-rich four helical bundle protein, with solved structure PDB:3KAW. 150
20847 275195 TIGR04402 mob_CxxC_CxxC mobilome CxxCx(11)CxxC protein. Members of this family share twin CxxC motifs near the C-terminus, suggesting a DNA- or RNA-binding activity. The spacing between CxxC motifs is variable, from 11 to 16 amino acids. Members in general occur on plasmids or near other markers of lateral gene transfer (transposases, integrases, endonucleases, etc). 186
20848 275196 TIGR04403 rSAM_skfB sporulation killing factor system radical SAM maturase. Members of this family are a radical SAM enzyme of post-translational modification of ribosomally translated peptides. In Bacillus subtilis, the enzyme SkfB creates a sactipeptide (sulfur-to-alpha-carbon) crosslink of Cys-4 to Met-12 of the mature form of sporulation killing factor (SkfA). In Paenibacillus larvae subsp. larvae B-3650, the Met is replaced by Leu, so the modification must be different. SkfB has 2 4Fe-4S clusters, one in its radical SAM domain (pfam04055) and one in a region that somewhat resembles the SPASM domain (TIGR04085). 402
20849 275197 TIGR04404 RiPP_SkfA sporulation killing factor. Members of this family are ribosomally synthesized and post-translationally modified peptide natural products, modified by sulfur-to-alpha-carbon cross-link introduced by a radical SAM enzyme, SkfB (TIGR04403). 53
20850 275198 TIGR04405 SkfF sporulation killing factor system integral membrane protein. Members of this family, SkfF, occur only in cassettes of the sporulation killing factor system. This protein has multiple membrane-spanning domains and is encoded next to a protein with ATP-binding cassette (ABC) homology (SkfE), suggesting ABC transporter permease activity for this protein. 472
20851 275199 TIGR04406 LPS_export_lptB LPS export ABC transporter ATP-binding protein. Members of this fmaily are LptB, the ATP-binding cassette protein of an ABC transporter involved in lipopolysaccharide export. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 239
20852 275200 TIGR04407 LptF_YjgP LPS export ABC transporter permease LptF. Members of this family are LptF, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptG (TIGR04408). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 346
20853 275201 TIGR04408 LptG_lptG LPS export ABC transporter permease LptG. Members of this family are LptG, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptF (TIGR04407). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 351
20854 275202 TIGR04409 LptC_YrbK LPS export ABC transporter periplasmic protein LptC. Members of this family are LptC, a periplasmic protein tethered to the inner membrane in the lipopolysaccharide (LPS)exporter transenvelope complex (Lpt), which is responsible for conducting LPS to the outer leaflet of the outer membrane in most Gram-negative bacteria. LptC is homologous to LptA, another member of the transenvelope complex. Genes lptC and lptA are often adjacent. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 180
20855 275203 TIGR04410 Spiro_T2SS_lipo type II secretion system-associated lipoprotein. Members of this family occur only in spirochetes (Leptospira, Leptonema, Turneriella), as part of a type II secretion system (T2SS) cassette. Properly extended gene models always include an N-terminal signal sequence ending with a Cys residue, suggesting this small protein (about 100 amino acids) is a lipoprotein. 117
20856 275204 TIGR04411 T2SS_GspN_Lepto type II secretion system protein N, Leptospira/Geobacter-type. Members of this family are the N (or GspN) protein of type II secretion systems (T2SS) as found in Leptospira, Geobacter, Myxococcus, and several other genera. Sequence similarity to GspN as found in, say, Gammaproteobacteria (see pfam01203) is extremely remote. [Protein fate, Protein and peptide secretion and trafficking] 279
20857 275205 TIGR04412 T2SS_GspM_XcpZ type II secretion system protein M, XcpZ-type. Members of this family are a variant form of the type II secretion system (T2SS) protein M, GspM, as found in several species of Pseudomonas. Members, including XcpZ, are short relatives to most members of pfam04612 (as of release 26.0). 126
20858 275206 TIGR04413 MYXAN_cmx8 CRISPR type MYXAN-associated protein Cmx8. Members of this family occur only in CRISPR/cas loci of the MYXAN type, but are present in a minority of such systems. This protein appears to replace the MYXAN system Cas8a1/Csx13 protein (TIGR03485/TIGR03486), compared to which it shows similar length and composition but only about 12 percent sequence identity. 528
20859 275207 TIGR04414 hepto_Aah_TibC autotransporter strand-loop-strand O-heptosyltransferase. Both Aah (autotransporter adhesin heptosyltransferase) and TibC (tib is enterotoxigenic invasion locus B) are protein O-heptosyltransferases that act on multiple sites from repeat regions of proteins exported by autotransporters. Aah glycosylates AIDA, or autotransporter adhesin involved in diffuse adherence, TibC acts on TibA, but TibC can replace Aah. [Protein fate, Protein modification and repair] 374
20860 275208 TIGR04415 O_hepto_targRPT autotransporter passenger strand-loop-strand repeat. This model describes two tandem copies of a strand-loop-strand repeat that occurs often in type V secretion system (T5SS). These repeats usually occur in the passenger domain of the classical monomeric autotransporter. Proteins with this repeat often are encoded next to a member of family TIGR04414, the Aah/TibC family O-heptosyltransferase, and may be glycosylated in regions with this repeat. 38
20861 275209 TIGR04416 group_II_RT_mat group II intron reverse transcriptase/maturase. Members of this protein family are multifunctional proteins encoded in most examples of bacterial group II introns. These group II introns are mobile selfish genetic elements, often with multiple highly identical copies per genome. Member proteins have an N-terminal reverse transcriptase (RNA-directed DNA polymerase) domain (pfam00078) followed by an RNA-binding maturase domain (pfam08388). Some members of this family may have an additional C-terminal DNA endonuclease domain that this model does not cover. A region of the group II intron ribozyme structure should be detectable nearby on the genome by Rfam model RF00029. [Mobile and extrachromosomal element functions, Other] 354
20862 275210 TIGR04417 PFTS_polysacc polysaccharide biosynthesis PFTS motif protein. Members of this protein family are found in O-antigen biosynthesis loci in Leptospira, two tandem homologs in a polysaccharide biosynthesis region in the archaeon Methanoregula formicicum, in Rhizobium leguminosarum bv. trifolii WSM2297, etc. Members are more strongly conserved in the C-terminal region, where an invariant sequence PFTS is found. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 519
20863 275211 TIGR04418 PriB_gamma primosomal replication protein PriB. Members of this protein family are primosomal replication protein N (PriB), virtually always encoded between rpsF (ribosomal protein S6) and rpsR (ribosomal protein S18). Note that only this short form, as found primarily in the gamma-Proteobacteria, of the single-stranded DNA binding protein family (see model TIGR00621) may partner with PriA and be involved in priming for re-initiation of replication. [DNA metabolism, DNA replication, recombination, and repair] 96
20864 275212 TIGR04419 no_iron_rSAM HemN-related non-iron pseudo-SAM protein PsgB. Members of this protein family are related to radical SAM enzymes HemN (oxygen-independent coproporphyrinogen III oxidase) and HutW (a putative heme utilization enzyme) but lack the signature CxxxCxxC motif for 4Fe-4S binding. Members occur exclusively in Borrelia, which appears to live without iron, as the only radical SAM enzyme homolog in any Borrelia genome. We designate this enzyme PsgB (Pseudo-SAM, Genus Borrelia). 378
20865 275213 TIGR04420 Sec_Non_Glob Sec region non-globular protein. Members of this family occur only in the genus Leptospira, always encoded between genes for the YajC and SecD components of the Sec preprotein translocase. Sequences have an N-terminal signal peptide and a C-terminal transmembrane segment. Between these are regions of non-globular, low-complexity sequence including Lys-rich and Ser/Thr/Asn/Glu-rich regions. 241
20866 275214 TIGR04421 FemAB_IMCC1989 FemAB family protein. Members of this family are FemA/FemB family proteins from a cassette that some Leptospira have in their O-antigen biosynthesis regions and share with a cassette from gamma proteobacterium IMCC1989. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 345
20867 275215 TIGR04422 PLP_IMCC1989 putative PLP-dependent aminotransferase. Members of this family are PLP-dependent enzymes, probable aminotransferases of the DegT/DnrJ/EryC1/StrS. Members occur in some Leptospira have in their O-antigen biosynthesis regions and in a related cassette in gamma proteobacterium IMCC1989. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 315
20868 275216 TIGR04423 casT3_TIGR04423 CRISPR type III-associated protein, TIGR04423 family. Members of this protein family occur only in species with CRISPR systems, in the context of type III systems that resemble type III-A (MTUBE) and III-B (the RAMP module). It occurs in several species of Prevotella, Helicobacter, Campylobacter, and Bacteroides. 124
20869 275217 TIGR04424 metallo_McbB McbB family protein. This family includes the partially characterized zinc metalloprotein McbB, part of the maturase system for microcin B17, a thiazole/oxazole-modified microcin (TOMM). Other members of this family belong to very different gene neighborhoods. The Cys residues that act as zinc ligands are conserved in most members, but for rare members are jointly absent. 286
20870 275218 TIGR04425 P_lya_rel_AroB putative sugar phosphate phospholyase (cyclizing). Members of this family tend to be found in O-antigen biosynthesis regions. Members frequently are misidentified as the closely related 3-dehydroquinate synthase, AroB (see family TIGR01357), the phospholyase that converts 3-deoxy-D-arabino-hept-2-ulosonate 7-phosphate to 3-dehydroquinate during chorismate biosynthesis. Most bacteria with this enzyme have a true AroB in a chorismate biosynthesis gene cluster. [Biosynthesis of cofactors, prosthetic groups, and carriers, Lipoate] 348
20871 275219 TIGR04426 rSAM_desII TDP-4-amino-4,6-dideoxy-D-glucose deaminase. Members of this protein family, including DesII, are radical SAM enzymes that deaminate TDP-4-amino-4,6-dideoxy-D-glucose to TDP-3-keto-4,6-dideoxy-D-glucose. This is the fourth step of the six step pathway in Streptomyces venezuelae for synthesizing D-desosamine, or 3-(dimethylamino)-3,4,6-trideoxyglucose, a precursor for many macrolide antibiotics. 468
20872 275220 TIGR04427 PLP_DesI dTDP-4-dehydro-6-deoxyglucose aminotransferase. Members of this family are pyridoxal phosphate-dependent aminotransferases that convert TDP-4-keto-6-deoxy-D-glucose to the 4-amino sugar form, TDP-4-amino-4,6-dideoxy-D-glucose. In Streptomyces venezuelae, this enzyme is designated DesI, catalyzing the third of six steps in the biosynthesis of TDP-D-desosamine, a component of a number of different macrolide antibiotics made by that organism. Related proteins, scoring below the trusted cutoff, include sugar aminotranferases in O-antigen biosynthesis regions. 390
20873 275221 TIGR04428 B12_rSAM_trp_MT tryptophan 2-C-methyltransferase. Members of this family are the B12-binding domain/radical SAM domain enzyme tryptophan methyltransferase, named TsrT in the cassette for thiostrepton biosynthesis. Thiostrepton and related thiopeptides are synthesized by extensive modification of a ribosomally translated product, but this enzyme is involved in a pathway that converts a free Trp residue to a quinaldic acid moiety before it is appended. [Cellular processes, Toxin production and resistance] 545
20874 275222 TIGR04429 Phr_nterm Phr family secreted Rap phosphatase inhibitor. Phr peptides are short peptides, best conserved in their amino-terminal regions, that are almost always encoded immediately downstream of a Rap phosphatase. A portion of the Phr peptide is secreted, enters another cell, and forms a quorum-sensing system by inhibiting its Rap phosphatase partner. The set of Phr peptides recognized by this N-terminal region model is disjoint from the PhrC/PhrF set recognized by pfam11131. [Regulatory functions, Protein interactions] 28
20875 275223 TIGR04430 OM_asym_MlaD outer membrane lipid asymmetry maintenance protein MlaD. Members of this protein family are the MlaD (maintenance of Lipid Asymmetry D) protein of an ABC transport system that seems to remove phospholipid from the outer leaflet of the Gram-negative bacterial outer membrane (OM), leaving only lipopolysaccharide in the outer leaflet. The Mla locus has long been associated with toluene tolerance, consistent with the proposed role in retrograde transport of phospholipid and therefore with maintaining the integrity of the OM as a protective barrier. 146
20876 275224 TIGR04431 N6_acetyl_AAC6 aminoglycoside N(6')-acetyltransferase, AacA4 family. Members of this family are the aacA4 type of aminoglycoside N(6')-acetyltransferase (EC 2.3.1.82), an enzyme that modifies and inactivates aminoglycoside antibiotics such as kanamycin, neomycin, tobramycin, and amikacin. Members are regularly spread among pathogens into integron, transposon, and plasmid loci, with recombination often happening within the protein-coding region. Most of the region amino-terminal to the recombination site or sites was removed from this model. [Cellular processes, Toxin production and resistance] 184
20877 275225 TIGR04432 rSAM_Cfr 23S rRNA (adenine(2503)-C(8))-methyltransferase Cfr. This model identifies Cfr, a 23S rRNA methyltransferase, EC 2.1.1.224, responsible for a transmissible form of chloramphenicol/florfenicol resistance. It is closely related to RlmN (see TIGR00048), an adenine C2-methyltransferase that acts at the same site where Cfr acts as a C8-methyltransferase [Protein synthesis, tRNA and rRNA base modification] 341
20878 275226 TIGR04433 UrcA_uranyl UrcA family protein. Members of this family feature an N-terminal signal sequence, small size, and two invariant Cys residues, 10-20 residues apart. One member of this family, UrcA from the aerobic bacterium Caulobacter crescentus, is expressed so highly in response to uranium, but not other heavy metals, that a genetically engineered strain expressing green fluorescent protein in place of UrcA serves as a biodetector for micromolar uranyl ion. Caulobacter crescentus tolerates high levels of U(VI) exposure by mineralizing the uranium. UrcA and its homologs may participate in such a process. 91
20879 275227 TIGR04434 rSAM_Pput_1520 B12-binding domain/radical SAM domain protein, Pput_1520 family. Members of this family are radical SAM domain (pfam04055) enzymes with an N-terminal B12-binding domain (pfam02310), as is fairly common for radical SAM enzymes with lipid substrates. However, both domains as found in this family seem to be long-branch. The function is unknown, but all cases a PLP-dependent enzyme (a cysteine desulfurase homolog) is found nearby. 515
20880 275228 TIGR04435 restrict_AAA_1 restriction system-associated AAA family ATPase. Members of this family are AAA family ATPases by homology. They occur regularly in a conserved gene neighborhood with the restriction (R), modification (M), and specificity (S) proteins of an apparent type I restriction enzyme system, plus one additional uncharacterized protein. It is not clear whether members of this family contribute to restriction per se, or to another process such as transfer of mobile elements. 555
20881 275229 TIGR04436 SpoChoClust_3 putative 2OG-Fe(II) oxygenase. This family, related to streptomycin biosynthesis protein StrG and to phytanoyl-CoA dioxygenase, belongs to the 2-oxoglutarate and Fe(II)-dependent oxygenase clan, which includes not just dioxygenases such as phytanoyl-CoA dioxygenase PhyH, but also some chlorinating enzymes involved in natural product biosynthesis. Members of this family occur so far only in O-antigen biosynthesis regions of select Leptospira, and include an ~80 residue additional C-terminal region shared by no other proteins. 300
20882 275230 TIGR04437 WaaZ_KDO_III Kdo-III transferase WaaZ. Members of this family are WaaZ, or Kdo-III transferase. This enzyme, present in some strains of E. coli and its allies but not others, performs a non-stoichiometric addition of a third 3-deoxy-D-manno-oct-2-ulosonic acid (KDO-III) onto some fraction of KDO-II in the lipopolysaccharide (LPS) inner core. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 252
20883 275231 TIGR04438 small_Trp_rich small Trp-rich protein. Members of this bacterial protein family average 80 residues in length, and average nearly 6 Trp residues (two of which are invariant) in the first 45, which are strongly hydrophobic. Past this region, the protein is highly charged, with large numbers of Lys, Arg, Asp, and Glu residues. Members usually are divergently transcribed from a gene encoding a c-type cytochrome. 76
20884 275232 TIGR04439 histamin_N_OH putative histamine N-monooxygenase. Members of this family are involved in synthesizing N-hydroxyhistamine as a precursor to acinetobactin, a siderophore found in Acinetobacter baumannii. Assuming histidine is first decarboxylated to histamine, then hydroxylated, members of this family are histamine N-monooxygenase. The putative histidine decarboxylase is found in the same biosynthetic cluster. 431
20885 275233 TIGR04440 glyco_TIGR04440 glycosyltransferase domain. This model describes a putative glycotransferase domain, related to the group 2 family glycosyltransferases of pfam00535. 215
20886 275234 TIGR04441 lept_O_ant_chp1 surface carbohydrate biosynthesis protein. Members of this protein family occur only in a subset of Leptospira species, and in those species occur only in O-antigen biosynthesis regions. Members average about 375 amino acids in length. The function is unknown. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 373
20887 275235 TIGR04442 TIGR04442 TIGR04442 family protein. Members of this family occur exclusively in certain Deltaproteobacteria, including Geobacter and Pelobacter. The function is unknown. 608
20888 275236 TIGR04443 F420_CofF coenzyme gamma-F420-2:alpha-L-glutamate ligase. Members of this family are the CofF, a RimK-related glutamate ligase that caps the gamma-glutamyl tail of the archaeal form of coenzyme F420, a hydride carrier. This enzyme does not appear in bacterial species, such as Mycobacterium tuberculosis, that make F420 with a different type of tail. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 283
20889 275237 TIGR04444 chori_FkbO_Hyg5 chorismatase, FkbO/Hyg5 family. Members of this include two chemically distinct enzymes that cleave pyruvate from chorismate. FkbO and RapK convert chorismate to pyruvate plus (4R,5R)-4,5-dihydroxycyclohexa-1,5-dienecarboxylic acid (DCDC). HygB and Bra8 convert chorismate to pyruvate, water, and 3-hydroxybenzoate. These two enzymes are closely related, and lack homology to chorismate lyase, EC 4.1.3.40, which converts chorismate to pyruvate plus 4-hydroxybenzoate. 311
20890 275238 TIGR04445 preny_LynF_TruF peptide O-prenyltransferase, LynF/TruF/PatF family. Members of this family are prenytransferases that modify mostly, perhaps exclusively, ribosomally derived cyclic peptides. Note that in some cases, the natural product becomes C-prenylated through a non-enzymatic rearrangement after initial O-prenylation. Prenylated products carry names such as prenylagaramide, anacyclamide, and piricyclamide. 282
20891 275239 TIGR04446 pren_cyc_PirE prenylated cyclic peptide, anacyclamide/piricyclamide family. Members of this protein family occur primarily in Cyanobacteria. They average about 50 residues in length and are the ribosomally translated precursors of peptide natural products whose modifications include cleavage, cyclization, and prenylation. Sequences are well-conserved in the N-terminal region. They are nearly invariant over the last eight residues, but hypervariable just before that stretch. A related family, often in a similar genome context, is TIGR03678. 48
20892 275240 TIGR04447 PatC_TenC_TruC cyanobactin cluster PatC/TenC/TruC protein. Members of this family usually are small proteins (PatC, TenC, TruC) of unknown function in cyanobactin (prenylated cyclic peptide) biosynthesis clusters, where a different small protein is a known cyanobactin precursor (patellamide, anacyclamide, piricyclamide, etc). They may instead be the C-terminal domain of a longer protein that otherwise consists mostly of lectin-like or VWF type A domains, in similar context. Similar to the cyanobactin precursors, members of this family have two very strongly conserved regions separated by a hypervariable region, suggesting these proteins may undergo a similar maturation. 34
20893 275241 TIGR04448 creatininase creatininase. Members of this family are creatininase (EC 3.5.2.10), an amidohydrolase that interconverts creatinine + H(2)O with creatine. It should not be confused with creatinase (EC 3.5.3.3), which hydrolyzes creatine to sarcosine plus urea. [Central intermediary metabolism, Nitrogen metabolism] 246
20894 275242 TIGR04449 halocin_C8_dom halocin C8-like bacteriocin domain. This model describes the 76-amino C-terminal domain of the halocin C8 precursor that actually becomes the mature bacteriocin halocin C8 after export and cleavage, as well as homologous C-terminal regions from many other archaea. Surprisingly, this Cys-rich region occurs also in many strains of Staphylococcus epidermidis. Gene regions do not provide evidence for post-translational modification other than cleavage. Halocin C8 is active against a broad range of archaea; the region N-terminal to the bacteriocin domain modeled here serves as the immunity protein for the cell secreting the bacteriocin. [Cellular processes, Toxin production and resistance] 69
20895 275243 TIGR04450 Gpos_C8_like putative immunity protein/bacteriocin. This model describes full-length proteins of Gram-positive bacteria that consist of an N-terminal signal peptide, a central region of unknown function, and a Cys-rich C-terminal region. In both the overall architecture and the apparent weak homology of the C-terminal region itself, these proteins resemble archaeal proteins such as the halocin C8 precursor. In that precursor, the C-terminal region is a bacteriocin but the N-terminal region functions as the immunity protein. The related family of halocin C8-like bacteriocins and their bacterial homologs are recognized by model TIGR04449. 234
20896 275244 TIGR04451 lanti_SCO0268 lantipeptide, SCO0268 family. Members of this family are putative lantipeptide (most likely lantibiotic) precursors, about 53 amino acids long, found in a lantibiotic-type biosynthetic cluster in several species of Streptomyces (S. coelicolor, S. griseoflavus, S. ambofaciens, etc.). This family is described in AntiSMASH as Streptomyces PEQAXS motif lantipeptide precursor. 53
20897 275245 TIGR04452 Lepto_Lipo_YY_C small lipoprotein, LA_3946 family. Members of this family of small lipoproteins that occur in at least nineteen species of Leptospira, but not so far anywhere outside that genus. Notable features include the putative lipoprotein modification Cys and an additional Cys near the C-terminus, both invariant, plus a well-conserved (although not invariant) Tyr-Tyr pair. From one to four paralogs occur in most Leptospira species. Members include LA_3946. 115
20898 275246 TIGR04453 Lepto_8Cys Cys-rich protein, LA_1312 family. Members of this protein family occur, so far, exclusively in the genus Leptospira. Members, although small (about 90 amino acids), have a predicted signal peptide followed by a region with eight invariant Cys residues. Some members have an additional Cys in the signal peptide region that suggests handling as a lipoprotein, but this Cys is not well conserved. 85
20899 275247 TIGR04454 Lepto_4Cys small lipoprotein, LB_250 family. Members of this family average about 92 amino acids in length, including an apparent lipoprotein signal peptide and a mature portion with four additional invariant Cys residues. This family is universal, so far, across at least twenty species of Leptospira but unknown outside the genus. 92
20900 275248 TIGR04455 lipo_MXAN_6521 probable lipoprotein, LA_1396/MXAN_6521 family. Members of this family are predicted lipoproteins, near 200 residues in length, found in multiple species of Leptospira (a spirochete) but also in the Myxococcales branch of the Deltaproteobacteria (Myxococcus xanthus, Chondromyces apiculatus, Cystobacter fuscus, Corallococcus coralloides), an uncommon mix of species. In both lineages, the N-terminal region appears to be a lipoprotein signal peptide. 191
20901 275249 TIGR04456 LruC_dom LruC domain. This domain is abundant in the Leptospira, in Bacteroides, and in Vibrio (three widely separated lineages). Most members have plausible lipoprotein signal peptides, including lipoprotein LruC from Leptospira interrogans and BACOVA_00967, from Bacteroides ovatus, with a solved crystal structure. Note that the C-terminal region of pfam13448 (length 83) matches the N-terminal region of some members of this domain (length 243). 242
20902 275250 TIGR04457 lipo_LenA_lepto lipoprotein LenA. Members of this family are LenA (Leptospira Endostatin-like protein A), found in pathogenic and intermediate species of Leptospira but not in saprophytes. LenA binds plasminogen, laminin, and complement regulator factor H. Behavior during outer membrane solubilization by low concentrations of Triton X-114 and conservation in all family members of an apparent lipoprotein signal sequence, with invariant Cys residue, strongly suggest that LenA is a lipoprotein. Just outside this family is a full-length homolog found in another spirochete, Turneriella parva DSM 21527. 224
20903 275251 TIGR04458 CYP450_TxtE 4-nitrotryptophan synthase. Members of this family are cytochrome P450 enzymes that convert L-tryptophan into L-4-nitrotryptophan. In thaxtomin gene clusters, this enzyme (TxtE) uses nitric acid (NO) derived from arginine by the nitric oxide synthase TxtD, and O2, to perform the tryptophan nitration. L-4-nitrotryptophan is then used as a non-proteinogenic amino acid by non-ribosomal peptide synthases (NRPS). 403
20904 275252 TIGR04459 TisB_tox type I toxin-antitoxin system toxin TisB. Members of this family are the TisB toxin protein of a type I toxin-antitoxin system, meaning that the antitoxin is a non-coding RNA. TisB is induced by some types of stress (SOS, ciprofloxacin) and appears to induce a persister state by dissipating transmembrane potential, thus depleting ATP. Persister cells, unable to grow until the persister state changes, survive a number of challenges, including exposure to most antibiotics. [Cellular processes, Adaptations to atypical conditions] 29
20905 275253 TIGR04460 endura_MppR enduracididine biosynthesis enzyme MppR. Members of this family are MppR, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppR is belongs to the acetoacetate decarboxylase-like superfamily. MppR catalyzes an aldol condensation and a dehydration, not a decarboxylation. 257
20906 275254 TIGR04461 endura_MppQ enduracididine biosynthesis enzyme MppQ. Members of this family are MppQ, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppQ is a PLP-dependent enzyme, predicted by homology to be an aminotransferase. 392
20907 275255 TIGR04462 endura_MppP enduracididine biosynthesis enzyme MppP. Members of this family are MppP, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppP is a PLP-dependent enzyme, predicted by homology to be an aminotransferase. 289
20908 275256 TIGR04463 rSAM_vs_C_rich radical SAM/SPASM domain protein maturase. Members of this family are probable protein/peptide-modifying radical SAM/SPASM domain proteins. The majority of members of this family seem to target Cys-rich repetitive regions of large proteins rather than of bacteriocin-sized small precursors. This arrangement suggests the modification target may be multifunctional, with the C-terminal domain behaving like a bacteriocin but other parts of the same precursor serving an immunity function, as occurs for the halocin C8 precursor. 439
20909 275257 TIGR04464 chaper_lep LipL41-expression chaperone Lep. Members of this protein family are Lep, an outer membrane lipoprotein LipL41-binding protein that appears to function as a chaperone important to its expression. LipL41 is the third most abundant lipoprotein in the pathogen Leptospira interrogans, but is found in saprophytic Leptospira species as well and is not essential for virulence. 114
20910 275258 TIGR04465 ArgArg_F420 TAT-translocated F420-dependent dehydrogenase, FGD2 family. Members of this family are F420-binding enzymes with a proven functional N-terminal twin-arginine translocation (TAT) signal. Members are homologous to the cytosolic F420-dependent glucose-6-phosphate dehydrogenase but do not share the same function. 364
20911 275259 TIGR04466 rSAM_BlsE cytosylglucuronate decarboxylase. BlsE, part of the blasticidin S biosynthetic pathway, is a radical SAM enzyme that performs a decarboxylation at C5 of the glucoside residue. MilG in mildiomycin biosynthesis is equivalent. This enzyme follows CGA synthase and makes the pyranoside core moiety of a class of peptidyl nucleoside antibiotics. [Cellular processes, Toxin production and resistance] 327
20912 275260 TIGR04467 CGA_synthase hydroxymethylcytosylglucuronate/cytosylglucuronate synthase. Members of this family synthesize cytosylglucuronate (or hydroxymethylcytosylglucuronate) from UDP-glucuronate and free cytosine (or hydroxymethylcytosine). This reaction is followed by a decarboxylation at C5 of the glucoside residue. The net reaction makes the pyranoside core moiety of a class of peptidyl nucleoside antibiotics. 386
20913 275261 TIGR04468 arg_2_3_am_muta arginine 2,3-aminomutase. Members of this family are arginine 2,3-aminomutase, a radical SAM enzyme more closely related to lysine 2,3-aminomutase than to glutamate 2,3-aminomutase. The enzyme makes L-beta-arginine, sometimes in the context of antibiotic biosynthesis (blasticidin S, mildiomycin, etc). Activity is proven in Streptomyces griseochromogenes, which makes blasticidin S. 351
20914 275262 TIGR04469 CGA_synth_rel CGA synthase-related protein. Members of this family are related to cytosylglucuronate (CGA) synthase (TIGR04466), and found in the same clusters as CGA synthase and CGA decarboxylase. These clusters produce peptidyl nucleoside antibiotics with a pyranoside core moiety, found in a number of Streptomyces species. Removal of the S. griseochromogenes member of this family, BlsF, from a heterologous expression system caused an increase, not blockage, of blasticidin S. [Cellular processes, Toxin production and resistance] 297
20915 275263 TIGR04470 rSAM_mob_pairB radical SAM mobile pair protein B. Members of this family are the downstream member (B) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown. 285
20916 275264 TIGR04471 rSAM_mob_pairA radical SAM mobile pair protein A. Members of this family are the upstream member (A) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown. 220
20917 275265 TIGR04472 reg_rSAM_mob mobile rSAM pair MarR family regulator. A number of human microbiome species from both the Firmicutes and the spirochete genus Treponema share a gene cassette encoding a pair of radical SAM enzymes and, in most cases, this MarR family transcriptional regulator as well. Sequence identity can exceed 96 percent, suggesting recent lateral transfer. These observations suggest the cassette confers resistance to a toxic compound such as an antibiotic. 141
20918 275266 TIGR04473 taxol_Phe_23mut phenylalanine aminomutase (L-beta-phenylalanine forming). Members of this family are the phenylalanine aminomutase known from taxol biosynthesis. This enzyme has the MIO prosthetic group (4-methylideneimidazole-5-one), derived from an Ala-Ser-Gly motif. Other MIO enzymes include Phe, Tyr, and His ammonia-lyases. This model serves as an exception to overrule assignments by equivalog model TIGR01226 for phenylalanine ammonia-lyase. 687
20919 275267 TIGR04474 tcm_partner three-Cys-motif partner protein. Members of this family occur regularly as a partner to as a member of family pfam07505, which has been called a phage protein but which seems to occur also in other contexts. Members average about 400 residues in length, but the conserved region covered by the model averages 260 residues and excludes the C-terminus. Conserved motifs suggest enzymatic activity. Note that its frequent partner protein (see pfam07505) has a three-cysteine motif that resembles the Cx3CxxC motif of radical SAM proteins, and that in one branch (see TIGR04471) actually becomes Cx3CxxC. We suggest the name three-Cys-motif partner protein (tcmP), and renaming pfam07505 to three-Cys-motif family protein 262
20920 275268 TIGR04475 Phe_D_beta_mut phenylalanine aminomutase (D-beta-phenylalanine forming). Members of this family have the MIO prosthetic group (4-methylideneimidazole-5-one), derived from an Ala-Ser-Gly motif. Other MIO enzymes include Phe, Tyr, and His ammonia-lyases. The member from Pantoea agglomerans, and probably all members, convert Phe to D-beta-phenylalanine (EC 5.4.3.11). By contrast, members of family TIGR04473 convert Phe to L-beta-phenylalanine (5.4.3.10), as in taxol biosynthesis. 510
20921 275269 TIGR04476 exosort_XrtN exosortase N. Members of this family are exosortase N (xrtN), a bacterial exosortase variant whose single target, encoded always by an adjacent gene, belongs to the vault protein inter-alpha-trypsin family. This system occurs in a number of spirochete (Leptospira) and Bacteriodetes species. 403
20922 275270 TIGR04477 sorted_by_XrtN XrtN system VIT domain protein. Members of this subfamily average about 850 amino acids in length, ending with a variant form of PEP-CTERM sorting signal. Members have a VIT (vault protein inter-alpha-trypsin inhibitor heavy chain) domain (pfam08487). Other bacterial subfamilies of VIT domain proteins have members with either GlyGly-CTERM or LPXTG C-terminal sorting signals. Members of this subfamily occur only in context next to a protein sorting/processing enzyme, exosortase N (XrtN). These subsystems occur both among the Bacteriodetes and in the spirochete genus Leptospira. 822
20923 275271 TIGR04478 rSAM_YfkAB radical SAM/CxCxxxxC motif protein YfkAB. Members of this highly conserved family in some Firmicutes have an N-terminal radical SAM domain (pfam04055) and a C-terminal pfam08756 domain with a CxCxxxxC motif that suggests binding to an additional metallocluster. It appears all correct sequences in this family are about 370 amino acids in length, containing the YfkA and YfkB regions originally reported as separate ORFs in Bacillus subtilis. Partial Phylogenetic Profiling shows occurrences almost exclusively in the Bacilli, with very few examples of either lateral transfer or gene loss. The essentially monophyletic distribution suggests a housekeeping function. Members have no well-conserved gene neighborhood. The function is unknown. [Unknown function, Enzymes of unknown specificity] 363
20924 275272 TIGR04479 bcpD_PhpK_rSAM radical SAM P-methyltransferase, PhpK family. Characterized members of this family are B12-binding domain/radical SAM domain enzymes that use methylcobalamin as a methyl donor to methylate a phosphorous atom during the biosynthesis of natural products such as bialaphos and phosalacine. These syntheses create an extremely rare C-P-C bond. All members of the seed alignment derive from genomic regions that include a non-ribosomal peptide synthase. Note that a single organism, Pelosinus fermentans JBW45 from Cr(VI)-contaminated groundwater, has eight additional homologs of unknown function that score between trusted and noise cutoffs of this model. [Cellular processes, Toxin production and resistance] 504
20925 275273 TIGR04480 D_pro_red_PrdA D-proline reductase (dithiol), PrdA proprotein. Members of this family are the PrdA proprotein. The polypeptide undergoes an autocatalytic cleavage that creates two subunits, one with a Cys-derived pyruvoyl motiety critical for activity. The D-proline reductase complex also contains a subunit derived from the prdB gene. The complex acts on D-proline, which may be supplied from L-proline by an active proline racemase encoded nearby. 587
20926 275274 TIGR04481 PR_assoc_PrdC proline reductase-associated electron transfer protein PrdC. Members of this family are encoded near the prdA and prdB genes for proteins of the proline reductase complex, are induced by proline, and are designated PrdC by Bouillaut, et al. Some members are selenoproteins (at two different positions), as is PrdB. Members are homologous to, but distinct from, electron transport protein RnfC. 425
20927 275275 TIGR04482 D_pro_red_PrdD proline reductase cluster protein PrdD. Members of this family are PrdD, encoded in the proline reductase gene cluster. Members are closely homologous to PrdA, which cleaves during maturation to create two subunits of the subunits of the proline reductase complex, one of which has a Cys-derived pyruvoyl active site. 246
20928 275276 TIGR04483 D_pro_red_PrdB D-proline reductase (dithiol), PrdB protein. Members of this family form the PrdB subunit, usually a selenoprotein, in the D-proline reductase complex. The usual pathway is conversion of L-protein to D-proline by a racemase, then use of D-proline as an electron acceptor coupled to ATP generation under anaerobic conditions. 238
20929 275277 TIGR04484 thiosulf_SoxA sulfur oxidation c-type cytochrome SoxA. Members of this family are SoxA, a c-type cytochrome with a CxxCH motif, part of a heterodimer with SoxX. SoxXA, SoxYZ, and SoxB contribute to thiosulfate oxidation to sulfate. 211
20930 275278 TIGR04485 thiosulf_SoxX sulfur oxidation c-type cytochrome SoxX. Members of this family are SoxX, a c-type cytochrome with a CxxCH motif, part of a heterodimer with SoxA. SoxXA, SoxYZ, and SoxB contribute to thiosulfate oxidation to sulfate. 78
20931 275279 TIGR04486 thiosulf_SoxB thiosulfohydrolase SoxB. SoxB, a di-manganese(II) enzyme, works with SoxYZ and the c-type cytochrome SoxXA in oxidation of thiosulfate to sulfate. 556
20932 275280 TIGR04487 SoxY_para_1 SoxY-related AACIE arm protein. Members of this family are paralogs to the authentic thiosulfate oxidation system protein SoxY. True SoxY end with the sequence GGCG(G), the swinging arm in which the Cys residue covalently binds the inorganic sulfur moiety. In this family, members end with a different swinging arm sequence, [AS]AC[IVT]E. The few species with a member of this family always have authentic SoxY. 145
20933 275281 TIGR04488 SoxY_true_GGCGG thiosulfate oxidation carrier protein SoxY. Members of this family are bona fide SoxY, the sulfate carrier protein with the GGCGG ir GGCG swinging arm C-terminal sequence. The Cys in the swinging arm is the residue to which the inorganic sulfate moiety becomes covalently attached. In some species, a closely related paralog occurs as well (TIGR04487), with a swinging arm sequence AACIE. More distantly related forms have an additional C-terminal SoxZ-related domain. All members are periplasmic and have an N-terminal twin-arginine translocation (TAT) signal sequence. 148
20934 275282 TIGR04489 exosort_XrtO exosortase O. Members of this protein are a variant form of exosortase, XrtO, with a dedicated target typically encoded by the adjacent gene. Members have a unique C-terminal extension very different from EpsI (TIGR02914), the extension that many exosortases have. The targets of XrtO all are members of family TIGR02921, which describes a PEP-CTERM protein about 950 residues long, found in more than 15 genera so far. These PEP-CTERM proteins are unusually hydrophobic in stretches, suggesting an integral membrane location, which is unusual. About one third of the members of TIGR02921 are in genomes with this protein, exosortase O, always encoded by an adjacent gene. Genomes include Synechocystis sp. PCC 7509, Xenococcus sp. PCC 7305,Pleurocapsa sp. PCC 7327, Microcoleus vaginatus, Hahella chejuensis, Vibrio azureus NBRC 104587, etc. 450
20935 275283 TIGR04490 SoxZ_true thiosulfate oxidation carrier complex protein SoxZ. SoxZ forms a heterodimer with SoxY, the subunit that forms a covalent bond with a sulfur moiety during thiosulfate oxidation to sulfate. Note that virtually all proteins that have a SoxY domain fused to a SoxZ domain are functionally distinct and not involved in thiosulfate oxidation. 95
20936 275284 TIGR04491 reactive_PduG diol dehydratase reactivase alpha subunit. Members of this family are the alpha (large) subunit of the alpha-2/beta-2 tetrameric enzyme that reactivates B12-dependent trimeric diol dehydratases (1,2-propanediol dehydratase, glycerol dehydratase). Note that the beta subunit of the reactivase is homologous to the beta (medium) subunit of the diol dehydratase. The reactivase catalyzes the exchange of chemically inactivated B-12 for active B12. This model excludes homologs from Mycobacterium (e.g. M. smegmatis), where the several paralogous forms of the dehydratase occur and are exceptional also by not being found in a carboxysome-like microcompartment. 598
20937 275285 TIGR04492 VioB iminophenyl-pyruvate dimer synthase VioB. Following the action of a flavin-dependent L-amino acid oxidase that converts L-tryptophan to indole-3-pyruvate imine, the enzymes VioB (this family), RebD, and StaD can ligate two molecules, forming a coupled iminophenyl-pyruvate dimer. In the violacein biosynthesis pathway, this compound is acted on by VioE before it cyclizes spontaneously to chromopyrrolic acid. In the pathways of homolog StaD (staurosporine), chromopyrrolate is formed, and the enzyme is referred to as chromopyrrolate synthase. RebD is very similar to StaD, but acts on chlorinated Trp-derived molecules. [Cellular processes, Biosynthesis of natural products] 1003
20938 275286 TIGR04493 microcomp_PduM microcompartment protein PduM. Members of this family are PduM, a protein essential for forming functional microcompartments in which a trimeric B12-dependent enzyme acts as a dehydratase for 1,2-propanediol (Salmonella enterica) or glycerol (Lactobacillus reuteri). 153
20939 275287 TIGR04494 c550_PedF cytochrome c-550 PedF. Members of this family are c-type cytochromes with some remote similarity to the sulfur oxidation cytochrome SoxX. 133
20940 275288 TIGR04495 RiPP_XyeA putative rSAM-modified RiPP, XyeA family. Members of this family are short polypeptides with a conserved GG motif as found in bacteriocins that are cleaved upon export. Each gene occurs immediately upstream of a gene for a peptide-modifying radical SAM/SPASM domain protein. The system is designated Xye for genera Xenorhabdus, Yersinia, and Erwinia, hence XyeA for the precursor peptide. The vicinity will also contain a transport protein with a C39 family peptidase domain (see pfam03412), characteristic of GG motif cleave-on-export systems. The function of this RiPP family is unknown. 52
20941 275289 TIGR04496 rSAM_XyeB radical SAM/SPASM domain peptide maturase, XyeB family. Members of this family are radical SAM/SPASM domain enzymes associated with maturation of the XyeA family of GG-motif containing RiPP (Ribosomally synthesized, Post-translationally modified Peptide) natural products. 385
20942 275290 TIGR04497 GRASP_targ_2 putative ATP-grasp target RiPP. Members of this small family are putative RiPP (Ribosomally translated, Post-translationally modified Peptides) precursors, modified by RimK-like ATP-grasp proteins. Members are encoded near both an ATP-grasp protein and C39 peptidase domain-containing transporter. Members are short polypeptides that contain the GG motif expected for cleavage on export. 46
20943 275291 TIGR04498 AbiV_defense abortive infection protein, AbiV family. This family includes AbiV (abortive infection system V) from Lactococcus lactis, a phage resistance protein that causes certain phage infections to fail to lead to successful phage replication. Abortive infection mechanisms differ greatly. AbiV interacts directly with the protein SaV in phage p2 and blocks translation of phage proteins. 141
20944 275292 TIGR04499 abortive_AbiA abortive infection protein, AbiA family. Members of this protein family average about 650 amino acids in length, with an N-terminal region related to reverse transcriptases. The only characterized member is AbiA, with reported activity as an abortive infection protein for phage defense in Lactococcus lactis and (heterologously) in Streptococcus thermophilus. 615
20945 275293 TIGR04500 PpiC_rel_mature putative peptide maturation system protein. Members of this protein family have a novel N-terminal sequence region. Close homologs to the C-terminal region of this protein score well to PpiC-type peptidyl-prolyl cis-trans isomerase models (see pfam00639), yet no sequence within the family scores well to such models, suggesting origin within a branch of the PpiC family but subsequent neofunctionalization with a rapid change of sequence. The genome context for members always includes an ATP-grasp enzyme associated with peptide modification and a short polypeptide likely to be the modification target. 337
20946 275294 TIGR04501 microcomp_PduB microcompartment protein PduB. Members of this family are PduB, a protein of bacterial microcompartments for coenzyme B(12)-dependent utilization of 1,2-propanediol (hence pdu) or glycerol. The most closely related protein in ethanolamine utilization microcompartments is EutL (TIGR04502). 225
20947 275295 TIGR04502 microcomp_EutL microcompartment protein EutL. Members of this family are EutL, a protein of bacterial microcompartments for ethanolamine utilization (eut). The most closely related protein in microcompartments for utilization of 1,2-propanediol (hence pdu) or glycerol is PduB (TIGR04501). 214
20948 275296 TIGR04503 mft_etfB electron transfer flavoprotein, mycofactocin-associated. Members of this small protein family are putative electron transfer flavoproteins, related to FixA from E. coli and EtfB from Methylophilus methylotrophus but clearly forming a distinctive clade. All members occur in species with a mycofactocin system. We have proposed that mycofactocin is a redox carrier synthesized from a ribosomally translated peptide with aid from a radical SAM enzyme, analogous to PQQ. 290
20949 275297 TIGR04504 SDR_subfam_2 SDR family mycofactocin-dependent oxidoreductase. Members of this protein subfamily are putative oxidoreductases belonging to the larger SDR family. All members occur in genomes that encode a cassette for the biosynthesis of mycofactocin, a proposed electron carrier of a novel redox pool. Characterized members of this family are described as NDMA-dependent, meaning that a blue aniline dye serving as an artificial electron acceptor is required for members of this family to cycle in vitro, since the bound NAD residue is not exchangeable. This family resembles TIGR03971 most closely in the N-terminal region, consistent with the published hypothesis of NAD interaction with mycofactocin. See EC 1.1.99.36. [Unknown function, Enzymes of unknown specificity] 259
20950 275298 TIGR04505 PtsS_plasma phosphate ABC transporter substrate-binding protein. Members of this family are the substrate-binding protein of the phosphate ABC transporter as found in Mollicutes genera such as Mycoplasma, Mesoplasma, and Spiroplasma. The most similar sequences outside this family are PtsS in family TIGR02136, but sequence architecture differs considerably. Members of this family are never lipoproteins. 328
20951 275299 TIGR04506 F_threo_transal fluorothreonine transaldolase. Members of this family are fluorothreonine transaldolase, and enzyme involved in biosynthesis of 4-fluorothreonine, one of the few known known naturally occurring organofluorine compounds. [Cellular processes, Biosynthesis of natural products] 609
20952 275300 TIGR04507 fluorinase adenosyl-fluoride synthase. Members of this family are fluorinase (adenosyl-fluoride synthase, EC 2.5.1.63), an enzyme involved in the first committed step in the biosynthesis of at least two different organofluorine compounds. Few organofluorine natural products are known. Related enzymes include chlorinases (EC 2.5.1.94) that lack fluorinase activity, although a fluorinase may show chlorinase activity. [Cellular processes, Biosynthesis of natural products] 285
20953 275301 TIGR04508 queE_Cx14CxxC 7-carboxy-7-deazaguanine synthase, Cx14CxxC type. In the pathway of 7-cyano-7-deazaquanine (preQ0) biosynthesis, the radical SAM enzyme QueE is quite variable. This model describes a variant form in which the three-Cys motif that binds the signature 4Fe-4S cluster takes the form Cx14CxxC, as in Burkholderia multivorans ATCC 17616. The crystal structure is known. 208
20954 275302 TIGR04509 mod_pep_NH_fam putative modified peptide. Members of this family average 110 residues in length, with strong N-terminal homology to both the nitrile hydratase (NH) alpha subunit and family TIGR03793 of NH-related ribosomally translated natural product precursors. A neighboring gene resembles SagB, the dehydrogenase of many thiazole and oxazole modified peptide systems, supporting the hypothesis that members of this family are post-translationally modified. 85
20955 275303 TIGR04510 mod_pep_cyc putative peptide modification system cyclase. Members of this family show homology to mononucleotidyl cyclases and to tetratricopeptide repeat (TPR) proteins. Members occur in next to two other markers of ribosomal peptide modification systems. One is a dehydrogenase related to SagB proteins from thiazole/oxazole modification systems. The other is the putative precursor, related to the nitrile hydratase-related leader peptide (NHLP) and nitrile hydratase alpha subunit families. These systems occur in many species of Xanthomonas and Stenotrophomonas, among others. 814
20956 275304 TIGR04511 SagB_rel_DH_2 putative peptide maturation dehydrogenase. Members resemble the peptide maturation dehydrogenase SagB of thiazole and oxazole modification systems, and occur in a what appears to be a new type of peptide modification system. One adjacent marker is a new type of nitrile hydratase alpha subunit-related putative precursor, TIGR04509, distantly related the NHLP leader peptide family TIGR03793. Another is a large protein, TIGR04510, with regions similar to adenylate cyclases and TPR proteins. 380
20957 275305 TIGR04512 Mycopla_NOT_gsn STREFT protein. Members of this family occur strictly in the genus Mycoplasma, average 1050 in length with little length variability, have an N-terminal signal sequence, and exhibit no detectable sequence similarity to any characterized protein. Up to four tandem copies occur in some Mycoplasma (e.g. M. putrefaciens KS1). Incorrect inclusion of a 57-amino acid stretch of one family member in pfam08178, for a helix-turn-helix transcriptional regulator in several E. coli phage, has caused many members of this family to be annotated, in error, as GnsA/GnsB family proteins. We suggest the name STREFT (Secreted Thousand Residue Frequently Tandem) protein as a distinctive name to spread and replace the incorrect GnsA/GnsB designation. [Unknown function, General] 1041
20958 275306 TIGR04513 VPAMP_CTERM VPAMP-CTERM protein sorting domain. This domain is found as the extreme C-terminal region of four extracellular protein (mostly protease) precursor sequences in Chthoniobacter flavus Ellin428, first representative sequenced from the Spartobacteria class of phylum Verrucomicrobia. This domain contains the cognate signal for one four exosortase family enzymes in the C. flavus genome, and coexists with the more common PEP-CTERM domain, found on more than 50 proteins. 28
20959 275307 TIGR04514 GWxTD_dom GWxTD domain. This domain, about 100 amino acids long, occurs in Actinobacteria and other little-studied Gram-negative bacteria, Sec-dependent proteins 300-800 in length. The domain is rich in aromatic residues, with Trp, Tyr, or Phe as the majority amino acid in ten of the twenty-four most-conserved residue positions. 105
20960 275308 TIGR04515 P450_rel_GT_act P450-derived glycosyltranferase activator. Members of this family resemble cytochrome P450 by homolog, but lack a critical heme-binding Cys residue. Members in general are encoded next to a glycosyltransferase gene in a natural products biosynthesis cluster, physically interact with it, and help the glycosyltransferase achieve high specificity while retaining high activity. Many members of this family assist in the attachment of a sugar moiety to a natural product such as a polyketide. 384
20961 275309 TIGR04516 glycosyl_450act glycosyltransferase, activator-dependent family. Many biosynthesis clusters for secondary metabolites feature a glycosyltransferase gene next to a P450 homolog, often with the P450 lacking a critical heme-binding Cys. These P540-derived sequences seem to be allosteric activators of glycosyltransferases such as the member of this family. This model describes a set of related glycosyltransferases, many of which can be recognized as activator-dependent from genomic context. 418
20962 275310 TIGR04517 rSAM_PoyD radical SAM family RiPP maturation amino acid epimerase. This model describes PoyD and its homologs. These are divergent putative radical SAM enzymes, with the classical CxxxCxxC motif but with few members approaching the cutoff score of pfam04055. PoyD appears responsible for catalyzing a unidirectional L-to-D epimerization of 18 the 48 residues in the core peptide of the first characterized polytheonamide. The RiPP (ribosomally translated natural product) precursor, and peptides encoded near many other members of this family, belong to the nitrile hydratase leader peptide (NHLP) family. 445
20963 275311 TIGR04518 ECF_S_folT_fam ECF transporter S component, folate family. Members of this model are the multiple membrane-spanning S (specificity) component of ECF (energy coupling factor) type uptake transporters. All seed members were found in the vicinity of the bifunctional enzyme folC, involved in making active cofactor from imported folate. However, some species have multiple members of this family, suggesting some diversity of function. [Transport and binding proteins, Unknown substrate] 162
20964 275312 TIGR04519 MoCo_extend_TAT MoCo/4Fe-4S cofactor protein extended Tat translocation domain. This model describes a forty-five residue domain in which the last six residues represent the start of a TAT (Twin-Arginine Translocation) sorting signal. TAT allows proteins already folded, with cofactor already bound, to transit the membrane and reach the periplasm with the ability to perform redox or other cofactor-dependent activities. TAT signals are not normally seen so far from a well-supported start site. Member proteins may all be mutually homologous, with both a molybdenum cofactor-binding domain and a 4Fe-4S dicluster-binding domain. 43
20965 275313 TIGR04520 ECF_ATPase_1 energy-coupling factor transporter ATPase. Members of this family are ATP-binding cassette (ABC) proteins by homology, but belong to energy coupling factor (ECF) transport systems. The architecture in general is two ATPase subunits (or a double-length fusion protein), a T component, and a substrate capture (S) component that is highly variable, and may be interchangeable in genomes with only one T component. This model identifies many but not examples of the upstream member of the pair of ECF ATPases in Firmicutes and Mollicutes. [Transport and binding proteins, Unknown substrate] 268
20966 275314 TIGR04521 ECF_ATPase_2 energy-coupling factor transporter ATPase. Members of this family are ATP-binding cassette (ABC) proteins by homology, but belong to energy coupling factor (ECF) transport systems. The architecture in general is two ATPase subunits (or a double-length fusion protein), a T component, and a substrate capture (S) component that is highly variable, and may be interchangeable in genomes with only one T component. This model identifies many but not examples of the downstream member of the pair of ECF ATPases in Firmicutes and Mollicutes. [Transport and binding proteins, Unknown substrate] 277
20967 275315 TIGR04522 EcfS_MSC_0063 putative energy coupling factor transporter S component, MSC_0063 family. This family of proteins is restricted to the Mollicutes (including Mycoplasma, Spiroplasma, and Ureaplasma). Members belong to a superfamily of multiple membrane-spanning proteins, among which those with assigned activities function as the S component (the specificity component) of ECF transporters. However, members fail to score better than the trusted cutoffs to previously built models for S component proteins (see pfam07155). [Transport and binding proteins, Unknown substrate] 151
20968 275316 TIGR04523 Mplasa_alph_rch helix-rich Mycoplasma protein. Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown. 745
20969 275317 TIGR04524 mycoplas_M_dom IgG-blocking virulence domain. This model defines a domain restricted to Mycoplasma and Ureaplasma proteins. Members include protein M of Mycoplasma genitalium, MG_281, a virulence protein that binds the IgG light chain to block the binding of antibody to antigen. The crystal structure of the protein M antibody-binding region is solved (PDB|4NZR), and includes this homology domain. Full-length homologs to MG_281 are known in a few other Mycoplasma species, but this model's seed alignment demonstrates distant homology to many additional proteins with a much wider distribution across the Mollicutes. Member proteins include paralogous families in some species, such as MCAP_0345, MCAP_0347, MCAP_0349, and MCAP_0351 in Mycoplasma capricolum. [Cellular processes, Pathogenesis] 251
20970 275318 TIGR04525 prot_M_MG281 IgG-blocking protein M. Members of this family, including MG_281 of Mycoplasma genitalium, bind conserved regions of the IgG light chain sequences, blocking IgG's normal function of antigen-specific binding. It is therefore an important virulence protein. Members of this family are found also in Mycoplasma pneumoniae, M. penetrans, M. gallisepticum, and M. iowae. Model TIGR04524 describes a region within this protein that is shared by many additional Mycoplasma and Ureaplasma proteins. [Cellular processes, Pathogenesis] 526
20971 275319 TIGR04526 predic_Ig_block putative immunoglobulin-blocking virulence protein. Members of this family are putative virulence proteins of Mycoplasma and Ureaplasma species. Members share a region of sequence similarity (see TIGR04524) with protein M, a Mycoplasma genitalium protein that binds a conserved light chain region of IgG and blocks its protective function of antigen-specific binding. The seed alignment for this model includes an N-terminal signal-anchor domain and a proline-rich linker domain, and a C-terminal extension, in addition to the protein M-like domain recognized by TIGR04524. [Cellular processes, Pathogenesis] 692
20972 275320 TIGR04527 mycoplas_twoTM two transmembrane protein. Members of this family are uncharacterized proteins from the genus Mycoplasma, typically about 260 amino acids long, with a hydrophobic predicted transmembrane alpha helix toward each end. Often two family members are encoded in tandem, e.g. MG_279 and MG_280 from Mycoplasma genitalium. 245
20973 275321 TIGR04528 acido_non_PQQ acido-empty-quinoprotein group A. Members of this family closely resemble quinoproteins and quinohemoproteins such as PQQ-dependent methanol, glucose, and shikimate dehydrogenases, but restricted to species of Acidobacteria unable to synthesize PQQ. Seven members occur in candidatus Solibacter usitatus Ellin6076, eleven in Acidobacteriaceae bacterium KBS 96, etc. Members have N-terminal signal sequences. They lack the pair of adjacent Cys residues, involved in electron transfer, typical for family TIGR03075, and they lack CxxCH motifs for cytochrome c-like heme-binding. What cofactor these paralogous families of enzymes might use is unclear. 491
20974 275322 TIGR04529 MTB_hemophore hemophore-related protein, Rv0203/Rv1174c family. Members of this family occur as paralogs in most Mycobacterium strains, including 2 in M. tuberculosis, 6 in M. avium, and 9 in M. smegmatis. Members have a cleaved N-terminal signal peptide and exactly two Cys residues in the mature protein, both at invariant positions. The best characterized member, Rv0203, is a hemophore, that is, a secreted polypeptide that binds heme and delivers it to a transport system for import. Hemophores are protein analogs of siderophores, natural products that chelate non-heme iron and deliver it to receptors for transport. The unrelated HasA family of hemophores has been described in Gram-negative bacteria such as Yersinia pestis and Pseudomonas aeruginosa. [Transport and binding proteins, Other] 77
20975 275323 TIGR04530 hemophoreRv0203 hemophore, mycobacterial-type. Members of this family, including Rv0203 from Mycobacterium tuberculosis, are secreted heme-binding proteins used in heme acquisition. Such proteins are called hemophores. Members have a cleavable N-terminal signal peptide, and a mature region just over 100 amino acids long with a pair of invariant Cys residues. An unrelated hemophore, HasA, occurs in Gram-negative pathogens such as Yersinia pestis. [Transport and binding proteins, Other] 113
20976 275324 TIGR04531 nonproteo_OH putative nonproteinogenic amino acid hydroxylase. This extremely rare protein family, a branch of the 2-oxoglutarate dependent oxygenase family related to proline 3-hydroxylase, appears only in natural product biosynthetic clusters that include nonribosomal peptide synthases. One members is PlyP from the polyoxypeptin A cluster, suggested to hydroxylate 3-methylproline. Another, GetF from the GE81112 biosynthetic gene cluster, is a proposed to hydroxylate pipecolic acid. 277
20977 275325 TIGR04532 PT_fungal_PKS iterative type I PKS product template domain. Sequences found by this model are the so-called product template (PT) domain of various fungal iterative type I polyketide synthases. This domain resembles pfam14765, designated polyketide synthase dehydratase by Pfam, but members of that family are primarily bacterial, where type I PKS are predominantly modular, not iterative. The dehydratase active site residues well-conserved in pfam14765 (His in the first hot dog domain, Asp in the second hot dog domain) seem well conserved in this family also. 324
20978 275326 TIGR04533 cyanosortB_assc cyanoexosortase B-associated protein. Members of this protein family are found exclusively in the Cyanobacteria, usually encoded next to a gene encoding cyanoexosortase B (TIGR04156). This relationship resembles the association of the unrelated protein family TIGR04153 with cyanoexosortase A (TIGR03763), and of most exosortases with EpsI. 221
20979 275327 TIGR04534 ELWxxDGT_rpt ELWxxDGT repeat. This model describes protein repeat with a well-conserved motif ELWxxDGT, and a periodicity of about 48. A single protein may have as many as 18 repeats. It may consist nearly entirely of this repeat, or may have other repeats as well (e.g. hyalin repeat). It is most common in the Deltaproteobacteria. 47
20980 275328 TIGR04535 ferrit_encaps ferritin-like protein. Two families of proteins are known to be encoded, with some regularity, next to the gene for encapsulin (pfam04454), with surrounds the target protein to form a prokaryotic proteinaceous organelle. One is the family of enzymes that includes Dyp-type peroxidases. The other is this family, with a resemblance to bacterioferritins. Encapsulin-associated forms of the proteins in these two families have a necessary C-terminal motif that resembles DGSL[SGN]IGSL[KR]. Members of this family that include the last columns of the model in the hit region (and are encoded next to the encapsulin gene) should be designated encapsulin-associated ferritin-like protein. 113
20981 275329 TIGR04536 geobac_encap encapsulated protein. Members of this family are lineage-restricted uncharacterized proteins found mostly in Brevibacillus and Geobacillus. Members are encoded next to the gene for encapsulin (which once was called a bacteriocin), and have the C-terminal motif for associating with encapsulin. [Unknown function, Enzymes of unknown specificity] 194
20982 275330 TIGR04537 encap_target encapsulation C-terminal sorting signal. This model describes a diverse region of extremely small size (11 residues), so unavoidably there are both false-positives and false-negatives. All true hits should occur in proteins encoded next to an encapsulating protein (see pfam04454), and should occur near the extreme C-terminus. Families previously known to have this domain on some members to mediate encapsulation include dye-decolorizing peroxidases and a ferritin-like protein, but (as this model helps show) there are others, including some hemerythrin family proteins and novel family TIGR04536. 11
20983 275331 TIGR04538 P450_cycloAA_1 cytochrome P450, cyclodipeptide synthase-associated. Members of this subfamily are cytochrome P450 enzymes that occur next to tRNA-dependent cyclodipeptide synthases. This group does NOT include CYP121 (Rv2275) from Mycobacterium tuberculosis, adjacent to the cyclodityrosine synthetase Rv2276. 395
20984 275332 TIGR04539 tRNA_cyclodipep tRNA-dependent cyclodipeptide synthase. Members of this family take two aminoacylated tRNA molecules and produce a cyclic dipeptide with two peptide bonds. This enzyme therefore produces a type of nonribosomal peptide, but by a mechanism entirely different from the typical non-ribosomal peptide synthase (NRPS) that relies on adenylation to activate amino acids. Three characterized members of this family are the cyclodityrosine synthase of Mycobacterium tuberculosis (an essential gene), a cyclo(L-Phe-L-Leu) synthase from Streptomyces noursei involved in natural product biosynthesis, and cyclodileucine synthase YvmC from Bacillus licheniformis. Many cyclodipeptide synthases are found next to a cytochrome P450 that further modifies the product. 220
20985 275333 TIGR04540 CLB_0814_fam conserved hypothetical protein. Members of this family are conserved hypothetical proteins in a narrow range of species. In Clostridium botulinum A ATCC 19397, the gene occurs immediately after a five gene operon for biosynthesis of the natural product bacimethrin, a thiamin antivitamin antibiotic. 73
20986 275334 TIGR04541 thiaminase_BcmE thiamine pyridinylase. Members of this family are thiamine pyridinylase (EC 2.5.1.2), also called thiaminase I. Most examples of this secreted, thiamine-degrading enzyme are encoded with a cluster for biosynthesis of the thiamine antivitamin bacimethrin. [Cellular processes, Biosynthesis of natural products] 381
20987 275335 TIGR04542 GMC_mycofac_2 GMC family mycofactocin-associated oxidreductase. This model describes a set of dehydrogenases belonging to the glucose-methanol-choline oxidoreductase (GMC oxidoreductase) family. Members of the present family are restricted to the bacterial genus Gordonia, and seem to replace the related family TIGR03970, which occurs in Actinobacteria generally but not in the genus Gordonia. Members of both this family and TIGR03970 are associated with the mycofactocin biosynthesis operon in Actinobacteria. [Unknown function, Enzymes of unknown specificity] 425
20988 275336 TIGR04543 ketoArg_3Met 2-ketoarginine methyltransferase. This SAM-dependent C-methyltransferase performs the middle step of a three step conversion from arginine to beta-methylarginine. It performs a C-methylation at position 3 of 5-guanidino-2-oxopentanoic acid (keto-arginine). An aminotransferase converts arginine to 5-guanidino-2-oxopentanoic acid, and later converts 5-guanidino-3-methyl-2-oxopentanoic acid to beta-methylarginine. 331
20989 275337 TIGR04544 3metArgNH2trans beta-methylarginine biosynthesis bifunctional aminotransferase. Members of this family are the bifunctional aminotransferase that catalyzes the first and third steps in the three-step conversion of arginine to beta-methylarginine. It first converts arginine to 2-ketoarginine, then converts 3-methyl-2-ketoarginine to 3-methylarginine. All members of the seed alignment are encoded next to a 2-ketoarginine methyltransferase (EC 2.1.1.243). 366
20990 275338 TIGR04545 rSAM_ahbD_hemeb heme b synthase. Members of this family are AhbD (alternative heme biosynthetic protein D), a radical SAM enzyme in sulfate-reducing bacteria and methanogens that performs the last decarboxylations to synthesize heme b from Fe-coproporphyrin III. Members include DVU_0855, previously included in error in TIGR04055, the NirJ2 family thought to be involved in heme d1 biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 339
20991 275339 TIGR04546 rSAM_ahbC_deAc 12,18-didecarboxysiroheme deacetylase. This model describes one of a pair of radical SAM enzymes involved in the alternative heme biosynthesis (ahb) pathway for heme b biosynthesis from siroheme. This anaerobic pathway occurs in sulfate-reducing bacteria and methanogens. A very similar pair of radical SAM enzymes (TIGR04054, TIGR04055) is involved in heme d1 biosynthesis in species such as Heliobacillus mobilis and Heliophilum fasciatum. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 390
20992 275340 TIGR04547 Mollicu_LP MOLPALP family lipoprotein. Members of this family are surface lipoproteins, about 900 amino acids long on average, found only in the Mollicutes (Mycoplasma, Entomoplasma, Acholeplasma, Mesoplasma, Spiroplasma). Paralogs occur, such as MCAP_0360, MCAP_0361, and MCAP_0362 in Mycoplasma capricolum. This family shares significant N-terminal sequence similarity with STREFT (Secreted Thousand Residue Frequently Tandem), described by model TIGR04512; several members of the STREFT family have been misannotated as GnsA/GnsB family proteins. For proteins in this family, we suggest the name MOLPALP (Mollicutes Paralogous Lipoprotein) family lipoprotein [Cell envelope, Surface structures] 805
20993 275341 TIGR04548 DnaD_Mollicutes DnaD family protein, Mollicutes type. This model describes the full length of a family of proteins in the Mollicutes (Mycoplasma, Spiroplasma, Mesoplasma, etc.) homologous to the N-terminal region of DnaD from Bacillus subtilis. [DNA metabolism, DNA replication, recombination, and repair] 178
20994 275342 TIGR04549 LP_HExxH_w_tonB substrate import-associated zinc metallohydrolase lipoprotein. Members of this family are lipoproteins with the typical zinc metallohydrolase HExxH motif and additional similarities to a better-documented zinc peptidase family, pfam06167. The seed alignment begins immediately after the lipoprotein motif Cys residue. Up to five members of this protein family occur per genome, in the context of certain gene pairs related to RagA and RagB, or to SusC and SusD. Those gene pairs, like the present family, are restricted to the Bacteriodetes, may number up to 100 pairs per genome, and are linked to TonB-dependent uptake of biopolymer-derived nutrients such as glycans. A possible function for this lipoprotein is to hydrolyse larger molecules to prepare substrates for import and utilization. [Unknown function, Enzymes of unknown specificity] 261
20995 275343 TIGR04550 sMetMonox_MmoD soluble methane monooxygenase-binding protein MmoD. Members of this family are MmoD, a protein that binds the soluble (as opposed to the membrane-bound, copper-rich, particulate) methane monooxygenase and may regulate its activity. Recent work suggests that MmoD, together with methanobactin, acts a copper switch to regulate which enzyme form is produced. 64
20996 275344 TIGR04551 TIGR04551 TIGR04551 family protein. Members of this family are proteins of unknown function, about 620 amino acids in length, and universal in but restricted to the Myxococcales, an order within the Deltaproteobacteria with at least 15 sequenced genomes as of 8/2014. The most closely related homologs outside the Myxococcales show localized homology only and display sharply lower scores. Relatively few protein families (roughly 20) could be built to have a comparable restriction to the Myxococcales. The putative protein sorting signal MYXO-CTERM (TIGR03901) appears so far universal in but restricted to the Myxococcales, making the present family a candidate to be involved in recognizing and processing proteins with that signal. 523
20997 275345 TIGR04552 TIGR04552 TIGR04552 family protein. Members of this family are bacterial proteins, roughly 400 amino acids in length. Most members belong to the Deltaproteobacteria. All members of the Myxococcales, and order withing the Deltaproteobacteria, have a member. The arrangement of conserved residues into invariant motifs suggests enzymatic activity. The function is unknown. 353
20998 275346 TIGR04553 ABC_peri_selen putative selenate ABC transporter periplasmic binding protein. Members of this family ABC transporter periplasmic binding proteins and represent one clade within a larger family that includes phosphate, phosphite, and phosphonate transporters. All members of the seed alignment occur near a gene for SelD, the selenium-activating protein needed to make selenocysteine or selenouridine. Context therefore suggests members should be able to transport selenate, although transporting other substrates as well (e.g. phosphonates) is possible. This model has no overlap with TIGR03431, whose members are found regularly with phosphonate catabolism operons. 266
20999 275347 TIGR04554 3TM_mycoplas three transmembrane helix protein. Members of this rare family are small, highly hydrophobic, and restricted so far to the genus Mycoplasma, where it appears not be be essential. All members have three hydrophobic transmembrane helical segments. 105
21000 275348 TIGR04555 sulfite_DH_soxC sulfite dehydrogenase. Members of this family are the sulfite dehydrogenase SoxC. All members have a twin-arginine translocation (TAT) signal for secretion of proteins with bound cofactor across the plasma membrane. 408
21001 275349 TIGR04556 PKS_assoc polyketide synthase-associated domain. This model describes a rare domain found as the N-terminal region of a number of dinoflagellate-specific proteins that resemble type I polyketide synthases. 228
21002 275350 TIGR04557 fuse_rel_SoxYZ quinoprotein dehydrogenase-associated SoxYZ-like carrier. Members of this family are fusion proteins, with the N-terminal region similar to the sulfur oxidation protein SoxY (TIGR04488) and the C-terminal region similar to sulfur oxidation protein SoxZ (TIGR04490). Members occur exclusively in species with PQQ-dependent enzymes that have a Cys-Cys motif (TIGR03075) for electron transfer to c550 family cytochrome. By homology to the sulfur moiety-binding subunit SoxY, we predict the conserved Cys in the Gly-Gly-Cys motif binds some unknown adduct. 225
21003 275351 TIGR04558 SoxH_rel_PQQ_1 quinoprotein relay system zinc metallohydrolase 1. By homology, members are Zn metallohydrolases in the same family as the SoxH protein associated with sulfate metabolism, Bacillus cereus beta-lactamase II (see PDB:1bc2), and, more distantly, hydroxyacylglutathione hydrolase (glyoxalase II). All members occur in genomes with both PQQ biosynthesis and a PQQ-dependent (quinoprotein) dehydrogenase that has a motif of two consecutive Cys residues (see TIGR03075). The Cys-Cys motif is associated with electron transfer by specialized cytochromes such as c551. All these genomes also include a fusion protein (TIGR04557) whose domains resemble SoxY and SoxZ from thiosulfate oxidation. A conserved Cys in this fusion protein aligns to the Cys residue in SoxY that carries sulfur cycle intermediates. In many genomes, the genes for PQQ biosynthesis enzymes, PQQ-dependent enzymes, their associated cytochromes, and members of this family are clustered. Note that one to three closely related Zn metallohydrolases may occur; this family represents a specific clade among them. [Unknown function, Enzymes of unknown specificity] 285
21004 275352 TIGR04559 SoxH_rel_PQQ_2 quinoprotein relay system zinc metallohydrolase 2. By homology, members are Zn metallohydrolases in the same family as the SoxH protein associated with sulfate metabolism, Bacillus cereus beta-lactamase II (see PDB:1bc2), and, more distantly, hydroxyacylglutathione hydrolase (glyoxalase II). All members occur in genomes with both PQQ biosynthesis and a PQQ-dependent (quinoprotein) dehydrogenase that has a motif of two consecutive Cys residues (see TIGR03075). The Cys-Cys motif is associated with electron transfer by specialized cytochromes such as c551. All these genomes also include a fusion protein (TIGR04557) whose domains resemble SoxY and SoxZ from thiosulfate oxidation. A conserved Cys in this fusion protein aligns to the Cys residue in SoxY that carries sulfur cycle intermediates. In many genomes, the genes for PQQ biosynthesis enzymes, PQQ-dependent enzymes, their associated cytochromes, and members of this family are clustered. Note that one to three closely related Zn metallohydrolases may occur; this family represents a specific clade among them. Some members of this family have a short additional N-terminal domain with four conserved Cys residues. [Unknown function, Enzymes of unknown specificity] 283
21005 275353 TIGR04560 ribo_THX ribosomal small subunit protein bTHX. Members of this protein are the lineage-specific bacterial ribosomal small subunit proteint bTHX (previously THX), originally shown to exist in the genus Thermus. The protein is conserved for the first 26 amino acids, past which some members continue with additional sequence, often repetitive or low-complexity. This model also finds eukaryotic organelle forms, which have additional N-terminal transit peptides. [Protein synthesis, Ribosomal proteins: synthesis and modification] 26
21006 275354 TIGR04561 membra_charge integral membrane protein. Members of this protein are short (about 85-residue), low-complexity sequences of unknown function, with a highly hydrophobic N-terminal region of about 40 amino acids followed by a charged (Asp, Glu, Lys, and Arg-rich), sometimes repetitive C-terminal region. Members occur exclusively among the Mollicutes (Mycoplasma, Mesoplasma, Acholeplasma, Spiroplasma, Entomoplasma). The gene neighborhood of this protein is not conserved. 82
21007 275355 TIGR04562 TIGR04562 TIGR04562 family protein. Members of this family are proteins of unknown function, about 400 amino acids in length. Members are universal among the Myxococcales (a branch of the Deltaproteobacteria) and occur sporadically elsewhere. [Unknown function, General] 355
21008 275356 TIGR04563 TIGR04563 MXAN_4361/MXAN_4362 family small protein. Members of this family are small proteins that appears to be restricted to and yet universal in the Myxococcales. The function is unknown. Members include two tandem loci in Myxococcus xanthus DK 1622, MXAN_4361 and MXAN_4362, although members are not tandem in other Myxococcales. 53
21009 275357 TIGR04564 Synergist_CTERM Synergist-CTERM protein sorting domain. This model identifies a C-terminal domain of about 27 residues whose features are 1) a short Gly/Ser-rich region that ends in an invariant Gly-Cys motif, 2) a highly hydrophobic probable transmembrane alpha helix with a nearly invariant Pro near the end, and 3) a cluster of basic residues (Arg, Lys), and then the end of the protein. This domain occurs, so far, only in species of Synergistetes (Dethiosulfovibrio peptidovorans, Aminiphilus circumscriptus, Aminomonas paucivorans, Fretibacterium fastidiosum, Cloacibacillus evryensis, Synergistes jonesii, etc). This region closely resembles the MXYO-CTERM region of the Myxococcales, a division of the Deltaproteobacteria (see TIGR03901), but that domain lacks the the conserved Pro, frequently has two Cys residues instead of one, and most importantly, has a spacer region separating the Gly-Cys motif from the transmembrane segment. As with MYXO-CTERM, the enzyme presumed to recognize and cleave the sorting signal is not known. The lack of a spacer region between motif and TM segment suggests the presumed protease is located largely within the membrane, like rhombosortase and archaeosortase, rather than merely tethered to it like sortase. 26
21010 275358 TIGR04565 OMP_myx_plus outer membrane beta-barrel protein. Members of this family are outer membrane beta-barrel proteins, as inferred by distant homologies to other families (e.g. pfam13505) and by the concentration of aromatic residues, especially Phe, in the OMP signal region, which is flush with the C-terminus in some members, but followed by a few residues in others. Members have variable insertions and deletions, affecting scores, so this model does not cleanly separate all members from all non-members. Members are common in the Myxococcales, with five occurring in Myxococcus xanthus DK 1622. 157
21011 275359 TIGR04566 myxo_TraA_Nterm outer membrane exchange protein TraA, N-terminal region. In Myxococcus xanthus, the protein pair TraA (MXAN_6895) and TraB (MXAN_6898) are required for contact-dependent exchange of outer membrane proteins. The C-terminal half of TrA consists largely of Cys-rich tandem repeats. This model describes the N-terminal region of TraA, and related protein MXAN_4924. This region is suggested to be similar to the lectin PA14. Members of this family are restricted to a subset of the Myxococcales, and so have a narrower species distribution than the MYXO-CTERM putative protein sorting signal (TIGR03901), which is universal in the Myxococcales. Note that TIGR04201 matches at least seven repeats in the C-terminal region of TraA. T [Protein fate, Protein and peptide secretion and trafficking] 240
21012 275360 TIGR04567 RNAP_delt_lowGC DNA-directed RNA polymerase delta subunit. Members of this family are the RNA polymerase delta subunit, as found in the Firmicutes and the Mollicutes. All members of the seed alignment have an extended C-terminal low-complexity region, consisting largely of Asp and Glu, that is not included in the model. Proteins giving borderline scores should be checked to confirm a similar acidic C-terminal domain. [Transcription, DNA-dependent RNA polymerase] 83
21013 275361 TIGR04568 arch_SelU_Nterm selenouridine synthase, SelU N-terminal-like subunit. This protein is involved in biosynthesis of a selenonucleotide, probably 2-selenouridine, in tRNA of some archaea, such as Methanococcus maripaludis. This protein resembles the N-terminal region of bacterial SelU, and its partner protein resembles the C-terminal region. [Protein synthesis, tRNA and rRNA base modification] 215
21014 275362 TIGR04569 arch_SelU_Cterm selenouridine synthase, SelU C-terminal-like subunit. This protein is involved in biosynthesis of a selenonucleotide, probably 2-selenouridine, in tRNA of some archaea, such as Methanococcus maripaludis. This protein resembles the C-terminal region of bacterial SelU, and its partner protein resembles the N-terminal region. [Protein synthesis, tRNA and rRNA base modification] 217
21015 275363 TIGR04570 mollicut_2TM small integral membrane protein. Members of this extremely rare protein family occur in Mycoplasma mycoides and two species of Spiroplasma. The protein is small and hydrophobic with two predicted transmembrane (TM) regions. [Unknown function, General] 87
21016 275364 TIGR04571 LmtA_Leptospira lipid A Kdo2 1-phosphate O-methyltransferase. This family describes LmtA, which methylates a phosphate on the Kdo2 sugar of lipid A. The model is classified as exception (more specific than equivalog) to reflect that its scope is limited to the genus Leptospira, whereas homologs with matching activity might exist more broadly. Members of this family belong to the broader family of pfam04191, phospholipid methyltransferase, which includes a characterized yeast enzyme that acts on a range of unsaturated phospholipids. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 252
21017 237975 cd00001 PTS_IIB_man PTS_IIB, PTS system, Mannose/sorbose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. The active site histidine receives a phosphate group from the IIA subunit and transfers it to the substrate. 151
21018 237976 cd00002 YbaK_deacylase This CD includes cysteinyl-tRNA(Pro) deacylases from Haemophilus influenzae and Escherichia coli and other related bacterial proteins. These trans-acting, single-domain proteins are homologs of ProX and also the cis-acting prolyl-tRNA synthetase (ProRS) inserted (INS) editing domain. The bacterial amino acid trans-editing enzyme YbaK is a deacylase that hydrolyzes cysteinyl-tRNA(Pro)'s mischarged by prolyl-tRNA synthetase. YbaK also hydrolyzes glycyl-tRNA's, alanyl-tRNA's, seryl-tRNA's, and prolyl-tRNA's. YbaK is homologous to the INS domain of prolyl-tRNA synthetase (ProRS) as well as the trans-editing enzyme ProX of Aeropyrum pernix which hydrolyzes alanyl-tRNA's and glycyl-tRNA's. 152
21019 237977 cd00003 PNPsynthase Pyridoxine 5'-phosphate (PNP) synthase domain; pyridoxal 5'-phosphate is the active form of vitamin B6 that acts as an essential, ubiquitous coenzyme in amino acid metabolism. In bacteria, formation of pyridoxine 5'-phosphate is a step in the biosynthesis of vitamin B6. PNP synthase, a homooctameric enzyme, catalyzes the final step in PNP biosynthesis, the condensation of 1-amino-acetone 3-phosphate and 1-deoxy-D-xylulose 5-phosphate. PNP synthase adopts a TIM barrel topology, intersubunit contacts are mediated by three ''extra'' helices, generating a tetramer of symmetric dimers with shared active sites; the open state has been proposed to accept substrates and to release products, while most of the catalytic events are likely to occur in the closed state; a hydrophilic channel running through the center of the barrel was identified as the essential structural feature that enables PNP synthase to release water molecules produced during the reaction from the closed, solvent-shielded active site. 234
21020 320674 cd00004 Sortase Sortase domain. Sortases are cysteine transpeptidases, mainly found in Gram-positive bacteria, which either anchor surface proteins to peptidoglycans of the bacterial cell wall envelope or link proteins together to form pili by working alone, or in concert with other enzymes. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes based on sequence, membrane topology, genomic positioning, and cleavage site preference. The different classes are called class A to F sortases. Most Gram-positive bacteria contain more than one sortase and it is thought that the different sortases attach different surface protein classes. The typical eight-stranded beta-barrel fold is observed in all known sortases, along with the conserved catalytic triad consisting of cysteine, histidine and arginine residues. Some sortases contain an N-terminal signal peptide only and the C-terminus serves as a membrane anchor, which represents a type I membrane topology, with the N-terminal enzymatic portion projecting towards the bacterial surface and the C-terminal end residing in the cytoplasm. Other sortases adopt a type II membrane topology, with the N-terminal hydrophobic segment inside the cytoplasm and the C-terminal enzymatic portion located across the plasma membrane. The N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. Sortases are also present in some Gram-negative and Archaebacterial species, but the functions of these enzymes are unknown. 125
21021 187674 cd00005 CBM9_like_1 DOMON-like type 9 carbohydrate binding module of xylanases. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. The CBM9 domain frequently occurs in tandem repeats; members found in this subfamily typically co-occur with glycosyl hydrolase family 10 domains and are annotated as endo-1,4-beta-xylanases. CBM9 from Thermotoga maritima xylanase 10A is reported to have specificity for polysaccharide reducing ends. 185
21022 237978 cd00006 PTS_IIA_man PTS_IIA, PTS system, mannose/sorbose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. IIA subunits receive phosphoryl groups from HPr and transfer them to IIB subunits, which in turn phosphorylate the substrate. 122
21023 350199 cd00008 PIN_53EXO-like FEN-like PIN domains of the 5'-3' exonucleases of DNA polymerase I, bacteriophage T4 RNase H and T5-5' nucleases, and homologs. PIN (PilT N terminus) domains of the 5'-3' exonucleases (53EXO) of multi-domain DNA polymerase I and single domain protein homologs, as well as, the PIN domains of bacteriophage T5-5'nuclease (T5FEN or 5'-3'exonuclease), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar nucleases are included in this family. The 53EXO of DNA polymerase I recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The T5-5'nuclease is a 5'-3'exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3'exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 158
21024 99707 cd00009 AAA The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases. 151
21025 237980 cd00010 AAI_LTSS AAI_LTSS: Alpha-Amylase Inhibitors (AAI), Lipid Transfer (LT) and Seed Storage (SS) Protein family; a protein family unique to higher plants that includes cereal-type alpha-amylase inhibitors, lipid transfer proteins, seed storage proteins, and similar proteins. Proteins in this family are known to play important roles, in defending plants from insects and pathogens, lipid transport between intracellular membranes, and nutrient storage. Many proteins of this family have been identified as allergens in humans. These proteins contain a common pattern of eight cysteines that form four disulfide bridges. 63
21026 153270 cd00011 BAR_Arfaptin_like The Bin/Amphiphysin/Rvs (BAR) domain of Arfaptin-like proteins, a dimerization module that binds and bends membranes. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization, lipid binding and curvature sensing module present in Arfaptins, PICK1, ICA69, and similar proteins. Arfaptins are ubiquitously expressed proteins implicated in mediating cross-talk between Rac, a member of the Rho family GTPases, and Arf (ADP-ribosylation factor) small GTPases. Arfaptins bind to GTP-bound Arf1, Arf5, and Arf6, with strongest binding to GTP-Arf1. Arfaptins also binds to Rac-GTP and Rac-GDP with similar affinities. The Arfs are thought to bind to the same surface as Rac, and their binding is mutually exclusive. Protein Interacting with C Kinase 1 (PICK1) plays a key role in the trafficking of AMPA receptors, which are critical for regulating synaptic strength and may be important in cellular processes involved in learning and memory. Islet cell autoantigen 69-kDa (ICA69) is a diabetes-associated autoantigen that is involved in membrane trafficking at the Golgi complex in neurosecretory cells. ICA69 associates with PICK1 through their BAR domains to form a heterodimer which is involved in regulating the synaptic targeting and surface expression of AMPA receptors. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 203
21027 212657 cd00012 NBD_sugar-kinase_HSP70_actin Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily. This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure. 185
21028 200435 cd00013 ADF_gelsolin Actin depolymerization factor/cofilin- and gelsolin-like domains. Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. 97
21029 409031 cd00014 CH_SF calponin homology (CH) domain superfamily. CH domains are actin filament (F-actin) binding motifs, which may be present as a single copy or in tandem repeats (which increase binding affinity). They either function as autonomous actin binding motifs or serve a regulatory function. CH domains are found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, as well as proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav). 103
21030 237982 cd00015 ALBUMIN Albumin domain, contains five or six internal disulphide bonds; albuminoid superfamily includes alpha-fetoprotein which binds various cations, fatty acids and bilirubin; vitamin D-binding protein which binds to vitamin D, its metabolites, and fatty acids; alpha-albumin which binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs; and afamin of which little is known; these belong to a multigene family with highly conserved intron/exon organization and encoded protein structures; evolutionary comparisons strongly support vitamin D-binding protein as the original gene in this group with subsequent local duplications generating the remaining genes in the cluster 185
21031 293732 cd00016 ALP_like alkaline phosphatases and sulfatases. This family includes alkaline phosphatases and sulfatases. Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. Both alkaline phosphatase and sulfatase are essential for human metabolism. Deficiency of individual enzyme cause genetic diseases. 237
21032 237984 cd00017 ANATO Anaphylatoxin homologous domain; C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to repeats in fibulins. 70
21033 237985 cd00018 AP2 DNA-binding domain found in transcription regulators in plants such as APETALA2 and EREBP (ethylene responsive element binding protein). In EREBPs the domain specifically binds to the 11bp GCC box of the ethylene response element (ERE), a promotor element essential for ethylene responsiveness. EREBPs and the C-repeat binding factor CBF1, which is involved in stress response, contain a single copy of the AP2 domain. APETALA2-like proteins, which play a role in plant development contain two copies. 61
21034 237986 cd00019 AP2Ec AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites; the alignment also contains hexulose-6-phosphate isomerases, enzymes that catalyze the epimerization of D-arabino-6-hexulose 3-phosphate to D-fructose 6-phosphate, via cleaving the phosphoesterbond with the sugar. 279
21035 380813 cd00021 Bbox_SF B-box-type zinc finger superfamily. The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interactions. Based on different consensus sequences and the spacing of the 7-8 zinc-binding residues, B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2). 39
21036 237989 cd00022 BIR Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger. 69
21037 237990 cd00023 BBI Bowman-Birk type proteinase inhibitor (BBI); family of plant serine protease inhibitors that block trypsin or chymotrypsin.They are either single-headed (one reactive site, one inactive site, present mainly in monocotyledonous seeds) or double-headed (two reactive sites, present mainly in dicotyledonous seeds). 55
21038 349274 cd00024 CD_CSD CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains. Members of this group are chromodomains or chromo shadow domains; these are SH3-fold-beta-barrel domains of the chromo-like superfamily. Chromodomains lack the first strand of the SH3-fold-beta-barrel, this first strand is altered by insertion in the chromo shadow domains. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. Chromodomain-containing proteins include: i) those having an N-terminal chromodomain followed by a related chromo shadow domain, such as Drosophila and human heterochromatin protein Su(var)205 (HP1), and mammalian modifier 1 and 2; ii) those having a single chromodomain, such as Drosophila protein Polycomb (Pc), mammalian modifier 3, human Mi-2 autoantigen, and several yeast and Caenorhabditis elegans proteins of unknown function; iii) those having paired tandem chromodomains, such as mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1; (iv) and elongation factor eEF3, a member of the ATP-binding cassette (ABC) family of proteins, that serves an essential function in the translation cycle of fungi. eEF3 is a soluble factor lacking a transmembrane domain and having two ABC domains arranged in tandem, with a unique chromodomain inserted within the ABC2 domain. 50
21039 237992 cd00025 BPI1 BPI/LBP/CETP N-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 223
21040 237993 cd00026 BPI2 BPI/LBP/CETP C-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 200
21041 349339 cd00027 BRCT C-terminal domain of the breast cancer suppressor protein (BRCA1) and related domains. The BRCT (BRCA1 C-terminus) domain is found within many DNA damage repair and cell cycle checkpoint proteins. BRCT domains interact with each other forming homo/hetero BRCT multimers, but are also involved in BRCT-non-BRCT interactions and interactions within DNA strand breaks. BRCT tandem repeats bind to phosphopeptides; it has been shown that the repeats in human BRCA1 bind specifically to pS-X-X-F motifs, mediating the interaction between BRCA1 and the DNA helicase BACH1, or BRCA1 and CtIP, a transcriptional corepressor. It is assumed that BRCT repeats play similar roles in many signaling pathways associated with the response to DNA damage. 68
21042 237995 cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses. 116
21043 410341 cd00029 C1 protein kinase C conserved region 1 (C1 domain) superfamily. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains. It contains the motif HX12CX2CXnCX2CX4HX2CX7C, where C and H are cysteine and histidine, respectively; X represents other residues; and n is either 13 or 14. C1 has a globular fold with two separate Zn(2+)-binding sites. It was originally discovered as lipid-binding modules in protein kinase C (PKC) isoforms. C1 domains that bind and respond to phorbol esters (PE) and diacylglycerol (DAG) are referred to as typical, and those that do not respond to PE and DAG are deemed atypical. A C1 domain may also be referred to as PKC or non-PKC C1, based on the parent protein's activity. Most C1 domain-containing non-PKC proteins act as lipid kinases and scaffolds, except PKD which acts as a protein kinase. PKC C1 domains play roles in membrane translocation and activation of the enzyme. 50
21044 175973 cd00030 C2 C2 domain. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 102
21045 206635 cd00031 CA_like Cadherin repeat-like domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan. 98
21046 237997 cd00032 CASc Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs. 243
21047 153056 cd00033 CCP Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function. 57
21048 349275 cd00034 CSD chromo shadow domain. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. 52
21049 211311 cd00035 ChtBD1 Hevein or type 1 chitin binding domain. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 39
21050 213175 cd00036 ChtBD3 Chitin/cellulose binding domains of chitinase and related enzymes. This group contains proteins related to the cellulose-binding domain of Erwinia chrysanthemi endoglucanase Z (EGZ) and Serratia marcescens chitinase B (ChiB). Gram negative plant parasite Erwinia chrysanthemi produces a variety of depolymerizing enzymes to metabolize pectin and cellulose on the host plant. Cellulase EGZ has a modular structure, with an N-terminal catalytic domain linked to a C-terminal cellulose-binding domain (CBD). CBD mediates the secretion activity of EGZ. Chitinases allow certain bacteria to utilize chitin as a energy source. Typically, non-plant chitinases are of the glycosidase family 18. Bacillus circulans Glycosidase ChiA1 hydrolyzes chitin and is comprised of several domains: the C-terminal chitin binding domain, an N-terminal catalytic domain, and 2 fibronectin type III-like domains. Bacillus circulans WL-12 ChiA1 facilitates invasion of fungal cell walls. The ChiA1 chitin binding domain is required for the specific recognition of insoluble chitin. although topologically and structurally related, ChiA1 lacks the characteristic aromatic residues of Erwinia chrysanthemi endoglucanase Z (CBD(EGZ)). Streptomyces griseus Chitinase C is a family 19 chitinase, and consists of a N-terminal chitin binding domain and a C-terminal chitin-catalytic domain that effects degradation. ChiC contains the characteristic chitin-binding aromatic residues. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. 40
21051 153057 cd00037 CLECT C-type lectin (CTL)/C-type lectin-like (CTLD) domain. CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model. 116
21052 237999 cd00038 CAP_ED effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels 115
21053 119409 cd00039 COLIPASE Colipase; a stoichiometric cofactor for pancreatic lipase, allowing the enzyme to anchor itself to the water-lipid interface and stabilizing the active enzyme conformation 90
21054 238000 cd00040 CSF2 Granulocyte Macrophage Colony Stimulating Factor (GM-CSF) is a member of the large family of polypeptide growth factors called cytokines. It stimulates a wide variety of hematopoietic and nonhematopoietic cell types via binding to members of the cytokine receptor family, mainly the GM-CSF receptor. 121
21055 238001 cd00041 CUB CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast. 113
21056 238002 cd00042 CY Substituted updates: Jan 30, 2002 105
21057 410207 cd00043 CYCLIN_SF Cyclin box fold superfamily. The cyclin box is a protein binding domain that functions in cell-cycle and transcriptional control. It is about 100 amino acids in length, composed of five helices, and is present in cyclins, transcription initiation factor IIB (TFIIB), and retinoblastoma tumour suppressor protein (Rb). Cyclins consist of 8 classes of cell cycle regulators that function as regulatory subunits of cyclin-dependent kinases (CDKs), which are serine/threonine kinases. The catalytic activities of CDKs are modulated not only by their interactions with cyclins but also by CDK inhibitors (CKIs). CDKs, cyclins and CKIs play key roles in transcription, epigenetic regulation, metabolism, stem cell self-renewal, neuronal functions, and spermatogenesis. TFIIB is a transcription factor that binds the TATA box. Members in this superfamily contain one or two copies of the cyclin box. 82
21058 238004 cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. 315
21059 260016 cd00045 DED Death Effector Domain: a protein-protein interaction domain. Death Effector Domains comprise a subfamily of the Death Domain (DD) superfamily. DED-containing proteins include Fas-Associated via Death Domain (FADD), Astrocyte phosphoprotein PEA-15, the initiator caspases (caspase-8 and -10), and FLICE-inhibitory protein (FLIP), among others. These proteins are prominent components of the programmed cell death (apoptosis) pathway. Some members also have non-apoptotic functions such as regulation of insulin signaling (DEDD and PEA15) and cell cycle progression (DEDD). DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. 77
21060 350668 cd00046 SF2-N N-terminal DEAD/H-box helicase domain of superfamily 2 helicases. The DEAD/H-like superfamily 2 helicases comprise a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This N-terminal domain contains the ATP-binding region. 146
21061 350343 cd00047 PTPc catalytic domain of protein tyrosine phosphatases. Protein tyrosine phosphatases (PTP, EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG, and are characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active. 200
21062 380679 cd00048 DSRM_SF double-stranded RNA binding motif (DSRM) superfamily. DSRM (also known as dsRBM) is a 65-70 amino acid domain that adopts an alpha-beta-beta-beta-alpha fold. It is not sequence specific, but highly specific for double-stranded RNAs (dsRNAs) of various origin and structure. The DSRM domains are found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila Staufen protein, E. coli RNase III, RNase H1, and dsRNA dependent adenosine deaminases. They are involved in numerous cellular mechanisms ranging from localization and transport of messenger RNAs, through maturation and degradation of RNAs, to viral response and signal transduction. Some members harbor tandem DSRMs that act in small RNA biogenesis. 57
21063 199811 cd00049 MH1 N-terminal Mad Homology 1 (MH1) domain. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain. 121
21064 199819 cd00050 MH2 C-terminal Mad Homology 2 (MH2) domain. The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers. 170
21065 238008 cd00051 EFh EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers. 63
21066 238009 cd00052 EH Eps15 homology domain; found in proteins implicated in endocytosis, vesicle transport, and signal transduction. The alignment contains a pair of EF-hand motifs, typically one of them is canonical and binds to Ca2+, while the other may not bind to Ca2+. A hydrophobic binding pocket is formed by residues from both EF-hand motifs. The EH domain binds to proteins containing NPF (class I), [WF]W or SWG (class II), or H[TS]F (class III) sequence motifs. 67
21067 238010 cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium. 36
21068 238011 cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements. 38
21069 238012 cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies 50
21070 238013 cd00056 ENDO3c endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases 158
21071 238014 cd00057 FA58C Substituted updates: Jan 31, 2002 143
21072 238015 cd00058 FGF Acidic and basic fibroblast growth factor family; FGFs are mitogens, which stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family plays essential roles in patterning and differentiation during vertebrate embryogenesis, and has neurotrophic activities. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell surface FGF receptors. Upon binding to FGF, the receptors dimerize and their intracellular tyrosine kinase domains become active. FGFs have internal pseudo-threefold symmetry (beta-trefoil topology). 123
21073 410788 cd00059 FH_FOX Forkhead (FH) domain found in Forkhead box (FOX) family of transcription factors and similar proteins. The FOX family comprises diverse tissue- and cell type-specific transcription factors with an evolutionary conserved "Forkhead (FH)" or "winged helix" DNA-binding domain. FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. The structure of the FH domain contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. They participate in a variety of cellular processes, such as cell cycle progression, proliferation, differentiation, migration, metabolism, and DNA damage response. Their expression can be regulated by multiple factors, and they can act as co-activators and/or transcriptional repressors. Fifty FOX-encoding genes in humans have been categorized into 19 subfamilies based on protein sequence homology (FOXA to FOXS). 75
21074 238017 cd00060 FHA Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation). 102
21075 238018 cd00061 FN1 Fibronectin type 1 domain, approximately 40 residue long with two conserved disulfide bridges. FN1 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices. Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. FN1 domains also found in coagulation factor XII, HGF activator, and tissue-type plasminogen activator. In tissue plasminogen activator, FN1 domains may form functional fibrin-binding units with EGF-like domains C-terminal to FN1. 43
21076 238019 cd00062 FN2 Fibronectin Type II domain: FN2 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices. Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. Fibronectin is composed of 3 types of modules, FN1,FN2 and FN3. The collagen binding domain contains four FN1 and two FN2 repeats. 48
21077 238020 cd00063 FN3 Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases. 93
21078 238021 cd00064 FU Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. 49
21079 277249 cd00065 FYVE_like_SF FYVE domain like superfamily. FYVE domain is a 60-80 residue double zinc finger motif-containing module named after the four proteins, Fab1, YOTB, Vac1, and EEA1. The canonical FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P, also termed PI3P)-binding site. They are found in many membrane trafficking regulators, including EEA1, Hrs, Vac1p, Vps27p, and FENS-1, which locate to early endosomes, specifically bind PtdIns3P, and play important roles in vesicular traffic and in signal transduction. Some proteins, such as rabphilin-3A and alpha-Rab3-interacting molecules (RIMs), are also involved in membrane trafficking and bind to members of the Rab subfamily of GTP hydrolases. However, they contain FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences. At this point, they may not bind to phosphoinositides. In addition, this superfamily also contains the third group of proteins, caspase-associated ring proteins CARP1 and CARP2. They do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, these proteins have an altered sequence in the basic ligand binding patch and lack the WxxD motif that is conserved only in phosphoinositide binding FYVE domains. Thus they constitute a family of unique FYVE-type domains called FYVE-like domains. The FYVE domain is structurally similar to the RING domain and the PHD finger. This superfamily also includes ADDz zinc finger domain, which is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. 52
21080 206639 cd00066 G-alpha Alpha subunit of G proteins (guanine nucleotide binding). The alpha subunit of G proteins contains the guanine nucleotide binding site. The heterotrimeric GNP-binding proteins are signal transducers that communicate signals from many hormones, neurotransmitters, chemokines, and autocrine and paracrine factors. Extracellular signals are received by receptors, which activate the G proteins, which in turn route the signals to several distinct intracellular signaling pathways. The alpha subunit of G proteins is a weak GTPase. In the resting state, heterotrimeric G proteins are associated at the cytosolic face of the plasma membrane and the alpha subunit binds to GDP. Upon activation by a receptor GDP is replaced with GTP, and the G-alpha/GTP complex dissociates from the beta and gamma subunits. This results in activation of downstream signaling pathways, such as cAMP synthesis by adenylyl cyclase, which is terminated when GTP is hydrolized and the heterotrimers reconstitute. 315
21081 238023 cd00067 GAL4 GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain; found in transcription regulators like GAL4. Domain consists of two helices organized around a Zn(2)Cys(6 )motif; Binds to sequences containing 2 DNA half sites comprised of 3-5 C/G combinations 36
21082 238024 cd00068 GGL G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors 57
21083 200450 cd00069 GHB_like Glycoprotein hormone beta chain homologues. This family of cystine-knot hormones includes the beta chains of gonadotropins, thyrotropins, follitropins, choriogonadotropins and more. The members are reproductive hormones that consist of two glycosylated chains (alpha and beta), which form a tightly bound dimer. 96
21084 238025 cd00070 GLECT Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation. 127
21085 238026 cd00071 GMPK Guanosine monophosphate kinase (GMPK, EC 2.7.4.8), also known as guanylate kinase (GKase), catalyzes the reversible phosphoryl transfer from adenosine triphosphate (ATP) to guanosine monophosphate (GMP) to yield adenosine diphosphate (ADP) and guanosine diphosphate (GDP). It plays an essential role in the biosynthesis of guanosine triphosphate (GTP). This enzyme is also important for the activation of some antiviral and anticancer agents, such as acyclovir, ganciclovir, carbovir, and thiopurines. 137
21086 238027 cd00072 GYF GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species. 57
21087 238028 cd00073 H15 linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber 88
21088 238029 cd00074 H2A Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins. 115
21089 340391 cd00075 HATPase Histidine kinase-like ATPase domain. This superfamily includes the histidine kinase-like ATPase (HATPase) domains of several ATP-binding proteins such as histidine kinase, DNA gyrase B, topoisomerases, heat shock protein 90 (HSP90), phytochrome-like ATPases and DNA mismatch repair proteins. Domains belonging to this superfamily are also referred to as GHKL (gyrase, heat-shock protein 90, histidine kinase, MutL) ATPase domains. 102
21090 238031 cd00076 H4 Histone H4, one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core; along with H3, it plays a central role in nucleosome formation; histones bind to DNA and wrap the genetic material into "beads on a string" in which DNA (the string) is wrapped around small blobs of histones (the beads) at regular intervals; play a role in the inheritance of specialized chromosome structures and the control of gene activity; defects in the establishment of proper chromosome structure by histones may activate or silence genes aberrantly and thus lead to disease; the sequence of histone H4 has remained almost invariant in more than 2 billion years of evolution 85
21091 238032 cd00077 HDc Metal dependent phosphohydrolases with conserved 'HD' motif 145
21092 238033 cd00078 HECTc HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains. 352
21093 188616 cd00080 H3TH_StructSpec-5'-nucleases H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination. The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 71
21094 238035 cd00081 Hint Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here. 136
21095 119399 cd00082 HisKA Histidine Kinase A (dimerization/phosphoacceptor) domain; Histidine Kinase A dimers are formed through parallel association of 2 domains creating 4-helix bundles; usually these domains contain a conserved His residue and are activated via trans-autophosphorylation by the catalytic domain of the histidine kinase. They subsequently transfer the phosphoryl group to the Asp acceptor residue of a response regulator protein. Two-component signalling systems, consisting of a histidine protein kinase that senses a signal input and a response regulator that mediates the output, are ancient and evolutionarily conserved signaling mechanisms in prokaryotes and eukaryotes. 65
21096 381392 cd00083 bHLH_SF basic Helix Loop Helix (bHLH) domain superfamily. bHLH proteins are transcriptional regulators that are found in organisms from yeast to humans. Members of the bHLH superfamily have two highly conserved and functionally distinct regions. The basic part is at the amino end of the bHLH that may bind DNA to a consensus hexanucleotide sequence known as the E box (CANNTG). Different families of bHLH proteins recognize different E-box consensus sequences. At the carboxyl-terminal end of the region is the HLH region that interacts with other proteins to form homo- and heterodimers. bHLH proteins function as a diverse set of regulatory factors because they recognize different DNA sequences and dimerize with different proteins. The bHLH proteins can be divided to cell-type specific and widely expressed proteins. The cell-type specific members of bHLH superfamily are involved in cell-fate determination and act in neurogenesis, cardiogenesis, myogenesis, and hematopoiesis. 46
21097 238037 cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions. 66
21098 238038 cd00085 HNHc HNH nucleases; HNH endonuclease signature which is found in viral, prokaryotic, and eukaryotic proteins. The alignment includes members of the large group of homing endonucleases, yeast intron 1 protein, MutS, as well as bacterial colicins, pyocins, and anaredoxins. 57
21099 238039 cd00086 homeodomain Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner. 59
21100 238040 cd00087 FReD Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation. 215
21101 238041 cd00088 HPT Histidine Phosphotransfer domain, involved in signalling through a two part component systems in which an autophosphorylating histidine protein kinase serves as a phosphoryl donor to a response regulator protein; the response regulator protein is modulated by phosphorylation and dephosphorylation of a conserved aspartic acid residue; two-component proteins are abundant in most eubacteria; In E. coli there are 62 two-component proteins involved in a variety of processes such as chemotaxis, osmoregulation, metabolism and transport 1; also present in both Gram positive and Gram negative pathogenic bacteria where they regulate basic housekeeping functions and control expression of toxins and other proteins important for pathogenesis; in archaea and eukaryotes, two-component pathways constitute a very small number of all signaling systems; in fungi they mediate environmental stress responses and, in pathogenic yeast, hyphal development. In Dictyostelium and in plants, they are involved in important processes such as osmoregulation, cell growth, and differentiation; to date two-component proteins have not been identified in animals; in most prokaryotic systems, the output response is effected directly by the RR, which functions as a transcription factor while in eukaryotic systems, two-component proteins are found at the beginning of signaling pathways where they interface with more conventional eukaryotic signaling strategies such as MAP kinase and cyclic nucleotide cascades 94
21102 212008 cd00089 HR1 Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases. The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region. 68
21103 238042 cd00090 HTH_ARSR Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors. ARSR subfamily of helix-turn-helix bacterial transcription regulatory proteins (winged helix topology). Includes several proteins that appear to dissociate from DNA in the presence of metal ions. 78
21104 238043 cd00091 NUC DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. 241
21105 238044 cd00092 HTH_CRP helix_turn_helix, cAMP Regulatory protein C-terminus; DNA binding domain of prokaryotic regulatory proteins belonging to the catabolite activator protein family. 67
21106 238045 cd00093 HTH_XRE Helix-turn-helix XRE-family like proteins. Prokaryotic DNA binding proteins belonging to the xenobiotic response element family of transcriptional regulators. 58
21107 238046 cd00094 HX Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. 194
21108 238047 cd00095 IFab Interferon alpha, beta. Includes also interferon omega and tau. Different from interferon gamma family. Type I interferons(alpha, beta) belong to the larger helical cytokine superfamily, which includes growth hormones, interleukins, several colony-stimulating factors and several other regulatory molecules. All function as regulators of cellular activty by interacting with cell-surface receptors and activating various signalling pathways. Interferons produce antiviral and antiproliferative responses in cells. Receptor specificity determines function of the various members of the family. 152
21109 409353 cd00096 Ig Immunoglobulin domain. The members here are composed of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, including T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, including butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. Ig superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Typically, the V-set domains have A, B, E, and D strands in one sheet and A', G, F, C, C' and C" in the other. The structures in C1-set are smaller than those in the V-set; they have one beta sheet that is formed by strands A, B, E, and D and the other by strands G, F, C, and C'. Moreover, a C1-set Ig domain contains a short C' strand (three residues) and lacks A' and C" strand. Unlike other Ig domain sets, C2-set structures do not have a D strand. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 70
21110 409354 cd00098 IgC1 Immunoglobulin Constant-1 (C1)-set domain. The members here are composed of C1-set domains, classical Ig-like domains resembling the antibody constant domain. Members of the IgC1 family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, while the IgC domain is involved in oligomerization and molecular interactions. The structures in C1-set are smaller than those in the V-set; they have one beta sheet that is formed by strands A, B, E, and D and the other strands by G, F, C, and C'. 95
21111 409355 cd00099 IgV Immunoglobulin variable domain (IgV). The members here are composed of the immunoglobulin variable domain (IgV). The IgV family contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology, and are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. Ig superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Typically, the V-set domains have A, B, E and, D strands in one sheet and A', G, F, C, C', and C" strands in the other. 111
21112 238048 cd00100 IL1 Interleukin-1 homologes; Cytokines with various biological functions. Interleukin 1 alpha and beta are also known as hematopoietin and catabolin. This family also contains interleukin-1 receptor antagonists (inhibitors). 144
21113 238049 cd00101 IlGF_like Insulin/insulin-like growth factor/relaxin family; insulin family of proteins. Members include a number of active peptides which are evolutionary related including insulin, relaxin, prorelaxin, insulin-like growth factors I and II, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 41
21114 238050 cd00102 IPT Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers. 89
21115 238051 cd00103 IRF Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. 107
21116 238052 cd00104 KAZAL_FS Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD. 41
21117 411802 cd00105 KH-I K homology (KH) RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include an N-terminal extension and type I KH domains (e.g. hnRNP K) contain a C-terminal extension. Some KH-I superfamily members contain a divergent KH domain that lacks the RNA-binding GXXG motif. Some others have a mutated GXXG motif which may or may not have nucleic acid binding ability. 63
21118 276812 cd00106 KISc Kinesin motor domain. Kinesin motor domain. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), in some its is found in the middle (M-type), or C-terminal (C-type). N-type and M-type kinesins are (+) end-directed motors, while C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 326
21119 238055 cd00107 Knot1 The "knottin" fold is stable cysteine-rich scaffold, in which one disulfide bridge crosses the macrocycle made by two other disulfide bridges and the connecting backbone segments. Members include plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins, and arthropod defensins. 33
21120 238056 cd00108 KR Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides. 83
21121 238057 cd00109 KU BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. 54
21122 238058 cd00110 LamG Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules. 151
21123 238059 cd00111 Trefoil P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs. 44
21124 238060 cd00112 LDLa Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure 35
21125 238061 cd00113 PLAT PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. 116
21126 238062 cd00114 LIGANc NAD+ dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD(+) as a cofactor, but using the same basic reaction mechanism. The enzyme reacts with the cofactor to form a phosphoamide-linked AMP with the amino group of a conserved Lysine in the KXDG motif, and subsequently transfers it to the DNA substrate to yield adenylated DNA. This alignment contains members of the NAD+ dependent subfamily only. 307
21127 319970 cd00115 LMWP Low molecular weight phosphatase family. Substituted updates: Aug 22, 2001 137
21128 238064 cd00116 LRR_RI Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1). 319
21129 238065 cd00117 LU Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins. 79
21130 212030 cd00118 LysM Lysin Motif is a small domain involved in binding peptidoglycan. LysM, a small globular domain with approximately 40 amino acids, is a widespread protein module involved in binding peptidoglycan in bacteria and chitin in eukaryotes. The domain was originally identified in enzymes that degrade bacterial cell walls, but proteins involved in many other biological functions also contain this domain. It has been reported that the LysM domain functions as a signal for specific plant-bacteria recognition in bacterial pathogenesis. Many of these enzymes are modular and are composed of catalytic units linked to one or several repeats of LysM domains. LysM domains are found in bacteria and eukaryotes. 45
21131 340357 cd00119 LYZ C-type lysozyme and alpha-lactalbumin. C-type lysozyme (chicken or conventional type, 1,4-beta-N-acetylmuramidase) and alpha-lactalbumin (lactose synthase B protein, LA). They have a close evolutionary relationship and similar tertiary structure, however, functionally they are quite different. Lysozymes have primarily bacteriolytic function; hydrolysis of peptidoglycans of prokaryotic cell walls and transglycosylation. LA is a calcium-binding metalloprotein that is expressed exclusively in the mammary gland during lactation. LA is the regulatory subunit of the enzyme lactose synthase. The association of LA with the catalytic component of lactose synthase, galactosyltransferase, alters the acceptor substrate specificity of this glycosyltransferase, facilitating biosynthesis of lactose. Some lysozymes have evolved into digestive enzymes, both in mammals and invertebrates. 122
21132 238067 cd00120 MADS MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 59
21133 238068 cd00121 MATH MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains. 126
21134 238069 cd00122 MBD MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family. 62
21135 238070 cd00123 DmpA_OAT DmpA/OAT superfamily; composed of L-aminopeptidase D-amidase/D-esterase (DmpA), ornithine acetyltransferase (OAT) and similar proteins. DmpA is an aminopeptidase that releases N-terminal D and L amino acids from peptide substrates. This group represents one of the rare aminopeptidases that are not metalloenzymes. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine. OAT catalyzes the first and fifth steps in arginine biosynthesis, coupling acetylation of glutamate with deacetylation of N-acetylornithine, which allows recycling of the acetyl group in the arginine biosynthetic pathway. The superfamily also contains an enzyme, endo-type 6-aminohexanoate-oligomer hydrolase, that have been shown to be involved in nylon degradation. Proteins in this superfamily undergo autocatalytic cleavage of an inactive precursor at the site immediately upstream to the catalytic nucleophile. 286
21136 276950 cd00124 MYSc Myosin motor domain superfamily. Myosin motor domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 633
21137 153091 cd00125 PLA2c PLA2c: Phospholipase A2, a family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. 115
21138 238072 cd00126 PAH Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers 36
21139 350200 cd00128 PIN_FEN1_EXO1-like FEN-like PIN domains of Flap endonuclease-1 (FEN1)-like and exonuclease-1 (EXO1)-like nucleases, structure-specific, divalent-metal-ion dependent, 5' nucleases. PIN (PilT N terminus) domain of Flap endonuclease-1 (FEN1) and exonuclease-1 (EXO1)-like nucleases: FEN1, EXO1, Mkt1, Gap endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 162
21140 238074 cd00129 PAN_APPLE PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 80
21141 238075 cd00130 PAS PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. 103
21142 238076 cd00131 PAX Paired Box domain 128
21143 238077 cd00132 CRIB PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules. 42
21144 99904 cd00133 PTS_IIB PTS_IIB: subunit IIB of enzyme II (EII) is the central energy-coupling domain of the phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In the multienzyme PTS complex, EII is a carbohydrate-specific permease consisting of two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. The PTS is found only in bacteria, where it catalyzes the transport and phosphorylation of numerous monosaccharides, disaccharides, polyols, amino sugars, and other sugar derivatives. The four proteins (domains) forming the PTS phosphorylation cascade (EI, HPr, EIIA, and EIIB), can phosphorylate or interact with numerous non-PTS proteins thereby regulating their activity. 84
21145 238079 cd00135 PDGF Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family domain; PDGF is a potent activator for cells of mesenchymal origin; PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer; VEGF is a potent mitogen in embryonic and somatic angiogenesis with a unique specificity for vascular endothelial cells; VEGF forms homodimers and exists in 4 different isoforms; overall, the VEGF monomer resembles that of PDGF, but its N-terminal segment is helical rather than extended; the cysteine knot motif is a common feature of this domain 86
21146 238080 cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein. 70
21147 176497 cd00137 PI-PLCc Catalytic domain of prokaryotic and eukaryotic phosphoinositide-specific phospholipase C. This subfamily corresponds to the catalytic domain present in prokaryotic and eukaryotic phosphoinositide-specific phospholipase C (PI-PLC), which is a ubiquitous enzyme catalyzing the cleavage of the sn3-phosphodiester bond in the membrane phosphoinositides (phosphatidylinositol, PI; Phosphatidylinositol-4-phosphate, PIP; phosphatidylinositol 4,5-bisphosphate, PIP2) to yield inositol phosphates (inositol monosphosphate, InsP; inositol diphosphate, InsP2; inositol trisphosphate, InsP3) and diacylglycerol (DAG). The higher eukaryotic PI-PLCs (EC 3.1.4.11) have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. They play a critical role in most signal transduction pathways, controlling numerous cellular events, such as cell growth, proliferation, excitation and secretion. These PI-PLCs strictly require Ca2+ for their catalytic activity. They display a clear preference towards the hydrolysis of the more highly phosphorylated PI-analogues, PIP2 and PIP, to generate two important second messengers, InsP3 and DAG. InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. In contrast, bacterial PI-PLCs contain a single catalytic domain. Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. They participate in Ca2+-independent PI metabolism. They are characterized as phosphatidylinositol-specific phospholipase C (EC 4.6.1.13) that selectively hydrolyze PI, not PIP or PIP2. The TIM-barrel type catalytic domain in bacterial PI-PLCs is very similar to the one in eukaryotic PI-PLCs, in which the catalytic domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. The catalytic mechanism of both prokaryotic and eukaryotic PI-PLCs is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes a distinctly different type of eukaryotic PLC, glycosylphosphatidylinositol-specific phospholipase C (GPI-PLC), an integral membrane protein characterized in the protozoan parasite Trypanosoma brucei. T. brucei GPI-PLC hydrolyzes the GPI-anchor on the variant specific glycoprotein (VSG), releasing dimyristyl glycerol (DMG), which may facilitate the evasion of the protozoan to the host#s immune system. It does not require Ca2+ for its activity and is more closely related to bacterial PI-PLCs, but not mammalian PI-PLCs. 274
21148 197200 cd00138 PLDc_SF Catalytic domain of phospholipase D superfamily proteins. Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 119
21149 340436 cd00139 PIPKc Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family. The Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family includes phosphatidylinositol 5-phosphate 4-kinases (PIP5Ks) and similar proteins. PIP5Ks catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. The family includes type I and II PIP5Ks (-alpha, -beta, and -gamma) kinases. Signalling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. 253
21150 238082 cd00140 beta_clamp Beta clamp domain. The beta subunit (processivity factor) of DNA polymerase III holoenzyme, refered to as the beta clamp, forms a ring shaped dimer that encircles dsDNA (sliding clamp) in bacteria. The beta-clamp is structurally similar to the trimeric ring formed by PCNA (found in eukaryotes and archaea) and the processivity factor (found in bacteriophages T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. 365
21151 143386 cd00141 NT_POLXc Nucleotidyltransferase (NT) domain of family X DNA Polymerases. X family polymerases fill in short gaps during DNA repair. They are relatively inaccurate enzymes and play roles in base excision repair, in non-homologous end joining (NHEJ) which acts mainly to repair damage due to ionizing radiation, and in V(D)J recombination. This family includes eukaryotic Pol beta, Pol lambda, Pol mu, and terminal deoxyribonucleotidyl transferase (TdT). Pol beta and Pol lambda are primarily DNA template-dependent polymerases. TdT is a DNA template-independent polymerase. Pol mu has both template dependent and template independent activities. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These three carboxylate residues are fairly well conserved in this family. 307
21152 270621 cd00142 PI3Kc_like Catalytic domain of Phosphoinositide 3-kinase and similar proteins. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRAPP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control. The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 216
21153 238083 cd00143 PP2Cc Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. 254
21154 277316 cd00144 MPP_PPP_family phosphoprotein phosphatases of the metallophosphatase superfamily, metallophosphatase domain. The PPP (phosphoprotein phosphatase) family is one of two known protein phosphatase families specific for serine and threonine. This family includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 229
21155 99912 cd00145 POLBc DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis. 323
21156 238084 cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases. 81
21157 132835 cd00147 cPLA2_like Cytosolic phospholipase A2, catalytic domain; hydrolyses arachidonyl phospholipids. Catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. Group IV cPLA2 includes six intercellular enzymes: cPLA2alpha, cPLA2beta, cPLA2gamma, cPLA2delta, cPLA2epsilon, and cPLA2zeta. 438
21158 238085 cd00148 PROF Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway. 127
21159 119410 cd00150 PlantTI Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges. 27
21160 238086 cd00152 PTX Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers. 201
21161 340449 cd00153 RA_RalGDS_like Ras-associating (RA) domain of RalGDS family. The RalGDS family RA domains can interact with activated Ras and may function as effectors for other Ras family. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes and is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. The RalGDS family includes RalGDS, RGL, RGL2/Rlf and RGL3. All family members have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal RA domain. The RA domain mediates the GTP-dependent interaction with Ras and Ras-related proteins. 88
21162 206640 cd00154 Rab Ras-related in brain (Rab) family of small guanosine triphosphatases (GTPases). Rab GTPases form the largest family within the Ras superfamily. There are at least 60 Rab genes in the human genome, and a number of Rab GTPases are conserved from yeast to humans. Rab GTPases are small, monomeric proteins that function as molecular switches to regulate vesicle trafficking pathways. The different Rab GTPases are localized to the cytosolic face of specific intracellular membranes, where they regulate distinct steps in membrane traffic pathways. In the GTP-bound form, Rab GTPases recruit specific sets of effector proteins onto membranes. Through their effectors, Rab GTPases regulate vesicle formation, actin- and tubulin-dependent vesicle movement, and membrane fusion. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which mask C-terminal lipid binding and promote cytosolic localization. While most unicellular organisms possess 5-20 Rab members, several have been found to possess 60 or more Rabs; for many of these Rab isoforms, homologous proteins are not found in other organisms. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Since crystal structures often lack C-terminal residues, the lipid modification site is not available for annotation in many of the CDs in the hierarchy, but is included where possible. 159
21163 238087 cd00155 RasGEF Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors. 237
21164 381085 cd00156 REC phosphoacceptor receiver (REC) domain of response regulators (RRs) and pseudo response regulators (PRRs). Two-component systems (TCSs) involving a sensor and a response regulator are used by bacteria to adapt to changing environments. Processes regulated by two-component systems in bacteria include sporulation, pathogenicity, virulence, chemotaxis, and membrane transport. Response regulators (RRs) share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. Response regulators regulate transcription, post-transcription or post-translation, or have functions such as methylesterases, adenylate or diguanylate cyclase, c-di-GMP-specific phosphodiesterases, histidine kinases, serine/threonine protein kinases, and protein phosphatases, depending on their output domains. The function of some output domains are still unknown. TCSs are found in all three domains of life - bacteria, archaea, and eukaryotes, however, the presence and abundance of particular RRs vary between the lineages. Archaea encode very few RRs with DNA-binding output domains; most are stand-alone REC domains. Among eukaryotes, TCSs are found primarily in protozoa, fungi, algae, and green plants. REC domains function as phosphorylation-mediated switches within RRs, but some also transfer phosphoryl groups in multistep phosphorelays. 99
21165 206641 cd00157 Rho Ras homology family (Rho) of small guanosine triphosphatases (GTPases). Members of the Rho (Ras homology) family include RhoA, Cdc42, Rac, Rnd, Wrch1, RhoBTB, and Rop. There are 22 human Rho family members identified currently. These proteins are all involved in the reorganization of the actin cytoskeleton in response to external stimuli. They also have roles in cell transformation by Ras in cytokinesis, in focal adhesion formation and in the stimulation of stress-activated kinase. These various functions are controlled through distinct effector proteins and mediated through a GTP-binding/GTPase cycle involving three classes of regulating proteins: GAPs (GTPase-activating proteins), GEFs (guanine nucleotide exchange factors), and GDIs (guanine nucleotide dissociation inhibitors). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Since crystal structures often lack C-terminal residues, this feature is not available for annotation in many of the CDs in the hierarchy. 171
21166 238089 cd00158 RHOD Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins. 89
21167 238090 cd00159 RhoGAP RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins. 169
21168 238091 cd00160 RhoGEF Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. 181
21169 238092 cd00161 RICIN Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture. 124
21170 319361 cd00162 RING_Ubox The superfamily of RING finger (Really Interesting New Gene) domain and U-box domain. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized as two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have different Cys/His pattern. Some lack a single Cys or His residues at typical Zn ligand positions. Especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can indeed chelate Zn in a RING finger as well. C4C4-, C3HC3D-, C2H2C4-, and C3HC5-type RING fingers are closely related to RING-HC finger. In contrast, C4HC3- (RING-CH alias RINGv), C3H3C2-, C3H2C2D-, C3DHC3-, and C4HC2H-type RING fingers are close to RING-H2 finger. However, not all RING finger-containing proteins display regular RING finger features, and the RING finger family has turned out to be multifarious. The degenerated RING fingers from Siz/PIAS RING (SP-RING) family proteins and sporulation protein RMD5, are characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues. They bind only one Zn2+ ion. On the other hand, the RING fingers of the human APC11 and RBX1 proteins can bind a third Zn atom since they harbor four additional Zn ligands. U-box is a modified form of the RING finger domain that lacks metal chelating Cys and His. It resembles the cross-brace RING structure consisting of three beta-sheets and a single alpha-helix, which would be stabilized by salt bridges instead of chelated metal ions. U-box proteins are widely distributed among eukaryotic organisms and show a higher prevalence in plants than in other organisms. RING finger/U-box-containing proteins are a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serves as a scaffold for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates. 40
21171 119386 cd00163 RNase_A RNase A family, or Pancreatic RNases family; includes vertebrate RNase homologs to the bovine pancreatic ribonuclease A (RNase A). Many of these enzymes have special biological activities; for example, some stimulate the development of vascular endothelial cells, dendritic cells, and neurons, are cytotoxic/anti-tumoral and/or anti-pathogenic. RNase A is involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. The catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. The RNase A family proteins have a conserved catalytic triad (two histidines and one lysine); recently some family members lacking the catalytic residues have been identified. They also share three or four disulfide bonds. The most conserved disulfide bonds are close to the N and C termini and contribute most significantly to the conformational stability. 8 RNase A homologs had initially been identified in the human genome, pancreatic RNase (RNase 1), Eosinophil Derived Neurotoxin (EDN/RNASE 2), Eosinophil Cationic Protein (ECP/RNase 3), RNase 4, Angiogenin (RNase 5), RNase 6 or k6, the skin derived RNase (RNase 7) and RNase 8. These eight human genes are all located in a cluster on chromosome 14. Recent genomic analysis has extended the family to 13 sequences. However only the first eight identified human RNases, which are refered to as "canonical" RNases, contain the catalytic residues required for RNase A activity. The new genes corresponding to RNases 9-13 are also located in the same chromosome cluster and seem to be related to male-reproductive functions. RNases 9-13 have the characteristic disulfide bridge pattern but are unlikely to share RNase activity. The RNase A family most likely started off in vertebrates as a host-defense protein, and comparative analysis in mammals and birds indicates that the family may have originated from a RNase 5-like gene. This hypothesis is supported by the fact that only RNase 5-like RNases have been reported outside the mammalian class. The RNase 5 group would therefore be the most ancient form of this family, and all other members would have arisen during mammalian evolution. 119
21172 238094 cd00164 S1_like S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold. 65
21173 238095 cd00165 S4 S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes. 70
21174 238096 cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA. 45
21175 349397 cd00168 CAP CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. The CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain, also called SCP (sperm-coating glycoprotein), is found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs (cysteine-rich secretory proteins), which combine the CAP/SCP domain with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. CAP/SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal CAP/SCP domains. 128
21176 238098 cd00169 Chemokine Chemokine: small cytokines, including a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; distinguished from other cytokines by their receptors, which are G-protein coupled receptors; divided into 4 subfamilies based on the arrangement of the two N-terminal cysteines; some members can bind multiple receptors and many chemokine receptors can bind more than one chemokine; this redundancy allows precise control in stimulating the immune system and in contributing to the homeostasis of a cell; when expressed inappropriately, chemokines play a role in autoimmune diseases, vascular irregularities, graft rejection, neoplasia, and allergies; exist as monomers, dimers and multimers, but are believed to function as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine_CXC (cd00273), Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for chemokine subgroups. 59
21177 238099 cd00170 SEC14 Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. 157
21178 238100 cd00171 Sec7 Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. 185
21179 381000 cd00172 serpin SERine Proteinase INhibitors (serpin) family. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 365
21180 198173 cd00173 SH2 Src homology 2 (SH2) domain. In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others. 79
21181 212690 cd00174 SH3 Src Homology 3 domain superfamily. Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction. 51
21182 238102 cd00175 SNc Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds. 129
21183 238103 cd00176 SPEC Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here 213
21184 176851 cd00177 START Lipid-binding START domain of mammalian STARD1-STARD15 and related proteins. This family includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and related domains, such as the START domain of the Arabidopsis homeobox protein GLABRA 2. The mammalian STARDs are grouped into 8 subfamilies. This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some members of this family, specific lipids that bind in this pocket are known; these include cholesterol (STARD1/STARD3/ STARD4/STARD5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2/ STARD7/STARD10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). The START domain is found either alone or in association with other domains. Mammalian STARDs participate in the control of various cellular processes including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease. The Arabidopsis homeobox protein GLABRA 2 suppresses root hair formation in hairless epidermal root cells. 193
21185 238104 cd00178 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors. 172
21186 238105 cd00179 SynN Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain 151
21187 270622 cd00180 PKc Catalytic domain of Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. PKs make up a large family of serine/threonine kinases (STKs), protein tyrosine kinases (PTKs), and dual-specificity PKs that phosphorylate both serine/threonine and tyrosine residues of target proteins. Majority of protein phosphorylation occurs on serine residues while only 1% occurs on tyrosine residues. Protein phosphorylation is a mechanism by which a wide variety of cellular proteins, such as enzymes and membrane channels, are reversibly regulated in response to certain stimuli. PKs often function as components of signal transduction pathways in which one kinase activates a second kinase, which in turn, may act on other kinases; this sequential action transmits a signal from the cell surface to target proteins, which results in cellular responses. The PK family is one of the largest known protein families with more than 100 homologous yeast enzymes and more than 500 human proteins. A fraction of PK family members are pseudokinases that lack crucial residues for catalytic activity. The mutiplicity of kinases allows for specific regulation according to substrate, tissue distribution, and cellular localization. PKs regulate many cellular processes including proliferation, division, differentiation, motility, survival, metabolism, cell-cycle progression, cytoskeletal rearrangement, immunity, and neuronal functions. Many kinases are implicated in the development of various human diseases including different types of cancer. The PK family is part of a larger superfamily that includes the catalytic domains of RIO kinases, aminoglycoside phosphotransferase, choline kinase, phosphoinositide 3-kinase (PI3K), and actin-fragmin kinase. 215
21188 206638 cd00181 Tar_Tsr_LBD ligand binding domain of Tar- and Tsr-related chemoreceptors. E.coli Tar (taxis to aspartate and repellents) and Tsr (taxis to serine and repellents) are homologous chemoreceptors that have a high specificity for aspartate and serine, respectively. Both are homodimeric receptors and contain an N-terminal periplasmic ligand binding domain, a transmembrane region, a HAMP domain and a C-terminal cytosolic signaling domain. 129
21189 410312 cd00182 T-box DNA-binding domain of the T-box transcription factor family. The T-box family is an ancient family of transcription factors which plays a multitude of diverse functions throughout development. The founding member of the family is Brachyury (also known as TBXT, or T). Members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. The T-box factors in Caenorhabditis elegans have evolved very differently than those in other organisms; its genome contains 22 T-box genes which encode factors which are diverse in DNA-binding specificity, function and sequence, and only 3 of these factors fall into the conserved T-box subfamilies. 176
21190 238107 cd00183 TFIIS_I N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme 76
21191 238108 cd00184 TNF Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding. 137
21192 276900 cd00185 TNFRSF Tumor necrosis factor receptor superfamily (TNFRSF). Members of TNFR superfamily (TNFRSF) interactions with TNF superfamily (TNFSF) ligands (TNFL) control key cellular processes such as differentiation, proliferation, apoptosis, and cell growth. Dysregulation of these pathways has been shown to result in a wide range of pathological conditions, including autoimmune diseases, inflammation, cancer, and viral infection. There are 29 very diverse family members of TNFRSF reported in humans: 22 are type I transmembrane receptors (single pass with the N terminus on extracellular side of the cell membrane) and have a clear signal peptide; the remaining 7 members are either type III transmembrane receptors (single pass with the N terminus on extracellular side of the membrane but no signal sequence; TNFR13B, TNFR13C, TNFR17, and XEDAR), or attached to the membrane via a glycosylphosphatidylinositol (GPI) linker (TNFR10C), or secreted as soluble receptors (TNFR11B and TNFR6B). All TNFRs contain relatively short cysteine-rich domains (CRDs) in the ectodomain, and are involved in interaction with the TNF homology domain (THD) of their ligands. TNFRs often have multiple CRDs (between one and six), with the most frequent configurations of three or four copies; most CRDs possess three disulfide bridges, but could have between one and four. Localized or genome-wide duplication and evolution of the TNFRSF members appear to have paralleled the emergence of the adaptive immune system; teleosts (i.e. ray-finned, bony fish), which possess an immune system with B and T cells, possess primary and secondary lymphoid organs, and are capable of adaptive responses to pathogens also display several characteristics that are different from the mammalian immune system, making teleost TNFSF orthologs and paralogs of interest to better understand immune system evolution and the immunological pathways elicited to pathogens. 87
21193 238110 cd00186 TOP1Ac DNA Topoisomerase, subtype IA; DNA-binding, ATP-binding and catalytic domain of bacterial DNA topoisomerases I and III, and eukaryotic DNA topoisomerase III and eubacterial and archael reverse gyrases. Topoisomerases clevage single or double stranded DNA and then rejoin the broken phosphodiester backbone. Proposed catalytic mechanism of single stranded DNA cleavage is by phosphoryl transfer through a tyrosine nucleophile using acid/base catalysis. Tyr is activated by a nearby group (not yet identified) acting as a general base for nucleophilic attack on the 5' phosphate of the scissile bond. Arg and Lys stabilize the pentavalent transition state. Glu then acts as a proton donor for the leaving 3'-oxygen, upon cleavage of the scissile strand. 381
21194 238111 cd00187 TOP4c DNA Topoisomerase, subtype IIA; domain A'; bacterial DNA topoisomerase IV (C subunit, ParC), bacterial DNA gyrases (A subunit, GyrA),mammalian DNA toposiomerases II. DNA topoisomerases are essential enzymes that regulate the conformational changes in DNA topology by catalysing the concerted breakage and rejoining of DNA strands during normal cellular growth. 445
21195 173773 cd00188 TOPRIM Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 83
21196 238113 cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. 232
21197 238114 cd00191 TY Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases 66
21198 270623 cd00192 PTKc Catalytic domain of Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. They can be classified into receptor and non-receptor tyr kinases. PTKs play important roles in many cellular processes including, lymphocyte activation, epithelium growth and maintenance, metabolism control, organogenesis regulation, survival, proliferation, differentiation, migration, adhesion, motility, and morphogenesis. Receptor tyr kinases (RTKs) are integral membrane proteins which contain an extracellular ligand-binding region, a transmembrane segment, and an intracellular tyr kinase domain. RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain, leading to intracellular signaling. Some RTKs are orphan receptors with no known ligands. Non-receptor (or cytoplasmic) tyr kinases are distributed in different intracellular compartments and are usually multi-domain proteins containing a catalytic tyr kinase domain as well as various regulatory domains such as SH3 and SH2. PTKs are usually autoinhibited and require a mechanism for activation. In many PTKs, the phosphorylation of tyr residues in the activation loop is essential for optimal activity. Aberrant expression of PTKs is associated with many development abnormalities and cancers.The PTK family is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
21199 277192 cd00193 SNARE SNARE motif. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, Qb- and Qc-SNAREs are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 54
21200 270455 cd00194 UBA_like_SF UBA domain-like superfamily. The ubiquitin-associated (UBA) domain-like superfamily contains alpha-helical structural homology ubiquitin-binding domains, including UBA domains and coupling of ubiquitin conjugation to endoplasmic reticulum degradation (CUE) domains which share a common three-helical bundle architecture. UBA domains are commonly occurring sequence motifs found in proteins involved in ubiquitin-mediated proteolysis. They contribute to ubiquitin (Ub) binding or ubiquitin-like (UbL) domain binding. However, some kinds of UBA domains can only bind the UbL domain, but not the Ub domain. UBA domains are normally comprised of compact three-helix bundles which contain a conserved GF/Y-loop. They can bind polyubiquitin with high affinity. They also bind monoubiquitin and other proteins. Most UBA domain-containing proteins have one UBA domain, but some harbor two or three UBA domains. CUE domain containing proteins are characterized by an FP and a di-leucine-like sequence and bind to monoubiquitin with varying affinities. Some higher eukaryotic CUE domain proteins do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. This superfamily also includes many UBA-like domains found in AMP-activated protein kinase (AMPK) related kinases, the NXF family of mRNA nuclear export factors, elongation factor Ts (EF-Ts), nascent polypeptide-associated complex subunit alpha (NACA) and similar proteins. Although many UBA-like domains may have a conserved TG but not GF/Y-loop, they still show a high level of structural and sequence similarity with three-helical ubiquitin binding domains. 28
21201 238117 cd00195 UBCc Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD. 141
21202 340450 cd00196 Ubiquitin_like_fold Beta-grasp ubiquitin-like fold. Ubiquitin is a protein modifier that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. The ubiquitination process comprises a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. Ubiquitin-like proteins have similar ubiquitin beta-grasp fold and attach to other proteins in a ubiquitin-like manner but with biochemically distinct roles. Ubiquitin and ubiquitin-like proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Some other ubiquitin-like domains have adaptor roles in ubiquitin-signaling by mediating protein-protein interaction. In addition to Ubiquitin-like (Ubl) domain, Ras-associating (RA) domain, F0/F1 sub-domain of FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, TGS (ThrRS, GTPase and SpoT) domain, Ras-binding domain (RBD), Ubiquitin regulatory domain X (UBX), Dublecortin-like domain, and RING finger- and WD40-associated ubiquitin-like (RAWUL) domain have beta-grasp ubiquitin-like folds, and are included in this superfamily. 68
21203 340764 cd00197 VHS_ENTH_ANTH VHS, ENTH and ANTH domain superfamily. This superfamily is composed of proteins containing a VHS, CID, ENTH, or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The CTD-Interacting Domain (CID) is present in several RNA-processing factors and binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase II (RNAP II or Pol II). The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-Terminal Homology (ANTH) domain. VHS, ENTH, and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH and ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 115
21204 238119 cd00198 vWFA Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. 161
21205 238120 cd00199 WAP whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease. 60
21206 238121 cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment. 289
21207 238122 cd00201 WW Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. 31
21208 238123 cd00202 ZnF_GATA Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C 54
21209 238124 cd00203 ZnMc Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease. 167
21210 119412 cd00205 rhv_like Picornavirus capsid protein domain_like. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids composed of 60 copies each of 4 virus encoded proteins; alignment includes picornaviridae, like poliovirus, hepatitis A virus, rhinovirus, foot-and-mouth disease virus and encephalomyocarditis virus; common structure is an 8-stranded beta sandwich 178
21211 119411 cd00206 snake_toxin Snake toxin domain, present in short and long neurotoxins, cytotoxins and short toxins, and in other miscellaneous venom peptides. The toxin acts by binding to the nicotinic acetylcholine receptors in the postsynaptic membrane of skeletal muscles and preventing the binding of acetylcholine, thereby blocking the excitation of muscles. This domain contains 60-75 amino acids that are fixed by 4-5 disulfide bridges and is nearly all beta sheet; it exists as either monomers or dimers. 64
21212 238126 cd00207 fer2 2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin. 84
21213 100038 cd00208 LbetaH Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms. 78
21214 238127 cd00209 DHFR Dihydrofolate reductase (DHFR). Reduces 7,8-dihydrofolate to 5,6,7,8-tetrahydrofolate with NADPH as a cofactor. This is an essential step in the biosynthesis of deoxythymidine phosphate since 5,6,7,8-tetrahydrofolate is required to regenerate 5,10-methylenetetrahydrofolate which is then utilized by thymidylate synthase. Inhibition of DHFR interrupts thymidilate synthesis and DNA replication, inhibitors of DHFR (such as Methotrexate) are used in cancer chemotherapy. 5,6,7,8-tetrahydrofolate also is involved in glycine, serine, and threonine metabolism and aminoacyl-tRNA biosynthesis. 158
21215 238128 cd00210 PTS_IIA_glc PTS_IIA, PTS system, glucose/sucrose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 124
21216 238129 cd00211 PTS_IIA_fru PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 136
21217 238130 cd00212 PTS_IIB_glc PTS_IIB, PTS system, glucose/sucrose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation 78
21218 238131 cd00213 S-100 S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins. 88
21219 238132 cd00214 Calpain_III Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains. 150
21220 238133 cd00215 PTS_IIA_lac PTS_IIA, PTS system, lactose/cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation. 97
21221 199833 cd00216 PQQ_DH_like PQQ-dependent dehydrogenases and related proteins. This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ. 434
21222 271174 cd00217 INT_Flp_C Flp Tyrosine-based site-specific recombinases (also called integrases), C-terminal catalytic domain. Yeast Flp-like recombinases mediate the amplification of the 2 micron circular plasmid copy number by catalyzing the intra-molecular recombination between two inverted repeats during replication. They belong to the DNA breaking-rejoining enzyme superfamily, which also includes prokaryotic tyrosine recombinases and type IB topoisomerases. These enzymes share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism. Flp-like recombinases are almost exclusively found in yeast and are highly diverged in sequence from the prokaryotic tyrosine recombinases. They cleave their target DNA in trans with a composite active site in which the catalytic tyrosine is provided by a promoter bound to a site other than the one being cleaved. Thus each active site within Flp complexes is assembled by domain swapping and contains catalytic residues from two different monomers. Two DNA segments are synapsed by the tetrameric enzyme, carrying the nucleophilic tyrosine in each active site with only two of the four monomers active at a given time. The catalytic domain is linked through a flexible loop to the N-terminal domain, which is largely responsible for non-specific DNA binding and isomerization. Its overall fold is similar to the SAM domain fold also found in the N-terminal domains of lambda integrase and XerD recombinase. 410
21223 132995 cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately. The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43). 223
21224 119405 cd00219 ToxGAP GTPase-activating protein (GAP) domain found in bacterial cytotoxins, ExoS, SptP, and YopE. Part of protein secretion system; stimulates Rac1- dependent cytoskeletal changes that promote bacterial internalization. 120
21225 238135 cd00220 VMO-I Vitelline membrane outer layer protein I (VMO-I) domain, VMO-I is one of the proteins found in the outer layer of the vitelline membrane of poultry eggs; VMO-I, lysozyme, and VMO-II are tightly bound to ovomucin; this complex forms the backbone of the outer layer; VMO-I has three distinct internal repeats; all three repeats are used to define the domain here; VMO-I has recently been shown to synthesize N-acetylchito-oligosaccharides from N-acetylglucosamine; may be a carbohydrate-binding protein; member of the beta-prism-fold family 177
21226 411702 cd00221 Vsr Very short patch repair endonuclease. Very short patch repair (Vsr) is an endonuclease functioning in DNA repair that recognizes damaged DNA and cleaves the phosphodiester backbone. Vsr endonucleases have a common endonuclease topology that has been tailored for recognition of TG mismatches. It belongs to a superfamily of nucleases including archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 133
21227 212461 cd00222 CollagenBindB Repeat unit of collagen-binding protein domain B. The collagen-binding protein mediates bacterial adherence to collagen; the primary sequence has a non-repetitive, collagen-binding A region, followed by instances of this B region repetitive unit. The B region has one to four 23 kDa repeat units (B1-B4), which have been suggested to serve as 'stalks' that project the A region from the bacterial surface and thus facilitate bacterial adherence to collagen. Each B repeat unit has two highly similar domains (D1 and D2) placed side-by-side; both D1 and D2 are included in this model. They exhibit a unique inverse IgG-like domain fold. 92
21228 173774 cd00223 TOPRIM_TopoIIB_SPO TOPRIM_TopoIIB_SPO: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IIB family of DNA topoisomerases and Spo11. This subgroup contains proteins similar to Sulfolobus shibatae topoisomerase VI (TopoVI) and Saccharomyces cerevisiae meiotic recombination factor: Spo11. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. TopoVI enzymes are heterotetramers found in archaea and plants. Spo11 plays a role in generating the double strand breaks that initiate homologous recombination during meiosis. S. shibatae TopoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD. For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 160
21229 238137 cd00224 Mog1 homolog to Ran-Binding Protein Mog1p; binds to the small GTPase Ran, which plays an important role in nuclear import. Binding is independent of Ran's nucleotide state (RanGTP/RanGDP) 173
21230 119406 cd00225 API3 Ascaris pepsin inhibitor-3 (API3); protein inhibitor that reversibly inhibits aspartic proteinase cathepsin E, and gastric enzymes pepsin and gastricsin. 159
21231 238138 cd00226 PRCH Photosynthetic reaction center (RC) complex, subunit H; RC is an integral membrane protein-pigment complex which catalyzes light-induced reduction of ubiquinone to ubiquinol, generating a transmembrane electrochemical gradient of protons used to produce ATP by ATP synthase. Subunit H is positioned mainly in the cytoplasm with one transmembrane alpha helix. Provides proton transfer pathway (water channels) connecting the terminal quinone electron acceptor of RC, to the aqueous phase. Found in photosynthetic bacteria: alpha, beta, and gamma proteobacteria. 246
21232 238139 cd00227 CPT Chloramphenicol (Cm) phosphotransferase (CPT). Cm-inactivating enzyme; modifies the primary (C-3) hydroxyl of the antibiotic. Related structurally to shikimate kinase II. 175
21233 238140 cd00228 eu-GS Eukaryotic Glutathione Synthetase (eu-GS); catalyses the production of glutathione from gamma-glutamylcysteine and glycine in an ATP-dependent manner. Belongs to the ATP-grasp superfamily. 471
21234 238141 cd00229 SGNH_hydrolase SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 187
21235 238142 cd00231 ZipA ZipA C-terminal domain. ZipA, a membrane-anchored protein, is one of at least nine essential gene products necessary for assembly of the septal ring which mediates cell division in E.coli. ZipA and FtsA directly bind FtsZ, a homolog of eukaryotic tubulins, at the prospective division site, followed by the sequential addition of FtsK, FtsQ, FtsL, FtsW, FtsI, and FtsN. ZipA contains three domains: a short N-terminal membrane-anchored domain, a central P/Q domain that is rich in proline and glutamine and a C-terminal domain, which comprises almost half the protein. 130
21236 350855 cd00232 HemeO-like heme oxygenase. Heme oxygenase (HO, EC 1.14.14.18) catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. This family serves a variety of specific needs in different branches of life: in vertebrates, HO plays a role in heme homeostasis and oxidative stress response, and cellular signaling in mammals that include isoforms HO-1 and HO-2; in photosynthetic organisms including cyanobacteria, algae, and higher plants, biliverdin is used for photosynthetic pigment formation or light-sensing; and, in pathogenic bacteria, HO is part of a pathway for iron acquisition from host heme and heme products. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology. 201
21237 238144 cd00233 VIP2 VIP2; A family of actin-ADP-ribosylating toxin. A member of the Bacillus-prodiced vegetative insecticidal proteins (VIPs) possesses high specificity against the major insect pest, corn rootworms, and belongs to a classs of binary toxins and regulators of biological pathways distinct from classical A-B toxins. A novel family of insecticidal ADP-ribosyltransferses were isolated from Bacillus cereus during vegetative growth, where VIP1 likely targets insect cells and VIP2 ribosylates actin. VIP2 shares significant sequence similarity with enzymatic components of other binary toxins, Clostridium botulinum C2 toxin, C. perfringens iota toxin, C. piroforme toxin, C. piroforme toxin and C. difficile toxin. 201
21238 119413 cd00235 TLP-20 Telokin-like protein-20 (TLP-20) domain; a baculovirus protein that shares some antigenic similarities to the smooth muscle protein telokin, a kinase-related protein 108
21239 238145 cd00236 FinO_conjug_rep FinO bacterial conjugation repressor domain; the basic protein FinO is part of the the two component FinOP system which is responsible for repressing bacterial conjugation; the FinOP system represses the transfer (tra) operon of the F-plasmid which encodes the proteins responsible for conjugative transfer of this plasmid from host to recipient Escherichia coli cells; antisense RNA, FinP is thought to interact with traJ mRNA to occlude its ribosome binding site, blocking traJ translation and thereby inhibiting transcription of the tra operon; FinO protects FinP against degradation by binding to FinP and sterically blocking the cellular endonuclease RNase E; FinO also also binds to the complementary stem-loop structures in traJ mRNA and promotes duplex formation between FinP and traJ RNA in vitro; this domain contains two independent RNA binding regions 146
21240 107218 cd00237 p23 p23 binds heat shock protein (Hsp)90 and participates in the folding of a number of Hsp90 clients, including the progesterone receptor. p23 also has a passive chaperoning activity and in addition may participate in prostaglandin synthesis. 106
21241 238146 cd00238 ERp29c ERp29 and ERp38, C-terminal domain; composed of the protein disulfide isomerase (PDI)-like proteins ERp29 and ERp38. ERp29 (also called ERp28) is a ubiquitous endoplasmic reticulum (ER)-resident protein expressed in high levels in secretory cells. It contains a redox inactive TRX-like domain at the N-terminus. The expression profile of ERp29 suggests a role in secretory protein production, distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex and is essential in regulating the secretion of thyroglobulin. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase. ERp38 is a P5-like protein, first isolated from alfalfa (the cDNA clone was named G1), which contains two redox active TRX domains at the N-terminus, like human P5. However, unlike human P5, ERp38 also contains a C-terminal domain with homology to the C-terminal domain of ERp29. It may be a glucose-regulated protein. The function of the all-helical C-terminal domain of ERp29 and ERp38 remains unclear. The C-terminal domain of Wind is thought to provide a distinct site required for interaction with its substrate, Pipe. 93
21242 119414 cd00239 PapG_CBD PapG carbohydrate / receptor binding domain (CBD); PapG, the adhesin of the P-pili, is situated at the tip, mediating the attachment of uropathogenic Escherichia coli to the uroepithelium of the human kidney; PapG has a two-domain architecture: a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus (C-terminal pilin region). The carbohydrate-binding domain interacts with the receptor glycan. There are 3 PapG alleles, class I-III, which bind to different receptor isotypes, GbO3, GbO4, and GbO5, respectively. 194
21243 238147 cd00240 TFIIFa Transcription initiation factor IIF, alpha subunit, N-terminal region of RAP74. Subunit of transcription initiation complex involved in initiation, elongation and promoter escape.Tetramer of 2 alpha and 2 beta TFIIF subunits interacts directly with RNA polymerase II. TFIIF inhibits non-specific transcription initiation by PolII and recruits the polymerase to the preinitiation complex on promoter DNA for site-specific transcription initiation. The PolII/TFIIF-complex attaches through direct interactions of TFIIF with promoter DNA, TFIIB and the TAF250 subunit of TFIID, and provides scaffolding for addition of TFIIE and TFIIH. Together with TFIIE, TFIIF participates in DNA strand separation (open complex formation). N-terminal domains of RAP30 and RAP74 co-fold to form a single core structure, a triple barrel heterodimer, and has pseudo-2-fold symmetry. 162
21244 187675 cd00241 DOMON_like Domon-like ligand-binding domains. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases. 158
21245 153074 cd00242 Ecotin Protease Inhibitor Ecotin; homodimeric protease inhibitor. Protease Inhibitor Ecotin; homodimeric protease inhibitor which binds two chymotrypsin-like serine proteases to form a heterotetramer. Found in bacterial periplasm. Inhibits a broad range of serine proteases including collagenase, trypsin, chymotrypsin, elastase, and factor Xa but not thrombin. Inhibition mechanism involves binding at two different protease contact sites: the primary and secondary binding sites. Primary site loops of ecotin bind to the active site of target proteases in a substrate-like manner with the P1 residue in ecotin mimicking the interactions of a canonical P1 substrate residue. 136
21246 119415 cd00243 Lysin-Sp18 Sp18 and Lysin from Archaegastropoda (marine mollusks of the families Halotidae and Trochidae) sperm. Both proteins play an important role in fertilization: sp18 mediates fusion between sperm and egg cell membrane, lysin dissolves the vitelline envelope surrounding the egg; they are believed to be a product of gene duplication and subsequent divergence. 122
21247 238148 cd00244 AlgLyase Alginate Lyase A1-III; enzymatically depolymerizes alginate, a complex copolymer of beta-D-mannuronate and alpha-L-guluronate, by cleaving the beta-(1,4) glycosidic bond. 339
21248 238149 cd00245 Glm_e Coenzyme B12-dependent glutamate mutase epsilon subunit-like family; contains proteins similar to Clostridium cochlearium glutamate mutase (Glm) and Streptomyces tendae Tu901 NikV. Glm catalyzes a carbon-skeleton rearrangement of L-glutamate to L-threo-3-methylaspartate. The first step in the catalysis is a homolytic cleavage of the Co-C bond of the coenzyme B12 cofactor to generate a 5'-deoxyadenosyl radical. This radical then initiates the rearrangement reaction. C. cochlearium Glm is a sigma2epsilon2 heterotetramer. Glm plays a role in glutamate fermentation in Clostridium sp. and in members of the family Enterobacteriaceae, and in the synthesis of the lipopeptide antibiotic friulimicin in Actinoplanes friuliensis. S. tendae Tu901 glutamate mutase-like proteins NikU and NIkV participate in the synthesis of the peptidyl nucleoside antibiotic nikkomycin. NikU and NikV proteins have sequence similarity to Clostridium Glm sigma and epsilon components respectively, and may catalyze the rearrangement of 2-oxoglutaric acid to 2-keto-3-methylsuccinic acid during nikkomycin synthesis. 428
21249 238150 cd00246 RabGEF Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport; 103
21250 238151 cd00247 Endostatin-like Endostatin-like domain; the angiogenesis inhibitor endostatin is a C-terminal fragment of collagen XV/XVIII, a proteoglycan/collagen found in vessel walls and basement membranes; this domain has a compact globular fold similar to that of C-type lectins; endostatin XVIII is monomeric and contains a heparin-binding epitope and zinc binding sites while endostatin XV is trimeric and contains neither of these sites; the generation of endostatin or endostatin-like collagen XV/XVIII fragments is catalyzed by proteolytic enzymes within the protease-sensitive hinge region of the C-terminal domain; endostatin inhibits endothelial cell migration in vitro and appears to be highly effective in murine in vivo studies 171
21251 238152 cd00248 Mth938-like Mth938-like domain. The members of this family include: Mth938, 2P1, Xcr35, Rpa2829, and several uncharacterized sequences. Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. This protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer. 2P1 is a partially characterized nuclear protein which is homologous to E3-3 from rat and known to be alternately spliced. Xcr35 and Rpa2829 are hypothetical proteins of unknown function from the Xanthomonas campestris and Rhodopseudomonas palustris genomes, respectively, for which the crystal structures have been determined. 109
21252 238153 cd00249 AGE AGE domain; N-acyl-D-glucosamine 2-epimerase domain; Responsible for intermediate epimerization during biosynthesis of N-acetylneuraminic acid. Catalytic mechanism is believed to be via nucleotide elimination and readdition and is ATP modulated. AGE is structurally and mechanistically distinct from the other four types of epimerases. The AGE domain monomer is composed of an alpha(6)/alpha(6)-barrel, the structure of which is also found in glucoamylase and cellulase. The active form is a homodimer. The alignment also contains subtype III mannose 6-phosphate isomerases. 384
21253 238154 cd00250 CAS_like Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2 262
21254 119403 cd00251 Mth_Ecto The ectodomain of Methuselah (Mth); Mth mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage; The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family; Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. 176
21255 320009 cd00252 EFh_SPARC_EC EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC)-like proteins. The SPARC protein family represents a diverse group of proteins that share a follistatin-like (FS) domain and an extracellular calcium-binding (EC) domain with two EF-hand motifs. It includes SPARC (for secreted protein acidic and rich in cysteine, also termed osteonectin/ON, or basement-membrane protein 40/BM-40), SPARC-like protein 1 (for secreted protein, acidic and rich in cysteines-like 1/ SPARCL1, also termed high endothelial venule protein/Hevi, or MAST 9, or SC-1, or RAGS-1, or QR1, or ECM 2), testicans 1, 2, and 3 (also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycans, or SPOCK), secreted modular calcium-binding protein SMOC-1 (also termed SPARC-related modular calcium-binding protein 1) and SMOC-2 (also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2/SMAP-2), follistatin-related protein 1 (FRP-1, also termed follistatin-like protein 1/fstl-1, TSC-36/Flik, TGF-beta inducible protein). The SPARC proteins have been implicated in modulating cell interaction with the extracellular milieu, including regulation of extracellular matrix assembly and deposition, counter-adhesion, effects on extracellular protease activity, and modulation of growth factor/cytokine signaling pathways, as well as in development and disease. 107
21256 238156 cd00253 PL_Passenger_AT Pertactin-like passenger domains (virulence factors) of autotransporter proteins of the type V secretion system. Autotransporters are proteins used by Gram-negative bacteria to transport proteins across their outer membranes. The C-terminal (beta) domain of autotransporters forms a pore in the outer membrane through which the N-terminal passenger domain is transported. Following transport, the passenger domain is generally cleaved by an outer membrane protease with the passenger domain either remaining in contact with the surface via a noncovalent interaction with the beta domain or cleaved to release a soluble protein. These proteins are highly diverse and perform a variety of functions that promote virulence, including catalyzing proteolysis, serving as an adhesin, mediating actin-promoted motility, or serving as a cytotoxin. Proteins in this family share similarity in the C-terminal region of the passenger domain as seen in the pertactin structure P.69, a Bordetella pertussis agglutinogen responsible for human pertussis. The P.69 protein consists of a 16-stranded parallel beta-helix with a V-shaped cross-section, and is one of the largest beta-helix known to date. 186
21257 381594 cd00254 LT-like lytic transglycosylase(LT)-like domain. Members include the soluble and insoluble membrane-bound LTs in bacteria and LTs in bacteriophage lambda. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. 111
21258 238158 cd00255 nidG2 Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria. 224
21259 238159 cd00256 VATPase_H VATPase_H, regulatory vacuolar ATP synthase subunit H (Vma13p); activation component of the peripheral V1 complex of V-ATPase, a heteromultimeric enzyme which uses ATP to actively transport protons into organelles and extracellular compartments. The topology is that of a superhelical spiral, in part the geometry is similar to superhelices composed of armadillo repeat motifs, as found in importins for example. 429
21260 238160 cd00257 Fascin Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF) 119
21261 238161 cd00258 GM2-AP GM2 activator protein (GM2-AP) is a non-enzymatic lysosomal protein that acts as cofactor in the sequential degradation of gangliosides. GM2A is an essential cofactor for beta-hexosaminidase A (Hex A) in the enzymatic hydrolysis of GM2 ganglioside to GM3. Mutation of the gene results in the AB variant of Tay-Sachs disease. GM2-AP and similar proteins belong to the ML domain family. 162
21262 119404 cd00259 STNV STNV domain; satellite tobacco necrosis virus (STNV) are small plant viruses which are completely dependent on the presence of a specific helper virus, TNV, for their replication; 60 identical subunits, this domain is one of them; form an icosahedral shell around a single RNA molecule. Half of the RNA codes for the coat protein with the other half being non-coding. The STNV domain has a "Swiss roll" Greek key topology with its two 4-stranded antiparallel beta sheets 195
21263 271229 cd00260 Sialidase sialidases/neuraminidases. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates as well as playing roles in pathogenesis, bacterial nutrition and cellular interactions. They have a six-bladed beta-propeller fold. This hierarchy includes eubacterial, eukaryotic, and viral sialidases. 361
21264 238163 cd00261 AAI_SS AAI_SS: Alpha-Amylase Inhibitors (AAIs) and Seed Storage (SS) Protein subfamily; composed of cereal-type AAIs and SS proteins. They are mainly present in the seeds of a variety of plants. AAIs play an important role in the natural defenses of plants against insects and pathogens such as fungi, bacteria and viruses. AAIs impede the digestion of plant starch and proteins by inhibiting digestive alpha-amylases and proteinases. Also included in this subfamily are SS proteins such as 2S albumin, gamma-gliadin, napin, and prolamin. These AAIs and SS proteins are also known allergens in humans. 110
21265 238164 cd00264 BPI BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 208
21266 238165 cd00265 MADS_MEF2_like MEF2 (myocyte enhancer factor 2)-like/Type II subfamily of MADS ( MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero and homo-dimers. Differs from SRF-like/Type I subgroup mainly in position of the alpha helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 77
21267 238166 cd00266 MADS_SRF_like SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero- and homo-dimers. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 83
21268 213179 cd00267 ABC_ATPase ATP-binding cassette transporter nucleotide-binding domain. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 157
21269 350669 cd00268 DEADc DEAD-box helicase domain of DEAD box helicases. DEAD-box helicases comprise a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 196
21270 238168 cd00270 MATH_TRAF_C Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link cell surface TNFRs and receptors of the interleukin-1/Toll-like family to downstream kinase signaling cascades which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. There are at least six mammalian and three Drosophila proteins containing TRAF domains. The mammalian TRAFs display varying expression profiles, indicating independent and cell type-specific regulation. They display distinct, as well as overlapping functions and interactions with receptors. Most TRAFs, except TRAF1, share N-terminal homology and contain a RING domain, multiple zinc finger domains, and a TRAF domain. TRAFs form homo- and heterotrimers through its TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 149
21271 238169 cd00271 Chemokine_C Chemokine_C, C or lymphotactin subgroup, 1 of 4 subgroup designations of chemokines based on the arrangement of two N-terminal, conserved cysteine residues. Most of the known chemokines (cd00169) belong to either the CC (cd00272) or CXC (cd00273) subclass. The two other subclasses each have a single known member: fractalkine for the CX3C (cd00274) class and lymphotactin for the C (cd00271) class. Chemokine_Cs differ structurally since they contain only one of the two disulfide bridges that are conserved in all other chemokines and they possess a unique C-terminal extension, which is required for biological activity and thought to play a role in receptor binding. Lymphotactin, a mediator of mucosal immunity, has been found to chemoattract neutrophils and B cells through the XCR1 receptor and thought to be a factor in acute allograft rejection and inflammatory bowel disease. 72
21272 238170 cd00272 Chemokine_CC Chemokine_CC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; some members (e.g. 2HCC) contain an additional disulfide bond which is thought to compensate for the highly conserved Trp missing in these; chemotatic for monocytes, macrophages, eosinophils, basophils, and T cells, but not neutrophils; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses; a subgroup of CC, identified by an N-terminal DCCL motif (Exodus-1, Exodus-2, and Exodus-3), has been shown to inhibit specific types of human cancer cell growth in a mouse model. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups, and Chemokine_CC_DCCL for the DCCL subgroup of this CD. 57
21273 238171 cd00273 Chemokine_CXC Chemokine_CXC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; many members contain an RCxC motif which may be a general requirement for binding to CXC chemokine receptors; those with the ELR motif are chemotatic for neutrophils and have been shown to be angiogenic, while those without the motif act on T and B cells, and are typically angiostatic; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups. 64
21274 238172 cd00274 Chemokine_CX3C Chemokine_CX3C: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteines; differ structurally from the other subgroups in that they are attached to a membrane-spanning domain via a mucin-like stalk and can be proteolytically cleaved to a freely diffusible form; chemotatic for T cells, monocytes, and natural killer cells; function as monomers and are found only in vertebrates and a few viruses; currently only fractalkine (sometimes called neurotactin) has been identified as a member of this subfamily; the primary source of fractalkine is neurons, and they exhibit cell adhesion and chemoattractive properties in the central nervous system. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_CC (cd00272), and Chemokine_C (cd00271) for the additional chemokine subgroups. 76
21275 175974 cd00275 C2_PLC_like C2 domain present in Phosphoinositide-specific phospholipases C (PLC). PLCs are involved in the hydrolysis of phosphatidylinositol-4,5-bisphosphate (PIP2) to d-myo-inositol-1,4,5-trisphosphate (1,4,5-IP3) and sn-1,2-diacylglycerol (DAG). 1,4,5-IP3 and DAG are second messengers in eukaryotic signal transduction cascades. PLC is composed of a N-terminal PH domain followed by a series of EF hands, a catalytic TIM barrel and a C-terminal C2 domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-II topology. 128
21276 175975 cd00276 C2B_Synaptotagmin C2 domain second repeat present in Synaptotagmin. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. There are several classes of Synaptotagmins. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 134
21277 238173 cd00279 YlxR Ylxr homologs; group of conserved hypothetical bacterial proteins of unknown function; structure revealed putative RNA binding cleft; proteins are encoded by an operon that includes other proteins involved in transcription and/or translation 79
21278 238174 cd00280 TRFH Telomeric Repeat binding Factor or TTAGGG Repeat binding Factor, central (dimerization) domain Homology; TRFH. Telomeres are protein/DNA complexes that make up the physical ends of eukaryotic linear chromosomes and are essential for chromosome stability, protecting the chromosome ends from degradation and end-to-end fusion. Proteins TRF1, TRF2 and Taz1 bind telomeric DNA and are also involved in recruiting interacting proteins, TIN2, and Rap1, to the telomeres. It has also been demonstrated that PARP1 associates with TRF2 and is capable of poly(ADP-ribosyl)ation of TRF2, which affects binding of TRF2 to telomeric DNA. TRF1, TRF2 and Taz1 proteins contain three functional domains: an N-terminal acidic domain, a central TRF-specific/dimerization domain, and a C-terminal DNA binding domain with a single Myb-like repeat. Homodimerization, a prerequisite to DNA binding, results in the juxtaposition of two Myb DNA binding domains. 200
21279 176449 cd00281 DAP_dppA Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases. 265
21280 238175 cd00283 GIY-YIG_Cterm GIYX(10-11)YIG family of class I homing endonucleases C-terminus (GIY-YIG_Cterm). Homing endonucleases promote the mobility of intron or intein by recognizing and cleaving a homologous allele that lacks the sequence. They catalyze a double-strand break in the DNA near the insertion site of that element to facilitate homing at that site. Class I homing endonucleases are sorted into four families based on the presence of these motifs in their respective N-termini: LAGLIDADG, His-Cys box, HNH, and GIY-YIG. This CD contains several but not all members of the GIY-YIG family. The C-terminus of GIY-YIG is a DNA-binding domain which is separated from the N-terminus by a long, flexible linker. The DNA-binding domain consists of a minor-groove binding alpha-helix, and a helix-turn-helix. Some also contain a zinc finger (i.e. I-TevI) which is not required for DNA binding or catalysis, but is a component of the linker and directs the catalytic domain to cleave the homing site at a fixed distance from the intron insertion site. 113
21281 238176 cd00284 Cytochrome_b_N Cytochrome b (N-terminus)/b6/petB: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal portion of cytochrome b is described in a separate CD. 200
21282 276954 cd00286 Tubulin_FtsZ_Cetz-like Tubulin protein family of FtsZ and CetZ-like. This family includes tubulin alpha-, beta-, gamma-, delta-, epsilon, and zeta-tubulins as well as FtsZ and CetZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts. A recent study found that CetZ proteins, formerly annotated FtsZ type 2, are not required for cell division, whereas FtsZ proteins play an important role. Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics. The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development. 332
21283 238177 cd00287 ribokinase_pfkB_like ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase). 196
21284 238178 cd00288 Pyruvate_Kinase Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer. 480
21285 238179 cd00290 cytochrome_b_C Cytochrome b(C-terminus)/b6/petD: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal domain is involved in forming the ubiquinol/ubiquinone binding sites, but not the heme binding sites. The N-terminal portion of cytochrome b, which contains both heme binding sites, is described in a separate CD. 147
21286 238180 cd00291 SirA_YedF_YeeD SirA, YedF, and YeeD. Two-layered alpha/beta sandwich domain. SirA (also known as UvrY, and YhhP) belongs to a family of bacterial two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 69
21287 238181 cd00292 EF1B Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction. 88
21288 238182 cd00293 USP_Like Usp: Universal stress protein family. The universal stress protein Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. The crystal structure of Haemophilus influenzae Usp reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, athough Usp lacks ATP-binding activity. 130
21289 238183 cd00295 RNA_Cyclase RNA 3' phosphate cyclase domain - RNA phosphate cyclases are enzymes that catalyze the ATP-dependent conversion of 3'-phosphate at the end of RNA into 2', 3'-cyclic phosphodiester bond. The enzymes are conserved in eucaryotes, bacteria and archaea. The exact biological role of this enzyme is unknown, but it has been proposed that it is likely to function in cellular RNA metabolism and processing. RNA phosphate cyclase has been characterized in human (with at least three isozymes), and E. coli, and it seems to be taxonomically widespread. The crystal structure of RNA phospate cyclase shows that it consists of two domains. The larger domain contains three repeats of a fold originally identified in the bacterial translation initiation factor IF3. 338
21290 238184 cd00296 SIR2 SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines. 222
21291 107219 cd00298 ACD_sHsps_p23-like This domain family includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this family is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR). 80
21292 198286 cd00299 GST_C_family C-terminal, alpha helical domain of the Glutathione S-transferase family. Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases. 100
21293 133418 cd00300 LDH_like L-lactate dehydrogenase-like enzymes. Members of this subfamily are tetrameric NAD-dependent 2-hydroxycarboxylate dehydrogenases including LDHs, L-2-hydroxyisocaproate dehydrogenases (L-HicDH), and LDH-like malate dehydrogenases (MDH). Dehydrogenases catalyze the conversion of carbonyl compounds to alcohols or amino acids. LDHs catalyze the last step of glycolysis in which pyruvate is converted to L-lactate. Vertebrate LDHs are non-allosteric, but some bacterial LDHs are activated by an allosteric effector such as fructose-1,6-bisphosphate. L-HicDH catalyzes the conversion of a variety of 2-oxo carboxylic acids with medium-sized aliphatic or aromatic side chains. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like subfamily is part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 300
21294 381182 cd00301 lipocalin_FABP lipocalin/cytosolic fatty acid-binding protein family. Lipocalins are diverse, mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules as well as membrane bound-receptors. They have a large beta-barrel ligand-binding cavity. Members include retinol-binding protein, retinoic acid-binding protein, complement protein C8 gamma, Can f 2, apolipoprotein D, extracellular fatty acid-binding protein, beta-lactoglobulin, oderant-binding protein, and bacterial lipocalin Blc. Lipocalins are involved in many important processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty acid-binding proteins also bind hydrophobic ligands in a non-covalent, reversible manner, and are involved in protection and shuttling of fatty acids within the cell, and in acquisition and removal of fatty acids from intracellular sites. 109
21295 410651 cd00302 cytochrome_P450 cytochrome P450 (CYP) superfamily. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs with > 40% sequence identity are members of the same family. There are approximately 2250 CYP families: mammals, insects, plants, fungi, bacteria, and archaea have around 18, 208, 277, 805, 591, and 14 families, respectively. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. CYPs use a variety of redox partners, such as the eukaryotic diflavin enzyme NADPH-cytochrome P450 oxidoreductase and the bacterial/mitochondrial NAD(P)H-ferredoxin reductase and ferredoxin partners. Some CYPs are naturally linked to their redox partners and others have evolved to bypass requirements for redox partners, and instead react directly with hydrogen peroxide or NAD(P)H to facilitate oxidative or reductive catalysis. 391
21296 133136 cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 92
21297 238185 cd00304 RT_like RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. 98
21298 238186 cd00305 Cu-Zn_Superoxide_Dismutase Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]. 144
21299 173787 cd00306 Peptidases_S8_S53 Peptidase domain in the S8 and S53 families. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values. 241
21300 238187 cd00307 RuBisCO_small_like Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit and related proteins. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. This superfamily also contains specific proteins from cyanobacteria. CcmM plays a role in a CO2 concentrating mechanism, which cyanobacteria need to to overcome the low specificity of their Rubisco and fusions to Rubisco activase, a type of chaperone, which promotes and maintains the catalytic activity of Rubisco. CcmM contains an N-terminal carbonic anhydrase fused to four copies of the Rubisco-small subunit domain 84
21301 238188 cd00308 enolase_like Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase. 229
21302 238189 cd00309 chaperonin_type_I_II chaperonin families, type I and type II. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. 464
21303 349411 cd00310 ATP-synt_Fo_a_6 ATP synthase Fo complex, subunit 6 (eukaryotes) and subunit a (prokaryotes). Bacterial forms are designated as ATP synthase, Fo complex, subunit a; eukaryotic (chloroplast and mitochondrial) forms are designated as ATP synthase, Fo complex, subunit 6. The F-ATP synthases (also called FoF1-ATPases) consist of two structural domains: F1 (factor one) complex containing the soluble catalytic core, and Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts or in the plasma membranes of bacteria. F-ATP synthase has also been found in the archaea Methanosarcina acetivorans. F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. 156
21304 238190 cd00311 TIM Triosephosphate isomerase (TIM) is a glycolytic enzyme that catalyzes the interconversion of dihydroxyacetone phosphate and D-glyceraldehyde-3-phosphate. The reaction is very efficient and requires neither cofactors nor metal ions. TIM, usually homodimeric, but in some organisms tetrameric, is ubiqitous and conserved in function across eukaryotes, bacteria and archaea. 242
21305 238191 cd00312 Esterase_lipase Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate. 493
21306 349412 cd00313 ATP-synt_Fo_Vo_Ao_c ATP synthase, membrane-bound Fo/Vo/Ao complexes, subunit c. Subunit c of the Fo/Vo/Ao complex is the main transmembrane subunit of F-, V- or A-type family of ATP synthases with rotary motors. These ion-transporting rotary ATP synthases are composed of two linked multi-subunit complexes: the F1, V1, and A1 complexes contains three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao (oligomycin sensitive) complex that forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. The A-ATP synthase (AoA1-ATPases) is exclusively found in archaea and function like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes. 65
21307 173823 cd00314 plant_peroxidase_like Heme-dependent peroxidases similar to plant peroxidases. Along with animal peroxidases, these enzymes belong to a group of peroxidases containing a heme prosthetic group (ferriprotoporphyrin IX), which catalyzes a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. The plant peroxidase-like superfamily is found in all three kingdoms of life and carries out a variety of biosynthetic and degradative functions. Several sub-families can be identified. Class I includes intracellular peroxidases present in fungi, plants, archaea and bacteria, called catalase-peroxidases, that can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. Catalase-peroxidases are typically comprised of two homologous domains that probably arose via a single gene duplication event. Class II includes ligninase and other extracellular fungal peroxidases, while class III is comprised of classic extracellular plant peroxidases, like horseradish peroxidase. 255
21308 238192 cd00315 Cyt_C5_DNA_methylase Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. 275
21309 238193 cd00316 Oxidoreductase_nitrogenase The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. This group contains both alpha and beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase) and, both subunits of Protochlorophyllide (Pchlide) reductase and chlorophyllide (chlide) reductase. The nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized nitrogenase is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers whose alpha and beta subunits are similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster from which, electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo at the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. Pchlide reductase and chlide reductase participate in the Mg-branch of the tetrapyrrole biosynthetic pathway. Pchlide reductase catalyzes the reduction of the D-ring of Pchlide during the synthesis of chlorophylls (Chl) and bacteriochlorophylls (BChl). Chlide-a reductase catalyzes the reduction of the B-ring of Chlide-a during the synthesis of BChl-a. The Pchlide reductase NB complex is a an N2B2 heterotetramer resembling nitrogenase FeMo, N and B proteins are homologous to the FeMo alpha and beta subunits respectively. The NB complex may serve as a catalytic site for Pchlide reduction and, the ZY complex as a site of chlide reduction, similar to MoFe for nitrogen reduction. 399
21310 238194 cd00317 cyclophilin cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing. 146
21311 238195 cd00318 Phosphoglycerate_kinase Phosphoglycerate kinase (PGK) is a monomeric enzyme which catalyzes the transfer of the high-energy phosphate group of 1,3-bisphosphoglycerate to ADP, forming ATP and 3-phosphoglycerate. This reaction represents the first of the two substrate-level phosphorylation events in the glycolytic pathway. Substrate-level phosphorylation is defined as production of ATP by a process, which is catalyzed by water-soluble enzymes in the cytosol; not involving membranes and ion gradients. 397
21312 238196 cd00319 Ribosomal_S12_like Ribosomal protein S12-like family; composed of prokaryotic 30S ribosomal protein S12, eukaryotic 40S ribosomal protein S23 and similar proteins. S12 and S23 are located at the interface of the large and small ribosomal subunits, adjacent to the decoding center. They play an important role in translocation during the peptide elongation step of protein synthesis. They are also involved in important RNA and protein interactions. Ribosomal protein S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. S23 interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes translocation. Mutations in S12 and S23 have been found to affect translational accuracy. Antibiotics such as streptomycin may also bind S12/S23 and cause the ribosome to misread the genetic code. 95
21313 238197 cd00320 cpn10 Chaperonin 10 Kd subunit (cpn10 or GroES); Cpn10 cooperates with chaperonin 60 (cpn60 or GroEL), an ATPase, to assist the folding and assembly of proteins and is found in eubacterial cytosol, as well as in the matrix of mitochondria and chloroplasts. It forms heptameric rings with a dome-like structure, forming a lid to the large cavity of the tetradecameric cpn60 cylinder and thereby tightly regulating release and binding of proteins to the cpn60 surface. 93
21314 238198 cd00321 SO_family_Moco Sulfite oxidase (SO) family, molybdopterin binding domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 156
21315 99778 cd00322 FNR_like Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H). 223
21316 271245 cd00323 uS7 Ribosomal protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 130
21317 381595 cd00325 chitinase_GH19 Glycoside hydrolase family 19, chitinase domain. Chitinases are enzymes that catalyze the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Glycoside hydrolase family 19 chitinases are found primarily in plants (classes I, III, and IV), but some are found in bacteria. Class I and II chitinases are similar in their catalytic domains. Class I chitinases have an N-terminal cysteine-rich, chitin-binding domain which is separated from the catalytic domain by a proline and glycine-rich hinge region. Class II chitinases lack both the chitin-binding domain and the hinge region. Class IV chitinases are similar to class I chitinases, but they are smaller in size due to certain deletions. Despite lacking any significant sequence homology with lysozymes, structural analysis reveals that family 19 chitinases, together with family 46 chitosanases, are similar to several lysozymes including those from T4-phage and from goose. The structures reveal that the different enzyme groups arose from a common ancestor glycohydrolase antecedent to the prokaryotic/eukaryotic divergence. 224
21318 238200 cd00326 alpha_CA Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer. 227
21319 238201 cd00327 cond_enzymes Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products. 254
21320 163705 cd00328 catalase Catalase heme-binding enzyme. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Most catalases exist as tetramers of 65KD subunits containing a protoheme IX group buried deep inside the structure. In eukaryotic cells, catalases are located in peroxisomes. 433
21321 238202 cd00329 TopoII_MutL_Trans MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families. 107
21322 153075 cd00330 phosphagen_kinases Phosphagen (guanidino) kinases. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized. 236
21323 238203 cd00331 IGPS Indole-3-glycerol phosphate synthase (IGPS); an enzyme in the tryptophan biosynthetic pathway, catalyzing the ring closure reaction of 1-(o-carboxyphenylamino)-1-deoxyribulose-5-phosphate (CdRP) to indole-3-glycerol phosphate (IGP), accompanied by the release of carbon dioxide and water. IGPS is active as a separate monomer in most organisms, but is also found fused to other enzymes as part of a bifunctional or multifunctional enzyme involved in tryptophan biosynthesis. 217
21324 176460 cd00332 PAL-HAL Phenylalanine ammonia-lyase (PAL) and histidine ammonia-lyase (HAL). PAL and HAL are members of the Lyase class I_like superfamily of enzymes that, catalyze similar beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. PAL, present in plants and fungi, catalyzes the conversion of L-phenylalanine to E-cinnamic acid. HAL, found in several bacteria and animals, catalyzes the conversion of L-histidine to E-urocanic acid. Both PAL and HAL contain the cofactor 3, 5-dihydro-5-methylidene-4H-imidazol-4-one (MIO) which is formed by autocatalytic excision/cyclization of the internal tripeptide, Ala-Ser-Gly. PAL is being explored as enzyme substitution therapy for Phenylketonuria (PKU), a disorder which involves an inability to metabolize phenylalanine. HAL failure in humans results in the disease histidinemia. 444
21325 238204 cd00333 MIP Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms. 228
21326 238205 cd00336 Ribosomal_L22 Ribosomal protein L22/L17e. L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit. It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA. L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes. L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome. 105
21327 238206 cd00338 Ser_Recombinase Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain. These enzymes perform site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and serine recombinase. Serine recombinases demonstrate functional versatility and include resolvases, invertases, integrases, and transposases. Resolvases and invertases (i.e. Tn3, gamma-delta, Tn5044 resolvases, Gin and Hin invertases) in this family contain a C-terminal DNA binding domain and comprise a major phylogenic group. Also included are phage- and bacterial-encoded recombinases such as phiC31 integrase, SpoIVCA excisionase, and Tn4451 TnpX transposase. These integrases and transposases have larger C-terminal domains compared to resolvases/invertases and are referred to as large serine recombinases. Also belonging to this family are proteins with N-terminal DNA binding domains similar to IS607- and IS1535-transposases from Helicobacter and Mycobacterium. 137
21328 238207 cd00340 GSH_Peroxidase Glutathione (GSH) peroxidase family; tetrameric selenoenzymes that catalyze the reduction of a variety of hydroperoxides including lipid peroxidases, using GSH as a specific electron donor substrate. GSH peroxidase contains one selenocysteine residue per subunit, which is involved in catalysis. Different isoenzymes are known in mammals,which are involved in protection against reactive oxygen species, redox regulation of many metabolic processes, peroxinitrite scavenging, and modulation of inflammatory processes. 152
21329 238208 cd00342 gram_neg_porins Porins form aqueous channels for the diffusion of small hydrophillic molecules across the outer membrane. Individual 16-strand anti-parallel beta-barrels form a central pore, and trimerizes thru mainly hydrophobic interactions at the interface. Trimers are stabilized by hytrophillic clamping of Loop L2. Loop 3 bends into the pore, creating an elliptical constriction of about 7 x 11A, large enough to allow passage of a glucose molecule without steric hindrance. Removal of the C-terminal residue (usuallly F) destabilizes the trimer and removal of the 16th beta-sheet abolishes trimerization. Unlike typical membrane proteins, porins lack long hydrophobic stretches. Short turns are found at the smooth, periplasmic end, longer irregular loops are found at the rough, extracellular end. C-terminal residue forms salt bridge with N-terminus. 329
21330 188629 cd00344 FBP_aldolase_I Fructose-bisphosphate aldolase class I. Fructose-bisphosphate aldolase class I. Fructose-1,6-bisphosphate aldolase is an enzyme of the glycolytic and gluconeogenic pathways found in vertebrates, plants, and bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin. 328
21331 238209 cd00347 Flavin_utilizing_monoxygenases Flavin-utilizing monoxygenases 90
21332 100101 cd00349 Ribosomal_L11 Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation. 131
21333 238210 cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity. 33
21334 238211 cd00351 TS_Pyrimidine_HMase Thymidylate synthase and pyrimidine hydroxymethylase: Thymidylate synthase (TS) and deoxycytidylate hydroxymethylase (dCMP-HMase) are homologs that catalyze analogous alkylation of C5 of pyrimidine nucleotides. Both enzymes are involved in the biosynthesis of DNA precursors and are active as homodimers. However, they exhibit distinct pyrimidine base specificities and differ in the details of their catalyzed reactions. TS is biologically ubiquitous and catalyzes the conversion of dUMP and methylene-tetrahydrofolate (CH2THF) to dTMP and dihydrofolate (DHF). It also acts as a regulator of its own expression by binding and inactivating its own RNA. Due to its key role in the de novo pathway for thymidylate synthesis and, hence, DNA synthesis, it is one of the most conserved enzymes across species and phyla. TS is a well-recognized target for anticancer chemotherapy, as well as a valuable new target against infectious diseases. Interestingly, in several protozoa, a single polypeptide chain codes for both, dihydrofolate reductase (DHFR) and thymidylate synthase (TS), forming a bifunctional enzyme (DHFR-TS), possibly through gene fusion at a single evolutionary point. DHFR-TS is also active as a dimer. Virus encoded dCMP-HMase catalyzes the reversible conversion of dCMP and CH2THF to hydroxymethyl-dCMP and THF. This family also includes dUMP hydroxymethylase, which is encoded by several bacteriophages that infect Bacillus subtilis, for their own protection against the host restriction system, and contain hydroxymethyl-dUMP instead of dTMP in their DNA. 215
21335 238212 cd00352 Gn_AT_II Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer. 220
21336 238213 cd00353 Ribosomal_S15p_S13e Ribosomal protein S15 (prokaryotic)_S13 (eukaryotic) binds the central domain of 16S rRNA and is required for assembly of the small ribosomal subunit and for intersubunit association, thus representing a key element in the assembly of the whole ribosome. S15 also plays an important autoregulatory role by binding and preventing its own mRNA from being translated. S15 has a predominantly alpha-helical fold that is highly structured except for the N-terminal alpha helix. 80
21337 238214 cd00354 FBPase Fructose-1,6-bisphosphatase, an enzyme that catalyzes the hydrolysis of fructose-1,6-biphosphate into fructose-6-phosphate and is critical in gluconeogenesis pathway. The alignment model also includes chloroplastic FBPases and sedoheptulose-1,7-biphosphatases that play a role in pentose phosphate pathway (Calvin cycle). 315
21338 100098 cd00355 Ribosomal_L30_like Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs. L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea. 53
21339 238215 cd00361 arom_aa_hydroxylase Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers. 221
21340 238216 cd00363 PFK Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers. 338
21341 153080 cd00365 HMG-CoA_reductase Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR) is a tightly regulated enzyme, which catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals, this is the rate limiting committed step in cholesterol biosynthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. There are two classes of HMGR: class I enzymes which are found predominantly in eukaryotes and contain N-terminal membrane regions and class II enzymes which are found primarily in prokaryotes and are soluble as they lack the membrane region. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. While the prokaryotic enzyme is a homodimer, the eukaryotic enzyme is a homotetramer. 376
21342 212658 cd00366 FGGY FGGY family of carbohydrate kinases. This family is predominantly composed of glycerol kinase (GK) and similar carbohydrate kinases including rhamnulokinase (RhuK), xylulokinase (XK), gluconokinase (GntK), ribulokinase (RBK), and fuculokinase (FK). These enzymes catalyze the transfer of a phosphate group, usually from ATP, to their carbohydrate substrates. The monomer of FGGY proteins contains two large domains, which are separated by a deep cleft that forms the active site. One domain is primarily involved in sugar substrate binding, and the other is mainly responsible for ATP binding. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Substrate-induced conformational changes and a divalent cation may be required for the catalytic activity. 435
21343 238217 cd00367 PTS-HPr_like Histidine-containing phosphocarrier protein (HPr)-like proteins. HPr is a central component of the bacterial phosphoenolpyruvate sugar phosphotransferase system (PTS). The PTS catalyses the phosphorylation of sugar substrates during their translocation across the cell membrane. The phosphoryl group from phosphoenolpyruvate is transferred to HPr by enzyme I (EI). Phospho-HPr then transfers the phosphoryl group to one of several sugar-specific phosphoprotein intermediates. The conserved histidine in the N-terminus of HPr serves as an acceptor for the phosphoryl group of EI. In addition to the phosphotransferase proteins HPr and E1, this family also includes the closely related Carbon Catabolite Repressor (CCR) proteins which use the same phosphorylation mechanism and interact with transcriptional regulators to control expression of genes coding for utilization of less favored carbon sources. 77
21344 238218 cd00368 Molybdopterin-Binding Molybdopterin-Binding (MopB) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. 374
21345 238219 cd00371 HMA Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions. 63
21346 238220 cd00374 RNase_T2 Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. 195
21347 238221 cd00375 Urease_alpha Urease alpha-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, fungi and plants. Their primary role is to allow the use of external and internally generated urea as a nitrogen source. The enzyme consists of 3 subunits, alpha, beta and gamma, which can be fused and present on a single protein chain and which in turn forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 567
21348 119340 cd00377 ICL_PEPM Members of the ICL/PEPM enzyme family catalyze either P-C or C-C bond formation/cleavage. Known members are phosphoenolpyruvate mutase (PEPM), phosphonopyruvate hydrolase (PPH), carboxyPEP mutase (CPEP mutase), oxaloacetate hydrolase (OAH), isocitrate lyase (ICL), and 2-methylisocitrate lyase (MICL). Isocitrate lyase (ICL) catalyzes the conversion of isocitrate to succinate and glyoxylate, the first committed step in the glyoxylate pathway. This carbon-conserving pathway is present in most prokaryotes, lower eukaryotes and plants, but has not been observed in vertebrates. PEP mutase (PEPM) turns phosphoenolpyruvate (PEP) into phosphonopyruvate (P-pyr), an important intermediate in the formation of organophosphonates, which function as antibiotics or play a role in pathogenesis or signaling. P-pyr can be hydrolyzed by phosphonopyruvate hydrolase (PPH) to from pyruvate and phosphate. Oxaloacetate acetylhydrolase (OAH) catalyzes the hydrolytic cleavage of oxaloacetate to form acetate and oxalate, an important pathway to produce oxalate in filamentous fungi. 2-methylisocitrate lyase (MICL) cleaves 2-methylisocitrate to pyruvate and succinate, part of the methylcitrate cycle for the alpha-oxidation of propionate. 243
21349 99733 cd00378 SHMT Serine-glycine hydroxymethyltransferase (SHMT). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). SHMT carries out interconversion of serine and glycine; it catalyzes the transfer of hydroxymethyl group of N5, N10-methylene tetrahydrofolate to glycine resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers; the mammalian enzyme forms a homotetramer comprising four pyridoxal phosphate-bound active sites. 402
21350 238222 cd00379 Ribosomal_L10_P0 Ribosomal protein L10 family; composed of the large subunit ribosomal protein called L10 in bacteria, P0 in eukaryotes, and L10e in archaea, as well as uncharacterized P0-like eukaryotic proteins. In all three kingdoms, L10 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2. 155
21351 240504 cd00380 KOW KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese). KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues. 49
21352 238223 cd00381 IMPDH IMPDH: The catalytic domain of the inosine monophosphate dehydrogenase. IMPDH catalyzes the NAD-dependent oxidation of inosine 5'-monophosphate (IMP) to xanthosine 5' monophosphate (XMP). It is a rate-limiting step in the de novo synthesis of the guanine nucleotides. There is often a CBS domain inserted in the middle of this domain, which is proposed to play a regulatory role. IMPDH is a key enzyme in the regulation of cell proliferation and differentiation. It has been identified as an attractive target for developing chemotherapeutic agents. 325
21353 238224 cd00382 beta_CA Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 119
21354 294013 cd00383 trans_reg_C DNA-binding effector domain of two-component system response regulators. Bacteria and some eukaryotes use two-component signal transduction systems to detect and respond to changes in the environment. The systems consists of a sensor histidine kinase and a response regulator. The former autophosphorylates a histidine residue on detecting an external stimulus. The phosphate is then transferred to an invariant aspartate residue in a highly conserved receiver domain of the response regulator. Phosphorylation activates a variable effector domain of the response regulator, which triggers the cellular response. This C-terminal effector domain belongs to the winged helix-turn-helix family of transcriptional regulators and contains DNA and RNA polymerase binding sites. Several dimers or monomers bind head to tail to small tandem repeats upstream of the genes. The RNA polymerase binding sites interact with the alpha or sigma subunit of RNA polymerase. 89
21355 238226 cd00384 ALAD_PBGS Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. They either contain a cysteine-rich zinc binding site (consensus DXCXCX(Y/F)X3G(H/Q)CG) or an aspartate-rich magnesium binding site (consensus DXALDX(Y/F)X3G(H/Q)DG). The cysteine-rich zinc binding site appears more common. Most members represented by this model also have a second allosteric magnesium binding site (consensus RX~164DX~65EXXXD, missing in a eukaryotic subfamily with cysteine-rich zinc binding site). 314
21356 173830 cd00385 Isoprenoid_Biosyn_C1 Isoprenoid Biosynthesis enzymes, Class 1. Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD. 243
21357 238227 cd00386 Heme_Cu_Oxidase_III_like Heme-copper oxidase subunit III. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. This group additionally contains proteins which are fusions between subunits I and III, such as Sulfolobus acidocaldarius SoxM, a subunit of the SoxM terminal oxidase complex. It also includes NorE which has been speculated to be a subunit of nitric oxide reductase. Some archaebacterial cytochrome oxidases lack subunit III. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. 183
21358 100102 cd00387 Ribosomal_L7_L12 Ribosomal protein L7/L12. Ribosomal protein L7/L12 refers to the large ribosomal subunit proteins L7 and L12, which are identical except that L7 is acetylated at the N terminus. It is a component of the L7/L12 stalk, which is located at the surface of the ribosome. The stalk base consists of a portion of the 23S rRNA and ribosomal proteins L11 and L10. An extended C-terminal helix of L10 provides the binding site for L7/L12. L7/L12 consists of two domains joined by a flexible hinge, with the helical N-terminal domain (NTD) forming pairs of homodimers that bind to the extended helix of L10. It is the only multimeric ribosomal component, with either four or six copies per ribosome that occur as two or three dimers bound to the L10 helix. L7/L12 is the only ribosomal protein that does not interact directly with rRNA, but instead has indirect interactions through L10. The globular C-terminal domains of L7/L12 are highly mobile. They are exposed to the cytoplasm and contain binding sites for other molecules. Initiation factors, elongation factors, and release factors are known to interact with the L7/L12 stalk during their GTP-dependent cycles. The binding site for the factors EF-Tu and EF-G comprises L7/L12, L10, L11, the L11-binding region of 23S rRNA, and the sarcin-ricin loop of 23S rRNA. Removal of L7/L12 has minimal effect on factor binding and it has been proposed that L7/L12 induces the catalytically active conformation of EF-Tu and EF-G, thereby stimulating the GTPase activity of both factors. In eukaryotes, the proteins that perform the equivalent function to L7/L12 are called P1 and P2, which do not share sequence similarity with L7/L12. However, a bacterial L7/L12 homolog is found in some eukaryotes, in mitochondria and chloroplasts. In archaea, the protein equivalent to L7/L12 is called aL12 or L12p, but it is closer in sequence to P1 and P2 than to L7/L12. 127
21359 238228 cd00389 microbial_RNases microbial_RNases. Ribonucleases (RNAses) cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. The alignment contains fungal RNases (U2, T1, F1, Th, Pb, N1, and Ms) and bacterial RNases (barnase, binase, RNase Sa) , the majority of which are guanyl specific and fungal ribotoxins. 71
21360 238229 cd00390 Urease_gamma Urease gamma-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 96
21361 238230 cd00392 Ribosomal_L13 Ribosomal protein L13. Protein L13, a large ribosomal subunit protein, is one of five proteins required for an early folding intermediate of 23S rRNA in the assembly of the large subunit. L13 is situated on the bottom of the large subunit, near the polypeptide exit site. It interacts with proteins L3 and L6, and forms an extensive network of interactions with 23S rRNA. L13 has been identified as a homolog of the human breast basic conserved protein 1 (BBC1), a protein identified through its increased expression in breast cancer. L13 expression is also upregulated in a variety of human gastrointestinal cancers, suggesting it may play a role in the etiology of a variety of human malignancies. 114
21362 132923 cd00394 Clp_protease_like Caseinolytic protease (ClpP) is an ATP-dependent protease. Clp protease (caseinolytic protease; ClpP; endopeptidase Clp; Peptidase S14; ATP-dependent protease, ClpAP)-like enzymes are highly conserved serine proteases and belong to the ClpP/Crotonase superfamily. Included in this family are Clp proteases that are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. The functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP. Active site consists of the triad Ser, His and Asp, preferring hydrophobic or non-polar residues at P1 or P1' positions. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Another family included in this class of enzymes is the signal peptide peptidase A (SppA; S49) which is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Mutagenesis studies suggest that the catalytic center of SppA comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain. Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain. The third family included in this hierarchy is nodulation formation efficiency D (NfeD) which is a membrane-bound Clp-class protease and only found in bacteria and archaea. Majority of the NfeD genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named stomatin operon partner protein (STOPP). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 from Pyrococcus horikoshii has been shown to possess serine protease activity having a Ser-Lys catalytic dyad. 161
21363 173893 cd00395 Tyr_Trp_RS_core catalytic core domain of tyrosinyl-tRNA and tryptophanyl-tRNA synthetase. Tyrosinyl-tRNA synthetase (TyrRS)/Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. These enzymes attach Tyr or Trp, respectively, to the appropriate tRNA. These class I enzymes are homodimers, which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding. 273
21364 100027 cd00396 PurM-like AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain. 222
21365 271175 cd00397 DNA_BRE_C DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine based site-specific recombinases (integrases) that share the same fold in their catalytic domain containing conserved active site residues. The best-studied members of this diverse superfamily include Human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase, and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 167
21366 238232 cd00398 Aldolase_II Class II Aldolase and Adducin head (N-terminal) domain. Aldolases are ubiquitous enzymes catalyzing central steps of carbohydrate metabolism. Based on enzymatic mechanisms, this superfamily has been divided into two distinct classes (Class I and II). Class II enzymes are further divided into two sub-classes A and B. This family includes class II A aldolases and adducins which has not been ascribed any enzymatic function. Members of this class are primarily bacterial and eukaryotic in origin and include L-fuculose-1-phosphate, L-rhamnulose-1-phosphate aldolases and L-ribulose-5-phosphate 4-epimerases. They all share the ability to promote carbon-carbon bond cleavage and stabilize enolate intermediates using divalent cations. 209
21367 259843 cd00399 RNAP_largest_subunit_N Largest subunit of RNA polymerase (RNAP), N-terminal domain. This region represents the N-terminal domain of the largest subunit of RNA polymerase (RNAP). RNAP is a large multi-protein complex responsible for the synthesis of RNA. It is the principle enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei; RNAP I transcribes the ribosomal RNA precursor, RNAP II the mRNA precursor, and RNAP III the 5S and tRNA genes. A single distinct RNAP complex is found in prokaryotes and archaea, respectively, which may be responsible for the synthesis of all RNAs. Structure studies reveal that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shaped structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. All RNAPs are metalloenzymes. At least one Mg2+ ion is bound in the catalytic center. In addition, all cellular RNAPs contain several tightly bound zinc ions to different subunits that vary between RNAPs from prokaryotic to eukaryotic lineages. This domain represents the N-terminal region of the largest subunit of RNAP, and includes part of the active site. In archaea and some of the photosynthetic organisms or cellular organelle, however, this domain exists as a separate subunit. 528
21368 238233 cd00400 Voltage_gated_ClC CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function. 383
21369 240619 cd00401 SAHH S-Adenosylhomocysteine Hydrolase, NAD-binding and catalytic domains. S-adenosyl-L-homocysteine hydrolase (SAHH, AdoHycase) catalyzes the hydrolysis of S-adenosyl-L-homocysteine (AdoHyc) to form adenosine (Ado) and homocysteine (Hcy). The equilibrium lies far on the side of AdoHyc synthesis, but in nature the removal of Ado and Hyc is sufficiently fast, so that the net reaction is in the direction of hydrolysis. Since AdoHyc is a potent inhibitor of S-adenosyl-L-methionine dependent methyltransferases, AdoHycase plays a critical role in the modulation of the activity of various methyltransferases. The enzyme forms homotetramers, with each monomer binding one molecule of NAD+. 402
21370 293928 cd00402 Riboflavin_synthase_like Riboflavin synthase and similar proteins. Riboflavin synthase catalyzes the dismutation of two molecules of 6,7-dimethyl-8-(1'-D-ribityl)-lumazine (DMRL) to yield riboflavin (vitamin B12) and 4-ribitylamino-5-amino-2,6-dihydroxypyrimidine (RAADP). Riboflavin synthase is a homotrimer and the catalysis does not require any cofactors. Active sites are located between pairs of monomers, but only one active site catalyzes a reaction, the other two sites are inactive. Humans do not produce riboflavin synthase, and thus it is a good target for antimicrobial agents. This family also include lumazine protein (LumP) from bioluminescent bacteria. LumP serves as an optical transponder in bioluminescence emission. 185
21371 238235 cd00403 Ribosomal_L1 Ribosomal protein L1. The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA. In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity. L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site. 208
21372 238236 cd00404 Aconitase_swivel Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid. 88
21373 238237 cd00405 PRAI Phosphoribosylanthranilate isomerase (PRAI) catalyzes the fourth step of the tryptophan biosynthesis, the conversion of N-(5'- phosphoribosyl)-anthranilate (PRA) to 1-(o-carboxyphenylamino)- 1-deoxyribulose 5-phosphate (CdRP). Most PRAIs are monomeric, monofunctional and thermolabile, but in some thermophile organisms PRAI is dimeric for reasons of stability and in others it is fused to other components of the tryptophan biosynthesis pathway to form multifunctional enzymes. 203
21374 238238 cd00407 Urease_beta Urease beta-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 101
21375 188630 cd00408 DHDPS-like Dihydrodipicolinate synthase family. Dihydrodipicolinate synthase family. A member of the class I aldolases, which use an active-site lysine which stabilizes a reaction intermediate via Schiff base formation, and have TIM beta/alpha barrel fold. The dihydrodipicolinate synthase family comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways and includes such proteins as N-acetylneuraminate lyase, MosA protein, 5-keto-4-deoxy-glucarate dehydratase, trans-o-hydroxybenzylidenepyruvate hydratase-aldolase, trans-2'-carboxybenzalpyruvate hydratase-aldolase, and 2-keto-3-deoxy- gluconate aldolase. The family is also referred to as the N-acetylneuraminate lyase (NAL) family. 281
21376 199205 cd00411 L-asparaginase_like Bacterial L-asparaginases and related enzymes. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase. 320
21377 238239 cd00412 pyrophosphatase Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers. 155
21378 185683 cd00413 Glyco_hydrolase_16 glycosyl hydrolase family 16. The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues. 210
21379 185672 cd00418 GlxRS_core catalytic core domain of glutamyl-tRNA and glutaminyl-tRNA synthetase. Glutamyl-tRNA synthetase(GluRS)/Glutaminyl-tRNA synthetase (GlnRS) cataytic core domain. These enzymes attach Glu or Gln, respectively, to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea, cellular organelles, and some bacteria lack GlnRS. In these cases, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. The discriminating form of GluRS differs from GlnRS and the non-discriminating form of GluRS in their C-terminal anti-codon binding domains. 230
21380 238240 cd00419 Ferrochelatase_C Ferrochelatase, C-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus. 135
21381 238241 cd00421 intradiol_dioxygenase Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers. 146
21382 238242 cd00423 Pterin_binding Pterin binding enzymes. This family includes dihydropteroate synthase (DHPS) and cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH). DHPS, a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS. Sulfonamide drugs, which are substrate analogs of pABA, target DHPS. Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate. These include MeTr, a functional heterodimer, and the folate binding domain of MetH. 258
21383 176453 cd00424 PolY Y-family of DNA polymerases. Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions. Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases. Most TLS polymerases are members of the Y-family, including Pol eta, Pol kappa/IV, Pol iota, Rev1, and Pol V, which is found exclusively in bacteria. In eukaryotes, the B-family polymerase Pol zeta also functions as a TLS polymerase. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion. Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention. 343
21384 238243 cd00427 Ribosomal_L29_HIP Ribosomal L29 protein/HIP. L29 is a protein of the large ribosomal Subunit. A homolog, called heparin/heparan sulfate interacting protein (HIP), has also been identified in mammals. L29 is located on the surface of the large ribosomal subunit, where it participates in forming a protein ring that surrounds the polypeptide exit channel, providing structural support for the ribosome. L29 is involved in forming the translocon binding site, along with L19, L22, L23, L24, and L31e. In addition, L29 and L23 form the interaction site for trigger factor (TF) on the ribosomal surface, adjacent to the exit tunnel. L29 forms numerous interactions with L23 and with the 23S rRNA. In some eukaryotes, L29 is referred to as L35, which is distinct from L35 found in bacteria and some eukaryotes (primarily plastids and mitochondria). The mammalian homolog, HIP, is found on the surface of many tissues and cell lines. It is believed to play a role in cell adhesion and modulation of blood coagulation. It has also been shown to inhibit apoptosis in cancer cells. 57
21385 238244 cd00429 RPE Ribulose-5-phosphate 3-epimerase (RPE). This enzyme catalyses the interconversion of D-ribulose 5-phosphate (Ru5P) into D-xylulose 5-phosphate, as part of the Calvin cycle (reductive pentose phosphate pathway) in chloroplasts and in the oxidative pentose phosphate pathway. In the Calvin cycle Ru5P is phosphorylated by phosphoribulose kinase to ribulose-1,5-bisphosphate, which in turn is used by RubisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) to incorporate CO2 as the central step in carbohydrate synthesis. 211
21386 143481 cd00430 PLPDE_III_AR Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Alanine Racemase. This family includes predominantly bacterial alanine racemases (AR), some serine racemases (SerRac), and putative bifunctional enzymes containing N-terminal UDP-N-acetylmuramoyl-tripeptide:D-alanyl-D-alanine ligase (murF) and C-terminal AR domains. These proteins are fold type III PLP-dependent enzymes that play essential roles in peptidoglycan biosynthesis. AR catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. SerRac converts L-serine into its D-enantiomer (D-serine) for peptidoglycan synthesis. murF catalyzes the addition of D-Ala-D-Ala to UDPMurNAc-tripeptide, the final step in the synthesis of the cytoplasmic precursor of bacterial cell wall peptidoglycan. Members of this family contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. AR and other members of this family require dimer formation and the presence of the PLP cofactor for catalytic activity. Fungal ARs and eukaryotic serine racemases, which are fold types I and II PLP-dependent enzymes respectively, are excluded from this family. 367
21387 238245 cd00431 cysteine_hydrolases Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function. 161
21388 238246 cd00432 Ribosomal_L18_L5e Ribosomal L18/L5e: L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified. 103
21389 238247 cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants. 468
21390 238248 cd00435 ACBP Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. 85
21391 350155 cd00436 UP_TbUP-like uridine phosphorylases similar to Trypanosoma brucei UP. Uridine phosphorylase (UP) catalyzes the reversible phosphorolysis of uracil ribosides and analogous compounds to their respective nucleobases and ribose 1-phosphate. Trypanosoma brucei UP has a high specificity for uracil-containing (deoxy)nucleosides, and may function as a dimer. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 282
21392 380337 cd00438 cupin_RmlC RmlC carbohydrate epimerase, involved in dTDP-L-rhamnose production. RmlC (deoxythymidine diphosphate (dTDP)-4-keto-6-deoxy-D-hexulose 3, 5-epimerase or dTDP-6-deoxy-D-xylo-4-hexulose 3,5-epimerase; also known as RfbC) is a carbohydrate epimerase involved in the production of dTDP-L-rhamnose, a precursor of the bacterial cell wall constituent, L-rhamnose. L-Rhamnose (6-deoxy-l-mannose) plays an important role in the cell-wall structure of many bacterial species. It has been found to contribute to the virulence of several species, including the Gram-negative Salmonella enterica and Vibrio cholerae, where it is present as a part of the O-antigen, and is essential for the growth of Gram-positive bacteria such as Streptococcus pyogenes. RmlC converts dTDP-6-deoxy-D-xylo-4-hexulose to dTDP-6-deoxy-L-xylo-hexulose by catalyzing the epimerization of the 5-methyl and 3-hydroxyl groups of hexulose, the third of four steps in the dTDP-L-rhamnose biosynthetic pathway. RmlC belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 168
21393 188631 cd00439 Transaldolase Transaldolase. Transaldolase. Enzymes found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates. 252
21394 381596 cd00442 Lyz-like lysozyme-like domains. This family contains several members, including soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, chitosanases, and pesticin. Typical members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 59
21395 238250 cd00443 ADA_AMPD Adenosine/AMP deaminase. Adenosine deaminases (ADAs) are present in pro- and eukaryotic organisms and catalyze the zinc dependent irreversible deamination of adenosine nucleosides to inosine nucleosides and ammonia. The eukaryotic AMP deaminase catalyzes a similar reaction leading to the hydrolytic removal of an amino group at the 6 position of the adenine nucleotide ring, a branch point in the adenylate catabolic pathway. 305
21396 238251 cd00445 Uricase Urate oxidase (UO, uricase) is a peroxisomal enzyme that catalyzes the oxidation of uric acid to allantoin in most fish, amphibian, and mammalian species. The enzymatic process involves catalyzing the oxidative opening of the purine ring during the purine degradation pathway. In humans and certain other primates, however, the enzyme has been lost by some unknown mechanism. Each monomer contains two instances of this domain. Its functional form is a homotetramer for most species, though there are reports that some may form heterotetramers based on a few biochemical studies. 286
21397 271355 cd00446 GrpE nucleotide exchange factor GrpE. GrpE is the adenine nucleotide exchange factor of DnaK (Hsp70)-type ATPases. In bacteria, the DnaK-DnaJ-GrpE (KJE) chaperone system functions at the fulcrum of protein homeostasis. GrpE participates actively in response to heat shock by preventing aggregation of stress-denatured proteins; unfolded proteins initially bind to DnaJ, the J-domain ATPase-activating protein (Hsp40 family), whereupon DnaK hydrolyzes its bound ATP, resulting in a stable complex. The GrpE dimer binds to the ATPase domain of Hsp70 catalyzing the dissociation of ADP, which enables rebinding of ATP, one step in the Hsp70 reaction cycle in protein folding. In eukaryotes, only the mitochondrial Hsp70, not the cytosolic form, is GrpE dependent. Over-expression of Hsp70 molecular chaperones is important in suppressing toxicity of aberrantly folded proteins that occur in Alzheimer's disease (AD), Parkinson's disease (PD), amyotrophic lateral sclerosis, as well as several polyQ-diseases such as Huntington's disease and ataxias. 136
21398 238253 cd00447 NusB_Sun RNA binding domain of NusB (N protein-Utilization Substance B) and Sun (also known as RrmB or Fmu) proteins. This family includes two orthologous groups exemplified by the transcription termination factor NusB and the N-terminal domain of the rRNA-specific 5-methylcytidine transferase (m5C-methyltransferase) Sun. The NusB protein plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. The m5C-methyltransferase Sun shares the N-terminal non-catalytic RNA-binding domain with NusB. 129
21399 100004 cd00448 YjgF_YER057c_UK114_family YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 107
21400 238254 cd00449 PLPDE_IV PyridoxaL 5'-Phosphate Dependent Enzymes class IV (PLPDE_IV). This D-amino acid superfamily, one of five classes of PLPDE, consists of branched-chain amino acid aminotransferases (BCAT), D-amino acid transferases (DAAT), and 4-amino-4-deoxychorismate lyases (ADCL). BCAT catalyzes the reversible transamination reaction between the L-branched-chain amino and alpha-keto acids. DAAT catalyzes the synthesis of D-glutamic acid and D-alanine, and ADCL converts 4-amino-4-deoxychorismate to p-aminobenzoate and pyruvate. Except for a few enzymes, i. e., Escherichia coli and Salmonella BCATs, which are homohexamers arranged as a double trimer, the class IV PLPDEs are homodimers. Homodimer formation is required for catalytic activity. 256
21401 212095 cd00451 GH38N_AMII_euk N-terminal catalytic domain of eukaryotic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). The family corresponds to a group of eukaryotic class II alpha-mannosidases (AlphaMII), which contain Golgi alpha-mannosidases II (GMII), the major broad specificity lysosomal alpha-mannosidases (LAM, MAN2B1), the noval core-specific lysosomal alpha 1,6-mannosidases (Epman, MAN2B2), and similar proteins. GMII catalyzes the hydrolysis of the terminal both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2 (GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. LAM is a broad specificity exoglycosidase hydrolyzing all known alpha 1,2-, alpha 1,3-, and alpha 1,6-mannosidic linkages from numerous high mannose type oligosaccharides. Different from LAM, Epman can efficiently cleave only the alpha 1,6-linked mannose residue from (Man)3GlcNAc, but not (Man)3(GlcNAc)2 or other larger high mannose oligosaccharides, in the core of N-linked glycans. Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. 258
21402 188632 cd00452 KDPG_aldolase KDPG and KHG aldolase. KDPG and KHG aldolase. This family belongs to the class I adolases whose reaction mechanism involves Schiff base formation between a substrate carbonyl and lysine residue in the active site. 2-keto-3-deoxy-6-phosphogluconate (KDPG) aldolase, is best known for its role in the Entner-Doudoroff pathway of bacteria, where it catalyzes the reversible cleavage of KDPG to pyruvate and glyceraldehyde-3-phosphate. 2-keto-4-hydroxyglutarate (KHG) aldolase, which has enzymatic specificity toward glyoxylate, forming KHG in the presence of pyruvate, and is capable of regulating glyoxylate levels in the glyoxylate bypass, an alternate pathway when bacteria are grown on acetate carbon sources. 190
21403 238255 cd00453 FTBP_aldolase_II Fructose/tagarose-bisphosphate aldolase class II. This family includes fructose-1,6-bisphosphate (FBP) and tagarose 1,6-bisphosphate (TBP) aldolases. FBP-aldolase is homodimeric and used in gluconeogenesis and glycolysis; the enzyme controls the condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to yield fructose-1,6-bisphosphate. TBP-aldolase is tetrameric and produces tagarose-1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. Although structurally similar, the class I aldolases use a different mechanism and are believed to have an independent evolutionary origin. 340
21404 381253 cd00454 TrHb1_N truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 1 (N). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). Typical of the TrHb1s (N) group is a protein matrix tunnel. It includes a Mycobacterium tuberculosis TrHb1, Mt-trHbN, which is encoded by the glbN gene. Mt-trHbN is expressed during the Mycobacterium stationary phase, and plays a specific defense role against nitrosative stress. The cyanobacterium Synechococcus sp. PCC 7002 TrHb1 GlbN, is constitutively expressed, and likely also protects cells from reactive nitrogen species. 111
21405 238257 cd00455 nuc_hydro nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment. 295
21406 176642 cd00457 PEBP PhosphatidylEthanolamine-Binding Protein (PEBP) domain. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer). 159
21407 238258 cd00458 SugarP_isomerase SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway. 169
21408 132901 cd00460 RNAP_RPB11_RPB3 RPB11 and RPB3 subunits of RNA polymerase. The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts. 86
21409 238259 cd00462 PTH Peptidyl-tRNA hydrolase (PTH) is a monomeric protein that cleaves the ester bond linking the nascent peptide and tRNA when peptidyl-tRNA is released prematurely from the ribosome. This ensures the recycling of peptidyl-tRNAs into tRNAs produced through abortion of translation and is essential for cell viability.This group also contains chloroplast RNA splicing 2 (CRS2), which is closely related nuclear-encoded protein required for the splicing of nine group II introns in chloroplasts. 171
21410 199209 cd00463 Ribosomal_L31e Eukaryotic/archaeal ribosomal protein L31. Ribosomal protein L31e, which is present in archaea and eukaryotes, binds the 23S rRNA and is one of six protein components encircling the polypeptide exit tunnel. It is a component of the eukaryotic 60S (large) ribosomal subunit, and the archaeal 50S (large) ribosomal subunit. 83
21411 238260 cd00464 SK Shikimate kinase (SK) is the fifth enzyme in the shikimate pathway, a seven-step biosynthetic pathway which converts erythrose-4-phosphate to chorismic acid, found in bacteria, fungi and plants. Chorismic acid is a important intermediate in the synthesis of aromatic compounds, such as aromatic amino acids, p-aminobenzoic acid, folate and ubiquinone. Shikimate kinase catalyses the phosphorylation of the 3-hydroxyl group of shikimic acid using ATP. 154
21412 238261 cd00465 URO-D_CIMS_like The URO-D_CIMS_like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases, as well as cobalamine (B12) independent methionine synthases. Despite their sequence similarities, members of this family have clearly different functions. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane, and methionine synthases transfer a methyl group from a folate cofactor to L-homocysteine in a reaction requiring zinc. 306
21413 238262 cd00466 DHQase_II Dehydroquinase (DHQase), type II. Dehydroquinase (or 3-dehydroquinate dehydratase) catalyzes the reversible dehydration of 3-dehydroquinate to form 3-dehydroshikimate. This reaction is part of two metabolic pathways: the biosynthetic shikimate pathway and the catabolic quinate pathway. There are two types of DHQases, which are distinct from each other in amino acid sequence and three-dimensional structure. Type I enzymes usually catalyze the biosynthetic reaction using a syn elimination mechanism. In contrast, type II enzymes, found in the quinate pathway of fungi and in the shikimate pathway of many bacteria, are dodecameric enzymes that employ an anti elimination reaction mechanism. 140
21414 238263 cd00468 HIT_like HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups. 86
21415 238264 cd00470 PTPS 6-pyruvoyl tetrahydropterin synthase (PTPS). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is PTPS which catalyzes the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin. The functional enzyme is a hexamer of identical subunits. 135
21416 100103 cd00472 Ribosomal_L24e_L24 Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site. L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination. L24 may be an important protein in eukaryotic reproduction: in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist. Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30. 54
21417 275385 cd00473 bS6 Bacterial ribosomal protein S6. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18. 91
21418 211317 cd00474 eIF1_SUI1_like Eukaryotic initiation factor 1 and related proteins. Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear. 78
21419 259850 cd00475 Cis_IPPS Cis (Z)-Isoprenyl Diphosphate Synthases. Cis (Z)-Isoprenyl Diphosphate Synthases (cis-IPPS) catalyze the successive 1'-4 condensation of the isopentenyl diphosphate (IPP) molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. In prokaryotes, the cis-IPPS, undecaprenyl diphosphate synthase (UPP synthase), catalyzes the formation of the carrier lipid UPP in bacterial cell wall peptidoglycan biosynthesis. Similarly, in eukaryotes, the cis-IPPS, dehydrodolichyl diphosphate (dedol-PP) synthase catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate. cis-IPPS form homodimers and are mechanistically and structurally distinct from trans-IPPS, which lack the DDXXD motifs, yet require Mg2+ for activity. 219
21420 133468 cd00476 SAICAR_synt 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. SAICAR synthetase (the PurC gene product) catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate. 230
21421 349750 cd00477 FTHFS formyltetrahydrofolate synthetase. Formyltetrahydrofolate synthetase (FTHFS) catalyzes the ATP-dependent activation of formate ion via its addition to the N10 position of tetrahydrofolate. FTHFS is a highly expressed key enzyme in both the Wood-Ljungdahl pathway of autotrophic CO2 fixation (acetogenesis) and the glycine synthase/reductase pathways of purinolysis. The key physiological role of this enzyme in acetogens is to catalyze the formylation of tetrahydrofolate, an initial step in the reduction of carbon dioxide and other one-carbon precursors to acetate. In purinolytic organisms, the enzymatic reaction is reversed, liberating formate from 10-formyltetrahydrofolate with concurrent production of ATP. 540
21422 238267 cd00480 malate_synt Malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 511
21423 238268 cd00481 Ribosomal_L19e Ribosomal protein L19e. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 145
21424 238269 cd00483 HPPK 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is HPPK which catalyzes pyrophosphoryl transfer from ATP to 6-hydroxymethyl-7,8-dihydropterin (HP). The functional enzyme is a monomer. Mammals lack many of the enzymes in the folate pathway including, HPPK. 128
21425 238270 cd00484 PEPCK_ATP Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the ATP-dependent groups. 508
21426 238271 cd00487 Pep_deformylase Polypeptide or peptide deformylase; a family of metalloenzymes that catalyzes the removal of the N-terminal formyl group in a growing polypeptide chain following translation initiation during protein synthesis in prokaryotes. These enzymes utilize Fe(II) as the catalytic metal ion, which can be replaced with a nickel or cobalt ion with no loss of activity. There are two types of peptide deformylases, types I and II, which differ in structure only in the outer surface of the domain. Because these enzymes are essential only in prokaryotes (although eukaryotic gene sequences have been found), they are a target for a new class of antibacterial agents. 141
21427 238272 cd00488 PCD_DCoH PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas. 75
21428 238273 cd00489 Barstar_like Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it, thus inhibiting its potentially lethal RNase activity inside the cell. Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase. 85
21429 119402 cd00490 Met_repressor_MetJ Met Repressor, MetJ. MetJ is a bacterial regulatory protein that uses S-adenosylmethionine (SAM) as a corepressor to regulate the production of Methionine. MetJ binds arrays of two to five adjacent copies of an eight base-pair 'metbox' sequence. MetJ forms sufficiently strong interactions with the sugar-phosphate backbone to accomodate sequence variation in natural operators. However, it is very sensitive to particular base changes in the operator. MetJ exists as a homodimer. 103
21430 238274 cd00491 4Oxalocrotonate_Tautomerase 4-Oxalocrotonate Tautomerase: Catalyzes the isomerization of unsaturated ketones. The structure is a homohexamer that is arranged as a trimer of dimers. The hexamer contains six active sites, each formed by residues from three monomers, two from one dimer and the third from a neighboring monomer. Each monomer is a beta-alpha-beta fold with two small beta strands at the C-terminus that fold back on themselves. A pair of monomers form a dimer with two-fold symmetry, consisting of a 4-stranded beta sheet with two helices on one side and two additional small beta strands at each end. The dimers are assembled around a 3-fold axis of rotation to form a hexamer, with the short beta strands from each dimer contacting the neighboring dimers. 58
21431 238275 cd00493 FabA_FabZ FabA/Z, beta-hydroxyacyl-acyl carrier protein (ACP)-dehydratases: One of several distinct enzyme types of the dissociative, type II, fatty acid synthase system (found in bacteria and plants) required to complete successive cycles of fatty acid elongation. The third step of the elongation cycle, the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, is catalyzed by FabA or FabZ. FabA is bifunctional and catalyzes an additional isomerization reaction of trans-2-acyl-ACP to cis-3-acyl-ACP, an essential reaction to unsaturated fatty acid synthesis. FabZ is the primary dehydratase that participates in the elongation cycles of saturated as well as unsaturated fatty acid biosynthesis, whereas FabA is more active in the dehydration of beta-hydroxydecanoyl-ACP. The FabA structure is homodimeric with two independent active sites located at the dimer interface. 131
21432 270213 cd00494 PBP2_HMBS Hydroxymethylbilane synthase possesses the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, vitamin B12 and related macrocycles. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This family includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 274
21433 198379 cd00495 Ribosomal_L25_TL5_CTC Ribosomal L25/TL5/CTC N-terminal 5S rRNA binding domain. L25 is a single-domain protein, homologous to the N-terminal domain of TL5 and CTC. CTC is a known stress protein, and proteins of this family are believed to have two functions, acting as both ribosomal and stress proteins. In Escherichia coli, cells deleted for L25 were found to be viable; however, these cells grew slowly and had impaired protein synthesis capability. In Bacillus subtilis, CTC is induced under stress conditions and located in the ribosome; it has been proposed that CTC may be necessary for accurate translation under stress conditions. Ribosomal_L25_TL5_CTC is mostly found in bacteria, with a few exceptions such as plants or stramenopiles. Due to its limited taxonomic diversity and the viability of cells deleted for L25, this protein is not believed to be necessary for ribosomal assembly. Eukaryotes contain a protein called L25, which is not homologous to bacterial L25, but rather to bacterial L23. 90
21434 238277 cd00496 PheRS_alpha_core Phenylalanyl-tRNA synthetase (PheRS) alpha chain catalytic core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA, PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. PheRS is an alpha-2/ beta-2 tetramer. 218
21435 211322 cd00497 PseudoU_synth_TruA_like Pseudouridine synthase, TruA family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruA, Saccharomyces cerevisiae Pus1p, S. cerevisiae Pus3p Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae PUS1 catalyzes the formation of psi34 and psi36 in the intron containing tRNAIle, psi35 in the intron containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes psi 26, 65 and 67. C. elegans Pus1p does not modify psi44 in U2 snRNA. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Psi44 in U2 snRNA and, psi38 and psi39 in tRNAs are highly phylogenetically conserved. Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA). 215
21436 238278 cd00498 Hsp33 Heat shock protein 33 (Hsp33): Cytosolic protein that acts as a molecular chaperone under oxidative conditions. In normal (reducing) cytosolic conditions, four conserved Cys residues are coordinated by a Zn ion. Under oxidative stress (such as heat shock), the Cys are reversibly oxidized to disulfide bonds, which causes the chaperone activity to be turned on. Hsp33 is homodimeric in its functional form. 275
21437 238279 cd00501 Peptidase_C15 Pyroglutamyl peptidase (PGP) type I, also known as pyrrolidone carboxyl peptidase (pcp) type I: Enzymes responsible for cleaving pyroglutamate (pGlu) from the N-terminal end of specialized proteins. The N-terminal pGlu protects these proteins from proteolysis by other proteases until the pGlu is removed by a PGP. PGPs are cysteine proteases with a Cys-His-Glu/Asp catalytic triad. Type I PGPs are found in a wide variety of prokaryotes and eukaryotes. It is not clear whether the functional form is a monomer, a homodimer, or a homotetramer. 194
21438 188633 cd00502 DHQase_I Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase). Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase). Catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate to produce dehydroshikimate. Dehydroquinase is the third enzyme in the shikimate pathway, which is involved in the biosynthesis of aromatic amino acids. Type I DHQase exists as a homodimer. Type II 3-dehydroquinase also catalyzes the same overall reaction, but is unrelated in terms of sequence and structure, and utilizes a completely different reaction mechanism. 225
21439 238280 cd00503 Frataxin Frataxin is a nuclear-encoded mitochondrial protein implicated in Friedreich's ataxia (FRDA), an human autosomal recessive neurodegenerative disease; Frataxin is found in eukaryotes and in purple bacteria; lack of frataxin causes iron to accumulate in the mitochondrial matrix suggesting that frataxin is involved in mitochondrial iron homeostasis and possibly in iron transport; the domain has an alpha-beta fold consisting of two helices flanking an antiparallel beta sheet. 105
21440 238281 cd00504 GXGXG GXGXG domain. This domain of unknown function is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS), in subunit C of tungsten formylmethanofuran dehydrogenase (FwdC) and in subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC). It is also found in a primarily archeal group of proteins predicted to encode part of the large subunit of GltS. It is characterized by a repeated GXXGXXXG motif. GltS is a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites occur in other domains within the protein or or encoded by separate genes, and are not present in the domain in this CD. FwdC and FmdC are reversible ion pumps that catalyze the formylation and deformylation of methanofuran in hyperthermophiles and bacteria. They require the presence of either tungstun (FwdC) or molybdenum (FmdC). The specific function of this domain also remains unidentified in the formylmethanofuran dehydrogenases. 149
21441 132996 cd00505 Glyco_transf_8 Members of glycosyltransferase family 8 (GT-8) are involved in lipopolysaccharide biosynthesis and glycogen synthesis. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. GT-8 comprises enzymes with a number of known activities: lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase, and N-acetylglucosaminyltransferase. GT-8 enzymes contains a conserved DXD motif which is essential in the coordination of a catalytic divalent cation, most commonly Mn2+. 246
21442 211323 cd00506 PseudoU_synth_TruB_like Pseudouridine synthase, TruB family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruB, Saccharomyces cerevisiae Pus4, M. tuberculosis TruB, S. cerevisiae Cbf5 and human dyskerin. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E. coli TruB, M. tuberculosis TruB and S. cerevisiae Pus4, make psi55 in the T loop of tRNAs. Pus4 catalyses the formation of psi55 in both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. Mutations in human dyskerin cause X-linked dyskeratosis congenitas. 210
21443 238282 cd00508 MopB_CT_Fdh-Nap-like This CD includes formate dehydrogenases (Fdh) H and N; nitrate reductases, Nap and Nas; and other related proteins. Formate dehydrogenase H is a component of the anaerobic formate hydrogen lyase complex and catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. Formate dehydrogenase N (alpha subunit) is the major electron donor to the bacterial nitrate respiratory chain and nitrate reductases, Nap and Nas, catalyze the reduction of nitrate to nitrite. This CD (MopB_CT_Fdh-Nap-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 120
21444 238283 cd00512 MM_CoA_mutase Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM)-like family; contains proteins similar to MCM, and the large subunit of Streptomyces coenzyme B12-dependent isobutyryl-CoA mutase (ICM). MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include Propionbacterium shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers, with both subunits being homologous members of this family. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA (intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis). In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism. 399
21445 238284 cd00513 Ribosomal_L32_L32e Ribosomal_L32_L32e: L32 is a protein from the large subunit that contains a surface-exposed globular domain and a finger-like projection that extends into the RNA core to stabilize the tertiary structure. L32 does not appear to play a role in forming the A (aminacyl), P (peptidyl) or E (exit) sites of the ribosome, but does interact with 23S rRNA, which has a "kink-turn" secondary structure motif. L32 is overexpressed in human prostate cancer and has been identified as a stably expressed housekeeping gene in macrophages of human chronic obstructive pulmonary disease (COPD) patients. In Schizosaccharomyces pombe, L32 has also been suggested to play a role as a transcriptional regulator in the nucleus. Found in archaea and eukaryotes, this protein is known as L32 in eukaryotes and L32e in archaea. 107
21446 238285 cd00515 HAM1 NTPase/HAM1. This family consists of the HAM1 protein and pyrophosphate-releasing xanthosine/ inosine triphosphatase. HAM1 protects the cell against mutagenesis by the base analog 6-N-hydroxylaminopurine (HAP) in E. Coli and S. cerevisiae. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides such as XTP to XMP and ITP to IMP, but not the standard nucleotides, in the presence of Mg or Mn ions. The enzyme exists as a homodimer. The HAM1 protein may be acting as an NTPase by hydrolyzing the HAP triphosphate. 183
21447 238286 cd00516 PRTase_typeII Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway. 281
21448 173895 cd00517 ATPS ATP-sulfurylase. ATP-sulfurylase (ATPS), also known as sulfate adenylate transferase, catalyzes the transfer of an adenylyl group from ATP to sulfate, forming adenosine 5'-phosphosulfate (APS). This reaction is generally accompanied by a further reaction, catalyzed by APS kinase, in which APS is phosphorylated to yield 3'-phospho-APS (PAPS). In some organisms the APS kinase is a separate protein, while in others it is incorporated with ATP sulfurylase in a bifunctional enzyme that catalyzes both reactions. In bifunctional proteins, the domain that performs the kinase activity can be attached at the N-terminal end of the sulfurylase unit or at the C-terminal end, depending on the organism. While the reaction is ubiquitous among organisms, the physiological role of the reaction varies. In some organisms it is used to generate APS from sulfate and ATP, while in others it proceeds in the opposite direction to generate ATP from APS and pyrophosphate. ATP sulfurylase can be a monomer, a homodimer, or a homo-oligomer, depending on the organism. ATPS belongs to a large superfamily of nucleotidyltransferases that includes pantothenate synthetase (PanC), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain. 353
21449 99872 cd00518 H2MP Hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). These enzymes belong to the peptidase family M52. Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved in processing of HypE, the large subunit of hydrogenase 3. This cleavage is nickel dependent. This CD also includes such hydrogenase-processing proteins as HydD, HupW, and HoxW, as well as, proteins of the F420-reducing hydrogenase of methanogens (e.g., FrcD). Also included, is the Pyrococcus furiosus FrxA protein, a bifunctional endopeptidase/ sulfhydrogenase found in NADP-reducing hyperthermophiles.The Pyrococcus FrxA is not related to those found in Helicobacter pylori. 139
21450 238287 cd00519 Lipase_3 Lipase (class 3). Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation," the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 229
21451 238288 cd00520 RRF Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are "recycled" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species. 179
21452 213981 cd00522 Hemerythrin-like Hemerythrin family. Hemerythrin (Hr) and related proteins are found in bacteria, archaea and eukaryotes. They are non-heme diiron oxygen transport proteins. In addition to oxygen transport, members are involved in cadmium fixation and host anti-bacterial defense. They have the same "four alpha helix bundle" motif and similar active site structures. Some members, like Hr, form oligomers, the octameric form being most prevalent, while others are monomeric. 103
21453 411703 cd00523 Holliday_junction_resolvase Holliday junction resolvase. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. HJRs occur in archaea, bacteria, and in the mitochondria of certain eukaryotes; however, this CD includes only the archeal HJRs. The bacterial and archeal HJRs perform a similar function but differ in both sequence and structure. Structural similarity does however, exist between the archeal HJRs and type II restriction endonucleases, such as EcoRV, BglII, and Fok, and this similarity includes their active site configurations. 126
21454 238290 cd00524 SORL Superoxide reductase-like (SORL) domain; present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)]. 86
21455 238291 cd00525 AE_Prim_S_like AE_Prim_S_like: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. The replication machineries of A/Es are distinct from that of bacteria. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. In addition to its catalytic role in replication, eukaryotic DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. Pfu41 and Pfu46 comprise the primase complex of the archaea Pyrococcus furiosus; these proteins have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. Also found in this group is the primase-polymerase (primpol) domain of replicases from archaeal plasmids including the ORF904 protein of pRN1 from Sulfolobus islandicus (pRN1 primpol). The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. The pRN1 primpol primase activity prefers dNTPs to rNTPs; however incorporation of dNTPs requires rNTP as cofactor. This group also includes the Pol domain of bacterial LigD proteins such Mycobacterium tuberculosis (Mt)LigD. MtLigD contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. LigD Pol plays a role in non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB) in vivo, perhaps by filling in short 5'-overhangs with ribonucleotides; the filled in termini would be sealed by the associated LigD ligase domain. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro. 136
21456 238292 cd00527 IF6 Ribosome anti-association factor IF6 binds the large ribosomal subunit and prevents the two subunits from associating during translation initiation. IF6 comprises a family of translation factors that includes both eukaryotic (eIF6) and archeal (aIF6) members. All members of this family have a conserved pentameric fold referred to as a beta/alpha propeller. The eukaryotic IF6 members have a moderately conserved C-terminal extension which is not required for ribosomal binding, and may have an alternative function. 220
21457 238293 cd00528 MoaC MoaC family. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 136
21458 340812 cd00529 RuvC_like Crossover junction endodeoxyribonuclease RuvC and similar proteins. The RuvC-like family consists of bacterial RuvC, fungal Cruciform cutting endonuclease 1 (CCE1), and bacterial YqgF. RuvC and CCE1 are Holliday junction resolvases (HJRs), endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. RuvC is part of the RuvABC pathway in Escherichia coli and other Gram-negative bacteria that is involved in processing Holliday junctions, which are formed by the reciprocal exchange of strands between two DNA duplexes. CCE1 is a HJR specific for 4-way junctions; it is involved in the maintenance of mitochondrial DNA. Escherichia coli YqgF has been shown to act as a pre-16S rRNA nuclease, presumably as a monomer. It is involved in the processing of pre-16S rRNA during ribosome maturation. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi. RuvC and its orthologs are homodimers and display structural similarity to RNase H and Hsp70. 117
21459 238295 cd00530 PTE Phosphotriesterase (PTE) catalyzes the hydrolysis of organophosphate nerve agents, including the chemical warfare agents VX, soman, and sarin as well as the insecticide paraoxon. PTE exists as a homodimer with one active site per monomer. The active site is located next to a binuclear metal center, at the C-terminal end of a TIM alpha- beta barrel motif. The native enzyme contains two zinc ions at the active site however these can be replaced with other metals such as cobalt, cadmium, nickel or manganese and the enzyme remains active. 293
21460 238296 cd00531 NTF2_like Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes. 124
21461 238297 cd00532 MGS-like MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site. 112
21462 238298 cd00534 DHNA_DHNTPE Dihydroneopterin aldolase (DHNA) and 7,8-dihydroneopterin triphosphate epimerase domain (DHNTPE); these enzymes have been designated folB and folX, respectively. Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is DHNA which catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. Though it is known that DHNTPE catalyzes the epimerization of dihydroneopterin triphosphate to dihydromonapterin triphosphate, the biological role of this enzyme is still unclear. It is hypothesized that it is not an essential protein since a folX knockout in E. coli has a normal phenotype and the fact that folX is not present in H. influenza. In addition both enzymes have been shown to be able to compensate for the other's activity albeit at slower reaction rates. The functional enzyme for both is an octamer of identical subunits. Mammals lack many of the enzymes in the folate pathway including, DHNA and DHNTPE. 118
21463 238299 cd00537 MTHFR Methylenetetrahydrofolate reductase (MTHFR). 5,10-Methylenetetrahydrofolate is reduced to 5-methyltetrahydrofolate by methylenetetrahydrofolate reductase, a cytoplasmic, NAD(P)-dependent enzyme. 5-methyltetrahydrofolate is utilized by methionine synthase to convert homocysteine to methionine. The enzymatic mechanism is a ping-pong bi-bi mechanism, in which NAD(P)+ release precedes the binding of methylenetetrahydrofolate and the acceptor is free FAD. The family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from prokaryotes and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The bacterial enzyme is a homotetramer and NADH is the preferred reductant while the eukaryotic enzyme is a homodimer and NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease. 274
21464 238300 cd00538 PA PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2. 126
21465 238301 cd00539 MCR_gamma Methyl-coenzyme M reductase (MCR) gamma subunit. MCR catalyzes the terminal step of methane formation in the energy metabolism of all methanogenic archaea, in which methyl-coenzyme M and coenzyme B are converted to methane and the heterodisulfide of coenzyme M and coenzyme B (CoM-S-S-CoB). MCR is a dimer of trimers, each of which consists of one alpha, one beta, and one gamma subunit, with two identical active sites containing nickel porphinoid factor 430 (F430). 246
21466 187726 cd00540 AAG Alkyladenine DNA glycosylase catalyzes the first step in base excision repair. Alkyladenine DNA glycosylase (AAG), also known as 3-methyladenine DNA glycosylase, catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site. AAG bends DNA by intercalating between the base pairs, causing the damaged base to flip out of the double helix and into the enzyme active site for cleavage. Although AAG represents one of six DNA glycosylase classes, it lacks the helix-hairpin-helix active site motif associated with other BER glycosylases and is structurally distinct from them. 187
21467 238302 cd00541 OMPLA The outer membrane phospholipase A (OMPLA) is an integral membrane enzyme that catalyses the hydrolysis of acylester bonds in phospholipids using calcium as a cofactor. The enzyme has a fold of transmembrane beta-barrels and is widespread among Gram-negative bacteria, both in pathogens and nonpathogens. In pathogenic bacteria such as Campylobacter coli and Helicobacter pylori OMPLA is involved in pathogenesis and virulence. In nonpathogenic bacteria the physiological function of OMPLA is less clear. The Escherichia coli enzyme is involved in the secretion of bacteriocins, antibacterial peptides that are produced in order to survive under starvation conditions. The enzyme activity of OMPLA is strictly regulated to prevent uncontrolled breakdown of the surrounding phospholipids. The activity of OMPLA can be induced by membrane perturbation and concurs with dimerization of the enzyme. 231
21468 238303 cd00542 Ntn_PVA Penicillin V acylase (PVA), also known as conjugated bile salt acid hydrolase (CBAH), catalyzes the hydrolysis of penicillin V to yield 6-amino penicillanic acid (6-APA), an important key intermediate of semisynthetic penicillins. PVA has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which PVA belongs. This nucleophilic cysteine is exposed by post-translational prossessing of the PVA precursor. PVA forms a homotetramer. 303
21469 410861 cd00544 CobU Adenosylcobinamide kinase / adenosylcobinamide phosphate guanyltransferase (CobU). CobU is a bacterial bifunctional cobalbumin biosynthesis enzyme which display adenosylcobinamide kinase and adenosylcobinamide phosphate guanyltransferase activity and is a key participant in the final stages of cobalamin biosynthesis where it is involved in nucleotide loop assembly. CobU is a homotrimer which functions both as a kinase and as a nucleotidyl transferase. It phosphorylates of adenosylcobinamide to form adenosylcobinamide phosphate (using a variety of nucleotides as the phosphate donor) and then adds GMP to adenosylcobinamide phosphate to form adenosylcobinamide-GDP, specifically using GTP. 166
21470 238305 cd00545 MCH Methenyltetrahydromethanopterin (methenyl-H4MPT) cyclohydrolase (MCH). MCH is a cytoplasmic enzyme that has been identified in methanogenic archaea, sulfate- reducing archaea, and methylotrophic bacteria. It catalyzes the reversible formation of N(5), N(10)-methenyltetrahydromethanopterin (methenyl-H4MPT+) from N(5)-formyltetrahydromethanopterin (formyl- H4MPT), in the third step of the reaction to reduce CO2 to CH4. The protein functions as a homodimer or homotrimer, depending on the organism. 312
21471 238306 cd00546 QFR_TypeD_subunitC Quinol:fumarate reductase (QFR) Type D subfamily, 15kD hydrophobic subunit C; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits. 124
21472 238307 cd00547 QFR_TypeD_subunitD Quinol:fumarate reductase (QFR) Type D subfamily, 13kD hydrophobic subunit D; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits. 115
21473 349426 cd00548 NrfA-like cytochrome c nitrite reductase and similar proteins. This family contains cytochrome c nitrite reductase (also known as cytochrome c552, or NrfA) and similar proteins. The pentaheme enzyme NrfA catalyzes the electron reduction of nitrite to ammonia in the nitrogen cycle. This enzyme can also transform nitrogen monoxide and hydroxylamine, two potential bound reaction intermediates, into ammonia. It is a homodimer, with each monomer containing four classical CXXCH type heme-binding sites along with an alternative CXXCK heme-binding motif, which is important for catalysis. This family also includes octaheme nitrite reductase (TvNiR) from the haloalkaliphilic bacterium Thioalkalivibrio paradoxus which catalyzes the reduction of nitrite and hydroxylamine to ammonia as well as the reduction of sulfite to sulfide. 370
21474 200451 cd00551 AmyAc_family Alpha amylase catalytic domain family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 260
21475 238308 cd00552 RaiA RaiA ("ribosome-associated inhibitor A", also known as Protein Y (PY), YfiA, and SpotY, is a stress-response protein that binds the ribosomal subunit interface and arrests translation by interfering with aminoacyl-tRNA binding to the ribosomal A site. RaiA is also thought to counteract miscoding at the A site thus reducing translation errors. The RaiA fold structurally resembles the double-stranded RNA-binding domain (dsRBD). 93
21476 238309 cd00553 NAD_synthase NAD+ synthase is a homodimer, which catalyzes the final step in de novo nicotinamide adenine dinucleotide (NAD+) biosynthesis, an amide transfer from either ammonia or glutamine to nicotinic acid adenine dinucleotide (NaAD). The conversion of NaAD to NAD+ occurs via an NAD-adenylate intermediate and requires ATP and Mg2+. The intemediate is subsequently cleaved into NAD+ and AMP. In many prokaryotes, such as E. coli , NAD synthetase consists of a single domain and is strictly ammonia dependent. In contrast, eukaryotes and other prokaryotes have an additional N-terminal amidohydrolase domain that prefer glutamine, Interestingly, NAD+ synthases in these prokaryotes, can also utilize ammonia as an amide source . 248
21477 100025 cd00554 MECDP_synthase MECDP_synthase (2-C-methyl-D-erythritol-2,4-cyclodiphosphate synthase), encoded by the ispF gene, catalyzes the formation of 2-C-methyl-D-erythritol 2,4-cyclodiphosphate (MEC) in the non-mevalonate deoxyxylulose (DOXP) pathway for isoprenoid biosynthesis. This pathway is present in bacteria, plants and some protozoa but is distinct from that used by mammals and Archaea. MECDP_synthase forms a homotrimer, carrying three active sites, each of which is formed in a cleft between pairs of subunits. 153
21478 238310 cd00555 Maf Nucleotide binding protein Maf. Maf has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea, but homologs in B.subtilis and S.cerevisiae are nonessential for cell division. Maf has been predicted to be a nucleotide- or nucleic acid-binding protein with structural similarity to the hypoxanthine/xanthine NTP pyrophosphatase Ham1 from Methanococcus jannaschii, RNase H from Escherichia coli, and some other nucleotide or RNA-binding proteins. 180
21479 238311 cd00556 Thioesterase_II Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 99
21480 238312 cd00557 Translocase_SecB Preprotein translocase subunit SecB. SecB is a cytoplasmic component of the multisubunit membrane-bound enzyme termed Sec protein translocase, which is the main constituent of the General Secretory (type II) Pathway involved in translocation of nascent polypeptides across the cytoplasmic membrane. SecB has been shown to function as export-specific molecular chaperone that selectively binds preproteins, maintains them in a translocation competent state and delivers them to SecA, the membrane-bound ATPase, that drives the translocation reaction. In solution, SecB exists as homotetramer, which is organized as a dimer of dimers. 131
21481 238313 cd00559 Cyanase_C Cyanase C-terminal domain. Cyanase (Cyanate lyase) is responsible for the hydrolysis of cyanate. It catalyzes the reaction of cyanate with bicarbonate to produce ammonia and carbon dioxide. This allows organisms that possess the enzyme to overcome the toxicity of environmental cyanate and to use cyanate as a source of nitrogen for growth. This enzyme is a homodecamer, formed by five dimers. Each monomer is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain. 69
21482 185673 cd00560 PanC Pantoate-beta-alanine ligase. PanC Pantoate-beta-alanine ligase, also known as pantothenate synthase, catalyzes the formation of pantothenate from pantoate and alanine. PanC belongs to a large superfamily of nucleotidyltransferases that includes , ATP sulfurylase (ATPS), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain. 277
21483 410862 cd00561 CobA_ACA CobA-type ATP:corrinoid adenosyltransferase. CobA-ATP:corrinoid adenosyltransferase is one of three sequence and structurally unrelated groups of ATP:corrinoid adenosyltransferase, PduO and EutT being the other two. CobA has been shown to be involved in cobalamin (vitamin B12) biosynthesis and scavenging of incomplete corrinoids. This enzyme is a homodimer, which catalyzes the adenosylation reaction: ATP + cob(I)alamin + H2O 170
21484 238315 cd00562 NifX_NifB This CD represents a family of iron-molybdenum cluster-binding proteins that includes NifB, NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily. This conserved domain is represented in two of the three major divisions of life (bacteria and archaea). 102
21485 238316 cd00563 Dtyr_deacylase D-Tyrosyl-tRNAtyr deacylases; a class of tRNA-dependent hydrolases which are capable of hydrolyzing the ester bond of D-Tyrosyl-tRNA reducing the level of cellular D-Tyrosine while recycling the peptidyl-tRNA; found in bacteria and in eukaryotes but not in archea; beta barrel-like fold structure; forms homodimers in which two surface cavities serve as the active site for tRNA binding 145
21486 238317 cd00564 TMP_TenI Thiamine monophosphate synthase (TMP synthase)/TenI. TMP synthase catalyzes an important step in the thiamine biosynthesis pathway, the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl) thiazole phosphate to yield thiamine phosphate. TenI is a enzymatically inactive regulatory protein involved in the regulation of several extracellular enzymes. This superfamily also contains other enzymatically inactive proteins with unknown functions. 196
21487 340451 cd00565 Ubl_ThiS ubiquitin-like (Ubl) domain found in sulfur carrier protein ThiS. ThiS, also termed Thiamine biosynthesis protein (ThiaminS), is a sulfur carrier protein involved in thiamin biosynthesis in prokaryotes. It has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub), and is activated in an ATP-dependent manner by sulfurtransferases, similar to the activation mechanism of Ub-activating enzyme E1. ThiS has common evolutionary origin with Ub-related protein modifiers in eukaryotes, a beta-grasp fold as Ub, and is closely related to proteins MoaD and Urm1. 64
21488 173838 cd00567 ACAD Acyl-CoA dehydrogenase. Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC) 327
21489 238318 cd00568 TPP_enzymes Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes. 168
21490 259851 cd00569 HTH_Hin_like Helix-turn-helix domain of Hin and related proteins. This domain model summarizes a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a C-terminal DNA-binding domain (HTH_Hin) and a catalytic domain. 42
21491 238319 cd00570 GST_N_family Glutathione S-transferase (GST) family, N-terminal domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK subfamily, a member of the DsbA family). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxin 2 and stringent starvation protein A. 71
21492 238320 cd00571 UreE UreE urease accessory protein. UreE is a metallochaperone assisting the insertion of a Ni2+ ion in the active site of urease, an important step in the in vivo assembly of urease, an enzyme that hydrolyses urea into ammonia and carbamic acid. The C-terminal region of UreE contains a histidine rich nickel binding site. 136
21493 238321 cd00575 NOS_oxygenase Nitric oxide synthase (NOS) produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOSs also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. While prokaryotes can produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS, a few prokaryotes also have a NOS which consists solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate. 356
21494 153083 cd00576 RNR_PFL Ribonucleotide reductase and Pyruvate formate lyase. Ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL) are believed to have diverged from a common ancestor. They have a structurally similar ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs use a diiron-tyrosyl radical while Class II RNRs use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. PFL, an essential enzyme in anaerobic bacteria, catalyzes the conversion of pyruvate and CoA to acteylCoA and formate in a mechanism that uses a glycyl radical. 401
21495 238322 cd00577 PCNA Proliferating Cell Nuclear Antigen (PCNA) domain found in eukaryotes and archaea. These polymerase processivity factors play a role in DNA replication and repair. PCNA encircles duplex DNA in its central cavity, providing a DNA-bound platform for the attachment of the polymerase. The trimeric PCNA ring is structurally similar to the dimeric ring formed by the DNA polymerase processivity factors in bacteria (beta subunit DNA polymerase III holoenzyme) and in bacteriophages (catalytic subunits in T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. PCNA is also involved with proteins involved in cell cycle processes such as DNA repair and apoptosis. Many of these proteins contain a highly conserved motif known as the PIP-box (PCNA interacting protein box) which contains the sequence Qxx[LIM]xxF[FY]. 248
21496 238323 cd00578 L-fuc_L-ara-isomerases L-fucose isomerase (FucIase) and L-arabinose isomerase (AI) family; composed of FucIase, AI and similar proteins. FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose in glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in various oligo- and polysaccharides in mammals, bacteria and plants. AI catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion to D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute. 452
21497 238324 cd00580 CHMI 5-carboxymethyl-2-hydroxymuconate isomerase (CHMI) is a trimeric enzyme catalyzing the isomerization of the unsaturated ketone 5-(carboxymethyl)-2-hydroxymuconate to 5-(carboxymethyl)-2-oxo-3-hexene-1,6-dionate. This is one step in the homoprotocatechuate pathway, one of the microbial meta-fission pathways that degrade aromatic carbon sources to citric acid cycle intermediates. Despite the structural similarity of CHMI with 4-oxalocrotonate tautomerase (4-OT) and macrophage migration inhibitory factor (MIF), there is no significant sequence similarity among these protein families, and therefore, they are not combined in one hierarchy. 113
21498 238325 cd00581 QFR_TypeB_TM Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinone oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and rhodoquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The Type B enzyme from Desulfovibrio gigas is capable of fumarate reduction and succinate oxidation. 206
21499 411704 cd00583 MutH-like Restriction endonuclease MutH and similar endonucleases. MutH is a 28kD endonuclease involved in methyl-directed DNA mismatch repair in gram negative bacteria. MutH is both sequence-specific and methylation-specific, introducing a nick in the unmethylated strand of a hemi-methylated d(GATC) DNA duplex. MutH is homologous to the type II restriction endonuclease Sau3AI which also recognizes the d(GATC) sequence however, Sau3AI cleaves both strands regardless of their methylation state. The active form of MutH is monomeric while that of Sau3AI is homodimeric. In addition to MutH, MutS, involved in mismatch recognition, and MutL, involved in mediating the interactions between MutH and MutS, are essential in initiating mismatch repair in Escherichia coli. 208
21500 238327 cd00584 Prefoldin_alpha Prefoldin alpha subunit; Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils. 129
21501 238328 cd00585 Peptidase_C1B Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure with the active sites imbedded in the central channel. The bacterial homolog of BH, called pepC, is a cysteine aminopeptidase possessing broad specificity. Although its crystal structure has not been solved, biochemical analysis shows that pepC also forms a hexamer. 437
21502 238329 cd00586 4HBT 4-hydroxybenzoyl-CoA thioesterase (4HBT). Catalyzes the final step in the 4-chlorobenzoate degradation pathway in which 4-chlorobenzoate is converted to 4-hydroxybenzoate in certain soil-dwelling bacteria. 4HBT forms a homotetramer with four active sites. There is no evidence to suggest that 4HBT is related to the type I thioesterases functioning in primary or secondary metabolic pathways. Each subunit of the 4HBT tetramer adopts a so-called hot-dog fold similar to those of beta-hydroxydecanoyl-ACP dehydratase, (R)-specific enoyl-CoA hydratase, and type II, thioesterase (TEII). 110
21503 238330 cd00587 HCP_like The HCP family of iron-sulfur proteins includes hybrid cluster protein (HCP), acetyl-CoA synthase (ACS), and carbon monoxide dehydrogenase (CODH), all of which contain [Fe4-S4] metal clusters at their active sites. These proteins have a conserved alpha-beta rossman fold domain. HCP, formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown. Acetyl-CoA synthase (ACS), is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide and CoA. 258
21504 238331 cd00588 CheW_like CheW-like domain. CheW proteins are part of the chemotaxis signalling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in the chemotaxis associated histidine kinase CheA binds to CheW, suggesting that these domains can interact with each other. 136
21505 409669 cd00590 RRM_SF RNA recognition motif (RRM) superfamily. RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). 72
21506 259852 cd00591 HU_IHF DNA sequence specific (IHF) and non-specific (HU) domains. This family includes integration host factor (IHF) and HU, also called type II DNA-binding proteins (DNABII), which are small dimeric proteins that specifically bind the DNA minor groove, inducing large bends in the DNA and serving as architectural factors in a variety of cellular processes such as recombination, initiation of replication/transcription and gene regulation. IHF binds DNA in a sequence specific manner while HU displays little or no sequence preference. IHF homologs are usually heterodimers, while HU homologs are typically homodimers (except HU heterodimers from E. coli and other enterobacteria). HU is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). Bacillus phage SPO1-encoded transcription factor 1 (TF1) is another related type II DNA-binding protein. Like IHF, TF1 binds DNA specifically and bends DNA sharply. 85
21507 133378 cd00592 HTH_MerR-like Helix-Turn-Helix DNA binding domain of MerR-like transcription regulators. Helix-turn-helix (HTH) MerR-like transcription regulator, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 100
21508 238333 cd00593 RIBOc RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing. 133
21509 238334 cd00594 KU Ku-core domain; includes the central DNA-binding beta-barrels, polypeptide rings, and the C-terminal arm of Ku proteins. The Ku protein consists of two tightly associated homologous subunits, Ku70 and Ku80, and was originally identified as an autoantigen recognized by the sera of patients with an autoimmunity disease. In eukaryotes, the Ku heterodimer contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by non-homologous end-joining. The bacterial Ku homologs does not contain the conserved N-terminal extension that is present in the eukaryotic Ku protein. 272
21510 238335 cd00595 NDPk Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved. 133
21511 349427 cd00596 Peptidase_M14_like M14 family of metallocarboxypeptidases and related proteins. The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 216
21512 119349 cd00598 GH18_chitinase-like The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model. 210
21513 119373 cd00599 GH25_muramidase Endo-N-acetylmuramidases (muramidases) are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. This family of muramidases contains a glycosyl hydrolase family 25 (GH25) catalytic domain and is found in bacteria, fungi, slime molds, round worms, protozoans and bacteriophages. The bacteriophage members are referred to as endolysins which are involved in lysing the host cell at the end of the replication cycle to allow release of mature phage particles. Endolysins are typically modular enzymes consisting of a catalytically active domain that hydrolyzes the peptidoglycan cell wall and a cell wall-binding domain that anchors the protein to the cell wall. Endolysins generally have narrow substrate specificities with either intra-species or intra-genus bacteriolytic activity. 186
21514 212462 cd00600 Sm_like Sm and related proteins. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 63
21515 238336 cd00602 IPT_TF IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region. 101
21516 238337 cd00603 IPT_PCSR IPT domain of Plexins and Cell Surface Receptors (PCSR) and related proteins . This subgroup contains IPT domains of plexins, receptors, like the plasminogen-related growth factor receptors, the hepatocyte growth factor-scatter factors, and the macrophage-stimulating receptors and of fibrocystin. Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT_PCSR domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 90
21517 238338 cd00604 IPT_CGTD IPT domain (domain D) of cyclodextrin glycosyltransferase (CGTase) and similar enzymes. These enzymes are involved in the enzymatic hydrolysis of alpha-1,4 linkages of starch polymers and belong to the glycosyl hydrolase family 13. Most consist of three domains (A,B,C) but CGTase is more complex and has two additional domains (D,E). The function of the IPT/D domain is unknown. 81
21518 238339 cd00606 fungal_RNase fungal type ribonuclease. Ribonucleases (RNAses) cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. The members of this CD belong to the superfamily of microbial ribonucleases which are predominantly guanyl specific nucleases. Guanyl specific RNAses are endonucleases which split RNA phosphodiester bonds at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3'-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate. The alignment also contains ribotoxins, a fungal group of cytotoxins, specifically cleaving the sarcin/ricin loop (SRL) structure of the 23-28S rRNA and therefore being very potent inhibitors of protein synthesis. 100
21519 238340 cd00607 RNase_Sa RNase_Sa. Ribonucleases first isolated from Streptomyces aureofaciens. In general, ribonucleases cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. RNAse Sa is a guanylate specific endoribonuclease which belongs to the superfamily of microbial ribonucleases. Typical of this sub-family, the enzyme hydrolyses the phosphodiester bonds of RNA at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3'-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate. 95
21520 238341 cd00608 GalT Galactose-1-phosphate uridyl transferase (GalT): This enzyme plays a key role in galactose metabolism by catalysing the transfer of a uridine 5'-phosphoryl group from UDP-galactose 1-phosphate. The structure of E.coli GalT reveals that the enzyme contains two identical subunits. It also demonstrates that the active site is formed by amino acid residues from both subunits of the dimer. 329
21521 99734 cd00609 AAT_like Aspartate aminotransferase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). Pyridoxal phosphate combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. The major groups in this CD corresponds to Aspartate aminotransferase a, b and c, Tyrosine, Alanine, Aromatic-amino-acid, Glutamine phenylpyruvate, 1-Aminocyclopropane-1-carboxylate synthase, Histidinol-phosphate, gene products of malY and cobC, Valine-pyruvate aminotransferase and Rhizopine catabolism regulatory protein. 350
21522 99735 cd00610 OAT_like Acetyl ornithine aminotransferase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to ornithine aminotransferase, acetylornithine aminotransferase, alanine-glyoxylate aminotransferase, dialkylglycine decarboxylase, 4-aminobutyrate aminotransferase, beta-alanine-pyruvate aminotransferase, adenosylmethionine-8-amino-7-oxononanoate aminotransferase, and glutamate-1-semialdehyde 2,1-aminomutase. All the enzymes belonging to this family act on basic amino acids and their derivatives are involved in transamination or decarboxylation. 413
21523 99736 cd00611 PSAT_like Phosphoserine aminotransferase (PSAT) family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major group in this CD corresponds to phosphoserine aminotransferase (PSAT). PSAT is active as a dimer and catalyzes the conversion of phosphohydroxypyruvate to phosphoserine. 355
21524 99737 cd00613 GDC-P Glycine cleavage system P-protein, alpha- and beta-subunits. This family consists of Glycine cleavage system P-proteins EC:1.4.4.2 from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex EC:2.1.2.10 (GDC) also annotated as glycine cleavage system or glycine synthase. GDC consists of four proteins P, H, L and T. The reaction catalysed by this protein is: Glycine + lipoylprotein <=> S-aminomethyldihydrolipoylprotein + CO2. Alpha-beta-type dimers associate to form an alpha(2)beta(2) tetramer, where the alpha- and beta-subunits are structurally similar and appear to have arisen by gene duplication and subsequent divergence with a loss of one active site. The members of this CD are widely dispersed among all three forms of cellular life. 398
21525 99738 cd00614 CGS_like CGS_like: Cystathionine gamma-synthase is a PLP dependent enzyme and catalyzes the committed step of methionine biosynthesis. This pathway is unique to microorganisms and plants, rendering the enzyme an attractive target for the development of antimicrobials and herbicides. This subgroup also includes cystathionine gamma-lyases (CGL), O-acetylhomoserine sulfhydrylases and O-acetylhomoserine thiol lyases. CGL's are very similar to CGS's. Members of this group are widely distributed among all three forms of life. 369
21526 99739 cd00615 Orn_deC_like Ornithine decarboxylase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD corresponds to ornithine decarboxylase (ODC), arginine decarboxylase (ADC) and lysine decarboxylase (LDC). ODC is a dodecamer composed of six homodimers and catalyzes the decarboxylation of tryptophan. ADC catalyzes the decarboxylation of arginine and LDC catalyzes the decarboxylation of lysine. Members of this family are widely found in all three forms of life. 294
21527 99740 cd00616 AHBA_syn 3-amino-5-hydroxybenzoic acid synthase family (AHBA_syn). AHBA_syn family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The members of this CD are involved in various biosynthetic pathways for secondary metabolites. Some well studied proteins in this CD are AHBA_synthase, protein product of pleiotropic regulatory gene degT, Arnb aminotransferase and pilin glycosylation protein. The prototype of this family, the AHBA_synthase, is a dimeric PLP dependent enzyme. AHBA_syn is the terminal enzyme of 3-amino-5-hydroxybenzoic acid (AHBA) formation which is involved in the biosynthesis of ansamycin antibiotics, including rifamycin B. Some members of this CD are involved in 4-amino-6-deoxy-monosaccharide D-perosamine synthesis. Perosamine is an important element in the glycosylation of several cell products, such as antibiotics and lipopolysaccharides of gram-positive and gram-negative bacteria. The pilin glycosylation protein encoded by gene pglA, is a galactosyltransferase involved in pilin glycosylation. Additionally, this CD consists of ArnB (PmrH) aminotransferase, a 4-amino-4-deoxy-L-arabinose lipopolysaccharide-modifying enzyme. This CD also consists of several predicted pyridoxal phosphate-dependent enzymes apparently involved in regulation of cell wall biogenesis. The catalytic lysine which is present in all characterized PLP dependent enzymes is replaced by histidine in some members of this CD. 352
21528 99741 cd00617 Tnase_like Tryptophanase family (Tnase). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to tryptophanase (Tnase) and tyrosine phenol-lyase (TPL). Tnase and TPL are active as tetramers and catalyze beta-elimination reactions. Tnase catalyzes degradation of L-tryptophan to yield indole, pyruvate and ammonia and TPL catalyzes degradation of L-tyrosine to yield phenol, pyruvate and ammonia. 431
21529 153092 cd00618 PLA2_like PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. 83
21530 238342 cd00619 Terminator_NusB Transcription termination factor NusB (N protein-Utilization Substance B). NusB plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. 130
21531 238343 cd00620 Methyltransferase_Sun N-terminal RNA binding domain of the methyltransferase Sun. The rRNA-specific 5-methylcytidine transferase Sun, also known as RrmB or Fmu shares the RNA-binding non-catalytic domain with the transcription termination factor NusB. The precise biological role of this domain in Sun is unknown, although it is likely to be involved in sequence-specific RNA binding. The C-terminal methyltransferase domain of Sun has been shown to catalyze formation of m5C at position 967 of 16S rRNA in Escherichia coli. 126
21532 143482 cd00622 PLPDE_III_ODC Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Ornithine Decarboxylase. This subfamily is composed mainly of eukaryotic ornithine decarboxylases (ODC, EC 4.1.1.17) and ODC-like enzymes from prokaryotes represented by Vibrio vulnificus LysineOrnithine decarboxylase. These are fold type III PLP-dependent enzymes that differ from most bacterial ODCs which are fold type I PLP-dependent enzymes. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. Members of this subfamily contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity. Also members of this subfamily are proteins with homology to ODC but do not possess any catalytic activity, the Antizyme inhibitor (AZI) and ODC-paralogue (ODC-p). AZI binds to the regulatory protein Antizyme with a higher affinity than ODC and prevents ODC degradation. ODC-p is a novel ODC-like protein, present only in mammals, that is specifically exressed in the brain and testes. ODC-p may function as a tissue-specific antizyme inhibitory protein. 362
21533 238344 cd00625 ArsB_NhaD_permease Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump. 396
21534 132719 cd00630 RNAP_largest_subunit_C Largest subunit of RNA polymerase (RNAP), C-terminal domain. RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria. 158
21535 238345 cd00632 Prefoldin_beta Prefoldin beta; Prefoldin is a hexameric molecular chaperone complex, composed of two evolutionarily related subunits (alpha and beta), which are found in both eukaryotes and archaea. Prefoldin binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The hexameric structure consists of a double beta barrel assembly with six protruding coiled-coils. The alpha prefoldin subunits have two beta hairpin structures while the beta prefoldin subunits (this CD) have only one hairpin that is most similar to the second hairpin of the alpha subunit. The prefoldin hexamer consists of two alpha and four beta subunits and is assembled from the beta hairpins of all six subunits. The alpha subunits initially dimerize providing a structural nucleus for the assembly of the beta subunits. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. 105
21536 238346 cd00633 Secretoglobin Secretoglobins are relatively small, secreted, disulphide-bridged dimeric proteins with encoding genes sharing substantial sequence similarity. Their family subunits may be grouped into five subfamilies, A-E. Uteroglobin (subfamily A), which is identical to Clara cell protein (CC10), forms a globular shaped homodimer with a large hydrophobic pocket located between the two dimers. The uteroglobin monomer structure is composed of four alpha helices that do not form a canonical four helix-bundle motif but rather a boomerang-shaped structure in which helices H1, H3, and H4 are able to bind a homodimeric partner. The hydrophobic pocket binds steroids, particularly progesterone, with high specificity. However, the true biological function of uteroglobin is poorly understood. In mammals, uteroglobin has immunosuppressive and anti-inflammatory properties through the inhibition of phospholipase A2. The other four main subfamilies of secretoglobins are found in heterodimeric combinations, with B and C subfamilies disulphide-bridged to the E and D subfamilies, respectively. [See review by Laukaitis C.M. & Karn R.C. (2005). Biological Journal of the Linnean Society 84, 493]. These include rat prostatic steroid-binding protein (PBP or prostatein), human mammaglobin (or heteroglobin), lipophilins, major cat allergen Fel dI, the hamster Harderian gland proteins and mouse salivary androgen-binding protein (ABP). Example of such a heterodimer: ABPalpha-like sequences are closely related to cat Fel dI chain 1, whereas ABPbeta-gamma-like sequences are closely related to Fel dI chain 2. Thus, the heterodimeric structure of ABPalpha-beta and ABPalpha-gamma is recapitulated by the sequence-similar Fel dI chains 1 and 2. This conservation of primary and quaternary structure indicates that the genome of the eutherian common ancestor of cats, rodents, and primates contained a similar gene pair. 67
21537 143483 cd00635 PLPDE_III_YBL036c_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, YBL036c-like proteins. This family contains mostly uncharacterized proteins, widely distributed among eukaryotes, bacteria and archaea, that bear similarity to the yeast hypothetical protein YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. YBL036c is a single domain monomeric protein with a typical TIM barrel fold. It binds the PLP cofactor and has been shown to exhibit amino acid racemase activity. The YBL036c structure is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. The lack of a second domain in YBL036c may explain limited D- to L-alanine racemase or non-specific racemase activity. 222
21538 238347 cd00636 TroA-like Helical backbone metal receptor (TroA-like domain). These proteins have been shown to function in the ABC transport of ferric siderophores and metal ions such as Mn2+, Fe3+, Cu2+ and/or Zn2+. Their ligand binding site is formed in the interface between two globular domains linked by a single helix. Many of these proteins also possess a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). The TroA-like proteins differ in their fold and ligand-binding mechanism from the PBPI and PBPII proteins, but are structurally similar, however, to the beta-subunit of the nitrogenase molybdenum-iron protein MoFe. Most TroA-like proteins are encoded by ABC-type operons and appear to function as periplasmic components of ABC transporters in metal ion uptake. 148
21539 410626 cd00637 7tm_classA_rhodopsin-like rhodopsin receptor-like class A family of the seven-transmembrane G protein-coupled receptor superfamily. Class A rhodopsin-like receptors constitute about 90% of all GPCRs. The class A GPCRs include the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. Based on sequence similarity, GPCRs can be divided into six major classes: class A (rhodopsin-like family), class B (Methuselah-like, adhesion and secretin-like receptor family), class C (metabotropic glutamate receptor family), class D (fungal mating pheromone receptors), class E (cAMP receptor family), and class F (frizzled/smoothened receptor family). Nearly 800 human GPCR genes have been identified and are involved essentially in all major physiological processes. Approximately 40% of clinically marketed drugs mediate their effects through modulation of GPCR function for the treatment of a variety of human diseases including bacterial infections. 275
21540 107202 cd00640 Trp-synth-beta_II Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine. 244
21541 238348 cd00641 GTP_cyclohydro2 GTP cyclohydrolase II (RibA). GTP cyclohydrolase II catalyzes the conversion of GTP to 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5' phosphate, formate, pyrophosphate (APy), and GMP in the biosynthetic pathway of riboflavin. Riboflavin is the precursor molecule for the synthesis of the coenzymes flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential to cell metabolism. The enzyme is present in plants and numerous pathogenic bacteria, especially gram negative organisms, who are dependent on endogenous synthesis of the vitamin because they lack an appropriate uptake system. For animals and humans, which lack this biosynthetic pathway, riboflavin is the essential vitamin B2. GTP cyclohydrolase II requires magnesium ions for activity and has a bound catalytic zinc. The functionally active form is thought to be a homodimer. A paralogous protein is encoded in the genome of Streptomyces coelicolor, which converts GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy), an activity that has otherwise been reported for unrelated GTP cyclohydrolases III. 193
21542 238349 cd00642 GTP_cyclohydro1 GTP cyclohydrolase I (GTP-CH-I) catalyzes the conversion of GTP into dihydroneopterin triphosphate. The enzyme product is the precursor of tetrahydrofolate in eubacteria, fungi, and plants and of the folate analogs in methanogenic bacteria. In vertebrates and insects it is the biosynthtic precursor of tetrahydrobiopterin (BH4) which is involved in the formation of catacholamines, nitric oxide, and the stimulation of T lymphocytes. The biosynthetic reaction of BH4 is controlled by a regulatory protein GFRP which mediates feedback inhibition of GTP-CH-I by BH4. This inhibition is reversed by phenylalanine. The decameric GTP-CH-I forms a complex with two pentameric GFRP in the presence of phenylalanine or a combination of GTP and BH4, respectively. 185
21543 153081 cd00643 HMG-CoA_reductase_classI Class I hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class I enzyme, homotetramer. Catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals this is the rate limiting committed step in cholesterol biosynthesis. Class I enzymes are found predominantly in eukaryotes and contain N-terminal membrane regions. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. 403
21544 153082 cd00644 HMG-CoA_reductase_classII Class II hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class II, prokaryotic enzyme is a homodimer. Class II enzymes are found primarily in prokaryotes and Archaeoglobus fulgidus and are soluble as they lack the membrane region. Enzymes catalyze the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. Human and bacterial HMGR differ in their active site architecture. 417
21545 238350 cd00645 AsnA Asparagine synthetase (aspartate-ammonia ligase) (AsnA) catalyses the conversion of L-aspartate to L-asparagine in the presence of ATP and ammonia. AsnA is a homodimeric enzyme which is structurally similiar to the catalytic core domain of class II aminoacyl-tRNA synthetases. Ammonia-dependent AsnA is not homologous to the glutamine-dependent asparagine synthetase AsnB. 309
21546 270214 cd00648 Periplasmic_Binding_Protein_Type_2 Type 2 periplasmic binding fold superfamily. This evolutionary model and hierarchy represent the ligand-binding domains found in solute binding proteins that serve as initial receptors in the transport, signal transduction and channel gating. The PBP2 proteins share the same architecture as periplasmic binding proteins type 1 (PBP1), but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The origin of PBP module can be traced across the distant phyla, including eukaryotes, archebacteria, and prokaryotes. The majority of PBP2 proteins are involved in the uptake of a variety of soluble substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction. The substrate binding domain of the LysR transcriptional regulators and the oligopeptide-like transport systems also contain the type 2 periplasmic binding fold and thus they are significantly homologous to that of the PBP2; however, these two families are grouped into a separate hierarchy of the PBP2 superfamily due to the large number of protein sequences. 196
21547 173824 cd00649 catalase_peroxidase_1 N-terminal catalytic domain of catalase-peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Catalase-peroxidases can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. These enzymes are found in many archaeal and bacterial organisms, where they neutralize potentially lethal hydrogen peroxide molecules generated during photosynthesis or stationary phase. Along with related intracellular fungal and plant peroxidases, catalase-peroxidases belong to class I of the plant peroxidase superfamily. Unlike the eukaryotic enzymes, they are typically comprised of two homologous domains that probably arose via a single gene duplication event. The heme binding motif is present only in the N-terminal domain; the function of the C-terminal domain is not clear. 409
21548 133419 cd00650 LDH_MDH_like NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family. Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 263
21549 238351 cd00651 TFold Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1), 6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1. Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel. 122
21550 238352 cd00652 TBP_TLF TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP. 174
21551 238353 cd00653 RNA_pol_B_RPB2 RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation. 866
21552 238354 cd00655 RNAP_Rpb7_N_like RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex. 80
21553 259791 cd00656 Zn-ribbon C-terminal zinc ribbon domain of RNA polymerase intrinsic transcript cleavage subunit. The homologous C-terminal zinc ribbon domains of subunits A12.2, Rpb9, and C11 in RNA Polymerases (Pol) I, II, and III, respectively are required for intrinsic transcript cleavage. TFS is a related archaeal protein that is involved in RNA cleavage by archaeal polymerase. These proteins have two zinc-binding beta-ribbon domains, N-terminal zinc ribbon (N-ribbon) and C-terminal zinc ribbon (C-ribbon). Transcription Factor IIS (TFIIS) domain III is homologous to the C-ribbon domain that stimulates the weak cleavage activity of Rpb9 for Pol II. 45
21554 153097 cd00657 Ferritin_like Ferritin-like superfamily of diiron-containing four-helix-bundle proteins. Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF). 130
21555 271176 cd00659 Topo_IB_C DNA topoisomerase IB, C-terminal catalytic domain. Topoisomerase I promotes the relaxation of both positive and negative DNA superhelical tension by introducing a transient single-stranded break in duplex DNA. This function is vital for the processes of replication, transcription, and recombination. Unlike Topo IA enzymes, Topo IB enzymes do not require a single-stranded region of DNA or metal ions for their function. The type IB family of DNA topoisomerases includes eukaryotic nuclear topoisomerase I, topoisomerases of poxviruses, and bacterial versions of Topo IB. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their C-terminal catalytic domain and the overall reaction mechanism with tyrosine recombinases. The C-terminal catalytic domain in topoisomerases is linked to a divergent N-terminal domain that shows no sequence or structure similarity to the N-terminal domains of tyrosine recombinases. 210
21556 238356 cd00660 Topoisomer_IB_N Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I play putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 215
21557 238357 cd00667 ring_hydroxylating_dioxygenases_beta Ring hydroxylating dioxygenase beta subunit. This subunit has a similar structure to NTF-2, Ketosteroid isomerase and scytalone dehydratase.The degradation of aromatic compounds by aerobic bacteria frequently begins with the dihydroxylation of the substrate by nonheme iron-containing dioxygenases. These enzymes consist of two or three soluble proteins that interact to form an electron-transport chain that transfers electrons from reduced nucleotides (NADH) via flavin and [2Fe-2S] redox centers to a terminal dioxygenase. Aromatic-ring-hydroxylating dioxygenases oxidize aromatic hydrocarbons and related compounds to cis-arene diols. These enzymes utilize a mononuclear non-heme iron center to catalyze the addition of dioxygen to their respective substrates. The active site of these enzymes however is in the alpha sub-unit. No functional role has been attributed to the beta sub-unit except for a structural role. 160
21558 185674 cd00668 Ile_Leu_Val_MetRS_core catalytic core domain of isoleucyl, leucyl, valyl and methioninyl tRNA synthetases. Catalytic core domain of isoleucyl, leucyl, valyl and methioninyl tRNA synthetases. These class I enzymes are all monomers. However, in some species, MetRS functions as a homodimer, as a result of an additional C-terminal domain. These enzymes aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. Enzymes in this subfamily share an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. MetRS has a significantly shorter insertion, which lacks the editing function. 312
21559 238358 cd00669 Asp_Lys_Asn_RS_core Asp_Lys_Asn_tRNA synthetase class II core domain. This domain is the core catalytic domain of class II aminoacyl-tRNA synthetases of the subgroup containing aspartyl, lysyl, and asparaginyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. Nearly all class II tRNA synthetases are dimers and enzymes in this subgroup are homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. 269
21560 238359 cd00670 Gly_His_Pro_Ser_Thr_tRS_core Gly_His_Pro_Ser_Thr_tRNA synthetase class II core domain. This domain is the core catalytic domain of tRNA synthetases of the subgroup containing glycyl, histidyl, prolyl, seryl and threonyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. These enzymes belong to class II aminoacyl-tRNA synthetases (aaRS) based upon their structure and the presence of three characteristic sequence motifs in the core domain. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ and the accessory subunit of mitochondrial polymerase gamma (Pol gamma b) . Most class II tRNA synthetases are dimers, with this subgroup consisting of mostly homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. 235
21561 185675 cd00671 ArgRS_core catalytic core domain of arginyl-tRNA synthetases. Arginyl tRNA synthetase (ArgRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. There are at least three subgroups of ArgRS. One type contains both characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The second subtype lacks the KMSKS motif; however, it has a lysine N-terminal to the HIGH motif, which serves as the functional counterpart to the second lysine of the KMSKS motif. A third group, which is found primarily in archaea and a few bacteria, lacks both the KMSKS motif and the HIGH loop lysine. 212
21562 173899 cd00672 CysRS_core catalytic core domain of cysteinyl tRNA synthetase. Cysteinyl tRNA synthetase (CysRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 213
21563 238360 cd00673 AlaRS_core Alanyl-tRNA synthetase (AlaRS) class II core catalytic domain. AlaRS is a homodimer. It is responsible for the attachment of alanine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its predicted structure and the presence of three characteristic sequence motifs. 232
21564 173900 cd00674 LysRS_core_class_I catalytic core domain of class I lysyl tRNA synthetase. Class I lysyl tRNA synthetase (LysRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The class I LysRS is found only in archaea and some bacteria and has evolved separately from class II LysRS, as the two do not share structural or sequence similarity. 353
21565 238361 cd00677 S15_NS1_EPRS_RNA-bind S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism. 46
21566 176852 cd00680 RHO_alpha_C C-terminal catalytic domain of the oxygenase alpha subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC), and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this family include the alpha subunits of Pseudomonas resinovorans strain CA10 anthranilate 1,2-dioxygenase, Stenotrophomonas maltophilia dicamba O-demethylase, Ralstonia sp. U2 salicylate-5-hydroxylase, Cycloclasticus sp. strain A5 polycyclic aromatic hydrocarbon dioxygenase, toluene 2,3-dioxygenase from Pseudomonas putida F1, dioxin dioxygenase of Sphingomonas sp. Strain RW1, plant choline monooxygenase, and the polycyclic aromatic hydrocarbon (PAH)-degrading ring-hydroxylating dioxygenase from Sphingomonas CHY-1. This group also includes the C-terminal catalytic domains of MupW, part of the mupirocin biosynthetic gene cluster in Pseudomonas fluorescens, and Pseudomonas aeruginosa GbcA (glycine betaine catabolism A). This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 188
21567 173831 cd00683 Trans_IPPS_HH Trans-Isoprenyl Diphosphate Synthases, head-to-head. These trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) catalyze a head-to-head (HH) (1'-1) condensation reaction. This CD includes squalene and phytoene synthases which catalyze the 1'-1 condensation of two 15-carbon (farnesyl) and 20-carbon (geranylgeranyl) isoprenyl diphosphates, respectively. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DXXXD) located on opposite walls. These residues mediate binding of prenyl phosphates. A two-step reaction has been proposed for squalene synthase (farnesyl-diphosphate farnesyltransferase) in which, two molecules of FPP react to form a stable cyclopropylcarbinyl diphosphate intermediate, and then the intermediate undergoes heterolysis, isomerization, and reduction with NADPH to form squalene, a precursor of cholestrol. The carotenoid biosynthesis enzyme, phytoene synthase (CrtB), catalyzes the condensation reaction of two molecules of geranylgeranyl diphosphate to produce phytoene, a precursor of beta-carotene. These enzymes produce the triterpene and tetraterpene precursors for many diverse sterol and carotenoid end products and are widely distributed among eukareya, bacteria, and archaea. 265
21568 173832 cd00684 Terpene_cyclase_plant_C1 Plant Terpene Cyclases, Class 1. This CD includes a diverse group of monomeric plant terpene cyclases (Tspa-Tspf) that convert the acyclic isoprenoid diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP) into cyclic monoterpenes, diterpenes, or sesquiterpenes, respectively; a few form acyclic species. Terpnoid cyclases are soluble enzymes localized to the cytosol (sesquiterpene synthases) or plastids (mono- and diterpene synthases). All monoterpene and diterpene synthases have restrict substrate specificity, however, some sesquiterpene synthases can accept both FPP and GPP. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl diphosphates, via bridging Mg2+ ions (K+ preferred by gymnosperm cyclases), inducing conformational changes such that an N-terminal region forms a cap over the catalytic core. Loss of diphosphate from the enzyme-bound substrate (GPP, FPP, or GGPP) results in an allylic carbocation that electrophilically attacks a double bond further down the terpene chain to effect the first ring closure. Unlike monoterpene, sesquiterene, and macrocyclic diterpenes synthases, which undergo substrate ionization by diphosphate ester scission, Tpsc-like diterpene synthases catalyze cyclization reactions by an initial protonation step producing a copalyl diphosphate intermediate. These enzymes lack the aspartate-rich sequences mentioned above. Most diterpene synthases have an N-terminal, internal element (approx 210 aa) whose function is unknown. 542
21569 173833 cd00685 Trans_IPPS_HT Trans-Isoprenyl Diphosphate Synthases, head-to-tail. These trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) catalyze head-to-tail (HT) (1'-4) condensation reactions. This CD includes all-trans (E)-isoprenyl diphosphate synthases which synthesize various chain length (C10, C15, C20, C25, C30, C35, C40, C45, and C50) linear isoprenyl diphosphates from precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). They catalyze the successive 1'-4 condensation of the 5-carbon IPP to allylic substrates geranyl-, farnesyl-, or geranylgeranyl-diphosphate. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DDXX(XX)D) located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, protecting and stabilizing reactive carbocation intermediates. Farnesyl diphosphate synthases produce the precursors of steroids, cholesterol, sesquiterpenes, farnsylated proteins, heme, and vitamin K12; and geranylgeranyl diphosphate and longer chain synthases produce the precursors of carotenoids, retinoids, diterpenes, geranylgeranylated chlorophylls, ubiquinone, and archaeal ether linked lipids. Isoprenyl diphosphate synthases are widely distributed among archaea, bacteria, and eukareya. 259
21570 173834 cd00686 Terpene_cyclase_cis_trans_C1 Cis, Trans, Terpene Cyclases, Class 1. This CD includes the terpenoid cyclase, trichodiene synthase, which catalyzes the cyclization of farnesyl diphosphate (FPP) to trichodiene using a cis-trans pathway, and is the first committed step in the biosynthesis of trichothecene toxins and antibiotics. As with other enzymes with the 'terpenoid synthase fold', this enzyme has two conserved metal binding motifs that coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function as homodimers and are found in several genera of fungi. 357
21571 173835 cd00687 Terpene_cyclase_nonplant_C1 Non-plant Terpene Cyclases, Class 1. This CD includes terpenoid cyclases such as pentalenene synthase and aristolochene synthase which, using an all-trans pathway, catalyze the ionization of farnesyl diphosphate, followed by the formation of a macrocyclic intermediate by bond formation between C1 with either C10 (aristolochene synthase) or C11 (pentalenene synthase), resulting in production of tricyclic hydrocarbon pentalenene or bicyclic hydrocarbon aristolochene. As with other enzymes with the 'terpenoid synthase fold', they have two conserved metal binding motifs, proposed to coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP to the enzymes. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function in the monomeric form and are found in fungi, bacteria and Dictyostelium. 303
21572 238362 cd00688 ISOPREN_C2_like This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system. 300
21573 173825 cd00691 ascorbate_peroxidase Ascorbate peroxidases and cytochrome C peroxidases. Ascorbate peroxidases are a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Along with related catalase-peroxidases, ascorbate peroxidases belong to class I of the plant superfamily. Ascorbate peroxidases are found in the chloroplasts and/or cytosol of algae and plants, where they have been shown to control the concentration of lethal hydrogen peroxide molecules. The yeast cytochrome c peroxidase is a divergent member of the family; it forms a complex with cytochrome c to catalyze the reduction of hydrogen peroxide to water. 253
21574 173826 cd00692 ligninase Ligninase and other manganese-dependent fungal peroxidases. Ligninases and related extracellular fungal peroxidases belong to class II of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class II peroxidases are fungal glycoproteins that have been implicated in the oxidative breakdown of lignin, the main cell wall component of woody plants. They contain four conserved disulphide bridges and two conserved calcium binding sites. 328
21575 173827 cd00693 secretory_peroxidase Horseradish peroxidase and related secretory plant peroxidases. Secretory peroxidases belong to class III of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class III peroxidases are found in the extracellular space or in the vacuole in plants where they have been implicated in hydrogen peroxide detoxification, auxin catabolism and lignin biosynthesis, and stress response. Class III peroxidases contain four conserved disulphide bridges and two conserved calcium binding sites. 298
21576 133420 cd00704 MDH Malate dehydrogenase. Malate dehydrogenase (MDH) is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. MDHs belong to the NAD-dependent, lactate dehydrogenase (LDH)-like, 2-hydroxycarboxylate dehydrogenase family, which also includes the GH4 family of glycoside hydrolases. They are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 323
21577 238363 cd00707 Pancreat_lipase_like Pancreatic lipase-like enzymes. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation," the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 275
21578 100039 cd00710 LbH_gamma_CA Gamma carbonic anhydrases (CA): Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism, involving the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three distinct groups of carbonic anhydrases - alpha, beta and gamma - which show no significant sequence identity or structural similarity. Gamma CAs are homotrimeric enzymes, with each subunit containing a left-handed parallel beta helix (LbH) structural domain. 167
21579 238364 cd00712 AsnB Glutamine amidotransferases class-II (GATase) asparagine synthase_B type. Asparagine synthetase B catalyses the ATP-dependent conversion of aspartate to asparagine. This enzyme is a homodimer, with each monomer composed of a glutaminase domain and a synthetase domain. The N-terminal glutaminase domain hydrolyzes glutamine to glutamic acid and ammonia. 220
21580 238365 cd00713 GltS Glutamine amidotransferases class-II (Gn-AT), glutamate synthase (GltS)-type. GltS is a homodimer that synthesizes L-glutamate from 2-oxoglutarate and L-glutamine, an important step in ammonia assimilation in bacteria, cyanobacteria and plants. The N-terminal glutaminase domain catalyzes the hydrolysis of glutamine to glutamic acid and ammonia, and has a fold similar to that of other glutamine amidotransferases such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), and beta lactam synthetase (beta-LS), as well as the Ntn hydrolase folds of the proteasomal alpha and beta subunits. 413
21581 238366 cd00714 GFAT Glutamine amidotransferases class-II (Gn-AT)_GFAT-type. This domain is found at the N-terminus of glucosamine-6P synthase (GlmS, or GFAT in humans). The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. In humans, GFAT catalyzes the first and rate-limiting step of hexosamine metabolism, the conversion of D-fructose-6P (Fru6P) into D-glucosamine-6P using L-glutamine as a nitrogen source. The end product of this pathway, UDP-N-acetyl glucosamine, is a major building block of the bacterial peptidoglycan and fungal chitin. 215
21582 238367 cd00715 GPATase_N Glutamine amidotransferases class-II (GN-AT)_GPAT- type. This domain is found at the N-terminus of glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase) . The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. GPATase crystalizes as a homotetramer, but can also exist as a homdimer. 252
21583 153076 cd00716 creatine_kinase_like Phosphagen (guanidino) kinases such as creatine kinase and similar enzymes. Eukaryotic creatine kinase-like phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK), which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CKs are found as tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial, cytosolic, and flagellar) isoforms. Mitochondrial and cytoplasmic CKs are dimeric or octameric, while the flagellar isoforms are trimers with three CD domains fused as a single protein chain. CKs are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK, one of the most studied members of this family, this model also represents other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK), and echinoderm arginine kinase (AK). 357
21584 238368 cd00717 URO-D Uroporphyrinogen decarboxylase (URO-D) is a dimeric cytosolic enzyme that decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, without requiring any prosthetic groups or cofactors. This reaction is located at the branching point of the tetrapyrrole biosynthetic pathway, leading to the biosynthesis of heme, chlorophyll or bacteriochlorophyll. URO-D deficiency is responsible for the human genetic diseases familial porphyria cutanea tarda (fPCT) and hepatoerythropoietic porphyria (HEP). 335
21585 198380 cd00719 GIY-YIG_SF GIY-YIG nuclease domain superfamily. The GIY-YIG nuclease domain superfamily includes a large and diverse group of proteins involved in many cellular processes, such as class I homing GIY-YIG family endonucleases, prokaryotic nucleotide excision repair proteins UvrC and Cho, type II restriction enzymes, the endonuclease/reverse transcriptase of eukaryotic retrotransposable elements, and a family of eukaryotic enzymes that repair stalled replication forks. All of these members contain a conserved GIY-YIG nuclease domain that may serve as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. By combining with different specificity, targeting, or other domains, the GIY-YIG nucleases may perform different functions. 69
21586 238369 cd00727 malate_synt_A Malate synthase A (MSA), present in some bacteria, plants and fungi. Prokaryotic MSAs tend to be monomeric, whereas eukaryotic enzymes are homomultimers. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA, which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 511
21587 238370 cd00728 malate_synt_G Malate synthase G (MSG), monomeric enzyme present in some bacteria. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 712
21588 238371 cd00729 rubredoxin_SM Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity. 34
21589 238372 cd00730 rubredoxin Rubredoxin; nonheme iron binding domains containing a [Fe(SCys)4] center. Rubredoxins are small nonheme iron proteins. The iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc. They are believed to be involved in electron transfer. 50
21590 238373 cd00731 CheA_reg CheA regulatory domain; CheA is a histidine protein kinase present in bacteria and archea. Activated by the chemotaxis receptor a histidine phosphoryl group from CheA is passed directly to an aspartate in the response regulator CheY. This signalling mechanism is modulated by the methyl accepting chemotaxis proteins (MCPs). MCPs form a highly interconnected, tightly packed array within the membrane that is organized, at least in part, through interactions with CheW and CheA. The CheA regulatory domain belongs to the family of CheW_like proteins and has been proposed to mediate interaction with the kinase regulator CheW. 132
21591 238374 cd00732 CheW CheW, a small regulator protein, unique to the chemotaxis signalling in prokaryotes and archea. CheW interacts with the histidine kinase CheA, most likely with the related regulatory domain of CheA. CheW is proposed to form signalling arrays together with CheA and the methyl-accepting chemotaxis proteins (MCPs), which are involved in response modulation. 140
21592 238375 cd00733 GlyRS_alpha_core Class II Glycyl-tRNA synthetase (GlyRS) alpha subunit core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes and in arabidopsis. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. This alignment contains only sequences from the GlyRS form which heterotetramerizes. The homodimer form of GlyRS is in a different family of class II aaRS. Class II assignment is based upon structure and the presence of three characteristic sequence motifs. 279
21593 381597 cd00735 T4-like_lys bacteriophage T4-like lysozymes. Bacteriophage T4-like lysozymes hydrolyze the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc) in peptidoglycan heteropolymers of prokaryotic cell walls. Members include a variety of bacteriophages (T4, RB49, RB69, Aeh1), as well as Dictyostelium. 146
21594 381598 cd00736 lambda_lys-like Bacteriophage lambda lysozyme and similar proteins. Lysozyme from bacteriophage lambda hydrolyzes the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc), as do other lysozymes. However, unlike other lysozymes, bacteriophage lambda does not produce a reducing end upon cleavage of the peptidoglycan, but rather uses the 6-OH of the same MurNAc residue to produce a 1,6-anhydromuramic acid terminal residue and is therefore a lytic transglycosylase. An identical 1,6-anhydro bond is formed in bacterial peptidoglycans by the action of the lytic transglycosylases of E. coli, though they differ structurally. 141
21595 381599 cd00737 lyz_endolysin_autolysin endolysin and autolysin. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan. 136
21596 238379 cd00738 HGTP_anticodon HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b). 94
21597 238380 cd00739 DHPS DHPS subgroup of Pterin binding enzymes. DHPS (dihydropteroate synthase), a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS. Sulfonamide drugs, which are substrate analogs of pABA, target DHPS. 257
21598 238381 cd00740 MeTr MeTr subgroup of pterin binding enzymes. This family includes cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH). Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate. These include MeTr, a functional heterodimer, and the folate binding domain of MetH. 252
21599 238382 cd00741 Lipase Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 153
21600 381183 cd00742 FABP intracellular fatty acid-binding protein family. Members of this family are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner. They protect and shuttle fatty acids within the cell and are involved in acquisition and removal of fatty acids from intracellular sites. They include cellular retinol-binding proteins (CRBPs) which participate in the cellular uptake of vitamin A in the form of free retinol, cellular retinoic acid-binding proteins (CRABPs) which participate in the metabolism of vitamin A and retinoic acid, and bind all trans retinoic acid, but not retinol, and FABPs similar to FABP3 which plays an important role in fatty acid transportation, cell growth, cell signaling, and gene transcription. 129
21601 381184 cd00743 lipocalin_RBP_like retinol-binding protein 4 and similar proteins. Retinol-Binding Protein 4 (RBP4) is a plasma protein that transports retinol (vitamin A) from the liver stores to the peripheral tissues. The RBP4-retinol complex interacts with transthyretin (TTR - transports thyroxine and retinol) which protects it from renal excretion. In addition to retinol, other endogenous and synthetic retinoids bind RBP4, including all-trans and 13-cis retinoic acid, retinyl acetate, N-(ethyl)retinamide, and fenretinide. This group also includes purpurin, a retinol-specific protein that plays a role in neural retina cell adhesion during development of the chicken retina; it also binds retinol and may participate in retinol transporter in the retina. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 171
21602 238383 cd00751 thiolase Thiolase are ubiquitous enzymes that catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. They are found in prokaryotes and eukaryotes (cytosol, microbodies and mitochondria). There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways. 386
21603 340452 cd00754 Ubl_MoaD ubiquitin-like (Ubl) domain found in molybdenum cofactor biosynthesis protein D (MoaD) and similar proteins. MoaD, also termed molybdopterin synthase sulfur carrier subunit, or MPT synthase subunit 1, or MPT synthase small subunit, or molybdopterin-converting factor small subunit, or molybdopterin-converting factor subunit 1, is a conserved small sulfur carrier protein that has beta-grasp ubiquitin-like (Ubl) fold involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor of a diverse group of redox enzymes. MoaD is activated in an ATP-dependent manner by sulfurtransferases similar to the activation mechanism of ubiquitin-activating enzyme E1. 79
21604 238384 cd00755 YgdL_like Family of activating enzymes (E1) of ubiquitin-like proteins related to the E.coli hypothetical protein ygdL. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown. 231
21605 238385 cd00756 MoaE MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits. 124
21606 238386 cd00757 ThiF_MoeB_HesA_family ThiF_MoeB_HesA. Family of E1-like enzymes involved in molybdopterin and thiamine biosynthesis family. The common reaction mechanism catalyzed by MoeB and ThiF, like other E1 enzymes, begins with a nucleophilic attack of the C-terminal carboxylate of MoaD and ThiS, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS. MoeB, as the MPT synthase (MoaE/MoaD complex) sulfurase, is involved in the biosynthesis of the molybdenum cofactor, a derivative of the tricyclic pterin, molybdopterin (MPT). ThiF catalyzes the adenylation of ThiS, as part of the biosynthesis pathway of thiamin pyrophosphate (vitamin B1). 228
21607 238387 cd00758 MoCF_BD MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin. 133
21608 132997 cd00761 Glyco_tranf_GTA_type Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold. Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities. 156
21609 133442 cd00762 NAD_bind_malic_enz NAD(P) binding domain of malic enzyme. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+. ME has been found in all organisms and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 254
21610 238388 cd00763 Bacterial_PFK Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include bacterial ATP-dependent phosphofructokinases. These are allosrterically regulated homotetramers; the subunits are of about 320 amino acids. 317
21611 238389 cd00764 Eukaryotic_PFK Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include eukaryotic ATP-dependent phosphofructokinases. These have evolved from the bacterial PFKs by gene duplication and fusion events and exhibit complex allosteric behavior. 762
21612 238390 cd00765 Pyrophosphate_PFK Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include pyrophosphate-dependent phosphofructokinases. These are found in bacteria as well as plants. These may be dimeric nonallosteric enzymes as in bacteria or allosteric heterotetramers as in plants. 550
21613 238391 cd00768 class_II_aaRS-like_core Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ. 211
21614 238392 cd00769 PheRS_beta_core Phenylalanyl-tRNA synthetase (PheRS) beta chain core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA, PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. PheRS is an alpha-2/ beta-2 tetramer. While the alpha chain contains a catalytic core domain, the beta chain has a non-catalytic core domain. 198
21615 238393 cd00770 SerRS_core Seryl-tRNA synthetase (SerRS) class II core catalytic domain. SerRS is responsible for the attachment of serine to the 3' OH group of ribose of the appropriate tRNA. This domain It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. SerRS synthetase is a homodimer. 297
21616 238394 cd00771 ThrRS_core Threonyl-tRNA synthetase (ThrRS) class II core catalytic domain. ThrRS is a homodimer. It is responsible for the attachment of threonine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. 298
21617 238395 cd00772 ProRS_core Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. 264
21618 238396 cd00773 HisRS-like_core Class II Histidinyl-tRNA synthetase (HisRS)-like catalytic core domain. HisRS is a homodimer. It is responsible for the attachment of histidine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ. HisZ along with HisG catalyze the first reaction in histidine biosynthesis. HisZ is found only in a subset of bacteria and differs from HisRS in lacking a C-terminal anti-codon binding domain. 261
21619 238397 cd00774 GlyRS-like_core Glycyl-tRNA synthetase (GlyRS)-like class II core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP binding and hydrolysis. This alignment contains only sequences from the GlyRS form which homodimerizes. The heterotetramer glyQ is in a different family of class II aaRS. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the N-terminus of the accessory subunit of mitochondrial polymerase gamma (Pol gamma b). Pol gamma b stimulates processive DNA synthesis and is functional as a homodimer, which can associate with the catalytic subunit Pol gamma alpha to form a heterotrimer. Despite significant both structural and sequence similarity with GlyRS, Pol gamma b lacks conservation of several class II functional residues. 254
21620 238398 cd00775 LysRS_core Lys_tRNA synthetase (LysRS) class II core domain. Class II LysRS is a dimer which attaches a lysine to the 3' OH group of ribose of the appropriate tRNA. Its assignment to class II aaRS is based upon its structure and the presence of three characteristic sequence motifs in the core domain. It is found in eukaryotes as well as some prokaryotes and archaea. However, LysRS belongs to class I aaRS's in some prokaryotes and archaea. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. 329
21621 238399 cd00776 AsxRS_core Asx tRNA synthetase (AspRS/AsnRS) class II core domain. Assignment to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs in the core domain. This family includes AsnRS as well as a subgroup of AspRS. AsnRS and AspRS are homodimers, which attach either asparagine or aspartate to the 3'OH group of ribose of the appropriate tRNA. While archaea lack asnRS, they possess a non-discriminating aspRS, which can mischarge Asp-tRNA with Asn. Subsequently, a tRNA-dependent aspartate amidotransferase converts the bound aspartate to asparagine. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. 322
21622 238400 cd00777 AspRS_core Asp tRNA synthetase (aspRS) class II core domain. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. AspRS is a homodimer, which attaches a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. AspRS in this family differ from those found in the AsxRS family by a GAD insert in the core domain. 280
21623 238401 cd00778 ProRS_core_arch_euk Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from archaea, the cytoplasm of eukaryotes and some bacteria. 261
21624 238402 cd00779 ProRS_core_prok Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from prokaryotes and from the mitochondria of eukaryotes. 255
21625 238403 cd00780 NTF2 Nuclear transport factor 2 (NTF2) domain plays an important role in the trafficking of macromolecules, ions and small molecules between the cytoplasm and nucleus. This bi-directional transport of macromolecules across the nuclear envelope requires many soluble factors that includes GDP-binding protein Ran (RanGDP). RanGDP is required for both import and export of proteins and poly(A) RNA. RanGDP also has been implicated in cell cycle control, specifically in mitotic spindle assembly. In interphase cells, RanGDP is predominately nuclear and thought to be GTP bound, but it is also present in the cytoplasm, probably in the GDP-bound state. NTF2 mediates the nuclear import of RanGDP. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. 119
21626 238404 cd00781 ketosteroid_isomerase ketosteroid isomerase: Many biological reactions proceed by enzymatic cleavage of a C-H bond adjacent to carbonyl or a carboxyl group, leading to an enol or a enolate intermediate that is subsequently re-protonated at the same or an adjacent carbon. Ketosteroid isomerases are important members of this class of enzymes which are the most proficient of all enzymes known and have served as a paradigm for enzymatic enolizations since its discovery in 1954. This CD includes members of this class that calalyze the isomerization of various beta,gamma-unsaturated isomers at nearly a diffusion-controlled rate. These enzymes are widely distributed in bacteria. 122
21627 238405 cd00782 MutL_Trans MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to human MLH1, hPMS2, hPMS1, hMLH3 and E. coli MutL, MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome. Mutation in hMLH1 accounts for a large fraction of HNPCC families. There is no convincing evidence to support hPMS1 having a role in HNPCC predisposition. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC. It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP. The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH. 122
21628 238406 cd00786 cytidine_deaminase-like Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. 96
21629 238407 cd00788 KU70 Ku-core domain, Ku70 subfamily; Ku70 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in the nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity. 287
21630 238408 cd00789 KU_like Ku-core domain, Ku-like subfamily; composed of prokaryotic homologs of the eukaryotic DNA binding protein Ku. The alignment includes the core domain shared by the prokaryotic YkoV-like proteins and the eukaryotic Ku70 and Ku80. The prokaryotic Ku homologs are predicted to form homodimers. It is proposed that the Ku homologs are functionally associated with ATP-dependent DNA ligase and the eukaryotic-type primase, probably as components of a double-strand break repair system. 256
21631 238409 cd00794 NOS_oxygenase_prok Nitric oxide synthase (NOS) prokaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. Nitric oxide synthases are homodimers. Most prokaryotes produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS. However, a few prokaryotes also have a NOS, consisting solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate. 353
21632 238410 cd00795 NOS_oxygenase_euk Nitric oxide synthase (NOS) eukaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain, which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOS's also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. 412
21633 271177 cd00796 INT_Rci_Hp1_C Shufflon-specific DNA recombinase Rci and Bacteriophage Hp1_like integrase, C-terminal catalytic domain. Rci protein is a tyrosine recombinase specifically involved in Shufflon type of DNA rearrangement in bacteria. The shufflon of plasmid R64 consists of four invertible DNA segments which are separated and flanked by seven 19-bp repeat sequences. RCI recombinase facilitates the site-specific recombination between any inverted repeats results in an inversion of the DNA segment(s) either independently or in groups. HP1 integrase promotes site-specific recombination of the HP1 genome into that of Haemophilus influenza. Bacteriophage Hp1_like integrases are tyrosine based site specific recombinases. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 162
21634 271178 cd00797 INT_RitB_C_like C-terminal catalytic domain of recombinase RitB, a component of the recombinase trio. Recombinases belonging to the RitA (also known as pAE1 due to its presence in the deletion prone region of plasmid pAE1 of Alcaligenes eutrophus H1), RitB, and RitC families are associated in a complex referred to as a Recombinase in Trio (RIT) element. These RIT elements consist of three adjacent and unidirectional overlapping genes, one from each family (ritABC in order of transcription). All three integrases contain a catalytic motif, suggesting that they are all active enzymes. However, their specific roles are not yet fully understood. All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. 198
21635 271179 cd00798 INT_XerDC_C XerD and XerC integrases, C-terminal catalytic domains. XerDC-like integrases are involved in the site-specific integration and excision of lysogenic bacteriophage genomes, transposition of conjugative transposons, termination of chromosomal replication, and stable plasmid inheritance. They share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism with the DNA breaking-rejoining enzyme superfamily. In Escherichia coli, the Xer site-specific recombination system acts to convert dimeric chromosomes, which are formed by homologous recombination to monomers. Two related recombinases, XerC and XerD, bind cooperatively to a recombination site present in the E. coli chromosome. Each recombinase catalyzes the exchange of one pair of DNA strand in a reaction that proceeds through a Holliday junction intermediate. These enzymes can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves, and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites. 172
21636 271180 cd00799 INT_Cre_C C-terminal catalytic domain of Cre recombinase (also called integrase). Cre-like recombinases are tyrosine based site specific recombinases. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The bacteriophage P1 Cre recombinase maintains the circular phage replicon in a monomeric state by catalyzing a site-specific recombination between two loxP sites. The catalytic core domain of Cre recombinase is linked to a more divergent helical N-terminal domain, which interacts primarily with the DNA major groove proximal to the crossover region. 188
21637 271181 cd00800 INT_Lambda_C C-terminal catalytic domain of Lambda integrase, a tyrosine-based site-specific recombinase. Lambda-type integrases catalyze site-specific integration and excision of temperate bacteriophages and other mobile genetic elements to and from the bacterial host chromosome. They are tyrosine-based site-specific recombinase and belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The phage lambda integrase can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites. 161
21638 271182 cd00801 INT_P4_C Bacteriophage P4 integrase, C-terminal catalytic domain. P4-like integrases are found in temperate bacteriophages, integrative plasmids, pathogenicity and symbiosis islands, and other mobile genetic elements. The P4 integrase mediates integrative and excisive site-specific recombination between two sites, called attachment sites, located on the phage genome and the bacterial chromosome. The phage attachment site is often found adjacent to the integrase gene, while the host attachment sites are typically situated near tRNA genes. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 180
21639 173901 cd00802 class_I_aaRS_core catalytic core domain of class I amino acyl-tRNA synthetase. Class I amino acyl-tRNA synthetase (aaRS) catalytic core domain. These enzymes are mostly monomers which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 143
21640 173902 cd00805 TyrRS_core catalytic core domain of tyrosinyl-tRNA synthetase. Tyrosinyl-tRNA synthetase (TyrRS) catalytic core domain. TyrRS is a homodimer which attaches Tyr to the appropriate tRNA. TyrRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formationof the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding. 269
21641 173903 cd00806 TrpRS_core catalytic core domain of tryptophanyl-tRNA synthetase. Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. TrpRS is a homodimer which attaches Tyr to the appropriate tRNA. TrpRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding 280
21642 185676 cd00807 GlnRS_core catalytic core domain of glutaminyl-tRNA synthetase. Glutaminyl-tRNA synthetase (GlnRS) cataytic core domain. These enzymes attach Gln to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. GlnRS contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea and most bacteria lack GlnRS. In these organisms, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. 238
21643 173905 cd00808 GluRS_core catalytic core domain of discriminating glutamyl-tRNA synthetase. Discriminating Glutamyl-tRNA synthetase (GluRS) catalytic core domain . The discriminating form of GluRS is only found in bacteria and cellular organelles. GluRS is a monomer that attaches Glu to the appropriate tRNA. Like other class I tRNA synthetases, GluRS aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 239
21644 173906 cd00812 LeuRS_core catalytic core domain of leucyl-tRNA synthetases. Leucyl tRNA synthetase (LeuRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. In Aquifex aeolicus, the gene encoding LeuRS is split in two, just before the KMSKS motif. Consequently, LeuRS is a heterodimer, which likely superimposes with the LeuRS monomer found in most other organisms. LeuRS has an insertion in the core domain, which is subject to both deletions and rearrangements and thus differs between prokaryotic LeuRS and archaeal/eukaryotic LeuRS. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 314
21645 173907 cd00814 MetRS_core catalytic core domain of methioninyl-tRNA synthetases. Methionine tRNA synthetase (MetRS) catalytic core domain. This class I enzyme aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. MetRS, which consists of the core domain and an anti-codon binding domain, functions as a monomer. However, in some species the anti-codon binding domain is followed by an EMAP domain. In this case, MetRS functions as a homodimer. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. As a result of a deletion event, MetRS has a significantly shorter core domain insertion than IleRS, ValRS, and LeuR. Consequently, the MetRS insertion lacks the editing function. 319
21646 185677 cd00817 ValRS_core catalytic core domain of valyl-tRNA synthetases. Valine amino-acyl tRNA synthetase (ValRS) catalytic core domain. This enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. ValRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 382
21647 173909 cd00818 IleRS_core catalytic core domain of isoleucyl-tRNA synthetases. Isoleucine amino-acyl tRNA synthetases (IleRS) catalytic core domain . This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. IleRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 338
21648 238417 cd00819 PEPCK_GTP Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the GTP-dependent group. 579
21649 238418 cd00820 PEPCK_HprK Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions. 107
21650 275388 cd00821 PH Pleckstrin homology (PH) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 92
21651 238419 cd00822 TopoII_Trans_DNA_gyrase TopoIIA_Trans_DNA_gyrase: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to the B subunits of E. coli DNA gyrase and E. coli Topoisomerase IV which are heterodimers composed of two subunits. The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes. All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. E.coli DNA gyrase is a heterodimer composed of two subunits. E. coli DNA gyrase B subunit is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 172
21652 238420 cd00823 TopoIIB_Trans TopoIIB_Trans: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIB family of DNA topoisomerases similar to Sulfolobus shibatae topoisomerase VI (topoVI). The sole representative of the Type IIB family is topo VI. Topo VI enzymes are heterotetramers found in archaea and plants. S. shibatae topoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 151
21653 238421 cd00825 decarbox_cond_enzymes decarboxylating condensing enzymes; Family of enzymes that catalyze the formation of a new carbon-carbon bond by a decarboxylating Claisen-like condensation reaction. Members are involved in the synthesis of fatty acids and polyketides, a diverse group of natural products. Both pathways are an iterative series of additions of small carbon units, usually acetate, to a nascent acyl group. There are 2 classes of decarboxylating condensing enzymes, which can be distinguished by sequence similarity, type of active site residues and type of primer units (acetyl CoA or acyl carrier protein (ACP) linked units). 332
21654 238422 cd00826 nondecarbox_cond_enzymes nondecarboxylating condensing enzymes; In general, thiolases catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways. 393
21655 238423 cd00827 init_cond_enzymes "initiating" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type III and polyketide synthases, type III, which include chalcone synthase and related enzymes. They are characterized by the utlization of CoA substrate primers, as well as the nature of their active site residues. 324
21656 238424 cd00828 elong_cond_enzymes "elongating" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type I and II and polyketide synthases.They are characterized by the utlization of acyl carrier protein (ACP) thioesters as primer substrates, as well as the nature of their active site residues. 407
21657 238425 cd00829 SCP-x_thiolase Thiolase domain associated with sterol carrier protein (SCP)-x isoform and related proteins; SCP-2 has multiple roles in intracellular lipid circulation and metabolism. The N-terminal presequence in the SCP-x isoform represents a peroxisomal 3-ketacyl-Coa thiolase specific for branched-chain acyl CoAs, which is proteolytically cleaved from the sterol carrier protein. 375
21658 238426 cd00830 KAS_III Ketoacyl-acyl carrier protein synthase III (KASIII) initiates the elongation in type II fatty acid synthase systems. It is found in bacteria and plants. Elongation of fatty acids in the type II systems occurs by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP. KASIII initiates this process by specifically using acetyl-CoA over acyl-CoA. 320
21659 238427 cd00831 CHS_like Chalcone and stilbene synthases; plant-specific polyketide synthases (PKS) and related enzymes, also called type III PKSs. PKS generate an array of different products, dependent on the nature of the starter molecule. They share a common chemical strategy, after the starter molecule is loaded onto the active site cysteine, a carboxylative condensation reation extends the polyketide chain. Plant-specific PKS are dimeric iterative PKSs, using coenzyme A esters to deliver substrate to the active site, but they differ in the choice of starter molecule and the number of condensation reactions. 361
21660 238428 cd00832 CLF Chain-length factor (CLF) is a factor required for polyketide chain initiation of aromatic antibiotic-producing polyketide synthases (PKSs) of filamentous bacteria. CLFs have been shown to have decarboxylase activity towards malonyl-acyl carrier protein (ACP). CLFs are similar to other elongation ketosynthase domains, but their active site cysteine is replaced by a conserved glutamine. 399
21661 238429 cd00833 PKS polyketide synthases (PKSs) polymerize simple fatty acids into a large variety of different products, called polyketides, by successive decarboxylating Claisen condensations. PKSs can be divided into 2 groups, modular type I PKSs consisting of one or more large multifunctional proteins and iterative type II PKSs, complexes of several monofunctional subunits. 421
21662 238430 cd00834 KAS_I_II Beta-ketoacyl-acyl carrier protein (ACP) synthase (KAS), type I and II. KASs are responsible for the elongation steps in fatty acid biosynthesis. KASIII catalyses the initial condensation and KAS I and II catalyze further elongation steps by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP. 406
21663 269907 cd00835 RanBD_family Ran-binding domain. The RanBD is present in RanBP1, RanBP2, RanBP3, Nuc2, and Nuc50. Most of these proteins have a single RanBD, with the exception of RanBP2 which has 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. The Ran-binding domain is found in multiple copies in Nuclear pore complex proteins. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The RanBD proteins of the nuclear pore complex (NPC): nucleoporin 1 (NUP1), NUP2, NUP61, and Nuclear Pore complex Protein 9 (npp-9) are present in the parent, but specific models were not made due to lineage. To date there been no reports of inositol phosphate or phosphoinositide binding by Ran-binding proteins. 118
21664 275389 cd00836 FERM_C-lobe FERM domain C-lobe. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 93
21665 269909 cd00837 EVH1_family EVH1 (Drosophila Enabled (Ena)/Vasodilator-stimulated phosphoprotein (VASP) homology 1) domain. The EVH1 domains are part of the PH domain superfamily. EVH1 subfamilies include Enables/VASP, Homer/Vesl, WASP, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 103
21666 277317 cd00838 MPP_superfamily metallophosphatase superfamily, metallophosphatase domain. Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 130
21667 277318 cd00839 MPP_PAPs purple acid phosphatases of the metallophosphatase superfamily, metallophosphatase domain. Purple acid phosphatases (PAPs) belong to a diverse family of binuclear metallohydrolases that have been identified and characterized in plants, animals, and fungi. PAPs contain a binuclear metal center and their characteristic pink or purple color derives from a charge-transfer transition between a tyrosine residue and a chromophoric ferric ion within the binuclear center. PAPs catalyze the hydrolysis of a wide range of activated phosphoric acid mono- and di-esters and anhydrides. PAPs are distinguished from the other phosphatases by their insensitivity to L-(+) tartrate inhibition and are therefore also known as tartrate resistant acid phosphatases (TRAPs). While only a few copies of PAP-like genes are present in mammalian and fungal genomes, multiple copies are present in plant genomes. PAPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 296
21668 277319 cd00840 MPP_Mre11_N Mre11 nuclease, N-terminal metallophosphatase domain. Mre11 (also known as SbcD in Escherichia coli) is a subunit of the MRX protein complex. This complex includes: Mre11, Rad50, and Xrs2/Nbs1, and plays a vital role in several nuclear processes including DNA double-strand break repair, telomere length maintenance, cell cycle checkpoint control, and meiotic recombination, in eukaryotes. During double-strand break repair, the MRX complex is required to hold the two ends of a broken chromosome together. In vitro studies show that Mre11 has 3'-5' exonuclease activity on dsDNA templates and endonuclease activity on dsDNA and ssDNA templates. In addition to the N-terminal phosphatase domain, the eukaryotic MRE11 members of this family have a C-terminal DNA binding domain (not included in this alignment model). MRE11-like proteins are found in prokaryotes and archaea was well as in eukaryotes. Mre11 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 186
21669 277320 cd00841 MPP_YfcE Escherichia coli YfcE and related proteins, metallophosphatase domain. YfcE is a manganase-dependent metallophosphatase, found in bacteria and archaea, that cleaves bis-p-nitrophenyl phosphate, thymidine 5'-monophosphate-p-nitrophenyl ester, and p-nitrophenyl phosphorylcholine, but is unable to hydrolyze 2',3 ' or 3',5' cyclic nucleic phosphodiesters, and various phosphomonoesters, including p-nitrophenyl phosphate. This family also includes the Bacilus subtilis YsnB and Methanococcus jannaschii MJ0936 proteins. This domain family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 156
21670 277321 cd00842 MPP_ASMase acid sphingomyelinase and related proteins, metallophosphatase domain. Acid sphingomyelinase (ASMase) is a ubiquitously expressed phosphodiesterase which hydrolyzes sphingomyelin in acid pH conditions to form ceramide, a bioactive second messenger, as part of the sphingomyelin signaling pathway. ASMase is localized at the noncytosolic leaflet of biomembranes (for example the luminal leaflet of endosomes, lysosomes and phagosomes, and the extracellular leaflet of plasma membranes). ASMase-deficient humans develop Niemann-Pick disease. This disease is characterized by lysosomal storage of sphingomyelin in all tissues. Although ASMase-deficient mice are resistant to stress-induced apoptosis, they have greater susceptibility to bacterial infection. The latter correlates with defective phagolysosomal fusion and antibacterial killing activity in ASMase-deficient macrophages. ASMase belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 294
21671 277322 cd00844 MPP_Dbr1_N Dbr1 RNA lariat debranching enzyme, N-terminal metallophosphatase domain. Dbr1 is an RNA lariat debranching enzyme that hydrolyzes 2'-5' phosphodiester bonds at the branch points of excised intron lariats. This alignment model represents the N-terminal metallophosphatase domain of Dbr1. This domain belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 271
21672 277323 cd00845 MPP_UshA_N_like Escherichia coli UshA-like family, N-terminal metallophosphatase domain. This family includes the bacterial enzyme UshA, and related enzymes including SoxB, CpdB, YhcR, and CD73. All members have a similar domain architecture which includes an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 255
21673 238431 cd00851 MTH1175 This uncharacterized conserved protein belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, NifB, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily. This conserved domain is represented in two of the three major divisions of life (bacteria and archaea). 103
21674 238432 cd00852 NifB NifB belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme as part of nitrogen fixation in bacteria. This domain is sometimes found fused to a N-terminal domain (the Radical SAM domain) in nifB-like proteins. 106
21675 238433 cd00853 NifX NifX belongs to a family of iron-molybdenum cluster-binding proteins that includes NifB, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. The protein is part of the nitrogen fixation gene cluster in nitrogen-fixing bacteria and has sequence similarity to other members of the cluster. 102
21676 238434 cd00854 NagA N-acetylglucosamine-6-phosphate deacetylase, NagA, catalyzes the hydrolysis of the N-acetyl group of N-acetyl-glucosamine-6-phosphate (GlcNAc-6-P) to glucosamine 6-phosphate and acetate. This is the first committed step in the biosynthetic pathway to amino-sugar-nucleotides, which is needed for cell wall peptidoglycan and teichoic acid biosynthesis. Deacetylation of N-acetylglucosamine is also important in lipopolysaccharide synthesis and cell wall recycling. 374
21677 349487 cd00855 SWIB-MDM2 SWIB/MDM2 domain family. The SWIB/MDM2 protein domain, short for SWI/SNF complex B/MDM2, has been found in both SWI/SNF complex B (SWIB) and the negative regulator of the p53 tumor suppressor MDM2, which are homologous and share a common fold. The SWIB domain is a conserved region found within proteins in the SWI/SNF (SWItch/Sucrose Non-Fermentable) family of complexes. SWI/SNF complex proteins display helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). MDM2 is an inhibitor of p53 tumor repressor. It binds to the transactivation domain and down-regulates the ability of p53 to activate transcription. This family corresponds to the SWIB domain and the p53 binding domain of MDM2. 69
21678 238435 cd00858 GlyRS_anticodon GlyRS Glycyl-anticodon binding domain. GlyRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 121
21679 238436 cd00859 HisRS_anticodon HisRS Histidyl-anticodon binding domain. HisRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 91
21680 238437 cd00860 ThrRS_anticodon ThrRS Threonyl-anticodon binding domain. ThrRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 91
21681 238438 cd00861 ProRS_anticodon_short ProRS Prolyl-anticodon binding domain, short version found predominantly in bacteria. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 94
21682 238439 cd00862 ProRS_anticodon_zinc ProRS Prolyl-anticodon binding domain, long version found predominantly in eukaryotes and archaea. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only, and an additional C-terminal zinc-binding domain specific to this subfamily of aaRSs. 202
21683 238440 cd00864 PI3Ka Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture. 152
21684 176643 cd00865 PEBP_bact_arch PhosphatidylEthanolamine-Binding Protein (PEBP) domain present in bacteria and archaea. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). The members in this subgroup are present in bacterial and archaea. Members here include Escherichia coli YBHB and YBCL which are thought to regulate protein phosphorylation as well as Sulfolobus solfataricus SsCEI which inhibits serine proteases alpha-chymotrypsin and elastase. Although their overall structures are similar, the members of the PEBP family have very different substrates and oligomerization states (monomer/dimer/tetramer). In a few of the bacterial members present here the dimerization interface is proposed to form the ligand binding site, unlike in other PEBP members. 150
21685 176644 cd00866 PEBP_euk PhosphatidylEthanolamine-Binding Protein (PEBP) domain present in eukaryotes. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). The members in this subgroup are present in eukaryotes. Members here include those in plants such as Arabidopsis thaliana FLOWERING LOCUS (FT) and TERMINAL FLOWER1 (FT1) which function as a promoter and a repressor of the floral transitions, respectively as well as the mammalian Raf kinase inhibitory protein (RKIP) which inhibits MAP kinase (Raf-MEK-ERK), G protein-coupled receptor (GPCR) kinase and NFkappaB signaling cascades. Although their overall structures are similar, the members of the PEBP family have very different substrates and oligomerization states (monomer/dimer/tetramer). 154
21686 173836 cd00867 Trans_IPPS Trans-Isoprenyl Diphosphate Synthases. Trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) of class 1 isoprenoid biosynthesis enzymes which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, diterpenes, ubiquinone, and archaeal ether linked lipids; and are widely distributed among archaea, bacteria, and eukareya. The enzymes in this family share the same 'isoprenoid synthase fold' and include the head-to-tail (HT) IPPS which catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, cis-IPPS are not included in this CD. 236
21687 173837 cd00868 Terpene_cyclase_C1 Terpene cyclases, Class 1. Terpene cyclases, Class 1 (C1) of the class 1 family of isoprenoid biosynthesis enzymes, which share the 'isoprenoid synthase fold' and convert linear, all-trans, isoprenoids, geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate into numerous cyclic forms of monoterpenes, diterpenes, and sesquiterpenes. Also included in this CD are the cis-trans terpene cyclases such as trichodiene synthase. The class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD. Taxonomic distribution includes bacteria, fungi and plants. 284
21688 238441 cd00869 PI3Ka_II Phosphoinositide 3-kinase (PI3K) class II, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, class II PI3-kinases phosphorylate phosphoinositol (PtdIns), PtdIns(4)-phosphate, but not PtdIns(4,5)-bisphosphate. They are larger, having a C2 domain at the C-terminus. 169
21689 238442 cd00870 PI3Ka_III Phosphoinositide 3-kinase (PI3K) class III, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3Ks class III phosphorylate phosphoinositol (PtdIns) only. The prototypical PI3K class III, yeast Vps34, is involved in trafficking proteins from Golgi to the vacuole. 166
21690 238443 cd00871 PI4Ka Phosphoinositide 4-kinase(PI4K), accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. PI4K phosphorylates hydroxylgroup at position 4 on the inositol ring of phosphoinositide, the first commited step in the phosphatidylinositol cycle. 175
21691 238444 cd00872 PI3Ka_I Phosphoinositide 3-kinase (PI3K) class I, accessory domain ; PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3K class I prefer phosphoinositol (4,5)-bisphosphate as a substrate. Mammalian members interact with active Ras. They form heterodimers with adapter molecules linking them to different signaling pathways. 171
21692 238445 cd00873 KU80 Ku-core domain, Ku80 subfamily; Ku80 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity. 300
21693 238446 cd00874 RNA_Cyclase_Class_II RNA 3' phosphate cyclase domain (class II). These proteins function as RNA cyclase to catalyze the ATP-dependent conversion of 3'-phosphate to a 2'.3'-cyclic phosphodiester at the end of RNA molecule. A conserved catalytic histidine residue is found in all members of this subfamily. 326
21694 238447 cd00875 RNA_Cyclase_Class_I RNA 3' phosphate cyclase domain (class I) This subfamily of cyclase-like proteins are encoded in eukaryotic genomes. They lack a conserved catalytic histidine residue required for cyclase activity, so probably do not function as cyclases. They are believed to play a role in ribosomal RNA processing and assembly. 341
21695 206642 cd00876 Ras Rat sarcoma (Ras) family of small guanosine triphosphatases (GTPases). The Ras family of the Ras superfamily includes classical N-Ras, H-Ras, and K-Ras, as well as R-Ras, Rap, Ral, Rheb, Rhes, ARHI, RERG, Rin/Rit, RSR1, RRP22, Ras2, Ras-dva, and RGK proteins. Ras proteins regulate cell growth, proliferation and differentiation. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified. These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterized are the Raf kinases, phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 160
21696 206643 cd00877 Ran Ras-related nuclear proteins (Ran)/TC4 family of small GTPases. Ran GTPase is involved in diverse biological functions, such as nuclear transport, spindle formation during mitosis, DNA replication, and cell division. Among the Ras superfamily, Ran is a unique small G protein. It does not have a lipid modification motif at the C-terminus to bind to the membrane, which is often observed within the Ras superfamily. Ran may therefore interact with a wide range of proteins in various intracellular locations. Like other GTPases, Ran exists in GTP- and GDP-bound conformations that interact differently with effectors. Conversion between these forms and the assembly or disassembly of effector complexes requires the interaction of regulator proteins. The intrinsic GTPase activity of Ran is very low, but it is greatly stimulated by a GTPase-activating protein (RanGAP1) located in the cytoplasm. By contrast, RCC1, a guanine nucleotide exchange factor that generates RanGTP, is bound to chromatin and confined to the nucleus. Ran itself is mobile and is actively imported into the nucleus by a mechanism involving NTF-2. Together with the compartmentalization of its regulators, this is thought to produce a relatively high concentration of RanGTP in the nucleus. 166
21697 206644 cd00878 Arf_Arl ADP-ribosylation factor(Arf)/Arf-like (Arl) small GTPases. Arf (ADP-ribosylation factor)/Arl (Arf-like) small GTPases. Arf proteins are activators of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. Arfs are N-terminally myristoylated. Members of the Arf family are regulators of vesicle formation in intracellular traffic that interact reversibly with membranes of the secretory and endocytic compartments in a GTP-dependent manner. They depart from other small GTP-binding proteins by a unique structural device, interswitch toggle, that implements front-back communication from N-terminus to the nucleotide binding site. Arf-like (Arl) proteins are close relatives of the Arf, but only Arl1 has been shown to function in membrane traffic like the Arf proteins. Arl2 has an unrelated function in the folding of native tubulin, and Arl4 may function in the nucleus. Most other Arf family proteins are so far relatively poorly characterized. Thus, despite their significant sequence homologies, Arf family proteins may regulate unrelated functions. 158
21698 206645 cd00879 Sar1 Sar1 is an essential component of COPII vesicle coats. Sar1 is an essential component of COPII vesicle coats involved in export of cargo from the ER. The GTPase activity of Sar1 functions as a molecular switch to control protein-protein and protein-lipid interactions that direct vesicle budding from the ER. Activation of the GDP to the GTP-bound form of Sar1 involves the membrane-associated guanine nucleotide exchange factor (GEF) Sec12. Sar1 is unlike all Ras superfamily GTPases that use either myristoyl or prenyl groups to direct membrane association and function, in that Sar1 lacks such modification. Instead, Sar1 contains a unique nine-amino-acid N-terminal extension. This extension contains an evolutionarily conserved cluster of bulky hydrophobic amino acids, referred to as the Sar1-N-terminal activation recruitment (STAR) motif. The STAR motif mediates the recruitment of Sar1 to ER membranes and facilitates its interaction with mammalian Sec12 GEF leading to activation. 191
21699 206646 cd00880 Era_like E. coli Ras-like protein (Era)-like GTPase. The Era (E. coli Ras-like protein)-like family includes several distinct subfamilies (TrmE/ThdF, FeoB, YihA (EngB), Era, and EngA/YfgK) that generally show sequence conservation in the region between the Walker A and B motifs (G1 and G3 box motifs), to the exclusion of other GTPases. TrmE is ubiquitous in bacteria and is a widespread mitochondrial protein in eukaryotes, but is absent from archaea. The yeast member of TrmE family, MSS1, is involved in mitochondrial translation; bacterial members are often present in translation-related operons. FeoB represents an unusual adaptation of GTPases for high-affinity iron (II) transport. YihA (EngB) family of GTPases is typified by the E. coli YihA, which is an essential protein involved in cell division control. Era is characterized by a distinct derivative of the KH domain (the pseudo-KH domain) which is located C-terminal to the GTPase domain. EngA and its orthologs are composed of two GTPase domains and, since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. 161
21700 206647 cd00881 GTP_translation_factor GTP translation factor family primarily contains translation initiation, elongation and release factors. The GTP translation factor family consists primarily of translation initiation, elongation, and release factors, which play specific roles in protein translation. In addition, the family includes Snu114p, a component of the U5 small nuclear riboprotein particle which is a component of the spliceosome and is involved in excision of introns, TetM, a tetracycline resistance gene that protects the ribosome from tetracycline binding, and the unusual subfamily CysN/ATPS, which has an unrelated function (ATP sulfurylase) acquired through lateral transfer of the EF1-alpha gene and development of a new function. 183
21701 206648 cd00882 Ras_like_GTPase Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases). Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions. 161
21702 238448 cd00883 beta_CA_cladeA Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 182
21703 238449 cd00884 beta_CA_cladeB Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 190
21704 238450 cd00885 cinA Competence-damaged protein. CinA is the first gene in the competence- inducible (cin) operon and is thought to be specifically required at some stage in the process of transformation. This domain is closely related to a domain, found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, where the domain is presumed to bind molybdopterin. 170
21705 238451 cd00886 MogA_MoaB MogA_MoaB family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF) an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea, and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT). MogA, together with MoeA, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein. The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes. In contrast, MoaB shows high similarity to MogA, but little is known about its physiological role. All well studied members of this family form highly stable trimers. 152
21706 238452 cd00887 MoeA MoeA family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF), an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT). MoeA, together with MoaB, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein. The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes. 394
21707 238453 cd00890 Prefoldin Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils. 129
21708 270624 cd00891 PI3Kc Catalytic domain of Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class II PI3Ks comprise three catalytic isoforms that do not associate with any regulatory subunits. They selectively use PtdIns as a susbtrate to produce PtsIns(3)P. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 334
21709 270625 cd00892 PIKKc_ATR Catalytic domain of Ataxia telangiectasia and Rad3-related proteins. ATR is also referred to as Mei-41 (Drosophila), Esr1/Mec1p (Saccharomyces cerevisiae), Rad3 (Schizosaccharomyces pombe), and FRAP-related protein (human). ATR contains a UME domain of unknown function, a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. Together with its downstream effector kinase, Chk1, ATR plays a central role in regulating the replication checkpoint. ATR stabilizes replication forks by promoting the association of DNA polymerases with the fork. Preventing fork collapse is essential in preserving genomic integrity. ATR also plays a role in normal cell growth and in response to DNA damage. ATR is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The ATR catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 237
21710 270626 cd00893 PI4Kc_III Catalytic domain of Type III Phosphoinositide 4-kinase. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. There are two types of PI4Ks, types II and III. Type II PI4Ks lack the characteristic catalytic kinase domain present in PI3Ks and type III PI4Ks, and are excluded from this family. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 286
21711 270627 cd00894 PI3Kc_IB_gamma Catalytic domain of Class IB Phosphoinositide 3-kinase gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kgamma signaling controls diverse immune and vascular functions including cell recruitment, mast cell activation, platelet aggregation, and smooth muscle contractility. It associates with one of two regulatory subunits, p101 and p84, and is activated by G-protein-coupled receptors (GPCRs) by direct binding to their betagamma subunits. It contains an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 367
21712 119421 cd00895 PI3Kc_C2_beta Catalytic domain of Class II Phosphoinositide 3-kinase beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II beta isoform, PI3K-C2beta, contributes to the migration and survival of cancer cells. It regulates Rac activity and impacts membrane ruffling, cell motility, and cadherin-mediated cell-cell adhesion. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 354
21713 270628 cd00896 PI3Kc_III Catalytic domain of Class III Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. Class III PI3Ks, also called Vps34 (vacuolar protein sorting 34), contain an N-terminal lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They phosphorylate only the substrate PtdIns. They interact with a regulatory subunit, Vps15, to form a membrane-associated complex. Class III PI3Ks are involved in protein and vesicular trafficking and sorting, autophagy, trimeric G-protein signaling, and phagocytosis. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 346
21714 132998 cd00897 UGPase_euk Eukaryotic UGPase catalyses the synthesis of UDP-Glucose. UGPase (UDP-Glucose Pyrophosphorylase) catalyzes the reversible production of UDP-Glucose and pyrophosphate (PPi) from Glucose-1-phosphate and UTP. UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids, glycoproteins, and proteoglycans. UGPase is found in both prokaryotes and eukaryotes. Interestingly, while the prokaryotic and eukaryotic forms of UGPase catalyze the same reaction, they share low sequence similarity. This family consists of mainly eukaryotic UTP-glucose-1-phosphate uridylyltransferases. 300
21715 132999 cd00899 b4GalT Beta-4-Galactosyltransferase is involved in the formation of the poly-N-acetyllactosamine core structures present in glycoproteins and glycosphingolipids. Beta-4-Galactosyltransferase transfers galactose from uridine diphosphogalactose to the terminal beta-N-acetylglucosamine residues, hereby forming the poly-N-acetyllactosamine core structures present in glycoproteins and glycosphingolipids. At least seven homologous beta-4-galactosyltransferase isoforms have been identified that use different types of glycoproteins and glycolipids as substrates. Of the seven identified members of the beta-1,4-galactosyltransferase subfamily (beta1,4-Gal-T1 to -T7), b1,4-Gal-T1 is most characterized (biochemically). It is a Golgi-resident type II membrane enzyme with a cytoplasmic domain, membrane spanning region, and a stem region and catalytic domain facing the lumen. 219
21716 275390 cd00900 PH-like Pleckstrin homology-like domain. The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins. 89
21717 153098 cd00904 Ferritin Ferritin iron storage proteins. Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Bacterial non-heme ferritins are composed only of H chains. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues. 160
21718 153099 cd00907 Bacterioferritin Bacterioferritin, ferritin-like diiron-binding domain. Bacterioferritins, also known as cytochrome b1, are members of a broad superfamily of ferritin-like diiron-carboxylate proteins. Similar to ferritin in architecture, Bfr forms an oligomer of 24 subunits that assembles to form a hollow sphere with 432 symmetry. Up to 12 heme cofactor groups (iron protoporphyrin IX or coproporphyrin III) are bound between dimer pairs. The role of the heme is unknown, although it may be involved in mediating iron-core reduction and iron release. Each subunit is composed of a four-helix bundle which carries a diiron ferroxidase center; it is here that initial oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of ferric-hydroxyphosphate at the core. Some bacterioferritins are composed of two subunit types, one conferring heme-binding ability (alpha) and the other (beta) bestowing ferroxidase activity. 153
21719 238454 cd00912 ML The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids. 127
21720 238455 cd00913 PCD_DCoH_subfamily_a PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). 76
21721 238456 cd00914 PCD_DCoH_subfamily_b PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas. 76
21722 238457 cd00915 MD-1_MD-2 MD-1 and MD-2 are cofactors required for LPS signaling through cell surface receptors. MD-2 and its binding partner, Toll-like receptor 4 (TLR4), are essential for the innate immune responses of mammalian cells to bacterial lipopolysaccharide (LPS); MD-2 directly binds the lipid A moiety of LPS. The TLR4-like receptor, RP105, which mediates LPS-induced lymphocyte proliferation, interacts with MD-1; MD-1 enhances RP105-mediated LPS-induced growth of B cells. These proteins belong to the ML domain family. 130
21723 238458 cd00916 Npc2_like Niemann-Pick type C2 (Npc2) is a lysosomal protein in which a mutation in the gene causes a rare form of Niemann-Pick type C disease, an autosomal recessive lipid storage disorder characterized by accumulation of low-density lipoprotein-derived cholesterol in lysosomes. Although Npc2 is known to bind cholesterol, the function of this protein is unknown. These proteins belong to the ML domain family. 123
21724 238459 cd00917 PG-PI_TP The phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP) has been shown to bind phosphatidylglycerol and phosphatidylinositol, but the biological significance of this is still obscure. These proteins belong to the ML domain family. 122
21725 238460 cd00918 Der-p2_like Several group 2 allergen proteins belong to the ML domain family. They include Dermatophagoides pteronyssinus, group 2 (Der p 2) and D. farinae, group 2 (Der f 2) allergens. These house dust mites cause heavy atopic diseases such as asthma and dermatitis. Although the allergenic properties of these proteins have been well characterized, their biological function in mites is unknown. 120
21726 238461 cd00919 Heme_Cu_Oxidase_I Heme-copper oxidase subunit I. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. Membership in the superfamily is defined by subunit I, which contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center. Only subunit I is common to the entire superfamily. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from the electron donor on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I of cytochrome c oxidase (CcO) and ubiquinol oxidase. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electron transfer occurs in two segments: from the electron donor to the low-spin heme, and from the low-spin heme to the binuclear center. The first segment can be a multi-step process and varies among the different families, while the second segment, a direct transfer, is consistent throughout the superfamily. 463
21727 259860 cd00920 Cupredoxin Cupredoxin superfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins, having an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. The majority of family members contain multiple cupredoxin domain repeats: ceruloplasmin and the coagulation factors V/VIII have six repeats; laccase, ascorbate oxidase, spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase, and the periplasmic domain of cytochrome c oxidase subunit II. 110
21728 238462 cd00922 Cyt_c_Oxidase_IV Cytochrome c oxidase subunit IV. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit IV is the largest of the nuclear-encoded subunits. It binds ATP at the matrix side, leading to an allosteric inhibition of enzyme activity at high intramitochondrial ATP/ADP ratios. In mammals, subunit IV has a lung-specific isoform and a ubiquitously expressed isoform. 136
21729 238463 cd00923 Cyt_c_Oxidase_Va Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios. 103
21730 238464 cd00924 Cyt_c_Oxidase_Vb Cytochrome c oxidase subunit Vb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Vb is one of three mammalian subunits that lacks a transmembrane region. Subunit Vb is located on the matrix side of the membrane and binds the regulatory subunit of protein kinase A. The abnormally extended conformation is stable only in the CcO assembly. 97
21731 238465 cd00925 Cyt_c_Oxidase_VIa Cytochrome c oxidase subunit VIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIa is expressed in two tissue-specific isoforms in mammals but not fish. VIa-H is the heart and skeletal muscle isoform; VIa-L is the liver or non-muscle isoform. Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate. 86
21732 238466 cd00926 Cyt_c_Oxidase_VIb Cytochrome c oxidase subunit VIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex. 75
21733 238467 cd00927 Cyt_c_Oxidase_VIc Cytochrome c oxidase subunit VIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIc subunit is found only in eukaryotes and its specific function remains unclear. It has been reported that the relative concentrations of some nuclear encoded CcO subunits, including subunit VIc, compared to those of the mitochondrial encoded subunits, are altered significantly during the progression of prostate cancer. 70
21734 238468 cd00928 Cyt_c_Oxidase_VIIa Cytochrome c oxidase subunit VIIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIIa has two tissue-specific isoforms that are expressed in a developmental manner. VIIa-H is expressed in heart and skeletal muscle but not smooth muscle. VIIa-L is expressed in liver and non-muscle tissues. 55
21735 238469 cd00929 Cyt_c_Oxidase_VIIc Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIc subunit is found only in eukaryotes and its specific function remains unclear. Peroxide inactivation of bovine CcO coincides with the direct oxidation of tryptophan (W19) within subunit VIIc, along with other structural changes in other subunits. 46
21736 238470 cd00930 Cyt_c_Oxidase_VIII Cytochrome oxidase c subunit VIII. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIII is the smallest of the nuclear-encoded subunits. It exists in muscle-specific and non-muscle-specific isoforms that are differently expressed in different species, suggesting species-specific regulation of energy metabolism. 43
21737 238471 cd00933 barnase Barnase, a member of the family of homologous microbial ribonucleases, catalyses the cleavage of single-stranded RNA via a two-step mechanism thought to be similar to that of pancreatic ribonuclease. The mechanism involves a transesterification to give a 2', 3'-cyclic phosphate intermediate, followed by hydrolysis to yield a 3' nucleotide. The active site residues His and Glu act as general acid-base groups during catalysis, while the Arg and Lys residues are important in binding the reactive phosphate, the latter probably binding the phosphate in the transition state. Barstar, a small 89 residue intracellular protein is a natural inhibitor of Barnase. 107
21738 269911 cd00934 PTB Phosphotyrosine-binding (PTB) PH-like fold. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to bind peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. 120
21739 238472 cd00935 GlyRS_RNA GlyRS_RNA binding domain. This short RNA-binding domain is found at the N-terminus of GlyRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix-turn-helix structure , which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 51
21740 238473 cd00936 WEPRS_RNA WEPRS_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in multiple copies in eukaryotic bifunctional glutamyl-prolyl-tRNA synthetases (EPRS) in a region that separates the N-terminal glutamyl-tRNA synthetase (GluRS) from the C-terminal prolyl-tRNA synthetase (ProRS). It is also found at the N-terminus of vertebrate tryptophanyl-tRNA synthetases (TrpRS). This domain consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 50
21741 238474 cd00938 HisRS_RNA HisRS_RNA binding domain. This short RNA-binding domain is found at the N-terminus of HisRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 45
21742 238475 cd00939 MetRS_RNA MetRS_RNA binding domain. This short RNA-binding domain is found at the C-terminus of MetRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is repeated in Drosophila MetRS. This domain consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 45
21743 411993 cd00941 FokI_N N-terminal DNA recognition domain of restriction endonuclease FokI and similar proteins. Restriction endonuclease FokI (EC3.1.21.4), also called R.FokI, or endonuclease FokI, is a type IIS restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. FokI recognizes the double-stranded sequence 5'-GGATG-3'/3'-CATCC-5' and cleaves 14 bases after G-1 and 13 bases before C-1, respectively. It contains an N-terminal DNA recognition domain and a C-terminal endonuclease domain. This model describes the DNA recognition domain. The family also includes endonuclease StsI, a type IIS restriction endonuclease found in Streptococcus sanguinis 54. It recognizes the same sequence as FokI but cleaves at different positions. 373
21744 411705 cd00942 BamHI-like Restriction endonuclease BamHI and similar proteins. Restriction endonuclease BamHI (EC 3.1.21.4), also termed R.BamHI, or endonuclease BamHI, is a type II restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. BamHI recognizes the double-stranded sequence GGATCC and cleaves after G-1. It shows striking resemblance to the structure of EcoRI, but lacks sequence similarity between them. The family also includes a BamHI isoschizomer, OkrAI endonuclease, which recognizes and cleaves the same DNA sequence (TATGGATCCATA) as BamHI. However, OkrAI does not have the equivalent of N- and C-terminal helices of BamHI, and it has higher star activity compared to BamHI. 196
21745 411706 cd00943 EcoRI-like Restriction endonuclease EcoRI and similar proteins. Restriction endonuclease EcoRI (EC 3.1.21.4), also termed R.EcoRI, or endonuclease EcoRI, is a type II restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. EcoRI recognizes the double-stranded sequence GAATTC and cleaves after G-1. The family also includes an EcoRI isoschizomer, RsrI endonuclease, which also catalyzes the cleavage of duplex DNA and oligodeoxyribonucleotides between the first two residues of the sequence GAATTC. RsrI differs from EcoRI in its N-terminal amino acid sequence, susceptibility to inhibition by antibodies, sensitivity to N-ethylmaleimide, isoelectric point, state of aggregation at high concentrations, temperature lability, and conditions for optimal reaction. It displays a reduction of specificity ("star activity") under conditions that also relax the specificity of EcoRI. 254
21746 188634 cd00945 Aldolase_Class_I Class I aldolases. Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin. 201
21747 238476 cd00946 FBP_aldolase_IIA Class II Type A, Fructose-1,6-bisphosphate (FBP) aldolases. The enzyme catalyses the zinc-dependent, reversible aldol condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to form fructose-1,6-bisphosphate. FBP aldolase is homodimeric and used in gluconeogenesis and glycolysis. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures. 345
21748 238477 cd00947 TBP_aldolase_IIB Tagatose-1,6-bisphosphate (TBP) aldolase and related Type B Class II aldolases. TBP aldolase is a tetrameric class II aldolase that catalyzes the reversible condensation of dihydroxyacetone phosphate with glyceraldehyde 3-phsophate to produce tagatose 1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures. 276
21749 188635 cd00948 FBP_aldolase_I_a Fructose-1,6-bisphosphate aldolase. Fructose-1,6-bisphosphate aldolase. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). This family includes proteins found in vertebrates, plants, and bacterial plant pathogens. Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. 330
21750 188636 cd00949 FBP_aldolase_I_bact Fructose-1.6-bisphosphate aldolase found in gram +/- bacteria. Fructose-1.6-bisphosphate aldolase found in gram +/- bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). The enzyme is member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. 292
21751 188637 cd00950 DHDPS Dihydrodipicolinate synthase (DHDPS). Dihydrodipicolinate synthase (DHDPS) is a key enzyme in lysine biosynthesis. It catalyzes the aldol condensation of L-aspartate-beta- semialdehyde and pyruvate to dihydropicolinic acid via a Schiff base formation between pyruvate and a lysine residue. The functional enzyme is a homotetramer consisting of a dimer of dimers. DHDPS is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways. 284
21752 188638 cd00951 KDGDH 5-dehydro-4-deoxyglucarate dehydratase, also called 5-keto-4-deoxy-glucarate dehydratase (KDGDH). 5-dehydro-4-deoxyglucarate dehydratase, also called 5-keto-4-deoxy-glucarate dehydratase (KDGDH), which is member of dihydrodipicolinate synthase (DHDPS) family that comprises several pyruvate-dependent class I aldolases. The enzyme is involved in glucarate metabolism, and its mechanism presumbly involves a Schiff-base intermediate similar to members of DHDPS family. While in the case of Pseudomonas sp. 5-dehydro-4-deoxy-D-glucarate is degraded by KDGDH to 2,5-dioxopentanoate, in certain species of Enterobacteriaceae it is degraded instead to pyruvate and glycerate. 289
21753 188639 cd00952 CHBPH_aldolase Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase (HBPHA) and trans-2'-carboxybenzalpyruvate hydratase-aldolase (CBPHA). Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase (HBPHA) and trans-2'-carboxybenzalpyruvate hydratase-aldolase (CBPHA). HBPHA catalyzes HBP to salicyaldehyde and pyruvate. This reaction is part of the degradative pathways for naphthalene and naphthalenesulfonates by bacteria. CBPHA is homologous to HBPHA and catalyzes the cleavage of CBP to 2-carboxylbenzaldehyde and pyruvate during the degradation of phenanthrene. They are member of the DHDPS family of Schiff-base-dependent class I aldolases. 309
21754 188640 cd00953 KDG_aldolase KDG (2-keto-3-deoxygluconate) aldolases found in archaea. KDG (2-keto-3-deoxygluconate) aldolases found in archaea. This subfamily of enzymes is adapted for high thermostability and shows specificity for non-phosphorylated substrates. The enzyme catalyses the reversible aldol cleavage of 2-keto-3-dexoygluconate to pyruvate and glyceraldehyde, the third step of a modified non-phosphorylated Entner-Doudoroff pathway of glucose oxidation. KDG aldolase shows no significant sequence similarity to microbial 2-keto-3-deoxyphosphogluconate (KDPG) aldolases, and the enzyme shows no activity with glyceraldehyde 3-phosphate as substrate. The enzyme is a tetramer and a member of the DHDPS family of Schiff-base-dependent class I aldolases. 279
21755 188641 cd00954 NAL N-Acetylneuraminic acid aldolase, also called N-acetylneuraminate lyase (NAL). N-Acetylneuraminic acid aldolase, also called N-acetylneuraminate lyase (NAL), which catalyses the reversible aldol reaction of N-acetyl-D-mannosamine and pyruvate to give N-acetyl-D-neuraminic acid (D-sialic acid). It has a widespread application as biocatalyst for the synthesis of sialic acid and its derivatives. This enzyme has been shown to be quite specific for pyruvate as the donor, but flexible to a variety of D- and, to some extent, L-hexoses and pentoses as acceptor substrates. NAL is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases. 288
21756 188642 cd00955 Transaldolase_like Transaldolase-like proteins from plants and bacteria. Transaldolase-like proteins from plants and bacteria. Transaldolase is found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates. 338
21757 188643 cd00956 Transaldolase_FSA Transaldolase-like fructose-6-phosphate aldolases (FSA) found in bacteria and archaea. Transaldolase-like fructose-6-phosphate aldolases (FSA) found in bacteria and archaea, which are member of the MipB/TalC subfamily of class I aldolases. FSA catalyze an aldol cleavage of fructose 6-phosphate and do not utilize fructose, fructose 1-phosphate, fructose 1,6-phosphate, or dihydroxyacetone phosphate. The enzymes belong to the transaldolase family that serves in transfer reactions in the pentose phosphate cycle, and are more distantly related to fructose 1,6-bisphosphate aldolase. 211
21758 188644 cd00957 Transaldolase_TalAB Transaldolases including both TalA and TalB. Transaldolases including both TalA and TalB. The enzyme catalyses the reversible transfer of a dyhydroxyacetone moiety, derived from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. The catalytic mechanism is similar to other class I aldolases. The enzyme is found in the non-oxidative branch of the pentose phosphate pathway and forms a dimer in solution. 313
21759 188645 cd00958 DhnA Class I fructose-1,6-bisphosphate (FBP) aldolases of the archaeal type (DhnA homologs). Class I fructose-1,6-bisphosphate (FBP) aldolases of the archaeal type (DhnA homologs) found in bacteria and archaea. Catalysis of the enzymes proceeds via a Schiff-base mechanism like other class I aldolases, although this subfamily is clearly divergent based on sequence similarity to other class I and class II (metal dependent) aldolase subfamilies. 235
21760 188646 cd00959 DeoC 2-deoxyribose-5-phosphate aldolase (DERA) of the DeoC family. 2-deoxyribose-5-phosphate aldolase (DERA) of the DeoC family. DERA belongs to the class I aldolases and catalyzes a reversible aldol reaction between acetaldehyde and glyceraldehyde 3-phosphate to generate 2-deoxyribose 5-phosphate. DERA is unique in catalyzing the aldol reaction between two aldehydes, and its broad substrate specificity confers considerable utility as a biocatalyst, offering an environmentally benign alternative to chiral transition metal catalysis of the asymmetric aldol reaction. 203
21761 238478 cd00974 DSRD Desulforedoxin (DSRD) domain; a small non-heme iron domain present in the desulforedoxin (rubredoxin oxidoreductase) and desulfoferrodoxin proteins of some archeael and bacterial methanogens and sulfate/sulfur reducers. Desulforedoxin is a small, single-domain homodimeric protein; each subunit contains an iron atom bound to four cysteinyl sulfur atoms, Fe(S-Cys)4, in a distorted tetrahedral coordination. Its metal center is similar to that found in rubredoxin type proteins. Desulforedoxin is regarded as a potential redox partner for rubredoxin. Desulfoferrodoxin forms a homodimeric protein, with each protomer comprised of two domains, the N-terminal DSRD domain and C-terminal superoxide reductase-like (SORL) domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)]. 34
21762 381600 cd00978 chitosanase_GH46 chitosanase belonging to the glycosyl hydrolase 46 family. This family is composed of the chitosanase enzymes which hydrolyzes chitosan, a biopolymer of beta (1,4)-linked-D-glucosamine (GlcN) residues produced by partial or full deacetylation of chitin. Chitosanases play a role in defense against pathogens such as fungi and are found in microorganisms, fungi, viruses, and plants. Microbial chitosanases can be divided into 3 subclasses based on the specificity of the cleavage positions for partial acetylated chitosan. Subclass I chitosanases such as N174 can split GlcN-GlcN and GlcNAc-GlcN linkages, whereas subclass II chitosanases such as Bacillus sp. no. 7-M can cleave only GlcN-GlcN linkages. Subclass III chitosanases such as MH-K1 chitosanase are the most versatile and can split both GlcN-GlcN and GlcN-GlcNAc linkages. 222
21763 238480 cd00980 FwdC/FmdC FwdC/FmdC. This domain of unknown function is found in the subunit C of formylmethanofuran dehydrogenase, an enzyme that catalyzes the first step in methane formation from CO2 in methanogenic archaea, hyperthermophiles and bacteria. There are two isoenzymes, a tungsten-containing isoenzyme (Fwd) and a molybdenum-containing isoenzyme (Fmd). The subunits C of both isoenzymes (FwdC/FmdC) are characterized by a repeated GXXGXXXG motif. 203
21764 238481 cd00981 arch_gltB Archaeal-type gltB domain. This domain shares sequence similarity with a region of unknown function found in the large subunit of glutamate synthase, which is encoded by gltB and found in most bacteria and eukaryotes. It is predicted to be homologous to the C-terminal domain of glutamate synthase based upon sequence similarity coupled with genome organization data, showing that this domain is found in a gene cluster with other domains of Glts, which are annotated. This domain is found primarily in archaea, but is also present in a few bacteria, likely as a result of lateral gene transfer. 232
21765 238482 cd00982 gltB_C gltb_C. This domain is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS). GltS encodes a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites appear to occur in other domains within the protein, and not the domain in this CD. This particular domain has no known function, but it likely has a structural role as it interacts with the amidotransferase and FMN-binding domains of gltS. 251
21766 410863 cd00983 RecA recombinase A. RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response. RecA couples ATP hydrolysis to DNA strand exchange. 235
21767 410864 cd00984 DnaB_C C-terminal domain of DnaB helicase. DnaB helicase C-terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis. 256
21768 238485 cd00985 Maf_Ham1 Maf_Ham1. Maf, a nucleotide binding protein, has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides, such as hypoxanthine/xanthine NTP, but not standard nucleotides. 131
21769 238486 cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 79
21770 238487 cd00987 PDZ_serine_protease PDZ domain of trypsin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 90
21771 238488 cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 85
21772 238489 cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 79
21773 238490 cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 80
21774 238491 cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 79
21775 238492 cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases. 82
21776 270215 cd00993 PBP2_ModA_like Substrate binding domain of molybdate-binding proteins, the type 2 periplasmic binding protein fold. Molybdate binding domain ModA. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has revealed a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. This octahedral geometry was rather unexpected. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 225
21777 270216 cd00994 PBP2_GlnH Glutamine binding domain of ABC-type transporter; the type 2 periplasmic binding protein fold. This periplasmic substrate-binding component serves as an initial receptor in the ABC transport of glutamine in bacteria and eukaryota. GlnH belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 218
21778 173853 cd00995 PBP2_NikA_DppA_OppA_like The substrate-binding domain of an ABC-type nickel/oligopeptide-like import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain of nickel/dipeptide/oligopeptide transport systems, which function in the import of nickel and peptides, and other closely related proteins. The oligopeptide-binding protein OppA is a periplasmic component of an ATP-binding cassette (ABC) transport system OppABCDEF consisting of five subunits: two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Similar to the ABC-type dipeptide and oligopeptide import systems, nickel transporter is comprised of five subunits NikABCDE: the two pore-forming integral inner membrane proteins NikB and NikC; the two inner membrane-associated proteins with ATPase activity NikD and NikE; and the periplasmic nickel binding NikA, which is the initial nickel receptor that controls the chemotactic response away from nickel. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand binding domains of ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 466
21779 270217 cd00996 PBP2_AatB_like Polar amino acids-binding domain of ATP-binding cassette transporter-like systems that belong to the type 2 periplasmic binding fold protein superfamily. This subfamily includes periplasmic binding domain of ATP-binding cassette transporter-like systems that serve as initial receptors in the ABC transport of amino acids and their derivatives in eubacteria. After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The Abp proteins belong to the PBPI superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 227
21780 270218 cd00997 PBP2_GluR0 Bacterial GluR0 ligand-binding domain; the type 2 periplasmic binding protein fold. Glutamate receptor domain GluR0. These domains are found in the GluR0 proteins that have been shown to function as prokaryotic L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. The GluR0 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 218
21781 270219 cd00998 PBP2_iGluR_ligand_binding The ligand-binding domain of ionotropic glutamate receptor family, a member of the periplasmic binding protein type II superfamily. This subfamily represents the ligand binding of ionotropic glutamate receptors. iGluRs are heterotetrameric ion channels that comprises of three functionally distinct subtypes based on their pharmacology and structural similarities: AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid), NMDA (N-methyl-D-aspartate), and kainate receptors. All three types of channels are also activated by the physiological neurotransmitter, glutamate. iGluRs are concentrated at postsynaptic sites, where they exert a variety of different functions. While this ligand-binding domain of iGluRs is structurally homologous to the periplasmic binding fold type II superfamily, the N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain belongs to the periplasmic-binding fold type I. 243
21782 270220 cd00999 PBP2_ArtJ The solute binding domain of ArtJ protein, a member of the type 2 periplasmic binding fold protein superfamily. An arginine-binding protein found in Chlamydiae trachomatis (CT-ArtJ) and pneumoniae (CPn-ArtJ) and its closely related proteins. CT- and CPn-ArtJ are shown to have different immunogenic properties despite a high sequence similarity. The ArtJ proteins display the type 2 periplasmic binding fold organized in two alpha-beta domains with arginine-binding region at their interface. 223
21783 270221 cd01000 PBP2_Cys_DEBP_like Substrate-binding domain of cysteine- and aspartate/glutamate-binding proteins; the type 2 periplasmic-binding protein fold. This family comprises of the periplasmic-binding protein component of ABC transporters specific for cysteine and carboxylic amino acids, as well as their closely related proteins. The cysteine and aspartate-glutamate binding domains belong to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 228
21784 270222 cd01001 PBP2_HisJ_LAO_like Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporters and related proteins; the type 2 periplasmic-binding protein fold. This family comprises the periplasmic substrate-binding proteins, including the lysine-, arginine-, ornithine-binding protein (LAO) and the histidine-binding protein (HisJ), which serve as initial receptors for active transport. HisJ and LAO proteins belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 228
21785 270223 cd01002 PBP2_Ehub_like Substrate binding domain of ectoine/hydroxyectoine specific ABC transport system; the type 2 periplasmic binding protein fold. This family represents the periplasmic substrate-binding component of ABC transport systems that involved in uptake of osmoprotectants (also termed compatible solutes) such as ectoine and hydroxyectoine. To counteract the efflux of water, bacteria and archaea accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 242
21786 270224 cd01003 PBP2_YckB Substrate binding domain of an ABC cystine transporter; the type 2 periplasmic binding protein fold. Periplasmic cystine-binding domain (YckB) of an ATP-binding cassette (ABC) transporter from Bacillus subtilis and its related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 229
21787 270225 cd01004 PBP2_MidA_like Mimosine binding domain of ABC-type transporter MidA and similar proteins; the type 2 periplasmic binding protein fold. This subgroup includes the periplasmic binding component of ABC transporter involved in uptake of mimosine MidA and its similar proteins. This periplasmic binding domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 230
21788 270226 cd01005 PBP2_CysP Substrate binding domain of an active sulfate transporter, a member of the type 2 periplasmic binding fold superfamily. This family contains sulfate binding domain of CysP proteins that serve as initial receptors in the ABC transport of sulfate and thiosulfate in eubacteria. After binding the ligand, CysP interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The CysP proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 307
21789 270227 cd01006 PBP2_phosphate_binding Substrate binding domain of ABC-type phosphate transporter, a member of the type 2 periplasmic-binding fold superfamily. This phosphate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 253
21790 270228 cd01007 PBP2_BvgS_HisK_like The type 2 periplasmic ligand-binding protein domain of the sensor-kinase BvgS and histidine kinase receptors, and related proteins. This family comprises the periplasmic sensor domain of the two-component sensor-kinase systems, such as the sensor protein BvgS of Bordetella pertussis and histidine kinase receptors (HisK), and uncharacterized related proteins. Typically, the two-component system consists of a membrane spanning sensor-kinase and a cytoplasmic response regulator. It serves as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. The N-terminal sensing domain of the sensor kinase detects extracellular signals, such as small molecule ligands and ions, which then modulate the catalytic activity of the cytoplasmic kinase domain through a phosphorylation cascade. The periplasmic sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 220
21791 270229 cd01008 PBP2_NrtA_SsuA_CpmA_like Substrate binding domain of ABC-type nitrate/sulfonate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily. This family represents the periplasmic binding proteins involved in nitrate, alkanesulfonate, and bicarbonate transport. These domains are found in eubacterial perisplamic-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates. Other closest homologs involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB) are also included in this family. After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. These binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 212
21792 270230 cd01009 PBP2_YfhD_N The solute binding domain of YfhD proteins, a member of the type 2 periplasmic binding fold protein superfamily. This subfamily includes the solute binding domain YfhD_N. These domains are found in the YfhD proteins that are predicted to function as lytic transglycosylases that cleave the glycosidic bond between N-acetylmuramic acid and N-acetylglucosamin in peptidoglycan, while the YfhD_N domain might act as an auxiliary or regulatory subunit. In addition to periplasmic solute binding domain, they have an SLT domain, typically found in soluble lytic transglycosylases, and a C-terminal low complexity domain. The YfhD proteins might have been recruited to create localized cell wall openings required for transport of large substrates such as DNA. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 223
21793 238493 cd01011 nicotinamidase Nicotinamidase/pyrazinamidase (PZase). Nicotinamidase, a ubiquitous enzyme in prokaryotes, converts nicotinamide to nicotinic acid (niacin) and ammonia, which in turn can be recycled to make nicotinamide adenine dinucleotide (NAD). The same enzyme is also called pyrazinamidase, because in converts the tuberculosis drug pyrazinamide (PZA) into its active form pyrazinoic acid (POA). 196
21794 238494 cd01012 YcaC_related YcaC related amidohydrolases; E.coli YcaC is an homooctameric hydrolase with unknown specificity. Despite its weak sequence similarity, it is structurally related to other amidohydrolases and shares conserved active site residues with them. Multimerisation interface seems not to be conserved in all members. 157
21795 238495 cd01013 isochorismatase Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate synthase, catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of a vinyl ether, an uncommon reaction in biological systems. Isochorismatase is part of the phenazine biosynthesis pathway. Phenazines are antimicrobial compounds that provide the competitive advantage for certain bacteria. 203
21796 238496 cd01014 nicotinamidase_related Nicotinamidase_ related amidohydrolases. Cysteine hydrolases of unknown function that share the catalytic triad with other amidohydrolases, like nicotinamidase, which converts nicotinamide to nicotinic acid and ammonia. 155
21797 238497 cd01015 CSHase N-carbamoylsarcosine amidohydrolase (CSHase) hydrolyzes N-carbamoylsarcosine to sarcosine, carbon dioxide and ammonia. CSHase is involved in one of the two alternative pathways for creatinine degradation to glycine in microorganisms.This CSHase-containing pathway degrades creatinine via N-methylhydantoin N-carbamoylsarcosine and sarcosine to glycine. Enzymes of this pathway are used in the diagnosis for renal disfunction, for determining creatinine levels in urine and serum. 179
21798 238498 cd01016 TroA Metal binding protein TroA. These proteins have been shown to function as initial receptors in ABC transport of Zn2+ and possibly Fe3+ in many eubacterial species. The TroA proteins belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 276
21799 238499 cd01017 AdcA Metal binding protein AdcA. These proteins have been shown to function in the ABC uptake of Zn2+ and Mn2+ and in competence for genetic transformation and adhesion. The AdcA proteins belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a long alpha helix and they bind their ligand in the cleft between these domains. In addition, many of these proteins have a low complexity region containing metal binding histidine-rich motif (repetitive HDH sequence). 282
21800 238500 cd01018 ZntC Metal binding protein ZntC. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a long alpha helix and bind their specific ligands in the cleft between these domains. In addition, many of these proteins possess a metal-binding histidine-rich motif (repetitive HDH sequence). 266
21801 238501 cd01019 ZnuA Zinc binding protein ZnuA. These proteins have been shown to function as initial receptors in the ABC uptake of Zn2+. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a single helix and bind their specific ligands in the cleft between these domains. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 286
21802 238502 cd01020 TroA_b Metal binding protein TroA_b. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 264
21803 381601 cd01021 GEWL Goose egg-white lysozyme. Eukaryotic goose-type or G-type lysozyme (goose egg-white lysozyme; GEWL) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc). Mammals have two lysozymes. This family corresponds to human and mouse lysozyme G-like protein 2. 174
21804 212096 cd01022 GH57N_like N-terminal catalytic domain of heat stable retaining glycoside hydrolase family 57. Glycoside hydrolase family 57(GH57) is a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. 313
21805 173775 cd01025 TOPRIM_recR TOPRIM_recR: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in Escherichia coli RecR. RecR participates in the RecFOR pathway of homologous recombinational repair in prokaryotes. This pathway provides a single-stranded DNA molecule coated with RecA to allow invasion of a homologous molecule. The RecFOR system directs the loading of RecA onto gapped DNA coated with SSB protein. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). In RecR sequences this glutamate in the first turn of the TOPRIM domain is semiconserved, the DXD motif is not conserved. 112
21806 173776 cd01026 TOPRIM_OLD TOPRIM_OLD: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in bacterial and archaeal nucleases of the OLD (overcome lysogenization defect) family. The bacteriophage P2 OLD protein, which has DNase as well as RNase activity, consists of an N-terminal ABC-type ATPase domain and a C-terminal Toprim domain; the nuclease activity of OLD is stimulated by ATP, though the ATPase activity is not DNA-dependent. Functional details on OLD are scant and further experimentation is required to define the relationship between the ATPase and Toprim nuclease domains. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general acid in strand cleavage by nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 97
21807 173777 cd01027 TOPRIM_RNase_M5_like TOPRIM_ RNase M5_like: The topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain found in Ribonuclease M5: (RNase M5) and other small primase-like proteins from bacteria and archaea. RNase M5 catalyzes the maturation of 5S rRNA in low G+C Gram-positive bacteria. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 81
21808 173778 cd01028 TOPRIM_TopoIA TOPRIM_TopoIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IA family of DNA topoisomerases (TopoIA). This subgroup contains proteins similar to the Type I DNA topoisomerases: E. coli topisomerases I and III, eukaryotic topoisomerase III and, ATP-dependent reverse gyrase found in archaea and thermophilic bacteria. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA. These enzymes cleave one strand of the DNA duplex, covalently link to the 5' phosphoryl end of the DNA break and allow the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 142
21809 173779 cd01029 TOPRIM_primases TOPRIM_primases: The topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain found in the active site regions of bacterial DnaG-type primases and their homologs. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. The prototypical bacterial primase. Escherichia coli DnaG is a single subunit enzyme. 79
21810 173780 cd01030 TOPRIM_TopoIIA_like TOPRIM_TopoIIA_like: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 115
21811 238504 cd01031 EriC ClC chloride channel EriC. This domain is found in the EriC chloride transporters that mediate the extreme acid resistance response in eubacteria and archaea. This response allows bacteria to survive in the acidic environments by decarboxylation-linked proton utilization. As shown for Escherichia coli EriC, these channels can counterbalance the electric current produced by the outwardly directed virtual proton pump linked to amino acid decarboxylation. The EriC proteins belong to the ClC superfamily of chloride ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism. The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge. In Escherichia coli EriC, a glutamate residue that protrudes into the pore is thought to participate in gating by binding to a Cl- ion site within the selectivity filter. 402
21812 238505 cd01033 ClC_like Putative ClC chloride channel. Clc proteins are putative halogen ion (Cl-, Br- and I-) transporters found in eubacteria. They belong to the ClC superfamily of halogen ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism. This superfamily lacks any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge. 388
21813 238506 cd01034 EriC_like ClC chloride channel family. These protein sequences, closely related to the ClC Eric family, are putative halogen ion (Cl-, Br- and I-) transport proteins found in eubacteria. They belong to the ClC superfamily of chloride ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism. This superfamily lacks any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge. 390
21814 238507 cd01036 ClC_euk Chloride channel, ClC. These domains are found in the eukaryotic halogen ion (Cl-, Br- and I-) channel proteins that perform a variety of functions including cell volume regulation, membrane potential stabilization, charge compensation necessary for the acidification of intracellular organelles, signal transduction and transepithelial transport. They are also involved in many pathophysiological processes and are responsible for a number of human diseases. These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. Some proteins possess long C-terminal cytoplasmic regions containing two CBS (cystathionine beta synthase) domains of putative regulatory function. 416
21815 411707 cd01037 PDDEXK_nuclease-like PDDEXK family nucleases. Superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 83
21816 411708 cd01038 Endonuclease_DUF559 Putative endonuclease. Domain of unknown function 559 (DUF559) is a putative endonuclease of unknown function, belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 97
21817 381254 cd01040 Mb-like myoglobin-like; M family globin domain. This family includes chimeric (FHbs/flavohemoglobins) and single-domain globins: FHbs, Ngbs/neuroglobins, Cygb/cytoglobins, GbE/avian eye specific globin E, GbX/globin X, amphibian GbY/globin Y, Mb/myoglobin, HbA/hemoglobin-alpha, HbB/hemoglobin-beta, SDgbs/single-domain globins related to FHbs, and Adgb/androglobin. The M family exhibits the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments (named A through H). In Adgbs, the globin domain is split into two: helices C-H are followed by helices A-B and the two parts are separated by the IQ motif. Although rearranged, the globin domain of most Adgbs contains a number of conserved residues which play critical roles in heme-coordination and gas ligand binding. Adgbs have been omitted from this A-H helix cd. 133
21818 153100 cd01041 Rubrerythrin Rubrerythrin, ferritin-like diiron-binding domain. Rubrerythrin domain is a nonheme iron binding domain found in many air-sensitive bacteria and archaea and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The homodimeric rubrerythrin protein contains a binuclear metal center located within a four helix bundle. Many, but not all, rubrerythrin proteins have a second domain with a rubredoxin-like hexacoordinated iron center. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system but its function is still poorly understood. 134
21819 153101 cd01042 DMQH Demethoxyubiquinone hydroxylase, ferritin-like diiron-binding domain. Demethoxyubiquinone hydroxylases (DMQH) are members of the ferritin-like, diiron-carboxylate family which are present in eukaryotes (the CLK-1/CAT5 family) and prokaryotes (the Coq7 family). DMQH participates in one of the last steps of ubiquinone biosysnthesis and is responsible for DMQ hydroxylation, resulting in the formation of hydroxyubiquinone, a precursor of ubiquinone. CLK-1 is a mitochondrial inner membrane protein and Coq7 is a proposed interfacial integral membrane protein. Mutations in the Caenorhabditis elegans gene clk-1 affect biological timing and extend longevity. The conserved residues of a diiron center are present in this domain. 165
21820 153102 cd01043 DPS DPS protein, ferritin-like diiron-binding domain. DPS (DNA Protecting protein under Starved conditions) domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Some DPS proteins nonspecifically bind DNA, protecting it from cleavage caused by reactive oxygen species such as the hydroxyl radicals produced during oxidation of Fe(II) by hydrogen peroxide. These proteins assemble into dodecameric structures, some form DPS-DNA co-crystalline complexes, and possess iron and H2O2 detoxification capabilities. Expression of DPS is induced by oxidative or nutritional stress, including metal ion starvation. Members of the DPS family are homopolymers formed by 12 four-helix bundle subunits that assemble with 23 symmetry into a hollow shell. The DPS ferroxidase site is unusual in that it is not located in a four-helix bundle as in ferritin, but is shared by 2-fold symmetry-related subunits providing the iron ligands. Many DPS sequences (e.g., E. coli) display an N-terminal extension of variable length that contains two or three positively charged lysine residues that extends into the solvent and is thought to play an important role in the stabilization of the complex with DNA. DPS Listeria Flp, Bacillus anthracis Dlp-1 and Dlp-2, and Helicobacter pylori HP-NAP which lack the N-terminal extension, do not bind DNA. DPS proteins from Helicobacter pylori, Treponema pallidum, and Borrelia burgdorferi are highly immunogenic. 139
21821 153103 cd01044 Ferritin_CCC1_N Ferritin-CCC1, N-terminal ferritin-like diiron-binding domain. Ferritin-like N-terminal domain present in an uncharacterized family of proteins found in bacteria and archaea. These proteins also have a C-terminal CCC1-like transmembrane domain and are thought to be involved in iron and/or manganese transport. This domain has the conserved residues of a diiron center found in other ferritin-like proteins. 125
21822 153104 cd01045 Ferritin_like_AB Uncharacterized family of ferritin-like proteins found in archaea and bacteria. Ferritin-like domain found in archaea and bacteria (Ferritin_like_AB). This uncharacterized domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins whose function is unknown. This family includes unknown or hypothetical proteins which were sequenced from mostly anaerobic or microaerophilic metal-metabolizing and/or nitrogen-fixing microbes. The family includes sequences from ferric-, sulfate-, and arsenic-reducing bacteria, Geobacter, Magnetospirillum, Desulfovibrio, and Desulfitobacterium. Also included are several nitrogen-fixing endosymbiotic bacteria, Rhizobium, Mesorhizobium, and Bradyrhizobium; also phototrophic purple nonsulfur bacteria, Rhodobacter and Rhodopseudomonas, as well as, obligate thermophiles, Thermotoga, Thermoanaerobacter, and Pyrococcus. The conserved residues of a diiron center are present in this uncharacterized domain. 139
21823 153105 cd01046 Rubrerythrin_like rubrerythrin-like, diiron-binding domain. Rubrerythrin-like domain, similar to rubrerythrin, a nonheme iron binding domain found in many air-sensitive bacteria and archaea, and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system. The rubrerythrin protein has two domains, a binuclear metal center located within a four-helix bundle of the rubrerythrin domain, and a rubredoxin domain. The Rubrerythrin-like domains in this CD are singular domains (no C-terminus rubredoxin domain) and are phylogenetically distinct from rubrerythrin domains of rubrerythrin-rubredoxin proteins. 123
21824 153106 cd01047 ACSF Aerobic Cyclase System Fe-containing subunit (ACSF), ferritin-like diiron-binding domain. Aerobic Cyclase System, Fe-containing subunit (ACSF) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrivivax gelatinosus acsF codes for a conserved, putative binuclear iron-cluster-containing protein involved in aerobic oxidative cyclization of Mg-protoporphyrin IX monomethyl ester. AcsF and homologs have a leucine zipper and two copies of the conserved glutamate and histidine residues predicted to act as ligands for iron in the Ex(29-35)DExRH motifs. Several homologs of AcsF are found in a wide range of photosynthetic organisms, including Chlamydomonas reinhardtii Crd1 and Pharbitis nil PNZIP, suggesting that this aerobic oxidative cyclization mechanism is conserved from bacteria to plants. 323
21825 153107 cd01048 Ferritin_like_AB2 Uncharacterized family of ferritin-like proteins found in archaea and bacteria. Ferritin-like domain found in archaea and bacteria, subgroup 2 (Ferritin_like_AB2). This uncharacterized domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins whose function is unknown. The conserved residues of a diiron center are present within the putative active site. 135
21826 153108 cd01049 RNRR2 Ribonucleotide Reductase, R2/beta subunit, ferritin-like diiron-binding domain. Ribonucleotide Reductase, R2/beta subunit (RNRR2) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The RNR protein catalyzes the conversion of ribonucleotides to deoxyribonucleotides and is found in all eukaryotes, many prokaryotes, several viruses, and few archaea. The catalytically active form of RNR is a proposed alpha2-beta2 tetramer. The homodimeric alpha subunit (R1) contains the active site and redox active cysteines as well as the allosteric binding sites. The beta subunit (R2) contains a diiron cluster that, in its reduced state, reacts with dioxygen to form a stable tyrosyl radical and a diiron(III) cluster. This essential tyrosyl radical is proposed to generate a thiyl radical, located on a cysteine residue in the R1 active site that initiates ribonucleotide reduction. The beta subunit is composed of 10-13 helices, the 8 longest helices form an alpha-helical bundle; some have 2 addition beta strands. Yeast is unique in that it assembles both homodimers and heterodimers of RNRR2. The yeast heterodimer, Y2Y4, contains R2 (Y2) and a R2 homolog (Y4) that lacks the diiron center and is proposed to only assist in cofactor assembly, and perhaps stabilize R1 (Y1) in its active conformation. 288
21827 153109 cd01050 Acyl_ACP_Desat Acyl ACP desaturase, ferritin-like diiron-binding domain. Acyl-Acyl Carrier Protein Desaturase (Acyl_ACP_Desat) is a mu-oxo-bridged diiron-carboxylate enzyme, which belongs to a broad superfamily of ferritin-like proteins and catalyzes the NADPH and O2-dependent formation of a cis-double bond in acyl-ACPs. Acyl-ACP desaturases are found in higher plants and a few bacterial species (Mycobacterium tuberculosis, M. leprae, M. avium and Streptomyces avermitilis, S. coelicolor). In plants, Acyl-ACP desaturase is a plastid-localized, covalently ACP linked, soluble desaturase that introduces the first double bound into saturated fatty acids, resulting in the corresponding monounsaturated fatty acid. Members of this class of soluble desaturases are specific for a particular substrate chain length and introduce the double bond between specific carbon atoms. For example, delta 9 stearoyl-ACP is specific for stearic acid and introduces a double bond between carbon 9 and 10 to yield oleic acid in the ACP-bound form. The enzymatic reaction requires molecular oxygen, NAD(P)H, NAD(P)H ferredoxin oxido-reductase and ferredoxin. The enzyme is active in the homodimeric form; the monomer consists mainly of alpha-helices with the catalytic diiron center buried within a four-helix bundle. Integral membrane fatty acid desaturases that introduce double bonds into fatty acid chains, acyl-CoA desaturases of animals, yeasts, and fungi, and acyl-lipid desaturases of cyanobacteria and higher plants, are distinct from soluble acyl-ACP desaturases, lack diiron centers, and are not included in this CD. 297
21828 153110 cd01051 Mn_catalase Manganese catalase, ferritin-like diiron-binding domain. Manganese (Mn) catalase is a member of a broad superfamily of ferritin-like diiron enzymes. While many diiron enzymes catalyze dioxygen-dependent reactions, manganese catalase performs peroxide-dependent oxidation-reduction. Catalases are important antioxidant metalloenzymes that catalyze disproportionation of hydrogen peroxide, forming dioxygen and water. Manganese catalase, a nonheme type II catalase, contains a binuclear manganese cluster that catalyzes the redox dismutation of hydrogen peroxide, interconverting between dimanganese(II) [(2,2)] and dimanganese(III) [(3,3)] oxidation states during turnover. Mn catalases are found in a broad range of microorganisms in microaerophilic environments, including the mesophilic lactic acid bacteria (e.g., Lactobacillus plantarum) and bacterial and archaeal thermophiles (e.g., Thermus thermophilus and Pyrobaculum caldifontis). L. plantarum and T. thermophilus holoenzymes are homohexameric structures; each subunit contains a dimanganese active site. The manganese ions are linked by a mu 1,3-bridging glutamate carboxylate and two mu-bridging solvent oxygens that electronically couple the metal centers. Several members of this CD lack the C-terminal strands that pack against the neighboring catalytic domains as seen in L. plantarum. One such sequence, Bacillus subtilis CotJC, is known to be a component of the inner spore coat that interacts with spore coat protein, CotJA. It has been suggested that CotJC could modulate the degree of Mn SodA-dependent cross-linking of an outer coat component, or the two enzymes could serve to protect specific cellular structures during the developmental process. 156
21829 153111 cd01052 DPSL DPS-like protein, ferritin-like diiron-binding domain. DPSL (DPS-like). DPSL is a phylogenetically distinct class within the ferritin-like superfamily, and similar in many ways to the DPS (DNA Protecting protein under Starved conditions) proteins. Like DPS, these proteins are expressed in response to oxidative stress, form dodecameric cage-like particles, preferentially utilize hydrogen peroxide in the controlled oxidation of iron, and possess a short N-terminal extension implicated in stabilizing cellular DNA. This domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. These proteins are distantly related to bacterial ferritins which assemble 24 monomers, each of which have a four-helix bundle with a fifth shorter helix at the C terminus and a diiron (ferroxidase) center. Ferritins contain a center where oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of iron in the ferric form. Many of the conserved residues of a diiron center are present in this domain. 148
21830 153112 cd01053 AOX Alternative oxidase, ferritin-like diiron-binding domain. Alternative oxidase (AOX) is a mitochondrial ubiquinol oxidase found in plants and some fungi and protists. AOX is a member of the ferritin-like diiron-carboxylate superfamily. The plant mitochondrial protein alternative oxidase catalyses dioxygen dependent ubiquinol oxidation to yield ubiquinone and water. AOX is a cyanide-resistant, salicylhydroxamic acid-sensitive oxidase that transfers electrons from ubiquinol to oxygen, bypassing the cytochrome chain. AOX has been proposed to contain a hydroxo-bridged diiron center within a four-helix bundle and a proximal redox-active tyrosine residue. AOX is proposed to be peripherally associated with the matrix side of the inner mitochondrial membrane. Fungal and protozoan AOXs generally exist as monomers. In plants, AOX is dimeric. Pyruvate is an allosteric activator of plant AOX involved in the reversible inactivation of the enzyme though the formation of an intermolecular disulfide bridge between monomeric subunits. The enzyme is non-proton-motive and does not contribute to the conservation of energy. The heat that dissipates from AOX activity is used in thermogenic plants to volatilize primary amines to attract pollinating insects. Other functions have been proposed: i) that the alternative oxidase allows Krebs-cycle turnover when the energy charge of the cell is high, and ii) that the enzyme protects against oxidative stress. The expression of AOX is induced when plants are exposed to a variety of stresses including chilling, pathogen attack, senescence and fruit ripening. 168
21831 153113 cd01055 Nonheme_Ferritin nonheme-containing ferritins. Nonheme Ferritin domain, found in archaea and bacteria, is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The ferritin protein shell is composed of 24 protein subunits arranged in 432 symmetry. Each protein subunit, a four-helix bundle with a fifth short terminal helix, contains a dinuclear ferroxidase center (H type). Unique to this group of proteins is a third metal site in the ferroxidase center. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite. 156
21832 153114 cd01056 Euk_Ferritin eukaryotic ferritins. Eukaryotic Ferritin (Euk_Ferritin) domain. Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues. 161
21833 153115 cd01057 AAMH_A Aromatic and Alkene Monooxygenase Hydroxylase, subunit A, ferritin-like diiron-binding domain. Aromatic and Alkene Monooxygenase Hydroxylases, subunit A (AAMH_A). Subunit A of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds, however, the beta-subunit lacks critical diiron ligands and a C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD. 465
21834 153116 cd01058 AAMH_B Aromatic and Alkene Monooxygenase Hydroxylase, subunit B, ferritin-like diiron-binding domain. Aromatic and Alkene Monooxygenase Hydroxylases, subunit B (AAMH_B). Subunit B (beta) of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds; the beta-subunit lacks the C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD. 304
21835 153121 cd01059 CCC1_like CCC1-related family of proteins. CCC1_like: This protein family includes the proteins related to CCC1, a yeast vacuole transmembrane protein responsible for the iron and manganese transport from the cytosol into vacuole. It also includes the proteins similar to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. 143
21836 238511 cd01060 Membrane-FADS-like The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase. 122
21837 238512 cd01061 RNase_T2_euk Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the eukaryotic RNase T2 family members. 195
21838 238513 cd01062 RNase_T2_prok Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the prokaryotic RNase T2 family members. 184
21839 133443 cd01065 NAD_bind_Shikimate_DH NAD(P) binding domain of Shikimate dehydrogenase. Shikimate dehydrogenase (DH) is an amino acid DH family member. Shikimate pathway links metabolism of carbohydrates to de novo biosynthesis of aromatic amino acids, quinones and folate. It is essential in plants, bacteria, and fungi but absent in mammals, thus making enzymes involved in this pathway ideal targets for broad spectrum antibiotics and herbicides. Shikimate DH catalyzes the reduction of 3-hydroshikimate to shikimate using the cofactor NADH. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DHs, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 155
21840 238514 cd01066 APP_MetAP A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation. 207
21841 381255 cd01067 Globin-like Globin-like protein superfamily. This globin-like domain superfamily contains a wide variety of all-helical proteins that bind porphyrins, phycobilins, and other non-heme cofactors, and play various roles in all three kingdoms of life, including sensors or transporters of oxygen. It includes the M/myoglobin-like, S/sensor globin, and T/truncated globin (TrHb) families, and the phycobiliproteins (PBPs). The M family includes chimeric (FHbs/flavohemoglobins) and single-domain globins: FHbs, Ngbs/neuroglobins, Cygb/cytoglobins, GbE/avian eye specific globin E, GbX/globin X, amphibian GbY/globin Y, Mb/myoglobin, HbA/hemoglobin-alpha, HbB/hemoglobin-beta, SDgbs/single-domain globins related to FHbs, and Adgb/androglobin. The S family includes GCS/globin-coupled sensors, Pgbs/protoglobins, and SSDgbs/sensor single domain globins. The T family is classified into three main groups: TrHb1s (N), TrHb2s (O) and TrHb3s (P). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments (named A through H). For M family Adgbs, this globin domain is permuted, such that C-H are followed by A-B. The T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. PBPs bind the linear tetrapyrrole chromophore, phycobilin, a prosthetic group chemically and metabolically related to iron protoporphyrin IX/protoheme. Examples of other globin-like domains which bind non-heme cofactors include those of the Bacillus anthracis sporulation inhibitors pXO1-118 and pXO2-61 which bind fatty acid and halide in vitro, and the globin-like domain of Bacillus subtilis RsbRA which is presumed to channel sensory input to the C-terminal sulfate transporter/ anti-sigma factor antagonist (STAT) domain. RsbRA is a component of the sigma B-activating stressosome, and a regulator of the RNA polymerase sigma factor subunit sigma (B). 119
21842 381256 cd01068 globin_sensor Globin sensor domain of globin-coupled-sensors (GCSs), protoglobins (Pgbs), and sensor single-domain globins (SSDgbs); S family. This family includes sensor domains which binds porphyrins, and other non-heme cofactors. GCSs have an N-terminal sensor domain coupled to a functional domain. For heme-bound oxygen sensing/binding globin domains, O2 binds to/dissociates from the heme iron complex inducing a structural change in the sensor domain, which is then transduced to the functional domain, switching on (or off) the function of the latter. Functional domains include DGC/GGDEF, EAL, histidine kinase, MCP, PAS, and GAF domains. Characterized members include Bacillus subtilis heme-based aerotaxis transducer (HemAT-Bs) which has a sensor domain coupled to an MCP domain. HemAT-Bs mediates an aerophilic response, and may control the movement direction of bacteria and archaea. Its MCP domain interacts with the CheA histidine kinase, a component of the CheA/CheY signal transduction system that regulates the rotational direction of flagellar motors. Another GCS having the sensor domain coupled to an MCP domain is Caulobacter crescentus McpB. McpB is encoded by a gene which lies adjacent to the major chemotaxis operon. Like McpA (encoded on this operon), McpB has three potential methylation sites, a C-terminal CheBR docking motif, and a motif needed for proteolysis via a ClpX-dependent pathway during the swarmer-to-stalked cell transition. Also included is Geobacter sulfurreducens GCS, a GCS of unknown function, in which the sensor domain is coupled to a transmembrane signal-transduction domain. Pgbs are single-domain globins of unknown function. Methanosarcina acetivorans Pgbs is dimeric and has an N-terminal extension, which together with other Pgb-specific loops, buries the heme within the protein; small ligand molecules gain access to the heme via two orthogonal apolar tunnels. Pgbs and other single-domain globins can function as sensors, when coupled to an appropriate regulator domain. 146
21843 270231 cd01069 PBP2_PheC Cyclohexadienyl dehydratase, a member of the type 2 periplasmic binding fold protein superfamily. This subfamily includes cyclohexadienyl dehydratase PheC. These proteins catalyze the decarboxylation of prephenate to phenylpyruvate in the alternative phenylalanine biosynthesis pathway in some proteobacteria and archaea. The PheC proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. Since they the PheC proteins are so similar to periplasmic binding proteins, (PBP), it is evolutionarily plausible that several pre-existing PBP proteins might have been recruited to perform the enzymatic function. 232
21844 270232 cd01071 PBP2_PhnD_like Substrate binding domain of phosphonate uptake system-like, a member of the type 2 periplasmic-binding fold superfamily. This family includes alkylphosphonate binding domain PhnD. These domains are found in PhnD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. PhnD is the periplasmic binding component of an ABC-type phosphonate uptake system (PhnCDE) that recognizes and binds phosphonate. PhnD belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. The PBP2 have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 253
21845 270233 cd01072 PBP2_SMa0082_like The substrate-binding domain of putatuve amino acid transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Sinorhizobium meliloti and its related proteins. The putative SMa0082-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 238
21846 133444 cd01075 NAD_bind_Leu_Phe_Val_DH NAD(P) binding domain of leucine dehydrogenase, phenylalanine dehydrogenase, and valine dehydrogenase. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. For example, leucine DH catalyzes the reversible oxidative deamination of L-leucine and several other straight or branched chain amino acids to the corresponding 2-oxoacid derivative. Amino acid DH -like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 200
21847 133445 cd01076 NAD_bind_1_Glu_DH NAD(P) binding domain of glutamate dehydrogenase, subgroup 1. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. Glutamate DH is a multidomain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms. Enzymes involved in ammonia assimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha -beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 227
21848 133446 cd01078 NAD_bind_H4MPT_DH NADP binding domain of methylene tetrahydromethanopterin dehydrogenase. Methylene Tetrahydromethanopterin Dehydrogenase (H4MPT DH) NADP binding domain. NADP-dependent H4MPT DH catalyzes the dehydrogenation of methylene- H4MPT and methylene-tetrahydrofolate (H4F) with NADP+ as cofactor. H4F and H4MPT are both cofactors that carry the one-carbon units between the formyl and methyl oxidation level. H4F and H4MPT are structurally analogous to each other with respect to the pterin moiety, but each has distinct side chain. H4MPT is present only in anaerobic methanogenic archaea and aerobic methylotrophic proteobacteria. H4MPT seems to have evolved independently from H4F and functions as a distinct carrier in C1 metabolism. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 194
21849 133447 cd01079 NAD_bind_m-THF_DH NAD binding domain of methylene-tetrahydrofolate dehydrogenase. The NAD-binding domain of methylene-tetrahydrofolate dehydrogenase (m-THF DH). M-THF is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. M-THF DH is a component of an unusual monofunctional enzyme; in eukaryotes, m-THF DH is typically found as part of a multifunctional protein. NADP-dependent m-THF DHs in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, monofunctional DH, as well as bifunctional DH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express an monofunctional DH. This family contains only the monofunctional DHs from S. cerevisiae and certain bacteria. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 197
21850 133448 cd01080 NAD_bind_m-THF_DH_Cyclohyd NADP binding domain of methylene-tetrahydrofolate dehydrogenase/cyclohydrolase. NADP binding domain of the Methylene-Tetrahydrofolate Dehydrogenase/cyclohydrolase (m-THF DH/cyclohydrolase) bifunctional enzyme. Tetrahydrofolate is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. In addition, most DHs also have an associated cyclohydrolase activity which catalyzes its hydrolysis to N10-formyltetrahydrofolate. m-THF DH is typically found as part of a multifunctional protein in eukaryotes. NADP-dependent m-THF DH in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, monofucntional DH, as well as bifunctional m-THF m-THF DHm-THF DHDH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express an monofunctional DH. This family contains the bifunctional DH/cyclohydrolase. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. 168
21851 185695 cd01081 Aldose_epim aldose 1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen. 284
21852 238517 cd01083 GAG_Lyase Glycosaminoglycan (GAG) polysaccharide lyase family. This family consists of a group of secreted bacterial lyase enzymes capable of acting on glycosaminoglycans, such as hyaluronan and chondroitin, in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. These are broad-specificity glycosaminoglycan lyases which recognize uronyl residues in polysaccharides and cleave their glycosidic bonds via a beta-elimination reaction to form a double bond between C-4 and C-5 of the non-reducing terminal uronyl residues of released products. Substrates include chondroitin, chondroitin 4-sulfate, chondroitin 6-sulfate, and hyaluronic acid. Family members include chondroitin AC lyase, chondroitin abc lyase, xanthan lyase, and hyalurate lyase. 693
21853 238518 cd01085 APP X-Prolyl Aminopeptidase 2. E.C. 3.4.11.9. Also known as X-Pro aminopeptidase, proline aminopeptidase, aminopeptidase P, and aminoacylproline aminopeptidase. Catalyses release of any N-terminal amino acid, including proline, that is linked with proline, even from a dipeptide or tripeptide. 224
21854 238519 cd01086 MetAP1 Methionine Aminopeptidase 1. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and Peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides. 238
21855 238520 cd01087 Prolidase Prolidase. E.C. 3.4.13.9. Also known as Xaa-Pro dipeptidase, X-Pro dipeptidase, proline dipeptidase., imidodipeptidase, peptidase D, gamma-peptidase. Catalyses hydrolysis of Xaa-Pro dipeptides; also acts on aminoacyl-hydroxyproline analogs. No action on Pro-Pro. 243
21856 238521 cd01088 MetAP2 Methionine Aminopeptidase 2. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides. 291
21857 238522 cd01089 PA2G4-like Related to aminopepdidase M, this family contains proliferation-associated protein 2G4. Family members have been implicated in cell cycle control. 228
21858 238523 cd01090 Creatinase Creatine amidinohydrolase. E.C.3.5.3.3. Hydrolyzes creatine to sarcosine and urea. 228
21859 238524 cd01091 CDC68-like Related to aminopeptidase P and aminopeptidase M, a member of this domain family is present in cell division control protein 68, a transcription factor. 243
21860 238525 cd01092 APP-like Similar to Prolidase and Aminopeptidase P. The members of this subfamily presumably catalyse hydrolysis of Xaa-Pro dipeptides and/or release of any N-terminal amino acid, including proline, that is linked with proline. 208
21861 238526 cd01093 CRIB_PAK_like PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. This subgroup of CRIB/PBD-domains is found N-terminal of Serine/Threonine kinase domains in PAK and PAK-like proteins. 46
21862 238527 cd01094 Alkanesulfonate_monoxygenase Alkanesulfonate monoxygenase is the monoxygenase of a two-component system that catalyzes the conversion of alkanesulfonates to the corresponding aldehyde and sulfite. Alkanesulfonate monoxygenase (SsuD) has an absolute requirement for reduced flavin mononucleotide (FMNH2), which is provided by the NADPH-dependent FMN oxidoreductase (SsuE). 244
21863 238528 cd01095 Nitrilotriacetate_monoxgenase nitrilotriacetate monoxygenase oxidizes nitrilotriacetate utilizing reduced flavin mononucleotide (FMNH2) and oxygen. The FMNH2 is provided by an NADH:flavin mononucleotide (FMN) oxidorductase that uses NADH to reduce FMN to FMNH2. 358
21864 238529 cd01096 Alkanal_monooxygenase Alkanal monooxygenase are flavin monoxygenases. Molecular oxygen is activated by reaction with reduced flavin mononucleotide (FMNH2) and reacts with an aldehyde to yield the carboxylic acid, oxidized flavin (FMN) and a blue-green light. Bacterial luciferases are heterodimers made of alpha and beta subunits which are homologous. The single activer center is on the alpha subunit. The alpha subunit has a stretch of 30 amino acid residues that is not present in the beta subunit. The beta subunit does not contain the active site and is required for the formation of the fully active heterodimer. The beta subunit does not contribute anything directly to the active site. Its role is probably to stabilize the high quantum yield conformation of the alpha subunit through interactionbs across the subunit interface. 315
21865 238530 cd01097 Tetrahydromethanopterin_reductase N5,N10-methylenetetrahydromethanopterin reductase (Mer) catalyzes the reduction of N5,N10-methylenetetrahydromethanopterin with reduced coenzyme F420 to N5-methyltetrahydromethanopterin and oxidized coenzyme F420. 202
21866 238531 cd01098 PAN_AP_plant Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization. 84
21867 238532 cd01099 PAN_AP_HGF Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 80
21868 238533 cd01100 APPLE_Factor_XI_like Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 73
21869 238534 cd01102 Link_Domain The link domain is a hyaluronan (HA)-binding domain. It functions to mediate adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It is found in the CD44 receptor and in human TSG-6. TSG-6 is the protein product of the tumor necrosis factor-stimulated gene-6. TSG-6 has a strong anti-inflammatory effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. This group also contains the link domains of the chondroitin sulfate proteoglycan core proteins (CSPG) including aggrecan, versican, neurocan, and brevican and the link domains of the vertebrate HAPLN (HA and proteoglycan binding link) protein family. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates in which other CSPGs substitute for aggregan might contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN gene family are physically linked adjacent to CSPG genes. TSG-6 contains a single link module which supports high affinity binding with HA. The functional HA-binding domain of CD44 is an extended domain comprised of a link module flanked with N-and C- extensions. These extensions are essential for folding and functional activity. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of the CSPG aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) which contains link modules 3 and 4 which lack HA-binding activity. HAPLNs contain two contiguous link modules. 92
21870 133379 cd01104 HTH_MlrA-CarA Helix-Turn-Helix DNA binding domain of the transcription regulators MlrA and CarA. Helix-turn-helix (HTH) transcription regulator MlrA (merR-like regulator A), N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. Its close homolog, CarA from Myxococcus xanthus, is involved in activation of the carotenoid biosynthesis genes by light. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA- and CarA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins. 68
21871 133380 cd01105 HTH_GlnR-like Helix-Turn-Helix DNA binding domain of GlnR-like transcription regulators. Helix-turn-helix (HTH) transcription regulator GlnR and related proteins, N-terminal domain. The GlnR and TnrA (also known as ScgR) proteins have been shown to regulate expression of glutamine synthetase as well as several genes involved in nitrogen metabolism. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 88
21872 133381 cd01106 HTH_TipAL-Mta Helix-Turn-Helix DNA binding domain of the transcription regulators TipAL, Mta, and SkgA. Helix-turn-helix (HTH) TipAL, Mta, and SkgA transcription regulators, and related proteins, N-terminal domain. TipAL regulates resistance to and activation by numerous cyclic thiopeptide antibiotics, such as thiostrepton. Mta is a global transcriptional regulator; the N-terminal DNA-binding domain of Mta interacts directly with the promoters of mta, bmr, blt, and ydfK, and induces transcription of these multidrug-efflux transport genes. SkgA has been shown to control stationary-phase expression of catalase-peroxidase in Caulobacter crescentus. These proteins are comprised of distinct domains that harbor an N-terminal active (DNA-binding) site and a regulatory (effector-binding) site. The conserved N-terminal domain of these transcription regulators contains winged HTH motifs that mediate DNA binding. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Unique to this family, is a TipAL-like, lineage specific Bacilli subgroup, which has five conserved cysteines in the C-terminus of the protein. 103
21873 133382 cd01107 HTH_BmrR Helix-Turn-Helix DNA binding domain of the BmrR transcription regulator. Helix-turn-helix (HTH) multidrug-efflux transporter transcription regulator, BmrR and YdfL of Bacillus subtilis, and related proteins; N-terminal domain. Bmr is a membrane protein which causes the efflux of a variety of toxic substances and antibiotics. BmrR is comprised of two distinct domains that harbor a regulatory (effector-binding) site and an active (DNA-binding) site. The conserved N-terminal domain contains a winged HTH motif that mediates DNA binding, while the C-terminal domain binds coactivating, toxic compounds. BmrR shares the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 108
21874 133383 cd01108 HTH_CueR Helix-Turn-Helix DNA binding domain of CueR-like transcription regulators. Helix-turn-helix (HTH) transcription regulators CueR and ActP, copper efflux regulators. In Bacillus subtilis, copper induced CueR regulates the copZA operon, preventing copper toxicity. In Rhizobium leguminosarum, ActP controls copper homeostasis; it detects cytoplasmic copper stress and activates transcription in response to increasing copper concentrations. These proteins are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain winged HTH motifs that mediate DNA binding, while the C-terminal domains have two conserved cysteines that define a monovalent copper ion binding site. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 127
21875 133384 cd01109 HTH_YyaN Helix-Turn-Helix DNA binding domain of the MerR-like transcription regulators YyaN and YraB. Putative helix-turn-helix (HTH) MerR-like transcription regulators of Bacillus subtilis, YyaN and YraB, and related proteins; N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 113
21876 133385 cd01110 HTH_SoxR Helix-Turn-Helix DNA binding domain of the SoxR transcription regulator. Helix-turn-helix (HTH) transcriptional regulator SoxR. The global regulator, SoxR, up-regulates gene expression of another transcription activator, SoxS, which directly stimulates the oxidative stress regulon genes in E. coli. The soxRS response renders the bacterial cell resistant to superoxide-generating agents, macrophage-generated nitric oxide, organic solvents, and antibiotics. The SoxR proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the unusually long spacer between the -35 and -10 promoter elements. They also harbor a regulatory C-terminal domain containing an iron-sulfur center. 139
21877 133386 cd01111 HTH_MerD Helix-Turn-Helix DNA binding domain of the MerD transcription regulator. Helix-turn-helix (HTH) transcription regulator MerD. The putative secondary regulator of mercury resistance (mer) operons, MerD, has been shown to down-regulate the expression of this operon in gram-negative bacteria. It binds to the same operator DNA as MerR that activates transcription of the operon in the presence of mercury ions. The MerD protein shares the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily, which promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are conserved and contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 107
21878 238535 cd01115 SLC13_permease Permease SLC13 (solute carrier 13). The sodium/dicarboxylate cotransporter NaDC-1 has been shown to translocate Krebs cycle intermediates such as succinate, citrate, and alpha-ketoglutarate across plasma membranes rabbit, human, and rat kidney. It is related to renal and intestinal Na+/sulfate cotransporters and a few putative bacterial permeases. The SLC13-type proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium and various anions across biological membranes in all three kingdoms of life. A typical ArsB/NhaD permease is composed of 8-13 transmembrane helices. 382
21879 238536 cd01116 P_permease Permease P (pink-eyed dilution). Mutations in the human melanosomal P gene were responsible for classic phenotype of oculocutaneous albinism type 2 (OCA2). Although the precise function of the P protein is unknown, it was predicted to regulate the intraorganelle pH, together with the ATP-driven proton pump. It shows significant sequence similarity to the Na+/H+ antiporter NhaD from Vibrio parahaemolyticus. Both proteins belong to ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life. A typical ArsB/NhaD permease contains 8-13 transmembrane domains. 413
21880 238537 cd01117 YbiR_permease Putative anion permease YbiR. Based on sequence similarity, YbiR proteins are predicted to function as anion translocating permeases in eubacteria, archaea and plants. They belong to ArsB/NhaD superfamily of permeases that have been shown to translocate sodium, sulfate, arsenite and organic anions. A typical ArsB/NhaD permease is composed of 8-13 transmembrane domains. 384
21881 238538 cd01118 ArsB_permease Anion permease ArsB. These permeases have been shown to export arsenate and antimonite in eubacteria and archaea. A typical ArsB permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump (ArsAB). The ArsAB complex is similar in many ways to ATP-binding cassette transporters, which have two groups of six transmembrane-spanning helical segments and two nucleotide-binding domains. The ArsB proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life. 416
21882 238539 cd01119 Chemokine_CC_DCCL Chemokine_CC_DCCL: subgroup of the Chemokine_CC subgroup based on the presence of a DCCL motif involving the two N-terminal cysteine residues; includes a number of small inducible cytokines capable of reversibly inhibiting normal hematopoietic progenitor proliferation by blocking progression through the cell cycle; DCCL subgroup contains Exodus-1 (also known as CCL20, MIP-3alpha, LARC, ST38 (mouse)), Exodus-2 (also known as CCL21, SLC, 6-Ckine, TCA4, CKbeta9), and Exodus-3 (also known as CCL-19, ELC, MIP-3beta, CKbeta11). Exodus-3 was shown to inhibit the growth of human breast cancer cells in vivo in a mouse model; Exodus-1, -2, and -3 were all shown to significantly inhibit chronic myelogenous leukemia progenitor cell proliferation; Exodus-2 and -3 show potent immunotherapeutic activity toward solid tumors; chemotatic for T cells, B cells, dendritic cells, macrophage progenitor cells, and NK cells; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates. See CDs: Chemokine_CC (cd00272) for the entire CC subgroup, Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups. 61
21883 410865 cd01120 RecA-like_superfamily RecA-like_NTPases. RecA-like NTPases. This superfamily includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 119
21884 410866 cd01121 RadA_SMS_N bacterial RadA DNA repair protein. Sms or bacterial RadA is a DNA repair protein that plays a role in recombination and recombinational repair of DNA damaged by UV radiation, X-rays, and chemical agent and is responsible for the stabilization or processing of branched DNA molecules. 268
21885 410867 cd01122 Twinkle_C C-terminal domain of Twinkle. Twinkle ( T7 gp4-like protein with intramitochondrial nucleoid localization, also known as C10orf2, PEO1, SCA8, ATXN8, IOSCA, PEOA3 or SANDO) is a homohexameric DNA helicases which unwinds short stretches of double-stranded DNA in the 5' to 3' direction and, along with mitochondrial single-stranded DNA binding protein and mtDNA polymerase gamma, is thought to play a key role in mtDNA replication. Mutations in the human gene cause infantile onset spinocerebellar ataxia (IOSCA) and progressive external ophthalmoplegia (PEO) and are also associated with several mitochondrial depletion syndromes. This group also contains viral GP4-like and related bacterial helicases. 266
21886 410868 cd01123 Rad51_DMC1_archRadA recombinase Rad51, DMC1, and archaeal RadA. This group of recombinases includes the eukaryotic proteins RAD51, RAD55/57 and the meiosis-specific protein DMC1, and the archaeal protein RadA. They are closely related to the bacterial RecA group. Rad51 proteins catalyze a similar recombination reaction as RecA, using ATP-dependent DNA binding activity and a DNA-dependent ATPase. However, this reaction is less efficient and requires accessory proteins such as RAD55/57 . 234
21887 410869 cd01124 KaiC-like Circadian Clock Protein KaiC. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 222
21888 410870 cd01125 RepA_RSF1010_like Hexameric Replicative Helicase RepA of plasmid RSF1010 and related proteins. This family includes the homo-hexameric replicative helicase RepA encoded by plasmid RSF1010. RSF1010 is found in most Gram-negative bacteria and some Gram-positive bacteria . The RepA protein of Plasmid RSF1010 is a 5'-3' DNA helicase which can utilize ATP, dATP, GTP and dGTP (and CTP and dCTP to a lesser extent). 238
21889 410871 cd01127 TrwB_TraG_TraD_VirD4 TrwB/TraG/TraD/VirD4 family of bacterial conjugation proteins. The TraG/TraD/VirD4 family are bacterial conjugation proteins involved in type IV secretion (T4S) systems, versatile bacterial secretion systems mediating transport of protein and/or DNA. They are present in gram-negative and gram-positive bacteria, as well as archaea. They form hexameric rings and belong to the RecA-like NTPases superfamily, which also includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. 144
21890 410872 cd01128 rho_factor_C C-terminal ATP binding domain of transcription termination factor rho. Transcription termination factor rho is a bacterial ATP-dependent RNA/DNA helicase. It is a homohexamer. Each monomer consists of an N-terminal oligonucleotide/oligosaccharide binding fold (OB-fold) domain which binds cysteine-rich nucleotides, and a C-terminal ATP binding domain. This alignment is of the C-terminal ATP binding domain. 249
21891 410873 cd01129 PulE-GspE-like PulE-GspE family. PulE and General secretory pathway protein GspE are ATPases of the type II secretory pathway, the main terminal branch of the general secretory pathway (GSP). PulE is a cytoplasmic protein of the GSP, which contains an ATP binding site and a tetracysteine motif. This subgroup also includes PilB, a type IV pilus assembly ATPase, DotB, an ATPase of the type IVb secretion system, also known as the dot/icm system, Escherichia coli IncI plasmid-encoded conjugative transfer ATPase TraJ, and HofB. 159
21892 410874 cd01130 VirB11-like_ATPase Type IV secretory pathway component VirB11-like. Type IV secretory pathway component VirB11, and related ATPases. The homohexamer, VirB11 is one of eleven Vir (virulence) proteins, which are required for T-pilus biogenesis and virulence in the transfer of T-DNA from the bacterial Ti (tumor-inducing)-plasmid into plant cells. The pilus is a fibrous cell surface organelle, which mediates adhesion between bacteria during conjugative transfer or between bacteria and host eukaryotic cells during infection. VirB11-related ATPases include Sulfolobus acidocaldarius FlaI, which plays key roles in archaellum (archaeal flagellum) assembly and motility functions, and the pilus assembly proteins CpaF/TadA and TrbB. This alignment contains the C-terminal domain, which is the ATPase. 177
21893 410875 cd01131 PilT Pilus retraction ATPase PilT. Pilus retraction ATPase PilT is a nucleotide-binding protein responsible for the retraction of type IV pili, likely by pili disassembly. This retraction provides the force required for travel of bacteria in low water environments by a mechanism known as twitching motility. 223
21894 410876 cd01132 F1-ATPase_alpha_CD F1 ATP synthase alpha subunit, central domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The mitochondrial extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic. Alpha and beta subunits form the globular catalytic moiety, a hexameric ring of alternating alpha and beta subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton-translocating domain. 274
21895 410877 cd01133 F1-ATPase_beta_CD F1 ATP synthase beta subunit, central domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The mitochondrial extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic. Alpha and beta subunits form the globular catalytic moiety, a hexameric ring of alternating alpha and beta subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton-translocating domain. 277
21896 410878 cd01134 V_A-ATPase_A V/A-type ATP synthase catalytic subunit A. V/A-type ATP synthase catalytic subunit A. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. Vacuolar (V-type) ATPases play major roles in endomembrane and plasma membrane proton transport in eukaryotes. They are found in multiple intracellular membranes including vacuoles, endosomes, lysosomes, Golgi-derived vesicles, secretory vesicles, as well as the plasma membrane. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase). A similar protein is also found in a few bacteria. 288
21897 410879 cd01135 V_A-ATPase_B V/A-type ATP synthase subunit B. V/A-type ATP synthase (non-catalytic) subunit B. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. Vacuolar (V-type) ATPases play major roles in endomembrane and plasma membrane proton transport in eukaryotes. They are found in multiple intracellular membranes including vacuoles, endosomes, lysosomes, Golgi-derived vesicles, secretory vesicles, as well as the plasma membrane. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase). A similar protein is also found in a few bacteria. This subfamily consists of the non-catalytic beta subunit. 282
21898 410880 cd01136 ATPase_flagellum-secretory_path_III Flagellum-specific ATPase/type III secretory pathway virulence-related protein. Flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The bacterial flagellar motor is similar to the F0F1-ATPase, in that they both are proton-driven rotary molecular devices. However, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway. 265
21899 238557 cd01137 PsaA Metal binding protein PsaA. These proteins have been shown to function as initial receptors in ABC transport of Mn2+ and as surface adhesins in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 287
21900 238558 cd01138 FeuA Periplasmic binding protein FeuA. These proteins have predicted to function as initial receptors in ABC transport of metal ions in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 248
21901 238559 cd01139 TroA_f Periplasmic binding protein TroA_f. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 342
21902 238560 cd01140 FatB Siderophore binding protein FatB. These proteins have been shown to function as ABC-type initial receptors in the siderophore-mediated iron uptake in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 270
21903 238561 cd01141 TroA_d Periplasmic binding protein TroA_d. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 186
21904 238562 cd01142 TroA_e Periplasmic binding protein TroA_e. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 289
21905 238563 cd01143 YvrC Periplasmic binding protein YvrC. These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria and archaea. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 195
21906 238564 cd01144 BtuF Cobalamin binding protein BtuF. These proteins have been shown to function as initial receptors in ABC transport of vitamin B12 (cobalamin) in eubacterial and some archaeal species. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 245
21907 238565 cd01145 TroA_c Periplasmic binding protein TroA_c. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 203
21908 238566 cd01146 FhuD Fe3+-siderophore binding domain FhuD. These proteins have been shown to function as initial receptors in ABC transport of Fe3+-siderophores in many eubacterial species. They belong to the TroA-like superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA-like protein is comprised of two globular subdomains connected by a long alpha helix and binds its specific ligands in the cleft between these domains. 256
21909 238567 cd01147 HemV-2 Metal binding protein HemV-2. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). 262
21910 238568 cd01148 TroA_a Metal binding protein TroA_a. These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 284
21911 238569 cd01149 HutB Hemin binding protein HutB. These proteins have been shown to function as initial receptors in ABC transport of hemin and hemoproteins in many eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 235
21912 173839 cd01150 AXO Peroxisomal acyl-CoA oxidase. Peroxisomal acyl-CoA oxidases (AXO) catalyze the first set in the peroxisomal fatty acid beta-oxidation, the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. In a second oxidative half-reaction, the reduced FAD is reoxidized by molecular oxygen. AXO is generally a homodimer, but it has been reported to form a different type of oligomer in yeast. There are several subtypes of AXO's, based on substrate specificity. Palmitoyl-CoA oxidase acts on straight-chain fatty acids and prostanoids; whereas, the closely related Trihydroxycoprostanoly-CoA oxidase has the greatest activity for 2-methyl branched side chains of bile precursors. Pristanoyl-CoA oxidase, acts on 2-methyl branched fatty acids. AXO has an additional domain, C-terminal to the region with similarity to acyl-CoA dehydrogenases, which is included in this alignment. 610
21913 173840 cd01151 GCD Glutaryl-CoA dehydrogenase. Glutaryl-CoA dehydrogenase (GCD). GCD is an acyl-CoA dehydrogenase, which catalyzes the oxidative decarboxylation of glutaryl-CoA to crotonyl-CoA and carbon dioxide in the catabolism of lysine, hydroxylysine, and tryptophan. It uses electron transfer flavoprotein (ETF) as an electron acceptor. GCD is a homotetramer. GCD deficiency leads to a severe neurological disorder in humans. 386
21914 173841 cd01152 ACAD_fadE6_17_26 Putative acyl-CoA dehydrogenases similar to fadE6, fadE17, and fadE26. Putative acyl-CoA dehydrogenases (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position. 380
21915 173842 cd01153 ACAD_fadE5 Putative acyl-CoA dehydrogenases similar to fadE5. Putative acyl-CoA dehydrogenase (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position. 407
21916 173843 cd01154 AidB Proteins involved in DNA damage response, similar to the AidB gene product. AidB is one of several genes involved in the SOS adaptive response to DNA alkylation damage, whose expression is activated by the Ada protein. Its function has not been entirely elucidated; however, it is similar in sequence and function to acyl-CoA dehydrogenases. It has been proposed that aidB directly destroys DNA alkylating agents such as nitrosoguanidines (nitrosated amides) or their reaction intermediates. 418
21917 173844 cd01155 ACAD_FadE2 Acyl-CoA dehydrogenases similar to fadE2. FadE2-like Acyl-CoA dehydrogenase (ACAD). Acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACAD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. ACAD's are generally homotetramers and have an active site glutamate at a conserved position. 394
21918 173845 cd01156 IVD Isovaleryl-CoA dehydrogenase. Isovaleryl-CoA dehydrogenase (IVD) is an is an acyl-CoA dehydrogenase, which catalyzes the third step in leucine catabolism, the conversion of isovaleryl-CoA (3-methylbutyryl-CoA) into 3-methylcrotonyl-CoA. IVD is a homotetramer and has the greatest affinity for small branched chain substrates. 376
21919 173846 cd01157 MCAD Medium chain acyl-CoA dehydrogenase. MCADs are mitochondrial beta-oxidation enzymes, which catalyze the alpha,beta dehydrogenation of the corresponding medium chain acyl-CoA by FAD, which becomes reduced. The reduced form of MCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. MCAD is a homotetramer. 378
21920 173847 cd01158 SCAD_SBCAD Short chain acyl-CoA dehydrogenases and eukaryotic short/branched chain acyl-CoA dehydrogenases. Short chain acyl-CoA dehydrogenase (SCAD). SCAD is a mitochondrial beta-oxidation enzyme. It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of SCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. This subgroup also contains the eukaryotic short/branched chain acyl-CoA dehydrogenase(SBCAD), the bacterial butyryl-CoA dehydorgenase(BCAD) and 2-methylbutyryl-CoA dehydrogenase, which is involved in isoleucine catabolism. These enzymes are homotetramers. 373
21921 173848 cd01159 NcnH Naphthocyclinone hydroxylase. Naphthocyclinone is an aromatic polyketide and an antibiotic, which is active against Gram-positive bacteria. Polyketides are secondary metabolites, which have important biological functions such as antitumor, immunosupressive or antibiotic activities. NcnH is a hydroxylase involved in the biosynthesis of naphthocyclinone and possibly other polyketides. 370
21922 173849 cd01160 LCAD Long chain acyl-CoA dehydrogenase. LCAD is an acyl-CoA dehydrogenases (ACAD), which is found in the mitochondria of eukaryotes and in some prokaryotes. It catalyzes the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of LCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. LCAD acts as a homodimer. 372
21923 173850 cd01161 VLCAD Very long chain acyl-CoA dehydrogenase. VLCAD is an acyl-CoA dehydrogenase (ACAD), which is found in the mitochondria of eukaryotes and in some bacteria. It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. VLCAD acts as a homodimer. 409
21924 173851 cd01162 IBD Isobutyryl-CoA dehydrogenase. Isobutyryl-CoA dehydrogenase (IBD) catalyzes the alpha, beta- dehydrogenation of short branched chain acyl-CoA intermediates in valine catabolism. It is predicted to be a homotetramer. 375
21925 173852 cd01163 DszC Dibenzothiophene (DBT) desulfurization enzyme C. DszC is a flavin reductase dependent enzyme, which catalyzes the first two steps of DBT desulfurization in mesophilic bacteria. DszC converts DBT to DBT-sulfoxide, which is then converted to DBT-sulfone. Bacteria with this enzyme are candidates for the removal of organic sulfur compounds from fossil fuels, which pollute the environment. An equivalent enzyme tdsC, is found in thermophilic bacteria. This alignment also contains a closely related uncharacterized subgroup. 377
21926 238570 cd01164 FruK_PfkB_like 1-phosphofructokinase (FruK), minor 6-phosphofructokinase (pfkB) and related sugar kinases. FruK plays an important role in the predominant pathway for fructose utilisation.This group also contains tagatose-6-phophate kinase, an enzyme of the tagatose 6-phosphate pathway, which responsible for breakdown of the galactose moiety during lactose metabolism by bacteria such as L. lactis. 289
21927 349496 cd01165 BTB_POZ BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain superfamily. Proteins in this superfamily are characterized by the presence of a common protein-protein interaction motif of about 100 amino acids, known as the BTB/POZ domain. Members include transcription factors, oncogenic proteins, ion channel proteins, and potassium channel tetramerization domain (KCTD) proteins. They have been identified in poxviruses and many eukaryotes, and have diverse functions, such as transcriptional regulation, chromatin remodeling, protein degradation and cytoskeletal regulation. Many BTB/POZ proteins contain one or two additional domains, such as kelch repeats, zinc-finger domains, FYVE (Fab1, YOTB, Vac1, and EEA1) fingers, or ankyrin repeats, among others. These special additional domains or interaction partners provide unique characteristics and functions to BTB/POZ proteins. In ion channel proteins and KCTD proteins, the BTB/POZ domain is also called the tetramerization (T1) domain. 79
21928 238571 cd01166 KdgK 2-keto-3-deoxygluconate kinase (KdgK) phosphorylates 2-keto-3-deoxygluconate (KDG) to form 2-keto-3-deoxy-6-phosphogluconate (KDGP). KDG is the common intermediate product, that allows organisms to channel D-glucuronate and/or D-galacturinate into the glycolysis and therefore use polymers, like pectin and xylan as carbon sources. 294
21929 238572 cd01167 bac_FRK Fructokinases (FRKs) mainly from bacteria and plants are enzymes with high specificity for fructose, as are all FRKs, but they catalyzes the conversion of fructose to fructose-6-phosphate, which is an entry point into glycolysis via conversion into glucose-6-phosphate. This is in contrast to FRKs [or ketohexokinases (KHKs)] from mammalia and halophilic archaebacteria, which phosphorylate fructose to fructose-1-phosphate. 295
21930 238573 cd01168 adenosine_kinase Adenosine kinase (AK) catalyzes the phosphorylation of ribofuranosyl-containing nucleoside analogues at the 5'-hydroxyl using ATP or GTP as the phosphate donor.The physiological function of AK is associated with the regulation of extracellular adenosine levels and the preservation of intracellular adenylate pools. Adenosine kinase is involved in the purine salvage pathway. 312
21931 238574 cd01169 HMPP_kinase 4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate kinase (HMPP-kinase) catalyzes two consecutive phosphorylation steps in the thiamine phosphate biosynthesis pathway, leading to the synthesis of vitamin B1. The first step is the phosphorylation of the hydroxyl group of HMP to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate (HMP-P) and then the phophorylation of HMP-P to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine pyrophosphate (HMP-PP), which is the substrate for the thiamine synthase coupling reaction. 242
21932 238575 cd01170 THZ_kinase 4-methyl-5-beta-hydroxyethylthiazole (Thz) kinase catalyzes the phosphorylation of the hydroxylgroup of Thz. A reaction that allows cells to recycle Thz into the thiamine biosynthesis pathway, as an alternative to its synthesis from cysteine, tyrosine and 1-deoxy-D-xylulose-5-phosphate. 242
21933 238576 cd01171 YXKO-related B.subtilis YXKO protein of unknown function and related proteins. Based on the conservation of the ATP binding site, the substrate binding site and the Mg2+binding site and structural homology this group is a member of the ribokinase-like superfamily. 254
21934 238577 cd01172 RfaE_like RfaE encodes a bifunctional ADP-heptose synthase involved in the biosynthesis of the lipopolysaccharide (LPS) core precursor ADP-L-glycero-D-manno-heptose. LPS plays an important role in maintaining the structural integrity of the bacterial outer membrane of gram-negative bacteria. RfaE consists of two domains, a sugar kinase domain, represented here, and a domain belonging to the cytidylyltransferase superfamily. 304
21935 238578 cd01173 pyridoxal_pyridoxamine_kinase Pyridoxal kinase plays a key role in the synthesis of the active coenzyme pyridoxal-5'-phosphate (PLP), by catalyzing the phosphorylation of the precursor vitamin B6 in the presence of Zn2+ and ATP. Mammals are unable to synthesize PLP de novo and require its precursors in the form of vitamin B6 (pyridoxal, pyridoxine, and pyridoxamine) from their diet. Pyridoxal kinase encoding genes are also found in many other species including yeast and bacteria. 254
21936 238579 cd01174 ribokinase Ribokinase catalyses the phosphorylation of ribose to ribose-5-phosphate using ATP. This reaction is the first step in the ribose metabolism. It traps ribose within the cell after uptake and also prepares the sugar for use in the synthesis of nucleotides and histidine, and for entry into the pentose phosphate pathway. Ribokinase is dimeric in solution. 292
21937 238580 cd01175 IPT_COE IPT domain of the COE family (Col/Olf-1/EBF) of non-basic, helix-loop-helix (HLH)-containing transcription factors. COE family proteins are all transcription factors and play an important role in variety of developmental processes. Mouse EBF is involved in the regulation of the early stages of B-cell differentiation, Drosophila collier is a regulator of the head patterning, and a related protein in Xenopus is involved in primary neurogenesis. All COE family members have a well conserved DNA binding domain that contains an atypical Zn finger motif. The function of the IPT domain is unknown. 85
21938 238581 cd01176 IPT_RBP-Jkappa IPT domain of the recombination signal Jkappa binding protein (RBP-Jkappa). RBP-J kappa, was initially considered to be involved in V(D)J recombination because of its DNA binding specificity and structural similarity to site-specific recombinases known as the integrase family. Further studies indicated that RBP-J kappa functions as a repressor of transcription, via destabilization of the general transcription factor IID and recruitment of histone deacetylase complexes. 97
21939 238582 cd01177 IPT_NFkappaB IPT domain of the transcription factor NFkappaB and related transcription factors. NFkappaB is considered a central regulator of stress responses, activated by different stressful conditions, including physical stress, oxidative stress, and exposure to certain chemicals. NFkappaB blocking cell apoptosis in several cell types, gives it an important role in cell proliferation and differentiation. 102
21940 238583 cd01178 IPT_NFAT IPT domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes mainly involved in cell-cell-interaction. 101
21941 238584 cd01179 IPT_plexin_repeat2 Second repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 85
21942 238585 cd01180 IPT_plexin_repeat1 First repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 94
21943 238586 cd01181 IPT_plexin_repeat3 Third repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 99
21944 271183 cd01182 INT_RitC_C_like C-terminal catalytic domain of recombinase RitC, a component of the recombinase trio. Recombinases belonging to the RitA (also known as pAE1 due to its presence in the deletion prone region of plasmid pAE1 of Alcaligenes eutrophus H1), RitB, and RitC families are associated in a complex referred to as a Recombinase in Trio (RIT) element. These RIT elements consist of three adjacent and unidirectional overlapping genes, one from each family (ritABC in order of transcription). All three integrases contain a catalytic motif, suggesting that they are all active enzymes. However, their specific roles are not yet fully understood. All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. 186
21945 271184 cd01184 INT_C_like_1 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain containing six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 180
21946 271185 cd01185 INTN1_C_like Integrase IntN1 of Bacteroides mobilizable transposon NBU1 and similar proteins, C-terminal catalytic domain. IntN1 is a tyrosine recombinase for the integration and excision of Bacteroides mobilizable transposon NBU1 from the host chromosome. IntN1 does not require strict homology between the recombining sites seen with other tyrosine recombinases. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 161
21947 271186 cd01186 INT_tnpA_C_Tn554 Putative Transposase A from transposon Tn554, C-terminal catalytic domain. This family includes putative Transposase A from transposon Tn554. It belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 184
21948 271187 cd01187 INT_tnpB_C_Tn554 Putative Transposase B from transposon Tn554, C-terminal catalytic domain. This family includes putative Transposase B from transposon Tn554. It belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain containing six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 142
21949 271188 cd01188 INT_RitA_C_like C-terminal catalytic domain of recombinase RitA, a component of the recombinase trio. Recombinases RitA (also known as pAE1), RitB, and RitC are encoded by three adjacent and overlapping genes. Collectively they are known as the Recombinase in Trio (RIT). This RitA family includes various bacterial integrases and integrases from the deletion-prone region of plasmid pAE1 of Alcaligenes eutrophus H1. All three integrases contain a catalytic motif, suggesting that they are all active enzymes. However, their specific roles are not fully understood. All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 179
21950 271189 cd01189 INT_ICEBs1_C_like C-terminal catalytic domain of integrases from bacterial phages and conjugate transposons. This family of tyrosine based site-specific integrases is has origins in bacterial phages and conjugate transposons. One member is the integrase from Bacillus subtilis conjugative transposon ICEBs1. ICEBs1 can be excised and transfered to various recipients in response to DNA damage or high concentrations of potential mating partners. The family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 147
21951 271190 cd01190 INT_StrepXerD_C_like Putative XerD in Streptococcus pneumonia and similar proteins, C-terminal catalytic domain. This family includes a putative XerD recombinase in Streptococcus pneumonia and similar tyrosine recombinases. However, the members of this family contain unusual active site motifs from the XerD from Escherichia coli. E. coli XerD and homologous enzymes show four conserved amino acids R-H-R-H that are spaced along the C-terminal domain. The putative S. pneumoniae XerD contains three unique replacements at the conserved positions resulting in L-Q-R-L. Severe growth defects in a loss-of-function xerD mutant demonstrate an important in vivo function of the S. pneumoniae XerD protein. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. 150
21952 271191 cd01191 INT_C_like_2 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 176
21953 271192 cd01192 INT_C_like_3 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 178
21954 271193 cd01193 INT_IntI_C Integron integrase and similar protiens, C-terminal catalytic domain. Integron integrases mediate site-specific DNA recombination between a proximal primary site (attI) and a secondary target site (attC) found within mobile gene cassettes encoding resistance or virulence factors. Unlike other site specific recombinases, the attC sites lack sequence conservation. Integron integrase exhibits broader DNA specificity by recognizing the non-conserved attC sites. The structure shows that DNA target site recognition are not dependent on canonical DNA but on the position of two flipped-out bases that interact in cis and in trans with the integrase. Integron-integrases are present in many natural occurring mobile elements, including transposons and conjugative plasmids. Vibrio, Shewanella, Xanthomonas, and Pseudomonas species harbor chromosomal super-integrons. All integron-integrases carry large inserts unlike the TnpF ermF-like proteins also seen in this group. 176
21955 271194 cd01194 INT_C_like_4 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 174
21956 271195 cd01195 INT_C_like_5 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 170
21957 271196 cd01196 INT_C_like_6 Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 183
21958 271197 cd01197 INT_FimBE_like FimB and FimE and related proteins, integrase/recombinases. This CD includes proteins similar to E.coli FimE and FimB and Proteus mirabilis MrpI. FimB and FimE are the regulatory proteins during expression of type 1 fimbriae in Escherichia coli. The fimB and fimE proteins direct the phase switch into the 'on' and 'off' position. MrpI is the regulatory protein of proteus mirabilis fimbriae expression. This family belongs to the integrase/recombinase superfamily. 181
21959 238605 cd01200 WHEPGMRS_RNA EPRS-like_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in three copies in the mammalian bifunctional EPRS in a region that separates the N-terminal GluRS from the C-terminal ProRS. In the Drosophila EPRS, this domain is repeated six times. It is found at the N-terminus of TrpRS, HisRS and GlyR and at the C-terminus of MetRS. This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 42
21960 275391 cd01201 PH_BEACH Pleckstrin homology domain in BEACH domain containing proteins. The BEACH domain is present in several eukaroyotic proteins CHS, neurobeachin (Nbea), LRBA (also called BGL, beige-like, or CDC4L), FAN, KIAA1607, and LvsA-LvsF. CHS is a rare, autosomal recessive disorder that can cause severe immunodeficiency and albinism in mammals and beige is the name for the CHS disease in mice. The CHS disease is associated with the presence of giant, perinuclear vesicles (lysosomes, melanosomes, and others) and CHS protein is thought to play an important role in the fusion, fission, or trafficking of these vesicles. All BEACH proteins contain the following domains: PH, BEACH, and WD40. The WD40 domain is involved in mediating protein-protein interactions involved in targeting proteins to subcellular compartments. The combined PH-BEACH motifs may present a single continuous structural unit involved in protein binding. Some members have an additional N-terminal Laminin G-like (LamG) domains Ca++ mediated receptors or an additional C-terminal FYVE zinc-binding domain which targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 112
21961 269913 cd01202 PTB_FRS2 Fibroblast growth factor receptor substrate 2 phosphotyrosine-binding domain. FRS2 (also called Suc1-associated neurotrophic factor (SNT)-induced tyrosine-phosphorylated target) proteins are membrane-anchored adaptor proteins. They are composed of an N-terminal myristoylation site followed by a phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a C-terminal effector domain containing multiple tyrosine and serine/threonine phosphorylation site. The FRS2/SNT proteins show increased tyrosine phosphorylation by activated receptors, such as fibroblast growth factor receptor (FGFR) and TrkA, recruit SH2 domain containing proteins such as Grb2, and mediate signals from activated receptors to a variety of downstream pathways. The PTB domains of the SNT proteins directly interact with the canonical NPXpY motif of TrkA in a phosphorylationdependent manner, they directly bind to the juxtamembrane region of FGFR in a phosphorylation-independent manner. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup. 92
21962 269914 cd01203 PTB_DOK1_DOK2_DOK3 Downstream of tyrosine kinase 1, 2, and 3 proteins phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup. 99
21963 269915 cd01204 PTB_IRS Insulin receptor substrate phosphotyrosine-binding domain (PTBi). Insulin receptor substrate (IRS) molecules are mediators in insulin signaling and play a role in maintaining basic cellular functions such as growth and metabolism. They act as docking proteins between the insulin receptor and a complex network of intracellular signaling molecules containing Src homology 2 (SH2) domains. Four members (IRS-1, IRS-2, IRS-3, IRS-4) of this family have been identified that differ as to tissue distribution, subcellular localization, developmental expression, binding to the insulin receptor, and interaction with SH2 domain-containing proteins. IRS molecules have an N-terminal PH domain, followed by an IRS-like PTB domain which has a PH-like fold. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup. 106
21964 269916 cd01205 EVH1_WASP-like WASP family proteins EVH1 domain. The Wiskott-Aldrich Syndrome Protein (WASP; also called Bee1p) and its homolog N (neuronal)-WASP are signal transduction proteins that promote actin polymerization in response to upstream intracellular signals. WAS is an X-linked recessive disease, characterized by eczema, immunodeficiency, and thrombocytopenia. The majority of patients with WAS, or a milder version of the disorder, X-linked thrombocytopenia (XLT), have point mutations in the EVH1 domain of WASP. WASP is an actin regulatory protein consisting of an N-terminal EVH1 domain called WH1 which binds LPPPEP peptides, a basic region (B), a GTP binding domain (GBP), a proline rich region, a WH2 domain, and a verprolin-cofilin-acidic motif (VCA) which activates the actin-related protein (Arp)2/3 actin nucleating complex. The B, GBD, and the proline-rich region are involved in autoinhibitory interactions that repress or block the activity of the VCA. Yeast members lack the GTP binding domain. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 101
21965 269917 cd01206 EVH1_Homer_Vesl Homer/Vesl family proteins EVH1 domain. Homer/Vesl proteins are synaptic scaffolding proteins, required for long-term potentiation, a form of synaptic plasticity thought to underlie memory formation. They contains an N-terminal EVH1 domain and bind to both neurotransmitter receptors, such as the metabotropic group 1 glutamate receptor (mGluR) and to other scaffolding proteins via PPXXF motifs, in order to target them to the synaptic junction. These mGluRs possess a long C-terminal intracellular tail that may be important for subcellular localization of the receptor. The C-terminus is also the site of binding by the immediate early gene (IEG), Homer 1a. In contrast to Homer 1a, other Homer members additionally encode a C-terminal coiled-coil (CC) domain and form multivalent complexes that bind group 1 mGluRs. Homer 1a competes with constitutively expressed CC-Homers to modify the association of group 1 mGluRs with CC-Homer complexes. Since Homer proteins are strikingly enriched at the postsynaptic density (PSD), these observations suggest a role for the Homer family in regulating synaptic metabotropic receptor function. PSD-Zip45 (also named Homer 1c/Vesl-1L) has an EVH1 domain with a longer alpha-helix and its linking part included in the conserved region of Homer 1 (CRH1) interacts with the EVH1 domain of the neighbour CRH1 molecule in the crystal, suggesting that the EVH1 domain recognizes the PPXXF motif found in the binding partners, and the SPLTP sequence (P-motif) in the linking region of the CRH1. The two types of binding are partly overlapped in the EVH1 domain, implying a mechanism to regulate multimerization of Homer 1 family proteins. Homer 2 and Homer 3 are negative regulators of T cell activation. They bind the nuclear factor of activated T cells (NFAT) and compete with calcineurin binding. NFAT plays a critical role in calcium-dependent signaling in other cell types, including muscle and neurons. Homer-NFAT binding is also antagonized by active serine-threonine kinase AKT, enhancing TCR signaling via calcineurin-dependent dephosphorylation of NFAT resulting in changes in cytokine expression and an increase in effector-memory T cell populations in Homer-deficient mice. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 109
21966 269918 cd01207 EVH1_Ena_VASP-like Enabled/VASP family EVH1 domain. Ena/VASP family includes proteins such as: Vasodilator-stimulated phosphoprotein (VASP), enabled gene product from Drosophila (Ena), mammalian enabled (Mena) and Ena/VASP-Like protein (EVL) localize to focal adhesions and to sites of actin filament dynamics. These proteins share a common modular organization with a highly conserved N- and C-terminal domains, termed Ena/VASP homology domains 1 and 2 (EVH1 and EVH2), that are separated by a central proline-rich domain. The EVH1 domain binds to other proteins at proline rich sequences. The majority of Ena-VASP type EVH1 domains recognize FPPPP motifs such as in the focal adhesion proteins zyxin and vinculin, and the ActA surface protein of Listeria monocytogenes, however the LIM3 domain of Tes lacks the FPPPP motif but still binds the EVH1 domain of Mena. It has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. EVH2 mediates oligomerization within the family. The proline-rich region binds SH3 and WW domains as well as profilin, a protein that regulates actin filament dynamics. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 108
21967 269919 cd01208 PTB_X11 X11-like Phosphotyrosine-binding (PTB) domain. The function of the neuronal protein X11 is unknown to date. X11 has a PTB domain followed by two PDZ domains. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 161
21968 269920 cd01209 PTB_Shc Shc-like phosphotyrosine-binding (PTB) domain. Shc is a substrate for receptor tyrosine kinases, which can interact with phosphoproteins at NPXY motifs. Shc contains an PTB domain followed by an SH2 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Shc-like subgroup. 170
21969 269921 cd01210 PTB_EPS8 Epidermal growth factor receptor kinase substrate (EPS8)-like Phosphotyrosine-binding (PTB) domain. EPS8 is a regulator of Rac signaling. It consists of a PTB and an SH3 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 131
21970 269922 cd01211 PTB_Rab6GAP GTPase activating protein for Rab 6 Phosphotyrosine-binding (PTB) domain. GAPCenA is a centrosome-associated GTPase activating protein (GAP) for Rab 6. It consists of an N-terminal PTB domain and a C-terminal TBC domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 129
21971 269923 cd01212 PTB_JIP JNK-interacting protein-like (JIP) Phosphotyrosine-binding (PTB) domain. JIP is a mitogen-activated protein kinase scaffold protein. JIP consists of a C-terminal SH3 domain, followed by a PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 149
21972 269924 cd01213 PTB_tensin Tensin Phosphotyrosine-binding (PTB) domain. Tensin is a a focal adhesion protein, which contains a C-terminal SH2 domain followed by a PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 136
21973 269925 cd01214 PTB_FAM43A Family with sequence similarity 43, member A (FAM43A) Phosphotyrosine-binding (PTB) domain. The function of FAM43A is currently unknown. Human FAM43A is located on chromosome 3 at location 3q29. It encodes a 3182 base pair mRNA which possesses one Pleckstrin homology-like domain. The mRNA translates into LOC131583, a hydrophilic protein that is predicted to localize in the nucleus. The FAM43A gene is conserved through a broad range of vertebrates. It is highly conserved from chimpanzees to zebrafish. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. 125
21974 269926 cd01215 PTB_Dab Disabled (Dab) Phosphotyrosine-binding domain. Dab is a cystosolic adaptor protein, which binds to the cytoplasmic tails of lipoprotein receptors, such as ApoER2 and VLDLR, via its PTB domain. The dab PTB domain has a preference for unphosphorylated tyrosine within an NPxY motif. Additionally, the Dab PTB domain, which is structurally similar to PH domains, binds to phosphatidlyinositol phosphate 4,5 bisphosphate in a manner characteristic of phosphoinositide binding PH domains. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 147
21975 241252 cd01217 PTB_CG12581 CG12581 Phosphotyrosine-binding (PTB) domain. The function of CG12581 and its related proteins are unknown to date. Members here contain a single N-terminal PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 166
21976 269927 cd01218 PH_Phafin2-like Phafin2 (also called EAPF, FLJ13187, ZFYVE18 or PLEKHF2) Pleckstrin Homology (PH) domain. Phafin2 is differentially expressed in the liver cancer cell and regulates the structure and function of the endosomes through Rab5-dependent processes. Phafin2 modulates the cell's response to extracellular stimulation by modulating the receptor density on the cell surface. Phafin2 contains a PH domain and a FYVE domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
21977 275392 cd01219 PH1_FGD1 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 1, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
21978 269928 cd01220 PH1_FARP1-like FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins Pleckstrin Homology (PH) domain, repeat 1. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FARP6 (also called Zinc finger FYVE domain-containing protein 24). They are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. Little is known about FARP1 and FARP6, though FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. FARP1 and FARP2 are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. FARP6 is composed of Dbl-homology (DH), and two C-terminal PH domains separated by a FYVE domain. This hierarchy contains the first PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 109
21979 269929 cd01221 PH_ephexin Ephexin Pleckstrin homology (PH) domain. Ephexin-1 (also called NGEF/ neuronal guanine nucleotide exchange factor) plays a role in the homeostatic modulation of presynaptic neurotransmitter release. Specific functions are still unknown for Ephexin-2 (also called RhoGEF19) and Ephexin-3 (also called Rho guanine nucleotide exchange factor 5/RhoGEF5, Transforming immortalized mammary oncogene/p60 TIM, and NGEF/neuronalGEF). Ephexin-4 (also called RhoGEF16) acts downstream of EphA2 to promote ligand-independent breast cancer cell migration and invasion toward epidermal growth factor through activation of RhoG. This in turn results in the activation of RhoG which recruits ELMO2 and Dock4 to form a complex with EphA2 at the tips of cortactin-rich protrusions in migrating breast cancer cells. Ephexin-5 is the specific GEF for RhoA activation and the regulation of vascular smooth muscle contractility. It interacts with EPHA4 PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. The members of the Ephexin family contains a RhoGEF (DH) followed by a PH domain and an SH3 domain. The ephexin PH domain is believed to act with the DH domain in mediating protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 131
21980 269930 cd01223 PH_Vav Vav pleckstrin homology (PH) domain. Vav acts as a guanosine nucleotide exchange factor (GEF) for Rho/Rac proteins. They control processes including T cell activation, phagocytosis, and migration of cells. The Vav subgroup of Dbl GEFs consists of three family members (Vav1, Vav2, and Vav3) in mammals. Vav1 is preferentially expressed in the hematopoietic system, while Vav2 and Vav3 are described by broader expression patterns. Mammalian Vav proteins consist of a calponin homology (CH) domain, an acidic region, a catalytic Dbl homology (DH) domain, a PH domain, a zinc finger cysteine rich domain (C1/CRD), and an SH2 domain, flanked by two SH3 domains. In invertebrates such as Drosophila and C. elegans, Vav is missing the N-terminal SH3 domain. The DH domain is involved in RhoGTPase recognition and selectivity and stimulates the reorganization of the switch regions for GDP/GTP exchange. The PH domain is implicated in directing membrane localization, allosteric regulation of guanine nucleotide exchange activity, and as a phospholipid- dependent regulator of GEF activity. Vavs bind RhoGTPases including Rac1, RhoA, RhoG, and Cdc42, while other members of the GEF family are specific for a single RhoGTPase. This promiscuity is thought to be a result of its CRD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes. 127
21981 269931 cd01224 PH_Collybistin_ASEF Collybistin/APC-stimulated guanine nucleotide exchange factor pleckstrin homology (PH) domain. Collybistin (also called PEM2) is homologous to the Dbl proteins ASEF (also called ARHGEF4/RhoGEF4) and SPATA13 (Spermatogenesis-associated protein 13; also called ASEF2). It activates CDC42 specifically and not any other Rho-family GTPases. Collybistin consists of an SH3 domain, followed by a RhoGEF/DH and PH domain. In Dbl proteins, the DH and PH domains catalyze the exchange of GDP for GTP in Rho GTPases, allowing them to signal to downstream effectors. It induces submembrane clustering of the receptor-associated peripheral membrane protein gephyrin, which is thought to form a scaffold underneath the postsynaptic membrane linking receptors to the cytoskeleton. It also acts as a tumor suppressor that links adenomatous polyposis coli (APC) protein, a negative regulator of the Wnt signaling pathway and promotes the phosphorylation and degradation of beta-catenin, to Cdc42. Autoinhibition of collybistin is accomplished by the binding of its SH3 domain with both the RhoGEF and PH domains to block access of Cdc42 to the GTPase-binding site. Inactivation promotes cancer progression. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 138
21982 269932 cd01225 PH_Cool_Pix Cloned out of library/PAK-interactive exchange factor pleckstrin homology (PH) domain. There are two forms of Pix proteins: alpha Pix (also called Rho guanine nucleotide exchange factor (GEF) 6/90Cool-2) and beta Pix (GEF7/p85Cool-1). betaPix contains an N-terminal SH3 domain, a RhoGEF/DH domain, a PH domain, a GIT1 binding domain (GBD), and a C-terminal coiled-coil (CC) domain. alphaPix differs in that it contains a calponin homology (CH) domain, which interacts with beta-parvin, N-terminal to the SH3 domain. alphaPix is an exchange factor for Rac1 and Cdc42 and mediates Pak activation on cell adhesion to fibronectin. Mutations in alphaPix can cause X-linked mental retardation. alphaPix also interacts with Huntington's disease protein (htt), and enhances the aggregation of mutant htt (muthtt) by facilitating SDS-soluble muthtt-muthtt interactions. The DH-PH domain of a Pix was required for its binding to htt. In the majority of Rho GEF proteins, the DH-PH domain is responsible for the exchange activity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
21983 269933 cd01226 PH_RalBD_exo84 Exocyst complex 84-kDa subunit Ral-binding domain/Pleckstrin Homology (PH) domain. The Sec6/8 complex, also called the exocyst complex, forms an octameric protein (Sec3, Sec5, Sec6, Sec8, Sec10, Sec15, Exo70 and Exo84) involved in the tethering of secretory vesicles to specific regions on the plasma membrane. The regulation of Sec6/8 complex differs between mammals and yeast. Mamalian Exo84 and Sec5 are effector targets for active Ral GTPases which are not present in yeast. Ral GTPases are members of the Ras superfamily, and as such cycle between an active GTP-bound state and an inactive GDP-bound state. The Exo84 Ral-binding domain adopts a PH domain fold. Mammalian Exo84 and Sec5 competitively bind to active RalA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 115
21984 269934 cd01227 PH_Dbs DBL's big sister protein pleckstrin homology (PH) domain. Dbs (also called MCF2-transforming sequence-like protein 2) is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a rhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 126
21985 269935 cd01228 PH_BCR-related Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. ABR, a related smaller protein, is structurally similar to BCR, but lacks the N-terminal kinase domain and has GAP activity for both Rac and Cdc42. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 166
21986 269936 cd01229 PH_Ect2 Epithelial cell transforming 2 (Ect2) pleckstrin homology (PH) domain. Ect2, a mammalian ortholog of Drosophila pebble, plays a role in neuronal differentiation and brain development. Pebble and Ect2 have been identified as Rho-family guanine nucleotide exchange factors (GEF) that mediate activation of Rho during cytokinesis, but are proposed to play slightly different roles. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 180
21987 269937 cd01230 PH1_Tiam1_2 T-lymphoma invasion and metastasis 1 and 2 Pleckstrin Homology (PH) domain, N-terminal domain. Tiam1 activates Rac GTPases to induce membrane ruffling and cell motility while Tiam2 (also called STEF (SIF (still life) and Tiam1 like-exchange factor) contributes to neurite growth. Tiam1/2 are Dbl-family of GEFs that possess a Dbl(DH) domain with a PH domain in tandem. DH-PH domain catalyzes the GDP/GTP exchange reaction in the GTPase cycle and facillitating the switch between inactive GDP-bound and active GTP-bound states. Tiam1/2 possess two PH domains, which are often referred to as PHn and PHc domains. The DH-PH tandem domain is made up of the PHc domain while the PHn is part of a novel N-terminal PHCCEx domain which is made up of the PHn domain, a coiled coil region(CC), and an extra region (Ex). PHCCEx mediates binding to plasma membranes and signalling proteins in the activation of Rac GTPases. The PH domain resembles the beta-spectrin PH domain, suggesting non-canonical phosphatidylinositol binding. CC and Ex form a positively charged surface for protein binding. There are 2 motifs in Tiam1/2-interacting proteins that bind to the PHCCEx domain: Motif-I in CD44, ephrinBs, and the NMDA receptor and Motif-II in Par3 and JIP2.Neither of these fall in the PHn domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 127
21988 269938 cd01231 PH_SH2B_family SH2B adapter protein 1, 2, and 3 Pleckstrin homology (PH) domain. SH2B family/APS proteins are a family of intracellular adaptor proteins that influences a variety of signaling pathways mediated by Janus kinase (JAK) and receptor tyrosine kinases (RTKs) including receptors for insulin, insulin-like growth factor-1, Janus kinase 2 (Jak2), platelet derived growth factor, fibroblast growth factor and nerve growth factor. They function in glucose homeostasis, energy metabolism, hematopoesis and reproduction. Mutations in human SH2B orthologs are associated with metabolic disregulation and obesity. There are several SH2B members in mammals: SH2B1 (splice variants: SH2B1alpha, SH2B1beta, SH2B1gamma, and SH2B1delta), SH2B2 (APS) and SH2B3 (Lnk). They contain a PH domain, a SH2 domain, a proline rich region, multiple consensus sites for tyrosine and serine/threonine phosphorylation and a highly conserved c-Cbl recognition motif. These domains function as protein-protein interaction motifs which allows SH2B proteins to integrate and transduce intracellular signals from multiple signaling networks in the absence of intrinsic catalytic activity. SH2B proteins bind via their SH2 domains to phosphotyrosine residues within the intracellular tails of several activated RTKs thereby contributing to receptor activation. SH2B proteins have been shown to interact with insulin receptor substrates IRS1 and IRS2, Grb2, Shc and c-Cbl which may or may not require RTK-stimulated tyrosine phosphorylation of SH2B. positively and negatively regulating RTK signaling. Understanding the physiological functions of SH2B proteins in mammals has been complicated by the presence of multiple SH2B isoforms and conflicting data. Both SH2-Bbeta and APS associate with JAKs, but the former facilitates JAK/STAT signaling while the latter inhibits it. Lnk plays a role in cell growth and proliferation with mutations resulting in growth reduction, developmental delay and female sterility. Recently Lnk Drosophila has been shown to be an important regulator of the insulin/insulin-like growth factor (IGF)-1 signaling (IIS) pathway during growth. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 115
21989 269939 cd01233 PH_KIFIA_KIFIB KIFIA and KIFIB protein pleckstrin homology (PH) domain. The kinesin-3 family motors KIFIA (Caenorhabditis elegans homolog unc-104) and KIFIB transport synaptic vesicle precursors that contain synaptic vesicle proteins, such as synaptophysin, synaptotagmin and the small GTPase RAB3A, but they do not transport organelles that contain plasma membrane proteins. They have a N-terminal motor domain, followed by a coiled-coil domain, and a C-terminal PH domain. KIF1A adopts a monomeric form in vitro, but acts as a processive dimer in vivo. KIF1B has alternatively spliced isoforms distinguished by the presence or absence of insertion sequences in the conserved amino-terminal region of the protein; this results in their different motor activities. KIF1A and KIF1B bind to RAB3 proteins through the adaptor protein mitogen-activated protein kinase (MAPK) -activating death domain (MADD; also calledDENN), which was first identified as a RAB3 guanine nucleotide exchange factor (GEF). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
21990 269940 cd01234 PH_CADPS Ca2+-dependent activator protein (also called CAPS) Pleckstrin homology (PH) domain. CADPS/CAPS consists of two members, CAPS1 which regulates catecholamine release from neuroendocrine cells and CAPS2 which is involved in the release of two neurotrophins, brain-derived neurotrophic factor (BDNF) and neurotrophin-3 (NT-3) from cerebellar granule cells. CADPS plays an important role in vesicle exocytosis in neurons and endocrine cells where it functions to prime the exocytic machinery for Ca2+-triggered fusion. Priming involves the assembly of trans SNARE complexes. The initial interaction of vesicles with target membranes is mediated by diverse stage-specific tethering factors or multi-subunit tethering complexes. CADPS and Munc13 proteins are proposed to be the functional homologs of the stage-specific tethering factors that prime membrane fusion. Interestingly, regions in the C-terminal half of CADPS are similar to the C-terminal region of Munc13-1 that was reported to bind syntaxin-1. CADPS has independent interactions with each of the SNARE proteins (Q-SNARE and R-SNARE) required for vesicle fusion. CADPS interacts with Q-SNARE proteins syntaxin-1 (H3 SNARE) and SNAP-25 (SN1) and might promote Q-SNARE heterodimer formation. Through its N-terminal R-SNARE VAMP-2 interactions, CADPS bound to heterodimeric Q-SNARE complexes could be involved in catalyzing the zippering of VAMP-2 into recipient complexes. It also contains a central PH domain that binds to phosphoinositide 4,5 bisphosphate containing liposomes. Membrane association may also be mediated by binding to phosphatidlyserine via general electrostatic interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 122
21991 269941 cd01235 PH_Sbf1_hMTMR5 Set binding factor 1 (also called Human MTMR5) Pleckstrin Homology (PH) domain. Sbf1 is a myotubularin-related pseudo-phosphatase. Both Sbf1 and myotubularin interact with the SET domains of Hrx and other epigenetic regulatory proteins, but Sbf1 lacks phosphatase activity due to several amino acid changes in its structurally preserved catalytic pocket. It contains pleckstrin (PH), GEF, and myotubularin homology domains that are thought to be responsible for signaling and growth control. Sbf1 functions as an inhibitor of cellular growth. The N-terminal GEF homology domain serves to inhibit the transforming effects of Sbf1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
21992 269942 cd01236 PH_RIP Rho-Interacting Protein Pleckstrin homology (PH) domain. RIP1-RhoGDI2 was obtained in a screen for proteins that bind to wild-type RhoA. RIP2, RIP3, and RIP4 were isolated from cDNA libraries with constitutively active V14RhoA (containing the C190R mutation). RIP2 represents a novel GDP/GTP exchange factor (RhoGEF), while RIP3 (p116Rip) and RIP4 are thought to be structural proteins. RhoGEF contains a Dbl(DH)/PH region, a a zinc finger motif, a leucine-rich domain, and a coiled-coil region. The last 2 domains are thought to be involved in mediating protein-protein interactions. RIP3 is a negative regulator of RhoA signaling that inhibits, either directly or indirectly, RhoA-stimulated actomyosin contractility. In plants RIP3 is localized at microtubules and interacts with the kinesin-13 family member AtKinesin-13A, suggesting a role for RIP3 in microtubule reorganization and a possible function in Rho proteins of plants (ROP)-regulated polar growth. It has a PH domain, two proline-rich regions which are putative binding sites for SH3 domains, and a COOH-terminal coiled-coil region which overlaps with the RhoA-binding region. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 136
21993 269943 cd01237 PH_fermitin Fermitin family pleckstrin homology (PH) domain. Fermitin functions as a mediator of integrin inside-out signalling. The recruitment of Fermitin proteins and Talin to the membrane mediates the terminal event of integrin signalling, via interaction with integrin beta subunits. Fermatin has FERM domain interrupted with a pleckstrin homology (PH) domain. Fermitin family homologs (Fermt1, 2, and 3, also known as Kindlins) are each encoded by a different gene. In mammalian studies, Fermt1 is generally expressed in epithelial cells, Fermt2 is expressed inmuscle tissues, and Fermt3 is expressed in hematopoietic lineages. Specifically Fermt2 is expressed in smooth and striated muscle tissues in mice and in the somites (a trunk muscle precursor) and neural crest in Xenopus embryos. As such it has been proposed that Fermt2 plays a role in cardiomyocyte and neural crest differentiation. Expression of mammalian Fermt3 is associated with hematopoietic lineages: the anterior ventral blood islands, vitelline veins, and early myeloid cells. In Xenopus embryos this expression, also include the notochord and cement gland. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
21994 269944 cd01238 PH_Btk Bruton's tyrosine kinase pleckstrin homology (PH) domain. Btk is a member of the Tec family of cytoplasmic protein tyrosine kinases that includes BMX, IL2-inducible T-cell kinase (Itk) and Tec. Btk plays a role in the maturation of B cells. Tec proteins general have an N-terminal PH domain, followed by a Tek homology (TH) domain, a SH3 domain, a SH2 domain and a kinase domain. The Btk PH domain binds phosphatidylinositol 3,4,5-trisphosphate and responds to signalling via phosphatidylinositol 3-kinase. The PH domain is also involved in membrane anchoring which is confirmed by the discovery of a mutation of a critical arginine residue in the BTK PH domain. This results in severe human immunodeficiency known as X-linked agammaglobulinemia (XLA) in humans and a related disorder is mice.PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 140
21995 269945 cd01239 PH_PKD Protein kinase D (PKD/PKCmu) pleckstrin homology (PH) domain. Protein Kinase C family is composed of three members, PKD1 (PKCmu), PKD2 and PKD3 (PKCnu). Like the C-type protein kinases (PKCs), PKDs are activated by diacylglycerol (DAG). They are involved in vesicular transport, cell proliferation, survival, migration and immune responses. PKD consists of tandem C1 domains, followed by a PH domain and a kinase domain. While the PKD PH domain has not been shown to bind phosphorylated inositol lipids and is not required for membrane translocation, it is required for nuclear export. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 127
21996 269946 cd01240 PH_GRK2_subgroup G Protein-Coupled Receptor Kinase 2 subgroup pleckstrin homology (PH) domain. GRKs are a family of serine-threonine kinases which phosphorylates activated G-protein coupled receptors leading to the release of the previously bound heterotrimeric G protein agonist and thus signal termination. There are seven mammalian GRKs (GRK1-7) grouped into three subfamilies: GRK1 (GRK1 and 7), GRK2 (GRK2 and 3), and GRK4 (GRK4-6). GRKs have three functional components: an N-terminal Regulators of G-protein signaling (RGS) which interacts with the seven-trans-membrane helical receptor protein and/or other membrane targets, a central catalytic protein kinase C (PKc) domain, and a C-terminal section containing a autophosphorylation region and a variable region that mediates membrane association. In both GRK2 (also known as beta-adrenergic receptor kinase-1) and GRK3 (beta-adrenergic receptor kinase-2), the C-terminal variable region contains a PH domain which gives binding specificity to Gbetagamma proteins. The GRK2 PH domain has an extended C-terminal helix, which mediates interactions with G beta gamma subunits. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 118
21997 269947 cd01241 PH_PKB Protein Kinase B-like pleckstrin homology (PH) domain. PKB (also called Akt), a member of the AGC kinase family, is a phosphatidylinositol 3'-kinase (PI3K)-dependent Ser/Thr kinase which alters the activity of the targeted protein. The name AGC is based on the three proteins that it is most similar to cAMP-dependent protein kinase 1 (PKA; also known as PKAC), cGMP-dependent protein kinase (PKG; also known as CGK1) and protein kinase C (PKC). Human Akt has three isoforms derived for distinct genes: Akt1/PKBalpha, Akt2/PKBbeta, and Akt3/PKBgamma. All Akts have an N-terminal PH domain with an activating Thr phosphorylation site, a kinase domain, and a short C-terminal regulatory tail with an activating Ser phosphorylation site. The PH domain recruits Akt to the plasma membrane by binding to phosphoinositides (PtdIns-3,4-P2) and is required for activation. The phosphorylation of Akt at its Thr and Ser phosphorylation sites leads to increased Akt activity toward forkhead transcription factors, the mammalian target of rapamycin (mTOR), and the Bcl-xL/Bcl-2-associated death promoter (BAD), all of which possess a consensus motif R-X-R-XX-ST-B (X = amino acid, B = bulky hydrophobic residue) for Akt phosphorylation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
21998 269948 cd01242 PH_ROCK Rho-associated coiled-coil containing protein kinase pleckstrin homology (PH) domain. ROCK is a serine/threonine kinase that binds GTP-Rho. It consists of a kinase domain, a coiled coil region and a PH domain. The ROCK PH domain is interrupted by a C1 domain. ROCK plays a role in cellular functions, such as contraction, adhesion, migration, and proliferation and in the regulation of apoptosis. There are two ROCK isoforms, ROCK1 and ROCK2. In ROCK2 the Rho Binding Domain (RBD) and the PH domain work together in membrane localization with RBD receiving the RhoA signal and the PH domain receiving the phospholipid signal. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
21999 269949 cd01243 PH_MRCK MRCK (myotonic dystrophy-related Cdc42-binding kinase) pleckstrin homology (PH) domain. MRCK is thought to be coincidence detector of signaling by Cdc42 and phosphoinositides. It has been shown to promote cytoskeletal reorganization, which affects many biological processes. There are 2 members of this family: MRCKalpha and MRCKbeta. MRCK consists of a serine/threonine kinase domain, a cysteine rich (C1) region, a PH domain and a p21 binding motif. The MRCK PH domain is responsible for its targeting to cell to cell junctions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 135
22000 269950 cd01244 PH_GAP1-like RAS p21 protein activator (GTPase activating protein) family pleckstrin homology (PH) domain. RASAL1, GAP1(m), GAP1(IP4BP), and CAPRI are all members of the GAP1 family of GTPase-activating proteins. They contain N-terminal SH2-SH3-SH2 domains, followed by two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. They act as a suppressor of RAS enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
22001 269951 cd01247 PH_FAPP1_FAPP2 Four phosphate adaptor protein 1 and 2 Pleckstrin homology (PH) domain. Human FAPP1 (also called PLEKHA3/Pleckstrin homology domain-containing, family A member 3) regulates secretory transport from the trans-Golgi network to the plasma membrane. It is recruited through binding of PH domain to phosphatidylinositol 4-phosphate (PtdIns(4)P) and a small GTPase ADP-ribosylation factor 1 (ARF1). These two binding sites have little overlap the FAPP1 PH domain to associate with both ligands simultaneously and independently. FAPP1 has a N-terminal PH domain followed by a short proline-rich region. FAPP1 is a member of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), and Goodpasture antigen binding protein (GPBP). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. FAPP2 (also called PLEKHA8/Pleckstrin homology domain-containing, family A member 8), a member of the Glycolipid lipid transfer protein(GLTP) family has an N-terminal PH domain that targets the TGN and C-terminal GLTP domain. FAPP2 functions to traffic glucosylceramide (GlcCer) which is made in the Golgi. It's interaction with vesicle-associated membrane protein-associated protein (VAP) could be a means of regulation. Some FAPP2s share the FFAT-like motifs found in GLTP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
22002 269952 cd01248 PH_PLC_ELMO1 Phospholipase C and Engulfment and cell motility protein 1 pleckstrin homology domain. The C-terminal region of ELMO1, the PH domain and Pro-rich sequences, binds the SH3-containing region of DOCK2 forming a intermolecular five-helix bundle allowing for DOCK mediated Rac1 activation. ELMO1, a mammalian homolog of C. elegans CED-12, contains an N-terminal RhoG-binding region, a ELMO domain, a PH domain, and a C-terminal sequence with three PxxP motifs. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). All PLCs, except for PLCzeta, have a PH domain which is for most part N-terminally located, though lipid binding specificity is not conserved between them. In addition PLC gamma contains a split PH domain within its catalytic domain that is separated by 2 SH2 domains and a single SH3 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
22003 269953 cd01249 BAR-PH_GRAF_family GTPase Regulator Associated with Focal adhesion and related proteins Pleckstrin homology (PH) domain. This hierarchy contains GRAF family members: OPHN1/oligophrenin1, GRAF1 (also called ARHGAP26/Rho GTPase activating protein 26), GRAF2 (also called ARHGAP10/ARHGAP42), AK057372, and LOC129897, all of which are members of the APPL family. OPHN1 is a RhoGAP involved in X-linked mental retardation, epilepsy, rostral ventricular enlargement, and cerebellar hypoplasia. Affected individuals have morphological abnormalities of their brain with enlargement of the cerebral ventricles and cerebellar hypoplasia. OPHN1 negatively regulates RhoA, Cdc42, and Rac1 in neuronal and non-neuronal cells. GRAF1 sculpts the endocytic membranes of the CLIC/GEEC (clathrin-independent carriers/GPI-enriched early endosomal compartments) endocytic pathway. It strongly interacts with dynamin and inhibition of dynamin abolishes CLIC/GEEC endocytosis. GRAF2, GRAF3 and oligophrenin are likely to play similar roles during clathrin-independent endocytic events. GRAF1 mutations are linked to leukaemia. All members are composed of a N-terminal BAR-PH domain, followed by a RhoGAP domain, a proline rich region, and a C-terminal SH3 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
22004 241281 cd01250 PH_AGAP Arf-GAP with GTPase, ANK repeat and PH domain-containing protein Pleckstrin homology (PH) domain. AGAP (also called centaurin gamma; PIKE/Phosphatidylinositol-3-kinase enhancer) reside mainly in the nucleus and are known to activate phosphoinositide 3-kinase, a key regulator of cell proliferation, motility and vesicular trafficking. There are 3 isoforms of AGAP (PIKE-A, PIKE-L, and PIKE-S) the longest of which PIKE-L consists of N-terminal proline rich domains (PRDs), followed by a GTPase domain, a split PH domain (PHN and PHC), an ArfGAP domain and two ankyrin repeats. PIKE-S terminates after the PHN domain and PIKE-A is missing the PRD region. Centaurin binds phosphatidlyinositol (3,4,5)P3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
22005 241282 cd01251 PH2_ADAP ArfGAP with dual PH domains Pleckstrin homology (PH) domain, repeat 2. ADAP (also called centaurin alpha) is a phophatidlyinositide binding protein consisting of an N-terminal ArfGAP domain and two PH domains. In response to growth factor activation, PI3K phosphorylates phosphatidylinositol 4,5-bisphosphate to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 1 is recruited to the plasma membrane following growth factor stimulation by specific binding of its PH domain to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 2 is constitutively bound to the plasma membrane since it binds phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 3,4,5-trisphosphate with equal affinity. This cd contains the second PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
22006 269954 cd01252 PH_GRP1-like General Receptor for Phosphoinositides-1-like Pleckstrin homology (PH) domain. GRP1/cytohesin3 and the related proteins ARNO (ARF nucleotide-binding site opener)/cytohesin-2 and cytohesin-1 are ARF exchange factors that contain a pleckstrin homology (PH) domain thought to target these proteins to cell membranes through binding polyphosphoinositides. The PH domains of all three proteins exhibit relatively high affinity for PtdIns(3,4,5)P3. Within the Grp1 family, diglycine (2G) and triglycine (3G) splice variants, differing only in the number of glycine residues in the PH domain, strongly influence the affinity and specificity for phosphoinositides. The 2G variants selectively bind PtdIns(3,4,5)P3 with high affinity,the 3G variants bind PtdIns(3,4,5)P3 with about 30-fold lower affinity and require the polybasic region for plasma membrane targeting. These ARF-GEFs share a common, tripartite structure consisting of an N-terminal coiled-coil domain, a central domain with homology to the yeast protein Sec7, a PH domain, and a C-terminal polybasic region. The Sec7 domain is autoinhibited by conserved elements proximal to the PH domain. GRP1 binds to the DNA binding domain of certain nuclear receptors (TRalpha, TRbeta, AR, ER, but not RXR), and can repress thyroid hormone receptor (TR)-mediated transactivation by decreasing TR-complex formation on thyroid hormone response elements. ARNO promotes sequential activation of Arf6, Cdc42 and Rac1 and insulin secretion. Cytohesin acts as a PI 3-kinase effector mediating biological responses including cell spreading and adhesion, chemotaxis, protein trafficking, and cytoskeletal rearrangements, only some of which appear to depend on their ability to activate ARFs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 119
22007 269955 cd01253 PH_ARHGAP21-like ARHGAP21 and related proteins pleckstrin homology (PH) domain. ARHGAP family genes encode Rho/Rac/Cdc42-like GTPase activating proteins with a RhoGAP domain. These proteins functions as a GTPase-activating protein (GAP) for RHOA and CDC42. ARHGAP21 controls the Arp2/3 complex and F-actin dynamics at the Golgi complex by regulating the activity of the small GTPase Cdc42. It is recruited to the Golgi by to GTPase, ARF1, through its PH domain and its helical motif. It is also required for CTNNA1 recruitment to adherens junctions. ARHGAP21 and it related proteins all contains a PH domain and a RhoGAP domain. Some of the members have additional N-terminal domains including PDZ, SH3, and SPEC. The ARHGAP21 PH domain interacts with the GTPbound forms of both ARF1 and ARF6 ARF-binding domain/ArfBD. The members here include: ARHGAP15, ARHGAP21, and ARHGAP23. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 113
22008 269956 cd01254 PH_PLD Phospholipase D pleckstrin homology (PH) domain. PLD hydrolyzes phosphatidylcholine to phosphatidic acid (PtdOH), which can bind target proteins. PLD contains a PH domain, a PX domain and four conserved PLD signature domains. The PLD PH domain is specific for bisphosphorylated inositides. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 136
22009 269957 cd01255 PH2_Tiam1_2 T-lymphoma invasion and metastasis 1 and 2 Pleckstrin Homology (PH) domain, C-terminal domain. Tiam1 activates Rac GTPases to induce membrane ruffling and cell motility while Tiam2 (also called STEF (SIF (still life) and Tiam1 like-exchange factor) contributes to neurite growth. Tiam1/2 are Dbl-family of GEFs that possess a Dbl(DH) domain with a PH domain in tandem. DH-PH domain catalyzes the GDP/GTP exchange reaction in the GTPase cycle and facillitating the switch between inactive GDP-bound and active GTP-bound states. The DH domain of Tiam1 interacts with Switch regions 1 and 2 of Rac1 which blocks magnesium binding and GDP is released. Tiam1/2 possess two PH domains, which are often referred to as PHn and PHc domains. The DH-PH tandem domain is made up of the PHc domain while the PHn is part of a novel N-terminal PHCCEx domain which is made up of the PHn domain, a coiled coil region(CC), and an extra region (Ex). PHCCEx mediates binding to plasma membranes and signalling proteins in the activation of Rac GTPases. The PH domain resembles the beta-spectrin PH domain, suggesting non-canonical phosphatidylinositol binding. CC and Ex form a positively charged surface for protein binding. There are 2 motifs in Tiam1/2-interacting proteins that bind to the PHCCEx domain: Motif-I in CD44, ephrinBs, and the NMDA receptor and Motif-II in Par3 and JIP2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 172
22010 269958 cd01256 PH_dynamin Dynamin pleckstrin homology (PH) domain. Dynamin is a GTPase that regulates endocytic vesicle formation. It has an N-terminal GTPase domain, followed by a PH domain, a GTPase effector domain and a C-terminal proline arginine rich domain. Dynamin-like proteins, which are found in metazoa, plants and yeast have the same domain architecture as dynamin, but lack the PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 112
22011 269959 cd01257 PH_IRS Insulin receptor substrate (IRS) pleckstrin homology (PH) domain. Insulin receptor substrate (IRS) molecules are mediators in insulin signaling and play a role in maintaining basic cellular functions such as growth and metabolism. They act as docking proteins between the insulin receptor and a complex network of intracellular signaling molecules containing Src homology 2 (SH2) domains. Four members (IRS-1, IRS-2, IRS-3, IRS-4) of this family have been identified that differ as to tissue distribution, subcellular localization, developmental expression, binding to the insulin receptor, and interaction with SH2 domain-containing proteins. IRS molecules have an N-terminal PH domain, followed by an IRS-like PTB domain which has a PH-like fold. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.cytoskeletal associated molecules, and in lipid associated enzymes. 106
22012 269960 cd01258 PHsplit_syntrophin Syntrophin Split Pleckstrin homology (PH) domain. Syntrophins are scaffold proteins that associate with associate with the Duchenne muscular dystrophy protein dystrophin and the dystrophin-related proteins, utrophin and dystrobrevin to form the dystrophin glycoprotein complex (DGC). There are 5 members: alpha, beta1, beta2, gamma1, and gamma2) all of which contains a split (also called joined) PH domain and a PDZ domain (PHN-PDZ-PHC). The split PH domain of alpha-syntrophin adopts a canonical PH domain fold and together with PDZ forms a supramodule functioning synergistically in binding to inositol phospholipids. The alpha-syntrophin PH-PDZ supramodule showed strong binding to phosphoinositides PI(3,5)P2 and PI(5)P, modest binding to PI(3,4)P2 and PI(4,5)P2, and weak binding to PI(3)P, PI(4)P, and PI(3,4,5)P. There are a large number of signaling proteins that bind to the PDZ domain of syntrophins: nitric oxide synthase (nNOS), aquaporin-4, voltage-gated sodium channels, potassium channels, serine/threonine protein kinases, and the ATP-binding cassette transporter A1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 89
22013 269961 cd01259 PH_APBB1IP Amyloid beta (A4) Precursor protein-Binding, family B, member 1 Interacting Protein pleckstrin homology (PH) domain. APBB1IP consists of a Ras-associated (RA) domain, a PH domain, a family-specific BPS region, and a C-terminal SH2 domain. Grb7, Grb10 and Grb14 are paralogs that are also present in this hierarchy. These adapter proteins bind a variety of receptor tyrosine kinases, including the insulin and insulin-like growth factor-1 (IGF1) receptors. Grb10 and Grb14 are important tissue-specific negative regulators of insulin and IGF1 signaling based and may contribute to type 2 (non-insulin-dependent) diabetes in humans. RA-PH function as a single structural unit and is dimerized via a helical extension of the PH domain. The PH domain here are proposed to bind phosphoinositides non-cannonically ahd are unlikely to bind an activated GTPase. The tandem RA-PH domains are present in a second adapter-protein family, MRL proteins, Caenorhabditis elegans protein MIG-1012, the mammalian proteins RIAM and lamellipodin and the Drosophila melanogaster protein Pico12, all of which are Ena/VASP-binding proteins involved in actin-cytoskeleton rearrangement. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 124
22014 269962 cd01260 PH_CNK_mammalian-like Connector enhancer of KSR (Kinase suppressor of ras) (CNK) pleckstrin homology (PH) domain. CNK family members function as protein scaffolds, regulating the activity and the subcellular localization of RAS activated RAF. There is a single CNK protein present in Drosophila and Caenorhabditis elegans in contrast to mammals which have 3 CNK proteins (CNK1, CNK2, and CNK3). All of the CNK members contain a sterile a motif (SAM), a conserved region in CNK (CRIC) domain, and a PSD-95/DLG-1/ZO-1 (PDZ) domain, and, with the exception of CNK3, a PH domain. A CNK2 splice variant CNK2A also has a PDZ domain-binding motif at its C terminus and Drosophila CNK (D-CNK) also has a domain known as the Raf-interacting region (RIR) that mediates binding of the Drosophila Raf kinase. This cd contains CNKs from mammals, chickens, amphibians, fish, and crustacea. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
22015 269963 cd01261 PH_SOS Son of Sevenless (SOS) Pleckstrin homology (PH) domain. SOS is a Ras guanine nucleotide exchange factor. SOS is thought to transmit signals from activated receptor tyrosine kinases to the Ras signaling pathway. SOS contains a histone domain, Dbl-homology (DH), a PH domain, Rem domain, Cdc25 domain, and a Grb2 binding domain. The SOS PH domain binds to phosphatidylinositol-4,5-bisphosphate (PIP2) and phosphatidic acid (PA). SOS is dependent on Ras binding to the allosteric site via its histone domain for both a lower level of activity (Ras GDP) and maximal activity (Ras GTP). The DH domain blocks the allosteric Ras binding site in SOS. The PH domain is closely associated with the DH domain and the action of the DH-PH unit gates a reciprocal interaction between Ras and SOS. The C-terminal proline-rich domain of SOS binds to the adapter protein Grb2 which localizes the Sos protein to the plasma membrane and diminishes the negative effect of the C-terminal domain on the guanine nucleotide exchange activity of the CDC25-homology domain of SOS. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 109
22016 241293 cd01262 PH_PDK1 3-Phosphoinositide dependent protein kinase 1 (PDK1) pleckstrin homology (PH) domain. PDK1 plays an important role in insulin and growth factor signalling cascades. It phosphorylates and activates many AGC (cAMP-dependent, cGMP-dependent, protein kinase C (PKC)) family of protein kinases members, including protein kinase B (PKB, also known as Akt), p70 ribosomal S6-kinase (S6K), serum and glucocorticoid responsive kinase (SGK), p90 ribosomal S6 kinase (RSK), and PKC. PDK1 contains an N-terminal serine/threonine kinase domain followed by a PH domain. Following binding of the PH domain to PtdIns(3,4,5)P3 and PtdIns(3,4)P2, PDK1 activates these enzymes by phosphorylating a Ser/Thr residue in their activation loop. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
22017 269964 cd01263 PH_anillin Anillin Pleckstrin homology (PH) domain. Anillin (Rhotekin/RTKN; also called PLEKHK/Pleckstrin homology domain-containing family K) is an actin binding protein involved in cytokinesis. It interacts with GTP-bound Rho proteins and results in the inhibition of their GTPase activity. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Anillin proteins have a N-terminal HRI domain/ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. The C-terminal PH domain helps target anillin to ectopic septin containing foci. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 121
22018 269965 cd01264 PH_MELT_VEPH1 Melted pleckstrin homology (PH) domain. The melted protein (also called Ventricular zone expressed PH domain-containing protein homolog 1) is expressed in the developing central nervous system of vertebrates. It contains a single C-terminal PH domain that is required for membrane targeting. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
22019 269966 cd01265 PH_TBC1D2A TBC1 domain family member 2A pleckstrin homology (PH) domain. TBC1D2A (also called PARIS-1/Prostate antigen recognized and identified by SEREX 1 and ARMUS) contains a PH domain and a TBC-type GTPase catalytic domain. TBC1D2A integrates signaling between Arf6, Rac1, and Rab7 during junction disassembly. Activated Rac1 recruits TBC1D2A to locally inactivate Rab7 via its C-terminal TBC/RabGAP domain and facilitate E-cadherin degradation in lysosomes. The TBC1D2A PH domain mediates localization at cell-cell contacts and coprecipitates with cadherin complexes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
22020 241297 cd01266 PH_Gab1_Gab2 Grb2-associated binding proteins 1 and 2 pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. The members in this cd include the Gab1 and Gab2 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
22021 241298 cd01268 PTB_Numb Numb Phosphotyrosine-binding (PTB) domain. Numb is a membrane associated adaptor protein which plays critical roles in cell fate determination. Numb proteins are involved in control of asymmetric cell division and cell fate choice, endocytosis, cell adhesion, cell migration, ubiquitination of specific substrates and a number of signaling pathways (Notch, Hedgehog, p53). Mutations in Numb plays a critical role in disease (cancer). Numb has an N-terminal PTB domain and a C-terminal NumbF domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 135
22022 269967 cd01269 PTB_TBC1D1_like TBC1 domain family member 1 and related proteins Phosphotyrosine-binding (PTB) domain. The TBC1D1-like members here include TBC1D1, TBC1D4 (also called Akt substrate of 160 kDa or AS160), and pollux (PLX), a calmodulin-binding protein, and are thought to have a role in regulating cell growth and differentiation. These proteins are thought to function as GTPase-activating protein for Rab family protein(s). They may play a role in the cell cycle and differentiation of various tissues. They all contain an N-terminal PTB domain, a calmodulin CBD domain, and a C-terminal TBC domain which is thought to be a GTPase activator protein of Rab-like small GTPases. Recently, TBC1D1 and TBC1D4 were recognized to potentially link the proximal signalling of insulin and/or exercise with GLUT4. TBC1D4 is thought to be involved in contraction-stimulated glucose uptake, but TBC1D4-independent mechanisms (potentially involving TBC1D1) are likely to be essential for most of the contraction's effect. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 143
22023 269968 cd01270 PTB_CAPON-like Carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase protein (CAPON) Phosphotyrosine-binding (PTB) domain. CAPON (also known as Nitric oxide synthase 1 adaptor protein, NOS1AP, encodes a cytosolic protein that binds to the signaling molecule, neuronal NOS (nNOS). It contains a N-terminal PTB domain that binds to the small monomeric G protein, Dexras1 and a C-terminal PDZ-binding domain that mediates interactions with nNOS. Included in this cd are C. elegan proteins dystrobrevin, DYB-1, which controls neurotransmitter release and muscle Ca(2+) transients by localizing BK channels and DYstrophin-like phenotype and CAPON related,DYC-1, which is functionally related to dystrophin homolog, DYS-1. Mutations in the dystrophin gene causes Duchenne muscular dystrophy. DYS-1 shares sequence similarity, including key motifs, with their mammalian counterparts. These CAPON-like proteins all have a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 179
22024 269969 cd01271 PTB2_Fe65 Fe65 C-terminal Phosphotyrosine-binding (PTB) domain. The neuronal adaptor protein Fe65 is involved in brain development, Alzheimer disease amyloid precursor protein (APP) signaling, and proteolytic processing of APP. It contains three protein-protein interaction domains, one WW domain, and a unique tandem array of phosphotyrosine-binding (PTB) domains. The C-terminal PTB domain is responsible for APP binding. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 127
22025 269970 cd01272 PTB1_Fe65 Fe65 N-terminal Phosphotyrosine-binding (PTB) domain. The neuronal adaptor protein Fe65 is involved in brain development, Alzheimer disease amyloid precursor protein (APP) signaling, and proteolytic processing of APP. It contains three protein-protein interaction domains, one WW domain, and a unique tandem array of phosphotyrosine-binding (PTB) domains. The N-terminal PTB domain was shown to interact with a variety of proteins, including the low density lipoprotein receptor-related protein (LRP-1), the ApoEr2 receptor, and the histone acetyltransferase Tip60. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 138
22026 269971 cd01273 PTB_CED-6 Cell death protein 6 homolog (CED-6/GULP1) Phosphotyrosine-binding (PTB) domain. CED6 (also known as GULP1: engulfment adaptor PTB domain containing 1) is an adaptor protein involved in the specific recognition and engulfment of apoptotic cells. CED6 has been shown to interact with the cytoplasmic tail of another protein involved in the engulfment of apoptotic cells, CED1. CED6 has a C-terminal PTB domain, which can bind to NPXY motifs. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 144
22027 269972 cd01274 PTB_Anks Ankyrin repeat and sterile alpha motif (SAM) domain-containing (Anks) protein family Phosphotyrosine-binding (PTB) domain. Both AIDA-1b (AbetaPP intracellular domain-associated protein 1b) and Odin (also known as ankyrin repeat and sterile alpha motif domain-containing 1A; ANKS1A) belong to the Anks protein family. Both of these family members interacts with the EphA8 receptor. Ank members consists of ankyrin repeats, a SAM domain and a C-terminal PTB domain which is crucial for interaction with the juxtamembrane (JM) region of EphA8. PTB domains are classified into three groups, namely, phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains of which the Anks PTB is a member. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 146
22028 238606 cd01275 FHIT FHIT (fragile histidine family): FHIT proteins, related to the HIT family carry a motif HxHxH/Qxx (x, is a hydrophobic amino acid), On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified into three branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Fhit plays a very important role in the development of tumours. Infact, Fhit deletions are among the earliest and most frequent genetic alterations in the development of tumours. 126
22029 238607 cd01276 PKCI_related Protein Kinase C Interacting protein related (PKCI): PKCI and related proteins belong to the ubiquitous HIT family of hydrolases that act on alpha-phosphates of ribonucleotides. The members of this subgroup have a conserved HxHxHxx motif (x is a hydrophobic residue) that is a signature for this family. No enzymatic activity has been reported however, for PKCI and its related members. 104
22030 238608 cd01277 HINT_subgroup HINT (histidine triad nucleotide-binding protein) subgroup: Members of this CD belong to the superfamily of histidine triad hydrolases that act on alpha-phosphate of ribonucleotides. This subgroup includes members from all three forms of cellular life. Although the biochemical function has not been characterised for many of the members of this subgroup, the proteins from Yeast have been shown to be involved in secretion, peroxisome formation and gene expression. 103
22031 238609 cd01278 aprataxin_related aprataxin related: Aprataxin, a HINT family hydrolase is mutated in ataxia oculomotor apraxia syndrome. All the members of this subgroup have the conserved HxHxHxx (where x is a hydrophobic residue) signature motif. Members of this subgroup are predominantly eukaryotic in origin. 104
22032 133387 cd01279 HTH_HspR-like Helix-Turn-Helix DNA binding domain of HspR-like transcription regulators. Helix-turn-helix (HTH) transcription regulator HspR and related proteins, N-terminal domain. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 98
22033 133388 cd01282 HTH_MerR-like_sg3 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 3). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 112
22034 238610 cd01283 cytidine_deaminase Cytidine deaminase zinc-binding domain. These enzymes are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. Cytidine deaminases catalyze the deamination of cytidine to uridine and are important in the pyrimadine salvage pathway in many cell types, from bacteria to humans. This family also includes the apoBec proteins, which are a mammal specific expansion of RNA editing enzymes, and the closely related phorbolins, and the AID (activation-induced) enzymes. 112
22035 238611 cd01284 Riboflavin_deaminase-reductase Riboflavin-specific deaminase. Riboflavin biosynthesis protein RibD (Diaminohydroxyphosphoribosylaminopyrimidine deaminase) catalyzes the deamination of 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate, which is an intermediate step in the biosynthesis of riboflavin.The ribG gene of Bacillus subtilis and the ribD gene of E. coli are bifunctional and contain this deaminase domain and a reductase domain which catalyzes the subsequent reduction of the ribosyl side chain. 115
22036 238612 cd01285 nucleoside_deaminase Nucleoside deaminases include adenosine, guanine and cytosine deaminases. These enzymes are Zn dependent and catalyze the deamination of nucleosides. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. The functional enzyme is a homodimer. Cytosine deaminase catalyzes the deamination of cytosine to uracil and ammonia and is a member of the pyrimidine salvage pathway. Cytosine deaminase is found in bacteria and fungi but is not present in mammals; for this reason, the enzyme is currently of interest for antimicrobial drug design and gene therapy applications against tumors. Some members of this family are tRNA-specific adenosine deaminases that generate inosine at the first position of their anticodon (position 34) of specific tRNAs; this modification is thought to enlarge the codon recognition capacity during protein synthesis. Other members of the family are guanine deaminases which deaminate guanine to xanthine as part of the utilization of guanine as a nitrogen source. 109
22037 238613 cd01286 deoxycytidylate_deaminase Deoxycytidylate deaminase domain. Deoxycytidylate deaminase catalyzes the deamination of dCMP to dUMP, providing the nucleotide substrate for thymidylate synthase. The enzyme binds Zn++, which is required for catalytic activity. The activity of the enzyme is allosterically regulated by the ratio of dCTP to dTTP not only in eukaryotic cells but also in T-even phage-infected Escherichia coli, with dCTP acting as an activator and dTTP as an inhibitor. 131
22038 238614 cd01287 FabA FabA, beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase: Bacterial protein of the type II, fatty acid synthase system that binds ACP and catalyzes both dehydration and isomerization reactions, apparently in the same active site. The FabA structure is a homodimer with two independent active sites located at the dimer interface. Each active site is tunnel-shaped and completely inaccessible to solvent. No metal ions or cofactors are required for ligand binding or catalysis. 150
22039 238615 cd01288 FabZ FabZ is a 17kD beta-hydroxyacyl-acyl carrier protein (ACP) dehydratase that primarily catalyzes the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, the third step in the elongation phase of the bacterial/ plastid, type II, fatty-acid biosynthesis pathway. 131
22040 238616 cd01289 FabA_like Domain of unknown function, appears to be related to a diverse group of beta-hydroxydecanoyl ACP dehydratases (FabA) and beta-hydroxyacyl ACP dehydratases (FabZ). This group appears to lack the conserved active site histidine of FabA and FabZ. 138
22041 211324 cd01291 PseudoU_synth Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA). 87
22042 238617 cd01292 metallo-dependent_hydrolases Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others. 275
22043 238618 cd01293 Bact_CD Bacterial cytosine deaminase and related metal-dependent hydrolases. Cytosine deaminases (CDs) catalyze the deamination of cytosine, producing uracil and ammonia. They play an important role in pyrimidine salvage. CDs are present in prokaryotes and fungi, but not mammalian cells. The bacterial enzymes, but not the fungal enzymes, are related to the adenosine deaminases (ADA). The bacterial enzymes are iron dependent and hexameric. 398
22044 238619 cd01294 DHOase Dihydroorotase (DHOase) catalyzes the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in the pyrimidine biosynthesis. In contrast to the large polyfunctional CAD proteins of higher organisms, this group of DHOases is monofunctional and mainly dimeric. 335
22045 238620 cd01295 AdeC Adenine deaminase (AdeC) directly deaminates adenine to form hypoxanthine. This reaction is part of one of the adenine salvage pathways, as well as the degradation pathway. It is important for adenine utilization as a purine, as well as a nitrogen source in bacteria and archea. 422
22046 238621 cd01296 Imidazolone-5PH Imidazolonepropionase/imidazolone-5-propionate hydrolase (Imidazolone-5PH) catalyzes the third step in the histidine degradation pathway, the hydrolysis of (S)-3-(5-oxo-4,5-dihydro-3H-imidazol-4-yl)propanoate to N-formimidoyl-L-glutamate. In bacteria, the enzyme is part of histidine utilization (hut) operon. 371
22047 238622 cd01297 D-aminoacylase D-aminoacylases (N-acyl-D-Amino acid amidohydrolases) catalyze the hydrolysis of N-acyl-D-amino acids to produce the corresponding D-amino acids, which are used as intermediates in the synthesis of pesticides, bioactive peptides, and antibiotics. 415
22048 238623 cd01298 ATZ_TRZ_like TRZ/ATZ family contains enzymes from the atrazine degradation pathway and related hydrolases. Atrazine, a chlorinated herbizide, can be catabolized by a variety of different bacteria. The first three steps of the atrazine dehalogenation pathway are catalyzed by atrazine chlorohydrolase (AtzA), hydroxyatrazine ethylaminohydrolase (AtzB), and N-isopropylammelide N-isopropylaminohydrolase (AtzC). All three enzymes belong to the superfamily of metal dependent hydrolases. AtzA and AtzB, beside other related enzymes are represented in this CD. 411
22049 238624 cd01299 Met_dep_hydrolase_A Metallo-dependent hydrolases, subgroup A is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 342
22050 238625 cd01300 YtcJ_like YtcJ_like metal dependent amidohydrolases. YtcJ is a Bacillus subtilis ORF of unknown function. The Arabidopsis homolog LAF3 has been identified as a factor required for photochrome A signalling. 479
22051 238626 cd01301 rDP_like renal dipeptidase (rDP), best studied in mammals and also called membrane or microsomal dipeptidase, is a membrane-bound glycoprotein hydrolyzing dipeptides and is involved in hydrolytic metabolism of penem and carbapenem beta-lactam antibiotics. Although the biological function of the enzyme is still unknown, it has been suggested to play a role in the renal glutathione metabolism. 309
22052 238627 cd01302 Cyclic_amidohydrolases Cyclic amidohydrolases, including hydantoinase, dihydropyrimidinase, allantoinase, and dihydroorotase, are involved in the metabolism of pyrimidines and purines, sharing the property of hydrolyzing the cyclic amide bond of each substrate to the corresponding N-carbamyl amino acids. Allantoinases catalyze the degradation of purines, while dihydropyrimidinases and hydantoinases, a microbial counterpart of dihydropyrimidinase, are involved in pyrimidine degradation. Dihydroorotase participates in the de novo synthesis of pyrimidines. 337
22053 238628 cd01303 GDEase Guanine deaminase (GDEase). Guanine deaminase is an aminohydrolase responsible for the conversion of guanine to xanthine and ammonia, the first step to utilize guanine as a nitrogen source. This reaction also removes the guanine base from the pool and therefore can play a role in the regulation of cellular GTP and the guanylate nucleotide pool. 429
22054 238629 cd01304 FMDH_A Formylmethanofuran dehydrogenase (FMDH) subunit A; Methanogenic bacteria and archea derive the energy for autotrophic growth from methanogenesis, the reduction of CO2 with molecular hydrogen as the electron donor. FMDH catalyzes the first step in methanogenesis, the formyl-methanofuran synthesis. In this step, CO2 is bound to methanofuran and subsequently reduced to the formyl state with electrons derived from hydrogen. 541
22055 238630 cd01305 archeal_chlorohydrolases Predicted chlorohydrolases. These metallo-dependent hydrolases from archea are part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. They have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. Some members of this subgroup are predicted to be chlorohyrolases. 263
22056 238631 cd01306 PhnM PhnM is believed to be a subunit of the membrane associated C-P lyase complex. C-P lyase is thought to catalyze the direct cleavage of inactivated C-P bonds to yield inorganic phosphate and the corresponding hydrocarbons. It is responsible for cleavage of alkylphosphonates, which are utilized as sole phosphorus sources by many bacteria. 325
22057 238632 cd01307 Met_dep_hydrolase_B Metallo-dependent hydrolases, subgroup B is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 338
22058 238633 cd01308 Isoaspartyl-dipeptidase Isoaspartyl dipeptidase hydrolyzes the beta-L-isoaspartyl linkages in dipeptides, as part of the degradative pathway to eliminate proteins with beta-L-isoaspartyl peptide bonds, bonds whereby the beta-group of an aspartate forms the peptide link with the amino group of the following amino acid. Formation of this bond is a spontaneous nonenzymatic reaction in nature and can profoundly effect the function of the protein. Isoaspartyl dipeptidase is an octameric enzyme that contains a binuclear zinc center in the active site of each subunit and shows a strong preference of hydrolyzing Asp-Leu dipeptides. 387
22059 238634 cd01309 Met_dep_hydrolase_C Metallo-dependent hydrolases, subgroup C is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 359
22060 238635 cd01310 TatD_DNAse TatD like proteins; E.coli TatD is a cytoplasmic protein, shown to have magnesium dependent DNase activity. 251
22061 238636 cd01311 PDC_hydrolase 2-pyrone-4,6-dicarboxylic acid (PDC) hydrolase hydrolyzes PDC to yield 4-oxalomesaconic acid (OMA) or its tautomer, 4-carboxy-2-hydroxymuconic acid (CHM). This reaction is part of the protocatechuate (PCA) 4,5-cleavage pathway. PCA is one of the most important intermediate metabolites in the bacterial pathways for various phenolic compounds, including lignin, which is the most abundant aromatic material in nature. 263
22062 238637 cd01312 Met_dep_hydrolase_D Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 381
22063 238638 cd01313 Met_dep_hydrolase_E Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 418
22064 238639 cd01314 D-HYD D-hydantoinases (D-HYD) also called dihydropyrimidases (DHPase) and related proteins; DHPases are a family of enzymes that catalyze the reversible hydrolytic ring opening of the amide bond in five- or six-membered cyclic diamides, like dihydropyrimidine or hydantoin. The hydrolysis of dihydropyrimidines is the second step of reductive catabolism of pyrimidines in human. The hydrolysis of 5-substituted hydantoins in microorganisms leads to enantiomerically pure N-carbamyl amino acids, which are used for the production of antibiotics, peptide hormones, pyrethroids, and pesticides. HYDs are classified depending on their stereoselectivity. This family also includes collapsin response regulators (CRMPs), cytosolic proteins involved in neuronal differentiation and axonal guidance which have strong homology to DHPases, but lack most of the active site residues. 447
22065 238640 cd01315 L-HYD_ALN L-Hydantoinases (L-HYDs) and Allantoinase (ALN); L-Hydantoinases are a member of the dihydropyrimidinase family, which catalyzes the reversible hydrolytic ring opening of dihydropyrimidines and hydantoins (five-membered cyclic diamides used in biotechnology). But L-HYDs differ by having an L-enantio specificity and by lacking activity on possible natural substrates such as dihydropyrimidines. Allantoinase catalyzes the hydrolytic cleavage of the five-member ring of allantoin (5-ureidohydantoin) to form allantoic acid. 447
22066 238641 cd01316 CAD_DHOase The eukaryotic CAD protein is a trifunctional enzyme of carbamoylphosphate synthetase-aspartate transcarbamoylase-dihydroorotase, which catalyzes the first three steps of de novo pyrimidine nucleotide biosynthesis. Dihydroorotase (DHOase) catalyzes the third step, the reversible interconversion of carbamoyl aspartate to dihydroorotate. 344
22067 238642 cd01317 DHOase_IIa Dihydroorotase (DHOase), subgroup IIa; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This subgroup also contains proteins that lack the active site, like unc-33, a C.elegans protein involved in axon growth. 374
22068 238643 cd01318 DHOase_IIb Dihydroorotase (DHOase), subgroup IIb; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This group contains the archeal members of the DHOase family. 361
22069 238644 cd01319 AMPD AMP deaminase (AMPD) catalyzes the hydrolytic deamination of adensosine monophosphate (AMP) at position 6 of the adenine nucleotide ring. AMPD is a diverse and highly regulated eukaryotic key enzyme of the adenylate catabolic pathway. 496
22070 238645 cd01320 ADA Adenosine deaminase (ADA) is a monomeric zinc dependent enzyme which catalyzes the irreversible hydrolytic deamination of both adenosine, as well as desoxyadenosine, to ammonia and inosine or desoxyinosine, respectively. ADA plays an important role in the purine pathway. Low, as well as high levels of ADA activity have been linked to several diseases. 325
22071 238646 cd01321 ADGF Adenosine deaminase-related growth factors (ADGF), a novel family of secreted growth-factors with sequence similarty to adenosine deaminase. 345
22072 238647 cd01324 cbb3_Oxidase_CcoQ Cytochrome cbb oxidase CcoQ. Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Found exclusively in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I. Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center. ccoQ, the fourth subunit, is a single transmembrane helix protein. It has been shown to protect the core complex from proteolytic degradation by serine proteases. See cd00919, cd01322, or cd01323 for more information on cbb3 oxidase. 48
22073 238648 cd01327 KAZAL_PSTI Kazal-type pancreatic secretory trypsin inhibitors (PSTI) and related proteins, including the second domain of the ovomucoid turkey inhibitor and the C-terminal domain of the esophagus cancer-related gene-2 protein (ECRG-2), are members of the superfamily of kazal-type proteinase inhibitors and follistatin-like proteins. 45
22074 238649 cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins. 86
22075 238650 cd01330 KAZAL_SLC21 The kazal-type serine protease inhibitor domain has been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The KAZAL_SLC21 domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins. 54
22076 176461 cd01334 Lyase_I Lyase class I family; a group of proteins which catalyze similar beta-elimination reactions. The Lyase class I family contains class II fumarase, aspartase, adenylosuccinate lyase (ASL), argininosuccinate lyase (ASAL), prokaryotic-type 3-carboxy-cis,cis-muconate cycloisomerase (pCMLE), and related proteins. It belongs to the Lyase_I superfamily. Proteins of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. 325
22077 100105 cd01335 Radical_SAM Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin. 204
22078 133421 cd01336 MDH_cytoplasmic_cytosolic Cytoplasmic and cytosolic Malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are eukaryotic MDHs localized to the cytoplasm and cytosol. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 325
22079 133422 cd01337 MDH_glyoxysomal_mitochondrial Glyoxysomal and mitochondrial malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are localized to the glycosome and mitochondria. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 310
22080 133423 cd01338 MDH_choloroplast_like Chloroplast-like malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are bacterial MDHs, and plant MDHs localized to the choloroplasts. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 322
22081 133424 cd01339 LDH-like_MDH L-lactate dehydrogenase-like malate dehydrogenase proteins. Members of this subfamily have an LDH-like structure and an MDH enzymatic activity. Some members, like MJ0490 from Methanococcus jannaschii, exhibit both MDH and LDH activities. Tetrameric MDHs, including those from phototrophic bacteria, are more similar to LDHs than to other MDHs. LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 300
22082 238651 cd01341 ADP_ribosyl ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 137
22083 293888 cd01342 Translation_Factor_II_like Domain II of Elongation factor Tu (EF-Tu)-like proteins. Elongation factor Tu consists of three structural domains. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E. coli IF-2 to 30S subunits. 80
22084 238653 cd01343 PL1_Passenger_AT Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 1, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of Neisseria and Haemophilus IgA1 proteases, SPATEs (serine protease autotransporters secreted by Enterobacteriaceae), Bordetella pertacins, and nonprotease autotransporters, TibA and similar AIDA-like proteins. 233
22085 238654 cd01344 PL2_Passenger_AT Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 2, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of the nonprotease autotransporters, Ag43, AIDA-1 and IcsA, as well as, the less characterized ShdA, MisL, and BapA autotransporters. 188
22086 238655 cd01345 OM_channels Porin superfamily. These outer membrane channels share a beta-barrel structure that differ in strand and shear number. Classical (gram-negative ) porins are non-specific channels for small hydrophillic molecules and form 16 beta-stranded barrels (16,20), which associate as trimers. Maltoporin-like channels have specificities for various sugars and form 18 beta-stranded barrels (18,22), which associate as trimers. Ligand-gated protein channels cooperate with a TonB associated inner membrane complex to actively transport ligands via the proton motive force and they form monomeric, (22,24) barrels. The 150-200 N-terminal residues form a plug that blocks the channel from the periplasmic end. 253
22087 238656 cd01346 Maltoporin-like The Maltoporin-like channels (LamB porin) form a trimeric structure which facilitate the diffusion of maltodextrins and other sugars across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an 18-strand antiparallel beta-barrel (18,22). Loop 3 folds into the core to constrict pore size. Long irregular loops are found on the extracelllular side, while short turns are in the periplasm.Tightly-bound water molecules are found in the eyelet of the passage, and only substrates that can displace and replace the broken hydrogen bonds are likely to enter the pore. In the MPR structure, loops 4,6, and 9 have the greatest mobility and are highly variable; these are postulated to attract maltodextrins. 392
22088 238657 cd01347 ligand_gated_channel TonB dependent/Ligand-Gated channels are created by a monomeric 22 strand (22,24) anti-parallel beta-barrel. Ligands apparently bind to the large extracellular loops. The N-terminal 150-200 residues form a plug from the periplasmic end of barrel. Energy (proton-motive force) and TonB-dependent conformational alteration of channel (parts of plug, and loops 7 and 8) allow passage of ligand. FepA residues 12-18 form the TonB box, which mediates the interaction with the TonB-containing inner membrane complex. TonB preferentially interacts with ligand-bound receptors. Transport thru the channel may resemble passage thru an air lock. In this model, ligand binding leads to closure of the extracellular end of pore, then a TonB-mediated signal facillitates opening of the interior side of pore, deforming the N-terminal plug and allowing passage of the ligand to the periplasm. Such a mechanism would prevent the free diffusion of small molecules thru the pore. 635
22089 153129 cd01351 Aconitase Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid. 389
22090 153130 cd01355 AcnX Putative Aconitase X catalytic domain. Putative Aconitase X catalytic domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution. 389
22091 238658 cd01356 AcnX_swivel Putative Aconitase X swivel domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution. 123
22092 176462 cd01357 Aspartase Aspartase. This subgroup contains Escherichia coli aspartase (L-aspartate ammonia-lyase), Bacillus aspartase and related proteins. It is a member of the Lyase class I family, which includes both aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid. 450
22093 176463 cd01359 Argininosuccinate_lyase Argininosuccinate lyase (argininosuccinase, ASAL). This group contains ASAL and related proteins. It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASAL is a cytosolic enzyme which catalyzes the reversible breakdown of argininosuccinate to arginine and fumarate during arginine biosynthesis. In ureotelic species ASAL also catalyzes a reaction involved in the production of urea. Included in this group are the major soluble avian eye lens proteins from duck, delta 1 and delta 2 crystallin. Of these two isoforms only delta 2 has retained ASAL activity. These crystallins may have evolved by, gene recruitment of ASAL followed by gene duplication. In humans, mutations in ASAL result in the autosomal recessive disorder argininosuccinic aciduria. 435
22094 176464 cd01360 Adenylsuccinate_lyase_1 Adenylsuccinate lyase (ASL)_subgroup 1. This subgroup contains bacterial and archeal proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). 387
22095 176465 cd01362 Fumarase_classII Class II fumarases. This subgroup contains Escherichia coli fumarase C, human mitochondrial fumarase, and related proteins. It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle. 455
22096 276814 cd01363 Motor_domain Myosin and Kinesin motor domain. Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Some of the names do not match with what is given in the sequence list. This is because they are based on the current nomenclature by Kollmar/Sebe-Pedros. 170
22097 276815 cd01364 KISc_BimC_Eg5 Kinesin motor domain, BimC/Eg5 spindle pole proteins. Kinesin motor domain, BimC/Eg5 spindle pole proteins, participate in spindle assembly and chromosome segregation during cell division. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 353
22098 276816 cd01365 KISc_KIF1A_KIF1B Kinesin motor domain, KIF1_like proteins. Kinesin motor domain, KIF1_like proteins. KIF1A (Unc104) transports synaptic vesicles to the nerve terminal, KIF1B has been implicated in transport of mitochondria. Both proteins are expressed in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. In contrast to the majority of dimeric kinesins, most KIF1A/Unc104 kinesins are monomeric motors. A lysine-rich loop in KIF1A binds to the negatively charged C-terminus of tubulin and compensates for the lack of a second motor domain, allowing KIF1A to move processively. 361
22099 276817 cd01366 KISc_C_terminal Kinesin motor domain, KIFC2/KIFC3/ncd-like carboxy-terminal kinesins. Kinesin motor domain, KIFC2/KIFC3/ncd-like carboxy-terminal kinesins. Ncd is a spindle motor protein necessary for chromosome segregation in meiosis. KIFC2/KIFC3-like kinesins have been implicated in motility of the Golgi apparatus as well as dentritic and axonal transport in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found at the C-terminus (C-type). C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 329
22100 276818 cd01367 KISc_KIF2_like Kinesin motor domain, KIF2-like group. Kinesin motor domain, KIF2-like group. KIF2 is a protein expressed in neurons, which has been associated with axonal transport and neuron development; alternative splice forms have been implicated in lysosomal translocation. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found in the middle (M-type) of the protein chain. M-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second (KIF2 may be slower). To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 328
22101 276819 cd01368 KISc_KIF23_like Kinesin motor domain, KIF23-like subgroup. Kinesin motor domain, KIF23-like subgroup. Members of this group may play a role in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 345
22102 276820 cd01369 KISc_KHC_KIF5 Kinesin motor domain, kinesin heavy chain (KHC) or KIF5-like subgroup. Kinesin motor domain, kinesin heavy chain (KHC) or KIF5-like subgroup. Members of this group have been associated with organelle transport. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 325
22103 276821 cd01370 KISc_KIP3_like Kinesin motor domain, KIP3-like subgroup. Kinesin motor domain, KIP3-like subgroup. The yeast kinesin KIP3 plays a role in positioning the mitotic spindle. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 345
22104 276822 cd01371 KISc_KIF3 Kinesin motor domain, kinesins II or KIF3_like proteins. Kinesin motor domain, kinesins II or KIF3_like proteins. Subgroup of kinesins, which form heterotrimers composed of 2 kinesins and one non-motor accessory subunit. Kinesins II play important roles in ciliary transport, and have been implicated in neuronal transport, melanosome transport, the secretory pathway, and mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this group the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 334
22105 276823 cd01372 KISc_KIF4 Kinesin motor domain, KIF4-like subfamily. Kinesin motor domain, KIF4-like subfamily. Members of this group seem to perform a variety of functions, and have been implicated in neuronal organelle transport and chromosome segregation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 341
22106 276824 cd01373 KISc_KLP2_like Kinesin motor domain, KIF15-like subgroup. Kinesin motor domain, KIF15-like subgroup. Members of this subgroup seem to play a role in mitosis and meiosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 347
22107 276825 cd01374 KISc_CENP_E Kinesin motor domain, CENP-E/KIP2-like subgroup. Kinesin motor domain, CENP-E/KIP2-like subgroup, involved in chromosome movement and/or spindle elongation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 321
22108 276826 cd01375 KISc_KIF9_like Kinesin motor domain, KIF9-like subgroup. Kinesin motor domain, KIF9-like subgroup; might play a role in cell shape remodeling. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 334
22109 276827 cd01376 KISc_KID_like Kinesin motor domain, KIF22/Kid-like subgroup. Kinesin motor domain, KIF22/Kid-like subgroup. Members of this group might play a role in regulating chromosomal movement along microtubules in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 319
22110 276951 cd01377 MYSc_class_II class II myosins, motor domain. Myosin motor domain in class II myosins. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. Thus, myosin II has two heads. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 662
22111 276829 cd01378 MYSc_Myo1 class I myosin, motor domain. Myosin I generates movement at the leading edge in cell motility, and class I myosins have been implicated in phagocytosis and vesicle transport. Myosin I, an unconventional myosin, does not form dimers. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. There are 5 myosin subclasses with subclasses c/h, d/g, and a/b have an IQ domain and a TH1 domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 652
22112 276830 cd01379 MYSc_Myo3 class III myosin, motor domain. Myosin III has been shown to play a role in the vision process in insects and in hearing in mammals. Myosin III, an unconventional myosin, does not form dimers. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. They are characterized by an N-terminal protein kinase domain and several IQ domains. Some members also contain WW, SH2, PH, and Y-phosphatase domains. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 633
22113 276831 cd01380 MYSc_Myo5 class V myosin, motor domain. Myo5, also called heavy chain 12, myoxin, are dimeric myosins that transport a variety of intracellular cargo processively along actin filaments, such as melanosomes, synaptic vesicles, vacuoles, and mRNA. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. It also contains a IQ domain and a globular DIL domain. Myosin V is a class of actin-based motor proteins involved in cytoplasmic vesicle transport and anchorage, spindle-pole alignment and mRNA translocation. The protein encoded by this gene is abundant in melanocytes and nerve cells. Mutations in this gene cause Griscelli syndrome type-1 (GS1), Griscelli syndrome type-3 (GS3) and neuroectodermal melanolysosomal disease, or Elejalde disease. Multiple alternatively spliced transcript variants encoding different isoforms have been reported, but the full-length nature of some variants has not been determined. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Note that the Dictyostelium myoVs are not contained in this child group. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 629
22114 276832 cd01381 MYSc_Myo7 class VII myosin, motor domain. These monomeric myosins have been associated with functions in sensory systems such as vision and hearing. Mammalian myosin VII has a tail with 2 MyTH4 domains, 2 FERM domains, and a SH3 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 648
22115 276833 cd01382 MYSc_Myo6 class VI myosin, motor domain. Myosin VI is a monomeric myosin, which moves towards the minus-end of actin filaments, in contrast to most other myosins which moves towards the plus-end of actin filaments. It is thought that myosin VI, unlike plus-end directed myosins, does not use a pure lever arm mechanism, but instead steps with a mechanism analogous to the kinesin neck-linker uncoupling model. It has been implicated in a myriad of functions including: the transport of cytoplasmic organelles, maintenance of normal Golgi morphology, endocytosis, secretion, cell migration, border cell migration during development, and in cancer metastasis playing roles in deafness and retinal development among others. While how this is accomplished is largely unknown there are several interacting proteins that have been identified such as disabled homolog 2 (DAB2), GIPC1, synapse-associated protein 97 (SAP97; also known as DLG1) and optineurin, which have been found to target myosin VI to different cellular compartments. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the minus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 649
22116 276834 cd01383 MYSc_Myo8 class VIII myosin, motor domain. These plant-specific type VIII myosins has been associated with endocytosis, cytokinesis, cell-to-cell coupling and gating at plasmodesmata. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. It also contains IQ domains Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 647
22117 276835 cd01384 MYSc_Myo11 class XI myosin, motor domain. These plant-specific type XI myosin are involved in organelle transport. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 647
22118 276836 cd01385 MYSc_Myo9 class IX myosin, motor domain. Myosin IX is a processive single-headed motor, which might play a role in signalling. It has a N-terminal RA domain, an IQ domain, a C1_1 domain, and a RhoGAP domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 690
22119 276837 cd01386 MYSc_Myo18 class XVIII myosin, motor domain. Many members of this class contain a N-terminal PDZ domain which is commonly found in proteins establishing molecular complexes. The motor domain itself does not exhibit ATPase activity, suggesting that it functions as an actin tether protein. It also has two IQ domains that probably bind light chains or related calmodulins and a C-terminal tail with two sections of coiled-coil domains, which are thought to mediate homodimerization. The function of these myosins are largely unknown. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 689
22120 276838 cd01387 MYSc_Myo15 class XV mammal-like myosin, motor domain. The class XV myosins are monomeric. In vertebrates, myosin XV appears to be expressed in sensory tissue and play a role in hearing. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. C-terminal to the head domain are 2 MyTH4 domain, a FERM domain, and a SH3 domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 657
22121 238684 cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif. 72
22122 238685 cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11. 77
22123 238686 cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions. 66
22124 380477 cd01391 Periplasmic_Binding_Protein_type1 Type 1 periplasmic binding fold superfamily. Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins, the ligands are monosaccharides, including lactose, ribose, fructose, xylose, arabinose, galactose/glucose and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold. 280
22125 143331 cd01392 HTH_LacI Helix-turn-helix (HTH) DNA binding domain of the LacI family of transcriptional regulators. HTH-DNA binding domain of the LacI (lactose operon repressor) family of bacterial transcriptional regulators and their putative homologs found in plants. The LacI family has more than 500 members distributed among almost all bacterial species. The monomeric proteins of the LacI family contain common structural features that include a small DNA-binding domain with a helix-turn-helix motif in the N-terminus, a regulatory ligand-binding domain which exhibits the type I periplasmic binding protein fold in the C-terminus for oligomerization and for effector binding, and an approximately 18-amino acid linker connecting these two functional domains. In LacI-like transcriptional regulators, the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. When the C-terminal domain of the LacI family repressor binds its ligand, it undergoes a conformational change which affects the DNA-binding affinity of the repressor. In Escherichia coli, LacI represses transcription by binding with high affinity to the lac operon at a specific operator DNA sequence until it interacts with the physiological inducer allolactose or a non-degradable analog IPTG (isopropyl-beta-D-thiogalactopyranoside). Induction of the repressor lowers its affinity for the operator sequence, thereby allowing transcription of the lac operon structural genes (lacZ, lacY, and LacA). The lac repressor occurs as a tetramer made up of two functional dimers. Thus, two DNA binding domains of a dimer are required to bind the inverted repeat sequences of the operator DNA binding sites. 52
22126 410881 cd01393 RecA-like RecA family. RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response. RecA couples ATP hydrolysis to DNA strand exchange. While prokaryotes have a single RecA protein, eukaryotes have multiple RecA homologs such as Rad51, DMC1 and Rad55/57. Archaea have the RecA-like homologs RadA and RadB. 185
22127 410882 cd01394 archRadB archaeal RadB. The archaeal protein RadB shares similarity RadA, the archaeal functional homologue to the bacterial RecA. The precise function of RadB is unclear. 216
22128 238689 cd01395 HMT_MBD Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins. 60
22129 238690 cd01396 MeCP2_MBD MeCP2, MBD1, MBD2, MBD3, and MBD4 are members of a protein family that share the methyl-CpG-binding domain (MBD). The MBD, consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. 77
22130 238691 cd01397 HAT_MBD Methyl-CpG binding domains (MBD) present in putative chromatin remodelling factor such as BAZ2A; BAZ2A contains a MBD, DDT, PHD-type zinc finger and Bromo domain suggesting that BAZ2A might be associated with histone acetyltransferase (HAT) activity. The Drosophila melanogaster toutatis protein, a putative subunit of the chromatin-remodeling complex, and other such proteins in this group share a similar domain architecture with BAZ2A, as does the Caenorhabditis elegans flectin homolog. 73
22131 238692 cd01398 RPI_A RPI_A: Ribose 5-phosphate isomerase type A (RPI_A) subfamily; RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. This reaction leads to the conversion of phosphosugars into glycolysis intermediates, which are precursors for the synthesis of amino acids, vitamins, nucleotides, and cell wall components. In plants, RPI is part of the Calvin cycle as ribulose 5-phosphate is the carbon dioxide receptor in the first dark reaction of photosynthesis. There are two unrelated types of RPIs (A and B), which catalyze the same reaction, at least one type of RPI is present in an organism. RPI_A is more widely distributed than RPI_B in bacteria, eukaryotes, and archaea. 213
22132 238693 cd01399 GlcN6P_deaminase GlcN6P_deaminase: Glucosamine-6-phosphate (GlcN6P) deaminase subfamily; GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium. The reaction is an aldo-keto isomerization coupled with an amination or deamination. It is the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate (GlcNAc6P). GlcN6P deaminase is a hexameric enzyme that is allosterically activated by GlcNAc6P. 232
22133 238694 cd01400 6PGL 6PGL: 6-Phosphogluconolactonase (6PGL) subfamily; 6PGL catalyzes the second step of the oxidative phase of the pentose phosphate pathway, the hydrolyzation of 6-phosphoglucono-1,5-lactone (delta form) to 6-phosphogluconate. 6PGL is thought to guard against the accumulation of the delta form of the lactone, which may be toxic through its reaction with endogenous cellular nucleophiles. 219
22134 238695 cd01401 PncB_like Nicotinate phosphoribosyltransferase (NAPRTase), related to PncB. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria, archea and funghi. 377
22135 238696 cd01403 Cyt_c_Oxidase_VIIb Cytochrome C oxidase chain VIIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIb subunit is found only in eukaryotes and its specific function remains unclear. A rare polymorphism of the CcO VIIb gene may be associated with the high risk of nasopharyngeal carcinoma in a Cantonese family. 51
22136 238697 cd01406 SIR2-like Sir2-like: Prokaryotic group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines; and are members of the SIR2 superfamily of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. 242
22137 238698 cd01407 SIR2-fam SIR2 family of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. 218
22138 238699 cd01408 SIRT1 SIRT1: Eukaryotic group (class1) which includes human sirtuins SIRT1-3 and yeast Hst1-4; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The nuclear SIRT1 has been shown to target the p53 tumor suppressor protein for deacetylation to suppress DNA damage, and the cytoplasmic SIRT2 homolog has been shown to target alpha-tubulin for deacetylation for the maintenance of cell integrity. 235
22139 238700 cd01409 SIRT4 SIRT4: Eukaryotic and prokaryotic group (class2) which includes human sirtuin SIRT4 and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 260
22140 238701 cd01410 SIRT7 SIRT7: Eukaryotic and prokaryotic group (class4) which includes human sirtuin SIRT6, SIRT7, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 206
22141 238702 cd01411 SIR2H SIR2H: Uncharacterized prokaryotic Sir2 homologs from several gram positive bacterial species and Fusobacteria; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 225
22142 238703 cd01412 SIRT5_Af1_CobB SIRT5_Af1_CobB: Eukaryotic, archaeal and prokaryotic group (class3) which includes human sirtuin SIRT5, Archaeoglobus fulgidus Sir2-Af1, and E. coli CobB; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. CobB is a bacterial sirtuin that deacetylates acetyl-CoA synthetase at an active site lysine to stimulate its enzymatic activity. 224
22143 238704 cd01413 SIR2_Af2 SIR2_Af2: Archaeal and prokaryotic group which includes Archaeoglobus fulgidus Sir2-Af2, Sulfolobus solfataricus ssSir2, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The Sir2 homolog from the archaea Sulfolobus solftaricus deacetylates the non-specific DNA protein Alba to mediate transcription repression. 222
22144 133469 cd01414 SAICAR_synt_Sc non-metazoan 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Eukaryotic, bacterial, and archaeal group of SAICAR synthetases represented by the Saccharomyces cerevisiae (Sc) enzyme, mostly absent in metazoans. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate. 279
22145 133470 cd01415 SAICAR_synt_PurC bacterial and archaeal 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. A subfamily of SAICAR synthetases represented by the Thermotoga maritima (Tm) enzyme and E. coli PurC. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate. 230
22146 133471 cd01416 SAICAR_synt_Ade5 Ade5_like 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Eukaryotic group of SAICAR synthetases represented by the Drosophila melanogaster, N-terminal, SAICAR synthetase domain of bifunctional Ade5. The Ade5 gene product (CAIR-SAICARs) catalyzes the sixth and seventh steps of the de novo biosynthesis of purine nucleotides (also reported as seventh and eighth steps). SAICAR synthetase converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate. 252
22147 238705 cd01417 Ribosomal_L19e_E Ribosomal protein L19e, eukaryotic. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 164
22148 238706 cd01418 Ribosomal_L19e_A Ribosomal protein L19e, archaeal. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 145
22149 238707 cd01419 MoaC_A MoaC family, archaeal. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 141
22150 238708 cd01420 MoaC_PE MoaC family, prokaryotic and eukaryotic. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 140
22151 238709 cd01421 IMPCH Inosine monophosphate cyclohydrolase domain. This is the N-terminal domain in the purine biosynthesis pathway protein ATIC (purH). The bifunctional ATIC protein contains a C-terminal ATIC formylase domain that formylates 5-aminoimidazole-4-carboxamide-ribonucleotide. The IMPCH domain then converts the formyl-5-aminoimidazole-4-carboxamide-ribonucleotide to inosine monophosphate. This is the final step in de novo purine production. 187
22152 238710 cd01422 MGS Methylglyoxal synthase catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The first part of the catalytic mechanism is believed to be similar to TIM (triosephosphate isomerase) in that both enzymes utilize DHAP to form an ene-diolate phosphate intermediate. In MGS, the second catalytic step is characterized by the elimination of phosphate and collapse of the enediolate to form methylglyoxal instead of reprotonation to form the isomer glyceraldehyde 3-phosphate, as in TIM. This is the first reaction in the methylglyoxal bypass of the Embden-Myerhoff glycolytic pathway and is believed to provide physiological benefits under non-ideal growth conditions in bacteria. 115
22153 238711 cd01423 MGS_CPS_I_III Methylglyoxal synthase-like domain found in pyr1 and URA1-like carbamoyl phosphate synthetases (CPS), including ammonia-dependent CPS Type I, and glutamine-dependent CPS Type III. These are multidomain proteins, in which MGS is the C-terminal domain. 116
22154 238712 cd01424 MGS_CPS_II Methylglyoxal synthase-like domain from type II glutamine-dependent carbamoyl phosphate synthetase (CSP). CSP, a CarA and CarB heterodimer, catalyzes the production of carbamoyl phosphate which is subsequently employed in the metabolic pathways responsible for the synthesis of pyrimidine nucleotides or arginine. The MGS-like domain is the C-terminal domain of CarB and appears to play a regulatory role in CPS function by binding allosteric effector molecules, including UMP and ornithine. 110
22155 100106 cd01425 RPS2 Ribosomal protein S2 (RPS2), involved in formation of the translation initiation complex, where it might contact the messenger RNA and several components of the ribosome. It has been shown that in Escherichia coli RPS2 is essential for the binding of ribosomal protein S1 to the 30s ribosomal subunit. In humans, most likely in all vertebrates, and perhaps in all metazoans, the protein also functions as the 67 kDa laminin receptor (LAMR1 or 67LR), which is formed from a 37 kDa precursor, and is overexpressed in many tumors. 67LR is a cell surface receptor which interacts with a variety of ligands, laminin-1 and others. It is assumed that the ligand interactions are mediated via the conserved C-terminus, which becomes extracellular as the protein undergoes conformational changes which are not well understood. Specifically, a conserved palindromic motif, LMWWML, may participate in the interactions. 67LR plays essential roles in the adhesion of cells to the basement membrane and subsequent signalling events, and has been linked to several diseases. Some evidence also suggests that the precursor of 67LR, 37LRP is also present in the nucleus in animals, where it appears associated with histones. 193
22156 349738 cd01426 ATP-synt_F1_V1_A1_AB_FliI_N ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, N-terminal domain. The alpha and beta (or A and B) subunits are primarily found in the F1, V1, and A1 complexes of the F-, V- and A-type family of ATPases with rotary motors. These ion-transporting rotary ATPases are composed of two linked multi-subunit complexes: the F1, V1, or A1 complex which contains three copies each of the alpha and beta subunits that form the soluble catalytic core involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao complex which forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. The A-ATP synthases (AoA1-ATPases), a different class of proton-translocating ATP synthases, are found in archaea and function like F-ATP synthases. Structurally, however, the A-ATP synthases are more closely related to the V-ATP synthases (vacuolar VoV1-ATPases), which are a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes. This family also includes the flagellum-specific ATPase/type III secretory pathway virulence-related protein, which shows extensive similarity to the alpha and beta subunits of F1-ATP synthase. 73
22157 319763 cd01427 HAD_like Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others. This superfamily includes a variety of enzymes that catalyze the cleavage of substrate C-Cl, P-C, and P-OP bonds via nucleophilic substitution pathways. All of which use a nucleophilic aspartate in their phosphoryl transfer reaction. They catalyze nucleophilic substitution reactions at phosphorus or carbon centers, using a conserved Asp carboxylate in covalent catalysis. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 106
22158 238713 cd01428 ADK Adenylate kinase (ADK) catalyzes the reversible phosphoryl transfer from adenosine triphosphates (ATP) to adenosine monophosphates (AMP) and to yield adenosine diphosphates (ADP). This enzyme is required for the biosynthesis of ADP and is essential for homeostasis of adenosine phosphates. 194
22159 349744 cd01429 ATP-synt_F1_V1_A1_AB_FliI_C ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, C-terminal domain. The alpha and beta (also called A and B) subunits are primarily found in the F1, V1, and A1 complexes of F-, V- and A-type family of ATPases with rotary motors. These ion-transporting rotary ATPases are composed of two linked multi-subunit complexes: the F1, V1, and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao complex that forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. The A-ATP synthases (AoA1-ATPases), a different class of proton-translocating ATP synthases, are found in archaea and function like F-ATP synthases. Structurally, however, the A-ATP synthases are more closely related to the V-ATP synthases (vacuolar VoV1-ATPases), which are a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes. This family also includes the flagellum-specific ATPase/type III secretory pathway virulence-related protein, which shows extensive similarity to the alpha and beta subunits of F1-ATP synthase. 70
22160 319764 cd01431 P-type_ATPases ATP-dependent membrane-bound cation and aminophospholipid transporters. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd(2+), and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine. 319
22161 238714 cd01433 Ribosomal_L16_L10e Ribosomal_L16_L10e: L16 is an essential protein in the large ribosomal subunit of bacteria, mitochondria, and chloroplasts. Large subunits that lack L16 are defective in peptidyl transferase activity, peptidyl-tRNA hydrolysis activity, association with the 30S subunit, binding of aminoacyl-tRNA and interaction with antibiotics. L16 is required for the function of elongation factor P (EF-P), a protein involved in peptide bond synthesis through the stimulation of peptidyl transferase activity by the ribosome. Mutations in L16 and the adjoining bases of 23S rRNA confer antibiotic resistance in bacteria, suggesting a role for L16 in the formation of the antibiotic binding site. The GTPase RbgA (YlqF) is essential for the assembly of the large subunit, and it is believed to regulate the incorporation of L16. L10e is the archaeal and eukaryotic cytosolic homolog of bacterial L16. L16 and L10e exhibit structural differences at the N-terminus. 112
22162 238715 cd01434 EFG_mtEFG1_IV EFG_mtEFG1_IV: domains similar to domain IV of the bacterial translational elongation factor (EF) EF-G. Included in this group is a domain of mitochondrial Elongation factor G1 (mtEFG1) proteins homologous to domain IV of EF-G. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group. 116
22163 259844 cd01435 RNAP_I_RPA1_N Largest subunit (RPA1) of eukaryotic RNA polymerase I (RNAP I), N-terminal domain. RPA1 is the largest subunit of the eukaryotic RNA polymerase I (RNAP I). RNAP I is a multi-subunit protein complex responsible for the synthesis of rRNA precursors. RNAP I consists of at least 14 different subunits, the largest being homologous to subunit Rpb1 of yeast RNAP II and subunit beta' of bacterial RNAP. The yeast member of this family is known as Rpb190. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shaped structure. The N-terminal domain of Rpb1, the largest subunit of RNAP II in yeast, forms part of the active site. It makes up the head and core of one clamp, as well as the pore and funnel structures of RNAP II. The strong homology between RPA1 and Rpb1 suggests a similar functional and structural role. 779
22164 238716 cd01436 Dipth_tox_like Mono-ADP-ribosylating toxins catalyze the transfer of ADP_ribose from NAD+ to eukaryotic Elongation Factor 2, halting protein synthesis. A single molecule of delivered toxin is sufficient to kill a cell. These toxins share mono-ADP-ribosylating activity with a variety of bacterial toxins, such as cholera toxin and pertussis toxin. The structural core is homologous to the poly-ADP ribosylating enzymes such as the PARP enzymes and Tankyrase. Diphtheria toxin is encoded by a lysogenic bacteriophage. Both diphtheria toxin and Pseudomonas aeruginosa exotoxin A are multi-domain proteins. These domains provide a EF2 ADP_ribosylating, receptor-binding, and intracellular trafficking/transmembrane functions . 147
22165 238717 cd01437 parp_like Poly(ADP-ribose) polymerase (parp) catalytic domain catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. Poly(ADP-ribose)-like polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length through interactions with telomere repeat binding factor 1. 347
22166 238718 cd01438 tankyrase_like Tankyrases interact with the telomere reverse transcriptase complex (TERT). Tankyrase 1 poly-ADP-ribosylates Telomere Repeat Binding Factor 1 (TRF1) while Tankyrase 2 can poly-ADP-ribosylate itself or TRF1. The tankyrases also contain multiple ankyrin repeats that mediate protein-protein interaction (binding TRF1 and insulin-responsive aminopeptidase) and may function as a complex. Overexpression of Tank1 promotes increased telomere length when overexpressed, while overexpressed Tank2 has been shown to promote PARP cleavage- independent cell death (necrosis). 223
22167 238719 cd01439 TCCD_inducible_PARP_like Poly(ADP-ribose) polymerases catalyse the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) causes pleotropic effects in mammalian species through modulating gene expression. TCCD indicible PARP (TiPARP) is a target of TCDD that may contribute to multiple responses to TCDD by modulating protein function through poly ADP-ribosylation 121
22168 238720 cd01443 Cdc25_Acr2p Cdc25 enzymes are members of the Rhodanese Homology Domain (RHOD) superfamily. Also included in this CD are eukaryotic arsenate resistance proteins such as Saccharomyces cerevisiae Acr2p and similar proteins. Cdc25 phosphatases activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. The Cdc25 and Acr2p RHOD domains have the signature motif (H/YCxxxxxR). 113
22169 238721 cd01444 GlpE_ST GlpE sulfurtransferase (ST) and homologs are members of the Rhodanese Homology Domain superfamily. Unlike other rhodanese sulfurtransferases, GlpE is a single domain protein but indications are that it functions as a dimer. The active site contains a catalytically active cysteine. 96
22170 238722 cd01445 TST_Repeats Thiosulfate sulfurtransferases (TST) contain 2 copies of the Rhodanese Homology Domain. Only the second repeat contains the catalytically active Cys residue. The role of the 1st repeat is uncertain, but believed to be involved in protein interaction. This CD aligns the 1st and 2nd repeats. 138
22171 238723 cd01446 DSP_MapKP N-terminal regulatory rhodanese domain of dual specificity phosphatases (DSP), such as Mapk Phosphatase. This domain is believed to determine substrate specificity by binding the substrate, such as ERK2, and activating the C-terminal catalytic domain by inducing a conformational change. This domain has homology to the Rhodanese Homology Domain. 132
22172 238724 cd01447 Polysulfide_ST Polysulfide-sulfurtransferase - Rhodanese Homology Domain. This domain is believed to serve as a polysulfide binding and transferase domain in anaerobic gram-negative bacteria, functioning in oxidative phosphorylation with polysulfide-sulfur as a terminal electron acceptor. The active site contains the same conserved cysteine that is the catalytic residue in other Rhodanese Homology Domain proteins. 103
22173 238725 cd01448 TST_Repeat_1 Thiosulfate sulfurtransferase (TST), N-terminal, inactive domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the 1st repeat, which does not contain the catalytically active Cys residue. The role of the 1st repeat is uncertain, but it is believed to be involved in protein interaction. 122
22174 238726 cd01449 TST_Repeat_2 Thiosulfate sulfurtransferase (TST), C-terminal, catalytic domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the second repeat. Only the second repeat contains the catalytically active Cys residue. 118
22175 238727 cd01450 vWFA_subfamily_ECM Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains 161
22176 238728 cd01451 vWA_Magnesium_chelatase Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. 178
22177 238729 cd01452 VWA_26S_proteasome_subunit 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. 187
22178 238730 cd01453 vWA_transcription_factor_IIH_type Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. 183
22179 238731 cd01454 vWA_norD_type norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------> NO2- ------> NO -------> N2O ---------> N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. 174
22180 238732 cd01455 vWA_F11C1-5a_type Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the functions of the members of this subgroup. The members of this subgroup are fused to the ancient AAA domain. 191
22181 238733 cd01456 vWA_ywmD_type VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the members of this subgroup. All members of this subgroup however have a conserved MIDAS motif. 206
22182 238734 cd01457 vWA_ORF176_type VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup are Eubacterial in origin and have a conserved MIDAS motif. Not much is known about the biochemistry of these. 199
22183 238735 cd01458 vWA_ku Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif. 218
22184 238736 cd01459 vWA_copine_like VWA Copine: Copines are phospholipid-binding proteins originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding. 254
22185 238737 cd01460 vWA_midasin VWA_Midasin: Midasin is a member of the AAA ATPase family. The proteins of this family are unified by their common archetectural organization that is based upon a conserved ATPase domain. The AAA domain of midasin contains six tandem AAA protomers. The AAA domains in midasin is followed by a D/E rich domain that is following by a VWA domain. The members of this subgroup have a conserved MIDAS motif. The function of this domain is not exactly known although it has been speculated to play a crucial role in midasin function. 266
22186 238738 cd01461 vWA_interalpha_trypsin_inhibitor vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. 171
22187 238739 cd01462 VWA_YIEM_type VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have a conserved MIDAS motif, however, their biochemical function is not well characterised. 152
22188 238740 cd01463 vWA_VGCC_like VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. 190
22189 238741 cd01464 vWA_subfamily VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have no assigned function. This subfamily is typified by the presence of a conserved MIDAS motif. 176
22190 238742 cd01465 vWA_subgroup VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the VWA domain in these proteins. The members do have a conserved MIDAS motif. The biochemical function however is not known. 170
22191 238743 cd01466 vWA_C3HC4_type VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Membes of this subgroup belong to Zinc-finger family as they are found fused to RING finger domains. The MIDAS motif is not conserved in all the members of this family. The function of vWA domains however is not known. 155
22192 238744 cd01467 vWA_BatA_type VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup are bacterial in origin. They are typified by the presence of a MIDAS motif. 180
22193 238745 cd01468 trunk_domain trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins. 239
22194 238746 cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. 177
22195 238747 cd01470 vWA_complement_factors Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. 198
22196 238748 cd01471 vWA_micronemal_protein Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. 186
22197 238749 cd01472 vWA_collagen von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. 164
22198 238750 cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. 192
22199 238751 cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. 185
22200 238752 cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. 224
22201 238753 cd01476 VWA_integrin_invertebrates VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. 163
22202 238754 cd01477 vWA_F09G8-8_type VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup lack the MIDAS motif. This subgroup is found only in C. elegans and the members identified thus far are always found fused to a C-Lectin type domain. Biochemical function thus far has not be attributed to any of the members of this subgroup. 193
22203 238755 cd01478 Sec23-like Sec23-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 23 is very similar to Sec24. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup lack the consensus MIDAS motif but have the overall Para-Rossmann type fold that is characteristic of this superfamily. 267
22204 238756 cd01479 Sec24-like Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily. 244
22205 238757 cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 186
22206 238758 cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 165
22207 238759 cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 164
22208 238760 cd01483 E1_enzyme_family Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS. 143
22209 238761 cd01484 E1-2_like Ubiquitin activating enzyme (E1), repeat 2-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the second repeat of Ub-E1. 234
22210 238762 cd01485 E1-1_like Ubiquitin activating enzyme (E1), repeat 1-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the first repeat of Ub-E1. 198
22211 238763 cd01486 Apg7 Apg7 is an E1-like protein, that activates two different ubiquitin-like proteins, Apg12 and Apg8, and assigns them to specific E2 enzymes, Apg10 and Apg3, respectively. This leads to the covalent conjugation of Apg8 with phosphatidylethanolamine, an important step in autophagy. Autophagy is a dynamic membrane phenomenon for bulk protein degradation in the lysosome/vacuole. 307
22212 238764 cd01487 E1_ThiF_like E1_ThiF_like. Member of superfamily of activating enzymes (E1) of the ubiquitin-like proteins. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown. 174
22213 238765 cd01488 Uba3_RUB Ubiquitin activating enzyme (E1) subunit UBA3. UBA3 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins. consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. UBA3 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2. 291
22214 238766 cd01489 Uba2_SUMO Ubiquitin activating enzyme (E1) subunit UBA2. UBA2 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. UBA2 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2. 312
22215 238767 cd01490 Ube1_repeat2 Ubiquitin activating enzyme (E1), repeat 2. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the second repeat of Ub-E1. 435
22216 238768 cd01491 Ube1_repeat1 Ubiquitin activating enzyme (E1), repeat 1. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the first repeat of Ub-E1. 286
22217 238769 cd01492 Aos1_SUMO Ubiquitin activating enzyme (E1) subunit Aos1. Aos1 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. Aos1 contains part of the adenylation domain. 197
22218 238770 cd01493 APPBP1_RUB Ubiquitin activating enzyme (E1) subunit APPBP1. APPBP1 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. ABPP1 contains part of the adenylation domain. 425
22219 99742 cd01494 AAT_I Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V). 170
22220 275447 cd01513 Translation_factor_III Domain III of Elongation factor (EF) Tu (EF-TU) and related proteins. Elongation factor (EF) EF-Tu participates in the elongation phase during protein biosynthesis on the ribosome. Its functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental findings indicate an essential contribution of domain III to activation of GTP hydrolysis. This domain III, which is distinct from the domain III in EFG and related elongation factors, is found in several eukaryotic translation factors, like peptide chain release factors RF3, elongation factor 1, selenocysteine (Sec)-specific elongation factor, and in GT-1 family of GTPase (GTPBP1). 102
22221 238772 cd01514 Elongation_Factor_C Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown. 79
22222 238773 cd01515 Arch_FBPase_1 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family (FBPase class IV). These are Mg++ dependent phosphatases. Members in this family may have both fructose-1,6-bisphosphatase and inositol-monophosphatase activity. In hyperthermophilic archaea, inositol monophosphatase is thought to play a role in the biosynthesis of di-myo-inositol-1,1'-phosphate, an osmolyte unique to hyperthermophiles. 257
22223 238774 cd01516 FBPase_glpX Bacterial fructose-1,6-bisphosphatase, glpX-encoded. A dimeric enzyme dependent on Mg(2+). glpX-encoded FPBase (FBPase class II) differs from other members of the inositol-phosphatase superfamily by permutation of secondary structure elements. The core structure around the active site is well preserved. In E. coli, FBPase II is part of the glp regulon, which mediates growth on glycerol or sn-glycerol 3-phosphate as the sole carbon source. 309
22224 238775 cd01517 PAP_phosphatase PAP-phosphatase_like domains. PAP-phosphatase is a member of the inositol monophosphatase family, and catalyses the hydrolysis of 3'-phosphoadenosine-5'-phosphate (PAP) to AMP. In Saccharomyces cerevisiae, HAL2 (MET22) is involved in methionine biosynthesis and provides increased salt tolerance when over-expressed. Bacterial members of this domain family may differ in their substrate specificity and dephosphorylate different targets, as the substrate binding site does not appear to be conserved in that sub-set. 274
22225 238776 cd01518 RHOD_YceA Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YceA, Bacillus subtilis YbfQ, and similar uncharacterized proteins. 101
22226 238777 cd01519 RHOD_HSP67B2 Member of the Rhodanese Homology Domain superfamily. This CD includes the heat shock protein 67B2 of Drosophila melanogaster and other similar proteins, many of which are uncharacterized. 106
22227 238778 cd01520 RHOD_YbbB Member of the Rhodanese Homology Domain superfamily. This CD includes several putative ATP /GTP binding proteins including E. coli YbbB. 128
22228 238779 cd01521 RHOD_PspE2 Member of the Rhodanese Homology Domain superfamily. This CD includes the putative rhodanese-like protein, Psp2, of Yersinia pestis biovar Medievalis and other similar uncharacterized proteins. 110
22229 238780 cd01522 RHOD_1 Member of the Rhodanese Homology Domain superfamily, subgroup 1. This CD includes the putative rhodanese-related sulfurtransferases of several uncharacterized proteins. 117
22230 238781 cd01523 RHOD_Lact_B Member of the Rhodanese Homology Domain superfamily. This CD includes predicted proteins with rhodanese-like domains found N-terminal of the metallo-beta-lactamase domain. 100
22231 238782 cd01524 RHOD_Pyr_redox Member of the Rhodanese Homology Domain superfamily. Included in this CD are the Lactococcus lactis NADH oxidase, Bacillus cereus NADH dehydrogenase, and Bacteroides thetaiotaomicron pyridine nucleotide-disulphide oxidoreductase, and similar rhodanese-like domains found C-terminal of the pyridine nucleotide-disulphide oxidoreductase (Pyr-redox) domain and the Pyr-redox dimerization domain. 90
22232 238783 cd01525 RHOD_Kc Member of the Rhodanese Homology Domain superfamily. Included in this CD are the rhodanese-like domains found C-terminal of the serine/threonine protein kinases catalytic (S_TKc) domain and the Tre-2, BUB2p, Cdc16p (TBC) domain. The putative active site Cys residue is not present in this CD. 105
22233 238784 cd01526 RHOD_ThiF Member of the Rhodanese Homology Domain superfamily. This CD includes several putative molybdopterin synthase sulfurylases including the molybdenum cofactor biosynthetic protein (CnxF) of Aspergillus nidulans and the molybdenum cofactor synthesis protein 3 (MOCS3) of Homo sapiens. These rhodanese-like domains are found C-terminal of the ThiF and MoeZ_MoeB domains. 122
22234 238785 cd01527 RHOD_YgaP Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YgaP, and similar uncharacterized putative rhodanese-related sulfurtransferases. 99
22235 238786 cd01528 RHOD_2 Member of the Rhodanese Homology Domain superfamily, subgroup 2. Subgroup 2 includes uncharacterized putative rhodanese-related domains. 101
22236 238787 cd01529 4RHOD_Repeats Member of the Rhodanese Homology Domain superfamily. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. Only the second and most of the fourth repeats contain the putative catalytic Cys residue. This CD aligns the 1st , 2nd, 3rd, and 4th repeats. 96
22237 238788 cd01530 Cdc25 Cdc25 phosphatases are members of the Rhodanese Homology Domain superfamily. They activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. Cdc25A phosphatase functions to regulate S phase entry and Cdc25B is required for G2/M phase transition of the cell cycle. The Cdc25 domain binds oxyanions at the catalytic site and has the signature motif (H/YCxxxxxR). 121
22238 238789 cd01531 Acr2p Eukaryotic arsenate resistance proteins are members of the Rhodanese Homology Domain superfamily. Included in this CD is the Saccharomyces cerevisiae arsenate reductase protein, Acr2p, and other yeast and plant homologs. 113
22239 238790 cd01532 4RHOD_Repeat_1 Member of the Rhodanese Homology Domain superfamily, repeat 1. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 1st repeat which does not contain the putative catalytic Cys residue. 92
22240 238791 cd01533 4RHOD_Repeat_2 Member of the Rhodanese Homology Domain superfamily, repeat 2. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 2nd repeat which does contain the putative catalytic Cys residue. 109
22241 238792 cd01534 4RHOD_Repeat_3 Member of the Rhodanese Homology Domain superfamily, repeat 3. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 3rd repeat which does not contain the putative catalytic Cys residue. 95
22242 238793 cd01535 4RHOD_Repeat_4 Member of the Rhodanese Homology Domain superfamily, repeat 4. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 4th repeat which, in general, contains the putative catalytic Cys residue. 145
22243 380478 cd01536 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. Periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this family function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea. The sugar binding domain is also homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. Moreover, this periplasmic binding domain, also known as Venus flytrap domain, undergoes transition from an open to a closed conformational state upon the binding of ligands such as lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. This family also includes the periplasmic binding domain of autoinducer-2 (AI-2) receptors such as LsrB and LuxP which are highly homologous to periplasmic pentose/hexose sugar-binding proteins. 268
22244 380479 cd01537 PBP1_repressor_sugar_binding-like Ligand-binding domain of the LacI-GalR family of transcription regulators and the sugar-binding domain of ABC-type transport systems. Ligand-binding domain of the LacI-GalR family of transcription regulators and the sugar-binding domain of ABC-type transport systems, all of which contain the type 1 periplasmic binding protein-like fold. Their specific ligands include lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. The LacI family of proteins consists of transcriptional regulators related to the lac repressor; in general the sugar binding domain in this family binds a sugar, which in turn changes the DNA binding activity of the repressor domain. The core structure of the periplasmic binding proteins is classified into two types and they differ in number and order of beta strands in each domain: type 1, which has six beta strands, and type 2, which has five beta strands. These two distinct structural arrangements may have originated from a common ancestor. 265
22245 380480 cd01538 PBP1_ABC_xylose_binding-like periplasmic xylose-like sugar-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. Periplasmic xylose-like sugar-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 283
22246 380481 cd01539 PBP1_GGBP periplasmic glucose/galactose-binding protein (GGBP) involved in chemotaxis towards, and active transport of, glucose and galactose in various bacterial species. Periplasmic glucose/galactose-binding protein (GGBP) involved in chemotaxis towards, and active transport of, glucose and galactose in various bacterial species. GGBP is a member of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic GGBP is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 302
22247 380482 cd01540 PBP1_arabinose_binding periplasmic L-arabinose-binding protein (ABP), a member of a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Periplasmic L-arabinose-binding protein (ABP), a member of a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. ABP is only involved in transport contrary to other related sugar-binding proteins such as the glucose/galactose-binding protein (GGBP) and the ribose-binding protein (RBP), both of which are involved in chemotaxis as well as transport. The periplasmic ABP consists of two alpha/beta globular domains connected by a three-stranded hinge, a Venus flytrap-like domain, which undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, ABP is homologous to the ligand-binding domain of eukaryotic receptors such as metabotropic glutamate receptor (mGluR) and DNA-binding transcriptional repressors such as LacI and GalR. 294
22248 380483 cd01541 PBP1_AraR ligand-binding domain of DNA transcription repressor specific for arabinose (AraR) which is a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor specific for arabinose (AraR) which is a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of AraR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 274
22249 380484 cd01542 PBP1_TreR-like ligand-binding domain of DNA transcription repressor specific for trehalose (TreR) which is a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor specific for trehalose (TreR) which is a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of TreR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 259
22250 380485 cd01543 PBP1_XylR ligand-binding domain of DNA transcription repressor specific for xylose (XylR). Ligand-binding domain of DNA transcription repressor specific for xylose (XylR), a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of XylR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 265
22251 380486 cd01544 PBP1_GalR ligand-binding domain of DNA transcription repressor GalR which is one of two regulatory proteins involved in galactose transport and metabolism. Ligand-binding domain of DNA transcription repressor GalR which is one of two regulatory proteins involved in galactose transport and metabolism. Transcription of the galactose regulon genes is regulated by Gal iso-repressor (GalS) and Gal repressor (GalR) in different ways, but both repressors recognize the same DNA binding site in the absence of D-galactose. GalR is a dimeric protein like GalS and is exclusively involved in the regulation of galactose permease, the low-affinity galactose transporter. GalS is involved in regulating expression of the high-affinity galactose transporter encoded by the mgl operon. GalS and GalR are members of the LacI-GalR family of transcription regulators and both contain the type 1 periplasmic binding protein-like fold. Hence, they are structurally homologous to the periplasmic sugar binding of ABC-type transport systems. 269
22252 380487 cd01545 PBP1_SalR ligand-binding domain of DNA transcription repressor SalR, a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor SalR, a member of the LacI-GalR family of bacterial transcription regulators. The SalR binds to glucose based compound Salicin which is chemically related to aspirin. The ligand-binding of SalR is structurally homologous to the periplasmic sugar-binding domain of ABC-transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 270
22253 238794 cd01553 EPT_RTPC-like This domain family includes the Enolpyruvate transferase (EPT) family and the RNA 3' phosphate cyclase family (RTPC). These 2 families differ in that EPT is formed by 3 repeats of an alpha-beta structural domain while RTPC has 3 similar repeats with a 4th slightly different domain inserted between the 2nd and 3rd repeat. They evidently share the same active site location, although the catalytic residues differ. 211
22254 238795 cd01554 EPT-like Enol pyruvate transferases family includes EPSP synthases and UDP-N-acetylglucosamine enolpyruvyl transferase. Both enzymes catalyze the reaction of enolpyruvyl transfer. 408
22255 238796 cd01555 UdpNAET UDP-N-acetylglucosamine enolpyruvyl transferase catalyzes enolpyruvyl transfer as part of the first step in the biosynthesis of peptidoglycan, a component of the bacterial cell wall. The reaction is phosphoenolpyruvate + UDP-N-acetyl-D-glucosamine = phosphate + UDP-N-acetyl-3-(1-carboxyvinyl)-D-glucosamine. This enzyme is of interest as a potential target for anti-bacterial agents. The only other known enolpyruvyl transferase is the related 5-enolpyruvylshikimate-3-phosphate synthase. 400
22256 238797 cd01556 EPSP_synthase EPSP synthase domain. 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EC 2.5.1.19) catalyses the reaction between shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP) to form 5-enolpyruvylshkimate-3-phosphate (EPSP), an intermediate in the shikimate pathway leading to aromatic amino acid biosynthesis. The reaction is phosphoenolpyruvate + 3-phosphoshikimate = phosphate + 5-O-(1-carboxyvinyl)-3-phosphoshikimate. It is found in bacteria and plants but not animals. The enzyme is the target of the widely used herbicide glyphosate, which has been shown to occupy the active site. In bacteria and plants, it is a single domain protein, while in fungi, the domain is found as part of a multidomain protein with functions that are all part of the shikimate pathway. 409
22257 238798 cd01557 BCAT_beta_family BCAT_beta_family: Branched-chain aminotransferase catalyses the transamination of the branched-chain amino acids leusine, isoleucine and valine to their respective alpha-keto acids, alpha-ketoisocaproate, alpha-keto-beta-methylvalerate and alpha-ketoisovalerate. The enzyme requires pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze the reaction. It has been found that mammals have two foms of the enzyme - mitochondrial and cytosolic forms while bacteria contain only one form of the enzyme. The mitochondrial form plays a significant role in skeletal muscle glutamine and alanine synthesis and in interorgan nitrogen metabolism.Members of this subgroup are widely distributed in all three forms of life. 279
22258 238799 cd01558 D-AAT_like D-Alanine aminotransferase (D-AAT_like): D-amino acid aminotransferase catalyzes transamination between D-amino acids and their respective alpha-keto acids. It plays a major role in the synthesis of bacterial cell wall components like D-alanine and D-glutamate in addition to other D-amino acids. The enzyme like other members of this superfamily requires PLP as a cofactor. Members of this subgroup are found in all three forms of life. 270
22259 238800 cd01559 ADCL_like ADCL_like: 4-Amino-4-deoxychorismate lyase: is a member of the fold-type IV of PLP dependent enzymes that converts 4-amino-4-deoxychorismate (ADC) to p-aminobenzoate and pyruvate. Based on the information available from the crystal structure, most members of this subgroup are likely to function as dimers. The enzyme from E.Coli, the structure of which is available, is a homodimer that is folded into a small and a larger domain. The coenzyme pyridoxal 5; -phosphate resides at the interface of the two domains that is linked by a flexible loop. Members of this subgroup are found in Eukaryotes and bacteria. 249
22260 107203 cd01560 Thr-synth_2 Threonine synthase catalyzes the final step of threonine biosynthesis. The conversion of O-phosphohomoserine into threonine and inorganic phosphate is pyridoxal 5'-phosphate dependent. The Thr-synth_1 CD includes members from higher plants, cyanobacteria, archaebacteria and eubacterial groups. This CD, Thr-synth_2, includes enzymes from fungi and eubacterial groups, as well as, metazoan threonine synthase-like proteins. 460
22261 107204 cd01561 CBS_like CBS_like: This subgroup includes Cystathionine beta-synthase (CBS) and Cysteine synthase. CBS is a unique heme-containing enzyme that catalyzes a pyridoxal 5'-phosphate (PLP)-dependent condensation of serine and homocysteine to give cystathionine. Deficiency of CBS leads to homocystinuria, an inherited disease of sulfur metabolism characterized by increased levels of the toxic metabolite homocysteine. Cysteine synthase on the other hand catalyzes the last step of cysteine biosynthesis. This subgroup also includes an O-Phosphoserine sulfhydrylase found in hyperthermophilic archaea which produces L-cysteine from sulfide and the more thermostable O-phospho-L-serine. 291
22262 107205 cd01562 Thr-dehyd Threonine dehydratase: The first step in amino acid degradation is the removal of nitrogen. Although the nitrogen atoms of most amino acids are transferred to alpha-ketoglutarate before removal, the alpha-amino group of threonine can be directly converted into NH4+. The direct deamination is catalyzed by threonine dehydratase, in which pyridoxal phosphate (PLP) is the prosthetic group. Threonine dehydratase is widely distributed in all three major phylogenetic divisions. 304
22263 107206 cd01563 Thr-synth_1 Threonine synthase is a pyridoxal phosphate (PLP) dependent enzyme that catalyses the last reaction in the synthesis of threonine from aspartate. It proceeds by converting O-phospho-L-homoserine (OPH) into threonine and inorganic phosphate. In plants, OPH is an intermediate between the methionine and threonine/isoleucine pathways. Thus threonine synthase competes for OPH with cystathionine-gamma-synthase, the first enzyme in the methionine pathway. These enzymes are in general dimers. Members of this CD, Thr-synth_1, are widely distributed in bacteria, archaea and higher plants. 324
22264 238801 cd01567 NAPRTase_PncB Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. 343
22265 238802 cd01568 QPRTase_NadC Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other. 269
22266 238803 cd01569 PBEF_like pre-B-cell colony-enhancing factor (PBEF)-like. The mammalian members of this group of nicotinate phosphoribosyltransferases (NAPRTases) were originally identified as genes whose expression is upregulated upon activation in lymphoid cells. In general, nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. 407
22267 238804 cd01570 NAPRTase_A Nicotinate phosphoribosyltransferase (NAPRTase), subgroup A. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria and eukaryota (except funghi). 327
22268 238805 cd01571 NAPRTase_B Nicotinate phosphoribosyltransferase (NAPRTase), subgroup B. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. 302
22269 238806 cd01572 QPRTase Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other. 268
22270 238807 cd01573 modD_like ModD; Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase) present in some modABC operons in bacteria, which are involved in molybdate transport. In general, QPRTases are part of the de novo synthesis pathway of NAD in both prokaryotes and eukaryotes. They catalyse the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. 272
22271 380488 cd01574 PBP1_LacI ligand-binding domain of DNA transcription repressor LacI specific for lactose, a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor LacI specific for lactose, a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of LacI is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 265
22272 380489 cd01575 PBP1_GntR ligand-binding domain of DNA transcription repressor GntR specific for gluconate, a member of the LacI-GalR family of bacterial transcription regulators. This group represents the ligand-binding domain of DNA transcription repressor GntR specific for gluconate, a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of GntR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding, which in turn changes the DNA binding affinity of the repressor. 269
22273 238808 cd01576 AcnB_Swivel Aconitase B swivel domain. Aconitate hydratase B is involved in energy metabolism as part of the TCA cycle. It catalyses the formation of cis-aconitate from citrate. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The domain structure of Aconitase B is different from other Aconitases in that he swivel domain that is found at N-terminus of B family is normally found at C-terminus for other Aconitases. In most members of the family, there is also a HEAT domain before domain 4, which is believed to play a role in protein-protein interaction. 131
22274 238809 cd01577 IPMI_Swivel Aconatase-like swivel domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 91
22275 238810 cd01578 AcnA_Mitochon_Swivel Mitochondrial aconitase A swivel domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. This is the mitochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members. 149
22276 238811 cd01579 AcnA_Bact_Swivel Bacterial Aconitase-like swivel domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. This distinct subfamily is found only in bacteria and archea. Its exact characteristics are not known. 121
22277 238812 cd01580 AcnA_IRP_Swivel Aconitase A swivel domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 171
22278 153131 cd01581 AcnB Aconitate hydratase B catalyses the formation of cis-aconitate from citrate as part of the TCA cycle. Aconitase B catalytic domain. Aconitate hydratase B catalyses the formation of cis-aconitate from citrate as part of the TCA cycle. Aconitase has an active (4FE-4S) and an inactive (3FE-4S) form. The active cluster is part of the catalytic site that interconverts citrate, cis-aconitase and isocitrate. The domain architecture of aconitase B is different from other aconitases in that the catalytic domain is normally found at C-terminus for other aconitases, but it is at N-terminus for B family. It also has a HEAT domain before domain 4 which plays a role in protein-protein interaction. This alignment is the core domain including domains 1,2 and 3. 436
22279 153132 cd01582 Homoaconitase Homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase catalytic domain. Homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases. 363
22280 153133 cd01583 IPMI 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate. Aconatase-like catalytic domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes. 382
22281 153134 cd01584 AcnA_Mitochondrial Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Mitochondrial aconitase A catalytic domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediary product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. This is the mitochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members. 412
22282 153135 cd01585 AcnA_Bact Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Bacterial Aconitase-like catalytic domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. This distinct subfamily is found only in bacteria and Archaea. Its exact characteristics are not known. 380
22283 153136 cd01586 AcnA_IRP Aconitase A catalytic domain. Aconitase A catalytic domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydrolyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes. 404
22284 176466 cd01594 Lyase_I_like Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions. Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively. 231
22285 176467 cd01595 Adenylsuccinate_lyase_like Adenylsuccinate lyase (ASL)_like. This group contains ASL, prokaryotic-type 3-carboxy-cis,cis-muconate cycloisomerase (pCMLE), and related proteins. These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). pCMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone, in the beta-ketoadipate pathway. ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting. 381
22286 176468 cd01596 Aspartase_like aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. This group contains aspartase (L-aspartate ammonia-lyase), fumarase class II enzymes, and related proteins. It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle. 450
22287 176469 cd01597 pCLME prokaryotic 3-carboxy-cis,cis-muconate cycloisomerase (CMLE)_like. This subgroup contains pCLME and related proteins, and belongs to the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. CMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone in the beta-ketoadipate pathway. This pathway is responsible for the catabolism of a variety of aromatic compounds into intermediates of the citric cycle in prokaryotic and eukaryotic micro-organisms. 437
22288 176470 cd01598 PurB PurB_like adenylosuccinases (adenylosuccinate lyase, ASL). This subgroup contains EcASL, the product of the purB gene in Escherichia coli, and related proteins. It is a member of the Lyase class I family of the Lyase_I superfamily. Members of the Lyase class I family function as homotetramers to catalyze similar beta-elimination reactions in which a Calpha-N or Calpha-O bond is cleaved with the subsequent release of fumarate as one of the products. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two non-sequential steps in the de novo purine biosynthesis pathway: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylosuccinate (SAMP) into adenosine monophosphate (AMP). 425
22289 259845 cd01609 RNAP_beta'_N Largest subunit (beta') of bacterial DNA-dependent RNA polymerase (RNAP), N-terminal domain. Beta' is the largest subunit of bacterial DNA-dependent RNA polymerase (RNAP). This family also includes the eukaryotic plastid-encoded RNAP beta' subunit. Bacterial RNAP is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. Structure studies suggest that RNA polymerase complexes from different organisms share a crab-claw-shaped structure with two "pincers" defining a central cleft. Beta' and beta, the largest and the second largest subunits of bacterial RNAP, each makes up one pincer and part of the base of the cleft. Beta' contains part of the active site and binds two zinc ions that have a structural role in the formation of the active polymerase. 659
22290 238813 cd01610 PAP2_like PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins. 122
22291 340453 cd01611 Ubl_Autophagy_like ubiquitin-like (Ubl) domain found in autophagy-related ubiquitin-like protein. Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. The autophagy-related ubiquitin-like proteins, such as Saccharomyces cerevisiae Atg8p, undergo a unique ubiquitin-like (Ubl) conjugation, a process essential for autophagosome formation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The ubiquitination process comprises a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. ATG8 family proteins undergo multistep modifications by the E1-like (ubiquitin activating) enzyme ATG7, and the E2-like (ubiquitin conjugating) enzyme ATG3. The mammalian ATG8 family is classified into three subfamilies: i) MAP1LC3 (microtubule associated protein 1 light chain 3) which includes MAP1LC3A, MAP1LC3B, MAP1LC3B2, and MAP1LC3C, ii) GABARAP (GABA type A receptor associated protein) which includes GABARAP, GABARAPL1, and GABARAPL3, and iii) GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa). 84
22292 340454 cd01612 Ubl_ATG12 ubiquitin-like (Ubl) domain found in autophagy-related protein 12 (ATG12). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. The autophagy-related ubiquitin-like (Ubl) proteins such as ATG12 protein have a conserved Ubl fold structure and undergo a unique Ubl conjugation, a process essential for autophagosome formation. ATG12 is conjugated to ATG5 by multistep modifications of the E1-like (ubiquitin activating) enzyme ATG7, and the E2-like (ubiquitin conjugating) enzyme ATG10. The ATG12-ATG5 conjugate facilitates the lipidation of ATG8 and directs its correct subcellular localization. ATG12 is localized at the developing autophagosome. 86
22293 133473 cd01614 EutN_CcmL Ethanolamine utilisation protein and carboxysome structural protein domain family. Beside the Escherichia coli ethanolamine utilization protein EutN and the Synechocystis sp. carboxysome (beta-type) structural protein CcmL, this family also includes alpha-type carboxysome structural proteins CsoS4A and CsoS4B (previously known as OrfA and OrfB), propanediol utilizationprotein PduN, and some hypothetical homologous of various bacterial microcompartments. The carboxysome, a polyhedral organelle, participates in carbon fixation by sequestering enzymes. It is the prototypical bacterial microcompartment. Its enzymatic components, ribulose bisphosphate carboxylase/oxygenase(RuBisCO) and carbonic anhydrase (CA), are surrounded by a polyhedral protein shell. Similarly, the ethanolamine utilization (eut) microcompartment, and the 1,2-propanediol utilization (pdu) microcompartment encapsulate the enzymes necessary for the process of cobalamin-dependent ethanolamine degradation, and coenzyme B12-dependent degradation of 1,2-propanediol, respectively, within its polyhedral protein shells. It is interesting that both carboxysome structural proteins CcmL and CsoS4A assemble as pentamers in the crystal structures, which might constitute the twelve pentameric vertices of a regular icosahedral carboxysome. However, the reported EutN structure is hexameric rather than pentameric. The absence of pentamers in Eut microcompartments might lead to less-regular icosahedral shell shapes. Due to the lack of structure evidence, the functional roles of the CsoS4A adjacent paralog, CsoS4B, and propanediol utilization protein PduN are not yet clear. 83
22294 119367 cd01615 CIDE_N CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein. 78
22295 340455 cd01616 TGS TGS (ThrRS, GTPase and SpoT) domain structurally similar to a beta-grasp ubiquitin-like fold. This family includes eukaryotic and some bacterial threonyl-tRNA synthetases (ThrRSs), a distinct Obg family GTPases, and guanosine polyphosphate hydrolase (SpoT) and synthetase (RelA), which are involved in stringent response in bacteria, as well as uridine kinase (UDK) from Thermotogales. All family members contain a TGS domain named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs. It is a small domain with a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. The functions of the TGS domain remains unclear, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, with a regulatory role. 61
22296 340456 cd01617 DCX Dublecortin-like domain structurally similar to a beta-grasp ubiquitin-like fold. Dublecortin (DCX) is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its DCX protein domains can occur in double tandem or as single DCX repeats. Proteins with DCX tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK). Single DCX repeat proteins are normally localized to actin-rich subcellular structures, or the nucleus such as DCDC2. DCX is not only a unique MAP in terms of structure, it also interacts with multiple additional proteins. Mutations in human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation. 73
22297 240620 cd01619 LDH_like D-Lactate and related Dehydrogenases, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-Hydroxyisocaproic acid dehydrogenase (D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-HicDH is a NAD-dependent member of the hydroxycarboxylate dehydrogenase family, and shares the Rossmann fold typical of many NAD binding proteins. D-HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. Similar to the structurally distinct L-HicDH, D-HicDH exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. (R)-2-hydroxyglutarate dehydrogenase (HGDH) catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 323
22298 240621 cd01620 Ala_dh_like Alanine dehydrogenase and related dehydrogenases. Alanine dehydrogenase/Transhydrogenase, such as the hexameric L-alanine dehydrogenase of Phormidium lapideum, contain 2 Rossmann fold-like domains linked by an alpha helical region. Related proteins include Saccharopine Dehydrogenase (SDH), bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase enzyme, N(5)-(carboxyethyl)ornithine synthase, and Rubrum transdehydrogenase. Alanine dehydrogenase (L-AlaDH) catalyzes the NAD-dependent conversion of pyrucate to L-alanine via reductive amination. Transhydrogenases found in bacterial and inner mitochondrial membranes link NAD(P)(H)-dependent redox reactions to proton translocation. The energy of the proton electrochemical gradient (delta-p), generated by the respiratory electron transport chain, is consumed by transhydrogenase in NAD(P)+ reduction. Transhydrogenase is likely involved in the regulation of the citric acid cycle. Rubrum transhydrogenase has 3 components, dI, dII, and dIII. dII spans the membrane while dI and dIII protrude on the cytoplasmic/matirx side. DI contains 2 domains with Rossmann folds, linked by a long alpha helix, and contains a NAD binding site. Two dI polypeptides (represented in this sub-family) spontaneously form a heterotrimer with one dIII in the absence of dII. In the heterotrimer, both dI chains may bind NAD, but only one is well-ordered. dIII also binds a well-ordered NADP, but in a different orientation than classical Rossmann domains. 317
22299 319765 cd01624 HAD_VSP_like vegetative storage proteins and related proteins, similar to soybean VSPalpha and VSPbeta proteins; belongs to the haloacid dehalogenase-like superfamily. Soybean [Glycine max (L.) Merr.] vegetative storage protein VSPalpha and VSPbeta levels were identified as storage proteins due to their abundance and pattern of expression in plant tissues, they accumulate to almost one-half the amount of soluble leaf protein when soybean plants are continually depodded. They possess acid phosphatase activity which appears to be low compared to several other plant acid phosphatases; it increases in the leaves of depodded soybean plants, but to no more than 0.1% of the total acid phosphatase activity in these leaves. This acid phosphatase activity has maximal activity at pH 5.0 - 5.5, and can liberate Pi from different substrates such as napthyl acid phosphate, carboxyphenyl phosphate, sugar-phosphates, glyceraldehyde 3-phosphate, dihydroxyacetone phosphate, phosphoenolpyruvate, ATP, ADP, PPi, and short chain polyphosphates; they cleave phosphoenolpyruvate, ATP, ADP, PPI, and polyphosphates most efficiently. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Soybean VSPalpha and VSPbeta lack this active site aspartate, other members of this family have this aspartate and may be more active. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 160
22300 319766 cd01625 HAD_PNP polynucleotide 3'-phosphatase domain similar to the phosphatase domain of the bifunctional enzyme polynucleotide 5'-kinase/3'-phosphatase. Polynucleotide 3'-phosphatase (PNP) domain. This domain dephosphorylates single-stranded as well as double-stranded 3'-phospho termini. It is found in bifunctional enzyme polynucleotide kinase/phosphatase (PNKP) which contain both kinase and phosphatase domains. PNKP plays a key role in both base excision repair and non-homologous end-joining DNA repair pathway. DNA strand breaks can result from DNA damage by ionizing radiation and chemical agents, such as alkylating agents or anticancer agents. Such DNA damage often results in DNA strands with 5'-hydroxyl and 3'-phosphate termini. However, the repair of DNA damage by DNA polymerases and ligases requires 5'-phosphate and 3'-hydroxyl termini. PNKP acts as a 5'-kinase/3'-phosphatase to create 5'-phosphate/3'-hydroxyl termini, which are a necessary prerequisite for ligation during repair. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 154
22301 319767 cd01627 HAD_TPP trehalose-phosphate phosphatase similar to Escherichia coli trehalose-6-phosphate phosphatase OtsB and Saccharomyces cerevisiae trehalose-phosphatase TPS2. Trehalose biosynthesis in bacteria is known through three pathways - OtsAB, TreYZ and TreS. The OtsAB pathway, also known as the trehalose 6-phosphate synthase (TSP)/ Trehalose-6-phosphate phosphatase (TPP) pathway, is the most common route known to be involved in the stress response of Escherichia coli. It involves converting glucose-6-phosphate and UDP-glucose to form trehalose-6-phosphate (T6P), catalyzed by TPS, the product of the otsA gene, this step is followed by the dephosphorylation of T6P to yield trehalose and inorganic phosphate, catalyzed by a specific TPP, the product of otsB gene. This OtsAB (or TSP/TPP) pathway, is also the most common route known to be involved in the stress response of yeast In Saccharomyces cerevisiae, the corresponding enzymes, TPS1p and TPS2p, form a multimeric synthase complex together with additional regulatory subunits encoded by Tsl1 and Tps3. Trehalose is a common disaccharide accumulated by organisms as a reservation of carbohydrate and in response to unfavorable growth conditions. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 228
22302 319768 cd01629 HAD_EP Enolase-phosphatase similar to human enolase-phosphatase E1 and and Xanthomonas oryzae pv. Oryzae enolase-phosphatase Xep. Enolase-phosphatase E1 (also called MASA) is a bifunctional enolase- phosphatase which promotes the conversion of 2,3-diketo-5-methylthio-1-phosphopentane to 1,2-dihydroxy-3-keto-5-methylthiopentene anion (an aci-reductone) in the methionine salvage pathway. The catalytic reaction is carried out continuously by enolization and dephosphorylation, and the enolase activity cannot be classified as typical enzymatic enolization. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 204
22303 319769 cd01630 HAD_KDO-like haloacid dehalogenase-like (HAD) hydrolase, similar to Escherichia coli 3-deoxy-D-manno-octulosonate 8-phosphate (KDO 8-P) phosphatase KdsC, and rainbow trout N-acylneuraminate cytidylyltransferase. KDO 8-P phosphatase catalyzes the hydrolysis of KDO 8-P to KDO (3-deoxy-D-manno-octulosonate) and inorganic phosphate and is the last enzyme in the KDO biosynthetic pathway. KDO is an 8-carbon sugar that links the lipid A and polysaccharide moieties of the lipopolysaccharide region in Gram-negative bacteria. An interruption in KDO biosynthesis leads to the accumulation of lipid A precursors and subsequent arrest in cell growth. The KDO biosynthesis pathway involves five sequential enzymatic reactions. This family also includes rainbow trout CMP-sialic acid synthetase which effectively converts both deaminoneuraminic acid (KDN, 2-keto-3-deoxy-D-glycero-D-galacto-nononic acid) and N-acetylneuraminic acid (Neu5Ac) to CMP-KDN and CMP-Neu5Ac, respectively. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 146
22304 340816 cd01635 Glycosyltransferase_GTB-type glycosyltransferase family 1 and related proteins with GTB topology. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 235
22305 238814 cd01636 FIG FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+. 184
22306 238815 cd01637 IMPase_like Inositol-monophosphatase-like domains. This family of phosphatases is dependent on bivalent metal ions such as Mg++, and many members are inhibited by Li+ (which is thought to displace a bivalent ion in the active site). Substrates include fructose-1,6-bisphosphate, inositol poly- and monophosphates, PAP and PAPS, sedoheptulose-1,7-bisphosphate and probably others. 238
22307 238816 cd01638 CysQ CysQ, a 3'-Phosphoadenosine-5'-phosphosulfate (PAPS) 3'-phosphatase, is a bacterial member of the inositol monophosphatase family. It has been proposed that CysQ helps control intracellular levels of PAPS, which is an intermediate in cysteine biosynthesis (a principal route of sulfur assimilation). 242
22308 238817 cd01639 IMPase IMPase, inositol monophosphatase and related domains. A family of Mg++ dependent phosphatases, inhibited by lithium, many of which may act on inositol monophosphate substrate. They dephosphorylate inositol phosphate to generate inositol, which may be recycled into inositol lipids; in eukaryotes IMPase plays a vital role in intracellular signaling. IMPase is one of the proposed targets of Li+ therapy in manic-depressive illness. This family contains some bacterial members of the inositol monophosphatase family classified as SuhB-like. E. coli SuhB has been suggested to participate in posstranscriptional control of gene expression, and its inositol monophosphatase activity doesn't appear to be sufficient for its cellular function. It has been proposed, that SuhB plays a role in the biosynthesis of phosphatidylinositol in mycobacteria. 244
22309 238818 cd01640 IPPase IPPase; Inositol polyphosphate-1-phosphatase, a member of the Mg++ dependent family of inositol monophosphatase-like domains, hydrolyzes the 1' position phosphate from inositol 1,3,4-trisphosphate and inositol 1,4-bisphosphate. Members in this group may also exhibit 3'-phosphoadenosine 5'-phosphate phosphatase activity, and they all appear to be inhibited by lithium. IPPase is one of the proposed targets of Li+ therapy in manic-depressive illness. 293
22310 238819 cd01641 Bacterial_IMPase_like_1 Predominantly bacterial family of Mg++ dependend phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate fructose-1,6-bisphosphate, inositol monophospate, 3'-phosphoadenosine-5'-phosphate, or similar substrates. 248
22311 238820 cd01642 Arch_FBPase_2 Putative fructose-1,6-bisphosphatase or related enzymes of inositol monophosphatase family. These are Mg++ dependent phosphatases. Members in this family may have fructose-1,6-bisphosphatase and/or inositol-monophosphatase activity. Fructose-1,6-bisphosphatase catalyzes the hydrolysis of fructose-1,6-biphosphate into fructose-6-phosphate and is critical in gluconeogenesis pathway. 244
22312 238821 cd01643 Bacterial_IMPase_like_2 Bacterial family of Mg++ dependent phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate inositol monophosphate or similar substrates. 242
22313 238822 cd01644 RT_pepA17 RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements. 213
22314 238823 cd01645 RT_Rtv RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs. 213
22315 238824 cd01646 RT_Bac_retron_I RT_Bac_retron_I: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome. 158
22316 238825 cd01647 RT_LTR RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses. 177
22317 238826 cd01648 TERT TERT: Telomerase reverse transcriptase (TERT). Telomerase is a ribonucleoprotein (RNP) that synthesizes telomeric DNA repeats. The telomerase RNA subunit provides the template for synthesis of these repeats. The catalytic subunit of RNP is known as telomerase reverse transcriptase (TERT). The reverse transcriptase (RT) domain is located in the C-terminal region of the TERT polypeptide. Single amino acid substitutions in this region lead to telomere shortening and senescence. Telomerase is an enzyme that, in certain cells, maintains the physical ends of chromosomes (telomeres) during replication. In somatic cells, replication of the lagging strand requires the continual presence of an RNA primer approximately 200 nucleotides upstream, which is complementary to the template strand. Since there is a region of DNA less than 200 base pairs from the end of the chromosome where this is not possible, the chromosome is continually shortened. However, a surplus of repetitive DNA at the chromosome ends protects against the erosion of gene-encoding DNA. Telomerase is not normally expressed in somatic cells. It has been suggested that exogenous TERT may extend the lifespan of, or even immortalize, the cell. However, recent studies have shown that telomerase activity can be induced by a number of oncogenes. Conversely, the oncogene c-myc can be activated in human TERT immortalized cells. Sequence comparisons place the telomerase proteins in the RT family but reveal hallmarks that distinguish them from retroviral and retrotransposon relatives. 119
22318 238827 cd01650 RT_nLTR_like RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities. 220
22319 238828 cd01651 RT_G2_intron RT_G2_intron: Reverse transcriptases (RTs) with group II intron origin. RT transcribes DNA using RNA as template. Proteins in this subfamily are found in bacterial and mitochondrial group II introns. Their most probable ancestor was a retrotransposable element with both gag-like and pol-like genes. This subfamily of proteins appears to have captured the RT sequences from transposable elements, which lack long terminal repeats (LTRs). 226
22320 153210 cd01653 GATase1 Type 1 glutamine amidotransferase (GATase1)-like domain. Type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA. and, the A4 beta-galactosidase middle domain. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 115
22321 100099 cd01657 Ribosomal_L7_archeal_euk Ribosomal protein L7, which is found in archaea and eukaryotes but not in prokaryotes, binds domain II of the 23S rRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. The eukaryotic L7 members have an N-terminal extension not found in the archeal L7 orthologs. L7 is closely related to the ribosomal L30 protein found in eukaryotes and prokaryotes. 159
22322 100100 cd01658 Ribosomal_L30 Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs. L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea. 54
22323 238829 cd01659 TRX_superfamily Thioredoxin (TRX) superfamily; a large, diverse group of proteins containing a TRX-fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include TRX, protein disulfide isomerase (PDI), tlpA-like, glutaredoxin, NrdH redoxin, and the bacterial Dsb (DsbA, DsbC, DsbG, DsbE, DsbDgamma) protein families. Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins and glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others. 69
22324 238830 cd01660 ba3-like_Oxidase_I ba3-like heme-copper oxidase subunit I. The ba3 family of heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and some archaea which catalyze the reduction of O2 and simultaneously pump protons across the membrane. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. The ba3 family contains oxidases that lack the conserved residues that form the D- and K-pathways in CcO and ubiquinol oxidase. Instead they contain a potential alternative K-pathway. Additional proton channels have been proposed for this family of oxidases but none have been identified definitively. For general information on the heme-copper oxidase superfamily, please see cd00919. 473
22325 238831 cd01661 cbb3_Oxidase_I Cytochrome cbb3 oxidase subunit I. Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Found mainly in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I. Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center. The fourth subunit (ccoQ) has been shown to protect the core complex from proteolytic degradation by serine proteases. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. The polar residues that form the D- and K-pathways in subunit I of other cytochrome c and ubiquinol oxidases are absent in cbb3. The proton pathways remain undefined. A pathway for the transfer of pumped protons beyond the binuclear center also remains undefined. It is believed that electrons are passed from cytochrome c (the electron donor) to the low-spin heme via ccoP and ccoO, respectively, and directly from the low-spin heme to the binuclear center. 493
22326 238832 cd01662 Ubiquinol_Oxidase_I Ubiquinol oxidase subunit I. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits in ubiquinol oxidase varies from two to five. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons from ubiquinol to the binuclear center. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from ubiquinol on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I. It is generally believed that the channels contain water molecules that act as 'proton wires' to transfer the protons. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are believed to be transferred directly from ubiquinol (the electron donor) to the low-spin heme, and directly from the low-spin heme to the binuclear center. 501
22327 238833 cd01663 Cyt_c_Oxidase_I Cytochrome C oxidase subunit I. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, may play a role in assembly or oxygen delivery to the active site. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme (heme a3) and a copper ion (CuB). It also contains a low-spin heme (heme a), believed to participate in the transfer of electrons to the binuclear center. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are transferred from cytochrome c (the electron donor) to heme a via the CuA binuclear site in subunit II, and directly from heme a to the binuclear center. 488
22328 238834 cd01665 Cyt_c_Oxidase_III Cytochrome c oxidase subunit III. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. CcO catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit III contains bound phospholipids in several crystal structures and is proposed to contain a "lipid pool." These phospholipids are believed to intrinsic constituents similar to cofactors of the enzyme. 243
22329 340457 cd01666 TGS_DRG TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein (DRG) family. DRG-1 and DRG-2 comprise a highly conserved DRG subfamily of GTP-binding proteins found in archaea, plants, fungi and animals. The exact function of DRG proteins is unknown, although phylogenetic and biochemical fraction studies have linked them to translation, differentiation and growth. Their abnormal expressions may trigger cell transformation or cell cycle arrest. DRG-1 and DRG-2 bind to DFRP1 (DRG family regulatory protein 1) and DFRP2, respectively. Both DRG-1 and DRG-2 contain a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as the C-terminal TGS (ThrRS, GTPase and SpoT) domain, which has a predominantly beta-grasp ubiquitin-like fold and may be related to RNA binding. DRG subfamily belongs to the Obg family of GTPases. 77
22330 340458 cd01667 TGS_ThrRS TGS (ThrRS, GTPase and SpoT) domain found in threonyl-tRNA synthetase (ThrRS) and similar proteins. ThrRS, also termed cytoplasmic threonine--tRNA ligase, is a class II aminoacyl-tRNA synthetase (aaRS) that plays an essential role in protein synthesis by catalyzing the aminoacylation of tRNA(Thr), generating aminoacyl-tRNA, and editing misacylation. In addition to its catalytic and anticodon-binding domains, ThrRS has an N-terminal TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs. TGS is a small domain with a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. 65
22331 340459 cd01668 TGS_RSH TGS (ThrRS, GTPase and SpoT) domain found in the RelA/SpoT homolog (RSH) family. The RelA/SpoT homolog (RSH) family consists of long RSH proteins and short RSH proteins. Long RSH proteins have been characterized as containing an N-terminal region and a C-terminal region. The N-terminal region contains a pseudo-hydrolase (inactive-hydrolase) domain and a (p)ppGpp synthetase domain. The C-terminal region contains a ubiquitin-like TGS (ThrRS, GTPase and SpoT) domain, a conserved cysteine domain (CC), helical and ACT (aspartate kinase, chorismate mutase, TyrA domain) domains connected by a linker region. Short RSH proteins have a truncated C-terminal region without ACT domain. The RSH family includes two classes of enzyme: i) monofunctional (p)ppGpp synthetase I, RelA, and ii) bifunctional (p)ppGpp synthetase II/hydrolase, SpoT (also called Rel). Both classes are capable of synthesizing (p)ppGpp but only bifunctional enzymes are capable of (p)ppGpp hydrolysis. SpoT is a ribosome-associated protein that is activated during amino acid starvation and thought to mediate the stringent response. The function of the TGS domain of SpoT is in transcription of survival and virulence genes in respond to environmental stress. RelA is an ATP:GTP(GDP) pyrophosphate transferase that is recruited to stalled ribosomes and activated to synthesize (p)ppGpp, which acts as a pleiotropic secondary messenger. 59
22332 340460 cd01669 TGS_MJ1332_like TGS (ThrRS, GTPase and SpoT) domain found in Methanocaldococcus jannaschii uncharacterized GTP-binding protein MJ1332 and similar proteins. This family includes a group of uncharacterized GTP-binding proteins from archaea, which belong to the Obg family of GTPases. The family members contain a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as a C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold. 78
22333 260017 cd01670 Death Death Domain: a protein-protein interaction domain. Death Domains (DDs) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. Structural analysis of DD-DD complexes show that the domains interact with each other in many different ways. DD-containing proteins serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. In mammals, they are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways. In invertebrates, they are involved in transcriptional regulation of zygotic patterning genes in insect embryogenesis, and are components of the ToII/NF-kappaB pathway, a conserved innate immune pathway in animal cells. 79
22334 260018 cd01671 CARD Caspase activation and recruitment domain: a protein-protein interaction domain. Caspase activation and recruitment domains (CARDs) are death domains (DDs) found associated with caspases. Caspases are aspartate-specific cysteine proteases with functions in apoptosis, immune signaling, inflammation, and host-defense mechanisms. In addition to caspases, proteins containing CARDs include adaptor proteins such as RAIDD, CARD9, and RIG-I-like helicases, which can form multiprotein complexes and play important roles in mediating the signals to induce immune and inflammatory responses. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 79
22335 238835 cd01672 TMPK Thymidine monophosphate kinase (TMPK), also known as thymidylate kinase, catalyzes the phosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP) utilizing ATP as its preferred phophoryl donor. TMPK represents the rate-limiting step in either de novo or salvage biosynthesis of thymidine triphosphate (TTP). 200
22336 238836 cd01673 dNK Deoxyribonucleoside kinase (dNK) catalyzes the phosphorylation of deoxyribonucleosides to yield corresponding monophosphates (dNMPs). This family consists of various deoxynucleoside kinases including deoxyribo- cytidine (EC 2.7.1.74), guanosine (EC 2.7.1.113), adenosine (EC 2.7.1.76), and thymidine (EC 2.7.1.21) kinases. They are key enzymes in the salvage of deoxyribonucleosides originating from extra- or intracellular breakdown of DNA. 193
22337 238837 cd01674 Homoaconitase_Swivel Homoaconitase swivel domain. This family includes homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases. This is the swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 129
22338 153084 cd01675 RNR_III Class III ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit which contains the active and allosteric sites. 555
22339 153085 cd01676 RNR_II_monomer Class II ribonucleotide reductase, monomeric form. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class II RNRs are found in bacteria that can live under both aerobic and anaerobic conditions. Many, but not all members of this class, are found to be homodimers. This particular subfamily is found to be active as a monomer. Adenosylcobalamin interacts directly with an active site cysteine to form the reactive cysteine radical. 658
22340 153086 cd01677 PFL2_DhaB_BssA Pyruvate formate lyase 2 and related enzymes. This family includes pyruvate formate lyase 2 (PFL2), B12-independent glycerol dehydratase (DhaB) and the alpha subunit of benzylsuccinate synthase (BssA), all of which have a highly conserved ten-stranded alpha/beta barrel domain, which is similar to those of PFL1 (pyruvate formate lyase 1) and RNR (ribonucleotide reductase). Pyruvate formate lyase catalyzes a key step in anaerobic glycolysis, the conversion of pyruvate and CoenzymeA to formate and acetylCoA. DhaB catalyzes the first step in the conversion of glycerol to 1,3-propanediol while BssA catalyzes the first step in the anaerobic mineralization of both toluene and m-xylene. 781
22341 153087 cd01678 PFL1 Pyruvate formate lyase 1. Pyruvate formate lyase catalyzes a key step in anaerobic glycolysis, the conversion of pyruvate and CoenzymeA to formate and acetylCoA. The PFL mechanism involves an unusual radical cleavage of pyruvate in which two cysteines and one glycine form radicals that are required for catalysis. PFL has a ten-stranded alpha/beta barrel domain that is structurally similar to those of all three ribonucleotide reductase (RNR) classes as well as benzylsuccinate synthase and B12-independent glycerol dehydratase. 738
22342 153088 cd01679 RNR_I Class I ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and many viruses, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophages, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class I RNR is oxygen-dependent and can be subdivided into classes Ia (eukaryotes, prokaryotes, viruses and phages) and Ib (which is found in prokaryotes only). It is a tetrameric enzyme of two alpha and two beta subunits; this model covers the major part of the alpha or large subunit, called R1 in class Ia and R1E in class Ib. 460
22343 238838 cd01680 EFG_like_IV Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm. 116
22344 238839 cd01681 aeEF2_snRNP_like_IV This family represents domain IV of archaeal and eukaryotic elongation factor 2 (aeEF-2) and of an evolutionarily conserved U5 snRNP-specific protein. U5 snRNP is a GTP-binding factor closely related to the ribosomal translocase EF-2. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Phe-tRNA, EF-1 (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm. 177
22345 238840 cd01683 EF2_IV_snRNP EF-2_domain IV_snRNP domain is a part of 116kD U5-specific protein of the U5 small nucleoprotein (snRNP) particle, essential component of the spliceosome. The protein is structurally closely related to the eukaryotic translational elongation factor EF2. This domain has been also identified in 114kD U5-specific protein of Saccharomyces cerevisiae and may play an important role either in splicing process itself or the recycling of spliceosomal snRNP. 178
22346 238841 cd01684 Tet_like_IV EF-G_domain IV_RPP domain is a part of bacterial ribosomal protected proteins (RPP) family. RPPs such as tetracycline resistance proteins Tet(M) and Tet(O) mediate tetracycline resistance in both gram-positive and -negative species. Tetracyclines inhibit the accommodation of aminoacyl-tRNA into ribosomal A site and therefore prevent the addition of new amino acids to the growing polypeptide. RPPs Tet(M) confer tetracycline resistance by releasing tetracycline from the ribosome and thereby freeing the ribosome from inhibitory effects of the drug, such that aa-tRNA can bind to the A site and protein synthesis can continue. 115
22347 238842 cd01693 mtEFG2_like_IV mtEF-G2 domain IV. This subfamily is a part the of mitochondrial transcriptional elongation factor, mtEF-G2. Mitochondrial translation is crucial for maintaining mitochondrial function and mutations in this system lead to a breakdown in the respiratory chain-oxidative phosphorylation system and to impaired maintenance of mitochondrial DNA. In complex with GTP, EF-G promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. 120
22348 238843 cd01699 RNA_dep_RNAP RNA_dep_RNAP: RNA-dependent RNA polymerase (RdRp) is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage. RdRp catalyzes synthesis of the RNA strand complementary to a given RNA template. RdRps of many viruses are products of processing of polyproteins. Some RdRps consist of one polypeptide chain, and others are complexes of several subunits. The domain organization and the 3D structure of the catalytic center of a wide range of RdRps, including those with a low overall sequence homology, are conserved. The catalytic center is formed by several motifs containing a number of conserved amino acid residues. This subfamily represents the RNA-dependent RNA polymerases from all positive-strand RNA eukaryotic viruses with no DNA stage. 278
22349 176454 cd01700 PolY_Pol_V_umuC umuC subunit of DNA Polymerase V. umuC subunit of Pol V. Pol V is a bacterial translesion synthesis (TLS) polymerase that consists of the heterotrimer of one umuC and two umuD subunits. Translesion synthesis is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Pol V, RecA, single stranded DNA-binding protein, beta sliding clamp, and gamma clamp loading complex are responsible for inducing the SOS response in bacteria to repair UV-induced DNA damage. 344
22350 176455 cd01701 PolY_Rev1 DNA polymerase Rev1. Rev1 is a translesion synthesis (TLS) polymerase found in eukaryotes. Translesion synthesis is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Rev1 has both structural and enzymatic roles. Structurally, it is believed to interact with other nonclassical polymerases and replication machinery to act as a scaffold. Enzymatically, it catalyzes the specific insertion of dCMP opposite abasic sites. Rev1 interacts with the Rev7 subunit of the B-family TLS polymerase Pol zeta (Rev3/Rev7). Rev1 is known to actively promote the introduction of mutations, potentially making it a significant target for cancer treatment. 404
22351 176456 cd01702 PolY_Pol_eta DNA Polymerase eta. Pol eta, also called Rad30A, is a translesion synthesis (TLS) polymerase. Translesion synthesis is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Unlike other Y-family members, Pol eta can efficiently and accurately replicate DNA past UV-induced lesions. Its activity is initiated by two simultaneous interactions: the PIP box in pol eta interacting with PCNA, and the UBZ (ubiquitin-binding zinc finger) in pol eta interacting with monoubiquitin attached to PCNA. Pol eta is more efficient in copying damaged DNA than undamaged DNA and seems to recognize when a lesion has been passed, facilitating a lesion-dependent dissociation from the DNA. 359
22352 176457 cd01703 PolY_Pol_iota DNA Polymerase iota. Pol iota, also called Rad30B, is a translesion synthesis (TLS) polymerase. Translesion synthesis is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Pol iota is thought to be one of the least efficient polymerases, particularly when opposite pyrimidines; it can incorporate the correct nucleotide opposite a purine much more efficiently than opposite a pyrimidine, and prefers to insert guanosine instead of adenosine opposite thymidine. Pol iota is believed to use Hoogsteen rather than Watson-Crick base pairing, which may explain the varying efficiency for different template nucleotides. 379
22353 238844 cd01709 RT_like_1 RT_like_1: A subfamily of reverse transcriptases (RTs). An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. 346
22354 238845 cd01712 ThiI ThiI is required for thiazole synthesis in the thiamine biosynthesis pathway. It belongs to the Adenosine Nucleotide Hydrolysis suoerfamily and predicted to bind to Adenosine nucleotide. 177
22355 238846 cd01713 PAPS_reductase This domain is found in phosphoadenosine phosphosulphate (PAPS) reductase enzymes or PAPS sulphotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. A highly modified version of the P loop, the fingerprint peptide of mononucleotide-binding proteins, is present in the active site of the protein, which appears to be a positively charged cleft containing a number of conserved arginine and lysine residues. Although PAPS reductase has no ATPase activity, it shows a striking similarity to the structure of the ATP pyrophosphatase (ATP PPase) domain of GMP synthetase, indicating that both enzyme families have evolved from a common ancestral nucleotide-binding fold. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP) . It is also found in NodP nodulation protein P from Rhizobium meliloti which has ATP sulphurylase activity (sulphate adenylate transferase) . 173
22356 238847 cd01714 ETF_beta The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit. 202
22357 238848 cd01715 ETF_alpha The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. 168
22358 212463 cd01716 Hfq bacterial Hfq-like. Hfq, an abundant, ubiquitous RNA-binding protein, functions as a pleiotropic regulator of RNA metabolism in prokaryotes, required for transcription of some transcripts and degradation of others. Hfq binds small RNA molecules called riboregulators that modulate the stability or translation efficiency of RNA transcripts. Hfq binds preferentially to unstructured A/U-rich RNA sequences and is similar to the eukaryotic Sm proteins in both sequence and structure. Hfq forms a homo-hexameric ring similar to the heptameric ring of the Sm proteins. 60
22359 212464 cd01717 Sm_B Sm protein B. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 80
22360 212465 cd01718 Sm_E Sm protein E. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit E binds subunits F and G to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. 79
22361 212466 cd01719 Sm_G Sm protein G. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. 70
22362 212467 cd01720 Sm_D2 Sm protein D2. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. 89
22363 212468 cd01721 Sm_D3 Sm protein D3. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. 70
22364 212469 cd01722 Sm_F Sm protein F. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers. 69
22365 212470 cd01723 LSm4 Like-Sm protein 4. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 76
22366 212471 cd01724 Sm_D1 Sm protein D1. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. 92
22367 212472 cd01725 LSm2 Like-Sm protein 2. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 89
22368 212473 cd01726 LSm6 Like-Sm protein 6. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 68
22369 212474 cd01727 LSm8 Like-Sm protein 8. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 91
22370 212475 cd01728 LSm1 Like-Sm protein 1. The eukaryotic LSm proteins (LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. Accumulation of uridylated RNAs in an lsm1 mutant suggests an involvement of the LSm1-7 complex in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm1-7, together with Pat1, are also called the decapping activator. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 74
22371 212476 cd01729 LSm7 Like-Sm protein 7. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 89
22372 212477 cd01730 LSm3 Like-Sm protein 3. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 82
22373 212478 cd01731 archaeal_Sm1 archaeal Sm protein 1. The archaeal Sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal LSm proteins are likely to represent the ancestral Sm domain. 69
22374 212479 cd01732 LSm5 Like-Sm protein 5. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. 76
22375 212480 cd01733 LSm10 Like-Sm protein 10. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-membered ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. 78
22376 212481 cd01734 YlxS_C Bacillus subtilis YxlS-like, C-terminal domain. YxlS is a Bacillus subtilis gene of unknown function with two domains that each have an alpha/beta fold. The N-terminal domain is composed of two alpha-helices and a three-stranded beta-sheet, while the C-terminal domain is composed of one alpha-helix and a five-stranded beta-sheet. This CD represents the C-terminal domain which has a fold similar to the Sm fold of proteins like Sm-D3. 72
22377 212482 cd01735 LSm12_N Like-Sm protein 12, N-terminal domain. LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain. 61
22378 212483 cd01736 LSm14_N Like-Sm protein 14, N-terminal domain. LSm14 (also known as RAP55) belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm14 has an uncharacterized C-terminal domain containing a conserved DFDF box. In Xenopus laevis, LSm14 is an oocyte-specific constituent of ribonucleoprotein particles. 74
22379 212484 cd01737 LSm16_N Like-Sm protein 16, N-terminal domain. LSm16 (also known as enhancer of decapping-3 or EDC3) has been shown to be associated with an mRNA-decapping complex Dcp1-Dcp2, required for removal of the 5-prime cap from mRNA prior to its degradation from the 5-prime end. EDC3 is believed to be a scaffold for decapping complex formation. It belongs to a family of Sm-like proteins that associate with RNA to form complexes involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. LSm16 has, in addition to its N-terminal Sm-like domain, a C-terminal Yjef_N-type Rossmann fold domain of unknown function. 65
22380 212485 cd01739 LSm11_M Like-Sm protein 11, middle domain. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2-like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-membered ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. 63
22381 153211 cd01740 GATase1_FGAR_AT Type 1 glutamine amidotransferase (GATase1)-like domain found in Formylglycinamide ribonucleotide amidotransferase. Type 1 glutamine amidotransferase (GATase1)-like domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT). FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. FGAR-AT is a glutamine amidotransferase. Glutamine amidotransferase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. FGAR-AT belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site 238
22382 153212 cd01741 GATase1_1 Subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 188
22383 153213 cd01742 GATase1_GMP_Synthase Type 1 glutamine amidotransferase (GATase1) domain found in GMP synthetase. Type 1 glutamine amidotransferase (GATase1) domain found in GMP synthetase. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. Glutamine amidotransferase (GATase) activity catalyse the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. GMP synthetase catalyses the amination of the nucleotide precursor xanthosine 5'-monophosphate to form GMP. GMP synthetase belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 181
22384 153214 cd01743 GATase1_Anthranilate_Synthase Type 1 glutamine amidotransferase (GATase1) domain found in Anthranilate synthase. Type 1 glutamine amidotransferase (GATase1) domain found in Anthranilate synthase (ASase). This group contains proteins similar to para-aminobenzoate (PABA) synthase and ASase. These enzymes catalyze similar reactions and produce similar products, PABA and ortho-aminobenzoate (anthranilate). Each enzyme is composed of non-identical subunits: a glutamine amidotransferase subunit (component II) and a subunit that produces an aminobenzoate products (component I). ASase catalyses the synthesis of anthranilate from chorismate and glutamine and is a tetrameric protein comprising two copies each of components I and II. Component II of ASase belongs to the family of triad GTases which hydrolyze glutamine and transfer nascent ammonia between the active sites. In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. PRTase catalyses the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate. In E.coli, the first step in the conversion of chorismate to PABA involves two proteins: PabA and PabB which co-operate to transfer the amide nitrogen of glutamine to chorismate forming 4-amino-4 deoxychorismate (ADC). PabA acts as a glutamine amidotransferase, supplying an amino group to PabB, which carries out the amination reaction. A third protein PabC then mediates elimination of pyruvate and aromatization to give PABA. Several organisms have bipartite proteins containing fused domains homologous to PabA and PabB commonly called PABA synthases. These hybrid PABA synthases may produce ADC and not PABA. 184
22385 153215 cd01744 GATase1_CPSase Small chain of the glutamine-dependent form of carbamoyl phosphate synthase, CPSase II. This group of sequences represents the small chain of the glutamine-dependent form of carbamoyl phosphate synthase, CPSase II. CPSase II catalyzes the production of carbomyl phosphate (CP) from bicarbonate, glutamine and two molecules of MgATP. The reaction is believed to proceed by a series of four biochemical reactions involving a minimum of three discrete highly reactive intermediates. The synthesis of CP is critical for the initiation of two separate biosynthetic pathways. In one CP is coupled to aspartate, its carbon and nitrogen nuclei ultimately incorporated into the aromatic moieties of pyrimidine nucleotides. In the second pathway CP is condensed with ornithine at the start of the urea cycle and is utilized for the detoxification of ammonia and biosynthesis of arginine. CPSases may be encoded by one or by several genes, depending on the species. The E.coli enzyme is a heterodimer consisting of two polypeptide chains referred to as the small and large subunit. Ammonia an intermediate during the biosynthesis of carbomyl phosphate produced by the hydrolysis of glutamine in the small subunit of the enzyme is delivered via a molecular tunnel between the remotely located carboxyphosphate active site in the large subunit. CPSase IIs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. This group also contains the sequence from the mammalian urea cycle form which has lost the active site Cys, resulting in an ammonia-dependent form, CPSase I. 178
22386 153216 cd01745 GATase1_2 Subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 189
22387 153217 cd01746 GATase1_CTP_Synthase Type 1 glutamine amidotransferase (GATase1) domain found in Cytidine Triphosphate Synthetase. Type 1 glutamine amidotransferase (GATase1) domain found in Cytidine Triphosphate Synthetase (CTP). CTP is involved in pyrimidine ribonucleotide/ribonucleoside metabolism. CTPs produce CTP from UTP and glutamine and regulate intracellular CTP levels through interactions with four ribonucleotide triphosphates. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. CTP is derived form UTP in three separate steps involving two active sites. In one active site, the UTP O4 oxygen is activated by Mg-ATP-dependent phosphorylation, followed by displacement of the resulting 4-phosphate moiety by ammonia. At a separate site, ammonia is generated via rate limiting glutamine hydrolysis (glutaminase) activity. A gated channel that spans between the glutamine hydrolysis and amidoligase active sites provides a path for ammonia diffusion. CTPs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 235
22388 153218 cd01747 GATase1_Glutamyl_Hydrolase Type 1 glutamine amidotransferase (GATase1) domain found in gamma-Glutamyl Hydrolase. Type 1 glutamine amidotransferase (GATase1) domain found in gamma-Glutamyl Hydrolase. gamma-Glutamyl Hydrolase catalyzes the cleavage of the gamma-glutamyl chain of folylpoly-gamma-glutamyl substrates and is a central enzyme in folyl and antifolyl poly-gamma-glutamate metabolism. GATase activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. gamma-Glutamyl hydrolases belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 273
22389 153219 cd01748 GATase1_IGP_Synthase Type 1 glutamine amidotransferase (GATase1) domain found in imidazole glycerol phosphate synthase (IGPS). Type 1 glutamine amidotransferase (GATase1) domain found in imidazole glycerol phosphate synthase (IGPS). IGPS incorporates ammonia derived from glutamine into N1-[(5'-phosphoribulosyl)-formimino]-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to form 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR) and imidazole glycerol phosphate (IGP). The glutamine amidotransferase domain generates the ammonia nucleophile which is channeled from the glutaminase active site to the PRFAR active site. IGPS belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 198
22390 153220 cd01749 GATase1_PB Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis. Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis. Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This group contains proteins like Bacillus subtilus YaaE and Plasmodium falciparum Pdx2 which are members of the triad glutamine aminotransferase family and function in a pathway for the biosynthesis of vitamin B6. 183
22391 153221 cd01750 GATase1_CobQ Type 1 glutamine amidotransferase (GATase1) domain found in Cobyric Acid Synthase (CobQ). Type 1 glutamine amidotransferase (GATase1) domain found in Cobyric Acid Synthase (CobQ). CobQ plays a role in cobalamin biosythesis. CobQ catalyses amidations at positions B, D, E, and G on adenosylcobyrinic A,C-diamide in the biosynthesis of cobalamin. CobQ belongs to the triad family of amidotransferases. Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobQ. 194
22392 238849 cd01751 PLAT_LH2 PLAT/ LH2 domain of plant lipoxygenase related proteins. Lipoxygenases are nonheme, nonsulfur iron dioxygenases that act on lipid substrates containing one or more (Z,Z)-1,4-pentadiene moieties. In plants, the immediate products are involved in defense mechanisms against pathogens and may be precursors of metabolic regulators. The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 137
22393 238850 cd01752 PLAT_polycystin PLAT/LH2 domain of polycystin-1 like proteins. Polycystins are a large family of membrane proteins composed of multiple domains, present in fish, invertebrates, mammals, and humans that are widely expressed in various cell types and whose biological functions remain poorly defined. In human, mutations in polycystin-1 (PKD1) and polycystin-2 (PKD2) have been shown to be the cause for autosomal dominant polycystic kidney disease (ADPKD). The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 120
22394 238851 cd01753 PLAT_LOX PLAT domain of 12/15-lipoxygenase. As a unique subfamily of the mammalian lipoxygenases, they catalyze enzymatic lipid peroxidation in complex biological structures via direct dioxygenation of phospholipids and cholesterol esters of biomembranes and plasma lipoproteins. Both types of enzymes are cytosolic but need this domain to access their sequestered membrane or micelle bound substrates. 113
22395 238852 cd01754 PLAT_plant_stress PLAT/LH2 domain of plant-specific single domain protein family with unknown function. Many of its members are stress induced. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins. 129
22396 238853 cd01755 PLAT_lipase PLAT/ LH2 domain present in connection with a lipase domain. This family contains two major subgroups, the lipoprotein lipase (LPL) and the pancreatic triglyceride lipase. LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs). The central role of triglyceride lipases is in energy production. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins. 120
22397 238854 cd01756 PLAT_repeat PLAT/LH2 domain repeats of family of proteins with unknown function. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins. 120
22398 238855 cd01757 PLAT_RAB6IP1 PLAT/LH2 domain present in RAB6 interacting protein 1 (Rab6IP1)_like family. PLAT/LH2 domains consists of an eight stranded beta-barrel. In RabIP1 this domain may participate in lipid-mediated modulation of Rab6IP1's function via it's generally proposed function of mediating interaction with lipids or membrane bound proteins. 114
22399 238856 cd01758 PLAT_LPL PLAT/ LH2 domain present in lipoprotein lipase (LPL). LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs) and has therefeore has a profound influence on triglyceride and high-density lipoprotein (HDL) cholesterol levels in the blood. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins. 137
22400 238857 cd01759 PLAT_PL PLAT/LH2 domain of pancreatic triglyceride lipase. Lipases hydrolyze phospholipids and triglycerides to generate fatty acids for energy production or for storage and to release inositol phosphates that act as second messengers. The central role of triglyceride lipases is in energy production. The proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 113
22401 340461 cd01760 RBD Ras-binding domain (RBD), structurally similar to a beta-grasp ubiquitin-like fold. The RBD of the serine/threonine kinase Raf is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. A Raf-like RBD is also present in Regulator of G protein Signaling (RGS12 and RGS14) members of GTPase activating proteins. 71
22402 340462 cd01763 Ubl_SUMO_like ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier (SUMO) and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins, and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. The mammalian SUMOs have four paralogs, SUMO1 through SUMO4, which all regulate different cellular functions by conjugating to different proteins. SUMO2-4 are more closely related to each other than to SUMO1. 72
22403 340463 cd01764 Ubl_Urm1 ubiquitin-like (Ubl) domain found in ubiquitin-related modifier 1 (Urm1). Urm1 acts as a sulfur carrier in the thiolation of eukaryotic tRNA via a mechanism that requires the formation of a thiocarboxylated Urm1, which is similar to that of prokaryotic sulfur carrier proteins such as ThiS and MoaD, containing the beta-grasp ubiquitin-like (Ubl) fold. Urm1 can be covalently conjugated to lysine residues of other proteins through a mechanism involving the E1-like protein Uba4. Urm1 is involved in yeast bioprocesses such as budding, nutrient sensing, high temperature sensitivity, antioxidant stress response and post-translation modification of the elongator subunit. 94
22404 340464 cd01765 FERM_F0_F1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain and F1 sub-domain, found in FERM (Four.1/Ezrin/Radixin/Moesin) family proteins. FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain is present at the N-terminus of a large and diverse group of proteins that mediate linkage of the cytoskeleton to the plasma membrane. FERM-containing proteins are ubiquitous components of the cytocortex and are involved in cell transport, cell structure and signaling functions. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N), which is structurally similar to ubiquitin. 80
22405 340465 cd01766 Ubl_UFM1 ubiquitin-like (Ubl) domain found in ubiquitin fold modifier 1 (UFM1). UFM1 belongs to the ubiquitin-like protein family with similar ubiquitin beta-grasp folds and mechanism of ligation to other proteins. UFM1 is present in nearly all eukaryotic organisms except fungi. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The UNF1 cascade has been implicated in endoplasmic reticulum functions, cell cycle control and cell differentiation. The involvement of the UFM1 cascade in diseases is diverse; reports include its involvement in ischemic heart diseases, diabetes, gastric lesions, schizophrenia, hip dysplasia and cancer. 75
22406 340466 cd01767 UBX Ubiquitin regulatory domain X (UBX) structurally similar to a beta-grasp ubiquitin-like fold. The UBXD family of proteins contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. Members in this family function as cofactors of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. Based on domain composition, UBXD proteins can be divided into two main groups, with and without ubiquitin-associated (UBA) domain. 74
22407 340467 cd01768 RA_FERM_F0_F1_like Ras-associating (RA) domain, FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0/F1 sub-domain, structurally similar to a beta-grasp ubiquitin-like fold. RA domain-containing proteins function by interacting, directly or indirectly, with Ras proteins and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras protein is a small GTPase that is involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA-containing proteins include RalGDS, AF6, RIN, RASSF1, SNX27, CYR1, STE50, and phospholipase C epsilon. The FERM domain is present at the N-terminus of a large and diverse group of proteins that mediate linkage of the cytoskeleton to the plasma membrane. FERM-containing proteins are ubiquitous components of the cytocortex and are involved in cell transport, cell structure and signaling. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, also known as the N-terminal Ubl-like structural domain of the FERM domain (FERM_N), which is structurally similar to Ub. Some FERM domain-containing proteins contain an N-terminal region, which also has the beta-grasp Ub-like fold, precedes the FERM domain and has been referred to as the F0 domain. 110
22408 340468 cd01770 UBX_UBXN2 Ubiquitin regulatory domain X (UBX) found in UBX domain-containing proteins UBXN2A, UBXN2B, NSFL1C/UBXN2C, and similar proteins. This family includes UBX domain-containing proteins UBXN2A, UBXN2B, and NSFL1C/UBXN2C, which contain a SEP (Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47) domain, and a ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold at the C-terminus. UBX domain participates broadly in the regulation of protein degradation. UBXN2A, UBXN2B, and UBXN2C function as the adaptor proteins of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. 71
22409 340469 cd01771 UBX_UBXN3A Ubiquitin regulatory domain X (UBX) found in FAS associated factor 1 (FAF1, also known as UBXN3A) and similar proteins. UBX domain-containing protein 3A (UBXN3A),also termed UBX domain-containing protein 12 (UBXD12), or FAF1, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem ubiquitin-like (Ubl) domains, which shows high structural similarity with UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. This family corresponds to UBX domain. 80
22410 340470 cd01772 UBX_UBXN1 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 1 (UBXN1) and similar proteins. UBXN1, also termed SAPK substrate protein 1 (SAKS1), UBA/UBX 33.3 kDa protein (Y33K), or UBXD10, is a widely expressed protein containing an N-terminal ubiquitin-associated (UBA) domain, a coiled-coil region, and a C-terminal ubiquitin-like (Ubl or UBX) domain that has a beta-grasp ubiquitin-like fold without the C-terminal double glycine motif. UBXN1 has been identified as a substrate for stress-activated protein kinases (SAPKs). It binds polyubiquitin and valosin-containing protein (VCP), suggesting a role as an adaptor that directs VCP to polyubiquitinated proteins facilitating its destruction by the proteasome. In addition, UBXN1 specifically binds to Homer2b. It may also interact with ubiquitin (Ub) and be involved in the Ub-proteasome proteolytic pathways. UBXN1 can also associate with autoubiquitinated BRCA1 tumor suppressor and inhibit its enzymatic function through its UBA domains. 81
22411 340471 cd01773 UBX_UBXN7 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 7 (UBXN7) and similar proteins. UBXN7, also termed UBX domain-containing protein 7 (UBXD7), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN7 functions as a ubiquitin-binding adaptor that mediates the interaction between the AAA+ ATPase p97 (also known as VCP or Cdc48) and the transcription factor HIF1-alpha. It binds only to the active, NEDD8- or Rub1-modified form of cullins. In addition to having a UBX domain, UBXD7 contains a ubiquitin-associated (UBA), ubiquitin-associating (UAS), and ubiquitin-interacting motif (UIM) domains. Either UBA or UIM could serve as a docking site for neddylated-cullins. UBA domain is required for binding ubiquitylated-protein substrates, while the UIM motif is responsible for the binding to cullin RING ligases (CRLs), and the UBX domain is essential for p97 binding. 76
22412 340472 cd01774 UBX_UBXN8 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 8 (UBXN8) and similar proteins. UBXN8, also termed reproduction 8 protein (Rep8), or UBX domain-containing protein 6 (UBXD6), or D8S2298E, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN8 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN8 is a transmembrane protein that localizes to the endoplasmic reticulum (ER) membrane with its UBX domain facing the cytoplasm. It facilitates efficient ER-associated degradation (ERAD) by tethering p97 to the ER membrane. 76
22413 340473 cd01775 RA_PHLPP_like Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatases, fungal adenylate cyclase, and similar proteins. PHLPP represents a novel family of Ser/Thr protein phosphatases, which is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP contains a putative Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain. Fungal adenylate cyclase regulates developmental processes such as hyphal growth, biofilm formation, and phenotypic switching. It plays an essential role in regulation of cellular metabolism by catalyzing the synthesis of a second messenger, cAMP. Fungal adenylate cyclase has at least four domains, including an N-terminal adenylate cyclase G-alpha binding domain, a Ras-associating (RA) domain, a middle leucine-rich repeat region, and a catalytic domain. The RA domain of adenylate cyclase post-translationally modifies a small GTPase called Ras, which is involved in cellular signal transduction. The activity of adenylate cyclase is stimulated directly by regulatory proteins (Ras1 and Gpa2), peptidoglycan fragments and carbon dioxide. 99
22414 340474 cd01776 RA_Rin Ras-associating (RA) domain of Ras and Rab interactor (Rin) protein family. Family of Ras-interaction/interference (Rin) proteins, also known as Ras and Rab interactors, is composed of Rin1, Rin2, and Rin3, which have multifunctional domains, including SH2 and proline-rich domains in the N-terminal region, and RH, VPS9, and RA domains in the C-terminal region. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domains of Rin1, Rin2, and Rin3 are well conserved and they all have Ras binding characteristics. 90
22415 340475 cd01777 FERM_F1_SNX27 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 27 (SNX27). SNX27 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. In addition to a PX (Phox homology) domain that regulates its endosomal localization, SNX27 has a unique PDZ (Psd-95/Dlg/ZO1) domain and an atypical FERM (4.1, ezrin, radixin, moesin) domain that both function to bind short peptide sequence motifs in the cytoplasmic domains of the cargo receptors. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 92
22416 340476 cd01778 RA_RASSF1_like Ras-associating (RA) domain found in Ras-association domain family members, RASSF1, RASSF3, and RASSF5. The RASSF family of proteins shares a conserved RalGDS/AF6 Ras association (RA) domain which is located either at the C-terminus (RASSF1-6, the classical group) or at the N-terminus (RASSF7-10). RASSF1-6 contains a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that functions in scaffolding and regulatory interactions. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Classical RASSF members interact either directly or indirectly with activated Ras. Ras proteins are small GTPases that are involved in cellular signal transduction. The classical RASSF proteins seem to modulate some of the growth inhibitory responses mediated by Ras and may serve as tumor suppressor genes. This family contains RASSF1, RASSF3, and RASSF5. 130
22417 340477 cd01779 RA_Myosin-IX Ras-associating (RA) domain found in Myosin-IX. Myosins IX (Myo9) is a class of unique motor proteins with a common structure of an N-terminal extension preceding a myosin head homologous to the Ras-association (RA) domain, a head (motor) domain, a neck with IQ motifs that bind light chains and a C-terminal tail containing a Rho-GTPase activating protein (RhoGAP) domain. The RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. There are two genes for myosins IX in humans, IXa and IXb, that are different in their expression and localization. IXa is expressed abundantly in brain and testis and IXb is expressed abundantly in tissues of the immune system. 97
22418 340478 cd01780 RA2_PLC-epsilon Ras-associating (RA) domain 2 found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. This family corresponds to the second RA domain of PLC-epsilon. 102
22419 340479 cd01781 RA2_Afadin Ras-associating (RA) domain 2 found in Afadin. Afadin, also termed ALL1-fused gene from chromosome 6 protein (AF-6), or canoe, is involved in many fundamental signaling cascades in cells. In addition, it is involved in oncogenesis and metastasis. Afadin has multiple domains: from the N-terminus to the C-terminus it has two Ras-associated (RA) domains, a forkhead-associated domain, a dilute domain, a PDZ domain, three proline-rich domains, and an F-actin binding domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Afadin is abundant at cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. This family corresponds to the second RA domain of afadin. 102
22420 340480 cd01782 RA1_Afadin Ras-associating (RA) domain 1 found in Afadin. Afadin, also termed ALL1-fused gene from chromosome 6 protein (AF-6), or canoe, is involved in many fundamental signaling cascades in cells. In addition, it is involved in oncogenesis and metastasis. Afadin has multiple domains: from the N-terminus to the C-terminus it has two Ras-associated (RA) domains, a forkhead-associated domain, a dilute domain, a PDZ domain, three proline-rich domains, and an F-actin-binding domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Afadin is abundant at cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. This family corresponds to the first RA domain of afadin, which mediates its self-association. 112
22421 340481 cd01783 RA2_DAGK-theta Ras-associating (RA) domain 2 found in diacylgylcerol kinase theta (DAGK-theta) and similar proteins. DAGK phosphorylates the second messenger diacylglycerol to phosphatidic acid as part of a protein kinase C pathway. DAGK-theta is characterized as a type V DAGK that has three cysteine-rich domains (all other isoforms have two), a proline/glycine-rich domain at its N-terminal, and a proposed Ras-associating (RA) domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. There are ten mammalian isoforms of DAGK have been identified to date, these are organized into five categories based on the domain architecture. DAGK-theta also contains a pleckstrin homology (PH) domain. The subcellular localization and the activity of DAGK-theta are regulated in a complex (stimulation- and cell type-dependent) manner. This family corresponds to the second RA domain of DAGK-theta. 95
22422 340482 cd01784 RA_RASSF2_like Ras-associating (RA) domain found in Ras-association domain family members, RASSF2, RASSF4, and RASSF6. The RASSF family of proteins shares a conserved RalGDS/AF6 RA domain either in the C-terminus (RASSF1-6) or N-terminus (RASSF7-10). The classical family members (RASSF1-6) contain a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that functions as scaffolding and regulatory interactions. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Classical RASSF members interact either directly or indirectly with activated Ras. Ras proteins are small GTPases that are involved in cellular signal transduction. The classical RASSF protein family seem to modulate some of the growth inhibitory responses mediated by Ras and may serve as tumor suppressor genes. This family contains RASSF2, RASSF4, and RASSF6. 87
22423 340483 cd01785 RA_PDZ-GEF1 Ras-associating (RA) domain found in PDZ domain-containing guanine nucleotide exchange factor 1 (PDZ-GEF1) and similar proteins. PDZ-GEF1, also termed Rap guanine nucleotide exchange factor 2, or cyclic nucleotide ras GEF (CNrasGEF), or neural RAP guanine nucleotide exchange protein (nRap GEP), or Ras/Rap1-associating GEF-1 (RA-GEF-1), is a Rap-specific guanine nucleotide exchange factor (GEF) that has a PSD-95/DlgA/ZO-1 (PDZ) domain, a RA domain and a region related to a cyclic nucleotide binding domain (RCBD). The RA domain of PDZ-GEF interacts with Rap1 and also contributes to the membrane localization of PDZ-GEF. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 85
22424 340484 cd01786 RA_STE50 Ras-associating (RA) domain found in the fungal adaptor protein STE50. The fungal adaptor protein STE50 is an essential component of three MAPK-mediated signaling pathways that control the mating response, invasive/filamentous growth and osmotolerance (HOG pathway), respectively. STE50 functions in cell signaling between the activated G protein and STE11. The domain architecture of STE50 includes an amino-terminal SAM (sterile alpha motif) domain in addition to the carboxy-terminal ubiquitin-like RA (RAS-associated) domain. RA domain of STE50 interacts with the small GTPase Cdc42p, a member of Rho type of the Ras superfamily. This interaction activates Ste11p/Ste7p/Kss1pMAP kinase cascade that controls filamentous growth. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 101
22425 340485 cd01787 RA_MRL Ras-associating (RA) domain of Mig10/RIAM/Lpd (MRL) family. MRL proteins share a common structural architecture, including a central structural unit consisting of a Ras-associating (RA) domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. MRL proteins have distinct functions in cell migration and adhesion, signaling, and in cell growth. 85
22426 340486 cd01788 Ubl_ElonginB ubiquitin-like (Ubl) domain found in transcription elongation factor B (Elongin B) and similar proteins. Elongin B, also termed Elongin 18 kDa subunit, or EloB, or RNA polymerase II transcription factor SIII subunit B (SIII p18), is part of an E3 ubiquitin ligase complex called VEC that activates ubiquitination by the E2 ubiquitin-conjugating enzyme Ubc5. VEC is composed of von Hippel-Lindau tumor suppressor protein (pVHL), elongin C, cullin 2, NEDD8, and Rbx1. ElonginB binds elonginC to form the elonginBC complex which is a positive regulator of RNA polymerase II elongation factor Elongin A. The BC complex then binds VHL (von Hippel-Lindau) tumor suppressor protein to form a VCB ternary complex. Elongin B has a ubiquitin-like (Ubl) domain. Ub has a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. 101
22427 340487 cd01789 Ubl_TBCB ubiquitin-like (Ubl) domain found in tubulin-folding cofactor B (TBCB) and similar proteins. TBCB, also termed cytoskeleton-associated protein 1, or cytoskeleton-associated protein CKAPI, or tubulin-specific chaperone B, is one of protein cofactors A through E that is required for the folding of tubulins prior to their incorporation into microtubules and heterodimer assembly. TBCB comprises an N-terminal ubiquitin-like (Ubl) domain and a C-terminal cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain. The Ubl domain of TBCB is essential for proper folding and assembly of tubulin alpha. It has a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. TBC-A through E are necessary for the biogenesis of microtubules and for cell viability. 80
22428 340488 cd01790 Ubl_HERP ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP. HERP is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. The Ubl domain is required for the degradation of HERP itself as well as for HERP-mediated anti-apoptotic effects. HERP is induced by the ER stress response pathway and is involved in improving the balance of folding capacity and protein loads in the ER. There are two types of HERP, HERP1 and HERP2, which are encoded by the HERPUD1 and HERPUD2 genes, respectively. 78
22429 340489 cd01791 Ubl_UBL5 ubiquitin-like (Ubl) domain found in ubiquitin-like protein 5 (UBL5) and similar proteins. UBL5, known as Hub1 in yeast, is an atypical ubiquitin-like (Ubl) post-translational modifier that contains a conserved Ubl domain with a beta-grasp Ubl fold. At the C-terminal end of its Ubl fold is a di-tyrosine motif followed by a single variable residue instead of the characteristic di-glycine found in all other Ubl modifiers, and thus UBL5 does not form covalent conjugates with cellular proteins. The yeast Hub1p binds non-covalently to the HIND element of spliceosomal protein Snu66p (Snu66p is termed SART1 in mammals) and modifies the spliceosome by this unconventional Ubl modifier. In higher eukaryotes, UBL5/Hub1 plays a role in modulating pre-mRNA splicing. It also is required for signaling in the mitochondrial unfolded protein response, through interaction with the transcription factor DVE-1 and upregulation of chaperone genes in response to mitochondrial stress. Moreover, UBL5 functions as a factor that directly binds to and stabilizes FANCI, and promotes the functionality of the Fanconi anemia (FA) DNA repair pathway. 71
22430 340490 cd01792 Ubl1_ISG15 ubiquitin-like (Ubl) domain 1 found in interferon-stimulated gene 15 (ISG15) and similar proteins. ISG15, also termed interferon-induced 15 kDa protein, or interferon-induced 17 kDa protein (IP17), or ubiquitin cross-reactive protein (UCRP), is an antiviral interferon-induced ubiquitin-like protein (Ubl) that upon viral infection, modifies cellular and viral proteins by mechanisms similar to ubiquitination. Although ISG15 has properties similar to those of other Ubl molecules, it is a unique member of the Ubl superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. ISG15 contains two tandem Ubl domains with a beta-grasp Ubl fold. This family corresponds to the first Ubl domain. 75
22431 340491 cd01793 Ubl_FUBI ubiquitin-like (Ubl) domain found in ubiquitin-like protein FUBI and similar proteins. FUBI is a pro-apoptotic regulatory gene FAU encoding ubiquitin-like protein with ribosomal protein S30 as a C-terminal extension. FUBI functions as a tumor suppressor protein that may be involved in the ATP-dependent proteolytic activity of ubiquitin. The N-terminal ubiquitin-like (Ubl) domain of FUBI has the beta-grasp Ubl fold, and it may act as a substitute or an inhibitor of ubiquitin or one of ubiquitin's close relatives UCRP, FAT10, and Nedd8. 74
22432 340492 cd01794 Ubl_UBTD ubiquitin-like (Ubl) domain found in ubiquitin domain-containing proteins UBTD1, UBTD2, and similar proteins. This family represents a group of ubiquitin-like (Ubl) domain-containing proteins evolutionarily conserved and found in metazoa, fungi, and plants. They may regulate the activity and/or specificity of E2 ubiquitin conjugating enzymes belonging to the UBE2D family. Members in this family contain an N-terminal ubiquitin binding domain (UBD) and a C-terminal Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 69
22433 340493 cd01795 Ubl_USP48 ubiquitin-like (Ubl) domain found in ubiquitin-specific-processing protease 48 (USP48) and similar proteins. USP48, also termed USP31, or deubiquitinating enzyme 48, or ubiquitin thioesterase 48, or ubiquitin carboxyl-terminal hydrolase 48, belongs to the ubiquitin specific protease (USP) family that is one of at least seven deubiquitylating enzyme (DUB) families capable of deconjugating ubiquitin (Ub)and ubiquitin-like (Ubl) adducts. While the USP proteins have a conserved catalytic core domain, USP48 differs in its domain architecture. It contains an N-terminal USP domain, three DUSP (domain present in ubiquitin-specific protease) domains, and a C-terminal Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. USP48 is a deubiquitinating enzyme that interacts with TNF receptor-associated factor 2 (TRAF2) and has been implicated in activation of nuclear factor-kappaB (NF-kappaB). Moreover, as a nuclear deubiquitinase regulated by casein kinase 2 (CK2), USP48 controls the ubiquitin/proteasome-system (UPS)-dependent turnover of activated NF-kappaB/RelA in the nucleus together with the COP9 signalosome, suggesting a role of USP48 in a timely control of immune responses. 99
22434 340494 cd01796 Ubl_Ddi1_like ubiquitin-like (Ubl) domain found in the eukaryotic Ddi1 family. The eukaryotic Ddi1 family, including yeast aspartyl protease DNA-damage inducible 1 (Ddi1) and Ddi1-like proteins from vertebrates and other eukaryotes, has been characterized by containing an N-terminal ubiquitin-like (Ubl) domain and a conserved retroviral aspartyl-protease-like domain (RVP) that is important in cell-cycle control. Yeast Ddi1 and many family members also contain a C-terminal ubiquitin-association (UBA) domain, however, Ddi1-like proteins from all vertebrates lack the UBA domain. Ddi1, also termed v-SNARE-master 1 (Vsm1), is an ubiquitin receptor involved in the cell cycle and late secretory pathway in Saccharomyces cerevisiae. It functions as an UBA-Ubl shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. For instance, Ddi1 plays an essential role in the final stages of proteasomal degradation of Ho endonuclease and of its cognate FBP, Ufo1. Moreover, Ddi1 and its associated protein Rad23p play a cooperative role as negative regulators in yeast PHO pathway. This family also includes mammalian regulatory solute carrier protein family 1 member 1 (RSC1A1), also termed transporter regulator RS1 (RS1), which mediates transcriptional and post-transcriptional regulation of Na(+)-D-glucose cotransporter SGLT1. Ddi1-like proteins play a significant role in cell cycle control, growth control, and trafficking in yeast and may play a crucial role in embryogenesis in higher eukaryotes. 73
22435 340495 cd01797 Ubl_UHRF ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing proteins, UHRF1 and UHRF2, and similar proteins. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma, gastric cancer, esophageal squamous cell carcinoma, colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a set- and ring-associated (SRA) domain, and a C-terminal RING finger. 74
22436 340496 cd01798 Ubl_parkin ubiquitin-like (Ubl) domain found in parkin and similar proteins. Parkin, also termed Parkinson juvenile disease protein 2, is a RBR-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson's disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins, such as BCL2, SYT11, CCNE1, GPR37, RHOT1/MIRO1, MFN1, MFN2, STUB1, SNCAIP, SEPT5, TOMM20, USP30, ZNF746 and AIMP2. It mediates monoubiquitination as well as Lys-6-, Lys-11-, Lys-48- and Lys-63-linked polyubiquitination of substrates depending on the context. Parkin may enhance cell viability and protects dopaminergic neurons from oxidative stress-mediated death by regulating mitochondrial function. It also limits the production of reactive oxygen species (ROS), and regulates cyclin-E during neuronal apoptosis. Moreover, parkin displays a ubiquitin ligase-independent function in transcriptional repression of p53. Parkin contains an N-terminal ubiquitin-like (Ubl) domain and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. 74
22437 340497 cd01799 Ubl_HOIL1 ubiquitin-like (Ubl) domain found in heme-oxidized IRP2 ubiquitin ligase 1 (HOIL-1) and similar proteins. HOIL-1, also termed RBCK1, or HOIL-1L, or RanBP-type and C3HC4-type zinc finger-containing protein 1, HBV-associated factor 4, or Hepatitis B virus X-associated protein 4, or RING finger protein 54 (RNF54), or ubiquitin-conjugating enzyme 7-interacting protein 3, or UbcM4-interacting protein 28 (UIP28), together with E3 ubiquitin-protein ligase RNF31 (also known as HOIP) and SHANK-associated RH domain interacting protein (SHARPIN), forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis through conjugation of linear polyubiquitin chains to NF-kappaB essential modulator (also known as NEMO or IKBKG). HOIL-1 plays a crucial role in TNF-alpha-mediated NF-kappaB activation. It also functions as an ubiquitin-protein ligase E3 that interacts with not only PKCbeta but also PKCzeta. It can recognize heme-oxidized IRP2 (iron regulatory protein2) and is thought to affect the turnover of oxidatively damaged proteins. HOIL-1 contains an N-terminal ubiqutin-like (UBL) domain and an Npl4 zinc-finger (NZF) domain, which regulate the interaction with the LUBAC subunit RNF31 and ubiquitin, respectively. The NZF domain belongs to RanBP2-type zinc finger (zf-RanBP2) domain superfamily. In addition, HOIL-1 has a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. 81
22438 340498 cd01800 Ubl_SF3a120 ubiquitin-like (Ubl) domain found in splicing factor 3A 120kDa subunit (SF3a120) and similar proteins. Mammalian splicing factor SF3a consists of three subunits of 60, 66, and 120 kDa and functions early during pre-mRNA splicing by converting the U2 snRNP to its active form. The 120kDa subunit SF3a120, also termed splicing factor 3A subunit 1 (SF3A1), or spliceosome-associated protein 114 (SAP114), is the U2 snRNP-specific protein that is critical for spliceosome assembly and normal splicing events. During splicing, SF3a120, together with the U2 snRNP and other proteins, are recruited to the 3' splicing site to generate the splicing complex A after the recognition of the 3' splicing site. SF3a120 contains two N-terminal SWAP (suppressor-of-white-apricot) domains, referred to collectively as the SURP module, as well as a C-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 84
22439 340499 cd01801 Ubl_TECR_like ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase (TECR) and similar proteins. This family includes TECR and many TECR-like proteins, such as TECRL. TECR, also termed very-long-chain enoyl-CoA reductase, or synaptic glycoprotein SC2, or TER, or GPSN2, is a synaptic glycoprotein that catalyzes the fourth reaction in the synthesis of very long-chain fatty acids (VLCFA) which is the reduction step of the microsomal fatty acyl-elongation process. Diseases involving perturbations to normal synthesis and degradation of VLCFA (e.g. adrenoleukodystrophy and Zellweger syndrome) have significant neurological consequences. The mammalian TECR P182L mutation causes nonsyndromic mental retardation. Deletion of the yeast TECR (TSC13) homolog is lethal. TECR contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain. TECRL, also termed steroid 5-alpha-reductase 2-like 2 protein (SRD5A2L2), is associated with life-threatening inherited arrhythmias displaying features of both long QT syndrome (LQTS) and catecholaminergic polymorphic ventricular tachycardia (CPVT). Both TECR and TECRL contain an N-terminal Ubl domain with a beta-grasp Ubl fold, and a C-terminal catalytic domain. 77
22440 340500 cd01802 Ubl_ZFAND4 ubiquitin-like (Ubl) domain found in AN1-type zinc finger protein 4 (ZFAND4) and similar proteins. ZFAND4, also termed AN1-type zinc finger and ubiquitin domain-containing protein-like 1 (ANUBL1), may function as an oncogene that promotes proliferation and regulates relevant tumor suppressor genes in gastric cancer, suggesting a role in gastric cancer initiation and progression. ZFAND4contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal AN1-type zinc finger. Unlike ubiquitin polyproteins and most ubiquitin fusion proteins, the N-terminal Ubl domain of ZFAND4 does not undergo proteolytic processing. 74
22441 340501 cd01803 Ubl_ubiquitin ubiquitin-like (Ubl) domain found in ubiquitin. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. Ubiquitin-like (Ubl) proteins have similar ubiquitin beta-grasp fold and attach to other proteins in a Ubl manner but with biochemically distinct roles. Ubiquitin (Ub)and Ubl proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Ub includes Ubq/RPL40e and Ubq/RPS27a fusions as well as homopolymeric multiubiquitin protein chains. 76
22442 340502 cd01804 Ubl_midnolin ubiquitin-like (Ubl) domain found in midnolin and similar proteins. Midnolin, also termed midbrain nucleolar protein, is a nucleolar protein that may be involved in regulation of genes related to neurogenesis in the nucleolus. It is strongly expressed at the mesencephalon (midbrain) of the embryo in day 12.5 (E12.5) mice and its expression is developmentally regulated. Midnolin plays a role in cellular signaling of adult tissues and regulates glucokinase enzyme activity in pancreatic beta cells. It can also control development via regulation of mRNA transport in cells. Midnolin contains an N-terminal conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 70
22443 340503 cd01805 Ubl_Rad23 ubiquitin-like (Ubl) domain found in the Rad23 protein family. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry an ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. The Ubl domain is responsible for the binding to proteasome. The UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates, which suggests Rad23 proteins might be involved in certain pathways of Ub metabolism. Both the Ubl domain and the XPC-binding domain are necessary for efficient NER function of Rad23 proteins. 72
22444 340504 cd01806 Ubl_NEDD8 ubiquitin-like (Ubl) domain found in neural precursor cell expressed developmentally down-regulated protein 8 (NEDD8) and similar proteins. NEDD8, also termed Neddylin, or RELATED TO UBIQUITIN (RUB/Rub1p) in plant and yeast, is a ubiquitin-like protein that conjugates to nuclear proteins in a manner analogous to ubiquitination and sentrinization. It modifies a family of molecular scaffold proteins called cullins that are responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. NEDD8 deamidation and its inhibition of Cullin-RING ubiquitin ligases (CRLs) activity are responsible for Cycle-inhibiting factor (Cif)/Cif homolog in Burkholderia pseudomallei (CHBP)-induced cytopathic effect. NEDD8 contains a single conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Polyubiquitination, signals for a diverse set of cellular events via different isopeptide linkages formed between the C terminus of one ubiquitin (Ub) and the epsilon-amine of K6, K11, K27, K29, K33, K48, or K63 of a second Ub. Ubl NEDD8, contains many of the same lysines (K6, K11, K27, K33, K48) as Ub, where K27 has an role (other than conjugation) in the mechanism of protein neddylation. 74
22445 340505 cd01807 Ubl_UBL4A_like ubiquitin-like (Ubl) domain found in ubiquitin-like proteins UBL4A and similar proteins. UBL4A, also termed GdX, is a ubiquitously expressed ubiquitin-like (Ubl) protein that forms a complex with partner proteins and participates in the protein processing through endoplasmic reticulum (ER), acting as a chaperone. As a key component of the BCL2-associated athanogene 6 (BAG6) chaperone complex, UBL4A plays a role in mediating DNA damage signaling and cell death. UBL4A also regulates insulin-induced Akt plasma membrane translocation through promotion of Arp2/3-dependent actin branching. Moreover, UBL4A specifically stabilizes the TC45/STAT3 association and promotes dephosphorylation of STAT3 to repress tumorigenesis. UBL4B is testis-specific, and encoded by an X-derived retrogene Ubl4b, which is specifically expressed in post-meiotic germ cells in mammals. As a germ cell-specific cytoplasmic protein, UBL4B is not present in somatic cells. Moreover, UBL4B is present in elongated spermatids, but not in spermatocytes and round spermatids, suggesting its function is restricted to late spermiogenesis. The function of UBL4A may be compensated by either UBL4B or other Ubl proteins in normal conditions. Both UBL4A and UBL4B contain a conserved Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 72
22446 340506 cd01808 Ubl_PLICs ubiquitin-like (Ubl) domain found in eukaryotic protein linking integrin-associated protein (IAP, also known as CD47) with cytoskeleton (PLIC) proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like (Ubl) Dsk2 protein, PLIC-1 (also termed ubiquilin-1), PLIC-2 (also termed ubiquilin-2, or Chap1), PLIC-3 (also termed ubiquilin-3) and PLIC-4 (also termed ubiquilin-4, ataxin-1 interacting ubiquitin-like protein, A1Up, connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin (Ub)-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. PLIC-1 regulates the function of the thrombospondin receptor CD47 and G protein signaling. It plays a role in TLR4-mediated signaling through interacting with the Toll/interleukin-1 receptor (TIR) domain of TLR4. It also inhibits the TLR3-Trif antiviral pathway by reducing the abundance of Trif. Moreover, PLIC-1 binds to gamma-aminobutyric acid receptors (GABAARs) and modulates the Ub-dependent, proteasomal degradation of GABAARs. Furthermore, PLIC-1 acts as a molecular chaperone regulating amyloid precursor protein (APP) biosynthesis, trafficking, and degradation by stimulating K63-linked polyubiquitination of lysine 688 in the APP intracellular domain. In addition, PLIC-1 is involved in the protein aggregation-stress pathway via associating with the Ub-interacting motif (UIM) proteins ataxin 3, HSJ1a, and epidermal growth factor substrate 15 (EPS15). PLIC-2 is a protein that binds the ATPase domain of the HSP70-like Stch protein. It functions as a negative regulator of G protein-coupled receptor (GPCR) endocytosis. It also involved in amyotrophic lateral sclerosis (ALS)-related dementia. PLIC-3 is encoded by UbiquilinN3, a testis-specific gene. It shows high sequence similarity with the Xenopus protein XDRP1, a nuclear phosphoprotein that binds to the N-terminus of cyclin A and inhibits Ca2+-induced degradation of cyclin A, but not cyclin B. PLIC-4 is an ubiquitin-like (Ubl) nuclear protein that interacts with ataxin-1 and further links ataxin-1 with the chaperone and Ub-proteasome pathways. It also binds to the non-ubiquitinated gap junction protein connexin43 (Cx43) and regulates the turnover of Cx43 through the proteasomal pathway. PLIC proteins contain an N-terminal Ubl domain that is responsible for the binding of Ub-interacting motifs (UIMs) expressed by proteasomes and endocytic adaptors, and C-terminal Ub-associated (UBA) domain that interacts with Ub chains present on proteins destined for proteasomal degradation. In addition, mammalian PLIC2 proteins have an extra collagen-like motif region, which is absent in other PLIC proteins and the yeast Dsk2 protein. 73
22447 340507 cd01809 Ubl_BAG6 ubiquitin-like (Ubl) domain found in BCL2-associated athanogene 6 (BAG6) and similar proteins. BAG6, also termed large proline-rich protein BAG6, or BAG family molecular chaperone regulator 6, or HLA-B-associated transcript 3 (Bat3), or protein Scythe, or protein G3, is a nucleo-cytoplasmic shuttling chaperone protein that is highly conserved in eukaryotes. It functions in two distinct biological pathways, ubiquitin-mediated protein degradation of defective polypeptides and tail-anchored transmembrane protein biogenesis in mammals. BAG6 is a component of the heterotrimeric BAG6 sortase complex composed of BAG6, transmembrane recognition complex 35 (TRC35) and ubiquitin-like protein 4A (UBL4A). The BAG6 complex together with the cochaperone small, glutamine-rich, tetratricopeptide repeat-containing, protein alpha (SGTA) plays a role in the biogenesis of tail-anchored membrane proteins and subsequently shown to regulate the ubiquitination and proteasomal degradation of mislocalized proteins. Moreover, BAG6 acts as an apoptotic regulator that binds reaper, a potent apoptotic inducer. BAG6/reaper is thought to signal apoptosis, in part through regulating the folding and activity of apoptotic signaling molecules. It is also likely a key regulator of the molecular chaperone Heat Shock Protein A2 (HSPA2) stability/function in human germ cells. Furthermore, aspartyl protease-mediated cleavage of BAG6 is necessary for autophagy and fungal resistance in plants. BAG6 contains a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which provides a platform for discriminating substrates with shorter hydrophobicity stretches as a signal for defective proteins. 71
22448 340508 cd01810 Ubl2_ISG15 ubiquitin-like (Ubl) domain 2 found in interferon-stimulated gene 15 (ISG15) and similar proteins. ISG15, also termed interferon-induced 15 kDa protein, or interferon-induced 17 kDa protein (IP17), or ubiquitin cross-reactive protein (UCRP), is an antiviral interferon-induced ubiquitin-like protein that upon viral infection it modifies cellular and viral proteins by mechanisms similar to ubiquitination. Although ISG15 has properties similar to those of other ubiquitin-like (Ubl) molecules, it is a unique member of the Ubl superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. ISG15 contains two tandem Ubl domains with a beta-grasp Ubl fold. This family corresponds to the second Ubl domain. 74
22449 340509 cd01811 Ubl1_OASL ubiquitin-like (Ubl) domain 1 found in 2'-5'-oligoadenylate synthase-like protein (OASL) and similar proteins. OASL, also termed 2'-5'-OAS-related protein (2'-5'-OAS-RP), or 59 kDa 2'-5'-oligoadenylate synthase-like protein, or thyroid receptor-interacting protein 14, or TR-interacting protein 14 (TRIP-14), or p59 OASL (p59OASL), is an interferon (IFN)-induced antiviral protein that plays an important role in the IFNs-mediated antiviral signaling pathway. It inhibits respiratory syncytial virus replication and is targeted by the viral nonstructural protein 1 (NS1). It also displays antiviral activity against encephalomyocarditis virus (EMCV) and hepatitis C virus (HCV) via an alternative antiviral pathway independent of RNase L. Moreover, OASL does not have 2'-5'-OAS activity, but can bind double-stranded RNA (dsRNA) to enhance RIG-I signaling. OASL belongs to the 2'-5' oligoadenylate synthase (OAS) family. While each member of this family has a conserved N-terminal OAS catalytic domain, only OASL has two tandem C-terminal ubiquitin-like (Ubl) repeats, which is required for its antiviral activity. This family corresponds to the first Ubl domain. 75
22450 340510 cd01812 Ubl_BAG1 ubiquitin-like (Ubl) domain found in BAG family molecular chaperone regulator 1 (BAG1) and similar proteins. BAG1, also termed Bcl-2-associated athanogene 1, or HAP, is a multifunctional protein involved in a variety of cellular functions such as apoptosis, transcription, and proliferative pathways, as well as in cell signaling and differentiation. It delivers chaperone-recognized unfolded substrates to the proteasome for degradation. BAG1 functions as a co-chaperone for Hsp70/Hsc70 to increase Hsp70 foldase activity. It also suppresses apoptosis and enhances neuronal differentiation. As an anti-apoptotic factor, BAG1 interacts with tau and regulates its proteasomal degradation. It also binds to BCR-ABL with a high affinity, and directly routes immature BCR-ABL for proteasomal degradation. It acts as a potential therapeutic target in Parkinson's disease. It also modulates huntingtin toxicity, aggregation, degradation, and subcellular distribution, suggesting a role in Huntington's disease. There are at least four isoforms of Bag1 protein that are formed by alternative initiation of translation within a common mRNA. BAG1 contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and a C-terminal BAG domain. 77
22451 340511 cd01813 Ubl_UBLCP1 ubiquitin-like (Ubl) domain found in ubiquitin-like domain-containing CTD phosphatase 1 (UBLCP1) and similar proteins. UBLCP1 is a 26S proteasome phosphatase that regulates nuclear proteasome activity. It is localized in the nucleus and it contains conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which directly interacts with the proteasome. Knockdown of UBLCP1 in cells promotes 26S proteasome assembly and selectively enhances nuclear proteasome activity. UBLCP1 may also play a role in the regulation of phosphorylation state of RNA polymerase II C-terminal domain, a key event during mRNA metabolism. 74
22452 340512 cd01814 Ubl_MUBs_plant ubiquitin-like (Ubl) domain found in plant membrane-anchored ubiquitin-fold proteins (MUBs). The plant MUBs belong to a family of ubiquitin-fold proteins that are plasma membrane-anchored by prenylation. They may serve as docking site to facilitate the association of specific E2s to the plasma membrane. MUBs contain a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. 89
22453 340513 cd01815 Ubl_UBL7 ubiquitin-like (Ubl) domain found in ubiquitin-like protein 7 (UBL7) and similar proteins. UBL7, also termed bone marrow stromal cell ubiquitin-like (Ubl)protein (BMSC-UbP), or ubiquitin-like protein SB132, is a novel Ubl protein that may play roles in regulation of bone marrow stromal cell (BMSC) function or cell differentiation via an evocator-associated and cell-specific pattern. UBL7 contains an N-terminal Ubl domain with a beta-grasp Ubl fold, and a C-terminal ubiquitin-associated (UBA) domain. The Ubl domain interacts with 26S proteasome-dependent degradation, and the UBA domain links cellular processes and the ubiquitin system. 92
22454 340514 cd01816 RBD_RAF Ras-binding domain (RBD) found in RAF family serine/threonine kinases. The RAF family includes three RAF serine/threonine kinases ARAF, BRAF, and RAF1/CRAF. These are encoded by proto-oncogenes, and activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. 71
22455 340515 cd01817 RBD1_RGS12_like Ras-binding domain (RBD) 1 of regulator of G protein signaling 12 (RGS12) and similar proteins. Regulator of G protein signaling (RGS) proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. This RGS12-like subfamily is composed of RGS12 and RGS14, with multidomain architectures including a RGS domain, two tandem Ras-binding domains (RBDs), and a second Galpha interacting domain, the GoLoco motif. The RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. 70
22456 132836 cd01819 Patatin_and_cPLA2 Patatins and Phospholipases. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. 155
22457 238858 cd01820 PAF_acetylesterase_like PAF_acetylhydrolase (PAF-AH)_like subfamily of SGNH-hydrolases. Platelet-activating factor (PAF) and PAF-AH are key players in inflammation and in atherosclerosis. PAF-AH is a calcium independent phospholipase A2 which exhibits strong substrate specificity towards PAF, hydrolyzing an acetyl ester at the sn-2 position. PAF-AH also degrades a family of oxidized PAF-like phospholipids with short sn-2 residues. In addition, PAF and PAF-AH are associated with neural migration and mammalian reproduction. 214
22458 238859 cd01821 Rhamnogalacturan_acetylesterase_like Rhamnogalacturan_acetylesterase_like subgroup of SGNH-hydrolases. Rhamnogalacturan acetylesterase removes acetyl esters from rhamnogalacturonan substrates, and renders them susceptible to degradation by rhamnogalacturonases. Rhamnogalacturonans are highly branched regions in pectic polysaccharides, consisting of repeating -(1,2)-L-Rha-(1,4)-D-GalUA disaccharide units, with many rhamnose residues substituted by neutral oligosaccharides such as arabinans, galactans and arabinogalactans. Extracellular enzymes participating in the degradation of plant cell wall polymers, such as Rhamnogalacturonan acetylesterase, would typically be found in saprophytic and plant pathogenic fungi and bacteria. 198
22459 238860 cd01822 Lysophospholipase_L1_like Lysophospholipase L1-like subgroup of SGNH-hydrolases. The best characterized member in this family is TesA, an E. coli periplasmic protein with thioesterase, esterase, arylesterase, protease and lysophospholipase activity. 177
22460 238861 cd01823 SEST_like SEST_like. A family of secreted SGNH-hydrolases similar to Streptomyces scabies esterase (SEST), a causal agent of the potato scab disease, which hydrolyzes a specific ester bond in suberin, a plant lipid. The tertiary fold of this enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxylic acid. 259
22461 238862 cd01824 Phospholipase_B_like Phospholipase-B_like. This subgroup of the SGNH-family of lipolytic enzymes may have both esterase and phospholipase-A/lysophospholipase activity. It's members may be involved in the conversion of phosphatidylcholine to fatty acids and glycerophosphocholine, perhaps in the context of dietary lipid uptake. Members may be membrane proteins. The tertiary fold of the SGNH-hydrolases is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; Its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases. 288
22462 238863 cd01825 SGNH_hydrolase_peri1 SGNH_peri1; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 189
22463 238864 cd01826 acyloxyacyl_hydrolase_like Acyloxyacyl-hydrolase like subfamily of the SGNH-hydrolase family. Acyloxyacyl-hydrolase is a leukocyte-secreted enzyme that deacetylates bacterial lipopolysaccharides. 305
22464 238865 cd01827 sialate_O-acetylesterase_like1 sialate O-acetylesterase_like family of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 188
22465 238866 cd01828 sialate_O-acetylesterase_like2 sialate_O-acetylesterase_like subfamily of the SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 169
22466 238867 cd01829 SGNH_hydrolase_peri2 SGNH_peri2; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 200
22467 238868 cd01830 XynE_like SGNH_hydrolase subfamily, similar to the putative arylesterase/acylhydrolase from the rumen anaerobe Prevotella bryantii XynE. The P. bryantii XynE gene is located in a xylanase gene cluster. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 204
22468 238869 cd01831 Endoglucanase_E_like Endoglucanase E-like members of the SGNH hydrolase family; Endoglucanase E catalyzes the endohydrolysis of 1,4-beta-glucosidic linkages in cellulose, lichenin and cereal beta-D-glucans. 169
22469 238870 cd01832 SGNH_hydrolase_like_1 Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. Myxobacterial members of this subfamily have been reported to be involved in adventurous gliding motility. 185
22470 238871 cd01833 XynB_like SGNH_hydrolase subfamily, similar to Ruminococcus flavefaciens XynB. Most likely a secreted hydrolase with xylanase activity. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 157
22471 238872 cd01834 SGNH_hydrolase_like_2 SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 191
22472 238873 cd01835 SGNH_hydrolase_like_3 SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 193
22473 238874 cd01836 FeeA_FeeB_like SGNH_hydrolase subfamily, FeeA, FeeB and similar esterases/lipases. FeeA and FeeB are part of a biosynthetic gene cluster and may participate in the biosynthesis of long-chain N-acyltyrosines by providing saturated and unsaturated fatty acids, which it turn are loaded onto the acyl carrier protein FeeL. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 191
22474 238875 cd01837 SGNH_plant_lipase_like SGNH_plant_lipase_like, a plant specific subfamily of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 315
22475 238876 cd01838 Isoamyl_acetate_hydrolase_like Isoamyl-acetate hydrolyzing esterase-like proteins. SGNH_hydrolase subfamily similar to the Saccharomyces cerevisiae IAH1. IAH1 may be the major esterase that hydrolyses isoamyl acetate in sake mash. The SGNH-family of hydrolases is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases 199
22476 238877 cd01839 SGNH_arylesterase_like SGNH_hydrolase subfamily, similar to arylesterase (7-aminocephalosporanic acid-deacetylating enzyme) of A. tumefaciens. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 208
22477 238878 cd01840 SGNH_hydrolase_yrhL_like yrhL-like subfamily of SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Most members of this sub-family appear to co-occur with N-terminal acyltransferase domains. Might be involved in lipid metabolism. 150
22478 238879 cd01841 NnaC_like NnaC (CMP-NeuNAc synthetase) _like subfamily of SGNH_hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases. E. coli NnaC appears to be involved in polysaccharide synthesis. 174
22479 238880 cd01842 SGNH_hydrolase_like_5 SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 183
22480 238881 cd01844 SGNH_hydrolase_like_6 SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 177
22481 238882 cd01846 fatty_acyltransferase_like Fatty acyltransferase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Might catalyze fatty acid transfer between phosphatidylcholine and sterols. 270
22482 238883 cd01847 Triacylglycerol_lipase_like Triacylglycerol lipase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Members of this subfamily might hydrolyze triacylglycerol into diacylglycerol and fatty acid anions. 281
22483 206746 cd01849 YlqF_related_GTPase Circularly permuted YlqF-related GTPases. These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases. 146
22484 206649 cd01850 CDC_Septin CDC/Septin GTPase family. Septins are a conserved family of GTP-binding proteins associated with diverse processes in dividing and non-dividing cells. They were first discovered in the budding yeast S. cerevisiae as a set of genes (CDC3, CDC10, CDC11 and CDC12) required for normal bud morphology. Septins are also present in metazoan cells, where they are required for cytokinesis in some systems, and implicated in a variety of other processes involving organization of the cell cortex and exocytosis. In humans, 12 septin genes generate dozens of polypeptides, many of which comprise heterooligomeric complexes. Since septin mutants are commonly defective in cytokinesis and formation of the neck formation of the neck filaments/septin rings, septins have been considered to be the primary constituents of the neck filaments. Septins belong to the GTPase superfamily for their conserved GTPase motifs and enzymatic activities. 275
22485 206650 cd01851 GBP Guanylate-binding protein (GBP) family (N-terminal domain). Guanylate-binding protein (GBP), N-terminal domain. Guanylate-binding proteins (GBPs) define a group of proteins that are synthesized after activation of the cell by interferons. The biochemical properties of GBPs are clearly different from those of Ras-like and heterotrimeric GTP-binding proteins. They bind guanine nucleotides with low affinity (micromolar range), are stable in their absence and have a high turnover GTPase. In addition to binding GDP/GTP, they have the unique ability to bind GMP with equal affinity and hydrolyze GTP not only to GDP, but also to GMP. Furthermore, two unique regions around the base and the phosphate-binding areas, the guanine and the phosphate caps, respectively, give the nucleotide-binding site a unique appearance not found in the canonical GTP-binding proteins. The phosphate cap, which constitutes the region analogous to switch I, completely shields the phosphate-binding site from solvent such that a potential GTPase-activating protein (GAP) cannot approach. 224
22486 206651 cd01852 AIG1 AvrRpt2-Induced Gene 1 (AIG1). This group represents Arabidoposis protein AIG1 (avrRpt2-induced gene 1) that appears to be involved in plant resistance to bacteria. The Arabidopsis disease resistance gene RPS2 is involved in recognition of bacterial pathogens carrying the avirulence gene avrRpt2. AIG1 exhibits RPS2- and avrRpt1-dependent induction early after infection with Pseudomonas syringae carrying avrRpt2. This subfamily also includes IAN-4 protein, which has GTP-binding activity and shares sequence homology with a novel family of putative GTP-binding proteins: the immuno-associated nucleotide (IAN) family. The evolutionary conservation of the IAN family provides a unique example of a plant pathogen response gene conserved in animals. The IAN/IMAP subfamily has been proposed to regulate apoptosis in vertebrates and angiosperm plants, particularly in relation to cancer, diabetes, and infections. The human IAN genes were renamed GIMAP (GTPase of the immunity associated proteins). 201
22487 206652 cd01853 Toc34_like Translocon at the Outer-envelope membrane of Chloroplasts 34-like (Toc34-like). The Toc34-like (Translocon at the Outer-envelope membrane of Chloroplasts) family contains several Toc proteins, including Toc34, Toc33, Toc120, Toc159, Toc86, Toc125, and Toc90. The Toc complex at the outer envelope membrane of chloroplasts is a molecular machine of ~500 kDa that contains a single Toc159 protein, four Toc75 molecules, and four or five copies of Toc34. Toc64 and Toc12 are associated with the translocon, but do not appear to be part of the core complex. The Toc translocon initiates the import of nuclear-encoded preproteins from the cytosol into the organelle. Toc34 and Toc159 are both GTPases, while Toc75 is a beta-barrel integral membrane protein. Toc159 is equally distributed between a soluble cytoplasmic form and a membrane-inserted form, suggesting that assembly of the Toc complex is dynamic. Toc34 and Toc75 act sequentially to mediate docking and insertion of Toc159 resulting in assembly of the functional translocon. 248
22488 206747 cd01854 YjeQ_EngC Ribosomal interacting GTPase YjeQ/EngC, a circularly permuted subfamily of the Ras GTPases. YjeQ (YloQ in Bacillus subtilis) is a ribosomal small subunit-dependent GTPase; hence also known as RsgA. YjeQ is a late-stage ribosomal biogenesis factor involved in the 30S subunit maturation, and it represents a protein family whose members are broadly conserved in bacteria and have been shown to be essential to the growth of E. coli and B. subtilis. Proteins of the YjeQ family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. All YjeQ family proteins display a unique domain architecture, which includes an N-terminal OB-fold RNA-binding domain, the central permuted GTPase domain, and a zinc knuckle-like C-terminal cysteine domain. 211
22489 206748 cd01855 YqeH Circularly permuted YqeH GTPase. YqeH is an essential GTP-binding protein. Depletion of YqeH induces an excess initiation of DNA replication, suggesting that it negatively controls initiation of chromosome replication. The YqeH subfamily is common in eukaryotes and sporadically present in bacteria with probable acquisition by plants from chloroplasts. Proteins of the YqeH family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. 191
22490 206749 cd01856 YlqF Circularly permuted YlqF GTPase. Proteins of the YlqF family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. The YlqF subfamily is represented in all eukaryotes as well as a phylogenetically diverse array of bacteria (including gram-positive bacteria, proteobacteria, Synechocystis, Borrelia, and Thermotoga). 171
22491 206750 cd01857 HSR1_MMR1 A circularly permuted subfamily of the Ras GTPases. Human HSR1 is localized to the human MHC class I region and is highly homologous to a putative GTP-binding protein, MMR1 from mouse. These proteins represent a new subfamily of GTP-binding proteins that has only eukaryote members. This subfamily shows a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with sequence NKXD) are relocated to the N-terminus. 140
22492 206751 cd01858 NGP_1 A novel nucleolar GTP-binding protein, circularly permuted subfamily of the Ras GTPases. Autoantigen NGP-1 (Nucleolar G-protein gene 1) has been shown to localize in the nucleolus and nucleolar organizers in all cell types analyzed, which is indicative of a function in ribosomal assembly. NGP-1 and its homologs show a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with NKXD motif) are relocated to the N terminus. 157
22493 206752 cd01859 MJ1464 An uncharacterized, circularly permuted subfamily of the Ras GTPases. This family represents archaeal GTPase typified by the protein MJ1464 from Methanococcus jannaschii. The members of this family show a circular permutation of the GTPase signature motifs so that C-terminal strands 5, 6, and 7 (strands 6 contain the NKxD motif) are relocated to the N terminus. 157
22494 206653 cd01860 Rab5_related Rab-related GTPase family includes Rab5 and Rab22; regulates early endosome fusion. The Rab5-related subfamily includes Rab5 and Rab22 of mammals, Ypt51/Ypt52/Ypt53 of yeast, and RabF of plants. The members of this subfamily are involved in endocytosis and endocytic-sorting pathways. In mammals, Rab5 GTPases localize to early endosomes and regulate fusion of clathrin-coated vesicles to early endosomes and fusion between early endosomes. In yeast, Ypt51p family members similarly regulate membrane trafficking through prevacuolar compartments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 163
22495 206654 cd01861 Rab6 Rab GTPase family 6 (Rab6). Rab6 is involved in microtubule-dependent transport pathways through the Golgi and from endosomes to the Golgi. Rab6A of mammals is implicated in retrograde transport through the Golgi stack, and is also required for a slow, COPI-independent, retrograde transport pathway from the Golgi to the endoplasmic reticulum (ER). This pathway may allow Golgi residents to be recycled through the ER for scrutiny by ER quality-control systems. Yeast Ypt6p, the homolog of the mammalian Rab6 GTPase, is not essential for cell viability. Ypt6p acts in endosome-to-Golgi, in intra-Golgi retrograde transport, and possibly also in Golgi-to-ER trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 161
22496 206655 cd01862 Rab7 Rab GTPase family 7 (Rab7). Rab7 subfamily. Rab7 is a small Rab GTPase that regulates vesicular traffic from early to late endosomal stages of the endocytic pathway. The yeast Ypt7 and mammalian Rab7 are both involved in transport to the vacuole/lysosome, whereas Ypt7 is also required for homotypic vacuole fusion. Mammalian Rab7 is an essential participant in the autophagic pathway for sequestration and targeting of cytoplasmic components to the lytic compartment. Mammalian Rab7 is also proposed to function as a tumor suppressor. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 172
22497 206656 cd01863 Rab18 Rab GTPase family 18 (Rab18). Rab18 subfamily. Mammalian Rab18 is implicated in endocytic transport and is expressed most highly in polarized epithelial cells. However, trypanosomal Rab, TbRAB18, is upregulated in the BSF (Blood Stream Form) stage and localized predominantly to elements of the Golgi complex. In human and mouse cells, Rab18 has been identified in lipid droplets, organelles that store neutral lipids. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 161
22498 133267 cd01864 Rab19 Rab GTPase family 19 (Rab19). Rab19 subfamily. Rab19 proteins are associated with Golgi stacks. Similarity analysis indicated that Rab41 is closely related to Rab19. However, the function of these Rabs is not yet characterized. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 165
22499 206657 cd01865 Rab3 Rab GTPase family 3 contains Rab3A, Rab3B, Rab3C and Rab3D. The Rab3 subfamily contains Rab3A, Rab3B, Rab3C, and Rab3D. All four isoforms were found in mouse brain and endocrine tissues, with varying levels of expression. Rab3A, Rab3B, and Rab3C localized to synaptic and secretory vesicles; Rab3D was expressed at high levels only in adipose tissue, exocrine glands, and the endocrine pituitary, where it is localized to cytoplasmic secretory granules. Rab3 appears to control Ca2+-regulated exocytosis. The appropriate GDP/GTP exchange cycle of Rab3A is required for Ca2+-regulated exocytosis to occur, and interaction of the GTP-bound form of Rab3A with effector molecule(s) is widely believed to be essential for this process. Functionally, most studies point toward a role for Rab3 in the secretion of hormones and neurotransmitters. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 165
22500 206658 cd01866 Rab2 Rab GTPase family 2 (Rab2). Rab2 is localized on cis-Golgi membranes and interacts with Golgi matrix proteins. Rab2 is also implicated in the maturation of vesicular tubular clusters (VTCs), which are microtubule-associated intermediates in transport between the ER and Golgi apparatus. In plants, Rab2 regulates vesicle trafficking between the ER and the Golgi bodies and is important to pollen tube growth. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 168
22501 206659 cd01867 Rab8_Rab10_Rab13_like Rab GTPase families 8, 10, 13 (Rab8, Rab10, Rab13). Rab8/Sec4/Ypt2 are known or suspected to be involved in post-Golgi transport to the plasma membrane. It is likely that these Rabs have functions that are specific to the mammalian lineage and have no orthologs in plants. Rab8 modulates polarized membrane transport through reorganization of actin and microtubules, induces the formation of new surface extensions, and has an important role in directed membrane transport to cell surfaces. The Ypt2 gene of the fission yeast Schizosaccharomyces pombe encodes a member of the Ypt/Rab family of small GTP-binding proteins, related in sequence to Sec4p of Saccharomyces cerevisiae but closer to mammalian Rab8. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 167
22502 206660 cd01868 Rab11_like Rab GTPase family 11 (Rab11)-like includes Rab11a, Rab11b, and Rab25. Rab11a, Rab11b, and Rab25 are closely related, evolutionary conserved Rab proteins that are differentially expressed. Rab11a is ubiquitously synthesized, Rab11b is enriched in brain and heart and Rab25 is only found in epithelia. Rab11/25 proteins seem to regulate recycling pathways from endosomes to the plasma membrane and to the trans-Golgi network. Furthermore, Rab11a is thought to function in the histamine-induced fusion of tubulovesicles containing H+, K+ ATPase with the plasma membrane in gastric parietal cells and in insulin-stimulated insertion of GLUT4 in the plasma membrane of cardiomyocytes. Overexpression of Rab25 has recently been observed in ovarian cancer and breast cancer, and has been correlated with worsened outcomes in both diseases. In addition, Rab25 overexpression has also been observed in prostate cancer, transitional cell carcinoma of the bladder, and invasive breast tumor cells. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 165
22503 206661 cd01869 Rab1_Ypt1 Rab GTPase family 1 includes the yeast homolog Ypt1. Rab1/Ypt1 subfamily. Rab1 is found in every eukaryote and is a key regulatory component for the transport of vesicles from the ER to the Golgi apparatus. Studies on mutations of Ypt1, the yeast homolog of Rab1, showed that this protein is necessary for the budding of vesicles of the ER as well as for their transport to, and fusion with, the Golgi apparatus. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 166
22504 206662 cd01870 RhoA_like Ras homology family A (RhoA)-like includes RhoA, RhoB and RhoC. The RhoA subfamily consists of RhoA, RhoB, and RhoC. RhoA promotes the formation of stress fibers and focal adhesions, regulating cell shape, attachment, and motility. RhoA can bind to multiple effector proteins, thereby triggering different downstream responses. In many cell types, RhoA mediates local assembly of the contractile ring, which is necessary for cytokinesis. RhoA is vital for muscle contraction; in vascular smooth muscle cells, RhoA plays a key role in cell contraction, differentiation, migration, and proliferation. RhoA activities appear to be elaborately regulated in a time- and space-dependent manner to control cytoskeletal changes. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. RhoA and RhoC are observed only in geranylgeranylated forms; however, RhoB can be present in palmitoylated, farnesylated, and geranylgeranylated forms. RhoA and RhoC are highly relevant for tumor progression and invasiveness; however, RhoB has recently been suggested to be a tumor suppressor. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 175
22505 206663 cd01871 Rac1_like Ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1)-like consists of Rac1, Rac2 and Rac3. The Rac1-like subfamily consists of Rac1, Rac2, and Rac3 proteins, plus the splice variant Rac1b that contains a 19-residue insertion near switch II relative to Rac1. While Rac1 is ubiquitously expressed, Rac2 and Rac3 are largely restricted to hematopoietic and neural tissues respectively. Rac1 stimulates the formation of actin lamellipodia and membrane ruffles. It also plays a role in cell-matrix adhesion and cell anoikis. In intestinal epithelial cells, Rac1 is an important regulator of migration and mediates apoptosis. Rac1 is also essential for RhoA-regulated actin stress fiber and focal adhesion complex formation. In leukocytes, Rac1 and Rac2 have distinct roles in regulating cell morphology, migration, and invasion, but are not essential for macrophage migration or chemotaxis. Rac3 has biochemical properties that are closely related to Rac1, such as effector interaction, nucleotide binding, and hydrolysis; Rac2 has a slower nucleotide association and is more efficiently activated by the RacGEF Tiam1. Both Rac1 and Rac3 have been implicated in the regulation of cell migration and invasion in human metastatic breast cancer. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 174
22506 133275 cd01873 RhoBTB RhoBTB protein is an atypical member of the Rho family of small GTPases. Members of the RhoBTB subfamily of Rho GTPases are present in vertebrates, Drosophila, and Dictyostelium. RhoBTB proteins are characterized by a modular organization, consisting of a GTPase domain, a proline rich region, a tandem of two BTB (Broad-Complex, Tramtrack, and Bric a brac) domains, and a C-terminal region of unknown function. RhoBTB proteins may act as docking points for multiple components participating in signal transduction cascades. RhoBTB genes appeared upregulated in some cancer cell lines, suggesting a participation of RhoBTB proteins in the pathogenesis of particular tumors. Note that the Dictyostelium RacA GTPase domain is more closely related to Rac proteins than to RhoBTB proteins, where RacA actually belongs. Thus, the Dictyostelium RacA is not included here. Most Rho proteins contain a lipid modification site at the C-terminus; however, RhoBTB is one of few Rho subfamilies that lack this feature. 195
22507 206664 cd01874 Cdc42 cell division cycle 42 (Cdc42) is a small GTPase of the Rho family. Cdc42 is an essential GTPase that belongs to the Rho family of Ras-like GTPases. These proteins act as molecular switches by responding to exogenous and/or endogenous signals and relaying those signals to activate downstream components of a biological pathway. Cdc42 transduces signals to the actin cytoskeleton to initiate and maintain polarized growth and to mitogen-activated protein morphogenesis. In the budding yeast Saccharomyces cerevisiae, Cdc42 plays an important role in multiple actin-dependent morphogenetic events such as bud emergence, mating-projection formation, and pseudohyphal growth. In mammalian cells, Cdc42 regulates a variety of actin-dependent events and induces the JNK/SAPK protein kinase cascade, which leads to the activation of transcription factors within the nucleus. Cdc42 mediates these processes through interactions with a myriad of downstream effectors, whose number and regulation we are just starting to understand. In addition, Cdc42 has been implicated in a number of human diseases through interactions with its regulators and downstream effectors. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 175
22508 133277 cd01875 RhoG Ras homolog family, member G (RhoG) of small guanosine triphosphatases (GTPases). RhoG is a GTPase with high sequence similarity to members of the Rac subfamily, including the regions involved in effector recognition and binding. However, RhoG does not bind to known Rac1 and Cdc42 effectors, including proteins containing a Cdc42/Rac interacting binding (CRIB) motif. Instead, RhoG interacts directly with Elmo, an upstream regulator of Rac1, in a GTP-dependent manner and forms a ternary complex with Dock180 to induce activation of Rac1. The RhoG-Elmo-Dock180 pathway is required for activation of Rac1 and cell spreading mediated by integrin, as well as for neurite outgrowth induced by nerve growth factor. Thus RhoG activates Rac1 through Elmo and Dock180 to control cell morphology. RhoG has also been shown to play a role in caveolar trafficking and has a novel role in signaling the neutrophil respiratory burst stimulated by G protein-coupled receptor (GPCR) agonists. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 191
22509 206665 cd01876 YihA_EngB YihA (EngB) GTPase family. The YihA (EngB) subfamily of GTPases is typified by the E. coli YihA, an essential protein involved in cell division control. YihA and its orthologs are small proteins that typically contain less than 200 amino acid residues and consists of the GTPase domain only (some of the eukaryotic homologs contain an N-terminal extension of about 120 residues that might be involved in organellar targeting). Homologs of yihA are found in most Gram-positive and Gram-negative pathogenic bacteria, with the exception of Mycobacterium tuberculosis. The broad-spectrum nature of YihA and its essentiality for cell viability in bacteria make it an attractive antibacterial target. 170
22510 206666 cd01878 HflX HflX GTPase family. HflX subfamily. A distinct conserved domain with a glycine-rich segment N-terminal of the GTPase domain characterizes the HflX subfamily. The E. coli HflX has been implicated in the control of the lambda cII repressor proteolysis, but the actual biological functions of these GTPases remain unclear. HflX is widespread, but not universally represented in all three superkingdoms. 204
22511 206667 cd01879 FeoB Ferrous iron transport protein B (FeoB) family. Ferrous iron transport protein B (FeoB) subfamily. E. coli has an iron(II) transport system, known as feo, which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus contains a P-loop motif suggesting that iron transport may be ATP dependent. 159
22512 206668 cd01881 Obg_like Obg-like family of GTPases consist of five subfamilies: Obg, DRG, YyaF/YchF, Ygr210, and NOG1. The Obg-like subfamily consists of five well-delimited, ancient subfamilies, namely Obg, DRG, YyaF/YchF, Ygr210, and NOG1. Four of these groups (Obg, DRG, YyaF/YchF, and Ygr210) are characterized by a distinct glycine-rich motif immediately following the Walker B motif (G3 box). Obg/CgtA is an essential gene that is involved in the initiation of sporulation and DNA replication in the bacteria Caulobacter and Bacillus, but its exact molecular role is unknown. Furthermore, several OBG family members possess a C-terminal RNA-binding domain, the TGS domain, which is also present in threonyl-tRNA synthetase and in bacterial guanosine polyphosphatase SpoT. Nog1 is a nucleolar protein that might function in ribosome assembly. The DRG and Nog1 subfamilies are ubiquitous in archaea and eukaryotes, the Ygr210 subfamily is present in archaea and fungi, and the Obg and YyaF/YchF subfamilies are ubiquitous in bacteria and eukaryotes. The Obg/Nog1 and DRG subfamilies appear to form one major branch of the Obg family and the Ygr210 and YchF subfamilies form another branch. No GEFs, GAPs, or GDIs for Obg have been identified. 167
22513 206669 cd01882 BMS1 Bms1, an essential GTPase, promotes assembly of preribosomal RNA processing complexes. Bms1 is an essential, evolutionarily conserved, nucleolar protein. Its depletion interferes with processing of the 35S pre-rRNA at sites A0, A1, and A2, and the formation of 40S subunits. Bms1, the putative endonuclease Rc11, and the essential U3 small nucleolar RNA form a stable subcomplex that is believed to control an early step in the formation of the 40S subumit. The C-terminal domain of Bms1 contains a GTPase-activating protein (GAP) that functions intramolecularly. It is believed that Rc11 activates Bms1 by acting as a guanine-nucleotide exchange factor (GEF) to promote GDP/GTP exchange, and that activated (GTP-bound) Bms1 delivers Rc11 to the preribosomes. 231
22514 206670 cd01883 EF1_alpha Elongation Factor 1-alpha (EF1-alpha) protein family. EF1 is responsible for the GTP-dependent binding of aminoacyl-tRNAs to the ribosomes. EF1 is composed of four subunits: the alpha chain which binds GTP and aminoacyl-tRNAs, the gamma chain that probably plays a role in anchoring the complex to other cellular components and the beta and delta (or beta') chains. This subfamily is the alpha subunit, and represents the counterpart of bacterial EF-Tu for the archaea (aEF1-alpha) and eukaryotes (eEF1-alpha). eEF1-alpha interacts with the actin of the eukaryotic cytoskeleton and may thereby play a role in cellular transformation and apoptosis. EF-Tu can have no such role in bacteria. In humans, the isoform eEF1A2 is overexpressed in 2/3 of breast cancers and has been identified as a putative oncogene. This subfamily also includes Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation in yeast, and to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression. 219
22515 206671 cd01884 EF_Tu Elongation Factor Tu (EF-Tu) GTP-binding proteins. EF-Tu subfamily. This subfamily includes orthologs of translation elongation factor EF-Tu in bacteria, mitochondria, and chloroplasts. It is one of several GTP-binding translation factors found in the larger family of GTP-binding elongation factors. The eukaryotic counterpart, eukaryotic translation elongation factor 1 (eEF-1 alpha), is excluded from this family. EF-Tu is one of the most abundant proteins in bacteria, as well as, one of the most highly conserved, and in a number of species the gene is duplicated with identical function. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors. Transfer RNA is carried to the ribosome in these complexes for protein translation. 195
22516 206672 cd01885 EF2 Elongation Factor 2 (EF2) in archaea and eukarya. Translocation requires hydrolysis of a molecule of GTP and is mediated by EF-G in bacteria and by eEF2 in eukaryotes. The eukaryotic elongation factor eEF2 is a GTPase involved in the translocation of the peptidyl-tRNA from the A site to the P site on the ribosome. The 95-kDa protein is highly conserved, with 60% amino acid sequence identity between the human and yeast proteins. Two major mechanisms are known to regulate protein elongation and both involve eEF2. First, eEF2 can be modulated by reversible phosphorylation. Increased levels of phosphorylated eEF2 reduce elongation rates presumably because phosphorylated eEF2 fails to bind the ribosomes. Treatment of mammalian cells with agents that raise the cytoplasmic Ca2+ and cAMP levels reduce elongation rates by activating the kinase responsible for phosphorylating eEF2. In contrast, treatment of cells with insulin increases elongation rates by promoting eEF2 dephosphorylation. Second, the protein can be post-translationally modified by ADP-ribosylation. Various bacterial toxins perform this reaction after modification of a specific histidine residue to diphthamide, but there is evidence for endogenous ADP ribosylase activity. Similar to the bacterial toxins, it is presumed that modification by the endogenous enzyme also inhibits eEF2 activity. 218
22517 206673 cd01886 EF-G Elongation factor G (EF-G) family involved in both the elongation and ribosome recycling phases of protein synthesis. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains both eukaryotic and bacterial members. 270
22518 206674 cd01887 IF2_eIF5B Initiation Factor 2 (IF2)/ eukaryotic Initiation Factor 5B (eIF5B) family. IF2/eIF5B contribute to ribosomal subunit joining and function as GTPases that are maximally activated by the presence of both ribosomal subunits. As seen in other GTPases, IF2/IF5B undergoes conformational changes between its GTP- and GDP-bound states. Eukaryotic IF2/eIF5Bs possess three characteristic segments, including a divergent N-terminal region followed by conserved central and C-terminal segments. This core region is conserved among all known eukaryotic and archaeal IF2/eIF5Bs and eubacterial IF2s. 169
22519 206675 cd01888 eIF2_gamma Gamma subunit of initiation factor 2 (eIF2 gamma). eIF2 is a heterotrimeric translation initiation factor that consists of alpha, beta, and gamma subunits. The GTP-bound gamma subunit also binds initiator methionyl-tRNA and delivers it to the 40S ribosomal subunit. Following hydrolysis of GTP to GDP, eIF2:GDP is released from the ribosome. The gamma subunit has no intrinsic GTPase activity, but is stimulated by the GTPase activating protein (GAP) eIF5, and GDP/GTP exchange is stimulated by the guanine nucleotide exchange factor (GEF) eIF2B. eIF2B is a heteropentamer, and the epsilon chain binds eIF2. Both eIF5 and eIF2B-epsilon are known to bind strongly to eIF2-beta, but have also been shown to bind directly to eIF2-gamma. It is possible that eIF2-beta serves simply as a high-affinity docking site for eIF5 and eIF2B-epsilon, or that eIF2-beta serves a regulatory role. eIF2-gamma is found only in eukaryotes and archaea. It is closely related to SelB, the selenocysteine-specific elongation factor from eubacteria. The translational factor components of the ternary complex, IF2 in eubacteria and eIF2 in eukaryotes are not the same protein (despite their unfortunately similar names). Both factors are GTPases; however, eubacterial IF-2 is a single polypeptide, while eIF2 is heterotrimeric. eIF2-gamma is a member of the same family as eubacterial IF2, but the two proteins are only distantly related. This family includes translation initiation, elongation, and release factors. 197
22520 206676 cd01889 SelB_euk SelB, the dedicated elongation factor for delivery of selenocysteinyl-tRNA to the ribosome. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains eukaryotic SelBs and some from archaea. 192
22521 206677 cd01890 LepA LepA also known as Elongation Factor 4 (EF4). LepA (also known as elongation factor 4, EF4) belongs to the GTPase family and exhibits significant homology to the translation factors EF-G and EF-Tu, indicating its possible involvement in translation and association with the ribosome. LepA is ubiquitous in bacteria and eukaryota (e.g. yeast GUF1p), but is missing from archaea. This pattern of phyletic distribution suggests that LepA evolved through a duplication of the EF-G gene in bacteria, followed by early transfer into the eukaryotic lineage, most likely from the promitochondrial endosymbiont. Yeast GUF1p is not essential and mutant cells did not reveal any marked phenotype. 179
22522 206678 cd01891 TypA_BipA Tyrosine phosphorylated protein A (TypA)/BipA family belongs to ribosome-binding GTPases. BipA is a protein belonging to the ribosome-binding family of GTPases and is widely distributed in bacteria and plants. BipA was originally described as a protein that is induced in Salmonella typhimurium after exposure to bactericidal/permeability-inducing protein (a cationic antimicrobial protein produced by neutrophils), and has since been identified in E. coli as well. The properties thus far described for BipA are related to its role in the process of pathogenesis by enteropathogenic E. coli. It appears to be involved in the regulation of several processes important for infection, including rearrangements of the cytoskeleton of the host, bacterial resistance to host defense peptides, flagellum-mediated cell motility, and expression of K5 capsular genes. It has been proposed that BipA may utilize a novel mechanism to regulate the expression of target genes. In addition, BipA from enteropathogenic E. coli has been shown to be phosphorylated on a tyrosine residue, while BipA from Salmonella and from E. coli K12 strains is not phosphorylated under the conditions assayed. The phosphorylation apparently modifies the rate of nucleotide hydrolysis, with the phosphorylated form showing greatly increased GTPase activity. 194
22523 206679 cd01892 Miro2 Mitochondrial Rho family 2 (Miro2), C-terminal. Miro2 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the putative GTPase domain in the C terminus of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature. 180
22524 206680 cd01893 Miro1 Mitochondrial Rho family 1 (Miro1), N-terminal. Miro1 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the N-terminal GTPase domain of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature. 168
22525 206681 cd01894 EngA1 EngA1 GTPase contains the first domain of EngA. This EngA1 subfamily CD represents the first GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability. 157
22526 206682 cd01895 EngA2 EngA2 GTPase contains the second domain of EngA. This EngA2 subfamily CD represents the second GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability. 174
22527 206683 cd01896 DRG Developmentally Regulated GTP-binding protein (DRG). The developmentally regulated GTP-binding protein (DRG) subfamily is an uncharacterized member of the Obg family, an evolutionary branch of GTPase superfamily proteins. GTPases act as molecular switches regulating diverse cellular processes. DRG2 and DRG1 comprise the DRG subfamily in eukaryotes. In view of their widespread expression in various tissues and high conservation among distantly related species in eukaryotes and archaea, DRG proteins may regulate fundamental cellular processes. It is proposed that the DRG subfamily proteins play their physiological roles through RNA binding. 233
22528 206684 cd01897 NOG Nucleolar GTP-binding protein (NOG). NOG1 is a nucleolar GTP-binding protein present in eukaryotes ranging from trypanosomes to humans. NOG1 is functionally linked to ribosome biogenesis and found in association with the nuclear pore complexes and identified in many preribosomal complexes. Thus, defects in NOG1 can lead to defects in 60S biogenesis. The S. cerevisiae NOG1 gene is essential for cell viability, and mutations in the predicted G motifs abrogate function. It is a member of the ODN family of GTP-binding proteins that also includes the bacterial Obg and DRG proteins. 167
22529 206685 cd01898 Obg Obg GTPase. The Obg nucleotide binding protein subfamily has been implicated in stress response, chromosome partitioning, replication initiation, mycelium development, and sporulation. Obg proteins are among a large group of GTP binding proteins conserved from bacteria to humans. The E. coli homolog, ObgE is believed to function in ribosomal biogenesis. Members of the subfamily contain two equally and highly conserved domains, a C-terminal GTP binding domain and an N-terminal glycine-rich domain. 170
22530 206686 cd01899 Ygr210 Ygr210 GTPase. Ygr210 is a member of Obg-like family and present in archaea and fungi. They are characterized by a distinct glycine-rich motif immediately following the Walker B motif. The Ygr210 and YyaF/YchF subfamilies appear to form one major branch of the Obg-like family. Among eukaryotes, the Ygr210 subfamily is represented only in fungi. These fungal proteins form a tight cluster with their archaeal orthologs, which suggests the possibility of horizontal transfer from archaea to fungi. 318
22531 206687 cd01900 YchF YchF GTPase. YchF is a member of the Obg family, which includes four other subfamilies of GTPases: Obg, DRG, Ygr210, and NOG1. Obg is an essential gene that is involved in DNA replication in C. crescentus and Streptomyces griseus and is associated with the ribosome. Several members of the family, including YchF, possess the TGS domain related to the RNA-binding proteins. Experimental data and genomic analysis suggest that YchF may be part of a nucleoprotein complex and may function as a GTP-dependent translational factor. 274
22532 238884 cd01901 Ntn_hydrolase The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences. 164
22533 238885 cd01902 Ntn_CGH Choloylglycine hydrolase (CGH) is a bile salt-modifying enzyme that hydrolyzes non-peptide carbon-nitrogen bonds in choloylglycine and choloyltaurine, both of which are present in bile. CGH is present in a number of probiotic microbial organisms that inhabit the gut. CGH has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which CGH belongs. 291
22534 238886 cd01903 Ntn_AC_NAAA AC_NAAA This conserved domain includes two closely related proteins, acid ceramidase (AC, also known as N-acylsphingosine amidohydrolase), and N-acylethanolamine-hydrolyzing acid amidase (NAAA). AC catalyzes the hydrolysis of ceramide to sphingosine and fatty acid. Ceramide is required for the biosynthesis of most sphingolipids and plays an important role in many signal transduction pathways by inducing apoptosis and/or arresting cell growth. An inherited deficiency of AC activity leads to the lysosomal storage disorder known as Farber disease. AC is considered a "rheostat" important for maintaining the proper intracellular levels of these lipids since hydrolysis of ceramide is the only source of sphingosine in cells. NAAA is a eukaryotic glycoprotein that hydrolyzes bioactive N-acylethanolamines, including anandamide (an endocannabinoid) and N-palmitoylethanolamine (an anti-inflammatory and neuroprotective substance), to fatty acids and ethanolamine at acidic pH. NAAA shows structural and functional similarity to acid ceramidase, but lacks the ceramide-hydrolyzing activity of AC. 231
22535 238887 cd01906 proteasome_protease_HslV proteasome_protease_HslV. This group contains the eukaryotic proteosome alpha and beta subunits and the prokaryotic protease hslV subunit. Proteasomes are large multimeric self-compartmentalizing proteases, involved in the clearance of misfolded proteins, the breakdown of regulatory proteins, and the processing of proteins such as the preparation of peptides for immune presentation. Two main proteasomal types are distinguished by their different tertiary structures: the eukaryotic/archeal 20S proteasome and the prokaryotic proteasome-like heat shock protein encoded by heat shock locus V, hslV. The proteasome core particle is a highly conserved cylindrical structure made up of non-identical subunits that have their active sites on the inner walls of a large central cavity. The proteasome subunits of bacteria, archaea, and eukaryotes all share a conserved Ntn (N terminal nucleophile) hydrolase fold and a catalytic mechanism involving an N-terminal nucleophilic threonine that is exposed by post-translational processing of an inactive propeptide. 182
22536 238888 cd01907 GlxB Glutamine amidotransferases class-II (Gn-AT)_GlxB-type. GlxB is a glutamine amidotransferase-like protein of unknown function found in bacteria and archaea. GlxB has a structural fold similar to that of other class II glutamine amidotransferases including glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The GlxB fold is also somewhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 249
22537 238889 cd01908 YafJ Glutamine amidotransferases class-II (Gn-AT)_YafJ-type. YafJ is a glutamine amidotransferase-like protein of unknown function found in prokaryotes, eukaryotes and archaea. YafJ has a conserved structural fold similar to those of other class II glutamine amidotransferases including glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The YafJ fold is also somewhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 257
22538 238890 cd01909 betaLS_CarA_N Glutamine amidotransferases class-II (GATase) asparagine synthase_betaLS-type. Carbapenam synthetase (CarA) is an ATP/Mg2+-dependent enzyme that catalyzes the formation of the beta-lactam ring in (5R)-carbapenem-3-carboxylic acid biosynthesis. CarA is homologous to beta-lactam synthetase (beta-LS), which is involved in the biosynthesis of clavulanic acid, a clinically important beta-lactamase inhibitor. CarA and beta-LS each have two distinct domains, an N-terminal Ntn hydrolase domain and a C-terminal synthetase domain, a domain architecture similar to that of the class-B asparagine synthetases (AS-B's). The N-terminal domain of these enzymes hydrolyzes glutamine to glutamate and ammonia. CarA forms a homotetramer while betaLS forms a heterodimer. The N-terminal folds of CarA and beta-LS are similar to those of other class II glutamine amidotransferases including lucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), and glutamate synthase (GltS). This fold is also somwhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 199
22539 238891 cd01910 Wali7 This domain is present in Wali7, a protein of unknown function, expressed in wheat and induced by aluminum. Wali7 has a single domain similar to the glutamine amidotransferase domain of glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The Wali7 domain is also somewhat similar to the Ntn hydrolase fold of the proteasomal alph and beta subunits. 224
22540 238892 cd01911 proteasome_alpha proteasome alpha subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 different alpha and 10 different beta proteasome subunit genes while archaea have one of each. 209
22541 238893 cd01912 proteasome_beta proteasome beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 189
22542 238894 cd01913 protease_HslV Protease HslV and the ATPase/chaperone HslU are part of an ATP-dependent proteolytic system that is the prokaryotic homolog of the proteasome. HslV is a dimer of hexamers (a dodecamer) that forms a central proteolytic chamber with active sites on the interior walls of the cavity. HslV shares significant sequence and structural similarity with the proteasomal beta-subunit and both are members of the Ntn-family of hydrolases. HslV has a nucleophilic threonine residue at its N-terminus that is exposed after processing of the propeptide and is directly involved in active site catalysis. 171
22543 238895 cd01914 HCP Hybrid cluster protein (HCP), formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown. HCP has three structural domains, an N-terminal alpha-helical domain, and two similar domains comprising a central beta-sheet flanked by alpha-helices. HCP contains two iron-sulfur clusters, one of which is a [Fe4-S4] cubane cluster similar to that of carbon monoxide dehydrogenase (CODH). The second cluster, referred to as the hybrid cluster, is a hybrid [Fe4-S2-O2] center located at the interface of the three domains. Although the hybrid cluster is buried within the protein, it is accessible through a large hydrophobic cavity. 423
22544 238896 cd01915 CODH Carbon monoxide dehydrogenase (CODH) is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA, respectively. CODH has two types of metal clusters, a cubane [Fe4-S4] center (B-cluster) similar to that of hybrid cluster protein (HCP) and a Ni-Fe-S center (C-cluster) where carbon monoxide oxidation occurs. Bifunctional CODH forms a heterotetramer with acetyl-CoA synthase (ACS) consisting of two CODH and two ACS subunits while monofunctional CODH forms a homodimer. Bifunctional CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP), while monofunctional CODH oxidizes carbon monoxide to carbon dioxide. CODH and ACS each have a metal cluster referred to as the C- and A-clusters, respectively. 613
22545 238897 cd01916 ACS_1 Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP). ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains. A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA. 731
22546 238898 cd01917 ACS_2 Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP). ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains. A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA. 287
22547 238899 cd01918 HprK_C HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of Ser-46 of HPr and its dephosphorylation by phosphorolysis. The latter reaction uses inorganic phosphate as substrate and produces pyrophosphate. Phosphoenolpyruvate carboxykinase (PEPCK) and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting these two phosphotransferases have related functions. The HprK/P N-terminal domain is structurally similar to the N-terminal domains of the MurE and MurF amino acid ligases. 149
22548 238900 cd01919 PEPCK Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups). 515
22549 238901 cd01920 cyclophilin_EcCYP_like cyclophilin_EcCYP_like: cyclophilin-type A-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to the cytosolic E. coli cyclophilin A and Streptomyces antibioticus SanCyp18. Compared to the archetypal cyclophilin Human cyclophilin A, these have reduced affinity for cyclosporin A. E. coli cyclophilin A has a similar peptidylprolyl cis- trans isomerase activity to the human cyclophilin A. Most members of this subfamily contain a phenylalanine residue at the position equivalent to Human cyclophilin W121, where a tyrptophan has been shown to be important for cyclophilin binding. 155
22550 238902 cd01921 cyclophilin_RRM cyclophilin_RRM: cyclophilin-type peptidylprolyl cis- trans isomerase domain occuring with a C-terminal RNA recognition motif domain (RRM). This subfamily of the cyclophilin domain family contains a number of eukaryotic cyclophilins having the RRM domain including the nuclear proteins: human hCyP-57, Arabidopsis thaliana AtCYP59, Caenorhabditis elegans CeCyP-44 and Paramecium tetrurelia Kin241. The Kin241 protein has been shown to have a role in cell morphogenesis. 166
22551 238903 cd01922 cyclophilin_SpCYP2_like cyclophilin_SpCYP2_like: cyclophilin 2-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to Schizosaccharomyces pombe cyp-2. These proteins bind their respective SNW chromatin binding protein in autologous systems, in a CsA independent manner indicating interaction with a surface outside the PPIase active site. SNW proteins play a basic and broad range role in signaling. 146
22552 238904 cd01923 cyclophilin_RING cyclophilin_RING: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a modified RING finger domain. This group includes the nuclear proteins, Human hCyP-60 and Caenorhabditis elegans MOG-6 which, compared to the archetypal cyclophilin Human cyclophilin A exhibit reduced peptidylprolyl cis- trans isomerase activity and lack a residue important for cyclophilin binding. Human hCyP-60 has been shown to physically interact with the proteinase inhibitor peptide eglin c and; C. elegans MOG-6 to physically interact with MEP-1, a nuclear zinc finger protein. MOG-6 has been shown to function in germline sex determination. 159
22553 238905 cd01924 cyclophilin_TLP40_like cyclophilin_TLP40_like: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) similar ot the Spinach thylakoid lumen protein TLP40. Compared to the archetypal cyclophilin Human cyclophilin A, these proteins have similar peptidylprolyl cis- trans isomerase activity and reduced affinity for cyclosporin A. Spinach TLP40 has been shown to have a dual function as a folding catalyst and regulator of dephosphorylation. 176
22554 238906 cd01925 cyclophilin_CeCYP16-like cyclophilin_CeCYP16-like: cyclophilin-type peptidylprolyl cis- trans isomerase) (PPIase) domain similar to Caenorhabditis elegans cyclophilin 16. C. elegans CeCYP-16, compared to the archetypal cyclophilin Human cyclophilin A has, a reduced peptidylprolyl cis- trans isomerase activity, is cyclosporin insensitive and shows an altered substrate preference favoring, hydrophobic, acidic or amide amino acids. Most members of this subfamily have a glutamate residue in the active site at the position equivalent to a tryptophan (W121 in Human cyclophilin A), which has been shown to be important for cyclophilin binding. 171
22555 238907 cd01926 cyclophilin_ABH_like cyclophilin_ABH_like: Cyclophilin A, B and H-like cyclophilin-type peptidylprolyl cis- trans isomerase (PPIase) domain. This family represents the archetypal cystolic cyclophilin similar to human cyclophilins A, B and H. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. These enzymes have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. As cyclophilins, Human hCyP-A, human cyclophilin-B (hCyP-19), S. cerevisiae Cpr1 and C. elegans Cyp-3, are inhibited by the immunosuppressive drug cyclopsporin A (CsA). CsA binds to the PPIase active site. Cyp-3. S. cerevisiae Cpr1 interacts with the Rpd3 - Sin3 complex and in addition is a component of the Set3 complex. S. cerevisiae Cpr1 has also been shown to have a role in Zpr1p nuclear transport. Human cyclophilin H associates with the [U4/U6.U5] tri-snRNP particles of the splicesome. 164
22556 238908 cd01927 cyclophilin_WD40 cyclophilin_WD40: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a WD40 domain. This group consists of several hypothetical and putative eukaryotic and bacterial proteins which have a cyclophilin domain and a WD40 domain. Function of the protein is not known. 148
22557 238909 cd01928 Cyclophilin_PPIL3_like Cyclophilin_PPIL3_like. Proteins similar to Human cyclophilin-like peptidylprolyl cis- trans isomerase (PPIL3). Members of this family lack a key residue important for cyclosporin binding: the tryptophan residue corresponding to W121 in human hCyP-18a; most members have a histidine at this position. The exact function of the protein is not known. 153
22558 238910 cd01935 Ntn_CGH_like Choloylglycine hydrolase (CGH)_like. This family of choloylglycine hydrolase-like proteins includes conjugated bile acid hydrolase (CBAH), penicillin V acylase (PVA), acid ceramidase (AC), and N-acylethanolamine-hydrolyzing acid amidase (NAAA) which cleave non-peptide carbon-nitrogen bonds in bile salt constituents. These enzymes have an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which they belong. This nucleophilic cysteine is exposed by post-translational prossessing of the precursor protein. 229
22559 238911 cd01936 Ntn_CA Cephalosporin acylase (CA) belongs to a family of beta-lactam acylases that includes penicillin G acylase (PGA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively. While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity to other Ntn's is low. 469
22560 238912 cd01937 ribokinase_group_D Ribokinase-like subgroup D. Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 254
22561 238913 cd01938 ADPGK_ADPPFK ADP-dependent glucokinase (ADPGK) and phosphofructokinase (ADPPFK). ADPGK and ADPPFK are proteins that rely on ADP rather than ATP to donate a phosphoryl group. They are found in certain hyperthermophilic archaea and in higher eukaryotes. A functional ADPGK has been characterized in mouse and is assumed to be desirable during ischemia/hypoxia. ADPGK and ADPPFK contain a large and a small domain with the binding site located in a groove between the domains. Partial domain closing is seen when ADP is bound, and further domain closing is observed when glucose is also bound. The oligomerization state apparently varies depending on the species, with some existing as monomers, some as dimers, and some as tetramers. 445
22562 238914 cd01939 Ketohexokinase Ketohexokinase (fructokinase, KHK) catalyzes the phosphorylation of fructose to fructose-1-phosphate (F1P), the first step in the metabolism of dietary fructose. KHK can also phosphorylate several other furanose sugars. It is found in higher eukaryotes where it is believed to function as a dimer and requires K(+) and ATP to be active. In humans, hepatic KHK deficiency causes fructosuria, a benign inborn error of metabolism. 290
22563 238915 cd01940 Fructoselysine_kinase_like Fructoselysine kinase-like. Fructoselysine is a fructoseamine formed by glycation, a non-enzymatic reaction of glucose with a primary amine followed by an Amadori rearrangement, resulting in a protein that is modified at the amino terminus and at the lysine side chains. Fructoseamines are typically metabolized by fructoseamine-3-kinase, especially in higher eukaryotes. In E. coli, fructoselysine kinase has been shown in vitro to catalyze the phosphorylation of fructoselysine. It is proposed that fructoselysine is released from glycated proteins during human digestion and is partly metabolized by bacteria in the hind gut using a protein such as fructoselysine kinase. This family is found only in bacterial sequences, and its oligomeric state is currently unknown. 264
22564 238916 cd01941 YeiC_kinase_like YeiC-like sugar kinase. Found in eukaryotes and bacteria, YeiC-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 288
22565 238917 cd01942 ribokinase_group_A Ribokinase-like subgroup A. Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 279
22566 238918 cd01943 MAK32 MAK32 kinase. MAK32 is a protein found primarily in fungi that is necessary for the structural stability of L-A particles. The L-A virus particule is a specialized compartment for the transcription and replication of double-stranded RNA, known to infect yeast and other fungi. MAK32 is part of the host machinery used by the virus to multiply. 328
22567 238919 cd01944 YegV_kinase_like YegV-like sugar kinase. Found only in bacteria, YegV-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 289
22568 238920 cd01945 ribokinase_group_B Ribokinase-like subgroup B. Found in bacteria and plants, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. . 284
22569 238921 cd01946 ribokinase_group_C Ribokinase-like subgroup C. Found only in bacteria, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 277
22570 238922 cd01947 Guanosine_kinase_like Guanosine kinase-like sugar kinases. Found in bacteria and archaea, the guanosine kinase-like group is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 265
22571 238923 cd01948 EAL EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues and is also known as domain of unknown function 2 (DUF2). The EAL domain has been shown to stimulate degradation of a second messenger, cyclic di-GMP, and is a good candidate for a diguanylate phosphodiesterase function. Together with the GGDEF domain, EAL might be involved in regulating cell surface adhesiveness in bacteria. 240
22572 143635 cd01949 GGDEF Diguanylate-cyclase (DGC) or GGDEF domain. Diguanylate-cyclase (DGC) or GGDEF domain: Originally named after a conserved residue pattern, and initially described as a domain of unknown function 1 (DUF1). This domain is widely present in bacteria, linked to a wide range of non-homologous domains in a variety of cell signaling proteins. The domain shows homology to the adenylyl cyclase catalytic domain. This correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Together with the EAL domain, GGDEF might be involved in regulating cell surface adhesion in bacteria. 158
22573 173886 cd01951 lectin_L-type legume lectins. The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 223
22574 238924 cd01958 HPS_like HPS_like: Hydrophobic Protein from Soybean (HPS)-like subfamily; composed of proteins with similarity to HPS, a small hydrophobic protein with unknown function related to cereal-type alpha-amylase inhibitors and lipid transfer proteins. In addition to HPS, members of this subfamily include a hybrid proline-rich protein (HyPRP) from maize, a dark-inducible protein (LeDI-2) from Lithospermum erythrorhizon, maize ZRP3 protein, and rice RcC3 protein. HyPRP is an embryo-specific protein that contains an N-terminal proline-rich domain and a C-terminal HPS-like cysteine-rich domain. It has been suggested that HyPRP may be involved in the stability and defense of the developing embryo. LeDI-2 is a root-specific protein that may be involved in regulating the biosynthesis of shikonin derivatives in L. erythrorhizon. Maize ZRP3 and rice RcC3 are root-specific proteins whose functions are yet to be determined. It has been reported that ZRP3 largely accumulates in a distinct subset of cortical cells. 85
22575 238925 cd01959 nsLTP2 nsLTP2: Non-specific lipid-transfer protein type 2 (nsLTP2) subfamily; Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. In addition to lipid transport and assembly, nsLTPs also play a key role in the defense of plants against pathogens. There are two closely-related types of nsLTPs, types 1 and 2, which differ in protein sequence, molecular weight, and biological properties. nsLTPs contain an internal hydrophobic cavity, which serves as the binding site for lipids. nsLTP2 can bind lipids and sterols. Structure studies of rice nsLTPs show that the plasticity of the hydrophobic cavity is an important factor in ligand binding. The flexibility of the sLTP2 cavity allows its binding to rigid sterol molecules, whereas nsLTP1 cannot bind sterols despite its larger cavity size. The resulting nsLTP2/sterol complexes may bind to receptors that trigger defense responses. nsLTP2 gene expression has been observed in barley and rice developing seeds, during Zinnia elegans cell differentiation, and under abiotic stress conditions in barley roots. The nsLTP2 of Brassica rapa has also been identified as a potent allergen. 66
22576 238926 cd01960 nsLTP1 nsLTP1: Non-specific lipid-transfer protein type 1 (nsLTP1) subfamily; Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. In addition to lipid transport and assembly, nsLTPs also play a key role in the defense of plants against pathogens. There are two closely-related types of nsLTPs, types 1 and 2, which differ in protein sequence, molecular weight, and biological properties. nsLTPs contain an internal hydrophobic cavity, which serves as the binding site for lipids. The hydrophobic cavity accommodates various fatty acid ligands containing from ten to 18 carbon atoms. In general, the cavity is larger in nsLTP1 than in nsLTP2. nsLTP1 proteins are located in extracellular layers and in vacuolar structures. They may be involved in the formation of cutin layers on plant surfaces by transporting cutin monomers. Many nsLTP1 proteins have been characterized as allergens in humans. 89
22577 238927 cd01965 Nitrogenase_MoFe_beta_like Nitrogenase_MoFe_beta_like: Nitrogenase MoFe protein, beta subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. This group contains the beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having alpha and beta subunits similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains, a single [4Fe-4S] cluster from which electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during N2-reduction and, ethane as a minor product during acetylene reduction 428
22578 238928 cd01966 Nitrogenase_NifN_1 Nitrogenase_nifN1: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE. NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco). 417
22579 238929 cd01967 Nitrogenase_MoFe_alpha_like Nitrogenase_MoFe_alpha_like: Nitrogenase MoFe protein, alpha subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. Three genetically distinct types of nitrogenase systems are known to exist: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). This group contains the alpha subunit of component 1 of all three different forms. The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having alpha and beta subunits similar to the alpha and beta subunits of MoFe. The role of the delta subunit is unknown. For MoFe, each alphabeta pair of subunits contains one P-cluster (located at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein is a homodimer which contains, a single [4Fe-4S] cluster from which electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo- nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 406
22580 238930 cd01968 Nitrogenase_NifE_I Nitrogenase_NifE_I: a subgroup of the NifE subunit of the NifEN complex: NifE forms an alpha2beta2 tetramer with NifN. NifE and NifN are structurally homologous to nitrogenase MoFe protein alpha and beta subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The NifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this NifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco). 410
22581 238931 cd01971 Nitrogenase_VnfN_like Nitrogenase_vnfN_like: VnfN subunit of the VnfEN complex-like. This group in addition to VnfN contains a subset of the beta subunit of the nitrogenase MoFe protein and NifN-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN may similarly be a scaffolding protien for the iron-vanadium cofactor (FeVco) of the vanadium-dependent (V)-nitrogenase. NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated. 427
22582 238932 cd01972 Nitrogenase_VnfE_like Nitrogenase_VnfE_like: VnfE subunit of the VnfEN complex_like. This group in addition to VnfE contains a subset of the alpha subunit of the nitrogenase MoFe protein and NifE-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN may similarly be a scaffolding protein for the iron-vanadium cofactor (FeVco) of the vanadium-dependent (V)-nitrogenase. NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated. 426
22583 238933 cd01973 Nitrogenase_VFe_beta_like Nitrogenase_VFe_beta -like: Nitrogenase VFe protein, beta subunit like. This group contains proteins similar to the beta subunits of the VFe protein of the vanadium-dependent (V-) nitrogenase. Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V-nitrogenase there is a molybdenum (Mo)-dependent nitrogenase and an iron only (Fe-) nitrogenase. The Mo-nitrogenase is the most widespread and best characterized of these systems. These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 454
22584 238934 cd01974 Nitrogenase_MoFe_beta Nitrogenase_MoFe_beta: Nitrogenase MoFe protein, beta subunit. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Molybdenum (Mo-) nitrogenase is the most widespread and best characterized of these systems. Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2). MoFe is an alpha2beta2 tetramer. This group contains the beta subunit of the MoFe protein. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. 435
22585 238935 cd01976 Nitrogenase_MoFe_alpha Nitrogenase_MoFe_alpha_II: Nitrogenase MoFe protein, beta subunit. A group of proteins similar to the alpha subunit of the MoFe protein of the molybdenum (Mo-) nitrogenase. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Mo-nitrogenase is the most widespread and best characterized of these systems. Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2). MoFe is an alpha2beta2 tetramer. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. 421
22586 238936 cd01977 Nitrogenase_VFe_alpha Nitrogenase_VFe_alpha -like: Nitrogenase VFe protein, alpha subunit like. This group contains proteins similar to the alpha subunits of, the VFe protein of the vanadium-dependent (V-) nitrogenase and the FeFe protein of the iron only (Fe-) nitrogenase Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V- and Fe- nitrogenases there is a molybdenum (Mo)-dependent nitrogenase which is the most widespread and best characterized of these systems. These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 415
22587 238937 cd01979 Pchlide_reductase_N Pchlide_reductase_N: N protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR). Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction. 396
22588 238938 cd01980 Chlide_reductase_Y Chlide_reductase_Y : Y subunit of chlorophyllide (chlide) reductase (BchY). Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits. 416
22589 238939 cd01981 Pchlide_reductase_B Pchlide_reductase_B: B protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR). Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction. 430
22590 238940 cd01982 Chlide_reductase_Z Chlide_reductase_Z : Z subunit of chlorophyllide (chlide) reductase (BchZ). Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits. 412
22591 349751 cd01983 SIMIBI SIMIBI (signal recognition particle, MinD and BioD)-class NTPases. SIMIBI (after signal recognition particle, MinD, and BioD), consists of signal recognition particle (SRP) GTPases, the assemblage of MinD-like ATPases, which are involved in protein localization, chromosome partitioning, and membrane transport, and a group of metabolic enzymes with kinase or related phosphate transferase activity. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. 107
22592 238942 cd01984 AANH_like Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide. 86
22593 238943 cd01985 ETF The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit. 181
22594 238944 cd01986 Alpha_ANH_like Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases and ATP sulphurylases. The domain forms a apha/beta/apha fold which binds to Adenosine group.. 103
22595 238945 cd01987 USP_OKCHK USP domain is located between the N-terminal sensor domain and C-terminal catalytic domain of this Osmosensitive K+ channel histidine kinase family. The family of KdpD sensor kinase proteins regulates the kdpFABC operon responsible for potassium transport. The USP domain is homologous to the universal stress protein Usp Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. 124
22596 238946 cd01988 Na_H_Antiporter_C The C-terminal domain of a subfamily of Na+ /H+ antiporter existed in bacteria and archea . Na+/H+ exchange proteins eject protons from cells, effectively eliminating excess acid from actively metabolising cells. Na+ /H+ exchange activity is also crucial for the regulation of cell volume, and for the reabsorption of NaCl across renal, intestinal, and other epithelia. These antiports exchange Na+ for H+ in an electroneutral manner, and this activity is carried out by a family of Na+ /H+ exchangers, or NHEs, which are known to be present in both prokaryotic and eukaryotic cells. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity wit h other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region or C-terminal has homology with a family universal stress protein.Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. 132
22597 238947 cd01989 STK_N The N-terminal domain of Eukaryotic Serine Threonine kinases. The Serine Threonine kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. The N-terminal domain is homologous to the USP family which has a ATP binding fold. The N-terminal domain is predicted to be involved in ATP binding. 146
22598 238948 cd01990 Alpha_ANH_like_I This is a subfamily of Adenine nucleotide alpha hydrolases superfamily. Adenine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins probably binds ATP. This domain is about 200 amino acids long with a strongly conserved motif SGGKD at the N terminus. 202
22599 238949 cd01991 Asn_Synthase_B_C The C-terminal domain of Asparagine Synthase B. This domain is always found associated n-terminal amidotransferase domain. Family members that contain this domain catalyse the conversion of aspartate to asparagine. Asparagine synthetase B catalyzes the assembly of asparagine from aspartate, Mg(2+)ATP, and glutamine. The three-dimensional architecture of the N-terminal domain of asparagine synthetase B is similar to that observed for glutamine phosphoribosylpyrophosphate amidotransferase while the molecular motif of the C-domain is reminiscent to that observed for GMP synthetase . 269
22600 238950 cd01992 PP-ATPase N-terminal domain of predicted ATPase of the PP-loop faimly implicated in cell cycle control [Cell division and chromosome partitioning]. This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This domain has a strongly conserved motif SGGXD at the N terminus. 185
22601 238951 cd01993 Alpha_ANH_like_II This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domainhas a strongly conserved motif SGGKD at the N terminus. 185
22602 238952 cd01994 Alpha_ANH_like_IV This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domainhas a strongly conserved motif SGGKD at the N terminus. 194
22603 238953 cd01995 ExsB ExsB is a transcription regulator related protein. It is a subfamily of a Adenosine nucleotide binding superfamily of proteins. This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti, a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA. In Arthrobacter viscosus, the homologous gene is designated ALU1 and is associated with an aluminum tolerance phenotype. The function is unknown 169
22604 238954 cd01996 Alpha_ANH_like_III This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domain has a strongly conserved motif SGGKD at the N terminus. 154
22605 238955 cd01997 GMP_synthase_C The C-terminal domain of GMP synthetase. It contains two subdomains; the ATP pyrophosphatase domain which closes to the N-termial and the dimerization domain at C-terminal end. The ATP-PPase is a twisted, five-stranded parallel beta-sheet sandwiched between helical layers. It has a signature nucleotide-binding motif, or P-loop, at the end of the first-beta strand.The dimerization domain formed by the C-terminal 115 amino acid for prokaryotic proteins. It is adjacent to teh ATP-binding site of the ATP-PPase subdomain. The largest difference between the primary sequence of prokaryotic and eukaryotic GMP synthetase map to the dimerization domain.Eukaryotic GMP synthetase has several large insertions relative to prokaryotes. 295
22606 238956 cd01998 tRNA_Me_trans tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs. This family of enzyme only presents in bacteria and eukaryote. The archaeal counterpart of this enzyme performs same function, but is completely unrelated in sequence. 349
22607 238957 cd01999 Argininosuccinate_Synthase Argininosuccinate synthase. The Argininosuccinate synthase is a urea cycle enzyme that catalyzes the penultimate step in arginine biosynthesis: the ATP-dependent ligation of citrulline to aspartate to form argininosuccinate, AMP and pyrophosphate . In humans, a defect in the AS gene causes citrullinemia, a genetic disease characterized by severe vomiting spells and mental retardation. AS is a homotetrameric enzyme of chains of about 400 amino-acid residues. An arginine seems to be important for the enzyme's catalytic mechanism. The sequences of AS from various prokaryotes, archaebacteria and eukaryotes show significant similarity 385
22608 238958 cd02000 TPP_E1_PDC_ADC_BCADC Thiamine pyrophosphate (TPP) family, E1 of PDC_ADC_BCADC subfamily, TPP-binding module; composed of proteins similar to the E1 components of the human pyruvate dehydrogenase complex (PDC), the acetoin dehydrogenase complex (ADC) and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC). PDC catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. ADC participates in the breakdown of acetoin while BCADC participates in the breakdown of branched chain amino acids. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate (branched chain 2-oxo acids derived from the transamination of leucine, valine and isoleucine). 293
22609 238959 cd02001 TPP_ComE_PpyrDC Thiamine pyrophosphate (TPP) family, ComE and PpyrDC subfamily, TPP-binding module; composed of proteins similar to sulfopyruvate decarboxylase beta subunit (ComE) and phosphonopyruvate decarboxylase (Ppyr decarboxylase). Methanococcus jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits which, catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. Ppyr decarboxylase is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. Ppyr decarboxylase and ComDE require TPP and divalent metal cation cofactors. 157
22610 238960 cd02002 TPP_BFDC Thiamine pyrophosphate (TPP) family, BFDC subfamily, TPP-binding module; composed of proteins similar to Pseudomonas putida benzoylformate decarboxylase (BFDC). P. putida BFDC plays a role in the mandelate pathway, catalyzing the conversion of benzoylformate to benzaldehyde and carbon dioxide. This enzyme is dependent on TPP and a divalent metal cation as cofactors. 178
22611 238961 cd02003 TPP_IolD Thiamine pyrophosphate (TPP) family, IolD subfamily, TPP-binding module; composed of proteins similar to Rhizobium leguminosarum bv. viciae IolD. IolD plays an important role in myo-inositol catabolism. 205
22612 238962 cd02004 TPP_BZL_OCoD_HPCL Thiamine pyrophosphate (TPP) family, BZL_OCoD_HPCL subfamily, TPP-binding module; composed of proteins similar to benzaldehyde lyase (BZL), oxalyl-CoA decarboxylase (OCoD) and 2-hydroxyphytanoyl-CoA lyase (2-HPCL). Pseudomonas fluorescens biovar I BZL cleaves the acyloin linkage of benzoin producing 2 molecules of benzaldehyde and enabling the Pseudomonas to grow on benzoin as the sole carbon and energy source. OCoD has a role in the detoxification of oxalate, catalyzing the decarboxylation of oxalyl-CoA to formate. 2-HPCL is a peroxisomal enzyme which plays a role in the alpha-oxidation of 3-methyl-branched fatty acids, catalyzing the cleavage of 2-hydroxy-3-methylacyl-CoA into formyl-CoA and a 2-methyl-branched fatty aldehyde. All these enzymes depend on Mg2+ and TPP for activity. 172
22613 238963 cd02005 TPP_PDC_IPDC Thiamine pyrophosphate (TPP) family, PDC_IPDC subfamily, TPP-binding module; composed of proteins similar to pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC). PDC, a key enzyme in alcoholic fermentation, catalyzes the conversion of pyruvate to acetaldehyde and CO2. It is able to utilize other 2-oxo acids as substrates. In plants and various plant-associated bacteria, IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway, a tryptophan-dependent biosynthetic route to indole-3-acetaldehyde (IAA). IPDC catalyzes the decarboxylation of IPA to IAA. Both PDC and IPDC depend on TPP and Mg2+ as cofactors. 183
22614 238964 cd02006 TPP_Gcl Thiamine pyrophosphate (TPP) family, Gcl subfamily, TPP-binding module; composed of proteins similar to Escherichia coli glyoxylate carboligase (Gcl). E. coli glyoxylate carboligase, plays a key role in glyoxylate metabolism where it catalyzes the condensation of two molecules of glyoxylate to give tartronic semialdehyde and carbon dioxide. This enzyme requires TPP, magnesium ion and FAD as cofactors. 202
22615 238965 cd02007 TPP_DXS Thiamine pyrophosphate (TPP) family, DXS subfamily, TPP-binding module; 1-Deoxy-D-xylulose-5-phosphate synthase (DXS) is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis. Terpeniods are plant natural products with important pharmaceutical activity. DXS catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. The formation of DXP leads to the formation of the terpene precursor IPP (isopentyl diphosphate) and to the formation of thiamine (vitamin B1) and pyridoxal (vitamin B6). 195
22616 238966 cd02008 TPP_IOR_alpha Thiamine pyrophosphate (TPP) family, IOR-alpha subfamily, TPP-binding module; composed of proteins similar to indolepyruvate ferredoxin oxidoreductase (IOR) alpha subunit. IOR catalyzes the oxidative decarboxylation of arylpyruvates, such as indolepyruvate or phenylpyruvate, which are generated by the transamination of aromatic amino acids, to the corresponding aryl acetyl-CoA. 178
22617 238967 cd02009 TPP_SHCHC_synthase Thiamine pyrophosphate (TPP) family, SHCHC synthase subfamily, TPP-binding module; composed of proteins similar to Escherichia coli 2-succinyl-6-hydroxyl-2,4-cyclohexadiene-1-carboxylic acid (SHCHC) synthase (also called MenD). SHCHC synthase plays a key role in the menaquinone biosynthetic pathway, converting isochorismate and 2-oxoglutarate to SHCHC, pyruvate and carbon dioxide. The enzyme requires TPP and a divalent metal cation for activity. 175
22618 238968 cd02010 TPP_ALS Thiamine pyrophosphate (TPP) family, Acetolactate synthase (ALS) subfamily, TPP-binding module; composed of proteins similar to Klebsiella pneumoniae ALS, a catabolic enzyme required for butanediol fermentation. ALS catalyzes the conversion of 2 molecules of pyruvate to acetolactate and carbon dioxide. ALS does not contain FAD, and requires TPP and a divalent metal cation for activity. 177
22619 238969 cd02011 TPP_PK Thiamine pyrophosphate (TPP) family, Phosphoketolase (PK) subfamily, TPP-binding module; PK catalyzes the conversion of D-xylulose 5-phosphate and phosphate to acetyl phosphate, D-glyceraldehyde-3-phosphate and H2O. This enzyme requires divalent magnesium ions and TPP for activity. 227
22620 238970 cd02012 TPP_TK Thiamine pyrophosphate (TPP) family, Transketolase (TK) subfamily, TPP-binding module; TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. In addition, the enzyme plays a central role in the Calvin cycle in plants. Typically, TKs are homodimers. They require TPP and divalent cations, such as magnesium ions, for activity. 255
22621 238971 cd02013 TPP_Xsc_like Thiamine pyrophosphate (TPP) family, Xsc-like subfamily, TPP-binding module; composed of proteins similar to Alcaligenes defragrans sulfoacetaldehyde acetyltransferase (Xsc). Xsc plays a key role in the degradation of taurine, catalyzing the desulfonation of 2-sulfoacetaldehyde into sulfite and acetyl phosphate. This enzyme requires TPP and divalent metal ions for activity. 196
22622 238972 cd02014 TPP_POX Thiamine pyrophosphate (TPP) family, Pyruvate oxidase (POX) subfamily, TPP-binding module; composed of proteins similar to Lactobacillus plantarum POX, which plays a key role in controlling acetate production under aerobic conditions. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. It requires FAD in addition to TPP and a divalent cation as cofactors. 178
22623 238973 cd02015 TPP_AHAS Thiamine pyrophosphate (TPP) family, Acetohydroxyacid synthase (AHAS) subfamily, TPP-binding module; composed of proteins similar to the large catalytic subunit of AHAS. AHAS catalyzes the condensation of two molecules of pyruvate to give the acetohydroxyacid, 2-acetolactate. 2-Acetolactate is the precursor of the branched chain amino acids, valine and leucine. AHAS also catalyzes the condensation of pyruvate and 2-ketobutyrate to form 2-aceto-2-hydroxybutyrate in isoleucine biosynthesis. In addition to requiring TPP and a divalent metal ion as cofactors, AHAS requires FAD. 186
22624 238974 cd02016 TPP_E1_OGDC_like Thiamine pyrophosphate (TPP) family, E1 of OGDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the 2-oxoglutarate dehydrogenase multienzyme complex (OGDC). OGDC catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA and carbon dioxide, a key reaction of the tricarboxylic acid cycle. 265
22625 238975 cd02017 TPP_E1_EcPDC_like Thiamine pyrophosphate (TPP) family, E1 of E. coli PDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the Escherichia coli pyruvate dehydrogenase multienzyme complex (PDC). PDC catalyzes the oxidative decarboxylation of pyruvate and the subsequent acetylation of coenzyme A to acetyl-CoA. The E1 component of PDC catalyzes the first step of the multistep process, using TPP and a divalent cation as cofactors. E. coli PDC is a homodimeric enzyme. 386
22626 238976 cd02018 TPP_PFOR Thiamine pyrophosphate (TPP family), Pyruvate ferredoxin/flavodoxin oxidoreductase (PFOR) subfamily, TPP-binding module; PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. PFORs can be homodimeric, heterodimeric, or heterotetrameric, depending on the organism. These enzymes are dependent on TPP and a divalent metal cation as cofactors. 237
22627 238977 cd02019 NK Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate. 69
22628 238978 cd02020 CMPK Cytidine monophosphate kinase (CMPK) catalyzes the reversible phosphorylation of cytidine monophosphate (CMP) to produce cytidine diphosphate (CDP), using ATP as the preferred phosphoryl donor. 147
22629 238979 cd02021 GntK Gluconate kinase (GntK) catalyzes the phosphoryl transfer from ATP to gluconate. The resulting product gluconate-6-phoshate is an important precursor of gluconate metabolism. GntK acts as a dimmer composed of two identical subunits. 150
22630 238980 cd02022 DPCK Dephospho-coenzyme A kinase (DPCK, EC 2.7.1.24) catalyzes the phosphorylation of dephosphocoenzyme A (dCoA) to yield CoA, which is the final step in CoA biosynthesis. 179
22631 238981 cd02023 UMPK Uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK), catalyzes the reversible phosphoryl transfer from ATP to uridine or cytidine to yield UMP or CMP. In the primidine nucleotide-salvage pathway, this enzyme combined with nucleoside diphosphate kinases further phosphorylates UMP and CMP to form UTP and CTP. This kinase also catalyzes the phosphorylation of several cytotoxic ribonucleoside analogs such as 5-flurrouridine and cyclopentenyl-cytidine. 198
22632 238982 cd02024 NRK1 Nicotinamide riboside kinase (NRK) is an enzyme involved in the metabolism of nicotinamide adenine dinucleotide (NAD+). This enzyme catalyzes the phosphorylation of nicotinamide riboside (NR) to form nicotinamide mononucleotide (NMN). It defines the NR salvage pathway of NAD+ biosynthesis in addition to the pathways through nicotinic acid mononucleotide (NaMN). This enzyme can also phosphorylate the anticancer drug tiazofurin, which is an analog of nicotinamide riboside. 187
22633 238983 cd02025 PanK Pantothenate kinase (PanK) catalyzes the phosphorylation of pantothenic acid to form 4'-phosphopantothenic, which is the first of five steps in coenzyme A (CoA) biosynthetic pathway. The reaction carried out by this enzyme is a key regulatory point in CoA biosynthesis. 220
22634 238984 cd02026 PRK Phosphoribulokinase (PRK) is an enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. This enzyme catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis. 273
22635 238985 cd02027 APSK Adenosine 5'-phosphosulfate kinase (APSK) catalyzes the phosphorylation of adenosine 5'-phosphosulfate to form 3'-phosphoadenosine 5'-phosphosulfate (PAPS). The end-product PAPS is a biologically "activated" sulfate form important for the assimilation of inorganic sulfate. 149
22636 238986 cd02028 UMPK_like Uridine monophosphate kinase_like (UMPK_like) is a family of proteins highly similar to the uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK). 179
22637 238987 cd02029 PRK_like Phosphoribulokinase-like (PRK-like) is a family of proteins similar to phosphoribulokinase (PRK), the enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. PRK catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis. 277
22638 238988 cd02030 NDUO42 NADH:Ubiquinone oxioreductase, 42 kDa (NDUO42) is a family of proteins that are highly similar to deoxyribonucleoside kinases (dNK). Members of this family have been identified as one of the subunits of NADH:Ubiquinone oxioreductase (complex I), a multi-protein complex located in the inner mitochondrial membrane. The main function of the complex is to transport electrons from NADH to ubiquinone, which is accompanied by the translocation of protons from the mitochondrial matrix to the inter membrane space. 219
22639 349752 cd02032 Bchl-like L-subunit of protochlorophyllide reductase. This family of proteins contains BchL and ChlL. Protochlorophyllide reductase catalyzes the reductive formation of chlorophyllide from protochlorophyllide during biosynthesis of chlorophylls and bacteriochlorophylls. Three genes, bchL, bchN and bchB, are involved in light-independent protochlorophyllide reduction in bacteriochlorophyll biosynthesis. In cyanobacteria, algae, and gymnosperms, three similar genes, chlL, chlN and chlB are involved in protochlorophyllide reduction during chlorophylls biosynthesis. BchL/chlL, bchN/chlN and bchB/chlB exhibit significant sequence similarity to the nifH, nifD and nifK subunits of nitrogenase, respectively. Nitrogenase catalyzes the reductive formation of ammonia from dinitrogen. 267
22640 349753 cd02033 BchX X-subunit of protochlorophyllide reductase. Chlorophyllide reductase converts chlorophylls into bacteriochlorophylls by reducing the chlorin B-ring. This family contains the X subunit of this three-subunit enzyme. Sequence and structure similarity between bchX, protochlorophyllide reductase L subunit (bchL and chlL) and nitrogenase Fe protein (nifH gene) suggest their functional similarity. Members of the BchX family serve as the unique electron donors to their respective catalytic subunits (bchN-bchB, bchY-bchZ and nitrogenase component 1). Mechanistically, they hydrolyze ATP and transfer electrons through a Fe4-S4 cluster. 329
22641 349754 cd02034 CooC1 accessory protein CooC1. The accessory protein CooC1, a nickel-binding ATPase, participates in the incorporation of nickel into the complex active site ([Ni-4Fe-4S]) cluster of Ni,Fe-dependent carbon monoxide dehydrogenase (CODH). CODH from Rhodospirillum rubrum catalyzes the reversible oxidation of CO to CO2. CODH contains a nickel-iron-sulfur cluster (C-center) and an iron-sulfur cluster (B-center). CO oxidation occurs at the C-center. Three accessory proteins encoded by cooCTJ genes are involved in nickel incorporation into a nickel site. CooC functions as a nickel insertase that mobilizes nickel to apoCODH using energy released from ATP hydrolysis. CooC is a homodimer and has NTPase activities. Mutation at the P-loop abolishs its function. 249
22642 349755 cd02035 ArsA Arsenical pump-driving ATPase ArsA. ArsA ATPase functions as an efflux pump located on the inner membrane of the cell. This ATP-driven oxyanion pump catalyzes the extrusion of arsenite, antimonite and arsenate. Maintenance of a low intracellular concentration of oxyanion produces resistance to the toxic agents. The pump is composed of two subunits, the catalytic ArsA subunit and the membrane subunit ArsB, which are encoded by arsA and arsB genes, respectively. Arsenic efflux in bacteria is catalyzed by either ArsB alone or by ArsAB complex. The ATP-coupled pump, however, is more efficient. ArsA is composed of two homologous halves, A1 and A2, connected by a short linker sequence. 250
22643 349756 cd02036 MinD septum site-determining protein MinD. Septum site-determining protein MinD is part of the operon MinCDE that determines the site of the formation of a septum at mid-cell, an important part of bacterial cell division. MinC is a nonspecific inhibitor of the septum protein FtsZ. MinE is the supressor of MinC. MinD plays a pivotal role, selecting the mid-cell over other sites through the activation and regulation of MinC and MinE. MinD is a membrane-associated ATPase, related to nitrogenase iron protein. 236
22644 349757 cd02037 Mrp_NBP35 Mrp/NBP35 ATP-binding protein family. Mrp/NBP35 ATP-binding family protein are typically iron-sulfur (FeS) cluster scaffolds that function to assemble nascent FeS clusters for transfer to FeS-requiring enzymes. Members include the eukaryotic nucleotide-binding protein 1 (NUBP1) which is a component of the cytosolic iron-sulfur (Fe/S) protein assembly (CIA) machinery and the archael [NiFe] hydrogenase maturation protein HypB which is required for nickel insertion into [NiFe] hydrogenase. 213
22645 349758 cd02038 FlhG-like MinD-like ATPase FlhG. FlhG is a member of the SIMIBI superfamily. FlhG (also known as YlxH) is a major determinant for a variety of flagellation patterns. It effects location and number of bacterial flagella during C-ring assembly. 230
22646 185678 cd02039 cytidylyltransferase_like Cytidylyltransferase-like domain. Cytidylyltransferase-like domain. Many of these proteins are known to use CTP or ATP and release pyrophosphate. Protein families that contain at least one copy of this domain include citrate lyase ligase, pantoate-beta-alanine ligase, glycerol-3-phosphate cytidyltransferase, ADP-heptose synthase, phosphocholine cytidylyltransferase, lipopolysaccharide core biosynthesis protein KdtB, the bifunctional protein NadR, and a number whose function is unknown. 143
22647 349759 cd02040 NifH nitrogenase component II NifH. NifH gene encodes component II (iron protein) of nitrogenase. Nitrogenase is responsible for the biological nitrogen fixation, i.e. reduction of molecular nitrogen to ammonia. NifH consists of two oxygen-sensitive metallosulfur proteins: the mollybdenum-iron (alternatively, vanadium-iron or iron-iron) protein (commonly referred to as component 1), and the iron protein (commonly referred to as component 2). The iron protein is a homodimer, with an Fe4S4 cluster bound between the subunits and two ATP-binding domains. It supplies energy by ATP hydrolysis, and transfers electrons from reduced ferredoxin or flavodoxin to component 1 for the reduction of molecular nitrogen to ammonia. 265
22648 349760 cd02042 ParAB_family partition proteins ParAB family. ParA and ParB of Caulobacter crescentus belong to a conserved family of bacterial proteins implicated in chromosome segregation. ParB binds to DNA sequences adjacent to the origin of replication and localizes to opposite cell poles shortly following the initiation of DNA replication. ParB regulates the ParA ATPase activity by promoting nucleotide exchange in a fashion reminiscent of the exchange factors of eukaryotic G proteins. ADP-bound ParA binds single-stranded DNA, whereas the ATP-bound form dissociates ParB from its DNA binding sites. Increasing the fraction of ParA-ADP in the cell inhibits cell division, suggesting that this simple nucleotide switch may regulate cytokinesis. ParA shares sequence similarity to a conserved and widespread family of ATPases which includes the repA protein of the repABC operon in Rhizobium etli symbiotic plasmid. This operon is involved in the plasmid replication and partition. 130
22649 381001 cd02043 serpinP_plants serpin family P, plant serpins. Plant SERine Proteinase INhibitors (serpins) are potent inhibitors of a range of mammalian serine proteases in vitro, and at least seven serpin genes are expressed in Arabidopsis. Serpins from plants display a wide range of functions including protection of storage protein degradation by exogenous proteases and seed survival within the herbivore digestive tract. Comparison between Arabidopsis AtSerpin1 and other serpins reveals several distinguishing features including a plant-specific insertion between s2B and s3B, with a plant-specific motif YXXGXDXRXF and the presence of a beta-bulge in strand s2C. The conserved Asp-230 and Arg-232 in the motif form a network of hydrogen bonds stabilize a loop region, which is otherwise disordered in many other serpin structures. AtSerpin1 is targeted to the secretory pathway and was shown to interact with cysteine protease RD21 (RESPONSIVE TO DESICCATION-21). RD21 accepts peptides and ligates them to the N termini of acceptor proteins so it has been proposed that AtSerpin1 functions to curb this activity. This subgroup corresponds to clade P of the serpin superfamily. In general, serpins exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 382
22650 381002 cd02045 serpinC1_AT3 serpin family C member 1, antithrombin III. Antithrombin III (AT3/ATIII) is a non-vitamin K-dependent serine protease that inhibits coagulation by neutralizing the enzymatic activity of thrombin (factors IIa, IXa, Xa). It is the most important anticoagulant molecule in mammalian circulation systems, controlled by its interaction with the cofactor, heparin, which accelerates its interaction with target proteases. This subgroup corresponds to clade C of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 395
22651 381003 cd02046 serpinH1_CBP1 serpin family H member 1, collagen-binding protein 1. Collagen-binding protein 1 (CBP1, also called heat shock protein 47/hsp47 or colligin), because of its collagen binding ability, is a chaperone specific protein for the correct folding of types I-V procollagen in the endoplasmic reticulum (ER). It is induced under stress conditions through heat shock element-heat shock factor interaction and has been shown to be essential for collagen biosynthesis. Hsp47 transiently binds to procollagen in the ER, dissociates in the cis-Golgi or ER-Golgi intermediate compartment, and is then transported back to the ER via its RDEL retention sequence. Hsp47 recognizes collagenous (Gly-Xaa-Arg) repeats on triple-helical procollagen and can prevent local unfolding and/or aggregate formation of procollagen. Hsp47 is a non-inhibitory member of the SERPIN superfamily and corresponds to clade H. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 382
22652 381004 cd02047 serpinD1_HCF2 serpin family D member 1, Heparin cofactor II. Heparin cofactor II (HCF2/HC-II, also called protease inhibitor leuserpin-2/hLS2) is a protein encoded by the SERPIND1 gene that inhibits thrombin, the final protease of the coagulation cascade. HCII is allosterically activated by binding to cell surface glycosaminoglycans (GAGs). The specificity of HCII for thrombin is conferred by a highly acidic hirudin-like N-terminal tail, which becomes available after GAG binding for interaction with the anion-binding exosite I of thrombin. HCII deficiency can lead to increased thrombin generation and a hypercoagulable state. This subgroup corresponds to clade D of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 449
22653 381005 cd02048 serpinI1_NSP serpin family I member 1, neuroserpin. Neuroserpin (NSP, also called proteinase inhibitor 12/PI-12) is an inhibitory member of the serpin family that reacts preferentially with tissue-type plasminogen activator (tPA). It is located in neurons in regions of the brain where tPA is also found, suggesting that neuroserpin is the selective inhibitor of tPA in the central nervous system (CNS). This subgroup corresponds to clade I of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 372
22654 381006 cd02050 serpinG1_C1-INH serpin family G member 1, plasma proteinase C1 inhibitor. Plasma proteinase C1 inhibitor (C1-INH/C1IN) is a protease inhibitor of the serpin family. It plays a pivotal role in regulating the activation of the classical complement pathway and of the contact system, via regulating bradykinin formation, inhibiting factor XII and kallikrein of the contact system, and via acting on factor XI in the coagulation cascade. This subgroup corresponds to clade G of the serpin superfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 362
22655 381007 cd02051 serpinE1_PAI-1 serpin family E member 1, plasminogen activator inhibitor-1. Plasminogen activator inhibitor-1 (PAI-1/PLANH1, also called endothelial PAI) is the primary, fast-acting inhibitor of plasminogen activators. It is often bound to vitronectin, an abundant component of the extracellular matrix in many tissues. PAI1 deficiency is a rare bleeding disorder that causes excessive or prolonged bleeding due to blood clots being broken down too early. PAI-1 is a member of the serpin superfamily and belongs to clade E. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 374
22656 381008 cd02052 serpinF1_PEDF serpin family F member 1, Pigment epithelium-derived factor (PEDF). Pigment epithelium-derived factor (PEDF, also called capsin or EPC-1) is an extracellular component of the retinal interphotoreceptor matrix, vitreous humor, and aqueous humor of the adult eye. PEDF is non-inhibitory member of the serpin superfamily. It exhibits neurotrophic, neuroprotective and antiangiogenic properties and is widely expressed in the developing and adult nervous systems. This subgroup corresponds to clade F1 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 373
22657 381009 cd02053 serpinF2_A2AP serpin family F member 2, alpha2-antiplasmin inhibitor. Alpha2-antiplasmin inhibitor (A2AP/API, also called plasmin inhibitor/PLI or alpha-2-antiplasmin) is the primary inhibitor of plasmin, a proteinase that digests fibrin, the main component of blood clots. Alpha2AP forms an inactive 1:1 stoichiometric complex with plasmin. It also rapidly crosslinks to fibrin during blood clotting by activated coagulation factor XIII, and as a consequence fibrin becomes more resistant to fibrinolysis. Therefore alpha2AP is important in modulating the effectiveness and persistence of fibrin with respect to its susceptibility to digestion and removal by plasmin. This subgroup corresponds to clade F2 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 363
22658 381010 cd02054 serpinA8_AGT serpin family A member 8, angiotensinogen. Angiotensinogen (AGT) is part of the renin-angiotensin system (RAS), which plays an important role in blood pressure regulation, renal hemodynamics, as well as fluid and electrolyte homeostasis. It is also involved in normal and abnormal growth processes. The growth promoting actions of angiotensin have been shown in a variety of cells and tissues. This subgroup represents clade A8 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 446
22659 381011 cd02055 serpinA10_PZI serpin family A member 10, protein Z-dependent protease inhibitor. Protein Z-dependent protease inhibitor (ZPI) is a member of the serpin superfamily of proteinase inhibitors (clade A10). ZPI inhibits coagulation factor Xa, dependent on protein Z (PZ), a vitamin K-dependent plasma protein. ZPI also inhibits factor XIa in a process that does not require PZ. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 380
22660 381012 cd02056 serpinA1_A1AT serpin family A member 1, alpha-1-antitrypsin. Alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, proteinase inhibitor/PI, and serum trypsin inhibitor) is a protease inhibitor that belongs to the serpin superfamily. It is encoded in humans by the SERPINA1 gene. When the blood contains inadequate amounts of A1AT or functionally defective A1AT (such as in alpha-1 antitrypsin deficiency), neutrophil elastase is excessively free to break down elastin, degrading the elasticity of the lungs, which results in respiratory complications, such as chronic obstructive pulmonary disease. Normally, A1AT leaves its site of origin, the liver, and joins the systemic circulation; defective A1AT fails to do so, building up in the liver, which results in cirrhosis. This family contains other A1AT-like members of clade A of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 368
22661 381013 cd02057 serpinB5_maspin serpin family B member 5, mammary serine proteinase inhibitor. Mammary serine proteinase inhibitor (maspin, also known as proteinase inhibitor 5/PI5), a member of the serpin superfamily, is related to the ov-serpins, with a multitude of effects on cells and tissues at an assortment of developmental stages. Maspin has tumor suppressing activity against breast and prostate cancer. All true inhibitory serpins rely on an exposed reactive center loop (RCL) to inhibit their target proteinase, in which the proteinase cleaves the RCL and becomes incorporated into a serpin-proteinase complex. Maspin differs from other serpins in that its RCL is necessary for activity, but it is not cleaved or rearranged. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 375
22662 381014 cd02058 serpinB_MENT-like serpin family B, Myeloid and Erythroid Nuclear Termination stage-specific protein (MENT) and similar proteins. Gallus gallus Myeloid and Erythroid Nuclear Termination stage-specific protein (MENT) is a nonhistone heterochromatin-associated serpin that is an effective inhibitor of cathepsin L as well as the papain-like cysteine proteases cathepsins K, L, and V in vitro. It's reactive center loop, which is essential for chromatin bridging, is able to mediate formation of a loop-sheet oligomer. It also contains an M-loop which contains two critical functional motifs: a classical nuclear localization signal (NLS) that is required for nuclear import and an AT-hook motif that is involved in chromatin and DNA binding. MENT belongs to the clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 406
22663 381015 cd02059 serpinB14_OVA serpin family B member 14, ovalbumin. The chicken protein ovalbumin (OVA3), a storage protein from egg white, lacking a loop insertion mechanism and therefore protease inhibitory activity, is a historical member of the serpin superfamily and the founding member of the subgroup known as ov-serpins (ovalbumin-related serpins). It has several modifications, including N-terminal acetylation, phosphorylation, and glycosylation. Ovalbumin is secreted from the cell, targeted by an internal signal sequence, rather than the N-terminal signal sequence commonly found in other secreted proteins. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 385
22664 380311 cd02062 Nitro_FMN_reductase nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase. 139
22665 185679 cd02064 FAD_synthetase_N FAD synthetase, N-terminal domain of the bifunctional enzyme. FAD synthetase_N. N-terminal domain of the bifunctional riboflavin biosynthesis protein riboflavin kinase/FAD synthetase. These enzymes have both ATP:riboflavin 5'-phosphotransferase and ATP:FMN-adenylyltransferase activities. The N-terminal domain is believed to play a role in the adenylylation reaction of FAD synthetases. The C-terminal domain is thought to have kinase activity. FAD synthetase is present among all kingdoms of life. However, the bifunctional enzyme is not found in mammals, which use separate enzymes for FMN and FAD formation. 180
22666 239016 cd02065 B12-binding_like B12 binding domain (B12-BD). Most of the members bind different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide. This domain is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. Not all members of this family contain the conserved binding motif. 125
22667 239017 cd02066 GRX_family Glutaredoxin (GRX) family; composed of GRX, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, as well as E. coli GRX1 and GRX3, which are members of this family. E. coli GRX2, however, is a 24-kDa protein that belongs to the GSH S-transferase (GST) family. 72
22668 239018 cd02067 B12-binding B12 binding domain (B12-BD). This domain binds different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide, it is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. 119
22669 239019 cd02068 radical_SAM_B12_BD B12 binding domain_like associated with radical SAM domain. This domain shows similarity with B12 (adenosylcobamide) binding domains found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase, but it lacks the signature motif Asp-X-His-X-X-Gly, which contains the histidine that acts as a cobalt ligand. The function of this domain remains unclear. 127
22670 239020 cd02069 methionine_synthase_B12_BD B12 binding domain of methionine synthase. This domain binds methylcobalamin, which it uses as an intermediate methyl carrier from methyltetrahydrofolate (CH3H4folate) to homocysteine (Hcy). 213
22671 239021 cd02070 corrinoid_protein_B12-BD B12 binding domain of corrinoid proteins. A family of small methanogenic corrinoid proteins that bind methyl-Co(III) 5-hydroxybenzimidazolylcobamide as a cofactor. They play a role on the methanogenesis from trimethylamine, dimethylamine or monomethylamine, which is initiated by a series of corrinoid-dependent methyltransferases. 201
22672 239022 cd02071 MM_CoA_mut_B12_BD methylmalonyl CoA mutase B12 binding domain. This domain binds to B12 (adenosylcobamide), which initiates the conversion of succinyl CoA and methylmalonyl CoA by forming an adenosyl radical, which then undergoes a rearrangement exchanging a hydrogen atom with a group attached to a neighboring carbon atom. This family is present in both mammals and bacteria. Bacterial members are heterodimers and involved in the fermentation of pyruvate to propionate. Mammalian members are homodimers and responsible for the conversion of odd-chain fatty acids and branched-chain amino acids via propionyl CoA to succinyl CoA for further degradation. 122
22673 239023 cd02072 Glm_B12_BD B12 binding domain of glutamate mutase (Glm). Glutamate mutase catalysis the conversion of (S)-glutamate with (2S,3S)-3-methylaspartate. The rearrangement reaction is initiated by the extraction of a hydrogen from the protein-bound substrate by a 5'-desoxyadenosyl radical, which is generated by the homolytic cleavage of the organometallic bond of the cofactor B12. Glm is a heterotetrameric molecule consisting of two alpha and two epsilon polypeptide chains. 128
22674 319770 cd02073 P-type_ATPase_APLT_Dnf-like Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Dnf1-3p, Drs2p, and human ATP8A2, -10D, -11B, -11C. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. Yeast Dnf1 and Dnf2 mediate the transport of phosphatidylethanolamine, phosphatidylserine, and phosphatidylcholine from the outer to the inner leaflet of the plasma membrane. This subfamily includes mammalian flippases such as ATP11C which may selectively transports PS and PE from the outer leaflet of the plasma membrane to the inner leaflet. It also includes Arabidopsis phospholipid flippases including ALA1, and Caenorhabditis elegans flippases, including TAT-1, the latter has been shown to facilitate the inward transport of phosphatidylserine. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 836
22675 319771 cd02076 P-type_ATPase_H plant and fungal plasma membrane H(+)-ATPases, and related bacterial and archaeal putative H(+)-ATPases. This subfamily includes eukaryotic plasma membrane H(+)-ATPase which transports H(+) from the cytosol to the extracellular space, thus energizing the plasma membrane for the uptake of ions and nutrients, and is expressed in plants and fungi. This H(+)-ATPase consists of four domains: a transmembrane domain and three cytosolic domains: nucleotide-binding domain, phosphorylation domain and actuator domain, and belongs to the P-type ATPase type III subfamily. This subfamily also includes the putative P-type H(+)-ATPase, MJ1226p of the anaerobic hyperthermophilic archaea Methanococcus jannaschii. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 781
22676 319772 cd02077 P-type_ATPase_Mg magnesium transporting ATPase (MgtA), similar to Escherichia coli MgtA and Salmonella typhimurium MgtA. MgtA is a membrane protein which actively transports Mg(2+) into the cytosol with its electro-chemical gradient rather than against the gradient as other cation transporters do. It may act both as a transporter and as a sensor for Mg(2+). In Salmonella typhimurium and Escherichia coli, the two-component system PhoQ/PhoP regulates the transcription of the mgtA gene by sensing Mg(2+) concentrations in the periplasm. MgtA is activated by cardiolipin and it highly sensitive to free magnesium in vitro. It consists of a transmembrane domain and three cytosolic domains: nucleotide-binding domain, phosphorylation domain and actuator domain, and belongs to the P-type ATPase type III subfamily. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 768
22677 319773 cd02078 P-type_ATPase_K potassium-transporting ATPase ATP-binding subunit, KdpB, a subunit of the prokaryotic high-affinity potassium uptake system KdpFABC; similar to Escherichia coli KdpB. KdpFABC is a prokaryotic high-affinity potassium uptake system. It is expressed under K(+) limiting conditions when the other potassium transport systems are not able to provide a sufficient flow of K(+) into the bacteria. The KdpB subunit represents the catalytic subunit performing ATP hydrolysis. KdpB is comprised of four domains: the transmembrane domain, the nucleotide-binding domain, the phosphorylation domain, and the actuator domain. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 667
22678 319774 cd02079 P-type_ATPase_HM P-type heavy metal-transporting ATPase. Heavy metal-transporting ATPases (Type IB ATPases) transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. These ATPases include mammalian copper-transporting ATPases, ATP7A and ATP7B, Bacillus subtilis CadA which transports cadmium, zinc and cobalt out of the cell, Bacillus subtilis ZosA/PfeT which transports copper, and perhaps also zinc and ferrous iron, Archaeoglobus fulgidus CopA and CopB, Staphylococcus aureus plasmid pI258 CadA, a cadmium-efflux ATPase, and Escherichia coli ZntA which is selective for Pb(2+), Zn(2+), and Cd(2+). The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This family belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 617
22679 319775 cd02080 P-type_ATPase_cation P-type cation-transporting ATPase similar to Exiguobacterium aurantiacum Mna, an Na(+)-ATPase, and Synechocystis sp. PCC 6803 PMA1, a putative Ca(2+)-ATPase. This subfamily includes the P-type Na(+)-ATPase of an alkaliphilic bacterium Exiguobacterium aurantiacum Mna and cyanobacterium Synechocystis sp. PCC 6803 PMA1, a cation-transporting ATPase which may translocate calcium. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 819
22680 319776 cd02081 P-type_ATPase_Ca_PMCA-like animal plasma membrane Ca2(+)-ATPases (PMCA), similar to human ATP2B1-4/PMCA1-4, and related Ca2(+)-ATPases including Saccharomyces cerevisiae vacuolar PMC1. Animal PMCAs function to export Ca(2+) from cells and play a role in regulating Ca(2+) signals following stimulus induction and in preventing calcium toxicity. Many PMCA pump variants exist due to alternative splicing of transcripts. PMCAs are regulated by the binding of calmodulin or by kinase-mediated phosphorylation. Saccharomyces cerevisiae vacuolar transporter Pmc1p facilitates the accumulation of Ca2+ into vacuoles. Pmc1p is not regulated by direct calmodulin binding but responds to the calmodulin/calcineurin-signaling pathway and is controlled by the transcription factor complex Tcn1p/Crz1p. Similarly, the expression of the gene for Dictyostelium discoideum Ca(2+)-ATPase PAT1, patA, is under the control of a calcineurin-dependent transcription factor. Plant vacuolar Ca(2+)-ATPases, are regulated by direct-calmodulin binding. Plant Ca(2+)-ATPases are present at various cellular locations including the plasma membrane, endoplasmic reticulum, chloroplast and vacuole. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 721
22681 319777 cd02082 P-type_ATPase_cation P-type cation-transporting ATPases, similar to human ATPase type 13A1-A4 (ATP13A1-A4) proteins and Saccharomyces cerevisiae Ypk9p and Spf1p. Saccharomyces cerevisiae Yph9p localizes to the yeast vacuole and may play a role in sequestering heavy metal ions, its deletion confers sensitivity for growth for cadmium, manganese, nickel or selenium. Saccharomyces 1 Spf1p may mediate manganese transport into the endoplasmic reticulum. Human ATP13A2 (PARK9/CLN12) is a lysosomal transporter with zinc as the possible substrate. Mutation in the ATP13A2 gene has been linked to Parkinson's disease and Kufor-Rakeb syndrome, and to neuronal ceroid lipofuscinoses. ATP13A3/AFURS1 is a candidate gene for oculo auriculo vertebral spectrum (OAVS), being one of nine genes included in a 3q29 microduplication in a patient with OAVS. Mutation in the human ATP13A4 may be involved in a speech-language disorder. The expression of ATP13A1 has been followed during mouse development, ATP13A1 transcript expression showed an increase as development progressed, with the highest expression at the peak of neurogenesis. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 786
22682 319778 cd02083 P-type_ATPase_SERCA sarco/endoplasmic reticulum Ca(2+)-ATPase (SERCA), similar to mammalian ATP2A1-3/SERCA1-3. SERCA is a transmembrane (Ca2+)-ATPase and a major regulator of Ca(2+) homeostasis and contractility in cardiac and skeletal muscle. It re-sequesters cytoplasmic Ca(2+) to the sarco/endoplasmic reticulum store, thereby also terminating Ca(2+)-induced signaling such as in muscle contraction. Three genes (ATP2A1-3/SERCA1-3) encode SERCA pumps in mammals, further isoforms exist due to alternative splicing of transcripts. The activity of SERCA is regulated by two small membrane proteins called phospholamban and sarcolipin. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 979
22683 319779 cd02085 P-type_ATPase_SPCA golgi-associated secretory pathway Ca(2+) transport ATPases, similar to human ATPase secretory pathway Ca(2+) transporting 1/hSPCA1 and Saccharomyces cerevisiae Ca(2+)/Mn(2+)-transporting P-type ATPase, Pmr1p. SPCAs are Ca(2+) pumps important for the golgi-associated secretion pathway, in addition some function as Mn(2+) pumps in Mn(2+) detoxification. Saccharomyces cerevisiae Pmr1p is a high affinity Ca(2+)/Mn(2+) ATPase which transports Ca(2+) and Mn(2+) from the cytoplasm into the Golgi. Pmr1p also contributes to Cd(2+) detoxification. This subfamily includes human SPCA1 and SPCA2, encoded by the ATP2C1 and ATP2C2 genes; autosomal dominant Hailey-Hailey disease is caused by mutations in the human ATP2C1 gene. It also includes Strongylocentrotus purpuratus testis secretory pathway calcium transporting ATPase SPCA which plays an important role in fertilization. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 804
22684 319780 cd02086 P-type_ATPase_Na_ENA fungal-type Na(+)-ATPase, similar to the plasma membrane sodium transporters Saccharomyces cerevisiae Ena1p, Ena2p and Ustilago maydis Ena1, and the endoplasmic reticulum sodium transporter Ustilago maydis Ena2. Fungal-type Na(+)-ATPase (also called ENA ATPases). This subfamily includes the Saccharomyces cerevisiae plasma membrane transporters: Na(+)/Li(+)-exporting ATPase Ena1p which may also extrudes K(+), and Na(+)-exporting P-type ATPase Ena2p. It also includes Ustilago maydis plasma membrane Ena1, an K(+)/Na(+)-ATPase whose chief role is to pump Na(+) and K(+) out of the cytoplasm, especially at high pH values, and endoplasmic reticulum Ena2 ATPase which mediates Na(+) or K(+) fluxes in the ER or in other endomembranes. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 920
22685 319781 cd02089 P-type_ATPase_Ca_prok prokaryotic P-type Ca(2+)-ATPase similar to Synechococcus elongatus sp. strain PCC 7942 PacL and Listeria monocytogenes LMCA1. Ca(2+) transport ATPase is a plasma membrane protein which pumps Ca(2+) ion out of the cytoplasm. This prokaryotic subfamily includes the Ca(2+)-ATPase Synechococcus elongatus PacL, Listeria monocytogenes Ca(2+)-ATPase 1 (LMCA1) which has a low Ca(2+) affinity and a high pH optimum (pH about 9) and may remove Ca(2+) from the microorganism in environmental conditions when e.g. stressed by high Ca(2+) and alkaline pH, and the Bacillus subtilis putative P-type Ca(2+)-transport ATPase encoded by the yloB gene, which is expressed during sporulation. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 674
22686 319782 cd02092 P-type_ATPase_FixI-like Rhizobium meliloti FixI and related proteins; belongs to P-type heavy metal-transporting ATPase subfamily. FixI may be a pump of a specific cation involved in symbiotic nitrogen fixation. The Rhizobium fixI gene is part of an operon conserved among rhizobia, fixGHIS. FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG, an iron-sulfur protein. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 605
22687 319783 cd02094 P-type_ATPase_Cu-like P-type heavy metal-transporting ATPase, similar to human copper-transporting ATPases, ATP7A and ATP7B. The mammalian copper-transporting P-type ATPases, ATP7A and ATP7B are key molecules required for the regulation and maintenance of copper homeostasis. Menkes and Wilson diseases are caused by mutation in ATP7A and ATP7B respectively. This subfamily includes other copper-transporting ATPases such as: Bacillus subtilis CopA , Archeaoglobus fulgidus CopA, and Saccharomyces cerevisiae Ccc2p. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 647
22688 259797 cd02106 SPFH_like core domain of the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons, and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease, and in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 110
22689 239025 cd02107 YedY_like_Moco YedY_like molybdopterin cofactor (Moco) binding domain, a subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. Escherichia coli YedY has been proposed to form a heterodimer, consisting of a soluble catalytic subunit termed YedY, which is likely membrane-anchored by a heme-containing trans-membrane subunit YedZ. Preliminary results indicate that YedY may represent a new type of membrane-associated bacterial reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 218
22690 239026 cd02108 bact_SO_family_Moco bacterial subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. This domain is found in a variety of oxidoreductases. Common features of all known members of this family, like sulfite oxidase and nitrite reductase, are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. The specific function of this subgroup is unknown. 185
22691 239027 cd02109 arch_bact_SO_family_Moco bacterial and archael members of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. The specific function of this subgroup is unknown. 180
22692 239028 cd02110 SO_family_Moco_dimer Subgroup of sulfite oxidase (SO) family molybdopterin binding domains that contains conserved dimerization domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). 317
22693 239029 cd02111 eukary_SO_Moco molybdopterin binding domain of sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 365
22694 239030 cd02112 eukary_NR_Moco molybdopterin binding domain of eukaryotic nitrate reductase (NR). Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Eukaryotic assimilatory nitrate reductases are cytosolic homodimeric enzymes with three prosthetic groups, flavin adenine dinucleotide (FAD), cytochrome b557, and Mo cofactor, which are located in three functional domains. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 386
22695 239031 cd02113 bact_SoxC_Moco bacterial SoxC is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. SoxC is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SoxD, a small c-type heme containing subunit, it forms a hetrotetrameric sulfite dehydrogenase. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 326
22696 239032 cd02114 bact_SorA_Moco sulfite:cytochrome c oxidoreductase subunit A (SorA), molybdopterin binding domain. SorA is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SorB, a small c-type heme containing subunit, it forms a hetrodimer. It is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 367
22697 239033 cd02115 AAK Amino Acid Kinases (AAK) superfamily, catalytic domain; present in such enzymes like N-acetylglutamate kinase (NAGK), carbamate kinase (CK), aspartokinase (AK), glutamate-5-kinase (G5K) and UMP kinase (UMPK). The AAK superfamily includes kinases that phosphorylate a variety of amino acid substrates. These kinases catalyze the formation of phosphoric anhydrides, generally with a carboxylate, and use ATP as the source of the phosphoryl group; are involved in amino acid biosynthesis. Some of these kinases control the process via allosteric feed-back inhibition. 248
22698 153139 cd02116 ACT ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times. 60
22699 349761 cd02117 NifH-like NifH family. This family contains the NifH (iron protein) of nitrogenase, L subunit (BchL/ChlL) of the protochlorophyllide reductase, and the BchX subunit of the Chlorophyllide reductase. Members of this family use energy from ATP hydrolysis and transfer electrons through a Fe4-S4 cluster to other subunit for substrate reduction 266
22700 239035 cd02120 PA_subtilisin_like PA_subtilisin_like: Protease-associated domain containing subtilisin-like proteases. This group contains various PA domain-containing subtilisin-like proteases including melon cucumisin, Arabidopsis thaliana Ara12, a nodule specific serine protease from Alnus glutinosa ag12, members of the tomato P69 family, and tomato LeSBT2. These proteins belong to the peptidase S8 family. Cucumisin from the juice of melon fruits is a thermostable serine peptidase, with a broad substrate specificity for oligopeptides and proteins. A. thaliana Ara12 is a thermostable, extracellular serine protease, found chiefly in silique tissue and stem tissue. Ara12 is stimulated by Ca2+ ions. A. glutinosa ag12 is expressed at high levels in the nodules, and at low levels in the shoot tips; it is implicated in both symbiotic and non-symbiotic processes in plant development. The tomato P69 protease family is comprised of various protein isoforms of approximately 69KDa. These isoforms accumulate extracellularly. Some of the P69 genes are tightly regulated in a tissue specific fashion, and by environmental and developmental signals. For example: infection with avirulent bacteria activates transcription of the genes for the P69 B and C isoforms, the P69 E transcript was detected only in roots, and the P69F transcript only in hydathodes. The Tomato LeSBT2 subtilase transcript was not detected in flowers and roots, but was present in cotyledons and leaves. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 126
22701 239036 cd02121 PA_GCPII_like PA_GCPII_like: Protease-associated domain containing protein, glutamate carboxypeptidase II (GCPII)-like. This group contains various PA domain-containing proteins similar to GCPII including, GCPIII (NAALADase2) and NAALADase L. These proteins belong to the peptidase M28 family. GCPII is also known N-acetylated-alpha-linked acidic dipeptidase (NAALDase1), folate hydrolase or prostate-specific membrane antigen (PSMA). GCPII is found in various human tissues including prostate, small intestine, and the central nervous system. In the brain, GCPII is known as NAALDase1, it functions as a NAALDase hydrolyzing the neuropeptide N-acetyl-L-aspartyl-L-glutamate (alpha-NAAG), to release free glutamate. In the small intestine, GCPII releases the terminal glutamate from poly-gamma-glutamated folates. GCPII (PSMA) is a useful cancer marker; its expression is markedly increased in prostate cancer and in tumor-associated neovasculature. GCPIII hydrolyzes alpha-NAAG with a lower efficiency than does GCPII; NAALADase L is not able to hydrolyze alpha-NAAG. The GCPII PA domain (referred to as the apical domain) participates in substrate binding and may act as a protein-protein interaction domain. 220
22702 239037 cd02122 PA_GRAIL_like PA _GRAIL_like: Protease-associated (PA) domain GRAIL-like. This group includes PA domain containing E3 (ubiquitin ligases) similar to human GRAIL (gene related to anergy in lymphocytes) protein. Proteins in this group contain a C3H2C3 RING finger. E3 ubiquitin ligase is part of an enzymic cascade, the end result of which is the ubiquitination of proteins. In this cascade, E1 activates the ubiquitin, the activated ubiquitin is carried by E2, and E3 recognizes the acceptor protein as well as catalyzes the transfer of the activated ubiquitin from E2 to this acceptor. GRAIL, a transmembrane protein localized in the endosomes, controls the development of T cell clonal anergy, and may ubiquitinate membrane-associated targets for T cell activation. GRAIL1 is associated with, and regulated by, two isoforms of otubain 1 (the ubiquitin-specific protease). Additional E3s belonging to this group include human (h)Goliath and Xenopus GREUL1 (Goliath Related E3 Ubiquitin Ligase 1). hGoliath and GRAIL both have the property of self-ubiquitination. hGoliath is expressed in leukocytes; its expression and localization is not modified in leukemia. GREUL1 may play a role in the generation of anterior ectoderm. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 138
22703 239038 cd02123 PA_C_RZF_like PA_C-RZF_ like: Protease-associated (PA) domain C_RZF-like. This group includes various PA domain-containing proteins similar to C-RZF (chicken embryo RING zinc finger) protein. These proteins contain a C3H2C3 RING finger. C-RZF is expressed in embryo cells and is restricted mainly to brain and heart, it is localized to both the nucleus and endosomes. Additional C3H2C3 RING finger proteins belonging to this group, include Arabidopsis ReMembR-H2 protein and mouse sperizin. ReMembR-H2 is likely to be an integral membrane protein, and to traffic through the endosomal pathway. Sperizin is expressed in haploid germ cells and localized in the cytoplasm, it may participate in spermatogenesis. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 153
22704 239039 cd02124 PA_PoS1_like PA_PoS1_like: Protease-associated (PA) domain PoS1-like. This group includes various PA domain-containing proteins similar to Pleurotus ostreatus (Po)S1. PoSl, the main extracellular protease in P. ostreatus is a subtilisin-like serine protease belonging to the peptidase S8 family. Ca2+ and Mn2+ both stimulate the protease activity of (Po)S1. Ca2+ protects PoS1 from autolysis. PoS1 is a monomeric glycoprotein, which may play a role in the regulation of laccases in lignin formation. (Po)S1 participates in the degradation of POXA1b, and in the activation of POXA3, (POXA1b and POXA3 are laccase isoenzymes), but its effect may be indirect. The significance of the PA domain to PoS1 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 129
22705 239040 cd02125 PA_VSR PA_VSR: Protease-associated (PA) domain-containing plant vacuolar sorting receptor (VSR). This group includes various PA domain-containing VSRs such as garden pea BP-80, pumpkin PV72, and various Arabidopsis VSRs including AtVSR1. In contrast to most eukaryotes, which only have one or two VSRs, plants have several. This may in part be a reflection of having a more complex vacuolar system with both lytic vacuoles and storage vacuoles. The lytic vacuole is thought to be equivalent to the mammalian lysosome and the yeast vacuole. Pea BP-80 is a type 1 transmembrane protein, involved in the targeting of proteins to the lytic vacuole; it has been suggested that this protein also mediates targeting to the storage vacuole. PV72 and AtVSR1 may mediate transport of seed storage proteins to protein storage vacuoles. The significance of the PA domain to VSRs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 127
22706 239041 cd02126 PA_EDEM3_like PA_EDEM3_like: protease associated domain (PA) domain-containing EDEM3-like proteins. This group contains various PA domain-containing proteins similar to mouse EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein). EDEM3 contains a region, similar to Class I alpha-mannosidases (gylcosyl hydrolase family 47), N-terminal to the PA domain. EDEM3 accelerates glycoprotein ERAD (ER-associated degradation). In transfected mammalian cells, overexpression of EDEM3 enhances the mannose trimming from the N-glycans, of a model misfolded protein [alpha1-antitrypsin null (Hong Kong)] as well as, from total glycoproteins. Mannose trimming appears to be involved in the selection of ERAD substrates. EDEM3 has a different specificity of trimming than ER alpha-mannosidase 1. The significance of the PA domain to EDEM3 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 126
22707 239042 cd02127 PA_hPAP21_like PA_hPAP21_like: Protease-associated domain containing proteins like the human secreted glycoprotein hPAP21 (human protease-associated domain-containing protein, 21kDa). This group contains various PA domain-containing proteins similar to hPAP21. Complex N-glycosylation may be required for the secretion of hPAP21. The significance of the PA domain to hPAP21 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 118
22708 239043 cd02128 PA_TfR PA_TfR: Protease-associated domain containing proteins like transferrin receptor (TfR). This group contains various PA domain-containing proteins similar to human TfR1 and TfR2. TfR1 and TfR2 are type II membrane proteins, belonging to the peptidase M28 family. TfR1 is homodimeric, widely expressed, and a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and this complex is internalized. In addition to its role in iron uptake, TfR1 may participate in cell growth and proliferation. TfR2 also binds Tf but with a significantly lower affinity than does TfR1. TfR2 is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis. HFE and TfR2 interact in cells. By one model for serum iron sensing, at low or basal iron concentrations, HFE and TFR1 form a complex at the plasma membrane; at increased Tf, Tf competes with HFE for binding of TfR1, resulting in HFE disassociating from TfR1 and associating with TfR2 . The TfR1-TfR2 association might initiate a signal cascade leading to the induction of hepcidin (a small peptide hormone that controls systemic iron levels). Human mutations in TfR2 are associated with a form of hemochromatosis (HFE3). The significance of the PA domain to TfRs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 183
22709 239044 cd02129 PA_hSPPL_like PA_hSPPL_like: Protease-associated domain containing human signal peptide peptidase-like (hSPPL)-like. This group contains various PA domain-containing proteins similar to hSPPL2a and 2b. These SPPLs are GxGD aspartic proteases. SPPL2a is sorted to the late endosomes, SPPL2b to the plasma membrane. In activated dendritic cells, hSPPL2a and 2b catalyze the intramembrane proteolysis of tumor necrosis factor alpha triggering IL-12 production. hSPPL2a and 2b may have a broad substrate spectrum. The significance of the PA domain to these SPPLs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 120
22710 239045 cd02130 PA_ScAPY_like PA_ScAPY_like: Protease-associated domain containing proteins like Saccharomyces cerevisiae aminopeptidase Y (ScAPY). This group contains various PA domain-containing proteins similar to the S. cerevisiae APY, including Trichophyton rubrum leucine aminopeptidase 1(LAP1). Proteins in this group belong to the peptidase M28 family. ScAPY hydrolyzes amino acid-4-methylcoumaryl-7-amides (MCAs). ScAPY more rapidly hydrolyzes dipeptidyl-MCAs. Hydrolysis of amino acid-MCAs or dipeptides is stimulated by Co2+ while the hydrolysis of dipeptidyl-MCAs, tripeptides, and longer peptides is inhibited by Co2+. ScAPY is vacuolar and is activated by proteolytic processing. LAP1 is a secreted leucine aminopeptidase. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 122
22711 239046 cd02131 PA_hNAALADL2_like PA_hNAALADL2_like: Protease-associated domain containing proteins like human N-acetylated alpha-linked acidic dipeptidase-like 2 protein (hNAALADL2). This group contains various PA domain-containing proteins similar to hNAALADL2. The function of hNAALADL2 is unknown. This gene has been mapped to a chromosomal region associated with Cornelia de Lange syndrome. The significance of the PA domain to hNAALADL2 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 153
22712 239047 cd02132 PA_GO-like PA_GO-like: Protease-associated domain containing proteins like Arabidopsis thaliana growth-on protein GRO10. This group contains various PA domain-containing proteins similar to the functionally uncharacterized Arabidopsis GRO10. The PA domain may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 139
22713 239048 cd02133 PA_C5a_like PA_C5a_like: Protease-associated domain containing proteins like Streptococcus pyogenes C5a peptidase. This group contains various PA domain-containing proteins similar to S. pyogenes C5a, including, i) Vpr, a minor extracellular serine protease from Bacillus subtilis, ii) a large molecular mass collagenolytic protease from Geobacillus collagenovorans MO-1, and iii) PrtS, a cell envelope protease from Streptococcus thermophilus CNRZ 385. Proteins in this group belong to the peptidase S8 family. C5a peptidase is a cell surface serine protease which specifically inactivates C5a [a chemotactic peptide, which attracts polymorphonuclear leukocytes (PMNs)], by cleaving it to release a 7-residue carboxy-terminal fragment which contains the PMN binding site. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 143
22714 411779 cd02134 KH-II_NusA_rpt1 first type II K-homology (KH) RNA-binding domain found in transcription termination/antitermination protein NusA and similar proteins. NusA, also called N utilization substance protein A or transcription termination/antitermination L factor, is an essential multifunctional transcription elongation factor that participates in both transcription termination and antitermination. NusA anti-termination function plays an important role in the expression of ribosomal rrn operons. During transcription of many other genes, NusA-induced RNA polymerase pausing provides a mechanism for synchronizing transcription and translation. In prokaryotes, the N-terminal RNA polymerase-binding domain (NTD) is connected through a flexible hinge helix to three globular domains, the S1 and two K-homology (KH), KH1 and KH2. The KH domains of NusA belong to the type II KH RNA-binding domain superfamily. This model corresponds to the first KH domain of NusA and similar proteins. 76
22715 380312 cd02135 YdjA-like nitroreductase family protein similar to Escherichia coli YdjA. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase YdjA from Escherichia coli. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 162
22716 380313 cd02136 PnbA_NfnB-like nitroreductase similar to Mycobacterium smegmatis NfnB. Members of this family utilize FMN as a cofactor and catalyze reduction of a variety of nitroaromatic compounds, including nitrofurans, nitrobenzens, nitrophenol, nitrobenzoate and quinones by using either NADH or NADPH as a source of reducing equivalents in an obligatory two-election transfer mechanism. The enzyme is typically a homodimer. Mycobacterium smegmatis nitroreductase NfnB plays a role in resistance to benzothiazinone. 152
22717 380314 cd02137 MhqN-like nitroreductase family protein similar to the NAD(P)H nitroreductase MhqN. A diverse subfamily of the nitroreductase family containing uncharacterized proteins; includes nitroreductases MhqN, YodC, YdgI, DrgA. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. 147
22718 380315 cd02138 TdsD-like nitroreductase similar to Burkholderia pseudomallei TdsD. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to Burkholderia pseudomallei TdsD, may be involved in the processing of organosulfur compounds. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. 174
22719 380316 cd02139 nitroreductase nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. 165
22720 380317 cd02140 Frm2-like nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 192
22721 380318 cd02142 McbC_SagB-like_oxidoreductase oxidase similar to the microcin B17 processing protein McbC. This family is the oxidase domain of NRPS (non-ribosomal peptide synthetase) and other systems that modify polypeptides by cyclizing a thioester to form a ring. These include EpoB, part of the epothilone biosynthesis pathway; TubD, part of the tubulysin biosynthesis pathway, MtsD, part of the myxothiozol biosynthesis pathway; IndC, part of the indigoidine biosynthesis pathway and TfxB, part of the trifitoxin processing pathway. All are FMN-dependent and oxidize the product of the cyclization of thioesters in short polypeptides. 200
22722 380319 cd02143 nitroreductase_FeS-like nitroreductases with an N-terminal iron-sulfur cluster-binding domain. Members of this family utilize FMN as a cofactor. This family may be involved in the reduction of flavin or nitroaromatic compounds via an obligatory two-electron transfer. Nitroreductase is homodimer. Each subunit contains one FMN molecule. 187
22723 380320 cd02144 iodotyrosine_dehalogenase iodotyrosine dehalogenase. Iodotyrosine dehalogenase catalyzes the removal of iodine from the 3, 5 positions of L-tyosine in thyroid, liver and kidney, using NADPH as electron donor. This enzyme is a homolog of the nitroreductase family. These enzymes are usually homodimers. 192
22724 380321 cd02145 BluB 5,6-dimethylbenzimidazole synthase. BluB catalyzes the O2-dependent conversion of FMNH2 to 5,6-dimethylbenzimidazole (DMB), a component of vitamin B12; is is a subfamily of the nitroreductase family; nitroreductases typically reduce their substrates by using NAD(P)H as electron donor and often use FMN as a cofactor. 196
22725 380322 cd02146 NfsA-like nitroreductase similar to Escherichia coli NfsA. This family contains NADPH-dependent flavin reductase and oxygen-insensitive nitroreductase. These enzymes are homodimeric flavoproteins that contain one FMN per monomer as a cofactor. Flavin reductase catalyzes the reduction of flavin by using NADPH as an electron donor. Oxygen-insensitive nitroreductase, such as NfsA protein in Escherichia coli, catalyzes reduction of nitrocompounds using NADPH as electron donor. 229
22726 380323 cd02148 RutE-like nitroreductase similar to Escherichia coli RutE. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. RutE is involved in the utilization of uracil as the sole nitrogen source; it appears to have the same function as YdfG, which reduces malonic semialdehyde to 3-hydroxypropionic acid. 186
22727 380324 cd02149 NfsB-like nitroreductase similar to Escherichia coli NfsB. NAD(P)H:FMN oxidoreductase family. This domain catalyzes the reduction of flavin, nitrocompound, quinones and azo compounds using NADH or NADPH as an electron donor. The enzyme is a homodimer, and each monomer binds a FMN as co-factor. This family includes FRase I in Vibrio fischeri, wihich reduces FMN into FMNH2 as part of the bioluminescent reaction. The family also includes oxygen-insensitive nitroreductases that use NADH or NADPH as an electron donor in the ping pong bi bi mechanism. This type of nitroreductase can be used in cancer chemotherapy to activate a range of prodrugs. 156
22728 380325 cd02150 nitroreductase nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer.often found to be homodimers. 156
22729 380326 cd02151 nitroreductase nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer.often found to be homodimers.. 157
22730 239065 cd02152 OAT Ornithine acetyltransferase (OAT) family; also referred to as ArgJ. OAT catalyzes the first and fifth steps in arginine biosynthesis, coupling acetylation of glutamate with deacetylation of N-acetylornithine, which allows recycling of the acetyl group in the arginine biosynthetic pathway. Members of this family may experience feedback inhibition by L-arginine. The active enzyme is a heterotetramer of two alpha and two beta chains, where the alpha and beta chains are the result of autocatalytic cleavage. OATs found in the clavulanic acid biosynthesis gene cluster catalyze the fifth step only, and may utilize acetyl acceptors other than glutamate. 390
22731 239066 cd02153 tRNA_bindingDomain The tRNA binding domain is also known as the Myf domain in literature. This domain is found in a diverse collection of tRNA binding proteins, including prokaryotic phenylalanyl tRNA synthetases (PheRS), methionyl-tRNA synthetases (MetRS), human tyrosyl-tRNA synthetase(hTyrRS), Saccharomyces cerevisiae Arc1p, Thermus thermophilus CsaA, Aquifex aeolicus Trbp111, human p43 and human EMAP-II. PheRS, MetRS and hTyrRS aminoacylate their cognate tRNAs. Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases. The molecular chaperones Trbp111 and CsaA also contain this domain. CsaA has export related activities; Trbp111 is structure-specific recognizing the L-shape of the tRNA fold. This domain has general tRNA binding properties. In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. An EMAP-II-like cytokine is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. For homodimeric members of this group which include CsaA, Trbp111 and Escherichia coli MetRS this domain acts as a dimerization domain. 99
22732 173912 cd02156 nt_trans nucleotidyl transferase superfamily. nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain. 105
22733 173914 cd02163 PPAT Phosphopantetheine adenylyltransferase. Phosphopantetheine adenylyltransferase (PPAT). PPAT is an essential enzyme in bacteria, responsible for catalyzing the rate-limiting step in coenzyme A (CoA) biosynthesis. The dinucleotide-binding fold of PPAT is homologous to class I aminoacyl-tRNA synthetases. CoA has been shown to inhibit PPAT and competes with ATP, PhP, and dPCoA. PPAT is a homohexamer in E. coli. 153
22734 173915 cd02164 PPAT_CoAS phosphopantetheine adenylyltransferase domain of eukaryotic and archaeal bifunctional enzymes. The PPAT domain of the bifunctional enzyme with PPAT and DPCK functions. The final two steps of the CoA biosynthesis pathway are catalyzed by phosphopantetheine adenylyltransferase (PPAT) and dephospho-CoA (dPCoA) kinase (DPCK). The PPAT reaction involves the reversible adenylation of 4'-phosphopantetheine to form 3'-dPCoA and PPi, and DPCK catalyses phosphorylation of the 3'-hydroxy group of the ribose moiety of dPCoA. In eukaryotes the two enzymes are part of a large multienzyme complex . Studies in Corynebacterium ammoniagenes suggested that separate enzymes were present, and this was confirmed through identification of the bacterial PPAT/CoAD. 143
22735 185680 cd02165 NMNAT Nicotinamide/nicotinate mononucleotide adenylyltransferase. Nicotinamide/nicotinate mononucleotide (NMN/ NaMN)adenylyltransferase (NMNAT). NMNAT represents the primary bacterial and eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide. It is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. Human NMNAT displays unique dual substrate specificity toward both NMN and NaMN, and can participate in both de novo and salvage pathways of NAD synthesis. 192
22736 173917 cd02166 NMNAT_Archaea Nicotinamide/nicotinate mononucleotide adenylyltransferase, archaeal. This family of archaeal proteins exhibits nicotinamide-nucleotide adenylyltransferase (NMNAT) activity utilizing the salvage pathway to synthesize NAD. In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a number of proteins which are uncharacterized with respect to activity, are also present. 163
22737 173918 cd02167 NMNAT_NadR Nicotinamide/nicotinate mononucleotide adenylyltransferase of bifunctional NadR-like proteins. NMNAT domain of NadR protein. The NadR protein (NadR) is a bifunctional enzyme possessing both NMN adenylytransferase (NMNAT) and ribosylnicotinamide kinase (RNK) activities. Its function is essential for the growth and survival of H. influenzae and thus may present a new highly specific anti-infectious drug target. The N-terminal domain that hosts the NMNAT activity is closely related to archaeal NMNAT. The bound NAD at the active site of the NMNAT domain reveals several critical interactions between NAD and the protein.The NMNAT domain of hiNadR defines yet another member of the pyridine nucleotide adenylyltransferase 158
22738 173919 cd02168 NMNAT_Nudix Nicotinamide/nicotinate mononucleotide adenylyltransferase of bifunctional proteins, also containing a Nudix hydrolase domain. N-terminal NMNAT (Nicotinamide/nicotinate mononucleotide adenylyltransferase) domain of a novel bifunctional enzyme endowed with NMN adenylyltransferase and Nudix hydrolase activities. This domain is highly homologous to the archeal NMN adenyltransferase that catalyzes NAD synthesis from NMN and ATP. NMNAT is an essential enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. The C-terminal domain of this enzyme shares homology with the archaeal ADP-ribose pyrophosphatase, a member of the 'Nudix' hydrolase family. 181
22739 173920 cd02169 Citrate_lyase_ligase Citrate lyase ligase. Citrate lyase ligase, also known as [Citrate (pro-3S)-lyase] ligase, is responsible for acetylation of the (2-(5''-phosphoribosyl)-3'-dephosphocoenzyme-A) prosthetic group of the gamma subunit of citrate lyase, converting the inactive thiol form of this enzyme to the active form. The acetylation of 1 molecule of deacetyl-citrate lyase to enzymatically active citrate lyase requires 6 molecules of ATP. The Adenylylyltranferase activity of the enzyme involves the formation of AMP and and pyrophosphate in the acetylation reaction. 297
22740 173921 cd02170 cytidylyltransferase cytidylyltransferase. The cytidylyltransferase family includes cholinephosphate cytidylyltransferase (CCT), glycerol-3-phosphate cytidylyltransferase, RafE and phosphoethanolamine cytidylyltransferase (ECT). All enzymes catalyze the transfer of a cytidylyl group from CTP to various substrates. 136
22741 173922 cd02171 G3P_Cytidylyltransferase glycerol-3-phosphate cytidylyltransferase. Glycerol-3-phosphate cytidylyltransferase,(CDP-glycerol pyrophosphorylase). Glycerol-3-phosphate cytidyltransferase acts in pathways of teichoic acid biosynthesis. Teichoic acids are substituted polymers, linked by phosphodiester bonds, of glycerol, ribitol, etc. An example is poly(glycerol phosphate), the major teichoic acid of the Bacillus subtilis cell wall. Most, but not all, species encoding proteins in this family are Gram-positive bacteria. A closely related protein assigned a different function experimentally is a human ethanolamine-phosphate cytidylyltransferase. 129
22742 173923 cd02172 RfaE_N N-terminal domain of RfaE. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in Escherichia coli, and separate proteins in other organisms. Domain I is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (this family) adds ADP to yield ADP-D-glycero-D-manno-heptose . 144
22743 173924 cd02173 ECT CTP:phosphoethanolamine cytidylyltransferase (ECT). CTP:phosphoethanolamine cytidylyltransferase (ECT) catalyzes the conversion of phosphoethanolamine to CDP-ethanolamine as part of the CDP-ethanolamine biosynthesis pathway. ECT expression in hepatocytes is localized predominantly to areas of the cytoplasm that are rich in rough endoplasmic reticulum. Several ECTs, including yeast and human ECT, have large repetitive sequences located within their N- and C-termini. 152
22744 173925 cd02174 CCT CTP:phosphocholine cytidylyltransferase. CTP:phosphocholine cytidylyltransferase (CCT) catalyzes the condensation of CTP and phosphocholine to form CDP-choline as the rate-limiting and regulatory step in the CDP-choline pathway. CCT is unique in that its enzymatic activity is regulated by the extent of its association with membrane structures. A current model posts that the elastic stress of the bilayer curvature is sensed by CCT and this governs the degree of membrane association, thus providing a mechanism for both positive and negative regulation of activity. 150
22745 185684 cd02175 GH16_lichenase lichenase, member of glycosyl hydrolase family 16. Lichenase, also known as 1,3-1,4-beta-glucanase, is a member of glycosyl hydrolase family 16, that specifically cleaves 1,4-beta-D-glucosidic bonds in mixed-linked beta glucans that also contain 1,3-beta-D-glucosidic linkages. Natural substrates of beta-glucanase are beta-glucans from grain endosperm cell walls or lichenan from the Islandic moss, Cetraria islandica. This protein is found not only in bacteria but also in anaerobic fungi. This domain includes two seven-stranded antiparallel beta-sheets that are adjacent to one another forming a compact, jellyroll beta-sandwich structure. 212
22746 185685 cd02176 GH16_XET Xyloglucan endotransglycosylase, member of glycosyl hydrolase family 16. Xyloglucan endotransglycosylases (XETs) cleave and religate xyloglucan polymers in plant cell walls via a transglycosylation mechanism. Xyloglucan is a soluble hemicellulose with a backbone of beta-1,4-linked glucose units, partially substituted with alpha-1,6-linked xylopyranose branches. It binds noncovalently to cellulose, cross-linking the adjacent cellulose microfibrils, giving it a key structural role as a matrix polymer. Therefore, XET plays an important role in all plant processes that require cell wall remodeling. 263
22747 185686 cd02177 GH16_kappa_carrageenase Kappa-carrageenase, member of glycosyl hydrolase family 16. Kappa-carrageenase is a glycosyl hydrolase family 16 (GH16) member that hydrolyzes the internal beta-1,4-linkage of kappa-carrageenans, a hydrophilic polysaccharide found in the cell wall of Rhodophyceaea, marine red algae. Carrageenans are linear chains of galactose units linked by alternating D-alpha-1,3- and D-beta-1,4-linkages that are additionally modified by a 3,6-anhydro-bridge. Depending on the position and number of sulfate ester modifications they are subdivided into kappa-, iota-, and lambda-carrageenases, kappa being modified once. Carrageenans form thermo-reversible gels widely used for industrial applications. Kappa-carrageenases exist in bacteria belonging to at least three phylogenetically distant branches, including pseudoalteromonas, planctomycetes, and baceroidetes. This domain adopts a curved beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold. 269
22748 185687 cd02178 GH16_beta_agarase Beta-agarase, member of glycosyl hydrolase family 16. Beta-agarase is a glycosyl hydrolase family 16 (GH16) member that hydrolyzes the internal beta-1,4-linkage of agarose, a hydrophilic polysaccharide found in the cell wall of Rhodophyceaea, marine red algae. Agarose is a linear chain of galactose units linked by alternating L-alpha-1,3- and D-beta-1,4-linkages that are additionally modified by a 3,6-anhydro-bridge. Agarose forms thermo-reversible gels that are widely used in the food industry or as a laboratory medium. While beta-agarases are also found in two other families derived from the sequence-based classification of glycosyl hydrolases (GH50, and GH86) the GH16 members are most abundant. This domain adopts a curved beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold. 258
22749 185688 cd02179 GH16_beta_GRP beta-1,3-glucan recognition protein, member of glycosyl hydrolase family 16. Beta-GRP (beta-1,3-glucan recognition protein) is one of several pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate innate immune response. They are present in insects and lack all catalytic residues. This subgroup also contains related proteins of unknown function that still contain the active site. Their structures adopt a jelly roll fold with a deep active site channel harboring the catalytic residues, like those of other glycosyl hydrolase family 16 members. 321
22750 185689 cd02180 GH16_fungal_KRE6_glucanase Saccharomyces cerevisiae KRE6 and related glucanses, member of glycosyl hydrolase family 16. KRE6 is a Saccharomyces cerevisiae glucanase that participates in the synthesis of beta-1,6-glucan, a major structural component of the cell wall. It is a golgi membrane protein required for normal beta-1,6-glucan levels in the cell wall. KRE6 is closely realted to laminarinase, a glycosyl hydrolase family 16 member that hydrolyzes 1,3-beta-D-glucosidic linkages in 1,3-beta-D-glucans such as laminarins, curdlans, paramylons, and pachymans, with very limited action on mixed-link (1,3-1,4-)-beta-D-glucans. 295
22751 185690 cd02181 GH16_fungal_Lam16A_glucanase fungal 1,3(4)-beta-D-glucanases, similar to Phanerochaete chrysosporium laminarinase 16A. Group of fungal 1,3(4)-beta-D-glucanases, similar to Phanerochaete chrysosporium laminarinase 16A. Lam16A belongs to the 'nonspecific' 1,3(4)-beta-glucanase subfamily, although beta-1,6 branching and beta-1,4 bonds specifically define where Lam16A hydrolyzes its substrates, like curdlan (beta-1,3-glucan), lichenin (beta-1,3-1,4-mixed linkage glucan), and laminarin (beta-1,6-branched-1,3-glucan). 293
22752 185691 cd02182 GH16_Strep_laminarinase_like Streptomyces laminarinase-like, member of glycosyl hydrolase family 16. Proteins similar to Streptomyces sioyaensis beta-1,3-glucanase (laminarinase) present in Actinomycetales as well as Peziomycotina. Laminarinases belong to glycosyl hydrolase family 16 and hydrolyze the glycosidic bond of the 1,3-beta-linked glucan, a major component of fungal and plant cell walls and the structural and storage polysaccharides (laminarin) of marine macro-algae. Members of the GH16 family have a conserved jelly roll fold with an active site channel. 259
22753 185692 cd02183 GH16_fungal_CRH1_transglycosylase glycosylphosphatidylinositol-glucanosyltransferase. Group of fungal GH16 members related to Saccharomyces cerevisiae Crh1p. Chr1p and Crh2p are transglycosylases that are required for the linkage of chitin to beta(1-3)glucose branches of beta(1-6)glucan, an important step in the assembly of new cell wall. Both have been shown to be glycosylphosphatidylinositol (GPI)-anchored. A third homologous protein, Crr1p, functions in the formation of the spore wall. They belongs to the family 16 of glycosyl hydrolases that includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues. 203
22754 100026 cd02185 AroH Chorismate mutase (AroH) is one of at least five chorismate-utilizing enzymes present in microorganisms that catalyze the rearrangement of chorismate to prephenic acid, the first committed step in the biosynthesis of aromatic amino acids. In prokaryotes, chorismate mutase may be fused to prephenate dehydratase, prephenate dehydrogenase, or 3-deoxy-D-arabino-heptulosonat-7-phosphate (DAHP) as part of a bifunctional enzyme. The AroH domain forms a homotrimer with three-fold symmetry. 117
22755 276955 cd02186 alpha_tubulin The alpha-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules. The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications. The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. 434
22756 276956 cd02187 beta_tubulin The beta-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules. The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications. The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. 425
22757 276957 cd02188 gamma_tubulin The gamma-tubulin family. Gamma-tubulin is a ubiquitous phylogenetically conserved member of tubulin superfamily. Gamma is a low abundance protein present within the cells in both various types of microtubule-organizing centers and cytoplasmic protein complexes. Gamma-tubulin recruits the alpha/beta-tubulin dimers that form the minus ends of microtubules and is thought to be involved in microtubule nucleation and capping. 430
22758 276958 cd02189 delta_zeta_tubulin-like The delta- and zeta-tubulin families. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. Delta-tubulin plays an essential role in forming the triplet microtubules of centrioles and basal bodies. 433
22759 276959 cd02190 epsilon_tubulin The epsilon-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The epsilon-tubulins which are widespread but not ubiquitous among eukaryotes play a role in basal body/centriole morphogenesis. 449
22760 276960 cd02191 FtsZ_CetZ-like Subfamily of FitZ and Cell-structure-related euryarchaeota tubulin/FtsZ homolog-like. FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes. CetZ-like proteins are related to tubulin and FtsZ and co-exists with FtsZ in many archaea. However, a recent study found that Cetz proteins (formerly annotated FtsZ type 2) are not required for cell division. Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics. The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development. 308
22761 100028 cd02192 PurM-like3 AIR synthase (PurM) related protein, subgroup 3 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain. 283
22762 100029 cd02193 PurL Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 272
22763 100030 cd02194 ThiL ThiL (Thiamine-monophosphate kinase) plays a dual role in de novo biosynthesis and in salvage of exogenous thiamine. Thiamine salvage occurs in two steps, with thiamine kinase catalyzing the formation of thiamine phosphate, and ThiL catalyzing the conversion of this intermediate to thiamine pyrophosphate. The N-terminal domain of ThiL binds ATP and is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, FGAM synthase and selenophosphate synthetase (SelD). 291
22764 100031 cd02195 SelD Selenophosphate synthetase (SelD) catalyzes the conversion of selenium to selenophosphate which is required by a number of bacterial, archaeal and eukaryotic organisms for synthesis of Secys-tRNA, the precursor of selenocysteine in selenoenzymes. The N-terminal domain of SelD is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, and FGAM synthase and is thought to bind ATP. 287
22765 100032 cd02196 PurM PurM (Aminoimidazole Ribonucleotide [AIR] synthetase), one of eleven enzymes required for purine biosynthesis, catalyzes the conversion of formylglycinamide ribonucleotide (FGAM) and ATP to AIR, ADP, and Pi, the fifth step in de novo purine biosynthesis. The N-terminal domain of PurM is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP. 297
22766 100033 cd02197 HypE HypE (Hydrogenase expression/formation protein). HypE is involved in Ni-Fe hydrogenase biosynthesis. HypE dehydrates its own carbamoyl moiety in an ATP-dependent process to yield the enzyme thiocyanate. The N-terminal domain of HypE is related to the ATP-binding domains of the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP. 293
22767 100005 cd02198 YjgH_like YjgH belongs to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 111
22768 100006 cd02199 YjgF_YER057c_UK114_like_1 This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 142
22769 276961 cd02201 FtsZ_type1 Filamenting temperature sensitive mutant Z, type 1. FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes. FtsZ is capable of polymerizing in a GTP-driven process into structures similar to those formed by tubulin. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. 303
22770 276962 cd02202 CetZ_tubulin-like Cell-structure-related euryarchaeota tubulin/FtsZ homologs. CetZ proteins comprise a distinct tubulin/FtsZ family. The crystal structures of CetZ contain the FtsZ/tubulin superfamily fold and its family members have mosaic of tubulin-like and FtsZ-like amino acid residues. However, a recent study found that CetZ proteins (formerly annotated FtsZ type 2) are not required for cell division, whereas FtsZ proteins play an important role. Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics. The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development. 357
22771 100034 cd02203 PurL_repeat1 PurL subunit of the formylglycinamide ribonucleotide amidotransferase (FGAR-AT), first repeat. FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 313
22772 100035 cd02204 PurL_repeat2 PurL subunit of the formylglycinamide ribonucleotide amidotransferase (FGAR-AT), second repeat. FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 264
22773 341358 cd02205 CBS_pair_SF Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains superfamily. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 113
22774 380338 cd02208 cupin_RmlC-like RmlC-like cupin superfamily. This superfamily contains proteins similar to the RmlC (dTDP (deoxythymidine diphosphates)-4-dehydrorhamnose 3,5-epimerase)-like cupins. RmlC is a dTDP-sugar isomerase involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria. Cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins. This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin ('cupa' is the Latin term for small barrel). The active site of members of this superfamily is generally located at the center of a conserved barrel and usually includes a metal ion. The different functional classes in this superfamily include single domain bacterial isomerases and epimerases involved in the modification of cell wall carbohydrates, two domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain nuclear transcription factors involved in legume root nodulation. 73
22775 380339 cd02209 cupin_XRE_C XRE (Xenobiotic Response Element) family transcriptional regulators, C-terminal cupin domain. This family contains transcriptional regulators containing an N-terminal XRE (Xenobiotic Response Element) family helix-turn-helix (HTH) DNA-binding domain and a C-terminal cupin domain. Included in this family is Escherichia coli transcription factor SutR (YdcN) that plays a regulatory role in sulfur utilization; it regulates a set of genes involved in the generation of sulfate and its reduction, the synthesis of cysteine, the synthesis of enzymes containing Fe-S as cofactors, and the modification of tRNA with use of sulfur-containing substrates. This family belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 90
22776 380340 cd02210 cupin_BLR2406-like Bradyrhizobium japonicum BLR2406 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLR2406, a Bradyrhizobium japonicum protein of unknown function with a cupin beta barrel domain. Proteins in this subfamily appear to align closest to RmlC carbohydrate epimerase which is involved in dTDP-L-rhamnose production, and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 98
22777 380341 cd02211 cupin_UGlyAH_N (S)-ureidoglycine aminohydrolase and related proteins, N-terminal cupin domain. This family includes the N-terminal cupin domain of (S)-ureidoglycine aminohydrolase (UGlyAH), an enzyme that converts (S)-ureidoglycine into (S)-ureidoglycolate and ammonia, providing the final substrate to the ureide pathway. The ureide pathway has recently been identified as the metabolic route of purine catabolism in plants and some bacteria where, uric acid, which is a major product of the early stage of purine catabolism, is degraded into glyoxylate and ammonia via stepwise reactions by seven different enzymes. Thus, this pathway has a possible physiological role in mobilization of purine ring nitrogen for further assimilation. This enzyme from Arabidopsis thaliana(AtUGlyAH) has been shown to bind a Mn2+ ion, via the C-terminal cupin domain, which acts as a molecular anchor to bind (S)-ureidoglycine, and its binding mode dictates the enantioselectivity of the reaction. The structure of AtUGlyAH shows a bi-cupin fold with a conserved "jelly roll-like" beta-barrel fold and an octameric functional unit. Several structural homologs of UGlyAH, including the Escherichia coli ortholog YlbA (also known as GlxB6), also exhibit similar features. 117
22778 380342 cd02212 cupin_UGlyAH_C (S)-ureidoglycine aminohydrolase and related proteins, C-terminal cupin domain. This family includes the C-terminal cupin domain of (S)-Ureidoglycine aminohydrolase (UGlyAH), an enzyme that converts (S)-ureidoglycine into (S)-ureidoglycolate and ammonia, providing the final substrate to the ureide pathway. The ureide pathway has recently been identified as the metabolic route of purine catabolism in plants and some bacteria where, uric acid, which is a major product of the early stage of purine catabolism, is degraded into glyoxylate and ammonia via stepwise reactions by seven different enzymes. Thus, this pathway has a possible physiological role in mobilization of purine ring nitrogen for further assimilation. This enzyme from Arabidopsis thaliana(AtUGlyAH) has been shown to bind a Mn2+ ion,via the C-terminal cupin domain, which acts as a molecular anchor to bind (S)-ureidoglycine, and its binding mode dictates the enantioselectivity of the reaction. The structure of AtUGlyAH shows a bi-cupin fold with a conserved "jelly roll-like" beta-barrel fold and an octameric functional unit. Several structural homologs of UGlyAH, including the Escherichia coli ortholog YlbA (also known as GlxB6), also exhibit similar features. 92
22779 380343 cd02213 cupin_PMI_typeII_C Phosphomannose isomerase type II, C-terminal cupin domain. This family includes the C-terminal cupin domain of mannose-6-phosphate isomerases (MPIs) which have been classified broadly into two groups, type I and type II, based on domain organization. This family contains type II phosphomannose isomerase (also known as PMI-GDP, phosphomannose isomerase/GDP-D-mannose pyrophosphorylase), a bifunctional enzyme with two domains that catalyze the first and third steps in the GDP-mannose pathway in which fructose 6-phosphate is converted to GDP-D-mannose. The N-terminal domain catalyzes the first and rate-limiting step, the isomerization from D-fructose-6-phosphate to D-mannose-6-phosphate, while the C-terminal cupin domain (represented in this alignment model) converts mannose 1-phosphate to GDP-D-mannose in the final step of the reaction. Although these two domains occur together in one protein in most organisms, they occur as separate proteins in certain cyanobacterial organisms. Also, although type I and type II MPIs have no overall sequence similarity, they share a conserved catalytic motif. 126
22780 380344 cd02214 cupin_MJ1618 Methanocaldococcus jannaschii MJ1618 and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to MJ1618, a Methanocaldococcus jannaschii protein of unknown function with a cupin beta barrel domain. The active site of members of the cupin superfamily is generally located at the center of a conserved barrel and usually includes a metal ion. 100
22781 380345 cd02215 cupin_QDO_N_C quercetinase, N- and C-terminal cupin domains. This family contains quercetinase (also known as quercetin 2,3-dioxygenase, 2,3QD, QDO and YxaG; EC 1.13.11.24), a mononuclear copper-dependent dioxygenase that catalyzes the cleavage of the flavonol quercetin (5,7,3',4'-tetrahydroxyflavonol) heterocyclic ring to produce 2-protocatechuoyl-phloroglucinol carboxylic acid and carbon monoxide. Bacillus subtilis quercetin 2,3-dioxygenase (QDO) is a homodimer that shows oxygenase activity with several divalent metals such as Mn2+, Co2+, Fe2+, and Cu2+, although the preferred one appears to be Mn2+. The dioxygen binds to the metal ion of the Cu-QDO-quercetin complex, yielding a Cu2+-superoxo quercetin radical intermediate, which then forms a Cu2+-alkylperoxo complex which then evolves into endoperoxide intermediate that decomposes to the product. Quercetinase is a bicupin with two tandem cupin beta-barrel domains, both of which are included in this alignment model. The pirins, which also belong to the cupin domain family, have been shown to catalyze a reaction involving quercetin and may have a function similar to that of quercetinase. 122
22782 380346 cd02216 cupin_GDO-like_N gentisate 1,2-dioxygenase, 1-hydroxy-2-naphthoate dioxygenase, and salicylate 1,2-dioxygenase, N-terminal cupin domain. This family includes the N-terminal cupin domains of three closely related bicupin aromatic ring-cleaving dioxygenases: gentisate 1,2-dioxygenase (GDO), salicylate 1,2-dioxygenase (SDO), and 1-hydroxy-2-naphthoate dioxygenase (NDO). GDO catalyzes the cleavage of the gentisate (2,5-dihydroxybenzoate) aromatic ring, a key step in the gentisate degradation pathway allowing soil bacteria to utilize 2,5-xylenol, 3,5-xylenol, and m-cresol as sole carbon and energy sources. NDO catalyzes the cleavage of 1-hydroxy-2-naphthoate as part of the bacterial phenanthrene degradation pathway. SDO is a ring cleavage dioxygenase from Pseudaminobacter salicylatoxidans that oxidizes salicylate to 2-oxohepta-3,5-dienedioic acid via a novel ring fission mechanism. SDO differs from other known GDOs and NDOs in its unique ability to oxidatively cleave many different salicylate, gentisate and 1-hydroxy-2-naphthoate substrates with high catalytic efficiency. The active site of these enzymes is located in the N-terminal domain but could be influenced by changes in the C-terminal domain, which lacks the strictly conserved metal-binding residues found in other cupin domains and is thought to be an inactive vestigial remnant. 108
22783 380347 cd02218 cupin_PGI cupin-type phosphoglucose isomerase. The cupin-type phosphoglucose isomerase (also called cupin-like glucose-6-phosphate isomerase or cPGI; EC 5.3.1.9) family is found in archaea and certain prokaryotes where they catalyze the reversible aldose-ketose isomerization of glucose 6-phosphate (G6P) and fructose 6-phosphate (F6P) as part of a unique variation of the Embden-Meyerhof glycolytic pathway. Cupin-PGIs represent a separate lineage in the evolution of phosphoglucose isomerases. Pyrococcus furiosus phosphoglucose isomerase (PfPGI) has been shown to be a metal-containing enzyme which catalyzes the interconversion of glucose 6-phosphate (G6P) and fructose 6-phosphate (F6P). These domains have a cupin beta-barrel fold capable of homodimerization. 168
22784 380348 cd02219 cupin_YjlB-like Bacillus subtilis YjlB and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to YjlB, a Bacillus subtilis protein of unknown function with a cupin beta barrel fold. The active site of members of the cupin superfamily is generally located at the center of a conserved barrel and usually includes a metal ion. The different functional classes in this superfamily include single domain bacterial isomerases and epimerases involved in the modification of cell wall carbohydrates, two-domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain nuclear transcription factors involved in legume root nodulation. 154
22785 380349 cd02220 cupin_ABP1 auxin-binding protein 1, cupin domain. Auxin-binding protein 1 (ABP1) is a soluble glycoprotein receptor that binds the plant hormone auxin, indole-3-acetic acid (IAA). ABP1 belongs to the ancient and functionally diverse germin/seed storage 7S protein superfamily. It is an important mediator of auxin action in plants and is essential for cell cycle control. Cellular auxin responses typically depend on auxin concentrations that mainly result from intercellular auxin transport and auxin biosynthesis, as well as metabolism. The functional inactivation of ABP1 results in cell cycle arrest, showing that ABP1 plays a critical role in cell cycle regulation, acting at both the G1/S and G2/M checkpoints. ABP1 is ubiquitous among green plants, found mainly within the endoplasmic reticulum (ER) and in smaller quantities at the cell surface associated with the plasma membrane. In Arabidopsis thaliana, ABP1 null mutations result in embryonic lethality while decreased ABP1 expression leads to severe retardation of leaf growth. 151
22786 380350 cd02221 cupin_TM1287-like Thermotoga maritima TM1287 decarboxylase, cupin domain. This family includes bacterial proteins homologous to TM1287 decarboxylase, a Thermotoga maritima manganese-containing cupin thought to catalyze the conversion of oxalate to formate and carbon dioxide, due to its similarity to oxalate decarboxylase (OXDC) from Bacillus subtilis. TM1287 shows a cupin fold with a conserved "jelly roll-like" beta-barrel fold and forms a homodimer. 93
22787 380351 cd02222 cupin_TM1459-like Thermotoga maritima TM1459 and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to Thermotoga maritima TM1459, a manganese-containing cupin that has been shown to cleave C=C bonds in the presence of alkylperoxide as oxidant in vitro. Its biological function is still unknown. This family also includes Halorhodospira halophila Hhal_0468. Structures of these proteins show a cupin fold with a conserved "jelly roll-like" beta-barrel fold that form a homodimer. 91
22788 380352 cd02223 cupin_Bh2720-like Bacillus halodurans Bh2720 and related proteins, cupin domain. This family includes bacterial, archaeal, and eukaryotic proteins similar to Bh2720, a Bacillus halodurans protein of unknown function with a cupin beta-barrel fold. 98
22789 380353 cd02224 cupin_SPO2919-like Silicibacter pomeroyi SPO2919 and related proteins, uncharacterized sugar phosphate isomerase with a cupin domain. This family includes proteins similar to sugar phosphate isomerase SPO2919 from Silicibacter pomeroyi and Afe_0303 from Acidithiobacillus ferrooxidans, but are as yet uncharacterized. Structures of these proteins show a cupin fold with a conserved "jelly roll-like" beta-barrel fold that form a homodimer. 105
22790 380354 cd02225 cupin_PA3510-like Pseudomonas aeruginosa PA3510 and related proteins, cupin domain. This family includes bacterial proteins homologous to PA3510, a Pseudomonas aeruginosa protein of unknown function with a beta-barrel fold that belongs to the cupin superfamily. 150
22791 380355 cd02226 cupin_YdbB-like Bacillus subtilis YdbB and related proteins, cupin domain. This family includes bacterial proteins homologous to YdbB, a Bacillus subtilis protein of unknown function. It also includes protein Nmb1881 From Neisseria meningitidis, also of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 94
22792 380356 cd02227 cupin_TM1112-like Thermotoga maritima TM1112 and related proteins, cupin domain. This family includes bacterial and plant proteins homologous to TM1112, a Thermotoga maritima protein of unknown function with a cupin beta barrel domain. TM1112 (also known as DUF861) is a subfamily of RmlC-like cupins with a conserved "jelly roll-like" beta-barrel fold; structures indicate that a monomer is the biologically-relevant form. 69
22793 380357 cd02228 cupin_EutQ Clostridium difficile EutQ and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to ethanolamine utilization protein EutQ found in Clostridium difficile, as well as in other bacteria, including the enteric pathogens Salmonella enterica and Enterococcus faecalis. EutQ is encoded by the eutQ gene which is part of the eut (ethanolamine utilization) operon found to be essential during anoxic growth of S. enterica on ethanolamine and tetrathionate. In C. difficile, inability to utilize ethanolamine results in greater virulence and a shorter time to morbidity in the animal model, suggesting that, in contrast to other intestinal pathogens, the metabolism of ethanolamine can delay the onset of disease. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. In contrast to the metal-binding catalytic cupins, the EutQ family does not possess the histidine residues that are responsible for metal coordination in the oxidoreductase and epimerase classes of cupins. 84
22794 380358 cd02230 cupin_HP0902-like Helicobacter pylori HP0902 and related proteins, cupin domain. This family includes prokaryotic and archaeal proteins homologous to HP0902, a functionally uncharacterized protein from Helicobacter pylori and Spy1581, a protein of unknown function from Streptococcus pyogenes. These proteins demonstrate all-beta cupin folds that cannot bind metal ions due to the absence of a metal-binding histidine that is conserved in many metallo-cupins. HP0902 is able to bind bacterial endotoxin lipopolysaccharides (LPS) through its surface-exposed loops, where metal-binding sites are usually found in other metallo-cupins, and thus may have a putative role in H. pylori pathogenicity. 83
22795 380359 cd02231 cupin_BLL6423-like Bradyrhizobium japonicum BLL6423 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLL6423, a Bradyrhizobium japonicum protein of unknown function; it includes a structure of an uncharacterized protein from Novosphingobium aromaticivorans. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 108
22796 380360 cd02232 cupin_ARD acireductone dioxygenase (ARD), cupin domain. Acireductone dioxygenase (ARD; also known as 1,2-dihydroxy-3-keto-5-methylthiopentene dioxygenase) catalyzes the oxidation of 1,2-dihydroxy-3-keto-5-methylthiopentene to yield two different products depending on which active site metal is present (Fe2+ or Ni2+) as part of the methionine salvage pathway. The ARD apo-enzyme, obtained after the metal is removed, is catalytically inactive. The Fe(II)-ARD reaction yields an alpha-keto acid and formic acid, while Ni(II)-ARD instead catalyzes a shunt out of the methionine salvage pathway, yielding methylthiocarboxylic acid, formic acid, and CO. ARD belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization 134
22797 380361 cd02233 cupin_HNL-like Granulicella tundricola hydroxynitrile lyase (GtHNL) and related proteins, cupin domain. This family includes archaeal, eukaryotic, and bacterial proteins homologous to hydroxynitrile lyase from Granulicella tundricola (GtHNL), a novel class of HNLs that does not show any sequence or structural similarity to any other HNL and does not contain conserved motifs typical of HNLs. HNLs comprise a diverse group of enzymes that vary in terms of their substrate specificity, enantioselectivity and the need for a co-factor. In plants, they catalyze the reversible cleavage of cyanohydrins, yielding HCN and aldehydes or ketones. Also included in this family is TM1010 from Thermotoga maritima, a protein of unknown function. Some but not all members of this family have N- or C-terminal carboxymuconolactone decarboxylase domains in addition to the cupin domain. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 106
22798 380362 cd02234 cupin_BLR7677-like Bradyrhizobium japonicum BLR7677 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLR7677, a Bradyrhizobium japonicum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 103
22799 380363 cd02235 cupin_BLL4011-like Bradyrhizobium diazoefficiens BLL4011 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLL4011, a Bradyrhizobium diazoefficiens protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 100
22800 380364 cd02236 cupin_CV2614-like Chromobacterium violaceum CV2614 and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to CV2614, a Chromobacterium violaceum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 102
22801 380365 cd02237 cupin_DAD_ChrR 2,4'-Dihydroxyacetophenone dioxygenase (DAD) and anti-sigma factor ChrR, and similar proteins; cupin domain. This family includes the proteins 2,4'-Dihydroxyacetophenone dioxygenase (DAD) and anti-sigma factor ChrR. DAD catalyzes the oxidation of 2,4'-dihydroxyacetophenone to 4-hydroxybenzoate and formate as part of the 4-hydroxyacetophenone catabolic pathway. The enzyme is a homotetramer containing one iron per molecule of enzyme. Anti-sigma factor ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the Rsp extra cytoplasmic function (ECF) sigma factor E (sigmaE). Some ChrR members contain tandem repeats of two distinct homologous functional domains. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 82
22802 380366 cd02238 cupin_KdgF pectin degradation protein KdgF and related proteins, cupin domain. This family includes bacterial and archaeal pectin degradation protein KdgF that catalyzes the linearization of unsaturated uronates from both pectin and alginate, which are polysaccharides found in the cell walls of plants and brown algae, respectively, and represent an important source of carbon. These polysaccharides, mostly consisting of chains of uronates, can be metabolized by bacteria through a pathway of enzymatic steps to the key metabolite 2-keto-3-deoxygluconate (KDG). Pectin degradation is used by many plant-pathogenic bacteria during infection, and also, pectin and alginate can both represent abundant sources of carbohydrate for the production of biofuels. These proteins belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 104
22803 380367 cd02240 cupin_OxDC Oxalate decarboxylase (OxDC), cupin domain. Oxalate decarboxylase (OxDC; EC 4.1.1.2) is a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds and both domains are included in this alignment. Each OxDC cupin domain contains one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 145
22804 380368 cd02241 cupin_OxOx Oxalate oxidase (germin), cupin domain. Oxalate oxidase (OxOx, also known as germin; EC 1.2.3.4) catalyzes the manganese-dependent oxidative decarboxylation of oxalate to carbon dioxide and hydrogen peroxide (H2O2). It is widespread in fungi and various plant tissues and may play a role in plant signaling and defense. This enzyme has been employed in a widely used assay for detecting urinary oxalate levels. Also, the gene encoding OxOx from barley roots has been expressed in oilseed rape in order to provide a defense against externally supplied oxalic acid. In germin, the predominant protein produced during the early phase of wheat germination, it is believed that H2O2 production is employed as a defense mechanism in response to infection by pathogens. Germin is also a marker of growth onset in cell walls in germinating cereals. The H2O2 produced by OxOx, together with the Ca2+ released by degradation of calcium oxalate, are thought to mediate cell wall cross-linking at high concentrations. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 191
22805 380369 cd02242 cupin_11S_legumin_N 11S legumin seed storage globulin, N-terminal cupin domain. This family contains the N-terminal domains of 11S legumin seed storage proteins that supply nutrition for seed germination, such as glycinin and legumin, including many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, Pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). They are synthesized as propeptides in the endoplasmic reticulum and transported to the secretory vesicles as a homotrimer. The propeptides are processed as they are sorted in the secretory vesicles. The homotrimer binds another homotrimer to form a homohexamer with 32-point symmetry formed by a face-to-face stacking of the two trimers. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 209
22806 380370 cd02243 cupin_11S_legumin_C 11S legumin seed storage globulin, C-terminal cupin domain. This family contains the C-terminal domains of 11S legumin seed storage proteins that supply nutrition for seed germination, such as glycinin and legumin, including many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, Pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). They are synthesized as propeptides in the endoplasmic reticulum and transported to the secretory vesicles as a homotrimer. The propeptides are processed as they are sorted in the secretory vesicles. The homotrimer binds another homotrimer to form a homohexamer with 32-point symmetry formed by a face-to-face stacking of the two trimers. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 155
22807 380371 cd02244 cupin_7S_vicilin-like_N 7S vicilin seed storage globulin, N-terminal cupin domain. This family contains the N-terminal domains of plant 7S seed storage proteins such as vicilin, and includes beta-conglycinin, phaseolin, canavalin, conglutin-beta, a chromatin protein in Pisum sativum called P54, and a sucrose binding protein in soybean called SBP. These 7S globulins also include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2, and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The vicilin peptides formed by trypsin or chymotrypsin digestion exhibit antihypertensive effects. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 178
22808 380372 cd02245 cupin_7S_vicilin-like_C 7S vicilin seed storage globulin, C-terminal cupin domain. This family contains C-terminal domain of plant 7S seed storage protein such as vicilin and includes beta-conglycinin, phaseolin, canavalin, conglutin-beta, a chromatin protein in Pisum sativum called P54, and a sucrose binding protein in soybean called SBP. These 7S globulins also include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2 and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The vicilin peptides formed by trypsin or chymotrypsin digestion exhibit antihypertensive effects. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 166
22809 380373 cd02247 cupin_pirin_C pirin, C-terminal cupin domain. This family contains the C-terminal domain of pirin, a nuclear protein that is highly conserved among mammals, plants, fungi, and prokaryotes. It is widely expressed in dot-like subnuclear structures in human tissues such as liver and heart. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. The pirins have been assigned as a subfamily of the cupin superfamily based on structure and sequence similarity. The pirins have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 76
22810 239068 cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to hatch or to evade the host immune system. Mammalian CPs are primarily lysosomal enzymes with the exception of cathepsin W, which is retained in the endoplasmic reticulum. They are responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. In addition to its inhibitory role, the propeptide is required for proper folding of the newly synthesized enzyme and its stabilization in denaturing pH conditions. Residues within the propeptide region also play a role in the transport of the proenzyme to lysosomes or acidified vesicles. Also included in this subfamily are proteins classified as non-peptidase homologs, which lack peptidase activity or have missing active site residues. 210
22811 239069 cd02249 ZZ Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins. 46
22812 239070 cd02252 nylC_like nylC-like family; composed of proteins with similarity to Flavobacterium endo-type 6-aminohexanoate-oligomer hydrolase (EIII), the product of the nylon oligomer degradation gene, nylC. EIII is an amide hydrolase that catalyzes the degradation of highly-polymerized 6-aminohexanoate oligomers. Together with other nylon degradation enzymes, such as 6-aminohexanoate cyclic dimer hydrolase (EI) and 6-aminohexanoate dimer hydrolase (EII), EIII plays a role in the detoxification and biological removal of the synthetic by-products of nylon manufacture. EIII shows sequence similarity to L-aminopeptidase D-amidase/D-esterase (DmpA), an aminopeptidase that releases N-terminal D and L amino acids from peptide substrates. Like DmpA, EIII undergoes autocatalytic cleavage in front of a nucleophile to form a heterodimer. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine. 260
22813 239071 cd02253 DmpA L-Aminopeptidase D-amidase/D-esterase (DmpA) family; DmpA catalyzes the release of N-terminal D and L amino acids from peptide susbtrates. DmpA is synthesized as a single polypeptide precursor, which is autocatalytically cleaved to the active heterodimeric form. The cleavage results in two polypeptide chains, with one chain containing an N-terminal nucleophile. This group represents one of the rare aminopeptidases that are not metalloenzymes. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine. 339
22814 187736 cd02255 Peptidase_C12 Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1. The ubiquitin C-terminal hydrolase (UCH; ubiquitinyl hydrolase; ubiquitin thiolesterase) family of deubiquitinating enzymes (DUBs) consists of four members to date: UCH-L1, UCH-L3, UCH-L5 (UCH37) and BRCA1-associated protein-1 (BAP1), all containing a conserved catalytic domain with cysteine peptidase activity. UCH-L1 hydrolyzes carboxyl terminal esters and amides of ubiquitin (Ub). Dysfunction of this hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD) and neurofibrillary tangles, linked to Alzheimer's disease (AD). UCH-L1, in its dimeric form, has additional enzymatic activity as a ubiquitin ligase. UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. UCH-L3 can also interact with Lys48-linked Ub dimers to protect it from degradation while inhibiting its hydrolase activity at the same time. UCH-L1 and UCH-L3 are the most closely related of the UCH members. UCH-L5 (UCH37) is involved in the deubiquitinating activity in the 19S proteasome regulatory complex. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus. It consists of the N-terminal UCH domain and two predicted nuclear localization signals (NLSs), only one of which is functional. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. There is growing evidence that UCH enzymes and human malignancies are closely correlated. Studies show that UCH enzymes play a crucial role in some signaling pathways and in cell-cycle regulation. 222
22815 239072 cd02257 Peptidase_C19 Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 255
22816 199210 cd02258 Peptidase_C25_N Peptidase C25 family N-terminal domain, found in Arg-gingipain (Rgp), Lys-gingipain (Kgp) and related proteins. Peptidase family C25 is a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network. 382
22817 239073 cd02259 Peptidase_C39_like Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in all sub-families. 122
22818 187535 cd02266 SDR Short-chain dehydrogenases/reductases (SDR). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase (KR) domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 186
22819 100064 cd02325 R3H R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 59
22820 239074 cd02334 ZZ_dystrophin Zinc finger, ZZ type. Zinc finger present in dystrophin and dystrobrevin. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dystrophin attaches actin filaments to an integral membrane glycoprotein complex in muscle cells. The ZZ domain in dystrophin has been shown to be essential for binding to the membrane protein beta-dystroglycan. 49
22821 239075 cd02335 ZZ_ADA2 Zinc finger, ZZ type. Zinc finger present in ADA2, a putative transcriptional adaptor, and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 49
22822 239076 cd02336 ZZ_RSC8 Zinc finger, ZZ type. Zinc finger present in RSC8 and related proteins. RSC8 is a component of the RSC complex, which is closely related to the SWI/SNF complex and is involved in remodeling chromatin structure. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding. 45
22823 239077 cd02337 ZZ_CBP Zinc finger, ZZ type. Zinc finger present in CBP/p300 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. CREB-binding protein (CBP) is a large multidomain protein that provides binding sites for transcriptional coactivators, the role of the ZZ domain in CBP/p300 is unclear. 41
22824 239078 cd02338 ZZ_PCMF_like Zinc finger, ZZ type. Zinc finger present in potassium channel modulatory factor (PCMF) 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Human potassium channel modulatory factor 1 or FIGC has been shown to possess intrinsic E3 ubiquitin ligase activity and to promote ubiquitination. 49
22825 239079 cd02339 ZZ_Mind_bomb Zinc finger, ZZ type. Zinc finger present in Drosophila Mind bomb (D-mib) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Mind bomb is an E3 ubiqitin ligase that has been shown to regulate signaling by the Notch ligand Delta in Drosophila melanogaster. 45
22826 239080 cd02340 ZZ_NBR1_like Zinc finger, ZZ type. Zinc finger present in Drosophila ref(2)P, NBR1, Human sequestosome 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Drosophila ref(2)P appears to control the multiplication of sigma rhabdovirus. NBR1 (Next to BRCA1 gene 1 protein) interacts with fasciculation and elongation protein zeta-1 (FEZ1) and calcium and integrin binding protein (CIB), and may function in cell signalling pathways. Sequestosome 1 is a phosphotyrosine independent ligand for the Lck SH2 domain and binds noncovalently to ubiquitin via its UBA domain. 43
22827 239081 cd02341 ZZ_ZZZ3 Zinc finger, ZZ type. Zinc finger present in ZZZ3 (ZZ finger containing 3) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 48
22828 239082 cd02342 ZZ_UBA_plant Zinc finger, ZZ type. Zinc finger present in plant ubiquitin-associated (UBA) proteins. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding. 43
22829 239083 cd02343 ZZ_EF Zinc finger, ZZ type. Zinc finger present in proteins with an EF_hand motif. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 48
22830 239084 cd02344 ZZ_HERC2 Zinc finger, ZZ type. Zinc finger present in HERC2 and related proteins. HERC2 is a potential E3 ubiquitin protein ligase and/or guanine nucleotide exchange factor. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 45
22831 239085 cd02345 ZZ_dah Zinc finger, ZZ type. Zinc finger present in Drosophila dah and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dah (discontinuous actin hexagon) is a membrane associated protein essential for cortical furrow formation in Drosophila. 49
22832 411803 cd02393 KH-I_PNPase type I K homology (KH) RNA-binding domain found in polyribonucleotide nucleotidyltransferase (PNPase) and similar proteins. PNPase, also called polynucleotide phosphorylase, is a polyribonucleotide nucleotidyl transferase that degrades mRNA in prokaryotes and plant chloroplasts. It catalyzes the phosphorolysis of single-stranded polyribonucleotides processively in the 3'- to 5'-direction. It is also involved, along with RNase II, in tRNA processing. The C-terminal region of PNPase contains domains homologous to those in other RNA binding proteins: a KH domain and an S1 domain. The model corresponds to the KH domain. 70
22833 411804 cd02394 KH-I_Vigilin_rpt6 sixth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the sixth one. 68
22834 411805 cd02395 KH-I_BBP type I K homology (KH) RNA-binding domain found in yeast branchpoint-bridging protein (BBP) and similar proteins. Yeast BBP, also called mud synthetic-lethal 5 protein, or splicing factor 1, or zinc finger protein BBP, is a mammalian splicing factor SF1 ortholog. It is involved in protein-protein interactions that bridge the 3' and 5' splice-site ends of the intron during the early steps of yeast pre-mRNA splicing. BBP interacts specifically with the pre-mRNA branchpoint sequence UACUAAC. 92
22835 411806 cd02396 KH-I_PCBP_rpt2 second type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 72
22836 239090 cd02406 CRS2 Chloroplast RNA splicing 2 (CRS2) is a nuclear-encoded protein required for the splicing of group II introns in the chloroplast. CRS2 forms stable complexes with two CRS2-associated factors, CAF1 and CAF2, which are required for the splicing of distinct subsets of CRS2-dependent introns. CRS2 is closely related to bacterial peptidyl-tRNA hydrolases (PTH). 191
22837 239091 cd02407 PTH2_family Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes. 115
22838 411780 cd02409 KH-II_SF type II K-homology (KH) RNA-binding domain superfamily. The K-homology (KH) domain binds single-stranded RNA or DNA, and is found in a wide variety of proteins including ribosomal proteins, transcription factors, and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but share a single "minimal KH motif" which is folded into a beta-alpha-alpha-beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include an N-terminal extension while type I KH domains (e.g. hnRNP K) contain a C-terminal extension, connected to the KH motif by variable loops that are different in different KH domains, whether they are type I or type II. KH-II superfamily members contain one or two KH domains, most of which are canonical type II KH domains that have the signature motif GXXG (where X represents any amino acid). The first KH domain found in archaeal cleavage and polyadenylation specificity factors (CPSFs) is a non-canonical type II KH domain that lacks the GXXG motif. Some others have mutated GXXG motifs which may or may not have nucleic acid binding ability. 67
22839 411781 cd02410 KH-II_CPSF_arch_rpt2 second type II K-homology (KH) RNA-binding domain found in archaeal cleavage and polyadenylation specificity factor (CPSF) and similar proteins. The archaeal CPSFs are predicted to be metal-dependent RNases belonging to the beta-CASP family, a subgroup of enzymes within the metallo-beta-lactamase fold. Within the CPSF family, all archaeal genomes contain one member with two N-terminal type II K-homology (KH) domains and one without. This family includes the CPSF homologs from archaea possessing N-terminal KH domains. This model corresponds to the second KH domain of CPSF, which is a canonical type II KH domain that contains the signature motif GXXG (where X represents any amino acid). 76
22840 411782 cd02411 KH-II_30S_S3_arch type II K-homology (KH) RNA-binding domain found in archaeal 30S ribosomal protein S3 and similar proteins. 30S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 30S ribosomal subunit and binds to the lower part of the 30S subunit head. It is believed to interact with mRNA as it threads its way from the latch into the channel. Members of this family are mainly from archaea and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 87
22841 411783 cd02412 KH-II_30S_S3 type II K-homology (KH) RNA-binding domain found in 30S ribosomal protein S3 and similar proteins. 30S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 30S ribosomal subunit and binds to the lower part of the 30S subunit head. It may also bind mRNA in the 70S ribosome, positioning it for translation. S3 protein is believed to interact with mRNA as it threads its way from the latch into the channel. Members of this family contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 108
22842 411784 cd02413 KH-II_40S_S3 type II K-homology (KH) RNA-binding domain found in 40S ribosomal protein S3 and similar proteins. 40S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 40S ribosomal subunit and is involved in translation. It is believed to interact with mRNA as it threads its way from the latch into the channel. 40S ribosomal protein S3 has endonuclease activity and plays a role in repair of damaged DNA. It cleaves phosphodiester bonds of DNAs containing altered bases with broad specificity and cleaves supercoiled DNA more efficiently than relaxed DNA. Members of this family are mainly from prokaryotes and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 91
22843 411785 cd02414 KH-II_Jag type II K-homology (KH) RNA-binding domain found in protein Jag and similar proteins. Protein Jag, also called SpoIIIJ-associated protein, is associated with SpoIIIJ and is necessary for the third stage of sporulation. Members of this family are mainly from bacteria and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 79
22844 239098 cd02417 Peptidase_C39_likeA A sub-family of peptidase C39 which contains Cyclolysin and Hemolysin processing peptidases. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family. 121
22845 239099 cd02418 Peptidase_C39B A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 136
22846 239100 cd02419 Peptidase_C39C A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 127
22847 239101 cd02420 Peptidase_C39D A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 125
22848 239102 cd02421 Peptidase_C39_likeD A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family. 124
22849 239103 cd02423 Peptidase_C39G A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature. 129
22850 239104 cd02424 Peptidase_C39E A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family, which contains Colicin V perocessing peptidase. 129
22851 239105 cd02425 Peptidase_C39F A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 126
22852 239106 cd02426 Pol_gamma_b_Cterm C-terminal domain of mitochondrial DNA polymerase gamma B subunit, which is required for processivity. Polymerase gamma replicates and repairs mitochondrial DNA. The c-terminal domain of its B subunit is strikingly similar to the anticodon-binding domain of glycyl tRNA synthetase. 128
22853 239107 cd02429 PTH2_like Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes. There is no functional information for this eukaryote-specific subgroup. 116
22854 239108 cd02430 PTH2 Peptidyl-tRNA hydrolase, type 2 (PTH2). Peptidyl-tRNA hydrolase (PTH) activity releases tRNA from the premature translation termination product peptidyl-tRNA, therefore allowing the tRNA and peptide to be reused in protein synthesis. PTH2 is present in archaea and eukaryotes. 115
22855 153122 cd02431 Ferritin_CCC1_C CCC1-related domain of ferritin. Ferritin_CCC1_like_C: The proteins of this family contain two domains. This is the C-terminal domain that is closely related to the CCC1, a vacuole transmembrane protein functioning as an iron and manganese transporter. The N-terminal domain is similar to ferritin-like diiron-carboxylate proteins, which are involved in a variety of iron ion related functions, such as iron storage and regulation, mono-oxygenation, and reactive radical production. This family may be unique to certain bacteria and archaea. 149
22856 153123 cd02432 Nodulin-21_like_1 Nodulin-21 and CCC1-related protein family. Nodulin-21_like_1: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 218
22857 153124 cd02433 Nodulin-21_like_2 Nodulin-21 and CCC1-related protein family. Nodulin-21_like_2: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 234
22858 153125 cd02434 Nodulin-21_like_3 Nodulin-21 and CCC1-related protein family. Nodulin-21_like_3: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 225
22859 153126 cd02435 CCC1 CCC1. CCC1: This domain is present in the CCC1, an iron and manganese transporter of Saccharomyces cerevisiae. CCC1 is a transmembrane protein that is located in the vacuole and transfers the iron and manganese ions from the cytosol to the vacuole. This domain may be unique to certain fungi and plants. 241
22860 153127 cd02436 Nodulin-21 Nodulin-21. Nodulin-21: This is a family of proteins that may be unique to certain plants. The family member in soybean is found to be nodule-specific and is abundant during nodule development. The proteins of this family thus may play a role in symbiotic nitrogen fixation. 152
22861 153128 cd02437 CCC1_like_1 CCC1-related protein family. CCC1_like_1: This is a protein family closely related to CCC1, a family of proteins involved in iron and manganese transport. Yeast CCC1 is a vacuole transmembrane protein responsible for the iron and manganese accumulation in vacuole. 175
22862 143332 cd02439 DMB-PRT_CobT Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT), also called CobT. Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT/CobT, not to be confused with the CobT subunit of cobaltochelatase, which does not belong to this group) catalyzes the synthesis of alpha-ribazole-5'-phosphate, from nicotinate mononucleotide (NAMN) and 5,6-dimethylbenzimidazole (DMB). This function is essential to the anaerobic biosynthesis pathway of cobalamin (vitamin B12), which is the largest and most complex cofactor in a number of enzyme-catalyzed reactions in bacteria, archaea and eukaryotes. Only eubacteria and archaebacteria can synthesize vitamin B12; multicellular organisms have lost this ability during evolution. DMB-PRT/CobT works sequentially with CobC (a phosphatase) to couple the lower ligand of cobalamin to a ribosyl moiety. DMB is the most common lower ligand of cobamides; other lower ligands include adenine, 5-methoxybenzimidazole or phenol. It has been suggested that earlier metabolic or enzymatic steps may control which lower ligand is available to DMB-PRT/CobT. In Salmonella enterica, for example, the lower ligand is DMB under aerobic conditions and adenine or 2-methyladenine under anaerobic conditions. Salmonella enterica DMB-PRT/CobT is a homodimer with two active sites, each active site is comprised of residues from both monomers. This group includes two distinct subfamilies, one archaeal-like, the other comprised of bacterial sequences. 315
22863 100107 cd02440 AdoMet_MTases S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.). 107
22864 133000 cd02503 MobA MobA catalyzes the formation of molybdopterin guanine dinucleotide. The prokaryotic enzyme molybdopterin-guanine dinucleotide biosynthesis protein A (MobA). All mononuclear molybdoenzymes bind molybdenum in complex with an organic cofactor termed molybdopterin (MPT). In many bacteria, including Escherichia coli, molybdopterin can be further modified by attachment of a GMP group to the terminal phosphate of molybdopterin to form molybdopterin guanine dinucleotide (MGD). This GMP attachment step is catalyzed by MobA, by linking a guanosine 5'-phosphate to MPT forming molybdopterin guanine dinucleotide. This reaction requires GTP, MgCl2, and the MPT form of the cofactor. It is a reaction unique to prokaryotes, and therefore may represent a potential drug target. 181
22865 133001 cd02507 eIF-2B_gamma_N_like The N-terminal of eIF-2B_gamma_like is predicted to have glycosyltransferase activity. N-terminal domain of eEIF-2B epsilon and gamma, subunits of eukaryotic translation initiators, is a subfamily of glycosyltranferase 2 and is predicted to have glycosyltranferase activity. eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit epsilon shares sequence similarity with gamma subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex. 216
22866 133002 cd02508 ADP_Glucose_PP ADP-glucose pyrophosphorylase is involved in the biosynthesis of glycogen or starch. ADP-glucose pyrophosphorylase (glucose-1-phosphate adenylyltransferase) catalyzes a very important step in the biosynthesis of alpha 1,4-glucans (glycogen or starch) in bacteria and plants: synthesis of the activated glucosyl donor, ADP-glucose, from glucose-1-phosphate and ATP. ADP-glucose pyrophosphorylase is a tetrameric allosterically regulated enzyme. While a homotetramer in bacteria, in plant chloroplasts and amyloplasts, it is a heterotetramer of two different, yet evolutionary related, subunits. There are a number of conserved regions in the sequence of bacterial and plant ADP-glucose pyrophosphorylase subunits. It is a subfamily of a very diverse glycosy transferase family 2. 200
22867 133003 cd02509 GDP-M1P_Guanylyltransferase GDP-M1P_Guanylyltransferase catalyzes the formation of GDP-Mannose. GDP-mannose-1-phosphate guanylyltransferase, also called GDP-mannose pyrophosphorylase (GDP-MP), catalyzes the formation of GDP-Mannose from mannose-1-phosphate and GTP. Mannose is a key monosaccharide for glycosylation of proteins and lipids. GDP-Mannose is the activated donor for mannosylation of various biomolecules. This enzyme is known to be bifunctional, as both mannose-6-phosphate isomerase and mannose-1-phosphate guanylyltransferase. This CD covers the N-terminal GDP-mannose-1-phosphate guanylyltransferase domain, whereas the isomerase function is located at the C-terminal half. GDP-MP is a member of the nucleotidyltransferase family of enzymes. 274
22868 133004 cd02510 pp-GalNAc-T pp-GalNAc-T initiates the formation of mucin-type O-linked glycans. UDP-GalNAc: polypeptide alpha-N-acetylgalactosaminyltransferases (pp-GalNAc-T) initiate the formation of mucin-type, O-linked glycans by catalyzing the transfer of alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to hydroxyl groups of Ser or Thr residues of core proteins to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These enzymes are type II membrane proteins with a GT-A type catalytic domain and a lectin domain located on the lumen side of the Golgi apparatus. In human, there are 15 isozymes of pp-GalNAc-Ts, representing the largest of all glycosyltransferase families. Each isozyme has unique but partially redundant substrate specificity for glycosylation sites on acceptor proteins. 299
22869 133005 cd02511 Beta4Glucosyltransferase UDP-glucose LOS-beta-1,4 glucosyltransferase is required for biosynthesis of lipooligosaccharide. UDP-glucose: lipooligosaccharide (LOS) beta-1-4-glucosyltransferase catalyzes the addition of the first residue, glucose, of the lacto-N-neotetrase structure to HepI of the LOS inner core. LOS is the major constituent of the outer leaflet of the outer membrane of gram-positive bacteria. It consists of a short oligosaccharide chain of variable composition (alpha chain) attached to a branched inner core which is lined in turn to lipid A. Beta 1,4 glucosyltransferase is required to attach the alpha chain to the inner core. 229
22870 133006 cd02513 CMP-NeuAc_Synthase CMP-NeuAc_Synthase activates N-acetylneuraminic acid by adding CMP moiety. CMP-N-acetylneuraminic acid synthetase (CMP-NeuAc synthetase) or acylneuraminate cytidylyltransferase catalyzes the transfer the CMP moiety of CTP to the anomeric hydroxyl group of NeuAc in the presence of Mg++. It is the second to last step in the sialylation of the oligosaccharide component of glycoconjugates by providing the activated sugar-nucleotide cytidine 5'-monophosphate N-acetylneuraminic acid (CMP-Neu5Ac), the substrate for sialyltransferases. Eukaryotic CMP-NeuAc synthetases are predominantly located in the nucleus. The activated CMP-Neu5Ac diffuses from the nucleus into the cytoplasm. 223
22871 133007 cd02514 GT13_GLCNAC-TI GT13_GLCNAC-TI is involved in an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GLCNAC-T I , GNT-I) transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide, an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localized to the Golgi apparatus. The catalytic domain is located at the C-terminus. These proteins are members of the glycosy transferase family 13. 334
22872 133008 cd02515 Glyco_transf_6 Glycosyltransferase family 6 comprises enzymes responsible for the production of the human ABO blood group antigens. Glycosyltransferase family 6, GT_6, comprises enzymes with three known activities: alpha-1,3-galactosyltransferase, alpha-1,3 N-acetylgalactosaminyltransferase, and alpha-galactosyltransferase. UDP-galactose:beta-galactosyl alpha-1,3-galactosyltransferase (alpha3GT) catalyzes the transfer of galactose from UDP-alpha-d-galactose into an alpha-1,3 linkage with beta-galactosyl groups in glycoconjugates. The enzyme exists in most mammalian species but is absent from humans, apes, and old world monkeys as a result of the mutational inactivation of the gene. The alpha-1,3 N-acetylgalactosaminyltransferase and alpha-galactosyltransferase are responsible for the production of the human ABO blood group antigens. A N-acetylgalactosaminyltransferases use a UDP-GalNAc donor to convert the H-antigen acceptor to the A antigen, whereas a galactosyltransferase uses a UDP-galactose donor to convert the H-antigen acceptor to the B antigen. Alpha-1,3 N-acetylgalactosaminyltransferase and alpha-galactosyltransferase differ only in the identity of four critical amino acid residues. 271
22873 133009 cd02516 CDP-ME_synthetase CDP-ME synthetase is involved in mevalonate-independent isoprenoid production. 4-diphosphocytidyl-2-methyl-D-erythritol synthase (CDP-ME), also called 2C-methyl-d-erythritol 4-phosphate cytidylyltransferase catalyzes the third step in the alternative (non-mevalonate) pathway of Isopentenyl diphosphate (IPP) biosynthesis: the formation of 4-diphosphocytidyl-2C-methyl-D-erythritol from CTP and 2C-methyl-D-erythritol 4-phosphate. This mevalonate independent pathway that utilizes pyruvate and glyceraldehydes 3-phosphate as starting materials for production of IPP occurs in a variety of bacteria, archaea and plant cells, but is absent in mammals. Thus, CDP-ME synthetase is an attractive targets for the structure-based design of selective antibacterial, herbicidal and antimalarial drugs. 218
22874 133010 cd02517 CMP-KDO-Synthetase CMP-KDO synthetase catalyzes the activation of KDO which is an essential component of the lipopolysaccharide. CMP-KDO Synthetase: 3-Deoxy-D-manno-octulosonate cytidylyltransferase (CMP-KDO synthetase) catalyzes the conversion of CTP and 3-deoxy-D-manno-octulosonate into CMP-3-deoxy-D-manno-octulosonate (CMP-KDO) and pyrophosphate. KDO is an essential component of the lipopolysaccharide found in the outer surface of gram-negative eubacteria. It is also a constituent of the capsular polysaccharides of some gram-negative eubacteria. Its presence in the cell wall polysaccharides of green algae and plant were also discovered. However, they have not been found in yeast and animals. The absence of the enzyme in mammalian cells makes it an attractive target molecule for drug design. 239
22875 133011 cd02518 GT2_SpsF SpsF is a glycosyltrnasferase implicated in the synthesis of the spore coat. Spore coat polysaccharide biosynthesis protein F (spsF) is a glycosyltransferase implicated in the synthesis of the spore coat in a variety of bacteria challenged by stress as starvation. The spsF gene is expressed in the late stage of coat development responsible for a terminal step in coat formation that involves the glycosylation of the coat. SpsF gene mutation resulted in spores that appeared normal. But, the spores tended to aggregate and had abnormal adsorption properties, indicating a surface alteration. 233
22876 133012 cd02520 Glucosylceramide_synthase Glucosylceramide synthase catalyzes the first glycosylation step of glycosphingolipid synthesis. UDP-glucose:N-acylsphingosine D-glucosyltransferase (glucosylceramide synthase or ceramide glucosyltransferase) catalyzes the first glycosylation step of glycosphingolipid synthesis. Its product, glucosylceramide, serves as the core of more than 300 glycosphingolipids (GSL). GSLs are a group of membrane components that have the lipid portion embedded in the outer plasma membrane leaflet and the sugar chains extended to the outer environment. Several lines of evidence suggest the importance of GSLs in various cellular processes such as differentiation, adhesion, proliferation, and cell-cell recognition. In pathogenic fungus Cryptococcus neoformans, glucosylceramide serves as an antigen that elicits an antibody response in patients and it is essential for fungal growth in host extracellular environment. 196
22877 133013 cd02522 GT_2_like_a GT_2_like_a represents a glycosyltransferase family-2 subfamily with unknown function. Glycosyltransferase family 2 (GT-2) subfamily of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 221
22878 133014 cd02523 PC_cytidylyltransferase Phosphocholine cytidylyltransferases catalyze the synthesis of CDP-choline. This family contains proteins similar to prokaryotic phosphocholine (P-cho) cytidylyltransferases. Phosphocholine (PC) cytidylyltransferases catalyze the transfer of a cytidine monophosphate from CTP to phosphocholine to form CDP-choline. PC is the most abundant phospholipid in eukaryotic membranes and it is also important in prokaryotic membranes. For pathogenic prokaryotes, the cell surface PC facilitates the interaction with host surface and induces attachment and invasion. In addition cell wall PC serves as scaffold for a group of choline-binding proteins that are secreted from the cells. Phosphocholine (PC) cytidylyltransferase is a key enzyme in the prokaryotic choline metabolism pathway. It has been hypothesized to consist of a choline transport system, a choline kinase, CTP:phosphocholine cytidylyltransferase, and a choline phosphotransferase that transfers P-Cho from CDP-Cho to either lipoteichoic acid or lipopolysaccharide. 229
22879 133015 cd02524 G1P_cytidylyltransferase G1P_cytidylyltransferase catalyzes the production of CDP-D-Glucose. Alpha-D-Glucose-1-phosphate Cytidylyltransferase catalyzes the production of CDP-D-Glucose from alpha-D-Glucose-1-phosphate and MgCTP as substrate. CDP-D-Glucose is the precursor for synthesizing four of the five naturally occurring 3,6-dideoxy sugars-abequose (3,6-dideoxy-D-Xylo-hexose), ascarylose (3,6-dideoxy-L-arabino-hexose), paratose (3,6-dideoxy-D-ribohexose), and tyvelose (3,6-dideoxy-D-arabino-hexose. Deoxysugars are ubiquitous in nature where they function in a variety of biological processes, including cell adhesion, immune response, determination of ABO blood groups, fertilization, antibiotic function, and microbial pathogenicity. 253
22880 133016 cd02525 Succinoglycan_BP_ExoA ExoA is involved in the biosynthesis of succinoglycan. Succinoglycan Biosynthesis Protein ExoA catalyzes the formation of a beta-1,3 linkage of the second sugar (glucose) of the succinoglycan with the galactose on the lipid carrie. Succinoglycan is an acidic exopolysaccharide that is important for invasion of the nodules. Succinoglycan is a high-molecular-weight polymer composed of repeating octasaccharide units. These units are synthesized on membrane-bound isoprenoid lipid carriers, beginning with galactose followed by seven glucose molecules, and modified by the addition of acetate, succinate, and pyruvate. ExoA is a membrane protein with a transmembrance domain at c-terminus. 249
22881 133017 cd02526 GT2_RfbF_like RfbF is a putative dTDP-rhamnosyl transferase. Shigella flexneri RfbF protein is a putative dTDP-rhamnosyl transferase. dTDP rhamnosyl transferases of Shigella flexneri add rhamnose sugars to N-acetyl-glucosamine in the O-antigen tetrasaccharide repeat. Lipopolysaccharide O antigens are important virulence determinants for many bacteria. The variations of sugar composition, the sequence of the sugars and the linkages in the O antigen provide structural diversity of the O antigen. 237
22882 133018 cd02537 GT8_Glycogenin Glycogenin belongs the GT 8 family and initiates the biosynthesis of glycogen. Glycogenin initiates the biosynthesis of glycogen by incorporating glucose residues through a self-glucosylation reaction at a Tyr residue, and then acts as substrate for chain elongation by glycogen synthase and branching enzyme. It contains a conserved DxD motif and an N-terminal beta-alpha-beta Rossmann-like fold that are common to the nucleotide-binding domains of most glycosyltransferases. The DxD motif is essential for coordination of the catalytic divalent cation, most commonly Mn2+. Glycogenin can be classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. It is placed in glycosyltransferase family 8 which includes lipopolysaccharide glucose and galactose transferases and galactinol synthases. 240
22883 133019 cd02538 G1P_TT_short G1P_TT_short is the short form of glucose-1-phosphate thymidylyltransferase. This family is the short form of glucose-1-phosphate thymidylyltransferase. Glucose-1-phosphate thymidylyltransferase catalyses the formation of dTDP-glucose, from dTTP and glucose 1-phosphate. It is the first enzyme in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme.There are two forms of Glucose-1-phosphate thymidylyltransferase in bacteria and archeae; short form and long form. The homotetrameric, feedback inhibited short form is found in numerous bacterial species that produce dTDP-L-rhamnose. The long form, which has an extra 50 amino acids c-terminal, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced. 240
22884 133020 cd02540 GT2_GlmU_N_bac N-terminal domain of bacterial GlmU. The N-terminal domain of N-Acetylglucosamine-1-phosphate uridyltransferase (GlmU). GlmU is an essential bacterial enzyme with both an acetyltransferase and an uridyltransferase activity which have been mapped to the C-terminal and N-terminal domains, respectively. This family represents the N-terminal uridyltransferase. GlmU performs the last two steps in the synthesis of UDP-N-acetylglucosamine (UDP-GlcNAc), which is an essential precursor in both the peptidoglycan and the lipopolysaccharide metabolic pathways in Gram-positive and Gram-negative bacteria, respectively. 229
22885 133021 cd02541 UGPase_prokaryotic Prokaryotic UGPase catalyses the synthesis of UDP-glucose. Prokaryotic UDP-Glucose Pyrophosphorylase (UGPase) catalyzes a reversible production of UDP-Glucose and pyrophosphate (PPi) from glucose-1-phosphate and UTP. UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids , glycoproteins , and proteoglycans. UGPase is found in both prokaryotes and eukaryotes, although prokaryotic and eukaryotic forms of UGPase catalyze the same reaction, they share low sequence similarity. 267
22886 239109 cd02549 Peptidase_C39A A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature or have different domain architectures. 141
22887 211325 cd02550 PseudoU_synth_Rsu_Rlu_like Pseudouridine synthase, Rsu/Rlu family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA. Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors. E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA. Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved. Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA. psi2604in 23S RNA made by RluF has only been detected in E.coli. 154
22888 211326 cd02552 PseudoU_synth_TruD_like Pseudouridine synthase, TruD family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruD and Saccharomyces cerevisiae Pus7. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruD and S. cerevisiae Pus7 make psi13 in cytoplasmic tRNAs. In addition S. cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA) and psi35 in pre-tRNATyr. Psi35 in U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved. Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35. 232
22889 211327 cd02553 PseudoU_synth_RsuA Pseudouridine synthase, Escherichia coli RsuA like. This group is comprised of eukaryotic and bacterial proteins similar to Escherichia coli RsuA. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RsuA makes psi516 in 16S RNA. Psi at this position is not generally conserved in other organisms. 167
22890 211328 cd02554 PseudoU_synth_RluF Pseudouridine synthase, Escherichia coli RluF like. This group is comprised of bacterial proteins similar to Escherichia coli RluF. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RluF makes psi2604 in 23S RNA. psi2604 has only been detected in E. coli. It is absent from other eubacteria despite a precursor U at that site and from eukarya and archea which lack a precursor U at that site. 164
22891 211329 cd02555 PSSA_1 Pseudouridine synthase, a subgroup of the RsuA family. This group is comprised of bacterial proteins assigned to the RsuA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. The TruA family is comprised of proteins related to Escherichia coli RsuA. 177
22892 211330 cd02556 PseudoU_synth_RluB Pseudouridine synthase, Escherichia coli RluB like. This group is comprised of bacterial and eukaryotic proteins similar to E. coli RluB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RluB makes psi2605 in 23S RNA. psi2605 has been detected in eubacteria but, not in eukarya and archea despite the presence of a precursor U at that site. 167
22893 211331 cd02557 PseudoU_synth_ScRIB2 Pseudouridine synthases similar to Saccharomyces cerevisiae RIB2. Pseudouridine synthase, Saccharomyces cerevisiae RIB2_like. This group is comprised of eukaryotic and bacterial proteins similar to Saccharomyces cerevisiae RIB2, S. cerevisiae Pus6p and human hRPUDSD2. S. cerevisiae RIB2 displays two distinct catalytic activities. The N-terminal domain of RIB2 is RNA:psi-synthase which makes psi32 on cytoplasmic tRNAs. Psi32 is highly phylogenetically conserved. The C-terminal domain of RIB2 has a DRAP deaminase activity which catalyses the formation of 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione 5'-phosphate from 2,5-diamino-6-ribitylamino-4(3H)-pyrimidinone 5'-phosphate during riboflavin biosynthesis. S. cerevisiae Pus6p makes the psi31 of cytoplasmic and mitochondrial tRNAs. 213
22894 211332 cd02558 PSRA_1 Pseudouridine synthase, a subgroup of the RluA family. This group is comprised of bacterial proteins assigned to the RluA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. The RluA family is comprised of proteins related to Escherichia coli RluA. 246
22895 211333 cd02563 PseudoU_synth_TruC tRNA pseudouridine isomerase C. Pseudouridine synthases catalyze the isomerization of specific uridines in an tRNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. TruC makes psi65 in tRNAs. This psi residue is not universally conserved. 223
22896 211334 cd02566 PseudoU_synth_RluE Pseudouridine synthase, Escherichia coli RluE. This group is comprised of bacterial proteins similar to E. coli RluE. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. Escherichia coli RluE makes psi2457 in 23S RNA. psi2457 is not universally conserved. 168
22897 211335 cd02568 PseudoU_synth_PUS1_PUS2 Pseudouridine synthase, PUS1/ PUS2 like. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus1p, S. cerevisiae Pus2p, Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae Pus1p catalyzes the formation of psi34 and psi36 in the intron-containing tRNAIle, psi35 in the intron-containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes are psi 26, 65 and 67. C. elegans Pus1p does not modify psi44 in U2 snRNA. Mouse Pus1p makes psi27/28 in pre- tRNASer , tRNAVal and tRNAIle, psi 34/36 in tRNAIle and, psi 32 and potentially 67 in tRNAVal. Psi44 in U2 snRNA and psi32 in tRNAs are highly phylogenetically conserved. Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA). 245
22898 211336 cd02569 PseudoU_synth_ScPus3 Pseudouridine synthase, Saccharomyces cerevisiae Pus3 like. This group consists of eukaryotic pseudouridine synthases similar to S. cerevisiae Pus3p, mouse Pus3p and, human PUS2. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Mouse Pus3p has been shown to makes psi38 and, possibly also psi 39, in tRNAs. Psi38 and psi39 are highly conserved in tRNAs from eubacteria, archea and eukarya. 256
22899 211337 cd02570 PseudoU_synth_EcTruA Eukaryotic and bacterial pseudouridine synthases similar to E. coli TruA. This group consists of eukaryotic and bacterial pseudouridine synthases similar to E. coli TruA, Pseudomonas aeruginosa truA and human pseudouridine synthase-like 1 (PUSL1). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E. coli TruA makes psi38/39 and/or 40 in tRNA. psi38 and psi39 in tRNAs are highly phylogenetically conserved. P. aeruginosa truA is required for induction of type III secretory genes and may act through modifying tRNAs critical for the expression of type III genes or their regulators. 239
22900 211338 cd02572 PseudoU_synth_hDyskerin Pseudouridine synthase, human dyskerin like. This group consists of eukaryotic and archeal pseudouridine synthases similar to human dyskerin, Saccharomyces cerevisiae Cbf5, and Drosophila melanogaster Mfl (minifly protein). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactor is required. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. D. melanogaster mfl hosts in its fourth intron, a box H/AC snoRNA gene. In addition dyskerin is likely to have a structural role in the telomerase complex. Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Mutations in Drosophila Mfl results in miniflies that suffer abnormalities. 182
22901 211339 cd02573 PseudoU_synth_EcTruB Pseudouridine synthase, Escherichia coli TruB like. This group consists of bacterial pseudouridine synthases similar to E. coli TruB and Mycobacterium tuberculosis TruB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruB and M. tuberculosis TruB make psi55 in the T loop of tRNAs. Psi55 is nearly universally conserved. E. coli TruB is not inhibited by RNA containing 5-fluorouridine. 213
22902 211340 cd02575 PseudoU_synth_EcTruD Pseudouridine synthase, similar to Escherichia coli TruD. This group consists of bacterial pseudouridine synthases similar to Escherichia coli TruD. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruD makes the highly phylogenetically conserved psi13 in tRNAs. 253
22903 211341 cd02576 PseudoU_synth_ScPUS7 Pseudouridine synthase, TruD family. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus7. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). Saccharomyces cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA), psi13 in cytoplasmic tRNAs and psi35 in pre-tRNATyr. Psi35 in yeast U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved. Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35. 371
22904 211342 cd02577 PSTD1 Pseudouridine synthase, a subgroup of the TruD family. This group consists of several hypothetical archeal pseudouridine synthases assigned to the TruD family of psuedouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). The TruD family is comprised of proteins related to Escherichia coli TruD. 319
22905 259846 cd02582 RNAP_archeal_A' A' subunit of archaeal RNA polymerase (RNAP). A' is the largest subunit of the archaeal RNA polymerase (RNAP). Archaeal RNAP is closely related to RNA polymerases in eukaryotes based on the subunit compositions. Archaeal RNAP is a large multi-protein complex, made up of 11 to 13 subunits, depending on the species, that are responsible for the synthesis of RNA. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shaped structure. The largest eukaryotic RNAP subunit is encoded by two separate archaeal subunits (A' and A'') which correspond to the N- and C-terminal domains of eukaryotic RNAP II Rpb1, respectively. The N-terminal domain of Rpb1 forms part of the active site and includes the head and the core of one clamp as well as the pore and funnel structures of RNAP II. Based on a structural comparison among the archaeal, bacterial and eukaryotic RNAPs the DNA binding channel and the active site are part of A' subunit which is conserved. The strong similarity between subunit A' and the N-terminal domain of Rpb1 suggests a similar functional and structural role for these two proteins. 861
22906 259847 cd02583 RNAP_III_RPC1_N Largest subunit (RPC1) of eukaryotic RNA polymerase III (RNAP III), N-terminal domain. Rpc1 (C160) subunit forms part of the active site region of RNAP III. RNAP III is one of the three distinct classes of nuclear RNAP in eukaryotes that is responsible for the synthesis of tRNAs, 5SrRNA, Alu-RNA, U6 snRNA genes, and some others. RNAP III is the largest nuclear RNA polymerase with 17 subunits. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shaped structure. The N-terminal domain of Rpb1, the largest subunit of RNAP II in yeast, forms part of the active site, making up the head and core of the one clamp, as well as the pore and funnel structures of RNAP II. The strong homology between Rpc1 and Rpb1 suggests a similar functional and structural role. 816
22907 132720 cd02584 RNAP_II_Rpb1_C Largest subunit (Rpb1) of Eukaryotic RNA polymerase II (RNAP II), C-terminal domain. RNA polymerase II (RNAP II) is a large multi-subunit complex responsible for the synthesis of mRNA. RNAP II consists of a 10-subunit core enzyme and a peripheral heterodimer of two subunits. The largest core subunit (Rpb1) of yeast RNAP II is the best characterized member of this family. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure. In yeast, Rpb1 and Rpb2, the largest and the second largest subunits, each makes up one clamp, one jaw, and part of the cleft. Rpb1 interacts with Rpb2 to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The C-terminal domain of Rpb1 makes up part of the foot and jaw structures. 410
22908 319784 cd02585 HAD_PMM phosphomannomutase, similar to human PMM1 and PMM2, Saccharomyces Sec53p, and Arabidopsis thaliana PMM. PMM catalyzes the interconversion of mannose-6-phosphate (M6P) to mannose-1-phosphate (M1P); the conversion of M6P to M1P is an essential step in mannose activation and the biosynthesis of glycoconjugates in all eukaryotes. M1P is the substrate for the synthesis of GDP-mannose, which is an intermediate for protein glycosylation, protein sorting and secretion, and maintaining a functional endomembrane system in eukaryotic cells. Proteins in this family contains a conserved phosphorylated motif DxDx(T/V) shared with some other phosphotransferases. This family contains two human homologs, PMM1 and PMM2; PMM2 deficiency causes congenital disorder of glycosylation type I-a, also known as Jaeken syndrome. PMM1 can also act as glucose-1,6-bisphosphatase in the brain after stimulation with inosine monophosphate; PMM2 on the other hand, is insensitive to IMP and demonstrates low glucose-1,6-bisphosphatase activity. Arabidopsis thaliana PMM converted M1P into M6P and glucose-1-phosphate into glucose-6-phosphate, with the latter reaction being less efficient. Arabidopsis thaliana and Nicotiana benthamian PPMs are involved in ascorbic acid biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 238
22909 319785 cd02586 HAD_PHN Phosphonoacetaldehyde hydrolase (phosphonatase); similar to Bacillus cereus phosphonatase. Degradation of the ubiquitous natural phosphonate 2-aminoethylphosphonate (AEP) into useable forms of nitrogen, carbon, and phosphorus is a two-step metabolic pathway. The first step, catalyzed by AEP transaminase, involves the transfer of NH3 from AEP to pyruvate, yielding phosphonoacetaldehyde (P-Ald) and alanine. In the second step, phosphonatase catalyzes the hydrolytic P-C bond cleavage of P-Ald to form orthophosphate and acetaldehyde. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 242
22910 319786 cd02587 HAD_5-3dNT 5'(3')-deoxyribonucleotidase. This family includes cytosolic 5'(3')-deoxyribonucleotidase (cdN) and mitochondrial 5'(3')-deoxyribonucleotidase (mdN). cdN and mdN specifically dephosphorylate the deoxyribo form of nucleoside monophosphates helps maintain homeostasis of deoxynucleosides required for mitochondrial DNA synthesis. Their preferred substrates are dUMP and dTMP. cdN also dephosphorylates dGMP and dIMP efficiently. They can also dephosphorylate the 5'- or 3'-phosphates of pyrimidine ribonucleotides. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 161
22911 319787 cd02588 HAD_L2-DEX L-2-haloacid dehalogenase. L-2-Haloacid dehalogenase catalyzes the hydrolytic dehalogenation of L-2-haloacids to produce the corresponding D-2-hydroxyacids with an inversion of the C2-configuration. 2-haloacid dehalogenases are of interest for their potential to degrade recalcitrant halogenated environmental pollutants and their use in the synthesis of industrial chemicals. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 216
22912 319788 cd02598 HAD_BPGM beta-phosphoglucomutase, similar to Lactococcus lactis beta-phosphoglucomutase (beta-PGM). Lactococcus lactis beta-PGM catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), forming beta-D-glucose 1,6-(bis)phosphate as an intermediate. In the forward G6P-forming direction, this reaction links polysaccharide phosphorolysis to glycolysis, in the reverse direction, the reaction provides G1P for the biosynthesis of exo-polysaccharides. This subfamily belongs to the beta-phosphoglucomutase-like family whose other members include Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 174
22913 319789 cd02601 HAD_Eya protein tyrosine phosphatase domain of the nuclear transcription factor of Eyes absent (Eya) and related phosphatase domains. Eyes absent (Eya) is a transcriptional coactivator, and an aspartyl-based protein tyrosine phosphatase. Eya and Six operate as a composite transcription factor, within a conserved network of transcription factors called the retinal determination (RD) network. The RD network interacts with a broad variety of signaling pathways to regulate the development and homeostasis of organs and tissues such as eye, muscle, kidney and ear. To date it is not clear what the physiologically relevant substrates of the Eya protein tyrosine phosphatase are, or whether this phosphatase activity plays a role in transcription. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 271
22914 319790 cd02603 HAD_sEH-N_like N-terminal lipase phosphatase domain of human soluble epoxide hydrolase, Escherichia coli YihX/HAD4 alpha-D-glucose 1-phosphate phosphatase, and related domains, may be inactive. This family includes the N-terminal phosphatase domain of human soluble epoxide hydrolase (sEH). sEH is a bifunctional enzyme with two distinct enzyme activities, the C-terminal domain has epoxide hydrolysis activity and the N-terminal domain (Ntermphos), which belongs to this family, has lipid phosphatase activity. The latter prefers mono-phosphate esters, and lysophosphatidic acids (LPAs) are the best natural substrates found to date. In addition this family includes Gallus gallus sEH and Xenopus sEH which appears to lack phosphatase activity, and Escherichia coli YihX/HAD4 which selectively hydrolyzes alpha-Glucose-1-P, phosphatase, has significant phosphatase activity against pyridoxal phosphate, and has low beta phosphoglucomutase activity. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 195
22915 319791 cd02604 HAD_5NT haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to Saccharomyces cerevisiae Phm8p and Sdt1p. This family includes Saccharomyces cerevisiae Phm8p (phosphate metabolism protein 8) and Sdt1p (Suppressor of disruption of TFIIS). Phm8p participates in the ribose salvage pathway, it catalyzes the dephosphorylation of nucleotide monophosphates to nucleosides, its preferred substrates are nucleotide monophosphates AMP, GMP, CMP, and UMP. Phm8p is also a lysophosphatidic acid phosphatase, dephosphorylating lysophosphatidic acids (LPAs) to monoacylglycerol in response to phosphate starvation. Sdt1p is a pyrimidine and pyridine-specific 5'-nucleotidase; it is an NMN/NaMN 5'-nucleotidases involved in the production of nicotinamide riboside and nicotinic acid riboside, and is a pyrimidine 5'-nucleotidase with high specificity for UMP and CMP. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 182
22916 319792 cd02605 HAD_SPP sucrose-phosphatase, similar to Synechocystis sp PCC 6803 SPP. Sucrose-phosphatase (SPP; EC 3.1.3.24) catalyzes the dephosphorylation of sucrose-6(F)-phosphate (Suc6P)-the final step in the pathway of sucrose biosynthesis in plants and cyanobacteria. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 245
22917 319793 cd02607 HAD_ThrH_like bifunctional phosphoserine phosphatase/phosphoserine:homoserine phosphotransferase, similar to Pseudomonas aeruginosa ThrH. This family includes Pseudomonas aeruginosa ThrH which is a duel activity enzyme having both phosphoserine phosphatase and phosphoserine:homoserine phosphotransferase activities, i.e. it can dephosphorylate phosphoserine, and can transfer phosphate from phosphoserine to homoserine. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 195
22918 319794 cd02608 P-type_ATPase_Na-K_like alpha-subunit of Na(+)/K(+)-ATPases and of gastric H(+)/K(+)-ATPase, similar to the human Na(+)/K(+)-ATPase alpha subunits 1-4. This subfamily includes the alpha subunit of Na(+)/K(+)-ATPase a heteromeric transmembrane protein composed of an alpha- and beta-subunit and an optional third subunit belonging to the FXYD proteins which are more tissue specific regulatory subunits of the enzyme. The alpha-subunit is the catalytic subunit responsible for transport activities of the enzyme. This subfamily includes all four isotopes of the human alpha subunit: (alpha1-alpha4, encoded by the ATP1A1- ATP1A4 genes). Na(+)/K(+)-ATPase functions chiefly as an ion pump, hydrolyzing one molecule of ATP to pump three Na(+) out of the cell in exchange for two K(+)entering the cell per pump cycle. In addition Na(+)/K(+)-ATPase acts as a signal transducer. This subfamily also includes Oreochromis mossambicus (tilapia) Na(+)/K(+)-ATPase alpha 1 and alpha 3 subunits, and gastric H(+)/K(+)-ATPase which exchanges hydronium ion with potassium and is responsible for gastric acid secretion. Gastric H(+)/K(+)-ATPase is an alpha,beta-heterodimeric enzyme. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 905
22919 319795 cd02609 P-type_ATPase uncharacterized subfamily of P-type ATPase transporter, similar to uncharacterized Streptococcus pneumoniae exported protein 7, Exp7. This subfamily contains P-type ATPase transporters of unknown function, similar to Streptococcus pneumoniae Exp7. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd(2+), and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine. 661
22920 319796 cd02612 HAD_PGPPase phosphatidylglycerol-phosphate phosphatase, similar to Escherichia coli K-12 phosphatidylglycerol-phosphate phosphatase C. This family includes Escherichia coli K-12 phosphatidylglycerol-phosphate phosphatase C, PgpC (previously named yfhB) which catalyzes the dephosphorylation of phosphatidylglycerol-phosphate (PGP) to phosphatidylglycerol (PG). This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 195
22921 319797 cd02616 HAD_PPase pyrophosphatase similar to Bacillus subtilis PpaX. This family includes Bacillus subtilis PpaX which hydrolyzes pyrophosphate formed during serine-46-phosphorylated HPr (P-Ser-HPr) dephosphorylation by the bifunctional enzyme HPr kinase/phosphorylase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 207
22922 239110 cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues. 223
22923 239111 cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane antibody-mediated interstitial nephritis. It plays a role in renal tubulogenesis and is defective in hereditary tubulointerstitial disorders. TIN-Ag is exclusively expressed in kidney tissues. 236
22924 239112 cd02621 Peptidase_C1A_CathepsinC Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are associated with Papillon-Lefevre and Haim-Munk syndromes, rare diseases characterized by hyperkeratosis and early-onset periodontitis. Cathepsin C is widely expressed in many tissues with high levels in lung, kidney and placenta. It is also highly expressed in cytotoxic lymphocytes and mature myeloid cells. 243
22925 100065 cd02636 R3H_sperm-antigen R3H domain of a group of metazoan proteins that is related to the sperm-associated antigen 7. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 61
22926 100066 cd02637 R3H_PARN R3H domain of Poly(A)-specific ribonuclease (PARN). PARN is a poly(A)-specific 3' exonuclease from the RNase D family that, in Xenopus, deadenylates a specific class of maternal mRNAs which results in their translational repression. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA. 65
22927 100067 cd02638 R3H_unknown_1 R3H domain of a group of eukaryotic proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 62
22928 100068 cd02639 R3H_RRM R3H domain of mainly fungal proteins which are associated with a RNA recognition motif (RRM) domain. Present in this group is the RNA-binding post-transcriptional regulator Cip2 (Csx1-interacting protein 2) involved in counteracting Csx1 function. Csx1 plays a central role in controlling gene expression during oxidative stress. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 60
22929 100069 cd02640 R3H_NRF R3H domain of the NF-kappaB-repression factor (NRF). NRF is a nuclear inhibitor of NF-kappaB proteins that can silence the IFNbeta promoter via binding to a negative regulatory element (NRE). Beside R3H NRF also contains a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 60
22930 100070 cd02641 R3H_Smubp-2_like R3H domain of Smubp-2_like proteins. Smubp-2_like proteins also contain a helicase_like and an AN1-like Zinc finger domain and have been shown to bind single-stranded DNA. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA. 60
22931 100071 cd02642 R3H_encore_like R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the germline exit after four mitotic divisions, by facilitating SCF-ubiquitin-proteasome-dependent proteolysis. Maize DBF1-interactor protein 1 (DIP1) containing an R3H domain is a potential regulator of DBF1 activity in stress responses. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 63
22932 100072 cd02643 R3H_NF-X1 R3H domain of the X1 box binding protein (NF-X1) and related proteins. Human NF-X1 is a transcription factor that regulates the expression of class II major histocompatibility complex (MHC) genes. The Drosophila homolog shuttle craft (STC) has been shown to be a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system and, the yeast homolog FAP1 encodes a dosage suppressor of rapamycin toxicity. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 74
22933 100073 cd02644 R3H_jag R3H domain found in proteins homologous to Bacillus subtilus Jag, which is associated with SpoIIIJ. SpoIIIJ is necessary for the third stage of sporulation. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 67
22934 100074 cd02645 R3H_AAA R3H domain of a group of proteins with unknown function, who also contain a AAA-ATPase (AAA) domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 60
22935 100075 cd02646 R3H_G-patch R3H domain of a group of fungal and plant proteins with unknown function, who also contain a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the R3H domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 58
22936 239113 cd02647 nuc_hydro_TvIAG nuc_hydro_ TvIAG: Nucleoside hydrolases similar to the Inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. Nucleoside hydrolases vary in their substrate specificity. This group contains eukaryotic and bacterial proteins similar to the purine specific inosine-adenosine-guanosine-preferring nucleoside hydrolase (IAG-NH) from T. vivax. T. vivax IAG-NH is of the order of a thousand to ten thousand fold more specific towards the naturally occurring purine nucleosides, than towards the pyrimidine nucleosides. 312
22937 239114 cd02648 nuc_hydro_1 NH_1: A subgroup of nucleoside hydrolases. This group contains fungal proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. 367
22938 239115 cd02649 nuc_hydro_CeIAG nuc_hydro_CeIAG: Nucleoside hydrolases similar to the inosine-adenosine-guanosine-preferring nucleoside hydrolase from Caenorhabditis elegans. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the purine-preferring nucleoside hydrolase (IAG-NH) from C. elegans and the salivary purine nucleosidase from Aedes aegypti. C. elegans IAG-NH exhibits a high affinity for the substrate analogue p-nitrophenylriboside (p-NPR). 306
22939 239116 cd02650 nuc_hydro_CaPnhB NH_hydro_CaPnhB: A subgroup of nucleoside hydrolases similar to Corynebacterium ammoniagenes Purine/pyrimidine nucleoside hydrolase (pnhB). Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. 304
22940 239117 cd02651 nuc_hydro_IU_UC_XIUA nuc_hydro_IU_UC_XIUA: inosine-uridine preferring, xanthosine-inosine-uridine-adenosine-preferring and, uridine-cytidine preferring nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains proteins similar to nucleoside hydrolases which hydrolyze both pyrimidine and purine ribonucleosides: the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the inosine-uridine-xanthosine preferring nucleoside hydrolase RihC from Escherichia coli and the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium. This group also contains proteins similar to the pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases URH1 from Saccharomyces cerevisiae, E. coli RihA and E. coli RihB. E. coli RihA is equally efficient with uridine and cytidine, E. coli RihB prefers cytidine over uridine. S. cerevisiae URH1 prefers uridine over cytidine. 302
22941 239118 cd02652 nuc_hydro_2 NH_2: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. 293
22942 239119 cd02653 nuc_hydro_3 NH_3: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. 320
22943 239120 cd02654 nuc_hydro_CjNH nuc_hydro_CjNH. Nucleoside hydrolases similar to Campylobacter jejuni nucleoside hydrolase. This group contains eukaryotic and bacterial proteins similar to C. jejuni nucleoside hydrolase. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. C. jejuni nucleoside hydrolase is inactive against natural nucleosides or against common nucleoside analogues. 318
22944 132721 cd02655 RNAP_beta'_C Largest subunit (beta') of Bacterial DNA-dependent RNA polymerase (RNAP), C-terminal domain. Bacterial RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. This family also includes the eukaryotic plastid-encoded RNAP beta" subunit. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure with two pincers defining a central cleft. Beta' and beta, the largest and the second largest subunits of bacterial RNAP, each makes up one pincer and part of the base of the cleft. The C-terminal domain includes a G loop that forms part of the floor of the downstream DNA-binding cavity. The position of the G loop may determine the switch of the bridge helix between flipped-out and normal alpha-helical conformations. 204
22945 239121 cd02656 MIT MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear. 75
22946 239122 cd02657 Peptidase_C19A A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 305
22947 239123 cd02658 Peptidase_C19B A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 311
22948 239124 cd02659 peptidase_C19C A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 334
22949 239125 cd02660 Peptidase_C19D A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 328
22950 239126 cd02661 Peptidase_C19E A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 304
22951 239127 cd02662 Peptidase_C19F A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 240
22952 239128 cd02663 Peptidase_C19G A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 300
22953 239129 cd02664 Peptidase_C19H A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 327
22954 239130 cd02665 Peptidase_C19I A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 228
22955 239131 cd02666 Peptidase_C19J A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 343
22956 239132 cd02667 Peptidase_C19K A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 279
22957 239133 cd02668 Peptidase_C19L A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 324
22958 239134 cd02669 Peptidase_C19M A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 440
22959 239135 cd02670 Peptidase_C19N A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 241
22960 239136 cd02671 Peptidase_C19O A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 332
22961 239137 cd02672 Peptidase_C19P A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 268
22962 239138 cd02673 Peptidase_C19Q A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 245
22963 239139 cd02674 Peptidase_C19R A subfamily of peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 230
22964 259861 cd02675 Ephrin_ectodomain Ectodomain of Ephrins. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands and the five EphB receptors bind to three transmembrane ephrin-B ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrins contain a highly conserved ectodomain for receptor binding, which is characterized by this domain hierarchy. 136
22965 239140 cd02677 MIT_SNX15 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in sorting nexin 15 and related proteins. The molecular function of the MIT domain is unclear. 75
22966 239141 cd02678 MIT_VPS4 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in intracellular protein transport proteins of the AAA-ATPase family. The molecular function of the MIT domain is unclear. 75
22967 239142 cd02679 MIT_spastin MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in the AAA protein spastin, a probable ATPase involved in the assembly or function of nuclear protein complexes; spastins might also be involved in microtubule dynamics. The molecular function of the MIT domain is unclear. 79
22968 239143 cd02680 MIT_calpain7_2 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear. 75
22969 239144 cd02681 MIT_calpain7_1 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear. 76
22970 239145 cd02682 MIT_AAA_Arch MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in mostly archaebacterial AAA-ATPases. The molecular function of the MIT domain is unclear. 75
22971 239146 cd02683 MIT_1 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with unknown function, co-occuring with an as yet undescribed domain. The molecular function of the MIT domain is unclear. 77
22972 239147 cd02684 MIT_2 MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with an n-terminal serine/threonine kinase domain. The molecular function of the MIT domain is unclear. 75
22973 239148 cd02685 MIT_C MIT_C; domain found C-terminal to MIT (contained within Microtubule Interacting and Trafficking molecules) domains, as well as in some bacterial proteins. The function of this domain is unknown. 148
22974 199878 cd02688 E_set Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus. The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase. 82
22975 349868 cd02690 M28 M28 Zn-peptidases include aminopeptidases and carboxypeptidases. Peptidase M28 family (also called aminopeptidase Y family) contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Plasma glutamate carboxypeptidase (PGCP) and glutamate carboxypeptidase II (NAALADase) hydrolyze dipeptides. Several members of the M28 peptidase family have PA domain inserts which may participate in substrate binding and/or in promoting conformational changes, which influence the stability and accessibility of the site to substrate. These include prostate-specific membrane antigen (PSMA), yeast aminopeptidase S (SGAP), human transferrin receptors (TfR1 and TfR2), plasma glutamate carboxypeptidase (PGCP) and several predicted aminopeptidases where relatively little is known about them. Also included in the M28 family are glutaminyl cyclases (QC), which are involved in N-terminal glutamine cyclization of many endocrine peptides. Nicastrin and nicalin belong to this family but lack the amino-acid conservation required for catalytically active aminopeptidases. 202
22976 100036 cd02691 PurM-like2 AIR synthase (PurM) related protein, archaeal subgroup 2 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain. 346
22977 119407 cd02696 MurNAc-LAA N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3.5.1.28) is an autolysin that hydrolyzes the amide bond between N-acetylmuramoyl and L-amino acids in certain cell wall glycopeptides. These proteins are Zn-dependent peptidases with highly conserved residues involved in cation co-ordination. MurNAc-LAA in this family is one of several peptidoglycan hydrolases (PGHs) found in bacterial and bacteriophage or prophage genomes that are involved in the degradation of the peptidoglycan. In Escherichia coli, there are five MurNAc-LAAs present: AmiA, AmiB, AmiC and AmiD that are periplasmic, and AmpD that is cytoplasmic. Three of these (AmiA, AmiB and AmiC) belong to this family, the other two (AmiD and AmpD) do not. E. coli AmiA, AmiB and AmiC play an important role in cleaving the septum to release daughter cells after cell division. In general, bacterial MurNAc-LAAs are members of the bacterial autolytic system and carry a signal peptide in their N-termini that allows their transport across the cytoplasmic membrane. However, the bacteriophage MurNAc-LAAs are endolysins since these phage-encoded enzymes break down bacterial peptidoglycan at the terminal stage of the phage reproduction cycle. As opposed to autolysins, almost all endolysins have no signal peptides and their translocation through the cytoplasmic membrane is thought to proceed with the help of phage-encoded holin proteins. The amidase catalytic module is fused to another functional module (cell wall binding module or CWBM) either at the N- or C-terminus, which is responsible for high affinity binding of the protein to the cell wall. 172
22978 349869 cd02697 M20_like M20 Zn-peptidases include exopeptidases. Peptidase M20 family; uncharacterized subfamily. These hypothetical proteins have been inferred by homology to be exopeptidases: carboxypeptidases, dipeptidases and a specialized aminopeptidase. In general, the peptidase hydrolyzes the late products of protein degradation in order to complete the conversion of proteins to free amino acids. Members of this subfamily may bind metal ions such as zinc. 394
22979 239149 cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response. 239
22980 341048 cd02699 M4_M36 Peptidase M4 family (includes thermolysin, aureolysin, neutral protease and bacillolysin) and Peptidase M36 family (also known as fungalysin). This family includes the peptidases M4 as well as M36, both belonging to the Gluzincin family. The M4 peptidase family includes numerous zinc-dependent metallopeptidases that hydrolyze peptide bonds, such as thermolysin (EC 3.4.24.27), pseudolysin (the extracellullar elastase of Pseudomonas aeruginosa), aureolysin (the extracellular metalloproteinase from Staphylococcus aureus), neutral protease from Bacillus cereus, as well as bacillolysin (EC 3.4.24.28). The M36 family also known as fungalysin (elastinolytic metalloproteinase) family, includes endopeptidases from pathogenic fungi. Both M4 and M36 families have similar folds and contain the Zn-binding site and the active site HEXXH motif. The eukaryotic M36 and bacterial M4 families of metalloproteases also share a conserved domain in their propeptides called FTP (fungalysin/thermolysin propeptide). 313
22981 259848 cd02733 RNAP_II_RPB1_N Largest subunit (Rpb1) of eukaryotic RNA polymerase II (RNAP II), N-terminal domain. The two largest subunits of RNA polymerase II (RNAP II), Rpb1 and Rpb2, form the active site, DNA entry channel and RNA exit channel. RNAP II is a large multi-subunit complex responsible for the synthesis of mRNA in eukaryotes. RNAP II consists of a 10-subunit core enzyme and a peripheral heterodimer of two subunits. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure. In yeast, Rpb1 and Rpb2, each makes up one clamp, one jaw, and part of the cleft. Rpb1_N contains part of the active site, forms the head and core of the one clamp, and makes up the pore and funnel regions of RNAP II. 751
22982 132722 cd02735 RNAP_I_Rpa1_C Largest subunit (Rpa1) of Eukaryotic RNA polymerase I (RNAP I), C-terminal domain. RNA polymerase I (RNAP I) is a multi-subunit protein complex responsible for the synthesis of rRNA precursor. It consists of at least 14 different subunits, and the largest one is homologous to subunit Rpb1 of yeast RNAP II and subunit beta' of bacterial RNAP. Rpa1 is also known as Rpa190 in yeast. Structure studies suggest that different RNAP complexes share a similar crab-claw-shape structure. The C-terminal domain of Rpb1, the largest subunit of RNAP II, makes up part of the foot and jaw structures of RNAP II. The similarity between this domain and the C-terminal domain of Rpb1, its counterpart in RNAP II, suggests a similar functional and structural role. 309
22983 132723 cd02736 RNAP_III_Rpc1_C Largest subunit (Rpc1) of Eukaryotic RNA polymerase III (RNAP III), C-terminal domain. Eukaryotic RNA polymerase III (RNAP III) is a large multi-subunit complex responsible for the synthesis of tRNAs, 5SrRNA, Alu-RNA, U6 snRNA, among others. Rpc1 is also known as C160 in yeast. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shape structure. The C-terminal domain of Rpb1, the largest subunit of RNAP II, makes up part of the foot and jaw structures of RNAP II. The similarity between this domain and the C-terminal domain of Rpb1, its counterpart in RNAP II, suggests a similar functional and structural role. 300
22984 132724 cd02737 RNAP_IV_NRPD1_C Largest subunit (NRPD1) of Higher plant RNA polymerase IV, C-terminal domain. Higher plants have five multi-subunit nuclear RNA polymerases: RNAP I, RNAP II and RNAP III, which are essential for viability; plus the two isoforms of the non-essential polymerase RNAP IV (IVa and IVb), which specialize in small RNA-mediated gene silencing pathways. RNAP IVa and/or RNAP IVb might be involved in RNA-directed DNA methylation of endogenous repetitive elements, silencing of transgenes, regulation of flowering-time genes, inducible regulation of adjacent gene pairs, and spreading of mobile silencing signals. NRPD1a is the largest subunit of RNAP IVa, whereas NRPD1b is the largest subunit of RNAP IVb. The full subunit compositions of RNAP IVa and RNAP IVb are not known, nor are their templates or enzymatic products. However, it has been shown that RNAP IVa and, to a lesser extent, RNAP IVb are crucial for several RNA-mediated gene silencing phenomena. 381
22985 119331 cd02742 GH20_hexosaminidase Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself. 303
22986 394871 cd02749 Macro_SF macrodomain superfamily. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Macrodomains include the yeast macrodomain Poa1 which is a phosphatase of ADP-ribose-1"-phosphate, a by-product of tRNA splicing. Some macrodomains have ADPr-unrelated binding partners such as the coronavirus SUD-N (N-terminal subdomain) and SUD-M (middle subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). Macrodomains regulate a wide variety of cellular and organismal processes, including DNA damage repair, signal transduction, and immune response. 121
22987 239151 cd02750 MopB_Nitrate-R-NarG-like Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. Members of the MopB_Nitrate-R-NarG-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 461
22988 239152 cd02751 MopB_DMSOR-like The MopB_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. Members of the MopB_DMSOR-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 609
22989 239153 cd02752 MopB_Formate-Dh-Na-like Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. Members of the MopB_Formate-Dh-Na-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 649
22990 239154 cd02753 MopB_Formate-Dh-H Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a Mo active site region and a [4Fe-4S] center. Members of the MopB_Formate-Dh-H CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 512
22991 239155 cd02754 MopB_Nitrate-R-NapA-like Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. Members of the MopB_Nitrate-R-NapA CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 565
22992 239156 cd02755 MopB_Thiosulfate-R-like The MopB_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Members of the MopB_Thiosulfate-R-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 454
22993 239157 cd02756 MopB_Arsenite-Ox Arsenite oxidase (Arsenite-Ox) oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin. Arsenite oxidase is a heterodimeric enzyme containing a large and a small subunit. The large catalytic subunit harbors the molybdopterin cofactor and the [3Fe-4S] cluster; and the small subunit belongs to the structural class of the Rieske proteins. The small subunit is not included in this alignment. Members of MopB_Arsenite-Ox CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 676
22994 239158 cd02757 MopB_Arsenate-R This CD includes the respiratory arsenate reductase, As(V), catalytic subunit (ArrA) and other related proteins. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 523
22995 239159 cd02758 MopB_Tetrathionate-Ra The MopB_Tetrathionate-Ra CD contains tetrathionate reductase, subunit A, (TtrA) and other related proteins. The Salmonella enterica tetrathionate reductase catalyses the reduction of trithionate but not sulfur or thiosulfate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 735
22996 239160 cd02759 MopB_Acetylene-hydratase The MopB_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 477
22997 239161 cd02760 MopB_Phenylacetyl-CoA-OR The MopB_Phenylacetyl-CoA-OR CD contains the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), and other related proteins. The phenylacetyl-CoA:acceptor oxidoreductase has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 760
22998 239162 cd02761 MopB_FmdB-FwdB The MopB_FmdB-FwdB CD contains the molybdenum/tungsten formylmethanofuran dehydrogenases, subunit B (FmdB/FwdB), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 415
22999 239163 cd02762 MopB_1 The MopB_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 539
23000 239164 cd02763 MopB_2 The MopB_2 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins 679
23001 239165 cd02764 MopB_PHLH The MopB_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding (MopB) proteins. This CD is of the PHLH region homologous to the catalytic molybdopterin-binding subunit of MopB homologs. 524
23002 239166 cd02765 MopB_4 The MopB_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins 567
23003 239167 cd02766 MopB_3 The MopB_3 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins 501
23004 239168 cd02767 MopB_ydeP The MopB_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 574
23005 239169 cd02768 MopB_NADH-Q-OR-NuoG2 MopB_NADH-Q-OR-NuoG2: The NuoG/Nad11/75-kDa subunit (second domain) of the NADH-quinone oxidoreductase (NADH-Q-OR)/respiratory complex I/NADH dehydrogenase-1 (NDH-1). The NADH-Q-OR is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The atomic structure of complex I is not known and the mechanisms of electron transfer and proton pumping are not established. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Escherichia coli, this subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The nad11 gene is nuclear-encoded in animals, plants, and fungi, but is still encoded in the mitochondrial genome of some protists. The Nad11/NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain family belongs to the molybdopterin_binding (MopB) superfamily of proteins. Bacterial type II NADH-quinone oxidoreductases and NQR-type sodium-motive NADH-quinone oxidoreductases are not homologs of this domain family. 386
23006 239170 cd02769 MopB_DMSOR-BSOR-TMAOR The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 609
23007 239171 cd02770 MopB_DmsA-EC This CD (MopB_DmsA-EC) includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a predicted N-terminal iron-sulfur [4Fe-4S] cluster binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 617
23008 239172 cd02771 MopB_NDH-1_NuoG2-N7 MopB_NDH-1_NuoG2-N7: The second domain of the NuoG subunit (with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 472
23009 239173 cd02772 MopB_NDH-1_NuoG2 MopB_NDH-1_NuoG2: The second domain of the NuoG subunit of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1), found in beta- and gammaproteobacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 414
23010 239174 cd02773 MopB_Res-Cmplx1_Nad11 MopB_Res_Cmplx1_Nad11: The second domain of the Nad11/75-kDa subunit of the NADH-quinone oxidoreductase/respiratory complex I/NADH dehydrogenase-1(NDH-1) of eukaryotes and the Nqo3/G subunit of alphaproteobacteria NDH-1. The NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75 kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Paracoccus denitrificans, this subunit is encoded by the nqo3 gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The Nad11/Nqo3 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 375
23011 239175 cd02774 MopB_Res-Cmplx1_Nad11-M MopB_Res_Cmplx1_Nad11_M: Mitochondrial-encoded NADH-quinone oxidoreductase/respiratory complex I, the second domain of the Nad11/75-kDa subunit of some protists. NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH-quinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. The Nad11 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 366
23012 239176 cd02775 MopB_CT Molybdopterin-Binding, C-terminal (MopB_CT) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. This hierarchy is of the conserved MopB_CT domain present in many, but not all, MopB homologs. 101
23013 239177 cd02776 MopB_CT_Nitrate-R-NarG-like Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. This CD (MopB_CT_Nitrate-R-NarG-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 141
23014 239178 cd02777 MopB_CT_DMSOR-like The MopB_CT_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 127
23015 239179 cd02778 MopB_CT_Thiosulfate-R-like The MopB_CT_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Also included in this CD is the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), which has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. The MopB_CT_Thiosulfate-R-like CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 123
23016 239180 cd02779 MopB_CT_Arsenite-Ox This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of Arsenite oxidase (Arsenite-Ox) and related proteins. Arsenite oxidase oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin. 115
23017 239181 cd02780 MopB_CT_Tetrathionate_Arsenate-R This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of tetrathionate reductase, subunit A, (TtrA); respiratory arsenate As(V) reductase, catalytic subunit (ArrA); and other related proteins. 143
23018 239182 cd02781 MopB_CT_Acetylene-hydratase The MopB_CT_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 130
23019 239183 cd02782 MopB_CT_1 The MopB_CT_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 129
23020 239184 cd02783 MopB_CT_2 The MopB_CT_2 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 156
23021 239185 cd02784 MopB_CT_PHLH The MopB_CT_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding proteins. This CD is of the PHLH region homologous to the conserved molybdopterin-binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 137
23022 239186 cd02785 MopB_CT_4 The MopB_CT_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 124
23023 239187 cd02786 MopB_CT_3 The MopB_CT_3 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 116
23024 239188 cd02787 MopB_CT_ydeP The MopB_CT_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 112
23025 239189 cd02788 MopB_CT_NDH-1_NuoG2-N7 MopB_CT_NDH-1_NuoG2-N7: C-terminal region of the NuoG-like subunit (of the variant with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain, is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain and a C-terminal region (this CD) homologous to the formate dehydrogenase C-terminal molybdopterin_binding (MopB) region. 96
23026 239190 cd02789 MopB_CT_FmdC-FwdD The MopB_FmdC-FwdD CD includes the C-terminus of subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC) and subunit D of tungsten formylmethanofuran dehydrogenase (FwdD), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding superfamily of proteins. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 106
23027 239191 cd02790 MopB_CT_Formate-Dh_H Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a Mo active site region and a [4Fe-4S] center. This CD (MopB_CT_Formate-Dh_H) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 116
23028 239192 cd02791 MopB_CT_Nitrate-R-NapA-like Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. This CD (MopB_CT_Nitrate-R-Nap) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs 122
23029 239193 cd02792 MopB_CT_Formate-Dh-Na-like Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. This CD (MopB_CT_Formate-Dh-Na-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 122
23030 239194 cd02793 MopB_CT_DMSOR-BSOR-TMAOR The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO.This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 129
23031 239195 cd02794 MopB_CT_DmsA-EC The MopB_CT_DmsA-EC CD includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a predicted N-terminal iron-sulfur [4Fe-4S] cluster binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 121
23032 271143 cd02795 CBM6-CBM35-CBM36_like Carbohydrate Binding Module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules. This alignment model also contains the C-terminal domains of bacterial insecticidal toxins, where they may be involved in determining insect specificity through carbohydrate binding functionality. 124
23033 239196 cd02796 tRNA_bind_bactPheRS tRNA-binding-domain-containing prokaryotic phenylalanly tRNA synthetase (PheRS) beta chain. PheRS aminoacylate phenylalanine transfer RNAs (tRNAphe). PheRSs belong structurally to class II aminoacyl tRNA synthetases (aaRSs) but, as they aminoacylate the 2'OH of the terminal ribose of tRNA they belong functionally to class 1 aaRSs. This domain has general tRNA binding properties and is believed to direct tRNAphe to the active site of the enzyme. 103
23034 239197 cd02798 tRNA_bind_CsaA tRNA-binding-domain-containing CsaA-like proteins. CsaA is a molecular chaperone with export related activities. CsaA has a putative tRNA binding activity. The functional unit of CsaA is a homodimer and this domain acts as a dimerization domain. 107
23035 239198 cd02799 tRNA_bind_EMAP-II_like tRNA-binding-domain-containing EMAP2-like proteins. This family contains a diverse fraction of tRNA binding proteins, including Caenorhabditis elegans methionyl-tRNA synthetase (CeMetRS), human tyrosyl- tRNA synthetase (hTyrRS), Saccharomyces cerevisiae Arc1p, human p43 and EMAP2. CeMetRS and hTyrRS aminoacylate their cognate tRNAs. Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases. This domain has general tRNA binding properties. In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. A EMAP-II-like cytokine also is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. 105
23036 239199 cd02800 tRNA_bind_EcMetRS_like tRNA-binding-domain-containing Escherichia coli methionyl-tRNA synthetase (EcMetRS)-like proteins. This family includes EcMetRS and Aquifex aeolicus Trbp111 (AaTrbp111). This domain has general tRNA binding properties. MetRS aminoacylates methionine transfer RNAs (tRNAmet). AaTrbp111 is structure-specific molecular chaperone recognizing the L-shape of the tRNA fold. AaTrbp111 plays a role in nuclear trafficking of tRNAs. The functional unit of EcMetRs and AaTrbp111 is a homodimer, this domain acts as the dimerization domain. 105
23037 239200 cd02801 DUS_like_FMN Dihydrouridine synthase-like (DUS-like) FMN-binding domain. Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archaea. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. 1VHN, a putative flavin oxidoreductase, has high sequence similarity to DUS. The enzymatic mechanism of 1VHN is not known at the present. 231
23038 239201 cd02803 OYE_like_FMN_family Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 327
23039 239202 cd02808 GltS_FMN Glutamate synthase (GltS) FMN-binding domain. GltS is a complex iron-sulfur flavoprotein that catalyzes the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the plant, yeast and bacterial pathway for ammonia assimilation. It is a multifunctional enzyme that functions through three distinct active centers, carrying out L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. 392
23040 239203 cd02809 alpha_hydroxyacid_oxid_FMN Family of homologous FMN-dependent alpha-hydroxyacid oxidizing enzymes. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). In green plants, glycolate oxidase is one of the key enzymes in photorespiration where it oxidizes glycolate to glyoxylate. LMO catalyzes the oxidation of L-lactate to acetate and carbon dioxide. MDH oxidizes (S)-mandelate to phenylglyoxalate. It is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate. 299
23041 239204 cd02810 DHOD_DHPD_FMN Dihydroorotate dehydrogenase (DHOD) and Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass its homodimeric interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue. 289
23042 239205 cd02811 IDI-2_FMN Isopentenyl-diphosphate:dimethylallyl diphosphate isomerase type 2 (IDI-2) FMN-binding domain. Two types of IDIs have been characterized at present. The long known IDI-1 is only dependent on divalent metals for activity, whereas IDI-2 requires a metal, FMN and NADPH. IDI-2 catalyzes the interconversion of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) in the mevalonate pathway. 326
23043 239206 cd02812 PcrB_like PcrB_like proteins. One member of this family, a protein from Archaeoglobus fulgidus, has been characterized as a (S)-3-O-geranylgeranylglyceryl phosphate synthase (AfGGGPS). AfGGGPS catalyzes the formation of an ether linkage between sn-glycerol-1-phosphate (G1P) and geranylgeranyl diphosphate (GGPP), the committed step in archaeal lipid biosynthesis. Therefore, it has been proposed that PcrB-like proteins are either prenyltransferases or are involved in lipoteichoic acid biosynthesis although the exact function is still unknown. 219
23044 239207 cd02825 PAZ PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain. 115
23045 239208 cd02826 Piwi-like Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism. 393
23046 239209 cd02843 PAZ_dicer_like PAZ domain, dicer_like subfamily. Dicer is an RNAse involved in cleaving dsRNA in the RNA interference pathway. It generates dsRNAs which are approximately 20 bp long (siRNAs), which in turn target hydrolysis of homologous RNAs. PAZ domains are named after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 122
23047 239210 cd02844 PAZ_CAF_like PAZ domain, CAF_like subfamily. CAF (for carpel factory) is a plant homolog of Dicer. CAF has been implicated in flower morphogenesis and in early Arabidopsis development and might function through posttranscriptional regulation of specific mRNA molecules. PAZ domains are named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 135
23048 239211 cd02845 PAZ_piwi_like PAZ domain, Piwi_like subfamily. In multi-cellular organisms, the Piwi protein appears to be essential for the maintenance of germline stem cells. In the Drosophila male germline, Piwi was shown to be involved in the silencing of retrotransposons in the male gametes. The Piwi proteins share their domain architecture with other members of the argonaute family. The PAZ domain has been named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 117
23049 239212 cd02846 PAZ_argonaute_like PAZ domain, argonaute_like subfamily. Argonaute is part of the RNA-induced silencing complex (RISC), and is an endonuclease that plays a key role in the RNA interference pathway. The PAZ domain has been named after the proteins Piwi,Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 114
23050 199879 cd02847 E_set_Chitobiase_C C-terminal Early set domain associated with the catalytic domain of chitobiase (also called N-acetylglucosaminidase). E or "early" set domains are associated with the catalytic domain of chitobiase at the C-terminus. Chitobiase digests the beta, 1-4 glycosidic bonds of the N-acetylglucosamine (NAG) oligomers found in chitin, an important structural element of fungal cell wall and arthropod exoskeletons. It is thought to proceed through an acid-base reaction mechanism, in which one protein carboxylate acts as the catalytic acid, while the nucleophile is the polar acetamido group of the sugar in a substrate-assisted reaction with retention of the anomeric configuration. The C-terminus of chitobiase may be related to the immunoglobulin and/or fibronectin type III superfamilies. E set domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 62
23051 199880 cd02848 E_set_Chitinase_N N-terminal Early set domain associated with the catalytic domain of chitinase. E or "early" set domains are associated with the catalytic domain of chitinase at the N-terminal end. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and belong to families 18 and 19 of glycosyl hydrolases based on sequence criteria. The N-terminal domain of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 105
23052 199881 cd02850 E_set_Cellulase_N N-terminal Early set domain associated with the catalytic domain of cellulase. E or "early" set domains are associated with the catalytic domain of cellulases at the N-terminal end. Cellulases are O-glycosyl hydrolases (GHs) that hydrolyze beta 1-4 glucosidic bonds in cellulose. They are usually categorized into either exoglucanases, which sequentially release terminal sugar units from the cellulose chain, or endoglucanases, which also attack the chain internally. The N-terminal domain of cellulase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 86
23053 199882 cd02851 E_set_GO_C C-terminal Early set domain associated with the catalytic domain of galactose oxidase. E or "early" set domains are associated with the catalytic domain of galactose oxidase at the C-terminal end. Galactose oxidase is an extracellular monomeric enzyme which catalyzes the stereospecific oxidation of a broad range of primary alcohol substrates and possesses a unique mononuclear copper site essential for catalyzing a two-electron transfer reaction during the oxidation of primary alcohols to corresponding aldehydes. The second redox active center necessary for the reaction was found to be situated at a tyrosine residue. The C-terminal domain of galactose oxidase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 103
23054 199883 cd02853 E_set_MTHase_like_N N-terminal Early set domain associated with the catalytic domain of Maltooligosyl trehalose trehalohydrolase (also called Glycosyltrehalose trehalohydrolase) and similar proteins. E or "early" set domains are associated with the catalytic domain of Maltooligosyl trehalose trehalohydrolase (MTHase) and similar proteins at the N-terminal end. This subfamily also includes bacterial alpha amylases and 1,4-alpha-glucan branching enzymes which are highly similar to MTHase. Maltooligosyl trehalose synthase (MTSase) and MTHase work together to produce trehalose. MTSase is responsible for converting the alpha-1,4-glucosidic linkage to an alpha,alpha-1,1-glucosidic linkage at the reducing end of the maltooligosaccharide through an intramolecular transglucosylation reaction, while MTHase hydrolyzes the penultimate alpha-1,4 linkage of the reducing end, resulting in the release of trehalose. The N-terminal domain of MTHase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 84
23055 199884 cd02854 E_set_GBE_euk_N N-terminal Early set domain associated with the catalytic domain of eukaryotic glycogen branching enzyme (also called 1,4 alpha glucan branching enzyme). This subfamily is composed of predominantly eukaryotic 1,4 alpha glucan branching enzymes, also called glycogen branching enzymes or starch binding enzymes in plants. E or "early" set domains are associated with the catalytic domain of the 1,4 alpha glucan branching enzymes at the N-terminal end. These enzymes catalyze the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage, yielding a non-reducing end oligosaccharide chain, as well as the subsequent attachment of short glucosyl chains to the alpha-1,6 position. Starch is composed of two types of glucan polymer: amylose and amylopectin. Amylose is mainly composed of linear chains of alpha-1,4 linked glucose residues and amylopectin consists of shorter alpha-1,4 linked chains connected by alpha-1,6 linkages. Amylopectin is synthesized from linear chains by starch branching enzyme. The N-terminal domains of the branching enzyme proteins may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 95
23056 199885 cd02855 E_set_GBE_prok_N N-terminal Early set domain associated with the catalytic domain of prokaryotic glycogen branching enzyme. This subfamily is composed of predominantly prokaryotic 1,4 alpha glucan branching enzymes, also called glycogen branching enzymes. E or "early" set domains are associated with the catalytic domain of glycogen branching enzymes at the N-terminal end. Glycogen branching enzyme catalyzes the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage, yielding a non-reducing end oligosaccharide chain, as well as the subsequent attachment of short glucosyl chains to the alpha-1,6 position. By increasing the number of non-reducing ends, glycogen is more reactive to synthesis and digestion as well as being more soluble. The N-terminal domain of the 1,4 alpha glucan branching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 105
23057 199886 cd02856 E_set_GDE_Isoamylase_N N-terminal Early set domain associated with the catalytic domain of Glycogen debranching enzyme and bacterial isoamylase (also called glycogen 6-glucanohydrolase). E or "early" set domains are associated with the catalytic domain of the glycogen debranching enzyme at the N-terminal end. Glycogen debranching enzymes have both 4-alpha-glucanotransferase and amylo-1,6-glucosidase activities. As a transferase, it transfers a segment of the 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan. As a glucosidase, it catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Bacterial isoamylases are also included in this subfamily. Isoamylase is one of the starch-debranching enzymes that catalyze the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen. Isoamylase contains a bound calcium ion, but this is not in the same position as the conserved calcium ion that has been reported in other alpha-amylase family enzymes. The N-terminal domain of glycogen debranching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 130
23058 199887 cd02857 E_set_CDase_PDE_N N-terminal Early set domain associated with the catalytic domain of cyclomaltodextrinase and pullulan-degrading enzymes. E or "early" set domains are associated with the catalytic domain of the cyclomaltodextrinase (CDase) and pullulan-degrading enzymes at the N-terminal end. Members of this subgroup include CDase, maltogenic amylase, and neopullulanase, all of which are capable of hydrolyzing all or two of the following three types of substrates: cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. The N-terminal domain of the CDase and pullulan-degrading enzymes may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 109
23059 199888 cd02858 E_set_Esterase_N N-terminal Early set domain associated with the catalytic domain of esterase. E or "early" set domains are associated with the catalytic domain of esterase at the N-terminal end. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term esterase can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminal domain of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 78
23060 199889 cd02859 E_set_AMPKbeta_like_N N-terminal Early set domain, a glycogen binding domain, associated with the catalytic domain of AMP-activated protein kinase beta subunit. E or "early" set domains are associated with the catalytic domain of AMP-activated protein kinase beta subunit glycogen binding domain at the N-terminal end. AMPK is a metabolic stress sensing protein that senses AMP/ATP and has recently been found to act as a glycogen sensor as well. The protein functions as an alpha-beta-gamma heterotrimer. This N-terminal domain is the glycogen binding domain of the beta subunit. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, and isoamylase. 80
23061 199890 cd02860 E_set_Pullulanase Early set domain associated with the catalytic domain of pullulanase (also called dextrinase and alpha-dextrin endo-1,6-alpha glucosidase). E or "early" set domains are associated with the catalytic domain of pullulanase at either the N-terminal or C-terminal end, and in a few instances at both ends. Pullulanase is an enzyme with activity similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. The E set domain of pullulanase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase. 97
23062 199891 cd02861 E_set_pullulanase_like Early set domain associated with the catalytic domain of pullulanase-like proteins. E or "early" set domains are associated with the catalytic domain of pullulanase at either the N-terminal or C-terminal end, and in a few instances at both ends. Pullulanase (also called dextrinase or alpha-dextrin endo-1,6-alpha glucosidase) is an enzyme with action similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. The E set domain of pullulanase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase. 88
23063 239213 cd02862 NorE_like NorE_like subfamily of heme-copper oxidase subunit III. Heme-copper oxidases include cytochrome c and ubiquinol oxidases. Alcaligenes faecalis norE is found in a gene cluster containing norCB. norCB encodes the cytochrome c and cytochrome b subunits of nitric oxide reductase (NOR). Based on this and on its similarity to subunit III of cytochrome c oxidase (CcO) and ubiquinol oxidase, NorE has been speculated to be a subunit of NOR. 186
23064 239214 cd02863 Ubiquinol_oxidase_III Ubiquinol oxidase subunit III subfamily. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Ubiquinol oxidases feature four subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of bovine CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in bovine CcO. Although not required for catalytic activity, subunit III appears to be involved in assembly of the multimer complex. 186
23065 239215 cd02864 Heme_Cu_Oxidase_III_1 Heme-copper oxidase subunit III subfamily. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. 202
23066 239216 cd02865 Heme_Cu_Oxidase_III_2 Heme-copper oxidase subunit III subfamily. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. 184
23067 211343 cd02866 PseudoU_synth_TruA_Archea Archeal pseudouridine synthases. This group consists of archeal pseudouridine synthases.Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. This group of proteins make Psedouridine in tRNAs. 219
23068 211344 cd02867 PseudoU_synth_TruB_4 Pseudouridine synthase homolog 4. This group consists of Eukaryotic TruB proteins similar to Saccharomyces cerevisiae Pus4. S. cerevisiae Pus4, makes psi55 in the T loop of both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). 312
23069 211345 cd02868 PseudoU_synth_hTruB2_like Pseudouridine synthase, humanTRUB2_like. This group consists of eukaryotic pseudouridine synthases similar to human TruB pseudouridine synthase homolog 2 (TRUB2). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). 226
23070 211346 cd02869 PseudoU_synth_RluA_like Pseudouridine synthase, RluA family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA. Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors. E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA. Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved. Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA. psi2604in 23S RNA made by RluF has only been detected in E.coli. 185
23071 211347 cd02870 PseudoU_synth_RsuA_like Pseudouridine synthases, RsuA subfamily. Pseudouridine synthases are responsible for the synthesis of pseudouridine from uracil in ribosomal RNA. The RsuA subfamily includes Pseudouridine Synthase similar to Ribosomal small subunit pseudouridine 516 synthase. Most of the proteins in this family are bacterial proteins. 146
23072 119350 cd02871 GH18_chitinase_D-like GH18 domain of Chitinase D (ChiD). ChiD, a chitinase found in Bacillus circulans, hydrolyzes the 1,4-beta-linkages of N-acetylglucosamine in chitin and chitodextrins. The domain architecture of ChiD includes a catalytic glycosyl hydrolase family 18 (GH18) domain, a chitin-binding domain, and a fibronectin type III domain. The chitin-binding and fibronectin type III domains are located either N-terminal or C-terminal to the catalytic domain. This family includes exochitinase Chi36 from Bacillus cereus. 312
23073 119351 cd02872 GH18_chitolectin_chitotriosidase This conserved domain family includes a large number of catalytically inactive chitinase-like lectins (chitolectins) including YKL-39, YKL-40 (HCGP39), YM1, oviductin, and AMCase (acidic mammalian chitinase), as well as catalytically active chitotriosidases. The conserved domain is an eight-stranded alpha/beta barrel fold belonging to the family 18 glycosyl hydrolases. The fold has a pronounced active-site cleft at the C-terminal end of the beta-barrel. The chitolectins lack a key active site glutamate (the proton donor required for hydrolytic activity) but retain highly conserved residues involved in oligosaccharide binding. Chitotriosidase is a chitinolytic enzyme expressed in maturing macrophages, which suggests that it plays a part in antimicrobial defense. Chitotriosidase hydrolyzes chitotriose, as well as colloidal chitin to yield chitobiose and is therefore considered an exochitinase. Chitotriosidase occurs in two major forms, the large form being converted to the small form by either RNA or post-translational processing. Although the small form, containing the chitinase domain alone, is sufficient for the chitinolytic activity, the additional C-terminal chitin-binding domain of the large form plays a role in processing colloidal chitin. The chitotriosidase gene is nonessential in humans, as about 35% of the population are heterozygous and 6% homozygous for an inactivated form of the gene. HCGP39 is a 39-kDa human cartilage glycoprotein thought to play a role in connective tissue remodeling and defense against pathogens. 362
23074 119352 cd02873 GH18_IDGF The IDGF's (imaginal disc growth factors) are a family of growth factors identified in insects that include at least five members, some of which are encoded by genes in a tight cluster. The IDGF's have an eight-stranded alpha/beta barrel fold and are related to the glycosyl hydrolase family 18 (GH18) chitinases, but they have an amino acid substitution known to abolish chitinase catalytic activity. IDGFs may have evolved from chitinases to gain new functions as growth factors, interacting with cell surface glycoproteins involved in growth-promoting processes. 413
23075 119353 cd02874 GH18_CFLE_spore_hydrolase Cortical fragment-lytic enzyme (CFLE) is a peptidoglycan hydrolase involved in bacterial endospore germination. CFLE is expressed as an inactive preprotein (called SleB) in the forespore compartment of sporulating cells. SleB translocates across the forespore inner membrane and is deposited as a mature enzyme in the cortex layer of the spore. As part of a sensory mechanism capable of initiating germination, CFLE degrades a spore-specific peptidoglycan constituent called muramic-acid delta-lactam that comprises the outer cortex. CFLE has a C-terminal glycosyl hydrolase family 18 (GH18) catalytic domain as well as two N-terminal LysM peptidoglycan-binding domains. In addition to SleB, this family includes YaaH, YdhD, and YvbX from Bacillus subtilis. 313
23076 119354 cd02875 GH18_chitobiase Chitobiase (also known as di-N-acetylchitobiase) is a lysosomal glycosidase that hydrolyzes the reducing-end N-acetylglucosamine from the chitobiose core of oligosaccharides during the ordered degradation of asparagine-linked glycoproteins in eukaryotes. Chitobiase can only do so if the asparagine that joins the oligosaccharide to protein is previously removed by a glycosylasparaginase. Chitobiase is therefore the final step in the lysosomal degradation of the protein/carbohydrate linkage component of asparagine-linked glycoproteins. The catalytic domain of chitobiase is an eight-stranded alpha/beta barrel fold similar to that of other family 18 glycosyl hydrolases such as hevamine and chitotriosidase. 358
23077 119355 cd02876 GH18_SI-CLP Stabilin-1 interacting chitinase-like protein (SI-CLP) is a eukaryotic chitinase-like protein of unknown function that interacts with the endocytic/sorting transmembrane receptor stabilin-1 and is secreted from the lysosome. SI-CLP has a glycosyl hydrolase family 18 (GH18) domain but lacks a chitin-binding domain. The catalytic amino acids of the GH18 domain are not conserved in SI-CLP, similar to the chitolectins YKL-39, YKL-40, and YM1/2. Human SI-CLP is sorted to late endosomes and secretory lysosomes in alternatively activated macrophages. 318
23078 119356 cd02877 GH18_hevamine_XipI_class_III This conserved domain family includes xylanase inhibitor Xip-I, and the class III plant chitinases such as hevamine, concanavalin B, and PPL2, all of which have a glycosyl hydrolase family 18 (GH18) domain. Hevamine is a class III endochitinase that hydrolyzes the linear polysaccharide chains of chitin and peptidoglycan and is important for defense against pathogenic bacteria and fungi. PPL2 (Parkia platycephala lectin 2) is a class III chitinase from Parkia platycephala seeds that hydrolyzes beta(1-4) glycosidic bonds linking 2-acetoamido-2-deoxy-beta-D-glucopyranose units in chitin. 280
23079 119357 cd02878 GH18_zymocin_alpha Zymocin, alpha subunit. Zymocin is a heterotrimeric enzyme that inhibits yeast cell cycle progression. The zymocin alpha subunit has a chitinase activity that is essential for holoenzyme action from the cell exterior while the gamma subunit contains the intracellular toxin responsible for G1 phase cell cycle arrest. The zymocin alpha and beta subunits are thought to act from the cell's exterior by docking to the cell wall-associated chitin, thus mediating gamma-toxin translocation. The alpha subunit has an eight-stranded TIM barrel fold similar to that of family 18 glycosyl hydrolases such as hevamine, chitolectin, and chitobiase. 345
23080 119358 cd02879 GH18_plant_chitinase_class_V The class V plant chitinases have a glycosyl hydrolase family 18 (GH18) domain, but lack the chitin-binding domain present in other GH18 enzymes. The GH18 domain of the class V chitinases has endochitinase activity in some cases and no catalytic activity in others. Included in this family is a lectin found in black locust (Robinia pseudoacacia) bark, which binds chitin but lacks chitinase activity. Also included is a chitinase-related receptor-like kinase (CHRK1) from tobacco (Nicotiana tabacum), with an N-terminal GH18 domain and a C-terminal kinase domain, which is thought to be part of a plant signaling pathway. The GH18 domain of CHRK1 is expressed extracellularly where it binds chitin but lacks chitinase activity. 299
23081 239217 cd02883 Nudix_Hydrolase Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase. 123
23082 239218 cd02885 IPP_Isomerase Isopentenyl diphosphate (IPP) isomerase, a member of the Nudix hydrolase superfamily, is a key enzyme in the isoprenoid biosynthetic pathway. Isoprenoids comprise a large family of natural products including sterols, carotenoids, dolichols and prenylated proteins. These compounds are synthesized from two precursors: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). IPP isomerase catalyzes the interconversion of IPP and DMAPP by a stereoselective antarafacial transposition of hydrogen. The enzyme requires one Mn2+ or Mg2+ ion in its active site to fold into an active conformation and also contains the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. The metal binding site is present within the active site and plays structural and catalytical roles. IPP isomerase is well represented in several bacteria, archaebacteria and eukaryotes, including fungi, mammals and plants. Despite sequence variations (mainly at the N-terminus), the core structure is highly conserved. 165
23083 153089 cd02888 RNR_II_dimer Class II ribonucleotide reductase, dimeric form. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class II RNRs are found in bacteria that can live under both aerobic and anaerobic conditions. Many, but not all members of this class are found to be homodimers. Adenosylcobalamin interacts directly with an active site cysteine to form the reactive cysteine radical. 464
23084 239219 cd02889 SQCY Squalene cyclase (SQCY) domain; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. Bacterial SQCY catalyzes the convertion of squalene to hopene or diplopterol. Eukaryotic OSQCY transforms the 2,3-epoxide of squalene to compounds such as, lanosterol (a metabolic precursor of cholesterol and steroid hormones) in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain. This group also contains SQCY-like archael sequences and some bacterial SQCY's which lack this minor domain. 348
23085 239220 cd02890 PTase Protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). The protein prenyltransferase family of lipid-modifying enzymes includes protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II). They catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between the C1 atom of farnesyl (15-carbon by FTase) or geranylgeranyl (20-carbon by GGTase-I, II) isoprenoid lipids and cysteine residues at or near the C-terminus of protein acceptors. FTase and GGTase-I prenylate the cysteine in the terminal sequence, "CAAX"; and GGTase-II prenylates both cysteines in the "CC" (or "CXC") terminal sequence. These enzymes are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTase-II does not recognize its protein acceptor directly but requires Rab to complex with REP (Rab escort protein) before prenylation can occur. These enzymes are found exclusively in eukaryotes. 286
23086 239221 cd02891 A2M_like Proteins similar to alpha2-macroglobulin (alpha (2)-M). Alpha (2)-M is a major carrier protein in serum. It is a broadly specific proteinase inhibitor. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. This group contains another broadly specific proteinase inhibitor: pregnancy zone protein (PZP). PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system. This group also contains C3, C4 and C5 of vertebrate complement. The vertebrate complement is an effector of both the acquired and innate immune systems The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond. 282
23087 239222 cd02892 SQCY_1 Squalene cyclase (SQCY) domain subgroup 1; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. This group contains bacterial SQCY which catalyzes the convertion of squalene to hopene or diplopterol and eukaryotic OSQCY which transforms the 2,3-epoxide of squalene to compounds such as, lanosterol in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain. 634
23088 239223 cd02893 FTase Protein farnesyltransferase (FTase)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). FTases are a subgroup of PTase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. These proteins are heterodimers of alpha and beta subunits. Both subunits are required for catalytic activity. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids. Ftase attaches a 15-carbon farnesyl group to the cysteine within the C-terminal CaaX motif of substrate proteins when X is Ala, Met, Ser, Cys or Gln. Protein farnesylation has been shown to play critical roles in a variety of cellular processes including Ras/mitogen activated protein kinase signaling pathways in mammals and, abscisic acid signal transduction in Arabidopsis. 299
23089 239224 cd02894 GGTase-II Geranylgeranyltransferase type II (GGTase-II)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-IIs are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-II ). GGTase-II catalyzes alkylation of both cysteine residues in Rab proteins containing carboxy-terminal "CC", "CXCX" or "CXC" motifs. PTases are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTas-II requires an escort protein to bring the substrate protein to the catalytic heterodimer and to escort the geryanylgeranylated product to the membrane. 287
23090 239225 cd02895 GGTase-I Geranylgeranyltransferase types I (GGTase-I)-like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-I s are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-I ). GGTase-I prenylates the cysteine in the terminal sequence, "CAAX" when X is Leu or Phe. Substrates for GTTase-I include the gamma subunit of neural G-proteins and several Ras-related G-proteins. PTases are heterodimeric with both alpha and beta subunits required for catalytic activity. 307
23091 239226 cd02896 complement_C3_C4_C5 Proteins similar to C3, C4 and C5 of vertebrate complement. The vertebrate complement system, comprised of a large number of distinct plasma proteins, is an effector of both the acquired and innate immune systems. The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond. 297
23092 239227 cd02897 A2M_2 Proteins similar to alpha2-macroglobulin (alpha (2)-M). This group also contains the pregnancy zone protein (PZP). Alpha(2)-M and PZP are broadly specific proteinase inhibitors. Alpha (2)-M is a major carrier protein in serum. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production contributing to fetal survival. It has been suggested that thioester bond cleavage promotes the binding of PZ and alpha (2)-M to the CD91 receptor clearing them from circulation. 292
23093 239228 cd02899 PLAT_SR Scavenger receptor protein. A subfamily of PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. This subfamily contains Toxoplasma gondii Scavenger protein TgSR1. 109
23094 394872 cd02900 Macro_Appr_pase macrodomain, Appr-1"-pase family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. The yeast protein Ymx7 and related proteins in this family contain a stand-alone macrodomain and may be specific phosphatases catalyzing the conversion of ADP-ribose-1"-monophosphate (Appr-1"-p) to ADP-ribose. Appr-1"-p is an intermediate in a metabolic pathway involved in pre-tRNA splicing. 195
23095 394873 cd02901 Macro_Poa1p-like macrodomain, Poa1p-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1"-p, a tRNA splicing metabolite. Poa1p may play a role in tRNA splicing regulation. 135
23096 394874 cd02903 Macro_BAL-like macrodomain, B-aggressive lymphoma (BAL)-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family show similarity to BAL (B-aggressive lymphoma) proteins, which contain one to three macrodomains. Most BAL family macrodomains belong to this family except for the most N-terminal domain in multiple-domain containing proteins. This family includes the second and third macrodomains of mono-ADP-ribosyltransferase PARP14 (PARP-14, also known as ADP-ribosyltransferase diphtheria toxin-like 8, ATRD8, B aggressive lymphoma protein 2, or BAL2). Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcriptional repressors. 175
23097 394875 cd02904 Macro_H2A-like macrodomain, macroH2A-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family are similar to macroH2A, a variant of the major-type core histone H2A, which contains an N-terminal H2A domain and a C-terminal nonhistone macrodomain. Histone macroH2A is enriched on the inactive X chromosome of mammalian female cells. It does not bind poly ADP-ribose, but does bind the monomeric SirT1 metabolite O-acetyl-ADP-ribose (OAADPR) with high affinity through its macrodomain. This family also includes the ADP-ribose binding macrodomain of the macroH2A variant, macroH2A1.1. The macroH2A1.1 isoform inhibits PARP1-dependent DNA-damage induced chromatin dynamics. The putative ADP-ribose binding pocket of the human macroH2A2 macrodomain exhibits marked structural differences compared with the macroH2A1.1 variant. 188
23098 394876 cd02905 Macro_GDAP2-like macrodomain, GDAP2-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. This family contains proteins similar to human GDAP2, the ganglioside induced differentiation associated protein 2, whose gene is expressed at a higher level in differentiated Neuro2a cells compared with non-differentiated cells. GDAP2 contains an N-terminal macrodomain and a C-terminal Sec14p-like lipid binding domain. It is specifically expressed in brain and testis. 169
23099 394877 cd02907 Macro_Af1521_BAL-like macrodomain, Af1521-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. The macrodomains in this family show similarity to Af1521, a protein from Archaeoglobus fulgidus containing a stand-alone macrodomain. Af1521 binds ADP-ribose and exhibits phosphatase activity toward ADP-ribose-1"-monophosphate (Appr-1"-p). Also included in this family are the N-terminal (or first) macrodomains of BAL (B-aggressive lymphoma) proteins which contain multiple macrodomains, such as the first macrodomain of mono-ADP-ribosyltransferase PARP14 (PARP-14, also known as ADP-ribosyltransferase diphtheria toxin-like 8, ATRD8, B aggressive lymphoma protein 2, or BAL2). Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcriptional repressors. 158
23100 394878 cd02908 Macro_OAADPr_deacetylase macrodomain, O-acetyl-ADP-ribose (OAADPr) family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. This family includes eukaryotic macrodomain proteins such as human MacroD1 and MacroD2, and bacterial proteins such as Escherichia coli YmdB; these have been shown to be O-acetyl-ADP-ribose (OAADPr) deacetylases that efficiently catalyze the hydrolysis of OAADPr to produce ADP-ribose and free acetate. OAADPr is a sirtuin reaction product generated from the NAD+-dependent protein deacetylation reactions and has been implicated as a signaling molecule. By acting on mono-ADP-ribosylated substrates, OAADPr deacteylases may reverse cellular ADP-ribosylation. 166
23101 380374 cd02909 cupin_pirin_N pirin, N-terminal cupin domain. This family contains the N-terminal domain of pirin, a nuclear protein that is highly conserved among mammals, plants, fungi, and prokaryotes. It is widely expressed in dot-like subnuclear structures in human tissues such as liver and heart. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. The pirins have been assigned as a subfamily of the cupin superfamily based on structure and sequence similarity. The pirins have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold generally capable of homodimerization. 104
23102 380375 cd02910 cupin_Yhhw_N Escherichia coli YhhW and YhaK and related proteins, pirin-like bicupin, N-terminal cupin domain. This family includes the N-terminal cupin domains of YhhW and YhaK, Escherichia coli pirin-like proteins with unknown function. YhhW is structurally similar not only to human pirin but also to quercitin 2,3-dioxygenase (quercitinase). Although the function of YhhW is not completely understood, YhhW and its human ortholog have quercitinase activity and are likely to play an important role in transcription and apoptosis. This N-terminal cupin domain of YhhW has a metal coordination site and is thought to have catalytic activity while the C-terminal cupin-like domain has diverged considerably and has closer alignment with C-terminal pirin. YhaK is found in low abundance in the cytosol of E. coli and is strongly up-regulated by nitroso-glutathione (GSNO). There are major structural differences at the N-terminus of YhaK compared with YhhW; YhaK lacks the canonical cupin metal-binding residues of pirins and may be involved in chloride binding and/or sensing of oxidative stress in enterobacteria. YhaK showed no quercetinase and peroxidase activity; however, reduced YhaK was very sensitive to reactive oxygen species (ROS). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 119
23103 239237 cd02911 arch_FMN Archeal FMN-binding domain. This family of archaeal proteins are part of the NAD(P)H-dependent flavin oxidoreductase (oxidored) FMN-binding family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. The specific function of this group is unknown. 233
23104 239238 cd02922 FCB2_FMN Flavocytochrome b2 (FCB2) FMN-binding domain. FCB2 (AKA L-lactate:cytochrome c oxidoreductase) is a respiratory enzyme located in the intermembrane space of fungal mitochondria which catalyzes the oxidation of L-lactate to pyruvate. FCB2 also participates in a short electron-transport chain involving cytochrome c and cytochrome oxidase which ultimately directs the reducing equivalents gained from L-lactate oxidation to oxygen, yielding one molecule of ATP for every L-lactate molecule consumed. FCB2 is composed of 2 domains: a C-terminal flavin-binding domain, which includes the active site for lacate oxidation, and an N-terminal b2-cytochrome domain, required for efficient cytochrome c reduction. FCB2 is a homotetramer and contains two noncovalently bound cofactors, FMN and heme per subunit. 344
23105 239239 cd02929 TMADH_HD_FMN Trimethylamine dehydrogenase (TMADH) and histamine dehydrogenase (HD) FMN-binding domain. TMADH is an iron-sulfur flavoprotein that catalyzes the oxidative demethylation of trimethylamine to form dimethylamine and formaldehyde. The protein forms a symetrical dimer with each subunit containing one 4Fe-4S cluster and one FMN cofactor. It contains a unique flavin, in the form of a 6-S-cysteinyl FMN which is bent by ~25 degrees along the N5-N10 axis of the flavin isoalloxazine ring. This modification of the conformation of the flavin is thought to facilitate catalysis.The closely related histamine dehydrogenase catalyzes oxidative deamination of histamine. 370
23106 239240 cd02930 DCR_FMN 2,4-dienoyl-CoA reductase (DCR) FMN-binding domain. DCR in E. coli is an iron-sulfur flavoenzyme which contains FMN, FAD, and a 4Fe-4S cluster. It is also a monomer, unlike that of its eukaryotic counterparts which form homotetramers and lack the flavin and iron-sulfur cofactors. Metabolism of unsaturated fatty acids requires auxiliary enzymes in addition to those used in b-oxidation. After a given number of cycles through the b-oxidation pathway, those unsaturated fatty acyl-CoAs with double bonds at even-numbered carbon positions contain 2-trans, 4-cis double bonds that can not be modified by enoyl-CoA hydratase. DCR utilizes NADPH to remove the C4-C5 double bond. DCR can catalyze the reduction of both natural fatty acids with cis double bonds, as well as substrates containing trans double bonds. The reaction is initiated by hybrid transfer from NADPH to FAD, which in turn transfers electrons, one at a time, to FMN via the 4Fe-4S cluster. The fully reduced FMN provides a hydrid ion to the C5 atom of substrate, and Tyr and His are proposed to form a catalytic dyad that protonates the C4 atom of the substrate and completes the reaction. 353
23107 239241 cd02931 ER_like_FMN Enoate reductase (ER)-like FMN-binding domain. Enoate reductase catalyzes the NADH-dependent reduction of carbon-carbon double bonds of several molecules, including nonactivated 2-enoates, alpha,beta-unsaturated aldehydes, cyclic ketones, and methylketones. ERs are similar to 2,4-dienoyl-CoA reductase from E. coli and to the old yellow enzyme from Saccharomyces cerevisiae. 382
23108 239242 cd02932 OYE_YqiM_FMN Old yellow enzyme (OYE) YqjM-like FMN binding domain. YqjM is involved in the oxidative stress response of Bacillus subtilis. Like the other OYE members, each monomer of YqjM contains FMN as a non-covalently bound cofactor and uses NADPH as a reducing agent. The YqjM enzyme exists as a homotetramer that is assembled as a dimer of catalytically dependent dimers, while other OYE members exist only as monomers or dimers. Moreover, the protein displays a shared active site architecture where an arginine finger at the COOH terminus of one monomer extends into the active site of the adjacent monomer and is directly involved in substrate recognition. Another remarkable difference in the binding of the ligand in YqjM is represented by the contribution of the NH2-terminal tyrosine instead of a COOH-terminal tyrosine in OYE and its homologs. 336
23109 239243 cd02933 OYE_like_FMN Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Members of OYE family include 12-oxophytodienoate reductase, pentaerythritol tetranitrate reductase, morphinone reductase, and related enzymes. 338
23110 239244 cd02940 DHPD_FMN Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN, and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass the dimer interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue. 299
23111 239245 cd02947 TRX_family TRX family; composed of two groups: Group I, which includes proteins that exclusively encode a TRX domain; and Group II, which are composed of fusion proteins of TRX and additional domains. Group I TRX is a small ancient protein that alter the redox state of target proteins via the reversible oxidation of an active site dithiol, present in a CXXC motif, partially exposed at the protein's surface. TRX reduces protein disulfide bonds, resulting in a disulfide bond at its active site. Oxidized TRX is converted to the active form by TRX reductase, using reducing equivalents derived from either NADPH or ferredoxins. By altering their redox state, TRX regulates the functions of at least 30 target proteins, some of which are enzymes and transcription factors. It also plays an important role in the defense against oxidative stress by directly reducing hydrogen peroxide and certain radicals, and by serving as a reductant for peroxiredoxins. At least two major types of functional TRXs have been reported in most organisms; in eukaryotes, they are located in the cytoplasm and the mitochondria. Higher plants contain more types (at least 20 TRX genes have been detected in the genome of Arabidopsis thaliana), two of which (types f amd m) are located in the same compartment, the chloroplast. Also included in the alignment are TRX-like domains which show sequence homology to TRX but do not contain the redox active CXXC motif. Group II proteins, in addition to either a redox active TRX or a TRX-like domain, also contain additional domains, which may or may not possess homology to known proteins. 93
23112 239246 cd02948 TRX_NDPK TRX domain, TRX and NDP-kinase (NDPK) fusion protein family; most members of this group are fusion proteins which contain one redox active TRX domain containing a CXXC motif and three NDPK domains, and are characterized as intermediate chains (ICs) of axonemal outer arm dynein. Dyneins are molecular motors that generate force against microtubules to produce cellular movement, and are divided into two classes: axonemal and cytoplasmic. They are supramolecular complexes consisting of three protein groups classified according to size: dynein heavy, intermediate and light chains. Axonemal dyneins form two structures, the inner and outer arms, which are attached to doublet microtubules throughout the cilia and flagella. The human homolog is the sperm-specific Sptrx-2, presumed to be a component of the human sperm axoneme architecture. Included in this group is another human protein, TRX-like protein 2, a smaller fusion protein containing one TRX and one NDPK domain, which is also associated with microtubular structures. The other members of this group are hypothetical insect proteins containing a TRX domain and outer arm dynein light chains (14 and 16kDa) of Chlamydomonas reinhardtii. Using standard assays, the fusion proteins have shown no TRX enzymatic activity. 102
23113 239247 cd02949 TRX_NTR TRX domain, novel NADPH thioredoxin reductase (NTR) family; composed of fusion proteins found only in oxygenic photosynthetic organisms containing both TRX and NTR domains. The TRX domain functions as a protein disulfide reductase via the reversible oxidation of an active center dithiol present in a CXXC motif, while the NTR domain functions as a reductant to oxidized TRX. The fusion protein is bifunctional, showing both TRX and NTR activities, but it is not an independent NTR/TRX system. In plants, the protein is found exclusively in shoots and mature leaves and is localized in the chloroplast. It is involved in plant protection against oxidative stress. 97
23114 239248 cd02950 TxlA TRX-like protein A (TxlA) family; TxlA was originally isolated from the cyanobacterium Synechococcus. It is found only in oxygenic photosynthetic organisms. TRX is a small enzyme that participate in redox reactions, via the reversible oxidation of an active site dithiol present in a CXXC motif. Disruption of the txlA gene suggests that the protein is involved in the redox regulation of the structure and function of photosynthetic apparatus. The plant homolog (designated as HCF164) is localized in the chloroplast and is involved in the assembly of the cytochrome b6f complex, which takes a central position in photosynthetic electron transport. 142
23115 239249 cd02951 SoxW SoxW family; SoxW is a bacterial periplasmic TRX, containing a redox active CXXC motif, encoded by a genetic locus (sox operon) involved in thiosulfate oxidation. Sulfur bacteria oxidize sulfur compounds to provide reducing equivalents for carbon dioxide fixation during autotrophic growth and the respiratory electron transport chain. It is unclear what the role of SoxW is, since it has been found to be dispensable in the oxidation of thiosulfate to sulfate. SoxW is specifically kept in the reduced state by SoxV, which is essential in thiosulfate oxidation. 125
23116 239250 cd02952 TRP14_like Human TRX-related protein 14 (TRP14)-like family; composed of proteins similar to TRP14, a 14kD cytosolic protein that shows disulfide reductase activity in vitro with a different substrate specificity compared with another human cytosolic protein, TRX1. TRP14 catalyzes the reduction of small disulfide-containing peptides but does not reduce disulfides of ribonucleotide reductase, peroxiredoxin and methionine sulfoxide reductase, which are TRX1 substrates. TRP14 also plays a role in tumor necrosis factor (TNF)-alpha signaling pathways, distinct from that of TRX1. Its depletion promoted TNF-alpha induced activation of c-Jun N-terminal kinase and mitogen-activated protein kinases. 119
23117 239251 cd02953 DsbDgamma DsbD gamma family; DsbD gamma is the C-terminal periplasmic domain of the bacterial protein DsbD. It contains a CXXC motif in a TRX fold and shuttles the reducing potential from the membrane domain (DsbD beta) to the N-terminal periplasmic domain (DsbD alpha). DsbD beta, a transmembrane domain comprising of eight helices, acquires its reducing potential from the cytoplasmic thioredoxin. DsbD alpha transfers the acquired reducing potential from DsbD gamma to target proteins such as the periplasmic protein disulphide isomerases, DsbC and DsbG. This flow of reducing potential from the cytoplasm through DsbD allows DsbC and DsbG to act as isomerases in the oxidizing environment of the bacterial periplasm. DsbD also transfers reducing potential from the cytoplasm to specific reductases in the periplasm which are involved in the maturation of cytochromes. 104
23118 239252 cd02954 DIM1 Dim1 family; Dim1 is also referred to as U5 small nuclear ribonucleoprotein particle (snRNP)-specific 15kD protein. It is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 interacts with multiple splicing-associated proteins, suggesting that it functions at multiple control points in the splicing of pre-mRNA as part of a large spliceosomal complex involving many protein-protein interactions. U5 snRNP contains seven core proteins (common to all snRNPs) and nine U5-specific proteins, one of which is Dim1. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif. It is essential for G2/M phase transition, as a consequence to its role in pre-mRNA splicing. 114
23119 239253 cd02955 SSP411 TRX domain, SSP411 protein family; members of this family are highly conserved proteins present in eukaryotes, bacteria and archaea, about 600-800 amino acids in length, which contain a TRX domain with a redox active CXXC motif. The human/rat protein, called SSP411, is specifically expressed in the testis in an age-dependent manner. The SSP411 mRNA is increased during spermiogenesis and is localized in round and elongated spermatids, suggesting a function in fertility regulation. 124
23120 239254 cd02956 ybbN ybbN protein family; ybbN is a hypothetical protein containing a redox-inactive TRX-like domain. Its gene has been sequenced from several gammaproteobacteria and actinobacteria. 96
23121 239255 cd02957 Phd_like Phosducin (Phd)-like family; composed of Phd and Phd-like proteins (PhLP), characterized as cytosolic regulators of G protein functions. Phd and PhLPs specifically bind G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane and impeding G protein-mediated signal transduction by inhibiting the formation of a functional G protein trimer (G protein alphabetagamma). Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. Also included in this family is a PhLP characterized as a viral inhibitor of apoptosis (IAP)-associated factor, named VIAF, that functions in caspase activation during apoptosis. 113
23122 239256 cd02958 UAS UAS family; UAS is a domain of unknown function. Most members of this family are uncharacterized proteins with similarity to FAS-associated factor 1 (FAF1) and ETEA because of the presence of a UAS domain N-terminal to a ubiquitin-associated UBX domain. FAF1 is a longer protein, compared to the other members of this family, having additional N-terminal domains, a ubiquitin-associated UBA domain and a nuclear targeting domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. ETEA is the protein product of a highly expressed gene in T-cells and eosinophils of atopic dermatitis patients. The presence of the ubiquitin-associated UBX domain in the proteins of this family suggests the possibility of their involvement in ubiquitination. Recently, FAF1 has been shown to interact with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. Some members of this family are uncharacterized proteins containing only a UAS domain. 114
23123 239257 cd02959 ERp19 Endoplasmic reticulum protein 19 (ERp19) family; ERp19 is also known as ERp18, a protein located in the ER containing one redox active TRX domain. Denaturation studies indicate that the reduced form is more stable than the oxidized form, suggesting that the protein is involved in disulfide bond formation. In vitro, ERp19 has been shown to possess thiol-disulfide oxidase activity which is dependent on the presence of both active site cysteines. Although described as protein disulfide isomerase (PDI)-like, the protein does not complement for PDI activity. ERp19 shows a wide tissue distribution but is most abundant in liver, testis, heart and kidney. 117
23124 239258 cd02960 AGR Anterior Gradient (AGR) family; members of this family are similar to secreted proteins encoded by the cement gland-specific genes XAG-1 and XAG-2, expressed in the anterior region of dorsal ectoderm of Xenopus. They are implicated in the formation of the cement gland and the induction of forebrain fate. The human homologs, hAG-2 and hAG-3, are secreted proteins associated with estrogen-positive breast tumors. Yeast two-hybrid studies identified the metastasis-associated C4.4a protein and dystroglycan as binding partners, indicating possible roles in the development and progression of breast cancer. hAG-2 has also been implicated in prostate cancer. Its gene was cloned as an androgen-inducible gene and it was shown to be overexpressed in prostate cancer cells at the mRNA and protein levels. AGR proteins contain one conserved cysteine corresponding to the first cysteine in the CXXC motif of TRX. They show high sequence similarity to ERp19. 130
23125 239259 cd02961 PDI_a_family Protein Disulfide Isomerase (PDIa) family, redox active TRX domains; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI and PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5, PDIR, ERp46 and the transmembrane PDIs. PDI, ERp57, ERp72, P5, PDIR and ERp46 are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins usually contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and may also contain one or more redox inactive TRX-like (b) domains. Only one a domain is required for the oxidase function but multiple copies are necessary for the isomerase function. The different types of PDIs may show different substrate specificities and tissue-specific expression, or may be induced by stress. PDIs are in their reduced form at steady state and are oxidized to the active form by Ero1, which is localized in the ER through ERp44. Some members of this family also contain a DnaJ domain in addition to the redox active a domains; examples are ERdj5 and Pfj2. Also included in the family is the redox inactive N-terminal TRX-like domain of ERp29. 101
23126 239260 cd02962 TMX2 TMX2 family; composed of proteins similar to human TMX2, a 372-amino acid TRX-related transmembrane protein, identified and characterized through the cloning of its cDNA from a human fetal library. It contains a TRX domain but the redox active CXXC motif is replaced with SXXC. Sequence analysis predicts that TMX2 may be a Type I membrane protein, with its C-terminal half protruding on the luminal side of the endoplasmic reticulum (ER). In addition to the TRX domain, transmembrane region and ER-retention signal, TMX2 also contains a Myb DNA-binding domain repeat signature and a dileucine motif in the tail. 152
23127 239261 cd02963 TRX_DnaJ TRX domain, DnaJ domain containing protein family; composed of uncharacterized proteins of about 500-800 amino acids, containing an N-terminal DnaJ domain followed by one redox active TRX domain. DnaJ is a member of the 40 kDa heat-shock protein (Hsp40) family of molecular chaperones, which regulate the activity of Hsp70s. TRX is involved in the redox regulation of many protein substrates through the reduction of disulfide bonds. TRX has been implicated to catalyse the reduction of Hsp33, a chaperone holdase that binds to unfolded protein intermediates. The presence of DnaJ and TRX domains in members of this family suggests that they could be involved in a redox-regulated chaperone network. 111
23128 239262 cd02964 TryX_like_family Tryparedoxin (TryX)-like family; composed of TryX and related proteins including nucleoredoxin (NRX), rod-derived cone viability factor (RdCVF) and the nematode homolog described as a 16-kD class of TRX. Most members of this family, except RdCVF, are protein disulfide oxidoreductases containing an active site CXXC motif, similar to TRX. 132
23129 239263 cd02965 HyaE HyaE family; HyaE is also called HupG and HoxO. They are proteins serving a critical role in the assembly of multimeric [NiFe] hydrogenases, the enzymes that catalyze the oxidation of molecular hydrogen to enable microorganisms to utilize hydrogen as the sole energy source. The E. coli HyaE protein is a chaperone that specifically interacts with the twin-arginine translocation (Tat) signal peptide of the [NiFe] hydrogenase-1 beta subunit precursor. Tat signal peptides target precursor proteins to the Tat protein export system, which facilitates the transport of fully folded proteins across the inner membrane. HyaE may be involved in regulating the traffic of [NiFe] hydrogenase-1 on the Tat transport pathway. 111
23130 239264 cd02966 TlpA_like_family TlpA-like family; composed of TlpA, ResA, DsbE and similar proteins. TlpA, ResA and DsbE are bacterial protein disulfide reductases with important roles in cytochrome maturation. They are membrane-anchored proteins with a soluble TRX domain containing a CXXC motif located in the periplasm. The TRX domains of this family contain an insert, approximately 25 residues in length, which correspond to an extra alpha helix and a beta strand when compared with TRX. TlpA catalyzes an essential reaction in the biogenesis of cytochrome aa3, while ResA and DsbE are essential proteins in cytochrome c maturation. Also included in this family are proteins containing a TlpA-like TRX domain with domain architectures similar to E. coli DipZ protein, and the N-terminal TRX domain of PilB protein from Neisseria which acts as a disulfide reductase that can recylce methionine sulfoxide reductases. 116
23131 239265 cd02967 mauD Methylamine utilization (mau) D family; mauD protein is the translation product of the mauD gene found in methylotrophic bacteria, which are able to use methylamine as a sole carbon source and a nitrogen source. mauD is an essential accessory protein for the biosynthesis of methylamine dehydrogenase (MADH), the enzyme that catalyzes the oxidation of methylamine and other primary amines. MADH possesses an alpha2beta2 subunit structure; the alpha subunit is also referred to as the large subunit. Each beta (small) subunit contains a tryptophan tryptophylquinone (TTQ) prosthetic group. Accessory proteins are essential for the proper transport of MADH to the periplasm, TTQ synthesis and the formation of several structural disulfide bonds. Bacterial mutants containing an insertion on the mauD gene were unable to grow on methylamine as a sole carbon source, were found to lack the MADH small subunit and had decreased amounts of the MADH large subunit. 114
23132 239266 cd02968 SCO SCO (an acronym for Synthesis of Cytochrome c Oxidase) family; composed of proteins similar to Sco1, a membrane-anchored protein possessing a soluble domain with a TRX fold. Members of this family are required for the proper assembly of cytochrome c oxidase (COX). They contain a metal binding motif, typically CXXXC, which is located in a flexible loop. COX, the terminal enzyme in the respiratory chain, is imbedded in the inner mitochondrial membrane of all eukaryotes and in the plasma membrane of some prokaryotes. It is composed of two subunits, COX I and COX II. It has been proposed that Sco1 specifically delivers copper to the CuA site, a dinuclear copper center, of the COX II subunit. Mutations in human Sco1 and Sco2 cause fatal infantile hepatoencephalomyopathy and cardioencephalomyopathy, respectively. Both disorders are associated with severe COX deficiency in affected tissues. More recently, it has been argued that the redox sensitivity of the copper binding properties of Sco1 implies that it participates in signaling events rather than functioning as a chaperone that transfers copper to COX II. 142
23133 239267 cd02969 PRX_like1 Peroxiredoxin (PRX)-like 1 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a conserved cysteine that aligns to the first cysteine in the CXXC motif of TRX. This does not correspond to the peroxidatic cysteine found in PRXs, which aligns to the second cysteine in the CXXC motif of TRX. In addition, these proteins do not contain the other two conserved residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. 171
23134 239268 cd02970 PRX_like2 Peroxiredoxin (PRX)-like 2 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a CXXC motif, similar to TRX. The second cysteine in the motif corresponds to the peroxidatic cysteine of PRX, however, these proteins do not contain the other two residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. TRXs alter the redox state of target proteins by catalyzing the reduction of their disulfide bonds via the CXXC motif using reducing equivalents derived from either NADPH or ferredoxins. 149
23135 239269 cd02971 PRX_family Peroxiredoxin (PRX) family; composed of the different classes of PRXs including many proteins originally known as bacterioferritin comigratory proteins (BCP), based on their electrophoretic mobility before their function was identified. PRXs are thiol-specific antioxidant (TSA) proteins also known as TRX peroxidases and alkyl hydroperoxide reductase C22 (AhpC) proteins. They confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either TRX, glutathione, trypanothione and AhpF. They are distinct from other peroxidases in that they have no cofactors such as metals or prosthetic groups. The first step of catalysis, common to all PRXs, is the nucleophilic attack by the catalytic cysteine (also known as the peroxidatic cysteine) on the peroxide leading to cleavage of the oxygen-oxygen bond and the formation of a cysteine sulfenic acid intermediate. The second step of the reaction, the resolution of the intermediate, distinguishes the different types of PRXs. The presence or absence of a second cysteine (the resolving cysteine) classifies PRXs as either belonging to the 2-cys or 1-cys type. The resolving cysteine of 2-cys PRXs is either on the same chain (atypical) or on the second chain (typical) of a functional homodimer. Structural and motif analysis of this growing family supports the need for a new classification system. The peroxidase activity of PRXs is regulated in vivo by irreversible cysteine over-oxidation into a sulfinic acid, phosphorylation and limited proteolysis. 140
23136 239270 cd02972 DsbA_family DsbA family; consists of DsbA and DsbA-like proteins, including DsbC, DsbG, glutathione (GSH) S-transferase kappa (GSTK), 2-hydroxychromene-2-carboxylate (HCCA) isomerase, an oxidoreductase (FrnE) presumed to be involved in frenolicin biosynthesis, a 27-kDa outer membrane protein, and similar proteins. Members of this family contain a redox active CXXC motif (except GSTK and HCCA isomerase) imbedded in a TRX fold, and an alpha helical insert of about 75 residues (shorter in DsbC and DsbG) relative to TRX. DsbA is involved in the oxidative protein folding pathway in prokaryotes, catalyzing disulfide bond formation of proteins secreted into the bacterial periplasm. DsbC and DsbG function as protein disulfide isomerases and chaperones to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. 98
23137 239271 cd02973 TRX_GRX_like Thioredoxin (TRX)-Glutaredoxin (GRX)-like family; composed of archaeal and bacterial proteins that show similarity to both TRX and GRX, including the C-terminal TRX-fold subdomain of Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). All members contain a redox-active CXXC motif and may function as PDOs. The archaeal proteins Mj0307 and Mt807 show structures more similar to GRX, but activities more similar to TRX. Some members of the family are similar to PfPDO in that they contain a second CXXC motif located in a second TRX-fold subdomain at the N-terminus; the superimposable N- and C-terminal TRX subdomains form a compact structure. PfPDO is postulated to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI). The C-terminal CXXC motif of PfPDO is required for its oxidase, reductase and isomerase activities. Also included in the family is the C-terminal TRX-fold subdomain of the N-terminal domain (NTD) of bacterial AhpF, which has a similar fold as PfPDO with two TRX-fold subdomains but without the second CXXC motif. 67
23138 239272 cd02974 AhpF_NTD_N Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) family, N-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD forming two contiguous TRX-fold subdomain similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The N-terminal TRX-fold subdomain of AhpF NTD is redox inactive, but is proposed to contain an important residue that aids in the catalytic function of the redox-active CXXC motif contained in the C-terminal TRX-fold subdomain. 94
23139 239273 cd02975 PfPDO_like_N Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO)-like family, N-terminal TRX-fold subdomain; composed of proteins with similarity to PfPDO, a redox active thermostable protein believed to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI), which are both involved in oxidative protein folding. PfPDO contains two redox active CXXC motifs in two contiguous TRX-fold subdomains. The active site in the N-terminal TRX-fold subdomain is required for isomerase but not for reductase activity of PfPDO. The exclusive presence of PfPDO-like proteins in extremophiles may suggest that they have a special role in adaptation to extreme conditions. 113
23140 239274 cd02976 NrdH NrdH-redoxin (NrdH) family; NrdH is a small monomeric protein with a conserved redox active CXXC motif within a TRX fold, characterized by a glutaredoxin (GRX)-like sequence and TRX-like activity profile. In vitro, it displays protein disulfide reductase activity that is dependent on TRX reductase, not glutathione (GSH). It is part of the NrdHIEF operon, where NrdEF codes for class Ib ribonucleotide reductase (RNR-Ib), an efficient enzyme at low oxygen levels. Under these conditions when GSH is mostly conjugated to spermidine, NrdH can still function and act as a hydrogen donor for RNR-Ib. It has been suggested that the NrdHEF system may be the oldest RNR reducing system, capable of functioning in a microaerophilic environment, where GSH was not yet available. NrdH from Corynebacterium ammoniagenes can form domain-swapped dimers, although it is unknown if this happens in vivo. Domain-swapped dimerization, which results in the blocking of the TRX reductase binding site, could be a mechanism for regulating the oxidation state of the protein. 73
23141 239275 cd02977 ArsC_family Arsenate Reductase (ArsC) family; composed of TRX-fold arsenic reductases and similar proteins including the transcriptional regulator, Spx. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX), through a single catalytic cysteine. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases. Spx is a general regulator that exerts negative and positive control over transcription initiation by binding to the C-terminal domain of the alpha subunit of RNA polymerase. 105
23142 239276 cd02978 KaiB_like KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA. KaiB is an essential protein in maintaining circadian rhythm. It was originally discovered from the cyanobacterium Synechococcus as part of the circadian clock gene cluster, kaiABC. KaiB attenuates KaiA-enhanced KaiC autokinase activity by interacting with KaiA-KaiC complexes in a circadian fashion. KaiB is membrane-associated as well as cytosolic. The amount of membrane-associated protein peaks in the evening (at circadian time (CT) 12-16) while the cytosolic form peaks later (at CT 20). The rhythmic localization of KaiB may function in regulating the formation of Kai complexes. SasA is a sensory histidine kinase which associates with KaiC. Although it is not an essential oscillator component, it is important in enhancing kaiABC expression and is important in metabolic growth control under day/night cycle conditions. SasA contains an N-terminal sensory domain with a TRX fold which is involved in the SasA-KaiC interaction. This domain shows high sequence similarity with KaiB. However, the KaiB structure does not show a classical TRX fold. The N-terminal half of KaiB shares the same beta-alpha-beta topology as TRX, but the topology of its C-terminal half diverges. 72
23143 239277 cd02979 PHOX_C FAD-dependent Phenol hydoxylase (PHOX) family, C-terminal TRX-fold domain; composed of proteins similar to PHOX from the aerobic topsoil yeast Trichosporon cutaneum. PHOX is a flavoprotein monooxygenase that catalyzes the hydroxylation of phenol and simple phenol derivatives in the ortho position with the consumption of NADPH and oxygen. This is the first step in the biodegradation and detoxification of phenolic compounds. PHOX contains three domains. The substrate and FAD/NAD(P) binding sites are contained in the first two domains, which adopt a complicated folding pattern. The third or C-terminal domain contains a TRX fold and is involved in dimerization. The functional unit of PHOX is a dimer, although active tetramers of the recombinant enzyme can be isolated when overproduced in bacteria. 167
23144 239278 cd02980 TRX_Fd_family Thioredoxin (TRX)-like [2Fe-2S] Ferredoxin (Fd) family; composed of [2Fe-2S] Fds with a TRX fold (TRX-like Fds) and proteins containing domains similar to TRX-like Fd including formate dehydrogenases, NAD-reducing hydrogenases and the subunit E of NADH:ubiquinone oxidoreductase (NuoE). TRX-like Fds are soluble low-potential electron carriers containing a single [2Fe-2S] cluster. The exact role of TRX-like Fd is still unclear. It has been suggested that it may be involved in nitrogen fixation. Its homologous domains in large redox enzymes (such as Nuo and hydrogenases) function as electron carriers. 77
23145 239279 cd02981 PDI_b_family Protein Disulfide Isomerase (PDIb) family, redox inactive TRX-like domain b; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57, ERp44 and PDIR. PDI, ERp57 (or ERp60), ERp72 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive b domains are implicated in substrate recognition. 97
23146 239280 cd02982 PDI_b'_family Protein Disulfide Isomerase (PDIb') family, redox inactive TRX-like domain b'; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5 and PDIR. PDI, ERp57, ERp72, P5 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive domains are implicated in substrate recognition with the b' domain serving as the primary substrate binding site. Only the b' domain is necessary for the binding of small peptide substrates. In addition to the b' domain, other domains are required for the binding of larger polypeptide substrates. The b' domain is also implicated in chaperone activity. 103
23147 239281 cd02983 P5_C P5 family, C-terminal redox inactive TRX-like domain; P5 is a protein disulfide isomerase (PDI)-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. The C-terminal domain is likely involved in substrate binding, similar to the b and b' domains of PDI. 130
23148 239282 cd02984 TRX_PICOT TRX domain, PICOT (for PKC-interacting cousin of TRX) subfamily; PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT contains an N-terminal TRX-like domain, which does not contain the catalytic CXXC motif, followed by one to three glutaredoxin domains. The TRX-like domain is required for interaction with PKC theta. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli. 97
23149 239283 cd02985 TRX_CDSP32 TRX family, chloroplastic drought-induced stress protein of 32 kD (CDSP32); CDSP32 is composed of two TRX domains, a C-terminal TRX domain which contains a redox active CXXC motif and an N-terminal TRX-like domain which contains an SXXS sequence instead of the redox active motif. CDSP32 is a stress-inducible TRX, i.e., it acts as a TRX by reducing protein disulfides and is induced by environmental and oxidative stress conditions. It plays a critical role in plastid defense against oxidative damage, a role related to its function as a physiological electron donor to BAS1, a plastidic 2-cys peroxiredoxin. Plants lacking CDSP32 exhibit decreased photosystem II photochemical efficiencies and chlorophyll retention compared to WT controls, as well as an increased proportion of BAS1 in its overoxidized monomeric form. 103
23150 239284 cd02986 DLP Dim1 family, Dim1-like protein (DLP) subfamily; DLP is a novel protein which shares 38% sequence identity to Dim1. Like Dim1, it is also implicated in pre-mRNA splicing and cell cycle progression. DLP is located in the nucleus and has been shown to interact with the U5 small nuclear ribonucleoprotein particle (snRNP)-specific 102kD protein (or Prp6). Dim1 protein, also known as U5 snRNP-specific 15kD protein is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif. 114
23151 239285 cd02987 Phd_like_Phd Phosducin (Phd)-like family, Phd subfamily; Phd is a cytosolic regulator of G protein functions. It specifically binds G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane. This impedes the formation of a functional G protein trimer (G protein alphabetagamma), thereby inhibiting G protein-mediated signal transduction. Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. 175
23152 239286 cd02988 Phd_like_VIAF Phosducin (Phd)-like family, Viral inhibitor of apoptosis (IAP)-associated factor (VIAF) subfamily; VIAF is a Phd-like protein that functions in caspase activation during apoptosis. It was identified as an IAP binding protein through a screen of a human B-cell library using a prototype IAP. VIAF lacks a consensus IAP binding motif and while it does not function as an IAP antagonist, it still plays a regulatory role in the complete activation of caspases. VIAF itself is a substrate for IAP-mediated ubiquitination, suggesting that it may be a target of IAPs in the prevention of cell death. The similarity of VIAF to Phd points to a potential role distinct from apoptosis regulation. Phd functions as a cytosolic regulator of G protein by specifically binding to G protein betagamma (Gbg)-subunits. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. 192
23153 239287 cd02989 Phd_like_TxnDC9 Phosducin (Phd)-like family, Thioredoxin (TRX) domain containing protein 9 (TxnDC9) subfamily; composed of predominantly uncharacterized eukaryotic proteins, containing a TRX-like domain without the redox active CXXC motif. The gene name for the human protein is TxnDC9. The two characterized members are described as Phd-like proteins, PLP1 of Saccharomyces cerevisiae and PhLP3 of Dictyostelium discoideum. Gene disruption experiments show that both PLP1 and PhLP3 are non-essential proteins. Unlike Phd and most Phd-like proteins, members of this group do not contain the Phd N-terminal helical domain which is implicated in binding to the G protein betagamma subunit. 113
23154 239288 cd02990 UAS_FAF1 UAS family, FAS-associated factor 1 (FAF1) subfamily; FAF1 contains a UAS domain of unknown function N-terminal to a ubiquitin-associated UBX domain. FAF1 also contains ubiquitin-associated UBA and nuclear targeting domains, N-terminal to the UAS domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. It is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kB (NF-kB) by interfering with the nuclear translocation of the p65 subunit. FAF1 also interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. 136
23155 239289 cd02991 UAS_ETEA UAS family, ETEA subfamily; composed of proteins similar to human ETEA protein, the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. ETEA shows homology to Fas-associated factor 1 (FAF1); both containing UAS and UBX (ubiquitin-associated) domains. Compared to FAF1, however, ETEA lacks the ubiquitin-associated UBA domain and a nuclear targeting domain. The function of ETEA is still unknown. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that ETEA could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis. 116
23156 239290 cd02992 PDI_a_QSOX PDIa family, Quiescin-sulfhydryl oxidase (QSOX) subfamily; QSOX is a eukaryotic protein containing an N-terminal redox active TRX domain, similar to that of PDI, and a small C-terminal flavin adenine dinucleotide (FAD)-binding domain homologous to the yeast ERV1p protein. QSOX oxidizes thiol groups to disulfides like PDI, however, unlike PDI, this oxidation is accompanied by the reduction of oxygen to hydrogen peroxide. QSOX is localized in high concentrations in cells with heavy secretory load and prefers peptides and proteins as substrates, not monothiols like glutathione. Inside the cell, QSOX is found in the endoplasmic reticulum and Golgi. The flow of reducing equivalents in a QSOX-catalyzed reaction goes from the dithiol substrate -> dithiol of the QSOX TRX domain -> dithiols of the QSOX ERV1p domain -> FAD -> oxygen. 114
23157 239291 cd02993 PDI_a_APS_reductase PDIa family, 5'-Adenylylsulfate (APS) reductase subfamily; composed of plant-type APS reductases containing a C-terminal redox active TRX domain and an N-terminal reductase domain which is part of a superfamily that includes N type ATP PPases. APS reductase catalyzes the reduction of activated sulfate to sulfite, a key step in the biosynthesis of sulfur-containing metabolites. Sulfate is first activated by ATP sulfurylase, forming APS, which can be phosphorylated to 3'-phosphoadenosine-5'-phosphosulfate (PAPS). Depending on the organism, either APS or PAPS can be used for sulfate reduction. Prokaryotes and fungi use PAPS, whereas plants use both APS and PAPS. Since plant-type APS reductase uses glutathione (GSH) as its electron donor, the C-terminal domain may function like glutaredoxin, a GSH-dependent member of the TRX superfamily. The flow of reducing equivalents goes from GSH -> C-terminal TRX domain -> N-terminal reductase domain -> APS. Plant-type APS reductase shows no homology to that of dissimilatory sulfate-reducing bacteria, which is an iron-sulfur flavoenzyme. Also included in the alignment is EYE2 from Chlamydomonas reinhardtii, a protein required for eyespot assembly. 109
23158 239292 cd02994 PDI_a_TMX PDIa family, TMX subfamily; composed of proteins similar to the TRX-related human transmembrane protein, TMX. TMX is a type I integral membrane protein; the N-terminal redox active TRX domain is present in the endoplasmic reticulum (ER) lumen while the C-terminus is oriented towards the cytoplasm. It is expressed in many cell types and its active site motif (CPAC) is unique. In vitro, TMX reduces interchain disulfides of insulin and renatures inactive RNase containing incorrect disulfide bonds. The C. elegans homolog, DPY-11, is expressed only in the hypodermis and resides in the cytoplasm. It is required for body and sensory organ morphogeneis. Another uncharacterized TRX-related transmembrane protein, human TMX4, is included in the alignment. The active site sequence of TMX4 is CPSC. 101
23159 239293 cd02995 PDI_a_PDI_a'_C PDIa family, C-terminal TRX domain (a') subfamily; composed of the C-terminal redox active a' domains of PDI, ERp72, ERp57 (or ERp60) and EFP1. PDI, ERp72 and ERp57 are endoplasmic reticulum (ER)-resident eukaryotic proteins involved in oxidative protein folding. They are oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. PDI and ERp57 have the abb'a' domain structure (where a and a' are redox active TRX domains while b and b' are redox inactive TRX-like domains). PDI also contains an acidic region (c domain) after the a' domain that is absent in ERp57. ERp72 has an additional a domain at the N-terminus (a"abb'a' domain structure). ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins, while PDI shows a wider substrate specificity. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. EFP1 is a binding partner protein of thyroid oxidase, which is responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones. 104
23160 239294 cd02996 PDI_a_ERp44 PDIa family, endoplasmic reticulum protein 44 (ERp44) subfamily; ERp44 is an ER-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain, similar to that of PDIa, with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. The CXFS motif in the N-terminal domain allows ERp44 to form stable reversible mixed disulfides with its substrates. Through this activity, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. It also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. 108
23161 239295 cd02997 PDI_a_PDIR PDIa family, PDIR subfamily; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. 104
23162 239296 cd02998 PDI_a_ERp38 PDIa family, endoplasmic reticulum protein 38 (ERp38) subfamily; composed of proteins similar to the P5-like protein first isolated from alfalfa, which contains two redox active TRX (a) domains at the N-terminus, like human P5, and a C-terminal domain with homology to the C-terminal domain of ERp29, unlike human P5. The cDNA clone of this protein (named G1) was isolated from an alfalfa cDNA library by screening with human protein disulfide isomerase (PDI) cDNA. The G1 protein is constitutively expressed in all major organs of the plant and its expression is induced by treatment with tunicamycin, indicating that it may be a glucose-regulated protein. The G1 homolog in the eukaryotic social amoeba Dictyostelium discoideum is also described as a P5-like protein, which is located in the endoplasmic reticulum (ER) despite the absence of an ER-retrieval signal. G1 homologs from Aspergillus niger and Neurospora crassa have also been characterized, and are named TIGA and ERp38, respectively. Also included in the alignment is an atypical PDI from Leishmania donovani containing a single a domain, and the C-terminal a domain of a P5-like protein from Entamoeba histolytica. 105
23163 239297 cd02999 PDI_a_ERp44_like PDIa family, endoplasmic reticulum protein 44 (ERp44)-like subfamily; composed of uncharacterized PDI-like eukaryotic proteins containing only one redox active TRX (a) domain with a CXXS motif, similar to ERp44. CXXS is still a redox active motif; however, the mixed disulfide formed with the substrate is more stable than those formed by CXXC motif proteins. PDI-related proteins are usually involved in the oxidative protein folding in the ER by acting as catalysts and folding assistants. ERp44 is involved in thiol-mediated retention in the ER. 100
23164 239298 cd03000 PDI_a_TMX3 PDIa family, TMX3 subfamily; composed of eukaryotic proteins similar to human TMX3, a TRX related transmembrane protein containing one redox active TRX domain at the N-terminus and a classical ER retrieval sequence for type I transmembrane proteins at the C-terminus. The TMX3 transcript is found in a variety of tissues with the highest levels detected in skeletal muscle and the heart. In vitro, TMX3 showed oxidase activity albeit slightly lower than that of protein disulfide isomerase. 104
23165 239299 cd03001 PDI_a_P5 PDIa family, P5 subfamily; composed of eukaryotic proteins similar to human P5, a PDI-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. Some members of this subfamily are P5-like proteins containing only one redox active TRX domain. 103
23166 239300 cd03002 PDI_a_MPD1_like PDI family, MPD1-like subfamily; composed of eukaryotic proteins similar to Saccharomyces cerevisiae MPD1 protein, which contains a single redox active TRX domain located at the N-terminus, and an ER retention signal at the C-terminus indicative of an ER-resident protein. MPD1 has been shown to suppress the maturation defect of carboxypeptidase Y caused by deletion of the yeast PDI1 gene. Other characterized members of this subfamily include the Aspergillus niger prpA protein and Giardia PDI-1. PrpA is non-essential to strain viability, however, its transcript level is induced by heterologous protein expression suggesting a possible role in oxidative protein folding during high protein production. Giardia PDI-1 has the ability to refold scrambled RNase and exhibits transglutaminase activity. 109
23167 239301 cd03003 PDI_a_ERdj5_N PDIa family, N-terminal ERdj5 subfamily; ERdj5, also known as JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is comprised of the first TRX domain of ERdj5 located after the DnaJ domain at the N-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation. 101
23168 239302 cd03004 PDI_a_ERdj5_C PDIa family, C-terminal ERdj5 subfamily; ERdj5, also known as JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is composed of the three TRX domains located at the C-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation. Also included in the alignment is the single complete TRX domain of an uncharacterized protein from Tetraodon nigroviridis, which also contains a DnaJ domain at its N-terminus. 104
23169 239303 cd03005 PDI_a_ERp46 PDIa family, endoplasmic reticulum protein 46 (ERp46) subfamily; ERp46 is an ER-resident protein containing three redox active TRX domains. Yeast complementation studies show that ERp46 can substitute for protein disulfide isomerase (PDI) function in vivo. It has been detected in many tissues, however, transcript and protein levels do not correlate in all tissues, suggesting regulation at a posttranscriptional level. An identical protein, named endoPDI, has been identified as an endothelial PDI that is highly expressed in the endothelium of tumors and hypoxic lesions. It has a protective effect on cells exposed to hypoxia. 102
23170 239304 cd03006 PDI_a_EFP1_N PDIa family, N-terminal EFP1 subfamily; EFP1 is a binding partner protein of thyroid oxidase (ThOX), also called Duox. ThOX proteins are responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones. EFP1 was isolated through a yeast two-hybrid method using the EF-hand fragment of dog Duox1 as a bait. It could be one of the partners in the assembly of a multiprotein complex constituting the thyroid hydrogen peroxide generating system. EFP1 contains two TRX domains related to the redox active TRX domains of protein disulfide isomerase (PDI). This subfamily is composed of the N-terminal TRX domain of EFP1, which contains a CXXS sequence in place of the typical CXXC motif, similar to ERp44. The CXXS motif allows the formation of stable mixed disulfides, crucial for the ER-retention function of ERp44. 113
23171 239305 cd03007 PDI_a_ERp29_N PDIa family, endoplasmic reticulum protein 29 (ERp29) subfamily; ERp29 is a ubiquitous ER-resident protein expressed in high levels in secretory cells. It forms homodimers and higher oligomers in vitro and in vivo. It contains a redox inactive TRX-like domain at the N-terminus, which is homologous to the redox active TRX (a) domains of PDI, and a C-terminal helical domain similar to the C-terminal domain of P5. The expression profile of ERp29 suggests a role in secretory protein production distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase. 116
23172 239306 cd03008 TryX_like_RdCVF Tryparedoxin (TryX)-like family, Rod-derived cone viability factor (RdCVF) subfamily; RdCVF is a thioredoxin (TRX)-like protein specifically expressed in photoreceptors. RdCVF was isolated and identified as a factor that supports cone survival in retinal cultures. Cone photoreceptor loss is responsible for the visual handicap resulting from the inherited disease, retinitis pigmentosa. RdCVF shows 33% similarity to TRX but does not exhibit any detectable thiol oxidoreductase activity. 146
23173 239307 cd03009 TryX_like_TryX_NRX Tryparedoxin (TryX)-like family, TryX and nucleoredoxin (NRX) subfamily; TryX and NRX are thioredoxin (TRX)-like protein disulfide oxidoreductases that alter the redox state of target proteins via the reversible oxidation of an active center CXXC motif. TryX is involved in the regulation of oxidative stress in parasitic trypanosomatids by reducing TryX peroxidase, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. TryX derives reducing equivalents from reduced trypanothione, a polyamine peptide conjugate unique to trypanosomatids, which is regenerated by the NADPH-dependent flavoprotein trypanothione reductase. Vertebrate NRX is a 400-amino acid nuclear protein with one redox active TRX domain containing a CPPC active site motif followed by one redox inactive TRX-like domain. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. Plant NRX, longer than the vertebrate NRX by about 100-200 amino acids, is a nuclear protein containing a redox inactive TRX-like domain between two redox active TRX domains. Both vertebrate and plant NRXs show thiol oxidoreductase activity in vitro. Their localization in the nucleus suggests a role in the redox regulation of nuclear proteins such as transcription factors. 131
23174 239308 cd03010 TlpA_like_DsbE TlpA-like family, DsbE (also known as CcmG and CycY) subfamily; DsbE is a membrane-anchored, periplasmic TRX-like reductase containing a CXXC motif that specifically donates reducing equivalents to apocytochrome c via CcmH, another cytochrome c maturation (Ccm) factor with a redox active CXXC motif. Assembly of cytochrome c requires the ligation of heme to reduced thiols of the apocytochrome. In bacteria, this assembly occurs in the periplasm. The reductase activity of DsbE in the oxidizing environment of the periplasm is crucial in the maturation of cytochrome c. 127
23175 239309 cd03011 TlpA_like_ScsD_MtbDsbE TlpA-like family, suppressor for copper sensitivity D protein (ScsD) and actinobacterial DsbE homolog subfamily; composed of ScsD, the DsbE homolog of Mycobacterium tuberculosis (MtbDsbE) and similar proteins, all containing a redox-active CXXC motif. The Salmonella typhimurium ScsD is a thioredoxin-like protein which confers copper tolerance to copper-sensitive mutants of E. coli. MtbDsbE has been characterized as an oxidase in vitro, catalyzing the disulfide bond formation of substrates like hirudin. The reduced form of MtbDsbE is more stable than its oxidized form, consistent with an oxidase function. This is in contrast to the function of DsbE from gram-negative bacteria which is a specific reductase of apocytochrome c. 123
23176 239310 cd03012 TlpA_like_DipZ_like TlpA-like family, DipZ-like subfamily; composed uncharacterized proteins containing a TlpA-like TRX domain. Some members show domain architectures similar to that of E. coli DipZ protein (also known as DsbD). The only eukaryotic members of the TlpA family belong to this subfamily. TlpA is a disulfide reductase known to have a crucial role in the biogenesis of cytochrome aa3. 126
23177 239311 cd03013 PRX5_like Peroxiredoxin (PRX) family, PRX5-like subfamily; members are similar to the human protein, PRX5, a homodimeric TRX peroxidase, widely expressed in tissues and found cellularly in mitochondria, peroxisomes and the cytosol. The cellular location of PRX5 suggests that it may have an important antioxidant role in organelles that are major sources of reactive oxygen species (ROS), as well as a role in the control of signal transduction. PRX5 has been shown to reduce hydrogen peroxide, alkyl hydroperoxides and peroxynitrite. As with all other PRXs, the N-terminal peroxidatic cysteine of PRX5 is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Human PRX5 is able to resolve this intermediate by forming an intramolecular disulfide bond with its C-terminal cysteine (the resolving cysteine), which can then be reduced by TRX, just like an atypical 2-cys PRX. This resolving cysteine, however, is not conserved in other members of the subfamily. In such cases, it is assumed that the oxidized cysteine is directly resolved by an external small-molecule or protein reductant, typical of a 1-cys PRX. In the case of the H. influenza PRX5 hybrid, the resolving glutaredoxin domain is on the same protein chain as PRX. PRX5 homodimers show an A-type interface, similar to atypical 2-cys PRXs. 155
23178 239312 cd03014 PRX_Atyp2cys Peroxiredoxin (PRX) family, Atypical 2-cys PRX subfamily; composed of PRXs containing peroxidatic and resolving cysteines, similar to the homodimeric thiol specific antioxidant (TSA) protein also known as TRX-dependent thiol peroxidase (Tpx). Tpx is a bacterial periplasmic peroxidase which differs from other PRXs in that it shows substrate specificity toward alkyl hydroperoxides over hydrogen peroxide. As with all other PRXs, the peroxidatic cysteine (N-terminal) of Tpx is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Tpx is able to resolve this intermediate by forming an intramolecular disulfide bond with a conserved C-terminal cysteine (the resolving cysteine), which can then be reduced by thioredoxin. This differs from the typical 2-cys PRX which resolves the oxidized cysteine by forming an intermolecular disulfide bond with the resolving cysteine from the other subunit of the homodimer. Atypical 2-cys PRX homodimers have a loop-based interface (A-type for alternate), in contrast with the B-type interface of typical 2-cys and 1-cys PRXs. 143
23179 239313 cd03015 PRX_Typ2cys Peroxiredoxin (PRX) family, Typical 2-Cys PRX subfamily; PRXs are thiol-specific antioxidant (TSA) proteins, which confer a protective role in cells through its peroxidase activity by reducing hydrogen peroxide, peroxynitrite, and organic hydroperoxides. The functional unit of typical 2-cys PRX is a homodimer. A unique intermolecular redox-active disulfide center is utilized for its activity. Upon reaction with peroxides, its peroxidatic cysteine is oxidized into a sulfenic acid intermediate which is resolved by bonding with the resolving cysteine from the other subunit of the homodimer. This intermolecular disulfide bond is then reduced by thioredoxin, tryparedoxin or AhpF. Typical 2-cys PRXs, like 1-cys PRXs, form decamers which are stabilized by reduction of the active site cysteine. Typical 2-cys PRX interacts through beta strands at one edge of the monomer (B-type interface) to form the functional homodimer, and uses an A-type interface (similar to the dimeric interface in atypical 2-cys PRX and PRX5) at the opposite end of the monomer to form the stable decameric (pentamer of dimers) structure. 173
23180 239314 cd03016 PRX_1cys Peroxiredoxin (PRX) family, 1-cys PRX subfamily; composed of PRXs containing only one conserved cysteine, which serves as the peroxidatic cysteine. They are homodimeric thiol-specific antioxidant (TSA) proteins that confer a protective role in cells by reducing and detoxifying hydrogen peroxide, peroxynitrite, and organic hydroperoxides. As with all other PRXs, a cysteine sulfenic acid intermediate is formed upon reaction of 1-cys PRX with its substrates. Having no resolving cysteine, the oxidized enzyme is resolved by an external small-molecule or protein reductant such as thioredoxin or glutaredoxin. Similar to typical 2-cys PRX, 1-cys PRX forms a functional dimeric unit with a B-type interface, as well as a decameric structure which is stabilized in the reduced form of the enzyme. Other oligomeric forms, tetramers and hexamers, have also been reported. Mammalian 1-cys PRX is localized cellularly in the cytosol and is expressed at high levels in brain, eye, testes and lung. The seed-specific plant 1-cys PRXs protect tissues from reactive oxygen species during desiccation and are also called rehydrins. 203
23181 239315 cd03017 PRX_BCP Peroxiredoxin (PRX) family, Bacterioferritin comigratory protein (BCP) subfamily; composed of thioredoxin-dependent thiol peroxidases, widely expressed in pathogenic bacteria, that protect cells against toxicity from reactive oxygen species by reducing and detoxifying hydroperoxides. The protein was named BCP based on its electrophoretic mobility before its function was known. BCP shows substrate selectivity toward fatty acid hydroperoxides rather than hydrogen peroxide or alkyl hydroperoxides. BCP contains the peroxidatic cysteine but appears not to possess a resolving cysteine (some sequences, not all, contain a second cysteine but its role is still unknown). Unlike other PRXs, BCP exists as a monomer. The plant homolog of BCP is PRX Q, which is expressed only in leaves and is cellularly localized in the chloroplasts and the guard cells of stomata. Also included in this subfamily is the fungal nuclear protein, Dot5p (for disrupter of telomere silencing protein 5), which functions as an alkyl-hydroperoxide reductase during post-diauxic growth. 140
23182 239316 cd03018 PRX_AhpE_like Peroxiredoxin (PRX) family, AhpE-like subfamily; composed of proteins similar to Mycobacterium tuberculosis AhpE. AhpE is described as a 1-cys PRX because of the absence of a resolving cysteine. The structure and sequence of AhpE, however, show greater similarity to 2-cys PRXs than 1-cys PRXs. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. The first step of catalysis is the nucleophilic attack by the peroxidatic cysteine on the peroxide leading to the formation of a cysteine sulfenic acid intermediate. The absence of a resolving cysteine suggests that functional AhpE is regenerated by an external reductant. The solution behavior and crystal structure of AhpE show that it forms dimers and octamers. 149
23183 239317 cd03019 DsbA_DsbA DsbA family, DsbA subfamily; DsbA is a monomeric thiol disulfide oxidoreductase protein containing a redox active CXXC motif imbedded in a TRX fold. It is involved in the oxidative protein folding pathway in prokaryotes, and is the strongest thiol oxidant known, due to the unusual stability of the thiolate anion form of the first cysteine in the CXXC motif. The highly unstable oxidized form of DsbA directly donates disulfide bonds to reduced proteins secreted into the bacterial periplasm. This rapid and unidirectional process helps to catalyze the folding of newly-synthesized polypeptides. To regain catalytic activity, reduced DsbA is then reoxidized by the membrane protein DsbB, which generates its disulfides from oxidized quinones, which in turn are reoxidized by the electron transport chain. 178
23184 239318 cd03020 DsbA_DsbC_DsbG DsbA family, DsbC and DsbG subfamily; V-shaped homodimeric proteins containing a redox active CXXC motif imbedded in a TRX fold. They function as protein disulfide isomerases and chaperones in the bacterial periplasm to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. DsbC and DsbG are kept in their reduced state by the cytoplasmic membrane protein DsbD, which utilizes the TRX/TRX reductase system in the cytosol as a source of reducing equivalents. DsbG differ from DsbC in that it has a more limited substrate specificity, and it may preferentially act later in the folding process to catalyze disulfide rearrangements in folded or partially folded proteins. Also included in the alignment is the predicted protein TrbB, whose gene was sequenced from the enterohemorrhagic E. coli type IV pilus gene cluster, which is required for efficient plasmid transfer. 197
23185 239319 cd03021 DsbA_GSTK DsbA family, Glutathione (GSH) S-transferase Kappa (GSTK) subfamily; GSTK is a member of the GST family of enzymes which catalyzes the transfer of the thiol of GSH to electrophilic substrates. It is specifically located in the mitochondria and peroxisomes, unlike other members of the canonical GST family, which are mainly cytosolic. The biological substrates of GSTK are not yet known. It is presumed to have a protective role during respiration when large amounts of reactive oxygen species are generated. GSTK has the same general fold as DsbA, consisting of a thioredoxin domain interrupted by an alpha-helical domain and its biological unit is a homodimer. GSTK is closely related to the bacterial enzyme, 2-hydroxychromene-2-carboxylate (HCCA) isomerase. It shows little sequence similarity to the other members of the GST family. 209
23186 239320 cd03022 DsbA_HCCA_Iso DsbA family, 2-hydroxychromene-2-carboxylate (HCCA) isomerase subfamily; HCCA isomerase is a glutathione (GSH) dependent enzyme involved in the naphthalene catabolic pathway. It converts HCCA, a hemiketal formed spontaneously after ring cleavage of 1,2-dihydroxynapthalene by a dioxygenase, into cis-o-hydroxybenzylidenepyruvate (cHBPA). This is the fourth reaction in a six-step pathway that converts napthalene into salicylate. HCCA isomerase is unique to bacteria that degrade polycyclic aromatic compounds. It is closely related to the eukaryotic protein, GSH transferase kappa (GSTK). 192
23187 239321 cd03023 DsbA_Com1_like DsbA family, Com1-like subfamily; composed of proteins similar to Com1, a 27-kDa outer membrane-associated immunoreactive protein originally found in both acute and chronic disease strains of the pathogenic bacteria Coxiella burnetti. It contains a CXXC motif, assumed to be imbedded in a DsbA-like structure. Its homology to DsbA suggests that the protein is a protein disulfide oxidoreductase. The role of such a protein in pathogenesis is unknown. 154
23188 239322 cd03024 DsbA_FrnE DsbA family, FrnE subfamily; FrnE is a DsbA-like protein containing a CXXC motif. It is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins. 201
23189 239323 cd03025 DsbA_FrnE_like DsbA family, FrnE-like subfamily; composed of uncharacterized proteins containing a CXXC motif with similarity to DsbA and FrnE. FrnE is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins. 193
23190 239324 cd03026 AhpF_NTD_C TRX-GRX-like family, Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) subfamily, C-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which then reduces hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD containing two contiguous TRX-fold subdomains similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The catalytic CXXC motif of the NTD of AhpF is contained in its C-terminal TRX subdomain. 89
23191 239325 cd03027 GRX_DEP Glutaredoxin (GRX) family, Dishevelled, Egl-10, and Pleckstrin (DEP) subfamily; composed of uncharacterized proteins containing a GRX domain and additional domains DEP and DUF547, both of which have unknown functions. GRX is a glutathione (GSH) dependent reductase containing a redox active CXXC motif in a TRX fold. It has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. By altering the redox state of target proteins, GRX is involved in many cellular functions. 73
23192 239326 cd03028 GRX_PICOT_like Glutaredoxin (GRX) family, PKC-interacting cousin of TRX (PICOT)-like subfamily; composed of PICOT and GRX-PICOT-like proteins. The non-PICOT members of this family contain only the GRX-like domain, whereas PICOT contains an N-terminal TRX-like domain followed by one to three GRX-like domains. It is interesting to note that PICOT from plants contain three repeats of the GRX-like domain, metazoan proteins (except for insect) have two repeats, while fungal sequences contain only one copy of the domain. PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli. Both GRX and TRX domains of PICOT are required for its activity. Characterized non-PICOT members of this family include CXIP1, a CAX-interacting protein in Arabidopsis thaliana, and PfGLP-1, a GRX-like protein from Plasmodium falciparum. 90
23193 239327 cd03029 GRX_hybridPRX5 Glutaredoxin (GRX) family, PRX5 hybrid subfamily; composed of hybrid proteins containing peroxiredoxin (PRX) and GRX domains, which is found in some pathogenic bacteria and cyanobacteria. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins. PRX-GRX hybrid proteins from Haemophilus influenza and Neisseria meningitis exhibit GSH-dependent peroxidase activity. The flow of reducing equivalents in the catalytic cycle of the hybrid protein goes from NADPH -> GSH reductase -> GSH -> GRX domain of hybrid -> PRX domain of hybrid -> peroxide substrate. 72
23194 239328 cd03030 GRX_SH3BGR Glutaredoxin (GRX) family, SH3BGR (SH3 domain binding glutamic acid-rich protein) subfamily; a recently-identified subfamily composed of SH3BGR and similar proteins possessing significant sequence similarity to GRX, but without a redox active CXXC motif. The SH3BGR gene was cloned in an effort to identify genes mapping to chromosome 21, which could be involved in the pathogenesis of congenital heart disease affecting Down syndrome newborns. Several human SH3BGR-like (SH3BGRL) genes have been identified since, mapping to different locations in the chromosome. Of these, SH3BGRL3 was identified as a tumor necrosis factor (TNF) alpha inhibitory protein and was also named TIP-B1. Upregulation of expression of SH3BGRL3 is associated with differentiation. It has been suggested that it functions as a regulator of differentiation-related signal transduction pathways. 92
23195 239329 cd03031 GRX_GRX_like Glutaredoxin (GRX) family, GRX-like domain containing protein subfamily; composed of uncharacterized eukaryotic proteins containing a GRX-like domain having only one conserved cysteine, aligning to the C-terminal cysteine of the CXXC motif of GRXs. This subfamily is predominantly composed of plant proteins. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins via a redox active CXXC motif using a similar dithiol mechanism employed by TRXs. GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. Proteins containing only the C-terminal cysteine are generally redox inactive. 147
23196 239330 cd03032 ArsC_Spx Arsenate Reductase (ArsC) family, Spx subfamily; Spx is a unique RNA polymerase (RNAP)-binding protein present in bacilli and some mollicutes. It inhibits transcription by binding to the C-terminal domain of the alpha subunit of RNAP, disrupting complex formation between RNAP and certain transcriptional activator proteins like ResD and ComA. In response to oxidative stress, Spx can also activate transcription, making it a general regulator that exerts both positive and negative control over transcription initiation. Spx has been shown to exert redox-sensitive transcriptional control over genes like trxA (TRX) and trxB (TRX reductase), genes that function in thiol homeostasis. This redox-sensitive activity is dependent on the presence of a CXXC motif, present in some members of the Spx subfamily, that acts as a thiol/disulfide switch. Spx has also been shown to repress genes in a sulfate-dependent manner independent of the presence of the CXXC motif. 115
23197 239331 cd03033 ArsC_15kD Arsenate Reductase (ArsC) family, 15kD protein subfamily; composed of proteins of unknown function with similarity to thioredoxin-fold arsenic reductases, ArsC. It is encoded by an ORF present in a gene cluster associated with nitrogen fixation that also encodes dinitrogenase reductase ADP-ribosyltransferase (DRAT) and dinitrogenase reductase activating glycohydrolase (DRAG). ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via glutaredoxin, through a single catalytic cysteine. 113
23198 239332 cd03034 ArsC_ArsC Arsenate Reductase (ArsC) family, ArsC subfamily; arsenic reductases similar to that encoded by arsC on the R733 plasmid of Escherichia coli. E. coli ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], the first step in the detoxification of arsenic, using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX). ArsC contains a single catalytic cysteine, within a thioredoxin fold, that forms a covalent thiolate-As(V) intermediate, which is reduced by GRX through a mixed GSH-arsenate intermediate. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases. 112
23199 239333 cd03035 ArsC_Yffb Arsenate Reductase (ArsC) family, Yffb subfamily; Yffb is an uncharacterized bacterial protein encoded by the yffb gene, related to the thioredoxin-fold arsenic reductases, ArsC. The structure of Yffb and the conservation of the catalytic cysteine suggest that it is likely to function as a glutathione (GSH)-dependent thiol reductase. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from GSH via glutaredoxin, through a single catalytic cysteine. 105
23200 239334 cd03036 ArsC_like Arsenate Reductase (ArsC) family, unknown subfamily; uncharacterized proteins containing a CXXC motif with similarity to thioredoxin (TRX)-fold arsenic reductases, ArsC. Proteins containing a redox active CXXC motif like TRX and glutaredoxin (GRX) function as protein disulfide oxidoreductases, altering the redox state of target proteins via the reversible oxidation of the active site dithiol. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via GRX, through a single catalytic cysteine. 111
23201 239335 cd03037 GST_N_GRX2 GST_N family, Glutaredoxin 2 (GRX2) subfamily; composed of bacterial proteins similar to E. coli GRX2, an atypical GRX with a molecular mass of about 24kD, compared with other GRXs which are 9-12kD in size. GRX2 adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses. 71
23202 239336 cd03038 GST_N_etherase_LigE GST_N family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF. 84
23203 239337 cd03039 GST_N_Sigma_like GST_N family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation and mediation of allergy and inflammation. Other class Sigma members include the class II insect GSTs, S-crystallins from cephalopods and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase and PGD2 synthase activities, and may play an important role in host-parasite interactions. Also members are novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST. 72
23204 239338 cd03040 GST_N_mPGES2 GST_N family; microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated, and a C-terminal soluble domain with a GST-like structure. 77
23205 239339 cd03041 GST_N_2GST_N GST_N family, 2 repeats of the N-terminal domain of soluble GSTs (2 GST_N) subfamily; composed of uncharacterized proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 77
23206 239340 cd03042 GST_N_Zeta GST_N family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid. 73
23207 239341 cd03043 GST_N_1 GST_N family, unknown subfamily 1; composed of uncharacterized proteins, predominantly from bacteria, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 73
23208 239342 cd03044 GST_N_EF1Bgamma GST_N family, Gamma subunit of Elongation Factor 1B (EFB1gamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal TRX-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression. 75
23209 239343 cd03045 GST_N_Delta_Epsilon GST_N family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress. 74
23210 239344 cd03046 GST_N_GTT1_like GST_N family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals. 76
23211 239345 cd03047 GST_N_2 GST_N family, unknown subfamily 2; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The sequence from Burkholderia cepacia was identified as part of a gene cluster involved in the degradation of 2,4,5-trichlorophenoxyacetic acid. Some GSTs (e.g. Class Zeta and Delta) are known to catalyze dechlorination reactions. 73
23212 239346 cd03048 GST_N_Ure2p_like GST_N family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p and related GSTs. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The N-terminal TRX-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. Characterized GSTs in this subfamily include Aspergillus fumigatus GSTs 1 and 2, and Schizosaccharomyces pombe GST-I. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. 81
23213 239347 cd03049 GST_N_3 GST_N family, unknown subfamily 3; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 73
23214 239348 cd03050 GST_N_Theta GST_N family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from aryl or alkyl sulfate esters. 76
23215 239349 cd03051 GST_N_GTT2_like GST_N family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock. 74
23216 239350 cd03052 GST_N_GDAP1 GST_N family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal TRX-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates. 73
23217 239351 cd03053 GST_N_Phi GST_N family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Phi GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity. 76
23218 239352 cd03054 GST_N_Metaxin GST_N family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities. 72
23219 239353 cd03055 GST_N_Omega GST_N family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases. 89
23220 239354 cd03056 GST_N_4 GST_N family, unknown subfamily 4; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 73
23221 239355 cd03057 GST_N_Beta GST_N family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit limited GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they also bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH. 77
23222 239356 cd03058 GST_N_Tau GST_N family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones. 74
23223 239357 cd03059 GST_N_SspA GST_N family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal TRX-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis. 73
23224 239358 cd03060 GST_N_Omega_like GST_N family, Omega-like subfamily; composed of uncharacterized proteins with similarity to class Omega GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. Like Omega enzymes, proteins in this subfamily contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. 71
23225 239359 cd03061 GST_N_CLIC GST_N family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLIC1-5, p64, parchorin and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division and apoptosis. They can exist in both water-soluble and membrane-bound states, and are found in various vesicles and membranes. Biochemical studies of the C. elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. The structure of soluble human CLIC1 reveals that it is monomeric and it adopts a fold similar to GSTs, containing an N-terminal domain with a TRX fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved. 91
23226 239360 cd03062 TRX_Fd_Sucrase TRX-like [2Fe-2S] Ferredoxin (Fd) family, Sucrase subfamily; composed of proteins with similarity to a novel plant enzyme, isolated from potato, which contains a Fd-like domain and exhibits sucrolytic activity. The putative active site of the Fd-like domain of the enzyme contains two cysteines and two histidines for possible binding to iron-sulfur clusters, compared to four cysteines present in the active site of Fd. 97
23227 239361 cd03063 TRX_Fd_FDH_beta TRX-like [2Fe-2S] Ferredoxin (Fd) family, NAD-dependent formate dehydrogenase (FDH) beta subunit; composed of proteins similar to the beta subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH beta subunit contains a NADH:ubiquinone oxidoreductase (Nuo) F domain C-terminal to a Fd-like domain without the active site cysteines. The absence of conserved metal-binding residues in the putative active site suggests that members of this subfamily have lost the ability to bind iron-sulfur clusters in the N-terminal Fd-like domain. The C-terminal NuoF domain is a component of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. NuoF contains one [4Fe-4S] cluster and binds NADH and FMN. 92
23228 239362 cd03064 TRX_Fd_NuoE TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily; Nuo, also called respiratory chain Complex 1, is the entry point for electrons into the respiratory chains of bacteria and the mitochondria of eukaryotes. It is a multisubunit complex with at least 14 core subunits. It catalyzes the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane, providing the proton motive force required for energy-consuming processes. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE core subunit, also called the 24 kD subunit of Complex 1. This subfamily also include formate dehydrogenases, NiFe hydrogenases and NAD-reducing hydrogenases, that contain a NuoE domain. A subset of these proteins contain both NuoE and NuoF in a single chain. NuoF, also called the 51 kD subunit of Complex 1, contains one [4Fe-4S] cluster and also binds the NADH substrate and FMN. 80
23229 239363 cd03065 PDI_b_Calsequestrin_N PDIb family, Calsequestrin subfamily, N-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The N-terminal TRX-fold domain (or domain I) mediates front-to-front dimer interaction, an important feature in the formation of calsequestrin polymers. 120
23230 239364 cd03066 PDI_b_Calsequestrin_middle PDIb family, Calsequestrin subfamily, Middle TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. 102
23231 239365 cd03067 PDI_b_PDIR_N PDIb family, PDIR subfamily, N-terminal TRX-like b domain; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. The TRX-like b domain of PDIR is critical for its chaperone activity. 112
23232 239366 cd03068 PDI_b_ERp72 PDIb family, ERp72 subfamily, first redox inactive TRX-like domain b; ERp72 exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp72 contains three redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. Its molecular structure is a"abb'a', compared to the abb'a' structure of PDI. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. Similar to PDI, the b domain of ERp72 is likely involved in binding to substrates. 107
23233 239367 cd03069 PDI_b_ERp57 PDIb family, ERp57 subfamily, first redox inactive TRX-like domain b; ERp57 (or ERp60) exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. Similar to PDI, the b domain of ERp57 is likely involved in binding to substrates. 104
23234 239368 cd03070 PDI_b_ERp44 PDIb family, ERp44 subfamily, first redox inactive TRX-like domain b; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b domain of ERp44 is likely involved in binding to substrates. 91
23235 239369 cd03071 PDI_b'_NRX PDIb' family, NRX subgroup, redox inactive TRX-like domain b'; composed of vertebrate nucleoredoxins (NRX). NRX is a 400-amino acid nuclear protein with one redox active TRX domain followed by one redox inactive TRX-like domain homologous to the b' domain of PDI. In vitro studies show that NRX has thiol oxidoreductase activity and that it may be involved in the redox regulation of transcription, in a manner different from that of TRX or glutaredoxin. NRX enhances the activation of NF-kB by TNFalpha, as well as PMA-1 induced AP-1 and FK-induced CREB activation. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. The mouse NRX gene is implicated in streptozotocin-induced diabetes. Similar to PDI, the b' domain of NRX is likely involved in substrate recognition. 116
23236 239370 cd03072 PDI_b'_ERp44 PDIb' family, ERp44 subfamily, second redox inactive TRX-like domain b'; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b' domain of ERp44 is likely involved in substrate recognition and may be the primary binding site. 111
23237 239371 cd03073 PDI_b'_ERp72_ERp57 PDIb' family, ERp72 and ERp57 subfamily, second redox inactive TRX-like domain b'; ERp72 and ER57 are involved in oxidative protein folding in the ER, like PDI. They exhibit both disulfide oxidase and reductase functions, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides and acting as isomerases to correct any non-native disulfide bonds. They also display chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp72 contains one additional redox-active TRX (a) domain at the N-terminus with a molecular structure of a"abb'a'. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. The b' domain of ERp57 is the primary binding site and is adapted for ER lectin association. Similarly, the b' domain of ERp72 is likely involved in substrate recognition. 111
23238 239372 cd03074 PDI_b'_Calsequestrin_C Protein Disulfide Isomerase (PDIb') family, Calsequestrin subfamily, C-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The C-terminal TRX-fold domain (or domain III) mediates back-to-back dimer interaction and also contriubutes to the front-to-front dimer interface, both of which are important features in the formation of calsequestrin polymers. 120
23239 239373 cd03075 GST_N_Mu GST_N family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1), thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation. 82
23240 239374 cd03076 GST_N_Pi GST_N family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumours. 73
23241 239375 cd03077 GST_N_Alpha GST_N family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Alpha subfamily is composed of eukaryotic GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals. 79
23242 239376 cd03078 GST_N_Metaxin1_like GST_N family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins including Tom37 from fungi. Mammalian metaxin (or metaxin 1) and the fungal protein Tom37 are components of preprotein import complexes of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals. 73
23243 239377 cd03079 GST_N_Metaxin2 GST_N family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury. 74
23244 239378 cd03080 GST_N_Metaxin_like GST_N family, Metaxin subfamily, Metaxin-like proteins; a heterogenous group of proteins, predominantly uncharacterized, with similarity to metaxins and GSTs. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. One characterized member of this subgroup is a novel GST from Rhodococcus with toluene o-monooxygenase and gamma-glutamylcysteine synthetase activities. Also members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system. 75
23245 239379 cd03081 TRX_Fd_NuoE_FDH_gamma TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, NAD-dependent formate dehydrogenase (FDH) gamma subunit; composed of proteins similar to the gamma subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD+ to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH gamma subunit is closely related to NuoE, which is part of a multisubunit complex (Nuo) catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE. Similarly, the FDH gamma subunit is hypothesized to be involved in an electron transport chain involving other FDH subunits, upon the oxidation of formate. 80
23246 239380 cd03082 TRX_Fd_NuoE_W_FDH_beta TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E family, Tungsten-containing formate dehydrogenase (W-FDH) beta subunit; composed of proteins similar to the W-FDH beta subunit of Methylobacterium extorquens. W-FDH is a heterodimeric NAD-dependent enzyme catalyzing the conversion of formate to carbon dioxide. The beta subunit is a fusion protein containing an N-terminal NuoE domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. Similarly, the beta subunit of W-FDH is most likely involved in the electron transport chain during the NAD-dependent oxidation of formate. 72
23247 239381 cd03083 TRX_Fd_NuoE_hoxF TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, hoxF; composed of proteins similar to the NAD-reducing hydrogenase (hoxS) alpha subunit of Alcaligenes eutrophus H16. HoxS is a cytoplasmic hydrogenase catalyzing the oxidation of molecular hydrogen accompanied by the reduction of NAD. It is composed of four structural subunits encoded by the genes hoxF, hoxU, hoxY and hoxH. The hoxF protein (or alpha subunit) is a fusion protein containing an N-terminal NuoE-like domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. HoxF may be involved in the electron transport chain during the NAD-dependent oxidation of hydrogen through its NuoF domain. The NuoE-like domain of hoxF contains only one conserved cysteine in its putative active site, compared to four cysteines in NuoE, and may have lost the ability to bind [2Fe-2S] clusters. 80
23248 100086 cd03084 phosphohexomutase The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 355
23249 100087 cd03085 PGM1 Phosphoglucomutase 1 (PGM1) catalyzes the bidirectional interconversion of glucose-1-phosphate (G-1-P) and glucose-6-phosphate (G-6-P) via a glucose 1,6-diphosphate intermediate, an important metabolic step in prokaryotes and eukaryotes. In one direction, G-1-P produced from sucrose catabolism is converted to G-6-P, the first intermediate in glycolysis. In the other direction, conversion of G-6-P to G-1-P generates a substrate for synthesis of UDP-glucose which is required for synthesis of a variety of cellular constituents including cell wall polymers and glycoproteins. The PGM1 family also includes a non-enzymatic PGM-related protein (PGM-RP) thought to play a structural role in eukaryotes, as well as pp63/parafusin, a phosphoglycoprotein that plays an important role in calcium-regulated exocytosis in ciliated protozoans. PGM1 belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 548
23250 100088 cd03086 PGM3 PGM3 (phosphoglucomutase 3), also known as PAGM (phosphoacetylglucosamine mutase) and AGM1 (N-acetylglucosamine-phosphate mutase), is an essential enzyme found in eukaryotes that reversibly catalyzes the conversion of GlcNAc-6-phosphate into GlcNAc-1-phosphate as part of the UDP-N-acetylglucosamine (UDP-GlcNAc) biosynthetic pathway. UDP-GlcNAc is an essential metabolite that serves as the biosynthetic precursor of many glycoproteins and mucopolysaccharides. AGM1 is a member of the alpha-D-phosphohexomutase superfamily, which catalyzes the intramolecular phosphoryl transfer of sugar substrates. The alpha-D-phosphohexomutases have four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 513
23251 100089 cd03087 PGM_like1 This archaeal PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 439
23252 100090 cd03088 ManB ManB is a bacterial phosphomannomutase (PMM) that catalyzes the conversion of mannose 6-phosphate to mannose-1-phosphate in the second of three steps in the GDP-mannose pathway, in which GDP-D-mannose is synthesized from fructose-6-phosphate. In Mycobacterium tuberculosis, the causative agent of tuberculosis, PMM is involved in the biosynthesis of mannosylated lipoglycans that participate in the association of mycobacteria with host macrophage phagocytic receptors. ManB belongs to the the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 459
23253 100091 cd03089 PMM_PGM The phosphomannomutase/phosphoglucomutase (PMM/PGM) bifunctional enzyme catalyzes the reversible conversion of 1-phospho to 6-phospho-sugars (e.g. between mannose-1-phosphate and mannose-6-phosphate or glucose-1-phosphate and glucose-6-phosphate) via a bisphosphorylated sugar intermediate. The reaction involves two phosphoryl transfers, with an intervening 180 degree reorientation of the reaction intermediate during catalysis. Reorientation of the intermediate occurs without dissociation from the active site of the enzyme and is thus, a simple example of processivity, as defined by multiple rounds of catalysis without release of substrate. Glucose-6-phosphate and glucose-1-phosphate are known to be utilized for energy metabolism and cell surface construction, respectively. PMM/PGM belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the phosphoglucomutases (PGM1 and PGM2). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 443
23254 349762 cd03108 AdSS adenylosuccinate synthetase. Adenylosuccinate synthetase (AdSS) catalyzes the first step in the de novo biosynthesis of AMP. IMP and L-aspartate are conjugated in a two-step reaction accompanied by the hydrolysis of GTP to GDP in the presence of Mg2+. In the first step, the r-phosphate group of GTP is transferred to the 6-oxygen atom of IMP. An aspartate then displaces this 6-phosphate group to form the product adenylosuccinate. Because of its critical role in purine biosynthesis, AdSS is a target of antibiotics, herbicides and antitumor drugs. 316
23255 349763 cd03109 DTBS dethiobiotin synthetase. Dethiobiotin synthetase (DTBS) is the penultimate enzyme in the biotin biosynthesis pathway in Escherichia coli and other microorganisms. The enzyme catalyzes formation of the ureido ring of dethiobiotin from (7R,8S)-7,8-diaminononanoic acid (DAPA) and carbon dioxide. The enzyme utilizes carbon dioxide instead of hydrogen carbonate as substrate and is dependent on ATP and divalent metal ions as cofactors. 189
23256 349764 cd03110 SIMIBI_bact_arch bacterial and archaeal subfamily of SIMIBI. Uncharacterized bacterial and archaeal subfamily of SIMIBI superfamily. Proteins in this superfamily contain an ATP-binding domain and use energy from hydrolysis of ATP to transfer electron or ion. The specific function of this family is unknown. 246
23257 349765 cd03111 CpaE-like pilus assembly ATPase CpaE. This protein family consists of proteins similar to the cpaE protein of the Caulobacter pilus assembly and the orf4 protein of Actinobacillus pilus formation gene cluster. The function of these proteins are unkown. The Caulobacter pilus assembly contains 7 genes: pilA, cpaA, cpaB, cpaC, cpaD, cpaE and cpaF. These genes are clustered together on chromosome. 235
23258 349766 cd03112 CobW-like cobalamin synthesis protein CobW. The function of this protein family is unknown. The amino acid sequence of YjiA protein in E. coli contains several conserved motifs that characterizes it as a P-loop GTPase. YijA gene is among the genes significantly induced in response to DNA-damage caused by mitomycin. YijA gene is a homologue of the CobW gene which encodes the cobalamin synthesis protein/P47K. 198
23259 349767 cd03113 CTPS_N N-terminal domain of cytidine 5'-triphosphate synthase. Cytidine 5'-triphosphate synthase (CTPS) is a two-domain protein, which consists of an N-terminal synthetase domain and C-terminal glutaminase domain. The enzymes hydrolyze the amide bond of glutamine to ammonia and glutamate at the glutaminase domains and transfer nascent ammonia to the acceptor substrate at the synthetase domain to form an aminated product. 261
23260 349768 cd03114 MMAA-like methylmalonic aciduria associated protein. Methylmalonyl Co-A mutase-associated GTPase MeaB and its human homolog, methylmalonic aciduria associated protein (MMAA) are metallochaperones that function as a G-protein chaperone that assists AdoCbl cofactor delivery to the methylmalonyl-CoA mutase (MCM) and reactivation of the enzyme during catalysis. A member of the family, Escherichia coli ArgK, was previously thought to be a membrane ATPase which is required for transporting arginine, ornithine and lysine into the cells by the arginine and ornithine (AO system) and lysine, arginine and ornithine (LAO) transport systems. 252
23261 349769 cd03115 SRP_G_like GTPase domain similar to the signal recognition particle subunit 54. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognate receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain. 193
23262 349770 cd03116 MobB molybdopterin-guanine dinucleotide biosynthesis protein B. Molybdenum is an essential trace element in the form of molybdenum cofactor (Moco) which is associated with the metabolism of nitrogen, carbon and sulfur by redox active enzymes. In Escherichia coli, the synthesis of Moco involves genes from several loci: moa, mob, mod, moe and mog. The mob locus contains mobA and mobB genes. MobB catalyzes the attachment of the guanine dinucleotide to molybdopterin. 157
23263 239391 cd03117 alpha_CA_IV_XV_like Carbonic anhydrase alpha, CA_IV, CA_XV, like isozymes. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This subgroup, restricted to animals, contains isozyme IV and similar proteins such as mouse CA XV. Isozymes IV is attached to membranes via a glycosylphosphatidylinositol (GPI) tail. In mammals, Isozyme IV plays crucial roles in kidney and lung function, amongst others. This subgroup also contains the dual domain CA from the giant clam, Tridacna gigas. T. gigas CA plays a role in the movement of inorganic carbon from the surrounding seawater to the symbiotic algae found in the clam's tissues. CA XV is expressed in several species but not in humans or chimps. Similar to isozyme CA IV, CA XV attaches to membranes via a GPI tail. 234
23264 239392 cd03118 alpha_CA_V Carbonic anhydrase alpha, CA isozyme V_like subgroup. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme V. CA V is the mitochondrial isozyme, which may play a role in gluconeogenesis and ureagenesis and possibly also in lipogenesis. 236
23265 239393 cd03119 alpha_CA_I_II_III_XIII Carbonic anhydrase alpha, isozymes I, II, and III and XIII. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozymes I, II, and III, which are cytoplasmic enzymes. CA I, for example, is expressed in erythrocyes of many vertebrates; CA II is the most active cytosolic isozyme; while it is being expressed nearly ubiquitously, it comprises 95% of the renal carbonic anhydrase and is required for renal acidification; CA III has been implicated in protection from the damaging effect of oxidizing agents in hepatocytes. CAXIII may play important physiological roles in several organs. 259
23266 239394 cd03120 alpha_CARP_VIII Carbonic anhydrase alpha related protein, group VIII. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP VIII may play roles in various biological processes of the central nervous system, and could be involved in protein-protein interactions. CARP VIII has been shown to bind inositol 1,4,5-triphosphate (IP3) receptor type I (IP3RI), reducing the affinity of the receptor for IP3. IP3RI is an intracellular IP3-gated Ca2+ channel located on intracellular Ca2+ stores. IP3RI converts IP3 signaling into Ca2+ signaling thereby participating in a variety of cell functions. 256
23267 239395 cd03121 alpha_CARP_X_XI_like Carbonic anhydrase alpha related protein: groups X, XI and related proteins. This subgroup contains carbonic anhydrase related proteins (CARPs) X and XI, which have been implicated in various biological processes of the central nervous system. CARPs are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP XI plays a role in the development of gastrointestinal stromal tumors. 256
23268 239396 cd03122 alpha_CARP_receptor_like Carbonic anhydrase alpha related protein, receptor_like subfamily. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. This sub-family of carbonic anhydrase-related domains found in tyrosine phosphatase receptors may play a role in cell adhesion. 253
23269 239397 cd03123 alpha_CA_VI_IX_XII_XIV Carbonic anhydrase alpha, isozymes VI, IX, XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are mostly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva, for example, and the membrane proteins CA IX, XII, and XIV. 248
23270 239398 cd03124 alpha_CA_prokaryotic_like Carbonic anhydrase alpha, prokaryotic-like subfamily. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This sub-family includes bacterial carbonic anhydrase alpha, as well as plant enzymes such as tobacco nectarin III and yam dioscorin and, carbonic anhydrases from molluscs, such as nacrein, which are part of the organic matrix layer in shells. Other members of this family may be involved in maintaining pH balance, in facilitating transport of carbon dioxide or carbonic acid, or in sensing carbon dioxide levels in the environment. Dioscorin is the major storage protein of yam tubers and may play a role as an antioxidant. Tobacco Nectarin may play a role in the maintenace of pH and oxidative balance in nectar. Mollusc nacrein may participate in calcium carbonate crystal formation of the nacreous layer. This subfamily also includes three alpha carbonic anhydrases from Chlamydomonas reinhardtii (CAH 1-3). CAHs1-2 are localized in the periplasmic space. CAH1 faciliates the movement of carbon dioxide across the plasma membrane when the medium is alkaline. CAH3 is localized to the thylakoid lumen and provides CO2 to Rubisco. 216
23271 239399 cd03125 alpha_CA_VI Carbonic anhydrase alpha, isozyme VI. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva. 249
23272 239400 cd03126 alpha_CA_XII_XIV Carbonic anhydrase alpha, isozymes XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane proteins CA XII and XIV. 249
23273 239401 cd03127 tetraspanin_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins. 90
23274 153222 cd03128 GAT_1 Type 1 glutamine amidotransferase (GATase1)-like domain. Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain. 92
23275 153223 cd03129 GAT1_Peptidase_E_like Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E_like proteins. Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E_like proteins. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E and, extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly. Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Peptidase E and cyanophycinases are thought to have a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus peptidase E is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption. 210
23276 153224 cd03130 GATase1_CobB Type 1 glutamine amidotransferase (GATase1) domain found in Cobyrinic Acid a,c-Diamide Synthase. Type 1 glutamine amidotransferase (GATase1) domain found in Cobyrinic Acid a,c-Diamide Synthase. CobB plays a role in cobalamin biosythesis catalyzing the conversion of cobyrinic acid to cobyrinic acid a,c-diamide. CobB belongs to the triad family of amidotransferases. Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobB. 198
23277 153225 cd03131 GATase1_HTS Type 1 glutamine amidotransferase (GATase1)-like domain found in homoserine trans-succinylase (HTS). Type 1 glutamine amidotransferase (GATase1)-like domain found in homoserine trans-succinylase (HTS). HTS, the first enzyme in methionine biosynthesis in Escherichia coli, transfers a succinyl group from succinyl-CoA to homoserine forming succinyl homoserine. It has been suggested that the succinyl group of succinyl-CoA is initially transferred to an enzyme nucleophile before subsequent transfer to homoserine. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with GATase1 domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. It has been proposed that this cys is in the active site of the molecule. However, as succinyl has been found bound to a conserved lysine residue, this conserved cys may play a role in dimer formation. HTS activity is tightly regulated by several mechanisms including feedback inhibition and proteolysis. It represents a critical control point for cell growth and viability. 175
23278 153226 cd03132 GATase1_catalase Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II. This GATase1-like domain has an essential role in HP-II catalase activity. However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII. Catalase-1 is associated with non-growing cells; Catalase-3 is associated with growing conditions. HP-II is produced in stationary phase. Catalase-1 is induced by ethanol and heat shock. Catalase-3 is induced under stress conditions such a hydrogen peroxide, paraquat, cadmium, heat shock, uric acid and nitrate treatment. 142
23279 153227 cd03133 GATase1_ES1 Type 1 glutamine amidotransferase (GATase1)-like domain found in zebrafish ES1. Type 1 glutamine amidotransferase (GATase1)-like domain found in zebrafish ES1. This group includes, proteins similar to ES1, Escherichia coli enhancing lycopene biosynthesis protein 2, Azospirillum brasilense iaaC and, human HES1. The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. Zebrafish ES1 is expressed specifically in adult photoreceptor cells and appears to be a cytoplasmic protein. A. brasilense iaaC is involved in controlling IAA biosynthesis. 213
23280 153228 cd03134 GATase1_PfpI_like A type 1 glutamine amidotransferase (GATase1)-like domain found in PfpI from Pyrococcus furiosus. A type 1 glutamine amidotransferase (GATase1)-like domain found in PfpI from Pyrococcus furiosus. This group includes proteins similar to PfpI from P. furiosus. and PH1704 from Pyrococcus horikoshii. These enzymes are ATP-independent intracellular proteases and may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For PH1704, it is believed that this Cys together with a different His in one monomer and Glu (from an adjacent monomer) forms a different catalytic triad from the typical GATase1domain. PfpI is homooligomeric. Protease activity is only found for oligomeric forms of PH1704. 165
23281 153229 cd03135 GATase1_DJ-1 Type 1 glutamine amidotransferase (GATase1)-like domain found in Human DJ-1. Type 1 glutamine amidotransferase (GATase1)-like domain found in Human DJ-1. DJ-1 is involved in multiple physiological processes including cancer, Parkinson's disease and male fertility. It is unclear how DJ-1 functions in these. DJ-1 has been shown to possess chaperone activity. DJ-1 is preferentially expressed in the testis and moderately in other tissues; it is induced together with genes involved in oxidative stress response. The Drosophila homologue (DJ-1A) plays an essential role in oxidative stress response and neuronal maintenance. Inhibition of DJ-1A function through RNAi, results in the cellular accumulation of reactive oxygen species, organismal hypersensitivity to oxidative stress, and dysfunction and degeneration of dopaminergic and photoreceptor neurons. DJ-1 has lacks enzymatic activity and the catalytic triad of typical GATase1 domains, however it does contain the highly conserved cysteine located at the nucelophile elbow region typical of these domains. This cysteine been proposed to be a site of regulation of DJ-1 activity by oxidation. DJ-1 is a dimeric enzyme. 163
23282 153230 cd03136 GATase1_AraC_ArgR_like AraC transcriptional regulators having an N-terminal Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having an N-terminal Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to the Pseudomonas aeruginosa ArgR regulator. ArgR functions in the control of expression of certain genes of arginine biosynthesis and catabolism. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in some sequences in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 185
23283 153231 cd03137 GATase1_AraC_1 AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 187
23284 153232 cd03138 GATase1_AraC_2 AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 195
23285 153233 cd03139 GATase1_PfpI_2 Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 183
23286 153234 cd03140 GATase1_PfpI_3 Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 170
23287 153235 cd03141 GATase1_Hsp31_like Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein. Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein (EcHsp31). This group includes EcHsp31 and Saccharomyces cerevisiae Ydr533c protein. EcHsp31 has chaperone activity. Ydr533c is upregulated in response to various stress conditions along with the heat shock family. EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1 domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For EcHsp31, this Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain. For Ydr533c a catalytic triad forms from the conserved Cys together with a different His and Glu from that of the typical GATase1domain. Ydr533c protein and EcHsp31 are homodimers. 221
23288 153236 cd03142 GATase1_ThuA Type 1 glutamine amidotransferase (GATase1)-like domain found in Sinorhizobium meliloti Rm1021 ThuA (SmThuA). Type 1 glutamine amidotransferase (GATase1)-like domain found in Sinorhizobium meliloti Rm1021 ThuA (SmThuA). This group includes proteins similar to SmThuA which plays a role in a major pathway for trehalose catabolism. SmThuA is induced by trehalose but not by related structurally similar disaccharides like sucrose or maltose. Proteins in this group lack the catalytic triad of typical GATase1 domains: a His replaces the reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. S. meliloti Rm1021 thuA mutants are impaired in competitive colonization of Medicago sativa roots but are more competitive than the wild-type Rml021 in infecting alfalfa roots and forming nitrogen-fixing nodules. 215
23289 153237 cd03143 A4_beta-galactosidase_middle_domain A4 beta-galactosidase middle domain: a type 1 glutamine amidotransferase (GATase1)-like domain. A4 beta-galactosidase middle domain: a type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to beta-galactosidase from Thermus thermophilus. Beta-Galactosidase hydrolyzes the beta-1,4-D-galactosidic linkage of lactose, as well as those of related chromogens, o-nitrophenyl-beta-D-galactopyranoside (ONP-Gal) and 5-bromo-4-chloro-3-indolyl-beta-D-galactoside (X-gal). This A4 beta-galactosidase middle domain lacks the catalytic triad of typical GATase1 domains. The reactive Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in typical GATase1 domains is not conserved in this group. 154
23290 153238 cd03144 GATase1_ScBLP_like Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Saccharomyces cerevisiae biotin-apoprotein ligase (ScBLP). Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Saccharomyces cerevisiae biotin-apoprotein ligase (ScBLP). Biotin-apoprotein ligase modifies proteins by covalently attaching biotin. ScBLP is known to biotinylate acety-CoA carboxylase and pyruvate carboxylase. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, the Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in a typical GATase1 domain is conserved. 114
23291 153239 cd03145 GAT1_cyanophycinase Type 1 glutamine amidotransferase (GATase1)-like domain found in cyanophycinase. Type 1 glutamine amidotransferase (GATase1)-like domain found in cyanophycinase. This group contains proteins similar to the extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB. Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Cyanophycinase is believed to be a serine-type exopeptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. 217
23292 153240 cd03146 GAT1_Peptidase_E Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E. Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus PepE is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption. 212
23293 153241 cd03147 GATase1_Ydr533c_like Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein. Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein. This group includes proteins similar to S. cerevisiae Ydr533c. Ydr533c is upregulated in response to various stress conditions along with the heat shock family. The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and Glu residue form a different catalytic triad from the typical GATase1domain. Ydr533c protein is a homodimer. 231
23294 153242 cd03148 GATase1_EcHsp31_like Type 1 glutamine amidotransferase (GATase1)-like domain found in Escherichia coli Hsp31 protein (EcHsp31). Type 1 glutamine amidotransferase (GATase1)-like domain found in Escherichia coli Hsp31 protein (EcHsp31). This group includes proteins similar to EcHsp31. EcHsp31 has chaperone activity. EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain. EcHsp31 is a homodimer. 232
23295 239402 cd03149 alpha_CA_VII Carbonic anhydrase alpha, CA isozyme VII_like subgroup. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme VII. CA VII is the most active cytosolic enzyme after CA II, and may be highly expressed in the brain. Human CA VII may be a target of antiepileptic sulfonamides/sulfamates. 236
23296 239403 cd03150 alpha_CA_IX Carbonic anhydrase alpha, isozyme IX. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are strictly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane protein CA IX. CA IX is functionally implicated in tumor growth and survival. CA IX is mainly present in solid tumors and its expression in normal tissues is limited to the mucosa of alimentary tract. CA IX is a transmembrane protein with two extracellular domains: carbonic anhydrase and, a proteoglycan-like segment mediating cell-cell adhesion. There is evidence for an involvement of the MAPK pathway in the regulation of CA9 expression. 247
23297 239404 cd03151 CD81_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD81_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD81, also referred to as Target for anti-proliferative antigen-1, TAPA-1, is found in virtually all tissues, may be involved in regulation of cell growth and has been described as a member of the CD19/CD21/Leu-13 signal transduction complex identified on B cells (the B-Cell co-receptor). 84
23298 239405 cd03152 CD9_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD9 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD9 is found in virtually all tissues and is potentially involved in developmental processes. It associates with the tetraspanins CD81 and CD63, as well as with some integrin, and has been shown to be involved in a variety of activation, adhesion, and cell motility functions, as well as cell-cell interactions - such as during fertilization. 84
23299 239406 cd03153 PHEMX_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), PHEMX_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Phemx (pan hematopoietic expression) or TSSC6 may play a role in hematopoietic cell function. 87
23300 239407 cd03154 TM4SF3_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF3_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 3 (TM4SF3) or D6.1a and related proteins. D6.1a associates with alpha6beta4 integrin and supports cell motility, it has been ascribed a role in tumor progression and metastasis. 100
23301 239408 cd03155 CD151_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD151_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD151strongly associates with integrins, especially alpha3beta1, alpha6beta1, alpha7beta1, and alpha6beta4; it may play roles in cell-cell adhesion, cell migration, platelet aggregation, and angiogenesis. For example, CD151 is is involved in regulation of migration of neutrophils, endothelial cells, and various tumor cell lines; it associates specifically with laminin-binding integrins and strengthens alpha6beta1 integrin-mediated adhesion to laminin-1; CD151 also specifically attenuates adhesion-dependent activation of Ras and correspdonding downstream effects, and is involved in epithelial cell-cell adhesion as a modulator of PKC- and Cdc42-dependent actin cytoskeletal reorganization. 110
23302 239409 cd03156 uroplakin_I_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), uroplakin_I_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Uroplakin Ia and Ib are components of the 16nm protein particles, which are packed hexagonally to form 2D crystals of asymmetric unit membranes, and cover the apical surface of mammalian urothelium, contributing to the urinay bladder's permeability barrier function. Uroplakins Ia and Ib are maturation facilitators. They trigger conformational changes in their single-transmembrane-domain binding partner proteins uroplakin II and IIIa, which in turn may lead to ER-exit, stabilization, and cell-surface expression. 114
23303 239410 cd03157 TM4SF12_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF12_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This sub-family contains proteins similar to human transmembrane 4 superfamily member 12 (TM4SF12). 103
23304 239411 cd03158 penumbra_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), penumbra_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Human Penumbra exhibits growth-suppressive activity in vitro and has been associated with myeloid malignancies. 119
23305 239412 cd03159 TM4SF9_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF9_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 9 (TM4SF9) or Tetraspanin-5 and related proteins. TM4SF9 is strongly expressed witin the central nervous system, and expression levels appear to correlate with differentiation status of particular neurons, hinting at a role in neuronal maturation. 121
23306 239413 cd03160 CD37_CD82_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD37_CD82_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD37 is a leukocyte-specific protein, and its restricted expression pattern suggests a role in the immune system. A regulatory role in T-cell proliferation has been suggested. CD82 is a metastasis suppressor implicated in biological processes ranging from fusion, adhesion, and migration to apoptosis and alterations of cell morphology. 117
23307 239414 cd03161 TM4SF2_6_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF2_6_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 2 (TM4SF2) or Tspan-7, transmembrane 4 superfamily 6 (TM4SF6) or Tspan-6, and related proteins. TM4SF2 has been identified as involved in some forms of X-linked mental retardation. 104
23308 239415 cd03162 peripherin_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), peripherin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Peripherin, or RDS (retinal degradation slow) is a glycoprotein expressed in vertebrate photoreceptors, located at the rim of the disc membranes of the photoreceptor outer segments. RDS is thought to play a major role in folding and stacking of the discs. Mutations in RDS have been linked to hereditary retinal dystrophies, which typically exhibit a wide phenotypic spectrum. 143
23309 239416 cd03163 TM4SF8_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF8_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 8 (TM4SF8) or Tspan-3 and related proteins. Tspan-3 has been reported to form a complex with integrin beta1 and OSP/claudin-11, which may be involved in oligodendrocyte proliferation and migration. 105
23310 239417 cd03164 CD53_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD53_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD53 is a tetraspanin of the lymphoid-myeloid lineage and has been implicated in apoptosis protection. It associates with integrin alpha4beta1. Some of the cellular responses modulated by CD53 may be mediated by JNK activation and/or via the AKT pathway. 86
23311 239418 cd03165 NET-5_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), NET-5_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This sub-family contains proteins similar to human tetraspan NET-5. 98
23312 239419 cd03166 CD63_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), CD63 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD63 is present in platelets, neutrophils, and endothelial cells, amongst others. In platelets it associates with the integrin alphaIIBbeta3 and may modulate alphaIIbbeta3-dependent cytoskeletal reorganization. 99
23313 239420 cd03167 oculospanin_like_LEL Tetraspanin, extracellular domain or large extracellular loop (LEL), oculospanin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contains sequences similar to oculospanin, which is found to be expressed in retinal pigment epithelium, iris, ciliary body, and retinal ganglion cells. 120
23314 153243 cd03169 GATase1_PfpI_1 Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 180
23315 239421 cd03171 SORL_Dfx_classI Superoxide reductase-like (SORL) domain, class I; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)]. 78
23316 239422 cd03172 SORL_classII Superoxide reductase-like (SORL) domain, class II; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers. 104
23317 176264 cd03173 DUF619-like DUF619 domain of various N-acetylglutamate Kinases and N-acetylglutamate Synthases. DUF619-like: This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. This subgroup also includes the DUF619 domain of the FABP N-acetylglutamate kinase (NAGK), the enzyme that catalyzes the second reaction of arginine biosynthesis; the phosphorylation of the gamma-carboxyl group of NAG to produce N-acetylglutamylphosphate (NAGP) which is subsequently converted to ornithine in two more steps. The nuclear-encoded, mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-acetylglutamate phosphate reductase). The DUF619 domain function has yet to be characterized. 98
23318 163674 cd03174 DRE_TIM_metallolyase DRE-TIM metallolyase superfamily. The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 265
23319 198287 cd03177 GST_C_Delta_Epsilon C-terminal, alpha helical domain of Class Delta and Epsilon Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress. 117
23320 198288 cd03178 GST_C_Ure2p_like C-terminal, alpha helical domain of Ure2p and related Glutathione S-transferase-like proteins. Glutathione S-transferase (GST) C-terminal domain family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p, YfcG and YghU from Escherichia coli, and related GST-like proteins. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The N-terminal thioredoxin-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. YfcG and YghU are two of the nine GST homologs in the genome of Escherichia coli. They display very low or no GSH transferase, but show very good disulfide bond oxidoreductase activity. YghU also shows modest organic hydroperoxide reductase activity. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 110
23321 198289 cd03180 GST_C_2 C-terminal, alpha helical domain of an unknown subfamily 2 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 2; composed of uncharacterized bacterial proteins, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 110
23322 198290 cd03181 GST_C_EF1Bgamma_like Glutathione S-transferase C-terminal-like, alpha helical domain of the Gamma subunit of Elongation Factor 1B and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Gamma subunit of Elongation Factor 1B (EF1Bgamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds to membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression. Also included in this subfamily is the GST_C-like domain at the N-terminus of human valyl-tRNA synthetase (ValRS) and its homologs. Metazoan ValRS forms a stable complex with Elongation Factor-1H (EF-1H), and together, they catalyze consecutive steps in protein biosynthesis, tRNA aminoacylation and its transfer to EF. 123
23323 198291 cd03182 GST_C_GTT2_like C-terminal, alpha helical domain of GTT2-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the Saccharomyces cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock. 116
23324 198292 cd03183 GST_C_Theta C-terminal, alpha helical domain of Class Theta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is the subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from the aryl or alkyl sulfate esters. 126
23325 198293 cd03184 GST_C_Omega C-terminal, alpha helical domain of Class Omega Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases. 124
23326 198294 cd03185 GST_C_Tau C-terminal, alpha helical domain of Class Tau Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones. 127
23327 198295 cd03186 GST_C_SspA C-terminal, alpha helical domain of Stringent starvation protein A. Glutathione S-transferase (GST) C-terminal domain family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis. 108
23328 198296 cd03187 GST_C_Phi C-terminal, alpha helical domain of Class Phi Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity. 118
23329 198297 cd03188 GST_C_Beta C-terminal, alpha helical domain of Class Beta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they are involved in the protection against oxidative stress and are able to bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs, contributing to antibiotic resistance. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH. One member of this subfamily is a GST from Burkholderia xenovorans LB400 that is encoded by the bphK gene and is part of the biphenyl catabolic pathway. 113
23330 198298 cd03189 GST_C_GTT1_like C-terminal, alpha helical domain of GTT1-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals. 123
23331 198299 cd03190 GST_C_Omega_like C-terminal, alpha helical domain of Class Omega-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae Omega-like subfamily; composed of three Saccharomyces cerevisiae GST omega-like (Gto) proteins, Gto1p, Gto2p (also known as Extracellular mutant protein 4 or ECM4p), and Gto3p, as well as similar uncharacterized proteins from fungi and bacteria. The three Saccharomyces cerevisiae Gto proteins are omega-class GSTs with low or no GST activity against standard substrates, but have glutaredoxin/thiol oxidoreductase and dehydroascorbate reductase activity through a single cysteine residue in the active site. Gto1p is located in the peroxisomes while Gto2p and Gto3p are cytosolic. The gene encoding Gto2p, called ECM4, is involved in cell surface biosynthesis and architecture. S. cerevisiae ECM4 mutants show increased amounts of the cell wall hexose, N-acetylglucosamine. More recently, global gene expression analysis shows that ECM4 is upregulated during genotoxic conditions and together with the expression profiles of 18 other genes could potentially differentiate between genotoxic and cytotoxic insults in yeast. 142
23332 198300 cd03191 GST_C_Zeta C-terminal, alpha helical domain of Class Zeta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates, but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid. 121
23333 198301 cd03192 GST_C_Sigma_like C-terminal, alpha helical domain of Class Sigma-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi, and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation, and mediation of allergy and inflammation. Other class Sigma-like members include the class II insect GSTs, S-crystallins from cephalopods, nematode-specific GSTs, and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase, and PGD2 synthase activities, and may play an important role in host-parasite interactions. Members also include novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST. 104
23334 198302 cd03193 GST_C_Metaxin C-terminal, alpha helical domain of Metaxin and related proteins. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities. Other members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system. 88
23335 198303 cd03194 GST_C_3 C-terminal, alpha helical domain of an unknown subfamily 3 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 3; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 115
23336 198304 cd03195 GST_C_4 C-terminal, alpha helical domain of an unknown subfamily 4 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 4; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 114
23337 198305 cd03196 GST_C_5 C-terminal, alpha helical domain of an unknown subfamily 5 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 5; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 115
23338 198306 cd03197 GST_C_mPGES2 C-terminal, alpha helical domain of microsomal Prostaglandin E synthase Type 2. Glutathione S-transferase (GST) C-terminal domain family, microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH, or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature, and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated and a C-terminal soluble domain with a GST-like structure. The C-terminal GST-like domain contains two structural domains, an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST active site is located in a cleft between the two structural domains. 149
23339 198307 cd03198 GST_C_CLIC C-terminal, alpha helical domain of Chloride Intracellular Channels. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLICs (CLIC1-6 in vertebrates), p64, parchorin, and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. Biochemical studies of the Caenorhabditis elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. CLICs display structural plasticity, with CLIC1 adopting two soluble conformations. The structure of soluble human CLIC1 reveals that it is monomeric and adopts a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved. 119
23340 198308 cd03199 GST_C_GRX2 C-terminal, alpha helical domain of Glutaredoxin 2. Glutathione S-transferase (GST) C-terminal domain family, Glutaredoxin 2 (GRX2) subfamily; composed of Escherichia coli GRX2 and similar proteins. Escherichia coli GRX2 is an atypical GRX with a molecular mass of about 24kD (most GRXs range from 9-12kD). It adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain, but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses. 128
23341 198309 cd03200 GST_C_AIMP2 Glutathione S-transferase C-terminal-like, alpha helical domain of Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 2. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein (AIMP) 2 subfamily; AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex that functions as a molecular hub to coordinate protein synthesis. There are three AIMPs, named AIMP1-3, which play diverse regulatory roles. AIMP2, also called p38 or JTV-1, contains a C-terminal domain with similarity to the C-terminal alpha helical domain of GSTs. It plays an important role in the control of cell fate via antiproliferative (by enhancing the TGF-beta signal) and proapoptotic (activation of p53 and TNF-alpha) activities. Its roles in the control of cell proliferation and death suggest that it is a potent tumor suppressor. AIMP2 heterozygous mice with lower than normal expression of AIMP2 show high susceptibility to tumorigenesis. AIMP2 is also a substrate of Parkin, an E3 ubiquitin ligase that is involved in the ubiquitylation and proteasomal degradation of its substrates. Mutations in the Parkin gene is found in 50% of patients with autosomal-recessive early-onset parkinsonism. The accumulation of AIMP2, due to impaired Parkin function, may play a role in the pathogenesis of Parkinson's disease. 96
23342 198310 cd03201 GST_C_DHAR C-terminal, alpha helical domain of Dehydroascorbate Reductase. Glutathione S-transferase (GST) C-terminal domain family, Dehydroascorbate Reductase (DHAR) subfamily; composed of plant-specific DHARs, which are monomeric enzymes catalyzing the reduction of DHA into ascorbic acid (AsA) using glutathione as the reductant. DHAR allows plants to recycle oxidized AsA before it is lost. AsA serves as a cofactor of violaxanthin de-epoxidase in the xanthophyll cycle and as an antioxidant in the detoxification of reactive oxygen species. Because AsA is the major reductant in plants, DHAR serves to regulate their redox state. It has been suggested that a significant portion of DHAR activity is plastidic, acting to reduce the large amounts of ascorbate oxidized during hydrogen peroxide scavenging by ascorbate peroxidase. DHAR contains a conserved cysteine in its active site and in addition to its reductase activity, shows thiol transferase activity similar to glutaredoxins. 121
23343 198311 cd03202 GST_C_etherase_LigE C-terminal, alpha helical domain of Beta etherase LigE. Glutathione S-transferase (GST) C-terminal domain family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 124
23344 198312 cd03203 GST_C_Lambda C-terminal, alpha helical domain of Class Lambda Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Lambda subfamily; composed of plant-specific class Lambda GSTs. GSTs are cytosolic, usually dimeric, proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Lambda subfamily was recently discovered, together with dehydroascorbate reductases (DHARs), as two outlying groups of the GST superfamily in Arabidopsis thaliana, which contain conserved active site cysteines. Characterization of recombinant A. thaliana proteins show that Lambda class GSTs are monomeric, similar to DHARs. They do not exhibit GSH conjugating or DHAR activities, but are active as thiol transferases, similar to glutaredoxins. Members of this subfamily were originally identified as encoded proteins of the In2-1 gene, which can be induced by treatment with herbicide safeners. 120
23345 198313 cd03204 GST_C_GDAP1_like C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1-like proteins. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1 (GDAP1)-like subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates. 111
23346 198314 cd03205 GST_C_6 C-terminal, alpha helical domain of an unknown subfamily 6 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 6; composed of uncharacterized bacterial proteins with similarity to GSTs, including Pseudomonas fluorescens GST with a known three-dimensional structure. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Though the three-dimensional structure of Pseudomonas fluorescens GST has been determined, there is no information on its functional characterization. 109
23347 198315 cd03206 GST_C_7 C-terminal, alpha helical domain of an unknown subfamily 7 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 7; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 100
23348 198316 cd03207 GST_C_8 C-terminal, alpha helical domain of an unknown subfamily 8 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 8; composed of Agrobacterium tumefaciens GST and other uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The three-dimensional structure of Agrobacterium tumefaciens GST has been determined but there is no information on its functional characterization. 101
23349 198317 cd03208 GST_C_Alpha C-terminal, alpha helical domain of Class Alpha Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Alpha subfamily is composed of vertebrate GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals. 135
23350 198318 cd03209 GST_C_Mu C-terminal, alpha helical domain of Class Mu Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1) thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation. 121
23351 198319 cd03210 GST_C_Pi C-terminal, alpha helical domain of Class Pi Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumors. 126
23352 198320 cd03211 GST_C_Metaxin2 C-terminal, alpha helical domain of Metaxin 2. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury. 126
23353 198321 cd03212 GST_C_Metaxin1_3 C-terminal, alpha helical domain of Metaxin 1, Metaxin 3, and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins. Mammalian metaxin (or metaxin 1) is a component of the preprotein import complex of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals. 137
23354 213180 cd03213 ABCG_EPDR Eye pigment and drug resistance transporter subfamily G of the ATP-binding cassette superfamily. ABCG transporters are involved in eye pigment (EP) precursor transport, regulation of lipid-trafficking mechanisms, and pleiotropic drug resistance (DR). DR is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. Compared to other members of the ABC transporter subfamilies, the ABCG transporter family is composed of proteins that have an ATP-binding cassette domain at the N-terminus and a TM (transmembrane) domain at the C-terminus. 194
23355 213181 cd03214 ABC_Iron-Siderophores_B12_Hemin ATP-binding component of iron-siderophores, vitamin B12 and hemin transporters and related proteins. ABC transporters, involved in the uptake of siderophores, heme, and vitamin B12, are widely conserved in bacteria and archaea. Only very few species lack representatives of the siderophore family transporters. The E. coli BtuCD protein is an ABC transporter mediating vitamin B12 uptake. The two ATP-binding cassettes (BtuD) are in close contact with each other, as are the two membrane-spanning subunits (BtuC); this arrangement is distinct from that observed for the E. coli lipid flippase MsbA. The BtuC subunits provide 20 transmembrane helices grouped around a translocation pathway that is closed to the cytoplasm by a gate region, whereas the dimer arrangement of the BtuD subunits resembles the ATP-bound form of the Rad50 DNA repair enzyme. A prominent cytoplasmic loop of BtuC forms the contact region with the ATP-binding cassette and represent a conserved motif among the ABC transporters. 180
23356 213182 cd03215 ABC_Carb_Monos_II Second domain of the ATP-binding cassette component of monosaccharide transport system. This family represents domain II of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. In members of Carb_Monos family the single hydrophobic gene product forms a homodimer, while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter. 182
23357 213183 cd03216 ABC_Carb_Monos_I First domain of the ATP-binding cassette component of monosaccharide transport system. This family represents the domain I of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. Pentoses include xylose, arabinose, and ribose. Important hexoses include glucose, galactose, and fructose. In members of the Carb_monos family, the single hydrophobic gene product forms a homodimer while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter. 163
23358 213184 cd03217 ABC_FeS_Assembly ABC-type transport system involved in Fe-S cluster assembly, ATPase component. Biosynthesis of iron-sulfur clusters (Fe-S) depends on multi-protein systems. The SUF system of E. coli and Erwinia chrysanthemi is important for Fe-S biogenesis under stressful conditions. The SUF system is made of six proteins: SufC is an atypical cytoplasmic ABC-ATPase, which forms a complex with SufB and SufD; SufA plays the role of a scaffold protein for assembly of iron-sulfur clusters and delivery to target proteins; SufS is a cysteine desulfurase which mobilizes the sulfur atom from cysteine and provides it to the cluster; SufE has no associated function yet. 200
23359 213185 cd03218 ABC_YhbG ATP-binding cassette component of YhbG transport system. The ABC transporters belonging to the YhbG family are similar to members of the Mj1267_LivG family, which is involved in the transport of branched-chain amino acids. The genes yhbG and yhbN are located in a single operon and may function together in cell envelope during biogenesis. YhbG is the putative ATP-binding cassette component and YhbN is the putative periplasmic-binding protein. Depletion of each gene product leads to growth arrest, irreversible cell damage and loss of viability in E. coli. The YhbG homolog (NtrA) is essential in Rhizobium meliloti, a symbiotic nitrogen-fixing bacterium. 232
23360 213186 cd03219 ABC_Mj1267_LivG_branched ATP-binding cassette component of branched chain amino acids transport system. The Mj1267/LivG ABC transporter subfamily is involved in the transport of the hydrophobic amino acids leucine, isoleucine and valine. MJ1267 is a branched-chain amino acid transporter with 29% similarity to both the LivF and LivG components of the E. coli branched-chain amino acid transporter. MJ1267 contains an insertion from residues 114 to 123 characteristic of LivG (Leucine-Isoleucine-Valine) homologs. The branched-chain amino acid transporter from E. coli comprises a heterodimer of ABCs (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ). 236
23361 213187 cd03220 ABC_KpsT_Wzt ATP-binding cassette component of polysaccharide transport system. The KpsT/Wzt ABC transporter subfamily is involved in extracellular polysaccharide export. Among the variety of membrane-linked or extracellular polysaccharides excreted by bacteria, only capsular polysaccharides, lipopolysaccharides, and teichoic acids have been shown to be exported by ABC transporters. A typical system is made of a conserved integral membrane and an ABC. In addition to these proteins, capsular polysaccharide exporter systems require two 'accessory' proteins to perform their function: a periplasmic (E.coli) or a lipid-anchored outer membrane protein called OMA (Neisseria meningitidis and Haemophilus influenza) and a cytoplasmic membrane protein MPA2. 224
23362 213188 cd03221 ABCF_EF-3 ATP-binding cassette domain of elongation factor 3, subfamily F. Elongation factor 3 (EF-3) is a cytosolic protein required by fungal ribosomes for in vitro protein synthesis and for in vivo growth. EF-3 stimulates the binding of the EF-1: GTP: aa-tRNA ternary complex to the ribosomal A site by facilitated release of the deacylated tRNA from the E site. The reaction requires ATP hydrolysis. EF-3 contains two ATP nucleotide binding sequence (NBS) motifs. NBSI is sufficient for the intrinsic ATPase activity. NBSII is essential for the ribosome-stimulated functions. 144
23363 213189 cd03222 ABC_RNaseL_inhibitor ATP-binding cassette domain of RNase L inhibitor. The ABC ATPase RNase L inhibitor (RLI) is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins, and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI's have an N-terminal Fe-S domain and two nucleotide-binding domains, which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 177
23364 213190 cd03223 ABCD_peroxisomal_ALDP ATP-binding cassette domain of peroxisomal transporter, subfamily D. Peroxisomal ATP-binding cassette transporter (Pat) is involved in the import of very long-chain fatty acids (VLCFA) into the peroxisome. The peroxisomal membrane forms a permeability barrier for a wide variety of metabolites required for and formed during fatty acid beta-oxidation. To communicate with the cytoplasm and mitochondria, peroxisomes need dedicated proteins to transport such hydrophilic molecules across their membranes. X-linked adrenoleukodystrophy (X-ALD) is caused by mutations in the ALD gene, which encodes ALDP (adrenoleukodystrophy protein ), a peroxisomal integral membrane protein that is a member of the ATP-binding cassette (ABC) transporter protein family. The disease is characterized by a striking and unpredictable variation in phenotypic expression. Phenotypes include the rapidly progressive childhood cerebral form (CCALD), the milder adult form, adrenomyeloneuropathy (AMN), and variants without neurologic involvement (i.e. asymptomatic). 166
23365 213191 cd03224 ABC_TM1139_LivF_branched ATP-binding cassette domain of branched-chain amino acid transporter. LivF (TM1139) is part of the LIV-I bacterial ABC-type two-component transport system that imports neutral, branched-chain amino acids. The E. coli branched-chain amino acid transporter comprises a heterodimer of ABC transporters (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. 222
23366 213192 cd03225 ABC_cobalt_CbiO_domain1 First domain of the ATP-binding cassette component of cobalt transport system. Domain I of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. This ABC transport system of the CbiMNQO family is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most of cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems. 211
23367 213193 cd03226 ABC_cobalt_CbiO_domain2 Second domain of the ATP-binding cassette component of cobalt transport system. Domain II of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. The CbiMNQO family ABC transport system is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems. 205
23368 213194 cd03227 ABC_Class2 ATP-binding cassette domain of non-transporter proteins. ABC-type Class 2 contains systems involved in cellular processes other than transport. These families are characterized by the fact that the ABC subunit is made up of duplicated, fused ABC modules (ABC2). No known transmembrane proteins or domains are associated with these proteins. 162
23369 213195 cd03228 ABCC_MRP_Like ATP-binding cassette domain of multidrug resistance protein-like transporters. The MRP (Multidrug Resistance Protein)-like transporters are involved in drug, peptide, and lipid export. They belong to the subfamily C of the ATP-binding cassette (ABC) superfamily of transport proteins. The ABCC subfamily contains transporters with a diverse functional spectrum that includes ion transport, cell surface receptor, and toxin secretion activities. The MRP-like family, similar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains, each composed of six transmembrane (TM) helices, and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 171
23370 213196 cd03229 ABC_Class3 ATP-binding cassette domain of the binding protein-dependent transport systems. This class is comprised of all BPD (Binding Protein Dependent) systems that are largely represented in archaea and eubacteria and are primarily involved in scavenging solutes from the environment. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 178
23371 213197 cd03230 ABC_DR_subfamily_A ATP-binding cassette domain of the drug resistance transporter and related proteins, subfamily A. This family of ATP-binding proteins belongs to a multi-subunit transporter involved in drug resistance (BcrA and DrrA), nodulation, lipid transport, and lantibiotic immunity. In bacteria and archaea, these transporters usually include an ATP-binding protein and one or two integral membrane proteins. Eukaryotic systems of the ABCA subfamily display ABC domains that are quite similar to this family. The ATP-binding domain shows the highest similarity between all members of the ABC transporter family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 173
23372 213198 cd03231 ABC_CcmA_heme_exporter Cytochrome c biogenesis ATP-binding export protein. CcmA, the ATP-binding component of the bacterial CcmAB transporter. The CCM family is involved in bacterial cytochrome c biogenesis. Cytochrome c maturation in E. coli requires the ccm operon, which encodes eight membrane proteins (CcmABCDEFGH). CcmE is a periplasmic heme chaperon that binds heme covalently and transfers it onto apocytochrome c in the presence of CcmF, CcmG, and CcmH. The CcmAB proteins represent an ABC transporter and the CcmCD proteins participate in heme transfer to CcmE. 201
23373 213199 cd03232 ABCG_PDR_domain2 Second domain of the pleiotropic drug resistance-like (PDR) subfamily G of ATP-binding cassette transporters. The pleiotropic drug resistance (PDR) is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 192
23374 213200 cd03233 ABCG_PDR_domain1 First domain of the pleiotropic drug resistance-like subfamily G of ATP-binding cassette transporters. The pleiotropic drug resistance (PDR) is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 202
23375 213201 cd03234 ABCG_White White pigment protein homolog of ABCG transporter subfamily. The White subfamily represents ABC transporters homologous to the Drosophila white gene, which acts as a dimeric importer for eye pigment precursors. The eye pigmentation of Drosophila is developed from the synthesis and deposition in the cells of red pigments, which are synthesized from guanine, and brown pigments, which are synthesized from tryptophan. The pigment precursors are encoded by the white, brown, and scarlet genes, respectively. Evidence from genetic and biochemical studies suggest that the White and Brown proteins function as heterodimers to import guanine, while the White and Scarlet proteins function to import tryptophan. However, a recent study also suggests that White may be involved in the transport of a metabolite, such as 3-hydroxykynurenine, across intracellular membranes. Mammalian ABC transporters belonging to the White subfamily (ABCG1, ABCG5, and ABCG8) have been shown to be involved in the regulation of lipid-trafficking mechanisms in macrophages, hepatocytes, and intestinal mucosa cells. ABCG1 (ABC8), the human homolog of the Drosophila white gene is induced in monocyte-derived macrophages during cholesterol influx mediated by acetylated low-density lipoprotein. It is possible that human ABCG1 forms heterodimers with several heterologous partners. 226
23376 213202 cd03235 ABC_Metallic_Cations ATP-binding cassette domain of the metal-type transporters. This family includes transporters involved in the uptake of various metallic cations such as iron, manganese, and zinc. The ATPases of this group of transporters are very similar to members of iron-siderophore uptake family suggesting that they share a common ancestor. The best characterized metal-type ABC transporters are the YfeABCD system of Y. pestis, the SitABCD system of Salmonella enterica serovar Typhimurium, and the SitABCD transporter of Shigella flexneri. Moreover other uncharacterized homologs of these metal-type transporters are mainly found in pathogens like Haemophilus or enteroinvasive E. coli isolates. 213
23377 213203 cd03236 ABC_RNaseL_inhibitor_domain1 The ATP-binding cassette domain 1 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI s are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLIs have an N-terminal Fe-S domain and two nucleotide binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 255
23378 213204 cd03237 ABC_RNaseL_inhibitor_domain2 The ATP-binding cassette domain 2 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI's have an N-terminal Fe-S domain and two nucleotide-binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity of more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 246
23379 213205 cd03238 ABC_UvrA ATP-binding cassette domain of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases. 176
23380 213206 cd03239 ABC_SMC_head The SMC head domain belongs to the ATP-binding cassette superfamily. The structural maintenance of chromosomes (SMC) proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and each has five distinct domains: amino- and carboxy-terminal globular domains, which contain sequences characteristic of ATPases, two coiled-coil regions separating the terminal domains , and a central flexible hinge. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair, and epigenetic silencing of gene expression. 178
23381 213207 cd03240 ABC_Rad50 ATP-binding cassette domain of Rad50. The catalytic domains of Rad50 are similar to the ATP-binding cassette of ABC transporters, but are not associated with membrane-spanning domains. The conserved ATP-binding motifs common to Rad50 and the ABC transporter family include the Walker A and Walker B motifs, the Q loop, a histidine residue in the switch region, a D-loop, and a conserved LSGG sequence. This conserved sequence, LSGG, is the most specific and characteristic motif of this family and is thus known as the ABC signature sequence. 204
23382 213208 cd03241 ABC_RecN ATP-binding cassette domain of RecN. RecN ATPase involved in DNA repair; similar to ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 276
23383 213209 cd03242 ABC_RecF ATP-binding cassette domain of RecF. RecF is a recombinational DNA repair ATPase that maintains replication in the presence of DNA damage. When replication is prematurely disrupted by DNA damage, several recF pathway gene products play critical roles processing the arrested replication fork, allowing it to resume and complete its task. This CD represents the nucleotide binding domain of RecF. RecF belongs to a large superfamily of ABC transporters involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases with a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 270
23384 213210 cd03243 ABC_MutS_homologs ATP-binding cassette domain of MutS homologs. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 202
23385 213211 cd03244 ABCC_MRP_domain2 ATP-binding cassette domain 2 of multidrug resistance-associated protein. The ABC subfamily C is also known as MRP (multidrug resistance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resistance lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate. 221
23386 213212 cd03245 ABCC_bacteriocin_exporters ATP-binding cassette domain of bacteriocin exporters, subfamily C. Many non-lantibiotic bacteriocins of lactic acid bacteria are produced as precursors which have N-terminal leader peptides that share similarities in amino acid sequence and contain a conserved processing site of two glycine residues in positions -1 and -2. A dedicated ATP-binding cassette (ABC) transporter is responsible for the proteolytic cleavage of the leader peptides and subsequent translocation of the bacteriocins across the cytoplasmic membrane. 220
23387 213213 cd03246 ABCC_Protease_Secretion ATP-binding cassette domain of PrtD, subfamily C. This family represents the ABC component of the protease secretion system PrtD, a 60-kDa integral membrane protein sharing 37% identity with HlyB, the ABC component of the alpha-hemolysin secretion pathway, in the C-terminal domain. They export degradative enzymes by using a type I protein secretion system and lack an N-terminal signal peptide, but contain a C-terminal secretion signal. The Type I secretion apparatus is made up of three components, an ABC transporter, a membrane fusion protein (MFP), and an outer membrane protein (OMP). For the HlyA transporter complex, HlyB (ABC transporter) and HlyD (MFP) reside in the inner membrane of E. coli. The OMP component is TolC, which is thought to interact with the MFP to form a continuous channel across the periplasm from the cytoplasm to the exterior. HlyB belongs to the family of ABC transporters, which are ubiquitous, ATP-dependent transmembrane pumps or channels. The spectrum of transport substrates ranges from inorganic ions, nutrients such as amino acids, sugars, or peptides, hydrophobic drugs, to large polypeptides, such as HlyA. 173
23388 213214 cd03247 ABCC_cytochrome_bd ATP-binding cassette domain of CydCD, subfamily C. The CYD subfamily implicated in cytochrome bd biogenesis. The CydC and CydD proteins are important for the formation of cytochrome bd terminal oxidase of E. coli and it has been proposed that they were necessary for biosynthesis of the cytochrome bd quinol oxidase and for periplasmic c-type cytochromes. CydCD were proposed to determine a heterooligomeric complex important for heme export into the periplasm or to be involved in the maintenance of the proper redox state of the periplasmic space. In Bacillus subtilis, the absence of CydCD does not affect the presence of halo-cytochrome c in the membrane and this observation suggests that CydCD proteins are not involved in the export of heme in this organism. 178
23389 213215 cd03248 ABCC_TAP ATP-binding cassette domain of the Transporter Associated with Antigen Processing, subfamily C. TAP (Transporter Associated with Antigen Processing) is essential for peptide delivery from the cytosol into the lumen of the endoplasmic reticulum (ER), where these peptides are loaded on major histocompatibility complex (MHC) I molecules. Loaded MHC I leave the ER and display their antigenic cargo on the cell surface to cytotoxic T cells. Subsequently, virus-infected or malignantly transformed cells can be eliminated. TAP belongs to the large family of ATP-binding cassette (ABC) transporters, which translocate a vast variety of solutes across membranes. 226
23390 213216 cd03249 ABC_MTABC3_MDL1_MDL2 ATP-binding cassette domain of a mitochondrial protein MTABC3 and related proteins. MTABC3 (also known as ABCB6) is a mitochondrial ATP-binding cassette protein involved in iron homeostasis and one of four ABC transporters expressed in the mitochondrial inner membrane, the other three being MDL1(ABC7), MDL2, and ATM1. In fact, the yeast MDL1 (multidrug resistance-like protein 1) and MDL2 (multidrug resistance-like protein 2) transporters are also included in this CD. MDL1 is an ATP-dependent permease that acts as a high-copy suppressor of ATM1 and is thought to have a role in resistance to oxidative stress. Interestingly, subfamily B is more closely related to the carboxyl-terminal component of subfamily C than the two halves of ABCC molecules are with one another. 238
23391 213217 cd03250 ABCC_MRP_domain1 ATP-binding cassette domain 1 of multidrug resistance-associated protein, subfamily C. This subfamily is also known as MRP (multidrug resistance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate. 204
23392 213218 cd03251 ABCC_MsbA ATP-binding cassette domain of the bacterial lipid flippase and related proteins, subfamily C. MsbA is an essential ABC transporter, closely related to eukaryotic MDR proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 234
23393 213219 cd03252 ABCC_Hemolysin ATP-binding cassette domain of hemolysin B, subfamily C. The ABC-transporter hemolysin B is a central component of the secretion machinery that translocates the toxin, hemolysin A, in a Sec-independent fashion across both membranes of E. coli. The hemolysin A (HlyA) transport machinery is composed of the ATP-binding cassette (ABC) transporter HlyB located in the inner membrane, hemolysin D (HlyD), also anchored in the inner membrane, and TolC, which resides in the outer membrane. HlyD apparently forms a continuous channel that bridges the entire periplasm, interacting with TolC and HlyB. This arrangement prevents the appearance of periplasmic intermediates of HlyA during substrate transport. Little is known about the molecular details of HlyA transport, but it is evident that ATP-hydrolysis by the ABC-transporter HlyB is a necessary source of energy. 237
23394 213220 cd03253 ABCC_ATM1_transporter ATP-binding cassette domain of iron-sulfur clusters transporter, subfamily C. ATM1 is an ABC transporter that is expressed in the mitochondria. Although the specific function of ATM1 is unknown, its disruption results in the accumulation of excess mitochondrial iron, loss of mitochondrial cytochromes, oxidative damage to mitochondrial DNA, and decreased levels of cytosolic heme proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 236
23395 213221 cd03254 ABCC_Glucan_exporter_like ATP-binding cassette domain of glucan transporter and related proteins, subfamily C. Glucan exporter ATP-binding protein. In A. tumefaciens cyclic beta-1, 2-glucan must be transported into the periplasmic space to exert its action as a virulence factor. This subfamily belongs to the MRP-like family and is involved in drug, peptide, and lipid export. The MRP-like family, similar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains each composed of six transmembrane (TM) helices and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 229
23396 213222 cd03255 ABC_MJ0796_LolCDE_FtsE ATP-binding cassette domain of the transporters involved in export of lipoprotein and macrolide, and cell division protein. This family is comprised of MJ0796 ATP-binding cassette, macrolide-specific ABC-type efflux carrier (MacAB), and proteins involved in cell division (FtsE), and release of lipoproteins from the cytoplasmic membrane (LolCDE). They are clustered together phylogenetically. MacAB is an exporter that confers resistance to macrolides, while the LolCDE system is not a transporter at all. An FtsE null mutants showed filamentous growth and appeared viable on high salt medium only, indicating a role for FtsE in cell division and/or salt transport. The LolCDE complex catalyzes the release of lipoproteins from the cytoplasmic membrane prior to their targeting to the outer membrane. 218
23397 213223 cd03256 ABC_PhnC_transporter ATP-binding cassette domain of the binding protein-dependent phosphonate transport system. Phosphonates are a class of organophosphorus compounds characterized by a chemically stable carbon-to-phosphorus (C-P) bond. Phosphonates are widespread among naturally occurring compounds in all kingdoms of wildlife, but only prokaryotic microorganisms are able to cleave this bond. Certain bacteria such as E. coli can use alkylphosphonates as a phosphorus source. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 241
23398 213224 cd03257 ABC_NikE_OppD_transporters ATP-binding cassette domain of nickel/oligopeptides specific transporters. The ABC transporter subfamily specific for the transport of dipeptides, oligopeptides (OppD), and nickel (NikDE). The NikABCDE system of E. coli belongs to this family and is composed of the periplasmic binding protein NikA, two integral membrane components (NikB and NikC), and two ATPase (NikD and NikE). The NikABCDE transporter is synthesized under anaerobic conditions to meet the increased demand for nickel resulting from hydrogenase synthesis. The molecular mechanism of nickel uptake in many bacteria and most archaea is not known. Many other members of this ABC family are also involved in the uptake of dipeptides and oligopeptides. The oligopeptide transport system (Opp) is a five-component ABC transport composed of a membrane-anchored substrate binding proteins (SRP), OppA, two transmembrane proteins, OppB and OppC, and two ATP-binding domains, OppD and OppF. 228
23399 213225 cd03258 ABC_MetN_methionine_transporter ATP-binding cassette domain of methionine transporter. MetN (also known as YusC) is an ABC-type transporter encoded by metN of the metNPQ operon in Bacillus subtilis that is involved in methionine transport. Other members of this system include the MetP permease and the MetQ substrate binding protein. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 233
23400 213226 cd03259 ABC_Carb_Solutes_like ATP-binding cassette domain of the carbohydrate and solute transporters-like. This family is comprised of proteins involved in the transport of apparently unrelated solutes and proteins specific for di- and oligosaccharides and polyols. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 213
23401 213227 cd03260 ABC_PstB_phosphate_transporter ATP-binding cassette domain of the phosphate transport system. Phosphate uptake is of fundamental importance in the cell physiology of bacteria because phosphate is required as a nutrient. The Pst system of E. coli comprises four distinct subunits encoded by the pstS, pstA, pstB, and pstC genes. The PstS protein is a phosphate-binding protein located in the periplasmic space. PstA and PstC are hydrophobic and they form the transmembrane portion of the Pst system. PstB is the catalytic subunit, which couples the energy of ATP hydrolysis to the import of phosphate across cellular membranes through the Pst system, often referred as ABC-protein. PstB belongs to one of the largest superfamilies of proteins characterized by a highly conserved adenosine triphosphate (ATP) binding cassette (ABC), which is also a nucleotide binding domain (NBD). 227
23402 213228 cd03261 ABC_Org_Solvent_Resistant ATP-binding cassette transport system involved in resistance to organic solvents. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 235
23403 213229 cd03262 ABC_HisP_GlnQ ATP-binding cassette domain of the histidine and glutamine transporters. HisP and GlnQ are the ATP-binding components of the bacterial periplasmic histidine and glutamine permeases, respectively. Histidine permease is a multi-subunit complex containing the HisQ and HisM integral membrane subunits and two copies of HisP. HisP has properties intermediate between those of integral and peripheral membrane proteins and is accessible from both sides of the membrane, presumably by its interaction with HisQ and HisM. The two HisP subunits form a homodimer within the complex. The domain structure of the amino acid uptake systems is typical for prokaryotic extracellular solute binding protein-dependent uptake systems. All of the amino acid uptake systems also have at least one, and in a few cases, two extracellular solute binding proteins located in the periplasm of Gram-negative bacteria, or attached to the cell membrane of Gram-positive bacteria. The best-studied member of the PAAT (polar amino acid transport) family is the HisJQMP system of S. typhimurium, where HisJ is the extracellular solute binding proteins and HisP is the ABC protein. 213
23404 213230 cd03263 ABC_subfamily_A ATP-binding cassette domain of the lipid transporters, subfamily A. The ABCA subfamily mediates the transport of a variety of lipid compounds. Mutations of members of ABCA subfamily are associated with human genetic diseases, such as, familial high-density lipoprotein (HDL) deficiency, neonatal surfactant deficiency, degenerative retinopathies, and congenital keratinization disorders. The ABCA1 protein is involved in disorders of cholesterol transport and high-density lipoprotein (HDL) biosynthesis. The ABCA4 (ABCR) protein transports vitamin A derivatives in the outer segments of photoreceptor cells, and therefore, performs a crucial step in the visual cycle. The ABCA genes are not present in yeast. However, evolutionary studies of ABCA genes indicate that they arose as transporters that subsequently duplicated and that certain sets of ABCA genes were lost in different eukaryotic lineages. 220
23405 213231 cd03264 ABC_drug_resistance_like ABC-type multidrug transport system, ATPase component. The biological function of this family is not well characterized, but display ABC domains similar to members of ABCA subfamily. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 211
23406 213232 cd03265 ABC_DrrA Daunorubicin/doxorubicin resistance ATP-binding protein. DrrA is the ATP-binding protein component of a bacterial exporter complex that confers resistance to the antibiotics daunorubicin and doxorubicin. In addition to DrrA, the complex includes an integral membrane protein called DrrB. DrrA belongs to the ABC family of transporters and shares sequence and functional similarities with a protein found in cancer cells called P-glycoprotein. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 220
23407 213233 cd03266 ABC_NatA_sodium_exporter ATP-binding cassette domain of the Na+ transporter. NatA is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilis, NatAB is inducible by agents such as ethanol and protonophores, which lower the proton-motive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunorubicin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of a single ATP-binding protein and a single integral membrane protein. 218
23408 213234 cd03267 ABC_NatA_like ATP-binding cassette domain of an uncharacterized transporter similar in sequence to NatA. NatA is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled to proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilis, NatAB is inducible by agents such as ethanol and protonophores, which lower the proton-motive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunorubicin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of the single ATP-binding protein and the single integral membrane protein. 236
23409 213235 cd03268 ABC_BcrA_bacitracin_resist ATP-binding cassette domain of the bacitracin-resistance transporter. The BcrA subfamily represents ABC transporters involved in peptide antibiotic resistance. Bacitracin is a dodecapeptide antibiotic produced by B. licheniformis and B. subtilis. The synthesis of bacitracin is non-ribosomally catalyzed by a multi-enzyme complex BcrABC. Bacitracin has potent antibiotic activity against gram-positive bacteria. The inhibition of peptidoglycan biosynthesis is the best characterized bacterial effect of bacitracin. The bacitracin resistance of B. licheniformis is mediated by the ABC transporter Bcr which is composed of two identical BcrA ATP-binding subunits and one each of the integral membrane proteins, BcrB and BcrC. B. subtilis cells carrying bcr genes on high-copy number plasmids develop collateral detergent sensitivity, a similar phenomenon in human cells with overexpressed multi-drug resistance P-glycoprotein. 208
23410 213236 cd03269 ABC_putative_ATPase ATP-binding cassette domain of an uncharacterized transporter. This subgroup is related to the subfamily A transporters involved in drug resistance, nodulation, lipid transport, and bacteriocin and lantibiotic immunity. In eubacteria and archaea, the typical organization consists of one ABC and one or two integral membranes. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 210
23411 213237 cd03270 ABC_UvrA_I ATP-binding cassette domain I of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases. 226
23412 213238 cd03271 ABC_UvrA_II ATP-binding cassette domain II of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins and UvrB having one ATP binding site that is structurally related to that of helicases. 261
23413 213239 cd03272 ABC_SMC3_euk ATP-binding cassette domain of eukaryotic SMC3 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 243
23414 213240 cd03273 ABC_SMC2_euk ATP-binding cassette domain of eukaryotic SMC2 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 251
23415 213241 cd03274 ABC_SMC4_euk ATP-binding cassette domain of eukaryotic SMC4 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 212
23416 213242 cd03275 ABC_SMC1_euk ATP-binding cassette domain of eukaryotic SMC1 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 247
23417 213243 cd03276 ABC_SMC6_euk ATP-binding cassette domain of eukaryotic SM6 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 198
23418 213244 cd03277 ABC_SMC5_euk ATP-binding cassette domain of eukaryotic SMC5 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 213
23419 213245 cd03278 ABC_SMC_barmotin ATP-binding cassette domain of barmotin, a member of the SMC protein family. Barmotin is a tight junction-associated protein expressed in rat epithelial cells which is thought to have an important regulatory role in tight junction barrier function. Barmotin belongs to the SMC protein family. SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18). 197
23420 213246 cd03279 ABC_sbcCD ATP-binding cassette domain of sbcCD. SbcCD and other Mre11/Rad50 (MR) complexes are implicated in the metabolism of DNA ends. They cleave ends sealed by hairpin structures and are thought to play a role in removing protein bound to DNA termini. 213
23421 213247 cd03280 ABC_MutS2 ATP-binding cassette domain of MutS2. MutS2 homologs in bacteria and eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 200
23422 213248 cd03281 ABC_MSH5_euk ATP-binding cassette domain of eukaryotic MutS5 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 213
23423 213249 cd03282 ABC_MSH4_euk ATP-binding cassette domain of eukaryotic MutS4 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 204
23424 213250 cd03283 ABC_MutS-like ATP-binding cassette domain of MutS-like homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 199
23425 213251 cd03284 ABC_MutS1 ATP-binding cassette domain of MutS1 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 216
23426 213252 cd03285 ABC_MSH2_euk ATP-binding cassette domain of eukaryotic MutS2 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 222
23427 213253 cd03286 ABC_MSH6_euk ATP-binding cassette domain of eukaryotic MutS6 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 218
23428 213254 cd03287 ABC_MSH3_euk ATP-binding cassette domain of eukaryotic MutS3 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 222
23429 213255 cd03288 ABCC_SUR2 ATP-binding cassette domain 2 of the sulfonylurea receptor SUR. The SUR domain 2. The sulfonylurea receptor SUR is an ATP binding cassette (ABC) protein of the ABCC/MRP family. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity. 257
23430 213256 cd03289 ABCC_CFTR2 ATP-binding cassette domain 2 of CFTR,subfamily C. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2. 275
23431 213257 cd03290 ABCC_SUR1_N ATP-binding cassette domain of the sulfonylurea receptor, subfamily C. The SUR domain 1. The sulfonylurea receptor SUR is an ATP transporter of the ABCC/MRP family with tandem ATPase binding domains. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity. 218
23432 213258 cd03291 ABCC_CFTR1 ATP-binding cassette domain of the cystic fibrosis transmembrane regulator, subfamily C. The CFTR subfamily domain 1. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits, or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2. 282
23433 213259 cd03292 ABC_FtsE_transporter ATP-binding cassette domain of the cell division transporter. FtsE is a hydrophilic nucleotide-binding protein that binds FtsX to form a heterodimeric ATP-binding cassette (ABC)-type transporter that associates with the bacterial inner membrane. The FtsE/X transporter is thought to be involved in cell division and is important for assembly or stability of the septal ring. 214
23434 213260 cd03293 ABC_NrtD_SsuB_transporters ATP-binding cassette domain of the nitrate and sulfonate transporters. NrtD and SsuB are the ATP-binding subunits of the bacterial ABC-type nitrate and sulfonate transport systems, respectively. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 220
23435 213261 cd03294 ABC_Pro_Gly_Betaine ATP-binding cassette domain of the osmoprotectant proline/glycine betaine uptake system. This family comprises the glycine betaine/L-proline ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporters is the obligatory coupling of ATP hydrolysis to substrate translocation. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 269
23436 213262 cd03295 ABC_OpuCA_Osmoprotection ATP-binding cassette domain of the osmoprotectant transporter. OpuCA is a the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment. ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition, to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 242
23437 213263 cd03296 ABC_CysA_sulfate_importer ATP-binding cassette domain of the sulfate transporter. Part of the ABC transporter complex cysAWTP involved in sulfate import. Responsible for energy coupling to the transport system. The complex is composed of two ATP-binding proteins (cysA), two transmembrane proteins (cysT and cysW), and a solute-binding protein (cysP). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 239
23438 213264 cd03297 ABC_ModC_molybdenum_transporter ATP-binding cassette domain of the molybdenum transport system. ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 214
23439 213265 cd03298 ABC_ThiQ_thiamine_transporter ATP-binding cassette domain of the thiamine transport system. Part of the binding-protein-dependent transport system tbpA-thiPQ for thiamine and TPP. Probably responsible for the translocation of thiamine across the membrane. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 211
23440 213266 cd03299 ABC_ModC_like ATP-binding cassette domain similar to the molybdate transporter. Archaeal protein closely related to ModC. ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 235
23441 213267 cd03300 ABC_PotA_N ATP-binding cassette domain of the polyamine transporter. PotA is an ABC-type transporter and the ATPase component of the spermidine/putrescine-preferential uptake system consisting of PotA, -B, -C, and -D. PotA has two domains with the N-terminal domain containing the ATPase activity and the residues required for homodimerization with PotA and heterdimerization with PotB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 232
23442 213268 cd03301 ABC_MalK_N The N-terminal ATPase domain of the maltose transporter, MalK. ATP binding cassette (ABC) proteins function from bacteria to human, mediating the translocation of substances into and out of cells or organelles. ABC transporters contain two transmembrane-spanning domains (TMDs) or subunits and two nucleotide binding domains (NBDs) or subunits that couple transport to the hydrolysis of ATP. In the maltose transport system, the periplasmic maltose binding protein (MBP) stimulates the ATPase activity of the membrane-associated transporter, which consists of two transmembrane subunits, MalF and MalG, and two copies of the ATP binding subunit, MalK, and becomes tightly bound to the transporter in the catalytic transition state, ensuring that maltose is passed to the transporter as ATP is hydrolyzed. 213
23443 176471 cd03302 Adenylsuccinate_lyase_2 Adenylsuccinate lyase (ASL)_subgroup 2. This subgroup contains mainly eukaryotic proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting. 436
23444 239423 cd03307 Mta_CmuA_like MtaA_CmuA_like family. MtaA/CmuA, also MtsA, or methyltransferase 2 (MT2) MT2-A and MT2-M isozymes, are methylcobamide:Coenzyme M methyltransferases, which play a role in metabolic pathways of methane formation from various substrates, such as methylated amines and methanol. Coenzyme M, 2-mercaptoethylsulfonate or CoM, is methylated during methanogenesis in a reaction catalyzed by three proteins. A methyltransferase methylates the corrinoid cofactor, which is bound to a second polypeptide, a corrinoid protein. The methylated corrinoid protein then serves as a substrate for MT2-A and related enzymes, which methylate CoM. 326
23445 239424 cd03308 CmuA_CmuC_like CmuA_CmuC_like: uncharacterized protein family similar to uroporphyrinogen decarboxylase (URO-D) and the methyltransferases CmuA and CmuC. 378
23446 239425 cd03309 CmuC_like CmuC_like. Proteins similar to the putative corrinoid methyltransferase CmuC. Its function has been inferred from sequence similarity to the methyltransferases CmuA and MtaA. Mutants of Methylobacterium sp. disrupted in cmuC and purU appear deficient in some step of chloromethane metabolism. 321
23447 239426 cd03310 CIMS_like CIMS - Cobalamine-independent methonine synthase, or MetE. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers both the N-and C-terminal barrel, and some single-barrel sequences, mostly from Archaea. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate. 321
23448 239427 cd03311 CIMS_C_terminal_like CIMS - Cobalamine-independent methonine synthase, or MetE, C-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the C-terminal barrel, and a few single-barrel sequences most similar to the C-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Sidechains from both barrels contribute to the binding of the folate substrate. 332
23449 239428 cd03312 CIMS_N_terminal_like CIMS - Cobalamine-independent methonine synthase, or MetE, N-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the N-terminal barrel, and a few single-barrel sequences most similar to the N-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate. 360
23450 239429 cd03313 enolase Enolase: Enolases are homodimeric enzymes that catalyse the reversible dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate as part of the glycolytic and gluconeogenesis pathways. The reaction is facilitated by the presence of metal ions. 408
23451 239430 cd03314 MAL Methylaspartate ammonia lyase (3-methylaspartase, MAL) is a homodimeric enzyme, catalyzing the magnesium-dependent reversible alpha,beta-elimination of ammonia from L-threo-(2S,3S)-3-methylaspartic acid to mesaconic acid. This reaction is part of the main catabolic pathway for glutamate. MAL belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 369
23452 239431 cd03315 MLE_like Muconate lactonizing enzyme (MLE) like subgroup of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and residues that can function as general acid/base catalysts, a Lys-X-Lys motif and another conserved lysine. Despite these conserved residues, the members of the MLE subgroup, like muconate lactonizing enzyme, o-succinylbenzoate synthase (OSBS) and N-acylamino acid racemase (NAAAR), catalyze different reactions. 265
23453 239432 cd03316 MR_like Mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. Members of the MR subgroup are mandelate racemase, D-glucarate/L-idarate dehydratase (GlucD), D-altronate/D-mannonate dehydratase , D-galactonate dehydratase (GalD) , D-gluconate dehydratase (GlcD), and L-rhamnonate dehydratase (RhamD). 357
23454 239433 cd03317 NAAAR N-acylamino acid racemase (NAAAR), an octameric enzyme that catalyzes the racemization of N-acylamino acids. NAAARs act on a broad range of N-acylamino acids rather than amino acids. Enantiopure amino acids are of industrial interest as chiral building blocks for antibiotics, herbicides, and drugs. NAAAR is a member of the enolase superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 354
23455 239434 cd03318 MLE Muconate Lactonizing Enzyme (MLE), an homooctameric enzyme, catalyses the conversion of cis,cis-muconate (CCM) to muconolactone (ML) in the catechol branch of the beta-ketoadipate pathway. This pathway is used in soil microbes to breakdown lignin-derived aromatics, catechol and protocatechuate, to citric acid cycle intermediates. Some bacterial species are also capable of dehalogenating chloroaromatic compounds by the action of chloromuconate lactonizing enzymes (Cl-MLEs). MLEs are members of the enolase superfamily characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion. 365
23456 239435 cd03319 L-Ala-DL-Glu_epimerase L-Ala-D/L-Glu epimerase catalyzes the epimerization of L-Ala-D/L-Glu and other dipeptides. The genomic context and the substrate specificity of characterized members of this family from E.coli and B.subtilis indicates a possible role in the metabolism of the murein peptide of peptidoglycan, of which L-Ala-D-Glu is a component. L-Ala-D/L-Glu epimerase is a member of the enolase-superfamily, which is characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 316
23457 239436 cd03320 OSBS o-Succinylbenzoate synthase (OSBS) catalyzes the conversion of 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate (SHCHC) to 4-(2'-carboxyphenyl)-4-oxobutyrate (o-succinylbenzoate or OSB), a reaction in the menaquinone biosynthetic pathway. Menaquinone is an essential cofactor for anaerobic growth in eubacteria and some archaea. OSBS belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 263
23458 239437 cd03321 mandelate_racemase Mandelate racemase (MR) catalyzes the Mg2+-dependent 1,1-proton transfer reaction that interconverts the enantiomers of mandelic acid. MR is the first enzyme in the bacterial pathway that converts mandelic acid to benzoic acid and allows this pathway to utilize either enantiomer of mandelate. MR belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 355
23459 239438 cd03322 RspA The starvation sensing protein RspA from E.coli and its homologs are lactonizing enzymes whose putative targets are homoserine lactone (HSL)-derivative. They are part of the mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subfamily share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and catalytic residues, a partially conserved Lys-X-Lys motif and a conserved histidine-aspartate dyad. 361
23460 239439 cd03323 D-glucarate_dehydratase D-Glucarate dehydratase (GlucD) catalyzes the dehydration of both D-glucarate and L-idarate to form 5-keto-4-deoxy-D-glucarate (5-KDG) , the initial reaction of the catabolic pathway for (D)-glucarate. GlucD belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion. 395
23461 239440 cd03324 rTSbeta_L-fuconate_dehydratase Human rTS beta is encoded by the rTS gene which, through alternative RNA splicing, also encodes rTS alpha whose mRNA is complementary to thymidylate synthase mRNA. rTS beta expression is associated with the production of small molecules that appear to mediate the down-regulation of thymidylate synthase protein by a novel intercellular signaling mechanism. A member of this family, from Xanthomonas, has been characterized to be a L-fuconate dehydratase. rTS beta belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 415
23462 239441 cd03325 D-galactonate_dehydratase D-galactonate dehydratase catalyses the dehydration of galactonate to 2-keto-3-deoxygalactonate (KDGal), as part of the D-galactonate nonphosphorolytic catabolic Entner-Doudoroff pathway. D-galactonate dehydratase belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 352
23463 239442 cd03326 MR_like_1 Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 1. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 385
23464 239443 cd03327 MR_like_2 Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 2. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 341
23465 239444 cd03328 MR_like_3 Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 3. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 352
23466 239445 cd03329 MR_like_4 Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 4. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 368
23467 394879 cd03330 Macro_Ttha0132-like Macrodomain, uncharacterized family similar to Thermus thermophilus hypothetical protein Ttha0132. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Macrodomains include the yeast macrodomain Poa1 which is a phosphatase of ADP-ribose-1"-phosphate, a by-product of tRNA splicing. Some macrodomains have ADPr-unrelated binding partners such as the coronavirus SUD-N (N-terminal subdomain) and SUD-M (middle subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). Macrodomains regulate a wide variety of cellular and organismal processes, including DNA damage repair, signal transduction, and immune response. This family is composed of uncharacterized proteins containing a stand-alone macrodomain, similar to Thermus thermophilus hypothetical protein Ttha0132. 147
23468 394880 cd03331 Macro_Poa1p-like_SNF2 macrodomain, Poa1p-like family, SNF2 subfamily. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this subfamily contain a C-terminal macrodomain that show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1"-p, a tRNA splicing metabolite. In addition, they also contain an SNF2 domain, defined by the presence of seven motifs with sequence similarity to DNA helicases. SNF2 proteins have the capacity to use the energy released by their DNA-dependent ATPase activity to stabilize or perturb protein-DNA interactions and play important roles in transcriptional regulation, maintenance of chromosome integrity and DNA repair. 152
23469 239448 cd03332 LMO_FMN L-Lactate 2-monooxygenase (LMO) FMN-binding domain. LMO is a FMN-containing enzyme that catalyzes the conversion of L-lactate and oxygen to acetate, carbon dioxide, and water. LMO is a member of the family of alpha-hydroxy acid oxidases. It is thought to be a homooctamer with two- and four- fold axes in the center of the octamer. 383
23470 239449 cd03333 chaperonin_like chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains. 209
23471 239450 cd03334 Fab1_TCP TCP-1 like domain of the eukaryotic phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinase Fab1. Fab1p is important for vacuole size regulation, presumably by modulating PtdIns(3,5)P2 effector activity. In the human homolog p235/PIKfyve deletion of this domain leads to loss of catalytic activity. However no exact function this domain has been defined. In general, chaperonins are involved in productive folding of proteins. 261
23472 239451 cd03335 TCP1_alpha TCP-1 (CTT or eukaryotic type II) chaperonin family, alpha subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 527
23473 239452 cd03336 TCP1_beta TCP-1 (CTT or eukaryotic type II) chaperonin family, beta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 517
23474 239453 cd03337 TCP1_gamma TCP-1 (CTT or eukaryotic type II) chaperonin family, gamma subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 480
23475 239454 cd03338 TCP1_delta TCP-1 (CTT or eukaryotic type II) chaperonin family, delta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 515
23476 239455 cd03339 TCP1_epsilon TCP-1 (CTT or eukaryotic type II) chaperonin family, epsilon subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 526
23477 239456 cd03340 TCP1_eta TCP-1 (CTT or eukaryotic type II) chaperonin family, eta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 522
23478 239457 cd03341 TCP1_theta TCP-1 (CTT or eukaryotic type II) chaperonin family, theta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 472
23479 239458 cd03342 TCP1_zeta TCP-1 (CTT or eukaryotic type II) chaperonin family, zeta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 484
23480 239459 cd03343 cpn60 cpn60 chaperonin family. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. Archaeal cpn60 (thermosome), together with TF55 from thermophilic bacteria and the eukaryotic cytosol chaperonin (CTT), belong to the type II group of chaperonins. Cpn60 consists of two stacked octameric rings, which are composed of one or two different subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. 517
23481 239460 cd03344 GroEL GroEL_like type I chaperonin. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). With the aid of cochaperonin GroES, GroEL encapsulates non-native substrate proteins inside the cavity of the GroEL-ES complex and promotes folding by using energy derived from ATP hydrolysis. 520
23482 239461 cd03345 eu_TyrOH Eukaryotic tyrosine hydroxylase (TyrOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tryptophan hydroxylase (TrpOH). TyrOH catalyzes the conversion of tyrosine to L-dihydroxyphenylalanine (L-DOPA), the rate-limiting step in the biosynthesis of the catecholamines dopamine, noradrenaline, and adrenaline. 298
23483 239462 cd03346 eu_TrpOH Eukaryotic tryptophan hydroxylase (TrpOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tyrosine hydroxylase (TyrOH). TrpOH oxidizes L-tryptophan to 5-hydroxy-L-tryptophan, the rate-limiting step in the biosynthesis of serotonin (5-hydroxytryptamine), a widely distributed hormone and neurotransmitter. 287
23484 239463 cd03347 eu_PheOH Eukaryotic phenylalanine-4-hydroxylase (eu_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic phenylalanine-4-hydroxylase (pro_PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH catalyzes the first and rate-limiting step in the metabolism of the amino acid L-phenylalanine (L-Phe), the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor. The catalytic activity of the tetrameric enzyme is tightly regulated by the binding of L-Phe and BH4 as well as by phosphorylation. Mutations in the human enzyme are linked to a severe variant of phenylketonuria. 306
23485 239464 cd03348 pro_PheOH Prokaryotic phenylalanine-4-hydroxylase (pro_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes the eukaryotic proteins, phenylalanine-4-hydroxylase (eu_PheOH), tyrosine hydroxylase (TyrOH) and tryptophan hydroxylase (TrpOH). PheOH catalyzes the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor. 228
23486 100040 cd03349 LbH_XAT Xenobiotic acyltransferase (XAT): The XAT class of hexapeptide acyltransferases is composed of a large number of microbial enzymes that catalyze the CoA-dependent acetylation of a variety of hydroxyl-bearing acceptors such as chloramphenicol and streptogramin, among others. Members of this class of enzymes include Enterococcus faecium streptogramin A acetyltransferase and Pseudomonas aeruginosa chloramphenicol acetyltransferase. They contain repeated copies of a six-residue hexapeptide repeat sequence motif (X-[STAV]-X-[LIV]-[GAED]-X) and adopt a left-handed parallel beta helix (LbH) structure. The active enzyme is a trimer with CoA and substrate binding sites at the interface of two separate LbH subunits. XATs are implicated in inactivating xenobiotics leading to xenobiotic resistance in patients. 145
23487 100041 cd03350 LbH_THP_succinylT 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate (THDP) N-succinyltransferase (also called THP succinyltransferase): THDP N-succinyltransferase catalyzes the conversion of tetrahydrodipicolinate and succinyl-CoA to N-succinyltetrahydrodipicolinate and CoA. It is the committed step in the succinylase pathway by which bacteria synthesize L-lysine and meso-diaminopimelate, a component of peptidoglycan. The enzyme is homotrimeric and each subunit contains an N-terminal region with alpha helices and hairpin loops, as well as a C-terminal region with a left-handed parallel alpha-helix (LbH) structural motif encoded by hexapeptide repeat motifs. 139
23488 100042 cd03351 LbH_UDP-GlcNAc_AT UDP-N-acetylglucosamine O-acyltransferase (UDP-GlcNAc acyltransferase): Proteins in this family catalyze the transfer of (R)-3-hydroxymyristic acid from its acyl carrier protein thioester to UDP-GlcNAc. It is the first enzyme in the lipid A biosynthetic pathway and is also referred to as LpxA. Lipid A is essential for the growth of Escherichia coli and related bacteria. It is also essential for maintaining the integrity of the outer membrane. UDP-GlcNAc acyltransferase is a homotrimer of left-handed parallel beta helix (LbH) subunits. Each subunit contains an N-terminal LbH region with 9 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), and a C-terminal alpha-helical region. 254
23489 100043 cd03352 LbH_LpxD UDP-3-O-acyl-glucosamine N-acyltransferase (LpxD): The enzyme catalyzes the transfer of 3-hydroxymyristic acid or 3-hydroxy-arachidic acid, depending on the organism, from the acyl carrier protein (ACP) to UDP-3-O-acyl-glucosamine to produce UDP-2,3-diacyl-GlcNAc. This constitutes the third step in the lipid A biosynthetic pathway in Gram-negative bacteria. LpxD is a homotrimer, with each subunit consisting of a novel combination of an N-terminal uridine-binding domain, a core lipid-binding left-handed parallel beta helix (LbH) domain, and a C-terminal alpha-helical extension. The LbH domain contains 9 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). 205
23490 100044 cd03353 LbH_GlmU_C N-acetyl-glucosamine-1-phosphate uridyltransferase (GlmU), C-terminal left-handed beta-helix (LbH) acetyltransferase domain: GlmU is also known as UDP-N-acetylglucosamine pyrophosphorylase. It is a bifunctional bacterial enzyme that catalyzes two consecutive steps in the formation of UDP-N-acetylglucosamine (UDP-GlcNAc), an important precursor in bacterial cell wall formation. The two enzymatic activities, uridyltransferase and acetyltransferase, are carried out by two independent domains. The C-terminal LbH domain possesses the acetyltransferase activity. It catalyzes the CoA-dependent acetylation of GlcN-1-phosphate to GlcNAc-1-phosphate. The LbH domain contains 10 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X. The acetyltransferase active site is located at the interface between two subunits of the active LbH trimer. 193
23491 100045 cd03354 LbH_SAT Serine acetyltransferase (SAT): SAT catalyzes the CoA-dependent acetylation of the side chain hydroxyl group of L-serine to form O-acetylserine, as the first step of a two-step biosynthetic pathway in bacteria and plants leading to the formation of L-cysteine. This reaction represents a key metabolic point of regulation for the cysteine biosynthetic pathway due to its feedback inhibition by cysteine. The enzyme is a 175 kDa homohexamer, composed of a dimer of homotrimers. Each subunit contains an N-terminal alpha helical region and a C-terminal left-handed beta-helix (LbH) subdomain with 5 turns, each containing a hexapeptide repeat motif characteristic of the acyltransferase superfamily of enzymes. The trimer interface mainly involves the C-terminal LbH subdomain while the dimer (of trimers) interface is mediated by the N-terminal alpha helical subdomain. 101
23492 100046 cd03356 LbH_G1P_AT_C_like Left-handed parallel beta-Helix (LbH) domain of a group of proteins with similarity to glucose-1-phosphate adenylyltransferase: Included in this family are glucose-1-phosphate adenylyltransferase, mannose-1-phosphate guanylyltransferase, and the eukaryotic translation initiation factor eIF-2B subunits, epsilon and gamma. Most members of this family contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold, followed by a LbH fold domain with at least 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). eIF-2B epsilon contains an additional domain of unknown function at the C-terminus. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 79
23493 100047 cd03357 LbH_MAT_GAT Maltose O-acetyltransferase (MAT) and Galactoside O-acetyltransferase (GAT): MAT and GAT catalyze the CoA-dependent acetylation of the 6-hydroxyl group of their respective sugar substrates. MAT acetylates maltose and glucose exclusively at the C6 position of the nonreducing end glucosyl moiety. GAT specifically acetylates galactopyranosides. Furthermore, MAT shows higher affinity toward artificial substrates containing an alkyl or hydrophobic chain as well as a glucosyl unit. Active MAT and GAT are homotrimers, with each subunit consisting of an N-terminal alpha-helical region and a C-terminal left-handed parallel alpha-helix (LbH) subdomain with 6 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). 169
23494 100048 cd03358 LbH_WxcM_N_like WcxM-like, Left-handed parallel beta-Helix (LbH) N-terminal domain: This group is composed of Xanthomonas campestris WcxM and proteins with similarity to the WcxM N-terminal domain. WcxM is thought to be bifunctional, catalyzing both the isomerization and transacetylation reactions of keto-hexoses. It contains an N-terminal LbH domain responsible for the transacetylation function and a C-terminal isomerase domain. The LbH domain contains imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), typical of enzymes with acyltransferase activity. 119
23495 100049 cd03359 LbH_Dynactin_5 Dynactin 5 (or subunit p25); Dynactin is a major component of the activator complex that stimulates dynein-mediated vesicle transport. Dynactin is a heterocomplex of at least eight subunits, including a 150,000-MW protein called Glued, the actin-capping protein Arp1, and dynamatin. In vitro binding experiments show that dynactin enhances dynein-dependent motility, possibly through interaction with microtubules and vesicles. Subunit p25 is part of the pointed-end subcomplex in dynactin that also includes p26, p27, and Arp11. This subcomplex interacts with membranous cargoes. p25 and p27 contain imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), indicating a left-handed parallel beta helix (LbH) structural domain. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 161
23496 100050 cd03360 LbH_AT_putative Putative Acyltransferase (AT), Left-handed parallel beta-Helix (LbH) domain; This group is composed of mostly uncharacterized proteins containing an N-terminal helical subdomain followed by a LbH domain. The alignment contains 6 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. A few members are identified as NeuD, a sialic acid (Sia) O-acetyltransferase that is required for Sia synthesis and surface polysaccharide sialylation. 197
23497 173781 cd03361 TOPRIM_TopoIA_RevGyr TopoIA_RevGyr : The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to the ATP-dependent reverse gyrase found in archaea and thermophilic bacteria. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 170
23498 173782 cd03362 TOPRIM_TopoIA_TopoIII TOPRIM_TopoIA_TopoIII: The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to topoisomerase III. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 151
23499 173783 cd03363 TOPRIM_TopoIA_TopoI TOPRIM_TopoIA_TopoI: The topoisomerase-primase (TOPRIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to Escherichia coli DNA topoisomerase I. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 123
23500 173784 cd03364 TOPRIM_DnaG_primases TOPRIM_DnaG_primases: The topoisomerase-primase (TORPIM) nucleotidyl transferase/hydrolase domain found in the active site regions of proteins similar to Escherichia coli DnaG. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. E. coli DnaG is a single subunit enzyme. 79
23501 173785 cd03365 TOPRIM_TopoIIA TOPRIM_TopoIIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 120
23502 173786 cd03366 TOPRIM_TopoIIA_GyrB TOPRIM_TopoIIA_GyrB: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to the Escherichia coli GyrB subunit. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. DNA gyrase is more effective at relaxing supercoils than decatentating DNA. DNA gyrase in addition inserts negative supercoils in the presence of ATP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 114
23503 239465 cd03367 Ribosomal_S23 S12-like family, 40S ribosomal protein S23 subfamily; S23 is located at the interface of the large and small ribosomal subunits of eukaryotes, adjacent to the decoding center. It interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes the translocation of the growing peptidyl-tRNA to the P site to make room for the next aminoacyl-tRNA at the A (acceptor) site. Through its interaction with eEF2, S23 may play an important role in translocation. Also members of this subfamily are the archaeal 30S ribosomal S12 proteins. Prokaryotic S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as control element for the rRNA- and tRNA-driven movements of translocation. S12 and S23 are also implicated in translation accuracy. Antibiotics such as streptomycin bind S12/S23 and cause the ribosome to misread the genetic code. 115
23504 239466 cd03368 Ribosomal_S12 S12-like family, 30S ribosomal protein S12 subfamily; S12 is located at the interface of the large and small ribosomal subunits of prokaryotes, chloroplasts and mitochondria, where it plays an important role in both tRNA and ribosomal subunit interactions. S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. Antibiotics such as streptomycin bind S12 and cause the ribosome to misread the genetic code. 108
23505 213269 cd03369 ABCC_NFT1 ATP-binding cassette domain 2 of NFT1, subfamily C. Domain 2 of NFT1 (New full-length MRP-type transporter 1). NFT1 belongs to the MRP (multidrug resistance-associated protein) family of ABC transporters. Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions such as glutathione, glucuronate, and sulfate. 207
23506 380327 cd03370 nitroreductase uncharacterized nitroreductase family proteins. Nitroreductase family containing Thermus thermophilus NADH oxidase and other, uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. 191
23507 239468 cd03371 TPP_PpyrDC Thiamine pyrophosphate (TPP) family, PpyrDC subfamily, TPP-binding module; composed of proteins similar to phosphonopyruvate decarboxylase (PpyrDC) proteins. PpyrDC is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. These proteins require TPP and divalent metal cation cofactors. 188
23508 239469 cd03372 TPP_ComE Thiamine pyrophosphate (TPP) family, ComE subfamily, TPP-binding module; composed of proteins similar to Methanococcus jannaschii sulfopyruvate decarboxylase beta subunit (ComE). M. jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits, which catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. ComDE requires TPP and divalent metal cation cofactors. 179
23509 239470 cd03375 TPP_OGFOR Thiamine pyrophosphate (TPP family), 2-oxoglutarate ferredoxin oxidoreductase (OGFOR) subfamily, TPP-binding module; OGFOR catalyzes the oxidative decarboxylation of 2-oxo-acids, with ferredoxin acting as an electron acceptor. In the TCA cycle, OGFOR catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA. In the reductive tricarboxylic acid cycle found in the anaerobic autotroph Hydrogenobacter thermophilus, OGFOR catalyzes the reductive carboxylation of succinyl-CoA to produce 2-oxoglutarate. Thauera aromatica OGFOR has been shown to provide reduced ferredoxin to benzoyl-CoA reductase, a key enzyme in the anaerobic metabolism of aromatic compounds. OGFOR is dependent on TPP and a divalent metal cation for activity. 193
23510 239471 cd03376 TPP_PFOR_porB_like Thiamine pyrophosphate (TPP family), PFOR porB-like subfamily, TPP-binding module; composed of proteins similar to the beta subunit (porB) of the Helicobacter pylori four-subunit pyruvate ferredoxin oxidoreductase (PFOR), which are also found in archaea and some hyperthermophilic bacteria. PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The 36-kDa porB subunit contains the binding sites for the cofactors, TPP and a divalent metal cation, which are required for activity. 235
23511 239472 cd03377 TPP_PFOR_PNO Thiamine pyrophosphate (TPP family), PFOR_PNO subfamily, TPP-binding module; composed of proteins similar to the single subunit pyruvate ferredoxin oxidoreductase (PFOR) of Desulfovibrio Africanus, present in bacteria and amitochondriate eukaryotes. This subfamily also includes proteins characterized as pyruvate NADP+ oxidoreductase (PNO). These enzymes are dependent on TPP and a divalent metal cation as cofactors. PFOR and PNO catalyze the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The PFOR from cyanobacterium Anabaena (NifJ) is required for the transfer of electrons from pyruvate to flavodoxin, which reduces nitrogenase. The facultative anaerobic mitochondrion of the photosynthetic protist Euglena gracilis oxidizes pyruvate with PNO. 365
23512 239473 cd03378 beta_CA_cladeC Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 154
23513 239474 cd03379 beta_CA_cladeD Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 142
23514 239475 cd03380 PAP2_like_1 PAP2_like_1 proteins, a sub-family of PAP2, containing bacterial acid phosphatase, vanadium chloroperoxidases and vanadium bromoperoxidases. 209
23515 239476 cd03381 PAP2_glucose_6_phosphatase PAP2_like proteins, glucose-6-phosphatase subfamily. Glucose-6-phosphatase converts glucose-6-phosphate into free glucose and is active in the lumen of the endoplasmic reticulum, where it is bound to the membrane. The generation of free glucose is an important control point in metabolism, and stands at the end of gluconeogenesis and the release of glucose from glycogen. Deficiency of glucose-6-phosphatase leads to von Gierke's disease. 235
23516 239477 cd03382 PAP2_dolichyldiphosphatase PAP2_like proteins, dolichyldiphosphatase subfamily. Dolichyldiphosphatase is a membrane-associated protein located in the endoplasmic reticulum and hydrolyzes dolichyl pyrophosphate, as well as dolichylmonophosphate at a low rate. The enzyme is necessary for maintaining proper levels of dolichol-linked oligosaccharides and protein N-glycosylation, and might play a role in re-utilization of the glycosyl carrier lipid for additional rounds of lipid intermediate biosynthesis after its release during protein N-glycosylation reactions. 159
23517 239478 cd03383 PAP2_diacylglycerolkinase PAP2_like proteins, diacylglycerol_kinase like sub-family. In some prokaryotes, PAP2_like phosphatase domains appear fused to E. coli DAGK-like trans-membrane diacylglycerol kinase domains. The cellular function of these architectures remains to be determined. 109
23518 239479 cd03384 PAP2_wunen PAP2, wunen subfamily. Most likely a family of membrane associated phosphatidic acid phosphatases. Wunen is a drosophila protein expressed in the central nervous system, which provides repellent activity towards primordial germ cells (PGCs), controls the survival of PGCs and is essential in the migration process of these cells towards the somatic gonadal precursors. 150
23519 239480 cd03385 PAP2_BcrC_like PAP2_like proteins, BcrC_like subfamily. Several members of this family have been annotated as bacitracin transport permeases, as it was suspected that they form the permease component of an ABC transporter system. It was shown, however, that BcrC from Bacillus subtilis posesses undecaprenyl pyrophosphate (UPP) phospatase activity, and it is hypothesized that it competes with bacitracin for UPP, increasing the cell's resistance to bacitracin. 144
23520 239481 cd03386 PAP2_Aur1_like PAP2_like proteins, Aur1_like subfamily. Yeast Aur1p or Ipc1p is necessary for the addition of inositol phosphate to ceramide, an essential step in yeast sphingolipid synthesis, and is the target of several antifungal compounds such as aureobasidin. 186
23521 239482 cd03388 PAP2_SPPase1 PAP2_like proteins, sphingosine-1-phosphatase subfamily. Sphingosine-1-phosphatase is an intracellular enzyme located in the endoplasmic reticulum, which regulates the level of sphingosine-1-phosphate (S1P), a bioactive lipid. S1P acts as a second messenger in the cell, and extracellularly by binding to G-protein coupled receptors of the endothelial differentiation gene family. 151
23522 239483 cd03389 PAP2_lipid_A_1_phosphatase PAP2_like proteins, Lipid A 1-phosphatase subfamily. Lipid A 1-phosphatase, or LpxE from Francisella novicida selectively dephosphorylates lipid A at the 1-position. Lipid A is the membrane-anchor component of lipopolysaccharides (LPS), the major constituents of the outer membrane in many gram-negative bacteria. 186
23523 239484 cd03390 PAP2_containing_1_like PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_1. Most likely membrane-associated phosphatidic acid phosphatases. Plant members of this group are constitutively expressed in many tissues and exhibit both diacylglycerol pyrophosphate phosphatase activity as well as phosphatidate (PA) phosphatase activity, they may have a more generic housekeeping role in lipid metabolism. 193
23524 239485 cd03391 PAP2_containing_2_like PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_2. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to eukaryota, lacks functional characterization and may act as a membrane-associated phosphatidic acid phosphatase. 159
23525 239486 cd03392 PAP2_like_2 PAP2_like_2 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 182
23526 239487 cd03393 PAP2_like_3 PAP2_like_3 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria and archaea, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 125
23527 239488 cd03394 PAP2_like_5 PAP2_like_5 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 106
23528 239489 cd03395 PAP2_like_4 PAP2_like_4 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 177
23529 239490 cd03396 PAP2_like_6 PAP2_like_6 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which mainly contains bacterial proteins, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 197
23530 239491 cd03397 PAP2_acid_phosphatase PAP2, bacterial acid phosphatase or class A non-specific acid phosphatases. These enzymes catalyze phosphomonoester hydrolysis, with optimal activity in low pH conditions. They are secreted into the periplasmic space, and their physiological role remains to be determined. 232
23531 239492 cd03398 PAP2_haloperoxidase PAP2, haloperoxidase_like subfamily. Haloperoxidases catalyze the oxidation of halides such as bromide or chloride by hydrogen peroxide, which results in subsequent halogenation of organic substrates, or halide-assisted disproportionation of hydrogen peroxide forming dioxygen. They are likely to participate in the biosynthesis of halogenated natural products, such as volatile halogenated hydrocarbons, chiral halogenated terpenes, acetogenins and indoles. 232
23532 259798 cd03399 SPFH_flotillin Flotillin or reggie family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. The flotillin (reggie) like proteins are lipid raft-associated. Individual proteins of this SPFH family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. In addition, microdomains formed from flotillin proteins may be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and interact with a variety of proteins. They may play a role in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and in cancer invasion, and metastasis. 145
23533 259799 cd03401 SPFH_prohibitin Prohibitin family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prohibitin (a lipid raft-associated integral membrane protein). Individual proteins of the SPFH (band 7) domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. These microdomains, in addition to being stable scaffolds, may also be dynamic units with their own regulatory functions. Prohibitin is a mitochondrial inner-membrane protein which may act as a chaperone for the stabilization of mitochondrial proteins. Human prohibitin forms a hetero-oligomeric complex with Bap-37 (prohibitin 2, an SPFH domain carrying homolog). This complex may protect non-assembled membrane proteins against proteolysis by the m-AAA protease. Prohibitin and Bap-37 yeast homologs have been implicated in yeast longevity and in the maintenance of mitochondrial morphology. 195
23534 259800 cd03402 SPFH_like_u2 Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease, and in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 231
23535 259801 cd03403 SPFH_stomatin Stomatin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. Stomatin (or band 7) is widely expressed and, highly expressed in red blood cells. It localizes predominantly to the plasma membrane and to intracellular vesicles of the endocytic pathway, where it is present in higher order homo-oligomeric complexes (of between 9 and 12 monomers). Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and, is implicated in trafficking of Glut1 glucose transporters. This subgroup found in animals, also contains proteins similar to Caenorhabditis elegans MEC-2. MEC-2 interacts with MEC-4, which is part of the degenerin channel complex required for response to gentle body touch. 202
23536 259802 cd03404 SPFH_HflK High frequency of lysogenization K (HflK) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prokaryotic HflK (High frequency of lysogenization K). Although many members of the SPFH (or band 7) superfamily are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this SPFH domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflK is an integral membrane protein which may localize to the plasma membrane. HflK associates with another SPFH superfamily member (HflC) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. 266
23537 259803 cd03405 SPFH_HflC High frequency of lysogenization C (HflC) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prokaryotic HflC (High frequency of lysogenization C). Although many members of the SPFH (or band 7) superfamily are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this SPFH domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflC is an integral membrane protein which may localize to the plasma membrane. HflC associates with another SPFH superfamily member (HflK) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. 249
23538 259804 cd03406 SPFH_like_u3 Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 293
23539 259805 cd03407 SPFH_like_u4 Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 269
23540 259806 cd03408 SPFH_like_u1 Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 217
23541 239503 cd03409 Chelatase_Class_II Class II Chelatase: a family of ATP-independent monomeric or homodimeric enzymes that catalyze the insertion of metal into protoporphyrin rings. This family includes protoporphyrin IX ferrochelatase (HemH), sirohydrochlorin ferrochelatase (SirB) and the cobaltochelatases, CbiK and CbiX. HemH and SirB are involved in heme and siroheme biosynthesis, respectively, while the cobaltochelatases are associated with cobalamin biosynthesis. Excluded from this family are the ATP-dependent heterotrimeric chelatases (class I) and the multifunctional homodimeric enzymes with dehydrogenase and chelatase activities (class III). 101
23542 239504 cd03411 Ferrochelatase_N Ferrochelatase, N-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus. 159
23543 239505 cd03412 CbiK_N Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), N-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases and is a homomeric enzyme that does not require ATP for its enzymatic activity. 127
23544 239506 cd03413 CbiK_C Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), C-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases, and is a homomeric enzyme that does not require ATP for its enzymatic activity. 103
23545 239507 cd03414 CbiX_SirB_C Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), C-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both CbiX and SirB are found in a wide range of bacteria. 117
23546 239508 cd03415 CbiX_CbiC Archaeal sirohydrochlorin cobalt chelatase (CbiX) single domain. Proteins in this subgroup contain a single CbiX domain N-terminal to a precorrin-8X methylmutase (CbiC) domain. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, while CbiC catalyzes the conversion of cobalt-precorrin 8 to cobyrinic acid by methyl rearrangement. Both CbiX and CbiC are involved in vitamin B12 biosynthesis. 125
23547 239509 cd03416 CbiX_SirB_N Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), N-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both are found in a wide range of bacteria. This subgroup also contains single domain proteins from archaea and bacteria which may represent the ancestral form of class II chelatases before domain duplication occurred. 101
23548 239510 cd03418 GRX_GRXb_1_3_like Glutaredoxin (GRX) family, GRX bacterial class 1 and 3 (b_1_3)-like subfamily; composed of bacterial GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including E. coli GRX1 and GRX3, which are members of this subfamily. 75
23549 239511 cd03419 GRX_GRXh_1_2_like Glutaredoxin (GRX) family, GRX human class 1 and 2 (h_1_2)-like subfamily; composed of proteins similar to human GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, which are members of this subfamily. Also included in this subfamily are the N-terminal GRX domains of proteins similar to human thioredoxin reductase 1 and 3. 82
23550 239512 cd03420 SirA_RHOD_Pry_redox SirA_RHOD_Pry_redox. SirA-like domain located within a multidomain protein of unknown function. Other domains include RHOD (rhodanese homology domain), and Pry_redox (pyridine nucleotide-disulphide oxidoreductase) as well as a C-terminal domain that corresponds to COG2210. This fold is referred to as a two-layered alpha/beta sandwich, structurally similar to that of translation initiation factor 3. 69
23551 239513 cd03421 SirA_like_N SirA_like_N, a protein of unknown function with an N-terminal SirA-like domain. The SirA, YedF, YeeD protein family is present in bacteria as well as archaea. SirA (also known as UvrY, and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 67
23552 239514 cd03422 YedF YedF is a bacterial SirA-like protein of unknown function. SirA (also known as UvrY, and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 69
23553 239515 cd03423 SirA SirA (also known as UvrY, and YhhP) belongs to a family of two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is thought to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 69
23554 239516 cd03424 ADPRase_NUDT5 ADP-ribose pyrophosphatase (ADPRase) catalyzes the hydrolysis of ADP-ribose and a variety of additional ADP-sugar conjugates to AMP and ribose-5-phosphate. Like other members of the Nudix hydrolase superfamily, it requires a divalent cation, such as Mg2+, for its activity. It also contains a highly conserved 23-residue Nudix motif (GX5EX7REUXEEXGU, where U = I, L or V) which functions as a metal binding site/catalytic site. In addition to the Nudix motif, there are additional conserved amino acid residues, distal from the signature sequence, that correlate with substrate specificity. In humans, there are four distinct ADPRase activities, three putative cytosolic enzymes (ADPRase-I, -II, and -Mn) and a single mitochondrial enzyme (ADPRase-m). Human ADPRase-II is also referred to as NUDT5. It lacks the N-terminal target sequence unique to mitochondrial ADPRase. The different cytosolic types are distinguished by their specificities for substrate and specific requirement for metal ions. NUDT5 forms a homodimer. 137
23555 239517 cd03425 MutT_pyrophosphohydrolase The MutT pyrophosphohydrolase is a prototypical Nudix hydrolase that catalyzes the hydrolysis of nucleoside and deoxynucleoside triphosphates (NTPs and dNTPs) by substitution at a beta-phosphorus to yield a nucleotide monophosphate (NMP) and inorganic pyrophosphate (PPi). This enzyme requires two divalent cations for activity; one coordinates the phosphoryl groups of the NTP/dNTP substrate, and the other coordinates to the enzyme. It also contains the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as metal binding and catalytic site. MutT pyrophosphohydrolase is important in preventing errors in DNA replication by hydrolyzing mutagenic nucleotides such as 8-oxo-dGTP (a product of oxidative damage), which can mispair with template adenine during DNA replication, to guanine nucleotides. 124
23556 239518 cd03426 CoAse Coenzyme A pyrophosphatase (CoAse), a member of the Nudix hydrolase superfamily, functions to catalyze the elimination of oxidized inactive CoA, which can inhibit CoA-utilizing enzymes. The need of CoAses mainly arises under conditions of oxidative stress. CoAse has a conserved Nudix fold and requires a single divalent cation for catalysis. In addition to a signature Nudix motif G[X5]E[X7]REUXEEXGU, where U is Ile, Leu, or Val, CoAse contains an additional motif upstream called the NuCoA motif (LLTXT(SA)X3RX3GX3FPGG) which is postulated to be involved in CoA recognition. CoA plays a central role in lipid metabolism. It is involved in the initial steps of fatty acid sythesis in the cytosol, in the oxidation of fatty acids and the citric acid cycle in the mitochondria, and in the oxidation of long-chain fatty acids in peroxisomes. CoA has the important role of activating fatty acids for further modification into key biological signalling molecules. 157
23557 239519 cd03427 MTH1 MutT homolog-1 (MTH1) is a member of the Nudix hydrolase superfamily. MTH1, the mammalian counterpart of MutT, hydrolyzes oxidized purine nucleoside triphosphates, such as 8-oxo-dGTP and 2-hydroxy-ATP, to monophosphates, thereby preventing the incorporation of such oxygen radicals during replication. This is an important step in the repair mechanism in genomic and mitochondrial DNA. Like other members of the Nudix family, it requires a divalent cation, such as Mg2+ or Mn2+, for activity, and contain the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. MTH1 is predominantly localized in the cytoplasm and mitochondria. Structurally, this enzyme adopts a similar fold to MutT despite low sequence similarity outside the conserved nudix motif. The most distinctive structural difference between MutT and MTH1 is the presence of a beta-hairpin, which is absent in MutT. This results in a much deeper and narrower substrate binding pocket. Mechanistically, MTH1 contains dual specificity for nucleotides that contain 2-OH-adenine bases and those that contain 8-oxo-guanine bases. 137
23558 239520 cd03428 Ap4A_hydrolase_human_like Diadenosine tetraphosphate (Ap4A) hydrolase is a member of the Nudix hydrolase superfamily. Ap4A hydrolases are well represented in a variety of prokaryotic and eukaryotic organisms. Phylogenetic analysis reveals two distinct subgroups where plant enzymes fall into one subfamily and fungi/animals/archaea enzymes, represented by this subfamily, fall into another. Bacterial enzymes are found in both subfamilies. Ap4A is a potential by-product of aminoacyl tRNA synthesis, and accumulation of Ap4A has been implicated in a range of biological events, such as DNA replication, cellular differentiation, heat shock, metabolic stress, and apoptosis. Ap4A hydrolase cleaves Ap4A asymmetrically into ATP and AMP. It is important in the invasive properties of bacteria and thus presents a potential target for inhibition of such invasive bacteria. Besides the signature nudix motif (G[X5]E[X7]REUXEEXGU, where U is Ile, Leu, or Val) that functions as a metal binding and catalytic site, and a required divalent cation, Ap4A hydrolase is structurally similar to the other members of the nudix superfamily with some degree of variation. Several regions in the sequences are poorly defined and substrate and metal binding sites are only predicted based on kinetic studies. 130
23559 239521 cd03429 NADH_pyrophosphatase NADH pyrophosphatase, a member of the Nudix hydrolase superfamily, catalyzes the cleavage of NADH into reduced nicotinamide mononucleotide (NMNH) and AMP. Like other members of the Nudix family, it requires a divalent cation, such as Mg2+ or Mn2+, for activity. Members of this family are also recognized by the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. A block of 8 conserved amino acids downstream of the nudix motif is thought to give NADH pyrophosphatase its specificity for NADH. NADH pyrophosphatase forms a dimer. 131
23560 239522 cd03430 GDPMH GDP-mannose glycosyl hydrolase (AKA GDP-mannose mannosyl hydrolase (GDPMH)) is a member of the Nudix hydrolase superfamily. This class of enzymes is unique from other members of the superfamily in two aspects. First, it contains a modified Nudix signature sequence. The slight changes to the conserved sequence motif, GX5EX7REUXEEXGU, where U = I, L or V), are believed to contribute to the removal of all magnesium binding sites but one, retaining only the metal site that coordinates the pyrophosphate of the substrate. Secondly, it is not a pyrophosphatase that substitutes at a phosphorus; instead, it hydrolyzes nucleotide sugars such as GDP-mannose to GDP and mannose, cleaving the phosphoglycosyl bond by substituting at a carbon position. GDP-mannose provides mannosyl components for cell wall synthesis and is required for the synthesis of other glycosyl donors (such as GDP-fucose and colitose) for the cell wall. The importance of GDP-sugar hydrolase activities is thus closely related to the regulation of cell wall biosynthesis. Enzymes in this family are believed to regulate the concentration of GDP-mannose and GDP-glucose in the bacterial cell wall. 144
23561 239523 cd03431 DNA_Glycosylase_C DNA glycosylase (MutY in bacteria and hMYH in humans) is responsible for repairing misread A*oxoG residues to C*G by removing the inappropriately paired adenine base from the DNA backbone. It belongs to the Nudix hydrolase superfamily and is important for the repair of various genotoxic lesions. Enzymes belonging to this superfamily requires a divalent cation, such as Mg2+ or Mn2+ for their activity. They are also recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V). However, DNA glycosylase does not seem to contain this signature motif. DNA glycosylase consists of 2 domains: the N-terminal domain contains the catalytic properties of the enzyme and the C-terminal domain affects substrate (oxoG) binding and enzymatic turnover. The C-terminal domain is highly similar to MutT, based on secondary structure and topology, despite low sequence identity. MutT sanitizes the nucleotide precursor pool by hydrolyzing oxo-dGTP to oxo-dGMO and inorganic pyrophosphate. The similarity strongly suggests that the two proteins share a common evolutionary origin. 118
23562 239524 cd03440 hot_dog The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis. 100
23563 239525 cd03441 R_hydratase_like (R)-hydratase [(R)-specific enoyl-CoA hydratase]. Catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway. The structure of the monomer includes a five-strand antiparallel beta-sheet wrapped around a central alpha helix, referred to as a hot dog fold. The active site lies within a substrate-binding tunnel formed by the homodimer. Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and the fatty acid synthase beta subunit. 127
23564 239526 cd03442 BFIT_BACH Brown fat-inducible thioesterase (BFIT). Brain acyl-CoA hydrolase (BACH). These enzymes deacylate long-chain fatty acids by hydrolyzing acyl-CoA thioesters to free fatty acids and CoA-SH. Eukaryotic members of this family are expressed in brain, testis, and brown adipose tissues. The archeal and eukaryotic members of this family have two tandem copies of the conserved hot dog fold, while most bacterial members have only one copy. 123
23565 239527 cd03443 PaaI_thioesterase PaaI_thioesterase is a tetrameric acyl-CoA thioesterase with a hot dog fold and one of several proteins responsible for phenylacetic acid (PA) degradation in bacteria. Although orthologs of PaaI exist in archaea and eukaryotes, their function has not been determined. Sequence similarity between PaaI, E. coli medium chain acyl-CoA thioesterase II, and human thioesterase III suggests they all belong to the same thioesterase superfamily. The conserved fold present in these thioesterases is referred to as an asymmetric hot dog fold, similar to those of 4-hydroxybenzoyl-CoA thioesterase (4HBT) and the beta-hydroxydecanoyl-ACP dehydratases (FabA/FabZ). 113
23566 239528 cd03444 Thioesterase_II_repeat1 Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 104
23567 239529 cd03445 Thioesterase_II_repeat2 Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 94
23568 239530 cd03446 MaoC_like MoaC_like Similar to the MaoC (monoamine oxidase C) dehydratase regulatory protein but without the N-terminal PutA domain. This protein family has a hot-dog fold similar to that of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 140
23569 239531 cd03447 FAS_MaoC FAS_MaoC, the MaoC-like hot dog fold of the fatty acid synthase, beta subunit. Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and 17-beta-hydroxysteriod dehydrogenase (HSD). 126
23570 239532 cd03448 HDE_HSD HDE_HSD The R-hydratase-like hot dog fold of the 17-beta-hydroxysteriod dehydrogenase (HSD), and Hydratase-Dehydrogenase-Epimerase (HDE) proteins. Other enzymes with this fold include MaoC dehydratase, and the fatty acid synthase beta subunit. 122
23571 239533 cd03449 R_hydratase (R)-hydratase [(R)-specific enoyl-CoA hydratase] catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway. (R)-hydratase contains a hot-dog fold similar to those of thioesterase II, and beta-hydroxydecanoyl-ACP dehydratase, MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and the fatty acid synthase beta subunit. The active site lies within a substrate-binding tunnel formed by the (R)-hydratase homodimer. A subset of the bacterial (R)-hydratases contain a C-terminal phosphotransacetylase (PTA) domain. 128
23572 239534 cd03450 NodN NodN (nodulation factor N) contains a single hot dog fold similar to those of the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. Rhizobium and related species form nodules on the roots of their legume hosts, a symbiotic process that requires production of Nod factors, which are signal molecules involved in root hair deformation and meristematic cell division. The nodulation gene products, including NodN, are involved in producing the Nod factors, however the role played by NodN is unclear. 149
23573 239535 cd03451 FkbR2 FkbR2 is a Streptomyces hygroscopicus protein with a hot dog fold that belongs to a conserved family of proteins found in prokaryotes and archaea but not in eukaryotes. FkbR2 has sequence similarity to (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. The function of FkbR2 is unknown. 146
23574 239536 cd03452 MaoC_C MaoC_C The C-terminal hot dog fold of the MaoC (monoamine oxidase C) dehydratase regulatory protein. Orthologs of MaoC include PaaZ [Escherichia coli] and PaaN [Pseudomonas putida], which are putative ring-opening enzymes involved in phenylacetic acid degradation. The C-terminal domain of MaoC has sequence similarity to (R)-specific enoyl-CoA hydratase,Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. MaoC also has an N-terminal PutA domain like that found in the E. coli PutA proline dehydrogenase and other members of the aldehyde dehydrogenase family. 142
23575 239537 cd03453 SAV4209_like SAV4209_like. Similar in sequence to the Streptomyces avermitilis SAV4209 protein, with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 127
23576 239538 cd03454 YdeM YdeM is a Bacillus subtilis protein that belongs to a family of prokaryotic proteins of unkown function. YdeM has sequence similarity to the hot-dog fold of (R)-specific enoyl-CoA hydratase. Other enzymes with this fold include the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 140
23577 239539 cd03455 SAV4209 SAV4209 is a Streptomyces avermitilis protein with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. The alpha- and gamma-proteobacterial members of this CD have, in addition to a hot dog fold, an N-terminal extension. 123
23578 239540 cd03457 intradiol_dioxygenase_like Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown. 188
23579 239541 cd03458 Catechol_intradiol_dioxygenases Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. 256
23580 239542 cd03459 3,4-PCD Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. 158
23581 239543 cd03460 1,2-CTD Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. 282
23582 239544 cd03461 1,2-HQD Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes. 277
23583 239545 cd03462 1,2-CCD chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway. 247
23584 239546 cd03463 3,4-PCD_alpha Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. 185
23585 239547 cd03464 3,4-PCD_beta Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+. 220
23586 239548 cd03465 URO-D_like The URO-D _like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane. 330
23587 239549 cd03466 Nitrogenase_NifN_2 Nitrogenase_nifN_2: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE. NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco). This group also contains the Clostidium fused NifN-NifB protein. 429
23588 239550 cd03467 Rieske Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis. 98
23589 176458 cd03468 PolY_like DNA Polymerase Y-family. Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions. Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion. Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention. 335
23590 239551 cd03469 Rieske_RO_Alpha_N Rieske non-heme iron oxygenase (RO) family, N-terminal Rieske domain of the oxygenase alpha subunit; The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The oxygenase component may contain alpha and beta subunits, with the beta subunit having a purely structural function. Some oxygenase components contain only an alpha subunit. The oxygenase alpha subunit has two domains, an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from the reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Reduced pyridine nucleotide is used as the initial source of two electrons for dioxygen activation. 118
23591 239552 cd03470 Rieske_cytochrome_bc1 Iron-sulfur protein (ISP) component of the bc(1) complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The bc(1) complex is a multisubunit enzyme found in many different organisms including uni- and multi-cellular eukaryotes, plants (in their mitochondria) and bacteria. The cytochrome bc(1) and b6f complexes are central components of the respiratory and photosynthetic electron transport chains, respectively, which carry out similar core electron and proton transfer steps. The bc(1) and b6f complexes share a common core structure of three catalytic subunits: cyt b, the Rieske ISP, and either a cyt c1 in the bc(1) complex or cyt f in the b6f complex, which are arranged in an integral membrane-bound dimeric complex. While the core of the b6f complex is similar to that of the bc(1) complex, the domain arrangement outside the core and the complement of prosthetic groups are strikingly different. 126
23592 239553 cd03471 Rieske_cytochrome_b6f Iron-sulfur protein (ISP) component of the b6f complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The cytochrome b6f complex from Mastigocladus laminosus, a thermophilic cyanobacterium, contains four large subunits, including cytochrome f, cytochrome b6, the Rieske ISP, and subunit IV; as well as four small hydrophobic subunits, PetG, PetL, PetM, and PetN. Rieske ISP, one of the large subunits of the cytochrome bc-type complexes, is involved in respiratory and photosynthetic electron transfer. The core of the chloroplast b6f complex is similar to the analogous respiratory cytochrome bc(1) complex, but the domain arrangement outside the core and the complement of prosthetic groups are strikingly different. 126
23593 239554 cd03472 Rieske_RO_Alpha_BPDO_like Rieske non-heme iron oxygenase (RO) family, Biphenyl dioxygenase (BPDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of BPDO and similar proteins including cumene dioxygenase (CumDO), nitrobenzene dioxygenase (NBDO), alkylbenzene dioxygenase (AkbDO) and dibenzofuran 4,4a-dioxygenase (DFDO). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. BPDO degrades biphenyls and polychlorinated biphenyls (PCB's) while CumDO degrades cumene (isopropylbenzene), an aromatic hydrocarbon that is intermediate in size between ethylbenzene and biphenyl. NBDO catalyzes the initial reaction in nitrobenzene degradation, oxidizing the aromatic rings of mono- and dinitrotoluenes to form catechol and nitrite. NBDO belongs to the naphthalene subfamily of ROs. AkbDO is involved in alkylbenzene catabolism, converting o-xylene to 2,3- and 3,4-dimethylphenol and ethylbenzene to cis-dihydrodiol. DFDO is involved in dibenzofuran degradation. 128
23594 239555 cd03473 Rieske_CMP_Neu5Ac_hydrolase_N Cytidine monophosphate-N-acetylneuraminic acid (CMP Neu5Ac) hydroxylase family, N-terminal Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. CMP Neu5Ac hydroxylase is the key enzyme for the synthesis of N-glycolylneuraminic acid (NeuGc) from N-acetylneuraminic acid (Neu5Ac), NeuGc and Neu5Ac are members of a family of cell surface sugars called sialic acids. All mammals except humans have both NeuGc variants on their cell surfaces. In humans, the gene encoding CMP Neu5Ac hydroxylase has a mutation within its coding region that abolishes NeuGc production. 107
23595 239556 cd03474 Rieske_T4moC Toluene-4-monooxygenase effector protein complex (T4mo), Rieske ferredoxin subunit; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. T4mo is a four-protein complex that catalyzes the NADH- and O2-dependent hydroxylation of toluene to form p-cresol. T4mo consists of an NADH oxidoreductase (T4moF), a diiron hydroxylase (T4moH), a catalytic effector protein (T4moD), and a Rieske ferredoxin (T4moC). T4moC contains a Rieske domain and functions as an obligate electron carrier between T4moF and T4moH. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes. 108
23596 239557 cd03475 Rieske_SoxF_SoxL SoxF and SoxL family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. SoxF is a subunit of the terminal oxidase supercomplex SoxM in the plasma membrane of Sulfolobus acidocaldarius that combines features of a cytochrome bc(1) complex and a cytochrome. The Rieske domain of SoxF has a 12 residue insertion which is not found in eukaryotic and bacterial Rieske proteins and is thought to influence the redox properties of the iron-sulfur cluster. SoxL is a Rieske protein which may be part of an archaeal bc-complex homologue whose physiological function is still unknown. SoxL has two features not seen in other Rieske proteins; (i) a significantly greater distance between the two cluster-binding sites and (ii) an unexpected Pro -> Asp substitution at one of the cluster binding sites. SoxF and SoxL are found in archaea and in bacteria. 171
23597 239558 cd03476 Rieske_ArOX_small Small subunit of Arsenite oxidase (ArOX) family, Rieske domain; ArOX is a molybdenum/iron protein involved in the detoxification of arsenic, oxidizing it to arsenate. It consists of two subunits, a large subunit similar to members of the DMSO reductase family of molybdenum enzymes and a small subunit with a Rieske-type [2Fe-2S] cluster. The large subunit of ArOX contains the molybdenum site at which the oxidation of arsenite occurs. The small subunit contains a domain homologous to the Rieske domains of the cytochrome bc(1) and cytochrome b6f complexes as well as naphthalene 1,2-dioxygenase. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. 126
23598 239559 cd03477 Rieske_YhfW_C YhfW family, C-terminal Rieske domain; YhfW is a protein of unknown function with an N-terminal DadA-like (glycine/D-amino acid dehydrogenase) domain and a C-terminal Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. It is commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. YhfW is found in bacteria, some eukaryotes and archaea. 91
23599 239560 cd03478 Rieske_AIFL_N AIFL (apoptosis-inducing factor like) family, N-terminal Rieske domain; members of this family show similarity to human AIFL, containing an N-terminal Rieske domain and a C-terminal pyridine nucleotide-disulfide oxidoreductase domain (Pyr_redox). The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. AIFL shares 35% homology with human AIF (apoptosis-inducing factor), mainly in the Pyr_redox domain. AIFL is predominantly localized to the mitochondria. AIFL induces apoptosis in a caspase-dependent manner. 95
23600 239561 cd03479 Rieske_RO_Alpha_PhDO_like Rieske non-heme iron oxygenase (RO) family, Phthalate 4,5-dioxygenase (PhDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of PhDO and similar proteins including 3-chlorobenzoate 3,4-dioxygenase (CBDO), phenoxybenzoate dioxygenase (POB-dioxygenase) and 3-nitrobenzoate oxygenase (MnbA). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PhDO and CBDO are two-component RO systems, containing oxygenase and reductase components. PhDO catalyzes the dihydroxylation of phthalate to form the 4,5-dihydro-cis-dihydrodiol of phthalate (DHD). CBDO, together with CbaC dehydrogenase, converts the environmental pollutant 3CBA to protocatechuate (PCA) and 5-Cl-PCA, which are then metabolized by the chromosomal PCA meta (extradiol) ring fission pathway. POB-dioxygenase catalyzes the initial catabolic step in the angular dioxygenation of phenoxybenzoate, converting mono- and dichlorinated phenoxybenzoates to protocatechuate and chlorophenols. These phenoxybenzoates are metabolic products formed during the degradation of pyrethroid insecticides. 144
23601 239562 cd03480 Rieske_RO_Alpha_PaO Rieske non-heme iron oxygenase (RO) family, Pheophorbide a oxygenase (PaO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO) and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PaO expression increases upon physical wounding of plant leaves and is thought to catalyze a key step in chlorophyll degradation. The Arabidopsis-accelerated cell death gene ACD1 is involved in oxygenation of PaO. 138
23602 239563 cd03481 TopoIIA_Trans_ScTopoIIA TopoIIA_Trans_ScTopoIIA: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topo IIA. S. cerevisiae Topo IIA is a homodimer encoded by a single gene. The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes. All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 153
23603 239564 cd03482 MutL_Trans_MutL MutL_Trans_MutL: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to Escherichia coli MutL. EcMutL belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from the ATP-binding site to the DNA breakage/reunion regions of the enzymes. It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP. The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH. Prokaryotic MutS and MutL are homodimers. 123
23604 239565 cd03483 MutL_Trans_MLH1 MutL_Trans_MLH1: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH1 (MutL homologue 1). This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking hMLH1 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hMLH1 accounts for a large fraction of HNPCC families. 127
23605 239566 cd03484 MutL_Trans_hPMS_2_like MutL_Trans_hPMS2_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM2 (hPSM2). hPSM2 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to yeast PMS1. The yeast MLH1-PMS1 and the human MLH1-PMS2 heterodimers play a role in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Cells lacking hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome. 142
23606 239567 cd03485 MutL_Trans_hPMS_1_like MutL_Trans_hPMS1_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM1 (hPSM1) and yeast MLH2. hPSM1 and yMLH2 are members of the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. PMS1 forms a heterodimer with MLH1. The MLH1-PMS1 complex functions in meiosis. Loss of yMLH2 results in a small but significant decrease in spore viability and a significant increase in gene conversion frequencies. A role for hMLH1-hPMS1 in DNA mismatch repair has not been established. Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families, however there is no convincing evidence to support hPMS1 having a role in HNPCC predisposition. 132
23607 239568 cd03486 MutL_Trans_MLH3 MutL_Trans_MLH3: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH3 (MutL homologue 3). MLH3 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with MLH3. The MLH1-MLH3 complex plays a role in meiosis. A role for hMLH1-hMLH3 in DNA mismatch repair (MMR) has not been established. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC. 141
23608 239569 cd03487 RT_Bac_retron_II RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome. 214
23609 239570 cd03488 Topoisomer_IB_N_htopoI_like Topoisomer_IB_N_htopoI_like : N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. This family may represent more than one structural domain. 215
23610 239571 cd03489 Topoisomer_IB_N_LdtopoI_like Topoisomer_IB_N_LdtopoI_like: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topo I play putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 212
23611 239572 cd03490 Topoisomer_IB_N_1 Topoisomer_IB_N_1: A subgroup of the N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB. Topo IB proteins include the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topos I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I have putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 217
23612 239573 cd03493 SQR_QFR_TM Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) family, transmembrane subunits; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQRs may reduce either high or low potential quinones while QFRs oxidize only low potential quinols. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit(s) containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. SQRs and QFRs can be classified into five types (A-E) according to the number of their hydrophobic subunits and heme groups. This classification is consistent with the characteristics and phylogeny of the catalytic and iron-sulfur subunits. Type E proteins, e.g. non-classical archael SQRs, contain atypical transmembrane subunits and are not included in this hierarchy. The heme and quinone binding sites reside in the transmembrane subunits. Although succinate oxidation and fumarate reduction are carried out by separate enzymes in most organisms, some bifunctional enzymes that exhibit both SQR and QFR activities exist. 98
23613 239574 cd03494 SQR_TypeC_SdhD Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. E. coli SQR, a member of this subfamily, reduces the high potential quinine, ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. SdhD and SdhC are the two transmembrane proteins of bacterial SQRs. They contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 99
23614 239575 cd03495 SQR_TypeC_SdhD_like Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit-like; composed of predominantly uncharacterized bacterial proteins with similarity to the E. coli SdhD subunit. One characterized protein is the respiratory Complex II SdhD subunit of the only eukaryotic member, Reclinomonas americana. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. It is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. E. coli SQR is classified as Type C SQRs because it contains two transmembrane subunits and one heme group. The SdhD and SdhC subunits are membrane anchor subunits containing heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 100
23615 239576 cd03496 SQR_TypeC_CybS SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Eukaryotic SQRs reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. CybS and CybL are the two transmembrane proteins of eukaryotic SQRs. They contain heme and quinone binding sites. CybS is the eukaryotic homolog of the bacterial SdhD subunit. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the transmembrane subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Mutations in human Complex II result in various physiological disorders including hereditary paraganglioma and pheochromocytoma tumors. The gene encoding for the SdhD subunit is classified as a tumor suppressor gene. 104
23616 239577 cd03497 SQR_TypeB_1_TM Succinate:quinone oxidoreductase (SQR) Type B subfamily 1, transmembrane subunit; composed of proteins similar to Bacillus subtilis SQR. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Bacillus subtilis SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside on the transmembrane subunit. The transmembrane subunit of Bacillus subtilis SQR is also called Sdh cytochrome b558 subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 207
23617 239578 cd03498 SQR_TypeB_2_TM Succinate:quinone oxidoreductase (SQR)-like Type B subfamily 2, transmembrane subunit; composed of proteins with similarity to the SQRs of Geobacter metallireducens and Corynebacterium glutamicum. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. C. glutamicum SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The transmembrane subunit of members of this subfamily is also called Sdh cytochrome b558 subunit based on the Bacillus subtilis protein. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Proteins in this subfamily from G. metallireducens and G. sulfurreducens are bifunctional enzymes with SQR and QFR activities. 209
23618 239579 cd03499 SQR_TypeC_SdhC Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase C (SdhC) subunit; composed of bacterial SdhC and eukaryotic large cytochrome b binding (CybL) proteins. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this family reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Proteins in this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. The heme and quinone binding sites reside in the transmembrane subunits. The SdhC or CybL protein is one of the two transmembrane subunits of bacterial and eukaryotic SQRs. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 117
23619 239580 cd03500 SQR_TypeA_SdhD_like Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase D (SdhD)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum. The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 106
23620 239581 cd03501 SQR_TypeA_SdhC_like Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase C (SdhC)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum. The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 101
23621 239582 cd03505 Delta9-FADS-like The Delta9 Fatty Acid Desaturase (Delta9-FADS)-like CD includes the delta-9 and delta-11 acyl CoA desaturases found in various eukaryotes including vertebrates, insects, higher plants, and fungi. The delta-9 acyl-lipid desaturases are found in a wide range of bacteria. These enzymes play essential roles in fatty acid metabolism and the regulation of cell membrane fluidity. Acyl-CoA desaturases are the enzymes involved in the CoA-bound desaturation of fatty acids. Mammalian stearoyl-CoA delta-9 desaturase is a key enzyme in the biosynthesis of monounsaturated fatty acids, and in yeast, the delta-9 acyl-CoA desaturase (OLE1) reaction accounts for all de nova unsaturated fatty acid production in Saccharomyces cerevisiae. These non-heme, iron-containing, ER membrane-bound enzymes are part of a three-component enzyme system involving cytochrome b5, cytochrome b5 reductase, and the delta-9 fatty acid desaturase. This complex catalyzes the NADH- and oxygen-dependent insertion of a cis double bond between carbons 9 and 10 of the saturated fatty acyl substrates, palmitoyl (16:0)-CoA or stearoyl (18:0)-CoA, yielding the monoenoic products palmitoleic (16:l) or oleic (18:l) acids, respectively. In cyanobacteria, the biosynthesis of unsaturated fatty acids is initiated by delta 9 acyl-lipid desaturase (DesC) which introduces the first double bond at the delta-9 position of a saturated fatty acid that has been esterified to a glycerolipid. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXXH, HXXHH, and H/QXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase. Some eukaryotic (Fungi, Euglenozoa, Mycetozoa, Rhodophyta) desaturase domains have an adjacent C-terminal cytochrome b5-like domain. 178
23622 239583 cd03506 Delta6-FADS-like The Delta6 Fatty Acid Desaturase (Delta6-FADS)-like CD includes the integral-membrane enzymes: delta-4, delta-5, delta-6, delta-8, delta-8-sphingolipid, and delta-11 desaturases found in vertebrates, higher plants, fungi, and bacteria. These desaturases are required for the synthesis of highly unsaturated fatty acids (HUFAs), which are mainly esterified into phospholipids and contribute to maintaining membrane fluidity. While HUFAs may be required for cold tolerance in bacteria, plants and fish, the primary role of HUFAs in mammals is cell signaling. These enzymes are described as front-end desaturases because they introduce a double bond between the pre-exiting double bond and the carboxyl (front) end of the fatty acid. Various substrates are involved, with both acyl-coenzyme A (CoA) and acyl-lipid desaturases present in this CD. Acyl-lipid desaturases are localized in the membranes of cyanobacterial thylakoid, plant endoplasmic reticulum (ER), and plastid; and acyl-CoA desaturases are present in ER membrane. ER-bound plant acyl-lipid desaturases and acyl-CoA desaturases require cytochrome b5 as an electron donor. Most of the eukaryotic desaturase domains have an adjacent N-terminal cytochrome b5-like domain. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXH, HXX(X)HH, and Q/HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. 204
23623 239584 cd03507 Delta12-FADS-like The Delta12 Fatty Acid Desaturase (Delta12-FADS)-like CD includes the integral-membrane enzymes, delta-12 acyl-lipid desaturases, oleate 12-hydroxylases, omega3 and omega6 fatty acid desaturases, and other related proteins, found in a wide range of organisms including higher plants, green algae, diatoms, nematodes, fungi, and bacteria. The expression of these proteins appears to be temperature dependent: decreases in temperature result in increased levels of fatty acid desaturation within membrane lipids subsequently altering cell membrane fluidity. An important enzyme for the production of polyunsaturates in plants is the oleate delta-12 desaturase (Arabidopsis FAD2) of the endoplasmic reticulum. This enzyme accepts l-acyl-2-oleoyl-sn-glycero-3-phosphocholine as substrate and requires NADH:cytochrome b oxidoreductase, cytochrome b, and oxygen for activity. FAD2 converts oleate(18:1) to linoleate (18:2) and is closely related to oleate 12-hydroxylase which catalyzes the hydroxylation of oleate to ricinoleate. Plastid-bound desaturases (Arabidopsis delta-12 desaturase (FAD6), omega-3 desaturase (FAD8), omega-6 desaturase (FAD6)), as well as, the cyanobacterial thylakoid-bound FADSs require oxygen, ferredoxin, and ferredoxin oxidoreductase for activity. As in higher plants, the cyanobacteria delta-12 (DesA) and omega-3 (DesB) FADSs desaturate oleate (18:1) to linoleate (18:2) and linoleate (18:2) to linolenate (18:3), respectively. Omega-3 (DesB/FAD8) and omega-6 (DesD/FAD6) desaturases catalyze reactions that introduce a double bond between carbons three and four, and carbons six and seven, respectively, from the methyl end of fatty acids. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homologue, stearoyl CoA desaturase. Mutation of any one of four of these histidines in the Synechocystis delta-12 acyl-lipid desaturase resulted in complete inactivity. 222
23624 239585 cd03508 Delta4-sphingolipid-FADS-like The Delta4-sphingolipid Fatty Acid Desaturase (Delta4-sphingolipid-FADS)-like CD includes the integral-membrane enzymes, dihydroceramide Delta-4 desaturase, involved in the synthesis of sphingosine; and the human membrane fatty acid (lipid) desaturase (MLD), reported to modulate biosynthesis of the epidermal growth factor receptor; and other related proteins. These proteins are found in various eukaryotes including vertebrates, higher plants, and fungi. Studies show that MLD is localized to the endoplasmic reticulum. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. 289
23625 239586 cd03509 DesA_FADS-like Fatty acid desaturase protein family subgroup, a delta-12 acyl-lipid desaturase-like, DesA-like, yet uncharacterized subgroup of membrane fatty acid desaturase proteins found in alpha-, beta-, and gamma-proteobacteria. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 288
23626 239587 cd03510 Rhizobitoxine-FADS-like This CD includes the dihydrorhizobitoxine fatty acid desaturase (RtxC) characterized in Bradyrhizobium japonicum USDA110, and other related proteins. Dihydrorhizobitoxine desaturase is reported to be involved in the final step of rhizobitoxine biosynthesis. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 175
23627 239588 cd03511 Rhizopine-oxygenase-like This CD includes the putative hydrocarbon oxygenase, MocD, a bacterial rhizopine (3-O-methyl-scyllo-inosamine, 3-O-MSI) oxygenase, and other related proteins. It has been proposed that MocD, MocE (Rieske-like ferredoxin), and MocF (ferredoxin reductase) under the regulation of MocR, act in concert to form a ferredoxin oxygenase system that demethylates 3-O-MSI to form scyllo-inosamine. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 285
23628 239589 cd03512 Alkane-hydroxylase Alkane hydroxylase is a bacterial, integral-membrane di-iron enzyme that shares a requirement for iron and oxygen for activity similar to that of the non-heme integral-membrane acyl coenzyme A (CoA) desaturases and acyl lipid desaturases. The alk genes in Pseudomonas oleovorans encode conversion of alkanes to acyl CoA. The alkane omega-hydroxylase (AlkB) system is responsible for the initial oxidation of inactivated alkanes. It is a three-component system comprising a soluble NADH-rubredoxin reductase (AlkT), a soluble rubredoxin (AlkG), and the integral membrane oxygenase (AlkB). AlkB utilizes the oxygen rebound mechanism to hydroxylate alkanes. This mechanism involves homolytic cleavage of the C-H bond by an electrophilic metal-oxo intermediate to generate a substrate-based radical. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. The active site structure of AlkB is not known, however, spectroscopic and genetic evidence points to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals. Like all other members of this superfamily, there are eight conserved histidines seen in the histidine cluster motifs: HXXXH, HXXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. Also included in this CD are terminal alkane hydroxylases (AlkM), xylene monooxygenase hydroxylases (XylM), p-cymene monooxygenase hydroxylases (CymAa), and other related proteins. 314
23629 239590 cd03513 CrtW_beta-carotene-ketolase Beta-carotene ketolase/oxygenase (CrtW, also known as CrtO), the carotenoid astaxanthin biosynthetic enzyme, initially catalyzes the addition of two keto groups to carbons C4 and C4' of beta-carotene. Carotenoids are important natural pigments produced by many microorganisms and plants. Astaxanthin is reported to be an antioxidant, an anti-cancer agent, and an immune system stimulant. A number of bacteria and green algae can convert beta-carotene into astaxanthin by using several ketocarotenoids as intermediates and CrtW and a beta-carotene hydroxylase (CrtZ). CrtW initially converts beta-carotene to canthaxanthin via echinenone, and CrtZ initially mediates the conversion of beta-carotene to zeaxanthin via beta-cryptoxanthin. After a few more intermediates are formed, CrtW and CrtZ act in combination to produce astaxanthin. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that are capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 225
23630 239591 cd03514 CrtR_beta-carotene-hydroxylase Beta-carotene hydroxylase (CrtR), the carotenoid zeaxanthin biosynthetic enzyme catalyzes the addition of hydroxyl groups to the beta-ionone rings of beta-carotene to form zeaxanthin and is found in bacteria and red algae. Carotenoids are important natural pigments; zeaxanthin and lutein are the only dietary carotenoids that accumulate in the macular region of the retina and lens. It is proposed that these carotenoids protect ocular tissues against photooxidative damage. CrtR does not show overall amino acid sequence similarity to the beta-carotene hydroxylases similar to CrtZ, an astaxanthin biosynthetic beta-carotene hydroxylase. However, CrtR does show sequence similarity to the green alga, Haematococcus pluvialis, beta-carotene ketolase (CrtW), which converts beta-carotene to canthaxanthin. Sequences of the CrtR_beta-carotene-hydroxylase domain family, as well as, the CrtW_beta-carotene-ketolase domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 207
23631 239592 cd03515 Link_domain_TSG_6_like This is the extracellular link domain of the type found in human TSG-6. The link domain is a hyaluronan (HA)-binding domain. TSG-6 is the protein product of tumor necrosis factor-stimulated gene-6. TSG-6 is up-regulated in inflammatory lesions and in the ovary during ovulation. It has a strong anti-inflammatory and chondroprotective effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. Also included in this group are the stabilins: stabilin-1 (FEEL-1, CLEVER-1) and stabilin-2 (FEEL-2). Stabilin-2 functions as the major liver and lymph node-scavenging receptor for HA and related glycosaminoglycans. Stabilin-2 is a scavenger receptor with a broad range of ligands including advanced glycation end (AGE) products, acetylated low density lipoprotein and procollagen peptides. In contrast, stabilin-1 does not bind HA, but binds acetylated low density lipoprotein and AGEs with lower affinity. As AGEs accumulate in vascular tissues during aging and diabetes, these receptors may be implicated in the pathologies of these states. Both stabilins are present in the early endocytic pathway in hepatic sinusoidal epithelium associating with clathrin/AP-2. Stabilin-1 is expressed in macrophages. Stabilin-2 is absent from the latter. In macrophages: stabilin-1 is involved in trafficking between early/sorting endosomes and the trans-Golgi network. Stabilin-1 has also been implicated in angiogenesis and possibly leucocyte trafficking. Both stabilins bind gram-positive and gram-negative bacteria. TSG-6 and stabilins contain a single link module which supports high affinity binding to HA. 93
23632 239593 cd03516 Link_domain_CD44_like This domain is a hyaluronan (HA)-binding domain. It is found in CD44 receptor and mediates adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It also plays an important role in arteriogenesis. The functional HA-binding domain of CD44 is an extended domain comprised of a single link module flanked with N-and C- extensions. These extensions are essential for folding and for functional activity. This group also contains the cell surface retention sequence (CRS) binding protein-1 (CRSBP-1) and lymph vessel endothelial receptor-1 (LYVE-1). CRSBP-1 is a cell surface binding protein for the CRS motif of PDGF-BB (platelet-derived growth factor-BB) and is responsible for the cell surface retention of PDGF-BB in SSV-transformed cells. CRSBP-1 may play a role in autocrine regulation of cell growth mediated by CRS containing growth regulators. LYVE-1 is preferentially expressed on the lymphatic endothelium and is used as a molecular marker for the detection and characterization of lymphatic vessels in tumors. 144
23633 239594 cd03517 Link_domain_CSPGs_modules_1_3 Link_domain_CSPGs_modules_1_3; this extracellular link domain is found in the first and third link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan. In addition, it is found in the first link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. In addition, aggrecan contains a second globular domain (G2) which contains link modules 3 and 4. G2 appears to lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 95
23634 239595 cd03518 Link_domain_HAPLN_module_1 Link_domain_HAPLN_module_1; this link domain is found in the first link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes. 95
23635 239596 cd03519 Link_domain_HAPLN_module_2 Link_domain_HAPLN_module_2; this link domain is found in the second link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes. 91
23636 239597 cd03520 Link_domain_CSPGs_modules_2_4 Link_domain_CSPGs_modules_2_4; this link domain is found in the second and fourth link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan and, in the second link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) having link modules 3 and 4 which lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 96
23637 239598 cd03521 Link_domain_KIAA0527_like Link_domain_KIAA0527_like; this domain is found in the human protein KIAA0527. Sequence-wise, it is highly similar to the link domain. The link domain is a hyaluronan-binding (HA) domain. KIAA0527 contains a single link module. The KIAA0527 gene was originally cloned from human brain tissue. 95
23638 239599 cd03522 MoeA_like MoeA_like. This domain is similar to a domain found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. There this domain is presumed to bind molybdopterin. The exact function of this subgroup is unknown. 312
23639 239600 cd03523 NTR_like NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex. 105
23640 239601 cd03524 RPA2_OBF_family RPA2_OBF_family: A family of oligonucleotide binding (OB) folds with similarity to the OB fold of the single strand (ss) DNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA contains six OB folds, which are involved in ssDNA binding and in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. This family also includes OB folds similar to those found in Escherichia coli SSB, the wedge domain of E. coli RecG (a branched-DNA-specific helicase), E. coli ssDNA specific exodeoxyribonuclease VII large subunit, Pyrococcus abyssi DNA polymerase II (Pol II) small subunit, Sulfolobus solfataricus SSB, and Bacillus subtilis YhaM (a 3'-to-5'exoribonuclease). It also includes the OB folds of breast cancer susceptibility gene 2 protein (BRCA2), Oxytricha nova telomere end binding protein (TEBP), Saccharomyces cerevisiae telomere-binding protein (Cdc13), and human protection of telomeres 1 protein (POT1). 75
23641 239602 cd03526 SQR_QFR_TypeB_TM Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Type B proteins contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. 199
23642 239603 cd03527 RuBisCO_small Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. 99
23643 239604 cd03528 Rieske_RO_ferredoxin Rieske non-heme iron oxygenase (RO) family, Rieske ferredoxin component; composed of the Rieske ferredoxin component of some three-component RO systems including biphenyl dioxygenase (BPDO) and carbazole 1,9a-dioxygenase (CARDO). The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The ferredoxin component contains either a plant-type or Rieske-type [2Fe-2S] cluster. The Rieske ferredoxin component in this family carries an electron from the RO reductase component to the terminal RO oxygenase component. BPDO degrades biphenyls and polychlorinated biphenyls. BPDO ferredoxin (BphF) has structural features consistent with a minimal and perhaps archetypical Rieske protein in that the insertions that give other Rieske proteins unique structural features are missing. CARDO catalyzes dihydroxylation at the C1 and C9a positions of carbazole. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes. 98
23644 239605 cd03529 Rieske_NirD Assimilatory nitrite reductase (NirD) family, Rieske domain; Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium. Members include bacterial and fungal proteins. The bacterial NirD contains a single Rieske domain while fungal proteins have a C-terminal Rieske domain in addition to several other domains. The fungal NirD is involved in nutrient acquisition, functioning at the soil/fungus interface to control nutrient exchange between the fungus and the host plant. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The Rieske [2Fe-2S] cluster is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In this family, only a few members contain these residues. Other members may have lost the ability to bind the Rieske [2Fe-2S] cluster. 103
23645 239606 cd03530 Rieske_NirD_small_Bacillus Small subunit of nitrite reductase (NirD) family, Rieske domain; composed of proteins similar to the Bacillus subtilis small subunit of assimilatory nitrite reductase containing a Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium. 98
23646 239607 cd03531 Rieske_RO_Alpha_KSH The alignment model represents the N-terminal rieske iron-sulfur domain of KshA, the oxygenase component of 3-ketosteroid 9-alpha-hydroxylase (KSH). The terminal oxygenase component of KSH is a key enzyme in the microbial steroid degradation pathway, catalyzing the 9 alpha-hydroxylation of 4-androstene-3,17-dione (AD) and 1,4-androstadiene-3,17-dione (ADD). KSH is a two-component class IA monooxygenase, with terminal oxygenase (KshA) and oxygenase reductase (KshB) components. KSH activity has been found in many actino- and proteo- bacterial genera including Rhodococcus, Nocardia, Arthrobacter, Mycobacterium, and Burkholderia. 115
23647 239608 cd03532 Rieske_RO_Alpha_VanA_DdmC Rieske non-heme iron oxygenase (RO) family, Vanillate-O-demethylase oxygenase (VanA) and dicamba O-demethylase oxygenase (DdmC) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Vanillate-O-demethylase is a heterodimeric enzyme consisting of a terminal oxygenase (VanA) and reductase (VanB) components. This enzyme reductively catalyzes the conversion of vanillate into protocatechuate and formaldehyde. Protocatechuate and vanillate are important intermediate metabolites in the degradation pathway of lignin-derived compounds such as ferulic acid and vanillin by soil microbes. DDmC is the oxygenase component of a three-component dicamba O-demethylase found in Pseudomonas maltophila, that catalyzes the conversion of a widely used herbicide called herbicide dicamba (2-methoxy-3,6-dichlorobenzoic acid) to DCSA (3,6-dichlorosalicylic acid). 116
23648 239609 cd03535 Rieske_RO_Alpha_NDO Rieske non-heme iron oxygenase (RO) family, Nathphalene 1,2-dioxygenase (NDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. NDO is a three-component RO system consisting of a reductase, a ferredoxin, and a hetero-hexameric alpha-beta subunit oxygenase component. NDO catalyzes the oxidation of naphthalene to cis-(1R,2S)-dihydroxy-1,2-dihydronaphthalene (naphthalene cis-dihydrodiol) with the consumption of O2 and NAD(P)H. NDO has a relaxed substrate specificity and can oxidize almost 100 substrates. Included in its varied activities are the enantiospecific cis-dihydroxylation of polycyclic aromatic hydrocarbons and benzocycloalkenes, benzylic hydroxylation, N- and O-dealkylation, sulfoxidation and desaturation reactions. 123
23649 239610 cd03536 Rieske_RO_Alpha_DTDO This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit (DitA) of diterpenoid dioxygenase (DTDO). DTDO is a novel aromatic-ring-hydroxylating dioxygenase found in Pseudomonas and other proteobacteria that degrades dehydroabietic acid (DhA). Specifically, DitA hydroxylates 7-oxodehydroabietic acid to 7-oxo-11,12-dihydroxy-8, 13-abietadien acid. The ditA1 and ditA2 genes encode the alpha and beta subunits of the oxygenase component of DTDO while the ditA3 gene encodes the ferredoxin component of DTDO. The organization of the genes encoding the various diterpenoid dioxygenase components, the phylogenetic distinctiveness of both the alpha subunit and the ferredoxin component, and the unusual iron-sulfur cluster of the ferredoxin all suggest that this enzyme belongs to a new class of aromatic ring-hydroxylating dioxygenases. 123
23650 239611 cd03537 Rieske_RO_Alpha_PrnD This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit of aminopyrrolnitrin oxygenase (PrnD). PrnD is a novel Rieske N-oxygenase that catalyzes the final step in the pyrrolnitrin biosynthetic pathway, the oxidation of the amino group in aminopyrrolnitrin to a nitro group, forming the antibiotic pyrrolnitrin. The biosynthesis of pyrrolnitrin is one of the best examples of enzyme-catalyzed arylamine oxidation. Although arylamine oxygenases are widely distributed within the microbial world and used in a variety of metabolic reactions, PrnD represents one of only two known examples of arylamine oxygenases or N-oxygenases involved in arylnitro group formation, the other being AurF involved in aureothin biosynthesis. 123
23651 239612 cd03538 Rieske_RO_Alpha_AntDO Rieske non-heme iron oxygenase (RO) family, Anthranilate 1,2-dioxygenase (AntDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. AntDO converts anthranilate to catechol, a naturally occurring compound formed through tryptophan degradation and an important intermediate in the metabolism of many N-heterocyclic compounds such as indole, o-nitrobenzoate, carbazole, and quinaldine. 146
23652 239613 cd03539 Rieske_RO_Alpha_S5H This alignment model represents the N-terminal rieske iron-sulfur domain of the oxygenase alpha subunit (NagG) of salicylate 5-hydroxylase (S5H). S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits. 129
23653 239614 cd03541 Rieske_RO_Alpha_CMO Rieske non-heme iron oxygenase (RO) family, Choline monooxygenase (CMO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. CMO is a novel RO found in certain plants which catalyzes the first step in betaine synthesis. CMO is not found in animals or bacteria. In these organisms, the first step in betaine synthesis is catalyzed by either the membrane-bound choline dehydrogenase (CDH) or the soluble choline oxidase (COX). 118
23654 239615 cd03542 Rieske_RO_Alpha_HBDO Rieske non-heme iron oxygenase (RO) family, 2-Halobenzoate 1,2-dioxygenase (HBDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. HBDO catalyzes the double hydroxylation of 2-halobenzoates with concomitant release of halogenide and carbon dioxide, yielding catechol. 123
23655 239616 cd03545 Rieske_RO_Alpha_OHBDO_like Rieske non-heme iron oxygenase (RO) family, Ortho-halobenzoate-1,2-dioxygenase (OHBDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of OHBDO, salicylate 5-hydroxylase (S5H), terephthalate 1,2-dioxygenase system (TERDOS) and similar proteins. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OHBDO converts 2-chlorobenzoate (2-CBA) to catechol as well as 2,4-dCBA and 2,5-dCBA to 4-chlorocatechol, as part of the chlorobenzoate degradation pathway. Although ortho-substituted chlorobenzoates appear to be particularly recalcitrant to biodegradation, several strains utilize 2-CBA and the dCBA derivatives as a sole carbon and energy source. S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits. TERDOS is present in gram-positive bacteria and proteobacteria where it converts terephthalate (1,4-dicarboxybenzene) to protocatechuate as part of the terephthalate degradation pathway. The oxygenase component of TERDOS, called TerZ, is a hetero-hexamer with 3 alpha (TerZalpha) and 3 beta (TerZbeta) subunits. 150
23656 239617 cd03548 Rieske_RO_Alpha_OMO_CARDO Rieske non-heme iron oxygenase (RO) family, 2-Oxoquinoline 8-monooxygenase (OMO) and Carbazole 1,9a-dioxygenase (CARDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OMO catalyzes the NADH-dependent oxidation of the N-heterocyclic aromatic compound 2-oxoquinoline to 8-hydroxy-2-oxoquinoline, the second step in the bacterial degradation of quinoline. OMO consists of a reductase component (OMR) and an oxygenase component (OMO) that together function to shuttle electrons from the reduced pyridine nucleotide to the active site of OMO, where O2 activation and 2-oxoquinoline hydroxylation occurs. CARDO, which contains oxygenase (CARDO-O), ferredoxin (CARDO-F) and ferredoxin reductase (CARDO-R) components, catalyzes the dihydroxylation at the C1 and C9a positions of carbazole. The oxygenase component of OMO and CARDO contain only alpha subunits arranged in a trimeric structure. 136
23657 239618 cd03556 L-fucose_isomerase L-fucose isomerase (FucIase); FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose during glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in blood group determinants as well as in various oligo- and polysaccharides, and glycosides in mammals, bacteria and plants. 584
23658 239619 cd03557 L-arabinose_isomerase L-Arabinose isomerase (AI) catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion into D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute. 484
23659 349787 cd03558 LGIC_ECD extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels (also known as ligand-gated ion channel (LGIC)). This superfamily contains the extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), type-A gamma-aminobutyric acid receptor (GABAAR) and glycine receptor (GlyR). These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. GABAAR and GlyR are anionic channels, both mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. These ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on the inhibitory half of the Cys-loop receptor family. The ECD contains the ligand binding sites for these receptors. 179
23660 349850 cd03559 LGIC_TM transmembrane domain of Cys-loop neurotransmitter-gated ion channels. This superfamily contains the transmembrane domain of Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), type-A gamma-aminobutyric acid receptor (GABAAR), and glycine receptor (GlyR). These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system where the sign of synaptic connections (excitatory or inhibitory) is determined by the charge of the ions that flow through these channels. In general, channels that conduct positive ions are excitatory, whereas channels that conduct negative ions are inhibitory. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR and GlyR are anionic channels, both mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. These ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on the inhibitory half of the Cys-loop receptor family. 116
23661 340765 cd03561 VHS VHS (Vps27/Hrs/STAM) domain family. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It has a superhelical structure similar to that of the ARM (Armadillo) repeats and is present at the N-termini of proteins involved in intracellular membrane trafficking. There are four general groups of VHS domain containing proteins based on their association with other domains. The first group consists of proteins of the STAM/EAST/Hbp family, which has the domain composition of VHS-SH3-ITAM. The second consists of proteins with a FYVE domain C-terminal to VHS. The third consists of GGA proteins with a domain composition of VHS-GAT (GGA and TOM)-GAE (Gamma-Adaptin Ear) domain. The fourth consists of proteins with a VHS domain alone or with domains other than those mentioned above. In GGA proteins, VHS domains are involved in cargo recognition in trans-Golgi, thereby having a general membrane targeting/cargo recognition role in vesicular trafficking. 131
23662 340766 cd03562 CID CID (CTD-Interacting Domain) family. The CTD-Interacting Domain (CID) is present in several eukaryotic RNA-processing factors including yeast proteins, Pcf11 and Nrd1, and vertebrate proteins, CTD-associated factors 8 (SCAF8) and Regulation of nuclear pre-mRNA domain-containing proteins (such as RPRD1 and RPRD2). Pcf11 is a conserved and essential subunit of the yeast cleavage factor IA, which is required for polyadenylation-dependent 3'-RNA processing and transcription termination. Nrd1 is implicated in polyadenylation-independent 3'-RNA processing. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 123
23663 340767 cd03564 ANTH_N ANTH (AP180 N-Terminal Homology) domain family, N-terminal region. The ANTH (AP180 N-Terminal Homology) domain family is composed of Adaptor Protein 180 (AP180), Clathrin Assembly Lymphoid Myeloid Leukemia protein (CALM), and similar proteins. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ANTH-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that the ANTH domain is a universal component of the machinery for clathrin-mediated membrane budding. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains. 120
23664 340768 cd03565 VHS_Tom1_like VHS (Vps27/Hrs/STAM) domain of Tom1 subfamily. This subfamily is composed of Tom1 (Target of myb1 - retroviral oncogene) protein, Tom1L1 (Tom1-like1), Tom1L2 (Tom1-like2), and similar proteins. Proteins belonging to this subfamily are characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Yeast do not contain homologous proteins of the Tom1 subfamily, suggesting these proteins have evolved to accommodate more complex cellular processes. Tom1 is essential for the negative regulation of Interleukin-1 and Tumor Necrosis Factor-induced signaling pathways. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. 138
23665 340769 cd03567 VHS_GGA_metazoan VHS (Vps27/Hrs/STAM) domain of metazoan GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) proteins. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. Jawed vertebrates contain as many as three GGA proteins: GGA1, GGA2, and GGA3. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 139
23666 340770 cd03568 VHS_STAM VHS (Vps27/Hrs/STAM) domain of the STAM (Signal Transducing Adaptor Molecule) subfamily. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMs, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. Jawed vertebrates have two STAM subfamily members, STAM1 and STAM2. 132
23667 340771 cd03569 VHS_Hrs VHS (Vps27/Hrs/STAM) domain of Hepatocyte growth factor-regulated tyrosine kinase substrate, Hrs. Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) plays a role in at least three vesicle trafficking events: exocytosis, endocytosis, and endosome to lysosome trafficking. Hrs is involved in promoting rapid recycling of endocytosed signaling receptors to the plasma membrane. Together with STAM or STAM2, it comprises the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery, which functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. Hrs contains an N-terminal VHS domain, which has a superhelical structure similar to the structure of ARM (Armadillo) repeats, a FYVE (Fab1p, YOTB, Vac1p, and EEA1) zinc finger domain, a Double Ubiquitin-Interacting Motif (DUIM), a P(S/T)XP motif that recruit ESCRT-I, a GAT (GGA and TOM) domain, and a short peptide motif near the C-terminus that recruits clathrin. 138
23668 340772 cd03571 ENTH Epsin N-Terminal Homology (ENTH) domain family. The Epsin N-Terminal Homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, contributing to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 117
23669 340773 cd03572 ENTH_like_Tepsin Epsin N-Terminal Homology (ENTH)-like domain of AP-4 complex accessory subunit Tepsin and similar domains. This family is composed of proteins containing an ENTH-like domain including vertebrate AP-4 complex accessory subunit Tepsin and Arabidopsis thaliana VHS domain-containing protein At3g16270. Tepsin is also called ENTH Domain-containing protein 2 (ENTHD2), Epsin for AP-4, or Tetra-epsin. It associates with the adapter-like complex 4 (AP-4), a heterotetramer composed of two large adaptins (epsilon and beta), a medium adaptin (mu) and a small adaptin (sigma), which forms a non-clathrin coat on vesicles departing the Trans-Golgi Network. The Epsin N-Terminal Homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 119
23670 239629 cd03574 NTR_complement_C345C NTR/C345C domain; The NTR domains that are found in the C-termini of complement C3, C4 and C5, are also called C345C domains. In C5, the domain interacts with various partners during the formation of the membrane attack complex, a fundamental process in the mammalian defense against infection. It's role in component C3 and C4 is not well understood. 147
23671 239630 cd03575 NTR_WFIKKN NTR domain, WFIKKN subfamily; WFIKKN proteins contain a C-terminal NTR domain and are putative secreted proteins which may be multivalent protease inhibitors that act on serine proteases as well as metalloproteases. Human WFIKKN and a related protein sharing the same domain architecture were observed to have distinct tissue expression patterns. WFIKKN is also referred to as growth and differentiation factor-associated serum protein-1 (GASP-1). It inhibits the activity of mature myostatin, a specific regulator of skeletal muscle mass and a member of the TGFbeta superfamily. 109
23672 239631 cd03576 NTR_PCOLCE NTR domain, PCOLCE subfamily; Procollagen C-endopeptidase enhancers (PCOLCEs) are extracellular matrix proteins that enhance the activity of procollagen C-proteases, by binding to the procollagen I C-peptide. They contain a C-terminal NTR domain, which have been suggested to possess inhibitory functions towards specific serine proteases but not towards metzincins, which are inhibited by the related TIMPs. 124
23673 239632 cd03577 NTR_TIMP_like NTR domain, TIMP-like subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. This group contains domains similar to the TIMP NTR domain, which binds MMPs. Members of this group may or may not function as MMP inhibitors. 116
23674 239633 cd03578 NTR_netrin-4_like NTR domain, Netrin-4-like subfamily; composed of the C-terminal NTR domains of netrin-4 (beta netrin) and similar proteins. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. Netrin-4 is a basement membrane component that is important in neural, kidney and vascular development. It may also be involved in regulating the outgrowth and shape of epithelial cells during lung branching morphogenesis. 111
23675 239634 cd03579 NTR_netrin-1_like NTR domain, Netrin-1-like subfamily; The C-terminal NTR domain of netrins is also called domain C in the context of C. elegans netrin UNC-6. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. These proteins may be chemoattractive to some neurons and chemorepellant for others. In the case of netrin-1, attraction and repulsion responses are mediated by the DCC and UNC-5 receptor families. The biological activities of C. elegans UNC-6, which may either attract or repel migrating cells or axons, are mediated by its different domains. The C-terminal NTR domain of UNC-6 has been shown to inhibit axon branching activity. 115
23676 239635 cd03580 NTR_Sfrp1_like NTR domain, Secreted frizzled-related protein (Sfrp) 1-like subfamily; composed of proteins similar to human Sfrp1, Sfrp2 and Sfrp5. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp1 has been found frequently to be downregulated in breast cancer and is associated with disease progression and poor prognosis. 126
23677 239636 cd03581 NTR_Sfrp3_like NTR domain, Secreted frizzled-related protein (Sfrp) 3-like subfamily; composed of proteins similar to human Sfrp3 and Sfrp4. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp3 may suppress the growth and invasiveness of androgen-independent prostate cancer cells. 111
23678 239637 cd03582 NTR_complement_C5 NTR/C345C domain, complement C5 subfamily; The NTR domain found in complement C5 is also known as C345C because it occurs at the C-terminus of complement C3, C4 and C5. Complement C5 is activated by C5 convertase, which itself is a complex between C3b and C3 convertase. The small cleavage fragment, C5a, is the most important small peptide mediator of inflammation, and the larger active fragment, C5b, initiates late events of complement activation. The NTR/C345C domain is important in the function of C5 as it interacts with enzymes that convert C5 to the active form, C5b. The domain has also been found to bind to complement components C6 and C7, and may specifically interact with their factor I modules. 150
23679 239638 cd03583 NTR_complement_C3 NTR/C345C domain, complement C3 subfamily; The NTR domain found in complement C3 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C3 plays a pivotal role in the activation of the complement systems, as all pathways (classical, alternative, and lectin) result in the processing of C3 by C3 convertase. The larger fragment, activated C3b, contains the NTR/C345C domain and binds covalently, via a reactive thioester, to cell surface carbohydrates including components of bacterial cell walls and immune aggregates. The smaller cleavage product, C3a, acts independently as a diffusible signal to mediate local inflammatory processes. The structure of C3 shows that the NTR/C345C domain is located in an exposed position relative to the rest of the molecule. The function of the domain in complement C3 is poorly understood. 149
23680 239639 cd03584 NTR_complement_C4 NTR/C345C domain, complement C4 subfamily; The NTR domain found in complement C4 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C4 is a key player in the activation of the component classical pathway. C4 is cleaved by activated C1 to yield C4a anaphylatoxin, and the larger fragment C4b, an essential component of the C3- and C5-convertase enzymes. C4b binds covalently to the surface of pathogens through a reactive thioester. The role of the NTR/C345C domain in C4 (C4b) is unclear. 153
23681 239640 cd03585 NTR_TIMP NTR domain, TIMP subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. The levels of activated membrane-type MMPs, MMPs, and free TIMPs determine the balance between matrix degradation and matrix formation or stabilization. Consequently, TIMPs play roles in processes that require the remodeling and degradation of connective tissue, such as development, morphogenesis, wound healing, as well as in various diseases and pathological states such as tumor cell metastasis, arthritis, and artherosclerosis. Most TIMPs bind to a variety of MMPs. TIMP-1 and TIMP-2 appear to be multifunctional proteins with diverse biological action. They may exhibit growth factor-like activity and can inhibit angiogenesis. TIMP-3 has been implicated in apoptosis. 183
23682 176459 cd03586 PolY_Pol_IV_kappa DNA Polymerase IV/Kappa. Pol IV, also known as Pol kappa, DinB, and Dpo4, is a translesion synthesis (TLS) polymerase. Translesion synthesis is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Known primarily as Pol IV in prokaryotes and Pol kappa in eukaryotes, this polymerase has a propensity for generating frameshift mutations. The eukaryotic Pol kappa differs from Pol IV and Dpo4 by an N-terminal extension of ~75 residues known as the "N-clasp" region. The structure of Pol kappa shows DNA that is almost totally encircled by Pol kappa, with the N-clasp region augmenting the interactions between DNA and the polymerase. Pol kappa is more resistant than Pol eta and Pol iota to bulky guanine adducts and is efficient at catalyzing the incorporation of dCTP. Bacterial pol IV has a higher error rate than other Y-family polymerases. 334
23683 239641 cd03587 SOCS SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 41
23684 153058 cd03588 CLECT_CSPGs C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins. CLECT_CSPGs: C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins (CSPGs) in human and chicken aggrecan, frog brevican, and zebra fish dermacan. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Xenopus brevican is expressed in the notochord and the brain during early embryogenesis. Zebra fish dermacan is expressed in dermal bones and may play a role in dermal bone development. CSPGs do contain LINK domain(s) which bind HA. These LINK domains are considered by one classification system to be a variety of CTLD, but are omitted from this hierarchical classification based on insignificant sequence similarity. 124
23685 153059 cd03589 CLECT_CEL-1_like C-type lectin-like domain (CTLD) of the type found in CEL-1 from Cucumaria echinata and Echinoidin from Anthocidaris crassispina. CLECT_CEL-1_like: C-type lectin-like domain (CTLD) of the type found in CEL-1 from Cucumaria echinata and Echinoidin from Anthocidaris crassispina. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CEL-1 CTLD binds three calcium ions and has a high specificity for N-acteylgalactosamine (GalNAc). CEL-1 exhibits strong cytotoxicity which is inhibited by GalNAc. This protein may play a role as a toxin defending against predation. Echinoidin is found in the coelomic fluid of the sea urchin and is specific for GalBeta1-3GalNAc. Echinoidin has a cell adhesive activity towards human cancer cells which is not mediated through the CTLD. Both CEL-1 and Echinoidin are multimeric proteins comprised of multiple dimers linked by disulfide bonds. 137
23686 153060 cd03590 CLECT_DC-SIGN_like C-type lectin-like domain (CTLD) of the type found in human dendritic cell (DC)-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) and the related receptor, DC-SIGN receptor (DC-SIGNR). CLECT_DC-SIGN_like: C-type lectin-like domain (CTLD) of the type found in human dendritic cell (DC)-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) and the related receptor, DC-SIGN receptor (DC-SIGNR). This group also contains proteins similar to hepatic asialoglycoprotein receptor (ASGP-R) and langerin in human. These proteins are type II membrane proteins with a CTLD ectodomain. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. DC-SIGN is thought to mediate the initial contact between dendritic cells and resting T cells, and may also mediate the rolling of DCs on epithelium. DC-SIGN and DC-SIGNR bind to oligosaccharides present on human tissues, as well as, on pathogens including parasites, bacteria, and viruses. DC-SIGN and DC-SIGNR bind to HIV enhancing viral infection of T cells. DC-SIGN and DC-SIGNR are homotetrameric, and contain four CTLDs stabilized by a coiled coil of alpha helices. The hepatic ASGP-R is an endocytic recycling receptor which binds and internalizes desialylated glycoproteins having a terminal galactose or N-acetylgalactosamine residues on their N-linked carbohydrate chains, via the clathrin-coated pit mediated endocytic pathway, and delivers them to lysosomes for degradation. It has been proposed that glycoproteins bearing terminal Sia (sialic acid) alpha2, 6GalNAc and Sia alpha2, 6Gal are endogenous ligands for ASGP-R and that ASGP-R participates in regulating the relative concentration of serum glycoproteins bearing alpha 2,6-linked Sia. The human ASGP-R is a hetero-oligomer composed of two subunits, both of which are found within this group. Langerin is expressed in a subset of dendritic leukocytes, the Langerhans cells (LC). Langerin induces the formation of Birbeck Granules (BGs) and associates with these BGs following internalization. Langerin binds, in a calcium-dependent manner, to glyco-conjugates containing mannose and related sugars mediating their uptake and degradation. Langerin molecules oligomerize as trimers with three CTLDs held together by a coiled-coil of alpha helices. 126
23687 153061 cd03591 CLECT_collectin_like C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1). CLECT_collectin_like: C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CTLDs of these collectins bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, or apoptotic cells) and mediate functions associated with killing and phagocytosis. MBPs recognize high mannose oligosaccharides in a calcium dependent manner, bind to a broad range of pathogens, and trigger cell killing by activating the complement pathway. MBP also acts directly as an opsonin. SP-A and SP-D in addition to functioning as host defense components, are components of pulmonary surfactant which play a role in surfactant homeostasis. Pulmonary surfactant is a phospholipid-protein complex which reduces the surface tension within the lungs. SP-A binds the major surfactant lipid: dipalmitoylphosphatidylcholine (DPPC). SP-D binds two minor components of surfactant that contain sugar moieties: glucosylceramide and phosphatidylinositol (PI). MBP and SP-A, -D monomers are homotrimers with an N-terminal collagen region and three CTLDs. Multiple homotrimeric units associate to form supramolecular complexes. MBL deficiency results in an increased susceptibility to a large number of different infections and to inflammatory disease, such as rheumatoid arthritis. 114
23688 153062 cd03592 CLECT_selectins_like C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins: P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels). CLECT_selectins_like: C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins: P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. P- E- and L-sels are cell adhesion receptors that mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. L- sel is expressed constitutively on most leukocytes. P-sel is stored in the Weibel-Palade bodies of endothelial cells and in the alpha granules of platlets. E- sels are present on endothelial cells. Following platelet and/or endothelial cell activation P- sel is rapidly translocated to the cell surface and E-sel expression is induced. The initial step in leukocyte migration involves interactions of selectins with fucosylated, sialylated, and sulfated carbohydrate moieties on target ligands displayed on glycoprotein scaffolds on endothelial cells and leucocytes. A major ligand of P- E- and L-sels is PSGL-1 (P-sel glycoprotein ligand). Interactions of E- and P- sels with tumor cells may promote extravasation of cancer cells. Regulation of L-sel and P-sel function includes proteolytic shedding of the most extracellular portion (containing the CTLD) from the cell surface. Increased levels of the soluble form of P-sel in the plasma have been found in a number of diseases including coronary disease and diabetes. E- and P- sel also play roles in the development of synovial inflammation in inflammatory arthritis. Platelet P-sel, but not endothelial P-sel, plays a role in the inflammatory response and neointimal formation after arterial injury. Selectins may also function as signal-transducing receptors. 115
23689 153063 cd03593 CLECT_NK_receptors_like C-type lectin-like domain (CTLD) of the type found in natural killer cell receptors (NKRs). CLECT_NK_receptors_like: C-type lectin-like domain (CTLD) of the type found in natural killer cell receptors (NKRs), including proteins similar to oxidized low density lipoprotein (OxLDL) receptor (LOX-1), CD94, CD69, NKG2-A and -D, osteoclast inhibitory lectin (OCIL), dendritic cell-associated C-type lectin-1 (dectin-1), human myeloid inhibitory C-type lectin-like receptor (MICL), mast cell-associated functional antigen (MAFA), killer cell lectin-like receptors: subfamily F, member 1 (KLRF1) and subfamily B, member 1 (KLRB1), and lys49 receptors. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. NKRs are variously associated with activation or inhibition of natural killer (NK) cells. Activating NKRs stimulate cytolysis by NK cells of virally infected or transformed cells; inhibitory NKRs block cytolysis upon recognition of markers of healthy self cells. Most Lys49 receptors are inhibitory; some are stimulatory. OCIL inhibits NK cell function via binding to the receptor NKRP1D. Murine OCIL in addition to inhibiting NK cell function inhibits osteoclast differentiation. MAFA clusters with the type I Fc epsilon receptor (FcepsilonRI) and inhibits the mast cells secretory response to FcepsilonRI stimulus. CD72 is a negative regulator of B cell receptor signaling. NKG2D is an activating receptor for stress-induced antigens; human NKG2D ligands include the stress induced MHC-I homologs, MICA, MICB, and ULBP family of glycoproteins Several NKRs have a carbohydrate-binding capacity which is not mediated through calcium ions (e.g. OCIL binds a range of high molecular weight sulfated glycosaminoglycans including dextran sulfate, fucoidan, and gamma-carrageenan sugars). Dectin-1 binds fungal beta-glucans and in involved in the innate immune responses to fungal pathogens. MAFA binds saccharides having terminal alpha-D mannose residues in a calcium-dependent manner. LOX-1 is the major receptor for OxLDL in endothelial cells and thought to play a role in the pathology of atherosclerosis. Some NKRs exist as homodimers (e.g.Lys49, NKG2D, CD69, LOX-1) and some as heterodimers (e.g. CD94/NKG2A). Dectin-1 can function as a monomer in vitro. 116
23690 153064 cd03594 CLECT_REG-1_like C-type lectin-like domain (CTLD) of the type found in Human REG-1 (lithostathine), REG-4, and avian eggshell-specific proteins: ansocalcin, structhiocalcin-1(SCA-1), and -2(SCA-2). CLECT_REG-1_like: C-type lectin-like domain (CTLD) of the type found in Human REG-1 (lithostathine), REG-4, and avian eggshell-specific proteins: ansocalcin, structhiocalcin-1(SCA-1), and -2(SCA-2). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. REG-1 is a proliferating factor which participates in various kinds of tissue regeneration including pancreatic beta-cell regeneration, regeneration of intestinal mucosa, regeneration of motor neurons, and perhaps in tissue regeneration of damaged heart. REG-1 may play a role on the pathophysiology of Alzheimer's disease and in the development of gastric cancers. Its expression is correlated with reduced survival from early-stage colorectal cancer. REG-1 also binds and aggregates several bacterial strains from the intestinal flora and it has been suggested that it is involved in the control of the intestinal bacterial ecosystem. Rat lithostathine has calcium carbonate crystal inhibitor activity in vitro. REG-IV is unregulated in pancreatic, gastric, hepatocellular, and prostrate adenocarcinomas. REG-IV activates the EGF receptor/Akt/AP-1 signaling pathway in colorectal carcinoma. Ansocalcin, SCA-1 and -2 are found at high concentration in the calcified egg shell layer of goose and ostrich, respectively and tend to form aggregates. Ansocalcin nucleates calcite crystal aggregates in vitro. 129
23691 153065 cd03595 CLECT_chondrolectin_like C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin. CLECT_chondrolectin_like: C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CHODL is predominantly expressed in muscle cells and is associated with T-cell maturation. Various alternatively spliced isoforms have been of CHODL have been identified. The transmembrane form of CHODL is localized in the ER-Golgi apparatus. Layilin is widely expressed in different cell types. The extracellular CTLD of layilin binds hyaluronan (HA), a major constituent of the extracellular matrix (ECM). The cytoplasmic tail of layilin binds various members of the band 4.1/ERM superfamily (talin, radixin, and merlin). The ERM proteins are cytoskeleton-membrane linker molecules which link actin to receptors in the plasma membrane. Layilin co-localizes in with talin in membrane ruffles and may mediate signals from the ECM to the cell cytoskeleton. 149
23692 153066 cd03596 CLECT_tetranectin_like C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF). CLECT_tetranectin_like: C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. TN binds to plasminogen and stimulates activation of plasminogen, playing a key role in the regulation of proteolytic processes. The TN CTLD binds two calcium ions. Its calcium free form binds to various kringle-like protein ligands. Two residues involved in the coordination of calcium are critical for the binding of TN to the fourth kringle (K4) domain of plasminogen (Plg K4). TN binds the kringle 1-4 form of angiostatin (AST K1-4). AST K1-4 is a fragment of Plg, commonly found in cancer tissues. TN inhibits the binding of Plg and AST K1-4 to the extracellular matrix (EMC) of endothelial cells and counteracts the antiproliferative effects of AST K1-4 on these cells. TN also binds the tenth kringle domain of apolipoprotein (a). In addition, TN binds fibrin and complex polysaccharides in a Ca2+ dependent manner. The binding site for complex sulfated polysaccharides is N-terminal to the CTLD. TN is homotrimeric; N-terminal to the CTLD is an alpha helical domain responsible for trimerization of monomeric units. TN may modulate angiogenesis through interactions with angiostatin and coagulation through interaction with fibrin. TN may play a role in myogenesis and in bone development. Mice having a deletion in the TN gene exhibit a kyphotic spine abnormality. TN is a useful prognostic marker of certain cancer types. CLECSF1 is expressed in cartilage tissue, which is primarily intracellular matrix (ECM), and is a candidate for organizing ECM. SCGF is strongly expressed in bone marrow and is a cytokine for primitive hematopoietic progenitor cells. 129
23693 153067 cd03597 CLECT_attractin_like C-type lectin-like domain (CTLD) of the type found in human and mouse attractin (AtrN) and attractin-like protein (ALP). CLECT_attractin_like: C-type lectin-like domain (CTLD) of the type found in human and mouse attractin (AtrN) and attractin-like protein (ALP). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Mouse AtrN (the product of the mahogany gene) has been shown to bind Agouti protein and to function in agouti-induced pigmentation and obesity. Mutations in AtrN have also been shown to cause spongiform encephalopathy and hypomyelination in rats and hamsters. The cytoplasmic region of mouse ALP has been shown to binds to melanocortin receptor (MCR4). Signaling through MCR4 plays a role in appetite suppression. Attractin may have therapeutic potential in the treatment of obesity. Human attractin (hAtrN) has been shown to be expressed on activated T cells and released extracellularly. The circulating serum attractin induces the spreading of monocytes that become the focus of the clustering of non-proliferating T cells. 129
23694 153068 cd03598 CLECT_EMBP_like C-type lectin-like domain (CTLD) of the type found in the human proteins, eosinophil major basic protein (EMBP) and prepro major basic protein homolog (MBPH). CLECT_EMBP_like: C-type lectin-like domain (CTLD) of the type found in the human proteins, eosinophil major basic protein (EMBP) and prepro major basic protein homolog (MBPH). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Eosinophils and basophils carry out various functions in allergic, parasitic, and inflammatory diseases. EMBP is stored in eosinophil crystalloid granules and is released upon degranulation. EMBP is also expressed in basophils. The proform of EMBP is expressed in placental X cells and breast tissue and increases significantly during human pregnancy. EMBP has cytotoxic properties and damages bacteria and mammalian cells, in vitro, as well as, helminth parasites. EMBP deposition has been observed in the inflamed tissue of allergy patients in a variety of diseases including asthma, atopic dermatitis, and rhinitis. In addition to its cytotoxic functions, EMBP activates cells and stimulates cytokine production. EMBP has been shown to bind the proteoglycan heparin. The binding site is similar to the carbohydrate binding site of other classical CTLD, such as mannose-binding protein (MBP1), however, heparin binding to EMBP is calcium ion independent. MBPH has reduced potency in cytotoxic and cytostimulatory assays compared with EMBP. 117
23695 153069 cd03599 CLECT_DGCR2_like C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS). CLECT_DGCR2_like: C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. DGS is also known velo-cardio-facial syndrome (VCFS). DGS is a genetic abnormality that results in malformations of the heart, face, and limbs and is associated with schizophrenia and depressive disorders. DGCR2 is a candidate for involvement in the pathogenesis of DGS since the DGCR2 gene lies within the minimal DGS critical region (MDGRC) of 22q11, which when deleted gives rise to DGS, and the DGCR2 gene is in close proximity to the balanced translocation breakpoint in a DGS patient having a balanced translocation. 153
23696 153070 cd03600 CLECT_thrombomodulin_like C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR. CLECT_thrombomodulin_like: C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. In these thrombomodulin-like proteins the residues involved in coordinating Ca2+ in the classical MBP-A CTLD are not conserved. TM exerts anti-fibrinolytic and anti-inflammatory activity. TM also regulates blood coagulation in the anticoagulant protein C pathway. In this pathway, the procoagulant properties of thrombin (T) are lost when it binds TM. TM also plays a key role in tumor biology. It is expressed on endothelial cells and on several type of tumor cell including squamous cell carcinoma. Loss of TM expression correlates with advanced stage and poor prognosis. Loss of function of TM function may be associated with arterial or venous thrombosis and with late fetal loss. Soluble molecules of TM retaining the CTLD are detected in human plasma and urine where higher levels indicate injury and/or enhanced turnover of the endothelium. C1qR is expressed on endothelial cells and stem cells. It is also expressed on monocots and neutrophils, where it is subject to ectodomain shedding. Soluble forms of C1qR retaining the CTLD is detected in human plasma. C1qR modulates the phagocytosis of apoptotic cells in vivo. C1qR-deficient mice are defective in clearance of apoptotic cells in vivo. The cytoplasmic tail of C1qR, C-terminal to the CTLD of CD93, contains a PDZ binding domain which interacts with the PDZ domain-containing adaptor protein, GIPC. The juxtamembrane region of this tail interacts with the ezrin/radixin/moesin family. Endosialin functions in the growth and progression of abdominal tumors and is expressed in the stroma of several tumors. 141
23697 153071 cd03601 CLECT_TC14_like C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm. CLECT_TC14_like: C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. TC14 is homodimeric. The CTLD of TC14 binds D-galactose and D-fucose. TC14 is expressed constitutively by multipotent epithelial and mesenchymal cells and plays in role during budding, in inducing the aggregation of undifferentiated mesenchymal cells to give rise to epithelial forming tissue. TC14-2 and TC14-3 shows calcium-dependent galactose binding activity. TC14-3 is a cytostatic factor which blocks cell growth and dedifferentiation of the atrial epithelium during asexual reproduction. It may also act as a differentiation inducing factor. Galactose inhibits the cytostatic activity of TC14-3. The gene for Acorn worm PfG6 is gill-specific; PfG6 may be a secreted protein. 119
23698 153072 cd03602 CLECT_1 C-type lectin (CTL)/C-type lectin-like (CTLD) domain subgroup 1; a subgroup of protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CLECT_1: C-type lectin (CTL)/C-type lectin-like (CTLD) domain subgroup 1; a subgroup of protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations. In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer. 108
23699 153073 cd03603 CLECT_VCBS A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CLECT_VCBS: A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice. Bacterial CTLDs within this group are functionally uncharacterized. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations. In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer. 118
23700 239642 cd03670 ADPRase_NUDT9 ADP-ribose pyrophosphatase (ADPRase) catalyzes the hydrolysis of ADP-ribose to AMP and ribose-5-P. Like other members of the Nudix hydrolase superfamily of enzymes, it is thought to require a divalent cation, such as Mg2+, for its activity. It also contains a 23-residue Nudix motif (GX5EX7REUXEEXGU, where U = I, L or V) which functions as a metal binding site/catalytic site. In addition to the Nudix motif, there are additional conserved amino acid residues, distal from the signature sequence, that correlate with substrate specificity. In humans, there are four distinct ADPRase activities, three putative cytosolic (ADPRase-I, -II, and -Mn) and a single mitochondrial enzyme (ADPRase-m). ADPRase-m is also known as NUDT9. It can be distinugished from the cytosolic ADPRase by a N-terminal target sequence unique to mitochondrial ADPRase. NUDT9 functions as a monomer. 186
23701 239643 cd03671 Ap4A_hydrolase_plant_like Diadenosine tetraphosphate (Ap4A) hydrolase is a member of the Nudix hydrolase superfamily. Members of this family are well represented in a variety of prokaryotic and eukaryotic organisms. Phylogenetic analysis reveals two distinct subgroups where plant enzymes fall into one group (represented by this subfamily) and fungi/animals/archaea enzymes fall into another. Bacterial enzymes are found in both subfamilies. Ap4A is a potential by-product of aminoacyl tRNA synthesis, and accumulation of Ap4A has been implicated in a range of biological events, such as DNA replication, cellular differentiation, heat shock, metabolic stress, and apoptosis. Ap4A hydrolase cleaves Ap4A asymmetrically into ATP and AMP. It is important in the invasive properties of bacteria and thus presents a potential target for the inhibition of such invasive bacteria. Besides the signature nudix motif (G[X5]E[X7]REUXEEXGU where U is Ile, Leu, or Val), Ap4A hydrolase is structurally similar to the other members of the nudix superfamily with some degree of variations. Several regions in the sequences are poorly defined and substrate and metal binding sites are only predicted based on kinetic studies. 147
23702 239644 cd03672 Dcp2p mRNA decapping enzyme 2 (Dcp2p), the catalytic subunit, and Dcp1p are the two components of the decapping enzyme complex. Decapping is a key step in both general and nonsense-mediated 5'->3' mRNA-decay pathways. Dcp2p contains an all-alpha helical N-terminal domain and a C-terminal domain which has the Nudix fold. While decapping is not dependent on the N-terminus of Dcp2p, it does affect its efficiency. Dcp1p binds the N-terminal domain of Dcp2p stimulating the decapping activity of Dcp2p. Decapping permits the degradation of the transcript and is a site of numerous control inputs. It is responsible for nonsense-mediated decay as well as AU-rich element (ARE)-mediated decay. In addition, it may also play a role in the levels of mRNA. Enzymes belonging to the Nudix superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V). 145
23703 239645 cd03673 Ap6A_hydrolase Diadenosine hexaphosphate (Ap6A) hydrolase is a member of the Nudix hydrolase superfamily. Ap6A hydrolase specifically hydrolyzes diadenosine polyphosphates, but not ATP or diadenosine triphosphate, and it generates ATP as the product. Ap6A, the most preferred substrate, hydrolyzes to produce two ATP molecules, which is a novel hydrolysis mode for Ap6A. These results indicate that Ap6A hydrolase is a diadenosine polyphosphate hydrolase. It requires the presence of a divalent cation, such as Mn2+, Mg2+, Zn2+, and Co2+, for activity. Members of the Nudix superfamily are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. 131
23704 239646 cd03674 Nudix_Hydrolase_1 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 138
23705 239647 cd03675 Nudix_Hydrolase_2 Contains a crystal structure of the Nudix hydrolase from Nitrosomonas europaea, which has an unknown function. In general, members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 134
23706 239648 cd03676 Nudix_hydrolase_3 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 180
23707 239649 cd03677 MM_CoA_mutase_beta Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Beta subunit-like subfamily; contains bacterial proteins similar to the beta subunit of MCMs from Propionbacterium shermanni and Streptomyces cinnamonensis, which are alpha/beta heterodimers. For P. shermanni MCM, it is known that only the alpha subunit binds coenzyme B12 and substrates. The role of the beta subunit is unclear. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation and Streptomyces MCM in polyketide biosynthesis. 424
23708 239650 cd03678 MM_CoA_mutase_1 Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, unknown subfamily 1; composed of uncharacterized bacterial proteins containing a C-terminal MCM domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Members of this subfamily also contain an N-terminal coenzyme B12 binding domain followed by a domain similar to the E. coli ArgK membrane ATPase. 495
23709 239651 cd03679 MM_CoA_mutase_alpha_like Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Alpha subunit-like subfamily; contains proteins similar to the alpha subunit of Propionbacterium shermanni MCM, as well as human and E. coli MCM. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. Sinorhizobium meliloti strain SU47 MCM plays a role in the polyhydroxyalkanoate degradation pathway. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism. 536
23710 239652 cd03680 MM_CoA_mutase_ICM_like Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, isobutyryl-CoA mutase (ICM)-like subfamily; contains archaeal and bacterial proteins similar to the large subunit of Streptomyces cinnamonensis coenzyme B12-dependent ICM. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA, intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis. 538
23711 239653 cd03681 MM_CoA_mutase_MeaA Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, MeaA-like subfamily; contains various methylmalonyl coenzyme A (CoA) mutase (MCM)-like proteins similar to the Streptomyces cinnamonensis MeaA, Methylobacterium extorquens MeaA and Streptomyces collinus B12-dependent mutase. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. S. cinnamonensis MeaA is a putative B12-dependent mutase which provides methylmalonyl-CoA precursors for the biosynthesis of the monensin polyketide via an unknown pathway. S. collinus B12-dependent mutase may be involved in a pathway for acetate assimilation. 407
23712 239654 cd03682 ClC_sycA_like ClC sycA-like chloride channel proteins. This ClC family presents in bacteria, where it facilitates acid resistance in acidic soil. Mutation of this gene (sycA) in Rhizobium tropici CIAT899 causes serious deficiencies in nodule development, nodulation competitiveness, and N2 fixation on Phaseolus vulgaris plants, due to its reduced ability for acid resistance. This family is part of the ClC chloride channel superfamiy. These proteins catalyse the selective flow of Cl- ions across cell membranes and Cl-/H+ exchange transport. These proteins share two characteristics that are apparently inherent to the entire ClC chloride channel superfamily: a unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. 378
23713 239655 cd03683 ClC_1_like ClC-1-like chloride channel proteins. This CD includes isoforms ClC-0, ClC-1, ClC-2 and ClC_K. ClC-1 is expressed in skeletal muscle and its mutation leads to both recessively and dominantly-inherited forms of muscle stiffness or myotonia. ClC-K is exclusively expressed in kidney. Similarly, mutation of ClC-K leads to nephrogenic diabetes insipidus in mice and Bartter's syndrome in human. These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. This domain is found in the eukaryotic halogen ion (Cl-, Br- and I-) channel proteins, that perform a variety of functions including cell volume regulation, regulation of intracelluar chloride concentration, membrane potential stabilization, charge compensation necessary for the acidification of intracellular organelles and transepithelial chloride transport. 426
23714 239656 cd03684 ClC_3_like ClC-3-like chloride channel proteins. This CD includes ClC-3, ClC-4, ClC-5 and ClC-Y1. ClC-3 was initially cloned from rat kidney. Expression of ClC-3 produces outwardly-rectifying Cl currents that are inhibited by protein kinase C activation. It has been suggested that ClC-3 may be a ubiquitous swelling-activated Cl channel that has very similar characteristics to those of native volume-regulated Cl currents. The function of ClC-4 is unclear. Studies of human ClC-4 have revealed that it gives rise to Cl currents that rapidly activate at positive voltages, and are sensitive to extracellular pH, with currents decreasing when pH falls below 6.5. ClC-4 is broadly distributed, especially in brain and heart. ClC-5 is predominantly expressed in the kidney, but can be found in the brain and liver. Mutations in the ClC-5 gene cause certain hereditary diseases, including Dent's disease, an X-chromosome linked syndrome characterised by proteinuria, hypercalciuria, and kidney stones (nephrolithiasis), leading to progressive renal failure. These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. This domain is found in the eukaryotic halogen ion (Cl- and I-) channel proteins, that perform a variety of functions including cell volume regulation, the membrane potential stabilization, transepithelial chloride transport and charge compensation necessary for the acidification of intracellular organelles. 445
23715 239657 cd03685 ClC_6_like ClC-6-like chloride channel proteins. This CD includes ClC-6, ClC-7 and ClC-B, C, D in plants. Proteins in this family are ubiquitous in eukarotes and their functions are unclear. They are expressed in intracellular organelles membranes. This family belongs to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. ClC chloride ion channel superfamily perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, and transepithelial transport in animals. 466
23716 239658 cd03687 Dehydratase_LU Dehydratase large subunit. This family contains the large (alpha) subunit of B12-dependent glycerol dehydratases (GDHs) and B12-dependent diol dehydratases (DDHs). GDH is isofunctional with DDH. These enzymes can each catalyze the conversion of 1,2-propanediol, glycerol, and 1,2-ethanediol to the corresponding aldehydes via a coenzyme B12 (adenosylcobalamin)-dependent radical mechanism. Both enzymes exhibit a subunit composition of alpha2beta2gamma2. The enzymes differ in substrate specificity; glycerol is the preferred substrate for GDH and 1,2-propanediol for DDH. GDH shows almost equal affinity for both (R) and (S)-isomers while DDH prefers the (S) isomer. GDH plays a key role in the dihydroxyacetone (DHA) pathway and DDH in the anaerobic degradation of 1,2-diols. The radical mechanism has been well studied for Klebsiella oxytoca DDH and involves binding of 1,2-propanediol to the enzyme to induce hemolytic cleavage of the Co-C5' bond of the coenzyme to form cob(II)alamin and the adenosyl radical. Hydrogen abstraction from the substrate follows producing a substrate generated radical and 5'-deoxyadenosine. Rearrangement to the product radical is then followed by abstraction of a hydrogen atom from 5'-deoxyadenosine to produce the hydrated propionaldehyde and regenerate the adenosyl radical. After the Co-C5' bond is reformed and the hydrated aldehyde dehydrated, the process is complete. GDH has a higher affinity for coenzyme B12 than DDH. Both GDH and DDH are activated by various monovalent cations with K+, NH4+, and Rb+ being the most effective. However, DDH differs from GDH in that it is partially active with Cs+ and Na+. In general, the alpha and beta subunits for both enzymes are on different chains. However, for a subset of the GDHs, alpha and beta subunits appear to be on a single chain. 545
23717 293889 cd03688 eIF2_gamma_II Domain II of the gamma subunit of eukaryotic translation initiation factor 2. This subfamily represents domain II of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in eukaryota and archaea. eIF2 is a G protein that delivers the methionyl initiator tRNA to the small ribosomal subunit and releases it upon GTP hydrolysis after the recognition of the initiation codon. eIF2 is composed of three subunits, alpha, beta and gamma. Subunit gamma shows strongest conservation, and it confers both tRNA binding and GTP/GDP binding. 113
23718 293890 cd03689 RF3_II Domain II of bacterial Release Factor 3. This subfamily represents domain II of bacterial Release Factor 3 (RF3). Termination of protein synthesis by the ribosome requires two release factor (RF) classes. The class II RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. Sequence comparison of class II release factors with elongation factors shows that prokaryotic RF3 is more similar to EF-G whereas eukaryotic eRF3 is more similar to eEF1A, implying that their precise function may differ. 87
23719 293891 cd03690 Tet_II Domain II of ribosomal protection proteins Tet(M) and Tet(O). This subfamily represents domain II of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to domain II of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. 86
23720 293892 cd03691 BipA_TypA_II Domain II of BipA. BipA (also called TypA) is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. The domain II of BipA shows similarity to the domain II of the elongation factors (EFs) EF-G and EF-Tu. 94
23721 293893 cd03692 mtIF2_IVc C2 subdomain of domain IV in mitochondrial translation initiation factor 2. This model represents the C2 subdomain of domain IV of mitochondrial translation initiation factor 2 (mtIF2) which adopts a beta-barrel fold displaying a high degree of structural similarity with domain II of the translation elongation factor EF-Tu. The C-terminal part of mtIF2 contains the entire fMet-tRNAfmet binding site of IF-2 and is resistant to proteolysis. This C-terminal portion consists of two domains, IF2 C1 and IF2 C2. IF2 C2 has been shown to contain all molecular determinants necessary and sufficient for the recognition and binding of fMet-tRNAfMet. Like IF2 from certain prokaryotes such as Thermus thermophilus, mtIF2lacks domain II which is thought to be involved in binding of E.coli IF-2 to 30S subunits. 84
23722 293894 cd03693 EF1_alpha_II Domain II of elongation factor 1-alpha. This family represents domain II of elongation factor 1-alpha (EF-1A) that is found in archaea and all eukaryotic lineages. EF-1A is very abundant in the cytosol, where it is involved in the GTP-dependent binding of aminoacyl-tRNAs to the A site of the ribosomes in the second step of translation from mRNAs to proteins. Both domain II of EF-1A and domain IV of IF2/eIF5B have been implicated in recognition of the 3'-ends of tRNA. More than 61% of eukaryotic elongation factor 1A (eEF-1A) in cells is estimated to be associated with actin cytoskeleton. The binding of eEF-1A to actin is a noncanonical function that may link two distinct cellular processes, cytoskeleton organization and gene expression. 91
23723 293895 cd03694 GTPBP_II Domain II of the GTPBP family of GTP binding proteins. This group includes proteins similar to GTPBP1 and GTPBP2. GTPBP1 is structurally related to elongation factor 1 alpha, a key component of the protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution. 87
23724 293896 cd03695 CysN_NodQ_II Domain II of the large subunit of ATP sulfurylase. This subfamily represents domain II of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and homologous to the domain II of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD. ATPS produces adenosine-5'-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction, APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sulfation of a nodulation factor. In Rhizobium meliloti, the heterodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N and C termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, archaea, and eukaryotes use a different ATP sulfurylase, which shows no amino acid sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha. 81
23725 293897 cd03696 SelB_II Domain II of elongation factor SelB. This subfamily represents the domain of elongation factor SelB that is homologous to domain II of EF-Tu. SelB may function by replacing EF-Tu. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. 83
23726 293898 cd03697 EFTU_II Domain II of elongation factor Tu. Elongation factors Tu (EF-Tu) are three-domain GTPases with an essential function in the elongation phase of mRNA translation. The GTPase center of EF-Tu is in the N-terminal domain (domain I), also known as the catalytic or G-domain. The G-domain is composed of about 200 amino acid residues, arranged into a predominantly parallel six-stranded beta-sheet core surrounded by seven alpha helices. Non-catalytic domains II and III are beta-barrels of seven and six, respectively, antiparallel beta-strands that share an extended interface. Both non-catalytic domains are composed of about 100 amino acid residues. EF-Tu proteins exist in two principal conformations: a compact one, EF-Tu*GTP, with tight interfaces between all three domains and a high affinity for aminoacyl-tRNA; and an open one, EF-Tu*GDP, with essentially no G-domain-domain II interactions and a low affinity for aminoacyl-tRNA. EF-Tu has approximately a 100-fold higher affinity for GDP than for GTP. 87
23727 293899 cd03698 eRF3_II_like Domain II of the eukaryotic class II release factor-like proteins. This model represents the domain similar to domain II of the eukaryotic class II release factor (eRF3). In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribosome. eRF3 is a GTPase, which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF-1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. This group also contains proteins similar to S. cerevisiae Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation and to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression. 84
23728 293900 cd03699 EF4_II Domain II of Elongation Factor 4 (EF4). Elongation factor 4 (EF4 or LepA) is a highly conserved guanosine triphosphatase found in bacteria and eukaryotic mitochondria and chloroplasts. EF4 functions as a translation factor, which promotes back-translocation of tRNAs on posttranslocational ribosome complexes and competes with elongation factor G for interaction with pretranslocational ribosomes, inhibiting the elongation phase of protein synthesis. 86
23729 293901 cd03700 EF2_snRNP_like_II Domain II of elongation factor 2 and C-terminal domain of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein. This subfamily represents domain II of elongation factor (EF) EF-2 found in eukaryotes and archaea, and the C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of U5-116 kD/Snu114p. 95
23730 293902 cd03701 IF2_IF5B_II Domain II of prokaryotic Initiation Factor 2 and archaeal and eukaryotic Initiation Factor 5. This family represents domain II of prokaryotic Initiation Factor 2 (IF2) and its archaeal and eukaryotic homologue aeIF5B. IF2, the largest initiation factor, is an essential GTP binding protein. In E. coli, three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of the 60S ribosomal subunit. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains in EF1A, eEF1A and aeIF2gamma. 96
23731 293903 cd03702 IF2_mtIF2_II Domain II of bacterial and mitochondrial Initiation Factor 2. This family represents domain II of bacterial Initiation Factor 2 (IF2) and its eukaryotic mitochondrial homolog mtIF2. IF2, the largest initiation factor, is an essential GTP binding protein. In E. coli, three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Bacterial IF-2 is structurally and functionally related to eukaryotic mitochondrial mtIF-2. 96
23732 293904 cd03703 aeIF5B_II Domain II of archaeal and eukaryotic Initiation Factor 5. This family represents domain II of archaeal and eukaryotic IF5B. aIF5B and eIF5B are homologs of prokaryotic Initiation Factor 2 (IF2). Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of joining of 60S subunits. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains of EF1A, eEF1A and aeIF2gamma. 111
23733 294003 cd03704 eRF3_C_III C-terminal domain of eRF3. This model represents the eEF1alpha-like C-terminal region of eRF3, which is homologous to the domain III of EF-Tu. eRF3 is a GTPase which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. The C-terminal region is responsible for translation termination activity and is essential for viability. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions: N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. 108
23734 294004 cd03705 EF1_alpha_III Domain III of Elongation Factor 1. Eukaryotic elongation factor 1 (EF-1) is responsible for the GTP-dependent binding of aminoacyl-tRNAs to ribosomes. EF-1 is composed of four subunits: the alpha chain, which binds GTP and aminoacyl-tRNAs; the gamma chain that probably plays a role in anchoring the complex to other cellular components; and the beta and delta (or beta') chains. This model represents the alpha subunit, which is the counterpart of bacterial EF-Tu for archaea (aEF-1 alpha) and eukaryotes (eEF-1 alpha). 104
23735 294005 cd03706 mtEFTU_III Domain III of mitochondrial EF-TU (mtEF-TU). mtEF-TU is highly conserved and is 55-60% identical to bacterial EF-TU. The overall structure is similar to that observed in the Escherichia coli and Thermus aquaticus EF-TU. However, compared with that observed in prokaryotic EF-TU, the nucleotide-binding domain (domain I) of mtEF-TU is in a different orientation relative to the rest of the structure. Furthermore, domain III is followed by a short 11-amino acid extension that forms one helical turn. This extension seems to be specific to the mitochondrial factors and has not been observed in any of the prokaryotic factors. 93
23736 294006 cd03707 EFTU_III Domain III of Elongation Factor (EF) Tu. EF-Tu consists of three structural domains, designated I, II, and III. Domain III adopts a beta barrel structure. Domain III is involved in binding to both charged tRNA and to elongation factor Ts (EF-Ts). EF-Ts is the guanine-nucleotide-exchange factor for EF-Tu. EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Crystallographic studies revealed structural similarities ("molecular mimicry") between tertiary structures of EF-G and the EF-Tu-aminoacyl-tRNA ternary complex. Domains III, IV, and V of EF-G mimic the tRNA structure in the EF-Tu ternary complex; domains III, IV and V can be related to the acceptor stem, anticodon helix and T stem of tRNA respectively. 90
23737 294007 cd03708 GTPBP_III Domain III of the GP-1 family of GTPases. This family includes proteins similar to GTPBP1 and GTPBP2. GTPBP1 is structurally related to elongation factor 1 alpha, a key component of the protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in the cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution. 87
23738 239680 cd03709 lepA_C lepA_C: This family represents the C-terminal region of LepA, a GTP-binding protein localized in the cytoplasmic membrane. LepA is ubiquitous in Bacteria and Eukaryota (e.g. Saccharomyces cerevisiae GUF1p), but is missing from Archaea. LepA exhibits significant homology to elongation factors (EFs) Tu and G. The function(s) of the proteins in this family are unknown. The N-terminal domain of LepA is homologous to a domain of similar size found in initiation factor 2 (IF2), and in EF-Tu and EF-G (factors required for translation in Escherichia coli). Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including S. cerevisiae GUF1) originated within the bacterial LepA family. LepA has never been observed in archaea, and eukaryl LepA is organellar. LepA is therefore a true bacterial GTPase, found only in the bacterial lineage. 80
23739 239681 cd03710 BipA_TypA_C BipA_TypA_C: a C-terminal portion of BipA or TypA having homology to the C terminal domains of the elongation factors EF-G and EF-2. A member of the ribosome binding GTPase superfamily, BipA is widely distributed in bacteria and plants. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and, is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. 79
23740 239682 cd03711 Tet_C Tet_C: C-terminus of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to the C terminal domains of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. 78
23741 239683 cd03713 EFG_mtEFG_C EFG_mtEFG_C: domains similar to the C-terminal domain of the bacterial translational elongation factor (EF) EF-G. Included in this group is the C-terminus of mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2) proteins. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homologue of mtEFG2, MEF2. 78
23742 239684 cd03714 RT_DIRS1 RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase. 119
23743 239685 cd03715 RT_ZFREV_like RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses. 210
23744 239686 cd03716 SOCS_ASB_like SOCS (suppressors of cytokine signaling) box of ASB (ankyrin repeat and SOCS box) and SSB (SPRY domain-containing SOCS box proteins) protein families. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence of a variable number of repeats. SSB proteins contain a central SPRY domain and a C-terminal SOCS. Recently, it has been shown that all four SSB proteins interact with the MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), and that SSB-1, SSB-2, and SSB-4 interact with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. 42
23745 239687 cd03717 SOCS_SOCS_like SOCS (suppressors of cytokine signaling) box of SOCS-like proteins. The CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. These intracellular proteins regulate the responses of immune cells to cytokines. Identified as negative regulators of the cytokine-JAK-STAT pathway, they seem to play a role in many immunological and pathological processes. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. Related SOCS boxes are also present in Rab40-like proteins and insect proteins of unknown function that also contain a NEUZ (domain in neuralized proteins) domain. 39
23746 239688 cd03718 SOCS_SSB1_4 SOCS (suppressors of cytokine signaling) box of SSB1 and SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 and SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF) and also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23747 239689 cd03719 SOCS_SSB2 SOCS (suppressors of cytokine signaling) box of SSB2 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB2 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB2, like SSB4 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23748 239690 cd03720 SOCS_ASB1 SOCS (suppressors of cytokine signaling) box of ASB1-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23749 239691 cd03721 SOCS_ASB2 SOCS (suppressors of cytokine signaling) box of ASB2-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB2 targets specific proteins to destruction by the proteasome in leukemia cells that have been induced to differentiate. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 45
23750 239692 cd03722 SOCS_ASB3 SOCS (suppressors of cytokine signaling) box of ASB3-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ABS3 has been shown to be negative regulator of TNF-R2-mediated cellular responses to TNF-alpha by direct targeting of tumor necrosis factor receptor II (TNF-R2) for ubiquitination and proteasome-mediated degradation. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 51
23751 239693 cd03723 SOCS_ASB4_ASB18 SOCS (suppressors of cytokine signaling) box of ASB4 and ASB18 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Asb4 was identified as imprinted gene in mice. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 48
23752 239694 cd03724 SOCS_ASB5 SOCS (suppressors of cytokine signaling) box of ASB5-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB5 has been implicated in the initiation of arteriogenesis. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23753 239695 cd03725 SOCS_ASB6 SOCS (suppressors of cytokine signaling) box of ASB6-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB6 interacts with the adaptor protein APS and recruits elongin B/C to the insulin receptor signaling complex. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 44
23754 239696 cd03726 SOCS_ASB7 SOCS (suppressors of cytokine signaling) box of ASB7-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 45
23755 239697 cd03727 SOCS_ASB8 SOCS (suppressors of cytokine signaling) box of ASB8-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB8 is highly transcribed in skeletal muscle and in lung carcinoma cell lines. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 43
23756 239698 cd03728 SOCS_ASB_9_11 SOCS (suppressors of cytokine signaling) box of ASB9 and 11 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23757 239699 cd03729 SOCS_ASB13 SOCS (suppressors of cytokine signaling) box of ASB13-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23758 239700 cd03730 SOCS_ASB14 SOCS (suppressors of cytokine signaling) box of ASB14-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 57
23759 239701 cd03731 SOCS_ASB15 SOCS (suppressors of cytokine signaling) box of ASB15-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB15 is expressed predominantly in skeletal muscle and participates in the regulation of protein turnover and muscle cell development by stimulating protein synthesis and regulating differentiation of muscle cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 56
23760 239702 cd03733 SOCS_WSB_SWIP SOCS (suppressors of cytokine signaling) box of WSB/SWiP-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2), and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh), as well as, their isoforms WSB-2 and SWiP-2. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 39
23761 239703 cd03734 SOCS_CIS1 SOCS (suppressors of cytokine signaling) box of CIS (cytokine-inducible SH2 protein) 1-like proteins. Together with the SOCS proteins, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. CIS1, like SOCS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. CIS1 binds to cytokine receptors at STAT5-docking sites, which prohibits recruitment of STAT5 to the receptor signaling complex and results in the down-regulation of activation by STAT5. 41
23762 239704 cd03735 SOCS_SOCS1 SOCS (suppressors of cytokine signaling) box of SOCS1-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS1, like CIS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. SOCS1 has a dual function as a direct potent JAK kinase inhibitor and as a component of an E3 ubiquitin-ligase complex recruiting substrates to the protein degradation machinery. 43
23763 239705 cd03736 SOCS_SOCS2 SOCS (suppressors of cytokine signaling) box of SOCS2-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS2 has recently been shown to regulate neuronal differentiation by controlling expression of a neurogenic transcription factor, Neurogenin-1. SOCS2 binds to GH receptors and inhibits the activation of STAT5b induced by GH. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 41
23764 239706 cd03737 SOCS_SOCS3 SOCS (suppressors of cytokine signaling) box of SOCS3-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS3, like CIS1 and SOCS1, is involved in the down-regulation of the JAK/STAT pathway. SOCS3 inhibits JAK activity indirectly through recruitment to the cytokine receptors. SOCS3 has been shown to play an essential role in placental development and a non-essential role in embryo development. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23765 239707 cd03738 SOCS_SOCS4 SOCS (suppressors of cytokine signaling) box of SOCS4-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 56
23766 239708 cd03739 SOCS_SOCS5 SOCS (suppressors of cytokine signaling) box of SOCS5-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS5 inhibits Th2 differentiation by inhibiting IL-4 signaling. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 57
23767 239709 cd03740 SOCS_SOCS6 SOCS (suppressors of cytokine signaling) box of SOCS6-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 41
23768 239710 cd03741 SOCS_SOCS7 SOCS (suppressors of cytokine signaling) box of SOCS7-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS7 is important in the functioning of neuronal cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 49
23769 239711 cd03742 SOCS_Rab40 SOCS (suppressors of cytokine signaling) box of Rab40-like proteins. Rab40 is part of the Rab family of small GTP-binding proteins that form the largest family within the Ras superfamily. Rab proteins regulate vesicular trafficking pathways, behaving as membrane-associated molecular switches. Rab40 is characterized by a SOCS box c-terminal to the GTPase domain. The SOCS boxes interact with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 43
23770 239712 cd03743 SOCS_SSB4 SOCS (suppressors of cytokine signaling) box of SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB4, like SSB2 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23771 239713 cd03744 SOCS_SSB1 SOCS (suppressors of cytokine signaling) box of SSB1 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), both the absence and the presence of HGF and enhances the HGF-MET-induced mitogen-activated protein kinases Erk-transcription factor Elk-1-serum response elements (SRE) pathway. SSB1, like SSB2 and SSB4, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 42
23772 239714 cd03745 SOCS_WSB2_SWIP2 SOCS (suppressors of cytokine signaling) box of WSB2/SWiP2-like proteins. This family consists of WSB-2 (SOCS-box-containing WD-40 protein) and SWiP-2 (SOCS box and WD-repeats in Protein). No functional information is available for WSB2 or SWiP-2, but limited information is available for the isoforms WSB-1 and SWiP-1. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 39
23773 239715 cd03746 SOCS_WSB1_SWIP1 SOCS (suppressors of cytokine signaling) box of WSB1/SWiP1-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2) and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh). The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 40
23774 239716 cd03747 Ntn_PGA_like Penicillin G acylase (PGA) belongs to a family of beta-lactam acylases that includes cephalosporin acylase (CA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively. While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity is low. 312
23775 239717 cd03748 Ntn_PGA Penicillin G acylase (PGA) is the key enzyme in the industrial production of beta-lactam antibiotics. PGA hydrolyzes the side chain of penicillin G and related beta-lactam antibiotics releasing 6-amino penicillanic acid (6-APA), a building block in the production of semisynthetic penicillins. PGA is widely distributed among microorganisms, including bacteria, yeast and filamentous fungi but it's in vivo role remains unclear. 488
23776 239718 cd03749 proteasome_alpha_type_1 proteasome_alpha_type_1. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 211
23777 239719 cd03750 proteasome_alpha_type_2 proteasome_alpha_type_2. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 227
23778 239720 cd03751 proteasome_alpha_type_3 proteasome_alpha_type_3. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 212
23779 239721 cd03752 proteasome_alpha_type_4 proteasome_alpha_type_4. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 213
23780 239722 cd03753 proteasome_alpha_type_5 proteasome_alpha_type_5. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 213
23781 239723 cd03754 proteasome_alpha_type_6 proteasome_alpha_type_6. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 215
23782 239724 cd03755 proteasome_alpha_type_7 proteasome_alpha_type_7. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 207
23783 239725 cd03756 proteasome_alpha_archeal proteasome_alpha_archeal. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 211
23784 239726 cd03757 proteasome_beta_type_1 proteasome beta type-1 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 212
23785 239727 cd03758 proteasome_beta_type_2 proteasome beta type-2 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 193
23786 239728 cd03759 proteasome_beta_type_3 proteasome beta type-3 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 195
23787 239729 cd03760 proteasome_beta_type_4 proteasome beta type-4 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 197
23788 239730 cd03761 proteasome_beta_type_5 proteasome beta type-5 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 188
23789 239731 cd03762 proteasome_beta_type_6 proteasome beta type-6 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 188
23790 239732 cd03763 proteasome_beta_type_7 proteasome beta type-7 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 189
23791 239733 cd03764 proteasome_beta_archeal Archeal proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme for non-lysosomal protein degradation in both the cytosol and the nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are both members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 188
23792 239734 cd03765 proteasome_beta_bacterial Bacterial proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 236
23793 239735 cd03766 Gn_AT_II_novel Gn_AT_II_novel. This asparagine synthase-related domain is present in eukaryotes but its function has not yet been determined. The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer. 181
23794 239736 cd03767 SR_Res_par Serine recombinase (SR) family, Partitioning (par)-Resolvase subfamily, catalytic domain; Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subgroup is composed of proteins similar to the E. coli resolvase found in the par region of the RP4 plasmid, which encodes a highly efficient partitioning system. This protein is part of a complex stabilization system involved in the resolution of plasmid dimers during cell division. Similar to Tn3 and other resolvases, members of this family may contain a C-terminal DNA binding domain. 146
23795 239737 cd03768 SR_ResInv Serine Recombinase (SR) family, Resolvase and Invertase subfamily, catalytic domain; members contain a C-terminal DNA binding domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. Resolvases and invertases affect resolution or inversion and comprise a major phylogenic group. Resolvases (e.g. Tn3, gamma-delta, and Tn5044) normally recombine two sites in direct repeat causing deletion of the DNA between the sites. Invertases (e.g. Gin and Hin) recombine sites in inverted repeat to invert the DNA between the sites. Cointegrate resolution with gamma-delta resolvase requires the formation of a synaptosome of three resolvase dimers bound to each of two res sites on the DNA. Also included in this subfamily are some putative integrases including a sequence from bacteriophage phi-FC1. 126
23796 239738 cd03769 SR_IS607_transposase_like Serine Recombinase (SR) family, IS607-like transposase subfamily, catalytic domain; members contain a DNA binding domain with homology to MerR/SoxR located N-terminal to the catalytic domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subfamily is composed of proteins that catalyze the transposition of insertion sequence (IS) elements such as IS607 from Helicobacter and IS1535 from Mycobacterium, and similar proteins from other bacteria and several archaeal species. IS elements are DNA segments that move to new sites in prokaryotic and eukaryotic genomes causing insertion mutations and gene rearrangements. 134
23797 239739 cd03770 SR_TndX_transposase Serine Recombinase (SR) family, TndX-like transposase subfamily, catalytic domain; composed of large serine recombinases similar to Clostridium TndX and TnpX transposases. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. TndX mediates the excision and circularization of the conjugative transposon Tn5397 from Clostridium difficile. TnpX is responsible for the movement of the nonconjugative chloramphenicol resistance elements of the Tn4451/3 family. Mobile genetic elements such as transposons are important vehicles for the transmission of virulence and antibiotic resistance in many microorganisms. 140
23798 239740 cd03771 MATH_Meprin Meprin family, MATH domain; Meprins are multidomain, highly glycosylated extracellular metalloproteases, which are either anchored to the membrane or secreted into extracellular spaces. They are expressed in renal and intestinal brush border membranes, leukocytes, and cancer cells, and are capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. Meprin proteases are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. Despite their similarity, the two subunits differ in their ability to self-associate, in proteolytic processing during biosynthesis and in substrate specificity. Both subunits are synthesized as membrane spanning proteins, however, the alpha subunit is cleaved during biosynthesis and loses its transmembrane domain. Meprin beta forms homodimers or heterotetramers while meprin alpha oligomerizes into large complexes containing 10-100 subunits. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 167
23799 239741 cd03772 MATH_HAUSP Herpesvirus-associated ubiquitin-specific protease (HAUSP, also known as USP7) family, N-terminal MATH (TRAF-like) domain; composed of proteins similar to human HAUSP, an enzyme that specifically catalyzes the deubiquitylation of p53 and MDM2, hence playing an important role in the p53-MDM2 pathway. It contains an N-terminal TRAF-like domain and a C-terminal catalytic protease (C19 family) domain. The tumor suppressor p53 protein is a transcription factor that responds to many cellular stress signals and is regulated primarily through ubiquitylation and subsequent degradation. MDM2 is a RING-finger E3 ubiquitin ligase that promotes p53 ubiquitinylation. p53 and MDM2 bind to the same site in the N-terminal TRAF-like domain of HAUSP in a mutually exclusive manner. HAUSP also interacts with the Epstein-Barr nuclear antigen 1 (EBNA1) protein of the Epstein-Barr virus (EBV), which efficiently immortalizes infected cells predisposing the host to a variety of cancers. EBNA1 plays several important roles in EBV latent infection and cellular transformation. It binds the same pocket as p53 in the HAUSP TRAF-like domain. Through interactions with p53, MDM2 and EBNA1, HAUSP plays a role in cell proliferation, apoptosis and EBV-mediated immortalization. 137
23800 239742 cd03773 MATH_TRIM37 Tripartite motif containing protein 37 (TRIM37) family, MATH domain; TRIM37 is a peroxisomal protein and is a member of the tripartite motif (TRIM) protein subfamily, also known as the RING-B-box-coiled-coil (RBCC) subfamily of zinc-finger proteins. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction and hepatomegaly. TRIM37, similar to other TRIMs, contains a cysteine-rich, zinc-binding RING-finger domain followed by another cysteine-rich zinc-binding domain, the B-box, and a coiled-coil domain. TRIM37 is autoubiquitinated in a RING domain-dependent manner, indicating that it functions as an ubiquitin E3 ligase. In addition to the tripartite motif, TRIM37 also contains a MATH domain C-terminal to the coiled-coil domain. The MATH domain of TRIM37 has been shown to interact with the TRAF domain of six known TRAFs in vitro, however, it is unclear whether this is physiologically relevant. Eleven TRIM37 mutations have been associated with Mulibrey nanism so far. One mutation, Gly322Val, is located in the MATH domain and is the only mutation that does not affect the length of the protein. It results in the incorrect subcellular localization of TRIM37. 132
23801 239743 cd03774 MATH_SPOP Speckle-type POZ protein (SPOP) family, MATH domain; composed of proteins with similarity to human SPOP. SPOP was isolated as a novel antigen recognized by serum from a scleroderma patient, whose overexpression in COS cells results in a discrete speckled pattern in the nuclei. It contains an N-terminal MATH domain and a C-terminal BTB (also called POZ) domain. Together with Cul3, SPOP constitutes an ubiquitin E3 ligase which is able to ubiquitinate the PcG protein BMI1, the variant histone macroH2A1 and the death domain-associated protein Daxx. Therefore, SPOP may be involved in the regulation of these proteins and may play a role in transcriptional regulation, apoptosis and X-chromosome inactivation. Cul3 binds to the BTB domain of SPOP whereas Daxx and the macroH2A1 nonhistone region have been shown to bind to the MATH domain. Both MATH and BTB domains are necessary for the nuclear speckled accumulation of SPOP. There are many proteins, mostly uncharacterized, containing both MATH and BTB domains from C. elegans and plants which are excluded from this family. 139
23802 239744 cd03775 MATH_Ubp21p Ubiquitin-specific protease 21 (Ubp21p) family, MATH domain; composed of fungal proteins with similarity to Ubp21p of fission yeast. Ubp21p is a deubiquitinating enzyme that may be involved in the regulation of the protein kinase Prp4p, which controls the formation of active spliceosomes. Members of this family are similar to human HAUSP (Herpesvirus-associated ubiquitin-specific protease) in that they contain an N-terminal MATH domain and a C-terminal catalytic protease (C19 family) domain. HAUSP is also an ubiquitin-specific protease that specifically catalyzes the deubiquitylation of p53 and MDM2. The MATH domain of HAUSP contains the binding site for p53 and MDM2. Similarly, the MATH domain of members in this family may be involved in substrate binding. 134
23803 239745 cd03776 MATH_TRAF6 Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF6 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF6, including the Drosophila protein DTRAF2. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF6 is the most divergent in its TRAF domain among the mammalian TRAFs. In addition to mediating TNFR family signaling, it is also an essential signaling molecule of the interleukin-1/Toll-like receptor superfamily. Whereas other TRAF molecules display similar and overlapping TNFR-binding specificities, TRAF6 binds completely different sites on receptors such as CD40 and RANK. TRAF6 serves as a molecular bridge between innate and adaptive immunity and plays a central role in osteoimmunology. DTRAF2, as an activator of nuclear factor-kappaB, plays a pivotal role in Drosophila development and innate immunity. TRAF6 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 147
23804 239746 cd03777 MATH_TRAF3 Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF3 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF3 was first described as a molecule that binds the cytoplasmic tail of CD40. However, it is not required for CD40 signaling. More recently, TRAF3 has been identified as a key regulator of type I interferon (IFN) production and the mammalian innate antiviral immunity. It mediates IFN responses in Toll-like receptor (TLR)-dependent as well as TLR-independent viral recognition pathways. It is also a key element in immunological homeostasis through its regulation of the anti-inflammatory cytokine interleukin-10. TRAF3 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 186
23805 239747 cd03778 MATH_TRAF2 Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF2 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF2 associates with the receptors TNFR-1, TNFR-2, RANK (which mediates differentiation and maturation of osteoclasts) and CD40 (which is important for the proliferation and activation of B cells), among others. It regulates distinct pathways that lead to the activation of nuclear factor-kappaB and Jun NH2-terminal kinases. TRAF2 also indirectly associates with death receptors through its interaction with TRADD (TNFR-associated death domain protein). It is involved in regulating oxidative stress or ROS-induced cell death and in the preconditioning of cells by sublethal stress for protection from subsequent injury. TRAF2 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 164
23806 239748 cd03779 MATH_TRAF1 Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF1 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF1 expression is the most restricted among the TRAFs. It is found exclusively in activated lymphocytes, dendritic cells and certain epithelia. TRAF1 associates, directly or indirectly through heterodimerization with TRAF2, with the TNFR family receptors TNFR-2, CD30, RANK, CD40 and LMP1, among others. It also binds the intracellular proteins TRADD, TANK, TRIP, RIP1, RIP2 and FLIP. TRAF1 is unique among the TRAFs in that it lacks a RING domain, which is critical for the activation of nuclear factor-kappaB and Jun NH2-terminal kinase. Studies on TRAF1-deficient mice suggest that TRAF1 has a negative regulatory role in TNFR-mediated signaling events. TRAF1 contains one zinc finger and one TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 147
23807 239749 cd03780 MATH_TRAF5 Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF5 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF5 was identified as an activator of nuclear factor-kappaB and a regulator of lymphotoxin-beta receptor and CD40 signaling. Its interaction with CD40 is indirect, involving hetero-oligomerization with TRAF3. In addition, TRAF5 has been shown to associate with other TNFRs including CD27, CD30, OX40 and GITR (glucocorticoid-induced TNFR). It plays a role in modulating Th2 immune responses (driven by OX40 costimulation) and T-cell activation (triggered by GITR). It is also involved in osteoclastogenesis. TRAF5 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 148
23808 239750 cd03781 MATH_TRAF4 Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF4 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF4, including the Drosophila protein DTRAF1. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF4 is highly expressed during embryogenesis, especially in the central and peripheral nervous system. Studies using TRAF4-deficient mice show that TRAF4 is required for neurogenesis, as well as the development of the trachea and the axial skeleton. In addition, TRAF4 augments nuclear factor-kappaB activation triggered by GITR (glucocorticoid-induced TNFR), a receptor expressed in T-cells, B-cells and macrophages. It also participates in counteracting the signaling mediated by Toll-like receptors through its association with TRAF6 and TRIF. DTRAF1 plays a pivotal role in the development of eye imaginal discs and photosensory neuron arrays in Drosophila. TRAF4 contains a RING finger domain, seven zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 154
23809 239751 cd03782 MATH_Meprin_Beta Meprin family, Beta subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The beta subunit is a type I membrane protein, which forms homodimers or heterotetramers (alpha2beta2 or alpha3beta). Meprin beta shows preference for acidic residues at the P1 and P1' sites of its substrate. Among its best substrates are growth factors and chemokines such as gastrin and osteopontin. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 167
23810 239752 cd03783 MATH_Meprin_Alpha Meprin family, Alpha subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The alpha subunit is synthesized as a membrane spanning protein, however, it is cleaved during biosynthesis and loses its transmembrane domain. It oligomerizes into large complexes, containing 10-100 subunits (dimers that associate noncovalently), which are secreted as latent proteases and can move through extracellular spaces in a nondestructive manner. This allows delivery of the concentrated protease to sites containing activating enzymes, such as sites of inflammation, infection or cancerous growth. Meprin alpha shows preference for small or hydrophobic residues at the P1 and P1' sites of its substrate. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 167
23811 340817 cd03784 GT1_Gtf-like UDP-glycosyltransferases and similar proteins. This family includes the Gtfs, a group of homologous glycosyltransferases involved in the final stages of the biosynthesis of antibiotics vancomycin and related chloroeremomycin. Gtfs transfer sugar moieties from an activated NDP-sugar donor to the oxidatively cross-linked heptapeptide core of vancomycin group antibiotics. The core structure is important for the bioactivity of the antibiotics. 404
23812 340818 cd03785 GT28_MurG undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase. MurG (EC 2.4.1.227) is an N-acetylglucosaminyltransferase, the last enzyme involved in the intracellular phase of peptidoglycan biosynthesis. It transfers N-acetyl-D-glucosamine (GlcNAc) from UDP-GlcNAc to the C4 hydroxyl of a lipid-linked N-acetylmuramoyl pentapeptide (NAM). The resulting disaccharide is then transported across the cell membrane, where it is polymerized into NAG-NAM cell-wall repeat structure. MurG belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains, each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 350
23813 340819 cd03786 GTB_UDP-GlcNAc_2-Epimerase UDP-N-acetylglucosamine 2-epimerase and similar proteins. Bacterial members of the UDP-N-Acetylglucosamine (GlcNAc) 2-Epimerase family (EC 5.1.3.14) are known to catalyze the reversible interconversion of UDP-GlcNAc and UDP-N-acetylmannosamine (UDP-ManNAc). The enzyme serves to produce an activated form of ManNAc residues (UDP-ManNAc) for use in the biosynthesis of a variety of cell surface polysaccharides; The mammalian enzyme is bifunctional, catalyzing both the inversion of stereochemistry at C-2 and the hydrolysis of the UDP-sugar linkage to generate free ManNAc. It also catalyzes the phosphorylation of ManNAc to generate ManNAc 6-phosphate, a precursor to salic acids. In mammals, sialic acids are found at the termini of oligosaccharides in a large variety of cell surface glycoconjugates and are key mediators of cell-cell recognition events. Mutations in human members of this family have been associated with Sialuria, a rare disease caused by the disorders of sialic acid metabolism. This family belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 365
23814 340820 cd03788 GT20_TPS trehalose-6-phosphate synthase. Trehalose-6-Phosphate Synthase (TPS, EC 2.4.1.15) is a glycosyltransferase that catalyses the synthesis of alpha,alpha-1,1-trehalose-6-phosphate from glucose-6-phosphate using a UDP-glucose donor. It is a key enzyme in the trehalose synthesis pathway. Trehalose is a nonreducing disaccharide present in a wide variety of organisms and may serve as a source of energy and carbon. It is characterized most notably in insect, plant, and microbial cells. Its production is often associated with a variety of stress conditions, including desiccation, dehydration, heat, cold, and oxidation. This family represents the catalytic domain of the TPS. Some members of this domain family coexist with a C-terminal trehalose phosphatase domain. 463
23815 340821 cd03789 GT9_LPS_heptosyltransferase lipopolysaccharide heptosyltransferase and similar proteins. Lipopolysaccharide heptosyltransferase (2.4.99.B6) is involved in the biosynthesis of lipooligosaccharide (LOS). Lipopolysaccharide (LPS) is a major component of the outer membrane of gram-negative bacteria. LPS heptosyltransferase transfers heptose molecules from ADP-heptose to 3-deoxy-D-manno-octulosonic acid (KDO), a part of the inner core component of LPS. This family also contains lipopolysaccharide 1,2-N-acetylglucosaminetransferase EC 2.4.1.56 and belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 277
23816 340822 cd03791 GT5_Glycogen_synthase_DULL1-like Glycogen synthase GlgA and similar proteins. This family is most closely related to the GT5 family of glycosyltransferases. Glycogen synthase (EC:2.4.1.21) catalyzes the formation and elongation of the alpha-1,4-glucose backbone using ADP-glucose, the second and key step of glycogen biosynthesis. This family includes starch synthases of plants, such as DULL1 in Zea mays and glycogen synthases of various organisms. 474
23817 340823 cd03792 GT4_trehalose_phosphorylase trehalose phosphorylase and similar proteins. Trehalose phosphorylase (TP) reversibly catalyzes trehalose synthesis and degradation from alpha-glucose-1-phosphate (alpha-Glc-1-P) and glucose. The catalyzing activity includes the phosphorolysis of trehalose, which produce alpha-Glc-1-P and glucose, and the subsequent synthesis of trehalose. This family is most closely related to the GT4 family of glycosyltransferases. 378
23818 340824 cd03793 GT3_GSY2-like glycogen synthase GSY2 and similar proteins. Glycogen synthase, which is most closely related to the GT3 family of glycosyltransferases, catalyzes the transfer of a glucose molecule from UDP-glucose to a terminal branch of a glycogen molecule, a rate-limit step of glycogen biosynthesis. GSY2, the member of this family in S. cerevisiae, has been shown to possess glycogen synthase activity. 590
23819 340825 cd03794 GT4_WbuB-like Escherichia coli WbuB and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. WbuB in E. coli is involved in the biosynthesis of the O26 O-antigen. It has been proposed to function as an N-acetyl-L-fucosamine (L-FucNAc) transferase. 391
23820 340826 cd03795 GT4_WfcD-like Escherichia coli alpha-1,3-mannosyltransferase WfcD and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP-linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria and eukaryotes. 355
23821 340827 cd03796 GT4_PIG-A-like phosphatidylinositol N-acetylglucosaminyltransferase subunit A and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Phosphatidylinositol glycan-class A (PIG-A), an X-linked gene in humans, is necessary for the synthesis of N-acetylglucosaminyl-phosphatidylinositol, a very early intermediate in glycosyl phosphatidylinositol (GPI)-anchor biosynthesis. The GPI-anchor is an important cellular structure that facilitates the attachment of many proteins to cell surfaces. Somatic mutations in PIG-A have been associated with Paroxysmal Nocturnal Hemoglobinuria (PNH), an acquired hematological disorder. 398
23822 340828 cd03798 GT4_WlbH-like Bordetella parapertussis WlbH and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Staphylococcus aureus CapJ may be involved in capsule polysaccharide biosynthesis. WlbH in Bordetella parapertussis has been shown to be required for the biosynthesis of a trisaccharide that, when attached to the B. pertussis lipopolysaccharide (LPS) core (band B), generates band A LPS. 376
23823 340829 cd03799 GT4_AmsK-like Erwinia amylovora AmsK and similar proteins. This is a family of GT4 glycosyltransferases found specifically in certain bacteria. AmsK in Erwinia amylovora, has been reported to be involved in the biosynthesis of amylovoran, a exopolysaccharide acting as a virulence factor. 350
23824 340830 cd03800 GT4_sucrose_synthase sucrose-phosphate synthase and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. The sucrose-phosphate synthases in this family may be unique to plants and photosynthetic bacteria. This enzyme catalyzes the synthesis of sucrose 6-phosphate from fructose 6-phosphate and uridine 5'-diphosphate-glucose, a key regulatory step of sucrose metabolism. The activity of this enzyme is regulated by phosphorylation and moderated by the concentration of various metabolites and light. 398
23825 340831 cd03801 GT4_PimA-like phosphatidyl-myo-inositol mannosyltransferase. This family is most closely related to the GT4 family of glycosyltransferases and named after PimA in Propionibacterium freudenreichii, which is involved in the biosynthesis of phosphatidyl-myo-inositol mannosides (PIM) which are early precursors in the biosynthesis of lipomannans (LM) and lipoarabinomannans (LAM), and catalyzes the addition of a mannosyl residue from GDP-D-mannose (GDP-Man) to the position 2 of the carrier lipid phosphatidyl-myo-inositol (PI) to generate a phosphatidyl-myo-inositol bearing an alpha-1,2-linked mannose residue (PIM1). Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in certain bacteria and archaea. 366
23826 340832 cd03802 GT4_AviGT4-like UDP-Glc:tetrahydrobiopterin alpha-glucosyltransferase and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. aviGT4 in Streptomyces viridochromogenes has been shown to be involved in biosynthesis of oligosaccharide antibiotic avilamycin A. Inactivation of aviGT4 resulted in a mutant that accumulated a novel avilamycin derivative lacking the terminal eurekanate residue. 333
23827 340833 cd03804 GT4_WbaZ-like mannosyltransferase WbaZ and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. WbaZ in Salmonella enterica has been shown to possess mannosyltransferase activity. 356
23828 340834 cd03805 GT4_ALG2-like alpha-1,3/1,6-mannosyltransferase ALG2 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ALG2, a 1,3-mannosyltransferase, in yeast catalyzes the mannosylation of Man(2)GlcNAc(2)-dolichol diphosphate and Man(1)GlcNAc(2)-dolichol diphosphate to form Man(3)GlcNAc(2)-dolichol diphosphate. A deficiency of this enzyme causes an abnormal accumulation of Man1GlcNAc2-PP-dolichol and Man2GlcNAc2-PP-dolichol, which is associated with a type of congenital disorders of glycosylation (CDG), designated CDG-Ii, in humans. 392
23829 340835 cd03806 GT4_ALG11-like alpha-1,2-mannosyltransferase ALG11 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ALG11 in yeast is involved in adding the final 1,2-linked Man to the Man5GlcNAc2-PP-Dol synthesized on the cytosolic face of the ER. The deletion analysis of ALG11 was shown to block the early steps of core biosynthesis that takes place on the cytoplasmic face of the ER and lead to a defect in the assembly of lipid-linked oligosaccharides. 419
23830 340836 cd03807 GT4_WbnK-like Shigella dysenteriae WbnK and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. WbnK in Shigella dysenteriae has been shown to be involved in the type 7 O-antigen biosynthesis. 362
23831 340837 cd03808 GT4_CapM-like capsular polysaccharide biosynthesis glycosyltransferase CapM and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. CapM in Staphylococcus aureus is required for the synthesis of type 1 capsular polysaccharides. 358
23832 340838 cd03809 GT4_MtfB-like glycosyltransferases MtfB, WbpX, and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. MtfB (mannosyltransferase B) in E. coli has been shown to direct the growth of the O9-specific polysaccharide chain. It transfers two mannoses into the position 3 of the previously synthesized polysaccharide. 362
23833 340839 cd03811 GT4_GT28_WabH-like family 4 and family 28 glycosyltransferases similar to Klebsiella WabH. This family is most closely related to the GT1 family of glycosyltransferases. WabH in Klebsiella pneumoniae has been shown to transfer a GlcNAc residue from UDP-GlcNAc onto the acceptor GalUA residue in the cellular outer core. 351
23834 340840 cd03812 GT4_CapH-like capsular polysaccharide biosynthesis glycosyltransferase CapH and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. capH in Staphylococcus aureus has been shown to be required for the biosynthesis of the type 1 capsular polysaccharide (CP1). 357
23835 340841 cd03813 GT4-like glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria, while some of them are also found in Archaea and eukaryotes. 474
23836 340842 cd03814 GT4-like glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases and includes a sequence annotated as alpha-D-mannose-alpha(1-6)phosphatidyl myo-inositol monomannoside transferase from Bacillus halodurans. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria and eukaryotes. 365
23837 340843 cd03816 GT33_ALG1-like chitobiosyldiphosphodolichol beta-mannosyltransferase and similar proteins. This family is most closely related to the GT33 family of glycosyltransferases. The yeast gene ALG1 has been shown to function as a mannosyltransferase that catalyzes the formation of dolichol pyrophosphate (Dol-PP)-GlcNAc2Man from GDP-Man and Dol-PP-Glc-NAc2, and participates in the formation of the lipid-linked precursor oligosaccharide for N-glycosylation. In humans ALG1 has been associated with the congenital disorders of glycosylation (CDG) designated as subtype CDG-Ik. 411
23838 340844 cd03817 GT4_UGDG-like UDP-Glc:1,2-diacylglycerol 3-a-glucosyltransferase and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. UDP-glucose-diacylglycerol glucosyltransferase (EC 2.4.1.337, UGDG; also known as 1,2-diacylglycerol 3-glucosyltransferase) catalyzes the transfer of glucose from UDP-glucose to 1,2-diacylglycerol forming 3-D-glucosyl-1,2-diacylglycerol. 372
23839 340845 cd03818 GT4_ExpC-like Rhizobium meliloti ExpC and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ExpC in Rhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucan (exopolysaccharide II). 396
23840 340846 cd03819 GT4_WavL-like Vibrio cholerae WavL and similar sequences. This family is most closely related to the GT4 family of glycosyltransferases. WavL in Vibrio cholerae has been shown to be involved in the biosynthesis of the lipopolysaccharide core. 345
23841 340847 cd03820 GT4_AmsD-like amylovoran biosynthesis glycosyltransferase AmsD and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. AmSD in Erwinia amylovora has been shown to be involved in the biosynthesis of amylovoran, the acidic exopolysaccharide acting as a virulence factor. This enzyme may be responsible for the formation of galactose alpha-1,6 linkages in amylovoran. 351
23842 340848 cd03821 GT4_Bme6-like Brucella melitensis Bme6 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Bme6 in Brucella melitensis has been shown to be involved in the biosynthesis of a polysaccharide. 377
23843 340849 cd03822 GT4_mannosyltransferase-like mannosyltransferases of glycosyltransferase family 4 and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. ORF704 in E. coli has been shown to be involved in the biosynthesis of O-specific mannose homopolysaccharides. 370
23844 340850 cd03823 GT4_ExpE7-like glycosyltransferase ExpE7 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ExpE7 in Sinorhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucans (exopolysaccharide II). 357
23845 340851 cd03825 GT4_WcaC-like putative colanic acid biosynthesis glycosyl transferase WcaC and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Escherichia coli WcaC has been predicted to function in colanic acid biosynthesis. WcfI in Bacteroides fragilis has been shown to be involved in the capsular polysaccharide biosynthesis. 364
23846 239753 cd03829 Sina Seven in absentia (Sina) protein family, C-terminal substrate binding domain; composed of the Drosophila Sina protein, the mammalian Sina homolog (Siah), the plant protein SINAT5, and similar proteins. Sina, Siah and SINAT5 are RING-containing proteins that function as E3 ubiquitin ligases, acting either as single proteins or as a part of multiprotein complexes. Sina is expressed in many cells in the developing eye but is essential specifically for R7 photoreceptor cell development. Sina cooperates with Phyllopod (Phyl), Ebi and the E2 ubiquitin-conjugating enzyme Ubcd1 to catalyze the ubiquitination and subsequent degradation of Tramtrack (Ttk88); Ttk88 is a transcriptional repressor that blocks photoreceptor differentiation. Similarly, the mammalian homologue Siah1 cooperates with SIP (Siah-interacting protein), Ebi and the adaptor protein Skp1, to target beta-catenin for ubiquitination and degradation via a p53-dependent mechanism. SINAT5 targets NAC1 for ubiquitin-mediated degradation resulting in the downregulation of auxin, a hormone that controls many aspects of plant development. Other targets of Sina family proteins include c-Myb, synaptophysin, group 1 glutamate receptors, promyelocytic leukemia protein, alpha-synuclein, synphilin-1 and alpha-ketoglutarate dehydrogenase, among others. Sina proteins also bind proteins that are not targets for ubiquitination such as Phyl, adenomatous polyposis coli, VAV, BAG-1 and Dab-1. Siah binds to a consensus motif, PXAXVXP, which is present in Siah-binding proteins. Siah is a dimeric protein consisting of an N-terminal RING domain, two zinc finger motifs and a C-terminal substrate-binding domain (SBD); this SBD contains an eight-stranded antiparallel beta-sandwich fold similar to the MATH (meprin and TRAF-C homology) domain. 127
23847 349428 cd03855 M14_ASTE Peptidase M14 Succinylglutamate desuccinylase (ASTE) subfamily. Peptidase M14 Succinylglutamate desuccinylase (ASTE, also known as N-succinyl-L-glutamate amidohydrolase, N2-succinylglutamate desuccinylase, and SGDS; EC 3.5.1.96) belongs to the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily of the M14 family of metallocarboxypeptidases. This group includes succinylglutamate desuccinylase that catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. It hydrolyzes N-succinyl-L-glutamate to succinate and L-glutamate. 239
23848 349429 cd03856 M14_Nna1-like Peptidase M14-like domain of ATP/GTP binding proteins, cytosolic carboxypeptidases and related proteins. Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This subfamily includes the human AGTPBP-1 and AGBL -2, -3, -4, and -5, and the mouse Nna1/CCP-1 and CCP -2 through -6. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a characteristic N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 252
23849 349430 cd03857 M14-like Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 203
23850 349431 cd03858 M14_CP_N-E_like Peptidase M14 carboxypeptidase subfamily N/E-like. Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. 292
23851 349432 cd03859 M14_CPT Peptidase M14 Carboxypeptidase T subfamily. Peptidase M14-like domain of carboxypeptidase (CP) T (CPT), CPT belongs to the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPT has moderate similarity to CPA and CPB, and exhibits dual-substrate specificity by cleaving C-terminal hydrophobic amino acid residues like CPA and C-terminal positively charged residues like CPB. CPA and CPB are M14 family peptidases but do not belong to this CPT group. The substrate specificity difference between CPT and CPA and CPB is ascribed to a few amino acid substitutions at the substrate-binding pocket while the spatial organization of the binding site remains the same as in all Zn-CPs. CPT has increased thermal stability in presence of Ca2+ ions, and two disulfide bridges which give an additional stabilization factor. 292
23852 349433 cd03860 M14_CP_A-B_like Peptidase M14 carboxypeptidase subfamily A/B-like. The Peptidase M14 Carboxypeptidase (CP) A/B subfamily is one of two main M14 CP subfamilies defined by sequence and structural homology, the other being the N/E subfamily. CPs hydrolyze single, C-terminal amino acids from polypeptide chains. They have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. There are nine members in the A/B family: CPA1, CPA2, CPA3, CPA4, CPA5, CPA6, CPB, CPO and CPU. CPA1, CPA2 and CPB are produced by the pancreas. The A forms have slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA3 is found in secretory granules of mast cells and functions in inflammatory processes. CPA4 is detected in hormone-regulated tissues, and is thought to play a role in prostate cancer. CPA5 is present in discrete regions of pituitary and other tissues, and cleaves aliphatic C-terminal residues. CPA6 is highly expressed in embryonic brain and optic muscle, suggesting that it may play a specific role in cell migration and axonal guidance. CPU (also called CPB2) is produced and secreted by the liver as the inactive precursor, PCPU, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). Little is known about CPO but it has been suggested to have specificity for acidic residues. 300
23853 349434 cd03862 M14-like Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 245
23854 349435 cd03863 M14_CPD_II Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain II subgroup. The second carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans-Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans-Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down -regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop. 296
23855 349436 cd03864 M14_CPN Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase N subgroup. Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates include peptides found in the bloodstream, such as kinins (e.g. bradykinin, kalinin, met-lys-bradykinin), complement anaphylatoxins and creatine kinase MM (CK-MM). By removing just one amino acid, CPN can alter peptide activity and receptor binding. For example Bradykinin, a nine-residue peptide released from kiningen in response to tissue injury which is inactivated by CPN, anaphylatoxins which are regulated by CPN by the cleaving and removal of their C-terminal arginines resulting in a reduction in their biological activities of 10-100-fold, and creatine kinase MM, a cytosolic enzyme that catalyzes the reversible transfer of a phosphate group from ATP to creatine, and is regulated by CPN by the cleavage of C-terminal lysines. Like the other N/E subfamily members, two surface loops surrounding the active-site groove restrict access to the catalytic center, thus restricting larger protein carboxypeptidase inhibitors from inhibiting CPN. 313
23856 349437 cd03865 M14_CPE Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase E subgroup. Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine peptides that result in obesity and hyperglycemia. Reduced CPE enzyme and receptor activity could underlie abnormal placental phenotypes from the observation that CPE is down-regulated in enlarged placentas of interspecific hybrid (interspecies hybrid placental dysplasia, IHPD) and cloned mice. 319
23857 349438 cd03866 M14_CPM Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase M subgroup. Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the control of peptide hormones and growth factor activity on the cell surface and in the membrane-localized degradation of extracellular proteins, for example it hydrolyses the C-terminal arginine of epidermal growth factor (EGF) resulting in des-Arg-EGF which binds to the EGF receptor (EGFR) with an equal or greater affinity than native EGF. CPM is a required processing enzyme that generates specific agonists for the B1 receptor. 289
23858 349439 cd03867 M14_CPZ Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase Z subgroup. Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis. That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling. 315
23859 349440 cd03868 M14_CPD_I Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain I subgroup. The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. This Domain I family contains two contiguous surface cysteines that may become palmitoylated and target the enzyme to membranes, thus regulating intracellular trafficking. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down-regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop. In D. melanogaster, the CPD variant 1B short (DmCPD1Bs) is necessary and sufficient for viability of the fruit fly. 294
23860 349441 cd03869 M14_CPX_like Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase X subgroup. Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2), and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates. Proteins in this subgroup also contain an N-terminal discoidin domain. The CP domain is important for the function of AEBP1 as a transcriptional repressor. AEBP1 is involved in several biological processes including adipogenesis, macrophage cholesterol homeostasis, and inflammation. In macrophages, AEBP1 promotes the expression of IL-6, TNF-alpha, MCP-1, and iNOS whose expression is tightly regulated by NF-kappaB activity. ACLP, a secreted protein that associates with the extracellular matrix, is essential for abdominal wall development and contributes to dermal wound healing. 322
23861 349442 cd03870 M14_CPA Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase A subgroup. Peptidase M14 Carboxypeptidase (CP) A (CPA) belongs to the A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPA enzymes generally favor hydrophobic residues. A/B subfamily enzymes are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The procarboxypeptidase A (PCPA) is produced by the exocrine pancreas and stored as a stable zymogen in the pancreatic granules until secretion into the digestive tract occurs. This subfamily includes CPA1, CPA2 and CPA4 forms. Within these A forms, there are slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA4, detected in hormone-regulated tissues, is thought to play a role in prostate cancer. 301
23862 349443 cd03871 M14_CPB Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase B subgroup. Peptidase M14 Carboxypeptidase B (CPB) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Carboxypeptidase B (CPB) enzymes only cleave the basic residues lysine or arginine. A/B subfamily enzymes are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The procarboxypeptidase B (PCPB) is produced by the exocrine pancreas and stored as stable zymogen in the pancreatic granules until secretion into the digestive tract occurs. PCPB has been reported to be a good serum marker for the diagnosis of acute pancreatitis and graft rejection in pancreas transplant recipients. this subfamily also includes thrombin activatable fibrinolysis inhibitor (TAFIa), a carboxypeptidase that stabilizes fibrin clots by removing C-terminal arginines and lysines from partially degraded fibrin. Inhibition of TAFIa stimulates the degradation of fibrin clots and may help in prevention of thrombosis. 300
23863 349444 cd03872 M14_CPA6 Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase A6 subgroup. Carboxypeptidase (CP) A6 (CPA6, also known as CPAH; EC 3.4.17.1), belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPA6 prefers large hydrophobic C-terminal amino acids as well as histidine, while peptides with a penultimate glycine or proline are very poorly cleaved. Several neuropeptides are processed by CPA6, including Met- and Leu-enkephalin, angiotensin I, and neurotensin. CPA6 converts enkephalin and neurotensin into forms known to be inactive toward their receptors, but converts inactive angiotensin I into the biologically active angiotensin II. Thus, CPA6 plays a possible role in the regulation of neuropeptides in the extracellular environment within the olfactory bulb where it is highly expressed. It is also broadly expressed in embryonic tissue, being found in neuronal tissues, bone, skin as well as the lateral rectus eye muscle. A disruption in the CPA6 gene is linked to Duane syndrome, a defect in the abducens nerve/lateral rectus muscle connection. 300
23864 349870 cd03873 Zinc_peptidase_like Zinc peptidases M18, M20, M28, and M42. Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This hierarchy contains zinc peptidases that correspond to the MH clan in the MEROPS database, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a non-specific eukaryotic dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carboxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metallo-aminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacyl-peptidase activity (i.e. hydrolysis of acylated N-terminal residues). 200
23865 349871 cd03874 M28_PMSA_TfR_like M28 Zn-peptidase Transferrin Receptor-like family. Peptidase M28 family; Transferrin Receptor (TfR) and prostate-specific membrane antigen (PSMA, also called glutamate carboxypeptidase or GCP-II) subfamily. TfR and PSMA are homodimeric type II transmembrane proteins containing three distinct domains: protease-like, apical or protease-associated (PA) and helical domains. The protease-like domain is a large extracellular portion (ectodomain). In TfR, it contains a binding site for the transferrin molecule and has 28% identity to membrane glutamate carboxypeptidase II (mGCP-II or PSMA). The PA domain is inserted between the first and second strands of the central beta sheet in the protease-like domain. TfR1 is widely expressed, and is a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and the complex is then internalized. TfR1 may also participate in cell growth and proliferation. TfR2 binds Tf but with a significantly lower affinity than TfR1. It is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis; in humans, mutations in TfR2 are associated with a form of hemochromatosis (HFE3). PSMA is over-expressed predominantly in prostate cancer (PCa) as well as in the neovasculature of most solid tumors, but not in the vasculature of normal tissues. PSMA is considered a biomarker for PCa and possibly for use as an imaging and therapeutic target. The extracellular domain of PSMA possesses two unique enzymatic functions: N-acetylated, alpha-linked acidic dipeptidase (NAALADase) which cleaves terminal glutamate from the neurodipeptide N-acetyl-aspartyl-glutamate (NAAG), and folate hydrolase (FOLH) which cleaves the terminal glutamates from gamma-linked polyglutamates (carboxypeptidase). A mutation in this gene may be associated with impaired intestinal absorption of dietary folates, resulting in low blood folate levels and consequent hyperhomocysteinemia. Expression of this protein in the brain may be involved in a number of pathological conditions associated with glutamate excitotoxicity. This gene likely arose from a duplication event of a nearby chromosomal region. Alternative splicing gives rise to multiple transcript variants. While related in sequence to peptidase M28 GCP-II, TfR lacks the metal ion coordination centers and protease activity. 278
23866 349872 cd03875 M28_Fxna_like M28 Zn-peptidase Endoplasmic reticulum metallopeptidase 1. Peptidase family M28; Endoplasmic reticulum metallopeptidase 1 (ERMP1; Felix-ina, FXNA or Fxna peptidase; KIAA1815) subfamily. ERMP1 is a multi-pass membrane protein located in the endoplasmic reticulum membrane. In humans, Fxna may play a crucial role in processing proteins required for the organization of somatic cells and oocytes into discrete follicular structures, although which proteins are hydrolyzed has not yet been determined. Another member of this subfamily is the 24-kDa vacuolar protein (VP24) which is probably involved in the formation of intravacuolar pigmented globules (cyanoplasts) in highly anthocyanin-containing vacuoles; however, the biological function of the C-terminal region which includes the putative transmembrane metallopeptidase domain is unknown. 307
23867 349873 cd03876 M28_SGAP_like M28 Zn-peptidase Streptomyces griseus aminopeptidase and similar proteins. Peptidase family M28; Streptomyces griseus Aminopeptidase (SGAP, Leucine aminopeptidase (LAP), aminopeptidase S, Mername-AA022 peptidase) subfamily. SGAP is a di-zinc exopeptidase with high preference towards large hydrophobic amino-terminal residues, with Leu being the most efficiently cleaved. It can accommodate all except Pro and Glu residues in the P1' position. It is a monomeric (30 kDa), calcium-activated and calcium-stabilized enzyme; its activation by calcium correlates with substrate specificity and it has thermal stability only in the presence of calcium. Although SGAP contains a calcium binding site, it is not conserved in many members of this subfamily. SGAP is present in the extracellular fluid of S. griseus cultures. 289
23868 349874 cd03877 M28_like M28 Zn-peptidase, many containing a protease-associated (PA) domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins, many of which contain a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. Some proteins in this subfamily are also associated with the PDZ domain, a widespread protein module that has been recruited to serve multiple functions during the course of evolution. 206
23869 349875 cd03879 M28_AAP M28 Zn-peptidase Aeromonas (Vibrio) proteolytica aminopeptidase. Peptidase family M28; Aeromonas (Vibrio) proteolytica aminopeptidase (AAP; leucine aminopeptidase from Vibrio proteolyticus; Bacterial leucyl aminopeptidase; E.C. 3.4.11.10) subfamily. AAP is a small (32kDa), heat stable leucine aminopeptidase and is active as a monomer. Similar forms of the enzyme have been isolated from Escherichia coli and Staphylococcus thermophilus. Leucine aminopeptidases, in general, play important roles in many biological processes such as protein catabolism, hormone degradation, regulation of migration and cell proliferation, as well as HIV infection and proliferation. AAP is a broad-specificity enzyme, utilizing two zinc(II) ions in its active site to remove N-terminal amino acids, with preference for large hydrophobic amino acids in the P1 position of the substrate, Leu being the most efficiently cleaved. It can accommodate all residues, except Pro, Asp and Glu in the P1' position. 286
23870 349876 cd03880 M28_QC_like M28 Zn-peptidase glutaminyl cyclase. Peptidase M28 family, glutaminyl cyclase (QC; EC 2.3.2.5) subfamily. QC is involved in N-terminal glutamine cyclization of many endocrine peptides and is typically abundant in brain tissue. N-terminal glutamine residue cyclization is an important post-translational event in the processing of numerous bioactive proteins, including neuropeptides, hormones, and cytokines during their maturation in the secretory pathway. The N-terminal pGlu protects them from exopeptidase degradation and/or enables them to have proper conformation for binding to their receptors. QCs are highly conserved from yeast to human. In humans, several genetic diseases, such as osteoporosis, appear to result from mutations of the QC gene. N-terminal glutamate cyclization into pyroglutamate (pGlu) is a reaction that may be related to the formation of several plaque-forming peptides, such as amyloid-(A) peptides and collagen-like Alzheimer amyloid plaque component, which play a pivotal role in Alzheimer's disease. 305
23871 349877 cd03881 M28_Nicastrin M28 Zn-peptidase nicastrin, a main component of gamma-secretase complex. Peptidase M28 family, nicastrin subfamily. Nicastrin is a main component of the gamma-secretase complex, which also contains presenilin, Pen-2 and Aph-1. Its extracellular domain sequence resembles aminopeptidases, but certain catalytic residues are not conserved. It is mainly localized to the endoplasmic reticulum and Golgi. It is highly glycosylated (Mr 120 kDa) and is essential for substrate recognition of the N-terminus of gamma-secretase substrates derived from APP and Notch. Nicastrin facilitates substrate cleavage by the catalytic presenilin subunit in the gamma-secretase complex. One conserved glutamate is especially important, probably because this residue forms an ion pair with the amino terminus of the substrate. This substrate-binding domain is often called the DAP domain (named after DYIGS, the amino acid stretch that modulates amyloid precursor protein (APP) processing, and Peptidase homologous region). The sequence of the substrate N-terminus is apparently not critical for the interaction, but a free amino group is. Thus, nicastrin can be considered a kind of gatekeeper for the gamma-secretase complex: type I membrane proteins that have not shed their ectodomains cannot interact properly with nicastrin and do not gain access to the active site. Dysfunction of gamma-secretase is thought to cause Alzheimer's disease, with most mutations derived from Alzheimer's disease mapping to the catalytic subunit presenilin 1 (PS1). 519
23872 349878 cd03882 M28_nicalin_like M28 Zn-Peptidase Nicalin, Nicastrin-like protein. Peptidase M28 family, Nicalin (nicastrin-like protein) subfamily. Nicalin is distantly related to Nicastrin, a component of the Alzheimer's disease-associated gamma-secretase, and forms a complex with Nomo (nodal modulator) pM5. Similar to Nicastrin, Nicalin lacks the amino-acid conservation required for catalytically active aminopeptidases. Functional studies in zebrafish embryos and cultured human cells reveal that nicalin and Nomo collaborate to antagonize the Nodal/TGFbeta signaling pathway. Thus, nicastrin and nicalin are both associated with protein complexes involved in cell fate decisions during early embryonic development. 296
23873 349879 cd03883 M28_Pgcp_like M28 Zn-Peptidase Plasma glutamate carboxypeptidase. Peptidase M28 family; Plasma glutamate carboxypeptidase (PGCP; blood plasma glutamate carboxypeptidase; EC 3.4.17.21) subfamily. PGCP is a 56kDa glutamate carboxypeptidase that is mainly produced in mammalian placenta and kidney, the majority of which is thought to be secreted into the bloodstream. Similar proteins are also found in other species, including bacteria. These proteins contain protease-associated (PA) domain inserts between the first and second strands of the central beta sheet in the protease-like domain. The PA domains may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. The exact physiological substrates of PGCP are unknown, although this enzyme may play an important role in the hydrolysis of circulating peptides. Its closest homolog encodes an important brain glutamate carboxypeptidase II (NAALADase) identical to the prostate-specific membrane antigen (PSMA), which serves as a marker for prostatic cancer metastasis. Hypermethylation of PGCP gene has been associated with human bronchial epithelial (HBE) cell immortalization and lung cancer. PGCP also provides an attractive target for serological analysis in hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) patients. 425
23874 349880 cd03884 M20_bAS M20 Peptidase beta-alanine synthase, an amidohydrolase. Peptidase M20 family, beta-alanine synthase (bAS; N-carbamoyl-beta-alanine amidohydrolase and beta-ureidopropionase; EC 3.5.1.6) subfamily. bAS is an amidohydrolase and is the final enzyme in the pyrimidine catabolic pathway, which is involved in the regulation of the cellular pyrimidine pool. bAS catalyzes the irreversible hydrolysis of the N-carbamylated beta-amino acids to beta-alanine or aminoisobutyrate with the release of carbon dioxide and ammonia. Also included in this subfamily is allantoate amidohydrolase (allantoate deiminase), which catalyzes the conversion of allantoate to (S)-ureidoglycolate, one of the crucial alternate steps in purine metabolism. It is possible that these two enzymes arose from the same ancestral peptidase that evolved into two structurally related enzymes with distinct catalytic properties and biochemical roles within the cell. Downstream enzyme (S)-ureidoglycolate amidohydrolase (UAH) is homologous in structure and sequence with AAH and catalyzes the conversion of (S)-ureidoglycolate into glyoxylate, releasing two molecules of ammonia as by-products. Yeast requires beta-alanine as a precursor of pantothenate and coenzyme A biosynthesis, but generates it mostly via degradation of spermine. Disorders in pyrimidine degradation and beta-alanine metabolism caused by beta-ureidopropionase deficiency (UPB1 gene) in humans are normally associated with neurological disorders. 398
23875 349881 cd03885 M20_CPDG2 M20 Peptidase Glutamate carboxypeptidase, a periplasmic enzyme. Peptidase M20 family, Glutamate carboxypeptidase (carboxypeptidase G; carboxypeptidase G1; carboxypeptidase G2; CPDG2; CPG2; Folate hydrolase G2; Pteroylmonoglutamic acid hydrolase G2; Glucarpidase; E.C. 3.4.17.11) subfamily. CPDG2 is a periplasmic enzyme that is synthesized with a signal peptide. It is a dimeric zinc-dependent exopeptidase, with two domains, a catalytic domain, which provides the ligands for the two zinc ions in the active site, and a dimerization domain. CPDG2 cleaves the C-terminal glutamate moiety from a wide range of N-acyl groups, including peptidyl, aminoacyl, benzoyl, benzyloxycarbonyl, folyl, and pteroyl groups to release benzoic acid, phenol, and aniline mustards. It is used clinically to treat methotrexate toxicity by hydrolyzing it to inactive and non-toxic metabolites. It is also proposed for use in antibody-directed enzyme prodrug therapy; for example, glutamate can be cleaved from glutamated benzoyl nitrogen mustards, producing nitrogen mustards with effective cytotoxicity against tumor cells. 362
23876 349882 cd03886 M20_Acy1 M20 Peptidase Aminoacylase 1 family. Peptidase M20 family, Aminoacylase 1 (ACY1; hippuricase; acylase I; amido acid deacylase; IAA-amino acid hydrolase; dehydropeptidase II; N-acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) subfamily. ACY1 is the most abundant of the aminoacylases, a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. It is encoded by the aminoacylase 1 gene (Acy1) on chromosome 3p21 that comprises 15 exons. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney, suggest a role of the enzyme in amino acid metabolism of these organs. Defects in ACY1 are the cause of aminoacylase-1 deficiency (ACY1D), resulting in a metabolic disorder manifesting encephalopathy and psychomotor delay. 371
23877 349883 cd03887 M20_Acy1L2 M20 Peptidase Aminoacylase 1-like protein 2, amidohydrolase family. Peptidase M20 family, Aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase) subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 360
23878 349884 cd03888 M20_PepV M20 Peptidase Xaa-His dipeptidase (PepV) degrades hydrophobic dipeptides. Peptidase M20 family, Peptidase V (Xaa-His dipeptidase; PepV g.p. (Lactobacillus lactis); X-His dipeptidase; beta-Ala-His dipeptidase; carnosinase) subfamily. The PepV group of proteins is widely distributed in lactic acid bacteria. PepV, along with PepT, functions at the end of the proteolytic processing system. PepV is a monomeric metalloenzyme that preferentially degrades hydrophobic dipeptides. The Streptococcus gordonii PepV gene is homologous to the PepV gene family from Lactobacillus and Lactococcus spp. PepV recognizes and fixes the dipeptide backbone, while the side chains are not specifically probed and can vary, rendering it a nonspecific dipeptidase. It has been shown that Lactococcus lactis subspecies lactis (L9) PepV does not hydrolyze dipeptides containing Pro or D-amino acids at the C-terminus, while PepV from Lactobaccilus has been shown to have L-carnosine hydrolyzing activity. The mammalian PepV also acts on anserine and homocarnosine (but not on homoanserine), and to a lesser extent on some other aminoacyl-L-histidine dipeptides. Also included is the Staphylococcus aureus metallopeptidase, Sapep, a Mn(2+)-dependent dipeptidase where large interdomain movements could potentially regulate the activity of this enzyme. 449
23879 349885 cd03890 M20_pepD M20 Peptidase D has specificity for beta-alanyl-L-histidine dipeptide. Peptidase M20 family, Peptidase D (PepD, Xaa-His dipeptidase; X-His dipeptidase; aminoacylhistidine dipeptidase; dipeptidase D; Beta-alanyl-histidine dipeptidase; pepD g.p. (Escherichia coli); EC 3.4.13.3) subfamily. PepD is a cytoplasmic enzyme family characterized by its unusual specificity for the dipeptides beta-alanyl-L-histidine (L-carnosine or beta-Ala-His) and gamma-aminobutyryl histidine (L-homocarnosine or gamma-amino-butyl-His). Homocarnosine has been suggested as a precursor for the neurotransmitter gamma-aminobutyric acid (GABA), acting as a GABA reservoir, and may mediate anti-seizure effects of GABAergic therapies. It has also been reported that glucose metabolism could be influenced by L-carnosine. PepD also includes a lid domain that forms a homodimer; however, the physiological function of this extra domain remains unclear. 474
23880 349886 cd03891 M20_DapE_proteobac M20 Peptidase proteobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase. Peptidase M20 family, proteobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase (DapE; aspartyl dipeptidase; succinyl-diaminopimelate desuccinylase) subfamily. DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. It has been shown that DapE is essential for cell growth and proliferation. DapEs have been purified from Escherichia coli and Haemophilus influenzae, while the genes that encode for DapEs have been sequenced from several bacterial sources such as Corynebacterium glutamicum, Helicobacter pylori, Neisseria meningitidis and Mycobacterium tuberculosis. DapE is a small, dimeric enzyme that requires two zinc atoms per molecule for full enzymatic activity. All of the amino acids that function as metal binding ligands are strictly conserved in DapE. 366
23881 349887 cd03892 M20_peptT M20 Peptidase T specifically cleaves tripeptides. Peptidase M20 family, Peptidase T (peptT; tripeptide aminopeptidase; tripeptidase) subfamily. PepT acts only on tripeptide substrates, and is thus called a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein. 400
23882 349888 cd03893 M20_Dipept_like M20 Dipeptidases. Peptidase M20 family, dipeptidase-like subfamily. This group contains a large variety of enzymes, including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase), canosinase, DUG2 type proteins, as well as many proteins inferred by homology to be dipeptidases. These enzymes have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. Substrates of CNDP are varied and not limited to Xaa-His dipeptides. DUG2 proteins contain a metallopeptidase domain and a large N-terminal WD40 repeat region, and are involved in the alternative pathway of glutathione degradation. 426
23883 349889 cd03894 M20_ArgE M20 Peptidase acetylornithine deacetylase. Peptidase M20 family, acetylornithine deacetylase (ArgE, Acetylornithinase, AO, N2-acetyl-L-ornithine amidohydrolase, EC 3.5.1.16) subfamily. ArgE catalyzes the conversion of N-acetylornithine to ornithine, which can then be incorporated into the urea cycle for the final stage of arginine synthesis. The substrate specificity of ArgE is quite broad; several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved. 367
23884 349890 cd03895 M20_ArgE_DapE-like M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial, and have been inferred by homology as being related to both ArgE and DapE. 400
23885 349891 cd03896 M20_PAAh_like M20 Peptidases, Poly(aspartic acid) hydrolase-like proteins. Peptidase M20 family, Poly(aspartic acid) hydrolase (PAA hydrolase)-like subfamily. PAA hydrolase enzymes are involved in alpha,beta-poly(D,L-aspartic acid) (tPAA) biodegradation. PAA is being extensively studied as a replacement for commercial polycarboxylate components since it can be degraded by enzymes from isolated tPAA degrading bacteria. Thus far, two types of PAA degrading bacteria (Sphingomonas sp. KT-1 and Pedobacter sp. KP-2) have been investigated in detail; the former can completely degrade tPAA of low-molecular weights below 5000, while the latter can degrade high molecular weight tPAA to release oligo(aspartic acid) (OAA) as a product, suggesting two kinds of PAA degrading enzymes. It has been shown that PAA hydrolase-1 from Sphingomonas sp. KT-1 hydrolyzes beta,beta-aspartic acid units in tPAA to produce OAA, and it is suggested that PAA hydrolase-2 hydrolyzes OAA to aspartic acid. Also included in this family is Bradyrhizobium 5-nitroanthranilic acid (5NAA)-aminohydrolase (5NAA-A), a biodegradation enzyme that converts 5NAA to 5-nitrosalicylic acid; 5NAA is a metabolite secreted by Streptomyces scabies, the bacterium responsible for potato scab, and metabolized by Bradyrhizobium species strain JS329. 357
23886 175976 cd04009 C2B_Munc13-like C2 domain second repeat in Munc13 (mammalian uncoordinated)-like proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 133
23887 175977 cd04010 C2B_RasA3 C2 domain second repeat present in RAS p21 protein activator 3 (RasA3). RasA3 are members of GTPase activating protein 1 (GAP1), a Ras-specific GAP, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. RasA3 contains an N-terminal C2 domain, a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 148
23888 175978 cd04011 C2B_Ferlin C2 domain second repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 111
23889 175979 cd04012 C2A_PI3K_class_II C2 domain first repeat present in class II phosphatidylinositol 3-kinases (PI3Ks). There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a N-terminal C2 domain, a PIK domain, and a kinase catalytic domain. Unlike class I and class III, class II PI3Ks have additionally a PX domain and a C-terminal C2 domain containing a nuclear localization signal both of which bind phospholipids though in a slightly different fashion. Class II PIK3s act downstream of receptors for growth factors, integrins, and chemokines. PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 171
23890 175980 cd04013 C2_SynGAP_like C2 domain present in Ras GTPase activating protein (GAP) family. SynGAP, GAP1, RasGAP, and neurofibromin are all members of the Ras-specific GAP (GTPase-activating protein) family. SynGAP regulates the MAP kinase signaling pathway and is critical for cognition and synapse function. Mutations in this gene causes mental retardation in humans. SynGAP contains a PH-like domain, a C2 domain, and a Ras-GAP domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 146
23891 175981 cd04014 C2_PKC_epsilon C2 domain in Protein Kinase C (PKC) epsilon. A single C2 domain is found in PKC epsilon. The PKC family of serine/threonine kinases regulates apoptosis, proliferation, migration, motility, chemo-resistance, and differentiation. There are 3 groups: group 1 (alpha, betaI, beta II, gamma) which require phospholipids and calcium, group 2 (delta, epsilon, theta, eta) which do not require calcium for activation, and group 3 (xi, iota/lambda) which are atypical and can be activated in the absence of diacylglycerol and calcium. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-II topology. 132
23892 175982 cd04015 C2_plant_PLD C2 domain present in plant phospholipase D (PLD). PLD hydrolyzes terminal phosphodiester bonds in diester glycerophospholipids resulting in the degradation of phospholipids. In vitro PLD transfers phosphatidic acid to primary alcohols. In plants PLD plays a role in germination, seedling growth, phosphatidylinositol metabolism, and changes in phospholipid composition. There is a single Ca(2+)/phospholipid-binding C2 domain in PLD. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 158
23893 175983 cd04016 C2_Tollip C2 domain present in Toll-interacting protein (Tollip). Tollip is a part of the Interleukin-1 receptor (IL-1R) signaling pathway. Tollip is proposed to link serine/threonine kinase IRAK to IL-1Rs as well as inhibiting phosphorylation of IRAK. There is a single C2 domain present in Tollip. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 121
23894 175984 cd04017 C2D_Ferlin C2 domain fourth repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fourth C2 repeat, C2D, and has a type-II topology. 135
23895 175985 cd04018 C2C_Ferlin C2 domain third repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 151
23896 175986 cd04019 C2C_MCTP_PRT_plant C2 domain third repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 150
23897 175987 cd04020 C2B_SLP_1-2-3-4 C2 domain second repeat present in Synaptotagmin-like proteins 1-4. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane. Additionally, their C2A domains are both Ca2+ independent, unlike the case in Slp3 and Slp4/granuphilin in which their C2A domains are Ca2+ dependent. It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain. In addition to Slps, rabphilin, Noc2, and Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp3 and Slp4/granuphilin promote dense-core vesicle exocytosis. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 162
23898 175988 cd04021 C2_E3_ubiquitin_ligase C2 domain present in E3 ubiquitin ligase. E3 ubiquitin ligase is part of the ubiquitylation mechanism responsible for controlling surface expression of membrane proteins. The sequential action of several enzymes are involved: ubiquitin-activating enzyme E1, ubiquitin-conjugating enzyme E2, and ubiquitin-protein ligase E3 which is responsible for substrate recognition and promoting the transfer of ubiquitin to the target protein. E3 ubiquitin ligase is composed of an N-terminal C2 domain, 4 WW domains, and a HECTc domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 125
23899 175989 cd04022 C2A_MCTP_PRT_plant C2 domain first repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 127
23900 175990 cd04024 C2A_Synaptotagmin-like C2 domain first repeat present in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 128
23901 175991 cd04025 C2B_RasA1_RasA4 C2 domain second repeat present in RasA1 and RasA4. RasA1 and RasA4 are GAP1s (GTPase activating protein 1s ), Ras-specific GAP members, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. Both proteins contain two C2 domains, a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 123
23902 175992 cd04026 C2_PKC_alpha_gamma C2 domain in Protein Kinase C (PKC) alpha and gamma. A single C2 domain is found in PKC alpha and gamma. The PKC family of serine/threonine kinases regulates apoptosis, proliferation, migration, motility, chemo-resistance, and differentiation. There are 3 groups: group 1(alpha, betaI, beta II, gamma) which require phospholipids and calcium, group 2 (delta, epsilon, theta, eta) which do not require calcium for activation, and group 3 (xi, iota/lambda) which are atypical and can be activated in the absence of diacylglycerol and calcium. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology. 131
23903 175993 cd04027 C2B_Munc13 C2 domain second repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 127
23904 175994 cd04028 C2B_RIM1alpha C2 domain second repeat contained in Rab3-interacting molecule (RIM) proteins. RIMs are believed to organize specialized sites of the plasma membrane called active zones. They also play a role in controlling neurotransmitter release, plasticity processes, as well as memory and learning. RIM contains an N-terminal zinc finger domain, a PDZ domain, and two C-terminal C2 domains (C2A, C2B). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology and do not bind Ca2+. 146
23905 175995 cd04029 C2A_SLP-4_5 C2 domain first repeat present in Synaptotagmin-like proteins 4 and 5. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. SHD of Slp (except for the Slp4-SHD) function as a specific Rab27A/B-binding domain. In addition to Slp, rabphilin, Noc2, and Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp4/granuphilin promotes dense-core vesicle exocytosis. The C2A domain of Slp4 is Ca2+ dependent. Slp5 mRNA has been shown to be restricted to human placenta and liver suggesting a role in Rab27A-dependent membrane trafficking in specific tissues. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 125
23906 175996 cd04030 C2C_KIAA1228 C2 domain third repeat present in uncharacterized human KIAA1228-like proteins. KIAA proteins are uncharacterized human proteins. They were compiled by the Kazusa mammalian cDNA project which identified more than 2000 human genes. They are identified by 4 digit codes that precede the KIAA designation. Many KIAA genes are still functionally uncharacterized including KIAA1228. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 127
23907 175997 cd04031 C2A_RIM1alpha C2 domain first repeat contained in Rab3-interacting molecule (RIM) proteins. RIMs are believed to organize specialized sites of the plasma membrane called active zones. They also play a role in controlling neurotransmitter release, plasticity processes, as well as memory and learning. RIM contains an N-terminal zinc finger domain, a PDZ domain, and two C-terminal C2 domains (C2A, C2B). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology and do not bind Ca2+. 125
23908 175998 cd04032 C2_Perforin C2 domain of Perforin. Perforin contains a single copy of a C2 domain in its C-terminus and plays a role in lymphocyte-mediated cytotoxicity. Mutations in perforin leads to familial hemophagocytic lymphohistiocytosis type 2. The function of perforin is calcium dependent and the C2 domain is thought to confer this binding to target cell membranes. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 127
23909 175999 cd04033 C2_NEDD4_NEDD4L C2 domain present in the Human neural precursor cell-expressed, developmentally down-regulated 4 (NEDD4) and NEDD4-like (NEDD4L/NEDD42). Nedd4 and Nedd4-2 are two of the nine members of the Human Nedd4 family. All vertebrates appear to have both Nedd4 and Nedd4-2 genes. They are thought to participate in the regulation of epithelial Na+ channel (ENaC) activity. They also have identical specificity for ubiquitin conjugating enzymes (E2). Nedd4 and Nedd4-2 are composed of a C2 domain, 2-4 WW domains, and a ubiquitin ligase Hect domain. Their WW domains can bind PPxY (PY) or LPSY motifs, and in vitro studies suggest that WW3 and WW4 of both proteins bind PY motifs in the key substrates, with WW3 generally exhibiting higher affinity. Most Nedd4 family members, especially Nedd4-2, also have multiple splice variants, which might play different roles in regulating their substrates. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 133
23910 176000 cd04035 C2A_Rabphilin_Doc2 C2 domain first repeat present in Rabphilin and Double C2 domain. Rabphilin is found neurons and in neuroendrocrine cells, while Doc2 is found not only in the brain but in tissues, including mast cells, chromaffin cells, and osteoblasts. Rabphilin and Doc2s share highly homologous tandem C2 domains, although their N-terminal structures are completely different: rabphilin contains an N-terminal Rab-binding domain (RBD),7 whereas Doc2 contains an N-terminal Munc13-1-interacting domain (MID). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 123
23911 176001 cd04036 C2_cPLA2 C2 domain present in cytosolic PhosphoLipase A2 (cPLA2). A single copy of the C2 domain is present in cPLA2 which releases arachidonic acid from membranes initiating the biosynthesis of potent inflammatory mediators such as prostaglandins, leukotrienes, and platelet-activating factor. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members of this cd have a type-II topology. 119
23912 176002 cd04037 C2E_Ferlin C2 domain fifth repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fifth C2 repeat, C2E, and has a type-II topology. 124
23913 176003 cd04038 C2_ArfGAP C2 domain present in Arf GTPase Activating Proteins (GAP). ArfGAP is a GTPase activating protein which regulates the ADP ribosylation factor Arf, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of Arf is involved in Golgi morphology and is involved in recruiting coat proteins. ArfGAP is responsible for the GDP-bound form of Arf which is necessary for uncoating the membrane and allowing the Golgi to fuse with an acceptor compartment. These proteins contain an N-terminal ArfGAP domain containing the characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) and C-terminal C2 domain. C2 domains were first identified in Protein Kinase C (PKC). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 145
23914 176004 cd04039 C2_PSD C2 domain present in Phosphatidylserine decarboxylase (PSD). PSD is involved in the biosynthesis of aminophospholipid by converting phosphatidylserine (PtdSer) to phosphatidylethanolamine (PtdEtn). There is a single C2 domain present and it is thought to confer PtdSer binding motif that is common to PKC and synaptotagmin. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 108
23915 176005 cd04040 C2D_Tricalbin-like C2 domain fourth repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fifth C2 repeat, C2E, and has a type-II topology. 115
23916 176006 cd04041 C2A_fungal C2 domain first repeat; fungal group. C2 domains were first identified in Protein Kinase C (PKC). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 111
23917 176007 cd04042 C2A_MCTP_PRT C2 domain first repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane. MCTP is composed of a variable N-terminal sequence, three C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 121
23918 176008 cd04043 C2_Munc13_fungal C2 domain in Munc13 (mammalian uncoordinated) proteins; fungal group. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 126
23919 176009 cd04044 C2A_Tricalbin-like C2 domain first repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 124
23920 176010 cd04045 C2C_Tricalbin-like C2 domain third repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 120
23921 176011 cd04046 C2_Calpain C2 domain present in Calpain proteins. A single C2 domain is found in calpains (EC 3.4.22.52, EC 3.4.22.53), calcium-dependent, non-lysosomal cysteine proteases. Caplains are classified as belonging to Clan CA by MEROPS and include six families: C1, C2, C10, C12, C28, and C47. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 126
23922 176012 cd04047 C2B_Copine C2 domain second repeat in Copine. There are 2 copies of the C2 domain present in copine, a protein involved in membrane trafficking, protein-protein interactions, and perhaps even cell division and growth. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 110
23923 176013 cd04048 C2A_Copine C2 domain first repeat in Copine. There are 2 copies of the C2 domain present in copine, a protein involved in membrane trafficking, protein-protein interactions, and perhaps even cell division and growth. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 120
23924 176014 cd04049 C2_putative_Elicitor-responsive_gene C2 domain present in the putative elicitor-responsive gene. In plants elicitor-responsive proteins are triggered in response to specific elicitor molecules such as glycolproteins, peptides, carbohydrates and lipids. A host of defensive responses are also triggered resulting in localized cell death. Antimicrobial secondary metabolites, such as phytoalexins, or defense-related proteins, including pathogenesis-related (PR) proteins are also produced. There is a single C2 domain present here. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-II topology. 124
23925 176015 cd04050 C2B_Synaptotagmin-like C2 domain second repeat present in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 105
23926 176016 cd04051 C2_SRC2_like C2 domain present in Soybean genes Regulated by Cold 2 (SRC2)-like proteins. SRC2 production is a response to pathogen infiltration. The initial response of increased Ca2+ concentrations are coupled to downstream signal transduction pathways via calcium binding proteins. SRC2 contains a single C2 domain which localizes to the plasma membrane and is involved in Ca2+ dependent protein binding. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 125
23927 176017 cd04052 C2B_Tricalbin-like C2 domain second repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 111
23928 176018 cd04054 C2A_Rasal1_RasA4 C2 domain first repeat present in RasA1 and RasA4. Rasal1 and RasA4 are both members of GAP1 (GTPase activating protein 1). Rasal1 responds to repetitive Ca2+ signals by associating with the plasma membrane and deactivating Ras. RasA4 suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. Both of these proteins contains two C2 domains, a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 121
23929 173788 cd04056 Peptidases_S53 Peptidase domain in the S53 family. Members of the peptidases S53 (sedolisin) family include endopeptidases and exopeptidases sedolisin, kumamolysin, and (PSCP) Pepstatin-insensitive Carboxyl Proteinase. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of Asn in subtilisin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. Characterized sedolisins include Kumamolisin, an extracellular calcium-dependent thermostable endopeptidase from Bacillus. The enzyme is synthesized with a 188 amino acid N-terminal preprotein region which is cleaved after the extraction into the extracellular space with low pH. One kumamolysin paralog, kumamolisin-As, is believed to be a collagenase. TPP1 is a serine protease that functions as a tripeptidyl exopeptidase as well as an endopeptidase. Less is known about PSCP from Pseudomonas which is thought to be an aspartic proteinase. 361
23930 173789 cd04059 Peptidases_S8_Protein_convertases_Kexins_Furin-like Peptidase S8 family domain in Protein convertases. Protein convertases, whose members include furins and kexins, are members of the peptidase S8 or Subtilase clan of proteases. They have an Asp/His/Ser catalytic triad that is not homologous to trypsin. Kexins are involved in the activation of peptide hormones, growth factors, and viral proteins. Furin cleaves cell surface vasoactive peptides and proteins involved in cardiovascular tissue remodeling in the TGN, at cell surface, or in endosomes but rarely in the ER. Furin also plays a key role in blood pressure regulation though the activation of transforming growth factor (TGF)-beta. High specificity is seen for cleavage after dibasic (Lys-Arg or Arg-Arg) or multiple basic residues in protein convertases. There is also strong sequence conservation. 297
23931 173790 cd04077 Peptidases_S8_PCSK9_ProteinaseK_like Peptidase S8 family domain in ProteinaseK-like proteins. The peptidase S8 or Subtilase clan of proteases have a Asp/His/Ser catalytic triad that is not homologous to trypsin. This CD contains several members of this clan including: PCSK9 (Proprotein convertase subtilisin/kexin type 9), Proteinase_K, Proteinase_T, and other subtilisin-like serine proteases. PCSK9 posttranslationally regulates hepatic low-density lipoprotein receptors (LDLRs) by binding to LDLRs on the cell surface, leading to their degradation. The binding site of PCSK9 has been localized to the epidermal growth factor-like repeat A (EGF-A) domain of the LDLR. Characterized Proteinases K are secreted endopeptidases with a high degree of sequence conservation. Proteinases K are not substrate-specific and function in a wide variety of species in different pathways. It can hydrolyze keratin and other proteins with subtilisin-like specificity. The number of calcium-binding motifs found in these differ. Proteinase T is a novel proteinase from the fungus Tritirachium album Limber. The amino acid sequence of proteinase T as deduced from the nucleotide sequence is about 56% identical to that of proteinase K. 255
23932 271144 cd04078 CBM36_xylanase-like Carbohydrate Binding Module family 36 (CBM36); appended mainly to glycoside hydrolase family 11 (GH11) domains; xylan binding. This family includes carbohydrate binding module family 36 (CBM36) most of which appear appended to glycoside hydrolase family 11 (GH11) domains. These CBMs are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH11 catalytic modules with their dedicated, insoluble substrates. GH11 domains have xylanase (endo-1,4-beta-xylanase) activity which catalyzes the hydrolysis of beta-1,4 bonds of xylan, the major component of hemicelluloses, to generate xylooligosaccharides and xylose. This family includes XynB from Dictyoglomus thermophilum Rt46B.1 and Xyn11A from Pseudobutyrivibrio xylanivorans Mz5T. Xyn11A is a multicatalytic enzyme with an N-terminal GH11 domain, a CBM36 domain, and a C-terminal putative NodB-like polysaccharide deacetylase which is predicted to be an acetyl esterase involved in debranching activity in the xylan backbone. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. Consistent with its structural and sequence similarity to CBM6, CBM36 binds xylan, but only at binding site I, and in a calcium-dependent manner; the latter suggests its potential application in affinity labeling. 119
23933 271145 cd04079 CBM6_agarase-like Carbohydrate Binding Module 6 (CBM6); appended mainly to glycoside hydrolase (GH) family 16 alpha- and beta agarases. This family includes carbohydrate binding module 6 (CBM6) domains that are appended mainly to glycoside hydrolase (GH) family 16 agarases. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the activity of alpha- and beta-agarase catalytic modules which are involved in the hydrolysis of 1,4-beta-D-galactosidic linkages. These CBM6s bind specifically to the non-reducing end of agarose chains, recognizing only the first repeat of the disaccharide, and directing the appended catalytic modules to areas of the plant cell wall attacked by beta-agarases. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. This family includes three tandem CBM6s from the Saccharophagus degradans agarase Aga86E, and three tandem CBM6s from Vibrio sp. strain PO-303 AgaA; in both these proteins these are appended to a GH16 domain. Vibrio AgaA also contains a Big-2-like protein-protein interaction domain. This family also includes two tandem CBM6s from an endo-type beta-agarase from a deep-sea Microbulbifer-like isolate, which are appended to a GH16 domain, and two of three CBM6s of Alteromonas agarilytica AgaA alpha-agarase, which are appended to a GH96 domain. 134
23934 271146 cd04080 CBM6_cellulase-like Carbohydrate Binding Module 6 (CBM6); appended to glycoside hydrolase (GH) domains, including GH5 (cellulase). This family includes carbohydrate binding module 6 (CBM6) domains that are appended to several glycoside hydrolase (GH) domains, including GH5 (cellulase) and GH16, as well as to coagulation factor 5/8 carbohydrate-binding domains. CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. The CBM6s are appended to GHs that display a diversity of substrate specificities. For some members of this family information is available about the specific substrates of the appended GH domains. It includes the CBM domains of various enzymes involved in cell wall degradation including, an extracellular beta-1,3-glucanase from Lysobacter enzymogenes encoded by the gluC gene (its catalytic domain belongs to the GH16 family), the tandem CBM domains of Pseudomonas sp. PE2 beta-1,3(4)-glucanase A (its catalytic domain also belongs to GH16), and a family 6 CBM from Cellvibrio mixtus Endoglucanase 5A (CmCBM6) which binds to the beta1,4-beta1,3-mixed linked glucans lichenan, and barley beta-glucan, cello-oligosaccharides, insoluble forms of cellulose, the beta1,3-glucan laminarin, and xylooligosaccharides, and the CBM6 of Fibrobacter succinogenes S85 XynD xylanase, appended to a GH10 domain, and Cellvibrio japonicas Cel5G appended to a GH5 (cellulase) domain. GH5 (cellulase) family includes enzymes with several known activities such as endoglucanase, beta-mannanase, and xylanase, which are involved in the degradation of cellulose and xylans. GH16 family includes enzymes with lichenase, xyloglucan endotransglycosylase (XET), and beta-agarase activities. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. For CmCBM6 it has been shown that these two binding sites have different ligand specificities. 144
23935 271147 cd04081 CBM35_galactosidase-like Carbohydrate Binding Module family 35 (CBM35); appended mainly to enzymes that bind alpha-D-galactose (CBM35-Gal), including glycoside hydrolase (GH) families GH27 and GH43. This family includes carbohydrate binding module family 35 (CBM35); these are non-catalytic carbohydrate binding domains that are appended mainly to enzymes that bind alpha-D-galactose (CBM35-Gal), including glycoside hydrolase (GH) families GH27 and GH43. Examples of proteins which contain CBM35s belonging to this family includes the CBM35 of an exo-beta-1,3-galactanase from Phanerochaete chrysosporium 9 (Pc1,3Gal43A) which is appended to a GH43 domain, and the CBM35 domain of two bifunctional proteins with beta-L-arabinopyranosidase/alpha-D-galactopyranosidase activities from Fusarium oxysporum 12S, Foap1 and Foap2 (Fo/AP1 and Fo/AP2), that are appended to GH27 domains. CBM35s are unique in that they display conserved specificity through extensive sequence similarity but divergent function through their appended catalytic modules. They are known to bind alpha-D-galactose (Gal), mannan (Man), xylan, glucuronic acid (GlcA), a beta-polymer of mannose, and possibly glucans, forming four subfamilies based on general ligand specificities (galacto, urono, manno, and gluco configurations). Some CBM35s bind their ligands in a calcium-dependent manner. In contrast to most CBMs that are generally rigid proteins, CBM35 undergoes significant conformational change upon ligand binding. GH43 includes beta-xylosidases and beta-xylanases, using aryl-glycosides as substrates, while family GH27 includes alpha-galactosidases, alpha-N-acetylgalactosaminidases, and isomaltodextranases. 125
23936 271148 cd04082 CBM35_pectate_lyase-like Carbohydrate Binding Module family 35 (CBM35), pectate lyase-like; appended mainly to enzymes that bind mannan (Man), xylan, glucuronic acid (GlcA) and possibly glucans. This family includes carbohydrate binding module family 35 (CBM35) domains that are non-catalytic carbohydrate binding domains that are appended mainly to enzymes that bind mannan (Man), xylan, glucuronic acid (GlcA) and possibly glucans. Included in this family are CBM35s of pectate lyases, including pectate lyase 10A from Cellvibrio japonicas, these enzymes release delta-4,5-anhydrogalaturonic acid (delta4,5-GalA) from pectin, thus identifying a signature molecule for plant cell wall degradation. CBM35s are unique in that they display conserved specificity through extensive sequence similarity but divergent function through their appended catalytic modules. They are known to bind alpha-D-galactose (Gal), mannan (Man), xylan, glucuronic acid (GlcA), a beta-polymer of mannose, and possibly glucans, forming four subfamilies based on general ligand specificities (galacto, urono, manno, and gluco configurations). In contrast to most CBMs that are generally rigid proteins, CBM35 undergoes significant conformational change upon ligand binding. Some CBM35s bind their ligands in a calcium-dependent manner, especially those binding uronic acids. 124
23937 271149 cd04083 CBM35_Lmo2446-like Carbohydrate Binding Module 35 (CBM35) domains similar to Lmo2446. This family includes carbohydrate binding module 35 (CBM35) domains that are appended to several carbohydrate binding enzymes. Some CBM35 domains belonging to this family are appended to glycoside hydrolase (GH) family domains, including glycoside hydrolase family 31 (GH31), for example the CBM35 domain of Lmo2446, an uncharacterized protein from Listeria monocytogenes EGD-e. These CBM35s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. GH31 has a wide range of hydrolytic activities such as alpha-glucosidase, alpha-xylosidase, 6-alpha-glucosyltransferase, or alpha-1,4-glucan lyase, cleaving a terminal carbohydrate moiety from a substrate that may be a starch or a glycoprotein. Most characterized GH31 enzymes are alpha-glucosidases. 125
23938 271150 cd04084 CBM6_xylanase-like Carbohydrate Binding Module 6 (CBM6); many are appended to glycoside hydrolase (GH) family 11 and GH43 xylanase domains. This family includes carbohydrate binding module 6 (CBM6) domains that are appended mainly to glycoside hydrolase (GH) family domains, including GH3, GH11, and GH43 domains. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. Examples of proteins having CMB6s belonging to this family are Microbispora bispora GghA, a 1,4-beta-D-glucan glucohydrolase (GH3); Clostridium thermocellum xylanase U (GH11), and Penicillium purpurogenum ABF3, a bifunctional alpha-L-arabinofuranosidase/xylobiohydrolase (GH43). GH3 comprises enzymes with activities including beta-glucosidase (hydrolyzes beta-galactosidase) and beta-xylosidase (hydrolyzes 1,4-beta-D-xylosidase). GH11 family comprises enzymes with xylanase (endo-1,4-beta-xylanase) activity which catalyze the hydrolysis of beta-1,4 bonds of xylan, the major component of hemicelluloses, to generate xylooligosaccharides and xylose. GH43 includes beta-xylosidases and beta-xylanases, using aryl-glycosides as substrates. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. 123
23939 271151 cd04085 delta_endotoxin_C delta-endotoxin C-terminal domain may be associated with carbohydrate binding functionality. Delta-endotoxin C-terminal domain (delta endotoxin domain III) is part of the activated region of delta endotoxins, which are insecticidal toxins produced during sporulation by Bacillus species of bacteria. The activated endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain (I) is involved in membrane insertion and pore formation, while the second and third domains (II and III) are involved in receptor binding. Domain III structurally resembles the carbohydrate binding domain 6 (CBM6) and it is possible that insect specificity is determined by protein-protein or protein-carbohydrate interactions mediated by both domains II and III of the toxin. Delta-endotoxins are of great interest for development of new bioinsecticides and in the control of mosquitoes. 152
23940 271152 cd04086 CBM35_mannanase-like Carbohydrate Binding Module 35 (CBM35); appended to several carbohydrate binding enzymes, including several glycoside hydrolase (GH) family 26 mannanase domains. This family includes carbohydrate binding module 35 (CBM35) domains that are appended to several carbohydrate binding enzymes, including periplasmic component of ABC-type sugar transport system involved in carbohydrate transport and metabolism, and several glycoside hydrolase (GH) domains, including GH26. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. Examples of proteins having CMB35s belonging to this family are mannanase A from Clostridium thermocellum (GH26), Man26B from Paenibacillus sp. BME-14 (GH26), and the multifunctional Cel44C-Man26A from Paenibacillus polymyxa GS01 (which has two GH domains, GH44 and GH26). GH26 mainly includes mannan endo-1,4-beta-mannosidase which hydrolyzes 1,4-beta-D-linkages in mannans, galacto-mannans, glucomannans, and galactoglucomannans, but displays little activity towards other plant cell wall polysaccharides. A few proteins belonging to this family have additional CBM3 domains; these CBM3s are not found in the CBM6-CBM35-CBM36_like superfamily. 119
23941 239754 cd04087 PTPA Phosphotyrosyl phosphatase activator (PTPA) is also known as protein phosphatase 2A (PP2A) phosphatase activator. PTPA is an essential, well conserved protein that stimulates the tyrosyl phosphatase activity of PP2A. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. Together, PTPA and PP2A constitute an ATPase. It has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA is controlled by the transcription factor Yin Yang1 (YY1). PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac. 266
23942 293905 cd04088 EFG_mtEFG_II Domain II of bacterial elongation factor G and C-terminal domain of mitochondrial Elongation factors G1 and G2. This family represents the domain II of bacterial Elongation factor G (EF-G)and mitochondrial Elongation factors G1 (mtEFG1) and G2 (mtEFG2). During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. mtEFG1 and mtEFG2 show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homolog of mtEFG2, MEF2. 83
23943 293906 cd04089 eRF3_II Domain II of the eukaryotic class II release factor. In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribosome. eRF3 is a GTPase, which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF-1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. 82
23944 293907 cd04090 EF2_II_snRNP Domain II of the spliceosomal 116kD U5 small nuclear ribonucleoprotein (snRNP) component. This subfamily includes domain II of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. This domain is homologous to domain II of the eukaryotic translational elongation factor EF-2. U5-116 kD is a GTPase which is a component of the spliceosome complex which processes precursor mRNAs to produce mature mRNAs. 94
23945 293908 cd04091 mtEFG1_II_like Domain II of mitochondrial elongation factor G1-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s); mtEFG2s are not present in this group. 81
23946 293909 cd04092 mtEFG2_II_like Domain II of mitochondrial elongation factor G2-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. No clear phenotype has been found for mutants in the yeast homolog of mtEFG2, MEF2. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s); mtEFG1s are not present in this group. 83
23947 294008 cd04093 HBS1_C_III C-terminal domain of Hsp70 subfamily B suppressor 1 (HBS1). This model represents the C-terminal domain of Hsp70 subfamily B suppressor 1 (HBS1), which is homologous to the domain III of EF-1alpha. This group contains proteins similar to yeast Hbs1, which together with Dom34, promotes the No-go decay (NGD) of mRNA. The NGD targets mRNAs whose elongation stalled for degradation initiated by endonucleolytic cleavage in the vicinity of the stalled ribosome. 109
23948 294009 cd04094 eSelB_III Domain III of eukaryotic and archaeal elongation factor SelB. This model represents the domain III of archaeal and eukaryotic selenocysteine (Sec)-specific eukaryotic elongation factor (eEFSec or eSelB), which is homologous to domain III of EF-Tu. SelB is a specialized translation elongation factor responsible for the co-translational incorporation of selenocysteine into proteins by recoding of a UGA stop codon in the presence of a downstream mRNA hairpin loop, called Sec insertion sequence (SECIS) element. 114
23949 294010 cd04095 CysN_NoDQ_III Domain III of the large subunit of ATP sulfurylase (ATPS). This model represents domain III of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and is homologous to domain III of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD and CysN. ATPS produces adenosine-5'-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sulfation of a nodulation factor. In Rhizobium meliloti, the heterodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N- and C-termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, archaea, and eukaryotes use a different ATP sulfurylase, which shows no amino acid sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha. 103
23950 239763 cd04096 eEF2_snRNP_like_C eEF2_snRNP_like_C: this family represents a C-terminal domain of eukaryotic elongation factor 2 (eEF-2) and a homologous domain of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. 80
23951 239764 cd04097 mtEFG1_C mtEFG1_C: C-terminus of mitochondrial Elongation factor G1 (mtEFG1)-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group. 78
23952 239765 cd04098 eEF2_C_snRNP eEF2_C_snRNP: This family includes a C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. This domain is homologous to the C-terminal domain of the eukaryotic translational elongation factor EF-2. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. 80
23953 239766 cd04100 Asp_Lys_Asn_RS_N Asp_Lys_Asn_RS_N: N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases (AspRS, AsnRS, and LysRS). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Included in this group are archeal and archeal-like AspRSs which are non-discriminating and can charge both tRNAAsp and tRNAAsn. E. coli cells have two isoforms of LysRSs (LysS and LysU) encoded by two distinct genes, which are differentially regulated. The cytoplasmic and the mitochondrial isoforms of human LysRS are encoded by a single gene. Yeast cytoplasmic and mitochondrial LysRSs participate in mitochondrial import of cytoplasmic tRNAlysCUU. In addition to their housekeeping role, human LysRS may function as a signaling molecule that activates immune cells. Tomato LysRS may participate in a process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging. AsnRS is immunodominant antigen of the filarial nematode Brugia malayai and is of interest as a target for anti-parasitic drug design. Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages. 85
23954 206688 cd04101 RabL4 Rab GTPase-like family 4 (Rab-like4). RabL4 (Rab-like4) subfamily. RabL4s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL4 lacks a prenylation site at the C-terminus. The specific function of RabL4 remains unknown. 167
23955 206689 cd04102 RabL3 Rab GTPase-like family 3 (Rab-like3). RabL3 (Rab-like3) subfamily. RabL3s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL3 lacks a prenylation site at the C-terminus. The specific function of RabL3 remains unknown. 204
23956 133303 cd04103 Centaurin_gamma Centaurin gamma (CENTG) GTPase. The centaurins (alpha, beta, gamma, and delta) are large, multi-domain proteins that all contain an ArfGAP domain and ankyrin repeats, and in some cases, numerous additional domains. Centaurin gamma contains an additional GTPase domain near its N-terminus. The specific function of this GTPase domain has not been well characterized, but centaurin gamma 2 (CENTG2) may play a role in the development of autism. Centaurin gamma 1 is also called PIKE (phosphatidyl inositol (PI) 3-kinase enhancer) and centaurin gamma 2 is also known as AGAP (ArfGAP protein with a GTPase-like domain, ankyrin repeats and a Pleckstrin homology domain) or GGAP. Three isoforms of PIKE have been identified. PIKE-S (short) and PIKE-L (long) are brain-specific isoforms, with PIKE-S restricted to the nucleus and PIKE-L found in multiple cellular compartments. A third isoform, PIKE-A was identified in human glioblastoma brain cancers and has been found in various tissues. GGAP has been shown to have high GTPase activity due to a direct intramolecular interaction between the N-terminal GTPase domain and the C-terminal ArfGAP domain. In human tissue, AGAP mRNA was detected in skeletal muscle, kidney, placenta, brain, heart, colon, and lung. Reduced expression levels were also observed in the spleen, liver, and small intestine. 158
23957 206690 cd04104 p47_IIGP_like p47 GTPase family includes IGTP, TGTP/Mg21, IRG-47, GTPI, LRG-47, and IIGP1. The p47 GTPase family consists of several highly homologous proteins, including IGTP, TGTP/Mg21, IRG-47, GTPI, LRG-47, and IIGP1. They are found in higher eukaryotes where they play a role in immune resistance against intracellular pathogens. p47 proteins exist at low resting levels in mouse cells, but are strongly induced by Type II interferon (IFN-gamma). ITGP is critical for resistance to Toxoplasma gondii infection and in involved in inhibition of Coxsackievirus-B3-induced apoptosis. TGTP was shown to limit vesicular stomatitis virus (VSV) infection of fibroblasts in vitro. IRG-47 is involved in resistance to T. gondii infection. LRG-47 has been implicated in resistance to T. gondii, Listeria monocytogenes, Leishmania, and mycobacterial infections. IIGP1 has been shown to localize to the ER and to the Golgi membranes in IFN-induced cells and inflamed tissues. In macrophages, IIGP1 interacts with hook3, a microtubule binding protein that participates in the organization of the cis-Golgi compartment. 197
23958 206691 cd04105 SR_beta Signal recognition particle receptor, beta subunit (SR-beta), together with SR-alpha, forms the heterodimeric signal recognition particle (SRP). Signal recognition particle receptor, beta subunit (SR-beta). SR-beta and SR-alpha form the heterodimeric signal recognition particle (SRP or SR) receptor that binds SRP to regulate protein translocation across the ER membrane. Nascent polypeptide chains are synthesized with an N-terminal hydrophobic signal sequence that binds SRP54, a component of the SRP. SRP directs targeting of the ribosome-nascent chain complex (RNC) to the ER membrane via interaction with the SR, which is localized to the ER membrane. The RNC is then transferred to the protein-conducting channel, or translocon, which facilitates polypeptide translation across the ER membrane or integration into the ER membrane. SR-beta is found only in eukaryotes; it is believed to control the release of the signal sequence from SRP54 upon binding of the ribosome to the translocon. High expression of SR-beta has been observed in human colon cancer, suggesting it may play a role in the development of this type of cancer. 202
23959 133306 cd04106 Rab23_like Rab GTPase family 23 (Rab23)-like. Rab23-like subfamily. Rab23 is a member of the Rab family of small GTPases. In mouse, Rab23 has been shown to function as a negative regulator in the sonic hedgehog (Shh) signaling pathway. Rab23 mediates the activity of Gli2 and Gli3, transcription factors that regulate Shh signaling in the spinal cord, primarily by preventing Gli2 activation in the absence of Shh ligand. Rab23 also regulates a step in the cytoplasmic signal transduction pathway that mediates the effect of Smoothened (one of two integral membrane proteins that are essential components of the Shh signaling pathway in vertebrates). In humans, Rab23 is expressed in the retina. Mice contain an isoform that shares 93% sequence identity with the human Rab23 and an alternative splicing isoform that is specific to the brain. This isoform causes the murine open brain phenotype, indicating it may have a role in the development of the central nervous system. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 162
23960 206692 cd04107 Rab32_Rab38 Rab GTPase families 18 (Rab18) and 32 (Rab32). Rab38/Rab32 subfamily. Rab32 and Rab38 are members of the Rab family of small GTPases. Human Rab32 was first identified in platelets but it is expressed in a variety of cell types, where it functions as an A-kinase anchoring protein (AKAP). Rab38 has been shown to be melanocyte-specific. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 201
23961 206693 cd04108 Rab36_Rab34 Rab GTPase families 34 (Rab34) and 36 (Rab36). Rab34/Rab36 subfamily. Rab34, found primarily in the Golgi, interacts with its effector, Rab-interacting lysosomal protein (RILP). This enables its participation in microtubular dynenin-dynactin-mediated repositioning of lysosomes from the cell periphery to the Golgi. A Rab34 (Rah) isoform that lacks the consensus GTP-binding region has been identified in mice. This isoform is associated with membrane ruffles and promotes macropinosome formation. Rab36 has been mapped to human chromosome 22q11.2, a region that is homozygously deleted in malignant rhabdoid tumors (MRTs). However, experimental assessments do not implicate Rab36 as a tumor suppressor that would enable tumor formation through a loss-of-function mechanism. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 170
23962 206694 cd04109 Rab28 Rab GTPase family 28 (Rab28). Rab28 subfamily. First identified in maize, Rab28 has been shown to be a late embryogenesis-abundant (Lea) protein that is regulated by the plant hormone abcisic acid (ABA). In Arabidopsis, Rab28 is expressed during embryo development and is generally restricted to provascular tissues in mature embryos. Unlike maize Rab28, it is not ABA-inducible. Characterization of the human Rab28 homolog revealed two isoforms, which differ by a 95-base pair insertion, producing an alternative sequence for the 30 amino acids at the C-terminus. The two human isoforms are presumably the result of alternative splicing. Since they differ at the C-terminus but not in the GTP-binding region, they are predicted to be targeted to different cellular locations. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 213
23963 133310 cd04110 Rab35 Rab GTPase family 35 (Rab35). Rab35 is one of several Rab proteins to be found to participate in the regulation of osteoclast cells in rats. In addition, Rab35 has been identified as a protein that interacts with nucleophosmin-anaplastic lymphoma kinase (NPM-ALK) in human cells. Overexpression of NPM-ALK is a key oncogenic event in some anaplastic large-cell lymphomas; since Rab35 interacts with N|PM-ALK, it may provide a target for cancer treatments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 199
23964 133311 cd04111 Rab39 Rab GTPase family 39 (Rab39). Found in eukaryotes, Rab39 is mainly found in epithelial cell lines, but is distributed widely in various human tissues and cell lines. It is believed to be a novel Rab protein involved in regulating Golgi-associated vesicular transport during cellular endocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 211
23965 206695 cd04112 Rab26 Rab GTPase family 26 (Rab26). Rab26 subfamily. First identified in rat pancreatic acinar cells, Rab26 is believed to play a role in recruiting mature granules to the plasma membrane upon beta-adrenergic stimulation. Rab26 belongs to the Rab functional group III, which are considered key regulators of intracellular vesicle transport during exocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 191
23966 206696 cd04113 Rab4 Rab GTPase family 4 (Rab4). Rab4 subfamily. Rab4 has been implicated in numerous functions within the cell. It helps regulate endocytosis through the sorting, recycling, and degradation of early endosomes. Mammalian Rab4 is involved in the regulation of many surface proteins including G-protein-coupled receptors, transferrin receptor, integrins, and surfactant protein A. Experimental data implicate Rab4 in regulation of the recycling of internalized receptors back to the plasma membrane. It is also believed to influence receptor-mediated antigen processing in B-lymphocytes, in calcium-dependent exocytosis in platelets, in alpha-amylase secretion in pancreatic cells, and in insulin-induced translocation of Glut4 from internal vesicles to the cell surface. Rab4 is known to share effector proteins with Rab5 and Rab11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 161
23967 133314 cd04114 Rab30 Rab GTPase family 30 (Rab30). Rab30 subfamily. Rab30 appears to be associated with the Golgi stack. It is expressed in a wide variety of tissue types and in humans maps to chromosome 11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 169
23968 133315 cd04115 Rab33B_Rab33A Rab GTPase family 33 includes Rab33A and Rab33B. Rab33B/Rab33A subfamily. Rab33B is ubiquitously expressed in mouse tissues and cells, where it is localized to the medial Golgi cisternae. It colocalizes with alpha-mannose II. Together with the other cisternal Rabs, Rab6A and Rab6A', it is believed to regulate the Golgi response to stress and is likely a molecular target in stress-activated signaling pathways. Rab33A (previously known as S10) is expressed primarily in the brain and immune system cells. In humans, it is located on the X chromosome at Xq26 and its expression is down-regulated in tuberculosis patients. Experimental evidence suggests that Rab33A is a novel CD8+ T cell factor that likely plays a role in tuberculosis disease processes. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 170
23969 206697 cd04116 Rab9 Rab GTPase family 9 (Rab9). Rab9 is found in late endosomes, together with mannose 6-phosphate receptors (MPRs) and the tail-interacting protein of 47 kD (TIP47). Rab9 is a key mediator of vesicular transport from late endosomes to the trans-Golgi network (TGN) by redirecting the MPRs. Rab9 has been identified as a key component for the replication of several viruses, including HIV1, Ebola, Marburg, and measles, making it a potential target for inhibiting a variety of viruses. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 170
23970 206698 cd04117 Rab15 Rab GTPase family 15 (Rab15). Rab15 colocalizes with the transferrin receptor in early endosome compartments, but not with late endosomal markers. It codistributes with Rab4 and Rab5 on early/sorting endosomes, and with Rab11 on pericentriolar recycling endosomes. It is believed to function as an inhibitory GTPase that regulates distinct steps in early endocytic trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 164
23971 133318 cd04118 Rab24 Rab GTPase family 24 (Rab24). Rab24 is distinct from other Rabs in several ways. It exists primarily in the GTP-bound state, having a low intrinsic GTPase activity; it is not efficiently geranyl-geranylated at the C-terminus; it does not form a detectable complex with Rab GDP-dissociation inhibitors (GDIs); and it has recently been shown to undergo tyrosine phosphorylation when overexpressed in vitro. The specific function of Rab24 still remains unknown. It is found in a transport route between ER-cis-Golgi and late endocytic compartments. It is putatively involved in an autophagic pathway, possibly directing misfolded proteins in the ER to degradative pathways. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 193
23972 133319 cd04119 RJL Rab GTPase family J-like (RabJ-like). RJLs are found in many protists and as chimeras with C-terminal DNAJ domains in deuterostome metazoa. They are not found in plants, fungi, and protostome metazoa, suggesting a horizontal gene transfer between protists and deuterostome metazoa. RJLs lack any known membrane targeting signal and contain a degenerate phosphate/magnesium-binding 3 (PM3) motif, suggesting an impaired ability to hydrolyze GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 168
23973 206699 cd04120 Rab12 Rab GTPase family 12 (Rab12). Rab12 was first identified in canine cells, where it was localized to the Golgi complex. The specific function of Rab12 remains unknown, and inconsistent results about its cellular localization have been reported. More recent studies have identified Rab12 associated with post-Golgi vesicles, or with other small vesicle-like structures but not with the Golgi complex. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 202
23974 133321 cd04121 Rab40 Rab GTPase family 40 (Rab40) contains Rab40a, Rab40b and Rab40c. The Rab40 subfamily contains Rab40a, Rab40b, and Rab40c, which are all highly homologous. In rat, Rab40c is localized to the perinuclear recycling compartment (PRC), and is distributed in a tissue-specific manor, with high expression in brain, heart, kidney, and testis, low expression in lung and liver, and no expression in spleen and skeletal muscle. Rab40c is highly expressed in differentiated oligodendrocytes but minimally expressed in oligodendrocyte progenitors, suggesting a role in the vesicular transport of myelin components. Unlike most other Ras-superfamily proteins, Rab40c was shown to have a much lower affinity for GTP, and an affinity for GDP that is lower than for GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 189
23975 133322 cd04122 Rab14 Rab GTPase family 14 (Rab14). Rab14 GTPases are localized to biosynthetic compartments, including the rough ER, the Golgi complex, and the trans-Golgi network, and to endosomal compartments, including early endosomal vacuoles and associated vesicles. Rab14 is believed to function in both the biosynthetic and recycling pathways between the Golgi and endosomal compartments. Rab14 has also been identified on GLUT4 vesicles, and has been suggested to help regulate GLUT4 translocation. In addition, Rab14 is believed to play a role in the regulation of phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 166
23976 133323 cd04123 Rab21 Rab GTPase family 21 (Rab21). The localization and function of Rab21 are not clearly defined, with conflicting data reported. Rab21 has been reported to localize in the ER in human intestinal epithelial cells, with partial colocalization with alpha-glucosidase, a late endosomal/lysosomal marker. More recently, Rab21 was shown to colocalize with and affect the morphology of early endosomes. In Dictyostelium, GTP-bound Rab21, together with two novel LIM domain proteins, LimF and ChLim, has been shown to regulate phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 162
23977 133324 cd04124 RabL2 Rab GTPase-like family 2 (Rab-like2). RabL2 (Rab-like2) subfamily. RabL2s are novel Rab proteins identified recently which display features that are distinct from other Rabs, and have been termed Rab-like. RabL2 contains RabL2a and RabL2b, two very similar Rab proteins that share > 98% sequence identity in humans. RabL2b maps to the subtelomeric region of chromosome 22q13.3 and RabL2a maps to 2q13, a region that suggests it is also a subtelomeric gene. Both genes are believed to be expressed ubiquitously, suggesting that RabL2s are the first example of duplicated genes in human proximal subtelomeric regions that are both expressed actively. Like other Rab-like proteins, RabL2s lack a prenylation site at the C-terminus. The specific functions of RabL2a and RabL2b remain unknown. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 161
23978 133326 cd04126 Rab20 Rab GTPase family 20 (Rab20). Rab20 is one of several Rab proteins that appear to be restricted in expression to the apical domain of murine polarized epithelial cells. It is expressed on the apical side of polarized kidney tubule and intestinal epithelial cells, and in non-polarized cells. It also localizes to vesico-tubular structures below the apical brush border of renal proximal tubule cells and in the apical region of duodenal epithelial cells. Rab20 has also been shown to colocalize with vacuolar H+-ATPases (V-ATPases) in mouse kidney cells, suggesting a role in the regulation of V-ATPase traffic in specific portions of the nephron. It was also shown to be one of several proteins whose expression is upregulated in human myelodysplastic syndrome (MDS) patients. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 220
23979 206700 cd04127 Rab27A Rab GTPase family 27a (Rab27a). The Rab27a subfamily consists of Rab27a and its highly homologous isoform, Rab27b. Unlike most Rab proteins whose functions remain poorly defined, Rab27a has many known functions. Rab27a has multiple effector proteins, and depending on which effector it binds, Rab27a has different functions as well as tissue distribution and/or cellular localization. Putative functions have been assigned to Rab27a when associated with the effector proteins Slp1, Slp2, Slp3, Slp4, Slp5, DmSlp, rabphilin, Dm/Ce-rabphilin, Slac2-a, Slac2-b, Slac2-c, Noc2, JFC1, and Munc13-4. Rab27a has been associated with several human diseases, including hemophagocytic syndrome (Griscelli syndrome or GS), Hermansky-Pudlak syndrome, and choroidermia. In the case of GS, a rare, autosomal recessive disease, a Rab27a mutation is directly responsible for the disorder. When Rab27a is localized to the secretory granules of pancreatic beta cells, it is believed to mediate glucose-stimulated insulin secretion, making it a potential target for diabetes therapy. When bound to JFC1 in prostate cells, Rab27a is believed to regulate the exocytosis of prostate- specific markers. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 180
23980 206701 cd04128 Spg1 Septum-promoting GTPase (Spg1). Spg1p. Spg1p (septum-promoting GTPase) was first identified in the fission yeast S. pombe, where it regulates septum formation in the septation initiation network (SIN) through the cdc7 protein kinase. Spg1p is an essential gene that localizes to the spindle pole bodies. When GTP-bound, it binds cdc7 and causes it to translocate to spindle poles. Sid4p (septation initiation defective) is required for localization of Spg1p to the spindle pole body, and the ability of Spg1p to promote septum formation from any point in the cell cycle depends on Sid4p. Spg1p is negatively regulated by Byr4 and cdc16, which form a two-component GTPase activating protein (GAP) for Spg1p. The existence of a SIN-related pathway in plants has been proposed. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 182
23981 206702 cd04129 Rho2 Ras homology family 2 (Rho2) of small guanosine triphosphatases (GTPases). Rho2 is a fungal GTPase that plays a role in cell morphogenesis, control of cell wall integrity, control of growth polarity, and maintenance of growth direction. Rho2 activates the protein kinase C homolog Pck2, and Pck2 controls Mok1, the major (1-3) alpha-D-glucan synthase. Together with Rho1 (RhoA), Rho2 regulates the construction of the cell wall. Unlike Rho1, Rho2 is not an essential protein, but its overexpression is lethal. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for proper intracellular localization via membrane attachment. As with other Rho family GTPases, the GDP/GTP cycling is regulated by GEFs (guanine nucleotide exchange factors), GAPs (GTPase-activating proteins) and GDIs (guanine nucleotide dissociation inhibitors). 190
23982 133330 cd04130 Wrch_1 Wnt-1 responsive Cdc42 homolog (Wrch-1) is a Rho family GTPase similar to Cdc42. Wrch-1 (Wnt-1 responsive Cdc42 homolog) is a Rho family GTPase that shares significant sequence and functional similarity with Cdc42. Wrch-1 was first identified in mouse mammary epithelial cells, where its transcription is upregulated in Wnt-1 transformation. Wrch-1 contains N- and C-terminal extensions relative to cdc42, suggesting potential differences in cellular localization and function. The Wrch-1 N-terminal extension contains putative SH3 domain-binding motifs and has been shown to bind the SH3 domain-containing protein Grb2, which increases the level of active Wrch-1 in cells. Unlike Cdc42, which localizes to the cytosol and perinuclear membranes, Wrch-1 localizes extensively with the plasma membrane and endosomes. The membrane association, localization, and biological activity of Wrch-1 indicate an atypical model of regulation distinct from other Rho family GTPases. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 173
23983 206703 cd04131 Rnd Rho family GTPase subfamily Rnd includes Rnd1/Rho6, Rnd2/Rho7, and Rnd3/RhoE/Rho8. The Rnd subfamily contains Rnd1/Rho6, Rnd2/Rho7, and Rnd3/RhoE/Rho8. These novel Rho family proteins have substantial structural differences compared to other Rho members, including N- and C-terminal extensions relative to other Rhos. Rnd3/RhoE is farnesylated at the C-terminal prenylation site, unlike most other Rho proteins that are geranylgeranylated. In addition, Rnd members are unable to hydrolyze GTP and are resistant to GAP activity. They are believed to exist only in the GTP-bound conformation, and are antagonists of RhoA activity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 176
23984 206704 cd04132 Rho4_like Ras homology family 4 (Rho4) of small guanosine triphosphatases (GTPases)-like. Rho4 is a GTPase that controls septum degradation by regulating secretion of Eng1 or Agn1 during cytokinesis. Rho4 also plays a role in cell morphogenesis. Rho4 regulates septation and cell morphology by controlling the actin cytoskeleton and cytoplasmic microtubules. The localization of Rho4 is modulated by Rdi1, which may function as a GDI, and by Rga9, which is believed to function as a GAP. In S. pombe, both Rho4 deletion and Rho4 overexpression result in a defective cell wall, suggesting a role for Rho4 in maintaining cell wall integrity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 197
23985 206705 cd04133 Rop_like Rho-related protein from plants (Rop)-like. The Rop (Rho-related protein from plants) subfamily plays a role in diverse cellular processes, including cytoskeletal organization, pollen and vegetative cell growth, hormone responses, stress responses, and pathogen resistance. Rops are able to regulate several downstream pathways to amplify a specific signal by acting as master switches early in the signaling cascade. They transmit a variety of extracellular and intracellular signals. Rops are involved in establishing cell polarity in root-hair development, root-hair elongation, pollen-tube growth, cell-shape formation, responses to hormones such as abscisic acid (ABA) and auxin, responses to abiotic stresses such as oxygen deprivation, and disease resistance and disease susceptibility. An individual Rop can have a unique function or an overlapping function shared with other Rop proteins; in addition, a given Rop-regulated function can be controlled by one or multiple Rop proteins. For example, Rop1, Rop3, and Rop5 are all involved in pollen-tube growth; Rop2 plays a role in response to low-oxygen environments, cell-morphology, and root-hair development; root-hair development is also regulated by Rop4 and Rop6; Rop6 is also responsible for ABA response, and ABA response is also regulated by Rop10. Plants retain some of the regulatory mechanisms that are shared by other members of the Rho family, but have also developed a number of unique modes for regulating Rops. Unique RhoGEFs have been identified that are exclusively active toward Rop proteins, such as those containing the domain PRONE (plant-specific Rop nucleotide exchanger). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 173
23986 206706 cd04134 Rho3 Ras homology family 3 (Rho3) of small guanosine triphosphatases (GTPases). Rho3 is a member of the Rho family found only in fungi. Rho3 is believed to regulate cell polarity by interacting with the diaphanous/formin family protein For3 to control both the actin cytoskeleton and microtubules. Rho3 is also believed to have a direct role in exocytosis that is independent of its role in regulating actin polarity. The function in exocytosis may be two-pronged: first, in the transport of post-Golgi vesicles from the mother cell to the bud, mediated by myosin (Myo2); second, in the docking and fusion of vesicles to the plasma membrane, mediated by an exocyst (Exo70) protein. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 185
23987 206707 cd04135 Tc10 Rho GTPase TC10 (Tc10). TC10 is a Rho family protein that has been shown to induce microspike formation and neurite outgrowth in vitro. Its expression changes dramatically after peripheral nerve injury, suggesting an important role in promoting axonal outgrowth and regeneration. TC10 regulates translocation of insulin-stimulated GLUT4 in adipocytes and has also been shown to bind directly to Golgi COPI coat proteins. GTP-bound TC10 in vitro can bind numerous potential effectors. Depending on its subcellular localization and distinct functional domains, TC10 can differentially regulate two types of filamentous actin in adipocytes. TC10 mRNAs are highly expressed in three types of mouse muscle tissues: leg skeletal muscle, cardiac muscle, and uterus; they were also present in brain, with higher levels in adults than in newborns. TC10 has also been shown to play a role in regulating the expression of cystic fibrosis transmembrane conductance regulator (CFTR) through interactions with CFTR-associated ligand (CAL). The GTP-bound form of TC10 directs the trafficking of CFTR from the juxtanuclear region to the secretory pathway toward the plasma membrane, away from CAL-mediated DFTR degradation in the lysosome. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 174
23988 206708 cd04136 Rap_like Rap-like family consists of Rap1, Rap2 and RSR1. The Rap subfamily consists of the Rap1, Rap2, and RSR1. Rap subfamily proteins perform different cellular functions, depending on the isoform and its subcellular localization. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and microsomal membrane of the pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. Rap1 localizes in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap2 is involved in multiple functions, including activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton and activation of the Wnt/beta-catenin signaling pathway in embryonic Xenopus. A number of effector proteins for Rap2 have been identified, including isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK), and the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. RSR1 is the fungal homolog of Rap1 and Rap2. In budding yeasts, it is involved in selecting a site for bud growth, which directs the establishment of cell polarization. The Rho family GTPase Cdc42 and its GEF, Cdc24, then establish an axis of polarized growth. It is believed that Cdc42 interacts directly with RSR1 in vivo. In filamentous fungi such as Ashbya gossypii, RSR1 is a key regulator of polar growth in the hypha. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 164
23989 206709 cd04137 RheB Ras Homolog Enriched in Brain (RheB) is a small GTPase. Rheb (Ras Homolog Enriched in Brain) subfamily. Rheb was initially identified in rat brain, where its expression is elevated by seizures or by long-term potentiation. It is expressed ubiquitously, with elevated levels in muscle and brain. Rheb functions as an important mediator between the tuberous sclerosis complex proteins, TSC1 and TSC2, and the mammalian target of rapamycin (TOR) kinase to stimulate cell growth. TOR kinase regulates cell growth by controlling nutrient availability, growth factors, and the energy status of the cell. TSC1 and TSC2 form a dimeric complex that has tumor suppressor activity, and TSC2 is a GTPase activating protein (GAP) for Rheb. The TSC1/TSC2 complex inhibits the activation of TOR kinase through Rheb. Rheb has also been shown to induce the formation of large cytoplasmic vacuoles in a process that is dependent on the GTPase cycle of Rheb, but independent of the TOR kinase, suggesting Rheb plays a role in endocytic trafficking that leads to cell growth and cell-cycle progression. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 180
23990 133338 cd04138 H_N_K_Ras_like Ras GTPase family containing H-Ras,N-Ras and K-Ras4A/4B. H-Ras/N-Ras/K-Ras subfamily. H-Ras, N-Ras, and K-Ras4A/4B are the prototypical members of the Ras family. These isoforms generate distinct signal outputs despite interacting with a common set of activators and effectors, and are strongly associated with oncogenic progression in tumor initiation. Mutated versions of Ras that are insensitive to GAP stimulation (and are therefore constitutively active) are found in a significant fraction of human cancers. Many Ras guanine nucleotide exchange factors (GEFs) have been identified. They are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active (GTP-bound) Ras interacts with several effector proteins that stimulate a variety of diverse cytoplasmic signaling activities. Some are known to positively mediate the oncogenic properties of Ras, including Raf, phosphatidylinositol 3-kinase (PI3K), RalGEFs, and Tiam1. Others are proposed to play negative regulatory roles in oncogenesis, including RASSF and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 162
23991 206710 cd04139 RalA_RalB Ral (Ras-like) family containing highly homologous RalA and RalB. The Ral (Ras-like) subfamily consists of the highly homologous RalA and RalB. Ral proteins are believed to play a crucial role in tumorigenesis, metastasis, endocytosis, and actin cytoskeleton dynamics. Despite their high sequence similarity (>80% sequence identity), nonoverlapping and opposing functions have been assigned to RalA and RalBs in tumor migration. In human bladder and prostate cancer cells, RalB promotes migration while RalA inhibits it. A Ral-specific set of GEFs has been identified that are activated by Ras binding. This RalGEF activity is enhanced by Ras binding to another of its target proteins, phosphatidylinositol 3-kinase (PI3K). Ral effectors include RLIP76/RalBP1, a Rac/cdc42 GAP, and the exocyst (Sec6/8) complex, a heterooctomeric protein complex that is involved in tethering vesicles to specific sites on the plasma membrane prior to exocytosis. In rat kidney cells, RalB is required for functional assembly of the exocyst and for localizing the exocyst to the leading edge of migrating cells. In human cancer cells, RalA is required to support anchorage-independent proliferation and RalB is required to suppress apoptosis. RalA has been shown to localize to the plasma membrane while RalB is localized to the intracellular vesicles. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 163
23992 206711 cd04140 ARHI_like A Ras homolog member I (ARHI). ARHI (A Ras homolog member I) is a member of the Ras family with several unique structural and functional properties. ARHI is expressed in normal human ovarian and breast tissue, but its expression is decreased or eliminated in breast and ovarian cancer. ARHI contains an N-terminal extension of 34 residues (human) that is required to retain its tumor suppressive activity. Unlike most other Ras family members, ARHI is maintained in the constitutively active (GTP-bound) state in resting cells and has modest GTPase activity. ARHI inhibits STAT3 (signal transducers and activators of transcription 3), a latent transcription factor whose abnormal activation plays a critical role in oncogenesis. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 165
23993 206712 cd04141 Rit_Rin_Ric Ras-like protein in all tissues (Rit), Ras-like protein in neurons (Rin) and Ras-related protein which interacts with calmodulin (Ric). Rit (Ras-like protein in all tissues), Rin (Ras-like protein in neurons) and Ric (Ras-related protein which interacts with calmodulin) form a subfamily with several unique structural and functional characteristics. These proteins all lack a the C-terminal CaaX lipid-binding motif typical of Ras family proteins, and Rin and Ric contain calmodulin-binding domains. Rin, which is expressed only in neurons, induces neurite outgrowth in rat pheochromocytoma cells through its association with calmodulin and its activation of endogenous Rac/cdc42. Rit, which is ubiquitously expressed in mammals, inhibits growth-factor withdrawl-mediated apoptosis and induces neurite extension in pheochromocytoma cells. Rit and Rin are both able to form a ternary complex with PAR6, a cell polarity-regulating protein, and Rac/cdc42. This ternary complex is proposed to have physiological function in processes such as tumorigenesis. Activated Ric is likely to signal in parallel with the Ras pathway or stimulate the Ras pathway at some upstream point, and binding of calmodulin to Ric may negatively regulate Ric activity. 172
23994 133342 cd04142 RRP22 Ras-related protein on chromosome 22 (RRP22) family. RRP22 (Ras-related protein on chromosome 22) subfamily consists of proteins that inhibit cell growth and promote caspase-independent cell death. Unlike most Ras proteins, RRP22 is down-regulated in many human tumor cells due to promoter methylation. RRP22 localizes to the nucleolus in a GTP-dependent manner, suggesting a novel function in modulating transport of nucleolar components. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Like most Ras family proteins, RRP22 is farnesylated. 198
23995 133343 cd04143 Rhes_like Ras homolog enriched in striatum (Rhes) and activator of G-protein signaling 1 (Dexras1/AGS1). This subfamily includes Rhes (Ras homolog enriched in striatum) and Dexras1/AGS1 (activator of G-protein signaling 1). These proteins are homologous, but exhibit significant differences in tissue distribution and subcellular localization. Rhes is found primarily in the striatum of the brain, but is also expressed in other areas of the brain, such as the cerebral cortex, hippocampus, inferior colliculus, and cerebellum. Rhes expression is controlled by thyroid hormones. In rat PC12 cells, Rhes is farnesylated and localizes to the plasma membrane. Rhes binds and activates PI3K, and plays a role in coupling serpentine membrane receptors with heterotrimeric G-protein signaling. Rhes has recently been shown to be reduced under conditions of dopamine supersensitivity and may play a role in determining dopamine receptor sensitivity. Dexras1/AGS1 is a dexamethasone-induced Ras protein that is expressed primarily in the brain, with low expression levels in other tissues. Dexras1 localizes primarily to the cytoplasm, and is a critical regulator of the circadian master clock to photic and nonphotic input. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 247
23996 133344 cd04144 Ras2 Rat sarcoma (Ras) family 2 of small guanosine triphosphatases (GTPases). The Ras2 subfamily, found exclusively in fungi, was first identified in Ustilago maydis. In U. maydis, Ras2 is regulated by Sql2, a protein that is homologous to GEFs (guanine nucleotide exchange factors) of the CDC25 family. Ras2 has been shown to induce filamentous growth, but the signaling cascade through which Ras2 and Sql2 regulate cell morphology is not known. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 190
23997 133345 cd04145 M_R_Ras_like R-Ras2/TC21, M-Ras/R-Ras3. The M-Ras/R-Ras-like subfamily contains R-Ras2/TC21, M-Ras/R-Ras3, and related members of the Ras family. M-Ras is expressed in lympho-hematopoetic cells. It interacts with some of the known Ras effectors, but appears to also have its own effectors. Expression of mutated M-Ras leads to transformation of several types of cell lines, including hematopoietic cells, mammary epithelial cells, and fibroblasts. Overexpression of M-Ras is observed in carcinomas from breast, uterus, thyroid, stomach, colon, kidney, lung, and rectum. In addition, expression of a constitutively active M-Ras mutant in murine bone marrow induces a malignant mast cell leukemia that is distinct from the monocytic leukemia induced by H-Ras. TC21, along with H-Ras, has been shown to regulate the branching morphogenesis of ureteric bud cell branching in mice. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 164
23998 206713 cd04146 RERG_RasL11_like Ras-related and Estrogen-Regulated Growth inhibitor (RERG) and Ras-like 11 (RasL11)-like families. RERG (Ras-related and Estrogen- Regulated Growth inhibitor) and Ras-like 11 are members of a novel subfamily of Ras that were identified based on their behavior in breast and prostate tumors, respectively. RERG expression was decreased or lost in a significant fraction of primary human breast tumors that lack estrogen receptor and are correlated with poor clinical prognosis. Elevated RERG expression correlated with favorable patient outcome in a breast tumor subtype that is positive for estrogen receptor expression. In contrast to most Ras proteins, RERG overexpression inhibited the growth of breast tumor cells in vitro and in vivo. RasL11 was found to be ubiquitously expressed in human tissue, but down-regulated in prostate tumors. Both RERG and RasL11 lack the C-terminal CaaX prenylation motif, where a = an aliphatic amino acid and X = any amino acid, and are localized primarily in the cytoplasm. Both are believed to have tumor suppressor activity. 166
23999 206714 cd04147 Ras_dva Ras - dorsal-ventral anterior localization (Ras-dva) family. Ras-dva subfamily. Ras-dva (Ras - dorsal-ventral anterior localization) subfamily consists of a set of proteins characterized only in Xenopus leavis, to date. In Xenopus Ras-dva expression is activated by the transcription factor Otx2 and begins during gastrulation throughout the anterior ectoderm. Ras-dva expression is inhibited in the anterior neural plate by factor Xanf1. Downregulation of Ras-dva results in head development abnormalities through the inhibition of several regulators of the anterior neural plate and folds patterning, including Otx2, BF-1, Xag2, Pax6, Slug, and Sox9. Downregulation of Ras-dva also interferes with the FGF-8a signaling within the anterior ectoderm. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 197
24000 206715 cd04148 RGK Rem, Rem2, Rad, Gem/Kir (RGK) subfamily of Ras GTPases. RGK subfamily. The RGK (Rem, Rem2, Rad, Gem/Kir) subfamily of Ras GTPases are expressed in a tissue-specific manner and are dynamically regulated by transcriptional and posttranscriptional mechanisms in response to environmental cues. RGK proteins bind to the beta subunit of L-type calcium channels, causing functional down-regulation of these voltage-dependent calcium channels, and either termination of calcium-dependent secretion or modulation of electrical conduction and contractile function. Inhibition of L-type calcium channels by Rem2 may provide a mechanism for modulating calcium-triggered exocytosis in hormone-secreting cells, and has been proposed to influence the secretion of insulin in pancreatic beta cells. RGK proteins also interact with and inhibit the Rho/Rho kinase pathway to modulate remodeling of the cytoskeleton. Two characteristics of RGK proteins cited in the literature are N-terminal and C-terminal extensions beyond the GTPase domain typical of Ras superfamily members. The N-terminal extension is not conserved among family members; the C-terminal extension is reported to be conserved among the family and lack the CaaX prenylation motif typical of membrane-associated Ras proteins. However, a putative CaaX motif has been identified in the alignment of the C-terminal residues of this CD. 219
24001 206716 cd04149 Arf6 ADP ribosylation factor 6 (Arf6). Arf6 subfamily. Arf6 (ADP ribosylation factor 6) proteins localize to the plasma membrane, where they perform a wide variety of functions. In its active, GTP-bound form, Arf6 is involved in cell spreading, Rac-induced formation of plasma membrane ruffles, cell migration, wound healing, and Fc-mediated phagocytosis. Arf6 appears to change the actin structure at the plasma membrane by activating Rac, a Rho family protein involved in membrane ruffling. Arf6 is required for and enhances Rac formation of ruffles. Arf6 can regulate dendritic branching in hippocampal neurons, and in yeast it localizes to the growing bud, where it plays a role in polarized growth and bud site selection. In leukocytes, Arf6 is required for chemokine-stimulated migration across endothelial cells. Arf6 also plays a role in down-regulation of beta2-adrenergic receptors and luteinizing hormone receptors by facilitating the release of sequestered arrestin to allow endocytosis. Arf6 is believed to function at multiple sites on the plasma membrane through interaction with a specific set of GEFs, GAPs, and effectors. Arf6 has been implicated in breast cancer and melanoma cell invasion, and in actin remodelling at the invasion site of Chlamydia infection. 168
24002 206717 cd04150 Arf1_5_like ADP-ribosylation factor-1 (Arf1) and ADP-ribosylation factor-5 (Arf5). The Arf1-Arf5-like subfamily contains Arf1, Arf2, Arf3, Arf4, Arf5, and related proteins. Arfs1-5 are soluble proteins that are crucial for assembling coat proteins during vesicle formation. Each contains an N-terminal myristoylated amphipathic helix that is folded into the protein in the GDP-bound state. GDP/GTP exchange exposes the helix, which anchors to the membrane. Following GTP hydrolysis, the helix dissociates from the membrane and folds back into the protein. A general feature of Arf1-5 signaling may be the cooperation of two Arfs at the same site. Arfs1-5 are generally considered to be interchangeable in function and location, but some specific functions have been assigned. Arf1 localizes to the early/cis-Golgi, where it is activated by GBF1 and recruits the coat protein COPI. It also localizes to the trans-Golgi network (TGN), where it is activated by BIG1/BIG2 and recruits the AP1, AP3, AP4, and GGA proteins. Humans, but not rodents and other lower eukaryotes, lack Arf2. Human Arf3 shares 96% sequence identity with Arf1 and is believed to generally function interchangeably with Arf1. Human Arf4 in the activated (GTP-bound) state has been shown to interact with the cytoplasmic domain of epidermal growth factor receptor (EGFR) and mediate the EGF-dependent activation of phospholipase D2 (PLD2), leading to activation of the activator protein 1 (AP-1) transcription factor. Arf4 has also been shown to recognize the C-terminal sorting signal of rhodopsin and regulate its incorporation into specialized post-Golgi rhodopsin transport carriers (RTCs). There is some evidence that Arf5 functions at the early-Golgi and the trans-Golgi to affect Golgi-associated alpha-adaptin homology Arf-binding proteins (GGAs). 159
24003 206718 cd04151 Arl1 ADP ribosylation factor 1 (Arf1). Arl1 subfamily. Arl1 (Arf-like 1) localizes to the Golgi complex, where it is believed to recruit effector proteins to the trans-Golgi network. Like most members of the Arf family, Arl1 is myristoylated at its N-terminal helix and mutation of the myristoylation site disrupts Golgi targeting. In humans, the Golgi-localized proteins golgin-97 and golgin-245 have been identified as Arl1 effectors. Golgins are large coiled-coil proteins found in the Golgi, and these golgins contain a C-terminal GRIP domain, which is the site of Arl1 binding. Additional Arl1 effectors include the GARP (Golgi-associated retrograde protein)/VFT (Vps53) vesicle-tethering complex and Arfaptin 2. Arl1 is not required for exocytosis, but appears necessary for trafficking from the endosomes to the Golgi. In Drosophila zygotes, mutation of Arl1 is lethal, and in the host-bloodstream form of Trypanosoma brucei, Arl1 is essential for viability. 158
24004 206719 cd04152 Arl4_Arl7 Arf-like 4 (Arl4) and 7 (Arl7) GTPases. Arl4 (Arf-like 4) is highly expressed in testicular germ cells, and is found in the nucleus and nucleolus. In mice, Arl4 is developmentally expressed during embryogenesis, and a role in somite formation and central nervous system differentiation has been proposed. Arl7 has been identified as the only Arf/Arl protein to be induced by agonists of liver X-receptor and retinoid X-receptor and by cholesterol loading in human macrophages. Arl7 is proposed to play a role in transport between a perinuclear compartment and the plasma membrane, apparently linked to the ABCA1-mediated cholesterol secretion pathway. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily. 183
24005 133353 cd04153 Arl5_Arl8 Arf-like 5 (Arl5) and 8 (Arl8) GTPases. Arl5/Arl8 subfamily. Arl5 (Arf-like 5) and Arl8, like Arl4 and Arl7, are localized to the nucleus and nucleolus. Arl5 is developmentally regulated during embryogenesis in mice. Human Arl5 interacts with the heterochromatin protein 1-alpha (HP1alpha), a nonhistone chromosomal protein that is associated with heterochromatin and telomeres, and prevents telomere fusion. Arl5 may also play a role in embryonic nuclear dynamics and/or signaling cascades. Arl8 was identified from a fetal cartilage cDNA library. It is found in brain, heart, lung, cartilage, and kidney. No function has been assigned for Arl8 to date. 174
24006 206720 cd04154 Arl2 Arf-like 2 (Arl2) GTPase. Arl2 (Arf-like 2) GTPases are members of the Arf family that bind GDP and GTP with very low affinity. Unlike most Arf family proteins, Arl2 is not myristoylated at its N-terminal helix. The protein PDE-delta, first identified in photoreceptor rod cells, binds specifically to Arl2 and is structurally very similar to RhoGDI. Despite the high structural similarity between Arl2 and Rho proteins and between PDE-delta and RhoGDI, the interactions between the GTPases and their effectors are very different. In its GTP bound form, Arl2 interacts with the protein Binder of Arl2 (BART), and the complex is believed to play a role in mitochondrial adenine nucleotide transport. In its GDP bound form, Arl2 interacts with tubulin- folding Cofactor D; this interaction is believed to play a role in regulation of microtubule dynamics that impact the cytoskeleton, cell division, and cytokinesis. 173
24007 206721 cd04155 Arl3 Arf-like 3 (Arl3) GTPase. Arl3 (Arf-like 3) is an Arf family protein that differs from most Arf family members in the N-terminal extension. In is inactive, GDP-bound form, the N-terminal extension forms an elongated loop that is hydrophobically anchored into the membrane surface; however, it has been proposed that this region might form a helix in the GTP-bound form. The delta subunit of the rod-specific cyclic GMP phosphodiesterase type 6 (PDEdelta) is an Arl3 effector. Arl3 binds microtubules in a regulated manner to alter specific aspects of cytokinesis via interactions with retinitis pigmentosa 2 (RP2). It has been proposed that RP2 functions in concert with Arl3 to link the cell membrane and the cytoskeleton in photoreceptors as part of the cell signaling or vesicular transport machinery. In mice, the absence of Arl3 is associated with abnormal epithelial cell proliferation and cyst formation. 174
24008 133356 cd04156 ARLTS1 Arf-like tumor suppressor gene 1 (ARLTS1 or Arl11). ARLTS1 (Arf-like tumor suppressor gene 1), also known as Arl11, is a member of the Arf family of small GTPases that is believed to play a major role in apoptotic signaling. ARLTS1 is widely expressed and functions as a tumor suppressor gene in several human cancers. ARLTS1 is a low-penetrance suppressor that accounts for a small percentage of familial melanoma or familial chronic lymphocytic leukemia (CLL). ARLTS1 inactivation seems to occur most frequently through biallelic down-regulation by hypermethylation of the promoter. In breast cancer, ARLTS1 alterations were typically a combination of a hypomorphic polymorphism plus loss of heterozygosity. In a case of thyroid adenoma, ARLTS1 alterations were polymorphism plus promoter hypermethylation. The nonsense polymorphism Trp149Stop occurs with significantly greater frequency in familial cancer cases than in sporadic cancer cases, and the Cys148Arg polymorphism is associated with an increase in high-risk familial breast cancer. 160
24009 206722 cd04157 Arl6 Arf-like 6 (Arl6) GTPase. Arl6 (Arf-like 6) forms a subfamily of the Arf family of small GTPases. Arl6 expression is limited to the brain and kidney in adult mice, but it is expressed in the neural plate and somites during embryogenesis, suggesting a possible role for Arl6 in early development. Arl6 is also believed to have a role in cilia or flagella function. Several proteins have been identified that bind Arl6, including Arl6 interacting protein (Arl6ip), and SEC61beta, a subunit of the heterotrimeric conducting channel SEC61p. Based on Arl6 binding to these effectors, Arl6 is also proposed to play a role in protein transport, membrane trafficking, or cell signaling during hematopoietic maturation. At least three specific homozygous Arl6 mutations in humans have been found to cause Bardet-Biedl syndrome, a disorder characterized by obesity, retinopathy, polydactyly, renal and cardiac malformations, learning disabilities, and hypogenitalism. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily. 162
24010 206723 cd04158 ARD1 (ADP-ribosylation factor domain protein 1 (ARD1). ARD1 (ADP-ribosylation factor domain protein 1) is an unusual member of the Arf family. In addition to the C-terminal Arf domain, ARD1 has an additional 46-kDa N-terminal domain that contains a RING finger domain, two predicted B-Boxes, and a coiled-coil protein interaction motif. This domain belongs to the TRIM (tripartite motif) or RBCC (RING, B-Box, coiled-coil) family. Like most Arfs, the ARD1 Arf domain lacks detectable GTPase activity. However, unlike most Arfs, the full-length ARD1 protein has significant GTPase activity due to the GAP (GTPase-activating protein) activity exhibited by the 46-kDa N-terminal domain. The GAP domain of ARD1 is specific for its own Arf domain and does not bind other Arfs. The rate of GDP dissociation from the ARD1 Arf domain is slowed by the adjacent 15 amino acids, which act as a GDI (GDP-dissociation inhibitor) domain. ARD1 is ubiquitously expressed in cells and localizes to the Golgi and to the lysosomal membrane. Two Tyr-based motifs in the Arf domain are responsible for Golgi localization, while the GAP domain controls lysosomal localization. 169
24011 206724 cd04159 Arl10_like Arf-like 9 (Arl9) and 10 (Arl10) GTPases. Arl10-like subfamily. Arl9/Arl10 was identified from a human cancer-derived EST dataset. No functional information about the subfamily is available at the current time, but crystal structures of human Arl10b and Arl10c have been solved. 159
24012 206725 cd04160 Arfrp1 Arf-related protein 1 (Arfrp1). Arfrp1 (Arf-related protein 1), formerly known as ARP, is a membrane-associated Arf family member that lacks the N-terminal myristoylation motif. Arfrp1 is mainly associated with the trans-Golgi compartment and the trans-Golgi network, where it regulates the targeting of Arl1 and the GRIP domain-containing proteins, golgin-97 and golgin-245, onto Golgi membranes. It is also involved in the anterograde transport of the vesicular stomatitis virus G protein from the Golgi to the plasma membrane, and in the retrograde transport of TGN38 and Shiga toxin from endosomes to the trans-Golgi network. Arfrp1 also inhibits Arf/Sec7-dependent activation of phospholipase D. Deletion of Arfrp1 in mice causes embryonic lethality at the gastrulation stage and apoptosis of mesodermal cells, indicating its importance in development. 168
24013 133361 cd04161 Arl2l1_Arl13_like Arl2-like protein 1 (Arl2l1) and Arl13. Arl2l1 (Arl2-like protein 1) and Arl13 form a subfamily of the Arf family of small GTPases. Arl2l1 was identified in human cells during a search for the gene(s) responsible for Bardet-Biedl syndrome (BBS). Like Arl6, the identified BBS gene, Arl2l1 is proposed to have cilia-specific functions. Arl13 is found on the X chromosome, but its expression has not been confirmed; it may be a pseudogene. 167
24014 133362 cd04162 Arl9_Arfrp2_like Arf-like 9 (Arl9)/Arfrp2-like GTPase. Arl9/Arfrp2-like subfamily. Arl9 (Arf-like 9) was first identified as part of the Human Cancer Genome Project. It maps to chromosome 4q12 and is sometimes referred to as Arfrp2 (Arf-related protein 2). This is a novel subfamily identified in human cancers that is uncharacterized to date. 164
24015 206726 cd04163 Era E. coli Ras-like protein (Era) is a multifunctional GTPase. Era (E. coli Ras-like protein) is a multifunctional GTPase found in all bacteria except some eubacteria. It binds to the 16S ribosomal RNA (rRNA) of the 30S subunit and appears to play a role in the assembly of the 30S subunit, possibly by chaperoning the 16S rRNA. It also contacts several assembly elements of the 30S subunit. Era couples cell growth with cytokinesis and plays a role in cell division and energy metabolism. Homologs have also been found in eukaryotes. Era contains two domains: the N-terminal GTPase domain and a C-terminal domain KH domain that is critical for RNA binding. Both domains are important for Era function. Era is functionally able to compensate for deletion of RbfA, a cold-shock adaptation protein that is required for efficient processing of the 16S rRNA. 168
24016 206727 cd04164 trmE trmE is a tRNA modification GTPase. TrmE (MnmE, ThdF, MSS1) is a 3-domain protein found in bacteria and eukaryotes. It controls modification of the uridine at the wobble position (U34) of tRNAs that read codons ending with A or G in the mixed codon family boxes. TrmE contains a GTPase domain that forms a canonical Ras-like fold. It functions a molecular switch GTPase, and apparently uses a conformational change associated with GTP hydrolysis to promote the tRNA modification reaction, in which the conserved cysteine in the C-terminal domain is thought to function as a catalytic residue. In bacteria that are able to survive in extremely low pH conditions, TrmE regulates glutamate-dependent acid resistance. 159
24017 206728 cd04165 GTPBP1_like GTP binding protein 1 (GTPBP1)-like family includes GTPBP2. Mammalian GTP binding protein 1 (GTPBP1), GTPBP2, and nematode homologs AGP-1 and CGP-1 are GTPases whose specific functions remain unknown. In mouse, GTPBP1 is expressed in macrophages, in smooth muscle cells of various tissues and in some neurons of the cerebral cortex; GTPBP2 tissue distribution appears to overlap that of GTPBP1. In human leukemia and macrophage cell lines, expression of both GTPBP1 and GTPBP2 is enhanced by interferon-gamma (IFN-gamma). The chromosomal location of both genes has been identified in humans, with GTPBP1 located in chromosome 22q12-13.1 and GTPBP2 located in chromosome 6p21-12. Human glioblastoma multiforme (GBM), a highly-malignant astrocytic glioma and the most common cancer in the central nervous system, has been linked to chromosomal deletions and a translocation on chromosome 6. The GBM translocation results in a fusion of GTPBP2 and PTPRZ1, a protein involved in oligodendrocyte differentiation, recovery, and survival. This fusion product may contribute to the onset of GBM. 224
24018 206729 cd04166 CysN_ATPS CysN, together with protein CysD, forms the ATP sulfurylase (ATPS) complex. CysN_ATPS subfamily. CysN, together with protein CysD, form the ATP sulfurylase (ATPS) complex in some bacteria and lower eukaryotes. ATPS catalyzes the production of ATP sulfurylase (APS) and pyrophosphate (PPi) from ATP and sulfate. CysD, which catalyzes ATP hydrolysis, is a member of the ATP pyrophosphatase (ATP PPase) family. CysN hydrolysis of GTP is required for CysD hydrolysis of ATP; however, CysN hydrolysis of GTP is not dependent on CysD hydrolysis of ATP. CysN is an example of lateral gene transfer followed by acquisition of new function. In many organisms, an ATPS exists which is not GTP-dependent and shares no sequence or structural similarity to CysN. 209
24019 206730 cd04167 Snu114p Snu114p, a spliceosome protein, is a GTPase. Snu114p subfamily. Snu114p is one of several proteins that make up the U5 small nuclear ribonucleoprotein (snRNP) particle. U5 is a component of the spliceosome, which catalyzes the splicing of pre-mRNA to remove introns. Snu114p is homologous to EF-2, but typically contains an additional N-terminal domain not found in Ef-2. This protein is part of the GTP translation factor family and the Ras superfamily, characterized by five G-box motifs. 213
24020 206731 cd04168 TetM_like Tet(M)-like family includes Tet(M), Tet(O), Tet(W), and OtrA, containing tetracycline resistant proteins. Tet(M), Tet(O), Tet(W), and OtrA are tetracycline resistance genes found in Gram-positive and Gram-negative bacteria. Tetracyclines inhibit protein synthesis by preventing aminoacyl-tRNA from binding to the ribosomal acceptor site. This subfamily contains tetracycline resistance proteins that function through ribosomal protection and are typically found on mobile genetic elements, such as transposons or plasmids, and are often conjugative. Ribosomal protection proteins are homologous to the elongation factors EF-Tu and EF-G. EF-G and Tet(M) compete for binding on the ribosomes. Tet(M) has a higher affinity than EF-G, suggesting these two proteins may have overlapping binding sites and that Tet(M) must be released before EF-G can bind. Tet(M) and Tet(O) have been shown to have ribosome-dependent GTPase activity. These proteins are part of the GTP translation factor family, which includes EF-G, EF-Tu, EF2, LepA, and SelB. 237
24021 206732 cd04169 RF3 Release Factor 3 (RF3) protein involved in the terminal step of translocation in bacteria. Peptide chain release factor 3 (RF3) is a protein involved in the termination step of translation in bacteria. Termination occurs when class I release factors (RF1 or RF2) recognize the stop codon at the A-site of the ribosome and activate the release of the nascent polypeptide. The class II release factor RF3 then initiates the release of the class I RF from the ribosome. RF3 binds to the RF/ribosome complex in the inactive (GDP-bound) state. GDP/GTP exchange occurs, followed by the release of the class I RF. Subsequent hydrolysis of GTP to GDP triggers the release of RF3 from the ribosome. RF3 also enhances the efficiency of class I RFs at less preferred stop codons and at stop codons in weak contexts. 268
24022 206733 cd04170 EF-G_bact Elongation factor G (EF-G) family. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains only bacterial members. 268
24023 206734 cd04171 SelB SelB, the dedicated elongation factor for delivery of selenocysteinyl-tRNA to the ribosome. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec, and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains bacterial SelBs, as well as, one from archaea. 170
24024 206735 cd04172 Rnd3_RhoE_Rho8 Rnd3/RhoE/Rho8 GTPases. Rnd3/RhoE/Rho8 subfamily. Rnd3/RhoE/Rho8 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd2/Rho7. Rnd3/RhoE is known to bind the serine-threonine kinase ROCK I. Unphosphorylated Rnd3/RhoE associates primarily with membranes, but ROCK I-phosphorylated Rnd3/RhoE localizes in the cytosol. Phosphorylation of Rnd3/RhoE correlates with its activity in disrupting RhoA-induced stress fibers and inhibiting Ras-induced fibroblast transformation. In cells that lack stress fibers, such as macrophages and monocytes, Rnd3/RhoE induces a redistribution of actin, causing morphological changes in the cell. In addition, Rnd3/RhoE has been shown to inhibit cell cycle progression in G1 phase at a point upstream of the pRb family pocket protein checkpoint. Rnd3/RhoE has also been shown to inhibit Ras- and Raf-induced fibroblast transformation. In mammary epithelial tumor cells, Rnd3/RhoE regulates the assembly of the apical junction complex and tight junction formation. Rnd3/RhoE is underexpressed in prostate cancer cells both in vitro and in vivo; re-expression of Rnd3/RhoE suppresses cell cycle progression and increases apoptosis, suggesting it may play a role in tumor suppression. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 182
24025 206736 cd04173 Rnd2_Rho7 Rnd2/Rho7 GTPases. Rnd2/Rho7 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd3/RhoE/Rho8. Rnd2/Rho7 is transiently expressed in radially migrating cells in the brain while they are within the subventricular zone of the hippocampus and cerebral cortex. These migrating cells typically develop into pyramidal neurons. Cells that exogenously expressed Rnd2/Rho7 failed to migrate to upper layers of the brain, suggesting that Rnd2/Rho7 plays a role in the radial migration and morphological changes of developing pyramidal neurons, and that Rnd2/Rho7 degradation is necessary for proper cellular migration. The Rnd2/Rho7 GEF Rapostlin is found primarily in the brain and together with Rnd2/Rho7 induces dendrite branching. Unlike Rnd1/Rho6 and Rnd3/RhoE/Rho8, which are RhoA antagonists, Rnd2/Rho7 binds the GEF Pragmin and significantly stimulates RhoA activity and Rho-A mediated cell contraction. Rnd2/Rho7 is also found to be expressed in spermatocytes and early spermatids, with male-germ-cell Rac GTPase-activating protein (MgcRacGAP), where it localizes to the Golgi-derived pro-acrosomal vesicle. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 221
24026 206737 cd04174 Rnd1_Rho6 Rnd1/Rho6 GTPases. Rnd1/Rho6 is a member of the novel Rho subfamily Rnd, together with Rnd2/Rho7 and Rnd3/RhoE/Rho8. Rnd1/Rho6 binds GTP but does not hydrolyze it to GDP, indicating that it is constitutively active. In rat, Rnd1/Rho6 is highly expressed in the cerebral cortex and hippocampus during synapse formation, and plays a role in spine formation. Rnd1/Rho6 is also expressed in the liver and in endothelial cells, and is upregulated in uterine myometrial cells during pregnancy. Like Rnd3/RhoE/Rho8, Rnd1/Rho6 is believed to function as an antagonist to RhoA. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 232
24027 133375 cd04175 Rap1 Rap1 family GTPase consists of Rap1a and Rap1b isoforms. The Rap1 subgroup is part of the Rap subfamily of the Ras family. It can be further divided into the Rap1a and Rap1b isoforms. In humans, Rap1a and Rap1b share 95% sequence homology, but are products of two different genes located on chromosomes 1 and 12, respectively. Rap1a is sometimes called smg p21 or Krev1 in the older literature. Rap1 proteins are believed to perform different cellular functions, depending on the isoform, its subcellular localization, and the effector proteins it binds. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and the microsomal membrane of pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. High expression of Rap1 has been observed in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines; interestingly, in the SCCs, the active GTP-bound form localized to the nucleus, while the inactive GDP-bound form localized to the cytoplasm. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap1a, which is stimulated by T-cell receptor (TCR) activation, is a positive regulator of T cells by directing integrin activation and augmenting lymphocyte responses. In murine hippocampal neurons, Rap1b determines which neurite will become the axon and directs the recruitment of Cdc42, which is required for formation of dendrites and axons. In murine platelets, Rap1b is required for normal homeostasis in vivo and is involved in integrin activation. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 164
24028 133376 cd04176 Rap2 Rap2 family GTPase consists of Rap2a, Rap2b, and Rap2c. The Rap2 subgroup is part of the Rap subfamily of the Ras family. It consists of Rap2a, Rap2b, and Rap2c. Both isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK) are putative effectors of Rap2 in mediating the activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton. In human platelets, Rap2 was shown to interact with the cytoskeleton by binding the actin filaments. In embryonic Xenopus development, Rap2 is necessary for the Wnt/beta-catenin signaling pathway. The Rap2 interacting protein 9 (RPIP9) is highly expressed in human breast carcinomas and correlates with a poor prognosis, suggesting a role for Rap2 in breast cancer oncogenesis. Rap2b, but not Rap2a, Rap2c, Rap1a, or Rap1b, is expressed in human red blood cells, where it is believed to be involved in vesiculation. A number of additional effector proteins for Rap2 have been identified, including the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 163
24029 133377 cd04177 RSR1 RSR1/Bud1p family GTPase. RSR1/Bud1p is a member of the Rap subfamily of the Ras family that is found in fungi. In budding yeasts, RSR1 is involved in selecting a site for bud growth on the cell cortex, which directs the establishment of cell polarization. The Rho family GTPase cdc42 and its GEF, cdc24, then establish an axis of polarized growth by organizing the actin cytoskeleton and secretory apparatus at the bud site. It is believed that cdc42 interacts directly with RSR1 in vivo. In filamentous fungi, polar growth occurs at the tips of hypha and at novel growth sites along the extending hypha. In Ashbya gossypii, RSR1 is a key regulator of hyphal growth, localizing at the tip region and regulating in apical polarization of the actin cytoskeleton. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 168
24030 206753 cd04178 Nucleostemin_like A circularly permuted subfamily of the Ras GTPases. Nucleostemin (NS) is a nucleolar protein that functions as a regulator of cell growth and proliferation in stem cells and in several types of cancer cells, but is not expressed in the differentiated cells of most mammalian adult tissues. NS shuttles between the nucleolus and nucleoplasm bidirectionally at a rate that is fast and independent of cell type. Lowering GTP levels decreases the nucleolar retention of NS, and expression of NS is abruptly down-regulated during differentiation prior to terminal cell division. Found only in eukaryotes, NS consists of an N-terminal basic domain, a coiled-coil domain, a GTP-binding domain, an intermediate domain, and a C-terminal acidic domain. Experimental evidence indicates that NS uses its GTP-binding property as a molecular switch to control the transition between the nucleolus and nucleoplasm, and this process involves interaction between the basic, GTP-binding, and intermediate domains of the protein. 171
24031 133022 cd04179 DPM_DPG-synthase_like DPM_DPG-synthase_like is a member of the Glycosyltransferase 2 superfamily. DPM1 is the catalytic subunit of eukaryotic dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. In higher eukaryotes,the enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. In lower eukaryotes, such as Saccharomyces cerevisiae and Trypanosoma brucei, DPM synthase consists of a single component (Dpm1p and TbDpm1, respectively) that possesses one predicted transmembrane region near the C terminus for anchoring to the ER membrane. In contrast, the Dpm1 homologues of higher eukaryotes, namely fission yeast, fungi, and animals, have no transmembrane region, suggesting the existence of adapter molecules for membrane anchoring. This family also includes bacteria and archaea DPM1_like enzymes. However, the enzyme structure and mechanism of function are not well understood. The UDP-glucose:dolichyl-phosphate glucosyltransferase (DPG_synthase) is a transmembrane-bound enzyme of the endoplasmic reticulum involved in protein N-linked glycosylation. This enzyme catalyzes the transfer of glucose from UDP-glucose to dolichyl phosphate. This protein family belongs to Glycosyltransferase 2 superfamily. 185
24032 133023 cd04180 UGPase_euk_like Eukaryotic UGPase-like includes UDPase and UDPGlcNAc pyrophosphorylase enzymes. This family includes UDP-Glucose Pyrophosphorylase (UDPase) and UDPGlcNAc pyrophosphorylase enzymes. The two enzymes share significant sequence and structure similarity. UDP-Glucose Pyrophosphorylase catalyzes a reversible production of UDP-Glucose and pyrophosphate (PPi) from Glucose-1-phosphate and UTP. UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids , glycoproteins , and proteoglycans . UDP-N-acetylglucosamine (UDPGlcNAc) pyrophosphorylase (UAP) (also named GlcNAc1P uridyltransferase), catalyzes the reversible conversion of UTP and GlcNAc1P from PPi and UDPGlcNAc, which is a key precursor of N- and O-linked glycosylations and is essential for the synthesis of chitin (a major component of the fungal cell wall) and of the glycosylphosphatidylinositol (GPI) linker anchoring a variety of cell surface proteins to the plasma membrane. In bacteria, UDPGlcNAc represents an essential precursor for both peptidoglycan and lipopolysaccharide biosynthesis. 266
24033 133024 cd04181 NTP_transferase NTP_transferases catalyze the transfer of nucleotides onto phosphosugars. Nucleotidyltransferases transfer nucleotides onto phosphosugars. The enzyme family includes Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase. The products are activated sugars that are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides. 217
24034 133025 cd04182 GT_2_like_f GT_2_like_f is a subfamily of the glycosyltransferase family 2 (GT-2) with unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 186
24035 133026 cd04183 GT2_BcE_like GT2_BcbE_like is likely involved in the biosynthesis of the polysaccharide capsule. GT2_BcbE_like: The bcbE gene is one of the genes in the capsule biosynthetic locus of Pasteurella multocida. Its deducted product is likely involved in the biosynthesis of the polysaccharide capsule, which is found on surface of a wide range of bacteria. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. 231
24036 133027 cd04184 GT2_RfbC_Mx_like Myxococcus xanthus RfbC like proteins are required for O-antigen biosynthesis. The rfbC gene encodes a predicted protein of 1,276 amino acids, which is required for O-antigen biosynthesis in Myxococcus xanthus. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. 202
24037 133028 cd04185 GT_2_like_b Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 202
24038 133029 cd04186 GT_2_like_c Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 166
24039 133030 cd04187 DPM1_like_bac Bacterial DPM1_like enzymes are related to eukaryotic DPM1. A family of bacterial enzymes related to eukaryotic DPM1; Although the mechanism of eukaryotic enzyme is well studied, the mechanism of the bacterial enzymes is not well understood. The eukaryotic DPM1 is the catalytic subunit of eukaryotic Dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. The enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. This protein family belongs to Glycosyltransferase 2 superfamily. 181
24040 133031 cd04188 DPG_synthase DPG_synthase is involved in protein N-linked glycosylation. UDP-glucose:dolichyl-phosphate glucosyltransferase (DPG_synthase) is a transmembrane-bound enzyme of the endoplasmic reticulum involved in protein N-linked glycosylation. This enzyme catalyzes the transfer of glucose from UDP-glucose to dolichyl phosphate. 211
24041 133032 cd04189 G1P_TT_long G1P_TT_long represents the long form of glucose-1-phosphate thymidylyltransferase. This family is the long form of Glucose-1-phosphate thymidylyltransferase. Glucose-1-phosphate thymidylyltransferase catalyses the formation of dTDP-glucose, from dTTP and glucose 1-phosphate. It is the first enzyme in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme.There are two forms of Glucose-1-phosphate thymidylyltransferase in bacteria and archeae; short form and long form. The long form, which has an extra 50 amino acids c-terminal, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced.The long from enzymes also have a left-handed parallel helix domain at the c-terminus, whereas, th eshort form enzymes do not have this domain. The homotetrameric, feedback inhibited short form is found in numerous bacterial species that produce dTDP-L-rhamnose. 236
24042 133033 cd04190 Chitin_synth_C C-terminal domain of Chitin Synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin. Chitin synthase, also called UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase, catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of GlcNAc residues formed by covalent beta-1,4 linkages. Chitin is an important component of the cell wall of fungi and bacteria and it is synthesized on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases. Studies with fungi have revealed that most of them contain more than one chitin synthase gene. At least five subclasses of chitin synthases have been identified. 244
24043 133034 cd04191 Glucan_BSP_MdoH Glucan_BSP_MdoH catalyzes the elongation of beta-1,2 polyglucose chains of glucan. Periplasmic Glucan Biosynthesis protein MdoH is a glucosyltransferase that catalyzes the elongation of beta-1,2 polyglucose chains of glucan, requiring a beta-glucoside as a primer and UDP-glucose as a substrate. Glucans are composed of 5 to 10 units of glucose forming a highly branched structure, where beta-1,2-linked glucose constitutes a linear backbone to which branches are attached by beta-1,6 linkages. In Escherichia coli, glucans are located in the periplasmic space, functioning as regulator of osmolarity. It is synthesized at a maximum when cells are grown in a medium with low osmolarity. It has been shown to span the cytoplasmic membrane. 254
24044 133035 cd04192 GT_2_like_e Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 229
24045 133036 cd04193 UDPGlcNAc_PPase UDPGlcNAc pyrophosphorylase catalayzes the synthesis of UDPGlcNAc. UDP-N-acetylglucosamine (UDPGlcNAc) pyrophosphorylase (UAP) (also named GlcNAc1P uridyltransferase), catalyzes the reversible conversion of UTP and GlcNAc1 to PPi and UDPGlcNAc. UDP-N-acetylglucosamine (UDPGlcNAc), the activated form of GlcNAc, is a key precursor of N- and O-linked glycosylations. It is essential for the synthesis of chitin (a major component of the fungal cell wall) and of the glycosylphosphatidylinositol (GPI) linker which anchors a variety of cell surface proteins to the plasma membrane. In bacteria, UDPGlcNAc represents an essential precursor for both peptidoglycan and lipopolysaccharide biosynthesis. Human UAP has two isoforms, resulting from alternative splicing of a single gene and differing by the presence or absence of 17 amino acids. UDPGlcNAc pyrophosphorylase shares significant sequence and structure conservation with UDPglucose pyrophosphorylase. 323
24046 133037 cd04194 GT8_A4GalT_like A4GalT_like proteins catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The members of this family of glycosyltransferases catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The enzymes exhibit broad substrate specificities. The known functions found in this family include: Alpha-1,4-galactosyltransferase, LOS-alpha-1,3-D-galactosyltransferase, UDP-glucose:(galactosyl) LPS alpha1,2-glucosyltransferase, UDP-galactose: (glucosyl) LPS alpha1,2-galactosyltransferase, and UDP-glucose:(glucosyl) LPS alpha1,2-glucosyltransferase. Alpha-1,4-galactosyltransferase from N. meningitidis adds an alpha-galactose from UDP-Gal (the donor) to a terminal lactose (the acceptor) of the LOS structure of outer membrane. LOSs are virulence factors that enable the organism to evade the immune system of host cells. In E. coli, the three alpha-1,2-glycosyltransferases, that are involved in the synthesis of the outer core region of the LPS, are all members of this family. The three enzymes share 40 % of sequence identity, but have different sugar donor or acceptor specificities, representing the structural diversity of LPS. 248
24047 133038 cd04195 GT2_AmsE_like GT2_AmsE_like is involved in exopolysaccharide amylovora biosynthesis. AmsE is a glycosyltransferase involved in exopolysaccharide amylovora biosynthesis in Erwinia amylovora. Amylovara is one of the three exopolysaccharide produced by E. amylovora. Amylovara-deficient mutants are non-pathogenic. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. 201
24048 133039 cd04196 GT_2_like_d Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 214
24049 133040 cd04197 eIF-2B_epsilon_N The N-terminal domain of epsilon subunit of the eIF-2B is a subfamily of glycosyltransferase 2. N-terminal domain of epsilon subunit of the eukaryotic translation initiation factor 2B (eIF-2B): eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit epsilon shares sequence similarity with gamma subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex. 217
24050 133041 cd04198 eIF-2B_gamma_N The N-terminal domain of gamma subunit of the eIF-2B is a subfamily of glycosyltransferase 2. N-terminal domain of gamma subunit of the eukaryotic translation initiation factor 2B (eIF-2B): eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit gamma shares sequence similarity with epsilon subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex. 214
24051 259862 cd04199 CuRO_1_ceruloplasmin_like Cupredoxin domains 1, 3, and 5 of ceruloplasmin and similar proteins. This family includes the first, third, and fifth cupredoxin domains of ceruloplasmin and similar proteins including the first, third and fifth cupredoxin domains of unprocessed coagulation factors V and VIII. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. It functions in copper transport, amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. Human Factor VIII facilitates blood clotting by acting as a cofactor for factor IXa. Factor VIII and IXa forms a complex in the presence of Ca+2 and phospholipids that converts factor X to the activated form Xa. 177
24052 259863 cd04200 CuRO_2_ceruloplasmin_like Cupredoxin domains 2, 4, and 6 of ceruloplasmin and similar proteins. This family includes the second, fourth and sixth cupredoxin domains of ceruloplasmin and similar proteins, including the second, fourth, and sixth cupredoxin domains of unprocessed coagulation factors V and VIII. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. Human Factor VIII facilitates blood clotting by acting as a cofactor for factor IXa Factor VIII and IXa forms a complex in the presence of Ca+2 and phospholipids that converts factor X to the activated form Xa. 141
24053 259864 cd04201 CuRO_1_CuNIR_like Cupredoxin domain 1 of Copper-containing nitrite reductase and two-domain laccase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center, which serves as the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles two domain nitrite reductase in both sequence homology and structure similarity. It consists of two domains and forms trimers and hence resembles the quaternary structure of nitrite reductases more than that of larger laccases. 120
24054 259865 cd04202 CuRO_D2_2dMcoN_like The second cupredoxin domain of bacterial two domain multicopper oxidase McoN and similar proteins. This family includes bacterial two domain multicopper oxidases (2dMCOs) represented by the McoN from Nitrosomonas europaea. McoN is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. The biological function of McoN has not been characterized. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. 138
24055 259866 cd04203 Cupredoxin_like_3 Uncharacterized subfamiy of Cupredoxin. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins, having an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats: ceruloplamin and coagulation factors V/VIII have six repeats; laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II. Proteins of this uncharacterized subfamily contain a single cupredoxin domain. 84
24056 259867 cd04204 Pseudoazurin_like Small blue copper proteins including pseudocyanin, plastocyanin, halocyanin and amicyanin. The Pseudocyanin-like family of copper-binding proteins (or blue (type 1) copper domain) is a family of small proteins that bind a single copper atom and are characterized by an intense electronic absorption band near 600 nm. Pseudoazurin (PAz) has been identified as a electron donor in the denitrification pathway. For example, PAz acts as an electron donor to cytochrome c peroxidase and N2OR from Paracoccus pantotrophus (Pp), and to the copper containing nitrite reductase (NiR) that catalyzes the second step of denitrification. Plastocyanin is found in cyanobacteria, higher plants, and some algae where it plays a role in photosynthesis. Plastocyanin is responsible for transporting electrons from PSII to PSI. This family also includes halocyanins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis) and amicyanin found in bacteria Paracoccus denitrificans. 92
24057 259868 cd04205 CuRO_2_LCC_like Cupredoxin domain 2 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 152
24058 259869 cd04206 CuRO_1_LCC_like Cupredoxin domain 1 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) in this family contain three cupredoxin domains. They are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. Also included in this family are cupredoxin domains 1, 3, and 5 of the 6-domain MCO ceruloplasmin and similar proteins. 120
24059 259870 cd04207 CuRO_3_LCC_like Cupredoxin domain 3 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) in this family contain three cupredoxin domains. They are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. Also included in this family are cupredoxin domains 2, 4, and 6 of the 6-domain MCO ceruloplasmin and similar proteins. 132
24060 259871 cd04208 CuRO_2_CuNIR Cupredoxin domain 2 of Copper-containing nitrite reductase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center in the protein. The type 2 copper center of a copper nitrite reductase is the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis. 143
24061 259872 cd04210 Cupredoxin_like_1 Uncharacterized Cupredoxin-like subfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins because they have an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats; ceruloplasmin and coagulation factors V/VIII have six repeats; Laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II. 111
24062 259873 cd04211 Cupredoxin_like_2 Uncharacterized Cupredoxin-like subfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins because they have an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats; ceruloplasmin and coagulation factors V/VIII have six repeats; Laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II. 110
24063 259874 cd04212 CuRO_UO_II The cupredoxin domain of Ubiquinol oxidase subunit II. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits in ubiquinol oxidase varies from two to five. Although subunit II of ubiquinol oxidase lacks the binuclear CuA site found in cytochrome c oxidases, the structure is conserved. 99
24064 259875 cd04213 CuRO_CcO_Caa3_II The cupredoxin domain of Caa3 type Cytochrome c oxidase subunit II. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of most bacteria, is a multi-chain transmembrane protein located in the inner membrane the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Caa3 type of CcO Subunit II contains a copper-copper binuclear site called CuA, which is believed to be involved in electron transfer from cytochrome c to the cytochromes a, a3 and CuB active site in subunit I. 103
24065 259876 cd04214 PAD_N N-terminal non-catalytic domain of protein-arginine deiminase. The N-terminal non-catalytic domain of protein-arginine deiminase has a cupredoxin-like fold, but lacks the Cu binding site. PAD (protein-arginine deiminase) and protein L-arginine iminohydrolase catalyze the conversion of protein arginine residues to citrulline residues post-translationally in a process called citrullination. The modification plays crucial regulatory roles in development and cell differentiation. 108
24066 259877 cd04215 Nitrosocyanin Nitrosocyanin (NC) is a mononuclear red copper protein. Nitrosocyanin (NC) is isolated from the ammonia oxidizing bacterium Nitrosomonas europaea. Nitrosocyanin exhibits remote sequence homology to classic blue copper proteins; its spectroscopic and electrochemical properties are different. The structure of NC is a trimer of single domain cupredoxins. Nitroscocyanin may mediate electron transfer. It could have a novel role as a nitric oxide dehydrogenase or a nitric oxide reductase in the oxidation of ammonia. 107
24067 259878 cd04216 Phytocyanin Phytocyanins are plant blue or type I copper proteins. Phytocyanins are plant blue or type I copper proteins. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Phytocyanins are classified into four groups: stellacyanin, plantacyanin, uclacyanin and early nodulin groups. Stellacyanin appears to be associated with the plant cell wall; it may be involved in oxidative reactions to build polymeric material making up the cell wall. Plantacyanin is shown to play a role in reproduction in Arabidopsis. Plantacyanins may also be stress-related proteins and may be involved in plant defense responses. The early nodulin-like protein (OsENODL1) from Oryza sativa is expressed specifically at the late developmental stage of the seeds. 98
24068 259879 cd04217 Cupredoxin_Fibrocystin-L_like Cupredoxin domain of PKHDL1, a homolog of the autosomal recessive polycystic kidney disease protein. One member of this family is Fibrocystin-L, a homolog of the autosomal recessive polycystic kidney disease protein PKHD1. Human fibrocystin-L is predicted to be a large receptor protein (466 kDa) with a signal peptide, a single transmembrane domain and a short cytoplasmic tail. Fibrocystin-L is widely expressed at a low level in most tissues but is up-regulated specifically in T lymphocytes following activation signals. It may play roles in immunity. 86
24069 259880 cd04218 Pseudoazurin Pseudoazurin (Paz) is a type I blue copper electron-transfer protein. Pseudoazurin (PAz) has been identified as an electron donor to the denitrification pathway. For example, PAz acts as an electron donor to cytochrome c peroxidase and N2OR from Paracoccus pantotrophus (Pp), and to the copper containing nitrite reductase (NiR) that catalyzes the second step of denitrification. It has been shown that pseudoazurin dramatically enhances the reaction profile of nitrite reduction by Paracoccus pantotrophus cytochrome cd1 and facilitates release of the product nitric oxide. The ability of this small redox protein to interact with a multitude of structurally different partners has been attributed to the hydrophobic character of the binding surface. 117
24070 259881 cd04219 Plastocyanin Plastocyanin is a type I copper protein and functions in the electron transfer from PSII to PSI. Plastocyanin is a small copper-containing protein found in cyanobacteria, higher plants, and some algae, where it plays a role in photosynthesis. The two photosystems that are primarily responsible for photosynthesis are photosystem I (PSI) and photosystem II (PSII). The flow of electrons begins in PSII, which acts as a proton pump. Plastocyanin is responsible for transporting electrons from PSII to PSI. 97
24071 259882 cd04220 Halocyanin Halocyanin is an archaea blue (type I) copper redox protein. Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis). Halocyanin may serve as a mobile electron carrier at a peripheral membrane protein. The copper-binding domain is present only once in some halocyanins and is duplicated in others. 92
24072 259883 cd04221 MauL Methylamine utilization protein MauL. MauL is one of the products from the methylamine utilization gene cluster in Methylobacterium extorquens AM1. Mutants generated by insertions in mauL were not able to grow on methylamine or any other primary amine as carbon sources. MauL belongs to the blue or type I copper protein family. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. 83
24073 259884 cd04222 CuRO_1_ceruloplasmin The first cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first cupredoxin domain of ceruloplasmin. 183
24074 259885 cd04223 N2OR_C The C-terminal cupredoxin domain of Nitrous-oxide reductase. Nitrous-oxide reductase participates in nitrogen metabolism and catalyzes the last step in dissimilatory nitrate reduction, the two-electron reduction of N2O to N2. It contains copper ions as cofactors in the form of a binuclear CuA center at the site of electron entry and a tetranuclear CuZ centre at the active site. The C-terminus of Nitrous-oxide reductase is a cupredoxin domain. 95
24075 259886 cd04224 CuRO_3_ceruloplasmin The third cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the third cupredoxin domain of ceruloplasmin. 197
24076 259887 cd04225 CuRO_5_ceruloplasmin The fifth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the fifth cupredoxin domain of ceruloplasmin. 171
24077 259888 cd04226 CuRO_1_FV_like The first cupredoxin domain of coagulation factor VIII and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 1 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 165
24078 259889 cd04227 CuRO_3_FVIII_like The third cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 3 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins. 177
24079 259890 cd04228 CuRO_5_FVIII_like The fifth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 5 of unprocessed Factor VIII or the first cupredoxin domain of the light chain of circulating Factor VIII, and similar proteins. 169
24080 259891 cd04229 CuRO_1_Ceruloplasmin_like_1 cupredoxin domain of ceruloplasmin homologs. Uncharacterized subfamily of ceruloplasmin homologous proteins. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first domain of the triplicated units. 175
24081 259892 cd04230 Sulfocyanin Sulfocyanin is a blue copper protein in archaebacterium Sulfolobus acidocaldarius. Sulfocyanin is a blue copper protein with a putative membrane anchoring hydrophobic motif at the N-terminus. It may substitute for cytochrome C in electron transfer reactions in archaea. 143
24082 259893 cd04231 Rusticyanin Rusticyanin is a cupredoxin in archaea and proteobacteria. Rusticyanin is a copper-containing protein which is involved in electron-transfer. The members of this family are found in archaea and proteobacteria. It is a cupredoxin, or blue-copper protein due to its color. Rusticyanin, extracted from the bacteria Thiobacillus ferrooxidans is redox active down to PH 2.0 and the acid-stable cytochrome c is the primary acceptor of the electron. This organism can grow on Fe2+ as its sole energy source. Rusticyanin is thought to be a principal component in the iron respiratory electron transport chain of T. ferrooxidans. 127
24083 259894 cd04232 CuRO_1_CueO_FtsP The first Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the first domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. FtsP does not contain any copper binding sites. 120
24084 259895 cd04233 Auracyanin Auracyanins A and B and similar proteins. This subfamily includes both auracyanins A and B from the photosynthetic bacterium Chloroflexus aurantiacus and similar proteins. Auracyanins A and B are very similar blue copper proteins with 38% sequence identity and are homologous to the bacterial redox protein Azurin. However, auracyanin A is expressed only when C. aurantiacus cells are grown in light, whereas auracyanin B is expressed in both dark and light conditions. Thus, auracyanin A may function as a redox partner in photosynthesis, while auracyanin B may function in aerobic respiration. 121
24085 239767 cd04234 AAK_AK AAK_AK: Amino Acid Kinase Superfamily (AAK), Aspartokinase (AK); this CD includes the N-terminal catalytic domain of aspartokinase (4-L-aspartate-4-phosphotransferase;). AK is the first enzyme in the biosynthetic pathway of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. It also catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. One mechanism for the regulation of this pathway is by the production of several isoenzymes of aspartokinase with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli, three different aspartokinase isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback-inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, one is a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this CD is the catalytic domain of the Methylomicrobium alcaliphilum ectoine AK, the first enzyme of the ectoine biosynthetic pathway, found in this bacterium, and several other halophilic/halotolerant bacteria. 227
24086 239768 cd04235 AAK_CK AAK_CK: Carbamate kinase (CK) catalyzes both the ATP-phosphorylation of carbamate and carbamoyl phosphate (CP) utilization with the production of ATP from ADP and CP. Both CK (this CD) and nonhomologous CP synthetase synthesize carbamoyl phosphate, an essential precursor of arginine and pyrimidine bases, in the presence of ATP, bicarbonate, and ammonia. CK is a homodimer of 33 kDa subunits and is a member of the Amino Acid Kinase Superfamily (AAK). 308
24087 239769 cd04236 AAK_NAGS-Urea AAK_NAGS-Urea: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the urea cycle found in animals. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate; NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Ureogenic NAGS activity is dependent on the concentration of glutamate (substrate) and arginine (activator). Domain architecture of ureogenic NAGS consists of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal DUF619 domain. Members of this CD belong to the protein superfamily, the Amino Acid Kinase Family (AAKF). 271
24088 239770 cd04237 AAK_NAGS-ABP AAK_NAGS-ABP: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the arginine-biosynthesis pathway (ABP) found in gamma- and beta-proteobacteria and higher plant chloroplasts. Domain architecture of these NAGS consisted of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal NAG synthase, acetyltransferase (ArgA) domain. Both bacterial and plant sequences in this CD have a conserved N-terminal extension; a similar sequence in the NAG kinases of the cyclic arginine-biosynthesis pathway has been implicated in feedback inhibition sensing. Plant sequences also have an N-terminal chloroplast transit peptide and an insert (approx. 70 residues) in the C-terminal region of ArgB. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK). 280
24089 239771 cd04238 AAK_NAGK-like AAK_NAGK-like: N-Acetyl-L-glutamate kinase (NAGK)-like . Included in this CD are the Escherichia coli and Pseudomonas aeruginosa type NAGKs which catalyze the phosphorylation of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in bacteria and photosynthetic organisms using either the acetylated, noncyclic (NC), or non-acetylated, cyclic (C) route of ornithine biosynthesis. Also included in this CD is a distinct group of uncharacterized (UC) bacterial and archeal NAGKs. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK). 256
24090 239772 cd04239 AAK_UMPK-like AAK_UMPK-like: UMP kinase (UMPK)-like, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis. Regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinases of E. coli (Ec) and Pyrococcus furiosus (Pf) are known to function as homohexamers, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Also included in this CD are the alpha and beta subunits of the Mo storage protein (MosA and MosB) characterized as an alpha4-beta4 octamer containing an ATP-dependent, polynuclear molybdenum-oxide cluster. These and related sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK). 229
24091 239773 cd04240 AAK_UC AAK_UC: Uncharacterized (UC) amino acid kinase-like proteins found mainly in archaea and a few bacteria. Sequences in this CD are members of the Amino Acid Kinase (AAK) superfamily. 203
24092 239774 cd04241 AAK_FomA-like AAK_FomA-like: This CD includes a fosfomycin biosynthetic gene product, FomA, and similar proteins found in a wide range of organisms. Together, the fomA and fomB genes in the fosfomycin biosynthetic gene cluster of Streptomyces wedmorensis confer high-level fosfomycin resistance. FomA and FomB proteins converted fosfomycin to fosfomycin monophosphate and fosfomycin diphosphate in the presence of ATP and a magnesium ion, indicating that FomA and FomB catalyzed phosphorylations of fosfomycin and fosfomycin monophosphate, respectively. FomA and related sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK). 252
24093 239775 cd04242 AAK_G5K_ProB AAK_G5K_ProB: Glutamate-5-kinase (G5K) catalyzes glutamate-dependent ATP cleavage; G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, in the first and controlling step of proline (and, in mammals, ornithine) biosynthesis. G5K is subject to feedback allosteric inhibition by proline or ornithine. In microorganisms and plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia. Microbial G5K generally consists of two domains: a catalytic G5K domain and one PUA (pseudo uridine synthases and archaeosine-specific transglycosylases) domain, and some lack the PUA domain. G5K requires free Mg for activity, it is tetrameric, and it aggregates to higher forms in a proline-dependent way. G5K lacking the PUA domain remains tetrameric, active, and proline-inhibitable, but the Mg requirement and the proline-triggered aggregation are greatly diminished and abolished, respectively, and more proline is needed for inhibition. Although plant and animal G5Ks are part of a bifunctional polypeptide, delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR; ProA); bacterial and yeast G5Ks are monofunctional single-polypeptide enzymes. In this CD, all three domain architectures are present: G5K, G5K+PUA, and G5K+G5PR. 251
24094 239776 cd04243 AAK_AK-HSDH-like AAK_AK-HSDH-like: Amino Acid Kinase Superfamily (AAK), AK-HSDH-like; this family includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK- homoserine dehydrogenase (HSDH). These aspartokinases are found in such bacteria as E. coli (AKI-HSDHI, ThrA and AKII-HSDHII, MetL) and in higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation. Also included in this CD is the catalytic domain of the aspartokinase (AK) of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. In E. coli, LysC is reported to be a homodimer of 50 kD subunits. Also included in this CD is the catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine. 293
24095 239777 cd04244 AAK_AK-LysC-like AAK_AK-LysC-like: Amino Acid Kinase Superfamily (AAK), AK-LysC-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive AK isoenzyme found in higher plants. The lysine-sensitive AK isoenzyme is a monofunctional protein. It is involved in the overall regulation of the aspartate pathway and can be synergistically inhibited by S-adenosylmethionine. Also included in this CD is an uncharacterized LysC-like AK found in Euryarchaeota and some bacteria. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. 298
24096 239778 cd04245 AAK_AKiii-YclM-BS AAK_AKiii-YclM-BS: Amino Acid Kinase Superfamily (AAK), AKiii-YclM-BS; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In Bacillus subtilis (BS), YclM is reported to be a single polypeptide of 50 kD. The Bacillus subtilis 168 AKIII is induced by lysine and repressed by threonine, and it is synergistically inhibited by lysine and threonine. 288
24097 239779 cd04246 AAK_AK-DapG-like AAK_AK-DapG-like: Amino Acid Kinase Superfamily (AAK), AK-DapG-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional enzymes found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species, as well as, the catalytic AK domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related isoenzymes. In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. The role of the AKI isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. In Corynebacterium glutamicum and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinase isoenzyme types found in Pseudomonas, C. glutamicum, and Amycolatopsis lactamdurans. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. The B. subtilis 168 AKII aspartokinase is also described as tetrameric consisting of two alpha and two beta subunits. Some archeal aspartokinases in this group lack recognizable ACT domains. 239
24098 239780 cd04247 AAK_AK-Hom3 AAK_AK-Hom3: Amino Acid Kinase Superfamily (AAK), AK-Hom3; this CD includes the N-terminal catalytic domain of the aspartokinase HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae and other related AK domains. Aspartokinase, the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single aspartokinase isoenzyme type, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies show that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. 306
24099 239781 cd04248 AAK_AK-Ectoine AAK_AK-Ectoine: Amino Acid Kinase Superfamily (AAK), AK-Ectoine; this CD includes the N-terminal catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and other various halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes' of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. 304
24100 239782 cd04249 AAK_NAGK-NC AAK_NAGK-NC: N-Acetyl-L-glutamate kinase - noncyclic (NAGK-NC) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis using the acetylated, noncyclic route of ornithine biosynthesis. There are two variants of this pathway. In one, typified by the pathway in Escherichia coli, glutamate is acetylated by acetyl-CoA and acetylornithine is deacylated hydrolytically. In this pathway, feedback inhibition by arginine occurs at the initial acetylation of glutamate and not at the phosphorylation of NAG by NAGK. Homodimeric NAGK-NC are members of the Amino Acid Kinase Superfamily (AAK). 252
24101 239783 cd04250 AAK_NAGK-C AAK_NAGK-C: N-Acetyl-L-glutamate kinase - cyclic (NAGK-C) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in some bacteria and photosynthetic organisms using the non-acetylated, cyclic route of ornithine biosynthesis. In this pathway, glutamate is first N-acetylated and then phosphorylated by NAGK to give phosphoryl NAG, which is converted to NAG-ornithine. There are two variants of this pathway. In one, typified by the pathway in Thermotoga maritima and Pseudomonas aeruginosa, the acetyl group is recycled by reversible transacetylation from acetylornithine to glutamate. The phosphorylation of NAG by NAGK is feedback inhibited by arginine. In photosynthetic organisms, NAGK is the target of the nitrogen-signaling protein PII. Hexameric formation of NAGK domains appears to be essential to both arginine inhibition and NAGK-PII complex formation. NAGK-C are members of the Amino Acid Kinase Superfamily (AAK). 279
24102 239784 cd04251 AAK_NAGK-UC AAK_NAGK-UC: N-Acetyl-L-glutamate kinase - uncharacterized (NAGK-UC). This domain is similar to Escherichia coli and Pseudomonas aeruginosa NAGKs which catalyze the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis. These uncharacterized domain sequences are found in some bacteria (Deinococci and Chloroflexi) and archea and belong to the Amino Acid Kinase Superfamily (AAK). 257
24103 239785 cd04252 AAK_NAGK-fArgBP AAK_NAGK-fArgBP: N-Acetyl-L-glutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (fArgBP). The nuclear-encoded, mitochondrial polyprotein precursor with an N-terminal NAGK (ArgB) domain (this CD), a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved in the mitochondria into two distinct enzymes (NAGK-DUF619 and NAGPR). Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. This CD also includes some gamma-proteobacteria (Xanthomonas and Xylella) NAG kinases with an N-terminal NAGK (ArgB) domain (this CD) and a C-terminal DUF619 domain. The DUF619 domain is described as a putative distant homolog of the acetyltransferase, ArgA, predicted to function in NAG synthase association in fungi. Eukaryotic sequences have an N-terminal mitochondrial transit peptide. Members of this NAG kinase domain CD belong to the Amino Acid Kinase Superfamily (AAK). 248
24104 239786 cd04253 AAK_UMPK-PyrH-Pf AAK_UMPK-PyrH-Pf: UMP kinase (UMPK)-Pf, the mostly archaeal uridine monophosphate kinase (uridylate kinase) enzymes that catalyze UMP phosphorylation and play a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of Pyrococcus furiosus (Pf) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs (this CD) appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK). 221
24105 239787 cd04254 AAK_UMPK-PyrH-Ec UMP kinase (UMPK)-Ec, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of E. coli (Ec) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial and chloroplast UMPKs (this CD) have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK). 231
24106 239788 cd04255 AAK_UMPK-MosAB AAK_UMPK-MosAB: This CD includes the alpha and beta subunits of the Mo storage protein (MosA and MosB) which are related to uridine monophosphate kinase (UMPK) enzymes that catalyze the phosphorylation of UMP by ATP, yielding UDP, and playing a key role in pyrimidine nucleotide biosynthesis. The Mo storage protein from the nitrogen-fixing bacterium, Azotobacter vinelandii, is characterized as an alpha4-beta4 octamer containing a polynuclear molybdenum-oxide cluster which is ATP-dependent to bind Mo and pH-dependent to release Mo. These and related bacterial sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK). 262
24107 239789 cd04256 AAK_P5CS_ProBA AAK_P5CS_ProBA: Glutamate-5-kinase (G5K) domain of the bifunctional delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR, ProA), the first and second enzyme catalyzing proline (and, in mammals, ornithine) biosynthesis. G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, and is subject to feedback allosteric inhibition by proline or ornithine. In plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia. 284
24108 239790 cd04257 AAK_AK-HSDH AAK_AK-HSDH: Amino Acid Kinase Superfamily (AAK), AK-HSDH; this CD includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - homoserine dehydrogenase (HSDH). These aspartokinases are found in bacteria (E. coli AKI-HSDHI, ThrA and E. coli AKII-HSDHII, MetL) and higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation. 294
24109 239791 cd04258 AAK_AKiii-LysC-EC AAK_AKiii-LysC-EC: Amino Acid Kinase Superfamily (AAK), AKiii-LysC-EC: this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKIII. AKIII is a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In E. coli, LysC is reported to be a homodimer of 50 kD subunits. 292
24110 239792 cd04259 AAK_AK-DapDC AAK_AK-DapDC: Amino Acid Kinase Superfamily (AAK), AK-DapDC; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. Aspartokinase is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine. 295
24111 239793 cd04260 AAK_AKi-DapG-BS AAK_AKi-DapG-BS: Amino Acid Kinase Superfamily (AAK), AKi-DapG; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional class enzyme found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species. In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. 244
24112 239794 cd04261 AAK_AKii-LysC-BS AAK_AKii-LysC-BS: Amino Acid Kinase Superfamily (AAK), AKii; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase isoenzyme type, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In this organism and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and theronine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. 239
24113 176265 cd04263 DUF619-NAGK-FABP DUF619 domain of N-acetylglutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway. DUF619-NAGK-FABP: DUF619 domain of N-acetylglutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (FABP). The nuclear-encoded, mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved into two distinct enzymes (NAGK-DUF619 and NAGPR) in the mitochondria. Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. Arg5,6 catalyzes the second reaction of arginine biosynthesis; the phosphorylation of the gamma-carboxyl group of NAG to produce N-acetylglutamylphosphate (NAGP) which is subsequently converted to ornithine in two more steps. It also binds and regulates the promoters of nuclear and mitochondrial genes, and may possibly regulate precursor mRNA metabolism. The DUF619 domain function has yet to be characterized. 98
24114 176266 cd04264 DUF619-NAGS DUF619 domain of various N-acetylglutamate Synthases of the fungal arginine-biosynthetic pathway and urea cycle found in humans and fish. DUF619-NAGS: This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present in C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain function has yet to be characterized. 99
24115 176267 cd04265 DUF619-NAGS-U DUF619 domain of various N-acetylglutamate Synthases (NAGS) of the urea (U) cycle of humans and fish. This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present in C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain function has yet to be characterized. 99
24116 176268 cd04266 DUF619-NAGS-FABP DUF619 domain of N-acetylglutamate Synthase of the fungal arginine-biosynthetic pathway. DUF619-NAGS-FABP: This family includes the DUF619 domain of N-acetylglutamate synthase (NAGS) of the fungal arginine-biosynthetic pathway (FABP). This NAGS (also known as arginine-requiring protein 2 or ARG2) consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. NAGS catalyzes the formation of NAG from acetylcoenzyme A and L-glutamate. The DUF619 domain, yet to be characterized, is predicted to function in NAGS association in fungi. 108
24117 239795 cd04267 ZnMc_ADAM_like Zinc-dependent metalloprotease, ADAM_like or reprolysin_like subgroup. The adamalysin_like or ADAM family of metalloproteases contains proteolytic domains from snake venoms, proteases from the mammalian reproductive tract, and the tumor necrosis factor alpha convertase, TACE. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. 192
24118 239796 cd04268 ZnMc_MMP_like Zinc-dependent metalloprotease, MMP_like subfamily. This group contains matrix metalloproteinases (MMPs), serralysins, and the astacin_like family of proteases. 165
24119 239797 cd04269 ZnMc_adamalysin_II_like Zinc-dependent metalloprotease; adamalysin_II_like subfamily. Adamalysin II is a snake venom zinc endopeptidase. This subfamily contains other snake venom metalloproteinases, as well as membrane-anchored metalloproteases belonging to the ADAM family. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. 194
24120 239798 cd04270 ZnMc_TACE_like Zinc-dependent metalloprotease; TACE_like subfamily. TACE, the tumor-necrosis factor-alpha converting enzyme, releases soluble TNF-alpha from transmembrane pro-TNF-alpha. 244
24121 239799 cd04271 ZnMc_ADAM_fungal Zinc-dependent metalloprotease, ADAM_fungal subgroup. The adamalysin_like or ADAM (A Disintegrin And Metalloprotease) family of metalloproteases are integral membrane proteases acting on a variety of extracellular targets. They are involved in shedding soluble peptides or proteins from the cell surface. This subfamily contains fungal ADAMs, whose precise function has yet to be determined. 228
24122 239800 cd04272 ZnMc_salivary_gland_MPs Zinc-dependent metalloprotease, salivary_gland_MPs. Metalloproteases secreted by the salivary glands of arthropods. 220
24123 239801 cd04273 ZnMc_ADAMTS_like Zinc-dependent metalloprotease, ADAMTS_like subgroup. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. This particular subfamily represents domain architectures that combine ADAM-like metalloproteinases with thrombospondin type-1 repeats. ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) proteinases are inhibited by TIMPs (tissue inhibitors of metalloproteinases), and they play roles in coagulation, angiogenesis, development and progression of arthritis. They hydrolyze the von Willebrand factor precursor and various components of the extracellular matrix. 207
24124 239802 cd04275 ZnMc_pappalysin_like Zinc-dependent metalloprotease, pappalysin_like subfamily. The pregnancy-associated plasma protein A (PAPP-A or pappalysin-1) cleaves insulin-like growth factor-binding proteins 4 and 5, thereby promoting cell growth by releasing bound growth factor. This model includes pappalysins and related metalloprotease domains from all three kingdoms of life. The three-dimensional structure of an archaeal representative, ulilysin, has been solved. 225
24125 239803 cd04276 ZnMc_MMP_like_2 Zinc-dependent metalloprotease; MMP_like sub-family 2. A group of bacterial metalloproteinase domains similar to matrix metalloproteinases and astacin. 197
24126 239804 cd04277 ZnMc_serralysin_like Zinc-dependent metalloprotease, serralysin_like subfamily. Serralysins and related proteases are important virulence factors in pathogenic bacteria. They may be secreted into the medium via a mechanism found in gram-negative bacteria, that does not require n-terminal signal sequences which are cleaved after the transmembrane translocation. A calcium-binding domain c-terminal to the metalloprotease domain, which contains multiple tandem repeats of a nine-residue motif including the pattern GGxGxD, and which forms a parallel beta roll may be involved in the translocation mechanism and/or substrate binding. Serralysin family members may have a broad spectrum of substrates each, including host immunoglobulins, complement proteins, cell matrix and cytoskeletal proteins, as well as antimicrobial peptides. 186
24127 239805 cd04278 ZnMc_MMP Zinc-dependent metalloprotease, matrix metalloproteinase (MMP) sub-family. MMPs are responsible for a great deal of pericellular proteolysis of extracellular matrix and cell surface molecules, playing crucial roles in morphogenesis, cell fate specification, cell migration, tissue repair, tumorigenesis, gain or loss of tissue-specific functions, and apoptosis. In many instances, they are anchored to cell membranes via trans-membrane domains, and their activity is controlled via TIMPs (tissue inhibitors of metalloproteinases). 157
24128 239806 cd04279 ZnMc_MMP_like_1 Zinc-dependent metalloprotease; MMP_like sub-family 1. A group of bacterial, archaeal, and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin. 156
24129 239807 cd04280 ZnMc_astacin_like Zinc-dependent metalloprotease, astacin_like subfamily or peptidase family M12A, a group of zinc-dependent proteolytic enzymes with a HExxH zinc-binding site/active site. Members of this family may have an amino terminal propeptide, which is cleaved to yield the active protease domain, which is consequently always found at the N-terminus in multi-domain architectures. This family includes: astacin, a digestive enzyme from Crayfish; meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain, proteins involved in (bone) morphogenesis, tolloid from drosophila, and the sea urchin SPAN protein, which may also play a role in development. 180
24130 239808 cd04281 ZnMc_BMP1_TLD Zinc-dependent metalloprotease; BMP1/TLD-like subfamily. BMP1 (Bone morphogenetic protein 1) and TLD (tolloid)-like metalloproteases play vital roles in extracellular matrix formation, by cleaving precursor proteins such as enzymes, structural proteins, and proteins involved in the mineralization of the extracellular matrix. The drosophila protein tolloid and its Xenopus homologue xolloid cleave and inactivate Sog and chordin, respectively, which are inhibitors of Dpp (the Drosophila decapentaplegic gene product) and its homologue BMP4, involved in dorso-ventral patterning. 200
24131 239809 cd04282 ZnMc_meprin Zinc-dependent metalloprotease, meprin_like subfamily. Meprins are membrane-bound or secreted extracellular proteases, which cleave a variety of targets, including peptides such as parathyroid hormone, gastrin, and cholecystokinin, cytokines such as osteopontin, and proteins such as collagen IV, fibronectin, casein and gelatin. Meprins may also be able to release proteins from the cell surface. Closely related meprin alpha- and beta-subunits form homo- and hetero-oligomers; these complexes are found on epithelial cells of the intestine, for example, and are also expressed in certain cancer cells. 230
24132 239810 cd04283 ZnMc_hatching_enzyme Zinc-dependent metalloprotease, hatching enzyme-like subfamily. Hatching enzymes are secreted by teleost embryos to digest the egg envelope or chorion. In some teleosts, the hatching enzyme may be a system consisting of two evolutionary related metalloproteases, high choriolytic enzyme and low choriolytic enzyme (HCE and LCE), which may have different substrate specificities and cooperatively digest the chorion. 182
24133 340852 cd04299 GT35_Glycogen_Phosphorylase-like proteins similar to glycogen phosphorylase. This family is most closely related to the oligosaccharide phosphorylase domain family and other unidentified sequences. Oligosaccharide phosphorylase catalyzes the breakdown of oligosaccharides into glucose-1-phosphate units. They are important allosteric enzymes in carbohydrate metabolism. 776
24134 340853 cd04300 GT35_Glycogen_Phosphorylase glycogen phosphorylase and similar proteins. This is a family of oligosaccharide phosphorylases. It includes yeast and mammalian glycogen phosphorylases, plant starch/glucan phosphorylase, as well as the maltodextrin phosphorylases of bacteria. The members of this family catalyze the breakdown of oligosaccharides into glucose-1-phosphate units. They are important allosteric enzymes in carbohydrate metabolism. The allosteric control mechanisms of yeast and mammalian members of this family are different from that of bacterial members. The members of this family belong to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 795
24135 173926 cd04301 NAT_SF N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate. NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included. 65
24136 319798 cd04302 HAD_5NT haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to the Pseudomonas aeruginosa PA0065. 5'-nucleotidases dephosphorylate nucleoside 5'-monophosphates to nucleosides and inorganic phosphate. Purified Pseudomonas aeruginosa PA0065 displayed high activity toward 5'-UMP and 5'-IMP, significant activity against 5'-XMP and 5'-TMP, and low activity against 5'-CMP. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 209
24137 319799 cd04303 HAD_PGPase phosphoglycolate phosphatase, similar to Synechococcus elongates phosphoglycolate phosphatase PGP/CbbZ. Phosphoglycolate phosphatase catalyzes the dephosphorylation of phosphoglycolate; its activity requires divalent cations, especially Mg++. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 201
24138 319800 cd04305 HAD_Neu5Ac-Pase_like human N-acetylneuraminate-9-phosphate phosphatase, Escherichia coli house-cleaning phosphatase YjjG, and related phosphatases. N-acetylneuraminate-9- phosphatase (Neu5Ac-9-Pase; E.C. 3.1.3.29) catalyzes the dephosphorylation of N-acylneuraminate 9-phosphate during the synthesis of N-acetylneuraminate; Escherichia coli nucleotide phosphatase YjjG has a broad pyrimidine nucleotide activity spectrum and functions as an in vivo house-cleaning phosphatase for noncanonical pyrimidine nucleotides. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 109
24139 319801 cd04309 HAD_PSP_eu phosphoserine phosphatase eukaryotic-like, similar to human phosphoserine phosphatase. Human PSP, EC 3.1.3.3, catalyzes the third and final of the L-serine biosynthesis pathway, the Mg2+-dependent hydrolysis of phospho-L-serine to L-serine and inorganic phosphate, L-serine is a precursor for the biosynthesis of glycine. HPSP regulates the levels of glycine and D-serine (converted from L-serine), the putative co-agonists for the glycine site of the NMDA receptor in the brain. Plant 3-PSP catalyzes the conversion of 3-phosphoserine to serine in the last step of the plastidic pathway of serine biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 202
24140 239811 cd04316 ND_PkAspRS_like_N ND_PkAspRS_like_N: N-terminal, anticodon recognition domain of the type found in the homodimeric non-discriminating (ND) Pyrococcus kodakaraensis aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. P. kodakaraensis AspRS is a class 2b aaRS. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. P. kodakaraensis ND-AspRS can charge both tRNAAsp and tRNAAsn. Some of the enzymes in this group may be discriminating, based on the presence of homologs of asparaginyl-tRNA synthetase (AsnRS) in their completed genomes. 108
24141 239812 cd04317 EcAspRS_like_N EcAspRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli aspartyl-tRNA synthetase (AspRS), the human mitochondrial (mt) AspRS-2, the discriminating (D) Thermus thermophilus AspRS-1, and the nondiscriminating (ND) Helicobacter pylori AspRS. These homodimeric enzymes are class2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis. Human mtAspRS participates in mitochondrial biosynthesis; this enzyme been shown to charge E.coli native tRNAsp in addition to in vitro transcribed human mitochondrial tRNAsp. T. thermophilus is rare among bacteria in having both a D_AspRS and a ND_AspRS. H.pylori ND-AspRS can charge both tRNAASp and tRNAAsn, it is fractionally more efficient at aminoacylating tRNAAsp over tRNAAsn. The H.pylori genome does not contain AsnRS. 135
24142 239813 cd04318 EcAsnRS_like_N EcAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli asparaginyl-tRNA synthetase (AsnRS) and, in Arabidopsis thaliana and Saccharomyces cerevisiae mitochondrial (mt) AsnRS. This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. S. cerevisiae mtAsnRS can charge E.coli tRNA with asparagines. Mutations in the gene for S. cerevisiae mtAsnRS has been found to induce a "petite" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system. 82
24143 239814 cd04319 PhAsnRS_like_N PhAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Pyrococcus horikoshii AsnRS asparaginyl-tRNA synthetase (AsnRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The archeal enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. 103
24144 239815 cd04320 AspRS_cyto_N AspRS_cyto_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae and human cytoplasmic aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. 102
24145 239816 cd04321 ScAspRS_mt_like_N ScAspRS_mt_like_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae mitochondrial (mt) aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this fungal group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Mutations in the gene for S. cerevisiae mtAspRS result in a "petite" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system. 86
24146 239817 cd04322 LysRS_N LysRS_N: N-terminal, anticodon recognition domain of lysyl-tRNA synthetases (LysRS). These enzymes are homodimeric class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Included in this group are E. coli LysS and LysU. These two isoforms of LysRS are encoded by distinct genes which are differently regulated. Eukaryotes contain 2 sets of aaRSs, both of which encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Saccharomyces cerevisiae cytoplasmic and mitochondrial LysRSs have been shown to participate in the mitochondrial import of the only nuclear-encoded tRNA of S. cerevisiae (tRNAlysCUU). The gene for human LysRS encodes both the cytoplasmic and the mitochondrial isoforms of LysRS. In addition to their housekeeping role, human lysRS may function as a signaling molecule that activates immune cells and tomato LysRS may participate in a root-specific process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging. 108
24147 239818 cd04323 AsnRS_cyto_like_N AsnRS_cyto_like_N: N-terminal, anticodon recognition domain of the type found in human and Saccharomyces cerevisiae cytoplasmic asparaginyl-tRNA synthetase (AsnRS), in Brugia malayai AsnRs and, in various putative bacterial AsnRSs. This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis. AsnRS is immunodominant antigen of the filarial nematode B. malayai and of interest as a target for anti-parasitic drug design. Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages. 84
24148 239819 cd04327 ZnMc_MMP_like_3 Zinc-dependent metalloprotease; MMP_like sub-family 3. A group of bacterial and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin. 198
24149 239820 cd04328 RNAP_I_Rpa43_N RNAP_I_Rpa43_N: Rpa43, N-terminal ribonucleoprotein (RNP) domain. Rpa43 is a subunit of eukaryotic RNA polymerase (RNAP) I that is homologous to Rpb7 of eukaryotic RNAP II, Rpc25 of eukaryotic RNP III, and RpoE of archaeal RNAP. Rpa43 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain. Rpa43 heterodimerizes with Rpa14 and this heterodimer has genetic and biochemical characteristics similar to those of the Rpb7/Rpb4 heterodimer of RNAP II. In addition, the Rpa43/Rpa14 heterodimer binds single-stranded RNA, as is the case for the Rpb7/Rpb4 and the archaeal E/F complexes. The position of Rpa43/Rpa14 in the three-dimensional structure of RNAP I is similar to that of Rpb4/Rpb7, which forms an upstream interface between the C-terminal domain of Rpb1 and the transcription factor IIB (TFIIB), recruiting pol II to the pol II promoter. Rpb43 binds Rrn3, an rDNA-specific transcription factor, functionally equivalent to TFIIB, involved in recruiting RNAP I to the pol I promoter. 89
24150 239821 cd04329 RNAP_II_Rpb7_N RNAP_II_Rpb7_N: Rpb7, N-terminal ribonucleoprotein (RNP) domain. Rpb7 is a subunit of eukaryotic RNA polymerase (RNAP) II that is homologous to Rpc25 of RNAP III, RpoE of archaeal RNAP, and Rpa43 of eukaryotic RNAP I. Rpb7 heterodimerizes with Rpb4 and this heterodimer binds the 10-subunit core of RNAP II, forming part of the floor of the DNA-binding cleft. Rpb7 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain, both of which bind single-stranded RNA. Rpb7 is thought to interact with the nascent RNA strand as it exits the RNAP II complex during transcription elongation. The Rpb7/Rpb4 heterodimer is also thought to serve as an upstream interface between the C-terminal domain of Rpb1 and the transcription factor IIB (TFIIB), recruiting pol II to the pol II promoter. 80
24151 239822 cd04330 RNAP_III_Rpc25_N RNAP_III_Rpc25_N: Rpc25, N-terminal ribonucleoprotein (RNP) domain. Rpc25 is a subunit of eukaryotic RNA polymerase (RNAP) III and is homologous to Rpa43 of eukaryotic RNAP I, Rpb7 of eukaryotic RNAP II, and RpoE of archaeal RNAP. Rpc25 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain, both of which are thought to bind single-stranded RNA. Rpc25 heterodimerizes with Rpc17 and plays an important role in transcription initiation. RNAP III transcribes diverse structural and catalytic RNAs including 5S ribosomal RNAs, tRNAs, and a small number of snRNAs involved in RNA and protein synthesis. 80
24152 239823 cd04331 RNAP_E_N RNAP_E_N: RpoE, N-terminal ribonucleoprotein (RNP) domain. RpoE (subunit E) is a subunit of the archaeal RNA polymerase (RNAP) that is homologous to Rpb7 of eukaryotic RNAP II, Rpc25 of eukaryotic RNAP III, and Rpa43 of eukaryotic RNAP I. RpoE heterodimerizes with RpoF, another RNA polymerase subunit. RpoE has an elongated two-domain structure that includes an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain. Both domains of RpoE bind single-stranded RNA. 80
24153 239824 cd04332 YbaK_like YbaK-like. The YbaK family of deacylase domains includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, ProX, and PrdX. The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation. Organisms whose ProRS lacks the INS domain express an INS homolog in trans (e.g. YbaK, ProX, or PrdX). 136
24154 239825 cd04333 ProX_deacylase This CD, composed mainly of bacterial single-domain proteins, includes the Thermus thermophilus (Tt) YbaK-like protein, a homolog of the trans-acting Escherichia coli YbaK Cys-tRNA(Pro) deacylase and the Agrobacterium tumefaciens ProX Ala-tRNA(Pro) deacylase and also the cis-acting prolyl-tRNA synthetase-editing domain (ProRS-INS). While ProX and ProRS-INS hydrolyze misacylated Ala-tRNA(Pro), the E. coli YbaK hydrolyzes misacylated Cys-tRNA(Pro). A few CD members are N-terminal, YbaK-ProX-like domains of an uncharacterized protein with a C-terminal, predicted Fe-S protein domain. 148
24155 239826 cd04334 ProRS-INS INS is an amino acid-editing domain inserted (INS) into the bacterial class II prolyl-tRNA synthetase (ProRS) however, this CD is not exclusively bacterial. It is also found at the N-terminus of the eukaryotic/archaea-like ProRS's of yeasts and single-celled parasites. ProRS catalyzes the attachment of proline to tRNA(Pro); proline is first activated by ATP, and then transferred to the acceptor end of tRNA(Pro). ProRS can inadvertently process noncognate amino acids such as alanine and cysteine, and to avoid such errors, in post-transfer editing, the INS domain deacylates mischarged Ala-tRNA(Pro), thus ensuring the fidelity of translation. Misacylated Cys-tRNA(Pro) is not edited by ProRS. In addition to the INS editing domain, the prokaryote-like ProRS protein contains catalytic and anticodon-binding domains which form a dimeric interface. 160
24156 239827 cd04335 PrdX_deacylase This CD includes bacterial (Agrobacterium tumefaciens and Caulobacter crescentus ProX, and Clostridium sticklandii PrdX) and eukaryotic (Plasmodium falciparum N-terminal ProRS editing domain) sequences. The C. sticklandii PrdX protein, a homolog of the YbaK and ProX proteins, and the prolyl-tRNA synthetase-editing domain (ProRS-INS), specifically hydrolyzes Ala-tRNA(Pro). In this CD, many of the eukaryotic editing domains are N-terminal and cis-acting, expressed from a multidomain ProRS, however, similar to the bacterial PrdX, the mammalian, amphibian, and echinoderm PrdX-like proteins are trans-acting, single-domain proteins. 156
24157 239828 cd04336 YeaK YeaK is an uncharacterized Echerichia coli protein with a YbaK-like domain of unknown function. The YbaK-like domain family includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, and ProX. The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation. Organisms whose ProRS lacks the INS domain express a single-domain INS homolog such as YbaK, ProX, or PrdX which supplies the function of INS in trans. 153
24158 239829 cd04337 Rieske_RO_Alpha_Cao Cao (chlorophyll a oxygenase) is a rieske non-heme iron-sulfur protein located within the plastid-envelope inner and thylakoid membranes, that catalyzes the conversion of chlorophyllide a to chlorophyllide b. CAO is found not only in plants but also in chlorophytes and prochlorophytes. This domain represents the N-terminal rieske domain of the oxygenase alpha subunit. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Cao is closely related to several other plant RO's including Tic 55, a 55 kDa protein associated with protein transport through the inner chloroplast membrane; Ptc 52, a novel 52 kDa protein isolated from chloroplasts; and LLS1/Pao (Lethal-leaf spot 1/pheophorbide a oxygenase). 129
24159 239830 cd04338 Rieske_RO_Alpha_Tic55 Tic55 is a 55kDa LLS1-related non-heme iron oxygenase associated with protein transport through the plant inner chloroplast membrane. This domain represents the N-terminal Rieske domain of the Tic55 oxygenase alpha subunit. Tic55 is closely related to the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO), Ptc52, and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. 134
24160 239831 cd04365 IlGF_relaxin_like IlGF_like family, relaxin_like subgroup, specific to vertebrates. Members include a number of active peptides including (pro)relaxin, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), and insulin-like peptides 5 (INSL5) and 6 (INSL6). Members of this subgroup are widely expressed in testes (INSL3, INSL6), decidua, placenta, prostate, corpus luteum, brain (various relaxins), GI tract, and kidney (INSL5) where they serve a variety of functions in parturition and development. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 59
24161 239832 cd04366 IlGF_insulin_bombyxin_like IlGF_like family, insulin_bombyxin_like subgroup. Members include a number of peptides including insulin, insulin-like growth factors I and II, insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. With the exception of insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 42
24162 239833 cd04367 IlGF_insulin_like IlGF_like family, insulin_like subgroup, specific to vertebrates. Members include a number of peptides including insulin and insulin-like growth factors I and II, which play a variety of roles in controlling processes such as metabolism, growth and differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, and differentiation. With the exception of the insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 79
24163 239834 cd04368 IlGF IlGF, insulin_like growth factors; specific to vertebrates. Members include a number of peptides including insulin-like growth factors I and II, which play a variety of roles in controlling processes such as growth, differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, proliferation, and differentiation. Typically, the active forms of these peptide hormones are single chains cross-linked by three disulfide bonds. 67
24164 99922 cd04369 Bromodomain Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. 99
24165 239835 cd04370 BAH BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions. 123
24166 239836 cd04371 DEP DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction. 81
24167 239837 cd04372 RhoGAP_chimaerin RhoGAP_chimaerin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of chimaerins. Chimaerins are a family of phorbolester- and diacylglycerol-responsive GAPs specific for the Rho-like GTPase Rac. Chimaerins exist in two alternative splice forms that each contain a C-terminal GAP domain, and a central C1 domain which binds phorbol esters, inducing a conformational change that activates the protein; one splice form is lacking the N-terminal Src homology-2 (SH2) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 194
24168 239838 cd04373 RhoGAP_p190 RhoGAP_p190: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p190-like proteins. p190, also named RhoGAP5, plays a role in neuritogenesis and axon branch stability. p190 shows a preference for Rho, over Rac and Cdc42, and consists of an N-terminal GTPase domain and a C-terminal GAP domain. The central portion of p190 contains important regulatory phosphorylation sites. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 185
24169 239839 cd04374 RhoGAP_Graf RhoGAP_Graf: GTPase-activator protein (GAP) domain for Rho-like GTPases found in GRAF (GTPase regulator associated with focal adhesion kinase); Graf is a multi-domain protein, containing SH3 and PH domains, that binds focal adhesion kinase and influences cytoskeletal changes mediated by Rho proteins. Graf exhibits GAP activity toward RhoA and Cdc42, but only weakly activates Rac1. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 203
24170 239840 cd04375 RhoGAP_DLC1 RhoGAP_DLC1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of DLC1-like proteins. DLC1 shows in vitro GAP activity towards RhoA and CDC42. Beside its C-terminal GAP domain, DLC1 also contains a SAM (sterile alpha motif) and a START (StAR-related lipid transfer action) domain. DLC1 has tumor suppressor activity in cell culture. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 220
24171 239841 cd04376 RhoGAP_ARHGAP6 RhoGAP_ARHGAP6: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP6-like proteins. ArhGAP6 shows GAP activity towards RhoA, but not towards Cdc42 and Rac1. ArhGAP6 is often deleted in microphthalmia with linear skin defects syndrome (MLS); MLS is a severe X-linked developmental disorder. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 206
24172 239842 cd04377 RhoGAP_myosin_IX RhoGAP_myosin_IX: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in class IX myosins. Class IX myosins contain a characteristic head domain, a neck domain, a tail domain which contains a C6H2-zinc binding motif and a RhoGAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 186
24173 239843 cd04378 RhoGAP_GMIP_PARG1 RhoGAP_GMIP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein) and PARG1 (PTPL1-associated RhoGAP1). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 203
24174 239844 cd04379 RhoGAP_SYD1 RhoGAP_SYD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in SYD-1_like proteins. Syd-1, first identified and best studied in C.elegans, has been shown to play an important role in neuronal development by specifying axonal properties. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 207
24175 239845 cd04380 RhoGAP_OCRL1 RhoGAP_OCRL1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in OCRL1-like proteins. OCRL1 (oculocerebrorenal syndrome of Lowe 1)-like proteins contain two conserved domains: a central inositol polyphosphate 5-phosphatase domain and a C-terminal Rho GAP domain, this GAP domain lacks the catalytic residue and therefore maybe inactive. OCRL-like proteins are type II inositol polyphosphate 5-phosphatases that can hydrolyze lipid PI(4,5)P2 and PI(3,4,5)P3 and soluble Ins(1,4,5)P3 and Ins(1,3,4,5)P4, but their individual specificities vary. The functionality of the RhoGAP domain is still unclear. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 220
24176 239846 cd04381 RhoGap_RalBP1 RhoGap_RalBP1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in RalBP1 proteins, also known as RLIP, RLIP76 or cytocentrin. RalBP1 plays an important role in endocytosis during interphase. During mitosis, RalBP1 transiently associates with the centromere and has been shown to play an essential role in the proper assembly of the mitotic apparatus. RalBP1 is an effector of the Ral GTPase which itself is an effector of Ras. RalBP1 contains a RhoGAP domain, which shows weak activity towards Rac1 and Cdc42, but not towards Ral, and a Ral effector domain binding motif. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 182
24177 239847 cd04382 RhoGAP_MgcRacGAP RhoGAP_MgcRacGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in MgcRacGAP proteins. MgcRacGAP plays an important dual role in cytokinesis: i) it is part of centralspindlin-complex, together with the mitotic kinesin MKLP1, which is critical for the structure of the central spindle by promoting microtuble bundling. ii) after phosphorylation by aurora B MgcRacGAP becomes an effective regulator of RhoA and plays an important role in the assembly of the contractile ring and the initiation of cytokinesis. MgcRacGAP-like proteins contain a N-terminal C1-like domain, and a C-terminal RhoGAP domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 193
24178 239848 cd04383 RhoGAP_srGAP RhoGAP_srGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in srGAPs. srGAPs are components of the intracellular part of Slit-Robo signalling pathway that is important for axon guidance and cell migration. srGAPs contain an N-terminal FCH domain, a central RhoGAP domain and a C-terminal SH3 domain; this SH3 domain interacts with the intracellular proline-rich-tail of the Roundabout receptor (Robo). This interaction with Robo then activates the rhoGAP domain which in turn inhibits Cdc42 activity. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 188
24179 239849 cd04384 RhoGAP_CdGAP RhoGAP_CdGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of CdGAP-like proteins; CdGAP contains an N-terminal RhoGAP domain and a C-terminal proline-rich region, and it is active on both Cdc42 and Rac1 but not RhoA. CdGAP is recruited to focal adhesions via the interaction with the scaffold protein actopaxin (alpha-parvin). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 195
24180 239850 cd04385 RhoGAP_ARAP RhoGAP_ARAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in ARAPs. ARAPs (also known as centaurin deltas) contain, besides the RhoGAP domain, an Arf GAP, ankyrin repeat ras-associating, and PH domains. Since their ArfGAP activity is PIP3-dependent, ARAPs are considered integration points for phosphoinositide, Arf and Rho signaling. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 184
24181 239851 cd04386 RhoGAP_nadrin RhoGAP_nadrin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Nadrin-like proteins. Nadrin, also named Rich-1, has been shown to be involved in the regulation of Ca2+-dependent exocytosis in neurons and recently has been implicated in tight junction maintenance in mammalian epithelium. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 203
24182 239852 cd04387 RhoGAP_Bcr RhoGAP_Bcr: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Bcr (breakpoint cluster region protein)-like proteins. Bcr is a multidomain protein with a variety of enzymatic functions. It contains a RhoGAP and a Rho GEF domain, a Ser/Thr kinase domain, an N-terminal oligomerization domain, and a C-terminal PDZ binding domain, in addition to PH and C2 domains. Bcr is a negative regulator of: i) RacGTPase, via the Rho GAP domain, ii) the Ras-Raf-MEK-ERK pathway, via phosphorylation of the Ras binding protein AF-6, and iii) the Wnt signaling pathway through binding beta-catenin. Bcr can form a complex with beta-catenin and Tcf1. The Wnt signaling pathway is involved in cell proliferation, differentiation, and cell renewal. Bcr was discovered as a fusion partner of Abl. The Bcr-Abl fusion is characteristic for a large majority of chronic myelogenous leukemias (CML). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 196
24183 239853 cd04388 RhoGAP_p85 RhoGAP_p85: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in the p85 isoforms of the regulatory subunit of the class IA PI3K (phosphatidylinositol 3'-kinase). This domain is also called Bcr (breakpoint cluster region protein) homology (BH) domain. Class IA PI3Ks are heterodimers, containing a regulatory subunit (p85) and a catalytic subunit (p110) and are activated by growth factor receptor tyrosine kinases (RTKs); this activation is mediated by the p85 subunit. p85 isoforms, alpha and beta, contain a C-terminal p110-binding domain flanked by two SH2 domains, an N-terminal SH3 domain, and a RhoGAP domain flanked by two proline-rich regions. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 200
24184 239854 cd04389 RhoGAP_KIAA1688 RhoGAP_KIAA1688: GTPase-activator protein (GAP) domain for Rho-like GTPases found in KIAA1688-like proteins; KIAA1688 is a protein of unknown function that contains a RhoGAP domain and a myosin tail homology 4 (MyTH4) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 187
24185 239855 cd04390 RhoGAP_ARHGAP22_24_25 RhoGAP_ARHGAP22_24_25: GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP22, 24 and 25-like proteins; longer isoforms of these proteins contain an additional N-terminal pleckstrin homology (PH) domain. ARHGAP25 (KIA0053) has been identified as a GAP for Rac1 and Cdc42. Short isoforms (without the PH domain) of ARHGAP24, called RC-GAP72 and p73RhoGAP, and of ARHGAP22, called p68RacGAP, has been shown to be involved in angiogenesis and endothelial cell capillary formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 199
24186 239856 cd04391 RhoGAP_ARHGAP18 RhoGAP_ARHGAP18: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP18-like proteins. The function of ArhGAP18 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 216
24187 239857 cd04392 RhoGAP_ARHGAP19 RhoGAP_ARHGAP19: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP19-like proteins. The function of ArhGAP19 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 208
24188 239858 cd04393 RhoGAP_FAM13A1a RhoGAP_FAM13A1a: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of FAM13A1, isoform a-like proteins. The function of FAM13A1a is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by up several orders of magnitude. 189
24189 239859 cd04394 RhoGAP-ARHGAP11A RhoGAP-ARHGAP11A: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP11A-like proteins. The mouse homolog of human ArhGAP11A has been detected as a gene exclusively expressed in immature ganglion cells, potentially playing a role in retinal development. The exact function of ArhGAP11A is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 202
24190 239860 cd04395 RhoGAP_ARHGAP21 RhoGAP_ARHGAP21: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP21-like proteins. ArhGAP21 is a multi-domain protein, containing RhoGAP, PH and PDZ domains, and is believed to play a role in the organization of the cell-cell junction complex. It has been shown to function as a GAP of Cdc42 and RhoA, and to interact with alpha-catenin and Arf6. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 196
24191 239861 cd04396 RhoGAP_fSAC7_BAG7 RhoGAP_fSAC7_BAG7: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal SAC7 and BAG7-like proteins. Both proteins are GTPase activating proteins of Rho1, but differ functionally in vivo: SAC7, but not BAG7, is involved in the control of Rho1-mediated activation of the PKC-MPK1 pathway. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 225
24192 239862 cd04397 RhoGAP_fLRG1 RhoGAP_fLRG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal LRG1-like proteins. Yeast Lrg1p is required for efficient cell fusion, and mother-daughter cell separation, possibly through acting as a RhoGAP specifically regulating 1,3-beta-glucan synthesis. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 213
24193 239863 cd04398 RhoGAP_fRGD1 RhoGAP_fRGD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD1-like proteins. Yeast Rgd1 is a GAP protein for Rho3 and Rho4 and plays a role in low-pH response. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 192
24194 239864 cd04399 RhoGAP_fRGD2 RhoGAP_fRGD2: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD2-like proteins. Yeast Rgd2 is a GAP protein for Cdc42 and Rho5. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 212
24195 239865 cd04400 RhoGAP_fBEM3 RhoGAP_fBEM3: RhoGAP (GTPase-activator [GAP] protein for Rho-like small GTPases) domain of fungal BEM3-like proteins. Bem3 is a GAP protein of Cdc42, and is specifically involved in the control of the initial assembly of the septin ring in yeast bud formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 190
24196 239866 cd04401 RhoGAP_fMSB1 RhoGAP_fMSB1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal MSB1-like proteins. Msb1 was originally identified as a multicopy suppressor of temperature sensitive cdc42 mutation. Msb1 is a positive regulator of the Pkc1p-MAPK pathway and 1,3-beta-glucan synthesis, both pathways involve Rho1 regulation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 198
24197 239867 cd04402 RhoGAP_ARHGAP20 RhoGAP_ARHGAP20: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP20-like proteins. ArhGAP20, also known as KIAA1391 and RA-RhoGAP, contains a RhoGAP, a RA, and a PH domain, and ANXL repeats. ArhGAP20 is activated by Rap1 and induces inactivation of Rho, which in turn leads to neurite outgrowth. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 192
24198 239868 cd04403 RhoGAP_ARHGAP27_15_12_9 RhoGAP_ARHGAP27_15_12_9: GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP27 (also called CAMGAP1), ARHGAP15, 12 and 9-like proteins; This subgroup of ARHGAPs are multidomain proteins that contain RhoGAP, PH, SH3 and WW domains. Most members that are studied show GAP activity towards Rac1, some additionally show activity towards Cdc42. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 187
24199 239869 cd04404 RhoGAP-p50rhoGAP RhoGAP-p50rhoGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p50RhoGAP-like proteins; p50RhoGAP, also known as RhoGAP-1, contains a C-terminal RhoGAP domain and an N-terminal Sec14 domain which binds phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3). It is ubiquitously expressed and preferentially active on Cdc42. This subgroup also contains closely related ARHGAP8. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 195
24200 239870 cd04405 RhoGAP_BRCC3-like RhoGAP_BRCC3-like: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of BRCC3-like proteins. This subgroup also contains two groups of closely related proteins, BRCC3 and DEPDC7, which both contain a C-terminal RhoGAP-like domain and an N-terminal DEP (Disheveled, Egl-10, and Pleckstrin) domain. The function(s) of BRCC3 and DEPDC7 are unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 235
24201 239871 cd04406 RhoGAP_myosin_IXA RhoGAP_myosin_IXA: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXA. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 186
24202 239872 cd04407 RhoGAP_myosin_IXB RhoGAP_myosin_IXB: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXB. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 186
24203 239873 cd04408 RhoGAP_GMIP RhoGAP_GMIP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 200
24204 239874 cd04409 RhoGAP_PARG1 RhoGAP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of PARG1 (PTPL1-associated RhoGAP1). PARG1 was originally cloned as an interaction partner of PTPL1, an intracellular protein-tyrosine phosphatase. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 211
24205 319870 cd04410 DMSOR_beta-like Beta subunit of the DMSO Reductase (DMSOR) family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 136
24206 100108 cd04411 Ribosomal_P1_P2_L12p Ribosomal protein P1, P2, and L12p. Ribosomal proteins P1 and P2 are the eukaryotic proteins that are functionally equivalent to bacterial L7/L12. L12p is the archaeal homolog. Unlike other ribosomal proteins, the archaeal L12p and eukaryotic P1 and P2 do not share sequence similarity with their bacterial counterparts. They are part of the ribosomal stalk (called the L7/L12 stalk in bacteria), along with 28S rRNA and the proteins L11 and P0 in eukaryotes (23S rRNA, L11, and L10e in archaea). In bacterial ribosomes, L7/L12 homodimers bind the extended C-terminal helix of L10 to anchor the L7/L12 molecules to the ribosome. Eukaryotic P1/P2 heterodimers and archaeal L12p homodimers are believed to bind the L10 equivalent proteins, eukaryotic P0 and archaeal L10e, in a similar fashion. P1 and P2 (L12p, L7/L12) are the only proteins in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have two copies each of P1 and P2 (two heterodimers). Bacteria may have four or six copies (two or three homodimers), depending on the species. As in bacteria, the stalk is crucial for binding of initiation, elongation, and release factors in eukaryotes and archaea. 105
24207 239875 cd04412 NDPk7B Nucleoside diphosphate kinase 7 domain B (NDPk7B): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner. 134
24208 239876 cd04413 NDPk_I Nucleoside diphosphate kinase Group I (NDPk_I)-like: NDP kinase domains are present in a large family of structurally and functionally conserved proteins from bacteria to humans that generally catalyze the transfer of gamma-phosphates of a nucleoside triphosphate (NTP) donor onto a nucleoside diphosphate (NDP) acceptor through a phosphohistidine intermediate. The mammalian nm23/NDP kinase gene family can be divided into two distinct groups. The group I genes encode proteins that generally have highly homologous counterparts in other organisms and possess the classic enzymatic activity of a kinase. This group includes vertebrate NDP kinases A-D (Nm23- H1 to -H4), and its counterparts in bacteria, archea and other eukaryotes. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. They possess the NDP kinase active site motif (NXXH[G/A]SD) and the nine residues that are most essential for catalysis. 130
24209 239877 cd04414 NDPk6 Nucleoside diphosphate kinase 6 (NDP kinase 6, NDPk6, NM23-H6; NME6; Inhibitor of p53-induced apoptosis-alpha, IPIA-alpha): The nm23-H6 gene encoding NDPk6 is expressed mainly in mitochondria, but also found at a lower level in most tissues. NDPk6 has all nine residues considered crucial for enzyme structure and activity, and has been found to have NDP kinase activity. It may play a role in cell growth and cell cycle progression. The nm23-H6 gene locus has been implicated in a variety of malignant tumors. 135
24210 239878 cd04415 NDPk7A Nucleoside diphosphate kinase 7 domain A (NDPk7A): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner. 131
24211 239879 cd04416 NDPk_TX NDP kinase domain of thioredoxin domain-containing proteins (TXNDC3 and TXNDC6): Txl-2 (TXNDC6) and Sptrx-2 (TXNDC3) are fusion proteins of Group II N-terminal thioredoxin domains followed by one or three NDP kinase domains, respectively. Sptrx-2, which has a tissue specific distribution in human testis, has been considered as a member of the nm23 family (nm23-H8) and exhibits a high homology with sea urchin IC1 (intermediate chain-1) protein, a component of the sperm axonemal outer dynein arm complex. Txl-2 is mainly represented in close association with microtubules within tissues with cilia and flagella such as seminiferous epithelium (spermatids) and lung airway epithelium, suggesting possible role in control of microtubule stability and maintenance. 132
24212 239880 cd04418 NDPk5 Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species. It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5). 132
24213 341228 cd04433 AFD_class_I Adenylate forming domain, Class I, also known as the ANL superfamily. This family is known as the ANL (acyl-CoA synthetases, the NRPS adenylation domains, and the Luciferase enzymes) superfamily. It includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases.The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain. 336
24214 271198 cd04434 LanC_like Cyclases involved in the biosynthesis of lantibiotics, and similar proteins. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthionine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as a precursor peptide and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans), in addition to 2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. A related domain is also present in LanM and other pro- and eukaryotic proteins with poorly characterized functions. 351
24215 239882 cd04435 DEP_fRom2 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGEF (GDP/GTP exchange factor) Rom2-like proteins. Rom2-like proteins share a common domain architecture, containing, beside the RhoGEF domain, a DEP, a PH (pleckstrin homology) and a CNH domain. Rom2, a yeast GEF for Rho1 and Rho2, is involved in mediating stress response via the Ras-cAMP pathway and also plays a role in mediating resistance to sphingolipid disturbances. 82
24216 239883 cd04436 DEP_fRgd2 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGAP (GTPase-activator protein) Rgd2-like proteins. Rgd2-like proteins share a common domain architecture, containing, beside the RhoGAP domain, a DEP and a FCH (Fes/CIP4 homology) domain. Yeast Rgd2 is a GAP protein for Cdc42 and Rho5. 84
24217 239884 cd04437 DEP_Epac DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in Epac-like proteins. Epac (exchange proteins directly activated by cAMP) proteins are GEFs (guanine-nucleotide-exchange factors) for the small GTPases, Rap1 and Rap2. They are directly regulated by cyclic AMP, a second messenger that plays a role in the control of diverse cellular processes, such as cell adhesion and insulin secretion. Epac-like proteins share a common domain architecture, containing RasGEF, DEP and CAP-effector (cAMP binding) domains. The DEP domain is involved in membrane localization. 125
24218 239885 cd04438 DEP_dishevelled DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in dishevelled-like proteins. Dishevelled-like proteins play a key role in the transduction of the Wnt signal from the cell surface to the nucleus, which in turn is an important regulatory pathway for cellular development and growth. They contain an N-terminal DIX domain, a central PDZ domain, and a C-terminal DEP domain. 84
24219 239886 cd04439 DEP_1_P-Rex DEP (Dishevelled, Egl-10, and Pleckstrin) domain 1 found in P-Rex-like proteins. The P-Rex family is the guanine-nucleotide exchange factor (GEF) for the small GTPase Rac that contains an N-terminal RhoGEF domain, two DEP and PDZ domains. Rac-GEF activity is stimulated by phosphatidylinositol (3,4,5)-trisphosphate (PtdIns(3,4,5)P3), a lipid second messenger, and by the G beta-gamma subunits of heterotrimeric G proteins. The DEP domains are not involved in mediating these stimuli, but may be of importance for basal and stimulated levels Rac-GEF activity. 81
24220 239887 cd04440 DEP_2_P-Rex DEP (Dishevelled, Egl-10, and Pleckstrin) domain 2 found in P-Rex-like proteins. The P-Rex family is the guanine-nucleotide exchange factor (GEF) for the small GTPase Rac that contains an N-terminal RhoGEF domain, two DEP and PDZ domains. Rac-GEF activity is stimulated by phosphatidylinositol (3,4,5)-trisphosphate (PtdIns(3,4,5)P3), a lipid second messenger, and the G beta-gamma subunits of heterotrimeric G proteins. The DEP domains are not involved in mediating these stimuli, but may be of importance for basal and stimulated levels Rac-GEF activity. 93
24221 239888 cd04441 DEP_2_DEP6 DEP (Dishevelled, Egl-10, and Pleckstrin) domain 2 found in DEP6-like proteins. DEP6 proteins contain two DEP and a PDZ domain. Their function is unknown. 85
24222 239889 cd04442 DEP_1_DEP6 DEP (Dishevelled, Egl-10, and Pleckstrin) domain 1 found in DEP6-like proteins. DEP6 proteins contain two DEP and a PDZ domain. Their function is unknown. 82
24223 239890 cd04443 DEP_GPR155 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in GPR155-like proteins. GRP155-like proteins, also known as PGR22, contain an N-terminal permease domain, a central transmembrane region and a C-terminal DEP domain. They are orphan receptors of the class B G protein-coupled receptors. Their function is unknown. 83
24224 239891 cd04444 DEP_PLEK2 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in pleckstrin 2-like proteins. Pleckstrin 2 is found in a wide variety of cell types, which suggest a more general role in signaling than pleckstrin 1. Pleckstrin-like proteins contain a central DEP domain, flanked by 2 PH (pleckstrin homology) domains. 109
24225 239892 cd04445 DEP_PLEK1 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in pleckstrin 1-like proteins. Pleckstrin 1 plays a role in cell spreading and reorganization of actin cytoskeleton in platelets and leukocytes. Its activity is highly regulated by phosphorylation, mainly by protein kinase C. Pleckstrin-like proteins contain a central DEP domain, flanked by 2 PH (pleckstrin homology) domains. 99
24226 239893 cd04446 DEP_DEPDC4 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in DEPDC4-like proteins. DEPDC4 is a DEP domain containing protein of unknown function. 95
24227 239894 cd04447 DEP_BRCC3 DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in BBRC3-like proteins. BBRC3, also known as DEPDC1B, is a DEP containing protein of unknown function. 92
24228 239895 cd04448 DEP_PIKfyve DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGEF (GDP/GTP exchange factor) PIKfyve-like proteins. PIKfyve contains N-terminal Fyve finger and DEP domains, a central chaperonin-like domain and a C-terminal PIPK (phosphatidylinositol phosphate kinase) domain. PIKfyve-like proteins are important phosphatidylinositol (3)-monophosphate (PtdIns(3)P)-5-kinases, producing PtdIns(3,5)P2, which plays a major role in multivesicular body (MVB) sorting and control of retrograde traffic from the vacuole back to the endosome and/or Golgi. PIKfyve itself has been shown to be play a role in regulating early-endosome-to-trans-Golgi network (TGN) retrograde trafficking. 81
24229 239896 cd04449 DEP_DEPDC5-like DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in DEPDC5-like proteins. DEPDC5, in human also known as KIAA0645, is a DEP domain containing protein of unknown function. 83
24230 239897 cd04450 DEP_RGS7-like DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in RGS (regulator of G-protein signaling) proteins of the subfamily R7. This subgroup contains RGS7, RGS6, RGS9 and RGS11. They share a common domain architecture, containing, beside the RGS domain, a DEP domain and a GGL (G-protein gamma subunit-like ) domain. RGS proteins are GTPase-activating (GAP) proteins of heterotrimeric G proteins by increasing the rate of GTP hydrolysis of the alpha subunit. The fungal homologs, like yeast Sst2, share a related common domain architecture, containing RGS and DEP domains. Sst2 has been identified as the principal regulator of mating pheromone signaling and recently the DEP domain of Sst2 has been shown to be necessary and sufficient to mediate receptor interaction. 88
24231 239898 cd04451 S1_IF1 S1_IF1: Translation Initiation Factor IF1, S1-like RNA-binding domain. IF1 contains an S1-like RNA-binding domain, which is found in a wide variety of RNA-associated proteins. Translation initiation includes a number of interrelated steps preceding the formation of the first peptide bond. In Escherichia coli, the initiation mechanism requires, in addition to mRNA, fMet-tRNA, and ribosomal subunits, the presence of three additional proteins (initiation factors IF1, IF2, and IF3) and at least one GTP molecule. The three initiation factors influence both the kinetics and the stability of ternary complex formation. IF1 is the smallest of the three factors. IF1 enhances the rate of 70S ribosome subunit association and dissociation and the interaction of 30S ribosomal subunit with IF2 and IF3. It stimulates 30S complex formation. In addition, by binding to the A-site of the 30S ribosomal subunit, IF1 may contribute to the fidelity of the selection of the initiation site of the mRNA. 64
24232 239899 cd04452 S1_IF2_alpha S1_IF2_alpha: The alpha subunit of translation Initiation Factor 2, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Eukaryotic and archaeal Initiation Factor 2 (e- and aIF2, respectively) are heterotrimeric proteins with three subunits (alpha, beta, and gamma). IF2 plays a crucial role in the process of translation initiation. The IF2 gamma subunit contains a GTP-binding site. The IF2 beta and gamma subunits together are thought to be responsible for binding methionyl-initiator tRNA. The ternary complex consisting of IF2, GTP, and the methionyl-initiator tRNA binds to the small subunit of the ribosome, as part of a pre-initiation complex that scans the mRNA to find the AUG start codon. The IF2-bound GTP is hydrolyzed to GDP when the methionyl-initiator tRNA binds the AUG start codon, at which time the IF2 is released with its bound GDP. The large ribosomal subunit then joins with the small subunit to complete the initiation complex, which is competent to begin translation. The IF2a subunit is a major site of control of the translation initiation process, via phosphorylation of a specific serine residue. This alpha subunit is well conserved in eukaryotes and archaea but is not present in bacteria. IF2 is a cold-shock-inducible protein. 76
24233 239900 cd04453 S1_RNase_E S1_RNase_E: RNase E and RNase G, S1-like RNA-binding domain. RNase E is an essential endoribonuclease in the processing and degradation of RNA. In addition to its role in mRNA degradation, RNase E has also been implicated in the processing of rRNA, and the maturation of tRNA, 10Sa RNA and the M1 precursor of RNase P. RNase E associates with PNPase (3' to 5' exonuclease), Rhl B (DEAD-box RNA helicase) and enolase (glycolytic enzyme) to form the RNA degradosome. RNase E tends to cut mRNA within single-stranded regions that are rich in A/U nucleotides. The N-terminal region of RNase E contains the catalytic site. Within the conserved N-terminal domain of RNAse E and RNase G, there is an S1-like subdomain, which is an ancient single-stranded RNA-binding domain. S1 domain is an RNA-binding module originally identified in the ribosomal protein S1. The S1 domain is required for RNA cleavage by RNase E. RNase G is paralogous to RNase E with an N-terminal catalytic domain that is highly homologous to that of RNase E. RNase G not only shares sequence similarity with RNase E, but also functionally overlaps with RNase E. In Escherichia coli, RNase G is involved in the maturation of the 5' end of the 16S rRNA. RNase G plays a secondary role in mRNA decay. 88
24234 239901 cd04454 S1_Rrp4_like S1_Rrp4_like: Rrp4-like, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein, and Rrp40 and Csl4 proteins, also represented in this group, are subunits of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure". 82
24235 239902 cd04455 S1_NusA S1_NusA: N-utilizing substance A protein (NusA), S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. NusA is a transcription elongation factor containing an N-terminal catalytic domain and three RNA binding domains (RBD's). The RBD's include one S1 domain and two KH domains that form an RNA binding surface. DNA transcription by RNA polymerase (RNAP) includes three phases - initiation, elongation, and termination. During initiation, sigma factors bind RNAP and target RNAP to specific promoters. During elongation, N-utilization substances (NusA, B, E, and G) replace sigma factors and regulate pausing, termination, and antitermination. NusA is cold-shock-inducible. 67
24236 239903 cd04456 S1_IF1A_like S1_IF1A_like: Translation initiation factor IF1A-like, S1-like RNA-binding domain. IF1A is also referred to as eIF1A in eukaryotes and aIF1A in archaea. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea. 78
24237 239904 cd04457 S1_S28E S1_S28E: S28E, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. S28E protein is a component of the 30S ribosomal subunit. S28E is highly conserved among archaea and eukaryotes. S28E may control precursor RNA splicing and turnover in mRNA maturation process but its function in the ribosome is largely unknown. The structure contains an OB-fold found in many oligosaccharide and nucleic acid binding proteins. This implies that S28E might be involved in protein synthesis. 60
24238 239905 cd04458 CSP_CDS Cold-Shock Protein (CSP) contains an S1-like cold-shock domain (CSD) that is found in eukaryotes, prokaryotes, and archaea. CSP's include the major cold-shock proteins CspA and CspB in bacteria and the eukaryotic gene regulatory factor Y-box protein. CSP expression is up-regulated by an abrupt drop in growth temperature. CSP's are also expressed under normal condition at lower level. The function of cold-shock proteins is not fully understood. They preferentially bind poly-pyrimidine region of single-stranded RNA and DNA. CSP's are thought to bind mRNA and regulate ribosomal translation, mRNA degradation, and the rate of transcription termination. The human Y-box protein, which contains a CSD, regulates transcription and translation of genes that contain the Y-box sequence in their promoters. This specific ssDNA-binding properties of CSD are required for the binding of Y-box protein to the promoter's Y-box sequence, thereby regulating transcription. 65
24239 239906 cd04459 Rho_CSD Rho_CSD: Rho protein cold-shock domain (CSD). Rho protein is a transcription termination factor in most bacteria. In bacteria, there are two distinct mechanisms for mRNA transcription termination. In intrinsic termination, RNA polymerase and nascent mRNA are released from DNA template by an mRNA stem loop structure, which resembles the transcription termination mechanism used by eukaryotic pol III. The second mechanism is mediated by Rho factor. Rho factor terminates transcription by using energy from ATP hydrolysis to forcibly dissociate the transcripts from RNA polymerase. Rho protein contains an N-terminal S1-like domain, which binds single-stranded RNA. Rho has a C-terminal ATPase domain which hydrolyzes ATP to provide energy to strip RNA polymerase and mRNA from the DNA template. Rho functions as a homohexamer. 68
24240 239907 cd04460 S1_RpoE S1_RpoE: RpoE, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. RpoE is subunit E of archaeal RNA polymerase. Archaeal cells contain a single RNA polymerase made up of 12 subunits, which are homologous to the 12 subunits (RPB1-12) of eukaryotic RNA polymerase II. RpoE is homologous to Rpa43 of eukaryotic RNA polymerase I, RPB7 of eukaryotic RNA polymerase II, and Rpc25 of eukaryotic RNA polymerase III. RpoE is composed of two domains, the N-terminal RNP (ribonucleoprotein) domain and the C-terminal S1 domain. This S1 domain binds ssRNA and ssDNA. This family is classified based on the C-terminal S1 domain. The function of RpoE is not fully understood. In eukaryotes, RPB7 and RPB4 form a heterodimer that reversibly associates with the RNA polymerase II core. 99
24241 239908 cd04461 S1_Rrp5_repeat_hs8_sc7 S1_Rrp5_repeat_hs8_sc7: Rrp5 Homo sapiens S1 repeat 8 (hs8) and Saccharomyces cerevisiae S1 repeat 7 (sc7)-like domains. Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in S. cerevisiae Rrp5 and 14 S1 repeats in H. sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 8 and S. cerevisiae S1 repeat 7. Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 83
24242 239909 cd04462 S1_RNAPII_Rpb7 S1_RNAPII_Rpb7: Eukaryotic RNA polymerase II (RNAPII) Rpb7 subunit C-terminal S1 domain. RNAPII is composed of 12 subunits (Rpb1-12). Rpb4 and Rpb7 form a heterodimer that associate with the RNAPII core. Rpb7 is a homolog of the Rpc25 of RNA polymerase III, RpoE of the archaeal RNA polymerase, and Rpa43 of eukaryotic RNA polymerase I. Rpb7 has two domains, an N-terminal ribonucleoprotein (RNP) domain and a C-terminal S1 domain, both of which bind single-stranded RNA. It is possible that the S1 domain interacts with the nascent RNA transcript, assisted by the RNP domain. In yeast, Rpb4/Rpb7 is necessary for promoter-directed transcription initiation. They also play a role in regulating transcription-coupled repair in the Rad26-dependent pathway, in efficient mRNA export, and in transcription termination. 88
24243 239910 cd04463 S1_EF_like S1_EF_like: EF-like, S1-like RNA-binding domain. The EF-like superfamily contains the bacterial translation elongation factor P and its archeal and eukaryotic homologs, aIF5A and eIF5A. All proteins in this superfamily contain an S1 domain, which binds RNA or single-stranded DNA and often interacts with the ribosome. Hex-1, the SI-like domain of which is also found in this group, is structurally homologous to eIF5A and might have evolved from an ancestral eIF5A through gene duplication. 55
24244 239911 cd04465 S1_RPS1_repeat_ec2_hs2 S1_RPS1_repeat_ec2_hs2: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain.While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 2 of the Escherichia coli and Homo sapiens RPS1 (ec2 and hs2, respectively). Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 67
24245 239912 cd04466 S1_YloQ_GTPase S1_YloQ_GTPase: YloQ GTase family (also known as YjeQ and CpgA), S1-like RNA-binding domain. Proteins in the YloQ GTase family bind the ribosome and have GTPase activity. The precise role of this family is unknown. The protein structure is composed of three domains: an N-terminal S1 domain, a central GTPase domain, and a C-terminal zinc finger domain. This N-terminal S1 domain binds ssRNA. The central GTPase domain contains nucleotide-binding signature motifs: G1 (walker A), G3 (walker B) and G4 motifs. Experiments show that the bacterial YloQ and YjeQ proteins have low intrinsic GTPase activity. The C-terminal zinc-finger domain has structural similarity to a portion of the DNA-repair protein Rad51. This suggests a possible role for this GTPase as a regulator of translation, perhaps as a translation initiation factor. This family is classified based on the N-terminal S1 domain. 68
24246 239913 cd04467 S1_aIF5A S1_aIF5A: Archaeal translation Initiation Factor 5A (aIF5A), S1-like RNA-binding domain. aIF5A is a homolog of eukaryotic eIF5A. IF5A is the only protein known to have the unusual amino acid hypusine. Hypusine is a post-translationally modified lysine and is essential for IF5A function. In yeast, eIF5A interacts with components of the 80S ribosome and translation elongation factors 2 (eEF2) in a hypusine-dependent manner. This C-terminal S1 domain resembles the cold-shock domain which binds RNA. Moreover, IF5A prefers binding to the actively translating ribosome. This evidence suggests that IF5A plays a role in translation elongation instead of translation initiation as previously proposed. 57
24247 239914 cd04468 S1_eIF5A S1_eIF5A: Eukaryotic translation Initiation Factor 5A (eIF5A), S1-like RNA-binding domain. eIF5A is an evolutionarily conserved protein found in eukaryotes. eIF5A is the only protein known to have the unusual amino acid hypusine. Hypusine is essential for eIF5A function and is a post-translationally modified lysine. eIF5A interacts with components of the 80S ribosome and translation elongation factors 2 (eEF2) in a hypusine-dependent manner. This C-terminal S1 domain resembles the oligonucleotides-binding fold (OB fold) which binds RNA. Moreover, eIF5A prefers binding to the actively translating ribosome. This evidence suggests that eIF5A plays a role in translation elongation instead of translation initiation as previously proposed. 69
24248 239915 cd04469 S1_Hex1 S1_Hex1: Hex1, S1-like RNA-binding domain. Hex1 protein is the major component of the Woronin body in filamentous fungi. The Woronin body is a dense vesicle and plays a vital role in filamentous fungi cell integrity. When cell damage occurs, Woronin bodies seal the septal pore to prevent further cytoplasmic bleeding. Hex1 protein self-assembles to form the solid core of the Woronin body vesicle. The Hex1 sequence and structure are similar to eukaryotic initiation factor 5A (eIF5A), suggesting they share a common ancestor during evolution. All members of the EF superfamily to which Hex1 belongs, contain an S1 domain, which has been shown to bind RNA or single-stranded DNA and often interacts with the ribosome. 75
24249 239916 cd04470 S1_EF-P_repeat_1 S1_EF-P_repeat_1: Translation elongation factor P (EF-P), S1-like RNA-binding domain, repeat 1. EF-P stimulates the peptidyltransferase activity in the prokaryotic 70S ribosome. EF-P enhances the synthesis of certain dipeptides with N-formylmethionyl-tRNA and puromycine in vitro. EF-P binds to both the 30S and 50S ribosomal subunits. EF-P binds near the streptomycine binding site of the 16S rRNA in the 30S subunit. EF-P interacts with domains 2 and 5 of the 23S rRNA. The L16 ribosomal protein of the 50S or its N-terminal fragment are required for EF-P mediated peptide bond synthesis, whereas L11, L15, and L7/L12 are not required in this reaction, suggesting that EF-P may function at a different ribosomal site than most other translation factors. EF-P is essential for cell viability and is required for protein synthesis. EF-P is mainly present in bacteria. The EF-P homologs in archaea and eukaryotes are the initiation factors aIF5A and eIF5A, respectively. EF-P has 3 domains (domains I, II, and III). Domains II and III are S1-like domains. This CD includes domain II (the first S1 domain of EF_P). Domains II and III have structural homology to the eIF5A domain C, suggesting that domains II and III evolved by duplication. 61
24250 239917 cd04471 S1_RNase_R S1_RNase_R: RNase R C-terminal S1 domain. RNase R is a processive 3' to 5' exoribonuclease, which is a homolog of RNase II. RNase R degrades RNA with secondary structure having a 3' overhang of at least 7 nucleotides. RNase R and PNPase play an important role in the degradation of RNA with extensive secondary structure, such as rRNA, tRNA, and certain mRNA which contains repetitive extragenic palindromic sequences. The C-terminal S1 domain binds ssRNA. 83
24251 239918 cd04472 S1_PNPase S1_PNPase: Polynucleotide phosphorylase (PNPase), ), S1-like RNA-binding domain. PNPase is a polyribonucleotide nucleotidyl transferase that degrades mRNA. It is a trimeric multidomain protein. The C-terminus contains the S1 domain which binds ssRNA. This family is classified based on the S1 domain. PNPase nonspecifically removes the 3' nucleotides from mRNA, but is stalled by double-stranded RNA structures such as a stem-loop. Evidence shows that a minimum of 7-10 unpaired nucleotides at the 3' end, is required for PNPase degradation. It is suggested that PNPase also dephosphorylates the RNA 5' end. This additional activity may regulate the 5'-dependent activity of RNaseE in vivo. 68
24252 239919 cd04473 S1_RecJ_like S1_RecJ_like: The S1 domain of the archaea-specific RecJ-like exonuclease. The function of this family is not fully understood. In Escherichia coli, RecJ degrades single-stranded DNA in the 5'-3' direction and participates in homologous recombination and mismatch repair. 77
24253 239920 cd04474 RPA1_DBD_A RPA1_DBD_A: A subfamily of OB folds corresponding to the second OB fold, the ssDNA-binding domain (DBD)-A, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-A, RPA1 contains three other OB folds: DBD-B, DBD-C, and RPA1N. The major DNA binding activity of human RPA (hRPA) and Saccharomyces cerevisiae RPA (ScRPA) is associated with DBD-A and DBD-B of RPA1. RPA1 DBD-C is involved in trimerization. The ssDNA-binding mechanism is believed to be multistep and to involve conformational change. Although ScRPA and the hRPA have similar ssDNA-binding properties, they differ functionally. Antibodies to hRPA do not cross-react with ScRPA, and null mutations in the ScRPA subunits are not complemented by corresponding human genes. Also, ScRPA cannot support Simian virus 40 (SV40) DNA replication in vitro, whereas human RPA can. 104
24254 239921 cd04475 RPA1_DBD_B RPA1_DBD_B: A subfamily of OB folds corresponding to the third OB fold, the ssDNA-binding domain (DBD)-B, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-B, RPA1 contains three other OB folds: DBD-A, DBD-C, and RPA1N. The major DNA binding activity of human RPA (hRPA) and Saccharomyces cerevisiae RPA (ScRPA) is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. Although ScRPA and the hRPA have similar ssDNA-binding properties, they differ functionally. Antibodies to hRPA do not cross-react with ScRPA, and null mutations in the ScRPA subunits are not complemented by corresponding human genes. Also, ScRPA cannot support Simian virus 40 (SV40) DNA replication in vitro, whereas human RPA can. 101
24255 239922 cd04476 RPA1_DBD_C RPA1_DBD_C: A subfamily of OB folds corresponding to the C-terminal OB fold, the ssDNA-binding domain (DBD)-C, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-C, RPA1 contains three other OB folds: DBD-A, DBD-B, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in DNA binding and trimerization. It contains two structural insertions not found to date in other OB-folds: a zinc ribbon and a three-helix bundle. RPA1 DBD-C also contains a Cys4-type zinc-binding motif, which plays a role in the ssDNA binding function of this domain. It appears that zinc itself may not be required for ssDNA binding. 166
24256 239923 cd04477 RPA1N RPA1N: A subfamily of OB folds corresponding to the N-terminal OB-fold domain of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA1N is known to specifically interact with the p53 tumor suppressor, DNA polymerase alpha, and transcription factors. In addition to RPA1N, RPA1 contains three other OB folds: ssDNA-binding domain (DBD)-A, DBD-B, and DBD-C. 97
24257 239924 cd04478 RPA2_DBD_D RPA2_DBD_D: A subfamily of OB folds corresponding to the OB fold of the central ssDNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B; RPA2 DBD-D is a weak ssDNA-binding domain. RPA2 DBD-D is also involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. N-terminal to human RPA2 DBD-D is a domain containing all the known phosphorylation sites of RPA. Human RPA2 is phosphorylated in a cell cycle dependent manner in response to DNA damage. RPA2 interacts physically with menin; the gene encoding menin is a tumor suppressor gene disrupted in multiple endocrine neoplasia type I. This subfamily also includes RPA2 from Cryptosporidium parvum (CpRPA2). CpRPA2 is an SSB, which can be phosphorylated by DNA-PK in vitro. 95
24258 239925 cd04479 RPA3 RPA3: A subfamily of OB folds similar to human RPA3 (also called RPA14). RPA3 is the smallest subunit of Replication protein A (RPA). RPA is a nuclear ssDNA binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA3 is believed to have a structural role in assembly of the RPA heterotrimer. 101
24259 239926 cd04480 RPA1_DBD_A_like RPA1_DBD_A_like: A subgroup of uncharacterized plant OB folds with similarity to the second OB fold, the ssDNA-binding domain (DBD)-A, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-A, RPA1 contains three other OB folds: DBD-B, DBD-C, and RPA1N. The major DNA binding activity of RPA is associated with DBD-A and DBD-B of RPA1. RPA1 DBD-C is involved in trimerization. The ssDNA-binding mechanism is believed to be multistep and to involve conformational change. 86
24260 239927 cd04481 RPA1_DBD_B_like RPA1_DBD_B_like: A subgroup of uncharacterized, plant OB folds with similarity to the third OB fold, the ssDNA-binding domain (DBD)-B, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-B, RPA1 contains three other OB folds: DBD-A, DBD-C, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. 106
24261 239928 cd04482 RPA2_OBF_like RPA2_OBF_like: A subgroup of uncharacterized archaeal OB folds with similarity to the OB fold of the central ssDNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B; RPA2 DBD-D is a weak ssDNA-binding domain. RPA2 DBD-D is also involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. N-terminal to human RPA2 DBD-D is a domain containing all the known phosphorylation sites of RPA. Human RPA2 is phosphorylated in a cell cycle dependent manner in response to DNA damage. 91
24262 239929 cd04483 hOBFC1_like hOBFC1_like: A subfamily of OB folds similar to that found in human OB fold containing protein 1 (hOBFC1). Members of this group belong to the Replication protein A subunit 2 (RPA2) family of OB folds. RPA is a nuclear ssDNA binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The OB fold domain of RPA2 has dual roles in ssDNA binding and trimerization. 92
24263 239930 cd04484 polC_OBF polC_OBF: A subfamily of OB folds corresponding to the N-terminal OB-fold nucleic acid binding domain of Bacillus subtilis type C replicative DNA polymerase III alpha subunit (polC). Replication in B. subtilis and Staphylococcus aureus requires two different polymerases, polC and DnaE. The holoenzyme is thought to include the two different polymerases. At the B. subtilis replication fork, polC appears to be involved in leading strand synthesis and DnaE in lagging strand synthesis. 82
24264 239931 cd04485 DnaE_OBF DnaE_OBF: A subfamily of OB folds corresponding to the C-terminal OB-fold nucleic acid binding domain of Thermus aquaticus and Escherichia coli type C replicative DNA polymerase III alpha subunit (DnaE). The DNA polymerase holoenzyme of E. coli contains two copies of this replicative polymerase, each of which copies a different DNA strand. This group also contains Bacillus subtilis DnaE. Replication in B. subtilis and Staphylococcus aureus requires two different type C polymerases, polC and DnaE, both of which are thought to be included in the DNA polymerase holoenzyme. At the B. subtilis replication fork, polC appears to be involved in leading strand synthesis and DnaE in lagging strand synthesis. 84
24265 239932 cd04486 YhcR_OBF_like YhcR_OBF_like: A subfamily of OB-fold domains similar to the OB folds of Bacillus subtilis YhcR. YhcR is a sugar-nonspecific nuclease, which is active in the presence of Ca2+ and Mn2+. It cleaves RNA endonucleolytically, producing 3'-monophosphate nucleosides. YhcR appears to be the major Ca2+ activated nuclease of B. subtilis. YhcR may be localized in the cell wall. 78
24266 239933 cd04487 RecJ_OBF2_like RecJ_OBF2_like: A subfamily of OB folds corresponding to the second OB fold (OBF2) of archaeal-specific proteins with similarity to eubacterial RecJ. RecJ is an ssDNA-specific exonuclease. Although the overall sequence similarity of these proteins to eubacterial RecJ proteins is marginal, they appear to carry motifs, which have been shown to be essential for nuclease function in Escherichia coli RecJ. In addition to this OB fold, most proteins in this subfamily contain: i) an N-terminal OB fold belonging to a different domain family (the ribosomal S1-like RNA-binding family); and ii) a domain, C-terminal to OBF2, characteristic of DHH family proteins. DHH family proteins include E. coli RecJ, and are predicted to have a phosphoesterase function. 73
24267 239934 cd04488 RecG_wedge_OBF RecG_wedge_OBF: A subfamily of OB folds corresponding to the OB fold found in the N-terminal (wedge) domain of Escherichia coli RecG. RecG is a branched-DNA-specific helicase, which catalyzes the interconversion of a DNA replication fork to a four-stranded (Holliday) junction in vivo and in vitro. This interconversion provides a route to repair stalled forks. The RecG monomer contains three domains. The N-terminal domain is named for its wedge structure, and may provide the specificity of RecG for binding branched-DNA structures. During the reversal of fork to Holliday junction, the wedge domain is fixed at the junction of the fork where the leading and lagging strand duplex arms meet, and is thought to promote the unwinding of the nascent leading and lagging strands. In order to form the Holliday junction, these nascent strands would be annealed, and the parental strands reannealed. The wedge domain may also be a processivity factor of RecG on these branched chain substrates. 75
24268 239935 cd04489 ExoVII_LU_OBF ExoVII_LU_OBF: A subfamily of OB folds corresponding to the N-terminal OB-fold domain of Escherichia coli exodeoxyribonuclease VII (ExoVII) large subunit. E. coli ExoVII is composed of two non-identical subunits. E. coli ExoVII is a single-strand-specific exonuclease which degrades ssDNA from both 3-prime and 5-prime ends. ExoVII plays a role in methyl-directed mismatch repair in vivo. ExoVII may also guard the genome from mutagenesis by removing excess ssDNA, since the build up of ssDNA would lead to SOS induction and PolIV-dependent mutagenesis. 78
24269 239936 cd04490 PolII_SU_OBF PolII_SU_OBF: A subfamily of OB folds corresponding to the OB fold found in Pyrococcus abyssi DNA polymerase II (PolII) small subunit. PolII is a family D DNA polymerase, having a 3-prime to 5-prime exonuclease activity. P. abyssi PolII is heterodimeric. The large subunit appears to be the polymerase, and the small subunit may be the exonuclease. The small subunit contains a calcineurin-like phosphatase superfamily domain C-terminal to this OB-fold domain. 79
24270 239937 cd04491 SoSSB_OBF SoSSB_OBF: A subfamily of OB folds similar to the OB fold of the crenarchaeote Sulfolobus solfataricus single-stranded (ss) DNA-binding protein (SSoSSB). SSoSSB has a single OB fold, and it physically and functionally interacts with RNA polymerase. In vitro, SSoSSB can substitute for the basal transcription factor TBP, stimulating transcription from promoters under conditions in which TBP is limiting, and supporting transcription when TBP is absent. SSoSSB selectively melts the duplex DNA of promoter sequences. It also relieves transcriptional repression by the chromatin Alba. In addition, SSoSSB activates reverse gyrase activity, which involves DNA binding, DNA cleavage, strand passage and ligation. SSoSSB stimulates all these steps in the presence of the chromatin protein, Sul7d. SSoSSB antagonizes the inhibitory effect of Sul7d on reverse gyrase supercoiling activity. It also physically and functionally interacts with Mini-chromosome Maintenance (MCM), stimulating the DNA helicase activity of MCM. 82
24271 239938 cd04492 YhaM_OBF_like YhaM_OBF_like: A subfamily of OB folds similar to that found in Bacillus subtilis YhaM and Staphylococcus aureus cmp-binding factor-1 (SaCBF1). Both these proteins are 3'-to-5'exoribonucleases. YhaM requires Mn2+ or Co2+ for activity and is inactive in the presence of Mg2+. YhaM also has a Mn2+ dependent 3'-to-5'single-stranded DNA exonuclease activity. SaCBF is also a double-stranded DNA binding protein, binding specifically to cmp, the replication enhancer found in S. aureus plasmid pT181. Proteins in this group combine an N-terminal OB fold with a C-terminal HD domain. The HD domain is found in metal-dependent phosphohydrolases. 83
24272 239939 cd04493 BRCA2DBD_OB1 BRCA2DBD_OB1: A subfamily of OB folds corresponding to the first OB fold (OB1) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA). BRCA2DBD OB1 binds DNA weakly. 100
24273 239940 cd04494 BRCA2DBD_OB2 BRCA2DBD_OB2: A subfamily of OB folds corresponding to the second OB fold (OB2) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA). 251
24274 239941 cd04495 BRCA2DBD_OB3 BRCA2DBD_OB3: A subfamily of OB folds corresponding to the third OB fold (OB3) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA). 100
24275 239942 cd04496 SSB_OBF SSB_OBF: A subfamily of OB folds similar to the OB fold of ssDNA-binding protein (SSB). SSBs bind with high affinity to ssDNA. They bind to and protect ssDNA intermediates during DNA metabolic pathways. All bacterial and eukaryotic SSBs studied to date oligomerize to bring together four OB folds in their active state. The majority (e.g. Escherichia coli SSB) have a single OB fold per monomer, which oligomerize to form a homotetramer. However, Deinococcus and Thermus SSB proteins have two OB folds per monomer, which oligomerize to form a homodimer. Mycobacterium tuberculosis SSB varies in quaternary structure from E. coli SSB. It forms a dimer of dimers having a unique dimer interface, which lends the protein greater stability. Included in this group are OB folds similar to Escherichia coli PriB. E.coli PriB is homodimeric with each monomer having a single OB fold. It does not appear to form higher order oligomers. PriB is an essential protein for the replication restart at forks that have stalled at sites of DNA damage. It also plays a role in the assembly of primosome during replication initiation at the bacteriophage phiX174 origin. PriB physically interacts with SSB and binds ssDNA with high affinity. 100
24276 239943 cd04497 hPOT1_OB1_like hPOT1_OB1_like: A subfamily of OB folds similar to the first OB fold (OB1) of human protection of telomeres 1 protein (hPOT1), the single OB fold of the N-terminal domain of Schizosaccharomyces pombe POT1 (SpPOT1), and the first OB fold of the N-terminal domain of the alpha subunit (OB1Nalpha) of Oxytricha nova telomere end binding protein (OnTEBP). POT1 proteins recognize single-stranded (ss) 3-prime ends of the telomere. A 3-prime ss overhang is conserved in ciliated protozoa, yeast, and mammals. SpPOT1 is essential for telomere maintenance. It binds specifically to the ss G-rich telomeric sequence (GGTTAC) of S. pombe. hPOT1 binds specifically to ss telomeric DNA repeats ending with the sequence GGTTAG. Deletion of the S. pombe pot1+ gene results in a rapid loss of telomere sequences, chromosome mis-segregation and chromosome circularization. hPOT1 is implicated in telomere length regulation. The hPOT1 monomer consists of two closely connected OB folds (OB1-OB2) which cooperate to bind telomeric ssDNA. OB1 makes more extensive contact with the ssDNA than OB2. OB2 protects the 3' end of the ssDNA. A second OB fold has not been predicted in S. pombe POT1. OnTEBP binds the extreme 3-prime end of telomeric DNA. It is heterodimeric and contains four OB folds - three in the alpha subunit (two in the N-terminal domain and one in the C-terminal domain) and one in the beta subunit. OB1Nalpha, together with the second OB fold of the N-terminal domain of OnTEBP alpha subunit and the beta subunit OB fold, forms a deep cleft that binds ssDNA. 138
24277 239944 cd04498 hPOT1_OB2 hPOT1_OB2: A subfamily of OB folds similar to the second OB fold (OB2) of human protection of telomeres 1 protein (hPOT1). POT1 proteins bind to the single-stranded (ss) 3-prime ends of the telomere. hPOT1 binds specifically to ss telomeric DNA repeats ending with the sequence GGTTAG. The hPOT1 monomer consists of two closely connected OB folds (OB1-OB2) which cooperate to bind telomeric ssDNA. OB1 makes more extensive contact with the ssDNA than OB2. OB2 protects the 3' end of the ssDNA. hPOT1 is implicated in telomere length regulation. 123
24278 239945 cd04501 SGNH_hydrolase_like_4 Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 183
24279 239946 cd04502 SGNH_hydrolase_like_7 Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 171
24280 239947 cd04506 SGNH_hydrolase_YpmR_like Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. This subfamily contains sequences similar to Bacillus YpmR. 204
24281 410449 cd04508 Tudor_SF Tudor domain superfamily. The Tudor domain is a conserved structural domain, originally identified in the Tudor protein of Drosophila, that adopts a beta-barrel-like core structure containing four short beta-strands followed by an alpha-helical region. It binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. Tudor domain-containing proteins may mediate protein-protein interactions required for various DNA-templated biological processes, such as RNA metabolism, as well as histone modification and the DNA damage response. Members of this superfamily contain one or more copies of the Tudor domain. 47
24282 380490 cd04509 PBP1_ABC_transporter_GPCR_C-like Family C of G-protein coupled receptors and their close homologs, the type 1 periplasmic-binding proteins of ATP-binding cassette transporter-like systems. This CD includes members of the family C of G-protein coupled receptors and their close homologs, the type 1 periplasmic-binding proteins of ATP-binding cassette transporter-like systems. The family C GPCR includes glutamate/glycine-gated ion channels such as the NMDA receptor, G-protein-coupled receptors, metabotropic glutamate, GABA-B, calcium sensing, pheromone receptors, and atrial natriuretic peptide-guanylate cyclase receptors. The glutamate receptors that form cation-selective ion channels, iGluR, can be classified into three different subgroups according to their binding-affinity for the agonists NMDA (N-methyl-D-asparate), AMPA (alpha-amino-3-dihydro-5-methyl-3-oxo-4-isoxazolepropionic acid), and kainate. L-glutamate is a major neurotransmitter in the brain of vertebrates and acts through either mGluRs or iGluRs. mGluRs subunits possess seven transmembrane segments and a large N-terminal extracellular domain. ABC-type leucine-isoleucine-valine binding protein (LIVBP) is a bacterial periplasmic binding protein that has homology with the amino-terminal domain of the glutamate-receptor ion channels (iGluRs). The extracellular regions of iGluRs are made of two PBP-like domains in tandem, a LIVBP-like domain that constitutes the N terminus (included in this model) followed by a domain related to lysine-arginine-ornithine-binding protein (LAOBP) that belongs to the type 2 periplasmic binding fold protein superfamily. The uncharacterized periplasmic components of various ABC-type transport systems are also included in this family. 306
24283 239948 cd04511 Nudix_Hydrolase_4 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 130
24284 271334 cd04512 Ntn_Asparaginase_2_like L-Asparaginase type 2-like enzymes of the NTN-hydrolase superfamily. This family includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. The family is circularly permuted relative to other NTN-hydrolase families. 249
24285 271335 cd04513 Glycosylasparaginase Glycosylasparaginase and similar proteins. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoproteins. This enzyme is an amidase located inside lysosomes. Mutation of this gene in humans causes a genetic disorder known as aspartylglycosaminuria (AGU). The glycosylasparaginase precursor undergoes autoproteolysis through an N-O or N-S acyl rearrangement of the peptide bond, which leads to the cleavage of a peptide bond between an Asp and a Thr. This proteolysis step generates an exposed N-terminal catalytic threonine and activates the enzyme. 294
24286 271336 cd04514 Taspase1_like Taspase 1 (threonine aspartase 1) and similar proteins. Taspase1 catalyzes the cleavage of the mix lineage leukemia (MLL) nuclear protein and transcription factor TFIIA. Taspase1 is a threonine aspartase, a member of the Ntn hydrolase superfamily and the type 2 asparaginase family. A threonine residue acts as the active site nucleophile in both endopeptidease and protease activities to cleave polypeptide substrates after an aspartate residue. The Taspase1 proenzyme undergoes autoproteolysis into alpha and beta subunits. The N-terminal residue of the beta subunit is a threonine which is the active catalytic residue. The active enzyme is a heterotetramer. 313
24287 341214 cd04515 Alpha_kinase Alpha kinase family. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 213
24288 239952 cd04516 TBP_eukaryotes eukaryotic TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. 174
24289 239953 cd04517 TLF TBP-like factors (TLF; also called TLP, TRF, TRP), which are found in most metazoans. TLFs and TBPs have well-conserved core domains; however, they only share about 60% similarity. TLFs, like TBPs, interact with TFIIA and TFIIB, which are part of the basal transcription machinery. Yet, in contrast to TBPs, TLFs seem not to interact with the TATA-box and even have a negative effect on the transcription of TATA-containing promoters. Recent results indicate that TLFs are involved in the transcription via TATA-less promoters. 174
24290 239954 cd04518 TBP_archaea archaeal TATA box binding protein (TBP): TBPs are transcription factors present in archaea and eukaryotes, that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. 174
24291 213328 cd04519 RasGAP Ras GTPase Activating Domain. RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 256
24292 341359 cd04582 CBS_pair_ABC_OpuCA_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the ABC transporter OpuCA. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 111
24293 341360 cd04583 CBS_pair_ABC_OpuCA_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the ABC transporter OpuCA. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 110
24294 341361 cd04584 CBS_pair_AcuB_like Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the ACT domain. The putative Acetoin Utilization Protein (Acub) from Vibrio Cholerae contains a CBS pair domain. The acetoin utilization protein plays a role in growth and sporulation on acetoin or butanediol for use as a carbon source. Acetoin is an important physiological metabolite excreted by many microorganisms. It is used as an external energy store by a number of fermentive bacteria. Acetoin is produced by the decarboxylation of alpha-acetolactate. Once superior carbon sources are exhausted, and the culture enters stationary phase, acetoin can be utilised in order to maintain the culture density. The conversion of acetoin into acetyl-CoA or 2,3-butanediol is catalysed by the acetoin dehydrogenase complex and acetoin reductase/2,3-butanediol dehydrogenase, respectively. Acetoin utilization proteins, acetylpolyamine amidohydrolases, and histone deacetylases are members of an ancient protein superfamily.This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms. They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT (aspartate kinase/chorismate mutase/TyrA) domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 130
24295 341362 cd04586 CBS_pair_BON_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. BON is a putative phospholipid-binding domain found in a family of osmotic shock protection proteins. It is also found in some secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 137
24296 341363 cd04587 CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT (Nucleotidyltransferase) Pol-beta-like domain, and the DUF294 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain. Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 114
24297 341364 cd04588 CBS_pair_archHTH_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in archaea and associated with helix turn helix domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 111
24298 341365 cd04589 CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT (Nucleotidyltransferase) Pol-beta-like domain, and the DUF294 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain. Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 113
24299 341366 cd04590 CBS_pair_CorC_HlyC_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains the majority of which are associated with the CorC_HlyC domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains the majority of which are associated with the CorC_HlyC domain. CorC_HlyC is a transporter associated domain. This small domain is found in Na+/H+ antiporters, in proteins involved in magnesium and cobalt efflux, and in association with some proteins of unknown function. The function of the CorC_HlyC domain is uncertain but it might be involved in modulating transport of ion substrates. These CBS domains are found in highly conserved proteins that either have unknown function or are puported to be hemolysins, exotoxins involved in lysis of red blood cells in vitro. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 119
24300 341367 cd04591 CBS_pair_voltage-gated_CLC_euk_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in eukaryotes and bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel. The CBS pairs here are found in the EriC CIC-type chloride channels in eukaryotes and bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 114
24301 341368 cd04592 CBS_pair_voltage-gated_CLC_euk_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in eukaryotes and bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel. The CBS pairs here are found in the EriC CIC-type chloride channels in eukaryotes and bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 128
24302 341369 cd04594 CBS_pair_voltage-gated_CLC_archaea Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in archaea. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel. The CBS pairs here are found in the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 107
24303 341370 cd04595 CBS_pair_DHH_polyA_Pol_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DHH and nucleotidyltransferase (NT) domains. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream nucleotidyltransferase (NT) domain of family X DNA polymerases. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 110
24304 341371 cd04596 CBS_pair_DRTGG_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DRTGG domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 108
24305 341372 cd04597 CBS_pair_inorgPPase Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with family II inorganic pyrophosphatase. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a subgroup of family II inorganic pyrophosphatases (PPases) that also contain a DRTGG domain. The homolog from Clostridium has been shown to be inhibited by AMP and activated by a novel effector, diadenosine 5',5-P1,P4-tetraphosphate (AP(4)A), which has been shown to bind to the CBS domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. 106
24306 341373 cd04598 CBS_pair_GGDEF_EAL Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 121
24307 341374 cd04599 CBS_pair_GGDEF_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 107
24308 341375 cd04600 CBS_pair_HPP_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. These proteins are integral membrane proteins with four transmembrane spanning helices. The function of these proteins is uncertain, but they are thought to be transporters. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 133
24309 341376 cd04601 CBS_pair_IMPDH Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 110
24310 341377 cd04603 CBS_pair_KefB_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 112
24311 341378 cd04604 CBS_pair_SIS_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the with the SIS (Sugar ISomerase) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the SIS (Sugar ISomerase) domain in the API [A5P (D-arabinose 5-phosphate) isomerase] protein KpsF/GutQ. These APIs catalyze the conversion of the pentose pathway intermediate D-ribulose 5-phosphate into A5P, a precursor of 3-deoxy-D-manno-octulosonate, which is an integral carbohydrate component of various glycolipids coating the surface of the outer membrane of Gram-negative bacteria, including lipopolysaccharide and many group 2 K-antigen capsules. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 124
24312 341379 cd04605 CBS_pair_arch_MET2_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. Met2 is a key enzyme in the biosynthesis of methionine. It encodes a homoserine transacetylase involved in converting homoserine to O-acetyl homoserine. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 116
24313 341380 cd04606 CBS_pair_Mg_transporter Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the magnesium transporter, MgtE. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE. MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 121
24314 341381 cd04607 CBS_pair_NTP_transferase_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain downstream. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 112
24315 341382 cd04608 CBS_pair_CBS Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream. Cystathionine beta-synthase (CBS ) contains, besides the C-terminal regulatory CBS-pair, an N-terminal heme-binding module, followed by a pyridoxal phosphate (PLP) domain, which houses the active site. It is the first enzyme in the transsulfuration pathway, catalyzing the conversion of serine and homocysteine to cystathionine and water. In general, CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 120
24316 341383 cd04610 CBS_pair_ParBc_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 108
24317 341384 cd04611 CBS_pair_GGDEF_PAS_repeat2 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors, repeat 2. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 131
24318 341385 cd04613 CBS_pair_voltage-gated_CLC_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel. The CBS pairs here are found in the EriC CIC-type chloride channels in bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 119
24319 341386 cd04614 CBS_pair_arch2_repeat2 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 2. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in Inosine monophosphate (IMP) dehydrogenases and related proteins including IMP dehydrogenase IX from Methanothermobacter. IMP dehydrogenase is an essential enzyme in the de novo biosynthesis of Guanosine monophosphate (GMP), catalyzing the NAD-dependent oxidation of IMP to xanthosine monophosphate (XMP). The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 150
24320 341387 cd04617 CBS_pair_CcpN Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains of CcpN repressor. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 125
24321 341388 cd04618 CBS_euAMPK_gamma-like_repeat1 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in AMP-activated protein kinase gamma-like proteins, repeat 1. AMP-activated protein kinase (AMPK) plays multiple roles in the body's overall metabolic balance and response to exercise, nutritional stress, hormonal stimulation, and the glucose-lowering drugs metformin and rosiglitazone. AMPK consists of a catalytic alpha subunit and two non-catalytic subunits, beta and gamma, each with multiple isoforms that form active 1:1:1 heterotrimers. This cd contains 2 tandem repeats of the CBS domains found in the gamma subunits of AMPK. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 138
24322 341389 cd04620 CBS_two-component_sensor_histidine_kinase_repeat1 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins, repeat 1. This cd contains 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins. Two-component regulation is the predominant form of signal recognition and response coupling mechanism used by bacteria to sense and respond to diverse environmental stresses and cues ranging from common environmental stimuli to host signals recognized by pathogens and bacterial cell-cell communication signals. The structures of both sensors and regulators are modular, and numerous variations in domain architecture and composition have evolved to tailor to specific needs in signal perception and signal transduction. The simplest histidine kinase sensors consists of only sensing and kinase domains. The more complex hybrid sensors contain an additional REC domain typical of two-component regulators and in some cases a C-terminal histidine phosphotransferase (HPT) domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 136
24323 341390 cd04622 CBS_pair_HRP1_like CBS pair domain found in Hypoxic Response Protein 1 (HRP1) -like proteinds. Mycobacterium tuberculosis adapts to cellular stresses by upregulation of the dormancy survival regulon. Hypoxic response protein 1 (HRP1) is encoded by one of the most strongly upregulated genes in the dormancy survival regulon. HRP1 is a 'CBS-domain-only protein; however unlike other CBS containing proteins it does not appear to bind AMP. The biological function of the protein remains unclear, but is thought to contribute to the modulation of the host immune response. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 115
24324 341391 cd04623 CBS_pair_bac_euk Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria and eukaryotes. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 113
24325 341392 cd04629 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 116
24326 341393 cd04630 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 120
24327 341394 cd04631 CBS_archAMPK_gamma-repeat2 CBS pair domains found in archeal 5'-AMP-activated protein kinase gamma subunit-like proteins. Archeal gamma-subunit of 5'-AMP-activated protein kinase (AMPK) contains four CBS domains in tandem repeats, similar to eukaryotic homologs. AMPK is an important regulator of metabolism and of energy homeostasis. It is a heterotrimeric protein composed of a catalytic serine/threonine kinase subunit (alpha) and two regulatory subunits (beta and gamma). The gamma subunit senses the intracellular energy status by competitively binding AMP and ATP and is believed to be responsible for allosteric regulation of the whole complex. In humans mutations in gamma- subunit of AMPK are associated with hypertrophic cardiomiopathy, Wolff-Parkinson-White syndrome and glycogen storage in the skeletal muscle. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. 130
24328 341395 cd04632 CBS_pair_arch1_repeat2 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 2. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 127
24329 341396 cd04638 CBS_pair_arch2_repeat1 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 1. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 109
24330 341397 cd04639 CBS_pair_peptidase_M50 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in the metalloprotease peptidase M50. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in peptidase M50. Members of the M50 metallopeptidase family include mammalian sterol-regulatory element binding protein (SREBP) site 2 proteases and various hypothetical bacterial homologues. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 120
24331 341398 cd04640 CBS_pair_proteobact Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in proteobacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 133
24332 341399 cd04641 CBS_euAMPK_gamma-like_repeat2 CBS pair domain found in 5'-AMP (adenosine monophosphate)-activated protein kinase. The 5'-AMP (adenosine monophosphate)-activated protein kinase (AMPK) coordinates metabolic function with energy availability by responding to changes in intracellular ATP (adenosine triphosphate) and AMP concentrations. Most of the members of this cd contain two Bateman domains, each of which is composed of a tandem pair of cystathionine beta-synthase (CBS) motifs. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 124
24333 341400 cd04643 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 130
24334 100051 cd04645 LbH_gamma_CA_like Gamma carbonic anhydrase-like: This family is composed of gamma carbonic anhydrase (CA), Ferripyochelin Binding Protein (FBP), E. coli paaY protein, and similar proteins. CAs are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism, involving the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Gamma CAs are trimeric enzymes with left-handed parallel beta helix (LbH) structural domain. 153
24335 100052 cd04646 LbH_Dynactin_6 Dynactin 6 (or subunit p27): Dynactin is a major component of the activator complex that stimulates dynein-mediated vesicle transport. Dynactin is a heterocomplex of at least eight subunits, including a 150,000-MW protein called Glued, the actin-capping protein Arp1, and dynamatin. In vitro binding experiments show that dynactin enhances dynein-dependent motility, possibly through interaction with microtubules and vesicles. Subunit p27 is part of the pointed-end subcomplex in dynactin that also includes p25, p26, and Arp11. This subcomplex interacts with membranous cargoes. p25 and p27 contain the imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), indicating a left-handed parallel beta helix (LbH) structural domain. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 164
24336 100053 cd04647 LbH_MAT_like Maltose O-acyltransferase (MAT)-like: This family is composed of maltose O-acetyltransferase, galactoside O-acetyltransferase (GAT), xenobiotic acyltransferase (XAT) and similar proteins. MAT and GAT catalyze the CoA-dependent acetylation of the 6-hydroxyl group of their respective sugar substrates. MAT acetylates maltose and glucose exclusively while GAT specifically acetylates galactopyranosides. XAT catalyzes the CoA-dependent acetylation of a variety of hydroxyl-bearing acceptors such as chloramphenicol and streptogramin, among others. XATs are implicated in inactivating xenobiotics leading to xenobiotic resistance in patients. Members of this family contain a a left-handed parallel beta-helix (LbH) domain with at least 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). They are trimeric in their active form. 109
24337 100054 cd04649 LbH_THP_succinylT_putative Putative 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate (THDP) N-succinyltransferase (THP succinyltransferase), C-terminal left-handed parallel alpha-helix (LbH) domain: This group is composed of mostly uncharacterized proteins containing an N-terminal domain of unknown function and a C-terminal LbH domain with similarity to THP succinyltransferase LbH. THP succinyltransferase catalyzes the conversion of tetrahydrodipicolinate and succinyl-CoA to N-succinyltetrahydrodipicolinate and CoA. It is the committed step in the succinylase pathway by which bacteria synthesize L-lysine and meso-diaminopimelate, a component of peptidoglycan. The enzyme is trimeric and displays the left-handed parallel alpha-helix (LbH) structural motif encoded by the hexapeptide repeat motif. 147
24338 100055 cd04650 LbH_FBP Ferripyochelin Binding Protein (FBP): FBP is an outer membrane protein which plays a role in iron acquisition. It binds iron when it is complexed with pyochelin. It adopts the left-handed parallel beta-helix (LbH) structure, and contains imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. Acyltransferase activity has not been observed in this group. 154
24339 100056 cd04651 LbH_G1P_AT_C Glucose-1-phosphate adenylyltransferase, C-terminal Left-handed parallel beta helix (LbH) domain: Glucose-1-phosphate adenylyltransferase is also known as ADP-glucose synthase or ADP-glucose pyrophosphorylase. It catalyzes the first committed and rate-limiting step in starch biosynthesis in plants and glycogen biosynthesis in bacteria. It is the enzymatic site for regulation of storage polysaccharide accumulation in plants and bacteria. The enzyme is a homotetramer, with each subunit containing an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain with at 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). The LbH domain is involved in cooperative allosteric regulation and oligomerization. 104
24340 100057 cd04652 LbH_eIF2B_gamma_C eIF-2B gamma subunit, C-terminal Left-handed parallel beta-Helix (LbH) domain: eIF-2B is a eukaryotic translation initiator, a guanine nucleotide exchange factor (GEF) composed of five different subunits (alpha, beta, gamma, delta and epsilon). eIF2B is important for regenerating GTP-bound eIF2 during the initiation process. This event is obligatory for eIF2 to bind initiator methionyl-tRNA, forming the ternary initiation complex. The eIF-2B gamma subunit contains an N-terminal domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH domain with 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). The epsilon and gamma subunits form the catalytic subcomplex of eIF-2B, which binds eIF2 and catalyzes guanine nucleotide exchange. 81
24341 240015 cd04657 Piwi_ago-like Piwi_ago-like: PIWI domain, Argonaute-like subfamily. Argonaute is the central component of the RNA-induced silencing complex (RISC) and related complexes. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. 426
24342 240016 cd04658 Piwi_piwi-like_Euk Piwi_piwi-like_Euk: PIWI domain, Piwi-like subfamily found in eukaryotes. This domain is found in Piwi and closely related proteins, where it is believed to perform a crucial role in germline cells, via RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The mechanism in Piwi is believed to be similar to that in Argonaute, the central component of the RNA-induced silencing complex (RISC). The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. 448
24343 240017 cd04659 Piwi_piwi-like_ProArk Piwi_piwi-like_ProArk: PIWI domain, Piwi-like subfamily found in Archaea and Bacteria. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism. 404
24344 240018 cd04660 nsLTP_like nsLTP_like: Non-specific lipid-transfer protein (nsLTP)-like subfamily; composed of predominantly uncharacterized proteins with similarity to nsLTPs, including Medicago truncatula MtN5, the root-specific Phaseolus vulgaris PVR3, Antirrhinum majus FIL1, and Lilium longiflorum LIM3. Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. The MtN5 gene is induced during root nodule development. FIL1 is thought to be important in petal and stamen formation. The LIM3 gene is induced during the early prophase stage of meiosis in lily microsporocytes. 73
24345 240019 cd04661 MRP_L46 Mitochondrial ribosomal protein L46 (MRP L46) is a component of the large subunit (39S) of the mammalian mitochondrial ribosome and a member of the Nudix hydrolase superfamily. MRPs are thought to be involved in the maintenance of the mitochondrial DNA. In general, members of the Nudix superfamily require a divalent cation, such as Mg2+ or Mn2+, for activity and contain the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. MRP L46 appears to contain a modified nudix motif. 132
24346 240020 cd04662 Nudix_Hydrolase_5 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 126
24347 240021 cd04663 Nudix_Hydrolase_6 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+ for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V) which functions as metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 126
24348 240022 cd04664 Nudix_Hydrolase_7 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 129
24349 240023 cd04665 Nudix_Hydrolase_8 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 118
24350 240024 cd04666 Nudix_Hydrolase_9 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 122
24351 240025 cd04667 Nudix_Hydrolase_10 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 112
24352 240026 cd04669 Nudix_Hydrolase_11 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 121
24353 240027 cd04670 Nudix_Hydrolase_12 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 127
24354 240028 cd04671 Nudix_Hydrolase_13 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 123
24355 240029 cd04672 Nudix_Hydrolase_14 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 123
24356 240030 cd04673 Nudix_Hydrolase_15 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 122
24357 240031 cd04674 Nudix_Hydrolase_16 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 118
24358 240032 cd04676 Nudix_Hydrolase_17 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 129
24359 240033 cd04677 Nudix_Hydrolase_18 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 132
24360 240034 cd04678 Nudix_Hydrolase_19 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 129
24361 240035 cd04679 Nudix_Hydrolase_20 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 125
24362 240036 cd04680 Nudix_Hydrolase_21 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 120
24363 240037 cd04681 Nudix_Hydrolase_22 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 130
24364 240038 cd04682 Nudix_Hydrolase_23 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 122
24365 240039 cd04683 Nudix_Hydrolase_24 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 120
24366 240040 cd04684 Nudix_Hydrolase_25 Contains a crystal structure of the Nudix hydrolase from Enterococcus faecalis, which has an unknown function. In general, members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 128
24367 240041 cd04685 Nudix_Hydrolase_26 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 133
24368 240042 cd04686 Nudix_Hydrolase_27 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 131
24369 240043 cd04687 Nudix_Hydrolase_28 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 128
24370 240044 cd04688 Nudix_Hydrolase_29 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 126
24371 240045 cd04689 Nudix_Hydrolase_30 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U=I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 125
24372 240046 cd04690 Nudix_Hydrolase_31 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 118
24373 240047 cd04691 Nudix_Hydrolase_32 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 117
24374 240048 cd04692 Nudix_Hydrolase_33 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 144
24375 240049 cd04693 Nudix_Hydrolase_34 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 127
24376 240050 cd04694 Nudix_Hydrolase_35 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 143
24377 240051 cd04695 Nudix_Hydrolase_36 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 131
24378 240052 cd04696 Nudix_Hydrolase_37 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 125
24379 240053 cd04697 Nudix_Hydrolase_38 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 126
24380 240054 cd04699 Nudix_Hydrolase_39 Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 129
24381 240055 cd04700 DR1025_like DR1025 from Deinococcus radiodurans, a member of the Nudix hydrolase superfamily, show nucleoside triphosphatase and dinucleoside polyphosphate pyrophosphatase activities. Like other enzymes belonging to this superfamily, it requires a divalent cation, in this case Mg2+, for its activity. It also contains a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. In general, substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. 142
24382 271337 cd04701 Asparaginase_2 Bacterial/fungal L-Asparaginase type 2. L-Asparaginase hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzyme undergoes an autoproteolytic cleavage into alpha and beta subunits to expose a threonine residue which becomes the N-terminal residue of the beta subunit. The threonine residue plays a central role in hydrolase activity. Some asparaginases can also hydrolyze L-glutamine and are termed glutaminase-asparaginase. This is a member of the Ntn-hydrolase superfamily, and this subfamily covers mostly bacterial and fungal enzymes. 264
24383 271338 cd04702 ASRGL1_like Metazoan L-Asparaginase type 2. ASRGL1 and similar proteins constitute a subfamily of the L-Asparaginase type 2-like enzymes. The wider family includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2 enzymes. The proenzymes undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. ASRGL1, or asparaginase-like 1, has been cloned from mammalian testis cDNA libraries. It has been identified as a sperm antigen that may induce the production of autoantibodies following obstruction of the male reproductive tract, e.g. vasectomy. 289
24384 271339 cd04703 Asparaginase_2_like_1 Uncharacterized subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. 243
24385 153093 cd04704 PLA2_bee_venom_like PLA2_bee_venom_like: A sub-family of Phospholipase A2, similar to bee venom PLA2. PLA2 is a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. Bee venom PLA2 has fewer conserved disulfide bridges than most canonical PLA2s. 97
24386 153094 cd04705 PLA2_group_III_like PLA2_group_III_like: A sub-family of Phospholipase A2, similar to human group III PLA2. PLA2 is a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. 100
24387 153095 cd04706 PLA2_plant PLA2_plant: Plant-specific sub-family of Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. This sub-family does not appear to have a conserved active site and metal-binding loop. 117
24388 153096 cd04707 otoconin_90 otoconin_90: Phospholipase A2-like domains present in otoconin-90 and otoconin-95, mammal proteins that are principal matrix proteins of calcitic otoconia. Interactions involving otoconin-90 may trigger or constitute key events in otoconia formation. The PLA2-like domains in otoconins may have lost their metal-binding sites. 117
24389 240059 cd04708 BAH_plantDCM_II BAH, or Bromo Adjacent Homology domain, second copy present in DNA (Cytosine-5)-methyltransferases (DCM) from plants. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 202
24390 240060 cd04709 BAH_MTA BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The Metastasis-associated protein MTA1 is part of the NURD (nucleosome remodeling and deacetylating) complex and plays a role in cellular transformation and metastasis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 164
24391 240061 cd04710 BAH_fungalPHD BAH, or Bromo Adjacent Homology domain, as present in fungal proteins containing PHD domains. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 135
24392 240062 cd04711 BAH_Dnmt1_II BAH, or Bromo Adjacent Homology domain, second copy present in DNA (Cytosine-5)-methyltransferases from Bilateria, Dnmt1 and similar proteins. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 137
24393 240063 cd04712 BAH_DCM_I BAH, or Bromo Adjacent Homology domain, as present in DNA (Cytosine-5)-methyltransferases (DCM) 1. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 130
24394 240064 cd04713 BAH_plant_3 BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 146
24395 240065 cd04714 BAH_BAHCC1 BAH, or Bromo Adjacent Homology domain, as present in mammalian BAHCC1 and similar proteins. BAHCC1 stands for BAH domain and coiled-coil containing 1. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 121
24396 240066 cd04715 BAH_Orc1p_like BAH, or Bromo Adjacent Homology domain, as present in the Schizosaccharomyces pombe homolog of Saccharomyces cerevisiae Orc1p and similar proteins. Orc1 is part of the Yeast Sir1-origin recognition complex, the Orc1p BAH doman functions in epigenetic silencing. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 159
24397 240067 cd04716 BAH_plantDCM_I BAH, or Bromo Adjacent Homology domain, first copy present in DNA (Cytosine-5)-methyltransferases (DCM) from plants. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 122
24398 240068 cd04717 BAH_polybromo BAH, or Bromo Adjacent Homology domain, as present in polybromo and yeast RSC1/2. The human polybromo protein (BAF180) is a component of the SWI/SNF chromatin-remodeling complex PBAF. It is thought that polybromo participates in transcriptional regulation. Saccharomyces cerevisiae RSC1 and RSC2 are part of the 15-subunit nucleosome remodeling RSC complex. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 121
24399 240069 cd04718 BAH_plant_2 BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 148
24400 240070 cd04719 BAH_Orc1p_animal BAH, or Bromo Adjacent Homology domain, as present in animal homologs of Saccharomyces cerevisiae Orc1p. Orc1 is part of the Yeast Sir1-origin recognition complex. The Orc1p BAH doman functions in epigenetic silencing. In vertebrates, a similar ORC protein complex exists, which has been shown essential for DNA replication in Xenopus laevis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 128
24401 240071 cd04720 BAH_Orc1p_Yeast BAH, or Bromo Adjacent Homology domain, as present in Orc1p, which again is part of the Saccharomyces cerevisiae Sir1-origin recognition complex, and as present in Sir3p. The Orc1p BAH doman functions in epigenetic silencing. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 179
24402 240072 cd04721 BAH_plant_1 BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 130
24403 240073 cd04722 TIM_phosphate_binding TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN. 200
24404 240074 cd04723 HisA_HisF Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase (HisA) and the cyclase subunit of imidazoleglycerol phosphate synthase (HisF). The ProFAR isomerase catalyzes the fourth step in histidine biosynthesis, an isomerisation of the aminoaldose moiety of ProFAR to the aminoketose of PRFAR (N-(5'-phospho-D-1'-ribulosylformimino)-5-amino-1-(5''-phospho-ribosyl)-4-imidazolecarboxamide). In bacteria and archaea, ProFAR isomerase is encoded by the HisA gene. The Imidazole glycerol phosphate synthase (IGPS) catalyzes the fifth step of histidine biosynthesis, the formation of the imidazole ring. IGPS converts N1-(5'-phosphoribulosyl)-formimino-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to imidazole glycerol phosphate (ImGP) and 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR). This conversion involves two tightly coupled reactions in distinct active sites of IGPS. The two catalytic domains can be fused, like in fungi and plants, or peformed by a heterodimer (HisH-glutaminase and HisF-cyclase), like in bacteria. 233
24405 240075 cd04724 Tryptophan_synthase_alpha Ttryptophan synthase (TRPS) alpha subunit (TSA). TPRS is a bifunctional tetrameric enzyme (2 alpha and 2 beta subunits) that catalyzes the last two steps of L-tryptophan biosynthesis. Alpha and beta subunit catalyze two distinct reactions which are both strongly stimulated by the formation of the complex. The alpha subunit catalyzes the cleavage of indole 3-glycerol phosphate (IGP) to indole and d-glyceraldehyde 3-phosphate (G3P). Indole is then channeled to the active site of the beta subunit, a PLP-dependent enzyme that catalyzes a replacement reaction to convert L-serine into L-tryptophan. 242
24406 240076 cd04725 OMP_decarboxylase_like Orotidine 5'-phosphate decarboxylase (ODCase) is a dimeric enzyme that decarboxylates orotidine 5'-monophosphate (OMP) to form uridine 5'-phosphate (UMP), an essential step in the pyrimidine biosynthetic pathway. In mammals, UMP synthase contains two domains: the orotate phosphoribosyltransferase (OPRTase) domain that catalyzes the transfer of phosphoribosyl 5'-pyrophosphate (PRPP) to orotate to form OMP, and the orotidine-5'-phosphate decarboxylase (ODCase) domain that decarboxylates OMP to form UMP. 216
24407 240077 cd04726 KGPDC_HPS 3-Keto-L-gulonate 6-phosphate decarboxylase (KGPDC) and D-arabino-3-hexulose-6-phosphate synthase (HPS). KGPDC catalyzes the formation of L-xylulose 5-phosphate and carbon dioxide from 3-keto-L-gulonate 6-phosphate as part of the anaerobic pathway for L-ascorbate utilization in some eubacteria. HPS catalyzes the formation of D-arabino-3-hexulose-6-phosphate from D-ribulose 5-phosphate and formaldehyde in microorganisms that can use formaldehyde as a carbon source. Both catalyze reactions that involve the Mg2+-assisted formation and stabilization of 1,2-enediolate reaction intermediates. 202
24408 240078 cd04727 pdxS PdxS is a subunit of the pyridoxal 5'-phosphate (PLP) synthase, an important enzyme in deoxyxylulose 5-phosphate (DXP)-independent pathway for de novo biosynthesis of PLP, present in some eubacteria, in archaea, fungi, plants, plasmodia, and some metazoa. Together with PdxT, PdxS forms the PLP synthase, a heteromeric glutamine amidotransferase (GATase), whereby PdxT produces ammonia from glutamine and PdxS combines ammonia with five- and three-carbon phosphosugars to form PLP. PLP is the biologically active form of vitamin B6, an essential cofactor in many biochemical processes. PdxS subunits form two hexameric rings. 283
24409 240079 cd04728 ThiG Thiazole synthase (ThiG) is the tetrameric enzyme that is involved in the formation of the thiazole moiety of thiamin pyrophosphate, an essential ubiquitous cofactor that plays an important role in carbohydrate and amino acid metabolism. ThiG catalyzes the formation of thiazole from 1-deoxy-D-xylulose 5-phosphate (DXP) and dehydroglycine, with the help of the sulfur carrier protein ThiS that carries the sulfur needed for thiazole assembly on its carboxy terminus (ThiS-COSH). 248
24410 240080 cd04729 NanE N-acetylmannosamine-6-phosphate epimerase (NanE) converts N-acetylmannosamine-6-phosphate to N-acetylglucosamine-6-phosphate. This reaction is part of the pathway that allows the usage of sialic acid as a carbohydrate source. Sialic acids are a family of related sugars that are found as a component of glycoproteins, gangliosides, and other sialoglycoconjugates. 219
24411 240081 cd04730 NPD_like 2-Nitropropane dioxygenase (NPD), one of the nitroalkane oxidizing enzyme families, catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDP is a member of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. 236
24412 240082 cd04731 HisF The cyclase subunit of imidazoleglycerol phosphate synthase (HisF). Imidazole glycerol phosphate synthase (IGPS) catalyzes the fifth step of histidine biosynthesis, the formation of the imidazole ring. IGPS converts N1-(5'-phosphoribulosyl)-formimino-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to imidazole glycerol phosphate (ImGP) and 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR). This conversion involves two tightly coupled reactions in distinct active sites of IGPS. The two catalytic domains can be fused, like in fungi and plants, or peformed by a heterodimer (HisH-glutaminase and HisF-cyclase), like in bacteria. 243
24413 240083 cd04732 HisA HisA. Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase catalyzes the fourth step in histidine biosynthesis, an isomerisation of the aminoaldose moiety of ProFAR to the aminoketose of PRFAR (N-(5'-phospho-D-1'-ribulosylformimino)-5-amino-1-(5''-phospho-ribosyl)-4-imidazolecarboxamide). In bacteria and archaea, ProFAR isomerase is encoded by the HisA gene. 234
24414 240084 cd04733 OYE_like_2_FMN Old yellow enzyme (OYE)-related FMN binding domain, group 2. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 338
24415 240085 cd04734 OYE_like_3_FMN Old yellow enzyme (OYE)-related FMN binding domain, group 3. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. One member of this subgroup, the Sinorhizobium meliloti stachydrine utilization protein stcD, has been idenified as a putative N-methylproline demethylase. 343
24416 240086 cd04735 OYE_like_4_FMN Old yellow enzyme (OYE)-related FMN binding domain, group 4. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 353
24417 240087 cd04736 MDH_FMN Mandelate dehydrogenase (MDH)-like FMN-binding domain. MDH is part of a widespread family of homologous FMN-dependent a-hydroxy acid oxidizing enzymes that oxidizes (S)-mandelate to phenylglyoxalate. MDH is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). 361
24418 240088 cd04737 LOX_like_FMN L-Lactate oxidase (LOX) FMN-binding domain. LOX is a member of the family of FMN-containing alpha-hydroxyacid oxidases and catalyzes the oxidation of l-lactate using molecular oxygen to generate pyruvate and H2O2. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). 351
24419 240089 cd04738 DHOD_2_like Dihydroorotate dehydrogenase (DHOD) class 2. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences, their cellular location and their natural electron acceptor used to reoxidize the flavin group. Members of class 1 are cytosolic enzymes and multimers, while class 2 enzymes are membrane associated, monomeric and use respiratory quinones as their physiological electron acceptors. 327
24420 240090 cd04739 DHOD_like Dihydroorotate dehydrogenase (DHOD) like proteins. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. This subgroup has the conserved FMN binding site, but lacks some catalytic residues and may therefore be inactive. 325
24421 240091 cd04740 DHOD_1B_like Dihydroorotate dehydrogenase (DHOD) class 1B FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. 296
24422 240092 cd04741 DHOD_1A_like Dihydroorotate dehydrogenase (DHOD) class 1A FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. 294
24423 240093 cd04742 NPD_FabD 2-Nitropropane dioxygenase (NPD)-like domain, associated with the (acyl-carrier-protein) S-malonyltransferase FabD. NPD is part of the nitroalkaneoxidizing enzyme family, that catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDPs are members of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. 418
24424 240094 cd04743 NPD_PKS 2-Nitropropane dioxygenase (NPD)-like domain, associated with polyketide synthases (PKS). NPD is part of the nitroalkaneoxidizing enzyme family, that catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDPs are members of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. 320
24425 100058 cd04745 LbH_paaY_like paaY-like: This group is composed by uncharacterized proteins with similarity to the protein product of the E. coli paaY gene, which is part of the paa gene cluster responsible for phenylacetic acid degradation. Proteins in this group are expected to adopt the left-handed parallel beta-helix (LbH) structure. They contain imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Similarity to gamma carbonic anhydrase and Ferripyochelin Binding Protein (FBP) may suggest metal binding capacity. 155
24426 240095 cd04747 OYE_like_5_FMN Old yellow enzyme (OYE)-related FMN binding domain, group 5. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction. Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 361
24427 240096 cd04748 Commd COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals. 87
24428 240097 cd04749 Commd1_MURR1 COMM_Domain containing protein 1, also called Murr1. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 174
24429 240098 cd04750 Commd2 COMM_Domain containing protein 2. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 166
24430 240099 cd04751 Commd3 COMM_Domain containing protein 3. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 95
24431 240100 cd04752 Commd4 COMM_Domain containing protein 4. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 174
24432 240101 cd04753 Commd5_HCaRG COMM_Domain containing protein 5, also called HCaRG (hypertension-related, calcium-regulated gene). HCaRG is a nuclear protein that might be involved in cell proliferation; it is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 110
24433 240102 cd04754 Commd6 COMM_Domain containing protein 6. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 86
24434 240103 cd04755 Commd7 COMM_Domain containing protein 7. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 180
24435 240104 cd04756 Commd8 COMM_Domain containing protein 8. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 176
24436 240105 cd04757 Commd9 COMM_Domain containing protein 9. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 108
24437 240106 cd04758 Commd10 COMM_Domain containing protein 10. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. 186
24438 212498 cd04759 Rib_hydrolase ADP-ribosyl cyclase, also known as cyclic ADP-ribose hydrolase or CD38. ADP-ribosyl cyclase (EC:3.2.2.5) synthesizes the second messenger cyclic-ADP ribose (cADPR), which in turn releases calcium from internal stores. Mammals possess two membrane proteins, CD38 and BST-1/CD157, which exhibit ADP-ribosyl cyclase activity, as well as intracellular soluble ADP-ribose cyclases. CD38 is involved in differentiation, adhesion, and cell proliferation, and has been implicated in diseases such as AIDS, diabetes, and B-cell chronic lymphocytic leukemia. The extramembrane domain of CD38 acts as a multifunctional enzyme, and can synthesize cADPR from NAD+, hydrolyze NAD+ and cADPR to ADPR, as well as catalyze the exchange of the nicotinamide group of NADP+ with nicotinic acid under acidic conditions, to yield NAADP+ (nicotinic acid-adenine dinucleotide phosphate), a metabolite involved in Ca2+ mobilization from acidic stores. 244
24439 240107 cd04760 BAH_Dnmt1_I BAH, or Bromo Adjacent Homology domain, first copy present in DNA (Cytosine-5)-methyltransferases from Bilateria, Dnmt1 and similar proteins. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions. 124
24440 133389 cd04761 HTH_MerR-SF Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily. Helix-turn-helix (HTH) transcription regulator MerR superfamily, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 49
24441 133390 cd04762 HTH_MerR-trunc Helix-Turn-Helix DNA binding domain of truncated MerR-like proteins. Proteins in this family mostly have a truncated helix-turn-helix (HTH) MerR-like domain. They lack a portion of the C-terminal region, called Wing 2 and the long dimerization helix that is typically present in MerR-like proteins. These truncated domains are found in response regulator receiver (REC) domain proteins (i.e., CheY), cytosine-C5 specific DNA methylases, IS607 transposase-like proteins, and RacA, a bacterial protein that anchors chromosomes to cell poles. 49
24442 133391 cd04763 HTH_MlrA-like Helix-Turn-Helix DNA binding domain of MlrA-like transcription regulators. Helix-turn-helix (HTH) transcription regulator MlrA (merR-like regulator A) and related proteins, N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. Its close homolog, CarA from Myxococcus xanthus, is involved in activation of the carotenoid biosynthesis genes by light. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins. 68
24443 133392 cd04764 HTH_MlrA-like_sg1 Helix-Turn-Helix DNA binding domain of putative MlrA-like transcription regulators. Putative helix-turn-helix (HTH) MlrA-like transcription regulators (subgroup 1). The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins. 67
24444 133393 cd04765 HTH_MlrA-like_sg2 Helix-Turn-Helix DNA binding domain of putative MlrA-like transcription regulators. Putative helix-turn-helix (HTH) MlrA-like transcription regulators (subgroup 2), N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 99
24445 133394 cd04766 HTH_HspR Helix-Turn-Helix DNA binding domain of the HspR transcription regulator. Helix-turn-helix (HTH) transcription regulator HspR, N-terminal domain. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 91
24446 133395 cd04767 HTH_HspR-like_MBC Helix-Turn-Helix DNA binding domain of putative HspR-like transcription regulators. Putative helix-turn-helix (HTH) transcription regulator HspR-like proteins. Unlike the characterized HspR, these proteins have a C-terminal domain with putative metal binding cysteines (MBC). Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 120
24447 133396 cd04768 HTH_BmrR-like Helix-Turn-Helix DNA binding domain of BmrR-like transcription regulators. Helix-turn-helix (HTH) BmrR-like transcription regulators (TipAL, Mta, SkgA, BmrR, and BltR), N-terminal domain. These proteins have been shown to regulate expression of specific regulons in response to various toxic substances, antibiotics, or oxygen radicals in Bacillus subtilis, Streptomyces, and Caulobacter crescentus. They are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 96
24448 133397 cd04769 HTH_MerR2 Helix-Turn-Helix DNA binding domain of MerR2-like transcription regulators. Helix-turn-helix (HTH) transcription regulator MerR2 and related proteins. MerR2 in Bacillus cereus RC607 regulates resistance to organomercurials. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 116
24449 133398 cd04770 HTH_HMRTR Helix-Turn-Helix DNA binding domain of Heavy Metal Resistance transcription regulators. Helix-turn-helix (HTH) heavy metal resistance transcription regulators (HMRTR): MerR1 (mercury), CueR (copper), CadR (cadmium), PbrR (lead), ZntR (zinc), and other related proteins. These transcription regulators mediate responses to heavy metal stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 123
24450 133399 cd04772 HTH_TioE_rpt1 First Helix-Turn-Helix DNA binding domain of the regulatory protein TioE. Putative helix-turn-helix (HTH) regulatory protein, TioE, and related proteins. TioE is part of the thiocoraline gene cluster, which is involved in the biosynthesis of the antitumor thiocoraline from the marine actinomycete, Micromonospora. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Proteins in this family are unique within the MerR superfamily in that they are composed of just two adjacent MerR-like N-terminal domains; this CD contains the N-terminal or first repeat (rpt1) of these tandem MerR-like domain proteins. 99
24451 133400 cd04773 HTH_TioE_rpt2 Second Helix-Turn-Helix DNA binding domain of the regulatory protein TioE. Putative helix-turn-helix (HTH) regulatory protein, TioE, and related proteins. TioE is part of the thiocoraline gene cluster, which is involved in the biosynthesis of the antitumor thiocoraline from the marine actinomycete, Micromonospora. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Proteins in this family are unique within the MerR superfamily in that they are composed of just two adjacent MerR-like N-terminal domains; this CD mainly contains the C-terminal or second repeat (rpt2) of these tandem MerR-like domain proteins. 108
24452 133401 cd04774 HTH_YfmP Helix-Turn-Helix DNA binding domain of the YfmP transcription regulator. Helix-turn-helix (HTH) transcription regulator, YfmP, and related proteins; N-terminal domain. YfmP regulates the multidrug efflux protein, YfmO, and indirectly regulates the expression of the Bacillus subtilis copZA operon encoding a metallochaperone, CopZ, and a CPx-type ATPase efflux protein, CopA. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 96
24453 133402 cd04775 HTH_Cfa-like Helix-Turn-Helix DNA binding domain of Cfa-like transcription regulators. Putative helix-turn-helix (HTH) MerR-like transcription regulators; the HTH domain of Cfa, a cyclopropane fatty acid synthase, and other related methyltransferases, as well as, the N-terminal domain of a conserved, uncharacterized ~172 a.a. protein. Based on sequence similarity of the N-terminal domain, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 102
24454 133403 cd04776 HTH_GnyR Helix-Turn-Helix DNA binding domain of the regulatory protein GnyR. Putative helix-turn-helix (HTH) regulatory protein, GnyR, and other related proteins. GnyR belongs to the gnyRDBHAL cluster, which is involved in acyclic isoprenoid degradation in Pseudomonas aeruginosa. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 118
24455 133404 cd04777 HTH_MerR-like_sg1 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 1), N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 107
24456 133405 cd04778 HTH_MerR-like_sg2 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 2). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 219
24457 133406 cd04779 HTH_MerR-like_sg4 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 4). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 134
24458 133407 cd04780 HTH_MerR-like_sg5 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 5), N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 95
24459 133408 cd04781 HTH_MerR-like_sg6 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 6) with at least two conserved cysteines present in the C-terminal portion of the protein. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 120
24460 133409 cd04782 HTH_BltR Helix-Turn-Helix DNA binding domain of the BltR transcription regulator. Helix-turn-helix (HTH) multidrug-efflux transporter transcription regulator, BltR (BmrR-like transporter) of Bacillus subtilis, and related proteins; N-terminal domain. Blt, like Bmr, is a membrane protein which causes the efflux of a variety of toxic substances and antibiotics. These regulators are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. They share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 97
24461 133410 cd04783 HTH_MerR1 Helix-Turn-Helix DNA binding domain of the MerR1 transcription regulator. Helix-turn-helix (HTH) transcription regulator MerR1. MerR1 transcription regulators, such as Tn21 MerR and Tn501 MerR, mediate response to mercury exposure in eubacteria. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines that define a mercury binding site. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 126
24462 133411 cd04784 HTH_CadR-PbrR Helix-Turn-Helix DNA binding domain of the CadR and PbrR transcription regulators. Helix-turn-helix (HTH) CadR and PbrR transcription regulators including Pseudomonas aeruginosa CadR and Ralstonia metallidurans PbrR that regulate expression of the cadmium and lead resistance operons, respectively. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines which form a putative metal binding site. Some members in this group have a histidine-rich C-terminal extension. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 127
24463 133412 cd04785 HTH_CadR-PbrR-like Helix-Turn-Helix DNA binding domain of the CadR- and PbrR-like transcription regulators. Helix-turn-helix (HTH) CadR- and PbrR-like transcription regulators. CadR and PbrR regulate expression of the cadmium and lead resistance operons, respectively. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines which comprise a putative metal binding site. Some members in this group have a histidine-rich C-terminal extension. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 126
24464 133413 cd04786 HTH_MerR-like_sg7 Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 7) with a conserved cysteine present in the C-terminal portion of the protein. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 131
24465 133414 cd04787 HTH_HMRTR_unk Helix-Turn-Helix DNA binding domain of putative Heavy Metal Resistance transcription regulators. Putative helix-turn-helix (HTH) heavy metal resistance transcription regulators (HMRTR), unknown subgroup. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to heavy metal stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules, such as, metal ions, drugs, and organic substrates. This subgroup lacks one of the conserved, metal-binding cysteines seen in the MerR1 group. 133
24466 133415 cd04788 HTH_NolA-AlbR Helix-Turn-Helix DNA binding domain of the transcription regulators NolA and AlbR. Helix-turn-helix (HTH) transcription regulators NolA and AlbR, N-terminal domain. In Bradyrhizobium (Arachis) sp. NC92, NolA is required for efficient nodulation of host plants. In Xanthomonas albilineans, AlbR regulates the expression of the pathotoxin, albicidin. These proteins are putatively comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. They share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 96
24467 133416 cd04789 HTH_Cfa Helix-Turn-Helix DNA binding domain of the Cfa transcription regulator. Putative helix-turn-helix (HTH) MerR-like transcription regulator; the N-terminal domain of Cfa, a cyclopropane fatty acid synthase and other related methyltransferases. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 102
24468 133417 cd04790 HTH_Cfa-like_unk Helix-Turn-Helix DNA binding domain of putative Cfa-like transcription regulators. Putative helix-turn-helix (HTH) MerR-like transcription regulator; conserved, Cfa-like, unknown proteins (~172 a.a.). The N-terminal domain of these proteins appears to be related to the HTH domain of Cfa, a cyclopropane fatty acid synthase. These Cfa-like proteins have a unique C-terminal domain with conserved histidines (motif HXXFX7HXXF). Based on sequence similarity of the N-terminal domains, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 172
24469 271199 cd04791 LanC_SerThrkinase Lanthionine synthetase C-like domain associated with serine/threonine kinases. Some members of this subgroup lack the zinc binding site and the active site residues, and therefore are most likely inactive. The function of this domain is unknown. 327
24470 271200 cd04792 LanM-like Cyclases involved in the biosynthesis of class II lantibiotics, and similar proteins. LanM-like proteins. LanM is a bifunctional enzyme, involved in the synthesis of class II lantibiotics. It is responsible for both the dehydration and the cyclization of the precursor-peptide during lantibiotic synthesis. The C-terminal domain shows similarity to LanC, the cyclase component of the lan operon, but the N terminus seems to be unrelated to the dehydratase, LanB. 836
24471 271201 cd04793 LanC Cyclases involved in the biosynthesis of lantibiotics. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthinoine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as precursor peptides and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans) in addition to 2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. Also contains SpaC (the cyclase involved in the biosynthesis of subtilin), NisC, and homologs. 377
24472 271202 cd04794 euk_LANCL Eukaryotic Lanthionine synthetase C-like protein. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1 and LANCL2 (testes-specific adriamycin sensitivity protein) were thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs. More recently, they have been associated with signal transduction processes and insulin sensitization. In particular, LANCL2 has been shown to bind abscisic acid (ABA), and this interaction may play a role in signaling pathways triggered by ABA, such as in human granulocytes and rat insulinoma cells. This eukaryotic LANCL family also includes Arabidopsis GCR2. 349
24473 240112 cd04795 SIS SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. 87
24474 341401 cd04801 CBS_pair_peptidase_M50 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in the metalloprotease peptidase M50. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in peptidase M50. Members of the M50 metallopeptidase family include mammalian sterol-regulatory element binding protein (SREBP) site 2 proteases and various hypothetical bacterial homologues. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 113
24475 240117 cd04813 PA_1 PA_1: Protease-associated (PA) domain subgroup 1. A subgroup of PA-domain containing proteins. Proteins in this subgroup contain a RING-finger (Really Interesting New Gene) domain C-terminal to this PA domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins in this group contain a C-terminal RING-finger domain. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases: such as hSPPL2a and 2b, ii) various E3 ubiquitin ligases similar to human GRAIL (gene related to anergy in lymphocytes) protein, iii) various proteins containing a RING finger motif such as Arabidopsis ReMembR-H2 protein, iv) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), v) various plant vacuolar sorting receptors such as Pisum sativum BP-80, vi) prostate-specific membrane antigen (PSMA), vii) yeast aminopeptidase Y viii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, ix) various subtilisin-like proteases such as Cucumisin from the juice of melon fruits, and x) human TfR (transferrin receptor) 1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 117
24476 240118 cd04814 PA_M28_1 PA_M28_1: Protease-associated (PA) domain, peptidase family M28, subfamily-1. A subfamily of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subfamilies, relatively little is known about proteins in this subfamily. 142
24477 240119 cd04815 PA_M28_2 PA_M28_2: Protease-associated (PA) domain, peptidase family M28, subfamily-2. A subfamily of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subfamilies; relatively little is known about proteins in this subfamily. 134
24478 240120 cd04816 PA_SaNapH_like PA_SaNapH_like: Protease-associated domain containing proteins like Streptomyces anulatus N-acetylpuromycin N-acetylhydrolase (SaNapH).This group contains various PA domain-containing proteins similar SaNapH. Proteins in this group belong to the peptidase M28 family. NapH is a terminal enzyme in the puromycin biosynthetic pathway; NapH hydrolyzes N-acetylpuromycin to the active antibiotic. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 122
24479 240121 cd04817 PA_VapT_like PA_VapT_like: Protease-associated domain containing proteins like VapT from Vibrio metschnikovii strain RH530. This group contains various PA domain-containing proteins similar to V. metschnikovii VapT, including the serine alkaline protease SapSh from the psychotroph Shewanella strain Ac10 and the Apa1 protease from the psychrotroph Pseudoalteromonas Sp. As-11. VapT is a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease showing high activity over a broad pH range and temperature. SapSh has a high level of protease activity at low temperatures. Apa1 is also cold-adapted. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. 139
24480 240122 cd04818 PA_subtilisin_1 PA_subtilisin_1: Protease-associated domain containing subtilisin-like proteases, subgroup 1. A subgroup of PA domain-containing subtilisin-like proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following subtilisin-like proteases: i) melon cucumisin, ii) Arabidopsis thaliana Ara12, iii) Alnus glutinosa ag12, iv) members of the tomato P69 family, and v) tomato LeSBT2. However, these proteins belong to other subtilisin-like subgroups. Relatively little is known about proteins in this subgroup. 118
24481 240123 cd04819 PA_2 PA_2: Protease-associated (PA) domain subgroup 2. A subgroup of PA-domain containing proteins. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins in this group contain a C-terminal RING-finger domain. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases: such as hSPPL2a and 2b, ii) various E3 ubiquitin ligases similar to human GRAIL (gene related to anergy in lymphocytes) protein, iii) various proteins containing a RING finger motif such as Arabidopsis ReMembR-H2 protein, iv) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), v) various plant vacuolar sorting receptors such as Pisum sativum BP-80, vi) prostate-specific membrane antigen (PSMA), vii) yeast aminopeptidase Y viii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, ix) various subtilisin-like proteases such as Cucumisin from the juice of melon fruits, and x) human TfR (transferrin receptor) 1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 127
24482 240124 cd04820 PA_M28_1_1 PA_M28_1_1: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 1. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 137
24483 240125 cd04821 PA_M28_1_2 PA_M28_1_2: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 2. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 157
24484 240126 cd04822 PA_M28_1_3 PA_M28_1_3: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 3. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 151
24485 240127 cd04823 ALAD_PBGS_aspartate_rich Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. All of PBGS_aspartate_rich contain an aspartate rich metal binding site with the general sequence DXALDX(Y/F)X3G(H/Q)DG. They also contain an allosteric magnesium binding sequence RX~164DX~65EXXXD and are activated by magnesium and/or potassium, but not by zinc. PBGSs_aspartate_rich are found in some bacterial species and photosynthetic organisms such as vascular plants, mosses and algae, but not in archaea. 320
24486 240128 cd04824 eu_ALAD_PBGS_cysteine_rich Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. The eukaryotic PBGSs represented by this model, which contain a cysteine-rich zinc binding motif (DXCXCX(Y/F)X3G(H/Q)CG), require zinc for their activity, they do not contain an additional allosteric metal binding site and do not bind magnesium. 320
24487 173791 cd04842 Peptidases_S8_Kp43_protease Peptidase S8 family domain in Kp43 proteases. Kp43 proteases are members of the peptidase S8 or Subtilase clan of proteases. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution). Kp43 is topologically similar to kexin and furin both of which are proprotein convertases, but differ in amino acids sequence and the position of its C-terminal barrel. Kp43 has 3 Ca2+ binding sites that differ from the corresponding sites in the other known subtilisin-like proteases. KP-43 protease is known to be an oxidation-resistant protease when compared with the other subtilisin-like proteases 293
24488 173792 cd04843 Peptidases_S8_11 Peptidase S8 family domain, uncharacterized subfamily 11. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 277
24489 173793 cd04847 Peptidases_S8_Subtilisin_like_2 Peptidase S8 family domain in Subtilisin-like proteins. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 291
24490 173794 cd04848 Peptidases_S8_Autotransporter_serine_protease_like Peptidase S8 family domain in Autotransporter serine proteases. Autotransporter serine proteases belong to Peptidase S8 or Subtilase family. Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution). Autotransporters are a superfamily of outer membrane/secreted proteins of gram-negative bacteria. The presence of these subtilisin-like domains in these autotransporters are may enable them to be auto-catalytic and may also serve to allow them to act as a maturation protease cleaving other outer membrane proteins at the cell surface. 267
24491 173795 cd04852 Peptidases_S8_3 Peptidase S8 family domain, uncharacterized subfamily 3. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 307
24492 173796 cd04857 Peptidases_S8_Tripeptidyl_Aminopeptidase_II Peptidase S8 family domain in Tripeptidyl aminopeptidases_II. Tripeptidyl aminopeptidases II are member of the peptidase S8 or Subtilase family. Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution). Tripeptidyl aminopeptidase II removes tripeptides from the free N terminus of oligopeptides as well as having endoproteolytic activity. Some tripeptidyl aminopeptidases have been shown to cleave tripeptides and small peptides, e.g. angiotensin II and glucagon, while others are believed to be involved in MHC I processing. 412
24493 240129 cd04859 Prim_Pol Prim_Pol: Primase-polymerase (primpol) domain of the type found in bifunctional replicases from archaeal plasmids, including ORF904 protein of the crenarchaeal plasmid pRN1 from Sulfolobus islandicus (pRN1 primpol). These primpol domains belong to the archaeal/eukaryal primase (AEP) superfamily. This group includes archaeal plasmids and bacteriophage AEPs. The ORF904 protein is a multifunctional protein having ATPase, primase and DNA polymerase activity, and may play a role in the replication of the archaeal plasmid. The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. For pRN1 primpol, the primase activity prefers dNTPs to rNTPs; incorporation of dNTPs requires rNTP as cofactor. The pRN1 primpol contains an unusual zinc-binding stem, which is not conserved in other members of this group. 152
24494 240130 cd04860 AE_Prim_S AE_Prim_S: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In addition to its catalytic role in replication, DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. The function of the larger primase subunit is unclear. Included in this group are Pfu41 and Pfu46, these two proteins comprise the primase complex of the archaea Pyrococcus furiosus; Pfu41 and Pfu46 have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. 232
24495 240131 cd04861 LigD_Pol_like LigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. PaeLigD is monomeric, containing an N-terminal phosphoesterase module, a central polymerase (Pol) domain, and a C-terminal ATP-dependent ligase domain. Mycobacterium tuberculosis (Mt)LigD, also found in this group, is monomeric and contains the same modules but these are arranged differently: an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The PaeLigD Pol domain in vitro, in a manganese-dependent fashion, catalyzes templated extensions of 5'-overhang duplex DNA, and nontemplated single-nucleotide additions to blunt-end duplex DNA; it preferentially adds single ribonucleotides at blunt DNA ends. PaeLigD Pol adds a correctly paired rNTP to the DNA primer termini more rapidly than it does a correctly paired dNTP; it has higher infidelity as an RNA polymerase than it does as a DNA polymerase, which is in keeping with the mutagenic property of NHEJ-mediated DNA DSB repair. The MtLigD Pol domain similarly is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro. The MtLigD Pol domain has been shown to prefer DNA gapped substrates containing a 5'-phosphate group at the gap. 227
24496 240132 cd04862 PaeLigD_Pol_like PaeLigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. PaeLigD is monomeric, containing an N-terminal phosphoesterase module, a central polymerase (Pol) domain, and a C-terminal ATP-dependent ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The PaeLigD Pol domain in vitro, in a manganese-dependent fashion, catalyzes templated extensions of 5'-overhang duplex DNA, and nontemplated single-nucleotide additions to blunt-end duplex DNA; it preferentially adds single ribonucleotides at blunt DNA ends. PaeLigD Pol adds a correctly paired rNTP to the DNA primer termini more rapidly than it does a correctly paired dNTP; it has higher infidelity as an RNA polymerase than it does as a DNA polymerase, which is in keeping with the mutagenic property of NHEJ-mediated DNA DSB repair. 227
24497 240133 cd04863 MtLigD_Pol_like MtLigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Mycobacterium tuberculosis (Mt)LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. MtLigD is monomeric and contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro. The MtLigD Pol domain has been shown to prefer DNA gapped substrates containing a 5'-phosphate group at the gap. 231
24498 240134 cd04864 LigD_Pol_like_1 LigD_Pol_like_1: Polymerase (Pol) domain of mostly bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 1. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization. 228
24499 240135 cd04865 LigD_Pol_like_2 LigD_Pol_like_2: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 2. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization. 228
24500 240136 cd04866 LigD_Pol_like_3 LigD_Pol_like_3: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 3. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated repair DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization. 223
24501 340516 cd04867 TGS_YchF_OLA1 TGS (ThrRS, GTPase and SpoT) domain found in the YchF/OLA1 family proteins. The YchF/Ola1 family includes bacterial ribosome-binding ATPase YchF as well as its human homolog Obg-like ATPase 1 (OLA1), both of which belong to the Obg family of GTPases, and are novel ATPases that bind and hydrolyze ATP more efficiently than GTP. They have been associated with various cellular processes and pathologies, including DNA repair, tumorigenesis, and apoptosis, in addition to the regulation of the oxidative stress response. OLA1 is also termed DNA damage-regulated overexpressed in cancer 45 (DOC45), or GTP-binding protein 9 (GTPBP9). It is over-expressed in several human malignancies, including cancers of the colon, rectum, ovary, lung, stomach, and uterus. It is linked to the cellular stress response and tumorigenesis, and may also serve as a valuable tumor marker. Members in this family contain a central Obg-type G (guanine nucleotide-binding) domain, flanked by a coiled-coil domain and this TGS (ThrRS, GTPase, SpoT) domain of unknown function. 85
24502 153140 cd04868 ACT_AK-like ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes each of two ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). Typically, AK consists of two ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Aspartokinase is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. One mechanism for the regulation of this pathway is by the production of several isoenzymes of aspartokinase with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli (EC), three different aspartokinase isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis (BS) isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as is a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium, and apparently, unique to cyanobacteria, are aspartokinases with two tandem pairs of ACT domains, C-terminal to the catalytic domain. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this AK family CD are the ACT domains of the Methylomicrobium alcaliphilum AK; the first enzyme of the ectoine biosynthetic pathway found in this bacterium and several other halophilic/halotolerant bacteria. Members of this CD belong to the superfamily of ACT regulatory domains. 60
24503 153141 cd04869 ACT_GcvR_2 ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the second of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR interacts directly with GcvA rather than binding to DNA to cause repression. Members of this CD belong to the superfamily of ACT regulatory domains. 81
24504 153142 cd04870 ACT_PSP_1 CT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). The ACT_PSP_1 CD includes the first of the two ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). PSPs belong to the L-2-haloacid dehalogenase-like protein superfamily. PSP is involved in serine metabolism; serine is synthesized from phosphoglycerate through sequential reactions catalyzed by 3-phosphoglycerate dehydrogenase (SerA), 3-phosphoserine aminotransferase (SerC), and SerB. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24505 153143 cd04871 ACT_PSP_2 ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). The ACT_PSP_2 CD includes the second of the two ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). PSPs belong to the L-2-haloacid dehalogenase-like protein superfamily. PSP is involved in serine metabolism; serine is synthesized from phosphoglycerate through sequential reactions catalyzed by 3-phosphoglycerate dehydrogenase (SerA), 3-phosphoserine aminotransferase (SerC), and SerB. Members of this CD belong to the superfamily of ACT regulatory domains 84
24506 153144 cd04872 ACT_1ZPV ACT domain proteins similar to the yet uncharacterized Streptococcus pneumoniae ACT domain protein. This CD, ACT_1ZPV, includes those single ACT domain proteins similar to the yet uncharacterized Streptococcus pneumoniae ACT domain protein (pdb structure 1ZPV). Members of this CD belong to the superfamily of ACT regulatory domains. 88
24507 153145 cd04873 ACT_UUR-ACR-like ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD. This ACT domain family, ACT_UUR_ACR-like, includes the two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD are the four ACT domains of a novel protein composed almost entirely of ACT domain repeats (the ACR protein) and like proteins. These ACR proteins, found in Arabidopsis and Oryza, are proposed to function as novel regulatory or sensor proteins in plants. This CD also includes the first of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein and related domains, as well as, the N-terminal ACT domain of a yet characterized Arabidopsis/Oryza predicted tyrosine kinase. Members of this CD belong to the superfamily of ACT regulatory domains. 70
24508 153146 cd04874 ACT_Af1403 N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, and related domains. This CD includes the N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, from Archaeoglobus fulgidus and other related archeal ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24509 153147 cd04875 ACT_F4HF-DF N-terminal ACT domain of formyltetrahydrofolate deformylase (F4HF-DF; formyltetrahydrofolate hydrolase). This CD includes the N-terminal ACT domain of formyltetrahydrofolate deformylase (F4HF-DF; formyltetrahydrofolate hydrolase) which catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Formyl-FH4 hydrolase generates the formate that is used by purT-encoded 5'-phosphoribosylglycinamide transformylase for step three of de novo purine nucleotide synthesis. Formyl-FH4 hydrolase, a hexamer which is activated by methionine and inhibited by glycine, is proposed to regulate the balance FH4 and C1-FH4 in response to changing growth conditions. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24510 153148 cd04876 ACT_RelA-SpoT ACT domain found C-terminal of the RelA/SpoT domains. ACT_RelA-SpoT: the ACT domain found C-terminal of the RelA/SpoT domains. Enzymes of the Rel/Spo family enable bacteria to survive prolonged periods of nutrient limitation by controlling guanosine-3'-diphosphate-5'-(tri)diphosphate ((p)ppGpp) production and subsequent rRNA repression (stringent response). Both the synthesis of (p)ppGpp from ATP and GDP(GTP), and its hydrolysis to GDP(GTP) and pyrophosphate, are catalyzed by Rel/Spo proteins. In Escherichia coli and its close relatives, the metabolism of (p)ppGpp is governed by two homologous proteins, RelA and SpoT. The RelA protein catalyzes (p)ppGpp synthesis in a reaction requiring its binding to ribosomes bearing codon-specified uncharged tRNA. The major role of the SpoT protein is the breakdown of (p)ppGpp by a manganese-dependent (p)ppGpp pyrophosphohydrolase activity. Although the stringent response appears to be tightly regulated by these two enzymes in E. coli, a bifunctional Rel/Spo protein has been discovered in most gram-positive organisms studied so far. These bifunctional Rel/Spo homologs (rsh) appear to modulate (p)ppGpp levels through two distinct active sites that are controlled by a reciprocal regulatory mechanism ensuring inverse coupling of opposing activities. In studies with the Streptococcus equisimilis Rel/Spo homolog, the C-terminal domain appears to be involved in this reciprocal regulation of the two opposing catalytic activities present in the N-terminal domain, ensuring that both synthesis and degradation activities are not coinduced. Members of this CD belong to the superfamily of ACT regulatory domains. 71
24511 153149 cd04877 ACT_TyrR N-terminal ACT domain of the TyrR protein. ACT_TyrR: N-terminal ACT domain of the TyrR protein. The TyrR protein of Escherichia coli controls the expression of a group of transcription units (TyrR regulon) whose gene products are involved in the biosynthesis or transport of the aromatic amino acids. Binding to specific DNA sequences known as TyrR boxes, the TyrR protein can either activate or repress transcription at different sigma70 promoters. Its regulatory activity occurs in response to intracellular levels of tyrosine, phenylalanine and tryptophan. The TyrR protein consists of an N-terminal region important for transcription activation with an ATP-independent aromatic amino acid binding site (contained within the ACT domain) and is involved in dimerization; a central region with an ATP binding site, an ATP-dependent aromatic amino acid binding site and is involved in hexamerization; and a helix turn helix DNA binding C-terminal region. In solution, in the absence of cofactors or in the presence of phenylalanine alone, the TyrR protein exists as a dimer. However, in the presence of ATP and tyrosine the TyrR protein self-aggregates to form a hexamer. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24512 153150 cd04878 ACT_AHAS N-terminal ACT domain of the Escherichia coli IlvH-like regulatory subunit of acetohydroxyacid synthase (AHAS). ACT_AHAS: N-terminal ACT domain of the Escherichia coli IlvH-like regulatory subunit of acetohydroxyacid synthase (AHAS). AHAS catalyses the first common step in the biosynthesis of the three branched-chain amino acids. The first step involves the condensation of either pyruvate or 2-ketobutyrate with the two-carbon hydroxyethyl fragment derived from another pyruvate molecule, covalently bound to the coenzyme thiamine diphosphate. Bacterial AHASs generally consist of regulatory and catalytic subunits. The effector (valine) binding sites are proposed to be located in two symmetrically related positions in the interface between a pair of N-terminal ACT domains with the C-terminal domain of IlvH contacting the catalytic dimer. Plants Arabidopsis and Oryza have tandem IlvH subunits; both the first and second ACT domain sequences are present in this CD. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24513 153151 cd04879 ACT_3PGDH-like ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). ACT_3PGDH-like: The ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with or without an extended C-terminal (xct) region found in various bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback controlled by the end product L-serine in an allosteric manner. In the Escherichia coli homotetrameric enzyme, the interface at adjacent ACT (regulatory) domains couples to create an extended beta-sheet. Each regulatory interface forms two serine-binding sites. The mechanism by which serine transmits inhibition to the active site is postulated to involve the tethering of the regulatory domains together to create a rigid quaternary structure with a solvent-exposed active site cleft. This CD also includes the C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit, found in various bacterial anaerobes such as Clostridium, Bacillus, and Treponema species. LSD enzymes catalyze the deamination of L-serine, producing pyruvate and ammonia. Unlike the eukaryotic L-serine dehydratase, which requires the pyridoxal-5'-phosphate (PLP) cofactor, the prokaryotic L-serine dehydratase contains an [4Fe-4S] cluster instead of a PLP active site. The LSD alpha and beta subunits of the 'clostridial' enzyme are encoded by the sdhA and sdhB genes. The single subunit bacterial homologs of L-serine dehydratase (LSD1, LSD2, TdcG) present in E. coli, and other Enterobacteriales, lack the ACT domain described here. Members of this CD belong to the superfamily of ACT regulatory domains. 71
24514 153152 cd04880 ACT_AAAH-PDT-like ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. Eukaryotic AAAHs have an N-terminal ACT (regulatory) domain, a middle catalytic domain and a C-terminal domain which is responsible for the oligomeric state of the enzyme forming a domain-swapped tetrameric coiled-coil. The PAH, TH, and TPH enzymes contain highly conserved catalytic domains but distinct N-terminal ACT domains and differ in their mechanisms of regulation. One commonality is that all three eukaryotic enzymes appear to be regulated, in part, by the phosphorylation of serine residues N-terminal of the ACT domain. Also included in this CD are the C-terminal ACT domains of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme found in plants, fungi, bacteria, and archaea. The P-protein of Escherichia coli (CM-PDT) catalyzes the conversion of chorismate to prephenate and then the decarboxylation and dehydration to form phenylpyruvate. These are the first two steps in the biosynthesis of L-Phe and L-Tyr via the shikimate pathway in microorganisms and plants. The E. coli P-protein (CM-PDT) has three domains with an N-terminal domain with chorismate mutase activity, a middle domain with prephenate dehydratase activity, and an ACT regulatory C-terminal domain. The prephenate dehydratase enzyme has a PDT and ACT domain. The ACT domain is essential to bring about the negative allosteric regulation by L-Phe binding. L-Phe binds with positive cooperativity; with this binding, there is a shift in the protein to less active tetrameric and higher oligomeric forms from a more active dimeric form. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24515 153153 cd04881 ACT_HSDH-Hom ACT_HSDH_Hom CD includes the C-terminal ACT domain of the NAD(P)H-dependent, homoserine dehydrogenase (HSDH) and related domains. The ACT_HSDH_Hom CD includes the C-terminal ACT domain of the NAD(P)H-dependent, homoserine dehydrogenase (HSDH) encoded by the hom gene of Bacillus subtilis and other related sequences. HSDH reduces aspartate semi-aldehyde to the amino acid homoserine, one that is required for the biosynthesis of Met, Thr, and Ile from Asp. Neither the enzyme nor the aspartate pathway is found in the animal kingdom. This mostly bacterial HSDH group has a C-terminal ACT domain and is believed to be involved in enzyme regulation. A C-terminal deletion in the Corynebacterium glutamicum HSDH abolished allosteric inhibition by L-threonine. Members of this CD belong to the superfamily of ACT regulatory domains. 79
24516 153154 cd04882 ACT_Bt0572_2 C-terminal ACT domain of a novel protein composed of just two ACT domains. Included in this CD is the C-terminal ACT domain of a novel protein composed of just two ACT domains, as seen in the yet uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related proteins. Members of this CD belong to the superfamily of ACT regulatory domains. 65
24517 153155 cd04883 ACT_AcuB C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. This CD includes the C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. AcuB is putatively involved in the anaerobic catabolism of acetoin, and related proteins. Studies report the induction of AcuB by nitrate respiration and also by fermentation. Since acetoin can be secreted and later serve as a source of carbon, it has been proposed that, during anaerobic growth when other carbon sources are exhausted, the induction of the AcuB protein results in acetoin catabolism. AcuB-like proteins have two N-terminal tandem CBS domains and a single C-terminal ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24518 153156 cd04884 ACT_CBS C-terminal ACT domain of the cystathionine beta-synthase (CBS) domain protein found in Thermotoga maritima, Tm0935, and delta proteobacteria. This CD includes the C-terminal ACT domain of the cystathionine beta-synthase (CBS) domain protein found in Thermotoga maritima, Tm0935, and delta proteobacteria. This protein has two N-terminal tandem CBS domains and a single C-terminal ACT domain. The CBS domain is found in a wide range of proteins, often in tandem arrangements and together with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24519 153157 cd04885 ACT_ThrD-I Tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes each of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains. 68
24520 153158 cd04886 ACT_ThrD-II-like C-terminal ACT domain of biodegradative (catabolic) threonine dehydratase II (ThrD-II) and other related ACT domains. This CD includes the C-terminal ACT domain of biodegradative (catabolic) threonine dehydratase II (ThrD-II) and other related ACT domains. The Escherichia coli tdcB gene product, ThrD-II, anaerobically catalyzes the pyridoxal phosphate-dependent dehydration of L-threonine and L-serine to ammonia and to alpha-ketobutyrate and pyruvate, respectively. Tetrameric ThrD-II is subject to allosteric activation by AMP, inhibition by alpha-keto acids, and catabolite inactivation by several metabolites of glycolysis and the citric acid cycle. Also included in this CD are N-terminal ACT domains present in smaller (~170 a.a.) archaeal proteins of unknown function. Members of this CD belong to the superfamily of ACT regulatory domains. 73
24521 153159 cd04887 ACT_MalLac-Enz ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI and related domains. The ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI, a malolactic enzyme (MalLac-Enz) which converts malate to lactate, and other related ACT domains. The yqkJ product is predicted to convert malate directly to lactate, as opposed to related malic enzymes that convert malate to pyruvate. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24522 153160 cd04888 ACT_PheB-BS C-terminal ACT domain of a small (~147 a.a.) putative phenylalanine biosynthetic pathway protein described in Bacillus subtilis (BS) PheB (PheB-BS) and related domains. This CD includes the C-terminal ACT domain of a small (~147 a.a.) putative phenylalanine biosynthetic pathway protein described in Bacillus subtilis (BS) PheB (PheB-BS) and other related ACT domains. In B. subtilis, the upstream gene of pheB, pheA encodes prephenate dehydratase (PDT). The presumed product of the pheB gene is chorismate mutase (CM). The deduced product of the B. subtilis pheB gene, however, has no significant homology to the CM portion of the bifunctional CM-PDT of Escherichia coli. The presence of an ACT domain lends support to the prediction that these proteins function as a phenylalanine-binding regulatory protein. Members of this CD belong to the superfamily of ACT regulatory domains. 76
24523 153161 cd04889 ACT_PDH-BS-like C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate. Included in this CD is the C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate, found in Bacillus subtilis (BS) and other Firmicutes, Deinococci, and Bacteroidetes. PDH is the first enzyme in the aromatic amino acid pathway specific for the biosynthesis of tyrosine. This enzyme is feedback inhibited by tyrosine in B. subtilis and other microorganisms. Both phenylalanine and tryptophan have been shown to be inhibitors of this activity in B. subtilis. Bifunctional chorismate mutase-PDH (TyrA) enzymes such as those seen in Escherichia coli do not contain an ACT domain. Also included in this CD is the N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains as seen in the uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. 56
24524 153162 cd04890 ACT_AK-like_1 ACT domains found C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes the first of two ACT domains found C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids, lysine, threonine, methionine, and isoleucine. This CD, includes the first ACT domain of the Escherichia coli (EC) isoenzyme, AKIII (LysC) and the Arabidopsis isoenzyme, asparate kinase 1, both enzymes monofunctional and involved in lysine synthesis, as well as the the first ACT domain of Bacillus subtilis (BS) isoenzyme, AKIII (YclM), and of the Saccharomyces cerevisiae AK (Hom3). Also included are the first ACT domains of the Methylomicrobium alcaliphilum AK, the first enzyme of the ectoine biosynthetic pathway. Members of this CD belong to the superfamily of ACT regulatory domains. 62
24525 153163 cd04891 ACT_AK-LysC-DapG-like_1 ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII and related proteins. This CD includes the N-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, as well as, the first and third, of four, ACT domains present in cyanobacteria AK. Also included are the N-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase isoenzyme AKI found in Bacilli (Bacillus subtilis strain 168), Clostridia, and Actinobacteria bacterial species. Members of this CD belong to the superfamily of ACT regulatory domains. 61
24526 153164 cd04892 ACT_AK-like_2 ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes the second of two ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). The exception in this group, is the inclusion of the first ACT domain of the bifunctional aspartokinase - homoserine dehydrogenase-like enzyme group (ACT_AKi-HSDH-ThrA-like_1) which includes the monofunctional, threonine-sensitive, aspartokinase found in Methanococcus jannaschii and other related archaeal species. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. AK is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. One mechanism for the regulation of this pathway is by the production of several isoenzymes of AK with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli (EC), three different AK isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis (BS) isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as is a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium, and apparently, unique to cyanobacteria, are AKs with two tandem pairs of ACT domains, C-terminal to the catalytic domain. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this CD are the ACT domains of the Methylomicrobium alcaliphilum AK; the first enzyme of the ectoine biosynthetic pathway found in this bacterium and several other halophilic/halotolerant bacteria. Members of this CD belong to the superfamily of ACT regulatory domains. 65
24527 153165 cd04893 ACT_GcvR_1 ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the first of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR interacts directly with GcvA rather than binding to DNA to cause repression. Members of this CD belong to the superfamily of ACT regulatory domains. 77
24528 153166 cd04894 ACT_ACR-like_1 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the N-terminal ACT domain of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains. 69
24529 153167 cd04895 ACT_ACR_1 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the N-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24530 153168 cd04896 ACT_ACR-like_3 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the third ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24531 153169 cd04897 ACT_ACR_3 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the third ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24532 153170 cd04898 ACT_ACR-like_4 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the C-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains. 77
24533 153171 cd04899 ACT_ACR-UUR-like_2 C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD and related domains. This ACT domain family, ACT_ACR-UUR-like_2, includes the second of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD are the second and fourth ACT domains of a novel protein composed almost entirely of ACT domain repeats, the ACR protein. These ACR proteins, found in Arabidopsis and Oryza, are proposed to function as novel regulatory or sensor proteins in plants. Members of this CD belong to the superfamily of ACT regulatory domains. 70
24534 153172 cd04900 ACT_UUR-like_1 ACT domain family, ACT_UUR-like_1, includes the first of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD and related domains. This ACT domain family, ACT_UUR-like_1, includes the first of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD is the N-terminal ACT domain of a yet characterized Arabidopsis/Oryza predicted tyrosine kinase. Members of this CD belong to the superfamily of ACT regulatory domains. 73
24535 153173 cd04901 ACT_3PGDH C-terminal ACT (regulatory) domain of D-3-Phosphoglycerate Dehydrogenase (3PGDH) found in fungi and bacteria. The C-terminal ACT (regulatory) domain of D-3-Phosphoglycerate Dehydrogenase (3PGDH) found in fungi and bacteria. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In Escherichia coli, the SerA 3PGDH is feedback-controlled by the end product L-serine in an allosteric manner. In the homotetrameric enzyme, the interface at adjacent ACT (regulatory) domains couples to create an extended beta-sheet. Each regulatory interface forms two serine-binding sites. The mechanism by which serine transmits inhibition to the active site is postulated to involve the tethering of the regulatory domains together to create a rigid quaternary structure with a solvent-exposed active site cleft. Members of this CD belong to the superfamily of ACT regulatory domains. 69
24536 153174 cd04902 ACT_3PGDH-xct C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). The C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with an extended C-terminal (xct) region from bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback-controlled by the end product L-serine in an allosteric manner. Some 3PGDH enzymes have an additional domain formed by an extended C-terminal region. This additional domain introduces significant asymmetry to the homotetramer. Adjacent ACT (regulatory) domains interact, creating two serine-binding sites, however, this asymmetric arrangement results in the formation of two different and distinct domain interfaces between identical domains in the asymmetric unit. How this asymmetry influences the mechanism of effector inhibition is still unknown. Members of this CD belong to the superfamily of ACT regulatory domains. 73
24537 153175 cd04903 ACT_LSD C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit. The C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit, found in various bacterial anaerobes such as Clostridium, Bacillis, and Treponema species. These enzymes catalyze the deamination of L-serine, producing pyruvate and ammonia. Unlike the eukaryotic L-serine dehydratase, which requires the pyridoxal-5'-phosphate (PLP) cofactor, the prokaryotic L-serine dehydratase contains an [4Fe-4S] cluster instead of a PLP active site. The LSD alpha and beta subunits of the 'clostridial' enzyme are encoded by the sdhA and sdhB genes. The single subunit bacterial homologs of L-serine dehydratase (LSD1, LSD2, TdcG) present in Escherichia coli, and other enterobacterials, lack the ACT domain described here. Members of this CD belong to the superfamily of ACT regulatory domains. 71
24538 153176 cd04904 ACT_AAAH ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe; TH catalyses the hydroxylation of L-Tyr to 3,4-dihydroxyphenylalanine, the rate limiting step in the biosynthesis of catecholamines; and TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxytryptamine (serotonin) and the first reaction in the synthesis of melatonin. Eukaryotic AAAHs have an N-terminal ACT (regulatory) domain, a middle catalytic domain and a C-terminal domain which is responsible for the oligomeric state of the enzyme forming a domain-swapped tetrameric coiled-coil. The PAH, TH, and TPH enzymes contain highly conserved catalytic domains but distinct N-terminal ACT domains (this CD) and differ in their mechanisms of regulation. One commonality is that all three eukaryotic enzymes are regulated in part by the phosphorylation of serine residues N-terminal of the ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24539 153177 cd04905 ACT_CM-PDT C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme. The C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme, found in plants, fungi, bacteria, and archaea. The P-protein of E. coli (CM-PDT, PheA) catalyzes the conversion of chorismate to prephenate and then the decarboxylation and dehydration to form phenylpyruvate. These are the first two steps in the biosynthesis of L-Phe and L-Tyr via the shikimate pathway in microorganisms and plants. The E. coli P-protein (CM-PDT) has three domains with an N-terminal domain with chorismate mutase activity, a middle domain with prephenate dehydratase activity, and an ACT regulatory C-terminal domain. The prephenate dehydratase enzyme has a PDT and ACT domain. The ACT domain is essential to bring about the negative allosteric regulation by L-Phe binding. L-Phe binds with positive cooperativity; with this binding, there is a shift in the protein to less active tetrameric and higher oligomeric forms from a more active dimeric form. Members of this CD belong to the superfamily of ACT regulatory domains. 80
24540 153178 cd04906 ACT_ThrD-I_1 First of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes the first of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains. 85
24541 153179 cd04907 ACT_ThrD-I_2 Second of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes the second of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains. 81
24542 153180 cd04908 ACT_Bt0572_1 N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains. Included in this CD is the N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains as seen in the uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related ACT domains. These tandem ACT domain proteins belong to the superfamily of ACT regulatory domains. 66
24543 153181 cd04909 ACT_PDH-BS C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH). The C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate, found in Bacillus subtilis (BS) and other Firmicutes, Deinococci, and Bacteroidetes. PDH is the first enzyme in the aromatic amino acid pathway specific for the biosynthesis of tyrosine. This enzyme is feedback-inhibited by tyrosine in B. subtilis and other microorganisms. Both phenylalanine and tryptophan have been shown to be inhibitors of this activity in B. subtilis. Bifunctional chorismate mutase-PDH (TyrA) enzymes such as those seen in Escherichia coli do not contain an ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains. 69
24544 153182 cd04910 ACT_AK-Ectoine_1 ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and various other halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes' of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. Members of this CD belong to the superfamily of ACT regulatory domains. 71
24545 153183 cd04911 ACT_AKiii-YclM-BS_1 ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Bacillus subtilis YclM is reported to be a single polypeptide of 50 kD. AKIII from Bacillus subtilis strain 168 is induced by lysine and repressed by threonine and it is synergistically inhibited by lysine and threonine. Members of this CD belong to the superfamily of ACT regulatory domains. 76
24546 153184 cd04912 ACT_AKiii-LysC-EC-like_1 ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC) and plants, (Zea mays Ask1, Ask2, and Arabidopsis thaliana AK1). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Like the A. thaliana AK1 (AK1-AT), the E. coli AKIII (LysC) has two bound feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. The lysine-sensitive plant isoenzyme is synergistically inhibited by S-adenosylmethionine. A homolog of this group appears to be the Saccharomyces cerevisiae AK (Hom3) which clusters with this group as well. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24547 153185 cd04913 ACT_AKii-LysC-BS-like_1 ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related proteins. This CD includes the N-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In corynebacteria and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Conserved residues in the ACT domains have been shown to be involved in this concerted feedback inhibition. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomonas aeruginosa, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. This CD includes the first ACT domain C-terminal to the AK catalytic domain of the alpha subunit and the first ACT domain of the beta subunit that lacks the AK catalytic domain. Unlike the C. glutamicum AK beta subunit, which is involved in feedback regulation, the B. subtilis AKII beta subunit is not. Cyanobacteria aspartokinases are unique to this CD and they have a unique domain architecture with two tandem pairs of ACT domains, C-terminal to the catalytic AK domain. In this CD, the first and third cyanobacteria AK ACT domains are present. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24548 153186 cd04914 ACT_AKi-DapG-BS_1 ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI. This CD includes the N-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) strain 168), Clostridia, and Actinobacteria, bacterial species. In B. subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. 67
24549 153187 cd04915 ACT_AK-Ectoine_2 ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and various other halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes' of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. Members of this CD belong to the superfamily of ACT regulatory domains. 66
24550 153188 cd04916 ACT_AKiii-YclM-BS_2 ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. B. subtilis YclM is reported to be a single polypeptide of 50 kD. AKIII from B. subtilis strain 168 is induced by lysine and repressed by threonine and it is synergistically inhibited by lysine and threonine. Members of this CD belong to the superfamily of ACT regulatory domains. 66
24551 153189 cd04917 ACT_AKiii-LysC-EC_2 ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The E. coli AKIII (LysC) binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. The second ACT domain (ACT2), this CD, is not involved in the binding of heterotrophic effectors. Members of this CD belong to the superfamily of ACT regulatory domains. 64
24552 153190 cd04918 ACT_AK1-AT_2 ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1). This CD includes the second of two ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1), which can be synergistically inhibited by S-adenosylmethionine (SAM). This isoenzyme is found in higher plants, Arabidopsis thaliana (AT) and Zea mays, and also in Chlorophyta. In its inactive state, Arabidopsis AK1 binds the effectors lysine and SAM (two molecules each) at the interface of two ACT1 domain subunits. The second ACT domain (ACT2), this CD, does not interact with an effector. Members of this CD belong to the superfamily of ACT regulatory domains. 65
24553 153191 cd04919 ACT_AK-Hom3_2 ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. AK is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single AK, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies shown that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. Members of this CD belong to the superfamily of ACT regulatory domains. 66
24554 153192 cd04920 ACT_AKiii-DAPDC_2 ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC). This CD includes the second of two ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. Aspartokinase (AK) is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The lysA gene encodes the enzyme DAPDC, a pyridoxal-5'-phosphate (PLP)-dependent enzyme which catalyzes the final step in the lysine biosynthetic pathway converting meso-diaminopimelic acid (DAP) to l-lysine. Tandem ACT domains are positioned centrally with the AK catalytic domain N-terminal and the DAPDC domains C-terminal. Members of this CD belong to the superfamily of ACT regulatory domains. 63
24555 153193 cd04921 ACT_AKi-HSDH-ThrA-like_1 ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). This CD includes the first of two ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). The ACT domains are positioned between the N-terminal catalytic domain of AK and the C-terminal HSDH domain found in bacteria (Escherichia coli (EC) ThrA) and higher plants (Zea mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. HSDH is the first committed reaction in the branch of the pathway that leads to Thr and Met. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains were shown to be involved in allosteric activation. Also included in this CD is the first of two ACT domains of a tetrameric, monofunctional, threonine-sensitive, AK found in Methanococcus jannaschii and other related archaeal species. Members of this CD belong to the superfamily of ACT regulatory domains. 80
24556 153194 cd04922 ACT_AKi-HSDH-ThrA_2 ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). This CD includes the second of two ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). The ACT domains are positioned between the N-terminal catalytic domain of AK and the C-terminal HSDH domain found in bacteria (Escherichia coli (EC) ThrA) and higher plants (Zea mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. HSDH is the first committed reaction in the branch of the pathway that leads to Thr and Met. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains were shown to be involved in allosteric activation. Members of this CD belong to the superfamily of ACT regulatory domains. 66
24557 153195 cd04923 ACT_AK-LysC-DapG-like_2 ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related domains. This CD includes the C-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, as well as, the second and fourth, of four, ACT domains present in cyanobacteria AK. Also included are the C-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase isoenzyme AKI found in Bacilli (B. subtilis strain 168), Clostridia, and Actinobacteria bacterial species. Members of this CD belong to the superfamily of ACT regulatory domains. 63
24558 153196 cd04924 ACT_AK-Arch_2 ACT domains of a monofunctional aspartokinase found mostly in Archaea species (ACT_AK-Arch_2). Included in this CD is the second of two ACT domains of a monofunctional aspartokinase found mostly in Archaea species (ACT_AK-Arch_2). The first or N-terminal ACT domain of these proteins cluster with the ThrA-like ACT 1 domains (ACT_AKi-HSDH-ThrA-like_1) which includes the threonine-sensitive archaeal Methanococcus jannaschii aspartokinase ACT 1 domain. Members of this CD belong to the superfamily of ACT regulatory domains. 66
24559 153197 cd04925 ACT_ACR_2 ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24560 153198 cd04926 ACT_ACR_4 C-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the C-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains. 72
24561 153199 cd04927 ACT_ACR-like_2 Second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains. 76
24562 153200 cd04928 ACT_TyrKc Uncharacterized, N-terminal ACT domain of an Arabidopsis/Oryza predicted tyrosine kinase and other related ACT domains. This CD includes a novel, yet uncharacterized, N-terminal ACT domain of an Arabidopsis/Oryza predicted tyrosine kinase and other related ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. 68
24563 153201 cd04929 ACT_TPH ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxytryptamine (serotonin) and the first reaction in the synthesis of melatonin. Very little is known about the role of the ACT domain in TPH, which appears to be regulated by phosphorylation but not by its substrate or cofactor. Members of this CD belong to the superfamily of ACT regulatory domains. 74
24564 153202 cd04930 ACT_TH ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tyrosine hydroxylases (TH). ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tyrosine hydroxylases (TH). TH catalyses the hydroxylation of L-Tyr to 3,4-dihydroxyphenylalanine, the rate limiting step in the biosynthesis of catecholamines (dopamine, noradrenaline and adrenaline), functioning as hormones and neurotransmitters. The enzyme is not regulated by its amino acid substrate, but instead by phosphorylation at several serine residues located N-terminal of the ACT domain, and by feedback inhibition by catecholamines at the active site. Members of this CD belong to the superfamily of ACT regulatory domains. 115
24565 153203 cd04931 ACT_PAH ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe. In PAH, an autoregulatory sequence, N-terminal of the ACT domain, extends across the catalytic domain active site and regulates the enzyme by intrasteric regulation. It appears that the activation by L-Phe induces a conformational change that converts the enzyme to a high-affinity and high-activity state. Modulation of activity is achieved through inhibition by BH4 and activation by phosphorylation of serine residues of the autoregulatory region. The molecular basis for the cooperative activation process is not fully understood yet. Members of this CD belong to the superfamily of ACT regulatory domains. 90
24566 153204 cd04932 ACT_AKiii-LysC-EC_1 ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The E. coli AKIII (LysC) binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24567 153205 cd04933 ACT_AK1-AT_1 ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1). This CD includes the first of two ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1), which can be synergistically inhibited by S-adenosylmethionine. This isoenzyme is found in higher plants, Arabidopsis thaliana (AT) and Zea mays, and also in Chlorophyta. Like the Escherichia coli AKIII (LysC), Arabidopsis AK1 binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. A loop in common is involved in the binding of both Lys and S-adenosylmethionine providing an explanation for the synergistic inhibition by these effectors. Members of this CD belong to the superfamily of ACT regulatory domains. 78
24568 153206 cd04934 ACT_AK-Hom3_1 CT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. AK is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single AK, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies shown that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. Members of this CD belong to the superfamily of ACT regulatory domains. 73
24569 153207 cd04935 ACT_AKiii-DAPDC_1 ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. This CD includes the first of two ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. Aspartokinase (AK) is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The lysA gene encodes the enzyme DAPDC, a pyridoxal-5'-phosphate (PLP)-dependent enzyme which catalyzes the final step in the lysine biosynthetic pathway converting meso-diaminopimelic acid (DAP) to l-lysine. Tandem ACT domains are positioned centrally with the AK catalytic domain N-terminal and the DAPDC domains C-terminal. Members of this CD belong to the superfamily of ACT regulatory domains. 75
24570 153208 cd04936 ACT_AKii-LysC-BS-like_2 ACT domains of the lysine-sensitive, aspartokinase (AK) isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related domains. This CD includes the C-terminal of the two ACT domains of the lysine-sensitive, aspartokinase (AK) isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis strain 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive AK isoenzymes. The B. subtilis strain 168 AKII is induced by methionine and repressed and inhibited by lysine. Although C. glutamicum is known to contain a single AK, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In corynebacteria and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Conserved residues in the ACT domains have been shown to be involved in this concerted feedback inhibition. Also included in this CD are the AKs of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single AKs found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis strain 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans AKs are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. This CD includes the second ACT domain C-terminal to the AK catalytic domain of the alpha subunit and the second ACT domain of the beta subunit that lacks the AK catalytic domain. Unlike the C. glutamicum AK beta subunit, which is involved in feedback regulation, the B. subtilis AKII beta subunit is not. Cyanobacteria AKs are unique to this CD and they have a unique domain architecture with two tandem pairs of ACT domains, C-terminal to the catalytic AK domain. In this CD, the second and fourth cyanobacteria AK ACT domains are present. Members of this CD belong to the superfamily of ACT regulatory domains. 63
24571 153209 cd04937 ACT_AKi-DapG-BS_2 ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI. This CD includes the C-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) strain 168), Clostridia, and Actinobacteria bacterial species. In B. subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive AK isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The BS AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains. 64
24572 340517 cd04938 TGS_Obg TGS (ThrRS, GTPase and SpoT) domain found in the Obg protein family. The Obg family of GTPases function has been implicated in cellular processes as diverse as sporulation, stress response, control of DNA replication, and ribosome assembly. It consists of several subfamilies such as DRG and YchF with TGS domain. The TGS domain is named after the various RNA-binding multidomain ThrRS, GTPase, and SpoT/RelA proteins in which this domain occurs. The TGS domain of Obg-like GTPases such as those present in DRG (developmentally regulated GTP-binding protein), and GTP-binding proteins Ygr210 and YchF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. 77
24573 240137 cd04939 PA2301 PA2301 is an uncharacterized Pseudomonas aeruginosa protein with a YbaK-like domain of unknown function. The YbaK-like domain family includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, and ProX. The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation. Organisms whose ProRS lacks the INS domain express a single-domain INS homolog such as YbaK, ProX, or PrdX which supplies the function of INS in trans. 139
24574 340854 cd04946 GT4_AmsK-like amylovoran biosynthesis glycosyltransferase AmsK and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. AmsK is involved in the biosynthesis of amylovoran, which functions as a virulence factor. It functions as a glycosyl transferase which transfers galactose from UDP-galactose to a lipid-linked amylovoran-subunit precursor. The members of this family are found mainly in bacteria and Archaea. 401
24575 340855 cd04949 GT4_GtfA-like accessory Sec system glycosyltransferase GtfA and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases and is named after gtfA in Streptococcus gordonii, where it plays a role in the O-linked glycosylation of GspB, a cell surface glycoprotein involved in platelet binding. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in bacteria. 328
24576 340856 cd04950 GT4_TuaH-like teichuronic acid biosynthesis glycosyltransferase TuaH and similar proteins. Members of this family may function in teichuronic acid biosynthesis/cell wall biogenesis. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 373
24577 340857 cd04951 GT4_WbdM_like LPS/UnPP-GlcNAc-Gal a-1,4-glucosyltransferase WbdM and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases and is named after WbdM in Escherichia coli. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in bacteria. 360
24578 340858 cd04955 GT4-like glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in certain bacteria and Archaea. 379
24579 340859 cd04962 GT4_BshA-like N-acetyl-alpha-D-glucosaminyl L-malate synthase BshA and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria, while some of them are also found in Archaea and eukaryotes. 370
24580 409356 cd04967 IgI_1_Contactin First immunoglobulin (Ig) domain of contactin; member of the I-set of (Ig) superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains. 96
24581 409357 cd04968 IgI_3_Contactin Third immunoglobulin (Ig) domain of contactin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains. 88
24582 409358 cd04969 Ig5_Contactin Fifth immunoglobulin (Ig) domain of contactin. The members here are composed of the fifth immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. 89
24583 409359 cd04970 Ig6_Contactin Sixth immunoglobulin (Ig) domain of contactin. The members here are composed of the sixth immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. 102
24584 409360 cd04971 IgI_TrKABC_d5 Fifth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB, and TrkC; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth domain of Trk receptors TrkA, TrkB, and TrkC, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, TrkB, and TrkC mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. TrkA is recognized by NGF. TrkB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor found in all major NGF targets, including the sympathetic, trigeminal, and dorsal root ganglia, cholinergic neurons of the basal forebrain, and the striatum. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. The TrkC gene is expressed throughout the mammalian nervous system. This group belongs to the I-set of IgSF domains. 96
24585 409361 cd04972 Ig_TrkABC_d4 Fourth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB, and TrkC. The members here are composed of the fourth domain of Trk receptors TrkA, TrkB, and TrkC, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, TrkB, and TrkC mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. TrkA is recognized by NGF. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor found in all major NGF targets, including the sympathetic, trigeminal, and dorsal root ganglia, cholinergic neurons of the basal forebrain, and the striatum. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. The TrkC gene is expressed throughout the mammalian nervous system. 88
24586 409362 cd04973 IgI_1_FGFR First immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for all FGFs. 94
24587 409363 cd04974 IgI_3_FGFR Third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for FGFs. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 102
24588 409364 cd04975 IgI_4_SCFR_like Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR), and similar domains; member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR). In addition to SCFR, this group also includes the fourth Ig domain of macrophage colony stimulating factor receptor (M-CSF-R). SCFR, also called receptor tyrosine kinase KIT or proto-oncogene c-Kit, contains an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes, and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR, this fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth SCFR Ig-like domain abolishes the ligand-induced dimerization of SCFR and completely inhibits signal transduction. M-CSF-R, also called proto-oncogene c-Fms, acts as cell-surface receptor for CSF1 and IL34 and plays an essential role in the regulation of survival, proliferation and differentiation of hematopoietic precursor cells, such as macrophages and monocytes. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 101
24589 409365 cd04976 IgI_VEGFR Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members, VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1), and VEGFR-3 (Flt-4). VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic, and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 90
24590 409366 cd04977 IgI_1_NCAM-1_like First immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-1, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-1. NCAM-1 plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM) and heterophilic (NCAM-nonNCAM) interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves the Ig1, Ig2, and Ig3 domains. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. Also included in this group is NCAM-2 (also known as OCAM/mamFas II and RNCAM). NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE). This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 95
24591 409367 cd04978 Ig4_L1-NrCAM_like Fourth immunoglobulin (Ig)-like domain of L1, Ng-CAM (Neuron-glia CAM cell adhesion molecule), and NrCAM (Ng-CAM-related). The members here are composed of the fourth immunoglobulin (Ig)-like domain of L1, Ng-CAM (Neuron-glia CAM cell adhesion molecule), and NrCAM (Ng-CAM-related). These proteins belong to the L1 subfamily of cell adhesion molecules (CAMs) and are comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. These molecules are primarily expressed in the nervous system. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. 89
24592 409368 cd04979 Ig_Semaphorin_C Immunoglobulin (Ig)-like domain at the C-terminus of semaphorins. The members here are composed of the immunoglobulin (Ig)-like domain in semaphorins. Semaphorins are transmembrane protein that have important roles in a variety of tissues. Functionally, semaphorins were initially characterized for their importance in the development of the nervous system and in axonal guidance. Later they have been found to be important for the formation and functioning of the cardiovascular, endocrine, gastrointestinal, hepatic, immune, musculoskeletal, renal, reproductive, and respiratory systems. Semaphorins function through binding to their receptors and transmembrane semaphorins also serves as receptors themselves. Although molecular mechanism of semaphorins is poorly understood, the Ig-like domains may be involved in ligand binding or dimerization. 88
24593 409369 cd04980 IgV_L_kappa Immunoglobulin (Ig) light chain, kappa type, variable (V) domain. The members here are composed of the immunoglobulin (Ig) light chain, kappa type, variable (V) domain. This group contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains. 106
24594 409370 cd04981 IgV_H Immunoglobulin (Ig) heavy chain (H), variable (V) domain. The members here are composed of the immunoglobulin (Ig) heavy chain (H), variable (V) domain. This group contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which can associate with any of the heavy chains. This family includes alpha, gamma, delta, epsilon, and mu heavy chains. 118
24595 409371 cd04982 IgV_TCR_gamma Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) gamma chain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the gamma chain of gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains. Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigens as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain the standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 117
24596 409372 cd04983 IgV_TCR_alpha Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) alpha chain and similar proteins. The members here are composed of the immunoglobulin (Ig) variable domain of the alpha chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta polypeptide chains with variable (V) and constant (C) regions. This group represents the variable domain of the alpha chain of TCRs and also includes the variable domain of delta chains of TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The variable domain of TCRs is responsible for antigen recognition, and is located at the N-terminus of the receptor. Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 109
24597 409373 cd04984 IgV_L_lambda Immunoglobulin (Ig) lambda light chain variable (V) domain. The members here are composed of the immunoglobulin (Ig) light chain, lambda type, variable (V) domain. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 105
24598 409374 cd04985 IgC1_CH1_IgADEGM CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, delta, epsilon, gamma, and mu chains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 98
24599 409375 cd04986 IgC1_CH2_IgA CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin heavy alpha chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin constant-1 set domain (IgC) of alpha heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 96
24600 240138 cd05005 SIS_PHI Hexulose-6-phosphate isomerase (PHI). PHI is a member of the SIS (Sugar ISomerase domain) superfamily. In the ribulose monophosphate pathway of formaldehyde fixation, hexulose-6-phosphate synthase catalyzes the condensation of ribulose-5-phosphate with formadelhyde to become hexulose-6-phosphate, which is then isomerized to fructose-6-phosphate by PHI. 179
24601 240139 cd05006 SIS_GmhA Phosphoheptose isomerase is a member of the SIS (Sugar ISomerase) superfamily. Phosphoheptose isomerase catalyzes the isomerization of sedoheptulose 7-phosphate into D-glycero-D-mannoheptose 7-phosphate. This is the first step of the biosynthesis of gram-negative bacteria inner core lipopolysaccharide precursor, L-glycero-D-mannoheptose (Gmh). 177
24602 240140 cd05007 SIS_Etherase N-acetylmuramic acid 6-phosphate etherase. Members of this family contain the SIS (Sugar ISomerase) domain. The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. The bacterial cell wall sugar N-acetylmuramic acid carries a unique D-lactyl ether substituent at the C3 position. The etherase catalyzes the cleavage of the lactyl ether bond of N-acetylmuramic acid 6-phosphate. 257
24603 240141 cd05008 SIS_GlmS_GlmD_1 SIS (Sugar ISomerase) domain repeat 1 found in Glucosamine 6-phosphate synthase (GlmS) and Glucosamine-6-phosphate deaminase (GlmD). The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. GlmS contains a N-terminal glutaminase domain and two C-terminal SIS domains and catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source. The glutaminase domain hydrolyzes glutamine to glutamate and ammonia. Ammonia is transferred through a channel to the isomerase domain for glucosamine 6-phosphate synthesis. The end product of the pathway is N-acetylglucosamine, which plays multiple roles in eukaryotic cells including being a building block of bacterial and fungal cell walls. In the absence of glutamine, GlmS catalyzes the isomerization of fructose 6-phosphate into glucose 6- phosphate (PGI-like activity). Glucosamine-6-phosphate deaminase (GlmD) contains two SIS domains and catalyzes the deamination and isomerization of glucosamine-6-phosphate into fructose-6-phosphate with the release of ammonia; in presence of high ammonia concentration, GlmD can catalyze the reverse reaction. 126
24604 240142 cd05009 SIS_GlmS_GlmD_2 SIS (Sugar ISomerase) domain repeat 2 found in Glucosamine 6-phosphate synthase (GlmS) and Glucosamine-6-phosphate deaminase (GlmD). The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. GlmS contains a N-terminal glutaminase domain and two C-terminal SIS domains and catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source. The glutaminase domain hydrolyzes glutamine to glutamate and ammonia. Ammonia is transferred through a channel to the isomerase domain for glucosamine 6-phosphate synthesis. The end product of the pathway is N-acetylglucosamine, which plays multiple roles in eukaryotic cells including being a building block of bacterial and fungal cell walls. In the absence of glutamine, GlmS catalyzes the isomerization of fructose 6-phosphate into glucose 6- phosphate (PGI-like activity). Glucosamine-6-phosphate deaminase (GlmD) contains two SIS domains and catalyzes the deamination and isomerization of glucosamine-6-phosphate into fructose-6-phosphate with the release of ammonia; in presence of high ammonia concentration, GlmD can catalyze the reverse reaction. 153
24605 240143 cd05010 SIS_AgaS_like AgaS-like protein. AgaS contains a SIS (Sugar ISomerase) domain which is found in many phosphosugar isomerases and phosphosugar binding proteins. AgaS is a putative isomerase in Escherichia coli. It is similar to the glucosamine-6-phosphate synthases (GlmS) which catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source. 151
24606 240144 cd05013 SIS_RpiR RpiR-like protein. RpiR contains a SIS (Sugar ISomerase) domain, which is found in many phosphosugar isomerases and phosphosugar binding proteins. In E. coli, rpiR negatively regulates the expression of rpiB gene. Both rpiB and rpiA are ribose phosphate isomerases that catalyze the reversible reactions of ribose 5-phosphate into ribulose 5-phosphate. 139
24607 240145 cd05014 SIS_Kpsf KpsF-like protein. KpsF is an arabinose-5-phosphate isomerase which contains SIS (Sugar ISomerase) domains. SIS domains are found in many phosphosugar isomerases and phosphosugar binding proteins. KpsF catalyzes the reversible reaction of ribulose 5-phosphate to arabinose 5-phosphate. This is the second step in the CMP-Kdo biosynthesis pathway. 128
24608 240146 cd05015 SIS_PGI_1 Phosphoglucose isomerase (PGI) contains two SIS (Sugar ISomerase) domains. This classification is based on the alignment of the first SIS domain. PGI is a multifunctional enzyme which as an intracellular dimer catalyzes the reversible isomerization of glucose 6-phosphate to fructose 6-phosphate. As an extracellular protein, PGI also has functions equivalent to neuroleukin (NLK), autocrine motility factor (AMF), and maturation factor (MF). Evidence suggests that PGI, NLK, AMF, and MF are closely related or identical. NLK is a neurotrophic growth factor that promotes regeneration and survival of neurons. The dimeric form of NLK has isomerase function, whereas its monomeric form carries out neurotrophic activity. AMF is a cytokine that stimulates cell migration and metastasis. MF mediates the differentiation of human myeloid leukemic HL-60 cells to terminal monocytic cells. 158
24609 240147 cd05016 SIS_PGI_2 Phosphoglucose isomerase (PGI) contains two SIS (Sugar ISomerase) domains. This classification is based on the alignment of the second SIS domain. PGI is a multifunctional enzyme which as an intracellular dimer catalyzes the reversible isomerization of glucose 6-phosphate to fructose 6-phosphate. As an extracellular protein, PGI also has functions equivalent to neuroleukin (NLK), autocrine motility factor (AMF), and maturation factor (MF). Evidence suggests that PGI, NLK, AMF, and MF are closely related or identical. NLK is a neurotrophic growth factor that promotes regeneration and survival of neurons. The dimeric form of NLK has isomerase function, whereas its monomeric form carries out neurotrophic activity. AMF is a cytokine that stimulates cell migration and metastasis. MF mediates the differentiation of human myeloid leukemic HL-60 cells to terminal monocytic cells. 164
24610 240148 cd05017 SIS_PGI_PMI_1 The members of this protein family contain the SIS (Sugar ISomerase) domain and have both the phosphoglucose isomerase (PGI) and the phosphomannose isomerase (PMI) functions. These functions catalyze the reversible reactions of glucose 6-phosphate to fructose 6-phosphate, and mannose 6-phosphate to fructose 6-phosphate, respectively at an equal rate. This protein contains two SIS domains. This alignment is based on the first SIS domain. 119
24611 176853 cd05018 CoxG Carbon monoxide dehydrogenase subunit G (CoxG). CoxG has been shown, in Oligotropha carboxidovorans, to anchor the carbon monoxide (CO) dehydrogenase to the cytoplasmic membrane. The gene encoding CoxG is part of the Cox cluster (coxBCMSLDEFGHIK) located on a low-copy-number, circular, megaplasmid pHCG3. This cluster includes genes encoding subunits of CO dehydrogenase and several accessory components involved in the utilization of CO. This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 144
24612 240149 cd05022 S-100A13 S-100A13: S-100A13 domain found in proteins similar to S100A13. S100A13 is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A13 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100A13 is involved in the cellular export of interleukin-1 (IL-1) and of fibroblast growth factor-1 (FGF-1), which plays an important role in angiogenesis and tissue regeneration. Export is based on the CuII-dependent formation of multiprotein complexes containing the S100A13 protein. Assembly of these complexes occurs near the inner surface of the plasma membrane. Binding of two Ca(II) ions per monomer triggers key conformational changes leading to the creation of two identical and symmetrical Cu(II)-binding sites on the surface of the protein, close to the interface between the two monomers. These Cu(II)-binding sites are unique among the S100 proteins, which are reported to bind Cu(II) or Zn(II) ions in addition to Ca(II) ions. In addition, the three-dimensional structure of S100A13 differs significantly from those of other S100 proteins; the hydrophobic pocket that largely contributes to protein-protein interactions in other S100 proteins is absent in S100A13. The structure of S100A13 contains a large patch of negatively charged residues flanked by dense cationic clusters, formed mostly from positively charged residues from the C-terminal end, which plays major role in binding FGF-1. 89
24613 240150 cd05023 S-100A11 S-100A11: S-100A11 domain found in proteins similar to S100A11. S100A11 is a member of the S-100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A11 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100 proteins have also been associated with a variety of pathological events, including neoplastic transformation and neurodegenerative diseases such as Alzheimer's, usually via over expression of the protein. S100A11 is expressed in smooth muscle and other tissues and involves in calcium-dependent membrane aggregation, which is important for cell vesiculation . As is the case for many other S100 proteins, S100A11 is homodimer, which is able to form a heterodimer with S100B through subunit exchange. Ca2+ binding to S100A11 results in a conformational change in the protein, exposing a hydrophobic surface that interacts with target proteins. In addition to binding to annexin A1 and A6 S100A11 also interacts with actin and transglutaminase. 89
24614 240151 cd05024 S-100A10 S-100A10: A subgroup of the S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A10 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions. 91
24615 240152 cd05025 S-100A1 S-100A1: S-100A1 domain found in proteins similar to S100A1. S100A1 is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. As is the case with many other members of S100 protein family, S100A1 is implicated in intracellular and extracellular regulatory activities, including interaction with myosin-associated twitchin kinase, actin-capping protein CapZ, sinapsin I, and tubulin. Structural data suggests that S100A1 proteins exist within cells as antiparallel homodimers, while heterodimers with S100A4 and S100B also has been reported. Upon binding calcium S100A1 changes conformation to expose a hydrophobic cleft which is the interaction site of S100A1 with its more that 20 known target proteins. 92
24616 240153 cd05026 S-100Z S-100Z: S-100Z domain found in proteins similar to S100Z. S100Z is a member of the S100 domain family within the EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100Z group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately.S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control. S100Z is normally expressed in various tissues, with its highest level of expression being in spleen and leukocytes. The function of S100Z remains unclear. Preliminary structural data suggests that S100Z is homodimer, however a heterodimer with S100P has been reported. S100Z is capable of binding calcium ions. When calcium binds to S110Z, the protein experiences a conformational change, which exposes hydrophobic surfaces on the protein. In comparison with their normal tissue counterparts, S100Z gene expression appears to be deregulated in some tumor tissues. 93
24617 240154 cd05027 S-100B S-100B: S-100B domain found in proteins similar to S100B. S100B is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100B group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100B is most abundant in glial cells of the central nervous system, predominately in astrocytes. S100B is involved in signal transduction via the inhibition of protein phoshorylation, regulation of enzyme activity and by affecting the calcium homeostasis. Upon calcium binding the S100B homodimer changes conformation to expose a hydrophobic cleft, which represents the interaction site of S100B with its more than 20 known target proteins. These target proteins include several cellular architecture proteins such as tubulin and GFAP; S100B can inhibit polymerization of these oligomeric molecules. Furthermore, S100B inhibits the phosphorylation of multiple kinase substrates including the Alzheimer protein tau and neuromodulin (GAP-43) through a calcium-sensitive interaction with the protein substrates. 88
24618 240155 cd05029 S-100A6 S-100A6: S-100A6 domain found in proteins similar to S100A6. S100A6 is a member of the S100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A6 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100A6 is normally expressed in the G1 phase of the cell cycle in neuronal cells. The function of S100A6 remains unclear, but evidence suggests that it is involved in cell cycle regulation and exocytosis. S100A6 may also be involved in tumorigenesis; the protein is overexpressed in several tumors. Ca2+ binding to S100A6 leads to a conformational change in the protein, which exposes a hydrophobic surface for interaction with target proteins. Several such proteins have been identified: glyceraldehyde-3-phosphate dehydrogenase , annexins 2, 6 and 11 and Calcyclin-Binding Protein (CacyBP). 88
24619 240156 cd05030 calgranulins Calgranulins: S-100 domain found in proteins belonging to the Calgranulin subgroup of the S100 family of EF-hand calcium-modulated proteins, including S100A8, S100A9, and S100A12 . Note that the S-100 hierarchy, to which this Calgranulin group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. These proteins are expressed mainly in granulocytes, and are involved in inflammation, allergy, and neuritogenesis, as well as in host-parasite response. Calgranulins are modulated not only by calcium, but also by other metals such as zinc and copper. Structural data suggested that calgranulins may exist in multiple structural forms, homodimers, as well as hetero-oligomers. For example, the S100A8/S100A9 complex called calprotectin plays important roles in the regulation of inflammatory processes, wound repair, and regulating zinc-dependent enzymes as well as microbial growth. 88
24620 240157 cd05031 S-100A10_like S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1_like group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions. 94
24621 173625 cd05032 PTKc_InsR_like Catalytic domain of Insulin Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The InsR subfamily is composed of InsR, Insulin-like Growth Factor-1 Receptor (IGF-1R), and similar proteins. InsR and IGF-1R are receptor PTKs (RTKs) composed of two alphabeta heterodimers. Binding of the ligand (insulin, IGF-1, or IGF-2) to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, stimulating downstream kinase activities, which initiate signaling cascades and biological function. InsR and IGF-1R, which share 84% sequence identity in their kinase domains, display physiologically distinct yet overlapping functions in cell growth, differentiation, and metabolism. InsR activation leads primarily to metabolic effects while IGF-1R activation stimulates mitogenic pathways. In cells expressing both receptors, InsR/IGF-1R hybrids are found together with classical receptors. Both receptors can interact with common adaptor molecules such as IRS-1 and IRS-2. The InsR-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
24622 270629 cd05033 PTKc_EphR Catalytic domain of Ephrin Receptor Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EphRs comprise the largest subfamily of receptor PTKs (RTKs). They can be classified into two classes (EphA and EphB), according to their extracellular sequences, which largely correspond to binding preferences for either GPI-anchored ephrin-A ligands or transmembrane ephrin-B ligands. Vertebrates have ten EphA and six EphB receptors, which display promiscuous ligand interactions within each class. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. This allows ephrin/EphR dimers to form, leading to the activation of the intracellular tyr kinase domain. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The main effect of ephrin/EphR interaction is cell-cell repulsion or adhesion. Ephrin/EphR signaling is important in neural development and plasticity, cell morphogenesis and proliferation, cell-fate determination, embryonic development, tissue patterning, and angiogenesis.The EphR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
24623 270630 cd05034 PTKc_Src_like Catalytic domain of Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Src subfamily members include Src, Lck, Hck, Blk, Lyn, Fgr, Fyn, Yrk, and Yes. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. Src kinases are overexpressed in a variety of human cancers, making them attractive targets for therapy. They are also implicated in acute inflammatory responses and osteoclast function. Src, Fyn, Yes, and Yrk are widely expressed, while Blk, Lck, Hck, Fgr, and Lyn show a limited expression pattern. The Src-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 248
24624 270631 cd05035 PTKc_TAM Catalytic Domain of TAM (Tyro3, Axl, Mer) Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The TAM subfamily consists of Tyro3 (or Sky), Axl, Mer (or Mertk), and similar proteins. TAM subfamily members are receptor tyr kinases (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. TAM proteins are implicated in a variety of cellular effects including survival, proliferation, migration, and phagocytosis. They are also associated with several types of cancer as well as inflammatory, autoimmune, vascular, and kidney diseases. The TAM subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 273
24625 270632 cd05036 PTKc_ALK_LTK Catalytic domain of the Protein Tyrosine Kinases, Anaplastic Lymphoma Kinase and Leukocyte Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyr residues in protein substrates. ALK and LTK are orphan receptor PTKs (RTKs) whose ligands are not yet well-defined. ALK appears to play an important role in mammalian neural development as well as visceral muscle differentiation in Drosophila. ALK is aberrantly expressed as fusion proteins, due to chromosomal translocations, in about 60% of anaplastic large cell lymphomas (ALCLs). ALK fusion proteins are also found in rare cases of diffuse large B cell lymphomas (DLBCLs). LTK is mainly expressed in B lymphocytes and neuronal tissues. It is important in cell proliferation and survival. Transgenic mice expressing TLK display retarded growth and high mortality rate. In addition, a polymorphism in mouse and human LTK is implicated in the pathogenesis of systemic lupus erythematosus. RTKs contain an extracellular ligand-binding domain, a transmembrane region, and an intracellular tyr kinase domain. They are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The ALK/LTK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
24626 270633 cd05037 PTK_Jak_rpt1 Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinases, Janus kinases. The Jak subfamily is composed of Jak1, Jak2, Jak3, TYK2, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. In the case of Jak2, the presumed pseudokinase (repeat 1) domain exhibits dual-specificity kinase activity, phosphorylating two negative regulatory sites in Jak2: Ser523 and Tyr570. Most Jaks are expressed in a wide variety of tissues, except for Jak3, which is expressed only in hematopoietic cells. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). Jaks are also involved in regulating the surface expression of some cytokine receptors. The Jak-STAT pathway is involved in many biological processes including hematopoiesis, immunoregulation, host defense, fertility, lactation, growth, and embryogenesis. The Jak subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
24627 270634 cd05038 PTKc_Jak_rpt2 Catalytic (repeat 2) domain of the Protein Tyrosine Kinases, Janus kinases. The Jak subfamily is composed of Jak1, Jak2, Jak3, TYK2, and similar proteins. They are PTKs, catalyzing the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jaks are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase catalytic domain. Most Jaks are expressed in a wide variety of tissues, except for Jak3, which is expressed only in hematopoietic cells. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). Jaks are also involved in regulating the surface expression of some cytokine receptors. The Jak-STAT pathway is involved in many biological processes including hematopoiesis, immunoregulation, host defense, fertility, lactation, growth, and embryogenesis. The Jak subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
24628 270635 cd05039 PTKc_Csk_like Catalytic domain of C-terminal Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of Csk, Chk, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. They negatively regulate the activity of Src kinases that are anchored to the plasma membrane. To inhibit Src kinases, Csk and Chk are translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. Csk catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. Chk inhibit Src kinases using a noncatalytic mechanism by simply binding to them. As negative regulators of Src kinases, Csk and Chk play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. The Csk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
24629 270636 cd05040 PTKc_Ack_like Catalytic domain of the Protein Tyrosine Kinase, Activated Cdc42-associated kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily includes Ack1, thirty-eight-negative kinase 1 (Tnk1), and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing an N-terminal catalytic domain, an SH3 domain, a Cdc42-binding CRIB domain, and a proline-rich region. They are mainly expressed in brain and skeletal tissues and are involved in the regulation of cell adhesion and growth, receptor degradation, and axonal guidance. Ack1 is also associated with androgen-independent prostate cancer progression. Tnk1 regulates TNFalpha signaling and may play an important role in cell death. The Ack-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
24630 270637 cd05041 PTKc_Fes_like Catalytic domain of Fes-like Protein Tyrosine Kinases. Protein Tyrosine Kinase (PTK) family; Fes subfamily; catalytic (c) domain. Fes subfamily members include Fes (or Fps), Fer, and similar proteins. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fes subfamily proteins are cytoplasmic (or nonreceptor) tyr kinases containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. The genes for Fes (feline sarcoma) and Fps (Fujinami poultry sarcoma) were first isolated from tumor-causing retroviruses. The viral oncogenes encode chimeric Fes proteins consisting of Gag sequences at the N-termini, resulting in unregulated tyr kinase activity. Fes and Fer kinases play roles in haematopoiesis, inflammation and immunity, growth factor signaling, cytoskeletal regulation, cell migration and adhesion, and the regulation of cell-cell interactions. Fes and Fer show redundancy in their biological functions. 251
24631 270638 cd05042 PTKc_Aatyk Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Aatyk subfamily is also referred to as the lemur tyrosine kinase (Lmtk) subfamily. It consists of Aatyk1 (Lmtk1), Aatyk2 (Lmtk2, Brek), Aatyk3 (Lmtk3), and similar proteins. Aatyk proteins are mostly receptor PTKs (RTKs) containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. Aatyk1 does not contain a transmembrane segment and is a cytoplasmic (or nonreceptor) kinase. Aatyk proteins are classified as PTKs based on overall sequence similarity and the phylogenetic tree. However, analysis of catalytic residues suggests that Aatyk proteins may be multispecific kinases, functioning also as serine/threonine kinases. They are involved in neural differentiation, nerve growth factor (NGF) signaling, apoptosis, and spermatogenesis. The Aatyk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
24632 270639 cd05043 PTK_Ryk Pseudokinase domain of Ryk (Receptor related to tyrosine kinase). Ryk is a receptor tyr kinase (RTK) containing an extracellular region with two leucine-rich motifs, a transmembrane segment, and an intracellular inactive pseudokinase domain, which shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. The extracellular region of Ryk shows homology to the N-terminal domain of Wnt inhibitory factor-1 (WIF) and serves as the ligand (Wnt) binding domain of Ryk. Ryk is expressed in many different tissues both during development and in adults, suggesting a widespread function. It acts as a chemorepulsive axon guidance receptor of Wnt glycoproteins and is responsible for the establishment of axon tracts during the development of the central nervous system. In addition, studies in mice reveal that Ryk is essential in skeletal, craniofacial, and cardiac development. Thus, it appears Ryk is involved in signal transduction despite its lack of kinase activity. Ryk may function as an accessory protein that modulates the signals coming from catalytically active partner RTKs such as the Eph receptors. The Ryk subfamily is part of a larger superfamily that includes other pseudokinases and the catalytic domains of active kinases including PTKs, protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
24633 270640 cd05044 PTKc_c-ros Catalytic domain of the Protein Tyrosine Kinase, C-ros. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily contains c-ros, Sevenless, and similar proteins. The proto-oncogene c-ros encodes an orphan receptor PTK (RTK) with an unknown ligand. RTKs contain an extracellular ligand-binding domain, a transmembrane region, and an intracellular tyr kinase domain. RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. C-ros is expressed in embryonic cells of the kidney, intestine and lung, but disappears soon after birth. It persists only in the adult epididymis. Male mice bearing inactive mutations of c-ros lack the initial segment of the epididymis and are infertile. The Drosophila protein, Sevenless, is required for the specification of the R7 photoreceptor cell during eye development. The c-ros subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
24634 173631 cd05045 PTKc_RET Catalytic domain of the Protein Tyrosine Kinase, REarranged during Transfection protein. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. RET is a receptor PTK (RTK) containing an extracellular region with four cadherin-like repeats, a calcium-binding site, and a cysteine-rich domain, a transmembrane segment, and an intracellular catalytic domain. It is part of a multisubunit complex that binds glial-derived neurotropic factor (GDNF) family ligands (GFLs) including GDNF, neurturin, artemin, and persephin. GFLs bind RET along with four GPI-anchored coreceptors, bringing two RET molecules together, leading to autophosphorylation, activation, and intracellular signaling. RET is essential for the development of the sympathetic, parasympathetic and enteric nervous systems, and the kidney. RET disruption by germline mutations causes diseases in humans including congenital aganglionosis of the gastrointestinal tract (Hirschsprung's disease) and three related inherited cancers: multiple endocrine neoplasia type 2A (MEN2A), MEN2B, and familial medullary thyroid carcinoma. The RET subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
24635 133178 cd05046 PTK_CCK4 Pseudokinase domain of the Protein Tyrosine Kinase, Colon Carcinoma Kinase 4. CCK4, also called protein tyrosine kinase 7 (PTK7), is an orphan receptor PTK (RTK) containing an extracellular region with seven immunoglobulin domains, a transmembrane segment, and an intracellular inactive pseudokinase domain, which shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. Studies in mice reveal that CCK4 is essential for neural development. Mouse embryos containing a truncated CCK4 die perinatally and display craniorachischisis, a severe form of neural tube defect. The mechanism of action of the CCK4 pseudokinase is still unknown. Other pseudokinases such as HER3 rely on the activity of partner RTKs. The CCK4 subfamily is part of a larger superfamily that includes other pseudokinases and the catalytic domains of active kinases including PTKs, protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
24636 270641 cd05047 PTKc_Tie Catalytic domain of Tie Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie proteins, consisting of Tie1 and Tie2, are receptor PTKs (RTKs) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie receptors are specifically expressed in endothelial cells and hematopoietic stem cells. The angiopoietins (Ang-1 to Ang-4) serve as ligands for Tie2, while no specific ligand has been identified for Tie1. The binding of Ang-1 to Tie2 leads to receptor autophosphorylation and activation, promoting cell migration and survival. In contrast, Ang-2 binding to Tie2 does not result in the same response, suggesting that Ang-2 may function as an antagonist. In vivo studies of Tie1 show that it is critical in vascular development. The Tie subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
24637 270642 cd05048 PTKc_Ror Catalytic Domain of the Protein Tyrosine Kinases, Receptor tyrosine kinase-like Orphan Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Ror subfamily consists of Ror1, Ror2, and similar proteins. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. Ror kinases are expressed in many tissues during development. They play important roles in bone and heart formation. Mutations in human Ror2 result in two different bone development genetic disorders, recessive Robinow syndrome and brachydactyly type B. Drosophila Ror is expressed only in the developing nervous system during neurite outgrowth and neuronal differentiation, suggesting a role for Drosophila Ror in neural development. More recently, mouse Ror1 and Ror2 have also been found to play an important role in regulating neurite growth in central neurons. Ror1 and Ror2 are believed to have some overlapping and redundant functions. The Ror subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
24638 270643 cd05049 PTKc_Trk Catalytic domain of the Protein Tyrosine Kinases, Tropomyosin Related Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Trk subfamily consists of TrkA, TrkB, TrkC, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, the nerve growth factor (NGF) family of neutrotrophins, leads to Trk receptor oligomerization and activation of the catalytic domain. Trk receptors are mainly expressed in the peripheral and central nervous systems. They play important roles in cell fate determination, neuronal survival and differentiation, as well as in the regulation of synaptic plasticity. Altered expression of Trk receptors is associated with many human diseases. The Trk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 280
24639 133181 cd05050 PTKc_Musk Catalytic domain of the Protein Tyrosine Kinase, Muscle-specific kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Musk is a receptor PTK (RTK) containing an extracellular region with four immunoglobulin-like domains and a cysteine-rich cluster, a transmembrane segment, and an intracellular catalytic domain. Musk is expressed and concentrated in the postsynaptic membrane in skeletal muscle. It is essential for the establishment of the neuromuscular junction (NMJ), a peripheral synapse that conveys signals from motor neurons to muscle cells. Agrin, a large proteoglycan released from motor neurons, stimulates Musk autophosphorylation and activation, leading to the clustering of acetylcholine receptors (AChRs). To date, there is no evidence to suggest that agrin binds directly to Musk. Mutations in AChR, Musk and other partners are responsible for diseases of the NMJ, such as the autoimmune syndrome myasthenia gravis. The Musk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
24640 270644 cd05051 PTKc_DDR Catalytic domain of the Protein Tyrosine Kinases, Discoidin Domain Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The DDR subfamily consists of homologs of mammalian DDR1, DDR2, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDRs results in a slow but sustained receptor activation. DDRs regulate cell adhesion, proliferation, and extracellular matrix remodeling. They have been linked to a variety of human cancers including breast, colon, ovarian, brain, and lung. There is no evidence showing that DDRs act as transforming oncogenes. They are more likely to play a role in the regulation of tumor growth and metastasis. The DDR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 297
24641 270645 cd05052 PTKc_Abl Catalytic domain of the Protein Tyrosine Kinase, Abelson kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Abl (or c-Abl) is a ubiquitously-expressed cytoplasmic (or nonreceptor) PTK that contains SH3, SH2, and tyr kinase domains in its N-terminal region, as well as nuclear localization motifs, a putative DNA-binding domain, and F- and G-actin binding domains in its C-terminal tail. It also contains a short autoinhibitory cap region in its N-terminus. Abl function depends on its subcellular localization. In the cytoplasm, Abl plays a role in cell proliferation and survival. In response to DNA damage or oxidative stress, Abl is transported to the nucleus where it induces apoptosis. In chronic myelogenous leukemia (CML) patients, an aberrant translocation results in the replacement of the first exon of Abl with the BCR (breakpoint cluster region) gene. The resulting BCR-Abl fusion protein is constitutively active and associates into tetramers, resulting in a hyperactive kinase sending a continuous signal. This leads to uncontrolled proliferation, morphological transformation and anti-apoptotic effects. BCR-Abl is the target of selective inhibitors, such as imatinib (Gleevec), used in the treatment of CML. Abl2, also known as ARG (Abelson-related gene), is thought to play a cooperative role with Abl in the proper development of the nervous system. The Tel-ARG fusion protein, resulting from reciprocal translocation between chromosomes 1 and 12, is associated with acute myeloid leukemia (AML). The TEL gene is a frequent fusion partner of other tyr kinase oncogenes, including Tel/Abl, Tel/PDGFRbeta, and Tel/Jak2, found in patients with leukemia and myeloproliferative disorders. The Abl subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
24642 270646 cd05053 PTKc_FGFR Catalytic domain of the Protein Tyrosine Kinases, Fibroblast Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The FGFR subfamily consists of FGFR1, FGFR2, FGFR3, FGFR4, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, and to heparin/heparan sulfate (HS) results in the formation of a ternary complex, which leads to receptor dimerization and activation, and intracellular signaling. There are at least 23 FGFs and four types of FGFRs. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. FGF/FGFR signaling is important in the regulation of embryonic development, homeostasis, and regenerative processes. Depending on the cell type and stage, FGFR signaling produces diverse cellular responses including proliferation, growth arrest, differentiation, and apoptosis. Aberrant signaling leads to many human diseases such as skeletal, olfactory, and metabolic disorders, as well as cancer. The FGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase . 294
24643 270647 cd05054 PTKc_VEGFR Catalytic domain of the Protein Tyrosine Kinases, Vascular Endothelial Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The VEGFR subfamily consists of VEGFR1 (Flt1), VEGFR2 (Flk1), VEGFR3 (Flt4), and similar proteins. VEGFR subfamily members are receptor PTKss (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. In VEGFR3, the fifth Ig-like domain is replaced by a disulfide bridge. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. There are five VEGF ligands in mammals, which bind, in an overlapping pattern to the three VEGFRs, which can form homo or heterodimers. VEGFRs regulate the cardiovascular system. They are critical for vascular development during embryogenesis and blood vessel formation in adults. They induce cellular functions common to other growth factor receptors such as cell migration, survival, and proliferation. The VEGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 298
24644 133186 cd05055 PTKc_PDGFR Catalytic domain of the Protein Tyrosine Kinases, Platelet Derived Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The PDGFR subfamily consists of PDGFR alpha, PDGFR beta, KIT, CSF-1R, the mammalian FLT3, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. PDGFR kinase domains are autoinhibited by their juxtamembrane regions containing tyr residues. The binding to their ligands leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR subfamily receptors are important in the development of a variety of cells. PDGFRs are expressed in a many cells including fibroblasts, neurons, endometrial cells, mammary epithelial cells, and vascular smooth muscle cells. PDGFR signaling is critical in normal embryonic development, angiogenesis, and wound healing. Kit is important in the development of melanocytes, germ cells, mast cells, hematopoietic stem cells, the interstitial cells of Cajal, and the pacemaker cells of the GI tract. CSF-1R signaling is critical in the regulation of macrophages and osteoclasts. Mammalian FLT3 plays an important role in the survival, proliferation, and differentiation of stem cells. The PDGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase . 302
24645 133187 cd05056 PTKc_FAK Catalytic domain of the Protein Tyrosine Kinase, Focal Adhesion Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. FAK is a cytoplasmic (or nonreceptor) PTK that contains an autophosphorylation site and a FERM domain at the N-terminus, a central tyr kinase domain, proline-rich regions, and a C-terminal FAT (focal adhesion targeting) domain. FAK activity is dependent on integrin-mediated cell adhesion, which facilitates N-terminal autophosphorylation. Full activation is achieved by the phosphorylation of its two adjacent A-loop tyrosines. FAK is important in mediating signaling initiated at sites of cell adhesions and at growth factor receptors. Through diverse molecular interactions, FAK functions as a biosensor or integrator to control cell motility. It is a key regulator of cell survival, proliferation, migration and invasion, and thus plays an important role in the development and progression of cancer. Src binds to autophosphorylated FAK forming the FAK-Src dual kinase complex, which is activated in a wide variety of tumor cells and generates signals promoting growth and metastasis. FAK is being developed as a target for cancer therapy. The FAK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
24646 270648 cd05057 PTKc_EGFR_like Catalytic domain of Epidermal Growth Factor Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER, ErbB) subfamily members include EGFR (HER1, ErbB1), HER2 (ErbB2), HER3 (ErbB3), HER4 (ErbB4), and similar proteins. They are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, resulting in the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Collectively, they can recognize a variety of ligands including EGF, TGFalpha, and neuregulins, among others. All four subfamily members can form homo- or heterodimers. HER3 contains an impaired kinase domain and depends on its heterodimerization partner for activation. EGFR subfamily members are involved in signaling pathways leading to a broad range of cellular responses including cell proliferation, differentiation, migration, growth inhibition, and apoptosis. Gain of function alterations, through their overexpression, deletions, or point mutations in their kinase domains, have been implicated in various cancers. These receptors are targets of many small molecule inhibitors and monoclonal antibodies used in cancer therapy. The EGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
24647 270649 cd05058 PTKc_Met_Ron Catalytic domain of the Protein Tyrosine Kinases, Met and Ron. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Met and Ron are receptor PTKs (RTKs) composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. Met binds to the ligand, hepatocyte growth factor/scatter factor (HGF/SF), and is also called the HGF receptor. HGF/Met signaling plays a role in growth, transformation, cell motility, invasion, metastasis, angiogenesis, wound healing, and tissue regeneration. Aberrant expression of Met through mutations or gene amplification is associated with many human cancers including hereditary papillary renal and gastric carcinomas. The ligand for Ron is macrophage stimulating protein (MSP). Ron signaling is important in regulating cell motility, adhesion, proliferation, and apoptosis. Aberrant Ron expression is implicated in tumorigenesis and metastasis. The Met/Ron subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
24648 173637 cd05059 PTKc_Tec_like Catalytic domain of Tec-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Tec-like subfamily is composed of Tec, Btk, Bmx (Etk), Itk (Tsk, Emt), Rlk (Txk), and similar proteins. They are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, some members contain the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. Tec kinases form the second largest subfamily of nonreceptor PTKs and are expressed mainly by haematopoietic cells, although Tec and Bmx are also found in endothelial cells. B-cells express Btk and Tec, while T-cells express Itk, Txk, and Tec. Collectively, Tec kinases are expressed in a variety of myeloid cells such as mast cells, platelets, macrophages, and dendritic cells. Each Tec kinase shows a distinct cell-type pattern of expression. Tec kinases play important roles in the development, differentiation, maturation, regulation, survival, and function of B-cells and T-cells. Mutations in Btk cause the severe B-cell immunodeficiency, X-linked agammaglobulinaemia (XLA). The Tec-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
24649 270650 cd05060 PTKc_Syk_like Catalytic domain of Spleen Tyrosine Kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Syk-like subfamily is composed of Syk, ZAP-70, Shark, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. They are involved in the signaling downstream of activated receptors (including B-cell, T-cell, and Fc receptors) that contain ITAMs (immunoreceptor tyr activation motifs), leading to processes such as cell proliferation, differentiation, survival, adhesion, migration, and phagocytosis. Syk is important in B-cell receptor signaling, while Zap-70 is primarily expressed in T-cells and NK cells, and is a crucial component in T-cell receptor signaling. Syk also plays a central role in Fc receptor-mediated phagocytosis in the adaptive immune system. Shark is exclusively expressed in ectodermally derived epithelia, and is localized preferentially to the apical surface of the epithelial cells, it may play a role in a signaling pathway for epithelial cell polarity. The Syk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
24650 133192 cd05061 PTKc_InsR Catalytic domain of the Protein Tyrosine Kinase, Insulin Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. InsR is a receptor PTK (RTK) that is composed of two alphabeta heterodimers. Binding of the insulin ligand to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, stimulating downstream kinase activities, which initiate signaling cascades and biological function. InsR signaling plays an important role in many cellular processes including glucose homeostasis, glycogen synthesis, lipid and protein metabolism, ion and amino acid transport, cell cycle and proliferation, cell differentiation, gene transcription, and nitric oxide synthesis. Insulin resistance, caused by abnormalities in InsR signaling, has been described in diabetes, hypertension, cardiovascular disease, metabolic syndrome, heart failure, and female infertility. The InsR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
24651 133193 cd05062 PTKc_IGF-1R Catalytic domain of the Protein Tyrosine Kinase, Insulin-like Growth Factor-1 Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. IGF-1R is a receptor PTK (RTK) that is composed of two alphabeta heterodimers. Binding of the ligand (IGF-1 or IGF-2) to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, which stimulates downstream kinase activities and biological function. IGF-1R signaling is important in the differentiation, growth, and survival of normal cells. In cancer cells, where it is frequently overexpressed, IGF-1R is implicated in proliferation, the suppression of apoptosis, invasion, and metastasis. IGF-1R is being developed as a therapeutic target in cancer treatment. The IGF-1R subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
24652 133194 cd05063 PTKc_EphR_A2 Catalytic domain of the Protein Tyrosine Kinase, Ephrin Receptor A2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The EphA2 receptor is overexpressed in tumor cells and tumor blood vessels in a variety of cancers including breast, prostate, lung, and colon. As a result, it is an attractive target for drug design since its inhibition could affect several aspects of tumor progression. EphRs comprise the largest subfamily of receptor PTKs (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The EphA2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). 268
24653 133195 cd05064 PTKc_EphR_A10 Catalytic domain of the Protein Tyrosine Kinase, Ephrin Receptor A10. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EphA10, which contains an inactive tyr kinase domain, may function to attenuate signals of co-clustered active receptors. EphA10 is mainly expressed in the testis. Ephrin/EphR interaction results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. EphRs comprise the largest subfamily of receptor tyr kinases (RTKs). In general, class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The EphA10 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
24654 173638 cd05065 PTKc_EphR_B Catalytic domain of the Protein Tyrosine Kinases, Class EphB Ephrin Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EphB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphB receptors play important roles in synapse formation and plasticity, spine morphogenesis, axon guidance, and angiogenesis. In the intestinal epithelium, EphBs are Wnt signaling target genes that control cell compartmentalization. They function as suppressors of colon cancer progression. EphRs comprise the largest subfamily of receptor PTKs (RTKs). They contain an ephrin-binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion. The EphB subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
24655 270651 cd05066 PTKc_EphR_A Catalytic domain of the Protein Tyrosine Kinases, Class EphA Ephrin Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of most class EphA receptors including EphA3, EphA4, EphA5, and EphA7, but excluding EphA1, EphA2 and EphA10. Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. One exception is EphA4, which also binds ephrins-B2/B3. EphA receptors and ephrin-A ligands are expressed in multiple areas of the developing brain, especially in the retina and tectum. They are part of a system controlling retinotectal mapping. EphRs comprise the largest subfamily of receptor PTKs (RTKs). EphRs contain an ephrin-binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The EphA subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
24656 270652 cd05067 PTKc_Lck_Blk Catalytic domain of the Protein Tyrosine Kinases, Lymphocyte-specific kinase and Blk. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Lck and Blk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lck is expressed in T-cells and natural killer cells. It plays a critical role in T-cell maturation, activation, and T-cell receptor (TCR) signaling. Lck phosphorylates ITAM (immunoreceptor tyr activation motif) sequences on several subunits of TCRs, leading to the activation of different second messenger cascades. Phosphorylated ITAMs serve as binding sites for other signaling factor such as Syk and ZAP-70, leading to their activation and propagation of downstream events. In addition, Lck regulates drug-induced apoptosis by interfering with the mitochondrial death pathway. The apototic role of Lck is independent of its primary function in T-cell signaling. Blk is expressed specifically in B-cells. It is involved in pre-BCR (B-cell receptor) signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Lck/Blk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 264
24657 270653 cd05068 PTKc_Frk_like Catalytic domain of Fyn-related kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Frk and Srk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Frk, also known as Rak, is specifically expressed in liver, lung, kidney, intestine, mammary glands, and the islets of Langerhans. Rodent homologs were previously referred to as GTK (gastrointestinal tyr kinase), BSK (beta-cell Src-like kinase), or IYK (intestinal tyr kinase). Studies in mice reveal that Frk is not essential for viability. It plays a role in the signaling that leads to cytokine-induced beta-cell death in Type I diabetes. It also regulates beta-cell number during embryogenesis and early in life. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Frk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
24658 270654 cd05069 PTKc_Yes Catalytic domain of the Protein Tyrosine Kinase, Yes. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Yes (or c-Yes) is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. c-Yes kinase is the cellular homolog of the oncogenic protein (v-Yes) encoded by the Yamaguchi 73 and Esh sarcoma viruses. It displays functional overlap with other Src subfamily members, particularly Src. It also shows some unique functions such as binding to occludins, transmembrane proteins that regulate extracellular interactions in tight junctions. Yes also associates with a number of proteins in different cell types that Src does not interact with, like JAK2 and gp130 in pre-adipocytes, and Pyk2 in treated pulmonary vein endothelial cells. Although the biological function of Yes remains unclear, it appears to have a role in regulating cell-cell interactions and vesicle trafficking in polarized cells. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Yes subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 279
24659 270655 cd05070 PTKc_Fyn Catalytic domain of the Protein Tyrosine Kinase, Fyn. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fyn and Yrk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Fyn, together with Lck, plays a critical role in T-cell signal transduction by phosphorylating ITAM (immunoreceptor tyr activation motif) sequences on T-cell receptors, ultimately leading to the proliferation and differentiation of T-cells. In addition, Fyn is involved in the myelination of neurons, and is implicated in Alzheimer's and Parkinson's diseases. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Fyn/Yrk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase. 274
24660 270656 cd05071 PTKc_Src Catalytic domain of the Protein Tyrosine Kinase, Src. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Src (or c-Src) is a cytoplasmic (or non-receptor) PTK, containing an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region with a conserved tyr. It is activated by autophosphorylation at the tyr kinase domain, and is negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). c-Src is the vertebrate homolog of the oncogenic protein (v-Src) from Rous sarcoma virus. Together with other Src subfamily proteins, it is involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. Src also play a role in regulating cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. Elevated levels of Src kinase activity have been reported in a variety of human cancers. Several inhibitors of Src have been developed as anti-cancer drugs. Src is also implicated in acute inflammatory responses and osteoclast function. The Src subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
24661 270657 cd05072 PTKc_Lyn Catalytic domain of the Protein Tyrosine Kinase, Lyn. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Lyn is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lyn is expressed in B lymphocytes and myeloid cells. It exhibits both positive and negative regulatory roles in B cell receptor (BCR) signaling. Lyn, as well as Fyn and Blk, promotes B cell activation by phosphorylating ITAMs (immunoreceptor tyr activation motifs) in CD19 and in Ig components of BCR. It negatively regulates signaling by its unique ability to phosphorylate ITIMs (immunoreceptor tyr inhibition motifs) in cell surface receptors like CD22 and CD5. Lyn also plays an important role in G-CSF receptor signaling by phosphorylating a variety of adaptor molecules. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Lyn subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
24662 270658 cd05073 PTKc_Hck Catalytic domain of the Protein Tyrosine Kinase, Hematopoietic cell kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Hck is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Hck is present in myeloid and lymphoid cells that play a role in the development of cancer. It may be important in the oncogenic signaling of the protein Tel-Abl, which induces a chronic myelogenous leukemia (CML)-like disease. Hck also acts as a negative regulator of G-CSF-induced proliferation of granulocytic precursors, suggesting a possible role in the development of acute myeloid leukemia (AML). In addition, Hck is essential in regulating the degranulation of polymorphonuclear leukocytes. Genetic polymorphisms affect the expression level of Hck, which affects PMN mediator release and influences the development of chronic obstructive pulmonary disease (COPD). Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Hck subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
24663 270659 cd05074 PTKc_Tyro3 Catalytic domain of the Protein Tyrosine Kinase, Tyro3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tyro3 (or Sky) is predominantly expressed in the central nervous system and the brain, and functions as a neurotrophic factor. It is also expressed in osteoclasts and has a role in bone resorption. Tyro3 is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Tyro3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
24664 270660 cd05075 PTKc_Axl Catalytic domain of the Protein Tyrosine Kinase, Axl. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Axl is widely expressed in a variety of organs and cells including epithelial, mesenchymal, hematopoietic, as well as non-transformed cells. It is important in many cellular functions such as survival, anti-apoptosis, proliferation, migration, and adhesion. Axl was originally isolated from patients with chronic myelogenous leukemia and a chronic myeloproliferative disorder. It is overexpressed in many human cancers including colon, squamous cell, thyroid, breast, and lung carcinomas. Axl is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to its ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Axl subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
24665 270661 cd05076 PTK_Tyk2_rpt1 Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Tyrosine kinase 2. Tyk2 is widely expressed in many tissues. It is involved in signaling via the cytokine receptors IFN-alphabeta, IL-6, IL-10, IL-12, IL-13, and IL-23. It mediates cell surface urokinase receptor (uPAR) signaling and plays a role in modulating vascular smooth muscle cell (VSMC) functional behavior in response to injury. Tyk2 is also important in dendritic cell function and T helper (Th)1 cell differentiation. A homozygous mutation of Tyk2 was found in a patient with hyper-IgE syndrome (HIES), a primary immunodeficiency characterized by recurrent skin abscesses, pneumonia, and elevated serum IgE. This suggests that Tyk2 may play important roles in multiple cytokine signaling involved in innate and adaptive immunity. Tyk2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. The Tyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 273
24666 270662 cd05077 PTK_Jak1_rpt1 Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 1. Jak1 is widely expressed in many tissues. Many cytokines are dependent on Jak1 for signaling, including those that use the shared receptor subunits, common gamma chain (IL-2, IL-4, IL-7, IL-9, IL-15, IL-21) and gp130 (IL-6, IL-11, oncostatin M, G-CSF, and IFNs, among others). The many varied interactions of Jak1 and its ubiquitous expression suggest many biological roles. Jak1 is important in neurological development, as well as in lymphoid development and function. It also plays a role in the pathophysiology of cardiac hypertrophy and heart failure. A mutation in the ATP-binding site of Jak1 was identified in a human uterine leiomyosarcoma cell line, resulting in defective cytokine induction and antigen presentation, thus allowing the tumor to evade the immune system. Jak1 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. The Jak1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
24667 270663 cd05078 PTK_Jak2_rpt1 Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 2. Jak2 is widely expressed in many tissues. It is essential for the signaling of hormone-like cytokines such as growth hormone, erythropoietin, thrombopoietin, and prolactin, as well as some IFNs and cytokines that signal through the IL-3 and gp130 receptors. Disruption of Jak2 in mice results in an embryonic lethal phenotype with multiple defects including erythropoietic and cardiac abnormalities. It is the only Jak gene that results in a lethal phenotype when disrupted in mice. A mutation in the pseudokinase domain of Jak2, V617F, is present in many myeloproliferative diseases, including almost all patients with polycythemia vera, and 50% of patients with essential thrombocytosis and myelofibrosis. Jak2 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. Despite this, the presumed pseudokinase (repeat 1) domain of Jak2 exhibits dual-specificity kinase activity, phosphorylating two negative regulatory sites in Jak2: Ser523 and Tyr570. Inactivation of the repeat 1 domain increased Jak2 basal activity, suggesting that it modulates the kinase activity of the C-terminal catalytic (repeat 2) domain. The Jak2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
24668 173644 cd05079 PTKc_Jak1_rpt2 Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak1 is widely expressed in many tissues. Many cytokines are dependent on Jak1 for signaling, including those that use the shared receptor subunits common gamma chain (IL-2, IL-4, IL-7, IL-9, IL-15, IL-21) and gp130 (IL-6, IL-11, oncostatin M, G-CSF, and IFNs, among others). The many varied interactions of Jak1 and its ubiquitous expression suggest many biological roles. Jak1 is important in neurological development, as well as in lymphoid development and function. It also plays a role in the pathophysiology of cardiac hypertrophy and heart failure. A mutation in the ATP-binding site of Jak1 was identified in a human uterine leiomyosarcoma cell line, resulting in defective cytokine induction and antigen presentation, thus allowing the tumor to evade the immune system. Jak1 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Jak1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
24669 270664 cd05080 PTKc_Tyk2_rpt2 Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Tyrosine kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tyk2 is widely expressed in many tissues. It is involved in signaling via the cytokine receptors IFN-alphabeta, IL-6, IL-10, IL-12, IL-13, and IL-23. It mediates cell surface urokinase receptor (uPAR) signaling and plays a role in modulating vascular smooth muscle cell (VSMC) functional behavior in response to injury. Tyk2 is also important in dendritic cell function and T helper (Th)1 cell differentiation. A homozygous mutation of Tyk2 was found in a patient with hyper-IgE syndrome (HIES), a primary immunodeficiency characterized by recurrent skin abscesses, pneumonia, and elevated serum IgE. This suggests that Tyk2 may play important roles in multiple cytokine signaling involved in innate and adaptive immunity. Tyk2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase catalytic domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Tyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
24670 270665 cd05081 PTKc_Jak3_rpt2 Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak3 is expressed only in hematopoietic cells. It binds the shared receptor subunit common gamma chain and thus, is essential in the signaling of cytokines that use it such as IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21. Jak3 is important in lymphoid development and myeloid cell differentiation. Inactivating mutations in Jak3 have been reported in humans with severe combined immunodeficiency (SCID). Jak3 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
24671 133213 cd05082 PTKc_Csk Catalytic domain of the Protein Tyrosine Kinase, C-terminal Src kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Csk catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. Csk is expressed in a wide variety of tissues. As a negative regulator of Src, Csk plays a role in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. Csk is a cytoplasmic (or nonreceptor) PTK containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. To inhibit Src kinases, Csk is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. In addition, Csk also shows Src-independent functions. It is a critical component in G-protein signaling, and plays a role in cytoskeletal reorganization and cell migration. The Csk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
24672 270666 cd05083 PTKc_Chk Catalytic domain of the Protein Tyrosine Kinase, Csk homologous kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Chk is also referred to as megakaryocyte-associated tyrosine kinase (Matk). Chk inhibits Src kinases using a noncatalytic mechanism by simply binding to them. As a negative regulator of Src kinases, Chk may play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. Chk is expressed in brain and hematopoietic cells. Like Csk, it is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. To inhibit Src kinases that are anchored to the plasma membrane, Chk is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. Studies in mice reveal that Chk is not functionally redundant with Csk and that it plays an important role as a regulator of immune responses. Chk also plays a role in neural differentiation in a manner independent of Src by enhancing Mapk activation via Ras-mediated signaling. The Chk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
24673 270667 cd05084 PTKc_Fes Catalytic domain of the Protein Tyrosine Kinase, Fes. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fes (or Fps) is a cytoplasmic (or nonreceptor) PTK containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. The genes for Fes (feline sarcoma) and Fps (Fujinami poultry sarcoma) were first isolated from tumor-causing retroviruses. The viral oncogenes encode chimeric Fes proteins consisting of Gag sequences at the N-termini, resulting in unregulated PTK activity. Fes kinase is expressed in myeloid, vascular endothelial, epithelial, and neuronal cells. It plays important roles in cell growth and differentiation, angiogenesis, inflammation and immunity, and cytoskeletal regulation. A recent study implicates Fes kinase as a tumor suppressor in colorectal cancer. The Fes subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
24674 270668 cd05085 PTKc_Fer Catalytic domain of the Protein Tyrosine Kinase, Fer. Protein Tyrosine Kinase (PTK) family; Fer kinase; catalytic (c) domain. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fer kinase is a member of the Fes subfamily of proteins which are cytoplasmic (or nonreceptor) tyr kinases containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. Fer kinase is expressed in a wide variety of tissues, and is found to reside in both the cytoplasm and the nucleus. It plays important roles in neuronal polarization and neurite development, cytoskeletal reorganization, cell migration, growth factor signaling, and the regulation of cell-cell interactions mediated by adherens junctions and focal adhesions. Fer kinase also regulates cell cycle progression in malignant cells. 251
24675 270669 cd05086 PTKc_Aatyk2 Catalytic domain of the Protein Tyrosine Kinase, Apoptosis-associated tyrosine kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk2 is a member of the Aatyk subfamily of proteins, which are receptor kinases containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. Aatyk2 is also called lemur tyrosine kinase 2 (Lmtk2) or brain-enriched kinase (Brek). It is expressed at high levels in early postnatal brain, and has been shown to play a role in nerve growth factor (NGF) signaling. Studies with knockout mice reveal that Aatyk2 is essential for late stage spermatogenesis. Although it is classified as a PTK based on sequence similarity and the phylogenetic tree, Aatyk2 has been functionally characterized as a serine/threonine kinase. The Aatyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
24676 270670 cd05087 PTKc_Aatyk1 Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinase 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk1 (or simply Aatyk) is also called lemur tyrosine kinase 1 (Lmtk1). It is a cytoplasmic (or nonreceptor) kinase containing a long C-terminal region. The expression of Aatyk1 is upregulated during growth arrest and apoptosis in myeloid cells. Aatyk1 has been implicated in neural differentiation, and is a regulator of the Na-K-2Cl cotransporter, a membrane protein involved in cell proliferation and survival, epithelial transport, and blood pressure control. The Aatyk1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
24677 133219 cd05088 PTKc_Tie2 Catalytic domain of the Protein Tyrosine Kinase, Tie2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie2 is a receptor PTK (RTK) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie2 is expressed mainly in endothelial cells and hematopoietic stem cells. It is also found in a subset of tumor-associated monocytes and eosinophils. The angiopoietins (Ang-1 to Ang-4) serve as ligands for Tie2. The binding of Ang-1 to Tie2 leads to receptor autophosphorylation and activation, promoting cell migration and survival. In contrast, Ang-2 binding to Tie2 does not result in the same response, suggesting that Ang-2 may function as an antagonist. Tie2 signaling plays key regulatory roles in vascular integrity and quiescence, and in inflammation. The Tie2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 303
24678 270671 cd05089 PTKc_Tie1 Catalytic domain of the Protein Tyrosine Kinase, Tie1. Protein Tyrosine Kinase (PTK) family; Tie1; catalytic (c) domain. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie1 is a receptor tyr kinase (RTK) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie receptors are specifically expressed in endothelial cells and hematopoietic stem cells. No specific ligand has been identified for Tie1, although the angiopoietin, Ang-1, binds to Tie1 through integrins at high concentrations. In vivo studies of Tie1 show that it is critical in vascular development. 297
24679 270672 cd05090 PTKc_Ror1 Catalytic domain of the Protein Tyrosine Kinase, Receptor tyrosine kinase-like Orphan Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Ror kinases are expressed in many tissues during development. Avian Ror1 was found to be involved in late limb development. Studies in mice reveal that Ror1 is important in the regulation of neurite growth in central neurons, as well as in respiratory development. Loss of Ror1 also enhances the heart and skeletal abnormalities found in Ror2-deficient mice. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The Ror1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
24680 270673 cd05091 PTKc_Ror2 Catalytic domain of the Protein Tyrosine Kinase, Receptor tyrosine kinase-like Orphan Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Ror2 plays important roles in skeletal and heart formation. Ror2-deficient mice show widespread bone abnormalities, ventricular defects in the heart, and respiratory dysfunction. Mutations in human Ror2 result in two different bone development genetic disorders, recessive Robinow syndrome and brachydactyly type B. Ror2 is also implicated in neural development. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The Ror2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
24681 270674 cd05092 PTKc_TrkA Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase A. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkA is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkA to its ligand, nerve growth factor (NGF), results in receptor oligomerization and activation of the catalytic domain. TrkA is expressed mainly in neural-crest-derived sensory and sympathetic neurons of the peripheral nervous system, and in basal forebrain cholinergic neurons of the central nervous system. It is critical for neuronal growth, differentiation and survival. Alternative TrkA splicing has been implicated as a pivotal regulator of neuroblastoma (NB) behavior. Normal TrkA expression is associated with better NB prognosis, while the hypoxia-regulated TrkAIII splice variant promotes NB pathogenesis and progression. Aberrant TrkA expression has also been demonstrated in non-neural tumors including prostate, breast, lung, and pancreatic cancers. The TrkA subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 280
24682 270675 cd05093 PTKc_TrkB Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase B. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkB is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkB to its ligands, brain-derived neurotrophic factor (BDNF) or neurotrophin 4 (NT4), results in receptor oligomerization and activation of the catalytic domain. TrkB is broadly expressed in the nervous system and in some non-neural tissues. It plays important roles in cell proliferation, differentiation, and survival. BDNF/Trk signaling plays a key role in regulating activity-dependent synaptic plasticity. TrkB also contributes to protection against gp120-induced neuronal cell death. TrkB overexpression is associated with poor prognosis in neuroblastoma (NB) and other human cancers. It acts as a suppressor of anoikis (detachment-induced apoptosis) and contributes to tumor metastasis. The TrkB subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
24683 270676 cd05094 PTKc_TrkC Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase C. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkC is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkC to its ligand, neurotrophin 3 (NT3), results in receptor oligomerization and activation of the catalytic domain. TrkC is broadly expressed in the nervous system and in some non-neural tissues including the developing heart. NT3/TrkC signaling plays an important role in the innervation of the cardiac conducting system and the development of smooth muscle cells. Mice deficient with NT3 and TrkC have multiple heart defects. NT3/TrkC signaling is also critical for the development and maintenance of enteric neurons that are important for the control of gut peristalsis. The TrkC subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
24684 270677 cd05095 PTKc_DDR2 Catalytic domain of the Protein Tyrosine Kinase, Discoidin Domain Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR2 is a receptor PTK (RTK) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDR2 results in a slow but sustained receptor activation. DDR2 binds mostly to fibrillar collagens as well as collagen X. DDR2 is widely expressed in many tissues with the highest levels found in skeletal muscle, skin, kidney and lung. It is important in cell proliferation and development. Mice, with a deletion of DDR2, suffer from dwarfism and delayed healing of epidermal wounds. DDR2 also contributes to collagen (type I) regulation by inhibiting fibrillogenesis and altering the morphology of collagen fibers. It is also expressed in immature dendritic cells (DCs), where it plays a role in DC activation and function. The DDR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 297
24685 133227 cd05096 PTKc_DDR1 Catalytic domain of the Protein Tyrosine Kinase, Discoidin Domain Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR1 is a receptor PTK (RTK) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDR1 results in a slow but sustained receptor activation. DDR1 binds to all collagens tested to date (types I-IV). It is widely expressed in many tissues. It is abundant in the brain and is also found in keratinocytes, colonic mucosa epithelium, lung epithelium, thyroid follicles, and the islets of Langerhans. During embryonic development, it is found in the developing neuroectoderm. DDR1 is a key regulator of cell morphogenesis, differentiation and proliferation. It is important in the development of the mammary gland, the vasculator and the kidney. DDR1 is also found in human leukocytes, where it facilitates cell adhesion, migration, maturation, and cytokine production. The DDR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 304
24686 133228 cd05097 PTKc_DDR_like Catalytic domain of Discoidin Domain Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR-like proteins are members of the DDR subfamily, which are receptor PTKs (RTKs) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDRs results in a slow but sustained receptor activation. DDRs regulate cell adhesion, proliferation, and extracellular matrix remodeling. They have been linked to a variety of human cancers including breast, colon, ovarian, brain, and lung. There is no evidence showing that DDRs act as transforming oncogenes. They are more likely to play a role in the regulation of tumor growth and metastasis. The DDR-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 295
24687 270678 cd05098 PTKc_FGFR1 Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Alternative splicing of FGFR1 transcripts produces a variety of isoforms, which are differentially expressed in cells. FGFR1 binds the ligands, FGF1 and FGF2, with high affinity and has also been reported to bind FGF4, FGF6, and FGF9. FGFR1 signaling is critical in the control of cell migration during embryo development. It promotes cell proliferation in fibroblasts. Nuclear FGFR1 plays a role in the regulation of transcription. Mutations, insertions or deletions of FGFR1 have been identified in patients with Kallman's syndrome (KS), an inherited disorder characterized by hypogonadotropic hypogonadism and loss of olfaction. Aberrant FGFR1 expression has been found in some human cancers including 8P11 myeloproliferative syndrome (EMS), breast cancer, and pancreatic adenocarcinoma. FGFR1 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 302
24688 133230 cd05099 PTKc_FGFR4 Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 4. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Unlike other FGFRs, there is only one splice form of FGFR4. It binds FGF1, FGF2, FGF6, FGF19, and FGF23. FGF19 is a selective ligand for FGFR4. Although disruption of FGFR4 in mice causes no obvious phenotype, in vivo inhibition of FGFR4 in cultured skeletal muscle cells resulted in an arrest of muscle progenitor differentiation. FGF6 and FGFR4 are uniquely expressed in myofibers and satellite cells. FGF6/FGFR4 signaling appears to play a key role in the regulation of muscle regeneration. A polymorphism in FGFR4 is found in head and neck squamous cell carcinoma. FGFR4 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR4 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 314
24689 173652 cd05100 PTKc_FGFR3 Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Many FGFR3 splice variants have been reported with the IIIb and IIIc isoforms being the predominant forms. FGFR3 IIIc is the isoform expressed in chondrocytes, the cells affected in dwarfism, while IIIb is expressed in epithelial cells. FGFR3 ligands include FGF1, FGF2, FGF4, FGF8, FGF9, and FGF23. It is a negative regulator of long bone growth. In the cochlear duct and in the lens, FGFR3 is involved in differentiation while it appears to have a role in cell proliferation in epithelial cells. Germline mutations in FGFR3 are associated with skeletal disorders including several forms of dwarfism. Some missense mutations are associated with multiple myeloma and carcinomas of the bladder and cervix. Overexpression of FGFR3 is found in thyroid carcinoma. FGFR3 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 334
24690 270679 cd05101 PTKc_FGFR2 Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. There are many splice variants of FGFR2 which show differential expression and binding to FGF ligands. Disruption of either FGFR2 or FGFR2b is lethal in mice, due to defects in the placenta or severe impairment of tissue development including lung, limb, and thyroid, respectively. Disruption of FGFR2c in mice results in defective bone and skull development. Genetic alterations of FGFR2 are associated with many human skeletal disorders including Apert syndrome, Crouzon syndrome, Jackson-Weiss syndrome, and Pfeiffer syndrome. FGFR2 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
24691 270680 cd05102 PTKc_VEGFR3 Catalytic domain of the Protein Tyrosine Kinase, Vascular Endothelial Growth Factor Receptor 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR3 (or Flt4) preferentially binds the ligands VEGFC and VEGFD. VEGFR3 is essential for lymphatic endothelial cell (EC) development and function. It has been shown to regulate adaptive immunity during corneal transplantation. VEGFR3 is upregulated on blood vascular ECs in pathological conditions such as vascular tumors and the periphery of solid tumors. It plays a role in cancer progression and lymph node metastasis. Missense mutations in the VEGFR3 gene are associated with primary human lymphedema. VEGFR3 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. In VEGFR3, the fifth Ig-like domain is replaced by a disulfide bridge. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 336
24692 270681 cd05103 PTKc_VEGFR2 Catalytic domain of the Protein Tyrosine Kinase, Vascular Endothelial Growth Factor Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR2 (or Flk1) binds the ligands VEGFA, VEGFC, VEGFD and VEGFE. VEGFR2 signaling is implicated in all aspects of normal and pathological vascular endothelial cell biology. It induces a variety of cellular effects including migration, survival, and proliferation. It is critical in regulating embryonic vascular development and angiogenesis. VEGFR2 is the major signal transducer in pathological angiogenesis including cancer and diabetic retinopathy, and is a target for inhibition in cancer therapy. The carboxyl terminus of VEGFR2 plays an important role in its autophosphorylation and activation. VEGFR2 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 343
24693 270682 cd05104 PTKc_Kit Catalytic domain of the Protein Tyrosine Kinase, Kit. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Kit is important in the development of melanocytes, germ cells, mast cells, hematopoietic stem cells, the interstitial cells of Cajal, and the pacemaker cells of the GI tract. Kit signaling is involved in major cellular functions including cell survival, proliferation, differentiation, adhesion, and chemotaxis. Mutations in Kit, which result in constitutive ligand-independent activation, are found in human cancers such as gastrointestinal stromal tumor (GIST) and testicular germ cell tumor (TGCT). The aberrant expression of Kit and/or SCF is associated with other tumor types such as systemic mastocytosis and cancers of the breast, neurons, lung, prostate, colon, and rectum. Although the structure of the human Kit catalytic domain is known, it is excluded from this specific alignment model because it contains a deletion in its sequence. Kit is a member of the Platelet Derived Growth Factor Receptor (PDGFR) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of Kit to its ligand, the stem-cell factor (SCF), leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. The Kit subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 375
24694 173653 cd05105 PTKc_PDGFR_alpha Catalytic domain of the Protein Tyrosine Kinase, Platelet Derived Growth Factor Receptor alpha. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. PDGFR alpha is a receptor PTK (RTK) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding to its ligands, the PDGFs, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR alpha forms homodimers or heterodimers with PDGFR beta, depending on the nature of the PDGF ligand. PDGF-AA, PDGF-AB, and PDGF-CC induce PDGFR alpha homodimerization. PDGFR signaling plays many roles in normal embryonic development and adult physiology. PDGFR alpha signaling is important in the formation of lung alveoli, intestinal villi, mesenchymal dermis, and hair follicles, as well as in the development of oligodendrocytes, retinal astrocytes, neural crest cells, and testicular cells. Aberrant PDGFR alpha expression is associated with some human cancers. Mutations in PDGFR alpha have been found within a subset of gastrointestinal stromal tumors (GISTs). An active fusion protein FIP1L1-PDGFR alpha, derived from interstitial deletion, is associated with idiopathic hypereosinophilic syndrome and chronic eosinophilic leukemia. The PDGFR alpha subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 400
24695 133237 cd05106 PTKc_CSF-1R Catalytic domain of the Protein Tyrosine Kinase, Colony-Stimulating Factor-1 Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. CSF-1R, also called c-Fms, is a member of the Platelet Derived Growth Factor Receptor (PDGFR) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of CSF-1R to its ligand, CSF-1, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. CSF-1R signaling is critical in the regulation of macrophages and osteoclasts. It leads to increases in gene transcription and protein translation, and induces cytoskeletal remodeling. CSF-1R signaling leads to a variety of cellular responses including survival, proliferation, and differentiation of target cells. It plays an important role in innate immunity, tissue development and function, and the pathogenesis of some diseases including atherosclerosis and cancer. CSF-1R signaling is also implicated in mammary gland development during pregnancy and lactation. Aberrant CSF-1/CSF-1R expression correlates with tumor cell invasiveness, poor clinical prognosis, and bone metastasis in breast cancer. Although the structure of the human CSF-1R catalytic domain is known, it is excluded from this specific alignment model because it contains a deletion in its sequence. The CSF-1R subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 374
24696 133238 cd05107 PTKc_PDGFR_beta Catalytic domain of the Protein Tyrosine Kinase, Platelet Derived Growth Factor Receptor beta. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. PDGFR beta is a receptor PTK (RTK) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding to its ligands, the PDGFs, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR beta forms homodimers or heterodimers with PDGFR alpha, depending on the nature of the PDGF ligand. PDGF-BB and PDGF-DD induce PDGFR beta homodimerization. PDGFR signaling plays many roles in normal embryonic development and adult physiology. PDGFR beta signaling leads to a variety of cellular effects including the stimulation of cell growth and chemotaxis, as well as the inhibition of apoptosis and GAP junctional communication. It is critical in normal angiogenesis as it is involved in the recruitment of pericytes and smooth muscle cells essential for vessel stability. Aberrant PDGFR beta expression is associated with some human cancers. The continuously-active fusion proteins of PDGFR beta with COL1A1 and TEL are associated with dermatofibrosarcoma protuberans (DFSP) and a subset of chronic myelomonocytic leukemia (CMML), respectively. The PDGFR beta subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 401
24697 270683 cd05108 PTKc_EGFR Catalytic domain of the Protein Tyrosine Kinase, Epidermal Growth Factor Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER1, ErbB1) is a receptor PTK (RTK) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands for EGFR include EGF, heparin binding EGF-like growth factor (HBEGF), epiregulin, amphiregulin, TGFalpha, and betacellulin. Upon ligand binding, EGFR can form homo- or heterodimers with other EGFR subfamily members. The EGFR signaling pathway is one of the most important pathways regulating cell proliferation, differentiation, survival, and growth. Overexpression and mutation in the kinase domain of EGFR have been implicated in the development and progression of a variety of cancers. A number of monoclonal antibodies and small molecule inhibitors have been developed that target EGFR, including the antibodies Cetuximab and Panitumumab, which are used in combination with other therapies for the treatment of colorectal cancer and non-small cell lung carcinoma (NSCLC). The small molecule inhibitors Gefitinib (Iressa) and Erlotinib (Tarceva), already used for NSCLC, are undergoing clinical trials for other types of cancer including gastrointestinal, breast, head and neck, and bladder. The EGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
24698 270684 cd05109 PTKc_HER2 Catalytic domain of the Protein Tyrosine Kinase, HER2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. HER2 (ErbB2, HER2/neu) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. HER2 does not bind to any known EGFR subfamily ligands, but contributes to the kinase activity of all possible heterodimers. It acts as the preferred partner of other ligand-bound EGFR proteins and functions as a signal amplifier, with the HER2-HER3 heterodimer being the most potent pair in mitogenic signaling. HER2 plays an important role in cell development, proliferation, survival and motility. Overexpression of HER2 results in its activation and downstream signaling, even in the absence of ligand. HER2 overexpression, mainly due to gene amplification, has been shown in a variety of human cancers. Its role in breast cancer is especially well-documented. HER2 is up-regulated in about 25% of breast tumors and is associated with increases in tumor aggressiveness, recurrence and mortality. HER2 is a target for monoclonal antibodies and small molecule inhibitors, which are being developed as treatments for cancer. The first humanized antibody approved for clinical use is Trastuzumab (Herceptin), which is being used in combination with other therapies to improve the survival rates of patients with HER2-overexpressing breast cancer. The HER2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
24699 173655 cd05110 PTKc_HER4 Catalytic domain of the Protein Tyrosine Kinase, HER4. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. HER4 (ErbB4) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands that bind HER4 fall into two groups, the neuregulins (or heregulins) and some EGFR (HER1) ligands including betacellulin, HBEGF, and epiregulin. All four neuregulins (NRG1-4) interact with HER4. Upon ligand binding, HER4 forms homo- or heterodimers with other HER proteins. HER4 is essential in embryonic development. It is implicated in mammary gland, cardiac, and neural development. As a postsynaptic receptor of NRG1, HER4 plays an important role in synaptic plasticity and maturation. The impairment of NRG1/HER4 signaling may contribute to schizophrenia. The HER4 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 303
24700 173656 cd05111 PTK_HER3 Pseudokinase domain of the Protein Tyrosine Kinase, HER3. HER3 (ErbB3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. HER3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. HER3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The HER2-HER3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. HER3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells. The HER3 subfamily is part of a larger superfamily that includes other pseudokinases and the the catalytic domains of active kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
24701 133243 cd05112 PTKc_Itk Catalytic domain of the Protein Tyrosine Kinase, Interleukin-2-inducible T-cell Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Itk, also known as Tsk or Emt, is a member of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, Itk contains the Tec homology (TH) domain containing one proline-rich region and a zinc-binding region. Itk is expressed in T-cells and mast cells, and is important in their development and differentiation. Of the three Tec kinases expressed in T-cells, Itk plays the predominant role in T-cell receptor (TCR) signaling. It is activated by phosphorylation upon TCR crosslinking and is involved in the pathway resulting in phospholipase C-gamma1 activation and actin polymerization. It also plays a role in the downstream signaling of the T-cell costimulatory receptor CD28, the T-cell surface receptor CD2, and the chemokine receptor CXCR4. In addition, Itk is crucial for the development of T-helper(Th)2 effector responses. The Itk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
24702 173657 cd05113 PTKc_Btk_Bmx Catalytic domain of the Protein Tyrosine Kinases, Bruton's tyrosine kinase and Bone marrow kinase on the X chromosome. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Btk and Bmx (also named Etk) are members of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, Btk contains the Tec homology (TH) domain with proline-rich and zinc-binding regions. Btk is expressed in B-cells, and a variety of myeloid cells including mast cells, platelets, neutrophils, and dendrictic cells. It interacts with a variety of partners, from cytosolic proteins to nuclear transcription factors, suggesting a diversity of functions. Stimulation of a diverse array of cell surface receptors, including antigen engagement of the B-cell receptor, leads to PH-mediated membrane translocation of Btk and subsequent phosphorylation by Src kinase and activation. Btk plays an important role in the life cycle of B-cells including their development, differentiation, proliferation, survival, and apoptosis. Mutations in Btk cause the primary immunodeficiency disease, X-linked agammaglobulinaemia (XLA) in humans. Bmx is primarily expressed in bone marrow and the arterial endothelium, and plays an important role in ischemia-induced angiogenesis. It facilitates arterial growth, capillary formation, vessel maturation, and bone marrow-derived endothelial progenitor cell mobilization. The Btk/Bmx subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
24703 270685 cd05114 PTKc_Tec_Rlk Catalytic domain of the Protein Tyrosine Kinases, Tyrosine kinase expressed in hepatocellular carcinoma and Resting lymphocyte kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tec and Rlk (also named Txk) are members of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. Instead of PH, Rlk contains an N-terminal cysteine-rich region. In addition to PH, Tec also contains the Tec homology (TH) domain with proline-rich and zinc-binding regions. Tec kinases are expressed mainly by haematopoietic cells. Tec is more widely-expressed than other Tec-like subfamily kinases. It is found in endothelial cells, both B- and T-cells, and a variety of myeloid cells including mast cells, erythroid cells, platelets, macrophages and neutrophils. Rlk is expressed in T-cells and mast cell lines. Tec and Rlk are both key components of T-cell receptor (TCR) signaling. They are important in TCR-stimulated proliferation, IL-2 production and phopholipase C-gamma1 activation. The Tec/Rlk subfamily is part of a larger superfamily, that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
24704 270686 cd05115 PTKc_Zap-70 Catalytic domain of the Protein Tyrosine Kinase, Zeta-chain-associated protein of 70kDa. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Zap-70 is a cytoplasmic (or nonreceptor) PTK containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. Zap-70 is primarily expressed in T-cells and NK cells, and is a crucial component in T-cell receptor (TCR) signaling. Zap-70 binds the phosphorylated ITAM (immunoreceptor tyr activation motif) sequences of the activated TCR zeta-chain through its SH2 domains, leading to its phosphorylation and activation. It then phosphorylates target proteins, which propagate the signals to downstream pathways. Zap-70 is hardly detected in normal peripheral B-cells, but is present in some B-cell malignancies. It is used as a diagnostic marker for chronic lymphocytic leukemia (CLL) as it is associated with the more aggressive subtype of the disease. The Zap-70 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
24705 133247 cd05116 PTKc_Syk Catalytic domain of the Protein Tyrosine Kinase, Spleen tyrosine kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Syk is a cytoplasmic (or nonreceptor) PTK containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. Syk was first cloned from the spleen, and its function in hematopoietic cells is well-established. It is involved in the signaling downstream of activated receptors (including B-cell and Fc receptors) that contain ITAMs (immunoreceptor tyr activation motifs), leading to processes such as cell proliferation, differentiation, survival, adhesion, migration, and phagocytosis. More recently, Syk expression has been detected in other cell types (including epithelial cells, vascular endothelial cells, neurons, hepatocytes, and melanocytes), suggesting a variety of biological functions in non-immune cells. Syk plays a critical role in maintaining vascular integrity and in wound healing during embryogenesis. It also regulates Vav3, which is important in osteoclast function including bone development. In breast epithelial cells, where Syk acts as a negative regulator for EGFR signaling, loss of Syk expression is associated with abnormal proliferation during cancer development suggesting a potential role as a tumor suppressor. In mice, Syk has been shown to inhibit malignant transformation of mammary epithelial cells induced with murine mammary tumor virus (MMTV). The Syk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
24706 270687 cd05117 STKc_CAMK The catalytic domain of CAMK family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. CaMKII is a signaling molecule that translates upstream calcium and reactive oxygen species (ROS) signals into downstream responses that play important roles in synaptic function and cardiovascular physiology. CAMKIV is implicated in regulating several transcription factors like CREB, MEF2, and retinoid orphan receptors, as well as in T-cell development and signaling. The CAMK family also consists of other related kinases including the Phosphorylase kinase Gamma subunit (PhKG), the C-terminal kinase domains of Ribosomal S6 kinase (RSK) and Mitogen and stress-activated kinase (MSK), Doublecortin-like kinase (DCKL), and the MAPK-activated protein kinases MK2, MK3, and MK5, among others. The CAMK family is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
24707 270688 cd05118 STKc_CMGC Catalytic domain of CMGC family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The CMGC family consists of Cyclin-Dependent protein Kinases (CDKs), Mitogen-activated protein kinases (MAPKs) such as Extracellular signal-regulated kinase (ERKs), c-Jun N-terminal kinases (JNKs), and p38, and other kinases. CDKs belong to a large subfamily of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. MAPKs serve as important mediators of cellular responses to extracellular signals. They control critical cellular functions including differentiation, proliferation, migration, and apoptosis. They are also implicated in the pathogenesis of many diseases including multiple types of cancer, stroke, diabetes, and chronic inflammation. Other members of the CMGC family include casein kinase 2 (CK2), Dual-specificity tYrosine-phosphorylated and -Regulated Kinase (DYRK), Glycogen Synthase Kinase 3 (GSK3), among many others. The CMGC family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 249
24708 270689 cd05119 RIO Catalytic domain of the atypical protein serine kinases, RIO kinases. RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 192
24709 270690 cd05120 APH_ChoK_like Aminoglycoside 3'-phosphotransferase and Choline Kinase family. This family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 158
24710 270691 cd05121 ABC1_ADCK3-like Activator of bc1 complex (ABC1) kinases (also called aarF domain containing kinase 3) and similar proteins. This family is composed of the atypical yeast protein kinase Abc1p, its human homolog ADCK3 (also called CABC1), and similar proteins. Abc1p (also called Coq8p) is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is necessary for the formation of a multi-subunit Q-biosynthetic complex and may also function in the regulation of Q synthesis. Human ADCK3 is able to rescue defects in Q synthesis and the phosphorylation state of Coq proteins in yeast Abc1 (or Coq8) mutants. Mutations in ADCK3 cause progressive cerebellar ataxia and atrophy due to Q10 deficiency. Eukaryotes contain at least two more ABC1/ADCK3-like proteins: in humans, these are the putative atypical protein kinases named ADCK1 and ADCK2. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Eight of these plant ABC1 kinase subfamilies (ABC1K1-8) are specific for photosynthetic organisms. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family. 247
24711 270692 cd05122 PKc_STE Catalytic domain of STE family Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. This family is composed of STKs, and some dual-specificity PKs that phosphorylate both threonine and tyrosine residues of target proteins. Most members are kinases involved in mitogen-activated protein kinase (MAPK) signaling cascades, acting as MAPK kinases (MAPKKs), MAPKK kinases (MAPKKKs), or MAPKKK kinases (MAP4Ks). The MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The pathways involve a triple kinase core cascade comprising of the MAPK, which is phosphorylated and activated by a MAPKK, which itself is phosphorylated and activated by a MAPKKK. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAPKKK to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Other STE family members include p21-activated kinases (PAKs) and class III myosins, among others. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain, which can phosphorylate several cytoskeletal proteins, conventional myosin regulatory light chains, as well as autophosphorylate the C-terminal motor domain. They play an important role in maintaining the structural integrity of photoreceptor cell microvilli. The STE family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
24712 270693 cd05123 STKc_AGC Catalytic domain of AGC family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. AGC kinases regulate many cellular processes including division, growth, survival, metabolism, motility, and differentiation. Many are implicated in the development of various human diseases. Members of this family include cAMP-dependent Protein Kinase (PKA), cGMP-dependent Protein Kinase (PKG), Protein Kinase C (PKC), Protein Kinase B (PKB), G protein-coupled Receptor Kinase (GRK), Serum- and Glucocorticoid-induced Kinase (SGK), and 70 kDa ribosomal Protein S6 Kinase (p70S6K or S6K), among others. AGC kinases share an activation mechanism based on the phosphorylation of up to three sites: the activation loop (A-loop), the hydrophobic motif (HM) and the turn motif. Phosphorylation at the A-loop is required of most AGC kinases, which results in a disorder-to-order transition of the A-loop. The ordered conformation results in the access of substrates and ATP to the active site. A subset of AGC kinases with C-terminal extensions containing the HM also requires phosphorylation at this site. Phosphorylation at the HM allows the C-terminal extension to form an ordered structure that packs into the hydrophobic pocket of the catalytic domain, which then reconfigures the kinase into an active bi-lobed state. In addition, growth factor-activated AGC kinases such as PKB, p70S6K, RSK, MSK, PKC, and SGK, require phosphorylation at the turn motif (also called tail or zipper site), located N-terminal to the HM at the C-terminal extension. The AGC family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and Phosphoinositide 3-Kinase. 250
24713 270694 cd05124 AFK Catalytic domain of Actin-Fragmin Kinase. AFK is found in slime molds, ciliates, and flowering plants. It catalyzes the transfer of the gamma-phosphoryl group from ATP specifically to threonine residues in the actin-fragmin complex. The phosphorylation sites are located at a minor contact site for DNase I and at an actin-actin contact site. Fragmin is an actin-binding protein that functions as a regulator of the microfilament system. It interferes with the growth of F-actin by severing actin filaments and capping their ends. The phosphorylation of the actin-fragmin complex inhibits its nucleation activity and results in calcium-dependent capping activity. Thus, AFK plays a role in regulating actin polymerization. The AFK catalytic domain is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 240
24714 240161 cd05125 Mth938_2P1-like Mth938_2P1-like domain. This model contains sequences that are similar to 2P1, a partially characterized nuclear protein, which is homologous to E3-3 from rat and known to be alternatively spliced. Its function is unknown. This family is part of the Mth938 family, for which structures, but no functional data are available. 114
24715 240162 cd05126 Mth938 Mth938 domain. Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. This protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer. The function of the protein has not been determined. 117
24716 213329 cd05127 RasGAP_IQGAP_like Ras-GTPase Activating Domain of IQ motif containing GTPase activating proteins. This family represents IQ motif containing GTPase activating protein (IQGAP) which associated with the Ras GTP-binding protein. A primary function of IQGAP proteins is to modulate cytoskeletal architecture. There are three known IQGAP family members: IQGAP1, IQGAP2 and IQGAP3. Human IQGAP1 and IQGAP2 share 62% identity. IQGAPs are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP is an essential regulator of cytoskeletal function. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity, the protein actually lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite of their similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3, only present in mammals, regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton. 331
24717 213330 cd05128 RasGAP_GAP1_like Ras-GTPase Activating Domain of GAP1 and similar proteins. The GAP1 family of Ras GTPase-activating proteins includes GAP1(m) (or RASA2), GAP1_IP4BP (or RASA3), Ca2+ -promoted Ras inactivator (CAPRI, or RASAL4), and Ras GTPase activating-like proteins (RASAL) or RASAL1. The members are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin homology domain that is associated with a Bruton's tyrosine kinase motif. While this domain structure is conserved, a small change in the function of each individual domain and the interaction between domains has a marked effect on the regulation of each protein. 269
24718 213331 cd05129 RasGAP_RAP6 Ras-GTPase Activating Domain of Rab5-activating protein 6. Rab5-activating protein 6 (RAP6) is an endosomal protein with a role in the regulation of receptor-mediated endocytosis. RAP6 contains a Vps9 domain, which is involved in the activation of Rab5, and a Ras GAP domain (RGD). Rab5 is a small GTPase required for the control of the endocytic route, and its activity is regulated by guanine nucleotide exchange factor, such as Rabex5, and GAPs, such as RN-tre. Human Rap6 protein is localized on the plasma membrane and on the endosome. RAP6 binds to Rab5 and Ras through the Vps9 and RGD domains, respectively. 365
24719 213332 cd05130 RasGAP_Neurofibromin Ras-GTPase Activating Domain of neurofibromin. Neurofibromin is the product of the neurofibromatosis type 1 gene (NF1) and shares a region of similarity with catalytic domain of the mammalian p120RasGAP protein and an extended similarity with the Saccharomyces cerevisiae RasGAP proteins Ira1 and Ira2. Neurofibromin has been shown to function as a GAP (GTPase-activating protein) which inhibits low molecular weight G proteins such as Ras by stimulating their intrinsic GTPase activity. NF1 is a common genetic disorder characterized by various symptoms ranging from predisposition for the development of tumors to learning disability or mental retardation. Loss of neurofibromin activity can be correlated to the increase in Ras-GTP concentration in neurofibromas of NF1 of patients, supporting the notion that unregulated Ras signaling may contribute to their development. 332
24720 213333 cd05131 RasGAP_IQGAP2 Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 2. IQGAP2 is a member of the IQGAP family that contains a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeat, a single WW domain, four IQ motifs which mediate interactions with calmodulin, and a Ras-GTPase-activating protein (GAP)-related domain that binds Rho family GTPases. IQGAP2 and IQGAP3 play important roles in the regulation of the cytoskeleton for axon outgrowth in hippocampal neurons and are thought to stay in a common regulatory pathway. The results of RNA interference studies indicated that IQGAP3 partially compensates functions of IQGAP2, but has lesser ability than IQGAP2 to promote axon outgrowth in hippocampal neuron. Moreover, IQGAP2 is required for the cadherin-mediated cell-to-cell adhesion in Xenopus laevis embryos. 359
24721 213334 cd05132 RasGAP_GAPA Ras-GTPase Activating Domain of GAPA. GAPA is an IQGAP-related protein and is predicted to bind to small GTPases, which are yet to be identified. IQGAP proteins are integral components of cytoskeletal regulation. Results from truncated GAPAs indicated that almost the entire region of GAPA homologous to IQGAP is required for cytokinesis in Dictyostelium. More members of the IQGAP family are emerging, and evidence suggests that there are both similarities and differences in their function. 352
24722 213335 cd05133 RasGAP_IQGAP1 Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 1. IQGAP1 is a homodimeric protein that is widely expressed among vertebrate cell types from early embryogenesis. Mammalian IQGAP1 protein is the best characterized member of the IQGAP family, and contains several protein-interacting domains. Human IQGAP1 is most similar to mouse Iqgap1 (94% identity) and has 62% identity to human IQGAP2. IQGAP1 binds and cross-links actin filaments in vitro and has been implicated in Ca2+/calmodulin signaling, E-cadherin-dependent cell adhesion, cell motility, and invasion. Yeast IQGAP homologs have a role in the recruitment of actin filaments, are components of the spindle pole body, and are required for actomyosin ring assembly and cytokinesis. Furthermore, IQGAP1 over-expression has also been detected in gastric and colorectal carcinomas and gastric cancer cell lines. 380
24723 213336 cd05134 RasGAP_RASA3 Ras-GTPase Activating Domain of RASA3. RASA3 (or GAP1_IP4BP) is a member of the GAP1 family and has been shown to specifically bind 1,3,4,5-tetrakisphosphate (IP4). Thus, RASA3 may function as an IP4 receptor. The members of GAP1 family are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. Purified RASA3 stimulates GAP activity on Ras with about a five-fold lower potency than p120RasGAP, but shows no GAP-stimulating activity at all against Rac or Rab3A. 269
24724 213337 cd05135 RasGAP_RASAL Ras-GTPase Activating Domain of RASAL1 and similar proteins. Ras GTPase activating-like protein (RASAL) or RASAL1 is a member of the GAP1 family, and a Ca2+ sensor responding in-phase to repetitive Ca2+ signals by associating with the plasma membrane and deactivating Ras. It contains a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. RASAL, like Ca2+ -promoted Ras inactivator (CAPRI, or RASAL4), is a cytosolic protein that undergoes a rapid translocation to the plasma membrane in response to receptor-mediated elevation in the concentration of intracellular free Ca2+, a translocation that activates its ability to function as a RasGAP. However, unlike RASAL4, RASAL undergoes an oscillatory translocation to the plasma membrane that occurs in synchrony with repetitive Ca2+ spikes. 287
24725 213338 cd05136 RasGAP_DAB2IP Ras-GTPase Activating Domain of DAB2IP and similar proteins. The DAB2IP family of Ras GTPase-activating proteins includes DAB2IP, nGAP, and Syn GAP. Disabled 2 interactive protein, (DAB2IP; also known as ASK-interacting protein 1 (AIP1)), is a member of the GTPase-activating proteins, down-regulates Ras-mediated signal pathways, and mediates TNF-induced activation of ASK1-JNK signaling pathways. The mechanism by which TNF signaling is coupled to DAB2IP is not known. 324
24726 213339 cd05137 RasGAP_CLA2_BUD2 Ras-GTPase Activating Domain of CLA2/BUD2. CLA2/BUD2 functions as a GTPase-activating protein (GAP) for BUD1/RSR1 and is necessary for proper bud-site selection in yeast. BUD2 has sequence similarity to the catalytic domain of RasGAPs, and stimulates the hydrolysis of BUD1-GTP to BUD1-GDP. Elimination of Bud2p activity by mutation causes a random budding pattern with no growth defect. Overproduction of Bud2p also alters the budding pattern. 356
24727 240163 cd05140 Barstar_AU1054-like Barstar_AU1054-like contains uncharacterized sequences similar to the uncharacterized, predicted RNAase inhibitor AU1054 found in Burkholderia cenocepacia. This is a subfamily of the Barstar family of RNAase inhibitors. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase. 86
24728 240164 cd05141 Barstar_evA4336-like Barstar_evA4336-like contains uncharacterized sequences similar to the uncharacterized, predicted RNAase inhibitor evA4336 found in Azoarcus sp. EvN1. This is a subfamily of the Barstar family of RNAase inhibitors. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase. 81
24729 240165 cd05142 Barstar Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase. 87
24730 240166 cd05143 Barstar_SaI14_like Barstar_SaI14_like contains sequences that are similar to SaI14, an RNAase inhibitor, which are members of the Barstar family. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. The sequences in this subfamily are mostly uncharacterized, but believed to have a similar function and role. 88
24731 270695 cd05144 RIO2_C C-terminal catalytic domain of the atypical protein serine kinase, RIO2 kinase. RIO2 is present in archaea and eukaryotes. It contains an N-terminal winged helix (wHTH) domain and a C-terminal RIO kinase catalytic domain. The wHTH domain is primarily seen in DNA-binding proteins, although some wHTH domains may be involved in RNA recognition. RIO2 is essential for survival and is necessary for rRNA cleavage during 40S ribosomal subunit maturation. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO2 kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 183
24732 270696 cd05145 RIO1_like Catalytic domain of the atypical protein serine kinases, RIO1 and RIO3 kinases and similar proteins. RIO1 is present in archaea, bacteria and eukaryotes. In addition, RIO3 is present in multicellular eukaryotes. Both RIO1 and RIO3 are associated with precursors of 40S ribosomal subunits, just like RIO2. RIO1 is essential for survival and is required for 18S rRNA processing, proper cell cycle progression and chromosome maintenance. Although depletion of either RIO1 and RIO2 results in similar effects, the two kinases are not fully interchangeable. The specific function of RIO3 is unknown. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 189
24733 270697 cd05146 RIO3_euk Catalytic domain of the atypical protein serine kinase, RIO3 kinase. RIO3 is present only in multicellular eukaryotes. It is associated with precursors of 40S ribosomal subunits, just like RIO1 and RIO2. Its specific function is still unknown. Like RIO1 and RIO2, it may be involved in ribosomal subunit processing and maturation. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO3 kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 196
24734 270698 cd05147 RIO1_euk Catalytic domain of the atypical protein serine kinase, Eukaryotic RIO1 kinase. RIO1 is present in archaea, bacteria and eukaryotes. This subfamily is composed of RIO1 proteins from eukaryotes. RIO1 is essential for survival and is required for 18S rRNA processing, proper cell cycle progression and chromosome maintenance. It is associated with precursors of 40S ribosomal subunits, just like RIO2. Although depletion of either RIO1 and RIO2 results in similar effects, the two kinases are not fully interchangeable. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 190
24735 133248 cd05148 PTKc_Srm_Brk Catalytic domain of the Protein Tyrosine Kinases, Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristylation sites (Srm) and Breast tumor kinase (Brk). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Srm and Brk (also called protein tyrosine kinase 6) are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Brk has been found to be overexpressed in a majority of breast tumors. Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Srm and Brk however, lack the N-terminal myristylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. The Srm/Brk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
24736 270699 cd05150 APH Aminoglycoside 3'-phosphotransferase. APH catalyzes the transfer of the gamma-phosphoryl group from ATP to aminoglycoside antibiotics such as kanamycin, streptomycin, neomycin, and gentamicin, among others. The aminoglycoside antibiotics target the 30S ribosome and promote miscoding, leading to the production of defective proteins which insert into the bacterial membrane, resulting in membrane damage and the ultimate demise of the bacterium. Phosphorylation of the aminoglycoside antibiotics results in their inactivation, leading to bacterial antibiotic resistance. The APH gene is found on transposons and plasmids and is thought to have originated as a self-defense mechanism used by microorganisms that produce the antibiotics. The APH subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 244
24737 270700 cd05151 ChoK-like Choline Kinase and similar proteins. This subfamily is composed of bacterial and eukaryotic choline kinases, as well as eukaryotic ethanolamine kinase. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC), and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. Bacterial ChoK is also referred to as licA protein. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate and displays negligible activity towards N-methylated derivatives of Etn. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 152
24738 270701 cd05152 MPH2' Macrolide 2'-Phosphotransferase. MPH2' catalyzes the transfer of the gamma-phosphoryl group from ATP to the 2'-hydroxyl of macrolide antibiotics such as erythromycin, clarithromycin, and azithromycin, among others. Macrolides penetrate the bacterial cell and bind to ribosomes, where it interrupts protein elongation, leading ultimately to the demise of the bacterium. Phosphorylation of macrolides leads to their inactivation. Based on substrate specificity and amino acid sequence, MPH2' is divided into types I and II, encoded by mphA and mphB genes, respectively. MPH2'I inactivates 14-membered ring macrolides while MPH2'II inactivates both 14- and 16-membered ring macrolides. Enzymatic inactivation of macrolides has been reported as a mechanism for bacterial resistance in clinical samples. MPH2' is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 276
24739 270702 cd05153 HomoserineK_II Type II Homoserine Kinase. This subfamily is composed of unusual homoserine kinases, from a subset of bacteria, which have a Protein Kinase fold. These proteins do not bear any similarity to the GHMP family homoserine kinases present in most bacteria and eukaryotes. Homoserine kinase catalyzes the transfer of the gamma-phosphoryl group from ATP to L-homoserine producing L-homoserine phosphate, an intermediate in the production of the amino acids threonine, methionine, and isoleucine. The Type II homoserine kinase subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 300
24740 270703 cd05154 ACAD10_11_N-like N-terminal domain of Acyl-CoA dehydrogenase (ACAD) 10 and 11, and similar proteins. This subfamily is composed of the N-terminal domains of vertebrate ACAD10 and ACAD11, and similar uncharacterized bacterial and eukaryotic proteins. ACADs are a family of flavoproteins that are involved in the beta-oxidation of fatty acyl-CoA derivatives. ACAD deficiency can cause metabolic disorders including muscle fatigue, hypoglycemia, and hepatic lipidosis. There are at least 11 distinct ACADs, some of which show distinct substrate specificities to either straight-chain or branched-chain fatty acids. ACAD10 is widely expressed in human tissues and highly expressed in liver, kidney, pancreas, and spleen. ACAD10 and ACAD11 are both significantly expressed in human brain tissues. They contain a long N-terminal domain with similarity to phosphotransferases with a Protein Kinase fold, which is absent in other ACADs. They may exhibit multiple functions in acyl-CoA oxidation pathways. ACAD11 utilizes substrates with carbon chain lengths of 20 to 26, with optimal activity towards C22CoA. ACAD10 may be associated with an increased risk in type II diabetes. The ACAD10/11-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 254
24741 270704 cd05155 APH_ChoK_like_1 Uncharacterized bacterial proteins with similarity to Aminoglycoside 3'-phosphotransferase and Choline kinase. This subfamily is composed of uncharacterized bacterial proteins with similarity to APH and ChoK. Other APH/ChoK-like proteins include ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). These proteins catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates, such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides, and macrolides leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine. The APH/ChoK-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 234
24742 270705 cd05156 ChoK_euk Euykaryotic Choline Kinase. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC) and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. Along with PCho, it is involved in malignant transformation through Ras oncogenes in various human cancers such as breast, lung, colon, prostate, neuroblastoma, and hepatic lymphoma. In mammalian cells, there are three ChoK isoforms (A-1, A-2, and B) which are active in homo- or heterodimeric forms. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 326
24743 270706 cd05157 ETNK_euk Euykaryotic Ethanolamine kinase. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate, and displays negligible activity towards N-methylated derivatives of Etn. The Drosophila ETNK is implicated in development and neuronal function. Mammals contain two ETNK proteins, ETNK1 and ETNK2. ETNK1 selectively increases Etn uptake and phosphorylation, as well as PtdEtn synthesis. ETNK2 is found primarily in the liver and reproductive tissues. It plays a critical role in regulating placental hemostasis to support late embryonic development. It may also have a role in testicular maturation. ETNK is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 307
24744 176646 cd05160 DEDDy_DNA_polB_exo DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. The 3'-5' exonuclease domain of family-B DNA polymerases. This domain has a fundamental role in reducing polymerase errors and is involved in proofreading activity. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The exonuclease domain of family B polymerase also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Members include Escherichia coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon and zeta), and eukaryotic viral and plasmid-borne enzymes. Nuclear DNA polymerases alpha and zeta lack the four conserved acidic metal-binding residues. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair. 199
24745 99894 cd05162 PWWP The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors. 87
24746 270707 cd05163 PIKK_TRRAP Pseudokinase domain of TRansformation/tRanscription domain-Associated Protein. TRRAP belongs to the the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. It contains a FATC (FRAP, ATM and TRRAP, C-terminal) domain and has a large molecular weight. Unlike most PIKK proteins, however, it contains an inactive PI3K-like pseudokinase domain, which lacks the conserved residues necessary for ATP binding and catalytic activity. TRRAP also contains many motifs that may be critical for protein-protein interactions. TRRAP is a common component of many histone acetyltransferase (HAT) complexes, and is responsible for the recruitment of these complexes to chromatin during transcription, replication, and DNA repair. TRRAP also exists in non-HAT complexes such as the p400 and MRN complexes, which are implicated in ATP-dependent remodeling and DNA repair, respectively. The TRRAP pseudokinase domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 252
24747 270708 cd05164 PIKKc Catalytic domain of Phosphoinositide 3-kinase-related protein kinases. PIKK subfamily members include ATM (Ataxia telangiectasia mutated), ATR (Ataxia telangiectasia and Rad3-related), TOR (Target of rapamycin), SMG-1 (Suppressor of morphogenetic effect on genitalia-1), and DNA-PK (DNA-dependent protein kinase). PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). They show strong preference for phosphorylating serine/threonine residues followed by a glutamine and are also referred to as (S/T)-Q-directed kinases. They all contain a FATC (FRAP, ATM and TRRAP, C-terminal) domain. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control. The PIKK catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 222
24748 270709 cd05165 PI3Kc_I Catalytic domain of Class I Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. In vitro, they can also phosphorylate the substrates PtdIns and PtdIns(4)P. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 363
24749 270710 cd05166 PI3Kc_II Catalytic domain of Class II Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. They are activated by a variety of stimuli including chemokines, cytokines, lysophosphatidic acid (LPA), insulin, and tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 352
24750 270711 cd05167 PI4Kc_III_alpha Catalytic domain of Type III Phosphoinositide 4-kinase alpha. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. PI4KIIIalpha is a 220 kDa protein found in the plasma membrane and the endoplasmic reticulum (ER). The role of PI4KIIIalpha in the ER remains unclear. In the plasma membrane, it provides PtdIns(4)P, which is then converted by PI5Ks to PtdIns(4,5)P2, an important signaling molecule. Vertebrate PI4KIIIalpha is also part of a signaling complex associated with P2X7 ion channels. The yeast homolog, Stt4p, is also important in regulating the conversion of phosphatidylserine to phosphatidylethanolamine at the ER and Golgi interface. Mammalian PI4KIIIalpha is highly expressed in the nervous system. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 307
24751 270712 cd05168 PI4Kc_III_beta Catalytic domain of Type III Phosphoinositide 4-kinase beta. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. PI4KIIIbeta (also called Pik1p in yeast) is a 110 kDa protein that is localized to the Golgi and the nucleus. It is required for maintaining the structural integrity of the Golgi complex (GC), and is a key regulator of protein transport from the GC to the plasma membrane. PI4KIIIbeta also functions in the genesis, transport, and exocytosis of synaptic vesicles. The Drosophila PI4KIIIbeta is essential for cytokinesis during spermatogenesis. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 292
24752 270713 cd05169 PIKKc_TOR Catalytic domain of Target of Rapamycin. TOR contains a rapamycin binding domain, a catalytic domain, and a FATC (FRAP, ATM and TRRAP, C-terminal) domain at the C-terminus. It is also called FRAP (FK506 binding protein 12-rapamycin associated protein). TOR is a central component of the eukaryotic growth regulatory network. It controls the expression of many genes transcribed by all three RNA polymerases. It associates with other proteins to form two distinct complexes, TORC1 and TORC2. TORC1 is involved in diverse growth-related functions including protein synthesis, nutrient use and transport, autophagy and stress responses. TORC2 is involved in organizing cytoskeletal structures. TOR is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The TOR catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 279
24753 270714 cd05170 PIKKc_SMG1 Catalytic domain of Suppressor of Morphogenetic effect on Genitalia-1. SMG-1 plays a critical role in the mRNA surveillance mechanism known as non-sense mediated mRNA decay (NMD). NMD protects the cells from the accumulation of aberrant mRNAs with premature termination codons (PTCs) generated by genome mutations and by errors during transcription and splicing. SMG-1 phosphorylates Upf1, another central component of NMD, at the C-terminus upon recognition of PTCs. The phosphorylation/dephosphorylation cycle of Upf1 is essential for promoting NMD. In addition to its catalytic domain, SMG-1 contains a FATC (FRAP, ATM and TRRAP, C-terminal) domain at the C-terminus. SMG-1 is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The SMG-1 catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 304
24754 270715 cd05171 PIKKc_ATM Catalytic domain of Ataxia Telangiectasia Mutated. ATM is critical in the response to DNA double strand breaks (DSBs) caused by radiation. It is activated at the site of a DSB and phosphorylates key substrates that trigger pathways that regulate DNA repair and cell cycle checkpoints at the G1/S, S phase, and G2/M transition. Patients with the human genetic disorder Ataxia telangiectasia (A-T), caused by truncating mutations in ATM, show genome instability, increased cancer risk, immunodeficiency, compromised mobility, and neurodegeneration. A-T displays clinical heterogeneity, which is correlated to the degree of retained ATM activity. ATM contains a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. It is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The ATM catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 282
24755 270716 cd05172 PIKKc_DNA-PK Catalytic domain of DNA-dependent protein kinase. DNA-PK is comprised of a regulatory subunit, containing the Ku70/80 subunit, and a catalytic subunit, which contains a NUC194 domain of unknown function, a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. It is part of a multi-component system involved in non-homologous end joining (NHEJ), a process of repairing double strand breaks (DSBs) by joining together two free DNA ends of little homology. DNA-PK functions as a molecular sensor for DNA damage that enhances the signal via phosphorylation of downstream targets. It may also act as a protein scaffold that aids the localization of DNA repair proteins to the site of DNA damage. DNA-PK also plays a role in the maintenance of telomeric stability and the prevention of chromosomal end fusion. DNA-PK is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The DNA-PK catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 235
24756 270717 cd05173 PI3Kc_IA_beta Catalytic domain of Class IA Phosphoinositide 3-kinase beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kbeta can be activated by G-protein-coupled receptors. Deletion of PI3Kbeta in mice results in early lethality at around day three of development. PI3Kbeta plays an important role in regulating sustained integrin activation and stable platelet agrregation, especially under conditions of high shear stress. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 362
24757 270718 cd05174 PI3Kc_IA_delta Catalytic domain of Class IA Phosphoinositide 3-kinase delta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kdelta is mainly expressed in immune cells and plays an important role in cellular and humoral immunity. It plays a major role in antigen receptor signaling in B-cells, T-cells, and mast cells. It regulates the differentiation of peripheral helper T-cells and controls the development and function of regulatory T-cells. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 366
24758 270719 cd05175 PI3Kc_IA_alpha Catalytic domain of Class IA Phosphoinositide 3-kinase alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kalpha plays an important role in insulin signaling. It also mediates physiologic heart growth and provides protection from stress. Activating mutations of PI3Kalpha is associated with diverse forms of cancer at high frequency. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 370
24759 270720 cd05176 PI3Kc_C2_alpha Catalytic domain of Class II Phosphoinositide 3-kinase alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II alpha isoform, PI3K-C2alpha, plays key roles in clathrin assembly and clathrin-mediated membrane trafficking, insulin signaling, vascular smooth muscle contraction, and the priming of neurosecretory granule exocytosis. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 353
24760 270721 cd05177 PI3Kc_C2_gamma Catalytic domain of Class II Phosphoinositide 3-kinase gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II gamma isoform, PI3K-C2gamma, is expressed in the liver, breast, and prostate. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. It's biological function remains unknown. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. 354
24761 176178 cd05188 MDR Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family. The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc. 271
24762 133449 cd05191 NAD_bind_amino_acid_DH NAD(P) binding domain of amino acid dehydrogenase-like proteins. Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 86
24763 187536 cd05193 AR_like_SDR_e aldehyde reductase, flavonoid reductase, and related proteins, extended (e) SDRs. This subgroup contains aldehyde reductase and flavonoid reductase of the extended SDR-type and related proteins. Proteins in this subgroup have a complete SDR-type active site tetrad and a close match to the canonical extended SDR NADP-binding motif. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it catalyzes the NADP-dependent reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. The related flavonoid reductases act in the NADP-dependent reduction of flavonoids, ketone-containing plant secondary metabolites. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 295
24764 176179 cd05195 enoyl_red enoyl reductase of polyketide synthase. Putative enoyl reductase of polyketide synthase. Polyketide synthases produce polyketides in step by step mechanism that is similar to fatty acid synthesis. Enoyl reductase reduces a double to single bond. Erythromycin is one example of a polyketide generated by 3 complex enzymes (megasynthases). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 293
24765 133425 cd05197 GH4_glycoside_hydrolases Glycoside Hydrases Family 4. Glycoside hydrolases cleave glycosidic bonds to release smaller sugars from oligo- or polysaccharides. Some bacteria simultaneously translocate and phosphorylate disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS). After translocation, these phospho-disaccharides may be hydrolyzed by GH4 glycoside hydrolases. Other organisms (such as archaea and Thermotoga maritima) lack the PEP-PTS system, but have several enzymes normally associated with the PEP-PTS operon. GH4 family members include 6-phospho-beta-glucosidases, 6-phospho-alpha-glucosidases, alpha-glucosidases/alpha-glucuronidases (only from Thermotoga), and alpha-galactosidases. They require two cofactors, NAD+ and a divalent metal (Mn2+, Ni2+, Mg2+), for activity. Some also require reducing conditions. GH4 glycoside hydrolases are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 425
24766 240622 cd05198 formate_dh_like Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxy acid dehydrogenase family. Formate dehydrogenase, D-specific 2-hydroxy acid dehydrogenase, Phosphoglycerate Dehydrogenase, Lactate dehydrogenase, Thermostable Phosphite Dehydrogenase, and Hydroxy(phenyl)pyruvate reductase, among others, share a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production and in the stress responses of plants. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase, among others. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 302
24767 240623 cd05199 SDH_like Saccharopine Dehydrogenase like proteins. Saccharopine Dehydrogenase (SDH) and related proteins, including bifunctional lysine ketoglutarate reductase/SDH enzymes and N(5)-(carboxyethyl)ornithine synthases. SDH catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SDH is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of related structure. Bifunctional lysine ketoglutarate reductase/SDH protein is a pair of enzymes linked on a single polypeptide chain that catalyze the initial, consecutive steps of lysine degradation. These proteins are related to the 2-domain saccharopine dehydrogenases. 319
24768 133450 cd05211 NAD_bind_Glu_Leu_Phe_Val NAD(P) binding domain of glutamate dehydrogenase, leucine dehydrogenase, phenylalanine dehydrogenase, and valine dehydrogenase. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NAD(P)+. This subfamily includes glutamate, leucine, phenylalanine, and valine DHs. Glutamate DH is a multi-domain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms. Enzymes involved in ammonia assimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent. As in other NAD+-dependent DHs, monomers in this family have 2 domains separated by a deep cleft. Here the c-terminal domain contains a modified NAD-binding Rossmann fold with 7 rather than the usual 6 beta strands and one strand anti-parrallel to the others. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 217
24769 133451 cd05212 NAD_bind_m-THF_DH_Cyclohyd_like NAD(P) binding domain of methylene-tetrahydrofolate dehydrogenase and methylene-tetrahydrofolate dehydrogenase/cyclohydrolase. NAD(P) binding domains of methylene-tetrahydrofolate dehydrogenase (m-THF DH) and m-THF DH/cyclohydrolase bifunctional enzymes (m-THF DH/cyclohydrolase). M-THF is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. In addition, most DHs also have an associated cyclohydrolase activity which catalyzes its hydrolysis to N10-formyltetrahydrofolate. m-THF DH is typically found as part of a multifunctional protein in eukaryotes. NADP-dependent m-THF DH in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, mono-functional DH, as well as bifunctional DH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express a monofunctional DH. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 140
24770 133452 cd05213 NAD_bind_Glutamyl_tRNA_reduct NADP-binding domain of glutamyl-tRNA reductase. Glutamyl-tRNA reductase catalyzes the conversion of glutamyl-tRNA to glutamate-1-semialdehyde, initiating the synthesis of tetrapyrrole. Whereas tRNAs are generally associated with peptide bond formation in protein translation, here the tRNA activates glutamate in the initiation of tetrapyrrole biosynthesis in archaea, plants and many bacteria. In the first step, activated glutamate is reduced to glutamate-1-semi-aldehyde via the NADPH dependent glutamyl-tRNA reductase. Glutamyl-tRNA reductase forms a V-shaped dimer. Each monomer has 3 domains: an N-terminal catalytic domain, a classic nucleotide binding domain, and a C-terminal dimerization domain. Although the representative structure 1GPJ lacks a bound NADPH, a theoretical binding pocket has been described. (PMID 11172694). Amino acid dehydrogenase (DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 311
24771 187537 cd05226 SDR_e_a Extended (e) and atypical (a) SDRs. Extended or atypical short-chain dehydrogenases/reductases (SDRs, aka tyrosine-dependent oxidoreductases) are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 176
24772 187538 cd05227 AR_SDR_e aldehyde reductase, extended (e) SDRs. This subgroup contains aldehyde reductase of the extended SDR-type and related proteins. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it has an NADP-binding motif consensus that is slightly different from the canonical SDR form and lacks the Asn of the extended SDR active site tetrad. Aldehyde reductase I catalyzes the NADP-dependent reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 301
24773 187539 cd05228 AR_FR_like_1_SDR_e uncharacterized subgroup of aldehyde reductase and flavonoid reductase related proteins, extended (e) SDRs. This subgroup contains proteins of unknown function related to aldehyde reductase and flavonoid reductase of the extended SDR-type. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it has an NADP-binding motif consensus that is slightly different from the canonical SDR form and lacks the Asn of the extended SDR active site tetrad. Aldehyde reductase I catalyzes the NADP-dependent reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. The related flavonoid reductases act in the NADP-dependent reduction of flavonoids, ketone-containing plant secondary metabolites. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 318
24774 187540 cd05229 SDR_a3 atypical (a) SDRs, subgroup 3. These atypical SDR family members of unknown function have a glycine-rich NAD(P)-binding motif consensus that is very similar to the extended SDRs, GXXGXXG. Generally, this group has poor conservation of the active site tetrad, However, individual sequences do contain matches to the YXXXK active site motif, and generally Tyr or Asn in place of the upstream Ser found in most SDRs. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 302
24775 187541 cd05230 UGD_SDR_e UDP-glucuronate decarboxylase (UGD) and related proteins, extended (e) SDRs. UGD catalyzes the formation of UDP-xylose from UDP-glucuronate; it is an extended-SDR, and has the characteristic glycine-rich NAD-binding pattern, TGXXGXXG, and active site tetrad. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 305
24776 187542 cd05231 NmrA_TMR_like_1_SDR_a NmrA (a transcriptional regulator) and triphenylmethane reductase (TMR) like proteins, subgroup 1, atypical (a) SDRs. Atypical SDRs related to NMRa, TMR, and HSCARG (an NADPH sensor). This subgroup resembles the SDRs and has a partially conserved characteristic [ST]GXXGXXG NAD-binding motif, but lacks the conserved active site residues. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Atypical SDRs are distinct from classical SDRs. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 259
24777 187543 cd05232 UDP_G4E_4_SDR_e UDP-glucose 4 epimerase, subgroup 4, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup is comprised of bacterial proteins, and includes the Staphylococcus aureus capsular polysaccharide Cap5N, which may have a role in the synthesis of UDP-N-acetyl-d-fucosamine. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 303
24778 212491 cd05233 SDR_c classical (c) SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 234
24779 187545 cd05234 UDP_G4E_2_SDR_e UDP-glucose 4 epimerase, subgroup 2, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup is comprised of archaeal and bacterial proteins, and has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 305
24780 187546 cd05235 SDR_e1 extended (e) SDRs, subgroup 1. This family consists of an SDR module of multidomain proteins identified as putative polyketide sythases fatty acid synthases (FAS), and nonribosomal peptide synthases, among others. However, unlike the usual ketoreductase modules of FAS and polyketide synthase, these domains are related to the extended SDRs, and have canonical NAD(P)-binding motifs and an active site tetrad. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 290
24781 187547 cd05236 FAR-N_SDR_e fatty acyl CoA reductases (FARs), extended (e) SDRs. SDRs are Rossmann-fold NAD(P)H-binding proteins, many of which may function as fatty acyl CoA reductases (FAR), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. This N-terminal domain shares the catalytic triad (but not the upstream Asn) and characteristic NADP-binding motif of the extended SDR family. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 320
24782 187548 cd05237 UDP_invert_4-6DH_SDR_e UDP-Glcnac (UDP-linked N-acetylglucosamine) inverting 4,6-dehydratase, extended (e) SDRs. UDP-Glcnac inverting 4,6-dehydratase was identified in Helicobacter pylori as the hexameric flaA1 gene product (FlaA1). FlaA1 is hexameric, possesses UDP-GlcNAc-inverting 4,6-dehydratase activity, and catalyzes the first step in the creation of a pseudaminic acid derivative in protein glycosylation. Although this subgroup has the NADP-binding motif characteristic of extended SDRs, its members tend to have a Met substituted for the active site Tyr found in most SDR families. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 287
24783 187549 cd05238 Gne_like_SDR_e Escherichia coli Gne (a nucleoside-diphosphate-sugar 4-epimerase)-like, extended (e) SDRs. Nucleoside-diphosphate-sugar 4-epimerase has the characteristic active site tetrad and NAD-binding motif of the extended SDR, and is related to more specifically defined epimerases such as UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), which catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup includes Escherichia coli 055:H7 Gne, a UDP-GlcNAc 4-epimerase, essential for O55 antigen synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 305
24784 187550 cd05239 GDP_FS_SDR_e GDP-fucose synthetase, extended (e) SDRs. GDP-fucose synthetase (aka 3, 5-epimerase-4-reductase) acts in the NADP-dependent synthesis of GDP-fucose from GDP-mannose. Two activities have been proposed for the same active site: epimerization and reduction. Proteins in this subgroup are extended SDRs, which have a characteristic active site tetrad and an NADP-binding motif, [AT]GXXGXXG, that is a close match to the archetypical form. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 300
24785 187551 cd05240 UDP_G4E_3_SDR_e UDP-glucose 4 epimerase (G4E), subgroup 3, extended (e) SDRs. Members of this bacterial subgroup are identified as possible sugar epimerases, such as UDP-glucose 4 epimerase. However, while the NAD(P)-binding motif is fairly well conserved, not all members retain the canonical active site tetrad of the extended SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 306
24786 187552 cd05241 3b-HSD-like_SDR_e 3beta-hydroxysteroid dehydrogenases (3b-HSD)-like, extended (e) SDRs. Extended SDR family domains belonging to this subgroup have the characteristic active site tetrad and a fairly well-conserved NAD(P)-binding motif. 3b-HSD catalyzes the NAD-dependent conversion of various steroids, such as pregnenolone to progesterone, or androstenediol to testosterone. This subgroup includes an unusual bifunctional 3b-HSD/C-4 decarboxylase from Arabidopsis thaliana, and Saccharomyces cerevisiae ERG26, a 3b-HSD/C-4 decarboxylase, involved in the synthesis of ergosterol, the major sterol of yeast. It also includes human 3 beta-HSD/HSD3B1 and C(27) 3beta-HSD/ [3beta-hydroxy-delta(5)-C(27)-steroid oxidoreductase; HSD3B7]. C(27) 3beta-HSD/HSD3B7 is a membrane-bound enzyme of the endoplasmic reticulum, that catalyzes the isomerization and oxidation of 7alpha-hydroxylated sterol intermediates, an early step in bile acid biosynthesis. Mutations in the human NSDHL (NAD(P)H steroid dehydrogenase-like protein) cause CHILD syndrome (congenital hemidysplasia with ichthyosiform nevus and limb defects), an X-linked dominant, male-lethal trait. Mutations in the human gene encoding C(27) 3beta-HSD underlie a rare autosomal recessive form of neonatal cholestasis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 331
24787 187553 cd05242 SDR_a8 atypical (a) SDRs, subgroup 8. This subgroup contains atypical SDRs of unknown function. Proteins in this subgroup have a glycine-rich NAD(P)-binding motif consensus that resembles that of the extended SDRs, (GXXGXXG or GGXGXXG), but lacks the characteristic active site residues of the SDRs. A Cys often replaces the usual Lys of the YXXXK active site motif, while the upstream Ser is generally present and Arg replaces the usual Asn. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 296
24788 187554 cd05243 SDR_a5 atypical (a) SDRs, subgroup 5. This subgroup contains atypical SDRs, some of which are identified as putative NAD(P)-dependent epimerases, one as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif that is very similar to the extended SDRs, GXXGXXG, and binds NADP. Generally, this subgroup has poor conservation of the active site tetrad; however, individual sequences do contain matches to the YXXXK active site motif, the upstream Ser, and there is a highly conserved Asp in place of the usual active site Asn throughout the subgroup. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 203
24789 187555 cd05244 BVR-B_like_SDR_a biliverdin IX beta reductase (BVR-B, aka flavin reductase)-like proteins; atypical (a) SDRs. Human BVR-B catalyzes pyridine nucleotide-dependent production of bilirubin-IX beta during fetal development; in the adult BVR-B has flavin and ferric reductase activities. Human BVR-B catalyzes the reduction of FMN, FAD, and riboflavin. Recognition of flavin occurs mostly by hydrophobic interactions, accounting for the broad substrate specificity. Atypical SDRs are distinct from classical SDRs. BVR-B does not share the key catalytic triad, or conserved tyrosine typical of SDRs. The glycine-rich NADP-binding motif of BVR-B is GXXGXXG, which is similar but not identical to the pattern seen in extended SDRs. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 207
24790 187556 cd05245 SDR_a2 atypical (a) SDRs, subgroup 2. This subgroup contains atypical SDRs, one member is identified as Escherichia coli protein ybjT, function unknown. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif consensus that generally matches the extended SDRs, TGXXGXXG, but lacks the characteristic active site residues of the SDRs. This subgroup has basic residues (HXXXR) in place of the active site motif YXXXK, these may have a catalytic role. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 293
24791 187557 cd05246 dTDP_GD_SDR_e dTDP-D-glucose 4,6-dehydratase, extended (e) SDRs. This subgroup contains dTDP-D-glucose 4,6-dehydratase and related proteins, members of the extended-SDR family, with the characteristic Rossmann fold core region, active site tetrad and NAD(P)-binding motif. dTDP-D-glucose 4,6-dehydratase is closely related to other sugar epimerases of the SDR family. dTDP-D-dlucose 4,6,-dehydratase catalyzes the second of four steps in the dTDP-L-rhamnose pathway (the dehydration of dTDP-D-glucose to dTDP-4-keto-6-deoxy-D-glucose) in the synthesis of L-rhamnose, a cell wall component of some pathogenic bacteria. In many gram negative bacteria, L-rhamnose is an important constituent of lipopoylsaccharide O-antigen. The larger N-terminal portion of dTDP-D-Glucose 4,6-dehydratase forms a Rossmann fold NAD-binding domain, while the C-terminus binds the sugar substrate. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 315
24792 187558 cd05247 UDP_G4E_1_SDR_e UDP-glucose 4 epimerase, subgroup 1, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 323
24793 187559 cd05248 ADP_GME_SDR_e ADP-L-glycero-D-mannoheptose 6-epimerase (GME), extended (e) SDRs. This subgroup contains ADP-L-glycero-D-mannoheptose 6-epimerase, an extended SDR, which catalyzes the NAD-dependent interconversion of ADP-D-glycero-D-mannoheptose and ADP-L-glycero-D-mannoheptose. This subgroup has the canonical active site tetrad and NAD(P)-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 317
24794 187560 cd05250 CC3_like_SDR_a CC3(TIP30)-like, atypical (a) SDRs. Atypical SDRs in this subgroup include CC3 (also known as TIP30) which is implicated in tumor suppression. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine rich NAD(P)-binding motif that resembles the extended SDRs, and have an active site triad of the SDRs (YXXXK and upstream Ser), although the upstream Asn of the usual SDR active site is substituted with Asp. For CC3, the Tyr of the triad is displaced compared to the usual SDRs and the protein is monomeric, both these observations suggest that the usual SDR catalytic activity is not present. NADP appears to serve an important role as a ligand, and may be important in the interaction with other macromolecules. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 214
24795 187561 cd05251 NmrA_like_SDR_a NmrA (a transcriptional regulator) and HSCARG (an NADPH sensor) like proteins, atypical (a) SDRs. NmrA and HSCARG like proteins. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Atypical SDRs are distinct from classical SDRs. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 242
24796 187562 cd05252 CDP_GD_SDR_e CDP-D-glucose 4,6-dehydratase, extended (e) SDRs. This subgroup contains CDP-D-glucose 4,6-dehydratase, an extended SDR, which catalyzes the conversion of CDP-D-glucose to CDP-4-keto-6-deoxy-D-glucose. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 336
24797 187563 cd05253 UDP_GE_SDE_e UDP glucuronic acid epimerase, extended (e) SDRs. This subgroup contains UDP-D-glucuronic acid 4-epimerase, an extended SDR, which catalyzes the conversion of UDP-alpha-D-glucuronic acid to UDP-alpha-D-galacturonic acid. This group has the SDR's canonical catalytic tetrad and the TGxxGxxG NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 332
24798 187564 cd05254 dTDP_HR_like_SDR_e dTDP-6-deoxy-L-lyxo-4-hexulose reductase and related proteins, extended (e) SDRs. dTDP-6-deoxy-L-lyxo-4-hexulose reductase, an extended SDR, synthesizes dTDP-L-rhamnose from alpha-D-glucose-1-phosphate, providing the precursor of L-rhamnose, an essential cell wall component of many pathogenic bacteria. This subgroup has the characteristic active site tetrad and NADP-binding motif. This subgroup also contains human MAT2B, the regulatory subunit of methionine adenosyltransferase (MAT); MAT catalyzes S-adenosylmethionine synthesis. The human gene encoding MAT2B encodes two major splicing variants which are induced in human cell liver cancer and regulate HuR, an mRNA-binding protein which stabilizes the mRNA of several cyclins, to affect cell proliferation. Both MAT2B variants include this extended SDR domain. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 280
24799 187565 cd05255 SQD1_like_SDR_e UDP_sulfoquinovose_synthase (Arabidopsis thaliana SQD1 and related proteins), extended (e) SDRs. Arabidopsis thaliana UDP-sulfoquinovose-synthase ( SQD1), an extended SDR, catalyzes the transfer of SO(3)(-) to UDP-glucose in the biosynthesis of plant sulfolipids. Members of this subgroup share the conserved SDR catalytic residues, and a partial match to the characteristic extended-SDR NAD-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 382
24800 187566 cd05256 UDP_AE_SDR_e UDP-N-acetylglucosamine 4-epimerase, extended (e) SDRs. This subgroup contains UDP-N-acetylglucosamine 4-epimerase of Pseudomonas aeruginosa, WbpP, an extended SDR, that catalyzes the NAD+ dependent conversion of UDP-GlcNAc and UDPGalNA to UDP-Glc and UDP-Gal. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 304
24801 187567 cd05257 Arna_like_SDR_e Arna decarboxylase_like, extended (e) SDRs. Decarboxylase domain of ArnA. ArnA, is an enzyme involved in the modification of outer membrane protein lipid A of gram-negative bacteria. It is a bifunctional enzyme that catalyzes the NAD-dependent decarboxylation of UDP-glucuronic acid and N-10-formyltetrahydrofolate-dependent formylation of UDP-4-amino-4-deoxy-l-arabinose; its NAD-dependent decaboxylating activity is in the C-terminal 360 residues. This subgroup belongs to the extended SDR family, however the NAD binding motif is not a perfect match and the upstream Asn of the canonical active site tetrad is not conserved. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 316
24802 187568 cd05258 CDP_TE_SDR_e CDP-tyvelose 2-epimerase, extended (e) SDRs. CDP-tyvelose 2-epimerase is a tetrameric SDR that catalyzes the conversion of CDP-D-paratose to CDP-D-tyvelose, the last step in tyvelose biosynthesis. This subgroup is a member of the extended SDR subfamily, with a characteristic active site tetrad and NAD-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 337
24803 187569 cd05259 PCBER_SDR_a phenylcoumaran benzylic ether reductase (PCBER) like, atypical (a) SDRs. PCBER and pinoresinol-lariciresinol reductases are NADPH-dependent aromatic alcohol reductases, and are atypical members of the SDR family. Other proteins in this subgroup are identified as eugenol synthase. These proteins contain an N-terminus characteristic of NAD(P)-binding proteins and a small C-terminal domain presumed to be involved in substrate binding, but they do not have the conserved active site Tyr residue typically found in SDRs. Numerous other members have unknown functions. The glycine rich NADP-binding motif in this subgroup is of 2 forms: GXGXXG and G[GA]XGXXG; it tends to be atypical compared with the forms generally seen in classical or extended SDRs. The usual SDR active site tetrad is not present, but a critical active site Lys at the usual SDR position has been identified in various members, though other charged and polar residues are found at this position in this subgroup. Atypical SDR-related proteins retain the Rossmann fold of the SDRs, but have limited sequence identity and generally lack the catalytic properties of the archetypical members. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 282
24804 187570 cd05260 GDP_MD_SDR_e GDP-mannose 4,6 dehydratase, extended (e) SDRs. GDP-mannose 4,6 dehydratase, a homodimeric SDR, catalyzes the NADP(H)-dependent conversion of GDP-(D)-mannose to GDP-4-keto, 6-deoxy-(D)-mannose in the fucose biosynthesis pathway. These proteins have the canonical active site triad and NAD-binding pattern, however the active site Asn is often missing and may be substituted with Asp. A Glu residue has been identified as an important active site base. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 316
24805 187571 cd05261 CAPF_like_SDR_e capsular polysaccharide assembling protein (CAPF) like, extended (e) SDRs. This subgroup of extended SDRs, includes some members which have been identified as capsular polysaccharide assembling proteins, such as Staphylococcus aureus Cap5F which is involved in the biosynthesis of N-acetyl-l-fucosamine, a constituent of surface polysaccharide structures of S. aureus. This subgroup has the characteristic active site tetrad and NAD-binding motif of extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 248
24806 187572 cd05262 SDR_a7 atypical (a) SDRs, subgroup 7. This subgroup contains atypical SDRs of unknown function. Members of this subgroup have a glycine-rich NAD(P)-binding motif consensus that matches the extended SDRs, TGXXGXXG, but lacks the characteristic active site residues of the SDRs. This subgroup has basic residues (HXXXR) in place of the active site motif YXXXK, these may have a catalytic role. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 291
24807 187573 cd05263 MupV_like_SDR_e Pseudomonas fluorescens MupV-like, extended (e) SDRs. This subgroup of extended SDR family domains have the characteristic active site tetrad and a well-conserved NAD(P)-binding motif. This subgroup is not well characterized, its members are annotated as having a variety of putative functions. One characterized member is Pseudomonas fluorescens MupV a protein involved in the biosynthesis of Mupirocin, a polyketide-derived antibiotic. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 293
24808 187574 cd05264 UDP_G4E_5_SDR_e UDP-glucose 4-epimerase (G4E), subgroup 5, extended (e) SDRs. This subgroup partially conserves the characteristic active site tetrad and NAD-binding motif of the extended SDRs, and has been identified as possible UDP-glucose 4-epimerase (aka UDP-galactose 4-epimerase), a homodimeric member of the extended SDR family. UDP-glucose 4-epimerase catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 300
24809 187575 cd05265 SDR_a1 atypical (a) SDRs, subgroup 1. Atypical SDRs in this subgroup are poorly defined and have been identified putatively as isoflavones reductase, sugar dehydratase, mRNA binding protein etc. Atypical SDRs are distinct from classical SDRs. Members of this subgroup retain the canonical active site triad (though not the upstream Asn found in most SDRs) but have an unusual putative glycine-rich NAD(P)-binding motif, GGXXXXG, in the usual location. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 250
24810 187576 cd05266 SDR_a4 atypical (a) SDRs, subgroup 4. Atypical SDRs in this subgroup are poorly defined, one member is identified as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif that is related to, but is different from, the archetypical SDRs, GXGXXG. This subgroup also lacks most of the characteristic active site residues of the SDRs; however, the upstream Ser is present at the usual place, and some potential catalytic residues are present in place of the usual YXXXK active site motif. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 251
24811 187577 cd05267 SDR_a6 atypical (a) SDRs, subgroup 6. These atypical SDR family members of unknown function have only a partial match to a prototypical glycine-rich NAD(P)-binding motif consensus, GXXG, which conserves part of the motif of extended SDR. Furthermore, they lack the characteristic active site residues of the SDRs. This subgroup is related to phenylcoumaran benzylic ether reductase, an NADPH-dependent aromatic alcohol reductase. One member is identified as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 203
24812 187578 cd05269 TMR_SDR_a triphenylmethane reductase (TMR)-like proteins, NMRa-like, atypical (a) SDRs. TMR is an atypical NADP-binding protein of the SDR family. It lacks the active site residues of the SDRs but has a glycine rich NAD(P)-binding motif that matches the extended SDRs. Proteins in this subgroup however, are more similar in length to the classical SDRs. TMR was identified as a reducer of triphenylmethane dyes, important environmental pollutants. This subgroup also includes Escherichia coli NADPH-dependent quinine oxidoreductase (QOR2), which catalyzes two-electron reduction of quinone; but is unlikely to play a major role in protecting against quinone cytotoxicity. Atypical SDRs are distinct from classical SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 272
24813 187579 cd05271 NDUFA9_like_SDR_a NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, subunit 9, 39 kDa, (NDUFA9) -like, atypical (a) SDRs. This subgroup of extended SDR-like proteins are atypical SDRs. They have a glycine-rich NAD(P)-binding motif similar to the typical SDRs, GXXGXXG, and have the YXXXK active site motif (though not the other residues of the SDR tetrad). Members identified include NDUFA9 (mitochondrial) and putative nucleoside-diphosphate-sugar epimerase. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 273
24814 187580 cd05272 TDH_SDR_e L-threonine dehydrogenase, extended (e) SDRs. This subgroup contains members identified as L-threonine dehydrogenase (TDH). TDH catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)-dependent oxidation. This group is distinct from TDHs that are members of the medium chain dehydrogenase/reductase family. This group has the NAD-binding motif and active site tetrad of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 308
24815 187581 cd05273 GME-like_SDR_e Arabidopsis thaliana GDP-mannose-3',5'-epimerase (GME)-like, extended (e) SDRs. This subgroup of NDP-sugar epimerase/dehydratases are extended SDRs; they have the characteristic active site tetrad, and an NAD-binding motif: TGXXGXX[AG], which is a close match to the canonical NAD-binding motif. Members include Arabidopsis thaliana GDP-mannose-3',5'-epimerase (GME) which catalyzes the epimerization of two positions of GDP-alpha-D-mannose to form GDP-beta-L-galactose. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 328
24816 187582 cd05274 KR_FAS_SDR_x ketoreductase (KR) and fatty acid synthase (FAS), complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. In some instances, such as porcine FAS, an enoyl reductase (ER) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consist of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthase uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-KR, forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-ER. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 375
24817 176180 cd05276 p53_inducible_oxidoreductase PIG3 p53-inducible quinone oxidoreductase. PIG3 p53-inducible quinone oxidoreductase, a medium chain dehydrogenase/reductase family member, acts in the apoptotic pathway. PIG3 reduces ortho-quinones, but its apoptotic activity has been attributed to oxidative stress generation, since overexpression of PIG3 accumulates reactive oxygen species. PIG3 resembles the MDR family member quinone reductases, which catalyze the reduction of quinone to hydroxyquinone. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 323
24818 176181 cd05278 FDH_like Formaldehyde dehydrogenases. Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family. Formaldehyde dehydrogenase (aka ADH3) may be the ancestral form of alcohol dehydrogenase, which evolved to detoxify formaldehyde. This CD contains glutathione dependant FDH, glutathione independent FDH, and related alcohol dehydrogenases. FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. Unlike typical FDH, Pseudomonas putida aldehyde-dismutating FDH (PFDH) is glutathione-independent. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 347
24819 176182 cd05279 Zn_ADH1 Liver alcohol dehydrogenase and related zinc-dependent alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates. For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine (His-51), the ribose of NAD, a serine (Ser-48), then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 365
24820 176183 cd05280 MDR_yhdh_yhfp Yhdh and yhfp-like putative quinone oxidoreductases. Yhdh and yhfp-like putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 325
24821 176184 cd05281 TDH Threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)- dependent oxidation. THD is a member of the zinc-requiring, medium chain NAD(H)-dependent alcohol dehydrogenase family (MDR). MDRs have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria) and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. 341
24822 176645 cd05282 ETR_like 2-enoyl thioester reductase-like. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 323
24823 176186 cd05283 CAD1 Cinnamyl alcohol dehydrogenases (CAD). Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 337
24824 176187 cd05284 arabinose_DH_like D-arabinose dehydrogenase. This group contains arabinose dehydrogenase (AraDH) and related alcohol dehydrogenases. AraDH is a member of the medium chain dehydrogenase/reductase family and catalyzes the NAD(P)-dependent oxidation of D-arabinose and other pentoses, the initial step in the metabolism of d-arabinose into 2-oxoglutarate. Like the alcohol dehydrogenases, AraDH binds a zinc in the catalytic cleft as well as a distal structural zinc. AraDH forms homotetramers as a dimer of dimers. AraDH replaces a conserved catalytic His with replace with Arg, compared to the canonical ADH site. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 340
24825 176188 cd05285 sorbitol_DH Sorbitol dehydrogenase. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit. Aldose reductase catalyzes the NADP(H)-dependent conversion of glucose to sorbital, and SDH uses NAD(H) in the conversion of sorbitol to fructose. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 343
24826 176189 cd05286 QOR2 Quinone oxidoreductase (QOR). Quinone oxidoreductase (QOR) and 2-haloacrylate reductase. QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. 2-haloacrylate reductase, a member of this subgroup, catalyzes the NADPH-dependent reduction of a carbon-carbon double bond in organohalogen compounds. Although similar to QOR, Burkholderia 2-haloacrylate reductase does not act on the quinones 1,4-benzoquinone and 1,4-naphthoquinone. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 320
24827 176190 cd05288 PGDH Prostaglandin dehydrogenases. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 329
24828 176191 cd05289 MDR_like_2 alcohol dehydrogenase and quinone reductase-like medium chain degydrogenases/reductases. Members identified as zinc-dependent alcohol dehydrogenases and quinone oxidoreductase. QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 309
24829 133426 cd05290 LDH_3 A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed of some bacterial LDHs from firmicutes, gammaproteobacteria, and actinobacteria. Vertebrate LDHs are non-allosteric, but some bacterial LDHs are activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenase, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 307
24830 133427 cd05291 HicDH_like L-2-hydroxyisocapronate dehydrogenases and some bacterial L-lactate dehydrogenases. L-2-hydroxyisocapronate dehydrogenase (HicDH) catalyzes the conversion of a variety of 2-oxo carboxylic acids with medium-sized aliphatic or aromatic side chains. This subfamily is composed of HicDHs and some bacterial L-lactate dehydrogenases (LDH). LDHs catalyze the last step of glycolysis in which pyruvate is converted to L-lactate. Bacterial LDHs can be non-allosteric or may be activated by an allosteric effector such as fructose-1,6-bisphosphate. Members of this subfamily with known structures such as the HicDH of Lactobacillus confusus, the non-allosteric LDH of Lactobacillus pentosus, and the allosteric LDH of Bacillus stearothermophilus, show that they exist as homotetramers. The HicDH-like subfamily is part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 306
24831 133428 cd05292 LDH_2 A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed predominantly of bacterial LDHs and a few fungal LDHs. Bacterial LDHs may be non-allosteric or may be activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 308
24832 133429 cd05293 LDH_1 A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed of eukaryotic LDHs. Vertebrate LDHs are non-allosteric. This is in contrast to some bacterial LDHs that are activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 312
24833 133430 cd05294 LDH-like_MDH_nadp A lactate dehydrogenases-like structure with malate dehydrogenase enzymatic activity. The LDH-like MDH proteins have a lactate dehyhydrogenase-like (LDH-like) structure and malate dehydrogenase (MDH) enzymatic activity. This subgroup is composed of some archaeal LDH-like MDHs that prefer NADP(H) rather than NAD(H) as a cofactor. One member, MJ0490 from Methanococcus jannaschii, has been observed to form dimers and tetramers during crystalization, although it is believed to exist primarilly as a tetramer in solution. In addition to its MDH activity, MJ0490 also possesses fructose-1,6-bisphosphate-activated LDH activity. Members of this subgroup have a higher sequence similarity to LDHs than to other MDHs. LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)- binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenase, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 309
24834 133431 cd05295 MDH_like Malate dehydrogenase-like. These MDH-like proteins are related to other groups in the MDH family but do not have conserved substrate and cofactor binding residues. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subgroup are uncharacterized MDH-like proteins from animals. They are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 452
24835 133432 cd05296 GH4_P_beta_glucosidase Glycoside Hydrolases Family 4; Phospho-beta-glucosidase. Some bacteria simultaneously translocate and phosphorylate disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS). After translocation, these phospho-disaccharides may be hydrolyzed by the GH4 glycoside hydrolases such as the phospho-beta-glucosidases. Other organisms (such as archaea and Thermotoga maritima ) lack the PEP-PTS system, but have several enzymes normally associated with the PEP-PTS operon. The 6-phospho-beta-glucosidase from Thermotoga maritima hydrolylzes cellobiose 6-phosphate (6P) into glucose-6P and glucose, in an NAD+ and Mn2+ dependent fashion. The Escherichia coli 6-phospho-beta-glucosidase (also called celF) hydrolyzes a variety of phospho-beta-glucosides including cellobiose-6P, salicin-6P, arbutin-6P, and gentobiose-6P. Phospho-beta-glucosidases are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 419
24836 133433 cd05297 GH4_alpha_glucosidase_galactosidase Glycoside Hydrolases Family 4; Alpha-glucosidases and alpha-galactosidases. linked to 3D####ucture 423
24837 133434 cd05298 GH4_GlvA_pagL_like Glycoside Hydrolases Family 4; GlvA- and pagL-like glycosidases. Bacillus subtilis GlvA and Clostridium acetobutylicum pagL are 6-phospho-alpha-glucosidase, catalyzing the hydrolysis of alpha-glucopyranoside bonds to release glucose from oligosaccharides. The substrate specificities of other members of this subgroup are unknown. Some bacteria simultaneously translocate and phosphorylate disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP_PTS). After translocation, these phospho-disaccharides may be hydrolyzed by the GH4 glycoside hydrolases, which include 6-phospho-beta-glucosidases, 6-phospho-alpha-glucosidases, alpha-glucosidases/alpha-glucuronidases (only from Thermotoga), and alpha-galactosidases. Members of this subfamily are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others. 437
24838 240624 cd05299 CtBP_dh C-terminal binding protein (CtBP), D-isomer-specific 2-hydroxyacid dehydrogenases related repressor. The transcriptional corepressor CtBP is a dehydrogenase with sequence and structural similarity to the d2-hydroxyacid dehydrogenase family. CtBP was initially identified as a protein that bound the PXDLS sequence at the adenovirus E1A C terminus, causing the loss of CR-1-mediated transactivation. CtBP binds NAD(H) within a deep cleft, undergoes a conformational change upon NAD binding, and has NAD-dependent dehydrogenase activity. 312
24839 240625 cd05300 2-Hacid_dh_1 Putative D-isomer specific 2-hydroxyacid dehydrogenase. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomains but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of the hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production and in the stress responses of plants. 313
24840 240626 cd05301 GDH D-glycerate dehydrogenase/hydroxypyruvate reductase (GDH). D-glycerate dehydrogenase (GDH, also known as hydroxypyruvate reductase, HPR) catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. In humans, HPR deficiency causes primary hyperoxaluria type 2, characterized by over-excretion of L-glycerate and oxalate in the urine, possibly due to an imbalance in competition with L-lactate dehydrogenase, another formate dehydrogenase (FDH)-like enzyme. GDH, like FDH and other members of the D-specific hydroxyacid dehydrogenase family that also includes L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase, typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form, despite often low sequence identity. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 309
24841 240627 cd05302 FDH NAD-dependent Formate Dehydrogenase (FDH). NAD-dependent formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of a formate anion to carbon dioxide coupled with the reduction of NAD+ to NADH. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxy acid dehydrogenase family have 2 highly similar subdomains of the alpha/beta form, with NAD binding occurring in the cleft between subdomains. NAD contacts are primarily to the Rossmann-fold NAD-binding domain which is inserted within the linear sequence of the more diverse flavodoxin-like catalytic subdomain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of the hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production from C1 compounds such as methanol, and in the stress responses of plants. NAD-dependent FDH is useful in cofactor regeneration in asymmetrical biocatalytic reduction processes, where FDH irreversibly oxidizes formate to carbon dioxide, while reducing the oxidized form of the cofactor to the reduced form. 348
24842 240628 cd05303 PGDH_2 Phosphoglycerate dehydrogenase (PGDH) NAD-binding and catalytic domains. Phosphoglycerate dehydrogenase (PGDH) catalyzes the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDH comes in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. 301
24843 240629 cd05304 Rubrum_tdh Rubrum transdehydrogenase NAD-binding and catalytic domains. Transhydrogenases found in bacterial and inner mitochondrial membranes link NAD(P)(H)-dependent redox reactions to proton translocation. The energy of the proton electrochemical gradient (delta-p), generated by the respiratory electron transport chain, is consumed by transhydrogenase in NAD(P)+ reduction. Transhydrogenase is likely involved in the regulation of the citric acid cycle. Rubrum transhydrogenase has 3 components, dI, dII, and dIII. dII spans the membrane while dI and dIII protrude on the cytoplasmic/matrix side. DI contains 2 domains in Rossmann-like folds, linked by a long alpha helix, and contains a NAD binding site. Two dI polypeptides (represented in this sub-family) spontaneously form a heterotrimer with dIII in the absence of dII. In the heterotrimer, both dI chains may bind NAD, but only one is well-ordered. dIII also binds a well-ordered NADP, but in a different orientation than a classical Rossmann domain. 363
24844 240630 cd05305 L-AlaDH Alanine dehydrogenase NAD-binding and catalytic domains. Alanine dehydrogenase (L-AlaDH) catalyzes the NAD-dependent conversion of pyruvate to L-alanine via reductive amination. Like formate dehydrogenase and related enzymes, L-AlaDH is comprised of 2 domains connected by a long alpha helical stretch, each resembling a Rossmann fold NAD-binding domain. The NAD-binding domain is inserted within the linear sequence of the more divergent catalytic domain. Ligand binding and active site residues are found in the cleft between the subdomains. L-AlaDH is typically hexameric and is critical in carbon and nitrogen metabolism in micro-organisms. 359
24845 133453 cd05311 NAD_bind_2_malic_enz NAD(P) binding domain of malic enzyme (ME), subgroup 2. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+. ME has been found in all organisms, and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2. This subfamily consists primarily of archaeal and bacterial ME. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 226
24846 133454 cd05312 NAD_bind_1_malic_enz NAD(P) binding domain of malic enzyme (ME), subgroup 1. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+. ME has been found in all organisms, and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2. This subfamily consists of eukaryotic and bacterial ME. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 279
24847 133455 cd05313 NAD_bind_2_Glu_DH NAD(P) binding domain of glutamate dehydrogenase, subgroup 2. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. Glutamate DH is a multidomain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms. Enzymes involved in ammonia asimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha -beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts. 254
24848 187583 cd05322 SDH_SDR_c_like Sorbitol 6-phosphate dehydrogenase (SDH), classical (c) SDRs. Sorbitol 6-phosphate dehydrogenase (SDH, aka glucitol 6-phosphate dehydrogenase) catalyzes the NAD-dependent interconversion of D-fructose 6-phosphate to D-sorbitol 6-phosphate. SDH is a member of the classical SDRs, with the characteristic catalytic tetrad, but without a complete match to the typical NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 257
24849 187584 cd05323 ADH_SDR_c_like insect type alcohol dehydrogenase (ADH)-like, classical (c) SDRs. This subgroup contains insect type ADH, and 15-hydroxyprostaglandin dehydrogenase (15-PGDH) type I; these proteins are classical SDRs. ADH catalyzes the NAD+-dependent oxidation of alcohols to aldehydes/ketones. This subgroup is distinct from the zinc-dependent alcohol dehydrogenases of the medium chain dehydrogenase/reductase family, and evolved in fruit flies to allow the digestion of fermenting fruit. 15-PGDH catalyzes the NAD-dependent interconversion of (5Z,13E)-(15S)-11alpha,15-dihydroxy-9-oxoprost-13-enoate and (5Z,13E)-11alpha-hydroxy-9,15-dioxoprost-13-enoate, and has a typical SDR glycine-rich NAD-binding motif, which is not fully present in ADH. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 244
24850 187585 cd05324 carb_red_PTCR-like_SDR_c Porcine testicular carbonyl reductase (PTCR)-like, classical (c) SDRs. PTCR is a classical SDR which catalyzes the NADPH-dependent reduction of ketones on steroids and prostaglandins. Unlike most SDRs, PTCR functions as a monomer. This subgroup also includes human carbonyl reductase 1 (CBR1) and CBR3. CBR1 is an NADPH-dependent SDR with broad substrate specificity and may be responsible for the in vivo reduction of quinones, prostaglandins, and other carbonyl-containing compounds. In addition it includes poppy NADPH-dependent salutaridine reductase which catalyzes the stereospecific reduction of salutaridine to 7(S)-salutaridinol in the biosynthesis of morphine, and Arabidopsis SDR1,a menthone reductase, which catalyzes the reduction of menthone to neomenthol, a compound with antimicrobial activity; SDR1 can also carry out neomenthol oxidation. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 225
24851 187586 cd05325 carb_red_sniffer_like_SDR_c carbonyl reductase sniffer-like, classical (c) SDRs. Sniffer is an NADPH-dependent carbonyl reductase of the classical SDR family. Studies in Drosophila melanogaster implicate Sniffer in the prevention of neurodegeneration due to aging and oxidative-stress. This subgroup also includes Rhodococcus sp. AD45 IsoH, which is an NAD-dependent 1-hydroxy-2-glutathionyl-2-methyl-3-butene dehydrogenase involved in isoprene metabolism, Aspergillus nidulans StcE encoded by a gene which is part of a proposed sterigmatocystin biosynthesis gene cluster, Bacillus circulans SANK 72073 BtrF encoded by a gene found in the butirosin biosynthesis gene cluster, and Aspergillus parasiticus nor-1 involved in the biosynthesis of aflatoxins. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 233
24852 187587 cd05326 secoisolariciresinol-DH_like_SDR_c secoisolariciresinol dehydrogenase (secoisolariciresinol-DH)-like, classical (c) SDRs. Podophyllum secoisolariciresinol-DH is a homo tetrameric, classical SDR that catalyzes the NAD-dependent conversion of (-)-secoisolariciresinol to (-)-matairesinol via a (-)-lactol intermediate. (-)-Matairesinol is an intermediate to various 8'-lignans, including the cancer-preventive mammalian lignan, and those involved in vascular plant defense. This subgroup also includes rice momilactone A synthase which catalyzes the conversion of 3beta-hydroxy-9betaH-pimara-7,15-dien-19,6beta-olide into momilactone A, Arabidopsis ABA2 which during abscisic acid (ABA) biosynthesis, catalyzes the conversion of xanthoxin to abscisic aldehyde and, maize Tasselseed2 which participate in the maize sex determination pathway. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 249
24853 212492 cd05327 retinol-DH_like_SDR_c_like retinol dehydrogenase (retinol-DH), Light dependent Protochlorophyllide (Pchlide) OxidoReductase (LPOR) and related proteins, classical (c) SDRs. Classical SDR subgroup containing retinol-DHs, LPORs, and related proteins. Retinol is processed by a medium chain alcohol dehydrogenase followed by retinol-DHs. Pchlide reductases act in chlorophyll biosynthesis. There are distinct enzymes that catalyze Pchlide reduction in light or dark conditions. Light-dependent reduction is via an NADP-dependent SDR, LPOR. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. This subgroup includes the human proteins: retinol dehydrogenase -12, -13 ,and -14, dehydrogenase/reductase SDR family member (DHRS)-12 , -13 and -X (a DHRS on chromosome X), and WWOX (WW domain-containing oxidoreductase), as well as a Neurospora crassa SDR encoded by the blue light inducible bli-4 gene. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 269
24854 187589 cd05328 3alpha_HSD_SDR_c alpha hydroxysteroid dehydrogenase (3alpha_HSD), classical (c) SDRs. Bacterial 3-alpha_HSD, which catalyzes the NAD-dependent oxidoreduction of hydroxysteroids, is a dimeric member of the classical SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
24855 187590 cd05329 TR_SDR_c tropinone reductase-I and II (TR-1, and TR-II)-like, classical (c) SDRs. This subgroup includes TR-I and TR-II; these proteins are members of the SDR family. TRs catalyze the NADPH-dependent reductions of the 3-carbonyl group of tropinone, to a beta-hydroxyl group. TR-I and TR-II produce different stereoisomers from tropinone, TR-I produces tropine (3alpha-hydroxytropane), and TR-II, produces pseudotropine (sigma-tropine, 3beta-hydroxytropane). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 251
24856 187591 cd05330 cyclohexanol_reductase_SDR_c cyclohexanol reductases, including levodione reductase, classical (c) SDRs. Cyloclohexanol reductases,including (6R)-2,2,6-trimethyl-1,4-cyclohexanedione (levodione) reductase of Corynebacterium aquaticum, catalyze the reversible oxidoreduction of hydroxycyclohexanone derivatives. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 257
24857 187592 cd05331 DH-DHB-DH_SDR_c 2,3 dihydro-2,3 dihydrozybenzoate dehydrogenases, classical (c) SDRs. 2,3 dihydro-2,3 dihydrozybenzoate dehydrogenase shares the characteristics of the classical SDRs. This subgroup includes Escherichai coli EntA which catalyzes the NAD+-dependent oxidation of 2,3-dihydro-2,3-dihydroxybenzoate to 2,3-dihydroxybenzoate during biosynthesis of the siderophore Enterobactin. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 244
24858 187593 cd05332 11beta-HSD1_like_SDR_c 11beta-hydroxysteroid dehydrogenase type 1 (11beta-HSD1)-like, classical (c) SDRs. Human 11beta_HSD1 catalyzes the NADP(H)-dependent interconversion of cortisone and cortisol. This subgroup also includes human dehydrogenase/reductase SDR family member 7C (DHRS7C) and DHRS7B. These proteins have the GxxxGxG nucleotide binding motif and S-Y-K catalytic triad characteristic of the SDRs, but have an atypical C-terminal domain that contributes to homodimerization contacts. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 257
24859 187594 cd05333 BKR_SDR_c beta-Keto acyl carrier protein reductase (BKR), involved in Type II FAS, classical (c) SDRs. This subgroup includes the Escherichai coli K12 BKR, FabG. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet) NAD(P)(H) binding region and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H) binding pattern: TGxxxGxG in classical SDRs. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P) binding motif and an altered active site motif (YXXXN). Fungal type type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P) binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr-151 and Lys-155, and well as Asn-111 (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 240
24860 187595 cd05334 DHPR_SDR_c_like dihydropteridine reductase (DHPR), classical (c) SDRs. Dihydropteridine reductase is an NAD-binding protein related to the SDRs. It converts dihydrobiopterin into tetrahydrobiopterin, a cofactor necessary in catecholamines synthesis. Dihydropteridine reductase has the YXXXK of these tyrosine-dependent oxidoreductases, but lacks the typical upstream Asn and Ser catalytic residues. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 221
24861 187596 cd05337 BKR_1_SDR_c putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR), subgroup 1, classical (c) SDR. This subgroup includes Escherichia coli CFT073 FabG. The Escherichai coli K12 BKR, FabG, belongs to a different subgroup. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet) NAD(P)(H) binding region and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H) binding pattern: TGxxxGxG in classical SDRs. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P) binding motif and an altered active site motif (YXXXN). Fungal type type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P) binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr-151 and Lys-155, and well as Asn-111 (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 255
24862 187597 cd05338 DHRS1_HSDL2-like_SDR_c human dehydrogenase/reductase (SDR family) member 1 (DHRS1) and human hydroxysteroid dehydrogenase-like protein 2 (HSDL2), classical (c) SDRs. This subgroup includes human DHRS1 and human HSDL2 and related proteins. These are members of the classical SDR family, with a canonical Gly-rich NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. DHRS1 mRNA has been detected in many tissues, liver, heart, skeletal muscle, kidney and pancreas; a longer transcript is predominantly expressed in the liver , a shorter one in the heart. HSDL2 may play a part in fatty acid metabolism, as it is found in peroxisomes. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 246
24863 187598 cd05339 17beta-HSDXI-like_SDR_c human 17-beta-hydroxysteroid dehydrogenase XI-like, classical (c) SDRs. 17-beta-hydroxysteroid dehydrogenases (17betaHSD) are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. 17betaHSD type XI, a classical SDR, preferentially converts 3alpha-Adiol to androsterone but not numerous other tested steroids. This subgroup of classical SDRs also includes members identified as retinol dehydrogenases, which convert retinol to retinal, a property that overlaps with 17betaHSD activity. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 243
24864 187599 cd05340 Ycik_SDR_c Escherichia coli K-12 YCIK-like, classical (c) SDRs. Escherichia coli K-12 YCIK and related proteins have a canonical classical SDR nucleotide-binding motif and active site tetrad. They are predicted oxoacyl-(acyl carrier protein/ACP) reductases. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 236
24865 187600 cd05341 3beta-17beta-HSD_like_SDR_c 3beta17beta hydroxysteroid dehydrogenase-like, classical (c) SDRs. This subgroup includes members identified as 3beta17beta hydroxysteroid dehydrogenase, 20beta hydroxysteroid dehydrogenase, and R-alcohol dehydrogenase. These proteins exhibit the canonical active site tetrad and glycine rich NAD(P)-binding motif of the classical SDRs. 17beta-dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 247
24866 187601 cd05343 Mgc4172-like_SDR_c human Mgc4172-like, classical (c) SDRs. Human Mgc4172-like proteins, putative SDRs. These proteins are members of the SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
24867 187602 cd05344 BKR_like_SDR_like putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR)-like, SDR. This subgroup resembles the SDR family, but does not have a perfect match to the NAD-binding motif or the catalytic tetrad characteristic of the SDRs. It includes the SDRs, Q9HYA2 from Pseudomonas aeruginosa PAO1 and APE0912 from Aeropyrum pernix K1. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 253
24868 187603 cd05345 BKR_3_SDR_c putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR), subgroup 3, classical (c) SDR. This subgroup includes the putative Brucella melitensis biovar Abortus 2308 BKR, FabG, Mesorhizobium loti MAFF303099 FabG, and other classical SDRs. BKR, a member of the SDR family, catalyzes the NADPH-dependent reduction of acyl carrier protein in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of 4 elongation steps, which are repeated to extend the fatty acid chain thru the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I Fas utilizes one or 2 multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 248
24869 187604 cd05346 SDR_c5 classical (c) SDR, subgroup 5. These proteins are members of the classical SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 249
24870 187605 cd05347 Ga5DH-like_SDR_c gluconate 5-dehydrogenase (Ga5DH)-like, classical (c) SDRs. Ga5DH catalyzes the NADP-dependent conversion of carbon source D-gluconate and 5-keto-D-gluconate. This SDR subgroup has a classical Gly-rich NAD(P)-binding motif and a conserved active site tetrad pattern. However, it has been proposed that Arg104 (Streptococcus suis Ga5DH numbering), as well as an active site Ca2+, play a critical role in catalysis. In addition to Ga5DHs this subgroup contains Erwinia chrysanthemi KduD which is involved in pectin degradation, and is a putative 2,5-diketo-3-deoxygluconate dehydrogenase. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107,15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 248
24871 187606 cd05348 BphB-like_SDR_c cis-biphenyl-2,3-dihydrodiol-2,3-dehydrogenase (BphB)-like, classical (c) SDRs. cis-biphenyl-2,3-dihydrodiol-2,3-dehydrogenase (BphB) is a classical SDR, it is of particular importance for its role in the degradation of biphenyl/polychlorinated biphenyls(PCBs); PCBs are a significant source of environmental contamination. This subgroup also includes Pseudomonas putida F1 cis-biphenyl-1,2-dihydrodiol-1,2-dehydrogenase (aka cis-benzene glycol dehydrogenase, encoded by the bnzE gene), which participates in benzene metabolism. In addition it includes Pseudomonas sp. C18 putative 1,2-dihydroxy-1,2-dihydronaphthalene dehydrogenase (aka dibenzothiophene dihydrodiol dehydrogenase, encoded by the doxE gene) which participates in an upper naphthalene catabolic pathway. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 257
24872 187607 cd05349 BKR_2_SDR_c putative beta-ketoacyl acyl carrier protein [ACP]reductase (BKR), subgroup 2, classical (c) SDR. This subgroup includes Rhizobium sp. NGR234 FabG1. The Escherichai coli K12 BKR, FabG, belongs to a different subgroup. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 246
24873 187608 cd05350 SDR_c6 classical (c) SDR, subgroup 6. These proteins are members of the classical SDR family, with a canonical active site tetrad and a fairly well conserved typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 239
24874 187609 cd05351 XR_like_SDR_c xylulose reductase-like, classical (c) SDRs. Members of this subgroup include proteins identified as L-xylulose reductase (XR) and carbonyl reductase; they are members of the SDR family. XR, catalyzes the NADP-dependent reduction of L-xyulose and other sugars. Tetrameric mouse carbonyl reductase is involved in the metabolism of biogenic and xenobiotic carbonyl compounds. This subgroup also includes tetrameric chicken liver D-erythrulose reductase, which catalyzes the reduction of D-erythrulose to D-threitol. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). 244
24875 187610 cd05352 MDH-like_SDR_c mannitol dehydrogenase (MDH)-like, classical (c) SDRs. NADP-mannitol dehydrogenase catalyzes the conversion of fructose to mannitol, an acyclic 6-carbon sugar. MDH is a tetrameric member of the SDR family. This subgroup also includes various other tetrameric SDRs, including Pichia stipitis D-arabinitol dehydrogenase (aka polyol dehydrogenase), Candida albicans Sou1p, a sorbose reductase, and Candida parapsilosis (S)-specific carbonyl reductase (SCR, aka S-specific alcohol dehydrogenase) which catalyzes the enantioselective reduction of 2-hydroxyacetophenone into (S)-1-phenyl-1,2-ethanediol. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). 252
24876 187611 cd05353 hydroxyacyl-CoA-like_DH_SDR_c-like (3R)-hydroxyacyl-CoA dehydrogenase-like, classical(c)-like SDRs. Beta oxidation of fatty acids in eukaryotes occurs by a four-reaction cycle, that may take place in mitochondria or in peroxisomes. (3R)-hydroxyacyl-CoA dehydrogenase is part of rat peroxisomal multifunctional MFE-2, it is a member of the NAD-dependent SDRs, but contains an additional small C-terminal domain that completes the active site pocket and participates in dimerization. The atypical, additional C-terminal extension allows for more extensive dimerization contact than other SDRs. MFE-2 catalyzes the second and third reactions of the peroxisomal beta oxidation cycle. Proteins in this subgroup have a typical catalytic triad, but have a His in place of the usual upstream Asn. This subgroup also contains members identified as 17-beta-hydroxysteroid dehydrogenases, including human peroxisomal 17-beta-hydroxysteroid dehydrogenase type 4 (17beta-HSD type 4, aka MFE-2, encoded by HSD17B4 gene) which is involved in fatty acid beta-oxidation and steroid metabolism. This subgroup also includes two SDR domains of the Neurospora crassa and Saccharomyces cerevisiae multifunctional beta-oxidation protein (MFP, aka Fox2). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 250
24877 187612 cd05354 SDR_c7 classical (c) SDR, subgroup 7. These proteins are members of the classical SDR family, with a canonical active site triad (and also an active site Asn) and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 235
24878 187613 cd05355 SDR_c1 classical (c) SDR, subgroup 1. These proteins are members of the classical SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 270
24879 187614 cd05356 17beta-HSD1_like_SDR_c 17-beta-hydroxysteroid dehydrogenases (17beta-HSDs) types -1, -3, and -12, -like, classical (c) SDRs. This subgroup includes various 17-beta-hydroxysteroid dehydrogenases and 3-ketoacyl-CoA reductase, these are members of the SDR family, and contain the canonical active site tetrad and glycine-rich NAD-binding motif of the classical SDRs. 3-ketoacyl-CoA reductase (KAR, aka 17beta-HSD type 12, encoded by HSD17B12) acts in fatty acid elongation; 17beta- hydroxysteroid dehydrogenases are isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the SDR family. 17beta-estradiol dehydrogenase (aka 17beta-HSD type 1, encoded by HSD17B1) converts estrone to estradiol. Estradiol is the predominant female sex hormone. 17beta-HSD type 3 (aka testosterone 17-beta-dehydrogenase 3, encoded by HSD17B3) catalyses the reduction of androstenedione to testosterone, it also accepts estrogens as substrates. This subgroup also contains a putative steroid dehydrogenase let-767 from Caenorhabditis elegans, mutation in which results in hypersensitivity to cholesterol limitation. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 239
24880 187615 cd05357 PR_SDR_c pteridine reductase (PR), classical (c) SDRs. Pteridine reductases (PRs), members of the SDR family, catalyzes the NAD-dependent reduction of folic acid, dihydrofolate and related compounds. In Leishmania, pteridine reductase (PTR1) acts to circumvent the anti-protozoan drugs that attack dihydrofolate reductase activity. Proteins in this subgroup have an N-terminal NAD-binding motif and a YxxxK active site motif, but have an Asp instead of the usual upstream catalytic Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 234
24881 187616 cd05358 GlcDH_SDR_c glucose 1 dehydrogenase (GlcDH), classical (c) SDRs. GlcDH, is a tetrameric member of the SDR family, it catalyzes the NAD(P)-dependent oxidation of beta-D-glucose to D-glucono-delta-lactone. GlcDH has a typical NAD-binding site glycine-rich pattern as well as the canonical active site tetrad (YXXXK motif plus upstream Ser and Asn). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 253
24882 187617 cd05359 ChcA_like_SDR_c 1-cyclohexenylcarbonyl_coenzyme A_reductase (ChcA)_like, classical (c) SDRs. This subgroup contains classical SDR proteins, including members identified as 1-cyclohexenylcarbonyl coenzyme A reductase. ChcA of Streptomyces collinus is implicated in the final reduction step of shikimic acid to ansatrienin. ChcA shows sequence similarity to the SDR family of NAD-binding proteins, but it lacks the conserved Tyr of the characteristic catalytic site. This subgroup also contains the NADH-dependent enoyl-[acyl-carrier-protein(ACP)] reductase FabL from Bacillus subtilis. This enzyme participates in bacterial fatty acid synthesis, in type II fatty-acid synthases and catalyzes the last step in each elongation cycle. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 242
24883 187618 cd05360 SDR_c3 classical (c) SDR, subgroup 3. These proteins are members of the classical SDR family, with a canonical active site triad (and also active site Asn) and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 233
24884 187619 cd05361 haloalcohol_DH_SDR_c-like haloalcohol dehalogenase, classical (c) SDRs. Dehalogenases cleave carbon-halogen bonds. Haloalcohol dehalogenase show low sequence similarity to short-chain dehydrogenases/reductases (SDRs). Like the SDRs, haloalcohol dehalogenases have a conserved catalytic triad (Ser-Tyr-Lys/Arg), and form a Rossmann fold. However, the normal classical SDR NAD(P)-binding motif (TGXXGXG) and NAD-binding function is replaced with a halide binding site, allowing the enzyme to catalyze a dehalogenation reaction. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 242
24885 187620 cd05362 THN_reductase-like_SDR_c tetrahydroxynaphthalene/trihydroxynaphthalene reductase-like, classical (c) SDRs. 1,3,6,8-tetrahydroxynaphthalene reductase (4HNR) of Magnaporthe grisea and the related 1,3,8-trihydroxynaphthalene reductase (3HNR) are typical members of the SDR family containing the canonical glycine rich NAD(P)-binding site and active site tetrad, and function in fungal melanin biosynthesis. This subgroup also includes an SDR from Norway spruce that may function to protect against both biotic and abitoic stress. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 243
24886 187621 cd05363 SDH_SDR_c Sorbitol dehydrogenase (SDH), classical (c) SDR. This bacterial subgroup includes Rhodobacter sphaeroides SDH, and other SDHs. SDH preferentially interconverts D-sorbitol (D-glucitol) and D-fructose, but also interconverts L-iditol/L-sorbose and galactitol/D-tagatose. SDH is NAD-dependent and is a dimeric member of the SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 254
24887 187622 cd05364 SDR_c11 classical (c) SDR, subgroup 11. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 253
24888 187623 cd05365 7_alpha_HSDH_SDR_c 7 alpha-hydroxysteroid dehydrogenase (7 alpha-HSDH), classical (c) SDRs. This bacterial subgroup contains 7 alpha-HSDHs, including Escherichia coli 7 alpha-HSDH. 7 alpha-HSDH, a member of the SDR family, catalyzes the NAD+ -dependent dehydrogenation of a hydroxyl group at position 7 of the steroid skeleton of bile acids. In humans the two primary bile acids are cholic and chenodeoxycholic acids, these are formed from cholesterol in the liver. Escherichia coli 7 alpha-HSDH dehydroxylates these bile acids in the human intestine. Mammalian 7 alpha-HSDH activity has been found in livers. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 242
24889 187624 cd05366 meso-BDH-like_SDR_c meso-2,3-butanediol dehydrogenase-like, classical (c) SDRs. 2,3-butanediol dehydrogenases (BDHs) catalyze the NAD+ dependent conversion of 2,3-butanediol to acetonin; BDHs are classified into types according to their stereospecificity as to substrates and products. Included in this subgroup are Klebsiella pneumonia meso-BDH which catalyzes meso-2,3-butanediol to D(-)-acetonin, and Corynebacterium glutamicum L-BDH which catalyzes lX+)-2,3-butanediol to L(+)-acetonin. This subgroup is comprised of classical SDRs with the characteristic catalytic triad and NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 257
24890 187625 cd05367 SPR-like_SDR_c sepiapterin reductase (SPR)-like, classical (c) SDRs. Human SPR, a member of the SDR family, catalyzes the NADP-dependent reduction of sepiaptern to 7,8-dihydrobiopterin (BH2). In addition to SPRs, this subgroup also contains Bacillus cereus yueD, a benzil reductase, which catalyzes the stereospecific reduction of benzil to (S)-benzoin. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 241
24891 187626 cd05368 DHRS6_like_SDR_c human DHRS6-like, classical (c) SDRs. Human DHRS6, and similar proteins. These proteins are classical SDRs, with a canonical active site tetrad and a close match to the typical Gly-rich NAD-binding motif. Human DHRS6 is a cytosolic type 2 (R)-hydroxybutyrate dehydrogenase, which catalyses the conversion of (R)-hydroxybutyrate to acetoacetate. Also included in this subgroup is Escherichia coli UcpA (upstream cys P). Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. Note: removed : needed to make this chiodl smaller when drew final trees: rmeoved text form description: Other proteins in this subgroup include Thermoplasma acidophilum aldohexose dehydrogenase, which has high dehydrogenase activity against D-mannose, Bacillus subtilis BacC involved in the biosynthesis of the dipeptide bacilysin and its antibiotic moiety anticapsin, Sphingomonas paucimobilis strain B90 LinC, involved in the degradation of hexachlorocyclohexane isomers...... P). 241
24892 187627 cd05369 TER_DECR_SDR_a Trans-2-enoyl-CoA reductase (TER) and 2,4-dienoyl-CoA reductase (DECR), atypical (a) SDR. TTER is a peroxisomal protein with a proposed role in fatty acid elongation. Fatty acid synthesis is known to occur in the both endoplasmic reticulum and mitochondria; peroxisomal TER has been proposed as an additional fatty acid elongation system, it reduces the double bond at C-2 as the last step of elongation. This system resembles the mitochondrial system in that acetyl-CoA is used as a carbon donor. TER may also function in phytol metabolism, reducting phytenoyl-CoA to phytanoyl-CoA in peroxisomes. DECR processes double bonds in fatty acids to increase their utility in fatty acid metabolism; it reduces 2,4-dienoyl-CoA to an enoyl-CoA. DECR is active in mitochondria and peroxisomes. This subgroup has the Gly-rich NAD-binding motif of the classical SDR family, but does not display strong identity to the canonical active site tetrad, and lacks the characteristic Tyr at the usual position. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 249
24893 187628 cd05370 SDR_c2 classical (c) SDR, subgroup 2. Short-chain dehydrogenases/reductases (SDRs, aka Tyrosine-dependent oxidoreductases) are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 228
24894 187629 cd05371 HSD10-like_SDR_c 17hydroxysteroid dehydrogenase type 10 (HSD10)-like, classical (c) SDRs. HSD10, also known as amyloid-peptide-binding alcohol dehydrogenase (ABAD), was previously identified as a L-3-hydroxyacyl-CoA dehydrogenase, HADH2. In fatty acid metabolism, HADH2 catalyzes the third step of beta-oxidation, the conversion of a hydroxyl to a keto group in the NAD-dependent oxidation of L-3-hydroxyacyl CoA. In addition to alcohol dehydrogenase and HADH2 activites, HSD10 has steroid dehydrogenase activity. Although the mechanism is unclear, HSD10 is implicated in the formation of amyloid beta-petide in the brain (which is linked to the development of Alzheimer's disease). Although HSD10 is normally concentrated in the mitochondria, in the presence of amyloid beta-peptide it translocates into the plasma membrane, where it's action may generate cytotoxic aldehydes and may lower estrogen levels through its use of 17-beta-estradiol as a substrate. HSD10 is a member of the SRD family, but differs from other SDRs by the presence of two insertions of unknown function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 252
24895 187630 cd05372 ENR_SDR Enoyl acyl carrier protein (ACP) reductase (ENR), divergent SDR. This bacterial subgroup of ENRs includes Escherichia coli ENR. ENR catalyzes the NAD(P)H-dependent reduction of enoyl-ACP in the last step of fatty acid biosynthesis. De novo fatty acid biosynthesis is catalyzed by the fatty acid synthetase complex, through the serial addition of 2-carbon subunits. In bacteria and plants,ENR catalyzes one of six synthetic steps in this process. Oilseed rape ENR, and also apparently the NADH-specific form of Escherichia coli ENR, is tetrameric. Although similar to the classical SDRs, this group does not have the canonical catalytic tetrad, nor does it have the typical Gly-rich NAD-binding pattern. Such so-called divergent SDRs have a GXXXXXSXA NAD-binding motif and a YXXMXXXK (or YXXXMXXXK) active site motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
24896 187631 cd05373 SDR_c10 classical (c) SDR, subgroup 10. This subgroup resembles the classical SDRs, but has an incomplete match to the canonical glycine rich NAD-binding motif and lacks the typical active site tetrad (instead of the critical active site Tyr, it has Phe, but contains the nearby Lys). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 238
24897 187632 cd05374 17beta-HSD-like_SDR_c 17beta hydroxysteroid dehydrogenase-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 248
24898 349398 cd05379 CAP_bacterial Bacterial CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain proteins. Little is known about bacterial and archaeal members of the CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. Studies of eukaryotic proteins show that CAP domains have several functions, including the binding of cholesterol, lipids and heparan sulfate. This group includes Borrelia burgdorferi outer surface protein BB0689, which does not bind to cholesterol, lipids, or heparan sulfate, and whose function is unknown. 120
24899 349399 cd05380 CAP_euk Eukaryotic CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain proteins. The CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain is found mainly in eukaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), and allergen 5 from vespid venom. 144
24900 349400 cd05381 CAP_PR-1 CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of pathogenesis-related protein 1 (PR-1) family proteins. Members of pathogenesis-related protein 1 (PR-1) family are among the most abundantly produced proteins in plants on pathogen attack. They are considered hallmarks of hypersensitive response/defense pathways and may act as anti-fungal agents or be involved in cell wall loosening. 136
24901 349401 cd05382 CAP_GAPR1-like CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of Golgi-associated plant pathogenesis-related protein 1 and similar proteins. Golgi-associated plant pathogenesis related protein 1 (GAPR1), also called Golgi-associated PR-1 protein or glioma pathogenesis-related protein 2 (GLIPR-2), forms amyloid-like fibrils in the presence of liposomes containing acidic phospholipids. It has been identified in mice as an up-regulated protein in kidney fibrosis, and is involved in epithelial to mesenchymal transition and in generating a pool of myofibroblasts contributing to fibrosis. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 132
24902 349402 cd05383 CAP_CRISP CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory proteins. Cysteine-rich secretory proteins (CRISPs) are two-domain proteins with an evolutionary diverse and structurally conserved N-terminal CAP domain and a C-terminal cysteine-rich domain, which is comprised of a hinge and an ICR (ion channel regulator) region. CRISPs are involved in response to pathogens, fertilization, and sperm maturation. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. CRISP-1 has been shown to mediate gamete fusion by binding to the egg surface. Other members of the CRISP family secreted in the testis (CRISP2), epididymis (CRISP3-4), or during ejaculation (CRISP3), are also involved in sperm-egg interaction, supporting the existence of a functional redundancy and cooperation between homolog proteins ensuring the success of fertilization. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1) and allergen 5 from vespid venom, among others. 139
24903 349403 cd05384 CAP_PRY1-like CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of pathogen-related yeast 1 (PRY1) protein and similar fungal proteins. PRY1, also called pathogenesis-related protein 1, is a yeast protein that is up-regulated in core ESCRT mutants. It is a secreted protein required for efficient export of lipids such as acetylated sterols, and acts in detoxification of hydrophobic compounds. This PRY1-like group also contains fruiting body proteins SC7/14 from Schizophyllum commune. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 129
24904 349404 cd05385 CAP_GLIPR1-like CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of glioma pathogenesis-related protein 1 and similar proteins. Glioma pathogenesis-related protein 1 (GLIPR1) is also called related to testes-specific, vespid, and pathogenesis protein 1 (RTVP-1). The GLIPR1 gene has been identified as a p53 target gene and was shown to be methylated and down-regulated in prostate cancer. It is a novel broad-spectrum tumor suppressor whose proapoptotic properties are exerted in part through ROS-JNK signaling. GLIPR1 is composed of a signal peptide that directs its secretion, a CAP domain, and a transmembrane domain. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 148
24905 349771 cd05386 TraL transfer origin protein TraL. The transfer origin protein TraL is member of the SIMIBI superfamily which contains a ATP-binding domain. Proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. The specific function of TraL protein is unknown. 155
24906 349772 cd05387 BY-kinase bacterial tyrosine-kinase. Bacterial tyrosine (BY)-kinases catalyze the autophosphorylation on a C-terminal tyrosine cluster and also phosphorylate endogenous protein substrates by using ATP as phosphoryl donor. Besides their capacity to function as tyrosine kinase, most of these proteins are also involved in the production and transport of exopolysaccharides. BY-kinases are involved in a number of physiological processes ranging from stress resistance to pathogenicity. 190
24907 349773 cd05388 CobB_N N-terminal domain of cobyrinic acid a,c-diamide synthase. Cobyrinic acid a,c-diamide synthase (CobB, CbiA). Biosynthesis of cobalamin (vitamin B12) requires more than two dozen different enzymes. CobB catalyzes the ATP-dependent amidation of the two carboxylate groups at positions a and c of cobyrinic acid, via the formation of a phosphorylated intermediate, using glutamine or ammonia as the nitrogen source. CobB is comprised of two protein domains: the C-terminal glutaminase domain and the N-terminal ATP-binding domain. The glutaminase domain catalyzes the hydrolysis of glutamine to glutamate and ammonia. It belongs to the triad class of glutamine amidotransferases. This classification is based on the N-terminal domain which catalyzes the ultimate synthesis of the diamide product by using energy from the hydrolysis of ATP and ammonia transferred from the C-terminal domain. 193
24908 349774 cd05389 CobQ_N N-terminal domain of cobyric acid synthase. Cobyric acid synthase (CobQ, CbiP) N-terminal domain. CobQ plays a role in the cobalamin (vitamin B12) biosynthesis pathway. CobQ catalyzes the ATP-dependent amidation of adenosyl-cobyrinic acid a,c-diamide at carboxylates positions b, d, e, and g to produce cobyric acid using glutamine or ammonia as the nitrogen source. The C-terminal glutaminase domain catalyzes the hydrolysis of glutamine to glutamate and ammonia. Ammonia is translocated via an intramolecular tunnel to the N-terminal domain for the synthesis of cobyric acid. 223
24909 349775 cd05390 HypB nickel incorporation protein HypB. HypB is one of numerous accessory proteins required for the maturation of nickel-dependent hydrogenases, like carbon monoxide dehydrogenase or urease. HypB is a GTP-binding protein and has GTP hyrolase activity. It forms homodimer and is capable of binding two nickel ions and two zinc ions. The active site is located on the dimer interface. Energy from hydrolysis of GTP is used to insert nickels into hydrogenases. 203
24910 213340 cd05391 RasGAP_p120GAP Ras-GTPase Activating Domain of p120. p120GAP is a negative regulator of Ras that stimulates hydrolysis of bound GTP to GDP. Once the Ras regulator p120GAP, a member of the GAP protein family, is recruited to the membrane, it is transiently immobilized to interact with Ras-GTP. The down-regulation of Ras by p120GAP is a critical step in the regulation of many cellular processes, which is disrupted in approximately 30% of human cancers. p120GAP contains SH2, SH3, PH, calcium- and lipid-binding domains, suggesting its involvement in a complex network of cellular interactions in vivo. 328
24911 213341 cd05392 RasGAP_Neurofibromin_like Ras-GTPase Activating Domain of proteins similar to neurofibromin. Neurofibromin-like proteins include the Saccharomyces cerevisiae RasGAP proteins Ira1 and Ira2, the closest homolog of neurofibromin, which is responsible for the human autosomal dominant disease neurofibromatosis type I (NF1). The RasGAP Ira1/2 proteins are negative regulators of the Ras-cAMP signaling pathway and conserved from yeast to human. In yeast Ras proteins are activated by GEFs, and inhibited by two GAPs, Ira1 and Ira2. Ras proteins activate the cAMP/protein kinase A (PKA) pathway, which controls metabolism, stress resistance, growth, and meiosis. Recent studies showed that the kelch proteins Gpb1 and Gpb2 inhibit Ras activity via association with Ira1 and Ira2. Gpb1/2 bind to a conserved C-terminal domain of Ira1/2, and loss of Gpb1/2 results in a destabilization of Ira1 and Ira2, leading to elevated levels of Ras2-GTP and uninhibited cAMP-PKA signaling. Since the Gpb1/2 binding domain on Ira1/2 is conserved in the human neurofibromin protein, the studies suggest that an analogous signaling mechanism may contribute to the neoplastic development of NF1. 317
24912 213342 cd05394 RasGAP_RASA2 Ras-GTPase Activating Domain of RASA2. RASA2 (or GAP1(m)) is a member of the GAP1 family of Ras GTPase-activating proteins that includes GAP1_IP4BP (or RASA3), CAPRI, and RASAL. In vitro, RASA2 has been shown to bind inositol 1,3,4,5-tetrakisphosphate (IP4), the water soluble inositol head group of the lipid second messenger phosphatidylinositol 3,4,5-trisphosphate (PIP3). In vivo studies also demonstrated that RASA2 binds PIP3, and it is recruited to the plasma membrane following agonist stimulation of PI 3-kinase. Furthermore, the membrane translocation is a consequence of the ability of its pleckstrin homology (PH) domain to bind PIP3. 272
24913 213343 cd05395 RasGAP_RASA4 Ras-GTPase Activating Domain of RASA4. Ras GTPase activating-like 4 protein (RASAL4), also known as Ca2+ -promoted Ras inactivator (CAPRI), is a member of the GAP1 family. Members of the GAP1 family are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. RASAL4, like RASAL, is a cytosolic protein that undergoes a rapid translocation to the plasma membrane in response to a receptor-mediated elevation in the concentration of intracellular free Ca2+ ([Ca2+]i). However, unlike RASAL, RASAL4 does not sense oscillations in [Ca2+]i. 287
24914 188647 cd05396 An_peroxidase_like Animal heme peroxidases and related proteins. A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well. 370
24915 143387 cd05397 NT_Pol-beta-like Nucleotidyltransferase (NT) domain of DNA polymerase beta and similar proteins. This superfamily includes the NT domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of Class I and Class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, and Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. The Escherichia coli CCA-adding enzyme belongs to this superfamily but is not included as this enzyme lacks the N-terminal helix conserved in the remainder of the superfamily. In the majority of the Pol beta-like superfamily NTs, two carboxylates, Dx[D/E], together with a third more distal carboxylate coordinate two divalent metal cations that are essential for catalysis. These divalent metal ions are involved in a two-metal ion mechanism of nucleotide addition. Two of the three catalytic carboxylates are found in Rel-Spo enzymes, with the second carboxylate of the DXD motif missing. Evidence supports a single-cation synthetase mechanism for Rel-Spo enzymes. 49
24916 143388 cd05398 NT_ClassII-CCAase Nucleotidyltransferase (NT) domain of ClassII CCA-adding enzymes. CCA-adding enzymes add the sequence [cytidine(C)-cytidine-adenosine (A)], one nucleotide at a time, onto the 3' end of tRNA, in a template-independent reaction. This Class II group is comprised mainly of eubacterial and eukaryotic enzymes and includes Bacillus stearothermophilus CCAase, Escherichia coli poly(A) polymerase I, human mitochondrial CCAase, and Saccharomyces cerevisiae CCAase (CCA1). CCA-adding enzymes have a single catalytic pocket, which recognizes both ATP and CTP substrates. Included in this subgroup are CC- and A-adding enzymes from various ancient species of bacteria such as Aquifex aeolicus; these enzymes collaborate to add CCA to tRNAs. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are fairly well conserved in this family. Escherichia coli CCAase is related to this group but has not been included in this alignment as this enzyme lacks the N-terminal helix conserved in the remainder of the NT superfamily. 139
24917 143389 cd05399 NT_Rel-Spo_like Nucleotidyltransferase (NT) domain of RelA- and SpoT-like ppGpp synthetases and hydrolases. This family includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition.Two of the three catalytic carboxylates are found in Rel-Spo enzymes, with the second carboxylate of the DXD motif missing. Evidence supports a single-cation synthetase mechanism. 129
24918 143390 cd05400 NT_2-5OAS_ClassI-CCAase Nucleotidyltransferase (NT) domain of 2'5'-oligoadenylate (2-5A)synthetase (2-5OAS) and class I CCA-adding enzyme. In vertebrates, 2-5OASs are induced by interferon during the innate immune response to protect against RNA virus infections. In the presence of an RNA activator, 2-5OASs catalyze the oligomerization of ATP into 2-5A. 2-5A activates endoribonuclease L, which leads to degradation of the viral RNA. 2-5OASs are also implicated in cell growth control, differentiation, and apoptosis. This family includes human OAS1, -2, -3, and OASL. CCA-adding enzymes add the sequence [cytidine(C)-cytidine-adenosine (A)], one nucleotide at a time, onto the 3' end of tRNA, in a template-independent reaction. This class I group includes the archaeal Sulfolobus shibatae and Archeoglobus fulgidus CCA-adding enzymes. It belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this family. 143
24919 143391 cd05401 NT_GlnE_GlnD_like Nucleotidyltransferase (NT) domain of Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), and similar proteins. Escherichia coli GlnD and -E participate in the Glutamine synthetase (GS)/Glutamate synthase (GOGAT) pathway for the assimilation of ammonium nitrogen. In nitrogen sufficiency, GlnE adenylates GS, reducing GS activity; when nitrogen is limiting, GlnE deadenylates GS-AMP, restoring GS activity. When nitrogen is limiting, GlnD uridylylates the nitrogen regulatory protein PII to PII-UTP, and in nitrogen sufficiency, it removes the modifying groups. The activity of Escherichia coli GlnE is modulated by PII-proteins. PII-UMP promotes GlnE deadenylation activity, and PII promotes GlnE adenylation activity. Escherichia coli GlnE has two separate NT domains. The N-terminal NT domain catalyzes the deadenylylation of GS, and the C-terminal NT domain the adenylylation reaction. The majority of proteins in this family contain a C-terminal NT domain which is associated with a cystathionine beta-synthase (CBS) domain pair and a CAP_ED (cAMP receptor protein effector ) domain. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. For the majority of proteins in this family, these carboxylate residues are conserved. 172
24920 143392 cd05402 NT_PAP_TUTase Nucleotidyltransferase (NT) domain of poly(A) polymerases and terminal uridylyl transferases. Poly(A) polymerases (PAPs) catalyze mRNA poly(A) tail synthesis, and terminal uridylyl transferases (TUTases) uridylate RNA. PAPs in this subgroup include human PAP alpha, mouse testis-specific cytoplasmic PAP beta, human nuclear PAP gamma, Saccharomyces cerevisiae PAP1, TRF4 and-5, Schizosaccharomyces pombe caffeine-induced death proteins -1, and -14, Caenorhabditis elegans Germ Line Development-2, and Chlamydomonas reinhardtii MUT68. This family also includes human U6 snRNA-specific TUTase1, and Trypanosoma brucei 3'-TUTase-1,-2, and 4. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. For the majority of proteins in this family, these carboxylate residues are conserved. 114
24921 143393 cd05403 NT_KNTase_like Nucleotidyltransferase (NT) domain of Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. S. aureus KNTase is a plasmid encoded enzyme which confers resistance to a wide range of aminoglycoside antibiotics which have a 4'- or 4''-hydroxyl group in the equatorial position, such as kanamycin A. This enzyme transfers a nucleoside monophosphate group from a nucleotide (ATP,GTP, or UTP) to the 4'-hydroxyl group of kanamycin A. This enzyme is a homodimer, having two NT active sites. The nucleotide and antibiotic binding sites of each active site include residues from each monomer. Included in this subgroup is Escherichia coli AadA5 which confers resistance to the antibiotic spectinomycin and is a putative aminoglycoside-3'-adenylyltransferase. It is part of the aadA5 cassette of a class 1 integron. This subgroup also includes Haemophilus influenzae HI0073 which forms a 2:2 heterotetramer with an unrelated protein HI0074. Structurally HI0074 is related to the substrate-binding domain of S. aureus KNTase. The genes encoding HI0073 and HI0074 form an operon. Little is known about the substrate specificity or function of two-component NTs. The characterized members of this subgroup may not be representive of the function of this subgroup. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, co-ordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this subgroup. 93
24922 176102 cd05466 PBP2_LTTR_substrate The substrate binding domain of LysR-type transcriptional regulators (LTTRs), a member of the type 2 periplasmic binding fold protein superfamily. This model and hierarchy represent the the substrate-binding domain of the LysR-type transcriptional regulators that form the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, oxidative stress responses, nodule formation of nitrogen-fixing bacteria, synthesis of virulence factors, toxin production, attachment and secretion, to name a few. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 197
24923 119437 cd05467 CBM20 The family 20 carbohydrate-binding module (CBM20), also known as the starch-binding domain, is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 96
24924 176472 cd05468 pVHL von Hippel-Landau (pVHL) tumor suppressor protein. von Hippel-Landau (pVHL) protein, the gene product of VHL, is a critical regulator of the ubiquitous oxygen-sensing pathway. It is conserved throughout evolution, as its homologs are found in organisms ranging from mammals to the Drosophila melanogaster, Anopheles gambiae insects and the Caenorhabditis elegans nematode. pVHL acts as the substrate recognition component of an E3 ubiquitin ligase complex. Several proteins have been identified as pVHL-binding proteins that are subject to ubiquitin-mediated proteolysis; the best characterized putative substrates are the alpha subunits of the hypoxia-inducible factor (HIF1alpha, HIF2alpha, and HIF3alpha). In addition to HIF degradation, pVHL has been implicated to be involved in HIF independent cellular processes. Germline VHL mutations cause renal cell carcinomas, hemangioblastomas and pheochromocytomas in humans. pVHL can bind to and direct the proper deposition of fibronectin and collagen IV within the extracellular matrix. It works to stabilize microtubules and foster the maintenance of primary cilium. It also has been reported to promote the stabilization and activation of p53 in a HIF-independent manner and, in neuronal cells, promote apoptosis by down-regulation of Jun-B. 141
24925 100112 cd05469 Transthyretin_like Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms. 113
24926 133137 cd05470 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family). 109
24927 133138 cd05471 pepsin_like Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap.The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 283
24928 133139 cd05472 cnd41_like Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco. CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 299
24929 133140 cd05473 beta_secretase_like Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 364
24930 133141 cd05474 SAP_like SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). The overall structure of Sap protein conforms to the classical aspartic proteinase fold typified by pepsin. SAP is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 295
24931 133142 cd05475 nucellin_like Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more divergent, except for the conserved catalytic site motif. 273
24932 133143 cd05476 pepsin_A_like_plant Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. 265
24933 133144 cd05477 gastricsin Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more divergent, except for the conserved catalytic site motif. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 318
24934 133145 cd05478 pepsin_A Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 317
24935 133146 cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI. 124
24936 133147 cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 103
24937 133148 cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 93
24938 133149 cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 87
24939 133150 cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 96
24940 133151 cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 91
24941 133152 cd05485 Cathepsin_D_like Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank a deep active site cleft. Each of the two related lobes contributes one active site aspartic acid residue and contains a single carbohydrate group. Cathepsin D is an essential enzyme. Mice deficient for proteinase cathepsin D, generated by gene targeting, develop normally during the first 2 weeks, stop thriving in the third week and die in a state of anorexia in the fourth week. The mice develop atrophy of ileal mucosa followed by other degradation of intestinal organs. In these knockout mice, lysosomal proteolysis was normal. These results suggest that vital functions of cathepsin D are exerted by limited proteolysis of proteins regulating cell growth and/or tissue homeostasis, while its contribution to bulk proteolysis in lysosomes appears to be non-critical. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 329
24942 133153 cd05486 Cathespin_E Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal lobes of the enzyme. The aspartic acid residues act together to allow a water molecule to attack the peptide bond. One aspartic acid residue (in its deprotonated form) activates the attacking water molecule, whereas the other aspartic acid residue (in its protonated form) polarizes the peptide carbonyl, increasing its susceptibility to attack. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 316
24943 133154 cd05487 renin_like Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 326
24944 133155 cd05488 Proteinase_A_fungi Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme. Proteinase A preferentially hydrolyzes hydrophobic residues such as Phe, Leu or Glu at the P1 position and Phe, Ile, Leu or Ala at P1'. Moreover, the enzyme is inhibited by IA3, a natural and highly specific inhibitor produced by S. cerevisiae. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 320
24945 133156 cd05489 xylanase_inhibitor_I_like TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability of tight binding and inhibition with subnanomolar affinity and indicates the importance of the C-terminal end for the differences in xylanase specificity among different TAXI-type inhibitors. This family also contains pepsin-like aspartic proteinases homologous to TAXI-I. Unlike TAXI-I, they have active site aspartates and are functionally active. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 362
24946 133157 cd05490 Cathepsin_D2 Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank a deep active site cleft. Each of the two related lobes contributes one active site aspartic acid residue and contains a single carbohydrate group. Cathepsin D is an essential enzyme. Mice deficient for proteinase cathepsin D, generated by gene targeting, develop normally during the first 2 weeks, stop thriving in the third week and die in a state of anorexia in the fourth week. The mice develop atrophy of ileal mucosa followed by other degradation of intestinal organs. In these knockout mice, lysosomal proteolysis was normal. These results suggest that vital functions of cathepsin D are exerted by limited proteolysis of proteins regulating cell growth and/or tissue homeostasis, while its contribution to bulk proteolysis in lysosomes appears to be non-critical. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 325
24947 99923 cd05491 Bromo_TBP7_like Bromodomain; TBP7_like subfamily, limited to fungi. TBP7, or TAT-binding protein homolog 7, is a yeast protein of unknown function that contains AAA-superfamily ATP-ase domains and a bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. 119
24948 99924 cd05492 Bromo_ZMYND11 Bromodomain; ZMYND11_like sub-family. ZMYND11 or BS69 is a ubiquitously expressed nuclear protein that has been shown to associate with chromatin. It interacts with chromatin remodeling factors and might play a role in chromatin remodeling and gene expression. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 109
24949 99925 cd05493 Bromo_ALL-1 Bromodomain, ALL-1 like proteins. ALL-1 is a vertebrate homologue of Drosophila trithorax and is often affected in chromosomal rearrangements that are linked to acute leukemias, such as acute lymphocytic leukemia (ALL). Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. 131
24950 99926 cd05494 Bromodomain_1 Bromodomain; uncharacterized subfamily. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. 114
24951 99927 cd05495 Bromo_cbp_like Bromodomain, cbp_like subfamily. Cbp (CREB binding protein or CREBBP) is an acetyltransferase acting on histone, which gives a specific tag for transcriptional activation and also acetylates non-histone proteins. CREBBP binds specifically to phosphorylated CREB protein and augments the activity of phosphorylated CREB to activate transcription of cAMP-responsive genes. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 108
24952 99928 cd05496 Bromo_WDR9_II Bromodomain; WDR9 repeat II_like subfamily. WDR9 is a human gene located in the Down Syndrome critical region-2 of chromosome 21. It encodes for a nuclear protein containing WD40 repeats and two bromodomains, which may function as a transcriptional regulator involved in chromatin remodeling and play a role in embryonic development. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 119
24953 99929 cd05497 Bromo_Brdt_I_like Bromodomain, Brdt_like subfamily, repeat I. Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins; the first bromodomain in Brdt has been shown to be essential for male germ cell differentiation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 107
24954 99930 cd05498 Bromo_Brdt_II_like Bromodomain, Brdt_like subfamily, repeat II. Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins; the first bromodomain in Brdt has been shown to be essential for male germ cell differentiation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 102
24955 99931 cd05499 Bromo_BDF1_2_II Bromodomain. BDF1/BDF2 like subfamily, restricted to fungi, repeat II. BDF1 and BDF2 are yeast transcription factors involved in the expression of a wide range of genes, including snRNAs; they are required for sporulation and DNA repair and protect histone H4 from deacetylation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 102
24956 99932 cd05500 Bromo_BDF1_2_I Bromodomain. BDF1/BDF2 like subfamily, restricted to fungi, repeat I. BDF1 and BDF2 are yeast transcription factors involved in the expression of a wide range of genes, including snRNAs; they are required for sporulation and DNA repair and protect histone H4 from deacetylation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 103
24957 99933 cd05501 Bromo_SP100C_like Bromodomain, SP100C_like subfamily. The SP100C protein is a splice variant of SP100, a major component of PML-SP100 nuclear bodies (NBs), which are poorly understood. It is covalently modified by SUMO-1 and may play a role in processes at the chromatin level. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 102
24958 99934 cd05502 Bromo_tif1_like Bromodomain; tif1_like subfamily. Tif1 (transcription intermediary factor 1) is a member of the tripartite motif (TRIM) protein family, which is characterized by a particular domain architecture. It functions by recruiting coactivators and/or corepressors to modulate transcription. Vertebrate Tif1-gamma, also labeled E3 ubiquitin-protein ligase TRIM33, plays a role in the control of hematopoiesis. Its homologue in Xenopus laevis, Ectodermin, has been shown to function in germ-layer specification and control of cell growth during embryogenesis. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 109
24959 99935 cd05503 Bromo_BAZ2A_B_like Bromodomain, BAZ2A/BAZ2B_like subfamily. Bromo adjacent to zinc finger 2A (BAZ2A) and 2B (BAZ2B) were identified as a novel human bromodomain gene by cDNA library screening. BAZ2A is also known as Tip5 (Transcription termination factor I-interacting protein 5) and hWALp3. The proteins may play roles in transcriptional regulation. Human Tip5 is part of a complex termed NoRC (nucleolar remodeling complex), which induces nucleosome sliding and may play a role in the regulation of the rDNA locus. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 97
24960 99936 cd05504 Bromo_Acf1_like Bromodomain; Acf1_like or BAZ1A_like subfamily. Bromo adjacent to zinc finger 1A (BAZ1A) was identified as a novel human bromodomain gene by cDNA library screening. The Drosophila homologue, Acf1, is part of the CHRAC (chromatin accessibility complex) and regulates ISWI-induced nucleosome remodeling. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 115
24961 99937 cd05505 Bromo_WSTF_like Bromodomain; Williams syndrome transcription factor-like subfamily (WSTF-like). The Williams-Beuren syndrome deletion transcript 9 is a putative transcriptional regulator. WSTF was found to play a role in vitamin D-mediated transcription as part of two chromatin remodeling complexes, WINAC and WICH. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 97
24962 99938 cd05506 Bromo_plant1 Bromodomain, uncharacterized subfamily specific to plants. Might function as a global transcription factor. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 99
24963 99939 cd05507 Bromo_brd8_like Bromodomain, brd8_like subgroup. In mammals, brd8 (bromodomain containing 8) interacts with the thyroid hormone receptor in a ligand-dependent fashion and enhances thyroid hormone-dependent activation from thyroid response elements. Brd8 is thought to be a nuclear receptor coactivator. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 104
24964 99940 cd05508 Bromo_RACK7 Bromodomain, RACK7_like subfamily. RACK7 (also called human protein kinase C-binding protein) was identified as a potential tumor suppressor genes, it shares domain architecture with BS69/ZMYND11; both have been implicated in the regulation of cellular proliferation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 99
24965 99941 cd05509 Bromo_gcn5_like Bromodomain; Gcn5_like subfamily. Gcn5p is a histone acetyltransferase (HAT) which mediates acetylation of histones at lysine residues; such acetylation is generally correlated with the activation of transcription. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 101
24966 99942 cd05510 Bromo_SPT7_like Bromodomain; SPT7_like subfamily. SPT7 is a yeast protein that functions as a component of the transcription regulatory histone acetylation (HAT) complexes SAGA, SALSA, and SLIK. SAGA is involved in the RNA polymerase II-dependent transcriptional regulation of about 10% of all yeast genes. The SPT7 bromodomain has been shown to weakly interact with acetylated histone H3, but not H4. The human representative of this subfamily is cat eye syndrome critical region protein 2 (CECR2). Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 112
24967 99943 cd05511 Bromo_TFIID Bromodomain, TFIID-like subfamily. Human TAFII250 (or TAF250) is the largest subunit of TFIID, a large multi-domain complex, which initiates the assembly of the transcription machinery. TAFII250 contains two bromodomains that specifically bind to acetylated histone H4. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 112
24968 99944 cd05512 Bromo_brd1_like Bromodomain; brd1_like subfamily. BRD1 is a mammalian gene which encodes for a nuclear protein assumed to be a transcriptional regulator. BRD1 has been implicated with brain development and susceptibility to schizophrenia and bipolar affective disorder. Bromodomains are 110 amino acid long domains that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 98
24969 99945 cd05513 Bromo_brd7_like Bromodomain, brd7_like subgroup. The BRD7 gene encodes a nuclear protein that has been shown to inhibit cell growth and the progression of the cell cycle by regulating cell-cycle genes at the transcriptional level. BRD7 has been identified as a gene involved in nasopharyngeal carcinoma. The protein interacts with acetylated histone H3 via its bromodomain. Bromodomains are 110 amino acid long domains that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 98
24970 99946 cd05515 Bromo_polybromo_V Bromodomain, polybromo repeat V. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 105
24971 99947 cd05516 Bromo_SNF2L2 Bromodomain, SNF2L2-like subfamily, specific to animals. SNF2L2 (SNF2-alpha) or SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 2 is a global transcriptional activator, which cooperates with nuclear hormone receptors to boost transcriptional activation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 107
24972 99948 cd05517 Bromo_polybromo_II Bromodomain, polybromo repeat II. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 103
24973 99949 cd05518 Bromo_polybromo_IV Bromodomain, polybromo repeat IV. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 103
24974 99950 cd05519 Bromo_SNF2 Bromodomain, SNF2-like subfamily, specific to fungi. SNF2 is a yeast protein involved in transcriptional activation, it is the catalytic component of the SWI/SNF ATP-dependent chromatin remodeling complex. The protein is essential for the regulation of gene expression (both positive and negative) of a large number of genes. The SWI/SNF complex changes chromatin structure by altering DNA-histone contacts within the nucleosome, which results in a re-positioning of the nucleosome and facilitates or represses the binding of gene-specific transcription factors. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 103
24975 99951 cd05520 Bromo_polybromo_III Bromodomain, polybromo repeat III. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 103
24976 99952 cd05521 Bromo_Rsc1_2_I Bromodomain, repeat I in Rsc1/2_like subfamily, specific to fungi. Rsc1 and Rsc2 are components of the RSC complex (remodeling the structure of chromatin), are essential for transcriptional control, and have a specific domain architecture including two bromodomains. The RSC complex has also been linked to homologous recombination and nonhomologous end-joining repair of DNA double strand breaks. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 106
24977 99953 cd05522 Bromo_Rsc1_2_II Bromodomain, repeat II in Rsc1/2_like subfamily, specific to fungi. Rsc1 and Rsc2 are components of the RSC complex (remodeling the structure of chromatin), are essential for transcriptional control, and have a specific domain architecture including two bromodomains. The RSC complex has also been linked to homologous recombination and nonhomologous end-joining repair of DNA double strand breaks. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 104
24978 99954 cd05524 Bromo_polybromo_I Bromodomain, polybromo repeat I. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 113
24979 99955 cd05525 Bromo_ASH1 Bromodomain; ASH1_like sub-family. ASH1 (absent, small, or homeotic 1) is a member of the trithorax-group in Drosophila melanogaster, an epigenetic transcriptional regulator of HOX genes. Drosophila ASH1 has been shown to methylate specific lysines in histones H3 and H4. Mammalian ASH1 has been shown to methylate histone H3. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 106
24980 99956 cd05526 Bromo_polybromo_VI Bromodomain, polybromo repeat VI. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine. 110
24981 99957 cd05528 Bromo_AAA Bromodomain; sub-family co-occurring with AAA domains. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. The structure(2DKW) in this alignment is an uncharacterized protein predicted from analysis of cDNA clones from human fetal liver 112
24982 99958 cd05529 Bromo_WDR9_I_like Bromodomain; WDR9 repeat I_like subfamily. WDR9 is a human gene located in the Down Syndrome critical region-2 of chromosome 21. It encodes for a nuclear protein containing WD40 repeats and two bromodomains, which may function as a transcriptional regulator involved in chromatin remodeling and play a role in embryonic development. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 128
24983 99913 cd05530 POLBc_B1 DNA polymerase type-B B1 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaeal members also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases. 372
24984 99914 cd05531 POLBc_B2 DNA polymerase type-B B2 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaeal members also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases. 352
24985 99915 cd05532 POLBc_alpha DNA polymerase type-B alpha subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase (Pol) alpha is almost exclusively required for the initiation of DNA replication and the priming of Okazaki fragments during elongation. In most organisms no specific repair role, other than check point control, has been assigned to this enzyme. Pol alpha contains both polymerase and exonuclease domains, but lacks exonuclease activity suggesting that the exonuclease domain may be for structural purposes only. 400
24986 99916 cd05533 POLBc_delta DNA polymerase type-B delta subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. Presently, no direct data is available regarding the strand specificity of DNA polymerase during DNA replication in vivo. However, mutation analysis supports the hypothesis that DNA polymerase delta is the enzyme responsible for both elongation and maturation of Okazaki fragments on the lagging strand. 393
24987 99917 cd05534 POLBc_zeta DNA polymerase type-B zeta subfamily catalytic domain. DNA polymerase (Pol) zeta is a member of the eukaryotic B-family of DNA polymerases and distantly related to DNA Pol delta. Pol zeta plays a major role in translesion replication and the production of either spontaneous or induced mutations. Apart from its role in translesion replication, Pol zeta also appears to be involved in somatic hypermutability in B lymphocytes, an important element for the production of high affinity antibodies in response to an antigen. 451
24988 99918 cd05535 POLBc_epsilon DNA polymerase type-B epsilon subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase (Pol) epsilon has been proposed to play a role in elongation of the leading strand during DNA replication. Pol epsilon might also have a role in DNA repair. The structure of pol epsilon is characteristic of this family with the exception that it contains a large c-terminal domain with an unclear function. Phylogenetic analyses indicate that Pol epsilon is the ortholog to the archaeal Pol B3 rather than to Pol alpha, delta, or zeta. This might be because pol epsilon is ancestral to both archaea and eukaryotes DNA polymerases type B. 621
24989 99919 cd05536 POLBc_B3 DNA polymerase type-B B3 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some members of the archaea also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases. Structural comparison of the thermostable DNA polymerase type B to its mesostable homolog suggests several adaptations to high temperature such as shorter loops, disulfide bridges, and increasing electrostatic interaction at subdomain interfaces. 371
24990 99920 cd05537 POLBc_Pol_II DNA polymerase type-II subfamily catalytic domain. Bacteria contain five DNA polymerases (I, II, III, IV and V). DNA polymerase II (Pol II) is a prototype for the B-family of polymerases. The role of Pol II in a variety of cellular activities, such as repair of DNA damaged by UV irradiation or oxidation has been proven by genetic studies. DNA polymerase III is the main enzyme responsible for replication of the bacterial chromosome; however, In vivo studies have also shown that Pol II is able to participate in chromosomal DNA replication with larger role in lagging-strand replication. 371
24991 99921 cd05538 POLBc_Pol_II_B DNA polymerase type-II B subfamily catalytic domain. Bacteria contain five DNA polymerases (I, II, III, IV and V). DNA polymerase II (Pol II) is a prototype for the B-family of polymerases. The role of Pol II in a variety of cellular activities, such as repair of DNA damaged by UV irradiation or oxidation has been proved by genetic studies. DNA polymerase III is the main enzyme responsible for replication of the bacterial chromosome; however, In vivo studies have also shown that Pol II is able to participate in chromosomal DNA replication with larger role in lagging-strand replication. 347
24992 349776 cd05540 UreG urease accessory protein UreG. UreG is one of the four accessory proteins of urease. Urease is an enzyme which catalyzes the decomposition of urea to form ammonia and carbon dioxide. Bacterial urease is a trimer of three subunits which are encoded by genes ureA, ureB, and ureC. Up to four accessory proteins (ureD, ureE, ureF, and ureG) are required for urease catalytical function. UreG may play an important role in nickel incorporation of the urease metallocenter. UreG is a member of the Fer4_NifH superfamily which contains an ATP-binding domain. Proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. 191
24993 349405 cd05559 CAP_PI16_HrTT-1 CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 16 and HrTT-1 protein. Human peptidase inhibitor 16 (PI16) is also called cysteine-rich secretory protein 9 (CRISP-9) or PSP94-binding protein. Mouse PI16 is also called cysteine-rich protease inhibitor. PI16 is predominantly expressed by cardiac fibroblasts and is exposed to the interstitium via a glycophosphatidylinositol (-GPI) membrane anchor. It suppresses the activation of the chemokine chemerin in the myocardium, which may be a part of the cardiac stress response. At high endothelial shear stress, PI16 is an inflammation-regulated inhibitor of matrix metalloproteinase 2 (MMP2). Also included in this subfamily is the HrTT-1 protein, a tail-tip epidermis marker in ascidians. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 134
24994 240187 cd05560 Xcc1710_like Xcc1710_like family, specific to proteobacteria. Xcc1710 is a hypothetical protein from Xanthomonas campestris pv. campestris str. ATCC 33913, similar to Mth938, a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. Their three-dimensional structures have been determined, but their functions are unknown. 109
24995 173797 cd05561 Peptidases_S8_4 Peptidase S8 family domain, uncharacterized subfamily 4. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 239
24996 173798 cd05562 Peptidases_S53_like Peptidase domain in the S53 family. Members of the peptidase S53 (sedolisin) family include endopeptidases and exopeptidases. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of Asn in subtilisin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. Characterized sedolisins include Kumamolisin, an extracellular calcium-dependent thermostable endopeptidase from Bacillus. The enzyme is synthesized with a 188 amino acid N-terminal preprotein region which is cleaved after the extraction into the extracellular space with low pH. One kumamolysin paralog, kumamolisin-As, is believed to be a collagenase. TPP1 is a serine protease that functions as a tripeptidyl exopeptidase as well as an endopeptidase. Less is known about PSCP from Pseudomonas which is thought to be an aspartic proteinase. 275
24997 99905 cd05563 PTS_IIB_ascorbate PTS_IIB_ascorbate: subunit IIB of enzyme II (EII) of the L-ascorbate-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII is an L-ascorbate-specific permease with two cytoplasmic subunits (IIA and IIB) and a transmembrane channel IIC subunit. Subunits IIA, IIB, and IIC are encoded by the sgaA, sgaB, and sgaT genes of the E. coli sgaTBA operon. In some bacteria, the IIB (SgaB) domain is fused C-terminal to the IIA (SgaT) domain. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include ascorbate, chitobiose/lichenan, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. 86
24998 99906 cd05564 PTS_IIB_chitobiose_lichenan PTS_IIB_chitobiose_lichenan: subunit IIB of enzyme II (EII) of the N,N-diacetylchitobiose-specific and lichenan-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In these systems, EII is either a lichenan- or an N,N-diacetylchitobiose-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. In the chitobiose system, these subunits are expressed as separate proteins from chbA, chbB, and chbC of the chb operon (formerly the cel (cellulose) operon). In the lichenan system, these subunits are expressed from licA, licB, and licC of the lic operon. The lic operon of Bacillus subtilis is required for the transport and degradation of oligomeric beta-glucosides, which are produced by extracellular enzymes on substrates such as lichenan or barley glucan. The lic operon is transcribed from a gammaA-dependent promoter and is inducible by lichenan, lichenan hydrolysate, and cellobiose. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. 96
24999 99907 cd05565 PTS_IIB_lactose PTS_IIB_lactose: subunit IIB of enzyme II (EII) of the lactose-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) found in Firmicutes as well as Actinobacteria. In this system, EII is a lactose-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. The IIC and IIB domains are expressed as a single protein from the lac operon. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include lactose, chitobiose/lichenan, ascorbate, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. 99
25000 99908 cd05566 PTS_IIB_galactitol PTS_IIB_galactitol: subunit IIB of enzyme II (EII) of the galactitol-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII is a galactitol-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain that are expressed on three distinct polypeptide chains, in contrast to other PTS sugar transporters. The three genes encoding these subunits (gatA, gatB, and gatC) comprise the gatCBA operon. Galactitol PTS permease takes up exogenous galactitol, releasing the phosphate ester into the cytoplasm in preparation for oxidation and further metabolism via a modified glycolytic pathway called the tagatose-6-phosphate glycolytic pathway. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include galactitol, chitobiose/lichenan, ascorbate, lactose, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. 89
25001 99909 cd05567 PTS_IIB_mannitol PTS_IIB_mannitol: subunit IIB of enzyme II (EII) of the mannitol-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII is a mannitol-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. The IIA, IIB, and IIC domains are expressed from the mtlA gene as a single protein, also known as the mannitol PTS permease, the mtl transporter, or MtlA. MtlA is only functional as a dimer with the dimer contacts occuring between the IIC domains. MtlA takes up exogenous mannitol releasing the phosphate ester into the cytoplasm in preparation for oxidation to fructose-6-phosphate by the NAD-dependent mannitol-P dehydrogenase (MtlD). The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include mannitol, chitobiose/lichenan, ascorbate, lactose, galactitol, fructose, and a sensory system with similarity to the bacterial bgl system. 87
25002 99910 cd05568 PTS_IIB_bgl_like PTS_IIB_bgl_like: the PTS (phosphotransferase system) IIB domain of a family of sensory systems composed of a membrane-bound sugar-sensor (similar to BglF) and a transcription antiterminator (similar to BglG) which regulate expression of genes involved in sugar utilization. The domain architecture of the IIB-containing protein includes a region N-terminal to the IIB domain which is homologous to the BglG transcription antiterminator with an RNA-binding domain followed by two homologous domains, PRD1 and PRD2 (PTS Regulation Domains). C-terminal to the IIB domain is a domain similar to the PTS IIA domain. In this system, the BglG-like region and the IIB and IIA-like domains are all expressed together as a single multidomain protein. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include this sensory system with similarity to the bacterial bgl system, chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, and fructose systems. 85
25003 99911 cd05569 PTS_IIB_fructose PTS_IIB_fructose: subunit IIB of enzyme II (EII) of the fructose-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII (also referred to as FruAB) is a fructose-specific permease made up of two proteins (FruA and FruB) each containing 3 domains. The FruA protein contains two tandem nonidentical IIB domains and a C-terminal IIC transmembrane domain. Both IIB domains of FruA are included in this alignment. The FruB protein (also referred to as diphosphoryl transfer protein) contains a IIA domain, a domain of unknown function, and an Hpr-like domain called FPr (fructose-inducible HPr). This familiy also includes the IIB domains of several fructose-like PTS permeases including the Frv permease encoded by the frvABXR operon, the Frw permease encoded by the frwACBD operon, the Frx permease encoded by the hrsA gene, and the Fry permease encoded by the fryABC (ypdDGH) operon. FruAB takes up exogenous fructose, releasing the 1-phosphate ester in to the cytoplasm in preparation for metabolism primarily via glycolysis. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include fructose, chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, and a sensory system with similarity to the bacterial bgl system. 96
25004 270722 cd05570 STKc_PKC Catalytic domain of the Serine/Threonine Kinase, Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, classical PKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. Novel PKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. Also included in this subfamily are the PKC-like proteins, called PKNs. The PKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 318
25005 270723 cd05571 STKc_PKB Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. There are three PKB isoforms from different genes, PKB-alpha (or Akt1), PKB-beta (or Akt2), and PKB-gamma (or Akt3). PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. It is activated downstream of phosphoinositide 3-kinase (PI3K) and plays important roles in diverse cellular functions including cell survival, growth, proliferation, angiogenesis, motility, and migration. PKB also has a central role in a variety of human cancers, having been implicated in tumor initiation, progression, and metastasis. The PKB subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and PI3K. 322
25006 270724 cd05572 STKc_cGK Catalytic domain of the Serine/Threonine Kinase, cGMP-dependent protein kinase (cGK or PKG). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mammals have two cGK isoforms from different genes, cGKI and cGKII. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. cGK consists of an N-terminal regulatory domain containing a dimerization and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. cGKII is a membrane-bound protein that is most abundantly expressed in the intestine. It is also present in the brain nuclei, adrenal cortex, kidney, lung, and prostate. cGKI is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. cGKII plays a role in the regulation of secretion, such as renin secretion by the kidney and aldosterone secretion by the adrenal. It also regulates bone growth and the circadian rhythm. The cGK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
25007 270725 cd05573 STKc_ROCK_NDR_like Catalytic domain of Rho-associated coiled-coil containing protein kinase (ROCK)- and Nuclear Dbf2-Related (NDR)-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily include ROCK and ROCK-like proteins such as DMPK, MRCK, and CRIK, as well as NDR and NDR-like proteins such as LATS, CBK1 and Sid2p. ROCK and CRIK are effectors of the small GTPase Rho, while MRCK is an effector of the small GTPase Cdc42. NDR and NDR-like kinases contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Proteins in this subfamily are involved in regulating many cellular functions including contraction, motility, division, proliferation, apoptosis, morphogenesis, and cytokinesis. The ROCK/NDR-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 350
25008 270726 cd05574 STKc_phototropin_like Catalytic domain of Phototropin-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phototropins are blue-light receptors that control responses such as phototropism, stromatal opening, and chloroplast movement in order to optimize the photosynthetic efficiency of plants. They are light-activated STKs that contain an N-terminal photosensory domain and a C-terminal catalytic domain. The N-terminal domain contains two LOV (Light, Oxygen or Voltage) domains that binds FMN. Photoexcitation of the LOV domains results in autophosphorylation at multiple sites and activation of the catalytic domain. In addition to plant phototropins, included in this subfamily are predominantly uncharacterized fungal STKs whose catalytic domains resemble the phototropin kinase domain. One protein from Neurospora crassa is called nrc-2, which plays a role in growth and development by controlling entry into the conidiation program. The phototropin-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 316
25009 270727 cd05575 STKc_SGK Catalytic domain of the Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGKs are activated by insulin and growth factors via phosphoinositide 3-kinase and PDK1. They activate ion channels, ion carriers, and the Na-K-ATPase, as well as regulate the activity of enzymes and transcription factors. SGKs play important roles in transport, hormone release, neuroexcitability, cell proliferation, and apoptosis. There are three isoforms of SGK, named SGK1, SGK2, and SGK3 (also called cytokine-independent survival kinase CISK). The SGK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 323
25010 270728 cd05576 STKc_RPK118_like Catalytic domain of the Serine/Threonine Kinase, RPK118, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RPK118 contains an N-terminal Phox homology (PX) domain, a Microtubule Interacting and Trafficking (MIT) domain, and a kinase domain containing a long uncharacterized insert. Also included in the family is human RPK60 (or ribosomal protein S6 kinase-like 1), which also contains MIT and kinase domains but lacks a PX domain. RPK118 binds sphingosine kinase, a key enzyme in the synthesis of sphingosine 1-phosphate (SPP), a lipid messenger involved in many cellular events. RPK118 may be involved in transmitting SPP-mediated signaling. RPK118 also binds the antioxidant peroxiredoxin-3. RPK118 may be involved in the transport of PRDX3 from the cytoplasm to its site of function in the mitochondria. The RPK118-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
25011 270729 cd05577 STKc_GRK Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. GRKs play important roles in the cardiovascular, immune, respiratory, skeletal, and nervous systems. They contain a central catalytic domain, flanked by N- and C-terminal extensions. The N-terminus contains an RGS (regulator of G protein signaling) homology (RH) domain and several motifs. The C-terminus diverges among different groups of GRKs. There are seven types of GRKs, named GRK1 to GRK7, which are subdivided into three main groups: visual (GRK1/7); beta-adrenergic receptor kinases (GRK2/3); and GRK4-like (GRK4/5/6). Expression of GRK2/3/5/6 is widespread while GRK1/4/7 show a limited tissue distribution. The substrate spectrum of the widely expressed GRKs partially overlaps. The GRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 278
25012 270730 cd05578 STKc_Yank1 Catalytic domain of the Serine/Threonine Kinase, Yank1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily contains uncharacterized STKs with similarity to the human protein designated as Yank1 or STK32A. The Yank1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
25013 270731 cd05579 STKc_MAST_like Catalytic domain of Microtubule-associated serine/threonine (MAST) kinase-like proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes MAST kinases, MAST-like (MASTL) kinases (also called greatwall kinase or Gwl), and fungal kinases with similarity to Saccharomyces cerevisiae Rim15 and Schizosaccharomyces pombe cek1. MAST kinases contain an N-terminal domain of unknown function, a central catalytic domain, and a C-terminal PDZ domain that mediates protein-protein interactions. MASTL kinases carry only a catalytic domain which contains a long insert relative to other kinases. The fungal kinases in this subfamily harbor other domains in addition to a central catalytic domain, which like in MASTL, also contains an insert relative to MAST kinases. Rim15 contains a C-terminal signal receiver (REC) domain while cek1 contains an N-terminal PAS domain. MAST kinases are cytoskeletal associated kinases of unknown function that are also expressed at neuromuscular junctions and postsynaptic densities. MASTL/Gwl is involved in the regulation of mitotic entry, mRNA stabilization, and DNA checkpoint recovery. The fungal proteins Rim15 and cek1 are involved in the regulation of meiosis and mitosis, respectively. The MAST-like kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
25014 270732 cd05580 STKc_PKA_like Catalytic subunit of the Serine/Threonine Kinases, cAMP-dependent protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the cAMP-dependent protein kinases, PKA and PRKX, and similar proteins. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. PRKX is also reulated by the R subunit and is is present in many tissues including fetal and adult brain, kidney, and lung. It is implicated in granulocyte/macrophage lineage differentiation, renal cell epithelial migration, and tubular morphogenesis in the developing kidney. The PKA-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
25015 270733 cd05581 STKc_PDK1 Catalytic domain of the Serine/Threonine Kinase, Phosphoinositide-dependent kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PDK1 carries an N-terminal catalytic domain and a C-terminal pleckstrin homology (PH) domain that binds phosphoinositides. It phosphorylates the activation loop of AGC kinases that are regulated by PI3K such as PKB, SGK, and PKC, among others, and is crucial for their activation. Thus, it contributes in regulating many processes including metabolism, growth, proliferation, and survival. PDK1 also has the ability to autophosphorylate and is constitutively active in mammalian cells. It is essential for normal embryo development and is important in regulating cell volume. The PDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 278
25016 270734 cd05582 STKc_RSK_N N-terminal catalytic domain of the Serine/Threonine Kinase, 90 kDa ribosomal protein S6 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. Mammals possess four RSK isoforms (RSK1-4) from distinct genes. RSK proteins are also referred to as MAP kinase-activated protein kinases (MAPKAPKs), p90-RSKs, or p90S6Ks. The RSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 317
25017 270735 cd05583 STKc_MSK_N N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, in response to various stimuli such as growth factors, hormones, neurotransmitters, cellular stress, and pro-inflammatory cytokines. This triggers phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) in the C-terminal extension of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. MSKs are predominantly nuclear proteins. They are widely expressed in many tissues including heart, brain, lung, liver, kidney, and pancreas. There are two isoforms of MSK, called MSK1 and MSK2. The MSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
25018 270736 cd05584 STKc_p70S6K Catalytic domain of the Serine/Threonine Kinase, 70 kDa ribosomal protein S6 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p70S6K (or S6K) contains only one catalytic kinase domain, unlike p90 ribosomal S6 kinases (RSKs). It acts as a downstream effector of the STK mTOR (mammalian Target of Rapamycin) and plays a role in the regulation of the translation machinery during protein synthesis. p70S6K also plays a pivotal role in regulating cell size and glucose homeostasis. Its targets include S6, the translation initiation factor eIF3, and the insulin receptor substrate IRS-1, among others. Mammals contain two isoforms of p70S6K, named S6K1 and S6K2 (or S6K-beta). The p70S6K subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 323
25019 270737 cd05585 STKc_YPK1_like Catalytic domain of Yeast Protein Kinase 1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of fungal proteins with similarity to the AGC STKs, Saccharomyces cerevisiae YPK1 and Schizosaccharomyces pombe Gad8p. YPK1 is required for cell growth and acts as a downstream kinase in the sphingolipid-mediated signaling pathway of yeast. It also plays a role in efficient endocytosis and in the maintenance of cell wall integrity. Gad8p is a downstream target of Tor1p, the fission yeast homolog of mTOR. It plays a role in cell growth and sexual development. The YPK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
25020 270738 cd05586 STKc_Sck1_like Catalytic domain of Suppressor of loss of cAMP-dependent protein kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Sck1 and similar fungal proteins. Sck1 plays a role in trehalase activation triggered by glucose and a nitrogen source. Trehalase catalyzes the cleavage of the disaccharide trehalose to glucose. Trehalose, as a carbohydrate reserve and stress metabolite, plays an important role in the response of yeast to environmental changes. The Sck1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 330
25021 270739 cd05587 STKc_cPKC Catalytic domain of the Serine/Threonine Kinase, Classical (or Conventional) Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. cPKCs contain a calcium-binding C2 region in their regulatory domain. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. The cPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 320
25022 270740 cd05588 STKc_aPKC Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. The aPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 328
25023 270741 cd05589 STKc_PKN Catalytic domain of the Serine/Threonine Kinase, Protein Kinase N. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKN has a C-terminal catalytic domain that is highly homologous to PKCs. Its unique N-terminal regulatory region contains antiparallel coiled-coil (ACC) domains. In mammals, there are three PKN isoforms from different genes (designated PKN-alpha, beta, and gamma), which show different enzymatic properties, tissue distribution, and varied functions. PKN can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytokeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. The PKN subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 326
25024 270742 cd05590 STKc_nPKC_eta Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C eta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-eta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 323
25025 270743 cd05591 STKc_nPKC_epsilon Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C epsilon. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-epsilon subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 321
25026 270744 cd05592 STKc_nPKC_theta_like Catalytic domain of the Serine/Threonine Kinases, Novel Protein Kinase C theta, delta, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. There are four nPKC isoforms, delta, epsilon, eta, and theta. The nPKC-theta-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 320
25027 270745 cd05593 STKc_PKB_gamma Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B gamma (also called Akt3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-gamma is predominantly expressed in neuronal tissues. Mice deficient in PKB-gamma show a reduction in brain weight due to the decreases in cell size and cell number. PKB-gamma has also been shown to be upregulated in estrogen-deficient breast cancer cells, androgen-independent prostate cancer cells, and primary ovarian tumors. It acts as a key mediator in the genesis of ovarian cancer. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. The PKB-gamma subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 348
25028 270746 cd05594 STKc_PKB_alpha Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B alpha (also called Akt1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-alpha is predominantly expressed in endothelial cells. It is critical for the regulation of angiogenesis and the maintenance of vascular integrity. It also plays a role in adipocyte differentiation. Mice deficient in PKB-alpha exhibit perinatal morbidity, growth retardation, reduction in body weight accompanied by reduced sizes of multiple organs, and enhanced apoptosis in some cell types. PKB-alpha activity has been reported to be frequently elevated in breast and prostate cancers. In some cancer cells, PKB-alpha may act as a suppressor of metastasis. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. The PKB-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 356
25029 173686 cd05595 STKc_PKB_beta Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B beta (also called Akt2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-beta is the predominant PKB isoform expressed in insulin-responsive tissues. It plays a critical role in the regulation of glucose homeostasis. It is also implicated in muscle cell differentiation. Mice deficient in PKB-beta display normal growth weights but exhibit severe insulin resistance and diabetes, accompanied by lipoatrophy and B-cell failure. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain.The PKB-beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 323
25030 270747 cd05596 STKc_ROCK Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK is also referred to as Rho-associated kinase or simply as Rho kinase. It contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. The ROCK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 352
25031 270748 cd05597 STKc_DMPK_like Catalytic domain of Myotonic Dystrophy protein kinase (DMPK)-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The DMPK-like subfamily is composed of DMPK and DMPK-related cell division control protein 42 (Cdc42) binding kinase (MRCK). DMPK is expressed in skeletal and cardiac muscles, and in central nervous tissues. The functional role of DMPK is not fully understood. It may play a role in the signal transduction and homeostasis of calcium. The DMPK gene is implicated in myotonic dystrophy 1 (DM1), an inherited multisystemic disorder with symptoms that include muscle hyperexcitability, progressive muscle weakness and wasting, cataract development, testicular atrophy, and cardiac conduction defects. The genetic basis for DM1 is the mutational expansion of a CTG repeat in the 3'-UTR of DMPK. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. Three isoforms of MRCK are known, named alpha, beta and gamma. MRCKgamma is expressed in heart and skeletal muscles, unlike MRCKalpha and MRCKbeta, which are expressed ubiquitously. The DMPK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 331
25032 270749 cd05598 STKc_LATS Catalytic domain of the Serine/Threonine Kinase, Large Tumor Suppressor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS was originally identified in Drosophila using a screen for genes whose inactivation led to overproliferation of cells. In tetrapods, there are two LATS isoforms, LATS1 and LATS2. Inactivation of LATS1 in mice results in the development of various tumors, including sarcomas and ovarian cancer. LATS functions as a tumor suppressor and is implicated in cell cycle regulation. The LATS subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 333
25033 270750 cd05599 STKc_NDR_like Catalytic domain of Nuclear Dbf2-Related kinase-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR kinases regulate mitosis, cell growth, embryonic development, and neurological processes. They are also required for proper centrosome duplication. Higher eukaryotes contain two NDR isoforms, NDR1 and NDR2. This subfamily also contains fungal NDR-like kinases. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 324
25034 270751 cd05600 STKc_Sid2p_like Catalytic domain of Fungal Sid2p-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This group contains fungal kinases including Schizosaccharomyces pombe Sid2p and Saccharomyces cerevisiae Dbf2p. Group members show similarity to NDR kinases in that they contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Sid2p plays a crucial role in the septum initiation network (SIN) and in the initiation of cytokinesis. Dbf2p is important in regulating the mitotic exit network (MEN) and in cytokinesis. The Sid2p-like group is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 386
25035 270752 cd05601 STKc_CRIK Catalytic domain of the Serine/Threonine Kinase, Citron Rho-interacting kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CRIK (also called citron kinase) is an effector of the small GTPase Rho. It plays an important function during cytokinesis and affects its contractile process. CRIK-deficient mice show severe ataxia and epilepsy as a result of abnormal cytokinesis and massive apoptosis in neuronal precursors. A Down syndrome critical region protein TTC3 interacts with CRIK and inhibits CRIK-dependent neuronal differentiation and neurite extension. CRIK contains a catalytic domain, a central coiled-coil domain, and a C-terminal region containing a Rho-binding domain (RBD), a zinc finger, and a pleckstrin homology (PH) domain, in addition to other motifs. The CRIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 328
25036 270753 cd05602 STKc_SGK1 Catalytic domain of the Protein Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK1 is ubiquitously expressed and is under transcriptional control of numerous stimuli including cell stress (cell shrinkage), serum, hormones (gluco- and mineralocorticoids), gonadotropins, growth factors, interleukin-6, and other cytokines. It plays roles in sodium retention and potassium elimination in the kidney, nutrient transport, salt sensitivity, memory consolidation, and cardiac repolarization. A common SGK1 variant is associated with increased blood pressure and body weight. SGK1 may also contribute to tumor growth, neurodegeneration, fibrosing disease, and ischemia. The SGK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 339
25037 270754 cd05603 STKc_SGK2 Catalytic domain of the Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK2 shows a more restricted distribution than SGK1 and is most abundantly expressed in epithelial tissues including kidney, liver, pancreas, and the choroid plexus of the brain. In vitro cellular assays show that SGK2 can stimulate the activity of ion channels, the glutamate transporter EEAT4, and the glutamate receptors, GluR6 and GLUR1. The SGK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 321
25038 270755 cd05604 STKc_SGK3 Catalytic domain of the Protein Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK3 (also called cytokine-independent survival kinase or CISK) is expressed in most tissues and is most abundant in the embryo and adult heart and spleen. It was originally discovered in a screen for antiapoptotic genes. It phosphorylates and inhibits the proapoptotic proteins, Bad and FKHRL1. SGK3 also regulates many transporters, ion channels, and receptors. It plays a critical role in hair follicle morphogenesis and hair cycling. The SGK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 326
25039 270756 cd05605 STKc_GRK4_like Catalytic domain of G protein-coupled Receptor Kinase 4-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of the GRK4-like group include GRK4, GRK5, GRK6, and similar GRKs. They contain an N-terminal RGS homology (RH) domain and a catalytic domain, but lack a G protein betagamma-subunit binding domain. They are localized to the plasma membrane through post-translational lipid modification or direct binding to PIP2. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK4-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
25040 270757 cd05606 STKc_beta_ARK Catalytic domain of the Serine/Threonine Kinase, beta-adrenergic receptor kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The beta-ARK group is composed of GRK2, GRK3, and similar proteins. GRK2 and GRK3 are both widely expressed in many tissues, although GRK2 is present at higher levels. They contain an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRK2 (also called beta-ARK or beta-ARK1) is important in regulating several cardiac receptor responses. It plays a role in cardiac development and in hypertension. Deletion of GRK2 in mice results in embryonic lethality, caused by hypoplasia of the ventricular myocardium. GRK2 also plays important roles in the liver (as a regulator of portal blood pressure), in immune cells, and in the nervous system. Altered GRK2 expression has been reported in several disorders including major depression, schizophrenia, bipolar disorder, and Parkinsonism. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The beta-ARK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
25041 270758 cd05607 STKc_GRK7 Catalytic domain of the Protein Serine/Threonine Kinase, G protein-coupled Receptor Kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK7 (also called iodopsin kinase) belongs to the visual group of GRKs. It is primarily found in the retina and plays a role in the regulation of opsin light receptors. GRK7 is located in retinal cone outer segments and plays an important role in regulating photoresponse of the cones. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
25042 270759 cd05608 STKc_GRK1 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK1 (also called rhodopsin kinase) belongs to the visual group of GRKs and is expressed in retinal cells. It phosphorylates rhodopsin in rod cells, which leads to termination of the phototransduction cascade. Mutations in GRK1 are associated to a recessively inherited form of stationary nightblindness called Oguchi disease. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
25043 270760 cd05609 STKc_MAST Catalytic domain of the Protein Serine/Threonine Kinase, Microtubule-associated serine/threonine kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAST kinases contain an N-terminal domain of unknown function, a central catalytic domain, and a C-terminal PDZ domain that mediates protein-protein interactions. There are four mammalian MAST kinases, named MAST1-MAST4. MAST1 is also called syntrophin-associated STK (SAST) while MAST2 is also called MAST205. MAST kinases are cytoskeletal associated kinases of unknown function that are also expressed at neuromuscular junctions and postsynaptic densities. MAST1, MAST2, and MAST3 bind and phosphorylate the tumor suppressor PTEN, and may contribute to the regulation and stabilization of PTEN. MAST2 is involved in the regulation of the Fc-gamma receptor of the innate immune response in macrophages, and may also be involved in the regulation of the Na+/H+ exchanger NHE3. The MAST kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 280
25044 270761 cd05610 STKc_MASTL Catalytic domain of the Serine/Threonine Kinase, Microtubule-associated serine/threonine-like kinase (also called greatwall kinase). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The MASTL kinases in this group carry only a catalytic domain, which contains a long insertion relative to MAST kinases. MASTL, also called greatwall kinase (Gwl), is involved in the regulation of mitotic entry, which is controlled by the coordinated activities of protein kinases and opposing protein phosphatases (PPs). The cyclin B/CDK1 complex induces entry into M-phase while PP2A-B55 shows anti-mitotic activity. MASTL/Gwl is activated downstream of cyclin B/CDK1 and indirectly inhibits PP2A-B55 by phosphorylating the small protein alpha-endosulfine (Ensa) or the cAMP-regulated phosphoprotein 19 (Arpp19), resulting in M-phase progression. Gwl kinase may also play roles in mRNA stabilization and DNA checkpoint recovery. The human MASTL gene has also been named FLJ14813; a missense mutation in FLJ14813 is associated with autosomal dominant thrombocytopenia. The MASTL kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 349
25045 270762 cd05611 STKc_Rim15_like Catalytic domain of fungal Rim15-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include Saccharomyces cerevisiae Rim15, Schizosaccharomyces pombe cek1, and similar fungal proteins. They contain a central catalytic domain, which contains an insert relative to MAST kinases. In addition, Rim15 contains a C-terminal signal receiver (REC) domain while cek1 contains an N-terminal PAS domain. Rim15 (or Rim15p) functions as a regulator of meiosis. It acts as a downstream effector of PKA and regulates entry into stationary phase (G0). Thus, it plays a crucial role in regulating yeast proliferation, differentiation, and aging. Cek1 may facilitate progression of mitotic anaphase. The Rim15-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
25046 270763 cd05612 STKc_PRKX_like Catalytic domain of PRKX-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include human PRKX (X chromosome-encoded protein kinase), Drosophila DC2, and similar proteins. PRKX is present in many tissues including fetal and adult brain, kidney, and lung. The PRKX gene is located in the Xp22.3 subregion and has a homolog called PRKY on the Y chromosome. An abnormal interchange between PRKX aand PRKY leads to the sex reversal disorder of XX males and XY females. PRKX is implicated in granulocyte/macrophage lineage differentiation, renal cell epithelial migration, and tubular morphogenesis in the developing kidney. The PRKX-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
25047 270764 cd05613 STKc_MSK1_N N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK1 plays a role in the regulation of translational control and transcriptional activation. It phosphorylates the transcription factors, CREB and NFkB. It also phosphorylates the nucleosomal proteins H3 and HMG-14. Increased phosphorylation of MSK1 is associated with the development of cerebral ischemic/hypoxic preconditioning. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
25048 270765 cd05614 STKc_MSK2_N N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK2 and MSK1 play nonredundant roles in activating histone H3 kinases, which play pivotal roles in compaction of the chromatin fiber. MSK2 is the required H3 kinase in response to stress stimuli and activation of the p38 MAPK pathway. MSK2 also plays a role in the pathogenesis of psoriasis. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family, similar to 90 kDa ribosomal protein S6 kinases (RSKs). MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 332
25049 270766 cd05615 STKc_cPKC_alpha Catalytic domain of the Serine/Threonine Kinase, Classical Protein Kinase C alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. The cPKC-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 341
25050 270767 cd05616 STKc_cPKC_beta Catalytic domain of the Serine/Threonine Kinase, Classical Protein Kinase C beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG, and in most cases, phosphatidylserine (PS) for activation. The cPKC-beta subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 323
25051 270768 cd05617 STKc_aPKC_zeta Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C zeta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. The aPKC-zeta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 357
25052 270769 cd05618 STKc_aPKC_iota Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C iota. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. The aPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 364
25053 270770 cd05619 STKc_nPKC_theta Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C theta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. Although T-cells also express other PKC isoforms, PKC-theta is unique in that upon antigen stimulation, it is translocated to the plasma membrane at the immunological synapse, where it mediates signals essential for T-cell activation. It is essential for TCR-induced proliferation, cytokine production, T-cell survival, and the differentiation and effector function of T-helper (Th) cells, particularly Th2 and Th17. PKC-theta is being developed as a therapeutic target for Th2-mediated allergic inflammation and Th17-mediated autoimmune diseases. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 331
25054 173710 cd05620 STKc_nPKC_delta Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C delta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. It slows down cell proliferation, inducing cell cycle arrest and enhancing cell differentiation. PKC-delta is also involved in the regulation of transcription as well as immune and inflammatory responses. It plays a central role in the genotoxic stress response that leads to DNA damaged-induced apoptosis. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-delta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 316
25055 270771 cd05621 STKc_ROCK2 Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK2 was the first identified target of activated RhoA, and was found to play a role in stress fiber and focal adhesion formation. It is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain, and is activated via interaction with Rho GTPases. The ROCK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 379
25056 270772 cd05622 STKc_ROCK1 Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK1 is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain, and is activated via interaction with Rho GTPases. The ROCK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 405
25057 270773 cd05623 STKc_MRCK_alpha Catalytic domain of the Serine/Threonine Kinase, DMPK-related cell division control protein 42 binding kinase (MRCK) alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MRCK-alpha is expressed ubiquitously in many tissues. It plays a role in the regulation of peripheral actin reorganization and neurite outgrowth. It may also play a role in the transferrin iron uptake pathway. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. The MRCK-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. This alignment model includes the dimerization domain. 409
25058 270774 cd05624 STKc_MRCK_beta Catalytic domain of the Protein Serine/Threonine Kinase, DMPK-related cell division control protein 42 binding kinase (MRCK) beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MRCK-beta is expressed ubiquitously in many tissues. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. The MRCK-beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. This alignment model includes the dimerization domain. 409
25059 270775 cd05625 STKc_LATS1 Catalytic domain of the Serine/Threonine Kinase, Large Tumor Suppressor 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS1 functions as a tumor suppressor and is implicated in cell cycle regulation. Inactivation of LATS1 in mice results in the development of various tumors, including sarcomas and ovarian cancer. Promoter methylation, loss of heterozygosity, and missense mutations targeting the LATS1 gene have also been found in human sarcomas and ovarian cancers. In addition, decreased expression of LATS1 is associated with an aggressive phenotype and poor prognosis. LATS1 induces G2 arrest and promotes cytokinesis. It may be a component of the mitotic exit network in higher eukaryotes. The LATS1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 382
25060 173715 cd05626 STKc_LATS2 Catalytic domain of the Protein Serine/Threonine Kinase, Large Tumor Suppressor 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS2 is an essential mitotic regulator responsible for coordinating accurate cytokinesis completion and governing the stabilization of other mitotic regulators. It is also critical in the maintenance of proper chromosome number, genomic stability, mitotic fidelity, and the integrity of centrosome duplication. Downregulation of LATS2 is associated with poor prognosis in acute lymphoblastic leukemia and breast cancer. The LATS2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 381
25061 270776 cd05627 STKc_NDR2 Catalytic domain of the Serine/Threonine Kinase, Nuclear Dbf2-Related kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR2 (also called STK38-like) plays a role in proper centrosome duplication. In addition, it is involved in regulating neuronal growth and differentiation, as well as in facilitating neurite outgrowth. NDR2 is also implicated in fear conditioning as it contributes to the coupling of neuronal morphological changes with fear-memory consolidation. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 366
25062 270777 cd05628 STKc_NDR1 Catalytic domain of the Serine/Threonine Kinase, Nuclear Dbf2-Related kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR1 (also called STK38) plays a role in proper centrosome duplication. It is highly expressed in thymus, muscle, lung and spleen. It is not an essential protein because mice deficient of NDR1 remain viable and fertile. However, these mice develop T-cell lymphomas and appear to be hypersenstive to carcinogenic treatment. NDR1 appears to also act as a tumor suppressor. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 376
25063 270778 cd05629 STKc_NDR_like_fungal Catalytic domain of Fungal Nuclear Dbf2-Related kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This group is composed of fungal NDR-like proteins including Saccharomyces cerevisiae CBK1 (or CBK1p), Schizosaccharomyces pombe Orb6 (or Orb6p), Ustilago maydis Ukc1 (or Ukc1p), and Neurospora crassa Cot1. Like NDR kinase, group members contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. CBK1 is an essential component in the RAM (regulation of Ace2p activity and cellular morphogenesis) network. CBK1 and Orb6 play similar roles in coordinating cell morphology with cell cycle progression. Ukc1 is involved in morphogenesis, pathogenicity, and pigment formation. Cot1 plays a role in polar tip extension.The fungal NDR subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 377
25064 270779 cd05630 STKc_GRK6 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK6 is widely expressed in many tissues and is expressed as multiple splice variants with different domain architectures. It is post-translationally palmitoylated and localized in the membrane. GRK6 plays important roles in the regulation of dopamine, M3 muscarinic, opioid, and chemokine receptor signaling. It also plays maladaptive roles in addiction and Parkinson's disease. GRK6-deficient mice exhibit altered dopamine receptor regulation, decreased lymphocyte chemotaxis, and increased acute inflammation and neutrophil chemotaxis. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
25065 173720 cd05631 STKc_GRK4 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK4 has a limited tissue distribution. It is mainly found in the testis, but is also present in the cerebellum and kidney. It is expressed as multiple splice variants with different domain architectures and is post-translationally palmitoylated and localized in the membrane. GRK4 polymorphisms are associated with hypertension and salt sensitivity, as they cause hyperphosphorylation, desensitization, and internalization of the dopamine 1 (D1) receptor while increasing the expression of the angiotensin II type 1 receptor. GRK4 plays a crucial role in the D1 receptor regulation of sodium excretion and blood pressure. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
25066 270780 cd05632 STKc_GRK5 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK5 is widely expressed in many tissues. It associates with the membrane though an N-terminal PIP2 binding domain and also binds phospholipids via its C-terminus. GRK5 deficiency is associated with early Alzheimer's disease in humans and mouse models. GRK5 also plays a crucial role in the pathogenesis of sporadic Parkinson's disease. It participates in the regulation and desensitization of PDGFRbeta, a receptor tyrosine kinase involved in a variety of downstream cellular effects including cell growth, chemotaxis, apoptosis, and angiogenesis. GRK5 also regulates Toll-like receptor 4, which is involved in innate and adaptive immunity. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
25067 270781 cd05633 STKc_GRK3 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK3, also called beta-adrenergic receptor kinase 2 (beta-ARK2), is widely expressed in many tissues. It is involved in modulating the cholinergic response of airway smooth muscles, and also plays a role in dopamine receptor regulation. GRK3-deficient mice show a lack of olfactory receptor desensitization and altered regulation of the M2 muscarinic airway. GRK3 promoter polymorphisms may also be associated with bipolar disorder. GRK3 contains an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 346
25068 100059 cd05635 LbH_unknown Uncharacterized proteins, Left-handed parallel beta-Helix (LbH) domain: Members in this group are uncharacterized bacterial proteins containing a LbH domain with multiple turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 101
25069 100060 cd05636 LbH_G1P_TT_C_like Putative glucose-1-phosphate thymidylyltransferase, C-terminal Left-handed parallel beta-Helix (LbH) domain: Proteins in this family show simlarity to glucose-1-phosphate adenylyltransferases in that they contain N-terminal catalytic domains that resemble a dinucleotide-binding Rossmann fold and C-terminal LbH fold domains. Members in this family are predicted to be glucose-1-phosphate thymidylyltransferases, which are involved in the dTDP-L-rhamnose biosynthetic pathway. Glucose-1-phosphate thymidylyltransferase catalyzes the synthesis of deoxy-thymidine di-phosphate (dTDP)-L-rhamnose, an important component of the cell wall of many microorganisms. The C-terminal LbH domain contains multiple turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 163
25070 240188 cd05637 SIS_PGI_PMI_2 The members of this protein family contain the SIS (Sugar ISomerase) domain and have both the phosphoglucose isomerase (PGI) and the phosphomannose isomerase (PMI) functions. These functions catalyze the reversible reactions of glucose 6-phosphate to fructose 6-phosphate, and mannose 6-phosphate to fructose 6-phosphate, respectively at an equal rate. This protein contains two SIS domains. This alignment is based on the second SIS domain. 132
25071 193517 cd05638 M42 M42 Peptidases, also known as glutamyl aminopeptidase family. Peptidase M42 family proteins, also known as glutamyl aminopeptidases (GAP), are co-catalytic metallopeptidases, found in archaea and bacteria. They typically bind two zinc or cobalt atoms and include cellulase and endo-1,4-beta-glucanase (endoglucanase). Some of the enzymes exhibit typical aminopeptidase specificity, whereas others are also capable of N-terminal deblocking activity, i.e. hydrolyzing acylated N-terminal residues. GAP removes glutamyl residues from the N-terminus of peptide substrates, but is also effective against aspartyl and, to a lesser extent, seryl residues. Lactococcus lactis glutamyl aminopeptidase (PepA; aminopeptidase A) has high thermal stability and aids growth of the organism in milk. Pyrococcus horikoshii contain a thermostable de-blocking aminopeptidase member of this family, used commercially for N-terminal protein sequencing. 332
25072 349892 cd05639 M18 M18 peptidase aminopeptidase family. Peptidase M18 aminopeptidase family is widely distributed in bacteria and eukaryotes, but only the yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized to date. Yeast aminopeptidase I is active only in its dodecameric form with broad substrate specificity, acting on N-terminal leucine and most other amino acids. In contrast, the mammalian aspartyl aminopeptidase is highly selective for hydrolysis of N-terminal Asp or Glu residues from peptides. These enzymes have two catalytic zinc ions at the active site. 430
25073 349893 cd05640 M28_like M28 Zn-peptidase; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. 281
25074 349894 cd05642 M28_like M28 Zn-peptidase-like; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. 347
25075 349895 cd05643 M28_like M28 Zn-peptidase-like. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They typically have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This protein subfamily conserves some of the metal-coordinating residues of the typically co-catalytic M28 family which might suggest binding of a single metal ion. 290
25076 381731 cd05644 M28_like M28 Zn-peptidase-like, uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They typically have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. Proteins in this subfamily conserve some of the metal-coordinating residues of the typically co-catalytic M28 family, and appear to bind a single metal (Zn) ion. 340
25077 349897 cd05645 M20_peptidase_T M20 Peptidase T specifically cleaves tripeptides. Peptidase M20 family, Peptidase T (PepT; tripeptide aminopeptidase; tripeptidase) subfamily and similar proteins. PepT acts only on tripeptide substrates, and is thus termed a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein. 400
25078 349898 cd05646 M20_AcylaseI_like M20 Aminoacylase-I like subfamily. Peptidase M20 family, aminoacylase-I like (AcyI-like; acylase I; N-acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) subfamily. Acylase I is involved in the hydrolysis of N-acylated or N-acetylated amino acids (except L-aspartate) and is considered as a potential target of antimicrobial agents. Porcine AcyI is also shown to deacetylate certain quorum-sensing N-acylhomoserine lactones, while the rat enzyme has been implicated in degradation of chemotactic peptides of commensal bacteria. Prokaryotic arginine synthesis usually involves the transfer of an acetyl group to glutamate by ornithine acetyltransferase in order to form ornithine. However, Escherichia coli acetylornithine deacetylase (acetylornithinase, ArgE) (EC 3.5.1.16) catalyzes the deacylation of N2-acetyl-L-ornithine to yield ornithine and acetate. Phylogenetic evidence suggests that the clustering of the arg genes in one continuous sequence pattern arose in an ancestor common to Enterobacteriaceae and Vibrionaceae, where ornithine acetyltransferase was lost and replaced by a deacylase. Elevated levels of serum aminoacylase-1 autoantibody have been seen in the disease progression of chronic hepatitis B (CHB), making ACY1 autoantibody a valuable serum biomarker for discriminating hepatitis B virus (HBV) related liver cirrhosis from CHB. 391
25079 349899 cd05647 M20_DapE_actinobac M20 Peptidase actinobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase. Peptidase M20 family, actinobacterial dapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase (DapE) subfamily. This group is composed of predominantly actinobacterial DapE proteins. DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. It has been shown that DapE is essential for cell growth and proliferation. DapEs have been purified from proteobacteria such as Escherichia coli and Haemophilus influenzae, while genes that encode for DapEs have been sequenced from several bacterial sources such as the actinobacteria Corynebacterium glutamicum and Mycobacterium tuberculosis. DapE is a small, dimeric enzyme (41.6 kDa per subunit) that requires 2 atoms of zinc per molecule of polypeptide for full enzymatic activity. All of the amino acids that function as metal binding ligands are strictly conserved in DapE. 347
25080 349900 cd05649 M20_ArgE_DapE-like M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial and archaeal, and have been inferred by homology as being related to both ArgE and DapE. 381
25081 349901 cd05650 M20_ArgE_DapE-like M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial and archaeal, and have been inferred by homology as being related to both ArgE and DapE. 389
25082 349902 cd05651 M20_ArgE_DapE-like M20 peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are bacterial, and have been inferred by homology as being related to both ArgE and DapE. 341
25083 349903 cd05652 M20_ArgE_DapE-like_fungal M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly fungal, and have been inferred by similarity as being related to both ArgE and DapE. 340
25084 349904 cd05653 M20_ArgE_LysK M20 Peptidase acetylornithine deacetylase/acetyl-lysine deacetylase. Peptidase M20 family, acetylornithine deacetylase (ArgE)/acetyl-lysine deacetylase (LysK) subfamily. Proteins in this subfamily are mainly archaeal with related bacterial species and are deacetylases with specificity for both N-acetyl-ornithine and N-acetyl-lysine found within a lysine biosynthesis operon. ArgE catalyzes the conversion of N-acetylornithine to ornithine, while LysK, a homolog of ArgE, has deacetylating activities for both N-acetyllysine and N-acetylornithine at almost equal efficiency. These results suggest that LysK which may share an ancestor with ArgE functions not only for lysine biosynthesis, but also for arginine biosynthesis in species such as Thermus thermophilus. The substrate specificity of ArgE is quite broad in that several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved. 343
25085 349905 cd05654 M20_ArgE_RocB M20 Peptidase arginine utilization protein, RocB. Peptidase M20 family, ArgE RocB (arginine utilization protein, RocB; arginine degradation protein, RocB) subfamily. This group of proteins is possibly related to acetylornithine deacetylase (ArgE) and may be involved in the arginine and/or ornithine degradation pathway. In Bacillus subtilis, RocB is one of the three genes found in the rocABC operon, which is sigma L dependent and induced by arginine. The function of members of this family is as yet unknown, although they are predicted as deacetylases. 534
25086 349906 cd05656 M42_Frv M42 Peptidase, endoglucanases. Peptidase M42 family, Frv (Frv Operon Protein; Endo-1 4-Beta-Glucanase; Cellulase Protein; Endoglucanase; Endo-1 4-Beta-Glucanase Homolog; Glucanase; EC. 3.2.1.4) subfamily. Frv is a co-catalytic metallopeptidase, found in archaea and bacteria, including Pyrococcus horikoshii tetrahedral shaped phTET1 (DAPPh1; FrvX; PhDAP aminopeptidase; PhTET aminopeptidase; deblocking aminopeptidase), phTET2 (DAPPh2) and phTET3 (DAPPh3), Haloarcula marismortui TET (HmTET) as well as Bacillus subtilis YsdC. All of these exhibit aminopeptidase and deblocking activities. The HmTET is a broad substrate aminopeptidase capable of degrading large peptides. PhTET2, which shares 24% identity with HmTET, is a cobalt-activated peptidase and possibly a deblocking aminopeptidase, assembled as a 12-subunit tetrahedral dodecamer, while PhTET1 can be alternatively assembled as a tetrahedral dodecamer or as an octahedral tetracosameric structure. The active site in such a self-compartmentalized complex is located on the inside such that substrate sizes are limited, indicating function as possible peptide scavengers. PhTET2 cleaves polypeptides by a nonprocessive mechanism, preferring N-terminal hydrophobic or uncharged polar amino acids. Streptococcus pneumoniae PepA (SpPepA) also forms dodecamer with tetrahedral architecture, and exhibits selective substrate specificity to acidic amino acids with the preference to glutamic acid, with the substrate binding S1 pocket containing an Arg allows electrostatic interactions with the N-terminal acidic residue in the substrate. The YsdC gene is conserved in a number of thermophiles, archaea and pathogenic bacterial species; the closest structural homolog is Thermotoga maritima FrwX (34% identity), which is annotated as either a cellulase or an endoglucanase, and is possibly involved in polysaccharide biosynthesis or degradation. 337
25087 349907 cd05657 M42_glucanase_like M42 Peptidase, endoglucanase-like subfamily. Peptidase M42 family, glucanase (endo-1,4-beta-glucanase or endoglucanase)-like subfamily. Proteins in this subfamily are co-catalytic metallopeptidases, found in archaea and bacteria. They show similarity to cellulase and endo-1,4-beta-glucanase (endoglucanase) which typically bind two zinc or cobalt atoms. Some of the enzymes exhibit typical aminopeptidase specificity, whereas others are also capable of N-terminal deblocking activity, i.e. hydrolyzing acylated N-terminal residues. Many of these enzymes are assembled either as tetrahedral dodecamers or as octahedral tetracosameric structures, with the active site located on the inside such that substrate sizes are limited, indicating function as possible peptide scavengers. 337
25088 349908 cd05658 M18_DAP M18 peptidase aspartyl aminopeptidase. Peptidase M18 family, aspartyl aminopeptidase (DAP; EC 3.4.11.21) subfamily, is widely distributed in bacteria and eukaryotes. DAP cleaves only unblocked N-terminal acidic amino-acid residues. It is a cytosolic enzyme and is highly conserved; for example, the human enzyme has 51% identity to an aspartyl aminopeptidase-like protein in Arabidopsis thaliana. The mammalian DAP is highly selective for hydrolysis of N-terminal aspartate or glutamate residues from peptides. Unlike glutamyl aminopeptidase (M42), DAP does not cleave simple aminoaryl-arylamide substrates. Although there is lack of understanding of the function of this enzyme, it is thought to act in concert with other aminopeptidases to facilitate protein turnover because of their restricted specificities for the N-terminal aspartic and glutamic acid, which cannot be cleaved by any other aminopeptidases. The mammalian aspartyl aminopeptidase is possibly contributing to the catabolism of peptides, including those produced by the proteasome. It may also trim the N-terminus of peptides that are intended for the MHC class I system. In humans, DAP has been implicated in the specific function of converting angiotensin II to the vasoactive angiotensin III within the brain. Saccharomyces cerevisiae aminopeptidase I (Ape1) is involved in protein degradation in vacuoles (the yeast lysosomes) where it is transported by the unique cytoplasm-to-vacuole targeting (Cvt) pathway under vegetative growth conditions and by the autophagy pathway during starvation. Its N-terminal propeptide region, which mediates higher-order complex formation, serves as a scaffolding cargo critical for the assembly of the Cvt vesicle for vacuolar delivery. Pseudomonas aeruginosa aminopeptidase (PaAP) shows that its activity is dependent on Co2+ rather than Zn2+, and is thus a cocatalytic cobalt peptidase rather than a zinc-dependent peptidase. 439
25089 349909 cd05659 M18_API M18 peptidase aminopeptidase I. Peptidase M18 family, aminopeptidase I (vacuolar aminopeptidase I; polypeptidase; Leucine aminopeptidase IV; LAPIV; aminopeptidase III; aminopeptidase yscI; EC 3.4.11.22) subfamily. Aminopeptidase I is widely distributed in bacteria and eukaryotes, but only the yeast enzyme has been characterized to date. It is a vacuolar enzyme, synthesized as a cytosolic proform, and proteolytically matured upon arrival in the vacuole. The pro-aminopeptidase I (proAPI) does not enter the vacuole via the secretory pathway. In non-starved cells, it uses the cytoplasm to vacuole targeting (cvt) pathway and in cells starved for nitrogen, it is targeted to the vacuole via autophagy. Yeast aminopeptidase I is active only in its dodecameric form with broad substrate specificity, acting on all aminoacyl and peptidyl derivatives that contain a free alpha-amino group; this is in contrast to the highly selective M18 mammalian aspartyl aminopeptidase. N-terminal leucine and most other hydrophobic amino acid residues are the best substrates while glycine and charged amino acid residues in P1 position are cleaved much more slowly. This enzyme is strongly and specifically activated by zinc (Zn2+) and chloride (Cl-) ions. 446
25090 349910 cd05660 M28_like_PA M28 Zn-peptidase containing a protease-associated (PA) domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins containing a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. 290
25091 349911 cd05661 M28_like_PA M28 Zn-peptidase containing a PA domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins containing a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. 262
25092 349912 cd05662 M28_like M28 Zn-Peptidases. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins that do not contain a protease-associated (PA) domain. 268
25093 349913 cd05663 M28_like_PA_PDZ_associated M28 Zn-peptidase containing a protease-associated (PA) domain insert and associated with a PDZ domain. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins, many of which contain a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. Proteins in this subfamily are also associated with the PDZ domain, a widespread protein module that has been recruited to serve multiple functions during the course of evolution. 266
25094 349914 cd05664 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, Uncharacterized subfamily of proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 399
25095 349915 cd05665 M20_Acy1_IAAspH M20 Peptidases aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase. Peptidase M20 family, bacterial and archaeal aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase (IAA-Asp hydrolase; IAAspH; IAAH; IAA amidohydrolase; EC 3.5.1.-) subfamily. IAAspH hydrolyzes indole-3-acetyl-N-aspartic acid (IAA or auxin) to indole-3-acetic acid. Genes encoding IAA-amidohydrolases were first cloned from Arabidopsis; ILR1, IAR3, ILL1 and ILL2 encode active IAA- amino acid hydrolases, and three additional amidohydrolase-like genes (ILL3, ILL5, ILL6) have been isolated. In higher plants, the growth regulator indole-3-acetic acid (IAA or auxin) is found both free and conjugated via amide bonding to a variety of amino acids and peptides, and via an ester linkage to carbohydrates. IAA-Asp conjugates are involved in homeostatic control, protection, storing and subsequent use of free IAA. IAA-Asp is also found in some plants as a unique intermediate for entering into IAA non-decarboxylative oxidative pathway. IAA amidohydrolase cleaves the amide bond between the auxin and the conjugated amino acid. Enterobacter agglomerans IAAspH has very strong enzyme activity and substrate specificity towards IAA-Asp, although its substrate affinity is weaker compared to Arabidopsis enzymes of the ILR1 gene family. Enhanced IAA-hydrolase activity has been observed during clubroot disease in Chinese cabbage. 415
25096 349916 cd05666 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 373
25097 349917 cd05667 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins that have been predicted as N-acyl-L-amino acid amidohydrolase (amaA), thermostable carboxypeptidase (cpsA-1, cpsA-2 in Sulfolobus solfataricus) and abgB (aminobenzoyl-glutamate utilization protein B), and generally are involved in the urea cycle and metabolism of amino groups. Aminoacylases 1 (ACY1s) comprise a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and is a highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 403
25098 349918 cd05668 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial uncharacterized proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 371
25099 349919 cd05669 M20_Acy1_YxeP-like M20 Peptidase aminoacyclase-1 YxeP-like proteins, including YxeP, YtnL, YjiB and HipO2. Peptidase M20 family, aminoacyclase-1 YxeP-like subfamily including YxeP, YtnL, YjiB and HipO2, most of which have not been well characterized to date. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney suggest a role of the enzyme in amino acid metabolism of these organs. 371
25100 349920 cd05670 M20_Acy1_YkuR-like M20 Peptidase aminoacyclase-1 YkuR-like proteins, including YkuR and Ama/HipO/HyuC proteins. Peptidase M20 family, aminoacyclase-1 YkuR-like subfamily including YkuR and Ama/HipO/HyuC proteins, most of which have not been well characterized to date. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney suggest a role of the enzyme in amino acid metabolism of these organs. 367
25101 349921 cd05672 M20_ACY1L2-like M20 Peptidase aminoacylase 1-like protein 2-like, amidohydrolase subfamily. Peptidase M20 family, aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase)-like subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. This subfamily includes Staphylococcus aureus antibiotic resistance factor HmrA that has been shown to participate in methicillin resistance mechanisms in vivo in the presence of beta-lactams. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 360
25102 349922 cd05673 M20_Acy1L2_AbgB M20 Peptidase Aminoacylase 1-like protein 2 aminobenzoyl-glutamate utilization protein B subfamily. Peptidase M20 family, ACY1L2 aminobenzoyl-glutamate utilization protein B (AbgB) subfamily. This group contains mostly bacterial amidohydrolases, including gene products of abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate is a natural end product of folate catabolism, and its utilization is initiated by the abg region gene product, AbgT, by enabling uptake of its into the cell in a concentration-dependent, saturable manner. It is subsequently cleaved by AbgA and AbgB (sometimes referred to as AbgAB). 437
25103 349923 cd05674 M20_yscS M20 Peptidase, carboxypeptidase yscS. Peptidase M20 family, yscS (GlyX-carboxypeptidase, CPS1, carboxypeptidase S, carboxypeptidase a, carboxypeptidase yscS, glycine carboxypeptidase)-like subfamily. This group mostly contains proteins that have been uncharacterized to date, but also includes vacuolar proteins involved in nitrogen metabolism which are essential for use of certain peptides that are sole nitrogen sources. YscS releases a C-terminal amino acid from a peptide that has glycine as the penultimate residue. It is synthesized as one polypeptide chain precursor which yields two active precursor molecules after carbohydrate modification in the secretory pathway. The proteolytically unprocessed forms are associated with the membrane, whereas the mature forms of the enzyme are soluble. Enzymes in this subfamily may also cleave intracellularly generated peptides in order to recycle amino acids for protein synthesis. Also included in this subfamily is peptidase M20 domain containing 1 (PM20D1), that is enriched in uncoupling protein 1, UCP1(+) versus UCP1(-) adipocytes is a bidirectional enzyme in vitro, catalyzing both the condensation of fatty acids and amino acids to generate N-acyl amino acids and also the reverse hydrolytic reaction; N-acyl amino acids directly bind mitochondria and function as endogenous uncouplers of UCP1-independent respiration. Mice studies show increased circulating PM20D1 augments respiration and increases N-acyl amino acids in blood, and administration of N-acyl amino acids improves glucose homeostasis and increases energy expenditure. 471
25104 349924 cd05675 M20_yscS_like M20 Peptidase, carboxypeptidase yscS-like. Peptidase M20 family, yscS (GlyX-carboxypeptidase, CPS1, carboxypeptidase S, carboxypeptidase a, carboxypeptidase yscS, glycine carboxypeptidase)-like subfamily. This group contains proteins that have been uncharacterized to date with similarity to vacuolar proteins involved in nitrogen metabolism which are essential for use of certain peptides that are sole nitrogen sources. YscS releases a C-terminal amino acid from a peptide that has glycine as the penultimate residue. It is synthesized as one polypeptide chain precursor which yields two active precursor molecules after carbohydrate modification in the secretory pathway. The proteolytically unprocessed forms are associated with the membrane, whereas the mature forms of the enzyme are soluble. Enzymes in this subfamily may also cleave intracellularly generated peptides in order to recycle amino acids for protein synthesis. 431
25105 349925 cd05676 M20_dipept_like_CNDP M20 cytosolic nonspecific dipeptidases including anserinase and serum carnosinase. Peptidase M20 family, CNDP (cytosolic nonspecific dipeptidase) subfamily including anserinase (Xaa-methyl-His dipeptidase, EC 3.4.13.5), 'serum' carnosinase (beta-alanyl-L-histidine dipeptidase; EC 3.4.13.20), and some uncharacterized proteins. Two genes, CN1 and CN2, coding for proteins that degrade carnosine (beta-alanyl-L-histidine) and homocarnosine (gamma-aminobutyric acid-L-histidine), two naturally occurring dipeptides with potential neuroprotective and neurotransmitter functions, have been identified. CN1 encodes for serum carnosinase and has narrow substrate specificity for Xaa-His dipeptides, where Xaa can be beta-alanine (carnosine), N-methyl beta-alanine, alanine, glycine and gamma-aminobutyric acid (homocarnosine). CN2 corresponds to the cytosolic nonspecific dipeptidase (CNDP; EC 3.4.13.18) and is not limited to Xaa-His dipeptides. CNDP requires Mn(2+) for full activity and does not hydrolyze homocarnosine. Anserinase is a dipeptidase that mainly catalyzes the hydrolysis of N-alpha-acetylhistidine. 467
25106 349926 cd05677 M20_dipept_like_DUG2_type M20 Defective in Utilization of Glutathione-type peptidases containing WD repeats. Peptidase M20 family, Defective in Utilization of Glutathione (DUG2) subfamily. DUG2-type proteins are metallopeptidases containing WD repeats at the N-terminus. DUG2 proteins are involved in the alternative pathway of glutathione (GSH) degradation. GSH, the major low-molecular-weight thiol compound in most eukaryotic cells, is normally degraded through the gamma-glutamyl cycle initiated by gamma-glutamyl transpeptidase. However, a novel pathway for the degradation of GSH has been characterized; it requires the participation of three genes identified in Saccharomyces cerevisiae as "defective in utilization of glutathione" genes including DUG1, DUG2, and DUG3. DUG1 encodes a probable di- or tri-peptidase identified as M20 metallopeptidase, DUG2 gene encodes a protein with a metallopeptidase domain and a large N-terminal WD40 repeat region, while DUG3 encodes a protein with a glutamine amidotransferase domain. Although dipeptides and tripeptides with a normal peptide bond, such as cys-gly or glu-cys-gly, can be hydrolyzed by the DUG1 protein, the presence of an unusual peptide bond, like in GSH, requires the participation of the DUG2 and DUG3 proteins as well. These three proteins form a GSH degradosomal complex. 436
25107 349927 cd05678 M20_dipept_like uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. 466
25108 349928 cd05679 M20_dipept_like uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. 448
25109 349929 cd05680 M20_dipept_like uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. 437
25110 349930 cd05681 M20_dipept_Sso-CP2 uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. This family includes Sso-CP2 from Sulfolobus solfataricus. 429
25111 349931 cd05682 M20_dipept_dapE uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. This family includes dapE (Lpg0809) from Legionella pneumophila. 451
25112 349932 cd05683 M20_peptT_like M20 Peptidase T like enzymes specifically cleave tripeptides. Peptidase M20 family, PeptT (tripeptide aminopeptidase; tripeptidase)-like subfamily. This group includes bacterial tripeptidases as well as predicted tripeptidases. Peptidase T acts only on tripeptide substrates, and is thus called a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein. 368
25113 240189 cd05684 S1_DHX8_helicase S1_DHX8_helicase: The N-terminal S1 domain of human ATP-dependent RNA helicase DHX8, a DEAH (Asp-Glu-Ala-His) box polypeptide. The DEAH-box RNA helicases are thought to play key roles in pre-mRNA splicing and DHX8 facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. DHX8 is also known as HRH1 (human RNA helicase 1) in Homo sapiens and PRP22 in Saccharomyces cerevisiae. 79
25114 240190 cd05685 S1_Tex S1_Tex: The C-terminal S1 domain of a transcription accessory factor called Tex, which has been characterized in Bordetella pertussis and Pseudomonas aeruginosa. The tex gene is essential in Bortella pertusis and is named for its role in toxin expression. Tex has two functional domains, an N-terminal domain homologous to the Escherichia coli maltose repression protein, which is a poorly defined transcriptional factor, and a C-terminal S1 RNA-binding domain. Tex is found in prokaryotes, eukaryotes, and archaea. 68
25115 240191 cd05686 S1_pNO40 S1_pNO40: pNO40 , S1-like RNA-binding domain. pNO40 is a nucleolar protein of unknown function with an N-terminal S1 RNA binding domain, a CCHC type zinc finger, and clusters of basic amino acids representing a potential nucleolar targeting signal. pNO40 was identified through a yeast two-hybrid interaction screen of a human kidney cDNA library using the pinin (pnn) protein as bait. pNO40 is thought to play a role in ribosome maturation and/or biogenesis. 73
25116 240192 cd05687 S1_RPS1_repeat_ec1_hs1 S1_RPS1_repeat_ec1_hs1: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 1 of the Escherichia coli and Homo sapiens RPS1 (ec1 and hs1, respectively). Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 70
25117 240193 cd05688 S1_RPS1_repeat_ec3 S1_RPS1_repeat_ec3: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 3 (ec3) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 68
25118 240194 cd05689 S1_RPS1_repeat_ec4 S1_RPS1_repeat_ec4: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 4 (ec4) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 72
25119 240195 cd05690 S1_RPS1_repeat_ec5 S1_RPS1_repeat_ec5: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 5 (ec5) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 69
25120 240196 cd05691 S1_RPS1_repeat_ec6 S1_RPS1_repeat_ec6: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 6 (ec6) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 73
25121 240197 cd05692 S1_RPS1_repeat_hs4 S1_RPS1_repeat_hs4: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 4 (hs4) of the H. sapiens RPS1 homolog. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog. 69
25122 240198 cd05693 S1_Rrp5_repeat_hs1_sc1 S1_Rrp5_repeat_hs1_sc1: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 1 (hs1) and S. cerevisiae S1 repeat 1 (sc1). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 100
25123 240199 cd05694 S1_Rrp5_repeat_hs2_sc2 S1_Rrp5_repeat_hs2_sc2: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 2 (hs2) and S. cerevisiae S1 repeat 2 (sc2). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 74
25124 240200 cd05695 S1_Rrp5_repeat_hs3 S1_Rrp5_repeat_hs3: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 3 (hs3). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 66
25125 240201 cd05696 S1_Rrp5_repeat_hs4 S1_Rrp5_repeat_hs4: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 4 (hs4). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 71
25126 240202 cd05697 S1_Rrp5_repeat_hs5 S1_Rrp5_repeat_hs5: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 5 (hs5) and S. cerevisiae S1 repeat 5 (sc5). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 69
25127 240203 cd05698 S1_Rrp5_repeat_hs6_sc5 S1_Rrp5_repeat_hs6_sc5: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 6 (hs6) and S. cerevisiae S1 repeat 5 (sc5). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 70
25128 240204 cd05699 S1_Rrp5_repeat_hs7 S1_Rrp5_repeat_hs7: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 7 (hs7). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 72
25129 240205 cd05700 S1_Rrp5_repeat_hs9 S1_Rrp5_repeat_hs9: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes Homo sapiens S1 repeat 9 (hs9). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 65
25130 240206 cd05701 S1_Rrp5_repeat_hs10 S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 69
25131 240207 cd05702 S1_Rrp5_repeat_hs11_sc8 S1_Rrp5_repeat_hs11_sc8: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 11 (hs11) and S. cerevisiae S1 repeat 8 (sc8). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 70
25132 240208 cd05703 S1_Rrp5_repeat_hs12_sc9 S1_Rrp5_repeat_hs12_sc9: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 12 (hs12) and S. cerevisiae S1 repeat 9 (sc9). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 73
25133 240209 cd05704 S1_Rrp5_repeat_hs13 S1_Rrp5_repeat_hs13: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 13 (hs13). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 72
25134 240210 cd05705 S1_Rrp5_repeat_hs14 S1_Rrp5_repeat_hs14: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 14 (hs14). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 74
25135 240211 cd05706 S1_Rrp5_repeat_sc10 S1_Rrp5_repeat_sc10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 10 (sc10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 73
25136 240212 cd05707 S1_Rrp5_repeat_sc11 S1_Rrp5_repeat_sc11: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 11 (sc11). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 68
25137 240213 cd05708 S1_Rrp5_repeat_sc12 S1_Rrp5_repeat_sc12: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 12 (sc12). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. 77
25138 100078 cd05709 S2P-M50 Site-2 protease (S2P) class of zinc metalloproteases (MEROPS family M50) cleaves transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of this family use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. The domain core structure appears to contain at least three transmembrane helices with a catalytic zinc atom coordinated by three conserved residues contained within the consensus sequence HExxH, together with a conserved aspartate residue. The S2P/M50 family of RIP proteases is widely distributed; in eukaryotic cells, they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum (ER) stress responses. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of sterol regulatory element-binding proteins (SREBPs) from membranes of the ER. These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD superfamily. Prokaryotic S2P/M50 homologs have been shown to regulate stress responses, sporulation, cell division, and cell differentiation. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses, and in Bacillus subtilis, the S2P homolog SpoIVFB is involved in the pro-sigmaK pathway of spore formation. Some of the subfamilies within this hierarchy contain one or two PDZ domain insertions, with putative regulatory roles, such as the inhibition of substrate cleavage as seen by the RseP PDZ domain. 180
25139 240214 cd05710 SIS_1 A subgroup of the SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. 120
25140 409376 cd05711 IgC2_D2_LILR_KIR_like Second immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors, Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules. 90
25141 409377 cd05712 IgV_CD33 Immunoglobulin Variable (IgV) domain at the N-terminus of CD33 and related Siglecs (sialic acid-binding Ig-like lectins). The members here are composed of the immunoglobulin (Ig) domain at the N-terminus of Cluster of Differentiation (CD) 33 and related Siglecs (sialic acid-binding Ig-like lectins). Siglec refers to a structurally related protein family that specifically recognizes sialic acid in oligosaccharide chains of glycoproteins and glycolipids. Siglecs are type I transmembrane proteins, organized as an extracellular module composed of Ig-like domains, an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains, followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG, the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. 119
25142 409378 cd05713 IgV_MOG_like Immunoglobulin (Ig)-like domain of myelin oligodendrocyte glycoprotein (MOG). The members here are composed of the immunoglobulin (Ig)-like domain of myelin oligodendrocyte glycoprotein (MOG). MOG, a minor component of the myelin sheath, is an important CNS-specific autoantigen, linked to the pathogenesis of multiple sclerosis (MS) and experimental autoimmune encephalomyelitis (EAE). It is a transmembrane protein having an extracellular Ig domain. MOG is expressed in the CNS on the outermost lamellae of the myelin sheath, and on the surface of oligodendrocytes, and may participate in the completion, compaction, and/or maintenance of myelin. This group also includes butyrophilin (BTN). BTN is the most abundant protein in bovine milk-fat globule membrane (MFGM). 114
25143 409379 cd05714 Ig_CSPGs_LP_like Immunoglobulin (Ig)-like domain of chondroitin sulfate proteoglycans (CSPGs), human cartilage link protein (LP), and similar domains. The members here are composed of the immunoglobulin (Ig)-like domain similar to that found in chondroitin sulfate proteoglycans (CSPGs) and human cartilage link protein (LP). Included in this group are the CSPGs aggrecan, versican, and neurocan. In CSPGs, this Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggrecan and versican have a wide distribution in connective tissue and extracellular matrices. Neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. There is considerable evidence that HA-binding CSPGs are involved in developmental processes in the central nervous system. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 123
25144 409380 cd05715 IgV_P0-like Immunoglobulin (Ig)-like domain of protein zero (P0) and similar proteins. The members here are composed of the immunoglobulin (Ig) domain of protein zero (P0), a myelin membrane adhesion molecule. P0 accounts for over 50% of the total protein in peripheral nervous system (PNS) myelin. P0 is a single-pass transmembrane glycoprotein with a highly basic intracellular domain and an extracellular Ig domain. The extracellular domain of P0 (P0-ED) is similar to the Ig variable domain, carrying one acceptor sequence for N-linked glycosylation. P0 plays a role in membrane adhesion in the spiral wraps of the myelin sheath. The intracellular domain is thought to mediate membrane apposition of the cytoplasmic faces and may, through electrostatic interactions, interact directly with lipid headgroups. It is thought that homophilic interactions of the P0 extracellular domain mediate membrane juxtaposition in the extracellular space of PNS myelin. This group also contains the Ig domain of sodium channel subunit beta-2 (SCN2B), and of epithelial V-like antigen 1 (EVA). EVA, also known as myelin protein zero-like 2, is an adhesion molecule, which may play a role in structural organization of the thymus and early lymphocyte development. SCN2B subunits play a role in determining sodium channel density and function in neurons,and in control of electrical excitability in the brain. 117
25145 409381 cd05716 IgV_pIgR_like Immunoglobulin (Ig)-like domain in the polymeric Ig receptor (pIgR) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain in the polymeric Ig receptor (pIgR) and similar proteins. pIgR delivers dimeric IgA and pentameric IgM to mucosal secretions. Polymeric immunoglobulin (pIgs) are the first defense against pathogens and toxins. IgA and IgM can form polymers via an 18-residue extension at their C-termini referred to as the tailpiece. pIgR transports pIgs across mucosal epithelia into mucosal secretions. Human pIgR is a glycosylated type I transmembrane protein, comprised of a 620-residue extracellular region, a 23-residue transmembrane region, and a 103-residue cytoplasmic tail. The extracellular region contains five domains that share sequence similarity with Ig variable (v) regions. This group also contains the Ig-like extracellular domains of other receptors such as NK cell receptor Nkp44 and myeloid receptors, among others. 100
25146 409382 cd05717 IgV_1_Necl_like First (N-terminal) immunoglobulin (Ig)-like domain of the nectin-like molecules; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 (also known as cell adhesion molecule 3 (CADM3)), Necl-2 (CADM1), Necl-3 (CADM2), and similar proteins. At least five nectin-like molecules have been identified (Necl-1 to Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1, Necl-2, and Necl-3 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue, and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Necl-3 accumulates in central and peripheral nervous system tissue and has been shown to selectively interact with oligodendrocytes. This group also contains Class-I MHC-restricted T-cell-associated molecule (CRTAM), whose expression pattern is consistent with its expression in Class-I MHC-restricted T-cells. 94
25147 409383 cd05718 IgV_1_PVR_like First immunoglobulin variable (IgV) domain of poliovirus receptor (PVR, also known as CD155 and necl-5), and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and nectin-like protein 5 (necl-5)). Poliovirus (PV) binds to its cellular receptor (PVR/CD155) to initiate infection. CD155 is a membrane-anchored, single-span glycoprotein; its extracellular region has three Ig-like domains. There are four different isotypes of CD155 (referred to as alpha, beta, gamma, and delta), that result from alternate splicing of the CD155 mRNA, and have identical extracellular domains. CD155-beta and CD155-gamma are secreted; CD155-alpha and CD155-delta are membrane-bound and function as PV receptors. The virus recognition site is contained in the amino-terminal domain, D1. Having the virus attachment site on the receptor distal from the plasma membrane may be important for successful initiation of infection of cells by the virus. CD155 binds in the poliovirus "canyon" with a footprint similar to that of the intercellular adhesion molecule-1 receptor on human rhinoviruses. This group also includes the first Ig-like domain of nectin-1 (also known as poliovirus receptor related protein(PVRL)1; CD111), nectin-3 (also known as PVRL 3), nectin-4 (also known as PVRL4; LNIR receptor)and DNAX accessory molecule 1 (DNAM-1; CD226). 113
25148 409384 cd05719 IgC1_2_PVR_like Second immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and Necl-5), and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and nectin-like protein 5 (Necl-5)) and similar proteins. Poliovirus (PV) binds to its cellular receptor (PVR/CD155) to initiate infection. CD155 is a membrane-anchored, single-span glycoprotein; its extracellular region has three Ig-like domains. There are four different isotypes of CD155 (referred to as alpha, beta, gamma, and delta), these result from alternate splicing of the CD155 mRNA, and have identical extracellular domains. CD155-beta and CD155-gamma are secreted, while CD155-alpha and CD155-delta are membrane-bound and function as PV receptors. The virus recognition site is contained in the amino-terminal domain, D1. Having the virus attachment site on the receptor distal from the plasma membrane may be important for successful initiation of infection of cells by the virus. CD155 binds in the poliovirus "canyon" and has a footprint similar to that of the intercellular adhesion molecule-1 receptor on human rhinoviruses. This group also includes the second Ig-like domain of nectin-1, also known as poliovirus receptor related protein(PVRL)1 or CD111. 96
25149 409385 cd05720 IgV_CD8_alpha Immunoglobulin (Ig)-like variable (V) domain of Cluster of Differentiation (CD) 8 alpha chain. The members here are composed of the immunoglobulin (Ig)-like variable domain of the Cluster of Differentiation (CD) 8 alpha. The CD8 glycoprotein plays an essential role in the control of T-cell selection, maturation, and the T-cell receptor (TCR)-mediated response to peptide antigen. CD8 is comprised of alpha and beta subunits and is expressed as either an alpha/alpha or alpha/beta dimer. Both dimeric isoforms can serve as a coreceptor for T cell activation and differentiation, however they have distinct physiological roles, different cellular distributions, unique binding partners, etc. Each CD8 subunit is comprised of an extracellular domain containing a V-type Ig-like domain, a single pass transmembrane portion, and a short intracellular domain. The Ig domain of CD8 alpha binds to antibodies. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 110
25150 409386 cd05721 IgV_CTLA-4 Immunoglobulin Variable (IgV) domain of cytotoxic T lymphocyte-associated antigen 4 (CTLA-4). The members here are composed of the variable(v)-type immunoglobulin (Ig) domain found in cytotoxic T lymphocyte-associated antigen 4 (CTLA-4). CTLA-4 is involved in the regulation of T cell response, acting as an inhibitor of intracellular signaling. CTLA-4 is similar to CD28, a T cell co-receptor protein that recognizes the B7 proteins (CD80 and CD86). CD28 binding of the B7 proteins occurs after the presentation of antigen to the T cell receptor (TCR) via the peptide-MHC complex on the surface of an antigen presenting cell (APC). CTLA-4 also binds the B7 molecules with a higher affinity than does CD28. The B7/CTLA-4 interaction generates inhibitory signals down-regulating the response, and may prevent T cell activation by weak TCR signals. CD28 and CTLA-4 then elicit opposing signals in the regulation of T cell responsiveness and homeostasis. T cell activation leads to increased CTLA-4 gene expression and trafficking of CTLA-4 protein to the cell surface. CTLA-4 is not detected on the T-cell surface until 24 hours after activation. Covalent dimerization of CTLA-4 has been shown to be required for its high binding avidity, although each CTLA-4 monomer contains a binding site for CD80 and CD86. 115
25151 409387 cd05722 IgI_1_Neogenin_like First immunoglobulin (Ig)-like domain in neogenin, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain in neogenin and related proteins. Neogenin is a cell surface protein which is expressed in the developing nervous system of vertebrate embryos in the growing nerve cells. It is also expressed in other embryonic tissues and may play a general role in developmental processes such as cell migration, cell-cell recognition, and tissue growth regulation. Included in this group is the tumor suppressor protein DCC which is deleted in colorectal carcinoma. DCC and neogenin each have four Ig-like domains followed by six fibronectin type III domains, a transmembrane domain, and an intracellular domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 97
25152 409388 cd05723 IgI_4_Neogenin_like Fourth immunoglobulin (Ig)-like domain in neogenin, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain in neogenin and related proteins. Neogenin is a cell surface protein which is expressed in the developing nervous system of vertebrate embryos in the growing nerve cells. It is also expressed in other embryonic tissues, and may play a general role in developmental processes such as cell migration, cell-cell recognition, and tissue growth regulation. Included in this group is the tumor suppressor protein DCC which is deleted in colorectal carcinoma. DCC and neogenin each have four Ig-like domains followed by six fibronectin type III domains, a transmembrane domain, and an intracellular domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 84
25153 409389 cd05724 IgI_2_Robo Second immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of the Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, and Robo3), and three mammalian Slit homologs (Slit-1,Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons. The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit-2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 87
25154 409390 cd05725 IgI_3_Robo Third immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, Robo3), and three mammalian Slit homologs (Slit-1,Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, and Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons. The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 83
25155 409391 cd05726 IgI_4_Robo Fourth immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. Members here are composed the fourth immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, Robo3), and three mammalian Slit homologs (Slit-1, Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, and Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons. The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 98
25156 409392 cd05727 Ig2_Contactin-2-like Second Ig domain of the neural cell adhesion molecule contactin-2, and similar domains. The members here are composed of the second Ig domain of the neural cell adhesion molecule contactin-2. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (also called TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. The first four Ig domains form the intermolecular binding fragment which arranges as a compact U-shaped module by contacts between Ig domains 1 and 4, and domains 2 and 3. It has been proposed that a linear zipper-like array forms, from contactin-2 molecules alternatively provided by the two apposed membranes. 88
25157 143205 cd05728 Ig4_Contactin-2-like Fourth Ig domain of the neural cell adhesion molecule contactin-2, and similar domains. The members here are composed of the fourth Ig domain of the neural cell adhesion molecule contactin-2. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (also called TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. The first four Ig domains form the intermolecular binding fragment which arranges as a compact U-shaped module by contacts between Ig domains 1 and 4, and domains 2 and 3. It has been proposed that a linear zipper-like array forms, from contactin-2 molecules alternatively provided by the two apposed membranes. 85
25158 409393 cd05729 IgI_2_FGFR_like Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor, and similar domains; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor. FGF receptors bind FGF signaling polypeptides. FGFs participate in multiple processes such as morphogenesis, development, and angiogenesis. FGFs bind to four FGF receptor tyrosine kinases (FGFR1, FGFR2, FGFR3, FGFR4). Receptor diversity is controlled by alternative splicing producing splice variants with different ligand binding characteristics and different expression patterns. FGFRs have an extracellular region comprised of three Ig-like domains, a single transmembrane helix, and an intracellular tyrosine kinase domain. Ligand binding and specificity reside in the Ig-like domains 2 and 3, and the linker region that connects these two. FGFR activation and signaling depend on FGF-induced dimerization, a process involving cell surface heparin or heparin sulfate proteoglycans. This group also contains fibroblast growth factor (FGF) receptor like-1(FGFRL1). FGFRL1 does not have a protein tyrosine kinase domain at its C-terminus; neither does its cytoplasmic domain appear to interact with a signaling partner. It has been suggested that FGFRL1 may not have any direct signaling function, but instead acts as a decoy receptor trapping FGFs and preventing them from binding other receptors. 95
25159 143207 cd05730 IgI_3_NCAM-1 Third immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule 1 (NCAM-1); member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule (NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-non-NCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions) through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. 95
25160 409394 cd05731 Ig3_L1-CAM_like Third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM), and similar domains. The members here are composed of the third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, and spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM and human neurofascin. 83
25161 409395 cd05732 IgI_NCAM-1_like Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 1 (NCAM-1) and similar proteins. The members here are composed of the fourth immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule (NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-non-NCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. Also included in this group is NCAM-2 (also known as OCAM/mamFas II and RNCAM) NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE). One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains. 96
25162 409396 cd05733 IgI_L1-CAM_like Immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains NrCAM [Ng(neuronglia)CAM-related cell adhesion molecule], which is primarily expressed in the nervous system, and human neurofascin. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lacks a C" strand. 94
25163 409397 cd05734 Ig_DSCAM Immunoglobulin (Ig)-like domain of Down Syndrome Cell Adhesion molecule (DSCAM). The members here are composed of the immunoglobulin (Ig)-like domain of Down Syndrome Cell Adhesion molecule (DSCAM). DSCAM is a cell adhesion molecule expressed largely in the developing nervous system. The gene encoding DSCAM is located at human chromosome 21q22, the locus associated with the intellectual disability phenotype of Down Syndrome. DSCAM is predicted to be the largest member of the IG superfamily. It has been demonstrated that DSCAM can mediate cation-independent homophilic intercellular adhesion. 97
25164 409398 cd05735 Ig_DSCAM Immunoglobulin (Ig) domain of Down Syndrome Cell Adhesion molecule (DSCAM). The members here are composed of the immunoglobulin (Ig) domain of Down Syndrome Cell Adhesion molecule (DSCAM). DSCAM is a cell adhesion molecule expressed largely in the developing nervous system. The gene encoding DSCAM is located at human chromosome 21q22, the locus associated with the intellectual disability phenotype of Down Syndrome. DSCAM is predicted to be the largest member of the IG superfamily. It has been demonstrated that DSCAM can mediate cation-independent homophilic intercellular adhesion. 101
25165 409399 cd05736 IgI_2_Follistatin_like Second immunoglobulin (Ig)-like domain of a Follistatin-related protein 5, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in human Follistatin-related protein 5 (FSTL5) and a follistatin-like molecule encoded by the CNS-related Mahya gene. Mahya genes have been retained in certain Bilaterian branches during evolution. They are conserved in Hymenoptera and Deuterostomes, but are absent from other metazoan species such as fruit fly and nematode. Mahya proteins are secretory, with a follistatin-like domain (Kazal-type serine/threonine protease inhibitor domain and EF-hand calcium-binding domain), two Ig-like domains, and a novel C-terminal domain. Mahya may be involved in learning and memory and in processing of sensory information in Hymenoptera and vertebrates. Follistatin is a secreted, multidomain protein that binds activins with high affinity and antagonizes their signaling. 93
25166 319300 cd05737 IgI_Myomesin_like_C C-terminal immunoglobulin (Ig)-like domain of myomesin and M-protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of myomesin and M-protein (also known as myomesin-2). Myomesin and M-protein are both structural proteins localized to the M-band, a transverse structure in the center of the sarcomere, and are candidates for M-band bridges. Both proteins are modular, consisting mainly of repetitive Ig-like and fibronectin type III (FnIII) domains. Myomesin is expressed in all types of vertebrate striated muscle; M-protein has a muscle-type specific expression pattern. Myomesin is present in both slow and fast fibers; M-protein is present only in fast fibers. It has been suggested that myomesin acts as a molecular spring with alternative splicing as a means of modifying its elasticity. 92
25167 409400 cd05738 IgI_2_RPTP_IIa_LAR_like Second immunoglobulin (Ig)-like domain of the receptor protein tyrosine phosphatase (RPTP)-F; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in the receptor protein tyrosine phosphatase (RPTP)-F, also known as LAR. LAR belongs to the RPTP type IIa subfamily. Members of this subfamily are cell adhesion molecule-like proteins involved in central nervous system (CNS) development. They have large extracellular portions comprised of multiple Ig-like domains and two to nine fibronectin type III (FNIII) domains and a cytoplasmic portion having two tandem phosphatase domains. 91
25168 409401 cd05739 IgI_3_RPTP_IIa_LAR_like Third immunoglobulin (Ig)-like domain of the receptor protein tyrosine phosphatase (RPTP)-F (also known as LAR), type IIa; member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain found in the receptor protein tyrosine phosphatase (RPTP)-F, also known as LAR. LAR belongs to the RPTP type IIa subfamily. Members of this subfamily are cell adhesion molecule-like proteins involved in central nervous system (CNS) development. They have large extracellular portions comprised of multiple Ig-like domains and two to nine fibronectin type III (FNIII) domains and a cytoplasmic portion having two tandem phosphatase domains. Included in this group is Drosophila LAR (DLAR). 82
25169 409402 cd05740 IgI_hCEACAM_2_4_6_like Immunoglobulin (Ig)-like domain of human carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) domains 2, 4, and 6, and similar domains. The members here are composed of the second, fourth, and sixth immunoglobulin (Ig)-like domains in human carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily. The CEA family is a group of anchored or secreted glycoproteins expressed by epithelial cells, leukocytes, endothelial cells, and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions; it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surface. 89
25170 409403 cd05741 IgV_CEACAM_like Immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and related domains. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions: it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two (D1, D4) or four (D1-D4) Ig-like domains on the cell surface. This family corresponds to the D1 Ig-like domain. Also belonging to this group is the N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84-like family. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. SLAM family proteins are organized as an extracellular domain with having two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region having Tyr-based motifs. The extracellular domain is organized as a membrane-distal Ig variable (IgV) domain that is responsible for ligand recognition and a membrane-proximal truncated Ig constant-2 (IgC2) domain. 102
25171 409404 cd05742 IgI_VEGFR_like Immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor (R) and similar proteins; member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor (R) and related proteins. The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members: VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1) and VEGFR-3 (Flt-4). VEGF-A interacts with both VEGFR-1 and VEGFR-2. VEGFR-1 binds strongest to VEGF; VEGF-2 binds more weakly. VEGFR-3 appears not to bind VEGF, but binds other members of the VEGF family (VEGF-C and -D). VEGFRs bind VEGFs with high affinity with the IG-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic, and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth. This group also contains alpha-type platelet-derived growth factor receptor precursor (PDGFR)-alpha (CD140a), and PDGFR-beta (CD140b). PDGFRs alpha and beta have an extracellular component with five Ig-like domains, a transmembrane segment, and a cytoplasmic portion that has protein tyrosine kinase activity. 102
25172 143220 cd05743 Ig_Perlecan_like Immunoglobulin (Ig)-like domain of the human basement membrane heparan sulfate proteoglycan perlecan and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of the human basement membrane heparan sulfate proteoglycan perlecan, also known as HSPG2, and similar proteins. Perlecan consists of five domains: domain I has three putative heparan sulfate attachment sites, domain II has four LDL receptor-like repeats, and one Ig-like repeat, domain III resembles the short arm of laminin chains, domain IV has multiple Ig-like repeats (21 repeats in human perlecan), and domain V resembles the globular G domain of the laminin A chain and internal repeats of EGF. Perlecan may participate in a variety of biological functions including cell binding, LDL-metabolism, basement membrane assembly and selective permeability, calcium binding, and growth- and neurite-promoting activities. 78
25173 409405 cd05744 IgI_Myotilin_C_like Immunoglobulin (Ig)-like domain of myotilin, palladin, and myopalladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in myotilin, palladin, and myopalladin. Myotilin, palladin, and myopalladin function as scaffolds that regulate actin organization. Myotilin and myopalladin are most abundant in skeletal and cardiac muscle; palladin is ubiquitously expressed in the organs of developing vertebrates and plays a key role in cellular morphogenesis. The three family members each interact with specific molecular partners with all three binding to alpha-actinin; In addition, palladin also binds to vasodilator-stimulated phosphoprotein (VASP) and ezrin, myotilin binds to filamin and actin, and myopalladin also binds to nebulin and cardiac ankyrin repeat protein (CARP). This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 91
25174 143222 cd05745 Ig3_Peroxidasin Third immunoglobulin (Ig)-like domain of peroxidasin. The members here are composed of the third immunoglobulin (Ig)-like domain in peroxidasin. Peroxidasin has a peroxidase domain and interacting extracellular motifs containing four Ig-like domains. It has been suggested that peroxidasin is secreted and has functions related to the stabilization of the extracellular matrix. It may play a part in various other important processes such as removal and destruction of cells which have undergone programmed cell death and protection of the organism against non-self. 74
25175 143223 cd05746 Ig4_Peroxidasin Fourth immunoglobulin (Ig)-like domain of peroxidasin. The members here are composed of the fourth immunoglobulin (Ig)-like domain in peroxidasin. Peroxidasin has a peroxidase domain and interacting extracellular motifs containing four Ig-like domains. It has been suggested that peroxidasin is secreted, and has functions related to the stabilization of the extracellular matrix. It may play a part in various other important processes such as removal and destruction of cells which have undergone programmed cell death and protection of the organism against non-self. 69
25176 143224 cd05747 IgI_Titin_like Immunoglobulin (Ig)-like domain of human titin C terminus and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain from the C-terminus of human titin x and similar proteins. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is gigantic; depending on isoform composition it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone and appears to function similar to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 92
25177 409406 cd05748 Ig_Titin_like Immunoglobulin (Ig)-like domain of titin and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain found in titin-like proteins and similar proteins. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is a giant protein; depending on isoform composition, it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone. It appears to function similarly to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. Within the sarcomere, titin is also attached to or is associated with myosin binding protein C (MyBP-C). MyBP-C appears to contribute to the generation of passive tension by titin and like titin has repeated Ig-like and FN-III domains. Also included in this group are worm twitchin and insect projectin, thick filament proteins of invertebrate muscle which also have repeated Ig-like and FN-III domains. 82
25178 409407 cd05749 IgI_2_Axl_Tyro3_like Second immunoglobulin (Ig)-like domain of Axl/Tyro3 family receptor tyrosine kinases (RTKs); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain in the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3, and Mer are widely expressed in adult tissues, though they show higher expression in the brain, lymphatic and vascular systems, and testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation. 82
25179 409408 cd05750 Ig_Pro_neuregulin Immunoglobulin (Ig)-like domain in neuregulins. The members here are composed of the immunoglobulin (Ig)-like domain in neuregulins (NRGs). NRGs are signaling molecules which participate in cell-cell interactions in the nervous system, breast, heart, and other organ systems, and are implicated in the pathology of diseases including schizophrenia, multiple sclerosis, and breast cancer. There are four members of the neuregulin gene family (NRG-1, NRG-2, NRG-3, and NRG-4). The NRG-1 protein, binds to and activates the tyrosine kinases receptors ErbB3 and ErbB4, initiating signaling cascades. The other NRGs proteins bind one or the other or both of these ErbBs. NRG-1 has multiple functions: in the brain it regulates various processes such as radial glia formation and neuronal migration, dendritic development, and expression of neurotransmitters receptors, while in the peripheral nervous system NRG-1 regulates processes such as target cell differentiation, and Schwann cell survival. There are many NRG-1 isoforms which arise from the alternative splicing of mRNA. Less is known of the functions of the other NRGs. NRG-2 and NRG-3 are expressed predominantly in the nervous system. NRG-2 is expressed by motor neurons and terminal Schwann cells, and is concentrated near synaptic sites and may be a signal that regulates synaptic differentiation. NRG-4 has been shown to direct pancreatic islet cell development towards the delta-cell lineage. 92
25180 409409 cd05751 IgC2_D1_LILR_KIR_like First immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs) and Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules. 88
25181 409410 cd05752 Ig1_FcgammaR_like First immunoglobulin (Ig)-like domain of Fcgamma-receptors (FcgammaRs), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Fcgamma-receptors (FcgammaRs). Interactions between IgG and FcgammaR are important to the initiation of cellular and humoral response. IgG binding to FcgammaR leads to a cascade of signals and ultimately to functions such as antibody-dependent-cellular-cytotoxicity (ADCC), endocytosis, phagocytosis, release of inflammatory mediators, etc. FcgammaR has two Ig-like domains. This group also contains FcepsilonRI which binds IgE with high affinity. 79
25182 409411 cd05753 Ig2_FcgammaR_like Second immunoglobulin (Ig)-like domain of Fcgamma-receptors (FcgammaRs), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of Fcgamma-receptors (FcgammaRs). Interactions between IgG and FcgammaR are important to the initiation of cellular and humoral response. IgG binding to FcgammaR leads to a cascade of signals and ultimately to functions such as antibody-dependent-cellular-cytotoxicity (ADCC), endocytosis, phagocytosis, release of inflammatory mediators, etc. FcgammaR has two Ig-like domains. This group also contains FcepsilonRI which binds IgE with high affinity. 83
25183 409412 cd05754 IgI_Perlecan_like Immunoglobulin (Ig)-like domain found in Perlecan and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain found in Perlecan. Perlecan is a large multi-domain heparin sulfate proteoglycan, important in tissue development and organogenesis. Perlecan can be represented as 5 major portions; its fourth major portion (domain IV) is a tandem repeat of immunoglobulin-like domains (Ig2-Ig15) which can vary in size due to alternative splicing. Perlecan binds many cellular and extracellular ligands. Its domain IV region has many binding sites. Some of these have been mapped at the level of individual Ig-like domains, including a site restricted to the Ig5 domain for heparin/sulfatide, a site restricted to the Ig3 domain for nidogen-1 and nidogen-2, a site restricted to Ig4-5 for fibronectin, and sites restricted to Ig2 and to Ig13-15 for fibulin-2. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 85
25184 409413 cd05755 IgC2_2_ICAM-1_like Second immunoglobulin (Ig)-like C2-set domain of intercellular cell adhesion molecule 1 (ICAM-1), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of intercellular cell adhesion molecule 1 (ICAM-1; also known as domain of cluster of differentiation (CD) 54) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. ICAM-1 may be involved in organ targeted tumor metastasis. The interaction of ICAM-1 with leukocyte function-associated antigen-1 (LFA-1) plays a part in leukocyte-endothelial cell recognition. This group also contains ICAM-2 which also interacts with LFA-1. Transmigration of immature dendritic cells across resting endothelium is dependent on the interaction of ICAM-2 with, yet unidentified, ligand(s) on the dendritic cells. ICAM-1 has five Ig-like domains and ICAM-2 has two. ICAM-1 may also act as host receptor for viruses and parasites. The structures of this group show that the second Ig domain lacks a D strand and thus belonging to the C2-set of the IgSF 101
25185 409414 cd05756 Ig1_IL1R_like First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R; also known as cluster of differentiation (CD) 121). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. 96
25186 409415 cd05757 Ig2_IL1R-like Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R; also known as cluster of differentiation (CD) 121). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2. 92
25187 319310 cd05758 IgI_5_KIRREL3-like Fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (also known as Neph1), Kirrel2 (also known as Neph3), and Drosophila RST (also known as irregular chiasm C-roughest) protein. These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development. 98
25188 409416 cd05759 IgI_2_KIRREL3-like Second immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (Neph1), Kirrel2 (Neph3), and Drosophila RST (irregular chiasm C-roughest) protein. These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development. 98
25189 409417 cd05760 Ig2_PTK7 Second immunoglobulin (Ig)-like domain of protein tyrosine kinase (PTK) 7. The members here are composed of the second immunoglobulin (Ig)-like domain in protein tyrosine kinase (PTK) 7, also known as CCK4. PTK7 is a subfamily of the receptor protein tyrosine kinase family, and is referred to as an RPTK-like molecule. RPTKs transduce extracellular signals across the cell membrane and play important roles in regulating cell proliferation, migration, and differentiation. PTK7 is organized as an extracellular portion having seven Ig-like domains, a single transmembrane region, and a cytoplasmic tyrosine kinase-like domain. PTK7 is considered a pseudokinase as it has several unusual residues in some of the highly conserved tyrosine kinase (TK) motifs; it is predicted to lack TK activity. PTK7 may function as a cell-adhesion molecule. PTK7 mRNA is expressed at high levels in placenta, melanocytes, liver, lung, pancreas, and kidney. PTK7 is overexpressed in several cancers, including melanoma and colon cancer lines. 95
25190 409418 cd05761 IgI_2_Necl-1-4 Second immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 - Necl-4; member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 (also known as cell adhesion molecule 3 or CADM3), Necl-2 (also known as CADM1), Necl-3 (also known as CADM2) and Necl-4 (also known as CADM4). These nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 and Necl-2 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues, and is a putative tumour suppressor gene, which is downregulated in aggressive neuroblastoma. Necl-3 has been shown to accumulate in tissues of the central and peripheral nervous system, where it is expressed in ependymal cells and myelinated axons. It is observed at the interface between the axon shaft and the myelin sheath. Necl-4 is expressed on Schwann cells, and plays a key part in initiating peripheral nervous system (PNS) myelination. Necl-4 participates in cell-cell adhesion and is proposed to play a role in tumor suppression. 102
25191 409419 cd05762 IgI_8_hMLCK_like Eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK) and similar protein; member of the I-set of IgSF domains. The members here are composed of the eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK) and similar proteins. Myosin light-chain kinase (MLCK) is a key regulator of different forms of cell motility involving actin and myosin II. Agonist stimulation of smooth muscle cells increases cytosolic Ca2+ which binds calmodulin. This Ca2+-calmodulin complex in turn binds to and activates MLCK. Activated MLCK leads to the phosphorylation of the 20 kDa myosin regulatory light chain (RLC) of myosin II and the stimulation of actin-activated myosin MgATPase activity. MLCK is widely present in vertebrate tissues; it phosphorylates the 20 kDa RLC of both smooth and nonmuscle myosin II. Phosphorylation leads to the activation of the myosin motor domain and altered structural properties of myosin II. In smooth muscle MLCK it is involved in initiating contraction. In nonmuscle cells, MLCK may participate in cell division and cell motility; it has been suggested MLCK plays a role in cardiomyocyte differentiation and contraction through regulation of nonmuscle myosin II. 99
25192 409420 cd05763 IgI_LRIG1-like Immunoglobulin (Ig)-like ectodomain of the LRIG1 (Leucine-rich Repeats And Immunoglobulin-like Domains Protein 1) and similar proteins; member of the I-set of IgSF domains. The members here are composed of subgroup of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. The ectodomain of LRIG1 has two distinct regions: the proposed 15 LRRs and three Ig-like domains closer to the membrane. LRIG1 has been reported to interact with many receptor tyrosine kinases, GDNF/c-Ret, E-cadherin, JAK/STAT, c-Met, and the EGFR family signaling systems. Immunoglobulin Superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The structure of the LRIG1 extracellular Ig domain lacks a C" strand and thus is better described as a member of the I-set of IgSF domains. 91
25193 409421 cd05764 IgI_SALM5_like Immunoglobulin domain of human Synaptic Adhesion-Like Molecule 5 (SALM5) and similar proteins; member of the I-set of IgSF domains. This group contains the immunoglobulin domain of human Synaptic Adhesion-Like Molecule 5 (SALM5) and similar proteins. The SALM (for synaptic adhesion-like molecules; also known as Lrfn for leucine-rich repeat and fibronectin type III domain containing) family of adhesion molecules consists of five known members: SALM1/Lrfn2, SALM2/Lrfn1, SALM3/Lrfn4, SALM4/Lrfn3, and SALM5/Lrfn5. SALMs share a similar domain structure, containing leucine-rich repeats (LRRs), an immunoglobulin (Ig) domain, and a fibronectin III (FNIII) domain, followed by a transmembrane domain and a C-terminal PDZ-binding motif. SALM5 is implicated in autism spectrum disorders (ASDs) and schizophrenia, induces presynaptic differentiation in contacting axons. SALM5 interacts with the Ig domains of LAR (Leukocyte common Antigen-Related) family receptor protein tyrosine phosphatases (LAR-RPTPs; LAR, PTPdelta, and PTPsigma). In addition, PTPdelta is implicated in ASDs, ADHD, bipolar disorder, and restless leg syndrome. Studies have shown that LAR-RPTPs are novel and splicing-dependent presynaptic ligands for SALM5, and that they mediate SALM5-dependent presynaptic differentiation. Furthermore, SALM5 maintains AMPA receptor (AMPAR)-mediated excitatory synaptic transmission through mechanisms involving the interaction of SALM5 with LAR-RPTPs. This group belongs to the I-set of immunoglobulin superfamily (IgSF) domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 88
25194 409422 cd05765 IgI_3_WFIKKN-like Third immunoglobulin-like domain of the human WFIKKN (WAP, follistatin, immunoglobulin, Kunitz and NTR domain-containing protein), and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin-like domain of the human WFIKKN (WAP, follistatin, immunoglobulin, Kunitz and NTR domain-containing protein) and similar proteins. WFIKKN is a secreted protein that consists of multiple types of protease inhibitory modules, including two tandem Kunitz-type protease inhibitor-domains. The Ig superfamily is a heterogenous group of proteins built on a common fold comprised of a sandwich of two beta sheets. Members of the Ig superfamily are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 95
25195 409423 cd05766 IgC1_MHC_II_beta Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II beta chain. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes and they are also expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain has two globular domains (N- and C-terminal) and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 96
25196 409424 cd05767 IgC1_MHC_II_alpha Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of the major histocompatibility complex (MHC) class II alpha chain. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are also expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 95
25197 409425 cd05768 IgC1_CH3_IgAGD_CH4_IgAEM CH3 domain (third constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, gamma, and delta chains, and CH4 domain (fourth constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, epsilon, and mu chains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the third and fourth immunoglobulin constant domain (IgC) of alpha, delta, gamma and alpha, epsilon, and mu heavy chains, respectively. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. 105
25198 409426 cd05769 IgC1_TCR_beta T cell receptor (TCR) beta chain constant immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the T cell receptor (TCR) beta chain constant immunoglobulin domain. TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the beta chain. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The antigen binding site is formed by the variable domains of the alpha and beta chains, located at the N-terminus of each chain. Alpha/beta TCRs recognize antigens differently from gamma/delta TCRs. 116
25199 409427 cd05770 IgC1_beta2m Class I major histocompatibility complex (MHC) beta-2-microglobulin; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain in beta-2-microglobulin (beta2m). Beta2m is the non-covalently bound light chain of the human class I major histocompatibility complex (MHC-I). Beta2m is structured as a beta-sandwich domain composed of two facing beta-sheets (four stranded and three stranded), that is typical of the C-type immunoglobulin superfamily. This structure is stabilized by an intramolecular disulfide bridge connecting two Cys residues in the facing beta-sheets. In vivo, MHC-I continuously exposes beta2m on the cell surface, where it may be released to plasmatic fluids, transported to the kidneys, degraded, and finally excreted. 94
25200 409428 cd05771 IgC1_Tapasin_R Tapasin-R immunoglobulin-like domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain on Tapasin-R. Tapasin is a V-C1 (variable-constant) immunoglobulin superfamily molecule present in the endoplasmic reticulum (ER), where it links MHC class I molecules to the transporter associated with antigen processing (TAP). Tapasin-R is a tapasin-related protein that contains similar structural motifs to Tapasin, with some marked differences, especially in the V domain, transmembrane and cytoplasmic regions. The majority of Tapasin-R is located within the ER; however, there may be some expression of Tapasin-R at the cell surface. Tapasin-R lacks an obvious ER retention signal. 100
25201 409429 cd05772 IgC1_SIRP_domain_2 Signal-regulatory protein (SIRP) immunoglobulin-like domain 2; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in Signal-Regulatory Protein (SIRP), domain 2 (C1 repeat 1). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions, but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three Immunoglobulin superfamily domains, a single V-set and two C1-set IgSF domains. Their cytoplasmic tails contain either ITIMs or transmembrane regions that have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma. SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs, but also binds CD47, but with much less affinity. 102
25202 143250 cd05773 IgC1_hNephrin_like Immunoglobulin-like domain of human nephrin and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain in human nephrin and similar proteins. Nephrin is an integral component of the slit diaphragm and is a central component of the glomerular ultrafilter. Nephrin plays a structural role and has a role in signaling. Nephrin is a transmembrane protein having a short intracellular portion, an extracellular portion comprised of eight Ig-like domains, and one fibronectin type III-like domain. The extracellular portions of nephrin from neighboring foot processes of separate podocyte cells may interact with each other, and in association with other components of the slit diaphragm form a porous molecular sieve within the slit pore. The intracellular portion of nephrin is associated with linker proteins, which connect nephrin to the actin cytoskeleton. The intracellular portion is tyrosine phosphorylated, and mediates signaling from the slit diaphragm into the podocytes. 109
25203 409430 cd05774 IgV_CEACAM_D1 First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM). The members here are composed of the immunoglobulin (Ig)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) proteins. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells, and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions: it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two (D1, D4) or four (D1-D4) Ig-like domains on the cell surface. 105
25204 409431 cd05775 IgV_CD2_like_N N-terminal immunoglobulin (Ig)-like domain of T-cell surface antigen CD2, and similar domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain (or domain 1) of T-cell surface antigen Clusters of Differentiation (CD) 2 and similar proteins. CD2 is a T-cell specific surface glycoprotein and is critically important for mediating adhesion between T cells and antigen-presenting cells or between cytolytic T cells and target cells. CD2 is located on chromosome 1 at 1p13 in humans and on chromosome 3 in mice. CD2 contains an extracellular domain with two or Ig-like domains, a single transmembrane segment, and a cytoplasmic region rich in proline and basic residues. 98
25205 99819 cd05776 DNA_polB_alpha_exo inactive DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase alpha, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase alpha. DNA polymerase alpha is a family-B DNA polymerase with a catalytic subunit that contains a DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (delta and epsilon are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase alpha is almost exclusively required for the initiation of DNA replication and the priming of Okazaki fragments during elongation. It associates with DNA primase and is the only enzyme able to start DNA synthesis de novo. The catalytic subunit contains both polymerase and 3'-5' exonuclease domains, but only exhibits polymerase activity. The 3'-5' exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, without the four conserved acidic residues that are crucial for metal binding and catalysis. This explains why in most organisms, that no specific repair role, other than check point control, has been assigned to this enzyme. The exonuclease domain may have a structural role. 234
25206 99820 cd05777 DNA_polB_delta_exo DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase delta, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase delta. DNA polymerase delta is a family-B DNA polymerase with a catalytic subunit that contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (alpha and epsilon are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase delta is the enzyme responsible for both elongation and maturation of Okazaki fragments on the lagging strand. It is also implicated in mismatch repair (MMR) and base excision repair (BER). The catalytic subunit displays both polymerase and 3'-5' exonuclease activities. The exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues necessary for metal binding and catalysis. The exonuclease domain of family B polymerase also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. 230
25207 99821 cd05778 DNA_polB_zeta_exo inactive DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase zeta, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase zeta. DNA polymerase zeta is a family-B DNA polymerase which is distantly related to DNA polymerase delta. It plays a major role in translesion replication and the production of either spontaneous or induced mutations. In addition, DNA polymerase zeta also appears to be involved in somatic hypermutability in B lymphocytes, an important element for the production of high affinity antibodies in response to an antigen. The catalytic subunit contains both polymerase and 3'-5' exonuclease domains, but only exhibits polymerase activity. The DnaQ-like 3'-5' exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, without the four conserved acidic residues that are crucial for metal binding and catalysis. 231
25208 99822 cd05779 DNA_polB_epsilon_exo DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase epsilon, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase epsilon. DNA polymerase epsilon is a family-B DNA polymerase with a catalytic subunit that contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (alpha and delta are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase epsilon plays a role in elongating the leading strand during DNA replication. It is also involved in DNA repair. The catalytic subunit contains both polymerase and 3'-5' exonuclease activities. The N-terminal exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. DNA polymerase epsilon also carries a unique large C-terminal domain with an unknown function. Phylogenetic analyses indicate that it is orthologous to the archaeal DNA polymerase B3 rather than to the eukaryotic alpha, delta, or zeta polymerases. The exonuclease domain of family-B polymerases contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation 204
25209 99823 cd05780 DNA_polB_Kod1_like_exo DEDDy 3'-5' exonuclease domain of Pyrococcus kodakaraensis Kod1 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of archaeal family-B DNA polymerases with similarity to Pyrococcus kodakaraensis Kod1, including polymerases from Desulfurococcus (D. Tok Pol) and Thermococcus gorgonarius (Tgo Pol). Kod1, D. Tok Pol, and Tgo Pol are thermostable enzymes that exhibit both polymerase and 3'-5' exonuclease activities. They are family-B DNA polymerases. Their amino termini harbor a DEDDy-type DnaQ-like 3'-5' exonuclease domain that contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family B polymerases contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Members of this subfamily show similarity to eukaryotic DNA polymerases involved in DNA replication. Some archaea possess multiple family-B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family-B DNA polymerases support independent gene duplications during the evolution of archaeal and eukaryotic family-B DNA polymerases. 195
25210 99824 cd05781 DNA_polB_B3_exo DEDDy 3'-5' exonuclease domain of Sulfurisphaera ohwakuensis DNA polymerase B3 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of archaeal proteins with similarity to Sulfurisphaera ohwakuensis DNA polymerase B3. B3 is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. B3 exhibits both polymerase and 3'-5' exonuclease activities. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family B polymerases also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaea possess multiple family-B DNA polymerases. B3 is mainly found in crenarchaea. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B-DNA polymerases support independent gene duplications during the evolution of archaeal and eukaryotic family-B DNA polymerases. 188
25211 99825 cd05782 DNA_polB_like1_exo Uncharacterized bacterial subgroup of the DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. A subfamily of the 3'-5' exonuclease domain of family-B DNA polymerases. This subfamily is composed of uncharacterized bacterial family-B DNA polymerases. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are involved in metal binding and catalysis. The exonuclease domain of family-B DNA polymerases has a fundamental role in proofreading activity. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair. 208
25212 99826 cd05783 DNA_polB_B1_exo DEDDy 3'-5' exonuclease domain of Sulfolobus solfataricus DNA polymerase B1 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of Sulfolobus solfataricus DNA polymerase B1 and similar archaeal proteins. B1 is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. B1displays thermostable polymerase and 3'-5' exonuclease activities. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family-B polymerases also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Family-B DNA polymerases from thermophilic archaea are unique in that they are able to recognize the presence of uracil in the template strand, leading to the stalling of DNA synthesis. This is an additional safeguard mechanism against increased levels of deaminated bases during genome duplication at high temperatures. S. solfataricus B1 also interacts with DNA polymerase Y and may contribute to genome stability mechanisms. 204
25213 99827 cd05784 DNA_polB_II_exo DEDDy 3'-5' exonuclease domain of Escherichia coli DNA polymerase II and similar bacterial family-B DNA polymerases. The 3'-5' exonuclease domain of Escherichia coli DNA polymerase II (Pol II) and similar bacterial proteins. Pol II is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain has a fundamental role in the proofreading activity of polII. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Pol II is involved in a variety of cellular activities, such as the repair of DNA damaged by UV irradiation or oxidation. It plays a pivotal role in replication-restart, a process that bypasses DNA damage in an error-free manner. Pol II is also involved in lagging strand synthesis. 193
25214 99828 cd05785 DNA_polB_like2_exo Uncharacterized bacterial subgroup of the DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. A subfamily of the 3'-5' exonuclease domain of family-B DNA polymerases. This subfamily is composed of uncharacterized bacterial family-B DNA polymerases. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are involved in metal binding and catalysis. The exonuclease domain of family-B DNA polymerases has a fundamental role in proofreading activity. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair. 207
25215 100061 cd05787 LbH_eIF2B_epsilon eIF-2B epsilon subunit, central Left-handed parallel beta-Helix (LbH) domain: eIF-2B is a eukaryotic translation initiator, a guanine nucleotide exchange factor (GEF) composed of five different subunits (alpha, beta, gamma, delta and epsilon). eIF2B is important for regenerating GTP-bound eIF2 during the initiation process. This event is obligatory for eIF2 to bind initiator methionyl-tRNA, forming the ternary initiation complex. The eIF-2B epsilon subunit contains an N-terminal domain that resembles a dinucleotide-binding Rossmann fold, a central LbH domain containing 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), and a C-terminal domain of unknown function that is present in eIF-4 gamma, eIF-5, and eIF-2B epsilon. The epsilon and gamma subunits form the catalytic subcomplex of eIF-2B, which binds eIF2 and catalyzes guanine nucleotide exchange. 79
25216 240215 cd05789 S1_Rrp4 S1_Rrp4: Rrp4 S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure". 86
25217 240216 cd05790 S1_Rrp40 S1_Rrp40: Rrp40 S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure". 86
25218 240217 cd05791 S1_CSL4 S1_CSL4: CSL4, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. ScCSL4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In S. cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure". 92
25219 240218 cd05792 S1_eIF1AD_like S1_eIF1AD_like: eukaryotic translation initiation factor 1A domain containing protein (eIF1AD)-like, S1-like RNA-binding domain. eIF1AD is also known as MGC11102 protein. Little is known about the function of eIF1AD. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins, including translation initiation factor IF1A (also referred to as eIF1A in eukaryotes). eIF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. 78
25220 240219 cd05793 S1_IF1A S1_IF1A: Translation initiation factor IF1A, also referred to as eIF1A in eukaryotes and aIF1A in archaea, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea. 77
25221 240220 cd05794 S1_EF-P_repeat_2 S1_EF-P_repeat_2: Translation elongation factor P (EF-P), S1-like RNA-binding domain, repeat 1. EF-P stimulates the peptidyltransferase activity in the prokaryotic 70S ribosome. EF-P enhances the synthesis of certain dipeptides with N-formylmethionyl-tRNA and puromycine in vitro. EF-P binds to both the 30S and 50S ribosomal subunits. EF-P binds near the streptomycine binding site of the 16S rRNA in the 30S subunit. EF-P interacts with domains 2 and 5 of the 23S rRNA. The L16 ribosomal protein of the 50S or its N-terminal fragment are required for EF-P mediated peptide bond synthesis, whereas L11, L15, and L7/L12 are not required in this reaction, suggesting that EF-P may function at a different ribosomal site than most other translation factors. EF-P is essential for cell viability and is required for protein synthesis. EF-P is mainly present in bacteria. The EF-P homologs in archaea and eukaryotes are the initiation factors aIF5A and eIF5A, respectively. EF-P has 3 domains (domains I, II, and III). Domains II and III are S1-like domains. This CD includes domain III (the second S1 domain of EF_P). Domains II and III of have structural homology to the eIF5A domain C, suggesting that domains II and III evolved by duplication. 56
25222 240221 cd05795 Ribosomal_P0_L10e Ribosomal protein L10 family, P0 and L10e subfamily; composed of eukaryotic 60S ribosomal protein P0 and the archaeal P0 homolog, L10e. P0 or L10e forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). These eukaryotic and archaeal P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2. 175
25223 240222 cd05796 Ribosomal_P0_like Ribosomal protein L10 family, P0-like protein subfamily; composed of uncharacterized eukaryotic proteins with similarity to the 60S ribosomal protein P0, including the Saccharomyces cerevisiae protein called mRNA turnover protein 4 (MRT4). MRT4 may be involved in mRNA decay. P0 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. It occupies the L7/L12 stalk of the ribosome. The stalk is known to contain the binding site for elongation factors EF-G and EF-Tu; however, there is disagreement as to whether or not P0 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, P0 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2. 163
25224 240223 cd05797 Ribosomal_L10 Ribosomal protein L10 family, L10 subfamily; composed of bacterial 50S ribosomal protein and eukaryotic mitochondrial 39S ribosomal protein, L10. L10 occupies the L7/L12 stalk of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). These bacteria and eukaryotic sequences have no additional C-terminal domain, present in other eukaryotic and archaeal orthologs. 157
25225 240224 cd05798 SIS_TAL_PGI SIS_TAL_PGI: Transaldolase (TAL)/ Phosphoglucose isomerase (PGI). This group represents the SIS (Sugar ISomerase) PGI domain, of a multifunctional protein (TAL-PGI ) having both TAL and PGI activities. TAL_PGI contains an N-terminal TAL domain and a C-terminal PGI domain. TAL catalyzes the reversible conversion of sedoheptulose-7-phosphate (S7P) and glyceraldehyde-3-phosphate (G3P), to fructose-6-phosphate (F6P) and erythrose-4-phosphate (E4P). PGI catalyzes the reversible isomerization of F6P to glucose-6-phosphate (G6P). It has been suggested for Gluconobacter oxydans TAL_PGI that this enzyme generates E4P and G6P directly from S7P and G3P. G. oxydans TAL_PGI contributes to increased xylitol production from D-arabitol. As xylitol is an alternative natural sweetner to sucrose, the microbial conversion of D-arabitol to xylitol is of interest to food and pharmaceutical industries. 129
25226 100092 cd05799 PGM2 This CD includes PGM2 (phosphoglucomutase 2) and PGM2L1 (phosphoglucomutase 2-like 1). The mammalian PGM2 is thought to be a phosphopentomutase that catalyzes the conversion of the nucleoside breakdown products, ribose-1-phosphate and deoxyribose-1-phosphate to the corresponding 5-phosphopentoses. PGM2L1 is thought to catalyze the 1,3-bisphosphoglycerate-dependent synthesis of glucose 1,6-bisphosphate and other aldose-bisphosphates that serve as cofactors for several sugar phosphomutases and possibly also as regulators of glycolytic enzymes. PGM2 and PGM2L1 belong to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 487
25227 100093 cd05800 PGM_like2 This PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily and is found in both archaea and bacteria. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four structural domains (subdomains) with a centrally located active site formed by four loops, one from each subdomain. All four subdomains are included in this alignment model. 461
25228 100094 cd05801 PGM_like3 This bacterial PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 522
25229 100095 cd05802 GlmM GlmM is a bacterial phosphoglucosamine mutase (PNGM) that belongs to the alpha-D-phosphohexomutase superfamily. It is required for the interconversion of glucosamine-6-phosphate and glucosamine-1-phosphate in the biosynthetic pathway of UDP-N-acetylglucosamine, an essential precursor to components of the cell envelope. In order to be active, GlmM must be phosphorylated, which can occur via autophosphorylation or by the Ser/Thr kinase StkP. GlmM functions in a classical ping-pong bi-bi mechanism with glucosamine-1,6-diphosphate as an intermediate. Other members of the alpha-D-phosphohexomutase superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 434
25230 100096 cd05803 PGM_like4 This PGM-like (phosphoglucomutase-like) domain is located C-terminal to a mannose-1-phosphate guanyltransferase domain in a protein of unknown function that is found in both prokaryotes and eukaryotes. This domain belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 445
25231 100115 cd05804 StaR_like StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure. 355
25232 100097 cd05805 MPG1_transferase GTP-mannose-1-phosphate guanyltransferase (MPG1 transferase), also known as GDP-mannose pyrophosphorylase, is a bifunctional enzyme with both phosphomannose isomerase (PMI) activity and GDP-mannose phosphorylase (GMP) activity. The protein contains an N-terminal NTP transferase domain, an L-beta-H domain, and a C-terminal PGM-like domain that belongs to the alpha-D-phosphohexomutase superfamily. This subfamily is limited to bacteria and archaea. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this group appear to lack conserved residues necessary for metal binding and catalytic activity. Other members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model. 441
25233 99881 cd05806 CBM20_laforin Laforin protein tyrosine phosphatase, N-terminal CBM20 (carbohydrate-binding module, family 20) domain. Laforin, encoded by the EPM2A gene, is a dual-specificity phosphatase that dephosphorylates complex carbohydrates. Mutations in the gene encoding laforin result in Lafora disease, a fatal autosomal recessive neurodegenerative disorder characterized by the presence of intracellular deposits of insoluble, abnormally branched, glycogen-like polymers, known as Lafora bodies, in neurons, muscle, liver, and other tissues. The molecular basis for the formation of these Lafora bodies is unknown. Laforin is one of the only phosphatases that contains a carbohydrate-binding module. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 112
25234 99882 cd05807 CBM20_CGTase CGTase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. CGTase, also known as cyclodextrin glycosyltransferase and cyclodextrin glucanotransferase, catalyzes the formation of various cyclodextrins (alpha-1,4-glucans) from starch. CGTase has, in addition to its C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 13 and an IPT domain of unknown function. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 101
25235 99883 cd05808 CBM20_alpha_amylase Alpha-amylase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. This domain is found in several bacterial and fungal alpha-amylases including the maltopentaose-forming amylases (G5-amylases). Most alpha-amylases have, in addition to the C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 13, which hydrolyzes internal alpha-1,4-glucosidic bonds in starch and related saccharides, yielding maltotriose and maltose. Two types of soluble substrates are used by alpha-amylases including long substrates (e.g. amylose) and short substrates (e.g. maltodextrins or maltooligosaccharides). The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 95
25236 99884 cd05809 CBM20_beta_amylase Beta-amylase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Beta-amylase has, in addition to its C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 14, which hydrolyzes the alpha-1,4-glucosidic bonds of starch, yielding beta-maltose from the nonreducing end of the substrate. Beta-amylase is found in both plants and microorganisms, however the plant members lack a C-terminal CBM20 domain and are not included in this group. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 99
25237 99885 cd05810 CBM20_alpha_MTH Glucan 1,4-alpha-maltotetraohydrolase (alpha-MTH), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Alpha-MTH, also known as maltotetraose-forming exo-amylase or G4-amylase, is an exo-amylase found in bacteria that degrades starch from its non-reducing end. Most alpha-MTHs have, in addition to the C-terminal CBM20 domain, an N-terminal glycosyl hydrolase family 13 catalytic domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 97
25238 99886 cd05811 CBM20_glucoamylase Glucoamylase (glucan1,4-alpha-glucosidase), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Glucoamylases are inverting, exo-acting starch hydrolases that hydrolyze starch and related polysaccharides by releasing the nonreducing end glucose. They are mainly active on alpha-1,4-glycosidic bonds but also have some activity towards 1,6-glycosidic bonds occurring in natural oligosaccharides. The ability of glucoamylases to cleave 1-6-glycosidic binds is called "debranching activity" and is of importance in industrial applications, where complete degradation of starch to glucose is needed. Most glucoamylases are multidomain proteins containing an N-terminal catalytic domain, a C-terminal CBM20 domain, and a highly O-glycosylated linker region that connects the two. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 106
25239 99887 cd05813 CBM20_genethonin_1 Genethonin-1, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Genethonin-1 is a human skeletal muscle protein with no known function. It contains a C-terminal CBM20 domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 95
25240 99888 cd05814 CBM20_Prei4 Prei4, N-terminal CBM20 (carbohydrate-binding module, family 20) domain. Preimplantation protein 4 (Prei4) is a protein of unknown function that is expressed during mouse preimplantation embryogenesis. In addition to the N-terminal CBM20 domain, Prei4 contains a C-terminal glycerophosphoryl diester phosphodiesterase (GDPD) domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 120
25241 99889 cd05815 CBM20_DPE2_repeat1 Disproportionating enzyme 2 (DPE2), N-terminal CBM20 (carbohydrate-binding module, family 20) domain, repeat 1. DPE2 is a transglucosidase that is essential for the cytosolic metabolism of maltose in plant leaves at night. Maltose is an intermediate on the pathway from starch to sucrose and DPE2 is thought to metabolize the maltose that is exported from the chloroplast. DPE2 has two N-terminal CBM20 starch binding domains as well as a C-terminal amylomaltase (4-alpha-glucanotransferase) catalytic domain. DPE1, the plastid version of this enzyme, has a transglucosidase domain that is similar to that of DPE2 but lacks the N-terminal carbohydrate-binding domains. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 101
25242 99890 cd05816 CBM20_DPE2_repeat2 Disproportionating enzyme 2 (DPE2), N-terminal CBM20 (carbohydrate-binding module, family 20) domain, repeat 2. DPE2 is a transglucosidase that is essential for the cytosolic metabolism of maltose in plant leaves at night. Maltose is an intermediate on the pathway from starch to sucrose and DPE2 is thought to metabolize the maltose that is exported from the chloroplast. DPE2 has two N-terminal CBM20 domains as well as a C-terminal amylomaltase (4-alpha-glucanotransferase) catalytic domain. DPE1, the plastid version of this enzyme, has a transglucosidase domain that is similar to that of DPE2 but lacks the N-terminal CBM20 domains. Included in this group are PDE2-like proteins from Dictyostelium, Entamoeba, and Bacteroides. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 99
25243 99891 cd05817 CBM20_DSP Dual-specificity phosphatase (DSP), N-terminal CBM20 (carbohydrate-binding module, family 20) domain. This CBM20 domain is located at the N-terminus of a protein tyrosine phosphatase of unknown function found in slime molds and ciliated protozoans. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 100
25244 99892 cd05818 CBM20_water_dikinase Phosphoglucan water dikinase (also known as alpha-glucan water dikinase), N-terminal CBM20 (carbohydrate-binding module, family 20) domain. This domain is found in the chloroplast-encoded phosphoglucan water dikinase, one of two enzymes involved in the phosphorylation of plant starches. In addition to the CBM20 domain, phosphoglucan water dikinase contains a C-terminal pyruvate binding domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 92
25245 271320 cd05819 NHL NHL repeat unit of beta-propeller proteins. The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats. 269
25246 99893 cd05820 CBM20_novamyl Novamyl (also known as acarviose transferase, ATase, maltogenic alpha-amylase, glucan 1,4-alpha-maltohydrolase, and AcbD), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Novamyl has a five-domain structure similar to that of cyclodextrin glucanotransferase (CGTase). Novamyl has a substrate-binding surface with an open groove which can accommodate both cyclodextrins and linear substrates. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 103
25247 100113 cd05821 TLP_Transthyretin Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates. TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein. Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity. A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by binding to apolipoprotein AI (apoA-I). TTR is able to proteolytically process apoA-I by cleaving its C-terminus; therefore TTR has protease activity in addition to its function in protein transport. 121
25248 100114 cd05822 TLP_HIUase HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family. HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site. In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location. Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences. HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. 112
25249 100062 cd05824 LbH_M1P_guanylylT_C Mannose-1-phosphate guanylyltransferase, C-terminal Left-handed parallel beta helix (LbH) domain: Mannose-1-phosphate guanylyltransferase is also known as GDP-mannose pyrophosphorylase. It catalyzes the synthesis of GDP-mannose from GTP and mannose-1-phosphate, and is involved in the maintenance of cell wall integrity and glycosylation. Similar to ADP-glucose pyrophosphorylase, it contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain, presumably with 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. 80
25250 100063 cd05825 LbH_wcaF_like wcaF-like: This group is composed of the protein product of the E. coli wcaF gene and similar proteins. WcaF is part of the gene cluster responsible for the biosynthesis of the extracellular polysaccharide colanic acid. The wcaF protein is predicted to contain a left-handed parallel beta-helix (LbH) domain encoded by imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. Many are trimeric in their active forms. 107
25251 320675 cd05826 Sortase_B Sortase domain found in class B sortases. Class B sortases are membrane-bound cysteine transpeptidases broadly distributed in Gram-positive bacteria (mainly present in Firmicutes and Actinobacteria). They can have radically distinct functions. Some members of this group attach haemoproteins to the peptidoglycan of the cell wall, while others assemble pili, which are multi-subunit hair-like fibres that extend from the cell surface to promote microbial adhesion and biofilm formation. In transpeptidation reaction, the surface protein substrate is cleaved at a conserved cell wall-sorting signal (Class B sortases normally recognize the consensus NP[Q/K][T/S][N/G/S][D/A] motif), and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical sortase B protein from Staphylococcus aureus (named Sa-SrtB) cleaves surface protein precursors between threonine and asparagine at a conserved NPQTN motif with subsequent covalent linkage to pentaglycine cross-bridges. It is required for anchoring the heme-iron binding surface protein IsdC to the cell wall envelope. SrtB contains an N-terminal hydrophobic region that functions as a signal peptide/transmembrane domain. At the C terminus, it contains an essential cysteine residue within the catalytic TLXTC signature sequence, where X is usually a serine. Genes encoding SrtB and its targets are generally clustered in the same locus. The prototypical class B sortase involved in pilus biogenesis is pilus-specific sortase C2 from Streptococcus pyogenes (named Sp-SrtC2) that anchors a surface protein containing a QVPTGV motif to the cell wall, as well as polymerizes the major pilin subunit Tee3/FctA and attaches the minor tip pilin Cpa. The linkage of Cpa to Tee3 by SrtC2 requires the VPPTG motif in the cell wall-sorting signal of Cpa. The family also includes SrtB enzymes from Bacillus anthracis (named Ba-SrtB) and Clostridium difficile (named Cd-SrtB). Ba-SrtB is thought to recognize the NPKTG motif, and attaches surface proteins to meso-diaminopimelic acid (mDAP) cross-bridges. Cd-SrtB does not play an essential role in pathogenesis. It cleaves short [SP]PXTG motif-containing peptides between the threonine and glycine residues and then covalently anchors the threonine residue to a nucleophile such as glycine or mDAP, but not to the peptidoglycan of C. difficile, suggesting a novel association of sortase activity with cyclic diGMP (c-diGMP)-mediated regulation to control levels of cell wall anchoring and secretion of putative adhesion molecules. 170
25252 320676 cd05827 Sortase_C Sortase domain found in class C sortases. Class C sortases are membrane-bound cysteine transpeptidases broadly distributed in Gram-positive bacteria (mainly present in Firmicutes and Actinobacteria). They function as pilin polymerases responsible for the assembly of pili, which are multi-subunit hair-like fibres that extend from the cell surface to promote microbial adhesion and biofilm formation. First, one or more class C sortases form the long thin shaft of the pilus through linking together pilin subunits via isopeptide bonds. The base of the pilus is then anchored to the cell wall by a housekeeping sortase or, in some cases, the class C sortase itself. Depending upon the organism both the number and type of sortase enzymes involved varies, and in some cases, accessory factors appear to be needed. In three-component spaA pilus from Corynebacterium diphtheriae, the prototypical class C sortase (named Cd-SrtA) catalyzes polymerization of the SpaA-type pilus, consisting of the shaft pilin SpaA, tip pilin SpaC and minor pilin SpaB. The pilus shaft is then attached to the cell wall by a housekeeping class E sortase, Cd-SrtF. In the absence of Cd-SrtF, Cd-SrtA attaches the pilus to the cell wall, albeit at a reduced rate. Cd-SrtA can recognize two distinct sorting signals (LPLTG in SpaA and SpaC, and LAFTG in SpaB) and it can employ lysine residues that originate from different proteins (either Lys190 within the pilin motif of SpaA or Lys139 in SpaB). However, Cd-SrtA cannot be able to polymerize the major pilin subunit SpaH, even though it contains LPLTG motif. In two-component pili of prototypical Bacillus cereus, the class C sortase (named Bc-SrtD) cleaves related sorting signals within a major pilin protein BcpA (LPVTG) and a minor tip pilin BcpB (IPNTG), and catalyzes a transpeptidation that joins the threonine residues in each signal to the side-chain of Lys162 in BcpA (located within a pilin motif). Unlike the SpaA pilus in C. diphtheriae, in B. cereus Bc-SrtD is unable to covalently attach the pilus to the cell wall without the help of the housekeeping sortase. 131
25253 320677 cd05828 Sortase_D_1 Sortase domain found in subfamily 1 of the class D family of sortases. Class D sortases are cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). The prototypical subfamily 1 of class D sortase from Bacillus anthracis (named Ba-SrtC) covalently attaches proteins bearing a noncanonical LPNTA sorting signal, such as the BasH and BasI proteins, to the peptidoglycan of the cell wall that facilitate sporulation. BasH is exclusively anchored to the forespore cell wall envelope, while BasI is attached to the diaminopimelic acid moiety of the peptidoglycan of predivisional cells. Ba-SrtC lacks the N-terminal signal peptide and membrane anchor. The family also includes many class D sortase homologs from Gram-negative bacteria, but the functions of these enzymes are unknown. 127
25254 320678 cd05829 Sortase_F Sortase domain found in the class F family of sortases. Class F sortases are mainly present in Actinobacteria, Chlorobacteria and Firmicutes. Their functions are largely unknown. 144
25255 320679 cd05830 Sortase_E Sortase domain found in the class E family of sortases. Class E sortases are membrane-bound cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Actinobacteria). Genes encoding class A and E sortases are never found in the same organism, and similar to class A sortases, the genes encoding class E sortases are not positioned adjacent to genes encoding potential protein substrates, suggesting a housekeeping sortase function of class E sortases in some high G + C Gram-positive bacteria. Similar to class A sortase, class E sortases are capable of anchoring a large number of functionally distinct surface proteins containing a cell wall sorting signal to an amino group located on the bacterial cell wall. They recognize an LAXTG sorting signal, instead of the canonical LPXTG motif processed by class A sortases. The prototypical class E sortase from Corynebacterium diphtheria (named Cd-SrtF) is a non-polymerization sortase that is not required for pilus polymerization, and proceeds to complete the assembly process by anchoring the polymer to the cell wall peptidoglycan. Moreover, in Streptomyces coelicolor, one or both of Staphylococcus aureus SrtA homologs may function as class E sortase responsible for the cell wall anchoring of the long chaplin proteins (ChpA-C) containing an LAXTG sorting signal, which presumably mediate aerial hyphae formation. The family also includes some class E sortase homologs from Gram-negative and Archaebacterial species, but the functions of these enzymes are unknown. 135
25256 100109 cd05831 Ribosomal_P1 Ribosomal protein P1. This subfamily represents the eukaryotic large ribosomal protein P1. Eukaryotic P1 and P2 are functionally equivalent to the bacterial protein L7/L12, but are not homologous to L7/L12. P1 is located in the L12 stalk, with proteins P2, P0, L11, and 28S rRNA. P1 and P2 are the only proteins in the ribosome to occur as multimers, always appearing as sets of heterodimers. Recent data indicate that eukaryotes have four copies (two heterodimers), while most archaeal species contain six copies of L12p (three homodimers) and bacteria may have four or six copies (two or three homodimers), depending on the species. Experiments using S. cerevisiae P1 and P2 indicate that P1 proteins are positioned more internally with limited reactivity in the C-terminal domains, while P2 proteins seem to be more externally located and are more likely to interact with other cellular components. In lower eukaryotes, P1 and P2 are further subdivided into P1A, P1B, P2A, and P2B, which form P1A/P2B and P1B/P2A heterodimers. Some plant species have a third P-protein, called P3, which is not homologous to P1 and P2. In humans, P1 and P2 are strongly autoimmunogenic. They play a significant role in the etiology and pathogenesis of systemic lupus erythema (SLE). In addition, the ribosome-inactivating protein trichosanthin (TCS) interacts with human P0, P1, and P2, with its primary binding site located in the C-terminal region of P2. TCS inactivates the ribosome by depurinating a specific adenine in the sarcin-ricin loop of 28S rRNA. 103
25257 100110 cd05832 Ribosomal_L12p Ribosomal protein L12p. This subfamily includes archaeal L12p, the protein that is functionally equivalent to L7/L12 in bacteria and the P1 and P2 proteins in eukaryotes. L12p is homologous to P1 and P2 but is not homologous to bacterial L7/L12. It is located in the L12 stalk, with proteins L10, L11, and 23S rRNA. L12p is the only protein in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have four copies (two heterodimers), and bacteria may have four or six copies (two or three homodimers), depending on the species. The organization of proteins within the stalk has been characterized primarily in bacteria, where L7/L12 forms either two or three homodimers and each homodimer binds to the extended C-terminal helix of L10. L7/L12 is attached to the ribosome through L10 and is the only ribosomal protein that does not directly interact with rRNA. Archaeal L12p is believed to function in a similar fashion. However, hybrid ribosomes containing the large subunit from E. coli with an archaeal stalk are able to bind archaeal and eukaryotic elongation factors but not bacterial elongation factors. In several mesophilic and thermophilic archaeal species, the binding of 23S rRNA to protein L11 and to the L10/L12p pentameric complex was found to be temperature-dependent and cooperative. 106
25258 100111 cd05833 Ribosomal_P2 Ribosomal protein P2. This subfamily represents the eukaryotic large ribosomal protein P2. Eukaryotic P1 and P2 are functionally equivalent to the bacterial protein L7/L12, but are not homologous to L7/L12. P2 is located in the L12 stalk, with proteins P1, P0, L11, and 28S rRNA. P1 and P2 are the only proteins in the ribosome to occur as multimers, always appearing as sets of heterodimers. Recent data indicate that eukaryotes have four copies (two heterodimers), while most archaeal species contain six copies of L12p (three homodimers). Bacteria may have four or six copies of L7/L12 (two or three homodimers) depending on the species. Experiments using S. cerevisiae P1 and P2 indicate that P1 proteins are positioned more internally with limited reactivity in the C-terminal domains, while P2 proteins seem to be more externally located and are more likely to interact with other cellular components. In lower eukaryotes, P1 and P2 are further subdivided into P1A, P1B, P2A, and P2B, which form P1A/P2B and P1B/P2A heterodimers. Some plants have a third P-protein, called P3, which is not homologous to P1 and P2. In humans, P1 and P2 are strongly autoimmunogenic. They play a significant role in the etiology and pathogenesis of systemic lupus erythema (SLE). In addition, the ribosome-inactivating protein trichosanthin (TCS) interacts with human P0, P1, and P2, with its primary binding site in the C-terminal region of P2. TCS inactivates the ribosome by depurinating a specific adenine in the sarcin-ricin loop of 28S rRNA. 109
25259 99895 cd05834 HDGF_related The PWWP domain is an essential part of the Hepatoma Derived Growth Factor (HDGF) family of proteins, and is necessary for DNA binding by HDGF. This family of endogenous nuclear-targeted mitogens includes HRP (HDGF-related proteins 1, 2, 3, 4, or HPR1, HPR2, HPR3, HPR4, respectively) and lens epithelium-derived growth factor, LEDGF. Members of the HDGF family have been linked to human diseases, and HDGF is a prognostic factor in several types of cancer. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 83
25260 99896 cd05835 Dnmt3b_related The PWWP domain is an essential component of DNA methyltransferase 3 B (Dnmt3b) which is responsible for establishing DNA methylation patterns during embryogenesis and gametogenesis. In tumorigenesis, DNA methylation by Dnmt3b is known to play a role in the inactivation of tumor suppressor genes. In addition, a point mutation in the PWWP domain of Dnmt3b has been identified in patients with ICF syndrome (immunodeficiency, centromeric instability, and facial anomalies), a rare autosomal recessive disorder characterized by hypomethylation of classical satellite DNA. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 87
25261 99897 cd05836 N_Pac_NP60 The PWWP domain is an essential part of the cytokine-like nuclear factor n-pac protein, or NP60, which enhances the activity of MAP2K4 and MAP2K6 kinases to phosphorylate p38-alpha. In a variety of cell lines, NP60 has been shown to localize to the nucleus. In addition to the PWWP domain, NP60 also contains an AT-hook and a C-terminal NAD-binding domain. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes. 86
25262 99898 cd05837 MSH6_like The PWWP domain is present in MSH6, a mismatch repair protein homologous to bacterial MutS. The PWWP domain of histone-lysine N-methyltransferase, also known as Nuclear SET domain-containing protein 3, is also included. Mutations in MSH6 have been linked to increased cancer susceptibility, particularly in hereditary nonpolyposis colorectal cancer in humans. The role of the PWWP domain in MSH6 is not clear; MSH6 orthologs found in S. cerevisiae, Caenorhabditis elegans and Arabidopsis thaliana lack the PWWP domain. Histone methyltransferases (HMTases) induce the posttranslational methylation of lysine residues in histones and play a role in apoptosis. In the HMTase Whistle, the PWWP domain is necessary for HMTase activity. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 110
25263 99899 cd05838 WHSC1_related The PWWP domain was first identified in the WHSC1 (Wolf-Hirschhorn syndrome candidate 1) protein, a protein implicated in Wolf-Hirschhorn syndrome (WHS). When translocated, WHSC1 plays a role in lymphoid multiple myeloma (MM) disease, also known as plasmacytoma. WHCS1 proteins typically contain two copies of the PWWP domain. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 95
25264 99900 cd05839 BR140_related The PWWP domain is found in the BR140 family, which includes peregrin and BR140-like proteins 1 and 2. BR140 is the only family to contain the PWWP domain at the C terminus, with PHD and bromo domains in the N-terminal region. In myeloid leukemias, BR140 is disrupted by chromosomal translocations, similar to translocations of WHSC1 in lymphoid multiple myeloma. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes. 111
25265 99901 cd05840 SPBC215_ISWI_like The PWWP domain is a component of the S. pombe hypothetical protein SPBC215, as well as ISWI complex protein 4. The ISWI (imitation switch) proteins are ATPases responsible for chromatin remodeling in eukaryotes, and SPBC215 is proposed to also bind chromatin. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 93
25266 99902 cd05841 BS69_related The PWWP domain is part of BS69 protein, a nuclear protein that specifically binds adenoviral E1A and Epstein-Barr viral EBNA2 proteins, suppressing their transactivation functions. BS69 is a multi-domain protein, containing bromo, PHD, PWWP, and MYND domains. The specific role of the PWWP domain within BS69 is not clearly identified, but BS69 functions in chromatin remodeling, consistent with other PWWP-containing proteins. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 83
25267 320682 cd05843 Peptidase_M48_M56 Peptidases M48 (Ste24 endopeptidase or htpX homolog) and M56 (in MecR1 and BlaR1), integral membrane metallopeptidases. This family contains peptidase M48 (also known as Ste24 peptidase, Ste24p, Ste24 endopeptidase, a-factor converting enzyme, AFC1), M56 (also known as BlaR1 peptidase) as well as a novel family called minigluzincins. Peptidase M48 belongs to Ste24 endopeptidase family. Members of this family include Ste24 protease (peptidase M48A), protease htpX homolog (peptidase M48B), or CAAX prenyl protease 1, and mitochondrial metalloendopeptidase OMA1 (peptidase M48C). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit. In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. The Ste24p contains multiple membrane spans, a zinc metalloprotease motif (HEXXH), and a COOH-terminal ER retrieval signal (KKXX). Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity. Ste24p has limited homology to HtpX family of prokaryotic proteins; HtpX proteins, also part of the M48 peptidase family, are smaller and homology is restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins; HtpX then undergoes self-degradation and collaborates with FtsH to eliminate these misfolded proteins. Peptidase M56 includes zinc metalloprotease domain in MecR1 and BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain beta-lactam antibiotics in MRSA. Also included are a novel family of related proteins that consist of the soluble minimal scaffold similar to the catalytic domains of the integral-membrane metallopeptidase M48 and M56, thus called minigluzincins. 94
25268 340860 cd05844 GT4-like glycosyltransferase family 4 proteins. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to glycosyltransferase family 4 (GT4). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 365
25269 409432 cd05845 IgI_2_L1-CAM_like Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM), and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region, and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1 that involves abnormalities of axonal growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 91
25270 409433 cd05846 IgV_1_MRC-OX-2_like First immunoglobulin (Ig) variable (V) domain of rat MRC OX-2 antigen, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of rat MRC OX-2 antigen (also known as CD200) and similar proteins. MRC OX-2 is a membrane glycoprotein expressed in a variety of lymphoid and non-lymphoid cells in rats. It has a similar broad distribution pattern in humans. MRC OX-2 may regulate myeloid cell activity. The protein has an extracellular portion containing two Ig-like domains, a transmembrane portion, and a cytoplasmic portion. 108
25271 409434 cd05847 IgC1_CH2_IgE CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin E (IgE); member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second constant domain of the heavy chain of immunoglobulin E (IgE). The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta, and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). The different classes of antibodies vary in their heavy chains; the IgE class has the epsilon type. This domain (Cepsilon2) of IgE is in place of the flexible hinge region found in IgG. 97
25272 409435 cd05848 IgI_1_Contactin-5 First immunoglobulin (Ig) domain of contactin-5; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-5. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains, anchored to the membrane by glycosylphosphatidylinositol. The different contactins show different expression patterns in the central nervous system. In rats, a lack of contactin-5 (NB-2) results in an impairment of the neuronal activity in the auditory system. Contactin-5 is expressed specifically in the postnatal nervous system, peaking at about 3 weeks postnatal. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala; lower levels of expression have been detected in the corpus callosum, caudate nucleus, and spinal cord. This group belongs to the I-set of IgSF domains. 96
25273 409436 cd05849 IgI_1_Contactin-1 First immunoglobulin (Ig) domain of contactin-1; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains. 95
25274 409437 cd05850 IgI_1_Contactin-2 First immunoglobulin (Ig) domain of contactin-2; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-2-like. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. It may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module by contacts between IG domains 1 and 4, and domains 2 and 3. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-2 is also expressed in retinal amacrine cells in the developing chick retina, corresponding to the period of formation and maturation of AC processes. This group belongs to the I-set of IgSF domains. 97
25275 143259 cd05851 IgI_3_Contactin-1 Third immunoglobulin (Ig) domain of contactin-1; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains. 88
25276 409438 cd05852 Ig5_Contactin-1 Fifth immunoglobulin (Ig) domain of contactin-1. The members here are composed of the fifth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. 89
25277 409439 cd05853 Ig6_Contactin-4 Sixth immunoglobulin (Ig) domain of contactin-4. The members here are composed of the sixth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-4. Contactins are neural cell adhesion molecules, and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The different contactins show different expression patterns in the central nervous system. Highest expression of contactin-4 is in testes, thyroid, small intestine, uterus, and brain. Contactin-4 plays a role in the response of neuroblastoma cells to differentiating agents, such as retinoids. The contactin 4 gene is associated with cerebellar degeneration in spinocerebellar ataxia type 16. 102
25278 409440 cd05854 Ig6_Contactin-2 Sixth immunoglobulin (Ig) domain of contactin-2. The members here are composed of the sixth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-2-like. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. It may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module by contacts between IG domains 1 and 4, and domains 2 and 3. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-2 is also expressed in retinal amacrine cells (AC) in the developing chick retina, corresponding to the period of formation and maturation of AC processes. 102
25279 409441 cd05855 IgI_TrkB_d5 Fifth domain (immunoglobulin-like) of Trk receptor TrkB; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth domain of Trk receptor, TrkB, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors, which mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. Trks are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkB shares significant sequence homology and domain organization with TrkA and TrkC. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. In some cell systems NT-3 can activate TrkA and TrkB receptors. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. This group belongs to the I-set of IgSF domains 94
25280 409442 cd05856 IgI_2_FGFRL1-like Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor_like-1(FGFRL1); member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor like-1(FGFRL1). FGFRL1 is comprised of a signal peptide, three extracellular Ig-like modules, a transmembrane segment, and a short intracellular domain. FGFRL1 is expressed preferentially in skeletal tissues. Similar to FGF receptors, the expressed protein interacts specifically with heparin and with FGF2. FGFRL1 does not have a protein tyrosine kinase domain at its C-terminus; neither does its cytoplasmic domain appear to interact with a signaling partner. It has been suggested that FGFRL1 may not have any direct signaling function, but instead acts as a decoy receptor trapping FGFs and preventing them from binding other receptors. 92
25281 409443 cd05857 IgI_2_FGFR Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor. FGF receptors bind FGF signaling polypeptides. FGFs participate in multiple processes such as morphogenesis, development, and angiogenesis. FGFs bind to four FGF receptor tyrosine kinases (FGFR1, FGFR2, FGFR3, FGFR4). Receptor diversity is controlled by alternative splicing producing splice variants with different ligand binding characteristics and different expression patterns. FGFRs have an extracellular region comprised of three IG-like domains, a single transmembrane helix, and an intracellular tyrosine kinase domain. Ligand binding and specificity reside in the Ig-like domains 2 and 3, and the linker region that connects these two. FGFR activation and signaling depend on FGF-induced dimerization, a process involving cell surface heparin or heparin sulfate proteoglycans. 95
25282 409444 cd05858 IgI_3_FGFR2 Third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor 2 (FGFR2); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain of human fibroblast growth factor receptor 2 (FGFR2). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for FGFs. FGFR2 is required for male sex determination. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 105
25283 409445 cd05859 Ig4_PDGFR Fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR). The members here are composed of the fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR; also known as cluster of differentiation (CD) 140a) alpha and beta. PDGF is a potent mitogen for connective tissue cells. PDGF-stimulated processes are mediated by three different PDGFs (PDGF-A,PDGF-B, and PDGF-C). PDGFR alpha binds to all three PDGFs, whereas the PDGFR beta binds only to PDGF-B. PDGF alpha is organized as an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. In mice, PDGFR alpha and PDGFR beta are essential for normal development. 101
25284 409446 cd05860 IgI_4_SCFR Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR); member of the I-set of IgSF domains. The members here are composed of the fourth Immunoglobulin (Ig)-like domain in stem cell factor receptor (SCFR). SCFR is organized as an extracellular component having five IG-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes, and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR. This fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth domain abolishes the ligand-induced dimerization of SCFR and completely inhibits signal transduction. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 101
25285 409447 cd05861 IgI_PDGFR-alphabeta Immunoglobulin (Ig)-like domain of platelet-derived growth factor (PDGF) receptors (R), alpha and beta; member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of platelet-derived growth factor (PDGF) receptors (R), alpha (also known as cluster of differentiation (CD) 140a), and beta (also known as CD140b). PDGF is a potent mitogen for connective tissue cells. PDGF-stimulated processes are mediated by three different PDGFs (PDGF-A,PDGF-B, and PDGF-C). PDGFRalpha binds to all three PDGFs, whereas the PDGFRbeta binds only to PDGF-B. PDGFRs alpha and beta have similar organization: an extracellular component with five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. In mice, PDGFRalpha and PDGFRbeta are essential for normal development. 99
25286 409448 cd05862 IgI_VEGFR Immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor(R); member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor(R). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members, VEGFR-1 (also known as Flt-1), VEGFR-2 (also known as KDR or Flk-1) and VEGFR-3 (also known as Flt-4). VEGF_A interacts with both VEGFR-1 and VEGFR-2. VEGFR-1 binds strongest to VEGF, VEGF-2 binds more weakly. VEGFR-3 appears not to bind VEGF, but binds other members of the VEGF family (VEGF-C and -D). VEGFRs bind VEGFs with high affinity with the IG-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-2 and -1 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth. 102
25287 409449 cd05863 IgI_VEGFR-3 Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-3 (Flt-4) binds two members of the VEGF family (VEGF-C and VEGF-D) and is involved in tumor angiogenesis and growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 88
25288 409450 cd05864 IgI_VEGFR-2 Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-2 (KDR/Flk-1) is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A; VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGF-A also interacts with VEGFR-1, which it binds more strongly than VEGFR-2. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 89
25289 409451 cd05865 IgI_1_NCAM-1 First immunoglobulin (Ig)-like domain of neural cell adhesion molecule (NCAM-1); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule (NCAM-1). NCAM-1 plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-nonNCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves the Ig1, Ig2, and Ig3 domains. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 97
25290 409452 cd05866 IgI_1_NCAM-2 First immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-2; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-2 (OCAM/mamFas II, RNCAM). NCAM-2 is organized similarly to NCAM-1, including five N-terminal Ig-like domains and two fibronectin type III domains. NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE), and may function like NCAM, as an adhesion molecule. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 93
25291 409453 cd05867 Ig4_L1-CAM_like Fourth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). The members here are composed of the fourth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, and spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM. 89
25292 409454 cd05868 Ig4_NrCAM Fourth immunoglobulin (Ig)-like domain of NrCAM (NgCAM-related cell adhesion molecule). The members here are composed of the fourth immunoglobulin (Ig)-like domain of NrCAM (NgCAM-related cell adhesion molecule). NrCAM belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six IG-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. NrCAM is primarily expressed in the nervous system. 89
25293 143277 cd05869 IgI_NCAM-1 Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 1 (NCAM-1). The members here are composed of the fourth Ig domain of Neural Cell Adhesion Molecule 1(NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM) and heterophilic (NCAM-non-NCAM) interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains. 97
25294 143278 cd05870 IgI_NCAM-2 Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 2 (NCAM-2). The members here are composed of the fourth Ig domain of Neural Cell Adhesion Molecule NCAM-2 (also known as OCAM/mamFas II and RNCAM). NCAM-2 is organized similarly to NCAM, including five N-terminal Ig-like domains and two fibronectin type III domains. NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE), and may function like NCAM, as an adhesion molecule. One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains. 98
25295 409455 cd05871 Ig_Sema3 Immunoglobulin (Ig)-like domain of class III semaphorin Sema3. The members here are composed of the immunoglobulin (Ig)-like domain of Sema3 and similar proteins. Semaphorins are classified based on structural features additional to the Sema domain. Sema3 is a Class III semaphorin that is secreted. It is a vertebrate class having a Sema domain, an Ig domain, a short basic domain. They have been shown to be axonal guidance cues and have a part in the regulation of the cardiovascular, immune, and respiratory systems. Sema3A, the prototype member of this class III subfamily, induces growth cone collapse and is an inhibitor of axonal sprouting. In perinatal rat cortex, it acts as a chemoattractant and functions to direct the orientated extension of apical dendrites. It may play a role, prior to the development of apical dendrites, in signaling the radial migration of newborn cortical neurons towards the upper layers. Sema3A selectively inhibits vascular endothelial growth factor receptor (VEGF)-induced angiogenesis and induces microvascular permeability. This group also includes Sema3B, -C, -D, -E, -G. 92
25296 409456 cd05872 Ig_Sema4B_like Immunoglobulin (Ig)-like domain of the class IV semaphorin Sema4B. The members here are composed of the immunoglobulin (Ig)-like domain of Sema4B and similar proteins. Sema4B is a Class IV semaphorin. Semaphorins are classified based on structural features additional to the Sema domain. Sema4B has extracellular Sema and Ig domains, a transmembrane domain, and a short cytoplasmic domain. Sema4B has been shown to preferentially regulate the development of the postsynaptic specialization at the glutamatergic synapses. This cytoplasmic domain includes a PDZ-binding motif upon which the synaptic localization of Sem4B is dependent. Sema4B is a ligand of CLCP1. CLCP1 was identified in an expression profiling analysis, which compared a highly metastic lung cancer subline with its low metastic parental line. Sema4B was shown to promote CLCP1 endocytosis and their interaction is a potential target for therapeutic intervention of metastasis. 86
25297 409457 cd05873 Ig_Sema4D_like Immunoglobulin (Ig)-like domain of semaphorin 4D (Sema4D) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of semaphorin 4D (Sema4D) and similar proteins. Sema4D is a Class IV semaphorin. Semaphorins are classified based on structural features additional to the Sema domain. Sema4D has extracellular Sema and Ig domains, a transmembrane domain, and a short cytoplasmic domain. Sema4D plays a part in the development of GABAergic synapses. Sema4D in addition is an immune semaphorin. It is abundant on resting T cells; its expression is weak on resting B cells and antigen presenting cells (APCs), but is upregulated by various stimuli. The receptor used by Sema4D in the immune system is CD72. Sem4D enhances the activation of B cells and DCs through binding CD72, perhaps by reducing CD72s inhibitory signals. The receptor used by Sema4D in the non-lymphatic tissues is plexin-B1. Sem4D is anchored to the cell surface but its extracellular domain can be released from the cell surface by a metalloprotease-dependent process. Sem4D may mediate its effects in its membrane-bound form and/or its cleaved form. 87
25298 409458 cd05874 IgI_NrCAM Immunoglobulin (Ig)-like domain of NrCAM (Ng (neuronglia) CAM-related cell adhesion molecule); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of NrCAM (Ng (neuronglia) CAM-related cell adhesion molecule). NrCAM belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. NrCAM is primarily expressed in the nervous system. 95
25299 409459 cd05875 IgI_hNeurofascin_like Immunoglobulin (Ig)-like domain of human neurofascin (NF); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of human neurofascin (NF). NF belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and a cytoplasmic domain. NF has many alternatively spliced isoforms having different temporal expression patterns during development. NF participates in axon subcellular targeting and synapse formation, however little is known of the functions of the different isoforms. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lacks a C" strand. 95
25300 409460 cd05876 Ig3_L1-CAM Third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). The members here are composed of the third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM. 83
25301 409461 cd05877 Ig_LP_like Immunoglobulin (Ig)-like domain of human cartilage link protein (LP), and similar domains. The members here are composed of the immunoglobulin (Ig)-like domain similar to that found in human cartilage link protein (LP; also called hyaluronan and proteoglycan link protein). In cartilage, chondroitin-keratan sulfate proteoglycan (CSPG), aggrecan, forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 117
25302 409462 cd05878 Ig_Aggrecan_like Immunoglobulin (Ig)-like domain of the aggrecan-like chondroitin sulfate proteoglycan core protein (CSPG). The members here are composed of the immunoglobulin (Ig)-like domain of the aggrecan-like chondroitin sulfate proteoglycan core proteins (CSPGs). Included in this group are the Ig domains of other CSPGs: versican, and neurocan. In CSPGs, this Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggrecan and versican have a wide distribution in connective tissue and extracellular matrices. Neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 125
25303 409463 cd05879 IgV_P0 Immunoglobulin (Ig)-like domain of protein zero (P0). The members here are composed of the immunoglobulin (Ig) domain of protein zero (P0), a myelin membrane adhesion molecule. P0 accounts for over 50% of the total protein in peripheral nervous system (PNS) myelin. P0 is a single-pass transmembrane glycoprotein with a highly basic intracellular domain and an Ig domain. The extracellular domain of P0 (P0-ED) is similar to the Ig variable domain, carrying one acceptor sequence for N-linked glycosylation. P0 plays a role in membrane adhesion in the spiral wraps of the myelin sheath. The intracellular domain is thought to mediate membrane apposition of the cytoplasmic faces and may, through electrostatic interactions, interact directly with lipid headgroups. It is thought that homophilic interactions of the P0 extracellular domain mediate membrane juxtaposition in the extracellular space of PNS myelin. 117
25304 409464 cd05880 IgV_EVA1 Immunoglobulin (Ig)-like domain of epithelial V-like antigen (EVA) 1. The members here are composed of the immunoglobulin (Ig) domain of epithelial V-like antigen 1 (EVA 1). EVA is also known as myelin protein zero-like 2. EVA is an adhesion molecule and may play a role in the structural organization of the thymus and early lymphocyte development. 116
25305 409465 cd05881 IgV_1_Necl-2 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule 2; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-2, Necl-2 (also known as cell adhesion molecule 1 (CADM1), SynCAM1, IGSF4A, Tslc1, sgIGSF, and RA175). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene, which is downregulated in aggressive neuroblastoma. 94
25306 143290 cd05882 IgV_1_Necl-1 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule-1 (Necl-1); member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-1, Necl-1 (also known as celll adhesion molecule 3 (CADM3), SynCAM2, or IGSF4). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons. 95
25307 409466 cd05883 IgI_2_Necl-2 Second immunoglobulin (Ig)-like domain of nectin-like molecule 2 (Necl-2); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule 2 (Necl-2; also known as cell adhesion molecule 1 (CADM1)). Nectin-like molecules (Necls) have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Ig domains are likely to participate in ligand binding and recognition. 99
25308 409467 cd05884 IgI_2_Necl-3 Second immunoglobulin (Ig)-like domain of nectin-like molecule-3 (Necl-3); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule-3 (Necl-3; also known as cell adhesion molecule 2 (CADM2)). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Necl-3 has been shown to accumulate in tissues of the central and peripheral nervous system where it is expressed in ependymal cells and myelinated axons. It is observed at the interface between the axon shaft and the myelin sheath. Ig domains are likely to participate in ligand binding and recognition. 104
25309 409468 cd05885 IgI_2_Necl-4 Second immunoglobulin (Ig)-like domain of nectin-like molecule-4 (Necl-4); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule-4 (Necl-4; also known as cell adhesion molecule 4 (CADM4)). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1-Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Ig domains are likely to participate in ligand binding and recognition. Necl-4 is expressed on Schwann cells, and plays a key part in initiating peripheral nervous system (PNS) myelination. In injured peripheral nerve cells, the mRNA signal for both Necl-4 and Necl-5 was observed to be elevated. Necl-4 participates in cell-cell adhesion and is proposed to play a role in tumor suppression. 100
25310 409469 cd05886 IgV_1_Nectin-1_like First immunoglobulin variable (IgV) domain of nectin-1, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-1 (also known as poliovirus receptor related protein 1 (PVRL1) or cluster of differentiation (CD) 111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. In addition nectins heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls), nectin-1 for example, has been shown to trans-interact with Necl-1. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1). Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D. 113
25311 409470 cd05887 IgV_1_Nectin-3_like First immunoglobulin variable (IgV) domain of nectin-3 (also known as poliovirus receptor related protein 3), and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-3 (also known as poliovirus receptor related protein 3 (PVRL3) or cluster of differentiation (CD) 113). Nectin-3 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which participate in adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. For example, during spermatid development, the nectin-3,-2 trans-interaction is required for the formation of Sertoli cell-spermatid junctions in testis, and during morphogenesis of the ciliary body, the nectin-3,-1 trans-interaction is important for apex-apex adhesion between the pigment and non-pigment layers of the ciliary epithelia. Nectins also heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls); nectin-3 for example, trans-interacts with Necl-5, regulating cell movement and proliferation. Other proteins with which nectin-3 interacts include the actin filament-binding protein, afadin, integrin alpha-beta3, Par-3, and PDGF receptor; its interaction with PDGF receptor regulates the latter's signaling for anti-apoptosis. 110
25312 409471 cd05888 IgV_1_Nectin-4_like First immunoglobulin (Ig) domain of nectin-4, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-4 (also known as poliovirus receptor related protein 4 or LNIR receptor). Nectin-4 belongs to the nectin family, which is comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which participate in adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. For example nectin-4 trans-interacts with nectin-1. Nectin-4 has also been shown to interact with the actin filament-binding protein, afadin. Unlike the other nectins, which are widely expressed in adult tissues, nectin-4 is mainly expressed during embryogenesis, and is not detected in normal adult tissue or in serum. Nectin-4 is re-expressed in breast carcinoma, and patients having metastatic breast cancer have a circulating form of nectin-4 formed from the ectodomain 108
25313 409472 cd05889 IgV_1_DNAM-1_like First immunoglobulin variable (IgV) domain of DNAX accessory molecule 1, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of DNAX accessory molecule 1 (DNAM-1, also known as CD226). DNAM-1 is a transmembrane protein having two Ig-like domains. It is an adhesion molecule which plays a part in tumor-directed cytotoxicity and adhesion in natural killer (NK) cells and T lymphocytes. It has been shown to regulate the NK cell killing of several tumor types, including myeloma cells and ovarian carcinoma cells. DNAM-1 interacts specifically with poliovirus receptor (PVR; CD155) and nectin -2 (CD211), other members of the Ig superfamily. DNAM-1 is expressed in most peripheral T cells, NK cells, monocytes and a subset of B lymphocytes. 111
25314 143298 cd05890 IgC1_2_Nectin-1_like Second immunoglobulin (Ig) domain of nectin-1, and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-1 (also known as poliovirus receptor related protein 1, or cluster of differentiation (CD) 111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectin-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectins also heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls); nectin-1 for example, has been shown to trans-interact with Necl-1. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1). Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D. 98
25315 143299 cd05891 IgI_M-protein_C C-terminal immunoglobulin (Ig)-like domain of M-protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of M-protein (also known as myomesin-2). M-protein is a structural protein localized to the M-band, a transverse structure in the center of the sarcomere, and is a candidate for M-band bridges. M-protein is modular consisting mainly of repetitive IG-like and fibronectin type III (FnIII) domains and has a muscle-type specific expression pattern. M-protein is present in fast fibers. 92
25316 409473 cd05892 IgI_Myotilin_C C-terminal immunoglobulin (Ig)-like domain of myotilin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of myotilin. Mytolin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to the latter family contain multiple Ig-like domains and function as scaffolds, modulating the actin cytoskeleton. Myotilin is most abundant in skeletal and cardiac muscle and is involved in maintaining sarcomere integrity. It binds to alpha-actinin, filamin, and actin. Mutations in myotilin lead to muscle disorders. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 92
25317 409474 cd05893 IgI_1_Palladin_C First C-terminal immunoglobulin (Ig)-like domain of palladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of palladin. Palladin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to this family contain multiple Ig-like domains and function as scaffolds, modulating actin cytoskeleton. Palladin binds to alpha-actinin ezrin, vasodilator-stimulated phosphoprotein VASP, SPIN90 (also known as DIP or mDia interacting protein), and Src. Palladin also binds F-actin directly, via its Ig3 domain. Palladin is expressed as several alternatively spliced isoforms, having various combinations of Ig-like domains, in a cell-type-specific manner. It has been suggested that palladin's different Ig-like domains may be specialized for distinct functions. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 92
25318 409475 cd05894 Ig_C5_MyBP-C C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). The members here are composed of the C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). MyBP-C consists of repeated domains, Ig and fibronectin type 3, and various linkers. Three isoforms of MYBP-C exist: slow-skeletal (ssMyBP-C), fast-skeletal (fsMyBP-C), and cardiac (cMyBP-C). cMYBP-C has insertions between and inside domains and an additional cardiac-specific Ig domain at the N-terminus. For cMYBP_C an interaction has been demonstrated between this C5 domain and the Ig C8 domain. 86
25319 409476 cd05895 Ig_Pro_neuregulin-1 Immunoglobulin (Ig)-like domain found in neuregulin (NRG)-1. The members here are composed of the immunoglobulin (Ig)-like domain found in neuregulin (NRG)-1. There are many NRG-1 isoforms which arise from the alternative splicing of mRNA. NRG-1 belongs to the neuregulin gene family which is comprised of four genes. This group represents NRG-1. NRGs are signaling molecules which participate in cell-cell interactions in the nervous system, breast, and heart, and other organ systems, and are implicated in the pathology of diseases including schizophrenia, multiple sclerosis, and breast cancer. The NRG-1 protein binds to and activates the tyrosine kinases receptors ErbB3 and ErbB4, initiating signaling cascades. NRG-1 has multiple functions, for example, in the brain it regulates various processes such as radial glia formation and neuronal migration, dendritic development, and expression of neurotransmitters receptors in the peripheral nervous system NRG-1 regulates processes such as target cell differentiation, and Schwann cell survival. 93
25320 409477 cd05896 Ig1_IL1RAPL-1_like First immunoglobulin (Ig)-like domain of X-linked interleukin-1 receptor accessory protein-like 1 (IL1RAPL-1), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of X-linked interleukin-1 receptor accessory protein-like 1 (IL1RAPL-1). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. IL1RAPL is encoded by a gene on the X-chromosome, this gene is wholly or partially deleted in multiple cases of non-syndromic intellectual disability. This group also contains IL1RAPL-2 which is also encoded by a gene on the X-chromosome and is a candidate for another non-syndromic intellectual disability loci. 105
25321 409478 cd05897 Ig2_IL1R2_like Second immunoglobulin (Ig)-like domain of interleukin-1 receptor-2 (IL1R2), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor-2 (IL1R2). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds the IL-1 receptor, type II (IL1R2) represented in this group. Mature IL1R2 consists of three IG-like domains, a transmembrane domain, and a short cytoplasmic domain. It lacks the large cytoplasmic domain of mature IL1R1 and does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. 95
25322 409479 cd05898 IgI_5_KIRREL3 Fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 protein (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (Neph1). These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development. Neph1 and 2 may mediate axonal guidance and synapse formation in certain areas of the CNS. In the kidney they participate in the formation of the slit diaphragm. 98
25323 409480 cd05899 IgV_TCR_beta Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) beta chain. The members here are composed of the immunoglobulin (Ig) variable domain of the beta chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the alpha chain of alpha/beta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The variable domain of TCRs is responsible for antigen recognition, and is located at the N-terminus of the receptor. Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 110
25324 409481 cd05900 Ig_Aggrecan Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), aggrecan. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), aggrecan. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggrecan has a wide distribution in connective tissue and extracellular matrices. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 123
25325 409482 cd05901 Ig_Versican Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), versican. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), versican. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, the CSPG aggrecan (not included in this group) forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Like aggrecan, versican has a wide distribution in connective tissue and extracellular matrices. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 128
25326 409483 cd05902 Ig_Neurocan Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), neurocan. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), neurocan. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, the CSPG aggrecan (not included in this group) forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Unlike aggrecan which is widely distributed in connective tissue and extracellular matrices, neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 121
25327 341229 cd05903 CHC_CoA_lg Cyclohexanecarboxylate-CoA ligase (also called cyclohex-1-ene-1-carboxylate:CoA ligase). Cyclohexanecarboxylate-CoA ligase activates the aliphatic ring compound, cyclohexanecarboxylate, for degradation. It catalyzes the synthesis of cyclohexanecarboxylate-CoA thioesters in a two-step reaction involving the formation of cyclohexanecarboxylate-AMP anhydride, followed by the nucleophilic substitution of AMP by CoA. 437
25328 341230 cd05904 4CL 4-Coumarate-CoA Ligase (4CL). 4-Coumarate:coenzyme A ligase is a key enzyme in the phenylpropanoid metabolic pathway for monolignol and flavonoid biosynthesis. It catalyzes the synthesis of hydroxycinnamate-CoA thioesters in a two-step reaction, involving the formation of hydroxycinnamate-AMP anhydride and the nucleophilic substitution of AMP by CoA. The phenylpropanoid pathway is one of the most important secondary metabolism pathways in plants and hydroxycinnamate-CoA thioesters are the precursors of lignin and other important phenylpropanoids. 505
25329 341231 cd05905 Dip2 Disco-interacting protein 2 (Dip2). Dip2 proteins show sequence similarity to other members of the adenylate forming enzyme family, including insect luciferase, acetyl CoA ligases and the adenylation domain of nonribosomal peptide synthetases (NRPS). However, its function may have diverged from other members of the superfamily. In mouse embryo, Dip2 homolog A plays an important role in the development of both vertebrate and invertebrate nervous systems. Dip2A appears to regulate cell growth and the arrangement of cells in organs. Biochemically, Dip2A functions as a receptor of FSTL1, an extracellular glycoprotein, and may play a role as a cardiovascular protective agent. 571
25330 341232 cd05906 A_NRPS_TubE_like The adenylation domain (A domain) of a family of nonribosomal peptide synthetases (NRPSs) synthesizing toxins and antitumor agents. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino)-acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family includes NRPSs that synthesize toxins and antitumor agents; for example, TubE for Tubulysine, CrpA for cryptophycin, TdiA for terrequinone A, KtzG for kutzneride, and Vlm1/Vlm2 for Valinomycin. Nonribosomal peptide synthetases are large multifunctional enzymes which synthesize many therapeutically useful peptides. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 540
25331 341233 cd05907 VL_LC_FACS_like Long-chain fatty acid CoA synthetases and Bubblegum-like very long-chain fatty acid CoA synthetases. This family includes long-chain fatty acid (C12-C20) CoA synthetases and Bubblegum-like very long-chain (>C20) fatty acid CoA synthetases. FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Eukaryotes generally have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. Drosophila melanogaster mutant bubblegum (BGM) have elevated levels of very-long-chain fatty acids (VLCFA) caused by a defective gene later named bubblegum. The human homolog (hsBG) of bubblegum has been characterized as a very long chain fatty acid CoA synthetase that functions specifically in the brain; hsBG may play a central role in brain VLCFA metabolism and myelinogenesis. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 452
25332 341234 cd05908 A_NRPS_MycA_like The adenylation domain of nonribosomal peptide synthetases (NRPS) similar to mycosubtilin synthase subunit A (MycA). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as (amino)-acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family includes NRPS similar to mycosubtilin synthase subunit A (MycA). Mycosubtilin, which is characterized by a beta-amino fatty acid moiety linked to the circular heptapeptide Asn-Tyr-Asn-Gln-Pro-Ser-Asn, belongs to the iturin family of lipopeptide antibiotics. The mycosubtilin synthase subunit A (MycA) combines functional domains derived from peptide synthetases, amino transferases, and fatty acid synthases. Nonribosomal peptide synthetases are large multifunction enzymes that synthesize many therapeutically useful peptides. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 499
25333 341235 cd05909 AAS_C C-terminal domain of the acyl-acyl carrier protein synthetase (also called 2-acylglycerophosphoethanolamine acyltransferase, Aas). Acyl-acyl carrier protein synthase (Aas) is a membrane protein responsible for a minor pathway of incorporating exogenous fatty acids into membrane phospholipids. Its in vitro activity is characterized by the ligation of free fatty acids between 8 and 18 carbons in length to the acyl carrier protein sulfydryl group (ACP-SH) in the presence of ATP and Mg2+. However, its in vivo function is as a 2-acylglycerophosphoethanolamine (2-acyl-GPE) acyltransferase. The reaction occurs in two steps: the acyl chain is first esterified to acyl carrier protein (ACP) via a thioester bond, followed by a second step where the acyl chain is transferred to a 2-acyllysophospholipid, thus completing the transacylation reaction. This model represents the C-terminal domain of the enzyme, which belongs to the class I adenylate-forming enzyme family, including acyl-CoA synthetases. 490
25334 341236 cd05910 FACL_like_1 Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 457
25335 341237 cd05911 Firefly_Luc_like Firefly luciferase of light emitting insects and 4-Coumarate-CoA Ligase (4CL). This family contains insect firefly luciferases that share significant sequence similarity to plant 4-coumarate:coenzyme A ligases, despite their functional diversity. Luciferase catalyzes the production of light in the presence of MgATP, molecular oxygen, and luciferin. In the first step, luciferin is activated by acylation of its carboxylate group with ATP, resulting in an enzyme-bound luciferyl adenylate. In the second step, luciferyl adenylate reacts with molecular oxygen, producing an enzyme-bound excited state product (Luc=O*) and releasing AMP. This excited-state product then decays to the ground state (Luc=O), emitting a quantum of visible light. 486
25336 341238 cd05912 OSB_CoA_lg O-succinylbenzoate-CoA ligase (also known as O-succinylbenzoate-CoA synthase, OSB-CoA synthetase, or MenE). O-succinylbenzoic acid-CoA synthase catalyzes the coenzyme A (CoA)- and ATP-dependent conversion of o-succinylbenzoic acid to o-succinylbenzoyl-CoA. The reaction is the fourth step of the biosynthesis pathway of menaquinone (vitamin K2). In certain bacteria, menaquinone is used during fumarate reduction in anaerobic respiration. In cyanobacteria, the product of the menaquinone pathway is phylloquinone (2-methyl-3-phytyl-1,4-naphthoquinone), a molecule used exclusively as an electron transfer cofactor in Photosystem 1. In green sulfur bacteria and heliobacteria, menaquinones are used as loosely bound secondary electron acceptors in the photosynthetic reaction center. 411
25337 341239 cd05913 PaaK Phenylacetate-CoA ligase (also known as PaaK). PaaK catalyzes the first step in the aromatic degradation pathway, by converting phenylacetic acid (PA) into phenylacetyl-CoA (PA-CoA). Phenylacetate-CoA ligase has been found in proteobacteria as well as gram positive prokaryotes. The enzyme is specifically induced after aerobic growth in a chemically defined medium containing PA or phenylalanine (Phe) as the sole carbon source. PaaKs are members of the adenylate-forming enzyme (AFE) family. However, sequence comparison reveals divergent features of PaaK with respect to the superfamily, including a novel N-terminal sequence. 425
25338 341240 cd05914 LC_FACL_like Uncharacterized subfamily of fatty acid CoA ligase (FACL). The members of this family are bacterial long-chain fatty acid CoA synthetase, most of which are as yet uncharacterized. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 463
25339 213283 cd05915 ttLC_FACS_like Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles. This family includes fatty acyl-CoA synthetases that can activate medium-chain to long-chain fatty acids. They catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family has been shown to catalyze the long-chain fatty acid, myristoyl acid, while another member in this family, the AlkK protein identified in Pseudomonas oleovorans, targets medium chain fatty acids. This family also includes an uncharacterized subgroup of FACS. 509
25340 341241 cd05917 FACL_like_2 Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 349
25341 341242 cd05918 A_NRPS_SidN3_like The adenylation (A) domain of siderophore-synthesizing nonribosomal peptide synthetases (NRPS). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family of siderophore-synthesizing NRPS includes the third adenylation domain of SidN from the endophytic fungus Neotyphodium lolii, ferrichrome siderophore synthetase, HC-toxin synthetase, and enniatin synthase. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 481
25342 341243 cd05919 BCL_like Benzoate CoA ligase (BCL) and similar adenylate forming enzymes. This family contains benzoate CoA ligase (BCL) and related ligases that catalyze the acylation of benzoate derivatives, 2-aminobenzoate and 4-hydroxybenzoate. Aromatic compounds represent the second most abundant class of organic carbon compounds after carbohydrates. Xenobiotic aromatic compounds are also a major class of man-made pollutants. Some bacteria use benzoate as the sole source of carbon and energy through benzoate degradation. Benzoate degradation starts with its activation to benzoyl-CoA by benzoate CoA ligase. The reaction catalyzed by benzoate CoA ligase proceeds via a two-step process; the first ATP-dependent step forms an acyl-AMP intermediate, and the second step forms the acyl-CoA ester with release of the AMP. 436
25343 341244 cd05920 23DHB-AMP_lg 2,3-dihydroxybenzoate-AMP ligase. 2,3-dihydroxybenzoate-AMP ligase activates 2,3-dihydroxybenzoate (DHB) by ligation of AMP from ATP with the release of pyrophosphate. However, it can also catalyze the ATP-PPi exchange for 2,3-DHB analogs, such as salicyclic acid (o-hydrobenzoate), as well as 2,4-DHB and 2,5-DHB, but with less efficiency. Proteins in this family are the stand-alone adenylation components of non-ribosomal peptide synthases (NRPSs) involved in the biosynthesis of siderophores, which are low molecular weight iron-chelating compounds synthesized by many bacteria to aid in the acquisition of this vital trace elements. In Escherichia coli, the 2,3-dihydroxybenzoate-AMP ligase is called EntE, the adenylation component of the enterobactin NRPS system. 482
25344 341245 cd05921 FCS Feruloyl-CoA synthetase (FCS). Feruloyl-CoA synthetase is an essential enzyme in the feruloyl acid degradation pathway and enables some proteobacteria to grow on media containing feruloyl acid as the sole carbon source. It catalyzes the transfer of CoA to the carboxyl group of ferulic acid, which then forms feruloyl-CoA in the presence of ATP and Mg2. The resulting feruloyl-CoA is further degraded to vanillin and acetyl-CoA. Feruloyl-CoA synthetase (FCS) is a subfamily of the adenylate-forming enzymes superfamily. 561
25345 341246 cd05922 FACL_like_6 Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 457
25346 341247 cd05923 CBAL 4-Chlorobenzoate-CoA ligase (CBAL). CBAL catalyzes the conversion of 4-chlorobenzoate (4-CB) to 4-chlorobenzoyl-coenzyme A (4-CB-CoA) by the two-step adenylation and thioester-forming reactions. 4-Chlorobenzoate (4-CBA) is an environmental pollutant derived from microbial breakdown of aromatic pollutants, such as polychlorinated biphenyls (PCBs), DDT, and certain herbicides. The 4-CBA degrading pathway converts 4-CBA to the metabolite 4-hydroxybezoate (4-HBA), allowing some soil-dwelling microbes to utilize 4-CBA as an alternate carbon source. This pathway consists of three chemical steps catalyzed by 4-CBA-CoA ligase, 4-CBA-CoA dehalogenase, and 4HBA-CoA thioesterase in sequential reactions. 493
25347 341248 cd05924 FACL_like_5 Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 364
25348 341249 cd05926 FACL_fum10p_like Subfamily of fatty acid CoA ligase (FACL) similar to Fum10p of Gibberella moniliformis. FACL catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, followed by the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Fum10p is a fatty acid CoA ligase involved in the synthesis of fumonisin, a polyketide mycotoxin, in Gibberella moniliformis. 493
25349 341250 cd05927 LC-FACS_euk Eukaryotic long-chain fatty acid CoA synthetase (LC-FACS). The members of this family are eukaryotic fatty acid CoA synthetases that activate fatty acids with chain lengths of 12 to 20. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Organisms tend to have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. 545
25350 341251 cd05928 MACS_euk Eukaryotic Medium-chain acyl-CoA synthetase (MACS or ACSM). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The acyl-CoA is a key intermediate in many important biosynthetic and catabolic processes. MACS enzymes are localized to mitochondria. Two murine MACS family proteins are found in liver and kidney. In rodents, a MACS member is detected particularly in the olfactory epithelium and is called O-MACS. O-MACS demonstrates substrate preference for the fatty acid lengths of C6-C12. 530
25351 341252 cd05929 BACL_like Bacterial Bile acid CoA ligases and similar proteins. Bile acid-Coenzyme A ligase catalyzes the formation of bile acid-CoA conjugates in a two-step reaction: the formation of a bile acid-AMP molecule as an intermediate, followed by the formation of a bile acid-CoA. This ligase requires a bile acid with a free carboxyl group, ATP, Mg2+, and CoA for synthesis of the final bile acid-CoA conjugate. The bile acid-CoA ligation is believed to be the initial step in the bile acid 7alpha-dehydroxylation pathway in the intestinal bacterium Eubacterium sp. 473
25352 341253 cd05930 A_NRPS The adenylation domain of nonribosomal peptide synthetases (NRPS). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 444
25353 341254 cd05931 FAAL Fatty acyl-AMP ligase (FAAL). FAAL belongs to the class I adenylate forming enzyme family and is homologous to fatty acyl-coenzyme A (CoA) ligases (FACLs). However, FAALs produce only the acyl adenylate and are unable to perform the thioester-forming reaction, while FACLs perform a two-step catalytic reaction; AMP ligation followed by CoA ligation using ATP and CoA as cofactors. FAALs have insertion motifs between the N-terminal and C-terminal subdomains that distinguish them from the FACLs. This insertion motif precludes the binding of CoA, thus preventing CoA ligation. It has been suggested that the acyl adenylates serve as substrates for multifunctional polyketide synthases to permit synthesis of complex lipids such as phthiocerol dimycocerosate, sulfolipids, mycolic acids, and mycobactin. 547
25354 341255 cd05932 LC_FACS_bac Bacterial long-chain fatty acid CoA synthetase (LC-FACS), including Marinobacter hydrocarbonoclasticus isoprenoid Coenzyme A synthetase. The members of this family are bacterial long-chain fatty acid CoA synthetase. Marinobacter hydrocarbonoclasticus isoprenoid Coenzyme A synthetase in this family is involved in the synthesis of isoprenoid wax ester storage compounds when grown on phytol as the sole carbon source. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 508
25355 341256 cd05933 ACSBG_like Bubblegum-like very long-chain fatty acid CoA synthetase (VL-FACS). This family of very long-chain fatty acid CoA synthetase is named bubblegum because Drosophila melanogaster mutant bubblegum (BGM) has elevated levels of very-long-chain fatty acids (VLCFA) caused by a defective gene of this family. The human homolog (hsBG) has been characterized as a very long chain fatty acid CoA synthetase that functions specifically in the brain; hsBG may play a central role in brain VLCFA metabolism and myelinogenesis. VL-FACS is involved in the first reaction step of very long chain fatty acid degradation. It catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 596
25356 341257 cd05934 FACL_DitJ_like Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Members of this family include DitJ from Pseudomonas and similar proteins. 422
25357 341258 cd05935 LC_FACS_like Putative long-chain fatty acid CoA ligase. The members of this family are putative long-chain fatty acyl-CoA synthetases, which catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. 430
25358 341259 cd05936 FC-FACS_FadD_like Prokaryotic long-chain fatty acid CoA synthetases similar to Escherichia coli FadD. This subfamily of the AMP-forming adenylation family contains Escherichia coli FadD and similar prokaryotic fatty acid CoA synthetases. FadD was characterized as a long-chain fatty acid CoA synthetase. The gene fadD is regulated by the fatty acid regulatory protein FadR. Fatty acid CoA synthetase catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, followed by the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 468
25359 341260 cd05937 FATP_chFAT1_like Uncharacterized subfamily of bifunctional fatty acid transporter/very-long-chain acyl-CoA synthetase in fungi. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis. Members of this family are fungal FATPs, including FAT1 from Cochliobolus heterostrophus. 468
25360 341261 cd05938 hsFATP2a_ACSVL_like Fatty acid transport proteins (FATP) including hsFATP2, hsFATP5, and hsFATP6, and similar proteins. Fatty acid transport proteins (FATP) of this family transport long-chain or very-long-chain fatty acids across the plasma membrane. At least five copies of FATPs are identified in mammalian cells. This family includes hsFATP2, hsFATP5, and hsFATP6, and similar proteins. Each FATP has unique patterns of tissue distribution. These FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. The hsFATP proteins exist in two splice variants; the b variant, lacking exon 3, has no acyl-CoA synthetase activity. FATPs are key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis. 537
25361 341262 cd05939 hsFATP4_like Fatty acid transport proteins (FATP), including FATP4 and FATP1, and similar proteins. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. At least five copies of FATPs are identified in mammalian cells. This family includes FATP4, FATP1, and homologous proteins. Each FATP has unique patterns of tissue distribution. FATP4 is mainly expressed in the brain, testis, colon and kidney. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis. 474
25362 341263 cd05940 FATP_FACS Fatty acid transport proteins (FATP) play dual roles as fatty acid transporters and its activation enzymes. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. At least five copies of FATPs are identified in mammalian cells. This family also includes prokaryotic FATPs. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis. 449
25363 341264 cd05941 MCS Malonyl-CoA synthetase (MCS). MCS catalyzes the formation of malonyl-CoA in a two-step reaction consisting of the adenylation of malonate with ATP, followed by malonyl transfer from malonyl-AMP to CoA. Malonic acid and its derivatives are the building blocks of polyketides and malonyl-CoA serves as the substrate of polyketide synthases. Malonyl-CoA synthetase has broad substrate tolerance and can activate a variety of malonyl acid derivatives. MCS may play an important role in biosynthesis of polyketides, the important secondary metabolites with therapeutic and agrochemical utility. 442
25364 341265 cd05943 AACS Acetoacetyl-CoA synthetase (acetoacetate-CoA ligase, AACS). AACS is a cytosolic ligase that specifically activates acetoacetate to its coenzyme A ester by a two-step reaction. Acetoacetate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is the first step of the mevalonate pathway of isoprenoid biosynthesis via isopentenyl diphosphate. Isoprenoids are a large class of compounds found in all living organisms. AACS is widely distributed in bacteria, archaea and eukaryotes. In bacteria, AACS is known to exhibit an important role in the metabolism of poly-b-hydroxybutyrate, an intracellular reserve of organic carbon and chemical energy by some microorganisms. In mammals, AACS influences the rate of ketone body utilization for the formation of physiologically important fatty acids and cholesterol. 629
25365 341266 cd05944 FACL_like_4 Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 359
25366 341267 cd05945 DltA D-alanine:D-alanyl carrier protein ligase (DltA) and similar proteins. This family includes D-alanyl carrier protein ligase DltA and aliphatic beta-amino acid adenylation enzymes IdnL1 and CmiS6. DltA incorporates D-ala in techoic acids in gram-positive bacteria via a two-step process, starting with adenylation of D-alanine that transfers D-alanine to the D-alanyl carrier protein. IdnL1, a short-chain aliphatic beta-amino acid adenylation enzyme, recognizes 3-aminobutanoic acid, and is involved in the synthesis of the macrolactam antibiotic incednine. CmiS6 is a medium-chain beta-amino acid adenylation enzyme that recognizes 3-aminononanoic acid, and is involved in the synthesis of cremimycin, also a macrolactam antibiotic. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 449
25367 341268 cd05958 ABCL 2-aminobenzoate-CoA ligase (ABCL). ABCL catalyzes the initial step in the 2-aminobenzoate aerobic degradation pathway by activating 2-aminobenzoate to 2-aminobenzoyl-CoA. The reaction is carried out via a two-step process; the first step is ATP-dependent and forms a 2-aminobenzoyl-AMP intermediate, and the second step forms the 2-aminobenzoyl-CoA ester and releases the AMP. 2-Aminobenzoyl-CoA is further converted to 2-amino-5-oxo-cyclohex-1-ene-1-carbonyl-CoA catalyzed by 2-aminobenzoyl-CoA monooxygenase/reductase. ABCL has been purified from cells aerobically grown with 2-aminobenzoate as sole carbon, energy, and nitrogen source, and has been characterized as a monomer. 439
25368 341269 cd05959 BCL_4HBCL Benzoate CoA ligase (BCL) and 4-Hydroxybenzoate-Coenzyme A Ligase (4-HBA-CoA ligase). Benzoate CoA ligase and 4-hydroxybenzoate-coenzyme A ligase catalyze the first activating step for benzoate and 4-hydroxybenzoate catabolic pathways, respectively. Although these two enzymes share very high sequence homology, they have their own substrate preference. The reaction proceeds via a two-step process; the first ATP-dependent step forms the substrate-AMP intermediate, while the second step forms the acyl-CoA ester, releasing the AMP. Aromatic compounds represent the second most abundant class of organic carbon compounds after carbohydrates. Some bacteria can use benzoic acid or benzenoid compounds as the sole source of carbon and energy through degradation. Benzoate CoA ligase and 4-hydroxybenzoate-Coenzyme A ligase are key enzymes of this process. 508
25369 341270 cd05966 ACS Acetyl-CoA synthetase (also known as acetate-CoA ligase and acetyl-activating enzyme). Acetyl-CoA synthetase (ACS, EC 6.2.1.1, acetate#CoA ligase or acetate:CoA ligase (AMP-forming)) catalyzes the formation of acetyl-CoA from acetate, CoA, and ATP. Synthesis of acetyl-CoA is carried out in a two-step reaction. In the first step, the enzyme catalyzes the synthesis of acetyl-AMP intermediate from acetate and ATP. In the second step, acetyl-AMP reacts with CoA to produce acetyl-CoA. This enzyme is widely present in all living organisms. The activity of this enzyme is crucial for maintaining the required levels of acetyl-CoA, a key intermediate in many important biosynthetic and catabolic processes. Acetyl-CoA is used in the biosynthesis of glucose, fatty acids, and cholesterol. It can also be used in the production of energy in the citric acid cycle. Eukaryotes typically have two isoforms of acetyl-CoA synthetase, a cytosolic form involved in biosynthetic processes and a mitochondrial form primarily involved in energy generation. 608
25370 341271 cd05967 PrpE Propionyl-CoA synthetase (PrpE). EC 6.2.1.17: propanoate:CoA ligase (AMP-forming) or propionate#CoA ligase (PrpE) catalyzes the first step of the 2-methylcitric acid cycle for propionate catabolism. It activates propionate to propionyl-CoA in a two-step reaction, which proceeds through a propionyl-AMP intermediate and requires ATP and Mg2+. In Salmonella enterica, the PrpE protein is required for growth of Salmonella enterica on propionate and can substitute for the acetyl-CoA synthetase (Acs) enzyme during growth on acetate. PrpE can also activate acetate, 3HP, and butyrate to their corresponding CoA-thioesters, although with less efficiency. 617
25371 341272 cd05968 AACS_like Uncharacterized acyl-CoA synthetase subfamily similar to Acetoacetyl-CoA synthetase. This uncharacterized acyl-CoA synthetase family (EC 6.2.1.16, or acetoacetate#CoA ligase or acetoacetate:CoA ligase (AMP-forming)) is highly homologous to acetoacetyl-CoA synthetase. However, the proteins in this family exist in only bacteria and archaea. AACS is a cytosolic ligase that specifically activates acetoacetate to its coenzyme A ester by a two-step reaction. Acetoacetate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is the first step of the mevalonate pathway of isoprenoid biosynthesis via isopentenyl diphosphate. Isoprenoids are a large class of compounds found in all living organisms. 610
25372 341273 cd05969 MACS_like_4 Uncharacterized subfamily of Acetyl-CoA synthetase like family (ACS). This family is most similar to acetyl-CoA synthetase. Acetyl-CoA synthetase (ACS) catalyzes the formation of acetyl-CoA from acetate, CoA, and ATP. Synthesis of acetyl-CoA is carried out in a two-step reaction. In the first step, the enzyme catalyzes the synthesis of acetyl-AMP intermediate from acetate and ATP. In the second step, acetyl-AMP reacts with CoA to produce acetyl-CoA. This enzyme is only present in bacteria. 442
25373 341274 cd05970 MACS_AAE_MA_like Medium-chain acyl-CoA synthetase (MACS) of AAE_MA like. MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This family of MACS enzymes is found in archaea and bacteria. It is represented by the acyl-adenylating enzyme from Methanosarcina acetivorans (AAE_MA). AAE_MA is most active with propionate, butyrate, and the branched analogs: 2-methyl-propionate, butyrate, and pentanoate. The specific activity is weaker for smaller or larger acids. 537
25374 341275 cd05971 MACS_like_3 Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria. 439
25375 341276 cd05972 MACS_like Medium-chain acyl-CoA synthetase (MACS or ACSM). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The acyl-CoA is a key intermediate in many important biosynthetic and catabolic processes. 428
25376 341277 cd05973 MACS_like_2 Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria. 437
25377 341278 cd05974 MACS_like_1 Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria. 432
25378 99716 cd05992 PB1 The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 81
25379 100076 cd06006 R3H_unknown_2 R3H domain of a group of fungal proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 59
25380 100077 cd06007 R3H_DEXH_helicase R3H domain of a group of proteins which also contain a DEXH-box helicase domain, and may function as ATP-dependent DNA or RNA helicases. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner. 59
25381 100116 cd06008 NF-X1-zinc-finger Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms. 49
25382 276963 cd06059 Tubulin The tubulin superfamily and related homologs. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules. The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications. The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. Also included in this group is the mitochondrial Misato/DML1 protein family, involved in mitochondrial fusion and in mitochondrial distribution and morphology. 387
25383 276964 cd06060 misato Misato segment II tubulin-like domain. Human Misato shows similarity with Tubulin/FtsZ family of GTPases and is localized to the the outer membrane of mitochondria. It has a role in mitochondrial fusion and in mitochondrial distribution and morphology. Mutations in its Drosophila homolog (misato) lead to irregular chromosome segregation during mitosis. Deletion of the budding yeast homolog DML1 is lethal and unregulate expression of DML1 leads to mitochondrial dispersion and abnormalities in cell morphology. The Misato/DML1 protein family is conserved from yeast to human, but its exact function is still unknown. 539
25384 100037 cd06061 PurM-like1 AIR synthase (PurM) related protein, subgroup 1 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain. 298
25385 99873 cd06062 H2MP_MemB-H2up Endopeptidases belonging to membrane-bound hydrogenases group. These hydrogenases transfer electrons from H2 to a cytochrome that is bound to a membrane-located complex coupling electron transfer to transmembrane proton translocation. Endopeptidase HybD from E. coli is well studied in this group. Maturation of [NiFe] hydrogenases include proteolytic processing of large subunit, assembly with other subunits, and formation of the nickel metallocenter. Hydrogenase maturation endopeptidase (HybD) cleaves a short C-terminal peptide after a His or an Arg residue in the large subunit (pre-HybC) of hydrogenase 2 (hyb operon) in E. coli. This cleavage is nickel dependent. A variety of endopeptidases belong to this group that are similar in function and sequence homology. They include such proteins as HynC, HoxM, and HupD. 146
25386 99874 cd06063 H2MP_Cyano-H2up This group of endopeptidases include HupW enzymes that are specific to the cyanobacterial hydrogenase and are involved in the C-terminal cleavage of the hydrogenase large subunit precursor protein. Cyanobacterial nickel-iron (NiFe)-hydrogenases are found exclusively in the N2-fixing strains and are encoded by hup (hydrogen uptake) genes. These uptake hydrogenases are heterodimers with a large (hupL) and small subunit (hupS) and catalyze the consumption of the H2 produced during N2 fixation. Sequence similarity shows that the putative metal-binding resides are well conserved in this group of hydrogen maturation proteases. This group also includes such proteins as the hydrogenase III from Aquifex aeolicus. 146
25387 99875 cd06064 H2MP_F420-Reduc Endopeptidases belonging to F420-reducing hydrogenases group. These hydrogenases from methanogens are encoded by the fru, frc, or frh genes. Sequence comparison indicates that fruD and frcD gene products from Methanococcus voltae are similar to HycI protease of Escherichia coli and are putatively involved in the C-terminal processing of large subunits (FruA and FrcA respectively). FrhD (F420 reducing hydrogenase delta subunit) enzyme belongs to the gene cluster of 8-hydroxy-5-deazaflavin (F420) reducing hydrogenase (FRH) from the thermophilic methanogen Methanobacterium thermoautotrophicum delta H. FrhD subunit is putatively involved in the processing of the coenzyme F420 hydrogenase-processing. It is similar to those frhD genes found in Methanomicrobia and Methanobacteria. It is different from the FrhD conserved domain found in methyl viologen-reducing hydrogenase and F420-non-reducing hydrogenase iron-sulfur subunit D. 150
25388 99876 cd06066 H2MP_NAD-link-bidir Endopeptidases that belong to the bidirectional NAD-linked hydrogenase group. This group of endopeptidases are highly specific carboxyl-terminal protease (HoxW protease) which releases a 24-amino-acid peptide from HoxH prior to progression of subunit assembly. These bidirectional hydrogenases are heteropentamers encoded by the hox (hydrogen oxidation) genes, in which complex HoxEFU shows the diaphorase activity, and HoxYH constitutes the NiFe-hydrogenase. 139
25389 99877 cd06067 H2MP_MemB-H2evol Endopeptidases belonging to membrane-bound hydrogen evolving hydrogenase group. In hydrogenase 3 from E coli, the maturation of the large subunit (HycE) requires the cleavage of a C-terminal peptide by the endopeptidase HycI, before the final formation of the [NiFe] metallocenter. HycI protease is a monomer and lacks characteristic signature motifs of serine, zinc, cysteine, or acid proteases and thus its cleavage reaction is not inhibited by conventional inhibitors of serine and metalloproteases. Such hydrogenases as those from Methanosarcina barkeri (EchCE) and Rhodospirillum rubrum (CooLH) also belong to this group of membrane-bound hydrogen evolving hydrogenase. Sequence comparison of the large subunits from related hydrogenase indicates that in contrast to EchE (358 amino acids) and CooH (361 amino acids), the large subunit HycE (569 amino acids) contains an extra carboxy-terminal stretch of 32 amino acids that is cleaved during the maturation process. In the absence of this C-terminal stretch, there is no homolog of endopeptidase HycI found in these two related hydrogenase. 136
25390 99878 cd06068 H2MP_like-1 Putative [NiFe] hydrogenase-specific C-terminal protease. Sequence comparison shows similarity to hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved in processing of HypE (the large subunit of hydrogenases 3). This cleavage is nickel dependent. 144
25391 99879 cd06070 H2MP_like-2 Putative [NiFe] hydrogenase-specific C-terminal protease. Sequence comparison shows similarity to hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved in processing of HypE (the large subunit of hydrogenases 3). This cleavage is nickel dependent. 140
25392 100117 cd06071 Beach BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins. 275
25393 99903 cd06080 MUM1_like Mutated melanoma-associated antigen 1 (MUM-1) is a melanoma-associated antigen (MAA). MUM-1 belongs to the mutated or aberrantly expressed type of MAAs, along with antigens such as CDK4, beta-catenin, gp100-in4, p15, and N-acetylglucosaminyltransferase V. It is highly expressed in several types of human cancers. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. 80
25394 240505 cd06081 KOW_Spt5_1 KOW domain of Spt5, repeat 1. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 38
25395 240506 cd06082 KOW_Spt5_2 KOW domain of Spt5, repeat 2. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 51
25396 240507 cd06083 KOW_Spt5_3 KOW domain of Spt5, repeat 3. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 51
25397 240508 cd06084 KOW_Spt5_4 KOW domain of Spt5, repeat 4. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 43
25398 240509 cd06085 KOW_Spt5_5 KOW domain of Spt5, repeat 5. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 52
25399 240510 cd06086 KOW_Spt5_6 KOW domain of Spt5, repeat 6. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases. 58
25400 240511 cd06087 KOW_RPS4 KOW motif of Ribosomal Protein S4 (RPS4). RPS4 plays a critical role in the core assembly of the small ribosomal subunit with a KOW motif at its C-terminal. RPS4 also acts as a general transcription antiterminator factor and regulates ribosomal RNA expression level. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. RPS4 deficiency in human has been associated with Turner syndrome. Archeae RPS4 (RPS4e) showed substantial identity to the eukaryotic equivalents RPS4, but the archaeal proteins formed a different complex from the eukaryotic proteins. 55
25401 240512 cd06088 KOW_RPL14 KOW motif of Ribosomal Protein L14. RPL14 is a component of the large ribosomal subunit in both archaea and eukaryotes with KOW motif at its N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. Auto-antibodies to RPL14 in humans have been associated with systemic lupus erythematosus . Although RPL14 is well conserved, it is not found in all archaea, and therefore it is presumably not essential. 76
25402 240513 cd06089 KOW_RPL26 KOW motif of Ribosomal Protein L26. RPL26 and its bacterial paralogs RPL24 have a KOW motif at their N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. RPL26 makes a very minor contributions to the biogenesis, structure, and function of 60s ribosomal subunits. However, RPL24 is essential to generate the first intermediate during 50s ribosomal subunits assembly. RPL26 have an extra-ribosomal function to enhances p53 translation after DNA damage. 65
25403 240514 cd06090 KOW_RPL27 KOW motif of eukaryotic Ribosomal Protein L27. RPL27e has a KOW motif at its N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. 83
25404 240515 cd06091 KOW_NusG NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus. KOW_NusG motif is one of the two domains of N-Utilization Substance G (NusG) a transcription elongation and Rho-termination factor in bacteria and archaea. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The eukaryotic ortholog of NusG is Spt5 with multiple KOW motifs at its C-terminus. 56
25405 132768 cd06093 PX_domain The Phox Homology domain, a phosphoinositide binding module. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting. 106
25406 133158 cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 89
25407 133159 cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A. 86
25408 133160 cd06096 Plasmepsin_5 Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5. Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal lobes of the enzyme. There are four types of plasmepsins, closely related but varying in the specificity of cleavage site. The name plasmepsin may come from plasmodium (the organism) and pepsin (a common aspartic acid protease with similar molecular structure). This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 326
25409 133161 cd06097 Aspergillopepsin_like Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-site cleft capable of interacting with multiple residues of a substrate. Although members of the aspartic protease family of enzymes have very similar three-dimensional structures and catalytic mechanisms, each has unique substrate specificity. The members of this family has an optimal acidic pH (5.5) and cleaves protein substrates with similar specificity to that of porcine pepsin A, preferring hydrophobic residues at P1 and P1' in the cleave site. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 278
25410 133162 cd06098 phytepsin Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases. They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA). 317
25411 99853 cd06099 CS_ACL-C_CCL Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL. 213
25412 99854 cd06100 CCL_ACL-C Citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CCL cleaves citryl-CoA (CiCoA) to acetyl-CoA (AcCoA) and oxaloacetate (OAA). ACL catalyzes an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. ACL may be required for fruiting body maturation in the filamentous fungus Sordaria macrospore. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL; the first enzyme is citryl-CoA synthetase (CCS) which is not included in this group. Chlorobium limicola ACL is an example of a two-subunit type ACL. It is comprised of a large and a small subunit; it has been speculated that the large subunit arose from a fusion of the small subunit of the two subunit CCS with CCL. The small ACL subunit is a homolog of the larger CCS subunit. Mammalian ACL is of the single-subunit type and may have arisen from the two-subunit ACL by another gene fusion. Mammalian ACLs are homotetramers; the ACLs of C. limicola and Arabidopsis are a heterooctomers (alpha4beta4). In cancer cells there is a shift in energy metabolism to aerobic glycolysis, the glycolytic end product pyruvate enters a truncated TCA cycle generating citrate which is cleaved in the cytosol by ACL. Inhibiting ACL limits the in-vitro proliferation and survival of these cancer cells, reduces in vivo tumor growth, and induces differentiation. 227
25413 99855 cd06101 citrate_synt Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and form homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria. 265
25414 99856 cd06102 citrate_synt_like_2 Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria. 282
25415 99857 cd06103 ScCS-like Saccharomyces cerevisiae (Sc) citrate synthase (CS)-like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) with oxaloacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). Some CS proteins function as 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). This group includes three S. cerevisiae CS proteins, ScCit1,-2,-3. ScCit1 is a nuclear-encoded mitochondrial CS with highly specificity for AcCoA; in addition to having activity with AcCoA, it plays a part in the construction of the TCA cycle metabolon. Yeast cells deleted for Cit1 are hyper-susceptible to apoptosis induced by heat and aging stress. ScCit2 is a peroxisomal CS involved in the glyoxylate cycle; in addition to having activity with AcCoA, it may have activity with PrCoA. ScCit3 is a mitochondrial CS and functions in the metabolism of PrCoA; it is a dual specificity CS and 2MCS, having similar catalytic efficiency with both AcCoA and PrCoA. The pattern of expression of the ScCIT3 gene follows that of the ScCIT1 gene and its expression is increased in the presence of a ScCIT1 deletion. Included in this group is the Tetrahymena 14 nm filament protein which functions as a CS in mitochondria and as a cytoskeletal component in cytoplasm and Geobacter sulfurreducens (GSu) CS. GSuCS is dimeric and eukaryotic-like; it lacks 2MCS activity and is inhibited by ATP. In contrast to eukaryotic and other prokaryotic CSs, GSuCIT is not stimulated by K+ ions. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. 426
25416 99858 cd06105 ScCit1-2_like Saccharomyces cerevisiae (Sc) citrate synthases Cit1-2_like. Citrate synthases (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) with oxaloacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). Some CS proteins function as 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). ScCit1 is a nuclear-encoded mitochondrial CS with highly specificity for AcCoA. In addition to its CS function, ScCit1 plays a part in the construction of the TCA cycle metabolon. Yeast cells deleted for Cit1 are hyper-susceptible to apoptosis induced by heat and aging stress. ScCit2 is a peroxisomal CS involved in the glyoxylate cycle; in addition to having activity with AcCoA, it may have activity with PrCoA. Chicken and pig heart CS, two Arabidopsis thaliana (Ath) CSs, CSY4 and -5, and Aspergillus niger (An) CS also belong to this group. Ath CSY4 has a mitochondrial targeting sequence; AthCSY5 has no identifiable targeting sequence. AnCS encoded by the citA gene has both an N-terminal mitochondrial import signal and a C-terminal peroxisiomal target sequence; it is not known if both these signals are functional in vivo. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. 427
25417 99859 cd06106 ScCit3_like Saccharomyces cerevisiae (Sc) 2-methylcitrate synthase Cit3-like. 2-methylcitrate synthase (2MCS) catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxaloacetate (OAA) to form 2-methylcitrate and CoA. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) with OAA to form citrate and CoA, the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). ScCit3 is mitochondrial and functions in the metabolism of PrCoA; it is a dual specificity CS and 2MCS, having similar catalytic efficiency with both AcCoA and PrCoA. The pattern of expression of the ScCIT3 gene follows that of the major mitochondrial CS gene (CIT1, not included in this group) and its expression is increased in the presence of a CIT1 deletion. This group also contains Aspergillus nidulans 2MCS; a deletion of the gene encoding this protein results in a strain unable to grow on propionate. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. 428
25418 99860 cd06107 EcCS_AthCS-per_like Escherichia coli (Ec) citrate synthase (CS) gltA and Arabidopsis thaliana (Ath) peroxisomal (Per) CS_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs, including EcCS, are strongly and specifically inhibited by NADH through an allosteric mechanism. Included in this group is an NADH-insensitive type II Acetobacter acetii CS which has retained many of the residues used by EcCS for NADH binding. C. aurantiacus is a gram-negative thermophilic green gliding bacterium; its CS belonging to this group may be a type I CS. It is not inhibited by NADH or 2-oxoglutarate and is inhibited by ATP. Both gram-positive and gram-negative bacteria are found in this group. This group also contains three Arabidopsis peroxisomal CS proteins, CYS-1, -2, and -3 which participate in the glyoxylate cycle. AthCYS1, in addition to a peroxisomal targeting sequence, has a predicted secretory signal peptide; it may be targeted to both the secretory pathway and the peroxisomes and perhaps is located in the extracellular matrix. AthCSY1 is expressed only in siliques and specifically in developing seeds. AthCSY2 and 3 are active during seed germination and seedling development and are thought to participate in the beta-oxidation of fatty acids. 382
25419 99861 cd06108 Ec2MCS_like Escherichia coli (Ec) 2-methylcitrate synthase (2MCS)_like. 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxalacetate (OAA) to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and OAA to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group contains proteins similar to the E. coli 2MCS, EcPrpC. EcPrpC is one of two CS isozymes in the gram-negative E. coli. EcPrpC is a dimeric (type I ) CS; it is induced during growth on propionate and prefers PrCoA as a substrate though it has partial CS activity with AcCoA. This group also includes Salmonella typhimurium PrpC and Ralstonia eutropha (Re) 2-MCS1 which are also induced during growth on propionate and prefer PrCoA as substrate, but can also use AcCoA. Re 2-MCS1 can use butyryl-CoA and valeryl-CoA at a lower rate. A second Ralstonia eutropha 2MCS, Re 2-MCS2, which is induced on propionate is also found in this group. This group may include proteins which may function exclusively as a CS, those which may function exclusively as a 2MCS, or those with dual specificity which functions as both a CS and a 2MCS. 363
25420 99862 cd06109 BsCS-I_like Bacillus subtilis (Bs) citrate synthase CS-I_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains proteins similar to BsCS-I, one of two CS isozymes in the gram-positive B. subtilis. The majority of CS activity in B. subtilis is provided by the other isozyme, BsCS-II (not included in this group). BsCS-I has a lower catalytic activity than BsCS-II, and has a Glu in place of a key catalytic Asp residue. This change is conserved in other members of this group. For E. coli CS (not included in this group), site directed mutagenesis of the key Asp residue to a Glu converts the enzyme into citryl-CoA lyase which cleaves citryl-CoA to AcCoA and OAA. A null mutation in the gene encoding BsCS-I (citA) had little effect on B. subtilis CS activity or on sporulation. However, disruption of the citA gene in a strain null for the gene encoding BsCS-II resulted in a sporulation deficiency, a characteristic of strains defective in the Krebs cycle. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. Many of the gram-negative species represented in this group have a second CS isozyme which is in another group. 349
25421 99863 cd06110 BSuCS-II_like Bacillus subtilis (Bs) citrate synthase (CS)-II_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains proteins similar to BsCS-II, the major CS of the gram-positive bacterium Bacillus subtilis. A mutation in the gene which encodes BsCS-II (citZ gene) has been described which resulted in a significant loss of CS activity, partial glutamate auxotrophy, and a sporulation deficiency, all of which are characteristic of strains defective in the Krebs cycle. Streptococcus mutans CS, found in this group, may participate in a pathway for the anaerobic biosynthesis of glutamate. This group also contains functionally uncharacterized CSs of various gram-negative bacteria. Some of the gram-negative species represented in this group have a second CS isozyme found in another group. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. 356
25422 99864 cd06111 DsCS_like Cold-active citrate synthase (CS) from an Antarctic bacterial strain DS2-3R (Ds)-like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2-methylcitrate synthase (2MCS) catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. DsCS, compared with CS from the hyperthermophile Pyrococcus furiosus (not included in this group), has an increase in the size of surface loops, a higher proline content in the loop regions, a more accessible active site, and a higher number of intramolecular ion pairs. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. For example, included in this group are Corynebacterium glutamicum (Cg) PrpC1 and -2, which are only synthesized during growth on propionate-containing medium, can use PrCoA, AcCoA and butyryl-CoA as substrates, and have comparable catalytic activity with AcCoA as the major CgCS (GltA, not included in this group). 362
25423 99865 cd06112 citrate_synt_like_1_1 Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. 373
25424 99866 cd06113 citrate_synt_like_1_2 Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) a carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) hydrolysis of citryl-CoA to produce citrate and CoA. CSs are found in two structural types: type I (homodimeric) and type II CSs (homohexameric). Type II CSs are unique to gram-negative bacteria. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria. Type I CS is active as a homodimer, both subunits participating in the active site. Type II CS is a hexamer of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria. 406
25425 99867 cd06114 EcCS_like Escherichia coli (Ec) citrate synthase (CS) GltA_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs including EcCS are strongly and specifically inhibited by NADH through an allosteric mechanism. Included in this group is an NADH-insensitive type II Acetobacter acetii CS which has retained many of the residues used by EcCS for NADH binding. 400
25426 99868 cd06115 AthCS_per_like Arabidopsis thaliana (Ath) peroxisomal (Per) CS_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains three Arabidopsis peroxisomal CS proteins, CYS1, -2, and -3 which are involved in the glyoxylate cycle. AthCYS1, in addition to a peroxisomal targeting sequence, has a predicted secretory signal peptide; it may be targeted to both the secretory pathway and the peroxisomes and is thought to be located in the extracellular matrix. AthCSY1 is expressed only in siliques and specifically in developing seeds. AthCSY2 and 3 are active during seed germination and seedling development and are thought to participate in the beta-oxidation of fatty acids. 410
25427 99869 cd06116 CaCS_like Chloroflexus aurantiacus (Ca) citrate synthase (CS)_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group is similar to gram-negative Escherichia coli (Ec) CS (type II, gltA) and Arabidopsis thaliana (Ath) peroxisomal (Per) CS. However EcCS and AthPerCS are not found in this group. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. C. aurantiacus is a gram-negative thermophilic green gliding bacterium, its CS belonging to this group may be a type I CS; it is not inhibited by NADH or 2-oxoglutarate and is inhibited by ATP. Both gram-positive and gram-negative bacteria are found in this group. 384
25428 99870 cd06117 Ec2MCS_like_1 Subgroup of Escherichia coli (Ec) 2-methylcitrate synthase (2MCS)_like. 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxalacetate (OAA) to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and OAA to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group contains proteins similar to the E. coli 2MCS, EcPrpC. EcPrpC is one of two CS isozymes in the gram-negative E. coli. EcPrpC is a dimeric (type I ) CS; it is induced during growth on propionate and prefers PrCoA as a substrate, but has a partial CS activity with AcCoA. This group also includes Salmonella typhimurium PrpC and Ralstonia eutropha (Re) 2-MCS1 which are also induced during growth on propionate, prefer PrCoA as substrate, but can also can use AcCoA. Re 2-MCS1 at a low rate can use butyryl-CoA and valeryl-CoA. A second Ralstonia eutropha 2MCS is also found in this group, Re 2-MCS2, which is induced on propionate. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. 366
25429 99871 cd06118 citrate_synt_like_1 Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. 358
25430 380376 cd06121 cupin_YML079wp Saccharomyces cerevisiae YML079wp and related proteins, cupin domain. This family includes eukaryotic, bacterial, and archaeal proteins homologous to YML079wp, a Saccharomyces cerevisiae cupin-like protein of unknown function that structurally resembles plant seed storage and ligand-binding proteins (canavalin, glycinin, auxin binding protein) as well as the bacterial RmlC epimerase. YML079wp is non-essential in yeast and localizes to the nucleus and cytoplasm. The presence of a hydrophobic ligand within a well-conserved binding pocket inside the cupin beta-barrel and sequence similarity with bacterial epimerases suggests a possible biochemical function for YML079wp and its homologs. Also included in this family are Shewanella oneidensis So0799, Agrobacterium fabrum Atu3615 and Branchiostoma belcheri Bbduf985. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold and forms a homodimer. 153
25431 380377 cd06122 cupin_TTHA0104 Thermus thermophilus TTHA0104 and related proteins, cupin domain. This family contains bacterial proteins including TTHA0104 (also called TT1209), a putative antibiotic synthesis protein from Thermus thermophilus. TTHA0104 is a cupin-like protein. The cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins. This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin (cupa is the Latin term for small barrel). 102
25432 380378 cd06123 cupin_HAO 3-Hydroxyanthranilate-3,4-dioxygenase, cupin domain. 3-Hydroxyanthranilate-3,4-dioxygenase (HAO or 3HAO) is a non-heme iron-dependent extradiol dioxygenase that catalyzes the oxidative ring opening of 3-hydroxyanthranilate (3-HAA) in the final enzymatic step of the kynurenine biosynthetic pathway in which tryptophan is converted to quinolinate, an endogenous neurotoxin, making HAO a target for pharmacological downregulation. Quinolate is also the universal de novo precursor to the pyridine ring of nicotinamide adenine dinucleotide. The enzyme forms homodimers, with two metal binding sites per molecule. One of the bound metal ions occupies the proposed ferrous-coordinated active site, which is located in a conserved double-strand beta-helix domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 153
25433 380379 cd06124 cupin_NimR-like_N AraC/XylS family transcriptional regulators similar to NimR, N-terminal cupin domain. This family contains mostly bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included in this family is Escherichia coli HTH-type transcriptional regulator NimR (also called YeaM) that negatively regulates expression of the nimT operon and its own expression. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 95
25434 176647 cd06125 DnaQ_like_exo DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily. The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer. 96
25435 176648 cd06127 DEDDh DEDDh 3'-5' exonuclease domain family. DEDDh exonucleases, part of the DnaQ-like (or DEDD) exonuclease superfamily, catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. These proteins contain four invariant acidic residues in three conserved sequence motifs termed ExoI, ExoII and ExoIII. DEDDh exonucleases are classified as such because of the presence of specific Hx(4)D conserved pattern at the ExoIII motif. The four conserved acidic residues are clustered around the active site and serve as ligands for the two metal ions required for catalysis. Most DEDDh exonucleases are the proofreading subunits (epsilon) or domains of bacterial DNA polymerase III, the main replicating enzyme in bacteria, which functions as the chromosomal replicase. Other members include other DNA and RNA exonucleases such as RNase T, Oligoribonuclease, and RNA exonuclease (REX), among others. 159
25436 176649 cd06128 DNA_polA_exo DEDDy 3'-5' exonuclease domain of family-A DNA polymerases. The 3'-5' exonuclease domain of family-A DNA polymerases has a fundamental role in reducing polymerase errors and is involved in proofreading activity. Family-A DNA polymerases contain a DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-B DNA polymerases. The exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four invariant acidic residues that serve as ligands for the two metal ions required for catalysis. The Klenow fragment (KF) of Escherichia coli Pol I, the Thermus aquaticus (Taq) Pol I, and Bacillus stearothermophilus (BF) Pol I are examples of family-A DNA polymerases. They are involved in nucleotide excision repair and in the processing of Okazaki fragments that are generated during lagging strand synthesis. The N-terminal domains of BF Pol I and Taq Pol I resemble the fold of the 3'-5' exonuclease domain of KF without the proofreading activity of KF. The four critical metal-binding residues are not conserved in BF Pol I and Taq Pol I, and they are unable to bind metals necessary for exonuclease activity. 151
25437 176650 cd06129 RNaseD_like DEDDy 3'-5' exonuclease domain of RNase D, WRN, and similar proteins. The RNase D-like group is composed of RNase D, WRN, and similar proteins. They contain a DEDDy-type, DnaQ-like, 3'-5' exonuclease domain that contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. RNase D is involved in the 3'-end processing of tRNA precursors. RNase D-like proteins in eukaryotes include yeast Rrp6p, human PM/Scl-100 and Drosophila melanogaster egalitarian (Egl) protein. WRN is a unique DNA helicase possessing exonuclease activity. Mutation in the WRN gene is implicated in Werner syndrome, a disease associated with premature aging and increased predisposition to cancer. Yeast Rrp6p and the human Polymyositis/scleroderma autoantigen 100kDa (PM/Scl-100) are exosome-associated proteins involved in the degradation and processing of precursors to stable RNAs. Egl is a component of an mRNA-binding complex which is required for oocyte specification. The Egl subfamily does not possess a completely conserved YX(3)D pattern at the ExoIII motif. 161
25438 99834 cd06130 DNA_pol_III_epsilon_like an uncharacterized bacterial subgroup of the DEDDh 3'-5' exonuclease domain family with similarity to the epsilon subunit of DNA polymerase III. This subfamily is composed of uncharacterized bacterial proteins with similarity to the epsilon subunit of DNA polymerase III (Pol III), a multisubunit polymerase which is the main DNA replicating enzyme in bacteria, functioning as the chromosomal replicase. The Pol III holoenzyme is a complex of ten different subunits, three of which (alpha, epsilon, and theta) compose the catalytic core. The Pol III epsilon subunit, encoded by the dnaQ gene, is a DEDDh-type 3'-5' exonuclease which is responsible for the proofreading activity of the polymerase, increasing the fidelity of DNA synthesis. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The epsilon subunit of Pol III also functions as a stabilizer of the holoenzyme complex. 156
25439 99835 cd06131 DNA_pol_III_epsilon_Ecoli_like DEDDh 3'-5' exonuclease domain of the epsilon subunit of Escherichia coli DNA polymerase III and similar proteins. This subfamily is composed of the epsilon subunit of Escherichia coli DNA polymerase III (Pol III) and similar proteins. Pol III is the main DNA replicating enzyme in bacteria, functioning as the chromosomal replicase. It is a holoenzyme complex of ten different subunits, three of which (alpha, epsilon, and theta) compose the catalytic core. The Pol III epsilon subunit, encoded by the dnaQ gene, is a DEDDh-type 3'-5' exonuclease which is responsible for the proofreading activity of the polymerase, increasing the fidelity of DNA synthesis. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The epsilon subunit of Pol III also functions as a stabilizer of the holoenzyme complex. 167
25440 99836 cd06133 ERI-1_3'hExo_like DEDDh 3'-5' exonuclease domain of Caenorhabditis elegans ERI-1, human 3' exonuclease, and similar proteins. This subfamily is composed of Caenorhabditis elegans ERI-1, human 3' exonuclease (3'hExo), Drosophila exonuclease snipper (snp), and similar proteins from eukaryotes and bacteria. These are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. ERI-1 has been implicated in the degradation of small interfering RNAs (RNAi). 3'hExo participates in the degradation of histone mRNAs. Snp is a non-essential exonuclease that efficiently degrades structured RNA and DNA substrates as long as there is a minimum of 2 nucleotides in the 3' overhang to initiate degradation. Snp is not a functional homolog of either ERI-1 or 3'hExo. 176
25441 99837 cd06134 RNaseT DEDDh 3'-5' exonuclease domain of RNase T. RNase T is a DEDDh-type DnaQ-like 3'-5' exoribonuclease E implicated in the 3' maturation of small stable RNAs and 23srRNA, and in the end turnover of tRNA. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. RNase T is related to the proofreading domain of DNA polymerase III. Despite its important role, RNase T is mainly found only in gammaproteobacteria. It is speculated that it might have originated from DNA polymerase III at the time the gamma division of proteobacteria diverged from other bacteria. RNase T is a homodimer with the catalytic residues of one monomer contacting a large basic patch on the other monomer to form a functional active site. 189
25442 99838 cd06135 Orn DEDDh 3'-5' exonuclease domain of oligoribonuclease and similar proteins. Oligoribonuclease (Orn) is a DEDDh-type DnaQ-like 3'-5' exoribonuclease that is responsible for degrading small oligoribonucleotides to mononucleotides. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. Orn is essential for Escherichia coli survival. The human homolog, also called Sfn (small fragment nuclease), is able to hydrolyze short single-stranded RNA and DNA oligomers. It plays a role in cellular nucleotide recycling. 173
25443 99839 cd06136 TREX1_2 DEDDh 3'-5' exonuclease domain of three prime repair exonuclease (TREX)1, TREX2, and similar proteins. Three prime repair exonuclease (TREX)1 and TREX2 are closely related DEDDh-type DnaQ-like 3'-5' exonucleases. They contain three conserved sequence motifs known as ExoI, II, and III, with a specific Hx(4)D conserved pattern at ExoIII. These motifs contain four conserved acidic residues that participate in coordination of divalent metal ions required for catalysis. Both proteins play a role in the metabolism and clearance of DNA. TREX1 is the major 3'-5' exonuclease activity detected in mammalian cells. Mutations in the human TREX1 gene can cause Aicardi-Goutieres syndrome (AGS), which is characterized by perturbed innate immunity and presents itself as a severe neurological disease. TREX1 degrades ssDNA generated by aberrant replication intermediates to prevent checkpoint activation and autoimmune disease. There are distinct structural differences between TREX1 and TREX2 that point to different biological roles for these proteins. The main difference is the presence of about 70 amino acids at the C-terminus of TREX1. In addition, TREX1 has a nonrepetitive proline-rich region that is not present in the TREX2 protein. Furthermore, TREX2 contains a conserved DNA binding loop positioned adjacent to the active site that has a sequence distinct from the corresponding loop in TREX1. Truncations in the C-terminus of human TREX1 cause autosomal dominant retinal vasculopathy with cerebral leukodystrophy (RVCL), a neurovascular syndrome featuring a progressive loss of visual acuity combined with a variable neurological picture. 177
25444 99840 cd06137 DEDDh_RNase DEDDh 3'-5' exonuclease domain of the eukaryotic exoribonucleases PAN2, RNA exonuclease (REX)-1,-3, and -4, ISG20, and similar proteins. This group is composed of eukaryotic exoribonucleases that include PAN2, RNA exonuclease 1 (REX1 or Rex1p), REX3 (Rex3p), REX4 (or Rex4p), ISG20, and similar proteins. They are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. PAN2 is the catalytic subunit of poly(A) nuclease (PAN), a Pab1p-dependent 3'-5' exoribonuclease which plays an important role in the posttranscriptional maturation of pre-mRNAs. REX proteins are required for the processing and maturation of many RNA species, and ISG20 is an interferon-induced antiviral exonuclease with a strong preference for single-stranded RNA. 161
25445 99841 cd06138 ExoI_N N-terminal DEDDh 3'-5' exonuclease domain of Escherichia coli exonuclease I and similar proteins. This subfamily is composed of the N-terminal domain of Escherichia coli exonuclease I (ExoI) and similar proteins. ExoI is a monomeric enzyme that hydrolyzes single stranded DNA in the 3' to 5' direction. It plays a role in DNA recombination and repair. It primarily functions in repairing frameshift mutations. The N-terminal domain of ExoI is a DEDDh-type DnaQ-like 3'-5 exonuclease containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The ExoI structure is unique among DnaQ family enzymes in that there is a large distance between the two metal ions required for catalysis and the catalytic histidine is oriented away from the active site. 183
25446 176651 cd06139 DNA_polA_I_Ecoli_like_exo DEDDy 3'-5' exonuclease domain of Escherichia coli DNA polymerase I and similar bacterial family-A DNA polymerases. Escherichia coli-like Polymerase I (Pol I), a subgroup of family-A DNA polymerases, contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain in the same polypeptide chain as the polymerase domain. The exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The 3'-5' exonuclease domain of DNA polymerases has a fundamental role in reducing polymerase errors and is involved in proofreading activity. E. coli DNA Pol I is involved in genome replication but is not the main replicating enzyme. It is also implicated in DNA repair. 193
25447 176652 cd06140 DNA_polA_I_Bacillus_like_exo inactive DEDDy 3'-5' exonuclease domain of Bacillus stearothermophilus DNA polymerase I and similar family-A DNA polymerases. Bacillus stearothermophilus-like Polymerase I (Pol I), a subgroup of the family-A DNA polymerases, contains an inactive DnaQ-like 3'-5' exonuclease domain in the same polypeptide chain as the polymerase region. The exonuclease-like domain of these proteins possess the same fold as the Klenow fragment (KF) of Escherichia coli Pol I, but does not contain the four critical metal-binding residues necessary for activity. The function of this domain is unknown. It might act as a spacer between the polymerase and the 5'-3' exonuclease domains. Some members of this subgroup, such as those from Bacillus sphaericus and Thermus aquaticus, are thermostable DNA polymerases. 178
25448 176653 cd06141 WRN_exo DEDDy 3'-5' exonuclease domain of WRN and similar proteins. WRN is a unique RecQ DNA helicase exhibiting an exonuclease activity. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. Mutations in the WRN gene cause Werner syndrome, an autosomal recessive disorder associated with premature aging and increased susceptibility to cancer and type II diabetes. WRN interacts with key proteins involved in DNA replication, recombination, and repair. It is believed to maintain genomic stability and life span by participating in DNA processes. WRN is stimulated by Ku70/80, an important regulator of genomic stability. 170
25449 176654 cd06142 RNaseD_exo DEDDy 3'-5' exonuclease domain of Ribonuclease D and similar proteins. Ribonuclease (RNase) D is a bacterial enzyme involved in the maturation of small stable RNAs and the 3' maturation of tRNA. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. In vivo, RNase D only becomes essential upon removal of other ribonucleases. Eukaryotic RNase D homologs include yeast Rrp6p, human PM/Scl-100, and the Drosophila melanogaster egalitarian protein. 178
25450 99846 cd06143 PAN2_exo DEDDh 3'-5' exonuclease domain of the eukaryotic exoribonuclease PAN2. PAN2 is the catalytic subunit of poly(A) nuclease (PAN), a Pab1p-dependent 3'-5' exoribonuclease which plays an important role in the posttranscriptional maturation of pre-mRNAs. PAN catalyzes the deadenylation of poly(A) tails, which are initially synthesized to default lengths of 70 to 90, to mRNA-specific lengths of 55 to 71. Pab1p and PAN also play a role in the export and decay of mRNA. PAN2 contains a DEDDh-type DnaQ-like 3'-5' exonuclease domain with three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. 174
25451 99847 cd06144 REX4_like DEDDh 3'-5' exonuclease domain of RNA exonuclease 4, XPMC2, Interferon Stimulated Gene product of 20 kDa, and similar proteins. This subfamily is composed of RNA exonuclease 4 (REX4 or Rex4p), XPMC2, Interferon (IFN) Stimulated Gene product of 20 kDa (ISG20), and similar proteins. REX4 is involved in pre-rRNA processing. It controls the ratio between the two forms of 5.8S rRNA in yeast. XPMC2 is a Xenopus gene which was identified through its ability to correct a mitotic defect in fission yeast. The human homolog of XPMC2 (hPMC2) may be involved in angiotensin II-induced adrenal cell cycle progression and cell proliferation. ISG20 is an IFN-induced antiviral exonuclease with a strong preference for single-stranded RNA and minor activity towards single-stranded DNA. These proteins are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. REX proteins function in the processing and maturation of many RNA species, similar to the function of Escherchia coli RNase T. 152
25452 99848 cd06145 REX1_like DEDDh 3'-5' exonuclease domain of RNA exonuclease 1, -3 and similar eukaryotic proteins. This subfamily is composed of RNA exonuclease 1 (REX1 or Rex1p), REX3 (or Rex3p), and similar eukaryotic proteins. In yeast, REX1 and REX3 are required for 5S rRNA and MRP (mitochondrial RNA processing) RNA maturation, respectively. They are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. REX1 is the major exonuclease responsible for pre-tRNA trail trimming and may also be involved in nuclear CCA turnover. REX proteins function in the processing and maturation of many RNA species, similar to the function of Escherichia coli RNase T. 150
25453 176655 cd06146 mut-7_like_exo DEDDy 3'-5' exonuclease domain of Caenorhabditis elegans mut-7 and similar proteins. The mut-7 subfamily is composed of Caenorhabditis elegans mut-7 and similar proteins found in plants and metazoans. Mut-7 is implicated in posttranscriptional gene silencing. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs, termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. 193
25454 99850 cd06147 Rrp6p_like_exo DEDDy 3'-5' exonuclease domain of yeast Rrp6p, human polymyositis/scleroderma autoantigen 100kDa, and similar proteins. Yeast Rrp6p and its human homolog, the polymyositis/scleroderma autoantigen 100kDa (PM/Scl-100), are exosome-associated proteins involved in the degradation and processing of precursors to stable RNAs. Both proteins contain a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. The motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. PM/Scl-100, an autoantigen present in the nucleolar compartment of the cell, reacts with autoantibodies produced by about 50% of patients with polymyositis-scleroderma overlap syndrome. 192
25455 99851 cd06148 Egl_like_exo DEDDy 3'-5' exonuclease domain of Drosophila Egalitarian (Egl) and similar proteins. The Egalitarian (Egl) protein subfamily is composed of Drosophila Egl and similar proteins. Egl is a component of an mRNA-binding complex which is required for oocyte specification. Egl contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. The motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation of this subfamily throughout eukaryotes suggests that its members may be part of ancient RNA processing complexes that are likely to participate in the regulated processing of specific mRNAs. Some members of this subfamily do not have a completely conserved YX(3)D pattern at the ExoIII motif. 197
25456 99852 cd06149 ISG20 DEDDh 3'-5' exonuclease domain of Interferon Stimulated Gene product of 20 kDa, and similar proteins. Interferon (IFN) Stimulated Gene product of 20 kDa (ISG20) is an IFN-induced antiviral exonuclease with a strong preference for single-stranded RNA and minor activity towards single-stranded DNA. It was also independently identified by its response to estrogen and was called HEM45 (human estrogen regulated transcript). ISG20 is a DEDDh-type DnaQ-like 3'-5' exonuclease containing three conserved sequence motifs termed ExoI, ExoII and ExoIII with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. ISG20 may be a major effector of innate immunity against pathogens including viruses, bacteria, and parasites. It is located in promyelocytic leukemia (PML) nuclear bodies, sites for oncogenic DNA viral transcription and replication. It may carry out its function by degrading viral RNAs as part of the IFN-regulated antiviral response. 157
25457 100007 cd06150 YjgF_YER057c_UK114_like_2 This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 105
25458 100008 cd06151 YjgF_YER057c_UK114_like_3 This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 126
25459 100009 cd06152 YjgF_YER057c_UK114_like_4 YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 114
25460 100010 cd06153 YjgF_YER057c_UK114_like_5 This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 114
25461 100011 cd06154 YjgF_YER057c_UK114_like_6 This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 119
25462 100012 cd06155 eu_AANH_C_1 A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the first of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 101
25463 100013 cd06156 eu_AANH_C_2 A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the second of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 118
25464 132726 cd06157 NR_LBD The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators. Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 168
25465 100079 cd06158 S2P-M50_like_1 Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with a minimal core protein and no PDZ domains. 181
25466 100080 cd06159 S2P-M50_PDZ_Arch Uncharacterized Archaeal homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group appears to be limited to Archaeal S2P/M50s homologs with additional putative N-terminal transmembrane spanning regions, relative to the core protein, and either one or two PDZ domains present. 263
25467 100081 cd06160 S2P-M50_like_2 Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with additional putative N- and C-terminal transmembrane spanning regions, relative to the core protein, and no PDZ domains. 183
25468 100082 cd06161 S2P-M50_SpoIVFB SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. 208
25469 100083 cd06162 S2P-M50_PDZ_SREBP Sterol regulatory element-binding protein (SREBP) Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50A), regulates intramembrane proteolysis (RIP) of SREBP and is part of a signal transduction mechanism involved in sterol and lipid metabolism. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of SREBPs from membranes of the endoplasmic reticulum (ER). These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. The first cleavage occurs at Site-1 within the ER lumen to generate an intermediate that is subsequently released from the membrane by cleavage at Site-2, which lies within the first transmembrane domain. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD family. This group appears to be limited to eumetazoan proteins and contains one PDZ domain. 277
25470 100084 cd06163 S2P-M50_PDZ_RseP-like RseP-like Site-2 proteases (S2P), zinc metalloproteases (MEROPS family M50A), cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses. Also included in this group are such homologs as Bacillus subtilis YluC, Mycobacterium tuberculosis Rv2869c S2P, and Bordetella bronchiseptica HurP. Rv2869c S2P appears to have a role in the regulation of prokaryotic lipid biosynthesis and membrane composition and YluC of Bacillus has a role in transducing membrane stress. This group includes bacterial and eukaryotic S2P/M50s homologs with either one or two PDZ domains present. PDZ domains are believed to have a regulatory role. The RseP PDZ domain is required for the inhibitory reaction that prevents cleavage of its substrate, RseA. 182
25471 100085 cd06164 S2P-M50_SpoIVFB_CBS SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. In this subgroup, SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain. SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. 227
25472 320680 cd06165 Sortase_A Sortase domain found in class A sortases. Class A sortases are membrane-bound cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). They perform a housekeeping role in the cell as members of this group are capable of anchoring a large number of functionally distinct surface proteins containing a cell wall sorting signal to an amino group located on the bacterial cell wall. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall-sorting signal (Class A sortases recognize a canonical LPXTG motif, X can be any amino acid), and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical sortase A protein from Staphylococcus aureus (named Sa-SrtA) cleaves the amide bond between threonine and glycine residues of the canonical LPXTG motif in a wide range of protein substrates with diverse functions that can promote bacterial adhesion, nutrient acquisition, host cell invasion, and immune evasion. Next, it catalyzes a transpeptidation reaction by which the proteins are covalently linked to the peptidoglycan precursor lipid II. SrtA is therefore affects the ability of a pathogen to establish successful infection. SrtA contains an N-terminal hydrophobic segment, a linker region and an extra-cellular C-terminal catalytic domain. The hydrophobic segment functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. The catalytic domain contains the catalytic TLXTC signature sequence where X is usually a valine, isoleucine or a threonine. The gene encoding SrtA is generally not located in the same gene cluster as its substrates while the gene encoding SrtB is usually clustered in the same locus as its substrate. 127
25473 320681 cd06166 Sortase_D_2 Sortase domain found in subfamily 2 of the class D family of sortases. Class D sortases are cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). They anchor surface proteins bearing a cell wall sorting signal to peptidoglycans of the bacterial cell wall envelope, which is responsible for spore formation under anaerobic conditions. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical subfamily 2 of class D sortase from Clostridium perfringens (named Cp-SrtD) recognizes the LPQTGS signal motif for transpeptidation. Its catalytic activity is in a metal cation- and temperature- dependent manner. The presence of magnesium appears to enhance Cp-SrtD catalysis towards the LPQTGS signal motif. 127
25474 350201 cd06167 PIN_LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing. It is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system. In particular, LabA seems necessary for KaiC-dependent repression of gene expression. This family also includes the N-terminal domain of limkain b1, a human autoantigen associated with cytoplasmic vesicles. Other members are the LabA-like PIN domains of human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. 113
25475 212486 cd06168 LSMD1 LSM domain containing 1. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSMD1 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes, forming heptameric and hexameric ring structures similar to those found in eukaryotes. 73
25476 132884 cd06169 BMC Bacterial Micro-Compartment (BMC) domain. Bacterial micro-compartments are primitive protein-based organelles that sequester specific metabolic pathways in bacterial cells. The prototypical bacterial microcompartment is the carboxysome shell, a bacterial polyhedral organelle which increase the efficiency of CO2 fixation by encapsulating RuBisCO and carbonic anhydrase. They can be divided into two types: alpha-type carboxysomes (alpha-cyanobacteria and proteobacteria) and beta-type carboxysomes (beta-cyanobacteria). In addition to these proteins there are several homologous shell proteins including those found in pdu organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol and eut organelles involved in the cobalamin-dependent degradation of ethanolamine. Structure evidence shows that several carboxysome shell proteins and their homologs (Csos1A, CcmK1,2,4, and PduU) exist as hexamers which might further assemble into extended, tightly packed layers hypothesized to represent the flat facets of the polyhedral organelles outer shell. Although it has been suggested that other homologous proteins in this family might also form hexamers and play similar functional roles in the construction of their corresponding organelle outer shell at present no experimental evidence directly supports this view. 62
25477 99777 cd06170 LuxR_C_like C-terminal DNA-binding domain of LuxR-like proteins. This domain contains a helix-turn-helix motif and binds DNA. Proteins belonging to this group are response regulators; some act as transcriptional activators, others as transcriptional repressors. Many are active as homodimers. Many are two domain proteins in which the DNA binding property of the C-terminal DNA binding domain is modulated by modifications of the N-terminal domain. For example in the case of Lux R which participates in the regulation of gene expression in response to fluctuations in cell-population density (quorum-sensing), a signaling molecule, the pheromone Acyl HSL (N-acyl derivatives of homoserine lactone), binds to the N-terminal domain and leads to LuxR dimerization. For others phophorylation of the N-terminal domain leads to multimerization, for example Escherichia coli NarL and Sinorhizobium melilot FixJ. NarL controls gene expression of many respiratory-related operons when environmental nitrate or nitrite is present under anerobic conditions. FixJ is involved in the transcriptional activation of nitrogen fixation genes. The group also includes small proteins which lack an N-terminal signaling domain, such as Bacillus subtilis GerE. GerE is dimeric and acts in conjunction with sigmaK as an activator or a repressor modulating the expression of various genes in particular those encoding the spore-coat. These LuxR family regulators may share a similar organization of their target binding sites. For example the LuxR dimer binds the lux box, a 20bp inverted repeat, GerE dimers bind two 12bp consensus sequences in inverted orientation having the central four bases overlap, and the NarL dimer binds two 7bp inverted repeats separated by 2 bp. 57
25478 100119 cd06171 Sigma70_r4 Sigma70, region (SR) 4 refers to the most C-terminal of four conserved domains found in Escherichia coli (Ec) sigma70, the main housekeeping sigma, and related sigma-factors (SFs). A SF is a dissociable subunit of RNA polymerase, it directs bacterial or plastid core RNA polymerase to specific promoter elements located upstream of transcription initiation points. The SR4 of Ec sigma70 and other essential primary SFs contact promoter sequences located 35 base-pairs upstream of the initiation point, recognizing a 6-base-pair -35 consensus TTGACA. Sigma70 related SFs also include SFs which are dispensable for bacterial cell growth for example Ec sigmaS, SFs which activate regulons in response to a specific signal for example heat-shock Ec sigmaH, and a group of SFs which includes the extracytoplasmic function (ECF) SFs and is typified by Ec sigmaE which contains SR2 and -4 only. ECF SFs direct the transcription of genes that regulate various responses including periplasmic stress and pathogenesis. Ec sigmaE SR4 also contacts the -35 element, but recognizes a different consensus (a 7-base-pair GGAACTT). Plant SFs recognize sigma70 type promoters and direct transcription of the major plastid RNA polymerase, plastid-encoded RNA polymerase (PEP). 55
25479 340862 cd06172 MFS_LacY LacY proton/sugar symporter family of the Major Facilitator Superfamily of transporters. LacY proton/sugar family (also called LacY/RafB family) symporters are integral membrane proteins responsible for the transport of specific beta-glucosides into the cell accompanied by the import of a proton. Members include lactose permease (LacY), raffinose permease (RafB), and sucrose permease, which facilitate the transport of beta-galactosides, raffinose, and sucrose, respectively. The prototypical member, LacY, contains 12 transmembrane helices connected by hydrophilic loops with both N and C termini on the cytoplasmic face. The LacY/RafB permease family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
25480 340863 cd06173 MFS_MefA_like Macrolide efflux protein A and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Streptococcus pyogenes macrolide efflux protein A (MefA) and similar transporters, many of which remain uncharacterized. Some members may be multidrug resistance (MDR) transporters, which are drug/H+ antiporters (DHAs) that mediate the efflux of a variety of drugs and toxic compounds, conferring resistance to these compounds. MefA confers resistance to 14-membered macrolides including erythromycin and to 15-membered macrolides. It functions as an efflux pump to regulate intracellular macrolide levels. The MefA-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 383
25481 349949 cd06174 MFS Major Facilitator Superfamily. The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated. 378
25482 340865 cd06175 MFS_POT Proton-coupled oligopeptide transporter (POT) family of the Major Facilitator Superfamily of transporters. The Proton-coupled oligopeptide transporter (POT) family is present across all major kingdoms of life and is known by a variety of names. It is referred to as the Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) in plants, and in addition to POT, it is also known as the Peptide transporter (PepT/PTR) or Solute Carrier 15 (SLC15) family in animals. Members of this family are proton-driven symporters involved in nitrogen acquisition in the form of di- and tripeptides. Plant members transport other nitrogenous ligands including nitrate, the plant hormone auxin, and glucosinolate compounds that are important for seed defense. POT proteins exhibit substrate multispecificity, with one transporter able to recognize as many as 8,400 types of di/tripeptides and certain peptide-like drugs. The POT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 422
25483 349950 cd06176 MFS_BCD_PucC-like Bacteriochlorophyll delivery (BCD) family, also called PucC family, of the Major Facilitator Superfamily. The bacteriochlorophyll delivery (BCD) family, also called PucC family, is composed of the PucC protein and related proteins including LhaA (also called ORF477 and F1696) and bacteriochlorophyll synthase 44.5 kDa chain (also called ORF428). These proteins are found in photosynthetic organisms. Rhodobacter capsulatus LhaA and PucC are implicated in light-harvesting complex 1 and 2 (LH1 and LH2) assembly. PucC may function to shepherd or sequester LH2 alpha and beta proteins to facilitate proper assembly, as well as deliver bacteriochlorophyll a to nascent LH2 complexes. The BCD family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 409
25484 340866 cd06177 MFS_NHS Nucleoside:H(+) symporter family of the Major Facilitator Superfamily of transporters. The prototypical members of the Nucleoside:H(+) symporter (NHS) family are Escherichia coli nucleoside permease NupG and xanthosine permease. Nucleoside:H(+) symporters are proton-driven transporters that facilitate the import of nucleosides across the cytoplasmic membrane. NupG is a broad-specificity transporter of purine and pyrimidine nucleosides. Xanthosine permease is involved in the uptake of xanthosine and other nucleosides such as inosine, adenosine, cytidine, uridine and thymidine. The NHS family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 386
25485 340867 cd06178 MFS_unc93-like Uncharacterized Unc-93-like proteins of the Major Facilitator Superfamily of transporters. This subfamily consists of uncharacterized proteins, mainly from fungi and plants, with similarity to Caenorhabditis elegans uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93). Unc-93 acts as a regulatory subunit of a multi-subunit potassium channel complex that may function in coordinating muscle contraction in C. elegans. The unc93-like subfamily belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 415
25486 340868 cd06179 MFS_TRI12_like Fungal trichothecene efflux pump (TRI12) of the Major Facilitator Superfamily of transporters. This family includes Fusarium sporotrichioides trichothecene efflux pump (TRI12), which may play a role in F. sporotrichioides self-protection against trichothecenes. TRI12 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 518
25487 340869 cd06180 MFS_YjiJ Uncharacterized protein YjiJ and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Escherichia coli YjiJ and other uncharacterized proteins. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
25488 198409 cd06181 BI-1-like BAX inhibitor (BI)-1/YccA-like protein family. Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes. 202
25489 99779 cd06182 CYPOR_like NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPOR has a C-terminal ferredoxin reducatase (FNR)- like FAD and NAD binding module, an FMN-binding domain, and an additional conecting domain (inserted within the FAD binding region) that orients the FNR and FMN binding domains. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria and participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2, which then transfers two electrons and a proton to NADP+ to form NADPH. 267
25490 99780 cd06183 cyt_b5_reduct_like Cytochrome b5 reductase catalyzes the reduction of 2 molecules of cytochrome b5 using NADH as an electron donor. Like ferredoxin reductases, these proteins have an N-terminal FAD binding subdomain and a C-terminal NADH binding subdomain, separated by a cleft, which accepts FAD. The NADH-binding moiety interacts with part of the FAD and resembles a Rossmann fold. However, NAD is bound differently than in canonical Rossmann fold proteins. Nitrate reductases, flavoproteins similar to pyridine nucleotide cytochrome reductases, catalyze the reduction of nitrate to nitrite. The enzyme can be divided into three functional fragments that bind the cofactors molybdopterin, heme-iron, and FAD/NADH. 234
25491 99781 cd06184 flavohem_like_fad_nad_binding FAD_NAD(P)H binding domain of flavohemoglobin. Flavohemoglobins have a globin domain containing a B-type heme fused with a ferredoxin reductase-like FAD/NAD-binding domain. Flavohemoglobins detoxify nitric oxide (NO) via an NO dioxygenase reaction. The hemoglobin domain adopts a globin fold with an embedded heme molecule. Flavohemoglobins also have a C-terminal reductase domain with bindiing sites for FAD and NAD(P)H. This domain catalyzes the conversion of NO + O2 + NAD(P)H to NO3- + NAD(P)+. Instead of the oxygen transport function of hemoglobins, flavohemoglobins seem to act in NO dioxygenation and NO signalling. 247
25492 99782 cd06185 PDR_like Phthalate dioxygenase reductase (PDR) is an FMN-dependent reductase that mediates electron transfer from NADH to FMN to an iron sulfur cluster. PDR has an an N-terminal ferrredoxin reductase (FNR)-like NAD(H) binding domain and a C-terminal iron-sulfur [2Fe-2S] cluster domain. Although structurally homologous to FNR, PDR binds FMN rather than FAD in it's FNR-like domain. Electron transfer between pyrimidines and iron-sulfur clusters (Rieske center [2Fe-2S]) or heme groups is mediated by flavins in respiration, photosynthesis, and oxygenase systems. Type I dioxygenase systems, including the hydroxylate phthalate system, have 2 components, a monomeric reductase consisting of a flavin and a 2Fe-2S center and a multimeric oxygenase. In contrast to other Rieske dioxygenases the ferredoxin like domain is C-, not N-terminal. 211
25493 99783 cd06186 NOX_Duox_like_FAD_NADP NADPH oxidase (NOX) catalyzes the generation of reactive oxygen species (ROS) such as superoxide and hydrogen peroxide. ROS were originally identified as bactericidal agents in phagocytes, but are now also implicated in cell signaling and metabolism. NOX has a 6-alpha helix heme-binding transmembrane domain fused to a flavoprotein with the nucleotide binding domain located in the cytoplasm. Duox enzymes link a peroxidase domain to the NOX domain via a single transmembrane and EF-hand Ca2+ binding sites. The flavoprotein module has a ferredoxin like FAD/NADPH binding domain. In classical phagocytic NOX2, electron transfer occurs from NADPH to FAD to the heme of cytb to oxygen leading to superoxide formation. 210
25494 99784 cd06187 O2ase_reductase_like The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons using oxygen as the oxidant. Electron transfer is from NADH via FAD (in the oxygenase reductase) and an [2FE-2S] ferredoxin center (fused to the FAD/NADH domain and/or discrete) to the oxygenase. Dioxygenases add both atoms of oxygen to the substrate, while mono-oxygenases (aka mixed oxygenases) add one atom to the substrate and one atom to water. In dioxygenases, Class I enzymes are 2 component, containing a reductase with Rieske type [2Fe-2S] redox centers and an oxygenase. Class II are 3 component, having discrete flavin and ferredoxin proteins and an oxygenase. Class III have 2 [2Fe-2S] centers, one fused to the flavin domain and the other separate. 224
25495 99785 cd06188 NADH_quinone_reductase Na+-translocating NADH:quinone oxidoreductase (Na+-NQR) FAD/NADH binding domain. (Na+-NQR) provides a means of storing redox reaction energy via the transmembrane translocation of Na2+ ions. The C-terminal domain resembles ferredoxin:NADP+ oxidoreductase, and has NADH and FAD binding sites. (Na+-NQR) is distinct from H+-translocating NADH:quinone oxidoreductases and noncoupled NADH:quinone oxidoreductases. The NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain of this group typically contains an iron-sulfur cluster binding domain. 283
25496 99786 cd06189 flavin_oxioreductase NAD(P)H dependent flavin oxidoreductases use flavin as a substrate in mediating electron transfer from iron complexes or iron proteins. Structurally similar to ferredoxin reductases, but with only 15% sequence identity, flavin reductases reduce FAD, FMN, or riboflavin via NAD(P)H. Flavin is used as a substrate, rather than a tightly bound prosthetic group as in flavoenzymes; weaker binding is due to the absence of a binding site for the AMP moeity of FAD. 224
25497 99787 cd06190 T4MO_e_transfer_like Toluene-4-monoxygenase electron transfer component of Pseudomonas mendocina hydroxylates toluene and forms p-cresol as part of a three component toluene-4-monoxygenase system. Electron transfer is from NADH to an NADH:ferredoxin oxidoreductase (TmoF in P. mendocina) to ferredoxin to an iron-containing oxygenase. TmoF is homologous to other mono- and dioxygenase systems within the ferredoxin reductase family. 232
25498 99788 cd06191 FNR_iron_sulfur_binding Iron-sulfur binding Ferredoxin Reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with a C-terminal iron-sulfur binding cluster domain. FNR was intially identified as a chloroplast reductase activity catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methnae assimilation in a variety of organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H). 231
25499 99789 cd06192 DHOD_e_trans_like FAD/NAD binding domain (electron transfer subunit) of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as NAD binding. NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH. 243
25500 99790 cd06193 siderophore_interacting Siderophore interacting proteins share the domain structure of the ferredoxin reductase like family. Siderophores are produced in various bacteria (and some plants) to extract iron from hosts. Binding constants are high, so iron can be pilfered from transferrin and lactoferrin for bacterial uptake, contributing to pathogen virulence. Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in a variety of organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one-electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and two electron carriers. FNR has a strong preference for NADP(H) vs NAD(H). 235
25501 99791 cd06194 FNR_N-term_Iron_sulfur_binding Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an N-terminal Iron-Sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 222
25502 99792 cd06195 FNR1 Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 241
25503 99793 cd06196 FNR_like_1 Ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which varies in orientation with respect to the NAD(P) binding domain. The N-terminal region may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH. 218
25504 99794 cd06197 FNR_like_2 FAD/NAD(P) binding domain of ferredoxin reductase-like proteins. Ferredoxin reductase (FNR) was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and have a variety of physiological functions in a variety of organisms including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which varies in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one-electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and two electron carriers. FNR has a strong preference for NADP(H) vs NAD(H). 220
25505 99795 cd06198 FNR_like_3 NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) domain, which varies in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH. 216
25506 99796 cd06199 SiR Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide. Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH. Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain. 360
25507 99797 cd06200 SiR_like1 Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide. Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH. Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues, and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule, which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 245
25508 99798 cd06201 SiR_like2 Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide. Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH. Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 289
25509 99799 cd06202 Nitric_oxide_synthase The ferredoxin-reductase (FNR) like C-terminal domain of the nitric oxide synthase (NOS) fuses with a heme-containing N-terminal oxidase domain. The reductase portion is similar in structure to NADPH dependent cytochrome-450 reductase (CYPOR), having an inserted connecting sub-domain within the FAD binding portion of FNR. NOS differs from CYPOR in a requirement for the cofactor tetrahydrobiopterin and unlike most CYPOR is dimeric. Nitric oxide synthase produces nitric oxide in the conversion of L-arginine to L-citruline. NOS has been implicated in a variety of processes including cytotoxicity, anti-inflamation, neurotransmission, and vascular smooth muscle relaxation. 406
25510 99800 cd06203 methionine_synthase_red Human methionine synthase reductase (MSR) restores methionine sythase which is responsible for the regeneration of methionine from homocysteine, as well as the coversion of methyltetrahydrofolate to tetrahydrofolate. In MSR, electrons are transferred from NADPH to FAD to FMN to cob(II)alamin. MSR resembles proteins of the cytochrome p450 family including nitric oxide synthase, the alpha subunit of sulfite reductase, but contains an extended hinge region. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPORs resemble ferredoxin reductase (FNR) but have a connecting subdomain inserted within the flavin binding region, which helps orient the FMN binding doamin with the FNR module. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 398
25511 99801 cd06204 CYPOR NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 416
25512 99802 cd06206 bifunctional_CYPOR These bifunctional proteins fuse N-terminal cytochrome p450 with a cytochrome p450 reductase (CYPOR). NADPH cytochrome p450 reductase serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 384
25513 99803 cd06207 CyPoR_like NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 382
25514 99804 cd06208 CYPOR_like_FNR These ferredoxin reductases are related to the NADPH cytochrome p450 reductases (CYPOR), but lack the FAD-binding region connecting sub-domain. Ferredoxin-NADP+ reductase (FNR) is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins, such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap between the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2, which then transfers two electrons and a proton to NADP+ to form NADPH. CYPOR serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases, sulfite reducatase, and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPOR has a C-terminal FNR-like FAD and NAD binding module, an FMN-binding domain, and an additional connecting domain (inserted within the FAD binding region) that orients the FNR and FMN -binding domains. The C-terminal domain contains most of the NADP(H) binding residues, and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule, which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 286
25515 99805 cd06209 BenDO_FAD_NAD Benzoate dioxygenase reductase (BenDO) FAD/NAD binding domain. Oxygenases oxidize hydrocarbons using dioxygen as the oxidant. As a Class I bacterial dioxygenases, benzoate dioxygenase like proteins combine an [2Fe-2S] cluster containing N-terminal ferredoxin at the end fused to an FAD/NADP(P) domain. In dioxygenase FAD/NAD(P) binding domain, the reductase transfers 2 electrons from NAD(P)H to the oxygenase which insert into an aromatic substrate, an initial step in microbial aerobic degradation of aromatic rings. Flavin oxidoreductases use flavins as substrates, unlike flavoenzymes which have a flavin prosthetic group. 228
25516 99806 cd06210 MMO_FAD_NAD_binding Methane monooxygenase (MMO) reductase of methanotrophs catalyzes the NADH-dependent hydroxylation of methane to methanol. This multicomponent enzyme mediates electron transfer via a hydroxylase (MMOH), a coupling protein, and a reductase which is comprised of an N-terminal [2Fe-2S] ferredoxin domain, an FAD binding subdomain, and an NADH binding subdomain. Oxygenases oxidize hydrocarbons using dioxygen as the oxidant. Dioxygenases add both atom of oxygen to the substrate, while mono-oxygenases add one atom to the substrate and one atom to water. 236
25517 99807 cd06211 phenol_2-monooxygenase_like Phenol 2-monooxygenase (phenol hydroxylase) is a flavoprotein monooxygenase, able to use molecular oxygen as a substrate in the microbial degredation of phenol. This protein is encoded by a single gene and uses a tightly bound FAD cofactor in the NAD(P)H dependent conversion of phenol and O2 to catechol and H2O. This group is related to the NAD binding ferredoxin reductases. 238
25518 99808 cd06212 monooxygenase_like The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons. These flavoprotein monooxygenases use molecular oxygen as a substrate and require reduced FAD. One atom of oxygen is incorportated into the aromatic compond, while the other is used to form a molecule of water. In contrast dioxygenases add both atoms of oxygen to the substrate. 232
25519 99809 cd06213 oxygenase_e_transfer_subunit The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons. Electron transfer is from NADH via FAD (in the oxygenase reductase) and an [2FE-2S] ferredoxin center (fused to the FAD/NADH domain and/or discrete) to the oxygenase. Dioxygenases add both atoms of oxygen to the substrate while mono-oxygenases add one atom to the substrate and one atom to water. In dioxygenases, Class I enzymes are 2 component, containing a reductase with Rieske type [2Fe-2S] redox centers and an oxygenase. Class II are 3 component, having discrete flavin and ferredoxin proteins and an oxygenase. Class III have 2 [2Fe-2S] centers, one fused to the flavin domain and the other separate. 227
25520 99810 cd06214 PA_degradation_oxidoreductase_like NAD(P) binding domain of ferredoxin reductase like phenylacetic acid (PA) degradation oxidoreductase. PA oxidoreductases of E. coli hydroxylate PA-CoA in the second step of PA degradation. Members of this group typically fuse a ferredoxin reductase-like domain with an iron-sulfur binding cluster domain. Ferredoxins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal portion may contain a flavin prosthetic group, as in flavoenzymes, or use flavin as a substrate. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria and participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 241
25521 99811 cd06215 FNR_iron_sulfur_binding_1 Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal portion of the FAD/NAD binding domain contains most of the NADP(H) binding residues and the N-terminal sub-domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. In this ferredoxin like sub-group, the FAD/NAD sub-domains is typically fused to a C-terminal iron-sulfur binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins which act as electron carriers in photosynthesis and ferredoxins which participate in redox chains from bacteria to mammals. Ferredoxin reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 231
25522 99812 cd06216 FNR_iron_sulfur_binding_2 Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 243
25523 99813 cd06217 FNR_iron_sulfur_binding_3 Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap between the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH. 235
25524 99814 cd06218 DHOD_e_trans FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as 3 cofactors: FMN, FAD, and an [2Fe-2S] cluster. 246
25525 99815 cd06219 DHOD_e_trans_like1 FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as NAD binding. NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group, as in flavoenzymes, or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD, forming FADH2 via a semiquinone intermediate, and then transfers a hydride ion to convert NADP+ to NADPH. 248
25526 99816 cd06220 DHOD_e_trans_like2 FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as 3 cofactors: FMN, FAD, and an [2Fe-2S] cluster. 233
25527 99817 cd06221 sulfite_reductase_like Anaerobic sulfite reductase contains an FAD and NADPH binding module with structural similarity to ferredoxin reductase and sequence similarity to dihydroorotate dehydrogenases. Clostridium pasteurianum inducible dissimilatory type sulfite reductase is linked to ferredoxin and reduces NH2OH and SeO3 at a lesser rate than it's normal substate SO3(2-). Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. 253
25528 259998 cd06222 RNase_H_like Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as anti-HIV drug targets since RNase H inactivation inhibits reverse transcription. This model also includes the Prp8 domain IV, which adopts the RNase fold but shows low sequence homology; domain IV is implicated in key spliceosomal interactions. 121
25529 206754 cd06223 PRTases_typeI Phosphoribosyl transferase (PRT)-type I domain. Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrophosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22. 130
25530 100121 cd06224 REM Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few. 122
25531 381743 cd06225 HAMP Histidine kinase, Adenylyl cyclase, Methyl-accepting protein, and Phosphatase (HAMP) domain. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established. 45
25532 349445 cd06226 M14_CPT_like Peptidase M14-like domain of an uncharacterized group of Peptidase M14 Carboxypeptidase T (CPT)-like proteins. Peptidase M14-like domain of an uncharacterized group of Peptidase M14 Carboxypeptidase T (CPT)-like proteins. This group belongs to the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPT exhibits dual-substrate specificity by cleaving C-terminal hydrophobic amino acid residues and C-terminal positively charged residues. However, CPT does not belong to this CPT-like group. 267
25533 349446 cd06227 M14-CPA-like Peptidase M14 carboxypeptidase A-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 224
25534 349447 cd06228 M14-like Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 294
25535 349448 cd06229 M14_Endopeptidase_I Peptidase M14 carboxypeptidase family-like domain of Endopeptidase I. Peptidase M14-like domain of Gamma-D-glutamyl-L-diamino acid endopeptidase 1 (also known as Gamma-D-glutamyl-meso-diaminopimelate peptidase I, and Endopeptidase I (ENP1); EC 3.4.19.11). ENP1 is a member of the M14 family of metallocarboxypeptidases (MCPs), and is classified as belonging to subfamily C. However it has an exceptional type of activity of hydrolyzing the gamma-D-Glu-(L)meso-diaminopimelic acid (gamma-D-Glu-Dap) bond of L-Ala-gamma-D-Glu-(L)meso-diaminopimelic acid and L-Ala-gamma-D-Glu-(L)meso-diaminopimelic acid(L)-D-Ala peptides. ENP1 has a different substrate specificity and cellular role than MpaA (MpaA does not belong to this group). ENP1 hydrolyzes the gamma-D-Glu-Dap bond of MurNAc-tripeptide and MurNAc-tetrapeptide, as well as the amide bond of free tripeptide and tetrapeptide. ENP1 is active on spore cortex peptidoglycan, and is produced at stage IV of sporulation in forespore and spore integuments. 238
25536 349449 cd06230 M14_ASTE_ASPA_like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily. The Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily belongs to the M14 family of metallocarboxypeptidases (MCPs), and includes ASTE, which catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) which cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 177
25537 349450 cd06231 M14_REP34-like Peptidase M14-like domain similar to rapid encystment phenotype 34 (REP34). This family includes Francisella tularensis protein rapid encystment phenotype 34 (REP34) which is a zinc-containing monomeric protein demonstrating carboxypeptidase B-like activity. REP34 possesses a novel topology with its substrate binding pocket deviating from the canonical M14 peptidases with a possible catalytic role for a conserved tyrosine and distinct S1' recognition site. Thus, REP34, identified as an active carboxypeptidase and a potential key F. tularensis effector protein, may help elucidate a mechanistic understanding of F. tularensis infection of phagocytic cells. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 239
25538 349451 cd06232 M14-like Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 276
25539 349452 cd06233 M14-like Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 249
25540 349453 cd06234 M14_PaCCP-like Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases similar to Pseudomonas aerugnosa CCP (PaCCP). A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP)-like proteins. This subgroup includes PaCCP from Pseudomonas aeruginosa, a carboxypeptidase homologous to M14D subfamily of human CCPs. Structural complexes with well-known inhibitors of metallocarboxypeptidases indicate that PaCCP might only possess C-terminal hydrolase activity against cellular substrates of particular specificity. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 256
25541 349454 cd06235 M14_AGTPBP-like Peptidase M14-like domain of human Nna1/AGTPBP-1, AGBL2 -5, and related proteins. Subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human Nna1/AGTPBP-1 and AGBL -2, -3, -4, and -5, and the mouse Nna1/CCP-1 and CCP -2 through -6. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 256
25542 349455 cd06236 M14_AGBL5_like Peptidase M14-like domain of ATP/GTP binding protein (AGBL)-5 and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-5, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human AGBL5 and the mouse cytosolic carboxypeptidase (CCP)-5. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 263
25543 349456 cd06237 M14_Nna1-like Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases; uncharacterized bacterial subgroup. A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP),-like proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 239
25544 349457 cd06238 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 217
25545 349458 cd06239 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 194
25546 349459 cd06240 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 212
25547 349460 cd06241 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 215
25548 349461 cd06242 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 220
25549 349462 cd06243 M14_CP_Csd4-like Peptidase M14 carboxypeptidase Csd4 and similar proteins. This family includes peptidase M14 carboxypeptidase Csd4 from H. pylori which has been shown to be DL-carboxypeptidase with a modified zinc binding site containing a glutamine residue in place of a conserved histidine. It is an archetype of a new carboxypeptidase subfamily with a domain arrangement that differs from this family of peptide-cleaving enzymes. Csd4 plays a role in trimming uncrosslinked peptidoglycan peptide chains by cleaving the amide bond between meso-diaminopimelate and iso-D-glutamic acid in truncated peptidoglycan side chains. It acts as a cell shape determinant, similar to Campylobacter jejuni Pgp1. The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 227
25550 349463 cd06244 M14-like Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 223
25551 349464 cd06245 M14_CPD_III Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain III subgroup. The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans-Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans-Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down -regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop. 283
25552 349465 cd06246 M14_CPB2 Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase B2 subgroup. Peptidase M14 Carboxypeptidase (CP) B2 (CPB2, also known as plasma carboxypeptidase B, carboxypeptidase U, and CPU), belongs to the carboxpeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPB2 enzyme displays B-like activity; it only cleaves the basic residues lysine or arginine. It is produced and secreted by the liver as the inactive precursor, procarboxypeptidase U or PCPB2, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). It circulates in plasma as a zymogen bound to plasminogen, and the active enzyme, TAFIa, inhibits fibrinolysis. It is highly regulated, increased TAFI concentrations are thought to increase the risk of thrombosis and coronary artery disease by reducing fibrinolytic activity while low TAFI levels have been correlated with chronic liver disease. 300
25553 349466 cd06247 M14_CPO Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase O subgroup. Peptidase M14 carboxypeptidase (CP) O (CPO, also known as metallocarboxypeptidase C; EC 3.4.17.) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPO has not been well characterized as yet, and little is known about it. Based on modeling studies, CPO has been suggested to have specificity for acidic residues rather than aliphatic/aromatic residues as in A-like enzymes or basic residues as in B-like enzymes. It remains to be demonstrated that CPO is functional as an MCP. 298
25554 349467 cd06248 M14_CP_insect Peptidase M14 carboxypeptidase subfamily A/B-like. This family includes peptidase M14 carboxypeptidases found specifically in insects, including B-type carboxypeptidase of H. zea (CPBHz, insect gut carboxypeptidase-3) that is insensitive to potato carboxypeptidase inhibitor (PCI) in corn earworm, and midgut procarboxypeptidase A (PCPAHa, insect gut carboxypeptidase-1) from Helicoverpa armigera larva, a devastating pest of crops. PCPAHa preferentially cleaves aliphatic and aromatic residues. The peptidase M14 Carboxypeptidase (CP) A/B subfamily is one of two main M14 CP subfamilies defined by sequence and structural homology, the other being the N/E subfamily. CPs hydrolyze single, C-terminal amino acids from polypeptide chains. They have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. There are nine members in the A/B family: CPA1, CPA2, CPA3, CPA4, CPA5, CPA6, CPB, CPO and CPU. CPA1, CPA2 and CPB are produced by the pancreas. The A forms have slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA3 is found in secretory granules of mast cells and functions in inflammatory processes. CPA4 is detected in hormone-regulated tissues, and is thought to play a role in prostate cancer. CPA5 is present in discrete regions of pituitary and other tissues, and cleaves aliphatic C-terminal residues. CPA6 is highly expressed in embryonic brain and optic muscle, suggesting that it may play a specific role in cell migration and axonal guidance. CPU (also called CPB2) is produced and secreted by the liver as the inactive precursor, PCPU, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). Little is known about CPO but it has been suggested to have specificity for acidic residues. 297
25555 349468 cd06250 M14_PaAOTO_like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like subfamily; subgroup includes Pseudomonas aeruginosa AotO. An uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the the M14 family of metallocarboxypeptidases. This subgroup includes Pseudomonas aeruginosa AotO and related proteins. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. The gene encoding P. aeruginosa AotO was characterized as part of an operon encoding an arginine and ornithine transport system, however it is not essential for arginine and ornithine uptake. 267
25556 349469 cd06251 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 195
25557 349470 cd06252 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 224
25558 349471 cd06253 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 211
25559 349472 cd06254 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 198
25560 349473 cd06255 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 223
25561 349474 cd06256 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 204
25562 99751 cd06257 DnaJ DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification. 55
25563 341049 cd06258 M3_like M3-like Peptidases, zincin metallopeptidases, include M2_ACE, M3A, M3B_PepF, and M32 families. The peptidase M3-like family, also called neurolysin-like family, is part of the "zincin" metallopeptidases, and includes the M2, M3 and M32 families of metallopeptidases. The M2 angiotensin converting enzyme (ACE, EC 3.4.15.1) is a membrane-bound, zinc-dependent dipeptidase that catalyzes the conversion of the decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. The M3 family is subdivided into two subfamilies: the widespread M3A, which comprises a number of high-molecular mass endo- and exopeptidases from bacteria, archaea, protozoa, fungi, plants and animals, and the small M3B, whose members are enzymes primarily from bacteria. Well-known mammalian/eukaryotic M3A endopeptidases are the thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (alias endopeptidase 3.4.24.16), and the mitochondrial intermediate peptidase. The first two are intracellular oligopeptidases, which act only on relatively short substrates of less than 20 amino acid residues, while the latter cleaves N-terminal octapeptides from proteins during their import into the mitochondria. The M3A subfamily also contains several bacterial endopeptidases, called oligopeptidases A, as well as a large number of bacterial carboxypeptidases, called dipeptidyl peptidases (Dcp; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). M3B subfamily consists of oligopeptidase F (PepF) which hydrolyzes peptides containing 7-17 amino acid residues with fairly broad specificity. Peptidases in the M3 family contain the HEXXH motif that forms part of the active site in conjunction with a C-terminally-located Glutamic acid (Glu) residue. A single zinc ion is ligated by the side-chains of the two Histidine (His) residues, and the more C-terminal Glu. Most of the peptidases are synthesized without signal peptides or propeptides, and function intracellularly. There are similarities to the thermostable carboxypeptidases from Pyrococcus furiosus carboxypeptidase (PfuCP), and Thermus aquaticus (TaqCP), belonging to peptidase family M32. Little is known about function of this family, including carboxypeptidases Taq and Pfu. 473
25564 99750 cd06259 YdcF-like YdcF-like. YdcF-like is a large family of mainly bacterial proteins, with a few members found in fungi, plants, and archaea. Escherichia coli YdcF has been shown to bind S-adenosyl-L-methionine (AdoMet), but a biochemical function has not been idenitified. The family also includes Escherichia coli sanA and Salmonella typhimurium sfiX, which are involved in vancomycin resistance; sfiX may also be involved in murein synthesis. 150
25565 411709 cd06260 DUF820-like Uncharacterized PDDEXK family nuclease. The Domain of unknown function 820 (DUF820) family is composed of hypothetical proteins that are greatly expanded in cyanobacteria. The proteins are found sporadically in other bacteria. They have been predicted to belong to the PD-(D/E)xK superfamily of nucleases, which includes very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 155
25566 119394 cd06261 TM_PBP2 Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs. These types of transporters consist of a PBP, two TMs, and two cytoplasmic ABC ATPase subunits, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. For these transporters the ABCs and TMs are on independent polypeptide chains. These systems transport a diverse range of substrates. Most are specific for a single substrate or a group of related substrates; however some transporters are more promiscuous, transporting structurally diverse substrates such as the histidine/lysine and arginine transporter in Enterobacteriaceae. In the latter case, this is achieved through binding different PBPs with different specificities to the TMs. For other promiscuous transporters such as the multiple-sugar transporter Msm of Streptococcus mutans, the PBP has a wide substrate specificity. These transporters include the maltose-maltodextrin, phosphate and sulfate transporters, among others. 190
25567 293792 cd06262 metallo-hydrolase-like_MBL-fold mainly hydrolytic enzymes and related proteins which carry out various biological functions; MBL-fold metallohydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases which can catalyze the hydrolysis of a wide range of beta-lactam antibiotics, hydroxyacylglutathione hydrolases (also called glyoxalase II) which hydrolyze S-d-lactoylglutathione to d-lactate in the second step of the glycoxlase system, AHL lactonases which catalyze the hydrolysis and opening of the homoserine lactone rings of acyl homoserine lactones (AHLs), persulfide dioxygenase which catalyze the oxidation of glutathione persulfide to glutathione and persulfite in the mitochondria, flavodiiron proteins which catalyze the reduction of oxygen and/or nitric oxide to water or nitrous oxide respectively, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J which has both 5'-3' exoribonucleolytic and endonucleolytic activity and ribonuclease Z which catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors, cyclic nucleotide phosphodiesterases which decompose cyclic adenosine and guanosine 3', 5'-monophosphate (cAMP and cGMP) respectively, insecticide hydrolases, and proteins required for natural transformation competence. The diversity of biological roles is reflected in variations in the active site metallo-chemistry, for example classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, human persulfide dioxygenase ETHE1 is a mono-iron binding member of the superfamily; Arabidopsis thaliana hydroxyacylglutathione hydrolases incorporates iron, manganese, and zinc in its dinuclear metal binding site, and flavodiiron proteins contains a diiron site. 188
25568 99706 cd06263 MAM Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region. 157
25569 119387 cd06265 RNase_A_canonical Canonical RNase A family includes all vertebrate homologues to the bovine pancreatic ribonuclease A (RNase A) that contain the catalytic site, necessary for RNase activity. In the human genome 8 RNases , refered to as "canonical" RNases, have been identified, pancreatic RNase (RNase 1), Eosinophil Derived Neurotoxin (SEDN/RNASE 2), Eosinophil Cationic Protein (ECP/RNase 3), RNase 4, Angiogenin (RNase 5), RNase 6 or k6, the skin derived RNase (RNase 7) and RNase 8. The eight human genes are all located in a cluster on chromosome 14. Canonical RNase A enzymes have special biological activities; for example, some stimulate the development of vascular endothelial cells, dendritic cells, and neurons, are cytotoxic/anti-tumoral and/or anti-pathogenic. RNase A is involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. The catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. The canonical RNase A family proteins have a conserved catalytic triad (two histidines and one lysine). They also share 6 to 8 cysteines that form three to four disulfide bonds. Two disulfide bonds that are close to the N and C termini contribute most significantly to conformational stability. Angiogenin or RNAse 5 has been implicated in tumor-associated angiogenesis. Comparative analysis in mammals and birds indicates that the whole family may have originated from a RNase 5-like gene. This hypothesis is supported by the fact that only RNase 5-like RNases have been reported outside the mammalian class. The RNase 5 group would therefore be the most ancient form of this family, and all other members would have arisen during mammalian evolution. 115
25570 259999 cd06266 RNase_HII Ribonuclease H (RNase H) type II family (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). This family contains ribonucleases HII (RNases H2) which include bacterial RNase HII and HIII, and eukaryotic and archaeal RNase H2/HII. RNase H2 cleaves RNA sequences that are part of RNA/DNA hybrids or that are incorporated into DNA, thereby preventing genomic instability and the accumulation of aberrant nucleic acid which can induce Aicardi-Goutieres syndrome, a severe autoimmune disorder in humans. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes, but no prokaryotic genome contains the combination of only RNase HI and HIII. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype in E. coli. 193
25571 380491 cd06267 PBP1_LacI_sugar_binding-like ligand binding domain of the LacI transcriptional regulator family belonging to the type 1 periplasmic-binding fold protein superfamily. Ligand binding domain of the LacI transcriptional regulator family belonging to the type 1 periplasmic-binding fold protein superfamily. In most cases, ligands are monosaccharide including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the domain sugar binding changes the DNA binding activity of the repressor domain. 264
25572 380492 cd06268 PBP1_ABC_transporter_LIVBP-like periplasmic binding domain of ATP-binding cassette transporter-like systems that belong to the type 1 periplasmic binding fold protein superfamily. Periplasmic binding domain of ATP-binding cassette transporter-like systems that belong to the type 1 periplasmic binding fold protein superfamily. They are mostly present in archaea and eubacteria, and are primarily involved in scavenging solutes from the environment. ABC-type transporters couple ATP hydrolysis with the uptake and efflux of a wide range of substrates across bacterial membranes, including amino acids, peptides, lipids and sterols, and various drugs. These systems are comprised of transmembrane domains, nucleotide binding domains, and in most bacterial uptake systems, periplasmic binding proteins (PBPs) which transfer the ligand to the extracellular gate of the transmembrane domains. These PBPs bind their substrates selectively and with high affinity. Members of this group include ABC-type Leucine-Isoleucine-Valine-Binding Proteins (LIVBP), which are homologous to the aliphatic amidase transcriptional repressor, AmiC, of Pseudomonas aeruginosa. The uncharacterized periplasmic components of various ABC-type transport systems are included in this group. 298
25573 380493 cd06269 PBP1_glutamate_receptors-like ligand-binding domain of family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases such as natriuretic peptide receptors (NPRs), and N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of ionotropic glutamate receptors. This CD represents the ligand-binding domain of the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases such as the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the ionotropic glutamate receptors, all of which are structurally similar and related to the periplasmic-binding fold type 1 family. The family C GPCRs consists of metabotropic glutamate receptor (mGluR), a calcium-sensing receptor (CaSR), gamma-aminobutyric acid receptor (GABAbR), the promiscuous L-alpha-amino acid receptor GPR6A, families of taste and pheromone receptors, and orphan receptors. Truncated splicing variants of the orphan receptors are not included in this CD. The family C GPCRs are activated by endogenous agonists such as amino acids, ions, and sugar based molecules. Their amino terminal ligand-binding region is homologous to the bacterial leucine-isoleucine-valine binding protein (LIVBP) and a leucine binding protein (LBP). The ionotropic glutamate receptors (iGluRs) have an integral ion channel and are subdivided into three major groups based on their pharmacology and structural similarities: NMDA receptors, AMPA receptors, and kainate receptors. The family of membrane bound guanylyl cyclases is further divided into three subfamilies: the ANP receptor (GC-A)/C-type natriuretic peptide receptor (GC-B), the heat-stable enterotoxin receptor (GC-C)/sensory organ specific membrane GCs such as retinal receptors (GC-E, GC-F), and olfactory receptors (GC-D and GC-G). 332
25574 380494 cd06270 PBP1_GalS-like ligand binding domain of DNA transcription iso-repressor GalS, which is one of two regulatory proteins involved in galactose transport and metabolism. Ligand binding domain of DNA transcription iso-repressor GalS, which is one of two regulatory proteins involved in galactose transport and metabolism. Transcription of the galactose regulon genes is regulated by Gal iso-repressor (GalS) and Gal repressor (GalR) in different ways, but both repressors recognize the same DNA binding site in the absence of D-galactose. GalS is a dimeric protein like GalR,and its major role is in regulating expression of the high-affinity galactose transporter encoded by the mgl operon, whereas GalR is the exclusive regulator of galactose permease, the low-affinity galactose transporter. GalS and GalR are members of the LacI-GalR family of transcription regulators and both contain the type 1 periplasmic binding protein-like fold. Hence, they are homologous to the periplasmic sugar binding of ABC-type transport systems. 266
25575 380495 cd06271 PBP1_AglR_RafR-like ligand-binding domain of DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 264
25576 380496 cd06272 PBP1_hexuronate_repressor-like ligand-binding domain of DNA transcription repressor for the hexuronate utilization operon from Bacillus species and close homologs, all members of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor for the hexuronate utilization operon from Bacillus species and its close homologs from other bacteria, all of which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 266
25577 380497 cd06273 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 268
25578 380498 cd06274 PBP1_FruR ligand binding domain of DNA transcription repressor specific for fructose (FruR) and its close homologs. Ligand binding domain of DNA transcription repressor specific for fructose (FruR) and its close homologs, all of which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to members of the type 1 periplasmic binding protein superfamily. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor 264
25579 380499 cd06275 PBP1_PurR ligand-binding domain of purine repressor, PurR, which functions as the master regulatory protein of de novo purine nucleotide biosynthesis in Escherichia coli. Ligand-binding domain of purine repressor, PurR, which functions as the master regulatory protein of de novo purine nucleotide biosynthesis in Escherichia coli. This dimeric PurR belongs to the LacI-GalR family of transcription regulators and is activated to bind to DNA operator sites by initially binding either of high affinity corepressors, hypoxanthine or guanine. PurR is composed of two functional domains: aan N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the purine transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 269
25580 380500 cd06277 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 275
25581 380501 cd06278 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 266
25582 380502 cd06279 PBP1_LacI-like ligand-binding domain of an uncharacterized transcription regulator from Corynebacterium glutamicum and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Corynebacterium glutamicum and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding. 284
25583 380503 cd06280 PBP1_LacI-like ligand-binding domain of an uncharacterized transcription regulator from Staphylococcus saprophyticus and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Staphylococcus saprophyticus and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding. 266
25584 380504 cd06281 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 270
25585 380505 cd06282 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 267
25586 380506 cd06283 PBP1_RegR_EndR_KdgR-like ligand-binding domain of DNA transcription repressor RegR and other putative regulators such as KdgR and EndR. Ligand-binding domain of DNA transcription repressor RegR and other putative regulators such as KdgR and EndR, all of which are members of the LacI-GalR family of bacterial transcription regulators. RegR regulates bacterial competence and the expression of virulence factors, including hyaluronidase. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 266
25587 380507 cd06284 PBP1_LacI-like ligand-binding domain of an uncharacterized transcription regulator from Actinobacillus succinogenes and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Actinobacillus succinogenes and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding. 267
25588 380508 cd06285 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 269
25589 380509 cd06286 PBP1_CcpB-like ligand-binding domain of a novel transcription factor implicated in catabolite repression in Bacillus and Clostridium species. This group includes the ligand-binding domain of a novel transcription factor implicated in catabolite repression in Bacillus and Clostridium species. Catabolite control protein B (CcpB) is 30% identical in sequence to CcpA which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. Like CcpA, the DNA-binding protein CcpB exerts its catabolite-repressing effect by a mechanism dependent on the presence of HPr(Ser-P), the small phosphocarrier proteins of the phosphoenolpyruvate-sugar phosphotransferase system, but with a less significant degree. 262
25590 380510 cd06287 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 268
25591 380511 cd06288 PBP1_sucrose_transcription_regulator ligand-binding domain of DNA-binding regulatory proteins specific to sucrose that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of DNA-binding regulatory proteins specific to sucrose that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 268
25592 380512 cd06289 PBP1_MalI-like ligand-binding domain of MalI, a transcription regulator of the maltose system of Escherichia coli and its close homologs from other bacteria. This group includes the ligand-binding domain of MalI, a transcription regulator of the maltose system of Escherichia coli and its close homologs from other bacteria. They are members of the LacI-GalR family of repressor proteins which are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 268
25593 380513 cd06290 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 267
25594 380514 cd06291 PBP1_Qymf-like ligand binding domain of the lacI-like transcription regulator from a novel metal-reducing bacterium Alkaliphilus Metalliredigens (strain Qymf) and its close homologs. This group includes the ligand binding domain of the lacI-like transcription regulator from a novel metal-reducing bacterium Alkaliphilus metalliredigens (strain Qymf) and its close homologs. Qymf is a strict anaerobe that could be grown in the presence of borax and its cells are straight rods that produce endospores. This group is a member of the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 264
25595 380515 cd06292 PBP1_AglR_RafR-like Ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the repressors specific raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins highly similar to DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR). Members of this group belong to the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 273
25596 380516 cd06293 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 270
25597 380517 cd06294 PBP1_MalR-like ligand-binding domain of maltose transcription regulator MalR which is a member of the LacI-GalR family repressors. This group includes the ligand-binding domain of maltose transcription regulator MalR which is a member of the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 269
25598 380518 cd06295 PBP1_CelR ligand binding domain of a transcription regulator of cellulose genes, CelR, which is highly homologous to the LacI-GalR family of bacterial transcription regulators. This group includes the ligand binding domain of a transcription regulator of cellulose genes, CelR, which is highly homologous to the LacI-GalR family of bacterial transcription regulators. The binding of CelR to the celE promoter is inhibited specifically by cellobiose. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 273
25599 380519 cd06296 PBP1_CatR-like ligand-binding domain of a LacI-like transcriptional regulator, CatR which is involved in catechol degradation. This group includes the ligand-binding domain of a LacI-like transcriptional regulator, CatR which is involved in catechol degradation. This group belongs to the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 270
25600 380520 cd06297 PBP1_CcpA_TTHA0807 ligand-binding domain of TTHA0807, a CcpA regulator, from Thermus thermophilus HB8 and its close homologs. Ligand-binding domain of the uncharacterized transcription regulator TTHA0807 from the extremely thermophilic organism Thermus thermophilus HB8 and close homologs from other bacteria. Although its exact biological function is not known, the TTHA0807 belongs to the catabolite control protein A (CcpA)family of regulatory proteins. The CcpA functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators. 268
25601 380521 cd06298 PBP1_CcpA ligand-binding domain of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation. Ligand-binding domain of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators. 268
25602 380522 cd06299 PBP1_LacI-like ligand-binding domain of DNA-binding regulatory protein from Corynebacterium glutamicum produces significant amounts of L-glutamate directly from cheap sugar and ammonia. This group includes the ligand-binding domain of DNA-binding regulatory protein from Corynebacterium glutamicum which has a unique ability to produce significant amounts of L-glutamate directly from cheap sugar and ammonia. This regulatory protein is a member of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 268
25603 380523 cd06300 PBP1_ABC_sugar_binding-like periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail. 302
25604 380524 cd06301 PBP1_rhizopine_binding-like periplasmic binding proteins specific to rhizopines. Periplasmic binding proteins specific to rhizopines, which are simple sugar-like compounds produced in the nodules induced by the symbiotic root nodule bacteria, such as Rhizobium and Sinorhizobium. Rhizopine-binding-like proteins from other bacteria are also included. Two inositol based rhizopine compounds are known to date: L-3-O-methly-scyllo-inosamine (3-O-MSI) and scyllo-inosamine. Bacterial strains that can metabolize rhizopine have a greater competitive advantage in nodulation and rhizopine synthesis is regulated by NifA/NtrA regulatory transcription activators which are maximally expressed at the onset of nitrogen fixation in bacteroids. The members of this group belong to the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily. 272
25605 380525 cd06302 PBP1_LsrB_Quorum_Sensing-like periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs. Periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling. 296
25606 380526 cd06303 PBP1_LuxPQ_Quorum_Sensing periplasmic binding protein (LuxP) of autoinducer-2 (AI-2) receptor LuxPQ from Vibrio harveyi and its close homologs. Periplasmic binding protein (LuxP) of autoinducer-2 (AI-2) receptor LuxPQ from Vibrio harveyi and its close homologs from other bacteria. The members of this group are highly homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea, and that are members of the type 1 periplasmic binding protein superfamily. The Vibrio harveyi AI-2 receptor consists of two polypeptides, LuxP and LuxQ: LuxP is a periplasmic binding protein that binds AI-2 by clamping it between two domains, LuxQ is an integral membrane protein belonging to the two-component sensor kinase family. Unlike AI-2 bound to the LsrB receptor in Salmonella typhimurium, the Vibrio harveyi AI-2 signaling molecule has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LuxPQ to control light production as well as its motility behavior. 320
25607 380527 cd06304 PBP1_BmpA_Med_PnrA-like periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold. 262
25608 380528 cd06305 PBP1_methylthioribose_binding-like similar to methylthioribose-binding protein of ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. Proteins similar to methylthioribose-binding protein of ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. The sugar-binding domain of the periplasmic proteins in this group is also homologous to the ligand-binding domain of eukaryotic receptors such as metabotropic glutamate receptor (mGluR), DNA-binding transcriptional repressors such as LacI and GalR. 273
25609 380529 cd06306 PBP1_TorT-like TorT-like proteins, a periplasmic binding protein family that activates induction of the Tor respiratory system upon trimethylamine N-oxide (TMAO) electron-acceptor binding in bacteria. TorT-like proteins, a periplasmic binding protein family that activates induction of the Tor respiratory system upon trimethylamine N-oxide (TMAO) electron-acceptor binding in bacteria. The Tor respiratory system is consists of three proteins (TorC, TorA, and TorD) and is induced in the presence of TMAO. The TMAO control is tightly regulated by three proteins: TorS, TorT, and TorR. Thus, the disruption of any of these proteins can abolish the Tor respiratory induction. TorT shares homology with the sugar-binding domain of the type 1 periplasmic binding proteins. The members of TorT-like family bind TMAO or related compounds and are predicted to be involved in signal transduction and/or substrate transport. 269
25610 380530 cd06307 PBP1_sugar_binding periplasmic sugar-binding domain of uncharacterized transport systems. Periplasmic sugar-binding domain of uncharacterized transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes. 275
25611 380531 cd06308 PBP1_sensor_kinase-like periplasmic binding domain of two-component sensor kinase signaling systems. Periplasmic binding domain of two-component sensor kinase signaling systems, some of which are fused with a C-terminal histidine kinase A domain (HisK) and/or a signal receiver domain (REC). Members of this group share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily and are predicted to be involved in sensing of environmental stimuli; their substrate specificities, however, are not known in detail. 268
25612 380532 cd06309 PBP1_galactofuranose_YtfQ-like periplasmic binding domain of ABC-type galactofuranose YtfQ-like transport systems. Periplasmic binding domain of ABC-type YtfQ-like transport systems. The YtfQ protein from Escherichia coli is up-regulated under glucose-limited conditions and shares homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their ligand specificity is not determined experimentally. 285
25613 380533 cd06310 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 272
25614 380534 cd06311 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 270
25615 380535 cd06312 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 272
25616 380536 cd06313 PBP1_ABC_ThpA_XypA periplasmic sugar-binding proteins (ThpA and XypA) of ABC-type transport systems. This group includes periplasmic D-threitol-binding protein ThpA and xylitol/L-sorbitol-binding protein XypA, which are part of sugar ABC-type transport systems. Both ThpA and XypA share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. 277
25617 380537 cd06314 PBP1_tmGBP periplasmic sugar-binding domain of Thermotoga maritima glucose-binding protein (tmGBP) and its close homologs. Periplasmic sugar-binding domain of Thermotoga maritima glucose-binding protein (tmGBP) and its close homologs from other bacteria. They are members of the type 1 periplasmic binding protein superfamily which consists of two domains connected by a three-stranded hinge. TmGBP is specific for glucose and its binding pocket is buried at the interface of the two domains. TmGBP also exhibits high thermostability and the highest structural similarity to E. coli glucose binding protein (ecGBP). 271
25618 380538 cd06315 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 278
25619 380539 cd06316 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 294
25620 380540 cd06317 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 281
25621 380541 cd06318 PBP1_ABC_D-talitol-like periplasmic D-talitol-binding protein of an ABC transport system and similar proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 282
25622 380542 cd06319 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 278
25623 380543 cd06320 PBP1_allose_binding periplasmic allose-binding domain of bacterial transport systems that function as a primary receptor of active transport and chemotaxis. Periplasmic allose-binding domain of bacterial transport systems that function as a primary receptor of active transport and chemotaxis. The members of this group are belonging to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Like other periplasmic receptors of the ABC-type transport systems, the allose-binding protein consists of two alpha/beta domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. 283
25624 380544 cd06321 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consist of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 270
25625 380545 cd06322 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consist of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 270
25626 380546 cd06323 PBP1_ribose_binding periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis ribose binding protein (ttRBP) and its mesophilic homologs. Periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis D-ribose binding protein (ttRBP) and its mesophilic homologs. Members of this group belong to the type 1 periplasmic binding protein superfamily, whose members are involved in chemotaxis, ATP-binding cassette transport, and intercellular communication in central nervous system. The thermophilic and mesophilic ribose-binding proteins are structurally very similar, but differ substantially in thermal stability. 268
25627 380547 cd06324 PBP1_ABC_sugar_binding-like periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 317
25628 380548 cd06325 PBP1_ABC_unchar_transporter type 1 periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally. 282
25629 380549 cd06326 PBP1_ABC_ligand_binding-like periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally, however. 339
25630 380550 cd06327 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in gram-negative, gram-positive bacteria, and archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 336
25631 380551 cd06328 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in gram-negative and gram-positive bacteria. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 336
25632 380552 cd06329 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 343
25633 380553 cd06330 PBP1_As_SBP-like periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea that is predicted to be involved in the efflux of toxic compounds. Members of this subgroup include proteins from Herminiimonas arsenicoxydans, which is resistant to arsenic (As) and various heavy metals such as cadmium and zinc. Moreover, they show significant sequence similarity to the cluster of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. 342
25634 380554 cd06331 PBP1_AmiC-like type 1 periplasmic components of amide-binding protein (AmiC) and the active transport system for short-chain and urea (FmdDEF). This group includes the type 1 periplasmic components of amide-binding protein (AmiC) and the active transport system for short-chain and urea (FmdDEF), found in bacteria and Archaea. AmiC controls expression of the amidase operon by a ligand-triggered conformational switch. In the absence of ligand or presence of butyramide (repressor), AmiC (the ligand sensor and negative regulator) adopts an open conformation and inhibits the transcription antitermination function of AmiR by direct protein-protein interaction. In the presence of inducing ligands such as acetamide, AmiC adopts a closed conformation which disrupts a silencing AmiC-AmiR complex and the expression of amidase and other genes of the operon is induced. FmdDEF is predicted to be an ATP-dependent transporter and closely resembles the periplasmic binding protein and the two transmembrane proteins present in various hydrophobic amino acid-binding transport systems. 333
25635 380555 cd06332 PBP1_aromatic_compounds-like type 1 periplasmic binding proteins of active transport systems predicted to be involved in transport of aromatic compounds such as 2-nitrobenzoic acid and alkylbenzenes. This group includes the type 1 periplasmic binding proteins of active transport systems that are predicted to be involved in transport of aromatic compounds such as 2-nitrobenzoic acid and alkylbenzenes; their substrate specificities are not well characterized, however. Members also exhibit close similarity to active transport systems for short chain amides and/or urea found in bacteria and archaea. 336
25636 380556 cd06333 PBP1_ABC_RPA1789-like type 1 periplasmic binding-protein component (CouP) of an ABC system (CouPSTU; RPA1789, RPA1791-1793), involved in active transport of lignin-derived aromatic substrates, and its close homologs. This group includes RPA1789 (CouP) from Rhodopseudomonas palustris and its close homologs in other bacteria. RPA1789 (CouP) is the periplasmic binding-protein component of an ABC system (CouPSTU; RPA1789, RPA1791-1793) that is involved in the active transport of lignin-derived aromatic substrates. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP). 342
25637 380557 cd06334 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally. 360
25638 380558 cd06335 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally. 348
25639 380559 cd06336 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally. 345
25640 380560 cd06337 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally. 354
25641 380561 cd06338 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT); however, their ligand specificity has not been determined experimentally. 347
25642 380562 cd06339 PBP1_YraM_LppC_lipoprotein-like periplasmic binding component of lipoprotein LppC, an immunodominant antigen. This subgroup includes periplasmic binding component of lipoprotein LppC, an immunodominant antigen, whose molecular function is not characterized. Members of this subgroup are predicted to be involved in transport of lipid compounds, and they are sequence similar to the family of ABC-type hydrophobic amino acid transporters (HAAT). 331
25643 380563 cd06340 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally. 352
25644 380564 cd06341 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally. 340
25645 380565 cd06342 PBP1_ABC_LIVBP-like type 1 periplasmic ligand-binding domain of ABC (Atpase Binding Cassette)-type active transport systems involved in the transport of all three branched chain aliphatic amino acids (leucine, isoleucine and valine). This subgroup includes the type 1 periplasmic ligand-binding domain of ABC (Atpase Binding Cassette)-type active transport systems that are involved in the transport of all three branched chain aliphatic amino acids (leucine, isoleucine and valine). This subgroup also includes a leucine-specific binding protein (or LivK), which is very similar in sequence and structure to leucine-isoleucine-valine binding protein (LIVBP). ABC-type active transport systems are transmembrane proteins that function in the transport of diverse sets of substrates across extra- and intracellular membranes, including carbohydrates, amino acids, inorganic ions, dipeptides and oligopeptides, metabolic products, lipids and sterols, and heme, to name a few. 334
25646 380566 cd06343 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however its ligand specificity has not been determined experimentally. 355
25647 380567 cd06344 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 332
25648 380568 cd06345 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 356
25649 380569 cd06346 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 314
25650 380570 cd06347 PBP1_ABC_LivK_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 334
25651 380571 cd06348 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 342
25652 380572 cd06349 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 338
25653 380573 cd06350 PBP1_GPCR_family_C-like ligand-binding domain of membrane-bound glutamate receptors that mediate excitatory transmission on the cellular surface through initial binding of glutamate; categorized into ionotropic glutamate receptors (iGluRs) and metabotropic glutamate receptors (mGluRs). Ligand-binding domain of membrane-bound glutamate receptors that mediate excitatory transmission on the cellular surface through initial binding of glutamate and are categorized into ionotropic glutamate receptors (iGluRs) and metabotropic glutamate receptors (mGluRs). The metabotropic glutamate receptors (mGluR) are key receptors in the modulation of excitatory synaptic transmission in the central nervous system. The mGluRs are coupled to G proteins and are thus distinct from the iGluRs which internally contain ligand-gated ion channels. The mGluR structure is divided into three regions: the extracellular region, the seven-spanning transmembrane region and the cytoplasmic region. The extracellular region is further divided into the ligand-binding domain (LBD) and the cysteine-rich domain. The LBD has sequence similarity to the LIVBP, which is a bacterial periplasmic protein (PBP), as well as to the extracellular region of both iGluR and the gamma-aminobutyric acid (GABA)b receptor. iGluRs are divided into three main subtypes based on pharmacological profile: NMDA, AMPA, and kainate receptors. All family C GPCRs have a large extracellular N terminus that contain a domain with homology to bacterial periplasmic amino acid-binding proteins. 350
25654 380574 cd06351 PBP1_iGluR_N_LIVBP-like N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NMDA, AMPA, and kainate receptor subtypes of ionotropic glutamate receptors (iGluRs). N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NMDA, AMPA, and kainate receptor subtypes of ionotropic glutamate receptors (iGluRs). While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Glutamate mediates the majority of excitatory synaptic transmission in the central nervous system via two broad classes of ionotropic receptors characterized by their response to glutamate agonists: N-methyl-aspartate (NMDA) and non-NMDA receptors. NMDA receptors have intrinsically slow kinetics, are highly permeable to Ca2+, and are blocked by extracellular Mg2+ in a voltage-dependent manner. On the other hand, non-NMDA receptors have faster kinetics, are weakly permeable to Ca2+, and are not blocked by extracellular Mg2+. While non-NMDA receptors typically mediate excitatory synaptic responses at resting membrane potentials, NMDA receptors contribute to several forms of synaptic plasticity and are suggested to play an important role in the development of synaptic pathways. 348
25655 380575 cd06352 PBP1_NPR_GC-like ligand-binding domain of membrane guanylyl-cyclase receptors. Ligand-binding domain of membrane guanylyl-cyclase receptors. Membrane guanylyl cyclases (GC) have a single membrane-spanning region and are activated by endogenous and exogenous peptides. This family can be divided into three major subfamilies: the natriuretic peptide receptors (NPRs), sensory organ-specific membrane GCs, and the enterotoxin/guanylin receptors. The binding of peptide ligands to the receptor results in the activation of the cytosolic catalytic domain. Three types of NPRs have been cloned from mammalian tissues: NPR-A/GC-A, NPR-B/ GC-B, and NPR-C. In addition, two of the GCs, GC-D and GC-G, appear to be pseudogenes in humans. Atrial natriuretic peptide (ANP) and brain natriuretic peptide (BNP) are produced in the heart, and both bind to the NPR-A. NPR-C, also termed the clearance receptor, binds each of the natriuretic peptides and can alter circulating levels of these peptides. The ligand binding domain of the NPRs exhibits strong structural similarity to the type 1 periplasmic binding fold protein family. 391
25656 380576 cd06353 PBP1_Med-like periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Med, a cell-surface localized protein, which regulates the competence transcription factor gene comK in Bacillus subtilis, lacks the DNA binding domain when compared with structures of transcription regulators from the LacI family. Nevertheless, Med has significant overall sequence homology to various periplasmic substrate-binding proteins. Moreover, the structure of Med shows a striking similarity to PnrA, a periplasmic nucleoside binding protein of an ATP-binding cassette transport system. Members of this group contain the type 1 periplasmic sugar-binding protein-like fold. 260
25657 380577 cd06354 PBP1_PrnA-like periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. Periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. The PnrA lipoprotein, also known as Tp0319 or TmpC, represents a novel family of bacterial purine nucleoside receptor encoded within an ATP-binding cassette (ABC) transport system (pnrABCDE). It shows a striking structural similarity to another basic membrane lipoprotein Med which regulates the competence transcription factor gene, comK, in Bacillus subtilis. The members of PnrA-like subgroup are likely to have similar nucleoside-binding functions and a similar type 1 periplasmic sugar-binding protein-like fold. 268
25658 380578 cd06355 PBP1_FmdD-like periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF). This group includes the periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF), found in Methylophilus methylotrophus, and its homologs from other bacteria. FmdD, a type 1 periplasmic binding protein, is induced by short-chain amides and urea and repressed by excess ammonia, while FmdE and FmdF are hydrophobic transmembrane proteins. FmdDEF is predicted to be an ATP-dependent transporter and closely resembles the periplasmic binding protein and the two transmembrane proteins present in various hydrophobic amino acid-binding transport systems. 347
25659 380579 cd06356 PBP1_amide_urea_BP-like periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF). This group includes the type 1 periplasmic-binding proteins that are predicted to have a function similar to that of an active transport system for short chain amides and/or urea in bacteria and Archaea, by sequence comparison and phylogenetic analysis. 334
25660 380580 cd06357 PBP1_AmiC periplasmic binding domain of amidase (AmiC) that belongs to the type 1 periplasmic binding fold protein family. This group includes the periplasmic binding domain of amidase (AmiC) that belongs to the type 1 periplasmic binding fold protein family. AmiC controls expression of the amidase operon by the ligand-triggered conformational switch. In the absence of ligand or presence of butyramide (repressor), AmiC (the ligand sensor and negative regulator) adopts an open conformation and inhibits the transcription antitermination function of AmiR by direct protein-protein interaction. In the presence of inducing ligands such as acetamide, AmiC adopts a closed conformation which disrupts a silencing AmiC-AmiR complex and the expression of amidase and other genes of the operon are induced. 357
25661 380581 cd06358 PBP1_NHase type 1 periplasmic-binding protein of the nitrile hydratase (NHase) system that selectively converts nitriles to corresponding amides. This group includes the type 1 periplasmic-binding protein of the nitrile hydratase (NHase) system that selectively converts nitriles to corresponding amides, which are subsequently converted by amidases to yield free carboxylic acids and ammonia. NHases from bacteria and fungi have been purified and characterized. In Rhodococcus sp., the nitrile hydratase operon consists of six genes encoding NHase regulator 2, NHase regulator 1, amidase, NHase alpha subunit, NHase beta subunit, and NHase activator. The operon produces a constitutive hydratase that has a broad substrate spectrum: aliphatic and aromatic nitriles, mononitriles and dinitriles, hydroxynitriles and amino-nitriles, and a constitutive amidase of equally low substrate specificity. NHases are metalloenzymes containing either cobalt or iron, and therefore can be classified into two subgroups: ferric NHases and cobalt NHases. 333
25662 380582 cd06359 PBP1_Nba-like type 1 periplasmic binding component of active transport systems predicted to be involved in 2-nitrobenzoic acid degradation pathway. This group includes the type 1 periplasmic binding component of active transport systems that are predicted to be involved in 2-nitrobenzoic acid degradation pathway; their substrate specificities are not well characterized. 333
25663 380583 cd06360 PBP1_alkylbenzenes-like type 1 periplasmic binding component of active transport systems predicted be involved in anaerobic biodegradation of alkylbenzenes such as toluene and ethylbenzene. This group includes the type 1 periplasmic binding component of active transport systems that are predicted be involved in anaerobic biodegradation of alkylbenzenes such as toluene and ethylbenzene; their substrate specificity is not well characterized, however. 357
25664 380584 cd06361 PBP1_GPC6A-like ligand-binding domain of the promiscuous L-alpha-amino acid receptor GPRC6A which is a broad-spectrum amino acid-sensing receptor. This family includes the ligand-binding domain of the promiscuous L-alpha-amino acid receptor GPRC6A which is a broad-spectrum amino acid-sensing receptor, and its fish homolog, the 5.24 chemoreceptor. GPRC6A is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into cellular responses. 401
25665 380585 cd06362 PBP1_mGluR ligand binding domain of metabotropic glutamate receptors (mGluR). Ligand binding domain of the metabotropic glutamate receptors (mGluR), which are members of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into cellular responses. mGluRs bind to glutamate and function as an excitatory neurotransmitter; they are involved in learning, memory, anxiety, and the perception of pain. Eight subtypes of mGluRs have been cloned so far, and are classified into three groups according to their sequence similarities, transduction mechanisms, and pharmacological profiles. Group I is composed of mGlu1R and mGlu5R that both stimulate PLC hydrolysis. Group II includes mGlu2R and mGlu3R, which inhibit adenylyl cyclase, as do mGlu4R, mGlu6R, mGlu7R, and mGlu8R, which form group III. 460
25666 380586 cd06363 PBP1_taste_receptor ligand-binding domain of the T1R taste receptor. Ligand-binding domain of the T1R taste receptor. The T1R is a member of the family C receptors within the G-protein coupled receptor superfamily, which also includes the metabotropic glutamate receptors, GABAb receptors, the calcium-sensing receptor (CaSR), the V2R pheromone receptors, and a small group of uncharacterized orphan receptors. 418
25667 380587 cd06364 PBP1_CaSR ligand-binding domain of the CaSR calcium-sensing receptor, a member of the family C receptors within the G-protein coupled receptor superfamily. Ligand-binding domain of the CaSR calcium-sensing receptor, which is a member of the family C receptors within the G-protein coupled receptor superfamily. CaSR provides feedback control of extracellular calcium homeostasis by responding sensitively to acute fluctuations in extracellular ionized Ca2+ concentration. This ligand-binding domain has homology to the bacterial leucine-isoleucine-valine binding protein (LIVBP) and a leucine binding protein (LBP). CaSR is widely expressed in mammalian tissues and is active in tissues that are not directly involved in extracellular calcium homeostasis. Moreover, CaSR responds to aromatic, aliphatic, and polar amino acids, but not to positively charged or branched chain amino acids, which suggests that changes in plasma amino acid levels are likely to modulate whole body calcium metabolism. Additionally, the family C GPCRs includes at least two receptors with broad-spectrum amino acid-sensing properties: GPRC6A which recognizes basic and various aliphatic amino acids, its gold-fish homolog the 5.24 chemoreceptor, and a specific taste receptor (T1R) which responds to aliphatic, polar, charged, and branched amino acids, but not to aromatic amino acids. 473
25668 380588 cd06365 PBP1_pheromone_receptor Ligand-binding domain of the V2R pheromone receptor, a member of the family C receptors within the G-protein coupled receptor superfamily. Ligand-binding domain of the V2R pheromone receptor, a member of the family C receptors within the G-protein coupled receptor superfamily, which also includes the metabotropic glutamate receptor, the GABAb receptor, the calcium-sensing receptor (CaSR), the T1R taste receptor, and a small group of uncharacterized orphan receptors. 464
25669 380589 cd06366 PBP1_GABAb_receptor ligand-binding domain of GABAb receptors, which are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). Ligand-binding domain of GABAb receptors, which are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). GABA is the major inhibitory neurotransmitter in the mammalian CNS and, like glutamate and other transmitters, acts via both ligand gated ion channels (GABAa receptors) and G-protein coupled receptors (GABAb receptor or GABAbR). GABAa receptors are members of the ionotropic receptor superfamily which includes alpha-adrenergic and glycine receptors. The GABAb receptor is a member of a receptor superfamily which includes the mGlu receptors. The GABAb receptor is coupled to G alpha-i proteins, and activation causes a decrease in calcium, an increase in potassium membrane conductance, and inhibition of cAMP formation. The response is thus inhibitory and leads to hyperpolarization and decreased neurotransmitter release, for example. 404
25670 380590 cd06367 PBP1_iGluR_NMDA N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. The function of the NMDA subtype receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer comprising two NR1 and two NR2 (A, B, C, and D) or NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 357
25671 380591 cd06368 PBP1_iGluR_non_NMDA-like N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the non-NMDA (N-methyl-D-aspartate) subtypes of ionotropic glutamate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the non-NMDA (N-methyl-D-asparate) subtypes of ionotropic glutamate receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Glutamate mediates the majority of excitatory synaptic transmission in the central nervous system via two broad classes of ionotropic receptors, characterized by their response to glutamate agonists: N-methyl-D-aspartate (NMDA) and non-NMDA receptors. NMDA receptors have intrinsically slow kinetics, are highly permeable to Ca2+, and are blocked by extracellular Mg2+ in a voltage-dependent manner. Non-NMDA receptors have faster kinetics, are most often only weakly permeable to Ca2+, and are not blocked by extracellular Mg2+. While non-NMDA receptors typically mediate excitatory synaptic responses at resting membrane potentials, NMDA receptors contribute several forms of synaptic plasticity and are thought to play an important role in the development of synaptic pathways. Non-NMDA receptors include alpha-amino-3-hydroxy-5-methyl-4-isoxazole proprionate (AMPA) and kainate receptors. 339
25672 380592 cd06369 PBP1_GC_C_enterotoxin_receptor ligand-binding domain of the membrane guanylyl cyclase C. Ligand-binding domain of the membrane guanylyl cyclase C (GC-C or StaR). StaR is a key receptor for the STa (Escherichia coli Heat Stable enterotoxin), a potent stimulant of intestinal chloride and bicarbonate secretion that cause acute secretory diarrhea. The catalytic domain of the STa/guanylin receptor type membrane GC is highly similar to those of the natriuretic peptide receptor (NPR) type and sensory organ-specific type membrane GCs (GC-D, GC-E and GC-F). The GC-C receptor is mainly expressed in the intestine of most vertebrates, but is also found in the kidney and other organs. Moreover, GC-C is activated by guanylin and uroguanylin, endogenous peptide ligands synthesized in the intestine and kidney. Consequently, the receptor activation results in increased cGMP levels and phosphorylation of the CFTR chloride channel and secretion. 381
25673 380593 cd06370 PBP1_SAP_GC-like Ligand-binding domain of membrane bound guanylyl cyclases. Ligand-binding domain of membrane bound guanylyl cyclases (GCs), which are known to be activated by sperm-activating peptides (SAPs), such as speract or resact. These ligand peptides are released by a range of invertebrates to stimulate the metabolism and motility of spermatozoa and are also potent chemoattractants. These GCs contain a single transmembrane segment, an extracellular ligand binding domain, and intracellular protein kinase-like and cyclase catalytic domains. GCs of insect and nematodes, which exhibit high sequence similarity to the speract receptor are also included in this model. 400
25674 380594 cd06371 PBP1_sensory_GC_DEF-like ligand-binding domain of membrane guanylyl cyclases (GC-D, GC-E, and GC-F) that are specifically expressed in sensory tissues. This group includes the ligand-binding domain of membrane guanylyl cyclases (GC-D, GC-E, and GC-F) that are specifically expressed in sensory tissues. They share a similar topology with an N-terminal extracellular ligand-binding domain, a single transmembrane domain, and a C-terminal cytosolic region that contains kinase-like and catalytic domains. GC-D is specifically expressed in a subpopulation of olfactory sensory neurons. GC-E and GC-F are colocalized within the same photoreceptor cells of the retina and have important roles in phototransduction. Unlike the other family members, GC-E and GC-F have no known extracellular ligands. Instead, they are activated under low calcium conditions by guanylyl cyclase activating proteins called GCAPs. GC-D expressing neurons have been implicated in pheromone detection and GC-D is phylogenetically more similar to the Ca2+-regulated GC-E and GC-F than to receptor GC-A, -B and -C which are activated by peptide ligands. Moreover, these olfactory GCs and retinal GCs share characteristic sequence similarity in a regulatory domain that is involved in the binding of GCAPs, suggesting GC-D activity may be regulated by an unknown extracellular ligand and intracellular Ca2+. Rodent GC-D-expressing neurons have been implicated in pheromone detection and were recently shown to respond to atmospheric CO2 which is an olfactory stimulus for many invertebrates and regulates some insect innate behavior, such as the location of food and hosts. 379
25675 380595 cd06372 PBP1_GC_G-like Ligand-binding domain of membrane guanylyl cyclase G. This group includes the ligand-binding domain of membrane guanylyl cyclase G (GC-G) which is a sperm surface receptor and might function, similar to its sea urchin counterpart, in the early signaling event that regulates the Ca2+ influx/efflux and subsequent motility response in sperm. GC-G appears to be a pseudogene in human. Furthermore, in contrast to the other orphan receptor GCs, GC-G has a broad tissue distribution in rat, including lung, intestine, kidney, and skeletal muscle. 390
25676 380596 cd06373 PBP1_NPR-like Ligand binding domain of natriuretic peptide receptor (NPR) family. Ligand binding domain of natriuretic peptide receptor (NPR) family which consists of three different subtypes: type A natriuretic peptide receptor (NPR-A, or GC-A), type B natriuretic peptide receptors (NPR-B, or GC-B), and type C natriuretic peptide receptor (NPR-C). There are three types of natriuretic peptide (NP) ligands specific to the receptors: atrial NP (ANP), brain or B-type NP (BNP), and C-type NP (CNP). The NP family is thought to have arisen through gene duplication during evolution and plays an essential role in cardiovascular and body fluid homeostasis. ANP and BNP bind mainly to NPR-A, while CNP binds specifically to NPR-B. Both NPR-A and NPR-B have guanylyl cyclase catalytic activity and produces intracellular secondary messenger cGMP in response to peptide-ligand binding. Consequently, the NPR-A activation results in vasodilation and inhibition of vascular smooth muscle cell proliferation. NPR-C acts as the receptor for all the three members of NP family, and functions as a clearance receptor. Unlike NPR-A and -B, NPR-C lacks an intracellular guanylyl cyclase domain and is thought to exert biological actions by sequestration of released natriuretic peptides and/or inhibition of adenylyl cyclase. 394
25677 380597 cd06374 PBP1_mGluR_groupI ligand binding domain of the group I metabotropic glutamate receptor. Ligand binding domain of the group I metabotropic glutamate receptor, a family containing mGlu1R and mGlu5R, all of which stimulate phospholipase C (PLC) hydrolysis. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes. 474
25678 380598 cd06375 PBP1_mGluR_groupII ligand binding domain of the group II metabotropic glutamate receptor. Ligand binding domain of the group II metabotropic glutamate receptor, a family that contains mGlu2R and mGlu3R, all of which inhibit adenylyl cyclase. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes 462
25679 380599 cd06376 PBP1_mGluR_groupIII ligand-binding domain of the group III metabotropic glutamate receptor. Ligand-binding domain of the group III metabotropic glutamate receptor, a family which contains mGlu4R, mGluR6R, mGluR7, and mGluR8; all of which inhibit adenylyl cyclase. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes. 467
25680 380600 cd06377 PBP1_iGluR_NMDA_NR3 N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR3 subunit of NMDA receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR3 subunit of NMDA receptor family. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 373
25681 380601 cd06378 PBP1_iGluR_NMDA_NR2 N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR2 subunit of NMDA receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR2 subunit of NMDA receptor family. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 356
25682 380602 cd06379 PBP1_iGluR_NMDA_NR1 N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore can be classified as excitatory glycine receptors. NR1/NR3 receptors are calcium-impermeable and unaffected by ligands acting at the NR2 glutamate-binding site 364
25683 380603 cd06380 PBP1_iGluR_AMPA N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor, a member of the glutamate-receptor ion channels (iGluRs). AMPA receptors are the major mediators of excitatory synaptic transmission in the central nervous system. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. AMPA receptors consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important roles in mediating the rapid excitatory synaptic current. 390
25684 380604 cd06381 PBP1_iGluR_delta-like N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. This CD represents the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans. 401
25685 380605 cd06382 PBP1_iGluR_Kainate N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the kainate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the kainate receptors, non-NMDA ionotropic receptors which respond to the neurotransmitter glutamate. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Kainate receptors have five subunits, GluR5, GluR6, GluR7, KA1 and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined. 335
25686 380606 cd06383 PBP1_iGluR_AMPA_Like N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of uncharacterized AMPA-like receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of uncharacterized AMPA-like receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. AMPA receptors consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important roles in mediating the rapid excitatory synaptic current. 379
25687 380607 cd06384 PBP1_NPR_B ligand-binding domain of type B natriuretic peptide receptor. Ligand-binding domain of type B natriuretic peptide receptor (NPR-B). NPR-B is one of three known single membrane-spanning natriuretic peptide receptors that have been identified. Natriuretic peptides are family of structurally related but genetically distinct hormones/paracrine factors that regulate blood volume, blood pressure, ventricular hypertrophy, pulmonary hypertension, fat metabolism, and long bone growth. In mammals there are three natriuretic peptides: ANP, BNP, and CNP. Like NPR-A (or GC-A), NPR-B (or GC-B) is a transmembrane guanylyl cyclase, an enzyme that catalyzes the synthesis of cGMP. NPR-B is the predominant natriuretic peptide receptor in the brain. The rank of order activation of NPR-B by natriuretic peptides is CNP>>ANP>BNP. Homozygous inactivating mutations in human NPR-B cause a form of short-limbed dwarfism known as acromesomelic dysplasia type Maroteaux. 399
25688 380608 cd06385 PBP1_NPR_A Ligand-binding domain of type A natriuretic peptide receptor. Ligand-binding domain of type A natriuretic peptide receptor (NPR-A). NPR-A is one of three known single membrane-spanning natriuretic peptide receptors that regulate blood volume, blood pressure, ventricular hypertrophy, pulmonary hypertension, fat metabolism, and long bone growth. In mammals there are three natriuretic peptides: ANP, BNP, and CNP. NPR-A is highly expressed in kidney, adrenal, terminal ileum, adipose, aortic, and lung tissues. The rank order of NPR-A activation by natriuretic peptides is ANP>BNP>>CNP. Single allele-inactivating mutations in the promoter of human NPR-A are associated with hypertension and heart failure. 408
25689 380609 cd06386 PBP1_NPR_C ligand-binding domain of type C natriuretic peptide receptor. Ligand-binding domain of type C natriuretic peptide receptor (NPR-C). NPR-C is found in atrial, mesentery, placenta, lung, kidney, venous tissue, aortic smooth muscle, and aortic endothelial cells. The affinity of NPR-C for natriuretic peptides is ANP>CNP>BNP. The extracellular domain of NPR-C is about 30% identical to NPR-A and NPR-B. However, unlike the cyclase-linked receptors, it contains only 37 intracellular amino acids and no guanylyl cyclase activity. Major function of NPR-C is to clear natriuretic peptides from the circulation or extracellular surroundings through constitutive receptor-mediated internalization and degradation. 391
25690 380610 cd06387 PBP1_iGluR_AMPA_GluR3 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR3 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR3 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs). 375
25691 380611 cd06388 PBP1_iGluR_AMPA_GluR4 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR4 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR4 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs). 373
25692 380612 cd06389 PBP1_iGluR_AMPA_GluR2 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR2 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR2 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs). 372
25693 380613 cd06390 PBP1_iGluR_AMPA_GluR1 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR1 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR1 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs). 367
25694 380614 cd06391 PBP1_iGluR_delta_2 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta2 receptor of an orphan glutamate receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta2 receptor of an orphan glutamate receptor family. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 are closer related to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq and tumor necrosis factor family that is secreted from cerebellar granule cells. 402
25695 380615 cd06392 PBP1_iGluR_delta_1 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta1 receptor of an orphan glutamate receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta1 receptor of an orphan glutamate receptor family. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 may be closer related to non-NMDA receptors. In contrast to GluRdelta2, GluRdelta1 is expressed in many areas in the developing CNS, including the hippocampus and the caudate putamen. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans. 402
25696 380616 cd06394 PBP1_iGluR_Kainate_KA1_2 N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the KA1 and KA2 subunits of Kainate receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the KA1 and KA2 subunits of Kainate receptor. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMPA receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined. 379
25697 99717 cd06395 PB1_Map2k5 PB1 domain is essential part of the mitogen-activated protein kinase kinase 5 (Map2k5, alias MEK5) one of the key member of the signaling kinases cascade which involved in angiogenesis and early cardiovascular development. The PB1 domain of Map2k5 interacts with the PB1 domain of another members of kinase cascade MEKK2 (or MEKK3). A canonical PB1-PB1 interaction, involving heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The Map2k5 protein contains a type I PB1 domain. 91
25698 99718 cd06396 PB1_NBR1 The PB1 domain is an essential part of NBR1 protein, next to BRCA1, a scaffold protein mediating specific protein-protein interaction with both titin protein kinase and with another scaffold protein p62. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The NBR1 protein contains a type I PB1 domain. 81
25699 99719 cd06397 PB1_UP1 Uncharacterized protein 1. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. 82
25700 99720 cd06398 PB1_Joka2 The PB1 domain is present in the Nicotiana plumbaginifolia Joka2 protein which interacts with sulfur stress inducible UP9 protein. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 91
25701 99721 cd06399 PB1_P40 The PB1 domain is essential part of the p40 adaptor protein which plays an important role in activating phagocyte NADPH oxidase during phagocytosis. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes , such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The PB1 domain of p40 represents a type I PB1 domain which interacts with the PB1 domain of oxidase activator p67 which belong to type II PB1 domain. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 92
25702 99722 cd06401 PB1_TFG The PB1 domain found in TFG protein, an oncogenic gene product and fusion partner to nerve growth factor tyrosine kinase receptor TrkA and to the tyrosine kinase ALK. The PB1 domain is a modular domain mediating specific protein-protein interaction in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The PB1 domains of TFG represent a type I/II PB1 domain. The physiological function of TFG remains unknown. 81
25703 99723 cd06402 PB1_p62 The PB1 domain is an essential part of p62 scaffold protein (alias sequestosome 1,SQSTM) involved in cell signaling, receptor internalization, and protein turnover. The PB1 domain is a modular domain mediating specific protein-protein interaction which play roles in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 87
25704 99724 cd06403 PB1_Par6 The PB1 domain is an essential part of Par6 protein which in complex with Par3 and aPKC proteins is crucial for establishment of apical-basal polarity of animal cells. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The Par6 protein contains a type II PB1 domain. 80
25705 99725 cd06404 PB1_aPKC PB1 domain is an essential modular domain of the atypical protein kinase C (aPKC) which in complex with Par6 and Par3 proteins is crucial for establishment of apical-basal polarity of animal cells. PB1 domain is a modular domain mediating specific protein-protein interaction which play roles in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The aPKC protein contains a type I/II PB1 domain. 83
25706 99726 cd06405 PB1_Mekk2_3 The PB1 domain is present in the two mitogen-activated protein kinase kinases MEKK2 and MEKK3 which are two members of the signaling kinase cascade involved in angiogenesis and early cardiovascular development. The PB1 domain of MEKK2 (and/or MEKK3) interacts with the PB1 domain of another member of the kinase cascade Map2k5. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The MEKK2 and MEKK3 proteins contain a type II PB1 domain. 79
25707 99727 cd06406 PB1_P67 A PB1 domain is present in p67 proteins which forms a signaling complex with p40, a crucial step for activation of NADPH oxidase during phagocytosis. PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes . A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The p67 proteins contain a type II PB1 domain. 80
25708 99728 cd06407 PB1_NLP A PB1 domain is present in NIN like proteins (NLP), a key enzyme in a process of establishment of symbiosis betweeen legumes and nitrogen fixing bacteria (Rhizobium). The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes like osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 82
25709 99729 cd06408 PB1_NoxR The PB1 domain is present in the Epichloe festucae NoxR protein (NADPH oxidase regulator), a key regulator of NADPH oxidase isoform, NoxA. NoxA is essential for growth control of the fungal endophyte in plant tissue in the process of symbiotic interaction between a fungi and its plant host. The Epichloe festucae p67(phox)-like regulator, NoxR, dispensable in culture but essential in plants for the symbiotic interaction. Plants infected with a noxR deletion mutant show severe stunting and premature senescence, whereas hyphae in the meristematic tissues show increased branching leading to increased fungal colonization of pseudostem and leaf blade tissue. The PB1 domain is a modular domain mediating specific protein-protein interactions which a play role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 86
25710 99730 cd06409 PB1_MUG70 The MUG70 protein is a product of the meiotically up-regulated gene 70 which has a role in meiosis and harbors a PB1 domain. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domains depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic amino acid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 86
25711 99731 cd06410 PB1_UP2 Uncharacterized protein 2. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. 97
25712 99732 cd06411 PB1_p51 The PB1 domain is present in the p51 protein, a homolog of the p67 protein. p51 plays an important role in NADPH oxidase activation during phagosytosis. The PB1 domain is a modular domain mediating specific protein-protein interaction in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. 78
25713 119374 cd06412 GH25_CH-type CH-type (Chalaropsis-type) lysozymes represent one of four functionally-defined classes of peptidoglycan hydrolases (also referred to as endo-N-acetylmuramidases) that cleave bacterial cell wall peptidoglycans. CH-type lysozymes exhibit both lysozyme (acetylmuramidase) and diacetylmuramidase activity. The first member of this family to be described was a muramidase from the fungus Chalaropsis. However, a majority of the CH-type lysozymes are found in bacteriophages and Gram-positive bacteria such as Streptomyces and Clostridium. CH-type lysozymes have a single glycosyl hydrolase family 25 (GH25) domain with an unusual beta/alpha-barrel fold in which the last strand of the barrel is antiparallel to strands beta7 and beta1. Most CH-type lysozymes appear to lack the cell wall-binding domain found in other GH25 muramidases. 199
25714 119375 cd06413 GH25_muramidase_1 Uncharacterized bacterial muramidase containing a glycosyl hydrolase family 25 (GH25) catalytic domain. Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. 191
25715 119376 cd06414 GH25_LytC-like The LytC lysozyme of Streptococcus pneumoniae is a bacterial cell wall hydrolase that cleaves the beta1-4-glycosydic bond located between the N-acetylmuramoyl-N-glucosaminyl residues of the cell wall polysaccharide chains. LytC is composed of a C-terminal glycosyl hydrolase family 25 (GH25) domain and an N-terminal choline-binding module (CBM) consisting of eleven homologous repeats that specifically recognizes the choline residues of pneumococcal lipoteichoic and teichoic acids. This domain arrangement is the reverse of the major pneumococcal autolysin, LytA, and the CPL-1-like lytic enzymes of the pneumococcal bacteriophages, in which the CBM (consisting of six repeats) is at the C-terminus. This model represents the C-terminal catalytic domain of the LytC-like enzymes. 191
25716 119377 cd06415 GH25_Cpl1-like Cpl-1 lysin (also known as Cpl-9 lysozyme / muramidase) is a bacterial cell wall endolysin encoded by the pneumococcal bacteriophage Cp-1, which cleaves the glycosidic N-acetylmuramoyl-(beta1,4)-N-acetylglucosamine bonds of the pneumococcal glycan chain, thus acting as an enzymatic antimicrobial agent (an enzybiotic) against streptococcal infections. Cpl-1 belongs to the CP family of lysozymes (CPL lysozymes) which includes the Cpl-7 lysin. Cpl-1 has a glycosyl hydrolase family 25 (GH25) catalytic domain with an irregular (beta/alpha)5-beta3 barrel and a C-terminal cell wall-anchoring module formed by six similar choline-binding repeats (ChBr's). The ChBr's facilitate the anchoring of Cpl-1 to the choline-containing teichoic acid of the pneumococcal cell wall. Other members of this domain family have an N-terminal CHAP (cysteine, histidine-dependent amidohydrolases/peptidases) domain similar to that of the firmicute CHAP lysins and associated with endopeptidase activity. The Cpl-7 lysin is also included here as is LysB of Lactococcus phage, and the Mur lysin of Lactobacillus phage. 196
25717 119378 cd06416 GH25_Lys1-like Lys-1 is a lysozyme encoded by the Caenorhabditis elegans lys-1 gene. This gene is one of a several lysozyme genes upregulated upon infection by the Gram-negative bacterial pathogen Serratia marcescens. Lys-1 contains a glycosyl hydrolase family 25 (GH25) catalytic domain. This family also includes Lys-5 from Caenorhabditis elegans. 196
25718 119379 cd06417 GH25_LysA-like LysA is a cell wall endolysin produced by Lactobacillus fermentum, which degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. The N-terminal glycosyl hydrolase family 25 (GH25) domain of LysA has sequence similarity with other murein hydrolase catalytic domains while the C-terminal domain has sequence similarity with putative bacterial cell wall-binding SH3b domains. This domain family also includes LysL of Lactococcus lactis. 195
25719 119380 cd06418 GH25_BacA-like BacA is a bacterial lysin from Enterococcus faecalis that degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. BacA is homologous to the YbfG and YkuG lysins of Bacillus subtilis. BacA has a C-terminal catalytic glycosyl hydrolase family 25 (GH25) domain and an N-terminal peptidoglycan-binding domain comprised of three alpha helices which is similar to a domain found in matrixins. 212
25720 119381 cd06419 GH25_muramidase_2 Uncharacterized bacterial muramidase containing a glycosyl hydrolase family 25 (GH25) catalytic domain. Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. 190
25721 133042 cd06420 GT2_Chondriotin_Pol_N N-terminal domain of Chondroitin polymerase functions as a GalNAc transferase. Chondroitin polymerase is a two domain, bi-functional protein. The N-terminal domain functions as a GalNAc transferase. The bacterial chondroitin polymerase catalyzes elongation of the chondroitin chain by alternatively transferring the GlcUA and GalNAc moiety from UDP-GlcUA and UDP-GalNAc to the non-reducing ends of the chondroitin chain. The enzyme consists of N-terminal and C-terminal domains in which the two active sites catalyze the addition of GalNAc and GlcUA, respectively. Chondroitin chains range from 40 to over 100 repeating units of the disaccharide. Sulfated chondroitins are involved in the regulation of various biological functions such as central nervous system development, wound repair, infection, growth factor signaling, and morphogenesis, in addition to its conventional structural roles. In Caenorhabditis elegans, chondroitin is an essential factor for the worm to undergo cytokinesis and cell division. Chondroitin is synthesized as proteoglycans, sulfated and secreted to the cell surface or extracellular matrix. 182
25722 133043 cd06421 CESA_CelA_like CESA_CelA_like are involved in the elongation of the glucan chain of cellulose. Family of proteins related to Agrobacterium tumefaciens CelA and Gluconacetobacter xylinus BscA. These proteins are involved in the elongation of the glucan chain of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues. They are putative catalytic subunit of cellulose synthase, which is a glycosyltransferase using UDP-glucose as the substrate. The catalytic subunit is an integral membrane protein with 6 transmembrane segments and it is postulated that the protein is anchored in the membrane at the N-terminal end. 234
25723 133044 cd06422 NTP_transferase_like_1 NTP_transferase_like_1 is a member of the nucleotidyl transferase family. This is a subfamily of nucleotidyl transferases. Nucleotidyl transferases transfer nucleotides onto phosphosugars. The activated sugars are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides. Other subfamilies of nucleotidyl transferases include Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase. 221
25724 133045 cd06423 CESA_like CESA_like is the cellulose synthase superfamily. The cellulose synthase (CESA) superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains. The members include cellulose synthase catalytic subunit, chitin synthase, glucan biosynthesis protein and other families of CESA-like proteins. Cellulose synthase catalyzes the polymerization reaction of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues in plants, most algae, some bacteria and fungi, and even some animals. In bacteria, algae and lower eukaryotes, there is a second unrelated type of cellulose synthase (Type II), which produces acylated cellulose, a derivative of cellulose. Chitin synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of beta-(1,4)-linked GlcNAc residues and Glucan Biosynthesis protein catalyzes the elongation of beta-1,2 polyglucose chains of Glucan. 180
25725 133046 cd06424 UGGPase UGGPase catalyzes the synthesis of UDP-Glucose/UDP-Galactose. UGGPase: UDP-Galactose/Glucose Pyrophosphorylase catalyzes the reversible production of UDP-Glucose/UDP-Galactose and pyrophosphate (PPi) from Glucose-1-phosphate/Galactose-1-phosphate and UTP. Its dual substrate specificity distinguishes it from the single substrate enzyme UDP-glucose pyrophosphorylase. It may play a key role in the galactose metabolism in raffinose oligosaccharide (RFO) metabolizing plants. RFO raffinose is a major photoassimilate and is a galactosylderivative of sucrose (Suc) containing a galactose (Gal) moiety. Upon arriving at the sink tissue, the Gal moieties of the RFOs are initially removed by alpha-galactosidase and then are phosphorylated to Gal-1-P. Gal-1-P is converted to UDP-Gal. The UDP-Gal is further metabolized to UDP-Glc via an epimerase reaction. The UDP-Glc can be directly utilized in cell wall metabolism or in Suc synthesis. However, for the Suc synthesis UDP-Glc must be further metabolized to Glc-1-P. This can be carried out either by the UGPase in the reverse direction or by the dual substrate PPase itself operating in the reverse direction. According to the latter possibility, the three-step pathway of Gal-1-P to Glc-1-P could be carried out by a single PPase, functioning sequentially in reverse directions separated by the epimerase reaction. 315
25726 133047 cd06425 M1P_guanylylT_B_like_N N-terminal domain of the M1P-guanylyltransferase B-isoform like proteins. GDP-mannose pyrophosphorylase (GTP: alpha-d-mannose-1-phosphate guanyltransferase) catalyzes the formation of GDP-d-mannose from GTP and alpha-d-mannose-1-Phosphate. It contains an N-terminal catalytic domain and a C-terminal Lefthanded-beta-Helix fold domain. GDP-d-mannose is the activated form of mannose for formation of cell wall lipoarabinomannan and various mannose-containing glycolipids and polysaccharides. The function of GDP-mannose pyrophosphorylase is essential for cell wall integrity, morphogenesis and viability. Repression of GDP-mannose pyrophosphorylase in yeast leads to phenotypes, such as cell lysis, defective cell wall, and failure of polarized growth and cell separation. 233
25727 133048 cd06426 NTP_transferase_like_2 NTP_trnasferase_like_2 is a member of the nucleotidyl transferase family. This is a subfamily of nucleotidyl transferases. Nucleotidyl transferases transfer nucleotides onto phosphosugars. The activated sugars are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides. Other subfamilies of nucleotidyl transferases include Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase. 220
25728 133049 cd06427 CESA_like_2 CESA_like_2 is a member of the cellulose synthase superfamily. The cellulose synthase (CESA) superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains. The members include cellulose synthase catalytic subunit, chitin synthase, Glucan Biosynthesis protein and other families of CESA-like proteins. Cellulose synthase catalyzes the polymerization reaction of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues in plants, most algae, some bacteria and fungi, and even some animals. In bacteria, algae and lower eukaryotes, there is a second unrelated type of cellulose synthase (Type II), which produces acylated cellulose, a derivative of cellulose. Chitin synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of beta-(1,4)-linked GlcNAc residues and Glucan Biosynthesis protein catalyzes the elongation of beta-1,2 polyglucose chains of glucan. 241
25729 133050 cd06428 M1P_guanylylT_A_like_N N-terminal domain of M1P_guanylyl_A_ like proteins are likely to be a isoform of GDP-mannose pyrophosphorylase. N-terminal domain of the M1P-guanylyltransferase A-isoform like proteins: The proteins of this family are likely to be a isoform of GDP-mannose pyrophosphorylase. Their sequences are highly conserved with mannose-1-phosphate guanyltransferase, but generally about 40-60 bases longer. GDP-mannose pyrophosphorylase (GTP: alpha-d-mannose-1-phosphate guanyltransferase) catalyzes the formation of GDP-d-mannose from GTP and alpha-d-mannose-1-Phosphate. It contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain. GDP-d-mannose is the activated form of mannose for formation of cell wall lipoarabinomannan and various mannose-containing glycolipids and polysaccharides. The function of GDP-mannose pyrophosphorylase is essential for cell wall integrity, morphogenesis and viability. Repression of GDP-mannose pyrophosphorylase in yeast leads to phenotypes including cell lysis, defective cell wall, and failure of polarized growth and cell separation. 257
25730 133051 cd06429 GT8_like_1 GT8_like_1 represents a subfamily of GT8 with unknown function. A subfamily of glycosyltransferase family 8 with unknown function: Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase and inositol 1-alpha-galactosyltransferase. It is classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. 257
25731 133052 cd06430 GT8_like_2 GT8_like_2 represents a subfamily of GT8 with unknown function. A subfamily of glycosyltransferase family 8 with unknown function: Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase and inositol 1-alpha-galactosyltransferase. It is classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. 304
25732 133053 cd06431 GT8_LARGE_C LARGE catalytic domain has closest homology to GT8 glycosyltransferase involved in lipooligosaccharide synthesis. The catalytic domain of LARGE is a putative glycosyltransferase. Mutations of LARGE in mouse and human cause dystroglycanopathies, a disease associated with hypoglycosylation of the membrane protein alpha-dystroglycan (alpha-DG) and consequent loss of extracellular ligand binding. LARGE needs to both physically interact with alpha-dystroglycan and function as a glycosyltransferase in order to stimulate alpha-dystroglycan hyperglycosylation. LARGE localizes to the Golgi apparatus and contains three conserved DxD motifs. While two of the motifs are indispensible for glycosylation function, one is important for localization of th eenzyme. LARGE was originally named because it covers approximately large trunck of genomic DNA, more than 600bp long. The predicted protein structure contains an N-terminal cytoplasmic domain, a transmembrane region, a coiled-coil motif, and two putative catalytic domains. This catalytic domain has closest homology to GT8 glycosyltransferase involved in lipooligosaccharide synthesis. 280
25733 133054 cd06432 GT8_HUGT1_C_like The C-terminal domain of HUGT1-like is highly homologous to the GT 8 family. C-terminal domain of glycoprotein glucosyltransferase (UGT). UGT is a large glycoprotein whose C-terminus contains the catalytic activity. This catalytic C-terminal domain is highly homologous to Glycosyltransferase Family 8 (GT 8) and contains the DXD motif that coordinates donor sugar binding, characteristic for Family 8 glycosyltransferases. GT 8 proteins are retaining enzymes based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. The non-catalytic N-terminal portion of the human UTG1 (HUGT1) has been shown to monitor the protein folding status and activate its glucosyltransferase activity. 248
25734 133055 cd06433 GT_2_WfgS_like WfgS and WfeV are involved in O-antigen biosynthesis. Escherichia coli WfgS and Shigella dysenteriae WfeV are glycosyltransferase 2 family enzymes involved in O-antigen biosynthesis. GT-2 enzymes have GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families. 202
25735 133056 cd06434 GT2_HAS Hyaluronan synthases catalyze polymerization of hyaluronan. Hyaluronan synthases (HASs) are bi-functional glycosyltransferases that catalyze polymerization of hyaluronan. HASs transfer both GlcUA and GlcNAc in beta-(1,3) and beta-(1,4) linkages, respectively to the hyaluronan chain using UDP-GlcNAc and UDP-GlcUA as substrates. HA is made as a free glycan, not attached to a protein or lipid. HASs do not need a primer for HA synthesis; they initiate HA biosynthesis de novo with only UDP-GlcNAc, UDP-GlcUA, and Mg2+. Hyaluronan (HA) is a linear heteropolysaccharide composed of (1-3)-linked beta-D-GlcUA-beta-D-GlcNAc disaccharide repeats. It can be found in vertebrates and a few microbes and is typically on the cell surface or in the extracellular space, but is also found inside mammalian cells. Hyaluronan has several physiochemical and biological functions such as space filling, lubrication, and providing a hydrated matrix through which cells can migrate. 235
25736 133057 cd06435 CESA_NdvC_like NdvC_like proteins in this family are putative bacterial beta-(1,6)-glucosyltransferase. NdvC_like proteins in this family are putative bacterial beta-(1,6)-glucosyltransferase. Bradyrhizobium japonicum synthesizes periplasmic cyclic beta-(1,3),beta-(1,6)-D-glucans during growth under hypoosmotic conditions. Two genes (ndvB, ndvC) are involved in the beta-(1, 3), beta-(1,6)-glucan synthesis. The ndvC mutant strain resulted in synthesis of altered cyclic beta-glucans composed almost entirely of beta-(1, 3)-glycosyl linkages. The periplasmic cyclic beta-(1,3),beta-(1,6)-D-glucans function for osmoregulation. The ndvC mutation also affects the ability of the bacteria to establish a successful symbiotic interaction with host plant. Thus, the beta-glucans may function as suppressors of a host defense response. 236
25737 133058 cd06436 GlcNAc-1-P_transferase N-acetyl-glucosamine transferase is involved in the synthesis of Poly-beta-1,6-N-acetyl-D-glucosamine. N-acetyl-glucosamine transferase is responsible for the synthesis of bacteria Poly-beta-1,6-N-acetyl-D-glucosamine (PGA). Poly-beta-1,6-N-acetyl-D-glucosamine is a homopolymer that serves as an adhesion for the maintenance of biofilm structural stability in diverse eubacteria. N-acetyl-glucosamine transferase is the product of gene pgaC. Genetic analysis indicated that all four genes of the pgaABCD locus were required for the PGA production, pgaC being a glycosyltransferase. 191
25738 133059 cd06437 CESA_CaSu_A2 Cellulose synthase catalytic subunit A2 (CESA2) is a catalytic subunit or a catalytic subunit substitute of the cellulose synthase complex. Cellulose synthase (CESA) catalyzes the polymerization reaction of cellulose using UDP-glucose as the substrate. Cellulose is an aggregate of unbranched polymers of beta-1,4-linked glucose residues, which is an abundant polysaccharide produced by plants and in varying degrees by several other organisms including algae, bacteria, fungi, and even some animals. Genomes from higher plants harbor multiple CESA genes. There are ten in Arabidopsis. At least three different CESA proteins are required to form a functional complex. In Arabidopsis, CESA1, 3 and 6 and CESA4, 7 and 8, are required for cellulose biosynthesis during primary and secondary cell wall formation. CESA2 is very closely related to CESA6 and is viewed as a prime substitute for CESA6. They functionally compensate each other. The cesa2 and cesa6 double mutant plants were significantly smaller, while the single mutant plants were almost normal. 232
25739 133060 cd06438 EpsO_like EpsO protein participates in the methanolan synthesis. The Methylobacillus sp EpsO protein is predicted to participate in the methanolan synthesis. Methanolan is an exopolysaccharide (EPS), composed of glucose, mannose and galactose. A 21 genes cluster was predicted to participate in the methanolan synthesis. Gene disruption analysis revealed that EpsO is one of the glycosyltransferase enzymes involved in the synthesis of repeating sugar units onto the lipid carrier. 183
25740 133061 cd06439 CESA_like_1 CESA_like_1 is a member of the cellulose synthase (CESA) superfamily. This is a subfamily of cellulose synthase (CESA) superfamily. CESA superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains. The members of the superfamily include cellulose synthase catalytic subunit, chitin synthase, glucan biosynthesis protein and other families of CESA-like proteins. 251
25741 133062 cd06442 DPM1_like DPM1_like represents putative enzymes similar to eukaryotic DPM1. Proteins similar to eukaryotic DPM1, including enzymes from bacteria and archaea; DPM1 is the catalytic subunit of eukaryotic dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. In higher eukaryotes,the enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. In lower eukaryotes, such as Saccharomyces cerevisiae and Trypanosoma brucei, DPM synthase consists of a single component (Dpm1p and TbDpm1, respectively) that possesses one predicted transmembrane region near the C terminus for anchoring to the ER membrane. In contrast, the Dpm1 homologues of higher eukaryotes, namely fission yeast, fungi, and animals, have no transmembrane region, suggesting the existence of adapter molecules for membrane anchoring. This family also includes bacteria and archaea DPM1_like enzymes. However, the enzyme structure and mechanism of function are not well understood. This protein family belongs to Glycosyltransferase 2 superfamily. 224
25742 176473 cd06444 DNA_pol_A Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication. DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 347
25743 119438 cd06445 ATase The DNA repair protein O6-alkylguanine-DNA alkyltransferase (ATase; also known as AGT, AGAT and MGMT) reverses O6-alkylation DNA damage by transferring O6-alkyl adducts to an active site cysteine irreversibly, without inducing DNA strand breaks. ATases are specific for repair of guanines with O6-alkyl adducts, however human ATase is not limited to O6-methylguanine, repairing many other adducts at the O6-position of guanine as well. ATase is widely distributed among species. Most ATases have N- and C-terminal domains. The C-terminal domain contains the conserved active-site cysteine motif (PCHR), the O6-alkylguanine binding channel, and the helix-turn-helix (HTH) DNA-binding motif. The active site is located near the recognition helix of the HTH motif. While the C-terminal domain of ATase contains residues that are necessary for DNA binding and alkyl transfer, the function of the N-terminal domain is still unknown. Removal of the N-terminal domain abolishes the activity of the C-terminal domain, suggesting an important structural role for the N-terminal domain in orienting the C-terminal domain for proper catalysis. Some ATase C-terminal domain homologs are either single-domain proteins that lack an N-terminal domain, or have a tryptophan substituted in place of the acceptor cysteine (i.e. the motif PCHR is replaced by PWHR). ATase null mutant mice are viable, fertile, and have a normal lifespan. 79
25744 107207 cd06446 Trp-synth_B Tryptophan synthase-beta: Tryptophan synthase is a bifunctional enzyme that catalyses the last two steps in the biosynthesis of L-tryptophan via its alpha and beta reactions. In the alpha reaction, indole 3-glycerol phosphate is cleaved reversibly to glyceraldehyde 3-phosphate and indole at the active site of the alpha subunit. In the beta reaction, indole undergoes a PLP-dependent reaction with L-serine to form L-tryptophan at the active site of the beta subunit. Members of this CD, Trp-synth_B, are found in all three major phylogenetic divisions. 365
25745 107208 cd06447 D-Ser-dehyd D-Serine dehydratase is a pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of L- or D-serine to pyruvate and ammonia. D-serine dehydratase serves as a detoxifying enzyme in most E. coli strains where D-serine is a competitive antagonist of beta-alanine in the biosynthetic pathway to pentothenate and coenzyme A. D-serine dehydratase is different from other pyridoxal-5'-phosphate-dependent enzymes in that it catalyzes alpha, beta-elimination reactions on amino acids. 404
25746 107209 cd06448 L-Ser-dehyd Serine dehydratase is a pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of L- , D-serine, or L-threonine to pyruvate/ketobutyrate and ammonia. 316
25747 107210 cd06449 ACCD Aminocyclopropane-1-carboxylate deaminase (ACCD): Pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of 1-aminocyclopropane-L-carboxylate (ACC), a precursor of the plant hormone ethylene, to alpha-ketobutyrate and ammonia. 307
25748 99743 cd06450 DOPA_deC_like DOPA decarboxylase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to DOPA/tyrosine decarboxylase (DDC), histidine decarboxylase (HDC), and glutamate decarboxylase (GDC). DDC is active as a dimer and catalyzes the decarboxylation of tyrosine. GDC catalyzes the decarboxylation of glutamate and HDC catalyzes the decarboxylation of histidine. 345
25749 99744 cd06451 AGAT_like Alanine-glyoxylate aminotransferase (AGAT) family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to alanine-glyoxylate aminotransferase (AGAT), serine-glyoxylate aminotransferase (SGAT), and 3-hydroxykynurenine transaminase (HKT). AGAT is a homodimeric protein, which catalyses the transamination of glyoxylate to glycine, and SGAT converts serine and glyoxylate to hydroxypyruvate and glycine. HKT catalyzes the PLP-dependent transamination of 3-hydroxykynurenine, a potentially toxic metabolite of the kynurenine pathway. 356
25750 99745 cd06452 SepCysS Sep-tRNA:Cys-tRNA synthase. This family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). Cys-tRNA(Cys) is produced by O-phosphoseryl-tRNA synthetase which ligates O-phosphoserine (Sep) to tRNA(Cys), and Sep-tRNA:Cys-tRNA synthase (SepCysS) converts Sep-tRNA(Cys) to Cys-tRNA(Cys), in methanogenic archaea. SepCysS forms a dimer, each monomer is composed of a large and small domain; the larger, a typical pyridoxal 5'-phosphate (PLP)-dependent-like enzyme fold. In the active site of each monomer, PLP is covalently bound to a conserved Lys residue near the dimer interface. 361
25751 99746 cd06453 SufS_like Cysteine desulfurase (SufS)-like. This family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to cysteine desulfurase (SufS) and selenocysteine lyase. SufS catalyzes the removal of elemental sulfur and selenium atoms from L-cysteine, L-cystine, L-selenocysteine, and L-selenocystine to produce L-alanine; and selenocysteine lyase catalyzes the decomposition of L-selenocysteine. 373
25752 99747 cd06454 KBL_like KBL_like; this family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD corresponds to serine palmitoyltransferase (SPT), 5-aminolevulinate synthase (ALAS), 8-amino-7-oxononanoate synthase (AONS), and 2-amino-3-ketobutyrate CoA ligase (KBL). SPT is responsible for the condensation of L-serine with palmitoyl-CoA to produce 3-ketodihydrospingosine, the reaction of the first step in sphingolipid biosynthesis. ALAS is involved in heme biosynthesis; it catalyzes the synthesis of 5-aminolevulinic acid from glycine and succinyl-coenzyme A. AONS catalyses the decarboxylative condensation of l-alanine and pimeloyl-CoA in the first committed step of biotin biosynthesis. KBL catalyzes the second reaction step of the metabolic degradation pathway for threonine converting 2-amino-3-ketobutyrate, to glycine and acetyl-CoA. The members of this CD are widely found in all three forms of life. 349
25753 341050 cd06455 M3A_TOP Peptidase M3 thimet oligopeptidase (TOP), also includes neurolysin. Peptidase M3 Thimet oligopeptidase (TOP; PZ-peptidase; endo-oligopeptidase A; endopeptidase 24.15; soluble metallo-endopeptidase; EC 3.4.24.15) family also includes neurolysin (endopeptidase 24.16, microsomal endopeptidase, mitochondrial oligopeptidase M, neurotensin endopeptidase, soluble angiotensin II-binding protein, thimet oligopeptidase II) which hydrolyzes oligopeptides such as neurotensin, bradykinin and dynorphin A. TOP and neurolysin are neuropeptidases expressed abundantly in the testis, but are also found in the liver, lung and kidney. They are involved in the metabolism of neuropeptides under 20 amino acid residues long and cleave most bioactive peptides at the same sites, but recognize different positions on some naturally occurring and synthetic peptides; they cleave at distinct sites on the 13-residue bioactive peptide neurotensin, which modulates central dopaminergic and cholinergic circuits. TOP has been shown to degrade peptides released by the proteasome, limiting the extent of antigen presentation by major histocompatibility complex class I molecules, and has been associated with amyloid protein precursor processing. 642
25754 341051 cd06456 M3A_DCP Peptidase family M3, dipeptidyl carboxypeptidase (DCP). Peptidase family M3 dipeptidyl carboxypeptidase (DCP; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). This metal-binding M3A family also includes oligopeptidase A (OpdA; EC 3.4.24.70). DCP cleaves dipeptides off the C-termini of various peptides and proteins, the smallest substrate being N-blocked tripeptides and unblocked tetrapeptides. DCP from Escherichia coli is inhibited by the anti-hypertensive drug captopril, an inhibitor of the mammalian angiotensin converting enzyme (ACE, also called peptidyl dipeptidase A). OpdA may play a specific role in the degradation of signal peptides after they are released from precursor forms of secreted proteins. It can also cleave N-acetyl-L-Ala. This family also includes Arabidopsis thaliana organellar oligopeptidase OOP (At5g65620), which plays a role in targeting peptide degradation in mitochondria and chloroplasts; it degrades peptide substrates that are between 8 to 23 amino acid residues, and shows a weak preference for hydrophobic residues (F/L) at the P1 position. 653
25755 341052 cd06457 M3A_MIP Peptidase M3 mitochondrial intermediate peptidase (MIP). Peptidase M3 mitochondrial intermediate peptidase (MIP; EC 3.4.24.59) belongs to the widespread subfamily M3A, that shows similarity to Thimet oligopeptidase (TOP). It is one of three peptidases responsible for the proteolytic processing of both nuclear and mitochondrial encoded precursor polypeptides targeted to various subcompartments of the mitochondria. It cleaves intermediate-size proteins initially processed by mitochondrial processing peptidase (MPP) to yield a processing intermediate with a typical N-terminal octapeptide that is sequentially cleaved by MIP to mature-size protein. MIP cleaves precursor proteins of respiratory components, including subunits of the electron transport chain and tri-carboxylic acid cycle enzymes, and components of the mitochondrial genetic machinery, including ribosomal proteins, translation factors, and proteins required for mitochondrial DNA metabolism. It has been suggested that the human MIP (HMIP polypeptide (gene symbol MIPEP) may be one of the loci predicted to influence the clinical manifestations of Friedreich's ataxia (FRDA), an autosomal recessive neurodegenerative disease caused by the lack of human frataxin. These proteins are enriched in cysteine residues, two of which are highly conserved, suggesting their importance to stability as well as in formation of metal binding sites, thus playing a role in MIP activity. 613
25756 341053 cd06459 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and includes oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. 539
25757 341054 cd06460 M32_Taq Peptidase family M32, which includes thermostable carboxypeptidases TaqCP, PfuCP and FisCP. Peptidase family M32 is a subclass of metallocarboxypeptidases, distributed mainly in bacteria and archaea, whose members contain a HEXXH motif that participates in coordinating a divalent cation such as Zn2+ or Co2+. It includes the thermostable carboxypeptidases (E.C. 3.4.17.19) from Thermus aquaticus (TaqCP) and Pyrococcus furiosus (PfuCP), which have broad specificities toward a wide range of C-terminal substrates that include basic, aromatic, neutral and polar amino acids. These enzymes have a similar fold to the M3 peptidases such as neurolysin and the M2 angiotensin converting enzyme (ACE). The keratin-degrading extremophilic eubacterium Fervidobacterium islandicum M32 carboxypeptidase (FisCP) plays an important role in cellular metabolism, and significantly enhances the degradation of native chicken feathers. It has been shown to mainly cleave the C-termini of peptides with a basic amino acid sequence. Novel M32 peptidases from some eukaryotes: protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, have been identified, thus making these enzymes an attractive potential target for drug development against these organisms. 484
25758 341055 cd06461 M2_ACE Peptidase family M2, angiotensin converting enzyme (ACE). Peptidase family M2 angiotensin converting enzyme (ACE, EC 3.4.15.1) is a membrane-bound, zinc-dependent dipeptidase that catalyzes the conversion of the decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II, by removing two C-terminal amino acids. There are two forms of the enzyme in humans, the ubiquitous somatic ACE and the sperm-specific germinal ACE, both encoded by the same gene through transcription from alternative promoters. Somatic ACE has two tandem active sites with distinct catalytic properties, whereas germinal ACE, the function of which is largely unknown, has just a single active site. Recently, an ACE homolog, ACE2, has been identified in humans that differs from ACE; it preferentially removes carboxy-terminal hydrophobic or basic amino acids and appears to be important in cardiac function. ACE homologs (also known as members of the M2 gluzincin family) have been found in a wide variety of species, including those that neither have a cardiovascular system nor synthesize angiotensin. ACE is well-known as a key part of the renin-angiotensin system that regulates blood pressure and ACE inhibitors are important for the treatment of hypertension. 563
25759 119396 cd06462 Peptidase_S24_S26 The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases. 84
25760 107220 cd06463 p23_like Proteins containing this p23_like domain include p23 and its Saccharomyces cerevisiae (Sc) homolog Sba1. Both are co-chaperones for the heat shock protein (Hsp) 90. p23 binds Hsp90 and participates in the folding of a number of Hsp90 clients, including the progesterone receptor. p23 also has a passive chaperoning activity and in addition may participate in prostaglandin synthesis. Both p23 and Sba1p can regulate telomerase activity. This group includes domains similar to the C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1). Sgt1 interacts with multiple protein complexes and has the features of a co-chaperone. Human (h) Sgt1 interacts with both Hsp70 and Hsp90, and has been shown to bind Hsp90 through its CS domain. Saccharomyces cerevisiae (Sc) Sgt1 is a subunit of both core kinetochore and SCF (Skp1-Cul1-F-box) ubiquitin ligase complexes. Sgt1 is required for pathogen resistance in plants. This group also includes the p23_like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR). hB-ind1 plays a role in the signaling pathway mediated by the small GTPase Rac1, NUDC is needed for nuclear movement, Melusin interacts with two splice variants of beta1 integrin, and NCB5OR plays a part in maintaining viable pancreatic beta cells. 84
25761 107221 cd06464 ACD_sHsps-like Alpha-crystallin domain (ACD) of alpha-crystallin-type small(s) heat shock proteins (Hsps). sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is the Alpha-crystallin domain (ACD). sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. 88
25762 107222 cd06465 p23_hB-ind1_like p23_like domain found in human (h) butyrate-induced transcript 1 (B-ind1) and similar proteins. hB-ind1 participates in signaling by the small GTPase Rac1. It binds to Rac1 and enhances different Rac1 effects including activation of nuclear factor (NF) kappaB and activation of c-Jun N-terminal kinase (JNK). hB-ind1 also plays a part in the RNA replication and particle production of Hepatitis C virus (HCV) through its interaction with heat shock protein Hsp90, HCV nonstructural protein 5A (NS5A), and the immunophilin FKBP8. hB-ind1 is upregulated in the outer layer of Chinese hamster V79 cells grown as multicell spheroids, versus in the same cells grown as monolayers. This group includes the Saccharomyces cerevisiae Sba1, a co-chaperone of the Hsp90. Sba1 has been shown to be is required for telomere length maintenance, and may modulate telomerase DNA-binding activity. 108
25763 107223 cd06466 p23_CS_SGT1_like p23_like domain similar to the C-terminal CHORD-SGT1 (CS) domain of Sgt1 (suppressor of G2 allele of Skp1). Sgt1 interacts with multiple protein complexes and has the features of a cochaperone. Human (h) Sgt1 interacts with both Hsp70 and Hsp90, and has been shown to bind Hsp90 through its CS domain. Saccharomyces cerevisiae (Sc) Sgt1 is a subunit of both core kinetochore and SCF (Skp1-Cul1-F-box) ubiquitin ligase complexes. Sgt1 is required for pathogen resistance in plants. ScSgt1 is needed for the G1/S and G2/M cell-cycle transitions, and for assembly of the core kinetochore complex (CBF3) via activation of Ctf13, the F-box protein. Binding of Hsp82 (a yeast Hsp90 homologue) to ScSgt1, promotes the binding of Sgt1 to Skp1 and of Skp1 to Ctf13. Some proteins in this group have an SGT1-specific (SGS) domain at the extreme C-terminus. The ScSgt1-SGS domain binds adenylate cyclase. The hSgt1-SGS domain interacts with some S100 family proteins, and studies suggest that the interaction of hSgt1 with Hsp90 and Hsp70 may be regulated by S100A6 in a Ca2+ dependent fashion. This group also includes the p23_like domains of Melusin and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR). Melusin is a vertebrate protein which interacts with two splice variants of beta1 integrin, and NCB5OR plays a part in maintaining viable pancreatic beta cells. 84
25764 107224 cd06467 p23_NUDC_like p23_like domain of NUD (nuclear distribution) C and similar proteins. Aspergillus nidulas (An) NUDC is needed for nuclear movement. AnNUDC is localized at the hyphal cortex, and binds NUDF at spindle pole bodies (SPBs) and in the cytoplasm at different stages in the cell cycle. At the SPBs it is part of the dynein molecular motor/NUDF complex that regulates microtubule dynamics. Mammalian(m) NUDC associates both with the dynein complex and also with an anti-inflammatory enzyme, platelet activating factor acetylhydrolase I, PAF-AH(I) complex, through binding mNUDF, the regulatory beta subunit of PAF-AH(I). mNUDC is important for cell proliferation both in normal and tumor tissues. Its expression is elevated in various cell types undergoing mitosis or stimulated to proliferate, with high expression levels observed in leukemic cells and tumors. For a leukemic cell line, human NUDC was shown to activate the thrombopoietin (TPO) receptor (Mpl) by binding to its extracellular domain, and promoting cell proliferation and differentiation. This group also includes the human broadly immunogenic tumor associated antigen, CML66, which is highly expressed in a variety of solid tumors and in leukemias. In normal tissues high expression of CML66 is limited to testis and heart. 85
25765 107225 cd06468 p23_CacyBP p23_like domain found in proteins similar to Calcyclin-Binding Protein(CacyBP)/Siah-1-interacting protein (SIP). CacyBP/SIP interacts with S100A6 (calcyclin), with some other members of the S100 family, with tubulin, and with Siah-1 and Skp-1. The latter two are components of the ubiquitin ligase that regulates beta-catenin degradation. The beta-catenin gene is an oncogene participating in tumorigenesis in many different cancers. Overexpression of CacyBP/SIP, in part through its effect on the expression of beta-catenin, inhibits the proliferation, tumorigenicity, and invasion of gastric cancer cells. CacyBP/SIP is abundant in neurons and neuroblastoma NB2a cells. An extensive re-organization of microtubules accompanies the differentiation of NB2a cells. CacyBP/SIP may contribute to NB2a cell differentiation through binding to and increasing the oligomerization of tubulin. CacyBP/SIP is also implicated in differentiation of erythroid cells, rat neonatal cardiomyocytes, in mouse endometrial events, and in thymocyte development. 92
25766 107226 cd06469 p23_DYX1C1_like p23_like domain found in proteins similar to dyslexia susceptibility 1 (DYX1) candidate 1 (C1) protein, DYX1C1. The human gene encoding this protein is a positional candidate gene for developmental dyslexia (DD), it is located on 15q21.3 by the DYX1 DD susceptibility locus (15q15-21). Independent association studies have reported conflicting results. However, association of short-term memory, which plays a role in DD, with a variant within the DYX1C1 gene has been reported. Most proteins belonging to this group contain a C-terminal tetratricopeptide repeat (TPR) protein binding region. 78
25767 107227 cd06470 ACD_IbpA-B_like Alpha-crystallin domain (ACD) found in Escherichia coli inclusion body-associated proteins IbpA and IbpB, and similar proteins. IbpA and IbpB are 16 kDa small heat shock proteins (sHsps). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. IbpA and IbpB are produced during high-level production of various heterologous proteins, specifically human prorenin, renin and bovine insulin-like growth factor 2 (bIGF-2), and are strongly associated with inclusion bodies containing these heterologous proteins. IbpA and IbpB work as an integrated system to stabilize thermally aggregated proteins in a disaggregation competent state. The chaperone activity of IbpB is also significantly elevated as the temperature increases from normal to heat shock. The high temperature results in the disassociation of 2-3-MDa IbpB oligomers into smaller approximately 600-kDa structures. This elevated activity seen under heat shock conditions is retained for an extended period of time after the temperature is returned to normal. IbpA also forms multimers. 90
25768 107228 cd06471 ACD_LpsHSP_like Group of bacterial proteins containing an alpha crystallin domain (ACD) similar to Lactobacillus plantarum (Lp) small heat shock proteins (sHsp) HSP 18.5, HSP 18.55 and HSP 19.3. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Transcription of the genes encoding Lp HSP 18.5, 18.55 and 19.3 is regulated by a variety of stresses including heat, cold and ethanol. Early growing L. plantarum cells contain elevated levels of these mRNAs which rapidly fall of as the cells enter stationary phase. Also belonging to this group is Bifidobacterium breve (Bb) HSP20 and Oenococcus oenis (syn. Leuconostoc oenos) (Oo) HSP18. Transcription of the gene encoding BbHSP20 is strongly induced following heat or osmotic shock, and that of the gene encoding OoHSP18 following heat, ethanol or acid shock. OoHSP18 is peripherally associated with the cytoplasmic membrane. 93
25769 107229 cd06472 ACD_ScHsp26_like Alpha crystallin domain (ACD) found in Saccharomyces cerevisiae (Sc) small heat shock protein (Hsp)26 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. ScHsp26 is temperature-regulated, it switches from an inactive to a chaperone-active form upon elevation in temperature. It associates into large 24-mers storage forms which upon heat shock disassociate into dimers. These dimers initiate the interaction with non-native substrate proteins and re-assemble into large globular assemblies having one monomer of substrate bound per dimer. This group also contains Arabidopsis thaliana (Ath) Hsp15.7, a peroxisomal matrix protein which can complement the morphological phenotype of S. cerevisiae mutants deficient in Hsps26. AthHsp15.7 is minimally expressed under normal conditions and is strongly induced by heat and oxidative stress. Also belonging to this group is wheat HSP16.9 which differs in quaternary structure from the shell-type particles of ScHsp26, it assembles as a dodecameric double disc, with each disc organized as a trimer of dimers. 92
25770 107230 cd06475 ACD_HspB1_like Alpha crystallin domain (ACD) found in mammalian small (s)heat shock protein (Hsp)-27 (also denoted HspB1 in human) and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Hsp27 shows enhanced synthesis in response to stress. It is a molecular chaperone which interacts with a large number of different proteins. It is found in many types of human cells including breast, uterus, cervix, platelets and cancer cells. Hsp27 has diverse cellular functions including, chaperoning, regulation of actin polymerization, keratinocyte differentiation, regulation of inflammatory pathways in keratinocytes, and protection from oxidative stress through modulating glutathione levels. It is also a subunit of AUF1-containing protein complexes. It has been linked to several transduction pathways regulating cellular functions including differentiation, cell growth, development, and apoptosis. Its activity can be regulated by phosphorylation. Its unphosphorylated state is a high molecular weight aggregated form (100-800kDa) composed of up to 24 subunits, which forms as a result of multiple interactions within the ACD, and is required for chaperone function and resistance to oxidative stress. Upon phosphorylation these large aggregates rapidly disassociate to smaller oligomers and chaperone activity is modified. High constitutive levels of Hsp27 have been detected in various cancer cells, in particular those of carcinoma origin. Over-expression of Hsp27 has a protective effect against various diseases-processes, including Huntington's disease. Mutations in Hsp27 have been associated with a form of distal hereditary motor neuropathy type II and Charcot-Marie-Tooth disease type 2. 86
25771 107231 cd06476 ACD_HspB2_like Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB2/heat shock 27kDa protein 2 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. HspB2 is preferentially and constitutively expressed in skeletal muscle and heart. HspB2 shows homooligomeric activity and forms aggregates in muscle cytosol. Although its expression is not induced by heat shock, it redistributes to the insoluble fraction in response to heat shock. In the mouse heart, HspB2 plays a role in maintaining energetic balance, by protecting cardiac energetics during ischemia/reperfusion, and allowing for increased work during acute inotropic challenge. hHspB2 [previously also known as myotonic dystrophy protein kinase (DMPK) binding protein (MKBP)] is selectively up-regulated in skeletal muscles from myotonic dystrophy patients. The ACD of hHspB2 binds the DMPK kinase domain. In vitro, hHspB2 enhances the kinase activity of DMPK and confers thermoresistance. The hHspB2 gene lies less than 1kb from the 5 prime end of the related alphaB (HspB4)-crystallin gene, with the opposite transcription direction. These two genes may share regulatory elements for their expression. 83
25772 107232 cd06477 ACD_HspB3_Like Alpha crystallin domain (ACD) found in mammalian HspB3, also known as heat-shock protein 27-like protein (HSPL27, 17-kDa) and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. HspB3 is expressed in adult skeletal muscle, smooth muscle, and heart, and in several other fetal tissues. In muscle cells HspB3 forms an oligomeric 150 kDa complex with myotonic dystrophy protein kinase-binding protein (MKBP/ HspB2), this complex may comprise one of two independent muscle-cell specific chaperone systems. The expression of HspB3 is induced during muscle differentiation controlled by the myogenic factor MyoD. HspB3 may also interact with Hsp22 (HspB8). 83
25773 107233 cd06478 ACD_HspB4-5-6 Alpha-crystallin domain found in alphaA-crystallin (HspB4), alphaB-crystallin (HspB5), and the small heat shock protein (sHsp) HspB6, also known as Hsp20. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1. Only trace amounts of HspB4 are found in tissues other than the lens. HspB5 on the other hand is also expressed constitutively in other tissues including brain, heart, and type I and type IIa skeletal muscle fibers, and in several cancers including gliomas, renal cell carcinomas, basal-like and metaplastic breast carcinomas, and head and neck cancer. HspB5's functions include effects on the apoptotic pathway and on metastasis. Phosphorylation of HspB5 reduces its oligomerization and anti-apoptotic activities. HspB5 is protective in demyelinating disease such as multiple sclerosis (MS), being a negative regulator of inflammation. In early active MS lesions it is the most abundant gene transcript and an autoantigen, the immune response against it would disrupt its function and worsen inflammation and demyelination. Given as therapy for ongoing demyelinating disease it may counteract this effect. It is an autoantigen in the pathogenesis of various other inflammatory disorders including Lens-associated uveitis (LAU), and Behcet's disease. Mutations in HspB5 have been associated with diseases including dominant cataract and desmin-related myopathy. Mutations in HspB4 have been associated with Autosomal Dominant Congenital Cataract (ADCC). HspB6 (Hsp20) is ubiquitous and is involved in diverse functions including regulation of glucose transport and contraction of smooth muscle, in platelet aggregation, in cardioprotection, and in the prevention of apoptosis. It interacts with the universal scaffolding and adaptor protein 14-3-3, and also with the proapoptotic protein Bax. 83
25774 107234 cd06479 ACD_HspB7_like Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB7, also known as cardiovascular small heat shock protein (cvHsp), and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. HspB7 is a 25-kDa protein, preferentially expressed in heart and skeletal muscle. It binds the cytoskeleton protein alpha-filamin (also known as actin-binding protein 280). The expression of HspB7 is increased during rat muscle aging. Its expression is also modulated in obesity implicating this protein in this and related metabolic disorders. As the human gene encoding HspB7 is mapped to chromosome 1p36.23-p34.3 it is a positional candidate for several dystrophies and myopathies. 81
25775 107235 cd06480 ACD_HspB8_like Alpha-crystallin domain (ACD) found in mammalian 21.6 KDa small heat shock protein (sHsp) HspB8, also denoted as Hsp22 in humans, and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. A chaperone complex formed of HspB8 and Bag3 stimulates degradation of protein complexes by macroautophagy. HspB8 also forms complexes with Hsp27 (HspB1), MKBP (HspB2), HspB3, alphaB-crystallin (HspB5), Hsp20 (HspB6), and cvHsp (HspB7). These latter interactions may depend on phosphorylation of the respective partner sHsp. HspB8 may participate in the regulation of cell proliferation, cardiac hypertrophy, apoptosis, and carcinogenesis. Point mutations in HspB8 have been correlated with the development of several congenital neurological diseases, including Charcot Marie tooth disease and distal motor neuropathy type II. 91
25776 107236 cd06481 ACD_HspB9_like Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB9 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Human (h) HspB9 is expressed exclusively in the normal testis and in various tumor samples and is a cancer/testis antigen. hHspB9 interacts with TCTEL1 (T-complex testis expressed protein -1), a subunit of dynein. hHspB9 and TCTEL1 are co-expressed in similar cells within the testis and in tumor cells. Included in this group is Xenopus Hsp30, a developmentally-regulated heat-inducible molecular chaperone. 87
25777 107237 cd06482 ACD_HspB10 Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB10, also known as sperm outer dense fiber protein (ODFP), and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Human (h) HspB10 occurs exclusively in the axoneme of sperm cells and may have a cytoskeletal role. 87
25778 107238 cd06488 p23_melusin_like p23_like domain similar to the C-terminal (tail) domain of vertebrate Melusin and related proteins. Melusin's tail domain interacts with the cytoplasmic domain of beta1-A and beta1-D isoforms of beta1 integrin, it does not bind other integrin beta subunits. Melusin is a muscle-specific protein expressed in skeletal and cardiac muscles but not in smooth muscle or other tissues. It is needed for heart hypertrophy following mechanical overload. The integrin-binding portion of this domain appears to be sequestered in the full length melusin protein, Ca2+ may modulate the protein's conformation exposing this binding site. This group includes Chordc1, also known as Chp-1, which is conserved from vertebrates to humans. Mammalian Chordc1 interacts with the heat shock protein (HSP) Hsp90 and is implicated in circadian and/or homeostatic mechanisms in the brain. The N-terminal portions of proteins belonging to this group contain two cysteine and histidine rich domain (CHORD) domains. 87
25779 107239 cd06489 p23_CS_hSgt1_like p23_like domain similar to the C-terminal CS (CHORD-SGT1) domain of human (h) Sgt1 and related proteins. hSgt1 is a co-chaperone which has been shown to be elevated in HEp-2 cells as a result of stress conditions such as heat shock. It interacts with the heat shock proteins (HSPs) Hsp70 and Hsp90, and it expression pattern is synchronized with these two Hsps. The interaction with HSP90 has been shown to involve the hSgt1_CS domain, and appears to be required for correct kinetochore assembly and efficient cell division. Some proteins in this subgroup contain a tetratricopeptide repeat (TPR) HSP-binding domain N-terminal to this CS domain, and most proteins in this subgroup contain a Sgt1-specific (SGS) domain C-terminal to the CS domain. The SGS domain interacts with some S100 family proteins. Studies suggest that S100A6 modulates in a Ca2+ dependent manner the interactions of hSgt1 with Hsp90 and Hsp70. The yeast Sgt1 CS domain is not found in this subgroup. 84
25780 107240 cd06490 p23_NCB5OR p23_like domain found in NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR) and similar proteins. NCB5OR is widely expressed in human organs and tissues and is localized in the ER (endoplasmic reticulum). It appears to play a critical role in maintaining viable pancreatic beta cells. Mice homozygous for a targeted knockout (KO) of the gene encoding NCB5OR develop an early-onset nonautoimmune diabetes phenotype with a non-inflammatory beta-cell deficiency. The role of NCB5OR in beta cells may be in maintaining or regulating their redox status. Proteins in this group in addition contain an N-terminal cytochrome b5 domain and a C-terminal cytochrome b5 oxidoreductase domain. The gene encoding NCB5OR has been considered as a positional candidate for type II diabetes and other diabetes subtypes related to B-cell dysfunction, however variation in its coding region does not appear not to be a major contributor to the pathogenesis of these diseases. 87
25781 107241 cd06492 p23_mNUDC_like p23-like NUD (nuclear distribution) C-like domain of mammalian(m) NUDC and similar proteins. Mammalian(m) NUDC associates both with the dynein complex and also with an anti-inflammatory enzyme, platelet activating factor acetylhydrolase I, PAF-AH(I) complex, through binding mNUDF, the regulatory beta subunit of PAF-AH(I). mNUDC is important for cell proliferation both in normal and tumor tissues. Its expression is elevated in various cell types undergoing mitosis or stimulated to proliferate, with high expression levels observed in leukemic cells and tumors. For a leukemic cell line, human NUDC was shown to activate the thrombopoietin (TPO) receptor (Mpl) by binding to its extracellular domain, and promoting cell proliferation and differentiation. 87
25782 107242 cd06493 p23_NUDCD1_like p23_NUDCD1: p23-like NUD (nuclear distribution) C-like domain found in human NUD (nuclear distribution) C domain-containing protein 1, NUDCD1 (also known as CML66), and similar proteins. NUDCD1/CML66 is a broadly immunogenic tumor associated antigen, which is highly expressed in a variety of solid tumors and in leukemias. In normal tissues high expression of NUDCD1/CML66 is limited to testis and heart. 85
25783 107243 cd06494 p23_NUDCD2_like p23-like NUD (nuclear distribution) C-like found in human NUDC domain-containing protein 2 (NUDCD2) and similar proteins. Little is known about the function of the proteins in this subgroup. 93
25784 107244 cd06495 p23_NUDCD3_like p23-like NUD (nuclear distribution) C-like domain found in human NUDC domain-containing protein 3 (NUDCD3) and similar proteins. Little is known about the function of the proteins in this subgroup. 102
25785 107245 cd06497 ACD_alphaA-crystallin_HspB4 Alpha-crystallin domain found in the small heat shock protein (sHsp) alphaA-crystallin (HspB4, 20kDa). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1. Only trace amounts of HspB4 are found in tissues other than the lens. HspB5 does not belong to this group. Mutations inHspB4 have been associated with Autosomal Dominant Congenital Cataract (ADCC). The chaperone-like functions of HspB4 are considered important for maintaining lens transparency and preventing cataract. 86
25786 107246 cd06498 ACD_alphaB-crystallin_HspB5 Alpha-crystallin domain found in the small heat shock protein (sHsp) alphaB-crystallin (HspB5, 20kDa). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1. HspB4 does not belong to this group. HspB5 shows increased synthesis in response to stress. HspB5 is also expressed constitutively in other tissues including brain, heart, and type I and type IIa skeletal muscle fibers, and in several cancers including gliomas, renal cell carcinomas, basal-like and metaplastic breast carcinomas, and head and neck cancer. Its functions include effects on the apoptotic pathway and on metastasis. Phosphorylation of HspB5 reduces its oligomerization and anti-apoptotic activities. HspB5 is protective in demyelinating disease such as multiple sclerosis (MS), being a negative regulator of inflammation. In early active MS lesions it is the most abundant gene transcript and an autoantigen, the immune response against it would disrupt its function and worsen inflammation and demyelination. Given as therapy for ongoing demyelinating disease it may counteract this effect. It is an autoantigen in the pathogenesis of various other inflammatory disorders including Lens-associated uveitis (LAU), and Behcet's disease. Mutations in HspB5 have been associated with diseases including dominant cataract and desmin-related myopathy. 84
25787 133460 cd06499 GT_MraY-like Glycosyltransferase 4 (GT4) includes both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases. They catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate, which is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. One member, D-N-acetylhexosamine 1-phosphate transferase (GPT) is a eukaryotic enzyme, which is specific for UDP-GlcNAc as donor substrate and dolichol-phosphate as the membrane bound acceptor. The bacterial members MraY, WecA, and WbpL/WbcO utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic endproducts implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The eukaryotic reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for N-glycosylation. The prokaryotic reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly. Archaeal and eukaryotic enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme. Archaea possess the same N-glycosylation pathway as eukaryotes. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene. 185
25788 99748 cd06502 TA_like Low-specificity threonine aldolase (TA). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). TA catalyzes the conversion of L-threonine or L-allo-threonine to glycine and acetaldehyde in a secondary glycine biosynthetic pathway. 338
25789 349951 cd06503 ATP-synt_Fo_b F-type ATP synthase, membrane subunit b. Membrane subunit b is a component of the Fo complex of FoF1-ATP synthase. The F-type ATP synthases (FoF1-ATPase) consist of two structural domains: the F1 (assembly factor one) complex containing the soluble catalytic core, and the Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F1 is composed of alpha (or A), beta (B), gamma (C), delta (D) and epsilon (E) subunits with a stoichiometry of 3:3:1:1:1, while Fo consists of the three subunits a, b, and c (1:2:10-14). An oligomeric ring of 10-14 c subunits (c-ring) make up the Fo rotor. The flux of protons through the ATPase channel (Fo) drives the rotation of the c-ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the c-ring of Fo. The F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts or in the plasma membranes of bacteria. The F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. This group also includes F-ATP synthase that has also been found in the archaea Candidatus Methanoperedens. 132
25790 119382 cd06522 GH25_AtlA-like AtlA is an autolysin found in Gram-positive lactic acid bacteria that degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. This family includes the AtlA and Aml autolysins from Streptococcus mutans which have a C-terminal glycosyl hydrolase family 25 (GH25) catalytic domain as well as six tandem N-terminal repeats of the GBS (group B Streptococcus) Bsp-like peptidoglycan-binding domain. Other members of this family have one or more C-terminal peptidoglycan-binding domain(s) (SH3 or LysM) in addition to the GH25 domain. 192
25791 119383 cd06523 GH25_PlyB-like PlyB is a bacteriophage endolysin that displays potent lytic activity toward Bacillus anthracis. PlyB has an N-terminal glycosyl hydrolase family 25 (GH25) catalytic domain and a C-terminal bacterial SH3-like domain, SH3b. Both domains are required for effective catalytic activity. Endolysins are produced by bacteriophages at the end of their life cycle and participate in lysing the bacterial cell in order to release the newly formed progeny. Endolysins (also referred to as endo-N-acetylmuramidases or peptidoglycan hydrolases) degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. 177
25792 119384 cd06524 GH25_YegX-like YegX is an uncharacterized bacterial protein with a glycosyl hydrolase family 25 (GH25) catalytic domain that is similar in sequence to the CH-type (Chalaropsis-type) lysozymes of the GH25 family of endolysins. 194
25793 119385 cd06525 GH25_Lyc-like Lyc muramidase is an autolytic lysozyme (autolysin) from Clostridium acetobutylicum encoded by the lyc gene. Lyc has a glycosyl hydrolase family 25 (GH25) catalytic domain. Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues. 184
25794 107247 cd06526 metazoan_ACD Alpha-crystallin domain (ACD) of metazoan alpha-crystallin-type small(s) heat shock proteins (Hsps). sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is the Alpha-crystallin domain (ACD). sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. 83
25795 132725 cd06528 RNAP_A'' A'' subunit of Archaeal RNA Polymerase (RNAP). Archaeal RNA polymerase (RNAP), like bacterial RNAP, is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. The relative positioning of the RNAP core is highly conserved between archaeal RNAP and the three classes of eukaryotic RNAPs. In archaea, the largest subunit is split into two polypeptides, A' and A'', which are encoded by separate genes in an operon. Sequence alignments reveal that the archaeal A'' subunit corresponds to the C-terminal one-third of the RNAPII largest subunit (Rpb1). In subunit A'', several loops in the jaw domain are shorter. The RNAPII Rpb1 interacts with the second-largest subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. 363
25796 119397 cd06529 S24_LexA-like Peptidase S24 LexA-like proteins are involved in the SOS response leading to the repair of single-stranded DNA within the bacterial cell. This family includes: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The LexA-like proteins contain two-domains: an N-terminal DNA binding domain and a C-terminal domain (CTD) that provides LexA dimerization as well as cleavage activity. They undergo autolysis, cleaving at an Ala-Gly or a Cys-Gly bond, separating the DNA-binding domain from the rest of the protein. In the presence of single-stranded DNA, the LexA, UmuD and MucA proteins interact with RecA, activating self cleavage, thus either derepressing transcription in the case of LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. The LexA proteins are serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases. LexA sequence homologs are found in almost all of the bacterial genomes sequenced to date, covering a large number of phyla, suggesting both, an ancient origin and a widespread distribution of lexA and the SOS response. 81
25797 119398 cd06530 S26_SPase_I The S26 Type I signal peptidase (SPase; LepB; leader peptidase B; leader peptidase I; EC 3.4.21.89) family members are essential membrane-bound serine proteases that function to cleave the amino-terminal signal peptide extension from proteins that are translocated across biological membranes. The bacterial signal peptidase I, which is the most intensively studied, has two N-terminal transmembrane segments inserted in the plasma membrane and a hydrophilic, C-terminal catalytic region that is located in the periplasmic space. Although the bacterial signal peptidase I is monomeric, signal peptidases of eukaryotic cells commonly function as oligomeric complexes containing two divergent copies of the catalytic monomer. These are the IMP1 and IMP2 signal peptidases of the mitochondrial inner membrane that remove leader peptides from nuclear- and mitochondrial-encoded proteins. Also, two components of the endoplasmic reticulum signal peptidase in mammals (18-kDa and 21-kDa) belong to this family and they process many proteins that enter the ER for retention or for export to the Golgi apparatus, secretory vesicles, plasma membranes or vacuole. An atypical member of the S26 SPase type I family is the TraF peptidase which has the remarkable activity of producing a cyclic protein of the Pseudomonas pilin system. The type I signal peptidases are unique serine proteases that utilize a serine/lysine catalytic dyad mechanism in place of the classical serine/histidine/aspartic acid catalytic triad mechanism. 85
25798 133474 cd06532 Glyco_transf_25 Glycosyltransferase family 25 [lipooligosaccharide (LOS) biosynthesis protein] is a family of glycosyltransferases involved in LOS biosynthesis. The members include the beta(1,4) galactosyltransferases: Lgt2 of Moraxella catarrhalis, LgtB and LgtE of Neisseria gonorrhoeae and Lic2A of Haemophilus influenzae. M. catarrhalis Lgt2 catalyzes the addition of galactose (Gal) to the growing chain of LOS on the cell surface. N. gonorrhoeae LgtB and LgtE link Gal-beta(1,4) to GlcNAc (N-acetylglucosamine) and Glc (glucose), respectively. The genes encoding LgtB and LgtE are two genes of a five gene locus involved in the synthesis of gonococcal LOS. LgtE is believed to perform the first step in LOS biosynthesis. 128
25799 119439 cd06533 Glyco_transf_WecG_TagA The glycosyltransferase WecG/TagA superfamily contains Escherichia coli WecG, Bacillus subtilis TagA and related proteins. E. coli WecG is believed to be a UDP-N-acetyl-D-mannosaminuronic acid transferase, and is involved in enterobacterial common antigen (eca) synthesis. B. subtilis TagA plays a key role in the Wall Teichoic Acid (WTA) biosynthetic pathway, catalyzing the transfer of N-acetylmannosamine to the C4 hydroxyl of a membrane-anchored N-acetylglucosaminyl diphospholipid to make ManNAc-beta-(1,4)-GlcNAc-pp-undecaprenyl. This is the first committed step in this pathway. Also included in this group is Xanthomonas campestris pv. campestris GumM, a glycosyltransferase participating in the biosynthesis of the exopolysaccharide xanthan. 171
25800 143395 cd06534 ALDH-SF NAD(P)+-dependent aldehyde dehydrogenase superfamily. The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group. 367
25801 119368 cd06535 CIDE_N_CAD CIDE_N domain of CAD nuclease. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of CAD/DFF40 and ICAD/DFF45 during apoptosis. In normal cells, DFF exists in the nucleus as a heterodimer composed of CAD/DFF40 as a latent nuclease and its chaperone and inhibitor subunit ICAD/DFF45. Apoptotic activation of caspase-3 results in the cleavage of DFF45/ICAD and the release of active DFF40/CAD nuclease. 77
25802 119369 cd06536 CIDE_N_ICAD CIDE_N domain of ICAD. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD (DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40 and ICAD/DFF45 during apoptosis. In normal cells, DFF exists in the nucleus as a heterodimer composed of CAD/DFF40 as a latent nuclease and its chaperone and inhibitor subunit ICAD/DFF45. Apoptotic activation of caspase-3 results in the cleavage of DFF45/ICAD and release of active DFF40/CAD nuclease. 80
25803 119370 cd06537 CIDE_N_B CIDE_N domain of CIDE-B proteins. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins. These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40, ICAD/DFF45 and CIDE nucleases during apoptosis. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C). Based on sequence similarity with DFF40 and DFF45, CIDE proteins were initially characterized as mitochondrial activators of apoptosis. However, strong metabolic phenotypes of mice lacking CIDE-A and CIDE-B indicated that this family may play critical roles in energy balance. 81
25804 119371 cd06538 CIDE_N_FSP27 CIDE_N domain of FSP27 proteins. The CIDE-N (cell death-inducing DFF45-like effector, N-terminal) domain is found in the FSP27/CIDE-C protein, which has been identified as a n adipocyte lipid droplet protein that negatively regulates lipolysis and promotes triglyceride accumulation. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C). Based on sequence similarity with DFF40 and DFF45, CIDE proteins were initially characterized as mitochondrial activators of apoptosis. The CIDE-N domain of FSP27 is sufficient to increase apoptosis in vitro when overexpressed. 79
25805 119372 cd06539 CIDE_N_A CIDE_N domain of CIDE-A proteins. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins. These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40, ICAD/DFF45, and CIDE nucleases during apoptosis. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C). Based on sequence similarity with DFF40 and DFF45, the CIDE proteins were initially characterized as mitochondrial activators of apoptosis. However, strong metabolic phenotypes of mice lacking CIDE-A and CIDE-B indicated that this family may play critical roles in energy balance. 78
25806 119343 cd06541 ASCH ASC-1 homology or ASCH domain, a small beta-barrel domain found in all three kingdoms of life. ASCH resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. The domain has been named after the ASC-1 protein, the activating signal cointegrator 1 or thyroid hormone receptor interactor protein 4 (TRIP4). ASC-1 is conserved in many eukaryotes and has been suggested to participate in a protein complex that interacts with RNA. It has been shown that ASC-1 mediates the interaction between various transciption factors and the basal transcriptional machinery. 105
25807 119359 cd06542 GH18_EndoS-like Endo-beta-N-acetylglucosaminidases are bacterial chitinases that hydrolyze the chitin core of various asparagine (N)-linked glycans and glycoproteins. The endo-beta-N-acetylglucosaminidases have a glycosyl hydrolase family 18 (GH18) catalytic domain. Some members also have an additional C-terminal glycosyl hydrolase family 20 (GH20) domain while others have an N-terminal domain of unknown function (pfam08522). Members of this family include endo-beta-N-acetylglucosaminidase S (EndoS) from Streptococcus pyogenes, EndoF1, EndoF2, EndoF3, and EndoH from Flavobacterium meningosepticum, and EndoE from Enterococcus faecalis. EndoS is a secreted endoglycosidase from Streptococcus pyogenes that specifically hydrolyzes the glycan on human IgG between two core N-acetylglucosamine residues. EndoE is a secreted endoglycosidase, encoded by the ndoE gene in Enterococcus faecalis, that hydrolyzes the glycan on human RNase B. 255
25808 119360 cd06543 GH18_PF-ChiA-like PF-ChiA is an uncharacterized chitinase found in the hyperthermophilic archaeon Pyrococcus furiosus with a glycosyl hydrolase family 18 (GH18) catalytic domain as well as a cellulose-binding domain. Members of this domain family are found not only in archaea but also in eukaryotes and prokaryotes. PF-ChiA exhibits hydrolytic activity toward both colloidal and crystalline (beta/alpha) chitins at high temperature. 294
25809 119361 cd06544 GH18_narbonin Narbonin is a plant 2S protein from the globulin fraction of narbon bean (Vicia narbonensis L.) cotyledons with unknown function. Narbonin has a glycosyl hydrolase family 18 (GH18) domain without the conserved catalytic residues and with no known enzymatic activity. Narbonin amounts to up to 3% of the total seed globulins of mature seeds and was thought to be a storage protein but was found to degrade too slowly during germination. This family also includes the VfNOD32 nodulin from Vicia faba. 253
25810 119362 cd06545 GH18_3CO4_chitinase The Bacteroides thetaiotaomicron protein represented by pdb structure 3CO4 is an uncharacterized bacterial member of the family 18 glycosyl hydrolases with homologs found in Flavobacterium, Stigmatella, and Pseudomonas. 253
25811 119363 cd06546 GH18_CTS3_chitinase GH18 domain of CTS3 (chitinase 3), an uncharacterized protein from the human fungal pathogen Coccidioides posadasii. CTS3 has a chitinase-like glycosyl hydrolase family 18 (GH18) domain; and has homologs in bacteria as well as fungi. 256
25812 119364 cd06547 GH85_ENGase Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of N-glycosylproteins. The beta-1,4-glycosyl bond located between two N-acetylglucosamine residues is hydrolyzed such that N-acetylglucosamine 1 remains with the protein and N-acetylglucosamine 2 forms the reducing end of the released glycan. ENGase is a key enzyme in the processing of free oligosaccharides in the cytosol of eukaryotes. Oligosaccharides formed in the lumen of the endoplasmic reticulum are transported into the cytosol where they are catabolized by cytosolic ENGases and other enzymes, possibly to maximize the reutilization of the component sugars. ENGases have an eight-stranded alpha/beta barrel topology and are classified as a family 85 glycosyl hydrolase (GH85) domain. The GH85 ENGases are sequence-similar to the family 18 glycosyl hydrolases, also known as GH18 chitinases. An ENGase-like protein is also found in bacteria and is included in this alignment model. 339
25813 119365 cd06548 GH18_chitinase The GH18 (glycosyl hydrolases, family 18) type II chitinases hydrolyze chitin, an abundant polymer of N-acetylglucosamine and have been identified in bacteria, fungi, insects, plants, viruses, and protozoan parasites. The structure of this domain is an eight-stranded alpha/beta barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. 322
25814 119366 cd06549 GH18_trifunctional GH18 domain of an uncharacterized family of bacterial proteins, which share a common three-domain architecture: an N-terminal glycosyl hydrolase family 18 (GH18) domain, a glycosyl transferase family 2 domain, and a C-terminal polysaccharide deacetylase domain. 298
25815 119348 cd06550 TM_ABC_iron-siderophores_like Transmembrane subunit (TM), of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters involved in the uptake of siderophores, heme, vitamin B12, or the divalent cations Mg2+ and Zn2+. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The TMs are bundles of alpha helices that transverse the cytoplasmic membrane multiple times. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Each TM has a prominent cytoplasmic loop which contacts an ABC and represents a conserved motif. The two TMs form either a homodimer (e.g. in the case of the BtuC subunits of the Escherichia coli BtuCD vitamin B12 transporter), a heterodimer (e.g. the TroC and TroD subunits of the Treponema pallidum general transition metal transporter, TroBCD), or a pseudo-heterodimer (e.g. the FhuB protein of the E. coli ferrichrome transporter, FhuBC). FhuB contains two tandem TMs which associate to form the pseudo-heterodimer. Both FhuB TMs are found in this hierarchy. 261
25816 153244 cd06551 LPLAT Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis. Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene). 187
25817 119344 cd06552 ASCH_yqfb_like ASC-1 homology domain, subfamily similar to Escherichia coli Yqfb. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. 100
25818 119345 cd06553 ASCH_Ef3133_like ASC-1 homology domain, subfamily similar to Enterococcus faecalis Ef3133. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. 127
25819 119346 cd06554 ASCH_ASC-1_like ASC-1 homology domain, ASC-1-like subfamily. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. The domain has been named after the ASC-1 protein, the activating signal cointegrator 1 or thyroid hormone receptor interactor protein 4 (TRIP4). ASC-1 is conserved in many eukaryotes and has been suggested to participate in a protein complex that interacts with RNA. It has been shown that ASC-1 mediates the interaction between various transciption factors and the basal transcriptional machinery. 113
25820 119347 cd06555 ASCH_PF0470_like ASC-1 homology domain, subfamily similar to Pyrococcus furiosus Pf0470. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. 109
25821 119341 cd06556 ICL_KPHMT Members of the ICL/PEPM_KPHMT enzyme superfamily catalyze the formation and cleavage of either P-C or C-C bonds. Typical members are phosphoenolpyruvate mutase (PEPM), phosphonopyruvate hydrolase (PPH), carboxyPEP mutase (CPEP mutase), oxaloacetate hydrolase (OAH), isocitrate lyase (ICL), 2-methylisocitrate lyase (MICL), and ketopantoate hydroxymethyltransferase (KPHMT). 240
25822 119342 cd06557 KPHMT-like Ketopantoate hydroxymethyltransferase (KPHMT) is the first enzyme in the pantothenate biosynthesis pathway. Ketopantoate hydroxymethyltransferase (KPHMT) catalyzes the first committed step in the biosynthesis of pantothenate (vitamin B5), which is a precursor to coenzyme A and is required for penicillin biosynthesis. 254
25823 119339 cd06558 crotonase-like Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole. 195
25824 143472 cd06559 Endonuclease_V Endonuclease_V, a DNA repair enzyme that initiates repair of nitrosative deaminated purine bases. Endonuclease_V (EndoV) is an enzyme that can initiate repair of all possible deaminated DNA bases. EndoV cleaves the DNA strand containing lesions at the second phosphodiester bond 3' to the lesion using Mg2+ as a cofactor. EndoV homologs are conserved throughout all domains of life from bacteria to humans. EndoV is encoded by the nfi gene and nfi null mutant mice have a phenotype prone to cancer. The ability of endonuclease V to recognize mismatches and abnormal replicative DNA structures suggests that the enzyme plays an important role in DNA metabolism. The details of downstream processing for the EndoV pathway remain unknown. 208
25825 143473 cd06560 PriL Archaeal/eukaryotic core primase: Large subunit, PriL. Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. The DNA replication machinery of archaeal organisms contains only the core primase, a simpler arrangement compared to eukaryotes. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL, such as the stabilization of PriS, involvement in the initiation of synthesis, the improvement of primase processivity, and the determination of product size. 166
25826 132880 cd06561 AlkD_like A new structural DNA glycosylase. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix). DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base flipping despite their structural diversity. 197
25827 119332 cd06562 GH20_HexA_HexB-like Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. The hexA and hexB genes encode the alpha- and beta-subunits of the two major beta-N-acetylhexosaminidase isoenzymes, N-acetyl-beta-D-hexosaminidase A (HexA) and beta-N-acetylhexosaminidase B (HexB). Both the alpha and the beta catalytic subunits have a TIM-barrel fold and belong to the glycosyl hydrolase family 20 (GH20). The HexA enzyme is a heterodimer containing one alpha and one beta subunit while the HexB enzyme is a homodimer containing two beta-subunits. Hexosaminidase mutations cause an inability to properly hydrolyze certain sphingolipids which accumulate in lysosomes within the brain, resulting in the lipid storage disorders Tay-Sachs and Sandhoff. Mutations in the alpha subunit cause in a deficiency in the HexA enzyme and result in Tay-Sachs, mutations in the beta-subunit cause in a deficiency in both HexA and HexB enzymes and result in Sandhoff disease. In both disorders GM(2) gangliosides accumulate in lysosomes. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 348
25828 119333 cd06563 GH20_chitobiase-like The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This GH20 domain family includes an N-acetylglucosamidase (GlcNAcase A) from Pseudoalteromonas piscicida and an N-acetylhexosaminidase (SpHex) from Streptomyces plicatus. SpHex lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 357
25829 119334 cd06564 GH20_DspB_LnbB-like Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 326
25830 119335 cd06565 GH20_GcnA-like Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, also known as BhsA) and related proteins. GcnA is an exoglucosidase which cleaves N-acetyl-beta-D-galactosamine (NAG) and N-acetyl-beta-D-galactosamine residues from 4-methylumbelliferylated (4MU) substrates, as well as cleaving NAG from chito-oligosaccharides (i.e. NAG polymers). In contrast, sulfated forms of the substrate are unable to be cleaved and act instead as mild competitive inhibitors. Additionally, the enzyme is known to be poisoned by several first-row transition metals as well as by mercury. GcnA forms a homodimer with subunits comprised of three domains, an N-terminal zincin-like domain, this central catalytic GH20 domain, and a C-terminal alpha helical domain. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 301
25831 143475 cd06567 Peptidase_S41 C-terminal processing peptidase family S41. Peptidase family S41 (C-terminal processing peptidase or CTPase family) contains very different subfamilies; it includes photosystem II D1 C-terminal processing protease (CTPase), interphotoreceptor retinoid-binding protein IRBP and tricorn protease (TRI). CTPase and TRI both contain the PDZ domain while IRBP, although being very similar to the tail-specific protease domain, lacks the PDZ insertion domain and hydrolytic activity. These serine proteases have distinctly different active sites: in CTPase, the active site consists of a serine/lysine catalytic dyad while in tricorn core protease, it is a tetrad (serine, histidine, serine, glutamate). CPases with different substrate specificities in different species include processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein; and others such as tricorn protease (TRI) act as a carboxypeptidase, involved in the degradation of proteasomal products. CTPase homolog IRBP, secreted by photoreceptors into the interphotoreceptor matrix, having arisen in the early evolution of the vertebrate eye, promotes the release of all-trans retinol from photoreceptors and facilitates its delivery to the retinal pigment epithelium. 224
25832 119336 cd06568 GH20_SpHex_like A subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the N-acetylhexosaminidase from Streptomyces plicatus (SpHex). SpHex catalyzes the hydrolysis of N-acetyl-beta-hexosaminides. An Asp residue within the active site plays a critical role in substrate-assisted catalysis by orienting the 2-acetamido group and stabilizing the transition state. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. Proteins belonging to this subgroup lack the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. 329
25833 119337 cd06569 GH20_Sm-chitobiase-like The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 445
25834 119338 cd06570 GH20_chitobiase-like_1 A functionally uncharacterized subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the chitobiase of Serratia marcescens, a beta-N-1,4-acetylhexosaminidase that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This subgroup lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. 311
25835 119330 cd06571 Bac_DnaA_C C-terminal domain of bacterial DnaA proteins. The DNA-binding C-terminal domain of DnaA contains a helix-turn-helix motif that specifically interacts with the DnaA box, a 9-mer motif that occurs repetitively in the replication origin oriC. Multiple copies of DnaA, which is an ATPase, bind to 9-mers at the origin and form an initial complex in which the DNA strands are being separated in an ATP-dependent step. 90
25836 119329 cd06572 Histidinol_dh Histidinol dehydrogenase, HisD, E.C 1.1.1.23. Histidinol dehydrogenase catalyzes the last two steps in the L-histidine biosynthesis pathway, which is conserved in bacteria, archaea, fungi, and plants. These last two steps are (i) the NAD-dependent oxidation of L-histidinol to L-histidinaldehyde, and (ii) the NAD-dependent oxidation of L-histidinaldehyde to L-histidine. In most fungi and in the unicellular choanoflagellate Monosiga bevicollis, the HisD domain is fused with units that catalyze the second and third biosynthesis steps in this same pathway. 390
25837 119325 cd06573 PASTA PASTA domain. This domain is found at the C-termini of several Penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 53
25838 119320 cd06574 TM_PBP1_branched-chain-AA_like Transmembrane subunit (TM) of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which are involved in the uptake of branched-chain amino acids (AAs), as well as TMs of transporters involved in the uptake of monosaccharides including ribose, galactose, and arabinose. These transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. This group includes Escherichia coli LivM and LivH, two TMs which heterodimerize to form the translocation pathway of the E. coli branched-chain AA LIV-1/LS transporter. This transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) and LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine. Included in this group are proteins from transport systems that contain a single TM which homodimerizes to generate the transmembrane pore; for example E. coli RbsC, AlsC, and MglC, the TMs of the high affinity ribose transporter, the D-allose transporter and the galactose transporter, respectively. The D-allose transporter may also to be involved in low affinity ribose transport. 266
25839 119326 cd06575 PASTA_Pbp2x-like_2 PASTA domain of PBP2x-like proteins, second repeat. Penicillin-binding proteins (PBPs) are the major targets for beta-lactam antibiotics, like penicillins and cephalosporins. Beta-lactam antibiotics specifically inhibit transpeptidase activity by acylating the active site serine. PBPs catalyze key steps in the synthesis of the peptidoglycan, such as the interconnecting of glycan chains (polymers of N-glucosamine and N-acetylmuramic acid residues) and the cross-linking (transpeptidation) of short stem peptides, which are attached to glycan chains. Peptidoglycan is essential in cell division and protects bacteria from osmotic shock and lysis. PBP2x is one of the two monofunctional high molecular mass PBPs in Streptococcus pneumoniae and has been seen as the primary PBP target in beta-lactam-resistant strains. The PASTA domain is found at the C-termini of several PBPs and bacterial serine/threonine kinases. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 54
25840 119327 cd06576 PASTA_Pbp2x-like_1 PASTA domain of PBP2x-like proteins, first repeat. Penicillin-binding proteins (PBPs) are the major targets for beta-lactam antibiotics, like penicillins and cephalosporins. Beta-lactam antibiotics specifically inhibit transpeptidase activity by acylating the active site serine. PBPs catalyze key steps in the synthesis of the peptidoglycan, such as the interconnecting of glycan chains (polymers of N-glucosamine and N-acetylmuramic acid residues) and the cross-linking (transpeptidation) of short stem peptides, which are connected to glycan chains. Peptidoglycan is essential in cell division and protects bacteria from osmotic shock and lysis. PBP2x is one of the two monofunctional high molecular mass PBPs in Streptococcus pneumoniae and has been seen as the primary PBP target in beta-lactam-resistant strains. The PASTA domain is found at the C-termini of several PBPs and bacterial serine/threonine kinases. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 55
25841 119328 cd06577 PASTA_pknB PASTA domain of bacterial serine/threonine kinase pknB-like proteins. PknB is a member of a group of related transmembrane sensor kinases present in many gram positive bacteria, which has been shown to regulate cell shape in Mycobacterium tubercolosis. PknB is a receptor-like transmembrane protein with an extracellular signal sensor domain (containing multiple PASTA domains) and an intracellular, eukaryotic serine/threonine kinase-like domain. The PASTA domain is found at the C-termini of several Penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 62
25842 119440 cd06578 HemD Uroporphyrinogen-III synthase (HemD) catalyzes the asymmetrical cyclization of tetrapyrrole (linear) to uroporphyrinogen-III, the fourth step in the biosynthesis of heme. This ubiquitous enzyme is present in eukaryotes, bacteria and archaea. Mutations in the human uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria, a recessive inborn error of metabolism also known as Gunther disease. 239
25843 119321 cd06579 TM_PBP1_transp_AraH_like Transmembrane subunit (TM) of Escherichia coli AraH and related proteins. E. coli AraH is the TM of a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of the monosaccharide arabinose. This group also contains E. coli RbsC, AlsC, and MglC, which are TMs of other monosaccharide transporters, the ribose transporter, the D-allose transporter and the galactose transporter, respectively. The D-allose transporter may also be involved in low affinity ribose transport. These transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Proteins in this subgroup have a single TM which homodimerizes to generate the transmembrane pore. 263
25844 119322 cd06580 TM_PBP1_transp_TpRbsC_like Transmembrane subunit (TM) of Treponema pallidum (Tp) RbsC-1, RbsC-2 and related proteins. This is a functionally uncharacterized subgroup of TMs which belong to a larger group of TMs of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters, which are mainly involved in the uptake of branched-chain amino acids (AAs) or in the uptake of monosaccharides including ribose, galactose, and arabinose, and which generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. 234
25845 119323 cd06581 TM_PBP1_LivM_like Transmembrane subunit (TM) of Escherichia coli LivM and related proteins. LivM is one of two TMs of the E. coli LIV-1/LS transporter, a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of branched-chain amino acids (AAs). These types of transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. E. coli LivM forms a heterodimer with another TM, LivH, to generate the transmembrane pore. LivH is not included in this subgroup. The LIV-1/LS transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) or LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine. 268
25846 119324 cd06582 TM_PBP1_LivH_like Transmembrane subunit (TM) of Escherichia coli LivH and related proteins. LivH is one of two TMs of the E. coli LIV-1/LS transporter, a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of branched-chain amino acids (AAs). These types of transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. E. coli LivH forms a heterodimer with another TM, LivM, to generate the transmembrane pore. LivM is not included in this subgroup. The LIV-1/LS transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) or LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine. 272
25847 133475 cd06583 PGRP Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity. 126
25848 132915 cd06586 TPP_enzyme_PYR Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit. 154
25849 319898 cd06587 VOC vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases. 112
25850 319899 cd06588 PhnB_like Escherichia coli PhnB and similar proteins. The Escherichia coli phnB gene is found next to an operon of fourteen genes (phnC-to-phnP) related to the cleavage of carbon-phosphorus (C-P) bonds in unactivated alkylphosphonates, supporting bacterial growth on alkylphosphonates as the sole phosphorus source. It was originally considered part of that operon. PhnB appears to play no direct catalytic role in the usage of alkylphosphonate. Although many of the proteins in this family have been annotated as 3-demethylubiquinone-9 3-methyltransferase enzymes by automatic annotation programs, the experimental evidence for this assignment is lacking. In Escherichia coli, the gene coding 3-demethylubiquinone-9 3-methyltransferase enzyme is ubiG, which belongs to the AdoMet-MTase protein family. PhnB-like proteins adopt a structural fold similar to bleomycin resistance proteins, glyoxalase I, and type I extradiol dioxygenases. 129
25851 269876 cd06589 GH31 glycosyl hydrolase family 31 (GH31). GH31 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively. 265
25852 260000 cd06590 RNase_HII_bacteria_HIII_like Bacterial type 2 ribonuclease, HII and HIII-like. This family includes type 2 RNases H from several bacteria, such as Bacillus subtilis, which have two different RNases, HII and HIII. RNases HIII are distinguished by having a large (70-90 residues) N-terminal extension of unknown function. In addition, the active site of RNase HIII differs from that of other RNases H; replacing the fourth residue (aspartate) of the acidic "DEDD" motif with a glutamate. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes; however, no prokaryotic genomes contain the combination of both RNase HI and HIII. This mutual exclusive gene inheritance might be the result of functional redundancy of RNase HI and HIII in prokaryotes. Ribonuclease (RNase) H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, archaeal RNase HII and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication or repair. 207
25853 269877 cd06591 GH31_xylosidase_XylS xylosidase XylS-like. XylS is a glycosyl hydrolase family 31 (GH31) alpha-xylosidase found in prokaryotes, eukaryotes, and archaea, that catalyzes the release of alpha-xylose from the non-reducing terminal side of the alpha-xyloside substrate. XylS has been characterized in Sulfolobus solfataricus where it hydrolyzes isoprimeverose, the p-nitrophenyl-beta derivative of isoprimeverose, and xyloglucan oligosaccharides, and has transxylosidic activity. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. The XylS family corresponds to subgroup 3 in the Ernst et al classification of GH31 enzymes. 322
25854 269878 cd06592 GH31_NET37 glucosidase NET37. NET37 (also known as KIAA1161) is a human lamina-associated nuclear envelope transmembrane protein. A member of the glycosyl hydrolase family 31 (GH31) , it has been shown to be required for myogenic differentiation of C2C12 cells. Related proteins are found in eukaryotes and prokaryotes. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 364
25855 269879 cd06593 GH31_xylosidase_YicI alpha-xylosidase YicI-like. YicI alpha-xylosidase is a glycosyl hydrolase family 31 (GH31) enzyme that catalyzes the release of an alpha-xylosyl residue from the non-reducing end of alpha-xyloside substrates such as alpha-xylosyl fluoride and isoprimeverose. YicI forms a homohexamer (a trimer of dimers). All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. The YicI family corresponds to subgroup 4 in the Ernst et al classification of GH31 enzymes. 308
25856 269880 cd06594 GH31_glucosidase_YihQ alpha-glucosidase YihQ-like. YihQ is a bacterial alpha-glucosidase with a conserved glycosyl hydrolase family 31 (GH31) domain that catalyzes the release of an alpha-glucosyl residue from the non-reducing end of alpha-glucoside substrates such as alpha-glucosyl fluoride. Orthologs of YihQ that have not yet been functionally characterized are present in plants and fungi. YihQ has sequence similarity to other GH31 enzymes such as CtsZ, a 6-alpha-glucosyltransferase from Bacillus globisporus, and YicI, an alpha-xylosidase from Echerichia coli. These latter two belong to different GH31 subfamilies than YihQ. In bacteria, YihQ (along with YihO) is important for bacterial O-antigen capsule assembly and translocation. 325
25857 269881 cd06595 GH31_u1 glycosyl hydrolase family 31 (GH31); uncharacterized subgroup. This family represents an uncharacterized GH31 enzyme subgroup found in bacteria and eukaryotes. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 304
25858 269882 cd06596 GH31_CPE1046 Clostridium CPE1046-like. CPE1046 is an uncharacterized Clostridium perfringens protein with a glycosyl hydrolase family 31 (GH31) domain. The domain architecture of CPE1046 and its orthologs includes a C-terminal fibronectin type 3 (FN3) domain and a coagulation factor 5/8 type C domain in addition to the GH31 domain. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 334
25859 269883 cd06597 GH31_transferase_CtsY CtsY (cyclic tetrasaccharide-synthesizing enzyme Y)-like. CtsY is a bacterial 3-alpha-isomaltosyltransferase, first identified in Arthrobacter globiformis, that produces cyclic tetrasaccharides together with a closely related enzyme CtsZ. CtsY and CtsZ both have a glycosyl hydrolase family 31 (GH31) catalytic domain; CtsZ belongs to a different subfamily. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 326
25860 269884 cd06598 GH31_transferase_CtsZ CtsZ (cyclic tetrasaccharide-synthesizing enzyme Z)-like. CtsZ is a bacterial 6-alpha-glucosyltransferase, first identified in Arthrobacter globiformis, that produces cyclic tetrasaccharides together with a closely related enzyme CtsY. CtsZ and CtsY both have a glycosyl hydrolase family 31 (GH31) catalytic domain; CtsY belongs to a different subfamily. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 332
25861 269885 cd06599 GH31_glycosidase_Aec37 E.coli Aec37-like. Glycosyl hydrolase family 31 (GH31) domain of a bacterial protein family represented by Escherichia coli protein Aec37. The gene encoding Aec37 (aec-37) is located within a genomic island (AGI-3) isolated from the extraintestinal avian pathogenic Escherichia coli strain BEN2908. The function of Aec37 and its orthologs is unknown; however, deletion of a region of the genome that includes aec-37 affects the assimilation of seven carbohydrates, decreases growth rate of the strain in minimal medium containing galacturonate or trehalose, and attenuates the virulence of E. coli BEN2908 in chickens. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. 319
25862 269886 cd06600 GH31_MGAM-like maltase-glucoamylase (MGAM)-like. This family includes the following closely related glycosyl hydrolase family 31 (GH31) enzymes: maltase-glucoamylase (MGAM), sucrase-isomaltase (SI), lysosomal acid alpha-glucosidase (GAA), neutral alpha-glucosidase C (GANC), the alpha subunit of neutral alpha-glucosidase AB (GANAB), and alpha-glucosidase II. MGAM is one of the two enzymes responsible for catalyzing the last glucose-releasing step in starch digestion. SI is implicated in the digestion of dietary starch and major disaccharides such as sucrose and isomaltose, while GAA degrades glycogen in the lysosome, cleaving both alpha-1,4 and alpha-1,6 glucosidic linkages. MGAM and SI are anchored to small-intestinal brush-border epithelial cells. The absence of SI from the brush border membrane or its malfunction is associated with malabsorption disorders such as congenital sucrase-isomaltase deficiency (CSID). The domain architectures of MGAM and SI include two tandem GH31 catalytic domains, an N-terminal domain found near the membrane-bound end and a C-terminal luminal domain. Both of the tandem GH31 domains of MGAM and SI are included in this family. The domain architecture of GAA includes an N-terminal TFF (trefoil factor family) domain in addition to the GH31 catalytic domain. Deficient GAA expression causes Pompe disease, an autosomal recessive genetic disorder also known as glycogen storage disease type II (GSDII). GANC and GANAB are key enzymes in glycogen metabolism that hydrolyze terminal, non-reducing 1,4-linked alpha-D-glucose residues from glycogen in the endoplasmic reticulum. Alpha-glucosidase II is a GH31 enzyme, found in bacteria and plants, which has exo-alpha-1,4-glucosidase and oligo-1,6-glucosidase activities. Alpha-glucosidase II has been characterized in Bacillus thermoamyloliquefaciens where it forms a homohexamer. This family also includes the MalA alpha-glucosidase from Sulfolobus solfataricus and the AglA alpha-glucosidase from Picrophilus torridus. MalA is part of the carbohydrate-metabolizing machinery that allows this organism to utilize carbohydrates, such as maltose, as the sole carbon and energy source. The MGAM-like family corresponds to subgroup 1 in the Ernst et al classification of GH31 enzymes. 256
25863 269887 cd06601 GH31_lyase_GLase alpha-1,4-glucan lyase. GLases (alpha-1,4-glucan lyases) are glycosyl hydrolase family 31 (GH31) enzymes that degrade alpha-1,4-glucans and maltooligosaccharides via a nonhydrolytic pathway to yield 1,5-D-anhydrofructose from the nonreducing end. GLases cleave the bond between C1 and O1 of the nonreducing sugar residue of alpha-glucans to generate a monosaccharide product with a double bond between C1 and C2. This family corresponds to subgroup 2 in the Ernst et al classification of GH31 enzymes. 347
25864 269888 cd06602 GH31_MGAM_SI_GAA maltase-glucoamylase, sucrase-isomaltase, lysosomal acid alpha-glucosidase. This subgroup includes the following three closely related glycosyl hydrolase family 31 (GH31) enzymes: maltase-glucoamylase (MGAM), sucrase-isomaltase (SI), and lysosomal acid alpha-glucosidase (GAA), also known as acid-maltase. MGAM is one of the two enzymes responsible for catalyzing the last glucose-releasing step in starch digestion. SI is implicated in the digestion of dietary starch and major disaccharides such as sucrose and isomaltose, while GAA degrades glycogen in the lysosome, cleaving both alpha-1,4 and alpha-1,6 glucosidic linkages. MGAM and SI are anchored to small-intestinal brush-border epithelial cells. The absence of SI from the brush border membrane or its malfunction is associated with malabsorption disorders such as congenital sucrase-isomaltase deficiency (CSID). The domain architectures of MGAM and SI include two tandem GH31 catalytic domains, an N-terminal domain found near the membrane-bound end, and a C-terminal luminal domain. Both of the tandem GH31 domains of MGAM and SI are included in this family. The domain architecture of GAA includes an N-terminal TFF (trefoil factor family) domain in addition to the GH31 catalytic domain. Deficient GAA expression causes Pompe disease, an autosomal recessive genetic disorder also known as glycogen storage disease type II (GSDII). 367
25865 269889 cd06603 GH31_GANC_GANAB_alpha neutral alpha-glucosidase C, neutral alpha-glucosidase AB. This subgroup includes the closely related glycosyl hydrolase family 31 (GH31) isozymes, neutral alpha-glucosidase C (GANC) and the alpha subunit of heterodimeric neutral alpha-glucosidase AB (GANAB). Initially distinguished on the basis of differences in electrophoretic mobility in starch gel, GANC and GANAB have been shown to have other differences, including those of substrate specificity. GANC and GANAB are key enzymes in glycogen metabolism that hydrolyze terminal, non-reducing 1,4-linked alpha-D-glucose residues from glycogen in the endoplasmic reticulum. The GANC/GANAB family includes the alpha-glucosidase II (ModA) from Dictyostelium discoideum as well as the alpha-glucosidase II (GLS2, or ROT2 - Reversal of TOR2 lethality protein 2) from Saccharomyces cerevisiae. 467
25866 269890 cd06604 GH31_glucosidase_II_MalA Alpha-glucosidase II-like. Alpha-glucosidase II (alpha-D-glucoside glucohydrolase) is a glycosyl hydrolase family 31 (GH31) enzyme, found in bacteria and plants, which has exo-alpha-1,4-glucosidase and oligo-1,6-glucosidase activities. Alpha-glucosidase II has been characterized in Bacillus thermoamyloliquefaciens where it forms a homohexamer. This subgroup also includes the MalA alpha-glucosidase from Sulfolobus solfataricus and the AglA alpha-glucosidase from Picrophilus torridus. MalA is part of the carbohydrate-metabolizing machinery that allows this organism to utilize carbohydrates, such as maltose, as the sole carbon and energy source. 339
25867 270782 cd06605 PKc_MAPKK Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein Kinase Kinase. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MAPKKs are dual-specificity PKs that phosphorylate their downstream targets, MAPKs, at specific threonine and tyrosine residues. The MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The pathways involve a triple kinase core cascade comprising the MAPK, which is phosphorylated and activated by a MAPK kinase (MAPKK or MKK or MAP2K), which itself is phosphorylated and activated by a MAPKK kinase (MAPKKK or MKKK or MAP3K). There are three MAPK subfamilies: extracellular signal-regulated kinase (ERK), c-Jun N-terminal kinase (JNK), and p38. In mammalian cells, there are seven MAPKKs (named MKK1-7) and 20 MAPKKKs. Each MAPK subfamily can be activated by at least two cognate MAPKKs and by multiple MAPKKKs. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
25868 270783 cd06606 STKc_MAPKKK Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase Kinase Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPKKKs (MKKKs or MAP3Ks) are also called MAP/ERK kinase kinases (MEKKs) in some cases. They phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. This subfamily is composed of the Apoptosis Signal-regulating Kinases ASK1 (or MAPKKK5) and ASK2 (or MAPKKK6), MEKK1, MEKK2, MEKK3, MEKK4, as well as plant and fungal MAPKKKs. Also included in this subfamily are the cell division control proteins Schizosaccharomyces pombe Cdc7 and Saccharomyces cerevisiae Cdc15. The MAPKKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
25869 270784 cd06607 STKc_TAO Catalytic domain of the Serine/Threonine Kinases, Thousand-and-One Amino acids proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO proteins possess mitogen-activated protein kinase (MAPK) kinase kinase activity. They activate the MAPKs, p38 and c-Jun N-terminal kinase (JNK), by phosphorylating and activating the respective MAP/ERK kinases (MEKs, also known as MKKs or MAPKKs), MEK3/MEK6 and MKK4/MKK7. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. Vertebrates contain three TAO subfamily members, named TAO1, TAO2, and TAO3. The TAO subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
25870 270785 cd06608 STKc_myosinIII_N_like N-terminal Catalytic domain of Class III myosin-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class III myosins are motor proteins with an N-terminal kinase catalytic domain and a C-terminal actin-binding motor domain. Class III myosins are present in the photoreceptors of invertebrates and vertebrates and in the auditory hair cells of mammals. The kinase domain of myosin III can phosphorylate several cytoskeletal proteins, conventional myosin regulatory light chains, and can autophosphorylate the C-terminal motor domain. Myosin III may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. It may also function as a cargo carrier during light-dependent translocation, in photoreceptor cells, of proteins such as transducin and arrestin. The Drosophila class III myosin, called NinaC (Neither inactivation nor afterpotential protein C), is critical in normal adaptation and termination of photoresponse. Vertebrates contain two isoforms of class III myosin, IIIA and IIIB. This subfamily also includes mammalian NIK-like embryo-specific kinase (NESK), Traf2- and Nck-interacting kinase (TNIK), and mitogen-activated protein kinase (MAPK) kinase kinase kinase 4/6. MAP4Ks are involved in some MAPK signaling pathways by activating a MAPK kinase kinase. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The class III myosin-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
25871 270786 cd06609 STKc_MST3_like Catalytic domain of Mammalian Ste20-like protein kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MST3, MST4, STK25, Schizosaccharomyces pombe Nak1 and Sid1, Saccharomyces cerevisiae sporulation-specific protein 1 (SPS1), and related proteins. Nak1 is required by fission yeast for polarizing the tips of actin cytoskeleton and is involved in cell growth, cell separation, cell morphology and cell-cycle progression. Sid1 is a component in the septation initiation network (SIN) signaling pathway, and plays a role in cytokinesis. SPS1 plays a role in regulating proteins required for spore wall formation. MST4 plays a role in mitogen-activated protein kinase (MAPK) signaling during cytoskeletal rearrangement, morphogenesis, and apoptosis. MST3 phosphorylates the STK NDR and may play a role in cell cycle progression and cell morphology. STK25 may play a role in the regulation of cell migration and polarization. The MST3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 274
25872 270787 cd06610 STKc_OSR1_SPAK Catalytic domain of the Serine/Threonine Kinases, Oxidative stress response kinase and Ste20-related proline alanine-rich kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SPAK is also referred to as STK39 or PASK (proline-alanine-rich STE20-related kinase). OSR1 and SPAK regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. They are also implicated in cytoskeletal rearrangement, cell differentiation, transformation and proliferation. OSR1 and SPAK contain a conserved C-terminal (CCT) domain, which recognizes a unique motif ([RK]FX[VI]) present in their activating kinases (WNK1/WNK4) and their substrates. The OSR1 and SPAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
25873 132942 cd06611 STKc_SLK_like Catalytic domain of Ste20-Like Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of the subfamily include SLK, STK10 (also called LOK for Lymphocyte-Oriented Kinase), SmSLK (Schistosoma mansoni SLK), and related proteins. SLK promotes apoptosis through apoptosis signal-regulating kinase 1 (ASK1) and the mitogen-activated protein kinase (MAPK) p38. It also plays a role in mediating actin reorganization. STK10 is responsible in regulating the CD28 responsive element in T cells, as well as leukocyte function associated antigen (LFA-1)-mediated lymphocyte adhesion. SmSLK is capable of activating the MAPK Jun N-terminal kinase (JNK) pathway in human embryonic kidney cells as well as in Xenopus oocytes. It may participate in regulating MAPK cascades during host-parasite interactions. The SLK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 280
25874 132943 cd06612 STKc_MST1_2 Catalytic domain of the Serine/Threonine Kinases, Mammalian STe20-like protein kinase 1 and 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MST1, MST2, and related proteins including Drosophila Hippo and Dictyostelium discoideum Krs1 (kinase responsive to stress 1). MST1/2 and Hippo are involved in a conserved pathway that governs cell contact inhibition, organ size control, and tumor development. MST1 activates the mitogen-activated protein kinases (MAPKs) p38 and c-Jun N-terminal kinase (JNK) through MKK7 and MEKK1 by acting as a MAPK kinase kinase kinase. Activation of JNK by MST1 leads to caspase activation and apoptosis. MST1 has also been implicated in cell proliferation and differentiation. Krs1 may regulate cell growth arrest and apoptosis in response to cellular stress. The MST1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
25875 270788 cd06613 STKc_MAP4K3_like Catalytic domain of Mitogen-activated protein kinase kinase kinase kinase (MAP4K) 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes MAP4K3, MAP4K1, MAP4K2, MAP4K5, and related proteins. Vertebrate members contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. MAP4K1, also called haematopoietic progenitor kinase 1 (HPK1), is a hematopoietic-specific STK involved in many cellular signaling cascades including MAPK, antigen receptor, apoptosis, growth factor, and cytokine signaling. It participates in the regulation of T cell receptor signaling and T cell-mediated immune responses. MAP4K2 was referred to as germinal center (GC) kinase because of its preferred location in GC B cells. MAP4K3 plays a role in the nutrient-responsive pathway of mTOR (mammalian target of rapamycin) signaling. It is required in the activation of S6 kinase by amino acids and for the phosphorylation of the mTOR-regulated inhibitor of eukaryotic initiation factor 4E. MAP4K5, also called germinal center kinase-related enzyme (GCKR), has been shown to activate the MAPK c-Jun N-terminal kinase (JNK). The MAP4K3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
25876 270789 cd06614 STKc_PAK Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs are implicated in the regulation of many cellular processes including growth factor receptor-mediated proliferation, cell polarity, cell motility, cell death and survival, and actin cytoskeleton organization. PAK deregulation is associated with tumor development. PAKs from higher eukaryotes are classified into two groups (I and II), according to their biochemical and structural features. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). Group II PAKs contain a PBD and a catalytic domain, but lack other motifs found in group I PAKs. Since group II PAKs do not contain an obvious AID, they may be regulated differently from group I PAKs. Group I PAKs interact with the SH3 containing proteins Nck, Grb2 and PIX; no such binding has been demonstrated for group II PAKs. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
25877 132946 cd06615 PKc_MEK Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK1 and MEK2 are MAPK kinases (MAPKKs or MKKs), and are dual-specificity PKs that phosphorylate and activate the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK1/2, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. This cascade has also been implicated in synaptic plasticity, migration, morphological determination, and stress response immunological reactions. Gain-of-function mutations in genes encoding ERK cascade proteins, including MEK1/2, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 308
25878 270790 cd06616 PKc_MKK4 Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 4. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK4 is a dual-specificity PK that phosphorylates and activates the downstream targets, c-Jun N-terminal kinase (JNK) and p38 MAPK, on specific threonine and tyrosine residues. JNK and p38 are collectively known as stress-activated MAPKs, as they are activated in response to a variety of environmental stresses and pro-inflammatory cytokines. Their activation is associated with the induction of cell death. Mice deficient in MKK4 die during embryogenesis and display anemia, severe liver hemorrhage, and abnormal hepatogenesis. MKK4 may also play roles in the immune system and in cardiac hypertrophy. It plays a major role in cancer as a tumor and metastasis suppressor. Under certain conditions, MKK4 is pro-oncogenic. The MKK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
25879 173729 cd06617 PKc_MKK3_6 Catalytic domain of the dual-specificity Protein Kinases, Mitogen-activated protein Kinase Kinases 3 and 6. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK3 and MKK6 are dual-specificity PKs that phosphorylate and activate their downstream target, p38 MAPK, on specific threonine and tyrosine residues. MKK3/6 play roles in the regulation of cell cycle progression, cytokine- and stress-induced apoptosis, oncogenic transformation, and adult tissue regeneration. In addition, MKK6 plays a critical role in osteoclast survival in inflammatory disease while MKK3 is associated with tumor invasion, progression, and poor patient survival in glioma. The MKK3/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
25880 270791 cd06618 PKc_MKK7 Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 7. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK7 is a dual-specificity PK that phosphorylates and activates its downstream target, c-Jun N-terminal kinase (JNK), on specific threonine and tyrosine residues. Although MKK7 is capable of dual phosphorylation, it prefers to phosphorylate the threonine residue of JNK. Thus, optimal activation of JNK requires both MKK4 and MKK7. MKK7 is primarily activated by cytokines. MKK7 is essential for liver formation during embryogenesis. It plays roles in G2/M cell cycle arrest and cell growth. In addition, it is involved in the control of programmed cell death, which is crucial in oncogenesis, cancer chemoresistance, and antagonism to TNFalpha-induced killing, through its inhibition by Gadd45beta and the subsequent suppression of the JNK cascade. The MKK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 295
25881 132950 cd06619 PKc_MKK5 Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 5. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK5 (also called MEK5) is a dual-specificity PK that phosphorylates its downstream target, extracellular signal-regulated kinase 5 (ERK5), on specific threonine and tyrosine residues. MKK5 is activated by MEKK2 and MEKK3 in response to mitogenic and stress stimuli. The ERK5 cascade promotes cell proliferation, differentiation, neuronal survival, and neuroprotection. This cascade plays an essential role in heart development. Mice deficient in either ERK5 or MKK5 die around embryonic day 10 due to cardiovascular defects including underdevelopment of the myocardium. In addition, MKK5 is associated with metastasis and unfavorable prognosis in prostate cancer. The MKK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
25882 270792 cd06620 PKc_Byr1_like Catalytic domain of fungal Byr1-like dual-specificity Mitogen-activated protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Byr1 from Schizosaccharomyces pombe, FUZ7 from Ustilago maydis, and related proteins. Byr1 phosphorylates its downstream target, the MAPK Spk1, and is regulated by the MAPKK kinase Byr2. The Spk1 cascade is pheromone-responsive and is essential for sporulation and sexual differentiation in fission yeast. FUZ7 phosphorylates and activates its target, the MAPK Crk1, which is required in mating and virulence in U. maydis. MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The Byr-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
25883 270793 cd06621 PKc_Pek1_like Catalytic domain of fungal Pek1-like dual-specificity Mitogen-Activated Protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Pek1/Skh1 from Schizosaccharomyces pombe and MKK2 from Saccharomyces cerevisiae, and related proteins. Both fission yeast Pek1 and baker's yeast MKK2 are components of the cell integrity MAPK pathway. In fission yeast, Pek1 phosphorylates and activates Pmk1/Spm1 and is regulated by the MAPKK kinase Mkh1. In baker's yeast, the pathway involves the MAPK Slt2, the MAPKKs MKK1 and MKK2, and the MAPKK kinase Bck1. The cell integrity MAPK cascade is activated by multiple stress conditions, and is essential in cell wall construction, morphogenesis, cytokinesis, and ion homeostasis. MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
25884 132953 cd06622 PKc_PBS2_like Catalytic domain of fungal PBS2-like dual-specificity Mitogen-Activated Protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Polymyxin B resistance protein 2 (PBS2) from Saccharomyces cerevisiae, Wis1 from Schizosaccharomyces pombe, and related proteins. PBS2 and Wis1 are components of stress-activated MAPK cascades in budding and fission yeast, respectively. PBS2 is the specific activator of the MAPK Hog1, which plays a central role in the response of budding yeast to stress including exposure to arsenite and hyperosmotic environments. Wis1 phosphorylates and activates the MAPK Sty1 (also called Spc1 or Phh1), which stimulates a transcriptional response to a wide range of cellular insults through the bZip transcription factors Atf1, Pcr1, and Pap1. The PBS2 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
25885 132954 cd06623 PKc_MAPKK_plant_like Catalytic domain of Plant dual-specificity Mitogen-Activated Protein Kinase Kinases and similar proteins. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include MAPKKs from plants, kinetoplastids, alveolates, and mycetozoa. The MAPKK, LmxPK4, from Leishmania mexicana, is important in differentiation and virulence. Dictyostelium discoideum MEK1 is required for proper chemotaxis; MEK1 null mutants display severe defects in cell polarization and directional movement. Plants contain multiple MAPKKs like other eukaryotes. The Arabidopsis genome encodes for 10 MAPKKs while poplar and rice contain 13 MAPKKs each. The functions of these proteins have not been fully elucidated. There is evidence to suggest that MAPK cascades are involved in plant stress responses. In Arabidopsis, MKK3 plays a role in pathogen signaling; MKK2 is involved in cold and salt stress signaling; MKK4/MKK5 participates in innate immunity; and MKK7 regulates basal and systemic acquired resistance. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 264
25886 270794 cd06624 STKc_ASK Catalytic domain of the Serine/Threonine Kinase, Apoptosis signal-regulating kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily are mitogen-activated protein kinase (MAPK) kinase kinases (MAPKKKs or MKKKs) and include ASK1, ASK2, and MAPKKK15. ASK1 (also called MAPKKK5) functions in the c-Jun N-terminal kinase (JNK) and p38 MAPK signaling pathways by directly activating their respective MAPKKs, MKK4/MKK7 and MKK3/MKK6. It plays important roles in cytokine and stress responses, as well as in reactive oxygen species-mediated cellular responses. ASK1 is implicated in various diseases mediated by oxidative stress including inschemic heart disease, hypertension, vessel injury, brain ischemia, Fanconi anemia, asthma, and pulmonary edema, among others. ASK2 (also called MAPKKK6) functions only in a heteromeric complex with ASK1, and can activate ASK1 by direct phosphorylation. The function of MAPKKK15 is still unknown. The ASK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
25887 270795 cd06625 STKc_MEKK3_like Catalytic domain of Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MEKK3, MEKK2, and related proteins; all contain an N-terminal PB1 domain, which mediates oligomerization, and a C-terminal catalytic domain. MEKK2 and MEKK3 are MAPK kinase kinases (MAPKKKs or MKKK) that activate MEK5 (also called MKK5), which activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. MEKK2 and MEKK3 can also activate the MAPKs, c-Jun N-terminal kinase (JNK) and p38, through their respective MAPKKs. The MEKK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
25888 270796 cd06626 STKc_MEKK4 Catalytic domain of the Protein Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK4 is a MAPK kinase kinase that phosphorylates and activates the c-Jun N-terminal kinase (JNK) and p38 MAPK signaling pathways by directly activating their respective MAPKKs, MKK4/MKK7 and MKK3/MKK6. JNK and p38 are collectively known as stress-activated MAPKs, as they are activated in response to a variety of environmental stresses and pro-inflammatory cytokines. MEKK4 also plays roles in the re-polarization of the actin cytoskeleton in response to osmotic stress, in the proper closure of the neural tube, in cardiovascular development, and in immune responses. The MEKK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
25889 270797 cd06627 STKc_Cdc7_like Catalytic domain of Cell division control protein 7-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily include Schizosaccharomyces pombe Cdc7, Saccharomyces cerevisiae Cdc15, Arabidopsis thaliana mitogen-activated protein kinase kinase kinase (MAPKKK) epsilon, and related proteins. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Fission yeast Cdc7 is essential for cell division by playing a key role in the initiation of septum formation and cytokinesis. Budding yeast Cdc15 functions to coordinate mitotic exit with cytokinesis. Arabidopsis MAPKKK epsilon is required for pollen development in the plasma membrane. The Cdc7-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
25890 270798 cd06628 STKc_Byr2_like Catalytic domain of the Serine/Threonine Kinases, fungal Byr2-like Mitogen-Activated Protein Kinase Kinase Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include the MAPKKKs Schizosaccharomyces pombe Byr2, Saccharomyces cerevisiae and Cryptococcus neoformans Ste11, and related proteins. They contain an N-terminal SAM (sterile alpha-motif) domain, which mediates protein-protein interaction, and a C-terminal catalytic domain. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Fission yeast Byr2 is regulated by Ras1. It responds to pheromone signaling and controls mating through the MAPK pathway. Budding yeast Ste11 functions in MAPK cascades that regulate mating, high osmolarity glycerol, and filamentous growth responses. The Byr2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
25891 270799 cd06629 STKc_Bck1_like Catalytic domain of the Serine/Threonine Kinases, fungal Bck1-like Mitogen-Activated Protein Kinase Kinase Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include the MAPKKKs Saccharomyces cerevisiae Bck1 and Schizosaccharomyces pombe Mkh1, and related proteins. Budding yeast Bck1 is part of the cell integrity MAPK pathway, which is activated by stresses and aggressions to the cell wall. The MAPKKK Bck1, MAPKKs Mkk1 and Mkk2, and the MAPK Slt2 make up the cascade that is important in the maintenance of cell wall homeostasis. Fission yeast Mkh1 is involved in MAPK cascades regulating cell morphology, cell wall integrity, salt resistance, and filamentous growth in response to stress. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The Bck1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
25892 270800 cd06630 STKc_MEKK1 Catalytic domain of the Protein Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK1 is a MAPK kinase kinase (MAPKKK or MKKK) that phosphorylates and activates activates the ERK1/2 and c-Jun N-terminal kinase (JNK) pathways by activating their respective MAPKKs, MEK1/2 and MKK4/MKK7, respectively. MEKK1 is important in regulating cell survival and apoptosis. MEKK1 also plays a role in cell migration, tissue maintenance and homeostasis, and wound healing. The MEKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
25893 270801 cd06631 STKc_YSK4 Catalytic domain of the Serine/Threonine Kinase, Yeast Sps1/Ste20-related Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. YSK4 is a putative MAPKKK, whose mammalian gene has been isolated. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The YSK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
25894 270802 cd06632 STKc_MEKK1_plant Catalytic domain of the Serine/Threonine Kinase, Plant Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of plant MAPK kinase kinases (MAPKKKs) including Arabidopsis thaliana MEKK1 and MAPKKK3. Arabidopsis thaliana MEKK1 activates MPK4, a MAPK that regulates systemic acquired resistance. MEKK1 also participates in the regulation of temperature-sensitive and tissue-specific cell death. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The plant MEKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
25895 270803 cd06633 STKc_TAO3 Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO3 is also known as JIK (c-Jun N-terminal kinase inhibitory kinase) or KFC (kinase from chicken). It specifically activates JNK, presumably by phosphorylating and activating MKK4/MKK7. In Saccharomyces cerevisiae, TAO3 is a component of the RAM (regulation of Ace2p activity and cellular morphogenesis) signaling pathway. TAO3 is upregulated in retinal ganglion cells after axotomy, and may play a role in apoptosis. TAO proteins possess mitogen-activated protein kinase (MAPK) kinase kinase activity. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The TAO3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
25896 270804 cd06634 STKc_TAO2 Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Human TAO2 is also known as prostate-derived Ste20-like kinase (PSK) and was identified in a screen for overexpressed RNAs in prostate cancer. TAO2 possesses mitogen-activated protein kinase (MAPK) kinase kinase activity and activates both p38 and c-Jun N-terminal kinase (JNK), by phosphorylating and activating their respective MAP/ERK kinases, MEK3/MEK6 and MKK4/MKK7. It contains a long C-terminal extension with autoinhibitory segments, and is activated by the release of this inhibition and the phosphorylation of its activation loop serine. TAO2 functions as a regulator of actin cytoskeletal and microtubule organization. In addition, it regulates the transforming growth factor-activated kinase 1 (TAK1), which is a MAPKKK that plays an essential role in the signaling pathways of tumor necrosis factor, interleukin 1, and Toll-like receptor. The TAO2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 308
25897 270805 cd06635 STKc_TAO1 Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO1 is sometimes referred to as prostate-derived sterile 20-like kinase 2 (PSK2). TAO1 activates the p38 MAPK through direct interaction with and activation of MEK3. TAO1 is highly expressed in the brain and may play a role in neuronal apoptosis. TAO1 interacts with the checkpoint proteins BubR1 and Mad2, and plays an important role in regulating mitotic progression, which is required for both chromosome congression and checkpoint-induced anaphase delay. TAO1 may play a role in protecting genomic stability. TAO proteins possess MAPK kinase kinase activity. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The TAO1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 317
25898 270806 cd06636 STKc_MAP4K4_6_N N-terminal Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinase Kinase Kinase Kinase 4 and 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. MAP4K4 is also called Nck Interacting kinase (NIK). It facilitates the activation of the MAPKs, extracellular signal-regulated kinase (ERK) 1, ERK2, and c-Jun N-terminal kinase (JNK), by phosphorylating and activating MEKK1. MAP4K4 plays a role in tumor necrosis factor (TNF) alpha-induced insulin resistance. MAP4K4 silencing in skeletal muscle cells from type II diabetic patients restores insulin-mediated glucose uptake. MAP4K4, through JNK, also plays a broad role in cell motility, which impacts inflammation, homeostasis, as well as the invasion and spread of cancer. MAP4K4 is found to be highly expressed in most tumor cell lines relative to normal tissue. MAP4K6 (also called MINK for Misshapen/NIKs-related kinase) is activated after Ras induction and mediates activation of p38 MAPK. MAP4K6 plays a role in cell cycle arrest, cytoskeleton organization, cell adhesion, and cell motility. The MAP4K4/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 282
25899 270807 cd06637 STKc_TNIK Catalytic domain of the Serine/Threonine Kinase, Traf2- and Nck-Interacting Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TNIK is an effector of Rap2, a small GTP-binding protein from the Ras family. TNIK specifically activates the c-Jun N-terminal kinase (JNK) pathway and plays a role in regulating the actin cytoskeleton. The TNIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 296
25900 132969 cd06638 STKc_myosinIIIA_N N-terminal Catalytic domain of the Serine/Threonine Kinase, Class IIIA myosin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class IIIA myosin is highly expressed in retina and in inner ear hair cells. It is localized to the distal ends of actin-bundled structures. Mutations in human myosin IIIA are responsible for progressive nonsyndromic hearing loss. Human myosin IIIA possesses ATPase and kinase activities, and the ability to move actin filaments in a motility assay. It may function as a cellular transporter capable of moving along actin bundles in sensory cells. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain. Class III myosins may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. In photoreceptor cells, they may also function as cargo carriers during light-dependent translocation of proteins such as transducin and arrestin. The class III myosin subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
25901 270808 cd06639 STKc_myosinIIIB_N N-terminal Catalytic domain of the Serine/Threonine Kinase, Class IIIB myosin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class IIIB myosin is expressed highly in retina. It is also present in the brain and testis. The human class IIIB myosin gene maps to a region that overlaps the locus for Bardet-Biedl syndrome, which is characterized by dysmorphic extremities, retinal dystrophy, obesity, male hypogenitalism, and renal abnormalities. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain. They may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. They may also function as cargo carriers during light-dependent translocation, in photoreceptor cells, of proteins such as transducin and arrestin. The class III myosin subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
25902 132971 cd06640 STKc_MST4 Catalytic domain of the Serine/Threonine Kinase, Mammalian Ste20-like protein kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MST4 is sometimes referred to as MASK (MST3 and SOK1-related kinase). It plays a role in mitogen-activated protein kinase (MAPK) signaling during cytoskeletal rearrangement, morphogenesis, and apoptosis. It influences cell growth and transformation by modulating the extracellular signal-regulated kinase (ERK) pathway. MST4 may also play a role in tumor formation and progression. It localizes in the Golgi apparatus by interacting with the Golgi matrix protein GM130 and may play a role in cell migration. The MST4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
25903 270809 cd06641 STKc_MST3 Catalytic domain of the Serine/Threonine Kinase, Mammalian Ste20-like protein kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MST3 phosphorylates the STK NDR and may play a role in cell cycle progression and cell morphology. It may also regulate paxillin and consequently, cell migration. MST3 is present in human placenta, where it plays an essential role in the oxidative stress-induced apoptosis of trophoblasts in normal spontaneous delivery. Dysregulation of trophoblast apoptosis may result in pregnancy complications such as preeclampsia and intrauterine growth retardation. The MST3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
25904 270810 cd06642 STKc_STK25 Catalytic domain of Serine/Threonine Kinase 25 (also called Yeast Sps1/Ste20-related kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK25 is also called Ste20/oxidant stress response kinase 1 (SOK1) or yeast Sps1/Ste20-related kinase 1 (YSK1). It is localized in the Golgi apparatus through its interaction with the Golgi matrix protein GM130. It may be involved in the regulation of cell migration and polarization. STK25 binds and phosphorylates CCM3 (cerebral cavernous malformation 3), also called PCD10 (programmed cell death 10), and may play a role in apoptosis. Human STK25 is a candidate gene responsible for pseudopseudohypoparathyroidism (PPHP), a disease that shares features with the Albright hereditary osteodystrophy (AHO) phenotype. The STK25 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
25905 270811 cd06643 STKc_SLK Catalytic domain of the Serine/Threonine Kinase, Ste20-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SLK promotes apoptosis through apoptosis signal-regulating kinase 1 (ASK1) and the mitogen-activated protein kinase (MAPK) p38. It acts as a MAPK kinase kinase by phosphorylating ASK1, resulting in the phosphorylation of p38. SLK also plays a role in mediating actin reorganization. It is part of a microtubule-associated complex that is targeted at adhesion sites, and is required in focal adhesion turnover and in regulating cell migration. The SLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
25906 132975 cd06644 STKc_STK10 Catalytic domain of the Serine/Threonine Kinase, STK10 (also called Lymphocyte-Oriented Kinase or LOK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK10/LOK is also called polo-like kinase kinase 1 in Xenopus (xPlkk1). It is highly expressed in lymphocytes and is responsible in regulating leukocyte function associated antigen (LFA-1)-mediated lymphocyte adhesion. It plays a role in regulating the CD28 responsive element in T cells, and may also function as a regulator of polo-like kinase 1 (Plk1), a protein which is overexpressed in multiple tumor types. The STK10 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
25907 270812 cd06645 STKc_MAP4K3 Catalytic domain of the Serine/Threonine Kinase, Mitogen-activated protein kinase kinase kinase kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP4K3 plays a role in the nutrient-responsive pathway of mTOR (mammalian target of rapamycin) signaling. MAP4K3 is required in the activation of S6 kinase by amino acids and for the phosphorylation of the mTOR-regulated inhibitor of eukaryotic initiation factor 4E. mTOR regulates ribosome biogenesis and protein translation, and is frequently deregulated in cancer. MAP4Ks are involved in MAPK signaling pathways by activating a MAPK kinase kinase. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. The MAP4K3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
25908 270813 cd06646 STKc_MAP4K5 Catalytic domain of the Serine/Threonine Kinase, Mitogen-activated protein kinase kinase kinase kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP4K5, also called germinal center kinase-related enzyme (GCKR), has been shown to activate the MAPK c-Jun N-terminal kinase (JNK). MAP4K5 also facilitates Wnt signaling in B cells, and may therefore be implicated in the control of cell fate, proliferation, and polarity. MAP4Ks are involved in some MAPK signaling pathways by activating a MAPK kinase kinase. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. The MAP4K5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
25909 270814 cd06647 STKc_PAK_I Catalytic domain of the Serine/Threonine Kinase, Group I p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Group I PAKs, also called conventional PAKs, include PAK1, PAK2, and PAK3. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). They interact with the SH3 domain containing proteins Nck, Grb2 and PIX. Binding of group I PAKs to activated GTPases leads to conformational changes that destabilize the AID, allowing autophosphorylation and full activation of the kinase domain. Known group I PAK substrates include MLCK, Bad, Raf, MEK1, LIMK, Merlin, Vimentin, Myc, Stat5a, and Aurora A, among others. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs are implicated in the regulation of many cellular processes including growth factor receptor-mediated proliferation, cell polarity, cell motility, cell death and survival, and actin cytoskeleton organization. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
25910 270815 cd06648 STKc_PAK_II Catalytic domain of the Serine/Threonine Kinase, Group II p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Group II PAKs, also called non-conventional PAKs, include PAK4, PAK5, and PAK6. Group II PAKs contain PBD (p21-binding domain) and catalytic domains, but lack other motifs found in group I PAKs, such as an AID (autoinhibitory domain) and SH3 binding sites. Since group II PAKs do not contain an obvious AID, they may be regulated differently from group I PAKs. While group I PAKs interact with the SH3 containing proteins Nck, Grb2 and PIX, no such binding has been demonstrated for group II PAKs. Some known substrates of group II PAKs are also substrates of group I PAKs such as Raf, BAD, LIMK and GEFH1. Unique group II substrates include MARK/Par-1 and PDZ-RhoGEF. Group II PAKs play important roles in filopodia formation, neuron extension, cytoskeletal organization, and cell survival. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
25911 132980 cd06649 PKc_MEK2 Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase 2. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK2 is a dual-specificity PK and a MAPK kinase (MAPKK or MKK) that phosphorylates and activates the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK2, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. Gain-of-function mutations in genes encoding ERK cascade proteins, including MEK2, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 331
25912 270816 cd06650 PKc_MEK1 Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase 1. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK1 is a dual-specificity PK and a MAPK kinase (MAPKK or MKK) that phosphorylates and activates the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK1, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. Gain-of-function mutations in genes encoding ERK cascade proteins, including MEK1, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. MEK1 also plays a role in cell cycle control. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 319
25913 270817 cd06651 STKc_MEKK3 Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK3 is a MAPK kinase kinase (MAPKKK or MKKK), that phosphorylates and activates the MAPK kinase MEK5 (or MKK5), which in turn phosphorylates and activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. In addition, MEKK3 is involved in interleukin-1 receptor and Toll-like receptor 4 signaling. It is also a specific regulator of the proinflammatory cytokines IL-6 and GM-CSF in some immune cells. MEKK3 also regulates calcineurin, which plays a critical role in T cell activation, apoptosis, skeletal myocyte differentiation, and cardiac hypertrophy. The MEKK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
25914 270818 cd06652 STKc_MEKK2 Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK2 is a MAPK kinase kinase (MAPKKK or MKKK), that phosphorylates and activates the MAPK kinase MEK5 (or MKK5), which in turn phosphorylates and activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK2 also activates ERK1/2, c-Jun N-terminal kinase (JNK) and p38 through their respective MAPKKs MEK1/2, JNK-activating kinase 2 (JNKK2), and MKK3/6. MEKK2 plays roles in T cell receptor signaling, immune synapse formation, cytokine gene expression, as well as in EGF and FGF receptor signaling. The MEKK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 264
25915 270819 cd06653 STKc_MEKK3_like_u1 Catalytic domain of an Uncharacterized subfamily of Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of uncharacterized proteins with similarity to MEKK3, MEKK2, and related proteins; they contain an N-terminal PB1 domain, which mediates oligomerization, and a C-terminal catalytic domain. MEKK2 and MEKK3 are MAPK kinase kinases (MAPKKKs or MKKKs), proteins that phosphorylate and activate MAPK kinases (MAPKKs or MKKs), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MEKK2 and MEKK3 activate MEK5 (also called MKK5), which activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. MEKK2 and MEKK3 can also activate the MAPKs, c-Jun N-terminal kinase (JNK) and p38, through their respective MAPKKs. The MEKK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 264
25916 270820 cd06654 STKc_PAK1 Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK1 is important in the regulation of many cellular processes including cytoskeletal dynamics, cell motility, growth, and proliferation. Although PAK1 has been regarded mainly as a cytosolic protein, recent reports indicate that PAK1 also exists in significant amounts in the nucleus, where it is involved in transcription modulation and in cell cycle regulatory events. PAK1 is also involved in transformation and tumorigenesis. Its overexpression, hyperactivation and increased nuclear accumulation is correlated to breast cancer invasiveness and progression. Nuclear accumulation is also linked to tamoxifen resistance in breast cancer cells. PAK1 belongs to the group I PAKs, which contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 296
25917 132986 cd06655 STKc_PAK2 Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK2 plays a role in pro-apoptotic signaling. It is cleaved and activated by caspases leading to morphological changes during apoptosis. PAK2 is also activated in response to a variety of stresses including DNA damage, hyperosmolarity, serum starvation, and contact inhibition, and may play a role in coordinating the stress response. PAK2 also contributes to cancer cell invasion through a mechanism distinct from that of PAK1. It belongs to the group I PAKs, which contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 296
25918 132987 cd06656 STKc_PAK3 Catalytic domain of the Protein Serine/Threonine Kinase, p21-activated kinase 3. Serine/threonine kinases (STKs), p21-activated kinase (PAK) 3, catalytic (c) domain. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs from higher eukaryotes are classified into two groups (I and II), according to their biochemical and structural features. PAK3 belongs to group I. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAK3 is highly expressed in the brain. It is implicated in neuronal plasticity, synapse formation, dendritic spine morphogenesis, cell cycle progression, neuronal migration, and apoptosis. Inactivating mutations in the PAK3 gene cause X-linked non-syndromic mental retardation, the severity of which depends on the site of the mutation. 297
25919 132988 cd06657 STKc_PAK4 Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK4 regulates cell morphology and cytoskeletal organization. It is essential for embryonic viability and proper neural development. Mice lacking PAK4 die due to defects in the fetal heart. In addition, their spinal cord motor neurons showed failure to differentiate and migrate. PAK4 also plays a role in cell survival and tumorigenesis. It is overexpressed in many primary tumors including colon, esophageal, and mammary tumors. PAK4 has also been implicated in viral and bacterial infection pathways. PAK4 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
25920 132989 cd06658 STKc_PAK5 Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK5 is mainly expressed in the brain. It is not required for viability, but together with PAK6, it is required for normal levels of locomotion and activity, and for learning and memory. PAK5 cooperates with Inca (induced in neural crest by AP2) in the regulation of cell adhesion and cytoskeletal organization in the embryo and in neural crest cells during craniofacial development. PAK5 may also play a role in controlling the signaling of Raf-1, an effector of Ras, at the mitochondria. PAK5 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
25921 270821 cd06659 STKc_PAK6 Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK6 may play a role in stress responses through its activation by the mitogen-activated protein kinase (MAPK) p38 and MAPK kinase 6 (MKK6) pathway. PAK6 is highly expressed in the brain. It is not required for viability, but together with PAK5, it is required for normal levels of locomotion and activity, and for learning and memory. Increased expression of PAK6 is found in primary and metastatic prostate cancer. PAK6 may play a role in the regulation of motility. PAK6 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 297
25922 381296 cd06660 AKR_SF Aldo-keto reductase (AKR) superfamily. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits, and aflatoxin aldehyde reductases, among others. 232
25923 119400 cd06661 GGCT_like GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC). 99
25924 119401 cd06662 SURF1 SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder. 202
25925 133456 cd06663 Biotinyl_lipoyl_domains Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue. 73
25926 143480 cd06664 IscU_like Iron-sulfur cluster scaffold-like proteins. IscU_like and NifU_like proteins. IscU and NifU function as a scaffold for the assembly of [2Fe-2S] clusters before they are transferred to apo target proteins. They are highly conserved and play vital roles in the ISC and NIF systems of Fe-S protein maturation. NIF genes participate in nitrogen fixation in several isolated bacterial species. The NifU domain, however, is also found in bacteria that do not fix nitrogen, so it may have wider significance in the cell. Human IscU interacts with frataxin, the Friedreich ataxia gene product, and incorrectly spliced IscU has been shown to disrupt iron homeostasis in skeletal muscle and cause myopathy. 123
25927 143484 cd06808 PLPDE_III Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes. The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity. 211
25928 143485 cd06810 PLPDE_III_ODC_DapDC_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Ornithine and Diaminopimelate Decarboxylases, and Related Enzymes. This family includes eukaryotic ornithine decarboxylase (ODC, EC 4.1.1.17), diaminopimelate decarboxylase (DapDC, EC 4.1.1.20), plant and prokaryotic biosynthetic arginine decarboxylase (ADC, EC 4.1.1.19), carboxynorspermidine decarboxylase (CANSDC), and ODC-like enzymes from diverse bacterial species. These proteins are fold type III PLP-dependent enzymes that catalyze essential steps in the biosynthesis of polyamine and lysine. ODC and ADC participate in alternative pathways of the biosynthesis of putrescine, which is the precursor of aliphatic polyamines in many organisms. ODC catalyzes the direct synthesis of putrescine from L-ornithine, while ADC converts L-arginine to agmatine, which is hydrolysed to putrescine by agmatinase in a pathway that exists only in plants and bacteria. DapDC converts meso-2,6-diaminoheptanedioate to L-lysine, which is the final step of lysine biosynthesis. CANSDC catalyzes the decarboxylation of carboxynorspermidine, which is the last step in the synthesis of norspermidine. The PLP-dependent decarboxylases in this family contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Prokaryotic ornithine, lysine and biodegradative arginine decarboxylases are fold type I PLP-dependent enzymes and are not included in this family. 368
25929 143486 cd06811 PLPDE_III_yhfX_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme yhfX. This subfamily is composed of the uncharacterized protein yhfX from Escherichia coli K-12 and similar bacterial proteins. These proteins are homologous to bacterial alanine racemases (AR), which are fold type III PLP-dependent enzymes containing an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. It catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. Members of this subfamily may act as PLP-dependent enzymes. 382
25930 143487 cd06812 PLPDE_III_DSD_D-TA_like_1 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 1. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution. 374
25931 143488 cd06813 PLPDE_III_DSD_D-TA_like_2 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 2. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution. 388
25932 143489 cd06814 PLPDE_III_DSD_D-TA_like_3 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 3. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution. 379
25933 143490 cd06815 PLPDE_III_AR_like_1 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Alanine Racemase-like 1. This subfamily is composed of uncharacterized bacterial proteins with similarity to bacterial alanine racemases (AR), which are fold type III PLP-dependent enzymes containing an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. It catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. Members of this subfamily may act as PLP-dependent enzymes. 353
25934 143491 cd06817 PLPDE_III_DSD Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Eukaryotic D-Serine Dehydratase. This subfamily is composed of chicken D-serine dehydratase (DSD, EC 4.3.1.18) and similar eukaryotic proteins. Chicken DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. It is a fold type III PLP-dependent enzyme with similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as dimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Experimental data suggest that chicken DSD also exists as dimers. Sequence comparison and biochemical experiments show that chicken DSD is distinct from the ubiquitous bacterial DSDs coded by dsdA gene, mammalian L-serine dehydratases (LSD) and mammalian serine racemase (SerRac), which are fold type II PLP-dependent enzymes. 389
25935 143492 cd06818 PLPDE_III_cryptic_DSD Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Bacterial Cryptic D-Serine Dehydratase. This subfamily is composed of Burkholderia cepacia cryptic D-serine dehydratase (cryptic DSD), which is also called D-serine deaminase, and similar bacterial proteins. Members of this subfamily are fold type III PLP-dependent enzymes with similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as dimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on similarity, it is possible cryptic DSDs may also form dimers. Cryptic DSDs are distinct from the ubiquitous bacterial DSDs coded by the dsdA gene, mammalian L-serine dehydratases (LSD) and mammalian serine racemase (SerRac), which are fold type II PLP-dependent enzymes. At present, the enzymatic and biochemical properties of cryptic DSDs are still poorly understood. Typically, DSDs catalyze the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. 382
25936 143493 cd06819 PLPDE_III_LS_D-TA Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Low Specificity D-Threonine Aldolase. Low specificity D-threonine aldolase (Low specificity D-TA, EC 4.3.1.18), encoded by dtaAS gene from Arthrobacter sp. strain DK-38, is the prototype of this subfamily. Low specificity D-TAs are fold type III PLP-dependent enzymes that catalyze the interconversion between D-threonine/D-allo-threonine and glycine plus acetaldehyde. Both PLP and divalent cations (eg. Mn2+) are required for catalytic activity. Members of this subfamily show similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity. 358
25937 143494 cd06820 PLPDE_III_LS_D-TA_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Low Specificity D-Threonine Aldolase-like. This subfamily is composed of uncharacterized bacterial proteins with similarity to low specificity D-threonine aldolase (D-TA), which is a fold type III PLP-dependent enzyme that catalyzes the interconversion between D-threonine/D-allo-threonine and glycine plus acetaldehyde. Both PLP and divalent cations (eg. Mn2+) are required for catalytic activity. Low specificity D-TAs show similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity. 353
25938 143495 cd06821 PLPDE_III_D-TA Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme D-Threonine Aldolase. D-threonine aldolase (D-TA, EC 4.3.1.18) reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. Its activity is present in several genera of bacteria but not in fungi. It requires PLP and a divalent cation such as Co2+, Ni2+, Mn2+, or Mg2+ as cofactors for catalytic activity and thermal stability. Members of this subfamily show similarity to bacterial alanine racemase (AR), a fold type III PLP-dependent enzyme which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity. 361
25939 143496 cd06822 PLPDE_III_YBL036c_euk Pyridoxal 5-phosphate (PLP)-binding TIM barrel domain of Type III PLP-Dependent Enzymes, Eukaryotic YBL036c-like proteins. This subfamily contains mostly uncharacterized eukaryotic proteins with similarity to the yeast hypothetical protein YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. YBL036c is a single domain monomeric protein with a typical TIM barrel fold. It binds the PLP cofactor and has been shown to exhibit amino acid racemase activity. The YBL036c structure is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. The lack of a second domain in YBL036c may explain limited D- to L-alanine racemase or non-specific racemase activity. Some members of this subfamily are also referred to as PROSC (Proline synthetase co-transcribed bacterial homolog). 227
25940 143497 cd06824 PLPDE_III_Yggs_like Pyridoxal 5-phosphate (PLP)-binding TIM barrel domain of Type III PLP-Dependent Enzymes, Yggs-like proteins. This subfamily contains mainly uncharacterized proteobacterial proteins with similarity to the hypothetical Escherichia coli protein YggS, a homolog of yeast YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. Like yeast YBL036c, Yggs is a single domain monomeric protein with a typical TIM-barrel fold. Its structure, which shows a covalently-bound PLP cofactor, is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. YggS has not been characterized extensively and its biological function is still unkonwn. 224
25941 143498 cd06825 PLPDE_III_VanT Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, VanT and similar proteins. This subfamily is composed of Enterococcus gallinarum VanT and similar proteins. VanT is a membrane-bound serine racemase (EC 5.1.1.18) that is essential for vancomycin resistance in Enterococcus gallinarum. It converts L-serine into its D-enantiomer (D-serine) for peptidoglycan synthesis. The C-terminal region of this protein contains a PLP-binding TIM-barrel domain followed by beta-sandwich domain, which is homologous to the fold type III PLP-dependent enzyme, bacterial alanine racemase (AR). AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. On the basis of this similarity, it has been suggested that dimer formation of VanT is required for its catalytic activity, and that it catalyzes the racemization of serine in a mechanistically similar manner to that of alanine by bacterial AR. Some biochemical evidence indicates that VanT also exhibits alanine racemase activity and plays a role in the racemization of L-alanine. VanT contains a unique N-terminal transmembrane domain, which may function as an L-serine transporter. VanT serine racemases are not related to eukaryotic serine racemases, which are fold type II PLP-dependent enzymes. 368
25942 143499 cd06826 PLPDE_III_AR2 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme, Alanine Racemase 2. This subfamily is composed of bacterial alanine racemases (EC 5.1.1.1) with similarity to Yersinia pestis and Vibrio cholerae alanine racemase (AR) 2. ARs catalyze the interconversion between L- and D-alanine, an essential component of the peptidoglycan layer of bacterial cell walls. These proteins are similar to other bacterial ARs and are fold type III PLP-dependent enzymes containing contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity. 365
25943 143500 cd06827 PLPDE_III_AR_proteobact Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Proteobacterial Alanine Racemases. This subfamily is composed mainly of proteobacterial alanine racemases (EC 5.1.1.1), fold type III PLP-dependent enzymes that catalyze the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. hese proteins are similar to other bacterial ARs and are fold type III PLP-dependent enzymes containing contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity. 354
25944 143501 cd06828 PLPDE_III_DapDC Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Diaminopimelate Decarboxylase. Diaminopimelate decarboxylase (DapDC, EC 4.1.1.20) participates in the last step of lysine biosynthesis. It converts meso-2,6-diaminoheptanedioate to L-lysine. It is a fold type III PLP-dependent enzyme that contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. DapDC exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity. 373
25945 143502 cd06829 PLPDE_III_CANSDC Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Carboxynorspermidine Decarboxylase. Carboxynorspermidine decarboxylase (CANSDC) catalyzes the decarboxylation of carboxynorspermidine, the last step in the biosynthesis of norspermidine. It is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC), which are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. Based on this similarity, CANSDC may require homodimer formation and the presence of the PLP cofactor for its catalytic activity. 346
25946 143503 cd06830 PLPDE_III_ADC Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Arginine Decarboxylase. This subfamily includes plants and biosynthetic prokaryotic arginine decarboxylases (ADC, EC 4.1.1.19). ADC is involved in the biosynthesis of putrescine, which is the precursor of aliphatic polyamines in many organisms. It catalyzes the decarboxylation of L-arginine to agmatine, which is then hydrolyzed to putrescine by agmatinase. ADC is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC), which are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. Homodimer formation and the presence of both PLP and Mg2+ cofactors may be required for catalytic activity. Prokaryotic ADCs (biodegradative), which are fold type I PLP-dependent enzymes, are not included in this family. 409
25947 143504 cd06831 PLPDE_III_ODC_like_AZI Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Ornithine Decarboxylase-like Antizyme Inhibitor. Antizyme inhibitor (AZI) is homologous to the fold type III PLP-dependent enzyme ODC but does not retain any decarboxylase activity. Like ODC, AZI is presumed to exist as a homodimer. Antizyme is a regulatory protein that binds directly to the ODC monomer to block its active site, leading to its degradation by the 26S proteasome. AZI binds to Antizyme with a higher affinity than ODC, preventing the formation of the Antizyme-ODC complex. Thus, AZI blocks the ability of Antizyme to promote ODC degradation, which leads to increased ODC enzymatic activity and polyamine levels. AZI also prevents the degradation of other proteins regulated by Antizyme, such as cyclin D1. 394
25948 143505 cd06836 PLPDE_III_ODC_DapDC_like_1 Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Uncharacterized Proteins with similarity to Ornithine and Diaminopimelate Decarboxylases. This subfamily contains uncharacterized proteins with similarity to ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Proteins in this subfamily may function as PLP-dependent decarboxylases. Homodimer formation and the presence of the PLP cofactor may be required for catalytic activity. 379
25949 143506 cd06839 PLPDE_III_Btrk_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Btrk Decarboxylase. This subfamily is composed of Bacillus circulans BtrK decarboxylase and similar proteins. These proteins are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases, eukaryotic ornithine decarboxylases and diaminopimelate decarboxylases. BtrK is presumed to function as a PLP-dependent decarboxylase involved in the biosynthesis of the aminoglycoside antibiotic butirosin. Homodimer formation and the presence of the PLP cofactor may be required for catalytic activity. 382
25950 143507 cd06840 PLPDE_III_Bif_AspK_DapDC Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Bifunctional Aspartate Kinase/Diaminopimelate Decarboxylase. Bifunctional aspartate kinase/diaminopimelate decarboxylase (AspK/DapDC, EC 4.1.1.20/EC 2.7.2.4) typically exists in bacteria. These proteins contain an N-terminal AspK region and a C-terminal DapDC region, which contains a PLP-binding TIM-barrel domain followed by beta-sandwich domain, characteristic of fold type III PLP-dependent enzymes. Members of this subfamily have not been fully characterized. Based on their sequence, these proteins may catalyze both reactions catalyzed by AspK and DapDC. AspK catalyzes the phosphorylation of L-aspartate to produce 4-phospho-L-aspartate while DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. 368
25951 143508 cd06841 PLPDE_III_MccE_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme MccE. This subfamily is composed of uncharacterized proteins with similarity to Escherichia coli MccE, a hypothetical protein that is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Most members of this subfamily share the same domain architecture as ODC and DapDC. A few members, including Escherichia coli MccE, contain an additional acetyltransferase domain at the C-terminus. 379
25952 143509 cd06842 PLPDE_III_Y4yA_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Y4yA. This subfamily is composed of the hypothetical Rhizobium sp. protein Y4yA and similar uncharacterized bacterial proteins. These proteins are homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Proteins in this subfamily may function as PLP-dependent decarboxylases. 423
25953 143510 cd06843 PLPDE_III_PvsE_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme PvsE. This subfamily is composed of PvsE from Vibrio parahaemolyticus and similar proteins. PvsE is a vibrioferrin biosynthesis protein which is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. It has been suggested that PvsE may be involved in the biosynthesis of the polycarboxylate siderophore vibrioferrin. It may catalyze the decarboxylation of serine to yield ethanolamine. PvsE may require homodimer formation and the presence of the PLP cofactor for activity. 377
25954 132911 cd06844 STAS Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors. The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation. 100
25955 132900 cd06845 Bcl-2_like Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly. 144
25956 185704 cd06846 Adenylation_DNA_ligase_like Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases. ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity. 182
25957 133457 cd06848 GCS_H Glycine cleavage H-protein. Glycine cleavage H-proteins are part of the glycine cleavage system (GCS) found in bacteria, archea and the mitochondria of eukaryotes. GCS is a multienzyme complex consisting of 4 different components (P-, H-, T- and L-proteins) which catalyzes the oxidative cleavage of glycine. The H-protein shuttles the methylamine group of glycine from the P-protein (glycine dehydrogenase) to the T-protein (aminomethyltransferase) via a lipoyl group, attached to a completely conserved lysine residue. 96
25958 133458 cd06849 lipoyl_domain Lipoyl domain of the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases. 2-oxo acid dehydrogenase multienzyme complexes, like pyruvate dehydrogenase (PDH), 2-oxoglutarate dehydrogenase (OGDH) and branched-chain 2-oxo acid dehydrogenase (BCDH), contain at least three different enzymes, 2-oxo acid dehydrogenase (E1), dihydrolipoyl acyltransferase (E2) and dihydrolipoamide dehydrogenase (E3) and play a key role in redox regulation. E2, the central component of the complex, catalyzes the transfer of the acyl group of CoA from E1 to E3 via reductive acetylation of a lipoyl group covalently attached to a lysine residue. 74
25959 133459 cd06850 biotinyl_domain The biotinyl-domain or biotin carboxyl carrier protein (BCCP) domain is present in all biotin-dependent enzymes, such as acetyl-CoA carboxylase, pyruvate carboxylase, propionyl-CoA carboxylase, methylcrotonyl-CoA carboxylase, geranyl-CoA carboxylase, oxaloacetate decarboxylase, methylmalonyl-CoA decarboxylase, transcarboxylase and urea amidolyase. This domain functions in transferring CO2 from one subsite to another, allowing carboxylation, decarboxylation, or transcarboxylation. During this process, biotin is covalently attached to a specific lysine. 67
25960 133461 cd06851 GT_GPT_like This family includes eukaryotic UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT) and archaeal GPT-like glycosyltransferases. Eukaryotic GPT catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-linked glycosylation. GPT activity has been identified in all eukaryotic cells examined to date. Evidence for the existence of the N-glycosylation pathway in archaea has emerged and genes responsible for the pathway have been identified. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaeal gene, indicating eukaryotic and archaeal enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme, which uses a different substrate. 223
25961 133462 cd06852 GT_MraY Phospho-N-acetylmuramoyl-pentapeptide-transferase (mraY) is an enzyme responsible for the formation of the first lipid intermediate in the synthesis of bacterial cell wall peptidoglycan. It catalyzes the formation of undecaprenyl-pyrophosphoryl-N-acetylmuramoyl-pentapeptide from UDP-MurNAc-pentapeptide and undecaprenyl-phosphate. It is an integral membrane protein with possibly ten transmembrane domains. 280
25962 133463 cd06853 GT_WecA_like This subfamily contains Escherichia coli WecA, Bacillus subtilis TagO and related proteins. WecA is an UDP-N-acetylglucosamine (GlcNAc):undecaprenyl-phosphate (Und-P) GlcNAc-1-phosphate transferase that catalyzes the formation of a phosphodiester bond between a membrane-associated undecaprenyl-phosphate molecule and N-acetylglucosamine 1-phosphate, which is usually donated by a soluble UDP-N-acetylglucosamine precursor. WecA participates in the biosynthesis of O antigen LPS in many enteric bacteria and is also involved in the biosynthesis of enterobacterial common antigen. A conserved short sequence motif and a conserved arginine at a cytosolic loop of this integral membrane protein were shown to be critical in recognition of substrate UDP-N-acetylglucosamine. 249
25963 133464 cd06854 GT_WbpL_WbcO_like The members of this subfamily catalyze the formation of a phosphodiester bond between a membrane-associated undecaprenyl-phosphate (Und-P) molecule and N-acetylhexosamine 1-phosphate, which is usually donated by a soluble UDP-N-acetylhexosamine precursor. The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic end products implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The subgroup of bacterial UDP-HexNAc:polyprenol-P HexNAc-1-P transferases includes the WbcO protein from Yersinia enterocolitica and the WbpL protein from Pseudomonas aeruginosa. These transferases initiate LPS O-antigen biosynthesis. Similar to other GlcNAc/MurNAc-1-P transferase family members, WbpL is a highly hydrophobic protein possessing 11 predicted transmembrane segments. 253
25964 133465 cd06855 GT_GPT_euk UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT) catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-glycosylation. GPT activity has been identified in all eukaryotic cells examined to date. A series of six conserved motifs designated A through F, ranging in length from 5 to 13 amino acid residues, has been identified in this family. They have been determined to be important for stable expression, substrate binding, or catalytic activities. 283
25965 133466 cd06856 GT_GPT_archaea UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT)-like proteins in archaea. Eukaryotic GPT catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-linked glycosylation. Evidence for the existence of the N-glycosylation pathway in archaea has emerged and genes responsible for the pathway have been identified. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene, indicating that eukaryotic and archaeal enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme, which uses a different substrate. 280
25966 271356 cd06857 SLC5-6-like_sbd Solute carrier families 5 and 6-like; solute binding domain. This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT. 407
25967 132769 cd06859 PX_SNX1_2_like The phosphoinositide binding Phox Homology domain of Sorting Nexins 1 and 2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX1, SNX2, and similar proteins. They harbor a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. Both domains have been shown to determine the specific membrane-targeting of SNX1. SNX1 and SNX2 are components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. 114
25968 132770 cd06860 PX_SNX7_30_like The phosphoinositide binding Phox Homology domain of Sorting Nexins 7 and 30. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. This subfamily consists of SNX7, SNX30, and similar proteins. They harbor a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-6, SNX8, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of the sorting nexins in this subfamily has yet to be elucidated. 116
25969 132771 cd06861 PX_Vps5p The phosphoinositide binding Phox Homology domain of yeast sorting nexin Vps5p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Vsp5p is the yeast counterpart of human SNX1 and is part of the retromer complex, which functions in the endosome-to-Golgi retrieval of vacuolar protein sorting receptor Vps10p, the Golgi-resident membrane protein A-ALP, and endopeptidase Kex2. The PX domain of Vps5p binds phosphatidylinositol-3-phosphate (PI3P). Similar to SNX1, Vps5p contains a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. Both domains have been shown to determine the specific membrane-targeting of SNX1. 112
25970 132772 cd06862 PX_SNX9_18_like The phosphoinositide binding Phox Homology domain of Sorting Nexins 9 and 18. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX9, SNX18, and similar proteins. They contain an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. 125
25971 132773 cd06863 PX_Atg24p The phosphoinositide binding Phox Homology domain of yeast Atg24p, an autophagic degradation protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The yeast Atg24p is a sorting nexin (SNX) which is involved in membrane fusion events at the vacuolar surface during pexophagy. This is facilitated via binding of Atg24p to phosphatidylinositol 3-phosphate (PI3P) through its PX domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. 118
25972 132774 cd06864 PX_SNX4 The phosphoinositide binding Phox Homology domain of Sorting Nexin 4. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It shows a similar domain architecture as SNX1-2, among others, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. SNX4 is implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and the long form of the leptin receptor. 129
25973 132775 cd06865 PX_SNX_like The phosphoinositide binding Phox Homology domain of SNX-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. This subfamily is composed of uncharacterized proteins, predominantly from plants, with similarity to sorting nexins. A few members show a similar domain architecture as a subfamily of sorting nexins, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX-BAR structural unit is known to determine specific membrane localization. 120
25974 132776 cd06866 PX_SNX8_Mvp1p_like The phosphoinositide binding Phox Homology domain of Sorting Nexin 8 and yeast Mvp1p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX8 and the yeast counterpart Mvp1p are involved in sorting and delivery of late-Golgi proteins, such as carboxypeptidase Y, to vacuoles. 105
25975 132777 cd06867 PX_SNX41_42 The phosphoinositide binding Phox Homology domain of fungal Sorting Nexins 41 and 42. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX41 and SNX42 (also called Atg20p) form dimers with SNX4, and are required in protein recycling from the sorting endosome (post-Golgi endosome) back to the late Golgi in yeast. 112
25976 132778 cd06868 PX_HS1BP3 The phosphoinositide binding Phox Homology domain of HS1BP3. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Hematopoietic lineage cell-specific protein-1 (HS1) binding protein 3 (HS1BP3) associates with HS1 proteins through their SH3 domains, suggesting a role in mediating signaling. It has been reported that HS1BP3 might affect the IL-2 signaling pathway in hematopoietic lineage cells. Mutations in HS1BP3 may also be associated with familial Parkinson disease and essential tremor. HS1BP3 contains a PX domain, a leucine zipper, motifs similar to immunoreceptor tyrosine-based inhibitory motif and proline-rich regions. The PX domain interacts with PIs and plays a role in targeting proteins to PI-enriched membranes. 120
25977 132779 cd06869 PX_UP2_fungi The phosphoinositide binding Phox Homology domain of uncharacterized fungal proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized fungal proteins containing a PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction. 119
25978 132780 cd06870 PX_CISK The phosphoinositide binding Phox Homology Domain of Cytokine-Independent Survival Kinase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Cytokine-independent survival kinase (CISK), also called Serum- and Glucocorticoid-induced Kinase 3 (SGK3), plays a role in cell growth and survival. It is expressed in most tissues and is most abundant in the embryo and adult heart and spleen. It was originally discovered in a screen for antiapoptotic genes. It phosphorylates and inhibits the proapoptotic proteins, Bad and FKHRL1. CISK/SGK3 also regulates many transporters, ion channels, and receptors. It plays a critical role in hair follicle morphogenesis and hair cycling. N-terminal to a catalytic kinase domain, CISK contains a PX domain which binds highly phosphorylated PIs, directs membrane localization, and regulates the enzyme's activity. 109
25979 132781 cd06871 PX_MONaKA The phosphoinositide binding Phox Homology domain of Modulator of Na,K-ATPase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. MONaKA (Modulator of Na,K-ATPase) binds the plasma membrane ion transporter, Na,K-ATPase, and modulates its enzymatic and ion pump activities. It modulates brain Na,K-ATPase and may be involved in regulating electrical excitability and synaptic transmission. MONaKA contains an N-terminal PX domain and a C-terminal catalytic kinase domain. The PX domain interacts with PIs and plays a role in targeting proteins to PI-enriched membranes. 120
25980 132782 cd06872 PX_SNX19_like_plant The phosphoinositide binding Phox Homology domain of uncharacterized SNX19-like plant proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized plant proteins containing an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some sorting nexins (SNXs). This is the same domain architecture found in SNX19. SNX13 and SNX14 also contain these three domains but also contain a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction. 107
25981 132783 cd06873 PX_SNX13 The phosphoinositide binding Phox Homology domain of Sorting Nexin 13. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX13, also called RGS-PX1, contains an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs. It specifically binds to the stimulatory subunit of the heterotrimeric G protein G(alpha)s, serving as its GTPase activating protein, through the RGS domain. It preferentially binds phosphatidylinositol-3-phosphate (PI3P) through the PX domain and is localized in early endosomes. SNX13 is involved in endosomal sorting of EGFR into multivesicular bodies (MVB) for delivery to the lysosome. 120
25982 132784 cd06874 PX_KIF16B_SNX23 The phosphoinositide binding Phox Homology domain of KIF16B kinesin or Sorting Nexin 23. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. KIF16B, also called sorting nexin 23 (SNX23), is a family-3 kinesin which harbors an N-terminal kinesin motor domain containing ATP and microtubule binding sites, a ForkHead Associated (FHA) domain, and a C-terminal PX domain. The PX domain of KIF16B binds to phosphatidylinositol-3-phosphate (PI3P) in early endosomes and plays a role in the transport of early endosomes to the plus end of microtubules. By regulating early endosome plus end motility, KIF16B modulates the balance between recycling and degradation of receptors. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. 127
25983 132785 cd06875 PX_IRAS The phosphoinositide binding Phox Homology domain of the Imidazoline Receptor Antisera-Selected. The PX domain is a phosphoinositide binding (PI) module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Imidazoline Receptor Antisera-Selected (IRAS), also called nischarin, contains an N-terminal PX domain, leucine rich repeats, and a predicted coiled coil domain. The PX domain of IRAS binds to phosphatidylinositol-3-phosphate in membranes. Together with the coiled coil domain, it is essential for the localization of IRAS to endosomes. IRAS has been shown to interact with integrin and inhibit cell migration. Its interaction with alpha5 integrin causes a redistribution of the receptor from the cell surface to endosomal structures, suggesting that IRAS may function as a sorting nexin (SNX) which regulates the endosomal trafficking of integrin. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. 116
25984 132786 cd06876 PX_MDM1p The phosphoinositide binding Phox Homology domain of yeast MDM1p. The PX domain is a phosphoinositide binding (PI) module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Yeast MDM1p is a filament-like protein localized in punctate structures distributed throughout the cytoplasm. It plays an important role in nuclear and mitochondrial transmission to daughter buds. Members of this subfamily show similar domain architectures as some sorting nexins (SNXs). Some members are similar to SNX19 in that they contain an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some SNXs. Others are similar to SNX13 and SNX14, which also harbor these three domains as well as a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. 133
25985 132787 cd06877 PX_SNX14 The phosphoinositide binding Phox Homology domain of Sorting Nexin 14. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX14 may be involved in recruiting other proteins to the membrane via protein-protein and protein-ligand interaction. It is expressed in the embryonic nervous system of mice, and is co-expressed in the motoneurons and the anterior pituary with Islet-1. SNX14 shows a similar domain architecture as SNX13, containing an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs. 119
25986 132788 cd06878 PX_SNX25 The phosphoinositide binding Phox Homology domain of Sorting Nexin 25. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. The function of SNX25 is not yet known. It has been found in exosomes from human malignant pleural effusions. SNX25 shows the same domain architecture as SNX13 and SNX14, containing an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs. 127
25987 132789 cd06879 PX_UP1_plant The phosphoinositide binding Phox Homology domain of uncharacterized plant proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized fungal proteins containing a PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction. 138
25988 132790 cd06880 PX_SNX22 The phosphoinositide binding Phox Homology domain of Sorting Nexin 22. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX22 may be involved in recruiting other proteins to the membrane via protein-protein and protein-ligand interaction. The biological function of SNX22 is not yet known. 110
25989 132791 cd06881 PX_SNX15_like The phosphoinositide binding Phox Homology domain of Sorting Nexin 15-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily have similarity to sorting nexin 15 (SNX15), which contains an N-terminal PX domain and a C-terminal Microtubule Interacting and Trafficking (MIT) domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNX15 plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein. Other members of this subfamily contain an additional C-terminal kinase domain, similar to human RPK118, which binds sphingosine kinase and the antioxidant peroxiredoxin-3 (PRDX3). RPK118 may be involved in the transport of proteins such as PRDX3 from the cytoplasm to its site of function in the mitochondria. 117
25990 132792 cd06882 PX_p40phox The phosphoinositide binding Phox Homology domain of the p40phox subunit of NADPH oxidase. The PX domain is a phosphoinositide binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p40phox contains an N-terminal PX domain, a central SH3 domain that binds p47phox, and a C-terminal PB1 domain that interacts with p67phox. It is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p40phox positively regulates NADPH oxidase in both phosphatidylinositol-3-phosphate (PI3P)-dependent and PI3P-independent manner. The PX domain is a phospholipid-binding module involved in the membrane targeting of proteins. The p40phox PX domain binds to PI3P, an abundant lipid in phagosomal membranes, playing an important role in the localization of NADPH oxidase. The PX domain of p40phox is also involved in protein-protein interaction. 123
25991 132793 cd06883 PX_PI3K_C2 The phosphoinositide binding Phox Homology Domain of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They are also involved in the regulation of clathrin-mediated membrane trafficking as well as ATP-dependent priming of neurosecretory granule exocytosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. Class II PI3Ks include three vertebrate isoforms (alpha, beta, and gamma), the Drosophila PI3K_68D, and similar proteins. 109
25992 132794 cd06884 PX_PI3K_C2_68D The phosphoinositide binding Phox Homology Domain of Class II Phosphoinositide 3-Kinases similar to the Drosophila PI3K_68D protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. PI3K_68D is a novel PI3K which is widely expressed throughout the Drosophila life cycle. In vitro, it has been shown to phosphorylate PI and PI4P. It is involved in signaling pathways that affect pattern formation of Drosophila wings. 111
25993 132795 cd06885 PX_SNX17_31 The phosphoinositide binding Phox Homology domain of Sorting Nexins 17 and 31. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Members of this subfamily include sorting nexin 17 (SNX17), SNX31, and similar proteins. They contain an N-terminal PX domain followed by a truncated FERM (4.1, ezrin, radixin, and moesin) domain and a unique C-terminal region. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX17 is known to regulate the trafficking and processing of a number of proteins. It binds some members of the low-density lipoprotein receptor (LDLR) family such as LDLR, VLDLR, ApoER2, and others, regulating their endocytosis. It also binds P-selectin and may regulate its lysosomal degradation. SNX17 is highly expressed in neurons. It binds amyloid precursor protein (APP) and may be involved in its intracellular trafficking and processing to amyloid beta peptide, which plays a central role in the pathogenesis of Alzheimer's disease. The biological function of SNX31 is unknown. 104
25994 132796 cd06886 PX_SNX27 The phosphoinositide binding Phox Homology domain of Sorting Nexin 27. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX27 contains an N-terminal PDZ domain followed by a PX domain and a Ras-Associated (RA) domain. It binds G protein-gated potassium (Kir3) channels, which play a role in neuronal excitability control, through its PDZ domain. SNX27 downregulates Kir3 channels by promoting their movement in the endosome, reducing surface expression and increasing degradation. SNX27 also associates with 5-hydroxytryptamine type 4 receptor (5-HT4R), cytohesin associated scaffolding protein (CASP), and diacylglycerol kinase zeta, and may play a role in their intracellular trafficking and endocytic recycling. The SNX27 PX domain preferentially binds to phosphatidylinositol-3-phosphate (PI3P) and is important for targeting to the early endosome. 106
25995 132797 cd06887 PX_p47phox The phosphoinositide binding Phox Homology domain of the p47phox subunit of NADPH oxidase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p47phox is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal PX domain, two Src Homology 3 (SH3) domains, and a C-terminal domain that contains PxxP motifs for binding SH3 domains. The PX domain of p47phox is unique in that it contains two distinct basic pockets on the membrane-binding surface: one preferentially binds phosphatidylinositol-3,4-bisphosphate [PI(3,4)P2] and is analogous to the PI3P-binding pocket of p40phox, while the other binds anionic phospholipids such as phosphatidic acid or phosphatidylserine. Simultaneous binding in the two pockets results in increased membrane affinity. The PX domain of p47phox is also involved in protein-protein interaction. 118
25996 132798 cd06888 PX_FISH The phosphoinositide binding Phox Homology domain of Five SH protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Five SH (FISH), also called Tks5, is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. FISH contains an N-terminal PX domain and five Src homology 3 (SH3) domains. FISH binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. This subfamily also includes proteins with a different number of SH3 domains than FISH, such as Tks4, which contains four SH3 domains instead of five. The Tks4 adaptor protein is required for the formation of functional podosomes. It has overlapping, but not identical, functions as FISH. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 119
25997 132799 cd06889 PX_NoxO1 The phosphoinositide binding Phox Homology domain of Nox Organizing protein 1. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1, a homolog of the p47phox subunit of phagocytic NADPH oxidase, is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of NoxO1 preferentially binds phosphatidylinositol-3,5-bisphosphate [PI(3,5)P2], PI5P, and PI4P. 121
25998 132800 cd06890 PX_Bem1p The phosphoinositide binding Phox Homology domain of Bem1p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of Bem1p specifically binds phosphatidylinositol-4-phosphate (PI4P). 112
25999 132801 cd06891 PX_Vps17p The phosphoinositide binding Phox Homology domain of yeast sorting nexin Vps17p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Vsp17p forms a dimer with Vps5p, the yeast counterpart of human SNX1, and is part of the retromer complex that mediates the transport of the carboxypeptidase Y receptor Vps10p from endosomes to Golgi. Similar to Vps5p and SNX1, Vps17p harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX-BAR structural unit helps determine specific membrane localization. 140
26000 132802 cd06892 PX_SNX5_like The phosphoinositide binding Phox Homology domain of Sorting Nexins 5 and 6. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Members of this subfamily include SNX5, SNX6, and similar proteins. They contain a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs. SNX5 and SNX6 may be components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. 141
26001 132803 cd06893 PX_SNX19 The phosphoinositide binding Phox Homology domain of Sorting Nexin 19. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX19 contains an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some SNXs. These domains are also found in SNX13 and SNX14, which also contain a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNX19 interacts with IA-2, a major autoantigen found in type-1 diabetes. It inhibits the conversion of phosphatidylinositol-4,5-bisphosphate [PI(4,5)P2] to PI(3,4,5)P3, which leads in the decrease of protein phosphorylation in the Akt signaling pathway, resulting in apoptosis. SNX19 may also be implicated in coronary heart disease and thyroid oncocytic tumors. 132
26002 132804 cd06894 PX_SNX3_like The phosphoinositide binding Phox Homology domain of Sorting Nexin 3 and related proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily is composed of SNX3, SNX12, and fungal Grd19. Grd19 is involved in the localization of late Golgi membrane proteins in yeast. SNX3/Grp19 associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer. 123
26003 132805 cd06895 PX_PLD The phosphoinositide binding Phox Homology domain of Phospholipase D. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. Members of this subfamily contain PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. PLD activity has been detected in viruses, bacteria, yeast, plants, and mammals, but the PX domain is not present in PLDs from viruses and bacteria. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. Vertebrates contain two PLD isozymes, PLD1 and PLD2. PLD1 is located mainly in intracellular membranes while PLD2 is associated with plasma membranes. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 140
26004 132806 cd06896 PX_PI3K_C2_gamma The phosphoinositide binding Phox Homology Domain of the Gamma Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II gamma isoform, PI3K-C2gamma, is expressed in the liver, breast, and prostate. It's biological function remains unknown. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 101
26005 132807 cd06897 PX_SNARE The phosphoinositide binding Phox Homology domain of SNARE proteins from fungi. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. This subfamily is composed of fungal proteins similar to Saccharomyces cerevisiae Vam7p. They contain an N-terminal PX domain and a C-terminal SNARE domain. The SNARE (Soluble NSF attachment protein receptor) family of proteins are integral membrane proteins that serve as key factors for vesicular trafficking. Vam7p is anchored at the vacuolar membrane through the specific interaction of its PX domain with phosphatidylinositol-3-phosphate (PI3P) present in bilayers. It plays an essential role in vacuole fusion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 108
26006 132808 cd06898 PX_SNX10 The phosphoinositide binding Phox Homology domain of Sorting Nexin 10. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX10 may be involved in the regulation of endosome homeostasis. Its expression induces the formation of giant vacuoles in mammalian cells. 113
26007 173887 cd06899 lectin_legume_LecRK_Arcelin_ConA legume lectins, lectin-like receptor kinases, arcelin, concanavalinA, and alpha-amylase inhibitor. This alignment model includes the legume lectins (also known as agglutinins), the arcelin (also known as phytohemagglutinin-L) family of lectin-like defense proteins, the LecRK family of lectin-like receptor kinases, concanavalinA (ConA), and an alpha-amylase inhibitor. Arcelin is a major seed glycoprotein discovered in kidney beans (Phaseolus vulgaris) that has insecticidal properties and protects the seeds from predation by larvae of various bruchids. Arcelin is devoid of monosaccharide binding properties and lacks a key metal-binding loop that is present in other members of this family. Phytohaemagglutinin (PHA) is a lectin found in plants, especially beans, that affects cell metabolism by inducing mitosis and by altering the permeability of the cell membrane to various proteins. PHA agglutinates most mammalian red blood cell types by binding glycans on the cell surface. Medically, PHA is used as a mitogen to trigger cell division in T-lymphocytes and to activate latent HIV-1 from human peripheral lymphocytes. Plant L-type lectins are primarily found in the seeds of leguminous plants where they constitute about 10% of the total soluble protein of the seed extracts. They are synthesized during seed development several weeks after flowering and transported to the vacuole where they become condensed into specialized vesicles called protein bodies. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 236
26008 173888 cd06900 lectin_VcfQ VcfQ bacterial pilus biogenesis protein, lectin domain. This family includes bacterial proteins homologous to the VcfQ (also known as MshQ) bacterial pilus biogenesis protein. VcfQ is encoded by the vcfQ gene of the type IV pilus gene cluster of Vibrio cholerae and is essential for type IV pilus assembly. VcfQ has a Laminin G-like domain as well as an L-type lectin domain. 255
26009 173889 cd06901 lectin_VIP36_VIPL VIP36 and VIPL type 1 transmembrane proteins, lectin domain. The vesicular integral protein of 36 kDa (VIP36) is a type 1 transmembrane protein of the mammalian early secretory pathway that acts as a cargo receptor transporting high mannose type glycoproteins between the Golgi and the endoplasmic reticulum (ER). Lectins of the early secretory pathway are involved in the selective transport of newly synthesized glycoproteins from the ER to the ER-Golgi intermediate compartment (ERGIC). The most prominent cycling lectin is the mannose-binding type1 membrane protein ERGIC-53, which functions as a cargo receptor to facilitate export of glycoproteins from the ER. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 248
26010 173890 cd06902 lectin_ERGIC-53_ERGL ERGIC-53 and ERGL type 1 transmembrane proteins, N-terminal lectin domain. ERGIC-53 and ERGL, N-terminal carbohydrate recognition domain. ERGIC-53 and ERGL are eukaryotic mannose-binding type 1 transmembrane proteins of the early secretory pathway that transport newly synthesized glycoproteins from the endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC). ERGIC-53 and ERGL have an N-terminal lectin-like carbohydrate recognition domain (represented by this alignment model) as well as a C-terminal transmembrane domain. ERGIC-53 functions as a 'cargo receptor' to facilitate the export of glycoproteins with different characteristics from the ER, while the ERGIC-53-like protein (ERGL) which may act as a regulator of ERGIC-53. In mammals, ERGIC-53 forms a complex with MCFD2 (multi-coagulation factor deficiency 2) which then recruits blood coagulation factors V and VIII. Mutations in either MCFD2 or ERGIC-53 cause a mild form of inherited hemophilia known as combined deficiency of factors V and VIII (F5F8D). In addition to the lectin and transmembrane domains, ERGIC-53 and ERGL have a short N-terminal cytoplasmic region of about 12 amino acids. ERGIC-53 forms disulphide-linked homodimers and homohexamers. ERGIC-53 and ERGL are sequence-similar to the lectins of leguminous plants. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 225
26011 173891 cd06903 lectin_EMP46_EMP47 EMP46 and EMP47 type 1 transmembrane proteins, N-terminal lectin domain. EMP46 and EMP47, N-terminal carbohydrate recognition domain. EMP46 and EMP47 are fungal type-I transmembrane proteins that cycle between the endoplasmic reticulum and the golgi apparatus and are thought to function as cargo receptors that transport newly synthesized glycoproteins. EMP47 is a receptor for EMP46 responsible for the selective transport of EMP46 by forming hetero-oligomerization between the two proteins. EMP46 and EMP47 have an N-terminal lectin-like carbohydrate recognition domain (represented by this alignment model) as well as a C-terminal transmembrane domain. EMP46 and EMP47 are 45% sequence-identical to one another and have sequence homology to a class of intracellular lectins defined by ERGIC-53 and VIP36. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 215
26012 349475 cd06904 M14_MpaA-like Peptidase M14-like domain of Escherichia coli Murein Peptide Amidase A and related proteins. Peptidase M14-like domain of Escherichia coli Murein Peptide Amidase A (MpaA) and related proteins. MpaA is a member of the M14 family of metallocarboxypeptidases (MCPs), however it has an exceptional type of activity, it hydrolyzes the gamma-D-glutamyl-meso-diaminopimelic acid (gamma-D-Glu-Dap) bond in murein peptides. MpaA is specific for cleavage of the gamma-D-Glu-Dap bond of free murein tripeptide; it may also cleave murein tetrapeptide. MpaA has a different substrate specificity and cellular role than endopeptidase I, ENP1 (ENP1 does not belong to this group). MpaA works on free murein peptide in the recycling pathway. 214
26013 349476 cd06905 M14-like Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism. 359
26014 349477 cd06906 M14_Nna1 Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases. Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the mouse Nna1/CCP-1, and -4 proteins, and the human Nna1/AGTPBP-1 protein. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 271
26015 349478 cd06907 M14_AGBL2-3_like Peptidase M14-like domain of ATP/GTP binding protein AGBL-2 and AGBL-3, and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-2, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This subgroup includes the human AGBL-2, and -3, and the mouse cytosolic carboxypeptidase (CCPs)-2, and -3. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 252
26016 349479 cd06908 M14_AGBL4_like Peptidase M14-like domain of ATP/GTP binding protein AGBL-4 and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-4, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human AGBL4 and the mouse cytosolic carboxypeptidase (CCP)-6. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 254
26017 349480 cd06909 M14_ASPA Peptidase M14 Aspartoacylase (ASPA) subfamily. Aspartoacylase (ASPA) belongs to the Succinylglutamate desuccinylase/aspartoacylase subfamily of the M14 family of metallocarboxypeptidases. ASPA (also known as aminoacylase 2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 190
26018 349481 cd06910 M14_ASTE_ASPA-like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 208
26019 132874 cd06911 VirB9_CagX_TrbG VirB9/CagX/TrbG, a component of the type IV secretion system. VirB9 is a component of the type IV secretion system, which is employed by pathogenic bacteria to export virulence proteins directly from the bacterial cytoplasm into the host cell. Unlike the more common type III secretion system, type IV systems evolved from the conjugative apparatus, which is used to transfer DNA between cells. VirB9 was initially identified as an essential virulence gene on the Agrobacterium tumefaciens Ti plasmid. In the pilin-like conjugative structure, VirB9 appears to form a stabilizing complex in the outer membrane, by interacting with the lipoprotein VirB7. The heterodimer has been shown to stabilize other components of the type IV system. This alignment model spans the C-terminal domain of VirB9. CagX is a component of the Helicobacter pylori cag PAI-encoded type IV secretion system. Some other members of this family are involved in conjugal transfer to T-DNA of plant cells. 86
26020 133467 cd06912 GT_MraY_like This subfamily is composed of uncharacterized bacterial glycosyltransferases in the MraY-like family. This family contains both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases, which catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate. This is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. The three bacterial members MraY, WecA, and WbpL/WbcO, utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic end products implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The prokaryotic enzyme-catalyzed reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly. 193
26021 133063 cd06913 beta3GnTL1_like Beta 1, 3-N-acetylglucosaminyltransferase is essential for the formation of poly-N-acetyllactosamine . This family includes human Beta3GnTL1 and related eukaryotic proteins. Human Beta3GnTL1 is a putative beta-1,3-N-acetylglucosaminyltransferase. Beta3GnTL1 is expressed at various levels in most of tissues examined. Beta 1, 3-N-acetylglucosaminyltransferase has been found to be essential for the formation of poly-N-acetyllactosamine. Poly-N-acetyllactosamine is a unique carbohydrate composed of N-acetyllactosamine repeats. It is often an important part of cell-type-specific oligosaccharide structures and some functional oligosaccharides. It has been shown that the structure and biosynthesis of poly-N-acetyllactosamine display a dramatic change during development and oncogenesis. Several members of beta-1, 3-N-acetylglucosaminyltransferase have been identified. 219
26022 133064 cd06914 GT8_GNT1 GNT1 is a fungal enzyme that belongs to the GT 8 family. N-acetylglucosaminyltransferase is a fungal enzyme that catalyzes the addition of N-acetyl-D-glucosamine to mannotetraose side chains by an alpha 1-2 linkage during the synthesis of mannan. The N-acetyl-D-glucosamine moiety in mannan plays a role in the attachment of mannan to asparagine residues in proteins. The mannotetraose and its N-acetyl-D-glucosamine derivative side chains of mannan are the principle immunochemical determinants on the cell surface. N-acetylglucosaminyltransferase is a member of glycosyltransferase family 8, which are, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed, retaining glycosyltransferases. 278
26023 133065 cd06915 NTP_transferase_WcbM_like WcbM_like is a subfamily of nucleotidyl transferases. WcbM protein of Burkholderia mallei is involved in the biosynthesis, export or translocation of capsule. It is a subfamily of nucleotidyl transferases that transfer nucleotides onto phosphosugars. 223
26024 143512 cd06916 NR_DBD_like DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers. 72
26025 270822 cd06917 STKc_NAK1_like Catalytic domain of Fungal Nak1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Nak1, Saccharomyces cerevisiae Kic1p (kinase that interacts with Cdc31p) and related proteins. Nak1 (also called N-rich kinase 1), is required by fission yeast for polarizing the tips of actin cytoskeleton and is involved in cell growth, cell separation, cell morphology and cell-cycle progression. Kic1p is required by budding yeast for cell integrity and morphogenesis. Kic1p interacts with Cdc31p, the yeast homologue of centrin, and phosphorylates substrates in a Cdc31p-dependent manner. The Nak1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
26026 132994 cd06919 Asp_decarbox Aspartate alpha-decarboxylase or L-aspartate 1-decarboxylase, a pyruvoyl group-dependent decarboxylase in beta-alanine production. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalyzed by the enzyme L-aspartate decarboxylase (ADC), EC:4.1.1.11 which requires a pyruvoyl group for its activity. The pyruvoyl cofactor is covalently bound to the enzyme. The protein is synthesized as a proenzyme and cleaved via self-processing at Gly23-Ser24 to yield an alpha chain (C-terminal fragment) and beta chain (N-terminal fragment), and the pyruvoyl group. Beta-alanine is required for the biosynthesis of pantothenate, in which the enzyme plays a critical regulatory role. The active site of the tetrameric enzyme is located at the interface of two subunits, with a Lysine and a Histidine from the beta chain of one subunit forming the active site with residues from the alpha chain of the adjacent subunit. This alignment model spans the precursor (or both beta and alpha chains) of aspartate decarboxylase. 111
26027 132993 cd06920 NEAT NEAr Transport domain, a component of cell surface proteins. NEAr Transporter (NEAT) domain; used by pathogenic bacteria to to scavenge heme-iron from host hemoproteins. The NEAT domain is a component of cell surface proteins (iron regulated surface determinants, or Isd, such as IsdA and IsdC) in various gram-positive bacteria, and may be arranged in tandem repeats. 117
26028 211312 cd06921 ChtBD1_GH19_hevein Hevein or Type 1 chitin binding domain subfamily co-occuring with family 19 glycosyl hydrolases or with barwin domains. This subfamily includes Hevein, a major IgE-binding allergen in natural rubber latex. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 40
26029 211313 cd06922 ChtBD1_GH18_1 Hevein or Type 1 chitin binding domain subfamily that co-occurs with family 18 glycosyl hydrolases. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 38
26030 211314 cd06923 ChtBD1_GH16 Hevein or Type 1 chitin binding domain subfamily that co-occurs with family 16 glycosyl hydrolases. This subfamily includes Saccharomyces cerevisiae Utr2p, also known as Crh2p, which participates in the cross-linking of chitin to beta(1-3)- and beta(1-6) glucan in the cell wall, and S. cerevisiae Crr1p, a putative transglycosidase which is needed for proper spore wall assembly. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 47
26031 132902 cd06926 RNAP_II_RPB11 RPB11 subunit of Eukaryotic RNA polymerase II. The eukaryotic RPB11 subunit of RNA polymerase (RNAP) II is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP II is responsible for the synthesis of mRNA precursor. The RPB11 subunit heterodimerizes with the RPB3 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. 93
26032 132903 cd06927 RNAP_L L subunit of Archaeal RNA polymerase. The archaeal L subunit of RNA polymerase (RNAP) is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. A single distinct RNAP complex is found in archaea, which may be responsible for the synthesis of all RNAs. The archaeal RNAP harbors homologues of all eukaryotic RNAP II subunits with two exceptions (RPB8 and RPB9). The 12 archaeal subunits are designated by letters and can be divided into three functional groups that are engaged in: (I) catalysis (A'/A", B'/B" or B); (II) assembly (L, N, D and P); and (III) auxiliary functions (F, E, H and K). The assembly of the two largest archaeal RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of the archaeal D/L heterodimer. 83
26033 132904 cd06928 RNAP_alpha_NTD N-terminal domain of the Alpha subunit of Bacterial RNA polymerase. The bacterial alpha subunit of RNA polymerase (RNAP) consists of two independently folded domains: an amino-terminal domain (alphaNTD) and a carboxy-terminal domain (alphaCTD). AlphaCTD is not required for RNAP assembly but interacts with transcription activators. AlphaNTD is essential in vivo and in vitro for RNAP assembly and basal transcription. It is similar to the eukaryotic RPB3/AC40/archaeal D subunit, and contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The alphaNTDs of plant plastid RNAP (PEP) are also included in this subfamily. PEP is largely responsible for the transcription of photosynthetic genes and is closely related to the multi-subunit bacterial RNAP, which is a large multi-subunit complex responsible for the synthesis of all bacterial RNAs. The bacterial RNAP core enzyme consists of four subunits (beta', beta, alpha and omega). All residues in the alpha subunit that is involved in dimerization or in the interaction with other subunits are located within alphaNTD. 215
26034 132727 cd06929 NR_LBD_F1 Ligand-binding domain of nuclear receptor family 1. Ligand-binding domain (LBD) of nuclear receptor (NR) family 1: This is one of the major subfamily of nuclear receptors, including thyroid receptor, retinoid acid receptor, ecdysone receptor, farnesoid X receptor, vitamin D receptor, and other related receptors. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 174
26035 132728 cd06930 NR_LBD_F2 Ligand-binding domain of nuclear receptor family 2. Ligand-binding domain (LBD) of nuclear receptor (NR) family 2: This is one of the major subfamily of nuclear receptors, including some well known nuclear receptors such as glucocorticoid receptor (GR), mineralocorticoid receptor (MR), estrogen receptor (ER), progesterone receptor (PR), and androgen receptor (AR), other related receptors. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 165
26036 132729 cd06931 NR_LBD_HNF4_like The ligand binding domain of heptocyte nuclear factor 4, which is explosively expanded in nematodes. The ligand binding domain of hepatocyte nuclear factor 4 (HNF4) like proteins: HNF4 is a member of the nuclear receptor superfamily. HNF4 plays a key role in establishing and maintenance of hepatocyte differentiation in the liver. It is also expressed in gut, kidney, and pancreatic beta cells. HNF4 was originally classified as an orphan receptor, but later it is found that HNF4 binds with very high affinity to a variety of fatty acids. However, unlike other nuclear receptors, the ligands do not act as a molecular switch for HNF4. They seem to constantly bind to the receptor, which is constitutively active as a transcription activator. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, HNF4 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD domain is also responsible for recruiting co-activator proteins. More than 280 nuclear receptors are found in C. ele gans, most of which are originated from an explosive burst of duplications of HNF4. 222
26037 132730 cd06932 NR_LBD_PPAR The ligand binding domain of peroxisome proliferator-activated receptors. The ligand binding domain (LBD) of peroxisome proliferator-activated receptors (PPAR): Peroxisome proliferator-activated receptors (PPARs) are members of the nuclear receptor superfamily of ligand-activated transcription factors. PPARs play important roles in regulating cellular differentiation, development and lipid metabolism. Activated PPAR forms a heterodimer with the retinoid X receptor (RXR) that binds to the hormone response element located upstream of the peroxisome proliferator responsive genes and interacts with co-activators. There are three subtypes of peroxisome proliferator activated receptors, alpha, beta (or delta), and gamma, each with a distinct tissue distribution. Several essential fatty acids, oxidized lipids and prostaglandin J derivatives can bind and activate PPAR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR has a central well conserved DNA binding domain (DBD), a variable N-terminal regulatory domain, a flexible hinge a nd a C-terminal ligand binding domain (LBD). 259
26038 132731 cd06933 NR_LBD_VDR The ligand binding domain of vitamin D receptors, a member of the nuclear receptor superfamily. The ligand binding domain of vitamin D receptors (VDR): VDR is a member of the nuclear receptor (NR) superfamily that functions as classical endocrine receptors. VDR controls a wide range of biological activities including calcium metabolism, cell proliferation and differentiation, and immunomodulation. VDR is a high affinity receptor for the biologically most active Vitamin D metabolite, 1alpha,25-dihydroxyvitamin D3 (1alpha,25(OH)2D3). The binding of the ligand to the receptor induces a conformational change of the ligand binding domain (LBD) with consequent dissociation of corepressors. Upon ligand binding, VDR forms heterodimer with the retinoid X receptor (RXR) that binds to vitamin D response elements (VDREs), recruits coactivators. This leads to the expression of a large number of genes. Approximately 200 human genes are considered to be primary targets of VDR and even more genes are regulated indirectly. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, VDR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 238
26039 132732 cd06934 NR_LBD_PXR_like The ligand binding domain of xenobiotic receptors:pregnane X receptor and constitutive androstane receptor. The ligand binding domain of xenobiotic receptors: This xenobiotic receptor family includes pregnane X receptor (PXR), constitutive androstane receptor (CAR) and other related nuclear receptors. They function as sensors of toxic byproducts of cell metabolism and of exogenous chemicals, to facilitate their elimination. The nuclear receptor pregnane X receptor (PXR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. The ligand binding domain of PXR shows remarkable flexibility to accommodate both large and small molecules. PXR functions as a heterodimer with retinoic X receptor-alpha (RXRa) and binds to a variety of response elements in the promoter regions of a diverse set of target genes involved in the metabolism, transport, and elimination of these molecules from the cell. Constitutive androstane receptor (CAR) is a closest mammalian relative of PXR, which has also been proposed to function as a xenosensor. CAR is activated by some of the same ligands as PXR and regulates a subset of common genes. The sequence homology and functional similarity suggests that the CAR gene arose from a duplication of an ancestral PXR gene. Like other nuclear receptors, xenobiotic receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 226
26040 132733 cd06935 NR_LBD_TR The ligand binding domain of thyroid hormone receptor, a members of a superfamily of nuclear receptors. The ligand binding domain (LBD) of thyroid hormone receptors: Thyroid hormone receptors are members of a superfamily of nuclear receptors. Thyroid hormone receptors (TR) mediate the actions of thyroid hormones, which play critical roles in growth, development, and homeostasis in mammals. They regulate overall metabolic rate, cholesterol and triglyceride levels, and heart rate, and affect mood. TRs are expressed from two separate genes (alpha and beta) in human and each gene generates two isoforms of the receptor through differential promoter usage or splicing. TRalpha functions in the heart to regulate heart rate and rhythm and TRbeta is active in the liver and other tissues. The unliganded TRs function as transcription repressors, by binding to thyroid hormone response elements (TRE) predominantly as homodimers, or as heterodimers with retinoid X-receptors (RXR), and being associated with a complex of proteins containing corepressor proteins. Ligand binding promotes corepressor dissociation and binding of a coactivator to activate transcription. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 243
26041 132734 cd06936 NR_LBD_Fxr The ligand binding domain of Farnesoid X receptor:a member of the nuclear receptor superfamily of ligand-activated transcription factors. The ligand binding domain (LBD) of Farnesoid X receptor: Farnesoid X receptor (FXR) is a member of the nuclear receptor superfamily of ligand-activated transcription factors. FXR is highly expressed in the liver, the intestine, the kidney, and the adrenals. FXR plays key roles in the regulation of bile acid, cholesterol, triglyceride, and glucose metabolism. Evidences show that it also regulates liver regeneration. Upon binding of ligands, such as bile acid, an endogenous ligand, FXRs bind to FXR response elements (FXREs) either as a monomer or as a heterodimer with retinoid X receptor (RXR), and regulate the expression of various genes involved in bile acid, lipid, and glucose metabolism. There are two FXR genes (FXRalpha and FXRbeta) in mammals. A single FXRalpha gene encodes four isoforms resulting from differential use of promoters and alternative splicing. FXRbeta is a functional receptor in mice, rats, rabbits and dogs, but is a pseudogene in humans and primates. Like other members of the nuclear receptor (NR) superfamily, farnesoid X receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 221
26042 132735 cd06937 NR_LBD_RAR The ligand binding domain (LBD) of retinoic acid receptor (RAR), a members of the nuclear receptor superfamily. The ligand binding domain (LBD) of retinoic acid receptor (RAR): Retinoic acid receptors are members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors. RARs mediate the biological effect of retinoids, including both naturally dietary vitamin A (retinol) metabolites and active synthetic analogs. Retinoids play key roles in a wide variety of essential biological processes, such as vertebrate embryonic morphogenesis and organogenesis, differentiation and apoptosis, and homeostasis. RARs function as heterodimers with retinoic X receptors by binding to specific RAR response elements (RAREs) found in the promoter regions of retinoid target genes. In the absence of ligand, the RAR-RXR heterodimer recruits the corepressor proteins NCoR or AMRT, and associated factors such as histone deacetylases or DNA-methyltransferases, leading to an inactive condensed chromatin structure, preventing transcription. Upon ligand binding, the corepressors are released, and coactivator complexes such as histone acetyltransferase or histone arginine methyltransferases are recruited to activate transcription. There are three RAR subtypes (alpha, beta, gamma), originating from three distinct genes. For each subtype, several isoforms exist that differ in their N-terminal region, allowing retinoids to exert their pleiotropic effects. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoic acid receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 231
26043 132736 cd06938 NR_LBD_EcR The ligand binding domain (LBD) of the Ecdysone receptor, a member of the nuclear receptors super family. The ligand binding domain (LBD) of the ecdysone receptor: The ecdysone receptor (EcR) belongs to the superfamily of nuclear receptors (NRs) of ligand-dependent transcription factors. Ecdysone receptor is present only in invertebrates and regulates the expression of a large number of genes during development and reproduction. ECR functions as a heterodimer by partnering with ultraspiracle protein (USP), the ortholog of the vertebrate retinoid X receptor (RXR). The natural ligands of ecdysone receptor are ecdysteroids#the endogenous steroidal hormones found in invertebrates. In addition, insecticide bisacylhydrazine used against pests has shown to act on EcR. EcR must be dimerised with a USP for high-affinity ligand binding to occur. The ligand binding triggers a conformational change in the C-terminal part of the EcR ligand-binding domain that leads to transcriptional activation of genes controlled by EcR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ec dysone receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 231
26044 132737 cd06939 NR_LBD_ROR_like The ligand binding domain of Retinoid-related orphan receptors, of the nuclear receptor superfamily. The ligand binding domain (LBD) of Retinoid-related orphan receptors (RORs): Retinoid-related orphan receptors (RORs) are transcription factors belonging to the nuclear receptor superfamily. RORs are key regulators of many physiological processes during embryonic development. RORs bind as monomers to specific ROR response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by a 5-bp A/T-rich sequence. Transcription regulation by RORs is mediated through certain corepressors, as well as coactivators. There are three subtypes of retinoid-related orphan receptors (RORs), alpha, beta, and gamma that differ only in N-terminal sequence and are distributed in distinct tissues. RORalpha plays a key role in the development of the cerebellum, particularly in the regulation of the maturation and survival of Purkinje cells. RORbeta expression is largely restricted to several regions of the brain, the retina, and pineal gland. RORgamma is essential for lymph node organogenesis. Recently, it has been su ggested that cholesterol or a cholesterol derivative is the natural ligand of RORalpha. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoid-related orphan receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 241
26045 132738 cd06940 NR_LBD_REV_ERB The ligand binding domain of REV-ERB receptors, members of the nuclear receptor superfamily. The ligand binding domain (LBD) of REV-ERB receptors: REV-ERBs are transcriptional regulators belonging to the nuclear receptor superfamily. They regulate a number of physiological functions including the circadian rhythm, lipid metabolism, and cellular differentiation. The LBD domain of REV-ERB is unusual in the nuclear receptor family by lacking the AF-2 region that is responsible for coactivator interaction. REV-ERBs act as constitutive repressors because of their inability to bind coactivators. REV-ERB receptors can bind to two classes of DNA response elements as either a monomer or heterodimer, indicating functional diversity. When bound to the DNA, they recruit corepressors (NcoR/histone deacetylase 3) to the promoter, resulting in repression of the target gene. The porphyrin heme has been demonstrated to function as a ligand for REV-ERB. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, REV-ERB receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 189
26046 132739 cd06941 NR_LBD_DmE78_like The ligand binding domain of Drosophila ecdysone-induced protein 78, a member of the nuclear receptor superfamily. The ligand binding domain (LBD) of Drosophila ecdysone-induced protein 78 (E78) like: Drosophila ecdysone-induced protein 78 (E78) is a transcription factor belonging to the nuclear receptor superfamily. E78 is a product of the ecdysone-inducible gene found in an early late puff locus at position 78C during the onset of Drosophila metamorphosis. Two isoforms of E78, E78A and E78B, are expressed from two nested transcription units. An E78 orthologue from the Platyhelminth Schistosoma mansoni (SmE78) has also been identified. It is the first E78 orthologue known outside of the molting animals--the Ecdysozoa. SmE78 may be involved in transduction of an ecdysone signal in S. mansoni, consistent with its function in Drosophila. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, E78-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 195
26047 132740 cd06942 NR_LBD_Sex_1_like The ligand binding domain of Caenorhabditis elegans nuclear hormone receptor Sex-1 protein. The ligand binding domain (LBD) of Caenorhabditis elegans nuclear hormone receptor Sex-1 protein like: Sex-1 protein of C. elegans is a transcription factor belonging to the nuclear receptor superfamily. Sex-1 plays pivotal role in sex fate of C. elegans by regulating the transcription of the sex-determination gene xol-1, which specifies male (XO) fate when active and hermaphrodite (XX) fate when inactive. The Sex-1 protein directly represses xol-1 transcription by binding to its promoter. However, the active ligand for Sex-1 protein has not yet been identified. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, Sex-1 like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 191
26048 132741 cd06943 NR_LBD_RXR_like The ligand binding domain of the retinoid X receptor and Ultraspiracle, members of nuclear receptor superfamily. The ligand binding domain of the retinoid X receptor (RXR) and Ultraspiracle (USP): This family includes two evolutionary related nuclear receptors: retinoid X receptor (RXR) and Ultraspiracle (USP). RXR is a nuclear receptor in mammalian and USP is its counterpart in invertebrates. The native ligand of retinoid X receptor is 9-cis retinoic acid (RA). RXR functions as a DNA binding partner by forming heterodimers with other nuclear receptors including CAR, FXR, LXR, PPAR, PXR, RAR, TR, and VDR. RXRs can play different roles in these heterodimers. It acts either as a structural component of the heterodimer complex, required for DNA binding but not acting as a receptor or as both a structural and a functional component of the heterodimer, allowing 9-cis RA to signal through the corresponding heterodimer. In addition, RXR can also form homodimers, functioning as a receptor for 9-cis RA, independently of other nuclear receptors. Ultraspiracle (USP) plays similar roles as DNA binding partner of other nuclear rec eptors in invertebrates. USP has no known high-affinity ligand and is thought to be a silent component in the heterodimeric complex with partner receptors. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, RXR and USP have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 207
26049 132742 cd06944 NR_LBD_Ftz-F1_like The ligand binding domain of FTZ-F1 like nuclear receptors. The ligand binding domain of FTZ-F1 like nuclear receptors: This nuclear receptor family includes at least three subgroups of receptors that function in embryo development and differentiation, and other processes. FTZ-F1 interacts with the cis-acting DNA motif of ftz gene, which required at several stages of development. Particularly, FTZ-F1 genes are strongly linked to steroid biosynthesis and sex-determination; LRH-1 is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development. SF-1 is an essential regulator of endocrine development and function and is considered a master regulator of reproduction; SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1). However, the ligand for FTZ-F1 has not yet been identified. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 and SF-1 bind to DNA as a monomer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, receptors in this family have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 237
26050 132743 cd06945 NR_LBD_Nurr1_like The ligand binding domain of Nurr1 and related nuclear receptor proteins, members of nuclear receptor superfamily. The ligand binding domain of nuclear receptor Nurr1_like: This family of nuclear receptors, including Nurr1, Nerve growth factor-induced-B (NGFI-B) and DHR38 are involved in the embryo development. Nurr1 is a transcription factor that is expressed in the embryonic ventral midbrain and is critical for the development of dopamine (DA) neurons. Structural studies have shown that the ligand binding pocket of Nurr1 is filled by bulky hydrophobic residues, making it unable to bind to ligands. Therefore, it belongs to the class of orphan receptors. However, Nurr1 forms heterodimers with RXR and can promote signaling via its partner, RXR. NGFI-B is an early immediate gene product of embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of tr anscriptional initiation. Another group of receptor in this family is DHR38. DHR38 is the Drosophila homolog to the vertebrate NGFI-B-type orphan receptor. It interacts with the USP component of the ecdysone receptor complex, suggesting that DHR38 might modulate ecdysone-triggered signals in the fly, in addition to the ECR/USP pathway. Nurr1_like proteins exhibit a modular structure that is characteristic for nuclear receptors; they have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 239
26051 132744 cd06946 NR_LBD_ERR The ligand binding domain of estrogen receptor-related nuclear receptors. The ligand binding domain of estrogen receptor-related receptors (ERRs): The family of estrogen receptor-related receptors (ERRs), a subfamily of nuclear receptors, is closely related to the estrogen receptor (ER) family, but it lacks the ability to bind estrogen. ERRs can interfere with the classic ER-mediated estrogen signaling pathway, positively or negatively. ERRs share target genes, co-regulators and promoters with the estrogen receptor (ER) family. There are three subtypes of ERRs: alpha, beta and gamma. ERRs bind at least two types of DNA sequence, the estrogen response element and another site, originally characterized as SF-1 (steroidogenic factor 1) response element. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ERR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 221
26052 132745 cd06947 NR_LBD_GR_Like Ligand binding domain of nuclear hormone receptors:glucocorticoid receptor, mineralocorticoid receptor , progesterone receptor, and androgen receptor. The ligand binding domain of GR_like nuclear receptors: This family of NRs includes four distinct, but closely related nuclear hormone receptors: glucocorticoid receptor (GR), mineralocorticoid receptor (MR), progesterone receptor (PR), and androgen receptor (AR). These four receptors play key roles in some of the most fundamental physiological functions such as the stress response, metabolism, electrolyte homeostasis, immune function, growth, development, and reproduction. The NRs in this family use multiple signaling pathways and share similar functional mechanisms. The dominant signaling pathway is via direct DNA binding and transcriptional regulation of target genes. Another mechanism is via protein-protein interactions, mainly with other transcription factors such as nuclear factor-kappaB and activator protein-1, to regulate gene expression patterns. Both pathways can up-regulate or down-regulate gene expression and require ligand activation of the receptor and recruitment of other cofactors such as chaperone proteins and coregulator proteins. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR, MR, PR, and AR share the same modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 246
26053 132746 cd06948 NR_LBD_COUP-TF Ligand binding domain of chicken ovalbumin upstream promoter transcription factors, a member of the nuclear receptor family. The ligand binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs): COUP-TFs are orphan members of the steroid/thyroid hormone receptor superfamily. They are expressed in many tissues and are involved in the regulation of several important biological processes, such as neurogenesis, organogenesis, cell fate determination, and metabolic homeostasis. In mammals two isoforms named COUP-TFI and COUP-TFII have been identified. Both genes show an exceptional homology and overlapping expression patterns, suggesting that they may serve redundant functions. Although COUP-TF was originally characterized as a transcriptional activator of the chicken ovalbumin gene, COUP-TFs are generally considered to be repressors of transcription for other nuclear hormone receptors, such as retinoic acid receptor (RAR), thyroid hormone receptor (TR), vitamin D receptor (VDR), peroxisome proliferator activated receptor (PPAR), and hepatocyte nuclear factor 4 (HNF4). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, COUP-TFs have a central well cons erved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 236
26054 132747 cd06949 NR_LBD_ER Ligand binding domain of Estrogen receptor, which are activated by the hormone 17beta-estradiol (estrogen). The ligand binding domain (LBD) of Estrogen receptor (ER): Estrogen receptor, a member of nuclear receptor superfamily, is activated by the hormone estrogen. Estrogen regulates many physiological processes including reproduction, bone integrity, cardiovascular health, and behavior. The main mechanism of action of the estrogen receptor is as a transcription factor by binding to the estrogen response element of target genes upon activation by estrogen and then recruiting coactivator proteins which are responsible for the transcription of target genes. Additionally some ERs may associate with other membrane proteins and can be rapidly activated by exposure of cells to estrogen. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The C-terminal LBD also contains AF-2 activation motif, the dimerization motif, and part of the nuclear localization region. Estrogen receptor has been linked to aging, cancer, obesity and other diseases. 235
26055 132748 cd06950 NR_LBD_Tlx_PNR_like The ligand binding domain of Tailless-like proteins, orphan nuclear receptors. The ligand binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like family: This family includes photoreceptor cell-specific nuclear receptor (PNR), Tailless (TLX), and related receptors. TLX is an orphan receptor that is expressed by neural stem/progenitor cells in the adult brain of the subventricular zone (SVZ) and the dentate gyrus (DG). It plays a key role in neural development by promoting cell cycle progression and preventing apoptosis in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TLX and PNR have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 206
26056 132749 cd06951 NR_LBD_Dax1_like The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of DAX1-like proteins: This orphan nuclear receptor family includes DAX1 (dosage-sensitive sex reversal adrenal hypoplasia congenita critical region on chromosome X gene 1) and the Small Heterodimer Partner (SHP). Both receptors have a typical ligand binding domain, but lack the DNA binding domain, typical to almost all of the nuclear receptors. They function as a transcriptional coregulator by directly interacting with other nuclear receptors. DAX1 and SHP can form heterodimers with each other, as well as with many other nuclear receptors. In addition, DAX1 can also form homodimers. DAX1 plays an important role in the normal development of several hormone-producing tissues. SHP has shown to regulate a variety of target genes. 222
26057 132750 cd06952 NR_LBD_TR2_like The ligand binding domain of the orphan nuclear receptors TR4 and TR2. The ligand binding domain of the TR4 and TR2 (human testicular receptor 4 and 2): TR4 and TR2 are orphan nuclear receptors. Several isoforms of TR4 and TR2 have been isolated in various tissues. TR2 is abundantly expressed in the androgen-sensitive prostate. TR4 transcripts are expressed in many tissues, including central nervous system, adrenal gland, spleen, thyroid gland, and prostate. The expression of TR2 is negatively regulated by androgen, retinoids, and radiation. The expression of both mouse TR2 and TR4 is up-regulated by neurocytokine ciliary neurotrophic factor (CNTF) in mouse. It has shown that human TR2 binds to a wide spectrum of natural hormone response elements (HREs) with distinct affinities suggesting that TR2 may cross-talk with other gene expression regulation systems. The genes responding to TR2 or TR4 include genes that are regulated by retinoic acid receptor, vitamin D receptor, peroxisome proliferator-activated receptor. TR4/2 binds to HREs as a dimer. Like other members of the nuclea r receptor (NR) superfamily of ligand-activated transcription factors, TR2-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 222
26058 132751 cd06953 NR_LBD_DHR4_like The ligand binding domain of orphan nuclear receptor Ecdysone-induced receptor DHR4. The ligand binding domain of Ecdysone-induced receptor DHR4: Ecdysone-induced orphan receptor DHR4 is a member of the nuclear receptor family. DHR4 is expressed during the early Drosophila larval development and is induced by ecdysone. DHR4 coordinates growth and maturation in Drosophila by mediating endocrine response to the attainment of proper body size during larval development. Mutations in DHR4 result in shorter larval development which translates into smaller and lighter flies. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR4 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 213
26059 132752 cd06954 NR_LBD_LXR The ligand binding domain of Liver X receptors, a family of nuclear receptors of ligand-activated transcription factors. The ligand binding domain of Liver X receptors: Liver X receptors (LXRs) belong to a family of nuclear receptors of ligand-activated transcription factors. LXRs operate as cholesterol sensors which protect from cholesterol overload by stimulating reverse cholesterol transport from peripheral tissues to the liver and its excretion in the bile. Oxidized cholesterol derivatives or oxysterols were identified as specific ligands for LXRs. Upon ligand binding a conformational change leads to recruitment of co-factors, which stimulates expression of target genes. Among the LXR target genes are several genes involved in cholesterol efflux from peripheral tissues such as the ATP-binding-cassette transporters ABCA1, ABCG1 and ApoE. There are two LXR isoforms in mammals, LXRalpha and LXRbeta. LXRalpha is expressed mainly in the liver, intestine, kidney, spleen, and adipose tissue, whereas LXRbeta is ubiquitously expressed at lower level. Both LXRalpha and LXRbeta function as heterodimers with the retinoid X receptor (RX R) which may be activated by either LXR ligands or 9-cis retinoic acid, a specific RXR ligand. The LXR/RXR complex binds to a liver X receptor response element (LXRE) in the promoter region of target genes. LXR has typical NR modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and the ligand binding domain (LBD) at the C-terminal. 236
26060 143513 cd06955 NR_DBD_VDR DNA-binding domain of vitamin D receptors (VDR) is composed of two C4-type zinc fingers. DNA-binding domain of vitamin D receptors (VDR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. VDR interacts with a VDR response element, a direct repeat of GGTTCA DNA site with 3 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation. VDR is a member of the nuclear receptor (NR) superfamily that functions as classical endocrine receptors. VDR controls a wide range of biological activities including calcium metabolism, cell proliferation and differentiation, and immunomodulation. VDR is a high-affinity receptor for the biologically most active Vitamin D metabolite, 1alpha,25-dihydroxyvitamin D3 (1alpha,25(OH)2D3). The binding of the ligand to the receptor induces a conformational change of the ligand binding domain (LBD) with consequent dissociation of corepressors. Upon ligand binding, VDR forms a heterodimer with the retinoid X receptor (RXR) that binds to vitamin D response elements (VDREs), recruits coactivators. This leads to the expression of a large number of genes. Approximately 200 human genes are considered to be primary targets of VDR and even more genes are regulated indirectly. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, VDR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 107
26061 143514 cd06956 NR_DBD_RXR DNA-binding domain of retinoid X receptor (RXR) is composed of two C4-type zinc fingers. DNA-binding domain of retinoid X receptor (RXR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. RXR functions as a DNA binding partner by forming heterodimers with other nuclear receptors including CAR, FXR, LXR, PPAR, PXR, RAR, TR, and VDR. All RXR heterodimers preferentially bind response elements composed of direct repeats of two AGGTCA sites with a 1-5 bp spacer. RXRs can play different roles in these heterodimers. RXR acts either as a structural component of the heterodimer complex, required for DNA binding but not acting as a receptor, or as both a structural and a functional component of the heterodimer, allowing 9-cis RA to signal through the corresponding heterodimer. In addition, RXR can also form homodimers, functioning as a receptor for 9-cis RA, independently of other nuclear receptors. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, RXR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 77
26062 143515 cd06957 NR_DBD_PNR_like_2 DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like is composed of two C4-type zinc fingers. The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes nuclear receptor Tailless (TLX), photoreceptor cell-specific nuclear receptor (PNR) and related receptors. TLX is an orphan receptor that plays a key role in neural development by regulating cell cycle progression and exit of neural stem cells in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR-like receptors have a central well-conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 82
26063 143516 cd06958 NR_DBD_COUP_TF DNA-binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs) is composed of two C4-type zinc fingers. DNA-binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. COUP-TFs are orphan members of the steroid/thyroid hormone receptor superfamily. They are expressed in many tissues and are involved in the regulation of several important biological processes, such as neurogenesis, organogenesis, cell fate determination, and metabolic homeostasis. COUP-TFs homodimerize or heterodimerize with retinoid X receptor (RXR) and a few other nuclear receptors and bind to a variety of response elements that are composed of imperfect AGGTCA direct or inverted repeats with various spacings. COUP-TFs are generally considered to be repressors of transcription for other nuclear hormone receptors such as retinoic acid receptor (RAR), thyroid hormone receptor (TR), vitamin D receptor (VDR), peroxisome proliferator activated receptor (PPAR), and hepatocyte nuclear factor 4 (HNF4). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, COUP-TFs have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 73
26064 143517 cd06959 NR_DBD_EcR_like The DNA-binding domain of Ecdysone receptor (EcR) like nuclear receptor family is composed of two C4-type zinc fingers. The DNA-binding domain of Ecdysone receptor (EcR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. EcR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes three types of nuclear receptors: Ecdysone receptor (EcR), Liver X receptor (LXR) and Farnesoid X receptor (FXR). The DNA binding activity is regulated by their corresponding ligands. The ligands for EcR are ecdysteroids; LXR is regulated by oxidized cholesterol derivatives or oxysterols; and bile acids control FXR's activities. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, EcR-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 73
26065 143518 cd06960 NR_DBD_HNF4A DNA-binding domain of heptocyte nuclear factor 4 (HNF4) is composed of two C4-type zinc fingers. DNA-binding domain of hepatocyte nuclear factor 4 (HNF4) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. HNF4 interacts with a DNA site, composed of two direct repeats of AGTTCA with 1 bp spacer, which is upstream of target genes and modulates the rate of transcriptional initiation. HNF4 is a member of the nuclear receptor superfamily. HNF4 plays a key role in establishing and maintenance of hepatocyte differentiation in the liver. It is also expressed in gut, kidney, and pancreatic beta cells. HNF4 was originally classified as an orphan receptor, but later it is found that HNF4 binds with very high affinity to a variety of fatty acids. However, unlike other nuclear receptors, the ligands do not act as a molecular switch for HNF4. They seem to constantly bind to the receptor, which is constitutively active as a transcription activator. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, HNF4 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 76
26066 143519 cd06961 NR_DBD_TR DNA-binding domain of thyroid hormone receptors (TRs) is composed of two C4-type zinc fingers. DNA-binding domain of thyroid hormone receptors (TRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. TR interacts with the thyroid response element, which is a DNA site with direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pairs, upstream of target genes and modulates the rate of transcriptional initiation. Thyroid hormone receptor (TR) mediates the actions of thyroid hormones, which play critical roles in growth, development, and homeostasis in mammals. They regulate overall metabolic rate, cholesterol and triglyceride levels, and heart rate, and affect mood. TRs are expressed from two separate genes (alpha and beta) in human and each gene generates two isoforms of the receptor through differential promoter usage or splicing. TRalpha functions in the heart to regulate heart rate and rhythm and TRbeta is active in the liver and other tissues. The unliganded TRs function as transcription repressors, by binding to thyroid hormone response elements (TRE) predominantly as homodimers, or as heterodimers with retinoid X-receptors (RXR), and being associated with a complex of proteins containing corepressor proteins. Ligand binding promotes corepressor dissociation and binding of a coactivator to activate transcription. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 85
26067 143520 cd06962 NR_DBD_FXR DNA-binding domain of Farnesoid X receptor (FXR) family is composed of two C4-type zinc fingers. DNA-binding domain of Farnesoid X receptor (FXR) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. FXR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. FXR is a member of the nuclear receptor family of ligand activated transcription factors. Bile acids are endogenous ligands for FXRs. Upon binding of a ligand, FXR binds to FXR response element (FXRE), which is an inverted repeat of TGACCT spaced by one nucleotide, either as a monomer or as a heterodimer with retinoid X receptor (RXR), to regulate the expression of various genes involved in bile acid, lipid, and glucose metabolism. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, FXR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 84
26068 143521 cd06963 NR_DBD_GR_like The DNA binding domain of GR_like nuclear receptors is composed of two C4-type zinc fingers. The DNA binding domain of GR_like nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family of NRs includes four types of nuclear hormone receptors: glucocorticoid receptor (GR), mineralocorticoid receptor (MR), progesterone receptor (PR), and androgen receptor (AR). The receptors bind to common DNA elements containing a partial palindrome of the core sequence 5'-TGTTCT-3' with a 3bp spacer. These four receptors regulate some of the most fundamental physiological functions such as the stress response, metabolism, electrolyte homeostasis, immune function, growth, development, and reproduction. The NRs in this family have high sequence homology and share similar functional mechanisms. The dominant mechanism of function is by direct DNA binding and transcriptional regulation of target genes . The GR, MR, PR, and AR exhibit same modular structure. They have a central highly conserved DNA binding domain (DBD), a non-conserved N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 73
26069 143522 cd06964 NR_DBD_RAR DNA-binding domain of retinoic acid receptor (RAR) is composed of two C4-type zinc fingers. DNA-binding domain of retinoic acid receptor (RAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. RAR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. RARs mediate the biological effect of retinoids, including both natural dietary vitamin A (retinol) metabolites and active synthetic analogs. Retinoids play key roles in a wide variety of essential biological processes, such as vertebrate embryonic morphogenesis and organogenesis, differentiation and apoptosis, and homeostasis. RAR function as a heterodimer with retinoic X receptor by binding to specific RAR response elements (RAREs), which are composed of two direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pair and found in the promoter regions of retinoid target genes. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoic acid receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 85
26070 143523 cd06965 NR_DBD_Ppar DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) is composed of two C4-type zinc fingers. DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PPAR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Peroxisome proliferator-activated receptors (PPARs) are members of the nuclear receptor superfamily of ligand-activated transcription factors. PPARs play important roles in regulating cellular differentiation, development and lipid metabolism. Activated PPAR forms a heterodimer with the retinoid X receptor (RXR) that binds to the hormone response elements, which are composed of two direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pair located upstream of the peroxisome proliferator responsive genes, and interacts with co-activators. Several essential fatty acids, oxidized lipids and prostaglandin J derivatives can bind and activate PPAR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR has a central well conserved DNA binding domain (DBD), a variable N-terminal regulatory domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 84
26071 143524 cd06966 NR_DBD_CAR DNA-binding domain of constitutive androstane receptor (CAR) is composed of two C4-type zinc fingers. DNA-binding domain (DBD) of constitutive androstane receptor (CAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. CAR DBD interacts with CAR response element, a perfect repeat of two AGTTCA motifs with a 4 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation. The constitutive androstane receptor (CAR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. It functions as a heterodimer with RXR. The CAR/RXR heterodimer binds many common response elements in the promoter regions of a diverse set of target genes involved in the metabolism, transport, and ultimately, elimination of these molecules from the body. CAR is a closest mammalian relative of PXR and is activated by some of the same ligands as PXR and regulates a subset of common genes. The sequence homology and functional similarity suggests that the CAR gene arose from a duplication of an ancestral PXR gene. Like other nuclear receptors, CAR has a central well conserved DNA binding domain, a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain. 94
26072 143525 cd06967 NR_DBD_TR2_like DNA-binding domain of the TR2 and TR4 (human testicular receptor 2 and 4) is composed of two C4-type zinc fingers. DNA-binding domain of the TR2 and TR4 (human testicular receptor 2 and 4) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. TR2 and TR4 interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. TR4 and TR2 are orphan nuclear receptors; the physiological ligand is as yet unidentified. TR2 is abundantly expressed in the androgen-sensitive prostate. TR4 transcripts are expressed in many tissues, including central nervous system, adrenal gland, spleen, thyroid gland, and prostate. It has been shown that human TR2 binds to a wide spectrum of natural hormone response elements (HREs) with distinct affinities suggesting that TR2 may cross-talk with other gene expression regulation systems. The genes responding to TR2 or TR4 include genes that are regulated by retinoic acid receptor, vitamin D receptor, and peroxisome proliferator-activated receptor. TR4/2 binds to HREs as dimers. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR2-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 87
26073 143526 cd06968 NR_DBD_ROR DNA-binding domain of Retinoid-related orphan receptors (RORs) is composed of two C4-type zinc fingers. DNA-binding domain of Retinoid-related orphan receptors (RORs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ROR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. RORS are key regulators of many physiological processes during embryonic development. RORs bind as monomers to specific ROR response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by a 5-bp A/T-rich sequence. There are three subtypes of retinoid-related orphan receptors (RORs), alpha, beta, and gamma, which differ only in N-terminal sequence and are distributed in distinct tissues. RORalpha plays a key role in the development of the cerebellum particularly in the regulation of the maturation and survival of Purkinje cells. RORbeta expression is largely restricted to several regions of the brain, the retina, and pineal gland. RORgamma is essential for lymph node organogenesis. Recently, it has been suggested that cholesterol or a cholesterol derivative are the natural ligands of RORalpha. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoid-related orphan receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 95
26074 143527 cd06969 NR_DBD_NGFI-B DNA-binding domain of the orphan nuclear receptor, nerve growth factor-induced-B. DNA-binding domain (DBD) of the orphan nuclear receptor, nerve growth factor-induced-B (NGFI-B) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NGFI-B interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. NGFI-B is a member of the nuclear-steroid receptor superfamily. NGFI-B is classified as an orphan receptor because no ligand has yet been identified. NGFI-B is an early immediate gene product of embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of transcriptional initiation. NGFI-B binds to the NGFI-B response element (NBRE) 5'-(A/T)AAAGGTCA as a monomer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, NGFI-B has a central well-conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 75
26075 143528 cd06970 NR_DBD_PNR DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) is composed of two C4-type zinc fingers. DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. PNR is a member of the nuclear receptor superfamily of the ligand-activated transcription factors. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. It most likely binds to DNA as a homodimer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 92
26076 133477 cd06971 PgpA Phosphatidylglycerophosphatase A; a bacterial membrane-associated enzyme involved in lipid metabolism. Phosphatidylglycerophosphatase A domain represents a family of bacterial membrane-associated enzymes involved in lipid metabolism. The prototype of this CD is a putative Phosphatidylglycerophosphatase A (PGPase A) from Listeria monocytogenes. PGPase A (EC: 3.1.3.27), encoded by the gene pgpA, specifically catalyzes the formation of phosphatidylglycerol from phosphatidyl glycerophosphate (PGP). It requires Mg2+ for activity and is inhibited by sulfhydryl agents and freezing/thawing. PGPase B encoded from pgpB is not included in this family, which also acts on phosphatidic acid (PA) and lysophosphatidic acid (LPA). Aside from PGPase A and B, evidence shows that there is another PGPase existing in E. coli. Thus, PGPase A is not essential for PGPase activity in E. coli. 143
26077 132992 cd06974 TerD_like Uncharacterized proteins involved in stress response, similar to tellurium resistance terD. Tellurium resistance terD like proteins. This family is composed of uncharacterized proteins involved in stress response, such as the tellurium resistance proteins, chemical-damaging agent resistance proteins, and general stress proteins from a variety of organisms. The tellurium resistance proteins are homologous terA,-D,-E,-F,-Z,-X gene products, which confer tellurium resistance mediated by plasmids. Currently, the biochemical mechanism of tellurium resistance remains unknown. The family also contains several ter gene homologues, YceC, YceD, YceE, for which there is no clear evidence for any involvement in the tellurium resistance. A putative cAMP-binding protin CABP1 shows a significant similarity to the terD protein and is also included in this family. 162
26078 380380 cd06975 cupin_BacB Bacillus subtilis bacilysin and related proteins, cupin domain. Bacilysin (BacB, also known as AerE in Microcystis aeruginosa) is a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. It is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls. AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF aeruginosin biosynthesis gene cluster in Microcystis aeruginosa. 93
26079 380381 cd06976 cupin_MtlR-like_N AraC/XylS family transcriptional regulators similar to MtlR, N-terminal cupin domain. MtlR is a Pseudomonas fluorescens protein that acts as a transcriptional regulator of the mannitol utilization genes. It has an N-terminal cupin domain (represented by this alignment) and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 83
26080 380382 cd06977 cupin_RhaR_RhaS-like_N HTH-type transcriptional activator RhaR and RhaS and related proteins, N-terminal cupin domain. Members of this family contain an N-terminal cupin domain and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain, including the HTH-type transcription activators RhaS and RhaR. RhaS and RhaR respond to the availability of L-rhamnose and activate transcription of the operons in the Escherichia coli L-rhamnose catabolic regulon. The E. coli RhaR protein activates expression of the rhaSR operon in the presence of its effector, L-rhamnose. The resulting RhaS protein (plus L-rhamnose) activates expression of the L-rhamnose catabolic operon rhaBAD as well as the transport operon rhaT. These proteins bind DNA as dimers, via their HTH motifs. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 147
26081 380383 cd06978 cupin_EctC L-ectoine synthase, cupin domain. Ectoine synthase (EctC; also known as L-ectoine synthase or N-acetyldiaminobutyrate dehydratase; EC 4.2.1.108) is a cupin-like bacterial protein that converts N'-acetyldiaminobutyric acid to ectoine, in the last step of the L-ectoine biosynthetic pathway, via a cyclo-condensation reaction and using iron as the cofactor. Ectoines are potent microbial stress protectants, primarily synthesized by bacteria but also found in a few obligate halophilic protists and archaea, based on the ectoine biosynthetic ectABC gene. In halophilic eubacteria, the osmolytic ectoines enable the organisms to adapt to a wide range of salt concentrations by adjusting the cytoplasmic solute pool to the osmolarity of the surrounding environment. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 111
26082 380384 cd06979 cupin_RemF-like Streptomyces resistomycificus RemF cyclase and related proteins, cupin domain. RemF cyclase is a manganese-containing polyketide cyclase present in bacteria that is involved in the biosynthesis of resistomycin, the aromatic pentacyclic metabolite in Streptomyces resistomycificus. Structure of this enzyme shows a cupin fold with a conserved "jelly roll-like" beta-barrel fold that forms a homodimer. It contains an unusual octahedral zinc-binding site in a large hydrophobic pocket that may represent the active site. The zinc ion, coordinated to four histidine side chains and two water molecules, could act as a Lewis acid in the aldol condensation reaction catalyzed by RemF, reminiscent of class II aldolases. 93
26083 380385 cd06980 cupin_bxe_c0505 uncharacterized protein bxe_c0505, cupin domain. This family includes mostly bacterial proteins homologous to bxe_c0505, a Burkholderia xenovorans protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 105
26084 380386 cd06981 cupin_reut_a1446 Cupriavidus pinatubonensis reut_a1446 and related proteins, cupin domain. This family includes bacterial and some eukaryotic proteins homologous to reut_a1446, a Cupriavidus pinatubonensis protein of unknown function that may be related to mannose-6-phosphate isomerase. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 103
26085 380387 cd06982 cupin_BauB-like Pseudomonas aeruginosa BauB and related proteins, cupin domain. This family includes bacterial proteins homologous to beta-alanine degradation protein BauB from Pseudomonas aeruginosa, which is involved in the degradation of beta-alanine. Also included are Rhodopseudomonas palustris Rpa4178 and Bordetella pertussis Bp2299, which are both proteins with unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 91
26086 380388 cd06983 cupin_dsy2733 Desulfitobacterium hafniense dsy2733 and related proteins, cupin domain. This family includes bacterial proteins homologous to dsy2733, a Desulfitobacterium hafniense protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 81
26087 380389 cd06984 cupin_Moth_1897 uncharacterized Methanocaldococcus jannaschii Moth_1897 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to Moth_1897, a Methanocaldococcus jannaschii protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 83
26088 380390 cd06985 cupin_BF4112 Bacteroides fragilis BF4112 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to BF4112, a Bacteroides fragilis protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 101
26089 380391 cd06986 cupin_MmsR-like_N AraC/XylS family transcriptional regulators similar to MmsR, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included is MmsR, a bacterial transcriptional regulator thought to positively regulate the expression of the mmsAB operon. The mmsAB operon contains two structural genes involved in valine metabolism: mmsA which encodes methylmalonate-semialdehyde dehydrogenase, and mmsB which encodes 3-hydroxyisobutyrate dehydrogenase. The cupin domain of members of this subfamily does not contain a metal binding site. 84
26090 380392 cd06987 cupin_MAE_RS03005 Microcystis aeruginosa MAE_RS03005 and related proteins, cupin domain. This family includes bacterial and some eukaryotic proteins homologous to MAE_RS03005, a Microcystis aeruginosa protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 122
26091 380393 cd06988 cupin_DddK Dimethylsulfoniopropionate lyase DddK and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to dimethylsulfoniopropionate lyase DddK from marine bacterium Pelagibacter. DddK cleaves dimethylsulfoniopropionate (DMSP), the organic osmolyte and antioxidant produced in marine environments, and yields acrylate and the climate-active gas dimethyl sulfide (DMS). DddK contains a double-stranded beta-helical motif which utilizes various divalent metal ions as cofactors for catalytic activity; however, nickel, an abundant metal ion in marine environments, confers the highest DMSP lyase activity. Also included in this family is Plu4264, a Photorhabdus luminescens manganese-containing cupin shown to have similar metal binding site to TM1287 decarboxylase, but two very different substrate binding pockets. The Plu4264 binding pocket shows a cavity and substrate entry point more than twice as large as and more hydrophobic than TM1287, suggesting that Plu4264 accepts a substrate that is significantly larger than that of TM1287, a putative oxalate decarboxylase. Thus, the function of Plu4264 could be similar to that of TM1287 but with a larger, less charged substrate. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 76
26092 380394 cd06989 cupin_DRT102 Arabidopsis thaliana DRT102 and related proteins, cupin domain. This family includes bacterial and eukaryotic proteins homologous to DNA-damage-repair/toleration protein DRT102 found in Arabidopsis thaliana. DRT102 may be involved in DNA repair from UV damage. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 97
26093 380395 cd06990 cupin_DUF861 domain of unknown function DUF 861, cupin domain. This family contains proteins which seem to be specific to bacteria and some fungi. The function of this family is unknown but contains a cupin domain without a metal binding site. Cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins. This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin (cupa is the Latin term for small barrel). 101
26094 380396 cd06991 cupin_TcmJ-like TcmJ monooxygenase and related proteins, cupin domain. This family includes TcmJ, a subunit of the tetracenomycin (TCM) polyketide synthase (PKS) type II complex in Streptomyces glaucescens. TcmJ is a quinone-forming monooxygenase involved in the modification of aromatic polyketides synthesized by polyketide synthases of types II and III. Orthologs of TcmJ include the Streptomyces BenD (benastatin biosynthetic pathway), the Streptomyces olivaceus ElmJ (polyketide antibiotic elloramycin biosynthetic pathway), the Actinomadura hibisca PdmL (pradimicin biosynthetic pathway), the Streptomyces cyaneus CurC (curamycin biosynthetic pathway), the Streptomyces rishiriensis Lct30 (lactonamycin biosynthetic pathway), and the Streptomyces WhiE II (spore pigment polyketide biosynthetic pathway). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 105
26095 380397 cd06992 cupin_GDO-like_C gentisate 1,2-dioxygenase, 1-hydroxy-2-naphthoate dioxygenase, and salicylate 1,2-dioxygenase bicupin aromatic ring-cleaving dioxygenases, C-terminal cupin domain. This model represents the C-terminal cupin domains of three closely related bicupin aromatic ring-cleaving dioxygenases: gentisate 1,2-dioxygenase (GDO), salicylate 1,2-dioxygenase (SDO), and 1-hydroxy-2-naphthoate dioxygenase (NDO). GDO catalyzes the cleavage of the gentisate (2,5-dihydroxybenzoate) aromatic ring, a key step in the gentisate degradation pathway allowing soil bacteria to utilize 2,5-xylenol, 3,5-xylenol, and m-cresol as sole carbon and energy sources. NDO catalyzes the cleavage of 1-hydroxy-2-naphthoate as part of the bacterial phenanthrene degradation pathway. SDO is a ring cleavage dioxygenase from Pseudaminobacter salicylatoxidans that oxidizes salicylate to 2-oxohepta-3,5-dienedioic acid via a novel ring fission mechanism. SDO differs from other known GDO's and NDO's in its unique ability to oxidatively cleave many different salicylate, gentisate, and 1-hydroxy-2- naphthoate substrates with high catalytic efficiency. The active site of this enzyme is located in the N-terminal domain but could be influenced by changes in the C-terminal domain, which lacks the strictly conserved metal-binding residues found in other cupin domains and is thought to be an inactive vestigial remnant. 99
26096 380398 cd06993 cupin_CENP-C_C centromere-binding protein CENP-C, C-terminal cupin domain. This family includes centromeric protein C (CENP-C; known as Mif2 in budding yeast and centromere protein 3 or cnp3 in fission yeast), which is an inner kinetochore centromere (CEN)-binding protein found in fungi and metazoans. CENP-C is a component of the CENP-A nucleosome-associated complex (NAC) that plays a central role in assembly of kinetochore proteins, mitotic progression and chromosome segregation. CENP-C localizes to the inner kinetochore plates adjacent to the centromeric DNA and is known to have DNA-binding ability. CENP-C, along with CENP-H, provides a platform onto which the mitotic kinetochore is assembled and thus plays a critical role in the structuring of kinethocore chromatin. The cupin domain at the C-terminus forms a homodimer which is part of an enhanceosome-like structure that nucleates kinetochore assembly in budding yeast. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 77
26097 380399 cd06995 cupin_YkgD-like_N AraC/XylS family transcriptional regulators similar to Escherichia coli YkgD, N-terminal cupin domain. This family contains mostly bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included in this family is YkgD, an uncharacterized Escherichia coli protein thought to be a transcriptional regulator. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 114
26098 380400 cd06996 cupin_Lmo2851-like_N AraC/XylS family transcriptional regulators similar to Listeria monocytogenes Lmo2851 protein, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included is Listeria monocytogenes Lmo2851 protein, whose function is unknown. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 87
26099 380401 cd06997 cupin_MelR-like_N AraC/XylS family transcriptional regulators similar to Escherichia coli MelR, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators, including Escherichia coli MelR, a transcription factor that controls melibiose utilization. MelR is encoded by the melR gene and is essential for melibiose-dependent triggering of the melAB operon that encodes products needed for melibiose catabolism and transport. Expression of melR is autoregulated by MelR, which represses the melR promoter by binding to a target that overlaps the transcript start. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 78
26100 380402 cd06998 cupin_D-LI-like sugar isomerase such as lyxose isomerase, cupin domain. This family includes D-lyxose isomerase (D-LI; EC 5.3.1.15) homologous to YdaE from the sigma B regulon of Bacillus subtilis and to pathogenic Escherichia coli O157 z5688 D-lyxose isomerase (EcSI or Z5688), both having highly similar active sites. YdaE may have a synergistic role with ydaD, an NAD(P)-dependent alcohol dehydrogenase, in the adaptation to environment stresses, while EcSI has D-lyxose/D-mannose activity. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 100
26101 380403 cd06999 cupin_HpaA-like_N AraC/XylS family transcriptional regulators similar to HpaA, N-terminal cupin domain. Members of this family contain an N-terminal cupin domain and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain, similar to Escherichia coli 4-hydroxyphenylacetate catabolism regulatory protein HpaA (also known as 4HPA). HpaA is encoded by the hpaA gene which is located upstream of hpaBC. It is activated by 4-HPA, 3-HPA and phenylacetate, and represents a member of the AraC/XylS family of regulators that recognizes aromatic effectors. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 98
26102 380404 cd07000 cupin_HGO_N homogentisate 1,2-dioxygenase and related proteins, N-terminal cupin domain. This family includes homogentisate 1,2-dioxygenase (also known as homogentisate oxygenase, homogentisic acid oxidase, homogentisicase, HGO, HGD, HGDO, or HmgA; EC 1.13.11.5), which is involved in the metabolic degradation of phenylalanine and tyrosine. It catalyzes the crucial aromatic ring opening reaction, utilizing nonheme Fe2+ to incorporate both atoms of molecular oxygen into homogentisate (2,5-dihydroxyphenylacetate) to yield 4-maleylacetoacetate as part of the homogentisate pathway. HGO deficiency caused by critical mutations and polymorphic sites, causes the metabolic disease alkaptonuria (AKU), a rare disorder of autosomal recessive inheritance. Homogentisate accumulation causes insoluble ochronotic pigments to deposit in connective tissues, resulting in degenerative arthritis. These enzymes are found in prokaryotes, eukaryotes, and archaea. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 109
26103 380405 cd07001 cupin_YbfI-like_N AraC/XylS family transcriptional regulators similar to Bacillus subtilis YbfI, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators, including YbfI, an uncharacterized Bacillus subtilis. In Pseudomonas putida, this protein is thought to regulate the expression of phenylserine aldolase. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 76
26104 380406 cd07002 cupin_SznF-like_C Streptomyces achromogenes SznF and related proteins, C-terminal cupin domain. This family includes bacterial proteins similar to Streptomyces achromogenes SznF, containing an N-terminal helical region that mediates dimerization, a central heme oxygenase domain, and a C-terminal cupin domain. SznF is a metalloenzyme that catalyzes an oxidative rearrangement of the guanidine group of N(omega)-methyl-L-arginine to generate an N-nitrosourea product, during the biosynthesis of streptozotocin, an N-nitrosourea natural product and an approved cancer chemotherapeutic. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold generally capable of homodimerization. However, in SznF, the cupin domain is not involved in dimerization. 96
26105 380407 cd07003 cupin_YobQ-like_N Bacillus subtilis YobQ and related proteins, N-terminal cupin domain. This family includes bacterial proteins homologous to Bacillus subtilis YobQ and Photobacterium leiognathi LumQ, both uncharacterized proteins thought to be DNA-binding proteins that may function as AraC/XylS family transcriptional regulators. YobQ has an N-terminal cupin beta barrel domain (represented by this alignment model) and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 66
26106 380408 cd07005 cupin_WbuC-like Escherichia coli WbuC and related proteins, cupin domain. This family includes bacterial proteins homologous to WbuC, an Escherichia coli protein of unknown function with a cupin beta barrel fold. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 114
26107 380409 cd07006 cupin_XcTcmJ-like Xanthomonas campestris XcTcmJ and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to plant pathogen Xanthomonas campestris tetracenomycin polyketide synthesis protein XcTcmJ, a protein encoded by the tcmJ gene. XcTcmJ is annotated as being involved in tetracenomycin polyketide biosynthesis. Also included is Xc5357 from a different strain of X. campestris. Structure studies show that binding of zinc induces conformational changes and serves a functional role in this cupin protein. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 89
26108 380410 cd07007 cupin_CapF-like_C Staphylococcus aureus CapF and related proteins, C-terminal cupin domain. This family contains cupin domains of proteins homologous to Staphylococcus aureus CapF (also known as WbjC in Pseudomonas aeruginosa and FnlB in Escherichia coli). CapF is a bifunctional metalloenzyme produced by certain pathogenic bacteria and is essential in the biosynthetic path of capsular polysaccharide (CP), a mucous layer on the surface of bacterium that facilitates immune evasion and infection. Thus, CapF is an antibacterial/therapeutic target. In S. aureus, enzymes CapE, CapF and CapG catalyze the sequential transformation of UDP-D-GlcNAc in the CP precursor UDP-L-FucNAc via the intermediate compound UDP-N-acetyl-L-talosamine (UDP-L-TalNAc). CapF consists of two domains; the C-terminal cupin domain catalyzes the epimerization of the compound produced by the upstream enzyme CapE, and the N-terminal short-chain dehydrogenase/reductase (SDR) domain catalyzes the reduction of the compound afforded by the cupin domain, requiring one equivalent of NADPH. The cupin domain is crucial for catalyzing the first chemical reaction, and also important for the stability of the enzyme. Similarly, in P. aeruginosa, WbjC, WbjB and WbjD enzymes synthesize UDP-N-acetyl-L-fucosamine, a precursor of the lipopolysacharide component L-fucosamine. The cupin domains contain a conserved "jelly roll-like" beta-barrel fold. 109
26109 380411 cd07008 cupin_yp_001338853-like Klebsiella pneumoniae yp_001338853.1 and related proteins, cupin domain. This family includes bacterial proteins homologous to Klebsiella pneumoniae yp_001338853.1, an uncharacterized conserved protein with double-stranded beta-helix domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 101
26110 380412 cd07009 cupin_BLL0285-like Bradyrhizobium japonicum BLL0285 and related proteins, cupin domain. This family includes bacterial proteins homologous to BLL0285, a Bradyrhizobium japonicum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 81
26111 380413 cd07010 cupin_PMI_type_I_N_bac Phosphomannose isomerase in bacteria and archaea, N-terminal cupin domain. This subfamily contains type I phosphomannose isomerase (PMI; E.C. 5.3.1.8; also known as mannose-6-phosphate isomerase) found in many bacteria (e.g. Bacillus subtilis) and archaea. PMI catalyzes the reversible isomerization of fructose-6-phosphate (F6P) and mannose-6-phosphate (M6P), the first committed step in the synthesis of mannosylated glycoproteins. The active site, located within the N-terminal jelly roll-like beta-barrel cupin fold, contains a single essential zinc atom and forms a deep, open cavity large enough to contain M6P or F6P. PMI type I also has a C-terminal beta-barrel fold which has diverged considerably from the N-terminal domain and is not included here. This subfamily does not contain an alpha helical domain that exists in eukaryotic and some prokaryotic PMIs. F6P is a substrate for glycolysis and gluconeogenesis, while M6P is a substrate for production of activated mannose donor guanosine 5'-diphosphate D-mannose, an important precursor of mannosylated biomolecules such as glycoproteins, bacterial exopolysaccharides and fungal cell wall components. PMI is also essential for survival, virulence and possibly pathogenicity of some bacteria and protozoan parasites, as well as for cell wall integrity of certain yeasts. Thus, PMI is a potential target against fungal infections causing serious illness or death. 173
26112 380414 cd07011 cupin_PMI_type_I_N type I phosphomannose isomerase in eukaryotes and bacteria, N-terminal cupin domain. This subfamily contains type I phosphomannose isomerase (PMI; E.C. 5.3.1.8; also known as mannose-6-phosphate isomerase) found in eukaryotes and some bacteria such as Salmonella enterica. PMI catalyzes the reversible isomerization of fructose-6-phosphate (F6P) and mannose-6-phosphate (M6P), the first committed step in the synthesis of mannosylated glycoproteins. The active site, located within the N-terminal jelly roll-like beta-barrel cupin fold, contains a single essential zinc atom and forms a deep, open cavity large enough to contain M6P or F6P. PMI type I also has a C-terminal beta-barrel fold which has diverged considerably from the N-terminal domain and is not included here. This subfamily contains an alpha helical domain that is found in eukaryotic and some prokaryotic PMIs but is not present in their archaeal counterparts. F6P is a substrate for glycolysis and gluconeogenesis, while M6P is a substrate for production of activated mannose donor guanosine 5'-diphosphate D-mannose, an important precursor of mannosylated biomolecules such as glycoproteins, bacterial exopolysaccharides and fungal cell wall components. PMI is also essential for survival, virulence and possibly pathogenicity of some bacteria and protozoan parasites, as well as for cell wall integrity of certain yeasts. Thus, PMI is a potential target against fungal infections causing serious illness or death. 247
26113 270234 cd07012 PBP2_Bug_TTT Bug (Bordetella uptake gene) protein family of periplasmic solute-binding receptors; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) proteins present in a number of bacterial species, but mainly in proteobacteria. In eubacteria, at least three families of periplasmic binding-protein dependent transporters are known: the ATP-binding cassette (ABC) transporters, the tripartite ATP-independent periplasmic transporters, and the tripartite tricarboxylate transporters (TTT). Bug proteins are the PBP components of the TTT. Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter. 291
26114 132924 cd07013 S14_ClpP Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. Additionally, they are implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. 162
26115 132925 cd07014 S49_SppA Signal peptide peptidase A. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is an intramembrane enzyme found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be ClpP-like serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain, cleaving peptide bonds in the plane of the lipid bilayer. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain (sometimes referred to as 67K type). Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain (sometimes referred to as 36K type). Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. This family also contains homologs that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases. 177
26116 132926 cd07015 Clp_protease_NfeD Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. 172
26117 132927 cd07016 S14_ClpP_1 Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. This subfamily only contains bacterial sequences. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. 160
26118 132928 cd07017 S14_ClpP_2 Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. 171
26119 132929 cd07018 S49_SppA_67K_type Signal peptide peptidase A (SppA) 67K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 67K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily contain an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown that members in this subfamily, mostly bacterial, are serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. 222
26120 132930 cd07019 S49_SppA_1 Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppAs in this subfamily are found in all three domains of life and are involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain, similar to Arabidopsis thaliana SppA1 peptidase. Others, including sohB peptidase, protein C and archaeal signal peptide peptidase, do not contain the amino-terminal domain. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. 211
26121 132931 cd07020 Clp_protease_NfeD_1 Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. 187
26122 132932 cd07021 Clp_protease_NfeD_like Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. 178
26123 132933 cd07022 S49_Sppa_36K_type Signal peptide peptidase A (SppA) 36K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 36K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily are all bacterial and include sohB peptidase and protein C. These are sometimes referred to as 36K type since they contain only one domain, unlike E. coli SppA that also contains an amino-terminal domain. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. 214
26124 132934 cd07023 S49_Sppa_N_C Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. This subfamily contains members with either a single domain (sometimes referred to as 36K type), such as sohB peptidase, protein C and archaeal signal peptide peptidase, or an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. 208
26125 132882 cd07025 Peptidase_S66 LD-Carboxypeptidase, a serine protease, includes microcin C7 self immunity protein. LD-carboxypeptidase (Muramoyltetrapeptide carboxypeptidase; EC 3.4.17.13; Merops family S66; initially described as Carboxypeptidase II) family also includes the microcin c7 self-immunity protein (MccF) as well as uncharacterized proteins including hypothetical proteins. LD-carboxypeptidase hydrolyzes the amide bond that links the dibasic amino acids to C-terminal D-amino acids. The physiological substrates of LD-carboxypeptidase are tetrapeptide fragments (such as UDP-MurNAc-tetrapeptides) that are produced when bacterial cell walls are degraded; they contain an L-configured residue (L-lysine or meso-diaminopimelic acid residue) as the penultimate residue and D-alanine as the ultimate residue. A possible role of LD-carboxypeptidase is in peptidoglycan recycling whereby the resulting tripeptide (precursor for murein synthesis) can be reconverted into peptidoglycan by attachment of preformed D-Ala-D-Ala dipeptides. Some enzymes possessing LD-carboxypeptidase activity also act as LD-transpeptidase by replacing the terminal D-Ala with another D-amino acid. MccF contributes to self-immunity towards microcin C7 (MccC7), a ribosomally encoded peptide antibiotic that contains a phosphoramidate linkage to adenosine monophosphate at its C-terminus. Its possible biological role is to defend producer cells against exogenous microcin from re-entering after having been exported. It is suggested that MccF is involved in microcin degradation or sequestration in the periplasm. 282
26126 197305 cd07026 Ribosomal_L20 Ribosomal protein L20. The ribosomal protein family L20 contains members from eubacteria, as well as their mitochondrial and plastid homologs. L20 is an assembly protein, required for the first in-vitro reconstitution step of the 50S ribosomal subunit, but does not seem to be essential for ribosome activity. L20 has been shown to partially unfold in the absence of RNA, in regions corresponding to the RNA-binding sites. L20 represses the translation of its own mRNA via specific binding to two distinct mRNA sites, in a manner similar to the L20 interaction with 23S ribosomal RNA. 106
26127 132905 cd07027 RNAP_RPB11_like RPB11 subunit of RNA polymerase. The eukaryotic RPB11 subunit of RNA polymerase (RNAP), as well as its archaeal (L subunit) and bacterial (alpha subunit) counterparts, is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts. 83
26128 132906 cd07028 RNAP_RPB3_like RPB3 subunit of RNA polymerase. The eukaryotic RPB3 subunit of RNA polymerase (RNAP), as well as its archaeal (D subunit) and bacterial (alpha subunit) counterparts, is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar to the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts. 212
26129 132907 cd07029 RNAP_I_III_AC19 AC19 subunit of Eukaryotic RNA polymerase (RNAP) I and RNAP III. The eukaryotic AC19 subunit of RNA polymerase (RNAP) I and RNAP III is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP I is responsible for the synthesis of ribosomal RNA precursor, while RNAP III functions in the synthesis of 5S and tRNA. The AC19 subunit is the equivalent of the RPB11 subunit of RNAP II. The RPB11 subunit heterodimerizes with the RPB3 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. The homology of AC19 to RPB11 suggests a similar function. The AC19 subunit is likely to associate with the RPB3 counterpart, AC40, to form a heterodimer, which stabilizes the association of the two largest subunits of RNAP I and RNAP III. 85
26130 132908 cd07030 RNAP_D D subunit of Archaeal RNA polymerase. The D subunit of archaeal RNA polymerase (RNAP) is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. A single distinct RNAP complex is found in archaea, which may be responsible for the synthesis of all RNAs. The archaeal RNAP harbors homologues of all eukaryotic RNAP II subunits with two exceptions (RPB8 and RPB9). The 12 archaeal subunits are designated by letters and can be divided into three functional groups that are engaged in: (I) catalysis (A'/A", B'/B" or B); (II) assembly (L, N, D and P); and (III) auxiliary functions (F, E, H and K). The D subunit is equivalent to the RPB3 subunit of eukaryotic RNAP II. It contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization, and the other is an inserted beta sheet subdomain. The assembly of the two largest archaeal RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of the archaeal D/L heterodimer. 259
26131 132909 cd07031 RNAP_II_RPB3 RPB3 subunit of Eukaryotic RNA polymerase II. The eukaryotic RPB3 subunit of RNA polymerase (RNAP) II is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP II is responsible for the synthesis of mRNA precursor. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization, and the other is an inserted beta sheet subdomain. The RPB3 subunit heterodimerizes with the RPB11 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. 265
26132 132910 cd07032 RNAP_I_II_AC40 AC40 subunit of Eukaryotic RNA polymerase (RNAP) I and RNAP III. The eukaryotic AC40 subunit of RNA polymerase (RNAP) I and RNAP III is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP I is responsible for the synthesis of ribosomal RNA precursor, while RNAP III functions in the synthesis of 5S and tRNA. The AC40 subunit is the equivalent of the RPB3 subunit of RNAP II. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The RPB3 subunit heterodimerizes with the RPB11 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. The homology of AC40 to RPB3 suggests a similar function. The AC40 subunit is likely to associate with the RPB11 counterpart, AC19, to form a heterodimer, which stabilizes the association of the two largest subunits of RNAP I and RNAP III. 291
26133 132916 cd07033 TPP_PYR_DXS_TK_like Pyrimidine (PYR) binding domain of 1-deoxy-D-xylulose-5-phosphate synthase (DXS), transketolase (TK), and related proteins. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain of 1-deoxy-D-xylulose-5-phosphate synthase (DXS), transketolase (TK), and the beta subunits of the E1 component of the human pyruvate dehydrogenase complex (E1- PDHc), subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Like many TPP-dependent enzymes DXS and TK are homodimers having a PYR and a PP domain on the same subunit. TK has two active sites per dimer which lie between PYR and PP domains of different subunits. For DXS each active site is located at the interface of a PYR and a PP domain from the same subunit. E1-PDHc is an alpha2beta2 dimer-of-heterodimers having two active sites but having the PYR and PP domains arranged on separate subunits, the PYR domains on the beta subunits, the PP domains on the alpha subunits. DXS is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis, it catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. TK also plays a central role in the Calvin cycle in plants. PDHc catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. This subfamily includes the beta subunits of the E1 component of the acetoin dehydrogenase complex (ADC) and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC). ADC participates in the breakdown of acetoin. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate during the breakdown of branched chain amino acids. 156
26134 132917 cd07034 TPP_PYR_PFOR_IOR-alpha_like Pyrimidine (PYR) binding domain of pyruvate ferredoxin oxidoreductase (PFOR), indolepyruvate ferredoxin oxidoreductase alpha subunit (IOR-alpha), and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain, of pyruvate ferredoxin oxidoreductase (PFOR), indolepyruvate ferredoxin oxidoreductase (IOR) alpha subunit (IOR-alpha), and related proteins, subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzyme Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit. This subfamily includes proteins characterized as pyruvate NADP+ oxidoreductase (PNO). PFOR and PNO catalyze the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. The facultative anaerobic mitochondrion of the photosynthetic protist Euglena gracilis oxidizes pyruvate with PNO. IOR catalyzes the oxidative decarboxylation of arylpyruvates, such as indolepyruvate or phenylpyruvate. 160
26135 132918 cd07035 TPP_PYR_POX_like Pyrimidine (PYR) binding domain of POX and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of pyruvate oxidase (POX) and related protiens subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. For glyoxylate carboligase, which belongs to this subfamily, but lacks this conserved glutamate, the rate of the initial TPP activation step is reduced but the ensuing steps of the enzymic reaction proceed efficiently. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites, for many the active sites lie between PP and PYR domains on different subunits. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. This subfamily includes pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC). PDC catalyzes the conversion of pyruvate to acetaldehyde and CO2 in alcoholic fermentation. IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway in plants and various plant-associated bacteria, it catalyzes the decarboxylation of IPA to IAA. This subfamily also includes the large catalytic subunit of acetohydroxyacid synthase (AHAS). AHAS catalyzes the condensation of two molecules of pyruvate to give the acetohydroxyacid, 2-acetolactate, a precursor of the branched chain amino acids, valine and leucine. AHAS also catalyzes the condensation of pyruvate and 2-ketobutyrate to form 2-aceto-2-hydroxybutyrate in isoleucine biosynthesis. Methanococcus jannaschii sulfopyruvate decarboxylase (MjComDE) and phosphonopyruvate decarboxylase (PpyrDc) also belong to this subfamily. PpyrDc is a homotrimeric enzyme having the PP and PYR domains tandemly arranged on the same subunit. It functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. MjComDE is a dodecamer having the PYR and PP domains on different subunits, it has six alpha (PYR/ComD) subunits and six beta (PP/ComE) subunits. MjComDE catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. 155
26136 132919 cd07036 TPP_PYR_E1-PDHc-beta_like Pyrimidine (PYR) binding domain of the beta subunits of the E1 components of human pyruvate dehydrogenase complex (E1- PDHc) and related proteins. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain of the beta subunits of the E1 components of: human pyruvate dehydrogenase complex (E1- PDHc), the acetoin dehydrogenase complex (ADC), and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC), subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. E1-PDHc is an alpha2beta2 dimer-of-heterodimers having two active sites lying between PYR and PP domains of separate subunits, the PYR domains are arranged on the beta subunit, the PP domains on the alpha subunits. PDHc catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. ADC participates in the breakdown of acetoin. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate during the breakdown of branched chain amino acids. 167
26137 132920 cd07037 TPP_PYR_MenD Pyrimidine (PYR) binding domain of 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexadiene-1-carboxylate synthase (MenD) and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexadiene-1-carboxylate (SEPHCHC) synthase (MenD) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. Escherichia coli MenD (EcMenD) is a homotetramer (dimer-of-homodimers), having two active sites per homodimer lying between PYR and PP domains of different subunits. EcMenD catalyzes a Stetter-like conjugate addition of alpha-ketoglutarate to isochorismate, leading to the formation of SEPHCHC and carbon dioxide, this addition is the first committed step in the biosynthesis of vitamin K2 (menaquinone). 162
26138 132921 cd07038 TPP_PYR_PDC_IPDC_like Pyrimidine (PYR) binding domain of pyruvate decarboxylase (PDC), indolepyruvate decarboxylase (IPDC) and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites, for many the active sites lie between PP and PYR domains on different subunits. PDC catalyzes the conversion of pyruvate to acetaldehyde and CO2 in alcoholic fermentation. IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway in plants and various plant-associated bacteria, it catalyzes the decarboxylation of IPA to IAA. Also belonging to this group is Mycobacterium tuberculosis alpha-keto acid decarboxylase (MtKDC) which participates in amino acid degradation via the Ehrlich pathway, and Lactococcus lactis branched-chain keto acid decarboxylase (KdcA) an enzyme identified as being involved in cheese ripening, which exhibits a very broad substrate range in the decarboxylation and carboligation reactions. 162
26139 132922 cd07039 TPP_PYR_POX Pyrimidine (PYR) binding domain of POX. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of pyruvate oxidase (POX) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. Lactobacillus plantarum POX is a homotetramer (dimer-of-homodimers), having two active sites per homodimer lying between PYR and PP domains of different subunits. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. 164
26140 132716 cd07040 HP Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed. 153
26141 132912 cd07041 STAS_RsbR_RsbS_like Sulphate Transporter and Anti-Sigma factor antagonist domain of the "stressosome" complex proteins RsbS and RsbR, regulators of the bacterial stress activated alternative sigma factor sigma-B by phosphorylation. The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain of proteins related to RsbS and RsbR which are part of the "stressosome" complex that plays an important role in the regulation of the bacterial stress activated alternative sigma factor sigma-B. During stress conditions RsbS and RsbR are phosphorylated which leads to the release of RsbT, an activator of of the RsbU phosphatase, which in turn activates RsbV which leads to the release and activation of sigma factor B. RsbS is a single domain protein (STAS domain), while RsbR-like proteins have a well-conserved C-terminal STATS domain and a variable N-terminal domain. The STAS domain is also found in the C- terminal region of sulphate transporters and anti-anti-sigma factors. 109
26142 132913 cd07042 STAS_SulP_like_sulfate_transporter Sulphate Transporter and Anti-Sigma factor antagonist domain of SulP-like sulfate transporters, plays a role in the function and regulation of the transport activity, proposed general NTP binding function. The SulP family is a large and diverse family of anion transporters, with members from eubacteria, plants, fungi, and mammals. They contain 10 to 14 transmembrane helices which form the catalytic core of the protein and a C-terminal extension, the STAS (Sulphate Transporter and AntiSigma factor antagonist) domain which plays a role in the function and regulation of the transport activity. The STAS domain is found in the C-terminal region of sulphate transporters and bacterial anti-sigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. 107
26143 132914 cd07043 STAS_anti-anti-sigma_factors Sulphate Transporter and Anti-Sigma factor antagonist) domain of anti-anti-sigma factors, key regulators of anti-sigma factors by phosphorylation. Anti-anti-sigma factors play an important role in the regulation of several sigma factors and their corresponding anti-sigma factors. Upon dephosphorylation they bind the anti-sigma factor and induce the release of the sigma factor from the anti-sigma factor. In a feedback mechanism the anti-anti-sigma factor can be inactivated via phosphorylation by the anti-sigma factor. Well studied examples from Bacillus subtilis are SpoIIAA (regulating sigmaF and sigmaC which play an important role in sporulation) and RsbV (regulating sigmaB involved in the general stress response). The STAS domain is also found in the C- terminal region of sulphate transporters and stressosomes. 99
26144 132871 cd07044 CofD_YvcK Family of CofD-like proteins and proteins related to YvcK. CofD is a 2-phospho-L-lactate transferase that catalyzes the last step in the biosynthesis of coenzyme F(420)-0 (F(420) without polyglutamate) by transferring the lactyl phosphate moiety of lactyl(2)diphospho-(5')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin ribitol (F0). F420 is a hydride carrier, important for energy metabolism of methanogenic archaea, as well as for the biosynthesis of other natural products, like tetracycline in Streptomyces. F420 and some of its precursors are also utilized as cofactors for enzymes, like DNA photolyase in Mycobacterium tuberculosis. YvcK from Bacillus subtilis is a member of a family of mostly uncharacterized proteins and has been proposed to play a role in carbon metabolism, since its function is essential for growth on intermediates of the Krebs cycle and pentose phosphate pathway. Both families appear to have a conserved phosphate binding site, but have different substrate binding residues conserved within each family. 309
26145 132885 cd07045 BMC_CcmK_like Carbon dioxide concentrating mechanism K (CcmK)-like proteins, Bacterial Micro-Compartment (BMC) domain. Bacterial micro-compartments are primitive protein-based organelles that sequester specific metabolic pathways in bacterial cells. The prototypical bacterial microcompartment is the carboxysome shell, a bacterial polyhedral organelle which increase the efficiency of CO2 fixation by encapsulating RuBisCO and carbonic anhydrase. They can be divided into two types: alpha-type carboxysomes (alpha-cyanobacteria and proteobacteria) and beta-type carboxysomes (beta-cyanobacteria). Potential functional differences between the two types are not yet fully understood. In addition to these proteins there are several homologous shell proteins including those found in pdu organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol and eut organelles involved in the cobalamin-dependent degradation of ethanolamine. Structure evidence shows that several carboxysome shell proteins and their homologs (Csos1A, CcmK1,2,4, and PduU) exist as hexamers which might further assemble into extended, tightly packed layers hypothesized to represent the flat facets of the polyhedral organelles outer shell. Although it has been suggested that other homologous proteins in this family might also form hexamers and play similar functional roles in the construction of their corresponding organelle outer shells at present no experimental evidence directly supports this view. 84
26146 132886 cd07046 BMC_PduU-EutS 1,2-propanediol utilization protein U (PduU)/ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain. PduU encapsulates several related enzymes within a shell composed of a few thousand protein subunits. PduU exists as a hexamer which might further assemble into the flat facets of the polyhedral outer shell of the pdu organelle. This proteinaceous noncarboxysome microcompartment is involved in coenzyme B12-dependent degradation of 1,2-propanediol. The core of PduU is related to the typical BMC domain and its natural oligomeric state is a cyclic hexamer. Unlike other typical BMC domain proteins, the 3D topology of PduU reveals a circular permuted variation on the typical BMC fold which leads to several unique features. The exact functions related to those unique features are still not clear. Another difference is the presence of a deep cavity on one side of the hexamer as well as an intermolecular six-stranded beta barrel that seems to block the central pore that is present in other BMC domain proteins. EutS proteins included in this CD are sequence homologs of PduU. They are encoded within eut operon and may be required for the formation of the outer shell of bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutS might also form hexamers and play similar functional roles in the construction of the eut organelle outer shell at present no experimental evidence directly supports this view. 110
26147 132887 cd07047 BMC_PduB_repeat1 1,2-propanediol utilization protein B (PduB), Bacterial Micro-Compartment (BMC) domain repeat 1. PduB proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduB might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view. PduB proteins contain two tandem BMC domains repeats. This CD contains repeat 1 (the first BMC domain of PduB). 134
26148 132888 cd07048 BMC_PduB_repeat2 1,2-propanediol utilization protein B (PduB), Bacterial Micro-Compartment (BMC) domain repeat 2. PduB proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduB might form hexamers and further assemble into the flat facets of the polyhedral outer shell of the pdu organelles at present no experimental evidence directly supports this view. PduB proteins contain two tandem BMC domains repeats. This CD contains repeat 2 (the second BMC domain of PduB). 70
26149 132889 cd07049 BMC_EutL_repeat1 ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain repeat 1. EutL proteins are homologs of the carboxysome shell protein. They are encoded within the eut operon and might be required for the formation of the outer shell of the bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutL might form hexamers and further assemble into the flat facets of the polyhedral outer shell of the eut organelles at present no experimental evidence directly supports this view. EutL proteins contain two tandem BMC domains. This CD includes domain 1 (the first BMC domain of EutL). 103
26150 132890 cd07050 BMC_EutL_repeat2 ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain repeat 2. EutL proteins are homologs of the carboxysome shell protein. They are encoded within the eut operon and might be required for the formation of the outer shell of the bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutL might form hexamers and further assemble into the flat facets of the polyhedral outer shell of eut organelles at present no experimental evidence directly supports this view. EutL proteins contain two tandem BMC domains. This CD includes domain 2 (the second BMC domain of EutL). 87
26151 132891 cd07051 BMC_like_1_repeat1 Bacterial Micro-Compartment (BMC)-like domain 1 repeat 1. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 1 (the first BMC domain of BMC like 1 proteins). 111
26152 132892 cd07052 BMC_like_1_repeat2 Bacterial Micro-Compartment (BMC)-like domain 1 repeat 2. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 2 (the second BMC domain of BMC like 1 proteins). 79
26153 132893 cd07053 BMC_PduT_repeat1 1,2-propanediol utilization protein T (PduT), Bacterial Micro-Compartment (BMC) domain repeat 1. PduT proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduT might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view. PduT proteins contain two tandem BMC domains repeats. This CD contains repeat 1 (the first BMC domain of PduT) as well as carboxysome shell protein sequence homolog, EutM protein, are also included in this CD. They too might exist as hexamers and might play similar functional roles in the construction of the eut organelle outer shell which still remains poorly understood. 76
26154 132894 cd07054 BMC_PduT_repeat2 1,2-propanediol utilization protein T (PduT), Bacterial Micro-Compartment (BMC) domain repeat 2. PduT proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduT might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles, at present no experimental evidence directly supports this view. PduT proteins contain two tandem BMC domains repeats. This CD contains repeat 2 (the second BMC domain of PduT) as well as carboxysome shell protein sequence homolog, EutM protein, are also included in this CD. They too might exist as hexamers and might play similar functional roles in the construction of the eut organelle outer shell which still remains poorly understood. 78
26155 132895 cd07055 BMC_like_2 Bacterial Micro-Compartment (BMC)-like domain 2. BMC like 2 domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. 61
26156 132896 cd07056 BMC_PduK 1,2-propanediol utilization protein K (PduK), Bacterial Micro-Compartment (BMC) domain repeat 1l. PduK proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduK might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view. 77
26157 132897 cd07057 BMC_CcmK Carbon dioxide concentrating mechanism (CcmK); Bacterial Micro-Compartment (BMC) domain. CcmK1-4 and CcmL proteins found in Synechocystis sp. strain PCC 6803 make up the beta carboxysome shell. These CcmK proteins have been shown to form hexameric units, while the CcmL proteins have been shown to form pentameric units. Together these proteins further assemble into the flat facets of the polyhedral carboxysome shell. The structures suggest that the central pores and the gaps between hexamers limit the transport of metabolites into and out of the the carboxysome. 88
26158 132898 cd07058 BMC_CsoS1 Carboxysome Shell 1 (CsoS1); Bacterial Micro-Compartment (BMC) domain. The cso operon in Halothiobacillus neapolitanus contains the genes involved in alpha carboxysome function including those for the carboxysome shell proteins: CsoS1A, CsoS1B, and CsoS1C. CsoS1A has been shown to form hexameric units which further assemble into the flat facets of the polyhedral carboxysome shell. The structures suggest that the central pores and the gaps between hexamers limit the transport of metabolites into and out of the the carboxysome. Although it has been suggested that other homologous proteins, CsoS1B and CsoS1C, in this family might also form hexamers and play similar functional roles in the construction of carboxysome outer shell at present no experimental evidence directly supports this view. 88
26159 132899 cd07059 BMC_PduA 1,2-propanediol utilization protein A (PduA), Bacterial Micro-Compartment (BMC) domain. PduA is encoded within the 1,2-propanediol utilization (pdu) operon along with other homologous carboxysome shell proteins PduB, B', J, K, T, and U. PduA is thought to be required for the formation of the outer shell of bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduA might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles, like PduU does, at present no experimental evidence directly supports this view. 85
26160 349952 cd07060 SPOUT_MTase SPOUT superfamily of SAM-dependent RNA methyltransferases. The SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, also known as class IV methyltransferase family, is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. Members of the SPOUT superfamily that have been characterized functionally are involved in post-transcriptional RNA modification by catalyzing methylation of the 2-OH group of ribose, the N-1 atom of guanosine 37 in tRNA, or the N-3 atom of uridine 1498 in 16S rRNA. 99
26161 132717 cd07061 HP_HAP_like Histidine phosphatase domain found in histidine acid phosphatases and phytases; contains a His residue which is phosphorylated during the reaction. Catalytic domain of HAP (histidine acid phosphatases) and phytases (myo-inositol hexakisphosphate phosphohydrolases). The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. Functions in this subgroup include roles in metabolism, signaling, or regulation, for example Escherichia coli glucose-1-phosphatase functions to scavenge glucose from glucose-1-phosphate and the signaling molecules inositol 1,3,4,5,6-pentakisphosphate (InsP5) and inositol hexakisphosphate (InsP6) are in vivo substrates for eukaryotic multiple inositol polyphosphate phosphatase 1 (Minpp1). Phytases scavenge phosphate from extracellular sources and are added to animal feed while prostatic acid phosphatase (PAP) has been used for many years as a serum marker for prostate cancer. Recently PAP has been shown in mouse models to suppress pain by functioning as an ecto-5prime-nucleotidase. In vivo it dephosphorylates extracellular adenosine monophosphate (AMP) generating adenosine,and leading to the activation of A1-adenosine receptors in dorsal spinal cord. 242
26162 132883 cd07062 Peptidase_S66_mccF_like Microcin C7 self-immunity protein determines resistance to exogenous microcin C7. Microcin C7 self-immunity protein (mccF): MccF, a homolog of the LD-carboxypeptidase family, mediates resistance against exogenously added microcin C7 (MccC7), a ribosomally-encoded peptide antibiotic that contains a phosphoramidate linkage to adenosine monophosphate at its C-terminus. The plasmid-encoded mccF gene is transcribed in the opposite direction to the other five genes (mccA-E) and is required for the full expression of immunity but not for production. The catalytic triad residues (Ser, His, Glu) of LD-carboxypeptidase are also conserved in MccF, strongly suggesting that MccF shares the hydrolytic activity with LD-carboxypeptidases. Substrates of MccF have not been deduced, but could likely be microcin C7 precursors. The possible role of MccF is to defend producer cells against exogenous microcin from re-entering after having been exported. It is suggested that MccF is involved in microcin degradation or sequestration in the periplasm. 308
26163 132881 cd07064 AlkD_like_1 A new structural DNA glycosylase containing HEAT-like repeats. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix). DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base flipping despite their structural diversity. The known structures for members of this family, AlkC and AlkD from Bacillus cereus, are distant homologues and are composed of six variant HEAT (Huntington/Elongation/ A subunit/Target of rapamycin) repeats. HEAT motifs are ~45-amino acid sequences that form antiparallel alpha-helices, which are packed by a conserved hyrophobic interface and are tandemly repeated to form superhelical alpha-structures. AlkD and AlkC are specific for removal of 3-methyladenine (3mA) and 7-methylguanine (7mG) from the DNA by base excision repair. Homologues of AlkC and AlkD were also identified in other organisms. 208
26164 143549 cd07066 CRD_FZ CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain. CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. 119
26165 132718 cd07067 HP_PGM_like Histidine phosphatase domain found in phosphoglycerate mutases and related proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. Subgroup of the catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This subgroup contains cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example, F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. 153
26166 132753 cd07068 NR_LBD_ER_like The ligand binding domain of estrogen receptor and estrogen receptor-related receptors. The ligand binding domain of estrogen receptor (ER) and estrogen receptor-related receptors (ERRs): Estrogen receptors are a group of receptors which are activated by the hormone estrogen. Estrogen regulates many physiological processes including reproduction, bone integrity, cardiovascular health, and behavior. The main mechanism of action of the estrogen receptor is as a transcription factor by binding to the estrogen response element of target genes upon activation by estrogen and then recruiting coactivator proteins which are responsible for the transcription of target genes. Additionally some ERs may associate with other membrane proteins and can be rapidly activated by exposure of cells to estrogen. ERRs are closely related to the estrogen receptor (ER) family. But, it lacks the ability to bind estrogen. ERRs can interfere with the classic ER-mediated estrogen signaling pathway, positively or negatively. ERRs share target genes, co-regulators and promoters with the estrogen receptor (ER) family. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER and ERRs have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 221
26167 132754 cd07069 NR_LBD_Lrh-1 The ligand binding domain of the liver receptor homolog-1, a member of nuclear receptor superfamily,. The ligand binding domain (LBD) of the liver receptor homolog-1 (LRH-1): LRH-1 belongs to nuclear hormone receptor superfamily, and is expressed mainly in the liver, intestine, exocrine pancreas, and ovary. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 binds DNA as a monomer, and is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development. Recently, phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, LRH-1 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 241
26168 132755 cd07070 NR_LBD_SF-1 The ligand binding domain of nuclear receptor steroidogenic factor 1, a member of nuclear receptor superfamily. The ligand binding domain of nuclear receptor steroidogenic factor 1 (SF-1): SF-1, a member of the nuclear hormone receptor superfamily, is an essential regulator of endocrine development and function and is considered a master regulator of reproduction. Most nuclear receptors function as homodimer or heterodimers, however SF-1 binds to its target genes as a monomer, recognizing the variations of the DNA sequence motif, T/CCA AGGTCA. SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been determined as potential ligands of SF-1. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, SF-1 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 237
26169 132756 cd07071 NR_LBD_Nurr1 The ligand binding domain of Nurr1, a member of conserved family of nuclear receptors. The ligand binding domain of nuclear receptor Nurr1: Nurr1 belongs to the conserved family of nuclear receptors. It is a transcription factor that is expressed in the embryonic ventral midbrain and is critical for the development of dopamine (DA) neurons. Structural studies have shown that the ligand binding pocket of Nurr1 is filled by bulky hydrophobic residues, making it unable to bind to ligands. Therefore, it belongs to the class of orphan receptors. However, Nurr1 forms heterodimers with RXR and can promote signaling via its partner, RXR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, Nurr1 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 238
26170 132757 cd07072 NR_LBD_DHR38_like Ligand binding domain of DHR38_like proteins, members of the nuclear receptor superfamily. The ligand binding domain of nuclear receptor DHR38_like proteins: DHR38 is a member of the steroid receptor superfamily in Drosophila. DHR38 interacts with the USP component of the ecdysone receptor complex, suggesting that DHR38 might modulate ecdysone-triggered signals in the fly, in addition to the ECR/USP pathway. At least four differentially expressed mRNA isoforms have been detected during development. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR38 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 239
26171 132758 cd07073 NR_LBD_AR Ligand binding domain of the nuclear receptor androgen receptor, ligand activated transcription regulator. The ligand binding domain of the androgen receptor (AR): AR is a member of the nuclear receptor family. It is activated by binding either of the androgenic hormones, testosterone or dihydrotestosterone, which are responsible for male primary sexual characteristics and for secondary male characteristics, respectively. The primary mechanism of action of ARs is by direct regulation of gene transcription. The binding of an androgen results in a conformational change in the androgen receptor which causes its transport from the cytosol into the cell nucleus, and dimerization. The receptor dimer binds to a hormone response element of AR-regulated genes and modulates their expression. Another mode of action is independent of their interactions with DNA. The receptors interact directly with signal transduction proteins in the cytoplasm, causing rapid changes in cell function, such as ion transport. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, AR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD is not only involved in binding to androgen, but also involved in binding of coactivator proteins and dimerization. A ligand dependent nuclear export signal is also present at the ligand binding domain. 246
26172 132759 cd07074 NR_LBD_PR Ligand binding domain of the progesterone receptor, a member of the nuclear hormone receptor. The ligand binding domain of the progesterone receptor (PR): PR is a member of the nuclear receptor superfamily of ligand dependent transcription factors, mediating the biological actions of progesterone. PR functions in a variety of biological processes including development of the mammary gland, regulating cell cycle progression, protein processing, and metabolism. When no binding hormone is present the carboxyl terminal inhibits transcription. Binding to a hormone induces a structural change that removes the inhibitory action. After progesterone binds to the receptor, PR forms a dimer and the complex enters the nucleus where it interacts with the hormone response element (HRE) in the promoters of progesterone responsive genes and alters their transcription. In addition, rapid actions of PR that occur independent of transcription, have also been observed in several tissues like brain, liver, mammary gland and spermatozoa. There are two natural PR isoforms called PR-A and PR-B. PR-B has an additional stretc h of 164 amino acids at the N terminus. The extra domain in PR-B performs activation functions by recruiting coactivators that could not be recruited by PR-A. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD is not only involved in binding to progesterone, but also involved in coactivator binding and dimerization. 248
26173 132760 cd07075 NR_LBD_MR Ligand binding domain of the mineralocorticoid receptor, a member of the nuclear receptor superfamily. The ligand binding domain of the mineralocorticoid receptor (MR): MR, also called aldosterone receptor, is a member of nuclear receptor superfamily involved in the regulation of electrolyte and fluid balance. The receptor is activated by mineralocorticoids such as aldosterone and deoxycorticosterone as well as glucocorticoids, like cortisol and cortisone. Binding of its ligand results in its translocation to the cell nucleus, homodimerization and binding to hormone response elements (HREs) present in the promoter of MR controlled genes. This results in the recruitment of the coactivators and the transcription of the activated genes. MR is expressed in many tissues and its activation results in the expression of proteins regulating electrolyte and fluid balance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, MR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD ). The LBD, in addition to binding ligand, contains a ligand-dependent activation function-2 (AF-2). 248
26174 132761 cd07076 NR_LBD_GR Ligand binding domain of the glucocorticoid receptor, a member of the nuclear receptor superfamily. The ligand binding domain of the glucocorticoid receptor (GR): GR is a ligand-activated transcription factor belonging to the nuclear receptor superfamily. It binds with high affinity to cortisol and other glucocorticoids. GR is expressed in almost every cell in the body and regulates genes controlling a wide variety of processes including the development, metabolism, and immune response of the organism. In the absence of hormone, the glucocorticoid receptor (GR) is complexes with a variety of heat shock proteins in the cytosol. The binding of the glucocorticoids results in release of the heat shock proteins and transforms it to its active state. One mechanism of action of GR is by direct activation of gene transcription. The activated form of GR forms dimers, translocates into the nucleus, and binds to specific hormone responsive elements, activating gene transcription. GR can also function as a repressor of other gene transcription activators, such as NF-kappaB and AF-1 by directly binding to them, and bloc king the expression of their activated genes. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD also functions for dimerization and chaperone protein association. 247
26175 143396 cd07077 ALDH-like NAD(P)+-dependent aldehyde dehydrogenase-like (ALDH-like) family. The aldehyde dehydrogenase-like (ALDH-like) group of the ALDH superfamily of NAD(P)+-dependent enzymes which, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. This group includes families ALDH18, ALDH19, and ALDH20 and represents such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group. 397
26176 143397 cd07078 ALDH NAD(P)+ dependent aldehyde dehydrogenase family. The aldehyde dehydrogenase family (ALDH) of NAD(P)+ dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or as osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-like) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. 432
26177 143398 cd07079 ALDH_F18-19_ProA-GPR Gamma-glutamyl phosphate reductase (GPR), aldehyde dehydrogenase families 18 and 19. Gamma-glutamyl phosphate reductase (GPR), a L-proline biosynthetic pathway (PBP) enzyme that catalyzes the NADPH dependent reduction of L-gamma-glutamyl 5-phosphate into L-glutamate 5-semialdehyde and phosphate. The glutamate route of the PBP involves two enzymatic steps catalyzed by gamma-glutamyl kinase (GK, EC 2.7.2.11) and GPR (EC 1.2.1.41). These enzymes are fused into the bifunctional enzyme, ProA or delta(1)-pyrroline-5-carboxylate synthetase (P5CS) in plants and animals, whereas they are separate enzymes in bacteria and yeast. In humans, the P5CS (ALDH18A1), an inner mitochondrial membrane enzyme, is essential to the de novo synthesis of the amino acids proline and arginine. Tomato (Lycopersicon esculentum) has both the prokaryotic-like polycistronic operons encoding GK and GPR (PRO1, ALDH19) and the full-length, bifunctional P5CS (PRO2, ALDH18B1). 406
26178 143399 cd07080 ALDH_Acyl-CoA-Red_LuxC Acyl-CoA reductase LuxC. Acyl-CoA reductase, LuxC, (EC=1.2.1.50) is the fatty acid reductase enzyme responsible for synthesis of the aldehyde substrate for the luminescent reaction catalyzed by luciferase. The fatty acid reductase, a luminescence-specific, multienzyme complex (LuxCDE), reduces myristic acid to generate the long chain fatty aldehyde required for the luciferase-catalyzed reaction resulting in the emission of blue-green light. Mutational studies of conserved cysteines of LuxC revealed that the cysteine which aligns with the catalytic cysteine conserved throughout the ALDH superfamily is the LuxC acylation site. This CD is composed of mainly bacterial sequences but also includes a few archaeal sequences similar to the Methanospirillum hungateiacyl acyl-CoA reductase RfbN. 422
26179 143400 cd07081 ALDH_F20_ACDH_EutE-like Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway. 439
26180 143401 cd07082 ALDH_F11_NP-GAPDH NADP+-dependent non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase and ALDH family 11. NADP+-dependent non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase (NP-GAPDH, EC=1.2.1.9) catalyzes the irreversible oxidation of glyceraldehyde 3-phosphate to 3-phosphoglycerate generating NADPH for biosynthetic reactions. This CD also includes the Arabidopsis thaliana osmotic-stress-inducible ALDH family 11, ALDH11A3 and similar sequences. In autotrophic eukaryotes, NP-GAPDH generates NADPH for biosynthetic processes from photosynthetic glyceraldehyde-3-phosphate exported from the chloroplast and catalyzes one of the classic glycolytic bypass reactions unique to plants. 473
26181 143402 cd07083 ALDH_P5CDH ALDH subfamily NAD+-dependent delta(1)-pyrroline-5-carboxylate dehydrogenase-like. ALDH subfamily of the NAD+-dependent, delta(1)-pyrroline-5-carboxylate dehydrogenases (P5CDH, EC=1.5.1.12). The proline catabolic enzymes, proline dehydrogenase and P5CDH catalyze the two-step oxidation of proline to glutamate. P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA). These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. In certain prokaryotes such as Escherichia coli, PutA is also a transcriptional repressor of the proline utilization genes. Monofunctional enzyme sequences such as those seen in the Bacillus RocA P5CDH are also present in this subfamily as well as the human ALDH4A1 P5CDH and the Drosophila Aldh17 P5CDH. 500
26182 143403 cd07084 ALDH_KGSADH-like ALDH subfamily: NAD(P)+-dependent alpha-ketoglutaric semialdehyde dehydrogenases and plant delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH family 12-like. ALDH subfamily which includes the NAD(P)+-dependent, alpha-ketoglutaric semialdehyde dehydrogenases (KGSADH, EC 1.2.1.26); plant delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, EC=1.5.1.12 ), ALDH family 12; the N-terminal domain of the MaoC (monoamine oxidase C) dehydratase regulatory protein; and orthologs of MaoC, PaaZ and PaaN, which are putative ring-opening enzymes of the aerobic phenylacetic acid catabolic pathway. 442
26183 143404 cd07085 ALDH_F6_MMSDH Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD. 478
26184 143405 cd07086 ALDH_F7_AASADH-like NAD+-dependent alpha-aminoadipic semialdehyde dehydrogenase and related proteins. ALDH subfamily which includes the NAD+-dependent, alpha-aminoadipic semialdehyde dehydrogenase (AASADH, EC=1.2.1.31), also known as Antiquitin-1, ALDH7A1, ALDH7B or delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH), and other similar sequences, such as the uncharacterized aldehyde dehydrogenase of Candidatus kuenenia AldH (locus CAJ73105). 478
26185 143406 cd07087 ALDH_F3-13-14_CALDH-like ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate. 426
26186 143407 cd07088 ALDH_LactADH-AldA Escherichia coli lactaldehyde dehydrogenase AldA-like. Lactaldehyde dehydrogenase from Escherichia coli (AldA, LactADH, EC=1.2.1.22), an NAD(+)-dependent enzyme involved in the metabolism of L-fucose and L-rhamnose, and other similar sequences are present in this CD. 468
26187 143408 cd07089 ALDH_CddD-AldA-like Rhodococcus ruber 6-oxolauric acid dehydrogenase-like and related proteins. The 6-oxolauric acid dehydrogenase (CddD) from Rhodococcus ruber SC1 which converts 6-oxolauric acid to dodecanedioic acid; and the aldehyde dehydrogenase (locus SSP0762) from Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 and also, the Mycobacterium tuberculosis H37Rv ALDH AldA (locus Rv0768) sequence; and other similar sequences, are included in this CD. 459
26188 143409 cd07090 ALDH_F9_TMBADH NAD+-dependent 4-trimethylaminobutyraldehyde dehydrogenase, ALDH family 9A1. NAD+-dependent, 4-trimethylaminobutyraldehyde dehydrogenase (TMABADH, EC=1.2.1.47), also known as aldehyde dehydrogenase family 9 member A1 (ALDH9A1) in humans, is a cytosolic tetramer which catalyzes the oxidation of gamma-aminobutyraldehyde involved in 4-aminobutyric acid (GABA) biosynthesis and also oxidizes betaine aldehyde (gamma-trimethylaminobutyraldehyde) which is involved in carnitine biosynthesis. 457
26189 143410 cd07091 ALDH_F1-2_Ald2-like ALDH subfamily: ALDH families 1and 2, including 10-formyltetrahydrofolate dehydrogenase, NAD+-dependent retinal dehydrogenase 1 and related proteins. ALDH subfamily which includes the NAD+-dependent retinal dehydrogenase 1 (RALDH 1, ALDH1, EC=1.2.1.36), also known as aldehyde dehydrogenase family 1 member A1 (ALDH1A1), in humans, a homotetrameric, cytosolic enzyme that catalyzes the oxidation of retinaldehyde to retinoic acid. Human ALDH1B1 and ALDH2 are also in this cluster; both are mitochrondrial homotetramers which play important roles in acetaldehyde oxidation; ALDH1B1 in response to UV light exposure and ALDH2 during ethanol metabolism. 10-formyltetrahydrofolate dehydrogenase (FTHFDH, EC=1.5.1.6), also known as aldehyde dehydrogenase family 1 member L1 (ALDH1L1), in humans, a multi-domain homotetramer with an N-terminal formyl transferase domain and a C-terminal ALDH domain. FTHFDH catalyzes an NADP+-dependent dehydrogenase reaction resulting in the conversion of 10-formyltetrahydrofolate to tetrahydrofolate and CO2. Also included in this subfamily is the Arabidosis aldehyde dehydrogenase family 2 members B4 and B7 (EC=1.2.1.3), which are mitochondrial, homotetramers that oxidize acetaldehyde and glycolaldehyde, as well as, the Arabidosis cytosolic, homotetramer ALDH2C4 (EC=1.2.1.3), an enzyme involved in the oxidation of sinapalehyde and coniferaldehyde. Also included is the AldA aldehyde dehydrogenase of Aspergillus nidulans (locus AN0554), the aldehyde dehydrogenase 2 (YMR170c, ALD5, EC=1.2.1.5) of Saccharomyces cerevisiae, and other similar sequences. 476
26190 143411 cd07092 ALDH_ABALDH-YdcW Escherichia coli NAD+-dependent gamma-aminobutyraldehyde dehydrogenase YdcW-like. NAD+-dependent, tetrameric, gamma-aminobutyraldehyde dehydrogenase (ABALDH), YdcW of Escherichia coli K12, catalyzes the oxidation of gamma-aminobutyraldehyde to gamma-aminobutyric acid. ABALDH can also oxidize n-alkyl medium-chain aldehydes, but with a lower catalytic efficiency. 450
26191 143412 cd07093 ALDH_F8_HMSADH Human aldehyde dehydrogenase family 8 member A1-like. In humans, the aldehyde dehydrogenase family 8 member A1 (ALDH8A1) protein functions to convert 9-cis-retinal to 9-cis-retinoic acid and has a preference for NAD+. Also included in this CD is the 2-hydroxymuconic semialdehyde dehydrogenase (HMSADH) which catalyzes the conversion of 2-hydroxymuconic semialdehyde to 4-oxalocrotonate, a step in the meta cleavage pathway of aromatic hydrocarbons in bacteria. Such HMSADHs seen here are: XylG of the TOL plasmid pWW0 of Pseudomonas putida, TomC of Burkholderia cepacia G4, and AphC of Comamonas testosterone. 455
26192 143413 cd07094 ALDH_F21_LactADH-like ALDH subfamily: NAD+-dependent, lactaldehyde dehydrogenase, ALDH family 21 A1, and related proteins. ALDH subfamily which includes Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123), and NAD+-dependent, lactaldehyde dehydrogenase (EC=1.2.1.22) and like sequences. 453
26193 143414 cd07095 ALDH_SGSD_AstD N-succinylglutamate 5-semialdehyde dehydrogenase, AstD-like. N-succinylglutamate 5-semialdehyde dehydrogenase or succinylglutamic semialdehyde dehydrogenase (SGSD, E. coli AstD, EC=1.2.1.71) involved in L-arginine degradation via the arginine succinyltransferase (AST) pathway and catalyzes the NAD+-dependent reduction of succinylglutamate semialdehyde into succinylglutamate. 431
26194 143415 cd07097 ALDH_KGSADH-YcbD Bacillus subtilis NADP+-dependent alpha-ketoglutaric semialdehyde dehydrogenase ycbD-like. Kinetic studies of the Bacillus subtilis ALDH-like ycbD protein, which is involved in d-glucarate/d-galactarate utilization, reveal that it is a NADP+-dependent, alpha-ketoglutaric semialdehyde dehydrogenase (KGSADH). KGSADHs (EC 1.2.1.26) catalyze the NAD(P)+-dependent conversion of KGSA to alpha-ketoglutarate. Interestingly, the NADP+-dependent, tetrameric, 2,5-dioxopentanoate dehydrogenase (EC=1.2.1.26), an enzyme involved in the catabolic pathway for D-arabinose in Sulfolobus solfataricus, also clusters in this group. This CD shows a distant phylogenetic relationship to the Azospirillum brasilense KGSADH-II (-III) group. 473
26195 143416 cd07098 ALDH_F15-22 Aldehyde dehydrogenase family 15A1 and 22A1-like. Aldehyde dehydrogenase family members ALDH15A1 (Saccharomyces cerevisiae YHR039C) and ALDH22A1 (Arabidopsis thaliana, EC=1.2.1.3), and similar sequences, are in this CD. Significant improvement of stress tolerance in tobacco plants was observed by overexpressing the ALDH22A1 gene from maize (Zea mays) and was accompanied by a reduction of malondialdehyde derived from cellular lipid peroxidation. 465
26196 143417 cd07099 ALDH_DDALDH Methylomonas sp. 4,4'-diapolycopene-dialdehyde dehydrogenase-like. The 4,4'-diapolycopene-dialdehyde dehydrogenase (DDALDH) involved in C30 carotenoid synthesis in Methylomonas sp. strain 16a and other similar sequences are present in this CD. DDALDH converts 4,4'-diapolycopene-dialdehyde into 4,4'-diapolycopene-diacid. 453
26197 143418 cd07100 ALDH_SSADH1_GabD1 Mycobacterium tuberculosis succinate-semialdehyde dehydrogenase 1-like. Succinate-semialdehyde dehydrogenase 1 (SSADH1, GabD1, EC=1.2.1.16) catalyzes the NADP(+)-dependent oxidation of succinate semialdehyde (SSA) to succinate. SSADH activity in Mycobacterium tuberculosis (Mtb) is encoded by both gabD1 (Rv0234c) and gabD2 (Rv1731). The Mtb GabD1 SSADH1 reportedly is an enzyme of the gamma-aminobutyrate shunt, which forms a functional link between two TCA half-cycles by converting alpha-ketoglutarate to succinate. 429
26198 143419 cd07101 ALDH_SSADH2_GabD2 Mycobacterium tuberculosis succinate-semialdehyde dehydrogenase 2-like. Succinate-semialdehyde dehydrogenase 2 (SSADH2) and similar proteins are in this CD. SSADH1 (GabD1, EC=1.2.1.16) catalyzes the NADP(+)-dependent oxidation of succinate semialdehyde to succinate. SSADH activity in Mycobacterium tuberculosis is encoded by both gabD1 (Rv0234c) and gabD2 (Rv1731), however ,the Vmax of GabD1 was shown to be much higher than that of GabD2, and GabD2 (SSADH2) is likely to serve physiologically as a dehydrogenase for a different aldehyde(s). 454
26199 143420 cd07102 ALDH_EDX86601 Uncharacterized aldehyde dehydrogenase of Synechococcus sp. PCC 7335 (EDX86601). Uncharacterized aldehyde dehydrogenase of Synechococcus sp. PCC 7335 (locus EDX86601) and other similar sequences, are present in this CD. 452
26200 143421 cd07103 ALDH_F5_SSADH_GabD Mitochondrial succinate-semialdehyde dehydrogenase and ALDH family members 5A1 and 5F1-like. Succinate-semialdehyde dehydrogenase, mitochondrial (SSADH, GabD, EC=1.2.1.24) catalyzes the NAD+-dependent oxidation of succinate semialdehyde (SSA) to succinate. This group includes the human aldehyde dehydrogenase family 5 member A1 (ALDH5A1) which is a mitochondrial homotetramer that converts SSA to succinate in the last step of 4-aminobutyric acid (GABA) catabolism. This CD also includes the Arabidopsis SSADH gene product ALDH5F1. Mutations in this gene result in the accumulation of H2O2, suggesting a role in plant defense against the environmental stress of elevated reactive oxygen species. 451
26201 143422 cd07104 ALDH_BenzADH-like ALDH subfamily: NAD(P)+-dependent benzaldehyde dehydrogenase II, vanillin dehydrogenase, p-hydroxybenzaldehyde dehydrogenase and related proteins. ALDH subfamily which includes the NAD(P)+-dependent, benzaldehyde dehydrogenase II (XylC, BenzADH, EC=1.2.1.28) involved in the oxidation of benzyl alcohol to benzoate; p-hydroxybenzaldehyde dehydrogenase (PchA, HBenzADH) which catalyzes the oxidation of p-hydroxybenzaldehyde to p-hydroxybenzoic acid; vanillin dehydrogenase (Vdh, VaniDH) involved in the metabolism of ferulic acid as seen in Pseudomonas putida KT2440; and other related sequences. 431
26202 143423 cd07105 ALDH_SaliADH Salicylaldehyde dehydrogenase, DoxF-like. Salicylaldehyde dehydrogenase (DoxF, SaliADH, EC=1.2.1.65) involved in the upper naphthalene catabolic pathway of Pseudomonas strain C18 and other similar sequences are present in this CD. 432
26203 143424 cd07106 ALDH_AldA-AAD23400 Streptomyces aureofaciens putative aldehyde dehydrogenase AldA (AAD23400)-like. Putative aldehyde dehydrogenase, AldA, from Streptomyces aureofaciens (locus AAD23400) and other similar sequences are present in this CD. 446
26204 143425 cd07107 ALDH_PhdK-like Nocardioides 2-carboxybenzaldehyde dehydrogenase, PhdK-like. Nocardioides sp. strain KP72-carboxybenzaldehyde dehydrogenase (PhdK), an enzyme involved in phenanthrene degradation, and other similar sequences, are present in this CD. 456
26205 143426 cd07108 ALDH_MGR_2402 Magnetospirillum NAD(P)+-dependent aldehyde dehydrogenase MSR-1-like. NAD(P)+-dependent aldehyde dehydrogenase of Magnetospirillum gryphiswaldense MSR-1 (MGR_2402) , and other similar sequences, are present in this CD. 457
26206 143427 cd07109 ALDH_AAS00426 Uncharacterized Saccharopolyspora spinosa aldehyde dehydrogenase (AAS00426)-like. Uncharacterized aldehyde dehydrogenase of Saccharopolyspora spinosa (AAS00426) and other similar sequences, are present in this CD. 454
26207 143428 cd07110 ALDH_F10_BADH Arabidopsis betaine aldehyde dehydrogenase 1 and 2, ALDH family 10A8 and 10A9-like. Present in this CD are the Arabidopsis betaine aldehyde dehydrogenase (BADH) 1 (chloroplast) and 2 (mitochondria), also known as, aldehyde dehydrogenase family 10 member A8 and aldehyde dehydrogenase family 10 member A9, respectively, and are putative dehydration- and salt-inducible BADHs (EC 1.2.1.8) that catalyze the oxidation of betaine aldehyde to the compatible solute glycine betaine. 456
26208 143429 cd07111 ALDH_F16 Aldehyde dehydrogenase family 16A1-like. Uncharacterized aldehyde dehydrogenase family 16 member A1 (ALDH16A1) and other related sequences are present in this CD. The active site cysteine and glutamate residues are not conserved in the human ALDH16A1 protein sequence. 480
26209 143430 cd07112 ALDH_GABALDH-PuuC Escherichia coli NADP+-dependent gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase PuuC-like. NADP+-dependent, gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase (GABALDH) PuuC of Escherichia coli which catalyzes the conversion of putrescine to 4-aminobutanoate and other similar sequences are present in this CD. 462
26210 143431 cd07113 ALDH_PADH_NahF Escherichia coli NAD+-dependent phenylacetaldehyde dehydrogenase PadA-like. NAD+-dependent, homodimeric, phenylacetaldehyde dehydrogenase (PADH, EC=1.2.1.39) PadA of Escherichia coli involved in the catabolism of 2-phenylethylamine, and other related sequences, are present in this CD. Also included is the Pseudomonas fluorescens ST StyD PADH involved in styrene catabolism, the Sphingomonas sp. LB126 FldD protein involved in fluorene degradation, and the Novosphingobium aromaticivorans NahF salicylaldehyde dehydrogenase involved in the NAD+-dependent conversion of salicylaldehyde to salicylate. 477
26211 143432 cd07114 ALDH_DhaS Uncharacterized Candidatus pelagibacter aldehyde dehydrogenase, DhaS-like. Uncharacterized aldehyde dehydrogenase from Candidatus pelagibacter (DhaS) and other related sequences are present in this CD. 457
26212 143433 cd07115 ALDH_HMSADH_HapE Pseudomonas fluorescens 4-hydroxymuconic semialdehyde dehydrogenase-like. 4-hydroxymuconic semialdehyde dehydrogenase (HapE, EC=1.2.1.61) of Pseudomonas fluorescens ACB involved in 4-hydroxyacetophenone degradation, and putative hydroxycaproate semialdehyde dehydrogenase (ChnE) of Brachymonas petroleovorans involved in cyclohexane metabolism, and other similar sequences, are present in this CD. 453
26213 143434 cd07116 ALDH_ACDHII-AcoD Ralstonia eutrophus NAD+-dependent acetaldehyde dehydrogenase II-like. Included in this CD is the NAD+-dependent, acetaldehyde dehydrogenase II (AcDHII, AcoD, EC=1.2.1.3) from Ralstonia (Alcaligenes) eutrophus H16 involved in the catabolism of acetoin and ethanol, and similar proteins, such as, the dimeric dihydrolipoamide dehydrogenase of the acetoin dehydrogenase enzyme system of Klebsiella pneumonia. Also included are sequences similar to the NAD+-dependent chloroacetaldehyde dehydrogenases (AldA and AldB) of Xanthobacter autotrophicus GJ10 which are involved in the degradation of 1,2-dichloroethane. These proteins apparently require RpoN factors for expression. 479
26214 143435 cd07117 ALDH_StaphAldA1 Uncharacterized Staphylococcus aureus AldA1 (SACOL0154) aldehyde dehydrogenase-like. Uncharacterized aldehyde dehydrogenase from Staphylococcus aureus (AldA1, locus SACOL0154) and other similar sequences are present in this CD. 475
26215 143436 cd07118 ALDH_SNDH Gluconobacter oxydans L-sorbosone dehydrogenase-like. Included in this CD is the L-sorbosone dehydrogenase (SNDH) from Gluconobacter oxydans UV10. In G. oxydans, D-sorbitol is converted to 2-keto-L-gulonate (a precursor of L-ascorbic acid) in sequential oxidation steps catalyzed by a FAD-dependent, L-sorbose dehydrogenase and an NAD(P)+-dependent, L-sorbosone dehydrogenase. 454
26216 143437 cd07119 ALDH_BADH-GbsA Bacillus subtilis NAD+-dependent betaine aldehyde dehydrogenase-like. Included in this CD is the NAD+-dependent, betaine aldehyde dehydrogenase (BADH, GbsA, EC=1.2.1.8) of Bacillus subtilis involved in the synthesis of the osmoprotectant glycine betaine from choline or glycine betaine aldehyde. 482
26217 143438 cd07120 ALDH_PsfA-ACA09737 Pseudomonas putida aldehyde dehydrogenase PsfA (ACA09737)-like. Included in this CD is the aldehyde dehydrogenase (PsfA, locus ACA09737) of Pseudomonas putida involved in furoic acid metabolism. Transcription of psfA was induced in response to 2-furoic acid, furfuryl alcohol, and furfural. 455
26218 143439 cd07121 ALDH_EutE Ethanolamine utilization protein EutE-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), converts acetaldehyde into acetyl-CoA. This CD is limited to such monofunctional enzymes as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium. Mutations in eutE abolish the ability to utilize ethanolamine as a carbon source. 429
26219 143440 cd07122 ALDH_F20_ACDH Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway. 436
26220 143441 cd07123 ALDH_F4-17_P5CDH Delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH families 4 and 17. Delta(1)-pyrroline-5-carboxylate dehydrogenase (EC=1.5.1.12 ), families 4 and 17: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily. Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH), also known as ALDH4A1 in humans, is a mitochondrial homodimer involved in proline degradation and catalyzes the NAD + -dependent conversion of P5C to glutamate. This is a necessary step in the pathway interconnecting the urea and tricarboxylic acid cycles. The preferred substrate is glutamic gamma-semialdehyde, other substrates include succinic, glutaric and adipic semialdehydes. Also included in this CD is the Aldh17 Drosophila melanogaster (Q9VUC0) P5CDH and similar sequences. 522
26221 143442 cd07124 ALDH_PutA-P5CDH-RocA Delta(1)-pyrroline-5-carboxylate dehydrogenase, RocA. Delta(1)-pyrroline-5-carboxylate dehydrogenase (EC=1.5.1.12 ), RocA: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily. The proline catabolic enzymes, proline dehydrogenase and Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH), catalyze the two-step oxidation of proline to glutamate; P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA). In this CD, monofunctional enzyme sequences such as seen in the Bacillus subtilis RocA P5CDH are also present. These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. 512
26222 143443 cd07125 ALDH_PutA-P5CDH Delta(1)-pyrroline-5-carboxylate dehydrogenase, PutA. The proline catabolic enzymes of the aldehyde dehydrogenase (ALDH) protein superfamily, proline dehydrogenase and Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, (EC=1.5.1.12 )), catalyze the two-step oxidation of proline to glutamate; P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA) These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. In certain prokaryotes such as Escherichia coli, PutA is also a transcriptional repressor of the proline utilization genes. 518
26223 143444 cd07126 ALDH_F12_P5CDH Delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH family 12. Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, EC=1.5.1.12), family 12: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily. P5CDH is a mitochondrial enzyme involved in proline degradation and catalyzes the NAD + -dependent conversion of P5C to glutamate. The P5CDH, ALDH12A1 gene, in Arabidopsis, has been identified as an osmotic-stress-inducible ALDH gene. This CD contains both Viridiplantae and Alveolata P5CDH sequences. 489
26224 143445 cd07127 ALDH_PAD-PaaZ Phenylacetic acid degradation proteins PaaZ (Escherichia coli) and PaaN (Pseudomonas putida)-like. Phenylacetic acid degradation (PAD) proteins PaaZ (Escherichia coli) and PaaN (Pseudomonas putida) are putative aromatic ring cleavage enzymes of the aerobic PA catabolic pathway. PaaZ mutants were defective for growth with PA as a sole carbon source due to interruption of the putative ring opening system. This CD is limited to bacterial monofunctional enzymes. 549
26225 143446 cd07128 ALDH_MaoC-N N-terminal domain of the monoamine oxidase C dehydratase. The N-terminal domain of the MaoC dehydratase, a monoamine oxidase regulatory protein. Orthologs of MaoC include PaaZ (Escherichia coli) and PaaN (Pseudomonas putida), which are putative ring-opening enzymes of the aerobic phenylacetic acid (PA) catabolic pathway. The C-terminal domain of MaoC has sequence similarity to enoyl-CoA hydratase. Also included in this CD is a novel Burkholderia xenovorans LB400 ALDH of the aerobic benzoate oxidation (box) pathway. This pathway involves first the synthesis of a CoA thio-esterified aromatic acid, with subsequent dihydroxylation and cleavage steps, yielding the CoA thio-esterified aliphatic aldehyde, 3,4-dehydroadipyl-CoA semialdehyde, which is further converted into its corresponding CoA acid by the Burkholderia LB400 ALDH. 513
26226 143447 cd07129 ALDH_KGSADH Alpha-Ketoglutaric Semialdehyde Dehydrogenase. Alpha-Ketoglutaric Semialdehyde (KGSA) Dehydrogenase (KGSADH, EC 1.2.1.26) catalyzes the NAD(P)+-dependent conversion of KGSA to alpha-ketoglutarate. This CD contains such sequences as those seen in Azospirillum brasilense, KGSADH-II (D-glucarate/D-galactarate-inducible) and KGSADH-III (hydroxy-L-proline-inducible). Both show similar high substrate specificity for KGSA and different coenzyme specificity; KGSADH-II is NAD+-dependent and KGSADH-III is NADP+-dependent. Also included in this CD is the NADP(+)-dependent aldehyde dehydrogenase from Vibrio harveyi which catalyzes the oxidation of long-chain aliphatic aldehydes to acids. 454
26227 143448 cd07130 ALDH_F7_AASADH NAD+-dependent alpha-aminoadipic semialdehyde dehydrogenase, ALDH family members 7A1 and 7B. Alpha-aminoadipic semialdehyde dehydrogenase (AASADH, EC=1.2.1.31), also known as ALDH7A1, Antiquitin-1, ALDH7B, or delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH), is a NAD+-dependent ALDH. Human ALDH7A1 is involved in the pipecolic acid pathway of lysine catabolism, catalyzing the oxidation of alpha-aminoadipic semialdehyde to alpha-aminoadipate. Arabidopsis thaliana ALDH7B4 appears to be an osmotic-stress-inducible ALDH gene encoding a turgor-responsive or stress-inducible ALDH. The Streptomyces clavuligerus P6CDH appears to be involved in cephamycin biosynthesis, catalyzing the second stage of the two-step conversion of lysine to alpha-aminoadipic acid. The ALDH7A1 enzyme and others in this group have been observed as tetramers, yet the bacterial P6CDH enzyme has been reported as a monomer. 474
26228 143449 cd07131 ALDH_AldH-CAJ73105 Uncharacterized Candidatus kuenenia aldehyde dehydrogenase AldH (CAJ73105)-like. Uncharacterized aldehyde dehydrogenase of Candidatus kuenenia AldH (locus CAJ73105) and similar sequences with similarity to alpha-aminoadipic semialdehyde dehydrogenase (AASADH, human ALDH7A1, EC=1.2.1.31), Arabidopsis ALDH7B4, and Streptomyces clavuligerus delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH) are included in this CD. 478
26229 143450 cd07132 ALDH_F3AB Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes. 443
26230 143451 cd07133 ALDH_CALDH_CalB Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD. 434
26231 143452 cd07134 ALDH_AlkH-like Pseudomonas putida Aldehyde dehydrogenase AlkH-like. Aldehyde dehydrogenase AlkH (locus name P12693, EC=1.2.1.3) of the alkBFGHJKL operon that allows Pseudomonas putida to metabolize alkanes and the aldehyde dehydrogenase AldX of Bacillus subtilis (locus P46329, EC=1.2.1.3), and similar sequences, are present in this CD. 433
26232 143453 cd07135 ALDH_F14-YMR110C Saccharomyces cerevisiae aldehyde dehydrogenase family 14 and related proteins. Aldehyde dehydrogenase family 14 (ALDH14), isolated mainly from the mitochondrial outer membrane of Saccharomyces cerevisiae (YMR110C) and most closely related to the plant and animal ALDHs and fatty ALDHs family 3 members, and similar fungal sequences, are present in this CD. 436
26233 143454 cd07136 ALDH_YwdH-P39616 Bacillus subtilis aldehyde dehydrogenase ywdH-like. Uncharacterized Bacillus subtilis ywdH aldehyde dehydrogenase (locus P39616) most closely related to the ALDHs and fatty ALDHs of families 3 and 14, and similar sequences, are included in this CD. 449
26234 143455 cd07137 ALDH_F3FHI Plant aldehyde dehydrogenase family 3 members F1, H1, and I1 and related proteins. Aldehyde dehydrogenase family members 3F1, 3H1, and 3I1 (ALDH3F1, ALDH3H1, and ALDH3I1), and similar plant sequences, are in this CD. In Arabidopsis thaliana, stress-regulated expression of ALDH3I1 was observed in leaves and osmotic stress expression of ALDH3H1 was observed in root tissue, whereas, ALDH3F1 expression was not stress responsive. Functional analysis of ALDH3I1 suggest it may be involved in a detoxification pathway in plants that limits aldehyde accumulation and oxidative stress. 432
26235 143456 cd07138 ALDH_CddD_SSP0762 Rhodococcus ruber 6-oxolauric acid dehydrogenase-like. The 6-oxolauric acid dehydrogenase (CddD) from Rhodococcus ruber SC1 which converts 6-oxolauric acid to dodecanedioic acid, and the aldehyde dehydrogenase (locus SSP0762) from Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 and other similar sequences, are included in this CD. 466
26236 143457 cd07139 ALDH_AldA-Rv0768 Mycobacterium tuberculosis aldehyde dehydrogenase AldA-like. The Mycobacterium tuberculosis NAD+-dependent, aldehyde dehydrogenase PDB structure, 3B4W, and the Mycobacterium tuberculosis H37Rv aldehyde dehydrogenase AldA (locus Rv0768) sequence, as well as the Rhodococcus rhodochrous ALDH involved in haloalkane catabolism, and other similar sequences, are included in this CD. 471
26237 143458 cd07140 ALDH_F1L_FTFDH 10-formyltetrahydrofolate dehydrogenase, ALDH family 1L. 10-formyltetrahydrofolate dehydrogenase (FTHFDH, EC=1.5.1.6), also known as aldehyde dehydrogenase family 1 member L1 (ALDH1L1) in humans, is a multi-domain homotetramer with an N-terminal formyl transferase domain and a C-terminal ALDH domain. FTHFDH catalyzes an NADP+-dependent dehydrogenase reaction resulting in the conversion of 10-formyltetrahydrofolate to tetrahydrofolate and CO2. The ALDH domain is also capable of the oxidation of short chain aldehydes to their corresponding acids. 486
26238 143459 cd07141 ALDH_F1AB_F2_RALDH1 NAD+-dependent retinal dehydrogenase 1, ALDH families 1A, 1B, and 2-like. NAD+-dependent retinal dehydrogenase 1 (RALDH 1, ALDH1, EC=1.2.1.36) also known as aldehyde dehydrogenase family 1 member A1 (ALDH1A1) in humans, is a homotetrameric, cytosolic enzyme that catalyzes the oxidation of retinaldehyde to retinoic acid. Human ALDH1B1 and ALDH2 are also in this cluster; both are mitochrondrial homotetramers which play important roles in acetaldehyde oxidation; ALDH1B1 in response to UV light exposure and ALDH2 during ethanol metabolism. 481
26239 143460 cd07142 ALDH_F2BC Arabidosis aldehyde dehydrogenase family 2 B4, B7, C4-like. Included in this CD is the Arabidosis aldehyde dehydrogenase family 2 members B4 and B7 (EC=1.2.1.3), which are mitochondrial homotetramers that oxidize acetaldehyde and glycolaldehyde, but not L-lactaldehyde. Also in this group, is the Arabidosis cytosolic, homotetramer ALDH2C4 (EC=1.2.1.3), an enzyme involved in the oxidation of sinapalehyde and coniferaldehyde. 476
26240 143461 cd07143 ALDH_AldA_AN0554 Aspergillus nidulans aldehyde dehydrogenase, AldA (AN0554)-like. NAD(P)+-dependent aldehyde dehydrogenase (AldA) of Aspergillus nidulans (locus AN0554), and other similar sequences, are present in this CD. 481
26241 143462 cd07144 ALDH_ALD2-YMR170C Saccharomyces cerevisiae aldehyde dehydrogenase 2 (YMR170c)-like. NAD(P)+-dependent Saccharomyces cerevisiae aldehyde dehydrogenase 2 (YMR170c, ALD5, EC=1.2.1.5) and other similar sequences, are present in this CD. 484
26242 143463 cd07145 ALDH_LactADH_F420-Bios Methanocaldococcus jannaschii NAD+-dependent lactaldehyde dehydrogenase-like. NAD+-dependent, lactaldehyde dehydrogenase (EC=1.2.1.22) involved the biosynthesis of coenzyme F(420) in Methanocaldococcus jannaschii through the oxidation of lactaldehyde to lactate and generation of NAPH, and similar sequences are included in this CD. 456
26243 143464 cd07146 ALDH_PhpJ Streptomyces putative phosphonoformaldehyde dehydrogenase PhpJ-like. Putative phosphonoformaldehyde dehydrogenase (PhpJ), an aldehyde dehydrogenase homolog reportedly involved in the biosynthesis of phosphinothricin tripeptides in Streptomyces viridochromogenes DSM 40736, and similar sequences are included in this CD. 451
26244 143465 cd07147 ALDH_F21_RNP123 Aldehyde dehydrogenase family 21A1-like. Aldehyde dehydrogenase ALDH21A1 (gene name RNP123) was first described in the moss Tortula ruralis and is believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and ALDH21A1 expression represents a unique stress tolerance mechanism. So far, of plants, only the bryophyte sequence has been observed, but similar protein sequences from bacteria and archaea are also present in this CD. 452
26245 143466 cd07148 ALDH_RL0313 Uncharacterized ALDH ( RL0313) with similarity to Tortula ruralis aldehyde dehydrogenase ALDH21A1. Uncharacterized aldehyde dehydrogenase (locus RL0313) with sequence similarity to the moss Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123) believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and similar sequences are included in this CD. 455
26246 143467 cd07149 ALDH_y4uC Uncharacterized ALDH (y4uC) with similarity to Tortula ruralis aldehyde dehydrogenase ALDH21A1. Uncharacterized aldehyde dehydrogenase (ORF name y4uC) with sequence similarity to the moss Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123) believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and similar sequences are included in this CD. 453
26247 143468 cd07150 ALDH_VaniDH_like Pseudomonas putida vanillin dehydrogenase-like. Vanillin dehydrogenase (Vdh, VaniDH) involved in the metabolism of ferulic acid and other related sequences are included in this CD. The E. coli vanillin dehydrogenase (LigV) preferred NAD+ to NADP+ and exhibited a broad substrate preference, including vanillin, benzaldehyde, protocatechualdehyde, m-anisaldehyde, and p-hydroxybenzaldehyde. 451
26248 143469 cd07151 ALDH_HBenzADH NADP+-dependent p-hydroxybenzaldehyde dehydrogenase-like. NADP+-dependent, p-hydroxybenzaldehyde dehydrogenase (PchA, HBenzADH) which catalyzes oxidation of p-hydroxybenzaldehyde to p-hydroxybenzoic acid and other related sequences are included in this CD. 465
26249 143470 cd07152 ALDH_BenzADH NAD-dependent benzaldehyde dehydrogenase II-like. NAD-dependent, benzaldehyde dehydrogenase II (XylC, BenzADH, EC=1.2.1.28) is involved in the oxidation of benzyl alcohol to benzoate. In Acinetobacter calcoaceticus, this process is carried out by the chromosomally encoded, benzyl alcohol dehydrogenase (xylB) and benzaldehyde dehydrogenase II (xylC) enzymes; whereas in Pseudomonas putida they are encoded by TOL plasmids. 443
26250 133478 cd07153 Fur_like Ferric uptake regulator(Fur) and related metalloregulatory proteins; typically iron-dependent, DNA-binding repressors and activators. Ferric uptake regulator (Fur) and related metalloregulatory proteins are iron-dependent, DNA-binding repressors and activators mainly involved in iron metabolism. A general model for Fur repression under iron-rich conditions is that activated Fur (a dimer having one Fe2+ coordinated per monomer) binds to specific DNA sequences (Fur boxes) in the promoter region of iron-responsive genes, hindering access of RNA polymerase, and repressing transcription. Positive regulation by Fur can be direct or indirect, as in the Fur repression of an anti-sense regulatory small RNA. Some members sense metal ions other than Fe2+. For example, the zinc uptake regulator (Zur) responds to Zn2+, the manganese uptake regulator (Mur) responds to Mn2+, and the nickel uptake regulator (Nur) responds to Ni2+. Other members sense signals other than metal ions. For example, PerR, a metal-dependent sensor of hydrogen peroxide. PerR regulates DNA-binding activity through metal-based protein oxidation, and co-ordinates Mn2+ or Fe2+ at its regulatory site. Fur family proteins contain an N-terminal winged-helix DNA-binding domain followed by a dimerization domain; this CD spans both those domains. 116
26251 143529 cd07154 NR_DBD_PNR_like The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family. The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes nuclear receptor Tailless (TLX), photoreceptor cell-specific nuclear receptor (PNR) and related receptors. TLX is an orphan receptor that plays a key role in neural development by regulating cell cycle progression and exit of neural stem cells in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR-like receptors have a central well-conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 73
26252 143530 cd07155 NR_DBD_ER_like DNA-binding domain of estrogen receptor (ER) and estrogen related receptors (ERR) is composed of two C4-type zinc fingers. DNA-binding domains of estrogen receptor (ER) and estrogen related receptors (ERR) are composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. ER and ERR interact with the palindromic inverted repeat, 5'GGTCAnnnTGACC-3', upstream of the target gene and modulate the rate of transcriptional initiation. ERR and ER are closely related and share sequence similarity, target genes, co-regulators and promoters. While ER is activated by endogenous estrogen, ERR lacks the ability to bind to estrogen. Estrogen receptor mediates the biological effects of hormone estrogen by the binding of the receptor dimer to estrogen response element of target genes. However, ERRs seem to interfere with the classic ER-mediated estrogen responsive signaling by targeting the same set of genes. ERRs and ERs exhibit the common modular structure with other nuclear receptors. They have a central highly conserved DNA binding domain (DBD), a non-conserved N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 75
26253 143531 cd07156 NR_DBD_VDR_like The DNA-binding domain of vitamin D receptors (VDR) like nuclear receptor family is composed of two C4-type zinc fingers. The DNA-binding domain of vitamin D receptors (VDR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. This domain interacts with specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. This family includes three types of nuclear receptors: vitamin D receptors (VDR), constitutive androstane receptor (CAR) and pregnane X receptor (PXR). VDR regulates calcium metabolism, cellular proliferation and differentiation. PXR and CAR function as sensors of toxic byproducts of cell metabolism and of exogenous chemicals, to facilitate their elimination. The DNA binding activity is regulated by their corresponding ligands. VDR is activated by Vitamin D; CAR and PXR respond to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. Like other nuclear receptors, xenobiotic receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 72
26254 143532 cd07157 2DBD_NR_DBD1 The first DNA-binding domain (DBD) of the 2DBD nuclear receptors is composed of two C4-type zinc fingers. The first DNA-binding domain (DBD) of the 2DBD nuclear receptors(NRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NRs interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. Theses proteins contain two DBDs in tandem, probably resulted from an ancient recombination event. The 2DBD-NRs are found only in flatworm species, mollusks and arthropods. Their biological function is unknown. 86
26255 143533 cd07158 NR_DBD_Ppar_like The DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) like nuclear receptor family. The DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. These domains interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. This family includes three known types of nuclear receptors: peroxisome proliferator-activated receptors (PPAR), REV-ERB receptors and Drosophila ecdysone-induced protein 78 (E78). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 73
26256 143534 cd07160 NR_DBD_LXR DNA-binding domain of Liver X receptors (LXRs) family is composed of two C4-type zinc fingers. DNA-binding domain of Liver X receptors (LXRs) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. LXR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. LXR operates as cholesterol sensor which protects cells from cholesterol overload by stimulating reverse cholesterol transport from peripheral tissues to the liver and its excretion in the bile. Oxidized cholesterol derivatives or oxysterols were identified as specific ligands for LXRs. LXR functions as a heterodimer with the retinoid X receptor (RXR) which may be activated by either LXR agonist or 9-cis retinoic acid, a specific RXR ligand. The LXR/RXR complex binds to a liver X receptor response element (LXRE) in the promoter region of target genes. The ideal LXRE sequence is a direct repeat-4 (DR-4) DNA fragment consisting of two AGGTCA hexameric half-sites separated by a 4-nucleotide spacer. LXR has typical NR modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and the ligand binding domain (LBD) at the C-terminal. 101
26257 143535 cd07161 NR_DBD_EcR DNA-binding domain of Ecdysone receptor (ECR) family is composed of two C4-type zinc fingers. DNA-binding domain of Ecdysone receptor (EcR) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. EcR interacts with highly degenerate pseudo-palindromic response elements, resembling inverted repeats of 5'-AGGTCA-3' separated by 1 bp, upstream of the target gene and modulates the rate of transcriptional initiation. EcR is present only in invertebrates and regulates the expression of a large number of genes during development and reproduction. EcR functions as a heterodimer by partnering with ultraspiracle protein (USP), the ortholog of the vertebrate retinoid X receptor (RXR). The natural ligands of EcR are ecdysteroids, the endogenous steroidal hormones found in invertebrates. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, EcRs have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 91
26258 143536 cd07162 NR_DBD_PXR DNA-binding domain of pregnane X receptor (PXRs) is composed of two C4-type zinc fingers. DNA-binding domain (DBD)of pregnane X receptor (PXR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PXR DBD interacts with the PXR response element, a perfect repeat of two AGTTCA motifs with a 4 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation. The pregnane X receptor (PXR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. PXR functions as a heterodimer with retinoic X receptor-alpha (RXRa) and binds to a variety of promoter regions of a diverse set of target genes involved in the metabolism, transport, and ultimately, elimination of these molecules from the body. Like other nuclear receptors, PXR has a central well conserved DNA-binding domain, a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain. 87
26259 143537 cd07163 NR_DBD_TLX DNA-binding domain of Tailless (TLX) is composed of two C4-type zinc fingers. DNA-binding domain of Tailless (TLX) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. TLX interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. TLX is an orphan receptor that is expressed by neural stem/progenitor cells in the adult brain of the subventricular zone (SVZ) and the dentate gyrus (DG). It plays a key role in neural development by promoting cell cycle progression and preventing apoptosis in the developing brain. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TLX has a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 92
26260 143538 cd07164 NR_DBD_PNR_like_1 DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like proteins is composed of two C4-type zinc fingers. DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like proteins is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. PNR is a member of nuclear receptor superfamily of the ligand-activated transcription factors. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. It most likely binds to DNA as a homodimer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 78
26261 143539 cd07165 NR_DBD_DmE78_like DNA-binding domain of Drosophila ecdysone-induced protein 78 (E78) like is composed of two C4-type zinc fingers. DNA-binding domain of proteins similar to Drosophila ecdysone-induced protein 78 (E78) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. E78 interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Drosophila ecdysone-induced protein 78 (E78) is a transcription factor belonging to the nuclear receptor superfamily. E78 is a product of the ecdysone-inducible gene found in an early late puff locus at position 78C during the onset of Drosophila metamorphosis. An E78 orthologue from the Platyhelminth Schistosoma mansoni (SmE78) has also been identified. It is the first E78 orthologue known outside of the molting animals--the Ecdysozoa. The SmE78 may be involved in transduction of an ecdysone signal in S. mansoni, consistent with its function in Drosophila. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, E78-like receptors have a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 81
26262 143540 cd07166 NR_DBD_REV_ERB DNA-binding domain of REV-ERB receptor-like is composed of two C4-type zinc fingers. DNA-binding domain of REV-ERB receptor- like is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. REV-ERB receptors are transcriptional regulators belonging to the nuclear receptor superfamily. They regulate a number of physiological functions including the circadian rhythm, lipid metabolism, and cellular differentiation. REV-ERB receptors bind as a monomer to a (A/G)GGTCA half-site with a 5' AT-rich extension or as a homodimer to a direct repeat 2 element (AGGTCA sequence with a 2-bp spacer), indicating functional diversity. When bound to the DNA, they recruit corepressors (NcoR/histone deacetylase 3) to the promoter, resulting in repression of the target genes. The porphyrin heme has been demonstrated to function as a ligand for REV-ERB receptor. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, REV-ERB receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 89
26263 143541 cd07167 NR_DBD_Lrh-1_like The DNA-binding domain of Lrh-1 like nuclear receptor family like is composed of two C4-type zinc fingers. The DNA-binding domain of Lrh-1 like nuclear receptor family like is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This nuclear receptor family includes at least three subgroups of receptors that function in embryo development and differentiation, and other processes. FTZ-F1 interacts with the cis-acting DNA motif of ftz gene, which is required at several stages of development. Particularly, FTZ-F1 regulated genes are strongly linked to steroid biosynthesis and sex-determination; LRH-1 is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development; SF-1 is an essential regulator of endocrine development and function and is considered a master regulator of reproduction; SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1). However, the ligand for FTZ-F1 has not yet been identified. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 and SF-1 bind to DNA as monomers. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, receptors in this family have a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 93
26264 143542 cd07168 NR_DBD_DHR4_like DNA-binding domain of ecdysone-induced DHR4 orphan nuclear receptor is composed of two C4-type zinc fingers. DNA-binding domain of ecdysone-induced DHR4 orphan nuclear receptor is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Ecdysone-induced orphan receptor DHR4 is a member of the nuclear receptor family. DHR4 is expressed during the early Drosophila larval development and is induced by ecdysone. DHR4 coordinates growth and maturation in Drosophila by mediating endocrine response to the attainment of proper body size during larval development. Mutations in DHR4 result in shorter larval development which translates into smaller and lighter flies. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR4 has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 90
26265 143543 cd07169 NR_DBD_GCNF_like DNA-binding domain of Germ cell nuclear factor (GCNF) F1 is composed of two C4-type zinc fingers. DNA-binding domain of Germ cell nuclear factor (GCNF) F1 is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. GCNF is a transcription factor expressed in post-meiotic stages of developing male germ cells. In vitro, GCNF has the ability to bind to direct repeat elements of 5'-AGGTCA.AGGTCA-3', as well as to an extended half-site sequence 5'-TCA.AGGTCA-3'. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GCNF has a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 90
26266 143544 cd07170 NR_DBD_ERR DNA-binding domain of estrogen related receptors (ERR) is composed of two C4-type zinc fingers. DNA-binding domain of estrogen related receptors (ERRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ERR interacts with the palindromic inverted repeat, 5'GGTCAnnnTGACC-3', upstream of the target gene and modulates the rate of transcriptional initiation. The estrogen receptor-related receptors (ERRs) are transcriptional regulators, which are closely related to the estrogen receptor (ER) family. Although ERRs lack the ability to bind to estrogen and are so-called orphan receptors, they share target genes, co-regulators and promoters with the estrogen receptor (ER) family. By targeting the same set of genes, ERRs seem to interfere with the classic ER-mediated estrogen response in various ways. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ERR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 97
26267 143545 cd07171 NR_DBD_ER DNA-binding domain of estrogen receptors (ER) is composed of two C4-type zinc fingers. DNA-binding domain of estrogen receptors (ER) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ER interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Estrogen receptor is a transcription regulator that mediates the biological effects of hormone estrogen. The binding of estrogen to the receptor triggers the dimerization and the binding of the receptor dimer to estrogen response element, which is a palindromic inverted repeat: 5'GGTCAnnnTGACC-3', of target genes. Through ER, estrogen regulates development, reproduction and homeostasis. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER has a central well-conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 82
26268 143546 cd07172 NR_DBD_GR_PR DNA-binding domain of glucocorticoid receptor (GR) is composed of two C4-type zinc fingers. DNA-binding domains of glucocorticoid receptor (GR) and progesterone receptor (PR) are composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinate a single zinc atom. The DBD from both receptors interact with the same hormone response element (HRE), which is an imperfect palindrome GGTACAnnnTGTTCT, upstream of target genes and modulates the rate of transcriptional initiation. GR is a transcriptional regulator that mediates the biological effects of glucocorticoids and PR regulates genes controlled by progesterone. GR is expressed in almost every cell in the body and regulates genes controlling a wide variety of processes including the development, metabolism, and immune response of the organism. PR functions in a variety of biological processes including development of the mammary gland, regulating cell cycle progression, protein processing, and metabolism. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR and PR have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD). 78
26269 143547 cd07173 NR_DBD_AR DNA-binding domain of androgen receptor (AR) is composed of two C4-type zinc fingers. DNA-binding domain of androgen receptor (AR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. To regulate gene expression, AR interacts with a palindrome of the core sequence 5'-TGTTCT-3' with a 3-bp spacer. It also binds to the direct repeat 5'-TGTTCT-3' hexamer in some androgen controlled genes. AR is activated by the androgenic hormones, testosterone or dihydrotestosterone, which are responsible for primary and for secondary male characteristics, respectively. The primary mechanism of action of ARs is by direct regulation of gene transcription. The binding of androgen results in a conformational change in the androgen receptor which causes its transport from the cytosol into the cell nucleus, and dimerization. The receptor dimer binds to a hormone response element of AR regulated genes and modulates their expression. Another mode of action of androgen receptor is independent of their interactions with DNA. The receptor interacts directly with signal transduction proteins in the cytoplasm, causing rapid changes in cell function, such as ion transport. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, AR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 82
26270 143580 cd07176 terB tellurite resistance protein terB. This family contains uncharacterized bacterial proteins involved in tellurium resistance. The prototype of this CD is the Kp-terB protein from Klebsiella pneumoniae, whose 3D structure was recently determined. The biological function of terB and the mechanism responsible for tellurium resistance are unknown. 111
26271 143581 cd07177 terB_like tellurium resistance terB-like protein. This family consists of tellurium resistance terB proteins, N-terminal domain of heat shock DnaJ-like proteins, N-terminal domain of Mo-dependent nitrogenase-like proteins, C-terminal domain of ABC transporter ATP-binding proteins, C-terminal domain of serine/threonine protein kinase, and many hypothetical bacterial proteins. The function of this family is unknown. 104
26272 143582 cd07178 terB_like_YebE tellurium resistance terB-like protein, subgroup 3. This family includes several uncharacterized bacterial proteins including an Escherichia coli protein called YebE. Protein sequence homology analysis shows they are similar to tellurium resistance protein terB, but the function of this family is unknown. 95
26273 143548 cd07179 2DBD_NR_DBD2 The second DNA-binding domain (DBD) of the 2DBD nuclear receptor is composed of two C4-type zinc fingers. The second DNA-binding domain (DBD) of the 2DBD nuclear receptor (NR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NRs interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. The proteins contain two DBDs in tandem, probably resulting from an ancient recombination event. The 2DBD-NRs are found only in flatworm species, mollusks and arthropods. Their biological function is unknown. 74
26274 260001 cd07180 RNase_HII_archaea_like Archaeal Ribonuclease HII. This family includes type 2 RNases H from archaea, some of which show broad divalent cation specificity. It is proposed that three of the four acidic residues at the active site are involved in metal binding and the fourth one is involved in the catalytic process in archaea. Most archaeal genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype in E. coli. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, archaeal RNase HII and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication or repair. 204
26275 260002 cd07181 RNase_HII_eukaryota_like Eukaryotic RNase HII. This family includes eukaryotic type 2 RNase H (RNase HII or H2) which is active during replication and is believed to play a role in the removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII (RNASEH2A) is functional when it forms a heterotrimeric complex with two other accessory proteins (RNASEH2B and RNASEH2C). It is speculated that these accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause the severe genetic neurological disorder Aicardi-Goutieres syndrome. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. 221
26276 260003 cd07182 RNase_HII_bacteria_HII_like Bacterial Ribonuclease HII-like. This family includes mostly bacterial type 2 RNases H, with some eukaryotic members. Bacterial RNase HII has a role in primer removal based on its involvement in ribonucleotide-specific catalytic activity in the presence of RNA/DNA hybrid substrates. Several bacteria, such as Bacillus subtilis, have two different type II RNases H, RNases HII and HIII; double deletion of these leads to cellular lethality. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype. In Leishmania mitochondria, of the four distinct RNase H genes (H1, HIIA, HIIB, HIIC), HIIC is essential for the survival of the parasite and thus can be a potential target for anti-leishmanial chemotherapy. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. 177
26277 199892 cd07184 E_set_Isoamylase_like_N N-terminal Early set domain associated with the catalytic domain of isoamylase-like (also called glycogen 6-glucanohydrolase) proteins. E or "early" set domains are associated with the catalytic domain of isoamylase-like proteins at the N-terminal end. Isoamylase is one of the starch-debranching enzymes that catalyze the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen. Isoamylase contains a bound calcium ion, but this is not in the same position as the conserved calcium ion that has been reported in other alpha-amylase family enzymes. The N-terminal domain of isoamylase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, and the beta subunit of AMP-activated protein kinase. 86
26278 143586 cd07185 OmpA_C-like Peptidoglycan binding domains similar to the C-terminal domain of outer-membrane protein OmpA. OmpA-like domains (named after the C-terminal domain of Escherichia coli OmpA protein) have been shown to non-covalently associate with peptidoglycan, a network of glycan chains composed of disaccharides, which are crosslinked via short peptide bridges. Well-studied members of this family include the Escherichia coli outer membrane protein OmpA, the Escherichia coli lipoprotein PAL, Neisseria meningitdis RmpM, which interact with the outer membrane, as well as the Escherichia coli motor protein MotB, and the Vibrio flagellar motor proteins PomB and MotY, which interact with the inner membrane. 106
26279 132872 cd07186 CofD_like LPPG:FO 2-phospho-L-lactate transferase; important in F420 biosynthesis. CofD is a 2-phospho-L-lactate transferase that catalyzes the last step in the biosynthesis of coenzyme F(420)-0 (F(420) without polyglutamate) by transferring the lactyl phosphate moiety of lactyl(2)diphospho-(5')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin ribitol (F0). F420 is a hydride carrier, important for energy metabolism of methanogenic archaea, as well as for the biosynthesis of other natural products, like tetracycline in Streptomyces. F420 and some of its precursors are also utilized as cofactors for enzymes, like DNA photolyase in Mycobacterium tuberculosis. 303
26280 132873 cd07187 YvcK_like family of mostly uncharacterized proteins similar to B.subtilis YvcK. One member of this protein family, YvcK from Bacillus subtilis, has been proposed to play a role in carbon metabolism, since its function is essential for growth on intermediates of the Krebs cycle and the pentose phosphate pathway. In general, this family of mostly uncharacterized proteins is related to the CofD-like protein family. CofD has been characterized as a 2-phospho-L-lactate transferase involved in F420 biosynthesis. This family appears to have the same conserved phosphate binding site as the other family in this hierarchy, but a different substrate binding site. 308
26281 143587 cd07197 nitrilase Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes. This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy. 253
26282 132837 cd07198 Patatin Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes PNPLA (1-9), TGL (3-5), ExoU-like, and SDP1-like subfamilies. There are some additional hypothetical proteins included in this family. 172
26283 132838 cd07199 Pat17_PNPLA8_PNPLA9_like Patatin-like phospholipase; includes PNPLA8, PNPLA9, and Pat17. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum. 258
26284 132839 cd07200 cPLA2_Grp-IVA Group IVA cytosolic phospholipase A2; catalytic domain; Ca-dependent. Group IVA cPLA2, an 85 kDa protein, consists of two domains: the regulatory C2 domain and the alpha/beta hydrolase PLA2 domain. Group IVA cPLA2 is also referred to as cPLA2-alpha. The catalytic domain of cytosolic phospholipase A2 (cPLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. A calcium-dependent phospholipid binding domain resides in the N-terminal region of cPLA2; it is homologous to the C2 domain superfamily which is not included in this hierarchy. Includes PLA2G4A from chicken, human, and frog. 505
26285 132840 cd07201 cPLA2_Grp-IVB-IVD-IVE-IVF Group IVB, IVD, IVE, and IVF cytosolic phospholipase A2; catalytic domain; Ca-dependent. Group IVB, IVD, IVE, and IVF cPLA2 consists of two domains: the regulatory C2 domain and alpha/beta hydrolase PLA2 domain. Group IVB, IVD, IVE, and IVF cPLA2 are also referred to as cPLA2-beta, -delta, -epsilon, and -zeta respectively. cPLA2-beta is approximately 30% identical to cPLA2-alpha and it shows low enzymatic activity compared to cPLA2alpha. cPLA2-beta hydrolyzes palmitic acid from 1-[14C]palmitoyl-2-arachidonoyl-PC and arachidonic acid from 1-palmitoyl-2[14C]arachidonoyl-PC, but not from 1-O-alkyl-2[3H]arachidonoyl-PC. cPLA2-delta, -epsilon, and -zeta are approximately 45-50% identical to cPLA2-beta and 31-37% identical to cPLA2-alpha. It's possible that cPLA2-beta, -delta, -epsilon, and -zeta may have arisen by gene duplication from an ancestral gene. The catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. The calcium-dependent phospholipid binding domain resides in the N-terminal region of cPLA2; it is homologous to the C2 domain superfamily which is not included in this hierarchy. It includes PLA2G4B, PLA2G4D, PLA2G4E, and PLA2G4F from humans. 541
26286 132841 cd07202 cPLA2_Grp-IVC Group IVC cytoplasmic phospholipase A2; catalytic domain; Ca-independent. Group IVC cPLA2, a small 61 kDa protein, is a single domain alpha/beta hydrolase. It lacks a C2 domain; therefore, it has no Ca-dependence. Group IVC cPLA2 is also referred to as cPLA2-gamma. The cPLA2-gamma enzyme is predominantly found in cardiac and skeletal muscles, and to a lesser extent in the brain. Human cPLA2-gamma is approximately 30% identical to cPLA2-alpha. The catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Includes PLA2G4C protein from human and Pla2g4c protein from mouse. 430
26287 132842 cd07203 cPLA2_Fungal_PLB Fungal Phospholipase B-like; cPLA2 GrpIVA homologs; catalytic domain. Fungal phospholipase B are Group IV cPLA2 homologs. Aspergillus PLA2 is Ca-dependent, yet it does not contain a C2 domain. PLB deacylates both sn-1 and sn-2 chains of phospholipids and are abundantly expressed in fungi. It shows lysophospholipase (lysoPL) and transacylase activities. The active site residues from cPLA2 are also conserved in PLB. Like cPLA2, PLB also has a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). It includes PLB1 from Schizosaccharomyces pombe, PLB2 from Candida glabrata, and PLB3 from Saccharomyces cerevisiae. PLB1, PLB2, and PLB3 show PLB and lysoPL activities; PLB3 is specific for phosphoinositides. 552
26288 132843 cd07204 Pat_PNPLA_like Patatin-like phospholipase domain containing protein family. Members of this family share a patain domain, initially discovered in potato tubers. PNPLA protein members show non-specific hydrolase activity with a variety of substrates such as triacylglycerol, phospholipids, and retinylesters. It contains the lipase consensus sequence (Gly-X-Ser-X-Gly). Nomenclature of PNPLA family could be misleading as some of the mammalian members of this family show hydrolase, but no phospholipase activity. 243
26289 132844 cd07205 Pat_PNPLA6_PNPLA7_NTE1_like Patatin-like phospholipase domain containing protein 6, protein 7, and fungal NTE1. Patatin-like phospholipase domain containing protein 6 (PNPLA6) and protein 7 (PNPLA7) are included in this family. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. PNPLA7 is an insulin-regulated phospholipase that is homologus to Neuropathy Target Esterase (NTE or PNPLA6) and is also known as NTE-related esterase (NRE). Human NRE is predominantly expressed in prostate, white adipose, and pancreatic tissue. NRE hydrolyzes sn-1 esters in lysophosphatidylcholine and lysophosphatidic acid, but shows no lipase activity with substrates like triacylglycerols (TG), cholesteryl esters, retinyl esters (RE), phosphatidylcholine (PC), or monoacylglycerol (MG). This family includes subfamily of PNPLA6 (NTE) and PNPLA7 (NRE)-like phospholipases. 175
26290 132845 cd07206 Pat_TGL3-4-5_SDP1 Triacylglycerol lipase 3, 4, and 5 and Sugar-Dependent 1 lipase. Triacylglycerol lipases are involved in triacylglycerol mobilization and degradation; they are found in lipid particles. TGL4 is 30% homologus to TGL3, whereas TGL5 is 26% homologus to TGL3. Sugar-Dependent 1 (SDP1) lipase has a patatin-like acyl-hydrolase domain that initiates the breakdown of storage oil in germinating Arabidopsis seeds. This family includes subfamilies of proteins: TGL3, TGL4, TGL5, and SDP1. 298
26291 132846 cd07207 Pat_ExoU_VipD_like ExoU and VipD-like proteins; homologus to patatin, cPLA2, and iPLA2. ExoU, a 74-kDa enzyme, is a potent virulence factor of Pseudomonas aeruginosa. One of the pathogenic mechanisms of P. aeruginosa is to induce cytotoxicity by the injection of effector proteins (e.g. ExoU) using the type III secretion (T3S) system. ExoU is homologus to patatin and also has the conserved catalytic residues of mammalian calcium-independent (iPLA2) and cytosolic (cPLA2) PLA2. In vitro, ExoU cytotoxity is blocked by the inhibitor of cytosolic and Ca2-independent phospholipase A2 (cPLA2 and iPLA2) enzymes, suggesting that phospholipase A2 inhibitors may represent a novel mode of treatment for acute P. aeruginosa infections. ExoU requires eukaryotic superoxide dismutase as a cofactor and cleaves phosphatidylcholine and phosphatidylethanolamine in vitro. VipD, a 69-kDa cytosolic protein, belongs to the members of Legionella pneumophila family and is homologus to ExoU from Pseudomonas. Even though VipD shows high sequence similarity with several functional regions of ExoU (e.g. oxyanion hole, active site serine, active site aspartate), it has been shown to have no phospholipase activity. This family includes ExoU from Pseudomonas aeruginosa and VipD of Legionella pneumophila. 194
26292 132847 cd07208 Pat_hypo_Ecoli_yjju_like Hypothetical patatin similar to yjju protein of Escherichia coli. Patatin-like phospholipase similar to yjju protein of Escherichia coli. This family predominantly consists of bacterial patatin glycoproteins, and some representatives from eukaryotes and archaea. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. 266
26293 132848 cd07209 Pat_hypo_Ecoli_Z1214_like Hypothetical patatin similar to Z1214 protein of Escherichia coli. Patatin-like phospholipase similar to Z1214 protein of Escherichia coli. This family predominantly consists of bacterial patatin glycoproteins and some representatives from eukaryotes and archaea. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. 215
26294 132849 cd07210 Pat_hypo_W_succinogenes_WS1459_like Hypothetical patatin similar to WS1459 of Wolinella succinogenes. Patatin-like phospholipase. This family predominantly consists of bacterial patatin glycoproteins. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. 221
26295 132850 cd07211 Pat_PNPLA8 Patatin-like phospholipase domain containing protein 8. PNPLA8 is a Ca-independent myocardial phospholipase which maintains mitochondrial integrity. PNPLA8 is also known as iPLA2-gamma. In humans, it is predominantly expressed in heart tissue. iPLA2-gamma can catalyze both phospholipase A1 and A2 reactions (PLA1 and PLA2 respectively). This family includes PNPLA8 (iPLA2-gamma) from Homo sapiens and iPLA2-2 from Mus musculus. 308
26296 132851 cd07212 Pat_PNPLA9 Patatin-like phospholipase domain containing protein 9. PNPLA9 is a Ca-independent phospholipase that catalyzes the hydrolysis of glycerophospholipids at the sn-2 position. PNPLA9 is also known as PLA2G6 (phospholipase A2 group VI) or iPLA2beta. PLA2G6 is stimulated by ATP and inhibited by bromoenol lactone (BEL). In humans, PNPLA9 in expressed ubiquitously and is involved in signal transduction, cell proliferation, and apoptotic cell death. Mutations in human PLA2G6 leads to infantile neuroaxonal dystrophy (INAD) and idiopathic neurodegeneration with brain iron accumulation (NBIA). This family includes PLA2G6 from Homo sapiens and Rattus norvegicus. 312
26297 132852 cd07213 Pat17_PNPLA8_PNPLA9_like1 Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum. 288
26298 132853 cd07214 Pat17_isozyme_like Patatin-like phospholipase of plants. Pat17 is an isozyme of patatin cloned from Solanum cardiophyllum. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue, and Nu = nucleophile). Patatin-like phospholipase are included in this group. Members of this family have also been found in vertebrates. 349
26299 132854 cd07215 Pat17_PNPLA8_PNPLA9_like2 Patatin-like phospholipase of bacteria. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum. 329
26300 132855 cd07216 Pat17_PNPLA8_PNPLA9_like3 Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum. 309
26301 132856 cd07217 Pat17_PNPLA8_PNPLA9_like4 Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum. 344
26302 132857 cd07218 Pat_iPLA2 Calcium-independent phospholipase A2; Classified as Group IVA-1 PLA2. Calcium-independent phospholipase A2; otherwise known as Group IVA-1 PLA2. It contains the lipase consensus sequence (Gly-X-Ser-X-Gly);mutagenesis experiments confirm the role of this serine as a nucleophile. Some members of this group show triacylglycerol lipase activity (EC 3:1:1:3). Members include iPLA-1, iPLA-2, and iPLA-3 from Aedes aegypti and show acylglycerol transacylase/lipase activity. Also includes putative iPLA2-eta from Pediculus humanus corporis which shows patatin-like phospholipase activity. 245
26303 132858 cd07219 Pat_PNPLA1 Patatin-like phospholipase domain containing protein 1. Members of this family share a patatin domain, initially discovered in potato tubers. Some members of PNPLA1 subfamily do not have the lipase consensus sequence Gly-X-Ser-X-Gly which is essential for hydrolase activity. This family includes PNPLA1 from Homo sapiens and Gallus gallus. Currently, there is no literature available on the physiological role, structure, or enzymatic activity of PNPLA1. It is expressed in various human tissues in low mRNA levels. 382
26304 132859 cd07220 Pat_PNPLA2 Patatin-like phospholipase domain containing protein 2. PNPLA2 plays a key role in hydrolysis of stored triacylglecerols and is also known as adipose triglyceride lipase (ATGL). Members of this family share a patain domain, initially discovered in potato tubers. ATGL is expressed in white and brown adipose tissue in high mRNA levels. Mutations in PNPLA2 encoding adipose triglyceride lipase (ATGL) leads to neutral lipid storage disease (NLSD) which is characterized by the accumulation of triglycerides in multiple tissues. ATGL mutations are also commonly associated with severe forms of skeletal- and cardio-myopathy. This family includes patatin-like proteins: TTS-2.2 (transport-secretion protein 2.2), PNPLA2 (Patatin-like phospholipase domain-containing protein 2), and iPLA2-zeta (Calcium-independent phospholipase A2) from Homo sapiens. 249
26305 132860 cd07221 Pat_PNPLA3 Patatin-like phospholipase domain containing protein 3. PNPLA3 is a triacylglycerol lipase that mediates triacylglycerol hydrolysis in adipocytes and is an indicator of the nutritional state. PNPLA3 is also known as adiponutrin (ADPN) or iPLA2-epsilon. Human adiponutrins are bound to the cell membrane of adipocytes and show transacylase, TG hydrolase, and PLA2 activity. This family includes patatin-like proteins: ADPN (adiponutrin) from mammals, PNPLA3 (Patatin-like phospholipase domain-containing protein 3), and iPLA2-epsilon (Calcium-independent phospholipase A2) from Homo sapiens. 252
26306 132861 cd07222 Pat_PNPLA4 Patatin-like phospholipase domain containing protein 4. PNPLA4, also known as GS2 (gene sequence-2), shows both lipase and transacylation activities. GS2 lipase is expressed in various tissues, predominantly in muscle and adipocytes tissue. It is also expressed in keratinocytes and shows retinyl ester hydrolase, acylglycerol, TG hydrolase, and PLA2 activity. This family includes patatin-like proteins: GS2 from mammals, PNPLA4 (Patatin-like phospholipase domain-containing protein 4), and iPLA2-eta (Calcium-independent phospholipase A2) from Homo sapiens. 246
26307 132862 cd07223 Pat_PNPLA5-mammals Patatin-like phospholipase domain containing protein 5. PNPLA5, also known as GS2L (GS2-like), plays a role in regulation of adipocyte differentiation. PNPLA5 is expressed in brain tissue in high mRNA levels and low levels in liver tissue. There is no concrete evidence in support of the enzymatic activity of GS2L. This family includes patatin-like proteins: GS2L (GS2-like) and PNPLA5 (Patatin-like phospholipase domain-containing protein 5) reported exclusively in mammals. 405
26308 132863 cd07224 Pat_like Patatin-like phospholipase. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of lipid acyl hydrolase, catalysing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates. 233
26309 132864 cd07225 Pat_PNPLA6_PNPLA7 Patatin-like phospholipase domain containing protein 6 and protein 7. Patatin-like phospholipase domain containing protein 6 (PNPLA6) and protein 7 (PNPLA7) are 60% identical to each other. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. PNPLA7 is an insulin-regulated phospholipase that is homologous to Neuropathy Target Esterase (NTE or PNPLA6) and is also known as NTE-related esterase (NRE). Human NRE is predominantly expressed in prostate, white adipose, and pancreatic tissue. NRE hydrolyzes sn-1 esters in lysophosphatidylcholine and lysophosphatidic acid, but shows no lipase activity with substrates like triacylglycerols (TG), cholesteryl esters, retinyl esters (RE), phosphatidylcholine (PC), or monoacylglycerol (MG). This family includes PNPLA6 and PNPLA7 from Homo sapiens, YMF9 from Yeast, and Swiss Cheese protein (sws) from Drosophila melanogaster. 306
26310 132865 cd07227 Pat_Fungal_NTE1 Fungal patatin-like phospholipase domain containing protein 6. These are fungal Neuropathy Target Esterase (NTE), commonly referred to as NTE1. Patatin-like phospholipase. NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. This family includes NTE1 from fungi. 269
26311 132866 cd07228 Pat_NTE_like_bacteria Bacterial patatin-like phospholipase domain containing protein 6. Bacterial patatin-like phospholipase domain containing protein 6. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. This group includes YCHK and rssA from Escherichia coli as well as Ylbk from Bacillus amyloliquefaciens. 175
26312 132867 cd07229 Pat_TGL3_like Triacylglycerol lipase 3. Triacylglycerol lipase 3 (TGL3) are responsible for all the TAG lipase activity of the lipid particle. Triacylglycerol (TAG) lipases are also necessary for the mobilization of TAG stored in lipid particles. TGL3 contains the consensus sequence motif GXSXG, which is found in lipolytic enzymes. This family includes Tgl3p from Saccharomyces cerevisiae. 391
26313 132868 cd07230 Pat_TGL4-5_like Triacylglycerol lipase 4 and 5. TGL4 and TGL5 are triacylglycerol lipases that are involved in triacylglycerol mobilization and degradation; they are found in lipid particles. Tgl4 is a functional ortholog of mammalian adipose TG lipase (ATGL) and is phosphorylated and activated by cyclin-dependent kinase 1 (Cdk1/Cdc28). TGL4 is 30% homologus to TGL3, whereas TGL5 is 26% homologus to TGL3. This family includes TGL4 (STC1) and TGL5 (STC2) from Saccharomyces cerevisiae. 421
26314 132869 cd07231 Pat_SDP1-like Sugar-Dependent 1 like lipase. Sugar-Dependent 1 (SDP1) lipase has a patatin-like acyl-hydrolase domain that initiates the breakdown of storage oil in germinating Arabidopsis seeds. This acyl-hydrolase domain is homologus to yeast triacylglycerol lipase 3 and human adipose triglyceride lipase. This family includes SDP1 from Arabidopsis thaliana. 323
26315 132870 cd07232 Pat_PLPL Patain-like phospholipase. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants and fungi. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates. 407
26316 319900 cd07233 GlxI_Zn Glyoxalase I that uses Zn(++) as cofactor. This family includes eukaryotic glyoxalase I that prefers the divalent cation zinc as cofactor. Glyoxalase I (also known as lactoylglutathione lyase; EC 4.4.1.5) is part of the glyoxalase system, a two-step system for detoxifying methylglyoxal, a side product of glycolysis. This system is responsible for the conversion of reactive, acyclic alpha-oxoaldehydes into the corresponding alpha-hydroxyacids and involves 2 enzymes, glyoxalase I and II. Glyoxalase I catalyses an intramolecular redox reaction of the hemithioacetal (formed from methylglyoxal and glutathione) to form the thioester, S-D-lactoylglutathione. This reaction involves the transfer of two hydrogen atoms from C1 to C2 of the methylglyoxal, and proceeds via an ene-diol intermediate. Glyoxalase I has a requirement for bound metal ions for catalysis. Eukaryotic glyoxalase I prefers the divalent cation zinc as cofactor, whereas Escherichia coil and other prokaryotic glyoxalase I uses nickel. However, eukaryotic Trypanosomatid parasites also use nickel as a cofactor, which could possibly be explained by acquiring their GLOI gene by horizontal gene transfer. Human glyoxalase I is a two-domain enzyme and it has the structure of a domain-swapped dimer with two active sites located at the dimer interface. In yeast, in various plants, insects and Plasmodia, glyoxalase I is four-domain, possibly the result of a further gene duplication and an additional gene fusing event. 142
26317 319901 cd07235 MRD Mitomycin C resistance protein (MRD). Mitomycin C (MC) is a naturally occurring antibiotic, and antitumor agent used in the treatment of cancer. Its antitumor activity is exerted primarily through monofunctional and bifunctional alkylation of DNA. MRD binds to MC and functions as a component of the MC exporting system. MC is bound to MRD by a stacking interaction between a His and a Trp. MRD adopts a structural fold similar to bleomycin resistance protein, glyoxalase I, and extradiol dioxygenases; and it has binding sites at an identical location to binding sites in these evolutionarily related enzymes. 123
26318 319902 cd07237 BphC1-RGP6_C_like C-terminal domain of 2,3-dihydroxybiphenyl 1,2-dioxygenase. This subfamily contains the C-terminal, catalytic, domain of BphC1-RGP6 and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, and has Fe(II) at the catalytic site. Its C-terminal repeat is represented in this subfamily. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, they belong to a different subfamily of the ED_TypeI_classII_C (C-terminal domain of type I, class II extradiol dioxygenases) family. 153
26319 319903 cd07238 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 112
26320 319904 cd07239 BphC5-RK37_C_like C-terminal, catalytic domain of BphC5 (2,3-dihydroxybiphenyl 1,2-dioxygenase). 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). The enzyme contains a N-terminal and a C-terminal domain of similar structure fold, resulting from an ancient gene duplication. BphC belongs to the type I extradiol dioxygenase family, which requires a metal in the active site for its catalytic activity. Polychlorinated biphenyl degrading bacteria demonstrate multiplicity of BphCs. Bacterium Rhodococcus rhodochrous K37 has eight genes encoding BphC enzymes. This family includes the C-terminal domain of BphC5-RrK37. The crystal structure of the protein from Novosphingobium aromaticivorans has a Mn(II)in the active site, although most proteins of type I extradiol dioxygenases are activated by Fe(II). 143
26321 319905 cd07241 VOC_BsYyaH vicinal oxygen chelate (VOC) family protein similar to Bacillus subtilis YyaH. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 125
26322 319906 cd07242 VOC_BsYqjT vicinal oxygen chelate (VOC) family protein similar to Bacillus subtilis YqjT. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 126
26323 319907 cd07243 2_3_CTD_C C-terminal domain of catechol 2,3-dioxygenase. This subfamily contains the C-terminal, catalytic, domain of catechol 2,3-dioxygenase. Catechol 2,3-dioxygenase (2,3-CTD, catechol:oxygen 2,3-oxidoreductase) catalyzes an extradiol cleavage of catechol to form 2-hydroxymuconate semialdehyde with the insertion of two atoms of oxygen. The enzyme is a homotetramer and contains catalytically essential Fe(II) . The reaction proceeds by an ordered bi-unit mechanism. First, catechol binds to the enzyme, this is then followed by the binding of dioxygen to form a tertiary complex, and then the aromatic ring is cleaved to produce 2-hydroxymuconate semialdehyde. Catechol 2,3-dioxygenase belongs to the type I extradiol dioxygenase family. The subunit comprises the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. This subfamily represents the C-terminal domain. 144
26324 319908 cd07244 FosA fosfomycin resistant protein subfamily FosA. This subfamily family contains FosA, a fosfomycin resistant protein. FosA is a Mn(II) and K(+)-dependent glutathione transferase. Fosfomycin inhibits the enzyme UDP-N-acetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosA, catalyzes the addition of glutathione to the antibiotic fosfomycin, (1R,2S)-epoxypropylphosphonic acid, making it inactive. FosA is a Mn(II) dependent enzyme. It is evolutionarily related to glyoxalase I and type I extradiol dioxygenases. 121
26325 319909 cd07245 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 117
26326 319910 cd07246 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping 124
26327 319911 cd07247 SgaA_N_like N-terminal domain of Streptomyces griseus SgaA and similar domains. SgaA suppresses the growth disturbances caused by high osmolarity and a high concentration of A-factor, a microbial hormone, during the early growth phase in Streptomyces griseus. A-factor (2-isocapryloyl-3R-hydroxymethyl-gamma-butyrolactone) controls morphological differentiation and secondary metabolism in Streptomyces griseus. It is a chemical signaling molecule that at a very low concentration acts as a switch for yellow pigment production, aerial mycelium formation, streptomycin production, and streptomycin resistance. The structure and amino acid sequence of SgaA are closely related to a group of antibiotics resistance proteins, including bleomycin resistance protein, mitomycin resistance protein, and fosfomycin resistance proteins. SgaA might also function as a streptomycin resistance protein. 114
26328 319912 cd07249 MMCE Methylmalonyl-CoA epimerase (MMCE). MMCE, also called methylmalonyl-CoA racemase (EC 5.1.99.1) interconverts (2R)-methylmalonyl-CoA and (2S)-methylmalonyl-CoA. MMCE has been found in bacteria, archaea, and in animals. In eukaryotes, MMCE is an essential enzyme in a pathway that converts propionyl-CoA to succinyl-CoA, and is important in the breakdown of odd-chain length fatty acids, branched-chain amino acids, and other metabolites. In bacteria, MMCE participates in the reverse pathway for propionate fermentation, glyoxylate regeneration, and the biosynthesis of polyketide antibiotics. MMCE is closely related to glyoxalase I and type I extradiol dioxygenases. 127
26329 319913 cd07250 HPPD_C_like C-terminal domain of 4-hydroxyphenylpyruvate dioxygenase (HppD) and hydroxymandelate synthase (HmaS). HppD and HmaS are non-heme iron-dependent dioxygenases, which modify a common substrate, 4-hydroxyphenylpyruvate (HPP), but yield different products. HPPD catalyzes the second reaction in tyrosine catabolism, the conversion of 4-hydroxyphenylpyruvate to homogentisate (2,5-dihydroxyphenylacetic acid, HG). HmaS converts HPP to 4-hydroxymandelate, a committed step in the formation of hydroxyphenylglycerine, a structural component of nonproteinogenic macrocyclic peptide antibiotics, such as vancomycin. If the emphasis is on catalytic chemistry, HPPD and HmaS are classified as members of a large family of alpha-keto acid dependent mononuclear non-heme iron oxygenases most of which require Fe(II), molecular oxygen, and an alpha-keto acid (typically alpha-ketoglutarate) to either oxygenate or oxidize a third substrate. Both enzymes are exceptions in that they require two, instead of three, substrates, do not use alpha-ketoglutarate, and incorporate both atoms of dioxygen into the aromatic product. Both HPPD and HmaS exhibit duplicate beta barrel topology in their N- and C-terminal domains which share sequence similarity, suggestive of a gene duplication. Each protein has only one catalytic site located in at the C-terminal domain. This HPPD_C_like domain represents the C-terminal domain. 194
26330 319914 cd07251 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 120
26331 319915 cd07252 BphC1-RGP6_N_like N-terminal domain of 2,3-dihydroxybiphenyl 1,2-dioxygenase. This subfamily contains the N-terminal, non-catalytic, domain of BphC1-RGP6 and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of 2,3-dihydroxybiphenyl 1,2-dioxygenases. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, and has Fe(II) at the catalytic site. Its N-terminal repeat is represented in this subfamily. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, they belong to a different family, the ED_TypeI_classII_C (C-terminal domain of type I, class II extradiol dioxygenases) family. 120
26332 319916 cd07253 GLOD5 Human glyoxalase domain-containing protein 5 and similar proteins. Uncharacterized subfamily of VOC family contains human glyoxalase domain-containing protein 5 and similar proteins. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 123
26333 319917 cd07254 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 120
26334 319918 cd07255 VOC_BsCatE_like_N N-terminal of Bacillus subtilis CatE like protein. Uncharacterized subfamily of VOC superfamily contains Bacillus subtilis CatE and similar proteins. CatE is proposed to function as Catechol-2,3-dioxygenase. VOC is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 124
26335 319919 cd07256 HPCD_C_class_II C-terminal domain of 3,4-dihydroxyphenylacetate 2,3-dioxygenase (HPCD). This subfamily contains the C-terminal, catalytic, domain of HPCD. HPCD catalyses the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. The aromatic ring of 4-hydroxyphenylacetate is opened by this dioxygenase to yield the 3,4-diol product, 2-hydroxy-5-carboxymethylmuconate semialdehyde. HPCD is a homotetramer and each monomer contains two structurally homologous barrel-shaped domains at the N- and C-terminus. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. Most extradiol dioxygenases contain Fe(II) in their active site, but HPCD can be activated by either Mn(II) or Fe(II). These enzymes belong to the type I class II family of extradiol dioxygenases. The class III 3,4-dihydroxyphenylacetate 2,3-dioxygenases belong to a different superfamily. 160
26336 319920 cd07257 THT_oxygenase_C The C-terminal domain of 2,4,5-trihydroxytoluene (THT) oxygenase. This subfamily contains the C-terminal, catalytic, domain of THT oxygenase. THT oxygenase is an extradiol dioxygenase in the 2,4-dinitrotoluene (DNT) degradation pathway. It catalyzes the conversion of 2,4,5-trihydroxytoluene to an unstable ring fission product, 2,4-dihydroxy-5-methyl-6-oxo-2,4-hexadienoic acid. The native protein was determined to be a dimer by gel filtration. The enzyme belongs to the type I family of extradiol dioxygenases which contains two structurally homologous barrel-shaped domains at the N- and C-terminus of each monomer. The active-site metal is located in the C-terminal barrel. Fe(II) is required for its catalytic activity. 152
26337 319921 cd07258 PpCmtC_C C-terminal domain of 2,3-dihydroxy-p-cumate-3,4-dioxygenase (PpCmtC). This subfamily contains the C-terminal, catalytic, domain of PpCmtC. 2,3-dihydroxy-p-cumate-3,4-dioxygenase (CmtC of Pseudomonas putida F1) is a dioxygenase involved in the eight-step catabolism pathway of p-cymene. CmtC acts upon the reaction intermediate 2,3-dihydroxy-p-cumate, yielding 2-hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate. The CmtC belongs to the type I family of extradiol dioxygenases. Fe2+ was suggested as a cofactor, same as for other enzymes in the family. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. 138
26338 319922 cd07261 EhpR_like phenazine resistance protein, EhpR. Phenazine resistance protein (EhpR) in Enterobacter agglomerans confers resistance by binding D-alanyl-griseoluteic acid and acting as a chaperone involved in exporting the antibiotic rather than by altering it chemically. EhpR is evolutionarily related to glyoxalase I and type I extradiol dioxygenases. 114
26339 319923 cd07262 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 121
26340 319924 cd07263 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping 120
26341 319925 cd07264 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 118
26342 319926 cd07265 2_3_CTD_N N-terminal domain of catechol 2,3-dioxygenase. This subfamily contains the N-terminal, non-catalytic, domain of catechol 2,3-dioxygenase. Catechol 2,3-dioxygenase (2,3-CTD, catechol:oxygen 2,3-oxidoreductase) catalyzes an extradiol cleavage of catechol to form 2-hydroxymuconate semialdehyde with the insertion of two atoms of oxygen. The enzyme is a homotetramer and contains catalytically essential Fe(II) . The reaction proceeds by an ordered bi-unit mechanism. First, catechol binds to the enzyme, this is then followed by the binding of dioxygen to form a tertiary complex, and then the aromatic ring is cleaved to produce 2-hydroxymuconate semialdehyde. Catechol 2,3-dioxygenase belongs to the type I extradiol dioxygenase family. The subunit comprises the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. This subfamily represents the N-terminal domain. 122
26343 319927 cd07266 HPCD_N_class_II N-terminal domain of 3,4-dihydroxyphenylacetate 2,3-dioxygenase (HPCD). This subfamily contains the N-terminal, non-catalytic, domain of HPCD. HPCD catalyses the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. The aromatic ring of 4-hydroxyphenylacetate is opened by this dioxygenase to yield the 3,4-diol product, 2-hydroxy-5-carboxymethylmuconate semialdehyde. HPCD is a homotetramer and each monomer contains two structurally homologous barrel-shaped domains at the N- and C-terminus. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. Most extradiol dioxygenases contain Fe(II) in their active site, but HPCD can be activated by either Mn(II) or Fe(II). These enzymes belong to the type I class II family of extradiol dioxygenases. The class III 3,4-dihydroxyphenylacetate 2,3-dioxygenases belong to a different superfamily. 118
26344 319928 cd07267 THT_Oxygenase_N N-terminal domain of 2,4,5-trihydroxytoluene (THT) oxygenase. This subfamily contains the N-terminal, non-catalytic, domain of THT oxygenase. THT oxygenase is an extradiol dioxygenase in the 2,4-dinitrotoluene (DNT) degradation pathway. It catalyzes the conversion of 2,4,5-trihydroxytoluene to an unstable ring fission product, 2,4-dihydroxy-5-methyl-6-oxo-2,4-hexadienoic acid. The native protein was determined to be a dimer by gel filtration. The enzyme belongs to the type I family of extradiol dioxygenases which contains two structurally homologous barrel-shaped domains at the N- and C-terminus of each monomer. The active-site metal is located in the C-terminal barrel. Fe(II) is required for its catalytic activity. 113
26345 319929 cd07268 VOC_EcYecM_like Escherichia coli YecM and similar proteins, a vicinal oxygen chelate subfamily. Uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily contains Escherichia coli YecM and similar proteins.The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 171
26346 132809 cd07276 PX_SNX16 The phosphoinositide binding Phox Homology domain of Sorting Nexin 16. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX16 contains a central PX domain followed by a coiled-coil region. SNX16 is localized in early and recycling endosomes through the binding of its PX domain to phosphatidylinositol-3-phosphate (PI3P). It plays a role in epidermal growth factor (EGF) signaling by regulating EGF receptor membrane trafficking. 110
26347 132810 cd07277 PX_RUN The phosphoinositide binding Phox Homology domain of uncharacterized proteins containing PX and RUN domains. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized proteins containing an N-terminal RUN domain and a C-terminal PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction. The RUN domain is found in GTPases in the Rap and Rab families and may play a role in Ras-like signaling pathways. 118
26348 132811 cd07278 PX_RICS_like The phosphoinositide binding Phox Homology domain of PX-RICS-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this family include PX-RICS, TCGAP (Tc10/Cdc42 GTPase-activating protein), and similar proteins. They contain N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal extensions. They act as Rho GTPase-activating proteins. PX-RICS is the main isoform expressed during neural development. It is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. The PX domain of PX-RICS specifically binds phosphatidylinositol 3-phosphate (PI3P), PI4P, and PI5P. TCGAP is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 114
26349 132812 cd07279 PX_SNX20_21_like The phosphoinositide binding Phox Homology domain of Sorting Nexins 20 and 21. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX20, SNX21, and similar proteins. SNX20 interacts with P-Selectin glycoprotein ligand-1 (PSGL-1), a surface-expressed mucin that acts as a ligand for the selectin family of adhesion proteins. It may function in the sorting and cycling of PSGL-1 into endosomes. SNX21, also called SNX-L, is distinctly and highly-expressed in fetal liver and may be involved in protein sorting and degradation during embryonic liver development. 112
26350 132813 cd07280 PX_YPT35 The phosphoinositide binding Phox Homology domain of the fungal protein YPT35. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. This subfamily is composed of YPT35 proteins from the fungal subkingdom Dikarya. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of YPT35 binds to phosphatidylinositol 3-phosphate (PI3P). It also serves as a protein interaction domain, binding to members of the Yip1p protein family, which localize to the ER and Golgi. YPT35 is mainly associated with endosomes and together with Yip1p proteins, may be involved in a specific function in the endocytic pathway. 120
26351 132814 cd07281 PX_SNX1 The phosphoinositide binding Phox Homology domain of Sorting Nexin 1. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX1 is both membrane associated and a cytosolic protein that exists as a tetramer in protein complexes. It can associate reversibly with membranes of the endosomal compartment, thereby coating these vesicles. SNX1 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. SNX1 contains a Bin/Amphiphysin/Rvs (BAR) domain C-terminal to the PX domain. The PX domain of SNX1 specifically binds phosphatidylinositol-3-phosphate (PI3P) and PI(3,5)P2, while the BAR domain detects membrane curvature. Both domains help determine the specific membrane-targeting of SNX1, which is localized to a microdomain in early endosomes where it regulates cation-independent mannose-6-phosphate receptor retrieval to the trans Golgi network. 124
26352 132815 cd07282 PX_SNX2 The phosphoinositide binding Phox Homology domain of Sorting Nexin 2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX2 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. Similar to SNX1, SNX2 contains a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX domain of SNX2 preferentially binds phosphatidylinositol-3-phosphate (PI3P), but not PI(3,4,5)P3. Studies on mice deficient with SNX1 and/or SNX2 suggest that they provide an essential function in embryogenesis and are functionally redundant. 124
26353 132816 cd07283 PX_SNX30 The phosphoinositide binding Phox Homology domain of Sorting Nexin 30. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX30 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-8, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of SNX30 has yet to be elucidated. 116
26354 132817 cd07284 PX_SNX7 The phosphoinositide binding Phox Homology domain of Sorting Nexin 7. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX7 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-6, SNX8, SNX30, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of SNX7 has yet to be elucidated. 116
26355 132818 cd07285 PX_SNX9 The phosphoinositide binding Phox Homology domain of Sorting Nexin 9. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX9, also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It contains an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature. The PX-BAR structural unit helps determine specific membrane localization. Through its SH3 domain, SNX9 binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization. 126
26356 132819 cd07286 PX_SNX18 The phosphoinositide binding Phox Homology domain of Sorting Nexin 18. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX18, like SNX9, contains an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature. The PX-BAR structural unit helps determine specific membrane localization. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. 127
26357 132820 cd07287 PX_RPK118_like The phosphoinositide binding Phox Homology domain of RPK118-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily bear similarity to human RPK118, which contains an N-terminal PX domain, a Microtubule Interacting and Trafficking (MIT) domain, and a kinase domain. RPK118 binds sphingosine kinase, a key enzyme in the synthesis of sphingosine 1-phosphate (SPP), a lipid messenger involved in many cellular events. RPK118 may be involved in transmitting SPP-mediated signaling. It also binds the antioxidant peroxiredoxin-3 (PRDX3) and may be involved in the transport of PRDX3 from the cytoplasm to its site of function in the mitochondria. Members of this subfamily also show similarity to sorting nexin 15 (SNX15), which contains PX and MIT domains but does not contain a kinase domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNX15 plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein. 118
26358 132821 cd07288 PX_SNX15 The phosphoinositide binding Phox Homology domain of Sorting Nexin 15. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX15 contains an N-terminal PX domain and a C-terminal Microtubule Interacting and Trafficking (MIT) domain. It plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein. 118
26359 132822 cd07289 PX_PI3K_C2_alpha The phosphoinositide binding Phox Homology Domain of the Alpha Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II alpha isoform, PI3K-C2alpha, plays key roles in clathrin assembly and clathrin-mediated membrane trafficking, insulin signaling, vascular smooth muscle contraction, and the priming of neurosecretory granule exocytosis. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 109
26360 132823 cd07290 PX_PI3K_C2_beta The phosphoinositide binding Phox Homology Domain of the Beta Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II beta isoform, PI3K-C2beta, contributes to the migration and survival of cancer cells. It regulates Rac activity and impacts membrane ruffling, cell motility, and cadherin-mediated cell-cell adhesion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 109
26361 132824 cd07291 PX_SNX5 The phosphoinositide binding Phox Homology domain of Sorting Nexin 5. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX5, abundantly expressed in macrophages, regulates macropinocytosis, a process that enables cells to internalize large amounts of external solutes. It may also be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It also binds the Fanconi anaemia complementation group A protein (FANCA). SNX5 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs. The PX domain of SNX5 binds phosphatidylinositol-3-phosphate (PI3P) and PI(3,4)P2. SNX5 is localized to a subdomain of early endosome and is recruited to the plasma membrane following EGF stimulation and elevation of PI(3,4)P2 levels. 141
26362 132825 cd07292 PX_SNX6 The phosphoinositide binding Phox Homology domain of Sorting Nexin 6. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX6 forms a stable complex with SNX1 and may be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It interacts with the receptor serine/threonine kinases from the transforming growth factor-beta family. It also plays roles in enhancing the degradation of EGFR and in regulating the activity of Na,K-ATPase through its interaction with Translationally Controlled Tumor Protein (TCTP). SNX6 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs. 141
26363 132826 cd07293 PX_SNX3 The phosphoinositide binding Phox Homology domain of Sorting Nexin 3. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX3 associates with early endosomes through a PX domain-mediated interaction with phosphatidylinositol-3-phosphate (PI3P). It associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer. SNX3 is required for the formation of multivesicular bodies, which function as transport intermediates to late endosomes. It also promotes cell surface expression of the amiloride-sensitive epithelial Na+ channel (ENaC), which is critical in sodium homeostasis and maintenance of extracellular fluid volume. 123
26364 132827 cd07294 PX_SNX12 The phosphoinositide binding Phox Homology domain of Sorting Nexin 12. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. The specific function of SNX12 has yet to be elucidated. 132
26365 132828 cd07295 PX_Grd19 The phosphoinositide binding Phox Homology domain of fungal Grd19. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Grd19 is involved in the localization of late Golgi membrane proteins in yeast. Grp19 associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer. 116
26366 132829 cd07296 PX_PLD1 The phosphoinositide binding Phox Homology domain of Phospholipase D1. The PX domain is a phosphoinositide binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. PLD1 contains PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. It acts as an effector of Rheb in the signaling of the mammalian target of rapamycin (mTOR), a serine/threonine protein kinase that transduces nutrients and other stimuli to regulate many cellular processes. PLD1 also regulates the secretion of the procoagulant von Willebrand factor (VWF) in endothelial cells. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of PLD1 specifically binds to phosphatidylinositol-3,4,5-trisphosphate [PI(3,4,5)P3], which enables PLD1 to mediate signals via the ERK1/2 pathway. 135
26367 132830 cd07297 PX_PLD2 The phosphoinositide binding Phox Homology domain of Phospholipase D2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. PLD activity has been detected in viruses, bacteria, yeast, plants, and mammals, but the PX domain is not present in PLDs from viruses and bacteria. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. PLD2 contains PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. It mediates EGF-dependent insulin secretion and EGF-induced Ras activation by the guanine nucleotide-exchange factor Son of sevenless (Sos). It regulates mast cell activation by associating and promoting the activation of the protein tyrosine kinase Syk. PLD2 also participates in the sphingosine 1-phosphate-mediated pathway that stimulates the migration of endothelial cells, an important factor in angiogenesis. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. 130
26368 132831 cd07298 PX_RICS The phosphoinositide binding Phox Homology domain of PX-RICS. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. RICS is a Rho GTPase-activating protein for cdc42 and Rac1. It is implicated in the regulation of postsynaptic signaling and neurite outgrowth. An N-terminal splicing variant of RICS containing additional PX and Src Homology 3 (SH3) domains, also called PX-RICS, is the main isoform expressed during neural development. PX-RICS is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of PX-RICS specifically binds phosphatidylinositol 3-phosphate (PI3P), PI4P, and PI5P. 115
26369 132832 cd07299 PX_TCGAP The phosphoinositide binding Phox Homology domain of Tc10/Cdc42 GTPase-activating protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. TCGAP (Tc10/Cdc42 GTPase-activating protein) contains N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal proline-rich regions. It is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. It interacts with cdc42 and TC10beta through its GAP domain and with phosphatidylinositol-(4,5)-bisphosphate [PI(4,5)P2] through its PX domain. It is translocated to the plasma membrane in adipocytes in response to insulin and may be involved in the regulation of insulin-stimulated glucose transport. TCGAP has also been named sorting nexins 26 (SNX26). SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. It is unknown whether TCGAP also functions as a SNX. 113
26370 132833 cd07300 PX_SNX20 The phosphoinositide binding Phox Homology domain of Sorting Nexin 20. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX20 interacts with P-Selectin glycoprotein ligand-1 (PSGL-1), a surface-expressed mucin that acts as a ligand for the selectin family of adhesion proteins. The PX domain of SNX20 binds PIs and targets the SNX20/PSGL-1 complex to endosomes. SNX20 may function in the sorting and cycling of PSGL-1 into endosomes. 114
26371 132834 cd07301 PX_SNX21 The phosphoinositide binding Phox Homology domain of Sorting Nexin 21. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX21, also called SNX-L, is distinctly and highly-expressed in fetal liver and may be involved in protein sorting and degradation during embryonic liver development. 112
26372 143636 cd07302 CHD cyclase homology domain. Catalytic domains of the mononucleotidyl cyclases (MNC's), also called cyclase homology domains (CHDs), are part of the class III nucleotidyl cyclases. This class includes eukaryotic and prokaryotic adenylate cyclases (AC's) and guanylate cyclases (GC's). They seem to share a common catalytic mechanism in their requirement for two magnesium ions to bind the polyphosphate moiety of the nucleotide. 177
26373 132765 cd07303 Porin3 Eukaryotic porin family that forms channels in the mitochondrial outer membrane. The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane. 274
26374 143612 cd07304 Chorismate_synthase Chorismase synthase, the enzyme catalyzing the final step of the shikimate pathway. Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway: the conversion of 5- enolpyruvylshikimate-3-phosphate (EPSP) to chorismate, a precursor for the biosynthesis of aromatic compounds. This process has an absolute requirement for reduced FMN as a co-factor which is thought to facilitate cleavage of C-O bonds by transiently donating an electron to the substrate, having no overall change its redox state. Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two classes: Enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctiona,l while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB). 344
26375 132766 cd07305 Porin3_Tom40 Translocase of outer mitochondrial membrane 40 (Tom40). Tom40 forms a channel in the mitochondrial outer membrane with a pore about 1.5 to 2.5 nanometers wide. It functions as a transport channel for unfolded protein chains and forms a complex with Tom5, Tom6, Tom7, and Tom22. The primary receptors Tom20 and Tom70 recruit the unfolded precursor protein from the mitochondrial-import stimulating factor (MSF) or cytosolic Hsc70. The precursor passes through the Tom40 channel and through another channel in the inner membrane, formed by Tim23, to be finally translocated into the mitochondrial matrix. The process depends on a proton motive force across the inner membrane and requires a contact site where the outer and inner membranes come close. Tom40 is also involved in inserting outer membrane proteins into the membrane, most likely not via a lateral opening in the pore, but by transfering precursor proteins to an outer membrane sorting and assembly machinery. 279
26376 132767 cd07306 Porin3_VDAC Voltage-dependent anion channel of the outer mitochondrial membrane. The voltage-dependent anion channel (VDAC) regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane, which is highly permeable to small molecules. VDAC is the most abundant protein in the outer membrane, and membrane potentials can toggle VDAC between open or high-conducting and closed or low-conducting forms. VDAC binds to and is regulated in part by hexokinase, an interaction that renders mitochondria less susceptible to pro-apoptotic signals, most likely by intefering with VDAC's capability to respond to Bcl-2 family proteins. While VDAC appears to play a key role in mitochondrially induced cell death, a proposed involvement in forming the mitochondrial permeability transition pore, which is characteristic for damaged mitochondria and apoptosis, has been challenged by more recent studies. 276
26377 153271 cd07307 BAR The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively. 194
26378 173892 cd07308 lectin_leg-like legume-like lectins: ERGIC-53, ERGL, VIP36, VIPL, EMP46, and EMP47. The legume-like (leg-like) lectins are eukaryotic intracellular sugar transport proteins with a carbohydrate recognition domain similar to that of the legume lectins. This domain binds high-mannose-type oligosaccharides for transport from the endoplasmic reticulum to the Golgi complex. These leg-like lectins include ERGIC-53, ERGL, VIP36, VIPL, EMP46, EMP47, and the UIP5 (ULP1-interacting protein 5) precursor protein. Leg-like lectins have different intracellular distributions and dynamics in the endoplasmic reticulum-Golgi system of the secretory pathway and interact with N-glycans of glycoproteins in a calcium-dependent manner, suggesting a role in glycoprotein sorting and trafficking. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely. 218
26379 213985 cd07309 PHP Polymerase and Histidinol Phosphatase domain. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PHP in polymerases has trinuclear zinc/magnesium dependent proofreading activity. It has also been shown that the PHP domain functions in DNA repair. The PHP structures have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 88
26380 143583 cd07311 terB_like_1 tellurium resistance terB-like protein, subgroup 1. This family includes several uncharacterized bacterial proteins. The prototype of this CD is tellurite resistance protein from Nostoc punctiforme that belongs to COG3793. Its precise biological function and its mechanism responsible for tellurium resistance still remains rather poorly understood. 150
26381 143584 cd07313 terB_like_2 tellurium resistance terB-like protein, subgroup 2. This family includes several uncharacterized bacterial proteins. Protein sequence homology analysis shows they are similar to tellurium resistance protein terB, but the function of this family is unknown. 104
26382 143585 cd07316 terB_like_DjlA N-terminal tellurium resistance protein terB-like domain of heat shock DnaJ-like proteins. Tellurium resistance terB-like domain of the DnaJ-like DjlA proteins. This family represents the terB-like domain of DjlA-like proteins, a subgroup of heat shock DnaJ-like proteins. Escherichia coli DjlA is a type III membrane protein with a small N-terminal transmembrane region and DnaJ-like domain on the extreme C-terminus. Overproduction has been shown to activate the RcsC pathway, which regulates the production of the capsular exopolysaccharide colanic acid. The specific function of this domain is unknown. 106
26383 153371 cd07320 Extradiol_Dioxygenase_3B_like Subunit B of Class III Extradiol ring-cleavage dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity. 260
26384 153390 cd07321 Extradiol_Dioxygenase_3A_like Subunit A of Class III extradiol dioxygenases. Extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. There are two major groups of dioxygenases according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents subunit A of class III extradiol dioxygenase enzymes. The A subunit is the smaller, non-catalytic subunit. Enzymes that belong to this family include Protocatechuate 4,5-dioxygenase (LigAB) A subunit, 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB) A subunit, Gallate Dioxygenase and proteins of unknown function. 77
26385 143474 cd07322 PriL_PriS_Eukaryotic Eukaryotic core primase: Large subunit, PriL. Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). In eukaryotic organisms, a heterotetrameric enzyme formed by DNA polymerase alpha, the B subunit and two primase subunits has primase activity. Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL such as stabilization of the PriS, involvement in synthesis initiation, improvement of primase processivity, determination of product size and transfer of the products to DNA polymerase alpha. 390
26386 153396 cd07323 LAM LA motif RNA-binding domain. This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 75
26387 320683 cd07324 M48C_Oma1-like Oma1 peptidase-like, integral membrane metallopeptidase. This family contains peptidase M48 subfamily C (also known as Oma1 peptidase or mitochondrial metalloendopeptidase OMA1), including similar peptidases containing tetratricopeptide (TPR) repeats, as well as uncharacterized proteins such as E. coli bepA (formerly yfgC), ycaL and loiP (formerly yggG), considered to be putative metallopeptidases. Oma1 peptidase is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Homologs of Oma1 are present in higher eukaryotes, eubacteria and archaebacteria, suggesting that Oma1 is the founding member of a conserved family of membrane-embedded metallopeptidases, all containing the zinc metalloprotease motif (HEXXH). M48 peptidases proteolytically remove the C-terminal three residues of farnesylated proteins. 142
26388 320684 cd07325 M48_Ste24p_like M48 Ste24 endopeptidase-like, integral membrane metallopeptidase. This family contains peptidase M48 family Ste24p-like proteins that are as yet uncharacterized, but probably function as intracellular, membrane-associated zinc metalloproteases; they all contain the HEXXH Zn-binding motif, which is critical for Ste24p activity. They likely remove the C-terminal three residues of farnesylated proteins proteolytically and are possibly associated with the endoplasmic reticulum and golgi. Some members also contain ankyrin domains which occur in very diverse families of proteins and mediate protein-protein interactions. 199
26389 320685 cd07326 M56_BlaR1_MecR1_like Peptidase M56-like including those in BlaR1 and MecR1, integral membrane metallopeptidase. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain ?-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized. 165
26390 320686 cd07327 M48B_HtpX_like HtpX-like membrane-bound metallopeptidase. This family contains peptidase M48 subfamily B, also known as HtpX, which consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX, an integral membrane (IM) metallopeptidase, is widespread in bacteria and archaea, and plays a central role in protein quality control by preventing the accumulation of misfolded proteins in the membrane. Its expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and eliminating them by collaborating with FtsH, a membrane-bound and ATP-dependent protease. HtpX contains the zinc binding motif (HEXXH), has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. Mutation studies of HtpX-like M48 metalloprotease from Leptospira interrogans (LA4131) has been shown to result in altered expression of a subset of metal toxicity and stress response genes. 183
26391 320687 cd07328 M48_Ste24p_like M48 Ste24 endopeptidase-like, integral membrane metallopeptidase. This family contains peptidase M48-like proteins that are as yet uncharacterized, but probably function as intracellular, membrane-associated zinc metalloproteases; they all contain the HEXXH Zn-binding motif, which is critical for Ste24p activity. They likely remove the C-terminal three residues of farnesylated proteins proteolytically and are possibly associated with the endoplasmic reticulum and golgi. 160
26392 320688 cd07329 M56_like Peptidase M56-like, integral membrane metallopeptidase in bacteria. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain beta-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized. 188
26393 320689 cd07330 M48A_Ste24p Peptidase M48 CaaX prenyl protease type 1, an integral membrane, Zn-dependent protein. This family of M48 CaaX prenyl protease 1-like family includes a number of well characterized genes such as those found in Taenia solium metacestode (TsSte24p), Arabidopsis (AtSte24), yeast Ste24p and human (Hs Ste24p) as well as several uncharacterized genes such as YhfN, some of which also containing tetratricopeptide (TPR) repeats. All members of this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be intimately associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. The gene ZmpSte24, also known as FACE-1 in humans, a member of this family, is involved in the post-translational processing of prelamin A to mature lamin A, a major component of the nuclear envelope. ZmpSte24 deficiency causes an accumulation of prelamin A leading to lipodystrophy and other disease phenotypes while mutations in the protein lead to diseases of lamin processing (laminopathies), such as premature aging disease progeria and metabolic disorders. Some of these mutations map to the peptide-binding site. 285
26394 320690 cd07331 M48C_Oma1_like Peptidase M48C, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer. 187
26395 320691 cd07332 M48C_Oma1_like Peptidase M48C Ste24p, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer. 222
26396 320692 cd07333 M48C_bepA_like Peptidase M48C Ste24p bepA-like, integral membrane protein. This family contains peptidase M48C Ste24p protease bepA (formerly yfgC)-like proteins considered to be putative metallopeptidases, containing a zinc-binding motif, HEXXH, and a COOH-terminal ER retrieval signal (KKXX). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit. In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity. Several members of this family also contain tetratricopeptide (TPR) repeat motifs, which are involved in a variety of functions including protein-protein interactions. BepA has been shown to possess protease activity and is responsible for the degradation of incorrectly folded LptD, an essential outer-membrane protein (OMP) involved in OM transport and assembly of lipopolysaccharide. Overexpression of the bepA protease causes abnormal biofilm architecture. 174
26397 320693 cd07334 M48C_loiP_like Peptidase M48C Ste24p loiP-like, integral membrane protein. This subfamily contains peptidase M48 Ste24p protease loiP (formerly yggG)-like family are mostly uncharacterized proteins that include E. coli loiP and ycaLG, considered to be putative metallopeptidases, containing a zinc-binding motif, HEXXH, and a COOH-terminal ER retrieval signal (KKXX). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit. In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity. LoiP has been shown to be a metallopeptidase that cleaves its targets preferentially between Phe-Phe residues. It is upregulated when bacteria are subjected to media of low osmolarity, thus yggG was named LoiP (low osmolarity induced protease). Proper membrane localization of LoiP may depend on YfgC, another putative metalloprotease in this subfamily. 215
26398 320694 cd07335 M48B_HtpX_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This family contains peptidase M48 subfamily B, also known as HtpX, which consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX, an integral membrane (IM) metallopeptidase, is widespread in bacteria and archaea, and plays a central role in protein quality control by preventing the accumulation of misfolded proteins in the membrane. Its expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and eliminating them by collaborating with FtsH, a membrane-bound and ATP-dependent protease. HtpX contains the zinc binding motif (HEXXH), has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. Mutation studies of HtpX-like M48 metalloprotease from Leptospira interrogans (LA4131) has been shown to result in altered expression of a subset of metal toxicity and stress response genes. 240
26399 320695 cd07336 M48B_HtpX_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. 266
26400 320696 cd07337 M48B_HtpX_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. 203
26401 320697 cd07338 M48B_HtpX_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. 216
26402 320698 cd07339 M48B_HtpX_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. 229
26403 320699 cd07340 M48B_Htpx_like Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. 246
26404 320700 cd07341 M56_BlaR1_MecR1_like Peptidase M56-like including those in BlaR1 and MecR1, integral membrane metallopeptidase. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain ?-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized. 187
26405 320701 cd07342 M48C_Oma1_like M48C peptidase, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer. 158
26406 320702 cd07343 M48A_Zmpste24p_like Peptidase M48 subfamily A, a type 1 CaaX endopeptidase. This family contains peptidase family M48 subfamily A which includes a number of well-characterized genes such as those found in humans (ZMPSTE24, also known as farnesylated protein-converting enzyme 1 or FACE-1 or Hs Ste24), Taenia solium metacestode (TsSte24p), Arabidopsis (AtSte24) and yeast (Ste24p). Ste24p contains the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. It is thought to be intimately associated with the endoplasmic reticulum (ER), regardless of whether its genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. Ste24p is involved in the post-translational processing of prelamin A to mature lamin A, a major component of the nuclear envelope. ZmpSte24 deficiency causes an accumulation of prelamin A leading to lipodystrophy and other disease phenotypes, while mutations in this gene or in that encoding its substrate, prelamin A, result in a series of human inherited diseases known as laminopathies, the most severe of which are Hutchinson Gilford progeria syndrome (HGPS) and restrictive dermopathy (RD) which arise due to unsuccessful maturation of prelamin A. Two forms of mandibuloacral dysplasia, a condition that causes a variety of abnormalities involving bone development, skin pigmentation, and fat distribution, are caused by mutations in two different genes; mutations in the LMNA gene, which normally provides instructions for making lamin A and lamin C, cause mandibuloacral dysplasia with A-type lipodystrophy (MAD-A), and mutations in the ZMPSTE24 gene cause mandibuloacral dysplasia with B-type lipodystrophy (MAD-B). Within cells, these genes are involved in maintaining the structure of the nucleus and may play a role in many cellular processes. Certain HIV protease inhibitors have been shown to inhibit the enzymatic activity of ZMPSTE24, but not enzymes involved in prelamin A processing. 405
26407 320703 cd07344 M48_yhfN_like Peptidase M48 YhfN-like, a novel minigluzincin. M48 YhfN-like protease is considered as a CaaX prenyl protease 1 homolog, with most of the sequences in this family as yet uncharacterized. It contains the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. It is probably associated with the endoplasmic reticulum (ER), regardless of whether its genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. This novel family of related proteins consist of the soluble minimal scaffold similar to the catalytic domains of the integral-membrane metallopeptidase M48 and M56, thus called minigluzincins. 96
26408 320704 cd07345 M48A_Ste24p-like Peptidase M48 subfamily A-like, putative CaaX prenyl protease. This family contains peptidase family M48 subfamily A-like CaaX prenyl protease 1, most of which are uncharacterized. Some of these contain tetratricopeptide (TPR) repeats at the C-terminus. Proteins in this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be possibly associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. These proteins putatively remove the C-terminal three residues of farnesylated proteins proteolytically. 346
26409 349983 cd07346 ABC_6TM_exporters Six-transmembrane helical domain of the ATP-binding cassette transporters. This family represents a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in this family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 292
26410 259818 cd07347 harmonin_N_like N-terminal protein-binding module of harmonin and similar domains, also known as HHD (harmonin homology domain). This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This domain is also found in CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM). CCM2 along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signaling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours; the C-terminal domain of malcavernin represented here has also been refered to as the Karet domain. Harmonin contains a single copy of this domain at its N-terminus which binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain (a component of the Usher protein network). Whirlin contains two copies of this domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected. 78
26411 132762 cd07348 NR_LBD_NGFI-B The ligand binding domain of Nurr1, a member of conserved family of nuclear receptors. The ligand binding domain of Nerve growth factor-induced-B (NGFI-B): NGFI-B is a member of the nuclear#steroid receptor superfamily. NGFI-B is classified as an orphan receptor because no ligand has yet been identified. NGFI-B is an early immediate gene product of the embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of transcriptional initiation. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, NGFI-B has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 238
26412 132763 cd07349 NR_LBD_SHP The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of the Small Heterodimer Partner (SHP): SHP is a member of the nuclear receptor superfamily. SHP has a ligand binding domain, but lacks the DNA binding domain, typical to almost all of the nuclear receptors. It functions as a transcriptional coregulator by directly interacting with other nuclear receptors through its AF-2 motif. The closest relative of SHP is DAX1 and they can form heterodimer. SHP is an orphan receptor, lacking an identified ligand. 222
26413 132764 cd07350 NR_LBD_Dax1 The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of the DAX1 protein: DAX1 (dosage-sensitive sex reversal adrenal hypoplasia congenita critical region on chromosome X gene 1) is a nuclear receptor with a typical ligand binding domain, but lacks the DNA binding domain. DAX1 plays an important role in the normal development of several hormone-producing tissues. Duplications of the region of the X chromosome containing DAX1 cause dosage sensitive sex reversal. DAX1 acts as a global repressor of many nuclear receptors, including SF-1, LRH-1, ERR, ER, AR and PR. DAX1 can form homodimer and heterodimerizes with its alternatively spliced isoform DAX1A and other nuclear receptors such as SHP, ERalpha and SF-1. 232
26414 259819 cd07353 harmonin_N N-terminal protein-binding module of harmonin. Harmonin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein, which organizes the Usher protein network of the inner ear and the retina. Harmonin contains a single copy of this domain, which is found at the N-terminus of all three harmonin isoform classes (a, b and c), and which preceeds the first PDZ protein-binding domain, PDZ1. This harmonin_N domain binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. 79
26415 259820 cd07354 HN_L-delphilin-R1_like first harmonin_N_like domain (repeat 1) of L-delphilin, and related domains. This subgroup contains the first of two harmonin_N_like domains of an alternatively spliced longer variant of mouse delphilin (L-delphilin, isoform 1), and related domains. Delphilin is a scaffold protein which binds the glutamate receptor delta-2 (GRID2) subunit and the monocarboxylate transporter 2 at the cerebellar parallel fiber-Purkinje cell synapses. The N-terminus of L-delphilin contains this harmonin_N_like domain preceded by a postsynaptic density-95/discs-large/ZO-1 (PDZ) protein-binding domain, PDZ1. L-delphilin, in common with the shorter C-terminal isoforms (S-delphilin/delphilin alpha and delphilin beta) has a second harmonin_N_like domain (not belonging to this subgroup) and a second PDZ domain, PDZ2. This first harmonin_N_like domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. 80
26416 259821 cd07355 HN_L-delphilin-R2_like second harmonin_N_like domain (repeat 2) of L-delphilin, and related domains. This subgroup contains the second of two harmonin_N_like domains of an alternatively spliced longer variant of mouse delphilin (L-delphilin), and related domains. Delphilin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds the glutamate receptor delta-2 (GRID2) subunit and the monocarboxylate transporter 2 at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain in L-delphilin follows the second PDZ protein-binding domain, PDZ2; it is also found in the shorter C-terminal isoforms (S-delphilin/delphilin alpha and delphilin beta). It is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. The first harmonin_N_like domain of L-delphilin belongs to a different subgroup and is missing from S-delphilin. 80
26417 259822 cd07356 HN_L-whirlin_R1_like first harmonin_N_like domain (repeat 1) of the long isoform of whirlin, and related domains. This subgroup contains the first of two harmonin_N_like domains of the long isoform of whirlin, and related domains. Whirlin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds various components of the Usher protein network of the inner ear and the retina: erythrocyte protein p55, usherin, VlGR1, and myosin XVa. The long isoform of whirlin contains two harmonin_N_like domains, and three PDZ protein-binding domains, PDZ1-3. This first harmonin_N_like domain precedes PDZ1, and is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. This first harmonin_N_like domain has been assayed for interaction with the cytoplasmic domain of cadherin 23 (a component of the Usher network and an interacting partner of the harmonin N-domain), however no interaction could be detected. The short whirlin isoform, derived from an alternative start ATG, lacks this first harmonin_N_like domain. The short isoform has in common with the long isoform, the second harmonin_N_like domain (designated repeat 2, not present in this subgroup), and PDZ3. 78
26418 259823 cd07357 HN_L-whirlin_R2_like second harmonin_N_like domain (repeat 2) of the long isoform of whirlin, and related domains. This subgroup contains the second of two harmonin_N_like domains found in the long isoform of whirlin, and related domains. Whirlin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds various components of the Usher protein network of the inner ear and the retina: erythrocyte protein p55, usherin, VlGR1, and myosin XVa. The long isoform of whirlin contains two harmonin_N_like domains, and three PDZ protein-binding domains, PDZ1-3. The short whirlin isoform, derived from an alternative start ATG, lacks the first harmonin_N_like domain but has in common with the long isoform, this second harmonin_N_like domain (designated repeat 2, included in this subgroup) and PDZ3. This second harmonin_N_like domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. 81
26419 259824 cd07358 HN_PDZD7_like harmonin_N_like domain, a protein-binding module of PDZ domain-containing protein 7 and related proteins. Human PDZD7 is a scaffolding protein which associates with the Usher Syndrome protein network, and localizes to the stereocilia Ankle-link. Usher syndrome is the leading cause of genetic deaf-blindness. PDZD7 has a role as in Usher syndrome type 2 (and not in USH1) in humans. Whirlin, Usherin and GRP98 are other USH2 proteins. The latter two form the ankle links and whirlin is thought to be a scaffold for protein interactions at these links. PDZD7, whirlin, and harmonin (an USH1 protein) have a similar domain composition. The domain represented here is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. Cooperative effects of mutations in PDZD7 and Usherin, and in PDZD7 and GPR98, result in a digenic USH2 phenotype. 78
26420 153372 cd07359 PCA_45_Doxase_B_like Subunit B of the Class III Extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, and simlar enzymes. This subfamily of class III extradiol dioxygenases consists of a number of proteins with known enzymatic activities: Protocatechuate (PCA) 4,5-dioxygenase (LigAB), 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB), 3-O-Methylgallate Dioxygenase, 2-aminophenol 1,6-dioxygenase, as well as proteins without any known enzymatic activity. These proteins play essential roles in the degradation of aromatic compounds by catalyzing the incorporation of both atoms of molecular oxygen into their preferred substrates. As members of the Class III extradiol dioxygenase family, the enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. 271
26421 153373 cd07361 MEMO_like Memo (mediator of ErbB2-driven cell motility) is co-precipitated with the C terminus of ErbB2, a protein involved in cell motility. This subfamily is composed of Memo (mediator of ErbB2-driven cell motility) and similar proteins. Memo is a protein that is co-precipitated with the C terminus of ErbB2, a protein involved in cell motility. It is required for the ErbB2-driven cell mobility and is found in protein complexes with cofilin, ErbB2 and PLCgamma1. However, Memo is not homologous to any known signaling proteins, and its function in ErbB2 signaling is not known. Structural studies show that Memo binds directly to a specific ErbB2-derived phosphopeptide. Memo is homologous to class III nonheme iron-dependent extradiol dioxygenases, however, no metal binding or enzymatic activity can be detected for Memo. This subfamily also contains a few members containing a C-terminal AMMECR1-like domain. The AMMECR1 protein was proposed to be a regulatory factor that is potentially involved in the development of AMME contiguous gene deletion syndrome. 266
26422 153374 cd07362 HPCD_like Class III extradiol dioxygenases with similarity to homoprotocatechuate 2,3-dioxygenase, which catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate. This subfamily of class III extradiol dioxygenases consists of two types of proteins with known enzymatic activities; 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD) and 2-amino-5-chlorophenol 1,6-dioxygenase. HPCD catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate (hpca), a central intermediate in the bacterial degradation of aromatic compounds. The enzyme incorporates both atoms of molecular oxygen into hpca, resulting in aromatic ring-opening to yield the product alpha-hydroxy-delta-carboxymethyl cis-muconic semialdehyde. 2-amino-5-chlorophenol 1,6-dioxygenase catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. The enzyme is probably a heterotetramer composed of two alpha and two beta subunits. Alpha and beta subunits share significant sequence similarity and both belong to this family. Like all Class III extradiol dioxygenases, these enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 272
26423 153375 cd07363 45_DOPA_Dioxygenase The Class III extradiol dioxygenase, 4,5-DOPA Dioxygenase, catalyzes the incorporation of both atoms of molecular oxygen into 4,5-dihydroxy-phenylalanine. This subfamily is composed of plant 4,5-DOPA Dioxygenase, the uncharacterized Escherichia coli protein Jw3007, and similar proteins. 4,5-DOPA Dioxygenase catalyzes the incorporation of both atoms of molecular oxygen into 4,5-dihydroxy-phenylalanine (4,5-DOPA). The reaction results in the opening of the cyclic ring between carbons 4 and 5 and producing an unstable seco-DOPA that rearranges to betalamic acid. 4,5-DOPA Dioxygenase is a key enzyme in the biosynthetic pathway of the plant pigment betalain. Homologs of DODA are present not only in betalain-producing plants but also in bacteria and archaea. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 253
26424 153376 cd07364 PCA_45_Dioxygenase_B Subunit B of the Class III extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of protocatechuate. Protocatechuate 4,5-dioxygenase (LigAB) catalyzes the oxidization and subsequent ring-opening of protocatechuate (or 3,4-dihydroxybenzoic acid, PCA), an intermediate in the breakdown of lignin and other compounds. Protocatechuate 4,5-dioxygenase is an aromatic ring opening dioxygenase belonging to the class III extradiol enzyme family, a group of enyzmes that cleaves aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon using a non-heme Fe(II). LigAB is composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. The B subunit (LigB) is the catalytic subunit of LigAB. 277
26425 153377 cd07365 MhpB_like Subunit B of the Class III Extradiol ring-cleavage dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB), which catalyzes the oxidization and subsequent ring-opening of 2,3-dihydroxyphenylpropionate. 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB) catalyzes the oxidization and subsequent ring-opening of 2,3-dihydroxyphenylpropionate, yielding the product 2-hydroxy-6-oxo-nona-2,4-diene 1,9-dicarboxylate. It is an essential enzyme in the beta-phenylpropionic degradation pathway, in which beta-phenylpropionic is first hydrolyzed to produce 2,3-dihydroxyphenylpropionate. The enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. MhpB is likely to be a tetramer. 310
26426 153378 cd07366 3MGA_Dioxygenase Subunit B of the Class III Extradiol ring-cleavage dioxygenase, 3-O-Methylgallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 3-O-Methylgallate. 3-O-Methylgallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of 3-O-Methylgallate (3MGA) between carbons 2 and 3. 3-O-Methylgallate Dioxygenase is a key enzyme in the syringate degradation pathway, in which the syringate is first converted to 3-O-Methylgallate by O-demethylase. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which uses a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. 328
26427 153379 cd07367 CarBb CarBb is the B subunit of the Class III Extradiol ring-cleavage dioxygenase, 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. CarBb is the B subunit of 2-aminophenol 1,6-dioxygenase (CarB), which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. It is a key enzyme in the carbazole degradation pathway isolated from bacterial strains with carbazole degradation ability. The enzyme is a heterotetramer composed of two A and two B subunits. CarB belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Although the enzyme was originally isolated as a meta-cleavage enzyme for 2'-aminobiphenyl-2,3-diol involved in carbazole degradation, it has also shown high specificity for 2,3-dihydroxybiphenyl. 268
26428 153380 cd07368 PhnC_Bs_like PhnC is a Class III Extradiol ring-cleavage dioxygenase involved in the polycyclic aromatic hydrocarbon (PAH) catabolic pathway. This subfamily is composed of Burkholderia sp. PhnC and similar poteins. PhnC is one of nine protein products encoded by the phn locus. These proteins are involved in the polycyclic aromatic hydrocarbon (PAH) catabolic pathway. PhnC is a member of the class III extradiol dioxygenase family, a group os enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. 277
26429 153381 cd07369 PydA_Rs_like PydA is a Class III Extradiol ring-cleavage dioxygenase required for the degradation of 3-hydroxy-4-pyridone (HP). This subfamily is composed of Rhizobium sp. PydA and similar proteins. PydA is required for the degradation of 3-hydroxy-4-pyridone (HP), an intermediate in the Leucaena toxin mimosine degradation pathway. It is a member of the class III extradiol dioxygenase family, a group of enzymes that use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. 329
26430 153382 cd07370 HPCD The Class III extradiol dioxygenase, homoprotocatechuate 2,3-dioxygenase, catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate. 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD) catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate (hpca), a central intermediate in the bacterial degradation of aromatic compounds. The enzyme incorporates both atoms of molecular oxygen into hpca, resulting in aromatic ring-opening to yield alpha-hydroxy-delta-carboxymethyl cis-muconic semialdehyde. HPCD is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 280
26431 153383 cd07371 2A5CPDO_AB The alpha and beta subunits of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. This subfamily contains both alpha and beta subunits of 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO), which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, an intermediate during p-chloronitrobenzene degradation. 2A5CPDO is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. Alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. 268
26432 153384 cd07372 2A5CPDO_B The beta subunit of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO), catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active 2A5CPDO enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. The alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. This model describes the beta subunit, which contains a putative metal binding site with two conserved histidines; these residues are equivalent to two out of three Fe(II) binding residues present in the catalytic subunit dioxygenase LigB. The alpha subunit does not contain these potential metal binding residues. The 2A5CPDO beta subunit may be the catalytic subunit of the enzyme. 294
26433 153385 cd07373 2A5CPDO_A The alpha subunit of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO) catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. The alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. This model describes the alpha subunit, which does not contain a potential metal binding site and may not possess catalytic activity. 271
26434 143620 cd07374 CYTH-like_Pase CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases. CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions. 174
26435 153408 cd07375 Anticodon_Ia_like Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains. This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway. 117
26436 143511 cd07376 PLPDE_III_DSD_D-TA_like Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase. This family includes eukaryotic D-serine dehydratases (DSD), cryptic DSDs from bacteria, D-threonine aldolases (D-TA), low specificity D-TAs, and similar uncharacterized proteins. DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. Members of this family are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on similarity to AR, it is possible members of this family also form dimers in solution. 345
26437 153418 cd07377 WHTH_GntR Winged helix-turn-helix (WHTH) DNA-binding domain of the GntR family of transcriptional regulators. This CD represents the winged HTH DNA-binding domain of the GntR (named after the gluconate operon repressor in Bacillus subtilis) family of bacterial transcriptional regulators and their putative homologs found in eukaryota and archaea. The GntR family has over 6000 members distributed among almost all bacterial species, which is comprised of FadR, HutC, MocR, YtrA, AraR, PlmA, and other subfamilies for the regulation of the most varied biological process. The monomeric proteins of the GntR family are characterized by two function domains: a small highly conserved winged helix-turn-helix prokaryotic DNA binding domain in the N-terminus, and a very diverse regulatory ligand-binding domain in the C-terminus for effector-binding/oligomerization, which provides the basis for the subfamily classifications. Binding of the effector to GntR-like transcriptional regulators is presumed to result in a conformational change that regulates the DNA-binding affinity of the repressor. The GntR-like proteins bind as dimers, where each monomer recognizes a half-site of 2-fold symmetric DNA sequences. 66
26438 277324 cd07378 MPP_ACP5 Homo sapiens acid phosphatase 5 and related proteins, metallophosphatase domain. Acid phosphatase 5 (ACP5) removes the mannose 6-phosphate recognition marker from lysosomal proteins. The exact site of dephosphorylation is not clear. Evidence suggests dephosphorylation may take place in a prelysosomal compartment as well as in the lysosome. ACP5 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 286
26439 277325 cd07379 MPP_239FB Homo sapiens 239FB and related proteins, metallophosphatase domain. 239FB (Fetal brain protein 239) is thought to play a role in central nervous system development, but its specific role in unknown. 239FB is expressed predominantly in human fetal brain from a gene located in the chromosome 11p13 region associated with the mental retardation component of the WAGR (Wilms tumor, Aniridia, Genitourinary anomalies, Mental retardation) syndrome. Orthologous brp-like (brain protein 239-like) proteins have been identified in the invertebrate amphioxus group and in vertebrates. 239FB belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 135
26440 277326 cd07380 MPP_CWF19_N Schizosaccharomyces pombe CWF19 and related proteins, N-terminal metallophosphatase domain. CWF19 cell cycle control protein (also known as CWF19-like 1 (CWF19L1) in Homo sapiens), N-terminal metallophosphatase domain. CWF19 contains C-terminal domains similar to that found in the CwfJ cell cycle control protein. The metallophosphatase domain belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 149
26441 277327 cd07381 MPP_CapA CapA and related proteins, metallophosphatase domain. CapA is one of three membrane-associated enzymes in Bacillus anthracis that is required for synthesis of gamma-polyglutamic acid (PGA), a major component of the bacterial capsule. The YwtB and PgsA proteins of Bacillus subtilis are closely related to CapA and are also included in this alignment model. CapA belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 239
26442 277328 cd07382 MPP_DR1281 Deinococcus radiodurans DR1281 and related proteins, metallophosphatase domain. DR1281 is an uncharacterized Deinococcus radiodurans protein with a domain that belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 255
26443 277329 cd07383 MPP_Dcr2 Saccharomyces cerevisiae DCR2 phosphatase and related proteins, metallophosphatase domain. DCR2 phosphatase (Dosage-dependent Cell Cycle Regulator 2) functions together with DCR1 (Gid8) in a common pathway to accelerate initiation of DNA replication in Saccharomyces cerevisiae. Genetic analysis suggests that DCR1 functions upstream of DCR2. DCR2 interacts with and dephosphorylates Sic1, an inhibitor of mitotic cyclin/cyclin-dependent kinase complexes, which may serve to trigger the initiation of cell division. DCR2 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 202
26444 277330 cd07384 MPP_Cdc1_like Saccharomyces cerevisiae CDC1 and related proteins, metallophosphatase domain. Cdc1 (also known as XlCdc1 in Xenopus laevis) is an endoplasmic reticulum-localized transmembrane lipid phosphatase with a metallophosphatase domain facing the ER lumen. In budding yeast, the gene encoding CDC1 is essential while nonlethal mutations cause defects in Golgi inheritance and actin polarization. Cdc1 mutant cells accumulate an unidentified phospholipid, suggesting that Cdc1 is a lipid phosphatase. Cdc1 mutant cells also have highly elevated intracellular calcium levels suggesting a possible role for Cdc1 in calcium regulation. The 5' flanking region of Cdc1 is a regulatory region with conserved binding site motifs for AP1, AP2, Sp1, NF-1 and CREB. DNA polymerase delta consists of at least four subunits - Pol3, Cdc1, Cdc27, and Cdm1. This group also contains Saccharomyces cerevisiae TED1 (Trafficking of Emp24p/Erv25p-dependent cargo disrupted 1), which acts together with Emp24p and Erv25p in cargo exit from the ER, and human MPPE1. The human MPPE1 gene is a candidate susceptibility gene for bipolar disorder. These proteins belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 172
26445 277331 cd07385 MPP_YkuE_C Bacillus subtilis YkuE and related proteins, C-terminal metallophosphatase domain. YkuE is an uncharacterized Bacillus subtilis protein with a C-terminal metallophosphatase domain and an N-terminal twin-arginine (RR) motif. An RR-signal peptide derived from the Bacillus subtilis YkuE protein can direct Tat-dependent secretion of agarase in Streptomyces lividans. This is an indication that YkuE is transported by the Bacillus subtilis Tat (Twin-arginine translocation) pathway machinery. YkuE belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 224
26446 277332 cd07386 MPP_DNA_pol_II_small_archeal_C archeal DNA polymerase II, small subunit, C-terminal metallophosphatase domain. The small subunit of the archeal DNA polymerase II contains a C-terminal metallophosphatase domain. This domain is thought to be functionally active because the active site residues required for phosphoesterase activity in other members of this superfamily are intact. The archeal replicative DNA polymerases are thought to possess intrinsic phosphatase activity that hydrolyzes the pyrophosphate released during nucleotide polymerization. This domain belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 243
26447 277333 cd07387 MPP_PolD2_C PolD2 (DNA polymerase delta, subunit 2), C-terminal domain. PolD2 (DNA polymerase delta, subunit 2) is an auxiliary subunit of the eukaryotic DNA polymerase delta (PolD) complex thought to play a regulatory role and to serve as a scaffold for PolD assembly by interacting simultaneously with all of the other three subunits. PolD2 is catalytically inactive and lacks the active site residues required for phosphoesterase activity in other members of this superfamily. PolD2 is also involved in the recruitment of several proteins regulating DNA metabolism, including p21, PDIP1, PDIP38, PDIP46, and WRN. Human PolD consists of four subunits: p125 (PolD1), p50 (PolD2), p66(PolD3), and p12(PolD4). PolD is one of three major replicases in eukaryotes. PolD also plays an essential role in translesion DNA synthesis, homologous recombination, and DNA repair. Within the PolD complex, PolD2 tightly associates with PolD3. PolD2 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 257
26448 277334 cd07388 MPP_Tt1561 Thermus thermophilus Tt1561 and related proteins, metallophosphatase domain. This family includes bacterial proteins related to Tt1561 (also known as Aq1956 in Aquifex aeolicus), an uncharacterized Thermus thermophilus protein. The conserved domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets, and is thought to allow for productive metal coordination. However, the active site residues required for phosphoesterase activity in other members of this superfamily are poorly conserved in this functionally uncharacterized family. 224
26449 277335 cd07389 MPP_PhoD Bacillus subtilis PhoD and related proteins, metallophosphatase domain. PhoD (also known as alkaline phosphatase D/APaseD in Bacillus subtilis) is a secreted phosphodiesterase encoded by phoD of the Pho regulon in Bacillus subtilis. PhoD homologs are found in prokaryotes, eukaryotes, and archaea. PhoD contains a twin arginine (RR) motif and is transported by the Tat (Twin-arginine translocation) translocation pathway machinery (TatAyCy). This family also includes the Fusarium oxysporum Fso1 protein. PhoD belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 242
26450 277336 cd07390 MPP_AQ1575 Aquifex aeolicus AQ1575 and related proteins, metallophosphatase domain. This family includes bacterial and archeal proteins homologous to AQ1575, an uncharacterized Aquifex aeolicus protein. AQ1575 may play an accessory role in DNA repair, based on the close proximity of its gene to Holliday junction resolvasome genes. The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 170
26451 277337 cd07391 MPP_PF1019 Pyrococcus furiosus PF1019 and related proteins, metallophosphatase domain. This family includes bacterial and archeal proteins homologous to PF1019, an uncharacterized Pyrococcus furiosus protein. The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 175
26452 277338 cd07392 MPP_PAE1087 Pyrobaculum aerophilum PAE1087 and related proteins, metallophosphatase domain. PAE1087 is an uncharacterized Pyrobaculum aerophilum protein with a metallophosphatase domain. The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 190
26453 277339 cd07393 MPP_DR1119 Deinococcus radiodurans DR1119 and related proteins, metallophosphatase domain. DR1119 is an uncharacterized Deinococcus radiodurans protein with a metallophosphatase domain. The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 238
26454 163637 cd07394 MPP_Vps29 Homo sapiens Vps29 and related proteins, metallophosphatase domain. Vps29 (vacuolar sorting protein 29), also known as vacuolar membrane protein Pep11, is a subunit of the retromer complex which is responsible for the retrieval of mannose-6-phosphate receptors (MPRs) from the endosomes for retrograde transport back to the Golgi. Vps29 has a phosphoesterase fold that acts as a protein interaction scaffold for retromer complex assembly as well as a phosphatase with specificity for the cytoplasmic tail of the MPR. The retromer includes the following 5 subunits: Vps35, Vps26, Vps29, and a dimer of the sorting nexins Vps5 (Snx1), and Vps17 (Snx2). Vps29 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 178
26455 277340 cd07395 MPP_CSTP1 Homo sapiens CSTP1 and related proteins, metallophosphatase domain. CSTP1 (complete S-transactivated protein 1) is an uncharacterized Homo sapiens protein with a metallophosphatase domain, that is transactivated by the complete S protein of hepatitis B virus. CSTP1 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 263
26456 277341 cd07396 MPP_Nbla03831 Homo sapiens Nbla03831 and related proteins, metallophosphatase domain. Nbla03831 (also known as LOC56985) is an uncharacterized Homo sapiens protein with a domain that belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 245
26457 277342 cd07397 MPP_NostocDevT-like Nostoc DevT and similar proteins, metallophosphatase domain. DevT (Alr4674) is a putative protein phosphatase from Nostoc PCC 7120 (Anabaena PCC 7120). DevT mutants form mature heterocysts, but they are unable to fix N(2) and must be supplied with a source of combined nitrogen in order to survive. Anabaena DevT shows homology to phosphatases of the PPP family and displays a Mn(2+)-dependent phosphatase activity. DevT is constitutively expressed in both vegetative cells and heterocysts, and is not regulated by NtcA. The heterocyst regulator HetR may exert a certain inhibition on the expression of devT. Under diazotrophic growth conditions, DevT protein accumulates specifically in mature heterocysts. The role that DevT plays in a late essential step of heterocyst differentiation is still unknown. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 245
26458 277343 cd07398 MPP_YbbF-LpxH Escherichia coli YbbF/LpxH and related proteins, metallophosphatase domain. YbbF/LpxH is an Escherichia coli UDP-2,3-diacylglucosamine hydrolase thought to catalyze the fourth step of lipid A biosynthesis, in which a precursor UDP-2,3-diacylglucosamine is hydrolyzed to yield 2,3-diacylglucosamine 1-phosphate and UMP. YbbF belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 217
26459 277344 cd07399 MPP_YvnB Bacillus subtilis YvnB and related proteins, metallophosphatase domain. YvnB (BSU35040) is an uncharacterized Bacillus subtilis protein with a metallophosphatase domain. This family includes bacterial and eukaryotic proteins similar to YvnB. YvnB belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 207
26460 277345 cd07400 MPP_1 Uncharacterized subfamily, metallophosphatase domain. Uncharacterized subfamily of the MPP superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 138
26461 277346 cd07401 MPP_TMEM62_N Homo sapiens TMEM62, N-terminal metallophosphatase domain. TMEM62 (transmembrane protein 62) is an uncharacterized Homo sapiens transmembrane protein with an N-terminal metallophosphatase domain. TMEM62 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 254
26462 277347 cd07402 MPP_GpdQ Enterobacter aerogenes GpdQ and related proteins, metallophosphatase domain. GpdQ (glycerophosphodiesterase Q, also known as Rv0805 in Mycobacterium tuberculosis) is a binuclear metallophosphoesterase from Enterobacter aerogenes that catalyzes the hydrolysis of mono-, di-, and triester substrates, including some organophosphate pesticides and products of the degradation of nerve agents. The GpdQ homolog, Rv0805, has 2',3'-cyclic nucleotide phosphodiesterase activity. GpdQ and Rv0805 belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 240
26463 277348 cd07403 MPP_TTHA0053 Thermus thermophilus TTHA0053 and related proteins, metallophosphatase domain. TTHA0053 is an uncharacterized Thermus thermophilus protein with a domain that belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 130
26464 277349 cd07404 MPP_MS158 Microscilla MS158 and related proteins, metallophosphatase domain. MS158 is an uncharacterized Microscilla protein with a metallophosphatase domain. Microscilla proteins MS152, and MS153 are also included in this family. The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 201
26465 277350 cd07405 MPP_UshA_N Escherichia coli UshA and related proteins, N-terminal metallophosphatase domain. UshA is a bacterial periplasmic enzyme with UDP-sugar hydrolase and dinucleoside-polyphosphate hydrolase activities associated with its N-terminal metallophosphatase domain, and 5'-nucleotidase activity associated with its C-terminal domain. UshA has been studied in Escherichia coli where it is expressed from the ushA gene as an immature precursor and proteolytically cleaved to form a mature product upon export to the periplasm. UshA hydrolyzes many different nucleotides and nucleotide derivatives and has been shown to degrade external UDP-glucose to uridine, glucose 1-phosphate and phosphate for utilization by the cell. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 287
26466 277351 cd07406 MPP_CG11883_N Drosophila melanogaster CG11883 and related proteins, N-terminal metallophosphatase domain. CG11883 is an uncharacterized Drosophila melanogaster UshA-like protein with two domains, an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 257
26467 277352 cd07407 MPP_YHR202W_N Saccharomyces cerevisiae YHR202W and related proteins, N-terminal metallophosphatase domain. YHR202W is an uncharacterized Saccharomyces cerevisiae UshA-like protein with two domains, an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 286
26468 277353 cd07408 MPP_SA0022_N Staphylococcus aureus SA0022 and related proteins, N-terminal metallophosphatase domain. SA0022 is an uncharacterized Staphylococcus aureus UshA-like protein with two putative domains, an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. SA0022 also contains a putative C-terminal cell wall anchor domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 255
26469 277354 cd07409 MPP_CD73_N CD73 ecto-5'-nucleotidase and related proteins, N-terminal metallophosphatase domain. CD73 is a mammalian ecto-5'-nucleotidase expressed in endothelial cells and lymphocytes that catalyzes the conversion of 5'-AMP to adenosine in the final step of a pathway that generates adenosine from ATP. This pathway also includes a CD39 nucleoside triphosphate dephosphorylase that mediates the dephosphorylation of ATP to ADP and then to 5'-AMP. These enzymes all have an N-terminal metallophosphatase domain and a C-terminal 5'nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 279
26470 277355 cd07410 MPP_CpdB_N Escherichia coli CpdB and related proteins, N-terminal metallophosphatase domain. CpdB is a bacterial periplasmic protein with an N-terminal metallophosphatase domain and a C-terminal 3'-nucleotidase domain. This alignment model represents the N-terminal metallophosphatase domain, which has 2',3'-cyclic phosphodiesterase activity, hydrolyzing the 2',3'-cyclic phosphates of adenosine, guanosine, cytosine and uridine to yield nucleoside and phosphate. CpdB also hydrolyzes the chromogenic substrates p-nitrophenyl phosphate (PNPP), bis(PNPP) and p-nitrophenyl phosphorylcholine (NPPC). CpdB is thought to play a scavenging role during RNA hydrolysis by converting the non-transportable nucleotides produced by RNaseI to nucleosides which can easily enter a cell for use as a carbon source. This family also includes YfkN, a Bacillus subtilis nucleotide phosphoesterase with two copies of each of the metallophosphatase and 3'-nucleotidase domains. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 280
26471 277356 cd07411 MPP_SoxB_N Thermus thermophilus SoxB and related proteins, N-terminal metallophosphatase domain. SoxB (sulfur oxidation protein B) is a periplasmic thiosulfohydrolase and an essential component of the sulfur oxidation pathway in archaea and bacteria. SoxB has a dinuclear manganese cluster and is thought to catalyze the release of sulfate from a protein-bound cysteine S-thiosulfonate. SoxB is expressed from the sox (sulfur oxidation) gene cluster, which encodes 15 other sox genes, and has two domains, an N-terminal metallophosphatase domain and a C-terminal 5'-nucleotidase domain. SoxB binds the SoxYZ complex and is thought to function as a sulfate-thiohydrolase. SoxB is closely related to the UshA, YchR, and CpdB proteins, all of which have the same two-domain architecture. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 273
26472 277357 cd07412 MPP_YhcR_N Bacillus subtilis YhcR endonuclease and related proteins, N-terminal metallophosphatase domain. YhcR is a Bacillus subtilis sugar-nonspecific endonuclease. It cleaves endonucleolytically to yield nucleotide 3'-monophosphate products, similar to Staphylococcus aureus micrococcal nuclease. YhcR appears to be located in the cell wall, and is thought to be a substrate for a Bacillus subtilis sortase. YhcR is the major calcium-activated nuclease of B. subtilis. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 295
26473 277358 cd07413 MPP_PA3087 Pseudomonas aeruginosa PA3087 and related proteins, metallophosphatase domain. PA3087 is an uncharacterized protein from Pseudomonas aeruginosa with a metallophosphatase domain that belongs to the phosphoprotein phosphatase (PPP) family. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 222
26474 277359 cd07414 MPP_PP1_PPKL PP1, PPKL (PP1 and kelch-like) enzymes, and related proteins, metallophosphatase domain. PP1 (protein phosphatase type 1) is a serine/threonine phosphatase that regulates many cellular processes including: cell-cycle progression, protein synthesis, muscle contraction, carbohydrate metabolism, transcription and neuronal signaling, through its interaction with at least 180 known targeting proteins. PP1 occurs in all tissues and regulates many pathways, ranging from cell-cycle progression to carbohydrate metabolism. Also included here are the PPKL (PP1 and kelch-like) enzymes including the PPQ, PPZ1, and PPZ2 fungal phosphatases. These PPKLs have a large N-terminal kelch repeat in addition to a C-terminal phosphoesterase domain. The PPP (phosphoprotein phosphatase) family, to which PP1 belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 291
26475 277360 cd07415 MPP_PP2A_PP4_PP6 PP2A, PP4, and PP6 phosphoprotein phosphatases, metallophosphatase domain. PP2A-like family of phosphoprotein phosphatases (PPP's) including PP4 and PP6. PP2A (Protein phosphatase 2A) is a critical regulator of many cellular activities. PP2A comprises about 1% of total cellular proteins. PP2A, together with protein phosphatase 1 (PP1), accounts for more than 90% of all serine/threonine phosphatase activities in most cells and tissues. The PP2A subunit in addition to having a catalytic domain homologous to PP1, has a unique C-terminal tail, containing a motif that is conserved in the catalytic subunits of all PP2A-like phosphatases including PP4 and PP6, and has an important role in PP2A regulation. The PP2A-like family of phosphatases all share a similar heterotrimeric architecture, that includes: a 65kDa scaffolding subunit (A), a 36kDa catalytic subunit (C), and one of 18 regulatory subunits (B). The PPP (phosphoprotein phosphatase) family, to which PP2A belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 285
26476 277361 cd07416 MPP_PP2B PP2B, metallophosphatase domain. PP2B (calcineurin) is a unique serine/threonine protein phosphatase in its regulation by a second messenger (calcium and calmodulin). PP2B is involved in many biological processes including immune responses, the second messenger cAMP pathway, sodium/potassium ion transport in the nephron, cell cycle progression in lower eukaryotes, cardiac hypertrophy, and memory formation. PP2B is highly conserved from yeast to humans, but is absent from plants. PP2B is a heterodimer consisting of a catalytic subunit (CnA) and a regulatory subunit (CnB); CnB contains four Ca2+ binding motifs referred to as EF hands. The PPP (phosphoprotein phosphatase) family, to which PP2B belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 305
26477 277362 cd07417 MPP_PP5_C PP5, C-terminal metallophosphatase domain. Serine/threonine protein phosphatase-5 (PP5) is a member of the PPP gene family of protein phosphatases that is highly conserved among eukaryotes and widely expressed in mammalian tissues. PP5 has a C-terminal phosphatase domain and an extended N-terminal TPR (tetratricopeptide repeat) domain containing three TPR motifs. The PPP (phosphoprotein phosphatase) family, to which PP5 belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 316
26478 163661 cd07418 MPP_PP7 PP7, metallophosphatase domain. PP7 is a plant phosphoprotein phosphatase that is highly expressed in a subset of stomata and thought to play an important role in sensory signaling. PP7 acts as a positive regulator of signaling downstream of cryptochrome blue light photoreceptors. PP7 also controls amplification of phytochrome signaling, and interacts with nucleotidediphosphate kinase 2 (NDPK2), a positive regulator of phytochrome signalling. In addition, PP7 interacts with heat shock transcription factor HSF and up-regulates protective heat shock proteins. PP7 may also play a role in salicylic acid-dependent defense signaling. The PPP (phosphoprotein phosphatase) family, to which PP7 belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP2A, PP2B (calcineurin), PP4, PP5, PP6, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 377
26479 277363 cd07419 MPP_Bsu1_C Arabidopsis thaliana Bsu1 phosphatase and related proteins, C-terminal metallophosphatase domain. Bsu1 encodes a nuclear serine-threonine protein phosphatase found in plants and protozoans. Bsu1 has a C-terminal phosphatase domain and an N-terminal Kelch-repeat domain. Bsu1 is preferentially expressed in elongating plant cells. It modulates the phosphorylation state of Bes1, a transcriptional regulator phosphorylated by the glycogen synthase kinase Bin2, as part of a steroid hormone signal transduction pathway. The PPP (phosphoprotein phosphatase) family, to which Bsu1 belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 311
26480 277364 cd07420 MPP_RdgC Drosophila melanogaster RdgC and related proteins, metallophosphatase domain. RdgC (retinal degeneration C) is a vertebrate serine-threonine protein phosphatase that is required to prevent light-induced retinal degeneration. In addition to its catalytic domain, RdgC has two C-terminal EF hands. Homologs of RdgC include the human phosphatases protein phosphatase with EF hands 1 and -2 (PPEF-1 and -2). PPEF-1 transcripts are present at low levels in the retina, PPEF-2 transcripts and PPEF-2 protein are present at high levels in photoreceptors. The PPP (phosphoprotein phosphatase) family, to which RdgC belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 297
26481 163664 cd07421 MPP_Rhilphs Rhilph phosphatases, metallophosphatase domain. Rhilphs (Rhizobiales/ Rhodobacterales/ Rhodospirillaceae-like phosphatases) are a phylogenetically distinct group of PPP (phosphoprotein phosphatases), found only in land plants. They are named for their close relationship to to PPP phosphatases from alpha-Proteobacteria, including Rhizobiales, Rhodobacterales and Rhodospirillaceae. The PPP (phosphoprotein phosphatase) family, to which the Rhilphs belong, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 304
26482 277365 cd07422 MPP_ApaH Escherichia coli ApaH and related proteins, metallophosphatase domain. ApaH (also known as symmetrically cleaving Ap4A hydrolase and bis(5'nucleosyl)-tetraphosphatase) is a bacterial member of the PPP (phosphoprotein phosphatase) family of serine/threonine phosphatases that hydrolyzes the nucleotide-signaling molecule diadenosine tetraphosphate (Ap(4)A) into two ADP and also hydrolyzes Ap(5)A, Gp(4)G, and other extending compounds. Null mutations in apaH result in high intracellular levels of Ap(4)A which correlate with multiple phenotypes, including a decreased expression of catabolite-repressible genes, a reduction in the expression of flagellar operons, and an increased sensitivity to UV and heat. Ap4A hydrolase is important in responding to heat shock and oxidative stress via regulating the concentration of Ap4A in bacteria. Ap4A hydrolase is also thought to play a role in siderophore production, but the mechanism by which ApaH interacts with siderophore pathways in unknown. The PPP (phosphoprotein phosphatase) family, to which ApaH belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, and PrpA/PrpB. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 257
26483 277366 cd07423 MPP_Prp_like Bacillus subtilis PrpE and related proteins, metallophosphatase domain. PrpE (protein phosphatase E) is a bacterial member of the PPP (phosphoprotein phosphatase) family of serine/threonine phosphatases and a key signal transduction pathway component controlling the expression of spore germination receptors GerA and GerK in Bacillus subtilis. PrpE is closely related to ApaH (also known symmetrical Ap(4)A hydrolase and bis(5'nucleosyl)-tetraphosphatase). PrpE has specificity for phosphotyrosine only, unlike the serine/threonine phosphatases to which it is related. The Bacilli members of this family are single domain proteins while the other members have N- and C-terminal domains in addition to this phosphatase domain. Pnkp is the end-healing and end-sealing component of an RNA repair system present in bacteria. It is composed of three catalytic modules: an N-terminal polynucleotide 5' kinase, a central 2',3' phosphatase, and a C-terminal ligase. Pnkp is a Mn(2+)-dependent phosphodiesterase-monoesterase that dephosphorylates 2',3'-cyclic phosphate RNA ends. An RNA binding site is suggested by a continuous tract of positive surface potential flanking the active site. The PPP (phosphoprotein phosphatase) family, to which PrpE belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 235
26484 277367 cd07424 MPP_PrpA_PrpB PrpA and PrpB, metallophosphatase domain. PrpA and PrpB are bacterial type I serine/threonine and tyrosine phosphatases thought to modulate the expression of proteins that protect the cell upon accumulation of misfolded proteins in the periplasm. The PPP (phosphoprotein phosphatase) family, to which PrpA and PrpB belong, is one of two known protein phosphatase families specific for serine and threonine. This family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 201
26485 277368 cd07425 MPP_Shelphs Shewanella-like phosphatases, metallophosphatase domain. This family includes bacterial, eukaryotic, and archeal proteins orthologous to the Shewanella cold-active protein-tyrosine phosphatase, CAPTPase. CAPTPase is an uncharacterized protein that belongs to the Shelph (Shewanella-like phosphatase) family of PPP (phosphoprotein phosphatases). The PPP family is one of two known protein phosphatase families specific for serine and threonine. In addition to Shelps, the PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes. PPPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 209
26486 143631 cd07429 Cby_like Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins. Chibby(Cby) is a well-conserved nuclear protein that functions as part of the Wnt/beta-catenin signaling pathway. Specifically, Cby binds directly to beta-catenin by interacting with its central region, which harbors armadillo repeats. Cby-beta-catenin interactions may also involve 14-3-3 proteins. By competing with other binding partners of beta-catenin, the Tcf/Lef transcription factors, Cby inhibits transcriptional activation. Cby has been shown to play a role in adipocyte differentiation. The C-terminal region of Cby appears to contain an alpha-helical coiled-coil motif. 108
26487 143632 cd07430 GH15_N Glycoside hydrolase family 15, N-terminal domain. Members of this family are N-terminal domains uniquely found in bacterial and archaeal glucoamylases and glucodextranases. Glucoamylase (glucan 1,4-alpha-glucosidase; 4-alpha-D-glucan glucohydrolase; amyloglucosidase; exo-1,4-alpha-glucosidase; gamma-amylase; lysosomal alpha-glucosidase; EC 3.2.1.3) hydrolyzes beta-1,4-glucosidic linkages of starch, glycogen and malto-oligosaccharides, releasing beta-D-glucose from the non-reducing end. Glucodextranase (glucan 1,6-alpha-glucosidase; exo-1,6-alpha-glucosidase; EC 3.2.1.70) uses an inverting reaction mechanism to hydrolyze alpha-1,6-glucosidic linkages of dextran and related oligosaccharides, releasing beta-D-glucose from the non-reducing end. These N-terminal domains adopt a structure consisting of antiparallel beta-strands, divided into two beta-sheets, with one sheet wrapped by an extended polypeptide, which appears to stabilize the domain. The function of these domains in the enzymes is as yet unknown. However, it is suggested that domain N of bacterial GA is involved in folding and/or the thermostability of the A domain that forms an (alpha/alpha)6-barrel structure. 260
26488 213986 cd07431 PHP_PolIIIA Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that is responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. The PolIIIA PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination, and like other PHP structures, exhibits a distorted (beta/alpha) 7 barrel and coordinates up to 3 metals. Initially, it was proposed that PHP region might be involved in pyrophosphate hydrolysis, but such activity has not been found. It has been shown that the PHP domain of PolIIIA has a trinuclear metal complex and is capable of proofreading activity. 179
26489 213987 cd07432 PHP_HisPPase Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. HisPPase can be classified into two types: the bifunctional HisPPase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 129
26490 213988 cd07433 PHP_PolIIIA_DnaE1 Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III DnaE1. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. PolIIIA core enzyme catalyzes the reaction for polymerizing both DNA strands. dnaE1 is the longest compared to dnaE2 and dnaE3. A unique motif was also identified in dnaE1 and dnaE3 genes. 277
26491 213989 cd07434 PHP_PolIIIA_DnaE2 Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III at DnaE2 gene. PolIIIA DnaE2 plays a role in SOS mutagenesis/translesion synthesis and has dominant effects in determining GC variability in the bacterial genome. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. PolIIIA core enzyme catalyzes the reaction for polymerizing both DNA strands. PolC PHP is located in a different location compared to dnaE1, 2, and 3. dnaE1 is the longest compared to dnaE2 and dnaE3. A unique motif was also identified in dnaE1 and dnaE3 genes. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PHP domains found in DnaEs of thermophilic origin exhibit 3'-5' exonuclease activity. 260
26492 213990 cd07435 PHP_PolIIIA_POLC Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III at PolC gene. DNA polymerase III alphas (PolIIIAs) that contain a PHP domain have been classified into four basic groups based on phylogenetic and domain structural analyses: polC, dnaE1, dnaE2, and dnaE3. The PolC group is distinct from the other three and is clustered together. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. PolC PHP is located in different location compare to dnaE1, 2, and 3. The PHP domain has four conserved sequence motifs and and contains an invariant histidine that is involved in metal ion coordination.The PHP domain of PolC is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. PHP domains found in dnaEs of thermophilic origin exhibit 3'-5' exonuclease activity. In contrast, PolC PHP lacks detectable nuclease activity. 268
26493 213991 cd07436 PHP_PolX Polymerase and Histidinol Phosphatase domain of bacterial polymerase X. The bacterial/archaeal X-family DNA polymerases (PolXs) have a PHP domain at their C-terminus. The bacterial/archaeal PolX core domain and PHP domain interact with each other and together are involved in metal dependent 3'-5' exonuclease activity. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PolX is found in all kingdoms, however bacterial PolXs have a completely different domain structure from eukaryotic PolXs. Bacterial PolX has an extended conformation in contrast to the common closed 'right hand' conformation for DNA polymerases. This extended conformation is stabilized by the PHP domain. The PHP domain of PolX is structurally homologous to other members of the PHP family that has a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 237
26494 213992 cd07437 PHP_HisPPase_Ycdx_like Polymerase and Histidinol Phosphatase domain of Ycdx like. PHP Ycdx-like is a stand alone PHP domain similar to Ycdx E. coli protein with an unknown physiological role. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. It has also been shown that the PHP domain functions in DNA repair. The PHP structures have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. YcdX may be involved in swarming. 233
26495 213993 cd07438 PHP_HisPPase_AMP Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase (HisPPase) AMP bound. The PHP domain of this HisPPase family has an unknown function. It has a second domain inserted in the middle that binds adenosine 5-monophosphate (AMP). The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to give histidinol. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to the other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 155
26496 143633 cd07439 FANCE_c-term Fanconi anemia complementation group E protein, C-terminal domain. Fanconi Anemia (FA) is an autosomal recessive disorder associated with increased susceptibility to various cancers, bone marrow failure, cardiac, renal, and limb malformations, and other characteristics. Cells are highly sensitive to DNA damaging agents. A multi-subunit protein complex, the FA core complex, is responsible for ubiquitination of the protein FANCD2 in response to DNA damage. This monoubiquitination results in a downstream effect on homology-directed DNA repair. FANCE is part of the FA core complex and its C-terminal domain, which is modeled here, has been shown to directly interact with FANCD2. The domain contains a five-fold repeat of a structural unit similar to ARM and HEAT repeats. FANCE appears conserved in metazoa and in plants. 254
26497 188659 cd07440 RGS Regulator of G protein signaling (RGS) domain superfamily. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision. 113
26498 143550 cd07441 CRD_SFRP3 Cysteine-rich domain of the secreted frizzled-related protein 3 (SFRP3, alias FRZB), a Wnt antagonist. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related protein 3 (SFRP3, alias FRZB), which plays important roles in embryogenesis and postnatal development as an antagonist of Wnt proteins, key players in a number of fundamental cellular processes. SFRPs antagonize the activation of Wnt signaling by binding to the CRD domains of frizzled proteins (Fz), thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. SFRP3 regulates Wnt signaling activity in bone development and homeostasis. It is also involved in the control of planar cell polarity. 126
26499 143551 cd07442 CRD_SFRP4 Cysteine-rich domain of the secreted frizzled-related protein 4 (SFRP4), a Wnt antagonist. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related Protein 4 (SFRP4), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRDs domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. 127
26500 143552 cd07443 CRD_SFRP1 Cysteine-rich domain of the secreted frizzled-related protein 1 (SFRP1), a regulator of Wnt activity. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related protein 1 (SFRP1), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRDs domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. SFRP1 is expressed in many tissues and is involved in the regulation of Wnt signaling in osteoblasts, leading to enhanced trabecular bone formation in adults; it has also been shown to control the growth of retinal ganglion cell axons and the elongation of the antero-posterior axis. 124
26501 143553 cd07444 CRD_SFRP5 Cysteine-rich domain of the secreted frizzled-related protein 5 (SFRP5), a regulator of Wnt activity. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related Protein 5 (SFRP5), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRD domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. 127
26502 143554 cd07445 CRD_corin_1 One of two cysteine-rich domains of the corin protein, a type II transmembrane serine protease . The cysteine-rich domain (CRD) is an essential component of corin, a type II transmembrane serine protease which functions as the convertase of the pro-atrial natriuretic peptide (pro-ANP) in the heart. Corin contains two CRDs in its extracellular region, which play an important role in recognition of the physiological substrate, pro-ANP. This model characterizes the first (N-terminal) CRD. 130
26503 143555 cd07446 CRD_SFRP2 Cysteine-rich domain of the secreted frizzled-related protein 2 (SFRP2), a regulator of Wnt activity. The cysteine-rich-domain (CRD) is an essential part of the secreted frizzled related protein 2 (SFRP2), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to CRD domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. As a Wnt antagonist, SFRP2 regulates Nkx2.2 expression in the ventral spinal cord and anteroposterior axis elongation. SFRP2 also has a Wnt-independent function as an enhancer of procollagen cleavage by the TLD proteinases. SFRP2 binds both procollagen and TLD, thus facilitating the enzymatic reaction by bringing together the proteinase and its substrate. 128
26504 143556 cd07447 CRD_Carboxypeptidase_Z Cysteine-rich domain of carboxypeptidase Z, a member of the carboxypeptidase E family. The cysteine-rich-domain (CRD) is an essential part of carboxypeptidase Z, a member of the carboxypeptidase E family of metallocarboxypeptidases. This is a group of Zn-dependent enzymes implicated in the intra- and extracellular processing of proteins. Carboxypeptidase Z removes C-terminal basic amino acid residues from its substrates, particularly arginine. The CRD acts as a ligand-binding domain for Wnts involved in developmental processes. CPZ binds and may process Wnt-4, CPZ has also been found to enhance the induction of the homeobox gene Cdx1. During vertebrate embryogenesis, the CRD of CPZ upregulates Pax3, a Wnt reporter gene essential for patterning of somites and limb development. 128
26505 143557 cd07448 CRD_FZ4 Cysteine-rich Wnt-binding domain of the frizzled 4 (Fz4) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 4 (Fz4) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and the Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Frizzled 4 (Fz4) activates the Ca(2+)/calmodulin-dependent protein kinase II and protein kinase C of the Wnt/Ca(2+) signaling pathway during retinal angiogenesis. Mutations in Fz4 lead to familial exudative vitreoretinopathy (FEVR), a hereditary ocular disorder characterized by failure of the peripheral retinal vascularization. In addition, the interplay between Fz4 and norrin as a receptor-ligand pair plays an important role in vascular development in the retina and inner ear in a Wnt-independent manner. 126
26506 143558 cd07449 CRD_FZ3 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 3 (Fz3) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 3 (Fz3) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz3 plays a vital role in the anterior-posterior guidance of commissural axons. Knockout mice without Fz3 show defects in fiber tracts in the rostral CNS. 127
26507 143559 cd07450 CRD_FZ6 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 6 (Fz6) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 6 (Fz6) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Frizzled 6 (Fz6) is expressed in the skin and hair follicles and controls hair patterning in mammals using a Fz-dependent tissue polarity system, which is similar to the one that patterns the Drosophila cuticle. 127
26508 143560 cd07451 CRD_SMO Cysteine-rich domain of the smoothened receptor (Smo) integral membrane protein. The cysteine-rich domain (CRD) is part of the smoothened receptor (Smo), an integral membrane protein and one of the key players in the Hedgehog (Hh) signaling pathway, critical for development, cell growth and migration, as well as stem cell maintenance. The CRD of Smo is conserved in vertebrates and can also be identified in invertebrates. The precise function of the CRD in Smo is unknown. Mutations in the Drosophila CRD disrupt Smo activity in vivo, while deletion of the CRD in mammalian cells does not seem to affect the activity of overexpressed Smo. 132
26509 143561 cd07452 CRD_sizzled Cysteine-rich domain of the sizzled protein. The cysteine-rich domain (CRD) is an essential part of the sizzled protein, which regulates bone morphogenetic protein (Bmp) signaling by stabilizing chordin, and plays a critical role in the patterning of vertebrate and invertebrate embryos. Sizzled also functions in the ventral region as a Wnt inhibitor and modulates canonical Wnt signaling. Sizzled proteins belong to the secreted frizzled-related protein family (SFRP), and have be identified in the genomes of birds, fishes and frogs, but not mammals. 141
26510 143562 cd07453 CRD_crescent Cysteine-rich domain of the crescent protein. The cysteine-rich domain (CRD) is an essential part of the crescent protein, a member of the secreted frizzled-related protein (SFRP) family, which regulates convergent extension movements (CEMs) during gastrulation and neurulation. Xenopus laevis crescent efficiently forms inhibitory complexes with Wnt5a and Wnt11, but this effect is cancelled in the presence of another member of the SFRP family, Frzb1. A potential role for Crescent in head formation is to regulate a non-canonical Wnt pathway positively in the adjacent posterior mesoderm, and negatively in the overlying anterior neuroectoderm. 135
26511 143563 cd07454 CRD_LIN_17 Cysteine-rich domain (CRD) of LIN_17. A cysteine-rich domain (CRD) is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. The protein lin-17 is involved in cell type specification during Caenorhabditis elegans vulval development. 124
26512 143564 cd07455 CRD_Collagen_XVIII Cysteine-rich domain of the variant 3 of collagen XVIII (V3C18 ). The cysteine-rich domain (CRD) is an essential part of the variant 3 of collagen XVIII (V3C18), which regulates major cellular functions such as the differential epithelial morphogenesis of early lung and kidney development. V3C18 is a 170 kD protein, which is proteolotically processed into the CRD-containing 50 kD glucoprotein precursor that binds Wnt3a through its CRD domain and suppresses the Wnt3a-induced stabilization of beta catenin. Full-length V3C18 is unable to inhibit Wnt signaling. 123
26513 143565 cd07456 CRD_FZ5_like Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 5. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 5 (Fz5) and frizzled 8 (Fz8) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. 120
26514 143566 cd07457 CRD_FZ9_like Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 9. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 9 (Fz9) and frizzled 10 (Fz10) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. 121
26515 143567 cd07458 CRD_FZ1_like Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 1. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 1 (Fz1), frizzled 2 (Fz2), and frizzled 7 (Fz7) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. 119
26516 143568 cd07459 CRD_TK_ROR_like Cysteine-rich domain of tyrosine kinase-like orphan receptors. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror) proteins, a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2. 135
26517 143569 cd07460 CRD_FZ5 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 5 (Fz5) receptor.proteins. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 5 (Fz5) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz5 plays critical regulating roles in the yolk sac and placental angiogenesis, in the maturation of the Paneth cell phenotype, in governing the neural potential of progenitors in the developing retina, and in neuronal survival in the parafascicular nucleus. 127
26518 143570 cd07461 CRD_FZ8 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 8 (Fz8) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 8 (Fz8) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Xenopus Fz8 is important in Wnt/beta-catenin signaling pathways controlling the transcriptional activation of target genes Siamois and Xnr3 in the animal caps of late blastula. 125
26519 143571 cd07462 CRD_FZ10 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 10 (Fz10) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 10 (Fz10) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. The cellular functon of Fz10 is unknown. 127
26520 143572 cd07463 CRD_FZ9 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 9 (Fz9) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 9 (Fz9) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz9 may play a signaling role in lymphoid development and maturation, particularly at points where B cells undergo self-renewal prior to further differentiation. 127
26521 143573 cd07464 CRD_FZ2 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 2 (Fz2) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 2 (Fz2) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz2 is involved in the Wnt/beta-catenin signaling pathway and in the activation of protein kinase C and calcium/calmodulin-dependent protein kinase (CaM kinase). 127
26522 143574 cd07465 CRD_FZ1 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 1 (Fz1) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 1 (Fz1) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. 127
26523 143575 cd07466 CRD_FZ7 Cysteine-rich Wnt-binding domain (CRD) of the frizzled 7 (Fz7) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 7 (Fz7) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Xenopus Fz7 is important in Wnt/beta-catenin signaling pathways controlling the transcriptional activation of target genes Siamois and Xnr3 in the animal caps of late blastula. 125
26524 143576 cd07467 CRD_TK_ROR1 Cysteine-rich domain of tyrosine kinase-like orphan receptor 1. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor 1 (Ror1), a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2. 142
26525 143577 cd07468 CRD_TK_ROR2 Cysteine-rich domain of tyrosine kinase-like orphan receptor 2. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror2), a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2. 140
26526 143578 cd07469 CRD_TK_ROR_related Cysteine-rich domain of proteins similar to tyrosine kinase-like orphan receptors. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror) proteins, a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. 147
26527 143621 cd07470 CYTH-like_mRNA_RTPase CYTH-like mRNA triphosphatase (RTPase) component of the mRNA capping apparatus. This subgroup includes fungal and protozoal RTPases. RTPase catalyzes the first step in the mRNA cap formation process, the removal of the gamma-phosphate of triphosphate terminated pre-mRNA. This activity is metal-dependent. The 5'-end of the resulting mRNA diphosphate is subsequently capped with GMP by RNA guanylytransferase, and then further modified by one or more methyltransferases. The mRNA cap-forming activity is an essential step in mRNA processing. The RTPases are not conserved among eukarya. The structure and mechanism of this fungal RTPase domain group is different from that of higher eukaryotes. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. The RTPase domain of the mimivirus RTPase-GTase fusion mRNA capping enzyme also belongs to this subgroup. 243
26528 213030 cd07472 HmuY_like Bacterial proteins similar to Porphyromonas gingivalis HmuY and the C-terminal domain of PARMER_03218. HmuY is a hemophore that scavenges heme from infected hosts and delivers it to the outer membrane receptor HmuR. Related but uncharacterized proteins do not appear to share the specific heme-binding site. The C-terminal domain of PARMER_03128, an uncharacterized protein from Parabacteroides merdae, plus related proteins from Bacteroidetes, appear to be a distantly related family and have been included in this model. 157
26529 173799 cd07473 Peptidases_S8_Subtilisin_like Peptidase S8 family domain in Subtilisin-like proteins. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 259
26530 173800 cd07474 Peptidases_S8_subtilisin_Vpr-like Peptidase S8 family domain in Vpr-like proteins. The maturation of the peptide antibiotic (lantibiotic) subtilin in Bacillus subtilis ATCC 6633 includes posttranslational modifications of the propeptide and proteolytic cleavage of the leader peptide. Vpr was identified as one of the proteases, along with WprA, that are capable of processing subtilin. Asp, Ser, His triadPeptidases S8 or Subtilases are a serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 295
26531 173801 cd07475 Peptidases_S8_C5a_Peptidase Peptidase S8 family domain in Streptococcal C5a peptidases. Streptococcal C5a peptidase (SCP), is a highly specific protease and adhesin/invasin. The subtilisin-like protease domain is located at the N-terminus and contains a protease-associated domain inserted into a loop. There are three fibronectin type III (Fn) domains at the C-terminus. SCP binds to integrins with the help of Arg-Gly-Asp motifs which are thought to stabilize conformational changes required for substrate binding. Peptidases S8 or Subtilases are a serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 346
26532 173802 cd07476 Peptidases_S8_thiazoline_oxidase_subtilisin-like_protease Peptidase S8 family domain in Thiazoline oxidase/subtilisin-like proteases. Thiazoline oxidase/subtilisin-like protease is produced by the symbiotic bacteria Prochloron spp. that inhabit didemnid family ascidians. The cyclic peptides of the patellamide class found in didemnid extracts are now known to be synthesized by the Prochloron spp. The prepatellamide is heterocyclized to form thiazole and oxazoline rings and the peptide is cleaved to form the two cyclic patellamides A and C. Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution). 267
26533 173803 cd07477 Peptidases_S8_Subtilisin_subset Peptidase S8 family domain in Subtilisin proteins. This group is composed of many different subtilisins: Pro-TK-subtilisin, subtilisin Carlsberg, serine protease Pb92 subtilisin, and BPN subtilisins just to name a few. Pro-TK-subtilisin is a serine protease from the hyperthermophilic archaeon Thermococcus kodakaraensis and consists of a signal peptide, a propeptide, and a mature domain. TK-subtilisin is matured from pro-TK-subtilisin upon autoprocessing and degradation of the propeptide. Unlike other subtilisins though, the folding of the unprocessed form of pro-TK-subtilisin is induced by Ca2+ binding which is almost completed prior to autoprocessing. Ca2+ is required for activity unlike the bacterial subtilisins. The propeptide is not required for folding of the mature domain unlike the bacterial subtilases because of the stability produced from Ca2+ binding. Subtilisin Carlsberg is extremely similar in structure to subtilisin BPN'/Novo thought it has a 30% difference in amino acid sequence. The substrate binding regions are also similar and 2 possible Ca2+ binding sites have been identified recently. Subtilisin Carlsberg possesses the highest commercial importance as a proteolytic additive for detergents. Serine protease Pb92, the serine protease from the alkalophilic Bacillus strain PB92, also contains two calcium ions and the overall folding of the polypeptide chain closely resembles that of the subtilisins. Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 229
26534 173804 cd07478 Peptidases_S8_CspA-like Peptidase S8 family domain in CspA-like proteins. GSP (germination-specific protease) converts the spore peptidoglycan hydrolase (SleC) precursor to an active enzyme during germination of Clostridium perfringens S40 spores. Analysis of an enzyme fraction of GSP showed that it was composed of a gene cluster containing the processed forms of products of cspA, cspB, and cspC which are positioned in a tandem array just upstream of the 5' end of sleC. The amino acid sequences deduced from the nucleotide sequences of the csp genes showed significant similarity and showed a high degree of homology with those of the catalytic domain and the oxyanion binding region of subtilisin-like serine proteases. Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 455
26535 173805 cd07479 Peptidases_S8_SKI-1_like Peptidase S8 family domain in SKI-1-like proteins. SKI-1 (type I membrane-bound subtilisin-kexin-isoenzyme) proteins are secretory Ca2+-dependent serine proteinases cleave at nonbasic residues: Thr, Leu, and Lys. SKI-1s play a critical role in the regulation of the synthesis and metabolism of cholesterol and fatty acid metabolism. Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 255
26536 173806 cd07480 Peptidases_S8_12 Peptidase S8 family domain, uncharacterized subfamily 12. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 297
26537 173807 cd07481 Peptidases_S8_BacillopeptidaseF-like Peptidase S8 family domain in BacillopeptidaseF-like proteins. Bacillus subtilis produces and secretes proteases and other types of exoenzymes at the end of the exponential phase of growth. The ones that make up this group is known as bacillopeptidase F, encoded by bpr, a serine protease with high esterolytic activity which is inhibited by PMSF. Like other members of the peptidases S8 family these have a Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. 264
26538 173808 cd07482 Peptidases_S8_Lantibiotic_specific_protease Peptidase S8 family domain in Lantiobiotic (lanthionine-containing antibiotics) specific proteases. Lantiobiotic (lanthionine-containing antibiotics) specific proteases are very similar in structure to serine proteases. Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides with antimicrobial activities against Gram-positive bacteria. The proteases that cleave the N-terminal leader peptides from lantiobiotics include: epiP, nsuP, mutP, and nisP. EpiP, from Staphylococcus, is thought to cleave matured epidermin. NsuP, a dehydratase from Streptococcus and NisP, a membrane-anchored subtilisin-like serine protease from Lactococcus cleave nisin. MutP is highly similar to epiP and nisP and is thought to process the prepeptide mutacin III of S. mutans. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) clan include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53 the it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values. 294
26539 173809 cd07483 Peptidases_S8_Subtilisin_Novo-like Peptidase S8 family domain in Subtilisin_Novo-like proteins. Subtilisins are a group of alkaline proteinases originating from different strains of Bacillus subtilis. Novo is one of the strains that produced enzymes belonging to this group. The enzymes obtained from the Novo and BPN' strains are identical. The Carlsburg and Novo subtilisins are thought to have arisen from a common ancestral protein. They have similar peptidase and esterase activities, pH profiles, catalyze transesterification reactions, and are both inhibited by diispropyl fluorophosphate, though they differ in 85 positions in the amino acid sequence. Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 291
26540 173810 cd07484 Peptidases_S8_Thermitase_like Peptidase S8 family domain in Thermitase-like proteins. Thermitase is a non-specific, trypsin-related serine protease with a very high specific activity. It contains a subtilisin like domain. The tertiary structure of thermitase is similar to that of subtilisin BPN'. It contains a Asp/His/Ser catalytic triad. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) clan include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53 the it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values. 260
26541 173811 cd07485 Peptidases_S8_Fervidolysin_like Peptidase S8 family domain in Fervidolysin. Fervidolysin found in Fervidobacterium pennivorans is an extracellular subtilisin-like keratinase. It is contains a signal peptide, a propeptide, and a catalytic region. The tertiary structure of fervidolysin is similar to that of subtilisin. It contains a Asp/His/Ser catalytic triad and is a member of the peptidase S8 (subtilisin and kexin) family. The catalytic triad is similar to that found in trypsin-like proteases, but it does not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values. 273
26542 173812 cd07487 Peptidases_S8_1 Peptidase S8 family domain, uncharacterized subfamily 1. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 264
26543 173813 cd07488 Peptidases_S8_2 Peptidase S8 family domain, uncharacterized subfamily 2. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 247
26544 173814 cd07489 Peptidases_S8_5 Peptidase S8 family domain, uncharacterized subfamily 5. gap in seq This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 312
26545 173815 cd07490 Peptidases_S8_6 Peptidase S8 family domain, uncharacterized subfamily 6. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 254
26546 173816 cd07491 Peptidases_S8_7 Peptidase S8 family domain, uncharacterized subfamily 7. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 247
26547 173817 cd07492 Peptidases_S8_8 Peptidase S8 family domain, uncharacterized subfamily 8. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 222
26548 173818 cd07493 Peptidases_S8_9 Peptidase S8 family domain, uncharacterized subfamily 9. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 261
26549 173819 cd07494 Peptidases_S8_10 Peptidase S8 family domain, uncharacterized subfamily 10. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 298
26550 173820 cd07496 Peptidases_S8_13 Peptidase S8 family domain, uncharacterized subfamily 13. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 285
26551 173821 cd07497 Peptidases_S8_14 Peptidase S8 family domain, uncharacterized subfamily 14. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 311
26552 173822 cd07498 Peptidases_S8_15 Peptidase S8 family domain, uncharacterized subfamily 15. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. 242
26553 319802 cd07499 HAD_CBAP molecular class B acid phosphatases, similar to Escherichia coli AphA. class B acid phosphatases (CBAPs) have been detected in a minority of bacterial species which include a number of major pathogens such as Escherichia coli, Haemophilus influenzae, and Streptococcus pyogenes. This family includes the CBAP Escherichia coli AphA. The purified enzyme is a broad-spectrum nucleotidase highly active against both 3'- and 5'-mononucleotides and monodeoxynucleotides, which can also act as a phosphotransferase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 185
26554 319803 cd07500 HAD_PSP phosphoserine phosphatase (PSP), similar to Methanococcus Jannaschii PSP and Saccharomyces cerevisiae SER2p. This family includes Methanococcus jannaschii PSP, and Saccharomyces cerevisiae phosphoserine phosphatase SER2p, EC 3.1.3.3, which participates in a pathway whereby serine and glycine are synthesized from the glycolytic intermediate 3-phosphoglycerate; phosphoserine phosphatase catalyzes the hydrolysis of phospho-L-serine to L-serine and inorganic phosphate, the third reaction in this pathway. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 180
26555 319804 cd07501 HAD_MDP-1_like eukaryotic hypothetical phosphotyrosine phosphatase MDP-1 and related phosphatases, similar to Bacillus cereus phosphonoacetaldehyde hydrolase and Streptomyces FkbH. This family includes eukaryotic magnesium-dependent phosphatase-1 (MDP-1) which is most likely a phosphotyrosine phosphatase catalyzing the dephosphorylation of tyrosine-phosphorylated proteins, Bacillus cereus phosphonoacetaldehyde hydrolase (phosphonatase)which catalyzes the hydrolysis of phosphonoacetaldehyde to acetaldehyde and phosphate using Mg(II) as cofactor, and sequences annotated as FkbH including BafAIV an FkbH-like protein from Streptomyces griseus encoded in ORF12 of the bafilomycin synthesis gene cluster. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 129
26556 319805 cd07502 HAD_PNKP-C C-terminal phosphatase domain of T4 polynucleotide kinase/phosphatase (PNKP) and related phosphatases. This family includes the C-terminal domain of the bifunctional enzyme T4 polynucleotide kinase/phosphatase, PNKP. The PNKP phosphatase domain can catalyze the hydrolytic removal of the 3'-phosphoryl of DNA, RNA and deoxynucleoside 3'-monophosphates. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 145
26557 319806 cd07503 HAD_HisB-N histidinol phosphate phosphatase and related phosphatases. This family includes the N-terminal domain of the Escherichia coli bifunctional enzyme histidinol-phosphate phosphatase/imidazole-glycerol-phosphate dehydratase, HisB. The N-terminal histidinol-phosphate phosphatase domain catalyzes the dephosphorylation of histidinol phosphate, the eight step of L-histidine biosynthesis. This family also includes Escherichia coli GmhB phosphatase which is highly specific for D-glycero-D-manno-heptose-1,7-bisphosphate, it removes the C(7)phosphate and not the C(1)phosphate, and this is the third essential step of lipopolysaccharide heptose biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 142
26558 319807 cd07504 HAD_5NT haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to human cytosolic IIIA and IIIB. 5'-nucleotidases dephosphorylate nucleoside 5prime-monophosphates. This family includes human 5'-nucleotidase, cytosolic IIIA (cN-IIIA, previously called cN-III; NT5C3A) the main pyrimidine 5'-nucleotidase in erythrocytes which dephosphorylates the pyrimidine nucleotides CMP, UMP, TMP, and the purine 7-methylguanosine monophosphate (m7GM), and possesses phosphotransferase activity. It also includes human 5'-nucleotidase, cytosolic IIIB (cN-IIIB; NT5C3B) which has a strong preference for m7GMP, dephosphorylates CMP and UMP and, with significantly lower efficiency, GMP and AMP, and can also act as a phosphotransferase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 273
26559 319808 cd07505 HAD_BPGM-like beta-phosphoglucomutase-like family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. This family represents the beta-phosphoglucomutase-like family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. Family members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. It belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 143
26560 319809 cd07506 HAD_like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 115
26561 319810 cd07507 HAD_Pase haloacid dehalogenase-like superfamily phosphatase similar to Pyrococcus horikoshii mannosyl-3-phosphoglycerate phosphatase and Persephonella marina glucosyl-3-phosphoglycerate phosphatase. This family includes Pyrococcus horikoshii and Thermus thermophilus HB27 mannosyl-3-phosphoglycerate phosphatases (MpgPs) which catalyze the dephosphorylation of alpha-mannosyl-3-phosphoglycerate (MPG) to produce alpha-mannosylglycerate (MG), and Persephonella marina glucosyl-3-phosphoglycerate phosphatase (GpgP) which catalyzes the dephosphorylation of glucosyl-3-phosphoglycerate (GPG) to produce glucosylglycerate (GG). It also includes Methanococcoides burtonii MpgP protein which is able to dephosphorylate GPG to GG, and MPG to MG. Similar flexibilities in substrate specificity have been confirmed in vitro for the MpgPs from Thermus thermophiles and Pyrococcus horikoshii. Screens with natural substrates have not yet detected activity for another member Escherichia Coli YedP. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 255
26562 319811 cd07508 HAD_Pase_UmpH-like haloacid dehalogenase-like superfamily phosphatases, UmpH/NagD family. Phosphatases in this UmpH/NagD family include Escherichia coli UmpH UMP phosphatase/NagD nucleotide phosphatase , Mycobacterium tuberculosis Rv1692 glycerol 3-phosphate phosphatase, human PGP phosphoglycolate phosphatase, Schizosaccharomyces pombe PHO2 p-nitrophenylphosphatase, Bacillus AraL a putative sugar phosphatase, and Plasmodium falciparum para nitrophenyl phosphate phosphatase PNPase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 270
26563 319812 cd07509 HAD_PPase inorganic pyrophosphatase similar to a human phospholysine phosphohistidine inorganic pyrophosphate phosphatase (LHPP). LHPP hydrolyzes nitrogen-phosphorus bonds in phospholysine, phosphohistidine and imidodiphosphate as well as oxygen-phosphorus bonds in inorganic pyrophosphate in vitro. This family also includes human haloacid dehalogenase like hydrolase domain containing 2 protine (HDHD2) a phosphatase which may be involved in polygenic hypertension. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 248
26564 319813 cd07510 HAD_Pase_UmpH-like UmpH/NagD family phosphatase, similar to human PGP phosphoglycolate phosphatase and Schizosaccharomyces pombe PHO2 p-nitrophenylphosphatase. This subfamily includes the phosphoglycolate phosphatases (human PGP and Arabidopsis thaliana PGLP2) and p-nitrophenylphosphatases (Schizosaccharomyces pombe PHO2 and Saccharomyces PHO13p). It belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 282
26565 319814 cd07511 HAD_like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily, similar to the uncharacterized human CECR5 (cat eye syndrome critical region protein 5). This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 136
26566 319815 cd07512 HAD_PGPase haloacid dehalogenase-like superfamily phosphoglycolate phosphatase, similar to Rhodobacter sphaeroides CbbZ. Phosphoglycolate phosphatase catalyzes the dephosphorylation of phosphoglycolate; its activity requires divalent cations, especially Mg++. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 214
26567 319816 cd07514 HAD_Pase phosphatase, similar to Thermoplasma acidophilum TA0175 phosphoglycolate phosphatase (PCPase), and Pyrococcus horikoshii PH1421, a magnesium-dependent phosphatase; belongs to the haloacid dehalogenase-like superfamily. Thermoplasma acidophilum TA0175 phosphoglycolate phosphatase (PGPase) catalyzes the magnesium-dependent dephosphorylation of phosphoglycolate. This family also includes Pyrococcus horikoshii OT3 PH1421, a magnesium-dependent phosphatase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 139
26568 319817 cd07515 HAD-like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 131
26569 319818 cd07516 HAD_Pase phosphatase, similar to Escherichia coli Cof and Thermotoga maritima TM0651; belongs to the haloacid dehalogenase-like superfamily. Escherichia coli Cof is involved in the hydrolysis of HMP-PP (4-amino-2-methyl-5-hydroxymethylpyrimidine pyrophosphate, an intermediate in thiamin biosynthesis), Cof also has phosphatase activity against the coenzymes pyridoxal phosphate (PLP) and FMN. Thermotoga maritima TM0651 acts as a phosphatase with a phosphorylated carbohydrate molecule as a possible substrate. Escherichia coli YbhA is also a member of this family and catalyzes the dephosphorylation of PLP, YbhA can also hydrolyze erythrose-4-phosphate and fructose-1,6-bis-phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 253
26570 319819 cd07517 HAD_HPP phosphatase, similar to Bacteroides thetaiotaomicron VPI-5482 BT4131 hexose phosphate phosphatase; belongs to the haloacid dehalogenase-like superfamily. Bacteroides thetaiotaomicron VPI-5482 BT4131 is a phosphatase with preference for hexose phosphates. In addition this family includes uncharacterized Bacillus subtilis YkrA, a putative phosphatase and uncharacterized Streptococcus pyogenes MGAS10394 a putative bifunctional phosphatase/peptidyl-prolyl cis-trans isomerase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 213
26571 319820 cd07518 HAD_YbiV-Like Escherichia coli YbiV sugar phosphatase/phosphotransferase and related proteins; belongs to the haloacid dehalogenase-like superfamily. Escherichia coli YbiV can act as both a sugar phosphatase and as a phosphotransferase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 184
26572 319821 cd07519 HAD_PTase hydrolase domain of the bifunctional HAD hydrolase/UbiA family prenyltransferase proteins and related domains; belongs to the haloacid dehalogenase-like superfamily. This family includes bifunctional enzymes that have both an N-terminal HAD hydrolase domain and a C-terminal UbiA family prenyltransferase domain. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases (PTases) and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. PTases catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 105
26573 319822 cd07520 HAD_like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 144
26574 319823 cd07521 HAD_FCP1-like human CTD phosphatase subunit 1 (CTDP1/FCP1) and related proteins; belongs to the haloacid dehalogenase-like superfamily. Human CTDP1/FCP1 is a protein phosphatase which dephosphorylates the phosphorylated C terminus (CTD) of RNA polymerase II. CTD phosphorylation is a key mechanism of regulation of gene expression in eukaryotes. CTDP1/FCP1 may have other roles in in transcription regulation independent of its phosphatase activity. This family also includes human translocase of inner mitochondrial membrane 50 (TIMM50), CTD small phosphatase like (CTDSPL) and CTD small phosphatase like 2 (CTDSPL2), Saccharomyces cerevisiae (nuclear envelope morphology protein 1) Nem1p, and Xenopus Dullard. Yeast Nem1p in complex with Spo7p dephosphorylates the nuclear membrane-associated phosphatidic acid phosphatase, Smp2p, which may be part of a signaling cascade playing a role in nuclear membrane biogenesis. Xenopus Dullard is a potential regulator of neural tube development. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 134
26575 319824 cd07522 HAD_cN-II cytosolic 5'-nucleotidase II (cN-II) similar to human NT5DC1 (5'-nucleotidase domain-containing protein 1) and NT5DC2. Cytosolic 5'-nucleotidase II (cN-II), also known as purine 5'-nucleotidase, IMP-GMP specific nucleotidase, or high Km 5prime-nucleotidase, catalyzes the dephosphorylation of 6-hydroxypurine nucleoside monophosphates. It is ubiquitously expressed and likely to play an important role in the regulation of purine nucleotide interconversions and in the regulation of IMP and GMP pools within the cell. It is also acts as a phosphotransferase, catalyzing the reverse reaction, the transfer of a phosphate from a monophosphate substrate to a nucleoside acceptor, to form a nucleoside monophosphate. The nucleoside acceptor is preferentially inosine and deoxyinosine, phosphate donors include any 6-hydroxypurine monophosphate substrate of the nucleotidase reaction. Both the dephosphorylation and phosphotransferase reactions are allosterically activated by adenine-based nucleotides and 2,3-bisphosphoglycerate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 352
26576 319825 cd07523 HAD_YsbA-like uncharacterized family of the haloacid dehalogenase-like superfamily, similar to the uncharacterized Lactococcus lactis YsbA. The specific function of Lactococcus lactis YsbA is unknown. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases 173
26577 319826 cd07524 HAD_Pase phosphatase, similar to Bacillus subtilis MtnX; belongs to the haloacid dehalogenase-like superfamily. Bacillus subtilis recycles two toxic byproducts of polyamine metabolism, methylthioadenosine and methylthioribose, into methionine by a salvage pathway. The sixth reaction in this pathway is catalyzed by B. subtilis MtnX: the dephosphorylation of 2- hydroxy-3-keto-5-methylthiopentenyl-1-phosphate (HKMTP- 1-P) into 1,2-dihydroxy-3-keto-5-methylthiopentene. The hydrolysis of HK-MTP-1-P is a two-step mechanism involving the formation of a transiently phosphorylated aspartyl intermediate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 211
26578 319827 cd07525 HAD_like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 253
26579 319828 cd07526 HAD_BPGM_like subfamily of beta-phosphoglucomutase-like family, similar to Escherichia coli 6-phosphogluconate phosphatase YieH. This subfamily includes Escherichia coli YieH/HAD3 an 6-phosphogluconate phosphatase, which can hydrolyzed purines and pyrimidines as secondary substrates. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate, and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 141
26580 319829 cd07527 HAD_ScGPP-like subfamily of beta-phosphoglucomutase-like family, similar to Saccharomyces cerevisiae DL-glycerol-3-phosphate phosphatase (GPP1p/ Rhr2p and GPP2p/HOR2p) and 2-deoxyglucose-6-phosphate phosphatase (DOG1p and DOG2p). This subfamily includes Saccharomyces cerevisiae DL-glycerol-3-phosphate phosphatase (GPP1p/ Rhr2p and GPP2p/HOR2p) and 2-deoxyglucose-6-phosphate phosphatase (DOG1p and DOG2p). GPP1p and GPP2p are involved in glycerol biosynthesis, GPP1 is induced in response to both anaerobic and hyperosmotic stress, GPP2 is induced in response to hyperosmotic or oxidative stress, and during diauxic shift; overexpression of DOG1 or DOG2 confers 2-deoxyglucose resistance. These belong to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 205
26581 319830 cd07528 HAD_CbbY-like subfamily of beta-phosphoglucomutase-like family, similar to Rhodobacter sphaeroides xylulose-1,5-bisphosphate phosphatase CbbY. This family includes Rhodobacter sphaeroides and Arabidopsis thaliana xylulose-1,5-bisphosphate phosphatase CbbY which convert xylulose-1,5-bisphosphate (a potent inhibitor of Ribulose-1,5-bisphosphate carboxylase/oxygenase, Rubisco), to the non-inhibitory compound xylulose-5-phosphate. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 199
26582 319831 cd07529 HAD_AtGPP-like subfamily of beta-phosphoglucomutase-like family, similar to Arabidopsis thaliana Gpp1 and Gpp2. This subfamily includes Arabidopsis thaliana AtGpp1 and AtGpp2, and Drosophila GS1-like protein (Dmel\Gs1l) of unknown function. AtGpp1 and AtGpp2 are constitutively expressed in all the Arabidopsis tissues and unaffected under abiotic stress. Overexpression of AtGpp2 in transgenic Arabidopsis plants increases the specific DL-glycerol-3-phosphatase activity and improves the plants tolerance to salt, osmotic and oxidative stress. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 192
26583 319832 cd07530 HAD_Pase_UmpH-like UmpH/NagD family phosphatase, similar to Escherichia coli UmpH UMP phosphatase/NagD nucleotide phosphatase and Mycobacterium tuberculosis Rv1692 glycerol 3-phosphate phosphatase. Escherichia coli UmpH/NagD is a ribonucleoside tri-, di-, and monophosphatase with a preference for purines, it shows peak activity with UMP and functions in UMP-degradation. It is also an effective phosphatase with AMP, GMP and CMP. Mycobacterium tuberculosis phosphatase, Rv1692 is a glycerol 3-phosphate phosphatase. Rv1692 is the final enzyme involved in glycerophospholipid recycling/catabolism. This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 247
26584 319833 cd07531 HAD_Pase_UmpH-like UmpH/NagD family phosphatase, similar to Bacillus AraL phosphatase, a putative sugar phosphatase. Bacillus subtilis AraL is a phosphatase displaying activity towards different sugar phosphate substrates; it is encoded by the arabinose metabolic operon araABDLMNPQ-abfA and may play a role in the dephosphorylation of substrates related to l-arabinose metabolism. This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 252
26585 319834 cd07532 HAD_PNPase_UmpH-like UmpH/NagD family phosphatase para nitrophenyl phosphate phosphatase, similar to Plasmodium falciparum PNPase. Plasmodium falciparum para nitrophenyl phosphate phosphatase (PNPase) catalyzes the dephosphorylation of thiamine monophosphate to thiamine, other substrates on which its active are nucleotides, phosphorylated sugars, pyridoxal-5-phosphate, and paranitrophenyl phosphate. This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 286
26586 319835 cd07533 HAD_like uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily, similar to Parvibaculum lavamentivorans HAD-superfamily hydrolase, subfamily IA, variant 1. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 207
26587 319836 cd07534 HAD_CAP molecular class C acid phosphatases, similar to Haemophilus influenzae e (P4) acid phosphatase; belongs to the haloacid dehalogenase-like hydrolase superfamily. Molecular class C acid phosphatases (CAPs) are nonspecific acid phosphatases with generally broad substrate specificity and optimum activity at neutral to acidic pH. Members include Haemophilus influenzae lipoprotein e (P4), Elizabethkingia meningosepticum OlpA, Helicobacter pylori HppA, Enterobacter sp. 4 acid phosphatase PhoI, and Streptococcus pyogenes M1 GAS LppC. Lipoprotein e (P4) exhibits phosphomonoesterase activity with aryl phosphate substrates including nicotinamide mononucleotide (NMN), tyrosine phosphate, phenyl phosphate, p-nitrophenyl phosphate, and 4-methylumbelliferyl phosphate. The role of P4 in NAD+ uptake appears to be the dephosphorylation of NMN to nicotinamide riboside, which is then taken up by the organism. Elizabethkingia meningosepticum OlpA is a broad-spectrum nucleotidase with preference for 5'-nucleotides, it efficiently hydrolyzes nucleotide monophosphates, with a strong preference for 5'-nucleotides and for 3'-AMP; OlpA can also hydrolyze sugar phosphates and beta-glycerol phosphate, although with a lower efficiency. Helicobacter pylori HppA is also a 5' nucleotidase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 196
26588 319837 cd07535 HAD_VSP vegetative storage proteins similar to soybean VSPalpha and VSPbeta proteins; belongs to the haloacid dehalogenase-like superfamily. Soybean [Glycine max (L.) Merr.] vegetative storage protein VSPalpha and VSPbeta levels were identified as storage proteins due to their abundance and pattern of expression in plant tissues, they accumulate to almost one-half the amount of soluble leaf protein when soybean plants are continually depodded. They possess acid phosphatase activity which appears to be low compared to several other plant acid phosphatases, it increases in the leaves of depodded soybean plants, but to no more than 0.1% of the total acid phosphatase activity in these leaves. This acid phosphatase activity has maximal activity at pH 5.0 - 5.5, and can liberate Pi from different substrates such as napthyl acid phosphate, carboxyphenyl phosphate, sugar-phosphates, glyceraldehyde 3-phosphate, dihydroxyacetone phosphate, phosphoenolpyruvate, ATP, ADP, PPi, and short chain polyphosphates; they cleave phosphoenolpyruvate, ATP, ADP, PPI, and polyphosphates most efficiently. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Soybean VSPalpha and VSPbeta lack this active site aspartate, other members of this family have this aspartate and may be more active. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 186
26589 319838 cd07536 P-type_ATPase_APLT Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Dnf1-3p, Drs2p, Neo1p, and human ATP8A2, -9B, -10D, -11B, and -11C. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. Yeast Dnf1 and Dnf2 mediate the transport of phosphatidylethanolamine, phosphatidylserine, and phosphatidylcholine from the outer to the inner leaflet of the plasma membrane. Mammalian ATP11C may selectively transports PS and PE from the outer leaflet of the plasma membrane to the inner leaflet. The yeast Neo1p localizes to the endoplasmic reticulum and the Golgi complex and plays a role in membrane trafficking within the endomembrane system. Human putative ATPase phospholipid transporting 9B, ATP9B, localizes to the trans-golgi network in a CDC50 protein-independent manner. It also includes Arabidopsis phospholipid flippases including ALA1, and Caenorhabditis elegans flippases, including TAT-1, the latter has been shown to facilitate the inward transport of phosphatidylserine. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 805
26590 319839 cd07538 P-type_ATPase uncharacterized subfamily of P-type ATPase transporters. This subfamily contains P-type ATPase transporters of unknown function. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd2+, and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine. 653
26591 319840 cd07539 P-type_ATPase uncharacterized subfamily of P-type ATPase transporters. This subfamily contains P-type ATPase transporters of unknown function. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd2+, and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine. 634
26592 319841 cd07541 P-type_ATPase_APLT_Neo1-like Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Neo1p and human putative APLT, ATP9B. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as a flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. The yeast Neo1 gene is an essential gene; Neo1p localizes to the endoplasmic reticulum and the Golgi complex and plays a role in membrane trafficking within the endomembrane system. Also included in this sub family is human putative ATPase phospholipid transporting 9B, ATP9B, which localizes to the trans-golgi network in a CDC50 protein-independent manner. Levels of ATP9B, along with levels of other ATPase genes, may contribute to expressivity of and atypical presentations of Hailey-Hailey disease (HHD), and the ATP9B gene has recently been identified as a putative Alzheimer's disease loci. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 792
26593 319842 cd07542 P-type_ATPase_cation P-type cation-transporting ATPases, similar to human ATPase type 13A2 (ATP13A2) protein and Saccharomyces cerevisiae Ypk9p. Saccharomyces cerevisiae Yph9p localizes to the yeast vacuole and may play a role in sequestering heavy metal ions, its deletion confers sensitivity for growth for cadmium, manganese, nickel or selenium. Human ATP13A2 (PARK9/CLN12) is a lysosomal transporter with zinc as the possible substrate. Mutation in the ATP13A2 gene has been linked to Parkinson's disease and Kufor-Rakeb syndrome, and to neuronal ceroid lipofuscinoses. ATP13A3/AFURS1 is a candidate gene for oculo auriculo vertebral spectrum (OAVS), being one of nine genes included in a 3q29 microduplication in a patient with OAVS. Mutation in the human ATP13A4 may be involved in a speech-language disorder. This subfamily also includes zebrafish ATP13A2 a lysosome-specific transmembrane ATPase protein of unknown function which plays a crucial role during embryonic development, its deletion is lethal. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 760
26594 319843 cd07543 P-type_ATPase_cation P-type cation-transporting ATPases, similar to human cation-transporting ATPase type 13A1 (ATP13A1) and Saccharomyces manganese-transporting ATPase 1 Spf1p. Saccharomyces Spf1p may mediate manganese transport into the endoplasmic reticulum (ER); one consequence of deletion of SPF1 is severe ER stress. This subfamily also includes Arabidopsis thaliana MIA (Male Gametogenesis Impaired Anthers) protein which is highly abundant in the endoplasmic reticulum and small vesicles of developing pollen grains and tapetum cells. The MIA gene functionally complements a mutant in the SPF1 from Saccharomyces cerevisiae. The expression of ATP13A1 has been followed during mouse development, ATP13A1 transcript expression showed an increase as development progressed, with the highest expression at the peak of neurogenesis. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 804
26595 319844 cd07544 P-type_ATPase_HM P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 596
26596 319845 cd07545 P-type_ATPase_Cd-like P-type heavy metal-transporting ATPase, similar to Staphylococcus aureus plasmid pI258 CadA, a cadmium-efflux ATPase. CadA from gram-positive Staphylococcus aureus plasmid pI258 is required for full Cd(2+) and Zn(2+) resistance. This subfamily also includes CadA, from the gram-negative bacilli, Stenotrophomonas maltophilia D457R, which is a cadmium efflux pump acquired as part of a cluster of antibiotic and heavy metal resistance genes from gram-positive bacteria. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 599
26597 319846 cd07546 P-type_ATPase_Pb_Zn_Cd2-like P-type heavy metal-transporting ATPase, similar to Escherichia coli ZntA which is selective for Pb(2+), Zn(2+), and Cd(2+). Escherichia coli ZntA mediates resistance to toxic levels of selected divalent metal ions. ZntA has the highest selectivity for Pb(2+), followed by Zn(2+) and Cd(2+); it also shows low levels of activity with Cu(2+), Ni(2+), and Co(2+). It is upregulated by the transcription factor ZntR at high zinc concentrations. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 597
26598 319847 cd07548 P-type_ATPase-Cd_Zn_Co_like P-type heavy metal-transporting ATPase, similar to Bacillus subtilis CadA which appears to transport cadmium, zinc and cobalt but not copper out of the cell. Bacillus subtilis CadA/YvgW appears to transport cadmium, zinc and cobalt but not copper, out of the cell. Functions in metal ion resistance and cellular metal ion homeostasis. CadA/YvgW is also important for sporulation in B. subtilis, the significant specific expression of the cadA/yvgW gene during the late stage of sporulation, is controlled by forespore-specific sigma factor, sigma G, and mother cell-specific sigma factor, sigma E. This subfamily also includes Helicobacter pylori CadA an essential resistance pump with ion specificity towards Cd(2+), Zn(2+) and Co(2+), and Zn-transporting ATPase, ZiaA(N) in Synechocystis PCC 6803. Transcription of ziaA is induced by Zn under the control of the Zn responsive repressor ZiaR. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 604
26599 319848 cd07550 P-type_ATPase_HM P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 592
26600 319849 cd07551 P-type_ATPase_HM_ZosA_PfeT-like P-type heavy metal-transporting ATPase, similar to Bacillus subtilis ZosA/PfeT which transports copper, and perhaps zinc under oxidative stress, and perhaps ferrous iron. Bacillus subtilis ZosA/PfeT (previously known as YkvW) transports copper, it may also transport zinc under oxidative stress and may also be involved in ferrous iron efflux. ZosA/PfeT is expressed under the regulation of the peroxide-sensing repressor PerR. It is involved in competence development. Disruption of the zosA/pfeT gene results in low transformability. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 611
26601 319850 cd07552 P-type_ATPase_Cu-like P-type heavy metal-transporting ATPase, similar to Archaeoglobus fulgidus CopB, a Cu(2+)-ATPase. Archaeoglobus fulgidus CopB transports Cu(2+) from the cytoplasm to the exterior of the cell using ATP as energy source, it transports preferentially Cu(2+) over Cu(+), it is activated by Cu(2+) with high affinity and partially by Cu(+) and Ag(+). This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 632
26602 319851 cd07553 P-type_ATPase_HM P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle. 610
26603 143637 cd07556 Nucleotidyl_cyc_III Class III nucleotidyl cyclases. Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways. 133
26604 143638 cd07557 trimeric_dUTPase Trimeric dUTP diphosphatases. Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA. It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface. 92
26605 143471 cd07559 ALDH_ACDHII_AcoD-like Ralstonia eutrophus NAD+-dependent acetaldehyde dehydrogenase II and Staphylococcus aureus AldA1 (SACOL0154)-like. Included in this CD is the NAD+-dependent, acetaldehyde dehydrogenase II (AcDHII, AcoD, EC=1.2.1.3) from Ralstonia (Alcaligenes) eutrophus H16 involved in the catabolism of acetoin and ethanol, and similar proteins, such as, the dimeric dihydrolipoamide dehydrogenase of the acetoin dehydrogenase enzyme system of Klebsiella pneumonia. Also included are sequences similar to the NAD+-dependent chloroacetaldehyde dehydrogenases (AldA and AldB) of Xanthobacter autotrophicus GJ10 which are involved in the degradation of 1,2-dichloroethane, as well as, the uncharacterized aldehyde dehydrogenase from Staphylococcus aureus (AldA1, locus SACOL0154) and other similar sequences. 480
26606 143476 cd07560 Peptidase_S41_CPP C-terminal processing peptidase; serine protease family S41. The C-terminal processing peptidase (CPP, EC 3.4.21.102) also known as tail-specific protease (tsp), the photosystem II D1 C-terminal processing protease (D1P), and other related S41 protease family members are present in this CD. CPP is synthesized as a precursor form with a carboxyl-terminal extension. It specifically recognizes a C-terminal tripeptide, Xaa-Yaa-Zaa, in which Xaa is preferably Ala or Leu, Yaa is preferably Ala or Tyr and Zaa is preferably Ala, but then cleaves at a variable distance from the C-terminus. The C-terminal carboxylate group is essential, and proteins where this group is amidated are not substrates. This family of proteases contains the PDZ domain that promotes protein-protein interactions and is important for substrate recognition. The active site consists of a serine/lysine catalytic dyad. The bacterial CCP-1 is believed to be important for the degradation of incorrectly synthesized proteins as well as protection from thermal and osmotic stresses. In E. coli, it is involved in the cleavage of a C-terminal peptide of 11 residues from the precursor form of penicillin-binding protein 3 (PBP3). In the plant chloroplast, the enzyme removes the C-terminal extension of the D1 polypeptide of photosystem II, allowing the light-driven assembly of the tetranuclear manganese cluster, which is responsible for photosynthetic water oxidation. 211
26607 143477 cd07561 Peptidase_S41_CPP_like C-terminal processing peptidase-like; serine protease family S41. Bacterial protease homologs of the S41 family related to C-terminal processing peptidase (CPP). CPP-1 is believed to be important for the degradation of incorrectly synthesized proteins as well as protection from thermal and osmotic stresses. CPP is synthesized with an extension on its carboxyl-terminus and specifically recognizes a C-terminal tripeptide, but cleaves at variable distance from the C-terminus. The CPP active site consists of a serine/lysine catalytic dyad. Conservation of these residues is seen in the CPP-like proteins of this group. CPP proteins contain a PDZ domain that promotes protein-protein interactions and is important for substrate recognition however, most of CPP-like proteins only have an internal fragment or lack the PDZ domain. 256
26608 143478 cd07562 Peptidase_S41_TRI Tricorn protease; serine protease family S41. The tricorn protease (TRI), a member of the S41 peptidase family and named for its tricorn-like shape, exists only in some archaea and eubacteria. It has been shown to act as a carboxypeptidase, involved in the degradation of proteasomal products to preferentially yield di- and tripeptides, with subsequent and final degradations to free amino acid residues by tricorn interacting factors, F1, F2 and F3. Tricorn is a hexameric D3-symmetric protease of 720kD, and can self-associate further into a giant icosahedral capsid structure containing twenty copies of the complex. Each tricorn peptidase monomer consists of five structural domains: a six-bladed beta-propeller and a seven-bladed beta-propeller that limit access to the active site, the two domains (C1 and C2) that carry the active site residues, and a PDZ-like domain (proposed to be important for substrate recognition) between the C1 and C2 domains. The active site tetrad residues are distributed between the C1 and C2 domains, with serine and histidine on C1 and serine and glutamate on C2. 266
26609 143479 cd07563 Peptidase_S41_IRBP Interphotoreceptor retinoid-binding protein; serine protease family S41. Interphotoreceptor retinoid-binding protein (IRBP) is a homolog of the S41 protease, C-terminal processing peptidase (CTPase) family. It is thought to facilitate the compartmentalization of the visual cycle that requires poorly soluble and potentially toxic retinoids to cross the aqueous subretinal space between the photoreceptors and the retinal pigment epithelium (RPE). IRBP is secreted by photoreceptors into the interphotoreceptor matrix (IPM) where it is rapidly turned over by a combination of RPE and photoreceptor endocytosis. It is the most abundant soluble protein component of the IPM, consisting of homologous modules, each repeat structure arising through the duplication (as in teleost IRBP) or quadruplication (in tetrapods) of an ancient gene, arisen in the early evolution of the vertebrate eye. IRBP has been shown to promote the release of all-trans retinol from photoreceptors and facilitates its delivery to the RPE. Conversely, IRBP can promote the release of 11-cis-retinal from the RPE, prevent its isomerization in the subretinal space, and transfer it to photoreceptors. In vivo evidence implicates IRBP as a retinoid transporter in the visual cycle, suggesting a critical role for IRBP in cone function essential for human vision. IRBP is a dominant autoimmune antigen in the eye; IRBP proteolysis analysis has proven a useful biomarker for autoimmune uveitis (AU) disorders, a major cause of blindness. This family also includes a chlamydia-secreted protein, designated chlamydia protease-like activity factor (CPAF), known to degrade host proteins, enabling Chlamydia to evade host defenses and replicate. 250
26610 143588 cd07564 nitrilases_CHs Nitrilases, cyanide hydratase (CH)s, and similar proteins (class 1 nitrilases). Nitrilases (nitrile aminohydrolases, EC:3.5.5.1) hydrolyze nitriles (RCN) to ammonia and the corresponding carboxylic acid. Most nitrilases prefer aromatic nitriles, some prefer arylacetonitriles and others aliphatic nitriles. This group includes the nitrilase cyanide dihydratase (CDH), which hydrolyzes inorganic cyanide (HCN) to produce formate. It also includes cyanide hydratase (CH), which hydrolyzes HCN to formamide. This group includes four Arabidopsis thaliana nitrilases (Ath)NIT1-4. AthNIT1-3 have a strong substrate preference for phenylpropionitrile (PPN) and other nitriles which may originate from the breakdown of glucosinolates. The product of PPN hydrolysis, phenylacetic acid has auxin activity. AthNIT1-3 can also convert indoacetonitrile to indole-3-acetic acid (IAA, auxin), but with a lower affinity and velocity. From their expression patterns, it has been speculated that NIT3 may produce IAA during the early stages of germination, and that NIT3 may produce IAA during embryo development and maturation. AthNIT4 has a strong substrate specificity for the nitrile, beta-cyano-L-alanine (Ala(CN)), an intermediate of cyanide detoxification. AthNIT4 has both a nitrilase activity and a nitrile hydratase (NHase) activity, which generate aspartic acid and asparagine respectively from Ala(CN). NHase catalyzes the hydration of nitriles to their corresponding amides. This subgroup belongs to a larger nitrilase superfamily comprised of belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 1. 297
26611 143589 cd07565 aliphatic_amidase aliphatic amidases (class 2 nitrilases). Aliphatic amidases catalyze the hydrolysis of short-chain aliphatic amides to form ammonia and the corresponding organic acid. This group includes Pseudomonas aeruginosa (Pa) AmiE, the amidase from Geobacillus pallidus RAPc8 (RAPc8 amidase), and Helicobacter pylori (Hp) AmiE and AmiF. PaAimE and HpAmiE hydrolyze various very short aliphatic amides, including propionamide, acetamide and acrylamide. HpAmiF is a formamidase which specifically hydrolyzes formamide. These proteins belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 2. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. HpAmiE , HpAmiF, and RAPc8 amidase, and PaAimE appear to be homohexameric enzymes, trimer of dimers. 291
26612 143590 cd07566 ScNTA1_like Saccharomyces cerevisiae N-terminal amidase NTA1, and related proteins (class 3 nitrilases). Saccharomyces cerevisiae NTA1 functions in the N-end rule protein degradation pathway. It specifically deaminates the N-terminal asparagine and glutamine residues of substrates of this pathway, to aspartate and glutamate respectively, these latter are the destabilizing residues. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 3. 295
26613 143591 cd07567 biotinidase_like biotinidase and vanins (class 4 nitrilases). These secondary amidases participate in vitamin recycling. Biotinidase (EC 3.5.1.12) has both a hydrolase and a transferase activity. It hydrolyzes free biocytin or small biotinyl-peptides produced during the proteolytic degradation of biotin-dependent carboxylases, to release free biotin (vitamin H), and it can transfer biotin to acceptor molecules such as histones. Biotinidase deficiency in humans is an autosomal recessive disorder characterized by neurological and cutaneous symptoms. This subgroup includes the three human vanins, vanin1-3. Vanins are ectoenzymes, Vanin-1, and -2 are membrane associated, vanin-3 is secreted. They are pantotheinases (EC 3.5.1.92, pantetheine hydrolase), which convert pantetheine, to pantothenic acid (vitamin B5) and cysteamine (2-aminoethanethiol, a potent anti-oxidant). They are potential targets for therapeutic intervention in inflammatory disorders. Vanin-1 deficient mice lacking free cysteamine are less susceptible to intestinal inflammation, and expression of vanin-1 and -3 is induced as part of the inflammatory-regenerative differentiation program of human epidermis. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 4. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 299
26614 143592 cd07568 ML_beta-AS_like mammalian-like beta-alanine synthase (beta-AS) and similar proteins (class 5 nitrilases). This family includes mammalian-like beta-AS (EC 3.5.1.6, also known as beta-ureidopropionase or N-carbamoyl-beta-alanine amidohydrolase). This enzyme catalyzes the third and final step in the catabolic pyrimidine catabolic pathway responsible for the degradation of uracil and thymine, the hydrolysis of N-carbamyl-beta-alanine and N-carbamyl-beta-aminoisobutyrate to the beta-amino acids, beta-alanine and beta-aminoisobutyrate respectively. This family belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 5. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Beta-ASs from this subgroup are found in various oligomeric states, dimer (human), hexamer (calf liver), decamer (Arabidopsis and Zea mays), and in the case of Drosophila melanogaster beta-AS, as a homooctamer assembled as a left-handed helical turn, with the possibility of higher order oligomers formed by adding dimers at either end. Rat beta-AS changes its oligomeric state (hexamer, trimer, dodecamer) in response to allosteric effectors. Eukaryotic Saccharomyces kluyveri beta-AS belongs to a different superfamily. 287
26615 143593 cd07569 DCase N-carbamyl-D-amino acid amidohydrolase (DCase, class 6 nitrilases). DCase hydrolyses N-carbamyl-D-amino acids to produce D-amino acids. It is an important biocatalyst in the pharmaceutical industry, producing useful D-amino acids for example in the preparation of beta-lactam antibiotics. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 6. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Agrobacterium radiobacter DCase forms a tetramer (dimer of dimers). Some DCases may form trimers. 302
26616 143594 cd07570 GAT_Gln-NAD-synth Glutamine aminotransferase (GAT, glutaminase) domain of glutamine-dependent NAD synthetases (class 7 and 8 nitrilases). Glutamine-dependent NAD synthetases are bifunctional enzymes, which have an N-terminal GAT domain and a C-terminal NAD+ synthetase domain. The GAT domain is a glutaminase (EC 3.5.1.2) which hydrolyses L-glutamine to L-glutamate and ammonia. The ammonia is used by the NAD+ synthetase domain in the ATP-dependent amidation of nicotinic acid adenine dinucleotide. Glutamine aminotransferases are categorized depending on their active site residues into different unrelated classes. This class of GAT domain belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to classes 7 and 8. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Mycobacterium tuberculosis glutamine-dependent NAD+ synthetase forms a homooctamer. 261
26617 143595 cd07571 ALP_N-acyl_transferase Apolipoprotein N-acyl transferase (class 9 nitrilases). ALP N-acyl transferase (Lnt), is an essential membrane-bound enzyme in gram-negative bacteria, which catalyzes the N-acylation of apolipoproteins, the final step in lipoprotein maturation. This is a reverse amidase (i.e. condensation) reaction. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 9. 270
26618 143596 cd07572 nit Nit1, Nit 2, and related proteins, and the Nit1-like domain of NitFhit (class 10 nitrilases). This subgroup includes mammalian Nit1 and Nit2, the Nit1-like domain of the invertebrate NitFhit, and various uncharacterized bacterial and archaeal Nit-like proteins. Nit1 and Nit2 are candidate tumor suppressor proteins. In NitFhit, the Nit1-like domain is encoded as a fusion protein with the non-homologous tumor suppressor, fragile histidine triad (Fhit). Mammalian Nit1 and Fhit may affect distinct signal pathways, and both may participate in DNA damage-induced apoptosis. Nit1 is a negative regulator in T cells. Overexpression of Nit2 in HeLa cells leads to a suppression of cell growth through cell cycle arrest in G2. These Nit proteins and the Nit1-like domain of NitFhit belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 10. 265
26619 143597 cd07573 CPA N-carbamoylputrescine amidohydrolase (CPA) (class 11 nitrilases). CPA (EC 3.5.1.53, also known as N-carbamoylputrescine amidase and carbamoylputrescine hydrolase) converts N-carbamoylputrescine to putrescine, a step in polyamine biosynthesis in plants and bacteria. This subgroup includes Arabidopsis thaliana CPA, also known as nitrilase-like 1 (NLP1), and Pseudomonas aeruginosa AguB. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 11. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer; P. aeruginosa AugB is a homohexamer, Arabidopsis thaliana NLP1 is a homooctomer. 284
26620 143598 cd07574 nitrilase_Rim1_like Uncharacterized subgroup of the nitrilase superfamily; some members of this subgroup have an N-terminal RimI domain (class 12 nitrilases). Some members of this subgroup are implicated in post-translational modification, as they contain an N-terminal GCN5-related N-acetyltransferase (GNAT) protein RimI family domain. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 12. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 280
26621 143599 cd07575 Xc-1258_like Xanthomonas campestris XC1258 and related proteins, members of the nitrilase superfamily (putative class 13 nitrilases). Uncharacterized subgroup belonging to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup either represents a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. XC1258 is a homotetramer. 252
26622 143600 cd07576 R-amidase_like Pseudomonas sp. MCI3434 R-amidase and related proteins (putative class 13 nitrilases). Pseudomonas sp. MCI3434 R-amidase hydrolyzes (R,S)-piperazine-2-tert-butylcarboxamide to form (R)-piperazine-2-carboxylic acid. It does so with strict R-stereoselectively. Its preferred substrates are carboxamide compounds which have the amino or imino group connected to their beta- or gamma-carbon. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), class 13 represents proteins that at the time were difficult to place in a distinct similarity group. It has been suggested that this subgroup represents a new class. Members of the nitrilase superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Native R-amidase however appears to be a monomer. 254
26623 143601 cd07577 Ph0642_like Pyrococcus horikoshii Ph0642 and related proteins, members of the nitrilase superfamily (putative class 13 nitrilases). Uncharacterized subgroup of the nitrilase superfamily. This superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. Pyrococcus horikoshii Ph0642 is a hypothetical protein belonging to this subgroup. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). This subgroup was classified as belonging to class 13, which represents proteins that at the time were difficult to place in a distinct similarity group. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 259
26624 143602 cd07578 nitrilase_1_R1 First nitrilase domain of an uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). Members of this subgroup have two nitrilase domains. This is the first of those two domains. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 258
26625 143603 cd07579 nitrilase_1_R2 Second nitrilase domain of an uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). Members of this subgroup have two nitrilase domains. This is the second of those two domains. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 279
26626 143604 cd07580 nitrilase_2 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 268
26627 143605 cd07581 nitrilase_3 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 255
26628 143606 cd07582 nitrilase_4 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 294
26629 143607 cd07583 nitrilase_5 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 253
26630 143608 cd07584 nitrilase_6 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 258
26631 143609 cd07585 nitrilase_7 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 261
26632 143610 cd07586 nitrilase_8 Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. 269
26633 143611 cd07587 ML_beta-AS mammalian-like beta-alanine synthase (beta-AS) and similar proteins (class 5 nitrilases). This subgroup includes mammalian-like beta-AS (EC 3.5.1.6, also known as beta-ureidopropionase or N-carbamoyl-beta-alanine amidohydrolase). This enzyme catalyzes the third and final step in the catabolic pyrimidine catabolic pathway responsible for the degradation of uracil and thymine, the hydrolysis of N-carbamyl-beta-alanine and N-carbamyl-beta-aminoisobutyrate to the beta-amino acids, beta-alanine and beta-aminoisobutyrate respectively. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 5. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Beta-ASs from this subgroup are found in various oligomeric states, dimer (human), hexamer (calf liver), decamer (Arabidopsis and Zea mays), and in the case of Drosophila melanogaster beta-AS, as a homooctamer assembled as a left-handed helical turn, with the possibility of higher order oligomers formed by adding dimers at either end. Rat beta-AS changes its oligomeric state (hexamer, trimer, dodecamer) in response to allosteric effectors. Eukaryotic Saccharomyces kluyveri beta-AS belongs to a different superfamily. 363
26634 153272 cd07588 BAR_Amphiphysin The Bin/Amphiphysin/Rvs (BAR) domain of Amphiphysins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. This subfamily is composed of different isoforms of amphiphysin and Bridging integrator 2 (Bin2). Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin and synaptojanin. They function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. Bin2 is mainly expressed in hematopoietic cells and is upregulated during granulocyte differentiation. The N-BAR domains of amphiphysins form a curved dimer with a positively-charged concave face that can drive membrane bending and curvature. 211
26635 153273 cd07589 BAR_DNMBP The Bin/Amphiphysin/Rvs (BAR) domain of Dynamin Binding Protein. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. DyNamin Binding Protein (DNMBP), also called Tuba, is a Cdc42-specific Guanine nucleotide Exchange Factor (GEF) that binds dynamin and various actin regulatory proteins. It serves as a link between dynamin function, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. DNMBP contains BAR and SH3 domains as well as a Dbl Homology domain (DH domain), which harbors GEF activity. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of DNMBP may be involved in binding to membranes. The gene encoding DNMBP is a candidate gene for late onset Alzheimer's disease. 195
26636 153274 cd07590 BAR_Bin3 The Bin/Amphiphysin/Rvs (BAR) domain of Bridging integrator 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Bridging integrator 3 (Bin3) is widely expressed in many tissues except in the brain. It plays roles in regulating filamentous actin localization and in cell division. In humans, the Bin3 gene is located in chromosome 8p21.3, a region that is implicated in cancer suppression. Homozygous inactivation of the Bin3 gene in mice led to the development of cataracts and an increased likelihood of lymphomas during aging, suggesting a role for Bin3 in lens development and cancer suppression. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 225
26637 153275 cd07591 BAR_Rvs161p The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Reduced viability upon starvation protein 161 and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of fungal proteins with similarity to Saccharomyces cerevisiae Reduced viability upon starvation protein 161 (Rvs161p) and Schizosaccharomyces pombe Hob3 (homolog of Bin3). S. cerevisiae Rvs161p plays a role in regulating cell polarity, actin cytoskeleton polarization, vesicle trafficking, endocytosis, bud formation, and the mating response. It forms a heterodimer with another BAR domain protein Rvs167p. Rvs161p and Rvs167p share common functions but are not interchangeable. Their BAR domains cannot be replaced with each other and the overexpression of one cannot suppress the mutant phenotypes of the other. S. pombe Hob3 is important in regulating filamentous actin localization and may be required in activating Cdc42 and recruiting it to cell division sites. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 224
26638 153276 cd07592 BAR_Endophilin_A The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins, localized at synapses, which interact with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. They tubulate membranes and regulate calcium influx into neurons to trigger the activation of the endocytic machinery. They are also involved in the sorting of plasma membrane proteins, actin filament assembly, and the uncoating of clathrin-coated vesicles for fusion with endosomes. The BAR domains of endophilin-A1 and A3 form crescent-shaped dimers that can detect membrane curvature and drive membrane bending. 223
26639 153277 cd07593 BAR_MUG137_fungi The Bin/Amphiphysin/Rvs (BAR) domain of Schizosaccharomyces pombe Meiotically Up-regulated Gene 137 protein and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This subfamily is composed predominantly of uncharacterized fungal proteins with similarity to Schizosaccharomyces pombe Meiotically Up-regulated Gene 137 protein (MUG137), which may play a role in meiosis and sporulation in fission yeast. MUG137 contains an N-terminal BAR domain and a C-terminal SH3 domain, similar to endophilins. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 215
26640 153278 cd07594 BAR_Endophilin_B The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. 229
26641 153279 cd07595 BAR_RhoGAP_Rich-like The Bin/Amphiphysin/Rvs (BAR) domain of Rich-like Rho GTPase Activating Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of Rho and Rac GTPase activating proteins (GAPs) with similarity to GAP interacting with CIP4 homologs proteins (Rich). Members contain an N-terminal BAR domain, followed by a Rho GAP domain, and a C-terminal prolin-rich region. Vertebrates harbor at least three Rho GAPs in this subfamily including Rich1, Rich2, and SH3-domain binding protein 1 (SH3BP1). Rich1 and Rich2 play complementary roles in the establishment and maintenance of cell polarity. Rich1 is a Cdc42- and Rac-specific GAP that binds to polarity proteins through the scaffold protein angiomotin and plays a role in maintaining the integrity of tight junctions. Rich2 is a Rac GAP that interacts with CD317 and plays a role in actin cytoskeleton organization and the maintenance of microvilli in polarized epithelial cells. SH3BP1 is a Rac GAP that inhibits Rac-mediated platelet-derived growth factor (PDGF)-induced membrane ruffling. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of Rich1 has been shown to form oligomers, bind membranes and induce membrane tubulation. 244
26642 153280 cd07596 BAR_SNX The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 218
26643 153281 cd07597 BAR_SNX8 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 8. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX8 and the yeast counterpart Mvp1p are involved in sorting and delivery of late-Golgi proteins, such as carboxypeptidase Y, to vacuoles. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 246
26644 153282 cd07598 BAR_FAM92 The Bin/Amphiphysin/Rvs (BAR) domain of Family with sequence similarity 92 (FAM92). BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of proteins from the family with sequence similarity 92 (FAM92), which were originally identified by the presence of the unknown domain DUF1208. This domain shows similarity to the BAR domains of sorting nexins. Mammals contain at least two member types, FAM92A and FAM92B, which may exist in many variants. The Xenopus homolog of FAM92A1, xVAP019, is essential for embryo survival and cell differentiation. FAM92A1 may be involved in regulating cell proliferation and apoptosis. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 211
26645 153283 cd07599 BAR_Rvs167p The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Reduced viability upon starvation protein 167 and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of fungal proteins with similarity to Saccharomyces cerevisiae Reduced viability upon starvation protein 167 (Rvs167p) and Schizosaccharomyces pombe Hob1 (homolog of Bin1). S. cerevisiae Rvs167p plays a role in regulation of the actin cytoskeleton, endocytosis, and sporulation. It forms a heterodimer with another BAR domain protein Rvs161p. Rvs161p and Rvs167p share common functions but are not interchangeable. Their BAR domains cannot be replaced with each other and the overexpression of one cannot suppress the mutant phenotypes of the other. Rvs167p also interacts with the GTPase activating protein (GAP) Gyp5p, which is involved in ER to Golgi vesicle trafficking. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 216
26646 153284 cd07600 BAR_Gvp36 The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Golgi vesicle protein of 36 kDa and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Proteomic analysis shows that Golgi vesicle protein of 36 kDa (Gvp36) may be involved in vesicular trafficking and nutritional adaptation. A Saccharomyces cerevisiae strain deficient in Gvp36 shows defects in growth, in actin cytoskeleton polarization, in endocytosis, in vacuolar biogenesis, and in the cell cycle. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 242
26647 153285 cd07601 BAR_APPL The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains, and are localized to cytoplasmic membranes. Vertebrates contain two APPL proteins, APPL1 and APPL2. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 215
26648 153286 cd07602 BAR_RhoGAP_OPHN1-like The Bin/Amphiphysin/Rvs (BAR) domain of Oligophrenin1-like Rho GTPase Activating Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of Rho and Rac GTPase activating proteins (GAPs) with similarity to oligophrenin1 (OPHN1). Members contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and a Rho GAP domain. Some members contain a C-terminal SH3 domain. Vertebrates harbor at least three Rho GAPs in this subfamily including OPHN1, GTPase Regulator Associated with Focal adhesion kinase (GRAF), GRAF2, and an uncharacterized protein called GAP10-like. OPHN1, GRAF and GRAF2 show GAP activity towards RhoA and Cdc42. In addition, OPHN1 is active towards Rac. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domains of OPHN1 and GRAF directly interact with their Rho GAP domains and inhibit their activity. The autoinhibited proteins are able to bind membranes and tubulate liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domains can occur simultaneously. 207
26649 153287 cd07603 BAR_ACAPs The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of ACAPs (ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins), which are Arf GTPase activating proteins (GAPs) containing an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. Vertebrates contain at least three members, ACAP1, ACAP2, and ACAP3. ACAP1 and ACAP2 are Arf6-specific GAPs, involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration, by mediating Arf6 signaling. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 200
26650 153288 cd07604 BAR_ASAPs The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of ASAPs (ArfGAP with SH3 domain, ANK repeat and PH domain containing proteins), which are Arf GTPase activating proteins (GAPs) with similarity to ACAPs (ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins) in that they contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and ankyrin (ANK) repeats. However, ASAPs contain an additional C-terminal SH3 domain. ASAPs function in regulating cell growth, migration, and invasion. Vertebrates contain at least three members, ASAP1, ASAP2, and ASAP3. ASAP1 and ASAP2 shows GTPase activating protein (GAP) activity towards Arf1 and Arf5. They do not show GAP activity towards Arf6, but is able to mediate Arf6 signaling by binding stably to GTP-Arf6. ASAP3 is an Arf6-specific GAP. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains. 215
26651 153289 cd07605 I-BAR_IMD Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), a dimerization module that binds and bends membranes. Inverse (I)-BAR (or IMD) is a member of the Bin/Amphiphysin/Rvs (BAR) domain family. It is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. IMD domains are found in Insulin Receptor tyrosine kinase Substrate p53 (IRSp53), Missing in Metastasis (MIM), and Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-like (BAIAP2L) proteins. These are multi-domain proteins that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. Most members contain an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus, exccept for MIM which does not carry an SH3 domain. Some members contain additional domains and motifs. The IMD domain binds and bundles actin filaments, binds membranes and produces membrane protrusions, and interacts with the small GTPase Rac. 223
26652 153290 cd07606 BAR_SFC_plant The Bin/Amphiphysin/Rvs (BAR) domain of the plant protein SCARFACE (SFC). BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. The plant protein SCARFACE (SFC), also called VAscular Network 3 (VAN3), is a plant ACAP (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein), an Arf GTPase Activating Protein (GAP) that plays a role in the trafficking of auxin efflux regulators from the plasma membrane to the endosome. It is required for the normal vein patterning in leaves. SCF contains an N-terminal BAR domain, followed by a Pleckstrin Homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 202
26653 153291 cd07607 BAR_SH3P_plant The Bin/Amphiphysin/Rvs (BAR) domain of the plant SH3 domain-containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of proteins with similarity to Arabidopsis thaliana SH3 domain-containing proteins 1 (SH3P1) and 2 (SH3P2). SH3P1 is involved in the trafficking of clathrin-coated vesicles. It is localized at the plasma membrane and is associated with vesicles of the trans-Golgi network. Yeast complementation studies reveal that SH3P1 has similar functions to the Saccharomyces cerevisiae Rvs167p, which is involved in endocytosis and actin cytoskeletal arrangement. Members of this group contain an N-terminal BAR domain and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 209
26654 153292 cd07608 BAR_ArfGAP_fungi The Bin/Amphiphysin/Rvs (BAR) domain of uncharacterized fungal Arf GAP proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of uncharacterized fungal proteins containing an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and an Arf GTPase Activating Protein (GAP) domain. These proteins may play roles in Arf-mediated functions involving membrane dynamics. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 192
26655 153293 cd07609 BAR_SIP3_fungi The Bin/Amphiphysin/Rvs (BAR) domain of fungal Snf1p-interacting protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of mostly uncharacterized fungal proteins with similarity to Saccharomyces cerevisiae Snf1p-interacting protein 3 (SIP3). These proteins contain an N-terminal BAR domain followed by a Pleckstrin Homology (PH) domain. SIP3 interacts with SNF1 protein kinase and activates transcription when anchored to DNA. It may function in the SNF1 pathway. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 214
26656 153294 cd07610 FCH_F-BAR The Extended FES-CIP4 Homology (FCH) or F-BAR (FCH and Bin/Amphiphysin/Rvs) domain, a dimerization module that binds and bends membranes. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. F-BAR domain containing proteins, also known as Pombe Cdc15 homology (PCH) family proteins, include Fes and Fer tyrosine kinases, PACSINs/Syndapins, FCHO, PSTPIP, CIP4-like proteins and srGAPs. Many members also contain an SH3 domain and play roles in endocytosis. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. These tubules have diameters larger than those observed with N-BARs. The F-BAR domains of some members such as NOSTRIN and Rgd1 are important for the subcellular localization of the protein. 191
26657 153295 cd07611 BAR_Amphiphysin_I_II The Bin/Amphiphysin/Rvs (BAR) domain of Amphiphysin I and II. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin and synaptojanin. They function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. The N-BAR domain of amphiphysin forms a curved dimer with a positively-charged concave face that can drive membrane bending and curvature. Human autoantibodies to amphiphysin-1 hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Mutations in amphiphysin-2 (BIN1) are associated with autosomal recessive centronuclear myopathy. 211
26658 153296 cd07612 BAR_Bin2 The Bin/Amphiphysin/Rvs (BAR) domain of Bridging integrator 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Bridging integrator 2 (Bin2) is a BAR domain containing protein that is mainly expressed in hematopoietic cells. It is upregulated during granulocyte differentiation and is thought to function primarily in this lineage. The BAR domain of Bin2 is closely related to the BAR domains of amphiphysins, which function primarily in endocytosis and other membrane remodeling events. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. Unlike amphiphysins, Bin2 does not appear to contain a C-terminal SH3 domain. Amphiphysin I proteins, enriched in the brain and nervous system, function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), function in intracellular vessicle trafficking. Bin2 can form a stable complex with Bin1 in cells but cannot replace the function of Bin1, and thus, appears to harbor a nonredundant function. The N-BAR domain of amphiphysin forms a curved dimer with a positively-charged concave face that can drive membrane bending and curvature. 211
26659 153297 cd07613 BAR_Endophilin_A1 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A1 (or endophilin-1) is also referred to as SH3P4 (SH3 domain containing protein 4) or SH3GL2 (SH3 domain containing Grb2-like protein 2). It is localized in presynaptic nerve terminals. It plays many roles in clathrin-dependent endocytosis of synaptic vesicles including early vesicle formation, ubiquitin-dependent sorting of plasma membrane proteins, and regulation of calcium influx into neurons. The BAR domain of endophilin-A1 forms crescent-shaped dimers that can detect membrane curvature and drive membrane bending, while its SH3 domain binds the endocytic proteins, dynamin 1, synaptojanin 1, and amphiphysins. 223
26660 153298 cd07614 BAR_Endophilin_A2 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins, localized at synapses, which interact with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A2 (or endophilin-2) is also referred to as SH3P8 (SH3 domain containing protein 8) or SH3GL1 (SH3 domain containing Grb2-like protein 1). It localizes to presynaptic nerve terminals and forms heterodimers with endophilin-A1 through their BAR domains. Endophilin-A2 binds dynamin 1, synaptojanin 1, and the beta1-adrenergic receptor cytoplasmic tail through its SH3 domain. 223
26661 153299 cd07615 BAR_Endophilin_A3 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins localized at synapses that interacts with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A3 (or endophilin-3) is also referred to as SH3P13 (SH3 domain containing protein 13) or SH3GL3 (SH3 domain containing Grb2-like protein 3). It regulates Arp2/3-dependent actin filament assembly during endocytosis. It binds N-WASP through its SH3 domain and enhances the ability of N-WASP to activate the Arp2/3 complex. Endophilin-A3 co-localizes with the vesicular glutamate transporter 1 (VGLUT1), and may play an important role in the synaptic release of glutamate. 223
26662 153300 cd07616 BAR_Endophilin_B1 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilin-B1, also called Bax-interacting factor 1 (Bif-1) or SH3GLB1 (SH3-domain GRB2-like endophilin B1), is localized mainly to the Golgi apparatus. It is involved in the regulation of many biological events including autophagy, tumorigenesis, nerve growth factor (NGF) trafficking, neurite outgrowth, mitochondrial outer membrane dynamics, and cell death. Endophilin-B1 forms homo- and heterodimers (with endophilin-B2) through its BAR domain, which can bind and bend membranes. It interacts with amphiphysin 1 and dynamin 1 through its SH3 domain. 229
26663 153301 cd07617 BAR_Endophilin_B2 The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilin-B2, also called SH3GLB2 (SH3-domain GRB2-like endophilin B2), is a cytoplasmic protein that interacts with the apoptosis inducer Bax. It is overexpressed in prostate cancer metastasis and has been identified as a cancer antigen with potential utility in immunotherapy. Endophilin-B2 forms homo- and heterodimers (with endophilin-B1) through its BAR domain, which can bind and bend membranes. 220
26664 153302 cd07618 BAR_Rich1 The Bin/Amphiphysin/Rvs (BAR) domain of RhoGAP interacting with CIP4 homologs protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. RhoGAP interacting with CIP4 homologs protein 1 (Rich1) is also called Neuron-associated developmentally-regulated protein (Nadrin) or Rho GTPase activating protein 17 (ARHGAP17). It is a Cdc42- and Rac-specific GAP that binds to polarity proteins through the scaffold protein angiomotin and plays a role in maintaining the integrity of tight junctions. It may be a component of a sorting mechanism in the recycling of tight junction transmembrane proteins. Rich1 contains an N-terminal BAR domain followed by a Rho GAP domain and a C-terminal proline-rich domain. It interacts with the BAR domain proteins endophilin and amphiphysin through its proline-rich region. The BAR domain of Rich1 forms oligomers and can bind membranes and induce membrane tubulation. 246
26665 153303 cd07619 BAR_Rich2 The Bin/Amphiphysin/Rvs (BAR) domain of RhoGAP interacting with CIP4 homologs protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. RhoGAP interacting with CIP4 homologs protein 2 (Rich2) is a Rho GTPase activating protein that interacts with CD317, a lipid raft-associated integral membrane protein. It plays a role in actin cytoskeleton organization and the maintenance of microvilli in polarized epithelial cells. Rich2 contains an N-terminal BAR domain followed by a GAP domain for Rho and Rac GTPases and a C-terminal proline-rich domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 248
26666 153304 cd07620 BAR_SH3BP1 The Bin/Amphiphysin/Rvs (BAR) domain of SH3-domain Binding Protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. SH3-domain binding protein 1 (SH3BP1 or 3BP-1) is a Rac GTPase activating protein that inhibits Rac-mediated platelet-derived growth factor (PDGF)-induced membrane ruffling. SH3BP1 contains an N-terminal BAR domain followed by a GAP domain for Rho and Rac GTPases and a C-terminal proline-rich domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 257
26667 153305 cd07621 BAR_SNX5_6 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 5 and 6. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Members of this subfamily include SNX5, SNX6, the mammalian SNX32, and similar proteins. SNX5 and SNX6 may be components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. The function of SNX32 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 219
26668 153306 cd07622 BAR_SNX4 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 4. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It is also implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and leptin. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 201
26669 153307 cd07623 BAR_SNX1_2 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 1 and 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX1, SNX2, and similar proteins. SNX1 and SNX2 are components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 224
26670 153308 cd07624 BAR_SNX7_30 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 7 and 30. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX7, SNX30, and similar proteins. The specific functions of SNX7 and SNX30 have not been elucidated. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 200
26671 153309 cd07625 BAR_Vps17p The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Vps17p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Vsp17p forms a dimer with Vps5p, the yeast counterpart of human SNX1, and is part of the retromer complex that mediates the transport of the carboxypeptidase Y receptor Vps10p from endosomes to Golgi. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 230
26672 153310 cd07626 BAR_SNX9_like The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 9 and Similar Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX9, SNX18, SNX33, and similar proteins. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 199
26673 153311 cd07627 BAR_Vps5p The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Vps5p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Vsp5p is the yeast counterpart of human SNX1 and is part of the retromer complex, which functions in the endosome-to-Golgi retrieval of vacuolar protein sorting receptor Vps10p, the Golgi-resident membrane protein A-ALP, and endopeptidase Kex2. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 216
26674 153312 cd07628 BAR_Atg24p The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Atg24p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Atg24p is involved in membrane fusion events at the vacuolar surface during pexophagy. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 185
26675 153313 cd07629 BAR_Atg20p The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Atg20p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The function of Atg20p is unknown but it has been shown to interact with Atg11p, which plays a role in linking cargo molecules with vesicle-forming components. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 187
26676 153314 cd07630 BAR_SNX_like The Bin/Amphiphysin/Rvs (BAR) domain of uncharacterized Sorting Nexins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of uncharacterized proteins with similarity to sorting nexins (SNXs), which are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 198
26677 153315 cd07631 BAR_APPL1 The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains. Vertebrates contain two APPL proteins, APPL1 and APPL2. APPL1 interacts with diverse receptors (e.g. NGF receptor TrkA, FSHR, adiponectin receptors) and signaling proteins (e.g. Akt, PI3K), and may function as an adaptor linked to many distinct signaling pathways. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 215
26678 153316 cd07632 BAR_APPL2 The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains. Vertebrates contain two APPL proteins, APPL1 and APPL2. Both APPL proteins interact with the transcriptional repressor Reptin, acting as activators of beta-catenin/TCF-mediated trancription. APPL2 is essential for cell proliferation. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 215
26679 153317 cd07633 BAR_OPHN1 The Bin/Amphiphysin/Rvs (BAR) domain of Oligophrenin-1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Oligophrenin-1 (OPHN1) is a GTPase activating protein (GAP) with activity towards RhoA, Rac, and Cdc42, that is expressed in developing spinal cord and in adult brain areas with high plasticity. It plays a role in regulating the actin cystoskeleton as well as morphology changes in axons and dendrites, and may also function in modulating neuronal connectivity. Mutations in the OPHN1 gene causes X-linked mental retardation associated with cerebellar hypoplasia, lateral ventricle enlargement and epilepsy. OPHN1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and a Rho GAP domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 207
26680 153318 cd07634 BAR_GAP10-like The Bin/Amphiphysin/Rvs (BAR) domain of Rho GTPase activating protein 10-like. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This group is composed of uncharacterized proteins called Rho GTPase activating protein (GAP) 10-like. GAP10-like may be a GAP with activity towards RhoA and Cdc42. Similar to GRAF and GRAF2, it contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domains of the related proteins GRAF and OPHN1, directly interact with their Rho GAP domains and inhibit theiractivity. The autoinhibited proteins are capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously. 207
26681 153319 cd07635 BAR_GRAF2 The Bin/Amphiphysin/Rvs (BAR) domain of GTPase Regulator Associated with Focal adhesion 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. GTPase Regulator Associated with Focal adhesion kinase 2 (GRAF2), also called Rho GTPase activating protein 10 (ARHGAP10) or PS-GAP, is a GAP with activity towards Cdc42 and RhoA which regulates caspase-activated p21-activated protein kinase-2 (PAK-2p34). GRAF2 interacts with PAK-2p34, leading to its stabilization and decrease of cell death. It is highly expressed in skeletal muscle and also interacts with PKNbeta, which is a target of Rho. GRAF2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein GRAF directly interacts with its Rho GAP domain and inhibits its activity. Autoinhibited GRAF is capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously. 207
26682 153320 cd07636 BAR_GRAF The Bin/Amphiphysin/Rvs (BAR) domain of GTPase Regulator Associated with Focal adhesion kinase. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. GTPase Regulator Associated with Focal adhesion kinase (GRAF), also called Rho GTPase activating protein 26 (ARHGAP26), is a GAP with activity towards RhoA and Cdc42 and is only weakly active towards Rac1. It influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase (FAK), which is a critical component of integrin signaling. GRAF contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of GRAF directly interacts with its Rho GAP domain and inhibits its activity. Autoinhibited GRAF is capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously. 207
26683 153321 cd07637 BAR_ACAP3 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP3 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 3), also called centaurin beta-5, is presumed to be an Arf GTPase activating protein (GAP) based on its similarity to the Arf6-specific GAPs ACAP1 and ACAP2. The specific function of ACAP3 is still unknown. ACAP3 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 200
26684 153322 cd07638 BAR_ACAP2 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP2 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 2), also called centaurin beta-2, is an Arf6-specific GTPase activating protein (GAP) which mediates Arf6 signaling. Arf6 is involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration. ACAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 200
26685 153323 cd07639 BAR_ACAP1 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP1 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 1), also called centaurin beta-1, is an Arf6-specific GTPase activating protein (GAP) which mediates Arf6 signaling. Arf6 is involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration. ACAP1 also participates in the cargo sorting and recycling of the transferrin receptor and integrin beta1. It may also play a role in innate immune responses. ACAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 200
26686 153324 cd07640 BAR_ASAP3 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP3 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 3) is also known as ACAP4 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 4), DDEFL1 (Development and Differentiation Enhancing Factor-Like 1), or centaurin beta-6. It is an Arf6-specific GTPase activating protein (GAP) and is co-localized with Arf6 in ruffling membranes upon EGF stimulation. ASAP3 is implicated in the pathogenesis of hepatocellular carcinoma and plays a role in regulating cell migration and invasion. ASAP3 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains. 213
26687 153325 cd07641 BAR_ASAP1 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP1 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 1) is also known as DDEF1 (Development and Differentiation Enhancing Factor 1), AMAP1, centaurin beta-4, or PAG2. ASAP1 is an Arf GTPase activating protein (GAP) with activity towards Arf1 and Arf5 but not Arf6 However, it has been shown to bind GTP-Arf6 stably without GAP activity. It has been implicated in cell growth, migration, and survival, as well as in tumor invasion and malignancy. It binds paxillin and cortactin, two components of invadopodia which are essential for tumor invasiveness. It also binds focal adhesion kinase (FAK) and the SH2/SH3 adaptor CrkL. ASAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains. 215
26688 153326 cd07642 BAR_ASAP2 The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP2 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 2) is also known as DDEF2 (Development and Differentiation Enhancing Factor 2), AMAP2, centaurin beta-3, or PAG3. ASAP2 mediates the functions of Arf GTPases vial dual mechanisms: it exhibits GTPase activating protein (GAP) activity towards class I (Arf1) and II (Arf5) Arfs; and binds class III Arfs (GTP-Arf6) stably without GAP activity. It binds paxillin and is implicated in Fcgamma receptor-mediated phagocytosis in macrophages and in cell migration. ASAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains. 215
26689 153327 cd07643 I-BAR_IMD_MIM Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Missing In Metastasis. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. Members of this subfamily include missing in metastasis (MIM) or metastasis suppressor 1 (MTSS1), metastasis suppressor 1-like (MTSSL) or ABBA (Actin-Bundling protein with BAIAP2 homology), and similar proteins. They contain an N-terminal IMD and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. MIM was originally identified as a missing transcript from metastatic bladder and prostate cancer cells. It is a scaffold protein that functions in a signaling pathway between the PDGF receptor, Src kinases, and actin assembly. It may also function as a cofactor of the Sonic hedgehog (Shh) transcriptional pathway and may participate in tumor development and progression via this pathway. ABBA regulates actin and plasma membrane dynamics to promote the extension of radial glia, which is important in neuronal migration, axon guidance and neurogenesis. The IMD domain of MIM binds and bundles actin filaments, binds membranes, and interacts with the small GTPase Rac. 231
26690 153328 cd07644 I-BAR_IMD_BAIAP2L2 Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. This group is composed of uncharacterized proteins known as BAIAP2L2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2). They contain an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The related proteins, BAIAP2L1 and IRSp53, function as regulators of membrane dynamics and the actin cytoskeleton. The IMD domain binds and bundles actin filaments, binds membranes and produces membrane protrusions, and interacts with the small GTPase Rac. 215
26691 153329 cd07645 I-BAR_IMD_BAIAP2L1 Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. BAIAP2L1 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1) is also known as IRTKS (Insulin Receptor Tyrosine Kinase Substrate). It is widely expressed, serves as a substrate for the insulin receptor, and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. BAIAP2L1 expression leads to the formation of short actin bundles, distinct from filopodia-like protrusions induced by the expression of the related protein IRSp53. It contains an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The IMD domain of BAIAP2L1 binds and bundles actin filaments, and binds the small GTPase Rac. 226
26692 153330 cd07646 I-BAR_IMD_IRSp53 Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Insulin Receptor tyrosine kinase Substrate p53. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. IRSp53 (Insulin Receptor tyrosine kinase Substrate p53) is also known as BAIAP2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2). It is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. One variant (T-form) is expressed exclusively in human breast cancer cells. The gene encoding IRSp53 is a putative susceptibility gene for Gilles de la Tourette syndrome. IRSp53 contains an N-terminal IMD, a CRIB (Cdc42 and Rac interactive binding motif), an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. Its IMD domain binds and bundles actin filaments, binds membranes, and interacts with the small GTPase Rac. 232
26693 153331 cd07647 F-BAR_PSTPIP The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Vetebrates contain two Proline-Serine-Threonine Phosphatase-Interacting Proteins (PSTPIPs), PSTPIP1 and PSTPIP2. PSTPIPs are mainly expressed in hematopoietic cells and are involved in the regulation of cell adhesion and motility. Mutations in PSTPIPs have been shown to cause autoinflammatory disorders. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain, while PSTPIP2 contains only the N-terminal F-BAR domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 239
26694 153332 cd07648 F-BAR_FCHO The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proteins in this group have been named FCH domain Only (FCHO) proteins. Vertebrates have two members, FCHO1 and FCHO2. These proteins contain an F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 261
26695 153333 cd07649 F-BAR_GAS7 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Growth Arrest Specific protein 7. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Growth Arrest Specific protein 7 (GAS7) is mainly expressed in the brain and is required for neurite outgrowth. It may also play a role in the protection and migration of embryonic stem cells. Treatment-related acute myeloid leukemia (AML) has been reported resulting from mixed-lineage leukemia (MLL)-GAS7 translocations as a complication of primary cancer treatment. GAS7 contains an N-terminal SH3 domain, followed by a WW domain, and a central F-BAR domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 233
26696 153334 cd07650 F-BAR_Syp1p_like The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of yeast Syp1 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Syp1p is associated with septins, a family of GTP-binding proteins that serve as elements of septin filaments, which are required for cell morphogenesis and division. Syp1p regulates cell-cycle dependent septin cytoskeletal dynamics in yeast. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCH domain Only (FCHO) proteins and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 228
26697 153335 cd07651 F-BAR_PombeCdc15_like The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Schizosaccharomyces pombe Cdc15, and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of Schizosaccharomyces pombe Cdc15 and Imp2, and similar proteins. These proteins contain an N-terminal F-BAR domain and a C-terminal SH3 domain. S. pombe Cdc15 and Imp2 play both distinct and overlapping roles in the maintenance and strengthening of the contractile ring at the division site, which is required in cell division. Cdc15 is a component of the actomyosin ring and is required in normal cytokinesis. Imp2 colocalizes with the medial ring during septation and is required for normal septation. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 236
26698 153336 cd07652 F-BAR_Rgd1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Saccharomyces cerevisiae Rho GTPase activating protein Rgd1 and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Saccharomyces cerevisiae Rgd1 is a GTPase activating protein (GAP) with activity towards Rho3p and Rho4p, which are involved in bud growth and cytokinesis, respectively. At low pH, S. cerevisiae Rgd1 is required for cell survival and the activation of the protein kinase C pathway, which is important in cell integrity and the maintenance of cell shape. It contains an N-terminal F-BAR domain and a C-terminal Rho GAP domain. The F-BAR domain of S. cerevisiae Rgd1 binds to phosphoinositides and plays an important role in the localization of the protein to the bud tip/neck during the cell cycle. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 234
26699 153337 cd07653 F-BAR_CIP4-like The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Cdc42-Interacting Protein 4 and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. Members of this subfamily typically contain an N-terminal F-BAR domain and a C-terminal SH3 domain. In addition, some members such as FNBP1L contain a central Cdc42-binding HR1 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 251
26700 153338 cd07654 F-BAR_FCHSD The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains proteins (FCHSD). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of FCH and double SH3 domain (FCHSD) proteins, so named as they contain an N-terminal F-BAR domain and two SH3 domains at the C-terminus. Vertebrates harbor two subfamily members, FCHSD1 and FCHSD2, which have been characterized only in silico. Their biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 264
26701 153339 cd07655 F-BAR_PACSIN The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 258
26702 153340 cd07656 F-BAR_srGAP The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs, all of which are expressed during embryonic and early development in the nervous system but with different localization and timing. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 241
26703 153341 cd07657 F-BAR_Fes_Fer The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fes (feline sarcoma) and Fer (Fes related) tyrosine kinases. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fes (feline sarcoma), also called Fps (Fujinami poultry sarcoma), and Fer (Fes related) are cytoplasmic (or nonreceptor) tyrosine kinases that play roles in haematopoiesis, inflammation and immunity, growth factor signaling, cytoskeletal regulation, cell migration and adhesion, and the regulation of cell-cell interactions. Although Fes and Fer show redundancy in their biological functions, they show differences in their expression patterns. Fer is ubiquitously expressed while Fes is expressed predominantly in myeloid and endothelial cells. Fes and Fer contain an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of Fes is critical in its role in microtubule nucleation and bundling. 237
26704 153342 cd07658 F-BAR_NOSTRIN The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Nitric Oxide Synthase TRaffic INducer (NOSTRIN). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Nitric Oxide Synthase TRaffic INducer (NOSTRIN) is expressed in endothelial and epithelial cells and is involved in the regulation, trafficking and targeting of endothelial NOS (eNOS). NOSTRIN facilitates the endocytosis of eNOS by coordinating the functions of dynamin and the Wiskott-Aldrich syndrome protein (WASP). Increased expression of NOSTRIN may be correlated to preeclampsia. NOSTRIN contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of NOSTRIN is necessary and sufficient for its membrane association and is responsible for its subcellular localization. 239
26705 153343 cd07659 BAR_PICK1 The Bin/Amphiphysin/Rvs (BAR) domain of Protein Interacting with C Kinase 1. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Protein Interacting with C Kinase 1 (PICK1), also called Protein kinase C-alpha-binding protein, is highly expressed in brain and testes. PICK1 plays a key role in the trafficking of AMPA receptors, which are critical for regulating synaptic strength and may be important in cellular processes involved in learning and memory. PICK1 is also critical in the early stages of spermiogenesis. Mice deficient in PICK1 are infertile and show characteristics of the human disease globozoospermia such as round-headed sperm, reduced sperm count, and severely impaired sperm motility. PICK1 may also be involved in the neuropathogenesis of schizophrenia. PICK1 contains an N-terminal PDZ domain and a C-terminal BAR domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of PICK1 is necessary for its membrane localization and activation. 215
26706 153344 cd07660 BAR_Arfaptin The Bin/Amphiphysin/Rvs (BAR) domain of Arfaptin. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Arfaptins are ubiquitously expressed proteins implicated in mediating cross-talk between Rac, a member of the Rho family GTPases, and Arf (ADP-ribosylation factor) small GTPases. Arfaptins bind to GTP-bound Arf1, Arf5, and Arf6, with strongest binding to GTP-Arf1. Arfaptins also bind to Rac-GTP and Rac-GDP with similar affinities. The Arfs are thought to bind to the same surface as Rac, and their binding is mutually exclusive. Mammals contain at least two isoforms of Arfaptin. Arfaptin 1 has been shown to inhibit the activation of Arf-dependent phospholipase D (PLD) and the secretion of matrix metalloproteinase-9 (MMP-9), an enzyme implicated in cancer invasiveness and metastasis. Arfaptin 2 regulates the aggregation of the protein huntingtin, which is implicated in Huntington disease. Arfaptins are single-domain proteins with a BAR-like structure. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 201
26707 153345 cd07661 BAR_ICA69 The Bin/Amphiphysin/Rvs (BAR) domain of Islet Cell Autoantigen 69-kDa. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Islet cell autoantigen 69-kDa (ICA69) is a diabetes-associated autoantigen that is highly expressed in brain and beta cells. It is involved in membrane trafficking at the Golgi complex in neurosecretory cells. It is coexpressed with Protein Interacting with C Kinase 1 (PICK1), also a the BAR domain containing protein, in many tissues at different developmental stages. In neurons, ICA69 colocalizes with PICK1 in cell bodies and dendrites but is absent in synapses where PICK1 is enriched. ICA69 contains an N-terminal BAR domain and a conserved C-terminal domain of unknown function. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. ICA69 associates with PICK1 through their BAR domains to form a heterodimer which is involved in regulating the synaptic targeting and surface expression of AMPA receptors. Autoantibodies against ICA69 have been identified in patients with insulin-dependent diabetes mellitus, rheumatoid arthritis, and primary Sjogren's syndrome. ICA69 has also been shown to be released by pancreatic cancer cells. 204
26708 153346 cd07662 BAR_SNX6 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 6. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX6 forms a stable complex with SNX1 and may be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It interacts with the receptor serine/threonine kinases from the transforming growth factor-beta family. It also plays roles in enhancing the degradation of EGFR and in regulating the activity of Na,K-ATPase through its interaction with Translationally Controlled Tumor Protein (TCTP). BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 218
26709 153347 cd07663 BAR_SNX5 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 5. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX5, abundantly expressed in macrophages, regulates macropinocytosis, a process that enables cells to internalize large amounts of external solutes. It may also be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It also binds the Fanconi anaemia complementation group A protein (FANCA). SNX5 is localized to a subdomain of early endosome and is recruited to the plasma membrane following EGF stimulation and elevation of PI(3,4)P2 levels. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 218
26710 153348 cd07664 BAR_SNX2 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX2 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 234
26711 153349 cd07665 BAR_SNX1 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX1 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. SNX1 is localized to a microdomain in early endosomes where it regulates cation-independent mannose-6-phosphate receptor retrieval to the trans Golgi network. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 234
26712 153350 cd07666 BAR_SNX7 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 7. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The specific function of SNX7 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 243
26713 153351 cd07667 BAR_SNX30 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 30. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The specific function of SNX30 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 240
26714 153352 cd07668 BAR_SNX9 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 9. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX9, also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 210
26715 153353 cd07669 BAR_SNX33 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 33. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX33 interacts with Wiskott-Aldrich syndrome protein (WASP) and plays a role in the maintenance of cell shape and cell cycle progression. It modulates the shedding and endocytosis of cellular prion protein (PrP(c)) and amyloid precursor protein (APP). BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 207
26716 153354 cd07670 BAR_SNX18 The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 18. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. 207
26717 153355 cd07671 F-BAR_PSTPIP1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 1. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proline-Serine-Threonine Phosphatase-Interacting Protein 1 (PSTPIP1), also known as CD2 Binding Protein 1 (CD2BP1), is mainly expressed in hematopoietic cells. It is a binding partner of the cell surface receptor CD2 and PTP-PEST, a tyrosine phosphatase which functions in cell motility and Rac1 regulation. It also plays a role in the activation of the Wiskott-Aldrich syndrome protein (WASP), which couples actin rearrangement and T cell activation. Mutations in the gene encoding PSTPIP1 cause the autoinflammatory disorder known as PAPA (pyogenic sterile arthritis, pyoderma gangrenosum, and acne) syndrome. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 242
26718 153356 cd07672 F-BAR_PSTPIP2 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 2. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proline-Serine-Threonine Phosphatase-Interacting Protein 2 (PSTPIP2), also known as Macrophage Actin-associated tYrosine Phosphorylated protein (MAYP), is mostly expressed in hematopoietic cells but is also expressed in the brain. It is involved in regulating cell adhesion and motility. Mutations in the gene encoding murine PSTPIP2 can cause autoinflammatory disorders such as chronic multifocal osteomyelitis and macrophage autoinflammatory disease. PSTPIP2 contains an N-terminal F-BAR domain and lacks the PEST motifs and SH3 domain that are found in PSTPIP1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 240
26719 153357 cd07673 F-BAR_FCHO2 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only 2 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. The specific function of FCH domain Only 2 (FCHO2) is still unknown. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCHO1 and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 269
26720 153358 cd07674 F-BAR_FCHO1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only 1 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH domain Only 1 (FCHO1) may be involved in clathrin-coated vesicle formation. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCHO2 and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 261
26721 153359 cd07675 F-BAR_FNBP1L The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Formin Binding Protein 1-Like. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FormiN Binding Protein 1-Like (FNBP1L), also known as Toca-1 (Transducer of Cdc42-dependent actin assembly), forms a complex with neural Wiskott-Aldrich syndrome protein (N-WASP). The FNBP1L/N-WASP complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. It contains an N-terminal F-BAR domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 252
26722 153360 cd07676 F-BAR_FBP17 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Formin Binding Protein 17. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Formin Binding Protein 17 (FBP17), also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 253
26723 153361 cd07677 F-BAR_FCHSD2 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains 2 (FCHSD2). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH and double SH3 domains 2 (FCHSD2) contains an N-terminal F-BAR domain and two SH3 domains at the C-terminus. It has been characterized only in silico, and its biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 260
26724 153362 cd07678 F-BAR_FCHSD1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains 1 (FCHSD1). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH and double SH3 domains 1 (FCHSD1) contains an N-terminal F-BAR domain and two SH3 domains at the C-terminus. It has been characterized only in silico, and its biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 263
26725 153363 cd07679 F-BAR_PACSIN2 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 2 (PACSIN2). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 2 or Syndapin II is expressed ubiquitously and is involved in the regulation of tubulin polymerization. It associates with Golgi membranes and forms a complex with dynamin II which is crucial in promoting vesicle formation from the trans-Golgi network. PACSIN 2 contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 258
26726 153364 cd07680 F-BAR_PACSIN1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 1 (PACSIN1). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 1 or Syndapin I is expressed specifically in the brain and is localized in neurites and synaptic boutons. It binds the brain-specific proteins dynamin I, synaptojanin, synapsin I, and neural Wiskott-Aldrich syndrome protein (nWASP), and functions as a link between the cytoskeletal machinery and synaptic vesicle endocytosis. PACSIN 1 interacts with huntingtin and may be implicated in the neuropathology of Huntington's disease. It contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 258
26727 153365 cd07681 F-BAR_PACSIN3 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 3 (PACSIN3). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 3 or Syndapin III is expressed ubiquitously and regulates glucose uptake in adipocytes through its role in GLUT1 trafficking. It also modulates the subcellular localization and stimulus-specific function of the cation channel TRPV4. PACSIN 3 contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 258
26728 153366 cd07682 F-BAR_srGAP2 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 2. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP2 is expressed in zones of neuronal differentiation. It plays a role in the regeneration of neurons and axons. srGAP2 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 263
26729 153367 cd07683 F-BAR_srGAP1 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 1. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP1, also called Rho GTPase-Activating Protein 13 (ARHGAP13), is a Cdc42- and RhoA-specific GAP and is expressed later in the development of CNS (central nervous system) tissues. It is an important downstream signaling molecule of Robo1. srGAP1 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 253
26730 153368 cd07684 F-BAR_srGAP3 The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 3. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP3, also called MEGAP (MEntal disorder associated GTPase-Activating Protein), is a Rho GAP with activity towards Rac1 and Cdc42. It impacts cell migration by regulating actin and microtubule cytoskeletal dynamics. The association between srGAP3 haploinsufficiency and mental retardation is under debate. srGAP3 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 253
26731 153369 cd07685 F-BAR_Fes The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fes (feline sarcoma) tyrosine kinase. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fes (feline sarcoma), also called Fps (Fujinami poultry sarcoma), is a cytoplasmic (or nonreceptor) tyrosine kinase whose gene was first isolated from tumor-causing retroviruses. It is expressed in myeloid, vascular endothelial, epithelial, and neuronal cells, and plays important roles in cell growth and differentiation, angiogenesis, inflammation and immunity, and cytoskeletal regulation. Fes kinase has also been implicated as a tumor suppressor in colorectal cancer. It contains an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of Fes is critical in its role in microtubule nucleation and bundling. 237
26732 153370 cd07686 F-BAR_Fer The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fer (Fes related) tyrosine kinase. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fer (Fes related) is a cytoplasmic (or nonreceptor) tyrosine kinase expressed in a wide variety of tissues, and is found to reside in both the cytoplasm and the nucleus. It plays important roles in neuronal polarization and neurite development, cytoskeletal reorganization, cell migration, growth factor signaling, and the regulation of cell-cell interactions mediated by adherens junctions and focal adhesions. Fer kinase also regulates cell cycle progression in malignant cells. It contains an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. 234
26733 409484 cd07687 IgC_TCR_delta Immunoglobulin (Ig) constant domain of the delta chain of delta/gamma T-cell antigen receptors (TCRs). The members here are composed of the constant domain of the delta chain of delta/gamma T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. The majority of T cells contain alpha-beta TCRs, but a small subset contain gamma-delta TCRs. Alpha-beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma-delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing and MHC independently of the bound peptide. Gamma-delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. 80
26734 409485 cd07688 IgC_TCR_alpha Immunoglobulin (Ig) constant domain the alpha chain of alpha/beta T-cell antigen receptors (TCRs). The members here are composed of the constant domain of the alpha chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes and are composed of alpha and beta, or gamma and delta polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the alpha chain. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The antigen binding site is formed by the variable domains of the alpha and beta chains, located at the N-terminus of each chain. Alpha/beta TCRs recognize antigens differently from gamma/delta TCRs. 83
26735 409486 cd07689 IgC2_VCAM-1 Immunoglobulin (Ig)-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1) and similar proteins; member of the C2-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1; also known as Cluster of Differentiation (CD) 106) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. The interaction of VCAM-1 binding to the beta1 integrin very late antigen 4 (VLA-4) expressed by lymphocytes and monocytes mediates the adhesion of leucocytes to blood vessel walls, and regulates migration across the endothelium. During metastasis, some circulating cancer cells extravasate to a secondary site by a similar process. VCAM-1 may be involved in organ targeted tumor metastasis and may also act as host receptors for viruses and parasites. VCAM-1 contains seven Ig domains. 101
26736 409487 cd07690 IgV_1_CD4 First immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4; member of the V-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4. CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy. The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy. 97
26737 409488 cd07691 IgC1_CD3_gamma_delta Immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 gamma and delta chains; member of the C1-set of IgSF domains. The members here are composed of immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 gamma and delta chains. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex. 69
26738 409489 cd07692 IgC1_CD3_epsilon Immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 epsilon chain; member of the C1-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 epsilon chain. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex. 75
26739 409490 cd07693 IgC_1_Robo First immunoglobulin (Ig)-like constant domain in Robo (roundabout) receptors, and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain in Roundabout (Robo) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, and Robo3), and three mammalian Slit homologs (Slit1, Slit2, Slit3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit1, Slit2,and Slit3 are expressed at the ventral midline. Robo3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons. The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be is the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. 99
26740 409491 cd07694 IgC2_2_CD4 Second immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4; member of the C2-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig) Constant 2 (C2)-set domain of Cluster of Differentiation (CD) 4. CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy. The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand. 88
26741 409492 cd07695 IgV_3_CD4 Third immunoglobulin (Ig) Variable (V) domain of Cluster of Differentiation (CD) 4; member of the V-set of IgSF domains. The members here are composed of the third immunoglobulin variable (IgV) domain of Cluster of Differentiation (CD) 4. CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy. The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy. 107
26742 409493 cd07696 IgC1_CH3_IgAEM_CH2_IgG CH3 domain (third constant Ig domain of heavy chains) in immunoglobulin heavy alpha, epsilon, and mu chains, and CH2 domain (second constant Ig domain of the gheavy chain) in immunoglobulin heavy gamma chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin constant domain (IgC) of the gamma heavy chains and the second immunoglobulin constant domain (IgC) of alpha, epsilon, and mu heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. 98
26743 409494 cd07697 IgC1_TCR_gamma T cell receptor (TCR) gamma chain constant immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) constant (C) domain of the gamma chain of gamma-delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes and are heterodimers consisting of alpha and beta chains or gamma and delta chains. Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha-beta TCRs, but a small subset contain gamma-delta TCRs. Alpha-beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma-delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing and MHC independently of the bound peptide. Gamma-delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. 98
26744 409495 cd07698 IgC1_MHC_I_alpha3 Class I major histocompatibility complex (MHC) alpha chain, alpha3 immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 92
26745 409496 cd07699 IgC1_L Immunoglobulin light chain Constant domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) light chain constant (C) domain. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determine the type of immunoglobulin: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains. 99
26746 409497 cd07700 IgV_CD8_beta Immunoglobulin (Ig) variable (V) domain of Cluster of Differentiation (CD) 8 beta chain. The members here are composed of the immunoglobulin (Ig)-like domain in Cluster of Differentiation (CD) 8 beta. The CD8 glycoprotein plays an essential role in the control of T-cell selection, maturation, and the T-cell receptor (TCR)-mediated response to peptide antigen. CD8 is comprised of alpha and beta subunits and is expressed as either an alpha/alpha or alpha/beta dimer. Both dimeric isoforms can serve as a coreceptor for T cell activation and differentiation, however they have distinct physiological roles, different cellular distributions, unique binding partners, etc. Each CD8 subunit is comprised of an extracellular domain containing a V-type Ig-like domain, a single pass transmembrane portion, and a short intracellular domain. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 116
26747 409498 cd07701 IgV_1_Necl-3 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule-3; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-3, Necl-3 (also known as cell adhesion molecule 2 (CADM2), SynCAM2, IGSF4D). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-3 accumulates in central and peripheral nervous system tissue, and has been shown to selectively interact with oligodendrocytes. 96
26748 409499 cd07702 IgI_VEGFR-1 Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 1 (VEGFR-1); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 1 (VEGFR-1). VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-1 binds VEGF-A strongly; VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-1 may play an inhibitory role in the function of VEGFR-2 by binding VEGF-A and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis and may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 92
26749 409500 cd07703 IgC1_2_Nectin-2_Necl-5_like Second immunoglobulin (Ig) domain of Nectin-2 and Nectin-like protein 5, and similar domains; member of the C1-set of the Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-2 (also known as poliovirus receptor related protein 2 or Cluster of Differentiation 112 (CD112)), nectin-like protein 5 (CD155), and similar proteins. Nectins and Nectin-like molecules are a family of Ca(2+)-independent immunoglobulin-like transmembrane glycoproteins belonging to the class of adhesion receptors, consisting of nine members (nectins 1 through 4 and nectin-like proteins 1 through 5). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. CD155 is the fifth member in the nectin-like molecule family, and functions as the receptor of poliovirus; therefore, CD155 is also referred to as Necl-5, or PVR. In contrast to all other family members, CD155 lacks self-adhesion capacity, yet it shares with nectins the feature to interact with other nectins. For instance, CD155 heterophilically trans-interacts with nectin-3, thereby contributing significantly to the establishment of adherens junctions between epithelial cells. This group belongs to the Constant 1 (C1)-set of IgSF domains, which has one beta-sheet that is formed by strands A-B-E-D and the other strands by G-F-C-C'. 97
26750 409501 cd07704 IgC1_2_Nectin-3-4_like Second immunoglobulin (Ig) domain of nectin-3 and nectin-4 (poliovirus receptor related protein 4), and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-3 (also known as poliovirus receptor related protein 3 or cluster of differentiation (CD) 113) and nectin-4 (poliovirus receptor related protein 4). Nectin-3 and nectin-4 belong to the nectin family comprised of four transmembrane glycoproteins (nectin-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. Nectin-3 has also been shown to form a heterophilic trans-interaction with nectin-1 in ciliary epithelia, establishing the apex-apex adhesion between the pigment and non-pigment cell layers. Nectin-4 has recently been identified in several types of breast carcinoma and can be used as a histological and serological marker for breast cancer. 96
26751 409502 cd07705 IgI_2_Necl-1 Second immunoglobulin (Ig)-like domain of nectin-like molcule-1 (Necl-1); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molcule-1 (Necl-1; also known as cell adhesion molecule3 (CADM3)). These nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains is essential to cell-cell adhesion and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 and Necl-2 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Ig domains are likely to participate in ligand binding and recognition. 103
26752 409503 cd07706 IgV_TCR_delta Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) delta chain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the delta chain of gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains. Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing, and MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 112
26753 293793 cd07707 MBL-B1-B2-like metallo-beta-lactamases; subclasses B1 and B2 and related proteins; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. B1 MBls include chromosomally-encoded MBLs such as Bacillus cereus BcII, Bacteroides fragilis CcrA, and Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and acquired MBLs including IMP-1, VIM-1, VIM-2, GIM-1, NDM-1 and FIM-1. B2 MBLs have a narrow substrate profile that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis. B2 MBLs include Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and Serratia fonticola Sfh-I. 219
26754 293794 cd07708 MBL-B3-like metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. B3 MBLs include Fluoribacter gormanii FEZ-1, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) GOB-1, Stenotrophomonas Maltophilia L1, and Bradyrhizobium diazoefficiens BJP-1, Serratia marcescens SMB-1, and Pseudomonas Aeruginosa AIM-1. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 248
26755 293795 cd07709 flavodiiron_proteins_MBL-fold catalytic domain of flavodiiron proteins (FDPs) and related proteins; MBL-fold metallo-hydrolase domain. FDPs catalyze the reduction of oxygen and/or nitric oxide to water or nitrous oxide respectively. In addition to this N-terminal catalytic domain they contain a C-terminal flavin mononucleotide-binding flavodoxin-like domain. Although some FDPs are able to reduce NO or O2 with similar catalytic efficiencies others are selective for either NO or O2, such as Escherichia coli flavorubredoxin which is selective toward NO and G. intestinalis FDP which is selective toward O2. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. Some members of this subgroup are single domain. 238
26756 293796 cd07710 arylsulfatase_Sdsa1-like_MBL-fold Pseudomonas aeruginosa arylsulfatase SdsA1, Pseudomonas sp. DSM6611 arylsulfatase Pisa1, and related proteins; MBL-fold metallo-hydrolase domain. Arylsulfatase (also known as aryl-sulfate sulfohydrolase, EC 3.1.6.1). Pseudomonas aeruginosa SdsA1 is a secreted SDS hydrolase that allows the bacterium to use primary sulfates such as the detergent SDS common in commercial personal hygiene products as a sole carbon or sulfur source. Pseudomonas inverting secondary alkylsulfatase 1 (Pisa1) is specific for secondary alkyl sulfates. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 239
26757 293797 cd07711 MBLAC1-like_MBL-fold uncharacterized human metallo-beta-lactamase domain-containing protein 1 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized human MBLAC1 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 190
26758 293798 cd07712 MBLAC2-like_MBL-fold uncharacterized human metallo-beta-lactamase domain-containing protein 2 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized human MBLAC2 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 182
26759 293799 cd07713 DHPS-like_MBL-fold Methanocaldococcus jannaschii dihydropteroate synthase, Thermoanaerobacter tengcongensis Tflp, and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes Methanocaldococcus jannaschii 7,8-dihydropterin-6-methyl-4-(beta-D-ribofuranosyl)-aminobenzene-5'-phosphate synthase (EC 2.5.1.15), a folate biosynthetic enzyme also known as dihydropteroate synthase and 7,8 dihydropteroate synthase. Thermoanaerobacter tengcongensis Tflp is a ferredoxin-like member. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 269
26760 293800 cd07714 RNaseJ_MBL-fold RNAaseJ, MBL-fold metallo-hydrolase domain. RNase J, also called Ribonuclease J, is a prokaryotic ribonuclease which plays a key part in RNA processing and in RNA degradation. It can act as an endonuclease which is specific for single-stranded regions of RNA irrespective of their sequence or location, and as a processive 5' exonuclease which only acts on substrates having a single phosphate or a hydroxyl at the 5' end. Many bacterial species have only one RNase J, but some, such as Bacillus subtilis, have two. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 248
26761 293801 cd07715 TaR3-like_MBL-fold MBL-fold metallo-hydrolase domain of Myxococcus xanthus TaR3 and related proteins; MBL-fold metallo-hydrolase domain. Myxococcus xanthus Tar3 may function as an ammonium regulator/effector protein involved in biosynthesis of the antibiotic TA. Some are members of this subgroup are annotated as ribonucleases. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 212
26762 293802 cd07716 RNaseZ_short-form-like_MBL-fold uncharacterized bacterial subgroup of Ribonuclease Z, short form; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. Only the short form exists in bacteria. Members of this bacterial subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 175
26763 293803 cd07717 RNaseZ_ZiPD-like_MBL-fold Ribonuclease Z, E. coli 3' tRNA-processing endonuclease ZiPD and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Escherichia coli zinc phosphodiesterase (ZiPD, also known as ecoZ, tRNase Z, or RNase BN) is a 3' tRNA-processing endonuclease, encoded by the elaC gene. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme; this subgroup includes the short form (ELAC1). Only the short form exists in bacteria. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 247
26764 293804 cd07718 RNaseZ_ELAC1_ELAC2-C-term-like_MBL-fold Ribonuclease Z ELAC1, C-terminus of ELAC2, and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme; this eukaryotic subgroup includes short forms (ELAC1) and the C-terminus of long forms including human ELAC2. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 204
26765 293805 cd07719 arylsulfatase_AtsA-like_MBL-fold Pseudoalteromonas carrageenovora arylsulfatase AtsA and related proteins; MBL-fold metallo-hydrolase domain. Arylsulfatase (also known as aryl-sulfate sulfohydrolase, EC 3.1.6.1). Pseudoalteromonas carrageenovora arylsulfatase AtsA may function as a glycosulfohydrolase involved with desulfation of sulfated polysaccharides, which catalyzes hydrolysis of the arylsulfate ester bond, producing the aryl compounds and inorganic sulfate. CD also includes some sequences annotated as ribonucleases. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily. 193
26766 293806 cd07720 OPHC2-like_MBL-fold Pseudomonas pseudoalcaligenes organophosphorus hydrolase C2, and related proteins; MBL-fold metallo hydrolase domain. Pseudomonas pseudoalcaligenes OPHC2 is a thermostable organophosphorus hydrolase which a broad substrate activity spectrum: it hydrolyzes various phosphotriesters, esters, and a lactone. This subgroup also includes Pseudomonas oleovorans PoOPH which exhibits high lactonase and esterase activities, and latent PTE activity. However, double mutations His250Ile/Ile263Trp switch PoOPH into an efficient and thermostable PTE. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 251
26767 293807 cd07721 yflN-like_MBL-fold uncharacterized subgroup which includes Bacillus subtilis yflN; MBL-fold metallo hydrolase domain. This subgroup includes the uncharacterized Bacillus subtilis yflN protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 202
26768 293808 cd07722 LACTB2-like_MBL-fold uncharacterized subgroup which includes human lactamase beta 2 and related proteins; MBL-fold metallo hydrolase domain. Includes functionally uncharacterized human lactamase beta 2. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 188
26769 293809 cd07723 hydroxyacylglutathione_hydrolase_MBL-fold hydroxyacylglutathione hydrolase, MBL-fold metallo-hydrolase domain. hydroxyacylglutathione hydrolase (EC 3.1.2.6, also known as, glyoxalase II; S-2-hydroxylacylglutathione hydrolase; hydroxyacylglutathione hydrolase; acetoacetylglutathione hydrolase). In the second step of the glycoxlase system this enzyme hydrolyzes S-d-lactoylglutathione to d-lactate and regenerates glutathione in the process. It has broad substrate specificity for glutathione thiol esters, hydrolyzing a number of these species to their corresponding carboxylic acids and reduced glutathione. It appears to hydrolyze 2-hydroxy thiol esters with greatest efficiency. It belongs to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 165
26770 293810 cd07724 POD-like_MBL-fold ETHE1 (PDO type I), persulfide dioxygenase A (PDOA, PDO type II) and related proteins; MBL-fold metallo-hydrolase domain. Persulfide dioxygenase (PDO, also known as sulfur dioxygenase, SDO, EC 1.13.11.18) is a non-heme iron-dependent oxygenase which catalyzes the oxidation of glutathione persulfide to glutathione and persulfite in the mitochondria. Mutations in ethe1 (the human PDO gene) are responsible for a rare autosomal recessive metabolic disorder called ethylmalonic encephalopathy. Arabidopsis thaliana ETHE1 is essential for embryo and endosperm development. Bacterial ETHE1-type PDOs are also called Type 1 PDOs. Type II PDOs (also called PDOAs), are mainly proteobacterial. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 177
26771 293811 cd07725 TTHA1429-like_MBL-fold uncharacterized Thermus thermophilus TTHA1429 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized Thermus thermophilus TTHA1429 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 184
26772 293812 cd07726 ST1585-like_MBL-fold uncharacterized subgroup which includes Sulfolobus tokodaii ST1585 protein; MBL-fold metallo hydrolase domain. This subgroup includes the uncharacterized Sulfolobus tokodaii ST1585 protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 215
26773 293813 cd07727 YmaE-like_MBL-fold uncharacterized subgroup which includes Bacillus subtilis YmaE and related proteins; MBL-fold metallo hydrolase domain. Includes the uncharacterized Bacillus subtilis YmaE and Nostoc all1228 proteins.Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 181
26774 293814 cd07728 YtnP-like_MBL-fold Bacillus subtilis YtnP and related proteins; MBL-fold metallo hydrolase domain. Bacillus subtilis YtnP inhibits the signaling pathway required for the streptomycin production and development of aerial mycelium in Streptomyces griseus. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 249
26775 293815 cd07729 AHL_lactonase_MBL-fold quorum-quenching N-acyl-homoserine lactonase, MBL-fold metallo-hydrolase domain. Acyl Homoserine Lactones (also known as AHLs) are signal molecules which coordinate gene expression in quorum sensing, in many Gram-negative bacteria. Quorum-quenching N-acyl-homoserine lactonase (also known as AHL lactonase, N-acyl-L-homoserine lactone hydrolase, EC 3.1.1.81) catalyzes the hydrolysis and opening of the homoserine lactone rings of AHLs, a reaction that can block quorum sensing. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 238
26776 293816 cd07730 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. Some members of this subgroup are annotated as GumP protein. 250
26777 293817 cd07731 ComA-like_MBL-fold Competence protein ComA, ComEC and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes proteins required for natural transformation competence including Neisseria gonorrhoeae ComA, Pseudomonas stutzeri ComA, Bacillus subtilis ComEC (also known as ComE operon protein 3) and Haemophilus influenza ORF2 encoded by the rec-2 gene, as well as Escherichia coli YcaI which does not mediate spontaneous plasmid transformation on nutrient-containing agar plates. It also includes the phosphorylcholine esterase (Pce) domain of choline-binding protein e from streptococcus pneumonia. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 179
26778 293818 cd07732 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Includes functionally uncharacterized Enterococcus faecalis EF2904. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 202
26779 293819 cd07733 YycJ-like_MBL-fold uncharacterized subgroup which includes Bacillus subtilis YycJ and related proteins; MBL-fold metallo hydrolase domain. Includes the uncharacterized Bacillus subtilis YycJ protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 151
26780 293820 cd07734 Int9-11_CPSF2-3-like_MBL-fold Int9, Int11, CPSF2, CPSF3 and related cleavage and polyadenylation specificity factors; MBL-fold metallo-hydrolase domain. CPSF3 (cleavage and polyadenylation specificity factor subunit 3; also known as cleavage and polyadenylation specificity factor 73 kDa subunit, CPSF-73) and CPSF2 (also known as cleavage and polyadenylation specificity factor 100 kDa subunit /CPSF-100) are components of the CPSF complex, which plays a role in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and during processing of metazoan histone pre-mRNAs. CPSF3 functions as a 3' endonuclease. Int11 (also known as cleavage and polyadenylation-specific factor (CPSF) 3-like protein, and protein related to CPSF subunits of 68 kDa (RC-68)), and Int9, also known as protein related to CPSF subunits of 74 kDa (RC-74) are subunits of Integrator, a metazoan-specific multifunctional protein complex composed of 14 subunits. Integrator has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 193
26781 293821 cd07735 class_II_PDE_MBL-fold class II cyclic nucleotide phosphodiesterases Saccharomyces cerevisiae PDE1, Dictyostelium discoideum PDE1 and PDE7, and related proteins; MBL-fold metallo-hydrolase domain. Cyclic nucleotide phosphodiesterases (PDEs) decompose the second messengers cyclic adenosine and guanosine 3',5'-monophosphate (cAMP and cGMP, respectively). Saccharomyces cerevisiae PDE1 and Dictyostelium discoideum PDE1 and PDE7, have dual cAMP/cGMP specificity. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 259
26782 293822 cd07736 PhnP-like_MBL-fold phosphodiesterase Escherichia coli PhnP and related proteins; MBL-fold metallo hydrolase domain. Escherichia coli PhnP catalyzes the hydrolysis of 5-phospho-D-ribose-1,2-cyclic phosphate to D-ribose-1,5-bisphosphate, a step in the C-P lyase pathway. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 186
26783 293823 cd07737 YcbL-like_MBL-fold Salmonella enterica serovar typhimurium YcbL and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes Salmonella enterica serovar typhimurium YcbL which has type II hydroxyacylglutathione hydrolase (EC 3.1.2.6, also known as glyoxalase II) activity, and has a single metal ion binding site, and Thermus thermophilus TTHA1623 which does not have GLX2 activity and has two metal ion binding sites with a glyoxalase II-type metal coordination. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 190
26784 293824 cd07738 DdPDE5-like_MBL-fold Dictyostelium discoideum phosphodiesterase 5 and related proteins; MBL-fold metallo hydrolase domain. Includes Dictyostelium discoideum cAMP/cGMP-dependent 3',5'-cAMP/cGMP phosphodiesterase A (also known as cyclic GMP-binding protein A, phosphodiesterase 5, phosphodiesterase D, and PDE5) and cAMP/cGMP-dependent 3',5'-cAMP/cGMP phosphodiesterase B (also known as cyclic GMP-binding protein B, phosphodiesterase 6, phosphodiesterase E, and PDE6. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 189
26785 293825 cd07739 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 201
26786 293826 cd07740 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 194
26787 293827 cd07741 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 212
26788 293828 cd07742 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 249
26789 293829 cd07743 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 197
26790 143394 cd07749 NT_Pol-beta-like_1 Nucleotidyltransferase (NT) domain of an uncharacterized subgroup of the Pol beta-like NT superfamily. The Pol beta-like NT superfamily includes DNA polymerase beta and other family X DNA Polymerases, as well as Class I and Class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly(A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. Proteins belonging to this subgroup are uncharacterized. In the majority of the Pol beta-like superfamily NTs, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations essential for catalysis. These divalent metal ions are involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this subgroup. 156
26791 143622 cd07750 PolyPPase_VTC_like Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, -3 and- 4, and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2, -3, and -4 contain polyP polymerase domains. For S. cerevisiae VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs have roles in bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. 214
26792 143623 cd07751 PolyPPase_VTC4_like Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) protein VTC4, and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2,-3, and -4 contain polyP polymerase domains. S. cerevisiae VTC4 belongs to this subgroup. For VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs have roles in bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. 290
26793 143624 cd07756 CYTH-like_Pase_CHAD Uncharacterized subgroup of the CYTH-like superfamily having an associated CHAD domain. This subgroup belongs to the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily. Members of this superfamily hydrolyze triphosphate-containing substrates, require metal cations as cofactors, and have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). A number of proteins in this subgroup also contain a C-terminal CHAD (Conserved Histidine Alpha-helical Domain) domain which may participate in metal chelation or act as a phosphor-acceptor. The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB) and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions. Proteins of this subgroup have not been characterized. 197
26794 143625 cd07758 ThTPase Thiamine Triphosphatase. ThTPase is a soluble cytosolic enzyme which converts thiamine triphosphate (ThTP) to thiamine diphosphate. This catalytic activity depends on a divalent metal cofactor, for example Mg++. ThTPase regulates the intracellular concentration of ThTP, maintaining it at a low concentration in vivo. ThTP acts as a messenger in cell signaling in response to cellular stress, and in addition, can phosphorylate proteins in certain tissues. There is another class of membrane-associated enzymes in animal tissues which also convert ThTP to thiamine diphosphate, however they do not belong to this subgroup. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. 196
26795 143626 cd07761 CYTH-like_CthTTM-like Clostridium thermocellum (Cth)TTM and similar proteins, a subgroup of the CYTH-like superfamily. CthTTM is a metal dependent tripolyphosphatase, nucleoside triphosphatase, and nucleoside tetraphosphatase. It hydrolyzes the beta-gamma phosphoanhydride linkage of triphosphate-containing substrates including tripolyphosphate, nucleoside triphosphates and nucleoside tetraphosphates. These substrates are hydrolyzed, releasing Pi. Mg++ or Mn++ are required for the enzyme's activity. CthTTM appears to have no adenylate cyclase activity. This subgroup consists chiefly of bacterial sequences. Members of the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily have a unique active site located within an eight-stranded beta barrel. 146
26796 143627 cd07762 CYTH-like_Pase_1 Uncharacterized subgroup 1 of the CYTH-like superfamily. Enzymes belonging to the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily hydrolyze triphosphate-containing substrates, require metal cations as cofactors, and have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB) and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions. Proteins of this subgroup are of bacterial origin and have not been characterized. 180
26797 143639 cd07765 KRAB_A-box KRAB (Kruppel-associated box) domain -A box. The KRAB domain is a transcription repression module, found in a subgroup of the zinc finger proteins (ZFPs) of the C2H2 family, KRAB-ZFPs. KRAB-ZFPs comprise the largest group of transcriptional regulators in mammals, and are only found in tetrapods. These proteins have been shown to play important roles in cell differentiation and organ development, and in regulating viral replication and transcription. A KRAB domain may consist of an A-box, or of an A-box plus either a B-box, a divergent B-box (b), or a C-box. Only the A-box is included in this model. The A-box is needed for repression, the B- and C- boxes are not. KRAB-ZFPs have one or two KRAB domains at their amino-terminal end, and multiple C2H2 zinc finger motifs at their C-termini. Some KRAB-ZFPs also contain a SCAN domain which mediates homo- and hetero-oligomerization. The KRAB domain is a protein-protein interaction module which represses transcription through recruiting corepressors. A key mechanism appears to be the following: KRAB-AFPs tethered to DNA recruit, via their KRAB domain, the repressor KAP1 (KRAB-associated protein-1, also known as transcription intermediary factor 1 beta , KRAB-A interacting protein , and tripartite motif protein 28). The KAP1/ KRAB-AFP complex in turn recruits the heterochromatin protein 1 (HP1) family, and other chromatin modulating proteins, leading to transcriptional repression through heterochromatin formation. 40
26798 341447 cd07766 DHQ_Fe-ADH Dehydroquinate synthase-like (DHQ-like) and iron-containing alcohol dehydrogenases (Fe-ADH). This superfamily consists of two subgroups: the dehydroquinate synthase (DHQS)-like, and a large metal-containing alcohol dehydrogenases (ADH), known as iron-containing alcohol dehydrogenases. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. Dehydroquinate synthase-like group includes dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. The alcohol dehydrogenases (ADHs) in this superfamily contain a dehydroquinate synthase-like protein structural fold and mostly contain iron. They are distinct from other alcohol dehydrogenases which contains different protein domains. There are several distinct families of alcohol dehydrogenases: Zinc-containing long-chain alcohol dehydrogenases; insect-type, or short-chain alcohol dehydrogenases; iron-containing alcohol dehydrogenases, and others. The iron-containing family has a Rossmann fold-like topology that resembles the fold of the zinc-dependent alcohol dehydrogenases, but lacks sequence homology, and differs in strand arrangement. ADH catalyzes the reversible oxidation of alcohol to acetaldehyde with the simultaneous reduction of NAD(P)+ to NAD(P)H. 271
26799 163686 cd07767 MPN Mpr1p, Pad1p N-terminal (MPN) domains. MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function. 116
26800 198346 cd07768 FGGY_RBK_like Ribulokinase-like carbohydrate kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of ribulokinases (RBKs) and similar proteins from bacteria and eukaryota. RBKs catalyze the MgATP-dependent phosphorylation of a variety of sugar substrates including L- and/or D-ribulose. Members of this subfamily contain two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subfamily belong to the FGGY family of carbohydrate kinases 465
26801 198347 cd07769 FGGY_GK Glycerol kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily includes glycerol kinases (GK; EC 2.7.1.30) and glycerol kinase-like proteins from all three kingdoms of living organisms. Glycerol is an important intermediate of energy metabolism and it plays fundamental roles in several vital physiological processes. GKs are involved in the entry of external glycerol into cellular metabolism. They catalyze the rate-limiting step in glycerol metabolism by transferring a phosphate from ATP to glycerol thus producing glycerol 3-phosphate (G3P) in the cytoplasm. Human GK deficiency, called hyperglycerolemia, is an X-linked recessive trait associated with psychomotor retardation, osteoporosis, spasticity, esotropia, and bone fractures. Under different conditions, GKs from different species may exist in different oligomeric states. The monomer of GKs is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The high affinity ATP binding site of GKs is created only by a substrate-induced conformational change. Based on sequence similarity, some GK-like proteins from metazoa, which have lost their GK enzymatic activity, are also included in this CD. Members in this subfamily belong to the FGGY family of carbohydrate kinases. 484
26802 212659 cd07770 FGGY_GntK Gluconate kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of a group of gluconate kinases (GntK, also known as gluconokinase; EC 2.7.1.12) encoded by the gntK gene, which catalyzes the ATP-dependent phosphorylation of D-gluconate and produce 6-phospho-D-gluconate and ADP. The presence of Mg2+ might be required for catalytic activity. The prototypical member of this subfamily is GntK from Lactobacillus acidophilus. Unlike Escherichia coli GntK, which belongs to the superfamily of P-loop containing nucleoside triphosphate hydrolases, members in this subfamily are homologous to glycerol kinase, xylulose kinase, and rhamnulokinase from Escherichia coli. They have been classified as members of the FGGY family of carbohydrate kinases, which contain two large domains separated by a deep cleft that forms the active site. This model spans both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Some uncharacterized homologous sequences are also included in this subfamily. The Lactobacillus gnt operon contains a single gntK gene. The gnt operons of some bacteria, such as Corynebacterium glutamicum, have two gntK genes. For example, the C. glutamicum gnt operon has both a gluconate kinase gntV gene (also known as gntK) and a second hypothetical gntK gene (also known as gntK2). Both gluconate kinases encoded by these genes belong to this family, however the protein encoded by C. glutamicum gntV is not included in this model as it is truncated in the C-terminal domain. 440
26803 198349 cd07771 FGGY_RhuK L-rhamnulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of bacterial L-rhamnulose kinases (RhuK, also known as rhamnulokinase; EC 2.7.1.5), which are encoded by the rhaB gene and catalyze the ATP-dependent phosphorylation of L-rhamnulose to produce L-rhamnulose-1-phosphate and ADP. Some uncharacterized homologous sequences are also included in this subfamily. The prototypical member of this subfamily is Escherichia coli RhuK, which exists as a monomer composed of two large domains. The ATP binding site is located in the cleft between the two domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of divalent Mg2+ or Mn2+ is required for catalysis. Although an intramolecular disulfide bridge is present in Rhuk, disulfide formation is not important to the regulation of RhuK enzymatic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases. 440
26804 198350 cd07772 FGGY_NaCK_like Novosphingobium aromaticivorans carbohydrate kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of uncharacterized bacterial proteins with similarity to carbohydrate kinase from Novosphingobium aromaticivorans (NaCK). These proteins may catalyze the transfer of a phosphate group from ATP to their carbohydrate substrates. They belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 419
26805 198351 cd07773 FGGY_FK L-fuculose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial L-fuculose kinases (FK, also known as fuculokinase, EC 2.7.1.51), which catalyze the ATP-dependent phosphorylation of L-fuculose to produce L-fuculose-1-phosphate and ADP. The presence of Mg2+ or Mn2+ is required for enzymatic activity. FKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 448
26806 198352 cd07774 FGGY_1 uncharacterized subgroup; belongs to the FGGY family of carbohydrate kinases. This subfamily is composed of uncharacterized carbohydrate kinases. They are sequence homologous to bacterial glycerol kinase and have been classified as members of the FGGY family of carbohydrate kinases. The monomers of FGGY proteins contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 430
26807 198353 cd07775 FGGY_AI-2K Autoinducer-2 kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial autoinducer-2 (AI-2) kinases and similar proteins. AI-2 is a small chemical quorum-sensing signal involved in interspecies communication in bacteria. Cytoplasmic autoinducer-2 kinase, encoded by the lsrK gene from Salmonella enterica serovar Typhimurium lsr (luxS regulated) operon, is the prototypical member of this subfamily. AI-2 kinase catalyzes the phosphorylation of intracellular AI-2 to phospho-AI-2, which leads to the inactivation of lsrR, the repressor of the lsr operon. Members of this family are homologs of glycerol kinase-like proteins and belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 452
26808 212660 cd07776 FGGY_D-XK_euk eukaryotic D-xylulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of eukaryotic D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. They belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subfamily are similar to bacterial D-XKs, which exist as dimers with active sites that lie at the interface between two large domains. The presence of Mg2+ or Mn2+ is required for catalytic activity. 480
26809 212661 cd07777 FGGY_SHK_like sedoheptulokinase-like proteins; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of uncharacterized bacterial and eukaryotic proteins with similarity to human sedoheptulokinase (SHK, also known as D-altro-heptulose or heptulokinase, EC 2.7.1.14) encoded by the carbohydrate kinase-like (CARKL/SHPK) gene. SHK catalyzes the ATP-dependent phosphorylation of sedoheptulose to produce sedoheptulose 7-phosphate and ADP. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 448
26810 198356 cd07778 FGGY_L-RBK_like L-ribulokinase-like proteins; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of a group of putative bacterial L-ribulokinases (RBK; EC 2.7.1.16) and similar proteins. L-RBK catalyzes the MgATP-dependent phosphorylation of a variety of sugar substrates. Members of this subfamily belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 466
26811 212662 cd07779 FGGY_ygcE_like uncharacterized ygcE-like proteins. This subfamily consists of uncharacterized hypothetical bacterial proteins with similarity to Escherichia coli sugar kinase ygcE , whose functional roles are not yet clear. Escherichia coli ygcE is recognized by this model, but is not present in the alignment as it contains a deletion relative to other members of the group. These proteins belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 488
26812 198358 cd07781 FGGY_RBK Ribulokinases; belongs to the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial ribulokinases (RBK) which catalyze the MgATP-dependent phosphorylation of L(or D)-ribulose to produce L(or D)-ribulose 5-phosphate and ADP. RBK also phosphorylates a variety of other sugar substrates including ribitol and arabitol. The reason why L-RBK can phosphorylate so many different substrates is not yet clear. The presence of Mg2+ is required for catalytic activity. This group belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 498
26813 212663 cd07782 FGGY_YpCarbK_like Yersinia Pseudotuberculosis carbohydrate kinase-like subgroup; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of the uncharacterized Yersinia Pseudotuberculosis carbohydrate kinase that has been named glyerol/xylulose kinase and similar uncharacterized proteins from bacteria and eukaryota. Carbohydrate kinases catalyze the ATP-dependent phosphorylation of their carbohydrate substrate to produce phosphorylated sugar and ADP. The presence of Mg2+ is required for catalytic activity. This subgroup shows high homology to characterized ribulokinases and belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 536
26814 198360 cd07783 FGGY_CarbK-RPE_like Carbohydrate kinase and ribulose-phosphate 3-epimerase fusion proteins-like; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized proteins with similarity to carbohydrate kinases. Some members are carbohydrate kinase and ribulose-phosphate 3-epimerase fusion proteins. Carbohydrate kinases catalyze the ATP-dependent phosphorylation of their carbohydrate substrate to produce phosphorylated sugar and ADP. The presence of Mg2+ is required for catalytic activity. This subgroup shows high homology to characterized ribulokinases and belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 484
26815 198361 cd07786 FGGY_EcGK_like Escherichia coli glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of mostly bacterial and archaeal glycerol kinases (GK), including the well characterized proteins from Escherichia coli (EcGK), Thermococcus kodakaraensis (TkGK), and Enterococcus casseliflavus (EnGK). GKs contain two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The high affinity ATP binding site of EcGK is created only by a substrate-induced conformational change, which is initiated by protein-protein interactions through complex formation with enzyme IIAGlc (also known as IIIGlc), the glucose-specific phosphocarrier protein of the phosphotransferase system (PTS). EcGK exists in a dimer-tetramer equilibrium. IIAGlc binds to both EcGK dimer and tetramer, and inhibits the uptake and subsequent metabolism of glycerol and maltose. Another well-known allosteric regulator of EcGK is fructose 1,6-bisphosphate (FBP), which binds to the EcGK tetramer and plays an essential role in the stabilization of the inactive tetrameric form. EcGK requires Mg2+ for its enzymatic activity. Members in this subgroup belong to the FGGY family of carbohydrate kinases 486
26816 198362 cd07789 FGGY_CsGK_like Cellulomonas sp. glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of bacterial glycerol kinases (GK) with similarity to Cellulomonas sp. glycerol kinase (CsGK). CsGK might exist as a dimer. Its monomer is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The regulation of the catalytic activity of this group has not yet been examined. Members in this subgroup belong to the FGGY family of carbohydrate kinases 495
26817 198363 cd07791 FGGY_GK2_bacteria bacterial glycerol kinase 2-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of putative bacterial glycerol kinases (GK), which may be coded by the GK-like gene, GK2. Sequence comparison shows members in this CD are homologs of Escherichia coli GK. They retain all functionally important residues, and may catalyze the Mg-ATP dependent phosphorylation of glycerol to yield glycerol 3-phosphate (G3P). GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 484
26818 212664 cd07792 FGGY_GK1-3_metazoa Metazoan glycerol kinase 1 and 3-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of metazoan glycerol kinases (GKs), coded by X chromosome-linked GK genes, and glycerol kinase (GK)-like proteins, coded by autosomal testis-specific GK-like genes (GK-like genes, GK1 and GK3). Sequence comparison shows that metazoan GKs and GK-like proteins in this family are closely related to the bacterial GKs, which catalyze the Mg-ATP dependent phosphorylation of glycerol to yield glycerol 3-phosphate (G3P). The metazoan GKs do have GK enzymatic activity. However, the GK-like metazoan proteins do not exhibit GK activity and their biological functions are not yet clear. Some of them lack important functional residues involved in the binding of ADP and Mg2+, which may result in the loss of GK catalytic function. Others that have conserved catalytic residues have lost their GK activity as well; the reason remains unclear. It has been suggested the conserved catalytic residues might facilitate them performing a distinct function. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 504
26819 212665 cd07793 FGGY_GK5_metazoa metazoan glycerol kinase 5-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of metazoan putative glycerol kinases (GK), which may be coded by the GK-like gene, GK5. Sequence comparison shows members of this group are homologs of bacterial GKs, and they retain all functionally important residues. However, GK-like proteins in this family do not have detectable GK activity. The reason remains unclear. It has been suggested tha the conserved catalytic residues might facilitate them performing a distinct function. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 504
26820 198366 cd07794 FGGY_GK_like_proteobact Proteobacterial glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of proteobacterial glycerol kinase (GK)-like proteins, including the glycerol kinase from Pseudomonas aeruginosa. Most bacteria, such as Escherichia coli, take up glycerol passively by facilitated diffusion. In contrast, P. aeruginosa may also utilize a binding protein-dependent active transport system to mediate glycerol transportation. The glycerol kinase subsequently phosphorylates the intracellular glycerol to glycerol 3-phosphate (G3P). GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 470
26821 198367 cd07795 FGGY_ScGut1p_like Saccharomyces cerevisiae Gut1p and related proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of fungal glycerol kinases (GK), including Saccharomyces cerevisiae Gut1p/YHL032Cp, which phosphorylates glycerol to glycerol-3-phosphate in the cytosol. Glycerol utilization has been considered as the sole source of carbon and energy in S. cerevisiae, and is mediated by glycerol kinase and glycerol 3-phosphate dehydrogenase, which is encoded by the GUT2 gene. Members in this family show high similarity to their prokaryotic and eukaryotic homologs. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 496
26822 198368 cd07796 FGGY_NHO1_plant Arabidopsis NHO1 and related proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup includes Arabidopsis NHO1 (also known as NONHOST1, or noh-host resistant 1) and other putative plant glycerol kinases, which share strong homology with glycerol kinases from bacteria, fungi, and animals. Nonhost resistance of plants refers to the phenomenon observed when all members of a plant species are typically resistant to a specific parasite. NHO1 is required for nonspecific resistance to nonhost Pseudomonas bacteria, it is also required for resistance to the fungal pathogen Botrytis cinerea. This subgroup belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 503
26823 198369 cd07798 FGGY_AI-2K_like Autoinducer-2 kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup consists of uncharacterized hypothetical bacterial proteins with similarity to bacterial autoinducer-2 (AI-2) kinases, which catalyzes the phosphorylation of intracellular AI-2 to phospho-AI-2, leading to the inactivation of lsrR, the repressor of the lsr operon. Members of this subgroup belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 437
26824 212666 cd07802 FGGY_L-XK L-xylulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial L-xylulose kinases (L-XK, also known as L-xylulokinase; EC 2.7.1.53), which catalyze the ATP-dependent phosphorylation of L-xylulose to produce L-xylulose 5-phosphate and ADP. The presence of Mg2+ might be required for catalytic activity. Some uncharacterized sequences are also included in this subfamily. L-XKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 447
26825 198371 cd07803 FGGY_D-XK D-xylulose kinases; a subgroup of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. The prototypical member of this subfamily is Escherichia coli xylulokinase (EcXK), which exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. XKs do not have any known allosteric regulators, and they may have weak but significant activity in the absence of substrate. The presence of Mg2+ or Mn2+ is required for catalytic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases. 482
26826 198372 cd07804 FGGY_XK_like_1 uncharacterized xylulose kinase-like proteins; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized bacterial and archaeal xylulose kinases-like proteins with similarity to bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. The presence of Mg2+ or Mn2+ is required for catalytic activity. D-XK exists as a dimer with an active site that lies at the interface between the N- and C-terminal domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subgroup belong to the FGGY family of carbohydrate kinases 492
26827 198373 cd07805 FGGY_XK_like_2 uncharacterized xylulose kinase-like proteins; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized proteins with similarity to bacterial D-Xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. The presence of Mg2+ or Mn2+ is required for catalytic activity. D-XK exists as a dimer with an active site that lies at the interface between the N- and C-terminal domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subgroup belong to the FGGY family of carbohydrate kinases. 514
26828 198374 cd07808 FGGY_D-XK_EcXK-like Escherichia coli xylulokinase-like D-xylulose kinases; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. D-xylulose has been used as a source of carbon and energy by a variety of microorganisms. Some uncharacterized sequences are also included in this subgroup. The prototypical member of this CD is Escherichia coli xylulokinase (EcXK), which exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ is required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases. 482
26829 198375 cd07809 FGGY_D-XK_1 D-xylulose kinases, subgroup 1; members of the FGGY family of carbohydrate kinases. This subgroup is composed of D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17) from bacteria and eukaryota. They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases. 487
26830 198376 cd07810 FGGY_D-XK_2 D-xylulose kinases, subgroup 2; members of the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17). They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases. 490
26831 198377 cd07811 FGGY_D-XK_3 D-xylulose kinases, subgroup 3; members of the FGGY family of carbohydrate kinases. This subgroup is composed of proteobacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17). They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases. 493
26832 176854 cd07812 SRPBCC START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 141
26833 176855 cd07813 COQ10p_like Coenzyme Q-binding protein COQ10p and similar proteins. Coenzyme Q-binding protein COQ10p and similar proteins. COQ10p is a hydrophobic protein located in the inner membrane of mitochondria that binds coenzyme Q (CoQ), also called ubiquinone, which is an essential electron carrier of the respiratory chain. Deletion of the gene encoding COQ10p (COQ10 or YOL008W) in Saccharomyces cerevisiae results in respiratory defect because of the inability to oxidize NADH and succinate. COQ10p may function in the delivery of CoQ (Q6 in budding yeast) to its proper location for electron transport. The human homolog, called Q-binding protein COQ10 homolog A (COQ10A), is able to fully complement for the absence of COQ10p in fission yeast. Human COQ10A also has a splice variant COQ10B. COQ10p belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 138
26834 176856 cd07814 SRPBCC_CalC_Aha1-like Putative hydrophobic ligand-binding SRPBCC domain of Micromonospora echinospora CalC, human Aha1, and related proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Micromonospora echinospora CalC, human Aha1, and related proteins. Proteins in this group belong to the SRPBCC domain superfamily of proteins, which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. MeCalC confers resistance to the enediyne, calicheamicin gamma 1 (CLM), by a self sacrificing mechanism which results in inactivation of both CalC and the highly reactive diradical enediyne species. MeCalC can also inactivate two other enediynes, shishijimicin and namenamicin. A crucial Gly of the MeCalC CLM resistance mechanism is not conserved in this subgroup. This family also includes the C-terminal, Bet v1-like domain of Aha1, one of several co-chaperones, which regulate the dimeric chaperone Hsp90. Aha1 promotes dimerization of the N-terminal domains of Hsp90, and stimulates its low intrinsic ATPase activity, and may regulate the dwell time of Hsp90 with client proteins. Aha1 can act as either a positive or negative regulator of chaperone-dependent activation, depending on the client protein, but the mechanisms by which these opposing functions are achieved are unclear. Aha1 is upregulated in a number of tumor lines co-incident with the activation of several signaling kinases. 139
26835 176857 cd07815 SRPBCC_PITP Lipid-binding SRPBCC domain of Class I and Class II Phosphatidylinositol Transfer Proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of the phosphatidylinositol transfer protein (PITP) family of lipid transfer proteins. This family of proteins includes Class 1 PITPs (PITPNA/PITPalpha and PITPNB/PITPbeta, Drosophila vibrator and related proteins), Class IIA PITPs (PITPNM1/PITPalphaI/Nir2, PITPNM2/PITPalphaII/Nir3, Drosophila RdgB, and related proteins), and Class IIB PITPs (PITPNC1/RdgBbeta and related proteins). The PITP family belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Class III PITPs, exemplified by the Sec14p family, are found in yeast and plants but are unrelated in sequence and structure to Class I and II PITPs and belong to a different superfamily. 251
26836 176858 cd07816 Bet_v1-like Ligand-binding bet_v_1 domain of major pollen allergen of white birch (Betula verrucosa), Bet v 1, and related proteins. This family includes the ligand binding domain of Bet v 1 (the major pollen allergen of white birch, Betula verrucosa) and related proteins. In addition to birch Bet v 1, this family includes other plant intracellular pathogenesis-related class 10 (PR-10) proteins, norcoclaurine synthases (NCSs), cytokinin binding proteins (CSBPs), major latex proteins (MLPs), and ripening-related proteins. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Members of this family binds a diverse range of ligands. Bet v 1 can bind brassinosteroids, cytokinins, flavonoids and fatty acids. Hyp-1, a PR-10 from Hypericum perforatum/St. John's wort, catalyzes the condensation of two molecules of emodin to the bioactive naphthodianthrone hypericin. NCSs catalyze the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. The role of MLPs is unclear; however, they are associated with fruit and flower development and in pathogen defense responses. A number of PR-10 proteins in this subgroup, including Bet v 1, have in vitro RNase activity, the biological significance of which is unclear. Bet v 1 family proteins have a conserved glycine-rich P (phosphate-binding)-loop proximal to the entrance of the ligand-binding pocket. However, its conformation differs from that of the canonical P-loop structure found in nucleotide-binding proteins. Several PR-10 members including Bet v1 are allergenic. Cross-reactivity of Bet v 1 with homologs from plant foods results in birch-fruit syndrome. 148
26837 176859 cd07817 SRPBCC_8 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 139
26838 176860 cd07818 SRPBCC_1 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 150
26839 176861 cd07819 SRPBCC_2 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 140
26840 176862 cd07820 SRPBCC_3 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 137
26841 176863 cd07821 PYR_PYL_RCAR_like Pyrabactin resistance 1 (PYR1), PYR1-like (PYL), regulatory component of abscisic acid receptors (RCARs), and related proteins. The PYR/PYL/RCAR-like family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. PYR/PYL/RCAR plant proteins are receptors involved in signal transduction. They bind abscisic acid (ABA) and mediate its signaling. ABA is a vital plant hormone, which regulates plant growth, development, and response to environmental stresses. Upon binding ABA, these plant proteins interact with a type 2C protein phosphatase (PP2C), such as ABI1 and ABI2, and inhibit their activity. When ABA is bound, a loop (designated the gate/CL2 loop) closes over the ligand binding pocket, resulting in the weakening of the inactive PYL dimer and facilitating type 2C protein phosphatase binding. In the ABA:PYL1:ABI1 complex, the gate blocks substrate access to the phosphatase active site. A conserved Trp from PP2C inserts into PYL to lock the receptor in a closed formation. This group also contains Methylobacterium extorquens AM1 MxaD. The mxaD gene is located within the mxaFJGIR(S)ACKLDEHB cluster which encodes proteins involved in methanol oxidation. MxaD may participate in the periplasmic electron transport chain for oxidation of methanol. Mutants lacking MxaD exhibit a reduced growth on methanol, and a lower rate of respiration with methanol. 140
26842 176864 cd07822 SRPBCC_4 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 141
26843 176865 cd07823 SRPBCC_5 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 146
26844 176866 cd07824 SRPBCC_6 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 146
26845 176867 cd07825 SRPBCC_7 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 144
26846 176868 cd07826 SRPBCC_CalC_Aha1-like_9 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 142
26847 143640 cd07827 RHD-n N-terminal sub-domain of the Rel homology domain (RHD). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel). 174
26848 381185 cd07828 lipocalin_heme-bd-THAP4-like heme-binding beta-barrel domain of human THAP4, Arabidopsis thaliana nitrobindin, and similar proteins. Proteins in this subfamily use a beta-barrel domain to bind ferric heme. This group also includes the beta-barrel domain of human THAP domain containing 4 (THAP4). The THAP domain is found in proteins involved in transcriptional regulation, cell-cycle control, apoptosis and chromatin modification. Arabidopsis thaliana nitrobindin may reversibly bind nitric oxide (NO) and be involved in NO transport. It also includes the beta-barrel domain of Caenorhabditis elegans protein male abnormal 7 (Mab-7) which plays an important role in determining body shape and sensory ray morphology. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 150
26849 270823 cd07829 STKc_CDK_like Catalytic domain of Cyclin-Dependent protein Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. CDKs are partly regulated by their subcellular localization, which defines substrate phosphorylation and the resulting specific function. CDK1, CDK2, CDK4, and CDK6 have well-defined functions in the cell cycle, such as the regulation of the early G1 phase by CDK4 or CDK6, the G1/S phase transition by CDK2, or the entry of mitosis by CDK1. They also exhibit overlapping cyclin specificity and functions in certain conditions. Knockout mice with a single CDK deleted remain viable with specific phenotypes, showing that some CDKs can compensate for each other. For example, CDK4 can compensate for the loss of CDK6, however, double knockout mice with both CDK4 and CDK6 deleted die in utero. CDK8 and CDK9 are mainly involved in transcription while CDK5 is implicated in neuronal function. CDK7 plays essential roles in both the cell cycle as a CDK-Activating Kinase (CAK) and in transcription as a component of the general transcription factor TFIIH. The CDK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 282
26850 270824 cd07830 STKc_MAK_like Catalytic domain of Male germ cell-Associated Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of human MAK and MAK-related kinase (MRK), Saccharomyces cerevisiae Ime2p, Schizosaccharomyces pombe Mei4-dependent protein 3 (Mde3) and Pit1, Caenorhabditis elegans dyf-5, Arabidopsis thaliana MHK, and similar proteins. These proteins play important roles during meiosis. MAK is highly expressed in testicular cells specifically in the meiotic phase, but is not essential for spermatogenesis and fertility. It functions as a coactivator of the androgen receptor in prostate cells. MRK, also called Intestinal Cell Kinase (ICK), is expressed ubiquitously, with highest expression in the ovary and uterus. A missense mutation in MRK causes endocrine-cerebro-osteodysplasia, suggesting that this protein plays an important role in the development of many organs. MAK and MRK may be involved in regulating cell cycle and cell fate. Ime2p is a meiosis-specific kinase that is important during meiotic initiation and during the later stages of meiosis. Mde3 functions downstream of the transcription factor Mei-4 which is essential for meiotic prophase I. The MAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
26851 270825 cd07831 STKc_MOK Catalytic domain of the Serine/Threonine Kinase, MAPK/MAK/MRK Overlapping Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MOK, also called Renal tumor antigen 1 (RAGE-1), is widely expressed and is enriched in testis, kidney, lung, and brain. It is expressed in approximately 50% of renal cell carcinomas (RCC) and is a potential target for immunotherapy. MOK is stabilized by its association with the HSP90 molecular chaperone. It is induced by the transcription factor Cdx2 and may be involved in regulating intestinal epithelial development and differentiation. The MOK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 282
26852 270826 cd07832 STKc_CCRK Catalytic domain of the Serine/Threonine Kinase, Cell Cycle-Related Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CCRK was previously called p42. It is a Cyclin-Dependent Kinase (CDK)-Activating Kinase (CAK) which is essential for the activation of CDK2. It is indispensable for cell growth and has been implicated in the progression of glioblastoma multiforme. In the heart, a splice variant of CCRK with a different C-terminal half is expressed; this variant promotes cardiac cell growth and survival and is significantly down-regulated during the development of heart failure. The CCRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
26853 270827 cd07833 STKc_CDKL Catalytic domain of Cyclin-Dependent protein Kinase Like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDKL1-5 and similar proteins. Some CDKLs, like CDKL1 and CDKL3, may be implicated in transformation and others, like CDKL3 and CDKL5, are associated with mental retardation when impaired. CDKL2 plays a role in learning and memory. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
26854 270828 cd07834 STKc_MAPK Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPKs serve as important mediators of cellular responses to extracellular signals. They control critical cellular functions including differentiation, proliferation, migration, and apoptosis. They are also implicated in the pathogenesis of many diseases including multiple types of cancer, stroke, diabetes, and chronic inflammation. Typical MAPK pathways involve a triple kinase core cascade comprising of the MAPK, which is phosphorylated and activated by a MAPK kinase (MAP2K or MKK), which itself is phosphorylated and activated by a MAPK kinase kinase (MAP3K or MKKK). Each cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. There are three typical MAPK subfamilies: Extracellular signal-Regulated Kinase (ERK), c-Jun N-terminal Kinase (JNK), and p38. Some MAPKs are atypical in that they are not regulated by MAP2Ks. These include MAPK4, MAPK6, NLK, and ERK7. The MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 329
26855 270829 cd07835 STKc_CDK1_CdkB_like Catalytic domain of Cyclin-Dependent protein Kinase 1-like Serine/Threonine Kinases and of Plant B-type Cyclin-Dependent protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK, CDK2, and CDK3. CDK1 is also called Cell division control protein 2 (Cdc2) or p34 protein kinase, and is regulated by cyclins A, B, and E. The CDK1/cyclin A complex controls G2 phase entry and progression while the CDK1/cyclin B complex is critical for G2 to M phase transition. CDK2 is regulated by cyclin E or cyclin A. Upon activation by cyclin E, it phosphorylates the retinoblastoma (pRb) protein which activates E2F mediated transcription and allows cells to move into S phase. The CDK2/cyclin A complex plays a role in regulating DNA replication. Studies in knockout mice revealed that CDK1 can compensate for the loss of the cdk2 gene as it can also bind cyclin E and drive G1 to S phase transition. CDK3 is regulated by cyclin C and it phosphorylates pRB specifically during the G0/G1 transition. This phosphorylation is required for cells to exit G0 efficiently and enter the G1 phase. The plant-specific B-type CDKs are expressed from the late S to the M phase of the cell cycle. They are characterized by the cyclin binding motif PPT[A/T]LRE. They play a role in controlling mitosis and integrating developmental pathways, such as stomata and leaf development. CdkB has been shown to associate with both cyclin B, which controls G2/M transition, and cyclin D, which acts as a mediator in linking extracellular signals to the cell cycle. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
26856 143341 cd07836 STKc_Pho85 Catalytic domain of the Serine/Threonine Kinase, Fungal Cyclin-Dependent protein Kinase Pho85. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Pho85 is a multifunctional CDK in yeast. It is regulated by 10 different cyclins (Pcls) and plays a role in G1 progression, cell polarity, phosphate and glycogen metabolism, gene expression, and in signaling changes in the environment. It is not essential for yeast viability and is the functional homolog of mammalian CDK5, which plays a role in central nervous system development. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The Pho85 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
26857 270830 cd07837 STKc_CdkB_plant Catalytic domain of the Serine/Threonine Kinase, Plant B-type Cyclin-Dependent protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The plant-specific B-type CDKs are expressed from the late S to the M phase of the cell cycle. They are characterized by the cyclin binding motif PPT[A/T]LRE. They play a role in controlling mitosis and integrating developmental pathways, such as stomata and leaf development. CdkB has been shown to associate with both cyclin B, which controls G2/M transition, and cyclin D, which acts as a mediator in linking extracellular signals to the cell cycle. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CdkB subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 294
26858 270831 cd07838 STKc_CDK4_6_like Catalytic domain of Cyclin-Dependent protein Kinase 4 and 6-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK4 and CDK6 partner with D-type cyclins to regulate the early G1 phase of the cell cycle. They are the first kinases activated by mitogenic signals to release cells from the G0 arrested state. CDK4 and CDK6 are both expressed ubiquitously, associate with all three D cyclins (D1, D2 and D3), and phosphorylate the retinoblastoma (pRb) protein. They are also regulated by the INK4 family of inhibitors which associate with either the CDK alone or the CDK/cyclin complex. CDK4 and CDK6 show differences in subcellular localization, sensitivity to some inhibitors, timing in activation, tumor selectivity, and possibly substrate profiles. Although CDK4 and CDK6 seem to show some redundancy, they also have discrete, nonoverlapping functions. CDK6 plays an important role in cell differentiation. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK4/6-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
26859 143344 cd07839 STKc_CDK5 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK5 is unusual in that it is regulated by non-cyclin proteins, p35 and p39. It is highly expressed in the nervous system and is critical in normal neural development and function. It plays a role in neuronal migration and differentiation, and is also important in synaptic plasticity and learning. CDK5 also participates in protecting against cell death and promoting angiogenesis. Impaired CDK5 activity is implicated in Alzheimer's disease, amyotrophic lateral sclerosis, Parkinson's disease, Huntington's disease and acute neuronal injury. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
26860 270832 cd07840 STKc_CDK9_like Catalytic domain of Cyclin-Dependent protein Kinase 9-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK9 and CDK12 from higher eukaryotes, yeast BUR1, C-type plant CDKs (CdkC), and similar proteins. CDK9, BUR1, and CdkC are functionally equivalent. They act as a kinase for the C-terminal domain of RNA polymerase II and participate in regulating mutliple steps of gene expression including transcription elongation and RNA processing. CDK9 and CdkC associate with T-type cyclins while BUR1 associates with the cyclin BUR2. CDK12 is a unique CDK that contains an arginine/serine-rich (RS) domain, which is predominantly found in splicing factors. CDK12 interacts with cyclins L1 and L2, and participates in regulating transcription and alternative splicing. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK9-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
26861 270833 cd07841 STKc_CDK7 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK7 plays essential roles in the cell cycle and in transcription. It associates with cyclin H and MAT1 and acts as a CDK-Activating Kinase (CAK) by phosphorylating and activating cell cycle CDKs (CDK1/2/4/6). In the brain, it activates CDK5. CDK7 is also a component of the general transcription factor TFIIH, which phosphorylates the C-terminal domain (CTD) of RNA polymerase II when it is bound with unphosphorylated DNA, as present in the pre-initiation complex. Following phosphorylation, the CTD dissociates from the DNA which allows transcription initiation. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 298
26862 270834 cd07842 STKc_CDK8_like Catalytic domain of Cyclin-Dependent protein Kinase 8-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK8, CDC2L6, and similar proteins. CDK8 functions as a negative or positive regulator of transcription, depending on the scenario. Together with its regulator, cyclin C, it reversibly associates with the multi-subunit core Mediator complex, a cofactor that is involved in regulating RNA polymerase II-dependent transcription. CDC2L6 also associates with Mediator in complexes lacking CDK8. In VP16-dependent transcriptional activation, CDK8 and CDC2L6 exerts opposing effects by positive and negative regulation, respectively, in similar conditions. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK8-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 316
26863 173741 cd07843 STKc_CDC2L1 Catalytic domain of the Serine/Threonine Kinase, Cell Division Cycle 2-like 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDC2L1, also called PITSLRE, exists in different isoforms which are named using the alias CDK11(p). The CDC2L1 gene produces two protein products, CDK11(p110) and CDK11(p58). CDC2L1 is also represented by the caspase-processed CDK11(p46). CDK11(p110), the major isoform, associates with cyclin L and is expressed throughout the cell cycle. It is involved in RNA processing and the regulation of transcription. CDK11(p58) associates with cyclin D3 and is expressed during the G2/M phase of the cell cycle. It plays roles in spindle morphogenesis, centrosome maturation, sister chromatid cohesion, and the completion of mitosis. CDK11(p46) is formed from the larger isoforms by caspases during TNFalpha- and Fas-induced apoptosis. It functions as a downstream effector kinase in apoptotic signaling pathways and interacts with eukaryotic initiation factor 3f (eIF3f), p21-activated kinase (PAK1), and Ran-binding protein (RanBPM). CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDC2L1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 293
26864 270835 cd07844 STKc_PCTAIRE_like Catalytic domain of PCTAIRE-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-like proteins show unusual expression patterns with high levels in post-mitotic tissues, suggesting that they may be involved in regulating post-mitotic cellular events. They share sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The association of PCTAIRE-like proteins with cyclins has not been widely studied, although PFTAIRE-1 has been shown to function as a CDK which is regulated by cyclin D3 as well as the membrane-associated cyclin Y. The PCTAIRE-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
26865 173742 cd07845 STKc_CDK10 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 10. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK10, also called PISSLRE, is essential for cell growth and proliferation, and acts through the G2/M phase of the cell cycle. CDK10 has also been identified as an important factor in endocrine therapy resistance in breast cancer. CDK10 silencing increases the transcription of c-RAF and the activation of the p42/p44 MAPK pathway, which leads to antiestrogen resistance. Patients who express low levels of CDK10 relapse early on tamoxifen. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK10 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 309
26866 270836 cd07846 STKc_CDKL2_3 Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase Like 2 and 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKL2, also called p56 KKIAMRE, is expressed in testis, kidney, lung, and brain. It functions mainly in mature neurons and plays an important role in learning and memory. Inactivation of CDKL3, also called NKIAMRE (NKIATRE in rat), by translocation is associated with mild mental retardation. It has been reported that CDKL3 is lost in leukemic cells having a chromosome arm 5q deletion, and may contribute to the transformed phenotype. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
26867 270837 cd07847 STKc_CDKL1_4 Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase Like 1 and 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKL1, also called p42 KKIALRE, is a glial protein that is upregulated in gliosis. It is present in neuroblastoma and A431 human carcinoma cells, and may be implicated in neoplastic transformation. The function of CDKL4 is unknown. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL1/4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
26868 270838 cd07848 STKc_CDKL5 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase Like 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mutations in the gene encoding CDKL5, previously called STK9, are associated with early onset epilepsy and severe mental retardation [X-linked infantile spasm syndrome (ISSX) or West syndrome]. In addition, CDKL5 mutations also sometimes cause a phenotype similar to Rett syndrome (RTT), a progressive neurodevelopmental disorder. These pathogenic mutations are located in the N-terminal portion of the protein within the kinase domain. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
26869 270839 cd07849 STKc_ERK1_2_like Catalytic domain of Extracellular signal-Regulated Kinase 1 and 2-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the mitogen-activated protein kinases (MAPKs) ERK1, ERK2, baker's yeast Fus3, and similar proteins. MAPK pathways are important mediators of cellular responses to extracellular signals. ERK1/2 activation is preferentially by mitogenic factors, differentiation stimuli, and cytokines, through a kinase cascade involving the MAPK kinases MEK1/2 and a MAPK kinase kinase from the Raf family. ERK1/2 have numerous substrates, many of which are nuclear and participate in transcriptional regulation of many cellular processes. They regulate cell growth, cell proliferation, and cell cycle progression from G1 to S phase. Although the distinct roles of ERK1 and ERK2 have not been fully determined, it is known that ERK2 can maintain most functions in the absence of ERK1, and that the deletion of ERK2 is embryonically lethal. The MAPK, Fus3, regulates yeast mating processes including mating-specific gene expression, G1 arrest, mating projection, and cell fusion. This ERK1/2-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 336
26870 270840 cd07850 STKc_JNK Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. They are also essential regulators of physiological and pathological processes and are involved in the pathogenesis of several diseases such as diabetes, atherosclerosis, stroke, Parkinson's and Alzheimer's. Vetebrates harbor three different JNK genes (Jnk1, Jnk2, and Jnk3) that are alternatively spliced to produce at least 10 isoforms. JNKs are specifically activated by the MAPK kinases MKK4 and MKK7, which are in turn activated by upstream MAPK kinase kinases as a result of different stimuli including stresses such as ultraviolet (UV) irradiation, hyperosmolarity, heat shock, or cytokines. JNKs activate a large number of different substrates based on specific stimulus, cell type, and cellular condition, and may be implicated in seemingly contradictory functions. The JNK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 337
26871 143356 cd07851 STKc_p38 Catalytic domain of the Serine/Threonine Kinase, p38 Mitogen-Activated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38 kinases are mitogen-activated protein kinases (MAPKs), serving as important mediators of cellular responses to extracellular signals. They function in the regulation of the cell cycle, cell development, cell differentiation, senescence, tumorigenesis, apoptosis, pain development and pain progression, and immune responses. p38 kinases are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. p38 substrates include other protein kinases and factors that regulate transcription, nuclear export, mRNA stability and translation. p38 kinases are drug targets for the inflammatory diseases psoriasis, rheumatoid arthritis, and chronic pulmonary disease. Vertebrates contain four isoforms of p38, named alpha, beta, gamma, and delta, which show varying substrate specificity and expression patterns. p38alpha and p38beta are ubiquitously expressed, p38gamma is predominantly found in skeletal muscle, and p38delta is found in the heart, lung, testis, pancreas, and small intestine. The p38 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 343
26872 270841 cd07852 STKc_MAPK15-like Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase 15 and similar MAPKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Human MAPK15 is also called Extracellular signal Regulated Kinase 8 (ERK8) while the rat protein is called ERK7. ERK7 and ERK8 display both similar and different biochemical properties. They autophosphorylate and activate themselves and do not require upstream activating kinases. ERK7 is constitutively active and is not affected by extracellular stimuli whereas ERK8 shows low basal activity and is activated by DNA-damaging agents. ERK7 and ERK8 also have different substrate profiles. Genome analysis shows that they are orthologs with similar gene structures. ERK7 and ERK 8 may be involved in the signaling of some nuclear receptor transcription factors. ERK7 regulates hormone-dependent degradation of estrogen receptor alpha while ERK8 down-regulates the transcriptional co-activation androgen and glucocorticoid receptors. MAPKs are important mediators of cellular responses to extracellular signals. The MAPK15 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 337
26873 173748 cd07853 STKc_NLK Catalytic domain of the Serine/Threonine Kinase, Nemo-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NLK is an atypical mitogen-activated protein kinase (MAPK) that is not regulated by a MAPK kinase. It functions downstream of the MAPK kinase kinase Tak1, which also plays a role in activating the JNK and p38 MAPKs. The Tak1/NLK pathways are regulated by Wnts, a family of secreted proteins that is critical in the control of asymmetric division and cell polarity. NLK can phosphorylate transcription factors from the TCF/LEF family, inhibiting their ability to activate the transcription of target genes. In prostate cancer cells, NLK is involved in regulating androgen receptor-mediated transcription and its expression is altered during cancer progression. MAPKs are important mediators of cellular responses to extracellular signals. The NLK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 372
26874 143359 cd07854 STKc_MAPK4_6 Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinases 4 (also called ERK4) and 6 (also called ERK3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK4 (also called ERK4 or p63MAPK) and MAPK6 (also called ERK3 or p97MAPK) are atypical MAPKs that are not regulated by MAPK kinases. MAPK6 is expressed ubiquitously with highest amounts in brain and skeletal muscle. It may be involved in the control of cell differentiation by negatively regulating cell cycle progression in certain conditions. It may also play a role in glucose-induced insulin secretion. MAPK6 and MAPK4 cooperate to regulate the activity of MAPK-activated protein kinase 5 (MK5), leading to its relocation to the cytoplasm and exclusion from the nucleus. The MAPK6/MK5 and MAPK4/MK5 pathways may play critical roles in embryonic and post-natal development. MAPKs are important mediators of cellular responses to extracellular signals. The MAPK4/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 342
26875 270842 cd07855 STKc_ERK5 Catalytic domain of the Serine/Threonine Kinase, Extracellular signal-Regulated Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ERK5 (also called Big MAPK1 (BMK1) or MAPK7) has a unique C-terminal extension, making it approximately twice as big as other MAPKs. This extension contains transcriptional activation capability which is inhibited by the N-terminal half. ERK5 is activated in response to growth factors and stress by a cascade that leads to its phosphorylation by the MAP2K MEK5, which in turn is regulated by the MAP3Ks MEKK2 and MEKK3. Activated ERK5 phosphorylates its targets including myocyte enhancer factor 2 (MEF2), Sap1a, c-Myc, and RSK. It plays a role in EGF-induced cell proliferation during the G1/S phase transition. Studies on knockout mice revealed that ERK5 is essential for cardiovascular development and plays an important role in angiogenesis. It is also critical for neural differentiation and survival. The ERK5 pathway has been implicated in the pathogenesis of many diseases including cancer, cardiac hypertrophy, and atherosclerosis. MAPKs are important mediators of cellular responses to extracellular signals. The ERK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 336
26876 270843 cd07856 STKc_Sty1_Hog1 Catalytic domain of the Serine/Threonine Kinases, Fungal Mitogen-Activated Protein Kinases Sty1 and Hog1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPKs Sty1 from Schizosaccharomyces pombe, Hog1 from Saccharomyces cerevisiae, and similar proteins. Sty1 and Hog1 are stress-activated MAPKs that partipate in transcriptional regulation in response to stress. Sty1 is activated in response to oxidative stress, osmotic stress, and UV radiation. It is regulated by the MAP2K Wis1, which is activated by the MAP3Ks Wis4 and Win1, which receive signals of the stress condition from membrane-spanning histidine kinases Mak1-3. Activated Sty1 stabilizes the Atf1 transcription factor and induces transcription of Atf1-dependent genes of the core environmetal stress response. Hog1 is the key element in the high osmolarity glycerol (HOG) pathway and is activated upon hyperosmotic stress. Activated Hog1 accumulates in the nucleus and regulates stress-induced transcription. The HOG pathway is mediated by two transmembrane osmosensors, Sln1 and Sho1. MAPKs are important mediators of cellular responses to extracellular signals. The Sty1/Hog1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 328
26877 173750 cd07857 STKc_MPK1 Catalytic domain of the Serine/Threonine Kinase, Fungal Mitogen-Activated Protein Kinase MPK1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPKs MPK1 from Saccharomyces cerevisiae, Pmk1 from Schizosaccharomyces pombe, and similar proteins. MPK1 (also called Slt2) and Pmk1 (also called Spm1) are stress-activated MAPKs that regulate the cell wall integrity pathway, and are therefore important in the maintainance of cell shape, cell wall construction, morphogenesis, and ion homeostasis. MPK1 is activated in response to cell wall stress including heat stimulation, osmotic shock, UV irradiation, and any agents that interfere with cell wall biogenesis such as chitin antagonists, caffeine, or zymolase. MPK1 is regulated by the MAP2Ks Mkk1/2, which are regulated by the MAP3K Bck1. Pmk1 is also activated by multiple stresses including elevated temperatures, hyper- or hypotonic stress, glucose deprivation, exposure to cell-wall damaging compounds, and oxidative stress. It is regulated by the MAP2K Pek1, which is regulated by the MAP3K Mkh1. MAPKs are important mediators of cellular responses to extracellular signals. The MPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 332
26878 143363 cd07858 STKc_TEY_MAPK Catalytic domain of the Serine/Threonine Kinases, Plant TEY Mitogen-Activated Protein Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Plant MAPKs are typed based on the conserved phosphorylation motif present in the activation loop, TEY and TDY. This subfamily represents the TEY subtype of plant MAPKs and is further subdivided into three groups (A, B, and C). Group A is represented by AtMPK3, AtMPK6, Nicotiana tabacum BTF4 (NtNTF4), among others. They are mostly involved in environmental and hormonal responses. AtMPK3 and AtMPK6 are also key regulators for stomatal development and patterning. Group B is represented by AtMPK4, AtMPK13, and NtNTF6, among others. They may be involved in both cell division and environmental stress response. AtMPK4 also participates in regulating innate immunity. Group C is represented by AtMPK1, AtMPK2, NtNTF3, Oryza sativa MAPK4 (OsMAPK4), among others. They may also be involved in stress responses. AtMPK1 and AtMPK2 are activated following mechanical injury and in the presence of stress chemicals such as jasmonic acid, hydrogen peroxide and abscisic acid. OsMAPK4 is also called OsMSRMK3 for Multiple Stress-Responsive MAPK3. In plants, MAPKs are associated with physiological, developmental, hormonal, and stress responses. Some plants show numerous gene duplications of MAPKs; Arabidopsis thaliana harbors at least 20 MAPKs, named AtMPK1-20. The TEY MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 337
26879 143364 cd07859 STKc_TDY_MAPK Catalytic domain of the Serine/Threonine Kinases, Plant TDY Mitogen-Activated Protein Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Plant MAPKs are typed based on the conserved phosphorylation motif present in the activation loop, TEY and TDY. This subfamily represents the TDY subtype and is composed of Group D plant MAPKs including Arabidopsis thaliana MPK18 (AtMPK18), Oryza sativa Blast- and Wound-induced MAPK1 (OsBWMK1), OsWJUMK1 (Wound- and JA-Uninducible MAPK1), Zea mays MPK6, and the Medicago sativa TDY1 gene product. OsBWMK1 enhances resistance to pathogenic infections. It mediates stress-activated defense responses by activating a transcription factor that affects the expression of stress-related genes. AtMPK18 is involved in microtubule-related functions. In plants, MAPKs are associated with physiological, developmental, hormonal, and stress responses. Some plants show numerous gene duplications of MAPKs; Arabidopsis thaliana harbors at least 20 MAPKs, named AtMPK1-20 while Oryza sativa contains at least 17 MAPKs. Arabidopsis thaliana contains more TEY-type MAPKs than TDY-type, whereas the reverse is true for Oryza sativa. The TDY MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 338
26880 270844 cd07860 STKc_CDK2_3 Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase 2 and 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK2 is regulated by cyclin E or cyclin A. Upon activation by cyclin E, it phosphorylates the retinoblastoma (pRb) protein which activates E2F mediated transcription and allows cells to move into S phase. The CDK2/cyclin A complex plays a role in regulating DNA replication. CDK2, together with CDK4, also regulates embryonic cell proliferation. Despite these important roles, mice deleted for the cdk2 gene are viable and normal except for being sterile. This may be due to compensation provided by CDK1 (also called Cdc2), which can also bind cyclin E and drive the G1 to S phase transition. CDK3 is regulated by cyclin C and it phosphorylates pRB specifically during the G0/G1 transition. This phosphorylation is required for cells to exit G0 efficiently and enter the G1 phase. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
26881 270845 cd07861 STKc_CDK1_euk Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 1 from higher eukaryotes. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK1 is also called Cell division control protein 2 (Cdc2) or p34 protein kinase, and is regulated by cyclins A, B, and E. The CDK1/cyclin A complex controls G2 phase entry and progression. CDK1/cyclin A2 has also been implicated as an important regulator of S phase events. The CDK1/cyclin B complex is critical for G2 to M phase transition. It induces mitosis by activating nuclear enzymes that regulate chromatin condensation, nuclear membrane degradation, mitosis-specific microtubule and cytoskeletal reorganization. CDK1 also associates with cyclin E and plays a role in the entry into S phase. CDK1 transcription is stable throughout the cell cycle but is modulated in some pathological conditions. It may play a role in regulating apoptosis under these conditions. In breast cancer cells, HER2 can mediate apoptosis by inactivating CDK1. Activation of CDK1 may contribute to HIV-1 induced apoptosis as well as neuronal apoptosis in neurodegenerative diseases. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
26882 270846 cd07862 STKc_CDK6 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK6 is regulated by D-type cyclins and INK4 inhibitors. It is active towards the retinoblastoma (pRb) protein, implicating it to function in regulating the early G1 phase of the cell cycle. It is expressed ubiquitously and is localized in the cytoplasm. It is also present in the ruffling edge of spreading fibroblasts and may play a role in cell spreading. It binds to the p21 inhibitor without any effect on its own activity and it is overexpressed in squamous cell carcinomas and neuroblastomas. CDK6 has also been shown to inhibit cell differentiation in many cell types. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
26883 143368 cd07863 STKc_CDK4 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK4 partners with all three D-type cyclins (D1, D2, and D3) and is also regulated by INK4 inhibitors. It is active towards the retinoblastoma (pRb) protein and plays a role in regulating the early G1 phase of the cell cycle. It is expressed ubiquitously and is localized in the nucleus. CDK4 also shows kinase activity towards Smad3, a signal transducer of TGF-beta signaling which modulates transcription and plays a role in cell proliferation and apoptosis. CDK4 is inhibited by the p21 inhibitor and is specifically mutated in human melanoma. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
26884 270847 cd07864 STKc_CDK12 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 12. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK12 is also called Cdc2-related protein kinase 7 (CRK7) or Cdc2-related kinase arginine/serine-rich (CrkRS). It is a unique CDK that contains an RS domain, which is predominantly found in splicing factors. CDK12 is widely expressed in tissues. It interacts with cyclins L1 and L2, and plays roles in regulating transcription and alternative splicing. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK12 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 302
26885 270848 cd07865 STKc_CDK9 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 9. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK9, together with a cyclin partner (cyclin T1, T2a, T2b, or K), is the main component of distinct positive transcription elongation factors (P-TEFb), which function as Ser2 C-terminal domain kinases of RNA polymerase II. P-TEFb participates in multiple steps of gene expression including transcription elongation, mRNA synthesis, processing, export, and translation. It also plays a role in mediating cytokine induced transcription networks such as IL6-induced STAT3 signaling. In addition, the CDK9/cyclin T2a complex promotes muscle differentiation and enhances the function of some myogenic regulatory factors. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK9 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 310
26886 270849 cd07866 STKc_BUR1 Catalytic domain of the Serine/Threonine Kinase, Fungal Cyclin-Dependent protein Kinase (CDK), Bypass UAS Requirement 1, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BUR1, also called SGV1, is a yeast CDK that is functionally equivalent to mammalian CDK9. It associates with the cyclin BUR2. BUR genes were orginally identified in a genetic screen as factors involved in general transcription. The BUR1/BUR2 complex phosphorylates the C-terminal domain of RNA polymerase II. In addition, this complex regulates histone modification by phosporylating Rad6 and mediating the association of the Paf1 complex with chromatin. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The BUR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 311
26887 270850 cd07867 STKc_CDC2L6 Catalytic domain of Serine/Threonine Kinase, Cell Division Cycle 2-like 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDC2L6 is also called CDK8-like and was previously referred to as CDK11. However, this is a confusing nomenclature as CDC2L6 is distinct from CDC2L1, which is represented by the two protein products from its gene, called CDK11(p110) and CDK11(p58), as well as the caspase-processed CDK11(p46). CDK11(p110), CDK11(p58), and CDK11(p46)do not belong to this subfamily. CDC2L6 is an associated protein of Mediator, a multiprotein complex that provides a platform to connect transcriptional and chromatin regulators and cofactors, in order to activate and mediate RNA polymerase II transcription. CDC2L6 is localized mainly in the nucleus amd exerts an opposing effect to CDK8 in VP16-dependent transcriptional activation by being a negative regulator. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDC2L6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 318
26888 270851 cd07868 STKc_CDK8 Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK8 can act as a negative or positive regulator of transcription, depending on the scenario. Together with its regulator, cyclin C, it reversibly associates with the multi-subunit core Mediator complex, a cofactor that is involved in regulating RNA polymerase II (RNAP II)-dependent transcription. CDK8 phosphorylates cyclin H, a subunit of the general transcription factor TFIIH, which results in the inhibition of TFIIH-dependent phosphorylation of the C-terminal domain of RNAP II, facilitating the inhibition of transcription. It has also been shown to promote transcription by a mechanism that is likely to involve RNAP II phosphorylation. CDK8 also functions as a stimulus-specific positive coregulator of p53 transcriptional responses. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK8 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 333
26889 143374 cd07869 STKc_PFTAIRE1 Catalytic domain of the Serine/Threonine Kinase, PFTAIRE-1 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PFTAIRE-1 is widely expressed except in the spleen and thymus. It is highly expressed in the brain, heart, pancreas, testis, and ovary, and is localized in the cytoplasm. It is regulated by cyclin D3 and is inhibited by the p21 cell cycle inhibitor. It has also been shown to interact with the membrane-associated cyclin Y, which recruits the protein to the plasma membrane. PFTAIRE-1 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PFTAIRE-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 303
26890 270852 cd07870 STKc_PFTAIRE2 Catalytic domain of the Serine/Threonine Kinase, PFTAIRE-2 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PFTAIRE-2 is also referred to as ALS2CR7 (amyotrophic lateral sclerosis 2 (juvenile) chromosome region candidate 7). It may be associated with amyotrophic lateral sclerosis 2 (ALS2), an autosomal recessive form of juvenile ALS. The function of PFTAIRE-2 is not yet known. It shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PFTAIRE-2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
26891 270853 cd07871 STKc_PCTAIRE3 Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-3 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-3 shows a restricted pattern of expression and is present in brain, kidney, and intestine. It is elevated in Alzheimer's disease (AD) and has been shown to associate with paired helical filaments (PHFs) and stimulate Tau phosphorylation. As AD progresses, phosphorylated Tau aggregates and forms PHFs, which leads to the formation of neurofibrillary tangles. In human glioma cells, PCTAIRE-3 induces cell cycle arrest and cell death. PCTAIRE-3 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
26892 143377 cd07872 STKc_PCTAIRE2 Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-2 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-2 is specifically expressed in neurons in the central nervous system, mainly in terminally differentiated neurons. It associates with Trap (Tudor repeat associator with PCTAIRE-2) and could play a role in regulating mitochondrial function in neurons. PCTAIRE-2 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 309
26893 270854 cd07873 STKc_PCTAIRE1 Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-1 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-1 is expressed ubiquitously and is localized in the cytoplasm. Its kinase activity is cell cycle dependent and peaks at the S and G2 phases. PCTAIRE-1 is highly expressed in the brain and may play a role in regulating neurite outgrowth. It can also associate with Trap (Tudor repeat associator with PCTAIRE-2), a physiological partner of PCTAIRE-2; with p11, a small dimeric protein with similarity to S100; and with 14-3-3 proteins, mediators of phosphorylation-dependent interactions in many different proteins. PCTAIRE-1 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 297
26894 143379 cd07874 STKc_JNK3 Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK3 is expressed primarily in the brain, and to a lesser extent in the heart and testis. Mice deficient in JNK3 are protected against kainic acid-induced seizures, stroke, sciatic axotomy neural death, and neuronal death due to NGF deprivation, oxidative stress, or exposure to beta-amyloid peptide. This suggests that JNK3 may play roles in the pathogenesis of these diseases. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 355
26895 143380 cd07875 STKc_JNK1 Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK1 is expressed in every cell and tissue type. It specifically binds with JAMP (JNK1-associated membrane protein), which regulates the duration of JNK1 activity in response to stimuli. Specific JNK1 substrates include Itch and SG10, which are implicated in Th2 responses and airway inflammation, and microtubule dynamics and axodendritic length, respectively. Mice deficient in JNK1 are protected against arthritis, obesity, type 2 diabetes, cardiac cell death, and non-alcoholic liver disease, suggesting that JNK1 may play roles in the pathogenesis of these diseases. Initially, it was thought that JNK1 and JNK2 were functionally redundant as mice deficient in either genes could survive but disruption of both genes resulted in lethality. However, recent studies have shown that JNK1 and JNK2 perform distinct functions through specific binding partners and substrates. JNKs are mitogen-activated protein kinases that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 364
26896 143381 cd07876 STKc_JNK2 Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK2 is expressed in every cell and tissue type. It is specifically translocated to the mitochondria during dopaminergic cell death. Specific substrates include the microtubule-associated proteins DCX and Tau, as well as TIF-IA which is involved in ribosomal RNA synthesis regulation. Mice deficient in Jnk2 show protection against arthritis, type 1 diabetes, atherosclerosis, abdominal aortic aneurysm, cardiac cell death, TNF-induced liver damage, and tumor growth, indicating that JNK2 may play roles in the pathogenesis of these diseases. Initially it was thought that JNK1 and JNK2 were functionally redundant as mice deficient in either genes could survive but disruption of both genes resulted in lethality. However, recent studies have shown that JNK1 and JNK2 perform distinct functions through specific binding partners and substrates. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 359
26897 143382 cd07877 STKc_p38alpha Catalytic domain of the Serine/Threonine Kinase, p38alpha Mitogen-Activated Protein Kinase (also called MAPK14). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38alpha/MAPK14 is expressed in most tissues and is the major isoform involved in the immune and inflammatory response. It is the central p38 MAPK involved in myogenesis. It plays a role in regulating cell cycle check-point transition and promoting cell differentiation. p38alpha also regulates cell proliferation and death through crosstalk with the JNK pathway. Its substrates include MAPK activated protein kinase 2 (MK2), MK5, and the transcription factors ATF2 and Mitf. p38 kinases MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 345
26898 143383 cd07878 STKc_p38beta Catalytic domain of the Serine/Threonine Kinase, p38beta Mitogen-Activated Protein Kinase (also called MAPK11). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38beta/MAPK11 is widely expressed in tissues and shows more similarity with p38alpha than with the other isoforms. Both are sensitive to pyridinylimidazoles and share some common substrates such as MAPK activated protein kinase 2 (MK2) and the transcription factors ATF2, c-Fos and, ELK-1. p38beta is involved in regulating the activation of the cyclooxygenase-2 promoter and the expression of TGFbeta-induced alpha-smooth muscle cell actin. p38 kinases are mitogen-activated protein kinases (MAPKs), serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 343
26899 143384 cd07879 STKc_p38delta Catalytic domain of the Serine/Threonine Kinase, p38delta Mitogen-Activated Protein Kinase (also called MAPK13). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38delta/MAPK13 is found in skeletal muscle, heart, lung, testis, pancreas, and small intestine. It regulates microtubule function by phosphorylating Tau. It activates the c-jun promoter and plays a role in G2 cell cycle arrest. It also controls the degration of c-Myb, which is associated with myeloid leukemia and poor prognosis in colorectal cancer. p38delta is the main isoform involved in regulating the differentiation and apoptosis of keratinocytes. p38 kinases are MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38delta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 342
26900 143385 cd07880 STKc_p38gamma Catalytic domain of the Serine/Threonine Kinase, p38gamma Mitogen-Activated Protein Kinase (also called MAPK12). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38gamma/MAPK12 is predominantly expressed in skeletal muscle. Unlike p38alpha and p38beta, p38gamma is insensitive to pyridinylimidazoles. It displays an antagonizing function compared to p38alpha. p38gamma inhibits, while p38alpha stimulates, c-Jun phosphorylation and AP-1 mediated transcription. p38gamma also plays a role in the signaling between Ras and the estrogen receptor and has been implicated to increase cell invasion and breast cancer progression. In Xenopus, p38gamma is critical in the meiotic maturation of oocytes. p38 kinases are MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38gamma subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 343
26901 143641 cd07881 RHD-n_NFAT N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of activated T-cells (NFAT) proteins. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes that are mainly involved in cell-cell interaction. Upon de-phosphorylation of the nuclear localization signal, NFAT enters the nucleus and acts as a transcription factor; its export from the nucleus is triggered by phosphorylation via export kinases. NFATs play important roles in mediating the immune response, and are found in T cells, B Cells, NK cells, mast cells, and monocytes. NFATs are also found in various non-hematopoietic cell types, where they play roles in development. 175
26902 143642 cd07882 RHD-n_TonEBP N-terminal sub-domain of the Rel homology domain (RHD) of tonicity-responsive enhancer binding protein (TonEBP). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the tonicity-responsive enhancer binding protein (TonEBP), also called NFAT5. Mammalian TonEBP regulates the expression of genes in response to tonicity. It plays a pivotal role in urinary concentrating mechanisms in kidney medulla, by triggering the accumulation of osmolytes that enable renal medullary cells to tolerate high levels of urea and salt. 161
26903 143643 cd07883 RHD-n_NFkB N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of kappa light polypeptide gene enhancer in B-cells (NF-kappa B). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B1 and B2 families of transcription factors, also referred to as class I members of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. Family members include NF-kappa B1 and NF-kappa B2. NF-kappa B1 is commonly referred to as p105 or p50 (proteolytically processed form), while NF-kappa B2 is called p100 or p52 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). p105 and p100 may also act as I-kappa Bs due to their C-terminal ankyrin repeats. 197
26904 143644 cd07884 RHD-n_Relish N-terminal sub-domain of the Rel homology domain (RHD) of the arthropod protein Relish. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the arthropod Relish protein, in which the RHD domain co-occurs with C-terminal ankyrin repeats. Family members are sometimes referred to as p110 or p68 (proteolytically processed form). Relish is an NF-kappa B-like transcription factor, which plays a role in mediating innate immunity in Drosophila. It is activated via the Imd (immune deficiency) pathway, which triggers phosphorylation of Relish. IKK-dependent proteolytic cleavage of Relish (which involves Dredd) results in a smaller active form (without the C-terminal ankyrin repeats), which is transported into the nucleus and functions as a transactivator. 159
26905 143645 cd07885 RHD-n_RelA N-terminal sub-domain of the Rel homology domain (RHD) of RelA. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD domain of the RelA family of transcription factors, categorized as a class II member of the NF-kappa B family. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). RelA (also called p65) forms heterodimers with NF-kappa B1 (p50) and B2 (p52). RelA also forms homodimers. 169
26906 143646 cd07886 RHD-n_RelB N-terminal sub-domain of the Rel homology domain (RHD) of the reticuloendotheliosis viral oncogene homolog B (RelB) protein. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the RelB family of transcription factors, categorized as class II NF-kappa B family members. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). RelB, is unable to homodimerize but is a potent transactivator in a heterodimer with NF-kappa B1 (p50) or B2 (p52). It is involved in the regulation of genes that play roles in inflammatory processes and the immune response. 172
26907 143647 cd07887 RHD-n_Dorsal_Dif N-terminal sub-domain of the Rel homology domain (RHD) of the arthropod protein Dorsal. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the arthropod Dorsal and Dif (Dorsal-related immunity factor), and similar proteins. Dorsal and Dif are Rel-like transcription factors, which play roles in mediating innate immunity in Drosophila. They are activated via the Toll pathway. Cytoplasmic Dorsal/Dif are inactivated via forming a complex with Cactus, the Drosophila homologue of mammalian I-kappa B proteins. In response to signals, Cactus is degraded and Dorsal/Dif can be transported into the nucleus, where they act as transcription factors. Dorsal is also an essential gene in establishing the proper dorsal/ventral polarity in the developing embryo. 173
26908 143579 cd07888 CRD_corin_2 One of two cysteine-rich domains of the corin protein, a type II transmembrane serine protease . The cysteine-rich domain (CRD) is an essential component of corin, a type II transmembrane serine protease which functions as the convertase of the pro-atrial natriuretic peptide (pro-ANP) in the heart. Corin contains two CRDs in its extracellular region, which play an important role in recognition of the physiological substrate, pro-ANP. This model characterizes the second (C-terminal) CRD. 122
26909 143628 cd07890 CYTH-like_AC_IV-like Adenylyl cyclase (AC) class IV-like, a subgroup of the CYTH-like superfamily. This subgroup contains class IV ACs and similar proteins. AC catalyzes the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. cAMP is a key signaling molecule which conveys a variety of signals in different cell types. In prokaryotes, cAMP is a catabolite derepression signal which triggers the expression of metabolic pathways including the lactose operon. Six non-homologous classes of ACs have been identified (I-VI). Class IV ACs are found in this group. In bacteria, the gene encoding Class IV AC has been designated cyaB and the protein as AC2. AC-IV occurs in addition to AC-I in bacterial pathogens such as Yersinia pestis (plague disease). The role of AC-IV is unknown but it has been speculated that it may be a factor in pathogenesis, perhaps providing cAMP for a secondary internal signaling function, or for secretion and uptake into host cells, where it may disrupt normal cellular processes. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. 169
26910 143629 cd07891 CYTH-like_CthTTM-like_1 CYTH-like Clostridium thermocellum TTM-like subgroup 1. This subgroup contains the triphosphate tunnel metalloenzyme (TTM) from Clostridium thermocellum (CthTTM) and similar proteins. These are found primarily in bacteria. CthTTM is a metal dependent tripolyphosphatase, nucleoside triphosphatase, and nucleoside tetraphosphatase. It hydrolyzes the beta-gamma phosphoanhydride linkage of triphosphate-containing substrates including tripolyphosphate, nucleoside triphosphates and nucleoside tetraphosphates. These substrates are hydrolyzed, releasing Pi. Mg++ or Mn++ are required for the enzyme's activity. CthTTM appears to have no adenylate cyclase activity. This subgroup consists chiefly of bacterial sequences. These enzymes are members of the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily, which have a unique active site located within an eight-stranded beta barrel. 148
26911 143630 cd07892 PolyPPase_VTC2-3_like Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, and -3 , and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2, -3, and -4 contain polyP polymerase domains. S. cerevisiae VTC-2,and -3 belong to this subgroup. For VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs have roles in bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. 303
26912 153435 cd07893 OBF_DNA_ligase The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 129
26913 185705 cd07894 Adenylation_RNA_ligase Adenylation domain of RNA circularization proteins. RNA circularization proteins are capable of circularizing RNA molecules in an ATP-dependent reaction. RNA circularization may protect RNA from exonuclease activity. This model comprises the adenylation domain, the minimal catalytic unit that is common to all members of the ATP-dependent DNA ligase family, and the carboxy-terminal extension of RNA circularization protein that serves as a dimerization module. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation of nicked nucleic acid substrates using the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. The adenylation domain binds ATP and contains many active site residues. 342
26914 185706 cd07895 Adenylation_mRNA_capping Adenylation domain of GTP-dependent mRNA capping enzymes. RNA capping enzymes transfer GMP from GTP to the 5'-diphosphate end of nascent mRNAs to form a G(5')ppp(5')RNA cap structure. The RNA cap is found only in eukarya. RNA capping is chemically analogous to the first two steps of polynucleotide ligation. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation of nicked nucleic acid substrates using the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. Structural studies reveal a shared structure for DNA ligases and capping enzymes, with a common catalytic core composed of an adenylation or nucleotidyltransferase domain and a C-terminal OB-fold domain containing conserved sequence motifs. The adenylation domain binds ATP and contains many active site residues. 215
26915 185707 cd07896 Adenylation_kDNA_ligase_like Adenylation domain of kDNA ligases and similar proteins. The mitochondrial DNA of parasitic protozoans is highly unusual. It is termed the kinetoplast DNA (kDNA) and consists of circular DNA molecules (maxicircles) and several thousand smaller circular molecules (minicircles). This group is composed of kDNA ligase, Chlorella virus DNA ligase, and similar proteins. kDNA ligase and Chlorella virus DNA ligase are the smallest known ATP-dependent ligases. They are involved in DNA replication or repair. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. They have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and the C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family, including this group. The adenylation domain binds ATP and contains many of the active-site residues. 174
26916 185708 cd07897 Adenylation_DNA_ligase_Bac1 Adenylation domain of putative bacterial ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of predicted bacterial ATP-dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three-step reaction mechanism. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family, including this group. The adenylation domain binds ATP and contains many of the active site residues. 207
26917 185709 cd07898 Adenylation_DNA_ligase Adenylation domain of ATP-dependent DNA Ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Some organisms express a variety of different ligases which appear to be targeted to specific functions. ATP-dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many of the active-site residues. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 201
26918 185710 cd07900 Adenylation_DNA_ligase_I_Euk Adenylation domain of eukaryotic DNA Ligase I. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Some organisms express a variety of different ligases which appear to be targeted to specific functions. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase I is required for the ligation of Okazaki fragments during lagging-strand DNA synthesis and for base excision repair (BER). DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. DNA ligase I is the main replicative ligase in eukaryotes. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 219
26919 185711 cd07901 Adenylation_DNA_ligase_Arch_LigB Adenylation domain of archaeal and bacterial LigB-like DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of archaeal DNA ligases and bacterial proteins similar to Mycobacterium tuberculosis LigB. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 207
26920 185712 cd07902 Adenylation_DNA_ligase_III Adenylation domain of DNA Ligase III. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three-step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase III is not found in lower eukaryotes and is present both in the nucleus and mitochondria. It has several isoforms; two splice forms, III-alpha and III-beta, differ in their carboxy-terminal sequences. DNA ligase III-beta is believed to play a role in homologous recombination during meiotic prophase. DNA ligase III-alpha interacts with X-ray Cross Complementing factor 1 (XRCC1) and functions in single nucleotide Base Excision Repair (BER). The mitochondrial form of DNA ligase III originates from the nucleolus and is involved in the mitochondrial DNA repair pathway. This isoform is expressed by a second start site on the DNA ligase III gene. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many active site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 213
26921 185713 cd07903 Adenylation_DNA_ligase_IV Adenylation domain of DNA Ligase IV. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligase in eukaryotic cells (I, III and IV). DNA ligase IV is required for DNA non-homologous end joining pathways, including recombination of the V(D)J immunoglobulin gene segments in cells of the mammalian immune system. DNA ligase IV is stabilized by forming a complex with XRCC4, a nuclear phosphoprotein, which is phosphorylated by DNA-dependent protein kinase. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 225
26922 185714 cd07905 Adenylation_DNA_ligase_LigC Adenylation domain of Mycobacterium tuberculosis LigC-like ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of ATP-dependent DNA ligases similar to Mycobacterium tuberculosis LigC. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. 194
26923 185715 cd07906 Adenylation_DNA_ligase_LigD_LigC Adenylation domain of Mycobacterium tuberculosis LigD and LigC-like ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of ATP-dependent DNA ligases similar to Mycobacterium tuberculosis LigC. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. LigD consists of a central ATP-dependent DNA ligase catalytic core unit fused to a C-terminal polymerase domain and an N-terminal 3'-phosphoesterase (PE) module. LigD catalyzes the end-healing and end-sealing steps during non-homologous end joining. 190
26924 153117 cd07908 Mn_catalase_like Manganese catalase-like protein, ferritin-like diiron-binding domain. This uncharacterized bacterial protein family has a ferritin-like domain similar to that of the manganese catalase protein of Lactobacillus plantarum and the bll3758 protein of Bradyrhizobium japonicum. Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF). 154
26925 153118 cd07909 YciF YciF bacterial stress response protein, ferritin-like iron-binding domain. YciF is a bacterial protein of unknown function that is up-regulated when bacteria experience stress conditions, and is highly conserved in a broad range of bacterial species. YciF has a ferritin-like domain. Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF). 147
26926 153119 cd07910 MiaE MiaE tRNA-modifying nonheme diiron monooxygenase, ferritin-like diiron-binding domain. MiaE is a nonheme diiron monooxygenase that catalyzes the posttranscriptional allylic hydroxylation of a modified nucleoside in tRNA called 2-methylthio-N-6-isopentenyl adenosine (ms2i6A). ms2i6A is found at position 37, next to the anticodon at the 3' position in almost all eukaryotic and bacterial tRNA's that read codons beginning with uridine. The miaE gene is absent in Escherichia coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species. 180
26927 153120 cd07911 RNRR2_Rv0233_like Ribonucleotide Reductase R2-like protein, Mn/Fe-binding domain. Rv0233 is a Mycobacterium tuberculosis ribonucleotide reductase R2 protein with a heterodinuclear manganese/iron-carboxylate cofactor located in its metal center. The Rv0233-like family may represent a structural/functional counterpart of the evolutionary ancestor of the RNRR2's (Ribonucleotide Reductase, R2/beta subunit) and the bacterial multicomponent monooxygenases. RNRR2s belong to a broad superfamily of ferritin-like diiron-carboxylate proteins. The RNR protein catalyzes the conversion of ribonucleotides to deoxyribonucleotides and is found in prokaryotes and archaea. The catalytically active form of RNR is a proposed alpha2-beta2 tetramer. The homodimeric alpha subunit (R1) contains the active site and redox active cysteines as well as the allosteric binding sites. 280
26928 143653 cd07912 Tweety_N N-terminal domain of the protein encoded by the Drosophila tweety gene and related proteins, a family of chloride ion channels. The protein product of the Drosophila tweety (tty) gene is thought to form a trans-membrane protein with five membrane-spanning regions and a cytoplasmic C-terminus. This N-terminal domain contains the putative transmembrane spanning regions. Tweety has been suggested as a candidate for a large conductance chloride channel, both in vertebrate and insect cells. Three human homologs have been identified and designated TTYH1-3. TTYH2 has been associated with the progression of cancer, and Drosophila melanogaster tweety has been assumed to play a role in development. TTYH2, and TTYH3 bind to and are ubiquinated by Nedd4-2, a HECT type E3 ubiquitin ligase, which most likely plays a role in controlling the cellular levels of tweety family proteins. 418
26929 153419 cd07914 IGPD Imidazoleglycerol-phosphate dehydratase. Imidazoleglycerol-phosphate dehydratase (IGPD; EC 4.2.1.19) catalyzes the dehydration of imidazole glycerol phosphate to imidazole acetol phosphate, the sixth step of histidine biosynthesis in plants and microorganisms where the histidine is synthesized de novo. There is an internal repeat in the protein domain that is related by pseudo-dyad symmetry, perhaps as a result of an ancient gene duplication. The apo-form of IGPD exists as a catalytically inactive trimer which, in the presence of specific divalent metal cations such as manganese (Mn2+), cobalt (Co2+), cadmium (Cd2+), nickel (Ni2+), iron (Fe2+) and zinc (Zn2+), assembles to form a biologically active high molecular weight metalloenzyme; a 24-mer with 4-3-2 symmetry. Each 24-mer has 24 active sites, and contains around 1.5 metal ions per monomer, each monomer contributing residues to three separate active sites. IGPD enzymes are monofunctional in fungi, plants, archaea and some eubacteria while they are encoded as bifunctional enzymes in other eubacteria, such that the enzyme is fused to histidinol-phosphate phosphatase, the penultimate enzyme of the histidine biosynthesis pathway. The histidine biosynthesis pathway is a potential target for development of herbicides, and IGPD is a target for the triazole phosphonate herbicides. 190
26930 153420 cd07920 Pumilio Pumilio-family RNA binding domain. Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats. 322
26931 153391 cd07921 PCA_45_Doxase_A_like Subunit A of the Class III Extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, and similar enzymes. This subfamily includes the A subunit of protocatechuate (PCA) 4,5-dioxygenase (LigAB) and two subfamilies of unknown function. The A subunit is the smaller, non-catalytic subunit of LigAB. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which play key roles in the degradation of aromatic compounds. As members of the Class III extradiol dioxygenase family, the enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. 106
26932 153392 cd07922 CarBa CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. 2-aminophenol 1,6-dioxygenase is a key enzyme in the carbazole degradation pathway isolated from bacterial strains with carbazole degradation ability. The enzyme is a heterotetramer composed of two A and two B subunits. CarB belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Although the enzyme was originally isolated as a meta-cleavage enzyme for 2'-aminobiphenyl-2,3-diol involved in carbazole degradation, the enzyme has also shown high specificity for 2,3-dihydroxybiphenyl. 81
26933 153393 cd07923 Gallate_dioxygenase_C The C-terminal domain of Gallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of gallate. Gallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of gallate, an intermediate in the degradation of the aromatic compound, syringate. The reaction product of gallate dioxygenase is 4-oxalomesaconate. The amino acid sequence of the N-terminal and C-terminal regions of gallate dioxygenase exhibits homology with the sequence of the PCA 4,5-dioxygenase B (catalytic) and A subunits, respectively. This model represents the C-terminal domain, which is similar to the A subunit of PCA 4,5-dioxygenase (or LigAB). The enzyme is estimated to be a homodimer according to the Escherichia coli enzyme. Since enzymes in this subfamily have fused A and B subunits, the dimer interface may resemble the tetramer interface of classical LigAB enzymes. This enzyme belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 94
26934 153394 cd07924 PCA_45_Doxase_A The A subunit of Protocatechuate 4,5-dioxygenase (LigAB) is the smaller, non-catalytic subunit. The A subunit is the non-catalytic subunit of Protocatechuate (PCA) 4,5-dioxygenase (LigAB), which is composed of A and B subunits that form a tetramer. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which play key roles in the degradation of aromatic compounds. As a member of the Class III extradiol dioxygenase family, LigAB uses a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 121
26935 153395 cd07925 LigA_like_1 The A subunit of Uncharacterized proteins with similarity to Protocatechuate 4,5-dioxygenase (LigAB). The proteins of unknown function in this subfamily are similar to the A subunit of the Protocatechuate (PCA) 4,5-dioxygenase (LigAB). LigAB belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Dioxygenases play key roles in the degradation of aromatic compounds. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. 106
26936 143648 cd07927 RHD-n_NFAT_like N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of activated T-cells (NFAT) proteins and similar proteins. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes that are mainly involved in cell-cell interaction. Upon de-phosphorylation of the nuclear localization signal, NFAT enters the nucleus and acts as a transcription factor; its export from the nucleus is triggered by phosphorylation via export kinases. NFATs play important roles in mediating the immune response, and are found in T cells, B Cells, NK cells, mast cells, and monocytes. NFATs are also found in various non-hematopoietic cell types, where they play roles in development. This group also contains the N-terminal RHD sub-domain of the non-calcium regulated tonicity-responsive enhancer binding protein (TonEBP), also called NFAT5. Mammalian TonEBP regulates the expression of genes in response to tonicity. It plays a pivotal role in urinary concentrating mechanisms in kidney medulla, by triggering the accumulation of osmolytes that enable renal medullary cells to tolerate high levels of urea and salt. 161
26937 153077 cd07930 bacterial_phosphagen_kinase Phosphagen (guanidino) kinases found in bacteria. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, such as phosphocreatine (PCr) or phosphoarginine, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. This subfamily is specific to bacteria and lacks an N-terminal domain, which otherwise forms part of the substrate binding site. Most of the catalytic residues are found in the larger C-terminal domain, however, which appears conserved in these bacterial proteins. Their functions have not been characterized. 232
26938 153078 cd07931 eukaryotic_phosphagen_kinases Phosphagen (guanidino) kinases mostly found in eukaryotes. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). 338
26939 153079 cd07932 arginine_kinase_like Phosphagen (guanidino) kinases such as arginine kinase and similar enzymes. Eukaryotic arginine kinase-like phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphoarginine in the case of arginine kinase (AK), which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. Besides AK, one of the most studied members of this family, this model also represents a phosphagen kinase with different substrate specificity, hypotaurocyamine kinase (HTK). 350
26940 143649 cd07933 RHD-n_c-Rel N-terminal sub-domain of the Rel homology domain (RHD) of c-Rel. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the c-Rel family of transcription factors, categorized as a class II member of the NF-kappa B family. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). c-Rel plays an important role in B cell proliferation and survival. 172
26941 143650 cd07934 RHD-n_NFkB2 N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor kappa B2 (NF-kappa B2). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B2 family of transcription factors, a class I member of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. NF-kappa B2 is commonly referred to as p100 or p52 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). NF-kappa B2 is involved in the alternative NF-kappa B signaling pathway which is activated by few agonists and plays an important role in secondary lymphoid organogenesis, maturation of B-cells, and adaptive humoral immunity. p100 may also act as an I-kappa B due to its C-terminal ankyrin repeats. 185
26942 143651 cd07935 RHD-n_NFkB1 N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of kappa B1 (NF-kappa B1). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B1 family of transcription factors, a class I member of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. NF-kappa B1 is commonly referred to as p105 or p50 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). NF-kappa B1 is involved in the canonical NF-kappa B signaling pathway which is activated by many agonists and is essential in immune and inflammatory responses, as well as cell survival. p105 is involved in its own specific NF-kappa B signaling pathway which is also implicated in immune and inflammatory responses. p105 may also act as an I-kappa B due to its C-terminal ankyrin repeats. It is also involved in mitogen-activated protein kinase (MAPK) signaling as its degradation leads to the activation of TPL-2, a MAPK kinase kinase which activates ERK pathways. 202
26943 153421 cd07936 SCAN SCAN oligomerization domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several vertebrate proteins that contain C2H2 zinc finger motifs, many of which may be transcription factors playing roles in cell survival and differentiation. This protein-interaction domain is able to mediate homo- and hetero-oligomerization of SCAN-containing proteins. Some SCAN-containing proteins, including those of lower vertebrates, do not contain zinc finger motifs. It has been noted that the SCAN domain resembles a domain-swapped version of the C-terminal domain of the HIV capsid protein. This domain model features elements common to the three general groups of SCAN domains (SCAN-A1, SCAN-A2, and SCAN-B). The SCAND1 protein is truncated at the C-terminus with respect to this model, the SCAND2 protein appears to have a truncated central helix. 85
26944 163675 cd07937 DRE_TIM_PC_TC_5S Pyruvate carboxylase and Transcarboxylase 5S, carboxyltransferase domain. This family includes the carboxyltransferase domains of pyruvate carboxylase (PC) and the transcarboxylase (TC) 5S subunit. Transcarboxylase 5S is a cobalt-dependent metalloenzyme subunit of the biotin-dependent transcarboxylase multienzyme complex. Transcarboxylase 5S transfers carbon dioxide from the 1.3S biotin to pyruvate in the second of two carboxylation reactions catalyzed by TC. The first reaction involves the transfer of carbon dioxide from methylmalonyl-CoA to the 1.3S biotin, and is catalyzed by the 12S subunit. These two steps allow a carboxylate group to be transferred from oxaloacetate to propionyl-CoA to yield pyruvate and methylmalonyl-CoA. The catalytic domain of transcarboxylase 5S has a canonical TIM-barrel fold with a large C-terminal extension that forms a funnel leading to the active site. Transcarboxylase 5S forms a homodimer and there are six dimers per complex. In addition to the catalytic domain, transcarboxylase 5S has several other domains including a carbamoyl-phosphate synthase domain, a biotin carboxylase domain, a carboxyltransferase domain, and an ATP-grasp domain. Pyruvate carboxylase, like TC, is a biotin-dependent enzyme that catalyzes the carboxylation of pyruvate to produce oxaloacetate. In mammals, PC has critical roles in gluconeogenesis, lipogenesis, glyceroneogenesis, and insulin secretion. Inherited PC deficiencies are linked to serious diseases in humans such as lactic acidemia, hypoglycemia, psychomotor retardation, and death. PC is a single-chain enzyme and is active only in its homotetrameric form. PC has three domains, an N-terminal biotin carboxylase domain, a carboxyltransferase domain (this alignment model), and a C-terminal biotin-carboxyl carrier protein domain. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 275
26945 163676 cd07938 DRE_TIM_HMGL 3-hydroxy-3-methylglutaryl-CoA lyase, catalytic TIM barrel domain. 3-hydroxy-3-methylglutaryl-CoA lyase (HMGL) catalyzes the cleavage of HMG-CoA to acetyl-CoA and acetoacetate, one of the terminal steps in ketone body generation and leucine degradation, and is a key enzyme in the pathway that supplies metabolic fuel to extrahepatic tissues. Mutations in HMGL cause a human autosomal recessive disorder called primary metabolic aciduria that affects ketogenesis and leucine catabolism and can be fatal due to an inability to tolerate hypoglycemia. HMGL has a TIM barrel domain with a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. The cleavage of HMG-CoA requires the presence of a divalent cation like Mg2+ or Mn2+, and the reaction is thought to involve general acid/base catalysis. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 274
26946 163677 cd07939 DRE_TIM_NifV Streptomyces rubellomurinus FrbC and related proteins, catalytic TIM barrel domain. FrbC (NifV) of Streptomyces rubellomurinus catalyzes the condensation of acetyl-CoA and alpha-ketoglutarate to form homocitrate and CoA, a reaction similar to one catalyzed by homocitrate synthase. The gene encoding FrbC is one of several genes required for the biosynthesis of FR900098, a potent antimalarial antibiotic. This protein is also required for assembly of the nitrogenase MoFe complex but its exact role is unknown. This family also includes the NifV proteins of Heliobacterium chlorum and Gluconacetobacter diazotrophicus, which appear to be orthologous to FrbC. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 259
26947 163678 cd07940 DRE_TIM_IPMS 2-isopropylmalate synthase (IPMS), N-terminal catalytic TIM barrel domain. 2-isopropylmalate synthase (IPMS) catalyzes an aldol-type condensation of acetyl-CoA and 2-oxoisovalerate yielding 2-isopropylmalate and CoA, the first committed step in leucine biosynthesis. This family includes the Arabidopsis thaliana IPMS1 and IPMS2 proteins, the Glycine max GmN56 protein, and the Brassica insularis BatIMS protein. This family also includes a group of archeal IPMS-like proteins represented by the Methanocaldococcus jannaschii AksA protein. AksA catalyzes the condensation of alpha-ketoglutarate and acetyl-CoA to form trans-homoaconitate, one of 13 steps in the conversion of alpha-ketoglutarate and acetylCoA to alpha-ketosuberate, a precursor to coenzyme B and biotin. AksA also catalyzes the condensation of alpha-ketoadipate or alpha-ketopimelate with acetylCoA to form, respectively, the (R)-homocitrate homologs (R)-2-hydroxy-1,2,5-pentanetricarboxylic acid and (R)-2-hydroxy-1,2,6- hexanetricarboxylic acid. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 268
26948 163679 cd07941 DRE_TIM_LeuA3 Desulfobacterium autotrophicum LeuA3 and related proteins, N-terminal catalytic TIM barrel domain. Desulfobacterium autotrophicum LeuA3 is sequence-similar to alpha-isopropylmalate synthase (LeuA) but its exact function is unknown. Members of this family have an N-terminal TIM barrel domain that belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 273
26949 163680 cd07942 DRE_TIM_LeuA Mycobacterium tuberculosis LeuA3 and related proteins, N-terminal catalytic TIM barrel domain. Alpha-isopropylmalate synthase (LeuA), a key enzyme in leucine biosynthesis, catalyzes the first committed step in the pathway, converting acetyl-CoA and alpha-ketoisovalerate to alpha-isopropyl malate and CoA. Although the reaction catalyzed by LeuA is similar to that of the Arabidopsis thaliana IPMS1 protein, the two fall into phylogenetically distinct families within the same superfamily. LeuA has and N-terminal TIM barrel catalytic domain, a helical linker domain, and a C-terminal regulatory domain. LeuA forms a homodimer in which the linker domain of one monomer sits over the catalytic domain of the other, inserting residues into the active site that may be important for catalysis. Homologs of LeuA are found in bacteria as well as fungi. This family includes alpha-isopropylmalate synthases I (LEU4) and II (LEU9) from Saccharomyces cerevisiae. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 284
26950 163681 cd07943 DRE_TIM_HOA 4-hydroxy-2-oxovalerate aldolase, N-terminal catalytic TIM barrel domain. 4-hydroxy 2-ketovalerate aldolase (Also known as 4-hydroxy-2-ketovalerate aldolase and 4-hydroxy-2-oxopentanoate aldolase (HOA)) converts 4-hydroxy-2-oxopentanoate to acetaldehyde and pyruvate, the penultimate step in the meta-cleavage pathway for the degradation of phenols, cresols and catechol. This family includes the Escherichia coli MhpE aldolase, the Pseudomonas DmpG aldolase, and the Burkholderia xenovorans BphI pyruvate aldolase. In Pseudomonas, the DmpG aldolase tightly associates with a dehydrogenase (DmpF ) and is inactive without it. HOA has a canonical TIM-barrel fold with a C-terminal extension that forms a funnel leading to the active site. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 263
26951 163682 cd07944 DRE_TIM_HOA_like 4-hydroxy-2-oxovalerate aldolase-like, N-terminal catalytic TIM barrel domain. This family of bacterial enzymes is sequence-similar to 4-hydroxy-2-oxovalerate aldolase (HOA) but its exact function is unknown. This family includes the Bacteroides vulgatus Bvu_2661 protein and belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 266
26952 163683 cd07945 DRE_TIM_CMS Leptospira interrogans citramalate synthase (CMS) and related proteins, N-terminal catalytic TIM barrel domain. Citramalate synthase (CMS) catalyzes the conversion of pyruvate and acetyl-CoA to (R)-citramalate in the first dedicated step of the citramalate pathway. Citramalate is only found in Leptospira interrogans and a few other microorganisms. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 280
26953 163684 cd07947 DRE_TIM_Re_CS Clostridium kluyveri Re-citrate synthase and related proteins, catalytic TIM barrel domain. Re-citrate synthase (Re-CS) is a Clostridium kluyveri enzyme that converts acetyl-CoA and oxaloacetate to citrate. In most organisms, this reaction is catalyzed by Si-citrate synthase which is Si-face stereospecific with respect to C-2 of oxaloacetate, and phylogenetically unrelated to Re-citrate synthase. Re-citrate synthase is also found in a few other strictly anaerobic organisms. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 279
26954 163685 cd07948 DRE_TIM_HCS Saccharomyces cerevisiae homocitrate synthase and related proteins, catalytic TIM barrel domain. Homocitrate synthase (HCS) catalyzes the condensation of acetyl-CoA and alpha-ketoglutarate to form homocitrate, the first step in the lysine biosynthesis pathway. This family includes the Yarrowia lipolytica LYS1 protein as well as the Saccharomyces cerevisiae LYS20 and LYS21 proteins. This family belongs to the DRE-TIM metallolyase superfamily. DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM". 262
26955 153386 cd07949 PCA_45_Doxase_B_like_1 The B subunit of unknown Class III extradiol dioxygenases with similarity to Protocatechuate 4,5-dioxygenase. This subfamily is composed of proteins of unknown function with similarity to the B subunit of Protocatechuate 4,5-dioxygenase (LigAB). LigAB belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Dioxygenases play key roles in the degradation of aromatic compounds. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. 276
26956 153387 cd07950 Gallate_Doxase_N The N-terminal domain of the Class III extradiol dioxygenase, Gallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of gallate. Gallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of gallate, an intermediate in the degradation of the aromatic compound, syringate. The reaction product of gallate dioxygenase is 4-oxalomesaconate. The amino acid sequence of the N-terminal and C-terminal regions of gallate dioxygenase exhibits homology with the sequence of PCA 4,5-dioxygenase B (catalytic) and A subunits, respectively. The enzyme is estimated to be a homodimer according to the Escherichia coli enzyme. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. In this subfamily, the subunits A and B are fused to make a single polypeptide chain. The dimer interface for this subfamily may resemble the tetramer interface of classical LigAB enzymes. Gallate Dioxygenase belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. 277
26957 153388 cd07951 ED_3B_N_AMMECR1 The N-terminal domain, an extradiol dioxygenase class III subunit B-like domain, of unknown proteins containing a C-terminal AMMECR1 domain. This subfamily is composed of uncharacterized proteins containing an N-terminal domain with similarity to the catalytic B subunit of class III extradiol dioxygenases and a C-terminal AMMECR1-like domain. This model represents the N-terminal domain. Class III extradiol dioxygenases use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon, however, proteins in this subfamily do not contain a potential metal binding site and may not exhibit class III extradiol dioxygenase-like activity. The AMMECR1 protein was proposed to be a regulatory factor that is potentially involved in the development of AMME contiguous gene deletion syndrome. 256
26958 153389 cd07952 ED_3B_like Uncharacterized class III extradiol dioxygenases. This subfamily is composed of proteins of unknown function with similarity to the catalytic B subunit of class III extradiol dioxygenases. Class III extradiol dioxygenases use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. They play key roles in the degradation of aromatic compounds. 256
26959 409289 cd07953 PUA PUA RNA binding domain. The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, and a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was also found in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in regulating the expression of other genes. It has been shown that the PUA domain acts as an RNA binding domain in at least some of the proteins involved in RNA metabolism. 73
26960 271157 cd07954 AP_MHD_Cterm C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD). This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15. 245
26961 153409 cd07955 Anticodon_Ia_Cys_like Anticodon-binding domain of cysteinyl tRNA synthetases and domain found in MshC. This domain is found in cysteinyl tRNA synthetases (CysRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. CysRS catalyzes the transfer of cysteine to the 3'-end of its tRNA. The family also includes a domain of MshC, the rate-determining enzyme in the mycothiol biosynthetic pathway, which is specific to actinomycetes. The anticodon-binding site of CysRS lies C-terminal to this model's footprint and is not shared by MshC. 81
26962 153410 cd07956 Anticodon_Ia_Arg Anticodon-binding domain of arginyl tRNA synthetases. This domain is found in arginyl tRNA synthetases (ArgRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. ArgRS catalyzes the transfer of arginine to the 3'-end of its tRNA. 156
26963 153411 cd07957 Anticodon_Ia_Met Anticodon-binding domain of methionyl tRNA synthetases. This domain is found in methionyl tRNA synthetases (MetRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon (CAU). MetRS catalyzes the transfer of methionine to the 3'-end of its tRNA. 129
26964 153412 cd07958 Anticodon_Ia_Leu_BEm Anticodon-binding domain of bacterial and eukaryotic mitochondrial leucyl tRNA synthetases. This domain is found in leucyl tRNA synthetases (LeuRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain. In contrast to other class Ia enzymes, the anticodon is not used as an identity element in LeuRS (with exceptions such as Saccharomyces cerevisiae and some other eukaryotes). No anticodon-binding site can be defined for this family, which includes bacterial and eukaryotic mitochondrial members, as well as LeuRS from the archaeal Halobacteria. LeuRS catalyzes the transfer of leucine to the 3'-end of its tRNA. 117
26965 153413 cd07959 Anticodon_Ia_Leu_AEc Anticodon-binding domain of archaeal and eukaryotic cytoplasmic leucyl tRNA synthetases. This domain is found in leucyl tRNA synthetases (LeuRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain. In contrast to other class Ia enzymes, the anticodon is not used as an identity element in LeuRS (with exceptions such as Saccharomyces cerevisiae and some other eukaryotes). No anticodon-binding site can be defined for this family, which includes archaeal and eukaryotic cytoplasmic members. LeuRS catalyzes the transfer of leucine to the 3'-end of its tRNA. 117
26966 153414 cd07960 Anticodon_Ia_Ile_BEm Anticodon-binding domain of bacterial and eukaryotic mitochondrial isoleucyl tRNA synthetases. This domain is found in isoleucyl tRNA synthetases (IleRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. This family includes bacterial and eukaryotic mitochondrial members. IleRS catalyzes the transfer of isoleucine to the 3'-end of its tRNA. 180
26967 153415 cd07961 Anticodon_Ia_Ile_ABEc Anticodon-binding domain of archaeal, bacterial, and eukaryotic cytoplasmic isoleucyl tRNA synthetases. This domain is found in isoleucyl tRNA synthetases (IleRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. This family includes bacterial, archaeal, and eukaryotic cytoplasmic members. IleRS catalyzes the transfer of isoleucine to the 3'-end of its tRNA. 183
26968 153416 cd07962 Anticodon_Ia_Val Anticodon-binding domain of valyl tRNA synthetases. This domain is found in valyl tRNA synthetases (ValRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. ValRS catalyzes the transfer of valine to the 3'-end of its tRNA. 135
26969 153417 cd07963 Anticodon_Ia_Cys Anticodon-binding domain of cysteinyl tRNA synthetases. This domain is found in cysteinyl tRNA synthetases (CysRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. CysRS catalyzes the transfer of cysteine to the 3'-end of its tRNA. 156
26970 176481 cd07964 RBP-H Head domain of virus receptor-binding proteins (RBP). Virus receptor-binding proteins (RBPs) are found in lactococcal bacteriophages, as well as in adenoviruses and reoviruses, which invade mammalian cells. Lactococcus lactis is widely used in dairy fermentations and infection of L. lactis by phages greatly impairs the fermentation process. Adenovirus typically infects respiratory tracts with symptoms ranging from the common cold to pneumonia. Onset of viral infections begin with the recognition of host cells through the receptor-binding protein complex located at the distal part of the virion. The RBP has three domains: the N- terminal shoulders domain, the interlaced neck domain, and the C-terminal head domain. Phages recognize their host through an interaction between the RBP head (RBP-H) domain and saccharidic receptors at the host cell surface. Adenovirus recognizes the membrane cofactor protein, CD46, as a cellular receptor. 103
26971 153436 cd07967 OBF_DNA_ligase_III The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase III is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase III is not found in lower eukaryotes and is present both in the nucleus and mitochondria. It has several isoforms; two splice forms, III-alpha and III-beta, differ in their carboxy-terminal sequences. DNA ligase III-beta is believed to play a role in homologous recombination during meiotic prophase. DNA ligase III-alpha interacts with X-ray Cross Complementing factor 1 (XRCC1) and functions in single nucleotide Base Excision Repair (BER). The mitochondrial form of DNA ligase III originates from the nucleolus and is involved in the mitochondrial DNA repair pathway. This isoform is expressed by a second start site on the DNA ligase III gene. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligouncleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 139
26972 153437 cd07968 OBF_DNA_ligase_IV The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase IV is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase IV is required for DNA non-homologous end joining pathways, including recombination of the V(D)J immunoglobulin gene segments in cells of the mammalian immune system. DNA ligase IV is stabilized by forming a complex with XRCC4, a nuclear phosphoprotein, which is phosphorylated by DNA-dependent protein kinase. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligouncleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 140
26973 153438 cd07969 OBF_DNA_ligase_I The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase I is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). This group is composed of eukaryotic DNA ligase I, Sulfolobus solfataricus DNA ligase and similar proteins. DNA ligase I is required for the ligation of Okazaki fragments during lagging-strand DNA synthesis and for base excision repair (BER). ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 144
26974 153439 cd07970 OBF_DNA_ligase_LigC The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase LigC is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Mycobacterium tuberculosis LigC and similar bacterial proteins. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 122
26975 153440 cd07971 OBF_DNA_ligase_LigD The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase LigD is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Mycobacterium tuberculosis LigD and similar bacterial proteins. LigD, or DNA ligase D, catalyzes the end-healing and end-sealing steps during nonhomologous end joining. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 115
26976 153441 cd07972 OBF_DNA_ligase_Arch_LigB The Oligonucleotide/oligosaccharide binding (OB)-fold domain of archaeal and bacterial ATP-dependent DNA ligases is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Pyrococcus furiosus DNA ligase, Mycobacterium tuberculosis LigB, and similar archaeal and bacterial proteins. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 122
26977 153422 cd07973 Spt4 Transcription elongation factor Spt4. Spt4 is a transcription elongation factor. Three transcription-elongation factors Spt4, Spt5, and Spt6, are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. Spt4 functions entirely in the context of the Spt4-Spt5 heterodimer and it has been found only as a complex to Spt5 in Yeast and Human. Spt4 is a small protein that has zinc finger at the N-terminus. Spt5 is a large protein that has several interesting structural features of an acidic N-terminus, a single NGN domain, five or six KOW domains, and a set of simple C-termianl repeats. Spt4 binds to Spt5 NGN domain. Unlike Spt5, Spt4 is not essential for viability in yeast, however Spt4 is critical for normal function of the Spt4-Spt5 complex. Spt4 homolog is not found in bacteria. 98
26978 199899 cd07976 TFIIA_alpha_beta_like Precursor of TFIIA alpha and beta subunits and similar proteins. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage). 102
26979 153423 cd07977 TFIIE_beta_winged_helix TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity. Transcription Factor IIE (TFIIE) beta winged-helix (or forkhead) domain is located at the central core region of TFIIE beta. The winged-helix is a form of helix-turn-helix (HTH) domain which typically binds DNA with the 3rd helix. The winged-helix domain is distinguished by the presence of a C-terminal beta-strand hairpin unit (the wing) that packs against the cleft of the tri-helical core. Although most winged-helix domains are multi-member families, TFIIE beta winged-helix domain is typically found as a single orthologous group. TFIIE is one of the six eukaryotic general transcription factors (TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH) that are required for transcription initiation of protein-coding genes. TFIIE is a heterotetramer consisting of two copies each of alpha and beta subunits. TFIIE beta contains several functional domains, an N-terminal serine-rich region, a central core domain exhibiting a winged-helix structure capable of binding double-stranded DNA, a leucine repeat, a sigma3 region, and a C-terminal domain containing two basic regions. The assembly of transcription preinitiation complex (PIC) includes the general transcription factors and RNA polymerase II (pol II) initiated by the binding of the TBP subunit of TFIID to the TATA box, followed by either the sequential assembly of other general transcription factors and pol II or a preassembled pol II holoenzyme pathway. TFIIE interacts directly with TFIIF, TFIIB, pol II, and promoter DNA. TFIIE recruits TFIIH and regulates its activities. TFIIE and TFIIH are also important for the transition from initiation to elongation. 75
26980 173962 cd07978 TAF13 The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and are found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF13 interacts with TAF11 and makes a histone-like heterodimer similar to H3/H4-like proteins. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex. 92
26981 173963 cd07979 TAF9 TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and are involved in forming the TFIID complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Human TAF9 has a paralogue gene (TAF9L) which has a redundant function. Several hypotheses are proposed for TAF function such as serving as activator-binding sites, in core-promoter recognition or a role in essential catalytic activity. It has been shown that TAF9 interacts directly with different transcription factors such as p53, herpes simplex virus activator vp16 and the basal transcription factor TFIIB. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF9 is a component of TFIID in multiple organisms as well as different TBP-free TAF complexes containing the GCN5-type histone acetyltransferase. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. TFIID has a histone octamer-like substructure. TAF9 is a shared subunit of both, histone acetyltransferase complex (SAGA) and TFIID complexes. TAF9 domain interacts with TAF6 to form a novel histone-like heterodimer that is structurally related to the histone H3 and H4 oligomer. 117
26982 259828 cd07980 TFIIF_beta Transcription initiation factor IIF, beta subunit. The TFIIF-beta subunit, also called RNA Polymerase II-associating Protein 30 (RAP30), forms a heteromeric complex of RAP30/74 (TFIIF, beta/gamma) that is involved in both initiation and elongation of RNA chains by RNA polymerase II. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. TFIIF-beta binds directly to RNA polymerase II and helps bring polymerase into a pre-initiation complex. 123
26983 381751 cd07981 TAF12 TATA Binding Protein (TBP) Associated Factor 12. The TATA Binding Protein (TBP) Associated Factor 12 (TAF12; also known as TAF2J or TAFII20) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and in the assembly of the pre-initiation complex (PIC). The TFIID complex is composed of the TBP and at least 13 TAFs which specifically interact with a variety of core promoter DNA sequences. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A unified and systematic nomenclature has been adopted for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs function such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF12 interacts with TAF4 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes. It is important for RAS-induced transformation properties of human colorectal cancer cells; its levels are increased in the cells harboring the RAS mutation. Also, TAF12 interacts with activating transcription factor 7 (ATF7) and contributes to the hypersensitivity of osteoclast (OCL) precursors to 1,25-dihydroxyvitamin D2 (1,25-(OH)2D3; also known as calcitriol) in Paget's disease (PD), a disorder of the bone remodeling process, in which the body absorbs old bone and forms abnormal new bone. 69
26984 187739 cd07982 TAF10 The TATA Binding Protein (TBP) Associated Factor 10. The TATA Binding Protein (TBP) Associated Factor 10 (TAF 10) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of the seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. Several hypotheses are proposed for TAF functions, such as serving as activator-binding sites, being involved in core-promoter recognition, or to perform an essential catalytic activity. Each TAF - with the help of a specific activator - is required only for the expression of a subset of genes, and TAFs are not universally involved in transcription such as the GTFs. TAF10 regulates genes that are important for cell cycle progression and cell morphology. A lack of TAF10 leads to cell cycle arrest and cell death by apoptosis in mouse. In both yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF10 is part of other transcription regulatory multiprotein complexes (e.g., SAGA, TBP-free TAF-containing complex [TFTC], STAGA, and PCAF/GCN5). Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. The minimal HFD contains three alpha-helices linked by two loops. The HFD is found in core histones, TAFs and many other transcription factors. Five HF-containing TAF pairs have been described in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10. 108
26985 153245 cd07983 LPLAT_DUF374-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: DUF374. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are the uncharacterized DUF374 phospholipid/glycerol acyltransferases and similar proteins. 189
26986 153246 cd07984 LPLAT_LABLAT-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LABLAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as lipid A biosynthesis lauroyl/myristoyl (LABLAT, HtrB) acyltransferases and similar proteins. 192
26987 153247 cd07985 LPLAT_GPAT Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: GPAT. Lysophospholipid acyltransferase (LPLAT) superfamily member: glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB). LPLATs are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. This subgroup includes glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB). 235
26988 153248 cd07986 LPLAT_ACT14924-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown ACT14924. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized phospholipid/glycerol acyltransferases such as the Pectobacterium carotovorum subsp. carotovorum PC1 locus ACT14924 putative acyltransferase, and similar proteins. 210
26989 153249 cd07987 LPLAT_MGAT-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: MGAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this suubgroup are such LPLATs as 2-acylglycerol O-acyltransferase (MGAT), and similar proteins. 212
26990 153250 cd07988 LPLAT_ABO13168-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown ABO13168. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized phospholipid/glycerol acyltransferases such as the Acinetobacter baumannii ATCC 17978 locus ABO13168 putative acyltransferase, and similar proteins. 163
26991 153251 cd07989 LPLAT_AGPAT-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: AGPAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), Tafazzin (product of Barth syndrome gene), and similar proteins. 184
26992 153252 cd07990 LPLAT_LCLAT1-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LCLAT1-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as Lysocardiolipin acyltransferase 1 (LCLAT1) or 1-acyl-sn-glycerol-3-phosphate acyltransferase and similar proteins. 193
26993 153253 cd07991 LPLAT_LPCAT1-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LPCAT1-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as lysophosphatidylcholine acyltransferase 1 (LPCAT-1), glycerol-3-phosphate acyltransferase 3 (GPAT3), and similar sequences. 211
26994 153254 cd07992 LPLAT_AAK14816-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown AAK14816-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized glycerol-3-phosphate acyltransferases such as the Plasmodium falciparum locus AAK14816 putative acyltransferase, and similar proteins. 203
26995 153255 cd07993 LPLAT_DHAPAT-like Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: GPAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and similar proteins. 205
26996 153424 cd07994 WGR WGR domain. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. 73
26997 153431 cd07995 TPK Thiamine pyrophosphokinase. Thiamine pyrophosphokinase (TPK, EC:2.7.6.2, also spelled thiamin pyrophosphokinase) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamine) to form the coenzyme thiamine pyrophosphate (TPP). TPP is required for central metabolic functions, and thiamine deficiency is associated with potentially fatal human diseases. The structure of thiamine pyrophosphokinase suggests that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 208
26998 153425 cd07996 WGR_MMR_like WGR domain of molybdate metabolism regulator and related proteins. The WGR domain is found in the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, as well as in various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain appears to occur in single-domain proteins and in a variety of domain architectures, together with ATP-dependent DNA ligase domains, WD40 repeats, leucine-rich repeats, and other domains. It has been proposed to function as a nucleic acid binding domain. 74
26999 153426 cd07997 WGR_PARP WGR domain of poly(ADP-ribose) polymerases. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins and histones. Higher eukaryotes contain several PARPs and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. Poly-ADP-ribosylation was thought to be a reversible post-translational covalent modification that serves as a regulatory mechanism for protein substrates. However, it is now known that it plays important roles in many cellular processes including maintenance of genomic stability, transcriptional regulation, energy metabolism, cell death and survival, among others. 102
27000 153427 cd07998 WGR_DNA_ligase WGR domain of bacterial DNA ligases. The WGR domain is found in a small family of predicted bacterial DNA ligases. It has been called WGR after the most conserved central motif of the domain. The domain typically occurs in together with an ATP-dependent DNA ligase domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. 77
27001 153432 cd07999 GH7_CBH_EG Glycosyl hydrolase family 7. Glycosyl hydrolase family 7 contains eukaryotic endoglucanases (EGs) and cellobiohydrolases (CBHs) that hydrolyze glycosidic bonds using a double-displacement mechanism. This leads to a net retention of the conformation at the anomeric carbon. Both enzymes work synergistically in the degradation of cellulose,which is the main component of plant cell wall, and is composed of beta-1,4 linked glycosyl units. EG cleaves the beta-1,4 linkages of cellulose and CBH cleaves off cellobiose disaccharide units from the reducing end of the chain. In general, the O-glycosyl hydrolases are a widespread group of enzymes that hydrolyze the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycoside hydrolase family 7. 386
27002 193574 cd08000 NGN N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily. The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria. 99
27003 153428 cd08001 WGR_PARP1_like WGR domain of poly(ADP-ribose) polymerase 1 and similar proteins. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of vertebrate PARP-1 and similar proteins, including Arabidopsis thaliana PARP-1 and PARP-3. PARP-1 is the best-studied among the PARPs. It is a widely expressed nuclear chromatin-associated enzyme that possesses auto-mono-ADP-ribosylation (initiation), elongation, and branching activities. PARP-1 is implicated in DNA damage and cell death pathways and is important in maintaining genomic stability and regulating cell proliferation, differentiation, neuronal function, inflammation, and aging. 104
27004 153429 cd08002 WGR_PARP3_like WGR domain of poly(ADP-ribose) polymerase 3 and similar proteins. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of human PARP-3 and similar proteins, including Arabidopsis thaliana PARP-2. PARP-3 displays a tissue-specific expression, with highest amounts found in the nuclei of epithelial cells of prostate ducts, salivary glands, liver, pancreas, and in the neurons of terminal ganglia. Unlike PARP-1 and PARP-2, PARP-3 activity is not induced by DNA strand breaks. However, it co-localizes with Polycomb group bodies and is part of complexes making up DNA-PKcs, DNA ligases III and IV, Ku70, and Ku80. PARP-3 is a nuclear protein that may be involved in transcriptional control and responses to DNA damage. 100
27005 153430 cd08003 WGR_PARP2_like WGR domain of poly(ADP-ribose) polymerases. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of human PARP-2 and similar proteins. Similar to PARP-1, PARP-2 is ubiquitously expressed and its activity is induced by DNA strand breaks. It also plays a role in cell differentiation, cell death, and maintaining genomic stability. Studies on mice deficient with PARP-2 shows that it is important in fat storage, T cell maturation, and spermatogenesis. 103
27006 381750 cd08010 MltG_like proteins similar to Escherichia coli YceG/mltG may function as endolytic murein transglycosylases. The gene product of Escherichia coli yceG/mltG has been erroneously annotated as an aminodeoxychorismate lyase. Its overexpression has been reported to cause abnormal biofilm architecture, and it has been reported to be part of a putative five-gene operon. More recently it has been proposed to function as a terminase for peptidoglycan polymerization. The family also includes Streptomyces caeruleus NovB, an uncharacterized member of the novobiocin biosynthetic gene cluster. 246
27007 349933 cd08011 M20_ArgE_DapE-like M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly archaeal, and have been inferred by homology as being related to both ArgE and DapE. 355
27008 349934 cd08012 M20_ArgE-related M20 Peptidases with similarity to acetylornithine deacetylases. Peptidase M20 family, acetylornithine deacetylase (ArgE, Acetylornithinase, AO, N2-acetyl-L-ornithine amidohydrolase, EC 3.5.1.16)-related subfamily. Proteins in this subfamily have not yet been characterized, but have been predicted to have deacetylase activity. ArgE catalyzes the conversion of N-acetylornithine to ornithine, which can then be incorporated into the urea cycle for the final stage of arginine synthesis. The substrate specificity of ArgE is quite broad; several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved. 423
27009 349935 cd08013 M20_ArgE_DapE-like M20 peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly fungal and bacterial, and have been inferred by homology as being related to both ArgE and DapE. 379
27010 349936 cd08014 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of uncharacterized bacterial proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 371
27011 349937 cd08015 M28_like M28 Zn-peptidase-like; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. 218
27012 349938 cd08017 M20_IAA_Hyd M20 Peptidase Indole-3-acetic acid amino acid hydrolase. Peptidase M20 family, plant aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase (IAA-Asp hydrolase; IAAspH; IAAH; IAA amidohydrolase; EC 3.5.1.-) subfamily. IAAspH hydrolyzes indole-3-acetyl-N-aspartic acid (IAA or auxin) to indole-3-acetic acid. Genes encoding IAA-amidohydrolases were first cloned from Arabidopsis; ILR1, IAR3, ILL1 and ILL2 encode active IAA- amino acid hydrolases, and three additional amidohydrolase-like genes (ILL3, ILL5, ILL6) have been isolated. In higher plants, the growth regulator indole-3-acetic acid (IAA or auxin) is found both free and conjugated via amide bonding to a variety of amino acids and peptides, and via an ester linkage to carbohydrates. IAA-Asp conjugates are involved in homeostatic control, protection, storing and subsequent use of free IAA. IAA-Asp is also found in some plants as a unique intermediate for entering into IAA non-decarboxylative oxidative pathway. IAA amidohydrolase cleaves the amide bond between the auxin and the conjugated amino acid. Enterobacter agglomerans IAAspH has very strong enzyme activity and substrate specificity towards IAA-Asp, although its substrate affinity is weaker compared to Arabidopsis enzymes of the ILR1 gene family. Enhanced IAA-hydrolase activity has been observed during clubroot disease in Chinese cabbage. 376
27013 349939 cd08018 M20_Acy1_amhX-like M20 Peptidase aminoacylase 1 amhX-like subfamily. Peptidase M20 family, uncharacterized subfamily of proteins predicted as putative amidohydrolases, including the amhX gene product from Bacillus subtilis. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 365
27014 349940 cd08019 M20_Acy1-like M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 372
27015 349941 cd08021 M20_Acy1_YhaA-like M20 Peptidase aminoacylase 1 subfamily, includes Bacillus subtilis YhaA and Staphylococcus aureus amidohydrolase, SACOL0085. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). This family includes Staphylococcus aureus amidohydrolase, SACOL0085, which contains two manganese ions in the active site, and forms a homotetramer with variations in interdomain orientation which possibly plays a role in the regulation of catalytic activity. 384
27016 349942 cd08022 M28_PSMA_like M28 Zn-peptidase prostate-specific membrane antigen. Peptidase M28 family; prostate-specific membrane antigen (PSMA, also called glutamate carboxypeptidase II or GCP-II)-like subfamily. PSMA is a homodimeric type II transmembrane protein containing three distinct domains: protease-like, apical or protease-associated (PA) and helical domains. The protease-like domain is a large extracellular portion (ectodomain). PSMA is over-expressed predominantly in prostate cancer (PCa) as well as in the neovasculature of most solid tumors, but not in the vasculature of the normal tissues. PSMA is considered a biomarker for PCa and possibly for use as an imaging and therapeutic target. The extracellular domain of PSMA possesses two unique enzymatic functions: N-acetylated, alpha-linked acidic dipeptidase (NAALADase) which cleaves terminal glutamate from the neurodipeptide N-acetyl-aspartyl-glutamate (NAAG), and folate hydrolase (FOLH) which cleaves the terminal glutamates from gamma-linked polyglutamates (carboxypeptidase). A mutation in this gene may be associated with impaired intestinal absorption of dietary folates, resulting in low blood folate levels and consequent hyperhomocysteinemia. Expression of this protein in the brain may be involved in a number of pathological conditions associated with glutamate excitotoxicity. Inhibition of GCP-II has been shown to be effective in preclinical models of neurological disorders associated with excessive activation of glutamatergic systems. This gene likely arose from a duplication event of a nearby chromosomal region. Alternative splicing gives rise to multiple transcript variants. 287
27017 185693 cd08023 GH16_laminarinase_like Laminarinase, member of the glycosyl hydrolase family 16. Laminarinase, also known as glucan endo-1,3-beta-D-glucosidase, is a glycosyl hydrolase family 16 member that hydrolyzes 1,3-beta-D-glucosidic linkages in 1,3-beta-D-glucans such as laminarins, curdlans, paramylons, and pachymans, with very limited action on mixed-link (1,3-1,4-)-beta-D-glucans. 235
27018 185694 cd08024 GH16_CCF Coelomic cytolytic factor, member of glycosyl hydrolase family 16. Subgroup of glucanases of unknown function that are related to beta-GRP (beta-1,3-glucan recognition protein), but contain active site residues. Beta-GRPs are one group of pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate innate immune response. Beta-GRPs are present in insects and lack all catalytic residues. This subgroup contains related proteins that still contain the active site and are widely distributed in eukaryotes. Their structures adopt a jelly roll fold with a deep active site channel harboring the catalytic residues, like those of other glycosyl hydrolase family 16 members. 330
27019 153090 cd08025 RNR_PFL_like_DUF711 Uncharacterized proteins with similarity to Ribonucleotide reductase and Pyruvate formate lyase. This subfamily contains Streptococcus pneumoniae Sp0239 and similar uncharacterized proteins. Sp0239 is structurally similar to ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL), which are believed to have diverged from a common ancestor. RNR and PFL possess a ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. PFL is an essential enzyme in anaerobic bacteria that catalyzes the conversion of pyruvate and CoA to acteylCoA and formate. 400
27020 153434 cd08026 DUF326 Cysteine-rich 4 helical bundle widely conserved in bacteria. This functionally uncharacterized protein forms a 4-helical bundle with a bromodomain-like topology. It is present in major bacterial lineages and contains highly conserved cysteines in a repeated pattern, whose sidechains appear buried. Some family members have been (mis)annotated as putative ferredoxins. 102
27021 153397 cd08028 LARP_3 La RNA-binding domain of La-related protein 3. This domain is found at the N-terminus of the La autoantigen and similar proteins, and co-occurs with an RNA-recognition motif (RRM). Together these domains function to bind primary transcripts of RNA polymerase III at their 3' terminus and protect them from exonucleolytic degradation. Binding is specific for the 3'-terminal UUU-OH motif. The La autoantigen is also called Lupus La protein, LARP3, or Sjoegren syndrome type B antigen (SS-B). 82
27022 153398 cd08029 LA_like_fungal La-motif domain of fungal proteins similar to the La autoantigen. This domain is found in fungal proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 76
27023 153399 cd08030 LA_like_plant La-motif domain of plant proteins similar to the La autoantigen. This domain is found in plant proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 90
27024 153400 cd08031 LARP_4_5_like La RNA-binding domain of proteins similar to La-related proteins 4 and 5. This domain is found in proteins similar to La-related proteins 4 and 5 (LARP4, LARP5). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 75
27025 153401 cd08032 LARP_7 La RNA-binding domain of La-related protein 7. LARP7 is a component of the 7SK snRNP, a key factor in the regulation of RNA polymerase II transcription. 7SK functionality is dependent on the presence of LARP7, which is thought to stabilize the 7SK RNA by interacting with its 3' end. The release of 7SK RNA from P-TEFb/HEXIM/7SK complexes activates the cyclin-dependent kinase P-TEFb, which in turn phosphorylates the C-terminal domain of RNA pol II and mediates a transition into productive transcription elongation. 82
27026 153402 cd08033 LARP_6 La RNA-binding domain of La-related protein 6. This domain is found in animal and plant proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 77
27027 153403 cd08034 LARP_1_2 La RNA-binding domain proteins similar to La-related proteins 1 and 2. This domain is found in proteins similar to vertebrate La-related proteins 1 and 2 (LARP1, LARP2). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 73
27028 153404 cd08035 LARP_4 La RNA-binding domain of La-related protein 4. This domain is found in vertebrate La-related protein 4 (LARP4), also known as c-MPL binding protein. La-type domains often co-occur with RNA-recognition motifs (RRMs). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 75
27029 153405 cd08036 LARP_5 La RNA-binding domain of La-related protein 5. This domain is found in vertebrate La-related protein 5 (LARP5). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 75
27030 153406 cd08037 LARP_1 La RNA-binding domain of La-related protein 1. This domain is found in vertebrate La-related protein 1 (LARP1). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 73
27031 153407 cd08038 LARP_2 La RNA-binding domain of La-related protein 2. This domain is found in vertebrate La-related protein 2 (LARP2). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. 73
27032 185716 cd08039 Adenylation_DNA_ligase_Fungal Adenylation domain of uncharacterized fungal ATP-dependent DNA ligase-like proteins. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. This group is composed of uncharacterized fungal proteins with similarity to ATP-dependent DNA ligases. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many of the active-site residues. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. This model characterizes the adenylation domain of this group of uncharacterized fungal proteins. It is not known whether these proteins also contain an OB-fold domain. 235
27033 153442 cd08040 OBF_DNA_ligase_family The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 108
27034 153443 cd08041 OBF_kDNA_ligase_like The Oligonucleotide/oligosaccharide binding (OB)-fold domain of kDNA ligase-like ATP-dependent DNA ligases is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. The mitochondrial DNA of parasitic protozoan is highly unusual. It is termed the kinetoplast DNA (kDNA) and consists of circular DNA molecules (maxicircles) and several thousand smaller circular molecules (minicircles). This group is composed of kDNA ligase, Chlorella virus DNA ligase, and similar proteins. kDNA ligase and Chlorella virus DNA ligase are the smallest known ATP-dependent ligases. They are involved in DNA replication or repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain. 77
27035 176269 cd08044 TAF5_NTD2 TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID). The TATA Binding Protein (TBP) Associated Factor 5 (TAF5) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TAF5 contains three domains, two conserved sequence motifs at the N-terminal and one at the C-terminal region. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF5 may play a major role in forming TFIID and its related complexes. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. TAF5 has a paralog gene (TAF5L) which has a redundant function. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. C-terminus of TAF5 contains six WD40 repeats that likely form a closed beta propeller structure and may be involved in protein-protein interaction. The first part of the TAF5 N-terminal (TAF5_NTD1) homodimerizes in the absence of other TAFs. The second conserved N-terminal part of TAF5 (TAF5_NTD2) has an alpha-helical domain. One study has shown that TAF5_NTD2 homodimerizes only at high concentration of calcium but not any other metals. No dimerization was observed in other structural studies of TAF_NTD2. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. However, TAF5 does not have a HFD motif. 133
27036 173965 cd08045 TAF4 TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryote. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. TAF4 domain interacts with TAF12 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes. 212
27037 173966 cd08047 TAF7 TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. TAF7 is involved in the regulation of the transition from PIC assembly to initiation and elongation. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. 162
27038 173967 cd08048 TAF11 TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. TAF11 interacts with the ligand binding domains of the nuclear receptors for vitamin D3 and thyroid hormone. TAF11 also directly interacts with TFIIA, acting as a bridging factor that stabilizes the TFIIA-TBP-DNA complex. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. The TAF11 domain is structurally analogous to histone H3 and interacts with TAF13, making a novel histone-like heterodimer. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex. 85
27039 176263 cd08049 TAF8 TATA Binding Protein (TBP) Associated Factor 8. The TATA Binding Protein (TBP) Associated Factor 8 (TAF8) is one of several TAFs that bind TBP, and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs' functions, such as serving as activator-binding sites, involvement in the core-promoter recognition, or a role in the essential catalytic activity of the complex. The mouse ortholog of TAF8 is called taube nuss protein (TBN), and is required for early embryonic development. TBN mutant mice exhibit disturbances in the balance between cell death and cell survival in the early embryo. TAF8 plays a role in the differentiation of preadipocyte fibroblasts to adipocytes; it is also required for the integration of TAF10 into the TAF complex. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF8 is also a component of a small TAF complex (SMAT), which contains TAF8, TAF10 and SUPT7L. Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. TAF8 contains an H4 related histone fold motif, and interacts with several subunits of TFIID, including TBP and the histone-fold protein TAF10. Currently, five HF-containing TAF pairs have been described or suggested to exist in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10. 54
27040 381749 cd08050 TAF6C C-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6). This model characterizes the carboxy (C)-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6), which is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. This C-terminal HEAT repeat domain of TAF6 (TAF6C) is proposed to form a homodimer that effectively bridges the downstream promoter-interacting TAFs (TAF1, -2, and -7) with lobe B of TFIID. This domain influences the TAF6-TAF9 complex, is thus important for TFIID assembly, and may trigger signals from transcriptional effectors. The HEAT domain motif is generally involved in protein/protein interactions, and in A. locustae, the conserved TAF6C domain is formed by five HEAT repeats, tightly packed against each other, defining a single structural domain. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays a key role in the recognition of promoter DNA and assembly of the pre-initiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs' functions such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription, as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold domain (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF6 is a shared subunit of histone acetyltransferase complex SAGA and TFIID complexes. The N-terminal HFD of TAF6, interacts with the HFD of TAF9 and makes a novel histone-like heterodimer that is structurally related to histones H4 and H3. TAF6 may also interact with the downstream core promoter element (DPE). 216
27041 153444 cd08051 gp6_gp15_like Head-Tail Connector Proteins gp6 and gp15, and similar proteins. Members of this family include the proteins gp6 and gp15 from bacteriophage HK97 and SPP1, respectively. They are critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. They form dodecameric ring structures that comprise the middle ring of the connector, located between the portal protein (attached to the head) and the gp7/gp16 ring (attached to the tail). They are components of the mature phage and the absence or mutation of HK97 gp6 or SPP1 gp15, respectively, result in defective head-tail joining and the absence of mature phage particles. The genome maps of HK97 and SPP1 show that genes encoding gp6 and gp15 are in the same relative position on the genome, located adjacent to the major capsid protein (MCP) gene and in between head and tail genes. Also included in this family is the uncharacterized Bacillus subtilis Yqbg protein, whose gene is part of the unusual genetic element called skin. The Yqbg gene is surrounded with genes similar to genes in the Bacillus subtilis prophage-like element PBSX, which encode for proteins comprising contractile-tailed phage-like particles that are produced upon mitomycin C treatment. Yqbg likely acts as a head-tail connector protein, similar to gp6 and gp15, of the PBSX-like prophage encoded in the skin element. 94
27042 153445 cd08053 Yqbg Putative Head-Tail Connector Protein Yqbg from Bacillus subtilis and similar proteins. The uncharacterized Bacillus subtilis Yqbg protein, whose gene is part of the unusual genetic element called skin, shows a similar structure to the connector proteins gp6 and gp15 from bacteriophage HK97 and SPP1, respectively. gp6 and gp15 are critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. They form dodecameric ring structures that comprise the middle ring of the connector, located between the portal protein (attached to the head) and the gp7/gp16 ring (attached to the tail). The Yqbg gene is surrounded with genes similar to genes in the Bacillus subtilis prophage-like element PBSX, which encode for proteins comprising contractile-tailed phage-like particles that are produced upon mitomycin C treatment. Yqbg likely acts as a head-tail connector protein, similar to gp6 and gp15, of the PBSX-like prophage encoded in the skin element. 121
27043 153446 cd08054 gp6 Head-Tail Connector Protein gp6 of Bacteriophage HK97 and similar proteins. The bacteriophage HK97 gp6 protein is critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. It forms a dodecameric ring structure that comprises the middle ring of the connector, located between the portal protein (attached to the head) and the gp7 ring (attached to the tail). It is a component of the mature phage and the absence of HK97 gp6 results in defective head-tail joining and the absence of mature phage particles. Although the crystal structure of HK97 gp6 shows an unexpected 13-mer ring, the biological form present in the mature phage is believed to be a dodecamer. 91
27044 153447 cd08055 gp15 Head-Tail Connector Protein gp15 of Bacteriophage SPP1 and similar proteins. The bacteriophage SPP1 gp15 protein is critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. It forms a dodecameric ring structure that comprises the middle ring of the connector, located between the portal protein (attached to the head) and the gp16 ring (attached to the tail). Binding of the gp15 and gp16 rings to the portal protein is essential to prevent leakage of packaged DNA. gp15 is a component of the mature phage and its mutation results in defective head-tail joining. 95
27045 163687 cd08056 MPN_PRP8 Mpr1p, Pad1p N-terminal (MPN) domains without isopeptidase activity found in splicing factor Prp8. Members of this family are found in pre-mRNA-processing factor 8 (Prp8) which is a critical splicing factor, interacting with several other spliceosomal proteins, snRNAs, and the pre-mRNA, thus organizing and stabilizing the spliceosome catalytic core. Prp8 is one of the largest and most highly conserved of nuclear proteins, occupying a central position in the catalytic core of the spliceosome. Its C-terminal domain exhibits a JAB1/MPN-like core similar to deubiquitinating enzymes, but does not show catalytic isopeptidase activity, possibly because the putative isopeptidase center is covered by insertions and terminal appendices that are grafted onto this core, thus impairing the metal binding site. It is proposed that this domain is a protein interaction domain instead of a Zn(2+)-dependent metalloenzyme as proposed for some MPN proteins. The DEAD-box protein Brr2 and the GTPase Snu114 bind to the Prp8 C-terminus, a region where mutations in human Prp8 (hPrp8) cause a severe form of the genetic disorder retinitis pigmentosa, RP13, which leads to progressive photoreceptor degeneration in the retina and eventual blindness. At the N-terminus of Prp8, there are several domains, including a highly variable nuclear localization signal (NLS) motif rich in prolines, a conserved RNA recognition motif (RRM), and U5 and U6 snRNA binding sites. 252
27046 163688 cd08057 MPN_euk_non_mb Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity (non metal-binding); eukaryotic. This family contains MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains variants lacking key residues in the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Rpn7/PSMD7, Rpn8/PSMD8, CSN6, Prp8p, and the translation initiation factor 3 subunits f and h do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function. Rpn7 is known to be critical for the integrity of the 26S proteasome complex by establishing a correct lid structure. It is necessary for the incorporation/anchoring of Rpn3 and Rpn12 to the lid and essential for viability and normal mitosis. CSN6 is a highly conserved protein complex with diverse functions, including several important intracellular pathways such as the ubiquitin/proteasome system, DNA repair, cell cycle, developmental changes, and some aspects of immune responses. It cleaves ubiquitin-like protein Nedd8 (neural precursor cell expressed, developmentally downregulated 8)) in the cullin 1 in cells. EIF3f s a potent inhibitor of HIV-1 replication as well as an important negative regulator of cell growth and proliferation. EIF3h regulates cell growth and viability, and that over-expression of the gene may provide growth advantage to prostate, breast, and liver cancer cells. 157
27047 163689 cd08058 MPN_euk_mb Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding); eukaryotic. This family contains eukaryotic MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains found in proteins with a variety of functions, including AMSH (associated molecule with the Src homology 3 domain (SH3) of STAM), H2A-DUB (histone H2A deubiquitinase), BRCC36 (BRCA1/BRCA2-containing complex subunit 36), as well as Rpn11 (regulatory particle number 11) and CSN5 (COP9 signalosome complex subunit 5). These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity. Rpn11 is responsible for substrate deubiquitination during proteasomal degradation. It is essential for maintaining a correct cell cycle and normal mitochondrial morphology and physiology. CSN5 is critical for nuclear export and the degradation of several tumor suppressor proteins, including p53, p27, and Smad4. Over-expression of CSN5 has been implicated in cancer initiation and progression. AMSH specifically cleaves Lys 63 and not Lys48-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. It is involved in the degradation of EGF receptor (EGFR) and possibly other ubiquitinated endocytosed proteins. BRCC36 is part of the BRCA1/BRCA2/BARD1-containing nuclear complex that displays an E3 ubiquitin ligase activity; it is targeted to DNA damage foci after irradiation. 2A-DUB is specific for monoubiquitinated H2A (uH2A), regulating transcription by coordinating histone acetylation and deubiquitination, and destabilizing the association of linker histone H1 with nucleosomes. It is a positive regulator of androgen receptor (AR) transactivation activity on a reporter gene and serves as a marker in prostate tumors. 119
27048 163690 cd08059 MPN_prok_mb Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding); prokaryotic. This family contains bacterial and archaeal MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These catalytically active domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site. 101
27049 163691 cd08060 MPN_UPF0172 Mov34/MPN/PAD-1 family: UPF0172 family of unknown function includes neighbor of COX4 (Noc4p). This family includes Noc4p (neighbor of COX4; neighbor of Cytochrome c Oxidase 4; nucleolar complex associated 4 homolog) which belongs to the family of unknown function, UPF0172, with MPN/JAMM-like domains. Proteins in this family are homologs of the NOC4 gene which is conserved in eukaryotic members including human, dog, mouse, rat, chicken, zebrafish, fruit fly, mosquito, S.pombe, K.lactis, E.gossypii, M.grisea, N.crassa, A.thaliana, and rice. NOC4 highly expressed in the pancreas and moderately in liver, heart, lung, kidney, brain, skeletal muscle, and placenta. This nucleolar protein forms a complex with Nop14p that mediates maturation and nuclear export of 40S ribosomal subunits. This family of eukaryotic MPN-like domains lacks the key residues that coordinate a metal ion and therefore does not show catalytic isopeptidase activity. 182
27050 163692 cd08061 MPN_NPL4 Mov34/MPN/PAD-1 family: nuclear protein localization-4 (Npl4) domain. Npl4p (nuclear protein localization-4) is identical to Hmg-CoA reductase degradation 4 (HRD4) protein and contains a domain that is part of the pfam clan MPN/Mov34-like. Npl4 plays an intermediate role between endoplasmic reticulum-associated degradation (ERAD) substrate ubiquitylation and proteasomal degradation. Npl4p associates with Cdc48p (Cdc48 in yeast and p97 or valosin-containing protein (VCP) in higher eukaryotes), the highly conserved ATPase of the AAA family, via ubiquitin fusion degradation-1 protein (Ufd1p) to form a Cdc48p-Ufd1p-Npl4p complex which then functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation. This family of eukaryotic MPN-like domains lacks the key residues that coordinate a metal ion and therefore does not show catalytic isopeptidase activity. 274
27051 163693 cd08062 MPN_RPN7_8 Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in 19S proteasomal subunits Rpn7 and Rpn8. This family includes lid subunits of the 26 S proteasome regulatory particles, Rpn7 (PSMD7; proteasome 26S non-ATPase subunit 7; p44), and Rpn8 (PSMD8; proteasome 26S non-ATPase subunit 8; p40; Mov34). Rpn7 is known to be critical for the integrity of the 26 S proteasome complex by establishing a correct lid structure. It is necessary for the incorporation/anchoring of Rpn3 and Rpn12 to the lid and essential for viability and normal mitosis. Rpn7 and Rpn8 are ATP-independent components of the 19S regulator subunit, and contain the MPN structural motif on its N-terminal region. However, while they show a typical MPN metalloprotease fold, they lack the canonical JAMM motif, and therefore do not show catalytic isopeptidase activity. It is suggested that Rpn7 function is primarily structural. 280
27052 163694 cd08063 MPN_CSN6 Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in COP9 signalosome complex subunit 6. CSN6 (COP9 signalosome subunit 6; COP9 subunit 6; MOV34 homolog, 34 kD) is one of the eight subunits of COP9 signalosome, a highly conserved protein complex with diverse functions, including several important intracellular pathways such as the ubiquitin/proteasome system, DNA repair, cell cycle, developmental changes, and some aspects of immune responses. CSN6 is an MPN-domain protein that directly interacts with the MPN+-domain subunit CSN5. It is cleaved during apoptosis by activated caspases. CSN6 processing occurs in CSN/CRL (cullin-RING Ub ligase) complexes and is followed by the cleavage of Rbx1, the direct interaction partner of CSN6. CSN6 cleavage enhances CSN-mediated deneddylating activity (i.e. cleavage of ubiquitin-like protein Nedd8 (neural precursor cell expressed, developmentally downregulated 8)) in the cullin 1 in cells. The cleavage of Rbx1 and increased deneddylation of cullins inactivate CRLs and presumably stabilize pro-apoptotic factors for final apoptotic steps. While CSN6 shows a typical MPN metalloprotease fold, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity. 288
27053 163695 cd08064 MPN_eIF3f Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in eIF3f. Eukaryotic translation initiation factor 3 (eIF3) subunit F (eIF3F; EIF3S5; eIF3-p47; eukaryotic translation initiation factor 3, subunit 5 epsilon, 47kDa; Mov34/MPN/PAD-1 family protein) is an evolutionarily non-conserved subunit of the functional core that comprises eIF3a, eIF3b, eIF3c, eIF3e, eIF3f, and eIF3h, and contains the MPN domain. However, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity. It has been shown that eIF3f mRNA expression is significantly decreased in many human tumors including pancreatic cancer and melanoma. EIF3f is a potent inhibitor of HIV-1 replication; it mediates restriction of HIV-1 expression through several factors including the serine/arginine-rich (SR) protein 9G8, and cyclin-dependent kinase 11 (CDK11). EIF3f phosphorylation by CDK11 is important in regulating its function in translation and apoptosis. It enhances its association with the core eIF3 subunits during apoptosis, suggesting that eIF3f may inhibit translation by increasing the binding to the eIF3 complex during apoptosis. Thus, eIF3f may be an important negative regulator of cell growth and proliferation. 265
27054 163696 cd08065 MPN_eIF3h Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in eIF2h. Eukaryotic translation initiation factor 3 (eIF3) subunit h (eIF3h; eIF3 subunit 3; eIF3S3; eIF3-gamma; eIF3-p40) is an evolutionarily non-conserved subunit of the functional core that comprises eIF3a, eIF3b, eIF3c, eIF3e, eIF3f, and eIF3h, and contains the MPN domain. However, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity.Together with eIF3e and eIF3f, eIF3h stabilizes the eIF3 complex. Results suggest that eIF3h regulates cell growth and viability, and that over-expression of the gene may provide growth advantage to prostate, breast, and liver cancer cells. For example, EIF3h gene amplification is common in late-stage prostate cancer suggesting that it may be functionally involved in the progression of the disease. It has been shown that coamplification of MYC, a well characterized oncogene involved in cell growth, differentiation, and apoptosis, and EIF3h in patients with non-small cell lung cancer (NSCLC) improves survival if treated with the Epidermal Growth Factor Receptor Tyrosine Kinase Inhibitor (EGFR-TKI), Gefitinib. Plant eIF3h is implicated in translation of specific mRNAs. 266
27055 163697 cd08066 MPN_AMSH_like Mov34/MPN/PAD-1 family. AMSH (associated molecule with the Src homology 3 domain (SH3) of STAM (signal-transducing adapter molecule, also known as STAMBP)) and AMSH-like proteins (AMSH-LP) are members of JAMM/MPN+ deubiquitinases (DUBs), with Zn2+-dependent ubiquitin isopeptidase activity. AMSH specifically cleaves Lys 63 and not Lys48-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. AMSH and AMSH-LP are anchored on the early endosomal membrane via interaction with the clathrin coat. AMSH shares a common SH3-binding site with another endosomal DUB, UBPY (ubiquitin-specific protease Y; also known as USP8), the latter being a cysteine protease that does not discriminate between Lys48 and Lys63-linked ubiquitin. AMSH is involved in the degradation of EGF receptor (EGFR) and possibly other ubiquitinated endocytosed proteins. AMSH also interacts with CHMP1, CHMP2, and CHMP3 proteins, all of which are components of ESCRT-III, suggested to be required for EGFR down-regulation. The function of AMSH-LP has not been elucidated; however, it exhibits two fundamentally distinct features from AMSH: first, there is a substitution in the critical amino acid residue in the SH3-binding motif (SBM) in the human AMSH-LP, but not in its mouse ortholog, and lacks STAM-binding ability; second, AMSH-LP lacks the ability to interact with CHMP proteins. It is therefore likely that AMSH and AMSH-LP play different roles on early endosomes. 173
27056 163698 cd08067 MPN_2A_DUB Mov34/MPN/PAD-1 family: Histone H2A deubiquitinase. This family includes histone H2A deubiquitinase (Histone H2A DUB;MYSM1; myb-like, SWIRM and MPN domains 1; 2ADUB; 2A-DUB; KIAA19152ADUB, or KIAA1915/MYSM1), a member of JAMM/MPN+ deubiquitinases (DUBs), with possible Zn2+-dependent ubiquitin isopeptidase activity. It contains the SWIRM (Swi3p, Rsc8p and Moira), and SANT (SWI-SNF, ADA N-CoR, TFIIIB)/Myb domains; the SANT, but not the SWIRM, domain can bind directly to DNA. 2A-DUB is specific for monoubiquitinated H2A (uH2A), regulating transcription by coordinating histone acetylation and deubiquitination, and destabilizing the association of linker histone H1 with nucleosomes. 2A-DUB interacts with p/CAF (p300/CBP-associated factor) in a co-regulatory protein complex, where the status of acetylation of nucleosomal histones modulates its deubiquitinase activity. 2A-DUB is a positive regulator of androgen receptor (AR) transactivation activity on a reporter gene; it participates in transcriptional regulation events in androgen receptor-dependent gene activation. In prostate tumors, the levels of uH2A are dramatically decreased, thus 2A-DUB serving as a cancer-related marker. 187
27057 163699 cd08068 MPN_BRCC36 Mov34/MPN/PAD-1 family: BRCC36, a subunit of BRCA1-A complex. BRCC36 (BRCA1-A complex subunit BRCC36; BRCA1/BRCA2-containing complex subunit 36; BRCA1/BRCA2-containing complex subunit 3; BRCC3; BRISC complex subunit BRCC36; BRCC36 isopeptidase complex; Lys-63-specific deubiquitinase BRCC36) and BRCC36-like domains are members of JAMM/MPN+ deubiquitinases (DUBs), possibly with Zn2+-dependent ubiquitin isopeptidase activity. BRCC36 is part of the BRCA1/BRCA2/BARD1-containing nuclear complex that displays an E3 ubiquitin ligase activity. It is targeted to DNA damage foci after irradiation; RAP80 recruits the Abraxas-BRCC36-BRCA1-BARD1 complex to DNA double strand breaks (DSBs) for DNA repair through specific recognition of Lys 63-linked polyubiquitinated proteins by its tandem ubiquitin-interacting motifs. A new protein, MERIT40 (mediator of RAP80 interactions and targeting 40 kDa), also named NBA1 (new component of the BRCA1 A complex), exists in the same BRCA1-containing complex and is essential for the integrity of the complex. There are studies suggesting that MERIT40/NBA1 ties BRCA1 complex integrity, DSB recognition, and ubiquitin chain activities to the DNA damage response. It has also been shown that BRCA1-containing complex resembles the lid complex of the 26S proteasome. 244
27058 163700 cd08069 MPN_RPN11_CSN5 Mov34/MPN/PAD-1 family: proteasomal regulatory protein Rpn11 and signalosome complex subunit CSN5. This family contains proteasomal regulatory protein Rpn11 (26S proteasome regulatory subunit rpn11; PAD1; POH1; RPN11; PSMD14; Rpn11 subunit of the 19S-proteasome; regulatory particle number 11) and signalosomal CSN5 (COP9 signalosome complex subunit 5; COP9 complex homolog subunit 5; c-Jun activation domain-binding protein-1; CSN5/JAB1; JAB1). COP9 signalosome (CSN) and the proteasome lid are paralogous complexes and their respective subunits CSN5 and Rpn11 are most closely related between the two complexes, both containing the conserved JAMM (JAB1/MPN/Mov34 metalloenzyme) motif involved in zinc ion coordination and providing the active site for isopeptidase activity. Rpn11 is responsible for substrate deubiquitination during proteasomal degradation. It is essential for maintaining a correct cell cycle and normal mitochondrial morphology and physiology; mutations in Rpn11 cause cell cycle and mitochondrial defects, temperature sensitivity and sensitivity to DNA damaging reagents such as UV. It has been shown that the C-terminal region of Rpn11 is involved in the regulation of the mitochondrial fission and tubulation processes. CSN5, one of the eight subunits of CSN, is critical for nuclear export and the degradation of several tumor suppressor proteins, including p53, p27, and Smad4. Its MPN+ domain is critical for the physical interaction of RUNX3 and Jab1. It has been suggested that the direct interaction of CSN5/JAB1 with p27 provides p27 with a leucine-rich nuclear export signal (NES), which is required for binding to chromosomal region maintenance 1 (CRM1), and facilitates nuclear export. The over-expression of CSN5/JAB1 also has been implicated in cancer initiation and progression, including cancer of the lung, pancreas, mouth, thyroid, and breast, suggesting that the oncogenic activity of CSN5 is associated with the down-regulation of RUNX3. 268
27059 163701 cd08070 MPN_like Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding). This family contains archaeal and bacterial MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site. 128
27060 163702 cd08071 MPN_DUF2466 Mov34/MPN/PAD-1 family. Mov34 DUF2466 (also known as DNA repair protein RadC) domain of unknown function contains the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity. However, to date, the name RadC has been misleading and no function has been determined. 113
27061 163703 cd08072 MPN_archaeal Mov34/MPN/PAD-1 family: archaeal JAB1/MPN/Mov34 metalloenzyme. This family contains only archaeal MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site. 117
27062 163704 cd08073 MPN_NLPC_P60 Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding) found in proteins also containing NlpC/P60 domains. This family contains bacterial MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains at the N-terminus of NlpC/P60 phage tail protein domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site. 108
27063 173969 cd08148 RuBisCO_large Ribulose bisphosphate carboxylase large chain. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions. 366
27064 163706 cd08150 catalase_like Catalase-like heme-binding proteins and protein domains. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity. 283
27065 163707 cd08151 AOS Allene oxide synthase. Allene oxide synthase converts a fatty acid hydroperoxide to an allene oxide, which is an unstable epoxide. In corals, the enzyme is part of a eiconaosid synthesis pathway that is initiated by a lipoxygenase, which generates the fatty acid hydroperoxides in the first step. The structure of allene oxide synthase closely resembles that of catalase, but allene oxide synthase does not have catalase activity. 328
27066 163708 cd08152 y4iL_like Catalase-like heme-binding proteins similar to the uncharacterized y4iL. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity. This family contains uncharacterized proteins similar to Rhizobium sp. NGR234 y4iL, of mostly bacterial origin. 305
27067 163709 cd08153 srpA_like Catalase-like heme-binding proteins similar to the uncharacterized srpA. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity. This family contains uncharacterized proteins similar to the Synechococcus elongatus PCC 7942 periplasmic protein srpA, of mostly bacterial origin. The plasmid-encoded srpA is regulated by sulfate, but does not seem to function in its uptake or metabolism. 295
27068 163710 cd08154 catalase_clade_1 Clade 1 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 1 catalases are found in bacteria, algae, and plants; they have a relatively small subunit size of 55 to 69 kDa, and bind a protoheme IX (heme b) group buried deep inside the structure. They appear to form tetramers. In eukaryotic cells, catalases are located in peroxisomes. 469
27069 163711 cd08155 catalase_clade_2 Clade 2 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 2 catalases are mostly found in bacteria and fungi; they have a large subunit size of 75 to 84 kDa, and bind a heme d group buried deep inside the structure. They appear to form tetramers. In eukaryotic cells, catalases are located in peroxisomes. 443
27070 163712 cd08156 catalase_clade_3 Clade 3 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 3 catalases are the most abundant subfamily and are found in all three kingdoms of life; they have a relatively small subunit size of 43 to 75 kDa, and bind a protoheme IX (heme b) group buried deep inside the structure. Clade 3 catalases also bind NADPH as a second redox-active cofactor. They form tetramers, and in eukaryotic cells, catalases are located in peroxisomes. 429
27071 163713 cd08157 catalase_fungal Fungal catalases similar to yeast catalases A and T. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. This family of fungal catalases has a relatively small subunit size, and binds a protoheme IX (heme b) group buried deep inside the structure. Fungal catalases also bind NADPH as a second redox-active cofactor. They form tetramers; in eukaryotic cells, catalases are typically located in peroxisomes. Saccharomyces cerevisiae catalase T is found in the cytoplasm, though. 451
27072 176482 cd08159 APC10-like APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination. This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here. 129
27073 380914 cd08161 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily. The Su(var)3-9, Enhancer-of-zeste, Trithorax (SET) domain superfamily corresponds to SET domain-containing lysine methyltransferases, which catalyze site and state-specific methylation of lysine residues in histones that are fundamental in epigenetic regulation of gene activation and silencing in eukaryotic organisms. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains has been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as N-SET and C-SET. C-SET forms an unusual and conserved knot-like structure of probable functional importance. In addition to N-SET and C-SET, an insert region (I-SET) and flanking regions of high structural variability form part of the overall structure. Some family members contain a pre-SET domain, which is found in a number of histone methyltransferases (HMTase), and a post-SET domain, which harbors a zinc-binding site. 72
27074 277369 cd08162 MPP_PhoA_N Synechococcus sp. strain PCC 7942 PhoA and related proteins, N-terminal metallophosphatase domain. Synechococcus sp. strain PCC 7942 PhoA is a large atypical alkaline phosphatase. It is known to be transported across the inner cytoplasmic membrane and into the periplasmic space. In vivo inactivation of the gene encoding PhoA leads to a loss of extracellular, phosphate-regulated phosphatase activity, but does not appear to affect the cells capacity for phosphate uptake. PhoA may play a role in scavenging phosphate during growth of Synechococcus sp. strain PCC 7942 in its natural environment. PhoA belongs to a domain family which includes the bacterial enzyme UshA and several other related enzymes including SoxB, CpdB, YhcR, and CD73. All members have a similar domain architecture which includes an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 325
27075 277370 cd08163 MPP_Cdc1 Saccharomyces cerevisiae CDC1 and related proteins, metallophosphatase domain. Cdc1 (also known as XlCdc1 in Xenopus laevis) is an endoplasmic reticulum-localized transmembrane lipid phosphatase with a metallophosphatase domain facing the ER lumen. In budding yeast, the gene encoding CDC1 is essential while nonlethal mutations cause defects in Golgi inheritance and actin polarization. Cdc1 mutant cells accumulate an unidentified phospholipid, suggesting that Cdc1 is a lipid phosphatase. Cdc1 mutant cells also have highly elevated intracellular calcium levels suggesting a possible role for Cdc1 in calcium regulation. The 5' flanking region of Cdc1 is a regulatory region with conserved binding site motifs for AP1, AP2, Sp1, NF-1 and CREB. DNA polymerase delta consists of at least four subunits - Pol3, Cdc1, Cdc27, and Cdm1. Cdc1 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 257
27076 277371 cd08164 MPP_Ted1 Saccharomyces cerevisiae Ted1 and related proteins, metallophosphatase domain. Saccharomyces cerevisiae Ted1 (trafficking of Emp24p/Erv25p-dependent cargo disrupted 1) is a metallophosphatase domain-containing protein which acts together with Emp24p and Erv25p in cargo exit from the ER. Ted1 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 193
27077 277372 cd08165 MPP_MPPE1 human MPPE1 and related proteins, metallophosphatase domain. MPPE1 is a functionally uncharacterized metallophosphatase domain-containing protein. The MPPE1 gene is located on chromosome 18 and is a candidate susceptibility gene for Bipolar disorder. MPPE1 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 156
27078 277373 cd08166 MPP_Cdc1_like_1 uncharacterized subgroup related to Saccharomyces cerevisiae CDC1, metallophosphatase domain. A functionally uncharacterized subgroup related to the metallophosphatase domain of Saccharomyces cerevisiae Cdc1, S. cerevisiae Ted1 and human MPPE1. Cdc1 is an endoplasmic reticulum-localized transmembrane lipid phosphatase and is a subunit of DNA polymerase delta. TED1 (trafficking of Emp24p/Erv25p-dependent cargo disrupted 1), acts together with Emp24p and Erv25p in cargo exit from the ER. The MPPE1 gene is a candidate susceptibility gene for Bipolar disorder. Proteins in this uncharacterized subgroup belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination. 195
27079 173979 cd08168 Cytochrom_C3 Heme-binding domain of the class III cytochrome C family and related proteins. This alignment models heme binding core motifs as encountered in the cytochrome C3 family and related proteins. Cytochrome C3 is a tetraheme protein found in sulfate-reducing bacteria which use either thiosulfate or sulfate as the ultimate electron acceptors. C3 is an integral part of a complex electron transfer chain. The model also contains triheme cytochromes C7 which function in electron transfer during Fe(III) respiration by Geobacter sulfurreducens (PpcA, PpcB, PpcC, PpcD, and PpcE) and four repeated core motifs as found in the 16-heme cytochrome C HmcA of Desulfovibrio vulgaris Hildenborough which plays a role in electron transfer through the membrane following periplasmic oxidation of hydrogen (resulting in sulfate reduction in the cytoplasm). 85
27080 341448 cd08169 DHQ-like Dehydroquinate synthase-like which includes dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. This group contains dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. These proteins exhibit the dehydroquinate synthase structural fold. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. 2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose through a multi-step reaction in the biosynthesis of aminoglycoside antibiotics. 2-deoxystreptamine (DOS)-containing aminoglycoside antibiotics includes neomycin, kanamycin, gentamicin, and ribostamycin. 2-epi-5-epi-valiolone synthases catalyze the cyclization of sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone in the biosynthesis of C(7)N-aminocyclitol-containing products. The cyclization product, 2-epi-5-epi-valiolone ((2S,3S,4S,5R)-5-(hydroxymethyl)cyclohexanon-2,3,4,5-tetrol), is a precursor of the valienamine moiety. The valienamine unit is responsible for their biological activities as various glycosidic hydrolases inhibitors. Two important microbial secondary metabolites, validamycin and acarbose, are used in agricultural and biomedical applications. 328
27081 341449 cd08170 GlyDH Glycerol dehydrogenases (GlyDH) catalyzes oxidation of glycerol to dihydroxyacetone in glycerol dissmilation. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site. 351
27082 341450 cd08171 GlyDH-like Glycerol dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins that have yet to be characterized, but show sequence homology with glycerol dehydrogenase. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site. 345
27083 341451 cd08172 GlyDH-like Glycerol_dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins that have yet to be characterized, but show sequence homology with glycerol dehydrogenase. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site. 346
27084 341452 cd08173 Gro1PDH Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH) catalyzes the reversible conversion between dihydroxyacetone phosphate and glycerol-1-phosphate using either NADH or NADPH as a coenzyme. Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH, EC 1.1.1.261) plays an important role in the formation of the enantiomeric configuration of the glycerophosphate backbone (sn-glycerol-1-phosphate) of archaeal ether lipids. It catalyzes the reversible conversion between dihydroxyacetone phosphate and glycerol-1-phosphate using either NADH or NADPH as a coenzyme. The activity is zinc-dependent. One characteristic feature of archaea is that their cellular membrane has an ether linkage between the glycerol backbone and the hydrocarbon residues. The polar lipids of the members of Archaea consist of di- and tetra-ethers of glycerol with isoprenoid alcohols bound at the sn-2 and sn-3 positions of the glycerol moiety. The archaeal polar lipids have the enantiomeric configuration of a glycerophosphate backbone [sn-glycerol-1-phosphate (G-1-P)] that is the mirror image structure of the bacterial or eukaryal counterpart [sn-glycerol- 3-phosphate (G-3-P)]. The absolute stereochemistry of the glycerol moiety in all archaeal polar lipids is opposite to that of glycerol ester lipids in bacteria and eukarya. 343
27085 341453 cd08174 G1PDH-like Glycerol-1-phosphate dehydrogenase-like. These glycerol-1-phosphate dehydrogenase-like proteins have not been characterized. The protein sequences have high similarity with that of glycerol-1-phosphate dehydrogenase (G1PDH) which plays a role in the synthesis of phosphoglycerolipids in Gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires Ni++ ion. 332
27086 341454 cd08175 G1PDH Glycerol-1-phosphate dehydrogenase (G1PDH) catalyzes the reversible reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in an NADH-dependent manner. Glycerol-1-phosphate dehydrogenase (G1PDH) plays a role in the synthesis of phosphoglycerolipids in Gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires a Ni++ ion. In Bacillus subtilis, it has been described as AraM gene in L-arabinose (ara) operon. AraM protein forms homodimer. 340
27087 341455 cd08176 LPO Lactadehyde:propanediol oxidoreductase (LPO) catalyzes the interconversion between L-lactaldehyde and L-1,2-propanediol in Escherichia coli and other enterobacteria. Lactadehyde:propanediol oxidoreductase (LPO) is a member of the group III iron-activated dehydrogenases which catalyze the interconversion between L-lactaldehyde and L-1,2-propanediol in Escherichia coli and other enterobacteria. L-fucose and L-rhamnose are used by Escherichia coli through an inducible pathway mediated by the fucose regulon comprising four linked operons fucO, fucA, fucPIK, and fucR. The fucA-encoded aldolase catalyzes the formation of dihydroxyacetone phosphate and L-lactaldehyde. Under anaerobic conditions, with NADH as a cofactor, lactaldehyde is converted by a fucO-encoded lactadehyde:propanediol oxidoreductase (LPO) to L-1,2-propanediol, which is excreted as a fermentation product. In mutant strains, E. coli adapted to grow on L-1,2-propanediol, FucO catalyzes the oxidation of the polyol to L-lactaldehyde. FucO is induced regardless of the respiratory conditions of the culture, remains fully active in the absence of oxygen. In the presence of oxygen, this enzyme becomes oxidatively inactivated by a metal-catalyzed oxidation mechanism. FucO is an iron-dependent metalloenzyme that is inactivated by other metals, such as zinc, copper, or cadmium. This enzyme can also reduce glycol aldehyde with similar efficiency. Beside L-1,2-propanediol, the enzyme is also able to oxidize methanol as an alternative substrate. 378
27088 341456 cd08177 MAR Maleylacetate reductase is involved in many aromatic compounds degradation pathways of aerobic microbes. Maleylacetate reductase (MAR) plays an important role in the degradation of aromatic compounds in aerobic microbes. In fungi and yeasts, the enzyme is involved in the catabolism of compounds such as phenol, tyrosine, benzoate, 4-hydroxybenzoate and resorcinol. In bacteria, the enzyme contributes to the degradation of resorcinol, 2,4-dihydroxybenzoate ([beta]-resorcylate) and 2,6-dihydroxybenzoate ([gamma]-resorcylate) via hydroxyquinol and maleylacetate. Maleylacetate reductase catalyzes NADH- or NADPH-dependent reduction, at the carbon-carbon double bond, of maleylacetate or 2-chloromaleylacetate to 3-oxoadipate. In the case of 2-chloromaleylacetate, MAR initially catalyzes the NAD(P)H-dependent dechlorination to maleylacetate, which is then reduced to 3-oxoadipate. This enzyme is a homodimer and is inhibited by thiol-blocking reagents such as p-chloromercuribenzoate and Hg++, indicating that the cysteine residue is probably necessary for the catalytic activity of maleylacetate reductase. 337
27089 341457 cd08178 AAD_C C-terminal alcohol dehydrogenase domain of the acetaldehyde dehydrogenase-alcohol dehydrogenase bifunctional two-domain protein (AAD). This alcohol dehydrogenase domain is located on the C-terminal of a bifunctional two-domain protein. The N-terminal of the protein contains an acetaldehyde-CoA dehydrogenase domain. This protein is involved in pyruvate metabolism whereby pyruvate is converted to acetyl-CoA and formate by pyruvate formate-lysase (PFL). Under anaerobic condition, acetyl-CoA is reduced to acetaldehyde and ethanol by this two-domain protein. Acetyl-CoA is first converted into an enzyme-bound thiohemiacetal by the N-terminal acetaldehyde dehydrogenase domain. The enzyme-bound thiohemiacetal is subsequently reduced by the C-terminal NAD+-dependent alcohol dehydrogenase domain. In E. coli, this protein is called AdhE and has been shown to have pyruvate formate-lyase (PFL) deactivase activity, which leads to the inactivation of PFL, a key enzyme in anaerobic metabolism. In Escherichia coli and Entamoeba histolytica, this enzyme forms homopolymeric peptides composed of more than 20 protomers associated in a helical rod-like structure. 400
27090 341458 cd08179 NADPH_BDH NADPH-dependent butanol dehydrogenase involved in the butanol and ethanol formation pathway in bacteria. NADPH-dependent butanol dehydrogenase (BDH) is involved in the butanol and ethanol formation pathway of some bacteria. The fermentation process is characterized by an acid producing growth phase, followed by a solvent producing phase. The latter phase is associated with the induction of solventogenic enzymes such as butanol dehydrogenase. The activity of the enzyme requires NADPH as cofactor, as well as divalent ions zinc or iron. This family is a member of the iron-containing alcohol dehydrogenase superfamily. Protein structure has a dehydroquinate synthase-like fold. 379
27091 341459 cd08180 PDD 1,3-propanediol dehydrogenase (PPD) catalyzes the reduction of 3-hydroxypropionaldehyde (3-HPA) to 1,3-propanediol in glycerol metabolism. 1,3-propanediol dehydrogenase (PPD) plays a role in glycerol metabolism of some bacteria in anaerobic conditions. In this degradation pathway, glycerol is converted in a two-step process to 1,3-propanediol (1,3-PD) which is then excreted into the extracellular medium. The first reaction involves the transformation of glycerol into 3-hydroxypropionaldehyde (3-HPA) by a coenzyme B-12-dependent dehydratase. The second reaction involves the dismutation of the 3-hydroxypropionaldehyde (3-HPA) to 1,3-propanediol by the NADH-linked 1,3-propanediol dehydrogenase (PPD). The enzyme requires iron ion for its function. Because many genes in this pathway are present in the propanediol utilization (pdu) operon, they are also named pdu genes. PPD is a member of the iron-containing alcohol dehydrogenase superfamily. The PPD structure has a dehydroquinate synthase-like fold. 333
27092 341460 cd08181 PPD-like 1,3-propanediol dehydrogenase-like (PPD). This family contains proteins similar to 1,3-propanediol dehydrogenase (PPD) which is a member of the iron-containing alcohol dehydrogenase superfamily, and exhibits a dehydroquinate synthase-like fold. Protein sequence similarity search and other biochemical evidences suggest that they are close to the iron-containing 1,3-propanediol dehydrogenase (EC 1.1.1.202). 1,3-propanediol dehydrogenase catalyzes the oxidation of propane-1,3-diol to 3-hydroxypropanal with the simultaneous reduction of NADP+ to NADPH. The protein structure of Thermotoga maritima TM0920 gene contains one NADP+ and one iron ion. 358
27093 341461 cd08182 HEPD Hydroxyethylphosphoate dehydrogenase (HEPD) catalyzes the reduction of phosphonoacetaldehyde (PnAA) to hydroxyethylphosphoate (HEP). Hydroxyethylphosphoate dehydrogenase (HEPD) catalyzes the reduction of phosphonoacetaldehyde (PnAA) to hydroxyethylphosphoate (HEP) with either NADH or NADPH as a cofactor, although NADH is the preferred cofactor. PnAA is a biosynthetic intermediate for several phosphonates such as the antibiotic fosfomycin, phosphinothricin tripeptide (PTT), and 2-aminoethylphosphonate (AEP). This enzyme is named PhpC in PTT biosynthesis pathway in Streptomyces hygroscopicus and S. viridochromogenes. 370
27094 341462 cd08183 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contain different protein domains. Proteins of this family have not been characterized. 377
27095 341463 cd08184 Fe-ADH_KdnB-like Iron-containing alcohol dehydrogenase similar to Shewanella oneidensis KdnB required for Kdo8N biosynthesis. This family contains iron-containing alcohol dehydrogenase-like proteins, many of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the iron-containing alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron or zinc ions. This family also includes Shewanella oneidensis KdnB which is required for biosynthesis of 8-Amino-3,8-dideoxy-D-manno-octulosonic acid (Kdo8N), a unique amino sugar that has thus far only been observed on the lipopolysaccharides of marine bacteria belonging to the genus Shewanella, and thought to be important for the integrity of the bacterial cell outer membrane. KdnB requires NAD(P) and zinc ion for activity. 348
27096 341464 cd08185 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase-like (ADH) proteins. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase fold and is a member of the iron-containing alcohol dehydrogenase-like family. They are distinct from other alcohol dehydrogenases which contain different protein domains. Proteins of this family have not been characterized. 379
27097 341465 cd08186 Fe-ADH-like Iron-containing alcohol dehydrogenase. This family contains iron-containing alcohol dehydrogenase (ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. The ADH of hyperthermophilic archaeon Thermococcus hydrothermalis oxidizes a series of primary aliphatic and aromatic alcohols, preferentially from C2 to C8, but is also active towards methanol and glycerol, and is stereospecific for monoterpenes. It has been suggested that the type III ADHs in microorganisms are involved in acetaldehyde detoxication rather than in alcohol turnover. 380
27098 341466 cd08187 BDH Butanol dehydrogenase catalyzes the conversion of butyraldehyde to butanol with the cofactor NAD(P)H being oxidized in the process. The butanol dehydrogenase (BDH) is involved in the final step of the butanol formation pathway in anaerobic micro-organism. Butanol dehydrogenase catalyzes the conversion of butyraldehyde to butanol with the cofactor NAD(P)H being oxidized in the process. Activity in the reverse direction is 50-fold lower than that in the forward direction. The NADH-BDH has higher activity with longer chained aldehydes and is inhibited by metabolites containing an adenine moiety. This protein family belongs to the so-called iron-containing alcohol dehydrogenase superfamily. Since members of this superfamily use different divalent ions, preferentially iron or zinc, it has been suggested to be renamed to family III metal-dependent polyol dehydrogenases. This family also includes E. coli YqhD enzyme, an NADP-dependent dehydrogenase whose activity measurements with several alcohols demonstrate preference for alcohols longer than C3. The active site of YqhD contains a Zn metal, and a modified NADPH cofactor bearing OH groups on the saturated C5 and C6 atoms, possibly due to oxygen stress on the enzyme, which would functionally work under anaerobic conditions. 382
27099 341467 cd08188 PDDH 1,3-Propanediol (1,3-PD) dehydrogenase. This family includes 1,3-propanediol (1,3-PD) dehydrogenase, a key enzyme in the microbial production of 1,3-PD that has been previously characterized as the product of dhaT gene in Klebsiella pneumoniae. 1,3-PD dehydrogenase is a member of the family III metal-dependent polyol dehydrogenases, which are shown to require a divalent metal ion for catalysis. However, some members of this family showed a dependence on Fe(2+) or Zn(2+) for activity. 377
27100 341468 cd08189 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domain. Proteins of this family have not been characterized. 378
27101 341469 cd08190 HOT Hydroxyacid-oxoacid transhydrogenase (HOT) involved in gamma-hydroxybutyrate metabolism. This family contains hydroxyacid-oxoacid transhydrogenase (HOT), also known as D-2-hydroxyglutarate transhydrogenase. It catalyzes the conversion of gamma-hydroxybutyrate (GHB) to succinic semialdehyde (SSA), coupled to the stoichiometric conversion of alpha-ketoglutarate to D-2-hydroxyglutarate in gamma-Hydroxybutyrate catabolism. Unlike many other alcohols, which are oxidized by NAD-linked dehydrogenases, gamma-hydroxybutyrate is metabolized to succinate semialdehyde by hydroxyacid-oxoacid transhydrogenase which does not require free NAD or NADP; instead, it uses alpha-ketoglutarate as an acceptor, converting it to d-2-hydroxyglutarate. Alpha-ketoglutarate serves as an intermediate acceptor to regenerate NAD(P) required for the oxidation of GHB. HOT also catalyzes the reversible oxidation of a hydroxyacid obligatorily coupled to the reduction of an oxoacid, and requires no cofactor. In mammals, the HOT enzyme is located in mitochondria, and is expressed with an N-terminal mitochondrial targeting sequence. HOT enzyme is member of the metal-containing alcohol dehydrogenase family. It typically contains an iron although other metal ions may be used. 412
27102 341470 cd08191 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contain different protein domain. Proteins of this family have not been characterized. 392
27103 341471 cd08192 MAR-like Maleylacetate reductase is involved in many aromatic compounds degradation pathways of aerobic microbes. Maleylacetate reductase (MAR) plays an important role in the degradation of aromatic compounds in aerobic microbes. In fungi and yeasts, the enzyme is involved in the catabolism of compounds such as phenol, tyrosine, benzoate, 4-hydroxybenzoate and resorcinol. In bacteria, the enzyme contributes to the degradation of resorcinol, 2,4-dihydroxybenzoate ([beta]-resorcylate) and 2,6-dihydroxybenzoate ([gamma]-resorcylate) via hydroxyquinol and maleylacetate. Maleylacetate reductase (MAR) catalyzes NADH- or NADPH-dependent reduction, at the carbon-carbon double bond, of maleylacetate or 2-chloromaleylacetate to 3-oxoadipate. In the case of 2-chloromaleylacetate, MAR initially catalyzes the NAD(P)H-dependent dechlorination to maleylacetate, which is then reduced to 3-oxoadipate. This enzyme is a homodimer. It is inhibited by thiol-blocking reagents such as p-chloromercuribenzoate and Hg++, indicating that the cysteine residue is probably necessary for the catalytic activity of maleylacetate reductase. 380
27104 341472 cd08193 HVD 5-hydroxyvalerate dehydrogenase (HVD) catalyzes the oxidation of 5-hydroxyvalerate to 5-oxovalerate with NAD+ as cofactor. 5-hydroxyvalerate dehydrogenase (HVD) is an iron-containing (type III) NAD-dependent alcohol dehydrogenase. It plays a role in the cyclopentanol metabolism biochemical pathway. It catalyzes the oxidation of 5-hydroxyvalerate to 5-oxovalerate with NAD+ as cofactor. This cyclopentanol (cpn) degradation pathway is present in some bacteria which can use cyclopentanol as sole carbon source. In Comamonas sp. strain NCIMB 9872, this enzyme is encoded by the CpnD gene. 379
27105 341473 cd08194 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) most of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. 378
27106 341474 cd08195 DHQS Dehydroquinate synthase (DHQS) catalyzes the conversion of DAHP to DHQ in shikimate pathway for aromatic compounds synthesis. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway, which involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds, is found in bacteria, microbial eukaryotes, and plants, but not in mammals. Therefore, enzymes of this pathway are attractive targets for the development of non-toxic antimicrobial compounds, herbicides and anti-parasitic agents. The activity of DHQS requires nicotinamide adenine dinucleotide (NAD) as cofactor. A single active site in DHQS catalyzes five sequential reactions involving alcohol oxidation, phosphate elimination, carbonyl reduction, ring opening, and intramolecular aldol condensation. The binding of substrates and ligands induces domain conformational changes. In some fungi and protozoa, this domain is fused with the other four domains in shikimate pathway and forms a penta-domain AROM protein, which catalyzes steps 2-6 in the shikimate pathway. 343
27107 341475 cd08196 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 367
27108 341476 cd08197 DOIS 2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose. 2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose through a multistep reaction in the biosynthesis of aminoglycoside antibiotics. 2-deoxystreptamine (DOS)-containing aminoglycoside antibiotics includes neomycin, kanamycin, gentamicin, and ribostamycin. They are important antibacterial agents. DOIS is a homolog of the dehydroquinate synthase which catalyzes the cyclization of 3-deoxy-D-arabino-heputulosonate-7-phosphate to dehydroquinate (DHQ) in the shikimate pathway. 355
27109 341477 cd08198 DHQS-like Dehydroquinate synthase (DHQS) catalyzes the conversion of DAHP to DHQ in shikimate pathway for aromatic compounds synthesis. This family contains dehydroquinate synthase-like proteins. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. The activity of DHQS requires NAD as cofactor. Proteins of this family share sequence similarity and functional motifs with that of dehydroquinate synthase, but the specific function has not been characterized. 366
27110 341478 cd08199 EEVS 2-epi-5-epi-valiolone synthase (EEVS). 2-epi-5-epi-valiolone synthase catalyzes the cyclization of sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone in the biosynthesis of C(7)N-aminocyclitol-containing products. The cyclization product, 2-epi-5-epi-valiolone ((2S,3S,4S,5R)-5-(hydroxymethyl)cyclohexanon-2,3,4,5-tetrol), is a precursor of the valienamine moiety. The valienamine unit is responsible for their biological activities as various glycosidic hydrolases inhibitors. Two important microbial secondary metabolites, validamycin and acarbose, are used in agricultural and biomedical applications. Validamycin A is an antifungal antibiotic which has a strong trehalase inhibitory activity and has been used to control sheath blight disease in rice caused by Rhizoctonia solani. Acarbose is an alpha-glucosidase inhibitor used for the treatment of type II insulin-independent diabetes. Salbostatin produced by Streptomyces albus also belongs to this family. It exhibits strong trehalase inhibitory activity. 349
27111 173828 cd08200 catalase_peroxidase_2 C-terminal non-catalytic domain of catalase-peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Catalase-peroxidases can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. These enzymes are found in many archaeal and bacterial organisms where they neutralize potentially lethal hydrogen peroxide molecules generated during photosynthesis or stationary phase. Along with related intracellular fungal and plant peroxidases, catalase-peroxidases belong to plant peroxidase superfamily. Unlike the eukaryotic enzymes, they are typically comprised of two homologous domains that probably arose via a single gene duplication event. The heme binding motif is present only in the N-terminal domain; the function of the C-terminal domain is not clear. 297
27112 173829 cd08201 plant_peroxidase_like_1 Uncharacterized family of plant peroxidase-like proteins. This is a subgroup of heme-dependent peroxidases similar to plant peroxidases. Along with animal peroxidases, these enzymes belong to a group of peroxidases containing a heme prosthetic group (ferriprotoporphyrin IX) which catalyzes a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. The plant peroxidase-like superfamily is found in all three kingdoms of life and carries out a variety of biosynthetic and degradative functions. 264
27113 188876 cd08203 SAM_PNT Sterile alpha motif (SAM)/Pointed domain. Sterile alpha motif (SAM)/Pointed domain is found in about 40% of transcriptional regulators of ETS family (initially named for Erythroblastosis virus, E26-E Twenty Six). SAM Pointed domain containing proteins of this family additionally have a C-terminal ETS DNA-binding domain. In a few cases, SAM Pointed domain appears as a single domain protein. Members of this group are mostly involved in regulation of embryonic development and growth control in eukaryotes. SAM Pointed domains mediate protein-protein interactions. Depending on the subgroup, they can interact with other SAM Pointed domains forming homo or hetero dimers/oligomers and/or they can recruit a protein kinase to its target which can be a SAM Pointed domain containing protein itself or another protein that has no kinase docking site. Thus, SAM Pointed domains participate in transcriptional regulation and signal transduction. Some genes coding ETS family transcriptional regulators are proto-oncogenes. They are prone to chromosomal translocations resulting in gene fusions. Chimeric proteins with SAM Pointed domains were found in a number of different human tumors including myeloid leukemia, lymphoblastic leukemia, Ewing's sarcoma and primitive neuroectodermal tumor. Members of this family are potential targets for cancer therapy. 67
27114 350058 cd08204 ArfGap GTPase-activating protein (GAP) for the ADP ribosylation factors (ARFs). ArfGAPs are a family of proteins containing an ArfGAP catalytic domain that induces the hydrolysis of GTP bound to the small guanine nucleotide-binding protein Arf, a member of the Ras superfamily of GTPases. Like all GTP-binding proteins, Arf proteins function as molecular switches, cycling between GTP (active-membrane bound) and GDP (inactive-cytosolic) form. Conversion to the GTP-bound form requires a guanine nucleotide exchange factor (GEF), whereas conversion to the GDP-bound form is catalyzed by a GTPase activating protein (GAP). In that sense, ArfGAPs were originally proposed to function as terminators of Arf signaling, which is mediated by regulating Arf family GTP-binding proteins. However, recent studies suggest that ArfGAPs can also function as Arf effectors, independently of their GAP enzymatic activity to transduce signals in cells. The ArfGAP domain contains a C4-type zinc finger motif and a conserved arginine that is required for activity, within a specific spacing (CX2CX16CX2CX4R). ArfGAPs, which have multiple functional domains, regulate the membrane trafficking and actin cytoskeleton remodeling via specific interactions with signaling lipids such as phosphoinositides and trafficking proteins, which consequently affect cellular events such as cell growth, migration, and cancer invasion. The ArfGAP family, which includes 31 human ArfGAP-domain containing proteins, is divided into 10 subfamilies based on domain structure and sequence similarity. The ArfGAP nomenclature is mainly based on the protein domain structure. For example, ASAP1 contains ArfGAP, SH3, ANK repeat and PH domains; ARAPs contain ArfGAP, Rho GAP, ANK repeat and PH domains; ACAPs contain ArfGAP, BAR (coiled coil), ANK repeat and PH domains; and AGAPs contain Arf GAP, GTP-binding protein-like, ANK repeat and PH domains. Furthermore, the ArfGAPs can be classified into two major types of subfamilies, according to the overall domain structure: the ArfGAP1 type includes 6 subfamilies (ArfGAP1, ArfGAP2/3, ADAP, SMAP, AGFG, and GIT), which contain the ArfGAP domain at the N-terminus of the protein; and the AZAP type includes 4 subfamilies (ASAP, ACAP, AGAP, and ARAP), which contain an ArfGAP domain between the PH and ANK repeat domains. 106
27115 173970 cd08205 RuBisCO_IV_RLP Ribulose bisphosphate carboxylase like proteins, Rubisco-Form IV. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions, like for example 2,3-diketo-5-methylthiopentyl-1-phosphate enolase or 5-methylthio-d-ribulose 1-phosphate isomerase. 367
27116 173971 cd08206 RuBisCO_large_I_II_III Ribulose bisphosphate carboxylase large chain, Form I,II,III. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubico-like proteins (RLP), are missing critical active site residues. 414
27117 173972 cd08207 RLP_NonPhot Ribulose bisphosphate carboxylase like proteins from nonphototrophic bacteria. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions. The specific function of this subgroup is unknown. 406
27118 173973 cd08208 RLP_Photo Ribulose bisphosphate carboxylase like proteins from phototrophic bacteria. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions. The specific function of this subgroup is unknown. 424
27119 173974 cd08209 RLP_DK-MTP-1-P-enolase 2,3-diketo-5-methylthiopentyl-1-phosphate enolase. Ribulose bisphosphate carboxylase like proteins (RLPs) similar to B. subtilis YkrW protein, have been identified as 2,3-diketo-5-methylthiopentyl-1-phosphate enolases. They catalyze the tautomerization of 2,3-diketo-5-methylthiopentane 1-phosphate (DK-MTP 1-P). This is an important step in the methionine salvage pathway in which 5-methylthio-D-ribose (MTR) derived from 5'-methylthioadenosine is converted to methionine. 391
27120 173975 cd08210 RLP_RrRLP Ribulose bisphosphate carboxylase like proteins (RLPs) similar to R.rubrum RLP. RLP from Rhodospirillum rubrum plays a role in an uncharacterized sulfur salvage pathway and has been shown to catalyze a novel isomerization reaction that converts 5-methylthio-d-ribulose 1-phosphate to a 3:1 mixture of 1-methylthioxylulose 5-phosphate and 1-methylthioribulose 5-phosphate. 364
27121 173976 cd08211 RuBisCO_large_II Ribulose bisphosphate carboxylase large chain, Form II. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form II is mainly found in bacteria, and forms large subunit oligomers (dimers, tetramers, etc.) that do not include small subunits. 439
27122 173977 cd08212 RuBisCO_large_I Ribulose bisphosphate carboxylase large chain, Form I. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form I is the most abundant class, present in plants, algae, and bacteria, and forms large complexes composed of 8 large and 8 small subunits. 450
27123 173978 cd08213 RuBisCO_large_III Ribulose bisphosphate carboxylase large chain, Form III. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form III is only found in archaea and forms large subunit oligomers (dimers or decamers) that do not include small subunits. 412
27124 270855 cd08215 STKc_Nek Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Nek family is composed of 11 different mammalian members (Nek1-11) with similarity to the catalytic domain of Aspergillus nidulans NIMA kinase, the founding member of the Nek family, which was identified in a screen for cell cycle mutants that were prevented from entering mitosis. Neks contain a conserved N-terminal catalytic domain and a more divergent C-terminal regulatory region of various sizes and structures. They are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
27125 270856 cd08216 PK_STRAD Pseudokinase domain of STE20-related kinase adapter protein. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. LKB1 is a tumor suppressor linked to the rare inherited disease, Peutz-Jeghers syndrome, which is characterized by a predisposition to benign polyps and hyperpigmentation of the buccal mucosa. There are two forms of STRAD, alpha and beta, that complex with LKB1 and MO25. The structure of STRAD-alpha is available and shows that this protein binds ATP, has an ordered activation loop, and adopts a closed conformation typical of fully active protein kinases. It does not possess activity due to nonconservative substitutions of essential catalytic residues. ATP binding enhances the affinity of STRAD for MO25. The conformation of STRAD-alpha stabilized through ATP and MO25 may be needed to activate LKB1. The STRAD subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 315
27126 270857 cd08217 STKc_Nek2 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Nek2 subfamily includes Aspergillus nidulans NIMA kinase, the founding member of the Nek family, which was identified in a screen for cell cycle mutants prevented from entering mitosis. NIMA is essential for mitotic entry and progression through mitosis, and its degradation is essential for mitotic exit. NIMA is involved in nuclear membrane fission. Vertebrate Nek2 is a cell cycle-regulated STK, localized in centrosomes and kinetochores, that regulates centrosome splitting at the G2/M phase. It also interacts with other mitotic kinases such as Polo-like kinase 1 and may play a role in spindle checkpoint. An increase in the expression of the human NEK2 gene is strongly associated with the progression of non-Hodgkin lymphoma. Nek2 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. It The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
27127 270858 cd08218 STKc_Nek1 Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek1 is associated with centrosomes throughout the cell cycle. It is involved in the formation of primary cilium and in the maintenance of centrosomes. It cycles through the nucleus and may be capable of relaying signals between the cilium and the nucleus. Nek1 is implicated in the development of polycystic kidney disease, which is characterized by benign polycystic tumors formed by abnormal overgrowth of renal epithelial cells. It appears also to be involved in DNA damage response, and may be important for both correct DNA damage checkpoint activation and DNA repair. Nek1 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
27128 173759 cd08219 STKc_Nek3 Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek3 is primarily localized in the cytoplasm and shows no cell cycle-dependent changes in its activity. It is present in the axons of neurons and affects morphogenesis and polarity through its regulation of microtubule acetylation. Nek3 modulates the signaling of the prolactin receptor through its activation of Vav2 and contributes to prolactin-mediated motility of breast cancer cells. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
27129 270859 cd08220 STKc_Nek8 Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek8 contains an N-terminal kinase catalytic domain and a C-terminal RCC1 (regulator of chromosome condensation) domain. A double point mutation in Nek8 causes cystic kidney disease in mice that genetically resembles human autosomal recessive polycystic kidney disease (ARPKD). Nek8 is also associated with a rare form of juvenile renal cystic disease, nephronophthisis type 9. It has been suggested that a defect in the ciliary localization of Nek8 contributes to the development of cysts manifested by these diseases. Nek8 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
27130 270860 cd08221 STKc_Nek9 Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 9. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek9, also called Nercc1, is primarily a cytoplasmic protein but can also localize in the nucleus. It is involved in modulating chromosome alignment and splitting during mitosis. It interacts with the gamma-tubulin ring complex and the Ran GTPase, and is implicated in microtubule organization. Nek9 associates with FACT (FAcilitates Chromatin Transcription) and modulates interphase progression. It also interacts with Nek6, and Nek7, during mitosis, resulting in their activation. Nek9 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
27131 270861 cd08222 STKc_Nek11 Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 11. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek11 is involved, through direct phosphorylation, in regulating the degradation of Cdc25A (Cell Division Cycle 25 homolog A), which plays a role in cell cycle progression and in activating cyclin dependent kinases. Nek11 is activated by CHK1 (CHeckpoint Kinase 1) and may be involved in the G2/M checkpoint. Nek11 may also play a role in the S-phase checkpoint as well as in DNA replication and genotoxic stress responses. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
27132 270862 cd08223 STKc_Nek4 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek4 is highly abundant in the testis. Its specific function is unknown. Neks are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. Nek4 is one in a family of 11 different Neks (Nek1-11). The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
27133 270863 cd08224 STKc_Nek6_7 Catalytic domain of the Serine/Threonine Kinases, Never In Mitosis gene A (NIMA)-related kinase 6 and 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek6 and Nek7 are the shortest Neks, consisting only of the catalytic domain and a very short N-terminal extension. They show distinct expression patterns and both appear to be downstream substrates of Nek9. They are required for mitotic spindle formation and cytokinesis. They may also be regulators of the p70 ribosomal S6 kinase. Nek6/7 is part of a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
27134 173765 cd08225 STKc_Nek5 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Neks are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. The specific function of Nek5 is unknown. Nek5 is one in a family of 11 different Neks (Nek1-11). The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
27135 270864 cd08226 PK_STRAD_beta Pseudokinase domain of STE20-related kinase adapter protein beta. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity.STRAD-beta is also referred to as ALS2CR2 (Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 2 protein), since the human gene encoding it is located within the juvenile ALS2 critical region on chromosome 2q33-q34. It is not linked to the development of ALS2. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. LKB1 is a tumor suppressor linked to the rare inherited disease, Peutz-Jeghers syndrome, which is characterized by a predisposition to benign polyps and hyperpigmentation of the buccal mucosa. The STRAD-beta subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 328
27136 173767 cd08227 PK_STRAD_alpha Pseudokinase domain of STE20-related kinase adapter protein alpha. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. The structure of STRAD-alpha is available and shows that this protein binds ATP, has an ordered activation loop, and adopts a closed conformation typical of fully active protein kinases. It does not possess activity due to nonconservative substitutions of essential catalytic residues. ATP binding enhances the affinity of STRAD for MO25. The conformation of STRAD-alpha, stabilized through ATP and MO25, may be needed to activate LKB1. A mutation which results in a truncation of a C-terminal part of the human STRAD-alpha pseudokinase domain and disrupts its association with LKB1, leads to PMSE (polyhydramnios, megalencephaly, symptomatic epilepsy) syndrome. Several splice variants of STRAD-alpha exist which exhibit different effects on the localization and activation of LKB1. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. The STRAD alpha subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 327
27137 270865 cd08228 STKc_Nek6 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek6 is required for the transition from metaphase to anaphase. It also plays important roles in mitotic spindle formation and cytokinesis. Activated by Nek9 during mitosis, Nek6 phosphorylates Eg5, a kinesin that is important for spindle bipolarity. Nek6 localizes to spindle microtubules during metaphase and anaphase, and to the midbody during cytokinesis. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
27138 270866 cd08229 STKc_Nek7 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek7 is required for mitotic spindle formation and cytokinesis. It is enriched in the centrosome and is critical for microtubule nucleation. Nek7 is activated by Nek9 during mitosis, and may regulate the p70 ribosomal S6 kinase. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
27139 176192 cd08230 glucose_DH Glucose dehydrogenase. Glucose dehydrogenase (GlcDH), a member of the medium chain dehydrogenase/zinc-dependent alcohol dehydrogenase-like family, catalyzes the NADP(+)-dependent oxidation of glucose to gluconate, the first step in the Entner-Doudoroff pathway, an alternative to or substitute for glycolysis or the pentose phosphate pathway. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossman fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 355
27140 176193 cd08231 MDR_TM0436_like Hypothetical enzyme TM0436 resembles the zinc-dependent alcohol dehydrogenases (ADH). This group contains the hypothetical TM0436 alcohol dehydrogenase from Thermotoga maritima, proteins annotated as 5-exo-alcohol dehydrogenase, and other members of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. MDR, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 361
27141 176194 cd08232 idonate-5-DH L-idonate 5-dehydrogenase. L-idonate 5-dehydrogenase (L-ido 5-DH ) catalyzes the conversion of L-lodonate to 5-ketogluconate in the metabolism of L-Idonate to 6-P-gluconate. In E. coli, this GntII pathway is a subsidiary pathway to the canonical GntI system, which also phosphorylates and transports gluconate. L-ido 5-DH is found in an operon with a regulator indR, transporter idnT, 5-keto-D-gluconate 5-reductase, and Gnt kinase. L-ido 5-DH is a zinc-dependent alcohol dehydrogenase-like protein. The alcohol dehydrogenase ADH-like family of proteins is a diverse group of proteins related to the first identified member, class I mammalian ADH. This group is also called the medium chain dehydrogenases/reductase family (MDR) which displays a broad range of activities and are distinguished from the smaller short chain dehydrogenases(~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal GroES-like catalytic domain. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 339
27142 176195 cd08233 butanediol_DH_like (2R,3R)-2,3-butanediol dehydrogenase. (2R,3R)-2,3-butanediol dehydrogenase, a zinc-dependent medium chain alcohol dehydrogenase, catalyzes the NAD(+)-dependent oxidation of (2R,3R)-2,3-butanediol and meso-butanediol to acetoin. BDH functions as a homodimer. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit. 351
27143 176196 cd08234 threonine_DH_like L-threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine, via NAD(H)-dependent oxidation. THD is a member of the zinc-requiring, medium chain NAD(H)-dependent alcohol dehydrogenase family (MDR). MDRs have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. 334
27144 176197 cd08235 iditol_2_DH_like L-iditol 2-dehydrogenase. Putative L-iditol 2-dehydrogenase based on annotation of some members in this subgroup. L-iditol 2-dehydrogenase catalyzes the NAD+-dependent conversion of L-iditol to L-sorbose in fructose and mannose metabolism. This enzyme is related to sorbitol dehydrogenase, alcohol dehydrogenase, and other medium chain dehydrogenase/reductases. The zinc-dependent alcohol dehydrogenase (ADH-Zn)-like family of proteins is a diverse group of proteins related to the first identified member, class I mammalian ADH. This group is also called the medium chain dehydrogenases/reductase family (MDR) to highlight its broad range of activities and to distinguish from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal GroES-like catalytic domain. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 343
27145 176198 cd08236 sugar_DH NAD(P)-dependent sugar dehydrogenases. This group contains proteins identified as sorbitol dehydrogenases and other sugar dehydrogenases of the medium-chain dehydrogenase/reductase family (MDR), which includes zinc-dependent alcohol dehydrogenase and related proteins. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Related proteins include threonine dehydrogenase, formaldehyde dehydrogenase, and butanediol dehydrogenase. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Horse liver alcohol dehydrogenase is a dimeric enzyme and each subunit has two domains. The NAD binding domain is in a Rossmann fold and the catalytic domain contains a zinc ion to which substrates bind. There is a cleft between the domains that closes upon formation of the ternary complex. 343
27146 176199 cd08237 ribitol-5-phosphate_DH ribitol-5-phosphate dehydrogenase. NAD-linked ribitol-5-phosphate dehydrogenase, a member of the MDR/zinc-dependent alcohol dehydrogenase-like family, oxidizes the phosphate ester of ribitol-5-phosphate to xylulose-5-phosphate of the pentose phosphate pathway. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 341
27147 176200 cd08238 sorbose_phosphate_red L-sorbose-1-phosphate reductase. L-sorbose-1-phosphate reductase, a member of the MDR family, catalyzes the NADPH-dependent conversion of l-sorbose 1-phosphate to d-glucitol 6-phosphate in the metabolism of L-sorbose to (also converts d-fructose 1-phosphate to d-mannitol 6-phosphate). The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 410
27148 176201 cd08239 THR_DH_like L-threonine dehydrogenase (TDH)-like. MDR/AHD-like proteins, including a protein annotated as a threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)-dependent oxidation. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Zinc-dependent ADHs are medium chain dehydrogenase/reductase type proteins (MDRs) and have a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. In addition to alcohol dehydrogenases, this group includes quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 339
27149 176202 cd08240 6_hydroxyhexanoate_dh_like 6-hydroxyhexanoate dehydrogenase. 6-hydroxyhexanoate dehydrogenase, an enzyme of the zinc-dependent alcohol dehydrogenase-like family of medium chain dehydrogenases/reductases catalyzes the conversion of 6-hydroxyhexanoate and NAD(+) to 6-oxohexanoate + NADH and H+. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 350
27150 176203 cd08241 QOR1 Quinone oxidoreductase (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 323
27151 176204 cd08242 MDR_like Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group contains members identified as related to zinc-dependent alcohol dehydrogenase and other members of the MDR family, including threonine dehydrogenase. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group includes various activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 319
27152 176205 cd08243 quinone_oxidoreductase_like_1 Quinone oxidoreductase (QOR). NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 320
27153 176206 cd08244 MDR_enoyl_red Possible enoyl reductase. Member identified as possible enoyl reductase of the MDR family. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 324
27154 176207 cd08245 CAD Cinnamyl alcohol dehydrogenases (CAD) and related proteins. Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes, or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 330
27155 176208 cd08246 crotonyl_coA_red crotonyl-CoA reductase. Crotonyl-CoA reductase, a member of the medium chain dehydrogenase/reductase family, catalyzes the NADPH-dependent conversion of crotonyl-CoA to butyryl-CoA, a step in (2S)-methylmalonyl-CoA production for straight-chain fatty acid biosynthesis. Like enoyl reductase, another enzyme in fatty acid synthesis, crotonyl-CoA reductase is a member of the zinc-dependent alcohol dehydrogenase-like medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 393
27156 176209 cd08247 AST1_like AST1 is a cytoplasmic protein associated with the periplasmic membrane in yeast. This group contains members identified in targeting of yeast membrane proteins ATPase. AST1 is a cytoplasmic protein associated with the periplasmic membrane in yeast, identified as a multicopy suppressor of pma1 mutants which cause temperature sensitive growth arrest due to the inability of ATPase to target to the cell surface. This family is homologous to the medium chain family of dehydrogenases and reductases. Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 352
27157 176210 cd08248 RTN4I1 Human Reticulon 4 Interacting Protein 1. Human Reticulon 4 Interacting Protein 1 is a member of the medium chain dehydrogenase/ reductase (MDR) family. Riticulons are endoplasmic reticulum associated proteins involved in membrane trafficking and neuroendocrine secretion. The MDR/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 350
27158 176211 cd08249 enoyl_reductase_like enoyl_reductase_like. Member identified as possible enoyl reductase of the MDR family. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 339
27159 176212 cd08250 Mgc45594_like Mgc45594 gene product and other MDR family members. Includes Human Mgc45594 gene product of undetermined function. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 329
27160 176213 cd08251 polyketide_synthase polyketide synthase. Polyketide synthases produce polyketides in step by step mechanism that is similar to fatty acid synthesis. Enoyl reductase reduces a double to single bond. Erythromycin is one example of a polyketide generated by 3 complex enzymes (megasynthases). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 303
27161 176214 cd08252 AL_MDR Arginate lyase and other MDR family members. This group contains a structure identified as an arginate lyase. Other members are identified quinone reductases, alginate lyases, and other proteins related to the zinc-dependent dehydrogenases/reductases. QOR catalyzes the conversion of a quinone and NAD(P)H to a hydroquinone and NAD(P+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 336
27162 176215 cd08253 zeta_crystallin Zeta-crystallin with NADP-dependent quinone reductase activity (QOR). Zeta-crystallin is a eye lens protein with NADP-dependent quinone reductase activity (QOR). It has been cited as a structural component in mammalian eyes, but also has homology to quinone reductases in unrelated species. QOR catalyzes the conversion of a quinone and NAD(P)H to a hydroquinone and NAD(P+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 325
27163 176216 cd08254 hydroxyacyl_CoA_DH 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase, N-benzyl-3-pyrrolidinol dehydrogenase, and other MDR family members. This group contains enzymes of the zinc-dependent alcohol dehydrogenase family, including members (aka MDR) identified as 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase and N-benzyl-3-pyrrolidinol dehydrogenase. 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase catalyzes the conversion of 6-Hydroxycyclohex-1-enecarbonyl-CoA and NAD+ to 6-Ketoxycyclohex-1-ene-1-carboxyl-CoA,NADH, and H+. This group displays the characteristic catalytic and structural zinc sites of the zinc-dependent alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 338
27164 176217 cd08255 2-desacetyl-2-hydroxyethyl_bacteriochlorophyllide_like 2-desacetyl-2-hydroxyethyl bacteriochlorophyllide and other MDR family members. This subgroup of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family has members identified as 2-desacetyl-2-hydroxyethyl bacteriochlorophyllide A dehydrogenase and alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 277
27165 176218 cd08256 Zn_ADH2 Alcohol dehydrogenases of the MDR family. This group has the characteristic catalytic and structural zinc-binding sites of the zinc-dependent alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 350
27166 176219 cd08258 Zn_ADH4 Alcohol dehydrogenases of the MDR family. This group shares the zinc coordination sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 306
27167 176220 cd08259 Zn_ADH5 Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. This group contains proteins that share the characteristic catalytic and structural zinc-binding sites of the zinc-dependent alcohol dehydrogenase family. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine (His-51), the ribose of NAD, a serine (Ser-48), then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 332
27168 176221 cd08260 Zn_ADH6 Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. This group has the characteristic catalytic and structural zinc sites of the zinc-dependent alcohol dehydrogenases. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 345
27169 176222 cd08261 Zn_ADH7 Alcohol dehydrogenases of the MDR family. This group contains members identified as related to zinc-dependent alcohol dehydrogenase and other members of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group includes various activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 337
27170 176223 cd08262 Zn_ADH8 Alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 341
27171 176224 cd08263 Zn_ADH10 Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 367
27172 176225 cd08264 Zn_ADH_like2 Alcohol dehydrogenases of the MDR family. This group resembles the zinc-dependent alcohol dehydrogenases of the medium chain dehydrogenase family. However, this subgroup does not contain the characteristic catalytic zinc site. Also, it contains an atypical structural zinc-binding pattern: DxxCxxCxxxxxxxC. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 325
27173 176226 cd08265 Zn_ADH3 Alcohol dehydrogenases of the MDR family. This group resembles the zinc-dependent alcohol dehydrogenase and has the catalytic and structural zinc-binding sites characteristic of this group. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc. 384
27174 176227 cd08266 Zn_ADH_like1 Alcohol dehydrogenases of the MDR family. This group contains proteins related to the zinc-dependent alcohol dehydrogenases. However, while the group has structural zinc site characteristic of these enzymes, it lacks the consensus site for a catalytic zinc. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 342
27175 176228 cd08267 MDR1 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 319
27176 176229 cd08268 MDR2 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 328
27177 176230 cd08269 Zn_ADH9 Alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. 312
27178 176231 cd08270 MDR4 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 305
27179 176232 cd08271 MDR5 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 325
27180 176233 cd08272 MDR6 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 326
27181 176234 cd08273 MDR8 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 331
27182 176235 cd08274 MDR9 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 350
27183 176236 cd08275 MDR3 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 337
27184 176237 cd08276 MDR7 Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 336
27185 176238 cd08277 liver_alcohol_DH_like Liver alcohol dehydrogenase. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates. For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine (His-51), the ribose of NAD, a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 365
27186 176239 cd08278 benzyl_alcohol_DH Benzyl alcohol dehydrogenase. Benzyl alcohol dehydrogenase is similar to liver alcohol dehydrogenase, but has some amino acid substitutions near the active site, which may determine the enzyme's specificity of oxidizing aromatic substrates. Also known as aryl-alcohol dehydrogenases, they catalyze the conversion of an aromatic alcohol + NAD+ to an aromatic aldehyde + NADH + H+. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 365
27187 176240 cd08279 Zn_ADH_class_III Class III alcohol dehydrogenase. Glutathione-dependent formaldehyde dehydrogenases (FDHs, Class III ADH) are members of the zinc-dependent/medium chain alcohol dehydrogenase family. FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. Class III ADH are also known as glutathione-dependent formaldehyde dehydrogenase (FDH), which convert aldehydes to corresponding carboxylic acid and alcohol. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 363
27188 176241 cd08281 liver_ADH_like1 Zinc-dependent alcohol dehydrogenases (ADH) and class III ADG (AKA formaldehyde dehydrogenase). NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. This group contains members identified as zinc dependent alcohol dehydrogenases (ADH), and class III ADG (aka formaldehyde dehydrogenase, FDH). Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. Class III ADH are also know as glutathione-dependent formaldehyde dehydrogenase (FDH), which convert aldehydes to the corresponding carboxylic acid and alcohol. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 371
27189 176242 cd08282 PFDH_like Pseudomonas putida aldehyde-dismutating formaldehyde dehydrogenase (PFDH). Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family. Unlike typical FDH, Pseudomonas putida aldehyde-dismutating FDH (PFDH) is glutathione-independent. PFDH converts 2 molecules of aldehydes to corresponding carboxylic acid and alcohol. MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Like the zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. Unlike ADH, where NAD(P)(H) acts as a cofactor, NADH in FDH is a tightly bound redox cofactor (similar to nicotinamide proteins). The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 375
27190 176243 cd08283 FDH_like_1 Glutathione-dependent formaldehyde dehydrogenase related proteins, child 1. Members identified as glutathione-dependent formaldehyde dehydrogenase(FDH), a member of the zinc-dependent/medium chain alcohol dehydrogenase family. FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Like many zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these FDHs form dimers, with 4 zinc ions per dimer. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 386
27191 176244 cd08284 FDH_like_2 Glutathione-dependent formaldehyde dehydrogenase related proteins, child 2. Glutathione-dependent formaldehyde dehydrogenases (FDHs) are members of the zinc-dependent/medium chain alcohol dehydrogenase family. Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family. FDH converts formaldehyde and NAD to formate and NADH. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. These tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 344
27192 176245 cd08285 NADP_ADH NADP(H)-dependent alcohol dehydrogenases. This group is predominated by atypical alcohol dehydrogenases; they exist as tetramers and exhibit specificity for NADP(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Like other zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), tetrameric ADHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains; however, they do not have and a structural zinc in a lobe of the catalytic domain. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 351
27193 176246 cd08286 FDH_like_ADH2 formaldehyde dehydrogenase (FDH)-like. This group is related to formaldehyde dehydrogenase (FDH), which is a member of the zinc-dependent/medium chain alcohol dehydrogenase family. This family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Another member is identified as a dihydroxyacetone reductase. Like the zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. Unlike ADH, where NAD(P)(H) acts as a cofactor, NADH in FDH is a tightly bound redox cofactor (similar to nicotinamide proteins). The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 345
27194 176247 cd08287 FDH_like_ADH3 formaldehyde dehydrogenase (FDH)-like. This group contains proteins identified as alcohol dehydrogenases and glutathione-dependant formaldehyde dehydrogenases (FDH) of the zinc-dependent/medium chain alcohol dehydrogenase family. The MDR family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. FDH converts formaldehyde and NAD to formate and NADH. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. 345
27195 176248 cd08288 MDR_yhdh Yhdh putative quinone oxidoreductases. Yhdh putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 324
27196 176249 cd08289 MDR_yhfp_like Yhfp putative quinone oxidoreductases. yhfp putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 326
27197 176250 cd08290 ETR 2-enoyl thioester reductase (ETR). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 341
27198 176251 cd08291 ETR_like_1 2-enoyl thioester reductase (ETR) like proteins, child 1. 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 324
27199 176252 cd08292 ETR_like_2 2-enoyl thioester reductase (ETR) like proteins, child 2. 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain. 324
27200 176253 cd08293 PTGR2 Prostaglandin reductase. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 345
27201 176254 cd08294 leukotriene_B4_DH_like 13-PGR is a bifunctional enzyme with delta-13 15-prostaglandin reductase and leukotriene B4 12 hydroxydehydrogenase activity. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto- 13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 329
27202 176255 cd08295 double_bond_reductase_like Arabidopsis alkenal double bond reductase and leukotriene B4 12-hydroxydehydrogenase. This group includes proteins identified as the Arabidopsis alkenal double bond reductase and leukotriene B4 12-hydroxydehydrogenase. The Arabidopsis enzyme, a member of the medium chain dehydrogenase/reductase family, catalyzes the reduction of 7-8-double bond of phenylpropanal substrates as a plant defense mechanism. Prostaglandins and related eicosanoids (lipid mediators involved in host defense and inflamation) are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. Leukotriene B4 (LTB4) can be metabolized by LTB4 20-hydroxylase in inflamatory cells, and in other cells by bifunctional LTB4 12-HD/PGR. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. 338
27203 176256 cd08296 CAD_like Cinnamyl alcohol dehydrogenases (CAD). Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADHs), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 333
27204 176257 cd08297 CAD3 Cinnamyl alcohol dehydrogenases (CAD). These alcohol dehydrogenases are related to the cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Cinnamyl alcohol dehydrogenases (CAD) reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 341
27205 176258 cd08298 CAD2 Cinnamyl alcohol dehydrogenases (CAD). These alcohol dehydrogenases are related to the cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Cinnamyl alcohol dehydrogenases (CAD) reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. 329
27206 176259 cd08299 alcohol_DH_class_I_II_IV class I, II, IV alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. This group includes alcohol dehydrogenases corresponding to mammalian classes I, II, IV. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine (His-51), the ribose of NAD, a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. 373
27207 176260 cd08300 alcohol_DH_class_III class III alcohol dehydrogenases. Members identified as glutathione-dependent formaldehyde dehydrogenase(FDH), a member of the zinc dependent/medium chain alcohol dehydrogenase family. FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes or ketones. Like many zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these FDHs form dimers, with 4 zinc ions per dimer. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 368
27208 176261 cd08301 alcohol_DH_plants Plant alcohol dehydrogenase. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates. For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine (His-51), the ribose of NAD, a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. 369
27209 176720 cd08304 DD Death Domain Superfamily of protein-protein interaction domains. The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer. 69
27210 260019 cd08305 Pyrin Pyrin: a protein-protein interaction domain. The Pyrin domain (or PYD), also called DAPIN or PAAD, is a subfamily of the Death Domain (DD) superfamily and it functions in several signaling pathways. The Pyrin domain is found at the N-terminus of a variety of proteins and serves as a linker that recruits other domains into signaling complexes. Pyrin-containing proteins include NALPs, ASC (Apoptosis-associated speck-like protein containing a CARD), and the interferon-inducible p200 (IFI-200) family of proteins which includes the human IFI-16, myeloid cell nuclear differentiation antigen (MNDA) and absent in melanoma (AIM) 2. NALPs are members of the NBS-LRR family of proteins possessing a tripartite domain structure including a C-terminal LRR (leucine-rich repeats), a central nucleotide-binding site (NBS) domain or NACHT (for neuronal apoptosis inhibitor protein, CIITA, HET-E and TP1), and an N-terminal protein-protein interaction domain, which is a Pyrin domain in the case of NALPs. ASC and NALPs are involved in the regulation of inflammation. ASC, NALP1 and NALP3 are involved in the assembly of the 'inflammasome', a multiprotein platform which is formed in response to infection or injury and is responsible for caspase-1 activation and regulation of IL-1beta maturation. NALP12 functions as a negative regulator of inflammation. The p200 proteins are involved in the regulation of cell cycle and differentiation. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including Caspase activation and recruitment domain (CARD) and Death Effector Domain (DED). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 73
27211 260020 cd08306 Death_FADD Fas-associated Death Domain protein-protein interaction domain. Death domain (DD) found in FAS-associated via death domain (FADD). FADD is a component of the death-inducing signaling complex (DISC) and serves as an adaptor in the signaling pathway of death receptor proteins. It modulates apoptosis as well as non-apoptotic processes such as cell cycle progression, survival, innate immune signaling, and hematopoiesis. FADD contains an N-terminal DED and a C-terminal DD. Its DD interacts with the DD of the activated death receptor, FAS, and its DED recruits the initiator caspases, caspase-8 and -10, to the DISC complex via a homotypic interaction with the N-terminal DED of the caspase. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. 85
27212 260021 cd08307 Death_Pelle Death domain of the protein kinase Pelle. Death domain (DD) of the protein kinase Pelle from Drosophila melanogaster and similar proteins. In Drosophila, interaction between the DDs of Tube and Pelle is an important component of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and in mediating innate immune responses to pathogens. Tube and Pelle transmit the signal from the Toll receptor to the Dorsal/Cactus complex. Pelle also functions in photoreceptor axon targeting. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 97
27213 260022 cd08308 Death_Tube Death domain of Tube. Death domains (DDs) similar to the DD in the protein Tube from Drosophila melanogaster. In Drosophila, interaction between the DDs of Tube and Pelle is an important component of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and also in mediating innate immune response to pathogens. Tube and Pelle transmit the signal from the Toll receptor to the Dorsal/Cactus complex. Some members of this subfamily contain a C-terminal kinase domain, like Pelle, in addition to the DD. Tube has no counterpart in vertebrates. It contains an N-terminal DD and a C-terminal region with five copies of the Tube repeat, an 8-amino acid motif. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 128
27214 260023 cd08309 Death_IRAK Death domain of Interleukin-1 Receptor-Associated Kinases. Death Domains (DDs) found in Interleukin-1 (IL-1) Receptor-Associated Kinases (IRAK1-4) and similar proteins. IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. All four types are involved in signal transduction involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB, and mitogen-activated protein kinase pathways. IRAK1 and IRAK4 are active kinases while IRAK2 and IRAK-M (also called IRAK3) are inactive. In general, IRAKs are expressed ubiquitously, except for IRAK-M which is detected only in macrophages. The insect homologs, Pelle and Tube, are important components of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and also in the innate immune response. Most members have an N-terminal DD followed by a kinase domain. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 88
27215 260024 cd08310 Death_NFkB-like Death domain of Nuclear Factor-KappaB precursor proteins. Death Domain (DD) of Nuclear Factor-KappaB (NF-kB) precursor proteins. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). Two of these, NF-kB1 and NF-kB2 are produced from the processing of the precursor proteins p105 and p100, respectively. In addition to RHD, p105 and p100 contain ANK repeats and a C-terminal DD. NF-kBs are regulated by the Inhibitor of NF-kB (IkB) Kinase (IKK) complex through classical and non-canonical pathways, which differ in the IKK subunits involved and downstream targets. IKKs facilitate the release of NF-kB dimers from an inactive state, allowing them to migrate to the nucleus where they regulate gene transcription. The precursor proteins p105 and p100 function as IkBs and as NF-kB proteins after being processed by the proteasome. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 72
27216 260025 cd08311 Death_p75NR Death domain of p75 Neurotrophin Receptor. Death Domain (DD) found in p75 neurotrophin receptor (p75NTR, NGFR, TNFRSF16). p75NTR binds members of the neurotrophin (NT) family including nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), and NT3, among others. It contains an NT-binding extracellular region that bears four cysteine-rich repeats, a transmembrane domain, and an intracellular DD. p75NTR plays roles in the immune, vascular, and nervous systems, and has been shown to promote cell death or survival, and to induce neurite outgrowth or collapse depending on its ligands and co-receptors. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 80
27217 260026 cd08312 Death_MyD88 Death domain of Myeloid Differentation primary response protein MyD88. Death Domain (DD) of Myeloid Differentiation primary response protein 88 (MyD88). MyD88 is an adaptor protein involved in interleukin-1 receptor (IL-1R)- and Toll-like receptor (TLR)-induced activation of nuclear factor-kappaB (NF-kB) and mitogen activated protein kinase pathways that lead to the induction of proinflammatory cytokines. It is a key component in the signaling pathway of pathogen recognition in the innate immune system. MyD88 contains an N-terminal DD and a C-terminal Toll/IL-1 Receptor (TIR) homology domain that mediates interaction with TLRs and IL-1R. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 79
27218 176729 cd08313 Death_TNFR1 Death domain of Tumor Necrosis Factor Receptor 1. Death Domain (DD) found in tumor necrosis factor receptor-1 (TNFR-1). TNFR-1 has many names including TNFRSF1A, CD120a, p55, p60, and TNFR60. It activates two major intracellular signaling pathways that lead to the activation of the transcription factor NF-kB and the induction of cell death. Upon binding of its ligand TNF, TNFR-1 trimerizes which leads to the recruitment of an adaptor protein named TNFR-associated death domain protein (TRADD) through a DD/DD interaction. Mutations in the TNFRSF1A gene causes TNFR-associated periodic syndrome (TRAPS), a rare disorder characterized recurrent fever, myalgia, abdominal pain, conjunctivitis and skin eruptions. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 80
27219 260027 cd08315 Death_TRAILR_DR4_DR5 Death domain of Tumor necrosis factor-Related Apoptosis-Inducing Ligand Receptors. Death Domain (DD) found in Tumor necrosis factor-Related Apoptosis-Inducing Ligand (TRAIL) Receptors. In mammals, this family includes TRAILR1 (also called DR4 or TNFRSF10A) and TRAILR2 (also called DR5, TNFRSF10B, or KILLER). They function as receptors for the cytokine TRAIL and are involved in apoptosis signaling pathways. TRAIL preferentially induces apoptosis in cancer cells while exhibiting little toxicity in normal cells. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 88
27220 260028 cd08316 Death_FAS_TNFRSF6 Death domain of FAS or TNF receptor superfamily member 6. Death Domain (DD) found in the FS7-associated cell surface antigen (FAS). FAS, also known as TNFRSF6 (TNF receptor superfamily member 6), APT1, CD95, FAS1, or APO-1, together with FADD (Fas-associating via Death Domain) and caspase 8, is an integral part of the death inducing signalling complex (DISC), which plays an important role in the induction of apoptosis and is activated by binding of the ligand FasL to FAS. FAS also plays a critical role in self-tolerance by eliminating cell types (autoreactive T and B cells) that contribute to autoimmunity. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 94
27221 260029 cd08317 Death_ank Death domain associated with Ankyrins. Death Domain (DD) associated with Ankyrins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. Ankyrins function as adaptor proteins and they interact, through ANK repeats, with structurally diverse membrane proteins, including ion channels/pumps, calcium release channels, and cell adhesion molecules. They play critical roles in the proper expression and membrane localization of these proteins. In mammals, this family includes ankyrin-R for restricted (or ANK1), ankyrin-B for broadly expressed (or ANK2) and ankyrin-G for general or giant (or ANK3). They are expressed in different combinations in many tissues and play non-overlapping functions. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27222 260030 cd08318 Death_NMPP84 Death domain of Nuclear Matrix Protein P84. Death domain (DD) found in the Nuclear Matrix Protein P84 (also known as HPR1 or THOC1). HPR1/p84 resides in the nuclear matrix and is part of the THO complex, also called TREX (transcription/export) complex, which functions in mRNP biogenesis at the interface between transcription and export of mRNA from the nucleus. Mice lacking THOC1 have abnormal testis development and are sterile. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27223 260031 cd08319 Death_RAIDD Death domain of RIP-associated ICH-1 homologous protein with a death domain. Death domain (DD) of RAIDD (RIP-associated ICH-1 homologous protein with a death domain), also known as CRADD (Caspase and RIP adaptor). RAIDD is an adaptor protein that together with the p53-inducible protein PIDD and caspase-2, forms the PIDDosome complex, which is required for caspase-2 activation and plays a role in mediating stress-induced apoptosis. RAIDD contains an N-terminal Caspase Activation and Recruitment Domain (CARD), which interacts with the caspase-2 CARD, and a C-terminal DD, which interacts with the DD of PIDD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD, DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27224 260032 cd08320 Pyrin_NALPs Pyrin death domain found in NALP proteins. Pyrin Death Domain found in NALP (NACHT, LRR and PYD domains) proteins including NALP1 (CARD7, NLRP1), NALP3 (NLRP3, Cryopyrin, CIAS1), and NALP12 (NLRP12, Monarch-1), among others. Mammals contains at least 14 NALP proteins, named NALP1-14 (or NLRP1-14). NALPs are members of the NBS-LRR family of proteins possessing a tripartite domain structure including a C-terminal LRR (leucine-rich repeats), a central nucleotide-binding site (NBS) domain or NACHT (for neuronal apoptosis inhibitor protein, CIITA, HET-E and TP1), and an N-terminal protein-protein interaction domain, which is a Pyrin domain in the case of NALPs. The NBS-LRR family is also referred to as the NLR (Nod-like Receptor) or CATERPILLAR (for CARD, transcription enhancer, R-(purine)-binding, pyrin, lots of LRRs) family. NALP1 contains an additional Caspase activation and recruitment domain (CARD) at the C-terminus. NALP1 and NALP3 are both involved in the assembly of the 'inflammasome', a multiprotein platform which is formed in response to infection or injury and is responsible for caspase-1 activation and regulation of IL-1beta maturation. NALP1-inflammasomes recognize specific substances while NALP3-inflammasomes responds to many diverse triggers. Mutations in the NALP3 gene are associated with a broad spectrum of autoinflammatory disorders including Muckle-Wells Syndrome (MWS), familial cold autoinflammatory syndrome (FCAS), and chronic neurologic cutaneous and articular syndrome (CINCA). NALP12 functions as a negative regulator of inflammation. In general, Pyrin is a subfamily of the Death Domain (DD) superfamily and functions in several signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27225 260033 cd08321 Pyrin_ASC-like Pyrin Death Domain found in ASC. Pyrin Death Domain found in ASC (Apoptosis-associated speck-like protein containing a CARD) and similar proteins. ASC is an adaptor molecule that functions in the assembly of the 'inflammasome', a multiprotein platform, which is responsible for caspase-1 activation and regulation of IL-1beta maturation. ASC contains two domains from the Death Domain (DD) superfamily, an N-terminal pyrin-like domain and a C-terminal Caspase activation and recruitment domain (CARD). Through these 2 domains, ASC serves as an adaptor for inflammasome integrity and oligomerizes to form supramolecular assemblies. Included in this family is human PYNOD (also known as NLRP10 or NOD8) which via its Pyrin domain suppresses oligomerization of ASC, and ASC-mediated NF-kappaB activation. Other members of this subfamily are associated with ATPase domains and their function remains unknown. In general, Pyrin is a subfamily of the DD superfamily and functions in several signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD and Death Effector Domain (DED). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 82
27226 260034 cd08323 CARD_APAF1 Caspase activation and recruitment domain similar to that found in Apoptotic Protease-Activating Factor 1. Caspase activation and recruitment domain (CARD) similar to that found in apoptotic protease-activating factor 1 (APAF-1), which is an activator of caspase-9. APAF-1 contains WD-40 repeats, a CARD, and an ATPase domain. Upon stimulation, APAF-1, together with caspase-9, forms the heptameric 'apoptosome', which leads to the processing and activation of caspase-9, starting a caspase cascade which leads to apoptosis. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27227 260035 cd08324 CARD_NOD1_CARD4 Caspase activation and recruitment domain similar to that found in NOD1. Caspase activation and recruitment domain (CARD) found in human NOD1 (CARD4) and similar proteins. NOD1 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD1, as well as NOD2, the N-terminal effector domain is a CARD. Nod1-CARD has been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 85
27228 260036 cd08325 CARD_CASP1-like Caspase activation and recruitment domain found in Caspase-1 and related proteins. Caspase activation and recruitment domain (CARD) similar to those found in Caspase-1 (CASP1, ICE) and related proteins, including CARD-only proteins such as ICEBERG or CARD18, INCA (CARD17), CARD16 (COP1, PSEUDO-ICE), CARD8 (DACAR, NDPP1, TUCAN), and CARD12 (NLRC4), as well as ICE-like caspases such as CASP12, CASP5 (ICH-3) and CASP4 (TX, ICH-2). Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. CASP1 plays a central role in the cellular response to a wide variety of microbial and non-microbial stimuli, being activated by the inflammasome or the pyroptosome. CARD8 binds itself and the initiator caspase-9, interfering with the binding of APAF-1 and suppressing caspase-9 activation. CARD12 is a Nod-like receptor (NLR) that plays an important role in the innate immune response to Gram-negative bacteria. Caspase-4 (CASP4), -5 (CASP5), and -12 (CASP12) are inflammatory caspases implicated in inflammation and endoplasmic reticulum stress-induced apoptosis. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27229 176740 cd08326 CARD_CASP9 Caspase activation and recruitment domain of Caspase-9. Caspase activation and recruitment domain (CARD) similar to that found in caspase-9 (CASP9, MCH6, APAF3), which interacts with the CARD of apoptotic protease-activating factor 1 (APAF-1). Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-9 is the initiator caspase associated with the intrinsic or mitochondrial pathway of apoptosis, induced by many pro-apoptotic signals. Together with APAF-1, it forms the heptameric 'apoptosome' in response to the release of cytochrome c from mitochondria. Activated caspase-9 cleaves and activates downstream effector caspases, like caspase-3, caspase-6, and caspase-7, resulting in apoptosis. In general, CARDs are death domains (DDs) associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27230 260037 cd08327 CARD_RAIDD Caspase activation and recruitment domain of RIP-associated ICH-1 homologous protein with a death domain. Caspase activation and recruitment domain (CARD) of RAIDD (RIP-associated ICH-1 homologous protein with a death domain), also known as CRADD (Caspase and RIP adaptor). RAIDD is an adaptor protein that together with the p53-inducible protein PIDD and caspase-2, forms the PIDDosome complex, which is required for caspase-2 activation and plays a role in mediating stress-induced apoptosis. RAIDD contains an N-terminal CARD, which interacts with the caspase-2 CARD, and a C-terminal Death domain (DD), which interacts with the DD of PIDD. In general, CARDs are DDs associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 94
27231 260038 cd08329 CARD_BIRC2_BIRC3 Caspase activation and recruitment domain found in Baculoviral IAP repeat-containing proteins, BIRC2 (c-IAP1) and BIRC3 (c-IAP2). Caspase activation and recruitment domain (CARD) similar to those found in Baculoviral IAP repeat (BIR)-containing protein 2 (BIRC2) or cellular Inhibitor of Apoptosis Protein 1 (c-IAP1), and BIRC3 (or c-IAP2). IAPs are anti-apoptotic proteins that contain at least one BIR domain. Most IAPs also contain a C-terminal RING domain. In addition, both BIRC2 and BIRC3 contain a CARD. BIRC2 and BIRC3, through their binding with TRAF (TNF receptor-associated factor) 2, are recruited to TNFR-1/2 signaling complexes, where they regulate caspase-8 activity. They also play important roles in pro-survival NF-kB signaling pathways. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 94
27232 260039 cd08330 CARD_ASC_NALP1 Caspase activation and recruitment domain found in Human ASC, NALP1, and similar proteins. Caspase activation and recruitment domain (CARD) similar to those found in human ASC (Apoptosis-associated speck-like protein containing a CARD) and NALP1 (CARD7, NLRP1). ASC, an adaptor molecule, and NALP1, a member of the Nod-like receptor (NLR) family, are involved in the assembly of the 'inflammasome', a multiprotein platform, which is responsible for caspase-1 activation and regulation of IL-1beta maturation. In general, CARDs are death domains (DDs) associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 81
27233 260040 cd08332 CARD_CASP2 Caspase activation and recruitment domain of Caspase-2. Caspase activation and recruitment domain (CARD) similar to that found in caspase-2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Caspase-2 (also known as ICH1, NEDD2, or CASP2) is one of the most evolutionarily conserved caspases, and plays a role in apoptosis, DNA damage response, cell cycle regulation, and tumor suppression. It is localized in the nucleus and exhibits properties of both an initiator and an effector caspase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 87
27234 260041 cd08333 DED_Caspase_8_r1 Death effector domain, repeat 1, of Caspase-8. Death effector domain (DED) found in caspase-8 (CASP8, FLICE), repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-10, and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 also plays many important non-apoptotic functions including roles in embryonic development, cell adhesion and motility, immune cell proliferation and differentiation, T-cell activation, and NFkappaB signaling. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 82
27235 260042 cd08334 DED_Caspase_8_10_r2 Death effector domain, repeat 2, of initator caspases 8 and 10. Death Effector Domain (DED) found in caspase-8 and caspase-10, repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis, and they play partially redundant roles. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. They contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27236 260043 cd08336 DED_FADD Death Effector Domain found in Fas-Associated via Death Domain. Death Effector Domain (DED) found in Fas-Associated via Death Domain (FADD). DEDs comprise a subfamily of the Death Domain (DD) superfamily. FADD is a component of the death-inducing signaling complex (DISC) and serves as an adaptor in the signaling pathway of death receptor proteins. It modulates apoptosis as well as non-apoptotic processes such as cell cycle progression, survival, innate immune signaling, and hematopoiesis. FADD contains an N-terminal DED and a C-terminal DD. Its DD interacts with the DD of the activated death receptor and its DED recruits the initiator caspases 8 and 10 to the DISC complex via a homotypic interaction with the N-terminal DED of the caspase. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. 82
27237 260044 cd08337 DED_c-FLIP_r1 Death Effector Domain, repeat 1, of cellular FLICE-Inhibitory Protein. Death Effector Domain (DED), repeat 1, similar to that found in FLICE-inhibitory protein (c-FLIP/CASH, also known as Casper/iFLICE/FLAME-1/CLARP/MRIT/usurpin). c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of the Death Inducing Signalling Complex (DISC). At low levels, c-FLIP has been shown to enhance apoptotic signaling by allosterically activating caspase-8. As a modulator of the initiator caspases, c-FLIP regulates life and death in various types of cells and tissues. All members contain two N-terminal DEDs and a C-terminal pseudo-caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 80
27238 260045 cd08338 DED_PEA15 Death Effector Domain of Astrocyte phosphoprotein PEA-15. Death Effector Domain (DED) similar to that found in PEA-15 (Astrocyte phosphoprotein PEA-15). PEA-15 is a multifunctional phosphoprotein that modulates signaling pathways, like the ERK MAP kinase cascade by binding to ERK and changing its subcellular localization. It has been implicated in apoptosis, cell proliferation, and glucose metabolism. It does not possess enzymatic activity and mainly acts as an adaptor protein. PEA-15 contains an N-terminal DED domain and a C-terminal disordered region. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. 84
27239 176750 cd08339 DED_DEDD-like Death Effector Domain of DEDD and DEDD2. Death Effector Domain (DED) found in DEDD and DEDD2. Both proteins have a single N-terminal DED and a long C-terminal portion with no known domains. DEDD has been shown to block mitotic progression by inhibiting Cdk1 and to be involved in regulating the insulin signaling cascade. DEDD and DEDD2 can bind to themselves, to each other, and to the two tandem DED-containing caspases, caspase-8 and -10. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. 97
27240 260046 cd08340 DED_c-FLIP_r2 Death Effector Domain, repeat 2, of cellular FLICE-Inhibitory Protein. Death Effector Domain (DED), repeat 2, similar to that found in cellular FLICE-inhibitory protein (c-FLIP/CASH, also known as Casper/iFLICE/FLAME-1/CLARP/MRIT/usurpin). c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of the Death Inducing Signalling Complex (DISC). At low levels, c-FLIP has been shown to enhance apoptotic signaling by allosterically activating caspase-8. As a modulator of the initiator caspases, c-FLIP regulates life and death in various types of cells and tissues. All members contain two N-terminal DEDs and a C-terminal pseudo-caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 81
27241 260047 cd08341 DED_Caspase_10_r1 Death effector domain, repeat 1, of Caspase-10. Death effector domain (DED) found in caspase-10, repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-10 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-8 and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 82
27242 319930 cd08342 HPPD_N_like N-terminal domain of 4-hydroxyphenylpyruvate dioxygenase (HPPD) and hydroxymandelate Synthase (HmaS). HppD and HmaS are non-heme iron-dependent dioxygenases, which modify a common substrate, 4-hydroxyphenylpyruvate (HPP), but yield different products. HPPD catalyzes the second reaction in tyrosine catabolism, the conversion of HPP to homogentisate (2,5-dihydroxyphenylacetic acid, HG). HmaS converts HPP to 4-hydroxymandelate, a committed step in the formation of hydroxyphenylglycerine, a structural component of nonproteinogenic macrocyclic peptide antibiotics, such as vancomycin. If the emphasis is on catalytic chemistry, HPPD and HmaS are classified as members of a large family of alpha-keto acid dependent mononuclear non-heme iron oxygenases most of which require Fe(II), molecular oxygen, and an alpha-keto acid (typically alpha-ketoglutarate) to either oxygenate or oxidize a third substrate. Both enzymes are exceptions in that they require two, instead of three, substrates, do not use alpha-ketoglutarate, and incorporate both atoms of dioxygen into the aromatic product. Both HPPD and HmaS exhibit duplicate beta barrel topology in their N- and C-terminal domains which share sequence similarity, suggestive of a gene duplication. Each protein has only one catalytic site located in at the C-terminal domain. This HPPD_N_like domain represents the N-terminal domain. 141
27243 319931 cd08343 ED_TypeI_classII_C C-terminal domain of type I, class II extradiol dioxygenases, catalytic domain. This family contains the C-terminal, catalytic domain of type I, class II extradiol dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site; extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon, whereas intradiol enzymes cleave the aromatic ring between two hydroxyl groups. Extradiol dioxygenases are classified into type I and type II enzymes. Type I extradiol dioxygenases include class I and class II enzymes. These two classes of enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. The extradiol dioxygenases represented in this family are type I, class II enzymes, and are composed of the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. A catalytically essential metal, Fe(II) or Mn(II), presents in all the enzymes in this family. 132
27244 319932 cd08344 MhqB_like_N N-terminal domain of MhqB, a type I extradiol dioxygenase, and similar proteins. This subfamily contains the N-terminal, non-catalytic, domain of Burkholderia sp. NF100 MhqB and similar proteins. MhqB is a type I extradiol dioxygenase involved in the catabolism of methylhydroquinone, an intermediate in the degradation of fenitrothion. The purified enzyme has shown extradiol ring cleavage activity toward 3-methylcatechol. Fe2+ was suggested as a cofactor, the same as most other enzymes in the family. Burkholderia sp. NF100 MhqB is encoded on the plasmid pNF1. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. 112
27245 319933 cd08345 Fosfomycin_RP Fosfomycin resistant protein. This family contains three types of fosfomycin resistant protein. Fosfomycin inhibits the enzyme UDP-N-acetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. The three types of fosfomycin resistance proteins, employ different mechanisms to render fosfomycin [(1R,2S)-epoxypropylphosphonic acid] inactive. FosB catalyzes the addition of L-cysteine to the epoxide ring of fosfomycin. FosX catalyzes the addition of a water molecule to the C1 position of the antibiotic with inversion of configuration at C1. FosA catalyzes the addition of glutathione to the antibiotic fosfomycin, making it inactive. Catalytic activities of both FosX and FosA are Mn(II)-dependent, but FosB is activated by Mg(II). Fosfomycin resistant proteins are evolutionarily related to glyoxalase I and type I extradiol dioxygenases. 118
27246 319934 cd08346 PcpA_N_like N-terminal domain of Sphingobium chlorophenolicum 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. The N-terminal domain of Sphingobium chlorophenolicum (formerly Sphingomonas chlorophenolica) 2,6-dichloro-p-hydroquinone1,2-dioxygenase (PcpA), and similar proteins. PcpA is a key enzyme in the pentachlorophenol (PCP) degradation pathway, catalyzing the conversion of 2,6-dichloro-p-hydroquinone to 2-chloromaleylacetate. This domain belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. 124
27247 319935 cd08347 PcpA_C_like C-terminal domain of Sphingobium chlorophenolicum 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. The C-terminal domain of Sphingobium chlorophenolicum (formerly Sphingomonas chlorophenolica) 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. PcpA is a key enzyme in the pentachlorophenol (PCP) degradation pathway, catalyzing the conversion of 2,6-dichloro-p-hydroquinone to 2-chloromaleylacetate. This domain belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. 157
27248 319936 cd08348 BphC2-C3-RGP6_C_like The single-domain 2,3-dihydroxybiphenyl 1,2-dioxygenases. This subfamily contains Rhodococcus globerulus P6 BphC2-RGP6 and BphC3-RGP6, and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, yielding 2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoic acid. This is the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Most type I extradiol dioxygenases are activated by Fe(II). Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, which form hexamers. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, its two domains do not belong to this subfamily. 137
27249 319937 cd08349 BLMA_like Bleomycin binding protein (BLMA) and similar proteins. BLMA also called Bleomycin resistance protein, confers Bm resistance by directly binding to Bm. Bm is a glycopeptide antibiotic produced naturally by actinomycetes. It is a potent anti-cancer drug, which acts as a strong DNA-cutting agent, thereby causing cell death. BLMA is produced by actinomycetes to protect themselves against their own lethal compound. BLMA has two identically-folded subdomains, with the same alpha/beta fold; these two halves have no sequence similarity. BLMAs are dimers and each dimer binds to two Bm molecules at the Bm-binding pockets formed at the dimer interface; two Bm molecules are bound per dimer. BLMA belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. As for the larger superfamily, this family contains members with or without domain swapping. 114
27250 319938 cd08350 BLMT_like BLMT, a bleomycin resistance protein encoded on the transposon Tn5, and similar proteins. BLMT is a bleomycin (Bm) resistance protein, encoded by the ble gene on the transposon Tn5. This protein confers a survival advantage to Escherichia coli host cells. Bm is a glycopeptide antibiotic produced naturally by actinomycetes. It is a potent anti-cancer drug, which acts as a strong DNA-cutting agent, thereby causing cell death. BLMT has strong binding affinity to Bm and it protects against this lethal compound through drug sequestering. BLMT has two identically-folded subdomains, with the same alpha/beta fold; these two halves have no sequence similarity. BLMT is a dimer with two Bm-binding pockets formed at the dimer interface. 118
27251 319939 cd08351 ChaP_like ChaP, an enzyme involved in the biosynthesis of the antitumor agent chartreusin (cha), and similar proteins. ChaP is an enzyme involved in the biosynthesis of the potent antitumor agent chartreusin (cha). Cha is an aromatic polyketide glycoside produced by Streptomyces chartreusis. ChaP may play a role as a meta-cleavage dioxygenase in the oxidative rearrangement of the anthracyclic polyketide. ChaP belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. 118
27252 319940 cd08352 VOC_Bs_YwkD_like vicinal oxygen chelate (VOC) family protein Bacillus subtilis YwkD and similar proteins. uncharacterized subfamily of vicinal oxygen chelate (VOC) family contains Bacillus subtilis YwkD and similar proteins. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 123
27253 319941 cd08353 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 142
27254 319942 cd08354 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 122
27255 319943 cd08355 TioX_like Micromonospora sp. TioX and similar proteins. Micromonospora sp. TioX is encoded by a gene of the thiocoraline biosynthetic gene cluster. Thiocoraline is a thiodepsipeptide with potent antitumor activity. TioX may be involved in thiocoraline resistance or secretion. TioX belongs to vicinal oxygen chelate (VOC) superfamily that is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 123
27256 319944 cd08356 VOC_CChe_VCA0619_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. uncharacterized subfamily of vicinal oxygen chelate (VOC) family contains Vibrio cholerae VCA0619 and similar proteins. The VOC superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 113
27257 319945 cd08357 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) familyprotein, glyoxalase I, and type I ring-cleaving dioxygenases. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 124
27258 319946 cd08358 GLOD4_N N-terminal domain of human glyoxalase domain-containing protein 4 and similar proteins. Uncharacterized subfamily of the vicinal oxygen chelate (VOC) superfamily contains human glyoxalase domain-containing protein 4 and similar proteins. VOC is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 127
27259 319947 cd08359 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 119
27260 319948 cd08360 MhqB_like_C C-terminal domain of Burkholderia sp. NF100 MhqB and similar proteins. This subfamily contains the C-terminal, catalytic, domain of Burkholderia sp. NF100 MhqB and similar proteins. MhqB is a type I extradiol dioxygenase involved in the catabolism of methylhydroquinone, an intermediate in the degradation of fenitrothion. The purified enzyme has shown extradiol ring cleavage activity toward 3-methylcatechol. Fe2+ was suggested as a cofactor, the same as most other enzymes in the family. Burkholderia sp. NF100 MhqB is encoded on the plasmid pNF1. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. 134
27261 319949 cd08361 PpCmtC_N N-terminal domain of 2,3-dihydroxy-p-cumate-3,4-dioxygenase (PpCmtC). This subfamily contains the N-terminal, non-catalytic, domain of PpCmtC. 2,3-dihydroxy-p-cumate-3,4-dioxygenase (CmtC of Pseudomonas putida F1) is a dioxygenase involved in the eight-step catabolism pathway of p-cymene. CmtC acts upon the reaction intermediate 2,3-dihydroxy-p-cumate, yielding 2-hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate. The CmtC belongs to the type I family of extradiol dioxygenases. Fe2+ was suggested as a cofactor, same as other enzymes in the family. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. 124
27262 319950 cd08362 BphC5-RrK37_N_like N-terminal, non-catalytic, domain of BphC5 (2,3-dihydroxybiphenyl 1,2-dioxygenase) from Rhodococcus rhodochrous K37, and similar proteins. 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). The enzyme contains a N-terminal and a C-terminal domain of similar structure fold, resulting from an ancient gene duplication. BphC belongs to the type I extradiol dioxygenase family, which requires a metal in the active site for its catalytic activity. Polychlorinated biphenyl degrading bacteria demonstrate multiplicity of BphCs. Bacterium Rhodococcus rhodochrous K37 has eight genes encoding BphC enzymes. This family includes the N-terminal domain of BphC5-RrK37. The crystal structure of the protein from Novosphingobium aromaticivorans has a Mn(II)in the active site, although most proteins of type I extradiol dioxygenases are activated by Fe(II). 120
27263 319951 cd08363 FosB fosfomycin resistant protein subfamily FosB. This subfamily family contains FosB, a fosfomycin resistant protein. FosB is a Mg(2+)-dependent L-cysteine thiol transferase. Fosfomycin inhibits the enzyme UDP-nacetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosB catalyzes the Mg(II) dependent addition of L-cysteine to the epoxide ring of fosfomycin, (1R,2S)-epoxypropylphosphonic acid, rendering it inactive. FosB is evolutionarily related to glyoxalase I and type I extradiol dioxygenases. 131
27264 319952 cd08364 FosX fosfomycin resistant protein subfamily FosX. This subfamily family contains FosX, a fosfomycin resistant protein. FosX is a Mn(II)-dependent fosfomycin-specific epoxide hydrolase. Fosfomycin inhibits the enzyme UDP-Nacetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosX catalyzes the addition of a water molecule to the C1 position of the antibiotic with inversion of the configuration at C1 in the presence of Mn(II). The hydrated fosfomycin loses the inhibition activity. FosX is evolutionarily related to glyoxalase I and type I extradiol dioxygenases. 130
27265 176483 cd08365 APC10-like1 APC10-like DOC1 domains of E3 ubiquitin ligases that mediate substrate ubiquitination. This model represens the APC10-like DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. APC10/DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included here. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10/DOC1 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here. 131
27266 176484 cd08366 APC10 APC10 subunit of the anaphase-promoting complex (APC) that mediates substrate ubiquitination. This model represents the single domain protein APC10, a subunit of the anaphase-promoting complex (APC), which is a multi-subunit E3 ubiquitin ligase. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a vital component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC (also known as the cyclosome), is a cell cycle-regulated E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. In mitosis, the APC initiates sister chromatid separation by ubiquitinating the anaphase inhibitor securin and triggers exit from mitosis by ubiquitinating cyclin B. The C-terminus of APC10 binds to CDC27/APC3, an APC subunit that contains multiple tetratrico peptide repeats. APC10 domains are homologous to the DOC1 domains present in the HECT (Homologous to the E6-AP Carboxyl Terminus) E3 ubiquitin ligase protein, and the Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase complex. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here. 139
27267 176262 cd08367 P53 P53 DNA-binding domain. P53 is a tumor suppressor gene product; mutations in p53 or lack of expression are found associated with a large fraction of all human cancers. P53 is activated by DNA damage and acts as a regulator of gene expression that ultimatively blocks progression through the cell cycle. P53 binds to DNA as a tetrameric transcription factor. In its inactive form, p53 is bound to the ring finger protein Mdm2, which promotes its ubiquitinylation and subsequent proteosomal degradation. Phosphorylation of p53 disrupts the Mdm2-p53 complex, while the stable and active p53 binds to regulatory regions of its target genes, such as the cyclin-kinase inhibitor p21, which complexes and inactivates cdk2 and other cyclin complexes. 179
27268 259829 cd08368 LIM LIM is a small protein-protein interaction domain, containing two zinc fingers. LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid). 53
27269 187712 cd08369 FMT_core Formyltransferase, catalytic core domain. Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis. 173
27270 187727 cd08370 FMT_C_like Carboxy-terminal domain of Formyltransferase and similar domains. This family represents the C-terminal domain of formyltransferase and similar proteins. This domain is found in a variety of enzymes with formyl transferase and alkyladenine DNA glycosylase activities. The proteins with formyltransferase function include methionyl-tRNA formyltransferase, ArnA, 10-formyltetrahydrofolate dehydrogenase and HypX proteins. Although most proteins with formyl transferase activity contain this C-terminal domain, prokaryotic glycinamide ribonucleotide transformylase (GART), a single domain protein, only contains the core catalytic domain. Thus, the C-terminal domain is not required for formyl transferase catalytic activity and may be involved in substrate binding. Some members of this family have shown nucleic acid binding capacity. The C-terminal domain of methionyl-tRNA formyltransferase is involved in tRNA binding. Alkyladenine DNA glycosylase is a distant member of this family with very low sequence similarity to other members. It catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site and shows ability to bind to DNA. 73
27271 187740 cd08371 Lumazine_synthase-like lumazine synthase and riboflavin synthase; involved in the riboflavin (vitamin B2) biosynthetic pathway. This superfamily contains lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS) and riboflavin synthase (RS). Both enzymes play important roles in the riboflavin biosynthetic pathway. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. In the final steps of the riboflavin biosynthetic pathway, LS catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), and RS catalyzes a dismutation of DMRL which yields riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. In bacteria and eukaryotes, there are two types of LS: type-I LS forms homo-pentamers or icosahedrally arranged dodecamers of pentamers, type-II LS forms decamers (dimers of pentamers). In archaea LSs and RSs appear to have diverged early in the evolution of archaea from a common ancestor. 129
27272 197306 cd08372 EEP Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily. This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins. 241
27273 176019 cd08373 C2A_Ferlin C2 domain first repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 127
27274 176020 cd08374 C2F_Ferlin C2 domain sixth repeat in Ferlin. Ferlins are involved in vesicle fusion events. Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together. There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6. Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1). Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E. In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the sixth C2 repeat, C2E, and has a type-II topology. 133
27275 176021 cd08375 C2_Intersectin C2 domain present in Intersectin. A single instance of the C2 domain is located C terminally in the intersectin protein. Intersectin functions as a scaffolding protein, providing a link between the actin cytoskeleton and the components of endocytosis and plays a role in signal transduction. In addition to C2, intersectin contains several additional domains including: Eps15 homology domains, SH3 domains, a RhoGEF domain, and a PH domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. The members here have topology I. 136
27276 176022 cd08376 C2B_MCTP_PRT C2 domain second repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane. MCTP is composed of a variable N-terminal sequence, three C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 116
27277 176023 cd08377 C2C_MCTP_PRT C2 domain third repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane. The cds in this family contain multiple C2 domains as well as a C-terminal PRT domain. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 119
27278 176024 cd08378 C2B_MCTP_PRT_plant C2 domain second repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 121
27279 176025 cd08379 C2D_MCTP_PRT_plant C2 domain fourth repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fourth C2 repeat, C2D, and has a type-II topology. 126
27280 176026 cd08380 C2_PI3K_like C2 domain present in phosphatidylinositol 3-kinases (PI3Ks). C2 domain present in all classes of PI3Ks. PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. In addition some PI3Ks contain a Ras-binding domain and/or a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains members with the first C2 repeat, C2A, and a type-I topology, as well as some with a single C2 repeat. 156
27281 176027 cd08381 C2B_PI3K_class_II C2 domain second repeat present in class II phosphatidylinositol 3-kinases (PI3Ks). There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a N-terminal C2 domain, a PIK domain, and a kinase catalytic domain. Unlike class I and class III, class II PI3Ks have additionally a PX domain and a C-terminal C2 domain containing a nuclear localization signal both of which bind phospholipids though in a slightly different fashion. PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 122
27282 176028 cd08382 C2_Smurf-like C2 domain present in Smad ubiquitination-related factor (Smurf)-like proteins. A single C2 domain is found in Smurf proteins, C2-WW-HECT-domain E3s, which play an important role in the downregulation of the TGF-beta signaling pathway. Smurf proteins also regulate cell shape, motility, and polarity by degrading small guanosine triphosphatases (GTPases). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have type-II topology. 123
27283 176029 cd08383 C2A_RasGAP C2 domain (first repeat) of Ras GTPase activating proteins (GAPs). RasGAPs suppress Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. The proteins here all contain either a single C2 domain or two tandem C2 domains, a Ras-GAP domain, and a pleckstrin homology (PH)-like domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology. 117
27284 176030 cd08384 C2B_Rabphilin_Doc2 C2 domain second repeat present in Rabphilin and Double C2 domain. Rabphilin is found neurons and in neuroendrocrine cells, while Doc2 is found not only in the brain but in tissues, including mast cells, chromaffin cells, and osteoblasts. Rabphilin and Doc2s share highly homologous tandem C2 domains, although their N-terminal structures are completely different: rabphilin contains an N-terminal Rab-binding domain (RBD),7 whereas Doc2 contains an N-terminal Munc13-1-interacting domain (MID). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 133
27285 176031 cd08385 C2A_Synaptotagmin-1-5-6-9-10 C2A domain first repeat present in Synaptotagmins 1, 5, 6, 9, and 10. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 1, a member of class 1 synaptotagmins, is located in the brain and endocranium and localized to the synaptic vesicles and secretory granules. It functions as a Ca2+ sensor for fast exocytosis as do synaptotagmins 5, 6, and 10. It is distinguished from the other synaptotagmins by having an N-glycosylated N-terminus. Synaptotagmins 5, 6, and 10, members of class 3 synaptotagmins, are located primarily in the brain and localized to the active zone and plasma membrane. They is distinguished from the other synaptotagmins by having disulfide bonds at its N-terminus. Synaptotagmin 6 also regulates the acrosome reaction, a unique Ca2+-regulated exocytosis, in sperm. Synaptotagmin 9, a class 5 synaptotagmins, is located in the brain and localized to the synaptic vesicles. It is thought to be a Ca2+-sensor for dense-core vesicle exocytosis. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 124
27286 176032 cd08386 C2A_Synaptotagmin-7 C2A domain first repeat present in Synaptotagmin 7. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 7, a member of class 2 synaptotagmins, is located in presynaptic plasma membranes in neurons, dense-core vesicles in endocrine cells, and lysosomes in fibroblasts. It has been shown to play a role in regulation of Ca2+-dependent lysosomal exocytosis in fibroblasts and may also function as a vesicular Ca2+-sensor. It is distinguished from the other synaptotagmins by having over 12 splice forms. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 125
27287 176033 cd08387 C2A_Synaptotagmin-8 C2A domain first repeat present in Synaptotagmin 8. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 124
27288 176034 cd08388 C2A_Synaptotagmin-4-11 C2A domain first repeat present in Synaptotagmins 4 and 11. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmins 4 and 11, class 4 synaptotagmins, are located in the brain. Their functions are unknown. They are distinguished from the other synaptotagmins by having and Asp to Ser substitution in their C2A domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 128
27289 176035 cd08389 C2A_Synaptotagmin-14_16 C2A domain first repeat present in Synaptotagmins 14 and 16. Synaptotagmin 14 and 16 are membrane-trafficking proteins in specific tissues outside the brain. Both of these contain C-terminal tandem C2 repeats, but only Synaptotagmin 14 has an N-terminal transmembrane domain and a putative fatty-acylation site. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium and this is indeed the case here. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 124
27290 176036 cd08390 C2A_Synaptotagmin-15-17 C2A domain first repeat present in Synaptotagmins 15 and 17. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. It is thought to be involved in the trafficking and exocytosis of secretory vesicles in non-neuronal tissues and is Ca2+ independent. Human synaptotagmin 15 has 2 alternatively spliced forms that encode proteins with different C-termini. The larger, SYT15a, contains a N-terminal TM region, a putative fatty-acylation site, and 2 tandem C terminal C2 domains. The smaller, SYT15b, lacks the C-terminal portion of the second C2 domain. Unlike most other synaptotagmins it is nearly absent in the brain and rather is found in the heart, lungs, skeletal muscle, and testis. Synaptotagmin 17 is located in the brain, kidney, and prostate and is thought to be a peripheral membrane protein. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 123
27291 176037 cd08391 C2A_C2C_Synaptotagmin_like C2 domain first and third repeat in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains either the first or third repeat in Synaptotagmin-like proteins with a type-I topology. 121
27292 176038 cd08392 C2A_SLP-3 C2 domain first repeat present in Synaptotagmin-like protein 3. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. SHD of Slp (except for the Slp4-SHD) function as a specific Rab27A/B-binding domain. In addition to Slp, rabphilin, Noc2, and Munc13-4 also function as Rab27-binding proteins. Little is known about the expression or localization of Slp3. The C2A domain of Slp3 is Ca2+ dependent. It has been demonstrated that Slp3 promotes dense-core vesicle exocytosis. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 128
27293 176039 cd08393 C2A_SLP-1_2 C2 domain first repeat present in Synaptotagmin-like proteins 1 and 2. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane. Additionally, their C2A domains are both Ca2+ independent, unlike Slp3 and Slp4/granuphilin which are Ca2+ dependent. It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain. In addition to Slps, rabphilin, Noc2, and Munc13-4 also function as Rab27-binding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 125
27294 176040 cd08394 C2A_Munc13 C2 domain first repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 127
27295 176041 cd08395 C2C_Munc13 C2 domain third repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins.C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology. 120
27296 176042 cd08397 C2_PI3K_class_III C2 domain present in class III phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. These are the only domains identified in the class III PI3Ks present in this cd. In addition some PI3Ks contain a Ras-binding domain and/or a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 159
27297 176043 cd08398 C2_PI3K_class_I_alpha C2 domain present in class I alpha phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. The members here are class I, alpha isoform PI3Ks and contain both a Ras-binding domain and a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-I topology. 158
27298 176044 cd08399 C2_PI3K_class_I_gamma C2 domain present in class I gamma phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. The members here are class I, gamma isoform PI3Ks and contain both a Ras-binding domain and a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-I topology. 178
27299 176045 cd08400 C2_Ras_p21A1 C2 domain present in RAS p21 protein activator 1 (RasA1). RasA1 is a GAP1 (GTPase activating protein 1), a Ras-specific GAP member, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. RasA1 contains a C2 domain, a Ras-GAP domain, a pleckstrin homology (PH)-like domain, a SH3 domain, and 2 SH2 domains. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology. 126
27300 176046 cd08401 C2A_RasA2_RasA3 C2 domain first repeat present in RasA2 and RasA3. RasA2 and RasA3 are GAP1s (GTPase activating protein 1s ), Ras-specific GAP members, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. RasA2 and RasA3 are both inositol 1,3,4,5-tetrakisphosphate-binding proteins and contain an N-terminal C2 domain, a Ras-GAP domain, a pleckstrin-homology (PH) domain which localizes it to the plasma membrane, and Bruton's Tyrosine Kinase (BTK) a zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 121
27301 176047 cd08402 C2B_Synaptotagmin-1 C2 domain second repeat present in Synaptotagmin 1. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 1, a member of the class 1 synaptotagmins, is located in the brain and endocranium and localized to the synaptic vesicles and secretory granules. It functions as a Ca2+ sensor for fast exocytosis. It, like synaptotagmin-2, has an N-glycosylated N-terminus. Synaptotagmin 4, a member of class 4 synaptotagmins, is located in the brain. It functions are unknown. It, like synaptotagmin-11, has an Asp to Ser substitution in its C2A domain. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 136
27302 176048 cd08403 C2B_Synaptotagmin-3-5-6-9-10 C2 domain second repeat present in Synaptotagmins 3, 5, 6, 9, and 10. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 3, a member of class 3 synaptotagmins, is located in the brain and localized to the active zone and plasma membrane. It functions as a Ca2+ sensor for fast exocytosis. It, along with synaptotagmins 5,6, and 10, has disulfide bonds at its N-terminus. Synaptotagmin 9, a class 5 synaptotagmins, is located in the brain and localized to the synaptic vesicles. It is thought to be a Ca2+-sensor for dense-core vesicle exocytosis. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 134
27303 176049 cd08404 C2B_Synaptotagmin-4 C2 domain second repeat present in Synaptotagmin 4. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 4, a member of class 4 synaptotagmins, is located in the brain. It functions are unknown. It, like synaptotagmin-11, has an Asp to Ser substitution in its C2A domain. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 136
27304 176050 cd08405 C2B_Synaptotagmin-7 C2 domain second repeat present in Synaptotagmin 7. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 7, a member of class 2 synaptotagmins, is located in presynaptic plasma membranes in neurons, dense-core vesicles in endocrine cells, and lysosomes in fibroblasts. It has been shown to play a role in regulation of Ca2+-dependent lysosomal exocytosis in fibroblasts and may also function as a vesicular Ca2+-sensor. It is distinguished from the other synaptotagmins by having over 12 splice forms. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 136
27305 176051 cd08406 C2B_Synaptotagmin-12 C2 domain second repeat present in Synaptotagmin 12. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 12, a member of class 6 synaptotagmins, is located in the brain. It functions are unknown. It, like synaptotagmins 8 and 13, do not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 136
27306 176052 cd08407 C2B_Synaptotagmin-13 C2 domain second repeat present in Synaptotagmin 13. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 13, a member of class 6 synaptotagmins, is located in the brain. It functions are unknown. It, like synaptotagmins 8 and 12, does not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 138
27307 176053 cd08408 C2B_Synaptotagmin-14_16 C2 domain second repeat present in Synaptotagmins 14 and 16. Synaptotagmin 14 and 16 are membrane-trafficking proteins in specific tissues outside the brain. Both of these contain C-terminal tandem C2 repeats, but only Synaptotagmin 14 has an N-terminal transmembrane domain and a putative fatty-acylation site. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium and this is indeed the case here. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 138
27308 176054 cd08409 C2B_Synaptotagmin-15 C2 domain second repeat present in Synaptotagmin 15. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. It is thought to be involved in the trafficking and exocytosis of secretory vesicles in non-neuronal tissues and is Ca2+ independent. Human synaptotagmin 15 has 2 alternatively spliced forms that encode proteins with different C-termini. The larger, SYT15a, contains a N-terminal TM region, a putative fatty-acylation site, and 2 tandem C terminal C2 domains. The smaller, SYT15b, lacks the C-terminal portion of the second C2 domain. Unlike most other synaptotagmins it is nearly absent in the brain and rather is found in the heart, lungs, skeletal muscle, and testis. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 137
27309 176055 cd08410 C2B_Synaptotagmin-17 C2 domain second repeat present in Synaptotagmin 17. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 17 is located in the brain, kidney, and prostate and is thought to be a peripheral membrane protein. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology. 135
27310 176103 cd08411 PBP2_OxyR The C-terminal substrate-binding domain of the LysR-type transcriptional regulator OxyR, a member of the type 2 periplasmic binding fold protein superfamily. OxyR senses hydrogen peroxide and is activated through the formation of an intramolecular disulfide bond. The OxyR activation induces the transcription of genes necessary for the bacterial defense against oxidative stress. The OxyR of LysR-type transcriptional regulator family is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The C-terminal domain also contains the redox-active cysteines that mediate the redox-dependent conformational switch. Thus, the interaction between the OxyR-tetramer and DNA is notably different between the oxidized and reduced forms. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27311 176104 cd08412 PBP2_PAO1_like The C-terminal substrate-binding domain of putative LysR-type transcriptional regulator PAO1-like, a member of the type 2 periplasmic binding fold protein superfamily. This family includes the C-terminal substrate domain of a putative LysR-type transcriptional regulator from the plant pathogen Pseudomonas aeruginosa PAO1and its closely related homologs. The LysR-type transcriptional regulators (LTTRs) are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of N2 fixing bacteria, and synthesis of virulence factors, to a name a few. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 198
27312 176105 cd08413 PBP2_CysB_like The C-terminal substrate domain of LysR-type transcriptional regulators CysB-like contains type 2 periplasmic binding fold. CysB is a transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the regulation of transcription in response to sulfur source is attributed to two transcriptional regulators, CysB and Cbl. CysB, in association with Cbl, downregulates the expression of ssuEADCB operon which is required for the utilization of sulfur from aliphatic sulfonates, in the presence of cysteine. Also, Cbl and CysB together directly function as transcriptional activators of tauABCD genes, which are required for utilization of taurine as sulfur source for growth. Like many other members of the LTTR family, CysB is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 198
27313 176106 cd08414 PBP2_LTTR_aromatics_like The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the catabolism of aromatic compounds and that of other related regulators, contains type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LTTRs involved in degradation of aromatic compounds, such as CbnR, BenM, CatM, ClcR and TfdR, as well as that of other transcriptional regulators clustered together in phylogenetic trees, including XapR, HcaR, MprR, IlvR, BudR, AlsR, LysR, and OccR. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 197
27314 176107 cd08415 PBP2_LysR_opines_like The C-terminal substrate-domain of LysR-type transcriptional regulators involved in the catabolism of opines and that of related regulators, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulators, OccR and NocR, involved in the catabolism of opines and that of LysR for lysine biosynthesis which clustered together in phylogenetic trees. Opines, such as octopine and nopaline, are low molecular weight compounds found in plant crown gall tumors that are produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. NocR and OccR belong to the family of LysR-type transcriptional regulators that positively regulates the catabolism of nopaline and octopine, respectively. Both nopaline and octopalin are arginine derivatives. In Agrobacterium tumefaciens, NocR regulates expression of the divergently transcribed nocB and nocR genes of the nopaline catabolism (noc) region. OccR protein activates the occQ operon of the Ti plasmid in response to octopine. This operon encodes proteins required for the uptake and catabolism of octopine. The occ operon also encodes the TraR protein, which is a quorum-sensing transcriptional regulator of the Ti plasmid tra regulon. LysR is the transcriptional activator of lysA gene encoding diaminopimelate decarboxylase, an enzyme that catalyses the decarboxylation of diaminopimelate to produce lysine. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 196
27315 176108 cd08416 PBP2_MdcR The C-terminal substrate-binding domian of LysR-type transcriptional regulator MdcR, which involved in the malonate catabolism contains the type 2 periplasmic binding fold. This family includes the C-terminal substrate binding domain of LysR-type transcriptional regulator (LTTR) MdcR that controls the expression of the malonate decarboxylase (mdc) genes. Like other members of the LTTRs, MdcR is a positive regulatory protein for its target promoter and composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2). The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 199
27316 176109 cd08417 PBP2_Nitroaromatics_like The C-terminal substrate binding domain of LysR-type transcriptional regulators that involved in the catabolism of nitroaromatic/naphthalene compounds and that of related regulators; contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the catabolism of dinitrotoluene and similar compounds, such as DntR, NahR, and LinR. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. Also included are related LysR-type regulators clustered together in phylogenetic trees, including NodD, ToxR, LeuO, SyrM, TdcA, and PnbR. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27317 176110 cd08418 PBP2_TdcA The C-terminal substrate binding domain of LysR-type transcriptional regulator TdcA, which is involved in the degradation of L-serine and L-threonine, contains the type 2 periplasmic binding fold. TdcA, a member of the LysR family, activates the expression of the anaerobically-regulated tdcABCDEFG operon which is involved in the degradation of L-serine and L-threonine to acetate and propionate, respectively. The tdc operon is comprised of one regulatory gene tdcA and six structural genes, tdcB to tdcG. The expression of the tdc operon is affected by several transcription factors including the cAMP receptor protein (CRP), integration host factor (IHF), histone-like protein (HU), and the operon specific regulators TdcA and TcdR. TcdR is divergently transcribed from the operon and encodes a small protein that is required for efficient expression of the Escherichia coli tdc operon. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 201
27318 176111 cd08419 PBP2_CbbR_RubisCO_like The C-terminal substrate binding of LysR-type transcriptional regulator (CbbR) of RubisCO operon, which is involved in the carbon dioxide fixation, contains the type 2 periplasmic binding fold. CbbR, a LysR-type transcriptional regulator, is required to activate expression of RubisCO, one of two unique enzymes in the Calvin-Benson-Bassham (CBB) cycle pathway. All plants, cyanobacteria, and many autotrophic bacteria use the CBB cycle to fix carbon dioxide. Thus, this cycle plays an essential role in assimilating CO2 into organic carbon on earth. The key CBB cycle enzyme is ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO), which catalyzes the actual CO2 fixation reaction. The CO2 concentration affects the expression of RubisCO genes. It has also shown that NADPH enhances the DNA-binding ability of the CbbR. RubisCO is composed of eight large (CbbL) and eight small subunits (CbbS). The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27319 176112 cd08420 PBP2_CysL_like C-terminal substrate binding domain of LysR-type transcriptional regulator CysL, which activates the transcription of the cysJI operon encoding sulfite reductase, contains the type 2 periplasmic binding fold. CysL, also known as YwfK, is a regular of sulfur metabolism in Bacillus subtilis. Sulfur is required for the synthesis of proteins and essential cofactors in all living organism. Sulfur can be assimilated either from inorganic sources (sulfate and thiosulfate), or from organic sources (sulfate esters, sulfamates, and sulfonates). CysL activates the transcription of the cysJI operon encoding sulfite reductase, which reduces sulfite to sulfide. Both cysL mutant and cysJI mutant are unable to grow using sulfate or sulfite as the sulfur source. Like other LysR-type regulators, CysL also negatively regulates its own transcription. In Escherichia coli, three LysR-type activators are involved in the regulation of sulfur metabolism: CysB, Cbl and MetR. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 201
27320 176113 cd08421 PBP2_LTTR_like_1 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27321 176114 cd08422 PBP2_CrgA_like The C-terminal substrate binding domain of LysR-type transcriptional regulator CrgA and its related homologs, contains the type 2 periplasmic binding domain. This CD includes the substrate binding domain of LysR-type transcriptional regulator (LTTR) CrgA and its related homologs. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis further showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27322 176115 cd08423 PBP2_LTTR_like_6 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27323 176116 cd08425 PBP2_CynR The C-terminal substrate-binding domain of the LysR-type transcriptional regulator CynR, contains the type 2 periplasmic binding fold. CynR is a LysR-like transcriptional regulator of the cyn operon, which encodes genes that allow cyanate to be used as a sole source of nitrogen. The operon includes three genes in the following order: cynT (cyanate permease), cynS (cyanase), and cynX (a protein of unknown function). CynR negatively regulates its own expression independently of cyanate. CynR binds to DNA and induces bending of DNA in the presence or absence of cyanate, but the amount of bending is decreased by cyanate. The CynR of LysR-type transcriptional regulator family is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2). The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27324 176117 cd08426 PBP2_LTTR_like_5 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 199
27325 176118 cd08427 PBP2_LTTR_like_2 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 195
27326 176119 cd08428 PBP2_IciA_ArgP The C-terminal substrate binding domain of LysR-type transcriptional regulator, ArgP (IciA), for arginine exporter (ArgO); contains the type 2 periplasmic binding fold. The inhibitor of chromosomal replication (iciA) protein encoded by Mycobacterium tuberculosis, which is implicated in chromosome replication initiation in vitro, has been identified as arginine permease (ArgP), a LysR-type transcriptional regulator for arginine outward transport, based on the same amino sequence and similar DNA binding targets. Arp has been shown to regulate various targets including DnaA (replication), ArgO (arginine export), dapB (lysine biosynthesis), and gdhA (glutamate biosynthesis). With abundant nutrition, ArgP activates the DnaA gene (to increase replication) and the ArgO (to export redundant molecules). However, when nutrition supply is limited, it is suggested that ArgP might function as an inhibitor of chromosome replication in order to slow replication. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 195
27327 176120 cd08429 PBP2_NhaR The C-terminal substrate binding domain of LysR-type transcriptional activator of the nhaA gene, encoding Na+/H+ antiporter, contains the type 2 periplasmic binding fold. NhaR is a positive regulator of the LysR family and is known to be an activator of the nhaA gene encoding a Na(+)/H(+) antiporter. In Escherichia coli, NhaA is the vital antiporter that protects against high sodium stress, and it is essential for growth in high sodium levels, while NhaB becomes essential only if NhaA is not available. The nhaA gene of nhaAR operon is induced by monovalent cations. The nhaR of the operon activates nhaAR, as well as the osmC transcription which is induced at elevated osmolarity. OsmC is transcribed from the two overlapping promoters (osmCp1 and osmP2) and that NhaR is shown to activate only the expression of osmCp1. NhaR also activates the transcription of the pgaABCD operon which is required for production of the biofilm adhesion, poly-beta-1,6-N-acetyl-d-glucosamine (PGA) .Thus, it is suggested that NhaR has an extended role in promoting bacterial survival. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 204
27328 176121 cd08430 PBP2_IlvY The C-terminal substrate binding of LysR-type transcriptional regulator IlvY, which activates the expression of ilvC gene that encoding acetohydroxy acid isomeroreductase for the biosynthesis of branched amino acids; contains the type 2 periplasmic binding fold. In Escherichia coli, IlvY is required for the regulation of ilvC gene expression that encodes acetohydroxy acid isomeroreductase (AHIR), a key enzyme in the biosynthesis of branched-chain amino acids (isoleucine, valine, and leucine). The ilvGMEDA operon genes encode remaining enzyme activities required for the biosynthesis of these amino acids. Activation of ilvC transcription by IlvY requires the additional binding of a co-inducer molecule (either alpha-acetolactate or alpha-acetohydoxybutyrate, the substrates for AHIR) to a preformed complex of IlvY protein-DNA. Like many other LysR-family members, IlvY negatively auto-regulates the transcription of its own divergently transcribed ilvY gene in an inducer-independent manner. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 199
27329 176122 cd08431 PBP2_HupR The C-terminal substrate binding domain of LysR-type transcriptional regulator, HupR, which regulates expression of the heme uptake receptor HupA; contains the type 2 periplasmic binding fold. HupR, a member of the LysR family, activates hupA transcription under low-iron conditions in the presence of hemin. The expression of many iron-uptake genes, such as hupA, is regulated at the transcriptional level by iron and an iron-binding repressor protein called Fur (ferric uptake regulation). Under iron-abundant conditions with heme, the active Fur repressor protein represses transcription of the iron-uptake gene hupA, and prevents transcriptional activation via HupR. Under low-iron conditions with heme, the Fur repressor is inactive and transcription of the hupA is allowed. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 195
27330 176123 cd08432 PBP2_GcdR_TrpI_HvrB_AmpR_like The C-terminal substrate domain of LysR-type GcdR, TrPI, HvR and beta-lactamase regulators, and that of other closely related homologs; contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate domain of LysR-type transcriptional regulators involved in controlling the expression of glutaryl-CoA dehydrogenase (GcdH), S-adenosyl-L-homocysteine hydrolase, cell division protein FtsW, tryptophan synthase, and beta-lactamase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 194
27331 176124 cd08433 PBP2_Nac The C-teminal substrate binding domain of LysR-like nitrogen assimilation control (NAC) protein, contains the type 2 periplasmic binding fold. The NAC is a LysR-type transcription regulator that activates expression of operons such as hut (histidine utilization) and ure (urea utilization), allowing use of non-preferred (poor) nitrogen sources, and represses expression of operons, such as glutamate dehydrogenase (gdh), allowing assimilation of the preferred nitrogen source. The expression of the nac gene is fully dependent on the nitrogen regulatory system (NTR) and the sigma54-containing RNA polymerase (sigma54-RNAP). In response to nitrogen starvation, NTR system activates the expression of nac, and NAC activates the expression of hut, ure, and put (proline utilization). NAC is not involved in the transcription of Sigma70-RNAP operons such as glnA, which directly respond by the NTR system, but activates the transcription of sigma70-RNAP dependent operons such as hut. Hence, NAC allows the coupling of sigma70-RNAP dependent operons to the sigma54-RNAP dependent NTR system. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27332 176125 cd08434 PBP2_GltC_like The substrate binding domain of LysR-type transcriptional regulator GltC, which activates gltA expression of glutamate synthase operon, contains type 2 periplasmic binding fold. GltC, a member of the LysR family of bacterial transcriptional factors, activates the expression of gltA gene of glutamate synthase operon and is essential for cell growth in the absence of glutamate. Glutamate synthase is a heterodimeric protein that encoded by gltA and gltB, whose expression is subject to nutritional regulation. GltC also negatively auto-regulates its own expression. This substrate-binding domain has strong homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 195
27333 176126 cd08435 PBP2_GbpR The C-terminal substrate binding domain of galactose-binding protein regulator contains the type 2 periplasmic binding fold. Galactose-binding protein regulator (GbpR), a member of the LysR family of bacterial transcriptional regulators, regulates the expression of chromosomal virulence gene chvE. The chvE gene is involved in the uptake of specific sugars, in chemotaxis to these sugars, and in the VirA-VirG two-component signal transduction system. In the presence of an inducing sugar such as L-arabinose, D-fucose, or D-galactose, GbpR activates chvE expression, while in the absence of an inducing sugar, GbpR represses expression. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 201
27334 176127 cd08436 PBP2_LTTR_like_3 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 194
27335 176128 cd08437 PBP2_MleR The substrate binding domain of LysR-type transcriptional regulator MleR which required for malolactic fermentation, contains type 2 periplasmic binidning fold. MleR, a transcription activator of malolactic fermentation system, is found in gram-positive bacteria and belongs to the lysR family of bacterial transcriptional regulators. The mleR gene is required for the expression and induction of malolactic fermentation. This substrate binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27336 176129 cd08438 PBP2_CidR The C-terminal substrate binding domain of LysR-like transcriptional regulator CidR, contains the type 2 periplasmic binding fold. This CD includes the substrate binding domain of CidR which positively up-regulates the expression of cidABC operon in the presence of acetic acid produced by the metabolism of excess glucose. The CidR affects the control of murein hydrolase activity by enhancing cidABC expression in the presence of acetic acid. Thus, up-regulation of cidABC expression results in increased murein hydrolase activity. This substrate binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27337 176130 cd08439 PBP2_LrhA_like The C-terminal substrate domain of LysR-like regulator LrhA (LysR homologue A) and that of closely related homologs, contains the type 2 periplasmic binding fold. This CD represents the LrhA subfamily of LysR-like bacterial transcriptional regulators, including LrhA, HexA, PecT, and DgdR. LrhA is involved in control of the transcription of flagellar, motility, and chemotaxis genes by regulating the synthesis and concentration of FlhD(2)C(2), the master regulator for the expression of flagellar and chemotaxis genes. The LrhA protein has strong homology to HexA and PecT from plant pathogenic bacteria, in which HexA and PecT act as repressors of motility and of virulence factors, such as exoenzymes required for lytic reactions. DgdR also shares similar characteristics to those of LrhA, HexA and PecT. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 185
27338 176131 cd08440 PBP2_LTTR_like_4 TThe C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27339 176132 cd08441 PBP2_MetR The C-terminal substrate binding domain of LysR-type transcriptional regulator metR, which regulates the expression of methionine biosynthetic genes, contains type 2 periplasmic binding fold. MetR, a member of the LysR family, is a positive regulator for the metA, metE, metF, and metH genes. The sulfur-containing amino acid methionine is the universal initiator of protein synthesis in all known organisms and its derivative S-adenosylmethionine (SAM) and autoinducer-2 (AI-2) are involved in various cellular processes. SAM plays a central role as methyl donor in methylation reactions, which are essential for the biosynthesis of phospholipids, proteins, DNA and RNA. The interspecies signaling molecule AI-2 is involved in cell-cell communication process (quorum sensing) and gene regulation in bacteria. Although methionine biosynthetic enzymes and metabolic pathways are well conserved in bacteria, the regulation of methionine biosynthesis involves various regulatory mechanisms. In Escherichia coli and Salmonella enterica serovar Typhimurium, MetJ and MetR regulate the expression of methionine biosynthetic genes. The MetJ repressor negatively regulates the E. coli met genes, except for metH. Several of these genes are also under the positive control of MetR with homocysteine as a co-inducer. In Bacillus subtilis, the met genes are controlled by S-box termination-antitermination system. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27340 176133 cd08442 PBP2_YofA_SoxR_like The C-terminal substrate binding domain of LysR-type transcriptional regulators, YofA and SoxR, contains the type 2 periplasmic binding fold. YofA is a LysR-like transcriptional regulator of cell growth in Bacillus subtillis. YofA controls cell viability and the formation of constrictions during cell division. YofaA positively regulates expression of the cell division gene ftsW, and thus is essential for cell viability during stationary-phase growth of Bacillus substilis. YofA shows significant homology to SoxR from Arthrobacter sp. TE1826. SoxR is a negative regulator for the sarcosine oxidase gene soxA. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine, which is involved in the metabolism of creatine and choline. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 193
27341 176134 cd08443 PBP2_CysB The C-terminal substrate domain of LysR-type transcriptional regulator CysB contains type 2 periplasmic binding fold. CysB is a transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the regulation of transcription in response to sulfur source is attributed to two transcriptional regulators, CysB and Cbl. CysB, in association with Cbl, downregulates the expression of ssuEADCB operon which is required for the utilization of sulfur from aliphatic sulfonates, in the presence of cysteine. Also, Cbl and CysB together directly function as transcriptional activators of tauABCD genes, which are required for utilization of taurine as sulfur source for growth. Like many other members of the LTTR family, CysB is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27342 176135 cd08444 PBP2_Cbl The C-terminal substrate binding domain of LysR-type transcriptional regulator Cbl, which is required for expression of sulfate starvation-inducible (ssi) genes, contains the type 2 periplasmic binding fold. Cbl is a member of the LysR transcriptional regulators that comprise the largest family of prokaryotic transcription factor. Cbl shows high sequence similarity to CysB, the LysR-type transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the function of Cbl is required for expression of sulfate starvation-inducible (ssi) genes, coupled with the biosynthesis of cysteine from the organic sulfur sources (sulfonates). The ssi genes include the ssuEADCB and tauABCD operons encoding uptake systems for organosulfur compounds, aliphatic sulfonates, and taurine. The genes in these operons encode an ABC-type transport system required for uptake of aliphatic sulfonates and a desulfonation enzyme. Both Cbl and CysB require expression of the tau and ssu genes. Like many other members of the LTTR family, the Cbl is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27343 176136 cd08445 PBP2_BenM_CatM_CatR The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in benzoate catabolism; contains the type 2 periplasmic binding fold. This CD includes the C-terminal of LysR-type transcription regulators, BenM, CatM, and CatR, which are involved in the benzoate catabolism. The BenM and CatM are paralogs with overlapping functions. BenM responds synergistically to two effectors, benzoate and cis,cis-muconate, to activate expression of the benABCDE operon which is involved in benzoate catabolism, while CatM responses only to muconate. BenM and CatM share high protein sequence identity and bind to the operator-promoter regions that have similar DNA sequences. In Pseudomonas species, phenolic compounds are converted by different enzymes to central intermediates, such as protocatechuate and catechols. Generally, unsubstituted compounds, such as benzoate, are metabolized by an ortho-cleavage pathway. The catBCA operon encodes three enzymes of the ortho-pathway required for benzoate catabolism: muconate lactonizing enzyme I, muconolactone isomerase, and catechol 1,2-dioxygenase. CatR normally responds to benzoate and cis,cis-muconate, an inducer molecule, to activate transcription of the catBCA operon, whose gene products convert benzoate to catechol. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 203
27344 176137 cd08446 PBP2_Chlorocatechol The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the chlorocatechol catabolism, contains the type 2 periplasmic binding fold. This CD includes the substrate binding domain of LysR-type regulators CbnR, ClcR and TfdR, which are involved in the regulation of chlorocatechol breakdown. The chlorocatechol-degradative pathway is often found in bacteria that can use chlorinated aromatic compounds as carbon and energy sources. CbnR is found in the 3-chlorobenzoate degradative bacterium Ralstonia eutropha NH9 and forms a tetramer. CbnR activates the expression of the cbnABCD genes, which are responsible for the degradation of chlorocatechol converted from 3-chlorobenzoate and are transcribed divergently from cbnR. In soil bacterium Pseudomonas putida, the 3-chlorocatechol-degradative pathway is encoded by clcABD operon, which requires the divergently transcribed clcR for activation. TfdR is involved in the activation of tfdA and tfdB gene expression. These genes encode enzymes for the conversion of 2,4-dichlorophenoxyacetic acid and 2,4-dichlorophenol. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27345 176138 cd08447 PBP2_LTTR_aromatics_like_1 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to regulators involved in the catabolism of aromatic compounds, contains type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type regulator similar to CbnR which is involved in the regulation of chlorocatechol breakdown. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27346 176139 cd08448 PBP2_LTTR_aromatics_like_2 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to regulators involved in the catabolism of aromatic compounds, contains type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type regulator similar to CbnR which is involved in the regulation of chlorocatechol breakdown. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27347 176140 cd08449 PBP2_XapR The C-terminal substrate binding domain of LysR-type transcriptional regulator XapR involved in xanthosine catabolism, contains the type 2 periplasmic binding fold. In Escherichia coli, XapR is a positive regulator for the expression of xapA gene, encoding xanthosine phosphorylase, and xapB gene, encoding a polypeptide similar to the nucleotide transport protein NupG. As an operon, the expression of both xapA and xapB is fully dependent on the presence of both XapR and the inducer xanthosine. Expression of the xapR is constitutive but not auto-regulated, unlike many other LysR family proteins. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27348 176141 cd08450 PBP2_HcaR The C-terminal substrate binding domain of LysR-type transcriptional regulator HcaR in involved in 3-phenylpropionic acid catabolism, contains the type2 periplasmic binding fold. HcaR, a member of the LysR family of transcriptional regulators, controls the expression of the hcA1, A2, B, C, and D operon, encoding for the 3-phenylpropionate dioxygenase complex and 3-phenylpropionate-2',3'-dihydrodiol dehydrogenase, that oxidizes 3-phenylpropionate to 3-(2,3-dihydroxyphenyl) propionate. Dioxygenases play an important role in protecting the cell against the toxic effects of dioxygen. The expression of hcaR is negatively auto-regulated, as for other members of the LysR family, and is strongly repressed in the presence of glucose. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 196
27349 176142 cd08451 PBP2_BudR The C-terminal substrate binding domain of LysR-type transcrptional regulator BudR, which is responsible for activation of the expression of the butanediol operon genes; contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of BudR regulator, which is responsible for induction of the butanediol formation pathway under fermentative growth conditions. Three enzymes are involved in the production of 1 mol of 2,3 butanediol from the condensation of 2 mol of pyruvate with acetolactate and acetoin as intermediates: acetolactate synthetase, acetolactate decarboxylase, and acetoin reductase. In Klebsiella terrigena, BudR regulates the expression of the budABC operon genes, encoding these three enzymes of the butanediol pathway. In many bacterial species, the use of this pathway can prevent intracellular acidification by diverting metabolism from acid production to the formation of neutral compounds (acetoin and butanediol). This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 199
27350 176143 cd08452 PBP2_AlsR The C-terminal substrate binding domain of LysR-type trnascriptional regulator AlsR, which regulates acetoin formation under stationary phase growth conditions; contains the type 2 periplasmic binding fold. AlsR is responsible for activating the expression of the acetoin operon (alsSD) in response to inducing signals such as glucose and acetate. Like many other LysR family proteins, AlsR is transcribed divergently from the alsSD operon. The alsS gene encodes acetolactate synthase, an enzyme involved in the production of acetoin in cells of stationary-phase. AlsS catalyzes the conversion of two pyruvate molecules to acetolactate and carbon dioxide. Acetolactate is then converted to acetoin at low pH by acetolactate decarboxylase which encoded by the alsD gene. Acetoin is an important physiological metabolite excreted by many microorganisms grown on glucose or other fermentable carbon sources. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27351 176144 cd08453 PBP2_IlvR The C-terminal substrate binding domain of LysR-type transcriptional regulator, IlvR, involved in the biosynthesis of isoleucine, leucine and valine; contains type 2 periplasmic binding fold. The IlvR is an activator of the upstream and divergently transcribed ilvD gene, which encodes dihydroxy acid dehydratase that participates in isoleucine, leucine, and valine biosynthesis. As in the case of other members of the LysR family, the expression of ilvR gene is repressed in the presence of its own gene product. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27352 176145 cd08456 PBP2_LysR The C-terminal substrate binding domain of LysR, transcriptional regulator for lysine biosynthesis, contains the type 2 periplasmic binding fold. LysR, the transcriptional activator of lysA encoding diaminopimelate decarboxylase, catalyses the decarboxylation of diaminopimelate to produce lysine. The LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor. The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 196
27353 176146 cd08457 PBP2_OccR The C-terminal substrate-domain of LysR-type transcriptional regulator, OccR, involved in the catabolism of octopine, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulator OccR, which is involved in the catabolism of octopine. Opines are low molecular weight compounds found in plant crown gall tumors produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. In Agrobacterium tumefaciens, OccR protein activates the occQ operon of the Ti plasmid in response to octopine. This operon encodes proteins required for the uptake and catabolism of octopine, an arginine derivative. The occ operon also encodes the TraR protein, which is a quorum-sensing transcriptional regulator of the Ti plasmid tra regulon. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 196
27354 176147 cd08458 PBP2_NocR The C-terminal substrate-domain of LysR-type transcriptional regulator, NocR, involved in the catabolism of nopaline, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulator NocR, which is involved in the catabolism of nopaline. Opines are low molecular weight compounds found in plant crown gall tumors produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. In Agrobacterium tumefaciens, NocR regulates expression of the divergently transcribed nocB and nocR genes of the nopaline catabolism (noc) region. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 196
27355 176148 cd08459 PBP2_DntR_NahR_LinR_like The C-terminal substrate binding domain of LysR-type transcriptional regulators that are involved in the catabolism of dinitrotoluene, naphthalene and gamma-hexachlorohexane; contains the type 2 periplasmic binding fold. This CD includes LysR-like bacterial transcriptional regulators, DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. DntR from Burkholderia species controls genes encoding enzymes for oxidative degradation of the nitro-aromatic compound 2,4-dinitrotoluene. The active form of DntR is homotetrameric, consisting of a dimer of dimers. NahR is a salicylate-dependent transcription activator of the nah and sal operons for naphthalene degradation. Salicylic acid is an intermediate of the oxidative degradation of the aromatic ring in soil bacteria. LinR positively regulates expression of the genes (linD and linE) encoding enzymes for gamma-hexachlorocyclohexane (a haloorganic insecticide) degradation. Expression of linD and linE are induced by their substrates, 2,5-dichlorohydroquinone (2,5-DCHQ) and chlorohydroquinone (CHQ). The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 201
27356 176149 cd08460 PBP2_DntR_like_1 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27357 176150 cd08461 PBP2_DntR_like_3 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27358 176151 cd08462 PBP2_NodD The C-terminal substsrate binding domain of NodD family of LysR-type transcriptional regulators that regulates the expression of nodulation (nod) genes; contains the type 2 periplasmic binding fold. The nodulation (nod) genes in soil bacteria play important roles in the development of nodules. nod genes are involved in synthesis of Nod factors that are required for bacterial entry into root hairs. Thirteen nod genes have been identified and are classified into five transcription units: nodD, nodABCIJ, nodFEL, nodMNT, and nodO. NodD is negatively auto-regulates its own expression of nodD gene, while other nod genes are inducible and positively regulated by NodD in the presence of flavonoids released by plant roots. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27359 176152 cd08463 PBP2_DntR_like_4 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 203
27360 176153 cd08464 PBP2_DntR_like_2 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27361 176154 cd08465 PBP2_ToxR The C-terminal substrate binding domain of LysR-type transcriptional regulator ToxR regulates the expression of the toxoflavin biosynthesis genes; contains the type 2 periplasmic bindinig fold. In soil bacterium Burkholderia glumae, ToxR regulates the toxABCDE and toxFGHI operons in the presence of toxoflavin as a coinducer. Additionally, the expression of both operons requires a transcriptional activator, ToxJ, whose expression is regulated by the TofI or TofR quorum-sensing system. The biosynthesis of toxoflavin is suggested to be synthesized in a pathway common to the synthesis of riboflavin. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27362 176155 cd08466 PBP2_LeuO The C-terminal substrate binding domain of LysR-type transcriptional regulator LeuO, an activator of leucine synthesis operon, contains the type 2 periplasmic binding fold. LeuO, a LysR-type transcriptional regulator, was originally identified as an activator of the leucine synthesis operon (leuABCD). Subsequently, LeuO was found to be not a specific regulator of the leu gene but a global regulator of unrelated various genes. LeuO activates bglGFB (utilization of beta-D-glucoside) and represses cadCBA (lysine decarboxylation) and dsrA (encoding a regulatory small RNA for translational control of rpoS and hns). LeuO also regulates the yjjQ-bglJ operon which coding for a LuxR-type transcription factor. In Salmonella enterica serovar Typhi, LeuO is a positive regulator of ompS1 (encoding an outer membrane), ompS2 (encoding a pathogenicity determinant), and assT, while LeuO represses the expression of OmpX and Tpx. Both osmS1 and osmS2 influence virulence in the mouse model of Salmonella. In Vibrio cholerae, LeuO is involved in control of biofilm formation and in the stringent response. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27363 176156 cd08467 PBP2_SyrM The C-terminal substrate binding of LysR-type symbiotic regulator SyrM, which activates expression of nodulation gene NodD3, contains the type 2 periplasmic binding fold. Rhizobium is a nitrogen fixing bacteria present in the roots of leguminous plants, which fixes atmospheric nitrogen to the soil. Most Rhizobium species possess multiple nodulation (nod) genes for the development of nodules. For example, Rhizobium meliloti possesses three copies of nodD genes. NodD1 and NodD2 activate nod operons when Rhizobium is exposed to inducers synthesized by the host plant, while NodD3 acts independent of plant inducers and requires the symbiotic regulator SyrM for nod gene expression. SyrM activates the expression of the regulatory nodulation gene nodD3. In turn, NodD3 activates expression of syrM. In addition, SyrM is involved in exopolysaccharide synthesis. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 200
27364 176157 cd08468 PBP2_Pa0477 The C-terminal substrate biniding domain of an uncharacterized LysR-like transcriptional regulator Pa0477 related to DntR, contains the type 2 periplasmic binding fold. LysR-type transcriptional regulator Pa0477 is related to DntR, which controls genes encoding enzymes for oxidative degradation of the nitro-aromatic compound 2,4-dinitrotoluene. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 202
27365 176158 cd08469 PBP2_PnbR The C-terminal substrate binding domain of LysR-type transcriptional regulator PnbR, which is involved in regulating the pnb genes encoding enzymes for 4-nitrobenzoate catabolism, contains the type 2 periplasmic binding fold. PnbR is the regulator of one or both of the two pnb genes that encoding enzymes for 4-nitrobenzoate catabolism. In Pseudomonas putida strain, pnbA encodes a 4-nitrobenzoate reductase, which is responsible for catalyzing the direct reduction of 4-nitrobenzoate to 4-hydroxylaminobenzoate, and pnbB encodes a 4-hydroxylaminobenzoate lyase, which catalyzes the conversion of 4-hydroxylaminobenzoate to 3, 4-dihydroxybenzoic acid and ammonium. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 221
27366 176159 cd08470 PBP2_CrgA_like_1 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding domain. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 1. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27367 176160 cd08471 PBP2_CrgA_like_2 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 2. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 201
27368 176161 cd08472 PBP2_CrgA_like_3 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 3. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 202
27369 176162 cd08473 PBP2_CrgA_like_4 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 4. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 202
27370 176163 cd08474 PBP2_CrgA_like_5 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 5. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 202
27371 176164 cd08475 PBP2_CrgA_like_6 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 6. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 199
27372 176165 cd08476 PBP2_CrgA_like_7 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 7. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27373 176166 cd08477 PBP2_CrgA_like_8 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 8. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 197
27374 176167 cd08478 PBP2_CrgA The C-terminal substrate binding domain of LysR-type transcriptional regulator CrgA, contains the type 2 periplasmic binding domain. This CD represents the substrate binding domain of LysR-type transcriptional regulator (LTTR) CrgA. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis further showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 199
27375 176168 cd08479 PBP2_CrgA_like_9 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 9. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27376 176169 cd08480 PBP2_CrgA_like_10 The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 10. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27377 176170 cd08481 PBP2_GcdR_like The C-terminal substrate binding domain of LysR-type transcriptional regulators GcdR-like, contains the type 2 periplasmic binding fold. GcdR is involved in the glutaconate/glutarate-specific activation of the Pg promoter driving expression of a glutaryl-CoA dehydrogenase-encoding gene (gcdH). The GcdH protein is essential for the anaerobic catabolism of many aromatic compounds and some alicyclic and dicarboxylic acids. The structural topology of this substrate-binding domain is most similar to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 194
27378 176171 cd08482 PBP2_TrpI The C-terminal substrate binding domain of LysR-type transcriptional regulator TrpI, which is involved in control of tryptophan synthesis, contains type 2 periplasmic binding fold. TrpI and indoleglycerol phosphate (InGP), are required to activate transcription of the trpBA, the genes for tryptophan synthase. The trpBA is induced by the InGp substrate, rather than by tryptophan, but the exact mechanism of the activation event is not known. This substrate-binding domain of TrpI shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 195
27379 176172 cd08483 PBP2_HvrB The C-terminal substrate-binding domain of LysR-type transcriptional regulator HvrB, an activator of S-adenosyl-L-homocysteine hydrolase expression, contains the type 2 periplasmic binding fold. The transcriptional regulator HvrB of the LysR family is required for the light-dependent activation of both ahcY, which encoding the enzyme S-adenosyl-L-homocysteine hydrolase (AdoHcyase) that responsible for the reversible hydrolysis of AdoHcy to adenosine and homocysteine, and orf5, a gene of unknown. The topology of this C-terminal domain of HvrB is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 190
27380 176173 cd08484 PBP2_LTTR_beta_lactamase The C-terminal substrate-domain of LysR-type transcriptional regulators for beta-lactamase genes, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LysR-type transcriptional regulators, BlaA and AmpR, that are involved in control of the expression of beta-lactamase genes. Beta-lactamases are responsible for bacterial resistance to beta-lactam antibiotics such as penicillins. BlaA (a constitutive class A penicillinase) belongs to the LysR family of transcriptional regulators, while BlaB (an inducible class C cephalosporinase or AmpC) can be referred to as a penicillin-binding protein, but it does not act as a beta-lactamase. AmpR regulates the expression of beta-lactamases in many enterobacterial strains and many other gram-negative bacilli. In contrast to BlaA, AmpR acts an activator only in the presence of the beta-lactam inducer. In the absence of the inducer, AmpR acts as a repressor. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 189
27381 176174 cd08485 PBP2_ClcR The C-terminal substrate binding domain of LysR-type transcriptional regulator ClcR involved in the chlorocatechol catabolism, contains type 2 periplasmic binding fold. In soil bacterium Pseudomonas putida, the ortho-pathways of catechol and 3-chlorocatechol are central catabolic pathways that convert aromatic and chloroaromaric compounds to tricarboxylic acid (TCA) cycle intermediates. The 3-chlorocatechol-degradative pathway is encoded by clcABD operon, which requires the divergently transcribed clcR and an intermediate of the pathway, 2-chloromuconate, as an inducer for activation. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27382 176175 cd08486 PBP2_CbnR The C-terminal substrate binding domain of LysR-type transcriptional regulator, CbnR, involved in the chlorocatechol catabolism, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of LysR-type regulator CbnR which is involved in the regulation of chlorocatechol breakdown. The chlorocatechol-degradative pathway is often found in bacteria that can use chlorinated aromatic compounds as carbon and energy sources. CbnR is found in the 3-chlorobenzoate degradative bacterium Ralstonia eutropha NH9 and forms a tetramer. CbnR activates the expression of the cbnABCD genes, which are responsible for the degradation of chlorocatechol converted from 3-chlorobenzoate and are transcribed divergently from cbnR. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 198
27383 176176 cd08487 PBP2_BlaA The C-terminal substrate-binding domain of LysR-type trnascriptional regulator BlaA which involved in control of the beta-lactamase gene expression; contains the type 2 periplasmic binding fold. This CD represents the C-terminal substrate binding domain of LysR-type transcriptional regulator, BlaA, that involved in control of the expression of beta-lactamase genes, blaA and blaB. Beta-lactamases are responsible for bacterial resistance to beta-lactam antibiotics such as penicillins. The blaA gene is located just upstream of blaB in the opposite direction and regulates the expression of the blaB. BlaA also negatively auto-regulates the expression of its own gene, blaA. BlaA (a constitutive class A penicllinase) belongs to the LysR family of transcriptional regulators, whereas BlaB (an inducible class C cephalosporinase or AmpC) can be referred to as a penicillin binding protein but it does not act as a beta-lactamase. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 189
27384 176177 cd08488 PBP2_AmpR The C-terminal substrate domain of LysR-type transcriptional regulator AmpR that involved in control of the expression of beta-lactamase gene ampC, contains the type 2 periplasmic binding fold. AmpR acts as a transcriptional activator by binding to a DNA region immediately upstream of the ampC promoter. In the absence of a beta-lactam inducer, AmpR represses the synthesis of beta-lactamase, whereas expression is induced in the presence of a beta-lactam inducer. The AmpD, AmpG, and AmpR proteins are involved in the induction of AmpC-type beta-lactamase (class C) which produced by enterobacterial strains and many other gram-negative bacilli. The activation of ampC by AmpR requires ampG for induction or high-level expression of AmpC. It is probable that the AmpD and AmpG work together to modulate the ability of AmpR to activate ampC expression. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 191
27385 173854 cd08489 PBP2_NikA The substrate-binding component of an ABC-type nickel import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain of nickel transport system, which functions in the import of nickel and in the control of chemotactic response away from nickel. The ATP-binding cassette (ABC) type nickel transport system is comprised of five subunits NikABCDE: the two pore-forming integral inner membrane proteins NikB and NikC; the two inner membrane-associated proteins with ATPase activity NikD and NikE; and the periplasmic nickel binding NikA, the initial nickel receptor. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 488
27386 173855 cd08490 PBP2_NikA_DppA_OppA_like_3 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 470
27387 173856 cd08491 PBP2_NikA_DppA_OppA_like_12 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 473
27388 173857 cd08492 PBP2_NikA_DppA_OppA_like_15 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 484
27389 173858 cd08493 PBP2_DppA_like The substrate-binding component of an ABC-type dipeptide import system contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of an ATP-binding cassette (ABC)-type dipeptide import system. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 482
27390 173859 cd08494 PBP2_NikA_DppA_OppA_like_6 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 448
27391 173860 cd08495 PBP2_NikA_DppA_OppA_like_8 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 482
27392 173861 cd08496 PBP2_NikA_DppA_OppA_like_9 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA can bind peptides of a wide range of lengths (2-35 amino-acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 454
27393 173862 cd08497 PBP2_NikA_DppA_OppA_like_14 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 491
27394 173863 cd08498 PBP2_NikA_DppA_OppA_like_2 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 481
27395 173864 cd08499 PBP2_Ylib_like The substrate-binding component of an uncharacterized ABC-type peptide import system Ylib contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding component of an uncharacterized ATP-binding cassette (ABC)-type peptide transport system YliB. Although the ligand specificity of Ylib protein is not known, it shares significant sequence similarity to the ABC-type dipeptide and oligopeptide binding proteins. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 474
27396 173865 cd08500 PBP2_NikA_DppA_OppA_like_4 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 499
27397 173866 cd08501 PBP2_Lpqw The substrate-binding domain of mycobacterial lipoprotein Lpqw contains type 2 periplasmic binding fold. LpqW is one of key players in synthesis and transport of the unique components of the mycobacterial cell wall which is a complex structure rich in two related lipoglycans, the phosphatidylinositol mannosides (PIMs) and lipoarabinomannans (LAMs). Lpqw is a highly conserved lipoprotein that transport intermediates from a pathway for mature PIMs production into a pathway for LAMs biosynthesis, thus controlling the relative abundance of these two essential components of cell wall. LpqW is thought to have been adapted by the cell-wall biosynthesis machinery of mycobacteria and other closely related pathogens, evolving to play an important role in PIMs/LAMs biosynthesis. Most of periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the LpqW protein. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 486
27398 173867 cd08502 PBP2_NikA_DppA_OppA_like_16 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 472
27399 173868 cd08503 PBP2_NikA_DppA_OppA_like_17 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 460
27400 173869 cd08504 PBP2_OppA The substrate-binding component of an ABC-type oligopetide import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding component of an ATP-binding cassette (ABC)-type oligopeptide transport system comprised of 5 subunits. The transport system OppABCDEF contains two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 498
27401 173870 cd08505 PBP2_NikA_DppA_OppA_like_18 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 528
27402 173871 cd08506 PBP2_clavulanate_OppA2 The substrate-binding domain of an oligopeptide binding protein (OppA2) from the biosynthesis pathway of the beta-lactamase inhibitor clavulanic acid contains the type 2 periplasmic binding fold. Clavulanic acid (CA), a clinically important beta-lactamase inhibitor, is one of a family of clavams produced as secondary metabolites by fermentation of Streptomyces clavuligeru. The biosynthesis of CA proceeds via multiple steps from the precursors, glyceraldehyde-3-phosphate and arginine. CA possesses a characteristic (3R,5R) stereochemistry essential for reaction with penicillin-binding proteins and beta-lactamases. Two genes (oppA1 and oppA2) in the clavulanic acid gene cluster encode oligopeptide-binding proteins that are required for CA biosynthesis. OppA1/2 is involved in the binding and transport of peptides across the cell membrane of Streptomyces clavuligerus. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 466
27403 173872 cd08507 PBP2_SgrR_like The C-terminal solute-binding domain of DNA-binding transcriptional regulator SgrR is related to the ABC-type oligopeptide-binding proteins and contains the type 2 periplasmic-binding fold. A novel family of SgrR transcriptional regulator contains a two-domain structure with an N terminal DNA-binding domain of the winged helix family and a C-terminal solute-binding domain. The C-terminal domain shows strong homology with the ABC-type oligopeptide-binding protein family, a member of the type 2 periplasmic-binding fold protein (PBP2) superfamily that also includes the C-terminal substrate-binding domain of LysR-type transcriptional regulators. SgrR (SugaR transport-related Regulator) is negatively autoregulated and activates transcription of divergent operon SgrS, which encodes a small RNA required for recovery from glucose-phosphate stress. Hence, the small RNA SgrS and SgrR, the transcription factor that controls sgrS expression, are both required for recovery from glucose-phosphate stress. Most of periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 448
27404 173873 cd08508 PBP2_NikA_DppA_OppA_like_1 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 470
27405 173874 cd08509 PBP2_TmCBP_oligosaccharides_like The substrate binding domain of a cellulose-binding protein from Thermotoga maritima contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of a cellulose-binding protein from the hyperthermophilic bacterium Thermotoga maritima (TmCBP) and its closest related proteins. TmCBP binds a variety of lengths of beta-1,4-linked glucose oligomers, ranging from two sugar rings (cellobiose) to five (cellopentose). TmCBP is structurally homologous to domains I and III of the ATP-binding cassette (ABC)-type oligopeptide-binding proteins and thus belongs to the type 2 periplasmic binding fold protein (PBP2) superfamily. The type 2 periplasmic binding proteins are soluble ligand-binding components of ABC or tripartite ATP-independent transporters and chemotaxis systems. Members of the PBP2 superfamily function in uptake of a variety of metabolites in bacteria such as amino acids, carbohydrate, ions, and polyamines. Ligands are then transported across the cytoplasmic membrane energized by ATP hydrolysis or electrochemical ion gradient. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 509
27406 173875 cd08510 PBP2_Lactococcal_OppA_like The substrate binding component of an ABC-type lactococcal OppA-like transport system contains. This family represents the substrate binding domain of an ATP-binding cassette (ABC)-type oligopeptide import system from Lactococcus lactis and other gram-positive bacteria, as well as its closet homologs from gram-negative bacteria. Oligopeptide-binding protein (OppA) from Lactococcus lactis can bind peptides of length from 4 to at least 35 residues without sequence preference. The oligopeptide import system OppABCDEF is consisting of five subunits: two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and also is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 516
27407 173876 cd08511 PBP2_NikA_DppA_OppA_like_5 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 467
27408 173877 cd08512 PBP2_NikA_DppA_OppA_like_7 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 476
27409 173878 cd08513 PBP2_thermophilic_Hb8_like The substrate-binding component of ABC-type thermophilic oligopeptide-binding protein Hb8-like import systems, contains the type 2 periplasmic binding fold. This family includes the substrate-binding domain of an ABC-type oligopeptide-binding protein Hb8 from Thermus thermophilius and its closest homologs from other bacteria. The structural topology of this substrate-binding domain is similar to those of DppA from Escherichia coli and OppA from Salmonella typhimurium, and thus belongs to the type 2 periplasmic binding fold protein (PBP2) superfamily. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. The type 2 periplasmic binding proteins are soluble ligand-binding components of ABC or tripartite ATP-independent transporters and chemotaxis systems. Members of the PBP2 superfamily function in uptake of a variety of metabolites in bacteria such as amino acids, carbohydrate, ions, and polyamines. Ligands are then transported across the cytoplasmic membrane energized by ATP hydrolysis or electrochemical ion gradient. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 482
27410 173879 cd08514 PBP2_AppA_like The substrate-binding component of the oligopeptide-binding protein, AppA, from Bacillus subtilis contains the type 2 periplasmic-binding fold. This family represents the substrate-binding domain of the oligopeptide-binding protein, AppA, from Bacillus subtilis and its closest homologs from other bacteria and archaea. Bacillus subtilis has three ABC-type peptide transport systems, a dipeptide-binding protein (DppA) and two oligopeptide-binding proteins (OppA and AppA) with overlapping specificity. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and also is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 483
27411 173880 cd08515 PBP2_NikA_DppA_OppA_like_10 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 460
27412 173881 cd08516 PBP2_NikA_DppA_OppA_like_11 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 457
27413 173882 cd08517 PBP2_NikA_DppA_OppA_like_13 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 480
27414 173883 cd08518 PBP2_NikA_DppA_OppA_like_19 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 464
27415 173884 cd08519 PBP2_NikA_DppA_OppA_like_20 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 469
27416 173885 cd08520 PBP2_NikA_DppA_OppA_like_21 The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction. 468
27417 176056 cd08521 C2A_SLP C2 domain first repeat present in Synaptotagmin-like proteins. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane. Additionally, their C2A domains are both Ca2+ independent, unlike the case in Slp3 and Slp4/granuphilin in which their C2A domains are Ca2+ dependent. It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain. In addition to Slps, rabphilin, Noc2, and Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp3 and Slp4/granuphilin promote dense-core vesicle exocytosis. Slp5 mRNA has been shown to be restricted to human placenta and liver suggesting a role in Rab27A-dependent membrane trafficking in specific tissues. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology. 123
27418 260080 cd08523 Reeler_cohesin_like Domains similar to the eukaryotic reeler domain and bacterial cohesins. This diverse family summarizes a set of distantly related domains, as revealed by structural similarity 128
27419 197341 cd08524 Reelin_subrepeat_like Tandem repeat subunit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns. 144
27420 197342 cd08525 Reelin_subrepeat_1 N-terminal subrepeat in the tandem repeat unit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 161
27421 197343 cd08526 Reelin_subrepeat_2 C-terminal subrepeat in the tandem repeat unit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 152
27422 270867 cd08528 STKc_Nek10 Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 10. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. No function has yet been ascribed to Nek10. The gene encoding Nek10 is a putative causative gene for breast cancer; it is located within a breast cancer susceptibility loci on chromosome 3p24. Nek10 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
27423 270868 cd08529 STKc_FA2-like Catalytic domain of the Serine/Threonine Kinases, Chlamydomonas reinhardtii FA2 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chlamydomonas reinhardtii FA2 was discovered in a genetic screen for deflagellation-defective mutants. It is essential for basal-body/centriole-associated microtubule severing, and plays a role in cell cycle progression. No cellular function has yet been ascribed to CNK4. The Chlamydomonas reinhardtii FA2-like subfamily belongs to the (NIMA)-related kinase (Nek) family, which includes seven different Chlamydomonas Neks (CNKs 1-6 and Fa2). This subfamily contains FA2 and CNK4. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
27424 270869 cd08530 STKc_CNK2-like Catalytic domain of the Serine/Threonine Kinases, Chlamydomonas reinhardtii CNK2 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chlamydomonas reinhardtii CNK2 has both cilliary and cell cycle functions. It influences flagellar length through promoting flagellar disassembly, and it regulates cell size, through influencing the size threshold at which cells commit to mitosis. This subfamily belongs to the (NIMA)-related kinase (Nek) family, which includes seven different Chlamydomonas Neks (CNKs 1-6 and Fa2). This subfamily includes CNK1, and -2. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
27425 188877 cd08531 SAM_PNT-ERG_FLI-1 Sterile alpha motif (SAM)/Pointed domain of ERG (Ets related gene) and FLI-1 (Friend leukemia integration 1) transcription factors. SAM Pointed domain of ERG/FLI-1 subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. The ERG and FLI regulators are involved in endothelial cell differentiation, bone morphogenesis and neural crest development. They are proto-oncogenes implicated in cancer development such as myeloid leukemia, Ewing's sarcoma and erythroleukemia. Members of this subfamily are potential targets for cancer therapy. 75
27426 188878 cd08532 SAM_PNT-PDEF-like Sterile alpha motif (SAM)/Pointed domain of prostate-derived ETS factor. SAM Pointed domain of PDEF-like (Prostate-Derived ETS Factor) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. In human males this activator is highly expressed in the prostate gland and enhances androgen-mediated activation of the PSA promoter though interaction with the DNA binding domain of androgen receptor. PDEF may play a role in prostate cancer development as well as in goblet cell formation and mucus production in the epithelial lining of respiratory and intestinal tracts. 81
27427 188879 cd08533 SAM_PNT-ETS-1,2 Sterile alpha motif (SAM)/Pointed domain of ETS-1,2 family. SAM Pointed domain of ETS-1,2 family of transcriptional activators is a protein-protein interaction domain. It carries a kinase docking site and mediates interaction between ETS transcriptional activators and protein kinases. This group of transcriptional factors is involved in the Ras/MAP kinase signaling pathway. MAP kinases phosphorylate the transcription factors. Phosphorylated factors then recruit coactivators and enhance transactivation. Members of this group play a role in regulation of different embryonic developmental processes. ETS-1,2 transcriptional activators are proto-oncogenes involved in malignant transformation and tumor progression. They are potential molecular targets for selective cancer therapy. 71
27428 176084 cd08534 SAM_PNT-GABP-alpha Sterile alpha motif (SAM)/Pointed domain of GA-binding protein (GABP) alpha chain. SAM Pointed domain of GA-binding protein (GABP) alpha subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. This type of transcriptional regulators forms heterotetramers containing two alpha and two beta subunits. It interacts with GA repeats (purine rich repeats). GABP transcriptional factors control gene expression in cell cycle control, apoptosis, and cellular respiration. GABP participates in regulation of transmembrane receptors and key hormones especially in myeloid cells and at the neuromuscular junction. 89
27429 176085 cd08535 SAM_PNT-Tel_Yan Sterile alpha motif (SAM)/Pointed domain of Tel/Yan protein. SAM Pointed domain of Tel (Translocation, Ets, Leukemia)/Yan subfamily of ETS transcriptional repressors is a protein-protein interaction domain. SAM Pointed domains of this type of regulators can interact with each other, forming head-to-tail homodimers or homooligomers, and/or interact with SAM Pointed domains of another subfamily of ETS factors forming heterodimers. The oligomeric form is able to block transcription of target genes and is involved in MAPK signaling. They participate in regulation of different processes during embryoniv development including hematopoietic differentiation and eye development. Tel/Yan transcriptional factors are frequent targets of chromosomal translocations resulting in fusions of SAM domain with new neighboring genes. Such chimeric proteins were found in different tumors. Members of this subfamily are potential targets for cancer therapy. 68
27430 176086 cd08536 SAM_PNT-Mae Sterile alpha motif (SAM)/Pointed domain of Mae protein homolog. Mae (Modulator of the Activity of ETS) subfamily represents a group of SAM Pointed monodomain proteins. SAM Pointed domain is a protein-protein interaction domain. It can interact with other SAM pointed domains forming head-to-tail heterodimers and also provides a kinase docking site. For example, in Drosophila Mae is required for facilitating phosphorylation of the Yan factor and for blocking phosphorylation of the ETS-2 regulator. Mae interacts with the SAM Pointed domains of Yan and ETS-2. Binding enhances access of the kinase to the Yan phosphorylation site by providing a kinase docking site, or inhibits phosphorylation of ETS-2 by blocking its docking site. This type of factors participates in regulation of kinase signaling particularly during embryogenesis. 66
27431 188880 cd08537 SAM_PNT-ESE-1-like Sterile alpha motif (SAM)/Pointed domain of ESE-1 like ETS transcriptional regulators. SAM Pointed domain of ESE-1-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. SAM Pointed domain of ESE-1 provides a potential docking site for signaling kinase Pak1 in humans. ESE-1 factors are involved in regulation of gene expression in different types of epithelial cells. ESE-1 is expressed in many different organs including intestine, stomach, pancreas, lungs, kidneys, and prostate. The DNA binding consensus motif for ESE-1 consists of a purine-rich GGA[AT] core sequence. The expression profile of these factors is altered in epithelial cancers if compared to normal tissues. Members of this subfamily are potential targets for cancer therapy. 81
27432 188881 cd08538 SAM_PNT-ESE-2-like Sterile alpha motif (SAM)/Pointed domain of ESE-2 like ETS transcriptional regulators. SAM Pointed domain of ESE-2-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. ESE-2 factors are involved in regulation of gene expression in a variety of epithelial (glandular and secretory) cells. ESE-2 mRNA was found in skin keratinocytes, salivary gland, mammary gland, stomach, prostate, and kidneys. The DNA binding consensus motif for ESE-2 consists of a GGA core and AT-rich flanks. The expression profiles of these factors are altered in epithelial cancers. Members of this subfamily are potential targets for cancer therapy. 84
27433 188882 cd08539 SAM_PNT-ESE-3-like Sterile alpha motif (SAM)/Pointed domain of ESE-3 like ETS transcriptional regulators. SAM Pointed domain of ESE-3-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. The ESE-3 transcriptional activator is involved in regulation of glandular epithelium differentiation through the MAP kinase signaling cascade. It is found to be expressed in glandular epithelium of prostate, pancreas, salivary gland, and trachea. Additionally, ESE-3 is differentially expressed during monocyte-derived dendritic cells development. DNA binding consensus motif for ESE-3 consists of purine-rich GGAA/T core sequence. The expression profiles of these factors are altered in epithelial cancers. Members of this subfamily are potential targets for cancer therapy. 78
27434 176090 cd08540 SAM_PNT-ERG Sterile alpha motif (SAM)/Pointed domain of ERG transcription factor. SAM Pointed domain of ERG subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It may participate in formation of homodimers or heterodimers with ETS-2, Fli-1, ER81, and Pu-1. However, dimeric forms are inactive and SAM Pointed domain is not essential for dimerization, since ER81 and Pu-1 do not have it. In mouse, a regulator of this type binds the ESET histone H3-specific methyltransferase (human homolog is SETDB1), which leads to modification of the local chromatin structure through histone methylation. ERG regulators are involved in endothelial cell differentiation, bone morphogenesis and neural crest development. The Erg gene is a proto-oncogene. It is a target of chromosomal translocations resulting in fusions with other neighboring genes. Chimeric proteins were found in solid tumors such as myeloid leukemia or Ewing's sarcoma. Members of this subfamily are potential targets for cancer therapy. 75
27435 188883 cd08541 SAM_PNT-FLI-1 Sterile alpha motif (SAM)/Pointed domain of friend leukemia integration 1 transcription activator. SAM Pointed domain of FLI-1 (Friend Leukemia Integration) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. The FLI-1 protein participates in regulation of cellular differentiation, proliferation, and survival. The Fli-1 gene was initially described in Friend virus-induced erythroleukemias as a site for virus integration. It is highly expressed in hematopoietic tissues and at lower level in lungs, heart, and ovaries. Fli-1 is a proto-oncogene implicated in Ewing's sarcoma and erythroleukemia. Members of this subfamily are potential targets for cancer therapy. 91
27436 176092 cd08542 SAM_PNT-ETS-1 Sterile alpha motif (SAM)/Pointed domain of ETS-1. SAM Pointed domain of ETS-1 subfamily of ETS transcriptional activators is a protein-protein interaction domain. The ETS-1 activator is regulated by phosphorylation. It contains a docking site for the ERK2 MAP (Mitogen Activated Protein) kinase, while the ERK2 phosphorylation site is located in the N-terminal disordered region upstream of the SAM Pointed domain. Mutations of the kinase docking site residues inhibit phosphorylation. ETS-1 activators play a role in a number of different physiological processes, and they are expressed during embryonic development, including blood vessel formation, hematopoietic, lymphoid, neuronal and osteogenic differentiation. The Ets-1 gene is a proto-oncogene involved in progression of different tumors (including breast cancer, meningioma, and prostate cancer). Members of this subfamily are potential molecular targets for selective cancer therapy. 88
27437 188884 cd08543 SAM_PNT-ETS-2 Sterile alpha motif (SAM)/Pointed domain of ETS-2. SAM Pointed domain of ETS-2 subfamily of ETS transcriptional regulators is a protein-protein interaction domain. It contains a docking site for Cdk10 (cyclin-dependent kinase 10), a member of the Cdc2 kinase family. The interaction between ETS-2 and Cdk10 kinase inhibits ETS-2 transactivation activity in mammals. ETS-2 is also regulated by ERK2 MAP kinase. ETS-2, which is phosphorylated by ERK2, can interact with coactivators and enhance transactivation. ETS-2 transcriptional activators are involved in embryonic development and cell cycle control. The Ets-2 gene is a proto-oncogene. It is overexpressed in breast and prostate cancer cells and its overexpression is necessary for transformation of such cells. Members of ETS-2 subfamily are potential molecular targets for selective cancer therapy. 89
27438 260081 cd08544 Reeler Reeler, the N-terminal domain of reelin, F-spondin, and a variety of other proteins. This domain is found at the N-terminus of F-spondin, a protein attached to the extracellular matrix, which plays roles in neuronal development and vascular remodelling. The F-spondin reeler domain has been reported to bind heparin. The reeler domain is also found at the N-terminus of reelin, an extracellular glycoprotein involved in the development of the brain cortex, and in a variety of other eukaryotic proteins with different domain architectures, including the animal ferric-chelate reductase 1 or stromal cell-derived receptor 2, a member of the cytochrome B561 family, which reduces ferric iron before its transport from the endosome to the cytoplasm. Also included is the insect putative defense protein 1, which is expressed upon bacterial infection and appears to contain a single reeler domain. 135
27439 260082 cd08545 YcnI_like Reeler-like domain of YcnI and similar proteins. YcnI is a copper-responsive gene of Bacillus subtilis. It is homologous to an uncharacterized protein from Nocardia farcinica, which shares a conserved three-dimensional structure with cohesins and the reeler domain. Some members in this YcnI_like family have C-terminal domains (DUF461) that may bind copper. 152
27440 260083 cd08546 cohesin_like Cohesin domain, interaction parter of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. Cohesin modules are phylogenetically distributed into three groups: type I cohesin-dockerin interactions mediate assembly of a range of dockerin-borne enzymes to the complex, while type-II interactions mediate attachment of the cellulosome complex to the bacterial cell wall. Recently discovered type-III cohesins, such as found in the anchoring scaffoldin ScaE, appears to contribute to increased stability of the elaborate cellulosome complex. While the presence of cohesin and dockerin domains in a genome can be indicative of cellulolytic activity, cohesin domains may occur in a wider range of domain architectures, biological systems, and taxonomic lineages. 144
27441 260084 cd08547 Type_II_cohesin Type II cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type II cohesins; their interactions with dockerin mediate attachment of the cellulosome complex to the bacterial cell wall. 136
27442 260085 cd08548 Type_I_cohesin_like Type I cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I cohesins; their interactions with dockerin mediate assembly of a range of dockerin-borne enzymes to the complex. 135
27443 341479 cd08549 G1PDH_related Glycerol-1-phosphate dehydrogenase and related proteins. This family contains bacterial and archeal glycerol-1-phosphate dehydrogenase-like oxidoreductases. These proteins have similarity with glycerol-1-phosphate dehydrogenase (G1PDH) which plays a role in the synthesis of phosphoglycerolipids in gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires Ni++ ion. It also contains archaeal Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH) that plays an important role in the formation of the enantiomeric configuration of the glycerophosphate backbone (sn-glycerol-1-phosphate) of archaeal ether lipids. 331
27444 341480 cd08550 GlyDH-like Glycerol_dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site. Some subfamilies have yet to be characterized. 347
27445 341481 cd08551 Fe-ADH iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains large metal-containing alcohol dehydrogenases (ADH), known as iron-containing alcohol dehydrogenases. They contain a dehydroquinate synthase-like protein structural fold and mostly contain iron. They are distinct from other alcohol dehydrogenases which contains different protein domains. There are several distinct families of alcohol dehydrogenases: Zinc-containing long-chain alcohol dehydrogenases, insect-type, or short-chain alcohol dehydrogenases, iron-containing alcohol dehydrogenases, among others. The iron-containing family has a Rossmann fold-like topology that resembles the fold of the zinc-dependent alcohol dehydrogenases, but lacks sequence homology, and differs in strand arrangement. ADH catalyzes the reversible oxidation of alcohol to acetaldehyde with the simultaneous reduction of NAD(P)+ to NAD(P)H. 372
27446 350202 cd08553 PIN_Fcf1-like VapC-like PIN domain of rRNA-processing proteins, Fcf1 (Utp24, YDR339C), Utp23 (YOR004W), and other eukaryotic homologs. Fcf1 (FAF1-copurifying factor 1, also known as Utp24) and Utp23 (U three-associated protein 23) are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly. Components of the small subunit (SSU) processome, Fcf1 and Utp23 are essential nucleolar proteins that are required for processing of the 18S pre-rRNA at sites A0-A2. The Fcf1 protein was reported to interact with Pmc1p (vacuolar Ca2+ ATPase) and Cor1p (core subunit of the ubiquinol-cytochrome c reductase complex). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. The subfamily of Fcf1- and Utp23-like homologs have three of the four conserved residues found in S. cerevisiae Fcf1. Some members of the superfamily, including S. cerevisiae Utp23, lack several of these key catalytic residues. Mutation of the remaining conserved putative active site residues seen in Utp23 did not interfere with rRNA maturation and cell viability. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 123
27447 176489 cd08554 Cyt_b561 Eukaryotic cytochrome b(561). Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous. 131
27448 176498 cd08555 PI-PLCc_GDPD_SF Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily. The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases. 179
27449 176499 cd08556 GDPD Glycerophosphodiester phosphodiesterase domain as found in prokaryota and eukaryota, and similar proteins. The typical glycerophosphodiester phosphodiesterase domain (GDPD) consists of a TIM barrel and a small insertion domain named the GDPD-insertion (GDPD-I) domain, which is specific for GDPD proteins. This family corresponds to both typical GDPD domain and GDPD-like domain which lacks the GDPD-I region. Members in this family mainly consist of a large family of prokaryotic and eukaryotic glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), and a number of uncharacterized homologs. Sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria are also included in this family. GDPD plays an essential role in glycerol metabolism and catalyzes the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols are major sources of carbon and phosphate. Its catalytic mechanism is based on the metal ion-dependent acid-base reaction, which is similar to that of phosphoinositide-specific phospholipases C (PI-PLCs, EC 3.1.4.11). Both, GDPD related proteins and PI-PLCs, belong to the superfamily of PI-PLC-like phosphodiesterases. 189
27450 176500 cd08557 PI-PLCc_bacteria_like Catalytic domain of bacterial phosphatidylinositol-specific phospholipase C and similar proteins. This subfamily corresponds to the catalytic domain present in bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) and their sequence homologs found in eukaryota. Bacterial PI-PLCs participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. Bacterial PI-PLCs contain a single TIM-barrel type catalytic domain. Its catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. Eukaryotic homologs in this family are named as phosphatidylinositol-specific phospholipase C X domain containing proteins (PI-PLCXD). They are distinct from the typical eukaryotic phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11), which have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, which is closely related to that of bacterial PI-PLCs. Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may be distinct from that of typical eukaryotic PI-PLCs. This family also includes a distinctly different type of eukaryotic PLC, glycosylphosphatidylinositol-specific phospholipase C (GPI-PLC), an integral membrane protein characterized in the protozoan parasite Trypanosoma brucei. T. brucei GPI-PLC hydrolyzes the GPI-anchor on the variant specific glycoprotein (VSG), releasing dimyristyl glycerol (DMG), which may facilitate the evasion of the protozoan to the host's immune system. It does not require Ca2+ for its activity and is more closely related to bacterial PI-PLCs, but not mammalian PI-PLCs. 271
27451 176501 cd08558 PI-PLCc_eukaryota Catalytic domain of eukaryotic phosphoinositide-specific phospholipase C and similar proteins. This family corresponds to the catalytic domain present in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) and similar proteins. The higher eukaryotic PI-PLCs play a critical role in most signal transduction pathways, controlling numerous cellular events such as cell growth, proliferation, excitation and secretion. They strictly require Ca2+ for the catalytic activity. They display a clear preference towards the hydrolysis of the more highly phosphorylated membrane phospholipids PI-analogues, phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidylinositol-4-phosphate (PIP), to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. The eukaryotic PI-PLCs have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains, such as the pleckstrin homology (PH) domain, EF-hand motif, and C2 domain. The catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a linker region. The catalytic mechanism of eukaryotic PI-PLCs is based on general base and acid catalysis utilizing two well conserved histidines and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. The mammalian PI-PLCs consist of 13 isozymes, which are classified into six-subfamilies, PI-PLC-delta (1,3 and 4), -beta(1-4), -gamma(1,2), -epsilon, -zeta, and -eta (1,2). Ca2+ is required for the activation of all forms of mammalian PI-PLCs, and the concentration of calcium influences substrate specificity. This family also includes metazoan phospholipase C related but catalytically inactive proteins (PRIP), which belong to a group of novel inositol trisphosphate binding proteins. Due to the replacement of critical catalytic residues, PRIP does not have PLC enzymatic activity. 226
27452 176502 cd08559 GDPD_periplasmic_GlpQ_like Periplasmic glycerophosphodiester phosphodiesterase domain (GlpQ) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in bacterial and eukaryotic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46) similar to Escherichia coli periplasmic phosphodiesterase GlpQ. GP-GDEs are involved in glycerol metabolism and catalyze the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. In E. coli, there are two major G3P uptake systems: Glp and Ugp, which contain genes coding for two different GP-GDEs. GlpQ gene from the glp operon codes for a periplasmic phosphodiesterase GlpQ. GlpQ is a dimeric enzyme that hydrolyzes periplasmic glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), to the corresponding alcohols and G3P, which is subsequently transported into the cell through the GlpT transport system. Ca2+ is required for GlpQ enzymatic activity. This subfamily also includes some GP-GDEs in higher plants and their eukaryotic homologs, which show very high sequence similarities with bacterial periplasmic GP-GDEs. 296
27453 176503 cd08560 GDPD_EcGlpQ_like_1 Glycerophosphodiester phosphodiesterase domain similar to Escherichia coli periplasmic phosphodiesterase (GlpQ) include uncharacterized proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) and their hypothetical homologs. Members in this subfamily show high sequence similarity to Escherichia coli periplasmic phosphodiesterase GlpQ, which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 356
27454 176504 cd08561 GDPD_cytoplasmic_ScUgpQ2_like Glycerophosphodiester phosphodiesterase domain of Streptomyces coelicolor cytoplasmic phosphodiesterases UgpQ2 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized cytoplasmic phosphodiesterases which predominantly exist in bacteria. The prototype of this family is a putative cytoplasmic phosphodiesterase encoded by gene ulpQ2 (SCO1419) in the Streptomyces coelicolor genome. It is distantly related to the Escherichia coli cytoplasmic phosphodiesterases UgpQ that catalyzes the hydrolysis of glycerophosphodiesters at the inner side of the cytoplasmic membrane to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 249
27455 176505 cd08562 GDPD_EcUgpQ_like Glycerophosphodiester phosphodiesterase domain in Escherichia coli cytosolic glycerophosphodiester phosphodiesterase UgpQ and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Escherichia coli cytosolic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46), UgpQ, and similar proteins. GP-GDE plays an essential role in the metabolic pathway of E. coli. It catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. E. coli possesses two major G3P uptake systems: Glp and Ugp, which contain genes coding for two distinct GP-GDEs. UgpQ gene from the E. coli ugp operon codes for a cytosolic phosphodiesterase GlpQ, which is the prototype of this family. Various glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), can only be hydrolyzed by UgpQ during transport at the inner side of the cytoplasmic membrane to alcohols and G3P, which is a source of phosphate. In contrast to Ca2+-dependent periplasmic phosphodiesterase GlpQ, cytosolic phosphodiesterase UgpQ requires divalent cations, such as Mg2+, Co2+, or Mn2+, for its enzyme activity. 229
27456 176506 cd08563 GDPD_TtGDE_like Glycerophosphodiester phosphodiesterase domain of Thermoanaerobacter tengcongensis and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Thermoanaerobacter tengcongensis glycerophosphodiester phosphodiesterase (TtGDE, EC 3.1.4.46) and its uncharacterized homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Despite the fact that most of GDPD family members exist as the monomer, TtGDE can function as a dimeric unit. Its catalytic mechanism is based on the general base-acid catalysis, which is similar to that of phosphoinositide-specific phospholipases C (PI-PLCs, EC 3.1.4.11). A divalent metal cation is required for the enzyme activity of TtGDE. 230
27457 176507 cd08564 GDPD_GsGDE_like Glycerophosphodiester phosphodiesterase domain of putative Galdieria sulphuraria glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative Galdieria sulphuraria glycerophosphodiester phosphodiesterase (GsGDE, EC 3.1.4.46) and its uncharacterized eukaryotic homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 265
27458 176508 cd08565 GDPD_pAtGDE_like Glycerophosphodiester phosphodiesterase domain of putative Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase (pAtGDE, EC 3.1.4.46) and its uncharacterized homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 235
27459 176509 cd08566 GDPD_AtGDE_like Glycerophosphodiester phosphodiesterase domain of Agrobacterium tumefaciens and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase (AtGDE, EC 3.1.4.46) and its uncharacterized eukaryotic homolgoues. Members in this family shows high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. AtGDE exists as a hexamer that is a trimer of dimers, which is unique among current known GDPD family members. However, it remains unclear if the hexamer plays a physiological role in AtGDE enzymatic function. 240
27460 176510 cd08567 GDPD_SpGDE_like Glycerophosphodiester phosphodiesterase domain of putative Silicibacter pomeroyi glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) and similar proteins. The prototype of this CD is a putative GP-GDE from Silicibacter pomeroyi (SpGDE). It shows high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 263
27461 176511 cd08568 GDPD_TmGDE_like Glycerophosphodiester phosphodiesterase domain of Thermotoga maritime and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Thermotoga maritime glycerophosphodiester phosphodiesterase (TmGDE, EC 3.1.4.46) and its uncharacterized homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. TmGDE exists as a monomer that might be the biologically relevant form. 226
27462 176512 cd08570 GDPD_YPL206cp_fungi Glycerophosphodiester phosphodiesterase domain of Saccharomyces cerevisiae YPL206cp and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Saccharomyces cerevisiae YPL206cp and uncharacterized hypothetical homologs existing in fungi. The product of S. cerevisiae ORF YPL206c (PGC1), YPL206cp (Pgc1p), displays homology to bacterial and mammalian glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. S. cerevisiae YPL206cp is an integral membrane protein with a single GDPD domain following by a short hydrophobic C-terminal tail that may function as a membrane anchor. This protein plays an essential role in the regulation of the cardiolipin (CL) biosynthetic pathway in yeast by removing the excess phosphatidylglycerol (PG) content of membranes via a phospholipase C-type degradation mechanism. YPL206cp has been characterized as a PG-specific phospholipase C that selectively catalyzes the cleavage of PG, not glycerophosphoinositol (GPI) or glycerophosphocholine (GPC), to diacylglycerol (DAG) and glycerophosphate. Members in this family are distantly related to S. cerevisiae YPL110cp, which selectively hydrolyzes glycerophosphocholine (GPC), not glycerophosphoinositol (GPI), to generate choline and glycerolphosphate, and has been characterized as a cytoplasmic GPC-specific phosphodiesterase. 234
27463 176513 cd08571 GDPD_SHV3_plant Glycerophosphodiester phosphodiesterase domain of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase (GDPD) domain present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play an important role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens. 302
27464 176514 cd08572 GDPD_GDE5_like Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE5-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian glycerophosphodiester phosphodiesterase GDE5-like proteins. GDE5 is widely expressed in mammalian tissues, with highest expression in spinal chord. Although its biological function remains unclear, mammalian GDE5 shows higher sequence homology to fungal and plant glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46) than to other bacterial and mammalian GP-GDEs. It may also hydrolyze glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 293
27465 176515 cd08573 GDPD_GDE1 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE1 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE1 (also known as MIR16, membrane interacting protein of RGS16) and their metazoan homologs. GDE1 is widely expressed in mammalian tissues, including the heart, brain, liver, and kidney. It shows sequence homology to bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyzes the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. GDE1 has been characterized as GPI-GDE (EC 3.1.4.44) that selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate glycerol phosphate and inositol. It functions as an integral membrane-bound glycoprotein interacting with regulator of G protein signaling protein RGS16, and is modulated by G protein-coupled receptor (GPCR) signaling. In addition, GDE1 may interact with PRA1 domain family, member 2 (PRAF2, also known as JM4), which is an interacting protein of the G protein-coupled chemokine receptor CCR5. The catalytic activity, which is dependent on the integrity of the GDPD domain, is required for GDE1 cellular function. 258
27466 176516 cd08574 GDPD_GDE_2_3_6 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE2, GDE3, GDE6-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian glycerophosphodiester phosphodiesterase domain-containing protein subtype 5 (GDE2), subtype 2 (GDE3), subtype 1 (GDE6), and their eukaryotic homologs. Mammalian GDE2, GDE3, and GDE6 show very high sequence similarity to each other and have been classified into the same family. Although they are all transmembrane proteins, based on different pattern of tissue distribution, these enzymes might display diverse cellular functions. Mammalian GDE2 is primarily expressed in mature neurons. It selectively hydrolyzes glycerophosphocholine (GPC) and mainly functions in a complex with an antioxidant scavenger peroxiredoxin1 (Prdx1) to control motor neuron differentiation in the spinal cord. Mammalian GDE3 is specifically expressed in bone tissues and spleen. It selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate inositol 1-phosphate (Ins1P) and glycerol and functions as an inducer of osteoblast differentiation. Mammalian GDE6 is predominantly expressed in the spermatocytes of testis, and its specific physiological function has not been elucidated yet. 252
27467 176517 cd08575 GDPD_GDE4_like Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE4-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE4 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 1 (GDPD1)) and similar proteins. Mammalian GDE4 is a transmembrane protein whose cellular function is not elucidated. It is expressed widely, including in placenta, liver, kidney, pancreas, spleen, thymus, ovary, small intestine and peripheral blood leukocytes. It is also expressed in the growth cones in neuroblastoma Neuro2a cells, which suggests mammalian GDE4 may play some distinct role from other members of mammalian GDEs family. Also included in this subfamily are uncharacterized mammalian glycerophosphodiester phosphodiesterase domain-containing protein 3 (GDPD3) and similar proteins which display very high sequence homology to mammalian GDE4. 264
27468 176518 cd08576 GDPD_like_SMaseD_PLD Glycerophosphodiester phosphodiesterase-like domain of spider venom sphingomyelinases D, bacterial phospholipase D, and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase-like domain (GDPD-like) present in sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.4) from spider venom, the Corynebacterium pseudotuberculosis Phospholipase D (PLD)-like protein from pathogenic bacteria, and the Ajellomyces capsulatus H143 PLD-like protein from ascomycetes. Spider SMases D and bacterial PLD proteins catalyze the Mg2+-dependent hydrolysis of sphingomyelin producing choline and ceramide 1-phosphate (C1P), which possess a number of biological functions, such as regulating cell proliferation and apoptosis, participating in inflammatory responses, and playing a key role in phagocytosis. In the presence of Mg2+, SMases D can function as lysophospholipase D and hydrolyze lysophosphatidylcholine (LPC) to choline and lysophosphatidic acid (LPA), which is a multifunctional phospholipid involved in platelet aggregation, endothelial hyperpermeability, and pro-inflammatory responses. Loxosceles spider venoms' SMases D are the principal toxins responsible for dermonecrosis and complement dependent haemolysis induced by spider venom. Due to amino acid substitutions at the entrance to the active-site pocket, some members lack activity. The typical GDPD domain consists of a TIM barrel and a small insertion domain named as the GDPD-insertion (GDPD-I) domain, which is specific for GDPD proteins. Although proteins in this family contain a non-typical GDPD domain which lacks the GDPD-I, their catalytic mechanisms are based on Mg2+-dependent acid-base reactions similar to GDPD proteins. They might be divergent members of the GDPD family. Moreover, this family does not belong to phospholipase D (PLD) superfamily, since it lacks the conserved HKD sequence motif that characterizes the catalytic center of the PLD superfamily. It belongs to the superfamily of PLC-like phosphodiesterases. 265
27469 176519 cd08577 PI-PLCc_GDPD_SF_unchar3 Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues. 228
27470 176520 cd08578 GDPD_NUC-2_fungi Putative glycerophosphodiester phosphodiesterase domain of ankyrin repeat protein NUC-2 and similar proteins. This subfamily corresponds to a putative glycerophosphodiester phosphodiesterase domain (GDPD) present in Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81. Some uncharacterized NUC-2 sequence homologs are also included in this family. NUC-2 plays an important role in the phosphate-regulated signal transduction pathway in Neurospora crassa. It shows high similarity to a cyclin-dependent kinase inhibitory protein PHO81, which is part of the phosphate regulatory cascade in S. cerevisiae. Both NUC-2 and PHO81 have multi-domain architecture, including an SPX N-terminal domain following by several ankyrin repeats and a putative C-terminal GDPD domain with unknown function. Although the putative GDPD domain displays sequence homology to that of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), the residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in members of this family, which suggests the function of putative GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. 300
27471 176521 cd08579 GDPD_memb_like Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial glycerophosphodiester phosphodiesterases. In addition to a C-terminal GDPD domain, most members in this family have an N-terminus that functions as a membrane anchor. 220
27472 176522 cd08580 GDPD_Rv2277c_like Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial protein Rv2277c and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial protein Rv2277c and similar proteins. Members in this subfamily are bacterial homologous of mammalian GDE4, a transmembrane protein whose cellular function has not yet been elucidated. 263
27473 176523 cd08581 GDPD_like_1 Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity to Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 229
27474 176524 cd08582 GDPD_like_2 Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity to Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 233
27475 176525 cd08583 PI-PLCc_GDPD_SF_unchar1 Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues. 237
27476 176526 cd08584 PI-PLCc_GDPD_SF_unchar2 Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues. 192
27477 176527 cd08585 GDPD_like_3 Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity with Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 237
27478 176528 cd08586 PI-PLCc_BcPLC_like Catalytic domain of Bacillus cereus phosphatidylinositol-specific phospholipases C and similar proteins. This subfamily corresponds to the catalytic domain present in Bacillus cereus phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) and its sequence homologs found in bacteria and eukaryota. Bacterial PI-PLCs participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. Bacterial PI-PLCs contain a single TIM-barrel type catalytic domain. Their catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This family also includes some uncharacterized eukaryotic homologs, which contains a single TIM-barrel type catalytic domain, X domain. They are similar to bacterial PI-PLCs, and distinct from typical eukaryotic PI-PLCs, which have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains, and strictly require Ca2+ for their catalytic activities. The prototype of this family is Bacillus cereus PI-PLC, which has a moderate thermal stability and is active as a monomer. 279
27479 176529 cd08587 PI-PLCXDc_like Catalytic domain of phosphatidylinositol-specific phospholipase C X domain containing and similar proteins. This family corresponds to the catalytic domain present in phosphatidylinositol-specific phospholipase C X domain containing proteins (PI-PLCXD) which are bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) sequence homologs mainly found in eukaryota. The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs and their bacterial homologs contain a single TIM-barrel type catalytic domain, X domain, which is more closely related to that of bacterial PI-PLCs. Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may be distinct from that of typical eukaryotic PI-PLCs. 288
27480 176530 cd08588 PI-PLCc_At5g67130_like Catalytic domain of Arabidopsis thaliana PI-PLC X domain-containing protein At5g67130 and its uncharacterized homologs. This subfamily corresponds to the catalytic domain present in Arabidopsis thaliana PI-PLC X domain-containing protein At5g67130 and its uncharacterized homologs. Members in this family show high sequence similarity to bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), which participates in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). 270
27481 176531 cd08589 PI-PLCc_SaPLC1_like Catalytic domain of Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1-like proteins. This subfamily corresponds to the catalytic domain present in Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1 (SaPLC1) and similar proteins. The typical bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) catalyzes Ca2+-independent hydrolysis of the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). The catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. In contrast, SaPLC1 is the first known natural Ca2+-dependent bacterial PI-PLC. It is more closely related to the eukaryotic PI-PLCs rather than the typical bacterial PI-PLCs. It participates in PI metabolism to generate myo-inositol-1-phosphate and myo-inositol-1:2-cyclic phosphate simultaneously. SaPLC1 and other members in this subfamily have two Ca2+-chelating amino acid substitutions which convert them from metal-independent enzymes to metal-dependent bacterial PI-PLC. Additionally, SaPLC1 active site utilizes a mechanism of amino acid juxtaposition, swapping amino acid positions, to adapt a calcium binding pocket and maintain more ideal active site geometry to support efficient catalysis. 324
27482 176532 cd08590 PI-PLCc_Rv2075c_like Catalytic domain of uncharacterized Mycobacterium tuberculosis Rv2075c-like proteins. This subfamily corresponds to the catalytic domain present in uncharacterized Mycobacterium tuberculosis Rv2075c and its homologs. Members in this family are more closely related to the Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1(SaPLC1)-like proteins rather than the typical bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). In contrast, SaPLC1-like proteins have two Ca2+-chelating amino acid substitutions which convert them to metal-dependent bacterial PI-PLC. Rv2075c and its homologs have the same amino acid substitutions as well, which might suggest they have metal-dependent PI-PLC activity. 267
27483 176533 cd08591 PI-PLCc_beta Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are four PLC-beta isozymes (1-4). They are activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. The beta-gamma subunits of heterotrimeric G proteins are known to activate the PLC-beta2 and -beta3 isozymes only. Aside from four PLC-beta isozymes identified in mammals, some eukaryotic PLC-beta homologs have been classified into this subfamily, such as NorpA and PLC-21 from Drosophila and PLC-beta from turkey, Xenopus, sponge, and hydra. 257
27484 176534 cd08592 PI-PLCc_gamma Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain.The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. There are two PI-PLC-gamma isozymes (1-2). They are activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region. Aside from the two PI-PLC-gamma isozymes identified in mammals, some eukaryotic PI-PLC-gamma homologs have been classified with this subfamily. 229
27485 176535 cd08593 PI-PLCc_delta Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This CD corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Different PI-PLC-delta isozymes have different tissue distribution and different subcellular locations. PI-PLC-delta1 is mostly a cytoplasmic protein, PI-PLC-delta3 is located in the membrane, and PI-PLC-delta4 is predominantly detected in the cell nucleus. Aside from three PI-PLC-delta isozymes identified in mammals, some eukaryotic PI-PLC-delta homologs have been classified to this CD. 257
27486 176536 cd08594 PI-PLCc_eta Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are two PI-PLC-eta isozymes (1-2), both neuron-specific enzymes. They function as calcium sensors that are activated by small increases in intracellular calcium concentrations. The PI-PLC-eta isozymes are also activated through GPCR stimulation. Aside from the PI-PLC-eta isozymes identified in mammals, their eukaryotic homologs are also present in this family. 227
27487 176537 cd08595 PI-PLCc_zeta Catalytic domain of metazoan phosphoinositide-specific phospholipase C-zeta. This family corresponds to the catalytic domain presenting in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-zeta isozyme. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-zeta represents a class of sperm-specific PI-PLC that has an N-terminal EF-hand domain, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is one PLC-zeta isozyme (1). PLC-zeta plays a fundamental role in vertebrate fertilization by initiating intracellular calcium oscillations that trigger the embryo development. However, the mechanism of its activation still remains unclear. Aside from PI-PLC-zeta identified in mammals, its eukaryotic homologs have been classified with this family. 257
27488 176538 cd08596 PI-PLCc_epsilon Catalytic domain of metazoan phosphoinositide-specific phospholipase C-epsilon. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-epsilon isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-epsilon represents a class of mammalian PI-PLC that has an N-terminal CDC25 homology domain with a guanyl-nucleotide exchange factor (GFF) activity, a pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and two predicted RA (Ras association) domains that are implicated in the binding of small GTPases, such as Ras or Rap, from the Ras family. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is one PI-PLC-epsilon isozyme (1). PI-PLC-epsilon is activated by G alpha(12/13), G beta gamma, and activated members of Ras and Rho small GTPases. Aside from PI-PLC-epsilon identified in mammals, its eukaryotic homologs have been classified with this family. 254
27489 176539 cd08597 PI-PLCc_PRIP_metazoa Catalytic domain of metazoan phospholipase C related, but catalytically inactive protein. This family corresponds to the catalytic domain present in metazoan phospholipase C related, but catalytically inactive proteins (PRIP), which belong to a group of novel Inositol 1,4,5-trisphosphate (InsP3) binding protein. PRIP has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP do not have PLC enzymatic activity. PRIP consists of two subfamilies, PRIP-1(previously known as p130 or PLC-1), which is predominantly expressed in the brain, and PRIP-2 (previously known as PLC-2), which exhibits a relatively ubiquitous expression. Experiments show both, PRIP-1 and PRIP-2, are involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. In addition, PRIP-2 acts as a negative regulator of B-cell receptor signaling and immune responses. 260
27490 176540 cd08598 PI-PLC1c_yeast Catalytic domain of putative yeast phosphatidylinositide-specific phospholipases C. This family corresponds to the catalytic domain present in a group of putative phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) encoded by PLC1 genes from yeasts, which are homologs of the delta isoforms of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The prototype of this CD is protein Plc1p encoded by PLC1 genes from Saccharomyces cerevisiae. Plc1p contains both highly conserved X- and Y- regions of PLC catalytic core domain, as well as a presumptive EF-hand like calcium binding motif. Experiments show that Plc1p displays calcium dependent catalytic properties with high similarity to those of the mammalian PLCs, and plays multiple roles in modulating the membrane/protein interactions in filamentation control. CaPlc1p encoded by CAPLC1 from the closely related yeast Candida albicans, an orthologue of S. cerevisiae Plc1p, is also included in this group. Like Plc1p, CaPlc1p has conserved presumptive catalytic domain, shows PLC activity when expressed in E. coli, and is involved in multiple cellular processes. There are two other gene copies of CAPLC1 in C. albicans, CAPLC2 (also named as PIPLC) and CAPLC3. Experiments show CaPlc1p is the only enzyme in C. albicans which functions as PLC. The biological functions of CAPLC2 and CAPLC3 gene products must be clearly different from CaPlc1p, but their exact roles remain unclear. Moreover, CAPLC2 and CAPLC3 gene products are more similar to extracellular bacterial PI-PLC than to the eukaryotic PI-PLC, and they are not included in this subfamily. 231
27491 176541 cd08599 PI-PLCc_plant Catalytic domain of plant phosphatidylinositide-specific phospholipases C. This family corresponds to the catalytic domain present in a group of phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11) encoded by PLC genes from higher plants, which are homologs of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The domain arrangement of plant PI-PLCs is structurally similar to the mammalian PLC-zeta isoform, which lacks the N-terminal pleckstrin homology (PH) domain, but contains EF-hand like motifs (which are absent in a few plant PLCs), a PLC catalytic core domain with X- and Y- highly conserved regions split by a linker sequence, and a C2 domain. However, at the sequence level, the plant PI-PLCs are closely related to the mammalian PLC-delta isoform. Experiments show that plant PLCs display calcium dependent PLC catalytic properties, although they lack some of the N-terminal motifs found in their mammalian counterparts. A putative calcium binding site may be located at the region spanning the X- and Y- domains. 228
27492 176542 cd08600 GDPD_EcGlpQ_like Glycerophosphodiester phosphodiesterase domain of Escherichia coli (GlpQ) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Escherichia coli periplasmic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46), GlpQ, and similar proteins. GP-GDE plays an essential role in the metabolic pathway of E. coli. It catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. E. coli possesses two major G3P uptake systems: Glp and Ugp, which contain genes coding for two different GP-GDEs. GlpQ gene from the E. coli glp operon codes for a periplasmic phosphodiesterase GlpQ, which is the prototype of this family. GlpQ is a dimeric enzyme that hydrolyzes periplasmic glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), to the corresponding alcohols and G3P, which is subsequently transported into the cell through the GlpT transport system. Ca2+ is required for the enzymatic activity of GlpQ. This family also includes a surface-exposed lipoprotein, protein D (HPD), from Haemophilus influenza Type b and nontypeable strains, which shows very high sequence similarity with E. coli GlpQ. HPD has been characterized as a human immunoglobulin D-binding protein with glycerophosphodiester phosphodiesterase activity. It can hydrolyze phosphatidylcholine from host membranes to produce free choline on the lipopolysaccharides on the surface of pathogenic bacteria. 318
27493 176543 cd08601 GDPD_SaGlpQ_like Glycerophosphodiester phosphodiesterase domain of Staphylococcus aureus and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46) from Staphylococcus aureus, Bacillus subtilis and similar proteins. Members in this family show very high sequence similarity to Escherichia coli periplasmic phosphodiesterase GlpQ, which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. 256
27494 176544 cd08602 GDPD_ScGlpQ1_like Glycerophosphodiester phosphodiesterase domain of Streptomycin coelicolor (GlpQ1) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of putative bacterial and eukaryotic glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) similar to Escherichia coli periplasmic phosphodiesterase GlpQ, as well as plant glycerophosphodiester phosphodiesterases (GP-PDEs), all of which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. The prototypes of this family include putative secreted phosphodiesterase encoded by gene glpQ1 (SCO1565) from the pho regulon in Streptomyces coelicolor genome, and in plants, two distinct Arabidopsis thaliana genes, AT5G08030 and AT1G74210, coding putative GP-PDEs from the cell walls and vacuoles, respectively. 309
27495 176545 cd08603 GDPD_SHV3_repeat_1 Glycerophosphodiester phosphodiesterase domain repeat 1 of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) repeat 1 present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play an important role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most of the members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens. This family includes domain I, the first GDPD domain of SHV3 and SVLs. 299
27496 176546 cd08604 GDPD_SHV3_repeat_2 Glycerophosphodiester phosphodiesterase domain repeat 2 of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) repeat 2 present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play important an role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most of the members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens. This CD includes domain II (the second GDPD domain of SHV3 and SVLs), which is necessary for SHV3 function. 300
27497 176547 cd08605 GDPD_GDE5_like_1_plant Glycerophosphodiester phosphodiesterase domain of uncharacterized plant glycerophosphodiester phosphodiesterase-like proteins similar to mammalian GDE5. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized plant glycerophosphodiester phosphodiesterase (GP-PDE)-like proteins. Members in this family show very high sequence homology to mammalian glycerophosphodiester phosphodiesterase GDE5 and are distantly related to plant GP-PDEs. 282
27498 176548 cd08606 GDPD_YPL110cp_fungi Glycerophosphodiester phosphodiesterase domain of Saccharomyces cerevisiae YPL110cp and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Saccharomyces cerevisiae YPL110cp and other uncharacterized fungal homologs. The product of S. cerevisiae ORF YPL110c (GDE1), YPL110cp (Gde1p), displays homology to bacterial and mammalian glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. S. cerevisiae YPL110cp has been characterized as a cytoplasmic glycerophosphocholine (GPC)-specific phosphodiesterase that selectively hydrolyzes GPC, not glycerophosphoinositol (GPI), to generate choline and glycerolphosphate. YPL110cp has multi-domain architecture, including not only C-terminal GDPD, but also an SPX N-terminal domain along with several ankyrin repeats, which implies that YPL110cp may mediate protein-protein interactions in a variety of proteins and play a role in maintaining cellular phosphate levels. Members in this family are distantly related to S. cerevisiae YPL206cp, which selectively catalyzes the cleavage of phosphatidylglycerol (PG), not glycerophosphoinositol (GPI) or glycerophosphocholine (GPC), to diacylglycerol (DAG) and glycerophosphate, and has been characterized as a PG-specific phospholipase C. 286
27499 176549 cd08607 GDPD_GDE5 Glycerophosphodiester phosphodiesterase domain of putative mammalian glycerophosphodiester phosphodiesterase GDE5 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative mammalian GDE5 and similar proteins. Mammalian GDE5 is widely expressed in mammalian tissues, with highest expression in the spinal chord. Although its biological function remains unclear, mammalian GDE5 shows higher sequence homology to fungal and plant glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46) than to other bacterial and mammalian GP-GDEs. It may also hydrolyze glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. In addition to C-terminal GDPD domain, all members in this subfamily have a starch binding domain (CBM20) in the N-terminus, which suggests these proteins may play a distinct role in glycerol metabolism. 290
27500 176550 cd08608 GDPD_GDE2 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE2 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE2 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 5 (GDPD5)) and their metazoan homologs. Mammalian GDE2 is transmembrane protein primarily expressed in mature neurons. It is a mammalian homolog of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Mammalian GDE2 selectively hydrolyzes glycerophosphocholine (GPC) and has been characterized as GPC-GDE (EC 3.1.4.2) that contributes to osmotic regulation of cellular GPC. Mammalian GDE2 functions in a complex with an antioxidant scavenger peroxiredoxin1 (Prdx1) to control motor neuron differentiation in the spinal cord. Mammalian GDE2 also plays a critical role for retinoid-induced neuronal outgrowth. The catalytic activity of GDPD domain is essential for mammalian GDE2 cellular function. 351
27501 176551 cd08609 GDPD_GDE3 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE3 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE3 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 2 (GDPD2), Osteoblast differentiation promoting factor) and their metazoan homologs. Mammalian GDE3 is a transmembrane protein specifically expressed in bone tissues and spleen. It is a mammalian homolog of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyzes the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Mammalian GDE3 has been characterized as glycerophosphoinositol inositolphosphodiesterase (EC 3.1.4.43) that selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate inositol 1-phosphate (Ins1P) and glycerol. Mammalian GDE3 functions as an inducer of osteoblast differentiation. It also plays a critical role for actin cytoskeletal modulation. The catalytic activity of GDPD domain is essential for mammalian GDE3 cellular function. 315
27502 176552 cd08610 GDPD_GDE6 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE6 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE6 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 4 (GDPD4)) and their metazoan homologs. Mammalian GDE6 is a transmembrane protein predominantly expressed in the spermatocytes of testis. Although the specific physiological function of mammalian GDE6 has not been elucidated, its different pattern of tissue distribution suggests it might play a critical role in the completion of meiosis during male germ cell differentiation. 316
27503 176553 cd08612 GDPD_GDE4 Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE4 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE4 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 1 (GDPD1)) and similar proteins. Mammalian GDE4 is a transmembrane protein whose cellular function has not yet been elucidated. It is expressed widely, including in placenta, liver, kidney, pancreas, spleen, thymus, ovary, small intestine and peripheral blood leukocytes. It is also expressed in the growth cones in neuroblastoma Neuro2a cells, which suggests GDE4 may play some distinct role from other members of the GDE family. 300
27504 176554 cd08613 GDPD_GDE4_like_1 Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial homologs of mammalian glycerophosphodiester phosphodiesterase GDE4. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial homologs of mammalian GDE4, a transmembrane protein whose cellular function has not been elucidated yet. 309
27505 176555 cd08616 PI-PLCXD1c Catalytic domain of phosphatidylinositol-specific phospholipase C, X domain containing 1. This subfamily corresponds to the catalytic domain present in a group of phosphatidylinositol-specific phospholipase C X domain containing 1 (PI-PLCXD1), 2 (PI-PLCXD2) and 3 (PI-PLCXD3), which are bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) sequence homologs found in vertebrates. The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, members in this group contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs. 290
27506 176556 cd08619 PI-PLCXDc_plant Catalytic domain of phosphatidylinositol-specific phospholipase C, X domain containing proteins found in plants. The CD corresponds to the catalytic domain present in uncharacterized plant phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, plant PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of plant PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs. 285
27507 176557 cd08620 PI-PLCXDc_like_1 Catalytic domain of uncharacterized hypothetical proteins similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins. This subfamily corresponds to the catalytic domain present in a group of uncharacterized hypothetical proteins found in bacteria and fungi, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs. 281
27508 176558 cd08621 PI-PLCXDc_like_2 Catalytic domain of uncharacterized hypothetical proteins similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins. This subfamily corresponds to the catalytic domain present in a group of uncharacterized hypothetical proteins found in bacteria and fungi, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs. 300
27509 176559 cd08622 PI-PLCXDc_CG14945_like Catalytic domain of Drosophila melanogaster CG14945-like proteins similar to phosphatidylinositol-specific phospholipase C, X domain containing. This subfamily corresponds to the catalytic domain present in uncharacterized metazoan Drosophila melanogaster CG14945-like proteins, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs. 276
27510 176560 cd08623 PI-PLCc_beta1 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta1 is expressed at highest levels in specific regions of the brain. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. 258
27511 176561 cd08624 PI-PLCc_beta2 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 2. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta2 is expressed at highest levels in cells of hematopoietic origin. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It is also activated by the beta-gamma subunits of heterotrimeric G proteins. 261
27512 176562 cd08625 PI-PLCc_beta3 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta3. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 3. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta3 is widely expressed at highest levels in brain, liver, and parotid gland. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It is also activated by the beta-gamma subunits of heterotrimeric G proteins. 258
27513 176563 cd08626 PI-PLCc_beta4 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta4. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 4. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta4 is expressed in high concentrations in cerebellar Purkinje and granule cells, the median geniculate body, and the lateral geniculate nucleus. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. 257
27514 176564 cd08627 PI-PLCc_gamma1 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma1, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. PI-PLC-gamma1 is ubiquitously expressed. It is activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region. 229
27515 176565 cd08628 PI-PLCc_gamma2 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozyme 2. PI-PLC is a signaling enzyme that hydrolyze the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma2, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. PI-PLC-gamma2 is highly expressed in cells of hematopoietic origin. It is activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region. Unlike PI-PLC-gamma1, the activation of PI-PLC-gamma2 may require concurrent stimulation of PI 3-kinase. 254
27516 176566 cd08629 PI-PLCc_delta1 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta1 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This subfamily corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Unlike PI-PLC-delta 4, PI-PLC-delta1 and 3 possess a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1and 3 from the cell nucleus. Experiments show PI-PLC-delta1 is essential for normal hair formation. 258
27517 176567 cd08630 PI-PLCc_delta3 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta3. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta3 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This family corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). Unlike PI-PLC-delta 4, PI-PLC-delta1 and 3 possess a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus. 258
27518 176568 cd08631 PI-PLCc_delta4 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta4. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta4 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This CD corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). Unlike PI-PLC-delta 1 and 3, a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus, is not present in PI-PLC-delta4. Experiments show PI-PLC-delta4 is required for the acrosome reaction in fertilization. 258
27519 176569 cd08632 PI-PLCc_eta1 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-eta1 is a neuron-specific enzyme and expressed in only nerve tissues such as the brain and spinal cord. It may perform a fundamental role in the brain. 253
27520 176570 cd08633 PI-PLCc_eta2 Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozyme 2. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-eta2 is a neuron-specific enzyme and expressed in the brain. It may in part function downstream of G-protein-coupled receptors and play an important role in the formation and maintenance of the neuronal network in the postnatal brain. 254
27521 176474 cd08637 DNA_pol_A_pol_I_C Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase (polymerase I) functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I (pol I) ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 377
27522 176475 cd08638 DNA_pol_A_theta DNA polymerase theta is a low-fidelity family A enzyme implicated in translesion synthesis and in somatic hypermutation. DNA polymerase theta is a low-fidelity family A enzyme implicated in translesion synthesis (TLS) and in somatic hypermutation (SHM). DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Pol theta is an exception among family A polymerases and generates processive single base substitutions. Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I (pol I) ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. Polymerase theta mostly has amino-terminal helicase domain, a carboxy-terminal polymerase domain and an intervening space region. 373
27523 176476 cd08639 DNA_pol_A_Aquificae_like Phylum Aquificae Pol A is different from Escherichia coli Pol A by three signature sequences. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used for phylogenetic anaylsis of bacteria. Species of the phylum Aquificae grow in extreme thermophilic environments. The Aquificae are non-spore-forming, Gram-negative rods and strictly thermophilic. Phylum Aquificae Pol A is different from E. coli Pol I by three signature sequences consisting of a 2 amino acids (aa) insert, a 5-6 aa insert and a 6 aa deletion. These signature sequences may provide a molecular marker for the family Aquificaceae and related species. 324
27524 176477 cd08640 DNA_pol_A_plastid_like DNA polymerase A type from plastids of higher plants possibly involve in DNA replication or in the repair of errors occurring during replication. DNA polymerase A type from plastids of higher plants possibly involve in DNA replication or in the repair of errors occurring during replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). The three-dimensional structure of plastid DNA polymerase has substantial similarity to Pol I. The structure of Pol I resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 371
27525 176478 cd08641 DNA_pol_gammaA Pol gammaA is a family A polymerase that is responsible for DNA replication and repair in mitochondria. DNA polymerase gamma (Pol gamma), 5'-3' polymerase domain (Pol gammaA). Pol gammaA is a family A polymerase that is responsible for DNA replication and repair in mitochondria. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gammaA, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. Pol gammaA has also the right hand configuration. Pol gammaA has both polymerase and proofreading exonuclease activities separated by a spacer. Pol gamma holoenzyme is a heterotrimer containing one Pol gammaA subunit and a dimeric Pol gammaB subunit. Pol gamma is important for mitochondria DNA maintenance and mutation of the catalytic subunit of Pol gamma is implicated in more than 30 human diseases. 425
27526 176479 cd08642 DNA_pol_A_pol_I_A Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase (polymerase I) functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 378
27527 176480 cd08643 DNA_pol_A_pol_I_B Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 429
27528 187713 cd08644 FMT_core_ArnA_N ArnA, N-terminal formyltransferase domain. ArnA_N: ArnA is a bifunctional enzyme required for the modification of lipid A with 4-amino-4-deoxy-L-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. The C-terminal dehydrogenase domain of ArnA catalyzes the oxidative decarboxylation of UDP-glucuronic acid (UDP-GlcUA) to UDP-4-keto-arabinose (UDP-Ara4O), while the N-terminal formyltransferase domain of ArnA catalyzes the addition of a formyl group to UDP-4-amino-4-deoxy-L-arabinose (UDP-L-Ara4N) to form UDP-L-4-formamido-arabinose (UDP-L-Ara4FN). This domain family represents the catalytic core of the N-terminal formyltransferase domain. The formyltransferase also contains a smaller C-terminal domain the may be involved in substrate binding. ArnA forms a hexameric structure, in which the dehydrogenase domains are arranged at the center of the particle with the transformylase domains on the outside of the particle. 203
27529 187714 cd08645 FMT_core_GART Phosphoribosylglycinamide formyltransferase (GAR transformylase, GART). Phosphoribosylglycinamide formyltransferase, also known as GAR transformylase or GART, is an essential enzyme that catalyzes the third step in de novo purine biosynthesis. This enzyme uses formyl tetrahydrofolate as a formyl group donor to produce 5'-phosphoribosyl-N-formylglycinamide. In prokaryotes, GART is a single domain protein but in most eukaryotes it is the C-terminal portion of a large multifunctional protein which also contains GAR synthetase and aminoimidazole ribonucleotide synthetase activities. 183
27530 187715 cd08646 FMT_core_Met-tRNA-FMT_N Methionyl-tRNA formyltransferase, N-terminal hydrolase domain. Methionyl-tRNA formyltransferase (Met-tRNA-FMT), N-terminal formyltransferase domain. Met-tRNA-FMT transfers a formyl group from N-10 formyltetrahydrofolate to the amino terminal end of a methionyl-aminoacyl-tRNA acyl moiety, yielding formyl-Met-tRNA. Formyl-Met-tRNA plays essential role in protein translation initiation by forming complex with IF2. The formyl group plays a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. The N-terminal domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 204
27531 187716 cd08647 FMT_core_FDH_N 10-formyltetrahydrofolate dehydrogenase (FDH), N-terminal hydrolase domain. This family represents the N-terminal hydrolase domain of the bifunctional protein 10-formyltetrahydrofolate dehydrogenase (FDH). This domain contains a 10-formyl-tetrahydrofolate (10-formyl-THF) binding site and shares sequence homology and structural topology with other enzymes utilizing this substrate. This domain functions as a hydrolase, catalyzing the conversion of 10-formyl-THF, a precursor for nucleotide biosynthesis, to tetrahydrofolate (THF). The overall FDH reaction mechanism is a coupling of two sequential reactions, a hydrolase and a formyl dehydrogenase, bridged by a substrate transfer step. The N-terminal hydrolase domain removes the formyl group from 10-formyl-THF and the C-terminal NADP-dependent dehydrogenase domain then reduces the formyl group to carbon dioxide. The two catalytic domains are connected by a third intermediate linker domain that transfers the formyl group, covalently attached to the sulfhydryl group of the phosphopantetheine arm, from the N-terminal domain to the C-terminal domain. 203
27532 187717 cd08648 FMT_core_Formyl-FH4-Hydrolase_C Formyltetrahydrofolate deformylase (Formyl-FH4 hydrolase), C-terminal hydrolase domain. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Formate is the substrate of phosphoribosylglycinamide transformylase for step three of de novo purine nucleotide synthesis. Formyl-FH4 hydrolase has been proposed to regulate the balance of FH4 and C1-FH4 in the cell. The enzyme uses methionine and glycine to sense the pools of C1-FH4 and FH4, respectively. This domain belongs to the formyltransferase (FMT) domain superfamily. Members of this family have an N-terminal ACT domain, which is commonly involved in specifically bind an amino acid or other small ligand leading to regulation of the enzyme. The N-terminal of this protein family may be responsible for the binding of the regulators methionine and glycine. 196
27533 187718 cd08649 FMT_core_NRPS_like N-terminal formyl transferase catalytic core domain of NRPS_like proteins, one of the proteins involved in the synthesis of Oxazolomycin. This family represents the N-terminal formyl transferase catalytic core domain present in a subgroup of non-ribosomal peptide synthetases. In Streptomyces albus a member of this family has been shown to be involved in the synthesis of oxazolomycin (OZM). OZM is a hybrid peptide-polyketide antibiotic and exhibits potent antitumor and antiviral activities. It is a multi-domain protein consisting of a formyl transferase domain, a Flavin-utilizing monoxygenase domain, a LuxE domain functioning as an acyl protein synthetase and a pp-binding domain, which may function as an acyl carrier. It shows sequence similarity with other peptide-polyketide biosynthesis proteins. 166
27534 187719 cd08650 FMT_core_HypX_N HypX protein, N-terminal hydrolase domain. The family represents the N-terminal hydrolase domain of HypX protein. HypX is involved in the maturation process of active [NiFe] hydrogenase. [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions. [NiFe] hydrogenases consist of a large and a small subunit. The large subunit contains [NiFe] active site, which is synthesized as a precursor without the [NiFe] active site. This precursor then undergoes a complex post-translational maturation process that requires the presence of a number of accessory proteins. HypX has been shown to be involved in this maturation process and have been proposed to participate in the generation and transport of the CO and CN ligands. However, HypX is not present in all hydrogen-metabolizing bacteria. Furthermore, hypX deletion mutants have a reduced but detectable level of hydrogenase activity. Thus, HypX might not be a determining factor in the matur ation process. Members of this group have an N-terminal formyl transferase domain and a C-terminal enoyl-CoA hydratase/isomerase domain. 151
27535 187720 cd08651 FMT_core_like_4 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 180
27536 187721 cd08653 FMT_core_like_3 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 152
27537 349943 cd08656 M28_like M28 Zn-peptidase; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. 287
27538 349944 cd08659 M20_ArgE_DapE-like Peptidase M20 acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE)-like. Peptidase M20 acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) like family of enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this family are mostly bacterial and have been inferred by homology as being related to both ArgE and DapE. This family also includes N-acetyl-L-citrulline deacetylase (ACDase; acetylcitrulline deacetylase), a unique, novel enzyme found in Xanthomonas campestris, a plant pathogen, in which N-acetyl-L-ornithine is the substrate for transcarbamoylation reaction, and the product is N-acetyl-L-citrulline. Thus, in the arginine biosynthesis pathway, ACDase subsequently catalyzes the hydrolysis of N-acetyl-L-citrulline to acetate and L-citrulline. 361
27539 349945 cd08660 M20_Acy1-like M20 Peptidase Aminoacylase 1-like family. This family includes aminoacylase 1 (ACY1) and Aminoacylase 1-like protein 2 (ACY1L2). Aminoacylase 1 proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. ACY1 (acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) is the most abundant of the aminoacylases, a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. It is encoded by the aminoacylase 1 gene (Acy1) on chromosome 3p21 that comprises 15 exons. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1L2 family contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in E. coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Defects in ACY1 are the cause of aminoacylase-1 deficiency (ACY1D) resulting in a metabolic disorder manifesting with encephalopathy and psychomotor delay. 366
27540 341056 cd08662 M13 Peptidase family M13 includes neprilysin and endothelin-converting enzyme I. The M13 family of metallopeptidases includes neprilysin (neutral endopeptidase, NEP, enkephalinase, CD10, CALLA, EC 3.4.24.11), endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), erythrocyte surface antigen KELL (ECE-3), phosphate-regulating gene on the X chromosome (PHEX), soluble secreted endopeptidase (SEP), and damage-induced neuronal endopeptidase (DINE)/X-converting enzyme (XCE). Proteins in this family fulfill a broad range of physiological roles due to the greater variation in the active site's S2' subsite allowing substrate specificity. NEP is expressed in a variety of tissues including kidney and brain, and is involved in many physiological and pathological processes, including blood pressure and inflammatory response. It degrades a wide array of substrates such as substance P, enkephalins, cholecystokinin, neurotensin and somatostatin. It is an important enzyme in the regulation of amyloid-beta (Abeta) protein that forms amyloid plaques that are associated with Alzeimers disease (AD). ECE-1 catalyzes the final rate-limiting step in the biosynthesis of endothelins via post-translational conversion of the biologically inactive big endothelins. Like NEP, it also hydrolyzes bradykinin, substance P, neurotensin, and Abeta. Endothelin-1 overproduction has been implicated in various diseases including stroke, asthma, hypertension, and cardiac and renal failure. Kell is a homolog of NEP and constitutes a major antigen on human erythrocytes; it preferentially cleaves big endothelin-3 to produce bioactive endothelin-3, but is also known to cleave substance P and neurokinin A. PHEX forms a complex interaction with fibroblast growth factor 23 (FGF23) and matrix extracellular phosphoglycoprotein, causing bone mineralization. A loss-of-function mutation in PHEX disrupts this interaction leading to hypophosphatemic rickets; X-linked hypophosphatemic (XLH) rickets is the most common form of metabolic rickets. ECEL1 is a brain metalloprotease which plays a critical role in the nervous regulation of the respiratory system, while DINE is abundantly expressed in the hypothalamus and its expression responds to nerve injury. A majority of these M13 proteases are prime therapeutic targets for selective inhibition. 642
27541 176450 cd08663 DAP_dppA_1 Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases. 266
27542 176485 cd08664 APC10-HERC2 APC10-like DOC1 domain present in HERC2 (HECT domain and RLD2). This model represents the APC10/DOC1 domain present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including a zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT (Homologous to the E6-AP Carboxyl Terminus) domain. The APC10/DOC1 domain of HERC2 is a homolog of the APC10 subunit and the DOC1 domain present in E3 ubiquitin ligases which mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. As suggested by structural relationships between HERC2 and other proteins such as HERC1, the proposed role for HERC2 in protein trafficking and degradation pathways is consistent with observations that mutations in HERC2 lead to neuromuscular secretory vesicle and sperm acrosome defects, other developmental abnormalities, and juvenile lethality of jdf2 mice. Recent studies have shown that the protein complex, HERC2-RNF8, coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. 152
27543 176486 cd08665 APC10-CUL7 APC10-like DOC1 domain of CUL7, subunit of the SCF-ROC1-like E3 ubiquitin ligase complex that mediates substrate ubiquitination. This model represents the APC10/DOC1 domain present in CUL7, a subunit of the SCF-ROC1-like E3 Ubiquitin (Ub) ligase complex, which mediates substrate ubiquitination (or ubiquitylation), and is a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling the SCF-ROC1-like E3 Ub ligase complex consisting of the adapter protein Skp1, CUL7, the WD40 repeat-containing F-box Fbw8 (also known as Fbx29), and ROC1 (RING-box protein 1). CUL7 is a large protein with a C-terminal cullin domain that binds ROC1 and additional domains, including an APC10/DOC1 domain. While the Fbw8 protein is responsible for substrate protein recognition, the ROC1 RING domain recruits an Ub-charged E2 Ub-conjugating enzyme for substrate ubiquitination. It remains to be determined how CUL7 binds to the Skp1-Fbw8 heterodimer. The CUL7 E3 Ub ligase has been implicated in the proteasomal degradation of the cellular proteins, cyclin D1, an important regulator of the G1 to S-phase cell cycle progression, and insulin receptor substrate 1, a critical component of the signaling pathways downstream of the insulin and insulin-like growth factor 1 receptor. CUL7 appears to be an important regulator of placental development. Germ line mutations of CUL7 are linked to 3-M syndrome and Yakuts short stature syndrome. 131
27544 176487 cd08666 APC10-HECTD3 APC10-like DOC1 domain of HECTD3, a HECT E3 ubiquitin ligase protein that mediates substrate ubiquitination. This model represents the APC10/DOC1 domain present in HECTD3, a HECT (Homologous to the E6-AP Carboxyl Terminus) E3 ubiquitin ligase protein. HECT E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), and are a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. They also regulate the trafficking of many receptors, channels, transporters and viral proteins. HECTD3 (HECT domain-containing protein3) contains a C-terminal HECT domain with the active site for ubiquitin transfer onto substrates, and an N-terminal APC10/DOC1 domain, which is responsible for substrate recognition and binding. HECTD3 specifically recognizes the Trio-binding protein, Tara (Trio-associated repeat on actin), implicated in regulating actin cytoskeletal, cell motility and cell growth. Tara also binds to TRF1 and may participate in telomere maintenance and/or mitotic regulation through interacting with TRF1. HECTD3 interacts with and promotes the ubiquitination of Syntaxin 8, an endosomal syntaxin proposed to mediate distinct steps of endosomal protein trafficking. HECTD3-mediated Syntaxin 8 degradation has been suggested to contribute to the pathophysiology of neurodegenerative diseases. 134
27545 176488 cd08667 APC10-ZZEF1 APC10/DOC1-like domain of uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) and homologs. This model represents the APC10/DOC1-like domain present in the uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) of Mus musculus. Members of this family contain EF-hand, APC10, CUB, and zinc finger ZZ-type domains. ZZEF1-like APC10 domains are homologous to the APC10 subunit/DOC1 domains present in E3 ubiquitin ligases, which mediate substrate ubiquitination (or ubiquitylation), and are components of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. 131
27546 176571 cd08674 Cdt1_m The middle winged helix fold of replication licensing factor Cdt1 binds geminin to inhibit binding of the MCM complex to origins of replication and DNA. Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the origin recognition complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and other archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity. 185
27547 176057 cd08675 C2B_RasGAP C2 domain second repeat of Ras GTPase activating proteins (GAPs). RasGAPs suppress Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation. The proteins here all contain two tandem C2 domains, a Ras-GAP domain, and a pleckstrin homology (PH)-like domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology. 137
27548 176058 cd08676 C2A_Munc13-like C2 domain first repeat in Munc13 (mammalian uncoordinated)-like proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner. Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode. Munc13 is the mammalian homolog which are expressed in the brain. There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters. Unc13 and Munc13 contain both C1 and C2 domains. There are two C2 related domains present, one central and one at the carboxyl end. Munc13-1 contains a third C2-like domain. Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology. 153
27549 176059 cd08677 C2A_Synaptotagmin-13 C2 domain. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 13, a member of class 6 synaptotagmins, is located in the brain. It functions are unknown. It, like synaptotagmins 8 and 12, does not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium. Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10). The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B). C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This CD contains the first C2 repeat, C2A, and has a type-I topology. 118
27550 176060 cd08678 C2_C21orf25-like C2 domain found in the Human chromosome 21 open reading frame 25 (C21orf25) protein. The members in this cd are named after the Human C21orf25 which contains a single C2 domain. Several other members contain a C1 domain downstream of the C2 domain. No other information on this protein is currently known. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 126
27551 176061 cd08679 C2_DOCK180_related C2 domains found in Dedicator Of CytoKinesis 1 (DOCK 180) and related proteins. Dock180 was first identified as an 180kd proto-oncogene product c-Crk-interacting protein involved in actin cytoskeletal changes. It is now known that it has Rac-specific GEF activity, but lacks the conventional Dbl homology (DH) domain. There are 10 additional related proteins that can be divided into four classes based on sequence similarity and domain organization: Dock-A which includes Dock180/Dock1, Dock2, and Dock5; Dock-B which includes Dock3/MOCA (modifier of cell adhesion) and Dock4; Dock-C which includes Dock6/Zir1, Dock7/Zir2, and Dock8/Zir3; and Dock-D, which includes Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2/ACG (activated Cdc42-associated GEF). Most of members of classes Dock-A and Dock-B are the GEFs specific for Rac. Those of Dock-D are Cdc42-specific GEFs while those of Dock-C are the GEFs for both. All Dock180-related proteins have two common homology domains: the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker). DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 178
27552 176062 cd08680 C2_Kibra C2 domain found in Human protein Kibra. Kibra is thought to be a regulator of the Salvador (Sav)/Warts (Wts)/Hippo (Hpo) (SWH) signaling network, which limits tissue growth by inhibiting cell proliferation and promoting apoptosis. The core of the pathway consists of a MST and LATS family kinase cascade that ultimately phosphorylates and inactivates the YAP/Yorkie (Yki) transcription coactivator. The FERM domain proteins Merlin (Mer) and Expanded (Ex) are part of the upstream regulation controlling pathway mechanism. Kibra colocalizes and associates with Mer and Ex and is thought to transduce an extracellular signal via the SWH network. The apical scaffold machinery that contains Hpo, Wts, and Ex recruits Yki to the apical membrane facilitating its inhibitory phosphorlyation by Wts. Since Kibra associates with Ex and is apically located it is hypothesized that KIBRA is part of the scaffold, helps in the Hpo/Wts complex, and helps recruit Yki for inactivation that promotes SWH pathway activity. Kibra contains two amino-terminal WW domains, an internal C2-like domain, and a carboxy-terminal glutamic acid-rich stretch. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 124
27553 176063 cd08681 C2_fungal_Inn1p-like C2 domain found in fungal Ingression 1 (Inn1) proteins. Saccharomyces cerevisiae Inn1 associates with the contractile actomyosin ring at the end of mitosis and is needed for cytokinesis. The C2 domain of Inn1, located at the N-terminus, is required for ingression of the plasma membrane. The C-terminus is relatively unstructured and contains eight PXXP motifs that are thought to mediate interaction of Inn1 with other proteins with SH3 domains in the cytokinesis proteins Hof1 (an F-BAR protein) and Cyk3 (whose overexpression can restore primary septum formation in Inn1Delta cells) as well as recruiting Inn1 to the bud-neck by binding to Cyk3. Inn1 and Cyk3 appear to cooperate in activating chitin synthase Chs2 for primary septum formation, which allows coordination of actomyosin ring contraction with ingression of the cleavage furrow. It is thought that the C2 domain of Inn1 helps to preserve the link between the actomyosin ring and the plasma membrane, contributing both to membrane ingression, as well as to stability of the contracting ring. Additionally, Inn1 might induce curvature of the plasma membrane adjacent to the contracting ring, thereby promoting ingression of the membrane. It has been shown that the C2 domain of human synaptotagmin induces curvature in target membranes and thereby contributes to fusion of these membranes with synaptic vesicles. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 118
27554 176064 cd08682 C2_Rab11-FIP_classI C2 domain found in Rab11-family interacting proteins (FIP) class I. Rab GTPases recruit various effector proteins to organelles and vesicles. Rab11-family interacting proteins (FIPs) are involved in mediating the role of Rab11. FIPs can be divided into three classes: class I FIPs (Rip11a, Rip11b, RCP, and FIP2) which contain a C2 domain after N-terminus of the protein, class II FIPs (FIP3 and FIP4) which contain two EF-hands and a proline rich region, and class III FIPs (FIP1) which exhibits no homology to known protein domains. All FIP proteins contain a highly conserved, 20-amino acid motif at the C-terminus of the protein, known as Rab11/25 binding domain (RBD). Class I FIPs are thought to bind to endocytic membranes via their C2 domain, which interacts directly with phospholipids. Class II FIPs do not have any membrane binding domains leaving much to speculate about the mechanism involving FIP3 and FIP4 interactions with endocytic membranes. The members in this CD are class I FIPs. The exact function of the Rab11 and FIP interaction is unknown, but there is speculation that it involves the role of forming a targeting complex that recruits a group of proteins involved in membrane transport to organelles. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 126
27555 176065 cd08683 C2_C2cd3 C2 domain found in C2 calcium-dependent domain containing 3 (C2cd3) proteins. C2cd3 is a novel C2 domain-containing protein specific to vertebrates. C2cd3 functions in regulator of cilia formation, Hedgehog signaling, and mouse embryonic development. Mutations in C2cd3 mice resulted in lethality in some cases and exencephaly, a twisted body axis, and pericardial edema in others. The presence of calcium-dependent lipid-binding domains in C2cd3 suggests a potential role in vesicular transport. C2cd3 is also an interesting candidate for ciliopathy because of its orthology to certain cilia-related genetic disease loci on chromosome. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 143
27556 176066 cd08684 C2A_Tac2-N C2 domain first repeat found in Tac2-N (Tandem C2 protein in Nucleus). Tac2-N contains two C2 domains and a short C-terminus including a WHXL motif, which are key in stabilizing transport vesicles to the plasma membrane by binding to a plasma membrane. However unlike the usual carboxyl-terminal-type (C-type) tandem C2 proteins, it lacks a transmembrane domain, a Slp-homology domain, and a Munc13-1-interacting domain. Homology search analysis indicate that no known protein motifs are located in its N-terminus, making Tac2-N a novel class of Ca2+-independent, C-type tandem C2 proteins. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 103
27557 176067 cd08685 C2_RGS-like C2 domain of the Regulator Of G-Protein Signaling (RGS) family. This CD contains members of the regulator of G-protein signaling (RGS) family. RGS is a GTPase activating protein which inhibits G-protein mediated signal transduction. The protein is largely cytosolic, but G-protein activation leads to translocation of this protein to the plasma membrane. A nuclear form of this protein has also been described, but its sequence has not been identified. There are multiple alternatively spliced transcript variants in this family with some members having additional domains (ex. PDZ and RGS) downstream of the C2 domain. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 119
27558 176068 cd08686 C2_ABR C2 domain in the Active BCR (Breakpoint cluster region) Related protein. The ABR protein is similar to the breakpoint cluster region protein. It has homology to guanine nucleotide exchange proteins and GTPase-activating proteins (GAPs). ABR is expressed primarily in the brain, but also includes non-neuronal tissues such as the heart. It has been associated with human diseases such as Miller-Dieker syndrome in which mental retardation and malformations of the heart are present. ABR contains a RhoGEF domain and a PH-like domain upstream of its C2 domain and a RhoGAP domain downstream of this domain. A few members also contain a Bcr-Abl oncoprotein oligomerization domain at the very N-terminal end. Splice variants of ABR have been identified. ABR is found in a wide variety of organisms including chimpanzee, dog, mouse, rat, fruit fly, and mosquito. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 118
27559 176069 cd08687 C2_PKN-like C2 domain in Protein kinase C-like (PKN) proteins. PKN is a lipid-activated serine/threonine kinase. It is a member of the protein kinase C (PKC) superfamily, but lacks a C1 domain. There are at least 3 different isoforms of PKN (PRK1/PKNalpha/PAK1; PKNbeta, and PRK2/PAK2/PKNgamma). The C-terminal region contains the Ser/Thr type protein kinase domain, while the N-terminal region of PKN contains three antiparallel coiled-coil (ACC) finger domains which are relatively rich in charged residues and contain a leucine zipper-like sequence. These domains binds to the small GTPase RhoA. Following these domains is a C2-like domain. Its C-terminal part functions as an auto-inhibitory region. PKNs are not activated by classical PKC activators such as diacylglycerol, phorbol ester or Ca2+, but instead are activated by phospholipids and unsaturated fatty acids. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 98
27560 176070 cd08688 C2_KIAA0528-like C2 domain found in the Human KIAA0528 cDNA clone. The members of this CD are named after the Human KIAA0528 cDNA clone. All members here contain a single C2 repeat. No other information on this protein is currently known. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 110
27561 176071 cd08689 C2_fungal_Pkc1p C2 domain found in protein kinase C (Pkc1p) in Saccharomyces cerevisiae. This family is named after the protein kinase C in Saccharomyces cerevisiae, Pkc1p. Protein kinase C is a member of a family of Ser/Thr phosphotransferases that are involved in many cellular signaling pathways. PKC has two antiparallel coiled-coiled regions (ACC finger domain) (AKA PKC homology region 1 (HR1)/ Rho binding domain) upstream of the C2 domain and two C1 domains downstream. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains, like those of PKC, are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 109
27562 176072 cd08690 C2_Freud-1 C2 domain found in 5' repressor element under dual repression binding protein-1 (Freud-1). Freud-1 is a novel calcium-regulated repressor that negatively regulates basal 5-HT1A receptor expression in neurons. It may also play a role in the altered regulation of 5-HT1A receptors associated with anxiety or major depression. Freud-1 contains two DM-14 basic repeats, a helix-loop-helix DNA binding domain, and a C2 domain. The Freud-1 C2 domain is thought to be calcium insensitive and it lacks several acidic residues that mediate calcium binding of the PKC C2 domain. In addition, it contains a poly-basic insert that is not present in calcium-dependent C2 domains and may function as a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology. 155
27563 176073 cd08691 C2_NEDL1-like C2 domain present in NEDL1 (NEDD4-like ubiquitin protein ligase-1). NEDL1 (AKA HECW1(HECT, C2 and WW domain containing E3 ubiquitin protein ligase 1)) is a newly identified HECT-type E3 ubiquitin protein ligase highly expressed in favorable neuroblastomas. In vertebrates it is found primarily in neuronal tissues, including the spinal cord. NEDL1 is thought to normally function in the quality control of cellular proteins by eliminating misfolded proteins. This is thought to be accomplished via a mechanism analogous to that of ER-associated degradation by forming tight complexes and aggregating misfolded proteins that have escaped ubiquitin-mediated degradation. NEDL1, is composed of a C2 domain, two WW domains, and a ubiquitin ligase Hect domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 137
27564 176074 cd08692 C2B_Tac2-N C2 domain second repeat found in Tac2-N (Tandem C2 protein in Nucleus). Tac2-N contains two C2 domains and a short C-terminus including a WHXL motif, which are key in stabilizing transport vesicles to the plasma membrane by binding to a plasma membrane. However unlike the usual carboxyl-terminal-type (C-type) tandem C2 proteins, it lacks a transmembrane domain, a Slp-homology domain, and a Munc13-1-interacting domain. Homology search analysis indicate that no known protein motifs are located in its N-terminus, making Tac2-N a novel class of Ca2+-independent, C-type tandem C2 proteins. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 135
27565 176075 cd08693 C2_PI3K_class_I_beta_delta C2 domain present in class I beta and delta phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. The members here are class I, beta and delta isoforms of PI3Ks and contain both a Ras-binding domain and a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-I topology. 173
27566 176076 cd08694 C2_Dock-A C2 domains found in Dedicator Of CytoKinesis (Dock) class A proteins. Dock-A is one of 4 classes of Dock family proteins. The members here include: Dock180/Dock1, Dock2, and Dock5. Most of these members have been shown to be GEFs specific for Rac. Dock5 has not been well characterized to date, but most likely also is a GEF specific for Rac. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-A members contain a proline-rich region and a SH3 domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 196
27567 176077 cd08695 C2_Dock-B C2 domains found in Dedicator Of CytoKinesis (Dock) class B proteins. Dock-B is one of 4 classes of Dock family proteins. The members here include: Dock3/MOCA (modifier of cell adhesion) and Dock4. Most of these members have been shown to be GEFs specific for Rac, although Dock4 has also been shown to interact indirectly with the Ras family GTPase Rap1, probably through Rap regulatory proteins. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-B members contain a SH3 domain upstream of the C2 domain and a proline-rich region downstream. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 189
27568 176078 cd08696 C2_Dock-C C2 domains found in Dedicator Of CytoKinesis (Dock) class C proteins. Dock-C is one of 4 classes of Dock family proteins. The members here include: Dock6/Zir1, Dock7/Zir2, and Dock8/Zir3. Dock-C members are GEFs for both Rac and Cdc42. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-C members contain a functionally uncharacterized domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 179
27569 176079 cd08697 C2_Dock-D C2 domains found in Dedicator Of CytoKinesis (Dock) class C proteins. Dock-D is one of 4 classes of Dock family proteins. The members here include: Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2/ACG (activated Cdc42-associated GEF). Dock-D are Cdc42-specific GEFs. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-D members contain a functionally uncharacterized domain and a PH domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The PH domain broadly binds to phospholipids and is thought to be involved in targeting the plasma membrane. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. 185
27570 381627 cd08698 TGF_beta_SF transforming growth factor beta (TGF-beta) like domain found in TGF-beta superfamily. TGF-beta superfamily consists of a large group of cell regulatory proteins, such as TFG-betas, Nodal, Activins/Inhibins, glial cell-line-derived neurotrophic factor (GDNF) family of ligands, bone morphogenetic proteins (BMPs), and growth and differentiation factors (GDFs). They play important roles in developmental and physiological processes in a variety of species, including invertebrates as well as vertebrates, through specific receptor complexes that are composed of type I and type II serine/threonine receptor kinases. The receptor kinases subsequently activate Smad proteins, which then propagate the signals into the nucleus to regulate target gene expression. Proteins from the TGF-beta superfamily are only active as homo- or heterodimer. 100
27571 187728 cd08700 FMT_C_OzmH_like C-terminal subdomain of the Formyltransferase-like domain found in OzmH-like proteins. Domain found in OzmH-like proteins with similarity to the C-terminal domain of Formyltransferase. OzmH is one of the proteins involved in the synthesis of Oxazolomycin (OZM), which is a hybrid peptide-polyketide antibiotic that exhibits potent antitumor and antiviral activities. OzmH is a multi-domain protein consisting of a formyl transferase domain, a flavin-utilizing monoxygenase domain, a LuxE domain functioning as an acyl protein synthetase and a phosphopantetheine (PP)-binding domain, which may function as an acyl carrier. It shows sequence similarity with other peptide-polyketide biosynthesis proteins. 100
27572 187729 cd08701 FMT_C_HypX C-terminal subdomain of the Formyltransferase-like domain found in HypX-like proteins. Domain found in HypX-like proteins with similarity to the C-terminal domain of Formyltransferase. HypX is involved in the maturation process of active [NiFe] hydrogenase. [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalents under aerobic and anaerobic conditions. [NiFe] hydrogenases consist of a large and a small subunit. The large subunit contains the [NiFe] active site but is synthesized as a precursor without the [NiFe] active site. This precursor undergoes a complex post-translational maturation process that requires the presence of a number of accessory proteins. HypX has been shown to be involved in this maturation process and have been proposed to participate in the generation and transport of the CO and CN ligands. However, HypX is not present in all hydrogen-metabolizing bacteria. Furthermore, hypX deletion mutants have a reduced but detectable level of hydrogenase activity. Thus, HypX might not be the determining factor in the maturation process. Members of this group have an N-terminal formyl transferase domain and a C-terminal enoyl-CoA hydratase/isomerase domain. 96
27573 187730 cd08702 Arna_FMT_C C-terminal subdomain of the formyltransferase domain on ArnA, which modifies lipid A with 4-amino-4-deoxy-l-arabinose. Domain found in ArnA with similarity to the C-terminal domain of Formyltransferase. ArnA is a bifunctional enzyme required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. The C-terminal domain of ArnA is a dehydrogenase domain that catalyzes the oxidative decarboxylation of UDP-glucuronic acid (UDP-GlcUA) to UDP-4-keto-arabinose (UDP-Ara4O) and the N-terminal domain is a formyltransferase domain that catalyzes the addition of a formyl group to UDP-4-amino-4-deoxy-L-arabinose (UDP-L-Ara4N) to form UDP-L-4-formamido-arabinose (UDP-L-Ara4FN). This domain family represents the C-terminal subdomain of the formyltransferase domain, downstream of the N-terminal subdomain containing the catalytic center. ArnA forms a hexameric structure (a dimer of trimers), in which the dehydrogenase domains are arranged at the center with the transformylase domains on the outside of the complex. 92
27574 187731 cd08703 FDH_Hydrolase_C The C-terminal subdomain of the hydrolase domain on the bi-functional protein 10-formyltetrahydrofolate dehydrogenase. The family represents the C-terminal subdomain of the hydrolase domain on the bi-functional protein, 10-formyltetrahydrofolate dehydrogenase (FDH). FDH catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. The protein comprises two functional domains: the N-terminal hydrolase domain that removes a formyl group from 10-formyltetrahydrofolate and the C-terminal NADP-dependent dehydrogenase domain that reduces the formyl group to carbon dioxide. The hydrolase domain contains an N-terminal formyl transferase catalytic core subdomain and this C-terminal subdomain, which may be involved in substrate binding. 100
27575 187732 cd08704 Met_tRNA_FMT_C C-terminal domain of Formyltransferase and other enzymes. C-terminal domain of formyl transferase and other proteins with diverse enzymatic activities. Proteins found in this family include methionyl-tRNA formyltransferase, ArnA, and 10-formyltetrahydrofolate dehydrogenase. Methionyl-tRNA formyltransferases constitute the majority of the family and also demonstrate greater sequence diversity. Although most proteins with formyltransferase activity contain the C-terminal domain, some formyltransferases ( for example, prokaryotic glycinamide ribonucleotide transformylase (GART)) only have the core catalytic domain, indicating that the C-terminal domain is not a requirement for catalytic activity and may be involved in substrate binding. For example, the C-terminal domain of methionyl-tRNA formyltransferase is involved in the tRNA binding. 87
27576 188660 cd08705 RGS_R7-like Regulator of G protein signaling (RGS) domain found in the R7 subfamily of proteins. The RGS (Regulator of G-protein Signaling) domain is an essential part of the R7 (Neuronal RGS) protein subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The R7 subfamily includes RGS6, RGS7, RGS9, and RGS11, all of which, in humans, are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes. In addition, R7 proteins were found to bind many other proteins outside of the G protein signaling pathways including: m-opioid receptor, beta-arrestin, alpha-actinin-2, NMDAR, polycystin, spinophilin, guanylyl cyclase, among others. 121
27577 188661 cd08706 RGS_R12-like Regulator of G protein signaling (RGS) domain found in the R12 subfamily of proteins. The RGS (Regulator of G-protein Signaling) domain is an essential part of the R12 (Neuronal RGS) protein subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play a critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling, controlled by RGS domain, accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP that results in reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The R12 RGS subfamily includes RGS10, RGS12 and RGS14 all of which are highly selective for G-alpha-i1 over G-alpha-q. 113
27578 188662 cd08707 RGS_Axin Regulator of G protein signaling (RGS) domain found in the Axin protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the Axin protein. Axin is a member of the RA/RGS subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, and skeletal and muscle development. The RGS domain of Axin is specifically interacts with the heterotrimeric G-alpha12 protein, but not with closely related G-alpha13, and provides a unique tool to regulate G-alpha12-mediated signaling processes. The RGS domain of Axin also interacts with the tumor suppressor protein APC (Adenomatous Polyposis Coli) in order to control the cytoplasmic level of the proto-oncogene, beta-catenin. 117
27579 188663 cd08708 RGS_FLBA Regulator of G protein signaling (RGS) domain found in the FLBA (Fluffy Low BrlA) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the FLBA (Fluffy Low BrlA) protein. FLBA is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play a critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of the G-protein signaling controlled by the RGS domain accelerates the GTPase activity of the alpha subunit by hydrolysis of GTP to GDP which results in reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. The RGS domain of the FLBA protein antagonizes G protein signaling to block proliferation and allow development. It is required for control of mycelial proliferation and activation of asexual sporulation in yeast. 148
27580 188664 cd08709 RGS_RGS2 Regulator of G protein signaling (RGS) domain found in the RGS2 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS2 protein. RGS2 is a member of R4/RGS subfamily of RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G- alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS2 plays important roles in the regulation of blood pressure and the pathogenesis of human hypertension, as well as in bone formation in osteoblasts. Outside of the GPCR pathway RGS2 interacts with calmodulin, beta- COP, tubulin, PKG1-alpha, and TRPV6. 114
27581 188665 cd08710 RGS_RGS16 Regulator of G protein signaling (RGS) domain found in the RGS16 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS16 protein. RGS16 is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS16 is a member of the R4/RGS subfamily and interacts with neuronal G-alpha0. RGS16 expression is upregulated by IL-17 of the NF-kappaB signaling pathway in autoimmune B cells. 114
27582 188666 cd08711 RGS_RGS8 Regulator of G protein signaling (RGS) domain found in the RGS8 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS8 protein. RGS8 is a member of R4/RGS subfamily of RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS8 is involved in G-protein-gated potassium channels regulation and predominantly expressed in the brain. RGS8 also is selectively expressed in the hematopoietic system (NK cells). 125
27583 188667 cd08712 RGS_RGS18 Regulator of G protein signaling (RGS) domain found in the RGS18 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS18 protein. RGS18 is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS18 is a member of the R4/RGS subfamily and is expressed predominantly in osteoclasts where it acts as a negative regulator of the acidosis-induced osteoclastogenic OGR1/NFAT signaling pathway. RANKL (receptor activator of nuclear factor B ligand) stimulates osteoclastogenesis by inhibiting expression of RGS18. 114
27584 188668 cd08713 RGS_RGS3 Regulator of G protein signaling (RGS) domain found in the RGS3 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS3 protein. RGS3 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. RGS3 induces apoptosis when overexpressed and is involved in cell migration through interaction with the Ephrin receptor. RGS3 exits as several splice isoforms and interacts with neuroligin, estrogen receptor-alpha, and 14-3-3 outside of the GPCR pathways. 114
27585 188669 cd08714 RGS_RGS4 Regulator of G protein signaling (RGS) domain found in the RGS4 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS4 protein. RGS4 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. RGS4 is expressed widely in brain including prefrontal cortex, striatum, locus coeruleus (LC), and hippocampus and has been implicated in regulation of opioid, cholinergic, and serotonergic signaling. Dysfunctions in RGS4 proteins are involved in etiology of Parkinson's disease, addiction, and schizophrenia. RGS4 also is up-regulated in the failing human heart. RGS4 interacts with many binding partners outside of GPCR pathways, including calmodulin, COP, Kir3, PIP, calcium/CaM, PA, ErbB3, and 14-3-3. 114
27586 188670 cd08715 RGS_RGS1 Regulator of G protein signaling (RGS) domain found in the RGS1 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS1 protein. RGS1 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS 1 is expressed predominantly in hematopoietic compartments, including T and B lymphocytes, and may play a major role in chemokine-mediated homing of lymphocytes to secondary lymphoid organs. In addition, RGS1 interacts with calmodulin and 14-3-3 protein outside of the GPCR pathway. 114
27587 188671 cd08716 RGS_RGS13 Regulator of G protein signaling (RGS) domain found in the RGS13 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS13 protein. RGS13 is member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS13 is predominantly expressed in T and B lymphocytes and in mast cells, and plays a role in adaptive immune responses. RGS13 also found in Rgs13, which is also expressed in dendritic cells and in neuroendocrine cells of the thymus, gastrointestinal, and respiratory tracts. Outside of the GPCR pathway, RGS5 interacts with the PIP3 protein. 114
27588 188672 cd08717 RGS_RGS5 Regulator of G protein signaling (RGS) domain found in the RGS5 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS5 protein. RGS5 is member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Two splice isoforms of RGS5 has been found: RGS5L (long) which is expressed in smooth muscle cells (pericytes) and heart and RGS5S (short) which is highly expressed in the ciliary body of the eye, kidney, brain, spleen, skeletal muscle, and small intestine. Outside of the GPCR pathway, RGS5 interacts with the 14-3-3 protein. 114
27589 188673 cd08718 RGS_RZ-like Regulator of G protein signaling (RGS) domain found in the RZ protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RZ subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by RGS domains, which accelerate GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins includes RGS17, RGS19 (former GAIP), RGS20, and its splice variant Ret-RGS. 118
27590 188674 cd08719 RGS_SNX13 Regulator of G protein signaling (RGS) domain found in the Sorting Nexin 13 (SNX13) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX13 (Sorting Nexin 13) protein, a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. The RGS-domain of SNX13 plays a major role through attenuation of Galphas-mediated signaling and regulates endocytic trafficking and degradation of the epidermal growth factor receptor. Snx13-null mice were embryonic lethal around midgestation which supports an essential role for SNX13 in mouse development and regulation of endocytosis dynamics. 135
27591 188675 cd08720 RGS_SNX25 Regulator of G protein signaling (RGS) domain found in the Sorting Nexin 25 (SNX25) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX25 (Sorting Nexin 25) protein, a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. SNX25 is a member of the Dopamine receptors (DAR) signalplex and regulates the trafficking of D1 and D2 DARs. 110
27592 188676 cd08721 RGS_AKAP2_2 Regulator of G protein signaling (RGS) domain 2 found in the A-kinase anchoring protein, D-AKAP2. The RGS (Regulator of G-protein Signaling) domain is an essential part of the D-AKAP2 (A-kinase anchoring protein), a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. D-AKAP2 contains two RGS domains which play an important role in spatiotemporal localization of cAMP-dependent PKA (cyclic AMP-dependent protein kinase) that regulates many different signaling pathways by phosphorylation of target proteins. This cd contains the second RGS domain. 121
27593 188677 cd08722 RGS_SNX14 Regulator of G protein signaling (RGS) domain found in the Sorting Nexin14 (SNX14) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX14 (Sorting Nexin14) protein, a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. SNX14 is believed to regulates membrane trafficking in motor neurons. 127
27594 188678 cd08723 RGS_RGS21 Regulator of G protein signaling (RGS) domain found in the RGS21 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part RGS21 protein, a member of RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, apoptosis, and cell proliferation, as well as modulation of cardiac development. RGS21 is a member of the R4/RGS subfamily and its mRNA was detected only in sensory taste cells that express sweet taste receptors and the taste G-alpha subunit, gustducin, suggesting a potential role in regulating taste transduction. 111
27595 188679 cd08724 RGS_GRK-like Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase (GRK). The RGS domain is found in G protein-coupled receptor kinases (GRKs). These proteins play a key role in phosphorylation-dependent desensitization/resensitization of GPCRs (G protein-coupled receptors), intracellular trafficking, endocytosis, as well as in the modulation of important intracellular signaling cascades by GPCR. GRKs also modulate cellular response in phosphorylation-independent manner using their ability to interact with multiple signaling proteins involved in many essential cellular pathways. The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. Based on sequence homology the GRK family consists of three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 114
27596 188680 cd08725 RGS_RGS22_4 Regulator of G protein signaling domain RGS_RGS22_4. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis. 123
27597 188681 cd08726 RGS_RGS22_3 Regulator of G protein signaling domain RGS_RGS22_3. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis. 130
27598 188682 cd08727 RGS_RGS22_2 Regulator of G protein signaling domain RGS_RGS22_2. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis. 116
27599 188683 cd08728 RGS-like_2 Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 2. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision. 179
27600 188684 cd08729 RGS_PX Regulator of G protein signaling domain. These uncharacterized RGS-like domains are found in proteins that also contain one or more PX domains. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involves in many crucial cellular processes. RGS proteins regulate intracellular trafficking and provide vital support for signal transduction. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, others RGS proteins play important role in neuronal signals modulation. Some RGS proteins are the principal elements needed for proper vision. 136
27601 188685 cd08730 RGS-like_3 Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 3. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involved in many crucial cellular processes. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision. 165
27602 188686 cd08731 RGS_RGS22_1 Regulator of G protein signaling domain RGS_RGS22_1. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis. 125
27603 188687 cd08732 RGS-like_4 Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 4. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision. 139
27604 188688 cd08734 RGS-like_1 Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 1. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involved in many crucial cellular processes. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision. 109
27605 188689 cd08735 RGS_AKAP2_1 Regulator of G protein signaling (RGS) domain 1 found in the A-kinase anchoring protein, D-AKAP2. The RGS (Regulator of G-protein Signaling) domain is an essential part of the D-AKAP2 (A-kinase anchoring protein), a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. D-AKAP2 contains two RGS domains which play an important role in spatiotemporal localization of cAMP-dependent PKA (cyclic AMP-dependent protein kinase) that regulates many different signaling pathways by phosphorylation of target proteins. This cd contains the first RGS domain. 171
27606 188690 cd08736 RGS_RhoGEF-like Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein subfamily of the RGS domain containing protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RhoGEFs link signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RGS domain of the RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. The RGS-GEFs subfamily includes the leukemia-associated RhoGEF (LARG), p115RhoGEF, and PDZ-RhoGEF. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 120
27607 188691 cd08737 RGS_RGS6 Regulator of G protein signaling (RGS) domain found in the RGS6 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS6 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). Other members of the R7 subfamily (Neuronal RGS) include: RGS7, RGS9, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control. Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS6 exists in multiple splice isoforms with identical RGS domains, but possess complete or incomplete GGL domains and distinct N- and C-terminal domains. RGS6 interacts with SCG10, a neuronal growth-associated protein and therefore regulates neuronal differentiation. Another RGS6-binding protein is DMAP1, a component of the Dnmt1 complex involved in repression of newly replicated genes. Mutations of a critical residue required for interaction of RGS6 protein with G proteins did not affect the ability of RGS6 to interact with both SCG10 and DMAP1. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. 125
27608 188692 cd08738 RGS_RGS7 Regulator of G protein signaling (RGS) domain found in the RGS7 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS7 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS9, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control. Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. R7 RGS proteins are key modulators of the pharmacological effects of drugs involved in the development of tolerance and addiction. In addition, RGS7 was found to bind a component of the synaptic fusion complex, snapin, and some other proteins outside of G protein signaling pathways. 121
27609 188693 cd08739 RGS_RGS9 Regulator of G protein signaling (RGS) domain found in the RGS9 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS9 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS7, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control. Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS9 forms constitutive complexes with G-beta-5 subunit and controls such fundamental functions as vision and behavior. RGS9 exists in two splice isoforms: RGS9-1 which regulates phototransduction in rods and cones and RGS9-2 which regulates dopamine and opioid signaling in the basal ganglia. In addition, RGS9 was found to bind many other proteins outside of G protein signaling pathways including: mu-opioid receptor, beta-arrestin, alpha-actinin-2, NMDAR, polycystin, spinophilin, and guanylyl cyclase, among others. 121
27610 188694 cd08740 RGS_RGS11 Regulator of G protein signaling (RGS) domain found in the RGS11 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS11 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS7, and RGS9, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control. Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS11 is expressed exclusively in retinal ON-bipolar neurons in which it forms complexes with G-beta-5 and R7AP (RGS7 anchor protein ) and plays crucial roles in processing the light responses of retinal neurons. 126
27611 188695 cd08741 RGS_RGS10 Regulator of G protein signaling (RGS) domain found in the RGS10 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS10 protein. RGS10 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS10 belong to the R12 RGS subfamily, which includes RGS12 and RGS14, all of which are highly selective for G-alpha-i1 over G-alpha-q. RGS10 exists in 2 splice isoforms. RGS10A is specifically expressed in osteoclasts and is a key component in the RANKL signaling mechanism for osteoclast differentiation, whereas RGS10B expressed in brain and in immune tissues and has been implicated in diverse processes including: promoting of dopaminergic neuron survival via regulation of the microglial inflammatory response, modulation of presynaptic and postsynaptic G-protein signalling, as well as a possible role in regulation of gene expression. 113
27612 188696 cd08742 RGS_RGS12 Regulator of G protein signaling (RGS) domain found in the RGS12 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS12 protein. RGS12 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS12 belong to the R12 RGS subfamily, which includes RGS10 and RGS14, all of which are highly selective for G-alpha-i1 over G-alpha-q. RGS12 exist in multiple splice variants: RGS12s (short) contains the core RGS/RBD/GoLoco domains, while RGS12L (long) has additional N-terminal PDZ and PTB domains. RGS12 splice variants show distinct expression patterns, suggesting that they have discrete functions during mouse embryogenesis. RGS12 also may play a critical role in coordinating Ras-dependent signals that are required for promoting and maintaining neuronal differentiation. 115
27613 188697 cd08743 RGS_RGS14 Regulator of G protein signaling (RGS) domain found in the RGS14 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS14 protein. RGS14 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS14 belong to the R12 RGS subfamily, which includes RGS10 and RGS12, all of which are highly selective for G-alpha-i1 over G-alpha-q. RGS14 binds and regulates the subcellular localization and activities of H-Ras and Raf kinases in cells and thereby integrates G protein and Ras/Raf signaling pathways. 129
27614 188698 cd08744 RGS_RGS17 Regulator of G protein signaling (RGS) domain found in the RGS17 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS17 protein, a member of the RZ subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of the G-protein signaling controlled by the RGS domain, which accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, results in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. The RZ subfamily of RGS proteins includes RGS19 (former GAIP), RGS20, and its splice variant Ret-RGS. RGS17 is a relatively non-selective GAP for G-alpha-z and other G-alpha-i/o proteins. RGS17 blocks dopamine receptor-mediated inhibition of cAMP accumulation; it also blocks thyrotropin releasing hormone-stimulated Ca++ mobilization. RGS17, like other members of RZ subfamily, can act either as a GAP or as G-protein effector antogonist. 118
27615 188699 cd08745 RGS_RGS19 Regulator of G protein signaling (RGS) domain found in the RGS19 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS19 protein (also known as GAIP), a member of the RZ subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by RGS domains, which accelerate GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, resulting in a reassociation of the alpha-subunit with the beta-gamma-dimer and an inhibition of downstream activity. As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins includes RGS17, RGS20, and its splice variant Ret-RGS. RGS19 participates in regulation of dopamine receptor D2R and D3R, as well as beta-adrenergic receptors . 118
27616 188700 cd08746 RGS_RGS20 Regulator of G protein signaling (RGS) domain found in the RGS20 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS20 protein (also known as RGSZ1), a member of the RZ subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by the RGS domain, which accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP resulting in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins include RGS17, RGS19 (former GAIP), and the splice variant of RGS20, Ret-RGS. RGS20 is expressed exclusively in brain, with the highest concentrations in the temporal lobe and the caudate nucleus and may play a role in signaling regulation in these brain regions. RGS20 acts as a GAP of both G-alpha-z and G-alpha-I and controls signaling in the mu opioid receptor pathway. 167
27617 188701 cd08747 RGS_GRK2_GRK3 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 2 (GRK2) and G protein-coupled receptor kinase 3 (GRK3). The RGS domain is an essential part of the GRK2 (G protein-coupled receptor kinases 2) and the GRK3 proteins, which are members of the beta-adrenergic receptor kinases subfamily. GRK2 and GRK3 are ubiquitously expressed and can phosphorylate many different GPCR. The C-terminus of GRK2 and 3 contains a plekstrin homology domain (PH) with binding sites for the membrane phospholipid PIP2 and free G#? subunits. These specific interactions could help to maintain a membrane-bound population of GRK2 prior to the agonist-dependent overt GRK2 translocation. GRK2 and GRK3 are members of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 157
27618 188702 cd08748 RGS_GRK1 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 1 (GRK1). The RGS domain is found in G protein-coupled receptor kinases 1 (GRK1, also refered to as Rhodopsin kinase) which play a key role in phosphorylation of rhodopsin (Rho), a G protein-coupled receptor responsible for visual signal transduction in rod cell. GRK1 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. A few inactivation mutations in GRK1 have been found in patients with Oguchi disease, a stationary form of night blindness. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 138
27619 188703 cd08749 RGS_GRK7 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 7 (GRK7). The RGS domain is an essential part of the GRK7 (G protein-coupled receptor kinases 7) proteins which together with GRK1 (Rhodopsin kinase) have been implicated in the shutoff of the photoresponse and adaptation to changing light conditions via rod and cone opsin phosphorylation. GRK7 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. GRK7 is expressed in all vertebrate cones except that of mice and rats, which do not have the gene for GRK7. Lack of either GRK7 or both GRK1 and GRK7 in human leads to a vision defect called Enhanced S Cone syndrome. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 139
27620 188704 cd08750 RGS_GRK4 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 4 (GRK4). The RGS domain is an essential part of the GRK4 (G protein-coupled receptor kinase4) proteins, which are membrane-associated serine/threonine protein kinases that phosphorylate G protein-coupled receptors (GPCRs) upon agonist stimulation. This phosphorylation initiates beta-arrestin-mediated receptor desensitization, internalization, and signaling events. GRK4 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. GRK4 plays a key role in regulating dopaminergic-mediated natriuresis and is associated with essential hypertension and/or salt-sensitive hypertension. GRK4 exists in four splice variants involved in hyperphosphorylation, desensitization, and internalization of two dopamine receptors (D1R and D3R). GRK4 also increases the expression of a key receptor of the renin-angiotensin system, the AT1R (angiotensin type 1 receptor). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 132
27621 188705 cd08751 RGS_GRK6 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 6 (GRK6). The RGS domain is an essential part of the GRK6 (G protein-coupled receptor kinase 6) protein which plays an important role in the regulating of dopamine, opioids, M3 muscarinic, and chemokine receptor signaling. GRK6 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. The RH domain of GRK6 does not have structural determinants that are required for binding G-alpha subunit, in contrast to GRK2 and many other RGS proteins. GRK6 is an important target for treatment of addiction and Parkinson disease. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 145
27622 188706 cd08752 RGS_GRK5 Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 5 (GRK5). The RGS domain is an essential part of the GRK5 (G protein-coupled receptor kinase 5) protein, a membrane-associated serine/threonine protein kinases which phosphorylates G protein-coupled receptors (GPCRs) upon agonist stimulation. This phosphorylation initiates beta-arrestin-mediated receptor desensitization, internalization, and signaling events. GRK5 is a member of the GRK kinase family which include three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 123
27623 188707 cd08753 RGS_PDZRhoGEF Regulator of G protein signaling (RGS) domain found in the PDZ-Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain is an essential part of the PDZ-RhoGEF (PDZ:Postsynaptic density 95, Disk large, Zona occludens-1; RhoGEF: Rho guanine nucleotide exchange factor; alias PRG) protein, a member of RhoGEFs subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, and cell cycle progression, as well as gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. RhoGEFs subfamily includes leukemia-associated RhoGEF protein (LARG), p115RhoGEF, PDZ-RhoGEF and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In contrast to p115RhoGEF and LARG, PDZ-RhoGEF cannot serve as a GTPase-activating protein (GAP), due to the mutation of sites in the RGS domain region that are crucial for GAP activity. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 145
27624 188708 cd08754 RGS_LARG Regulator of G protein signaling (RGS) domain found in the leukemia-associated Rho guanine nucleotide exchange factor (RhoGEF) protein (LARG). The RGS domain is an essential part of the leukemia-associated RhoGEF protein (LARG), a member of the RhoGEF (Rho guanine nucleotide exchange factor) subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, cell cycle progression of cells, and gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes p115RhoGEF, LARG, PDZ-RhoGEF, and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In addition to being a G-alpha13 effector, the LARG protein also functions as a GTPase-activating protein (GAP) for G-alpha13. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 222
27625 188709 cd08755 RGS_p115RhoGEF Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (GEF), p115 RhoGEF. The RGS (Regulator of G-protein Signaling) domain is an essential part of the p115RhoGEF protein, a member of the RhoGEF (Rho guanine nucleotide exchange factor) subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, cell cycle progression of cells, and gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes p115RhoGEF, LARG, PDZ-RhoGEF and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In addition to being a G-alpha13/12 effector, the p115RhoGEF protein also functions as a GTPase-activating protein (GAP) for G-alpha13. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 193
27626 188710 cd08756 RGS_GEF_like Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein subfamily of the RGS domain containing protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration and cell cycle progression as well as gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes the leukemia-associated RhoGEF protein (LARG), p115RhoGEF, PDZ-RhoGEF, and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. 122
27627 188885 cd08757 SAM_PNT_ESE Sterile alpha motif (SAM)/Pointed domain of ESE-like ETS transcriptional regulators. SAM Pointed domain of ESE-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. ETS factors are important for cell differentiation. They can be involved in regulation of gene expression in different types of epithelial cells. They are expressed in salivary gland, intestine, stomach, pancreas, lungs, kidneys, colon, mammary gland, and prostate. Members of this group are proto-oncogenes. Expression profiles of these factors are altered in epithelial cancers, which makes them potential targets for cancer therapy. 69
27628 260086 cd08759 Type_III_cohesin_like Cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents type III cohesins and closely related domains. 167
27629 176490 cd08760 Cyt_b561_FRRS1_like Eukaryotic cytochrome b(561), including the FRRS1 gene product. Cytochrome b(561), as found in eukaryotes, similar to and including the human FRRS1 gene product (ferric-chelate reductase 1), also called SDR-2 (stromal cell-derived receptor 2). This family comprises a variety of domain architectures, many of which contain dopamine beta-monooxygenase (DOMON) domains. The protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 191
27630 176491 cd08761 Cyt_b561_CYB561D2_like Eukaryotic cytochrome b(561), including the CYB561D2 gene product. Cytochrome b(561), as found in eukaryotes, similar to and including the human CYB561D2 gene product. CYB561D2 is a candidate tumor suppressor. The protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 183
27631 176492 cd08762 Cyt_b561_CYBASC3 Vertebrate cytochrome b(561), CYBASC3 gene product. Cytochrome b ascorbate-dependent 3, as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 179
27632 176493 cd08763 Cyt_b561_CYB561 Vertebrate cytochrome b(561), CYB561 gene product. Cytochrome b(561), as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 143
27633 176494 cd08764 Cyt_b561_CG1275_like Non-vertebrate eumetazoan cytochrome b(561). Cytochrome b(561), as found in non-vertebrate eumetazoans, similar to the Drosophila melanogaster CG1275 gene product. This protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 214
27634 176495 cd08765 Cyt_b561_CYBRD1 Vertebrate cytochrome b(561), CYBRD1 gene product. Duodenal cytochrome b or ferric-chelate reductase 3, a cytochrome b(561), as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. This protein is expressed at the brush border of duodenal enterocytes and may play a role in the uptake of dietary Fe(3+), facilitating its transport into the mucosal cells. It may also be involved in the recycling of extracellular ascorbate in erythrocyte membranes, and act as a ferrireductase in epithelial cells of the respiratory system. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 153
27635 176496 cd08766 Cyt_b561_ACYB-1_like Plant cytochrome b(561), including the carbon monoxide oxygenase ACYB-1. Cytochrome b(561), as found in plants, similar to the Arabidopsis thaliana ACYB-1 gene product, a cytochrome b561 isoform localized to the tonoplast. This protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), and might be capable of trans-membrane electron transport from intracellular ascorbate to extracellular ferric chelates. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. 144
27636 176572 cd08767 Cdt1_c The C-terminal fold of replication licensing factor Cdt1 is essential for Cdt1 activity and directly interacts with MCM2-7 helicase. Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the Origin Recognition Complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and is degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity. 126
27637 176573 cd08768 Cdc6_C Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding. This model characterizes the winged-helix, C-terminal domain of the Cell division control protein (Cdc6_C). Cdc6 (also known as Cell division cycle 6 or Cdc18) functions as a regulator at the early stages of DNA replication, by helping to recruit and load the Minichromosome Maintenance Complex (MCM) onto DNA and may have additional roles in the control of mitotic entry. Precise duplication of chromosomal DNA is required for genomic stability during replication. Cdc6 has an essential role in DNA replication and irregular expression of Cdc6 may lead to genomic instability. Cdc6 over-expression is observed in many cancerous lesions. DNA replication begins when an origin recognition complex (ORC) binds to a replication origin site on the chromatin. Studies indicate that Cdc6 interacts with ORC through the Orc1 subunit, and that this association increases the specificity of the ORC-origins interaction. Further studies suggest that hydrolysis of Cdc6-bound ATP promotes the association of the replication licensing factor Cdt1 with origins through an interaction with Orc6 and this in turn promotes the loading of MCM2-7 helicase onto chromatin. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication. S-Cdk (S-phase cyclin and cyclin-dependent kinase complex) prevents rereplication by causing the Cdc6 protein to dissociate from ORC and prevents the Cdc6 and MCM proteins from reassembling at any origin. By phosphorylating Cdc6, S-Cdk also triggers Cdc6's ubiquitination. The Cdc6 protein is composed of three domains, an N-terminal AAA+ domain with Walker A and B, and Sensor-1 and -2 motifs. The central region contains a conserved nucleotide binding/ATPase domain and is a member of the ATPase superfamily. The C-terminal domain (Cdc6_C) is a conserved winged-helix domain that possibly mediates protein-protein interactions or direct DNA interactions. Cdc6 is conserved in eukaryotes, and related genes are found in Archaea. The winged helix fold structure of Cdc6_C is similar to the structures of other eukaryotic replication initiators without apparent sequence similarity. 87
27638 176451 cd08769 DAP_dppA_2 Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases. 270
27639 176452 cd08770 DAP_dppA_3 Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases. 263
27640 206738 cd08771 DLP_1 Dynamin_like protein family includes dynamins and Mx proteins. The dynamin family of large mechanochemical GTPases includes the classical dynamins and dynamin-like proteins (DLPs) that are found throughout the Eukarya. These proteins catalyze membrane fission during clathrin-mediated endocytosis. Dynamin consists of five domains; an N-terminal G domain that binds and hydrolyzes GTP, a middle domain (MD) involved in self-assembly and oligomerization, a pleckstrin homology (PH) domain responsible for interactions with the plasma membrane, GED, which is also involved in self-assembly, and a proline arginine rich domain (PRD) that interacts with SH3 domains on accessory proteins. To date, three vertebrate dynamin genes have been identified; dynamin 1, which is brain specific, mediates uptake of synaptic vesicles in presynaptic terminals; dynamin-2 is expressed ubiquitously and similarly participates in membrane fission; mutations in the MD, PH and GED domains of dynamin 2 have been linked to human diseases such as Charcot-Marie-Tooth peripheral neuropathy and rare forms of centronuclear myopathy. Dynamin 3 participates in megakaryocyte progenitor amplification, and is also involved in cytoplasmic enlargement and the formation of the demarcation membrane system. This family also includes interferon-induced Mx proteins that inhibit a wide range of viruses by blocking an early stage of the replication cycle. Dynamin oligomerizes into helical structures around the neck of budding vesicles in a GTP hydrolysis-dependent manner. 278
27641 350091 cd08772 GH43_62_32_68_117_130 Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. Members of the glycosyl hydrolase families 32, 43, 62, 68, 117 and 130 (GH32, GH43, GH62, GH68, GH117, GH130) all possess 5-bladed beta-propeller domains and comprise clans F and J, as classified by the carbohydrate-active enzymes database (CAZY). Clan F consists of families GH43 and GH62. GH43 includes beta-xylosidases (EC 3.2.1.37), beta-xylanases (EC 3.2.1.8), alpha-L-arabinases (EC 3.2.1.99), and alpha-L-arabinofuranosidases (EC 3.2.1.55), using aryl-glycosides as substrates, while family GH62 contains alpha-L-arabinofuranosidases (EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose sidechains from xylans. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Clan J consists of families GH32 and GH68. GH32 comprises sucrose-6-phosphate hydrolases, invertases (EC 3.2.1.26), inulinases (EC 3.2.1.7), levanases (EC 3.2.1.65), eukaryotic fructosyltransferases, and bacterial fructanotransferases while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Members of this clan are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) that catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Structures of all families in the two clans manifest a funnel-shaped active site that comprises two subsites with a single route for access by ligands. Also included in this superfamily are GH117 enzymes that have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate, and GH130 that are phosphorylases and hydrolases for beta-mannosides, involved in the bacterial utilization of mannans or N-linked glycans. 257
27642 176798 cd08773 FpgNei_N N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. These enzymes initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycolsylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. The FpgNei DNA glycosylases represent one of the two structural superfamilies of DNA glycosylases that recognize oxidized bases (the other is the HTH-GPD superfamily exemplified by Escherichia coli Nth). Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In addition to this FpgNei_N domain, FpgNei proteins have a helix-two-turn-helix (H2TH) domain and a zinc (or zincless)-finger motif which also contribute residues to the active site. FpgNei DNA glycosylases have a broad substrate specificity. They are bifunctional, in addition to the glycosylase (recognition) activity, they have a lyase (cleaving) activity on the phosphodiester backbone of the DNA at the AP site. This superfamily includes eukaryotic, bacterial, and viral proteins. 117
27643 206755 cd08774 14-3-3 14-3-3 domain. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis. More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development. 225
27644 176753 cd08775 DED_Caspase-like_r2 Death effector domain, repeat 2, of initator caspase-like proteins. Death Effector Domain (DED), second repeat, found in initator caspase-like proteins like caspase-8, -10 and c-FLIP. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of DISC. All members contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 81
27645 176754 cd08776 DED_Caspase-like_r1 Death effector domain, repeat 1, of initator caspase-like proteins. Death Effector Domain (DED), first repeat, found in initator caspase-like proteins, like caspase-8 and -10 and c-FLIP. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of DISC. All members contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 71
27646 260048 cd08777 Death_RIP1 Death Domain of Receptor-Interacting Protein 1. Death domain (DD) found in Receptor-Interacting Protein 1 (RIP1) and related proteins. RIP kinases serve as essential sensors of cellular stress. Vertebrates contain several types containing a homologous N-terminal kinase domain and varying C-terminal domains. RIP1 harbors a C-terminal DD, which binds death receptors (DRs) including TNF receptor 1, Fas, TNF-related apoptosis-inducing ligand receptor 1 (TRAILR1), and TRAILR2. It also interacts with other DD-containing adaptor proteins such as TRADD and FADD. RIP1 plays a crucial role in determining a cell's fate, between survival or death, following exposure to stress signals. It is important in the signaling of NF-kappaB and MAPKs, and it links DR-associated signaling to reactive oxygen species (ROS) production. Abnormal RIP1 function may result in ROS accumulation affecting inflammatory responses, innate immunity, stress responses, and cell survival. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27647 176756 cd08778 Death_TNFRSF21 Death domain of tumor necrosis factor receptor superfamily member 21. Death domain (DD) found in tumor necrosis factor receptor superfamily member 21 (TNFRSF21), also called death receptor-6, DR6. DR6 is an orphan receptor that is expressed ubiquitously, but shows high expression in lymphoid organs, heart, brain and pancreas. Results from DR6(-/-) mice indicate that DR6 plays an important regulatory role for the generation of adaptive immunity. It may also be involved in tumor cell survival and immune evasion. In neuronal cells, it binds beta-amyloid precursor protein (APP) and activates caspase-dependent cell death. It may contribute to the pathogenesis of Alzheimer's disease. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27648 260049 cd08779 Death_PIDD Death Domain of p53-induced protein with a death domain. Death domain (DD) found in PIDD (p53-induced protein with a death domain) and similar proteins. PIDD is a component of the PIDDosome complex, which is an oligomeric caspase-activating complex involved in caspase-2 activation and plays a role in mediating stress-induced apoptosis. The PIDDosome complex is composed of three components, PIDD, RAIDD and caspase-2, which interact through their DDs and DD-like domains. The DD of PIDD interacts with the DD of RAIDD, which also contains a Caspase Activation and Recruitment Domain (CARD) that interacts with the caspase-2 CARD. Autoproteolysis of PIDD determines the downstream signaling event, between pro-survival NF-kB or pro-death caspase-2 activation. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD, DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27649 260050 cd08780 Death_TRADD Death Domain of Tumor Necrosis Factor Receptor 1-Associated Death Domain protein. Death domain (DD) of TRADD (TNF Receptor 1-Associated Death Domain or TNFRSF1A-associated via death domain) protein. TRADD is a central signaling adaptor for TNF-receptor 1 (TNFR1), mediating activation of Nuclear Factor -kappaB (NF-kB) and c-Jun N-terminal kinase (JNK), as well as caspase-dependent apoptosis. It also carries important immunological roles including germinal center formation, DR3-mediated T-cell stimulation, and TNFalpha-mediated inflammatory responses. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 90
27650 260051 cd08781 Death_UNC5-like Death domain found in Uncoordinated-5 homolog family. Death Domain (DD) found in Uncoordinated-5 (UNC-5) homolog family, which includes Unc5A, B, C and D in vertebrates. UNC5 proteins are receptors for secreted netrins (netrin-1, -3 and -4) that are involved in diverse processes like axonal guidance, neuronal migration, blood vessel patterning, and apoptosis. They are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27651 260052 cd08782 Death_DAPK1 Death domain found in death-associated protein kinase 1. Death domain (DD) found in death-associated protein kinase 1 (DAPK1). DAPK1 is composed of several functional domains, including a kinase domain, a CaM regulatory domain, ankyrin repeats, a cytoskeletal-binding domain and a C-terminal DD. It plays important roles in a diverse range of signal transduction pathways including apoptosis, growth factor signalling, and autophagy. Loss of DAPK1 expression, usually because of DNA methylation, is implicated in many tumor types. DAPK1 is highly abundant in the brain and has also been associated with neurodegeneration. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 82
27652 260053 cd08783 Death_MALT1 Death domain similar to that found in Mucosa-associated lymphoid tissue-lymphoma-translocation gene 1. Death domain (DD) similar to that found in Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1). Malt1, together with Bcl10 (B-cell lymphoma 10), are the integral components of the CBM signalosome. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells) and with CARMA1 to form L-CBM (CBM complex in lymphoid immune cells), to mediate activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 96
27653 260054 cd08784 Death_DRs Death Domain of Death Receptors. Death domain (DD) found in death receptor proteins. Death receptors are members of the tumor necrosis factor (TNF) receptor superfamily, characterized by having a cytoplasmic DD. Known members of the family are Fas (CD95/APO-1), TNF-receptor 1 (TNFR1/TNFRSF1A/p55/CD120a), TNF-related apoptosis-inducing ligand receptor 1 (TRAIL-R1 /DR4), and receptor 2 (TRAIL-R2/DR5/APO-2/KILLER), as well as Death Receptor 3 (DR3/APO-3/TRAMP/WSL-1/LARD). They are involved in apoptosis signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 80
27654 260055 cd08785 CARD_CARD9-like Caspase activation and recruitment domain of CARD9 and related proteins. Caspase activation and recruitment domain (CARD) found in CARD9, CARD14 (CARMA2), CARD10 (CARMA3), CARD11 (CARMA1) and BCL10. BCL10 (B-cell lymphoma 10), together with Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), are integral components of the CBM signalosome. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells), and with CARD11 to form L-CBM (CBM complex in lymphoid immune cells), which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. BCL10/Malt1 also associates with CARD10, which is more widely expressed and is not restricted to hematopoietic cells, to play a role in GPCR-induced NF-kB activation. CARD14 has also been shown to associate with BCL10. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27655 176764 cd08786 CARD_RIP2_CARD3 Caspase activation and recruitment domain of Receptor Interacting Protein 2. Caspase activation and recruitment domain (CARD) of Receptor Interacting Protein 2 (RIP2/RIPK2/RICK/CARDIAK/CARD3). RIP kinases serve as essential sensors of cellular stress. Vertebrates contain several types containing a homologous N-terminal kinase domain and varying C-terminal domains. RIP2 harbors a C-terminal CARD domain and functions as an effector kinase downstream of the pattern recognition receptors from the Nod-like (NLR)-family, NOD1 and NOD2, which recognizes bacterial peptidoglycans released upon infection. This cascade is implicated in inflammatory immune responses and the clearance of intracellular pathogens. RIP2 associates with NOD1 and NOD2 via CARD-CARD interactions. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 87
27656 176765 cd08787 CARD_NOD2_1_CARD15 Caspase activation and recruitment domain of NOD2, repeat 1. Caspase activation and recruitment domain (CARD) similar to that found in human NOD2 (CARD15), repeat 1. NOD2 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD2, as well as NOD1, the N-terminal effector domain is a CARD. NOD2 contains two N-terminal CARD repeats. Mutations in NOD2 have been associated with Crohns disease and Blau syndrome. Nod2-CARDs have been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 87
27657 260056 cd08788 CARD_NOD2_2_CARD15 Caspase activation and recruitment domain of NOD2, repeat 2. Caspase activation and recruitment domain (CARD) similar to that found in human NOD2 (CARD15), repeat 2. NOD2 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD2, as well as NOD1, the N-terminal effector domain is a CARD. NOD2 contains two N-terminal CARD repeats. Mutations in NOD2 have been associated with Crohns disease and Blau syndrome. Nod2-CARDs have been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 81
27658 260057 cd08789 CARD_IPS-1_RIG-I Caspase activation and recruitment domains (CARDs) found in IPS-1 and RIG-I-like RNA helicases. Caspase activation and recruitment domains (CARDs) found in IPS-1 (Interferon beta promoter stimulator protein 1) and Retinoic acid Inducible Gene I (RIG-I)-like DEAD box helicases. RIG-I-like helicases and IPS-1 play important roles in the induction of interferons in response to viral infection. They are crucial in triggering innate immunity and in developing adaptive immunity against viral pathogens. RIG-I-like helicases, including MDA5 and RIG-I, contain two N-terminal CARD domains and a C-terminal DEAD box RNA helicase domain. They are cytoplasmic RNA helicases that play an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. MDA5 and RIG-I associate with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 91
27659 260058 cd08790 DED_DEDD Death Effector Domain of DEDD. Death Effector Domain (DED) found in DEDD. DEDD has been shown to block mitotic progression by inhibiting Cdk1 and to be involved in regulating the insulin signaling cascade. DEDD can bind to itself, to DEDD2, and to the two tandem DED-containing caspases, caspase-8 and -10. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 97
27660 176769 cd08791 DED_DEDD2 Death Effector Domain of DEDD2. Death Effector Domain (DED) found in DEDD2. DEDD2 has been shown to bind to itself, DEDD, and to the two tandem DED-containing caspases, caspase-8 and -10. It may play a role in apoptosis. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. In mammals, they are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways. 106
27661 260059 cd08792 DED_Caspase_8_10_r1 Death effector domain, repeat 1, of initator caspases 8 and 10. Death Effector Domain (DED) found in caspase-8 and caspase-10, repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis, and they play partially redundant roles. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. They contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 77
27662 260060 cd08793 Death_IRAK4 Death domain of Interleukin-1 Receptor-Associated Kinase 4. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 4 (IRAK4). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB, and mitogen-activated protein kinases. IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK4 is an active kinase that is also involved in T-cell receptor signaling pathways, implying that it may function in acquired immunity and not just in innate immunity. It is known as the master IRAK member because its absence strongly impairs TLR- and IL-1-mediated signaling and innate immune defenses, while the absence of other IRAK proteins only shows slight effects. IRAK4-deficient patients have impaired inflammatory responses and recurrent life-threatening infections. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 100
27663 260061 cd08794 Death_IRAK1 Death domain of Interleukin 1 Receptor Associated Kinase-1. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 1 (IRAK1). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK1 is an active kinase and also plays adaptor functions. It binds to the MyD88-IRAK4 complex via its DD, which facilitates its phosphorylation by IRAK4, activating it for further auto-phosphorylation. Hyper-phosphorylated IRAK1 forms a cytosolic complex with TRAF6, leading to the activation of NF-kB and MAPK pathways. IRAK1 is involved in autoimmunity and may be associated with lupus pathogenesis. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27664 176773 cd08795 Death_IRAK2 Death domain of Interleukin 1 Receptor Associated Kinase-2. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 1 (IRAK1). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors (TLRs), nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK2 is an essential component of several signaling pathways, including NF-kappaB and the IL-1 signaling pathways. It is an inactive kinase that participates in septic shock mediated by TLR4 and TLR9. It plays a redundant role with IRAK1 in early NF-kB and MAPK responses, and remains present at later stages whereas IRAK1 disappears. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 88
27665 260062 cd08796 Death_IRAK-M Death domain of Interleukin 1 Receptor Associated Kinase-M. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase M (IRAK-M). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors(TLRs), nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK-M, also called IRAK-3, is an inactive kinase present only in macrophages in an inducible manner. It is a negative regulator of TLR signaling and it contributes to the attenuation of NF-kB activation. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 89
27666 260063 cd08797 Death_NFkB1_p105 Death domain of the Nuclear Factor-KappaB1 precursor protein p105. Death Domain (DD) of the Nuclear Factor-KappaB1 (NF-kB1) precursor protein p105. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). NF-kB1 (or p50) is produced from the processing of the precursor protein p105, which contains ANK repeats and a C-terminal DD in addition to the RHD. It is regulated by the classical (or canonical) NF-kB pathway. In the cytosol, p50 forms an inactive complex with RelA (or p65) and the Inhibitor of NF-kB (IkB). Activation is triggered by the phosphorylation and degradation of IkB, resulting in the active DNA-binding p50-RelA dimer to migrate to the nucleus. The classical pathway regulates the majority of genes activated by NF-kB including those encoding cytokines, chemokines, leukocyte adhesion molecules, and anti-apoptotic factors. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 76
27667 176776 cd08798 Death_NFkB2_p100 Death domain of the Nuclear Factor-KappaB2 precursor protein p100. Death Domain (DD) of the Nuclear Factor-KappaB2 (NF-kB2) precursor protein p100. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). NF-kB2 (or p52) is produced from the processing of the precursor protein p100, which contains ANK repeats and a C-terminal DD in addition to the RHD. It is regulated by the non-canonical NF-kB pathway. The p100 precursor is cytosolic and interacts with RelB. Upon phosphorylation by IKKalpha, p100 is processed to its 52kDa active, DNA-binding form and the p52/RelB complex is translocated into the nucleus. The non-canonical pathway plays a role in adaptive immunity and lymphorganogenesis. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 76
27668 260064 cd08799 Death_UNC5C Death domain found in Uncoordinated-5C. Death Domain (DD) found in Uncoordinated-5C (UNC5C). UNC5C is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5C plays a critical role in the development of spinal accessory motor neurons. Methylation of the UNC5C gene is associated with early stages of colorectal carcinogenesis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27669 260065 cd08800 Death_UNC5A Death domain found in Uncoordinated-5A. Death Domain (DD) found in Uncoordinated-5A (UNC5A). UNC5A is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a critical role in neuronal development and differentiation, as well as axon-guidance. It also plays a role in regulating apoptosis in non-neuronal cells as a downstream target of p53. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27670 176779 cd08801 Death_UNC5D Death domain found in Uncoordinated-5D. Death Domain (DD) found in Uncoordinated-5D (UNC5D). UNC5D is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 98
27671 176780 cd08802 Death_UNC5B Death domain found in Uncoordinated-5B. Death Domain (DD) found in Uncoordinated-5B (UNC5B). UNC5B is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5B signaling is involved in the netrin-1-induced proliferation and migration of renal proximal tubular cells. It is also required for vascular patterning during embryonic development, and its activation inhibits sprouting angiogenesis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27672 176781 cd08803 Death_ank3 Death domain of Ankyrin-3. Death Domain (DD) of the human protein ankyrin-3 (ANK-3) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-3, also called anykyrin-G (for general or giant), is found in neurons and at least one splice variant has been shown to be essential for propagation of action potentials as a binding partner to neurofascin and voltage-gated sodium channels. It is required for maintaining axo-dendritic polarity, and may be a genetic risk factor associated with bipolar disorder. ANK-3 may also play roles in other cell types. Mutations affecting ANK-3 pathways for Na channel localization are associated with Brugada syndrome, a potentially fatal arrythmia. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27673 260066 cd08804 Death_ank2 Death domain of Ankyrin-2. Death Domain (DD) of Ankyrin-2 (ANK-2) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-2, also called ankyrin-B (for broadly expressed), is required for proper function of the Na/Ca ion exchanger-1 in cardiomyocytes, and is thought to function in linking integral membrane proteins to the underlying cytoskeleton. Human ANK-2 is associated with "Ankyrin-B syndrome", an atypical arrythmia disorder with risk of sudden cardiac death. It also plays key roles in the brain and striated muscle. Loss of ANK-2 is associated with significant nervous system defects and sarcomere disorganization. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27674 260067 cd08805 Death_ank1 Death domain of Ankyrin-1. Death Domain (DD) of the human protein ankyrin-1 (ANK-1) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-1, also called ankyrin-R (for restricted), is found in brain, muscle, and erythrocytes and is thought to function in linking integral membrane proteins to the underlying cytoskeleton. It plays a critical nonredundant role in erythroid development and is associated with hereditary spherocytosis (HS), a common disorder of the red cell membrane. The small alternatively-spliced variant, sANK-1, found in striated muscle and concentrated in the sarcoplasmic reticulum (SR) binds obscurin and titin, which facilitates the anchoring of the network SR to the contractile apparatus. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 84
27675 260068 cd08806 CARD_CARD14_CARMA2 Caspase activation and recruitment domain of CARD14-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD14, also known as BIMP2 or CARMA2 (caspase recruitment domain-containing membrane-associated guanylate kinase protein 2). CARD14 has been identified as a novel member of the MAGUK (membrane-associated guanylate kinase) family that functions as upstream activators of BCL10 (B-cell lymphoma 10) and NF-kB signaling. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27676 260069 cd08807 CARD_CARD10_CARMA3 Caspase activation and recruitment domain of CARD10-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD10, also known as CARMA3 (caspase recruitment domain-containing membrane-associated guanylate kinase protein 3) or BIMP1. The CARMA3-BCL10-MALT1 signalosome plays a role in the GPCR-induced NF-kB activation. CARMA3 is more widely expressed than CARMA1, which is found only in hematopoietic cells. In endothelial and smooth muscle cells, CARMA3-mediated NF-kB activation induces pro-inflammatory signals within the vasculature and is a key factor in atherogenesis. In bronchial epithelial cells, CARMA3-mediated NF-kB signaling is important for the development of allergic airway inflammation. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27677 260070 cd08808 CARD_CARD11_CARMA1 Caspase activation and recruitment domain of CARD11-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD11, also known as caspase recruitment domain-containing membrane-associated guanylate kinase protein 1 (CARMA1). CARMA1, together with BCL10 (B-cell lymphoma 10) and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), form the L-CBM signalosome (CBM complex in lymphoid immune cells) which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. CARMA1 associates with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27678 260071 cd08809 CARD_CARD9 Caspase activation and recruitment domain of CARD9-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD9. CARD9 is a central regulator of innate immunity and is highly expressed in dendritic cells and macrophages. Together with BCL10 (B-cell lymphoma 10) and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), it forms the M-CBM signalosome (the CBM complex in myeloid immune cells), which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. CARD9 associates with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
27679 260072 cd08810 CARD_BCL10 Caspase activation and recruitment domain of B-cell lymphoma 10. Caspase activation and recruitment domain (CARD) similar to that found in BCL10 (B-cell lymphoma 10). BCL10 and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1) are the integral components of CBM signalosomes. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells) and with CARMA1 to form L-CBM (CBM complex in lymphoid immune cells), to mediate activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. Both CARMA1 and CARD9 associate with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 85
27680 260073 cd08811 CARD_IPS1 Caspase activation and recruitment domain (CARD) found in IPS-1. Caspase activation and recruitment domain (CARD) found in IPS-1 (Interferon beta promoter stimulator protein 1), also known as CARDIF, VISA or MAVS. IPS-1 is an adaptor protein that plays an important role in interferon induction in response to viral infection. It is crucial in triggering innate immunity and in developing adaptive immunity against viral pathogens. The CARD of IPS-1 associates with the CARDs of two RNA helicases, RIG-I and MDA5, which bind viral DNA in the cytoplasm during the initial stage of intracellular antiviral response, leading to the induction of type I interferons. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 92
27681 176791 cd08813 DED_Caspase_8_r2 Death Effector Domain, repeat 2, of Caspase-8. Death effector domain (DED) found in caspase-8 (CASP8, FLICE), repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-10, and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 also plays many important non-apoptotic functions including roles in embryonic development, cell adhesion and motility, immune cell proliferation and differentiation, T-cell activation, and NFkappaB signaling. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 83
27682 260074 cd08814 DED_Caspase_10_r2 Death Effector Domain, repeat 2, of Caspase-10. Death effector domain (DED) found in Caspase-10, repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-10 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-8 and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 79
27683 176793 cd08815 Death_TNFRSF25_DR3 Death domain of Tumor Necrosis Factor Receptor superfamily 25. Death Domain (DD) found in Tumor Necrosis Factor (TNF) receptor superfamily 25 (TNFRSF25), also known as TRAMP (TNF receptor-related apoptosis-mediating protein), LARD, APO-3, WSL-1, or DR3 (Death Receptor-3). TNFRSF25 is primarily expressed in T cells, is activated by binding to its ligand TL1A, and plays an important role in T-cell function. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 77
27684 260075 cd08816 CARD_RIG-I_r1 Caspase activation and recruitment domain found in RIG-I, first repeat. Caspase activation and recruitment domain (CARD) found in RIG-I (Retinoic acid Inducible Gene I, also known as Ddx58), first repeat. RIG-I is a cytoplasmic RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. RIG-I contains two N-terminal CARD domains and a C-terminal RNA helicase. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, RIG-I recognizes different sets of viruses compared to MDA5, a related RNA helicase. RIG-I associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 90
27685 260076 cd08817 CARD_RIG-I_r2 Caspase activation and recruitment domain found in RIG-I, second repeat. Caspase activation and recruitment domain (CARD) found in RIG-I (Retinoic acid Inducible Gene I, also known as Ddx58), second repeat. RIG-I is a cytoplasmic RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. RIG-I contains two N-terminal CARD domains and a C-terminal RNA helicase. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, RIG-I recognizes different sets of viruses compared to MDA5, a related RNA helicase. RIG-I associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 91
27686 260077 cd08818 CARD_MDA5_r1 Caspase activation and recruitment domain found in MDA5, first repeat. Caspase activation and recruitment domain (CARD) found in MDA5 (melanoma-differentiation-associated gene 5), first repeat. MDA5, also known as IFIH1, contains two N-terminal CARD domains and a C-terminal RNA helicase domain. MDA5 is a cytoplasmic DEAD box RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, MDA5 recognizes different sets of viruses compared to RIG-I, a related RNA helicase. MDA5 associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 92
27687 260078 cd08819 CARD_MDA5_r2 Caspase activation and recruitment domain found in MDA5, second repeat. Caspase activation and recruitment domain (CARD) found in MDA5 (melanoma-differentiation-associated gene 5), second repeat. MDA5, also known as IFIH1, contains two N-terminal CARD domains and a C-terminal RNA helicase domain. MDA5 is a cytoplasmic DEAD box RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, MDA5 recognizes different sets of viruses compared to RIG-I, a related RNA helicase. MDA5 associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 92
27688 187722 cd08820 FMT_core_like_6 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 173
27689 187723 cd08821 FMT_core_like_1 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 211
27690 187724 cd08822 FMT_core_like_2 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 192
27691 187725 cd08823 FMT_core_like_5 Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group. This domain contains a Rossmann fold and it is the catalytic domain of the enzyme. 177
27692 193585 cd08824 LOTUS LOTUS is an uncharacterized small globular domain found in Limkain b1, Oskar and Tudor-containing proteins 5 and 7. LOTUS is an uncharacterized small globular domain found in Limkain b1, Oskar and Tudor-containing proteins 5 and 7. The LOTUS containing proteins are germline-specific and are found in the nuage/polar granules of germ cells. Tudor-containing protein 5 and 7 belong to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 and TDRD7 are components of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. Oskar protein is a critical component of the pole plasm in the Drosophila oocyte, which is required for germ cell formation. Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. Limkain b1 contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 70
27693 259807 cd08825 MVP_shoulder Shoulder domain of the major vault protein. The major vault protein is the major polypeptide component of a large cellular ribonuclear protein complex found in the cytoplasm of eukaryotic cells. Its shoulder domain appears to be a homolog of the SPFH core domain. Vault proteins may be involved in detoxification processes, and have been associated with the multi-drug resistance (MDR) phenotype in malignancies. Presumably they play a role in transport processes. 151
27694 259808 cd08826 SPFH_eoslipins_u1 Uncharacterized prokaryotic subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria and archaebacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial and archaebacterial SLPs remain uncharacterized. This subgroup contains PH1511 from the hyperthermophilic archaeon Pyrococcus horikoshi. 178
27695 259809 cd08827 SPFH_podocin Podocin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in vertebrates. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Podocin is expressed in the kidney and mutations in the gene have been linked to familial idiopathic nephrotic syndrome. Podocin interacts with the TRP ion channel TRPV-6 and may function as a scaffolding protein in the organization of lipid-protein domains. 223
27696 259810 cd08828 SPFH_SLP-3 Slipin-3 (SLP-3), an uncharacterized subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in vertebrates. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Members of this slipin subgroup remain uncharacterized, except for Caenorhabditis elegans UNC-1. Mutations in the unc-1 gene result in abnormal motion and altered patterns of sensitivity to volatile anesthetics. 154
27697 259811 cd08829 SPFH_paraslipin Paraslipin or slipin-2 (SLP-2, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in all three kingdoms of life. The conserved domain common to these families has also been referred to as the Band 7 domain. Individual proteins of the SPFH family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. This subgroup of the SLPs remains largely uncharacterized. It includes human SLP-2 which is upregulated and involved in the progression and development in several types of cancer, including esophageal squamous cell carcinoma, endometrial adenocarcinoma, breast cancer, and glioma. 111
27698 350059 cd08830 ArfGap_ArfGap1 Arf1 GTPase-activating protein 1. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif. 115
27699 350060 cd08831 ArfGap_ArfGap2_3_like Arf1 GTPase-activating protein 2/3-like. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif. 116
27700 350061 cd08832 ArfGap_ADAP ArfGap with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation. ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions. 113
27701 350062 cd08833 ArfGap_GIT The GIT subfamily of ADP-ribosylation factor GTPase-activating proteins. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42 guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration. 109
27702 350063 cd08834 ArfGap_ASAP ArfGAP domain of ASAP (Arf GAP, SH3, ANK repeat and PH domains) subfamily of ADP-ribosylation factor GTPase-activating proteins. The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3. The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid. ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. Both ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma. 117
27703 350064 cd08835 ArfGap_ACAP ArfGAP domain of ACAP (ArfGAP with Coiled-coil, ANK repeat and PH domains) proteins. ArfGAP domain is an essential part of ACAP proteins that play important role in endocytosis, actin remodeling and receptor tyrosine kinase-dependent cell movement. ACAP subfamily of ArfGAPs are composed of coiled coils (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. In addition, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons. 116
27704 350065 cd08836 ArfGap_AGAP ArfGAP with GTPase domain, ANK repeat and PH domains. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains. AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking. In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking. 108
27705 350066 cd08837 ArfGap_ARAP ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing proteins. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. 116
27706 350067 cd08838 ArfGap_AGFG ArfGAP domain of the AGFG subfamily (ArfGAP domain and FG repeat-containing proteins). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG1 (alias: HIV-1 Rev binding protein, HRB; Rev interacting protein, RIP; Rev/Rex activating domain-binding protein, RAB) and AGFG2 are involved in the maintenance and spread of immunodeficiency virus type 1 (HIV-1) infection. The ArfGAP domain of AGFG is related to nucleoporins, which is a class of proteins that mediate nucleocytoplasmic transport. AGFG plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs, possibly together by the nuclear export receptor CRM1. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication. 113
27707 350068 cd08839 ArfGap_SMAP Stromal membrane-associated proteins; a subfamily of the ArfGAP family. The SMAP subfamily of Arf GTPase-activating proteins consists of the two structurally-related members, SMAP1 and SMAP2. Each SMAP member exhibits common and distinct functions in vesicle trafficking. They both bind to clathrin heavy chain molecules and are involved in the trafficking of clathrin-coated vesicles. SMAP1 preferentially exhibits GAP toward Arf6, while SMAP2 prefers Arf1 as a substrate. SMAP1 is involved in Arf6-dependent vesicle trafficking, but not Arf6-mediated actin cytoskeleton reorganization, and regulates clathrin-dependent endocytosis of the transferrin receptors and E-cadherin. SMAP2 regulates Arf1-dependent retrograde transport of TGN38/46 from the early endosome to the trans-Golgi network (TGN). SMAP2 has the Clathrin Assembly Lymphoid Myeloid (CALM)-binding domain, but SMAP1 does not. 103
27708 350069 cd08843 ArfGap_ADAP1 ADAP1 GTPase activating protein for Arf, with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation. ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions. 112
27709 350070 cd08844 ArfGap_ADAP2 ADAP2 GTPase activating protein for Arf, with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation. ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions. 112
27710 350071 cd08846 ArfGap_GIT1 GIT1 GTPase activating protein for Arf. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42 guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration. 111
27711 350072 cd08847 ArfGap_GIT2 GIT2 GTPase activating protein for Arf. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42 guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration. 111
27712 350073 cd08848 ArfGap_ASAP1 ArfGAP domain of ASAP1 (ArfGAP with SH3 domain, ANK repeat and PH domain-containing protein 1). The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3. The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid. ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma. 122
27713 350074 cd08849 ArfGap_ASAP2 ArfGAP domain of ASAP2 (ArfGAP2 with SH3 domain, ANK repeat and PH domain-containing protein 2). The Arf GAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf , thereby inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3. The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid. ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. 123
27714 350075 cd08850 ArfGap_ACAP3 ArfGAP domain of ACAP3 (ArfGAP with Coiled-coil, ANK repeat and PH domains 3). ACAP3 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor). ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. It has been shown that ACAP3 positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) also have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. 116
27715 350076 cd08851 ArfGap_ACAP2 ArfGAP domain of ACAP2 (ArfGAP with Coiled-coil, ANK repeat and PH domains 2). ACAP2 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor). ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons. 116
27716 350077 cd08852 ArfGap_ACAP1 ArfGAP domain of ACAP1 (ArfGAP with Coiled-coil, ANK repeat and PH domains 1). ACAP1 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor). ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons. 120
27717 350078 cd08853 ArfGap_AGAP2 ArfGAP with GTPase domain, ANK repeat and PH domain 2. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains. AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking. In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking. 109
27718 350079 cd08854 ArfGap_AGAP1 ArfGAP with GTPase domain, ANK repeat and PH domain 1. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains. AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking. In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking. 109
27719 350080 cd08855 ArfGap_AGAP3 ArfGAP with GTPase domain, ANK repeat and PH domain 3. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking. AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking. In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. 110
27720 350081 cd08856 ArfGap_ARAP2 ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 2. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP2 localizes to the cell periphery and on focal adhesions composed of paxillin and vinculin, and functions downstream of RhoA to regulate focal adhesion dynamics. ARAP2 is a PI(3,4,5)P3-dependent Arf6 GAP that binds RhoA-GTP, but it lacks the predicted catalytic arginine in the RhoGAP domain and does not have RhoGAP activity. ARAP2 reduces Rac1oGTP levels by reducing Arf6oGTP levels through GAP activity. AGAP2 also binds to and regulates focal adhesion kinase (FAK). Thus, ARAP2 signals through Arf6 and Rac1 to control focal adhesion morphology. 121
27721 350082 cd08857 ArfGap_AGFG1 ArfGAP domain of AGFG1 (ArfGAP domain and FG repeat-containing protein 1). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG1 (alias: HIV-1 Rev binding protein, HRB; Rev interacting protein, RIP; Rev/Rex activating domain-binding protein, RAB) and AGFG2 are involved in the maintenance and spread of immunodeficiency virus type 1 (HIV-1) infection. The ArfGAP domain of AGFG1 is related to nucleoporins, which is a class of proteins that mediate nucleocytoplasmic transport. AGFG1 plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs, possibly together by the nuclear export receptor CRM1. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG1 promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication. 116
27722 350083 cd08859 ArfGap_SMAP2 Stromal membrane-associated protein 2; a subfamily of the ArfGAP family. The SMAP subfamily of Arf GTPase-activating proteins consists of the two structurally-related members, SMAP1 and SMAP2. Each SMAP member exhibits common and distinct functions in vesicle trafficking. They both bind to clathrin heavy chain molecules and are involved in the trafficking of clathrin-coated vesicles. SMAP1 preferentially exhibits GAP toward Arf6, while SMAP2 prefers Arf1 as a substrate. SMAP1 is involved in Arf6-dependent vesicle trafficking, but not Arf6-mediated actin cytoskeleton reorganization, and regulates clathrin-dependent endocytosis of the transferrin receptors and E-cadherin. SMAP2 regulates Arf1-dependent retrograde transport of TGN38/46 from the early endosome to the trans-Golgi network (TGN). SMAP2 has the Clathrin Assembly Lymphoid Myeloid (CALM)-binding domain, but SMAP1 does not. 107
27723 176869 cd08860 TcmN_ARO-CYC_like N-terminal aromatase/cyclase domain of the multifunctional protein tetracenomycin (TcmN) and related domains. This family includes the N-terminal aromatase/cyclase (ARO/CYC) domain of Streptomyces glaucescens TcmN, and related domains. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. ARO/CYC domains participate in the diversification of aromatic polyketides by promoting polyketide cyclization. They occur in two architectural forms, monodomain and didomain. Monodomain aromatase/cyclases have a single ARO/CYC domain. For some, such as TcmN, this single domain is linked to a second domain of unrelated function. TcmN is a multifunctional cyclase-dehydratase-O-methyl transferase. Its N-terminal ARO/CYC domain participates in polyketide binding and catalysis; it promotes C9-C14 first-ring (and C7-C16 second-ring) cyclizations. Its C-terminal domain has O-methyltransferase activity. Didomain aromatase/cyclases contain two ARO/CYC domains, and they biosynthesize C7-C12 first ring cyclized polyketides. These latter domains belong to a different subfamily in the SRPBCC superfamily. 146
27724 176870 cd08861 OtcD1_ARO-CYC_like N-terminal and C-terminal aromatase/cyclase domains of Streptomyces rimosus OtcD1 and related domains. This family includes the N- and C- terminal aromatase/cyclase (ARO/CYC) domains of Streptomyces rimosus OtcD1 and related domains. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. ARO/CYC domains participate in the diversification of aromatic polyketides by promoting polyketide cyclization. They occur in two architectural forms, didomain and monodomain. Didomain aromatase/cyclases (ARO/CYCs), contain two ARO/CYC domains, and are associated with C7-C12 first ring cyclized polyketides. Streptomyces rimosus OtcD1 is a didomain ARO/CYC. The polyketide Oxytetracycline (OTC) is a broad spectrum antibiotic made by Streptomyces rimosus. The gene encoding OtcD1 is part of oxytetracycline (OTC) gene cluster. Disruption of this gene results in the production of novel polyketides having shorter chain lengths (by up to 10 carbons) than OTC. Monodomain ARO/CYCs have a single ARO/CYC domain, and are often associated with C9-C14 first ring cyclizations, these latter domains belong to a different subfamily in the SRPBCC superfamily. 142
27725 176871 cd08862 SRPBCC_Smu440-like Ligand-binding SRPBCC domain of Streptococcus mutans Smu.440 and related proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Streptococcus mutans Smu.440 and related proteins. This domain belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Streptococcus mutans is a dental pathogen, and the leading cause of dental caries. In this pathogen, the gene encoding Smu.440 is in the same operon as the gene encoding SMU.441, a member of the MarR protein family of transcriptional regulators involved in multiple antibiotic resistance. It has been suggested that SMU.440 is involved in polyketide-like antibiotic resistance. 138
27726 176872 cd08863 SRPBCC_DUF1857 DUF1857, an uncharacterized ligand-binding domain of the SRPBCC domain superfamily. Uncharacterized family of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 141
27727 176873 cd08864 SRPBCC_DUF3074 DUF3074, an uncharacterized ligand-binding domain of the SRPBCC domain superfamily. Uncharacterized family of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 208
27728 176874 cd08865 SRPBCC_10 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 140
27729 176875 cd08866 SRPBCC_11 Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins. 144
27730 176876 cd08867 START_STARD4_5_6-like Lipid-binding START domain of mammalian STARD4, -5, -6, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD4, -5, and -6. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD4 plays an important role in steroidogenesis, trafficking cholesterol into mitochondria. It specifically binds cholesterol, and demonstrates limited binding to another sterol, 7a-hydroxycholesterol. STARD4 and STARD5 are ubiquitously expressed, with highest levels in liver and kidney. STRAD5 functions in the kidney within the proximal tubule cells where it is associated with the Endoplasmic Reticulum (ER), and may participate in ER-associated cholesterol transport. It binds cholesterol and 25-hydroxycholesterol. Expression of the gene encoding STARD5 is increased by ER stress, and its mRNA and protein levels are elevated in a type I diabetic mouse model of human diabetic nephropathy. STARD6 is expressed in male germ cells of normal rats, and in the steroidogenic Leydig cells of perinatal hypothyroid testes. It may play a pivotal role in the steroidogenesis as well as in the spermatogenesis of normal rats. STARD6 has also been detected in the rat nervous system, and may participate in neurosteroid synthesis. 206
27731 176877 cd08868 START_STARD1_3_like Cholesterol-binding START domain of mammalian STARD1, -3 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD1 (also known as StAR) and STARD3 (also known as metastatic lymph node 64/MLN64). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. This STARD1-like subfamily has a high affinity for cholesterol. STARD1/StAR can reduce macrophage lipid content and inflammatory status. It plays an essential role in steroidogenic tissues: transferring the steroid precursor, cholesterol, from the outer to the inner mitochondrial membrane, across the aqueous space. Mutations in the gene encoding STARD1/StAR can cause lipid congenital adrenal hyperplasia (CAH), an autosomal recessive disorder characterized by a steroid synthesis deficiency and an accumulation of cholesterol in the adrenal glands and the gonads. STARD3 may function in trafficking endosomal cholesterol to a cytosolic acceptor or membrane. In addition to having a cytoplasmic START cholesterol-binding domain, STARD3 also contains an N-terminal MENTAL cholesterol-binding and protein-protein interaction domain. The MENTAL domain contains transmembrane helices and anchors MLN64 to endosome membranes. The gene encoding STARD3 is overexpressed in about 25% of breast cancers. 208
27732 176878 cd08869 START_RhoGAP C-terminal lipid-binding START domain of mammalian STARD8, -12, -13 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD8 (also known as deleted in liver cancer 3/DLC3, and Arhgap38), STARD12 (also known as DLC-1, Arhgap7, and p122-RhoGAP), and STARD13 (also known as DLC-2, Arhgap37, and SDCCAG13). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. Some, including STARD12, -and -13, also have an N-terminal SAM (sterile alpha motif) domain; these have a SAM-RhoGAP-START domain organization. This subfamily is involved in cancer development. A large spectrum of cancers have dysregulated genes encoding these proteins. The precise function of the START domain in this subfamily is unclear. 197
27733 176879 cd08870 START_STARD2_7-like Lipid-binding START domain of mammalian STARD2, -7, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD2 (also known as phosphatidylcholine transfer protein/PC-TP), and STARD7 (also known as gestational trophoblastic tumor 1/GTT1). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD2 is a cytosolic phosphatidycholine (PtdCho) transfer protein, which traffics PtdCho, the most common class of phospholipids in eukaryotes, between membranes. It represents a minimal START domain structure. STARD2 plays roles in hepatic cholesterol metabolism, in the development of atherosclerosis, and may also have a mitochondrial function. The gene encoding STARD7 is overexpressed in choriocarcinoma. STARD7 appears to be involved in the intracellular trafficking of PtdCho to mitochondria. STARD7 was shown to be surface active and to interact differentially with phospholipid monolayers. It showed a preference for phosphatidylserine, cholesterol, and phosphatidylglycerol. 209
27734 176880 cd08871 START_STARD10-like Lipid-binding START domain of mammalian STARD10 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD10 (also known as CGI-52, PTCP-like, and SDCCAG28). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD10 binds phophatidylcholine and phosphatidylethanolamine. This protein is widely expressed and is synthesized constitutively in many organs. It may function in the liver in the export of phospholipids into bile. It is concentrated in the sperm flagellum, and may play a role in energy metabolism. In the mammary gland it may participate in the enrichment of lipids in milk, and be a potential marker of differentiation. Its expression is induced in this gland during gestation and lactation. It is overexpressed in mammary tumors from Neu/ErbB2 transgenic mice, in several breast carcinoma cell lines, and in 35% of primary human breast cancers, and may cooperate with c-erbB receptor signaling in breast oncogenesis. It is a potential marker of disease outcome in breast cancer; loss of STARD10 expression in breast cancer strongly predicts an aggressive disease course. The lipid transfer activity of STRAD10 is downregulated by phosphorylation of its Ser284 by CK2 (casein kinase 2). 222
27735 176881 cd08872 START_STARD11-like Ceramide-binding START domain of mammalian STARD11 and related domains. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD11 and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD11 can mediate transfer of the natural ceramide isomers, dihydroceramide and phytoceramide, as well as ceramides having C14, C16, C18, and C20 chains. They can also transfer diacylglycerol, but with a lower efficiency. STARD11 is synthesized from two major transcripts: a larger one encoding Goodpasture antigen-binding protein (GPBP)/ceramide transporter long form (CERTL); and a smaller one encoding GPBPdelta26/CERT, which is deleted for 26 amino acids. Both splicing variants mediate ceramide transfer from the ER to the Golgi, in a non-vesicular manner. It is likely that these two carry out different functions in specific sub-cellular locations. These proteins have roles in brain homeostasis and disease processes. GPBP/CERTL exists in multiple isoforms originating from alternative translation initiation sites and post-translational modifications. Goodpasture syndrome is a human disorder caused by antibodies directed against the a3-chain of collagen type IV. GPBP/CERTL binds and phosphorylates this antigen. The human gene encoding STARD11 is referred to as COL4A3BP referring to its collagen binding function. It is unknown whether the ceramide-transfer function of GPBP/CERTL is related to this collagen interaction. The expression of GPBP/CERTL is elevated in these and other spontaneous autoimmune disorders including cutaneous lupus erythematosus, pemphigoid, and lichen planus. GPBL/CERTL contains an N-terminal pleckstrin homology domain (PH), which targets the protein to the Golgi, a middle region containing two serine-rich domains (SR1, SR2), a FFAT (two phenylalanine amino acids in an acidic tract) motif which is involved in endoplasmic reticulum targeting, and this C-terminal SMART domain. The shorter splicing variant, CERT, lacks the SR2 domain. 235
27736 176882 cd08873 START_STARD14_15-like Lipid-binding START domain of mammalian STARDT14, -15, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian brown fat-inducible STARD14 (also known as Acyl-Coenzyme A Thioesterase 11 or ACOT11, BFIT, THEA, THEM1, KIAA0707, and MGC25974), STARD15/ACOT12 (also known as cytoplasmic acetyl-CoA hydrolase/CACH, THEAL, and MGC105114), and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD14/ACOT11 and STARD15/ACOT12 are type II acetyl-CoA thioesterases; they catalyze the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Human STARD14 displays acetyl-CoA thioesterase activity towards medium(C12)- and long(C16)-chain fatty acyl-CoA substrates. Rat CACH hydrolyzes acetyl-CoA to acetate and CoA. In addition to having a START domain, STARD14 and STARD15 each have two tandem copies of the hotdog domain. There are two splice variants of human STARD14, named BFIT1 and BFIT2, which differ in their C-termini. Human BFIT2 is equivalent to mouse mBFIT/Acot11, whose transcription is increased two fold in obesity-resistant mice compared with obesity-prone mice. Human STARD15 may have roles in cholesterol metabolism and in beta-oxidation. 235
27737 176883 cd08874 START_STARD9-like C-terminal START domain of mammalian STARD9, and related domains; lipid binding. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD9 (also known as KIAA1300), and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C /PITP /Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Some members of this subfamily have N-terminal kinesin motor domains. STARD9 interacts with supervillin, a protein important for efficient cytokinesis, perhaps playing a role in coordinating microtubule motors with actin and myosin II functions at membranes. The human gene encoding STARD9 lies within a target region for LGMD2A, an autosomal recessive form of limb-girdle muscular dystrophy. 205
27738 176884 cd08875 START_ArGLABRA2_like C-terminal lipid-binding START domain of the Arabidopsis homeobox protein GLABRA 2 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of the Arabidopsis homeobox protein GLABRA 2 and related proteins. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Most proteins in this subgroup contain an N-terminal homeobox DNA-binding domain, some contain a leucine zipper. ArGLABRA2 plays a role in the differentiation of hairless epidermal cells of the Arabidopsis root. It acts in a cell-position-dependent manner to suppress root hair formation in those cells. 229
27739 176885 cd08876 START_1 Uncharacterized subgroup of the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain family. Functionally uncharacterized subgroup of the START domain family. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some mammalian members of the START family (STARDs), it is known which lipids bind in this pocket; these include cholesterol (STARD1, -3, -4, and -5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2, -7, and -10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). Mammalian STARDs participate in the control of various cellular processes, including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease. 195
27740 176886 cd08877 START_2 Uncharacterized subgroup of the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain family. Functionally uncharacterized subgroup of the START domain family. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some mammalian members of the START family (STARDs), it is known which lipids bind in this pocket; these include cholesterol (STARD1, -3, -4, and -5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2, -7, and -10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). Mammalian STARDs participate in the control of various cellular processes, including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease. 215
27741 176887 cd08878 RHO_alpha_C_DMO-like C-terminal catalytic domain of the oxygenase alpha subunit of dicamba O-demethylase and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Stenotrophomonas maltophilia dicamba O-demethylase (DMO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and the C-terminal catalytic domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this subgroup include the alpha subunits of carbazole 1,9a-dioxygenase, phthalate dioxygenase, vanillate O-demethylase, Pseudomonas putida 2-oxoquinoline 8-monooxygenase, and Comamonas testosteroni T-2 p-toluenesulfonate dioxygenase. It also includes the C-terminal domain of the lignin biphenyl-specific O-demethylase (LigX) of the 5,5'-dehydrodivanillic acid O- demethylation system of Sphingomonas paucimobilis SYK-6. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 196
27742 176888 cd08879 RHO_alpha_C_AntDO-like C-terminal catalytic domain of the oxygenase alpha subunit of Pseudomonas resinovorans strain CA10 anthranilate 1,2-dioxygenase and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of anthranilate 1,2-dioxygenase (AntDO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and the C-terminal catalytic domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this subgroup include the alpha subunits of AntDO, aniline dioxygenase, Acinetobacter calcoaceticus benzoate 1,2-dioxygenase, 2-halobenzoate 1,2-dioxygenase from Pseudomonas cepacia 2CBS, 2,4,5-trichlorophenoxyacetic acid oxygenase from Pseudomonas cepacia AC1100, 2,4-dichlorophenoxyacetic acid oxygenase from Bradyrhizobium sp. strain HW13, p-cumate 2,3-dioxygenase, 2-halobenzoate 1,2-dioxygenase form Pseudomonas cepacia 2CBS, and Pseudomonas putida IacC, which may be involved in the catabolism of the plant hormone indole 3-acetic acid. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 237
27743 176889 cd08880 RHO_alpha_C_ahdA1c-like C-terminal catalytic domain of the large/alpha subunit (ahdA1c) of a ring-hydroxylating dioxygenase from Sphingomonas sp. strain P2 and related proteins. C-terminal catalytic domain of the large subunit (ahdA1c) of the AhdA3A4A2cA1c salicylate 1-hydroxylase complex from Sphingomonas sp. strain P2, and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). AhdA3A4A2cA1c is one of three known isofunctional salicylate 1-hydroxylase complexes in strain P2, involved in phenanthrene degradation, which catalyze the monooxygenation of salicylate, the metabolite of phenanthene degradation, to produce catechol. This complex prefers salicylate over other substituted salicylates; the other two salicylate 1-hydroxylases have different substrate preferences. RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Other oxygenases belonging to this subgroup include the alpha subunits of anthranilate 1,2-dioxygenase from Burkholderia cepacia DBO1, a polycyclic aromatic hydrocarbon dioxygenase from Cycloclasticus sp. strain A5 (PhnA dioxygenase), salicylate-5-hydroxylase from Ralstonia sp. U2, ortho-halobenzoate 1,2-dioxygenase from Pseudomonas aeruginosa strain JB2, and the terephthalate 1,2-dioxygenase system from Delftia tsuruhatensis strain T7. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 222
27744 176890 cd08881 RHO_alpha_C_NDO-like C-terminal catalytic domain of the oxygenase alpha subunit of naphthalene 1,2-dioxygenase (NDO) and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of naphthalene 1,2-dioxygenase (NDO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). This domain binds non-heme Fe(II). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents form the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Proteins belonging to this subgroup include the terminal oxygenase alpha subunits of biphenyl dioxygenase, cumene dioxygenase from Pseudomonas fluorescens IP01, ethylbenzene dioxygenase, naphthalene 1,2-dioxygenase, nitrobenzene dioxygenase from Comamonas sp. strain JS765, toluene 2,3-dioxygenase from Pseudomonas putida F1, dioxin dioxygenase of Sphingomonas sp. Strain RW1, and the polycyclic aromatic hydrocarbons (PAHs)degrading ring-hydroxylating dioxygenase from Sphingomonas CHY-1. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 206
27745 176891 cd08882 RHO_alpha_C_MupW-like C-terminal catalytic domain of Pseudomonas fluorescens MupW and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Pseudomonas fluorescens MupW and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. MupW is part of the mupirocin biosynthetic gene cluster in Pseudomonas fluorescens, and may catalyze the oxidation of the 16-methyl group during biosynthesis of this polyketide antibiotic. Mupirocin is a mixture of pseudomonic acids which targets isoleucyl-tRNA synthase and is a strong inhibitor of Gram positive bacterial and mycoplasmal pathogens. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 243
27746 176892 cd08883 RHO_alpha_C_CMO-like C-terminal catalytic domain of plant choline monooxygenase (CMO) and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of plant choline monooxygenase and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Plant choline monooxygenase catalyzes the first step in a two-step oxidation of choline to the osmoprotectant glycine betaine. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 175
27747 176893 cd08884 RHO_alpha_C_GbcA-like C-terminal catalytic domain of GbcA (glycine betaine catabolism A) from Pseudomonas aeruginosa PAO1 and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of GbcA (glycine betaine catabolism A) from Pseudomonas aeruginosa PAO1 and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. GbcA is involved in glycine betaine (GB) catabolism in Pseudomonas aeruginosa; it may remove a methyl group from GB via a dioxygenase mechanism, producing dimethylglycine and formaldehyde. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 205
27748 176894 cd08885 RHO_alpha_C_1 C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This group contains two putative Parvibaculum lavamentivorans (T) DS-1 oxygenases; this organism catabolizes commercial linear alkylbenzenesulfonate surfactant (LAS) and other surfactants, by a pathway involving an undefined 'omega-oxygenation' and beta-oxidation of the LAS side chain. The nature of the LAS-oxygenase is unknown but is likely a multicomponent system. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 190
27749 176895 cd08886 RHO_alpha_C_2 C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 182
27750 176896 cd08887 RHO_alpha_C_3 C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This group contains a putative Parvibaculum lavamentivorans (T) DS-1 oxygenase; this organism catabolizes commercial linear alkylbenzenesulfonate surfactant (LAS) and other surfactants, by a pathway involving an undefined 'omega-oxygenation' and beta-oxidation of the LAS side chain. The nature of the LAS-oxygenase is unknown but is likely a multicomponent system. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. 185
27751 176897 cd08888 SRPBCC_PITPNA-B_like Lipid-binding SRPBCC domain of mammalian PITPNA, -B, and related proteins (Class I PITPs). This subgroup includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of mammalian Class 1 phosphatidylinositol transfer proteins (PITPs), PITPNA/PITPalpha and PITPNB/PITPbeta, Drosophila vibrator, and related proteins. These are single domain proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. In addition, PITPNB transfers sphingomyelin in vitro, with a low affinity. PITPNA is found chiefly in the nucleus and cytoplasm; it is enriched in the brain and predominantly localized in the axons. A reduced expression of PITPNA contributes to the neurodegenerative phenotype of the mouse vibrator mutation. The role of PITPNA in vivo may be to provide PtdIns for localized PI3K-dependent signaling, thereby controlling the polarized extension of axonal processes. PITPNA homozygous null mice die soon after birth from complicated organ failure, including intestinal and hepatic steatosis, hypoglycemia, and spinocerebellar disease. PITPNB is associated with the Golgi and ER, and is highly expressed in the liver. Deletion of the PITPNB gene results in embryonic lethality. The PtdIns and PtdCho exchange activity of PITPNB is required for COPI-mediated retrograde transport from the Golgi to the ER. Drosophila vibrator localizes to the ER, and has an essential role in cytokinesis during mitosis and meiosis. 258
27752 176898 cd08889 SRPBCC_PITPNM1-2_like Lipid-binding SRPBCC domain of mammalian PITPNM1-2 and related proteins (Class IIA PITPs). This subgroup includes an N-terminal SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of mammalian Class II phosphatidylinositol transfer protein (PITPs), PITPNM1/PITPalphaI/Nir2 (PYK2 N-terminal domain-interacting receptor2) and PITPNM2/PITPalphaII/Nir3), Drosophila RdgB, and related proteins. These are membrane associated multidomain proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Ablation of the mouse gene encoding PITPNM1 results in early embryonic death. PITPNM1 is localized chiefly to the Golgi apparatus, and under certain conditions translocates to the lipid droplets. Targeting to the latter is dependent on a specific threonine residue within the SRPBCC domain. PITPNM1 plays a part in Golgi-mediated transport. It regulates diacylglycerol (DAG) production at the trans-Golgi network (TGN) via the CDP-choline pathway. Drosophila RdgB, the founding member of the PITP family, is implicated in the visual and olfactory transduction. RdgB is required for maintenance of ultra structure in photoreceptors and for sensory transduction. The mouse PITPNM1 gene rescues the phenotype of Drosophila rdgB mutant flies. In addition to the SRPBCC domain, PITPNM1 and -2 contain a Rho-inhibitory domain (Rid), six hydrophobic stretches, a DDHD calcium binding region, and a C-terminal tyrosine kinase Pyk2-binding / HAD-like phosphohydrolase domain. PITPNM1 has a role in regulating cell morphogenesis through its Rho inhibitory domain (Rid). This SRPBCC_PITPNM1-2_like domain model includes the first 52 residues of the 224 residues Rid (Rho-inhibitory domain). 260
27753 176899 cd08890 SRPBCC_PITPNC1_like Lipid-binding SRPBCC domain of mammalian PITPNC1,and related proteins (Class IIB PITPs). This subgroup includes the N-terminal SRPBCC (START/RHO_alpha_C /PITP /Bet_v1/CoxG/CalC) domain of mammalian Class IIB phosphatidylinositol transfer protein (PITP), PITPNC1/RdgBbeta, and related proteins. These are metazoan proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Mammalian PITPNC1 contains an amino-terminal SRPBCC PITP-like domain and a short carboxyl-terminal domain. It is a cytoplasmic protein, and is ubiquitously expressed. It can transfer phosphatidylinositol (PtdIns) in vitro with a similar ability to other PITPs. 250
27754 176900 cd08891 SRPBCC_CalC Ligand-binding SRPBCC domain of Micromonospora echinospora CalC and related proteins. This subfamily includes Micromonospora echinospora CalC (MeCalC) and related proteins. These proteins belong to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. MeCalC confers resistance to the enediyne, calicheamicin gamma 1 (CLM). Enediyne antibiotics are antitumor agents. Enediynes have an in vitro and in vivo role as DNA damaging agents; they consist of a DNA recognition unit (e.g., aryltetrasaccharide of CLM), an activating component (e.g., methyl trisulfide of CLM), which promotes cycloaromatization, and the enediyne warhead which cycloaromatizes to a reactive diradical species, resulting in oxidative strand cleavage of the targeted DNA sequence. MeCalC confers resistance to CLM by a self sacrificing mechanism: the transient enediyne diradical species abstracts a CalC Gly Calpha-hydrogen, thereby quenching the reactive enediyne moiety, and generating a CalC Gly Calpha radical. This radical then reacts with oxygen, leading to oxidative site-specific proteolysis of CalC. This antibiotic-induced proteolysis of CalC results in inactivation of both CalC and the highly reactive diradical species. CalC has also been shown to inactivate two other enediynes, shishijimicin and namenamicin. The crucial Gly of the MeCalC CLM resistance mechanism is contained in a loop (L1) which is displaced when CLM is bound, this Gly is not conserved in this subgroup. 149
27755 176901 cd08892 SRPBCC_Aha1 Putative hydrophobic ligand-binding SRPBCC domain of the Hsp90 co-chaperone Aha1 and related proteins. This subfamily includes the C-terminal SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Aha1, and related domains. Proteins in this group belong to the SRPBCC domain superfamily of proteins which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Aha1 is one of several co-chaperones, which regulate the dimeric chaperone Hsp90. Hsp90, Aha1, and other accessory proteins interact in a chaperone cycle driven by ATP binding and hydrolysis. Aha1 promotes dimerization of the N-terminal domains of Hsp90, and stimulates its low intrinsic ATPase activity. One Aha1 molecule binds per Hsp90 dimer. The N- and C- terminal domains of Aha1 cooperatively bind across the dimer interface of Hsp90. The C-terminal domain of Aha1 binds the N-terminal Hsp90 ATPase domain. Aha1 may regulate the dwell time of Hsp90 with client proteins. Aha1 may act as either a negative or positive regulator of chaperone-dependent activation, depending on the client protein; for example, it acts as a negative regulator in the case of Saccharomyces cerevisiae MAL63 MAL-activator, and acts as a positive regulator in the case of glucocorticoid receptor and v-Src kinase. The mechanisms by which these opposing functions are achieved are unclear. Aha1 is upregulated in a number of tumor lines co-incident with the activation of several signaling kinases. 126
27756 176902 cd08893 SRPBCC_CalC_Aha1-like_GntR-HTH Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins; some contain an N-terminal GntR family winged HTH DNA-binding domain. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. Some proteins in this subgroup contain an N-terminal winged helix-turn-helix DNA-binding domain found in the GntR family of proteins which include bacterial transcriptional regulators and their putative homologs from eukaryota and archaea. 136
27757 176903 cd08894 SRPBCC_CalC_Aha1-like_1 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 139
27758 176904 cd08895 SRPBCC_CalC_Aha1-like_2 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 146
27759 176905 cd08896 SRPBCC_CalC_Aha1-like_3 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 146
27760 176906 cd08897 SRPBCC_CalC_Aha1-like_4 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 133
27761 176907 cd08898 SRPBCC_CalC_Aha1-like_5 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 145
27762 176908 cd08899 SRPBCC_CalC_Aha1-like_6 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 157
27763 176909 cd08900 SRPBCC_CalC_Aha1-like_7 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 143
27764 176910 cd08901 SRPBCC_CalC_Aha1-like_8 Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. 136
27765 176911 cd08902 START_STARD4-like Lipid-binding START domain of mammalian STARD4 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD4 and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD4 plays an important role in steroidogenesis, trafficking cholesterol into mitochondria. It specifically binds cholesterol, and demonstrates limited binding to another sterol, 7alpha-hydroxycholesterol. STARD4 is ubiquitously expressed, with highest levels in liver and kidney. 202
27766 176912 cd08903 START_STARD5-like Lipid-binding START domain of mammalian STARD5 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD5, and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD5 is ubiquitously expressed, with highest levels in liver and kidney. STARD5 functions in the kidney within the proximal tubule cells where it is associated with the Endoplasmic Reticulum (ER), and may participate in ER-associated cholesterol transport. It binds cholesterol and 25-hydroxycholesterol. Expression of the gene encoding STARD5 is increased by ER stress, and its mRNA and protein levels are elevated in a type I diabetic mouse model of human diabetic nephropathy. 208
27767 176913 cd08904 START_STARD6-like Lipid-binding START domain of mammalian STARD6 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD6 and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD6 is expressed in male germ cells of normal rats, and in the steroidogenic Leydig cells of perinatal hypothyroid testes. It may play a pivotal role in the steroidogenesis as well as in the spermatogenesis of normal rats. STARD6 has also been detected in the rat nervous system, and may participate in neurosteroid synthesis. 204
27768 176914 cd08905 START_STARD1-like Cholesterol-binding START domain of mammalian STARD1 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD1 (also known as StAR) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD1 has a high affinity for cholesterol. It can reduce macrophage lipid content and inflammatory status. It plays an essential role in steroidogenic tissues: transferring the steroid precursor, cholesterol, from the outer to the inner mitochondrial membrane, across the aqueous space. Mutations in the gene encoding STARD1/StAR can cause lipid congenital adrenal hyperplasia (CAH), an autosomal recessive disorder characterized by a steroid synthesis deficiency and an accumulation of cholesterol in the adrenal glands and the gonads. 209
27769 176915 cd08906 START_STARD3-like Cholesterol-binding START domain of mammalian STARD3 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD3 (also known as metastatic lymph node 64/MLN64) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD3 has a high affinity for cholesterol. It may function in trafficking endosomal cholesterol to a cytosolic acceptor or membrane. In addition to having a cytoplasmic START cholesterol-binding domain, STARD3 also contains an N-terminal MENTAL cholesterol-binding and protein-protein interaction domain. The MENTAL domain contains transmembrane helices and anchors MLN64 to endosome membranes. The gene encoding STARD3 is overexpressed in about 25% of breast cancers. 209
27770 176916 cd08907 START_STARD8-like C-terminal lipid-binding START domain of mammalian STARD8 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD8 (also known as deleted in liver cancer 3/DLC3, and Arhgap38) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. The precise function of the START domain in this subgroup is unclear. 205
27771 176917 cd08908 START_STARD12-like C-terminal lipid-binding START domain of mammalian STARD12 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD12 (also known as DLC-1, Arhgap7, and p122-RhoGAP) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subgroup also have an N-terminal SAM (sterile alpha motif) domain and a RhoGAP domain, and have a SAM-RhoGAP-START domain organization. The precise function of the START domain in this subgroup is unclear. 204
27772 176918 cd08909 START_STARD13-like C-terminal lipid-binding START domain of mammalian STARD13 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD13 (also known as DLC-2, Arhgap37, and SDCCAG13) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. The precise function of the START domain in this subgroup is unclear. 205
27773 176919 cd08910 START_STARD2-like Lipid-binding START domain of mammalian STARD2 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD2 (also known as phosphatidylcholine transfer protein/PC-TP) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD2 is a cytosolic phosphatidycholine (PtdCho) transfer protein, which traffics PtdCho, the most common class of phospholipids in eukaryotes, between membranes. It represents a minimal START domain structure. STARD2 plays roles in hepatic cholesterol metabolism, in the development of atherosclerosis, and may have a mitochondrial function. 207
27774 176920 cd08911 START_STARD7-like Lipid-binding START domain of mammalian STARD7 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD7 (also known as gestational trophoblastic tumor 1/GTT1). It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. The gene encoding STARD7 is overexpressed in choriocarcinoma. STARD7 appears to be involved in the intracellular trafficking of phosphatidycholine (PtdCho) to mitochondria. STARD7 was shown to be surface active and to interact differentially with phospholipid monolayers, it showed a preference for phosphatidylserine, cholesterol, and phosphatidylglycerol. 207
27775 176921 cd08913 START_STARD14-like Lipid-binding START domain of mammalian STARDT14 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian brown fat-inducible STARD14 (also known as Acyl-Coenzyme A Thioesterase 11 or ACOT11, BFIT, THEA, THEM1, KIAA0707, and MGC25974) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD14/ACOT11 is a type II acetyl-CoA thioesterase; it catalyzes the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Human STARD14 displays acetyl-CoA thioesterase activity towards medium(C12)- and long(C16)-chain fatty acyl-CoA substrates. In addition to having a START domain, most proteins in this subgroup have two tandem copies of the hotdog domain. There are two splice variants of human STARD14, named BFIT1 and BFIT2, which differ in their C-termini. Human BFIT2 is equivalent to mouse mBFIT/Acot11, whose transcription is increased two fold in obesity-resistant mice compared with obesity-prone mice. 240
27776 176922 cd08914 START_STARD15-like Lipid-binding START domain of mammalian STARD15 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD15/ACOT12 (also known as cytoplasmic acetyl-CoA hydrolase/CACH, THEAL, and MGC105114) and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD15/ACOT12 is a type II acetyl-CoA thioesterase; it catalyzes the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Rat CACH hydrolyzes acetyl-CoA to acetate and CoA. In addition to having a START domain, most proteins in this subgroup have two tandem copies of the hotdog domain. Human STARD15/ACOT12 may have roles in cholesterol metabolism and in beta-oxidation. 236
27777 185746 cd08915 V_Alix_like Protein-interacting V-domain of mammalian Alix and related domains. This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain. 342
27778 381257 cd08916 TrHb3_P Truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 3 (P). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb3s include Campylobacter jejuni Ctb, encoded by Cj0465c, which may play a role in moderating O2 flux within C. jejuni. 116
27779 381258 cd08917 TrHb2_O Truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2s include the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins. Other TrHb2's include Bacillus subtilis trHb (Bs-trHb) which exhibits an extremely high oxygen affinity, and Pseudoalteromonas haloplanktis PhHbO (encoded by the PSHAa0030 gene) which appears to be involved in oxidative and nitrosative stress resistance. 116
27780 381259 cd08919 PBP-like Phycobiliproteins (PBPs) and related proteins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This family also contains allophycocyanin-like (Apl) proteins, which conserve the residues critical for chromophore interactions, but may not maintain the proper alpha-beta subunit interactions and tertiary structure of PBPs. The genes encoding the Apl proteins cluster with light-responsive regulatory components, so these may have photoresponsive regulatory role(s). Included in this family is the PBP-like domain of the core-membrane linker polypeptide (LCM). The LCM serves both as a terminal energy acceptor and as a linker polypeptide. Its single phycocyanobilin (PCB) chromophore is one of two terminal energy transmitters, and transfers excitations from the hundreds of chromophores of the PBS to the RCs. This family also includes some proteins which have glutathione-S-transferases (GST) domains N-terminal to this PBP-like domain. 153
27781 271272 cd08920 Ngb Neuroglobins. The Ngb described in this subfamily is a hexacoordinated heme globin chiefly expressed in neurons of the brain and retina. In the human brain, it is highly expressed in the hypothalamus, amygdala, and in the pontine tegmental nuclei. It affords protection of brain neurons from ischemia and hypoxia. In rats, it plays a role in the neuroprotection of limb ischemic preconditioning (LIP). It plays roles as: a sensor of oxygen levels; a store or reservoir for oxygen; a facilitator for oxygen transport; a regulator of ROS; and a scavenger of nitric oxide. It also functions in the protection against apoptosis and in sleep regulation. This subgroup contains Ngb from mammalian and non-mammalian vertebrates, including fish, amphibians and reptiles; the functionally pentacoordinated acoelomorph Symsagittifera roscoffensis Ngb does not belong to this subgroup. 148
27782 381260 cd08922 FHb-globin Globin domain of flavohemoglobins (flavoHbs). FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses. FlavoHb expression affects Aspergillus nidulans sexual development and mycotoxin production, and Dictyostelium discoideum development. This family also includes some single-domain goblins (SDgbs). 140
27783 381261 cd08923 class1-2_nsHbs_Lbs Class1 nonsymbiotic hemoglobins (nsHbs), class II nsHbs, leghemoglobins (Lbs,) and related proteins. Class1 nsHbs include the dimeric hexacoordinate Trema tomentosa nsHb and the dimeric hexacoordinate nsHb from monocot barley. Also belonging to this family is ParaHb, a dimeric pentacoordinate Hb from the root nodules of Parasponia andersonii, a non-legume capable of symbiotic nitrogen fixation. ParaHb is unusual in that it has different heme redox potentials for each subunit; it may have evolved from class1 nsHbs. Lbs are pentacoordinate, and facilitate the diffusion of O2 to the respiring Rhizobium bacteroids within root nodules. They may have evolved from class 2 nonsymbiotic hemoglobins (class2 nsHb). 147
27784 271275 cd08924 Cygb Cytoglobin and related globins. Cygb is a hexacoordinated heme-containing protein, able to bind O2, NO and carbon monoxide. It has both nitric oxide dioxygenase and lipid peroxidase activities, and potentially participates in the maintenance of normal phenotype by implementing a homeostatic effect, to counteract stress conditions imposed on a cell. Cygb is implicated in multiple human pathologies: it is up-regulated in fibrosis and neurodegenerative disorders, and down-regulated in multiple cancer types, and may have a tumor suppressor role. It is expressed ubiquitously across a broad range of vertebrate organs including liver, heart, brain, lung, retina, and gut. In the human brain, it was detected at high levels in the habenula, hypothalamus, thalamus, hippocampus and pontine tegmental nuclei, detected at a low level in the cerebral cortex, and undetected in the cerebellar cortex. 153
27785 381262 cd08925 Hb-beta-like Hemoglobin beta, gamma, delta, epsilon, and related Hb subunits. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary hemoglobin throughout most of gestation. These Hbs types have differences in O2 affinity and in their interactions with allosteric effectors. 139
27786 271277 cd08926 Mb Animal Myoglobins. Myoglobin (Mb) is a monomeric pentacoordinate heme-bound globin protein whose expression has long been considered limited to cardiomyocytes and striated skeletal muscle cell, however it has recently been found localized in a wide variety of tissues including smooth muscle cells. As a physiological catalyst, it can modulate reactive oxygen species levels, facilitate oxygen diffusion within the cell, and scavenge or generate NO depending on oxygen tensions within the cell. Through its NO dioxygenase and nitrite reductase activities, Mb regulates mitochondrial function in energy-demanding tissues. 148
27787 381263 cd08927 Hb-alpha-like Hemoglobin alpha, zeta, mu, theta, and related Hb subunits. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary hemoglobin throughout most of gestation. These Hbs types have differences in O2 affinity and in their interactions with allosteric effectors. 140
27788 187633 cd08928 KR_fFAS_like_SDR_c_like ketoacyl reductase (KR) domain of fungal-type fatty acid synthase (fFAS)-like, classical (c)-like SDRs. KR domain of FAS, including the fungal-type multidomain FAS alpha chain, and the single domain daunorubicin C-13 ketoreductase. Fungal-type FAS is a heterododecameric FAS composed of alpha and beta multifunctional polypeptide chains. The KR, an SDR family member is located centrally in the alpha chain. KR catalyzes the NADP-dependent reduction of ketoacyl-ACP to hydroxyacyl-ACP. KR shares the critical active site Tyr of the classical SDR and has partial identity of the active site tetrad, but the upstream Asn is replaced in KR by Met. As in other SDRs, there is a glycine rich NAD(P)-binding motif, but the pattern found in KR does not match the classical SDRs, and is not strictly conserved within this group. Daunorubicin is a clinically important therapeutic compound used in some cancer treatments. Single domain daunorubicin C-13 ketoreductase is member of the classical SDR family with a canonical glycine-rich NAD(P)-binding motif, but lacking a complete match to the active site tetrad characteristic of this group. The critical Tyr, plus the Lys and upstream Asn are present, but the catalytic Ser is replaced, generally by Gln. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 248
27789 187634 cd08929 SDR_c4 classical (c) SDR, subgroup 4. This subgroup has a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 226
27790 187635 cd08930 SDR_c8 classical (c) SDR, subgroup 8. This subgroup has a fairly well conserved active site tetrad and domain size of the classical SDRs, but has an atypical NAD-binding motif ([ST]G[GA]XGXXG). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
27791 187636 cd08931 SDR_c9 classical (c) SDR, subgroup 9. This subgroup has the canonical active site tetrad and NAD-binding motif of the classical SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 227
27792 212493 cd08932 HetN_like_SDR_c HetN oxidoreductase-like, classical (c) SDR. This subgroup includes Anabaena sp. strain PCC 7120 HetN, a putative oxidoreductase involved in heterocyst differentiation, and related proteins. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 223
27793 187638 cd08933 RDH_SDR_c retinal dehydrogenase-like, classical (c) SDR. These classical SDRs includes members identified as retinol dehydrogenases, which convert retinol to retinal, a property that overlaps with 17betaHSD activity. 17beta-dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the short-chain dehydrogenases/reductase family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 261
27794 187639 cd08934 CAD_SDR_c clavulanic acid dehydrogenase (CAD), classical (c) SDR. CAD catalyzes the NADP-dependent reduction of clavulanate-9-aldehyde to clavulanic acid, a beta-lactamase inhibitor. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 243
27795 187640 cd08935 mannonate_red_SDR_c putative D-mannonate oxidoreductase, classical (c) SDR. D-mannonate oxidoreductase catalyzes the NAD-dependent interconversion of D-mannonate and D-fructuronate. This subgroup includes Bacillus subtitils UxuB/YjmF, a putative D-mannonate oxidoreductase; the B. subtilis UxuB gene is part of a putative ten-gene operon (the Yjm operon) involved in hexuronate catabolism. Escherichia coli UxuB does not belong to this subgroup. This subgroup has a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 271
27796 187641 cd08936 CR_SDR_c Porcine peroxisomal carbonyl reductase like, classical (c) SDR. This subgroup contains porcine peroxisomal carbonyl reductase and similar proteins. The porcine enzyme efficiently reduces retinals. This subgroup also includes human dehydrogenase/reductase (SDR family) member 4 (DHRS4), and human DHRS4L1. DHRS4 is a peroxisomal enzyme with 3beta-hydroxysteroid dehydrogenase activity; it catalyzes the reduction of 3-keto-C19/C21-steroids into 3beta-hydroxysteroids more efficiently than it does the retinal reduction. The human DHRS4 gene cluster contains DHRS4, DHRS4L2 and DHRS4L1. DHRS4L2 and DHRS4L1 are paralogs of DHRS4, DHRS4L2 being the most recent member. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 256
27797 187642 cd08937 DHB_DH-like_SDR_c 1,6-dihydroxycyclohexa-2,4-diene-1-carboxylate dehydrogenase (DHB DH)-like, classical (c) SDR. DHB DH (aka 1,2-dihydroxycyclohexa-3,5-diene-1-carboxylate dehydrogenase) catalyzes the NAD-dependent conversion of 1,2-dihydroxycyclohexa-3,4-diene carboxylate to a catechol. This subgroup also contains Pseudomonas putida F1 CmtB, 2,3-dihydroxy-2,3-dihydro-p-cumate dehydrogenase, the second enzyme in the pathway for catabolism of p-cumate catabolism. This subgroup shares the glycine-rich NAD-binding motif of the classical SDRs and shares the same catalytic triad; however, the upstream Asn implicated in cofactor binding or catalysis in other SDRs is generally substituted by a Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 256
27798 187643 cd08939 KDSR-like_SDR_c 3-ketodihydrosphingosine reductase (KDSR) and related proteins, classical (c) SDR. These proteins include members identified as KDSR, ribitol type dehydrogenase, and others. The group shows strong conservation of the active site tetrad and glycine rich NAD-binding motif of the classical SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 239
27799 187644 cd08940 HBDH_SDR_c d-3-hydroxybutyrate dehydrogenase (HBDH), classical (c) SDRs. DHBDH, an NAD+ -dependent enzyme, catalyzes the interconversion of D-3-hydroxybutyrate and acetoacetate. It is a classical SDR, with the canonical NAD-binding motif and active site tetrad. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 258
27800 187645 cd08941 3KS_SDR_c 3-keto steroid reductase, classical (c) SDRs. 3-keto steroid reductase (in concert with other enzymes) catalyzes NADP-dependent sterol C-4 demethylation, as part of steroid biosynthesis. 3-keto reductase is a classical SDR, with a well conserved canonical active site tetrad and fairly well conserved characteristic NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 290
27801 187646 cd08942 RhlG_SDR_c RhlG and related beta-ketoacyl reductases, classical (c) SDRs. Pseudomonas aeruginosa RhlG is an SDR-family beta-ketoacyl reductase involved in Rhamnolipid biosynthesis. RhlG is similar to but distinct from the FabG family of beta-ketoacyl-acyl carrier protein (ACP) of type II fatty acid synthesis. RhlG and related proteins are classical SDRs, with a canonical active site tetrad and glycine-rich NAD(P)-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
27802 187647 cd08943 R1PA_ADH_SDR_c rhamnulose-1-phosphate aldolase/alcohol dehydrogenase, classical (c) SDRs. This family has bifunctional proteins with an N-terminal aldolase and a C-terminal classical SDR domain. One member is identified as a rhamnulose-1-phosphate aldolase/alcohol dehydrogenase. The SDR domain has a canonical SDR glycine-rich NAD(P) binding motif and a match to the characteristic active site triad. However, it lacks an upstream active site Asn typical of SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 250
27803 187648 cd08944 SDR_c12 classical (c) SDR, subgroup 12. These are classical SDRs, with the canonical active site tetrad and glycine-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 246
27804 187649 cd08945 PKR_SDR_c Polyketide ketoreductase, classical (c) SDR. Polyketide ketoreductase (KR) is a classical SDR with a characteristic NAD-binding pattern and active site tetrad. Aromatic polyketides include various aromatic compounds of pharmaceutical interest. Polyketide KR, part of the type II polyketide synthase (PKS) complex, is comprised of stand-alone domains that resemble the domains found in fatty acid synthase and multidomain type I PKS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 258
27805 212494 cd08946 SDR_e extended (e) SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 200
27806 187651 cd08947 NmrA_TMR_like_SDR_a NmrA (a transcriptional regulator), HSCARG (an NADPH sensor), and triphenylmethane reductase (TMR) like proteins, atypical (a) SDRs. Atypical SDRs belonging to this subgroup include NmrA, HSCARG, and TMR, these proteins bind NAD(P) but they lack the usual catalytic residues of the SDRs. Atypical SDRs are distinct from classical SDRs. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. TMR, an NADP-binding protein, lacks the active site residues of the SDRs but has a glycine rich NAD(P)-binding motif that matches the extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 224
27807 187652 cd08948 5beta-POR_like_SDR_a progesterone 5-beta-reductase-like proteins (5beta-POR), atypical (a) SDRs. 5beta-POR catalyzes the reduction of progesterone to 5beta-pregnane-3,20-dione in Digitalis plants. This subgroup of atypical-extended SDRs, shares the structure of an extended SDR, but has a different glycine-rich nucleotide binding motif (GXXGXXG) and lacks the YXXXK active site motif of classical and extended SDRs. Tyr-179 and Lys 147 are present in the active site, but not in the usual SDR configuration. Given these differences, it has been proposed that this subfamily represents a new SDR class. Other atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 308
27808 187653 cd08950 KR_fFAS_SDR_c_like ketoacyl reductase (KR) domain of fungal-type fatty acid synthase (fFAS), classical (c)-like SDRs. KR domain of fungal-type fatty acid synthase (FAS), type I. Fungal-type FAS is a heterododecameric FAS composed of alpha and beta multifunctional polypeptide chains. The KR, an SDR family member, is located centrally in the alpha chain. KR catalyzes the NADP-dependent reduction of ketoacyl-ACP to hydroxyacyl-ACP. KR shares the critical active site Tyr of the Classical SDR and has partial identity of the active site tetrad, but the upstream Asn is replaced in KR by Met. As in other SDRs, there is a glycine rich NAD-binding motif, but the pattern found in KR does not match the classical SDRs, and is not strictly conserved within this group. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 259
27809 187654 cd08951 DR_C-13_KR_SDR_c_like daunorubicin C-13 ketoreductase (KR), classical (c)-like SDRs. Daunorubicin is a clinically important therapeutic compound used in some cancer treatments. Daunorubicin C-13 ketoreductase is member of the classical SDR family with a canonical glycine-rich NAD(P)-binding motif, but lacking a complete match to the active site tetrad characteristic of this group. The critical Tyr, plus the Lys and upstream Asn are present, but the catalytic Ser is replaced, generally by Gln. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 260
27810 187655 cd08952 KR_1_SDR_x ketoreductase (KR), subgroup 1, complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes KR domains found in many multidomain PKSs, including six of seven Sorangium cellulosum PKSs (encoded by spiDEFGHIJ) which participate in the synthesis of the polyketide scaffold of the cytotoxic spiroketal polyketide spirangien. These seven PKSs have either a single PKS module (SpiF), two PKR modules (SpiD,-E,-I,-J), or three PKS modules (SpiG,-H). This subfamily includes the single KR domain of SpiF, the first KR domains of SpiE,-G,H,-I,and #J, the third KR domain of SpiG, and the second KR domain of SpiH. The second KR domains of SpiE,-G, I, and #J, and the KR domains of SpiD, belong to a different KR_FAS_SDR subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 480
27811 187656 cd08953 KR_2_SDR_x ketoreductase (KR), subgroup 2, complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes both KR domains of the Bacillus subtilis Pks J,-L, and PksM, and all three KR domains of PksN, components of the megacomplex bacillaene synthase, which synthesizes the antibiotic bacillaene. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 436
27812 187657 cd08954 KR_1_FAS_SDR_x beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 1, complex (x) SDRs. NADP-dependent KR domain of the multidomain type I FAS, a complex SDR family. This subfamily also includes proteins identified as polyketide synthase (PKS), a protein with related modular protein architecture and similar function. It includes the KR domains of mammalian and chicken FAS, and Dictyostelium discoideum putative polyketide synthases (PKSs). These KR domains contain two subdomains, each of which is related to SDR Rossmann fold domains. However, while the C-terminal subdomain has an active site similar to the other SDRs and a NADP-binding capability, the N-terminal SDR-like subdomain is truncated and lacks these functions, serving a supportive structural role. In some instances, such as porcine FAS, an enoyl reductase (a Rossman fold NAD-binding domain of the medium-chain dehydrogenase/reductase, MDR family) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-ketoacyl reductase (KR), forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-enoyl reductase (ER); this KR and ER are members of the SDR family. This KR subfamily has an active site tetrad with a similar 3D orientation compared to archetypical SDRs, but the active site Lys and Asn residue positions are swapped. The characteristic NADP-binding is typical of the multidomain complex SDRs, with a GGXGXXG NADP binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 452
27813 187658 cd08955 KR_2_FAS_SDR_x beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 2, complex (x). Ketoreductase, a module of the multidomain polyketide synthase, has 2 subdomains, each corresponding to a short-chain dehydrogenases/reductase (SDR) family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerizes but is composed of 2 subdomains, each resembling an SDR monomer. In some instances, as in porcine FAS, an enoyl reductase (a Rossman fold NAD binding domain of the MDR family) module is inserted between the sub-domains. The active site resembles that of typical SDRs, except that the usual positions of the catalytic asparagine and tyrosine are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular polyketide synthases are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) fatty acid synthase. In some instances, such as porcine FAS , an enoyl reductase module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-ketoacyl reductase (KR), forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-enoyl reductase (ER). Polyketide syntheses also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes the KR domain of the Lyngbya majuscule Jam J, -K, and #L which are encoded on the jam gene cluster and are involved in the synthesis of the Jamaicamides (neurotoxins); Lyngbya majuscule Jam P belongs to a different KR_FAS_SDR_x subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 376
27814 187659 cd08956 KR_3_FAS_SDR_x beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 3, complex (x). Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. In some instances, such as porcine FAS, an enoyl reductase (ER) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-KR, forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta- ER. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes KR domains found in many multidomain PKSs, including six of seven Sorangium cellulosum PKSs (encoded by spiDEFGHIJ) which participate in the synthesis of the polyketide scaffold of the cytotoxic spiroketal polyketide spirangien. These seven PKSs have either a single PKS module (SpiF), two PKR modules (SpiD,-E,-I,-J), or three PKS modules (SpiG,-H). This subfamily includes the second KR domains of SpiE,-G, I, and -J, both KR domains of SpiD, and the third KR domain of SpiH. The single KR domain of SpiF, the first and second KR domains of SpiH, the first KR domains of SpiE,-G,- I, and -J, and the third KR domain of SpiG, belong to a different KR_FAS_SDR subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 448
27815 187660 cd08957 WbmH_like_SDR_e Bordetella bronchiseptica enzymes WbmH and WbmG-like, extended (e) SDRs. Bordetella bronchiseptica enzymes WbmH and WbmG, and related proteins. This subgroup exhibits the active site tetrad and NAD-binding motif of the extended SDR family. It has been proposed that the active site in Bordetella WbmG and WbmH cannot function as an epimerase, and that it plays a role in O-antigen synthesis pathway from UDP-2,3-diacetamido-2,3-dideoxy-l-galacturonic acid. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 307
27816 187661 cd08958 FR_SDR_e flavonoid reductase (FR), extended (e) SDRs. This subgroup contains FRs of the extended SDR-type and related proteins. These FRs act in the NADP-dependent reduction of flavonoids, ketone-containing plant secondary metabolites; they have the characteristic active site triad of the SDRs (though not the upstream active site Asn) and a NADP-binding motif that is very similar to the typical extended SDR motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 293
27817 350084 cd08959 ArfGap_ArfGap1_like ARF1 GTPase-activating protein 1-like. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif. 115
27818 185752 cd08961 GH64-TLP-SF glycoside hydrolase family 64 (beta-1,3-glucanases which produce specific pentasaccharide oligomers) and thaumatin-like proteins. This superfamily includes glycoside hydrolases of family 64 (GH64), these are mostly bacterial beta-1,3-glucanases which cleave long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers and are implicated in fungal cell wall degradation. Also included in this superfamily are thaumatin, the sweet-tasting protein from the African berry Thaumatococcus daniellii, and thaumatin-like proteins (TLPs) which are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Like GH64s, some TLPs also hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. Several members of the plant TLP family have been reported as food allergens from fruits, and pollen allergens from conifers. Streptomyces matensis laminaripentaose-producing, beta-1,3-glucanase (GH64-LPHase), and TLPs have in common, a core N-terminal barrel domain (domain I) composed of 10 beta-strands, two coming from the C-terminal region of the protein. In TLPs, this core domain is flanked by two shorter domains (domains II and III). Small TLPs, such as Triticum aestivum thaumatin-like xylanase inhibitor, have a deletion in the third domain (domain II). GH64-LPHase has a second C-terminal domain which corresponds positional to, but is much larger than, domain III of TLP. GH64-LPHase and TLPs are described as crescent-fold structures. Critical functional residues, common to GH64-LPHase and TLPs are a Glu and an Asp residue. LPHase has an electronegative, substrate-binding cleft and the afore mentioned conserved Glu and Asp residues are the catalytic residues essential for beta-1,3-glucan cleavage. In TLPs, these residues are two of the four conserved residues which contribute to the strong electronegative character of the cleft which is associated with the antifungal activity of TLPs. 153
27819 199206 cd08962 GatD GatD subunit of archaeal Glu-tRNA amidotransferase. GatD is involved in the alternative synthesis of Gln-tRNA(Gln) in archaea via the transamidation of incorrectly charged Glu-tRNA(Gln). GatD is active as a dimer, and it provides the amino group required for this reaction. GatD is related to bacterial L-asparaginases (amidohydrolases), which catalyze the hydrolysis of asparagine to aspartic acid and ammonia. This CD spans both the L-asparaginase_like domain and an N-terminal supplementary domain. 402
27820 199207 cd08963 L-asparaginase_I Type I (cytosolic) bacterial L-asparaginase. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases. This model represents type I L-asparaginases, which are highly specific for asparagine and localized in the cytosol. Type I L-asparaginase acts as a dimer. A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. One example of an enzyme with no L-glutaminase activity is the type I L-asparaginase from Wolinella succinogenes. 316
27821 199208 cd08964 L-asparaginase_II Type II (periplasmic) bacterial L-asparaginase. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases. This model represents type II L-asparaginases, which tend to be highly specific for asparagine and localized to the periplasm. They are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL), but not without severe side effects. Tumor cells appear to have a heightened dependence on exogenous L-aspartate, and depleting their surroundings of L-aspartate may starve cancerous ALL cells. Type II L-asparaginase acts as a tetramer, which is actually a dimer of two tightly bound dimers. A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. 319
27822 176799 cd08965 EcNei-like_N N-terminal domain of Escherichia coli Nei/endonuclease VIII and related DNA glycosylases. This family contains the N-terminal domain of proteobacteria Nei and related DNA glycosylases. It includes Escherichia coli Nei, and belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. Escherichia coli Nei has been well studied, it is a DNA glycosylase/AP lyase that excises damaged pyrimidines, including 5-hydroxycytosine, 5-hydroxyuracil, and uracil glycol. In addition to this EcNei-like_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a canonical zinc-finger motif. 115
27823 176800 cd08966 EcFpg-like_N N-terminal domain of Escherichia coli Fpg1/MutM and related bacterial DNA glycosylases. This family contains the N-terminal domain of Escherichia coli Fpg1/MutM and related bacterial DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. Escherichia coli Fpg mainly recognizes and excises damaged purines such as 8-oxo-7,8-dihydroguanine (8-oxoG) and 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG). It is bifunctional, having both a DNA glycosylase (recognition activity) and a AP lyase activity. In addition to this EcFpg-like_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif, which also contribute residues to the active site. 120
27824 176801 cd08967 MeNeil1_N N-terminal domain of metazoan Nei-like glycosylase 1 (NEIL1). This family contains the N-terminal domain of metazoan NEIL1. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. NEIL1 recognizes the oxidized pyrimidines 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 4,6-diamino- 5-formamidopyrimidine (FapyA), thymine glycol (Tg) and 5-hydroxyuracil (5-OHU). However, even though it has weak activity on 8-oxo-7,8-dihydroguanine (8-oxoG), it does show strong preference for the products of its further oxidation: spiroiminodihydantoin and guanidinohydantoin. In addition to this MeNeil1_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zincless finger motif. This characteristic "zincless finger" motif, is a structural equivalent of the zinc finger common to other members of the Fpg/Nei family. Neil1 is one of three homologs found in eukaryotes and its lineage extends back as far as early metazoans. 131
27825 176802 cd08968 MeNeil2_N N-terminal domain of metazoan Nei-like glycosylase 2 (NEIL2). This family contains the N-terminal domain of the metazoan protein Neil2. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. NEIL2 repairs 5-hydroxyuracil (5-OHU) and other oxidized derivatives of cytosine, but it shows preference for DNA bubble structures. In addition to this MeNeil2_N domain, NEIL2 contains a helix-two turn-helix (H2TH) domain and a characteristic CHCC zinc finger motif. Neil2 is one of three homologs found in eukaryotes. 126
27826 176803 cd08969 MeNeil3_N N-terminal domain of metazoan Nei-like glycosylase 3 (NEIL3). This family contains the N-terminal domain of the Metazoan Neil3. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. In contrast, mouse NEIL3 (MmuNEIL3) forms a Schiff base intermediate via its N-terminal valine. The latter is a functional DNA glycosylase in vitro and in vivo. MmuNEIL3 prefers lesions in single-stranded DNA and in bubble structures. In duplex DNA, it recognizes the oxidized purines spiroiminodihydantoin (Sp), guanidinohydantoin (Gh), 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 4,6-diamino-5-formamidopyrimidine (FapyA), but not 8-oxo-7,8-dihydroguanine (8-oxoG). Since the expression of the MmuNeil3 glycosylase domain (MmuNeil3delta324) reduces both the high spontaneous mutation frequency and the FapyG level in a Escherichia coli mutant lacking Fpg, Nei and MutY glycosylase activites, NEIL3 may play a role in repairing FapyG in vivo. In addition to this MeNeil3_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc finger motif, plus a characteristic C-terminal extension that contains additional zinc fingers. Neil3 is one of three homologs found in eukaryotes. 140
27827 176804 cd08970 AcNei1_N N-terminal domain of the actinomycetal Nei1 and related DNA glycosylases. This family contains the N-terminal domain of the actinomycetal Nei1 and related DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This family contains mostly actinomycetes and includes Mycobacterium tuberculosis Nei1 (MtuNei1). MtuNei1 recognizes oxidized pyrimidines such as thymine glycol (Tg) and 5,6-dihydrouracil on both double stranded and single stranded DNA, it has a strong preference for the 5R isomer of Tg. In addition to this domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif. 110
27828 176805 cd08971 AcNei2_N N-terminal domain of the actinomycetal Nei2 and related DNA glycosylases. This family contains the N-terminal domain of the actinomycetal Nei2 and related DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This family contains mostly actinomycetes and includes Mycobacterium tuberculosis Nei2 (MtuNei2). Complementation experiments in repair-deficient Escherichia coli (fpg mutY nei triple and nei nth double mutants), support that MtuNei2 is functionally active in vivo and recognizes both guanine and cytosine oxidation products. In addition to this AcNei2_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif. 114
27829 176806 cd08972 PF_Nei_N N-terminal domain of the plant and fungal Nei and related proteins. This family contains the N-terminal domain of plant and Fungi Nei and related proteins. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. The plant and fungal FpgNei glycosylases prefer the oxidized pyrimidines spiroiminodihydantoin (Sp), guanidinohydantoin (Gh) over 8-oxoguanine in double stranded oligonucleotides and also show weak activity on single stranded DNA. In addition to this domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a characteristic zincless finger motif. They share a common ancestor not shared with other eukaryotic members of the FpgNei family. 137
27830 176807 cd08973 BaFpgNei_N_1 Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_1 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif. 122
27831 176808 cd08974 BaFpgNei_N_2 Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines, and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_2 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain. Most also contain a zinc-finger motif. 98
27832 176809 cd08975 BaFpgNei_N_3 Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In this family the N-terminal proline is replaced by an isoleucine or valine. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_3 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif. 117
27833 176810 cd08976 BaFpgNei_N_4 Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_4 domain, most enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif. 117
27834 185760 cd08977 SusD starch binding outer membrane protein SusD. SusD-like proteins from Bacteroidetes, members of the human distal gut microbiota, are part of the starch utilization system (Sus). Sus is one of the large clusters of glycosyl hydrolases, called polysaccharide utilization loci (PULs), which play an important role in polysaccharide recognition and uptake, and it is needed for growth on amylose, amylopectin, pullulan, and maltooligosaccharides. SusD, together with SusC, a predicted beta-barrel porin, forms the minimum outer-membrane starch-binding complex. The adult human distal gut microbiota is essential for digestion of a large variety of dietary polysaccharides, for which humans lack the necessary glycosyl hydrolases. 359
27835 350092 cd08978 GH_F Glycosyl hydrolase families 43 and 62 form CAZY clan GH-F. This glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) includes family 43 (GH43) and 62 (GH62). GH43 includes enzymes with beta-xylosidase (EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanases (beta-xylanases) and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. GH62 includes enzymes characterized as arabinofuranosidases (alpha-L-arabinofuranosidases; EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose side chains from xylans. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. GH62 are also predicted to be inverting enzymes. A common structural feature of both, GH43 and GH62 enzymes, is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 251
27836 350093 cd08979 GH_J Glycosyl hydrolase families 32 and 68, which form the clan GH-J. This glycosyl hydrolase family clan J (according to carbohydrate-active enzymes database (CAZY)) includes family 32 (GH32) and 68 (GH68). GH32 enzymes include invertase (EC 3.2.1.26) and other other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). The GH68 family consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10, also known as beta-D-fructofuranosyl transferase), beta-fructofuranosidase (EC 3.2.1.26) and inulosucrase (EC 2.4.1.9). GH32 and GH68 family enzymes are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) and catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 292
27837 350094 cd08980 GH43_LbAraf43-like Glycosyl hydrolase family 43 proteins such as Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities. In addition to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) familiesGH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 276
27838 350095 cd08981 GH43_Bt1873-like Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron BT_1873. This glycosyl hydrolase family 43 (GH43) subfamily includes Bacteroides thetaiotaomicron VPI-5482 endo-arabinase (Bt1873;BT_1873), as well as uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the GH43 enzymes in this family may display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 289
27839 350096 cd08982 GH43-like Glycosyl hydrolase family 43 protein; uncharacterized. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 308
27840 350097 cd08983 GH43_Bt3655-like Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 arabinofuranosidase Bt3655. This glycosyl hydrolase family 43 (GH43)-like family includes the characterized arabinofuranosidases (EC 3.2.1.55): Bacteroides thetaiotaomicron VPI-5482 (Bt3655;BT_3655) and Penicillium chrysogenum 31B Abf43B, as well as Bifidobacterium adolescentis ATCC 15703 beta-xylosidase (EC 3.2.1.37) BAD_1527. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 includes enzymes with beta-xylosidase (EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanases (beta-xylanases) and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 262
27841 350098 cd08984 GH43-like Glycosyl hydrolase family 43. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 291
27842 350099 cd08985 GH43_CtGH43-like Glycosyl hydrolase family 43 protein such as Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 273
27843 350100 cd08986 GH43-like Glycosyl hydrolase family 43 protein; uncharacterized. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 257
27844 350101 cd08987 GH62 Glycosyl hydrolase family 62, characterized arabinofuranosidases. The glycosyl hydrolase family 62 (GH62) includes eukaryotic (mostly fungal) and prokaryotic enzymes which are characterized arabinofuranosidases (alpha-L-arabinofuranosidases; EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose side chains from xylans. These enzymes show significantly different substrate preference with rather low specific activity towards natural substrates and differ in catalytic efficiency. They do not act on xylose moieties in xylan that are adorned with an arabinose side chain at both O2 and O3 positions, nor do they display any non-specific arabinofuranosidase activity. The synergistic action in biomass degradation makes GH62 promising candidates for biotechnological improvements of biofuel production and in various biorefinery applications. These enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. 304
27845 350102 cd08988 GH43_ABN Glycosyl hydrolase family 43. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 277
27846 350103 cd08989 GH43_XYL-like Glycosyl hydrolase family 43, beta-D-xylosidases and arabinofuranosidases. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes that have been annotated as having beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity, including Selenomonas ruminantium beta-D-xylosidase SXA. These are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. It also includes various GH43 family GH43 arabinofuranosidases (EC 3.2.1.55) including Humicola insolens alpha-L-arabinofuranosidase AXHd3, Bacteroides ovatus alpha-L-arabinofuranosidase (BoGH43, XynB), and the bifunctional Phanerochaete chrysosporium xylosidase/arabinofuranosidase (Xyl;PcXyl). GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 272
27847 350104 cd08990 GH43_AXH_like Glycosyl hydrolase family 43 protein, includes arabinoxylan arabinofuranohydrolase, beta-xylosidase, endo-1,4-beta-xylanase, and alpha-L-arabinofuranosidase. This subgroup includes Bacillus subtilis arabinoxylan arabinofuranohydrolase (XynD;BsAXH-m23;BSU18160), Butyrivibrio proteoclasticus alpha-L-arabinofuranosidase (Xsa43E;bpr_I2319), Clostridium stercorarium alpha-L-arabinofuranosidase XylA, and metagenomic beta-xylosidase (EC 3.2.1.37) / alpha-L-arabinofuranosidase (EC 3.2.1.55) CoXyl43. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_AXH-like subgroup includes enzymes that have been characterized with beta-xylosidase, alpha-L-arabinofuranosidase, endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. Metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43 shows synergy with Trichoderma reesei cellulases and promotes plant biomass saccharification by degrading xylo-oligosaccharides, such as xylobiose and xylotriose, into the monosaccharide xylose. Studies show that the hydrolytic activity of CoXyl43 is stimulated in the presence of calcium. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 269
27848 350105 cd08991 GH43_HoAraf43-like Glycosyl hydrolase family 43 protein such as Halothermothrix orenii H 168 alpha-L-arabinofuranosidase (HoAraf43;Hore_20580). This glycosyl hydrolase family 43 (GH43) subgroup includes Halothermothrix orenii H 168 alpha-L-arabinofuranosidase (EC 3.2.1.55) (HoAraf43;Hore_20580). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. This GH43_ HoAraf43-like subgroup includes enzymes that have been annotated as having xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 283
27849 350106 cd08992 GH117 Glycosyl hydrolase family 117 (GH117). This glycoside hydrolase 117 (GH117) family includes alpha-1,3-L-neoagarooligosaccharide hydrolase (EC 3.2.1.-); alpha-1,3-L-neoagarobiase/neoagarobiose hydrolase (NABH, EC 3.2.1.-). In the agarolytic pathway, in order to metabolize agar, NABH is an essential enzyme because it converts alpha-neoagarobiose (O-3,6-anhydro-alpha-l-galactopyranosyl-(1,3)-d-galactose) into fermentable monosaccharides (d-galactose and 3,6-anhydro-l-galactose). Thus, these enzymes have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate. This family includes Zobellia galactanivorans enzymes, Zg4663 and Zg3615 (also known as ZgAhgA and ZgAhgB, respectively) that have been shown to have similar activity on unsubstituted agarose oligosaccharides while Zg3597 has been shown to be inactive, possibly due to differences in dimerization conformation, active-site structure and function. GH117 shares distant sequence similarity with families GH43 and GH32. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 314
27850 350107 cd08993 GH130 Glycosyl hydrolase family 130. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), among others that have yet to be characterized. They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. This family includes Ruminococcus albus 4-O-beta-D-mannosyl-D-glucose phosphorylase (RaMP1) and beta-(1,4)-mannooligosaccharide phosphorylase (RaMP2), enzymes that phosphorolyze beta-mannosidic linkages at the non-reducing ends of their substrates, and have substantially diverse substrate specificity that are determined by three loop regions. 279
27851 350108 cd08994 GH43_62_32_68_117_130-like Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. Members of the glycosyl hydrolase families 32, 43, 62, 68, 117 and 130 (GH32, GH43, GH62, GH68, GH117, GH130) all possess 5-bladed beta-propeller domains and comprise clans F and J, as classified by the carbohydrate-active enzymes database (CAZY). Clan F consists of families GH43 and GH62. GH43 includes beta-xylosidases (EC 3.2.1.37), beta-xylanases (EC 3.2.1.8), alpha-L-arabinases (EC 3.2.1.99), and alpha-L-arabinofuranosidases (EC 3.2.1.55), using aryl-glycosides as substrates, while family GH62 contains alpha-L-arabinofuranosidases (EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose sidechains from xylans. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Clan J consists of families GH32 and GH68. GH32 comprises sucrose-6-phosphate hydrolases, invertases (EC 3.2.1.26), inulinases (EC 3.2.1.7), levanases (EC 3.2.1.65), eukaryotic fructosyltransferases, and bacterial fructanotransferases while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Members of this clan are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) that catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Structures of all families in the two clans manifest a funnel-shaped active site that comprises two subsites with a single route for access by ligands. Also included in this superfamily are GH117 enzymes that have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate, and GH130 that are phosphorylases and hydrolases for beta-mannosides, involved in the bacterial utilization of mannans or N-linked glycans. 294
27852 350109 cd08995 GH32_EcAec43-like Glycosyl hydrolase family 32, such as the putative glycoside hydrolase Escherichia coli Aec43 (FosGH2). This glycosyl hydrolase family 32 (GH32) subgroup includes Escherichia coli strain BEN2908 putative glycoside hydrolase Aec43 (FosGH2). GH32 enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). GH32 family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. 281
27853 350110 cd08996 GH32_FFase Glycosyl hydrolase family 32, beta-fructosidases. Glycosyl hydrolase family GH32 cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). This family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 281
27854 350111 cd08997 GH68 Glycosyl hydrolase family 68, includes levansucrase, beta-fructofuranosidase and inulosucrase. Glycosyl hydrolase family 68 (GH68) consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10), beta-fructofuranosidase (EC 3.2.1.26) and inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Levansucrase, also known as beta-D-fructofuranosyl transferase, catalyzes the transfer of the sucrose fructosyl moiety to a growing levan chain. Similarly, inulosucrase catalyzes long inulin-type of fructans, and beta-fructofuranosidases create fructooligosaccharides (FOS). However, in the absence of high fructan/sucrose ratio, some GH68 enzymes can also use fructan as donor substrate. GH68 retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. Biotechnological applications of these enzymes include use of inulin in inexpensive production of rich fructose syrups as well as use of FOS as health-promoting pre-biotics. 354
27855 350112 cd08998 GH43_Arb43a-like Glycosyl hydrolase family 43 protein such as Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase Arb43A. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 278
27856 350113 cd08999 GH43_ABN-like Glycosyl hydrolase family 43 protein such as endo-alpha-L-arabinanase. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 284
27857 350114 cd09000 GH43_SXA-like Glycosyl hydrolase family 43, such as Selenomonas ruminantium beta-D-xylosidase SXA. This glycosyl hydrolase family 43 (GH43) includes enzymes that have been characterized to mainly have beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity, including Selenomonas ruminantium (Xsa;Sxa;SXA), Bifidobacterium adolescentis ATCC 15703 (XylC;XynB;BAD_0428) and Bacillus sp. KK-1 XylB. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. These enzymes possess an additional C-terminal beta-sandwich domain that restricts access for substrates to a portion of the active site to form a pocket. The active-site pockets comprise of two subsites, with binding capacity for two monosaccharide moieties and a single route of access for small molecules such as substrate. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 292
27858 350115 cd09001 GH43_FsAxh1-like Glycosyl hydrolase family 43 such as Fibrobacter succinogenes subsp. succinogenes S85 arabinoxylan alpha-L-arabinofuranosidase. This glycosyl hydrolase family 43 (GH43) includes mostly enzymes that have been annotated as having beta-1,4-xylosidase (beta-D-xylosidase; xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. This subfamily includes the characterized Clostridium stercorarium F-9 beta-xylosidase Xyl43B. It also includes Humicola insolens AXHd3 (HiAXHd3), a GH43 arabinofuranosidase (EC 3.2.1.55) that hydrolyzes O3-linked arabinose of doubly substituted xylans, a feature of the polysaccharide that is recalcitrant to degradation. It possesses an additional C-terminal beta-sandwich domain such that the interface between the domains comprises a xylan binding cleft that houses the active site pocket. The HiAXHd3 active site is tuned to hydrolyze arabinofuranosyl or xylosyl linkages, and the topology of the distal regions of the substrate binding surface confers specificity. It also includes Fibrobacter succinogenes subsp. succinogenes S85 arabinoxylan alpha-L-arabinofuranosidase (Axh1;Fisuc_1769;FSU_2269), Paenibacillus sp. E18 alpha-L-arabinofuranosidase (Abf43A), Bifidobacterium adolescentis ATCC 15703 double substituted xylan alpha-1,3-L-specific arabinofuranosidase d3 (AXHd3;AXH-d3;BaAXH-d3;BAD_0301;E-AFAM2), and Chrysosporium lucknowense C1 arabinoxylan hydrolase / double substituted xylan alpha-1,3-L-arabinofuranosidase (Abn7;AXHd). A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 270
27859 350116 cd09002 GH43_XYL-like Glycosyl hydrolase family 43, beta-D-xylosidase (uncharacterized). This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been annotated as having beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 271
27860 350117 cd09003 GH43_XynD-like Glycosyl hydrolase family 43 protein such as Bacillus subtilis arabinoxylan arabinofuranohydrolase (XynD;BsAXH-m23;BSU18160). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized Bacillus subtilis arabinoxylan arabinofuranohydrolase (AXH), Caldicellulosiruptor sp. Tok7B.1 beta-1,4-xylanase (EC 3.2.1.8) / alpha-L-arabinosidase (EC 3.2.1.55) XynA, Caldicellulosiruptor sp. Rt69B.1 xylanase C (EC 3.2.1.8) XynC, and Caldicellulosiruptor saccharolyticus beta-xylosidase (EC 3.2.1.37)/ alpha-L-arabinofuranosidase (EC 3.2.1.55) XynF. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. It belongs to the GH43_AXH-like subgroup which includes enzymes that have been annotated as having beta-xylosidase, alpha-L-arabinofuranosidase and arabinoxylan alpha-L-1,3-arabinofuranohydrolase, xylanase (endo-alpha-L-arabinanase) as well as AXH activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. Bacillus subtilis AXH (BsAXH-m2,3) has been shown to cleave arabinose units from O-2- or O-3-mono-substituted xylose residues and superposition of its structure with known structures of the GH43 exo-acting enzymes, beta-xylosidase and alpha-L-arabinanase, each in complex with their substrate, reveals a different orientation of the sugar backbone. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 315
27861 350118 cd09004 GH43_bXyl-like Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (BT3675;BT_3675) and (BT3662;BT_3662); includes mostly xylanases. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been annotated as xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities, as well the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 266
27862 350156 cd09005 NP-I nucleoside phosphorylase-I family. The nucleoside phosphorylase-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases such as purine nucleoside phosphorylase (PNP, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases such as AMP nucleosidase (AMN, EC 3.2.2.4) and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Members of this family display different physiologically relevant quaternary structures: hexameric (trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP); homotrimeric (human PNP and Escherichia coli PNPII or XapA); hexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD); or homodimeric such as human and Trypanosoma brucei UP. The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 216
27863 350157 cd09006 PNP_EcPNPI-like purine nucleoside phosphorylases similar to Escherichia coli PNP-I (DeoD) and Trichomonas vaginalis PNP. Escherichia coli purine nucleoside phosphorylase (PNP)-I (or DeoD) accepts both 6-oxo and 6-amino purine nucleosides as substrates. Trichomonas vaginalis PNP has broad substrate specificity, having phosphorolytic catalytic activity with adenosine, inosine, and guanosine (with adenosine as the preferred substrate). This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 228
27864 350158 cd09007 NP-I_spr0068 uncharacterized subfamily of the nucleoside phosphorylase-I family. This subfamily is composed of uncharacterized members including Streptococcus pneumoniae hypothetical protein spr0068. The nucleoside phosphorylase-I (NP-I) family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases such as purine nucleoside phosphorylase (PNP, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases such as AMP nucleosidase (AMN, EC 3.2.2.4) and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Members of the NP-I family display different physiologically relevant quaternary structures: hexameric (trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP); homotrimeric (human PNP and Escherichia coli PNPII or XapA); hexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD); or homodimeric such as human and Trypanosoma brucei UP. The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 221
27865 350159 cd09008 MTAN 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidases. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs): bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 222
27866 350160 cd09009 PNP-EcPNPII_like purine nucleoside phosphorylases similar to human PNP and Escherichia coli PNP-II (XapA). Human PNP catalyzes the reversible phosphorolysis of the purine nucleosides and deoxynucleosides inosine, guanosine, deoxyinosine, and deoxyguanosine. Patients with PNP deficiency typically present with severe immunodeficiency, neurological dysfunction, and autoimmunity. Escherichia coli PNPII, product of the xapA/pndA gene, catalyzes the phosphorolysis of xanthosine, inosine and guanosine with equal efficiency and has been referred to as xanthosine phosphorylase and inosine-guanosine phosphorylase. E. coli PNPII is also capable of converting nicotinamide to nicotinamide riboside, and may be involved in the NAD+ salvage pathway. It is one of two purine nucleoside phosphorylases found in E. coli, which also contains PNPI, which displays a different substrate specificity and belongs to a different subgroup of the nucleoside phosphorylase-I (NP-I) family than PNPII. NP-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 265
27867 350161 cd09010 MTAP_SsMTAPII_like_MTIP 5'-deoxy-5'-methylthioadenosine phosphorylases (MTAP) similar to Sulfolobus solfataricus MTAPII and Pseudomonas aeruginosa PAO1 5'-methylthioinosine phosphorylase (MTIP). MTAP catalyzes the reversible phosphorolysis of 5'-deoxy-5'-methylthioadenosine (MTA) to adenine and 5-methylthio-D-ribose-1-phosphate. This subfamily includes human MTAP which is highly specific for MTA, and Sulfolobus solfataricus MTAPII which accepts adenosine in addition to MTA. Two MTAPs have been isolated from S. solfataricus: SsMTAP1 and SsMTAPII, SsMTAP1 belongs to a different subfamily of the nucleoside phosphorylase-I (NP-I) family. This group also includes Pseudomonas aeruginosa PAO1 MTI phosphorylase (MTIP) which uses 5'-methylthioinosine (MTI) as a preferred substrate, and does not use MTA. NP-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 238
27868 319953 cd09011 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 122
27869 319954 cd09012 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 127
27870 319955 cd09013 BphC-JF8_N_like N-terminal, non-catalytic, domain of BphC_JF8, (2,3-dihydroxybiphenyl 1,2-dioxygenase) from Bacillus sp. JF8, and similar proteins. 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, a key step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). BphC belongs to the type I extradiol dioxygenase family, which requires a metal ion in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. This subfamily of BphC is represented by the enzyme purified from the thermophilic biphenyl and naphthalene degrader, Bacillus sp. JF8. The members in this family of BphC enzymes may use either Mn(II) or Fe(II) as cofactors. The enzyme purified from Bacillus sp. JF8 is Mn(II)-dependent, however, the enzyme from Rhodococcus jostii RHAI has Fe(II) bound to it. BphC_JF8 is thermostable and its optimum activity is at 85 degrees C. The enzymes in this family have an internal duplication. This family represents the N-terminal repeat. 121
27871 319956 cd09014 BphC-JF8_C_like C-terminal, catalytic domain of BphC_JF8, (2,3-dihydroxybiphenyl 1,2-dioxygenase). 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, a key step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). BphC belongs to the type I extradiol dioxygenase family, which requires a metal ion in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. This subfamily of BphC is represented by the enzyme purified from the thermophilic biphenyl and naphthalene degrader, Bacillus sp. JF8. The members in this family of BphC enzymes may use either Mn(II) or Fe(II) as cofactors. The enzyme purified from Bacillus sp. JF8 is Mn(II)-dependent, however, the enzyme from Rhodococcus jostii RHAI has Fe(II) bound to it. BphC_JF8 is thermostable and its optimum activity is at 85 degrees C. The enzymes in this family have an internal duplication. This family represents the C-terminal repeat. 167
27872 212511 cd09015 Ureohydrolase Ureohydrolase superfamily includes arginase, formiminoglutamase, agmatinase and proclavaminate amidinohydrolase (PAH). This family, also known as arginase-like amidino hydrolase family, includes Mn-dependent enzymes: arginase (Arg, EC 3.5.3.1), formimidoylglutamase (HutG, EC 3.5.3.8 ), agmatinase (SpeB, EC 3.5.3.11), guanidinobutyrase (Gbh, EC=3.5.3.7), proclavaminate amidinohydrolase (PAH, EC 3.5.3.22) and related proteins. These enzymes catalyze hydrolysis of amide bond. They are involved in control of cellular levels of arginine and ornithine (both involved in protein biosynthesis, and production of creatine, polyamines, proline and nitric acid), in histidine and arginine degradation, and in clavulanic acid biosynthesis. 270
27873 176656 cd09018 DEDDy_polA_RNaseD_like_exo DEDDy 3'-5' exonuclease domain of family-A DNA polymerases, RNase D, WRN, and similar proteins. DEDDy exonucleases, part of the DnaQ-like (or DEDD) exonuclease superfamily, catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. They contain four invariant acidic residues in three conserved sequence motifs termed ExoI, ExoII and ExoIII. DEDDy exonucleases are classified as such because of the presence of a specific YX(3)D pattern at ExoIII. The four conserved acidic residues serve as ligands for the two metal ions required for catalysis. This family of DEDDy exonucleases includes the proofreading domains of family A DNA polymerases, as well as RNases such as RNase D and yeast Rrp6p. The Egalitarian (Egl) and Bacillus-like DNA Polymerase I subfamilies do not possess a completely conserved YX(3)D pattern at the ExoIII motif. In addition, the Bacillus-like DNA polymerase I subfamily has inactive 3'-5' exonuclease domains which do not possess the metal-binding residues necessary for activity. 150
27874 185696 cd09019 galactose_mutarotase_like galactose mutarotase_like. Galactose mutarotase catalyzes the conversion of beta-D-galactose to alpha-D-galactose. Beta-D-galactose is produced by the degradation of lactose, a disaccharide composed of beta-D-glucose and beta-D-galactose. This epimerization reaction is the first step in the four-step Leloir pathway, which converts galactose into metabolically important glucose. This epimerization step is followed by the phosophorylation of alpha-D-galactose by galactokinase, an enzyme which can only act on the alpha anomer. A glutamate and a histidine residue of the galactose mutarotase have been shown to be critical for catalysis, the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. Galactose mutarotase is a member of the aldose-1-epimerase superfamily. 326
27875 185697 cd09020 D-hex-6-P-epi_like D-hexose-6-phosphate epimerase-like. D-Hexose-6-phosphate epimerase Ymr099c from Saccharomyces cerevisiae belongs to the large superfamily of aldose-1-epimerases. Its active site is very similar to the catalytic site of galactose mutarotase, the best studied member of the superfamily. It also contains the conserved glutamate and histidine residues that have been shown in galactose mutarotase to be critical for catalysis, the glutamate serving as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. In addition Ymr099c contains 2 conserved arginine residues which are involved in phosphate binding, and exhibits hexose-6-phosphate mutarotase activity on glucose-6-P, galactose-6-P and mannose-6-P. 269
27876 185698 cd09021 Aldose_epim_Ec_YphB aldose 1-epimerase, similar to Escherichia coli YphB. Proteins similar to Escherichia coli YphB are uncharacterized members of the aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. 273
27877 185699 cd09022 Aldose_epim_Ec_YihR Aldose 1-epimerase, similar to Escherichia coli YihR. Proteins similar to Escherichia coli YihR are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. 284
27878 185700 cd09023 Aldose_epim_Ec_c4013 Aldose 1-epimerase, similar to Escherichia coli c4013. Proteins, similar to Escherichia coli c4013, are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. 284
27879 185701 cd09024 Aldose_epim_lacX Aldose 1-epimerase, similar to Lactococcus lactis lacX. Proteins similar to Lactococcus lactis lacX are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. 288
27880 185702 cd09025 Aldose_epim_Slr1438 Aldose 1-epimerase, similar to Synechocystis Slr1438. Proteins similar to Synechocystis Slr1438 are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. 271
27881 193601 cd09027 PET PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions. PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation. 82
27882 350085 cd09028 ArfGap_ArfGap3 Arf1 GTPase-activating protein 3. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif. 120
27883 350086 cd09029 ArfGap_ArfGap2 Arf1 GTPase-activating protein 2. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif. 120
27884 176923 cd09030 DUF1425 Putative periplasmic lipoprotein. This bacterial family of proteins contains members described as putative lipoproteins, some are also known as YcfL. The function of this family is unknown. Family members have also been annotated as predicted periplasmic lipoproteins (COG5633), and appear to contain an N-terminal membrane lipoprotein lipid attachment side (pfam08139), which is not included in this alignment model. 101
27885 411807 cd09031 KH-I_NOVA_rpt3 third type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis. Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 71
27886 411808 cd09032 KH-I_N4BP1_like_rpt1 first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1). The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif. 65
27887 411809 cd09033 KH-I_PNPT1 type I K homology (KH) RNA-binding domain found in mitochondrial polyribonucleotide nucleotidyltransferase 1 (PNPT1) and similar proteins. PNPT1, also called 3'-5' RNA exonuclease OLD35, or PNPase old-35, or polynucleotide phosphorylase 1, or PNPase 1, or polynucleotide phosphorylase-like protein, is an RNA-binding protein implicated in numerous RNA metabolic processes. It catalyzes the phosphorolysis of single-stranded polyribonucleotides processively in the 3'-to-5' direction. It acts as a mitochondrial intermembrane factor with RNA-processing exoribonulease activity. PNPT1 is a component of the mitochondrial degradosome (mtEXO) complex, that degrades 3' overhang double-stranded RNA with a 3'-to-5' directionality in an ATP-dependent manner. It is involved in the degradation of non-coding mitochondrial transcripts (MT-ncRNA) and tRNA-like molecules and required for correct processing and polyadenylation of mitochondrial mRNAs. PNPT1 also plays a role as a cytoplasmic RNA import factor that mediates the translocation of small RNA components, like the 5S RNA, the RNA subunit of ribonuclease P and the mitochondrial RNA-processing (MRP) RNA, into the mitochondrial matrix. 67
27888 185761 cd09034 BRO1_Alix_like Protein-interacting Bro1-like domain of mammalian Alix and related domains. This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain. 345
27889 176924 cd09071 FAR_C C-terminal domain of fatty acyl CoA reductases. C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear. 92
27890 197307 cd09073 ExoIII_AP-endo Escherichia coli exonuclease III (ExoIII)-like apurinic/apyrimidinic (AP) endonucleases. The ExoIII family AP endonucleases belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, which is then followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, which have both mutagenic and cytotoxic effects. AP endonucleases can carry out a wide range of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two functional AP endonucleases, for example, APE1/Ref-1 and Ape2 in humans, Apn1 and Apn2 in bakers yeast, Nape and NExo in Neisseria meningitides, and exonuclease III (ExoIII) and endonuclease IV (EndoIV) in Escherichia coli. Usually, one of the two is the dominant AP endonuclease, the other has weak AP endonuclease activity, but exhibits strong 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, and 3'-phosphatase activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes. This family contains the ExoIII family; the EndoIV family belongs to a different superfamily. 251
27891 197308 cd09074 INPP5c Catalytic domain of inositol polyphosphate 5-phosphatases. Inositol polyphosphate 5-phosphatases (5-phosphatases) are signal-modifying enzymes, which hydrolyze the 5-phosphate from the inositol ring of specific 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), such as PI(4,5)P2, PI(3,4,5)P3, PI(3,5)P2, I(1,4,5)P3, and I(1,3,4,5)P4. These enzymes are Mg2+-dependent, and belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. In addition to this INPP5c domain, 5-phosphatases often contain additional domains and motifs, such as the SH2 domain, the Sac-1 domain, the proline-rich domain (PRD), CAAX, RhoGAP (RhoGTPase-activating protein), and SKICH [SKIP (skeletal muscle- and kidney-enriched inositol phosphatase) carboxyl homology] domains, that are important for protein-protein interactions and/or for the subcellular localization of these enzymes. 5-phosphatases incorporate into large signaling complexes, and regulate diverse cellular processes including postsynaptic vesicular trafficking, insulin signaling, cell growth and survival, and endocytosis. Loss or gain of function of 5-phosphatases is implicated in certain human diseases. This family also contains a functionally unrelated nitric oxide transport protein, Cimex lectularius (bedbug) nitrophorin, which catalyzes a heme-assisted S-nitrosation of a proximal thiolate; the heme however binds at a site distinct from the active site of the 5-phosphatases. 299
27892 197309 cd09075 DNase1-like Deoxyribonuclease 1 and related proteins. This family includes Deoxyribonuclease 1 (DNase1, EC 3.1.21.1) and related proteins. DNase1, also known as DNase I, is a Ca2+, Mg2+/Mn2+-dependent secretory endonuclease, first isolated from bovine pancreas extracts. It cleaves DNA preferentially at phosphodiester linkages next to a pyrimidine nucleotide, producing 5'-phosphate terminated polynucleotides with a free hydroxyl group on position 3'. It generally produces tetranucleotides. DNase1 substrates include single-stranded DNA, double-stranded DNA, and chromatin. This enzyme may be responsible for apoptotic DNA fragmentation. Other deoxyribonucleases in this subfamily include human DNL1L (human DNase I lysosomal-like, also known as DNASE1L1, Xib and DNase X ), human DNASE1L2 (also known as DNAS1L2), and DNASE1L3 (also known as DNAS1L3, nhDNase, LS-DNase, DNase Y, and DNase gamma). DNASE1L3 is also implicated in apoptotic DNA fragmentation. DNase1 is also a cytoskeletal protein which binds actin. A recombinant form of human DNase1 is used as a mucoactive therapy in patients with cystic fibrosis; it hydrolyzes the extracellular DNA in sputum and reduces its viscosity. Mutations in the gene encoding DNase1 have been associated with Systemic Lupus Erythematosus, a multifactorial autoimmune disease. This family also includes a subfamily of mostly uncharacterized proteins, which includes Mycoplasma pulmonis MnuA, a membrane-associated nuclease. The in vivo role of MnuA is as yet undetermined. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 258
27893 197310 cd09076 L1-EN Endonuclease domain (L1-EN) of the non-LTR retrotransposon LINE-1 (L1), and related domains. This family contains the endonuclease domain (L1-EN) of the non-LTR retrotransposon LINE-1 (L1), and related domains, including the endonuclease of Xenopus laevis Tx1. These retrotranspons belong to the subtype 2, L1-clade. LINES can be classified into two subtypes. Subtype 2 has two ORFs: the second (ORF2) encodes a modular protein consisting of an N-terminal apurine/apyrimidine endonuclease domain (EN), a central reverse transcriptase, and a zinc-finger-like domain at the C-terminus. LINE-1/L1 elements (full length and truncated) comprise about 17% of the human genome. This endonuclease nicks the genomic DNA at the consensus target sequence 5'TTTT-AA3' producing a ribose 3'-hydroxyl end as a primer for reverse transcription of associated template RNA. This subgroup also includes the endonuclease of Xenopus laevis Tx1, another member of the L1-clade. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 236
27894 197311 cd09077 R1-I-EN Endonuclease domain encoded by various R1- and I-clade non-long terminal repeat retrotransposons. This family contains the endonuclease (EN) domain of various non-long terminal repeat (non-LTR) retrotransposons, long interspersed nuclear elements (LINEs) which belong to the subtype 2, R1- and I-clade. LINES can be classified into two subtypes. Subtype 2 has two ORFs: the second (ORF2) encodes a modular protein consisting of an N-terminal apurine/apyrimidine endonuclease domain (EN), a central reverse transcriptase, and a zinc-finger-like domain at the C-terminus. Most non-LTR retrotransposons are inserted throughout the host genome; however, many retrotransposons of the R1 clade exhibit target-specific retrotransposition. This family includes the endonucleases of SART1 and R1bm, from the silkworm Bombyx mori, which belong to the R1-clade. It also includes the endonuclease of snail (Biomphalaria glabrata) Nimbus/Bgl and mosquito Aedes aegypti (MosquI), both which belong to the I-clade. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 205
27895 197312 cd09078 nSMase Neutral sphingomyelinases (nSMase) catalyze the hydrolysis of sphingomyelin in biological membranes to ceramide and phosphorylcholine. Sphingomyelinases (SMase) are phosphodiesterases that catalyze the hydrolysis of sphingomyelin to ceramide and phosphorylcholine. Eukaryotic SMases have been classified according to their pH optima and are known as acid SMase, alkaline SMase, and neutral SMase (nSMase). Eukaryotic proteins in this family are nSMases, and are activated by a variety of stress-inducing agents such as cytokines or UV radiation. Ceramides and other metabolic derivatives, including sphingosine, are lipid "second messenger" molecules that participate in the regulation of stress-induced cellular responses, including cell death, adhesion, differentiation, and proliferation. Bacterial neutral SMases, which also belong to this domain family, are secreted proteins that act as membrane-damaging virulence factors. They promote colonization of the host tissue. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 280
27896 197313 cd09079 RgfB-like Streptococcus agalactiae RgfB, part of a putative two component signal transduction system, and related proteins. This family includes Streptococcus agalactiae RgfB (for regulator of fibrinogen binding) and related proteins. The function of RgfB is unknown. It is part of a putative two component signal transduction system designated rgfBDAC (the rgf locus was identified in a screen for mutants of Streptococcus agalactiae with altered binding to fibrinogen). RgfA,-C,and -D do not belong to this superfamily: rgfA encodes a putative response regulator, and rgfC, a putative histidine kinase. All four genes are co-transcribed, and may be involved in regulating expression of bacterial cell surface components. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 259
27897 197314 cd09080 TDP2 Phosphodiesterase domain of human TDP2, a 5'-tyrosyl DNA phosphodiesterase, and related domains. Human TDP2, also known as TTRAP (TRAF/TNFR-associated factors, and tumor necrosis factor receptor/TNFR-associated protein), is a 5'-tyrosyl DNA phosphodiesterase. It is required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini, needed for subsequent DNA ligation, and hence repair of the break. TDP2 and 3'-tyrosyl DNA phosphodiesterase (TDP1) are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TTRAP has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation, and it may inhibit the activation of nuclear factor-kB. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 248
27898 197315 cd09081 CdtB CdtB, the catalytic DNase I-like subunit of cytolethal distending toxin (CDT) protein. CDT is a secreted protein toxin produced by a number of Gram-negative disease-causing bacteria. CDT causes cell cycle arrest and eventual cell death in eukaryotic cells, as a result of chromosomal DNA damage caused by the catalytic, DNase I-like, CdtB subunit. Bacterial CDTs are generally comprised of three subunits, CdtA, -B and -C. CdtB is translocated into the host cell, where it acts as a genotoxin. CdtA and CdtC are needed for cell surface binding and cellular entry, and it is likely that they remain associated with the membrane, when CdtB is internalized. CdtB enters the target nucleus via nuclear translocation signal domain(s). This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 247
27899 197316 cd09082 Deadenylase C-terminal deadenylase domain of CCR4, nocturnin, and related domains. This family contains the C-terminal catalytic domains of the deadenylases, CCR4 and nocturnin, and related domains. Nocturnin is a poly(A)-specific 3' exonuclease that specifically degrades the 3' poly(A) tail of RNA in a process known as deadenylation. This nuclease activity is manganese dependent. Nocturnin is expressed in the cytoplasm of the Xenopus laevis retinal photoreceptor cells in a rhythmic fashion, and it has been proposed that it participates in posttranscriptional regulation of the circadian clock or its outputs, and that the mRNA target(s) of this deadenylase are circadian clock-related. Saccharomyces cerevisiae CCR4p is a 3'-5' poly(A) RNA and ssDNA exonuclease. It is the catalytic subunit of the yeast mRNA deadenylase (Ccr4p/Pop2p/Not complex). This complex participates in various ways in mRNA metabolism, including transcription initiation and elongation, and mRNA degradation. The deadenylase activities of Ccr4p and nocturnin differ: nocturnin degrades poly(A), Ccr4p degrades both poly(A) and single-stranded DNA, and in contrast to Ccr4p, nocturnin appears to function in a highly processive manner. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 348
27900 197317 cd09083 EEP-1 Exonuclease-Endonuclease-Phosphatase domain; uncharacterized family 1. This family of uncharacterized proteins belongs to a superfamily that includes the catalytic domain (exonuclease/endonuclease/phosphatase, EEP, domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds. Their substrates range from nucleic acids to phospholipids and perhaps, proteins. 252
27901 197318 cd09084 EEP-2 Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; uncharacterized family 2. This family of uncharacterized proteins belongs to a superfamily that includes the catalytic domain (exonuclease/endonuclease/phosphatase, EEP, domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps, proteins. 246
27902 197319 cd09085 Mth212-like_AP-endo Methanothermobacter thermautotrophicus Mth212-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes the thermophilic archaeon Methanothermobacter thermautotrophicus Mth212and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Mth212 is an AP endonuclease, and a DNA uridine endonuclease (U-endo) that nicks double-stranded DNA at the 5'-side of a 2'-d-uridine residue. After incision at the 5'-side of a 2'-d-uridine residue by Mth212, DNA polymerase B takes over the 3'-OH terminus and carries out repair synthesis, generating a 5'-flap structure that is resolved by a 5'-flap endonuclease. Finally, DNA ligase seals the resulting nick. This U-endo activity shares the same catalytic center as its AP-endo activity, and is absent from other AP endonuclease homologues. 252
27903 197320 cd09086 ExoIII-like_AP-endo Escherichia coli exonuclease III (ExoIII) and Neisseria meningitides NExo-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes Escherichia coli ExoIII, Neisseria meningitides NExo,and related proteins. These are ExoIII family AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiencies. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity. For example, Neisseria meningitides Nape and NExo, and exonuclease III (ExoIII) and endonuclease IV (EndoIV) in Escherichia coli. NExo and ExoIII are found in this subfamily. NExo is the non-dominant AP endonuclease. It exhibits strong 3'-5' exonuclease and 3'-deoxyribose phosphodiesterase activities. Escherichia coli ExoIII is an active AP endonuclease, and in addition, it exhibits double strand (ds)-specific 3'-5' exonuclease, exonucleolytic RNase H, 3'-phosphomonoesterase and 3'-phosphodiesterase activities, all catalyzed by a single active site. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes ExoIII and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily. 254
27904 197321 cd09087 Ape1-like_AP-endo Human Ape1-like subfamily of the ExoIII family apurinic/apyrimidinic (AP) endonucleases. This subfamily includes human Ape1 (also known as Apex, Hap1, or Ref-1) and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity; for example, Ape1 and Ape2 in humans. Ape1 is found in this subfamily, it exhibits strong AP-endonuclease activity but shows weak 3'-5' exonuclease and 3'-phosphodiesterase activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes exonuclease III (ExoIII) and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily. 253
27905 197322 cd09088 Ape2-like_AP-endo Human Ape2-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes human APE2, Saccharomyces cerevisiae Apn2/Eth1, and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity. For examples, Ape1 and Ape2 in humans, and Apn1 and Apn2 in bakers yeast. Ape2 and Apn2/Eth1 are both found in this subfamily, and have the weaker AP endonuclease activity. Ape2 shows strong 3'-5' exonuclease and 3'-phosphodiesterase activities; it can reduce the mutagenic consequences of attack by reactive oxygen species by removing 3'-end adenine opposite from 8-oxoG, in addition to repairing 3'-damaged termini. Apn2/Eth1 exhibits AP endonuclease activity, but has 30-40 fold more active 3'-phosphodiesterase and 3'-5' exonuclease activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes exonuclease III (ExoIII) and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily. 309
27906 197323 cd09089 INPP5c_Synj Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanins. This subfamily contains the INPP5c domains of two human synaptojanins, synaptojanin 1 (Synj1) and synaptojanin 2 (Synj2), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs). They belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj1 occurs as two main isoforms: a brain enriched 145 KDa protein (Synj1-145) and a ubiquitously expressed 170KDa protein (Synj1-170). Synj1-145 participates in clathrin-mediated endocytosis. The primary substrate of the Synj1-145 INPP5c domain is PI(4,5)P2, which it converts to PI4P. Synj1-145 may work with membrane curvature sensors/generators (such as endophilin) to remove PI(4,5)P2 from curved membranes. The recruitment of the INPP5c domain of Synj1-145 to endophilin-induced membranes leads to a fragmentation and condensation of these structures. The PI(4,5)P2 to PI4P conversion may cooperate with dynamin to produce membrane fission. In addition to this INPP5c domain, Synjs contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro. Synj2 can hydrolyze phosphatidylinositol diphosphate (PIP2) to phosphatidylinositol phosphate (PIP). Synj2 occurs as multiple alternative splice variants in various tissues. These variants share the INPP5c domain and the Sac1 domain. Synj2A is recruited to the mitochondria via its interaction with OMP25 (a mitochondrial outer membrane protein). Synj2B is found at nerve terminals in the brain and at the spermatid manchette in testis. Synj2B undergoes further alternative splicing to give 2B1 and 2B2. In clathrin-mediated endocytosis, Synj2 participates in the formation of clathrin-coated pits, and perhaps also in vesicle decoating. Rac1 GTPase regulates the intracellular localization of Synj2 forms, but not Synj1. Synj2 may contribute to the role of Rac1 in cell migration and invasion, and is a potential target for therapeutic intervention in malignant tumors. 328
27907 197324 cd09090 INPP5c_ScInp51p-like Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Saccharomyces cerevisiae Inp51p, Inp52p, and Inp53p, and related proteins. This subfamily contains the INPP5c domain of three Saccharomyces cerevisiae synaptojanin-like inositol polyphosphate 5-phosphatases (INP51, INP52, and INP53), Schizosaccharomyces pombe synaptojanin (SPsynaptojanin), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. In addition to this INPP5c domain, these proteins have an N-terminal catalytic Sac1-like domain (found in other proteins including the phophoinositide phosphatase Sac1p), and a C-terminal proline-rich domain (PRD). The Sac1 domain allows Inp52p and Inp53p to recognize and dephosphorylate a wider range of substrates including PI3P, PI4P, and PI(3,5)P2. The Sac1 domain of Inp51p is non-functional. Disruption of any two of INP51, INP52, and INP53, in S. cerevisiae leads to abnormal vacuolar and plasma membrane morphology. During hyperosmotic stress, Inp52p and Inp53p localize at actin patches, where they may facilitate the hydrolysis of PI(4,5)P2, and consequently promote actin rearrangement to regulate cell growth. SPsynaptojanin is also active against a range of soluble and lipid inositol phosphates, including I(1,4,5)P3, I(1,3,4,5)P4, I(1,4,5,6)P4, PI(4,5)P2, and PIP3. Transformation of S. cerevisiae with a plasmid expressing the SPsynaptojanin 5-phosphatase domain rescues inp51/inp52/inp53 triple-mutant strains. 291
27908 197325 cd09091 INPP5c_SHIP Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol polyphosphate 5-phosphatase-1 and -2, and related proteins. This subfamily contains the INPP5c domain of SHIP1 (SH2 domain containing inositol polyphosphate 5-phosphatase-1, also known as SHIP/INPP5D), and SHIP2 (also known as INPPL1). It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Both SHIP1 and -2 catalyze the dephosphorylation of the PI, phosphatidylinositol 3,4,5-trisphosphate [PI(3,4,5)P3], to phosphatidylinositol 3,4-bisphosphate [PI(3,4)P2]. SHIP1 also converts inositol-1,3,4,5- polyphosphate [I(1,3,4,5)P4] to inositol-1,3,4-polyphosphate [I(1,3,4)P3]. SHIP1 and SHIP2 have little overlap in their in vivo functions. SHIP1 is a negative regulator of cell growth and plays a major part in mediating the inhibitory signaling in B cells; it is predominantly expressed in hematopoietic cells. SHIP2 is as an inhibitor of the insulin signaling pathway, and is implicated in actin structure remodeling, cell adhesion and cell spreading, receptor endocytosis and degradation, and in the JIP1-mediated JNK pathway. SHIP2 is widely expressed, most prominently in brain, heart and in skeletal muscle. In addition to this INPP5c domain, SHIP1 has an N-terminal SH2 domain, two NPXY motifs, and a C-terminal proline-rich region (PRD), while SHIP2 has an N-terminal SH2 domain, a C-terminal proline-rich domain (PRD), which includes a WW-domain binding motif (PPLP), an NPXY motif, and a sterile alpha motif (SAM) domain. The gene encoding SHIP2 is a candidate gene for conferring a predisposition for type 2 diabetes. 307
27909 197326 cd09092 INPP5A Type I inositol polyphosphate 5-phosphatase I. Type I inositol polyphosphate 5-phosphatase I (INPP5A) hydrolyzes the 5-phosphate from inositol 1,3,4,5-tetrakisphosphate [I(1,3,4,5)P4] and inositol 1,4,5-trisphosphate [I(1,4,5)P3]. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. As the substrates of INPP5A mobilize intracellular calcium ions, INPP5A is a calcium signal-terminating enzyme. In platelets, phosphorylated pleckstrin binds and activates INPP5A in a 1:1 complex, and accelerates the degradation of the calcium ion-mobilizing I(1,4,5)P3. 383
27910 197327 cd09093 INPP5c_INPP5B Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Type II inositol polyphosphate 5-phosphatase I, Oculocerebrorenal syndrome of Lowe 1, and related proteins. This subfamily contains the INPP5c domain of type II inositol polyphosphate 5-phosphatase I (INPP5B), Oculocerebrorenal syndrome of Lowe 1 (OCRL-1), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5B and OCRL1 preferentially hydrolyze the 5-phosphate of phosphatidylinositol (4,5)- bisphosphate [PI(4,5)P2] and phosphatidylinositol (3,4,5)- trisphosphate [PI(3,4,5)P3]. INPP5B can also hydrolyze soluble inositol (1,4,5)-trisphosphate [I(1,4,5)P3] and inositol (1,3,4,5)-tetrakisphosphate [I(1,3,4,5)P4]. INPP5B participates in the endocytic pathway and in the early secretory pathway. In the latter, it may function in retrograde ERGIC (ER-to-Golgi intermediate compartment)-to-ER transport; it binds specific RAB proteins within the secretory pathway. In the endocytic pathway, it binds RAB5 and during endocytosis, may function in a RAB5-controlled cascade for converting PI(3,4,5)P3 to phosphatidylinositol 3-phosphate (PI3P). This cascade may link growth factor signaling and membrane dynamics. Mutation in OCRL1 is implicated in Lowe syndrome, an X-linked recessive multisystem disorder, which includes defects in eye, brain, and kidney function, and in Type 2 Dent's disease, a disorder with only the renal symptoms. OCRL-1 may have a role in membrane trafficking within the endocytic pathway and at the trans-Golgi network, and may participate in actin dynamics or signaling from endomembranes. OCRL1 and INPP5B have overlapping functions: deletion of both 5-phosphatases in mice is embryonic lethal, deletion of OCRL1 alone has no phenotype, and deletion of Inpp5b alone has only a mild phenotype (male sterility). Several of the proteins that interact with OCRL1 also bind INPP5B, for examples, inositol polyphosphate phosphatase interacting protein of 27kDa (IPIP27)A and B (also known as Ses1 and 2), and endocytic signaling adaptor APPL1. OCRL1, but not INPP5B, binds clathrin heavy chain, the plasma membrane AP2 adaptor subunit alpha-adaptin. In addition to this INPP5c domain, most proteins in this subfamily have a C-terminal RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain. 292
27911 197328 cd09094 INPP5c_INPP5J-like Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of inositol polyphosphate 5-phosphatase J and related proteins. INPP5c domain of Inositol polyphosphate-5-phosphatase J (INPP5J), also known as PIB5PA or PIPP, and related proteins. This subfamily belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5J hydrolyzes PI(4,5)P2, I(1,4,5)P3, and I(1,3,4,5)P4 at ruffling membranes. These proteins contain a C-terminal, SKIP carboxyl homology domain (SKICH), which may direct plasma membrane ruffle localization. 300
27912 197329 cd09095 INPP5c_INPP5E-like Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Inositol polyphosphate-5-phosphatase E and related proteins. INPP5c domain of Inositol polyphosphate-5-phosphatase E (also called type IV or 72 kDa 5-phosphatase), rat pharbin, and related proteins. This subfamily belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5E hydrolyzes the 5-phosphate from PI(3,5)P2, PI(4,5)P2 and PI(3,4,5)P3, forming PI3P, PI4P, and PI(3,4)P2, respectively. It is a very potent PI(3,4,5)P3 5-phosphatase. Its intracellular localization is chiefly cytosolic, with pronounced perinuclear/Golgi localization. INPP5E also has an N-terminal proline rich domain (PRD) and a C-terminal CAAX motif. This protein is expressed in a variety of tissues, including the breast, brain, testis, and haemopoietic cells. It is differentially expressed in several cancers, for example, it is up-regulated in cervical cancer and down-regulated in stomach cancer. It is a candidate target for therapeutics of obesity and related disorders, as it is expressed in the hypothalamus, and following insulin stimulation, it undergoes tyrosine phosphorylation, associates with insulin receptor substrate-1, -2, and PI3-kinase, and become active as a 5-phosphatase. INPP5E may play a role, along with other 5-phosphatases SHIP2 and SKIP, in regulating glucose homoeostasis and energy metabolism. Mice deficient in INPPE5 develop a multi-organ disorder associated with structural defects of the primary cilium. 298
27913 197330 cd09096 Deadenylase_nocturnin C-terminal deadenylase domain of nocturnin and related domains. This subfamily contains the C-terminal catalytic domain of the deadenylase, nocturnin, and related domains. Nocturnin is a poly(A)-specific 3' exonuclease that specifically degrades the 3' poly(A) tail of RNA in a process known as deadenylation. This nuclease activity is manganese dependent. Nocturnin is expressed in the cytoplasm of Xenopus laevis retinal photoreceptor cells in a rhythmic fashion, and it has been proposed that it participates in posttranscriptional regulation of the circadian clock or its outputs, and that the mRNA target(s) of this deadenylase are circadian clock-related. In mouse, the nocturnin gene, mNoc, is expressed in a circadian pattern in a range of tissues including retina, spleen, heart, kidney, and liver. It is highly expressed in bone-marrow stromal cells, adipocytes and hepatocytes. In mammals, nocturnin plays a role in regulating mesenchymal stem-cell lineage allocation, perhaps through regulating PPAR-gamma (peroxisome proliferator-activated receptor-gamma) nuclear translocation. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 280
27914 197331 cd09097 Deadenylase_CCR4 C-terminal deadenylase domain of CCR4 and related domains. This subfamily contains the C-terminal catalytic domain of the deadenylases, Saccharomyces cerevisiae Ccr4p and two vertebrate homologs (CCR4a and CCR4b), and related domains. CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1 (called Pop2 in yeast), is a DEDD-type protein and does not belong in this superfamily. Saccharomyces cerevisiae CCR4 (or Ccr4p) is a 3'-5' poly(A) RNA and ssDNA exonuclease. It is the catalytic subunit of the yeast mRNA deadenylase (Ccr4p/Pop2p/Not complex). This complex participates in various ways in mRNA metabolism, including transcription initiation and elongation, and mRNA degradation. Ccr4p degrades both poly(A) and single-stranded DNA. There are two vertebrate homologs of Ccr4p, CCR4a (also called CCR4-NOT transcription complex subunit 6 or CNOT6) and CCR4b (also called CNOT6-like or CNOT6L), which independently associate with other components to form distinct CCR4-NOT multisubunit complexes. The nuclease domain of CNOT6 and CNOT6L exhibits Mg2+-dependent deadenylase activity, with specificity for poly (A) RNA as substrate. CCR4a is a component of P-bodies and is necessary for foci formation. CCR4b regulates p27/Kip1 mRNA levels, thereby influencing cell cycle progression. They both contribute to the prevention of cell death by regulating insulin-like growth factor-binding protein 5. 329
27915 197332 cd09098 INPP5c_Synj1 Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanin 1. This subfamily contains the INPP5c domains of human synaptojanin 1 (Synj1) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj1 occurs as two main isoforms: a brain enriched 145 KDa protein (Synj1-145) and a ubiquitously expressed 170KDa protein (Synj1-170). Synj1-145 participates in clathrin-mediated endocytosis. The primary substrate of the Synj1-145 INPP5c domain is PI(4,5)P2, which it converts to PI4P. Synj1-145 may work with membrane curvature sensors/generators (such as endophilin) to remove PI(4,5)P2 from curved membranes. The recruitment of the INPP5c domain of Synj1-145 to endophilin-induced membranes leads to a fragmentation and condensation of these structures. The PI(4,5)P2 to PI4P conversion may cooperate with dynamin to produce membrane fission. In addition to this INPP5c domain, these proteins contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro. 336
27916 197333 cd09099 INPP5c_Synj2 Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanin 2. This subfamily contains the INPP5c domains of human synaptojanin 2 (Synj2) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj2 can hydrolyze phosphatidylinositol diphosphate (PIP2) to phosphatidylinositol phosphate (PIP). In addition to this INPP5c domain, these proteins contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro. Synj2 occurs as multiple alternative splice variants in various tissues. These variants share the INPP5c domain and the Sac1 domain. Synj2A is recruited to the mitochondria via its interaction with OMP25, a mitochondrial outer membrane protein. Synj2B is found at nerve terminals in the brain and at the spermatid manchette in testis. Synj2B undergoes further alternative splicing to give 2B1 and 2B2. In clathrin-mediated endocytosis, Synj2 participates in the formation of clathrin-coated pits, and perhaps also in vesicle decoating. Rac1 GTPase regulates the intracellular localization of Synj2 forms, but not Synj1. Synj2 may contribute to the role of Rac1 in cell migration and invasion, and is a potential target for therapeutic intervention in malignant tumors. 336
27917 197334 cd09100 INPP5c_SHIP1-INPP5D Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol polyphosphate 5-phosphatase-1 and related proteins. This subfamily contains the INPP5c domain of SHIP1 (SH2 domain containing inositol polyphosphate 5-phosphatase-1, also known as SHIP/INPP5D) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. SHIP1's enzymic activity is restricted to phosphatidylinositol 3,4,5-trisphosphate [PI (3,4,5)P3] and inositol-1,3,4,5- polyphosphate [I(1,3,4,5)P4]. It converts these two phosphoinositides to phosphatidylinositol 3,4-bisphosphate [PI (3,4)P2] and inositol-1,3,4-polyphosphate [I(1,3,4)P3], respectively. SHIP1 is a negative regulator of cell growth and plays a major part in mediating the inhibitory signaling in B cells; it is predominantly expressed in hematopoietic cells. In addition to this INPP5c domain, SHIP1 has an N-terminal SH2 domain, two NPXY motifs, and a C-terminal proline-rich region (PRD). SHIP1's phosphorylated NPXY motifs interact with proteins with phosphotyrosine binding (PTB) domains, and facilitate the translocation of SHIP1 to the plasma membrane to hydrolyze PI(3,4,5)P3. SHIP1 generally acts to oppose the activity of phosphatidylinositol 3-kinase (PI3K). It acts as a negative signaling molecule, reducing the levels of PI(3,4,5)P3, thereby removing the latter as a membrane-targeting signal for PH domain-containing effector molecules. SHIP1 may also, in certain contexts, amplify PI3K signals. SHIP1 and SHIP2 have little overlap in their in vivo functions. 307
27918 197335 cd09101 INPP5c_SHIP2-INPPL1 Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol 5-phosphatase-2 and related proteins. This subfamily contains the INPP5c domain of SHIP2 (SH2 domain containing inositol 5-phosphatase-2, also called INPPL1) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. SHIP2 catalyzes the dephosphorylation of the PI, phosphatidylinositol 3,4,5-trisphosphate [PI(3,4,5)P3], to phosphatidylinositol 3,4-bisphosphate [PI(3,4)P2]. SHIP2 is widely expressed, most prominently in brain, heart and in skeletal muscle. SHIP2 is an inhibitor of the insulin signaling pathway. It is implicated in actin structure remodeling, cell adhesion and cell spreading, receptor endocytosis and degradation, and in the JIP1-mediated JNK pathway. Its interacting partners include filamin/actin, p130Cas, Shc, Vinexin, Interesectin 1, and c-Jun NH2-terminal kinase (JNK)-interacting protein 1 (JIP1). A large variety of extracellular stimuli appear to lead to the tyrosine phosphorylation of SHIP2, including epidermal growth factor (EGF), platelet-derived growth factor (PDGF), insulin, macrophage colony-stimulating factor (M-CSF) and hepatocyte growth factor (HGF). SHIP2 is localized to the cytosol in quiescent cells; following growth factor stimulation and /or cell adhesion, it relocalizes to membrane ruffles. In addition to this INPP5c domain, SHIP2 has an N-terminal SH2 domain, a C-terminal proline-rich domain (PRD), which includes a WW-domain binding motif (PPLP), an NPXY motif and a sterile alpha motif (SAM) domain. The gene encoding SHIP2 is a candidate for conferring a predisposition for type 2 diabetes; it has been suggested that suppression of SHIP2 may be of benefit in the treatment of obesity and thereby prevent type 2 diabetes. SHIP2 and SHIP1 have little overlap in their in vivo functions. 304
27919 197201 cd09102 PLDc_CDP-OH_P_transf_II_1 Catalytic domain, repeat 1, of CDP-alcohol phosphatidyltransferase class-II family members. Catalytic domain, repeat 1, of CDP-alcohol phosphatidyltransferase class-II family members, which mainly include gram-negative bacterial phosphatidylserine synthases (PSS; CDP-diacylglycerol--serine O-phosphatidyltransferase, EC 2.7.8.8), yeast phosphatidylglycerophosphate synthase (PGP synthase; CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase, EC 2.7.8.5), and metazoan PGP synthase 1. All members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterize the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism, involving a substrate-enzyme intermediate, to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involving phosphatidyl group transfer. Phosphatidylserine synthases from gram-positive bacteria and eukaryotes, and prokaryotic phosphatidylglycerophosphate synthases are not members of this subfamily. 168
27920 197202 cd09103 PLDc_CDP-OH_P_transf_II_2 Catalytic domain, repeat 2, of CDP-alcohol phosphatidyltransferase class-II family members. Catalytic domain, repeat 2, of CDP-alcohol phosphatidyltransferase class-II family members, which mainly include gram-negative bacterial phosphatidylserine synthases (PSS; CDP-diacylglycerol--serine O-phosphatidyltransferase, EC 2.7.8.8), yeast phosphatidylglycerophosphate synthase (PGP synthase; CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase, EC 2.7.8.5), and metazoan PGP synthase 1. All members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterize the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism, involving a substrate-enzyme intermediate, to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involving phosphatidyl group transfer. Phosphatidylserine synthases from gram-positive bacteria and eukaryotes, and prokaryotic phosphatidylglycerophosphate synthases are not members of this subfamily. 184
27921 197203 cd09104 PLDc_vPLD1_2_like_1 Catalytic domain, repeat 1, of vertebrate phospholipases, PLD1 and PLD2, and similar proteins. Catalytic domain, repeat 1, of phospholipase D (PLD, EC 3.1.4.4) found in yeast, plants, and vertebrates, and their bacterial homologs. PLDs are involved in signal transduction, vesicle formation, protein transport, and mitosis by participating in phospholipid metabolism. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both prokaryotic and eukaryotic PLDs have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. PLDs are active as bi-lobed monomers. Each monomer contains two domains, each of which carries one copy of the HKD motif. Two HKD motifs from two domains form a single active site. PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 147
27922 197204 cd09105 PLDc_vPLD1_2_like_2 Catalytic domain, repeat 2, of vertebrate phospholipases, PLD1 and PLD2, and similar proteins. Catalytic domain, repeat 2, of phospholipase D (PLD, EC 3.1.4.4) found in yeast, plants, and vertebrates, and their bacterial homologs. PLDs are involved in signal transduction, vesicle formation, protein transport, and mitosis by participating in phospholipid metabolism. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both prokaryotic and eukaryotic PLDs have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. PLDs are active as bi-lobed monomers. Each monomer contains two domains, each of which carries one copy of the HKD motif. Two HKD motifs from two domains form a single active site. PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 146
27923 197205 cd09106 PLDc_vPLD3_4_5_like_1 Putative catalytic domain, repeat 1, of vertebrate phospholipases, PLD3, PLD4 and PLD5, viral envelope proteins K4 and p37, and similar proteins. Putative catalytic domain, repeat 1, of vertebrate phospholipases D, PLD3, PLD4, and PLD5 (EC 3.1.4.4), viral envelope proteins (vaccinia virus proteins K4 and p37), and similar proteins. Most family members contain two copies of the HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified into the phospholipase D (PLD) superfamily. Proteins in this subfamily are associated with Golgi membranes, altering their lipid content by the conversion of phospholipids into phosphatidic acid, which is thought to be involved in the regulation of lipid movement. ADP ribosylation factor (ARF), a small guanosine triphosphate binding protein, might be required activity. The vaccinia virus p37 protein, encoded by the F13L gene, is also associated with Golgi membranes and is required for the envelopment and spread of the extracellular enveloped virus (EEV). The vaccinia virus protein K4, encoded by the HindIII K4L gene, remains to be characterized. Sequence analysis indicates that the vaccinia virus proteins K4 and p37 might have evolved from one or more captured eukaryotic genes involved in cellular lipid metabolism. Up to date, no catalytic activity of PLD3 has been shown. Furthermore, due to the lack of functional important histidine and lysine residues in the HKD motif, mammalian PLD5 has been characterized as an inactive PLD. The poxvirus p37 proteins may also lack PLD enzymatic activity, since they contain only one partially conserved HKD motif (N-x-K-x(4)-D). 153
27924 197206 cd09107 PLDc_vPLD3_4_5_like_2 Putative catalytic domain, repeat 2, of vertebrate phospholipases, PLD3, PLD4 and PLD5, viral envelope proteins K4 and p37, and similar proteins. Putative catalytic domain, repeat 2, of vertebrate phospholipases D, PLD3, PLD4, and PLD5 (EC 3.1.4.4), viral envelope proteins (vaccinia virus proteins K4 and p37), and similar proteins. Most family members contain two copies of the HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified into the phospholipase D (PLD) superfamily. Proteins in this subfamily are associated with Golgi membranes, altering their lipid content by the conversion of phospholipids into phosphatidic acid, which is thought to be involved in the regulation of lipid movement. ADP ribosylation factor (ARF), a small guanosine triphosphate binding protein, might be required activity. The vaccinia virus p37 protein, encoded by the F13L gene, is also associated with Golgi membranes and is required for the envelopment and spread of the extracellular enveloped virus (EEV). The vaccinia virus protein K4, encoded by the HindIII K4L gene, remains to be characterized. Sequence analysis indicates that the vaccinia virus proteins K4 and p37 might have evolved from one or more captured eukaryotic genes involved in cellular lipid metabolism. Up to date, no catalytic activity of PLD3 has been shown. Furthermore, due to the lack of functional important histidine and lysine residues in the HKD motif, mammalian PLD5 has been characterized as an inactive PLD. The poxvirus p37 proteins may also lack PLD enzymatic activity, since they contain only one partially conserved HKD motif (N-x-K-x(4)-D). 175
27925 197207 cd09108 PLDc_PMFPLD_like_1 Catalytic domain, repeat 1, of phospholipase D from Streptomyces Sp. Strain PMF and similar proteins. Catalytic domain, repeat 1, of phospholipases D (PLD, EC 3.1.4.4) from Streptomyces Sp. Strain PMF (PMFPLD) and similar proteins, which are generally extracellular and bear N-terminal signal sequences. PMFPLD hydrolyzes the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. In contrast to eukaryotic PLDs, PMFPLD has a compact structure, which consists of two catalytic domains, but lacks the regulatory domains. Each catalytic domain contains one copy of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. Two HKD motifs from two domains form a single active site. Like other PLD enzymes, PMFPLD may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. A calcium-dependent PLD from Streptomyce chromofuscus is excluded from this family, since it displays very little sequence homology with other Streptomyces PLDs. Moreover, it does not contain the conserved HKD motif and hydrolyzes the phospholipids via a different mechanism. 210
27926 197208 cd09109 PLDc_PMFPLD_like_2 Catalytic domain, repeat 2, of phospholipase D from Streptomyces Sp. Strain PMF and similar proteins. Catalytic domain, repeat 2, of phospholipases D (PLD, EC 3.1.4.4) from Streptomyces Sp. Strain PMF (PMFPLD) and similar proteins, which are generally extracellular and bear N-terminal signal sequences. PMFPLD hydrolyzes the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. In contrast to eukaryotic PLDs, PMFPLD has a compact structure, which consists of two catalytic domains, but lacks the regulatory domains. Each catalytic domain contains one copy of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. Two HKD motifs from two domains form a single active site. Like other PLD enzymes, PMFPLD may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. A calcium-dependent PLD from Streptomyce chromofuscus is excluded from this family, since it displays very little sequence homology with other Streptomyces PLDs. Moreover, it does not contain the conserved HKD motif and hydrolyzes the phospholipids via a different mechanism. 212
27927 197209 cd09110 PLDc_CLS_1 Catalytic domain, repeat 1, of bacterial cardiolipin synthase and similar proteins. Catalytic domain, repeat 1, of bacterial cardiolipin (CL) synthase and a few homologs found in eukaryotes and archaea. Bacterial CL synthases catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. The monomer of bacterial CL synthase consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. Bacterial CL synthases can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, bacterial CL synthases utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 154
27928 197210 cd09111 PLDc_ymdC_like_1 Putative catalytic domain, repeat 1, of Escherichia coli uncharacterized protein ymdC and similar proteins. Putative catalytic domain, repeat 1, of Escherichia coli uncharacterized protein ymdC and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli cardiolipin (CL) synthase. The prototype of this subfamily is an uncharacterized protein ymdC specified by the o493 (ymdC) gene. Although the functional characterization of ymdC and similar proteins remains unknown, members of this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, ymdC and its similar proteins contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characteriszes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 162
27929 197211 cd09112 PLDc_CLS_2 catalytic domain repeat 2 of bacterial cardiolipin synthase and similar proteins. This CD corresponds to the catalytic domain repeat 2 of bacterial cardiolipin synthase (CL synthase, EC 2.7.8.-) and a few homologs found in eukaryotes and archea. Bacterial CL synthases catalyze reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of bacterial CL synthase consists of two catalytic domains. Each catalytic domain contains one copy of conserved HKD motifs (H-X-K-X(4)-D, X represents any amino acid residue) that are the characteristic of the phospholipase D (PLD) superfamily. Two HKD motifs from two domains together form a single active site involving in phosphatidyl group transfer. Bacterial CL synthases can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity in PLD superfamily. Like other PLD enzymes, bacterial CL synthase utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid stabilizing the leaving group. 174
27930 197212 cd09113 PLDc_ymdC_like_2 Putative catalytic domain, repeat 2, of Escherichia coli uncharacterized protein ymdC and similar proteins. Putative catalytic domain, repeat 2, of Escherichia coli uncharacterized protein ymdC and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli cardiolipin (CL) synthase. The prototype of this subfamily is an uncharacterized protein ymdC specified by the o493 (ymdC) gene. Although the functional characterization of ymdC and similar proteins remains unknown, members of this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, ymdC and its similar proteins contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characteriszes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 218
27931 197213 cd09114 PLDc_PPK1_C1 Catalytic C-terminal domain, first repeat, of prokaryotic polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of bacterial polyphosphate kinases 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. There is a second bacterial-type enzyme, PPK2, which is involved in the synthesis of poly P from GTP or ATP. PPK2 shows no sequence similarity to PPK1 and belongs to different superfamily. 162
27932 197214 cd09115 PLDc_PPK1_C2 Catalytic C-terminal domain, second repeat, of prokaryotic polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of bacterial polyphosphate kinases 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. There is a second bacterial-type enzyme, PPK2, which is involved in the synthesis of poly P from GTP or ATP. PPK2 shows no sequence similarity to PPK1 and belongs to different superfamily. 162
27933 197215 cd09116 PLDc_Nuc_like Catalytic domain of EDTA-resistant nuclease Nuc, vertebrate phospholipase D6, and similar proteins. Catalytic domain of EDTA-resistant nuclease Nuc, vertebrate phospholipase D6 (PLD6, EC 3.1.4.4), and similar proteins. Nuc is an endonuclease from Salmonella typhimurium and the smallest known member of the PLD superfamily. It cleaves both single- and double-stranded DNA. PLD6 selectively hydrolyzes the terminal phosphodiester bond of phosphatidylcholine (PC), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLD6 also catalyzes the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which is essential for catalysis. PLDs utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. This subfamily also includes some uncharacterized hypothetical proteins, which have two HKD motifs in a single polypeptide chain. 138
27934 197216 cd09117 PLDc_Bfil_DEXD_like Catalytic domain of type II restriction endonucleases BfiI and NgoFVII, and uncharacterized proteins with a DEAD domain. Catalytic domain of type II restriction endonucleases BfiI and NgoFVII, uncharacterized type III restriction endonuclease Res subunit, and uncharacterized DNA/RNA helicase superfamily II members. Proteins in this family are found mainly in prokaryotes. They contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain, and have been classified as members of the phospholipase D (PLD, EC 3.1.4.4) superfamily. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains that contain the two HKD motifs from both subunits. BfiI utilizes a single active site to cut both DNA strands, which represents a novel mechanism for the scission of double-stranded DNA. It uses a histidine residue from the HKD motif in one subunit as the nucleophile for the cleavage of the target phosphodiester bond in both of the anti-parallel DNA strands, while the symmetrically-related histidine residue from the HKD motif of the opposite subunit acts as the proton donor/acceptor during both strand-scission events. 117
27935 197217 cd09118 PLDc_yjhR_C_like C-terminal domain of Escherichia coli uncharacterized protein yjhR and similar proteins. C-terminal domain of Escherichia coli uncharacterized protein yjhR, encoded by the o338 gene, and similar proteins. Although the biological function of yjhR remains unknown, it shows sequence similarity to the C-terminal portions of superfamily I DNA and RNA helicases, which are ubiquitous enzymes mediating ATP-dependent unwinding of DNA and RNA duplexes, and play essential roles in gene replication and expression. Moreover, The C-termini of yjhR and similar proteins contain one HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The PLDc-like domain of yjhR is similar to bacterial endonucleases, Nuc and BfiI, both of which have only one copy of the HKD motif per chain. They function as homodimers, with a single active site at the dimer interface containing the HKD motifs from both subunits. They utilize a two-step mechanism to cleave phosphodiester bonds. Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. 144
27936 197218 cd09119 PLDc_FAM83_N N-terminal phospholipase D-like domain of proteins from the Family with sequence similarity 83. N-terminal phospholipase D (PLD)-like domain of vetebrate proteins from the Family with sequence similarity 83 (FAM83), which is comprised of 8 members, designated FAM83A through FAM83H. Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, the FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are unlikely to carry PLD activity. Members of the FAM83 are mostly uncharacterized proteins. FAM83A, also known as tumor antigen BJ-TSA-9, is a novel tumor-specific gene highly expressed in human lung adenocarcinoma. FAM83D, also known as spindle protein CHICA, is a cell-cycle-regulated spindle component which localizes to the mitotic spindle and is both upregulated and phosphorylated during mitosis. The gene encoding protein FAM83H is the first gene involved in the etiology of amelogenesis imperfecta (AI), that encodes a non-secreted protein due to the absence of a signal peptide. Defects in gene FAM83H cause autosomal dominant hypocalcified amelogenesis imperfecta (ADHCAI). FAM83B, FAM83C, FAM83F, and FAM83G are uncharacterized proteins present across vertebrates while FAM83E is an uncharacterized protein found only in mammals. 269
27937 197219 cd09120 PLDc_DNaseII_1 Catalytic domain, repeat 1, of Deoxyribonuclease II and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II (DNase II, EC 3.1.22.1), an endodeoxyribonuclease with ubiquitous tissue distribution. It is essential for accessory apoptotic DNA fragmentation and DNA clearance during development, as well as in tissue regeneration in higher eukaryotes. Unlike the majority of nucleases, DNase II functions optimally at acidic pH in the absence of divalent metal ion cofactors. It hydrolyzes the phosphodiester backbone of DNA by a single strand cleavage mechanism to generate 3'-phosphate termini. The majority of family members contain an N-terminal signal-peptide leader sequence, which is critical for N-glycosylation and DNase II activity. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta (also known as DNase II-like acid DNase, DLAD) subtypes. A few homologs are found in non-metazoan species, but none are found in fungi, plants or prokaryotes, with the sole exception of Burkholderia pseudomallei. Among those homologs, the Caenorhabditis elegans C07B5.5 ORF encoding NUC-1 apoptotic nuclease, the uncharacterized C. elegans crn-6 (cell death related nuclease) gene encoding protein, and the putative gene CG7780 encoding Drosophila DNase II (dDNase II) have similar cleavage activity and specificity to mammalian DNase II enzymes. They may function like an acid DNase implicated in degrading DNA from apoptotic cells engulfed by macrophages. Plancitoxin I, the major lethal factor from the Acanthaster planci venom, is a unique homolog of mammalian DNase II. It has potent hepatotoxicity and the optimum pH for its activity is 7.2, unlike the optimum acidic PH for mammalian DNase II. Some members of this family contain substitutions of conserved residues found in the putative active site, which suggest that these proteins may have diverged from a canonical DNase II activity and may perform other functions. 141
27938 197220 cd09121 PLDc_DNaseII_2 Catalytic domain, repeat 2, of Deoxyribonuclease II and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II (DNase II, EC 3.1.22.1), an endodeoxyribonuclease with ubiquitous tissue distribution. It is essential for accessory apoptotic DNA fragmentation and DNA clearance during development, as well as in tissue regeneration in higher eukaryotes. Unlike the majority of nucleases, DNase II functions optimally at acidic pH in the absence of divalent metal ion cofactors. It hydrolyzes the phosphodiester backbone of DNA by a single strand cleavage mechanism to generate 3'-phosphate termini. The majority of family members contain an N-terminal signal-peptide leader sequence, which is critical for N-glycosylation and DNase II activity. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta (also known as DNase II-like acid DNase, DLAD) subtypes. A few homologs are found in non-metazoan species, but none are found in fungi, plants or prokaryotes, with the sole exception of Burkholderia pseudomallei. Among those homologs, the Caenorhabditis elegans C07B5.5 ORF encoding NUC-1 apoptotic nuclease, the uncharacterized C. elegans crn-6 (cell death related nuclease) gene encoding protein, and the putative gene CG7780 encoding Drosophila DNase II (dDNase II) have similar cleavage activity and specificity to mammalian DNase II enzymes. They may function like an acid DNase implicated in degrading DNA from apoptotic cells engulfed by macrophages. Plancitoxin I, the major lethal factor from the Acanthaster planci venom, is a unique homolog of mammalian DNase II. It has potent hepatotoxicity and the optimum pH for its activity is 7.2, unlike the optimum acidic PH for mammalian DNase II. Some members of this family contain substitutions of conserved residues found in the putative active site, which suggest that these proteins may have diverged from the canonical DNase II activity and may perform other functions. 139
27939 197221 cd09122 PLDc_Tdp1_1 Catalytic domain, repeat 1, of Tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of Tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-), which exists in eukaryotes but not in prokaryotes. Tdp1 acts as an important DNA repair enzyme that removes stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is a monomeric protein that contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Thus, this family represents a distinct class within the PLD superfamily. Like other PLD enzymes, Tdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 145
27940 197222 cd09123 PLDc_Tdp1_2 Catalytic domain, repeat 2, of tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of Tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-), which exists in eukaryotes but not in prokaryotes. Tdp1 acts as an important DNA repair enzyme that removes stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is a monomeric protein that contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Thus, this family represents a distinct class within the PLD superfamily. Like other PLD enzymes, Tdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 182
27941 197223 cd09124 PLDc_like_TrmB_middle Middle phospholipase D-like domain of the transcriptional regulator TrmB and similar proteins. Middle phospholipase D (PLD)-like domain of the transcriptional regulator TrmB and similar proteins. TrmB acts as a bifunctional sugar-sensing transcriptional regulator which controls two operons encoding maltose/trehalose and maltodextrin ABC transporters of Pyrococcus fruiosus. It functions as a dimer. Full length TrmB includes an N-terminal DNA-binding domain, a C-terminal sugar-binding domain and middle region that has been named as a PLD-like domain. The middle domain displays homology to PLD enzymes, which contain one or two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) per chain. The HKD motif characterizes the PLD superfamily. Due to the lack of key residues related to PLD activity in the PLD-like domain, members of this subfamily are unlikely to carry PLD activity. 126
27942 197224 cd09126 PLDc_C_DEXD_like C-terminal putative phospholipase D-like domain of uncharacterized prokaryotic HKD family nucleases fused to DEAD/DEAH box helicases. C-terminal putative phospholipase D (PLD)-like domain of uncharacterized prokaryotic HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. In addition to the helicase-like region, members of this family also contain a PLD-like domain in the C-terminal region, which is characterized by a variant HKD (H-x-K-x(4)-D motif, where x represents any amino acid residue) motif. Due to the lack of key residues related to PLD activity in the variant HKD motif, members of this subfamily are most unlikely to carry PLD activity. 126
27943 197225 cd09127 PLDc_unchar1_1 Putative catalytic domain, repeat 1, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 1, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 141
27944 197226 cd09128 PLDc_unchar1_2 Putative catalytic domain, repeat 2, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 2, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 142
27945 197227 cd09129 PLDc_unchar2_1 Putative catalytic domain, repeat 1, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 1, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 196
27946 197228 cd09130 PLDc_unchar2_2 Putative catalytic domain, repeat 2, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 2, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 157
27947 197229 cd09131 PLDc_unchar3 Putative catalytic domain of uncharacterized phospholipase D-like proteins. Putative catalytic domain of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. Members of this subfamily contain one copy of HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. 143
27948 197230 cd09132 PLDc_unchar4 Putative catalytic domain of uncharacterized phospholipase D-like proteins. Putative catalytic domain of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. Members of this subfamily contain one copy of HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. 122
27949 197231 cd09133 PLDc_unchar5 Putative catalytic domain of uncharacterized hypothetical proteins with one or two copies of the HKD motif. Putative catalytic domain of uncharacterized hypothetical proteins with similarity to phospholipase D (PLD, EC 3.1.4.4). PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain one or two copies of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. 127
27950 197232 cd09134 PLDc_PSS_G_neg_1 Catalytic domain, repeat 1, of phosphatidylserine synthases from gram-negative bacteria. Catalytic domain, repeat 1, of phosphatidylserine synthases (PSSs) from gram-negative bacteria. There are two subclasses of PSS enzymes in bacteria: subclass I of gram-negative bacteria and subclass II of gram-positive bacteria. It is common that PSSs in gram-positive bacteria and yeast are tight membrane-associated enzymes. By contrast, the gram-negative bacterial PSSs, such as Escherichia coli PSS, are commonly bound to the ribosomes. They are peripheral membrane proteins that can interact with the surface of the inner membrane by binding to the lipid substrate (CDP-diacylglycerol) and the lipid product (phosphatidylserine). The prototypical member of this subfamily is Escherichia coli PSS (also called CDP-diacylglycerol-L-serine O-phosphatidyltransferase, EC 2.7.8.8), which catalyzes the exchange reactions between CMP and CDP-diacylglycerol, and between serine and phosphatidylserine. The phosphatidylserine is then decarboxylated by phosphatidylserine decarboxylase to yield phosphatidylethanolamine, the major phospholipid in Escherichia coli. It also catalyzes the hydrolysis of CDP-diacylglycerol to form phosphatidic acid with the release of CMP. PSS may utilize a ping-pong mechanism involving a phosphatidyl-enzyme intermediate, which is distinct from those of gram-positive bacterial phosphatidylserine synthases. Moreover, all members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs constitute an active site for the formation of a covalent substrate-enzyme intermediate. 173
27951 197233 cd09135 PLDc_PGS1_euk_1 Catalytic domain, repeat 1, of eukaryotic PhosphatidylGlycerophosphate Synthases. Catalytic domain, repeat 1, of eukaryotic phosphatidylglycerophosphate (PGP) synthases, also called CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase (EC 2.7.8.5). Eukaryotic PGP synthases are different and unrelated to prokaryotic PGP synthases and yeast phosphatidylserine synthase. They catalyze the synthesis of PGP from CDP-diacylglycerol and sn-glycerol 3-phosphate, the committed and rate-limiting step in the biosynthesis of cardiolipin (CL), which is an essential component of many mitochondrial functions in eukaryotes. Members in this subfamily all have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism involving a substrate-enzyme intermediate to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involved in the phosphatidyl group transfer. 170
27952 197234 cd09136 PLDc_PSS_G_neg_2 Catalytic domain, repeat 2, of phosphatidylserine synthases from gram-negative bacteria. Catalytic domain, repeat 2, of phosphatidylserine synthases (PSSs) from gram-negative bacteria. There are two subclasses of PSS enzymes in bacteria: subclass I of gram-negative bacteria and subclass II of gram-positive bacteria. It is common that PSSs in gram-positive bacteria and yeast are tight membrane-associated enzymes. By contrast, the gram-negative bacterial PSSs, such as Escherichia coli PSS, are commonly bound to the ribosomes. They are peripheral membrane proteins that can interact with the surface of the inner membrane by binding to the lipid substrate (CDP-diacylglycerol) and the lipid product (phosphatidylserine). The prototypical member of this subfamily is Escherichia coli PSS (also called CDP-diacylglycerol-L-serine O-phosphatidyltransferase, EC 2.7.8.8), which catalyzes the exchange reactions between CMP and CDP-diacylglycerol, and between serine and phosphatidylserine. The phosphatidylserine is then decarboxylated by phosphatidylserine decarboxylase to yield phosphatidylethanolamine, the major phospholipid in Escherichia coli. It also catalyzes the hydrolysis of CDP-diacylglycerol to form phosphatidic acid with the release of CMP. PSS may utilize a ping-pong mechanism involving a phosphatidyl-enzyme intermediate, which is distinct from those of gram-positive bacterial phosphatidylserine synthases. Moreover, all members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs constitute an active site for the formation of a covalent substrate-enzyme intermediate. 215
27953 197235 cd09137 PLDc_PGS1_euk_2 Catalytic domain, repeat 2, of eukaryotic phosphatidylglycerophosphate synthases. Catalytic domain, repeat 2, of eukaryotic phosphatidylglycerophosphate (PGP) synthases, also called CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase (EC 2.7.8.5). Eukaryotic PGP synthases are different and unrelated to prokaryotic PGP synthases and yeast phosphatidylserine synthase. They catalyze the synthesis of PGP from CDP-diacylglycerol and sn-glycerol 3-phosphate, the committed and rate-limiting step in the biosynthesis of cardiolipin (CL), which is an essential component of many mitochondrial functions in eukaryotes. Members in this subfamily all have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism involving a substrate-enzyme intermediate to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involved in the phosphatidyl group transfer. 186
27954 197236 cd09138 PLDc_vPLD1_2_yPLD_like_1 Catalytic domain, repeat 1, of vertebrate phospholipases, PLD1 and PLD2, yeast PLDs, and similar proteins. Catalytic domain, repeat 1, of vertebrate phospholipases D (PLD1 and PLD2), yeast phospholipase D (PLD SPO14/PLD1), and other similar eukaryotic proteins. These PLD enzymes play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. The vertebrate PLD1 and PLD2 are membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzymes that selectively hydrolyze phosphatidylcholine (PC). Protein cofactors and calcium may be required for their activation. Yeast SPO14/PLD1 is a calcium-independent PLD, which needs PIP2 for its activity. Instead of the regulatory calcium-dependent phospholipid-binding C2 domain in plants, most mammalian and yeast PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at the N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. The PX and PH domains are also present in zeta-type PLD from Arabidopsis, which is more closely related to vertebrate PLDs than to other plant PLD types. In addition, this subfamily also includes some related proteins which have either PX-like or PH domains in their N-termini. Like other members of the PLD superfamily, the monomer of mammalian and yeast PLDs consists of two catalytic domains, each containing one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from the two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 146
27955 197237 cd09139 PLDc_pPLD_like_1 Catalytic domain, repeat 1, of plant phospholipase D and similar proteins. Catalytic domain, repeat 1, of plant phospholipase D (PLD, EC 3.1.4.4) and similar proteins. Plant PLDs have broad substrate specificity and can hydrolyze the terminal phosphodiester bond of several common membrane phospholipids such as phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), and phosphatidylserine (PS), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Most plant PLDs possess a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require calcium for activity, which is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDs may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. This subfamily includes two types of plant PLDs, alpha-type and beta-type PLDs, which are derived from different gene products and distinctly regulated. The zeta-type PLD from Arabidopsis is not included in this subfamily. 176
27956 197238 cd09140 PLDc_vPLD1_2_like_bac_1 Catalytic domain, repeat 1, of uncharacterized bacterial proteins with similarity to vertebrate phospholipases, PLD1 and PLD2. Catalytic domain, repeat 1, of uncharacterized bacterial counterparts of vertebrate, yeast and plant phospholipase D (PLD, EC 3.1.4.4). PLDs hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. They also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Instead of the regulatory C2 (calcium-activated lipid binding) domain in plants and the adjacent Phox (PX) and the Pleckstrin homology (PH) N-terminal domains in most mammalian and yeast PLDs, many members in this subfamily contain a SNARE associated C-terminal domain, whose functional role is unclear. Like other PLD enzymes, members in this subfamily contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), that may play an important role in the catalysis. 146
27957 197239 cd09141 PLDc_vPLD1_2_yPLD_like_2 Catalytic domain, repeat 2, of vertebrate phospholipases, PLD1 and PLD2, yeast PLDs, and similar proteins. Catalytic domain, repeat 2, of vertebrate phospholipases D (PLD1 and PLD2), yeast phospholipase D (PLD SPO14/PLD1), and other similar eukaryotic proteins. These PLD enzymes play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. The vertebrate PLD1 and PLD2 are membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzymes that selectively hydrolyze phosphatidylcholine (PC). Protein cofactors and calcium may be required for their activation. Yeast SPO14/PLD1 is a calcium-independent PLD, which needs PIP2 for its activity. Instead of the regulatory calcium-dependent phospholipid-binding C2 domain in plants, most mammalian and yeast PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at the N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. The PX and PH domains are also present in zeta-type PLD from Arabidopsis, which is more closely related to vertebrate PLDs than to other plant PLD types. In addition, this subfamily also includes some related proteins which have either PX-like or PH domains in their N-termini. Like other members of the PLD superfamily, the monomer of mammalian and yeast PLDs consists of two catalytic domains, each containing one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from the two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 183
27958 197240 cd09142 PLDc_pPLD_like_2 Catalytic domain, repeat 2, of plant phospholipase D and similar proteins. Catalytic domain, repeat 2, of plant phospholipase D (PLD, EC 3.1.4.4) and similar proteins. Plant PLDs have broad substrate specificity and can hydrolyze the terminal phosphodiester bond of several common membrane phospholipids such as phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), and phosphatidylserine (PS), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Most plant PLDs possess a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require calcium for activity, which is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDs may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. This subfamily includes two types of plant PLDs, alpha-type and beta-type PLDs, which are derived from different gene products and distinctly regulated. The zeta-type PLD from Arabidopsis is not included in this subfamily. 208
27959 197241 cd09143 PLDc_vPLD1_2_like_bac_2 Catalytic domain, repeat 2, of uncharacterized bacterial proteins with similarity to vertebrate phospholipases, PLD1 and PLD2. Catalytic domain, repeat 2, of uncharacterized bacterial counterparts of vertebrate, yeast and plant phospholipase D (PLD, EC 3.1.4.4). PLDs hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. They also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Instead of the regulatory C2 (calcium-activated lipid binding) domain in plants and the adjacent Phox (PX) and the Pleckstrin homology (PH) N-terminal domains in most mammalian and yeast PLDs, many members in this subfamily contain a SNARE associated C-terminal domain, whose functional role is unclear. Like other PLD enzymes, members in this subfamily contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), that may play an important role in the catalysis. 142
27960 197242 cd09144 PLDc_vPLD3_1 Putative catalytic domain, repeat 1, of vertebrate phospholipase PLD3. Putative catalytic domain, repeat 1, of phospholipase D3 (PLD3, EC 3.1.4.4). The human protein is also known as Hu-K4 or HUK4 and it was identified as a human homolog of the vaccinia virus protein K4, which is encoded by the HindIII K4L gene. PLD3 is found in many human organs with highest expression levels found in the central nervous system. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD3 has been assigned to the PLD superfamily although no catalytic activity has been detected experimentally. PLD3 is a membrane-bound protein that colocalizes with protein disulfide isomerase, an endoplasmic reticulum (ER) protein. Like other homologs of protein K4, PLD3 might alter the lipid content of associated membranes by selectively hydrolyzing phosphatidylcholine (PC) into the corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement. 172
27961 197243 cd09145 PLDc_vPLD4_1 Putative catalytic domain, repeat 1, of vertebrate phospholipase PLD4. Putative catalytic domain, repeat 1, of vertebrate phospholipases D4 (PLD4, EC 3.1.4.4), homologs of the vaccinia virus protein K4 which is encoded by the HindIII K4L gene. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD4 has been assigned to PLD superfamily although no catalytic activity has been detected to date. Unlike PLD1 and PLD2, PLD4 does not contain Phox (PX) and Pleckstrin homology (PH) domains but has a putative transmembrane domain. Like other vertebrate homologs of protein K4, PLD4 might be associated with Golgi membranes and alter their lipid content by selectively hydrolyze phosphatidylcholine (PC) into corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement. 170
27962 197244 cd09146 PLDc_vPLD5_1 Putative catalytic domain, repeat 1, of inactive veterbrate phospholipase PLD5. Putative catalytic domain, repeat 1, of inactive veterbrate phospholipases D5 (PLD5, EC 3.1.4.4), homologs of the vaccinia virus protein K4 encoded by the HindIII K4L gene. Vertebrate PLD5 has been assigned to the PLD superfamily, since it shows high sequence similarity to other human homologs of protein K4, which contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). However, due to the lack of functionally important histidine and lysine residues in the HKD motif, vetebrate PLD5 has been characterized as an inactive PLD. 163
27963 197245 cd09147 PLDc_vPLD3_2 Putative catalytic domain, repeat 2, of vertebrate phospholipase PLD3. Putative catalytic domain, repeat 2, of phospholipase D3 (PLD3, EC 3.1.4.4). The human protein is also known as Hu-K4 or HUK4 and it was identified as a human homolog of the vaccinia virus protein K4, which is encoded by the HindIII K4L gene. PLD3 is found in many human organs with highest expression levels found in the central nervous system. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD3 has been assigned to the PLD superfamily although no catalytic activity has been detected experimentally. PLD3 is a membrane-bound protein that colocalizes with protein disulfide isomerase, an endoplasmic reticulum (ER) protein. Like other homologs of protein K4, PLD3 might alter the lipid content of associated membranes by selectively hydrolyzing phosphatidylcholine (PC) into the corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement. 186
27964 197246 cd09148 PLDc_vPLD4_2 Putative catalytic domain, repeat 2, of vertebrate phospholipase PLD4. Putative catalytic domain, repeat 2, of vertebrate phospholipases D4 (PLD4, EC 3.1.4.4), homologs of the vaccinia virus protein K4 which is encoded by the HindIII K4L gene. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD4 has been assigned to PLD superfamily although no catalytic activity has been detected to date. Unlike PLD1 and PLD2, PLD4 does not contain Phox (PX) and Pleckstrin homology (PH) domains but has a putative transmembrane domain. Like other vertebrate homologs of protein K4, PLD4 might be associated with Golgi membranes and alter their lipid content by selectively hydrolyze phosphatidylcholine (PC) into corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement. 187
27965 197247 cd09149 PLDc_vPLD5_2 Putative catalytic domain, repeat 2, of inactive veterbrate phospholipase PLD5. Putative catalytic domain, repeat 2, of inactive veterbrate phospholipases D5 (PLD5, EC 3.1.4.4), homologs of the vaccinia virus protein K4 encoded by the HindIII K4L gene. Vertebrate PLD5 has been assigned to the PLD superfamily, since it shows high sequence similarity to other human homologs of protein K4, which contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). However, due to the lack of functionally important histidine and lysine residues in the HKD motif, vetebrate PLD5 has been characterized as an inactive PLD. 188
27966 197248 cd09150 PLDc_Ymt_1 Putative catalytic domain, repeat 1, of Yersinia pestis murine toxin-like proteins. Putative catalytic domain, repeat 1, of Yersinia pestis murine toxin (Ymt), a plasmid-encoded phospholipase D (PLD, EC 3.1.4.4), and similar proteins. Ymt is important in order for Yersinia pestis to survive and spread. It is toxic to mice and rats but not to other animals. It is not a conventional secreted exotoxin, but a cytoplasmic protein that is released upon bacterial lysis. Ymt may be active as a dimer. The monomeric Ymt consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Ymt has PLD-like activity and has been classified into the PLD superfamily. It hydrolyzes the terminal phosphodiester bond in several phospholipids, with preference for phosphatidylethanolamine (PE) over phosphatidylcholine (PC) and phosphatidylserine (PS). Like other PLD enzymes, Ymt may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. In terms of sequence similarity, Ymt is closely related to Streptomyces PLDs. 215
27967 197249 cd09151 PLDc_Ymt_2 Putative catalytic domain, repeat 2, of Yersinia pestis murine toxin-like proteins. Putative catalytic domain, repeat 2, of Yersinia pestis murine toxin (Ymt), a plasmid-encoded phospholipase D (PLD, EC 3.1.4.4), and similar proteins. Ymt is important in order for Yersinia pestis to survive and spread. It is toxic to mice and rats but not to other animals. It is not a conventional secreted exotoxin, but a cytoplasmic protein that is released upon bacterial lysis. Ymt may be active as a dimer. The monomeric Ymt consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Ymt has PLD-like activity and has been classified into the PLD superfamily. It hydrolyzes the terminal phosphodiester bond in several phospholipids, with preference for phosphatidylethanolamine (PE) over phosphatidylcholine (PC) and phosphatidylserine (PS). Like other PLD enzymes, Ymt may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. In terms of sequence similarity, Ymt is closely related to Streptomyces PLDs. 264
27968 197250 cd09152 PLDc_EcCLS_like_1 Catalytic domain, repeat 1, of Escherichia coli cardiolipin synthase and similar proteins. Catalytic domain, repeat 1, of Escherichia coli cardiolipin (CL) synthase and similar proteins. Escherichia coli CL synthase (EcCLS), specified by the cls gene, is the prototype of this family. EcCLS is a multi-pass membrane protein that catalyzes reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of EcCLS consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. EcCLS can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, EcCLS utilizes a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 163
27969 197251 cd09154 PLDc_SMU_988_like_1 Putative catalytic domain, repeat 1, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Putative catalytic domain, repeat 1, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Although SMU_988 and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 155
27970 197252 cd09155 PLDc_PaCLS_like_1 Putative catalytic domain, repeat 1, of Pseudomonas aeruginosa cardiolipin synthase and similar proteins. Putative catalytic domain, repeat 1, of Pseudomonas aeruginosa cardiolipin (CL) synthase (PaCLS) and similar proteins. Although PaCLS and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, PaCLS and other members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 156
27971 197253 cd09156 PLDc_CLS_unchar1_1 Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 154
27972 197254 cd09157 PLDc_CLS_unchar2_1 Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 155
27973 197255 cd09158 PLDc_EcCLS_like_2 Catalytic domain, repeat 2, of Escherichia coli cardiolipin synthase and similar proteins. Catalytic domain, repeat 2, of Escherichia coli cardiolipin (CL) synthase and similar proteins. Escherichia coli CL synthase (EcCLS), specified by the cls gene, is the prototype of this family. EcCLS is a multi-pass membrane protein that catalyzes reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of EcCLS consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. EcCLS can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, EcCLS utilizes a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 174
27974 197256 cd09159 PLDc_ybhO_like_2 Catalytic domain, repeat 2, of Escherichia coli cardiolipin synthase ybhO and similar proteins. Catalytic domain, repeat 2, of Escherichia coli cardiolipin (CL) synthase ybhO and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli CL synthase. The prototype of this subfamily is Escherichia coli CL synthase ybhO specified by the f413 (ybhO) gene. ybhO is a membrane-bound protein that catalyzes the formation of cardiolipin (CL) by transferring phosphatidyl group between two phosphatidylglycerol molecules. It can also catalyze phosphatidyl group transfer to water to form phosphatidate. In contrast to the Escherichia coli CL synthase encoded by the cls gene (EcCLS), ybhO does not hydrolyze CL. Moreover, ybhO lacks an N-terminal segment encoded by Escherichia coli cls, which makes ybhO easy to denature. The monomer of ybhO consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. ybhO can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. 170
27975 197257 cd09160 PLDc_SMU_988_like_2 Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Although SMU_988 and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 176
27976 197258 cd09161 PLDc_PaCLS_like_2 Putative catalytic domain, repeat 2, of Pseudomonas aeruginosa cardiolipin synthase and similar proteins. Putative catalytic domain, repeat 2, of Pseudomonas aeruginosa cardiolipin (CL) synthase (PaCLS) and similar proteins. Although PaCLS and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, PaCLS and other members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 176
27977 197259 cd09162 PLDc_CLS_unchar1_2 Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 172
27978 197260 cd09163 PLDc_CLS_unchar2_2 Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer. 176
27979 197261 cd09164 PLDc_EcPPK1_C1_like Catalytic C-terminal domain, first repeat, of Escherichia coli polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of Escherichia coli polyphosphate kinase 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. The prototype of this subfamily is Escherichia coli polyphosphate kinase (EcPPK), which forms a homotetramer in solution, and becomes a homodimer upon the binding of AMPPNP, a non-hydrolysable ATP analogue. Each EcPPK monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2)domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of EcPPK are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of EcPPK. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. 162
27980 197262 cd09165 PLDc_PaPPK1_C1_like Catalytic C-terminal domain, first repeat, of Pseudomonas aeruginosa polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of polyphosphate kinase (Poly P kinase 1 or PPK1, EC 2.7.4.1) from Pseudomonas aeruginosa (PaPPK1), Dictyostelium discoideum (DdPPK1), and other similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PaPPK1 is the key enzyme responsible for the synthesis of Poly P in Pseudomonas aeruginosa. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. PaPPK1 shows high sequence homolog to Escherichia coli polyphosphate kinase (EcPPK), which contains four structural domains per chain: the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. The polyphosphate kinase from Dictyostelium discoideum (DdPPK1) shares similar structural features with EcPPK1 in the ATP-binding pocket and poly P tunnel, but has a unique N-terminal extension that may be responsible for its enzymatic activity, cellular localization, and physiological functions. In spite of the lack of sequence homology, the C1 and C2 domains of the family members are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. In some bacteria, such as Pseudomonas aeruginosa, a second enzyme, PPK2, which is involved in the alternative pathway of polyphosphate synthesis, has been found. It can catalyze the synthesis of poly P from GTP or ATP, with a preference for Mn2+ over Mg2+. PPK2 shows no sequence similarity to PPK1 and belongs to a different superfamily. 164
27981 197263 cd09166 PLDc_PPK1_C1_unchar Catalytic C-terminal domain, first repeat, of uncharacterized prokaryotic polyphosphate kinases. Catalytic C-terminal domain, first repeat (C1 domain), of a group of uncharacterized prokaryotic polyphosphate kinases (Poly P kinase 1 or PPK1, EC 2.7.4.1). Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. 162
27982 197264 cd09167 PLDc_EcPPK1_C2_like Catalytic C-terminal domain, second repeat, of Escherichia coli polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of Escherichia coli polyphosphate kinase 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. The prototype of this subfamily is Escherichia coli polyphosphate kinase (EcPPK), which forms a homotetramer in solution, and becomes a homodimer upon the binding of AMPPNP, a non-hydrolysable ATP analogue. Each EcPPK monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2)domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of EcPPK are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of EcPPK. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. 165
27983 197265 cd09168 PLDc_PaPPK1_C2_like Catalytic C-terminal domain, second repeat, of Pseudomonas aeruginosa polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of polyphosphate kinase (Poly P kinase 1 or PPK1, EC 2.7.4.1) from Pseudomonas aeruginosa (PaPPK1), Dictyostelium discoideum (DdPPK1), and other similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PaPPK1 is the key enzyme responsible for the synthesis of Poly P in Pseudomonas aeruginosa. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. PaPPK1 shows high sequence homolog to Escherichia coli polyphosphate kinase (EcPPK), which contains four structural domains per chain: the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. The polyphosphate kinase from Dictyostelium discoideum (DdPPK1) shares similar structural features with EcPPK1 in the ATP-binding pocket and poly P tunnel, but has a unique N-terminal extension that may be responsible for its enzymatic activity, cellular localization, and physiological functions. In spite of the lack of sequence homology, the C1 and C2 domains of the family members are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. In some bacteria, such as Pseudomonas aeruginosa, a second enzyme, PPK2, which is involved in the alternative pathway of polyphosphate synthesis, has been found. It can catalyze the synthesis of poly P from GTP or ATP, with a preference for Mn2+ over Mg2+. PPK2 shows no sequence similarity to PPK1 and belongs to a different superfamily. 163
27984 197266 cd09169 PLDc_PPK1_C2_unchar Catalytic C-terminal domain, second repeat, of uncharacterized prokaryotic polyphosphate kinases. Catalytic C-terminal domain, second repeat (C2 domain), of a group of uncharacterized prokaryotic polyphosphate kinases (Poly P kinase 1 or PPK1, EC 2.7.4.1). Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. 162
27985 197267 cd09170 PLDc_Nuc Catalytic domain of EDTA-resistant nuclease Nuc from Salmonella typhimurium and similar proteins. Catalytic domain of an EDTA-resistant nuclease Nuc from Salmonella typhimurium and similar proteins. Nuc is an endonuclease cleaving both single- and double-stranded DNA. It is the smallest known member of the phospholipase D (PLD, EC 3.1.4.4) superfamily that includes a diverse group of proteins with various catalytic functions. Most members of this superfamily have two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain and both are required for catalytic activity. However, Nuc only has one copy of the HKD motif per subunit but form a functionally active homodimer (it is most likely also active in solution as a multimeric protein), which has a single active site at the dimer interface containing the HKD motifs from both subunits. Due to the lack of a distinct domain for DNA binding, Nuc cuts DNA non-specifically. It utilizes a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. 142
27986 197268 cd09171 PLDc_vPLD6_like Catalytic domain of vertebrate phospholipase D6 and similar proteins. Catalytic domain of vertebrate phospholipase D6 (PLD6, EC 3.1.4.4), a homolog of the EDTA-resistant nuclease Nuc from Salmonella typhimurium, and similar proteins. PLD6 can selectively hydrolyze the terminal phosphodiester bond of phosphatidylcholine (PC) with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. PLD6 belongs to the phospholipase D (PLD) superfamily. Its monomer contains a short conserved sequence motif, H-x-K-x(4)-D (where x represents any amino acid residue), termed the HKD motif, which is essential in catalysis. PLD6 is more closely related to the nuclease Nuc than to other vertebrate phospholipases, which have two copies of the HKD motif in a single polypeptide chain. Like Nuc, PLD6 may utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from the HKD motif of one subunit to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. 136
27987 197269 cd09172 PLDc_Nuc_like_unchar1_1 Putative catalytic domain, repeat 1, of uncharacterized hypothetical proteins similar to Nuc, an endonuclease from Salmonella typhimurium. Putative catalytic domain, repeat 1, of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD motifs in a single polypeptide chain. 144
27988 197270 cd09173 PLDc_Nuc_like_unchar1_2 Putative catalytic domain, repeat 2, of uncharacterized hypothetical proteins similar to Nuc, an endonuclease from Salmonella typhimurium. Putative catalytic domain, repeat 2, of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD motifs in a single polypeptide chain. 159
27989 197271 cd09174 PLDc_Nuc_like_unchar2 Putative catalytic domain of uncharacterized hypothetical proteins closely related to Nuc, , an endonuclease from Salmonella typhimurium. Putative catalytic domain of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD motifs in a single polypeptide chain. 136
27990 197272 cd09175 PLDc_Bfil Catalytic domain of type IIs restriction endonuclease BfiI and similar proteins. Catalytic domain of a novel type IIs restriction endonuclease BfiI and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. Unlike all other restriction enzymes known to date, BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI presumably evolved through domain fusion of a DNA recognition domain to the catalytic Nuc-like domain from the PLD superfamily. Most PLD enzymes have two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain and both are required for catalytic activity. However, BfiI contains only one HKD motif per protein chain and forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains that contain the two HKD motifs from both subunits. BfiI utilizes a single active site to cut both DNA strands, which represents a novel mechanism for the scission of double-stranded DNA. It uses a histidine residue from the HKD motif in one subunit as the nucleophile for the cleavage of the target phosphodiester bond in both of the anti-parallel DNA strands, while the symmetrically-related histidine residue from the HKD motif of the opposite subunit acts as the proton donor/acceptor during both strand-scission events. 161
27991 197273 cd09176 PLDc_unchar6 Putative catalytic domain of uncharacterized hypothetical proteins with one or two copies of the HKD motif. Putative catalytic domain of uncharacterized hypothetical proteins with similarity to phospholipase D (PLD, EC 3.1.4.4). PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain one or two copies of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. 114
27992 197274 cd09177 PLDc_RE_NgoFVII Putative catalytic domain of type II restriction enzyme NgoFVII and similar proteins. Putative catalytic domain of type II restriction enzyme NgoFVII (EC 3.1.21.4), which shows high sequence similarity to type IIs restriction endonuclease BfiI. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. The prototype of this subfamily is the NgoFVII restriction endonuclease from Neisseria gonorrhoeae. It plays an essential role in the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. It recognizes the double-stranded sequence GCSGC and cleaves after G-4. Members of this subfamily contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) per protein chain and have been classified into the phospholipase D (PLD, EC 3.1.4.4) superfamily. 143
27993 197275 cd09178 PLDc_N_Snf2_like N-terminal putative catalytic domain of uncharacterized HKD family nucleases fused to putative helicases from the Snf2-like family. N-terminal putative catalytic domain of uncharacterized archaeal and prokaryotic HKD family nucleases fused to putative helicases from the Snf2-like family, which belong to the DNA/RNA helicase superfamily II (SF2). Although Snf2-like family enzymes do not possess helicase activity, they contain a helicase-like region, where seven helicase-related sequence motifs are found, similar to those in DEAD/DEAH box helicases, which represent the biggest family within the SF2 superfamily. In addition to the helicase-like region, members of this family also contain an N-terminal putative catalytic domain with one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified as members of the phospholipase D (PLD, EC 3.1.4.4) superfamily. 134
27994 197276 cd09179 PLDc_N_DEXD_a N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. 190
27995 197277 cd09180 PLDc_N_DEXD_b N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. A few family members contain additional domains, like a C-terminal peptidase S24-like domain. 142
27996 197278 cd09181 PLDc_FAM83A_N N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83A. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83A (FAM83A), also known as tumor antigen BJ-TSA-9. FAM83A or BJ-TSA-9 is a novel tumor-specific gene highly expressed in human lung adenocarcinoma. Due to this specific expression pattern, it may serve as a biomarker for lung cancer, especially in the early detection of micrometastasis for lung adenocarcinoma patients. Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. 276
27997 197279 cd09182 PLDc_FAM83B_N N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83B. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83B (FAM83B). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83B shows high homology to other FAM83 family members, indicating that FAM83B might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family. 266
27998 197280 cd09183 PLDc_FAM83C_N N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83C. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83C (FAM83C). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83C shows high homology to other FAM83 family members, indicating that FAM83C might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family. 274
27999 197281 cd09184 PLDc_FAM83D_N N-terminal phospholipase D-like domain of the protein, Family with sequence similarity 83D. N-terminal phospholipase D (PLD)-like domain of the protein Family with sequence similarity 83D (FAM83D), also known as spindle protein CHICA. CHICA is a cell-cycle-regulated spindle component, which localizes to the mitotic spindle and is both upregulated and phosphorylated during mitosis. CHICA is required to localize the chromokinesin Kid to the mitotic spindle and serves as a novel interaction partner of Kid, which is required for the generation of polar ejection forces and chromosome congression. Since the N-terminal PLD-like domain of FAM83D shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83D may share a similar three-dimensional fold with PLD enzymes, but is unlikely to carry PLD activity. 271
28000 197282 cd09186 PLDc_FAM83F_N N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83F. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83F (FAM83F). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83F shows high homology to other FAM83 family members, indicating that FAM83F might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family. 268
28001 197283 cd09187 PLDc_FAM83G_N N-terminal phospholipase D-like domain of the uncharacterized protein Family with sequence similarity 83G. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83G (FAM83G). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83G shows high homology to other FAM83 family members, indicating that FAM83G might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family. 275
28002 197284 cd09188 PLDc_FAM83H_N N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83H. N-terminal phospholipase D (PLD)-like domain of the protein, Family with sequence similarity 83H (FAM83H) on chromosome 8q24.3, which localizes in the intracellular environment and is associated with vesicles, can be regulated by kinases, and plays important roles during ameloblast differentiation and enamel matrix calcification. The gene encoding protein FAM83H is the first gene involved in the etiology of amelogenesis imperfecta (AI), that encodes a non-secreted protein due to the absence of a signal peptide. Defects in gene FAM83H cause autosomal dominant hypocalcified amelogenesis imperfecta (ADHCAI). Since the N-terminal PLD-like domain of FAM83H shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83H may share a similar three-dimensional fold with PLD enzymes, but is most unlikely to carry PLD activity. 265
28003 197285 cd09189 PLDc_DNaseII_alpha_1 Catalytic domain, repeat 1, of Deoxyribonuclease II alpha and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II alpha (DNase II alpha, EC 3.1.22.1) and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II alpha is an acidic endonuclease found in lysosomes, nuclei, and various secretions. It plays a critical role in the degradation of nuclear DNA expelled from erythroid precursor cells, as well as in the degradation of the apoptotic DNA after macrophages engulf them. It cleaves double-stranded DNA to short 3'-phosphoryl oligonucleotides, rather than 3'-hydroxyl groups, and functions optimally at acidic pH in the absence of divalent metal ion cofactors. 162
28004 197286 cd09190 PLDc_DNaseII_beta_1 Catalytic domain, repeat 1, of Deoxyribonuclease II beta and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II beta (DNase II beta, EC 3.1.22.1), also known as DNase II-like acid DNase (DLAD), and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II beta, or DLAD, is a novel mammalian divalent cation-independent endonuclease with homology to DNase II alpha. It is highly expressed in the eye lens and in salivary glands and is responsible for the degradation of nuclear DNA during lens cell differentiation. DLAD mainly exists as a cytoplasmic protein and cleaves DNA to produce 3'-phosphoryl/5'-hydroxyl ends. Like DNase II alpha, DLAD is active under acidic conditions with maximum activity at pH 5.2. Aurintricarboxylic acid and Zn2+ are effective inhibitors of DLAD activity. Mice deficient in DLAD develop cataracts as they are unable to degrade DNA during differentiation of the lens cells. 165
28005 197287 cd09191 PLDc_DNaseII_alpha_2 Catalytic domain, repeat 2, of Deoxyribonuclease II alpha and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II alpha (DNase II alpha, EC 3.1.22.1) and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II alpha is an acidic endonuclease found in lysosomes, nuclei, and various secretions. It plays a critical role in the degradation of nuclear DNA expelled from erythroid precursor cells, as well as in the degradation of the apoptotic DNA after macrophages engulf them. It cleaves double-stranded DNA to short 3'-phosphoryl oligonucleotides, rather than 3'-hydroxyl groups, and functions optimally at acidic pH in the absence of divalent metal ion cofactors. 137
28006 197288 cd09192 PLDc_DNaseII_beta_2 Catalytic domain, repeat 2, of Deoxyribonuclease II beta and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II beta (DNase II beta, EC 3.1.22.1), also known as DNase II-like acid DNase (DLAD), and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II beta, or DLAD, is a novel mammalian divalent cation-independent endonuclease with homology to DNase II alpha. It is highly expressed in the eye lens and in salivary glands and is responsible for the degradation of nuclear DNA during lens cell differentiation. DLAD mainly exists as a cytoplasmic protein and cleaves DNA to produce 3'-phosphoryl/5'-hydroxyl ends. Like DNase II alpha, DLAD is active under acidic conditions with maximum activity at pH 5.2. Aurintricarboxylic acid and Zn2+ are effective inhibitors of DLAD activity. Mice deficient in DLAD develop cataracts as they are unable to degrade DNA during differentiation of the lens cells. 139
28007 197289 cd09193 PLDc_mTdp1_1 Catalytic domain, repeat 1, of metazoan tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of metazoan tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-). Human Tdp1 (hTdp1) acts as an important DNA repair enzyme with a preference for single-stranded or blunt-ended duplex oligonucleotides. It can remove stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is therefore a potential molecular target for new anti-cancer drugs. hTdp1 has been shown to associate with additional proteins, such as XRCC1, to form a multi-enzyme complex. These additional proteins may be involved in recognizing 3'-phoshotyrosyl DNA in vivo. hTdp1 is a monomeric protein containing two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, hTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 169
28008 197290 cd09194 PLDc_yTdp1_1 Catalytic domain, repeat 1, of yeast tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of yeast tyrosyl-DNA phosphodiesterase (yTdp1, EC 3.1.4.-). yTdp1 is involved in the repair of topoisomerase I DNA lesions by hydrolyzing the topoisomerase from the 3'-end of the DNA during double-strand break repair. Unlike human Tdp1 whose substrate-binding pocket can accommodate a fairly large topoisomerase I peptide fragment, yTdp1 has a preference for substrates containing one to four amino acid residues. The monomeric yTdp1 contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, yTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 166
28009 197291 cd09195 PLDc_mTdp1_2 Catalytic domain, repeat 2, of metazoan tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of metazoan tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-). Human Tdp1 (hTdp1) acts as an important DNA repair enzyme with a preference for single-stranded or blunt-ended duplex oligonucleotides. It can remove stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is therefore a potential molecular target for new anti-cancer drugs. hTdp1 has been shown to associate with additional proteins, such as XRCC1, to form a multi-enzyme complex. These additional proteins may be involved in recognizing 3'-phoshotyrosyl DNA in vivo. hTdp1 is a monomeric protein containing two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, hTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 191
28010 197292 cd09196 PLDc_yTdp1_2 Catalytic domain, repeat 2, of yeast tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of yeast tyrosyl-DNA phosphodiesterase (yTdp1, EC 3.1.4.-). yTdp1 is involved in the repair of topoisomerase I DNA lesions by hydrolyzing the topoisomerase from the 3'-end of the DNA during double-strand break repair. Unlike human Tdp1 whose substrate-binding pocket can accommodate a fairly large topoisomerase I peptide fragment, yTdp1 has a preference for substrates containing one to four amino acid residues. The monomeric yTdp1 contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, yTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. 200
28011 197293 cd09197 PLDc_pPLDalpha_1 Catalytic domain, repeat 1, of plant alpha-type phospholipase D. Catalytic domain, repeat 1, of plant alpha-type phospholipase D (PLDalpha, EC 3.1.4.4). Plant PLDalpha is a phosphatidylinositol 4,5-bisphosphate (PIP2)-independent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require millimolar calcium for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDalpha consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDalpha may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 178
28012 197294 cd09198 PLDc_pPLDbeta_1 Catalytic domain, repeat 1, of plant beta-type phospholipase D. Catalytic domain, repeat 1, of plant beta-type phospholipase D (PLDbeta, EC 3.1.4.4). Plant PLDbeta is a phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and requires nanomolar calcium and cytosolic factors for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Sequence analysis shows that plant PLDbeta is evolutionarily divergent from alpha-type plant PLD, and plant PLDbeta is more closely related to mammalian and yeast PLDs than to plant PLDalpha. Like other PLD enzymes, the monomer of plant PLDbeta consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDbeta may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 180
28013 197295 cd09199 PLDc_pPLDalpha_2 Catalytic domain, repeat 2, of plant alpha-type phospholipase D. Catalytic domain, repeat 2, of plant alpha-type phospholipase D (PLDalpha, EC 3.1.4.4). Plant PLDalpha is a phosphatidylinositol 4,5-bisphosphate (PIP2)-independent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require millimolar calcium for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDalpha consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDalpha may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 211
28014 197296 cd09200 PLDc_pPLDbeta_2 Catalytic domain, repeat 2, of plant beta-type phospholipase D. Catalytic domain, repeat 2, of plant beta-type phospholipase D (PLDbeta, EC 3.1.4.4). Plant PLDbeta is a phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and requires nanomolar calcium and cytosolic factors for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Sequence analysis shows that plant PLDbeta is evolutionarily divergent from alpha-type plant PLD, and plant PLDbeta is more closely related to mammalian and yeast PLDs than to plant PLDalpha. Like other PLD enzymes, the monomer of plant PLDbeta consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDbeta may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 211
28015 197297 cd09203 PLDc_N_DEXD_b1 N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. 143
28016 197298 cd09204 PLDc_N_DEXD_b2 N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. 139
28017 197299 cd09205 PLDc_N_DEXD_b3 N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. A few family members contain additional domains, like a C-terminal peptidase S24-like domain. 143
28018 187741 cd09208 Lumazine_synthase-II lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS), catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2); type-II. Type-II LS also known as RibH2, catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. LS catalyses the formation of 6,7-dimethyl-8-ribityllumazine by the condensation of 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate. Subsequently, the lumazine intermediate dismutates yielding riboflavin and 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione, in a reaction catalyzed by riboflavin synthase (RS); RS belongs to a different family of the Lumazine-synthase-like superfamily. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. Type II LSs are distinct from type-I LS not only in protein sequence, but in that they exhibit different quaternary assemblies; type-II LSs form decamers (dimers of pentamers). The pathogen Brucella spp. have both a type-I LS and a type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS appears to be a functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS and may be regulated by a riboswitch that senses FMN, suggesting that the type-II LSs may have evolved into very poor catalysts or, that they may harbor a new, as-yet-unknown function. 137
28019 187742 cd09209 Lumazine_synthase-I lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS), catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2); type-I. Type-I LS, also known as RibH1, catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. LS catalyse the formation of 6,7-dimethyl-8-ribityllumazine by the condensation of 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate. Subsequently, the lumazine intermediate dismutates to yield riboflavin and 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione, in a reaction catalyzed by riboflavin synthase synthase (RS); RS belongs to a different family of the Lumazine-synthase-like superfamily. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. Type II LSs are distinct from type-I LS not only in protein sequence, but in that they exhibit different quaternary assemblies; type-I LSs form pentamers. The pathogen Brucella spp. encode both a Type-I LS and a Type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS appears to be the functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS. The pathogen Brucella spp. have both a type-I LS and a type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS appears to be a functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS. 133
28020 187743 cd09210 Riboflavin_synthase_archaeal archaeal riboflavin synthase (RS); involved in the biosynthesis pathway of riboflavin (vitamin B2). Archaeal RSs are homopentamers catalyzing the formation of riboflavin from 6,7-dimethyl-8-ribityllumazine in riboflavin biosynthesis. Divalent metal ions, preferably manganese or magnesium, are needed for maximum activity. Riboflavin serves as the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD), essential cofactors for several oxidoreductases that are indispensable in most living cells. In the final steps of the riboflavin biosynthetic pathway, lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS) catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), followed by RS which catalyzes a dismutation of DMRL yielding riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. Archaeal RSs share sequence similarity with LSs, both appear to have diverged early in the evolution of archaea from a common ancestor. 143
28021 187744 cd09211 Lumazine_synthase_archaeal lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS); catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2). Archaeal LS is an important enzyme in the riboflavin biosynthetic pathway. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. In the final steps of the riboflavin biosynthetic pathway LS catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), and riboflavin synthase (RS) catalyzes a dismutation of DMRL which yields riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. LS from Methanococcus jannaschii forms capsids with icosahedral 532 symmetry consisting of 60 subunits. Archaeal LSs share sequence similarity with archaeal RSs, both appear to have diverged early in the evolution of archaea from a common ancestor. 131
28022 198416 cd09212 PUB PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins. The PUB domain is found in p97 adaptor proteins such as PNGase, UBXD1 (UBX domain-containing protein 1), and RNF31 (RING finger protein 31). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The p97, a type II AAA+ ATPase, is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes (except for fungi) and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97. 96
28023 188873 cd09213 Luminal_IRE1_like The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR. 312
28024 185753 cd09214 GH64-like glycosyl hydrolase 64 family. This family is represented by the laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis and related bacterial and ascomycete proteins. LPHase is a member of glycoside hydrolase family 64 (GH64), it is an inverting enzyme involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. LPHase is a two-domain crescent fold structure: one domain is composed of 10 beta-strands, eight coming from the N-terminus of the protein and two from the C-terminal region, and the protein has a second inserted domain; this cd includes both domains. This protein has an electronegative, substrate-binding cleft, and conserved Glu and Asp residues involved in the cleavage of the beta-1,3-glucan, laminarin, a plant and fungal cell wall component. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. Also included in this family is GluB , the beta-1,3-glucanase B from Lysobacter enzymogenes Strain N4-7. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain. In the Cellulosimicrobium cellulans, glucan endo-1,3-beta-glucosidase, they are positioned N-terminal of a RICIN, carbohydrate-binding domain, and in the Salinispora tropica CNB-440, coagulation factor 5/8 C-terminal domain (FA58C) protein, they are positioned C-terminal of two FA58C domains which are proposed to function as cell surface-attached, carbohydrate-binding domain. This FA58C-containing protein has an internal peptide deletion (of approx. 44 residues) in the LPHase domain II. 319
28025 185754 cd09215 Thaumatin-like the sweet-tasting protein, thaumatin, and thaumatin-like proteins involved in host defense. This family is represented by the sweet-tasting protein thaumatin from the African berry Thaumatococcus daniellii and thaumatin-like proteins (TLPs) involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. TLPs included in this family are such proteins as zeamatin, found in high concentrations in cereal seeds; osmotin, a salt-induced protein in osmotically stressed plants; and PpAZ44, a propylene-induced TLP in abscission of young fruit. Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). Thaumatin and TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Most TLPs contain 16 conserved Cys residues. A deletion within the third domain (domain II) of the Triticum aestivum thaumatin-like xylanase inhibitor is observed, thus, only 10 conserved Cys residues are present within this smaller TLP and similar homologs. 157
28026 185755 cd09216 GH64-LPHase-like glycoside hydrolase family 64: laminaripentaose-producing, beta-1,3-glucanase (LPHase)-like. This subfamily is represented by the laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis and related bacterial and ascomycete proteins. LPHase is a member of glycoside hydrolase family 64 (GH64), it is an inverting enzyme involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. LPHase is a two-domain crescent fold structure: one domain is composed of 10 beta-strands, eight coming from the N-terminus of the protein and two from the C-terminal region, and the protein has a second inserted domain; this cd includes both domains. This protein has an electronegative, substrate-binding cleft, and conserved Glu and Asp residues involved in the cleavage of the beta-1,3-glucan, laminarin, a plant and fungal cell wall component. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. Also included in this family is GluB , the beta-1,3-glucanase B from Lysobacter enzymogenes Strain N4-7. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain. In the Cellulosimicrobium cellulans, glucan endo-1,3-beta-glucosidase, they are positioned N-terminal of a RICIN, carbohydrate-binding domain. 353
28027 185756 cd09217 TLP-P thaumatin and allergenic/antifungal thaumatin-like proteins: plant homologs. This subfamily is represented by the sweet-tasting protein thaumatin from the African berry Thaumatococcus daniellii, allergenic/antifungal Thaumatin-like proteins (TLPs), and related plant proteins. TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. TLPs in this subfamily include such proteins as zeamatin, found in high concentrations in cereal seeds, and osmotin, a salt-induced protein in osmotically stressed plants. Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). Thaumatin and TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. IgE-binding epitopes of mountain Cedar (Juniperus ashei) allergen Jun a 3, which interact with pooled IgE from patients suffering allergenic response to this allergen, were mainly located on the helical domain II; the best-conserved IgE-binding epitope predicted for TLPs corresponds to this region. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Most TLPs contain 16 conserved Cys residues. A deletion within the third domain (domain II) of the Triticum aestivum thaumatin-like xylanase inhibitor is observed, thus, only 10 conserved Cys residues are present within this smaller TLP and similar homologs. 151
28028 185757 cd09218 TLP-PA allergenic/antifungal thaumatin-like proteins: plant and animal homologs. This subfamily is represented by the thaumatin-like proteins (TLPs), Cherry Allergen Pru Av 2 TLP, Peach PpAZ44 TLP (a propylene-induced TLP in abscission), the Caenorhabditis elegans thaumatin family member (thn-6), and other plant and animal homologs. TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Due to their inducible expression by environmental stresses such as pathogen/pest attack, drought and cold, plant TLPs are classified as the pathogenesis-related (PR) protein family 5 (PR5). Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. TLPs within this subfamily contain 16 conserved Cys residues. 219
28029 185758 cd09219 TLP-F thaumatin-like proteins: basidiomycete homologs. This subfamily is represented by Lentinula edodes TLG1, a thaumatin-like protein (TLP), as well as, other basidiomycete homologs. In general, TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. TLG1 TLP is involved in lentinan degradation and fruiting body senescence. TLG1 expressed in Escherichia coli and Aspergillus oryzae exhibited beta-1,3-glucanase activity and demonstrated lentinan degrading activity. TLG1 is proposed to be involved in lentinan and cell wall degradation during senescence following harvest and spore diffusion. TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. TLG1 from Lentinula edodes contains the required acidic amino acids conserved in the appropriate positions to possess an electronegative cleft. TLPs within this subfamily contain 13 conserved Cys residues; the number of total Cys residues in these TLPs varies from 16 in L. edodes TLG1 to 18 in other basidiomycete homologs. 229
28030 185759 cd09220 GH64-GluB-like glycoside hydrolase family 64: beta-1,3-glucanase B (GluB)-like. This subfamily is represented by GluB, beta-1,3-glucanase B , from Lysobacter enzymogenes Strain N4-7 and related bacterial and ascomycete proteins. GluB is a member of the glycoside hydrolase family 64 (GH64) involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. GluB possesses the conserved Glu and Asp residues required to cleave substrate beta-1,3-glucans. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Based on the structure of laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis, which belongs to the same family as GluB but to a different subfamily, this cd is a two-domain model. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain. 369
28031 187745 cd09223 Photo_RC D1, D2 subunits of photosystem II (PSII); M, L subunits of bacterial photosynthetic reaction center. This protein superfamily contains the D1, D2 subunits of the photosystem II (PS II) and the M, L subunits of the bacterial photosynthetic reaction center (RC). These four proteins are highly homologous and share a common fold. PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria. It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation. Bacterial photosynthetic reaction center (RC) complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species. It couples light-induced electron transfer to proton pumping across the membrane by reactions of a quinone molecule (QB) that binds two electrons and two protons at the active site. Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as the synthesis of ATP. 199
28032 198423 cd09224 CytoC_RC Cytochrome C subunit of the bacterial photosynthetic reaction center. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction center (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor. The electron transfer reactions of photosynthesis are performed by the following three components: the photosynthetic reaction center (RC), the cytochrome, and the soluble electron carrier protein. Firstly, the RC promotes the light-induced charge separation across the plasma membrane, which results in the oxidation of a pair of light-harvesting complexes, LH1 and LH2, and the reduction of quinone to quinol. The quinol then leaves the RC and moves to the cytochrome complex through the quinone pool of the plasma membrane. Secondly, the cytochrome complex reoxidizes the quinol to quinone, and the released electrons are transferred to soluble electron carriers. Third, the soluble electron carriers transport the electrons to the RC through the periplasmic space. Finally, the photo-oxidized light-harvesting complex is reduced by the soluble electron carriers, and the RC comes back to the initial state. In the course of the oxidation and reduction of the quinones, a transmembrane electrochemical gradient of protons is formed, and its energy is used to produce ATP by the ATP synthase complex. 300
28033 185717 cd09232 Snurportin-1_C C-terminal m3G cap-binding domain of nuclear import adaptor snurportin-1. Snurportin-1 (SPN1 or SNUPN) is a nuclear import adaptor for m3G-capped spliceosomal U small nucleoproteins (snRNPs), which are assembled in the cytoplasm. After capping and assembly, the U snRNPs are transported into the nucleus by SPN1 and importin beta; SPN1 is then returned to the cytoplasm by exportin 1 (CRM1), which also transports the non-capped U snRNPs. The U snRNPs are essential elements of the spliceosome, which catalyzes the excision of introns and the ligation of exons to form a mature mRNA. SPN1 contains two domains, an N-terminal importin beta-binding (IBB) domain and a C-terminal m3G cap-binding domain. 186
28034 187750 cd09233 ACE1-Sec16-like Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16. COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site. 314
28035 185747 cd09234 V_HD-PTP_like Protein-interacting V-domain of mammalian His-Domain type N23 protein tyrosine phosphatase and related domains. This family contains the V-shaped (V) domain of mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23) and related domains. It belongs to the V_Alix_like superfamily which includes the V domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X/ also known as apoptosis-linked gene-2 interacting protein 1, AIP1), and related domains. HD_PTP interacts with the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in cell migration and endosomal trafficking. The related Alix V-domain (belonging to a different family in this superfamily) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to the V-domain, HD_PTP also has an N-terminal Bro1-like domain, a proline-rich region (PRR), a catalytically inactive tyrosine phosphatase domain, and a region containing a PEST motif. Bro1-like domains bind components of the ESCRT-III complex, specifically to CHMP4 in the case of HD-PTP. The Bro1-like domain of HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This family also contains Drosophila Myopic, which promotes epidermal growth factor receptor (EGFR) signaling, and Caenorhabditis elegans (enhancer of glp-1) EGO-2 which promotes Notch signaling. 337
28036 185748 cd09235 V_Alix Middle V-domain of mammalian Alix and related domains are dimerization and protein interaction modules. This family contains the middle V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X) and related domains. It belongs to the V_Alix_like superfamily which includes the V-domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), is part of the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in membrane remodeling processes, including the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), the abscission reactions of mammalian cell division, and in apoptosis. The Alix V-domain is a dimerization domain, and contains a binding site, partially conserved in the V_Alix_like superfamily, for the retroviral late assembly (L) domain YPXnL motif. In addition to the V-domain, Alix also has an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex, in particular CHMP4. The Bro1-like domain of Alix can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix also has a C-terminal proline-rich region (PRR) that binds multiple partners including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1), and the apoptotic protein ALG-2. 339
28037 185749 cd09236 V_AnPalA_UmRIM20_like Protein-interacting V-domains of Aspergillus nidulans PalA/RIM20, Ustilago maydis RIM20, and related proteins. This family belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Aspergillus nidulas PalA/RIM20 and Ustilago maydis RIM20, like Saccharomyces cerevisiae Rim20, participate in the response to the external pH via the Pal/Rim101 pathway; however, Saccharomyces cerevisiae Rim20 does not belong to this family. This pathway is a signaling cascade resulting in the activation of the transcription factor PacC/Rim101. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. Aspergillus nidulas PalA binds a nonviral YPXnL motif (tandem YPXL/I motifs within PacC). The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_like superfamily also have an N-terminal Bro1-like domain, which has been shown to bind CHMP4/Snf7, a component of the ESCRT-III complex. 353
28038 185750 cd09237 V_ScBro1_like Protein-interacting V-domain of Saccharomyces cerevisiae Bro1 and related domains. This family contains the V-shaped (V) domain of Saccharomyces cerevisiae Bro1, and related domains. It belongs to the V_Alix_like superfamily which also includes the V-domain of Saccharomyces cerevisiae Rim20 (also known as PalA), mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Bro1 interacts with the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in endosomal trafficking. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Bro1 also has an N-terminal Bro1-like domain, which binds Snf7, a component of the ESCRT-III complex, and a C-terminal proline-rich region (PRR). The C-terminal portion (V-domain and PRR) of S. cerevisiae Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes. It interacts with a YPxL motif in the Doa4s catalytic domain to stimulate its deubiquitination activity. 356
28039 185751 cd09238 V_Alix_like_1 Protein-interacting V-domain of an uncharacterized family of the V_Alix_like superfamily. This domain family is comprised of uncharacterized plant proteins. It belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), (His-Domain) type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_Rim20_Bro1_like superfamily also have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members of the V_Alix_like superfamily also have a proline-rich region (PRR). 339
28040 185762 cd09239 BRO1_HD-PTP_like Protein-interacting, N-terminal, Bro1-like domain of mammalian His-Domain type N23 protein tyrosine phosphatase and related domains. This family contains the N-terminal, Bro1-like domain of mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP) and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. HD-PTP participates in cell migration and endosomal trafficking. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 in the case of HD-PTP. The Bro1-like domain of HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. HD-PTP, and some other members of the BRO1_Alix_like superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the V-domain superfamily. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This family also contains Drosophila Myopic which promotes epidermal growth factor receptor (EGFR) signaling, and Caenorhabditis elegans (enhancer of glp-1) EGO-2 which promotes Notch signaling. 361
28041 185763 cd09240 BRO1_Alix Protein-interacting, N-terminal, Bro1-like domain of mammalian Alix and related domains. This family contains the N-terminal, Bro1-like domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), also called apoptosis-linked gene-2 interacting protein 1 (AIP1). It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4, in the case of Alix. The Alix Bro1-like domain can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid and Rab5-specfic GAP (RabGAP5, also known as Rab-GAPLP). In addition to this Bro1-like domain, Alix has a middle V-shaped (V) domain. The Alix V-domain is a dimerization domain, and carries a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the superfamily. Alix also has a C-terminal proline-rich region (PRR) that binds multiple partners including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. 346
28042 185764 cd09241 BRO1_ScRim20-like Protein-interacting, N-terminal, Bro1-like domain of Saccharomyces cerevisiae Rim20 and related proteins. This family contains the N-terminal, Bro1-like domain of Saccharomyces cerevisiae Rim20 (also known as PalA) and related proteins. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Saccharomyces cerevisiae Bro1, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Rim20 and Rim23 participate in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: Snf7 in the case of Rim20. RIM20, and some other members of the BRO1_Alix_like superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain is a dimerization domain that also contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the V-domain superfamily. Rim20 localizes to endosomes under alkaline pH conditions. By binding Snf7, it may bring the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and thus aid in the proteolytic activation of the latter. Rim20 and other intermediates in the Rim101 pathway play roles in the pathogenesis of fungal corneal infection during Candida albicans keratitis. 355
28043 185765 cd09242 BRO1_ScBro1_like Protein-interacting, N-terminal, Bro1-like domain of Saccharomyces cerevisiae Bro1 and related proteins. This family contains the N-terminal, Bro1-like domain of Saccharomyces cerevisiae Bro1 and related proteins. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Saccharomyces cerevisiae Rim20 (also known as PalA), Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Bro1 participates in endosomal trafficking. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: Snf7 in the case of Bro1. Snf7 binds to a conserved hydrophobic patch on the middle of the concave side of the Bro1 domain. RIM20, and some other members of the BRO1_Alix_like superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the superfamily. The Alix V-domain is also a dimerization domain. The C-terminal portion (V-domain and proline rich-region) of Bro1 interacts with Doa4, a protease that deubiquitinates integral membrane proteins sorted into the lumenal vesicles of late-endosomal multivesicular bodies. It interacts with a YPxL motif in the Doa4 catalytic domain to stimulate its deubiquitination activity. 348
28044 185766 cd09243 BRO1_Brox_like Protein-interacting Bro1-like domain of human Brox1 and related proteins. This family contains the Bro1-like domain of a single-domain protein, human Brox, and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 in the case of Brox. Human Brox can bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. In addition to a Bro1-like domain, Brox also has a C-terminal thioester-linkage site for isoprenoid lipids (CaaX motif). This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily. 353
28045 185767 cd09244 BRO1_Rhophilin Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin and related domains. This family contains the Bro1-like domain of RhoA-binding proteins, Rhophilin-1 and -2, and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-1 and -2 bind both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-1 and -2, contain an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. Their PDZ domains have limited homology. Rhophilin-1 and -2 have different activities. The Drosophila knockout of Rhophilin-1 is embryonic lethal, suggesting an essential role in embryonic development. Roles of Rhophilin-2 may include limiting stress fiber formation or increasing the turnover of F-actin in the absence of high levels of RhoA signaling activity. The isolated Bro1-like domain of Rhophilin-1 binds human immunodeficiency virus type 1 (HIV-1) nucleocapsid. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix _like superfamily. 350
28046 185768 cd09245 BRO1_UmRIM23-like Protein-interacting, Bro1-like domain of Ustilago maydis Rim23 (PalC), and related domains. This family contains the Bro1-like domain of Ustilago maydis Rim23 (also known as PalC), and related proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Rim20 and Rim23 participate in the response to the external pH via the Rim101 pathway. Through its Bro1-like domain, Rim23 allows the interaction between the endosomal and plasma membrane complexes. Bro1-like domains are boomerang-shape, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Intermediates in the Rim101 pathway may play roles in the pathogenesis of fungal corneal infection during Candida albicans keratitis. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily. 413
28047 185769 cd09246 BRO1_Alix_like_1 Protein-interacting, N-terminal, Bro1-like domain of an Uncharacterized family of the BRO1_Alix_like superfamily. This domain family is comprised of uncharacterized proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20 and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP and Bro1 function in endosomal trafficking, with HD-PTP having additional functions in cell migration. Rim20 and Rim23 play roles in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, Brox and HD-PTP) and Snf7 (in the case of yeast Bro1 and Rim20). The Bro1-like domains of Alix, HD-PTP, Brox, and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. In addition to this Bro1-like domain, Alix, Bro1, Rim20, HD_PTP, and proteins belonging to this uncharacterized family, also have a V-shaped (V) domain. The Alix V-domain is a dimerization domain, and contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the BRO1_Alix_like superfamily. Many members of this superfamily also have a proline-rich region (PRR), a protein interaction domain. 353
28048 185770 cd09247 BRO1_Alix_like_2 Protein-interacting Bro1-like domain of an Uncharacterized family of the BRO1_Alix_like superfamily. This domain family is comprised of uncharacterized proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20 and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP and Bro1 function in endosomal trafficking, with HD-PTP having additional functions in cell migration. Rim20 and Rim23 play roles in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. These domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, Brox and HD-PTP) and Snf7 (in the case of yeast Bro1 and Rim20). The Bro1-like domains of Alix, HD-PTP, Brox, and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily. 346
28049 185771 cd09248 BRO1_Rhophilin_1 Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin-1. This subfamily contains the Bro1-like domain of the RhoA-binding protein, Rhophilin-1. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding protein Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-1 binds both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-1 contains an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. The Drosophila knockout of the Rhophilin-1 is embryonic lethal, suggesting an essential role in embryonic development. The isolated Bro1-like domain of Rhophilin-1 binds human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Rhophilin-1 lacks the V-shaped (V) domain found in many members of the BRO1_Alix_ like superfamily. 384
28050 185772 cd09249 BRO1_Rhophilin_2 Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin-2. This subfamily contains the Bro1-like domain of RhoA-binding protein, Rhophilin-2. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding protein Rhophilin-1, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-2, binds both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-2 contains an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. Roles for Rhophilin-2 may include limiting stress fiber formation or increasing the turnover of F-actin in the absence of high levels of RhoA signaling activity. Rhophilin-2 lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily. 385
28051 271158 cd09250 AP-1_Mu1_Cterm C-terminal domain of medium Mu1 subunit in clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1 subunit, which includes two closely related homologs, mu1A (encoded by ap1m1) and mu1B (encoded by ap1m2). Mu1A is ubiquitously expressed, but mu1B is expressed exclusively in polarized epithelial cells. AP-1 has been implicated in bi-directional transport between the trans-Golgi network (TGN) and endosomes. It plays an essential role in the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). Epithelial cell-specific AP-1 is also involved in sorting to the basolateral surface of polarized epithelial cells. Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). Phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. 272
28052 271159 cd09251 AP-2_Mu2_Cterm C-terminal domain of medium Mu2 subunit in ubiquitously expressed clathrin-associated adaptor protein (AP) complex AP-2. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, -2, -3, and -4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 2 (AP-2) medium mu2 subunit. Mu2 is ubiquitously expressed in mammals. In higher eukaryotes, AP-2 plays a critical role in clathrin-mediated endocytosis from the plasma membrane in different cells. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-2. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-2 mu2 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. Since the Y-X-X-Phi binding site is buried in the core structure of AP-2, a phosphorylation induced conformational change is required when the cargo molecules binds to AP-2. In addition, the C-terminal domain of mu2 subunit has been shown to bind other molecules. For instance, it can bind phosphoinositides, in particular PI[4,5]P2, which might be involved in the recognition process of the tyrosine-based signals. It can also interact with synaptotagmins, a family of important modulators of calcium-dependent neurosecretion within the synaptic vesicle (SV) membrane. Since many of the other endocytic adaptors responsible for biogenesis of synaptic vesicles exist, in the absence of AP-2, clathrin-mediated endocytosis can still occur. However, the cells may not survive in the complete absence of clathrin as well as AP-2. 263
28053 271160 cd09252 AP-3_Mu3_Cterm C-terminal domain of medium Mu3 subunit in adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3 subunit, which includes two closely related homologs, mu3A (P47A, encoded by ap3m1) and mu1B (P47B, encoded by ap3m2). Mu3A is ubiquitously expressed, but mu3B is specifically expressed in neurons and neuroendocrine cells. AP-3 is particularly important for targeting integral membrane proteins to lysosomes and lysome-related organelles at trans-Golgi network (TGN) and/or endosomes, such as the yeast vacuole, fly pigment granules and mammalian melanosomes, platelet dense bodies and the secretory lysosomes of cytotoxic T lymphocytes. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. 251
28054 271161 cd09253 AP-4_Mu4_Cterm C-terminal domain of medium Mu4 subunit in adaptor protein (AP) complex AP-4. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 4 (AP-4) medium mu4 subunit. AP-4 plays a role in signal-mediated trafficking of integral membrane proteins in mammalian cells. Unlike other AP complexes, AP-4 is found only in mammals and plants. It is believed to be part of a nonclathrin coat, since it might function independently of clathrin, a scaffolding protein participating in the formation of coated vesicles. Recruitment of AP-4 to the trans-Golgi network (TGN) membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1) or a related protein. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. One of the most important sorting signals binding to mu subunits of AP complexes are tyrosine-based endocytotic signals, which are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. However, AP-4 does not bind most canonical tyrosine-based signals except for two naturally occurring ones from the lysosomal membrane proteins CD63 and LAMP-2a. It binds YX [FYL][FL]E motif, where X can be any residue, from the cytosolic tails of amyloid precursor protein (APP) family members in a distinct way. 271
28055 271162 cd09254 AP_delta-COPI_MHD Mu homology domain (MHD) of adaptor protein (AP) coat protein I (COPI) delta subunit. COPI complex-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. COPI complex-coated vesicles consist of a small GTPase, ADP-ribosylation factor 1 (ARF1) and a heteroheptameric coatomer composed of two subcomplexes, F-COPI and B-COPI. ARF1 regulates COPI vesicle formation by recruiting the coatomer onto Golgi membranes to initiate its coat function. Coatomer complexes then bind cargo molecules and self-assemble to form spherical cages that yield COPI-coated vesicles. The heterotetrameric F-COPI subcomplex contains beta-, gamma-, delta-, and zeta-COP subunits, where beta- and gamma-COP subunits are related to the large AP subunits, and delta- and zeta-COP subunits are related to the medium and small AP subunits, respectively. Due to the sequence similarity to the AP complexes, the F-COPI subcomplex might play a role in the cargo-binding. The heterotrimeric B-COPI contains alpha-, beta-, and epsilon-COP subunits, which are not related to the adaptins. This subcomplex is thought to participate in the cage-forming and might serve a function similar to that of clathrin. This family corresponds to the mu homology domain of delta-subunit of COPI complex (delta-COP), which is distantly related to the C-terminal domain of mu chains among AP complexes. The delta-COP subunit appears tightly associated with the beta-COP subunit to confer its interaction with ARF1. In addition, both delta- and beta-COP subunits contribute to a common binding site for arginine (R)-based signals, which are sorting motifs conferring transient endoplasmic reticulum (ER) localization to unassembled subunits of multimeric membrane proteins. 237
28056 271163 cd09255 AP-like_stonins_MHD Mu homology domain (MHD) of adaptor-like proteins (AP-like), stonins. A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonins, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonins is unable to recognize tyrosine-based endocytic sorting signals. To data, little is known about the localization and function of stonin 1. Stonin 2, also known as stoned B, acts as an AP-2-dependent synaptotagmin-specific sorting adaptors for SV endocytosis. Stoned A is not a stonin. It is structurally unrelated to the adaptins and does not appear to have mammalian homologs. It is not included in this family. 315
28057 271164 cd09256 AP_MuD_MHD Mu-homology domain (MHD) of a adaptor protein (AP) encoded by mu-2 related death-inducing gene, MuD (also known as MUDENG). This family corresponds to the MHD found in a protein encoded by MuD (also known as Adapter-related protein complex 5 subunit mu-1), which is distantly related to the C-terminal domain of the mu2 subunit of AP complexes that participates in clathrin-mediated endocytosis. MuD is evolutionary conserved from mammals to amphibians. It is able to induce cell death by itself and plays an important role in cell death in various tissues. 276
28058 271165 cd09257 AP_muniscins_like_MHD Mu-homology domain (MHD) of muniscins adaptor proteins (AP) and similar proteins. This family corresponds to the MHD found in muniscins, a novel family of endocytic adaptor proteins. The term, muniscins, has been assigned to name the MHD of proteins with both EFC/F-BAR domain and MHD. These two domains are responsible for the membrane-tubulation activity associated with transmembrane cargo proteins. Members in this family include an endocytic adaptor Syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related uncharacterized proteins. Syp1 is a poorly characterized yeast protein with multiple biological functions. Syp1 contains an N-terminal EFC/F-BAR domain that induces membrane tabulation, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD that can directly binds to the endocytic adaptor/scaffold protein Ede1 or a transmembrane stress sensor cargo protein Mid2. Thus, Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress response. Syp1 shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, the membrane-sculpting F-BAR domain-containing Fer/Cip4 homology domain-only proteins 1 and 2 (FCHo1/2). FCHo1/2 represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They are required for plasma membrane clathrin-coated vesicle (CCV) budding and marked sites of CCV formation. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein, neuronal-specific transcript Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 [SGIP1] does not contain EFC/F-BAR domain, but does have a PRD and a C-terminal MHD and has been classified into this family as well. SGIP1 is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15. 244
28059 271166 cd09258 AP-1_Mu1A_Cterm C-terminal domain of medium Mu1A subunit in ubiquitously expressed clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1A subunit encoded by ap1m1 gene, which is ubiquitously expressed in all mammalian tissues and cells. AP-1 has been implicated in bidirectional transport between the trans-Golgi network (TGN) and endosomes. It is involved in the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). The ubiquitous AP-1 is recruited to the TGN membrane, as well as to immature secretory granules. Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). Phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1A subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. 270
28060 271167 cd09259 AP-1_Mu1B_Cterm C-terminal domain of medium Mu1B subunit in epithelial cell-specific clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1B subunit encoded by ap1m2 gene exclusively expressed in polarized epithelial cells. Epithelial cell-specific AP-1 is used to sort proteins to the basolateral plasma membrane, which involves the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). The phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1B subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic reside-binding. Besides, AP-1 mu1B subunit mediates the basolateral recycling of low-density lipoprotein receptor (LDLR) and transferrin receptor (TfR) from the sorting endosomes, where the basolateral sorting signal does not belong to the tyrosine-based signals. Thus, the binding site in mu1B subunit of AP-1 for the signals of LDLR and TfR might be distinct from that for YXXPhi signals. 268
28061 211371 cd09260 AP-3_Mu3A_Cterm C-terminal domain of medium Mu3A subunit in ubiquitously expressed adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3A subunit encoded by ap3m1gene. Mu3A is ubiquitously expressed in all mammalian tissues and cells. It appears to be localized to the trans-Golgi network (TGN) and/or endosomes and participates in trafficking to the vacuole/lysosome in yeast, flies, and mammals. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of ubiquitous AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3A subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. 254
28062 211372 cd09261 AP-3_Mu3B_Cterm C-terminal domain of medium Mu3B subunit in neuron-specific adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3B subunit encoded by ap3m2 gene. Mu3B is specifically expressed in neurons and neuroendocrine cells. Neuron-specific AP-3 appears to be involved in synaptic vesicle biogenesis from endosomes in neurons and plays an important role in synaptic transmission in the central nervous system. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of neuron-specific AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3B subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. 254
28063 271168 cd09262 AP_stonin-1_MHD Mu homology domain (MHD) of adaptor-like protein (AP-like), stonin-1 (also called Stoned B-like factor). A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are the only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonin 1, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonin-1 is unable to recognize tyrosine-based endocytic sorting signals. To data, little is known about the localization and function of stonin-1. 314
28064 271169 cd09263 AP_stonin-2_MHD Mu homology domain (MHD) of adaptor-like protein (AP-like), stonin-2. A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are the only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonin 2, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonin-2 is unable to recognize tyrosine-based endocytic sorting signals. It acts as an AP-2-dependent synaptotagmin-specific sorting adaptor for SV endocytosis. 318
28065 271170 cd09264 AP_Syp1_MHD mu-homology domain (MHD) of adaptor protein (AP), Syp1, and related proteins. This family corresponds to the MHD found in a novel endocytic adaptor Syp1 and related proteins. Syp1 is a poorly characterized yeast protein with multiple biological functions. It was originally identified as a suppressor of a yeast profiling deletion and later as a suppressor of arf3delta (Arf3 is the yeast homologue of Arf6, a mammalian regulator of endocytosis). Syp1 can bind to septins and physically link with cell polarity factors. It also directly binds to the endocytic adaptor/scaffold protein Ede1, and plays a role in endocytosis. Further studies show that Syp1 is itself an endocytic adaptor protein contributing to stress responses. Its mu-homology domain at the C-terminus binds to the cargo protein Mid2, a transmembrane stress sensor protein, and mediates Mid2 internalization. In addition, Syp1 contains an EFC/F-BAR domain which can induce membrane tabulation. 257
28066 271171 cd09265 AP_Syp1_like_MHD Mu-homology domain (MHD) of endocytic adaptor protein (AP), Syp1. This family corresponds to the MHD found in the metazoan counterparts of yeast Syp1, which includes two ubiquitously expressed membrane-sculpting F-BAR domain-containing Fer/Cip4 homology domain-only proteins 1 and 2 (FCH domain only 1 and 2, or FCHo1/FCHo2), neuronal-specific SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related uncharacterized proteins. FCHo1/FCHo2 represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They are required for plasma membrane clathrin-coated vesicle (CCV) budding and marked sites of CCV formation. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Both FCHo1/FCHo2 contain an N-terminal EFC/F-BAR domain that induces membrane tabulation, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD responsible for the binding of eps15 and intersectin. Another mammalian neuronal-specific protein, neuronal-specific transcript Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 [SGIP1] does not contain EFC/F-BAR domain, but does have a PRD and a C-terminal MHD and has been classified into this family as well. SGIP1 is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15. 266
28067 271172 cd09266 SGIP1_MHD mu-homology domain (MHD) of Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 (also known as endophilin-3-interacting protein, SGIP1) and similar proteins. This family corresponds to the MHD found in mammalian neuronal-specific transcript SGIP1 and similar proteins. Unlike other members in this family, SGIP1 does not contain EFC/F-BAR domain, but does have a proline-rich domain (PRD) and a C-terminal MHD. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis, and is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15. 267
28068 211378 cd09267 FCHo2_MHD mu-homology domain (MHD) of F-BAR domain-containing Fer/Cip4 homology domain-only protein 2 (FCH domain only 2 or FCHo2) and similar proteins. This family corresponds to the MHD found in the ubiquitously expressed mammalian membrane-sculpting FCHo2 and similar proteins. FCHo2 represents a key initial protein that ultimately controls cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. It is required for plasma membrane clathrin-coated vesicle (CCV) budding and marks sites of CCV formation. It binds specifically to the plasma membrane and recruits the scaffold proteins eps15 and intersectin, which subsequently engages the adaptor complex AP2 and clathrin, leading to coated vesicle formation. FCHo2 contains an N-terminal EFC/F-BAR domain, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD. The crescent-shaped EFC/F-BAR domain can form an antiparallel dimer structure that binds PtdIns(4,5)P2-enriched membranes and can polymerize into rings to generate membrane tubules. The MHD is structurally related to the cargo-binding mu2 subunit of adaptor complex 2 (AP-2) and is responsible for the binding of eps15 and intersectin. 267
28069 271173 cd09268 FCHo1_MHD mu-homology domain (MHD) of F-BAR domain-containing Fer/Cip4 homology domain-only protein 1 (FCH domain only 1 or FCHo1, also known as KIAA0290) and similar proteins. This family corresponds to the MHD found in ubiquitously expressed mammalian membrane-sculpting FCHo1 and similar proteins. FCHo1 represents a key initial protein that ultimately controls cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. It is required for plasma membrane clathrin-coated vesicle (CCV) budding and marks sites of CCV formation. It binds specifically to the plasma membrane and recruits the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. FCHo1 contains an N-terminal EFC/F-BAR domain, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD. The crescent-shaped EFC/F-BAR domain can form an antiparallel dimer structure that binds PtdIns(4,5)P2-enriched membranes and can polymerize into rings to generate membrane tubules. The MHD is structurally related to the cargo-binding mu2 subunit of adaptor complex 2 (AP-2) and is responsible for the binding of eps15 and intersectin. Unlike other F-BAR domain containing proteins, FCHo1 has neither the Src homology 3 (SH3) domain nor any other known domain for interaction with dynamin and actin cytoskeleton. However, it can periodically accumulate at the budding site of clathrin. FCHo1 may utilize a unique action mode for vesicle formation as compared with other F-BAR proteins. 265
28070 185703 cd09269 deoxyribose_mutarotase deoxyribose mutarotase_like. Salmonella enterica serovar Typhi DeoM (earlier named as DeoX) is a mutarotase with high specificity for deoxyribose. It is encoded by one of four genes beonging to the deoK operon. This operon has also been found in Escherichia coli where it is more common in pathogenic than in commensal strains and is associated with pathogenicity. It has been found on a pathogenicity island from a human blood isolate AL863 and confers the ability to use deoxyribose as a carbon source; deoxyribose is not fermented by non-pathogenic E.coli K-12. Proteins in this family are members of the aldose-1-epimerase superfamily. Aldose 1-epimerases, or mutarotases, are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. Site directed mutagenesis of this latter histidine residue renders Salmonella enterica DeoM inactive. 293
28071 187751 cd09270 RNase_H2-B Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. Ribonuclease H2B is one of the three proteins of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder. 211
28072 187752 cd09271 RNase_H2-C Ribonuclease H2-C is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. Ribonuclease H2C is one of the three protein of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder. 93
28073 260004 cd09272 RNase_HI_RT_Ty1 Ty1/Copia family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 140
28074 260005 cd09273 RNase_HI_RT_Bel Bel/Pao family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Bel/Pao family has been described only in metazoan genomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 131
28075 260006 cd09274 RNase_HI_RT_Ty3 Ty3/Gypsy family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 121
28076 260007 cd09275 RNase_HI_RT_DIRS1 DIRS1 family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. The structural features of DIRS1-group elements are different from typical LTR elements. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 120
28077 260008 cd09276 Rnase_HI_RT_non_LTR non-LTR RNase HI domain of reverse transcriptases. Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). Ribonuclease HI (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as an adjunct domain to the reverse transcriptase gene in retroviruses, long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. The position of the RNase domain of non-LTR and LTR transposons is at the carboxyl terminal of the reverse transcriptase (RT) domain and their RNase domains group together, indicating a common evolutionary origin. Many non-LTR transposons have lost the RNase domain because their activity is at the nucleus and cellular RNase may suffice; however LTR retrotransposons always encode their own RNase domain because it requires RNase activity in RNA-protein particles in the cytoplasm. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 131
28078 260009 cd09277 RNase_HI_bacteria_like Bacterial RNase HI containing a hybrid binding domain (HBD) at the N-terminus. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Prokaryotic RNase H varies greatly in domain structures and substrate specificities. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability. Some bacteria distinguished from other bacterial RNase HI in the presence of a hybrid binding domain (HBD) at the N-terminus which is commonly present at the N-termini of eukaryotic RNase HI. It has been reported that this domain is required for dimerization and processivity of RNase HI upon binding to RNA-DNA hybrids. 133
28079 260010 cd09278 RNase_HI_prokaryote_like RNase HI family found mainly in prokaryotes. Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD), residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Prokaryotic RNase H varies greatly in domain structures and substrate specificities. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability. 139
28080 260011 cd09279 RNase_HI_like RNAse HI family that includes archaeal, some bacterial as well as plant RNase HI. Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Most archaeal genomes contain only type 2 RNase H (RNase HII); however, a few contain RNase HI as well. Although archaeal RNase HI sequences conserve the DEDD active-site motif, they lack other common features important for catalytic function, such as the basic protrusion region. Archaeal RNase HI homologs are more closely related to retroviral RNase HI than bacterial and eukaryotic type I RNase H in enzymatic properties. 128
28081 260012 cd09280 RNase_HI_eukaryote_like Eukaryotic RNase H is essential and is longer and more complex than their prokaryotic counterparts. Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. Eukaryotic RNase H is longer and more complex than in prokaryotes. Almost all eukaryotic RNase HI have highly conserved regions at their N-termini called hybrid binding domain (HBD). It is speculated that the HBD contributes to binding the RNA/DNA hybrid. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability, but RNase H is essential in higher eukaryotes. RNase H knockout mice lack mitochondrial DNA replication and die as embryos. 145
28082 187753 cd09281 UPF0066 Escherichia coli YaeB and related proteins. Uncharacterized protein family UPF0066. This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR. Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown. 124
28083 185681 cd09286 NMNAT_Eukarya Nicotinamide/nicotinate mononucleotide adenylyltransferase, Eukaryotic. Nicotinamide/nicotinate mononucleotide (NMN/ NaMN)adenylyltransferase (NMNAT). NMNAT represents the primary bacterial and eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide. It is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. Human NMNAT displays unique dual substrate specificity toward both NMN and NaMN, and can participate in both de novo and salvage pathways of NAD synthesis. This subfamily consists strictly of eukaryotic members and includes secondary structural elements not found in all NMNATs. 225
28084 185682 cd09287 GluRS_non_core catalytic core domain of non-discriminating glutamyl-tRNA synthetase. Non-discriminating Glutamyl-tRNA synthetase (GluRS) cataytic core domain. These enzymes attach Glu to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea and most bacteria lack GlnRS. In these organisms, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. 240
28085 187746 cd09288 Photosystem-II_D2 D2 subunit of photosystem II (PS II). Photosystem II (PS II), D2 subunit. PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria. It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation. Molecular dioxygen is released as a by-product. PS II can be described as containing two parts: the photochemical part and the catalytic part. The photochemical portion promotes the fast, efficient light-induced charge separation and stabilization that occur when light is absorbed by chlorophyll. The catalytic portion, where water is oxidized, involves a cluster of Mn ions close to a redox-active tyrosine residue. The Mn cluster and its ligands form a functional unit called the oxygen-evolving complex (OEC) or the water-oxidizing complex (WOC). The D1 and D2 subunits are a pair of intertwined polypeptides. They contain all the cofactors involved directly in water oxidation and plastoquinone reduction. D1 and D2 are highly homologous and are also similar to the L and M proteins in bacterial photosynthetic reaction centers. 339
28086 187747 cd09289 Photosystem-II_D1 D1 subunit of photosystem II (PS II). Photosystem II (PS II), D2 subunit. PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria. It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation. Molecular dioxygen is released as a by-product. PS II can be described as containing two parts: the photochemical part and the catalytic part. The photochemical portion promotes the fast, efficient light-induced charge separation and stabilization that occur when light is absorbed by chlorophyll. The catalytic portion, where water is oxidized, involves a cluster of Mn ions close to a redox-active tyrosine residue. The Mn cluster and its ligands form a functional unit called the oxygen-evolving complex (OEC) or the water-oxidizing complex (WOC). The D1 and D2 subunits are a pair of interwined polypeptides. They contain all the cofactors involved directly in water oxidation and plastoquinone reduction. The D1 subunit contains the Mn cluster that constitutes the site of water oxidation. D1 and D2 are highly homologous and are also similar to the L and M proteins in bacterial photosynthetic reaction centers. 338
28087 187748 cd09290 Photo-RC_L Subunit L of bacterial photosynthetic reaction center. Bacterial photosynthetic reaction center (RC) complex, subunit L. The bacterial photosynthetic reaction center couples light-induced electron transfer with pumping protons across the membrane using reactions involving a quinone molecule (QB) that binds two electrons and two protons at the active site. The reaction center consists of three membrane-bound subunits, designated L, M, and H, plus an additional extracellular cytochrome subunit. The L and M subunits are arranged around an axis of 2-fold rotational symmetry perpendicular to the membrane, forming a scaffold that maintains the cofactors in a precise configuration. The L and M subunits have both sequence and structural similarity, suggesting a common evolutionary origin. The L and M subunits bind noncovalently to the nine cofactors in 2-fold symmetric branches: four bacteriochlorophylls (Bchl), two bacteriopheophytins (Bphe), two ubiquinone molecules (QA and QB), and a non-heme iron. Two Bchls on the periplasmic side of the membrane form the 'special pair' or dimer which is the primary electron donor for the photosynthetic reactions. The electron transfer reaction proceeds from the dimer to an intermediate acceptor (PA), a primary quinone (QA), and a secondary quinone (QB). Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as ATP synthesis. The RC complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species. 273
28088 187749 cd09291 Photo-RC_M Subunit M of bacterial photosynthetic reaction center. Bacterial photosynthetic reaction center (RC) complex, subunit M. The bacterial photosynthetic reaction center couples light-induced electron transfer with pumping protons across the membrane using reactions involving a quinone molecule (QB) that binds two electrons and two protons at the active site. The reaction center consists of three membrane-bound subunits, designated L, M, and H, plus an additional extracellular cytochrome subunit. The L and M subunits are arranged around an axis of 2-fold rotational symmetry perpendicular to the membrane, forming a scaffold that maintains the cofactors in a precise configuration. The L and M subunits have both sequence and structural similarity, suggesting a common evolutionary origin. The L and M subunits bind noncovalently to the nine cofactors in 2-fold symmetric branches: four bacteriochlorophylls (Bchl), two bacteriopheophytins (Bphe), two ubiquinone molecules (QA and QB), and a non-heme iron. Two Bchls on the periplasmic side of the membrane form the 'special pair' or dimer which is the primary electron donor for the photosynthetic reactions. The electron transfer reaction proceeds from the dimer to an intermediate acceptor (PA), a primary quinone (QA), and a secondary quinone (QB). Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as ATP synthesis. The RC complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species. 297
28089 187754 cd09293 AMN1 Antagonist of mitotic exit network protein 1. Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model. 226
28090 187755 cd09294 SmpB Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs. Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs and targeting incompletely synthesized protein fragments for degradation. Trans-translation system is composed of a ribonucleoprotein complex of tmRNA, a specialized RNA with properties of both tRNA and mRNA, and SmpB. SmpB is highly conserved and present in all bacterial kingdoms and is also found in some chloroplasts and mitochondria. This is suggesting Trans-translation arose early in bacterial evolution and its mechanism is a quality control for protein synthesis in spite of challenges such as transcription errors, mRNA damage, and translation frame shifting. SmpB deletion results in phage development defects phenotype and absence of tagged proteins translated from defective mRNAs. 116
28091 200495 cd09295 Sema The Sema domain, a protein interacting module, of semaphorins and plexins. Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors. 392
28092 187756 cd09299 TDT The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs). 326
28093 350171 cd09300 DEAD-like_helicase_C C-terminal helicase domain of the DEAD-like helicases. This hierarchy of DEAD-like helicases is composed of two superfamilies, SF1 and SF2, that share almost identical folds and extensive structural similarity in their catalytic core. Helicases are involved in ATP-dependent RNA or DNA unwinding. Two distinct types of helicases exist, those forming toroidal, predominantly hexameric structures, and those that do not. SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Their conserved helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 59
28094 212512 cd09301 HDAC Histone deacetylase (HDAC) classes I, II, IV and related proteins. The HDAC/HDAC-like family includes Zn-dependent histone deacetylase classes I, II and IV (class III HDACs, also called sirtuins, are NAD-dependent and structurally unrelated, and therefore not part of this family). Histone deacetylases catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98), as opposed to the acetylation reaction by some histone acetyltransferases (EC 2.3.1.48). Deacetylases of this family are involved in signal transduction through histone and other protein modification, and can repress/activate transcription of a number of different genes. They usually act via the formation of large multiprotein complexes. They are involved in various cellular processes, including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. In mammals, they are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs. 279
28095 187706 cd09302 Jacalin_like Jacalin-like lectin domain. Jacalin-like lectins are sugar-binding protein domains mostly found in plants. They adopt a beta-prism topology consistent with a circularly permuted three-fold repeat of a structural motif. Proteins containing this domain may bind mono- or oligosaccharides with high specificity. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. Taxonomic distribution is not restricted to plants, the domain is also found in various mammalian proteins, for example. 128
28096 187757 cd09317 TDT_Mae1_like C4-dicarboxylate transporter/malic acid transport protein family includes Mae1. This family contains eukaryotic homologs of C4-dicarboxylate transporter/malic acid transport proteins which are part of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This includes the MAE1 gene in Schizosaccharomyces pombe gene that encodes malate permease, Mae1, which functions by proton symport and transports C4-dicarboxylates (malate, fumarate, succinate, oxaloacetate, etc.), but not K-ketoglutarate. 330
28097 187758 cd09318 TDT_SSU1 Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes sulfite sensitivity protein (sulfite efflux pump; SSU1). This family contains the sulfite sensitivity protein (sulfite efflux pump; SSU1) and belongs to the tellurite-resistance/dicarboxylate transporter (TDT) family. The SSU1 gene encodes the sulfite pump required for efficient sulfite efflux. Mutations in the SSU1 gene cause sensitivity to sulfite while overexpression confers heightened resistance to sulfite toxicity. In dematophytes and other filamentous fungi, sulfite is excreted as a reducing agent during keratin degradation; thus sulfite transporters in keratinolytic fungi could be a new target for antifungal drugs in dermatology. The number of genes encoding sulfite efflux pumps in fungal genomes varies from species to species. 341
28098 187759 cd09319 TDT_like_1 The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs). 317
28099 187760 cd09320 TDT_like_2 The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs). 327
28100 187761 cd09321 TDT_like_3 The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane a-helical spanners (TMSs). 327
28101 187762 cd09322 TDT_TehA_like The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes TehA proteins. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane a-helical spanners (TMSs). 289
28102 187763 cd09323 TDT_SLAC1_like Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes SLAC1 (Slow Anion Channel-Associated 1). SLAC1 (Slow Anion Channel-Associated 1) is a plasma membrane protein, preferentially expressed in guard cells, which encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. It is essential for stomatal closure in response to carbon dioxide, abscisic acid, ozone, light/dark transitions, humidity change, calcium ions, hydrogen peroxide and nitric oxide. In the Arabidopsis genome, SLAC1 is part of a gene family with five members and encodes a membrane protein that has ten putative transmembrane domains flanked by large N- and C-terminal domains. Mutations in SLAC1 impair slow (S-type) anion channel currents that are activated by cytosolic calcium ions and abscisic acid, but do not affect rapid (R-type) anion channel currents or calcium ion channel function. 297
28103 187764 cd09324 TDT_TehA Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes TehA protein. This subfamily includes Tellurite resistance protein TehA that belongs to the C4-dicarboxylate transporter/malic acid transport (TDT) protein family and is a homolog of plant Slow Anion Channel-Associated 1 (SLAC1). The tehA gene encodes an integral membrane protein that has been shown to have efflux activity of quaternary ammonium compounds. TehA protein of Escherichia coli functions as a tellurite-resistance uptake permease. 301
28104 187765 cd09325 TDT_C4-dicarb_trans C4-dicarboxylate transporters of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This subfamily contains bacterial C4-dicarboxylate transporters, which is part of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. It includes Tellurite resistance protein tehA; the tehA gene encodes an integral membrane protein that has been shown to have efflux activity of quaternary ammonium compounds. TehA protein of Escherichia coli functions as a tellurite-resistance uptake permease. 293
28105 188712 cd09326 LIM_CRP_like The LIM domains of Cysteine Rich Protein (CRP) family. The LIM domains of Cysteine Rich Protein (CRP) family: Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The known CRP family members include CRP1, CRP2, and CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28106 188713 cd09327 LIM1_abLIM The first LIM domain of actin binding LIM (abLIM) proteins. The first LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly. They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28107 188714 cd09328 LIM2_abLIM The second LIM domain on actin binding LIM (abLIM) proteins. The second LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly. They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28108 188715 cd09329 LIM3_abLIM The third LIM domain of actin binding LIM (abLIM) proteins. The third LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly. They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28109 188716 cd09330 LIM4_abLIM The fourth LIM domain of actin binding LIM (abLIM) proteins. The fourth LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly. They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28110 188717 cd09331 LIM1_PINCH The first LIM domain of protein PINCH. The first LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28111 188718 cd09332 LIM2_PINCH The second LIM domain of protein PINCH. The second LIM domain of protein PINCH: PINCH plays a pivotal role in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners. These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28112 188719 cd09333 LIM3_PINCH The third LIM domain of protein PINCH. The third LIM domain of protein PINCH: PINCH plays pivotal roles in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners. These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 51
28113 188720 cd09334 LIM4_PINCH The fourth LIM domain of protein PINCH. The fourth LIM domain of protein PINCH: PINCH plays a pivotal role in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners. These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton. The PINCH LIM4 domain recognizes the third SH3 domain of another adaptor protein, Nck2. This step is an important component of integrin signaling event. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assem bly of multimeric protein complexes. 54
28114 188721 cd09335 LIM5_PINCH The fifth LIM domain of protein PINCH. The fifth LIM domain of protein PINCH: PINCH plays pivotal roles in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners. These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28115 259830 cd09336 LIM1_Paxillin_like The first LIM domain of the paxillin like protein family. The first LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region. Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28116 188723 cd09337 LIM2_Paxillin_like The second LIM domain of the paxillin like protein family. The second LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region. Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28117 188724 cd09338 LIM3_Paxillin_like The third LIM domain of the paxillin like protein family. The third LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region. Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28118 188725 cd09339 LIM4_Paxillin_like The fourth LIM domain of the Paxillin-like protein family. The fourth LIM domain of the Paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region. Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28119 188726 cd09340 LIM1_Testin_like The first LIM domain of Testin-like family. The first LIM domain of Testin_like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 58
28120 188727 cd09341 LIM2_Testin_like The second LIM domain of Testin-like family. The second LIM domain of Testin-like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28121 188728 cd09342 LIM3_Testin_like The third LIM domain of Testin-like family. The third LIM domain of Testin_like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 57
28122 188729 cd09343 LIM1_FHL The first LIM domain of Four and a half LIM domains protein (FHL). The first LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells. FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28123 188730 cd09344 LIM1_FHL1 The first LIM domain of Four and a half LIM domains protein 1. The first LIM domain of Four and a half LIM domains protein 1 (FHL1): FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28124 188731 cd09345 LIM2_FHL The second LIM domain of Four and a half LIM domains protein (FHL). The second LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells. FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28125 188732 cd09346 LIM3_FHL The third LIM domain of Four and a half LIM domains protein (FHL). The third LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells. FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28126 188733 cd09347 LIM4_FHL The fourth LIM domain of Four and a half LIM domains protein (FHL). The fourth LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells. FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28127 188734 cd09348 LIM4_FHL1 The fourth LIM domain of Four and a half LIM domains protein 1 (FHL1). The fourth LIM domain of Four and a half LIM domains protein 1 (FHL1): FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 64
28128 188735 cd09349 LIM1_Zyxin The first LIM domain of Zyxin. The first LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal. Localized at sites of cell substratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 87
28129 188736 cd09350 LIM1_TRIP6 The first LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The first LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28130 188737 cd09351 LIM1_LPP The first LIM domain of lipoma preferred partner (LPP). The first LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal. LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP),Palladin, and Scrib. The LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28131 188738 cd09352 LIM1_Ajuba_like The first LIM domain of Ajuba-like proteins. The first LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region. Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28132 188739 cd09353 LIM2_Zyxin The second LIM domain of Zyxin. The second LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal. Localized at sites of cellsubstratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors o r scaffolds to support the assembly of multimeric protein. 60
28133 188740 cd09354 LIM2_LPP The second LIM domain of lipoma preferred partner (LPP). The second LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal. LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP),Palladin, and Scrib. The LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 60
28134 188741 cd09355 LIM2_Ajuba_like The second LIM domain of Ajuba-like proteins. The second LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region. Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28135 188742 cd09356 LIM2_TRIP6 The second LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The second LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28136 188743 cd09357 LIM3_Zyxin_like The third LIM domain of Zyxin-like family. The third LIM domain of Zyxin like family: This family includes Ajuba, Limd1, WTIP, Zyxin, LPP, and Trip6 LIM proteins. Members of Zyxin family contain three tandem C-terminal LIM domains, and a proline-rich N-terminal region. Zyxin proteins are detected primarily in focal adhesion plaques. They function as scaffolds, participating in the assembly of multiple interactions and signal transduction networks, which regulate cell adhesion, spreading, and motility. They can also shuffle into nucleus. In nucleus, zyxin proteins affect gene transcription by interaction with a variety of nuclear proteins, including several transcription factors, playing regulating roles in cell proliferation, differentiation and apoptosis. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 63
28137 188744 cd09358 LIM_Mical_like The LIM domain of Mical (molecule interacting with CasL) like family. The LIM domain of Mical (molecule interacting with CasL) like family: Known members of this family includes LIM domain containing proteins; Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2) and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth and mobility. Eplin has also found to be tumor suppressor. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs.. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28138 188745 cd09359 LIM_LASP_like The LIM domain of LIM and SH3 Protein (LASP)-like proteins. The LIM domain of LIM and SH3 Protein (LASP) like proteins: This family contains two types of LIM containing proteins; LASP and N-RAP. LASP family contains two highly homologous members, LASP-1 and LASP-2. LASP contains a LIM motif at its amino terminus, a src homology 3 (SH3) domains at its C-terminal part, and a nebulin-like region in the middle. LASP-1 and -2 are highly conserved in their LIM, nebulin-like, and SH3 domains, but differ significantly at their linker regions. Both proteins are ubiquitously expressed and involved in cytoskeletal architecture, especially in the organization of focal adhesions. LASP-1 and LASP-2, are important during early embryo- and fetogenesis and are highly expressed in the central nervous system of the adult. However, only LASP-1 seems to participate significantly in neuronal differentiation and plays an important functional role in migration and proliferation of certain cancer cells while the role of LASP-2 is more structural. The expression of LASP-1 in breast tumors is increased significantly. N-RAP is a muscle-specific protein concentrated at myotendinous junctions in skeletal muscle and intercalated disks in cardiac muscle. LIM domain is found at the N-terminus of N-RAP and the C-terminal of N-RAP contains a region with multiple of nebulin repeats. N-RAP functions as a scaffolding protein that organizes alpha-actinin and actin into symmetrical I-Z-I structures in developing myofibrils. Nebulin repeat is known as actin binding domain. The N-RAP is hypothesized to form antiparallel dimerization via its LIM domain. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28139 188746 cd09360 LIM_ALP_like The LIM domain of ALP (actinin-associated LIM protein) family. This family represents the LIM domain of ALP (actinin-associated LIM protein) family. Four proteins: ALP, CLP36, RIL, and Mystique have been classified into the ALP subfamily of LIM domain proteins. Each member of the subfamily contains an N-terminal PDZ domain and a C-terminal LIM domain. Functionally, these proteins bind to alpha-actinin through their PDZ domains and bind or other signaling molecules through their LIM domains. ALP proteins have been implicated in cardiac and skeletal muscle structure, function and disease, platelet, and epithelial cell motility. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28140 188747 cd09361 LIM1_Enigma_like The first LIM domain of Enigma-like family. The first LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain. It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28141 188748 cd09362 LIM2_Enigma_like The second LIM domain of Enigma-like family. The second LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain. It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28142 188749 cd09363 LIM3_Enigma_like The third LIM domain of Enigma-like family. The third LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain. It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28143 188750 cd09364 LIM1_LIMK The first LIM domain of LIMK (LIM domain Kinase ). The first LIM domain of LIMK (LIM domain Kinase ): LIMK protein family is comprised of two members LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerisation. LIMKs can function in both cytoplasm and nucleus and are expressed in all tissues. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. However, LIMK1 and LIMk2 have different cellular locations. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The LIM domains of LIMK have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28144 188751 cd09365 LIM2_LIMK The second LIM domain of LIMK (LIM domain Kinase ). The second LIM domain of LIMK (LIM domain Kinase ): LIMK protein family is comprised of two members LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus and are expressed in all tissues. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. However, LIMK1 and LIMk2 have different cellular locations. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The LIM domains of LIMK have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28145 188752 cd09366 LIM1_Isl The first LIM domain of Isl, a member of LHX protein family. The first LIM domain of Isl: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Isl1 and Isl2 are the two conserved members of this family. Proteins in this group are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl-1 is one of the LHX proteins isolated originally by virtue of its ability to bind DNA sequences from the 5'-flanking region of the rat insulin gene in pancreatic insulin-producing cells. Mice deficient in Isl-1 fail to form the dorsal exocrine pancreas and islet cells fail to differentiate. On the other hand, Isl-1 takes part in the pituitary development by activating the gonadotropin-releasing hormone receptor gene together with LHX3 and steroidogenic factor 1. Mouse Is l2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Same as Isl1, Isl2 may also be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28146 188753 cd09367 LIM1_Lhx1_Lhx5 The first LIM domain of Lhx1 (also known as Lim1) and Lhx5. The first LIM domain of Lhx1 (also known as Lim1) and Lhx5. Lhx1 and Lhx5 are closely related members of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx1 is required for regulating the vertebrate head organizer, the nervous system, and female reproductive tract development. During embryogenesis in the mouse, Lhx1 is expressed early in mesodermal tissue, then later during urogenital, kidney, liver, and nervous system development. In the adult, expression is restricted to the kidney and brain. A mouse embryos with Lhx1 gene knockout cannot grow normal anterior head structures, kidneys, and gonads, but with normally developed trunk and tail morphology. In the developing nervous system, Lhx1 is required to direct the trajectories of motor axons in the limb. Lhx1 null female mice lack the oviducts and uterus. Lhx5 protein may play complementary or overlapping roles with Lhx1. The expression of Lhx5 in the anterior portion of the mouse neural tube suggests a role in patterning of the forebrain. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28147 188754 cd09368 LIM1_Lhx3_Lhx4 The first LIM domain of Lhx3 and Lhx4 family. The first LIM domain of Lhx3-Lhx4 family: Lhx3 and Lhx4 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. The LHX3 and LHX4 LIM-homeodomain transcription factors play essential roles in pituitary gland and nervous system development. Although LHX3 and LHX4 share marked sequence homology, the genes have different expression patterns. They play overlapping, but distinct functions during the establishment of the specialized cells of the mammalian pituitary gland and the nervous system. Lhx3 proteins have been demonstrated the ability to directly bind to the promoters/enhancers of several pituitary hormone gene promoters to cause increased transcription. Lhx3a and Lhx3b, whose mRNAs have distinct temporal expression profiles during development, are two isoforms of Lhx3. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 52
28148 188755 cd09369 LIM1_Lhx2_Lhx9 The first LIM domain of Lhx2 and Lhx9 family. The first LIM domain of Lhx2 and Lhx9 family: Lhx2 and Lhx9 are highly homologous LHX regulatory proteins. They belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Although Lhx2 and Lhx9 are highly homologous, they seems to play regulatory roles in different organs. In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. Lhx9 is expressed in several regions of the developing mouse brain , the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development. Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28149 188756 cd09370 LIM1_Lmx1a The first LIM domain of Lmx1a. The first LIM domain of Lmx1a: Lmx1a belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Mouse Lmx1a is expressed in multiple tissues, including the roof plate of the neural tube, the developing brain, the otic vesicles, the notochord, and the pancreas. Human Lmx1a can be found in pancreas, skeletal muscle, adipose tissue, developing brain, mammary glands, and pituitary. The functions of Lmx1a in the developing nervous system were revealed by studies of mutant mouse. In mouse, mutations in Lmx1a result in failure of the roof plate to develop. Lmx1a may act upstream of other roof plate markers such as MafB, Gdf7, Bmp 6, and Bmp7. Further characterization of these mice reveals numerous defects including disorganized cerebellum, hippocampus, and cortex; altered pigmentation; female sterility; skeletal defects; and behavioral abnormalities. Within pancreatic cells, the Lmx1a protein interacts synergistically with the bHLH transcription factor E47 to activate the insulin gene enhancer/promoter. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 52
28150 188757 cd09371 LIM1_Lmx1b The first LIM domain of Lmx1b. The first LIM domain of Lmx1b: Lmx1b belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. In mouse, Lmx1b functions in the developing limbs and eyes, the kidneys, the brain, and in cranial mesenchyme. The disruption of Lmx1b gene results kidney and limb defects. In the brain, Lmx1b is important for generation of mesencephalic dopamine neurons and the differentiation of serotonergic neurons. In the mouse eye, Lmx1b regulates anterior segment (cornea, iris, ciliary body, trabecular meshwork, and lens) development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28151 188758 cd09372 LIM2_FBLP-1 The second LIM domain of the filamin-binding LIM protein-1 (FBLP-1). The second LIM domain of the filamin-binding LIM protein-1 (FBLP-1): Fblp-1 contains a proline-rich domain near its N terminus and two LIM domains at its C terminus. FBLP-1 mRNA was detected in a variety of tissues and cells including platelets and endothelial cells. FBLP-1 binds to Filamins. The association between filamin B and FBLP-1 may play an unknown role in cytoskeletal function, cell adhesion, and cell motility. As in other LIM domains, this domain family is 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28152 188759 cd09373 LIM1_AWH The first LIM domain of Arrowhead (AWH). The first LIM domain of Arrowhead (AWH): Arrowhead belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. During embryogenesis of Drosophila, Arrowhead is expressed in each abdominal segment and in the labial segment. Late in embryonic development, expression of arrowhead is refined to the abdominal histoblasts and salivary gland imaginal ring cells themselves. The Arrowhead gene required for establishment of a subset of imaginal tissues: the abdominal histoblasts and the salivary gland imaginal rings. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28153 188760 cd09374 LIM2_Isl The second LIM domain of Isl, a member of LHX protein family. The second LIM domain of Isl: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Isl1 and Isl2 are the two conserved members of this family. Proteins in this group are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl-1 is one of the LHX proteins isolated originally by virtue of its ability to bind DNA sequences from the 5'-flanking region of the rat insulin gene in pancreatic insulin-producing cells. Mice deficient in Isl-1 fail to form the dorsal exocrine pancreas and islet cells fail to differentiate. On the other hand, Isl-1 takes part in the pituitary development by activating the gonadotropin-releasing hormone receptor gene together with LHX3 and steroidogenic factor 1. Mouse Isl2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Same as Isl1, Isl2 may also be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28154 188761 cd09375 LIM2_Lhx1_Lhx5 The second LIM domain of Lhx1 (also known as Lim1) and Lhx5. The second LIM domain of Lhx1 (also known as Lim1) and Lhx5. Lhx1 and Lhx5 are closely related members of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx1 is required for regulating the vertebrate head organizer, the nervous system, and female reproductive tract development. During embryogenesis in the mouse, Lhx1 is expressed early in mesodermal tissue, then later during urogenital, kidney, liver, and nervous system development. In the adult, expression is restricted to the kidney and brain. A mouse embryos with Lhx1 gene knockout cannot grow normal anterior head structures, kidneys, and gonads, but with normally developed trunk and tail morphology. In the developing nervous system, Lhx1 is required to direct the trajectories of motor axons in the limb. Lhx1 null female mice lack the oviducts and uterus. Lhx5 protein may play complementary or overlapping roles with Lhx1. The expression of Lhx5 in the anterior portion of the mouse neural tube suggests a role in patterning of the forebrain. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28155 188762 cd09376 LIM2_Lhx3_Lhx4 The second LIM domain of Lhx3-Lhx4 family. The second LIM domain of Lhx3-Lhx4 family: Lhx3 and Lhx4 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. The LHX3 and LHX4 LIM-homeodomain transcription factors play essential roles in pituitary gland and nervous system development. Although LHX3 and LHX4 share marked sequence homology, the genes have different expression patterns. They play overlapping, but distinct functions during the establishment of the specialized cells of the mammalian pituitary gland and the nervous system. Lhx3 proteins have been demonstrated the ability to directly bind to the promoters/enhancers of several pituitary hormone gene promoters to cause increased transcription.Lhx3a and Lhx3b, whose mRNAs have distinct temporal expression profiles during development, are two isoforms of Lhx3. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 56
28156 188763 cd09377 LIM2_Lhx2_Lhx9 The second LIM domain of Lhx2 and Lhx9 family. The second LIM domain of Lhx2 and Lhx9 family: Lhx2 and Lhx9 are highly homologous LHX regulatory proteins. They belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Although Lhx2 and Lhx9 are highly homologous, they seems to play regulatory roles in different organs. In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. Lhx9 is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development. Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 59
28157 188764 cd09378 LIM2_Lmx1a_Lmx1b The second LIM domain of Lmx1a and Lmx1b. The second LIM domain of Lmx1a and Lmx1b: Lmx1a and Lmx1b belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Mouse Lmx1a is expressed in multiple tissues, including the roof plate of the neural tube, the developing brain, the otic vesicles, the notochord, and the pancreas. In mouse, mutations in Lmx1a result in failure of the roof plate to develop. Lmx1a may act upstream of other roof plate markers such as MafB, Gdf7, Bmp6, and Bmp7. Further characterization of these mice reveals numerous defects including disorganized cerebellum, hippocampus, and cortex; altered pigmentation; female sterility, skeletal defects, and behavioral abnormalities. In the mouse, Lmx1b functions in the developing limbs and eyes, the kidneys, the brain, and in cranial mesenchyme. The disruption of Lmx1b gene results kidney and limb defects. In the brain, Lmx1b is important for generation of mesencephalic dopamine neurons and the differentiation of serotonergic neurons. In the mouse eye, Lmx1b regulates anterior segment (cornea, iris, ciliary body, trabecular meshwork, and lens) development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28158 188765 cd09379 LIM2_AWH The second LIM domain of Arrowhead (AWH). The second LIM domain of Arrowhead (AWH): Arrowhead belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. During embryogenesis of Drosophila, Arrowhead is expressed in each abdominal segment and in the labial segment. Late in embryonic development, expression of arrowhead is refined to the abdominal histoblasts and salivary gland imaginal ring cells themselves. The Arrowhead gene required for establishment of a subset of imaginal tissues: the abdominal histoblasts and the salivary gland imaginal rings. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28159 188766 cd09380 LIM1_Lhx6 The first LIM domain of Lhx6. The first LIM domain of Lhx6. Lhx6 is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Lhx6 functions in the brain and nervous system. It is expressed at high levels in several regions of the embryonic mouse CNS, including the telencephalon and hypothalamus, and the first branchial arch. Lhx6 is proposed to have a role in patterning of the mandible and maxilla, and in signaling during odontogenesis. In brain sections, knockdown of Lhx6 gene blocks the normal migration of neurons to the cortex. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28160 188767 cd09381 LIM1_Lhx7_Lhx8 The first LIM domain of Lhx7 and Lhx8. The first LIM domain of Lhx7 and Lhx8: Lhx7 and Lhx8 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Studies using mutant mice have revealed roles for Lhx7 and Lhx8 in the development of cholinergic neurons in the telencephalon and in basal forebrain development. Mice lacking alleles of the LIM-homeobox gene Lhx7 or Lhx8 display dramatically reduced number of forebrain cholinergic neurons. In addition, Lhx7 mutation affects male and female mice differently, with females appearing more affected than males. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 56
28161 188768 cd09382 LIM2_Lhx6 The second LIM domain of Lhx6. The second LIM domain of Lhx6. Lhx6 is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Lhx6 functions in brain and nervous system. It is expressed at high levels in several regions of the embryonic mouse CNS, including the telencephalon and hypothalamus, and the first branchial arch. Lhx6 is proposed to have a role in patterning of the mandible and maxilla, and in signaling during odontogenesis. In brain sections, knockdown of Lhx6 gene blocks the normal migration of neurons to the cortex. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28162 188769 cd09383 LIM2_Lhx7_Lhx8 The second LIM domain of Lhx7 and Lhx8. The second LIM domain of Lhx7 and Lhx8: Lhx7 and Lhx8 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Studies using mutant mice have revealed roles for Lhx7 and Lhx8 in the development of cholinergic neurons in the telencephalon and in basal forebrain development. Mice lacking alleles of the LIM-homeobox gene Lhx7 or Lhx8 display dramatically reduced number of forebrain cholinergic neurons. In addition, Lhx7 mutation affects male and female mice differently, with females appearing more affected than males. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28163 188770 cd09384 LIM1_LMO2 The first LIM domain of LMO2 (LIM domain only protein 2). The first LIM domain of LMO2 (LIM domain only protein 2): LMO2 is a nuclear protein that plays important roles in transcriptional regulation and development. The two tandem LIM domains of LMO2 support the assembly of a crucial cell-regulatory complex by interacting with both the TAL1-E47 and GATA1 transcription factors to form a DNA-binding complex that is capable of transcriptional activation. LMOs have also been shown to be involved in oncogenesis. LMO1 and LMO2 are activated in T-cell acute lymphoblastic leukemia by distinct chromosomal translocations. LMO2 was also shown to be involved in erythropoiesis and is required for the hematopoiesis in the adult animals. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28164 188771 cd09385 LIM2_LMO2 The second LIM domain of LMO2 (LIM domain only protein 2). The second LIM domain of LMO2 (LIM domain only protein 2): LMO2 is a nuclear protein that plays important roles in transcriptional regulation and development. The two tandem LIM domains of LMO2 support the assembly of a crucial cell-regulatory complex by interacting with both the TAL1-E47 and GATA1 transcription factors to form a DNA-binding complex that is capable of transcriptional activation. LMOs have also been shown to be involved in oncogenesis. LMO1 and LMO2 are activated in T-cell acute lymphoblastic leukemia by distinct chromosomal translocations. LMO2 was also shown to be involved in erythropoiesis and is required for the hematopoiesis in the adult animals. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28165 188772 cd09386 LIM1_LMO4 The first LIM domain of LMO4 (LIM domain only protein 4). The first LIM domain of LMO4 (LIM domain only protein 4): LMO4 is a nuclear protein that plays important roles in transcriptional regulation and development. LMO4 is involved in various functions in tumorigenesis and cellular differentiation. LMO4 proteins regulate gene expression by interacting with a wide variety of transcription factors and cofactors to form large transcription complexes. It can interact with Smad proteins, and associate with the promoter of the PAI-1 (plasminogen activator inhibitor-1) gene in a TGFbeta (transforming growth factor beta)-dependent manner. LMO4 can also form a complex with transcription regulator CREB (cAMP response element-binding protein) and interact with CLIM1 and CLIM2. In breast tissue, LMO4 interacts with multiple proteins, including the cofactor CtIP [CtBP (C-terminal binding protein)-interacting protein], the breast and ovarian tumor suppressor BRCA1 (breast-cancer susceptibility gene 1) and the LIM-domain-binding protein LDB1. Functionally, LMO4 is shown to repress BRCA1-mediated transcription activation, thus invoking a potential role for LMO4 as a negative regulator of BRCA1 in sporadic breast cancer. LMO4 also forms complex to both ERa (oestrogen receptor alpha), MTA1 (metastasis tumor antigen 1), and HDACs (histone deacetylases), implying that LMO4 is also a component of the MTA1 corepressor complex. Over-expressed LMO4 represses ERa transactivation functions in an HDAC-dependent manner, and contributes to the process of breast cancer progression by allowing the development of Era-negative phenotypes. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28166 188773 cd09387 LIM2_LMO4 The second LIM domain of LMO4 (LIM domain only protein 4). The second LIM domain of LMO4 (LIM domain only protein 4): LMO4 is a nuclear protein that plays important roles in transcriptional regulation and development. LMO4 is involved in various functions in tumorigenesis and cellular differentiation. LMO4 proteins regulate gene expression by interacting with a wide variety of transcription factors and cofactors to form large transcription complexes. It can interact with Smad proteins, and associate with the promoter of the PAI-1 (plasminogen activator inhibitor-1) gene in a TGFbeta (transforming growth factor beta)-dependent manner. LMO4 can also form a complex with transcription regulator CREB (cAMP response element-binding protein) and interact with CLIM1 and CLIM2. In breast tissue, LMO4 interacts with multiple proteins, including the cofactor CtIP [CtBP (C-terminal binding protein)-interacting protein], the breast and ovarian tumor suppressor BRCA1 (breast-cancer susceptibility gene 1) and the LIM-domain-binding protein LDB1. Functionally, LMO4 is shown to repress BRCA1-mediated transcription activation, thus invoking a potential role for LMO4 as a negative regulator of BRCA1 in sporadic breast cancer. LMO4 also forms complex to both ERa (oestrogen receptor alpha), MTA1 (metastasis tumor antigen 1), and HDACs (histone deacetylases), implying that LMO4 is also a component of the MTA1 corepressor complex. Over-expressed LMO4 represses ERa transactivation functions in an HDAC-dependent manner, and contributes to the process of breast cancer progression by allowing the development of Era-negative phenotypes. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28167 188774 cd09388 LIM1_LMO1_LMO3 The first LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3). The first LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3): LMO1 and LMO3 are highly homologous and belong to the LMO protein family. LMO1 and LMO3 are nuclear protein that plays important roles in transcriptional regulation and development. As LIM domains lack intrinsic DNA-binding activity, nuclear LMOs are involved in transcriptional regulation by forming complexes with other transcription factors or cofactors. For example, LMO1 interacts with the the bHLH domain of bHLH transcription factor, TAL1 (T-cell acute leukemia1)/SCL (stem cell leukemia) . LMO1 inhibits the expression of TAL1/SCL target genes. LMO3 facilitates p53 binding to its response elements, which suggests that LMO3 acts as a co-repressor of p53, suppressing p53-dependent transcriptional regulation. In addition, LMO3 interacts with neuronal transcription factor, HEN2, and acts as an oncogene in neuroblastoma. Another binding partner of LMO3 is calcium- and integrin-binding protein CIB, which binds via the second LIM domain (LIM2) of LMO3. One role of the CIB/LMO3 complex is to inhibit cell proliferation. Although LMO1 and LMO3 are highly homologous proteins, they play different roles in the regulation of the pituitary glycoprotein hormone alpha-subunit (alpha GSU) gene. Alpha GSU promoter activity was markedly repressed by LMO1 but activated by LMO3. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28168 188775 cd09389 LIM2_LMO1_LMO3 The second LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3). The second LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3): LMO1 and LMO3 are highly homologous and belong to the LMO protein family. LMO1 and LMO3 are nuclear protein that plays important roles in transcriptional regulation and development. As LIM domains lack intrinsic DNA-binding activity, nuclear LMOs are involved in transcriptional regulation by forming complexes with other transcription factors or cofactors. For example, LMO1 interacts with the the bHLH domain of bHLH transcription factor, TAL1 (T-cell acute leukemia1)/SCL (stem cell leukemia) . LMO1 inhibits the expression of TAL1/SCL target genes. LMO3 facilitates p53 binding to its response elements, which suggests that LMO3 acts as a co-repressor of p53, suppressing p53-dependent transcriptional regulation. In addition, LMO3 interacts with neuronal transcription factor, HEN2, and acts as an oncogene in neuroblastoma. Another binding partner of LMO3 is calcium- and integrin-binding protein CIB, which binds via the second LIM domain (LIM2) of LMO3. One role of the CIB/LMO3 complex is to inhibit cell proliferation. Although LMO1 and LMO3 are highly homologous proteins, they play different roles in the regulation of the pituitary glycoprotein hormone alpha-subunit (alpha GSU) gene. Alpha GSU promoter activity was markedly repressed by LMO1 but activated by LMO3. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28169 188776 cd09390 LIM2_dLMO The second LIM domain of dLMO (Beaderx). The second LIM domain of dLMO (Beaderx): dLMO is a nuclear protein that plays important roles in transcriptional regulation and development. In Drosophila dLMO modulates the activity of LIM-homeodomain protein Apterous (Ap), which regulates the formation of the dorsal-ventral axis of the Drosophila wing. Biochemical analysis shows that dLMO protein influences the activity of Apterous by binding of its cofactor Chip. Further studies shown that dLMO proteins might function in an evolutionarily conserved mechanism involved in patterning the appendages. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28170 188777 cd09391 LIM1_Lrg1p_like The first LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The first LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 57
28171 188778 cd09392 LIM2_Lrg1p_like The second LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The second LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28172 188779 cd09393 LIM3_Lrg1p_like The third LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The third LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 56
28173 188780 cd09394 LIM1_Rga The first LIM domain of Rga GTPase-Activating Proteins. The first LIM domain of Rga GTPase-Activating Proteins: The members of this family contain two tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Rga activates GTPases during polarized morphogenesis. In yeast, a known regulating target of Rga is CDC42p, a small GTPase. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28174 188781 cd09395 LIM2_Rga The second LIM domain of Rga GTPase-Activating Proteins. The second LIM domain of Rga GTPase-Activating Proteins: The members of this family contain two tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Rga activates GTPases during polarized morphogenesis. In yeast, a known regulating target of Rga is CDC42p, a small GTPase. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28175 188782 cd09396 LIM_DA1 The Lim domain of DA1. The Lim domain of DA1: DA1 contains one copy of LIM domain and a domain of unknown function. DA1 is predicted as an ubiquitin receptor, which sets final seed and organ size by restricting the period of cell proliferation. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28176 188783 cd09397 LIM1_UF1 LIM domain in proteins of unknown function. The first Lim domain of a LIM domain containing protein: The functions of the proteins are unknown. The members of this family contain two copies of LIM domain. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 58
28177 188784 cd09400 LIM_like_1 LIM domain in proteins of unknown function. LIM domain in proteins of unknown function: LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation, and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. The LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid). 61
28178 188785 cd09401 LIM_TLP_like The LIM domains of thymus LIM protein (TLP). The LIM domain of thymus LIM protein (TLP) like proteins: This family includes the LIM domains of TLP and CRIP (Cysteine-Rich Intestinal Protein). TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs. Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells. TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. CRIP is a short LIM protein with only one LIM domain. CRIP gene is developmentally regulated and can be induced by glucocorticoid hormones during the first three postnatal weeks. The domain shows close sequence homology to LIM domain of thymus LIM protein. However, unlike the TLP proteins which have two LIM domains, the members of this family have only one LIM domain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28179 188786 cd09402 LIM1_CRP The first LIM domain of Cysteine Rich Protein (CRP). The first LIM domain of Cysteine Rich Protein (CRP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. It is evident that CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. Although members of the CRP family share common binding partners, they are also capable of recognizing different and specific targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28180 188787 cd09403 LIM2_CRP The second LIM domain of Cysteine Rich Protein (CRP). The second LIM domain of Cysteine Rich Protein (CRP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. It is evident that CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. Although members of the CRP family share common binding partners, they are also capable of recognizing different and specific targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residu es, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28181 188788 cd09404 LIM1_MLP84B_like The LIM domain of Mlp84B and Mlp60A. The LIM domain of Mlp84B and Mlp60A: Mlp84B and Mlp60A belong to the CRP LIM domain protein family. The Mlp84B protein contains five copies of the LIM domains, each followed by a Glycin Rich Region (GRR). However, only the first LIM domain of Mlp84B is in this family. Mlp60A exhibits only one LIM domain linked to a glycin-rich region. Mlp84B and Mlp60A are muscle specific proteins and have been implicated in muscle differentiation. While Mlp84B transcripts are enriched at the terminal ends of muscle fibers, Mlp60A transcripts are found throughout the muscle fibers. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28182 188789 cd09405 LIM1_Paxillin The first LIM domain of paxillin. The first LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight cons erved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28183 188790 cd09406 LIM1_Leupaxin The first LIM domain of Leupaxin. The first LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL. When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28184 188791 cd09407 LIM2_Paxillin The second LIM domain of paxillin. The second LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28185 188792 cd09408 LIM2_Leupaxin The second LIM domain of Leupaxin. The second LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL. When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28186 188793 cd09409 LIM3_Paxillin The third LIM domain of paxillin. The third LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28187 188794 cd09410 LIM3_Leupaxin The third LIM domain of Leupaxin. The third LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL. When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28188 188795 cd09411 LIM4_Paxillin The fourth LIM domain of Paxillin. The fourth LIM domain of Paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28189 188796 cd09412 LIM4_Leupaxin The fourth LIM domain of Leupaxin. The fourth LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL. When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28190 188797 cd09413 LIM1_Testin The first LIM domain of Testin. The first LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of Testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 58
28191 188798 cd09414 LIM1_LIMPETin The first LIM domain of protein LIMPETin. The first LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the Testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 58
28192 188799 cd09415 LIM1_Prickle The first LIM domain of Prickle. The first LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28193 188800 cd09416 LIM2_Testin The second LIM domain of Testin. The second LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28194 188801 cd09417 LIM2_LIMPETin_like The second LIM domain of protein LIMPETin and related proteins. The second LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28195 188802 cd09418 LIM2_Prickle The second LIM domain of Prickle. The second LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Two forms of prickles have been identified; namely prickle 1 and prickle 2. Prickle 1 and prickle 2 are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28196 188803 cd09419 LIM3_Testin The third LIM domain of Testin. The third LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers at cell-cell-contact areas and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of Testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28197 188804 cd09420 LIM3_Prickle The third LIM domain of Prickle. The third LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Two forms of prickles have been identified; namely prickle 1 and prickle 2. Prickle 1 and prickle 2 are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28198 188805 cd09421 LIM3_LIMPETin The third LIM domain of protein LIMPETin. The third LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28199 188806 cd09422 LIM1_FHL2 The first LIM domain of Four and a half LIM domains protein 2 (FHL2). The first LIM domain of Four and a half LIM domains protein 2 (FHL2): FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung at lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 62
28200 188807 cd09423 LIM1_FHL3 The first LIM domain of Four and a half LIM domains protein 3 (FHL3). The first LIM domain of Four and a half LIM domains protein 3 (FHL3): FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28201 188808 cd09424 LIM2_FHL1 The second LIM domain of Four and a half LIM domains protein 1 (FHL1). The second LIM domain of Four and a half LIM domains protein 1 (FHL1): FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 58
28202 188809 cd09425 LIM4_LIMPETin The fourth LIM domain of protein LIMPETin. The fourth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the Testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28203 188810 cd09426 LIM2_FHL2 The second LIM domain of Four and a half LIM domains protein 2 (FHL2). The second LIM domain of Four and a half LIM domains protein 2 (FHL2): FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes. 57
28204 188811 cd09427 LIM2_FHL3 The second LIM domain of Four and a half LIM domains protein 3 (FHL3). The second LIM domain of Four and a half LIM domains protein 3 (FHL3): FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 58
28205 188812 cd09428 LIM2_FHL5 The second LIM domain of Four and a half LIM domains protein 5 (FHL5). The second LIM domain of Four and a half LIM domains protein 5 (FHL5): FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors , which are highly expressed in male germ cells and is required for post-meiotic gene expression. FHL5 associates with CREM and confers a powerful transcriptional activation function. Activation by CREB has known to occur upon phosphorylation at an essential regulatory site and the subsequent interaction with the ubiquitous coactivator CREB-binding protein (CBP). However, the activation by FHL5 is independent of phosphorylation and CBP association. It represents a new route for transcriptional activation by CREM and CREB. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28206 188813 cd09429 LIM3_FHL1 The third LIM domain of Four and a half LIM domains protein 1 (FHL1). The third LIM domain of Four and a half LIM domains protein 1 (FHL1): FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28207 188814 cd09430 LIM5_LIMPETin The fifth LIM domain of protein LIMPETin. The fifth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28208 188815 cd09431 LIM3_Fhl2 The third LIM domain of Four and a half LIM domains protein 2 (FHL2). The third LIM domain of Four and a half LIM domains protein 2 (FHL2): FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes. 57
28209 188816 cd09432 LIM6_LIMPETin The sixth LIM domain of protein LIMPETin. The sixth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28210 188817 cd09433 LIM4_FHL2 The fourth LIM domain of Four and a half LIM domains protein 2 (FHL2). The fourth LIM domain of Four and a half LIM domains protein 2 (FHL2): FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes. 58
28211 188818 cd09434 LIM4_FHL3 The fourth LIM domain of Four and a half LIM domains protein 3 (FHL3). The fourth LIM domain of Four and a half LIM domains protein 3 (FHL3): FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28212 188819 cd09435 LIM3_Zyxin The third LIM domain of Zyxin. The third LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal. Localized at sites of cellsubstratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 67
28213 188820 cd09436 LIM3_TRIP6 The third LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The third LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 66
28214 188821 cd09437 LIM3_LPP The third LIM domain of lipoma preferred partner (LPP). The third LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal. LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP), Palladin, and Scrib. The LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 68
28215 188822 cd09438 LIM3_Ajuba_like The third LIM domain of Ajuba-like proteins. The third LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region. Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 62
28216 188823 cd09439 LIM_Mical The LIM domain of Mical (molecule interacting with CasL). The LIM domain of Mical (molecule interacting with CasL): MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semapho-rin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM domain and calporin homology domain are known for interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein monooxygenase (MO) is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. In addition, MICAL was characterized to interact with Rab13 and Rab8 to coordinate the assembly of tight junctions and adherens junctions in epithelial cells. Thus, MICAL was also named junctional Rab13-binding protein (JRAB). As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28217 188824 cd09440 LIM1_SF3 The first Lim domain of pollen specific protein SF3. The first Lim domain of pollen specific protein SF3: SF3 is a Lim protein that is found exclusively in mature plant pollen grains. It contains two LIM domains. The exact function of SF3 is unknown. It may be a transcription factor required for the expression of late pollen genes. It is possible that SF3 protein is involved in controlling pollen-specific processes such as male gamete maturation, pollen tube formation, or even fertilization. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 63
28218 188825 cd09441 LIM2_SF3 The second Lim domain of pollen specific protein SF3. The second Lim domain of pollen specific protein SF3: SF3 is a Lim protein that is found exclusively in mature plant pollen grains. It contains two LIM domains. The exact function of SF3 is unknown. It may be a transcription factor required for the expression of late pollen genes. It is possible that SF3 protein is involved in controlling pollen-specific processes such as male gamete maturation, pollen tube formation, or even fertilization. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 61
28219 188826 cd09442 LIM_Eplin_like The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin) like proteins. The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin) like proteins: This family contains Epithelial Protein Lost in Neoplasm in Neoplasm (Eplin), xin actin-binding repeat-containing protein 2 (XIRP2) and a group of protein with unknown function. The members of this family all contain a single LIM domain. Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality. Eplin interacts and stabilizes F-actin filaments and stress fibers, which correlates with its ability to suppress anchorage independent growth. In epithelial cells, Eplin is required for formation of the F-actin adhesion belt by binding to the E-cadherin-catenin complex through alpha-catenin. Eplin is expressed in two isoforms, a longer Eplin-beta and a shorter Eplin-alpha. Eplin-alpha mRNA is detected in various tissues and cell lines, but is absent or down regulated in cancer cells. Xirp2 contains a LIM domain and Xin re peats for binding to and stabilising F-actin. Xirp2 is expressed in muscles and is significantly induced in the heart in response to systemic administration of angiotensin II. Xirp2 is an important effector of the Ang II signaling pathway in the heart. The expression of Xirp2 is activated by myocyte enhancer factor (MEF)2A, whose transcriptional activity is stimulated by angiotersin II. Thus, Xirp2 plays important pathological roles in the angiotensin II induced hypertension. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28220 188827 cd09443 LIM_Ltd-1 The LIM domain of LIM and transglutaminase domains protein (Ltd-1). The LIM domain of LIM and transglutaminase domains protein (Ltd-1): This family includes mouse Ky protein and Caenorhabditis elegans Ltd-1 protein. The members of this family consists a N-terminal Lim domain and a C-terminal transglutaminase domain. The mouse Ky protein has putative function in muscle development. The mouse with ky mutant exhibits combined posterior and lateral curvature of the spine. The Ltd-1 gene in C. elegans is expressed in developing hypodermal cells from the twofold stage embryo through adulthood. These data define the ltd-1 gene as a novel marker for C. elegans epithelial cell development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28221 188828 cd09444 LIM_Mical_like_1 This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. The LIM domain on proteins of unknown function: This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. Known members of the Mical-like family includes single LIM domain containing proteins, Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2), and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth, and mobility. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28222 188829 cd09445 LIM_Mical_like_2 This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. The LIM domain on proteins of unknown function: This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL)-like proteins. Known members of the Mical-like family includes single LIM domain containing proteins, Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2), and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth, and mobility. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28223 188830 cd09446 LIM_N_RAP The LIM domain of N-RAP. The LIM domain of N-RAP: N-RAP is a muscle-specific protein concentrated at myotendinous junctions in skeletal muscle and intercalated disks in cardiac muscle. LIM domain is found at the N-terminus of N-RAP and the C-terminal of N-RAP contains a region with multiple of nebulin repeats. N-RAP functions as a scaffolding protein that organizes alpha-actinin and actin into symmetrical I-Z-I structures in developing myofibrils. Nebulin repeat is known as actin binding domain. The N-RAP is hypothesized to form antiparallel dimerization via its LIM domain. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28224 188831 cd09447 LIM_LASP The LIM domain of LIM and SH3 Protein (LASP). The LIM domain of LIM and SH3 Protein (LASP): LASP family contains two highly homologous members, LASP-1 and LASP-2. LASP contains a LIM motif at its amino terminus, a src homology 3 (SH3) domains at its C-terminal part, and a nebulin-like region in the middle. LASP-1 and -2 are highly conserved in their LIM, nebulin-like, and SH3 domains ,but differ significantly at their linker regions. Both proteins are ubiquitously expressed and involved in cytoskeletal architecture, especially in the organization of focal adhesions. LASP-1 and LASP-2, are important during early embryo- and fetogenesis and are highly expressed in the central nervous system of the adult. However, only LASP-1 seems to participate significantly in neuronal differentiation and plays an important functional role in migration and proliferation of certain cancer cells while the role of LASP-2 is more structural. The expression of LASP-1 in breast tumors is increased significantly. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28225 188832 cd09448 LIM_CLP36 This family represents the LIM domain of CLP36. This family represents the LIM domain of CLP36. CLP36 has also been named as CLIM1, Elfin, or PDLIM1. CLP36 contains a C-terminal LIM domain and an N-terminal PDZ domain. CLP36 is highly expressed in heart and is present in many other tissues including lung, liver, spleen, and blood. CLP36 has been implicated in many processes including hypoxia and regulation of actin stress fibers. CLP36 co-localizes with alpha-actinin-2 at the Z-lines in myocardium. In addition, CLP36 binds to alpha-actinin-1 and alpha-actinin-4, and associates with F-actin filaments and stress fibers. CLP36 might be involved in not only the function of sarcomeres in muscle cells, but also in actin stress fiber-mediated cellular processes, such as cell shape, migration, polarit, and cytokinesis in non-muscle cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28226 188833 cd09449 LIM_Mystique The LIM domain of Mystique, a subfamily of ALP LIM domain proteins. The LIM domain of Mystique, a subfamily of ALP LIM domain proteins: Mystique is the most recently identified member of the ALP protein family. It also interacts with alpha-actinin, as other ALP proteins do. Mystique promotes cell attachment and migration and suppresses anchorage-independent growth. The LIM domain of Mystique is required for the suppression function. Moreover, Mystique functions as an ubiquitin E3 ligase acting on STAT proteins to cause their proteosome mediated degradation. As in all LIM domains, this domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28227 188834 cd09450 LIM_ALP This family represents the LIM domain of ALP, actinin-associated LIM protein. This family represents the LIM domain of ALP, actinin-associated LIM protein. ALP contains an N-terminal PDZ domain, a C-terminal LIM domain and an ALP-subfamily-specific 34-amino-acid motif termed ALP-like motif (AM), which contains a putative consensus protein kinase C (PKC) phosphorylation site and two alpha-helices. ALP proteins are found in heart and in skeletal muscle. ALP may act as a signaling molecule which is regulated by PKC-dependent signaling. ALP plays an essential role in the development of RV (right ventricle) chamber. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28228 188835 cd09451 LIM_RIL The LIM domain of RIL. The LIM domain of RIL: RIL contains an N-terminal PDZ domain, a LIM domain, and a short consensus C-terminal region. It is the smallest molecule in the ALP LIM domain containing protein family. RIL was identified in rat fibroblasts and in human lymphocytes. The LIM domain interacts with the AMPA glutamate receptor in dendritic spines. The consensus C-terminus interacts with PTP-BL, a submembranous protein tyrosine phosphatase and the PDZ domain is responsible to interact with alpha-actinin molecules. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28229 188836 cd09452 LIM1_Enigma The first LIM domain of Enigma. The first LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28230 188837 cd09453 LIM1_ENH The first LIM domain of the Enigma Homolog (ENH) family. The first LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ENH is implicated in signal transduction processes involving protein kinases. It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28231 188838 cd09454 LIM1_ZASP_Cypher The first LIM domain of ZASP/Cypher family. The first LIM domain of ZASP/Cypher family: ZASP was identified in human heart and skeletal muscle and Cypher is a mice ortholog of ZASP. ZASP/Cyppher contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28232 188839 cd09455 LIM1_Enigma_like_1 The first LIM domain of an Enigma subfamily with unknown function. The first LIM domain of an Enigma subfamily with unknown function: The Enigma LIM domain family is comprised of three characterized members: Enigma, ENH and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. They serve as adaptor proteins, where the PDZ domain tethers the protein to the cytoskeleton and the LIM domains, recruit signaling proteins to implement corresponding functions. The members of the Enigma family have been implicated in regulating or organizing cytoskeletal structure, as well as involving multiple signaling pathways. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28233 188840 cd09456 LIM2_Enigma The second LIM domain of Enigma. The second LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28234 188841 cd09457 LIM2_ENH The second LIM domain of the Enigma Homolog (ENH) family. The second LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ENH is implicated in signal transduction processes involving protein kinases. It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 52
28235 188842 cd09458 LIM3_Enigma The third LIM domain of Enigma. The third LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28236 188843 cd09459 LIM3_ENH The third LIM domain of the Enigma Homolog (ENH) family. The third LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ENH is implicated in signal transduction processes involving protein kinases. It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28237 188844 cd09460 LIM3_ZASP_Cypher The third LIM domain of ZASP/Cypher family. The third LIM domain of ZASP/Cypher family: ZASP was identified in human heart and skeletal muscle and Cypher is a mice ortholog of ZASP. ZASP/Cyppher contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28238 188845 cd09461 LIM3_Enigma_like_1 The third LIM domain of an Enigma subfamily with unknown function. The third LIM domain of an Enigma subfamily with unknown function: The Enigma LIM domain family is comprised of three characterized members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. They serve as adaptor proteins, where the PDZ domain tethers the protein to the cytoskeleton and the LIM domains, recruit signaling proteins to implement corresponding functions. The members of the enigma family have been implicated in regulating or organizing cytoskeletal structure, as well as involving multiple signaling pathways. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28239 188846 cd09462 LIM1_LIMK1 The first LIM domain of LIMK1 (LIM domain Kinase 1). The first LIM domain of LIMK1 (LIM domain Kinase 1): LIMK1 belongs to the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK1 is expressed in all tissues and is localized to focal adhesions in the cell. LIMK1 can form homodimers upon binding of HSP90 and is activated by Rho effector Rho kinase and MAPKAPK2. LIMK1 is important for normal central nervous system development, and its deletion has been implicated in the development of the human genetic disorder Williams syndrome. Moreover, LIMK1 up-regulates the promoter activity of urokinase type plasminogen activator and induces its mRNA and protein expression in breast cancer cells. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 74
28240 188847 cd09463 LIM1_LIMK2 The first LIM domain of LIMK2 (LIM domain Kinase 2). The first LIM domain of LIMK2 (LIM domain Kinase 2): LIMK2 is a member of the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, altering the rate of actin depolymerization. LIMK activity is activated by phosphorylation of a threonine residue within the activation loop of the kinase by p21-activated kinases 1 and 4 and by Rho kinase. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK2 is expressed in all tissues. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The activity of LIM kinase 2 to regulate cofilin phosphorylation is inhibited by the direct binding of Par-3. LIMK2 activation promotes cell cycle progression. The phenotype of Limk2 knockout mice shows a defect in spermatogenesis. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 53
28241 188848 cd09464 LIM2_LIMK1 The second LIM domain of LIMK1 (LIM domain Kinase 1). The second LIM domain of LIMK1 (LIM domain Kinase 1): LIMK1 belongs to the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK1 is expressed in all tissues and is localized to focal adhesions in the cell. LIMK1 can form homodimers upon binding of HSP90 and is activated by Rho effector Rho kinase and MAPKAPK2. LIMK1 is important for normal central nervous system development, and its deletion has been implicated in the development of the human genetic disorder Williams syndrome. Moreover, LIMK1 up-regulates the promoter activity of urokinase type plasminogen activator and induces its mRNA and protein expression in breast cancer cells. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28242 188849 cd09465 LIM2_LIMK2 The second LIM domain of LIMK2 (LIM domain Kinase 2). The second LIM domain of LIMK2 (LIM domain Kinase 2): LIMK2 is a member of the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, altering the rate of actin depolymerisation. LIMK activity is activated by phosphorylation of a threonine residue within the activation loop of the kinase by p21-activated kinases 1 and 4 and by Rho kinase. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK2 is expressed in all tissues. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The activity of LIM kinase 2 to regulate cofilin phosphorylation is inhibited by the direct binding of Par-3. LIMK2 activation promotes cell cycle progression. The phenotype of Limk2 knockout mice shows a defect in spermatogenesis. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28243 188850 cd09466 LIM1_Lhx3a The first LIM domain of Lhx3a. The first LIM domain of Lhx3a: Lhx3a is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3a is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal. They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 56
28244 188851 cd09467 LIM1_Lhx3b The first LIM domain of Lhx3b. The first LIM domain of Lhx3b. Lhx3b is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3b is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal. They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 55
28245 188852 cd09468 LIM1_Lhx4 The first LIM domain of Lhx4. The first LIM domain of Lhx4. Lhx4 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 52
28246 188853 cd09469 LIM1_Lhx2 The first LIM domain of Lhx2. The first LIM domain of Lhx2: Lhx2 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. The Lhx2 protein has been shown to bind to the mouse M71 olfactory receptor promoter. Similar to other LIM domains, this domain family is 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 64
28247 188854 cd09470 LIM1_Lhx9 The first LIM domain of Lhx9. The first LIM domain of Lhx9: Lhx9 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx9 is highly homologous to Lhx2. It is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development. Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice have reduced levels of the Sf1 nuclear receptor that is required for gonadogenesis, and recent studies have shown that Lhx9 is able to activate the Sf1/FtzF1 gene. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 54
28248 188855 cd09471 LIM2_Isl2 The second LIM domain of Isl2. The second LIM domain of Isl2: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl proteins are found in the nucleus and act as transcription factors or cofactors. Isl1 and Isl2 are the two conserved members of this family. Mouse Isl2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Isl2 may be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28249 188856 cd09472 LIM2_Lhx3b The second LIM domain of Lhx3b. The second LIM domain of Lhx3b. Lhx3b is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3b is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal. They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein 57
28250 188857 cd09473 LIM2_Lhx4 The second LIM domain of Lhx4. The second LIM domain of Lhx4. Lhx4 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 56
28251 188858 cd09474 LIM2_Lhx2 The second LIM domain of Lhx2. The second LIM domain of Lhx2: Lhx2 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. The Lhx2 protein has been shown to bind to the mouse M71 olfactory receptor promoter. Similar to other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 59
28252 188859 cd09475 LIM2_Lhx9 The second LIM domain of Lhx9. The second LIM domain of Lhx9: Lhx9 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx9 is highly homologous to Lhx2. It is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development. Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice have reduced levels of the Sf1 nuclear receptor that is required for gonadogenesis, and recent studies have shown that Lhx9 is able to activate the Sf1/FtzF1 gene. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 59
28253 188860 cd09476 LIM1_TLP The first LIM domain of thymus LIM protein (TLP). The first LIM domain of thymus LIM protein (TLP): TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs. Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells. TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28254 188861 cd09477 LIM2_TLP The second LIM domain of thymus LIM protein (TLP). The second LIM domain of thymus LIM protein (TLP): TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs. Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells. TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28255 188862 cd09478 LIM_CRIP The LIM domain of Cysteine-Rich Intestinal Protein (CRIP). The LIM domain of Cysteine-Rich Intestinal Protein (CRIP): CRIP is a short protein with only one LIM domain. CRIP gene is developmentally regulated and can be induced by glucocorticoid hormones during the first three postnatal weeks. The domain shows close sequence homology to LIM domain of thymus LIM protein. However, unlike the TLP proteins which have two LIM domains, the members of this family have only one LIM domain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28256 188863 cd09479 LIM1_CRP1 The first LIM domain of Cysteine Rich Protein 1 (CRP1). The first LIM domain of Cysteine Rich Protein 1 (CRP1): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP1 can associate with the actin cytoskeleton and are capable of interacting with alpha-actinin and zyxin. CRP1 was shown to regulate actin filament bundling by interaction with alpha-actinin and direct binding to actin filaments. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 56
28257 188864 cd09480 LIM1_CRP2 The first LIM domain of Cysteine Rich Protein 2 (CRP2). The first LIM domain of Cysteine Rich Protein 2 (CRP2): The CRP family members include CRP1, CRP2, CRP3/MLP and TLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP2 specifically binds to protein inhibitor of activated STAT-1 (PIAS1) and a novel human protein designed CRP2BP (for CRP2 binding partner). PIAS1 specifically inhibits the STAT-1 pathway and CRP2BP is homologous to members of the histone acetyltransferase family raising the possibility that CRP2 is a modulator of cytokine-controlled pathways or is functionally active in the transcriptional regulatory network. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 55
28258 188865 cd09481 LIM1_CRP3 The first LIM domain of Cysteine Rich Protein 3 (CRP3/MLP). The first LIM domain of Cysteine Rich Protein 3 (CRP3/MLP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcriptio n factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28259 188866 cd09482 LIM2_CRP3 The second LIM domain of Cysteine Rich Protein 3 (CRP3/MLP). The second LIM domain of Cysteine Rich Protein 3 (CRP3/MLP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. The second LIM domain of CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcription factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28260 188867 cd09483 LIM1_Prickle_1 The first LIM domain of Prickle 1. The first LIM domain of Prickle 1. Prickle contains three C-terminal LIM domains and a N-terminal PET domain Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in mainly expressed in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. In addition, Prickle 1 regulates cell movements during gastrulation and neuronal migration through interaction with the noncanonical Wnt11/Wnt5 pathway in zebrafish. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28261 188868 cd09484 LIM1_Prickle_2 The first LIM domain of Prickle 2. The first LIM domain of Prickle 2: Prickle contains three C-terminal LIM domains and a N-terminal PET domain. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28262 188869 cd09485 LIM_Eplin_alpha_beta The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin). The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin): Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality. Eplin interacts and stabilizes F-actin filaments and stress fibers, which correlates with its ability to suppress anchorage independent growth. In epithelial cells, Eplin is required for formation of the F-actin adhesion belt by binding to the E-cadherin-catenin complex through alpha-catenin. Eplin is expressed in two isoforms, a longer Eplin-beta and a shorter Eplin-alpha. Eplin-alpha mRNA is detected in various tissues and cell lines, but is absent or down regulated in cancer cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28263 188870 cd09486 LIM_Eplin_like_1 a LIM domain subfamily on a group of proteins with unknown function. This model represents a LIM domain subfamily of Eplin-like family. This family shows highest homology to the LIM domains on Eplin and XIRP2 protein families. Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality. Xirp2 is expressed in muscles and is an important effector of the Ang II signaling pathway in the heart. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein. 53
28264 188886 cd09487 SAM_superfamily SAM (Sterile alpha motif ). SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases. 56
28265 188887 cd09488 SAM_EPH-R SAM domain of EPH family of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH (erythropoietin-producing hepatocyte) family of receptor tyrosine kinases is a C-terminal signal transduction module located in the cytoplasmic region of these receptors. SAM appears to mediate cell-cell initiated signal transduction via binding proteins to a conserved tyrosine that is phosphorylated. In some cases the SAM domain mediates homodimerization/oligomerization and plays a role in the clustering process necessary for signaling. EPH kinases are the largest family of receptor tyrosine kinases. They are classified into two groups based on their abilities to bind ephrin-A and ephrin-B ligands. The EPH receptors are involved in regulation of cell movement, shape, and attachment during embryonic development; they control cell-cell interactions in the vascular, nervous, epithelial, and immune systems, and in many tumors. They are potential molecular markers for cancer diagnostics and potential targets for cancer therapy. 61
28266 188888 cd09489 SAM_Smaug-like SAM (Sterile alpha motif ). SAM (sterile alpha motif) domain of Smaug-like subfamily proteins is an RNA binding domain. SAM interacts with stem-loop structures in target mRNAs. Proteins of this subfamily are post-transcriptional regulators involved in mRNA silencing and deadenylation; they can be implicated in transcript stability regulation and vacuolar protein transport as well. SAM_Smaug-like domain-containing proteins are found in metazoa from yeast to human. In animals they are active during early embryogenesis. 57
28267 188889 cd09490 SAM_Arap1,2,3 SAM domain of Arap1,2,3 (angiotensin receptor-associated protein). SAM (sterile alpha motif) domain of Arap1,2,3 subfamily proteins (angiotensin receptor-associated) is a protein-protein interaction domain. Arap1,2,3 proteins are phosphatidylinositol-3,4,5-trisphosphate-dependent GTPase-activating proteins. They are involved in phosphatidylinositol-3 kinase (PI3K) signaling pathways. In addition to SAM domain, Arap1,2,3 proteins contain ArfGap, PH-like, RhoGAP and UBQ domains. SAM domain of Arap3 protein was shown to interact with SAM domain of Ship2 phosphatidylinositol-trisphosphate phosphatase proteins. Such interaction apparently plays a role in inhibition of PI3K regulated pathways since Ship2 converts PI(3,4,5)P3 into PI(3,4)P2. Proteins of this subfamily participate in regulation of signaling and trafficking associated with a number of different receptors (including EGFR, TRAIL-R1/DR4, TRAIL-R2/DR5) in normal and cancer cells; they are involved in regulation of actin cytoskeleton remodeling, cell spreading and formation of lamellipodia. 63
28268 188890 cd09491 SAM_Ship2 SAM domain of Ship2 lipid phosphatase proteins. SAM (sterile alpha motif) domain of Ship2 subfamily is a protein-protein interaction domain. Ship2 proteins are lipid phosphatases (Phosphatidylinositol-3,4,5-trisphosphate 5-phosphatase 2) containing an N-terminal SH2 domain, a central phosphatase domain and a C-terminal SAM domain. Ship2 is involved in a number of PI3K signaling pathways. For example, it plays a role in regulation of the actin cytoskeleton remodeling, in insulin signaling pathways, and in EphA2 receptor endocytosis. SAM domain of Ship2 can interact with SAM domain of other proteins in these pathways, thus participating in signal transduction. In particular, SAM of Ship2 is known to form heterodimers with SAM domain of Eph-A2 receptor tyrosine kinase during receptor endocytosis as well as with SAM domain of PI3K effector protein Arap3 in the actin cytoskeleton signaling network. Since Ship2 plays a role in negatively regulating insulin signaling, it has been suggested that inhibition of its expression or function may contribute in treating type 2 diabetes and obesity-induced insulin resistance. 63
28269 188891 cd09492 SAM_SASH1_repeat2 SAM domain of SASH1 proteins, repeat 2. SAM (sterile alpha motif) repeat 2 of SASH1 proteins is a protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. SASH1 can bind 14-3-3 proteins in response to IGF1/phosphatidylinositol 3-kinase signaling. SASH1 was found upregulated in different tissues including thymus, placenta, lungs and downregulated in some breast tumors, liver metastases and colon cancers if compare to corresponding normal tissues. SASH1 is a potential candidate for a tumor suppressor gene in breast cancers. At the same time, downregulation of SASH1 in colon cancer is associated with metastasis and a poor prognosis. 70
28270 188892 cd09493 SAM_SASH-like SAM (Sterile alpha motif ), SASH1-like. SAM (sterile alpha motif) domain of SASH1-like proteins is a protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. Proteins of this subfamily are known to be involved in preventing DN thymocytes from premature initiation of programmed cell death and in B cells activation and differentiation. They have been found downregulated in some breast tumors, liver metastases and colon cancers if compare to corresponding normal tissues. 60
28271 188893 cd09494 SAM_liprin-kazrin_repeat1 SAM domain of liprin/kazrin proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of the SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and adherens junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 58
28272 188894 cd09495 SAM_liprin-kazrin_repeat2 SAM domain of liprin/kazrin proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and in adheren junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 60
28273 188895 cd09496 SAM_liprin-kazrin_repeat3 SAM domain of liprin/kazrin proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and in adherens junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 62
28274 188896 cd09497 SAM_caskin1,2_repeat1 SAM domain of caskin protein repeat 1. SAM (sterile alpha motif) domain repeat 1 of caskin1,2 proteins is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and apparently may play a role in neural development, synaptic protein targeting, and regulation of gene expression. 66
28275 188897 cd09498 SAM_caskin1,2_repeat2 SAM domain of caskin protein repeat 2. SAM (sterile alpha motif) domain repeat 2 of caskin1,2 proteins is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and may play a role in neural development, synaptic protein targeting, and regulation of gene expression. 71
28276 188898 cd09499 SAM_AIDA1AB-like_repeat1 SAM domain of AIDA1AB-like proteins, repeat 1. SAM (sterile alpha motif) domain repeat 1 of AIDA1AB-like proteins is a protein-protein interaction domain. AIDA1AB-like proteins have two tandem SAM domains. They may form an intramolecular head-to-tail homodimer. One of two basic motifs of the nuclear localization signal (NLS) is located within helix 5 of SAM2 (motif HKRK). This signal plays a role in decoupling of SAM2 from SAM1, thus facilitating translocation of this type proteins into the nucleus. SAM1 domain has a potential phosphorylation site for CMGC group of serine/threonine kinases. SAM domains of the AIDA1-like subfamily can directly bind ubiquitin and participate in regulating the degradation of ubiquitinated EphA receptors, particularly EPH-A8 receptor. Additionally AIDA1AB-like proteins may participate in the regulation of nucleoplasmic coilin protein interactions. 67
28277 188899 cd09500 SAM_AIDA1AB-like_repeat2 SAM domain of AIDA1AB-like proteins, repeat 2. SAM (sterile alpha motif) domain repeat 2 of AIDA1AB-like proteins is a protein-protein interaction domain. AIDA1AB-like proteins have two tandem SAM domains. They may form an intramolecular head-to-tail homodimer. One of two basic motifs of the nuclear localization signal (NLS) is located within helix 5 of the SAM2 (motif HKRK). This signal plays a role in decoupling of SAM2 from SAM1, thus facilitating translocation of this type proteins into the nucleus. SAM domains of the AIDA1AB-like subfamily can directly bind ubiquitin and participate in regulating the degradation of ubiquitinated EphA receptors, particularly EPH-A8 receptor. Additionally AIDA1AB-like proteins may participate in the regulation of nucleoplasmic coilin protein interactions. 65
28278 188900 cd09501 SAM_SARM1-like_repeat1 SAM domain ot SARM1-like proteins, repeat 1. SAM (sterile alpha motif) domain repeat 1 of SARM1-like adaptor proteins is a protein-protein interaction domain. SARM1-like proteins contain two tandem SAM domains. SARM1-like proteins are involved in TLR (Toll-like receptor) signaling. They are responsible for targeted localization of the whole protein to post-synaptic regions of axons. In humans SARM1 expression is detected in kidney and liver. 69
28279 188901 cd09502 SAM_SARM1-like_repeat2 SAM domain of SARM1-like, repeat 2. SAM (sterile alpha motif) domain repeat 2 of SARM1-like adaptor proteins is a protein-protein interaction domain. SARM1-like proteins contain two tandem SAM domains. SARM1-like proteins are involved in TLR (Toll-like receptor) signaling. They are responsible for targeted localization of the whole protein to post-synaptic regions of axons. In humans SARM1 expression is detected in kidney and liver. 70
28280 188902 cd09503 SAM_tumor-p63,p73 SAM domain of tumor-p63,p73 proteins. SAM (sterile alpha motif) domain of p63, p73 transcriptional factors is a putative protein-protein interaction domain and lipid-binding domain. p63 and p73 are homologs to the tumor suppressor p53. They have a C-terminal SAM domain in their longest spliced alpha forms, while p53 doesn't have it. p63 or p73 knockout mice show significant developmental abnormalities but no increased cancer susceptibility, suggesting that p63 and p73 play a role in regulation of normal development. It was shown that SAM domain of p73 is able to bind some membrane lipids. The structural rearrangements in SAM are necessary to accomplish the binding. No evidence for homooligomerization through SAM domains was found for p63/p73 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain. 65
28281 188903 cd09504 SAM_STIM-1,2-like SAM domain of STIM-1,2-like proteins. SAM (sterile alpha motif) domain of STIM-1,2-like (Stromal interaction molecule) proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM1 and STIM2 are involved in opposite regulation of store operated channels in plasma membrane. 74
28282 188904 cd09505 SAM_WDSUB1 SAM domain of WDSUB1 proteins. SAM (sterile alpha motif) domain of WDSUB1 subfamily proteins is a putative protein-protein interaction domain. Proteins of this group contain multiple domains: SAM, one or more WD40 repeats and U-box (derived version of the RING-finger domain). Apparently the WDSUB1 subfamily proteins participate in protein degradation through ubiquitination, since U-box domain are known as a member of E3 ubiquitin ligase family, while SAM and WD40 domains most probably are responsible for an E2 ubiquitin-conjugating enzyme binding and a target protein binding. 72
28283 188905 cd09506 SAM_Shank1,2,3 SAM domain of Shank1,2,3 family proteins. SAM (sterile alpha motif) domain of Shank1,2,3 family proteins is a protein-protein interaction domain. Shank1,2,3 proteins are scaffold proteins that are known to interact with a variety of cytoplasmic and membrane proteins. SAM domains of the Shank1,2,3 family are prone to homooligomerization. They are highly enriched in the postsynaptic density, acting as scaffolds to organize assembly of postsynaptic proteins. SAM domains of Shank3 proteins can form large sheets of helical fibers. Shank genes show distinct patterns of expression, in rat Shank1 mRNA is found almost exclusively in brain, Shank2 in brain, kidney and liver, and Shank3 in heart, brain and spleen. 66
28284 188906 cd09507 SAM_DGK-delta-eta SAM domain of diacylglycerol kinase delta and eta subunits. SAM (sterile alpha motif) domain of DGK-eta-delta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases with a SAM domain located at the C-terminus. DGK proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. The SAM domain of DGK proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers between the SAM domains of DGK delta and eta proteins. The oligomerization plays a role in the regulation of DGK intracellular localization. 65
28285 188907 cd09508 SAM_HD SAM domain of HD-phosphohydrolase. SAM (sterile alpha motif) domain of SAM_HD subfamily proteins is a putative protein-protein interaction domain. Proteins of this group, additionally to the SAM domain, contain a HD hydrolase domain. Human SAM-HD1 is a nuclear protein involved in innate immune response and may act as a negative regulator of the cell-intrinsic antiviral response. Mutations in this gene lead to Aicardi-Goutieres syndrome (symptoms include cerebral atrophy, leukoencephalopathy, hepatosplenomegaly, and increased production of alpha-interferon). 70
28286 188908 cd09509 SAM_Polycomb SAM domain of Polycomb group. SAM (sterile alpha motif) domain of Polycomb group is a protein-protein interaction domain. The Polycomb group includes transcriptional repressors which are involved in the regulation of some key regulatory genes during development in many organisms. They are best known for silencing Hox (Homeobox) genes. Polycomb proteins work together in large multimeric and chromatin-associated complexes. They organize chromatin of the target genes and maintain repressed states during many cell divisions. Polycomb proteins are classified based on their common function, but not on conserved domains and/or motifs; however many Polycomb proteins (members of PRC1 class complex) contain SAM domains which are more similar to each other inside of the Polycomb group than to SAM domains outside of it. Most information about structure and function of Polycomb SAM domains comes from studies of Ph (Polyhomeotic) and Scm (Sex comb on midleg) proteins. Polycomb SAM domains usually can be found at the C-terminus of the proteins. Some members of this group contain, in addition to the SAM domain, MTB repeats, Zn finger, and/or DUF3588 domains. Polycomb SAM domains can form homo- and/or heterooligomers through ML and EH surfaces. SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome. Polycomb proteins are known to be highly expressed in some cells years before their cancer pathology; thus they are attractive markers for early cancer therapy. 64
28287 188909 cd09510 SAM_aveugle-like SAM domain of aveugle-like subfamily. SAM (sterile alpha motif) domain of SAM_aveugle-like subfamily is a protein-protein interaction domain. In Drosophila, the aveugle (AVE) protein (also known as HYP (Hyphen)) is involved in normal photoreceptor differentiation, and required for epidermal growth factor receptor (EGFR) signaling between ras and raf genes during eye development and wing vein formation. SAM domain of the HYP(AVE) protein interacts with SAM domain of CNK, the multidomain scaffold protein connector enhancer of kinase suppressor of ras. CNK/HYP(AVE) complex interacts with KSR (kinase suppressor of Ras) protein. This interaction leads to stimulation of Ras-dependent Raf activation. This subfamily also includes vertebrate AVE homologs - Samd10 and Samd12 proteins. Their exact function is unknown, but they may play a role in signal transduction during embryogenesis. 75
28288 188910 cd09511 SAM_CNK1,2,3-suppressor SAM domain of CNK1,2,3-suppressor subfamily. SAM (sterile alpha motif) domain of CNK (connector enhancer of kinase suppressor of ras (Ksr)) subfamily is a protein-protein interaction domain. CNK proteins are multidomain scaffold proteins containing a few protein-protein interaction domains and are required for connecting Rho and Ras signaling pathways. In Drosophila, the SAM domain of CNK is known to interact with the SAM domain of the aveugle protein, forming a heterodimer. Mutation of the SAM domain in human CNK1 abolishes the ability to cooperate with the Ras effector, supporting the idea that this interaction is necessary for proper Ras signal transduction. 69
28289 188911 cd09512 SAM_Neurabin-like SAM domain of SAM_Neurabin-like subfamily. SAM (sterile alpha motif) domain of Neurabin-like (Neural actin-binding) subfamily is a putative protein-protein interaction domain. This group currently includes the SAM domains of neurobin-I, SAMD14 and neurobin-I/SAMD14-like proteins. Most are multidomain proteins and in addition to SAM domain they contain other protein-binding domains such as PDZ and actin-binding domains. Members of this subfamily participate in signal transduction. Neurabin-I is involved in the regulation of Ca signaling intensity in alpha-adrenergic receptors; it forms a functional pair of opposing regulators with neurabin-II. Neurabins are expressed almost exclusively in neuronal cells. They are known to interact with protein phosphatase 1 and inhibit its activity; they also can bind actin filaments; however, the exact role of the SAM domain is unclear, since SAM doesn't participate in these interactions. 70
28290 188912 cd09513 SAM_BAR SAM domain of BAR subfamily. SAM (sterile alpha motif) domain of BAR (Bifunctional Apoptosis Regulator) subfamily is a protein-protein interaction domain. In addition to the SAM domain, this type of regulator has a RING finger domain. Proteins of this subfamily are involved in the apoptosis signal network. Their overexpression in human neuronal cells significantly protects cells from a broad range of cell death stimuli. SAM domain can interact with Caspase8, Bcl-2 and Bcl-X resulting in suppression of Bax-induced cell death. 71
28291 188913 cd09514 SAM_SGMS1 SAM domain of sphingomyelin synthase. SAM (sterile alpha motif) domain of SGMS-1 (sphingomyelin synthase) subfamily is a potential protein-protein interaction domain. Sphingomyelin synthase 1 is a transmembrane protein with a SAM domain at the N-terminus and a catalytic domain at the C-terminus. Sphingomyelin synthase 1 is a Golgi-associated enzyme, and depending on the concentration of diacylglycerol and ceramide, can catalyze synthesis phosphocholine or sphingomyelin, respectively. It plays a central role in sphingolipid and glycerophospholipid metabolism. 72
28292 188914 cd09515 SAM_SGMS1-like SAM domain of sphingomyelin synthase related subfamily. SAM (sterile alpha motif) domain of SGMS-like (sphingomyelin synthase) subfamily is a potential protein-protein interaction domain. This group of proteins is related to sphingomyelin synthase 1, and contains an N-terminal SAM domain. The function of SGMS1-like proteins is unknown; they may play a role in sphingolipid metabolism. 70
28293 188915 cd09516 SAM_sec23ip-like SAM domain of sec23ip-like subfamily. SAM (sterile alpha motif) domain of Sec23ip-like (Sec23 interacting protein) subfamily is a potential protein-protein interaction domain. This group of proteins includes Sec23ip and DDHD2 proteins. All of them contain at least two domains: a SAM domain and a predicted metal-binding domain. For mammalian DDHD2 members of this group, phospholipase activity has been demonstrated. Sec23ip proteins of this group interact with Sec23 proteins via an N-terminal proline-rich region. Members of this subfamily are involved in organization of ER/Golgi intermediate compartment. 69
28294 188916 cd09517 SAM_USH1G_HARP SAM domain of USH1G_HARP family. SAM (sterile alpha motif) domain of USH1G/HARP (Usher syndrome type-1G/ Harmonin-interacting Ankyrin Repeat-containing protein) family is a protein-protein interaction domain. Members of this family have an N-terminal ankyrin repeat region and a C-terminal SAM domain. In mammals these proteins can interact via the SAM domain with the PDZ domain of harmonin to form a scaffolding complex that facilitates signal transduction in epithelial and inner ear sensory cells. It was suggested that USH1G and HARP can be tissue specific partners of harmonin. Mutations in ush1g genes lead to Usher syndrome type 1G. This syndrome is the cause of deaf-blindness in humans. 66
28295 188917 cd09518 SAM_ANKS6 SAM domain of ANKS6 (or SamCystin) subfamily. SAM (sterile alpha motif) domain of ANKS6 (or SamCystin) subfamily is a potential protein-protein interaction domain. Proteins of this subfamily have N-terminal ankyrin repeats and a C-terminal SAM domain. They are able to form self-associated complexes and both (SAM and ANK) domains play a role in such interactions. Mutations in Anks6 gene are associated with polycystic kidney disease. They cause formation of renal cysts in rodent models. It was suggested that the ANKS6 protein can interact indirectly (through RNA and protein intermediates) with BICC1, another polycystic kidney disease-associated protein. 65
28296 188918 cd09519 SAM_ANKS3 SAM domain of ANKS3 subfamily. SAM (sterile alpha motif) domain of ANKS3 subfamily is a potential protein-protein interaction domain. Proteins of this subfamily have N-terminal ankyrin repeats and a C-terminal SAM domain. SAM is a widespread domain in signaling proteins. In many cases it mediates homo-dimerization/oligomerization. 64
28297 188919 cd09520 SAM_BICC1 SAM domain of BICC1 (bicaudal) subfamily. SAM (sterile alpha motif) domain of BICC1 (bicaudal) subfamily is a protein-protein interaction domain. Proteins of this group have N-terminal K homology RNA-binding vigilin-like repeats and a C-terminal SAM domain. BICC1 is involved in the regulation of embryonic differentiation. It plays a role in the regulation of Dvl (Dishevelled) signaling, particularly in the correct cilia orientation and nodal flow generation. In Drosophila, disruption of BICC1 can disturb the normal migration direction of the anterior follicle cell of oocytes; the specific function of SAM is to recruit whole protein to the periphery of P-bodies. In mammals, mutations in this gene are associated with polycystic kidney disease and it was suggested that the BICC1 protein can indirectly interact with ANKS6 protein (ANKS6 is also associated with polycystic kidney disease) through some protein and RNA intermediates. 65
28298 188920 cd09521 SAM_ASZ1 SAM domain of ASZ1 subfamily. SAM (sterile alpha motif) domain of ASZ1 (Ankyrin, SAM, leucine Zipper) also known as GASZ (Germ cell-specific Ankyrin, SAM, leucine Zipper) subfamily is a potential protein-protein interaction domain. Proteins of this group are involved in the repression of transposable elements during spermatogenesis, oogenesis, and preimplantation embryogenesis. They support synthesis of PIWI-interacting RNA via association with some PIWI proteins, such as MILI and MIWI. This association is required for initiation and maintenance of retrotransposon repression during the meiosis. In mice lacking ASZ1, DNA damage and delayed germ cell maturation was observed due to retrotransposons releasing from their repressed state. 64
28299 188921 cd09522 SAM_SLP76 SAM domain of SLP76 subfamily. SAM (sterile alpha motif) domain of SLP76 (SH2 domain-containing leukocyte protein 76), also known as LCP2 (Lymphocyte cytosolic protein), subfamily is a protein-protein interaction domain. Proteins of this group have an N-terminal SAM domain, 3 phosphotyrosine motifs, a proline-rich region and a C-terminal SH2 domain. They are scaffold proteins involved in protein complex formation. The complexes play a role in T-cell receptor mediated signaling pathways such as integrin activation, cytoskeletal organization, MARK activation, and calcium flux. SAM domain deleted SLP76 knockin mice show a number of defects, including partially blocked thymocyte development, impaired positive and negative thymic selection and changes in T-cell receptor mediated signaling. 69
28300 188922 cd09523 SAM_TAL SAM domain of TAL subfamily. SAM (sterile alpha motif) domain of TAL (Tsg101-associated ligase) proteins, also known as LRSAM1 (Leucine-rich repeat and sterile alpha motif-containing) proteins, is a putative protein-protein interaction domain. Proteins of this subfamily participate in the regulation of retrovirus budding and receptor endocytosis. They show E3 ubiquitin ligase activity. Human TAL protein interacts with Tsg101 and TAL's C-terminal ring finger domain is essential for the multiple monoubiquitylation of Tsg101. 65
28301 188923 cd09524 SAM_tankyrase1,2 SAM domain of tankyrase1,2 subfamily. SAM (sterile alpha motif) domain of Tankyrase1,2 subfamily is a protein-protein interaction domain. In addition to the SAM domain, proteins of this group have ankyrin repeats and a ADP- ribosyltransferase (poly-(ADP-ribose) synthase) domain. Tankyrases can polymerize through their SAM domains forming homoligomers and these complexes are disrupted by autoribosylation. Tankyrases apparently act as master scaffolding proteins and thus may interact simultaneously with multiple proteins, in particular with TRF1, NuMA, IRAP and Grb14 (ankyrin repeats are involved in these interactions). Tankyrases participate in a variety of cell signaling pathways as effector molecules. Their functions are different depending on the intracellular location: at telomeres they play a role in the regulation of telomere length via control of telomerase access to telomeres, at centrosomes they promote spindle assembly/disassembly, in Golgi vesicles they participate in the regulation of vesicle trafficking and Golgi dynamics. Tankyrase 1 may be of interest as new potential target for telomerase-directed cancer therapy. 66
28302 188924 cd09525 SAM_GAREM SAM domain of GAREM subfamily. SAM (sterile alpha motif) domain of GAREM (Grb2-associated and regulator of Erk/MARK) protein subfamily (also known as FAM59A) is a putative protein-protein interaction domain. SAM domain is a widespread domain in signaling proteins. Proteins of this group have SAM at the C-terminus. Human GAREM protein is known to play a role in regulation of the EGF (Epidermal Growth Factor) receptor and of Gab or insulin preceptor substrate-1 family proteins. Grb2 (Growth factor receptor-bound) protein was identified as a binding partner of human GAREM. Proline-rich motifs and phosphorylation of two conserved tyrosines in GAREM are important for the interaction with the SH3 domains of Grb2 protein; however these motifs and residues do not belong to the SAM domain. 67
28303 188925 cd09526 SAM_Samd3 SAM domain of Samd3 subfamily. SAM (sterile alpha motif) domain of the Samd3 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates dimerization/oligomerization. Exact function of proteins belonging to this subfamily is unknown. 66
28304 188926 cd09527 SAM_Samd5 SAM domain of Samd5 subfamily. SAM (sterile alpha motif) domain of Samd5 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates dimerization/oligomerization. The exact function of proteins belonging to this subfamily is unknown. 63
28305 188927 cd09528 SAM_Samd9_Samd9L SAM domain of Samd9/Samd9L subfamily. SAM (sterile alpha motif) domain of Samd9/Samd9L subfamily is a putative protein-protein interaction domain. SAM is a widespread domain in signaling proteins. Samd9 is a tumor suppressor gene. It is involved in death signaling of malignant glioblastoma. Samd9 suppression blocks cancer cell death induced by HVJ-E or IFN-beta treatment. Deleterious mutations in Samd9 lead to normophosphatemic familial tumoral calcinosis, a cutaneous disorder characterized by cutaneous calcification or ossification. 64
28306 188928 cd09529 SAM_MLTK SAM domain of MLTK subfamily. SAM (sterile alpha motif) domain of MLTK subfamily is a protein-protein interaction domain. Besides SAM domain, these proteins have N-terminal protein tyrosine kinase domain and leucine-zipper motif. Proteins of this group act as mitogen-activated protein triple kinase in a number of MAPK cascades. They can be activated by autophosphorylation in response to stress signals. MLTK-alpha is known to phosphorylate histone H3. In mammals, MLTKs participate in the activation of the JNK/SAPK, p38, ERK5 pathways, the transcriptional factor NF-kB, in the regulation of the cell cycle checkpoint, and in the induction of apoptosis in a hepatoma cell line. Some members of this subfamily are proto-oncogenes, thus MLTK-alpha is involved in neoplasmic cell transformation and/or skin cancer development in athymic nude mice. Based on in vivo coprecipitation experiments in mammalian cells, it has been demonstrated that MLTK proteins might form homodimers/oligomers via their SAM domains. 71
28307 188929 cd09530 SAM_Samd14 SAM domain of Samd14 subfamily. SAM (sterile alpha motif) domain of SamD14 (or FAM15A) subfamily is a putative protein-protein interaction domain. SAM is widespread domain in proteins involved in signal transduction and regulation. In many cases SAM mediates homodimerization/oligomerization. The exact function of proteins belonging to this subfamily is unknown. 67
28308 188930 cd09531 SAM_CS047 SAM domain of CS047 subfamily. SAM (sterile alpha motif) domain of CS047 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates homodimerization/oligomerization. The exact function of proteins belonging to this group is unknown. 65
28309 188931 cd09532 SAM_SLA1_fungal SAM domain of SLA1 subfamily. SAM (sterile alpha motif) domain of fungal SLA1 proteins is a protein-protein interaction domain. Proteins of this group consist of a few N-terminal SH3 domains followed by SHD1 domain, SAM domain (also known as SHD2) and multiple C-terminal repeats. The yeast SLA1 protein is an endocytic clathrin adaptor. It is associated with a variety of endocytic accessory factors and required for endocytic vesicle formation and for clathrin and actin-dependent cargo recognition. SLA1 binds clathrin through a variant clathrin-binding motif (vCB). The SAM domain negatively regulates this binding by blocking the vCB site. The SAM domains of SLA1 proteins can form oligomers via their mid-loop (ML) and end-helix (EH) regions. Such self-associations apparently are important for SLA1 function. A proposed regulatory model suggests that SAM can be considered a mediator of two aspects of clathrin adaptor function. It plays a role in negative regulation of clathrin binding via an intramolecular interaction with the vCB, and a role in positive regulation of vesicle coat assembly via self-oligomerization. 62
28310 188932 cd09533 SAM_Ste50-like_fungal SAM domain of Ste50_like (ubc2) subfamily. SAM (sterile alpha motif) domain of Ste50-like (or Ubc2 for Ustilago bypass of cyclase) subfamily is a putative protein-protein interaction domain. This group includes only fungal proteins. Basidiomycetes have an N-terminal SAM domain, central UBQ domain, and C-terminal SH3 domain, while Ascomycetes lack the SH3 domain. Ubc2 of Ustilago maydis is a major virulence and maize pathogenicity factor. It is required for filamentous growth (the budding haploid form of Ustilago maydis is a saprophyte, while filamentous dikaryotic form is a pathogen). Also the Ubc2 protein is involved in the pheromone-responsive morphogenesis via the MAP kinase cascade. The SAM domain is necessary for ubc2 function; deletion of SAM eliminates this function. A Lys-to-Glu mutation in the SAM domain of ubc2 gene induces temperature sensitivity. 58
28311 188933 cd09534 SAM_Ste11_fungal SAM domain of Ste11_fungal subfamily. SAM (sterile alpha motif) domain of Ste11 subfamily is a protein-protein interaction domain. Proteins of this subfamily have SAM domain at the N-terminus and protein kinase domain at the C-terminus. They participate in regulation of mating pheromone response, invasive growth and high osmolarity growth response. MAP triple kinase Ste11 from S.cerevisia is known to interact with Ste20 kinase and Ste50 regulator. These kinases are able to form homodimers interacting through their SAM domains as well as heterodimers or heterogenous complexes when either SAM domain of monomeric or homodimeric form of Ste11 interacts with Ste50 regulator. 62
28312 188934 cd09535 SAM_BOI-like_fungal SAM domain of BOI-like fungal subfamily. SAM (sterile alpha motif) domain of BOI-like fungal subfamily is a potential protein-protein interaction domain. Proteins of this subfamily are apparently scaffold proteins, since most contain SH3 and PH domains, which are also protein-protein interaction domains, in addition to SAM domain. BOI-like proteins participate in cell cycle regulation. In particular BOI1 and BOI2 proteins of budding yeast S.cerevisiae are involved in bud formation, and POB1 protein of fission yeast S.pombe plays a role in cell elongation and separation. Among binding partners of BOI-like fungal subfamily members are such proteins as Bem1 and Cdc42 (they are known to be involved in cell polarization and bud formation). 65
28313 188935 cd09536 SAM_Ste50_fungal SAM domain of Ste50 fungal subfamily. SAM (sterile alpha motif) domain of Ste50 fungal subfamily is a protein-protein interaction domain. Proteins of this subfamily have SAM domain at the N-terminus and Ras-associated UBQ superfamily domain at the C-terminus. They participate in regulation of mating pheromone response, invasive growth and high osmolarity growth response, and contribute to cell wall integrity in vegetative cells. Ste50 of S.cerevisiae acts as an adaptor protein between G protein and MAP triple kinase Ste11. Ste50 proteins are able to form homooligomers, binding each other via their SAM domains, as well as heterodimers and heterogeneous complexes with SAM domain or SAM homodimers of MAPKKK Ste11 protein kinase. 74
28314 188936 cd09537 SAM_CP2-like SAM domain of CP2-like transcription factors. SAM (sterile alpha motif) domain of CP2-like transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and some also have a C-terminal dimerization domain. CP2-like family of transcriptional factors includes three subgroups: LBP1, TFCP2, and LBP9. Members of this family are involved in transcriptional regulation from early development to terminal differentiation. They play a role in regulation of expression of P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in placenta, and alpha-globin in erythroid cells. They are required for proper maturation of the dust (epithelial component of tubular organs) of kidney and salivary gland. Human LBP1 is known to be induced by HIV type I infection in lymphocytes; it represses HIV transcription by preventing the binding of TFIID to the virus promoter. Additionally, it has been suggested that UBP1 (LBP1) regulator might be a member of a blood pressure controlling network. LBP1 protein isoforms are able to form dimers apparently via SAM domain since SAM deletion or mutation resulted in a loss of this ability. 67
28315 188937 cd09538 SAM_DLC1,2-like SAM domain of DLC1,2-like subfamily. SAM (sterile alpha motif) domain of DLC-1,2-like (Deleted in liver cancer) subfamily is a protein-protein interaction domain located at the N-terminus of the protein. Members of this subfamily do not form dimers/oligomers through their SAM domains. They participate in regulation of cell migration and lipid transfer. SAM domain of human DLC1 protein contains the EF1A1 (eukaryotic elongation factor) binding motif, thus SAM facilitates recruitment of EF1A1 to the membrane periphery and suppresses cell migration. Human Dlc2 gene is known as a tumor suppressor gene. It was found underexpressed in hepatocellular carcinoma. 60
28316 188938 cd09539 SAM_TNK-like SAM domain of TNK(ACK)-like non-receptor tyrosine-protein kinases. SAM (sterile alpha motif) domain of TNK-like subfamily is a putative protein-protein interaction domain. This subfamily includes TNK1 and TNK2 (also known as ACK1) non-receptor tyrosine-protein kinases. They contain a SAM domain at the N-terminus followed by a catalytic domain and a few other domains. Members of this group are involved in the regulation of cell adhesion and growth, receptor degradation, and axonal guidance. Deletion of the SAM domain resulted in reduction of Ack1 ability to undergo autophosphorylation and dramatically reduces ubiquitination of Ack1 catalyzed by HECT E3 ubiquitin ligase (Nedd4-1) during EGF-induced Ack1 degradation. It has been suggested that the lysine-rich region in SAM domain might be a major ubiquitination site. Members of this group are also associated with some cancers. Amplification of the Ack1 gene correlates with prostate and lung cancer progression, and Ack1 overexpression increases invasiveness. Oncogenecity of Tnk1 gene apparently depends on cell context; it may play a role in tumor suppression since Tnk1 knockout mice can develop spontaneous tumors. 62
28317 188939 cd09540 SAM_EPS8-like SAM domain of EPS8-like subfamily. SAM (sterile alpha motif) domain of EPS8-like subfamily is a putative protein-protein interaction domain. This subfamily includes epidermal growth factor receptor kinase substrate 8 proteins (EPS8) and epidermal growth factor receptor kinase substrate 8-like (EPSL8) 1, 2, 3 proteins with the SAM domain located in the C-terminal effector region. This region is responsible for intracellular protein localization and is involved in small GTPases (such as Rac and Rab5) activation/inhibition. Proteins belonging to this group participate in coordination and integration of multiple signaling pathways; in particular, they play a role in the control of actin dynamics and in receptor endocytosis. They can form complexes with other proteins; for example, in the actin signaling network they interact with SOS1 and E3b1 (Abl1) proteins as well as with CRIB (via SH3 domains) during the actin filament formation, and in the receptor endocytosis their partner is RN-tre protein. 66
28318 188940 cd09541 SAM_KIF24-like SAM domain of KIF24-like subfamily. SAM (sterile alpha motif) domain of KIF24 subfamily is a putative protein-protein interaction domain. This subfamily includes proteins related to human kinesin-like protein KIF24. SAM domain is located at the N-terminus followed by kinesin motor domain. Kinesins are proteins involved in a number of different cell processes including microtubule dynamics and axonal transport. Kinesins of this group belong to N-type; they drive microtubule plus end-directed transport. SAM apparently plays the role of adaptor or scaffold domain. In many cases SAM is known as a mediator of dimerization/oligomerization. 60
28319 188941 cd09542 SAM_EPH-A1 SAM domain of EPH-A1 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A1 subfamily of the receptor tyrosine kinases is a C-terminal protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A1 receptors and appears to mediate cell-cell initiated signal transduction. Activation of these receptors leads to inhibition of cell spreading and migration in a RhoA-ROCK-dependent manner. EPH-A1 receptors are known to bind ILK (integrin-linked kinase) which is the mediator of interactions between integrin and the actin cytoskeleton. However SAM is not sufficient for this interaction; it rather plays an ancillary role. SAM domains of Eph-A1 receptors do not form homo/hetero dimers/oligomers. EphA1 gene was found expressed widely in differentiated epithelial cells. In a number of different malignant tumors EphA1 genes are downregulated. In breast carcinoma the downregulation is associated with invasive behavior of the cell. 63
28320 188942 cd09543 SAM_EPH-A2 SAM domain of EPH-A2 family of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A2 subfamily of receptor tyrosine kinases is a C-terminal protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A2 receptors and appears to mediate cell-cell initiated signal transduction. For example, SAM domain of EPH-A2 receptors interacts with SAM domain of Ship2 proteins (SH2 containing phosphoinositide 5-phosphotase-2) forming heterodimers; such recruitment of Ship2 by EPH-A2 attenuates the positive signal for receptor endocytosis. Eph-A2 is found overexpressed in many types of human cancer, including breast, prostate, lung and colon cancer. High level of expression could induce cancer progression by a variety of mechanisms and could be used as a novel tag for cancer immunotherapy. EPH-A2 receptors are attractive targets for drag design. 70
28321 188943 cd09544 SAM_EPH-A3 SAM domain of EPH-A3 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A3 subfamily of receptor tyrosine kinases is a C-terminal putative protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A3 receptors and appears to mediate cell-cell initiated signal transduction. EPH-A3 receptors bind SH2/SH3 containing adaptor protein Nck1 and this adaptor is a key factor in EPH-A3 mediated signaling. However SAM domain is not implemented in this interaction. Activation of EPH-A3 receptors inhibits outgrowth and cell migration. Mutations in SAM domain may play a role in development of hepatocellular carcinoma. Expression of EPH-A3 is associated with lymphocytic leukemia and defines the subset of rhabdomyosarcoma tumors. EPH-A3 receptors are attractive targets for drug design. 63
28322 188944 cd09545 SAM_EPH-A4 SAM domain of EPH-A4 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A4 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A4 receptors and appears to mediate cell-cell initiated signal transduction. SAM domains of EPH-A4 receptors can form homodimers. EPH-A4 receptors bind ligands such as erphirin A1, A4, A5. They are known to interact with a number of different proteins, including meltrin beta metalloprotease, Cdk5, and EFS2alpha, however SAM domain doesn't participate in these interactions. EPH-A4 receptors are involved in regulation of corticospinal tract formation, in pathway controlling voluntary movements, in formation of motor neurons, and in axon guidance (SAM domain is not required for axon guidance or for EPH-A4 kinase signaling). In Xenopus embryos EPH-A4 induces loss of cell adhesion, ventro-lateral protrusions, and severely expanded posterior structures. Mutations in SAM domain conserved tyrosine (Y928F) enhance the ability of EPH-A4 to induce these phenotypes, thus supporting the idea that the SAM domain may negatively regulate some aspects of EPH-A4 activity. EphA4 gene was found overexpressed in a number of different cancers including human gastric cancer, colorectal cancer, and pancreatic ductal adenocarcinoma. It is likely to be a promising molecular target for the cancer therapy. 71
28323 188945 cd09546 SAM_EPH-A5 SAM domain of EPH-A5 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A5 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A5 receptors and appears to mediate cell-cell initiated signal transduction. Eph-A5 gene is almost exclusively expressed in the nervous system. Murine EPH-A5 receptors participate in axon guidance during embryogenesis and play a role in the adult synaptic plasticity, particularly in neuron-target interactions in multiple neural circuits. Additionally EPH-A5 receptors and its ligand ephrin A5 regulate dopaminergic axon outgrowth and influence the formation of the midbrain dopaminergic pathways. EphA5 gene expression was found decreased in a few different breast cancer cell lines, thus it might be a potential molecular marker for breast cancer carcinogenesis and progression. 66
28324 188946 cd09547 SAM_EPH-A6 SAM domain of EPH-A6 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A6 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A6 receptors and appears to mediate cell-cell initiated signal transduction. Eph-A6 gene is preferentially expressed in the nervous system. EPH-A6 receptors are involved in primate retina vascular and axon guidance, and in neural circuits responsible for learning and memory. EphA6 gene was significantly down regulated in colorectal cancer and in malignant melanomas. It is a potential molecular marker for these cancers. 64
28325 188947 cd09548 SAM_EPH-A7 SAM domain of EPH-A7 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A7 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A7 receptors and appears to mediate cell-cell initiated signal transduction. EphA7 was found expressed in human embryonic stem (ES) cells, neural tissues, kidney vasculature. EphA7 knockout mice show decrease in cortical progenitor cell death at mid-neurogenesis and significant increase in cortical size. EphA7 may be involved in the pathogenesis and development of different cancers; in particular, EphA7 was found upregulated in glioblastoma and downregulated in colorectal cancer and gastric cancer. Thus, it is a potential molecular marker and/or therapy target for these types of cancers. 70
28326 188948 cd09549 SAM_EPH-A10 SAM domain of EPH-A10 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A10 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A10 receptors and appears to mediate cell-cell initiated signal transduction. It was found preferentially expressed in the testis. EphA10 may be involved in the pathogenesis and development of prostate carcinoma and lymphocytic leukemia. It is a potential molecular marker and/or therapy target for these types of cancers. 70
28327 188949 cd09550 SAM_EPH-A8 SAM domain of EPH-A8 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A8 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A8 receptors and appears to mediate cell-cell initiated signal transduction. EPH-A8 receptors are involved in ligand dependent (ephirin A2, A3, A5) regulation of cell adhesion and migration, and in ligand independent regulation of neurite outgrowth in neuronal cells. They perform signaling in kinase dependent and kinase independent manner. EPH-A8 receptors are known to interact with a number of different proteins including PI 3-kinase and AIDA1-like subfamily SAM repeat domain containing proteins. However other domains (not SAM) of EPH-A8 receptors are involved in these interactions. 65
28328 188950 cd09551 SAM_EPH-B1 SAM domain of EPH-B1 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B1 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH- B1 receptors. In human vascular endothelial cells it appears to mediate cell-cell initiated signal transduction via the binding of the adaptor protein GRB10 (growth factor) through its SH2 domain to a conserved tyrosine that is phosphorylated. EPH-B1 receptors play a role in neurogenesis, in particular in regulation of proliferation and migration of neural progenitors in the hippocampus and in corneal neovascularization; they are involved in converting the crossed retinal projection to ipsilateral retinal projection. They may be potential targets in angiogenesis-related disorders. 68
28329 188951 cd09552 SAM_EPH-B2 SAM domain of EPH-B2 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B2 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B2 receptors and appears to mediate cell-cell initiated signal transduction. SAM domains of this subfamily form homodimers/oligomers (in head-to-head/tail-to-tail orientation); apparently such clustering is necessary for signaling. EPH-B2 receptor is involved in regulation of synaptic function; it is needed for normal vestibular function, proper formation of anterior commissure, control of cell positioning, and ordered migration in the intestinal epithelium. EPH-B2 plays a tumor suppressor role in colorectal cancer. It was found to be downregulated in gastric cancer and thus may be a negative biomarker for it. 71
28330 188952 cd09553 SAM_EPH-B3 SAM domain of EPH-B3 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B3 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B3 receptors and appears to mediate cell-cell initiated signal transduction. EPH-B3 receptor protein kinase performs kinase-dependent and kinase-independent functions. It is known to be involved in thymus morphogenesis, in regulation of cell adhesion and migration. Also EphB3 controls cell positioning and ordered migration in the intestinal epithelium and plays a role in the regulation of adult retinal ganglion cell axon plasticity after optic nerve injury. In some experimental models overexpression of EphB3 enhances cell/cell contacts and suppresses colon tumor growth. 69
28331 188953 cd09554 SAM_EPH-B4 SAM domain of EPH-B4 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B4 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B4 receptors and appears to mediate cell-cell initiated signal transduction. EPH-B4 protein kinase performs kinase-dependent and kinase-independent functions. These receptors play a role in the regular vascular system development during embryogenesis. They were found overexpressed in a variety of cancers, including carcinoma of the head and neck, ovarian cancer, bladder cancer, and downregulated in bone myeloma. Thus, EphB4 is a potential biomarker and a target for drug design. 67
28332 188954 cd09555 SAM_EPH-B6 SAM domain of EPH-B6 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B6 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B6 receptors and appears to mediate cell-cell initiated signal transduction. Receptors of this type are highly expressed in embryo and adult nervous system, in thymus and also in T-cells. They are involved in regulation of cell adhesion and migration. (EPH-B6 receptor is unusual; it fails to show catalytic activity due to alteration in kinase domain). EPH-B6 may be considered as a biomarker in some types of tumors; EPH-B6 activates MAP kinase signaling in lung adenocarcinoma, suppresses metastasis formation in non-small cell lung cancer, and slows invasiveness in some breast cancer cell lines. 69
28333 188955 cd09556 SAM_VTS1_fungal SAM domain of VTS1 RNA-binding proteins. SAM (sterile alpha motif) domain of VTS1 subfamily proteins is RNA binding domain located in the C-terminal region. SAM interacts with stem-loop structures of mRNA. Proteins of this subfamily participate in regulation of transcript stability and degradation, and also may be involved in vacuolar protein transport regulation. VTS1 protein of S.cerevisiae induces mRNA degradation via the major deadenylation-dependent mRNA decay pathway; VTS1 recruits CCR4/POP2/NOT deadenylase complex to target mRNA. The recruitment is the initial step resulting in poly(A) tail removal transcripts. Potentially SAM domain may be responsible not only for RNA binding but also for deadenylase binding. 69
28334 188956 cd09557 SAM_Smaug SAM domain of Smaug subfamily. SAM (sterile alpha motif) domain of Smaug proteins is an RNA recognition domain. It binds a specific RNA motif known as Smaug recognition element (SRE). Among members of this group are invertebrate Smaug (Smg) proteins and vertebrate Smaug1 and Smaug2 proteins. They are involved in post-transcriptional control during early embryogenesis in animals. In Drosophila, Smaug protein is a translational repressor of mRNA of Nanos (Nos) protein. Gradient of Nanos is required for proper abdominal segmentation. SAM domain interacts specifically with the Nanos mRNA regulatory regions. Moreover, Smaug protein is involved in regulation of specific maternal transcripts degradation in Drosophila early embryo via recruitment of the CCR4/POP2/NOT deadenylase. 63
28335 188957 cd09558 SAM_ZCCH14 SAM domain of ZCCH14 subfamily. SAM (sterile alpha motif) domain of ZCCH14 (Zinc finger CCHC domain 14) protein subfamily (also known as BDG-29 or KIAA0579) is a putative RNA binding domain. Members of this group are believed to be involved in post-translational regulation during early embryogenesis. 65
28336 188958 cd09559 SAM_SASH1_repeat1 SAM domain of SASH1 proteins, repeat 1. SAM (sterile alpha motif) repeat 1 of SASH1 proteins is a predicted protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. SASH1 can bind 14-3-3 proteins in response to IGF1/phosphatidylinositol 3-kinase signaling. SASH1 was found upregulated in different tissues including thymus, placenta, lungs and downregulated in some breast tumors, liver metastases and colon cancers, relative to corresponding normal tissues. SASH1 is a potential candidate for a tumor suppressor gene in breast cancers. At the same time, downregulation of SASH1 in colon cancer is associated with metastasis and a poor prognosis. 66
28337 188959 cd09560 SAM_SASH3 SAM domain of SASH3 subfamily. SAM (sterile alpha motif) domain of SAHS3 (also known as SLY) proteins is a predicted protein-protein interaction domain. Members of this subfamily are putative signaling/adaptor proteins. In addition to SAM, they contain SLY and SH3 domains. They appear to mediate signal transduction in lymphoid tissues. Murine SASH3 is involved in preventing DN thymocytes from premature initiation of programmed cell death and in mTOR (mammalian target of rapamycin) activation via signal integration of the Notch receptor and preTCR (T cell receptor) pathways. 68
28338 188960 cd09561 SAM_SAMSN1 SAM domain of SAMSN1 subfamily. SAM (sterile alpha motif) domain of SAMSN1 (also known as HACS1 or NASH1) proteins is a predicted protein-protein interaction domain. Members of this group are putative signaling/adaptor proteins. They appear to mediate signal transduction in lymphoid tissues. Murine HACS1 protein likely plays a role in B cell activation and differentiation. Potential binding partners of HACS1 are SLAM, DEC205 and PIR-B receptors and also some unidentified tyrosine-phosphorylated proteins. Proteins of this group were found preferentially expressed in normal hematopietic tissues and in some malignancies including lymphoma, myeloid leukemia and myeloma. 66
28339 188961 cd09562 SAM_liprin-alpha1,2,3,4_repeat1 SAM domain of liprin-alpha1,2,3,4 proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone. 71
28340 188962 cd09563 SAM_liprin-beta1,2_repeat1 SAM domain of liprin-beta1,2 proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta protein contain three copies (repeats) of SAM domain. They may form heterodimers with liprins-alpha through their SAM domains. It was suggested based on bioinformatic approaches that the second SAM domain of liprin-beta is potentially able to form polymers. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity. 64
28341 188963 cd09564 SAM_kazrin_repeat1 SAM domain of kazrin proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of kazrin proteins is a protein-protein interaction domain. The long isoform of kazrin contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved into interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 70
28342 188964 cd09565 SAM_liprin-alpha1,2,3,4_repeat2 SAM domain of liprin-alpha1,2,3,4 proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone. 66
28343 188965 cd09566 SAM_liprin-beta1,2_repeat2 SAM domain of liprin-beta1,2 proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-alpha proteins through their SAM domains. It was suggested based on bioinformatic approaches that the second SAM domain of liprin-beta potentially is able to form polymers. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity. 63
28344 188966 cd09567 SAM_kazrin_repeat2 SAM domain of kazrin proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of kazrin proteins is a protein-protein interaction domain. The long isoform of kazrins contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved in interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 65
28345 188967 cd09568 SAM_liprin-alpha1,2,3,4_repeat3 SAM domain of liprin-alpha1,2,3,4 proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone. 72
28346 188968 cd09569 SAM_liprin-beta1,2_repeat3 SAM domain of liprin-beta proteins repeat 3. SAM (sterile alpha motif) domain repea t3 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-alpha proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity. 72
28347 188969 cd09570 SAM_kazrin_repeat3 SAM domain of kazrin proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of kazrin proteins is a protein-protein interaction domain. The long isoform of kazrins contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved in interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization. 72
28348 188970 cd09571 SAM_tumor-p73 SAM domain of tumor-p73 proteins. SAM (sterile alpha motif) domain of p73 proteins is a putative protein-protein interaction and lipid-binding domain. p73 is a homolog to the tumor suppressor p53. p73 has a C-terminal SAM domain in the longest spliced alpha form, while p53 doesn't have it. p73 knockout mouse shows significant developmental abnormalities but no increased cancer susceptibility, suggesting that p73 plays a role in regulation of normal development. It was shown that SAM domain of p73 is able to bind some membrane lipids. The structural rearrangements in SAM are necessary to accomplish the binding. No evidence for homooligomerization through SAM domains was found for the p73 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain. 65
28349 188971 cd09572 SAM_tumor-p63 SAM domain of tumor-p63 proteins. SAM (sterile alpha motif) domain of p63 proteins is a putative protein-protein interaction domain. p63 is homolog to the tumor suppressor p53. p63 has a C-terminal SAM domain in the longest spliced alpha form, while p53 doesn't have it. p63 knockout mice show significant developmental abnormalities but no increased cancer susceptibility, suggesting that p63 plays a role in regulation of normal development. No evidence for homooligomerization through SAM domains was found for the p63 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain. Mutations in the SAM domain of p63 are found in AEC syndrome patients. 65
28350 188972 cd09573 SAM_STIM1 SAM domain of STIM1 subfamily proteins. SAM (sterile alpha motif) domain of STIM1 (Stromal interaction molecule) subfamily proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM1 protein is an activator of store operated channels in plasma membrane. 74
28351 188973 cd09574 SAM_STIM2 SAM domain of STIM2 subfamily proteins. SAM (sterile alpha motif) domain of STIM2 (Stromal interaction molecule) subfamily proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM2 protein is an inhibitor of store operated channels in plasma membrane. 74
28352 188974 cd09575 SAM_DGK-delta SAM domain of diacylglycerol kinase delta. SAM (sterile alpha motif) domain of DGK-delta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases with a SAM domain located at the C-terminus. DGK-delta proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. In particular DGK-delta is involved in the regulation of clathrin-dependent endocytosis. The SAM domain of DGK-delta proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers with the SAM domain of DGK-eta proteins. The oligomerization plays a role in the regulation of the DGK-delta intracellular localization: it inhibits the translocation of the protein to the plasma membrane from the cytoplasm. The SAM domain also can bind Zn at multiple (not conserved) sites driving the formation of highly ordered large sheets of polymers, thus suggesting that Zn may play important role in the function of DCK-delta. 65
28353 188975 cd09576 SAM_DGK-eta SAM domain of diacylglycerol kinase eta. SAM (sterile alpha motif) domain of DGK-eta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases. The SAM domain is located at the C-terminus of two out of three isoforms of DGK-eta protein. DGK-eta proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. The SAM domain of DCK-eta proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers with the SAM domain of DGK-delta proteins. The oligomerization plays a role in the regulation of the DGK-delta intracellular localization: it is responsible for sustained endosomal localization of the protein and resulted in negative regulation of DCK-eta catalytic activity. 65
28354 188976 cd09577 SAM_Ph1,2,3 SAM domain of Ph (polyhomeotic) proteins of Polycomb group. SAM (sterile alpha motif) domain of Ph (polyhomeotic) proteins of Polycomb group is a protein-protein interaction domain. Ph1,2,3 proteins are members of PRC1 complex. This complex is involved in transcriptional repression of Hox (Homeobox) cluster genes. It is recruited through methylated H3Lys27 and supports the repression state by mediating monoubiquitination of histone H2A. Proteins of the Ph1,2,3 subfamily contribute to anterior-posterior neural tissue specification during embryogenesis. Additionally, the P2 protein of zebrafish is known to be involved in epiboly and tailbud formation. SAM domains of Ph proteins may interact with each other, forming homooligomers, as well as with SAM domains of other proteins, in particular with the SAM domain of Scm (sex comb on midleg) proteins, forming heterooligomers. Homooligomers are similar to the ones formed by SAM Pointed domains of the TEL proteins. Such SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome. 69
28355 188977 cd09578 SAM_Scm SAM domain of Scm proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm (Sex comb on midleg) subfamily of Polycomb group is a protein-protein interaction domain. Proteins of this subfamily are transcriptional repressors associated with PRC1 complex. This group includes invertebrate Scm protein and chordate Scm homolog 1 and Scm-like 1, 2, 3 proteins. Most have a SAM domain, two MBT repeats, and a DUF3588 domain, except Scm-like 4 proteins which do not have MBT repeats. Originally the Scm protein was described in Drosophila as a regulator required for proper spatial expression of homeotic genes. It plays a major role during early embryogenesis. SAM domains of Scm proteins can interact with each other, forming homooligomers, as well as with SAM domains of other proteins, in particular with SAM domains of Ph (polyhomeotic) proteins, forming heterooligomers. Homooligomers are similar to the ones formed by SAM Pointed domains of the TEL proteins. Such SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome. Mammalian Scmh1 protein is known be indispensible member of PRC1 complex; it plays a regulatory role for the complex during meiotic prophase of male sperm cells, and is particularly involved in regulation of chromatin modification at the XY chromatin domain of the pachytene spermatocytes. 72
28356 188978 cd09579 SAM_Samd7,11 SAM domain of Samd7,11 subfamily of Polycomb group. SAM (sterile alpha motif) domain is a protein-protein interaction domain. Phylogenetic analysis suggests that proteins of this subfamily are most closely related to SAM-Ph1,2,3 subfamily of Polycomb group. They are predicted transcriptional repressors in photoreceptor cells and pinealocytes of vertebrates. SAM domain containing protein 11 is also known as Mr-s (major retinal SAM) protein. In mouse, it is predominantly expressed in developing retinal photoreceptors and in adult pineal gland. The SAM domain is involved in homooligomerization of whole proteins (it was shown based on immunoprecipitation assay and mutagenesis), however its repression activity is not due to SAM/SAM interactions but to the C-terminal region. 68
28357 188979 cd09580 SAM_Scm-like-4MBT SAM domain of Scm-like-4MBT proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-4MBT (Sex comb on midleg like, Malignant Brain Tumor) subfamily proteins of the polycomb group is a putative protein-protein interaction domain. Additionally to the SAM domain, most of the proteins of this subfamily have 4 MBT repeats. In Drosophila SAM-Scm-like-4MBT protein (known as dSfmbt) is a member of Pho repressive complex (PhoRC). Additionally to dSfmbt, the PhoRC complex includes Pho or Pho-like proteins. This complex is responsible for HOX (Homeobox) gene silencing: Pho or Pho-like proteins bind DNA and dSmbt binds methylated histones. dSmbt can interact with mono- and di-methylated histones H3 and H4 (however this activity has been shown for the MBT repeats, while exact function of the SAM domain is unclear). Besides interaction with histones, dSmbt can interact with Scm (a member of PRC complex), but this interaction also seems to be SAM domain independent. 67
28358 188980 cd09581 SAM_Scm-like-4MBT1,2 SAM domain of Scm-like-4MBT1,2 proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-4MBT1,2 (Sex comb on midleg, Malignant Brain Tumor) subfamily proteins (also known as Sfmbt1,2 proteins) is a putative protein-protein interaction domain. Proteins of this subfamily are transcriptional regulators belonging to Polycomb group. The majority of them are multidomain proteins: in addition to the C-terminal SAM domain, they contain four MBT repeats and DUF5388 domain. The MBT repeats of the human sfmbt1 protein are responsible for association with the nuclear matrix and for selective binding of H3 histone N-terminal tails, while the exact function of the SAM domain is unclear. 85
28359 188981 cd09582 SAM_Scm-like-3MBT3,4 SAM domain of Scm-like-3MBT3,4 proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-3MBT3,4 (Sex comb on midleg, Malignant brain tumor) subfamily proteins (also known as L3mbtl3,4 proteins) is a putative protein-protein interaction domain. Proteins of this subfamily are predicted transcriptional regulators belonging to Polycomb group. The majority of them are multidomain proteins: in addition to the C-terminal SAM domain, they contain three MBT repeats and Zn finger domain. Murine L3mbtl3 protein of this subfamily is essential for maturation of myeloid progenitor cells during differentiation. Human L3mbtl4 is a potential tumor suppressor gene in breast cancer, while deregulation of L3MBTL3 is associated with neuroblastoma. 66
28360 188982 cd09583 SAM_Atherin-like SAM domain of Atherin/Atherin-like subfamily. SAM (sterile alpha motif) domain of SAM_Atherin and Atherin-like subfamily proteins is a putative protein-protein and/or protein-lipid interaction domain. In addition to the C-terminal SAM domain, the majority of proteins belonging to this group also have PHD (or Zn finger) domain. As potential members of the polycomb group, these proteins may be involved in regulation of some key regulatory genes during development. Atherin can be recruited by Ruk/CIN85 kinase-binding proteins via its SH3 domains thus participating in the signal transferring kinase cascades. Also, atherin was found associated with low density lipids (LDL) in atherosclerotic lesions in human. It was suggested that atherin plays an essential role in atherogenesis via immobilization of LDL in the arterial wall. SAM domains of atherins are predicted to form polymers. Inhibition of polymer formation could be a potential antiatherosclerotic therapy. 69
28361 188983 cd09584 SAM_sec23ip SAM domain of sec23ip. SAM (sterile alpha motif) domain of Sec23ip (Sec23 interacting protein) group is a potential protein-protein interaction domain. Sec23ip proteins (also known as p125) contain an N-terminal proline-rich region, a central region containing a SAM domain and a C-terminal region with a predicted metal-binding domain. Sec23ip interacts with Sec23p/Sec24p part of COPII-coated vesicles complex involved in protein transport from the ER to the Golgi apparatus. The proline-rich region plays an essential role in this interaction. Overexpression of Sec23ip leads to disorganization of ER/Golgi intermediate compartment. 69
28362 188984 cd09585 SAM_DDHD2 SAM domain of DDHD2. SAM (sterile alpha motif) domain of DDHD2 group is a potential protein-protein interaction domain. DDHD2 proteins contain at least two domains:a SAM domain and a predicted metal-binding domain. Phospholipase A1 activity was demonstrated for the mammalian DDHD2 protein. Mutation of the putative catalytic serine resulted in elimination of activity. Unlike SEC23IP, DDHD2 proteins do not have an N-terminal proline-rich region and correspondingly they are not able to interact with Sec23p/Sec24p complex. Overexpression of DDHD2 is the cause of dispersion of ER/Golgi intermediate compartment and dispersion of tethering proteins located in the Golgi region, leading to aggregation in the endoplasmic reticulum. 69
28363 188985 cd09586 SAM_USH1G SAM domain of USH1G. SAM (sterile alpha motif) domain of USH1G (Usher syndrome type-1G protein) proteins (also known as SANS) is a putative protein-protein interaction domain. Members of this group have an N-terminal ankyrin repeat region and C-terminal SAM domain. USH1G is expressed in the hair bundles of the inner ear sensory cells. It can form a functional network with USH1B (myosin VIIa), USH1C (harmonin b), USH1F (protocadherin-related 15), and USH1D (cadherin 23). The SAM domain of the USH1G protein is involved in synergetic interactions with the PDZ domain of harmonin. Such interactions contribute to the stability of harmonin. The network is required for the correct cohesion of the hair bundle. Mutations in the ush1g gene lead to Usher syndrome type 1G. This syndrome is the cause of deaf-blindness in humans. 66
28364 188986 cd09587 SAM_HARP SAM domain of HARP subfamily. SAM (sterile alpha motif) domain of HARP (Harmonin-interacting Ankyrin Repeat-containing) proteins, also known as ANKS4B, is a protein-protein interaction domain. Proteins of this subfamily have an N-terminal ankyrin repeat region and C-terminal SAM. In mouse epithelial tissues, HARP protein interacts with the PDZ domain of harmonin. This scaffolding complex facilitates signal transduction in epithelia. HARP was found co-expressed with harmonin in a number of epithelial cells including pancreatic ductal epithelium, embryonic epithelia of the lung, kidney, salivary glands, and cochlea. 67
28365 188987 cd09588 SAM_LBP1 SAM domain of LBP1 (UBP1) transcription factors. SAM (sterile alpha motif) domain of LBP1 (also known as UBP1) transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and some also have a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they regulate alpha-globin in erythroid cells and P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in human placenta. Human LBP1 is known to be induced by HIV type I infection in lymphocytes; it represses HIV transcription by preventing the binding of TFIID to the virus promoter. Additionally, it has been suggested that UBP1 (LPB1) regulator might be a member of a blood pressure controlling network. LBP1 protein isoforms are able to form dimers, apparently via SAM domain since SAM deletion or mutation resulted in a loss of this ability. 67
28366 188988 cd09589 SAM_TFCP2 SAM domain of TFCP2 transcription factors. SAM (sterile alpha motif) domain of TFCP2 transcription factors is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they regulate expression of erythroid cell-specific alpha-globin, fibrinogen, and sex-determining gene SRY as well as lens alpha-crystallin. TFCP2 regulators can interact with NF-E4 proteins forming heteromeric stage selector protein complex (SSP). This complex is able to bind stage selector element (SSE) and regulate embryonic globin expression in fetal-erythroid cells. 67
28367 188989 cd09590 SAM_LBP9 SAM domain of LBP9 transcriptional factors. SAM (sterile alpha motif) domain of LBP9 (also known as TFCP2L1 or CRTR-1 (CP2-Related Transcriptional Repressor-1)) transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they are required for proper maturation of the dust (epithelial component of tubular organs) of kidney and salivary gland as well as for regulation of P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in human placenta. 67
28368 188990 cd09591 SAM_DLC1 SAM domain of DLC1 subfamily. SAM (sterile alpha motif) domain of DLC1 (Deleted in liver cancer) protein is a protein-protein interaction domain located at the N-terminus. Proteins of this subfamily do not form dimers/oligomers through their SAM domains. They participate in regulation of cell migration. SAM domain of human DLC1 protein contains the EF1A1 (eukaryotic elongation factor) binding motif, thus SAM facilitates recruitment of EF1A1 to the membrane periphery and suppresses cell migration. 60
28369 188991 cd09592 SAM_DLC2 SAM domain of STARD13-like subfamily. SAM (sterile alpha motif) domain of DLC2 (Deleted in liver cancer) protein is a lipid-binding and putative protein-protein interaction domain located at the N-terminus of the protein. Members of this subfamily do not form dimers/oligomers through their SAM domains. They participate in lipid transfer. Human Dlc2 gene is known as a tumor suppressor gene. It was found underexpressed in hepatocellular carcinoma. 64
28370 381677 cd09593 UDG-like uracil-DNA glycosylases (UDG) and related enzymes. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches, but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it. 125
28371 341057 cd09594 GluZincin Gluzincin Peptidase family (thermolysin-like proteinases, TLPs) which includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins). The Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), which contain the HEXXH motif as part of their active site. Peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. The M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The M3_like peptidases include the M2_ACE, M3 or neurolysin-like family (subfamilies M3B_PepF and M3A) and M32_Taq peptidases. The M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key component of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M3A includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; and M3B includes oligopeptidase F. The M32 family includes eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and from Leishmania major, a parasite that causes leishmaniasis, making these enzymes attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, and neutral protease as well as bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. The M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. The peptidase M36 fungalysin family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers. 105
28372 341058 cd09595 M1 Peptidase M1 family includes the catalytic domains of aminopeptidase N and leukotriene A4 hydrolase. The model represents the catalytic domains of M1 peptidase family members including aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile upon activation during catalysis. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. APN expression is dysregulated in many inflammatory diseases and is enhanced in numerous tumor cells, making it a lead target in the development of anti-cancer and anti-inflammatory drugs. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity. The two activities occupy different, but overlapping sites. The activity and physiological relevance of the aminopeptidase in LTA4H is as yet unknown, while the epoxide hydrolase converts leukotriene A4 (LTA4) into leukotriene B4 (LTB4), a potent chemotaxin that is fundamental to the inflammatory response of mammals. 413
28373 341059 cd09596 M36 Peptidase M36 family, also known as fungalysin family. The M36 peptidase family, also known as fungalysin (elastinolytic metalloproteinase) family, includes endopeptidases from pathogenic fungi. Fungalysin can hydrolyze extracellular matrix proteins such as elastin and keratin, with a preference for cleavage on the amino side of hydrophobic residues with bulky side-chains. This family is similar to the M4 (thermolysin) family due to the presence of the HEXXH motif in the active site residues, as well as its fold prediction. Some of these enzymes also contain a protease-associated (PA) domain insert. The eukaryotic M36 and bacterial M4 families of metalloproteases also share a conserved domain in their propeptides called FTP (fungalysin/thermolysin propeptide). Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals; it secretes fungalysin that possibly breaks down proteinaceous structural barriers. A solid lesion known as an aspergilloma can grow in a lung cavity, particularly following recovery from tuberculosis. Fungalysins are also found as multiple copies in the human and animal pathogenic fungi such as Microsporum canis, Trichophyton rubrum and T. mentagrophytes, which cause cutaneous infections. 317
28374 341060 cd09597 M4_TLP Peptidase M4 family including thermolysin, protealysin, aureolysin, and neutral protease. This peptidase M4 family includes several endopeptidases such as thermolysin (EC 3.4.24.27), aureolysin (the extracellular metalloproteinase from Staphylococcus aureus), neutral protease from Bacillus cereus, protealysin, and bacillolysin (EC 3.4.24.28). Typically, the M4 peptidases consist of a presequence (signal sequence), a propeptide sequence, and a peptidase unit. The presequence is cleaved off during export while the propeptide has inhibitory and chaperone functions and facilitates folding. The propeptide remains attached until the peptidase is secreted and can be safely activated. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. The active site is found between two sub-domains; the N-terminal domain contains the HEXXH zinc-binding motif while the helical C-terminal domain, which is unique for the family, carries the third zinc ligand. These peptidases are secreted eubacterial endopeptidases from Gram-positive or Gram-negative sources that degrade extracellular proteins and peptides for bacterial nutrition. They are selectively inhibited by Steptomyces metalloproteinase inhibitor (SMPI) as well as by phosphoramidon from Streptomyces tanashiensis. A large number of these enzymes are implicated as key factors in the pathogenesis of various diseases, including gastritis, peptic ulcer, gastric carcinoma, cholera and several types of bacterial infections, and are therefore important drug targets. Some enzymes of the family can function at extremes of temperatures, while some function in organic solvents, thus rendering them novel targets for biotechnological applications. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing. It has also been used in production of the artificial sweetener aspartame. 278
28375 341061 cd09598 M4_like Peptidase M4 family containing mostly uncharacterized proteins. This family of uncharacterized bacterial proteins are homologs of the M4 peptidase family that is also known as the thermolysin-like peptidase (TLP) family. Typically, the M4 peptidases consist of a presequence (signal sequence), a propeptide sequence and a peptidase unit. The presequence is cleaved off during export while the propeptide has inhibitory and chaperone functions and facilitates folding. The propeptide remains attached until the peptidase is secreted and can be safely activated. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. TLPs are secreted eubacterial endopeptidases from Gram-positive or Gram-negative sources that degrade extracellular proteins and peptides for bacterial nutrition. They contain the HEXXH motif as part of their active site and belong to the Gluzincins family and are selectively inhibited by Steptomyces metalloproteinase inhibitor (SMPI) as well as by phosphoramidon from Streptomyces tanashiensis. A large number of these enzymes are implicated as key factors in the pathogenesis of various diseases, including gastritis, peptic ulcer, gastric carcinoma, cholera and several types of bacterial infections, and are therefore important drug targets. Some enzymes of the family can function at extremes of temperatures, while some function in organic solvents, thus rendering them novel targets for biotechnological applications. 263
28376 341062 cd09599 M1_LTA4H Peptidase M1 family including Leukotriene A4 hydrolase catalytic domain. This model represents the N-terminal catalytic domain of leukotriene A4 hydrolase (LTA4H; E.C. 3.3.2.6) and the close homolog cold-active aminopeptidase (Colwellia psychrerythraea-type peptidase; ColAP), both members of the aminopeptidase M1 family. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity. The two activities occupy different, but overlapping sites. The activity and physiological relevance of the aminopeptidase is poorly understood while the epoxide hydrolase converts leukotriene A4 (LTA4) into leukotriene B4 (LTB4), a potent chemotaxin that is fundamental to the inflammatory response of mammals. It accepts a variety of substrates, including some opioid, di- and tripeptides, as well as chromogenic aminoacyl-p-nitroanilide derivatives. The aminopeptidase activity of LTA4H is possibly involved in the processing of peptides related to inflammation and host defense. Kinetic analysis shows that LTA4H hydrolyzes arginyl tripeptides with high efficiency and specificity, indicating its function as an arginyl aminopeptidase. Thermodynamic characterization using different biophysical methods shows that structurally distinct inhibitors of the LTA4H occupy different regions of the binding site; while some (RB202, ARM1 and SC57461A) bind to the hydrophobic hydrolase side, both bestatin and captopril are located at the hydrophilic peptidase side. LTB4H overexpression is associated with different pathological conditions and diseases such as cystic fibrosis, coronary heart disease, sepsis, shock, connective tissue disease, and chronic obstructive pulmonary disease. It is also overexpressed in certain human cancers, and has been identified as a functionally important target for mediating anticancer properties of resveratrol, a well-known red wine polyphenolic compound with cancer chemopreventive activity. 442
28377 341063 cd09600 M1_APN Peptidase M1 family, including aminopeptidase N catalytic domain. This model represents the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. It includes bacterial-type alanyl aminopeptidases as well as PfA-M1 aminopeptidase (Plasmodium falciparum-type). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established. 434
28378 341064 cd09601 M1_APN-Q_like Peptidase M1 aminopeptidase N catalytic domain family which includes aminopeptidase N (APN), aminopeptidase Q (APQ), tricorn interacting factor F3, and endoplasmic reticulum aminopeptidase 1 (ERAP1). This M1 peptidase family includes eukaryotic and bacterial members: the catalytic domains of aminopeptidase N (APN), aminopeptidase Q (APQ, laeverin), endoplasmic reticulum aminopeptidase 1 (ERAP1) as well as tricorn interacting factor F3. Aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease, preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is considered a marker of differentiation since it is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. ERAP1, also known as endoplasmic reticulum aminopeptidase associated with antigen processing (ERAAP), adipocyte derived leucine aminopeptidase (A-LAP), or aminopeptidase regulating tumor necrosis factor receptor I (THFRI) shedding (ARTS-1), associates with the closely related ER aminopeptidase ERAP2, for the final trimming of peptides within the ER for presentation by MHC class I molecules. ERAP1 is associated with ankylosing spondylitis (AS), an inflammatory arthritis that predominantly affects the spine. ERAP1 also aids in the shedding of membrane-bound cytokine receptors. The tricorn interacting factor F3, together with factors F1 and F2, degrades the tricorn protease products, producing free amino acids, thus completing the proteasomal degradation pathway. F3 is homologous to F2, but not F1, and shows a strong preference for glutamate in the P1' position. APQ, also known as laeverin, is specifically expressed in human embryo-derived extravillous trophoblasts (EVTs) that invade the uterus during early placentation. It cleaves the N-terminal amino acid of various peptides such as angiotensin III, endokinin C, and kisspeptin-10, all expressed in the placenta in large quantities. APN is a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs are also putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established. 442
28379 341065 cd09602 M1_APN Peptidase M1 family including aminopeptidase N catalytic domain. This model represents the catalytic domain of bacterial and eukaryotic aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established. 440
28380 341066 cd09603 M1_APN_like Peptidase M1 family similar to aminopeptidase N catalytic domain. This family contains mostly bacterial and some archaeal M1 peptidases with smilarity to the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established. 410
28381 341067 cd09604 M1_APN_like Peptidase M1 family similar to aminopeptidase N catalytic domain. This family contains bacterial M1 peptidases with smilarity to the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established. 440
28382 341068 cd09605 M3A Peptidase M3A family includes thimet oligopeptidase, dipeptidyl carboxypeptidase and mitochondrial intermediate peptidase. The M3-like family also called neurolysin-like family, is part of the "zincins" metallopeptidases, and includes M3, M2 and M32 families of metallopeptidases. The M3 family is subdivided into two subfamilies: the widespread M3A, represented by this CD, which comprises a number of high-molecular mass endo- and exopeptidases from bacteria, archaea, protozoa, fungi, plants and animals, and the small M3B, whose members are enzymes primarily from bacteria. Well-known mammalian/eukaryotic M3A endopeptidases are the thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (alias endopeptidase 3.4.24.16), and the mitochondrial intermediate peptidase. The first two are intracellular oligopeptidases, which act only on relatively short substrates of less than 20 amino acid residues, while the latter cleaves N-terminal octapeptides from proteins during their import into the mitochondria. The M3A subfamily also contains several bacterial endopeptidases, called oligopeptidases A, as well as a large number of bacterial carboxypeptidases, called dipeptidyl peptidases (Dcp; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). The peptidases in the M3 family contain the HEXXH motif that forms part of the active site in conjunction with a C-terminally-located Glutamic acid (Glu) residue. A single zinc ion is ligated by the side-chains of the two Histidine (His) residues, and the more C-terminal Glu. Most of the peptidases are synthesized without signal peptides or propeptides, and function intracellularly. 587
28383 341069 cd09606 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3 oligopeptidase F (oligendopeptidase) is mostly bacterial and includes oligoendopeptidase F from Geobacillus stearothermophilus. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids and may cleave proteins at Leu-Gly. The PepF gene is duplicated in Lactococcus lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. 543
28384 341070 cd09607 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B Oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. 580
28385 341071 cd09608 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and includes oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. This PepF family includes Streptococcus agalactiae PepB, a group B streptococcal oligopeptidase which has been shown to degrade a variety of bioactive peptides as well as the synthetic collagen-like substrate N-(3-[2-furyl]acryloyl)-Leu-Gly- Pro-Ala in vitro. 560
28386 341072 cd09609 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. 586
28387 341073 cd09610 M3B_PepF Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. 532
28388 187707 cd09611 Jacalin_ZG16_like Jacalin-like lectin domain of the zymogen granule protein 16 and related proteins. ZG16p is a conserved secreted vertebrate protein with tissue-specific expression profiles, which might play a role in glycoprotein secretion, perhaps as a linker protein that participates in the formation and/or transport of the zymogen granule. Its paralog ZG16b (PAUF) has been associated with roles in gene regulation and cancer. This domain family also contains mammalian proteins labelled as prostatic spermine-binding protein (SBP) and salivary-gland specific secreted proteins. 128
28389 187708 cd09612 Jacalin Jacalin-like plant lectin domain. Jacalin-like lectins are sugar-binding protein domains mostly found in plants. They adopt a beta-prism topology consistent with a circularly permuted three-fold repeat of a structural motif. Proteins containing this domain may bind mono- or oligosaccharides with high specificity. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. The family was initially named after an abundant protein found in the jackfruit seed. Jacalin specifically binds to the alpha-O-glycoside of the disaccharide Gal-beta1-3-GalNAc, and has proven useful in the study of O-linked glycoproteins. Jacalin-like lectins in this family may occur in various oligomerization states. 130
28390 187709 cd09613 Jacalin_metallopeptidase_like Jacalin-like lectin domain of putative metalloproteases and similar proteins. Members of this family, which appears restricted to fungi, co-occur with protein domains that contain an HExxH motif characteristic of metallopeptidases. They have not been functionally characterized. 124
28391 187710 cd09614 griffithsin_like Jacalin-like lectin domain of griffithsin and related proteins. Griffithsin is a lectin isolated from a red alga, which has shown potential as an inhibitor of viral entry, exhibiting antiviral activity against HIV and SARS. The biological functions of griffithsin and griffithsin-like proteins with respect to their source organisms are not known. 128
28392 187711 cd09615 Jacalin_EEP Jacalin-like lectin domains of putative endonucleases/exonucleases/phosphatases and related proteins. Members of this taxonomically diverse family co-occur with metal-dependent endonucleases/exonucleases/phosphatases. They have not been functionally characterized. 134
28393 187737 cd09616 Peptidase_C12_UCH_L1_L3 Cysteine peptidase C12 containing ubiquitin carboxyl-terminal hydrolase (UCH) families L1 and L3. This ubiquitin C-terminal hydrolase (UCH) family includes UCH-L1 and UCH-L3, the two members sharing around 53% sequence identity as well as conserved catalytic residues. Both enzymes hydrolyze carboxyl terminal esters and amides of ubiquitin (Ub). UCH-L1, in dimeric form, has additional enzymatic activity as a ubiquitin ligase. It is highly abundant in the brain, constituting up to 2% of total protein, and is expressed exclusively in neurons and testes. Abnormal expression of UCH-L1 has been shown to correlate with several forms of cancer, including several primary lung tumors, lung tumor cell lines, and colorectal cancers. Mutations in the UCH-L1 gene have been linked to susceptibility to and protection from Parkinson's disease (PD); dysfunction of the hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD), while accumulation of neurofibrillary tangles is linked to Alzheimer's disease (AD). UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. It can also interact with Lys48-linked Ub dimers to protect them from degradation while inhibiting its hydrolase activity at the same time. Unlike UCH-L1, neither dimerization nor ligase activity have been observed for UCH-L3. It has been shown that levels of Nedd8 and the apoptotic protein p53 and Bax are elevated in UCH-L3 knockout mice upon cryptorchid injury, possibly contributing to profound germ cell loss via apoptosis. 222
28394 187738 cd09617 Peptidase_C12_UCH37_BAP1 Cysteine peptidase C12 containing ubiquitin carboxyl-terminal hydrolase (UCH) families UCH37 (UCH-L5) and BAP1. This ubiquitin C-terminal hydrolase (UCH) family includes UCH37 (also known as UCH-L5) and BRCA1-associated protein-1 (BAP1). They contain a UCH catalytic domain as well as an additional C-terminal extension which plays a role in protein-protein interactions. UCH37 is responsible for ubiquitin (Ub) isopeptidase activity in the 19S proteasome regulatory complex; it disassembles Lys48-linked poly-ubiquitin from the distal end of the chain. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus and can be activated through transient association of hINO80 with hRpn13 that is bound to the 19S regulatory particle or the proteasome. UCH37 possibly plays a role in oncogenesis; it competes with Smad ubiquitination regulatory factor 2 (Smurf2, ubiquitin ligase) in binding concurrently to Smad7 in order to deubiquitinate the activated type I transforming growth factor beta (TGF-beta) receptor, thus rescuing it from proteasomal degradation. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus. In addition to the UCH catalytic domain, BAP1 contains a UCH37-like domain (ULD), binding domains for BRCA1 and BARD1, which form a tumor suppressor heterodimeric complex, and a binding domain for HCFC1, which interacts with histone-modifying complexes during cell division. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. BAP1 exhibits tumor suppressor activity in cancer cells, and gene mutations have been reported in a small number of breast and lung cancer samples. In metastasis of uveal melanoma, the most common primary cancer of the eye, inactivating somatic mutations have been identified in the gene encoding BAP1 on chromosome 3p21.1. These mutations include several that cause premature protein termination as well as affect its UCH domain, thus implicating loss of BAP1 and suggesting that the BAP1 pathway may be a valuable therapeutic target. 219
28395 187676 cd09618 CBM9_like_2 DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized subfamily are typically found at the N-terminus of longer proteins that lack additional annotation with domain footprints. 186
28396 187677 cd09619 CBM9_like_4 DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily are often located at the C-terminus of longer proteins and may co-occur with various other domains. 187
28397 187678 cd09620 CBM9_like_3 DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily may co-occur with various other domains. 200
28398 187679 cd09621 CBM9_like_5 DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily are often located at the C-terminus of longer proteins and may co-occur with various other functional domains such as glycosyl hydrolases. The CBM9 module in these architectures may be involved in binding to carbohydrates. 188
28399 187680 cd09622 CBM9_like_HisKa DOMON-like type 9 carbohydrate binding module at the N-terminus of bacterial sensor histidine kinases. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this family are located at the N-terminus of bacterial sensor histidine kinases and may constitute or contribute to the ligand-binding moiety. 265
28400 187681 cd09623 DOMON_EBDH Heme-binding domain of bacterial ethylbenzene dehydrogenase. Ethylbenzene dehydrogenase (EBDH) is a bacterial molybdopterin enzyme. It catalyzes anaerobic hydroxylation of alkylaromatic compounds to secondary alcohols. The DOMON domain in EBDH and related proteins, typically called the gamma subunit, binds a heme; its function in the catalytic mechanism is unclear. It co-occurs with a molybdopterin-binding subunit and an iron-sulfur protein. This family also contains heme-binding domains of dimethylsulfide dehydrogenase, selenate reductases, and chlorate reductase. 224
28401 187682 cd09624 DOMON_b558_566 DOMON-like heme-binding domain of CbsA. This family, conserved in some lineages of the Crenarchaeota, represents a mono-heme cytochrome b558/566. CbsA is reported to be a subunit in a heterodimeric complex (CbsA-CbsB in Sulfolobus species), and appears to be glycosylated. 279
28402 187683 cd09625 DOMON_like_cytochrome DOMON-like domain of an uncharacterized protein family. This family of uncharacterized bacterial proteins contains a DOMON-like domain and an N-terminal B- or C-type cytochrome domain. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases. 348
28403 187684 cd09626 DOMON_glucodextranase_like DOMON-like domain of various glycoside hydrolases. This DOMON-like domain is found at the C-terminus of various bacterial proteins that play roles in metabolizing carbohydrates, such as glucodextranase (hydrolyzes alpha-1,6-glucosidic linkages of dextran from the non-reducing end), glucan alpha-1,4-glucosidase, pullulanase (degrades pullulan, a polysaccharide built from maltotriose units), arabinogalactan endo-1,4-beta-galactosidase, and others. Consequently, the DOMON-like domains in this family co-occur with catalytic domains from various glycosyl hydrolase families. The precise function of the DOMON domains in these proteins is not clear, they may be involved in interactions with carbohydrates. 220
28404 187685 cd09627 DOMON_murB_like Domon-like domain of UDP-N-acetylenolpyruvoylglucosamine reductase. UDP-N-acetylenolpyruvoylglucosamine reductase (murB) catalyzes an essential step in peptidoglycan biosynthesis, the reduction of UDP-N-acetylglucosamine-enolpyruvate to UDP-N-acetylmuramate. A subset of these FAD-dependent enzymes contains a C-terminal DOMON-like domain. DOMON domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes; initially DOMON domains were suspected to confer protein-protein interactions. The DOMON-like domain in murB may bind a heme. 179
28405 187686 cd09628 DOMON_SDR_2_like DOMON domain of stromal cell-derived receptor 2 (ferric chelate reductase 1) and related proteins. Stromal cell-derived receptor 2 (or ferric chelate reductase 1) reduces Fe(3+) to Fe(2+) ahead of iron transport from the endosome to the cytoplasm. This transmembrane protein is a member of the cytochrome b561 family and contains a DOMON domain which may bind to heme or another ligand. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases. 169
28406 187687 cd09629 DOMON_CIL1_like DOMON-like domain of Brassica carinata CIL1 and similar proteins. Brassica carinata CIL1 has been described as involved in suppression of axillary meristem development. It contains a single DOMON domain, the function of which is unclear. Members in this diverse family of plant proteins may have a cytochrome b561 domain C-terminal to the DOMON domain, some members from Arabidopsis have been characterized as auxin-responsive or auxin-induced proteins. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases. 152
28407 187688 cd09630 CDH_like_cytochrome Heme-binding cytochrome domain of fungal cellobiose dehydrogenases. Cellobiose dehydrogenase (CellobioseDH or CDH) is an extracellular fungal oxidoreductase that degrades both lignin and cellulose. Specifically, CDHs oxidize cellobiose, cellodextrins, and lactose to corresponding lactones, utilizing a variety of electron acceptors. Class-II CDHs are monomeric hemoflavoenzymes that are comprised of a b-type cytochrome domain linked to a large flavodehydrogenase domain. The cytochrome domain of CDH and related enzymes, which this model describes, folds as a beta sandwich and complexes a heme molecule. It is found at the N-terminus of this family of enzymes, and belongs to the DOMON domain superfamily, a ligand-interacting motif found in all three kingdoms of life. 168
28408 187689 cd09631 DOMON_DOH DOMON-like domain of copper-dependent monooxygenases and related proteins. This diverse family characterizes DOMON domains found in dopamine beta-hydroxylase (DBH), monooxygenase X (MOX), and various other proteins, some of which contain DOMON domains exclusively; the family is not restricted to eukaryotes. DBH is a membrane-bound enzyme that converts dopamine to L-norepinephrine, and plays a central role in the metabolism of catecholamine neurotransmitters. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases. 138
28409 193606 cd09632 PliI_like Periplasmic lysozyme inhibitor, I-type (PliI) and similar proteins. Aeromonas hydrophila PliI is a dimeric periplasmic protein that enables bacteria to resist permeabilization of the outer membrane by the bactericidal action of lysozyme. PliI may be a direct inhibitor of lysozyme that inserts a conserved loop into the active site of type I (invertebrate) lysozymes. 109
28410 193607 cd09633 Deltex_C Domain found at the C-terminus of deltex-like. The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage. 131
28411 187766 cd09634 Cas1_I-II-III CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 317
28412 187767 cd09636 Cas1_I-II-III CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 260
28413 187768 cd09637 Cas4_I-A_I-B_I-C_I-D_II-B CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster 178
28414 187769 cd09638 Cas2_I_II_III CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold 90
28415 187770 cd09639 Cas3_I CRISPR/Cas system-associated protein Cas3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DEAD/DEAH box helicase DNA helicase cas3'; Often but not always is fused to HD nuclease domain; signature gene for Type I 353
28416 187771 cd09640 Cas7_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as CT1132 family 258
28417 193608 cd09641 Cas3''_I CRISPR/Cas system-associated protein Cas3''. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; HD-like nuclease, specifically digesting double-stranded oligonucleotides and preferably cleaving at G:C pairs; signature gene for Type I 200
28418 187773 cd09642 Cas8c_I-C CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csd1 family 574
28419 187774 cd09643 Csn1 CRISPR/Cas system-associated protein Cas9. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Very large protein containing McrA/HNH-nuclease related domain and a RuvC-like nuclease domain; signature gene for type II 799
28420 213407 cd09644 Csn2 CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family. 223
28421 187776 cd09645 Cas5_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 137
28422 187777 cd09646 Cas7_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Cse4/CasC family 325
28423 187778 cd09647 Csm2_III-A CRISPR/Cas system-associated protein Csm2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-A 95
28424 187779 cd09648 Cas2_I-E CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold 93
28425 187780 cd09649 Cas5_I-A CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 143
28426 187781 cd09650 Cas7_I CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as MJ0381 family 189
28427 187782 cd09651 Cas5_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family 198
28428 410980 cd09652 Cas6 Class 1 CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are typically found within types I and III CRISPR-Cas systems and are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. Cas6 is also found in the rare type IV system that includes rudimentary CRISPR-cas loci lacking the adaptation module. 258
28429 187784 cd09653 Csa5_I-A CRISPR/Cas system-associated protein Csa5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system; contains DNA binding HTH domain; also known as Csa5 family 97
28430 187785 cd09654 Cmr5_III-B CRISPR/Cas system-associated protein Cmr5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-B 127
28431 187786 cd09655 CasRa_I-A CRISPR/Cas system-associated transcriptional regulator CasRa. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system 198
28432 187787 cd09656 Cmr3_III-B CRISPR/Cas system-associated RAMP superfamily protein Cmr3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex 318
28433 187788 cd09657 Cmr1_III-B CRISPR/Cas system-associated RAMP superfamily protein Cmr1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex 132
28434 187789 cd09658 Cas5_I-B CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 181
28435 187790 cd09659 Cas4_I-A CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster 270
28436 187791 cd09660 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as MJ1666 family 394
28437 187792 cd09661 Cmr6_III-B CRISPR/Cas system-associated RAMP superfamily protein Cmr6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex 210
28438 187793 cd09662 Csm5_III-A CRISPR/Cas system-associated RAMP superfamily protein Csm5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein 365
28439 187794 cd09663 Csm4_III-A CRISPR/Cas system-associated RAMP superfamily protein Csm4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein 301
28440 187795 cd09664 Cas6_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas6e. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6e is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-E subtype; Homologous to Cas6 (RAMP superfamily protein); Possesses double RRM/ferredoxin fold; also known as Cse3 family 210
28441 187796 cd09665 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as CXXC_CXXC family 334
28442 187797 cd09666 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-A subtype; also known as Csa4 family 352
28443 187798 cd09667 Csb2_I-U CRISPR/Cas system-associated protein Csb2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains; also known as GSU0054 family 418
28444 187799 cd09668 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TM1812 family 214
28445 187800 cd09669 Cse1_I-E CRISPR/Cas system-associated protein Cse1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; subunit of the Cascade complex; signature gene for I-E subtype; also known as Cse1/CasA/YgcL family 477
28446 187801 cd09670 Cse2_I-E CRISPR/Cas system-associated protein Cse2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; also known as Cse2/CasB/YgcK family; specific gene for I-E subtype; 152
28447 187802 cd09671 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as DxTHG family 346
28448 187803 cd09672 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as TM1802 family 545
28449 187804 cd09673 Cas3_Cas2_I-F CRISPR/Cas system-associated protein Cas3/Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas3/Cas2 fusion; This protein includes both DEAH and HD motifs for helicase and N-terminal domain corresponding to Cas2 RNAse; signature gene for Type I and subtype I-F 1106
28450 187805 cd09674 Cas6_I-F CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6f is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-F subtype; Possesses RRM fold; also known as Csy4 family 186
28451 187806 cd09675 Csy1_I-F CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex; signature gene for I-F subtype; also known as Csy1 family 384
28452 187807 cd09676 Csy2_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog 292
28453 187808 cd09677 Csy3_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog 339
28454 187809 cd09678 Csb1_I-U CRISPR/Cas system-associated protein Csb1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family; also known as GSU0053 family 174
28455 187810 cd09679 Cas10_III CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, palm domain and Zn-ribbon; MTH326-like has inactivated polymerase catalytic domain; alr1562 and slr7011 - predicted only on the basis of size, presence of HD domain, and location with RAMPs in one operon; signature gene for type III; also known as Crm2 family 475
28456 187811 cd09680 Cas10_III CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, palm domain and Zn-ribbon; signature gene for type III; also known as Csm1 family 650
28457 187812 cd09681 Csx3_III-U CRISPR/Cas system-associated protein Csx3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein in some cases fused to Csx1 (COG1517) family domains 83
28458 187813 cd09682 Cmr4_III-B CRISPR/Cas system-associated RAMP superfamily protein Cmr4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex 242
28459 187814 cd09683 Csm3_III-A CRISPR/Cas system-associated RAMP superfamily protein Csm3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein 216
28460 187815 cd09684 Csm3_III-A CRISPR/Cas system-associated RAMP superfamily protein Csm3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein 215
28461 187816 cd09685 Cas7_I-A CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as DevR family 274
28462 187817 cd09686 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as NE0113 family 209
28463 187818 cd09687 Cas7_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Cst2/DevR family 302
28464 187819 cd09688 Cas5_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family 174
28465 187820 cd09689 Cas7_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csd2 family 278
28466 187821 cd09690 Cas7_I-B CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csh2 family 286
28467 187822 cd09691 Cas8b_I-B CRISPR/Cas system-associated protein Cas8b. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-B subtype; also known as Csh1 family 381
28468 187823 cd09692 Cas5_I-B CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 189
28469 187824 cd09693 Cas5_I CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 202
28470 187825 cd09694 Csm6_III-A CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems 181
28471 187826 cd09695 Csx16_III-U CRISPR/Cas system-associated protein Csx16. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein often seen in proximity to Csx1 (COG1517) family; also known as VVA1548 family 76
28472 187827 cd09696 Cas3_I CRISPR/Cas system-associated protein Cas3; Distinct Cas3 family with HD domain fused to C-termus of Helicase domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DNA helicase Cas3; This protein includes both DEAH and HD motifs; signature gene for Type I 843
28473 187828 cd09697 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx8 family 441
28474 187829 cd09698 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx9 family 377
28475 187830 cd09699 Csm6_III-A CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems 360
28476 187831 cd09700 Csx10 CRISPR/Cas system-associated RAMP superfamily protein Csx10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains 386
28477 187832 cd09701 Cas10_III CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, inactivated palm domain and Zn-ribbon; signature gene for type III 909
28478 187833 cd09702 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TIGR02710 family 378
28479 187834 cd09703 Cas6-I-III CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold; also known as Cse3 family 188
28480 187835 cd09704 Csx12 CRISPR/Cas system-associated protein Cas9. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Very large protein containing McrA/HNH-nuclease related domain and a RuvC-like nuclease domain; signature gene for type II 804
28481 187836 cd09705 Csf1_U CRISPR/Cas system-associated protein Csf1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein; also known as Csf1 family 202
28482 187837 cd09706 Csf2_U CRISPR/Cas system-associated RAMP superfamily protein Csf2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family 328
28483 187838 cd09707 Csf3_U CRISPR/Cas system-associated RAMP superfamily protein Csf3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein 214
28484 187839 cd09708 Csf4_U CRISPR/Cas system-associated DinG family helicase Csf4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase 632
28485 187840 cd09709 Csc2_I-D CRISPR/Cas system-associated protein Csc2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog; also known as Cse1 family 274
28486 187841 cd09710 Cas3_I-D CRISPR/Cas system-associated protein Cas3; Distinct diverged subfamily of Cas3 helicase domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Diverged DNA helicase Cas3'; signature gene for Type I and subtype I-D 353
28487 187842 cd09711 Csc1_I-D CRISPR/Cas system-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog; also known as CasA/Cse1 family 210
28488 187843 cd09712 Cas10d_I-D CRISPR/Cas system-associated protein Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain. Fused to N-terminal HD domain; signature gene for I-D subtype; also known as Csc3 family 900
28489 187844 cd09713 Cas8c_I-C CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csx13_N family 316
28490 187845 cd09714 Cas8c'_I-D CRISPR/Cas system-associated protein Cas8c'. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csx13_C family 152
28491 187846 cd09715 Csp2_I-U CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted Cas8 ortholog 474
28492 187847 cd09716 Cas5_I CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 220
28493 187848 cd09717 Cas7_I CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csp1 family 292
28494 187849 cd09718 Cas1_I-F CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 306
28495 187850 cd09719 Cas1_I-E CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 262
28496 187851 cd09720 Cas1_II CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer intergration. Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain. 275
28497 187852 cd09721 Cas1_I-C CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 338
28498 187853 cd09722 Cas1_I-B CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain 320
28499 187854 cd09723 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as csx13 family 132
28500 187855 cd09724 CsaX_III-U CRISPR/Cas system-associated protein CsaX. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; No prediction 296
28501 187856 cd09725 Cas2_I_II_III CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold 79
28502 187857 cd09726 RAMP_I_III CRISPR/Cas system-associated RAMP superfamily protein. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily proteins 177
28503 187858 cd09727 Cas6_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas6e. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6e is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-E subtype; Homologous to Cas6 (RAMP superfamily protein); Possesses double RRM/ferredoxin fold; also known as Cse3 family 210
28504 187859 cd09728 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as DxTHG family 400
28505 187860 cd09729 Cse1_I-E CRISPR/Cas system-associated protein Cse1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; subunit of the Cascade complex; signature gene for I-E subtype; also known as Cse1/CasA/YgcL family 465
28506 187861 cd09730 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as TM1802 family 579
28507 187862 cd09731 Cse2_I-E CRISPR/Cas system-associated protein Cse2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; also known as Cse2/CasB/YgcK family; specific gene for I-E subtype; 141
28508 187863 cd09732 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TM1812 family 221
28509 187864 cd09733 Cas6-I-III CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold; also known as AF0072 family 193
28510 320705 cd09734 Csb2_I-U CRISPR/Cas system-associated protein Csb2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains; also known as GSU0054 family 496
28511 187866 cd09735 Csy1_I-F CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex; signature gene for I-F subtype; also known as Csy1 family 377
28512 187867 cd09736 Csy2_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog 289
28513 187868 cd09737 Csy3_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog 329
28514 187869 cd09738 Csb1_I-U CRISPR/Cas system-associated protein Csb1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family; also known as GSU0053 family 168
28515 187870 cd09739 Cas6_I-F CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6f is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-F subtype; Possesses RRM fold; also known as Csy4 family 185
28516 187871 cd09740 Csx3_III-U CRISPR/Cas system-associated protein Csx3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein in some cases fused to Csx1 (COG1517) family domains 84
28517 187872 cd09741 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as NE0113 family 219
28518 187873 cd09742 Csm6_III-A CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems; also known as APE2256 family 183
28519 187874 cd09743 Csx16_III-U CRISPR/Cas system-associated protein Csx16. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein often seen in proximity to Csx1 (COG1517) family; also known as VVA1548 family 90
28520 187875 cd09744 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx8 family 441
28521 187876 cd09745 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx9 family 377
28522 187877 cd09746 Csm6_III-A CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems 382
28523 187878 cd09747 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as Cas02710 family 378
28524 187879 cd09748 Cmr3_III-B CRISPR/Cas system-associated RAMP superfamily protein Cmr3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex 356
28525 187880 cd09749 Cmr5_III-B CRISPR/Cas system-associated protein Cmr5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-B 119
28526 187881 cd09750 Csa5_I-A CRISPR/Cas system-associated protein Csa5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system; contains DNA binding HTH domain; also known as Csa5 family 101
28527 187882 cd09751 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-A subtype; also known as Csa4 family 355
28528 187534 cd09752 Cas5_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family 198
28529 187883 cd09753 Cas5_I-A CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 147
28530 187884 cd09754 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as CXXC_CXXC family 65
28531 187885 cd09755 Cas2_I-E CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold 62
28532 187886 cd09756 Cas5_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex 135
28533 187887 cd09757 Cas8c_I-C CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-C subtype; also known as Csd1 family 569
28534 213408 cd09758 Csn2 CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family. 218
28535 187889 cd09759 Cas6_I-A CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold 240
28536 187890 cd09760 Cas6_III CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex 289
28537 187662 cd09761 A3DFK9-like_SDR_c Clostridium thermocellum A3DFK9-like, a putative carbohydrate or polyalcohol metabolizing SDR, classical (c) SDRs. This subgroup includes a putative carbohydrate or polyalcohol metabolizing SDR (A3DFK9) from Clostridium thermocellum. Its members have a TGXXXGXG classical-SDR glycine-rich NAD-binding motif, and some have a canonical SDR active site tetrad (A3DFK9 lacks the upstream Asn). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 242
28538 187663 cd09762 HSDL2_SDR_c human hydroxysteroid dehydrogenase-like protein 2 (HSDL2), classical (c) SDRs. This subgroup includes human HSDL2 and related protens. These are members of the classical SDR family, with a canonical Gly-rich NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. HSDL2 may play a part in fatty acid metabolism, as it is found in peroxisomes. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 243
28539 187664 cd09763 DHRS1-like_SDR_c human dehydrogenase/reductase (SDR family) member 1 (DHRS1) -like, classical (c) SDRs. This subgroup includes human DHRS1 and related proteins. These are members of the classical SDR family, with a canonical Gly-rich NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. DHRS1 mRNA has been detected in many tissues, liver, heart, skeletal muscle, kidney and pancreas; a longer transcript is predominantly expressed in the liver , a shorter one in the heart. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs. 265
28540 187733 cd09764 Csb3_I-U CRISPR/Cas system-associated RAMP superfamily protein Csb3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Might be a catalytically active RNA endoribonuclease 341
28541 187734 cd09765 Csx14_I-U CRISPR/Cas system-associated protein Csx14. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein containing C-terminal alpha-helical domain resembling Cas8a2, also known as GSU0052 272
28542 187735 cd09766 Csx15_I-U CRISPR/Cas system-associated protein Csx15. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein loosely associated with CRISPR/Cas systems; some are fused to AAA ATPase domain, also known as TTE2665 family 101
28543 187705 cd09767 Csx17_I-U CRISPR/Cas system-associated protein Csx17. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex; 652
28544 188874 cd09768 Luminal_EIF2AK3 The Luminal domain, a dimerization domain, of the Serine/Threonine protein kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 3. The Luminal domain is a dimerization domain present in eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), also called PKR-like Endoplasmic Reticulum Kinase (PERK). EIF2AK3 is a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). As a EIF2AK, it phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis: General Control Non-derepressible-2 (GCN2), protein kinase regulated by RNA (PKR), heme-regulated inhibitor kinase (HRI), and PERK. PERK contains a luminal domain bound with the chaperone BiP under unstressed conditions and a cytoplasmic catalytic kinase domain. In response to the accumulation of misfolded or unfolded proteins in the ER, PERK is activated through the release of BiP, allowing it to dimerize through its luminal domain and autophosphorylate. It functions as the central regulator of translational control during the Unfolded Protein Response (UPR) pathway. In addition to the eIF-2 alpha subunit, PERK also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR. 301
28545 188875 cd09769 Luminal_IRE1 The Luminal domain, a dimerization domain, of the Serine/Threonine protein kinase, Inositol-requiring protein 1. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is a kinase receptor that also contains an endoribonuclease domain in the cytoplasmic side. It plays roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its luminal domain and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). IRE1alpha is expressed in all cells and tissues while IRE1beta is found only in intestinal epithelial cells. 295
28546 197361 cd09803 UBAN polyubiquitin binding domain of NEMO and related proteins. NEMO (NF-kappaB essential modulator) is a regulatory subunit of the kinase complex IKK, which is involved in the activation of NF-kappaB via phosporylation of inhibitory IkappaBs. This mechanism requires the binding of NEMO to ubiquinated substrates. Binding is achieved via the UBAN motif (ubiquitin binding in ABIN and NEMO), which is described in this model. This region of NEMO has also been named CoZi (for coiled-coil 2 and leucine zipper). ABINs (A20-binding inhibitors of NF-kappaB) are sensors for ubiquitin that are involved in regulation of apoptosis, ABIN-1 is presumed to inhibit signalling via the NF-kappaB route. The UBAN motif is also found in optineurin, the product of a gene associated with glaucoma, which has been characterized as a negative regulator of NF-kappaB as well. 87
28547 197362 cd09804 Dcp1 mRNA decapping enzyme 1 (Dcp1). mRNA decapping enzyme 1 (Dcp1), together with Dcp2, is part of the decapping complex which catalyzes the removal of the 5' cap structure of mRNA. This decapping reaction is an essential step in mRNA degradation, by exposing the 5' end for exonucleolytic digestion. Dcp1 binds to the N-terminal helical domain of catalytic subunit Dcp2 and enhances its function by promoting Dsp2's closed conformation which is catalytically more active. 121
28548 187665 cd09805 type2_17beta_HSD-like_SDR_c human 17beta-hydroxysteroid dehydrogenase type 2 (type 2 17beta-HSD)-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. This classical-SDR subgroup includes the human proteins: type 2 17beta-HSD, type 6 17beta-HSD, type 2 11beta-HSD, dehydrogenase/reductase SDR family member 9, short-chain dehydrogenase/reductase family 9C member 7, 3-hydroxybutyrate dehydrogenase type 1, and retinol dehydrogenase 5. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 281
28549 187666 cd09806 type1_17beta-HSD-like_SDR_c human estrogenic 17beta-hydroxysteroid dehydrogenase type 1 (type 1 17beta-HSD)-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. This classical SDR subgroup includes human type 1 17beta-HSD, human retinol dehydrogenase 8, zebrafish photoreceptor associated retinol dehydrogenase type 2, and a chicken ovary-specific 17beta-hydroxysteroid dehydrogenase. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 258
28550 212495 cd09807 retinol-DH_like_SDR_c retinol dehydrogenases (retinol-DHs), classical (c) SDRs. Classical SDR-like subgroup containing retinol-DHs and related proteins. Retinol is processed by a medium chain alcohol dehydrogenase followed by retinol-DHs. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. This subgroup includes the human proteins: retinol dehydrogenase -12, -13 ,and -14. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 274
28551 187668 cd09808 DHRS-12_like_SDR_c-like human dehydrogenase/reductase SDR family member (DHRS)-12/FLJ13639-like, classical (c)-like SDRs. Classical SDR-like subgroup containing human DHRS-12/FLJ13639, the 36K protein of zebrafish CNS myelin, and related proteins. DHRS-12/FLJ13639 is expressed in neurons and oligodendrocytes in the human cerebral cortex. Proteins in this subgroup share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 255
28552 187669 cd09809 human_WWOX_like_SDR_c-like human WWOX (WW domain-containing oxidoreductase)-like, classical (c)-like SDRs. Classical-like SDR domain of human WWOX and related proteins. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 284
28553 187670 cd09810 LPOR_like_SDR_c_like light-dependent protochlorophyllide reductase (LPOR)-like, classical (c)-like SDRs. Classical SDR-like subgroup containing LPOR and related proteins. Protochlorophyllide (Pchlide) reductases act in chlorophyll biosynthesis. There are distinct enzymes that catalyze Pchlide reduction in light or dark conditions. Light-dependent reduction is via an NADP-dependent SDR, LPOR. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 311
28554 187671 cd09811 3b-HSD_HSDB1_like_SDR_e human 3beta-HSD (hydroxysteroid dehydrogenase) and HSD3B1(delta 5-delta 4-isomerase)-like, extended (e) SDRs. This extended-SDR subgroup includes human 3 beta-HSD/HSD3B1 and C(27) 3beta-HSD/ [3beta-hydroxy-delta(5)-C(27)-steroid oxidoreductase; HSD3B7], and related proteins. These proteins have the characteristic active site tetrad and NAD(P)-binding motif of extended SDRs. 3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. C(27) 3beta-HSD is a membrane-bound enzyme of the endoplasmic reticulum, it catalyzes the isomerization and oxidation of 7alpha-hydroxylated sterol intermediates, an early step in bile acid biosynthesis. Mutations in the human gene encoding C(27) 3beta-HSD underlie a rare autosomal recessive form of neonatal cholestasis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 354
28555 187672 cd09812 3b-HSD_like_1_SDR_e 3beta-hydroxysteroid dehydrogenase (3b-HSD)-like, subgroup1, extended (e) SDRs. An uncharacterized subgroup of the 3b-HSD-like extended-SDR family. Proteins in this subgroup have the characteristic active site tetrad and NAD(P)-binding motif of extended-SDRs. 3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 339
28556 187673 cd09813 3b-HSD-NSDHL-like_SDR_e human NSDHL (NAD(P)H steroid dehydrogenase-like protein)-like, extended (e) SDRs. This subgroup includes human NSDHL and related proteins. These proteins have the characteristic active site tetrad of extended SDRs, and also have a close match to their NAD(P)-binding motif. Human NSDHL is a 3beta-hydroxysteroid dehydrogenase (3 beta-HSD) which functions in the cholesterol biosynthetic pathway. 3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. Mutations in the gene encoding NSDHL cause CHILD syndrome (congenital hemidysplasia with ichthyosiform nevus and limb defects), an X-linked dominant, male-lethal trait. This subgroup also includes an unusual bifunctional [3beta-hydroxysteroid dehydrogenase (3b-HSD)/C-4 decarboxylase from Arabidopsis thaliana, and Saccharomyces cerevisiae ERG26, a 3b-HSD/C-4 decarboxylase, involved in the synthesis of ergosterol, the major sterol of yeast. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. 335
28557 381167 cd09815 TP_methylase S-AdoMet-dependent tetrapyrrole methylases. This superfamily uses S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA. Other superfamily members not involved in cobalamin biosynthesis include the N-terminal tetrapyrrole methylase domain of Bacillus subtilis YabN whose specific function is unknown, and Omphalotus olearius omphalotin methyltransferase which catalyzes the automethylation of its own C-terminus; this C terminus is subsequently released and macrocyclized to give Omphalotin A, a potent nematicide. 219
28558 188648 cd09816 prostaglandin_endoperoxide_synthase Animal prostaglandin endoperoxide synthase and related bacterial proteins. Animal prostaglandin endoperoxide synthases, including prostaglandin H2 synthase and a set of similar bacterial proteins which may function as cyclooxygenases. Prostaglandin H2 synthase catalyzes the synthesis of prostaglandin H2 from arachidonic acid. In two reaction steps, arachidonic acid is converted to Prostaglandin G2, a peroxide (cyclooxygenase activity) and subsequently converted to the end product via the enzyme's peroxidase activity. Prostaglandin H2 synthase is the target of aspirin and other non-steroid anti-inflammatory drugs such as ibuprofen, which block the substrate's access to the active site and may acetylate a conserved serine residue. In humans and other mammals, prostaglandin H2 synthase (PGHS), also called cyclooxygenase (COX) is present as at least two isozymes, PGHS-1 (or COX-1) and PGHS-2 (or COX-2), respectively. PGHS-1 is expressed constitutively in most mammalian cells, while the expression of PGHS-2 is induced via inflammation response in endothelial cells, activated macrophages, and others. COX-3 is a splice variant of COX-1. 490
28559 188649 cd09817 linoleate_diol_synthase_like Linoleate (8R)-dioxygenase and related enzymes. These fungal enzymes, related to animal heme peroxidases, catalyze the oxygenation of linoleate and similar targets. Linoleate (8R)-dioxygenase, also called linoleate:oxygen 7S,8S-oxidoreductase, generates (9Z,12Z)-(7S,8S)-dihydroxyoctadeca-9,12-dienoate as a product. Other members are 5,8-linoleate dioxygenase (LDS, ppoA) and linoleate 10R-dioxygenase (ppoC), involved in the biosynthesis of oxylipins. 550
28560 188650 cd09818 PIOX_like Animal heme oxidases similar to plant pathogen-inducible oxygenases. This is a diverse family of oxygenases related to the animal heme peroxidases, with members from plants, animals, and bacteria. The plant pathogen-inducible oxygenases (PIOX) oxygenate fatty acids into 2R-hydroperoxides. They may be involved in the hypersensitive reaction, rapid and localized cell death induced by infection with pathogens, and the rapidly induced expression of PIOX may be caused by the oxidative burst that occurs in the process of cell death. 484
28561 188651 cd09819 An_peroxidase_bacterial_1 Uncharacterized bacterial family of heme peroxidases. Animal heme peroxidases are diverse family of enzymes which are not restricted to metazoans; members are also found in fungi, and plants, and in bacteria - like this family of uncharacterized proteins. 465
28562 188652 cd09820 dual_peroxidase_like Dual oxidase and related animal heme peroxidases. Animal heme peroxidases of the dual-oxidase like subfamily play vital roles in the innate mucosal immunity of gut epithelia. They provide reactive oxygen species which help control infection. 558
28563 188653 cd09821 An_peroxidase_bacterial_2 Uncharacterized bacterial family of heme peroxidases. Animal heme peroxidases are diverse family of enzymes which are not restricted to metazoans; members are also found in fungi, and plants, and in bacteria - like this family of uncharacterized proteins. 570
28564 188654 cd09822 peroxinectin_like_bacterial Uncharacterized family of heme peroxidases, mostly bacterial. Animal heme peroxidases are diverse family of enzymes which are not restricted to animals. Members are also found in metazoans, fungi, and plants, and also in bacteria - like most members of this family of uncharacterized proteins. 420
28565 188655 cd09823 peroxinectin_like peroxinectin_like animal heme peroxidases. Peroxinectin is an arthropod protein that plays a role in invertebrate immunity mechanisms. Specifically, peroxinectins are secreted as cell-adhesive and opsonic peroxidases. The immunity mechanism appears to involve an interaction between peroxinectin and a transmembrane receptor of the integrin family. Human myeloperoxidase, which is included in this wider family, has also been reported to interact with integrins. 378
28566 188656 cd09824 myeloperoxidase_like Myeloperoxidases, eosinophil peroxidases, and lactoperoxidases. This well conserved family of animal heme peroxidases contains members with somewhat diverse functions. Myeloperoxidases are lysosomal proteins found in azurophilic granules of neutrophils and the lysosomes of monocytes. They are involved in the formation of microbicidal agents upon activation of activated neutrophils (neutrophils undergoing respiratory bursts as a result of phagocytosis), by catalyzing the conversion of hydrogen peroxide to hypochlorous acid. As a heme protein, myeloperoxidase is responsible for the greenish tint of pus, which is rich in neutrophils. Eosinophil peroxidases are haloperoxidases as well, preferring bromide over chloride. Expressed by eosinophil granulocytes, they are involved in attacking multicellular parasites and play roles in various inflammatory diseases such as asthma. The haloperoxidase lactoperoxidase is secreted from mucosal glands and provides antibacterial activity by oxidizing a variety of substrates such as bromide or chloride in the presence of hydrogen peroxide. 411
28567 188657 cd09825 thyroid_peroxidase Thyroid peroxidase (TPO). TPO is a member of the animal heme peroxidase family, which is expressed in the thyroid and involved in the processing of iodine and iodine compounds. Specifically, TPO oxidizes iodide via hydrogen peroxide to form active iodine, which is then, for example, incorporated into the tyrosine residues of thyroglobulin to yield mono- and di-iodotyrosines. 565
28568 188658 cd09826 peroxidasin_like Animal heme peroxidase domain of peroxidasin and related proteins. Peroxidasin is a secreted heme peroxidase which is involved in hydrogen peroxide metabolism and peroxidative reactions in the cardiovascular system. The domain co-occurs with extracellular matrix domains and may play a role in the formation of the extracellular matrix. 440
28569 193602 cd09827 PET_Prickle The PET domain of Prickle. The PET domain of Prickle: Prickle contains an N-terminal PET domain and three C-terminal LIM domains. Prickle has been implicated in regulation of cell movement in the planar cell polarity (PCP) pathway which requires the conserved Frizzled/Dishevelled (Dsh); Prickle interacts with Dishevelled, thereby modulating the activity of Frizzled/Dishevelled and the PCP signaling. Two forms of Prickle have been identified, namely Prickle 1 and Prickle 2. These are differentially expressed; Prickle 1 is found in fetal heart and hematological malignancies, while Prickle 2 is expressed in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. The PET domain is a protein-protein interaction domain, usually found in conjunction with the LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. 97
28570 193603 cd09828 PET_OEBT The PET domain of overexpressed breast tumor protein (OEBT). The PET domain of overexpressed breast tumor protein (OEBT): OEBT contains an N-terminal PET domain and two C-terminal LIM domains, and is predicted to be localized in the nucleus. The expression pattern of OEBT in malignant tissues indicates a possible role of OEBT in cancer differentiation. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. 116
28571 193604 cd09829 PET_testin The PET domain of Testin. The PET domain of Testin: Testin contains a PET domain at the N-terminus and three C-terminal LIM domains. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and is involved in cell motility and adhesion events. Knockout mice experiments reveal a tumor repressor function of Testin. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. 88
28572 193605 cd09830 PET_LIMPETin_LIM-9 The PET domain of protein LIMPETin and LIM-9. The PET domain of protein LIMPETin and LIM-9: Members of this family contain an N-terminal PETdomain and five to six LIM domains at the C-terminus. Four of the six LIM domains are highly homologous to the four-and-half LIM (FHL) domain family while the other two show sequence similarity to LIM domains of the Testin family. Thus, proteins of this family may be the recombinant product of genes coding testin and FHL proteins. In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Thus, proteins of this family may be the recombinant product of genes coding Testin and FHL proteins. SmLIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 binds to UNC-97 and UNC-96, components of sarcomeric muscle M-lines. LIM-9 also forms a complex with SCPL-1 and UNC-89, whose function is to organize sarcomeric A-bands, especially the M-line of muscle. Thus, it might play a role in regulating the assembly and maintenance of muscle A-band. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. 83
28573 341402 cd09831 CBS_pair_ABC_Gly_Pro_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the glycine betaine/L-proline ABC transporter. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 116
28574 341403 cd09833 CBS_pair_GGDEF_PAS_repeat1 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors, repeat 1. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 116
28575 341404 cd09834 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 118
28576 341405 cd09836 CBS_pair_arch Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 116
28577 341406 cd09837 CBS_pair_chlorobiales Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in chlorobiales. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 111
28578 341074 cd09839 M1_like_TAF2 TATA binding protein (TBP) associated factor 2. This family includes TATA binding protein (TBP) associated factor 2 (TAF2, TBP-associated factor TAFII150, transcription initiation factor TFIID subunit 2, RNA polymerase II TBP-associated factor subunit B), and has homology to the M1 gluzincin family. TAF2 is part of the TFIID multidomain subunit complex essential for transcription of most protein-encoded genes by RNA polymerase II. TAF2 is known to interact with the initiator element (Inr) found at the transcription start site of many genes, thus possibly playing a key role in promoter binding as well as start-site selection. Image analysis has shown TAF2 to form a complex with TAF1 and TBP, inferring its role in promoter recognition. Peptidases in the M1 family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. TAF2, however, lacks these active site residues. 531
28579 188871 cd09840 LIM2_CRP2 The second LIM domain of Cysteine Rich Protein 2 (CRP2). The second LIM domain of Cysteine Rich Protein 2 (CRP2): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. The second LIM domain of CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcription factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 54
28580 188872 cd09841 LIM1_Prickle_3 The first LIM domain of Prickle 3. The first LIM domain of Prickle 3/LIM domain only 6 (LM06): Prickle contains three C-terminal LIM domains and a N-terminal PET domain. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP). PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 59
28581 197300 cd09842 PLDc_vPLD1_1 Catalytic domain, repeat 1, of vertebrate phospholipase D1. Catalytic domain, repeat 1, of vertebrate phospholipase D1 (PLD1). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD1 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 151
28582 197301 cd09843 PLDc_vPLD2_1 Catalytic domain, repeat 1, of vertebrate phospholipase D2. Catalytic domain, repeat 1, of vertebrate phospholipase D2 (PLD2). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. They also catalyze a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD2 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 145
28583 197302 cd09844 PLDc_vPLD1_2 Catalytic domain, repeat 2, of vertebrate phospholipase D1. Catalytic domain, repeat 2, of vertebrate phospholipase D1 (PLD1). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD1 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 182
28584 197303 cd09845 PLDc_vPLD2_2 Catalytic domain, repeat 2, of vertebrate phospholipase D2. Catalytic domain, repeat 2, of vertebrate phospholipase D2 (PLD2). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. They also catalyze a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD2 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. 182
28585 197363 cd09846 DUF1312 N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431 are part of DUF1312. Domains of Unknown Function 1312 (DUF1312) are represented in at least 71 bacterial species with no functional annotation. Included in this family are N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431, having similar structure and surface features that appear to be conserved across these domain families, suggesting similar function. NusG contains NGN at the N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at the C-terminus in bacteria and archaea, and this insert (often known as Domain II) is found in several bacteria. Lin0431 is similar to NGN-insert but does ot contain the disulphite bridge 81
28586 349946 cd09848 M28_TfR M28 Zn-peptidase Transferrin Receptor family. Peptidase M28 family; Transferrin Receptor (TfR) subfamily. TfRs are homodimeric type II transmembrane proteins containing three distinct domains: protease-like, apical or protease-associated (PA), and helical domains. The protease-like domain is a large extracellular portion (ectodomain). In TfR, it contains a binding site for the transferrin molecule and has 28% identity to membrane glutamate carboxypeptidase II (mGCP-II or PSMA). The PA domain is inserted between the first and second strands of the central beta sheet in the protease-like domain. TfR1 is widely expressed, and is a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and the complex is then internalized. TfR1 may also participate in cell growth and proliferation. TfR2 binds Tf but with a significantly lower affinity than TfR1. It is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis; in humans, mutations in TfR2 are associated with a form of hemochromatosis (HFE3). While related in sequence to peptidase M28 glutamate carboxypeptidase II (also called prostate-specific membrane antigen or PSMA), TfR lacks the metal ion coordination centers and protease activity of that group. 285
28587 349947 cd09849 M20_Acy1L2-like M20 Peptidase aminoacylase 1-like protein 2, amidohydrolase family. Peptidase M20 family, aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase)-like subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli , to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). 389
28588 197367 cd09850 Ebola-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region of the transmembrane subunit of Filoviridae viruses, Ebola virus and Marburg virus, and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Marburg virus gp, and the envelope proteins of various ERVs, including human HERV-R_c7q21.2 (ERV-3). This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host. However, it is unclear whether ERV-3 has a critical biological role: it is expressed in the placenta, but is not fusogenic, has an immunosuppressive domain, but lacks a fusion peptide. Filoviridae, the family of viruses including Ebola and Marburg, may have acquired this domain via horizontal transfer from retroviruses. 77
28589 197368 cd09851 HTLV-1-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of human T-cell leukemia virus type 1 (HTLV-1), and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane(TM) subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including HTLV-1, HTLV -2, primate Mason-Pfizer monkey virus, Moloney murine leukemia virus, simian T-cell lymphotropic virus, feline leukemia virus (FeLV), bovine leukemia virus, and various human endogenous retroviruses (HERVs), including, HERV-H1_c2q24.3, HERV-H2_3q26, HERV-F(c)1_cXq21.33, HERV-T_19q13.11, Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), Syncytin-2 (HERV-FRD_6p24.1), and related domains. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as FeLV. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Syncytin-1 and Syncytin-2 are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive; its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast; it is also implicated in cell fusions between cancer and host cells and between cancer cell, and in human osteclast fusion. This subfamily also contains a mouse envelope protein encoded by the Fv-4 env gene, that blocks infection by exogenous MuLV. 78
28590 350203 cd09852 PIN_SF PIN (PilT N terminus) domain: Superfamily. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. The PIN domain superfamily includes: the FEN-like PIN domain family such as the PIN domains of Flap endonuclease-1 (FEN1), exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. It also includes the Mut7-C PIN domain family, which is not represented here as it is a shortened version of the PIN fold and lacks a core strand and helix (H3 and S3). The Mut7-C PIN domain family includes the C-terminus of Caenorhabditis elegans exonuclease Mut-7. 114
28591 350204 cd09853 PIN_FEN-like FEN-like PIN domains of structure-specific 5' nucleases (or Flap endonuclease-1-like) involved in DNA replication, repair, and recombination. Structure-specific 5' nucleases are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner. The family includes the PIN (PilT N terminus) domains of Flap endonuclease-1 (FEN1), exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the PIN domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4- and T5-5' nucleases, and other homologs. Canonical members of this FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 174
28592 350205 cd09854 PIN_VapC-like VapC-like PIN domains of VapC and Smg6 ribonucleases, ribosome assembly factor NOB1, rRNA-processing protein Fcf1, Archaeoglobus fulgidus AF0591 protein, and homologs. PIN (PilT N terminus) domains of such ribonucleases as the toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1, are included in VapC-like this family. Also included are the PIN domains of the Pyrobaculum aerophilum Pea0151 and Archaeoglobus fulgidus AF0591 proteins and other similar archaeal homologs. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
28593 350206 cd09856 PIN_FEN1-like FEN-like PIN domains of Flap endonuclease-1 (FEN1)-like, structure-specific, divalent-metal-ion dependent, 5' nucleases. PIN (PilT N terminus) domain of Flap endonuclease-1 (FEN1)-like nucleases: FEN1, Gap endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Nucleases in this subfamily are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 235
28594 350207 cd09857 PIN_EXO1 FEN-like PIN domains of Exonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR)), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively. 202
28595 350208 cd09858 PIN_MKT1 FEN-like PIN domains of Mkt1, a global regulator of mRNAs encoding mitochondrial proteins and eukaryotic homologs. The Mkt1 gene product interacts with the Poly(A)-binding protein associated factor, Pbp1, and is present at the 3' end of RNA transcripts during translation. The Mkt1-Pbp1 complex is involved in the post-transcriptional regulation of HO endonuclease expression. Mkt1 and eukaryotic homologs are atypical members of the structure-specific, 5' nuclease family (FEN-like). Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. Although Mkt1 appears to possess both a PIN and H3TH domain, the Mkt1 PIN domain lacks several of the active site residues necessary to bind essential divalent metal ion cofactors (Mg2+/Mn2+) required for nuclease activity in this family. Also, Mkt1 lacks the glycine-rich loop in the H3TH domain which is proposed to facilitate duplex DNA binding. 206
28596 350209 cd09859 PIN_53EXO FEN-like PIN domains of PIN domain of the 5'-3' exonuclease of Thermus aquaticus DNA polymerase I (Taq) and homologs. The 5'-3' exonuclease (53EXO) PIN (PilT N terminus) domain of multi-domain DNA polymerase I and single domain protein homologs are included in this family. Taq contains a polymerase domain for synthesizing a new DNA strand and a 53EXO PIN domain for cleaving RNA primers or damaged DNA strands. Taq's 53EXO PIN domain recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The 53EXO PIN domain cleaves the unpaired 5'-arm of the overlap flap DNA substrate. 5'-3' exonucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 160
28597 350210 cd09860 PIN_T4-like FEN-like PIN domains of bacteriophage T3, T4 RNase H, T5-5'nuclease, and homologs. PIN (PilT N terminus) domain of bacteriophage T5-5'nuclease (5'-3' exonuclease or T5FEN), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar 5' nucleases are included in this family. T5-5'nuclease is a 5'-3'exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3'exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. Bacteriophage T3 is believed to function in the removal of DNA-linked RNA primers and is essential for phage DNA replication and also necessary for host DNA degradation and phage genetic recombination. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. In the T5-5'nuclease, structure-specific endonuclease activity requires binding of a single metal ion in the high-affinity, metal binding site 1, whereas exonuclease activity requires both, the high-affinity, metal binding site 1 and the low-affinity, metal binding site 2 to be occupied by a divalent cofactor. The T5-5'nuclease is reported to be able to bind several metal ions including, Mg2+, Mn2+, Zn2+ and Co2+, as co-factors. 158
28598 350211 cd09862 PIN_Rrp44-like VapC-like PIN domain of yeast exosome subunit Rrp44 endoribonuclease and other eukaryotic homologs. PIN (PilT N terminus) domain of the Saccharomyces cerevisiae exosome subunit Rrp44 (Ribosomal RNA-processing protein 44 or Protein Dis3 homolog) and other similar eukaryotic homologs are included in this family. The eukaryotic exosome is a conserved macromolecular complex responsible for many RNA-processing and RNA-degradation reactions. It is composed of nine core subunits that directly binds Rrp44. The Rrp44 nuclease is the catalytic subunit of the exosome and has endonuclease activity in the PIN domain and an exoribonuclease activity in its RNase II-like region. Rrp44 binding to the exosome is mediated mainly by the PIN domain and by subunits Rrp41-Rrp45, and binding predictions indicate that the PIN domain active site is positioned on the outer surface of the exosome. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. PIN domains within this subgroup contain four of these residues which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. Recombinant Rrp44 was shown to possess manganese-dependent endonuclease activity in vitro that was abolished by point mutations in these putative metal binding residues of its PIN domain. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 178
28599 350212 cd09864 PIN_Fcf1-like VapC-like PIN domain of rRNA-processing protein, Fcf1 (Utp24, YDR339C), and other eukaryotic homologs. Fcf1/Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) is an essential protein involved in pre-rRNA processing and 40S ribosomal subunit assembly. Component of the small subunit (SSU) processome, Fcf1 is an essential nucleolar protein that is required for processing of the 18S pre-rRNA at sites A0-A2. The Fcf1 protein was reported to interact with Pmc1p (vacuolar Ca2+ ATPase) and Cor1p (core subunit of the ubiquinol-cytochrome c reductase complex). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. Most members of the Fcf1 PIN domain subfamily have four of these conserved residues and the Fcf1-Utp23 homolog PIN domain subfamily has three. Point mutation studies of the conserved acidic residues in the putative active site of Saccharomyces cerevisiae Fcf1 determined they were essential for pre-rRNA processing at sites A1 and A2, whereas the presence of the Fcf1 protein itself is also required for cleavage at site A0. 131
28600 350213 cd09865 PIN_ScUtp23p-like VapC-like PIN domain of rRNA-processing protein, Utp23 (YOR004W), and other fungal homologs. Saccharomyces cerevisiae Utp23 (U three-associated protein 23), component of the small subunit (SSU) processome, is an essential protein involved in pre-rRNA processing and 40S ribosomal subunit assembly. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily, including S. cerevisiae Utp23, lack several of these key catalytic residues. Mutation of the remaining conserved putative active site residues seen in Utp23 did not interfere with rRNA maturation and cell viability. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 149
28601 350214 cd09866 PIN_Fcf1-Utp23-H VapC-like PIN domain of rRNA-processing protein Fcf1- and Utp23-like homologs found in eukaryotes except fungi; similar to human rRNA-processing protein UTP23. PIN domain homologs of Fcf1/Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23, essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly, are included in this subfamily. It includes human UTP24 which hUTP24 plays a crucial role in human rRNA processing and is essential for accurate endonucleolytic cleavage at the 5'-end of 18S rRNA. Fcf1 is a component of the small subunit (SSU) processome and an essential nucleolar protein required for processing of the 18S pre-rRNA at sites A0-A2. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The Fcf1-Utp23 homolog PIN domain subfamily has three of these conserved acidic residues rather than the four seen in the Fcf1 PIN domain subfamily. 130
28602 350215 cd09867 PIN_FEN1 FEN-like PIN domains of Flap endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. Flap endonuclease-1 (FEN1) is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. FEN1 belongs to the FEN1-EXO1-like subfamily of structure-specific, 5' nucleases (FEN-like family). Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA. 251
28603 350216 cd09868 PIN_XPG_RAD2 FEN-like PIN domains of Xeroderma pigmentosum complementation group G (XPG) nuclease, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. The Xeroderma pigmentosum complementation group G (XPG) nuclease plays a central role in nucleotide excision repair (NER) in cleaving DNA bubble structures or loops. XPG is a member of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 209
28604 350217 cd09869 PIN_GEN1 FEN-like PIN domains of Gap Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. Gap Endonuclease 1 (GEN1) is a Holliday junction resolvase reported to symmetrically cleave Holliday junctions and allow religation without additional processing. GEN1 is a member of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 227
28605 350218 cd09870 PIN_YEN1 FEN-like PIN domains of Saccharomyces cerevisiae endonuclease 1 (YEN1), Chaetomium thermophilum junction-resolving enzyme GEN1, and fungal homologs. Fungal Endonuclease 1 (YEN1 and GEN1, GEN1 is known as YEN1 in Saccharomyces cerevisiae) is a four-way (Holliday) junction resolvase. Members of this subgroup belong to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 229
28606 350219 cd09871 PIN_MtVapC28-VapC30-like VapC-like PIN domain of Mycobacterium tuberculosis VapC28 and 30 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC28 and VapC30 toxins. M. tuberculosis VapC28 and VapC30 both cleave tRNA25Ser-TGA and tRNA28Ser-CGA. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
28607 350220 cd09872 PIN_Sll0205-like VapC-like PIN domain of Sll0205 protein and homologs. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of the Synechocystis sp. (strain PCC 6803) Sll0205 protein and other uncharacterized homologs are included in this subfamily. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 125
28608 350221 cd09873 PIN_Pae0151-like VapC-like PIN domain of the Pyrobaculum aerophilum Pae0151 and Pae2754 proteins and homologs. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of the Pyrobaculum aerophilum proteins, Pae0151 and Pae2754, and homologs are included in this subfamily. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
28609 350222 cd09874 PIN_MT3492-like VapC-like PIN domain of the hypothetical protein MT3492 of Mycobacterium tuberculosis CDC1551 and other uncharacterized, annotated PilT protein domain proteins. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis CDC1551, hypothetical protein MT3492, and similar bacterial and archaeal proteins are included in this subfamily. They are PIN domain homologs of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 134
28610 350223 cd09875 PIN_VapC-FitB-like VapC-like PIN domain of ribonucleases (toxins), VapC and FitB, of prokaryotic toxin/antitoxin operons, Pyrococcus horikoshii protein PH0500, and other similar bacterial and archaeal homologs. PIN (PilT N terminus) domain-containing proteins of prokaryotic toxin/antitoxin (TA) operons, such as, Mycobacterium tuberculosis VapC of the VapBC (virulence associated proteins) TA operon, and Neisseria gonorrhoeae FitB of the FitAB (fast intracellular trafficking) TA operon, as well as, the archaeal Pyrococcus horikoshii protein PH0500 are included in this family. Toxins of TA operons are believed to be involved in growth inhibition by regulating translation and are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the complex activates the ribonuclease activity of the toxin. In N. gonorrhoeae, FitA and FitB form a heterodimer: FitA is the DNA binding subunit and FitB contains a ribonuclease activity that is blocked by the presence of FitA. A tetramer of FitAB heterodimers binds DNA from the fitAB upstream promoter region with high affinity. This results in both sequestration of FitAB and repression of fitAB transcription. It is thought that FitAB release from the DNA and subsequent dissociation both slows N. gonorrhoeae replication and transcytosis by an as yet undefined mechanism. The toxin M. tuberculosis VapC is a structural homolog of N. gonorrhoeae FitB, but their antitoxin partners, VapB and FitA, respectively, differ structurally. The M. tuberculosis VapC-5 is proposed to be both an endoribonuclease and an exoribonuclease that can act on free RNA in a similar manner to the endo and exonuclease Flap endonuclease-1 (FEN1). VapC-like toxins are structural homologs of FEN1-like PIN domains, but lack the extensive arch/clamp region and the H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region, seen in FEN1-like PIN domains. PIN domains within this group typically contain three or four conserved acidic residues that cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. These putative active site residues are thought to bind Mg2+ and/or Mn2+ ions and be essential for single-stranded ribonuclease activity. VapC-like PIN domains are single domain proteins that form dimers and dimerization configures the active sites in a groove along the long-axis of the structure. 130
28611 350224 cd09876 PIN_Nob1-like VapC-like PIN domain of eukaryotic ribosome assembly factor Nob1 and archaeal UPF0129 protein Ta0041-like homologs. PIN (PilT N terminus) domain of the Saccharomyces cerevisiae ribosome assembly factor, Nob1 (Nin one binding) protein, the Thermoplasma acidophilum DSM 1728, UPF0129 protein Ta0041, and similar eukaryotic and archaeal homologs are included in this family. The Nob1 PIN domain binds the single-stranded cleavage site D at the 3-prime end of 18S rRNA. Recombinant Nob1 binds as a tetramer to pre-18S rRNA fragments containing cleavage site D and believed to cleave at this site. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_6. 112
28612 350225 cd09877 PIN_YacL-like VapC-like PIN domain of Thermus Thermophilus Hb8, uncharacterized Bacillus subtilis YacL, and other bacterial homologs. PIN (PilT N terminus) domain of the conserved membrane protein of unknown function of Thermus Thermophilus Hb8, Bacillus subtilis YacL and other similar homologs are included in this family. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Proteins in this group have a C-terminal TRAM domain whose function is unknown but predicted to be a RNA-binding domain common to tRNA uracil methylation and adenine thiolation enzymes. 127
28613 350226 cd09878 PIN_VapC_VirB11L-ATPase-like VapC-like PIN domain of an uncharacterized AAA+, VirB11-like ATPase-, KH- and PIN-domain containing protein MJ1533 from Methanocaldococcus jannaschii DSM 2661, and other similar archaeal homologs. PIN (PilT N terminus) domain present N-terminal of AAA+, VirB11-like ATPases. Several members of this subfamily possess an AAA+, VirB11-like ATPase domain, flanked by PIN and KH nucleic acid-binding domains. VirB11-ATPase is a type IV secretory pathway component required for T-pilus biogenesis and virulence. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. PIN domains within this subgroup contain four of these highly conserved residues which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 125
28614 350227 cd09879 PIN_VapC_AF0591-like VapC-like PIN domain of Archaeoglobus fulgidus AF0591 protein and other similar archaeal homologs. PIN (PilT N terminus) domain of Archaeoglobus fulgidus AF0591 protein and other similar uncharacterized archaeal homologs are included in this family. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. PIN domains within this subgroup contain four of these highly conserved putative metal-binding, active site residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains and included distant subgroups, this subgroup includes some sequences belonging to one of these, PIN_14. 118
28615 350228 cd09880 PIN_Smg5-6-like VapC-like PIN domain of nonsense-mediated decay (NMD) factors, Smg5 and Smg6, and related proteins. PIN (PilT N terminus) domain of nonsense-mediated decay (NMD) factors, Smg5 and Smg6, and homologs are included in this family. Smg5 and Smg6 are essential factors in NMD, a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal-ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. Many of the bacterial homologs in this group have an N-terminal PIN domain and a C-terminal PhoH-like ATPase domain. 152
28616 350229 cd09881 PIN_VapC4-5_FitB-like VapC-like PIN domain of Mycobacterium tuberculosis VapC4 and VapC5, and Neisseria gonorrhoeae FitB and related proteins. This family includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This family belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
28617 350230 cd09882 PIN_MtVapC3-like_start VapC-like PIN domain of Mycobacterium tuberculosis VapC3 toxin and related proteins. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, and VapC21. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
28618 350231 cd09883 PIN_VapC_PhoHL-ATPase VapC-like PIN domain of bacterial Smg6-like proteins with C-terminal PhoH-like ATPase domains. PIN (PilT N terminus) domain of Smg6-like bacterial proteins with C-terminal PhoH-like ATPase domains and other similar homologs are included in this family. Eukaryotic Smg5 and Smg6 nucleases are essential factors in nonsense-mediated mRNA decay (NMD), a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. PIN domains within this subgroup contain four highly conserved acidic residues (putative metal-binding, active site residues). Many of the bacterial homologs in this group have an N-terminal PIN domain and a C-terminal PhoH-like ATPase domain and are predicted to be ATPases which are induced by phosphate starvation. 146
28619 350232 cd09884 PIN_Smg5-like VapC-like PIN domain of human nonsense-mediated decay factor Smg5, and other similar eukaryotic homologs. Nonsense-mediated decay (NMD) factors, Smg5 and Smg6 are essential to the post-transcriptional regulatory pathway, NMD, which recognizes and rapidly degrades mRNAs containing premature translation termination codons. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. 160
28620 350233 cd09885 PIN_Smg6-like VapC-like PIN domain of human telomerase-binding protein EST1, Smg6, and other similar eukaryotic homologs. Nonsense-mediated decay (NMD) factors, Smg5 and Smg6 are essential to the post-transcriptional regulatory pathway, NMD, which recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN (PilT N terminus) domain elicits degradation of bound mRNAs, as well as, metal ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. PIN domains within this subgroup contain four highly conserved acidic residues (putative metal-binding, active site residues) which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. Eukaryotic Smg6 PIN domains are present at the C-terminal end of the telomerase activating proteins, EST1. 178
28621 193575 cd09886 NGN_SP N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP). The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggest that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains. 97
28622 193576 cd09887 NGN_Arch Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain. The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes. 82
28623 193577 cd09888 NGN_Euk Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1). The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants. 86
28624 193578 cd09889 NGN_Bact_2 Bacterial N-Utilization Substance G (NusG) N-terminal (NGN) domain, subgroup 2. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and associates with RNA polymerase elongation and Rho-termination. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. The NusG N-terminal domain (NGN) is quite similar in all NusG orthologs, but its C-terminal domain and the linker that separates these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains. 100
28625 193579 cd09890 NGN_plant Plant N-Utilization Substance G (NusG) N-terminal (NGN) domain. The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains a NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein comprising an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. The bacterial infected plants contain bacterial DNA, such as NGN sequences, that can be used to clone the DNA of uncultured organisms. 113
28626 193580 cd09891 NGN_Bact_1 Bacterial N-Utilization Substance G (NusG) N-terminal (NGN) domain, subgroup 1. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination in bacteria. NusG is essential in Escherichia coli and associates with RNA polymerase elongation and Rho-termination. Homologs of the NusG gene exist in all bacteria. The NusG N-terminal domain (NGN) is similar in all NusG homologs, but its C-terminal domain and the linker that separates these two domains are different. The domain organization of NusG suggests that the common properties of NusG and its homologs are due to their similar NGN domains. 107
28627 193581 cd09892 NGN_SP_RfaH N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), RfaH. RfaH is an operon-specific virulence regulator, thought to have arisen from an early duplication of N-Utilization Substance G (NusG). Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. In contrast, RfaH is a non-essential protein that controls expression of operons containing an ops (operon polarity suppressor) element in their transcribed DNA. RfaH and NusG are different in their response to Rho-dependent terminators and regulatory targets. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its homologs suggest that the common properties of NusG and RfaH are due to their similar NGN domains. 96
28628 193582 cd09893 NGN_SP_TaA N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), TaA. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antiterminationn factors. TaA is a NusG SP factor that is required for synthesis of a polyketide antibiotic TA in Myxococcus xanthus. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggest that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains. 95
28629 193583 cd09894 NGN_SP_AnfA1 N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), AnFA1. Regulation of the afp, antifeeding prophage, gene cluster is mediated by AnFA1, a RfaH-like transcriptional antiterminator. RfaH is an operon-specific virulence regulator, thought to arisen from an early duplication of N-Utilization Substance G (NusG). NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. The NusG N-terminal domain (NGN) is similar in all NusG orthologs, but its C-terminal domain and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains. 99
28630 193584 cd09895 NGN_SP_UpxY N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), UpxY. The N-Utilization Substance G (NusG) proteins are involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS (lipopolysaccharide) biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. UpxY proteins, UpxY proteins, where the x is replaced by the letter designation of the specific polysaccharide (UpaY to UphY), are a family of NusG SP factors that act specifically in transcriptional antitermination of operons from which they are encoded. UpxYs are necessary and specific for transcription regulation of the polysaccharide biosynthesis operon. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. The NusG N-terminal (NGN) domain is similar in all NusG orthologs, but its C-terminal domain and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains. 95
28631 188617 cd09897 H3TH_FEN1-XPG-like H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), Xeroderma pigmentosum complementation group G (XPG) nuclease, and other eukaryotic and archaeal homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. With the except of the Mkt1-like proteins, the nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 68
28632 188618 cd09898 H3TH_53EXO H3TH domain of the 5'-3' exonuclease of Taq DNA polymerase I and homologs. H3TH (helix-3-turn-helix) domains of the 5'-3' exonuclease (53EXO) of mutli-domain DNA polymerase I and single domain protein homologs are included in this family. Taq DNA polymerase I contains a polymerase domain for synthesizing a new DNA strand and a 53EXO domain for cleaving RNA primers or damaged DNA strands. Taq's 53EXO recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The 53EXO cleaves the unpaired 5'-arm of the overlap flap DNA substrate. 5'-3' exonucleases are members of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+ or Mn2+ or Zn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 73
28633 188619 cd09899 H3TH_T4-like H3TH domain of bacteriophage T3, T4 RNase H, T5-5' nucleases, and homologs. H3TH (helix-3-turn-helix) domains of bacteriophage T5-5'nuclease (5'-3' exonuclease or T5FEN), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar 5' nucleases are included in this family. The T5-5'nuclease is a 5'-3' exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3' exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. Bacteriophage T3 is believed to function in the removal of DNA-linked RNA primers and is essential for phage DNA replication and also necessary for host DNA degradation and phage genetic recombination. These nucleases are members of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. They contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors required for nuclease activity. The first metal binding site (MBS-1) is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site (MBS-2) is composed generally of two Asp residues from the PIN domain and two Asp residues from the H3TH domain. In the T5-5'nuclease, structure-specific endonuclease activity requires binding of a single metal ion in the high-affinity, MBS-1, whereas exonuclease activity requires both, the high-affinity, MBS-1 and the low-affinity, MBS-2 to be occupied by a divalent cofactor. The T5-5'nuclease is reported to be able to bind several metal ions including, Mg2+, Mn2+, Zn2+ and Co2+, as co-factors. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 74
28634 188620 cd09900 H3TH_XPG-like H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases: FEN1 (archaeal), GEN1, YEN1, and XPG. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of archaeal Flap Endonuclease-1 (FEN1), Gap Endonuclease 1 (GEN1), Yeast Endonuclease 1 (YEN1), Xeroderma pigmentosum complementation group G (XPG) nuclease, and other eukaryotic and archaeal homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. With the except of the Mkt1-like proteins, the nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 52
28635 188621 cd09901 H3TH_FEN1-like H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases: FEN1 (eukaryotic) and EXO1. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of eukaryotic Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), and other eukaryotic homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 73
28636 188622 cd09902 H3TH_MKT1 H3TH domain of Mkt1: A global regulator of mRNAs encoding mitochondrial proteins and eukaryotic homologs. The Mkt1 gene product interacts with the Poly(A)-binding protein associated factor, Pbp1, and is present at the 3' end of RNA transcripts during translation. The Mkt1-Pbp1 complex is involved in the post-transcriptional regulation of HO endonuclease expression. Mkt1 and eukaryotic homologs are atypical members of the structure-specific, 5' nuclease family. Conical members of this family possess a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH (helix-3-turn-helix) domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Although Mkt1 appears to possess both a PIN and H3TH domain, the Mkt1 PIN domain lacks several of the active site residues necessary to bind essential divalent metal ion cofactors (Mg2+/Mn2+) required for nuclease activity in this family. Also, Mkt1 lacks the glycine-rich loop in the H3TH domain which is proposed to facilitate duplex DNA binding. 81
28637 188623 cd09903 H3TH_FEN1-Arc H3TH domain of Flap Endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease: Archaeal homologs. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of archaeal Flap endonuclease-1 (FEN1), 5' nucleases. FEN1 is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this subfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. Also, FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA. 65
28638 188624 cd09904 H3TH_XPG H3TH domain of Xeroderma pigmentosum complementation group G (XPG) nuclease, a structure-specific, divalent-metal-ion dependent, 5' nuclease. The Xeroderma pigmentosum complementation group G (XPG) nuclease plays a central role in nucleotide excision repair (NER) in cleaving DNA bubble structures or loops. XPG is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of XPG and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 97
28639 188625 cd09905 H3TH_GEN1 H3TH domain of Gap Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Gap Endonuclease 1 (GEN1): Holliday junction resolvase reported to symmetrically cleave Holliday junctions and allow religation without additional processing. GEN1 is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of GEN1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 108
28640 188626 cd09906 H3TH_YEN1 H3TH domain of Yeast Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Yeast Endonuclease 1 (YEN1): Holliday junction resolvase which promotes reciprocal exchange during mitotic recombination to maintain genome integrity in budding yeast. YEN1 is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of YEN1 and other similar fungal 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. 105
28641 188627 cd09907 H3TH_FEN1-Euk H3TH domain of Flap Endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease: Eukaryotic homologs. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of eukaryotic Flap endonuclease-1 (FEN1), 5' nucleases. FEN1 is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this subfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. Also, FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA. 70
28642 188628 cd09908 H3TH_EXO1 H3TH domain of Exonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of EXO1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively. 73
28643 197369 cd09909 HIV-1-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the gp41 subunit of human immunodeficiency virus (HIV-1), and related domains. This domain family spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including human, simian, and feline immunodeficiency viruses (HIV, SIV, and FIV), bovine immunodeficiency-like virus (BIV), equine infectious anaemia virus (EIAV), and Jaagsiekte sheep retrovirus (JSRV), mouse mammary tumour virus (MMTV) and various ERVs including sheep enJSRV-26, and human ERVs (HERVs): HERV-K_c1q23.3 and HERV-K_c12q14.1. This domain belongs to a larger superfamily containing the HR1-HR2 domain of ERVs and infectious retroviruses, including Ebola virus, and Rous sarcoma virus. Proteins in this family lack the canonical CSK17-like immunosuppressive sequence, and the intrasubunit disulfide bond-forming CX6C motif found in linker region between HR1 and HR2 in the Ebola_RSV-like_HR1-HR2 family. N-terminal to the HR1-HR2 region is a fusion peptide (FP), and C-terminal is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as JSRV. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Included in this subgroup are ERVs from domestic sheep that are related to JSRV, the agent of transmissible lung cancer in sheep, for example enJSRV-26 that retains an intact genome. These endogenous JSRVs protect the sheep against JSRV infection and are required for sheep placental development. HERV-K_c12q14.1 is potentially a complete envelope protein; however, it does not appear to be fusogenic. 128
28644 197364 cd09910 NGN-insert_like NGN-insert domain found between N-terminal domain (D1) and C-terminal KOW domain (DIII) repeats of some N-Utilization Substance G (NusG) N-terminal (NGN). This family contains a unique insert (domain II, DII) found between the highly conserved N-terminal domain (NGN, domain I, D1) and C-terminal Kyrpides Ouzounis and Woese domain (KOW, domain III, DIII) repeats of some N-Utilization Substance G (NusG) N-terminal (NGN) proteins in bacteria such as Aquifex aeolicus NusG (AaeNusG). NusG was originally discovered as having an N-dependent antitermination enhancing activity in Escherichia coli, and has since been shown to have a variety of functions such as being involved in RNA polymerase elongation and Rho-termination. Orthologs of NusG gene exist in bacteria, but their functions and requirements are diverse. The function of DII is as yet unknown, and belongs to Domains of Unknown Function 1312 (DUF1312). 80
28645 197365 cd09911 Lin0431_like Listerrria innocua Lin0431 is similar to the N-Utilization Substance G (NusG) N terminal (NGN) insert (DII). This family contains domains homologous to Listeria innocua Lin0431, a protein that is similar to the N-Utilization Substance G (NusG) N terminal (NGN) insert (domain II, DII). Lin0431 and Aquifex aeolicus NusG DII (AaeNusG DII ) have similar structure and similar basic charged surface distributions that may bind negatively charged nucleic acids and/or another anionic binding partner, suggesting a possible role in transcription/translation regulating functions. Despite these two domains having low sequence similarity, the NusG DII and DUF1312 domain families may have diverged from common evolutionary ancestral proteins, and may have similar biochemical functions. 82
28646 206739 cd09912 DLP_2 Dynamin-like protein including dynamins, mitofusins, and guanylate-binding proteins. The dynamin family of large mechanochemical GTPases includes the classical dynamins and dynamin-like proteins (DLPs) that are found throughout the Eukarya. This family also includes bacterial DLPs. These proteins catalyze membrane fission during clathrin-mediated endocytosis. Dynamin consists of five domains; an N-terminal G domain that binds and hydrolyzes GTP, a middle domain (MD) involved in self-assembly and oligomerization, a pleckstrin homology (PH) domain responsible for interactions with the plasma membrane, GED, which is also involved in self-assembly, and a proline arginine rich domain (PRD) that interacts with SH3 domains on accessory proteins. To date, three vertebrate dynamin genes have been identified; dynamin 1, which is brain specific, mediates uptake of synaptic vesicles in presynaptic terminals; dynamin-2 is expressed ubiquitously and similarly participates in membrane fission; mutations in the MD, PH and GED domains of dynamin 2 have been linked to human diseases such as Charcot-Marie-Tooth peripheral neuropathy and rare forms of centronuclear myopathy. Dynamin 3 participates in megakaryocyte progenitor amplification, and is also involved in cytoplasmic enlargement and the formation of the demarcation membrane system. This family also includes mitofusins (MFN1 and MFN2 in mammals) that are involved in mitochondrial fusion. Dynamin oligomerizes into helical structures around the neck of budding vesicles in a GTP hydrolysis-dependent manner. 180
28647 206740 cd09913 EHD Eps15 homology domain (EHD), C-terminal domain. Dynamin-like C-terminal Eps15 homology domain (EHD) proteins regulate endocytic events; they have been linked to a number of Rab proteins through their association with mutual effectors, suggesting a coordinate role in endocytic regulation. Eukaryotic EHDs comprise four members (EHD1-4) in mammals and single members in Caenorhabditis elegans (Rme-1), Drosophila melanogaster (Past1) as well as several eukaryotic parasites. EHD1 regulates trafficking of multiple receptors from the endocytic recycling compartment (ERC) to the plasma membrane; EHD2 regulates trafficking from the plasma membrane by controlling Rac1 activity; EHD3 regulates endosome-to-Golgi transport, and preserves Golgi morphology; EHD4 is involved in the control of trafficking at the early endosome and regulates exit of cargo toward the recycling compartment as well as late endocytic pathway. Rme-1, an ortholog of human EHD1, controls the recycling of internalized receptors from the endocytic recycling compartment to the plasma membrane. In D. melanogaster, deletion of the Past1 gene leads to infertility as well as premature death of adult flies. Arabidopsis thaliana also has homologs of EHD proteins (AtEHD1 and AtEHD2), possibly involved in regulating endocytosis and signaling. 241
28648 206741 cd09914 RocCOR Ras of complex proteins (Roc) C-terminal of Roc (COR) domain family. RocCOR (or Roco) protein family is characterized by a superdomain containing a Ras-like GTPase domain, called Roc (Ras of complex proteins), and a characteristic second domain called COR (C-terminal of Roc). A kinase domain and diverse regulatory domains are also often found in Roco proteins. Their functions are diverse; in Dictyostelium discoideum, which encodes 11 Roco proteins, they are involved in cell division, chemotaxis and development, while in human, where 4 Roco proteins (LRRK1, LRRK2, DAPK1, and MFHAS1) are encoded, these proteins are involved in epilepsy and cancer. Mutations in LRRK2 (leucine-rich repeat kinase 2) are known to cause familial Parkinson's disease. 161
28649 206742 cd09915 Rag Rag GTPase subfamily of Ras-related GTPases. Rag GTPases (ras-related GTP-binding proteins) constitute a unique subgroup of the Ras superfamily, playing an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. This subfamily consists of RagA and RagB as well as RagC and RagD that are closely related. Saccharomyces cerevisiae encodes single orthologs of metazoan RagA/B and RagC/D, Gtr1 and Gtr2, respectively. Dimer formation is important for their cellular function; these domains form heterodimers, as RagA or RagB dimerizes with RagC or RagD, and similarly, Gtr1 dimerizes with Gtr2. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7. 175
28650 197366 cd09916 CpxP_like CpxP component of the bacterial Cpx-two-component system and related proteins. This family summarizes bacterial proteins related to CpxP, a periplasmic protein that forms part of a two-component system which acts as a global modulator of cell-envelope stress in gram-negative bacteria. CpxP aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH. Functioning as a dimer, it inhibits activation of the kinase CpxA, but also plays a vital role in the quality control system of P pili. It has been suggested that CpxP directly interacts with CpxA via its concave polar surface. Another member of this family, Spy, is also a periplasmic protein that may be involved in the response to stress. The homology between CpxP and Spy suggests similar functions. A characteristic 5-residue sequence motif LTXXQ is found repeated twice in many members of this family. 96
28651 198174 cd09918 SH2_Nterm_SPT6_like N-terminal Src homology 2 (SH2) domain found in Spt6. N-terminal SH2 domain in Spt6. Spt6 is an essential transcription elongation factor and histone chaperone that binds the C-terminal repeat domain (CTD) of RNA polymerase II. Spt6 contains a tandem SH2 domain with a novel structure and CTD-binding mode. The tandem SH2 domain binds to a serine 2-phosphorylated CTD peptide in vitro, whereas its N-terminal SH2 subdomain does not. CTD binding requires a positively charged crevice in the C-terminal SH2 subdomain, which lacks the canonical phospho-binding pocket of SH2 domains. The tandem SH2 domain is apparently required for transcription elongation in vivo as its deletion in cells is lethal in the presence of 6-azauracil. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 85
28652 198175 cd09919 SH2_STAT_family Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) family. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated by a receptor. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. The CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 115
28653 198176 cd09920 SH2_Cbl-b_TKB Src homology 2 (SH2) domain found in the Cbl-b TKB domain. SH2 found in the Cbl-b TKB domain. The Cbl (for Casitas B-lineage lymphoma) family of E3 ubiquitin ligases contains three members Cbl, Cbl-b and Cbl-c. The founding member Cbl was discovered first as the oncogenic protein v-Cbl, a Gag-fusion transforming protein of Cas NS-1 retrovirus, which causes pre- and pro-B lymphomas in mice. The N-terminus of the Cbl proteins is composed of a tyrosine kinase-binding (TKB) domain, also called phosphotyrosine binding (PTB) domain, a short linker region and the RING-type zinc finger. In addition, Cbl and Cbl-b contain a leucine zipper motif and a proline-rich domain in the C-terminus. The TKB domain consists of a four-helix bundle (4H), a calcium-binding EF hand and a divergent SH2 domain. Cbl-b plays a role in early hematopoietic development and is a negative regulator of T-cell receptor, B-cell receptor and high affinity immunoglobulin epsilon receptor signal transduction pathways. It also negatively regulates insulin-like growth factor 1 signaling during muscle atrophy caused by unloading and is involved in EGFR ubiquitination and internalization. Diseases associated with defects in Cbl-b include: multiple sclerosis, autoimmune diseases, including type 1 diabetes, and a craniofacial phenotype. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28654 198177 cd09921 SH2_Jak_family Src homology 2 (SH2) domain in the Janus kinase (Jak) family. The Janus kinases (Jak) are a family of 4 non-receptor tyrosine kinases (Jak1, Jak2, Jak3, Tyk2) which respond to cytokine or growth factor receptor activation. To transduce cytokine signaling, a series of conformational changes occur in the receptor-Jak complex upon extracellular ligand binding. This results in trans-activation of the receptor-associated Jaks followed by phosphorylation of receptor tail tyrosine sites. The Signal Transducers and Activators of Transcription (STAT) are then recruited to the receptor tail, become phosphorylated and translocate to the nucleus to regulate transcription. Jaks have four domains: the pseudokinase domain, the catalytic tyrosine kinase domain, the FERM (band four-point-one, ezrin, radixin, and moesin) domain, and the SH2 (Src Homology-2) domain. The Jak kinases are regulated by several enzymatic and non-enzymatic mechanisms. First, the Jak kinase domain is regulated by phosphorylation of the activation loop which is associated with the catalytically competent kinase conformation and is distinct from the inactive kinase conformation. Second, the pseudokinase domain directly modulates Jak catalytic activity with the FERM domain maintaining an active state. Third, the suppressor of cytokine signaling (SOCS) family and tyrosine phosphatases directly regulate Jak activity. Dysregulation of Jak activity can manifest as either a reduction or an increase in kinase activity resulting in immunodeficiency, inflammatory diseases, hematological defects, autoimmune and myeloproliferative disorders, and susceptibility to infection. Altered Jak regulation occurs by many mechanisms, including: gene translocations, somatic or inherited point mutations, receptor mutations, and alterations in the activity of Jak regulators such as SOCS or phosphatases. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28655 198178 cd09923 SH2_SOCS_family Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) family. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 81
28656 198179 cd09925 SH2_SHC Src homology 2 (SH2) domain found in SH2 adaptor protein C (SHC). SHC is involved in a wide variety of pathways including regulating proliferation, angiogenesis, invasion and metastasis, and bone metabolism. An adapter protein, SHC has been implicated in Ras activation following the stimulation of a number of different receptors, including growth factors [insulin, epidermal growth factor (EGF), nerve growth factor, and platelet derived growth factor (PDGF)], cytokines [interleukins 2, 3, and 5], erythropoietin, and granulocyte/macrophage colony-stimulating factor, and antigens [T-cell and B-cell receptors]. SHC has been shown to bind to tyrosine-phosphorylated receptors, and receptor stimulation leads to tyrosine phosphorylation of SHC. Upon phosphorylation, SHC interacts with another adapter protein, Grb2, which binds to the Ras GTP/GDP exchange factor mSOS which leads to Ras activation. SHC is composed of an N-terminal domain that interacts with proteins containing phosphorylated tyrosines, a (glycine/proline)-rich collagen-homology domain that contains the phosphorylated binding site, and a C-terminal SH2 domain. SH2 has been shown to interact with the tyrosine-phosphorylated receptors of EGF and PDGF and with the tyrosine-phosphorylated C chain of the T-cell receptor, providing one of the mechanisms of T-cell-mediated Ras activation. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28657 198180 cd09926 SH2_CRK_like Src homology 2 domain found in cancer-related signaling adaptor protein CRK. SH2 domain in the CRK proteins. CRKI (SH2-SH3) and CRKII (SH2-SH3-SH3) are splicing isoforms of the oncoprotein CRK. CRKs regulate transcription and cytoskeletal reorganization for cell growth and motility by linking tyrosine kinases to small G proteins. The SH2 domain of CRK associates with tyrosine-phosphorylated receptors or components of focal adhesions, such as p130Cas and paxillin. CRK transmits signals to small G proteins through effectors that bind its SH3 domain, such as C3G, the guanine-nucleotide exchange factor (GEF) for Rap1 and R-Ras, and DOCK180, the GEF for Rac6. The binding of p130Cas to the CRK-C3G complex activates Rap1, leading to regulation of cell adhesion, and activates R-Ras, leading to JNK-mediated activation of cell proliferation, whereas the binding of CRK DOCK180 induces Rac1-mediated activation of cellular migration. The activity of the different splicing isoforms varies greatly with CRKI displaying substantial transforming activity, CRKII less so, and phosphorylated CRKII with no biological activity whatsoever. CRKII has a linker region with a phosphorylated Tyr and an additional C-terminal SH3 domain. The phosphorylated Tyr creates a binding site for its SH2 domain which disrupts the association between CRK and its SH2 target proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 106
28658 198181 cd09927 SH2_Tensin_like Src homology 2 domain found in Tensin-like proteins. SH2 domain found in Tensin-like proteins. The Tensins are a family of intracellular proteins that interact with receptor tyrosine kinases (RTKs), integrins, and actin. They are thought act as signaling bridges between the extracellular space and the cytoskeleton. There are four homologues: Tensin1, Tensin2 (TENC1, C1-TEN), Tensin3 and Tensin4 (cten), all of which contain a C-terminal tandem SH2-PTB domain pairing, as well as actin-binding regions that may localize them to focal adhesions. The isoforms of Tensin2 and Tensin3 contain N-terminal C1 domains, which are atypical and not expected to bind to phorbol esters. Tensins 1-3 contain a phosphatase (PTPase) and C2 domain pairing which resembles PTEN (phosphatase and tensin homologue deleted on chromosome 10) protein. PTEN is a lipid phosphatase that dephosphorylates phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) to yield phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2). As PtdIns(3,4,5)P3 is the product of phosphatidylinositol 3-kinase (PI3K) activity, PTEN is therefore a key negative regulator of the PI3K pathway. Because of their PTEN-like domains, the Tensins may also possess phosphoinositide-binding or phosphatase capabilities. However, only Tensin2 and Tensin3 have the potential to be phosphatases since only their PTPase domains contain a cysteine residue that is essential for catalytic activity. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 116
28659 198182 cd09928 SH2_Cterm_SPT6_like C-terminal Src homology 2 (SH2) domain found in Spt6. Spt6 is an essential transcription elongation factor and histone chaperone that binds the C-terminal repeat domain (CTD) of RNA polymerase II. Spt6 contains a tandem SH2 domain with a novel structure and CTD-binding mode. The tandem SH2 domain binds to a serine 2-phosphorylated CTD peptide in vitro, whereas its N-terminal SH2 subdomain does not. CTD binding requires a positively charged crevice in the C-terminal SH2 subdomain, which lacks the canonical phospho-binding pocket of SH2 domains. The tandem SH2 domain is apparently required for transcription elongation in vivo as its deletion in cells is lethal in the presence of 6-azauracil. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 89
28660 198183 cd09929 SH2_BLNK_SLP-76 Src homology 2 (SH2) domain found in B-cell linker (BLNK) protein and SH2 domain-containing leukocyte protein of 76 kDa (SLP-76). BLNK (also known as SLP-65 or BASH) is an important adaptor protein expressed in B-lineage cells. BLNK consists of a N-terminal sterile alpha motif (SAM) domain and a C-terminal SH2 domain. BLNK is a cytoplasmic protein, but a part of it is bound to the plasma membrane through an N-terminal leucine zipper motif and transiently bound to a cytoplasmic domain of Iga through its C-terminal SH2 domain upon B cell antigen receptor (BCR)-stimulation. A non-ITAM phosphotyrosine in Iga is necessary for the binding with the BLNK SH2 domain and/or for normal BLNK function in signaling and B cell activation. Upon phosphorylation BLNK binds Btk and PLCgamma2 through their SH2 domains and mediates PLCgamma2 activation by Btk. BLNK also binds other signaling molecules such as Vav, Grb2, Syk, and HPK1. BLNK has been shown to be necessary for BCR-mediated Ca2+ mobilization, for the activation of mitogen-activated protein kinases such as ERK, JNK, and p38 in a chicken B cell line DT40, and for activation of transcription factors such as NF-AT and NF-kappaB in human or mouse B cells. BLNK is involved in B cell development, B cell survival, activation, proliferation, and T-independent immune responses. BLNK is structurally homologous to SLP-76. SLP-76 and (linker for activation of T cells) LAT are adaptor/linker proteins in T cell antigen receptor activation and T cell development. BLNK interacts with many downstream signaling proteins that interact directly with both SLP-76 and LAT. New data suggest functional complementation of SLP-76 and LAT in T cell antigen receptor function with BLNK in BCR function. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 121
28661 198184 cd09930 SH2_cSH2_p85_like C-terminal Src homology 2 (cSH2) domain found in p85. Phosphoinositide 3-kinases (PI3Ks) are essential for cell growth, migration, and survival. p110, the catalytic subunit, is composed of an adaptor-binding domain, a Ras-binding domain, a C2 domain, a helical domain, and a kinase domain. The regulatory unit is called p85 and is composed of an SH3 domain, a RhoGap domain, a N-terminal SH2 (nSH2) domain, a inter SH2 (iSH2) domain, and C-terminal (cSH2) domain. There are 2 inhibitory interactions between p110alpha and p85 of P13K: 1) p85 nSH2 domain with the C2, helical, and kinase domains of p110alpha and 2) p85 iSH2 domain with C2 domain of p110alpha. There are 3 inhibitory interactions between p110beta and p85 of P13K: 1) p85 nSH2 domain with the C2, helical, and kinase domains of p110beta, 2) p85 iSH2 domain with C2 domain of p110alpha, and 3) p85 cSH2 domain with the kinase domain of p110alpha. It is interesting to note that p110beta is oncogenic as a wild type protein while p110alpha lacks this ability. One explanation is the idea that the regulation of p110beta by p85 is unique because of the addition of inhibitory contacts from the cSH2 domain and the loss of contacts in the iSH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28662 198185 cd09931 SH2_C-SH2_SHP_like C-terminal Src homology 2 (C-SH2) domain found in SH2 domain Phosphatases (SHP) proteins. The SH2 domain phosphatases (SHP-1, SHP-2/Syp, Drosophila corkscrew (csw), and Caenorhabditis elegans Protein Tyrosine Phosphatase (Ptp-2)) are cytoplasmic signaling enzymes. They are both targeted and regulated by interactions of their SH2 domains with phosphotyrosine docking sites. These proteins contain two SH2 domains (N-SH2, C-SH2) followed by a tyrosine phosphatase (PTP) domain, and a C-terminal extension. Shp1 and Shp2 have two tyrosyl phosphorylation sites in their C-tails, which are phosphorylated differentially by receptor and nonreceptor PTKs. Csw retains the proximal tyrosine and Ptp-2 lacks both sites. Shp-binding proteins include receptors, scaffolding adapters, and inhibitory receptors. Some of these bind both Shp1 and Shp2 while others bind only one. Most proteins that bind a Shp SH2 domain contain one or more immuno-receptor tyrosine-based inhibitory motifs (ITIMs): [SIVL]xpYxx[IVL]. Shp1 N-SH2 domain blocks the catalytic domain and keeps the enzyme in the inactive conformation, and is thus believed to regulate the phosphatase activity of SHP-1. Its C-SH2 domain is thought to be involved in searching for phosphotyrosine activators. The SHP2 N-SH2 domain is a conformational switch; it either binds and inhibits the phosphatase, or it binds phosphoproteins and activates the enzyme. The C-SH2 domain contributes binding energy and specificity, but it does not have a direct role in activation. Csw SH2 domain function is essential, but either SH2 domain can fulfill this requirement. The role of the csw SH2 domains during Sevenless receptor tyrosine kinase (SEV) signaling is to bind Daughter of Sevenless rather than activated SEV. Ptp-2 acts in oocytes downstream of sheath/oocyte gap junctions to promote major sperm protein (MSP)-induced MAP Kinase (MPK-1) phosphorylation. Ptp-2 functions in the oocyte cytoplasm, not at the cell surface to inhibit multiple RasGAPs, resulting in sustained Ras activation. It is thought that MSP triggers PTP-2/Ras activation and ROS production to stimulate MPK-1 activity essential for oocyte maturation and that secreted MSP domains and Cu/Zn superoxide dismutases function antagonistically to control ROS and MAPK signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 99
28663 198186 cd09932 SH2_C-SH2_PLC_gamma_like C-terminal Src homology 2 (C-SH2) domain in Phospholipase C gamma. Phospholipase C gamma is a signaling molecule that is recruited to the C-terminal tail of the receptor upon autophosphorylation of a highly conserved tyrosine. PLCgamma is composed of a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, 2 catalytic regions of PLC domains that flank 2 tandem SH2 domains (N-SH2, C-SH2), and ending with a SH3 domain and C2 domain. N-SH2 SH2 domain-mediated interactions represent a crucial step in transmembrane signaling by receptor tyrosine kinases. SH2 domains recognize phosphotyrosine (pY) in the context of particular sequence motifs in receptor phosphorylation sites. Both N-SH2 and C-SH2 have a very similar binding affinity to pY. But in growth factor stimulated cells these domains bind to different target proteins. N-SH2 binds to pY containing sites in the C-terminal tails of tyrosine kinases and other receptors. Recently it has been shown that this interaction is mediated by phosphorylation-independent interactions between a secondary binding site found exclusively on the N-SH2 domain and a region of the FGFR1 tyrosine kinase domain. This secondary site on the SH2 cooperates with the canonical pY site to regulate selectivity in mediating a specific cellular process. C-SH2 binds to an intramolecular site on PLCgamma itself which allows it to hydrolyze phosphatidylinositol-4,5-bisphosphate into diacylglycerol and inositol triphosphate. These then activate protein kinase C and release calcium. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28664 199827 cd09933 SH2_Src_family Src homology 2 (SH2) domain found in the Src family of non-receptor tyrosine kinases. The Src family kinases are nonreceptor tyrosine kinases that have been implicated in pathways regulating proliferation, angiogenesis, invasion and metastasis, and bone metabolism. It is thought that transforming ability of Src is linked to its ability to activate key signaling molecules in these pathways, rather than through direct activity. As such blocking Src activation has been a target for drug companies. Src family members can be divided into 3 groups based on their expression pattern: 1) Src, Fyn, and Yes; 2) Blk, Fgr, Hck, Lck, and Lyn; and 3) Frk-related kinases Frk/Rak and Iyk/Bsk Of these, cellular c-Src is the best studied and most frequently implicated in oncogenesis. The c-Src contains five distinct regions: a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. Src exists in both active and inactive conformations. Negative regulation occurs through phosphorylation of Tyr, resulting in an intramolecular association between phosphorylated Tyr and the SH2 domain of SRC, which locks the protein in a closed conformation. Further stabilization of the inactive state occurs through interactions between the SH3 domain and a proline-rich stretch of residues within the kinase domain. Conversely, dephosphorylation of Tyr allows SRC to assume an open conformation. Full activity requires additional autophosphorylation of a Tyr residue within the catalytic domain. Loss of the negative-regulatory C-terminal segment has been shown to result in increased activity and transforming potential. Phosphorylation of the C-terminal Tyr residue by C-terminal Src kinase (Csk) and Csk homology kinase results in increased intramolecular interactions and consequent Src inactivation. Specific phosphatases, protein tyrosine phosphatase a (PTPa) and the SH-containing phosphatases SHP1/SHP2, have also been shown to take a part in Src activation. Src is also activated by direct binding of focal adhesion kinase (Fak) and Crk-associated substrate (Cas) to the SH2 domain. SRC activity can also be regulated by numerous receptor tyrosine kinases (RTKs), such as Her2, epidermal growth factor receptor (EGFR), fibroblast growth factor receptor, platelet-derived growth factor receptor (PDGFR), and vascular endothelial growth factor receptor (VEGFR). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28665 198188 cd09934 SH2_Tec_family Src homology 2 (SH2) domain found in Tec-like proteins. The Tec protein tyrosine kinase is the founding member of a family that includes Btk, Itk, Bmx, and Txk. The members have a PH domain, a zinc-binding motif, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. Btk is involved in B-cell receptor signaling with mutations in Btk responsible for X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (xid) in mice. Itk is involved in T-cell receptor signaling. Tec is expressed in both T and B cells, and is thought to function in activated and effector T lymphocytes to induce the expression of genes regulated by NFAT transcription factors. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28666 198189 cd09935 SH2_ABL Src homology 2 (SH2) domain found in Abelson murine lymphosarcoma virus (ABL) proteins. ABL-family proteins are highly conserved tyrosine kinases. Each ABL protein contains an SH3-SH2-TK (Src homology 3-Src homology 2-tyrosine kinase) domain cassette, which confers autoregulated kinase activity and is common among nonreceptor tyrosine kinases. Several types of posttranslational modifications control ABL catalytic activity, subcellular localization, and stability, with consequences for both cytoplasmic and nuclear ABL functions. Binding partners provide additional regulation of ABL catalytic activity, substrate specificity, and downstream signaling. By combining this cassette with actin-binding and -bundling domain, ABL proteins are capable of connecting phosphoregulation with actin-filament reorganization. Vertebrate paralogs, ABL1 and ABL2, have evolved to perform specialized functions. ABL1 includes nuclear localization signals and a DNA binding domain which is used to mediate DNA damage-repair functions, while ABL2 has additional binding capacity for actin and for microtubules to enhance its cytoskeletal remodeling functions. SH2 is involved in several autoinhibitory mechanism that constrain the enzymatic activity of the ABL-family kinases. In one mechanism SH2 and SH3 cradle the kinase domain while a cap sequence stabilizes the inactive conformation resulting in a locked inactive state. Another involves phosphatidylinositol 4,5-bisphosphate (PIP2) which binds the SH2 domain through residues normally required for phosphotyrosine binding in the linker segment between the SH2 and kinase domains. The SH2 domain contributes to ABL catalytic activity and target site specificity. It is thought that the ABL catalytic site and SH2 pocket have coevolved to recognize the same sequences. Recent work now supports a hierarchical processivity model in which the substrate target site most compatible with ABL kinase domain preferences is phosphorylated with greatest efficiency. If this site is compatible with the ABL SH2 domain specificity, it will then reposition and dock in the SH2 pocket. This mechanism also explains how ABL kinases phosphorylates poor targets on the same substrate if they are properly positioned and how relatively poor substrate proteins might be recruited to ABL through a complex with strong substrates that can also dock with the SH2 pocket. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 94
28667 198190 cd09937 SH2_csk_like Src homology 2 (SH2) domain found in Carboxyl-Terminal Src Kinase (Csk). Both the C-terminal Src kinase (CSK) and CSK-homologous kinase (CHK) are members of the CSK-family of protein tyrosine kinases. These proteins suppress activity of Src-family kinases (SFK) by selectively phosphorylating the conserved C-terminal tail regulatory tyrosine by a similar mechanism. CHK is also capable of inhibiting SFKs by a non-catalytic mechanism that involves binding of CHK to SFKs to form stable protein complexes. The unphosphorylated form of SFKs is inhibited by CSK and CHK by a two-step mechanism. The first step involves the formation of a complex of SFKs with CSK/CHK with the SFKs in the complex are inactive. The second step, involves the phosphorylation of the C-terminal tail tyrosine of SFKs, which then dissociates and adopt an inactive conformation. The structural basis of how the phosphorylated SFKs dissociate from CSK/CHK to adopt the inactive conformation is not known. The inactive conformation of SFKs is stabilized by two intramolecular inhibitory interactions: (a) the pYT:SH2 interaction in which the phosphorylated C-terminal tail tyrosine (YT) binds to the SH2 domain, and (b) the linker:SH3 interaction of which the SH2-kinase domain linker binds to the SH3 domain. SFKs are activated by multiple mechanisms including binding of the ligands to the SH2 and SH3 domains to displace the two inhibitory intramolecular interactions, autophosphorylation, and dephosphorylation of YT. By selective phosphorylation and the non-catalytic inhibitory mechanism CSK and CHK are able to inhibit the active forms of SFKs. CSK and CHK are regulated by phosphorylation and inter-domain interactions. They both contain SH3, SH2, and kinase domains separated by the SH3-SH2 connector and SH2 kinase linker, intervening segments separating the three domains. They lack a conserved tyrosine phosphorylation site in the kinase domain and the C-terminal tail regulatory tyrosine phosphorylation site. The CSK SH2 domain is crucial for stabilizing the kinase domain in the active conformation. A disulfide bond here regulates CSK kinase activity. The subcellular localization and activity of CSK are regulated by its SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28668 198191 cd09938 SH2_N-SH2_Zap70_Syk_like N-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70) and Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains. In Syk the two SH2 domains do not form such a phosphotyrosine-binding site. The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the N-terminus SH2 domains of both Syk and Zap70. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28669 198192 cd09939 SH2_STAP_family Src homology 2 domain found in Signal-transducing adaptor protein (STAP) family. STAP1 and STAP2 are signal-transducing adaptor proteins. They are composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. STAP-1 is an ortholog of BRDG1 (BCR downstream signaling 1). STAP1 protein functions as a docking protein acting downstream of Tec tyrosine kinase in B cell antigen receptor signaling. The protein is phosphorylated by Tec and participates in a positive feedback loop, increasing Tec activity. STAP1 has been shown to interact with C19orf2, an unconventional prefoldin RPB5 interactor. The STAP2 protein is the substrate of breast tumor kinase, an Src-type non-receptor tyrosine kinase that mediates the interactions linking proteins involved in signal transduction pathways. STAP2 has alternative splicing variants. STAP2 has been shown to interact with tyrosine-protein kinase 6 (PTK6). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 94
28670 198193 cd09940 SH2_Vav_family Src homology 2 (SH2) domain found in the Vav family. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, Vav2 and Vav3 are more ubiquitously expressed. The members here include insect and amphibian Vavs. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28671 199828 cd09941 SH2_Grb2_like Src homology 2 domain found in Growth factor receptor-bound protein 2 (Grb2) and similar proteins. The adaptor proteins here include homologs Grb2 in humans, Sex muscle abnormal protein 5 (Sem-5) in Caenorhabditis elegans, and Downstream of receptor kinase (drk) in Drosophila melanogaster. They are composed of one SH2 and two SH3 domains. Grb2/Sem-5/drk regulates the Ras pathway by linking the tyrosine kinases to the Ras guanine nucleotide releasing protein Sos, which converts Ras to the active GTP-bound state. The SH2 domain of Grb2/Sem-5/drk binds class II phosphotyrosyl peptides while its SH3 domain binds to Sos and Sos-derived, proline-rich peptides. Besides it function in Ras signaling, Grb2 is also thought to play a role in apoptosis. Unlike most SH2 structures in which the peptide binds in an extended conformation (such that the +3 peptide residue occupies a hydrophobic pocket in the protein, conferring a modest degree of selectivity), Grb2 forms several hydrogen bonds via main chain atoms with the side chain of +2 Asn. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 95
28672 198195 cd09942 SH2_nSH2_p85_like N-terminal Src homology 2 (nSH2) domain found in p85. Phosphoinositide 3-kinases (PI3Ks) are essential for cell growth, migration, and survival. p110, the catalytic subunit, is composed of an adaptor-binding domain, a Ras-binding domain, a C2 domain, a helical domain, and a kinase domain. The regulatory unit is called p85 and is composed of an SH3 domain, a RhoGap domain, a N-terminal SH2 (nSH2) domain, an internal SH2 (iSH2) domain, and C-terminal (cSH2) domain. There are 2 inhibitory interactions between p110alpha and p85 of P13K: (1) p85 nSH2 domain with the C2, helical, and kinase domains of p110alpha and (2) p85 iSH2 domain with C2 domain of p110alpha. There are 3 inhibitory interactions between p110beta and p85 of P13K: (1) p85 nSH2 domain with the C2, helical, and kinase domains of p110beta, (2) p85 iSH2 domain with C2 domain of p110alpha, and (3) p85 cSH2 domain with the kinase domain of p110alpha. It is interesting to note that p110beta is oncogenic as a wild type protein while p110alpha lacks this ability. One explanation is the idea that the regulation of p110beta by p85 is unique because of the addition of inhibitory contacts from the cSH2 domain and the loss of contacts in the iSH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 110
28673 198196 cd09943 SH2_Nck_family Src homology 2 (SH2) domain found in the Nck family. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates. There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)). They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets. Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus. Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 93
28674 198197 cd09944 SH2_Grb7_family Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 7 (Grb7) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. There are 3 members of the Grb7 family of proteins: Grb7, Grb10, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb7 binds strongly to the erbB2 receptor, unlike Grb10 and Grb14 which bind weakly to it. Grb14 binds to Fibroblast Growth Factor Receptor (FGFR). Grb10 has been shown to interact with many different proteins, including the insulin and IGF1 receptors, platelet-derived growth factor (PDGF) receptor-beta, Ret, Kit, Raf1 and MEK1, and Nedd4. Grb7 family proteins are phosphorylated on serine/threonine as well as tyrosine residues. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 108
28675 198198 cd09945 SH2_SHB_SHD_SHE_SHF_like Src homology 2 domain found in SH2 domain-containing adapter proteins B, D, E, and F (SHB, SHD, SHE, SHF). SHB, SHD, SHE, and SHF are SH2 domain-containing proteins that play various roles throughout the cell. SHB functions in generating signaling compounds in response to tyrosine kinase activation. SHB contains proline-rich motifs, a phosphotyrosine binding (PTB) domain, tyrosine phosphorylation sites, and a SH2 domain. SHB mediates certain aspects of platelet-derived growth factor (PDGF) receptor-, fibroblast growth factor (FGF) receptor-, neural growth factor (NGF) receptor TRKA-, T cell receptor-, interleukin-2 (IL-2) receptor- and focal adhesion kinase- (FAK) signaling. SRC-like FYN-Related Kinase FRK/RAK (also named BSK/IYK or GTK) and SHB regulate apoptosis, proliferation and differentiation. SHB promotes apoptosis and is also required for proper mitogenicity, spreading and tubular morphogenesis in endothelial cells. SHB also plays a role in preventing early cavitation of embryoid bodies and reduces differentiation to cells expressing albumin, amylase, insulin and glucagon. SHB is a multifunctional protein that has difference responses in different cells under various conditions. SHE is expressed in heart, lung, brain, and skeletal muscle, while expression of SHD is restricted to the brain. SHF is mainly expressed in skeletal muscle, brain, liver, prostate, testis, ovary, small intestine, and colon. SHD may be a physiological substrate of c-Abl and may function as an adapter protein in the central nervous system. It is also thought to be involved in apoptotic regulation. SHD contains five YXXP motifs, a substrate sequence preferred by Abl tyrosine kinases, in addition to a poly-proline rich region and a C-terminal SH2 domain. SHE contains two pTry protein binding domains, protein interaction domain (PID) and a SH2 domain, followed by a glycine-proline rich region, all of which are N-terminal to the phosphotyrosine binding (PTB) domain. SHF contains four putative tyrosine phosphorylation sites and an SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28676 198199 cd09946 SH2_HSH2_like Src homology 2 domain found in hematopoietic SH2 (HSH2) protein. HSH2 is thought to function as an adapter protein involved in tyrosine kinase signaling. It may also be involved in regulating cytokine signaling and cytoskeletal reorganization in hematopoietic cells. HSH2 contains several putative protein-binding motifs, SH3-binding proline-rich regions, and phosphotyrosine sites, but lacks enzymatic motifs. HSH2 was found to interact with cytokine-regulated tyrosine kinase c-FES and an activated Cdc42-associated tyrosine kinase ACK1. HSH2 binds c-FES through both its C-terminal region and its N-terminal region including the SH2 domain and binds ACK1 via its N-terminal proline-rich region. Both kinases bound and tyrosine-phosphorylated HSH2 in mammalian cells. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28677 197370 cd09947 Ebola_HIV-1-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and human immunodeficiency virus type 1 (HIV-1). This domain superfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Rous sarcoma virus gp37, human immunodeficiency virus type 1 (HIV-1) gp41, and the envelope proteins of various ERVs. In the HR1-HR2 region of Ebola virus and RSV, the linker region between the two repeats includes a CKS17-like immunosuppressive region and a CX6C motif that forms an intra-subunit disulfide bond; MMTV, HIV-1, HERV-K endogenous retroviruses and related sequences lack a canonical CSK17-like sequence, and CX6C motif. N-terminal to the HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as Jaagsiekte sheep retrovirus (JSRV), feline leukemia virus (FeLV), and avian leukemia virus (ALV). Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Human ERVs (HERVs) belonging to this superfamily include Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), and Syncytin-2 (HERV-FRD_6p24.1) which are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive; its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast; it is also implicated in cell fusions between cancer and host cells and between cancer cell, and in human osteclast fusion. This superfamily also contains human HERV-R_c7q21.2 (ERV-3), which is also expressed in the placenta, but is not fusogenic, and has an immunosuppressive domain, but lacks a fusion peptide. It is unclear whether ERV-3 has a critical biological role. Included in this superfamily are ERVs from domestic sheep that are related to JSRV, the agent of transmissible lung cancer in sheep; for example, enJSRV-26 that retains an intact genome. These endogenous JSRVs protect the sheep against JSRV infection and are required for sheep placental development. 73
28678 197371 cd09948 Ebola_RSV-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and Rous sarcoma virus. This domain family spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Rous sarcoma virus gp37, and the envelope proteins of various ERVs. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intra-subunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), while C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as Jaagsiekte sheep retrovirus (JSRV), feline leukemia virus (FeLV), and avian leukemia virus (ALV). Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Human ERVs (HERVs) belonging to this family include Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), and Syncytin-2 (HERV-FRD_6p24.1) which are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive. Its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast. It is also implicated in cell fusions between cancer and host cells and between cancer cells, and in human osteclast fusion. This family also contains human HERV-R_c7q21.2 (ERV-3), which is also expressed in the placenta, but is not fusogenic, has an immunosuppressive domain, but lacks a fusion peptide. It is unclear whether ERV-3 has a critical biological role. 72
28679 197372 cd09949 RSV-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of Rous sarcoma virus (RSV), and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Rous sarcoma virus gp37, Avian leukosis virus subgroup J (ALV-J) envelope protein, and the envelope proteins of various ERVs, including those belonging to the ev/J (or EAV-HP) family of chicken ERVs, such as ev/J 4.1 Rb. ALV-J is a recently emerged avian pathogen, the causative agent of myeloid leukosis in meat-type chicken. ERVs are likely to originate from ancient germ-line infections by active retroviruses. ALV-J may have emerged from a recombination event between an unknown ALV and an EAV-HP ERV. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. 72
28680 197373 cd09950 ENVV1-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of the human endogenous retrovirus ENVV1, and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs), including chicken FET-1 (Female Expressed Transcript 1) protein, and the envelope proteins of the human ERVs (HERVs): ENVV1 (also known as HERV-V2_c19q13.41) and ENVV2 (also known as HERV-V1_c19q13.41 ). This domain belongs to a larger superfamily containing the HR1-HR2 domain of endogenous retroviruses (ERVs) and infectious retroviruses, such as Ebola virus, Rous sarcoma virus and human immunodeficiency virus type 1. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intra-subunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. FET-1 may have an ovary-determining role. The FET-1 gene is located on the female specific W chromosome in chickens. During the sex-determining period, the FET-1 transcript is up-regulated in the cortex of the left gonad (the only gonad which develops in female chickens); it is also expressed at a lower level, in neural tissue and waste collection ducts. The genes encoding ENVV1 and ENVV2 proteins are located in tandem on chromosome 19q13.41, and show placenta-specific expression in human and baboon. 72
28681 197374 cd09951 HERV-Rb-like_HR1-HR2 heptad repeat 1- heptad repeat 2 region (ectodomain) of the transmembrane subunit of the human endogenous retrovirus HERV-R(b)_c3p24.3 and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) including the human ERVs (HERVs): HERV-R(b)_c3p24.3 and Syncytin-3 (also known as HERV-P(b)_c14q32.12). This domain belongs to a larger superfamily containing the HR1-HR2 domain of endogenous retroviruses (ERVs) and infectious retroviruses, such as Ebola virus, Rous sarcoma virus (RSV) and human immunodeficiency virus type 1 (HIV-1). This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal, is a heptad repeat. In intact retroviruses, N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Syncytin-3 is fusogenic, HERV-R(b)_c3p24.3 appears not to have fusogenic activity. 81
28682 197375 cd09966 UP_III_II Uroplakin IIIb, IIIa and II. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis. 181
28683 197376 cd09967 UP_II Uroplakin II. Uroplakin II, the dimerization partner of uroplakin Ia, is a member of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis. 165
28684 197377 cd09968 UP_III Uroplakin III. Uroplakin IIIa and IIIb, the dimerization partners of uroplakin Ib, are a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis. 187
28685 197378 cd09969 UP_IIIb Uroplakin IIIb. Uroplakin IIIb, minor isoform of the dimerization partner of uroplakin Ib, is a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis. 184
28686 197379 cd09970 UP_IIIa Uroplakin IIIa. Uroplakin IIIa, mayor isoform of the dimerization partner of uroplakin Ib, is a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis. 212
28687 197380 cd09971 SdiA-regulated SdiA-regulated. This model represents a bacterial family of proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. The C-terminal domain included in the alignment forms a five-bladed beta-propeller structure. The X-ray structure of Escherichia coli yjiK (C-terminal domain) exhibits binding of calcium ions (Ca++) in what appears to be an evolutionarily conserved site. Sequence analysis suggests a distant relationship to proteins that are characterized as containing NHL-repeats. The latter also form beta-propeller structures, with several examples known to form six-bladed beta-propellers. Several of the six-bladed beta-propellers containing NHL repeats have been characterized functionally, including members with enzymatic functions that are dependent on metal ions. No functional characterization is available for this family of five-bladed propellers, though. 242
28688 193586 cd09972 LOTUS_TDRD_OSKAR The first LOTUS domain in Oskar and Tudor-containing proteins 5 and 7. The first LOTUS domain in Oskar and Tudor-containing proteins 5 and 7: The LOTUS containing proteins are germline-specific and are found in the nuage/polar granules of germ cells. Tudor-containing protein 5 and 7 belong to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 and TDRD7 are components of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. Oskar protein is a critical component of the pole plasm in the Drosophila oocyte, which is required for germ cell formation.The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 87
28689 193587 cd09973 LOTUS_2_TDRD7 The second LOTUS domain on Tudor-containing protein 7 (TDRD7). The second LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis. TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 68
28690 193588 cd09974 LOTUS_3_TDRD7 The third LOTUS domain on Tudor-containing protein 7 (TDRD7). The third LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis. TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 67
28691 193589 cd09975 LOTUS_2_TDRD5 The second LOTUS domain on Tudor-containing protein 5 (TDRD5). The second LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be discovered. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 70
28692 193590 cd09976 LOTUS_3_TDRD5 The third LOTUS domain on Tudor-containing protein 5 (TDRD5). The third LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 74
28693 193591 cd09977 LOTUS_1_Limkain_b1 The first LOTUS domain on Limkain b1(LKAP). The first LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 62
28694 193592 cd09978 LOTUS_2_Limkain_b1 The second LOTUS domain on Limkain b1(LKAP). The second LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization 71
28695 193593 cd09979 LOTUS_3_Limkain_b1 The third LOTUS domain on Limkain b1(LKAP). The third LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 72
28696 193594 cd09980 LOTUS_4_Limkain_b1 The fourth LOTUS domain on Limkain b1(LKAP). The fourth LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 72
28697 193595 cd09981 LOTUS_5_Limkain_b1 The fifth LOTUS domain on Limkain b1(LKAP). The fifth LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 71
28698 193596 cd09982 LOTUS_6_Limkain_b1 The sixth LOTUS domain on Limkain b1(LKAP). The sixth LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 71
28699 193597 cd09983 LOTUS_7_Limkain_b1 The seventh LOTUS domain on Limkain b1(LKAP). The seventh LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 73
28700 193598 cd09984 LOTUS_8_Limkain_b1 The eighth LOTUS domain on Limkain b1(LKAP). The eighth LOTUS domain on Limkain b1(LKAP): Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 76
28701 193599 cd09985 LOTUS_1_TDRD5 The first LOTUS domain on Tudor-containing protein 5 (TDRD5). The first LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 95
28702 193600 cd09986 LOTUS_1_TDRD7 The first LOTUS domain on Tudor-containing protein 7 (TDRD7). The first LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis. TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization. 88
28703 212513 cd09987 Arginase_HDAC Arginase-like and histone-like hydrolases. Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs. 217
28704 212514 cd09988 Formimidoylglutamase Formimidoylglutamase or HutE. Formimidoylglutamase (N-formimidoyl-L-glutamate formimidoylhydrolase; formiminoglutamase; N-formiminoglutamate hydrolase; N-formimino-L-glutamate formiminohydrolase; HutE; EC 3.5.3.8) is a metalloenzyme that catalyzes hydrolysis of N-formimidoyl-L-glutamate to L-glutamate and formamide. This enzyme is involved in histidine degradation, requiring Mn as a cofactor while glutathione may be required for maximal activity. In Pseudomonas PAO1, mutation studies show that histidine degradation proceeds via a 'four-step' pathway if the 'five-step' route is absent and vice versa; in the four-step pathway, formiminoglutaminase (HutE, EC 3.5.3.8) directly converts formiminoglutamate (FIGLU) to L-glutamate and formamide in a single step. Formiminoglutamase has traditionally also been referred to as HutG; however, formiminoglutamase is structurally and mechanistically unrelated to N-formyl-glutamate deformylase (also called HutG). Phylogenetic analysis has suggested that HutE was acquired by horizontal gene transfer from a Ralstonia-like ancestor. 262
28705 212515 cd09989 Arginase Arginase family. This family includes arginase, also known as arginase-like amidino hydrolase family, and related proteins. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid. In vertebrates, at least two isozymes have been identified: type I (ARG1) cytoplasmic or hepatic liver-type arginase and type II (ARG2) mitochondrial or non-hepatic arginase. Point mutations in human arginase ARG1 gene lead to hyperargininemia with consequent mental disorders, retarded development and early death. Hyperargininemia is associated with a several-fold increase in the activity of the mitochondrial arginase (ARG2), causing persistent ureagenesis in patients. ARG2 overexpression plays a critical role in the pathophysiology of cholesterol mediated endothelial dysfunction. Thus, arginase is a therapeutic target to treat asthma, erectile dysfunction, atherosclerosis and cancer. 290
28706 212516 cd09990 Agmatinase-like Agmatinase-like family. Agmatinase subfamily currently includes metalloenzymes such as agmatinase, guanidinobutyrase, guanidopropionase, formimidoylglutamase and proclavaminate amidinohydrolase. Agmatinase (agmatine ureohydrolase; SpeB; EC=3.5.3.11) is the key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield putrescine and urea. This enzyme has been found in bacteria, archaea and eukaryotes, requiring divalent Mn and sometimes Zn, Co or Ca for activity. In mammals, the highest level of agmatinase mRNA was found in liver and kidney. However, catabolism of agmatine via agmatinase apparently is a not major path; it is mostly catabolized via diamine oxidase. Agmatinase has been shown to be down-regulated in tumor renal cells. Guanidinobutyrase (Gbh, EC=3.5.3.7) catalyzes hydrolysis of 4-guanidinobutanoate to yield 4-aminobutanoate and urea in arginine degradation pathway. Activity has been shown for purified enzyme from Arthrobacter sp. KUJ 8602. Additionally, guanidinobutyrase is able to hydrolyze D-arginine, 3-guanidinopropionate, 5-guanidinovaleriate and L-arginine with much less affinity, having divalent Zn ions for catalysis. Proclavaminate amidinohydrolase (Pah, EC 3.5.3.22) hydrolyzes amidinoproclavaminate to yield proclavaminate and urea in clavulanic acid biosynthesis. Activity has been shown for purified enzyme from Streptomyces clavuligerus. Clavulanic acid is the effective inhibitor of beta-lactamases. This acid is used in combination with the penicillin amoxicillin to prevent antibiotic's beta-lactam rings from hydrolysis, thus keeping the antibiotics biologically active. 275
28707 212517 cd09991 HDAC_classI Class I histone deacetylases. Class I histone deacetylases (HDACs) are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. This group includes animal HDAC1, HDAC2, HDAC3, HDAC8, fungal RPD3, HOS1 and HOS2, plant HDA9, protist, archaeal and bacterial (AcuC) deacetylases. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase. In mammals, they are known to be involved in progression of various tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs. 306
28708 212518 cd09992 HDAC_classII Histone deacetylases and histone-like deacetylases, classII. Class II histone deacetylases are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. This group includes animal HDAC4,5,6,7,8,9,10, fungal HOS3 and HDA1, plant HDA5 and HDA15 as well as other eukaryotes, archaeal and bacterial histone-like deacetylases. Eukaryotic deacetylases mostly use histones (H2, H3, H4) as substrates for deacetylation; however, non-histone substrates are known (for example, tubulin). Substrates for prokaryotic histone-like deacetylases are not known. Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Interaction partners of class II deacetylases include 14-3-3 proteins, MEF2 family of transcriptional factors, CtBP, calmodulin (CaM), SMRT, N-CoR, BCL6, HP1alpha and SUMO. Histone deacetylases play a role in the regulation of cell cycle, cell differentiation and survival. Class II mammalian HDACs are differentially inhibited by structurally diverse compounds with known antitumor activities, thus presenting them as potential drug targets for human diseases resulting from aberrant acetylation. 291
28709 212519 cd09993 HDAC_classIV Histone deacetylase class IV also known as histone deacetylase 11. Class IV histone deacetylases (HDAC11; EC 3.5.1.98) are predicted Zn-dependent enzymes. This class includes animal HDAC11, plant HDA2 and related bacterial deacetylases. Enzymes in this subfamily participate in regulation of a number of different processes through protein modification (deacetylation). They catalyze hydrolysis of N(6)-acetyl-lysine of histones (or other proteins) to yield a deacetylated proteins. Histone deacetylases often act as members of large multi-protein complexes such as mSin3A or SMRT/N-CoR. Human HDAC11 does not associate with them but can interact with HDAC6 in vivo. It has been suggested that HDAC11 and HDAC6 may use non-histone proteins as their substrates and play a role other than to directly modulate chromatin structure. In normal tissues, expression of HDAC11 is limited to kidney, heart, brain, skeletal muscle and testis, suggesting that its function might be tissue-specific. In mammals, HDAC11 proteins are known to be involved in progression of various tumors. HDAC11 plays an essential role in regulating OX40 ligand (OX40L) expression in Hodgkin lymphoma (HL); selective inhibition of HDAC11 expression significantly up-regulates OX40L and induces apoptosis in HL cell lines. Thus, inhibition of HDAC11 could be a therapeutic drug option for antitumor immune response in HL patients. 275
28710 212520 cd09994 HDAC_AcuC_like Class I histone deacetylase AcuC (Acetoin utilization protein)-like enzymes. AcuC (Acetoin utilization protein) is a class I deacetylase found only in bacteria and is involved in post-translational control of the acetyl-coenzyme A synthetase (AcsA). Deacetylase AcuC works in coordination with deacetylase SrtN (class III), possibly to maintain AcsA in active (deacetylated) form and let the cell grow under low concentration of acetate. B. subtilis AcuC is a member of operon acuABC; this operon is repressed by the presence of glucose and does not show induction by acetoin; acetoin is a bacterial fermentation product that can be converted to acetate via the butanediol cycle in absence of other carbon sources. Inactivation of AcuC leads to slower growth and lower cell yield under low-acetate conditions in Bacillus subtilis. In general, Class I histone deacetylases (HDACs) are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase. 313
28711 212521 cd09996 HDAC_classII_1 Histone deacetylases and histone-like deacetylases, classII. This subfamily includes bacterial as well as eukaryotic Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. Included in this family is a bacterial HDAC-like amidohydrolase (Bordetella/Alcaligenes species FB18817, denoted as FB188 HDAH) shown to be most similar in sequence and function to class II HDAC6 domain 3 or b (HDAC6b). FB188 HDAH is able to remove the acetyl moiety from acetylated histones, and can be inhibited by common HDAC inhibitors such as SAHA (suberoylanilide hydroxamic acid) as well as class II-specific but not class I specific inhibitors. 359
28712 212522 cd09998 HDAC_Hos3 Class II histone deacetylases Hos3 and related proteins. Fungal histone deacetylase Hos3 from Saccharomyces cerevisiae is a Zn-dependent enzyme belonging to HDAC class II. It catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Hos3 deacetylase is homodimer, in vitro it shows specificity to H4, H3 and H2A. 353
28713 212523 cd09999 Arginase-like_1 Arginase-like amidino hydrolase family. This family includes arginase, also known as arginase-like amidino hydrolase family, as well as arginase-like proteins and are found in bacteria, archaea and eykaryotes, but does not include metazoan arginases. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid. 272
28714 212524 cd10000 HDAC8 Histone deacetylase 8 (HDAC8). HDAC8 is a Zn-dependent class I histone deacetylase that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. HDAC8 is found in human cytoskeleton-bound protein fraction and insoluble cell pellets. It plays a crucial role in intramembraneous bone formation; germline deletion of HDAC8 is detrimental to skull bone formation. HDAC8 is possibly associated with the smooth muscle actin cytockeleton and may regulate the contractive capacity of smooth muscle cells. HDAC8 is also involved in the metabolic control of the estrogen receptor related receptor (ERR)-alpha/peroxisome proliferator activated receptor (PPAR) gamma coactivator 1 alpha (PGC1-alpha) transcriptional complex as well as in the development of neuroblastoma and T-cell lymphoma. HDAC8-selective small-molecule inhibitors could be a therapeutic drug option for these diseases. 364
28715 212525 cd10001 HDAC_classII_APAH Histone deacetylase class IIa. This subfamily includes bacterial acetylpolyamine amidohydrolase (APAH) as well as other Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. Mycoplana ramosa APAH exhibits broad substrate specificity and catalyzes the deacetylation of polyamines such as putrescine, spermidine, and spermine by cleavage of a non-peptide amide bond. 298
28716 212526 cd10002 HDAC10_HDAC6-dom1 Histone deacetylase 6, domain 1 and histone deacetylase 10. Histone deacetylases 6 and 10 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. HDAC10 has an N-terminal deacetylase domain and a C-terminal pseudo-repeat that shares significant similarity with its catalytic domain. It is located in the nucleus and cytoplasm, and is involved in regulation of melanogenesis. It transcriptionally down-regulates thioredoxin-interacting protein (TXNIP), leading to altered reactive oxygen species (ROS) signaling in human gastric cancer cells. Known interaction partners of HDAC6 are alpha tubulin (substrate) and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD) while interaction partners of HDAC10 are Pax3, KAP1, hsc70 and HDAC3 proteins. 336
28717 212527 cd10003 HDAC6-dom2 Histone deacetylase 6, domain 2. Histone deacetylase 6 is a class IIb Zn-dependent enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. Known interaction partners of HDAC6 are alpha tubulin and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD). 350
28718 212528 cd10004 RPD3-like reduced potassium dependency-3 (RPD3)-like. Proteins of the Rpd3-like family are class I Zn-dependent Histone deacetylases that catalyze hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). RPD3 is the yeast homolog of class I HDACs. The main function of RPD3-like group members is regulation of a number of different processes through protein (mostly different histones) modification (deacetylation). This group includes fungal RPD3 and acts via the formation of large multiprotein complexes. Members of this group are involved in cell cycle regulation, DNA damage response, embryonic development and cytokine signaling important for immune response. Histone deacetylation by yeast RPD3 represses genes regulated by the Ash1 and Ume6 DNA-binding proteins. In mammals, they are known to be involved in progression of various tumors. Specific inhibitors of mammalian histone deacetylases could be a therapeutic drug option. 375
28719 212529 cd10005 HDAC3 Histone deacetylase 3 (HDAC3). HDAC3 is a Zn-dependent class I histone deacetylase that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. In order to target specific chromatin regions, HDAC3 can interact with DNA-binding proteins (transcriptional factors) either directly or after forming complexes with a number of other proteins, as observed for the SMPT/N-CoR complex which recruits human HDAC3 to specific chromatin loci and activates deacetylation. Human HDAC3 is also involved in deacetylation of non-histone substrates such as RelA, SPY and p53 factors. This protein can also down-regulate p53 function and subsequently modulate cell growth and apoptosis. This gene is therefore regarded as a potential tumor suppressor gene. HDAC3 plays a role in various physiological processes, including subcellular protein localization, cell cycle progression, cell differentiation, apoptosis and survival. HDAC3 has been found to be overexpressed in some tumors including leukemia, lung carcinoma, colon cancer and maxillary carcinoma. Thus, inhibitors precisely targeting HDAC3 (in some cases together with retinoic acid or hyperthermia) could be a therapeutic drug option. 381
28720 212530 cd10006 HDAC4 Histone deacetylase 4. Histone deacetylase 4 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC4 participates in regulation of chondrocyte hypertrophy and skeletogenesis. However, biological substrates for HDAC4 have not been identified; only low lysine deacetylation activity has been demonstrated and active site mutant has enhanced activity toward acetylated lysines. HDAC4 does not bind DNA directly, but through transcription factors MEF2C (myocyte enhancer factor-2C) and MEF2D. Other known interaction partners of the protein are 14-3-3 proteins, SMRT and N-CoR co-repressors, BCL6, HP1, SUMO-1 ubiquitin-like protein, and ANKRA2. It appears to interact in a multiprotein complex with RbAp48 and HDAC3. Furthermore, HDAC4 is required for TGFbeta1-induced myofibroblastic differentiation. 409
28721 212531 cd10007 HDAC5 Histone deacetylase 5. Histone deacetylase 5 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC5 is involved in integration of chronic drug (cocaine) addiction and depression with changes in chromatin structure and gene expression; cocaine regulates HDAC5 function to antagonize the rewarding impact of cocaine, possibly by blocking drug-stimulated gene expression that supports drug-induced behavioral change. It is also involved in regulation of angiogenesis and cell cycle as well as immune system development. HDAC5 and HDAC9 have been found to be significantly up-regulated in high-risk medulloblastoma compared with low-risk and may potentially be novel drug targets. 420
28722 212532 cd10008 HDAC7 Histone deacetylase 7. Histone deacetylase 7 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC7 is involved in regulation of myocyte migration and differentiation. Known interaction partners of class IIa HDAC7 are myocyte enhancer factors - MEF2A, -2C, and -2D, 14-3-3 proteins, SMRT and N-CoR co-repressors, HDAC3, ETA (endothelin receptor). This enzyme is also involved in the development of the immune system as well as brain and heart development. Multiple alternatively spliced transcript variants encoding several isoforms have been found for this gene. 378
28723 212533 cd10009 HDAC9 Histone deacetylase 9. Histone deacetylase 9 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, they have N-terminal regulatory domain with two or three conserved serine residues, phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC9 is involved in regulation of gene expression and dendritic growth in developing cortical neurons. It also plays a role in hematopoiesis. Its deregulated expression may be associated with some human cancers. HDAC5 and HDAC9 have been found to be significantly up-regulated in high-risk medulloblastoma compared with low-risk and may potentially be novel drug targets. 379
28724 212534 cd10010 HDAC1 Histone deacetylase 1 (HDAC1). Histone deacetylase 1 (HDAC1) is a Zn-dependent class I enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDAC1 is involved in regulation through association with DNA binding proteins to target specific chromatin regions. In particular, HDAC1 appears to play a major role in pre-implantation embryogenesis in establishing a repressive chromatin state. Its interaction with retinoblastoma tumor-suppressor protein is essential in the control of cell proliferation and differentiation. Together with metastasis-associated protein-2 (MTA2), it deacetylates p53, thereby modulating its effect on cell growth and apoptosis. It participates in DNA-damage response, along with HDAC2; together, they promote DNA non-homologous end-joining. HDAC1 is also involved in tumorogenesis; its overexpression modulates cancer progression. Specific inhibitors of HDAC1 are currently used in cancer therapy. 371
28725 212535 cd10011 HDAC2 Histone deacetylase 2 (HDAC2). Histone deacetylase 2 (HDAC2) is a Zn-dependent class I enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDAC2 is involved in regulation through association with DNA binding proteins to target specific chromatin regions. It forms transcriptional repressor complexes by associating with several proteins, including the mammalian zinc-finger transcription factor YY1, thus playing an important role in transcriptional regulation, cell cycle progression and developmental events. Additionally, a few non-histone HDAC2 substrates have been found. HDAC2 plays a role in embryonic development and cytokine signaling important for immune response, and is over-expressed in several solid tumors including oral, prostate, ovarian, endometrial and gastric cancer. It participates in DNA-damage response, along with HDAC1; together, they can promote DNA non-homologous end-joining. HDAC2 is considered an important cancer prognostic marker. Inhibitors specifically targeting HDAC2 could be a therapeutic drug option. 366
28726 193609 cd10013 Cas3''_I CRISPR/Cas system-associated protein Cas3''. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; HD-like nuclease, specifically digesting double-stranded oligonucleotides and preferably cleaving at G:C pairs; signature gene for Type I 188
28727 199900 cd10014 TFIIA_gamma_C Gamma subunit of transcription initiation factor IIA, C-terminal domain. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The C-terminal domain of the gamma (TFIIA_gamma_C) subunit forms a beta-barrel structure together with TFIIA beta. 47
28728 197381 cd10015 BfiI_C_EcoRII_N_B3 DNA binding domains of BfiI, EcoRII and plant B3 proteins. This family contains the N-terminal DNA binding domain of type IIE restriction endonuclease EcoRII-like proteins, the C-terminal DNA binding domain of type IIS restriction endonuclease BfiI-like proteins and plant-specific B3 proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. EcoRII is specific for the 5'-CCWGG sequence (W stands for A or T). EcoRII consists of 2 domains, the C-terminal catalytic/dimerization domain (EcoRII-C), and the N-terminal effector DNA binding domain (EcoRII-N). BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. B3 proteins are a family of plant-specific transcription factors, involved in a great variety of processes, including seed development and auxin response. 109
28729 197382 cd10016 EcoRII_N N-terminal domain of type IIE restriction endonuclease EcoRII and similar proteins. N-terminal domain of type IIE restriction endonuclease EcoRII and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. EcoRII is specific for the 5'-CCWGG sequence (W stands for A or T). EcoRII consists of 2 domains, the C-terminal catalytic/dimerization domain (EcoRII-C), and the N-terminal effector DNA binding domain (EcoRII-N). To be catalytically active, EcoRII has to form a dimer. 142
28730 197383 cd10017 B3_DNA Plant-specific B3-DNA binding domain. The plant-specific B3 DNA binding domain superfamily includes the well-characterized auxin response factor (ARF) and the LAV (Leafy cotyledon2 [LEC2]-Abscisic acid insensitive3 [ABI3]-VAL) families, as well as the RAV (Related to ABI3 and VP1) and REM (REproductive Meristem) families. LEC2 and ABI3 have been shown to be involved in seed development, while other members of the LAV family seem to have a more general role, being expressed in many organs during plant development. Members of the ARF family bind to the auxin response element and depending on presence of an activation or repression domain, they activate or repress transcription. RAV and REM families are less studied B3 protein famillies. 98
28731 197384 cd10018 BfiI_C C-terminal domain of type IIs restriction endonuclease BfiI and similar proteins. C-terminal domain of a novel type IIs restriction endonuclease BfiI and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. Unlike all other restriction enzymes known to date, BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI presumably evolved through domain fusion of a DNA recognition domain to the catalytic Nuc-like domain from the PLD superfamily. BfiI forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains. 157
28732 206756 cd10019 14-3-3_sigma 14-3-3 sigma, an isoform of 14-3-3 protein. 14-3-3 protein sigma isoform, also known as stratifin or human mammary epithelial marker (HME) 1, has been most directly linked to tumor development. In humans, it is expressed by the SFN gene, strictly in stratified squamous epithelial cells in response to DNA damage where it is transcriptionally induced in a p53-dependent manner, subsequently causing cell-cycle arrest at the G2/M checkpoint. Up-regulation and down-regulation of 14-3-3 sigma expression have both been described in tumors. For example, in human breast cancer, 14-3-3 sigma is predominantly down-regulated by CpG methylation, acting as both a tumor suppressor and a prognostic indicator, while in human scirrhous-type gastric carcinoma (SGC), it is up-regulated and may play an important role in SGC carcinogenesis and progression. Loss of 14-3-3 sigma expression sensitizes tumor cells to treatment with conventional cytostatic drugs, making this protein an attractive therapeutic target. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 242
28733 206757 cd10020 14-3-3_epsilon 14-3-3 epsilon, an isoform of 14-3-3 protein. 14-3-3 protein epsilon isoform (isoform (also known as tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, epsilon polypeptide) is encoded by the YWHAE gene in humans and is involved in cancer cell survival and growth. It interacts with CDC25 phosphatases, RAF1 and IRS1 proteins, suggesting its role in diverse biochemical activities related to signal transduction, such as cell division and regulation of insulin sensitivity. Overexpression of 14-3-3 epsilon in primary hepatocellular carcinoma (HCC) tissues predicts a high risk of extrahepatic metastasis and worse survival, and is a potential therapeutic target. It has also been implicated in the pathogenesis of small cell lung cancer. 14-3-3 epsilon overexpression protects colorectal cancer and endothelial cells from oxidative stress-induced apoptosis, while its suppression by non-steroidal anti-inflammatory drugs induces cancer and endothelial cell death. Cellular levels of 14-3-3 epsilon could possibly serve as an important regulator of cell survival in response to oxidative stress and other death signals. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 230
28734 206758 cd10022 14-3-3_beta_zeta 14-3-3 beta and zeta isoforms of 14-3-3 protein. 14-3-3 protein beta and zeta isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, beta and zeta polypeptide) are encoded by the YWHAB gene and YWHAZ gene in humans. They have been linked to mitogenic signaling and the cell cycle machinery, and to cancer initiation and progression, respectively. The beta isoform has been shown to interact with RAF1 and CDC25 phosphatases and its overexpression is associated with invasion, migration, metastasis and proliferation of tumor cells and its elevated levels are correlated with tumor size, the number of lymph node metastases and a reduced survival rate. It is significantly overexpressed in lung cancer tissues, mutated chronic lymphocytic leukemia (M-CLL), gastric cancer tissues, aflatoxin B1-induced rat hepatocellular carcinoma K1 and K2 cells, as well as renal cell carcinoma cysts, and can potentially be used as a diagnostic and prognostic biomarker in the cancer. Numerous proteins involved in anti-apoptosis and tumor progression were also found to be differentially expressed in gastric cancer cells where 14-3-3 beta is overexpressed. 14-3-3 beta also interacts with human Dapper1 (hDpr1), a key negative regulator of Wnt signaling, via hDpr1 phosphorylation by protein kinase A, thus attenuating the ability of hDpr1 to promote Dishevelled (Dvl) degradation, and subsequently enhancing Wnt signaling. The zeta isoform is ubiquitously expressed and localized to most subcellular regions, including the cytoplasm, plasma membrane, mitochondria, and nucleus. Its overexpression and gene amplification in multiple cancers are correlated with poor prognosis and chemoresistance in cancer patients. 14-3-3 zeta has been identified as a biomarker with high sensitivity and specificity for diagnosis and prognosis in multiple tumor types, including hepatocellular carcinoma, head and neck cancer, indicating a potential clinical application for using 14-3-3 zeta in selecting treatment options and predicting cancer outcome. It also interacts with IRS1 protein, suggesting a role in regulating insulin sensitivity. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 229
28735 206759 cd10023 14-3-3_theta 14-3-3 theta/tau (theta in mice, tau in human), an isoform of 14-3-3 protein. 14-3-3 tau/theta (tau in humans, theta in mice) isoform (also known as tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, theta polypeptide) is encoded by the YWHAQ gene in humans and plays an important role in controlling apoptosis through interactions with ASK1, c-jun NH-terminal kinase, and p38 mitogen-activated protein kinase (MAPK). Its interaction with CDC25c regulates entry into the cell cycle and subsequent interaction with Bad prevents apoptosis. 14-3-3 theta protein expression is induced in patients with amyotrophic lateral sclerosis. 14-3-3 tau is often overexpressed in breast cancer, which is associated with the downregulation of p21, a p53 target gene, and thus leads to tamoxifen resistance in MCF7 breast cancer cells and shorter patient survival. Therefore, 14-3-3 tau may be a potential therapeutic target in breast cancer. Additionally, 14-3-3 theta mediates nucleocytoplasmic shuttling of the coronavirus nucleocapsid protein which causes severe acute respiratory syndrome. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 234
28736 206760 cd10024 14-3-3_gamma 14-3-3 gamma, an isoform of 14-3-3 protein. 14-3-3 gamma isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, gamma polypeptide) is encoded by the YWHAG gene in humans and is induced by growth factors in human vascular smooth muscle cells. It is also highly expressed in skeletal and heart muscles, suggesting an important role in muscle tissue. It has been shown to interact with RAF1 and protein kinase C, proteins involved in various signal transduction pathways. 14-3-3 gamma mediates Cdc25A proteolysis to block premature mitotic entry after DNA damage. 14-3-3 gamma mediates the interaction between Chk1 and Cdc25A; this complex has an essential function in Cdc25A phosphorylation and degradation to block premature mitotic entry after DNA damage. Increased expression of 14-3-3 gamma in lung cancer coincides with loss of functional p53, possibly in a cooperative manner promoting genomic instability. Also, during cell cycle, 14-3-3 gamma protects p21, a cyclin-dependent kinase inhibitor, from degradation mediated by the p53 suppressor MDMX, which may account for elevation of p21 levels independent of p53 and in response to DNA damage. Elevated expression of 14-3-3 gamma in human hepatocellular carcinoma predicts extrahepatic metastasis and worse survival, thus making this protein a candidate biomarker and a potential target for novel therapies against the disease. 246
28737 206761 cd10025 14-3-3_eta 14-3-3 eta, an isoform of 14-3-3 protein. 14-3-3 eta isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, eta polypeptide) is expressed mainly in brain, and is involved in hypothalamic-pituitary-adrenocortical (HPA) axis regulation. In humans, it is encoded by the YWHAH gene, and is a positional and functional candidate for schizophrenia as well as bipolar disorder (BP). This gene contains a 7 bp repeat sequence in its 5' Untranslated Region (UTR), and early-onset schizophrenia has been associated with changes in the number of this repeat. 14-3-3 eta and gamma are found in the serum and synovial fluid of patients with joint inflammation. Specifically, 14-3-3 eta, which plays a regulatory role in chondrogenic differentiation, is significantly overexpressed in juvenile rheumatoid arthritis (JRA), a chronic inflammatory disease often associated with growth impairment. Overexpression of Gremlin 1, the bone morphogenetic protein antagonist, may play an oncogenic role in carcinomas of the uterine cervix, lung, ovary, kidney, breast, colon, pancreas, and sarcoma, since it functions by interaction with the 14-3-3 eta domain. Therefore, Gremlin 1 and its binding protein 14-3-3 eta could be appropriate targets for developing diagnostic and therapeutic strategies against human cancers. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 239
28738 206762 cd10026 14-3-3_plant Plant 14-3-3 protein domain. Plant 14-3-3 isoforms, similar to their highly conserved homologs in mammals, bind to phosphorylated target proteins to modulate their function. They have been implicated in a variety of physiological functions; in particular, abiotic and biotic stress responses, primary metabolism, as well as various aspects of plant growth and development. They function through the regulation of a diverse range of proteins including transcription factors, kinases, structural proteins, ion channels as well as pathogen defense-related proteins. The 14-3-3 proteins are affected transcriptionally as well as functionally by the environment of the plant, both intracellular and extracellular, thus playing a key role in the response to environmental stress, pathogens and light conditions. Plant 14-3-3 proteins have been divided into epsilon-like groups and non-epsilon groups based on phylogenetic clustering. They have a varying number of isoforms (for example, Arabidopsis has thirteen known protein isoforms, cotton has six) with variation in their affinity for specific binding partners, suggesting specific roles in specific processes. 237
28739 381678 cd10027 UDG-F1-like Uracil DNA glycosylase family 1 subfamily, includes Human uracil DNA glycosylase and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 200
28740 381679 cd10028 UDG-F2_TDG_MUG Uracil DNA glycosylase family 2, includes thymine DNA glycosylase, mismatch-specific uracil DNA glycosylase and similar proteins. Uracil DNA glycosylase family 2 consists of thymine DNA glycosylase (TDG), which removes uracil and thymine from G:U and G:T mismatches in double-stranded DNA. It includes mismatch-specific uracil DNA glycosylase (MUG), the prokaryotic homolog of TDG. Escherichia coli MUG is highly specific to G:U mismatches but also repairs G:T mismatches at high enzyme concentration. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.. 163
28741 381680 cd10030 UDG-F4_TTUDGA_SPO1dp_like Uracil DNA glycosylase family 4, includes Thermotoga maritima TTUDGA, Bacillus phage SPO1 DNA polymerase, and similar proteins. Uracil DNA glycosylase family 4 includes Thermotoga maritima TTUDGA, a robust uracil DNA glycosylase that shares narrow substrate specificity and high catalytic efficiency with family 1, acting on double-stranded and single-stranded uracil-containing DNA. Members of this family possess four conserved cysteine residues required to coordinate the [4Fe-4S] iron-sulfur cluster. This family also includes the N-terminal domain of Bacillus phage SPO1 DNA polymerase. Bacteriophage SPO1 is one of a group of large, lytic, tailed bacteriophages of Bacillus subtilis, and contains hydroxymethyluracil (hmUra) in place of thymine in their DNA. It has been speculated that this UDG domain may help discriminate between hmUra containing SPO1 DNA and thymine-containing host DNA. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 165
28742 381681 cd10031 UDG-F5_TTUDGB_like Uracil DNA glycosylase family 5, includes Thermotoga maritima TTUDGB and similar proteins. Uracil DNA glycosylase family 5 includes Thermus thermophilus HB8 TTUDGB (also called UDGb) which is not only a UDG acting on double-stranded uracil-containing DNA, but also a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA (except for the C/I base pair), as well as a xanthine DNA glycosylase acting on both, double-stranded and single-stranded xanthine-containing DNA. TTUDGB also excises thymine from G:T mismatched DNA, and removes analogs of uracil from DNA, including 5-hydroxymethyluracil (hmU) and 5-fluorouracil (fU). This subfamily also contains Bradyrhizobium diazoefficiens family 5 homolog Blr5068 (UdgB) which has been found to efficiently excise uracil from ssDNA and dsDNA. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Similar to family 4 UDGs, members of this family possess four conserved cysteine residues required to coordinate the [4Fe-4S] iron-sulfur cluster. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 204
28743 381682 cd10032 UDG-F6_HDG Uracil DNA glycosylase family 6, includes hypoxanthine-DNA glycosylase and similar proteins. Uracil DNA glycosylase family 6 hypoxanthine-DNA glycosylase (HDG) lacks any detectable UDG activity; it excises hypoxanthine, a deamination product of adenine, from double-stranded DNA. Uracil-DNA glycosylase (UDGs) initiates repair of uracils in DNA. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 141
28744 381683 cd10033 UDG_like uncharacterized family of the uracil-DNA glycosylase superfamily. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it. 171
28745 381684 cd10034 UDG_BdiUng_like Uracil DNA glycosylase family which includes Bradyrhizobium diazoefficiens Blr0248 (BdiUng) and similar proteins. Bradyrhizobium diazoefficiens (previously B. japonicum) Blr0248 uracil-DNA glycosylase (BdiUng) has broad substrate specificity, preferring single-stranded DNA and removing uracil, 5-hydroxymethyl-uracil or xanthine from it. BdiUng is impervious to inhibition by AP DNA, and Ugi protein that specifically inhibits conventional family 1 UDGs. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 181
28746 381685 cd10035 UDG_like uncharacterized family of the uracil-DNA glycosylase superfamily. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches, but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it. 143
28747 197344 cd10036 Reelin_subrepeat_Nt Additional N-terminal subrepeat of reelin. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. Some family members appear to have an additional subrepeat at the N-terminus as characterized in this model. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns. 151
28748 197345 cd10037 Reelin_repeat_1_subrepeat_1 N-terminal subrepeat of tandem repeat unit 1 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 146
28749 197346 cd10038 Reelin_repeat_2_subrepeat_1 N-terminal subrepeat of tandem repeat unit 2 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 168
28750 197347 cd10039 Reelin_repeat_3_subrepeat_1 N-terminal subrepeat of tandem repeat unit 3 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 170
28751 197348 cd10040 Reelin_repeat_4_subrepeat_1 N-terminal subrepeat of tandem repeat unit 4 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 170
28752 197349 cd10041 Reelin_repeat_5_subrepeat_1 N-terminal subrepeat of tandem repeat unit 5 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 174
28753 197350 cd10042 Reelin_repeat_6_subrepeat_1 N-terminal subrepeat of tandem repeat unit 6 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 157
28754 197351 cd10043 Reelin_repeat_7_subrepeat_1 N-terminal subrepeat of tandem repeat unit 7 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 171
28755 197352 cd10044 Reelin_repeat_8_subrepeat_1 N-terminal subrepeat of tandem repeat unit 8 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 176
28756 197353 cd10045 Reelin_repeat_1_subrepeat_2 C-terminal subrepeat of tandem repeat unit 1 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 155
28757 197354 cd10046 Reelin_repeat_2_subrepeat_2 C-terminal subrepeat of tandem repeat unit 2 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 156
28758 197355 cd10047 Reelin_repeat_3_subrepeat_2 C-terminal subrepeat of tandem repeat unit 3 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 151
28759 197356 cd10048 Reelin_repeat_4_subrepeat_2 C-terminal subrepeat of tandem repeat unit 4 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 150
28760 197357 cd10049 Reelin_repeat_5_subrepeat_2 C-terminal subrepeat of tandem repeat unit 5 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 150
28761 197358 cd10050 Reelin_repeat_6_subrepeat_2 C-terminal subrepeat of tandem repeat unit 6 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 148
28762 197359 cd10051 Reelin_repeat_7_subrepeat_2 C-terminal subrepeat of tandem repeat unit 7 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 162
28763 197360 cd10052 Reelin_repeat_8_subrepeat_2 C-terminal subrepeat of tandem repeat unit 8 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 161
28764 380782 cd10140 PFM_aerolysin_family pore-forming module of aerolysin-type beta-barrel pore-forming proteins. Pore-forming proteins (PFPs) are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta pore-forming proteins (beta-PFPs) form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). Members of this family includes enterolobin, a cytolytic, inflammatory and insecticidal protein from the Brazilian tree Enterolobium contortisiliquum. 92
28765 381074 cd10141 CopZ-like_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of Archaeoglobus fulgidus CopZ, and similar proteins. Archaeoglobus fulgidus CopZ is a fusion of a redox-active domain (containing a mononuclear metal center and an [2Fe-2S] cluster) with a CXXC-containing copper-binding domain. It is a soluble Cu+ chaperone which delivers cytoplasmic Cu+ to the transmembrane metal-binding sites in the Cu+-ATPase CopA; CopA couples the hydrolysis of ATP to the efflux of cytoplasmic Cu+. In addition to CopZ, the BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), and the large subunit of NADH-dependent nitrite reductase. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 58
28766 408998 cd10142 HD_SAS6_N N-terminal head domain found in spindle assembly abnormal protein 6 and similar proteins. Spindle assembly abnormal protein 6 (SAS6) is a central scaffolding component of the centrioles that ensures their 9-fold symmetry. It is required for centrosome biogenesis and duplication, and is required for both mother-centriole-dependent centriole duplication and deuterosome-dependent centriole amplification in multiciliated cells. It is also required for the recruitment of microcephaly protein STIL to the procentriole and for STIL-mediated centriole amplification. SAS6 is comprised of an N-terminal globular head domain, a centrally located coiled-coil domain, and a disordered C-terminus. These monomers homodimerize symmetrically, through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. These homodimers can self-assemble into a 9-fold symmetric cartwheel structure comprised of nine SAS6 homodimers associated via their head domains; the dimerized coiled-coil domains being the spokes, the central hub being the head domains. This model corresponds to the N-terminal head domain of SAS6, which is structurally related to other XRCC4-superfamily members, XRCC4, PAXX, XLF and CCDC61. 137
28767 381748 cd10144 Peptidase_S74_CIMCD Peptidase S74 family, C-terminal intramolecular chaperone domain of Escherichia coli phage K1F endosialidase and related proteins. This peptidase S74 family includes C-terminal intramolecular chaperone domain (CIMCD) of Escherichia coli phage K1F endosialidase, Bacillus phage GA-1 neck appendage protein, and Bacteriophage T5 L-shaped tail fibre. This domain acts as a molecular chaperone; during virus particle assembly, the CIMCD of phage tailspike proteins induces the homo-trimerization of phage tailspike proteins by chaperoning the formation of a triple beta-helix. Homo-trimeric phage tailspike proteins are then auto-cleaved by the CIMCD domain. This family also includes the peptidase S74 Intramolecular Chaperone Auto-processing (ICA) domain of mammalian Myrf. The ICA domain drives the homo-oligomerization of Myrf in the endoplasmic reticulum (ER) membrane. The homo-oligomeric Myrf is proteolyzed by the ICA domain, releasing its N-terminal fragments from the ER membrane. 113
28768 199901 cd10145 TFIIA_gamma_N Gamma subunit of transcription initiation factor IIA, N-terminal helical domain. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The N-terminal domain of the gamma subunit forms a 4-helix bundle together with the alpha subunit. 49
28769 199214 cd10146 LabA_like_C C-terminal domain of LabA_like proteins. This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains). 69
28770 199902 cd10147 Wzt_C-like C-Terminal domain of O-antigenic polysaccharide transporter protein Wzt and related proteins. The Escherichia coli ABC protein Wzt consists of 2 domains, a conventional ABC domain that binds ATP and utilizes its energy to transport molecules across membranes, and a C terminal domain which is responsible for its target molecule specificity. Wzt is part of the ATP-binding-cassette (ABC) transporter complex, responsible for the transport of the O-antigenic polysaccharide (O-PS) portion of lipopolysaccharide (LPS), a major component of the outer membrane of Gram-negative bacteria. This CD includes Wzt proteins from two Escherichia coli serotypes O8 and O9a, WztO8 and WztO9a; these proteins are specific for their cognate polysaccharides (O8 or O9a O-PS). 144
28771 197385 cd10148 CsoR-like_DUF156 Transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this domain superfamily was previously known as DUF156. This superfamily includes various transcriptional regulators that respond to stressors including Cu(I), Ni(I), sulfite, and formaldehyde. It includes CsoR (copper-sensitive operon repressor) from Mycobacterium tuberculosis (MtCsoR), Bacillus subtilis (BsCsoR), Thermus thermophilus (TthCsoR), and Staphylococcus aureus (SaCsoR), Mycobacterium tuberculosis RicR (regulated in copper repressor, MtRicR), Escherichia coli RncR (formally known as YohL, nickel and cobalt-sensitive), Alcaligenes xylosoxidans NreA (nickel-sensitive), E. coli FrmR (formally known as YaiN, formaldehyde sensitive), and Staphylococcus aureus CstR (CsoR-like sulfur transferase repressor, NWMN_0026.5, SaCstR). CsoR is Cu(I)-inducible, and regulates the expression of genes involved in copper homeostasis. For example, TthCsoR binds the promoter region of the copZ-csoR-copA operon, and represses expression of these genes, which encode the copper chaperone CopZ, CsoR, and the copper efflux P-type ATPase CopA, respectively. In the presence of excess Cu(I), TthCsoR binds this ion, and is released from the DNA, allowing expression of the downstream genes. TthCsoR also senses other metal ions such as Cu(II), Zn(II), Ag(I), Cd(II) and Ni(II). CsoRs form a homotetramer (dimer of dimers). In the case of MtCsoR, two Cys residues on opposite subunits within each dimer, along with a His residue, bind the Cu(I) ion. These residues are conserved in the majority of members of this superfamily. Exceptions include the functionally uncharacterized Bacillus subtilis YrkD where there is an Asn instead of His (C-N-C), E.coli RcnR where there is a Thr instead of the second Cys (C-H-T), or TthCsoR and E.coli FrmR where there is a His instead of the second Cys and which have an additional N-terminal His (not found in those family members having C-H-C) that may also be involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue facilitate allosteric regulation of DNA binding. SaCstR regulates genes predicted to function in sulfur metabolism; it is thought that oxidation of the intersubunit Cys pair to a mixture of disulphide and trisulphide linkages by sulfite, results in a reduced affinity of SaCstR for the operator DNA. SaCstR exists as a mixture of oligomeric states, including dimers, tetramers and octamers. The sequence of SaCstR was not available at the time this hierarchy was curated and therefore was not included. Escherichia coli RncR represses expression of the gene encoding the nickel and cobalt-efflux protein RcnA. The gene encoding Alcaligenes xylosoxidans NreA is part of the nre nickel resistance locus located on the pTOM9 plasmid from thisbacteria. Escherichia coli FrmR regulates the formaldehyde degradation frmRAB operon. 80
28772 197397 cd10149 ClassIIa_HDAC_Gln-rich-N Glutamine-rich N-terminal helical domain of various Class IIa histone deacetylases (HDAC4, HDAC5 and HDCA9). This superfamily consists of a glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDAC9; it is missing in HDAC7. It is referred to as the glutamine-rich domain, and confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors. This domain is able to repress transcription independently of the HDAC's C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues. 90
28773 199903 cd10150 CobN_like CobN subunit of cobaltochelatase, bchH and chlH subunits of magnesium chelatases, and similar proteins. Cobaltochelatase is a complex enzyme that catalyzes the insertion of cobalt into hydrogenobyrinic acid a,c-diamide, resulting in cobyrinic acid, as demonstrated for Pseudomonas denitrificans. This is an essential step in the bacterial synthesis of cobalamine (B12). The insertion of cobalt requires a complex composed of three polypeptides, cobN, cobS, and cobT. Also included in this family are protoporphyrin IX magnesium chelatases involved in the synthesis of chlorophyll and bacteriochlorophyll, specifically the large (chlH or bchH) subunits.They are thought to bind both the protoporphyrin and the magnesium ion. Hydrolysis of ATP by the smaller subunits in the complex may trigger a conformational change that results in the insertion of the ion into the protoporphyrin scaffold. Cryo electron microscopy studies have suggested that a distinct bchH C-terminal domain may bind tightly to the N-terminal domain upon substrate binding, requiring a substantial conformational change of the bchH subunit. It has also been suggested that chlH of higher plants binds abscisic acid via a C-terminal domain and plays a role in abscisic acid signaling, and that the protein spans the chloroplast envelope, with the C-terminus exposed to the cytosol. 910
28774 197386 cd10151 TthCsoR-like_DUF156 Thermus thermophilus CsoR, a Cu(I)-sensing transcriptional regulator, and related domains; this domain family was previously known as part of DUF156. This domain family contains various Cu(I)-inducible transcriptional regulators including CsoR (copper-sensitive operon repressor) from Mycobacterium tuberculosis (MtCsoR), and Thermus thermophilus (TthCsoR). CsoR regulates the expression of genes involved in copper homeostasis. For example, TthCsoR binds the promoter region of the copZ-csoR-copA operon, and represses expression of these genes, which encode the copper chaperone CopZ, CsoR, and the copper efflux P-type ATPase CopA, respectively. In the presence of excess Cu(I), TthCsoR binds this ion, and is released from the DNA, allowing expression of the downstream genes. TthCsoR also senses other metal ions such as Cu(II), Zn(II), Ag(I), Cd(II) and Ni(II). MtCsoR regulates an operon that includes CsoR and a putative copper transporter gene, ctpV (cation transporter P-type ATPase). CsoRs form a homotetramer (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in some but not all members of this family; for example, for TthCsoR, there is a His instead of the second Cys as well as an N-terminal His (not found in those family members having C-H-C) which may also be involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue facilitate allosteric regulation of DNA binding. 82
28775 197387 cd10152 SaCsoR-like_DUF156 Staphylococcus aureus copper-sensitive operon repressor (CsoR), and related domains; this family was previously known as part of DUF156. This domain family includes Staphylococcus aureus CsoR (SaCsoR). SaCsoR is Cu(I)-inducible, and regulates the expression of genes involved in copper homeostasis; it represses a genetically unlinked copA-copZ operon. copA encodes a copper efflux P-type ATPase, and copZ, a copper chaperone. This family belongs to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes Mycobacterium tuberculosis CsoR (MtCsoR), Bacillus subtilis CsoR, and Thermus thermophilus CsoR. The latter three proteins do not belong to this family. CsoRs form homotetramers (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also well conserved. 82
28776 197388 cd10153 RcnR-FrmR-like_DUF156 Transcriptional regulators RcnR and FrmR, and related domains; this domain family was previously known as part of DUF156. This domain family includes various transcriptional regulators that respond to different stressors. It includes Escherichia coli RncR (formally known as YohL, nickel and cobalt-sensitive), and E. coli FrmR (formally known as YaiN, formaldehyde sensitive). Escherichia coli RncR represses expression of the gene encoding the nickel and cobalt-efflux protein RcnA; RcnA may act through modulating NikR, to repress the NIkABCDE nickel transporter. In vitro, purified RncR binds to the rncA promoter DNA fragment in the absence of Ni2+ or Co2+, and the affinity of RncR for this promoter is reduced in the presence of excess nickel. Escherichia coli FrmR regulates the formaldehyde degradation frmRAB operon. This family belongs to a larger superfamily that includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, not all these residues are conserved; in E.coli RcnR and FrmR there is a His or a Thr instead of the second Cys (C-H-H or C-H-T) respectively. For E. coli FrmR, an N-terminal His residue, not conserved in all members of this family, is also involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are poorly conserved in this family. 88
28777 197389 cd10154 NreA-like_DUF156 Alcaligenes xylosoxidans NreA and related domains; this domain family was previously known as part of DUF156. This domain family includes Alcaligenes xylosoxidans NreA, Psudomonas putida MreA, and related domains. The gene encoding Alcaligenes xylosoxidans NreA is part of the nre nickel resistance locus located on the pTOM9 plasmid from this bacteria; it confers low-level nickel resistance on both Ralstonia and Escherichia coli strains. The Pseudomonas putida MreA gene is found in association with a gene encoding mrdH, a heavy metal efflux transporter of broad specificity. MreA may have a role in cadmium and nickel resistance. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including members of this family; however, a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are poorly conserved. 86
28778 197390 cd10155 BsYrkD-like_DUF156 Uncharacterized protein YrkD from Bacillus subtilis and related domains; this domain superfamily was previously known as part of DUF156. This domain family contains an uncharacterized protein YrkD from Bacillus subtilis and related proteins. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, not all these residues are conserved, there is an Asn instead of the His (C-N-C); also a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are very poorly conserved. 82
28779 197391 cd10156 FpFrmR-Cterm-like_DUF156 C-terminal domain of Faecalibacterium prausnitzii A2-165 FrmR , and related domains; this domain family was previously known as part of DUF156. This domain family contains the C-terminal domain of the functionally uncharacterized protein Faecalibacterium prausnitzii A2-165 FrmR, and related domains. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also conserved. 86
28780 197392 cd10157 BsCsoR-like_DUF156 Bacillus subtilis copper-sensitive operon repressor (BsCsoR), and related domains; this family was previously known as part of DUF156. This domain family includes Bacillus subtilis CsoR (BsCsoR). CsoRs are Cu(I)-inducible, and regulate the expression of genes involved in copper homeostasis. BsCsoR regulates the copZA operon which encodes the copper chaperone CopZ, and the copper efflux P-type ATPase CopA. This family belongs to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes Mycobacterium tuberculosis CsoR (MtCsoR), Thermus thermophilus CsoR, and Staphylococcus aureus CsoR. The latter three proteins do not belong to this family. CsoRs regulate the expression of genes involved in copper homeostasis. CsoRs form homotetramers (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and the conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also well conserved. 85
28781 197393 cd10158 CsoR-like_DUF156_1 Uncharacterized family 1; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 1, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family; however, a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are poorly conserved. 81
28782 197394 cd10159 CsoR-like_DUF156_2 Uncharacterized family 2; belongs to a superfamily containing transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 2, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also conserved. 82
28783 197395 cd10160 CsoR-like_DUF156_3 Uncharacterized family 3; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 3, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family; however, a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are not conserved. 85
28784 197396 cd10161 CsoR-like_DUF156_4 Uncharacterized family 4; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 4, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, only one of these residues is conserved (the first Cys); and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also not conserved. 82
28785 197398 cd10162 ClassIIa_HDAC4_Gln-rich-N Glutamine-rich N-terminal helical domain of HDAC4, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 4 (HDAC4). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues. 90
28786 197399 cd10163 ClassIIa_HDAC9_Gln-rich-N Glutamine-rich N-terminal helical domain of HDAC9, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 9 (HDAC9). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues. 90
28787 197400 cd10164 ClassIIa_HDAC5_Gln-rich-N Glutamine-rich N-terminal helical domain of HDAC5, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 5 (HDAC5). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues. 97
28788 212667 cd10170 HSP70_NBD Nucleotide-binding domain of the HSP70 family. HSP70 (70-kDa heat shock protein) family chaperones assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some HSP70 family members are not chaperones but instead, function as NEFs to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle, some may function as both chaperones and NEFs. 369
28789 212668 cd10225 MreB_like MreB and similar proteins. MreB is a bacterial protein which assembles into filaments resembling those of eukaryotic F-actin. It is involved in determining the shape of rod-like bacterial cells, by assembling into large fibrous spirals beneath the cell membrane. MreB has also been implicated in chromosome segregation; specifically MreB is thought to bind to and segregate the replication origin of bacterial chromosomes. 320
28790 212669 cd10227 ParM_like Plasmid segregation protein ParM and similar proteins. ParM is a plasmid-encoded bacterial homolog of actin, which polymerizes into filaments similar to F-actin, and plays a vital role in plasmid segregation. ParM filaments segregate plasmids paired at midcell into the individual daughter cells. This subfamily also contains Thermoplasma acidophilum Ta0583, an active ATPase at physiological temperatures, which has a propensity to form filaments. 312
28791 212670 cd10228 HSPA4_like_NDB Nucleotide-binding domain of 105/110 kDa heat shock proteins including HSPA4 and similar proteins. This subgroup includes the human proteins, HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1), HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28), and HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3), Saccharomyces cerevisiae Sse1p and Sse2p, and a sea urchin sperm receptor. It belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family, and includes proteins believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins. 381
28792 212671 cd10229 HSPA12_like_NBD Nucleotide-binding domain of HSPA12A, HSPA12B and similar proteins. Human HSPA12A (also known as 70-kDa heat shock protein-12A) and HSPA12B (also known as 70-kDa heat shock protein-12B, chromosome 20 open reading frame 60/C20orf60, dJ1009E24.2) belong to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12A or HSPA12B. The gene encoding HSPA12A maps to 10q26.12, a cytogenetic region that might represent a common susceptibility locus for both schizophrenia and bipolar affective disorder; reduced expression of HSPA12A has been shown in the prefrontal cortex of subjects with schizophrenia. HSPA12A is also a candidate gene for forelimb-girdle muscular anomaly, an autosomal recessive disorder of Japanese black cattle. HSPA12A is predominantly expressed in neuronal cells. It may also play a role in the atherosclerotic process. The gene encoding HSPA12B maps to 20p13. HSPA12B is predominantly expressed in endothelial cells, is required for angiogenesis, and may interact with known angiogenesis mediators. It may be important for host defense in microglia-mediated immune response. HSPA12B expression is up-regulated in lipopolysaccharide (LPS)-induced inflammatory response in the spinal cord, and mostly located in active microglia; this induced expression may be regulated by activation of MAPK-p38, ERK1/2 and SAPK/JNK signaling pathways. Overexpression of HSPA12B also protects against LPS-induced cardiac dysfunction and involves the preserved activation of the PI3K/Akt signaling pathway. 404
28793 212672 cd10230 HYOU1-like_NBD Nucleotide-binding domain of human HYOU1 and similar proteins. This subgroup includes human HYOU1 (also known as human hypoxia up-regulated 1, GRP170; HSP12A; ORP150; GRP-170; ORP-150; the human HYOU1 gene maps to11q23.1-q23.3) and Saccharomyces cerevisiae Lhs1p (also known as Cer1p, SsI1). Mammalian HYOU1 functions as a nucleotide exchange factor (NEF) for HSPA5 (alos known as BiP, Grp78 or HspA5) and may also function as a HSPA5-independent chaperone. S. cerevisiae Lhs1p, does not have a detectable endogenous ATPase activity like canonical HSP70s, but functions as a NEF for Kar2p; it's interaction with Kar2p is stimulated by nucleotide-binding. In addition, Lhs1p has a nucleotide-independent holdase activity that prevents heat-induced aggregation of proteins in vitro. This subgroup belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as NEFs, to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins. 388
28794 212673 cd10231 YegD_like Escherichia coli YegD, a putative chaperone protein, and related proteins. This bacterial subfamily includes the uncharacterized Escherichia coli YegD. It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. YegD lacks the SBD. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some family members are not chaperones but instead, function as NEFs for their Hsp70 partners, other family members function as both chaperones and NEFs. 415
28795 212674 cd10232 ScSsz1p_like_NBD Nucleotide-binding domain of Saccharmomyces cerevisiae Ssz1pp and similar proteins. Saccharomyces cerevisiae Ssz1p (also known as /Pdr13p/YHR064C) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some family members are not chaperones but rather, function as NEFs for their Hsp70 partners, while other family members function as both chaperones and NEFs. Ssz1 does not function as a chaperone; it facilitates the interaction between the HSP70 Ssb protein and its partner J-domain protein Zuo1 (also known as zuotin) on the ribosome. Ssz1 is found in a stable heterodimer (called RAC, ribosome associated complex) with Zuo1. Zuo1 can only stimulate the ATPase activity of Ssb, when it is in complex with Ssz1. Ssz1 binds ATP but neither nucleotide-binding, hydrolysis, or its SBD, is needed for its in vivo function. 386
28796 212675 cd10233 HSPA1-2_6-8-like_NBD Nucleotide-binding domain of HSPA1-A, -B, -L, HSPA-2, -6, -7, -8, and similar proteins. This subfamily includes human HSPA1A (70-kDa heat shock protein 1A, also known as HSP72; HSPA1; HSP70I; HSPA1B; HSP70-1; HSP70-1A), HSPA1B (70-kDa heat shock protein 1B, also known as HSPA1A; HSP70-2; HSP70-1B), and HSPA1L (70-kDa heat shock protein 1-like, also known as HSP70T; hum70t; HSP70-1L; HSP70-HOM). The genes for these three HSPA1 proteins map in close proximity on the major histocompatibility complex (MHC) class III region on chromosome 6, 6p21.3. This subfamily also includes human HSPA8 (heat shock 70kDa protein 8, also known as LAP1; HSC54; HSC70; HSC71; HSP71; HSP73; NIP71; HSPA10; the HSPA8 gene maps to 11q24.1), human HSPA2 (70-kDa heat shock protein 2, also known as HSP70-2; HSP70-3, the HSPA2 gene maps to 14q24.1), human HSPA6 (also known as heat shock 70kDa protein 6 (HSP70B') gi 94717614, the HSPA6 gene maps to 1q23.3), human HSPA7 (heat shock 70kDa protein 7 , also known as HSP70B; the HSPA7 gene maps to 1q23.3) and Saccharmoyces cerevisiae Stress-Seventy subfamily B/Ssb1p. This subfamily belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Associations of polymorphisms within the MHC-III HSP70 gene locus with longevity, systemic lupus erythematosus, Meniere's disease, noise-induced hearing loss, high-altitude pulmonary edema, and coronary heart disease, have been found. HSPA2 is involved in cancer cell survival, is required for maturation of male gametophytes, and is linked to male infertility. The induction of HSPA6 is a biomarker of cellular stress. HSPA8 participates in the folding and trafficking of client proteins to different subcellular compartments, and in the signal transduction and apoptosis process; it has been shown to protect cardiomyocytes against oxidative stress partly through an interaction with alpha-enolase. S. cerevisiae Ssb1p, is part of the ribosome-associated complex (RAC), it acts as a chaperone for nascent polypeptides, and is important for translation fidelity; Ssb1p is also a [PSI+] prion-curing factor. 376
28797 212676 cd10234 HSPA9-Ssq1-like_NBD Nucleotide-binding domain of human HSPA9 and similar proteins. This subfamily includes human mitochondrial HSPA9 (also known as 70-kDa heat shock protein 9, CSA; MOT; MOT2; GRP75; PBP74; GRP-75; HSPA9B; MTHSP75; the gene encoding HSPA9 maps to 5q31.1), Escherichia coli DnaK, Saccharomyces cerevisiae Stress-seventy subfamily Q protein 1/Ssq1p (also called Ssc2p, Ssh1p, mtHSP70 homolog), and S. cerevisiae Stress-Seventy subfamily C/Ssc1p (also called mtHSP70, Endonuclease SceI 75 kDa subunit). It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs); for Escherichia coli DnaK, these are the DnaJ and GrpE, respectively. 376
28798 212677 cd10235 HscC_like_NBD Nucleotide-binding domain of Escherichia coli HscC and similar proteins. This subfamily includes Escherichia coli HscC (also called heat shock cognate protein C, Hsc62, or YbeW) and the the putative DnaK-like protein Escherichia coli ECs0689. It belongs to the heat shock protein 70 (Hsp70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, Hsp70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Two genes in the vicinity of the HscC gene code for potential cochaperones: J-domain containing proteins, DjlB/YbeS and DjlC/YbeV. HscC and its co-chaperone partners may play a role in the SOS DNA damage response. HscC does not appear to require a NEF. 339
28799 212678 cd10236 HscA_like_NBD Nucleotide-binding domain of HscA and similar proteins. Escherichia coli HscA (heat shock cognate protein A, also called Hsc66), belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). HscA's partner J-domain protein is HscB; it does not appear to require a NEF, and has been shown to be induced by cold-shock. The HscA-HscB chaperone/co-chaperone pair is involved in [Fe-S] cluster assembly. 355
28800 212679 cd10237 HSPA13-like_NBD Nucleotide-binding domain of human HSPA13 and similar proteins. Human HSPA13 (also called 70-kDa heat shock protein 13, STCH, "stress 70 protein chaperone, microsome-associated, 60kD", "stress 70 protein chaperone, microsome-associated, 60kDa"; the gene encoding HSPA13 maps to 21q11.1) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). STCH contains an NBD but lacks an SBD. STCH may function to regulate cell proliferation and survival, and modulate the TRAIL-mediated cell death pathway. The HSPA13 gene is a candidate stomach cancer susceptibility gene; a mutation in the NBD coding region of HSPA13 has been identified in stomach cancer cells. The NBD of HSPA13 interacts with the ubiquitin-like proteins Chap1 and Chap2, implicating HSPA13 in regulating cell cycle and cell death events. HSPA13 is induced by the Ca2+ ionophore A23187. 417
28801 212680 cd10238 HSPA14-like_NBD Nucleotide-binding domain of human HSPA14 and similar proteins. Human HSPA14 (also known as 70-kDa heat shock protein 14, HSP70L1, HSP70-4; the gene encoding HSPA14 maps to 10p13), is ribosome-associated and belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). HSPA14 interacts with the J-protein MPP11 to form the mammalian ribosome-associated complex (mRAC). HSPA14 participates in a pathway along with Nijmegen breakage syndrome 1 (NBS1, also known as p85 or nibrin), heat shock transcription factor 4b (HSF4b), and HSPA4 (belonging to a different subfamily), that induces tumor migration, invasion, and transformation. HSPA14 is a potent T helper cell (Th1) polarizing adjuvant that contributes to antitumor immune responses. 375
28802 212681 cd10241 HSPA5-like_NBD Nucleotide-binding domain of human HSPA5 and similar proteins. This subfamily includes human HSPA5 (also known as 70-kDa heat shock protein 5, glucose-regulated protein 78/GRP78, and immunoglobulin heavy chain-binding protein/BIP, MIF2; the gene encoding HSPA5 maps to 9q33.3.), Sacchaormyces cerevisiae Kar2p (also known as Grp78p), and related proteins. This subfamily belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. HSPA5 and Kar2p are chaperones of the endoplasmic reticulum (ER). Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Multiple ER DNAJ domain proteins have been identified and may exist in distinct complexes with HSPA5 in various locations in the ER, for example DNAJC3-p58IPK in the lumen. HSPA5-NEFs include SIL1 and an atypical HSP70 family protein HYOU1/ORP150. The ATPase activity of Kar2p is stimulated by the NEFs: Sil1p and Lhs1p. 374
28803 199834 cd10276 BamB_YfgL Beta-barrel assembly machinery (Bam) complex component B and related proteins. BamB (YflG) is a non-essential component of the beta-barrel assembly machinery (Bam), a multi-subunit complex that inserts proteins with beta-barrel topology into the outer membrane. BamB has been found to interact with BamA, which in turn binds and stabilizes pre-folded beta-barrel proteins; it has been suggested that BamB participates in the stabilization. 358
28804 199835 cd10277 PQQ_ADH_I Ethanol dehydrogenase, a bacterial quinoprotein (PQQ-dependent type I alcohol dehydrogenase). This bacterial family of homodimeric ethanol dehydrogenases utilize pyrroloquinoline quinone (PQQ) as a cofactor. It represents proteins whose expression may be induced by ethanol, and which are similar to quinoprotein methanol dehydrogenases, but have higher specificities for ethanol and other primary and secondary alcohols. Dehydrogenases with PQQ cofactors, such as ethanol, methanol, and membrane-bound glucose dehydrogenases, form an 8-bladed beta-propeller. 529
28805 199836 cd10278 PQQ_MDH Large subunit of methanol dehydrogenase (moxF). Methanol dehydrogenase is a key enzyme in the utilization of C1 compounds as a source of energy and carbon by bacteria. It catalyzes the oxidation of methanol to formaldehyde, transfering two electrons per methanol to cytochrome c(L) as the acceptor. Methanol dehydrogenase belongs to a family of dehydrogenases with pyrroloquinoline quinone (PQQ) as cofactor, which also includes dehydrogenases specific to other alcohols and membrane-bound glucose dehydrogenases. This alignment model for the large subunit contains an 8-bladed beta-propeller; the functional enzyme forms a heterotetramer composed of two large and two small subunits. 553
28806 199837 cd10279 PQQ_ADH_II PQQ_like domain of the quinohemoprotein alcohol dehydrogenase (type II). This family of monomeric and soluble type II alcohol dehydrogenases utilizes pyrroloquinoline quinone (PQQ) as a cofactor and is related to ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller. 549
28807 199838 cd10280 PQQ_mGDH Membrane-bound PQQ-dependent glucose dehydrogenase. This bacterial subfamily of enzymes belongs to the dehydrogenase family with pyrroloquinoline quinone (PQQ) as cofactor, and is the only subfamily that is bound to the membrane. Glucose dehydrogenase converts D-glucose to D-glucono-1,5-lactone in a reaction that is coupled with the respiratory chain in the periplasmic oxidation of sugars and alcohols in gram-negative bacteria. Ubiquinone functions as the electron acceptor. The alignment model contains an 8-bladed beta-propeller. 616
28808 197336 cd10281 Nape_like_AP-endo Neisseria meningitides Nape-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes Neisseria meningitides Nape and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity; for example, Neisseria meningitides Nape and NExo. Nape, found in this subfamily, is the dominant AP endonuclease. It exhibits strong AP endonuclease activity, and also exhibits 3'-5'exonuclease and 3'-deoxyribose phosphodiesterase activities. 253
28809 197337 cd10282 DNase1 Deoxyribonuclease 1. Deoxyribonuclease 1 (DNase1, EC 3.1.21.1), also known as DNase I, is a Ca2+, Mg2+/Mn2+-dependent secretory endonuclease, first isolated from bovine pancreas extracts. It cleaves DNA preferentially at phosphodiester linkages next to a pyrimidine nucleotide, producing 5'-phosphate terminated polynucleotides with a free hydroxyl group on position 3'. It generally produces tetranucleotides. DNase1 substrates include single-stranded DNA, double-stranded DNA, and chromatin. This enzyme may be responsible for apoptotic DNA fragmentation. Other deoxyribonucleases in this subfamily include human DNL1L (human DNase I lysosomal-like, also known as DNASE1L1, Xib, and DNase X ), human DNASE1L2 (also known as DNAS1L2), and DNASE1L3 (also known as DNAS1L3, nhDNase, LS-DNase, DNase Y, and DNase gamma) . DNASE1L3 is implicated in apoptotic DNA fragmentation. DNase I is also a cytoskeletal protein which binds actin. A recombinant form of human DNase1 is used as a mucoactive therapy in patients with cystic fibrosis; it hydrolyzes the extracellular DNA in sputum and reduces its viscosity. Mutations in the gene encoding DNase1 have been associated with Systemic Lupus Erythematosus, a multifactorial autoimmune disease. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 256
28810 197338 cd10283 MnuA_DNase1-like Mycoplasma pulmonis MnuA nuclease-like. This subfamily includes Mycoplasma pulmonis MnuA, a membrane-associated nuclease related to Deoxyribonuclease 1 (DNase1 or DNase I, EC 3.1.21.1). The in vivo role of MnuA is as yet undetermined. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. 266
28811 198434 cd10284 growth_hormone_like Somatotropin/prolactin hormone family. The somatotropin/prolactin hormone family includes growth hormones 1 and 2, prolactin, prolactin 2, and other members that play vital roles in a variety of processes, including growth control. They are long-chain class-I helical cytokines, most of which are secreted by the pituitary gland, and are active as monomers, binding to cellular receptors with EpoR-like ligand binding domains. 178
28812 198435 cd10285 somatotropin_like Somatotropin or growth hormone (GH), placental lactogen, and related pituitary gland hormones. Growth hormone (GH) or somatotropin is a peptide hormone synthesized by the pituitary gland, which mediates anabolic effects in development. GH is known to activate, via binding to specific cellular receptors, the MAPK/ERK and JAK-STAT signaling pathways. Via the latter, it triggers the secretion of insulin-like growth factor 1 (mostly in the liver). Besides increasing body height, GH has been shown to have a host of other effects. 180
28813 198436 cd10286 somatolactin Somatolactin (SL) and somatolactin-like proteins. This family of hormones specific to Actinopterygii is expressed in the pars intermedia bordering the neurohypophysis (posterior pituitary). Somatolactin appears to be involved in acid-base regulation, but much of its physiological role remains to be understood. 207
28814 198437 cd10287 prolactin_2 Vertebrate, non-mammalian prolactin 2 (PRL2). A functionally uncharacterized subfamily of the growth-hormone-like helical cytokines, which is found in vertebrata (except for mammals). The protein has been shown to be expressed in the zebrafish eye and brain, but not the pituitary gland, and might play a role in retina development. 184
28815 198438 cd10288 prolactin_like Prolactin (PRL or PRL1), chorionic somatomammotropin, and related pituitary gland hormones. Prolactin is primarily responsible for stimulating milk production and breast development in mammals. Aside from roles in reproduction, various functions have been attributed to prolactin, more than for other pituitary gland hormones combined. These are roles in growth and development, metamorphosis, metabolism of lipids, carbohydrates, and steroids, brain biochemistry and even immunoregulation, among others. Most of these roles are poorly understood, but it has become clear that many prolactin-like hormones are actually produced in the placenta and not the pituitary. 199
28816 198322 cd10289 GST_C_AaRS_like Glutathione S-transferase C-terminal-like, alpha helical domain of various Aminoacyl-tRNA synthetases and similar domains. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl-tRNA synthetase (AaRS)-like subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of some eukaryotic AaRSs, as well as similar domains found in proteins involved in protein synthesis including Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 2 (AIMP2), AIMP3, and eukaryotic translation Elongation Factor 1 beta (eEF1b). AaRSs comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. AaRSs in this subfamily include GluRS from lower eukaryotes, as well as GluProRS, MetRS, and CysRS from higher eukaryotes. AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex found in higher eukaryotes. The GST_C-like domain is involved in protein-protein interactions, mediating the formation of aaRS complexes such as the MetRS-Arc1p-GluRS ternary complex in lower eukaryotes and the multi-aaRS complex in higher eukaryotes, that act as molecular hubs for protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain. 82
28817 198323 cd10290 GST_C_MetRS_N_fungi Glutathione S-transferase C-terminal-like, alpha helical domain of Saccharomycetales Methionyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Saccharomycetales Methionyl-tRNA synthetase (MetRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of Saccharomycetales MetRS. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. MetRS is a class I aaRS, containing a Rossman fold catalytic core. It recognizes the initiator tRNA as well as the Met-tRNA for protein chain elongation. The GST_C-like domain of MetRS from Saccharomycetales is involved in protein-protein interactions, to mediate the formation of the the MetRS-Arc1p-GluRS ternary complex which is considered an evolutionary intermediate between prokaryotic aaRS and the multi-aaRS complex found in higher eukaryotes. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain. 95
28818 198324 cd10291 GST_C_YfcG_like C-terminal, alpha helical domain of Escherichia coli YfcG Glutathione S-transferases and related uncharacterized proteins. Glutathione S-transferase (GST) C-terminal domain family, YfcG-like subfamily; composed of the Escherichia coli YfcG and related proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. YfcG is one of nine GST homologs in Escherichia coli. It is expressed predominantly during the late stationary phase where the predominant form of GSH is glutathionylspermidine (GspSH), suggesting that YfcG might interact with GspSH. It has very low or no GSH transferase or peroxidase activity, but displays a unique disulfide bond reductase activity that is comparable to thioredoxins (TRXs) and glutaredoxins (GRXs). However, unlike TRXs and GRXs, YfcG does not contain a redox active cysteine residue and may use a bound thiol disulfide couple such as 2GSH/GSSG for activity. The crystal structure of YcfG reveals a bound GSSG molecule in its active site. The actual physiological substrates for YfcG are yet to be identified. 110
28819 198325 cd10292 GST_C_YghU_like C-terminal, alpha helical domain of Escherichia coli Yghu Glutathione S-transferases and related uncharacterized proteins. Glutathione S-transferase (GST) C-terminal domain family, YghU-like subfamily; composed of the Escherichia coli YghU and related proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. YghU is one of nine GST homologs in the genome of Escherichia coli. It is similar to Escherichia coli YfcG in that it has poor GSH transferase activity towards typical substrates. It shows modest reductase activity towards some organic hydroperoxides. Like YfcG, YghU also shows good disulfide bond oxidoreductase activity comparable to the activities of glutaredoxins and thioredoxins. YghU does not contain a redox active cysteine residue, and may use a bound thiol disulfide couple such as 2GSH/GSSG for activity. The crystal structure of YghU reveals two GSH molecules bound in its active site. 118
28820 198326 cd10293 GST_C_Ure2p C-terminal, alpha helical domain of fungal Ure2p Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Ure2p subfamily; composed of the Saccharomyces cerevisiae Ure2p and related fungal proteins. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The N-terminal thioredoxin-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 117
28821 198327 cd10294 GST_C_ValRS_N Glutathione S-transferase C-terminal-like, alpha helical domain of vertebrate Valyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Valyl-tRNA synthetase (ValRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of human ValRS and its homologs from other vertebrates such as frog and zebrafish. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. They typically form large stable complexes with other proteins. ValRS forms a stable complex with Elongation Factor-1H (EF-1H), and together, they catalyze consecutive steps in protein biosynthesis, tRNA aminoacylation and its transfer to EF. The GST_C-like domain of ValRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. ValRSs from prokaryotes and lower eukaryotes, such as fungi and plants, do not appear to contain this GST_C-like domain. 123
28822 198328 cd10295 GST_C_Sigma C-terminal, alpha helical domain of Class Sigma Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Sigma; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation, and mediation of allergy and inflammation. 100
28823 198329 cd10296 GST_C_CLIC4 C-terminal, alpha helical domain of Chloride Intracellular Channel 4. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 4 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC4, also known as p64H1, is expressed ubiquitously and its localization varies depending on the nature of the cells and tissues, from the plasma membrane to subcellular compartments including the nucleus, mitochondria, ER, and the trans-Golgi network, among others. In response to cellular stress such as DNA damage and senescence, cytoplasmic CLIC4 translocates to the nucleus, where it acts on the TGF-beta pathway. Studies on knockout mice suggest that CLIC4 also plays an important role in angiogenesis, specifically in network formation, capillary sprouting, and lumen formation. CLIC4 has been found to induce apoptosis in several cell types and to retard the growth of grafted tumors in vivo. 141
28824 198330 cd10297 GST_C_CLIC5 C-terminal, alpha helical domain of Chloride Intracellular Channel 5. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 5 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC5 exists in two alternatively-spliced isoforms, CLIC5A or CLIC5B (also called p64). It is expressed at high levels in hair cell stereocilia and is associated with the actin cytoskeleton and ezrin. A recessive mutation in the CLIC5 gene in mice led to the lack of coordination and deafness, due to a defect in the basal region of the hair bundle causing stereocilia to degrade. CLIC5 is therefore essential for normal inner ear function. CLIC5 is also highly expressed in podocytes where it is colocalized with the ezrin/radixin/moesin (ERM) complex. It is essential for foot process integrity, and for podocyte morphology and function. 141
28825 198331 cd10298 GST_C_CLIC2 C-terminal, alpha helical domain of Chloride Intracellular Channel 2. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 2 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC2 contains an intramolecular disulfide bond and exists as a monomer regardless of redox conditions, in contrast to CLIC1 which forms a dimer under oxidizing conditions. It is expressed in most tissues except the brain, and is highly expressed in the lung, spleen, and in cardiac and skeletal muscles. CLIC2 interacts with ryanodine receptors (cardiac RyR2 and skeletal RyR1) and modulates their activity, suggesting that CLIC2 may function in the regulation of calcium release and signaling in cardiac and skeletal muscles. 138
28826 198332 cd10299 GST_C_CLIC3 C-terminal, alpha helical domain of Chloride Intracellular Channel 3. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 3 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC3 is highly expressed in placental tissues, and may play a role in fetal development. 133
28827 198333 cd10300 GST_C_CLIC1 C-terminal, alpha helical domain of Chloride Intracellular Channel 1. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 1 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Soluble CLIC1 is monomeric and adopts a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity. CLIC1 is widely expressed in many tissues and its subcellular localization is dependent on cell type and cell cycle phase. It acts as a sensor of cell oxidation and appears to have a role in diseases that involve oxidative stress including tumorigenic and neurodegenerative diseases. 139
28828 198334 cd10301 GST_C_CLIC6 C-terminal, alpha helical domain of Chloride Intracellular Channel 6. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 6 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC6 is expressed predominantly in the stomach, pituitary, and brain. It interacts with D2-like dopamine receptors directly and through scaffolding proteins. CLIC6 may be involved in the regulation of secretion, possibly through chloride ion transport regulation. 140
28829 198335 cd10302 GST_C_GDAP1L1 C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1-like 1. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1-like 1 (GDAP1L1) subfamily; GDAP1L1 is a paralogue of GDAP1 with about 56% sequence identity and 70% similarity. It's function is unknown. Like GDAP1, it does not exhibit GST activity using standard substrates. GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. 111
28830 198336 cd10303 GST_C_GDAP1 C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates. 111
28831 198337 cd10304 GST_C_Arc1p_N_like Glutathione S-transferase C-terminal-like, alpha helical domain of the Aminoacyl tRNA synthetase cofactor 1 and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase cofactor 1 (Arc1p)-like subfamily; Arc1p, also called GU4 nucleic binding protein 1 (G4p1) or p42, is a tRNA-aminoacylation and nuclear-export cofactor. It contains a domain in the N-terminal region with similarity to the C-terminal alpha helical domain of GSTs. This domain mediates the association of the aminoacyl tRNA synthetases (aaRSs), MetRS and GluRS, in yeast to form a stable stoichiometric ternany complex. The GST_C-like domain of Arc1p is a protein-protein interaction domain containing two binding sites which enable it to bind the two aaRSs simultaneously and independently. The MetRS-Arc1p-GluRS complex selectively recruits and aminoacylates its cognate tRNAs without additional cofactors. Arc1p also plays a role in the transport of tRNA from the nucleus to the cytoplasm. It may also control the subcellular distribution of GluRS in the cytoplasm, nucleoplasm, and the mitochondrial matrix. 100
28832 198338 cd10305 GST_C_AIMP3 Glutathione S-transferase C-terminal-like, alpha helical domain of Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 3. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein (AIMP) 3 subfamily; AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex that functions as a molecular hub to coordinate protein synthesis. There are three AIMPs, named AIMP1-3, which play diverse regulatory roles. AIMP3, also called p18 or eukaryotic translation elongation factor 1 epsilon-1 (EEF1E1), contains a C-terminal domain with similarity to the C-terminal alpha helical domain of GSTs. It specifically interacts with methionyl-tRNA synthetase (MetRS) and is translocated to the nucleus during DNA synthesis or in response to DNA damage and oncogenic stress. In the nucleus, it interacts with ATM and ATR, which are upstream kinase regulators of p53. It appears to work against DNA damage in cooperation with AIMP2, and similar to AIMP2, AIMP3 is also a haploinsufficient tumor suppressor. AIMP3 transgenic mice have shorter lifespans than wild-type mice and they show characteristics of progeria, suggesting that AIMP3 may also be involved in cellular and organismal aging. 101
28833 198339 cd10306 GST_C_GluRS_N Glutathione S-transferase C-terminal-like, alpha helical domain of Glutamyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Glutamyl-tRNA synthetase (GluRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of GluRS from lower eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of GluRS is involved in protein-protein interactions. This domain mediates the formation of the MetRS-Arc1p-GluRS ternary complex found in lower eukaryotes, which is considered an evolutionary intermediate between prokaryotic aaRS and the multi-aaRS complex found in higher eukaryotes. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain. 87
28834 198340 cd10307 GST_C_MetRS_N Glutathione S-transferase C-terminal-like, alpha helical domain of Methionyl-tRNA synthetase from higher eukaryotes. Glutathione S-transferase (GST) C-terminal domain family, Methionyl-tRNA synthetase (MetRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of MetRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. MetRS is a class I aaRS, containing a Rossman fold catalytic core. It recognizes the initiator tRNA as well as the Met-tRNA for protein chain elongation. The GST_C-like domain of MetRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain. 102
28835 198341 cd10308 GST_C_eEF1b_like Glutathione S-transferase C-terminal-like, alpha helical domain of eukaryotic translation Elongation Factor 1 beta. Glutathione S-transferase (GST) C-terminal domain family, eukaryotic translation Elongation Factor 1 beta (eEF1b) subfamily; eEF1b is a component of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. eEF1b contains a GST_C-like alpha helical domain at the N-terminal region and a C-terminal guanine nucleotide exchange domain. The GST_C-like domain likely functions as a protein-protein interaction domain, similar to the function of the GST_C-like domains of EF1Bgamma and various aminoacyl-tRNA synthetases (aaRSs) from higher eukaryotes. 82
28836 198342 cd10309 GST_C_GluProRS_N Glutathione S-transferase C-terminal-like, alpha helical domain of bifunctional Glutamyl-Prolyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, bifunctional GluRS-Prolyl-tRNA synthetase (GluProRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of GluProRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of GluProRS may be involved in protein-protein interactions, mediating the formation of the multi-aaRS complex in higher eukaryotes. The multi-aaRS complex acts as a molecular hub for protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain. 81
28837 198343 cd10310 GST_C_CysRS_N Glutathione S-transferase C-terminal-like, alpha helical domain of Cysteinyl-tRNA synthetase from higher eukaryotes. Glutathione S-transferase (GST) C-terminal domain family, Cysteinyl-tRNA synthetase (CysRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of CysRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of CysRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. CysRSs from prokaryotes and lower eukaryotes do not appear to contain this GST_C-like domain. 73
28838 197304 cd10311 PLDc_N_DEXD_c N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. 156
28839 197339 cd10312 Deadenylase_CCR4b C-terminal deadenylase domain of CCR4b, also known as CCR4-NOT transcription complex subunit 6-like. This subfamily contains the C-terminal catalytic domain of the deadenylase, CCR4b, also known as CCR4-NOT transcription complex subunit 6-like (CNOT6L). CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1, is a DEDD-type protein and does not belong in this superfamily. There are two vertebrate CCR4 proteins, CCR4a (also called CCR4-NOT transcription complex subunit 6 or CNOT6) and CCR4b. CCR4b associates with other components, such as CNOT1-3 and Caf1, to form a CCR4-NOT multisubunit complex, which regulates transcription and mRNA degradation. The nuclease domain of CCR4b exhibits Mg2+-dependent deadenylase activity with strict specificity for poly (A) RNA as substrate. CCR4b is mainly localized in the cytoplasm. It regulates cell growth and influences cell cycle progression by regulating p27/Kip1 mRNA levels. It contributes to the prevention of cell death by regulating insulin-like growth factor-binding protein 5. 348
28840 197340 cd10313 Deadenylase_CCR4a C-terminal deadenylase domain of CCR4a, also known as CCR4-NOT transcription complex subunit 6. This subfamily contains the C-terminal catalytic domain of the deadenylase, CCR4a, also known as CCR4-NOT transcription complex subunit 6 (CNOT6). CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1, is a DEDD-type protein and does not belong in this superfamily. There are two vertebrate CCR4 proteins, CCR4a and CCR4b (also called CNOT6-like or CNOT6L). CCR4a associates with other components, such as CNOT1-3 and Caf1, to form a CCR4-NOT multisubunit complex, which regulates transcription and mRNA degradation. The nuclease domain of CCR4a exhibits Mg2+-dependent deadenylase activity with specificity for poly (A) RNA as substrate. CCR4a is a component of P-bodies and is necessary for foci formation of various P-body components. It also plays a role in cellular responses to DNA damage, by regulating Chk2 activity. 350
28841 198457 cd10314 FAM20_C C-terminal putative kinase domain of FAM20 (family with sequence similarity 20) proteins. This family contains the C-terminal domain of FAM20A, -B, -C and related proteins. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. The C-terminal domains of members of this family are putative kinase domains, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This domain family is also known as DUF1193. 209
28842 199215 cd10315 CBM41_pullulanase Family 41 Carbohydrate-Binding Module from pullulanase-like enzymes. Pullulanases (EC 3.2.1.41) are a group of starch-debranching enzymes, catalyzing the hydrolysis of the alpha-1,6-glucosidic linkages of alpha-glucans, preferentially pullulan. Pullulan is a polysaccharide in which alpha-1,4 linked maltotriosyl units are combined via an alpha-1,6 linkage. These enzymes are of importance in the starch industry, where they are used to hydrolyze amylopectin starch. Pullulanases consist of multiple distinct domains, including a catalytic domain belonging to the glycoside hydrolase (GH) family 13 and carbohydrate-binding modules (CBM), including CBM41. 100
28843 199904 cd10316 RGL4_M Middle domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold. Both the middle domain represented by this model and the C-terminal domain are putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11. 92
28844 199905 cd10317 RGL4_C C-terminal domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold. Both the middle and the C-terminal domain are putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11. 161
28845 199906 cd10318 RGL11 Rhamnogalacturonan lyase of the polysaccharide lyase family 11. The rhamnogalacturonan lyase of the polysaccharide lyase family 11 (RGL11) cleaves glycoside bonds in polygalacturonan as well as RG (rhamnogalacturonan) type-I through a beta-elimination reaction. Functionally characterized members of this family, YesW and YesX from Bacillus subtilis, cleave glycoside bonds between rhamnose and galacturonic acid residues in the RG-I region of plant cell wall pectin. YesW and YesX work synergistically, with YesW cleaving the glycoside bond of the RG chain endolytically, and YesX converting the resultant oligosaccharides through an exotype reaction. This domain is sometimes found in architectures with non-catalytic carbohydrate-binding modules (CBMs). There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain through a beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11. 564
28846 198439 cd10319 EphR_LBD Ligand Binding Domain of Ephrin Receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). They are subdivided into 2 groups, A and B type receptors, depending on their ligand ephrin-A or ephrin-B, respectively. In general, class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. 177
28847 199907 cd10320 RGL4_N N-terminal catalytic domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold; the middle and C-terminal domains are both putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11. 265
28848 199216 cd10321 RNase_Ire1_like RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L. This RNase domain is found in the multi-functional protein Ire1; Ire1 also contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR). The UPR is activated when protein misfolding is detected in the ER in order to reduce the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor; IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain which stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is also found in Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase. RNase L is a highly regulated, latent endoribonuclease widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system; the interferon (IFN)-inducible 2'-5'-oligoadenylate synthetase (OAS)/RNase L pathway blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele. 127
28849 271357 cd10322 SLC5sbd Solute carrier 5 family, sodium/glucose transporters and related proteins; solute-binding domain. This family represents the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporter family or solute sodium symporter family) that co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. Family members include: the human glucose (SGLT1, 2, 4, 5), chiro-inositol (SGLT5), myo-inositol (SMIT), choline (CHT), iodide (NIS), multivitamin (SMVT), and monocarboxylate (SMCT) cotransporters, as well as Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. Vibrio parahaemolyticus Na(+)/galactose cotransporter (vSGLT) has 13 transmembrane helices (TMs): TM-1, an inverted topology repeat: TMs1-5 and TMs6-10, and TMs 11-12 (TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). One member of this family, human SGLT3, has been characterized as a glucose sensor and not a transporter. Members of this family are important in human physiology and disease. 454
28850 271358 cd10323 SLC-NCS1sbd nucleobase-cation-symport-1 (NCS1) transporters; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. This family includes Microbacterium liquefaciens Mhp1, a transporter that mediates the uptake of indolyl methyl- and benzyl-hydantoins as part of a metabolic salvage pathway for their conversion to amino acids. It also includes various Saccharomyces cerevisiae transporters: Fcy21p (Purine-cytosine permease), vitamin B6 transporter Tpn1, nicotinamide riboside transporter 1 (Nrt1p, also called Thi71p), Dal4p (allantoin permease), Fui1p (uridine permease), and Fur4p (uracil permease). Mhp1 has 12 transmembrane (TM) helices (an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12; TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and SLC6 neurotransmitter transporters. 414
28851 271359 cd10324 SLC6sbd Solute carrier 6 family, neurotransmitter transporters; solute-binding domain. This family represents the solute-binding domain of SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family). These use sodium and chloride electrochemical gradients to catalyze the thermodynamically uphill movement of a variety of substrates, and include neurotransmitter transporters (NTTs). The latter are Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NTTs are widely expressed in the mammalian brain, and are involved in regulating neurotransmitter signaling and homeostasis, through facilitating the uptake of released neurotransmitters from the extracellular space into neurons and glial cells. NTTs are the target of a range of therapeutic drugs for the treatment of psychiatric diseases, such as major depression, anxiety disorders, attention deficit hyperactivity disorder and epilepsy. In addition, they are the primary targets of cocaine, amphetamines and other psychostimulants. This family also includes Drosophila Blot which is expressed primarily in epithelial tissues of ectodermal origin and in the nervous system of the embryo and larvae, but in addition found in the developing oocyte and the freshly laid egg. A lack or reduction of Blot function during oogenesis results in early arrest of embryonic development. 12 transmembrane helices (TMs) appears to be common for eukaryotic and some prokaryotic and archaeal SLC6s, (a core inverted topology repeat, TM1-5 and TM6-10, plus TMs11-12; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT), although a majority of bacterial, and some archaeal SLC6s lack TM12, for example the functional Fusobacterium nucleatum tyrosine transporter Tyt1. 415
28852 271360 cd10325 SLC5sbd_vSGLT Vibrio parahaemolyticus Na(+)/galactose cotransporter (vSGLT) and related proteins; solute binding domain. vSGLT transports D-galactose, D-glucose, and alpha-D-fucose, with a sugar specificity in the order of D-galactose >D-fucose >D-glucose. It transports one Na+ ion for each sugar molecule, and appears to function as a monomer. vSGLT has 13 transmembrane helices (TMs): TM-1, an inverted topology repeat: TMs1-5 and TMs6-10, and TMs 11-12 (TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 523
28853 271361 cd10326 SLC5sbd_NIS-like Na(+)/iodide (NIS) and Na(+)/multivitamin (SMVT) cotransporters, and related proteins; solute binding domain. NIS (product of the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. SMVT (product of the SLC5A6 gene) transports biotin, pantothenic acid and lipoate. This subfamily also includes SMCT1 and 2. SMCT1(the product of the SLC5A8 gene) is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. SMCT2 (product of the SLC5A12 gene) is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 472
28854 212037 cd10327 SLC5sbd_PanF Na(+)/pantothenate cotransporters: PanF of Escherichia coli and related proteins; solute binding domain. PanF catalyzes the Na+-coupled uptake of extracellular pantothenate for coenzyme A biosynthesis in cells. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 472
28855 271362 cd10328 SLC5sbd_YidK uncharacterized SLC5 subfamily, Escherichia coli YidK-like; solute binding domain. Uncharacterized subfamily of the solute binding domain of the solute carrier 5 (SLC5) transporter family (also called the sodium/glucose cotransporter family or solute sodium symporter family) that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily includes the uncharacterized Escherichia coli YidK protein, and belongs to the solute carrier 5 (SLC5) transporter family. 472
28856 271363 cd10329 SLC5sbd_SGLT1-like Na(+)/glucose cotransporter SGLT1 and related proteins; solute binding domain. This subfamily includes the solute-binding domain of SGLT proteins that cotransport Na+ with various solutes. Its members include: the human glucose (SGLT1, -2, -4, -5 ), chiro-inositol (SGLT5), and myo-inositol (SMIT) cotransporters. It also includes human SGLT3 which has been characterized as a glucose sensor and not a transporter. It belongs to the solute carrier 5 (SLC5) transporter family. 538
28857 271364 cd10332 SLC6sbd-B0AT-like System B(0) neutral amino acid transporter AT1, 2 and 3, and related proteins; solute-binding domain. This subgroup includes the solute-binding domain of transmembrane transporters, which transport, i) neutral amino acids: NTT4 (also called XT1), SBAT1 (also called B0AT2, v7-3, NTT7-3), and B0AT1 (also called HND); the human genes encoding these are SLC6A17, SLC6A15, and SLC6A19 respectively, ii) glycine: B0AT3 (also called Xtrp2, XT2), iii) imino acids, such as proline, pipecolate, MeAIB, and sarcosine: SIT1 (also called XTRP3, XT3, IMINO). The human genes encoding B0AT3 and SIT1 are SLC6A18 and SLC6A20 respectively. Transporters in this subgroup may play a role in disorders including major depression, Hartnup disorder, increased susceptibility to myocardial infarction, and iminoglycinuria. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 531
28858 271365 cd10333 LeuT-like_sbd Aquifex aeolicus LeuT and related proteins; solute binding domain. LeuT is a bacterial amino acid transporter with specificity for the hydrophobic amino acids glycine, alanine, methionine, and leucine. This subgroup belongs to the solute carrier 6 (SLC6) transporter family; LeuT has been used as a structural template for understanding fundamental aspects of SLC6 function. It has an arrangement of 12 transmembrane helices (TMs), which appears to be a common motif for eukaryotic and some prokaryotic and archaeal SLC6s: an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12. 496
28859 271366 cd10334 SLC6sbd_u1 uncharacterized bacterial and archaeal solute carrier 6 subfamily; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, involved in regulating neurotransmitter signaling and homeostasis, and the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter. 480
28860 271367 cd10336 SLC6sbd_Tyt1-Like solute carrier 6 subfamily, Fusobacterium nucleatum Tyt1-like; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, involved in regulating neurotransmitter signaling and homeostasis, and the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter. An arrangement of 12 transmembrane (TM) helices appears to be as a common topological motif for eukaryotic and some prokaryotic and archaeal NTTs. However, this subfamily which contains the majority of bacterial members and some archaeal members, appears to contain only 11 TMs; for example the functional Fusobacterium nucleatum tyrosine transporter Tyt1. 440
28861 198200 cd10337 SH2_BCAR3 Src homology 2 (SH2) domain in the Breast Cancer Anti-estrogen Resistance protein 3. BCAR3 is part of a growing family of guanine nucleotide exchange factors is responsible for activation of Ras-family GTPases, including Sos1 and 2, GRF1 and 2, CalDAG-GEF/GRP1-4, C3G, cAMP-GEF/Epac 1 and 2, PDZ-GEFs, MR-GEF, RalGDS family members, RalGPS, RasGEF, Smg GDS, and phospholipase C(epsilon). 12102558 21262352 BCAR3 binds to the carboxy-terminus of BCAR1/p130Cas, a focal adhesion adapter protein. Over expression of BCAR1 (p130Cas) and BCAR3 induces estrogen independent growth in normally estrogen-dependent cell lines. They have been linked to resistance to anti-estrogens in breast cancer, Rac activation, and cell motility, though the BCAR3/p130Cas complex is not required for this activity in BCAR3. Many BCAR3-mediated signaling events in epithelial and mesenchymal cells are independent of p130Cas association. Structurally these proteins contain a single SH2 domain upstream of their RasGEF domain, which is responsible for the ability of BCAR3 to enhance p130Cas over-expression-induced migration. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 136
28862 198201 cd10338 SH2_SHA Src homology 2 (SH2) domain found in SH2 adaptor proteins A (SHA) Signal transducers. Signal transducing adaptor proteins are accessory to main proteins in a signal transduction pathway. These proteins lack intrinsic enzymatic activity, but mediate specific protein-protein interactions that drive the formation of protein complexes. Adaptor proteins usually contain several domains within their structure (e.g. SH2 and SH3 domains) which allow specific interactions with several other specific proteins. Not much is known about the SHA protein except that it is predicted to act as a transcription factor. Arabidopsis SHA pulled down a 120-kD tyrosine-phosphorylated protein in vitro. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 106
28863 198202 cd10339 SH2_RIN_family Src homology 2 (SH2) domain found in Ras and Rab interactor (RIN)-family. The RIN (AKA Ras interaction/interference) family is composed of RIN1, RIN2 and RIN3. These proteins have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs, and RIN3 specifically functions as a Rab31-GEF. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28864 198203 cd10340 SH2_N-SH2_SHP_like N-terminal Src homology 2 (N-SH2) domain found in SH2 domain Phosphatases (SHP) proteins. The SH2 domain phosphatases (SHP-1, SHP-2/Syp, Drosophila corkscrew (csw), and Caenorhabditis elegans Protein Tyrosine Phosphatase (Ptp-2)) are cytoplasmic signaling enzymes. They are both targeted and regulated by interactions of their SH2 domains with phosphotyrosine docking sites. These proteins contain two SH2 domains (N-SH2, C-SH2) followed by a tyrosine phosphatase (PTP) domain, and a C-terminal extension. Shp1 and Shp2 have two tyrosyl phosphorylation sites in their C-tails, which are phosphorylated differentially by receptor and nonreceptor PTKs. Csw retains the proximal tyrosine and Ptp-2 lacks both sites. Shp-binding proteins include receptors, scaffolding adapters, and inhibitory receptors. Some of these bind both Shp1 and Shp2 while others bind only one. Most proteins that bind a Shp SH2 domain contain one or more immuno-receptor tyrosine-based inhibitory motifs (ITIMs): [IVL]xpYxx[IVL]. Shp1 N-SH2 domain blocks the catalytic domain and keeps the enzyme in the inactive conformation, and is thus believed to regulate the phosphatase activity of SHP-1. Its C-SH2 domain is thought to be involved in searching for phosphotyrosine activators. The SHP2 N-SH2 domain is a conformational switch; it either binds and inhibits the phosphatase, or it binds phosphoproteins and activates the enzyme. The C-SH2 domain contributes binding energy and specificity, but it does not have a direct role in activation. Csw SH2 domain function is essential, but either SH2 domain can fulfill this requirement. The role of the csw SH2 domains during Sevenless receptor tyrosine kinase (SEV) signaling is to bind Daughter of Sevenless rather than activated SEV. Ptp-2 acts in oocytes downstream of sheath/oocyte gap junctions to promote major sperm protein (MSP)-induced MAP Kinase (MPK-1) phosphorylation. Ptp-2 functions in the oocyte cytoplasm, not at the cell surface to inhibit multiple RasGAPs, resulting in sustained Ras activation. It is thought that MSP triggers PTP-2/Ras activation and ROS production to stimulate MPK-1 activity essential for oocyte maturation and that secreted MSP domains and Cu/Zn superoxide dismutases function antagonistically to control ROS and MAPK signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 99
28865 199829 cd10341 SH2_N-SH2_PLC_gamma_like N-terminal Src homology 2 (N-SH2) domain in Phospholipase C gamma. Phospholipase C gamma is a signaling molecule that is recruited to the C-terminal tail of the receptor upon autophosphorylation of a highly conserved tyrosine. PLCgamma is composed of a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, 2 catalytic regions of PLC domains that flank 2 tandem SH2 domains (N-SH2, C-SH2), and ending with a SH3 domain and C2 domain. N-SH2 SH2 domain-mediated interactions represent a crucial step in transmembrane signaling by receptor tyrosine kinases. SH2 domains recognize phosphotyrosine (pY) in the context of particular sequence motifs in receptor phosphorylation sites. Both N-SH2 and C-SH2 have a very similar binding affinity to pY. But in growth factor stimulated cells these domains bind to different target proteins. N-SH2 binds to pY containing sites in the C-terminal tails of tyrosine kinases and other receptors. Recently it has been shown that this interaction is mediated by phosphorylation-independent interactions between a secondary binding site found exclusively on the N-SH2 domain and a region of the FGFR1 tyrosine kinase domain. This secondary site on the SH2 cooperates with the canonical pY site to regulate selectivity in mediating a specific cellular process. C-SH2 binds to an intramolecular site on PLCgamma itself which allows it to hydrolyze phosphatidylinositol-4,5-bisphosphate into diacylglycerol and inositol triphosphate. These then activate protein kinase C and release calcium. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 99
28866 198205 cd10342 SH2_SAP1 Src homology 2 (SH2) domain found in SLAM-associated protein (SAP)1. The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail. XLP is characterized by an extreme sensitivity to Epstein-Barr virus. Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX[VI], which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28867 198206 cd10343 SH2_SHIP Src homology 2 (SH2) domain found in SH2-containing inositol-5'-phosphatase (SHIP) and SLAM-associated protein (SAP). The SH2-containing inositol-5'-phosphatase, SHIP (also called SHIP1/SHIP1a), is a hematopoietic-restricted phosphatidylinositide phosphatase that translocates to the plasma membrane after extracellular stimulation and hydrolyzes the phosphatidylinositol-3-kinase (PI3K)-generated second messenger PI-3,4,5-P3 (PIP3) to PI-3,4-P2. As a result, SHIP dampens down PIP3 mediated signaling and represses the proliferation, differentiation, survival, activation, and migration of hematopoietic cells. PIP3 recruits lipid-binding pleckstrin homology(PH) domain-containing proteins to the inner wall of the plasma membrane and activates them. PH domain-containing downstream effectors include the survival/proliferation enhancing serine/threonine kinase, Akt (protein kinase B), the tyrosine kinase, Btk, the regulator of protein translation, S6K, and the Rac and cdc42 guanine nucleotide exchange factor, Vav. SHIP is believed to act as a tumor suppressor during leukemogenesis and lymphomagenesis, and may play a role in activating the immune system to combat cancer. SHIP contains an N-terminal SH2 domain, a centrally located phosphatase domain that specifically hydrolyzes the 5'-phosphate from PIP3, PI-4,5-P2 and inositol-1,3,4,5- tetrakisphosphate (IP4), a C2 domain, that is an allosteric activating site when bound by SHIP's enzymatic product, PI-3,4-P2; 2 NPXY motifs that bind proteins with a phosphotyrosine binding (Shc, Dok 1, Dok 2) or an SH2 (p85a, SHIP2) domain; and a proline-rich domain consisting of four PxxP motifs that bind a subset of SH3-containing proteins including Grb2, Src, Lyn, Hck, Abl, PLCg1, and PIAS1. The SH2 domain of SHIP binds to the tyrosine phosphorylated forms of Shc, SHP-2, Doks, Gabs, CD150, platelet-endothelial cell adhesion molecule, Cas, c-Cbl, immunoreceptor tyrosine-based inhibitory motifs (ITIMs), and immunoreceptor tyrosine-based activation motifs (ITAMs). The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail. XLP is characterized by an extreme sensitivity to Epstein-Barr virus. Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX(V/I), which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28868 198207 cd10344 SH2_SLAP Src homology 2 domain found in Src-like adaptor proteins. SLAP belongs to the subfamily of adapter proteins that negatively regulate cellular signaling initiated by tyrosine kinases. It has a myristylated N-terminus, SH3 and SH2 domains with high homology to Src family tyrosine kinases, and a unique C-terminal tail, which is important for c-Cbl binding. SLAP negatively regulates platelet-derived growth factor (PDGF)-induced mitogenesis in fibroblasts and regulates F-actin assembly for dorsal ruffles formation. c-Cbl mediated SLAP inhibition towards actin remodeling. Moreover, SLAP enhanced PDGF-induced c-Cbl phosphorylation by SFK. In contrast, SLAP mitogenic inhibition was not mediated by c-Cbl, but it rather involved a competitive mechanism with SFK for PDGF-receptor (PDGFR) association and mitogenic signaling. Accordingly, phosphorylation of the Src mitogenic substrates Stat3 and Shc were reduced by SLAP. Thus, we concluded that SLAP regulates PDGFR signaling by two independent mechanisms: a competitive mechanism for PDGF-induced Src mitogenic signaling and a non-competitive mechanism for dorsal ruffles formation mediated by c-Cbl. SLAP is a hematopoietic adaptor containing Src homology (SH)3 and SH2 motifs and a unique carboxy terminus. Unlike c-Src, SLAP lacks a tyrosine kinase domain. Unlike c-Src, SLAP does not impact resorptive function of mature osteoclasts but induces their early apoptosis. SLAP negatively regulates differentiation of osteoclasts and proliferation of their precursors. Conversely, SLAP decreases osteoclast death by inhibiting activation of caspase 3. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28869 198208 cd10345 SH2_C-SH2_Zap70_Syk_like C-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70) and Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains. In Syk the two SH2 domains do not form such a phosphotyrosine-binding site. The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of both Syk and Zap70. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 95
28870 198209 cd10346 SH2_SH2B_family Src homology 2 (SH2) domain found in SH2B adapter protein family. The SH2B adapter protein family has 3 members: SH2B1 (SH2-B, PSM), SH2B2 (APS), and SH2B3 (Lnk). SH2B family members contain a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases. SH2B1 and SH2B2 function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. SH2B3 negatively regulates lymphopoiesis and early hematopoiesis. The lnk-deficiency results in enhanced production of B cells, and expansion as well as enhanced function of hematopoietic stem cells (HSCs), demonstrating negative regulatory functions of Sh2b3/Lnk in cytokine signaling. Sh2b3/Lnk also functions in responses controlled by cell adhesion and in crosstalk between integrin- and cytokine-mediated signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28871 198210 cd10347 SH2_Nterm_shark_like N-terminal Src homology 2 (SH2) domain found in SH2 domains, ANK, and kinase domain (shark) proteins. These non-receptor protein-tyrosine kinases contain two SH2 domains, five ankyrin (ANK)-like repeats, and a potential tyrosine phosphorylation site in the carboxyl-terminal tail which resembles the phosphorylation site in members of the src family. Like, mammalian non-receptor protein-tyrosine kinases, ZAP-70 and syk proteins, they do not have SH3 domains. However, the presence of ANK makes these unique among protein-tyrosine kinases. Both tyrosine kinases and ANK repeats have been shown to transduce developmental signals, and SH2 domains are known to participate intimately in tyrosine kinase signaling. These tyrosine kinases are believed to be involved in epithelial cell polarity. The members of this family include the shark (SH2 domains, ANK, and kinase domain) gene in Drosophila and yellow fever mosquitos, as well as the hydra protein HTK16. Drosophila Shark is proposed to transduce intracellularly the Crumbs, a protein necessary for proper organization of ectodermal epithelia, intercellular signal. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 81
28872 198211 cd10348 SH2_Cterm_shark_like C-terminal Src homology 2 (SH2) domain found in SH2 domains, ANK, and kinase domain (shark) proteins. These non-receptor protein-tyrosine kinases contain two SH2 domains, five ankyrin (ANK)-like repeats, and a potential tyrosine phosphorylation site in its carboxyl-terminal tail which resembles the phosphorylation site in members of the src family. Like, mammalian non-receptor protein-tyrosine kinases, ZAP-70 and syk proteins, they do not have SH3 domains. However, the presence of ANK makes these unique among protein-tyrosine kinases. Both tyrosine kinases and ANK repeats have been shown to transduce developmental signals, and SH2 domains are known to participate intimately in tyrosine kinase signaling. These tyrosine kinases are believed to be involved in epithelial cell polarity. The members of this family include the shark (SH2 domains, ANK, and kinase domain) gene in Drosophila and yellow fever mosquitos, as well as the hydra protein HTK16. Drosophila Shark is proposed to transduce intracellularly the Crumbs, a protein necessary for proper organization of ectodermal epithelia, intercellular signal. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 86
28873 199830 cd10349 SH2_SH2D2A_SH2D7 Src homology 2 domain found in the SH2 domain containing protein 2A and 7 (SH2D2A and SH2D7). SH2D2A and SH7 both contain a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 77
28874 198213 cd10350 SH2_SH2D4A Src homology 2 domain found in the SH2 domain containing protein 4A (SH2D4A). SH2D4A contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28875 198214 cd10351 SH2_SH2D4B Src homology 2 domain found in the SH2 domain containing protein 4B (SH2D4B). SH2D4B contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28876 198215 cd10352 SH2_a2chimerin_b2chimerin Src homology 2 (SH2) domain found in alpha2-chimerin and beta2-chimerin proteins. Chimerins are a family of phorbol ester- and diacylglycerol-responsive GTPase-activating proteins. Alpha1-chimerin (formerly known as n-chimerin) and alpha2-chimerin are alternatively spliced products of a single gene, as are beta1- and beta2-chimerin. alpha1- and beta1-chimerin have a relatively short N-terminal region that does not encode any recognizable domains, whereas alpha2- and beta2-chimerin both include a functional SH2 domain that can bind to phosphotyrosine motifs within receptors. All of the isoforms contain a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. Other C1 domain-containing diacylglycerol receptors including: PKC, Munc-13 proteins, phorbol ester binding scaffolding proteins involved in Ca2+-stimulated exocytosis, and RasGRPs, diacylglycerol-activated guanine-nucleotide exchange factors (GEFs) for Ras and Rap1. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 91
28877 198216 cd10353 SH2_Nterm_RasGAP N-terminal Src homology 2 (SH2) domain found in Ras GTPase-activating protein 1 (GAP). RasGAP is part of the GAP1 family of GTPase-activating proteins. The protein is located in the cytoplasm and stimulates the GTPase activity of normal RAS p21, but not its oncogenic counterpart. Acting as a suppressor of RAS function, the protein enhances the weak intrinsic GTPase activity of RAS proteins resulting in RAS inactivation, thereby allowing control of cellular proliferation and differentiation. Mutations leading to changes in the binding sites of either protein are associated with basal cell carcinomas. Alternative splicing results in two isoforms. The shorter isoform which lacks the N-terminal hydrophobic region, has the same activity, and is expressed in placental tissues. In general the longer isoform contains 2 SH2 domains, a SH3 domain, a pleckstrin homology (PH) domain, and a calcium-dependent phospholipid-binding C2 domain. The C-terminus contains the catalytic domain of RasGap which catalyzes the activation of Ras by hydrolyzing GTP-bound active Ras into an inactive GDP-bound form of Ras. This model contains the N-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28878 198217 cd10354 SH2_Cterm_RasGAP C-terminal Src homology 2 (SH2) domain found in Ras GTPase-activating protein 1 (GAP). RasGAP is part of the GAP1 family of GTPase-activating proteins. The protein is located in the cytoplasm and stimulates the GTPase activity of normal RAS p21, but not its oncogenic counterpart. Acting as a suppressor of RAS function, the protein enhances the weak intrinsic GTPase activity of RAS proteins resulting in RAS inactivation, thereby allowing control of cellular proliferation and differentiation. Mutations leading to changes in the binding sites of either protein are associated with basal cell carcinomas. Alternative splicing results in two isoforms. The shorter isoform which lacks the N-terminal hydrophobic region, has the same activity, and is expressed in placental tissues. In general longer isoform contains 2 SH2 domains, a SH3 domain, a pleckstrin homology (PH) domain, and a calcium-dependent phospholipid-binding C2 domain. The C-terminus contains the catalytic domain of RasGap which catalyzes the activation of Ras by hydrolyzing GTP-bound active Ras into an inactive GDP-bound form of Ras. This model contains the C-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 77
28879 198218 cd10355 SH2_DAPP1_BAM32_like Src homology 2 domain found in dual adaptor for phosphotyrosine and 3-phosphoinositides ( DAPP1)/B lymphocyte adaptor molecule of 32 kDa (Bam32)-like proteins. DAPP1/Bam32 contains a putative myristoylation site at its N-terminus, followed by a SH2 domain, and a pleckstrin homology (PH) domain at its C-terminus. DAPP1 could potentially be recruited to the cell membrane by any of these domains. Its putative myristoylation site could facilitate the interaction of DAPP1 with the lipid bilayer. Its SH2 domain may also interact with phosphotyrosine residues on membrane-associated proteins such as activated tyrosine kinase receptors. And finally its PH domain exhibits a high-affinity interaction with the PtdIns(3,4,5)P(3) PtdIns(3,4)P(2) second messengers produced at the cell membrane following the activation of PI 3-kinases. DAPP1 is thought to interact with both tyrosine phosphorylated proteins and 3-phosphoinositides and therefore may play a role in regulating the location and/or activity of such proteins(s) in response to agonists that elevate PtdIns(3,4,5)P(3) and PtdIns(3,4)P(2). This protein is likely to play an important role in triggering signal transduction pathways that lie downstream from receptor tyrosine kinases and PI 3-kinase. It is likely that DAPP1 functions as an adaptor to recruit other proteins to the plasma membrane in response to extracellular signals. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 92
28880 198219 cd10356 SH2_ShkA_ShkC Src homology 2 (SH2) domain found in SH2 domain-bearing protein kinases A and C (ShkA and ShkC). SH2-bearing genes cloned from Dictyostelium include two transcription factors, STATa and STATc, and a signaling factor, SHK1 (shkA). A database search of the Dictyostelium discoideum genome revealed two additional putative STAT sequences, dd-STATb and dd-STATd, and four additional putative SHK genes, dd-SHK2 (shkB), dd-SHK3 (shkC), dd-SHK4 (shkD), and dd-SHK5 (shkE). This model contains members of shkA and shkC. All of the SHK members are most closely related to the protein kinases found in plants. However these kinases in plants are not conjugated to any SH2 or SH2-like sequences. Alignment data indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. When STATc's linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHK's linker domain is predicted to contain an alpha-helix which is indeed homologous to that of STAT. Based on the phylogenetic alignment, SH2 domains can be grouped into two categories, STAT-type and Src-type. SHK family members are in between, but are closer to the STAT-type which indicates a close relationship between SHK and STAT families in their SH2 domains and further supports the notion that SHKs linker-SH2 domain evolved from STAT or STATL (STAT-like Linker-SH2) domain found in plants. In SHK, STAT, and SPT6, the linker-SH2 domains all reside exclusively in the C-terminal regions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 113
28881 198220 cd10357 SH2_ShkD_ShkE Src homology 2 (SH2) domain found in SH2 domain-bearing protein kinases D and E (ShkD and ShkE). SH2-bearing genes cloned from Dictyostelium include two transcription factors, STATa and STATc, and a signaling factor, SHK1 (shkA). A database search of the Dictyostelium discoideum genome revealed two additional putative STAT sequences, dd-STATb and dd-STATd, and four additional putative SHK genes, dd-SHK2 (shkB), dd-SHK3 (shkC), dd-SHK4 (shkD), and dd-SHK5 (shkE). This model contains members of shkD and shkE. All of the SHK members are most closely related to the protein kinases found in plants. However these kinases in plants are not conjugated to any SH2 or SH2-like sequences. Alignment data indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. When STATc's linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHK's linker domain is predicted to contain an alpha-helix which is indeed homologous to that of STAT. Based on the phylogenetic alignment, SH2 domains can be grouped into two categories, STAT-type and Src-type. SHK family members are in between, but are closer to the STAT-type which indicates a close relationship between SHK and STAT families in their SH2 domains and further supports the notion that SHKs linker-SH2 domain evolved from STAT or STATL (STAT-like Linker-SH2) domain found in plants. In SHK, STAT, and SPT6, the linker-SH2 domains all reside exclusively in the C-terminal regions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 87
28882 198221 cd10358 SH2_PTK6_Brk Src homology 2 domain found in protein-tyrosine kinase-6 (PTK6) which is also known as breast tumor kinase (Brk). Human protein-tyrosine kinase-6 (PTK6, also known as breast tumor kinase (Brk)) is a member of the non-receptor protein-tyrosine kinase family and is expressed in two-thirds of all breast tumors. PTK6 (9). PTK6 contains a SH3 domain, a SH2 domain, and catalytic domains. For the case of the non-receptor protein-tyrosine kinases, the SH2 domain is typically involved in negative regulation of kinase activity by binding to a phosphorylated tyrosine residue near to the C terminus. The C-terminal sequence of PTK6 (PTSpYENPT where pY is phosphotyrosine) is thought to be a self-ligand for the SH2 domain. The structure of the SH2 domain resembles other SH2 domains except for a centrally located four-stranded antiparallel beta-sheet (strands betaA, betaB, betaC, and betaD). There are also differences in the loop length which might be responsible for PTK6 ligand specificity. There are two possible means of regulation of PTK6: autoinhibitory with the phosphorylation of Tyr playing a role in its negative regulation and autophosphorylation at this site, though it has been shown that PTK6 might phosphorylate signal transduction-associated proteins Sam68 and signal transducing adaptor family member 2 (STAP/BKS) in vivo. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 100
28883 198222 cd10359 SH2_SH3BP2 Src homology 2 domain found in c-Abl SH3 domain-binding protein-2 (SH3BP2). The adaptor protein 3BP2/SH3BP2 plays a regulatory role in signaling from immunoreceptors. The protein-tyrosine kinase Syk phosphorylates 3BP2 which results in the activation of Rac1 through the interaction with the SH2 domain of Vav1 and induces the binding to the SH2 domain of the upstream protein-tyrosine kinase Lyn and enhances its kinase activity. 3BP2 has a positive regulatory role in IgE-mediated mast cell activation. In lymphocytes, engagement of T cell or B cell receptors triggers tyrosine phosphorylation of 3BP2. Suppression of the 3BP2 expression by siRNA results in the inhibition of T cell or B cell receptor-mediated activation of NFAT. 3BP2 is required for the proliferation of B cells and B cell receptor signaling. Mutations in the 3BP2 gene are responsible for cherubism resulting in excessive bone resorption in the jaw. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28884 198223 cd10360 SH2_Srm Src homology 2 (SH2) domain found in Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites (srm). Srm is a nonreceptor protein kinase that has two SH2 domains, a SH3 domain, and a kinase domain with a tyrosine residue for autophosphorylation. However it lacks an N-terminal glycine for myristoylation and a C-terminal tyrosine which suppresses kinase activity when phosphorylated. Srm is most similar to members of the Tec family who other members include: Tec, Btk/Emb, and Itk/Tsk/Emt. However Srm differs in its N-terminal unique domain it being much smaller than in the Tec family and is closer to Src. Srm is thought to be a new family of nonreceptor tyrosine kinases that may be redundant in function. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 79
28885 198224 cd10361 SH2_Fps_family Src homology 2 (SH2) domain found in feline sarcoma, Fujinami poultry sarcoma, and fes-related (Fes/Fps/Fer) proteins. The Fps family consists of members Fps/Fes and Fer/Flk/Tyk3. They are cytoplasmic protein-tyrosine kinases implicated in signaling downstream from cytokines, growth factors and immune receptors. Fes/Fps/Fer contains three coiled-coil regions, an SH2 (Src-homology-2) and a TK (tyrosine kinase catalytic) domain signature. Members here include: Fps/Fes, Fer, Kin-31, and In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 90
28886 198225 cd10362 SH2_Src_Lck Src homology 2 (SH2) domain in lymphocyte cell kinase (Lck). Lck is a member of the Src non-receptor type tyrosine kinase family of proteins. It is expressed in the brain, T-cells, and NK cells. The unique domain of Lck mediates its interaction with two T-cell surface molecules, CD4 and CD8. It associates with their cytoplasmic tails on CD4 T helper cells and CD8 cytotoxic T cells to assist signaling from the T cell receptor (TCR) complex. When the T cell receptor is engaged by the specific antigen presented by MHC, Lck phosphorylase the intracellular chains of the CD3 and zeta-chains of the TCR complex, allowing ZAP-70 to bind them. Lck then phosphorylates and activates ZAP-70, which in turn phosphorylates Linker of Activated T cells (LAT), a transmembrane protein that serves as a docking site for proteins including: Shc-Grb2-SOS, PI3K, and phospholipase C (PLC). The tyrosine phosphorylation cascade culminates in the intracellular mobilization of a calcium ions and activation of important signaling cascades within the lymphocyte, including the Ras-MEK-ERK pathway, which goes on to activate certain transcription factors such as NFAT, NF-kappaB, and AP-1. These transcription factors regulate the production cytokines such as Interleukin-2 that promote long-term proliferation and differentiation of the activated lymphocytes. The N-terminal tail of Lck is myristoylated and palmitoylated and it tethers the protein to the plasma membrane of the cell. Lck also contains a SH3 domain, a SH2 domain, and a C-terminal tyrosine kinase domain. Lck has 2 phosphorylation sites, the first an autophosphorylation site that is linked to activation of the protein and the second which is phosphorylated by Csk, which inhibits it. Lck is also inhibited by SHP-1 dephosphorylation and by Cbl ubiquitin ligase, which is part of the ubiquitin-mediated pathway. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28887 198226 cd10363 SH2_Src_HCK Src homology 2 (SH2) domain found in HCK. HCK is a member of the Src non-receptor type tyrosine kinase family of proteins and is expressed in hemopoietic cells. HCK is proposed to couple the Fc receptor to the activation of the respiratory burst. It may also play a role in neutrophil migration and in the degranulation of neutrophils. It has two different translational starts that have different subcellular localization. HCK has been shown to interact with BCR gene, ELMO1 Cbl gene, RAS p21 protein activator 1, RASA3, Granulocyte colony-stimulating factor receptor, ADAM15 and RAPGEF1. Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its C-terminal tail. In general SH2 domains are involved in signal transduction. HCK has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 104
28888 198227 cd10364 SH2_Src_Lyn Src homology 2 (SH2) domain found in Lyn. Lyn is a member of the Src non-receptor type tyrosine kinase family of proteins and is expressed in the hematopoietic cells, in neural tissues, liver, and adipose tissue. There are two alternatively spliced forms of Lyn. Lyn plays an inhibitory role in myeloid lineage proliferation. Following engagement of the B cell receptors, Lyn undergoes rapid phosphorylation and activation, triggering a cascade of signaling events mediated by Lyn phosphorylation of tyrosine residues within the immunoreceptor tyrosine-based activation motifs (ITAM) of the receptor proteins, and subsequent recruitment and activation of other kinases including Syk, phospholipase C2 (PLC2) and phosphatidyl inositol-3 kinase. These kinases play critical roles in proliferation, Ca2+ mobilization and cell differentiation. Lyn plays an essential role in the transmission of inhibitory signals through phosphorylation of tyrosine residues within the immunoreceptor tyrosine-based inhibitory motifs (ITIM) in regulatory proteins such as CD22, PIR-B and FC RIIb1. Their ITIM phosphorylation subsequently leads to recruitment and activation of phosphatases such as SHIP-1 and SHP-1 which further down modulate signaling pathways, attenuate cell activation and can mediate tolerance. Lyn also plays a role in the insulin signaling pathway. Activated Lyn phosphorylates insulin receptor substrate 1 (IRS1) leading to an increase in translocation of Glut-4 to the cell membrane and increased glucose utilization. It is the primary Src family member involved in signaling downstream of the B cell receptor. Lyn plays an unusual, 2-fold role in B cell receptor signaling; it is essential for initiation of signaling but is also later involved in negative regulation of the signal. Lyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28889 198228 cd10365 SH2_Src_Src Src homology 2 (SH2) domain found in tyrosine kinase sarcoma (Src). Src is a member of the Src non-receptor type tyrosine kinase family of proteins. Src is thought to play a role in the regulation of embryonic development and cell growth. Members here include v-Src and c-Src. v-Src lacks the C-terminal inhibitory phosphorylation site and is therefore constitutively active as opposed to normal cellular src (c-Src) which is only activated under certain circumstances where it is required (e.g. growth factor signaling). v-Src is an oncogene whereas c-Src is a proto-oncogene. c-Src consists of three domains, an N-terminal SH3 domain, a central SH2 domain and a tyrosine kinase domain. The SH2 and SH3 domains work together in the auto-inhibition of the kinase domain. The phosphorylation of an inhibitory tyrosine near the c-terminus of the protein produces a binding site for the SH2 domain which then facilitates binding of the SH3 domain to a polyproline site within the linker between the SH2 domain and the kinase domain. Binding of the SH3 domain inactivates the enzyme. This allows for multiple mechanisms for c-Src activation: dephosphorylation of the C-terminal tyrosine by a protein tyrosine phosphatase, binding of the SH2 domain by a competitive phospho-tyrosine residue, or competitive binding of a polyproline binding site to the SH3 domain. Unlike most other Src members Src lacks cysteine residues in the SH4 domain that undergo palmitylation. Serine and threonine phosphorylation sites have also been identified in the unique domains of Src and are believed to modulate protein-protein interactions or regulate catalytic activity. Alternatively spliced forms of Src, which contain 6- or 11-amino acid insertions in the SH3 domain, are expressed in CNS neurons. c-Src has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28890 198229 cd10366 SH2_Src_Yes Src homology 2 (SH2) domain found in Yes. Yes is a member of the Src non-receptor type tyrosine kinase family of proteins. Yes is the cellular homolog of the Yamaguchi sarcoma virus oncogene. In humans it is encoded by the YES1 gene which maps to chromosome 18 and is in close proximity to thymidylate synthase. A corresponding Yes pseudogene has been found on chromosome 22. YES1 has been shown to interact with Janus kinase 2, CTNND1,RPL10, and Occludin. Yes1 has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28891 198230 cd10367 SH2_Src_Fgr Src homology 2 (SH2) domain found in Gardner-Rasheed feline sarcoma viral (v-fgr) oncogene homolog, Fgr. Fgr is a member of the Src non-receptor type tyrosine kinase family of proteins. The protein contains N-terminal sites for myristoylation and palmitoylation, a PTK domain, and SH2 and SH3 domains which are involved in mediating protein-protein interactions with phosphotyrosine-containing and proline-rich motifs, respectively. Fgr is expressed in B-cells and myeloid cells, localizes to plasma membrane ruffles, and functions as a negative regulator of cell migration and adhesion triggered by the beta-2 integrin signal transduction pathway. Multiple alternatively spliced variants, encoding the same protein, have been identified Fgr has been shown to interact with Wiskott-Aldrich syndrome protein. Fgr has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28892 198231 cd10368 SH2_Src_Fyn Src homology 2 (SH2) domain found in Fyn. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28893 199831 cd10369 SH2_Src_Frk Src homology 2 (SH2) domain found in the Fyn-related kinase (Frk). Frk is a member of the Src non-receptor type tyrosine kinase family of proteins. The Frk subfamily is composed of Frk/Rak and Iyk/Bsk/Gst. It is expressed primarily epithelial cells. Frk is a nuclear protein and may function during G1 and S phase of the cell cycle and suppress growth. Unlike the other Src members it lacks a glycine at position 2 of SH4 which is important for addition of a myristic acid moiety that is involved in targeting Src PTKs to cellular membranes. FRK and SHB exert similar effects when overexpressed in rat phaeochromocytoma (PC12) and beta-cells, where both induce PC12 cell differentiation and beta-cell proliferation. Under conditions that cause beta-cell degeneration these proteins augment beta-cell apoptosis. The FRK-SHB responses involve FAK and insulin receptor substrates (IRS) -1 and -2. Frk has been demonstrated to interact with retinoblastoma protein. Frk regulates PTEN protein stability by phosphorylating PTEN, which in turn prevents PTEN degradation. Frk also plays a role in regulation of embryonal pancreatic beta cell formation. Frk has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its activation loop. The tryosine involved is at the same site as the tyrosine involved in the autophosphorylation of Src. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 96
28894 198233 cd10370 SH2_Src_Src42 Src homology 2 (SH2) domain found in the Src oncogene at 42A (Src42). Src42 is a member of the Src non-receptor type tyrosine kinase family of proteins. The integration of receptor tyrosine kinase-induced RAS and Src42 signals by Connector eNhancer of KSR (CNK) as a two-component input is essential for RAF activation in Drosophila. Src42 is present in a wide variety of organisms including: California sea hare, pea aphid, yellow fever mosquito, honey bee, Panamanian leafcutter ant, and sea urchin. Src42 has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its C-terminal tail. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 96
28895 198234 cd10371 SH2_Src_Blk Src homology 2 (SH2) domain found in B lymphoid kinase (Blk). Blk is a member of the Src non-receptor type tyrosine kinase family of proteins. Blk is expressed in the B-cells. Unlike most other Src members Blk lacks cysteine residues in the SH4 domain that undergo palmitylation. Blk is required for the development of IL-17-producing gamma-delta T cells. Furthermore, Blk is expressed in lymphoid precursors and, in this capacity, plays a role in regulating thymus cellularity during ontogeny. Blk has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 100
28896 198235 cd10372 SH2_STAT1 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 1 proteins. STAT1 is a member of the STAT family of transcription factors. STAT1 is involved in upregulating genes due to a signal by interferons. STAT1 forms homodimers or heterodimers with STAT3 that bind to the Interferon-Gamma Activated Sequence (GAS) promoter element in response to IFN-gamma stimulation. STAT1 forms a heterodimer with STAT2 that can bind Interferon Stimulated Response Element (ISRE) promoter element in response to either IFN-alpha or IFN-beta stimulation. Binding in both cases leads to an increased expression of ISG (Interferon Stimulated Genes). STAT1 has been shown to interact with protein kinase R, Src, IRF1, STAT3, MCM5, STAT2, CD117, Fanconi anemia, complementation group C, CREB-binding protein, Interleukin 27 receptor, alpha subunit, PIAS1, BRCA1, Epidermal growth factor receptor, PTK2, Mammalian target of rapamycin, IFNAR2, PRKCD, TRADD, C-jun, Calcitriol receptor, ISGF3G, and GNB2L1. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 151
28897 198236 cd10373 SH2_STAT2 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 2 proteins. STAT2 is a member of the STAT protein family. In response to interferon, STAT2 forms a complex with STAT1 and IFN regulatory factor family protein p48 (ISGF3G), in which this protein acts as a transactivator, but lacks the ability to bind DNA directly. Transcription adaptor P300/CBP (EP300/CREBBP) has been shown to interact specifically with STAT2, which is thought to be involved in the process of blocking IFN-alpha response by adenovirus. STAT2 has been shown to interact with MED14, CREB-binding protein, SMARCA4, STAT1, IFNAR2, IFNAR1, and ISGF3G. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 151
28898 198237 cd10374 SH2_STAT3 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 3 proteins. STAT3 encoded by this gene is a member of the STAT protein family. STAT3 mediates the expression of a variety of genes in response to cell stimuli, and plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 regulates the activity of STAT3 and PIAS3 inhibits it. Three alternatively spliced transcript variants encoding distinct isoforms have been described. STAT 3 activation is required for self-renewal of embryonic stem cells (ESCs) and is essential for the differentiation of the TH17 helper T cells. Mutations in the STAT3 gene result in Hyperimmunoglobulin E syndrome and human cancers. STAT3 has been shown to interact with Androgen receptor, C-jun, ELP2, EP300, Epidermal growth factor receptor, Glucocorticoid receptor, HIF1A, Janus kinase 1, KHDRBS1, Mammalian target of rapamycin, MyoD, NDUFA13, NFKB1, Nuclear receptor coactivator 1, Promyelocytic leukemia protein, RAC1, RELA, RET proto-oncogene, RPA2, Src, STAT1, and TRIP10. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 162
28899 198238 cd10375 SH2_STAT4 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 4proteins. STAT4 mediate signals from the IL-12 receptors. STAT4 is mainly phosphorylated by IL-12-mediated signaling pathway in T cells. STAT4 expression is restricted in myeloid cells, thymus and testis. L-12 is the major cytokine that can activate STAT4, resulting in its tyrosine phosphorylation. The IL-12 receptor has two chains, termed IL-12R 1 and IL-12R 2, and ligand binding results in heterodimer formation and activation of the receptor associated JAK kinases, Jak2 and Tyk2. Phosphorylated STAT4 homo-dimerizes via its SH2 domain, and translocates into nucleus where it can recognize traditional N3 STAT target sequences in IL-12 responsive genes. STAT4 can also be phosphorylated in response to IFN-gamma stimulation through activation of Jak1 and Tyk2 in human. IL-17 can also activate STAT4 in human monocytic leukemia cell lines and IL-2 can induce Jak2 and Stat4 activation in NK cells but not in T cells. T helper 1 (Th1) cells produce IL-2 and IFNgamma, whereas Th2 cells secrete IL-4, IL-5, IL-6 and IL-13. Th1 cells are responsible for cell-mediated/inflammatory immunity and can enhance defenses against infectious agents and cancer, while Th2 cells are essential for humoral immunity and the clearance of parasitic antigens. The most potent factors that can promote Th1 and Th2 differentiation are the cytokines IL-12 and IL-4 respectively Although STAT4 is expressed both in Th1 and Th2 cells, STAT4 can only be phosphorylated by IL-12 which suggests that STAT4 plays an important role in Th1 cell function or development. STAT4 activation leads to Th1 differentiation, including the target genes of STAT4 such as ERM, a transcription factor that belongs to the Ets family of transcription factors. The expression of ERM is specifically induced by IL-12 in wild-type Th1 cells, but not in STAT4-deficient T cells. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 148
28900 198239 cd10376 SH2_STAT5 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5 proteins. STAT5 is a member of the STAT family of transcription factors. Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level. Both STAT5a and STAT5b are ubiquitously expressed and functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. 137
28901 198240 cd10377 SH2_STAT6 Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 6 proteins. STAT6 mediate signals from the IL-4 receptor. Unlike the other STAT proteins which bind an IFNgamma Activating Sequence (GAS), STAT6 stands out as having a unique binding site preference. This site consists of a palindromic sequence separated by a 3 bp spacer (TTCNNNG-AA)(N3 site). STAT6 is able to bind the GAS site but only at a low affinity. STAT6 may be an important regulator of mitogenesis when cells respond normally to IL-4. There is speculation that the inappropriate activation of STAT6 is involved in uncontrolled cell growth in an oncogenic state. IFNgamma is a negative regulator of STAT6 dependent transcription of target genes. Bcl-6 is another negative regulator of STAT6 activity. Bcl-6 is a transcriptional repressor normally expressed in germinal center B cells and some T cells. IL-4 signaling via STAT6 initially occurs unopposed, but is then dampened by a negative feedback mechanism through the IL-4/Stat6 dependent induction of SOCS1 expression. The IL-4 dependent aspect of Th2 differentiation requires the activation of STAT6. IL-4 signaling and STAT6 appear to play an important role in the immune response. Recently, it was shown that large scale chromatin remodeling of the IL-4 gene occurs as cells differentiate into Th2 effectors is STAT6 dependent. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 129
28902 198241 cd10378 SH2_Jak1 Src homology 2 (SH2) domain in the Janus kinase 1 (Jak1) proteins. Janus kinase 1 (JAK1), is a member of a class of protein-tyrosine kinases (PTK) characterized by the presence of a second phosphotransferase-related domain immediately N-terminal to the PTK domain. The second phosphotransferase domain bears all the hallmarks of a protein kinase, although its structure differs significantly from that of the PTK and threonine/serine kinase family members. JAK1 is a large, widely expressed membrane-associated phosphoprotein. JAK1 is involved in the interferon-alpha/beta and -gamma signal transduction pathways. The reciprocal interdependence between JAK1 and TYK2 activities in the interferon-alpha pathway, and between JAK1 and JAK2 in the interferon-gamma pathway, may reflect a requirement for these kinases in the correct assembly of interferon receptor complexes. These kinases couple cytokine ligand binding to tyrosine phosphorylation of various known signaling proteins and of a unique family of transcription factors termed the signal transducers and activators of transcription, or STATs. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28903 198242 cd10379 SH2_Jak2 Src homology 2 (SH2) domain in the Janus kinase 2 (Jak2) proteins. Jak2 is a protein tyrosine kinase involved in a specific subset of cytokine receptor signaling pathways. It has been found to be constitutively associated with the prolactin receptor and is required for responses to gamma interferon. Mice that do not express an active protein for this gene exhibit embryonic lethality associated with the absence of definitive erythropoiesis. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28904 198243 cd10380 SH2_Jak3 Src homology 2 (SH2) domain in the Janus kinase 3 (Jak3) proteins. Jak3 is a member of the Janus kinase (JAK) family of tyrosine kinases involved in cytokine receptor-mediated intracellular signal transduction. It is predominantly expressed in immune cells and transduces a signal in response to its activation via tyrosine phosphorylation by interleukin receptors. Mutations in this gene are associated with autosomal SCID (severe combined immunodeficiency disease). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 96
28905 198244 cd10381 SH2_Jak_Tyk2 Src homology 2 (SH2) domain in Tyrosine Kinase 2 (Tyk2), a member of the Janus kinases (JAK). Tyk2 is a member of the tyrosine kinase and, more specifically, the Janus kinases (JAKs) protein families. This protein associates with the cytoplasmic domain of type I and type II cytokine receptors and promulgate cytokine signals by phosphorylating receptor subunits. It is also component of both the type I and type III interferon signaling pathways. As such, it may play a role in anti-viral immunity. A mutation in this gene has been associated with hyperimmunoglobulin E syndrome (HIES) - a primary immunodeficiency characterized by elevated serum immunoglobulin E. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28906 198245 cd10382 SH2_SOCS1 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28907 198246 cd10383 SH2_SOCS2 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28908 198247 cd10384 SH2_SOCS3 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28909 198248 cd10385 SH2_SOCS4 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28910 198249 cd10386 SH2_SOCS5 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) family. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 81
28911 198250 cd10387 SH2_SOCS6 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 100
28912 198251 cd10388 SH2_SOCS7 Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28913 198252 cd10389 SH2_SHB Src homology 2 domain found in SH2 domain-containing adapter protein B (SHB). SHB functions in generating signaling compounds in response to tyrosine kinase activation. SHB contains proline-rich motifs, a phosphotyrosine binding (PTB) domain, tyrosine phosphorylation sites, and a SH2 domain. SHB mediates certain aspects of platelet-derived growth factor (PDGF) receptor-, fibroblast growth factor (FGF) receptor-, neural growth factor (NGF) receptor TRKA-, T cell receptor-, interleukin-2 (IL-2) receptor- and focal adhesion kinase- (FAK) signaling. SRC-like FYN-Related Kinase FRK/RAK (also named BSK/IYK or GTK) and SHB regulate apoptosis, proliferation and differentiation. SHB promotes apoptosis and is also required for proper mitogenicity, spreading and tubular morphogenesis in endothelial cells. SHB also plays a role in preventing early cavitation of embryoid bodies and reduces differentiation to cells expressing albumin, amylase, insulin and glucagon. SHB is a multifunctional protein that has difference responses in different cells under various conditions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28914 198253 cd10390 SH2_SHD Src homology 2 domain found in SH2 domain-containing adapter proteins D (SHD). The expression of SHD is restricted to the brain. SHD may be a physiological substrate of c-Abl and may function as an adapter protein in the central nervous system. It is also thought to be involved in apoptotic regulation. SHD contains five YXXP motifs, a substrate sequence preferred by Abl tyrosine kinases, in addition to a poly-proline rich region and a C-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28915 198254 cd10391 SH2_SHE Src homology 2 domain found in SH2 domain-containing adapter protein E (SHE). SHE is expressed in heart, lung, brain, and skeletal muscle. SHE contains two pTry protein binding domains, protein interaction domain (PID) and a SH2 domain, followed by a glycine-proline rich region, all of which are N-terminal to the phosphotyrosine binding (PTB) domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28916 198255 cd10392 SH2_SHF Src homology 2 domain found in SH2 domain-containing adapter protein F (SHF). SHF is thought to play a role in PDGF-receptor signaling and regulation of apoptosis. SHF is mainly expressed in skeletal muscle, brain, liver, prostate, testis, ovary, small intestine, and colon. SHF contains four putative tyrosine phosphorylation sites and an SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28917 198256 cd10393 SH2_RIN1 Src homology 2 (SH2) domain found in Ras and Rab interactor 1 (RIN1)-like proteins. RIN1, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. Previous studies showed that RIN1 interacts with EGF receptors via its SH2 domain and regulates trafficking and degradation of EGF receptors via its interaction with STAM, indicating a vital role for RIN1 in regulating endosomal trafficking of receptor tyrosine kinases (RTKs). RIN1 was first identified as a Ras-binding protein that suppresses the activated RAS2 allele in S. cerevisiae. RIN1 binds to the activated Ras through its carboxyl-terminal domain and this Ras-binding domain also binds to 14-3-3 proteins as Raf-1 does. The SH2 domain of RIN1 are thought to interact with the phosphotyrosine-containing proteins, but the physiological partners for this domain are unknown. The proline-rich domain in RIN1 is similar to the consensus SH3 binding regions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28918 198257 cd10394 SH2_RIN2 Src homology 2 (SH2) domain found in Ras and Rab interactor 2 (RIN2)-like proteins. RIN2, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. Ras induces activation of Rab5 through RIN2, which is a direct downstream target of Ras and a direct upstream regulator of Rab5. In other words it is the binding of the GTP-bound form of Ras to the RA domain of RIN2 that enhances the GEF activity toward Rab5. It is thought that the RA domain negatively regulates the Rab5 GEF activity. In steady state, RIN2 is likely to form a closed conformation by an intramolecular interaction between the RA domain and the Vps9p-like (Rab5 GEF) domain, negatively regulating the Rab5 GEF activity. In the active state, the binding of Ras to the RA domain may reduce the intramolecular interaction and stabilize an open conformation of RIN2. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 100
28919 198258 cd10395 SH2_RIN3 Src homology 2 (SH2) domain found in Ras and Rab interactor 3 (RIN3)-like proteins. RIN3, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. RIN3 stimulated the formation of GTP-bound Rab31, a Rab5-subfamily GTPase, and formed enlarged vesicles and tubular structures, where it colocalized with Rab31. Transferrin appeared to be transported partly through the RIN3-positive vesicles to early endosomes. RIN3 interacts via its Pro-rich domain with amphiphysin II, which contains SH3 domain and participates in receptor-mediated endocytosis. RIN3, a Rab5 and Rab31 GEF, plays an important role in the transport pathway from plasma membrane to early endosomes. Mutations in the region between the SH2 and RH domain of RIN3 specifically abolished its GEF action on Rab31, but not Rab5. RIN3 was also found to partially translocate the cation-dependent mannose 6-phosphate receptor from the trans-Golgi network to peripheral vesicles and that this is dependent on its Rab31-GEF activity. These data indicate that RIN3 specifically acts as a GEF for Rab31. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28920 198259 cd10396 SH2_Tec_Itk Src homology 2 (SH2) domain found in Tec protein, IL2-inducible T-cell kinase (Itk). A member of the Tec protein tyrosine kinase Itk is expressed thymus, spleen, lymph node, T lymphocytes, NK and mast cells. It plays a role in T-cell proliferation and differentiation, analogous to Tec family kinases Txk. Itk has been shown to interact with Fyn, Wiskott-Aldrich syndrome protein, KHDRBS1, PLCG1, Lymphocyte cytosolic protein 2, Linker of activated T cells, Karyopherin alpha 2, Grb2, and Peptidylprolyl isomerase A. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP. It is crucial for the function of Tec PH domains and it's lack of presence in Txk is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 108
28921 198260 cd10397 SH2_Tec_Btk Src homology 2 (SH2) domain found in Tec protein, Bruton's tyrosine kinase (Btk). A member of the Tec protein tyrosine kinase Btk is expressed in bone marrow, spleen, all hematopoietic cells except T lymphocytes and plasma cells where it plays a crucial role in B cell maturation and mast cell activation. Btk has been shown to interact with GNAQ, PLCG2, protein kinase D1, B-cell linker, SH3BP5, caveolin 1, ARID3A, and GTF2I. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. Btk is implicated in the primary immunodeficiency disease X-linked agammaglobulinemia (Bruton's agammaglobulinemia). The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP. It is crucial for the function of Tec PH domains and it's lack of presence in Txk is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. Two tyrosine phosphorylation (pY) sites have been identified in Btk: one located in the activation loop of the catalytic domain which regulates the transition between open (active) and closed (inactive) states and the other in its SH3 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 106
28922 198261 cd10398 SH2_Tec_Txk Src homology 2 (SH2) domain found in Tec protein, Txk. A member of the Tec protein tyrosine kinase Txk is expressed in thymus, spleen, lymph node, T lymphocytes, NK cells, mast cell lines, and myeloid cell line. Txk plays a role in TCR signal transduction, T cell development, and selection which is analogous to the function of Itk. Txk has been shown to interact with IFN-gamma. Unlike most of the Tec family members Txk lacks a PH domain. Instead Txk has a unique region containing a palmitoylated cysteine string which has a similar membrane tethering function as the PH domain. Txk also has a zinc-binding motif, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP and crucial to the function of the PH domain. It is not present in Txk which is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 106
28923 198262 cd10399 SH2_Tec_Bmx Src homology 2 (SH2) domain found in Tec protein, Bmx. A member of the Tec protein tyrosine kinase Bmx is expressed in the endothelium of large arteries, fetal endocardium, adult endocardium of the left ventricle, bone marrow, lung, testis, granulocytes, myeloid cell lines, and prostate cell lines. Bmx is involved in the regulation of Rho and serum response factor (SRF). Bmx has been shown to interact with PAK1, PTK2, PTPN21, and RUFY1. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP. It is crucial for the function of Tec PH domains. It is not present in Txk and the type 1 splice form of the Drosophila homolog. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 106
28924 198263 cd10400 SH2_SAP1a Src homology 2 (SH2) domain found in SLAM-associated protein (SAP) 1a. The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail. XLP is characterized by an extreme sensitivity to Epstein-Barr virus. Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX[VI], which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28925 198264 cd10401 SH2_C-SH2_Syk_like C-terminal Src homology 2 (SH2) domain found in Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains. In Syk the two SH2 domains do not form such a phosphotyrosine-binding site. The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of Syk. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 99
28926 198265 cd10402 SH2_C-SH2_Zap70 C-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70). ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains. In Syk the two SH2 domains do not form such a phosphotyrosine-binding site. The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of Zap70. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 105
28927 198266 cd10403 SH2_STAP1 Src homology 2 domain found in Signal-transducing adaptor protein 1 (STAP1). STAP1 is a signal-transducing adaptor protein. It is composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. STAP-1 is an ortholog of BRDG1 (BCR downstream signaling 1). STAP1 protein functions as a docking protein acting downstream of Tec tyrosine kinase in B cell antigen receptor signaling. The protein is phosphorylated by Tec and participates in a positive feedback loop, increasing Tec activity. STAP1 has been shown to interact with C19orf2, an unconventional prefoldin RPB5 interactor. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 94
28928 198267 cd10404 SH2_STAP2 Src homology 2 domain found in Signal-transducing adaptor protein 2 (STAP2). STAP2 is a signal-transducing adaptor protein. It is composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. The STAP2 protein is the substrate of breast tumor kinase, an Src-type non-receptor tyrosine kinase that mediates the interactions linking proteins involved in signal transduction pathways. STAP2 has alternative splicing variants. STAP2 has been shown to interact with tyrosine-protein kinase 6 (PTK6). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28929 198268 cd10405 SH2_Vav1 Src homology 2 (SH2) domain found in the Vav1 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins. All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav1 plays a role in T-cell and B-cell development and activation. It has been identified as the specific binding partner of Nef proteins from HIV-1, resulting in morphological changes, cytoskeletal rearrangements, and the JNK/SAPK signaling cascade, leading to increased levels of viral transcription and replication. Vav1 has been shown to interact with Ku70, PLCG1, Lymphocyte cytosolic protein 2, Janus kinase 2, SIAH2, S100B, Abl gene, ARHGDIB, SHB, PIK3R1, PRKCQ, Grb2, MAPK1, Syk, Linker of activated T cells, Cbl gene and EZH2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28930 198269 cd10406 SH2_Vav2 Src homology 2 (SH2) domain found in the Vav2 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins. All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav2 is a GEF for RhoA, RhoB and RhoG and may activate Rac1 and Cdc42. Vav2 has been shown to interact with CD19 and Grb2. Alternatively spliced transcript variants encoding different isoforms have been found for Vav2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28931 198270 cd10407 SH2_Vav3 Src homology 2 (SH2) domain found in the Vav3 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins. All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav3 preferentially activates RhoA, RhoG and, to a lesser extent, Rac1. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. VAV3 has been shown to interact with Grb2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 103
28932 198271 cd10408 SH2_Nck1 Src homology 2 (SH2) domain found in Nck. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates. There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)). They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets. Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus. Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28933 198272 cd10409 SH2_Nck2 Src homology 2 (SH2) domain found in Nck. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates. There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)). They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets. Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus. Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 98
28934 198273 cd10410 SH2_SH2B1 Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B1 (SH2-B, PSM), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases. SH2B1 and SH2B2 function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28935 198274 cd10411 SH2_SH2B2 Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B2 (APS), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases. SH2B1 and SH2B2 function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28936 198275 cd10412 SH2_SH2B3 Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B3 (Lnk), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases. SH2B3 negatively regulates lymphopoiesis and early hematopoiesis. The lnk-deficiency results in enhanced production of B cells, and expansion as well as enhanced function of hematopoietic stem cells (HSCs), demonstrating negative regulatory functions of Sh2b3/Lnk in cytokine signaling. Sh2b3/Lnk also functions in responses controlled by cell adhesion and in crosstalk between integrin- and cytokine-mediated signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 97
28937 198276 cd10413 SH2_Grb7 Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 7 (Grb7) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb7 is part of the Grb7 family of proteins which also includes Grb10, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb7 binds strongly to the erbB2 receptor, unlike Grb10 and Grb14 which bind weakly to it. Grb7 family proteins are phosphorylated on serine/threonine as well as tyrosine residues. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 108
28938 198277 cd10414 SH2_Grb14 Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 14 (Grb14) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb14 is part of the Grb7 family of proteins which also includes Grb7, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb14 binds to Fibroblast Growth Factor Receptor (FGFR) and weakly to the erbB2 receptor. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 108
28939 198278 cd10415 SH2_Grb10 Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 10 (Grb10) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb10 is part of the Grb7 family of proteins which also includes Grb7, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb10 has been shown to interact with many different proteins, including the insulin and IGF1 receptors, platelet-derived growth factor (PDGF) receptor-beta, Ret, Kit, Raf1 and MEK1, and Nedd4. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 108
28940 198279 cd10416 SH2_SH2D2A Src homology 2 domain found in the SH2 domain containing protein 2A (SH2D2A). SH2D2A contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28941 199832 cd10417 SH2_SH2D7 Src homology 2 domain found in the SH2 domain containing protein 7 (SH2D7). SH2D7 contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 102
28942 198281 cd10418 SH2_Src_Fyn_isoform_a_like Src homology 2 (SH2) domain found in Fyn isoform a like proteins. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. This cd contains the SH2 domain found in Fyn isoform a type proteins. Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28943 198282 cd10419 SH2_Src_Fyn_isoform_b_like Src homology 2 (SH2) domain found in Fyn isoform b like proteins. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. This cd contains the SH2 domain found in Fyn isoform b type proteins. Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 101
28944 198283 cd10420 SH2_STAT5b Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5b proteins. STAT5 is a member of the STAT family of transcription factors. Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level. Both STAT5a and STAT5b are ubiquitously expressed and functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 145
28945 198284 cd10421 SH2_STAT5a Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5a proteins. STAT5 is a member of the STAT family of transcription factors. Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level. Both STAT5a and STAT5b are ubiquitously expressed and functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites. It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain. The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 140
28946 199217 cd10422 RNase_Ire1 RNase domain (also known as the kinase extension nuclease domain) of Ire1. The model represents the C-terminal endoribonuclease domain of the multi-functional protein Ire1; Ire1 in addition contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR), which acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain. This leads to a conformational change that stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is homologous to the RNase domain of RNase L, and possesses a novel fold for a nuclease and appears to be rigid irrespective of the activation state of IRE1. Structural analysis and mutational studies have revealed that an early stage 'phosphoryl-transfer' competent conformation of IRE1 favors face-to-face dimerization of the kinase domains which precedes and is distinct from the RNase 'active' back-to-back conformation. Furthermore, in yeast IRE1, the flavonol quercetin activates the RNase and potentiates activation of the protein kinase by ADP, hinting at the possible existence of endogenous cytoplasmic ligands that may function along with stress signals from ER lumen in order to modulate IRE1 activity, thus identifying IRE1 as a target for development of ATP-competitive inhibitors to modulate the UPR with specific relevance for multiple myeloma. 129
28947 199218 cd10423 RNase_RNase-L RNase domain (also known as the kinase extension nuclease domain) of RNase L. Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase, is a highly regulated, latent endoribonuclease (thus the 'L' in RNase L) and is widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system, which blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L is unique in that it is composed of three major domains; N-terminus regulatory ankyrin repeat domain (ARD), followed by a linker, a protein kinase (PK)-like domain and a C-terminal ribonuclease (RNase) domain. The RNase domain has homology with IRE1, also containing both a kinase and an endoribonuclease, that functions in the unfolded protein response (UPR). RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele. The broad range of biological functions of RNase offers a possibility for RNase L as a therapeutic target. 119
28948 198344 cd10424 GST_C_9 C-terminal, alpha helical domain of an unknown subfamily 9 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 9; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 103
28949 259896 cd10425 Ephrin-A_Ectodomain Ectodomain of Ephrin A. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrin As contain a highly conserved receptor binding ectodomain described by this model. Although ephrin As do not have a cytoplasmic tail (in contrast to ephrin Bs), they are still capable of downstream activation of Src family kinases and phosphoinositide-3-kinases, most likely involving coreceptors such as neurotrophin receptors. 130
28950 259897 cd10426 Ephrin-B_Ectodomain Ectodomain of Ephrin B. Ephrin Bs have several conserved tyrosine phosphorylation sites in their cytoplasmic PDZ-like domain, which are important for signal transduction. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands and the five EphB receptors bind to three transmembrane ephrin-B ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrin Bs contain a highly conserved receptor binding ectodomain described in this model. 137
28951 198378 cd10427 FGGY_GK_1 Uncharacterized subgroup; belongs to the glycerol kinases subfamily of the FGGY family of carbohydrate kinases. This subgroup contains uncharacterized bacterial proteins belonging to the glycerol kinase subfamily of the FGGY family of carbohydrate kinases. The glycerol kinase subfamily includes glycerol kinases (GK; EC 2.7.1.30), and glycerol kinase-like proteins from all three kingdoms of living organisms. Glycerol is an important intermediate of energy metabolism and it plays fundamental roles in several vital physiological processes. GKs are involved in the entry of external glycerol into cellular metabolism. They catalyze the rate-limiting step in glycerol metabolism by transferring a phosphate from ATP to glycerol thus producing glycerol 3-phosphate (G3P) in the cytoplasm. Under different conditions, GKs from different species may exist in different oligomeric states. The monomer of GKs is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. 487
28952 198410 cd10428 LFG_like Proteins similar to and including lifeguard (LFG), a putative regulator of apoptosis. Lifeguard (LFG) inhibits Fas-mediated apoptosis and interacts with the death receptor FasR/CD95/Apo1. LFG has been shown to interact with Bax and is supposed to be integral to cellular membranes such as the ER. A close homolog, PP1201 or RECS1, appears located in the Golgi compartment and also interacts with the Fas receptor CD95/Apo1. PP1201 is expressed in response to shear stress. 217
28953 198411 cd10429 GAAP_like Golgi antiapoptotic protein. GAAP (or transmembrane BAX inhibitor motif containing 4) is a regulator of apoptosis that is related to the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Human GAAP has been linked to the modulation of intracellular fluxes of Ca(2+), by suppressing influx from the extracellular medium and reducing release from intracellular stores. A viral homolog (vaccinia virus vGAAP) acts similar to its human counterpart in inhibiting apoptosis. 233
28954 198412 cd10430 BI-1 BAX inhibitor (BI)-1. Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. In plants, BI-1 like proteins play a role in pathogen resistance. 213
28955 198413 cd10431 GHITM Growth-hormone inducible transmembrane protein. GHITM appears to be ubiquitiously expressed in mammalian cells and expression has also been observed in various cancer cell lines. A cytoprotective function has been suggested. It is closely related to the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect. 264
28956 198414 cd10432 BI-1-like_bacterial Bacterial BAX inhibitor (BI)-1/YccA-like proteins. This family is comprised of bacterial relatives of the mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. In plants, BI-1 like proteins play a role in pathogen resistance. A characterized prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes. 211
28957 198415 cd10433 YccA_like YccA-like proteins. A prokaryotic member of the BAX inhibitor (BI)-1 like family of small transmembrane proteins, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes. 205
28958 198381 cd10434 GIY-YIG_UvrC_Cho Catalytic GIY-YIG domain of nucleotide excision repair endonucleases UvrC, Cho, and similar proteins. UvrC is essential for nucleotide excision repair (NER). The N-terminal catalytic GIY-YIG domain of UvrC (also known as Uri domain) is responsible for the 3' incision reaction and the C-terminal half of UvrC, consisting of an UvrB-binding domain (UvrBb), EndoV-like nuclease domain and a helix-hairpin-helix (HhH) DNA-binding domain, contains the residues involved in 5' incision. The N- and C-terminal regions are joined by a common Cys-rich domain containing four conserved Cys residues. Besides UvrC, protein Cho (UvrC homolog) serves as a second endonuclease in E. coli NER. Cho contains GIY-YIG motif followed by a Cys-rich region and shares sequence homology with the N-terminal half of UvrC. It is capable of incising the DNA at the 3' side of a lesion in the presence of the UvrA and UvrB proteins during NER. The C-terminal half of Cho is a unique uncharacterized domain, which is distinct from that of UvrC. Moreover, unlike UvrC, Cho does not require the UvrC-binding domain of UvrB for the 3' incision reaction, which might cause the shift in incision position and the difference in incision efficiencies between Cho and UvrC on different damaged substrates. Due to this, the range of NER in E. coli can be broadened by combining action of Cho and UvrC. This family also includes many uncharacterized epsilon proofreading subunits of DNA polymerase III, which have an additional N-terminal ExoIII domain and a 3'-5' exonuclease domain homolog, fused to an UvrC-like region or a Cho-like region. The UvrC-like region includes a GIY-YIG motif, followed by a Cys-rich region, and an UvrB-binding domain (UvrBb), but lacks the EndoV-like nuclease domain and the helix-hairpin-helix (HhH) DNA-binding domain. The Cho-like region consists of a GIY-YIG motif, followed by the Cys-rich region, and the unique uncharacterized domain presenting in the C-terminal half of Cho. Some family members may not carry the Cys-rich region. This family also includes a specific Cho-like protein from G. violaceus, which possesses only UvrBb domain at the C-terminus, but lacks the additional N-terminal ExoIII domain. The oother two remote homologs of UvrC, Bacillus-I and -II, are included in this family as well. Both of them contain a GIY-YIG domain, but no Cys-rich region. Moreover, the whole C-terminal region of Bacillus-I is replaces by an unknown domain, and Bacillus-II possesses another unknown N-terminal extension. 81
28959 198382 cd10435 GIY-YIG_RE_Eco29kI_like Catalytic GIY-YIG domain of type II restriction endonucleases R.Eco29kI, R.Cfr42I, and similar proteins. This family corresponds to the catalytic GIY-YIG domain of a group of GGCGCC-specific type II restriction endonucleases R.Eco29kI, R.Cfr42I, and similar proteins. R.Eco29kI is encoded on plasmid pECO29 in the E. coli strain 29K. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. R.Eco29kI forms a domain-swapped homodimeric catalytically active complex during DNA binding and cleavage. Each subunit contains one GIY-YIG catalytic motif. Restriction endonucleases R.Cfr42I is an isoschizomer of R.Eco29kI. Unlike R.Eco29kI, R.Cfr42I is functional as a homotetramer, binding and cleaving two cognate DNA molecules in a cooperative manner. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis, but not for substrate binding. This family also includes a hypothetical protein from Deinococcus radiodurans that corresponds to MraI, a type II restriction enzyme similar to GIY-YIG family of homing endonucleases. MraI is shown to be an isoschizomer of Eco29kI, Cfr42I recognizing the palindromic nucleotide sequence 5'-CCGC reduced GG-3'. The enzyme shows an absolute requirement of Mg2+, but is active in the absence of added 2-mercaptoethanol. MraI represents the first restriction enzyme from a bacterium whose DNA lacks modified methylated bases. 117
28960 198383 cd10436 GIY-YIG_EndoII_Hpy188I_like Catalytic GIY-YIG domain of coliphage T4 non-specific endonuclease II, type II restriction endonuclease R.Hpy188I, and similar proteins. This family includes two different GIY-YIG enzymes, coliphage T4 non-specific endonuclease II (EndoII), and type II restriction endonuclease R.Hpy188I. They display high sequence similarity to each other, and both of them contain an extra N-terminal hairpin that lacks counterparts in other GIY-YIG enzymes. EndoII encoded by gene denA catalyzes the initial step in degradation of host DNA, which permits scavenging of host-derived nucleotides for phage DNA synthesis. R.Hpy188I recognizes the unique sequence, 5'-TCNGA-3', and cleaves the DNA between nucleotides N and G in its recognition sequence to generate a single nucleotide 3'-overhang. EndoII binds to two DNA substrates as an X-shaped tetrameric structure composed as a dimer of dimers. In contrast, two subunits of R.Hpy188I form a dimer to embrace one bound DNA. Divalent metal-ion cofactors are required for their catalytic events, but not for the substrates binding. 97
28961 198384 cd10437 GIY-YIG_HE_I-TevI_like N-terminal catalytic domain of GIY-YIG intron endonuclease I-TevI, I-BmoI, I-BanI, I-BthII and similar proteins. I-TevI is a site-specific GIY-YIG homing endonuclease encoded within the group I intron of the thymidylate synthase gene (td) from Escherichia coli phage T4. It functions as an endonuclease that catalyzes the first step in intron homing by generating a double-strand break in the intronless td allele within a sequence designated the homing site. I-TevI recognizes its extensive 37 base pair DNA target in a site-specific, but sequence-tolerant manner. The cleavage site is located at 23 (upper strand) and 25 (lower strand) nucleotides upstream of the intron insertion site. A divalent cation, such as Mg2+, is required for the catalysis. I-TevI also acts as a repressor of its own transcription. It binds an operator that is located upstream of the I-TevI coding sequence and overlaps the T4 late promoter, which drives I-TevI expression from within the td intron. I-TevI binds the homing sites and the operator with the same affinity, but cleaves the homing site more efficiently than the operator. I-TevI consists of an N-terminal catalytic domain, containing the GIY-YIG motif, and a C-terminal DNA-binding domain that binds DNA as a monomer, joined by a flexible linker. The C-terminal domain includes three subdomains: a zinc finger, a minor-groove binding alpha-helix (NUMOD3, nuclease-associated modular domain 3), and a helix-turn-helix domain (HTH). The last two are responsible for DNA-binding. The zinc finger is part of the linker and not required for DNA-binding. It is implicated as a distance sensor to constrain the catalytic domain to cleave the homing site at a fixed position. None of other GIY-YIG endonucleases have been found to have the zinc finger motif. This family also includes a reduced activity isoschizomer of I-TevI, I-BmoI, which is encoded within the group I intron of the thymidylate synthase (TS) gene (thyA) from Bacillus mojavensis. I-BmoI catalyzes the first step in intron homing by generating a double-strand break in the intronless td allele within a sequence designated the homing site in the presence of a divalent cation cofactor, such as Mg2+. In the absence of Mg2+, I-Bmol only nicks one of the strands. Both I-BmoI and I-TevI bind a homologous stretch of TS-encoding DNA as monomers, but use different strategies to distinguish intronless from intron-containing substrates. I-TevI recognizes substrates at the level of DNA-binding. However, I-BmoI binds both intron-containing and intronless TS-encoding substrates, but efficiently cleaves only intronless substrate. Afterwards they cleave their respective intronless substrates in the same positions, and both require a critical G-C base pair adjacent to the top strand site for efficient cleavage. The C-terminal domain of I-BmoI has nuclease-associated modular DNA-binding domains (NUMODs), but lacks the zinc finger, which is different from that of I-TevI. Although the zinc finger implicated as a distance determination in I-TevI is absent, I-BmoI still possesses some cleavage distance discrimination. Besides I-TevI and I-BmoI, this family contains a putative GIY-YIG homing endonuclease, I-BanI, encoded within the self-splicing group I intron of nrdE gene from Bacillus anthracis. It contains two major domains, the N-terminal GIY-YIG domain and the C-terminal DNA-binding domain that consists of a minor-groove DNA binding alpha-helix motif and a helix-turn-helix (HTH) motif. I-BanI generates a double-strand break (DSB) in the intronless nrdE gene. The cleavage site is located at 5 and 7 nucleotides upstream of the intron insertion site, with 2-nucleotide 3' extensions. The recognition site is 35 to 40 base pairs and covers the cleavage site with a bias toward the downstream region including the (intervening sequence) IVS insertion site. Moreover, this family contains another putative GIY-YIG homing endonuclease, I-BthII, encoded within the self-splicing group I intron of nrdF gene from Bacillus thuringiensis ssp. pakistani. It contains a GIY-YIG motif that generates a double-strand break (DSB) in the intronless nrdF gene. The cleavage site is located at 7 and 9 nucleotides upstream of the intron insertion site, leaving 2-nucleotide 3' extensions. The recognition site is 27 to 29 base pairs with the DSB cleavage site at the 5'-end of the top strand, and with the intervening sequence (IVS) insertion site approximately in the middle of the recognition site. 90
28962 198385 cd10438 GIY-YIG_MSH Catalytic GIY-YIG domain of eukaryotic DNA mismatch repair protein MutS homologs. This family represents a putative GIY-YIG nuclease domain C-terminally fused to the DNA-repair ATPase on a small group of eukaryotic DNA mismatch repair protein mutS homologs (MSH). The MSH proteins in this family do not have the zinc finger domain, but have a predicted mitochondrial localization. They might play roles in the recognition and repair of errors made during the replication of DNA. The prototype of this family is the protein encoded by the chloroplast mutator (CHM) locus from Arabidopsis thaliana. It is suggested that this protein could be involved in the maintenance of mitochondrial genome stability. 72
28963 198386 cd10439 GIY-YIG_COG3410 GIY-YIG domain of uncharacterized bacterial protein structurally related to COG3410. This family contains a group of uncharacterized bacterial proteins. Although their function roles have not been recognized, these proteins contain a putative GIY-YIG domain in their N-terminus. Moreover, a conserved domain COG3410 with unknown function has been found in the C-terminus of most family members. 80
28964 198387 cd10440 GIY-YIG_COG3680 GIY-YIG domain of uncharacterized proteins from bacteria and their eukaryotic homologs. This family includes a group of functionally uncharacterized proteins from bacteria and their eukaryotic homologs which are present only in metazoa. These proteins might have nuclease activities and possibly be engaged in DNA repair or recombination, since they share sequence homology with the catalytic GIY-YIG domain of bacterial UvrC DNA repair proteins. Distinct from their prokaryotic relatives, the eukaryotic homologs contain an N-terminal extension that includes the region of approximately 3-4 ankyrin repeats, unique motifs mediating protein-protein interactions. Some of eukaryotic homologs do have an additional LEM domain located between ankyrin repeats region and GIY-YIG domain. The LEM domain, found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The different domain composition of the eukaryotic homologs suggests that they might participate in interactions with multiple partners and implies important cellular function. 94
28965 198388 cd10441 GIY-YIG_COG1833 GIY-YIG domain of hypothetical proteins from archaea and their bacterial homologs. This family includes a group of functionally uncharacterized hypothetical proteins from archaea and their bacterial homologs. These proteins contain a putative GIY-YIG domain that shows sequence homology with bacterial UvrC DNA repair proteins. Meanwhile, all of them share a C-terminal extension with semi-conserved Cys and His residues, which suggests that the extended region may be a zinc-binding nucleic acid interaction domain. Although the majority of family members have a standalone GIY-YIG domain composition, some of them do have additional endonulcease III domain or sugar fermentation stimulation protein domain, both of which are N-terminally fused to the GIY-YIG domain. As a result, those proteins could perform some other role by cooperating with different domains, which remains to be determined in the future. 112
28966 198389 cd10442 GIY-YIG_PLEs Catalytic GIY-YIG endonuclease domain of penelope-like elements and similar proteins. This model corresponds to the EN domain of PLEs that contains catalytic module of the GIY-YIG endonucleases of group I bacterial/organellar introns, as well as bacterial UvrC DNA repair proteins. It can cleave DNA with low nucleotide sequence specificity. However, the PLEs EN domain is distinct from other GIY-YIG endonucleases by the presence of a well-conserved CCHH motif (CX(2-7)CX(33-39)HX(3-5)H, X can be any residue). The role of the CCHH motif has not yet been identified. Penelope-like elements (PLEs) represent a novel class of eukaryotic retroelements, which do not belong to either long terminal repeat (LTR) retrotransposons or non-LTR retrotransposons (often called LINEs), but instead form a sister clade to telomerase reverse transcriptases (TERTs), highly specialized non-mobile reverse transcriptases (RTs) which are responsible for the addition of telomeric repeats to the ends of eukaryotic chromosomes. The single open reading frame (ORF) encoded by PLE consists of two principal domains, RT domain and endonuclease (EN) domain, jointed by a linker region of variable length. Both of these two domains are functionally active. 92
28967 198390 cd10443 GIY-YIG_HE_Tlr8p_PBC-V_like GIY-YIG domain of uncharacterized hypothetical protein found in phycodnavirus PBCV-1 DNA virus, T. thermophila Tlr element eoncoding protein Tlr8p, and similar proteins found in bacteria. The family includes a group of diverse uncharacterized hypothetical proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI. Similar to I-TevI, family members from phycodnavirus PBCV-1 DNA virus have nuclease-associated modular DNA-binding domains (NUMODs) and a helix-turn-helix (HTH) domain C-terminally fused to the GIY-YIG domain, which suggests that these PBCV-1 acquired the I-TevI-like homing endonucleases from phages by horizontal gene transfer. This family also includes proteins that appear to connect homing endonucleases with Penelope elements, such as Tetrahymena thermophila Tlr element encoding protein Tlr8p that possess additional N-terminal and central structural regions, followed by a putative superfamily 1 helicase domain and I-TevI-like GIY-YIG domain, but lacks the NUMOD domains and HTH domain. It is suggested that the Tlr8p element could have acquired its GIY-YIG domain w ithin the nucleus of the ciliate cell infected by the Phycodnavirus. Some family members only contain a standalone GIY-YIG domain and their biological functions are unclear. 90
28968 198391 cd10444 GIY-YIG_SegABCDEFG N-terminal catalytic GIY-YIG domain of bacteriophage T4 segABCDEFG gene encoding proteins. The prototypes of Seg family are proteins SegA, B, C, D, E, F, and G encoded by five seg genes segA, B, C, D, E, F, and G in the bacteriophage T4 genome, respectively. SegA, B, C, D, E, F, and G are not encoded by introns, but free-standing homologs of the GIY-YIG family of endonucleases encoded by group I introns, which are thought to initiate the homing of their own intron by cleaving the intronless DNA at or near the site of insertion. Both phage T4 intron-encoded and free-standing GIY-YIG endonucleases contribute to the exclusion of T2 markers from the progeny of mixed infections. SegA, encoded by the bacteriophage T4 segA gene, is a double-strand DNA endonulcease with a hierarchy of site specificity. The cleavage site of SegA is located in the uvsX gene of T4. Its cleaving activity requires the presence of Mg2+ and can be stimulated by the presence of ATP or ATPgammaS. Bacteriophage T4 segB gene encoding protein SegB is a site-specific endonuclease that recognizes a 27-bp sequence, cleaves DNA by introdu cing double-strand breaks in the adjacent gene 56 of T2 during mixed infection in the presence of Mg2+, Mn2+, or Ca2+ cations, and produces mostly 3' 2-nt protruding ends at its DNA cleavage site. It functions as a homing endonuclease to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages. Bacteriophage T4 segE gene encoding SegE is a site-specific endonuclease that preferentially cleaves DNA in a site located at the 5' end of the uvsW gene in the RB30 genome. It is responsible for a non-reciprocal genetic exchange between T-even-related phages. Bacteriophage T4 gene 69 encoding SegF is a site-specific double-strand DNA endonuclease that promotes marker exclusion. It preferentially introduces a double-strand break in the adjacent T2 gene 56 over T4 gene 56 both in vitro and in vivo during mixed infection, which results in the replacement of T2 gene 56 by T4 gene 56 in a process similar to group I intron homing. The cleavage site is located 210- and 212-bp upstream from its insertion site. Bacteriophage T4 segG gene (formerly gene 32.1) encoding SegG (also known as F-TevIV) is a double-strand DNA endonuclease adjacent to gene 32 of phage T4 that promotes marker exclusion. Although it is absent from phage T2, SegG preferentially introduces a double-strand break in T2 gene 32 during mixed infection, which results in replacement of T2 genetic markers by the corresponding T4 markers. The cleavage site is located 332- and 334-bp from its insertion site. 85
28969 198392 cd10445 GIY-YIG_bI1_like Catalytic GIY-YIG domain of putative intron-encoded endonuclease bI1 and similar proteins. The prototype of this family is a putative intron-encoded mitochondrial DNA endonuclease bI1 found in mitochondrion Ustilago maydis. This protein may arise from proteolytic cleavage of an in-frame translation of COB exon 1 plus intron 1, containing the bI1 open reading frame. It contains an N-terminal truncated non-functional cytochrome b region and a C-terminal intron-encoded endonuclease bI1 region. The bI1 region shows high sequence similarity to endonucleases of group I introns of fungi and phage and might be involved in intron homing. Many uncharacterized bI1 homologs existing in fungi and chlorophyta in this family do not contain the cytochrome b region, but have a standalone bI1-like region, which contains a GIY-YIG domain and a minor-groove binding alpha-helix nuclease-associated modular domain (NUMOD). This family also includes a Yarrowia lipolytica mobile group-II intron COX1-i1, also called intron alpha, encoding protein with reverse transcriptase activity. The group-II intron COX1-i1 may be involv ed both in the generation of the circular multimeric DNA molecules (senDNA alpha) which amplify during the senescence syndrome and in the generation of the site-specific deletion which accumulates in the premature-death syndrome. 88
28970 198393 cd10446 GIY-YIG_unchar_1 GIY-YIG domain of uncharacterized hypothetical protein found in bacteria. The family includes a group of uncharacterized bacterial hypothetical proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. 103
28971 198394 cd10447 GIY-YIG_unchar_2 GIY-YIG domain of uncharacterized hypothetical protein found in bacteria and archaea. The family includes a group of uncharacterized hypothetical proteins, mainly found in bacteria and a few found in archaea, with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. 80
28972 198395 cd10448 GIY-YIG_unchar_3 GIY-YIG domain of uncharacterized hypothetical protein found in bacteria. The family includes a group of uncharacterized bacterial proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. 87
28973 198396 cd10449 GIY-YIG_SLX1_like Catalytic GIY-YIG domain of yeast structure-specific endonuclease subunit SLX1 and its homologs. Structure-specific endonuclease subunit SLX1 is a highly conserved protein from yeast to human, with an N-terminal GIY-YIG endonuclease domain and a C-terminal PHD-type zinc finger postulated to mediate protein-protein or protein-DNA interaction. SLX1 forms active heterodimeric complexes with its SLX4 partner, which has additional roles in the DNA damage response that are distinct from the function of the heterodimeric SLX1-SLX4 nuclease. In yeast, the SLX1-SLX4 complex functions as a 5' flap endonuclease that maintains ribosomal DNA copy number, where SLX1 and SLX4 are shown to be catalytic and regulatory subunits, respectively. This endonuclease introduces single-strand cuts in duplex DNA on the 3' side of junctions with single-strand DNA. In addition to 5' flap endonuclease activity, human SLX1-SLX4 complex has been identified as a Holliday junction resolvase that promotes symmetrical cleavage of static and migrating Holliday junctions. SLX1 also associates with MUS81, EME1, C20orf94, PLK1, and ERCC1. Some eukaryotic SLX1 homologs lack the zinc finger domain, but possess intrinsically unstructured extensions of unknown function. These unstructured segments might be involved in interactions with other proteins. 67
28974 198397 cd10450 GIY-YIG_AtGrxS16_like GIY-YIG domain found in CAXIP1-like proteins, iron-sulfur cluster assembly proteins, and similar proteins. The family includes CAX-interacting protein-1 (CXIP1)-like proteins and iron-sulfur cluster assembly proteins, both of which contain a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. CAXIP1 is a novel PICOT (protein kinase C-interacting cousin of thioredoxin) domain-containing Arabidopsis protein that activates H+/Ca2+ exchanger CAX1, and its homolog CAX4, but not CAX2 or CAX3. Iron-sulfur cluster assembly proteins in this family also contain a C-terminal NifU-like domain that corresponds to a common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. 70
28975 198398 cd10451 GIY-YIG_LuxR_like GIY-YIG domain of LuxR and ArsR family transcriptional regulators, and uncharacterized hypothetical proteins found in bacteria. The family includes some bacterial LuxR and ArsR family transcriptional regulators. The a C-terminal conserved domain shows sequence similarity to the N-terminal catalytic GIY-YIG domains of intron-encoded homing endonucleases. Besides, they have an N-terminally fused transcriptional regulators module, comprising the winged helix-turn-helix (wHTH) domain and uncharacterized domain DUF2087. At this point, they are distinct from GIY-YIG homing endonucleases, which typically contain a variety of C-terminally fused nuclease-associated modular DNA-binding domains (NUMODs). Moreover, some key residues relevant to catalysis in GIY-YIG endonucleases are mutanted or absent in this family, which suggests that members in this family might lose the catalytic function that GIY-YIG endonucleases possess. This family also includes many uncharacterized hypothetical proteins that consist of a standalone GIY-YIG like domain. 101
28976 198399 cd10452 GIY-YIG_RE_Eco29kI_NgoMIII Catalytic GIY-YIG domain of type II restriction enzyme R.Eco29kI, R.NgoMIII, and similar proteins. This family corresponds to the catalytic GIY-YIG domain of GGCGCC-specific type II restriction endonucleases R.Eco29kI, NgoMIII, and similar proteins. R.Eco29kI is encoded on plasmid pECO29 in the E. coli strain 29K. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. R.Eco29kI forms a domain-swapped homodimeric catalytically active complex during DNA binding and cleavage. Each subunit contains one GIY-YIG catalytic motif. Restriction endonucleases R.NgoMIII is an isoschizomer of R.Eco29kI. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis, but not for their substrate binding. 204
28977 198400 cd10453 GIY-YIG_RE_Cfr42I Catalytic GIY-YIG domain of type II restriction enzyme R.Cfr42I and similar proteins. This family corresponds to the catalytic GIY-YIG domain of GGCGCC-specific type II restriction endonucleases R.Cfr42I and similar proteins. R.Cfr42I is encoded on plasmid pET21b(+) in the Citrobacter freundii RFL42 strain. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. It is an isoschizomer of R.Eco29kI. Unlike R.Eco29kI, R.Cfr42I is functional as a homotetramer, binding and cleaving two cognate DNA molecules in a cooperative manner. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis. 156
28978 198401 cd10454 GIY-YIG_COG3680_Meta GIY-YIG domain of hypothetical proteins from Metazoa. Members of this family are functionally uncharacterized hypothetical proteins from Metazoa. They have bacterial homologs that display sequence homology with the catalytic GIY-YIG domain of bacterial UvrC DNA repair proteins. However, unlike their bacterial relatives, these Metazoan proteins contain an N-terminal extension that includes the region of approximately 3-4 ankyrin repeats, unique motifs mediating protein-protein interactions. Some of them do have an additional LEM domain located between ankyrin repeats region and GIY-YIG domain. The LEM domain, found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The different domains composition suggests members in this subfamily might participate in interactions with multiple partners and imply some important cellular functions. 114
28979 198402 cd10455 GIY-YIG_SLX1 Catalytic GIY-YIG domain of yeast structure-specific endonuclease subunit SLX1 and its eukaryotic homologs. Structure-specific endonuclease subunit SLX1 is a highly conserved protein from yeast to human, with an N-terminal GIY-YIG endonuclease domain and a C-terminal PHD-type zinc finger postulated to mediate protein-protein or protein-DNA interaction. SLX1 forms active heterodimeric complexes with its SLX4 partner, which has additional roles in the DNA damage response that are distinct from the function of the heterodimeric SLX1-SLX4 nuclease. In yeast, the SLX1-SLX4 complex functions as a 5' flap endonuclease that maintains ribosomal DNA copy number, where SLX1 and SLX4 are shown to be catalytic and regulatory subunits, respectively. This endonuclease introduces single-strand cuts in duplex DNA on the 3' side of junctions with single-strand DNA. In addition to 5' flap endonuclease activity, human SLX1-SLX4 complex has been identified as a Holliday junction resolvase that promotes symmetrical cleavage of static and migrating Holliday junctions. SLX1 also associates with MUS81, EME1, C20orf94, PLK1, and ERCC1. Some eukaryotic SLX1 homologs lack the zinc finger domain, but possess intrinsically unstructured extensions of unknown function. These unstructured segments might be involved in interactions with other proteins. 76
28980 198403 cd10456 GIY-YIG_UPF0213 The GIY-YIG domain of uncharacterized protein family UPF0213 related to structure-specific endonuclease SLX1. This family contains a group of uncharacterized proteins found mainly in bacteria and several in dsDNA viruses. Although their function roles have not been recognized, these proteins show significant sequence similarities with the N-terminal GIY-YIG endonuclease domain of structure-specific endonuclease subunit SLX1, which binds another structure-specific endonuclease subunit SLX4 to form an active heterodimeric SLX1-SLX4 complex. This complex functions as a 5' flap endonuclease in yeast, and has also been identified as a Holliday junction resolvase in human. 68
28981 198404 cd10457 GIY-YIG_AtGrxS16 GIY-YIG domain found in CAXIP1-like proteins. The family includes CAX-interacting protein-1 (CXIP1)-like proteins which contain a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. CAXIP1 is a novel PICOT (protein kinase C-interacting cousin of thioredoxin) domain-containing Arabidopsis protein that activates H+/Ca2+ exchanger CAX1, and its homolog CAX4, but not CAX2 or CAX3. 74
28982 198405 cd10458 GIY-YIG_NifU GIY-YIG domain found in iron-sulfur cluster assembly proteins. This family includes a group of uncharacterized iron-sulfur cluster assembly proteins that transiently bind the iron-sulfur cluster before transfer to target apoproteins. These iron-sulfur cluster assembly proteins contains a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. They also contain a C-terminal NifU-like domain that corresponds to a common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. 76
28983 198417 cd10459 PUB_PNGase PNGase/UBA or UBX (PUB) domain of the P97 adaptor protein Peptide:N-glycanase (PNGase). This PUB (PNGase/UBA or UBX) domain is found in the p97 adaptor protein PNGase (Peptide:N-glycanase). The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The mammalian PNGase binds a considerable number of proteins via its PUB domain; these include ERAD E3 enzyme, the autocrine motility factor receptor (AMFR or gp78), SAKS and Derlin-1. 93
28984 198418 cd10460 PUB_UBXD1 PNGase/UBA or UBX (PUB) domain of UBXD1. This PUB domain is found in p97 adaptor protein UBXD1 (UBX domain-containing protein 1, also called UBXD6). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes, except for fungi, and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. UBXD1 also interacts with HRD1 and HERP, both components of the ERAD pathway, via p97. It is possibly involved in aggresome formation; aggresomes are perinuclear compartments that contain misfolded proteins colocalized with centrosome markers. 102
28985 198419 cd10461 PUB_UBA_plant PNGase/UBA or UBX (PUB) domain of plant Ubiquitin-associated (UBA) domain containing proteins. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover. This family contains only plant UBA domain-containing proteins. 107
28986 198420 cd10462 PUB_UBA PNGase/UBA or UBX (PUB) domain of Ubiquitin-associated (UBA) domain containing proteins. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover. 100
28987 198421 cd10463 PUB_WLM PNGase/UBA or UBX (PUB) domain of the Wss1p-like metalloprotease (WLM) family. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. WLM domains are found mostly in plant proteins, belonging to the Zincin-like superfamily of Zn-dependent peptidases that are linked to the ubiquitin signaling pathway through its fusion with the ubiquitin-binding PUB, ubiquitin-like, and Little Finger domains. More specifically, genetic evidence implicates the WLM family in de-SUMOylation. 96
28988 198422 cd10464 PUB_RNF31 PNGase/UBA or UBX (PUB) domain of the RNF31 (or HOIP) protein. This PUB domain is found in the p97 adaptor protein RNF31 (RING finger protein 31). The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97. RNF31 functions in a complex with another RING-finger protein (HOIL-IL), displaying E3 ubiquitin-protein ligase activity, and forming linear ubiquitin chain assembly complex (LUBAC) through linkages between the N- and C-termini of ubiquitin. LUBAC has been shown to activate the NF-kappaB pathway. 111
28989 198456 cd10466 FimH_man-bind Mannose binding domain of FimH and related proteins. This family, restricted to gammaproteobacteria, includes FimH, a mannose-specific adhesin of uropathogenic Escherichia coli strains. The domain appears to bind specifically to D-mannose and mediates cellular adhesion to mannosylated proteins, a prerequisite to colonization and subsequent invasion of epithelial tissues. 160
28990 198458 cd10467 FAM20_C_like C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins. Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family. 210
28991 198459 cd10468 Four-jointed-like_C C-terminal kinase domain of Drosophila Four-jointed (Fj), mouse Fjx1, and related proteins. Drosophila Fj is a Golgi type II transmembrane protein that is partially secreted, and is a kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). Mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in Drosophila Fj include loss of the intermediate leg joint, and a PCP defect in the eye. The expression of the Drospophila fj gene is modulated by Notch, Unpaired (JAK/STAT), and Wingless signals. Mouse Fjx1, has been shown to be involved in both the Fat and Hippo signaling pathways; these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. The expression of the mouse fjx1 gene is also Notch dependent; fjx1 is expressed in the brain, the peripheral nervous system, in epithelial structures of different organs, and during limb development. 286
28992 198460 cd10469 FAM20A_C C-terminal putative kinase domain of FAM20A. Human FAM20A may play a fundamental role in enamel development and gingival homeostasis as mutations in FAM20A may underlie the pathogenesis of the autosomal recessive Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. It is expressed in ameloblasts and gingivae. AI refers to a heterogeneous group of disorders of biomineralization caused by a lack of normal enamel formation. Mouse FAM20A is a secreted protein and the gene encoding it is differentially expressed in hematopoietic cells undergoing myeloid differentiation. This protein has also been associated with growth disorder in mice. The C-terminal domain of FAM20A is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family. 217
28993 198461 cd10470 FAM20B_C C-terminal putative kinase domain of FAM20B xylose kinase. Experiments with human FAM20B suggest that it is a xylose kinase that participates in proteoglycan production. It may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. The C-terminal domain of FAM20B is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family. 206
28994 198462 cd10471 FAM20C_C C-terminal putative kinase domain of FAM20C (also known as Dentin Matrix Protein 4, DMP4). Mouse DMP4 is abundant in the dentin matrix, and is expressed in high levels in odontoblasts. These latter cells synthesize various nucleators or inhibitors of mineralization. The in vivo role of DMP4 in dentinogenesis is unclear. However, gain- and loss-of-function experiments suggest that it participates in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. In addition to this domain, DMP4 contains a Greek key calcium-binding domain. Human FAM20C participates in bone development; mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), an autosomal recessive disorder in which affected individuals die within days or weeks of birth, usually due to thoratic malformation resulting in respiratory failure. The C-terminal domain of FAM20C is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family. 212
28995 198440 cd10472 EphR_LBD_B Ligand Binding Domain of Ephrin type-B receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. They play important roles in synapse formation and plasticity, spine morphogenesis, axon guidance, and angiogenesis. In the intestinal epithelium, EphB receptors are Wnt signaling target genes that control cell compartmentalization. They function as suppressors of colon cancer progression. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. 176
28996 198441 cd10473 EphR_LBD_A Ligand Binding Domain of Ephrin type-A Receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. 173
28997 198442 cd10474 EphR_LBD_B4 Ligand Binding Domain of Ephrin type-B Receptor 4. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB4 plays a role in osteoblast differentiation and has been linked to multiple myeloma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 180
28998 198443 cd10475 EphR_LBD_B6 Ligand Binding Domain of Ephrin type-B Receptor 6. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB6, a kinase-defective member of this family, is downregulated in MDA-MB-231-breast cancer cells and myeloid cancers and upregulated in neuroblasoma and glioblastoma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 180
28999 198444 cd10476 EphR_LBD_B1 Ligand Binding Domain of Ephrin type-B Receptor 1. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. Using EphB1 knockout-mice, EphB1 has been shown to be essential to the development of long-term potentiation (LTP), a cellular model of synaptic plasticity, learning and memory formation. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 176
29000 198445 cd10477 EphR_LBD_B2 Ligand Binding Domain of Ephrin type-B Receptor 2. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB2 plays a role in cell positioning in the gastrointestinal tract by being expressed in proliferating progenitor cells. It also has been implicated in colorectal cancer. A loss of EphB2, as well as EphA4, also precedes memory decline in a murine model of Alzheimers disease. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 178
29001 198446 cd10478 EphR_LBD_B3 Ligand Binding Domain of Ephrin type-B Receptor 3. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB3 plays a role in cell positioning in the gastrointestinal tract by being preferentially expressed in Paneth cells. It also has been implicated in early colorectal cancer and early stage squamous cell lung cancer. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 173
29002 198447 cd10479 EphR_LBD_A1 Ligand Binding Domain of Ephrin type-A Receptor 1. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA1 is downregulated in some advanced colorectal and myeloid cancers and upregulated in neuroblasoma and glioblastoma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion. 177
29003 198448 cd10480 EphR_LBD_A2 Ligand Binding Domain of Ephrin type-A Receptor 2. EphRs comprise the largest subfamily of receptor tyr kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA2 negatively regulates cell differentiation and has been shown to be overexpressed in tumor cells and tumor blood vessels in a variety of cancers including breast, prostate, lung, and colon. As a result, it is an attractive target for drug design since its inhibition could affect several aspects of tumor progression. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion. 174
29004 198449 cd10481 EphR_LBD_A3 Ligand Binding Domain of Ephrin type-A Receptor 3. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA3 has been implicated in leukemia, lung and other cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion. 173
29005 198450 cd10482 EphR_LBD_A4 Ligand Binding Domain of Ephrin type-A Receptor 4. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. A loss of EphA4, as well as EphB2, precedes memory decline in a murine model of Alzheimers disease. EphA4 has been shown to have a negative effect on axon regeneration and functional restoration in corticospinal lesions and is downregulated in some cervical cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 174
29006 198451 cd10483 EphR_LBD_A5 Ligand Binding Domain of Ephrin type-A Receptor 5. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA5 is almost exclusively expressed in the nervous system. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 173
29007 198452 cd10484 EphR_LBD_A6 Ligand Binding Domain of Ephrin type-A Receptor 6. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA6, like other Eph receptors and their ephrin ligands, seems to play a role in neural development, underlying learning and memory. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 173
29008 198453 cd10485 EphR_LBD_A7 Ligand Binding Domain of Ephrin type-A Receptor 7. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA7 has been implicated in various cancers, including prostate, gastic and colorectal cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 177
29009 198454 cd10486 EphR_LBD_A8 Ligand Binding Domain of Ephrin type-A Receptor 8. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA8 has been implicated in various cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). 173
29010 198455 cd10487 EphR_LBD_A10 Ligand Binding Domain of Ephrin type-A Receptor 10. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA10, which contains an inactive tyr kinase domain, may function to attenuate signals of co-clustered active receptors. EphA10 is mainly expressed in the testis. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction results in cell-cell repulsion or adhesion. 173
29011 199812 cd10488 MH1_R-SMAD N-terminal Mad Homology 1 (MH1) domain of receptor regulated SMADs. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. It binds to the major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in all receptor regulated SMADs (R-SMADs) including SMAD1, SMAD2, SMAD3, SMAD5 and SMAD9. SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta. SMAD4, a common mediator SMAD (co-SMAD) binds R-SMADs, forming an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD5 is involved in bone morphogenetic proteins (BMP) signal modulation, possibly playing a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 (also known as SMAD8) can mediate the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway 123
29012 199813 cd10489 MH1_SMAD_6_7 N-terminal Mad Homology 1 (MH1) domain in SMAD6 and SMAD7. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in SMAD6 and SMAD7, both inhibitory SMADs (I-SMADs) and negative regulators of signaling mediated by TGF-beta superfamily. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling while SMAD7 enhances muscle differentiation and is often associated with cancer, tissue fibrosis and inflammatory diseases. 119
29013 199814 cd10490 MH1_SMAD_1_5_9 N-terminal Mad Homology 1 (MH1) domain in SMAD1, SMAD5 and SMAD9 (also known as SMAD8). The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in SMAD1, SMAD5 and SMAD9, all closely related receptor regulated SMADs (R-SMADs). SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD5 is involved in bone morphogenetic proteins (BMP) signal modulation and may also play a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 mediates the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway. 124
29014 199815 cd10491 MH1_SMAD_2_3 N-terminal Mad Homology 1 (MH1) domain in SMAD2 and SMAD3. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 is found in SMAD2 as well as SMAD3. SMAD2 mediates the signal of the transforming growth factor (TGF)-beta, and thereby regulates multiple cellular processes, such as cell proliferation, apoptosis, and differentiation. It plays a role in the transmission of extracellular signals from ligands of the TGF-beta superfamily growth factors into the cell nucleus. SMAD3 modulates signals of activin and TGF-beta. It binds SMAD4, enabling its transmigration into the nucleus where it forms complexes with other proteins and acts as a transcription factor. Increased SMAD3 activity has been implicated in the pathogenesis of scleroderma. 124
29015 199816 cd10492 MH1_SMAD_4 N-terminal Mad Homology 1 (MH1) domain in SMAD4. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD4, a common mediator SMAD (co-SMAD), which belongs to the Dwarfin family of proteins and is involved in many cell functions such as differentiation, apoptosis, gastrulation, embryonic development and cell cycle. SMAD4 binds receptor regulated SMADs (R-SMADs) such as SMAD1 or SMAD2, and forms an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD4 is often mutated in several cancers, such as multiploid colorectal cancer and pancreatic carcinoma, as well as in juvenile polyposis syndrome (JPS). 125
29016 199817 cd10493 MH1_SMAD_6 N-terminal Mad Homology 1 (MH1) domain in SMAD6. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD6, an inhibitory SMAD (I-SMAD) or antagonistic SMAD, which acts as a negative regulator of signaling mediated by TGF-beta superfamily ligands, by competing with SMAD4 and preventing the transcription of SMAD4's gene products. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling. 113
29017 199818 cd10494 MH1_SMAD_7 N-terminal Mad Homology 1 (MH1) domain in SMAD7. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins. It binds to the major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD7, an inhibitory SMAD (I-SMAD) or antagonistic SMAD, which acts as a negative regulator of signaling mediated by TGF-beta superfamily ligands, by blocking TGF-beta type 1 and activin association with the receptor as well as access to SMAD2. SMAD7 enhances muscle differentiation, playing pivotal roles in embryonic development and adult homoeostasis. Altered expression of SMAD7 is often associated with cancer, tissue fibrosis and inflammatory diseases. 123
29018 199820 cd10495 MH2_R-SMAD C-terminal Mad Homology 2 (MH2) domain in receptor regulated SMADs. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. Receptor regulated SMADs (R-SMADs) include SMAD1, SMAD2, SMAD3, SMAD5 and SMAD9. SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta. SMAD5 is involved in BMP signal modulation, possibly playing a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 (also known as SMAD8) can mediate the differentiation of mesenchymal stem cells into tendon-like cells by inhibiting the osteogenic pathway. 182
29019 199821 cd10496 MH2_I-SMAD C-terminal Mad Homology 2 (MH2) domain in Inhibitory SMADs. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD6 and SMAD7 are inhibitory SMADs (I-SMADs) that function as negative regulators of signaling mediated by the TGF-beta superfamily. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling, while SMAD7 enhances muscle differentiation and is often associated with cancer, tissue fibrosis and inflammatory diseases. 165
29020 199822 cd10497 MH2_SMAD_1_5_9 C-terminal Mad Homology 2 (MH2) domain in SMAD1, SMAD5 and SMAD9. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD1, SMAD5 and SMAD9 (also known as SMAD8), are receptor regulated SMADs (R-SMADs). SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD5 is involved in BMP signal modulation and may also play a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 mediates the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway. 201
29021 199823 cd10498 MH2_SMAD_4 C-terminal Mad Homology 2 (MH2) domain in SMAD4. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. SMAD4, which belongs to the Dwarfin family of proteins, is involved in many cell functions such as differentiation, apoptosis, gastrulation, embryonic development and the cell cycle. SMAD4 binds receptor regulated SMADs (R-SMADs) such as SMAD1 or SMAD2, and forms an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD4 is often mutated in several cancers, such as multiploid colorectal cancer, cervical cancer and pancreatic carcinoma, as well as in juvenile polyposis syndrome. 222
29022 199824 cd10499 MH2_SMAD_6 C-terminal Mad Homology 2 (MH2) domain in SMAD6. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD6, an inhibitory or antagonistic SMAD (I-SMAD), acts as a negative regulator of signaling mediated by the TGF-beta superfamily of ligands, by competing with SMAD4 and preventing the transcription of SMAD4's gene products. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling. SMAD6 and SMAD7 act as critical mediators for effective TGF-beta I-mediated suppression of Interleukin-1/Toll-like receptor (IL-1R/TLR) signaling through simultaneous binding to Pellino-1, an adaptor protein of interleukin-1 receptor associated kinase 1 (IRAK1), via their MH2 domains. 174
29023 199825 cd10500 MH2_SMAD_7 C-terminal Mad Homology 2 (MH2) domain in SMAD7. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD7, an inhibitory or antagonistic SMAD (I-SMAD), acts as a negative regulator of signaling mediated by the TGF-beta superfamily of ligands, by blocking TGF-beta type 1 and activin association with the receptor as well as access to SMAD2. SMAD7 enhances muscle differentiation, playing pivotal roles in embryonic development and adult homoeostasis. SMAD7 and SMAD6 act as critical mediators for effective TGF-beta I-mediated suppression of Interleukin-1/Toll-like receptor (IL-1R/TLR) signaling through simultaneous binding to Pellino-1, an adaptor protein of interleukin-1 receptor associated kinase 1(IRAK1), via their MH2 domains. Altered expression of SMAD7 is often associated with cancer, tissue fibrosis and inflammatory diseases. 171
29024 259849 cd10506 RNAP_IV_RPD1_N Largest subunit (NRPD1) of higher plant RNA polymerase IV, N-terminal domain. NRPD1 and NRPE1 are the largest subunits of plant DNA-dependent RNA polymerase IV and V that, together with second largest subunits (NRPD2 and NRPE2), form the active site region of the DNA entry and RNA exit channel. Higher plants have five multi-subunit nuclear RNA polymerases; RNAP I, RNAP II and RNAP III, which are essential for viability, plus the two isoforms of the non-essential polymerase RNAP IV and V, which specialize in small RNA-mediated gene silencing pathways. RNAP IV and/or V might be involved in RNA-directed DNA methylation of endogenous repetitive elements, silencing of transgenes, regulation of flowering-time genes, inducible regulation of adjacent gene pairs, and spreading of mobile silencing signals. The subunit compositions of RNAP IV and V reveal that they evolved from RNAP II. 744
29025 259792 cd10507 Zn-ribbon_RPA12 C-terminal zinc ribbon domain of RPA12 subunit of RNA polymerase I. The C-terminal zinc ribbon domain (C-ribbon) of subunit A12 (Zn-ribbon_RPA12) in RNA polymerase (Pol) I is involved in intrinsic transcript cleavage. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPA12 in Pol I, RPB9 in Pol II, RPC11 in Pol III and TFS in archaea are distantly related to each other and to the TFIIS elongation factor of Pol II. RPA12 has two zinc-binding domains separated by a flexible linker. 47
29026 259793 cd10508 Zn-ribbon_RPB9 C-terminal zinc ribbon domain of RPB9 subunit of RNA polymerase II. The C-terminal zinc ribbon domain (C-ribbon) of subunit B9 (Zn-ribbon_RPB9) in RNA polymerase (Pol) II is involved in intrinsic transcript cleavage. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPB9 have strong homology to RPA12 of Pol I and RPC11 of Pol III subunits but its intrinsic cleavage activity is weaker for Pol II. Zn-ribbon_RPB9 is homologous to Pol II elongation factor TFIIS domain III. The very weak cleavage activity of Pol II is stimulated by TFIIS. RPB9 has two zinc-binding domains separated by a flexible linker. 49
29027 259794 cd10509 Zn-ribbon_RPC11 C-terminal zinc ribbon domain of RPC11 subunit of RNA polymerase III. The C-terminal zinc ribbon domain (C-ribbon) of subunit C11 (Zn-ribbon_RPC11) in RNA polymerase (Pol) III is required for intrinsic transcript cleavage. RPC11 is also involved in Pol III termination. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPC11 has strong homology to RPB9 of Pol II and RPA12 of Pol I. Zn-ribbon_RPC11 is homologous to Pol II elongation factor TFIIS domain III. C11 has two zinc-binding domains separated by a flexible linker. 46
29028 259795 cd10511 Zn-ribbon_TFS C-terminal zinc ribbon domain of archaeal Transcription Factor S (TFS). TFS is an archaeal protein that stimulates the intrinsic cleavage activity of archaeal RNA polymerase. TFS C-terminal domain shows sequence similarity to the homologous C-terminal zinc ribbon domain of subunits A12.2, Rpb9, and C11 in eukaryotic RNA Polymerases (Pol) I, II, and III, respectively and domain III of TFIIS. TFS is not a subunit of archaeal RNA polymerase even though its domains arrangement is similar to A12.2, Rpb9, and C1. TFS is a transcription factor with a similar function to eukaryotic TFIIS. TFS has external cleavage induction activity and improves the fidelity of transcription. TFS has two zinc-binding domains. 47
29029 380915 cd10517 SET_SETDB1 SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. 288
29030 380916 cd10518 SET_SETD1-like SET domain (including post-SET domain) found in SET domain-containing proteins (SETD1A/SETD1B), histone-lysine N-methyltransferases (KMT2A/KMT2B/KMT2C/KMT2D) and similar proteins. This family includes SET domain-containing protein 1A (SETD1A), 1B (SETD1B), as well as histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B), 2C (KMT2C), 2D (KMT2D). These proteins are histone-lysine N-methyltransferases (EC 2.1.1.43) that specifically methylate 'Lys-4' of histone H3 (H3K4me). 150
29031 380917 cd10519 SET_EZH SET domain found in enhancer of zeste homolog 1 (EZH1), zeste homolog 2 (EZH2) and similar proteins. The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both, EZH1 and EZH2, can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. 117
29032 380918 cd10520 PR-SET_PRDM17 PR-SET domain found in PR domain zinc finger protein 17 (PRDM17) and similar proteins. PRDM17 (also termed zinc finger protein 408 (ZNF408)) may be involved in transcriptional regulation. 121
29033 380919 cd10521 SET_SMYD5 SET domain (including iSET domain and post-SET domain) found in SET and MYND domain-containing protein 5 (SMYD5) and similar proteins. SMYD5 (also termed protein NN8-4AG, or retinoic acid-induced protein 15) functions as histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions. It plays an important role in chromosome integrity by regulating heterochromatin and repressing endogenous repetitive DNA elements during differentiation. In zebrafish embryogenesis, it plays pivotal roles in both primitive and definitive hematopoiesis. 282
29034 380920 cd10522 SET_LegAS4-like SET domain found in Legionella pneumophila type IV secretion system effector LegAS4 and similar proteins. LegAS4 is a type IV secretion system effector of Legionella pneumophila. It contains a SET domain that is involved in the modification of Lys4 of histone H3 (H3K4) in the nucleolus of the host cell, thereby enhancing heterochromatic rDNA transcription. It also contains an ankyrin repeat domain of unknown function at its C-terminal region. 122
29035 380921 cd10523 SET_SETDB2 SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 2 (SETDB2) and similar proteins. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis. 266
29036 380922 cd10524 SET_Suv4-20-like SET domain (including post-SET domain) found in Drosophila melanogaster suppressor of variegation 4-20 (Suv4-20) and similar proteins. Suv4-20 (also termed Su(var)4-20) is a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-20' of histone H4. It acts as a dominant suppressor of position-effect variegation. The family also includes Suv4-20 homologs, lysine N-methyltransferase 5B (KMT5B) and lysine N-methyltransferase 5C (KMT5C). Both KMT5B (also termed lysine-specific methyltransferase 5B, or suppressor of variegation 4-20 homolog 1, or Su(var)4-20 homolog 1, or Suv4-20h1) and KMT5C (also termed lysine-specific methyltransferase 5C, or suppressor of variegation 4-20 homolog 2, or Su(var)4-20 homolog 2, or Suv4-20h2) are histone methyltransferases that specifically trimethylate 'Lys-20' of histone H4 (H4K20me3). They play central roles in the establishment of constitutive heterochromatin in pericentric heterochromatin regions. 141
29037 380923 cd10525 SET_SUV39H1 SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 1 (SUV39H1) and similar proteins. SUV39H1 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A (KMT1A), position-effect variegation 3-9 homolog (SUV39H), or Su(var)3-9 homolog 1) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions. 255
29038 380924 cd10526 SET_SMYD1 SET domain (including post-SET domain) found in SET and MYND domain-containing protein 1 (SMYD1) and similar proteins. SMYD1 (EC 2.1.1.43), also termed BOP, is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD1 plays a critical role in cardiomyocyte differentiation, cardiac morphogenesis and myofibril organization, as well as in the regulation of endothelial cells (ECs). It is expressed in vascular endothelial cells, it has beenshown that knockdown of SMYD1 in endothelial cells impairs EC migration and tube formation. 210
29039 380925 cd10527 SET_LSMT SET domain found in Rubisco large subunit methyltransferase (LSMT) and similar proteins. Rubisco LSMT is a non-histone protein methyl transferase responsible for the trimethylation of lysine14 in the large subunit of Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase). The family also includes SET domain-containing proteins, SETD3, SETD4 and SETD6, which belong to methyltransferase class VII that represents classical non-histone SET domain methyltransferases. Members in this family contain a SET domain and a C-terminal RubisCO LSMT substrate-binding (Rubis-subs-bind) domain. 236
29040 380926 cd10528 SET_SETD8 SET domain found in SET domain-containing protein 8 (SETD8) and similar proteins. SETD8 (EC 2.1.1.43; also termed N-lysine methyltransferase KMT5A, H4-K20-HMTase KMT5A, lysine N-methyltransferase 5A, lysine-specific methylase 5A, PR/SET domain-containing protein 07, PR-Set7 or PR/SET07) is a nucleosomal histone-lysine N-methyltransferase that specifically monomethylates 'Lys-20' of histone H4 (H4K20me1). It plays a central role in the silencing of euchromatic genes. 141
29041 380927 cd10529 SET_SETD5-like SET domain found in SET domain-containing protein 5 (SETD5), inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins. SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. KMT2E (also termed inactive lysine N-methyltransferase 2E or myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. The family also includes Saccharomyces cerevisiae SET domain-containing proteins, SET3 and SET4, and Schizosaccharomyces pombe SET3. Most of these family members contain a post-SET domain which harbors a zinc-binding site. 127
29042 380928 cd10530 SET_SETD7 SET domain found in SET domain-containing protein 7 (SETD7) and similar proteins. SETD7 (EC 2.1.1.43; also termed histone H3-K4 methyltransferase SETD7, H3-K4-HMTase SETD7, lysine N-methyltransferase 7 (KMT7) or SET7/9) is a histone-lysine N-methyltransferase that specifically monomethylates 'Lys-4' of histone H3. It plays a central role in the transcriptional activation of genes such as collagenase or insulin. Set7/9 also methylates non-histone proteins, including estrogen receptor alpha (ERa), suggesting it has a role in diverse biological processes. ERa methylation by Set7/9 stabilizes ERa and activates its transcriptional activities, which are involved in the carcinogenesis of breast cancer. In a high-throughput screen, treatment of human breast cancer cells (MCF7 cells) with cyproheptadine, a Set7/9 inhibitor, decreased the expression and transcriptional activity of ERa, thereby inhibiting estrogen-dependent cell growth. 130
29043 380929 cd10531 SET_SETD2-like SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2), ASH1-like protein (ASH1L) and similar proteins. This family includes SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2) and ASH1-like protein (ASH1L), which function as histone-lysine N-methyltransferases. SETD2 specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. NSD2 shows histone H3 'Lys-27' (H3K27me) methyltransferase activity. ASH1L specifically methylates 'Lys-36' of histone H3 (H3K36me). The family also includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins. 136
29044 380930 cd10532 SET_SUV39H2 SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 2 (SUV39H2) and similar proteins. SUV39H2 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B (KMT1B), or Su(var)3-9 homolog 2) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions. 243
29045 380931 cd10533 SET_EHMT2 SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 2 (EHMT2) and similar proteins. EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C (KMT1C), or protein G9a) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. 239
29046 380932 cd10534 PR-SET_PRDM-like PR-SET domain found in PRDM (PRDI-BF1 and RIZ homology domain) family of proteins. PRDM family of proteins is defined based on the conserved N-terminal PR domain, which is closely related to the Su(var)3-9, enhancer of zeste, and trithorax (SET) domains of histone methyltransferases, and is specifically called PR-SET domain. The family consists of 17 members in primates. PRDMs play diverse roles in cell-cycle regulation, differentiation, and meiotic recombination. The family also contains zinc finger protein ZFPM1 and ZFPM2. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis. 83
29047 380933 cd10535 SET_EHMT1 SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 1 (EHMT1) and similar proteins. EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, or lysine N-methyltransferase 1D (KMT1D)) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. 231
29048 380934 cd10536 SET_SMYD4 SET domain (including iSET domain and post-SET domain) found in SET and MYND domain-containing protein 4 (SMYD4) and similar proteins. SMYD4 functions as a potential tumor suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha. In zebrafish, SMYD4 is ubiquitously expressed in early embryos and becomes enriched in the developing heart; mutants show a strong defect in cardiomyocyte proliferation, which lead to a severe cardiac malformation. 218
29049 380935 cd10537 SET_SETD9 SET domain found in SET domain-containing protein 9 (SETD9) and similar proteins. SETD9 is an uncharacterized protein that belongs to the class V-like SAM-binding methyltransferase superfamily. 150
29050 380936 cd10538 SET_SETDB-like SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2, and similar proteins. The family includes SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis. SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. This family also includes the pre-SET domain, which is found in a number of histone methyltransferases (HMTase), N-terminal to the SET domain. Pre-SET domain is a zinc binding motif which contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilizing SET domains. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site. 217
29051 380937 cd10539 SET_ATXR5_6-like SET domain found in fungal protein lysine methyltransferase SET5 and similar protein. The family includes Arabidopsis thaliana ATXR5 and ATXR6. Both ATXR5 (also termed protein SET DOMAIN GROUP 15, or TRX-related protein 5) and ATXR6 (also termed protein SET DOMAIN GROUP 34, or TRX-related protein 6) function as histone methyltransferase that specifically monomethylates 'Lys-37' of histone H3 (H3K27me1). They are required for chromatin structure and gene silencing. 138
29052 380938 cd10540 SET_SpSet7-like SET domain found in Schizossacharomyces pombe Set7 and similar proteins. Schizosaccharomyces pombe Set7 is a novel histone-lysine N-methyltransferase. The family also includes a viral histone H3 lysine 27 methyltransferase from Paramecium bursaria Chlorella virus 1 (PBCV-1). 112
29053 380939 cd10541 SET_SETDB SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1), SET domain bifurcated 2 (SETDB2), and similar proteins. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis. 236
29054 380940 cd10542 SET_SUV39H SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homologs, SUV39H1, SUV39H2 and similar proteins. This family includes SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. Also included are Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (SUV39H homolog) and Neurospora crassa DIM-5, both of which also methylate 'Lys-9' of histone H3. 245
29055 380941 cd10543 SET_EHMT SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase EHMT1, EHMT2 and similar proteins. This family includes EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. 231
29056 380942 cd10544 SET_SETMAR SET domain (including pre-SET and post-SET domains) found in SET domain and mariner transposase fusion protein (SETMAR) and similar proteins. SETMAR (also termed metnase) is a DNA-binding protein that is indirectly recruited to sites of DNA damage through protein-protein interactions. It has a sequence-specific DNA-binding activity recognizing the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element and displays a DNA nicking and end joining activity. SETMAR also acts as a histone-lysine N-methyltransferase that methylates 'Lys-4' and 'Lys-36' of histone H3. It specifically mediates dimethylation of H3 'Lys-36' at sites of DNA double-strand break and may recruit proteins required for efficient DSB repair through non-homologous end-joining. 254
29057 380943 cd10545 SET_AtSUVH-like SET domain found in Arabidopsis thaliana histone H3-K9 methyltransferases (SUVHs) and similar proteins. Arabidopsis thaliana SUVH protein (also termed suppressor of variegation 3-9 homolog protein) is a histone-lysine N-methyltransferase that methylates 'Lys-9' of histone H3. H3 'Lys-9' methylation represents a specific tag for epigenetic transcriptional repression. Some family members contain a post-SET domain which binds a Zn2+ ion. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site. 232
29058 240598 cd10546 VKOR Vitamin K epoxide reductase (VKOR) family. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. This family includes enzymes that are present in vertebrates, Drosophila, plants, bacteria, and archaea. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some plant and bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. Warfarin, a widely used oral anticoagulant used in medicine as well as rodenticides, inhibits the activity of VKOR, resulting in decreased levels of reduced vitamin K, which is required for the function of several clotting factors. However, anticoagulation effect of warfarin is significantly associated with polymorphism of certain genes, including VKORC1. Interestingly, in rodents, an adaptive trait appears to have evolved convergently by selection on new or standing genetic polymorphisms in VKORC1 as well as by adaptive introgressive hybridization between species, likely brought about by human-mediated dispersal. 126
29059 380415 cd10547 cupin_BacB_C Bacillus subtilis bacilysin and related proteins, C-terminal cupin domain. This model represents the C-terminal domain of bacilysin (BacB, also known as AerE in Microcystis aeruginosa), a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. Bacilysin is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls. AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF aeruginosin biosynthesis gene cluster in Microcystis aeruginosa. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 92
29060 380416 cd10548 cupin_CDO cysteine dioxygenase, cupin domain. This family contains cysteine dioxygenase (CDO; EC 1.13.11.20), which catalyzes the conversion of cysteine to cysteine sulfinic acid, the first step in the biosynthesis of essential oxidized cysteine metabolites such as sulfate, hypotaurine, and taurine. CDO also plays an important role in the regulation of intracellular cysteine levels in mammals; CDO expression is altered in cancer cells, and abnormal or deficient CDO activity has been linked to Parkinson's disease, Alzheimer's disease, and rheumatoid arthritis. CDO is an iron-dependent thiol dioxygenase that uses molecular oxygen to oxidize the sulfhydryl group of cysteine to generate cysteine sulfinic acid. The CDO active site contains an amino acid-derived cofactor. These enzymes are found in prokaryotes as well as eukaryotes and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 100
29061 319871 cd10549 MtMvhB_like Uncharacterized polyferredoxin-like protein. This family contains uncharacterized polyferredoxin protein similar to Methanobacterium thermoautotrophicum MvhB. The mvhB is a gene of the methylviologen-reducing hydrogenase operon. It is predicted to contain 12 [4Fe-4S] clusters, and was therefore suggested to be a polyferredoxin. As a subfamily of the beta subunit of the DMSO Reductase (DMSOR) family, it is predicted to function as electron carrier in the reducing reaction. 128
29062 319872 cd10550 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 130
29063 319873 cd10551 PsrB polysulfide reductase beta (PsrB) subunit. This family includes the beta subunit of bacterial polysulfide reductase (PsrABC), an integral membrane-bound enzyme responsible for quinone-coupled reduction of polysulfides, a process important in extreme environments such as deep-sea vents and hot springs. Polysulfide reductase contains three subunits: a catalytic subunit PsrA, an electron transfer PsrB subunit and the hydrophobic transmembrane PsrC subunit. PsrB belongs to the DMSO reductase superfamily that contains [4Fe-4S] clusters which transfer the electrons from the A subunit to the hydrophobic integral membrane C subunit via the B subunit. In Shewanella oneidensis, which has highly diverse anaerobic respiratory pathways, PsrABC is responsible for H2S generation as well as its regulation via respiration of sulfur species. PsrB transfers electrons from PsrC (serving as quinol oxidase) to the catalytic subunit PsrA for reduction of corresponding electron acceptors. It has been shown that T. thermophilus polysulfide reductase could be a key energy-conserving enzyme of the respiratory chain, using polysulfide as the terminal electron acceptor and pumping protons across the membrane. 185
29064 319874 cd10552 TH_beta_N N-terminal FeS domain of pyrogallol-phloroglucinol transhydroxylase (TH), beta subunit. This family includes the beta subunit of pyrogallol-phloroglucinol transhydroxylase (TH), a cytoplasmic molybdenum (Mo) enzyme from anaerobic microorganisms like Pelobacter acidigallici and Desulfitobacterium hafniense which catalyzes the conversion of pyrogallol to phloroglucinol, an important building block of plant polymers. TH belongs to the DMSO reductase (DMSOR) family; it is a heterodimer consisting of a large alpha catalytic subunit and a small beta FeS subunit. The beta subunit has two domains with the N-terminal domain containing three [4Fe-4S] centers and a seven-stranded, mainly antiparallel beta-barrel domain. In the anaerobic bacterium Pelobacter acidigallici, gallic acid, pyrogallol, phloroglucinol, or phloroglucinol carboxylic acid are fermented to three molecules of acetate (plus CO2), and TH is the key enzyme in the fermentation pathway, which converts pyrogallol to phloroglucinol in the absence of O2. 186
29065 319875 cd10553 PhsB_like uncharacterized beta subfamily of DMSO Reductase similar to Desulfonauticus sp PhsB. This family includes beta FeS subunits of anaerobic DMSO reductase (DMSOR) superfamily that have yet to be characterized. DMSOR consists of a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and the tungsten-containing formate dehydrogenase (FDH-T). Examples of heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 146
29066 319876 cd10554 HycB_like HycB, HydN and similar proteins. This family includes HycB, the FeS subunit of a membrane-associated formate hydrogenlyase system (FHL-1) in Escherichia coli that breaks down formate, produced during anaerobic fermentation, to H2 and CO2. FHL-1 consists of formate dehydrogenase H (FDH-H) and the hydrogenase 3 complex (Hyd-3). HycB is thought to code for the [4Fe-4S] ferredoxin subunit of hydrogenase 3, which functions as an intermediate electron carrier protein between hydrogenase 3 and formate dehydrogenase. HydN codes for the [4Fe-4S] ferredoxin subunit of FDH-H; a hydN in-frame deletion mutation causes only weak reduction in hydrogenase activity, but loss of more than 60% of FDH-H activity. This pathway is only active at low pH and high formate concentrations, and is thought to provide a detoxification/de-acidification system countering the buildup of formate during fermentation. 149
29067 319877 cd10555 EBDH_beta beta subunit of ethylbenzene-dehydrogenase (EBDH). This subfamily includes ethylbenzene dehydrogenase (EBDH, EC 1.17.99.2), a member of the DMSO reductase family. EBDH oxidizes the hydrocarbon ethylbenzene to (S)-1-phenylethanol. It is a heterotrimer, with the alpha subunit containing the catalytic center with a molybdenum held by two molybdopterin-guanine dinucleotides, the beta subunit containing four iron-sulfur clusters (the electron transfer subunit) and the gamma subunit containing a methionine and a lysine as axial heme ligands. During catalysis, electrons produced by substrate oxidation are transferred to a heme in the gamma subunit and then presumably to a separate cytochrome involved in nitrate respiration. 316
29068 319878 cd10556 SER_beta Beta subunit of selenate reductase. This subfamily includes beta FeS subunit of selenate reductase (SER), a member of the DMSO reductase family. SER catalyzes the reduction of selenate to selenite in bacterial species that can obtain energy by respiring anaerobically with selenate as the terminal electron acceptor. The enzyme comprises three subunits SerABC, forming a heterotrimer, with the catalytic component (alpha-subunit), iron-sulfur protein (beta-subunit) and monomeric b-type heme-containing gamma subunit. Beta subunit contains coordinating one [3Fe-4S] cluster and three [4Fe-4S] clusters and functions as electron carrier. 287
29069 319879 cd10557 NarH_beta-like beta subunit of nitrate reductase A (NarH) and similar proteins. This subfamily includes nitrate reductase A, a member of the DMSO reductase family. The respiratory nitrate reductase complex (NarGHI) from E. coli is a heterotrimer, with the catalytic subunit (NarG) with a molybdo-bis (molybdopterin guanine dinucleotide) cofactor and an [Fe-S] cluster, the electron transfer subunit (NarH) with four [Fe-S] clusters, and the integral membrane subunit (NarI) with two b-type hemes. Nitrate reductase A often forms a respiratory chain with the formate dehydrogenase via the lipid soluble quinol pool. Electron transfer from formate to nitrate is coupled to proton translocation across the cytoplasmic membrane generating proton motive force by a redox loop mechanism. Demethylmenaquinol (DMKH2) has been shown to be a good substrate for NarGHI in nitrate respiration in E. coli. 363
29070 319880 cd10558 FDH-N The beta FeS subunit of formate dehydrogenase-N (FDH-N). This subfamily contains beta FeS subunit of formate dehydrogenase-N (FDH-N), a member of the DMSO reductase family. FDH-N is involved in the major anaerobic respiratory pathway in the presence of nitrate, catalyzing the oxidation of formate to carbon dioxide at the expense of nitrate reduction to nitrite. Thus, FDH-N is a major component of nitrate respiration of Escherichia coli. This integral membrane enzyme forms a heterotrimer; the alpha-subunit (FDH-G) is the catalytic site of formate oxidation and membrane-associated, incorporating a selenocysteine (SeCys) residue and a [4Fe/4S] cluster in addition to two bis-MGD cofactors, the beta subunit (FDH-H) contains four [4Fe/4S] clusters which transfer the electrons from the alpha subunit to the gamma-subunit (FDH-I), a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. 208
29071 319881 cd10559 W-FDH tungsten-containing formate dehydrogenase, small subunit. This subfamily contains beta subunit of Tungsten-containing formate dehydrogenase (W-FDH), a member of the DMSO reductase family. W-FDH contains a tungsten instead of molybdenum at the catalytic center. This enzyme seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It is a heterodimer of a large and a small subunit; the large subunit harbors the W site and one [4Fe-4S] center and the small subunit, containing three [4Fe-4S] clusters, functions to transfer electrons. 200
29072 319882 cd10560 FDH-O_like beta subunit of formate dehydrogenase O (FDH-O) and similar proteins. This subfamily includes beta subunit of formate dehydrogenase family O (FDH-O), which is highly homologous to formate dehydrogenase N (FDH-N), a member of the DMSO reductase family. In E. coli three formate dehydrogenases are synthesized that are capable of oxidizing formate; Fdh-H, couples formate disproportionation to hydrogen and CO2, and is part of the cytoplasmically oriented formate hydrogenlyase complex, while FDH-N and FDH-O indicate their respective induction after growth with nitrate and oxygen. Little is known about FDH-O, although it shows formate oxidase activity during aerobic growth and is also synthesized during nitrate respiration, similar to FDH-N. 225
29073 319883 cd10561 HybA_like the FeS subunit of hydrogenase 2. This subfamily includes the beta-subunit of hydrogenase 2 (Hyd-2), an enzyme that catalyzes the reversible oxidation of H2 to protons and electrons. Hyd-2 is membrane-associated and forms an unusual heterotetrameric [NiFe]-hydrogenase in that it lacks the typical cytochrome b membrane anchor subunit that transfers electrons to the quinone pool. The electron transfer subunit of Hyd-2 (HybA) which is predicted to contain four iron-sulfur clusters, is essential for electron transfer from Hyd-2 to menaquinone/demethylmenaquinone (MQ/DMQ) to couple hydrogen oxidation to fumarate reduction. 196
29074 319884 cd10562 FDH_b_like uncharacterized subfamily of beta subunit of formate dehydrogenase. This subfamily includes the beta-subunit of formate dehydrogenases that are as yet uncharacterized. Members of the DMSO reductase family include formate dehydrogenase N and O (FDH-N, FDH-O) and tungsten-containing formate dehydrogenase (W-FDH) and other similar proteins. FDH-N, a major component of nitrate respiration of Escherichia coli, is involved in the major anaerobic respiratory pathway in the presence of nitrate, catalyzing the oxidation of formate to carbon dioxide at the expense of nitrate reduction to nitrite. It forms a heterotrimer; the alpha-subunit (FDH-G) is the catalytic site of formate oxidation and membrane-associated, incorporating a selenocysteine (SeCys) residue and a [4Fe/4S] cluster in addition to two bis-MGD cofactors, the beta subunit (FDH-H) contains four [4Fe/4S] clusters which transfer the electrons from the alpha subunit to the gamma-subunit (FDH-I), a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. W-FDH contains a tungsten instead of molybdenum at the catalytic center. This enzyme seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It is a heterodimer of a large and a small subunit; the large subunit harbors the W site and one [4Fe-4S] center and the small subunit, containing three [4Fe-4S] clusters, functions to transfer electrons. 161
29075 319885 cd10563 CooF_like CooF, iron-sulfur subunit of carbon monoxide dehydrogenase. This family includes CooF, the iron-sulfur subunit of carbon monoxide dehydrogenase (CODH), found in anaerobic bacteria and archaea. Carbon monoxide dehydrogenase is a key enzyme for carbon monoxide (CO) metabolism, where CooF is the proposed mediator of electron transfer between CODH and the CO-induced hydrogenase, catalyzing the reaction that uses CO as a single carbon and energy source, and producing only H2 and CO2. The ion-sulfur subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons in the protein complex during reaction. 140
29076 319886 cd10564 NapF_like NapF, iron-sulfur subunit of periplasmic nitrate reductase. This family contains NapF protein, the iron-sulfur subunit of periplasmic nitrate reductase. The periplasmic nitrate reductase NapABC of Escherichia coli likely functions during anaerobic growth in low-nitrate environments; napF operon expression is activated by cyclic AMP receptor protein (Crp). NapF is a subfamily of the beta subunit of DMSO reductase (DMSOR) family. DMSOR family members have a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit. The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 139
29077 349488 cd10566 MDM2_like p53-binding domain found in E3 ubiquitin-protein ligase MDM2, MDM4, and similar proteins. MDM2 (also termed HDM2) and MDM4 (also termed MDMX or HDMX) are the primary negative regulators of p53 tumor suppressor. They have non-redundant roles in the regulation of p53. MDM2 mainly functions to control p53 stability, while MDM4 controls p53 transcriptional activity. Both MDM2 and MDM4 contain an N-terminal p53-binding domain, a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region, and a C-terminal RING domain. Mdm2 can form homo-oligomers through its RING domain and display E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Despite its RING domain and structural similarity with MDM2, MDM4 does not homo-oligomerize and lacks ubiquitin-ligase function, but inhibits the transcriptional activity of p53. In addition, both their RING domains are responsible for the hetero-oligomerization, which is crucial for the suppression of p53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. Moreover, MDM2 and MDM4 can be phosphorylated and destabilized in response to DNA damage stress. In response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11 and L23. However, MDM4 is not bound to ribosomal proteins, suggesting its different response to regulation by small basic proteins such as ribosomal proteins and ARF. 75
29078 349489 cd10567 SWIB-MDM2_like SWIB/MDM2 domain found in SWIB/MDM2 homologous proteins. This family includes Schizosaccharomyces pombe upstream activation factor subunit spp27, Saccharomyces cerevisiae upstream activation factor subunit UAF30, Chlamydiae DNA topoisomerase/SWIB domain fusion protein, Arabidopsis thaliana zinc finger CCCH domain-containing proteins, AtC3H19 and AtC3H44, and similar proteins. S. pombe spp27, also termed upstream activation factor 27 KDa subunit (p27), or upstream activation factor 30 KDa subunit (p30), or upstream activation factor subunit uaf30, is a component of the UAF (upstream activation factor) complex which interacts with the upstream element of the RNA polymerase I promoter and forms a stable preinitiation complex. S. cerevisiae UAF30, also termed upstream activation factor 30 KDa subunit (p30), is a non-essential component of the UAF. It seems to play a role in silencing transcription by RNA polymerase II. The SWIB domain found in Chlamydiae DNA topoisomerase may play a role in chromatin condensation-decondensation, which is characteristic of the chlamydial developmental cycle and not found in any other types of bacteria. AtC3H19, also termed protein needed for RDR2-independent DNA methylation (NERD), is a plant-specific GW repeat- and PHD finger-containing protein that plays a central role in integrating RNA silencing and chromatin signals in 21 nt small-interfering RNA (siRNA)-dependent DNA methylation on the cytosine pathway, leading to transcriptional gene silencing of specific sequences. This family also includes many uncharacterized proteins containing two copies of SWIB/MDM2 domain. 71
29079 349490 cd10568 SWIB_like SWIB domain found in the 60 kda subunit of the ATP-dependent SWI/SNF chromatin-remodeling complexes and similar proteins. SWIB domain is a conserved region found within proteins in the SWI/SNF family of complexes. SWI/SNF complex proteins display helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors), among which the BAF60 subunit serves as a key link between the core complexes and specific transcriptional factors. The BAF60 subunit have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. The family also includes Saccharomyces cerevisiae transcription regulatory protein SNF12 and remodel the structure of chromatin complex subunit 6 (RSC6), and Schizosaccharomyces pombe SWI/SNF and RSC complexes subunit SSR3. SNF12, also termed 73-kDa subunit of the SWI/SNF transcriptional regulatory complex, or SWI/SNF complex component SWP73, is involved in transcriptional activation and repression of select genes by chromatin remodeling (alteration of DNA-nucleosome topology). RSC6 and SSR3 are components of the RSC, which is involved in transcription regulation and nucleosome positioning. RSC6 is essential for mitotic growth and suppresses formamide sensitivity of the RSC8 mutants. 69
29080 269973 cd10569 FERM_C_Talin FERM domain C-lobe/F3 of Talin. Talin (also called filopodin) plays an important role in initiating actin filament growth in motile cell protrusions. It is responsible for linking the cytoplasmic domains of integrins to the actin-based cytoskeleton, and is involved in vinculin, integrin and actin interactions. At the leading edge of motile cells, talin colocalises with the hyaluronan receptor layilin in transient adhesions, some of which become more stable focal adhesions (FA). During this maturation process, layilin is replaced with integrins, where localized production of PI(4,5)P(2) by type 1 phosphatidyl inositol phosphate kinase type 1gamma (PIPK1gamma) is thought to play a role in FA assembly. Talins are composed of a N-terminal region FERM domain which us made up of 3 subdomains (N, alpha-, and C-lobe; or- A-lobe, B-lobe, and C-lobe; or F1, F2, and F3) connected by short linkers, a talin rod which binds vinculin, and a conserved C-terminal region with actin- and integrin-binding sites. There are 2 additional actin-binding domains, one in the talin rod and the other in the FERM domain. Both the F2 and F3 FERM subdomains contribute to F-actin binding. Subdomain F3 of the FERM domain contains overlapping binding sites for integrin cytoplasmic domains and for the type 1 gamma isoform of PIP-kinase (phosphatidylinositol 4-phosphate 5-kinase). The FERM domain has a cloverleaf tripart structure . F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 92
29081 275393 cd10570 PH-GRAM Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. 94
29082 269975 cd10571 PH_beta_spectrin Beta-spectrin pleckstrin homology (PH) domain. Beta spectrin binds actin and functions as a major component of the cytoskeleton underlying cellular membranes. Beta spectrin consists of multiple spectrin repeats followed by a PH domain, which binds to inositol-1,4,5-trisphosphate. The PH domain of beta-spectrin is thought to play a role in the association of spectrin with the plasma membrane of cells. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
29083 269976 cd10572 PH_RhoGEF3_XPLN Rho guanine nucleotide exchange factor 3 Pleckstrin homology (PH) domain. RhoGEF3/XPLN, a Rho family GEF, preferentially stimulates guanine nucleotide exchange on RhoA and RhoB, but not RhoC, RhoG, Rac1, or Cdc42 in vitro. It also possesses transforming activity. RhoGEF3/XPLN contains a tandem Dbl homology and PH domain, but lacks homology with other known functional domains or motifs. It is expressed in the brain, skeletal muscle, heart, kidney, platelets, and macrophage and neuronal cell lines. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 133
29084 269977 cd10573 PH_DAPP1 Dual Adaptor for Phosphotyrosine and 3-Phosphoinositides Pleckstrin homology (PH) domain. DAPP1 (also known as PHISH/3' phosphoinositide-interacting SH2 domain-containing protein or Bam32) plays a role in B-cell activation and has potential roles in T-cell and mast cell function. DAPP1 promotes B cell receptor (BCR) induced activation of Rho GTPases Rac1 and Cdc42, which feed into mitogen-activated protein kinases (MAPK) activation pathways and affect cytoskeletal rearrangement. DAPP1can also regulate BCR-induced activation of extracellular signal-regulated kinase (ERK), and c-jun NH2-terminal kinase (JNK). DAPP1 contains an N-terminal SH2 domain and a C-terminal pleckstrin homology (PH) domain with a single tyrosine phosphorylation site located centrally. DAPP1 binds strongly to both PtdIns(3,4,5)P3 and PtdIns(3,4)P2. The PH domain is essential for plasma membrane recruitment of PI3K upon cell activation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 96
29085 269978 cd10574 EVH1_SPRED-like Sprouty-related EVH1 domain-containing-like proteins EVH1 domain. The Spred family has the following domains: an N-terminal EVH1 domain, a unique KBD (c-Kit kinase binding) domain which that is phosphorylated by the stem cell factor receptor c-Kit, and a C-terminal cysteine-rich SPR (Sprouty-related) domain which is involved in membrane localization. There are 3 Spred proteins: Spred1 which interacts with both Ras and Raf through its SPR domain; Spred2 which is the most abundant isoform; and Spred3 which has a non-functional KBD and maintains the inhibitory action on Raf. Legius syndrome is caused by heterozygous mutations in Spred1. Both EVH1 and SPR domains are involved in the inhibition of the MAP kinase pathway by Spred proteins. The specific function of the Spred2 EVH1 domain is unknown and there are no known interacting proteins to date. It is thought that its EVH1 domain will have a fourth distinct peptide binding mechanism within the EVH1 family. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 113
29086 276901 cd10575 TNFRSF6B Tumor necrosis factor receptor superfamily member 6B (TNFRSF6B), also known as decoy receptor 3 (DcR3). The subfamily TNFRSF6B is also known as decoy receptor 3 (DcR3), M68, or TR6. This protein is a soluble receptor without death domain and cytoplasmic domain, and secreted by cells. It acts as a decoy receptor that competes with death receptors for ligand binding. It is a pleiotropic immunomodulator and biomarker for inflammatory diseases, autoimmune diseases, and cancer. Over-expression of this gene has been noted in several cancers, including pancreatic carcinoma, and gastrointestinal tract tumors. It can neutralize the biological effects of three tumor necrosis factor superfamily (TNFSF) members: TNFSF6 (Fas ligand/FasL/CD95L) and TNFSF14 (LIGHT) which are both involved in apoptosis and inflammation, and TNFSF15 (TNF-like molecule 1A/TL1A), which is a T cell co-stimulator and involved in gut inflammation. DcR3 is a novel inflammatory marker; higher DcR3 levels strongly correlate with inflammation and independently predict cardiovascular and all-cause mortality in chronic kidney disease (CKD) patients on hemodialysis. Increased synovial inflammatory cells infiltration in rheumatoid arthritis and ankylosing spondylitis is also associated with the elevated DcR3 expression. In cartilaginous fish, mRNA expression of DcR3 in the thymus and leydig, which are the representative lymphoid tissues of elasmobranchs, suggests that DcR3 may act as a modulator in the immune system. Interestingly, in banded dogfish (Triakis scyllia), DcR3 mRNA is strongly expressed in the gill, compared with human expression in the normal lung; both are respiratory organs, suggesting potential relevance of DcR3 to respiratory function. 163
29087 276902 cd10576 TNFRSF1A Tumor necrosis factor receptor superfamily member 1A (TNFRSF1A), also known as TNFR1. TNFRSF1A (also known as type I TNFR, TNFR1, DR1, TNFRSF1A, CD120a, p55) binds TNF-alpha, through the death domain (DD), and activates NF-kappaB, mediates apoptosis and activates signaling pathways controlling inflammatory, immune, and stress responses. It mediates signal transduction by interacting with antiapoptotic protein BCL2-associated athanogene 4 (BAG4/SODD) and adaptor proteins TRAF2 and TRADD that play regulatory roles. The human genetic disorder called tumor necrosis factor associated periodic syndrome (TRAPS), or periodic fever syndrome, is associated with germline mutations of the extracellular domains of this receptor, possibly due to impaired receptor clearance. TNFRSF1A polymorphisms rs1800693 and rs4149584 are associated with elevated risk of multiple sclerosis. Serum levels of TNFRSF1A are elevated in schizophrenia and bipolar disorder, and high levels are also associated with cognitive impairment and dementia. Patients with idiopathic recurrent acute pericarditis (IRAP), presumed to be an autoimmune process, have also been shown to carry rare mutations (R104Q and D12E) in the TNFRSF1A gene. 130
29088 276903 cd10577 TNFRSF1B Tumor necrosis factor receptor superfamily member 1B (TNFRSF1B), also known as TNFR2. TNFRSF1B (also known as TNFR2, type 2 TNFR, TNFBR, TNFR80, TNF-R75, TNF-R-II, p75, CD120b) binds TNF-alpha, but lacks the death domain (DD) that is associated with the cytoplasmic domain of TNFRSF1A (TNFR1). It is inducible and expressed exclusively by oligodendrocytes, astrocytes, T cells, thymocytes, myocytes, endothelial cells, and in human mesenchymal stem cells. TNFRSF1B protects oligodendrocyte progenitor cells (OLGs) against oxidative stress, and induces the up-regulation of cell survival genes. While pro-inflammatory and pathogen-clearing activities of TNF are mediated mainly through activation of TNFRSF1A, a strong activator of NF-kappaB, TNFRSF1B is more responsible for suppression of inflammation. Although the affinities of both receptors for soluble TNF are similar, TNFRSF1B is sometimes more abundantly expressed and thought to associate with TNF, thereby increasing its concentration near TNFRSF1A receptors, and making TNF available to activate TNFRSF1A (a ligand-passing mechanism). 163
29089 276904 cd10578 TNFRSF3 Tumor necrosis factor receptor superfamily member 3 (TNFRSF3), also known as lymphotoxin beta receptor (LTBR). TNFRSF3 (also known as lymphotoxin beta receptor, LTbetaR, CD18, TNFCR, TNFR3, D12S370, TNFR-RP, TNFR2-RP, LT-BETA-R, TNF-R-III) plays a role in signaling during development of lymphoid and other organs, lipid metabolism, immune response, and programmed cell death. Its ligands include lymphotoxin (LT) alpha/beta membrane form (heterotrimer) and tumor necrosis factor ligand superfamily member 14 (also known as LIGHT). TNFRSF3 agonism by these ligands initiates canonical, as well as non-canonical nuclear factor-kappaB (NF-kappaB) signaling, and preferentially results in the translocation of p52-RELB complexes into the nucleus. While these ligands are often expressed by T and B cells, TNFRSF3 is conspicuous absence on T and B lymphocytes and NK cells, suggesting that signaling may be unidirectional for TNFRSF3. Activity of this receptor has also been linked to carcinogenesis; it helps trigger apoptosis and can also lead to release of the interleukin 8 (IL8). Alternatively spliced transcript variants encoding multiple isoforms have been observed. 158
29090 276905 cd10579 TNFRSF6 Tumor necrosis factor receptor superfamily member 6 (TNFRSF6), also known as fas cell surface death receptor (Fas). TNFRSF6 (also known as fas cell surface death receptor (FasR) or Fas, APT1, CD95, FAS1, APO-1, FASTM, ALPS1A) contains a death domain and plays a central role in the physiological regulation of programmed cell death. It has been implicated in the pathogenesis of various malignancies and diseases of the immune system. The receptor interactions with the Fas ligand (FasL), allowing the formation of a death-inducing signaling complex that includes Fas-associated death domain protein (FADD), caspase 8, and caspase 10; autoproteolytic processing of the caspases in the complex triggers a downstream caspase cascade, leading to apoptosis. This receptor has also been shown to activate NF-kappaB, MAPK3/ERK1, and MAPK8/JNK, and is involved in transducing the proliferating signals in normal diploid fibroblast and T cells. Of the several alternatively spliced transcript variants, some are candidates for nonsense-mediated mRNA decay (NMD). Isoforms lacking the transmembrane domain may negatively regulate the apoptosis mediated by the full length isoform. 129
29091 276906 cd10580 TNFRSF10 Tumor necrosis factor receptor superfamily member 10 (TNFRSF10), includes TNFRSF10A (DR4), TNFRSF10B (DR5), TNFRSF10C (DcR1) and TNFRSF10D (DcR2). TNFRSF10 family contains TNFRSF10A (also known as DR4, Apo2, TRAIL-R1, CD261), TNFRSF10B (also known as DR5, KILLER, TRICK2A, TRAIL-R2, TRICKB, CD262), TNFRSF10C (also known as DcR1, TRAIL-R3, LIT, TRID, CD263), and TNFRSF10D (also known as DcR2, TRUNDD, TRAIL-R4, CD264). Tumor necrosis factor-related apoptosis inducing ligand (TNFSF10/TRAIL) binds to all 4 receptors. DR4 (TRAIL-R1) and DR5 (TRAIL-R2) are membrane-bound and contain a death domain in their intracellular portion, which is able to transmit an apoptotic signal, thus often called death receptors. In contrast, DcR1 (TRAIL-R3), which lacks the complete intracellular portion and DcR2 (TRAIL-R4), which has a truncated cytoplasmic death domain, do not transmit an apoptotic signal, thus known as decoy receptors. Apoptosis mediated by DR4 and DR5 requires Fas (TNFRSF6)-associated via death domain (FADD), a death domain containing adaptor protein. Two transcript variants encoding different isoforms and one non-coding transcript have been found for TNFRSF10B/DR5. DcR1 appears to function as an antagonistic receptor that protects cells from TRAIL-induced apoptosis; it has been found to be a p53-regulated DNA damage-inducible gene. The expression of this gene is detected in many normal tissues but not in most cancer cell lines, which may explain the specific sensitivity of cancer cells to the apoptosis-inducing activity of TRAIL. DcR2 has been shown to play an inhibitory role in TRAIL-induced cell apoptosis. The membrane expression of all of these receptors (DR4, DR5, DcR1, and DcR2) is greater in normal endometrium (NE) than in endometrioid adenocarcinoma (EAC). In EAC patients, membrane expression of these receptors are not independent predictors of survival. DcR1 and DcR2 expression is critical in cell growth and apoptosis in cutaneous or uveal melanoma; DcR1 and DcR2 are frequently methylated in both, leading to loss of gene expression and melanomagenesis. On the other hand, DR4 and DR5 methylation is rare in cutaneous melanoma and frequent in uveal melanoma; their expression is wholly independent of the promoter methylation status. DcR1 and DcR2 genes are also reported to be hyper-methylated in prostate cancer. The TRAIL ligand, a potent and specific inducer of apoptosis in cancer cells, has been explored as a therapeutic drug; experimental data has shown that DR4 specific TRAIL variants are more efficacious than wild-type TRAIL in pancreatic cancer. 103
29092 276907 cd10581 TNFRSF11B Tumor necrosis factor receptor superfamily member 11B (TNFRSF11B), also known as Osteoprotegerin (OPG). TNFRSF11B (also known as Osteoprotegerin, OPG, TR1, OCIF) is a secreted glycoprotein that regulates bone resorption. It binds to two ligands, RANKL (receptor activator of nuclear factor kappaB ligand, also known as osteoprotegerin ligand, OPGL, TRANCE, TNF-related activation induced cytokine), a critical cytokine for osteoclast differentiation, and TRAIL (TNF-related apoptosis-inducing ligand), involved in immune surveillance. Therefore, acting as a decoy receptor for RANKL and TRAIL, OPG inhibits the regulatory effects of nuclear factor-kappaB on inflammation, skeletal, and vascular systems, and prevents TRAIL-induced apoptosis. Studies in mice counterparts suggest that this protein and its ligand also play a role in lymph-node organogenesis and vascular calcification. Circulating OPG levels have emerged as independent biomarkers of cardiovascular disease (CVD) in patients with acute or chronic heart disease. OPG has also been implicated in various inflammations and linked to diabetes and poor glycemic control. Alternatively spliced transcript variants of this gene have been reported, although their full length nature has not been determined. 147
29093 276908 cd10582 TNFRSF14 Tumor necrosis factor receptor superfamily member 14 (TNFRSF14), also known as herpes virus entry mediator (HVEM). TNFRSF14 (also known as herpes virus entry mediator or HVEM, ATAR, CD270, HVEA, LIGHTR, TR2) regulates T-cell immune responses by activating inflammatory, as well as inhibitory signaling pathways. HVEM acts as a receptor for the canonical TNF-related ligand LIGHT (lymphotoxin-like), which exhibits inducible expression, and competes with herpes simplex virus glycoprotein D for HVEM. It also acts as a ligand for the immunoglobulin superfamily proteins BTLA (B and T lymphocyte attenuator) and CD160, a feature distinguishing HVEM from other immune regulatory molecules, thus, creating a functionally diverse set of intrinsic and bidirectional signaling pathways. HVEM is highly expressed in the gut epithelium. Genome-wide association studies have shown that Hvem is an inflammatory bowel disease (IBD) risk gene, suggesting that HVEM could have a regulatory role influencing the regulation of epithelial barrier, host defense, and the microbiota. Mouse models have revealed that HVEM is involved in colitis pathogenesis, mucosal host defense, and epithelial immunity, thus acting as a mucosal gatekeeper with multiple regulatory functions in the mucosa. HVEM plays a critical role in both tumor progression and resistance to antitumor immune responses, possibly through direct and indirect mechanisms. It is known to be expressed in several human malignancies, including esophageal squamous cell carcinoma, follicular lymphoma and melanoma. HVEM network may therefore be an attractive target for drug intervention. 101
29094 276909 cd10583 TNFRSF21 Tumor necrosis factor receptor superfamily member 21 (TNFRSF21), also known as death receptor (DR6). TNFRSF21 (also known as death receptor 6 (DR6), CD358, BM-018) is highly expressed in differentiating neurons as well as in the adult brain, and is upregulated in injured neurons. DR6 negatively regulates neurondendrocyte, axondendrocyte, and oligodendrocyte survival, hinders axondendrocyte and oligodendrocyte regeneration and its inhibition has a neuro-protective effect in nerve injury. It activates nuclear factor kappa-B (NFkB) and mitogen-activated protein kinase 8 (MAPK8, also called c-Jun N-terminal kinase 1), and induces cell apoptosis by associating with TNFRSF1A-associated via death domain (TRADD), which is known to mediate signal transduction of tumor necrosis factor receptors. TNFRSF21 plays a role in T-helper cell activation, and may be involved in inflammation and immune regulation. Its possible ligand is alpha-amyloid precursor protein (APP), hence probably involved in the development of Alzheimer's disease; when released, APP binds in an autocrine/paracrine manner to activate a caspase-dependent self-destruction program that removes unnecessary or connectionless axons. Increasing beta-catenin levels in brain endothelium upregulates TNFRSF21 and TNFRSF19, indicating that these death receptors are downstream target genes of Wnt/beta-catenin signaling, which has been shown to be required for blood-brain barrier development. DR6 is up-regulated in numerous solid tumors as well as in tumor vascular cells, including ovarian cancer and may be a clinically useful diagnostic and predictive serum biomarker for some adult sarcoma subtypes. 159
29095 213020 cd10585 CE4_SF Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily. The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins. 142
29096 198285 cd10718 SH2_CIS Src homology 2 (SH2) domain found in cytokine-inducible SH2-containing protein (CIS). CIS family members are known to be cytokine-inducible negative regulators of cytokine signaling. The expression of the CIS gene can be induced by IL2, IL3, GM-CSF and EPO in hematopoietic cells. Proteasome-mediated degradation of this protein has been shown to be involved in the inactivation of the erythropoietin receptor. Suppressor of cytokine signalling (SOCS) was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 88
29097 199908 cd10719 DnaJ_zf Zinc finger domain of DnaJ and HSP40. Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding. 65
29098 199909 cd10747 DnaJ_C C-terminal substrate binding domain of DnaJ and HSP40. The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. 158
29099 199910 cd10748 anti-TRAP anti-TRAP (AT) protein specific to Bacilli. In Bacillus subtilis and related bacteria, AT binds to the TRAP protein, (tryptophan-activated trp RNA-binding attenuation protein), effectively disrupting interaction of TRAP with mRNAs. Upon binding of tryptophan, TRAP (which forms a complex of 11 identical subunits) interacts with a specific location in the leader RNA and blocks translation of the tryptophan biosynthetic operon. AT, in turn, recognizes the tryptophan-activated TRAP complex and prevents RNA binding. AT is expressed in response to high levels of uncharged tryptophan tRNA. AT contains a zinc-binding motif that closely resembles the zinc-binding motifs in the zinc-finger region of DnaJ/Hsp40. AT has been shown to form homo-dodecameric assemblies, and can actually do that in two different relative orientations, resulting in two different dodecamers. Recent data suggest that the trimeric form of AT may be the biologically relevant active complex. 52
29100 212097 cd10785 GH38-57_N_LamB_YdjC_SF Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins. The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily. 203
29101 212098 cd10786 GH38N_AMII_like N-terminal catalytic domain of class II alpha-mannosidases and similar proteins; glycoside hydrolase family 38 (GH38). Alpha-mannosidases (EC 3.2.1.24) are extensively found in eukaryotes and play important roles in the processing of newly formed N-glycans and in degradation of mature glycoproteins. A deficiency of this enzyme causes the lysosomal storage disease alpha-mannosidosis. Many bacterial and archaeal species also possess putative alpha-mannosidases, but their activity and specificity is largely unknown. Based on different functional characteristics and sequence homology, alpha-mannosidases have been organized into two classes (class I, belonging to glycoside hydrolase family 47, and class II, belonging to glycoside hydrolase family 38). Members of this family corresponds to class II alpha-mannosidases (alphaMII), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides. The N-terminal catalytic domain of alphaMII adopts a structure consisting of parallel 7-stranded beta/alpha barrel. Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. 251
29102 212099 cd10787 LamB_YcsF_like LamB/YcsF family of lactam utilization protein. The LamB/YbgL family includes the Aspergillus nidulans protein LamB, and its homologs from all three kingdoms of life. The lamb gene locates at the lam locus of Aspergillus nidulans, consisting of two divergently transcribed genes, lamA and lamB, needed for the utilization of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. Although the exact molecular function of LamB is unknown, it might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA 238
29103 212100 cd10788 YdjC_like YdjC-family proteins. YdjC-family proteins are widely distributed, from human to bacteria. It is represented by an uncharacterised protein YdjC (also known as ChbG), encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. This subfamily also includes hopanoid biosynthesis associated proteins HpnK and many uncharacterized YdjC homologs. Although the exact molecular function of the YdjC-family proteins remains unclear, it has been suggested that they play a role in the cleavage of cellobiosephosphate. 243
29104 212101 cd10789 GH38N_AMII_ER_cytosolic N-terminal catalytic domain of endoplasmic reticulum(ER)/cytosolic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). The subfamily is represented by Saccharomyces cerevisiae vacuolar alpha-mannosidase Ams1, rat ER/cytosolic alpha-mannosidase Man2C1, and similar proteins. Members in this family share high sequence similarity. None of them have any classical signal sequence or membrane spanning domains, which are typical of sorting or targeting signals. Ams1 functions as a second resident vacuolar hydrolase in S. cerevisiae. It aids in recycling macromolecular components of the cell through hydrolysis of terminal, non-reducing alpha-d-mannose residues. Ams1 utilizes both the cytoplasm to vacuole targeting (Cvt, nutrient-rich conditions) and autophagic (starvation conditions) pathways for biosynthetic delivery to the vacuole. Man2C1is involved in oligosaccharide catabolism in both the ER and cytosol. It can catalyze the cobalt-dependent cleavage of alpha 1,2-, alpha 1,3-, and alpha 1,6-linked mannose residues. Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl-enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. 252
29105 212102 cd10790 GH38N_AMII_1 N-terminal catalytic domain of putative prokaryotic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). This mainly bacterial subfamily corresponds to a group of putative class II alpha-mannosidases, including various proteins assigned as alpha-mannosidases, Streptococcus pyogenes (SpGH38) encoded by ORF spy1604. Escherichia coli MngB encoded by the mngB/ybgG gene, and Thermotoga maritime TMM, and similar proteins. SpGH38 targets alpha-1,3 mannosidic linkages. SpGH38 appears to exist as an elongated dimer and display alpha-1,3 mannosidase activity. It is active on disaccharides and some aryl glycosides. SpGH38 can also effectively deglycosylate human N-glycans in vitro. MngB exhibits alpha-mannosidase activity that catalyzes the conversion of 2-O-(6-phospho-alpha-mannosyl)-D-glycerate to mannose-6-phosphate and glycerate in the pathway which enables use of mannosyl-D-glycerate as a sole carbon source. TMM is a homodimeric enzyme that hydrolyzes p-nitrophenyl-alpha-D-mannopyranoside, alpha -1,2-mannobiose, alpha -1,3-mannobiose, alpha -1,4-mannobiose, and alpha -1,6-mannobiose. The GH38 family contains retaining glycosyl hydrolases that employ a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. Divalent metal ions, such as zinc or cobalt ions, are suggested to be required for the catalytic activities of typical class II alpha-mannosidases. However, TMM requires the cobalt or cadmium for its activity. The cadmium ion dependency is unique to TMM. Moreover, TMM is inhibited by swainsonine but not 1-deoxymannojirimycin, which is in agreement with the features of cytosolic alpha-mannosidase. 273
29106 212103 cd10791 GH38N_AMII_like_1 N-terminal catalytic domain of mainly uncharacterized eukaryotic proteins similar to alpha-mannosidases; glycoside hydrolase family 38 (GH38). The subfamily of mainly uncharacterized eukaryotic proteins shows sequence homology with class II alpha-mannosidases (AlphaAMIIs). AlphaAMIIs possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyze the degradation of N-linked oligosaccharides. The N-terminal catalytic domain of alphaMII adopts a structure consisting of parallel 7-stranded beta/alpha barrel. This subfamily belongs to the GH38 family of retaining glycosyl hydrolases, which employ a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. 254
29107 212104 cd10792 GH57N_AmyC_like N-terminal catalytic domain of alpha-amylase ( AmyC ) and similar proteins. Alpha-amylases (alpha-1,4-glucan-4-glucanohydrolases, EC 3.2.1.1) play essential roles in alpha-glucan metabolism by catalyzing the hydrolysis of polysaccharides such as amylose starch, and beta-limit dextrin. This subfamily is represented by a novel alpha-amylase (AmyC) encoded by hyperthermophilic organism Thermotoga maritime ORF tm1438, and its prokaryotic homologs. AmyC functions as a homotetramer and shows thermostable amylolytic activity. It is strongly inhibited by acarbose. AmyC is composed of a N-terminal catalytic domain, containing a distorted TIM-barrel structure with a characteristic (beta/alpha)7 fold motif, and two additional less conserved domains. There are other two canonical alpha-amylases encoded from T. maritime that lack the sequence similarity to AmyC, and belong to a different superfamily. 412
29108 212105 cd10793 GH57N_TLGT_like N-terminal catalytic domain of 4-alpha-glucanotransferase; glycoside hydrolase family 57 (GH57). 4-alpha-glucanotransferase (TLGT, EC 2.4.1.25) plays a key role in the maltose metabolism. It catalyzes the disproportionation of amylose and the formation of large cyclic alpha-1,4-glucan (cycloamylose) from linear amylose. TLGT functions as a homodimer. Each monomer is composed of two domains, an N-terminal catalytic domain with a (beta/alpha)7 barrel fold and a C-terminal domain with a twisted beta-sandwich fold. Some family members have been designated as alpha-amylases, such as the heat-stable eubacterial amylase from Dictyoglomus thermophilum (DtAmyA) and the extremely thermostable archaeal amylase from Pyrococcus furiosus(PfAmyA). However, both of these proteins are 4-alpha-glucanotransferases. DtAmyA was shown to have transglycosylating activity and PfAmyA exhibits 4-alpha-glucanotransferase activity. 279
29109 212106 cd10794 GH57N_PfGalA_like N-terminal catalytic domain of alpha-galactosidase; glycoside hydrolase family 57 (GH57). Alpha-galactosidases (GalA, EC 3.2.1.22) catalyze the hydrolysis of alpha-1,6-linked galactose residues from oligosaccharides and polymeric galactomannans. Based on sequence similarity, the majority of eukaryotic and bacterial GalAs have been classified into glycoside hydrolase family GH27, GH36, and GH4, respectively. This subfamily is represented by a novel type of GalA from Pyrococcus furiosus (PfGalA), which belongs to the GH57 family. PfGalA is an extremely thermo-active and thermostable GalA that functions as a bacterial-like GalA, however, without the capacity to hydrolyze polysaccharides. It specifically catalyzes the hydrolysis of para-nitrophenyl-alpha-galactopyranoside, and to some extent that of melibiose and raffinose. PfGalA has a pH optimum between 5.0-5.5. 305
29110 212107 cd10795 GH57N_MJA1_like N-terminal catalytic domain of a thermoactive alpha-amylase from Methanococcus jannaschii and similar proteins; glycoside hydrolase family 57 (GH57). The subfamily is represented by a thermostable alpha-amylase (MJA1, EC 3.2.1.1) encoded from the hyperthermophilic archaeon Methanococcus jannaschii locus, M J1611. MJA1 has a broad pH optimum 5.0-8.0. It exhibits extremely thermophilic alpha-amylase activity that catalyzes the hydrolysis of large sugar polymers with alpha-l,6 and alpha-l,4 linkages, and yields products including glucose polymers of 1-7 units. MJ1611 also encodes another alpha-amylase with catalytic features distinct from MJA1, which belongs to glycoside hydrolase family 13 (GH-13), and is not included here. This subfamily also includes many uncharacterized proteins found in bacteria and archaea. 306
29111 212108 cd10796 GH57N_APU N-terminal catalytic domain of thermoactive amylopullulanases; glycoside hydrolase family 57 (GH57). Pullulanases (EC 3.2.1.41) are capable of hydrolyzing the alpha-1,6 glucosidic bonds of pullulan, producing maltotriose. Amylopullulanases (APU, E.C 3.2.1.1/41) are type II pullulanases which can also degrade both the alpha-1,6 and alpha-1,4 glucosidic bonds of starch, producing oligosaccharides. This subfamily includes GH57 archaeal thermoactive APUs, which show both pullulanolytic and amylolytic activities. They have an acid pH optimum and the presence of Ca2+ might increase their activity, thermostability, and substrate affinity. Besides GH57 thermoactive APUs, all mesophilic and some thermoactive APUs belong to glycoside hydrolase family 13 with catalytic features distinct from GH57. This subfamily also includes many uncharacterized proteins found in bacteria and archaea. 313
29112 212109 cd10797 GH57N_APU_like_1 N-terminal putative catalytic domain of mainly uncharacterized prokaryotic proteins similar to archaeal thermoactive amylopullulanases; glycoside hydrolase family 57 (GH57). This subfamily of mainly uncharacterized bacterial proteins, shows high sequence homology to GH57 archaeal thermoactive amylopullulanases (APU, E.C 3.2.1.1/41). Thermoactive APUs are type II pullulanases with both pullulanolytic and amylolytic activities. They have an acid pH optimum and the presence of Ca2+ might increase their activity, thermostability, and substrate affinity. 327
29113 212110 cd10798 GH57N_like_1 Uncharacterized subfamily of glycoside hydrolase family 57 (GH57). This subfamily of uncharacterized bacterial proteins, shows high sequence homology to glycoside hydrolase family 57 (GH57). Glycoside hydrolase family 57(GH57) is a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). 330
29114 212111 cd10800 LamB_YcsF_YbgL_like Escherichia coli putative lactam utilization protein YbgL and similar proteins. This subfamily of the LamB/YbgL family is represented by the Escherichia coli putative lactam utilization protein YbgL. Although their molecular function of member of this subfamily is unknown, they show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA. 240
29115 212112 cd10801 LamB_YcsF_like_1 uncharacterized proteins similar to the Aspergillus nidulans lactam utilization protein LamB. This mainly bacterial subfamily of the LamB/YbgL family, contains many well conserved uncharacterized proteins. Although their molecular function remains unknown, those proteins show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA. 233
29116 212113 cd10802 YdjC_TTHB029_like Thermus thermophiles TTHB029 and similar proteins. This subfamily is represented by an YdjC-family protein TTHB029 from Thermus thermophilus HB8; it is similar to Escherichia coli YdjC, a hypothetical protein encoded by the celG gene. TTHB029 functions as a homodimer. Each of monomer consists of (beta/alpha)-barrel fold. The molecular function of TTHB029 is unclear. 251
29117 212114 cd10803 YdjC_EF3048_like Enterococcus faecalis EF3048 and similar proteins. This subfamily is represented by a putative cellobiose-phosphate cleavage protein EF3048 from Enterococcus faecalis v583. It is similar to Escherichia coli YdjC, a hypothetical protein encoded by the celG gene. EF3048 might function as a homodimer. Each of the monomers consists of a (beta/alpha)-barrel fold that forms an active homodimer. The molecular function of the EF3048 is unclear. 228
29118 212115 cd10804 YdjC_HpnK_like hopanoid biosynthesis associated protein HpnK and similar proteins. The subfamily includes some uncharacterized proteins annotated as hopanoid biosynthesis associated proteins, HpnK. They show high sequence similarity to proteins from the YdjC-family, the latter is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. 261
29119 212116 cd10805 YdjC_like_1 uncharacterized YdjC-like family proteins from bacteria. The subfamily contains many hypothetical proteins, and belongs to the YdjC-like family of uncharacterized proteins from bacteria. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear. 251
29120 212117 cd10806 YdjC_like_2 uncharacterized YdjC-like family proteins from eukaryotes. This eukaryotic subfamily contains hypothetical and uncharacterized proteins, and belongs to the YdjC-like family of uncharacterized proteins. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear. 280
29121 212118 cd10807 YdjC_like_3 uncharacterized YdjC-like family proteins from bacteria. This subfamily contains many hypothetical proteins, and belongs to the YdjC-like family of uncharacterized proteins from bacteria. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear. 251
29122 212119 cd10808 YdjC Escherichia coli YdjC-like family of proteins. Uncharacterized subfamily of YdjC-like family of proteins. Included in this subfamily is the uncharacterized Escherichia coli protein YdjC (also known as ChbG), encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear. 259
29123 212120 cd10809 GH38N_AMII_GMII_SfManIII_like N-terminal catalytic domain of Golgi alpha-mannosidase II, Spodoptera frugiperda Sf9 alpha-mannosidase III, and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by Golgi alpha-mannosidase II (GMII, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A1), a monomeric, membrane-anchored class II alpha-mannosidase existing in the Golgi apparatus of eukaryotes. GMII plays a key role in the N-glycosylation pathway. It catalyzes the hydrolysis of the terminal both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. GMII is activated by zinc or cobalt ions and is strongly inhibited by swainsonine. Inhibition of GMII provides a route to block cancer-induced changes in cell surface oligosaccharide structures. GMII has a pH optimum of 5.5-6.0, which is intermediate between those of acidic (lysosomal alpha-mannosidase) and neutral (ER/cytosolic alpha-mannosidase) enzymes. GMII is a retaining glycosyl hydrolase of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. This subfamily also includes human alpha-mannosidase 2x (MX, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A2). MX is enzymatically and functionally very similar to GMII, and is thought to also function in the N-glycosylation pathway. Also found in this subfamily is class II alpha-mannosidase encoded by Spodoptera frugiperda Sf9 cell. This alpha-mannosidase is an integral membrane glycoprotein localized in the Golgi apparatus. It shows high sequence homology with mammalian Golgi alpha-mannosidase II(GMII). It can hydrolyze p-nitrophenyl alpha-D-mannopyranoside (pNP-alpha-Man), and it is inhibited by swainsonine. However, the Sf9 enzyme is stimulated by cobalt and can hydrolyze (Man)5(GlcNAc)2 to (Man)3(GlcNAc)2, but it cannot hydrolyze GlcNAc(Man)5(GlcNAc)2, which is distinct from that of GMII. Thus, this enzyme has been designated as Sf9 alpha-mannosidase III (SfManIII). It probably functions in an alternate N-glycan processing pathway in Sf9 cells. 340
29124 212121 cd10810 GH38N_AMII_LAM_like N-terminal catalytic domain of lysosomal alpha-mannosidase and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily is represented by lysosomal alpha-mannosidase (LAM, Man2B1, EC 3.2.1.114), which is a broad specificity exoglycosidase hydrolyzing all known alpha 1,2-, alpha 1,3-, and alpha 1,6-mannosidic linkages from numerous high mannose type oligosaccharides. LAM is expressed in all tissues and in many species. In mammals, the absence of LAM can cause the autosomal recessive disease alpha-mannosidosis. LAM has an acidic pH optimum at 4.0-4.5. It is stimulated by zinc ion and is inhibited by cobalt ion and plant alkaloids, such as swainsonine (SW). LAM catalyzes hydrolysis by a double displacement mechanism in which a glycosyl-enzyme intermediate is formed and hydrolyzed via oxacarbenium ion-like transition states. A carboxylic acid in the active site acts as the catalytic nucleophile in the formation of the covalent intermediate while a second carboxylic acid acts as a general acid catalyst. The same residue is thought to assist in the hydrolysis (deglycosylation) step, this time acting as a general base. 278
29125 212122 cd10811 GH38N_AMII_Epman_like N-terminal catalytic domain of mammalian core-specific lysosomal alpha 1,6-mannosidase and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily is represented by a novel human core-specific lysosomal alpha 1,6-mannosidase (Epman, Man2B2) and similar proteins. Although it was previously named as epididymal alpha-mannosidase, Epman has a broadly distributed transcript expression profile. Different from the major broad specificity lysosomal alpha-mannosidases (LAM, MAN2B1), Epman is not associated with genetic alpha-mannosidosis that is caused by the absence of LAM. Furthermore, Epman has unique substrate specificity. It can efficiently cleave only the alpha 1,6-linked mannose residue from (Man)3GlcNAc, but not (Man)3(GlcNAc)2 or other larger high mannose oligosaccharides, in the core of N-linked glycans. In contrast, the major LAM can cleave all of the alpha-linked mannose residues from high mannose oligosaccharides except the core alpha 1,6-linked mannose residue. Moreover, it is suggested that the catalytic activity of Epman is dependent on prior action by di-N-acetyl-chitobiase (chitobiase), which indicates there is a functional cooperation between these two enzymes for the full and efficient catabolism of mammalian lysosomal N-glycan core structures. Epman has an acidic pH optimum. It is strongly stimulated by cobalt or zinc ions and strongly inhibited by furanose analogues swainsonine (SW) and 1,4-dideoxy-1,4-imino-d-mannitol (DIM). 326
29126 212123 cd10812 GH38N_AMII_ScAms1_like N-terminal catalytic domain of yeast vacuolar alpha-mannosidases and similar proteins; glycoside hydrolase family 38 (GH38). The family is represented by Saccharomyces cerevisiae alpha-mannosidase (Ams1) and its eukaryotic homologs. Ams1 functions as a second resident vacuolar hydrolase in S. cerevisiae. It aids in recycling macromolecular components of the cell through hydrolysis of terminal, non-reducing alpha-d-mannose residues. Ams1 forms an oligomer in the cytoplasm and retains its oligomeric form during the import process. It utilizes both the Cvt (nutrient-rich conditions) and autophagic (starvation conditions) pathways for biosynthetic delivery to the vacuole. Mutants in either pathway are defective in Ams1 import. Members in this family show high sequence similarity with rat ER/cytosolic alpha-mannosidase Man2C1. 258
29127 212124 cd10813 GH38N_AMII_Man2C1 N-terminal catalytic domain of mammalian cytosolic alpha-mannosidase Man2C1 and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily corresponds to cytosolic alpha-mannosidase Man2C1 (also known as ER-mannosidase II or neutral/cytosolic mannosidase), mainly found in various vertebrates, and similar proteins. Man2C1 plays an essential role in the catabolism of cytosolic free oligomannosides derived from dolichol intermediates and the degradation of newly synthesized glycoproteins in ER or cytosol. It can catalyze the cleavage of alpha 1,2-, alpha 1,3-, and alpha 1,6-linked mannose residues. Man2C1 is a cobalt-dependent enzyme belonging to alpha-mannosidase class II. It has a neutral pH optimum and is strongly inhitibed by furanose analogs swainsonine (SW) and 1,4-dideoxy-1,4-imino-D-mannitol (DIM), moderately by deoxymannojirimycin (DMM), but not by kifunensine (KIF). DMM and KIF, both pyranose analogs, are normally known to inhibit class I alpha-mannosidase. 252
29128 212125 cd10814 GH38N_AMII_SpGH38_like N-terminal catalytic domain of SPGH38, a putative alpha-mannosidase of Streptococcus pyogenes, and its prokaryotic homologs; glycoside hydrolase family 38 (GH38). The subfamily is represented by SpGH38 of Streptococcus pyogenes, which has been assigned as a putative alpha-mannosidase, and is encoded by ORF spy1604. SpGH38 appears to exist as an elongated dimer and display alpha-1,3 mannosidase activity. It is active on disaccharides and some aryl glycosides. SpGH38 can also effectively deglycosylate human N-glycans in vitro. A divalent metal ion, such as a zinc ion, is required for its activity. SpGH38 is inhibited by swainsonine. The absence of any secretion signal peptide suggests that SpGH38 may be intracellular. 271
29129 212126 cd10815 GH38N_AMII_EcMngB_like N-terminal catalytic domain of Escherichia coli alpha-mannosidase MngB and its bacterial homologs; glycoside hydrolase family 38 (GH38). The bacterial subfamily is represented by Escherichia coli alpha-mannosidase MngB, which is encoded by the mngB gene (previously called ybgG). MngB exhibits alpha-mannosidase activity that converts 2-O-(6-phospho-alpha-mannosyl)-D-glycerate to mannose-6-phosphate and glycerate in the pathway which enables use of mannosyl-D-glycerate as a sole carbon source. A divalent metal ion is required for its activity. 270
29130 212127 cd10816 GH57N_BE_TK1436_like N-terminal catalytic domain of Gh57 branching enzyme TK 1436 and similar proteins. The subfamily is represented by a novel branching-enzyme TK1436 of hyperthermophilic archaeon Thermococcus kodakaraensis KOD1. Branching enzymes (BEs, EC 2.4.1.18) play a key role in synthesis of alpha-glucans and they generally are classified into glycoside hydrolase family 13 (GH13). However, TK1436 belongs to the GH57 family. It functions as a monomer and possesses BE activity. TK1436 is composed of a distorted N-terminal (beta/alpha)7-barrel domain and a C-terminal five alpha-helical domain, both of which participate in the formation of the active-site cleft. 423
29131 380680 cd10843 DSRM_DICER double-stranded RNA binding motif of endoribonuclease Dicer and similar proteins. Dicer (also known as helicase with RNase motif (HERNA), or helicase MOI) is a double-stranded RNA (dsRNA) endoribonuclease playing a central role in short dsRNA-mediated post-transcriptional gene silencing. It cleaves naturally occurring long dsRNAs and short hairpin pre-microRNAs (miRNA) into fragments of twenty-one to twenty-three nucleotides with 3' overhang of two nucleotides, producing respectively short interfering RNAs (siRNA) and mature microRNAs. Dicer contains a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 63
29132 380681 cd10844 DSRM_TARBP2_rpt2 second double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
29133 380682 cd10845 DSRM_RNAse_III_family double-stranded RNA binding motif of ribonuclease III (RNase III) and similar proteins. RNase III (EC 3.1.26.3; also known as ribonuclease 3) digests double-stranded RNA formed within single-strand substrates, but not RNA-DNA hybrids. It is involved in the processing of rRNA precursors, viral transcripts, some mRNAs, and at least 1 tRNA (metY, a minor form of tRNA-init-Met). It cleaves the 30S primary rRNA transcript to yield the immediate precursors to the 16S and 23S rRNAs. The cleavage can occur in assembled 30S, 50S, and even 70S subunits and is influenced by the presence of ribosomal proteins. The RNase III family also includes the mitochondrion-specific ribosomal protein mL44 subfamily, which is composed of mitochondrial 54S ribosomal protein L3 (MRPL3) and mitochondrial 39S ribosomal protein L44 (MRPL44). Members of this family contain an RNase III domain and a C-terminal double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
29134 211315 cd10909 ChtBD1_GH18_2 Hevein or type 1 chitin binding domain (ChtBD1) subfamily; in some members co-occurs with family 18 glycosyl hydrolases. This subfamily includes a Toxoplasma gondii ME49 protein annotated as a putative mannosyl-oligosaccharide glucosidase. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 51
29135 350234 cd10910 PIN_limkain_b1_N_like N-terminal LabA-like PIN domain of limkain b1 and similar proteins. Limkain b1 is a human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. Limkain b1 contains multiple copies of LOTUS domains and a conserved RNA recognition motif, this and similar domain architectures are shared by several members of this family, and a function of these architectures in RNA binding or RNA metabolism has been suggested. The function of the N-terminal domain is unknown. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
29136 350235 cd10911 PIN_LabA PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. This subfamily contains Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. This subfamily belongs to the LabA-like domain family which includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes. Also included in the LabA-like domain family are human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB , which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 154
29137 350236 cd10912 PIN_YacP-like PIN_domain of Bacillus subtilis YacP/Rae1 and related proteins. Bacillus subtilis YacP, also known as Rae1, is an endoribonuclease involved in ribosome-dependent mRNA decay. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 142
29138 199211 cd10913 Peptidase_C25_N_gingipain gingipain subgroup of the Peptidase C25 family N-terminal domain. Gingipain, produced by Porphyromonas gingivalis, exemplifies the Peptidase family C25, a unique class of cysteine proteases. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease also associated with other diseases such as diabetes and cardiovascular disease. The gingipain subgroup contains extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad, are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. It has been suggested that they enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network. 348
29139 199212 cd10914 Peptidase_C25_N_1 uncharacterized subgroup of the Peptidase C25 family N-terminal domain. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidase family C25 is a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene (also called prtK, prkP). Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network. 365
29140 199213 cd10915 Peptidase_C25_N_2 uncharacterized subgroup of the Peptidase C25 family N-terminal domain. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidases family C25 are a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network. 403
29141 213021 cd10916 CE4_PuuE_HpPgdA_like Catalytic domain of bacterial PuuE allantoinases, Helicobacter pylori peptidoglycan deacetylase (HpPgdA), and similar proteins. This family is a member of the very large and functionally diverse carbohydrate esterase 4 (CE4) superfamily. It contains bacterial PuuE (purine utilization E) allantoinases, a peptidoglycan deacetylase from Helicobacter pylori (HpPgdA), Escherichia coli ArnD, and many uncharacterized homologs from all three kingdoms of life. PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as a homotetramer. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase (DCA)-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. However, in contrast with the typical DCAs, PuuE allantoinase and HpPgdA might not exhibit a solvent-accessible polysaccharide binding groove and only recognize a small substrate molecule. ArnD catalyzes the deformylation of 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol to 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol. 247
29142 213022 cd10917 CE4_NodB_like_6s_7s Catalytic NodB homology domain of rhizobial NodB-like proteins. This family belongs to the large and functionally diverse carbohydrate esterase 4 (CE4) superfamily, whose members show strong sequence similarity with some variability due to their distinct carbohydrate substrates. It includes many rhizobial NodB chitooligosaccharide N-deacetylase (EC 3.5.1.-)-like proteins, mainly from bacteria and eukaryotes, such as chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan. All members of this family contain a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold with 6- or 7 strands. Their catalytic activity is dependent on the presence of a divalent cation, preferably cobalt or zinc, and they employ a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. Several family members show diversity both in metal ion specificities and in the residues that coordinate the metal. 171
29143 213023 cd10918 CE4_NodB_like_5s_6s Putative catalytic NodB homology domain of PgaB, IcaB, and similar proteins which consist of a deformed (beta/alpha)8 barrel fold with 5- or 6-strands. This family belongs to the large and functionally diverse carbohydrate esterase 4 (CE4) superfamily, whose members show strong sequence similarity with some variability due to their distinct carbohydrate substrates. It includes bacterial poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB, hemin storage system HmsF protein in gram-negative species, intercellular adhesion proteins IcaB, and many uncharacterized prokaryotic polysaccharide deacetylases. It also includes a putative polysaccharide deacetylase YxkH encoded by the Bacillus subtilis yxkH gene, which is one of six polysaccharide deacetylase gene homologs present in the Bacillus subtilis genome. Sequence comparison shows all family members contain a conserved domain similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, which consists of a deformed (beta/alpha)8 barrel fold with 6 or 7 strands. However, in this family, most proteins have 5 strands and some have 6 strands. Moreover, long insertions are found in many family members, whose function remains unknown. 157
29144 200545 cd10919 CE4_CDA_like Putative catalytic domain of chitin deacetylase-like proteins from insects and similar proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase family 4 (CE4). This family includes many CDA-like proteins, mainly from insects, which contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. Some family members have an additional chitin binding domain (ChBD), or an additional low-density lipoprotein receptor class A domain (LDLa), or both. Due to the lack of some catalytically relevant residues, several insect CDA-like proteins are devoid of enzymatic activity and may simply bind to chitin and thus influence the mechanical or permeability properties of chitin-containing structures such as the cuticle or the peritrophic membrane. This family also includes many uncharacterized hypothetical proteins from bacteria, exhibiting high sequence similarity to insect CDA-like proteins. 273
29145 200546 cd10920 CE4_WbmS Catalytic domain of a putative polysaccharide deacetylase WbmS from Bordetella bronchiseptica and similar proteins. This family is represented by a putative polysaccharide deacetylase encoded by the O-antigen-related gene wbmS in Bordetella bronchiseptica. Although its precise function remains unknown, it has been suggested that WbmS might be involved in the biosynthesis of O-antigen, an important component of the gram-negative bacterial outer membrane, and may also play a role in sugar phosphate transfer. Structural superposition and sequence comparison show that WbmS consists of a conserved domain similar to the 7-stranded barrel catalytic domain of polysaccharide deacetylases (DACs) from the carbohydrate esterase 4 (CE4) superfamily, which removes N-linked acetyl groups from cell wall polysaccharides. 233
29146 200547 cd10921 CE4_MJ0505_like Putative catalytic domain of uncharacterized protein MJ0505 from Methanocaldococcus jannaschii and similar proteins. This family contains an uncharacterized protein MJ0505 from Methanocaldococcus jannaschii and its prokaryotic homologs. Although their biochemical properties remain to be determined, members in this family is composed of a seven-stranded barrel with a detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups of cell wall polysaccharides and belong to a larger carbohydrate esterase 4 (CE4) superfamily. 206
29147 200548 cd10922 CE4_PelA_like_C C-terminal Putative NodB-like catalytic domain of PelA-like uncharacterized hypothetical proteins found in bacteria. This family is represented by a protein PelA of unknown function that is encoded by a gene in the pelA-G gene cluster for pellicle production and biofilm formation in Pseudomonas aeruginosa. PelA and most of the family members contain a domain of unknown function, DUF297, in the N-terminus and a C-terminal domain that shows high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 266
29148 200549 cd10923 CE4_COG5298 Putative NodB-like catalytic domain of uncharacterized proteins found in bacteria. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. Some family members contain an additional copper amine oxidase N-terminal domain. 250
29149 200550 cd10924 CE4_COG4878 Putative NodB-like catalytic domain of uncharacterized proteins found in bacteria. The family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 273
29150 200551 cd10925 CE4_u1 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 216
29151 200552 cd10926 CE4_u2 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 253
29152 200553 cd10927 CE4_u3 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 227
29153 200554 cd10928 CE4_u4 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 222
29154 200555 cd10929 CE4_u5 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 263
29155 200556 cd10930 CE4_u6 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 240
29156 200557 cd10931 CE4_u7 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 224
29157 200558 cd10932 CE4_u8 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 324
29158 200559 cd10933 CE4_u9 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 266
29159 200560 cd10934 CE4_cadherin_MopE_like_N N-terminal Putative NodB-like catalytic domain of hypothetical proteins containing C-terminal cadherin or MopE copper binding domains. The family includes several cadherin or MopE copper binding domain containing hypothetical proteins found in bacteria. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. They play a role in cell fate, signalling, proliferation, differentiation, and migration. The copper binding domain involves a tryptophan metabolite, kynurenine, in the protein MopE. Members of this family contain an additional conserved domain, which is N-terminally fused to the cadherin domain or the MopE copper binding domain. Although its function remains unclear, the conserved domain exhibits a seven-stranded barrel with a detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 267
29160 200561 cd10935 CE4_WalW Putative catalytic domain of lipopolysaccharide biosynthesis protein WalW and its bacterial homologs. This family corresponds to a group of uncharacterized lipopolysaccharide biosynthesis protein WalW found in bacteria. Although their biochemical properties remain to be determined, members of this family is composed of a seven-stranded barrel with detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 295
29161 200562 cd10936 CE4_DAC2 Putative catalytic domain of family 2 polysaccharide deacetylases (DACs) from bacteria. This family contains an uncharacterized protein BH1492 from Bacillus halodurans, an uncharacterized protein ATU2773 from Agrobacterium tumefaciens C58, and other bacterial hypothetical proteins. Although their functions are still unknown, structural superposition and sequence comparison suggest that BH1492 and ATU2773 might be divergently related to the 7-stranded barrel catalytic domain of polysaccharide deacetylases (DACs) from the carbohydrate esterase 4 (CE4) superfamily, which remove N-linked acetyl groups from cell wall polysaccharides. This family is designated as DAC family 2, a divergent DAC family. 215
29162 200563 cd10938 CE4_HpPgdA_like Catalytic domain of Helicobacter pylori peptidoglycan deacetylase (HpPgdA) and similar proteins. This family is represented by a peptidoglycan deacetylase (HP0310, HpPgdA) from the gram-negative pathogen Helicobacter pylori. HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. It functions as a homotetramer. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase (DCA)-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, HpPgdA does not exhibit a solvent-accessible polysaccharide binding groove, suggesting that the enzyme binds a small molecule at the active site. 258
29163 200564 cd10939 CE4_ArnD Catalytic domain of Escherichia coli 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD and other bacterial homologs. This family is represented by Escherichia coli 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD (EC 3.5.1.n3). ArnD plays an important role in the biosynthesis of undecaprenyl phosphate alpha-4-amino-4-deoxy-L-arabinose (alpha-L-Ara4N). It catalyzes the deformylation of 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol to 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol. The ArnD-dependent deformylation likely occurs on the inner leaflet of the inner membrane. This family also includes many uncharacterized bacterial polysaccharide deacetylases. All family members show high sequence homology to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA), and are classified within the larger carbohydrate esterase 4 (CE4) superfamily. 290
29164 200565 cd10940 CE4_PuuE_HpPgdA_like_1 Putative catalytic domain of uncharacterized bacterial polysaccharide deacetylases similar to bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). This family contains many uncharacterized bacterial polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as homotetramers. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, PuuE allantoinase and HpPgdA do not exhibit a solvent-accessible polysaccharide binding groove and might only bind a small molecule at the active site. 306
29165 200566 cd10941 CE4_PuuE_HpPgdA_like_2 Putative catalytic domain of uncharacterized prokaryotic polysaccharide deacetylases similar to bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). This family contains many uncharacterized prokaryotic polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as homotetramers. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, PuuE allantoinase and HpPgdA do not exhibit a solvent-accessible polysaccharide binding groove and might only bind a small molecule at the active site. 258
29166 200567 cd10942 CE4_u11 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 252
29167 200568 cd10943 CE4_NodB Putative catalytic domain of rhizobial NodB chitooligosaccharide N-deacetylase and its bacterial homologs. This family corresponds to rhizobial NodB chitooligosaccharide N-deacetylase (EC 3.5.1.-), encoded by nodB gene from the nodulation (nod) gene cluster that is responsible for the biosynthesis of bacterial nodulation signals, termed Nod factors. NodB is involved in de-N-acetylating the nonreducing N-acetylglucosamine residue of chitooligosaccharides to allow for the attachment of the fatty acyl group by the acyltransferase NodA. The monosaccharide N-acetylglucosamine cannot be deacetylated by NodB. NodB is composed of a 6-stranded barrel catalytic domain with detectable sequence similarity to the 7-stranded barrel homology domain of polysaccharide deacetylase (DCA)-like proteins in the larger carbohydrate esterase 4 (CE4) superfamily. 193
29168 200569 cd10944 CE4_SmPgdA_like Catalytic NodB homology domain of Streptococcus mutans polysaccharide deacetylase PgdA, Bacillus subtilis YheN, and similar proteins. This family is represented by a putative polysaccharide deacetylase PgdA from the oral pathogen Streptococcus mutans (SmPgdA) and Bacillus subtilis YheN (BsYheN), which are members of the carbohydrate esterase 4 (CE4) superfamily. SmPgdA is an extracellular metal-dependent polysaccharide deacetylase with a typical CE4 fold, with metal bound to a His-His-Asp triad. It possesses de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. SmPgdA plays a role in tuning cell surface properties and in interactions with (salivary) agglutinin, an essential component of the innate immune system, most likely through deacetylation of an as-yet-unidentified polysaccharide. SmPgdA shows significant homology to the catalytic domains of peptidoglycan deacetylases from Streptococcus pneumoniae (SpPgdA) and Listeria monocytogenes (LmPgdA), both of which are involved in the bacterial defense mechanism against human mucosal lysozyme. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. The biological function of BsYheN is still unknown. This family also includes many uncharacterized polysaccharide deacetylases mainly found in bacteria. 189
29169 200570 cd10946 CE4_Mll8295_like Putative catalytic NodB homology domain of uncharacterized Mll8295 protein encoded from Rhizobium loti and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase Mll8295 encoded from Rhizobium loti. Although its biological function still remains unknown, Mll8295 shows high sequence homology to the catalytic domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both Mll8295 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases. 217
29170 200571 cd10947 CE4_SpPgdA_BsYjeA_like Catalytic NodB homology domain of Streptococcus pneumoniae peptidoglycan deacetylase PgdA, Bacillus subtilis BsYjeA protein, and their bacterial homologs. This family is represented by Streptococcus pneumoniae peptidoglycan GlcNAc deacetylase (SpPgdA), a member of the carbohydrate esterase 4 (CE4) superfamily. SpPgdA protects gram-positive bacterial cell wall from host lysozymes by deacetylating peptidoglycan N-acetylglucosamine (GlcNAc) residues. It consists of three separate domains: N-terminal, middle and C-terminal (catalytic) domains. The catalytic NodB homology domain is similar to the deformed (beta/alpha)8 barrel fold adopted by other CE4 esterases, which harbors a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The enzyme is able to accept GlcNAc3 as a substrate, with the N-acetyl of the middle sugar being removed by the enzyme. This family also includes Bacillus subtilis BsYjeA protein encoded by the yjeA gene, which is one of the six polysaccharide deacetylase gene homologs (pdaA, pdaB/ybaN, yheN, yjeA, yxkH and ylxY) in the Bacillus subtilis genome. Although homology comparison shows that the BsYjeA protein contains a polysaccharide deacetylase domain, and was predicted to be a membrane-bound xylanase or a membrane-bound chitooligosaccharide deacetylase, more recent research indicates BsYjeA might be a novel non-specific secretory endonuclease which creates random nicks progressively on the two strands of dsDNA, resulting in highly distinguishable intermediates/products very different in chemical and physical compositions over time. In addition, BsYjeA shares several enzymatic properties with the well-understood DNase I endonuclease. Both enzymes are active on ssDNA and dsDNA, both generate random nicks, and both require Mg2+ or Mn2+ for hydrolytic activity. 177
29171 200572 cd10948 CE4_BsPdaA_like Catalytic NodB homology domain of Bacillus subtilis polysaccharide deacetylase PdaA, and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by Bacillus subtilis pdaA gene encoding polysaccharide deacetylase BsPdaA, which is a member of the carbohydrate esterase 4 (CE4) superfamily. BsPdaA deacetylates peptidoglycan N-acetylmuramic acid (MurNAc) residues to facilitate the formation of muramic delta-lactam, which is required for recognition of germination lytic enzymes. BsPdaA deficiency leads to the absence of muramic delta-lactam residues in the spore cortex. Like other CE4 esterases, BsPdaA consists of a single catalytic NodB homology domain that appears to adopt a deformed (beta/alpha)8 barrel fold with a putative substrate binding groove harboring the majority of the conserved residues. It utilizes a general acid/base catalytic mechanism involving a tetrahedral transition intermediate, where a water molecule functions as the nucleophile tightly associated to the zinc cofactor. 223
29172 200573 cd10949 CE4_BsPdaB_like Putative catalytic NodB homology domain of Bacillus subtilis putative polysaccharide deacetylase PdaB, and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by the putative polysaccharide deacetylase PdaB encoded by the pdaB gene on sporulation of Bacillus subtilis. Although its biochemical properties remain to be determined, the PdaB (YbaN) protein is essential for maintaining spores after the late stage of sporulation and is highly conserved in spore-forming bacteria. The glycans of the spore cortex may be candidate PdaB substrates. Based on sequence similarity, the family members are classified as carbohydrate esterase 4 (CE4) superfamily members. However, the classical His-His-Asp zinc-binding motif of CE4 esterases is missing in this family. 192
29173 200574 cd10950 CE4_BsYlxY_like Putative catalytic NodB homology domain of uncharacterized protein YlxY from Bacillus subtilis and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by Bacillus subtilis putative polysaccharide deacetylase BsYlxY, encoded by the ylxY gene, which is a member of the carbohydrate esterase 4 (CE4) superfamily. Although its biological function still remains unknown, BsYlxY shows high sequence homology to the catalytic domain of Bacillus subtilis pdaB gene encoding a putative polysaccharide deacetylase (BsPdaB), which is essential for the maintenance of spores after the late stage of sporulation and is highly conserved in spore-forming bacteria. However, disruption of the ylxY gene in B. subtilis did not cause any sporulation defect. Moreover, the Asp residue in the classical His-His-Asp zinc-binding motif of CE4 esterases is mutated to a Val residue in this family. Other catalytically relevant residues of CE4 esterases are also not conserved, which suggest that members of this family may be inactive. 188
29174 200575 cd10951 CE4_ClCDA_like Catalytic NodB homology domain of Colletotrichum lindemuthianum chitin deacetylase and similar proteins. This family is represented by the chitin deacetylase (endo-chitin de-N-acetylase, ClCDA, EC 3.5.1.41) from Colletotrichum lindemuthianum (also known as Glomerella lindemuthiana), which is a member of the carbohydrate esterase 4 (CE4) superfamily. ClCDA catalyzes the hydrolysis of N-acetamido groups of N-acetyl-D-glucosamine residues in chitin, converting it to chitosan in fungal cell walls. It consists of a single catalytic domain similar to the deformed (alpha/beta)8 barrel fold adopted by other CE4 esterases, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine), to carry out acid/base catalysis. It possesses a highly conserved substrate-binding groove, with subtle alterations that influence substrate specificity and subsite affinity. Unlike its bacterial homologs, ClCDA contains two intramolecular disulfide bonds that may add stability to this secreted protein. The family also includes many uncharacterized deacetylases and hypothetical proteins mainly from eukaryotes, which show high sequence similarity to ClCDA. 197
29175 200576 cd10952 CE4_MrCDA_like Catalytic NodB homology domain of Mucor rouxii chitin deacetylase and similar proteins. This family is represented by the chitin deacetylase (MrCDA, EC 3.5.1.41) encoded from the fungus Mucor rouxii (also known as Amylomyces rouxii). MrCDA is an acidic glycoprotein with a very stringent specificity for beta1-4-linked N-acetylglucosamine homopolymers. It requires at least four residues (chitotetraose) for catalysis, and can achieve extensive deacetylation on chitin polymers. MrCDA shows high sequence similarity to Colletotrichum lindemuthianum chitin deacetylase (endo-chitin de-N-acetylase, ClCDA), which consists of a single catalytic domain similar to the deformed (beta/alpha)8 barrel fold adopted by the carbohydrate esterase 4 (CE4) superfamily, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The family also includes some uncharacterized eukaryotic and bacterial homologs of MrCDA. 178
29176 200577 cd10953 CE4_SlAXE_like Catalytic NodB homology domain of Streptomyces lividans acetylxylan esterase and its bacterial homologs. This family is represented by Streptomyces lividans acetylxylan esterase (SlAXE, EC 3.1.1.72), a member of the carbohydrate esterase 4 (CE4) superfamily. SlAXE deacetylates O-acetylated xylan, a key component of plant cell walls. It shows no detectable activity on generic esterase substrates including para-nitrophenyl acetate. It is specific for sugar-based substrates and will precipitate acetylxylan as a result of deacetylation. SlAXE also functions as a chitin and chitooligosaccharide de-N-acetylase with equal efficiency to its activity on xylan. SlAXE forms a dimer. Each monomer contains a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold as other CE4 esterases, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine), to carry out acid/base catalysis. SlAXE possess a single metal center with a chemical preference for Co2+. 179
29177 200578 cd10954 CE4_CtAXE_like Catalytic NodB homology domain of Clostridium thermocellum acetylxylan esterase and its bacterial homologs. This family is represented by Clostridium thermocellum acetylxylan esterase (CtAXE, EC 3.1.1.72), a member of the carbohydrate esterase 4 (CE4) superfamily. CtAXE deacetylates O-acetylated xylan, a key component of plant cell walls. It shows no detectable activity on generic esterase substrates including para-nitrophenyl acetate. It is specific for sugar-based substrates and will precipitate acetylxylan, as a consequence of deacetylation. CtAXE is a monomeric protein containing a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold as other CE4 esterases. However, due to differences in the topography of the substrate-binding groove, the chemistry of the active center, and metal ion coordination, CtAXE has different metal ion preference and lacks activity on N-acetyl substrates. It is significantly activated by Co2+. Moreover, CtAXE displays distinctly different ligand coordination to the metal ion, utilizing an aspartate, a histidine, and four water molecules, as opposed to the conserved His-His-Asp zinc-binding triad of other CE4 esterases. 180
29178 200579 cd10955 CE4_BH0857_like Putative catalytic NodB homology domain of uncharacterized BH0857 protein from Bacillus halodurans and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase BH0857 from Bacillus halodurans. Although its biological function still remains unknown, BH0857 shows high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both BH0857 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases. 195
29179 200580 cd10956 CE4_BH1302_like Putative catalytic NodB homology domain of uncharacterized BH1302 protein from Bacillus halodurans and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase BH1302 from Bacillus halodurans. Although its biological function is unknown, BH1302 shows high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both BH1302 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases. 194
29180 200581 cd10958 CE4_NodB_like_2 Catalytic NodB homology domain of uncharacterized chitin deacetylases and hypothetical proteins. This family includes some uncharacterized chitin deacetylases and hypothetical proteins, mainly from eukaryotes. Although their biological function is unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Colletotrichum lindemuthianum chitin deacetylase (endo-chitin de-N-acetylase, ClCDA, EC 3.5.1.41), which catalyzes the hydrolysis of N-acetamido groups of N-acetyl-D-glucosamine residues in chitin, converting it to chitosan in fungal cell walls. Like ClCDA, this family is a member the carbohydrate esterase 4 (CE4) superfamily. 190
29181 200582 cd10959 CE4_NodB_like_3 Catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases. This family includes many uncharacterized bacterial polysaccharide deacetylases. Although their biological function still remains unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily. 187
29182 200583 cd10960 CE4_NodB_like_1 Catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases. This family includes many uncharacterized bacterial polysaccharide deacetylases. Although their biological function still remains unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily. 238
29183 200584 cd10962 CE4_GT2-like Catalytic NodB homology domain of uncharacterized bacterial glycosyl transferase, group 2-like family proteins. This family includes many uncharacterized bacterial proteins containing an N-terminal GH18 (glycosyl hydrolase, family 18) domain, a middle NodB-like homology domain, and a C-terminal GT2-like (glycosyl transferase group 2) domain. Although their biological function is unknown, members in this family contain a middle NodB homology domain that is similar to the catalytic domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily. The presence of three domains suggests that members of this family may be multifunctional. 196
29184 200585 cd10963 CE4_RC0012_like Putative catalytic NodB homology domain of uncharacterized protein RC0012 from Rickettsia conorii and its bacterial homologs. This family contains an uncharacterized protein RC0012 from Rickettsia conorii and its bacterial homologs. Although their biochemical properties remain to be determined, members in this family seems to be composed of a seven-stranded barrel with detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 182
29185 200586 cd10964 CE4_PgaB_5s N-terminal putative catalytic polysaccharide deacetylase domain of bacterial poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB, and similar proteins. This family is represented by an outer membrane lipoprotein, poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase (PgaB, EC 3.5.1.-), encoded by Escherichia coli pgaB gene from the pgaABCD (formerly ycdSRQP) operon, which affects biofilm development by promoting abiotic surface binding and intercellular adhesion. PgaB catalyzes the N-deacetylation of poly-beta-1,6-N-acetyl-D-glucosamine (PGA), a biofilm adhesin polysaccharide that stabilizes biofilms of E. coli and other bacteria. PgaB contains an N-terminal NodB homology domain with a 5-stranded beta/alpha barrel, and a C-terminal carbohydrate binding domain required for PGA N-deacetylation, which may be involved in binding to unmodified poly-beta-1,6-GlcNAc and assisting catalysis by the deacetylase domain. This family also includes several orthologs of PgaB, such as the hemin storage system HmsF protein, encoded by Yersinia pestis hmsF gene from the hmsHFRS operon, which is essential for Y. pestis biofilm formation. Like PgaB, HmsF is an outer membrane protein with an N-terminal NodB homology domain, which is likely involved in the modification of the exopolysaccharide (EPS) component of the biofilm. HmsF also has a conserved but uncharacterized C-terminal domain that is present in other HmsF-like proteins in Gram-negative bacteria. This alignment model corresponds to the N-terminal NodB homology domain. 193
29186 200587 cd10965 CE4_IcaB_5s Putative catalytic polysaccharide deacetylase domain of bacterial intercellular adhesion protein IcaB and similar proteins. The family is represented by the surface-attached protein intercellular adhesion protein IcaB (Poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase, EC 3.5.1.-), encoded by Staphylococcus epidermidis icaB gene from the icaABC gene cluster that is involved in the synthesis of polysaccharide intercellular adhesin (PIA), which is located mainly on the cell surface. IcaB is a secreted, cell wall-associated protein that plays a crucial role in exopolysaccharide modification in bacterial biofilm formation. It catalyzes the N-deacetylation of poly-beta-1,6-N-acetyl-D-glucosamine (PNAG, also referred to as PIA), a biofilm adhesin polysaccharide. IcaB shows high homology to the N-terminal NodB homology domain of Escherichia coli PgaB. At this point, they are classified in the same family. 172
29187 213024 cd10966 CE4_yadE_5s Putative catalytic polysaccharide deacetylase domain of uncharacterized protein yadE and similar proteins. This family contains an uncharacterized protein yadE from Escherichia coli and its bacterial homologs. Although its molecular function remains unknown, yadE shows high sequence similarity with the catalytic NodB homology domain of outer membrane lipoprotein PgaB and the surface-attached protein intercellular adhesion protein IcaB. Both PgaB and IcaB are essential in bacterial biofilm formation. 164
29188 200589 cd10967 CE4_GLA_like_6s Putative catalytic NodB homology domain of gellan lyase and similar proteins. This family is represented by the extracellular polysaccharide-degrading enzyme, gellan lyase (gellanase, EC 4.2.2.-), from Bacillus sp. The enzyme acts on gellan exolytically and releases a tetrasaccharide of glucuronyl-glucosyl-rhamnosyl-glucose with unsaturated glucuronic acid at the nonreducing terminus. The family also includes many uncharacterized prokaryotic polysaccharide deacetylases, which show high sequence similarity to Bacillus sp. gellan lyase. Although their biological functions remain unknown, all members of the family contain a conserved domain with a 6-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 202
29189 213025 cd10968 CE4_Mlr8448_like_5s Putative catalytic NodB homology domain of Mesorhizobium loti Mlr8448 protein and its bacterial homologs. This family contains Mesorhizobium loti Mlr8448 protein and its bacterial homologs. Although their biochemical properties are yet to be determined, members in this subfamily contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 161
29190 213026 cd10969 CE4_Ecf1_like_5s Putative catalytic NodB homology domain of a hypothetical protein Ecf1 from Escherichia coli and similar proteins. This family contains a hypothetical protein Ecf1 from Escherichia coli and its prokaryotic homologs. Although their biochemical properties remain to be determined, members in this family contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 218
29191 213027 cd10970 CE4_DAC_u1_6s Putative catalytic NodB homology domain of uncharacterized prokaryotic polysaccharide deacetylases which consist of a 6-stranded beta/alpha barrel. This family contains uncharacterized prokaryotic polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family contain a conserved domain with a 6-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 194
29192 200593 cd10971 CE4_DAC_u2_5s Putative catalytic NodB homology domain of uncharacterized prokaryotic polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains many uncharacterized prokaryotic polysaccharide deacetylases. Although their biological functions remain unknown, all members of this family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 198
29193 200594 cd10972 CE4_DAC_u3_5s Putative catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains uncharacterized bacterial polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 216
29194 213028 cd10973 CE4_DAC_u4_5s Putative catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains many uncharacterized bacterial polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily. 157
29195 200596 cd10974 CE4_CDA_like_1 Putative catalytic domain of chitin deacetylase-like proteins with additional chitin-binding peritrophin-A domain (ChBD) and/or a low-density lipoprotein receptor class A domain (LDLa). Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase 4 (CE4) superfamily. This family includes many CDA-like proteins mainly from insects, which contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. In addition to the CDA-like domain, family members contain two additional domains, a chitin-binding peritrophin-A domain (ChBD) and a low-density lipoprotein receptor class A domain (LDLa), or have the ChBD domain but do not have the LDLa domain. 269
29196 200597 cd10975 CE4_CDA_like_2 Putative catalytic domain of chitin deacetylase-like proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase 4 (CE4) superfamily. This family includes many midgut-specific CDA-like proteins mainly from insects, such as Tribolium castaneum CDAs (TcCDA6-9). These proteins contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. In addition to the CDA-like domain, some family members have an additional chitin-binding peritrophin-A domain (ChBD). 268
29197 200598 cd10976 CE4_CDA_like_3 Putative catalytic domain of uncharacterized bacterial hypothetical proteins similar to insect chitin deacetylase-like proteins. The family includes many uncharacterized bacterial hypothetical proteins that show high sequence similarity to insect chitin deacetylase-like proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. 299
29198 200599 cd10977 CE4_PuuE_SpCDA1 Catalytic domain of bacterial PuuE allantoinases, Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), and similar proteins. Allantoinase (EC 3.5.2.5) can hydrolyze allantoin((2,5-dioxoimidazolidin-4-yl)urea), one of the most important nitrogen carrier for some plants, soil animals, and microorganisms, to allantoate. DAL1 gene from Saccharomyces cerevisiae encodes an allantoinase. However, some organisms possess allantoinase activity but lack DAL1 allantoinase. In those organisms, a defective allantoinase gene, named puuE (purine utilization E), encodes an allantoinase that specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. PuuE allantoinase is related to polysaccharide deacetylase (DCA), one member of the carbohydrate esterase 4 (CE4) superfamily, that removes N-linked or O-linked acetyl groups of cell wall polysaccharides, and lacks sequence similarity with the known DAL1 allantoinase that belongs to the amidohydrolase superfamily. PuuE allantoinase functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCAs. It appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common features of DCAs that are normally metal ion dependent and recognize multimeric substrates. This family also includes a chitin deacetylase 1 (SpCDA1) encoded by the Schizosaccharomyces pombe cda1 gene. Although the general function of chitin deacetylase (CDA) is the synthesis of chitosan from chitin, a polymer of N-acetyl glucosamine, to build up the proper ascospore wall, the actual function of SpCDA1 might involve allantoin hydrolysis. It is likely orthologous to PuuE allantoinase, whereas it is more distantly related to the CDAs found in other fungi, such as Saccharomyces cerevisiae and Mucor rouxii. Those CDAs are similar with rizobial NodB protein and are not included in this family. 273
29199 200600 cd10978 CE4_Sll1306_like Putative catalytic domain of Synechocystis sp. Sll1306 protein and other bacterial homologs. The family contains Synechocystis sp. Sll1306 protein and uncharacterized bacterial polysaccharide deacetylases. Although their biological function remains unknown, they show very high sequence homology to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases. PuuE allantoinase specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. It functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. PuuE allantoinase appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common feature of polysaccharide deacetylases that are normally metal ion dependent and recognize multimeric substrates. 271
29200 200601 cd10979 CE4_PuuE_like Putative catalytic domain of uncharacterized prokaryotic polysaccharide deacetylases similar to bacterial PuuE allantoinases. The family includes a group of uncharacterized prokaryotic polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases. PuuE allantoinase specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. It functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. PuuE allantoinase appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common feature of DCAs which are normally metal ion dependent and recognize multimeric substrates. 281
29201 200602 cd10980 CE4_SpCDA1 Putative catalytic domain of Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), and similar proteins. This family is represented by Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), encoded by the cda1 gene. The general function of chitin deacetylase (CDA) is the synthesis of chitosan from chitin, a polymer of N-acetyl glucosamine, to build up the proper ascospore wall. The actual function of SpCDA1 might be involved in allantoin hydrolysis. It is likely an ortholog to bacterial PuuE allantoinase, whereas it is more distantly related to the CDAs found in other fungi, such as Saccharomyces cerevisiae and Mucor rouxii. Those CDAs are similar with rizobial NodB protein and are not included in this family. 297
29202 211380 cd10981 ZnPC_S1P1 Zinc dependent phospholipase C/S1-P1 nuclease. This model describes both the bacterial and archeal zinc-dependent phospholipase C, a domain found in the alpha toxin of Clostridium perfringens, as well as S1/P1 nucleases, which predominantly act on single-stranded DNA and RNA. 238
29203 199826 cd10985 MH2_SMAD_2_3 C-terminal Mad Homology 2 (MH2) domain in SMAD2 and SMAD3. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. SMAD2 and SMAD3 are receptor regulated SMADs (R-SMADs). SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta. 191
29204 199911 cd11005 M35_like Peptidase M35 family. Family M35 Zn2+-metallopeptidase domain, also known as the deuterolysin family, contains fungal as well as bacterial metalloendopeptidases that include deuterolysin (EC2.4.24.39), peptidyl-Lys metalloendopeptidase (MEP), penicillolysin, as well as uncharacterized sequences. Typically, members of this family of extracellular peptidases contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG motif C-terminal to the His zinc ligands. Deuterolysins are highly active towards basic nuclear proteins such as histones and protamines, with a preference for a Lys or Arg residue in the P1' subsite. MEPs specifically cleave peptidyl-lysine bonds (-X-Lys-) in proteins and peptides. Penicillolysin, a thermolabile protease from Penicillium citrinum, strongly hydrolyzes nuclear proteins such as clupeine, salmine and histone. Many members of the M35 peptidases display unusual thermostabilities. 167
29205 199912 cd11006 M35_peptidyl-Lys_like Peptidase M35 domain of peptidyl-Lys metalloendopeptidases and related proteins. This family M35 Zn2+-metallopeptidase extracellular domain is mostly found in proteins characterized as peptidyl-Lys metalloendopeptidases (MEP; peptidyllysine metalloproteinase; EC 3.4.24.20), including some well-characterized domains in Aeromonas salmonicida subsp. Achromogenes (AsaP1) and Grifola frondosa (GfMEP). These proteins specifically cleave peptidyl-lysine bonds (-X-Lys- where X may even be Pro) in proteins and peptides. AsaP1 peptidase has been shown to be important in the virulence of A. salmonicida subsp. achromogenes, having a major role in the fish innate immune response. Members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands. 163
29206 199913 cd11007 M35_like_1 Peptidase M35-like domain of uncharacterized proteins. This family contains proteins similar to the M35 Zn2+-metallopeptidases, also known as the deuterolysin family, presumably these are bacterial metalloendopeptidases that have yet to be characterized. Typically, members of this family of extracellular peptidases contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand; however, members of this family do not contain the GTXDXXYG motif C-terminal to the His zinc ligands that is typical for the M35 proteases. Deuterolysins are highly active towards basic nuclear proteins such as histones and protamines, with a preference for a Lys or Arg residue in the P1' subsite. MEPs specifically cleave peptidyl-lysine bonds (-X-Lys-) in proteins and peptides. Many members of the M35 peptidases display unusual thermostabilities. 183
29207 199914 cd11008 M35_deuterolysin_like Peptidase M35 domain of deuterolysins and related proteins. This family M35 Zn2+-metallopeptidase extracellular domain is found in fungal deutrolysins (acid metalloproteinase, neutral proteinase II), including some well-characterized metallopeptidase domains in Aspergillus oryzae (NpII), Aspergillus fumigatus (MEP20), Penicillium roqueforti (protease II) and Emericella nidulans (PepJ peptidase). The neutral proteinase II from Aspergillus oryzae (NpII) unfolds reversibly upon incubation at higher temperatures, and loss in activity is mainly due to autoproteolysis. MEP20 is encoded by the mepB gene, which appears to be associated with the cytoplasmic degradation of small peptides. PepJ peptidase is a thermostable enzyme released under carbon starvation. Most members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands. The aspzincin motif is poorly conserved in one subgroup, that includes Asp f2, a major allergen from Aspergillus fumigatus. This subgroup in addition lacks the key conserved Tyr residue which acts as a proton donor during catalysis, and no protease activity has been detected to date for Asp f2. 167
29208 211381 cd11009 Zn_dep_PLPC Zinc dependent phospholipase C (alpha toxin). This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium. 218
29209 211382 cd11010 S1-P1_nuclease S1/P1 nucleases and related enzymes. This family summarizes both S1 and P1 nucleases (EC:3.1.30.1) which cleave RNA and single stranded DNA with no base specificity. S1 nuclease is more active on DNA than RNA. Its reaction products are oligonucleotides or single nucleotides with 5' phosphoryl groups. Although its primary substrate is single-stranded, it may also introduce single-stranded breaks in double-stranded DNA or RNA, or DNA-RNA hybrids. It is used as a reagent in nuclease protection assays and in removing single stranded tails from DNA molecules to create blunt ended molecules and opening hairpin loops generated during synthesis of double stranded cDNA. P1 nuclease cleaves its substrate at every position yielding nucleoside 5' monophosphates, and it does not recognize or act on double-stranded DNA. It is useful at removing single stranded strands hanging off the end of double stranded DNA and at completely cleaving melted DNA for simple DNA composition analysis. 249
29210 259898 cd11012 CuRO_6_ceruloplasmin The sixth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the sixth cupredoxin domain of ceruloplasmin. 145
29211 259899 cd11013 Plantacyanin Plantacyanin is a subclass of phytocyanins, plant type I copper proteins. Plantacyanins belong to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Plantacyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The exact function of plantacyanin is unknown. However plantacyanin is shown to play a role in reproduction in Arabidopsis. Plantacyanins may also be stress-related proteins and be involved in plant defense responses. 95
29212 259900 cd11014 Mavicyanin Mavicyanin is a subclass of phytocyanins, a plant blue copper protein. Mavicyanin is a glycosylated protein isolated from Cucurbita pepo medullosa (zucchini) peelings. It belongs to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Mavicyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The copper is tetrahedrally coordinated by a cysteine, 2 histidines, and a glutamine residue, like in the case of stellacyanin. The biological roles of mavicyanin have not been elucidated yet. 101
29213 259901 cd11015 CuRO_2_FVIII_like The second cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 2 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins. 134
29214 259902 cd11016 CuRO_4_FVIII_like The fourth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 4 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins. 143
29215 259903 cd11017 Phytocyanin_like_1 A subclass of phytocyanins, plant blue or type I copper proteins. Phytocyanins are plant blue or type I copper proteins. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Phytocyanins are classified into four groups: stellacyanin, plantacyanin, uclacyanin and early nodulin groups. Members of this unknown subgroup appear to have lost the T1 copper binding site. 99
29216 259904 cd11018 CuRO_6_FVIII_like The sixth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 6 of unprocessed Factor VIII or the second cupredoxin domain the light chain of circulating Factor VIII, and similar proteins. 144
29217 259905 cd11019 OsENODL1_like Early nodulin-like protein (OsENODL1) and similar proteins. This family includes early nodulin-like protein (OsENODL1) from Oryza sativa and similar proteins. It belongs to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Phytocyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. OsENODL1 expression occurs specifically at the late developmental stage of the seeds. Members of this subgroup appear to have lost the T1 copper binding site. 103
29218 259906 cd11020 CuRO_1_CuNIR Cupredoxin domain 1 of Copper-containing nitrite reductase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center, which serves as the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis. 119
29219 259907 cd11021 CuRO_2_ceruloplasmin The second cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the second cupredoxin domain of ceruloplasmin. 141
29220 259908 cd11022 CuRO_4_ceruloplasmin The fourth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the fourth cupredoxin domain of ceruloplasmin. 144
29221 259909 cd11023 CuRO_2_ceruloplasmin_like_2 cupredoxin domain of ceruloplasmin homologs. Uncharacterized subfamily of ceruloplasmin homologous proteins. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first domain of the triplicated units. 118
29222 259910 cd11024 CuRO_1_2DMCO_NIR_like The cupredoxin domain 1 of a two-domain laccase related to nitrite reductase. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles the two domain nitrite reductase in both sequence and structure. It consists of two cupredoxin domains and forms trimers and hence resembles the quaternary structure of nitrite reductases more than that of large laccases. There are three trinuclear copper clusters in the enzyme localized between domains 1 and 2 of each pair of neighbor chains. Three copper ions of type 1 lie close to one another near the surface of the central part of the trimer, and, effectively, a trimeric substrate binding site is formed in their vicinity. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic, notably phenolic, and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities. 119
29223 410652 cd11026 CYP2 cytochrome P450 family 2. The cytochrome P450 family 2 (CYP2 or Cyp2) is one of the largest, most diverse CYP families in vertebrates. It includes many subfamilies across vertebrate species but not all subfamilies are found in multiple vertebrate taxonomic classes. The CYP2U and CYP2R genes are present in the vertebrate ancestor and are shared across all vertebrate classes, whereas some subfamilies are lineage-specific, such as CYP2B and CYP2S in mammals. CYP2 enzymes play important roles in drug metabolism. The CYP2 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
29224 410653 cd11027 CYP17A1-like cytochrome P450 family 17, subfamily A, polypeptide 1, and similar cytochrome P450s. This subfamily contains cytochrome P450 17A1 (CYP17A1 or Cyp17a1), cytochrome P450 21 (CYP21 or Cyp21) and similar proteins. CYP17A1, also called cytochrome P450c17, steroid 17-alpha-hydroxylase (EC 1.14.14.19)/17,20 lyase (EC 1.14.14.32), or 17-alpha-hydroxyprogesterone aldolase, catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products and subsequently to dehydroepiandrosterone (DHEA) and androstenedione; it catalyzes both the 17-alpha-hydroxylation and the 17,20-lyase reaction. This subfamily also contains CYP21, also called steroid 21-hydroxylase (EC 1.14.14.16) or cytochrome P-450c21 or CYP21A2, catalyzes the 21-hydroxylation of steroids and is required for the adrenal synthesis of mineralocorticoids and glucocorticoids. The CYP17A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 428
29225 410654 cd11028 CYP1 cytochrome P450 family 1. The cytochrome P450 family 1 (CYP1 or Cyp1) is composed of three functional human members: CYP1A1, CYP1A2 and CYP1B1, which are regulated by the aryl hydrocarbon receptor (AhR), ligand-activated transcriptional factor that dimerizes with AhR nuclear translocator (ARNT). CYP1 enzymes are involved in the metabolism of endogenous hormones, xenobiotics, and drugs. Included in the CYP1 family is CYP1D1 (cytochrome P450 family 1, subfamily D, polypeptide 1), which is not expressed in humans as its gene is pseudogenized due to five nonsense mutations in the putative coding region, but is functional in in other organisms including cynomolgus monkey. Zebrafish CYP1D1 expression is not regulated by AhR. The CYP1 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 430
29226 410655 cd11029 CYP107-like cytochrome P450 family 107 and similar cytochrome P450s. This group contains bacterial cytochrome P450s from families 107 (CYP107), 154 (CYP154), 197 (CYP197), and similar proteins. Among the members of this group are: Pseudonocardia autotrophica vitamin D(3) 25-hydroxylase (also known as CYP197A; EC 1.14.15.15) that catalyzes the hydroxylation of vitamin D(3) into 25-hydroxyvitamin D(3) and 1-alpha,25-dihydroxyvitamin D(3), its physiologically active forms; Saccharopolyspora erythraea CYP107A1, also called P450eryF or 6-deoxyerythronolide B hydroxylase (EC 1.14.15.35), that catalyzes the conversion of 6-deoxyerythronolide B (6-DEB) to erythronolide B (EB) by the insertion of an oxygen at the 6S position of 6-DEB; Bacillus megaterium CYP107DY1 that displays C6-hydroxylation activity towards mevastatin to produce pravastatin; Streptomyces coelicolor CYP154C1 that shows activity towards 12- and 14-membered ring macrolactones in vitro and may be involved in catalyzing the site-specific oxidation of the precursors to macrolide antibiotics, which introduces regiochemical diversity into the macrolide ring system; and Nocardia farcinica CYP154C5 that acts on steroids with regioselectivity and stereoselectivity, converting various pregnans and androstans to yield 16 alpha-hydroxylated steroid products. Bacillus subtilis CYP107H1 is not included in this group. The CYP107-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 384
29227 410656 cd11030 CYP105-like cytochrome P450 family 105 and similar cytochrome P450s. This group predominantly contains bacterial cytochrome P450s, including those belonging to families 105 (CYP105) and 165 (CYP165). Also included in this group are fungal family 55 proteins (CYP55). CYP105s are predominantly found in bacteria belonging to the phylum Actinobacteria and the order Actinomycetales, and are associated with a wide variety of pathways and processes, from steroid biotransformation to production of macrolide metabolites. CYP105A1 catalyzes two sequential hydroxylations of vitamin D3 with differing specificity and cytochrome P450-SOY (also known as CYP105D1) has been shown to be capable of both oxidation and dealkylation reactions. CYP105D6 and CYP105P1, from the filipin biosynthetic pathway, perform highly regio- and stereospecific hydroxylations. Other members of this group include, but are not limited to: CYP165D3 (also called OxyE) from the teicoplanin biosynthetic gene cluster of Actinoplanes teichomyceticus, which is responsible for the phenolic coupling of the aromatic side chains of the first and third peptide residues in the teicoplanin peptide; Micromonospora griseorubida cytochrome P450 MycCI that catalyzes hydroxylation at the C21 methyl group of mycinamicin VIII, the earliest macrolide form in the postpolyketide synthase tailoring pathway; and Fusarium oxysporum CYP55A1 (also called nitric oxide reductase cytochrome P450nor) that catalyzes an unusual reaction, the direct electron transfer from NAD(P)H to bound heme. The CYP105-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 381
29228 410657 cd11031 Cyp158A-like cytochrome P450 family 158, subfamily A and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Streptomyces coelicolor CYP158A1 and CYP158A2, Streptomyces natalensis PimD (also known as CYP107E), Mycobacterium tuberculosis CYP121, and Micromonospora griseorubida MycG (also known as CYP107B). CYP158A1 and CYP158A2 catalyze an unusual oxidative C-C coupling reaction to polymerize flaviolin and form highly conjugated pigments; CYP158A2 produces three isomers of biflaviolin and one triflaviolin while CYP158A1 produces only two isomers of biflaviolin. PimD is a cytochrome P450 monooxygenase with native epoxidase activity that is critical in the biosynthesis of the polyene macrolide antibiotic pimaricin. CYP121 is essential for the viability of M. tuberculosis and is a novel drug target for the inhibition of mycobacterial growth. MycG catalyzes both hydroxylation and epoxidation reactions in the biosynthesis of the 16-membered ring macrolide antibiotic mycinamicin II. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 380
29229 410658 cd11032 P450_EryK-like cytochrome P450 EryK and similar cytochrome P450s. This subfamily contains archaeal and bacterial CYPs including Saccharopolyspora erythraea P450 EryK, Saccharolobus solfataricus cytochrome P450 119 (CYP119), Picrophilus torridus CYP231A2, Bacillus subtilis CYP109, Streptomyces himastatinicus HmtT and HmtN, and Bacillus megaterium CYP106A2, among others. EryK, also called erythromycin C-12 hydroxylase, is active during the final steps of erythromycin A (ErA) biosynthesis. CYP106A2 catalyzes the hydroxylation of a variety of 3-oxo-delta(4)-steroids such as progesterone and deoxycorticosterone, mainly in the 15beta-position. It is also capable of hydroxylating a variety of terpenoids. HmtT and HmtN is involved in the post-tailoring of the cyclohexadepsipeptide backbone during the biosynthesis of the himastatin antibiotic. The EryK-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 368
29230 410659 cd11033 CYP142-like cytochrome P450 family 142 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Streptomyces sp. P450sky (also called CYP163B3), Sphingopyxis macrogoltabida P450pyr hydroxylase, Novosphingobium aromaticivorans CYP108D1, Pseudomonas sp. cytochrome P450-Terp (P450terp), and Amycolatopsis balhimycina P450 OxyD, as well as several Mycobacterium proteins CYP124, CYP125, CYP126, and CYP142. P450sky is involved in the hydroxylation of three beta-hydroxylated amino acid precursors required for the biosynthesis of the cyclic depsipeptide skyllamycin. P450pyr hydroxylase is an active and selective catalyst for the regio- and stereo-selective hydroxylation at non-activated carbon atoms with a broad substrate range. P450terp catalyzes the hydroxylation of alpha-terpineol as part of its catabolic assimilation. OxyD is involved in beta-hydroxytyrosine formation during vancomycin biosynthesis. CYP124 is a methyl-branched lipid omega-hydroxylase while CYP142 is a cholesterol 27-oxidase with likely roles in host response modulation and cholesterol metabolism. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 378
29231 410660 cd11034 P450cin-like P450cin and similar cytochrome P450s. This group is composed of Citrobacter braakii cytochrome P450cin (P450cin, also called CYP176A1) and similar proteins. P450cin is a bacterial P450 enzyme that catalyzes the enantiospecific hydroxylation of 1,8-cineole to (1R)-6beta-hydroxycineole; its natural reduction-oxidation partner is cindoxin. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 361
29232 410661 cd11035 P450cam-like P450cam and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Pseudomonas putida P450cam and Cyp101 proteins from Novosphingobium aromaticivorans such as CYP101C1 and CYP101D2. P450cam catalyzes the hydroxylation of camphor in a process that involves two electron transfers from the iron-sulfur protein, putidaredoxin. CYP101D2 is capable of oxidizing camphor while CYP101C1 does not bind camphor but is capable of binding and hydroxylating ionone derivatives such as alpha- and beta-ionone and beta-damascone. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 359
29233 410662 cd11036 AknT-like AknT-like proteins. This family is composed of proteins similar to Streptomyces biosynthesis proteins including anthracycline biosynthesis proteins DnrQ and AknT, and macrolide antibiotic biosynthesis proteins TylM3 and DesVIII. Streptomyces peucetius DnrQ is involved in the biosynthesis of carminomycin and daunorubicin (daunomycin) while Streptomyces galilaeus AknT functions in the biosynthesis of aclacinomycin A. Streptomyces fradiae TylM3 is involved in the biosynthesis of tylosin derived from the polyketide lactone tylactone, and Streptomyces venezuelae functions in the biosynthesis of methymycin, neomethymycin, narbomycin, and pikromycin. These proteins are required for the glycosylation of specific substrates during the biosynthesis of specific anthracyclines and macrolide antibiotics. Although members of this family belong to the large cytochrome P450 (P450, CYP) superfamily and show significant similarity to cytochrome P450s, they lack heme-binding sites and are not functional cytochromes. 340
29234 410663 cd11037 CYP199A2-like cytochrome P450 family 199, subfamily A, polypeptide 2 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Rhodopseudomonas palustris CYP199A2 and CYP199A4. CYP199A2 catalyzes the oxidation of aromatic carboxylic acids including indole-2-carboxylic acid, 2-naphthoic acid and 4-ethylbenzoic acid. CYP199A4 catalyzes the hydroxylation of para-substituted benzoic acids. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 371
29235 410664 cd11038 CYP_AurH-like cytochrome P450 AurH and similar cytochrome P450s. This group includes Streptomyces thioluteus P450 monooxygenase AurH which is uniquely capable of forming a homochiral tetrahydrofuran ring, a vital component of the polyketide antibiotic aureothin. AurH catalyzes an unprecedented tandem oxygenation process: first, it catalyzes an asymmetric hydroxylation of deoxyaureothin to yield (7R)-7-hydroxydeoxyaureothin as an intermediate; and second, it mediates another C-O bond formation that leads to O-heterocyclization. The AurH-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 382
29236 410665 cd11039 P450-pinF2-like P450-pinF2 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Agrobacterium tumefaciens P450-pinF2, whose expression is induced by the presence of wounded plant tissue and by plant phenolic compounds such as acetosyringone. P450-pinF2 may be involved in the detoxification of plant protective agents at the site of wounding. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 372
29237 410666 cd11040 CYP7_CYP8-like cytochrome P450s similar to cytochrome P450 family 7, subfamily A, polypeptide 1, cytochrome P450 family 7, subfamily B, polypeptide 1, cytochrome P450 family 8, subfamily A, polypeptide 1. This family is composed of cytochrome P450s (CYPs) with similarity to the human P450s CYP7A1, CYP7B1, CYP8B1, CYP39A1 and prostacyclin synthase (CYP8A1). CYP7A1, CYP7B1, CYP8B1, and CYP39A1 are involved in the catabolism of cholesterol to bile acids (BAs) in two major pathways. CYP7A1 (cholesterol 7alpha-hydroxylase) and CYP8B1 (sterol 12-alpha-hydroxylase) function in the classic (or neutral) pathway, which leads to two bile acids: cholic acid (CA) and chenodeoxycholic acid (CDCA). CYP7B1 and CYP39A1 are 7-alpha-hydroxylases involved in the alternative (or acidic) pathway, which leads mainly to the formation of CDCA. Prostacyclin synthase (CYP8A1) catalyzes the isomerization of prostaglandin H2 to prostacyclin (or prostaglandin I2), a potent mediator of vasodilation and anti-platelet aggregation. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
29238 410667 cd11041 CYP503A1-like cytochrome P450 family 503, subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of predominantly fungal cytochrome P450s (CYPs) with similarity to Fusarium fujikuroi Cytochrome P450 503A1 (CYP503A1, also called ent-kaurene oxidase or cytochrome P450-4), Aspergillus nidulans austinol synthesis protein I (ausI), Alternaria alternata tentoxin synthesis protein 1 (TES1), and Acanthamoeba polyphaga mimivirus cytochrome P450 51 (CYP51, also called P450-LIA1 or sterol 14-alpha demethylase). Ent-kaurene oxidase catalyzes three successive oxidations of the 4-methyl group of ent-kaurene to form kaurenoic acid, an intermediate in gibberellin biosynthesis. AusI and TES1 are cytochrome P450 monooxygenases that mediate the biosynthesis of the meroterpenoids, austinol and dehydroaustinol, and the phytotoxin tentoxin, respectively. P450-LIA1 catalyzes the 14-alpha demethylation of obtusifoliol and functions in steroid biosynthesis. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 441
29239 410668 cd11042 CYP51-like cytochrome P450 family 51 and similar cytochrome P450s. This family is composed of cytochrome P450 51 (CYP51 or sterol 14alpha-demethylase) and related cytochrome P450s. CYP51 is the only cytochrome P450 enzyme with a conserved function across animals, fungi, and plants, in the synthesis of essential sterols. In mammals, it is expressed in many different tissues, with highest expression in testis, ovary, adrenal gland, prostate, liver, kidney, and lung. In fungi, CYP51 is a significant drug target for treatment of human protozoan infections. In plants, it functions within a specialized defense-related metabolic pathway. CYP51 is also found in several bacterial species. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 416
29240 410669 cd11043 CYP90-like plant cytochrome P450s similar to cytochrome P450 family 90, subfamily A, polypeptide 1, cytochrome P450 family 90, subfamily B, polypeptide 1, and cytochrome P450 family 90, subfamily D, polypeptide 2. This family is composed of plant cytochrome P450s including: Arabidopsis thaliana cytochrome P450s 85A1 (CYP85A1 or brassinosteroid-6-oxidase 1), 90A1 (CYP90A1), 88A3 (CYP88A3 or ent-kaurenoic acid oxidase 1), 90B1 (CYP90B1 or Dwarf4 or steroid 22-alpha-hydroxylase), and 90C1 (CYP90C1 or 3-epi-6-deoxocathasterone 23-monooxygenase); Oryza sativa cytochrome P450s 90D2 (CYP90D2 or C6-oxidase), 87A3 (CYP87A3), and 724B1 (CYP724B1 or dwarf protein 11); and Taxus cuspidata cytochrome P450 725A2 (CYP725A2 or taxane 13-alpha-hydroxylase). These enzymes are monooxygenases that catalyze oxidation reactions involved in steroid or hormone biosynthesis. CYP85A1, CYP90D2, and CYP90C1 are involved in brassinosteroids biosynthesis, while CYP88A3 catalyzes three successive oxidations of ent-kaurenoic acid, which is a key step in the synthesis of gibberellins. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 408
29241 410670 cd11044 CYP120A1_CYP26-like cyanobacterial cytochrome P450 family 120, subfamily A, polypeptide 1 (CYP120A1), vertebrate cytochrome P450 family 26 enzymes, and similar cytochrome P450s. This family includes cyanobacterial CYP120A1 and vertebrate cytochrome P450s 26A1 (CYP26A1), 26B1 (CYP26B1), and 26C1 (CYP26C1). These are retinoic acid-metabolizing cytochromes that play key roles in retinoic acid (RA) metabolism. Human and zebrafish CYP26a1, as well as Synechocystis CYP120A1 are characterized as RA hydroxylases. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 420
29242 410671 cd11045 CYP136-like putative cytochrome P450 family 136 and similar cytochrome P450s. This group is composed of Mycobacterium tuberculosis putative cytochrome P450 136 (CYP136) and similar proteins. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 407
29243 410672 cd11046 CYP97 cytochrome P450 family/clan 97. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). Members of the CYP97 clan include Arabidopsis thaliana cytochrome P450s 97A3 (CYP97A3), CYP97B3, and CYP97C1. CYP97A3 is also called protein LUTEIN DEFICIENT 5 (LUT5) and CYP97C1 is also called carotene epsilon-monooxygenase or protein LUTEIN DEFICIENT 1 (LUT1). These cytochromes function as beta- and epsilon-ring carotenoid hydroxylases and are involved in the biosynthesis of xanthophylls. CYP97 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 441
29244 410673 cd11049 CYP170A1-like cytochrome P450 family 170, subfamily A, polypeptide 1-like actinobacterial cytochrome P450s. This subfamily is composed of Streptomyces coelicolor cytochrome P450 170A1 (CYP170A1), Streptomyces avermitilis pentalenene oxygenase, and similar actinobacterial cytochrome P450s. CYP170A1, also called epi-isozizaene 5-monooxygenase (EC 1.14.13.106)/(E)-beta-farnesene synthase (EC 4.2.3.47), catalyzes the two-step allylic oxidation of epi-isozizaene to albaflavenone, which is a sesquiterpenoid antibiotic. Pentalenene oxygenase (EC 1.14.15.32) catalyzes the conversion of pentalenene to pentalen-13-al by stepwise oxidation via pentalen-13-ol, a precursor of the neopentalenolactone antibiotic. The CYP170A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 415
29245 410674 cd11051 CYP59-like cytochrome P450 family 59 and similar cytochrome P450s. This family is composed of Aspergillus nidulans cytochrome P450 59 (CYP59), also called sterigmatocystin biosynthesis P450 monooxygenase stcS, and similar fungal proteins. CYP59 is required for the conversion of versicolorin A to sterigmatocystin. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 403
29246 410675 cd11052 CYP72_clan Plant cytochrome P450s, clan CYP72. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). The CYP72 clan is associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. This clan includes: CYP734 enzymes that are involved in brassinosteroid (BRs) catabolism and regulation of BRs homeostasis; CYP714 enzymes that are involved in the biosynthesis of gibberellins (GAs) and the mechanism to control their bioactive endogenous levels; and CYP72 family enzymes, among others. The CYP72 clan belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 427
29247 410676 cd11053 CYP110-like cytochrome P450 family 110 and similar cytochrome P450s. This group is composed of mostly uncharacterized proteins, including Nostoc sp. probable cytochrome P450 110 (CYP110) and putative cytochrome P450s 139 (CYP139), 138 (CYP138), and 135B1 (CYP135B1) from Mycobacterium bovis. CYP110 genes, unique to cyanobacteria, are widely distributed in heterocyst-forming cyanobacteria including nitrogen-fixing genera Nostoc and Anabaena. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 415
29248 410677 cd11054 CYP24A1-like cytochrome P450 family 24 subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of vertebrate cytochrome P450 24A1 (CYP24A1) and similar proteins including several Drosophila proteins such as CYP315A1 (also called protein shadow) and CYP314A1 (also called ecdysone 20-monooxygenase), and vertebrate CYP11 and CYP27 subfamilies. Both CYP314A1 and CYP315A1, which has ecdysteroid C2-hydroxylase activity, are involved in the metabolism of insect hormones. CYP24A1 and CYP27B1 have roles in calcium homeostasis and metabolism, and the regulation of vitamin D. CYP24A1 catabolizes calcitriol (1,25(OH)2D), the physiologically active vitamin D hormone, by catalyzing its hydroxylation, while CYP27B1 is a calcidiol 1-monooxygenase that coverts 25-hydroxyvitamin D3 to calcitriol. The CYP24A1-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
29249 410678 cd11055 CYP3A-like cytochrome P450 family 3, subfamily A and similar cytochrome P450s. This family includes vertebrate CYP3A subfamily enzymes and CYP5a1, and similar proteins. CYP5A1, also called thromboxane-A synthase, converts prostaglandin H2 into thromboxane A2, a biologically active metabolite of arachidonic acid. CYP3A enzymes are drug-metabolizing enzymes embedded in the endoplasmic reticulum, where they can catalyze a wide variety of biochemical reactions including hydroxylation, N-demethylation, O-dealkylation, S-oxidation, deamination, or epoxidation of substrates. The CYP3A-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 422
29250 410679 cd11056 CYP6-like cytochrome P450 family 6 and similar cytochrome P450s. This family is composed of cytochrome P450s from insects and crustaceans, including the CYP6, CYP9 and CYP310 subfamilies, which are involved in the metabolism of insect hormones and xenobiotic detoxification. The CYP6-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 429
29251 410680 cd11057 CYP313-like cytochrome P450 family 313 and similar cytochrome P450s. This subfamily is composed of insect cytochrome P450s from families 313 (CYP313) and 318 (CYP318), and similar proteins. These proteins may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. Their specific function is yet unknown. They belong to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 427
29252 410681 cd11058 CYP60B-like cytochrome P450 family 60, subfamily B and similar cytochrome P450s. This family is composed of fungal cytochrome P450s including: Aspergillus nidulans cytochrome P450 60B (CYP60B), also called versicolorin B desaturase, which catalyzes the conversion of versicolorin B to versicolorin A during sterigmatocystin biosynthesis; Fusarium sporotrichioides cytochrome P450 65A1 (CYP65A1), also called isotrichodermin C-15 hydroxylase, which catalyzes the hydroxylation at C-15 of isotricodermin in trichothecene biosynthesis; and Penicillium aethiopicum P450 monooxygenase vrtK, also called viridicatumtoxin synthesis protein K, which catalyzes the spirocyclization of the geranyl moiety of previridicatumtoxin to produce viridicatumtoxin, a tetracycline-like fungal meroterpenoid. The CYP60B-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 419
29253 410682 cd11059 CYP_fungal unknown subfamily of fungal cytochrome P450s. This subfamily is composed of uncharacterized fungal cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 422
29254 410683 cd11060 CYP57A1-like cytochrome P450 family 57, subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of fungal cytochrome P450s including: Nectria haematococca cytochrome P450 57A1 (CYP57A1), also called pisatin demethylase, which detoxifies the phytoalexin pisatin; Penicillium aethiopicum P450 monooxygenase gsfF, also called griseofulvin synthesis protein F, which catalyzes the coupling of orcinol and phloroglucinol rings in griseophenone B to form desmethyl-dehydrogriseofulvin A during the biosynthesis of griseofulvin, a spirocyclic fungal natural product used to treat dermatophyte infections; and Penicillium aethiopicum P450 monooxygenase vrtE, also called viridicatumtoxin synthesis protein E, which catalyzes hydroxylation at C5 of the polyketide backbone during the biosynthesis of viridicatumtoxin, a tetracycline-like fungal meroterpenoid. The CYP57A1-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
29255 410684 cd11061 CYP67-like cytochrome P450 family 67 and similar cytochrome P450s. This subfamily includes Uromyces viciae-fabae cytochrome P450 67 (CYP67), also called planta-induced rust protein 16, Cystobasidium minutum (Rhodotorula minuta) cytochrome P450rm, and other fungal cytochrome P450s. P450rm catalyzes the formation of isobutene and 4-hydroxylation of benzoate. The gene encoding CYP67 is a planta-induced gene that is expressed in haustoria and rust-infected leaves. The CYP67-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 418
29256 410685 cd11062 CYP58-like cytochrome P450 family 58-like fungal cytochrome P450s. This group includes Fusarium sporotrichioides cytochrome P450 58 (CYP58, also known as Tri4 and trichodiene oxygenase), and similar fungal proteins. CYP58 catalyzes the oxygenation of trichodiene during the biosynthesis of trichothecenes, which are sesquiterpenoid toxins that act by inhibiting protein biosynthesis. The CYP58-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
29257 410686 cd11063 CYP52 cytochrome P450 family 52. Cytochrome P450 52 (CYP52), also called P450ALK, monooxygenases catalyze the first hydroxylation step in the assimilation of alkanes and fatty acids by filamentous fungi. The number of CYP52 proteins depend on the fungal species: for example, Candida tropicalis has seven, Candida maltose has eight, and Yarrowia lipolytica has twelve. The CYP52 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 419
29258 410687 cd11064 CYP86A cytochrome P450 family 86, subfamily A. This subfamily includes several Arabidopsis thaliana cytochrome P450s (CYP86A1, CYP86A2, CYP86A4, among others), Petunia x hybrida CYP86A22, and Vicia sativa CYP94A1 and CYP94A2. They are P450-dependent fatty acid omega-hydroxylases that catalyze the omega-hydroxylation of various fatty acids. CYP86A2 acts on saturated and unsaturated fatty acids with chain lengths from C12 to C18; CYP86A22 prefers substrates with chain lengths of C16 and C18; and CYP94A1 acts on various fatty acids from 10 to 18 carbons. They play roles in the biosynthesis of extracellular lipids, cutin synthesis, and plant defense. The CYP86A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
29259 410688 cd11065 CYP64-like cytochrome P450 family 64-like fungal cytochrome P450s. This group includes Aspergillus flavus cytochrome P450 64 (CYP64), also called O-methylsterigmatocystin (OMST) oxidoreductase or aflatoxin B synthase or aflatoxin biosynthesis protein Q, and similar fungal cytochrome P450s. CYP64 converts OMST to aflatoxin B1 and converts dihydro-O-methylsterigmatocystin (DHOMST) to aflatoxin B2 in the aflatoxin biosynthesis pathway. The CYP64-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
29260 410689 cd11066 CYP_PhacA-like fungal cytochrome P450s similar to Aspergillus nidulans phenylacetate 2-hydroxylase. This group includes Aspergillus nidulans phenylacetate 2-hydroxylase (encoded by the phacA gene) and similar fungal cytochrome P450s. PhacA catalyzes the ortho-hydroxylation of phenylacetate, the first step of A. nidulans phenylacetate catabolism. The PhacA-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 434
29261 410690 cd11067 CYP152 cytochrome P450 family 152, also called fatty acid hydroxylases or P450 peroxygenases. The cytochrome P450 152 (CYP152) family enzymes act as peroxygenases, converting fatty acids through oxidative decarboxylation, yielding terminal alkenes, and via alpha- and beta-hydroxylation to yield hydroxy-fatty acids. Included in this family are Bacillus subtilis CYP152A1, also called cytochrome P450BsBeta, that catalyzes the alpha- and beta-hydroxylation of long-chain fatty acids such as myristic acid in the presence of hydrogen peroxide, and Sphingomonas paucimobilis CYP152B1, also called cytochrome P450(SPalpha), that hydroxylates fatty acids with high alpha-regioselectivity. The CYP152 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 400
29262 410691 cd11068 CYP120A1 cytochrome P450 family 102, subfamily A, polypeptide 1, also called bifunctional cytochrome P450/NADPH--P450 reductase. Cytochrome P450 102A1, also called cytochrome P450(BM-3) or P450BM-3, is a bifunctional cytochrome P450/NADPH--P450 reductase. These proteins fuse an N-terminal cytochrome p450 with a C-terminal cytochrome p450 reductase (CYPOR). It functions as a fatty acid monooxygenase, catalyzing the hydroxylation of fatty acids at omega-1, omega-2 and omega-3 positions, with activity towards fatty acids with a chain length of 9-18 carbons. Its NADPH-dependent reductase activity (via the C-terminal domain) allows electron transfer from NADPH to the heme iron of the N-terminal cytochrome P450. CYP120A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 430
29263 410692 cd11069 CYP_FUM15-like Fusarium verticillioides cytochrome P450 monooxygenase FUM15, and similar cytochrome P450s. Fusarium verticillioides cytochrome P450 monooxygenase FUM15, is also called fumonisin biosynthesis cluster protein 15. The FUM15 gene is part of the gene cluster that mediates the biosynthesis of fumonisins B1, B2, B3, and B4, which are carcinogenic mycotoxins. This FUM15-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 437
29264 410693 cd11070 CYP56-like cytochrome P450 family 56-like fungal cytochrome P450s. This group includes Saccharomyces cerevisiae cytochrome P450 56, also called cytochrome P450-DIT2, and similar fungal proteins. CYP56 is involved in spore wall maturation and is thought to catalyze the oxidation of tyrosine residues in the formation of LL-dityrosine-containing precursors of the spore wall. The CYP56-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 438
29265 410694 cd11071 CYP74 cytochrome P450 family 74. The cytochrome P450 74 (CYP74) family controls several enzymatic conversions of fatty acid hydroperoxides to bioactive oxylipins in plants, some invertebrates, and bacteria. It includes two dehydrases, namely allene oxide synthase (AOS) and divinyl ether synthase (DES), and two isomerases, hydroperoxide lyase (HPL) and epoxyalcohol synthase (EAS). AOS (EC 4.2.1.92, also called hydroperoxide dehydratase), such as Arabidopsis thaliana CYP74A acts on a number of unsaturated fatty-acid hydroperoxides, forming the corresponding allene oxides. DES (EC 4.2.1.121), also called colneleate synthase or CYP74D, catalyzes the selective removal of pro-R hydrogen at C-8 in the biosynthesis of colneleic acid. The linolenate HPL, Arabidopsis thaliana CYP74B2, is required for the synthesis of the green leaf volatiles (GLVs) hexanal and trans-2-hexenal. The fatty acid HPL, Solanum lycopersicum CYP74B, is involved in the biosynthesis of traumatin and C6 aldehydes. The epoxyalcohol synthase Ranunculus japonicus CYP74A88 (also known as RjEAS) specifically converts linoleic acid 9- and 13-hydroperoxides to oxiranyl carbinols 9,10-epoxy-11-hydroxy-12-octadecenoic acid and 11-hydroxy-12,13-epoxy-9-octadecenoic acid, respectively. The CYP74 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 424
29266 410695 cd11072 CYP71-like cytochrome P450 family 71 and similar cytochrome P450s. The group includes plant cytochrome P450 family 71 (CYP71) proteins, as well as some CYPs designated as belonging to a different family including CYP99A1, CYP83B1, and CYP84A1, among others. Characterized CYP71 enzymes include: parsnip (Pastinaca sativa) CYP71AJ4, also called angelicin synthase, that converts (+)-columbianetin to angelicin, an angular furanocumarin; periwinkle (Catharanthus roseus) CYP71D351, also called tabersonine 16-hydroxylase 2, that is involved in the foliar biosynthesis of vindoline; sorghum CYP71E1, also called 4-hydroxyphenylacetaldehyde oxime monooxygenase, that catalyzes the conversion of p-hydroxyphenylacetaldoxime to p-hydroxymandelonitrile; as well as maize CYP71C1, CYP71C2, and CYP71C4, which are monooxygenases catalyzing the oxidation of 3-hydroxyindolin-2-one, indolin-2-one, and indole, respectively. CYPs within a single CYP71 subfamily, such as the C subfamily, usually metabolize similar/related compounds. The CYP71-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 428
29267 410696 cd11073 CYP76-like cytochrome P450 family 76 and similar cytochrome P450s. Characterized members of the plant cytochrome P450 family 76 (CYP76 or Cyp76) include: Catharanthus roseus CYP76B6, a multifunctional enzyme catalyzing two sequential oxidation steps leading to the formation of 8-oxogeraniol from geraniol; the Brassicaceae-specific CYP76C subfamily of enzymes that are involved in the metabolism of monoterpenols and phenylurea herbicides; and two P450s from Lamiaceae, CYP76AH and CYP76AK, that are involved in the oxidation of abietane diterpenes. CYP76AH produces ferruginol and 11-hydroxyferruginol, while CYP76AK catalyzes oxidations at the C20 position. Also included in this group is Berberis stolonifera Cyp80, also called berbamunine synthase or (S)-N-methylcoclaurine oxidase [C-O phenol-coupling], that catalyzes the phenol oxidation of N-methylcoclaurine to form the bisbenzylisoquinoline alkaloid berbamunine. The CYP76-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 435
29268 410697 cd11074 CYP73 cytochrome P450 family 73. Cytochrome P450 family 73 (CYP73 pr Cyp73), also called trans-cinnamate 4-monooxygenase (EC 1.14.14.91) or cinnamic acid 4-hydroxylase, catalyzes the regiospecific 4-hydroxylation of cinnamic acid to form precursors of lignin and many other phenolic compounds. It controls the general phenylpropanoid pathway, and controls carbon flux to pigments essential for pollination or UV protection. CYP73 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 434
29269 410698 cd11075 CYP77_89 cytochrome P450 families 77 and 89, and similar cytochrome P450s. This group includes cytochrome P450 families 73 (CYP77) and 89 (CYP89), which are sister families that share a common ancestor. CYP89, present only in angiosperms, is younger than CYP77, which is already found in lycopods; thus, CYP89 may have evolved from CYP77 after duplication and divergence. Also included in this group is ent-kaurene oxidase, called CYP701A3 in Arabidopsis thaliana and CYP701B1 in Physcomitrella patens, that catalyzes the oxidation of ent-kaurene to form ent-kaurenoic acid. CYP701A3 is sensitive to inhibitor uniconazole-P while CYP701B1 is not. This CYP77/89 group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 433
29270 410699 cd11076 CYP78 cytochrome P450 family 78. Characterized cytochrome P450 family 78 (CYP78 or Cyp78) proteins include: CYP78A5, which is expressed in leaf, flora and embryo, and has been reported to stimulate plant organ growth in Arabidopsis thaliana and to regulate plant architecture, ripening time, and fruit mass in tomato; Glycine max CYP78A10 that functions in regulating seed size/weight and pod number; and Physcomitrella patens CYP78A27 or CYP78A28, which together, are essential in bud formation. The CYP78 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
29271 410700 cd11078 CYP130-like cytochrome P450 family 130-like and similar cytochrome P450s. This subfamily includes Mycobacterium tuberculosis cytochrome P450 130 (CYP130), Rhodococcus erythropolis CYP116, and similar bacterial proteins. CYP130 catalyzes the N-demethylation of dextromethorphan, and has also shown a natural propensity to bind primary arylamines. CYP116 is involved in the degradation of thiocarbamate herbicides. The CYP130-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 380
29272 410701 cd11079 Cyp_unk unknown subfamily of mostly bacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s, predominantly from bacteria. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 350
29273 410702 cd11080 CYP134A1 cytochrome P450 family 134, subfamily A, polypeptide 1. Cytochrome P450 134A1 (CYP134A1, EC 1.14.15.13), also called pulcherriminic acid synthase or cyclo-L-leucyl-L-leucyl dipeptide oxidase or cytochrome P450 CYPX, catalyzes the oxidation of cyclo(L-Leu-L-Leu) (cLL) to yield pulcherriminic acid which forms the red pigment pulcherrimin via a non-enzymatic spontaneous reaction with Fe(3+). It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 370
29274 410703 cd11082 CYP61_CYP710 C-22 sterol desaturase subfamily, such as fungal cytochrome P450 61 and plant cytochrome P450 710. C-22 sterol desaturase (EC 1.14.19.41), also called sterol 22-desaturase, is required for the formation of the C-22 double bond in the sterol side chain of delta22-unsaturated sterols, which are present specifically in fungi and plants. This enzyme is also called cytochrome P450 61 (CYP61) in fungi and cytochrome P450 710 (CYP710) in plants. The CYP61/CYP710 subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 415
29275 410704 cd11083 CYP_unk unknown subfamily of cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 421
29276 199893 cd11234 E_set_GDE_N N-terminal Early set domain associated with the catalytic domain of Glycogen debranching enzyme. E or "early" set domains are associated with the catalytic domain of the glycogen debranching enzyme at the N-terminal end. Glycogen debranching enzymes have both 4-alpha-glucanotransferase and amylo-1,6-glucosidase activities. As a transferase, it transfers a segment of a 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan. As a glucosidase, it catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. The N-terminal domain of the glycogen debranching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase. 101
29277 200496 cd11235 Sema_semaphorin The Sema domain, a protein interacting module, of semaphorins. Semaphorins are regulator molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. They can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted proteins; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. The semaphorins exert their function through their receptors, the neuropilin and plexin families. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 437
29278 200497 cd11236 Sema_plexin_like The Sema domain, a protein interacting module, of Plexins and MET-like receptor tyrosine kinases. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestor of semaphorins. Ligand binding activates signal transduction pathways controlling axon guidance in the nervous system and other developmental processes including cell migration and morphogenesis, immune function, and tumor progression. Plexins are divided into four types (A-D) according to sequence similarity. In vertebrates, type A Plexins serve as the co-receptors for neuropilins to mediate the signalling of class 3 semaphorins except Sema3E, which signals through Plexin D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B. Plexin C1 serves as the receptor of Sema7A and plays regulation roles in both immune and nervous systems. This family also includes the Met and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 401
29279 200498 cd11237 Sema_1A The Sema domain, a protein interacting module, of semaphorin 1A (Sema1A). Sema1A is a transmembrane protein. It has been shown to mediate the defasciculation of motor axon bundles at specific choice points. Sema1A binds to its receptor plexin A (PlexA), which in turn triggers downstream signaling events involving the receptor tyrosine kinase Otk, the evolutionarily conserved flavoprotein monooxygenase molecule interacting with CasL (MICAL), and the A kinase anchoring protein Nervy, leading to repulsive growth-cone response. Sema1A has also been shown to be involved in synaptic formation. It is a member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 446
29280 200499 cd11238 Sema_2A The Sema domain, a protein interacting module, of semaphorin 2A (Sema2A). Sema2A, a secreted semaphorin, signals through its receptor plexin B (PlexB) to regulate central and peripheral axon pathfinding. In the Drosophila embryo, Sema2A secreted by oenocytes interacts with PlexB to guide sensory axons. Sema2A is a member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 452
29281 200500 cd11239 Sema_3 The Sema domain, a protein interacting module, of class 3 semaphorins. Class 3 semaphorins (Sema3s) are secreted regulator molecules involved in the development of the nervous system, vasculogenesis, angiogenesis,and tumorigenesis. There are 7 distinct subfamilies named Sema3A to 3G. Sema3s function as repellent signals during axon guidance by repelling neurons away from the source of Sema3s. However, Sema3s that are secreted by tumor cells play an inhibitory role in tumor growth and angiogenesis (specifically Sema3B and Sema3F). Sema3s functions by forming complexes with neuropilins and A-type plexins, where neuropilins serve as the ligand binding moiety and the plexins function as signal transduction component. Sema3s primarily inhibit the cell motility and migration of tumor and endothelial cells by inducing collapse of the actin cytoskeleton via neuropilins and plexins. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 471
29282 200501 cd11240 Sema_4 The Sema domain, a protein interacting module, of class 4 semaphorins (Sema4). Class 4 semaphorins (Sema4s) are transmembrane regulator molecules involved in the development of the nervous system, immune response, cytoskeletal organization, angiogenesis, and cell-cell interactions. There are 7 distinct subfamilies in class 4 semaphorins, named 4A to 4G. Several class 4 subfamilies play important roles in the immune system and are called "immune semaphorins". Sema4A plays critical roles in T cell-DC interactions in the immune response. Sema4D/CD100, expressed by lymphocytes, promotes the aggregation and survival of B lymphocytes and inhibits cytokine-induced migration of immune cells in vitro. It is required for normal activation of B and T lymphocytes. Sema4B negatively regulates basophil functions through T cell-basophil contacts and significantly inhibits IL-4 and IL-6 production from basophils in response to various stimuli, including IL-3 and papain. Sema4s not only influence the activation state of cells but also modulate their migration and survival. The effects of Sema4s on nonlymphoid cells are mediated by plexin D1 and plexin Bs. The Sema4G and Sema4C genes are expressed in the developing cerebellar cortex and are involved in neural tube closure and development of cerebellar granules cells through receptor plexin B2. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 456
29283 200502 cd11241 Sema_5 The Sema domain, a protein interacting module, of semaphorin 5 (Sema5). Class 5 semaphorins are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. There are three subfamilies in class 5 semaphorins, namely 5A, 5B and 5C. Sema5A and Sema5B function as guidance cues for optic and corticofugal nerve development, respectively. Sema5A-induced cell migration requires Met signaling. Sema5C is an early development gene and may play a role in odor-guided behavior. Sema5A is also implicated in cancer. In a screening model for metastasis, the Drosophila Sema5A ortholog, Dsema-5C, has been found to be required in tumorigenicity and metastasis. Sema5A is highly expressed in human pancreatic cancer cells and is associated with tumor growth, invasion and metastasis. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 438
29284 200503 cd11242 Sema_6 The Sema domain, a protein interacting module, of class 6 semaphorins (Sema6). Class 6 semaphorins (Sema6s) are membrane associated semaphorins. There are 6 subfamilies named 6A to 6D. Sema6s bind to plexin As in a neuropilin independent fashion. Sema6-plexin A signaling plays important roles in lamina-specific axon projections. Interactions between plexin A2, plexin A4, and Sema6A control lamina-restricted projection of hippocampal mossy fibers. Interactions between Sema6C, Sema6D and plexin A1 shape the stereotypic trajectories of sensory axons in the spinal cord. In addition to axon targeting, Sema6D-plexin A1 interactions influence a wide range of other biological processes. During cardiac development, Sema6D attracts or repels endothelial cells in the cardiac tube depending on the expression patterns of specific coreceptors in addition to plexin A1. Furthermore, Sema6D binds a receptor complex comprising of plexin A1, Trem2 (triggering receptor expressed on myeloid cells 2), and DAP12 on dendritic cells and osteoclasts to mediate T-cell-DC interactions and to control bone development, respectively. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 465
29285 200504 cd11243 Sema_7A The Sema domain, a protein interacting module, of semaphorin 7A (Sema7A, also called CD108). Sema7A plays regulatory roles in both immune and nervous systems. Unlike other semaphorins, which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Sema7A also plays a critical role in the negative regulation of T cell activation and function. Sema7A is a membrane-anchored member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 414
29286 200505 cd11244 Sema_plexin_A The Sema domain, a protein interacting module, of Plexin A. Plexins serve as receptors of semaphorins and may be the ancestor of semaphorins. Members of the Plexin A subfamily are receptors for Sema1s, Sema3s, and Sema6s, and they mediate diverse biological functions including axon guidance, cardiovascular development, and immune function. Guanylyl cyclase Gyc76C and Off-track kinase (OTK), a putative receptor tyrosine kinase, modulate Sema1a-Plexin A mediated axon repulsion. Sema3s do not interact directly with plexin A receptors, but instead bind Neuropilin-1 or Neuropilin-2 toactivate neuropilin-plexin A holoreceptor complexes. In contrast to Sema3s, Sema6s do not require neuropilins for plexin A binding. In the complex, plexin As serve as signal-transducing subunits. An increasing number of molecules that interact with the intracellular region of Plexin A have been identified; among them are IgCAMs (in axon guidance events) and Trem2-DAP12 (in immune responses). The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 470
29287 200506 cd11245 Sema_plexin_B The Sema domain, a protein interacting module, of Plexin B. Plexins, which contain semaphorin domains, function as receptors of semaphorins and may be the ancestors of semaphorins. There are three members of the Plexin B subfamily, namely B1, B2 and B3. Plexins B1, B2 and B3 are receptors for Sema4D, Sema4C and Sema4G, and Sema5A, respectively. The activation of plexin B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. By signaling the effect of Sema4C and Sema4G, the plexin B2 receptor is critically involved in neural tube closure and cerebellar granule cell development. Plexin B3, the receptor of Sema5A, is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin B3 has been linked to verbal performance and white matter volume in human brain. Small GTPases play important roles in plexin B signaling. Plexin B1 activates Rho through Rho-specific guanine nucleotide exchange factors, leading to neurite retraction. Plexin B1 possesses an intrinsic GTPase-activating protein activity for R-Ras and induces growth cone collapse through R-Ras inactivation. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 440
29288 200507 cd11246 Sema_plexin_C1 The Sema domain, a protein interacting module, of Plexin C1. Plexins serve as semaphorin receptors. Plexin C1 has been identified as the receptor of semaphorin 7A, which plays regulation roles in both the immune and nervous systems. Unlike other semaphorins which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Plexin C1 is a potential tumor suppressor for melanoma progression. The expression of Plexin C1 is diminished or absent in human melanoma cell lines. Cofilin, an actin-binding protein involved in cell migration, is a downstream target of Sema7A-Plexin C1 signaling. Cofilin is not phosphorylated when Plexin C1 expression is silenced. Thus, melanoma invasion and metastasis may be promoted through the loss of Plexin C1 inhibitory signaling on cofilin activation. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 401
29289 200508 cd11247 Sema_plexin_D1 The Sema domain, a protein interacting module, of Plexin D1. Plexins are known as semaphorin receptors and Plexin D1 has been identified as the receptor of Sema3E. It binds to Sema3E directly with high affinity. Sema3E is implicated in axonal path finding and inhibition of developmental and post-ischemic angiogenesis. Plexin D1 is broadly expressed on tumor vessels and tumor cells in a number of different types of human tumors. Plexin D1-Sema3E interaction inhibits tumor growth but promotes invasiveness and metastasis. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 483
29290 200509 cd11248 Sema_MET_like The Sema domain, a protein interacting module, of MET and RON receptor tyrosine kinases. This family includes MET and RON receptor tyrosine kinases. MET is encoded by the c-met protooncogene. MET is the receptor for hepatocyte growth factor/scatter factor (HGF/SF). HGF/SF and MET regulates multiple cellular events and are essential for the development of several tissues and organs, including the placenta, liver, and several groups of skeletal muscles. RON receptor tyrosine kinase is a Macrophage-stimulating protein (MSP) receptor. Upon binding of MSP, RON is activated via autophosphorylation within its kinase catalytic domain, resulting in a variety of effects including proliferation, tubular morphogenesis, angiogenesis, cellular motility and invasiveness. By interacting with downstream signaling molecules, it regulates macrophage migration, phagocytosis, and nitric oxide production. MET and RON receptors have been implicated in cancer development and migration. They are composed of alpha-beta heterodimers. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The Sema domain is necessary for receptor dimerization and activation. 467
29291 200510 cd11249 Sema_3A The Sema domain, a protein interacting module, of semaphorin 3A (Sema3A). Sema3A has been reported to inhibit the growth of certain experimental tumors and to regulate endothelial cell migration and apoptosis in vitro, as well as arteriogenesis in the muscle, skin vessel permeability, and tumor angiogenesis in vivo. The function of Sema3A is mediated through receptors neuropilin-1 (NP1) and plexins, although little is known about the requirement of specific plexins in its receptor complex. It is known however that Plexin-A4 is the receptor for Sema3A in the Toll-like receptor- and sepsis-induced cytokine storm during immune response. Sema3A is a member of the Class 3 semaphorin family of secreted proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 493
29292 200511 cd11250 Sema_3B The Sema domain, a protein interacting module, of semaphorin 3B (Sema3B). Sema3B is coexpressed with semaphorin 3F and both proteins are candidate tumor suppressors. Both Sema3B and Sema3F show high levels of expression in normal tissues and low-grade tumors but are down-regulated in highly metastatic tumors in the lung, melanoma cells, bladder carcinoma cells and prostate carcinoma. They are upregulated by estrogen and inhibit cell motility and invasiveness through decreased FAK phosphorylation and inhibition of MMP-2 and MMP-9 expression. Two receptor families, the neuropilins (NP) and plexins, have been implicated in mediating the actions of semaphorins 3B and 3F. Sema3B is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 471
29293 200512 cd11251 Sema_3C The Sema domain, a protein interacting module, of semaphorin 3C (Sema3C). Sema3C is a secreted semaphorin expressed in and adjacent to cardiac neural crest cells, and causes impaired migration of neural crest cells to the developing cardiac outflow tract, resulting in the interruption of the aortic arch and persistent truncus arteriosus. It has been proposed that Sema3C acts as a guidance molecule, regulating migration of neural crest cells that express semaphorin receptors such as plexin A2. Sema3C may also participate in tumor progression. The cleavage of Sema3C induced by ADAMTS1 promotes the migration of breast cancer cells. Sema3C is a member of the class 3 semaphorin family of secreted proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 470
29294 200513 cd11252 Sema_3D The Sema domain, a protein interacting module, of semaphorin 3D (Sema3D). Sema3D is a secreted semaphorin expressed during the development of the nervous system. In zebrafish, Sema3D is expressed in the ventral tectum. It guides retinal axons along the dorsoventral axis of the tectum and guides the laterality of retinal ganglion cell (RGC) projections. Both Sema3D knockdown or its ubiquitous overexpression induced aberrant ipsilateral projections. Proper balance of Sema3D is needed at the midline for the progression of RGC axons from the chiasm midline into the contralateral optic tract. Sema3D is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 474
29295 200514 cd11253 Sema_3E The Sema domain, a protein interacting module, of semaphorin 3E (Sema3E). Sema3E is a secreted molecule implicated in axonal path finding and inhibition of developmental and postischemic angiogenesis. It is also highly expressed in metastatic cancer cells. Sema3E signaling, through its high affinity functional receptor Plexin D1, drives cancer cell invasiveness and metastatic spreading. Sema3E is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 471
29296 200515 cd11254 Sema_3F The Sema domain, a protein interacting module, of semaphorin 3F (Sema3F). Sema3F is coexpressed with semaphorin3B. Both Sema3B and Sema3F proteins are candidate tumor suppressors that are down-regulated in highly metastatic tumors. Two receptor families, the neuropilins and plexins, have been implicated in mediating the actions of semaphorins 3B and 3F. Sema3F is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 470
29297 200516 cd11255 Sema_3G The Sema domain, a protein interacting module, of semaphorin 3G (Sema3G). Semaphorin 3G is identified as a primarily endothelial cell- expressed class 3 semaphorin that controls endothelial and smooth muscle cell functions in autocrine and paracrine manners, respectively. It is mainly expressed in the lung and kidney, and a little in the brain. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 474
29298 200517 cd11256 Sema_4A The Sema domain, a protein interacting module, of semaphorin 4A (Sema4A). Sema4A is expressed in immune cells and is thus termed an "immune semaphorin". It plays critical roles in T cell-DC interactions in the immune response. It has been reported to enhance activation and differentiation of T cells in vitro and generation of antigen-specific T cells in vivo. The function of Sema4A in the immune response implicates its role in infectious and noninfectious diseases. Sema4A exerts its function through three receptors, namely Plexin B, Plexin D1, and Tim-2. Sema4A belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. TThe Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 447
29299 200518 cd11257 Sema_4B The Sema domain, a protein interacting module, of semaphorin 4B (Sema4B). Sema4B, expressed in T and B cells, is an immune semaphorin. It functions as a negative regulatory of basophils through T cell-basophil contacts and it significantly inhibits IL-4 and IL-6 production from basophils in response to various stimuli, including IL-3 and papain. In addition, T cell-derived Sema4B suppresses basophil-mediated Th2 skewing and humoral memory responses. Sema4B may be also involved in lung cancer cell mobility by inducing the degradation of CLCP1 (CUB, LCCL-homology, coagulation factor V/VIII homology domains protein). Sema4B is characterized by a PDZ-binding motif at the carboxy-terminus, which mediates interaction with the post-synaptic density protein PSD-95/SAP90, which is thought to play a central role during synaptogenesis and in the structure and function of post-synaptic specializations of excitatory synapses. Sema4B belongs to class 4 transmembrane semaphorin family proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 464
29300 200519 cd11258 Sema_4C The Sema domain, a protein interacting module, of semaphorin 4C (Sema4C). Sema4C acts as a Plexin B2 ligand to regulate the development of cerebellar granule cells and to modulate ureteric branching in the developing kidney. The binding of Sema4C to Plexin B2 results the phosphorylation of downstream regulator ErbB-2 and the plexin protein itself. The cytoplasmic region of Sema4C binds a neurite-outgrowth-related protein SFAP75, suggesting that Sema4C may also play a role in neural function. Sema4C belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 458
29301 200520 cd11259 Sema_4D The Sema domain, a protein interacting module, of semaphorin 4D (Sema4D, also known as CD100). Sema4D/CD100 is expressed in immune cells and plays critical roles in immune response; it is thus termed an "immune semaphorin". It is expressed by lymphocytes and promotes the aggregation and survival of B lymphocytes and inhibits cytokine-induced migration of immune cells in vitro. Sema4D/CD100 knock-out mice demonstrate that Sema4D is required for normal activation of B and T lymphocytes. Sema4D increases B-cell and DC function using either Plexin B1 or CD72 as receptors. The function of Sema4D in immune response implicates its role in infectious and noninfectious diseases. Sema4D belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 471
29302 200521 cd11260 Sema_4E The Sema domain, a protein interacting module, of semaphorin 4E (Sema4E). Sema4E is expressed in the epithelial cells that line the pharyngeal arches in zebrafish. It may act as a guidance molecule to restrict the branchiomotor axons to the mesenchymal cells. Gain-of-function and loss-of-function studies demonstrate that Sema4E is essential for the guidance of facial axons from the hindbrain into their pharyngeal arch targets and is sufficient for guidance of gill motor axons. Sema4E guides facial motor axons by a repulsive action. Sema4E belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 456
29303 200522 cd11261 Sema_4F The Sema domain, a protein interacting module, of semaphorin 4F (Sema4F). Sema4F plays role in heterotypic cell-cell contacts and controls cell proliferation and suppresses tumorigenesis. In neurofibromatosis type 1 (NF1) patients, reduced Sema4F level disrupts Schwann cell/axonal interactions. Experiments using a yeast two-hybrid system show that the extreme C-terminus of Sema4F interacts with the PDZ domains of post-synaptic density protein SAP90/PSD-95, indicating possible functional involvement of Semas4F at glutamatergic synapses. Recent work also suggests a role for Sema4F in the injury response of intramedullary axotomized motoneuron. Sema4F belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulator molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 460
29304 200523 cd11262 Sema_4G The Sema domain, a protein interacting module, of semaphorin 4G (Sema4G). The Sema4G and Sema4C genes are expressed in the developing cerebellar cortex. Sema4G and Sema4C proteins specifically bind to Plexin B2 expressed in the cerebellar granule cells. Sema4G and Sema4C are involved in neural tube closure and cerebellar granule cell development through Plexin B2.Sema4G belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 457
29305 200524 cd11263 Sema_5A The Sema domain, a protein interacting module, of semaphorin 5A (Sema5A). Originally, mouse Sema5A was identified as a protein that induces inhibitory responses during optic nerve development. Recent studies show that Sema5A controls innate immunity in mice. It also has been identified as a candidate gene for causing idiopathic autism in humans. Plexin B3 functions as a binding partner and receptor for Sema5A. Furthermore, Sema5A is also implicated in cancer. The role of the Drosophila Sema5A ortholog, Dsema-5C, in tumorigenicity and metastasis has been reported. Sema5A is highly expressed in human pancreatic cancer cells and is associated with tumor growth, invasion and metastasis. Sema5A belongs to class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 436
29306 200525 cd11264 Sema_5B The Sema domain, a protein interacting module, of semaphorin 5B (Sema5B). Sema5B is expressed in regions of the basal telencephalon in rat. Sema5B is an inhibitory cue for corticofugal axons and acts as a source of repulsion for the appropriate guidance of cortical axons away from structures such as the ventricular zone as they navigate toward and within subcortical regions. In addition to its role as a guidance cue, Sema5B regulates the development and maintenance of synapse size and number in hippocampal neurons. In addition, the sema domain of Sema5B can be cleaved of the whole protein and exerts its function in regulation of synapse morphology. Sema5B belongs to the class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 437
29307 200526 cd11265 Sema_5C The Sema domain, a protein interacting module, of semaphorin 5C (sema5C). In Drosophila, Sema5C was identified as an early development gene, which is expressed in stage 2 embryos with a striped pattern emerging at later stages. Sema5c may play a role in odor-guided behavior and in tumorigenesis. Sema5C belongs to class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 433
29308 200527 cd11266 Sema_6A The Sema domain, a protein interacting module, of semaphorins 6A (Sema6A). In the cerebellum, Sema6A-plexin A2 signaling modulates granule cell migration by controlling centrosome positioning. Besides plexin A2, plexin A4 is also found to be a receptor of Sema6A. Interactions between plexin A2, plexin A4, and Sema6A control lamina-restricted projection of hippocampal mossy fibers. It is required for the clustering of boundary cap cells at the PNS/CNS interface and thus, prevents motoneurons from streaming out of the ventral spinal cord. At the dorsal root entry site, it organizes the segregation of dorsal roots. Sema6A may also be involved in axonal pathfinding processes in the periinfarct and homotopic contralateral cortex. Sema6A is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 466
29309 200528 cd11267 Sema_6B The Sema domain, a protein interacting module, of semaphorin 6B (Sema6B). Sema6B functions as repellents for axon growth; this repulsive activity is mediated by its receptor Plexin A4. Sema6B is expressed in CA3, and repels mossy fibers in a Plexin A4 dependent manner. In human, it was shown that peroxisome proliferator-activated receptors (PPARs) and 9-cis-retinoic acid receptor (RXR) regulate human semaphorin 6B (Sema6B) gene expression. Sema6B is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 466
29310 200529 cd11268 Sema_6C The Sema domain, a protein interacting module, of semaphorin 6C (Sema6C, also called semaphorin Y). Sema6C is highly expressed in adult brain and skeletal muscle and it shows growth cone collapsing activity. It may play a role in the maintenance and remodelling of neuronal connections. In adult skeletal muscle, this role includes prevention of motor neuron sprouting and uncontrolled motor neuron growth. The expression of Sema6C in adult skeletal muscle is down-regulated following denervation. Sema6C is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 465
29311 200530 cd11269 Sema_6D The Sema domain, a protein interacting module, of semaphorin 6D (Sema6D). Sema6D is expressed predominantly in the nervous system during embryogenesis and it uses Plexin-A1 as a receptor. It displays repellent activity for dorsal root ganglion axons. Sema6D also acts as a regulator of late phase primary immune responses. In addition, Sema6D is overexpressed in gastric carcinoma, indicating that it may have an important role in the occurrence and development of the cancer. Sema6D is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 465
29312 200531 cd11270 Sema_6E The Sema domain, a protein interacting module, semaphorin 6E (sema6E). Sema6E is expressed predominantly in the nervous system during embryogenesis. It binds Plexin A1 and might utilize it as a receptor to repel axons of specific types during development. Sema6E acts as a repellent to dorsal root ganglion axons as well as sympathetic axons. Sema6E is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module. 462
29313 200532 cd11271 Sema_plexin_A1 The Sema domain, a protein interacting module, of Plexin A1. Plexin A1 is found in both the nervous and immune systems. Its external Sema domain is also shared by semaphorin proteins. In the nervous system, Plexin A1 mediates Sema3A axon guidance function by interacting with the Sema3A coreceptor neuropilin, resulting in actin depolarization and cell repulsion. In the immune system, Plexin A1 mediates Sema6D signaling by binding to the Sema6D-Trem2-DAP12 complex on immune cells and osteoclasts to promote Rac activation and DAP12 phosphorylation. In gene profiling experiments, Plexin A1 was identified as a CIITA (class II transactivator) regulated gene in primary dendritic cells (DCs). The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 474
29314 200533 cd11272 Sema_plexin_A2 The Sema domain, a protein interacting module, of Plexin A2. Plexin A2 serves as a receptor for class 6 semaphorins. Interactions between Plexin A2, A4 and semaphorins 6A and 6B control the lamina-restricted projection of hippocampal mossy fibers. Sema6B also repels the growth of mossy fibers in a Plexin A4 dependent manner. Plexin A2 does not suppress Sema6B function. In addition, studies have shown that Plexin A2 may be related to anxiety and other psychiatric disorders. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 515
29315 200534 cd11273 Sema_plexin_A3 The Sema domain, a protein interacting module, of Plexin A3. Plexin-A3 forms a receptor complex with neuropilin-2 and transduces signals for class 3 semaphorins in the nervous system. Both plexins A3 and A4 are essential for normal sympathetic neuron development. They function cooperatively to regulate the migration of sympathetic neurons, and differentially to guide sympathetic axons. Both plexins A3 and A4 are not required for guiding neural crest precursors prior to reaching the sympathetic anlagen. Plexin A3 is a major driving force for intraspinal motor growth cone guidance. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 469
29316 200535 cd11274 Sema_plexin_A4 The Sema domain, a protein interacting module, of Plexin A4. Plexin A4 forms a receptor complex with neuropilins (NRPs) and transduces signals for class 3 semaphorins in the nervous system. It regulates facial nerve development by functioning as a receptor for Sema3A/NRP1. Both plexins A3 and A4 are essential for normal sympathetic development. They function both cooperatively, to regulate the migration of sympathetic neurons, and differentially, to guide sympathetic axons. Plexin A4 is also expressed in lymphoid tissues and functions in the immune system. It negatively regulates T lymphocyte responses. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 473
29317 200536 cd11275 Sema_plexin_B1 The Sema domain, a protein interacting module, of Plexin B1. Plexin B1 serves as the Semaphorin 4D receptor and functions as a regulator of developing neurons and a tumor suppressor protein for melanoma. The Sema4D-plexin B signaling complex regulates dendritic and axonal complexity. The activation of Plexin B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. As a tumor suppressor, plexin B1 abrogates activation of the oncogenic receptor, c-Met, by its ligand, hepatocyte growth factor (HGF), in melanoma. Furthermore, plexin B1 suppresses integrin-dependent migration and activation of pp125FAK and inhibits Rho activity. Plexin B1 is highly expressed in endothelial cells and its activation by Sema4D elicits a potent proangiogenic response. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 461
29318 200537 cd11276 Sema_plexin_B2 The Sema domain, a protein interacting module, of Plexin B2. Plexin B2 serves as the receptor of Sema4C and Sema4G. By signaling the effect of Sema4C and Sema4G, the plexin B2 receptor plays important roles in neural tube closure and cerebellar granule cell development. Mice lacking Plexin B2 demonstrated defects in closure of the neural tube and disorganization of the embryonic brain. In developing kidney, Sema4C-Plexin B2 signaling modulates ureteric branching. Plexin B2 is expressed both in the pretubular aggregates and the ureteric epithelium in the developing kidney. Deletion of Plexin B2 results in renal hypoplasia and occasional double ureters. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 449
29319 200538 cd11277 Sema_plexin_B3 The Sema domain, a protein interacting module, of Plexin B3. Plexin B3 is the receptor of semaphorin 5A. It is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin B3 has been linked to verbal performance and white matter volume in human brain. Furthermore, Sema5A and plexin B3 have been implicated in the progression of various types of cancer. They play an important role in the invasion and metastasis of gastric carcinoma. The stimulation of plexin B3 by Sema5A binding in human glioma cells results in the inhibition of cell migration and invasion. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module. 434
29320 200539 cd11278 Sema_MET The Sema domain, a protein interacting module, of MET (also called hepatocyte growth factor receptor, HGFR). MET is encoded by the c-met protooncogene. MET is a receptor tyrosine kinase that binds its ligand, hepatocyte growth factor/scatter factor (HGF/SF). HGF/SF and MET are essential for the development of several tissues and organs, including the placenta, liver, and several groups of skeletal muscles. It also plays a major role in the abnormal migration of cancer cells as a result of overexpression or MET mutations. MET is composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The cytoplasmic C-terminal region acts as a docking site for multiple protein substrates, including Grb2, Gab1, STAT3, Shc, SHIP-1 and Src. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. The Sema domain of Met is necessary for receptor dimerization and activation. 492
29321 200540 cd11279 Sema_RON The Sema domain, a protein interacting module, of RON Receptor Tyrosine Kinase. RON receptor tyrosine kinase is a Macrophage-stimulating protein (MSP) receptor. Upon binding of MSP, RON is activated via autophosphorylation within its kinase catalytic domain, resulting in a wide range of effects, including proliferation, tubular morphogenesis, angiogenesis, cellular motility and invasiveness. By interacting with downstream signaling molecules, it regulates macrophage migration, phagocytosis, and nitric oxide production. RON has been implicated in cancers of the breast, colon, pancreas and ovaries because both splice variants and receptor overexpression have been identified in these tumors. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as ligand recognition and binding model. RON is composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The Sema domain of RON may be necessary for receptor dimerization and activation. 493
29322 200436 cd11280 gelsolin_like Tandemly repeated domains found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 88
29323 200437 cd11281 ADF_drebrin_like ADF homology domain of drebrin and actin-binding protein 1 (abp1). Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Many of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Abp1 and drebrin (developmentally regulated brain protein) are multidomain proteins with an N-terminal ADF homology domain and one or more C-terminal SH3 domains. They have been shown to interact with polymeric F-actin, but not with monomeric G-actin, and do not appear to promote the disassembly of actin filaments. Drebrin rather stabilizes actin filaments by inducing changes in the helical twist and may promote or interfere with the interactions of other proteins with actin filaments. 136
29324 200438 cd11282 ADF_coactosin_like Coactosin-like members of the ADF homology domain family. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Many of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. The function of coactosins is not well understood. They appear to interfere with the capping of actin filaments in Dictyostelium, and may not be able to bind monomeric globular actin. A role for coactosins as chaperones stabilizing 5-lipoxygenase (5LO) has been suggested; 5LO plays a crucial role in leukotriene synthesis. 114
29325 200439 cd11283 ADF_GMF-beta_like ADF-homology domain of glia maturation factor beta and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Most of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. The glia maturation factor (GMF), however, does not bind actin but interacts with the Arp2/3 complex (which contains actin-related proteins, amongst others) and suppresses Arp2/3 activity, inducing the dissociation of branched daughter filaments from their mother filaments. This family includes both mammalian GMF isoforms, GMF-beta and GMF-gamma. GMF-beta regulates cellular growth, fission, differentiation and apoptosis. GMF-gamma is important in myeloid cell development and is an important regulator for cell migration and polarity in neutrophils. 122
29326 200440 cd11284 ADF_Twf-C_like C-terminal ADF domain of twinfilin and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Twinfilin contains two ADF domains, and inhibits the assembly of actin filaments by strongly interacting with monomeric ADP-actin (ADP-G-actin) in a 1:1 stochiometry (with it's C-terminal ADF domain, Twf-C) and inhibiting the actin monomer's nucleotide exchange. Mammalian twinfilin may also cap the barbed ends of F-actin filaments and prevent further assembly (or disassembly), in a process which requires both ADF domains. The N-terminal ADF domain (Twf-N) binds G-actin with a lower affinity than Twf-C; Twf-C can also bind F-actin. During capping, Twf-N may interact with the terminal actin subunit, and Twf-C may bind between two adjacent subunits at the side of the filament. 132
29327 200441 cd11285 ADF_Twf-N_like N-terminal ADF domain of twinfilin and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Twinfilin contains two ADF domains, and inhibits the assembly of actin filaments by strongly interacting with monomeric ADP-actin (ADP-G-actin) in a 1:1 stochiometry (with it's C-terminal ADF domain, Twf-C) and inhibiting the actin monomer's nucleotide exchange. Mammalian twinfilin may also cap the barbed ends of F-actin filaments and prevent further assembly (or disassembly), in a process which requires both ADF domains. The N-terminal ADF domain (Twf-N) binds G-actin with a lower affinity than Twf-C; Twf-C can also bind F-actin. During capping, Twf-N may interact with the terminal actin subunit, and Twf-C may bind between two adjacent subunits at the side of the filament. 139
29328 200442 cd11286 ADF_cofilin_like Cofilin, Destrin, and related actin depolymerizing factors. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. These proteins enhance the turnover rate of actin, and interact with actin monomers (G-actin) as well as actin filaments (F-actin), typically with a preference for ADP-G-actin subunits. The basic function of cofilin is to promote disassembly of aged actin filaments. Vertebrates have three isoforms of cofilin: cofilin-1 (Cfl1, non-muscle cofilin), cofilin-2 (muscle cofilin), and ADF (destrin). When bound to actin monomers, cofilins inhibit their spontaneous exchange of nucleotides. The cooperative binding to (aged) ADP-F-actin induces a local change in the actin filament structure and further promotes aging. 133
29329 200443 cd11287 Sec23_C C-terminal Actin depolymerization factor-homology domain of Sec23. The C-terminal domain of the Sec23 subunit of the coat protein complex II (COPII) is distantly related to gelsolin-like repeats and the actin depolymerizing domains found in cofilin and similar proteins. Sec23 forms a tight complex with Sec24. The cytoplasmic Sec23/24 complex is recruited together with Sar1-GTP and Sec13/31 to induce coat polymerization and membrane deformation in the forming of COPII-coated endoplasmic reticulum vesicles. The function of the Sec23 C-terminal domain is unclear. 121
29330 200444 cd11288 gelsolin_S5_like Gelsolin sub-domain 5-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 92
29331 200445 cd11289 gelsolin_S2_like Gelsolin sub-domain 2-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 92
29332 200446 cd11290 gelsolin_S1_like Gelsolin sub-domain 1-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin_like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 113
29333 200447 cd11291 gelsolin_S6_like Gelsolin sub-domain 6-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 99
29334 200448 cd11292 gelsolin_S3_like Gelsolin sub-domain 3-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 98
29335 200449 cd11293 gelsolin_S4_like Gelsolin sub-domain 4-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins. 101
29336 199894 cd11294 E_set_Esterase_like_N N-terminal Early set domain associated with the catalytic domain of putative esterases. E or "early" set domains are associated with the catalytic domain of esterase at the N-terminal end. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term esterase can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminal domain of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others. 83
29337 199917 cd11295 Mago_nashi Mago nashi proteins, integral members of the exon junction complex. Members of this family, which was originally identified in Drosophila and called mago nashi, are integral members of the exon junction complex (EJC). The EJC is a multiprotein complex that is deposited on spliced mRNAs after intron removal at a conserved position upstream of the exon-exon junction, and transported to the cytoplasm where it has been shown to influence translation, surveillance, and localization of the spliced mRNA. It consists of four core proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP and is supposed to be a binding platform for more peripherally and transiently associated factors along mRNA travel. Mago and Y14 form a stable heterodimer that stabilizes the complex by inhibiting eIF4AIII's ATPase activity. In humans, but not Drosophila, EJC is involved in nonsense-mediated mRNA decay (NMD) via binding to Upf3b, a central NMD effector. EJC is stripped off the mRNA during the first round of translation and then the complex components are transported back into the nucleus and recycled. The Mago-Y14 heterodimer has been shown to interact with the cytoplasmic protein PYM, an EJC disassembly factor, and specifically binds to the karyopherin nuclear receptor importin 13. 143
29338 211383 cd11296 O-FucT_like GDP-fucose protein O-fucosyltransferase and related proteins. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 206
29339 350237 cd11297 PIN_LabA-like_N_1 uncharacterized subfamily of N-terminal LabA-like PIN domains. This N-terminal LabA-like PIN domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain), has been shown to play a role in cyanobacterial circadian timing. The LabA-like C-terminal domains characteristic of this subfamily may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains). The function of the N-terminal domain is unknown. The LabA-like PIN domain family also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes. Other members are the LabA-like PIN domains of human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 117
29340 211384 cd11298 O-FucT-2 GDP-fucose protein O-fucosyltransferase 2. O-FucT-2 adds O-fucose to thrombospondin type 1 repeats (TSRs), and appears conserved in bilateria. The O-fucosylation of TSRs appears to play a role in regulating secretion of metalloproteases of the ADAMTS superfamily. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 374
29341 211385 cd11299 O-FucT_plant GDP-fucose protein O-fucosyltransferase, plant specific subfamily. Some members of this plant-specific family of O-fucosyltransferases have been annotated as auxin-independent growth promotors. The function of the protein seems unclear. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 290
29342 211386 cd11300 Fut8_like Alpha 1-6-fucosyltransferase. Alpha 1,6-fucosyltransferase (Fut8) transfers a fucose moiety from GDP-fucose to the reducing terminal N-acetylglucosamine of the core structure of Asn-linked oligosaccharides, in a process termed core fucosylation. Core fucosylation is essential for the function of growth factor receptors. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 328
29343 211387 cd11301 Fut1_Fut2_like Alpha-1,2-fucosyltransferase. Alpha-1,2-fucosyltransferases (Fut1, Fut2) catalyze the transfer of alpha-L-fucose to the terminal beta-D-galactose residue of glycoconjugates via an alpha-1,2-linkage, generating carbohydrate structures that exhibit H-antigenicity for blood-group carbohydrates. These structures also act as ligands for morphogenesis, the adhesion of microbes, and metastasizing cancer cells. Fut1 is responsible for producing the H antigen on red blood cells. Fut2 is expressed in epithelia of secretory tissues, and individuals termed "secretors" have at least one functional copy of the gene; they secrete H antigen which is further processed into A and/or B antigens depending on the ABO genotype. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 265
29344 211388 cd11302 O-FucT-1 GDP-fucose protein O-fucosyltransferase 1. The protein O-fucosyltransferase 1 (Ofut1 or O-FucT-1) adds O-fucose to EGF (epidermal growth factor-like) repeats. The O-fucsosylation of the Notch receptor signaling protein is dependent on this enzyme, which requires GDP-fucose as a substrate. O-fucose residues added to the target of O-FucT-1 may be further elongated by other glycosyltransferases. On top of O-fucosylation, O-FucT-1 may have other functions such as the regulation of the Notch receptor exit from the ER. Six highly conserved cysteines are present in O-FucT-1, which is a soluble ER protein, as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and C. elegans. Both features are characteristic of several glycosyltransferase families. The membrane-bound pre-protein is released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese. O-FucT-1 is similar to family 1 glycosyltransferases (GT1). 347
29345 206636 cd11303 Dystroglycan_repeat Cadherin-like repeat domain of alpha dystroglycan. Dystroglycan is a glycoprotein widely distributed in skeletal muscle and other tissues; the pre-protein is cleaved into two subunits (alpha and beta) that form a complex which links the extracellular matrix to the cytoskeleton. Cadherin-like dystroglycan repeats are present in the extracellular alpha-dystroglycan subunit, which binds to the alpha-2-laminin G-domain in the basement membrane as part of the dystrophin-dystroglycan-complex (DGC). DGC has been shown to interact with other etxtracellular matrix components as well, such as perlecan and m-agrin, suggesting that the complex may play various different roles depending on the extracellular ligand. 99
29346 206637 cd11304 Cadherin_repeat Cadherin tandem repeat domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. 98
29347 206765 cd11305 alpha_DG_C C-terminal domain of alpha dystroglycan. Dystroglycan is a glycoprotein widely distributed in skeletal muscle and other tissues; the pre-protein is cleaved into two subunits (alpha and beta) that form a complex which links the extracellular matrix to the cytoskeleton. This C-terminal domain of the alpha-subunit appears to contact neighboring cadherin-like repeats of alpha dystroglycan, and may also be involved in interactions with other components of the dystrophin-dystroglycan-complex (DGC). DGC has been shown to interact with extracellular matrix components such as laminin, perlecan and m-agrin, suggesting that the complex may play various different roles depending on the extracellular ligand. 124
29348 199915 cd11306 M35_peptidyl-Lys Peptidase M35 domain of peptidyl-Lys metalloendopeptidases. This family M35 Zn2+-metallopeptidase extracellular domain is mostly found in proteins characterized as peptidyl-Lys metalloendopeptidases (MEP; peptidyllysine metalloproteinase; EC 3.4.24.20), including some well-characterized domains in Aeromonas salmonicida subsp. Achromogenes (AsaP1) and Grifola frondosa (GfMEP). These proteins specifically cleave peptidyl-lysine bonds (-X-Lys- where X may even be Pro) in proteins and peptides. AsaP1 peptidase has been shown to be important in the virulence of A. salmonicida subsp. achromogenes, having a major role in the fish innate immune response. Members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands. 160
29349 199916 cd11307 M35_Asp_f2_like Peptidase M35 domain of Asp f2, a major allergen from Aspergillus fumigatus, and related proteins; non catalytic. In this domain subgroup the unique zinc-binding motif (the aspzincin motif, characteristic of the M35 deuterolysin family, and defined as the "HEXXH + D" motif: two His ligands and Asp as third ligand), is poorly conserved and may not bind Zinc. Members of this subgroup also lack a key conserved Tyr residue which acts as a proton donor during metallopeptidase catalysis. These include Asp f2, a major allergen from Aspergillus fumigatus, which reacts with serum from patients with ABPA (allergic bronchopulmonary aspergillosis), and pH-regulated antigen 1 (PRA1) from Candida albicans, which has a role in fungal morphogenesis and perhaps in the host-parasite interaction during candidal infection. No protease activity has been detected for Asp f2 to date. This subgroup also includes Saccharomyces cerevisiae Zps1p. The expression of the Zsp1 gene is increased in response to zinc deficiency; it is a target of the Zap1p transcription factor. 179
29350 200604 cd11308 Peptidase_M14NE-CP-C_like Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain. This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families. 76
29351 206763 cd11309 14-3-3_fungi Fungal 14-3-3 protein domain. This family containing fungal 14-3-3 domains includes the yeasts Saccharomyces cerevisiae (BMH1 and BMH2) and Schizosaccharomyces pombe (rad24 and rad25) isoforms. They possess distinctively variant C-terminal segments that differentiate them from the mammalian isoforms; the C-terminus is longer and BMH1/2 isoforms contain polyglutamine (polyQ) sequences of unknown function. The C-terminal segments of yeast 14-3-3 isoforms may thus behave in a different manner compared to the higher eukaryote isoforms. Yeast 14-3-3 proteins bind to numerous proteins involved in a variety of yeast cellular processes making them excellent model organisms for elucidating the function of the 14-3-3 protein family. BMH1 and BMH2 are positive regulators of rapamycin-sensitive signaling via TOR kinases while they play an inhibitory role in Rtg3p-dependent transcription involved in retrograde signaling. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 231
29352 206764 cd11310 14-3-3_1 14-3-3 protein domain. This 14-3-3 domain family includes proteins in Caenorhabditis elegans, the silkworm (Bombyx mori) as well as barley (Hordeum vulgare). In C. elegans, 14-3-3 proteins are SIR-2.1 binding partners which induce transcriptional activation of DAF-16 during stress and are required for the life-span extension conferred by extra copies of sir-2.1. In B. mori, the 14-3-3 proteins are expressed widely in larval and adult tissues, including the brain, fat body, Malpighian tube, silk gland, midgut, testis, ovary, antenna, and pheromone gland, and interact with the N-terminal fragment of Hsp60, suggesting that 14-3-3 (a molecular adaptor) and Hsp60 (a molecular chaperone) work together to achieve a wide range of cellular functions in B. mori. In barley aleurone cells, 14-3-3 proteins and members of the ABF transcription factor family have a regulatory function in the gibberellic acid (GA) pathway since the balance of GA and abscisic acid (ABA) is a determining factor during transition of embryogenesis and seed germination. 14-3-3 is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 230
29353 200452 cd11313 AmyAc_arch_bac_AmyA Alpha amylase catalytic domain found in archaeal and bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes firmicutes, bacteroidetes, and proteobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 336
29354 200453 cd11314 AmyAc_arch_bac_plant_AmyA Alpha amylase catalytic domain found in archaeal, bacterial, and plant Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes AmyA from bacteria, archaea, water fleas, and plants. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 302
29355 200454 cd11315 AmyAc_bac1_AmyA Alpha amylase catalytic domain found in bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes Firmicutes, Proteobacteria, Actinobacteria, and Cyanobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 352
29356 200455 cd11316 AmyAc_bac2_AmyA Alpha amylase catalytic domain found in bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes Chloroflexi, Dictyoglomi, and Fusobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 403
29357 200456 cd11317 AmyAc_bac_euk_AmyA Alpha amylase catalytic domain found in bacterial and eukaryotic Alpha amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes AmyA proteins from bacteria, fungi, mammals, insects, mollusks, and nematodes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 329
29358 200457 cd11318 AmyAc_bac_fung_AmyA Alpha amylase catalytic domain found in bacterial and fungal Alpha amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes bacterial and fungal proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 391
29359 200458 cd11319 AmyAc_euk_AmyA Alpha amylase catalytic domain found in eukaryotic Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes eukaryotic alpha-amylases including proteins from fungi, sponges, and protozoans. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 375
29360 200459 cd11320 AmyAc_AmyMalt_CGTase_like Alpha amylase catalytic domain found in maltogenic amylases, cyclodextrin glycosyltransferase, and related proteins. Enzymes such as amylases, cyclomaltodextrinase (CDase), and cyclodextrin glycosyltransferase (CGTase) degrade starch to smaller oligosaccharides by hydrolyzing the alpha-D-(1,4) linkages between glucose residues. In the case of CGTases, an additional cyclization reaction is catalyzed yielding mixtures of cyclic oligosaccharides which are referred to as alpha-, beta-, or gamma-cyclodextrins (CDs), consisting of six, seven, or eight glucose residues, respectively. CGTases are characterized depending on the major product of the cyclization reaction. Besides having similar catalytic site residues, amylases and CGTases contain carbohydrate binding domains that are distant from the active site and are implicated in attaching the enzyme to raw starch granules and in guiding the amylose chain into the active site. The maltogenic alpha-amylase from Bacillus is a five-domain structure, unlike most alpha-amylases, but similar to that of cyclodextrin glycosyltransferase. In addition to the A, B, and C domains, they have a domain D and a starch-binding domain E. Maltogenic amylase is an endo-acting amylase that has activity on cyclodextrins, terminally modified linear maltodextrins, and amylose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 389
29361 200460 cd11321 AmyAc_bac_euk_BE Alpha amylase catalytic domain found in bacterial and eukaryotic branching enzymes. Branching enzymes (BEs) catalyze the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage yielding a non-reducing end oligosaccharide chain, and subsequent attachment to the alpha-1,6 position. By increasing the number of non-reducing ends, glycogen is more reactive to synthesis and digestion as well as being more soluble. This group includes bacterial and eukaryotic proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 406
29362 200461 cd11322 AmyAc_Glg_BE Alpha amylase catalytic domain found in the Glycogen branching enzyme (also called 1,4-alpha-glucan branching enzyme). The glycogen branching enzyme catalyzes the third step of glycogen biosynthesis by the cleavage of an alpha-(1,4)-glucosidic linkage and the formation a new alpha-(1,6)-branch by subsequent transfer of cleaved oligosaccharide. They are part of a group called branching enzymes which catalyze the formation of alpha-1,6 branch points in either glycogen or starch. This group includes proteins from bacteria, eukaryotes, and archaea. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 402
29363 200462 cd11323 AmyAc_AGS Alpha amylase catalytic domain found in Alpha 1,3-glucan synthase (also called uridine diphosphoglucose-1,3-alpha-glucan glucosyltransferase and 1,3-alpha-D-glucan synthase). Alpha 1,3-glucan synthase (AGS, EC 2.4.1.183) is an enzyme that catalyzes the reversible chemical reaction of UDP-glucose and [alpha-D-glucosyl-(1-3)]n to form UDP and [alpha-D-glucosyl-(1-3)]n+1. AGS is a component of fungal cell walls. The cell wall of filamentous fungi is composed of 10-15% chitin and 10-35% alpha-1,3-glucan. AGS is triggered in fungi as a response to cell wall stress and elongates the glucan chains in cell wall synthesis. This group includes proteins from Ascomycetes and Basidomycetes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 569
29364 200463 cd11324 AmyAc_Amylosucrase Alpha amylase catalytic domain found in Amylosucrase. Amylosucrase is a glucosyltransferase that catalyzes the transfer of a D-glucopyranosyl moiety from sucrose onto an acceptor molecule. When the acceptor is another saccharide, only alpha-1,4 linkages are produced. Unlike most amylopolysaccharide synthases, it does not require any alpha-D-glucosyl nucleoside diphosphate substrate. In the presence of glycogen it catalyzes the transfer of a D-glucose moiety onto a glycogen branch, but in its absence, it hydrolyzes sucrose and synthesizes polymers, smaller maltosaccharides, and sucrose isoforms. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 536
29365 200464 cd11325 AmyAc_GTHase Alpha amylase catalytic domain found in Glycosyltrehalose trehalohydrolase (also called Maltooligosyl trehalose Trehalohydrolase). Glycosyltrehalose trehalohydrolase (GTHase) was discovered as part of a coupled system for the production of trehalose from soluble starch. In the first half of the reaction, glycosyltrehalose synthase (GTSase), an intramolecular glycosyl transferase, converts the glycosidic bond between the last two glucose residues of amylose from an alpha-1,4 bond to an alpha-1,1 bond, making a non-reducing glycosyl trehaloside. In the second half of the reaction, GTHase cleaves the alpha-1,4 glycosidic bond adjacent to the trehalose moiety to release trehalose and malto-oligosaccharide. Like isoamylase and other glycosidases that recognize branched oligosaccharides, GTHase contains an N-terminal extension and does not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. Glycosyltrehalose Trehalohydrolase Maltooligosyltrehalose Trehalohydrolase 436
29366 200465 cd11326 AmyAc_Glg_debranch Alpha amylase catalytic domain found in glycogen debranching enzymes. Debranching enzymes facilitate the breakdown of glycogen through glucosyltransferase and glucosidase activity. These activities are performed by a single enzyme in mammals, yeast, and some bacteria, but by two distinct enzymes in Escherichia coli and other bacteria. Debranching enzymes perform two activities: 4-alpha-D-glucanotransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). 4-alpha-D-glucanotransferase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Amylo-alpha-1,6-glucosidase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. In Escherichia coli, GlgX is the debranching enzyme and malQ is the 4-alpha-glucanotransferase. TreX, an archaeal glycogen-debranching enzyme has dual activities like mammals and yeast, but is structurally similar to GlgX. TreX exists in two oligomeric states, a dimer and tetramer. Isoamylase (EC 3.2.1.68) is one of the starch-debranching enzymes that catalyzes the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen and their beta-limit dextrins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 433
29367 200466 cd11327 AmyAc_Glg_debranch_2 Alpha amylase catalytic domain found in glycogen debranching enzymes. Debranching enzymes facilitate the breakdown of glycogen through glucosyltransferase and glucosidase activity. These activities are performed by a single enzyme in mammals, yeast, and some bacteria, but by two distinct enzymes in Escherichia coli and other bacteria. Debranching enzymes perform two activities, 4-alpha-D-glucanotransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). 4-alpha-D-glucanotransferase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Amylo-alpha-1,6-glucosidase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. The catalytic triad (DED), which is highly conserved in other debranching enzymes, is not present in this group. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 478
29368 200467 cd11328 AmyAc_maltase Alpha amylase catalytic domain found in maltase (also known as alpha glucosidase) and related proteins. Maltase (EC 3.2.1.20) hydrolyzes the terminal, non-reducing (1->4)-linked alpha-D-glucose residues in maltose, releasing alpha-D-glucose. In most cases, maltase is equivalent to alpha-glucosidase, but the term "maltase" emphasizes the disaccharide nature of the substrate from which glucose is cleaved, and the term "alpha-glucosidase" emphasizes the bond, whether the substrate is a disaccharide or polysaccharide. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 470
29369 200468 cd11329 AmyAc_maltase-like Alpha amylase catalytic domain family found in maltase. Maltase (EC 3.2.1.20) hydrolyzes the terminal, non-reducing (1->4)-linked alpha-D-glucose residues in maltose, releasing alpha-D-glucose. The catalytic triad (DED) which is highly conserved in the other maltase group is not present in this subfamily. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 477
29370 200469 cd11330 AmyAc_OligoGlu Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase) and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomalto-oligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 472
29371 200470 cd11331 AmyAc_OligoGlu_like Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase) and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomalto-oligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 450
29372 200471 cd11332 AmyAc_OligoGlu_TS Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase), trehalose synthase (also called maltose alpha-D-glucosyltransferase), and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomaltooligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. Trehalose synthase (EC 5.4.99.16) catalyzes the isomerization of maltose to produce trehalulose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 481
29373 200472 cd11333 AmyAc_SI_OligoGlu_DGase Alpha amylase catalytic domain found in Sucrose isomerases, oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase), dextran glucosidase (also called glucan 1,6-alpha-glucosidase), and related proteins. The sucrose isomerases (SIs) Isomaltulose synthase (EC 5.4.99.11) and Trehalose synthase (EC 5.4.99.16) catalyze the isomerization of sucrose and maltose to produce isomaltulose and trehalulose, respectively. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomaltooligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. Dextran glucosidase (DGase, EC 3.2.1.70) hydrolyzes alpha-1,6-glucosidic linkages at the non-reducing end of panose, isomaltooligosaccharides and dextran to produce alpha-glucose.The common reaction chemistry of the alpha-amylase family enzymes is based on a two-step acid catalytic mechanism that requires two critical carboxylates: one acting as a general acid/base (Glu) and the other as a nucleophile (Asp). Both hydrolysis and transglycosylation proceed via the nucleophilic substitution reaction between the anomeric carbon, C1 and a nucleophile. Both enzymes contain the three catalytic residues (Asp, Glu and Asp) common to the alpha-amylase family as well as two histidine residues which are predicted to be critical to binding the glucose residue adjacent to the scissile bond in the substrates. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 428
29374 200473 cd11334 AmyAc_TreS Alpha amylase catalytic domain found in Trehalose synthetase. Trehalose synthetase (TreS) catalyzes the reversible interconversion of trehalose and maltose. The enzyme catalyzes the reaction in both directions, but the preferred substrate is maltose. Glucose is formed as a by-product of this reaction. It is believed that the catalytic mechanism may involve the cutting of the incoming disaccharide and transfer of a glucose to an enzyme-bound glucose. This enzyme also catalyzes production of a glucosamine disaccharide from maltose and glucosamine. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 447
29375 200474 cd11335 AmyAc_MTase_N Alpha amylase catalytic domain found in maltosyltransferase. Maltosyltransferase (MTase), a maltodextrin glycosyltransferase, acts on starch and maltooligosaccharides. It catalyzes the transfer of maltosyl units from alpha-1,4-linked glucans or maltooligosaccharides to other alpha-1,4-linked glucans, maltooligosaccharides or glucose. MTase is a homodimer. The catalytic core domain has the (beta/alpha) 8 barrel fold with the active-site cleft formed at the C-terminal end of the barrel. Substrate binding experiments have led to the location of two distinct maltose-binding sites: one lies in the active-site cleft and the other is located in a pocket adjacent to the active-site cleft. It is a member of the alpha-amylase family, but unlike typical alpha-amylases, MTase does not require calcium for activity and lacks two histidine residues which are predicted to be critical for binding the glucose residue adjacent to the scissile bond in the substrates. The common reaction chemistry of the alpha-amylase family of enzymes is based on a two-step acid catalytic mechanism that requires two critical carboxylates: one acting as a general acid/base (Glu) and the other as a nucleophile (Asp). Both hydrolysis and transglycosylation proceed via the nucleophilic substitution reaction between the anomeric carbon, C1 and a nucleophile. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 538
29376 200475 cd11336 AmyAc_MTSase Alpha amylase catalytic domain found in maltooligosyl trehalose synthase (MTSase). Maltooligosyl trehalose synthase (MTSase) domain. MTSase and maltooligosyl trehalose trehalohydrolase (MTHase) work together to produce trehalose. MTSase is responsible for converting the alpha-1,4-glucosidic linkage to an alpha,alpha-1,1-glucosidic linkage at the reducing end of the maltooligosaccharide through an intramolecular transglucosylation reaction, while MTHase hydrolyzes the penultimate alpha-1,4 linkage of the reducing end, resulting in the release of trehalose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 660
29377 200476 cd11337 AmyAc_CMD_like Alpha amylase catalytic domain found in cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is mainly bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 328
29378 200477 cd11338 AmyAc_CMD Alpha amylase catalytic domain found in cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 389
29379 200478 cd11339 AmyAc_bac_CMD_like_2 Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 344
29380 200479 cd11340 AmyAc_bac_CMD_like_3 Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 407
29381 200480 cd11341 AmyAc_Pullulanase_LD-like Alpha amylase catalytic domain found in Pullulanase (also called dextrinase; alpha-dextrin endo-1,6-alpha glucosidase), limit dextrinase, and related proteins. Pullulanase is an enzyme with action similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. Pullulanases are very similar to limit dextrinases, although they differ in their action on glycogen and the rate of hydrolysis of limit dextrins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 406
29382 200481 cd11343 AmyAc_Sucrose_phosphorylase-like Alpha amylase catalytic domain found in sucrose phosphorylase (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 445
29383 200482 cd11344 AmyAc_GlgE_like Alpha amylase catalytic domain found in GlgE-like proteins. GlgE is a (1,4)-a-D-glucan:phosphate a-D-maltosyltransferase, involved in a-glucan biosynthesis in bacteria. It is also an anti-tuberculosis drug target. GlgE isoform I from Streptomyces coelicolor has the same catalytic and very similar kinetic properties to GlgE from Mycobacterium tuberculosis. GlgE from Streptomyces coelicolor forms a homodimer with each subunit comprising five domains (A, B, C, N, and S) and 2 inserts. Domain A is a catalytic alpha-amylase-type domain that along with domain N, which has a beta-sandwich fold and forms the core of the dimer interface, binds cyclodextrins. Domain A, B, and the 2 inserts define a well conserved donor pocket that binds maltose. Cyclodextrins competitively inhibit the binding of maltooligosaccharides to the S. coelicolor enzyme, indicating that the hydrophobic patch overlaps with the acceptor binding site. This is not the case in M. tuberculosis GlgE because cyclodextrins do not inhibit this enzyme, despite acceptor length specificity being conserved. Domain C is hypothesized to help stabilize domain A and could be involved in substrate binding. Domain S is a helix bundle that is inserted within the N domain and it plays a role in the dimer interface and interacts directly with domain B. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 355
29384 200483 cd11345 AmyAc_SLC3A2 Alpha amylase catalytic domain found in solute carrier family 3 member 2 proteins. 4F2 cell-surface antigen heavy chain (hc) is a protein that in humans is encoded by the SLC3A2 gene. 4F2hc is a multifunctional type II membrane glycoprotein involved in amino acid transport and cell fusion, adhesion, and transformation. It is related to bacterial alpha-glycosidases, but lacks alpha-glycosidase activity. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 326
29385 200484 cd11346 AmyAc_plant_IsoA Alpha amylase catalytic domain family found in plant isoamylases. Two types of debranching enzymes exist in plants: isoamylase-type (EC 3.2.1.68) and a pullulanase-type (EC 3.2.1.41, also known as limit-dextrinase). These efficiently hydrolyze alpha-(1,6)-linkages in amylopectin and pullulan. This group does not contain the conserved catalytic triad present in other alpha-amylase-like proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 347
29386 200485 cd11347 AmyAc_1 Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 391
29387 200486 cd11348 AmyAc_2 Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The catalytic triad (DED) is not present here. The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 429
29388 200487 cd11349 AmyAc_3 Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 456
29389 200488 cd11350 AmyAc_4 Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 390
29390 200489 cd11352 AmyAc_5 Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 443
29391 200490 cd11353 AmyAc_euk_bac_CMD_like Alpha amylase catalytic domain found in eukaryotic and bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is mainly bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 366
29392 200491 cd11354 AmyAc_bac_CMD_like Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 357
29393 200492 cd11355 AmyAc_Sucrose_phosphorylase Alpha amylase catalytic domain found in sucrose phosphorylase (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 433
29394 200493 cd11356 AmyAc_Sucrose_phosphorylase-like_1 Alpha amylase catalytic domain found in sucrose phosphorylase-like proteins (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 458
29395 206766 cd11358 RNase_PH RNase PH-like 3'-5' exoribonucleases. RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites. 218
29396 200494 cd11359 AmyAc_SLC3A1 Alpha amylase catalytic domain found in Solute Carrier family 3 member 1 proteins. SLC3A1, also called Neutral and basic amino acid transport protein rBAT or NBAT, plays a role in amino acid and cystine absorption. Mutations in the gene encoding SLC3A1 causes cystinuria, an autosomal recessive disorder characterized by the failure of proximal tubules to reabsorb filtered cystine and dibasic amino acids. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. 456
29397 206767 cd11362 RNase_PH_bact Ribonuclease PH. Ribonuclease PH (RNase PH)-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Structurally all members of this family form hexameric rings (trimers of dimers). Bacterial RNase PH forms a homohexameric ring, and removes nucleotide residues following the -CCA terminus of tRNA. 227
29398 206768 cd11363 RNase_PH_PNPase_1 Polyribonucleotide nucleotidyltransferase, repeat 1. Polyribonucleotide nucleotidyltransferase (PNPase) is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally, all members of this family form hexameric rings. In the case of PNPase the complex is a trimer, since each monomer contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction and in quality control of ribosomal RNA precursors. It is part of the RNA degradosome complex and binds to the scaffolding domain of the endoribonuclease RNase E. 229
29399 206769 cd11364 RNase_PH_PNPase_2 Polyribonucleotide nucleotidyltransferase, repeat 2. Polyribonucleotide nucleotidyltransferase (PNPase) is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally, all members of this family form hexameric rings. In the case of PNPase the complex is a trimer, since each monomer contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction and in quality control of ribosomal RNA precursors, with the second repeat containing the active site. PNPase is part of the RNA degradosome complex and binds to the scaffolding domain of the endoribonuclease RNase E. 223
29400 206770 cd11365 RNase_PH_archRRP42 RRP42 subunit of archaeal exosome. The RRP42 subunit of the archaeal exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of dimers). In archaea, the ring is formed by three Rrp41:Rrp42 dimers. The central chamber within the ring contains three phosphorolytic active sites located in an Rrp41 pocket at the interface between Rrp42 and Rrp41. The ring is capped by three copies of Rrp4 and/or Csl4 which contain putative RNA interaction domains. The archaeal exosome degrades single-stranded RNA (ssRNA) in the 3'-5' direction, but also can catalyze the reverse reaction of adding nucleoside diphosphates to the 3'-end of RNA which has been shown to lead to the formation of poly-A-rich tails on RNA. It is required for 3' processing of the 5.8S rRNA. 256
29401 206771 cd11366 RNase_PH_archRRP41 RRP41 subunit of archaeal exosome. The RRP41 subunit of the archaeal exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of dimers). In archaea, the ring is formed by three Rrp41:Rrp42 dimers. The central chamber within the ring contains three phosphorolytic active sites located in an Rrp41 pocket at the interface between Rrp42 and Rrp41. The ring is capped by three copies of Rrp4 and/or Csl4 which contain putative RNA interaction domains. The archaeal exosome degrades single-stranded RNA (ssRNA) in the 3'-5' direction, but also can catalyze the reverse reaction of adding nucleoside diphosphates to the 3'-end of RNA which has been shown to lead to the formation of poly-A-rich tails on RNA. 214
29402 206772 cd11367 RNase_PH_RRP42 RRP42 subunit of eukaryotic exosome. The RRP42 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 272
29403 206773 cd11368 RNase_PH_RRP45 RRP45 subunit of eukaryotic exosome. The RRP45 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 259
29404 206774 cd11369 RNase_PH_RRP43 RRP43 subunit of eukaryotic exosome. The RRP43 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 261
29405 206775 cd11370 RNase_PH_RRP41 RRP41 subunit of eukaryotic exosome. The RRP41 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 226
29406 206776 cd11371 RNase_PH_MTR3 MTR3 subunit of eukaryotic exosome. The MTR3 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 210
29407 206777 cd11372 RNase_PH_RRP46 RRP46 subunit of eukaryotic exosome. The RRP46 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts. 199
29408 200603 cd11374 CE4_u10 Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. The family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups of cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. 226
29409 213029 cd11375 Peptidase_M54 Peptidase family M54, also called archaemetzincins or archaelysins. Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events. 173
29410 271138 cd11376 Imelysin-like imelysin also called Peptidase M75. This family includes insulin-cleaving membrane protease (imelysin, ICMP), imelysin-like protein (IPPA from Psychrobacter arcticus), iron-regulated protein A (IrpA) and iron-transporter EfeO-like alginate-binding protein (Algp7). Imelysin is a membrane protein with the active site outside the cell envelope. It is also called the peptidase M75 since the HxxE sequence motif characteristic of the M14 peptidase is completely conserved. However, the overall structure and the GxHxxE motif region differ from the known HxxE metallopeptidases, suggesting that imelysin-like proteins may not be peptidases. Imelysin's cleavage of the oxidized insulin B chain shows a preference for aromatic hydrophobic amino acids at P1'. Imelysin was first identified in Pseudomonas aeruginosa and has also been shown to cleave fibrinogen. The tertiary structure shows a fold consisting of two domains, each consisting of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event. In addition to an imelysin-like domain, Algp7 typically contains an N-terminal cupredoxin (CUP) domain and has a deep cleft between the 4-helix bundles sufficiently large to accommodate macromolecules such as alginate polysaccharide. 253
29411 206778 cd11377 Pro-peptidase_S53 Activation domain of S53 peptidases. Members of this family are found in various subtilase propeptides, such as pro-kumamolysin and tripeptidyl peptidase I, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase. 139
29412 211390 cd11378 DUF296 Domain of unknown function found in archaea, bacteria, and plants. This domain is found in proteins that contain AT-hook motifs, which suggests a role in DNA-binding for the proteins as a whole. Three conserved histidine residues appear to form a zinc-binding site, and the domain has been observed to form homotrimers. It co-occurs with a thioredoxin-like domain in uncharacterized cyanobacterial proteins. 113
29413 211391 cd11379 DUF4425 Uncharacterized protein conserved in Bacteroidetes. This family appears to form homodimers, the 3D structure has been determined by both NMR and X-ray crystallography. 119
29414 211392 cd11380 Ribosomal_S8e_like Eukaryotic/archaeal ribosomal protein S8e and similar proteins. This family contains the eukaryotic/archaeal ribosomal protein S8, a component of the small ribosomal subunits, as well as the NSA2 gene product. 138
29415 211393 cd11381 NSA2 pre-ribosomal protein NSA2 (Nop seven-associated 2). NSA2 appears to be a protein required for the maturation of 27S pre-rRNA in yeast; it has been characterized in mammalian cells as a nucleolar protein that might play a role in the regulation of the cell cycle and in cell proliferation. 257
29416 211394 cd11382 Ribosomal_S8e Eukaryotic/archaeal ribosomal protein S8e (RPS8). The eukaryotic/archaeal ribosomal protein S8 is a component of the small (40S in eukaryotes, 30S in archaea) ribosomal subunits and interacts tightly with 18S rRNA (16S rRNA in archaea, presumably). 122
29417 206743 cd11383 YfjP YfjP GTPase. The Era (E. coli Ras-like protein)-like YfjP subfamily includes several uncharacterized bacterial GTPases that are similar to Era. They generally show sequence conservation in the region between the Walker A and B motifs (G1 and G3 box motifs), to the exclusion of other GTPases. Era is characterized by a distinct derivative of the KH domain (the pseudo-KH domain) which is located C-terminal to the GTPase domain. 140
29418 206744 cd11384 RagA_like Rag GTPase, subfamily of Ras-related GTPases, includes Ras-related GTP-binding proteins A and B. RagA and RagB are closely related Rag GTPases (ras-related GTP-binding protein A and B) that constitute a unique subgroup of the Ras superfamily, and are functional homologs of Saccharomyces cerevisiae Gtr1. These domains function by forming heterodimers with RagC or RagD, and similarly, Gtr1 dimerizes with Gtr2, through the carboxy-terminal segments. They play an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7. 286
29419 206745 cd11385 RagC_like Rag GTPase, subfamily of Ras-related GTPases, includes Ras-related GTP-binding proteins C and D. RagC and RagD are closely related Rag GTPases (ras-related GTP-binding protein C and D) that constitute a unique subgroup of the Ras superfamily, and are functional homologs of Saccharomyces cerevisiae Gtr2. These domains form heterodimers with RagA or RagB, and similarly, Gtr2 dimerizes with Gtr1 in order to function. They play an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7. 175
29420 206779 cd11386 MCP_signal Methyl-accepting chemotaxis protein (MCP), signaling domain. Methyl-accepting chemotaxis proteins (MCPs or chemotaxis receptors) are an integral part of the transmembrane protein complex that controls bacterial chemotaxis, together with the histidine kinase CheA, the receptor-coupling protein CheW, receptor-modification enzymes, and localized phosphatases. MCPs contain a four helix trans membrane region, an N-terminal periplasmic ligand binding domain, and a C-terminal HAMP domain followed by a cytoplasmic signaling domain. This C-terminal signaling domain dimerizes into a four-helix bundle and interacts with CheA through the adaptor protein CheW. 200
29421 381393 cd11387 bHLHzip_USF_MITF basic Helix-Loop-Helix-zipper (bHLHzip) domain found in USF/MITF family. The USF (upstream stimulatory factor)/MITF (microphthalmia-associated transcription factor) family includes two bHLHzip transcription factor subfamilies. USFs are ubiquitously expressed and key regulators of a wide number of gene regulation networks, including the stress and immune responses, cell cycle and proliferation, lipid and glucid metabolism. USFs recruit chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. USFs interact with high affinity to E-box regulatory elements. The MITF (also known as microphthalmia-TFE, or MiT) subfamily comprises four genes in mammals (MITF, TFE3, TFEB, and TFEC); each gene has different functions. MITF is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. TFEB is required for vascularization of the mouse placenta. TFE3 is involved in B cell function. TFEC regulates gene expression in macrophages. The MITF subfamily proteins can form homodimers or heterodimers with each other but not with other bHLH or bHLHzip proteins. 58
29422 381394 cd11388 bHLH_ScINO2_like basic helix-loop-helix (bHLH) domain found in Saccharomyces cerevisiae protein INO2 and similar proteins. INO2 is a positive regulatory factor required for depression of the co-regulated phospholipid biosynthetic enzymes in Saccharomyces cerevisiae. It is also involved in the expression of ITR1. 68
29423 381395 cd11389 bHLH-O_HERP_like basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES)-related repressor protein (HERP)-like family. The HERP-like family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split proteins. They contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. HERP proteins (HEY1, HEY2 and HEYL) act as downstream effectors of Notch signaling. They are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis. Hairy and enhancer of split-related protein HELT is a transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. Differentially expressed in chondrocytes proteins, DEC1 and DEC2, are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers. Drosophila melanogaster protein clockwork orange (Cwo) is also included in this family. It is involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression. 55
29424 381396 cd11390 bHLH_TS tissue specific basic helix-loop-helix (bHLH-TS) domain family. Tissue specific bHLH domain family includes transcription regulators whose expression are restricted to certain tissues. They are involved in cell-fate determination and process in neurogenesis, cardiogenesis, myogenesis, and hematopoiesis and include proteins from myogenic regulatory factor (MRF) family, twist-related protein (TWIST) family, scleraxis-like family, heart- and neural crest derivatives-expressed protein (HAND) family, helix-loop-helix protein (HEN) family, musculin-like family, germline alpha (FIGLA) family, T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family, ovary, uterus and testis protein (OUT) family, mesoderm posterior protein (Mesp) family, muscle, intestine and stomach expression 1 (MIST-1) family, protein atonal homologs (ATOH) family, neurogenin (NGN) family, neurogenic differentiation factor (NeuroD) family, achaete-scute complex-like (ASCL) family, Fer3-like protein (FERD3L)-like family, and Oligodendrocyte lineage genes (OLIG) family of transcription factors. 55
29425 381397 cd11391 bHLH_PAS basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain family. bHLH-PAS domain has been found in a large group of bHLH transcription regulators that are involved in gene expression responding to environmental change and controlling aspects of neural development, including proteins from aryl hydrocarbon receptor nuclear translocator (ARNT) family, hypoxia-inducible factor (HIF) family, aryl hydrocarbon receptor (AhR) family, neuronal PAS domain-containing protein (NPAS) family, Circadian locomotor output cycles protein kaput (CLOCK)-like family, and single-minded (SIM) family. bHLH-PAS transcriptional regulatory factors have a bHLH DNA-binding domain followed by two PAS domains and a C-terminal activation or repression domain. bHLH-PAS family members can be divided into class I and class II based on their dimerization partner. bHLH-PAS class I factors include AhR, HIF and SIM. The best characterized bHLH-PAS Class II protein is the ubiquitous ARNT. Some members of bHLH-PAS family act as transcriptional coactivators (such as NCoA) that lack the ability to dimerize and bind DNA. 55
29426 381398 cd11392 bHLH_ScPHO4_like basic helix-loop-helix (bHLH) domain found in Saccharomyces cerevisiae phosphate system positive regulatory protein PHO4 and similar proteins. PHO4 is a transcriptional activator that regulates the expression of repressible phosphatase under phosphate starvation conditions in Saccharomyces cerevisiae. The PHO4 protein has four functional domains with the bHLH domain at its carboxyl-terminal region. It regulates transcription by binding to promoter of the genes as a homodimer. 80
29427 381399 cd11393 bHLH_AtbHLH_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana genes coding transcription factors and similar proteins. bHLH proteins are the second largest class of plant transcription factors that regulate transcription of genes that are involve in many essential physiological and developmental process. bHLH proteins are transcriptional regulators that are found in organisms from yeast to humans. The Arabidopsis bHLH proteins that have been characterized so far have roles in regulation of fruit dehiscence, cell development (carpel, anther and epidermal), phytochrome signaling, flavonoid biosynthesis, hormone signaling and stress responses. 53
29428 381400 cd11394 bHLHzip_SREBP basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein (SREBP) family. The SREBP family includes SREBP1 and SREBP2, which are bHLHzip transcriptional activator of genes encoding proteins essential for cholesterol biosynthesis/uptake and fatty acid biosynthesis. SREBP1 and SREBP2 are principally found in the liver and in adipocytes and made up of an N-terminal transcription factor portion (composed of an activation domain, a bHLHzip domain, and a nuclear localization signal), a hydrophobic region containing two membrane spanning regions, and a C-terminal regulatory segment. They recognize a symmetric sterol regulatory element (TCACNCCAC) instead of E-box. 73
29429 381401 cd11395 bHLHzip_SREBP_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein (SREBP) family and similar proteins. The SREBP family includes SREBP1 and SREBP2, which are bHLHzip transcriptional activator of genes encoding proteins essential for cholesterol biosynthesis/uptake and fatty acid biosynthesis. SREBP1 and SREBP2 are principally found in the liver and in adipocytes and made up of an N-terminal transcription factor portion (composed of an activation domain, a bHLHzip domain, and a nuclear localization signal), a hydrophobic region containing two membrane spanning regions, and a C-terminal regulatory segment. They recognize a symmetric sterol regulatory element (TCACNCCAC) instead of E-box. The family also includes Saccharomyces cerevisiae transcription factor HMS1 (also termed high-copy MEP suppressor protein 1) and serine-rich protein TYE7. HMS1 is a putative bHLHzip transcription factor involved in exit from mitosis and pseudohyphal differentiation. TYE7, also termed basic-helix-loop-helix protein SGC1, is a putative bHLHzip transcription activator required for Ty1-mediated glycolytic gene expression. TYE7 N-terminal is extremely rich in serine residues. It binds DNA on E-box motifs, 5'-CANNTG-3'. TYE7 is not essential for growth. 87
29430 381402 cd11396 bHLHzip_USF basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factors, USF1, USF2 and similar proteins. Upstream stimulatory factor 1 and 2 (USF-1 and USF-2) are members of bHLHzip transcription factor family. USFs are ubiquitously expressed and key regulators of a wide number of gene regulation networks, including the stress and immune responses, cell cycle and proliferation, lipid and glucid metabolism. USFs recruit chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. USFs interact with high affinity to E-box regulatory elements. 58
29431 381403 cd11397 bHLHzip_MITF_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the microphthalmia-associated transcription factor family (MITF) family. The MITF (also known as microphthalmia-TFE, or MiT) family is a small family that contain a basic helix loop helix domain associated with a leucine zipper (bHLHZip). The MITF family comprises four genes in mammals (MITF, TFE3, TFEB, and TFEC); each gene has different functions. MITF is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. TFEB is required for vascularization of the mouse placenta. TFE3 is involved in B cell function. TFEC regulates gene expression in macrophages. The MITF family can form homodimers or heterodimers with each other but not with other bHLH or bHLHzip proteins. 69
29432 381404 cd11398 bHLHzip_scCBP1 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Saccharomyces cerevisiae centromere-binding protein 1 (CBP-1) and similar proteins. CBP-1, also termed centromere promoter factor 1 (CPF1), or centromere-binding factor 1 (CBF1), is a bHLHzip protein that is required for chromosome stability and methionine prototrophy. It binds as a homodimer to the centromere DNA elements I (CDEI, GTCACATG) region of the centromere that is required for optimal centromere function. 89
29433 381405 cd11399 bHLHzip_scHMS1_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Saccharomyces cerevisiae transcription factor HMS1 and similar proteins. HMS1, also termed high-copy MEP suppressor protein 1, is a putative bHLHzip transcription factor involved in exit from mitosis and pseudohyphal differentiation. 96
29434 381406 cd11400 bHLHzip_Myc basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the Myc family. The Myc family is a member of the bHLHzip family of transcription factors that play important roles in the control of normal cell proliferation, growth, survival and differentiation. All Myc isoforms contain two independently functioning polypeptide chain regions: N-terminal transactivating residues and a C-terminal bHLHzip segment. The bHLHzip family of bHLH transcription factors are characterized by a highly conserved N-terminal basic region that may bind DNA at a consensus hexanucleotide sequence known as the E-box (CANNTG) followed by HLH and leucine zipper motifs that may interact with other proteins to form homo- and heterodimers. Myc heterodimerizes with Max enabling specific binding to E-box DNA sequences in the promoters of target genes. The Myc proto-oncoprotein family includes at least five different functional members: c-, N-, L-, S- and B-Myc (which is lacking the bHLH domain). 80
29435 381407 cd11401 bHLHzip_Mad basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the Mad family. Members of the Mad family (Mad1, Mxi, Mad3, and Mad4) bear the bHLHzip domain (also known as basic-helix-loop-helix-leucine-zipper or bHLH-LZ domain), which mediates heterodimerization to Max and the sequence-specific DNA binding ability to E-box DNA. Mad family proteins can repress transcription at the E-box through their interaction with co-repressors. Mad family proteins antagonize Myc function in transactivation and transformation and they are growth/tumor suppressors. The developmental phenotypes of the individual Mad family member knockout mice are relatively mild- all these mice have been shown to be viable and normal. 76
29436 381408 cd11402 bHLHzip_Mnt basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-binding protein Mnt and similar proteins. Mnt, also termed Class D basic helix-loop-helix protein 3 (bHLHd3), or Myc antagonist MNT, or protein ROX, is a bHLHZip transcriptional repressor that binds DNA as a heterodimer with MAX. It binds to the canonical E box sequence 5'-CACGTG-3' and, with higher affinity, to 5'-CACGCG-3'. Mnt has an important role as an antagonist and regulator of Myc activities and it is a potential tumor suppressor. Mnt is ubiquitously expressed. Mnt-deficient mice shown to exhibit early postnatal lethality. 77
29437 381409 cd11403 bHLH_scINO4_like basic Helix-Loop-Helix (bHLH) domain found in Saccharomyces cerevisiae INO4 and similar proteins. INO4 is a bHLH transcriptional activator of phospholipid synthetic genes (such as INO1, CHO1/PSS, CHO2/PEM1, OPI3/PEM2, etc.). It is required for de-repression of phospholipid biosynthetic gene expression in response to inositol deprivation in yeast. INO4 dimerizes with INO2 and binds to an UAS DNA element to control expression of the genes whose expression is inositol-responsive. 71
29438 381410 cd11404 bHLHzip_Mlx_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-like protein X (Mlx) family. Mlx, also termed Class D basic helix-loop-helix protein 13 (bHLHd13), or Max-like bHLHZip protein, or protein BigMax, or transcription factor-like protein 4, is a Max-like bHLHZip transcription regulator that interacts with the Max network of transcription factors. It forms a sequence-specific DNA-binding protein complex with some member of Mad family (Mad1 and Mad4) and Mondo family but not the Myc family and bind the E-box DNA to control transcription. The family also includes Saccharomyces cerevisiae INO4, which is a bHLH transcriptional activator of phospholipid synthetic genes (such as INO1, CHO1/PSS, CHO2/PEM1, OPI3/PEM2, etc.). It is required for de-repression of phospholipid biosynthetic gene expression in response to inositol deprivation in yeast. 70
29439 381411 cd11405 bHLHzip_MLXIP_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein (MLXIP), MLX-interacting protein-like (MLXIPL) and similar proteins. The family includes MLXIP and MLXIPL. MLXIP, also termed Class E basic helix-loop-helix protein 36 (bHLHe36), or transcriptional activator MondoA, is a bHLHZip transcriptional activator that binds DNA as a heterodimer with Mlx. It binds to the canonical E box sequence 5'-CACGTG-3' and plays a role in transcriptional activation of glycolytic target genes. MLXIP is most highly expressed in skeletal muscle and functions as an indirect glucose sensor, by sensing glucose 6-phosphate and shuttling between the nucleus and the cytoplasm. MLXIPL, also termed carbohydrate-responsive element-binding protein (ChREBP), or Class D basic helix-loop-helix protein 14 (bHLHd14), or MLX interactor, or WS basic-helix-loop-helix leucine zipper protein (WS-bHLH), or Williams-Beuren syndrome chromosomal region 14 protein (WBSCR14), is a bHLHZip transcriptional factor integral to the regulation of glycolysis and lipogenesis in the liver. It forms heterodimers with the bHLHZip protein Mlx to bind the DNA sequence 5'-CACGTG-3'. 74
29440 381412 cd11406 bHLHzip_Max basic Helix-Loop-Helix-zipper (bHLHzip) domain found in protein Max and similar proteins. Max, also termed Class D basic helix-loop-helix protein 4 (bHLHd4), or Myc-associated factor X, is a bHLHZip transcription regulator that forms a sequence-specific DNA-binding protein complex with MYC or MAD which recognizes the core sequence 5'-CAC[GA]TG-3'. The MYC:MAX complex is a transcriptional activator, whereas the MAD:MAX complex is a transcriptional repressor. Max homodimer bind DNA but is transcriptionally inactive. Targeted deletion of max results in early embryonic lethality in mice. 69
29441 381413 cd11407 bHLH-O_HERP basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES)-related repressor protein (HERP) family. HERP (also called Hey/Hesr/HRT/CHF/gridlock) proteins corresponds to a family of bHLH-O transcriptional repressors that are related to the Drosophila hairy and Enhancer-of-split proteins and act as downstream effectors of Notch signaling. They contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. HERP proteins are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis. 59
29442 381414 cd11408 bHLH-O_HELT basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split-related protein HELT and similar proteins. HELT, also termed HES/HEY-like transcription factor, is a bHLH-O transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. HELT could homodimerize and heterodimerize with other bHLH-O protein such as HES-5 or HEY-2 and bound to E box to repress gene transcription. 56
29443 381415 cd11409 bHLH-O_DEC basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein (DEC) family. The DEC family includes two bHLH-O transcriptional repressors, DEC1 and DEC2, which are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers. They mediate the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. They are induced by CLOCK:BMAL1 heterodimer via the CACGTG E-box in the promoter. 75
29444 381416 cd11410 bHLH_O_HES basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES) family. The HES family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split (HES) proteins. They contain a basic helix-loop-helix (bHLH) domain with an invariant proline residue in its basic region, an orange domain in the central region and a conserved tetrapeptide motif, WRPW, at its C-terminal region. HES family proteins form heterodimers or homodimers via their HLH domain and bind DNA to repress gene transcription that play an essential role in development of both compartment and boundary cells of the central nervous system. 54
29445 381417 cd11411 bHLH_TS_MRF basic helix-loop-helix (bHLH) domain found in myogenic regulatory factor (MRF) family. MRFs are a family of muscle-specific bHLH transcription proteins (MyoD, Myf5, Mrf4 and MyoG) that plays an essential role in regulating skeletal muscle development and growth. MRFs are capable of binding to E-box motifs as a heterodimer with E-proteins to regulate transcription expression. 56
29446 381418 cd11412 bHLH_TS_TWIST1 basic helix-loop-helix (bHLH) domain found in twist-related protein 1 (TWIST1) and similar proteins. TWIST1, also termed Class A basic helix-loop-helix protein 38 (bHLHa38), or H-twist, is a bHLH transcriptional regulator that inhibits myogenesis by sequestrating E proteins, inhibiting trans-activation by MEF2, and inhibiting DNA-binding by MYOD1 through physical interaction. It also represses expression of proinflammatory cytokines such as TNFA and IL1B. In addition, TWIST1 is involved in cancer development and progression. 77
29447 381419 cd11413 bHLH_TS_TAL_LYL basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family. The TAL/LYL family includes a group of bHLH transcription factors (TAL1, TAL2 and LYL1) implicated in T cell acute leukaemia. They act as mediators of T cell leukaemogenesis. TAL-1, also termed Class A basic helix-loop-helix protein 17 (bHLHa17), or stem cell protein (SCL), or T-cell leukemia/lymphoma protein 5, is a hematopoietic-specific bHLH transcription factor that functions in embryonic and adult hematopoiesis in vertebrates. It is also required for embryonic vascular remodeling. It acts as a regulator of erythroid differentiation and binds to regulatory regions of a large cohort of erythroid genes as part of a complex with GATA-1, LMO2 and Ldb1. TAL-2, also termed Class A basic helix-loop-helix protein 19 (bHLHa19), is a bHLH transcription factor essential for the normal brain development. Lyl-1, also termed Class A basic helix-loop-helix protein 18 (bHLHa18), or lymphoblastic leukemia-derived sequence 1, is a proto-oncogenic bHLH transcription factor that plays an important role in hematopoietic stem cell function and is required for the late stages of postnatal angiogenesis to limit the formation of new blood vessels, notably by regulating the activity of the small GTPase Rap1. LYL-1 deficiency induces a stress erythropoiesis. 60
29448 381420 cd11414 bHLH_TS_HEN basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein (HEN) family. The HEN family includes two neuron-specific bHLH transcription factors, HEN-1 (also known as Nhlh1 or bHLHa35 or NSCL-1) and HEN-2 (also known as Nhlh2 or bHLHa34 or NSCL-2). They may serve as DNA-binding protein that is involved in the control of cell-type determination, possibly within the developing nervous system. 57
29449 381421 cd11415 bHLH_TS_FERD3L_NATO3 basic helix-loop-helix (bHLH) domain found in Fer3-like protein (FERD3L) and similar proteins. FERD3L, also termed basic helix-loop-helix protein N-twist, or Class A basic helix-loop-helix protein 31 (bHLHa31), or nephew of atonal 3 (NATO3), or Neuronal twist (NTWIST), is a bHLH transcription factor expressed in the developing central nervous system (CNS). It regulates floor plate (FP) cells development. FP is a critical organizing center located at the ventral-most midline of the neural tube. FERD3L binds to the E-box and functions as inhibitor of transcription. 64
29450 381422 cd11416 bHLH_TS_ceHLH13_like basic helix-loop-helix (bHLH) domain found in Caenorhabditis elegans Helix-loop-helix protein 13 (HLH13) and similar proteins. Caenorhabditis elegans HLH13, also termed Fer3-like protein, or nephew of atonal 3, is a bHLH transcription factor that plays a role in the negative regulation of exit from L1 arrest and dauer diapause dependent on IIS signaling (insulin and insulin-like growth factor (IGF) signaling). 63
29451 381423 cd11417 bHLH_TS_PTF1A basic helix-loop-helix (bHLH) domain found in pancreas transcription factor 1 subunit alpha (PTF1A) and similar proteins. PTF1A, also termed Class A basic helix-loop-helix protein 29 (bHLHa29), or pancreas-specific transcription factor 1a, or bHLH transcription factor p48, or p48 DNA-binding subunit of transcription factor PTF1 (PTF1-p48), is a bHLH transcription factor implicated in the cell fate determination in various organs. It binds to the E-box consensus sequence 5'-CANNTG-3' and plays a role in early and late pancreas development and differentiation. 56
29452 381424 cd11418 bHLH_TS_ASCL basic helix-loop-helix (bHLH) domain found in achaete-scute complex-like (ASCL) family. The achaete-scute complex-like (ASCL, also known as achaete-scute complex homolog or ASH) family of bHLH transcription factors, ASCL1-5, have been implicated in cell fate specification and differentiation. They are critical for proper development of the nervous system. The deregulation of ASCL plays a key role in psychiatric and neurological disorders. ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete-scute homolog 1 (Mash1), is a neural-specific bHLH transcription factor that is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete-scute homolog 2 (Mash2), is a bHLH transcription factor that is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is a bHLH transcription factor specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis. The family also includes Drosophila melanogaster achaete-scute complex (AS-C) proteins, which consists of lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation. 56
29453 381425 cd11419 bHLHzip_TFAP4 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor AP-4 (TFAP4) and similar proteins. TFAP4, also termed activating enhancer-binding protein 4, or Class C basic helix-loop-helix protein 41 (bHLHc41), is a bHLHzip transcription factor that activates both viral and cellular genes involved in the regulation of cellular proliferation, stemness, and epithelial-mesenchymal transition by binding to the symmetrical DNA sequence 5'-CAGCTG-3'. 61
29454 381426 cd11420 bHLH_E-protein basic helix-loop-helix (bHLH) domain found in E proteins family. The E proteins family corresponds to class I bHLH proteins, which are widely expressed within the immune system and on which the majority of this chapter will be focused. Members in this family include E2A (also referred to as TCF-3), E47, TCF-12 (also referred to as HEB), and TCF-4 (also referred to as E2-2) in vertebrates, as well as the E protein ortholog, Daughterless (Da), from Drosophila melanogaster. E-proteins are expressed broadly and in certain complexes they are restricted to specific cell types. E-proteins homodimerize and heterodimerize with the tissue specific bHLH factors to bind DNA and regulate transcription and differentiation of cells during development. The activity of the E-proteins is regulated by two main mechanisms: first, the relative concentrations of E-proteins, tissue specific bHLH factors, and the Id proteins, and second, covalent modification. 47
29455 381427 cd11421 bHLH_TS_ATOH8 basic helix-loop-helix (bHLH) domain found in protein atonal homolog 8 (ATOH8) and similar proteins. ATOH8, also termed Class A basic helix-loop-helix protein 21 (bHLHa21), or helix-loop-helix protein hATH-6 (hATH6), is a bHLH shear-stress-responsive transcription factor expressed in activated satellite cells and proliferating myoblasts of human skeletal muscle tissue. It regulates endothelial cell proliferation, migration and tube-like structures formation. ATOH8 binds a palindromic (canonical) core consensus DNA sequence 5'-CANNTG- 3' known as an E-box element, possibly as a heterodimer with other bHLH proteins. 68
29456 381428 cd11422 bHLH_TS_FIGLA basic helix-loop-helix (bHLH) domain found in factor in the germline alpha (FIGLA) and similar proteins. FIGLA, also termed FIGalpha, or Class C basic helix-loop-helix protein 8 (bHLHc8), or folliculogenesis-specific basic helix-loop-helix protein, or transcription factor FIGa, is a germ-cell-specific bHLH transcription factor expressed abundantly in female and less so in male germ cells. It is essential for primordial follicle formation and expression of many genes required for folliculogenesis, fertilization and early embryonic survival. FIGLA knockout mice cannot form primordial follicles and lose oocytes rapidly after birth, whereas male gonads are unaffected. 56
29457 381429 cd11423 bHLH_TS_musculin_like basic helix-loop-helix (bHLH) domain found in musculin, transcription factor 21 (TCF-21) and similar proteins. The family includes two bHLH transcription factors, musculin and transcription factor 21 (TCF-21). Musculin, also termed activated B-cell factor 1 (ABF-1), or Class A basic helix-loop-helix protein 22 (bHLHa22), is a bHLH transcription factor expressed in activated B lymphocytes. It acts as a transcription repressor capable of inhibiting the transactivation capability of TCF3/E47. Musculin may play a role in regulating antigen-dependent B-cell differentiation. The mouse homolog, musculin, is suggested to be a repressor of myogenesis that is expressed in developing muscle and in the spleen. TCF-21, also termed capsulin, or Class A basic helix-loop-helix protein 23 (bHLHa23), or epicardin, or podocyte-expressed 1 (Pod-1), is a bHLH transcription factor expressed specifically in mesodermally-derived cells that surround the epithelium of the developing gastrointestinal, genitourinary and respiratory systems during mouse embryogenesis. It may play a role in the specification or differentiation of one or more subsets of epicardial cell types. 56
29458 381430 cd11424 bHLH_TS_OUT basic helix-loop-helix (bHLH) domain found in ovary, uterus and testis protein (OUT) family. The OUT family includes transcription factor 23 (TCF-23), transcription factor 24 (TCF-24) and similar proteins. TCF-23, also termed Class A basic helix-loop-helix protein 24 (bHLHa24), is a bHLH transcription factor that is essential for progesterone-dependent decidualization. The mouse homolog is also called ovary, uterus and testis protein (OUT), which is expressed predominantly in the reproductive organs such as the uterus, ovary and testis. It shows an Id-like inhibitory activity and functions as a negative regulator of bHLH factors through the formation of a functionally inactive heterodimeric complex. OUT inhibits the formation of TCF3 and MYOD1 homodimers and heterodimers, but lacks DNA binding activity. OUT is involved in the regulation or modulation of smooth muscle contraction of the uterus during pregnancy and particularly around the time of delivery. It also plays a role in the inhibition of myogenesis. Unlike typical bHLH factors, OUT proteins do not bind E-box (CANNTG) or N-box DNA sequences and inhibit DNA binding of homo- and heterodimers consisting of E12 and MyoD in gel mobility shift assays. TCF-24 is an uncharacterized bHLH transcription factor that shows high sequence similarity with TCF-23. 55
29459 381431 cd11425 bHLH_TS_Mesp_like basic helix-loop-helix (bHLH) domain found in mesoderm posterior protein (Mesp) family. Mesp, a bHLH tissue specific transcription factor, acts as a key regulator of the cardiovascular transcriptional network by inducing directly and/or indirectly the expression of the majority of key cardiovascular transcription factors. The Mesp family includes two bHLH transcription factors, Mesp1 and Mesp2. Mesp1, also termed Class C basic helix-loop-helix protein 5 (bHLHc5), promotes cardiovascular differentiation during embryonic development and embryonic stem cell differentiation. Mesp2, also termed Class C basic helix-loop-helix protein 6 (bHLHc6), plays an important role in somitogenesis. The family also includes mesogenin-1 (Msgn1) and similar proteins. Msgn1, also termed paraxial mesoderm-specific mesogenin1, or pMesogenin1 (pMsgn1), is a bHLH transcription factor required for maturation and segmentation of paraxial mesoderm. It may regulate the expression of T-box transcription factors essential for mesoderm formation and differentiation. 59
29460 381432 cd11426 bHLH_TS_MIST1_like basic helix-loop-helix (bHLH) domain found in muscle, intestine and stomach expression 1 (MIST-1) family. MIST-1, also termed Class A basic helix-loop-helix protein 15 (bHLHa15), or Class B basic helix-loop-helix protein 8 (bHLHb8), is a bHLH transcription factor expressed in pancreatic acinar cells and other serous exocrine cells. It is essential for cytoskeletal organization and secretory activity. It also functions as a potent endoplasmic reticulum (ER) stress-inducible transcriptional regulator. MIST-1 is capable of binding to E-box (CANNTG) motifs as a homodimer or a heterodimer with E-proteins (E12 and E47) to regulate transcription. The family also includes Drosophila melanogaster protein dimmed and similar proteins. Dimmed, also termed DIMM, is a bHLH transcription factor that regulates neurosecretory (NS) cell function and neuroendocrine cell fate in Drosophila. 56
29461 381433 cd11427 bHLH_TS_NeuroD basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor (NeuroD) family. The NeuroD family includes NeuroD1, NeuroD2, NeuroD4 and NeuroD6. NeuroD1, also termed Class A basic helix-loop-helix protein 3 (bHLHa3), is a neuronal bHLH transcription factor involved in the development and maintenance of the endocrine pancreas and neuronal elements. It acts as an essential regulator of glutamatergic neuronal differentiation. Loss of NeuroD1 causes ataxia, cerebellar hypoplasia, sensorineural deafness, and severe retinal dystrophy in mice. NeuroD2, also termed Class A basic helix-loop-helix protein 1 (bHLHa1), or NeuroD-related factor (NDRF), is a neuronal calcium-dependent bHLH transcription factor that induces neuronal differentiation and promotes neuronal survival. It plays a central role in thalamocortical synaptic maturation. NeuroD2 mediates calcium-dependent transcription activation by binding to E box-containing promoter. NeuroD4, also termed Class A basic helix-loop-helix protein 4 (bHLHa4), or protein atonal homolog 3 (ATH-3), or Atoh3, or Math-3, is a bHLH transcriptional activator that mediates neuronal differentiation. NeuroD6, also termed Class A basic helix-loop-helix protein 2 (bHLHa2), or protein atonal homolog 2 (ATH-2), or Atoh2, or Math2, or Nex1, is a neurogenic bHLH transcription factor involved in neuronal development, differentiation, and survival in Alzheimer's disease (AD) brains of both cohorts. It plays an integrative role in coordinating increase in mitochondrial mass with cytoskeletal remodeling, suggesting that it may act as a co-regulator of neuronal differentiation and energy metabolism. 55
29462 381434 cd11428 bHLH_TS_NGN basic helix-loop-helix (bHLH) domain found in neurogenin (NGN) family. The NGN family includes three neural-specific bHLH transcription factors, NGN1-3, which may function at neuroblast selection genes during the development of several neuronal lineages. NGN-1, also termed Class A basic helix-loop-helix protein 6 (bHLHa6), or neurogenic basic-helix-loop-helix protein, or neurogenic differentiation factor 3 (NeuroD3), is a neural-specific bHLH transcription factor involved in the initiation of neuronal differentiation. NGN-2, also termed Class A basic helix-loop-helix protein 8 (bHLHa8), or protein atonal homolog 4 (ATOH4), is a neural-specific bHLH transcription factor required for sensory neurogenesis. NGN-3, also termed Class A basic helix-loop-helix protein 7 (bHLHa7), or protein atonal homolog 5 (ATOH5), is a neural-specific bHLH transcription factor expressed in the developing central nervous system and the embryonic pancreas. It is involved in neurogenesis and plays an important role in spermatogenesis. 57
29463 381435 cd11429 bHLH_TS_OLIG basic helix-loop-helix (bHLH) domain found in Oligodendrocyte lineage genes (OLIG) family of transcription factors. The OLIG family includes three bHLH transcription factors, Oligo1-3, which are expressed in both the developing and mature central nervous system. Oligo1 and Oligo2 are expressed in a nervous tissue-specific manner, but Oligo3 is found mainly in non-neural tissues. Oligo (also known as Olig) have key roles in the specification of motor neurons, dorsal interneurons, and oligodendrocytes. Oligo1, also termed Class B basic helix-loop-helix protein 6 (bHLHb6), or Class E basic helix-loop-helix protein 21 (bHLHe21), promotes formation and maturation of oligodendrocytes, especially within the brain. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons. This family also includes two OLIG-related bHLH transcription factors, bHLHe22 and bHLHe23. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is a neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types. 61
29464 381436 cd11430 bHLH_TS_ATOH1_like basic helix-loop-helix (bHLH) domain found in protein atonal homologs ATOH1, ATOH7 and similar proteins. The family includes ATOH1 and ATOH7. ATOH1, also termed Class A basic helix-loop-helix protein 14 (bHLHa14), or helix-loop-helix protein hATH-1 (hATH1), or Math1, or Cath1, is a proneural bHLH transcription factor that is essential for inner ear hair cell differentiation. It dimerizes with E47 and activates E-box (CANNTG) dependent transcription. ATOH1 is a mammalian homolog of the Drosophila melanogaster gene atonal and mouse atonal homolog 1 (Math1). ATOH7, also termed Class A basic helix-loop-helix protein 13 (bHLHa13), or helix-loop-helix protein hATH-5 (hATH5), or Math5, is a bHLH transcription factor involved in the differentiation of retinal ganglion cells. The family also includes protein Amos (also termed absent MD neurons and olfactory sensilla protein, or reduced olfactory organs protein, or rough eye protein). It is a bHLH transcription factor that promotes multiple dendritic neuron formation in the Drosophila peripheral nervous system. 56
29465 381437 cd11431 bHLH_TS_taxi_Dei basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein taxi and similar proteins. Protein taxi, also termed protein delilah (Dei), is a bHLH transcription factor that is involved in regulation of cell adhesion and attachment that is expressed in specialized cells that provide anchoring sites to either muscles (tendon cells), or proprioceptors (chordotonal attachment cells) during embryonic development. It probably plays an important role in the differentiation of epidermal cells into the tendon cells that form the attachment sites for all muscles. 59
29466 381438 cd11432 bHLH-PAS_NPAS1_3_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing proteins, NPAS1, NPAS3 and similar proteins. The family includes neuronal PAS domain proteins NPAS1 and NPAS3, both of which are master regulators of neuropsychiatric function. NPAS1, also termed neuronal PAS1, or Basic-helix-loop-helix-PAS protein MOP5, or Class E basic helix-loop-helix protein 11 (bHLHe11), or member of PAS protein 5, or PAS domain-containing protein 5 (PASD5), is a bHLH-PAS transcriptional repressor expressed in the central nervous system and involved in neuronal differentiation. It is active during late embryogenesis and postnatal development. NPAS3, also termed neuronal PAS3, or Basic-helix-loop-helix-PAS protein MOP6, or Class E basic helix-loop-helix protein 12 (bHLHe12), or member of PAS protein 6, or PAS domain-containing protein 6 (PASD6), is a bHLH-PAS brain-enriched transcription factor that is involved in central nervous system development and neurogenesis. It is a replicated genetic risk factor for psychiatric disorders. Human chromosomal rearrangements that affect NPAS3 normal expression are associated with schizophrenia and mental retardation. 55
29467 381439 cd11433 bHLH-PAS_HIF basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor (HIF) family. The HIF family contains bHLH-PAS transcription regulators involved in oxygen homeostasis, including HIF1a, HIF2a, and HIF3a. They have been implicated in development, postnatal physiology as well as disease pathogenesis. HIF1a, also termed HIF-1-alpha, or HIF1-alpha, or ARNT-interacting protein, or Basic-helix-loop-helix-PAS protein MOP1, or Class E basic helix-loop-helix protein 78 (bHLHe78), or Member of PAS protein 1, or PAS domain-containing protein 8 (PASD8), functions as a master transcriptional regulator of the adaptive response to hypoxia. HIF2a, also termed HIF-2-alpha, or HIF2-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP2, or Class E basic helix-loop-helix protein 73 (bHLHe73), or Member of PAS protein 2, or PAS domain-containing protein 2 (PASD2), or HIF-1-alpha-like factor (HLF), is a bHLH-PAS transcription factor involved in the induction of oxygen regulated genes. HIF3a, also termed HIF-3-alpha, or HIF3-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP7, or Class E basic helix-loop-helix protein 17 (bHLHe17), or Member of PAS protein 7, or PAS domain-containing protein 7 (PASD7), or HIF3-alpha-1, or inhibitory PAS domain protein (IPAS), is a bHLH-PAS transcriptional regulator in adaptive response to low oxygen tension. It plays a role in the regulation of hypoxia-inducible gene expression. 58
29468 381440 cd11434 bHLH-PAS_SIM basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded (SIM) family. The SIM family includes Drosophila melanogaster protein SIM and its homologs from vertebrates, single-minded homolog 1 (SIM1) and single-minded homolog 2 (SIM2). SIM is a nuclear bHLH-PAS transcription factor that functions as a master developmental regulator controlling midline development of the ventral nerve cord in Drosophila. SIM1, also termed Class E basic helix-loop-helix protein 14 (bHLHe14), is a bHLH-PAS transcription factor that may have pleiotropic effects during embryogenesis and in the adult. SIM2, also termed Class E basic helix-loop-helix protein 15 (bHLHe15), is a bHLH-PAS transcription factor that may be a master gene of central nervous system (CNS) development in cooperation with ARNT. It may have pleiotropic effects in the tissues expressed during development. 61
29469 381441 cd11435 bHLH-PAS_AhRR basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor repressor (AhRR) and similar proteins. AhRR, also termed AhR repressor, or Class E basic helix-loop-helix protein 77 (bHLHe77), is a member of bHLH-PAS transcription factors that acts as a negative regulator of AhR (or Dioxin Receptor), playing key roles in development and environmental sensing. AhR is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes, AhR dimerizes with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator). AhRR functions by competing with AhR for its partner ARNT. AhRR-ARNT complexes are transcriptionally inactive. 60
29470 381442 cd11436 bHLH-PAS_AhR basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor (AhR) and similar proteins. AhR, also termed Ah receptor, or Dioxin receptor (DR), or Class E basic helix-loop-helix protein 76 (bHLHe76), is the only member of bHLH-PAS transcription regulators that bind and be activated by small chemical ligands. It is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes AhR dimerize with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator). 61
29471 381443 cd11437 bHLH-PAS_ARNT_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator (ARNT) family. The ARNT family of bHLH-PAS transcription regulators includes ARNT, ARNT-like proteins (ARNTL and ARNTL2), and Drosophila melanogaster protein cycle. They act as the heterodimeric partner for bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM). These bHLH-PAS transcription complexes are involved in transcriptional responses to xenobiotic, hypoxia, and developmental pathways. Heterodimerization of bHLH-PAS proteins with ARNT is mediated by contacts between both the bHLH and the tandem PAS domains. ARNT use bHLH and/or PAS domains to interact with several transcriptional coactivators. It is required for activity of the aryl hydrocarbon (dioxin) receptor. ARNTL, also termed Basic-helix-loop-helix-PAS protein MOP3, or brain and muscle ARNT-like 1 (BMAL1), or Class E basic helix-loop-helix protein 5 (bHLHe5), or member of PAS protein 3, or PAS domain-containing protein 3 (PASD3), or bHLH-PAS protein JAP3, is a member of the bHLH-PAS transcription factor family that forms heterodimers with another bHLH-PAS protein, CLOCK (circadian locomotor output cycle kaput), which regulates circadian rhythm. ARNTL-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes. ARNTL is highly homologous to ARNT. ARNTL2, also termed Basic-helix-loop-helix-PAS protein MOP9, or brain and muscle ARNT-like 2 (BMAL2), or CYCLE-like factor (CLIF), or Class E basic helix-loop-helix protein 6 (bHLHe6), or member of PAS protein 9, or PAS domain-containing protein 9 (PASD9), is a neuronal bHLH-PAS transcriptional factor, regulating cell cycle progression and preventing cell death, whose sustained expression might ensure brain neuron survival. It also plays important roles in tumor angiogenesis. Protein cycle, also termed brain and muscle ARNT-like 1 (BMAL1), or MOP3, is a putative bHLH-PAS transcription factor involved in the generation of biological rhythms in Drosophila. It activates cycling transcription of Period (PER) and Timeless (TIM) by binding to the E-box (5'-CACGTG-3') present in their promoters. 58
29472 381444 cd11438 bHLH-PAS_ARNTL_PASD3 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator-like protein 1 (ARNTL) and similar proteins. ARNTL, also termed Basic-helix-loop-helix-PAS protein MOP3, or brain and muscle ARNT-like 1 (BMAL1), or Class E basic helix-loop-helix protein 5 (bHLHe5), or member of PAS protein 3, or PAS domain-containing protein 3 (PASD3), or bHLH-PAS protein JAP3, is a member of the bHLH-PAS transcription factor family that forms heterodimers with another bHLH-PAS protein, CLOCK (circadian locomotor output cycle kaput), which regulates circadian rhythm. ARNTL-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes. ARNTL is highly homologous to ARNT. 64
29473 381445 cd11439 bHLH-PAS_SRC basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in steroid receptor coactivator (SRC) family. The SRC family of coactivators includes SRC-1 (NcoA-1/p160), SRC-2(TIF2/GRIP1/NcoA-2) and SRC-3(NcoA-3/pCIP/RAC3/ACTR/pCIP/AIB1/TRAM1), which are critical mediators of steroid receptor action. They contain bHLH-PAS domain at the N-terminal that is followed by receptor interacting domain and C-terminal transcriptional activation domain. SRC coactivators interact with nuclear receptors in a ligand-dependent manner and enhance transcriptional activation by the receptor via histone acetylation/methylation. 58
29474 381446 cd11440 bHLH-O_Cwo_like basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster protein clockwork orange (Cwo) and similar proteins. Cwo is a bHLH-O transcriptional regulator involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression. 60
29475 381447 cd11441 bHLH-PAS_CLOCK_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Circadian locomotor output cycles protein kaput (CLOCK) and similar proteins. The family includes CLOCK, neuronal PAS domain-containing protein 2 (NPAS2) and non-mammalian circadian clock protein PASD1. CLOCK, also termed Class E basic helix-loop-helix protein 8 (bHLHe8), is a transcriptional activator which forms a core component of the circadian clock. NPAS2, also termed neuronal PAS2, or basic-helix-loop-helix-PAS protein MOP4, or Class E basic helix-loop-helix protein 9 (bHLHe9), or member of PAS protein 4, or PAS domain-containing protein 4, is a transcriptional activator which forms a core component of the circadian clock. PASD1 is evolutionarily related to Circadian locomotor output cycles protein kaput (CLOCK)and functions as a suppressor of the biological clock that drives the daily circadian rhythms of cells throughout the body. 54
29476 381448 cd11442 bHLH_AtPRE_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana paclobutrazol resistance (PRE) family. The PRE family includes several bHLH transcription factors from Arabidopsis thaliana, such as PRE1-6. PRE1 (also termed AtbHLH136, or protein banquo 1), PRE2 (also termed AtbHLH134, or protein banquo 2, or EN 52), PRE4 (also termed AtbHLH161, or protein banquo 3), and PRE5 (also termed AtbHLH164) are atypical and probable non DNA-binding bHLH transcription factors that integrate multiple signaling pathways to regulate cell elongation and plant development. PRE3 (also termed AtbHLH135, or protein activation-tagged BRI1 suppressor 1, or ATBS1, or protein target of MOOPTEROS 7, or EN 67) is an atypical and probable non DNA-binding bHLH transcription factor required for MONOPTEROS-dependent root initiation in embryo. It promotes the correct definition of the hypophysis cell division plane. PRE5 (also termed AtbHLH163, or protein KIDARI) is an atypical and probable non DNA-binding bHLH transcription factor that regulates light-mediated responses in day light conditions by binding and inhibiting the activity of the bHLH transcription factor HFR1, a critical regulator of light signaling and shade avoidance. 65
29477 381449 cd11443 bHLH_AtAMS_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein aborted microspores (AMS) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as AMS, ICE1 and SCREAM2. AMS, also termed AtbHLH21, or EN 48, plays a crucial role in tapetum development and it is required for male fertility and pollen differentiation. ICE1, also termed inducer of CBF expression 1, or AtbHLH116, or EN 45, or SCREAM, acts as a transcriptional activator that regulates the cold-induced transcription of CBF/DREB1 genes. It binds specifically to the MYC recognition sites (5'-CANNTG-3') found in the CBF3/DREB1A promoter. SCREAM2, also termed AtbHLH33, or EN 44, mediates stomatal differentiation in the epidermis probably by controlling successive roles of SPCH, MUTE, and FAMA. 72
29478 381450 cd11444 bHLH_AtIBH1_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana ILI1-BINDING BHLH 1 (IBH1) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as IBH1, UPBEAT1, PAR1 and PAR2. IBH1, also termed bHLH zeta, or AtbHLH158, is an atypical and probable non DNA-binding bHLH transcription factor that acts as transcriptional repressor that negatively regulates cell and organ elongation in response to gibberellin (GA) and brassinosteroid (BR) signaling. IBH1 forms heterodimer with BHLH49, thus inhibiting DNA binding of BHLH49, which is a transcriptional activator that regulates the expression of a subset of genes involved in cell expansion by binding to the G-box motif. UPBEAT1, also termed AtbHLH151, or EN 146, is a bHLH transcription factor that modulates the balance between cellular proliferation and differentiation in root growth. It does not act through cytokinin and auxin signaling, but by repressing peroxidase expression in the elongation zone. PAR1 (also termed AtbHLH165, or protein helix-loop-helix 1, or protein phytochrome rapidly regulated 1) and PAR2 (also termed AtbHLH166, or protein helix-loop-helix 2, or protein phytochrome rapidly regulated 2) are two atypical bHLH transcription factors that act as negative regulators of a variety of shade avoidance syndrome (SAS) responses, including seedling elongation and photosynthetic pigment accumulation. They act as direct transcriptional repressor of two auxin-responsive genes, SAUR15 and SAUR68. They may function in integrating shade and hormone transcriptional networks in response to light and auxin changes. 57
29479 381451 cd11445 bHLH_AtPIF_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana phytochrome interacting factors (PIFs) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as PIFs, ALC, PIL1, SPATULA, and UNE10. PIFs (PIF1, PIF3, PIF4, PIF5, PIF6 and PIF7) have been shown to control light-regulated gene expression. They directly bind to the photoactivated phytochromes and are degraded in response to light signals. ALC, also termed AtbHLH73, or protein ALCATRAZ, or EN 98, is required for the dehiscence of fruit, especially for the separation of the valve cells from the replum. It promotes the differentiation of a strip of labile non-lignified cells sandwiched between layers of lignified cells. PIL1, also termed AtbHLH124, or protein phytochrome interacting factor 3-like 1, or EN 110, is involved in responses to transient and long-term shade. It is required for the light-mediated inhibition of hypocotyl elongation and necessary for rapid light-induced expression of the photomorphogenesis- and circadian-related gene APRR9. PIL1 seems to play a role in multiple PHYB responses, such as flowering transition and petiole elongation. SPATULA, also termed AtbHLH24, or EN 99, plays a role in floral organogenesis. It promotes the growth of carpel margins and of pollen tract tissues derived from them. UNE10, also termed AtbHLH16, or protein UNFERTILIZED EMBRYO SAC 10, or EN 99, is required during the fertilization of ovules by pollen. 64
29480 381452 cd11446 bHLH_AtILR3_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein IAA-leucine resistant 3 (ILR3) and similar proteins. ILR3, also termed AtbHLH105, or EN 133, is a bHLH transcription factor that plays a role in resistance to amide-linked indole-3-acetic acid (IAA) conjugates such as IAA-Leu and IAA-Phe. It may regulate gene expression in response to metal homeostasis changes. 76
29481 381453 cd11447 bHLH-O_HEYL basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif-like protein (HEYL) and similar proteins. HEYL, also termed Class B basic helix-loop-helix protein 33 (bHLHb33), or hairy-related transcription factor 3 (HRT-3), is a bHLH-O transcriptional repressor that is strongly expressed in the presomitic mesoderm, the somites, the peripheral nervous system and smooth muscle of all arteries and is a downstream effector of the Notch and transforming growth factor-beta pathways. It promotes neuronal differentiation by activating proneural genes and inhibiting other hairy and enhancer of split (HES) and hairy/enhancer-of-split related with YRPW motif protein (HEY) proteins. HEYL also functions as a tumor suppressor involved in the progression of human cancers. 74
29482 381454 cd11448 bHLH_AtFAMA_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein FAMA and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as FAMA, MUTE and SPEECHLESS, which work together to regulate the sequential cell fate specification during stomatal development and differentiation. FAMA, also termed AtbHLH97, or EN 14, is a transcription activator required to promote differentiation and morphogenesis of stomatal guard cells and to halt proliferative divisions in their immediate precursors. It mediates the formation of stomata. MUTE, also termed AtbHLH45, or EN 20, is required for the differentiation of stomatal guard cells, by promoting successive asymmetric cell divisions and the formation of guard mother cells. It promotes the conversion of the leaf epidermis into stomata. SPEECHLESS, also termed AtbHLH98, or EN 19, is required for the initiation and the formation of stomata, by promoting the first asymmetric cell divisions. FAMA, MUTE and SPEECHLESS form heterodimers with SCREAM/ICE1 and SCRM2 to regulate transcription of genes during stomatal development. 74
29483 381455 cd11449 bHLH_AtAIB_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein ABA-INDUCIBLE bHLH-TYPE (AIB) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as AIB and MYC proteins (MYC2, MYC3 and MYC4). AIB, also termed AtbHLH17, or EN 35, is a transcription activator that regulates positively abscisic acid (ABA) response. MYC2, also termed protein jasmonate insensitive 1, or R-homologous Arabidopsis protein 1 (RAP-1), or AtbHLH6, or EN 38, or Z-box binding factor 1 protein, is a transcriptional activator involved in abscisic acid (ABA), jasmonic acid (JA), and light signaling pathways. MYC3, also termed protein altered tryptophan regulation 2, or AtbHLH5, or transcription factor ATR2, or EN 36, is a transcription factor involved in tryptophan, jasmonic acid (JA) and other stress-responsive gene regulation. MYC4, also termed AtbHLH4, or EN 37, is a transcription factor involved in jasmonic acid (JA) gene regulation. MYC2, together with MYC3 and MYC4, controls additively subsets of JA-dependent responses. 78
29484 381456 cd11450 bHLH_AtFIT_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana Fe-deficiency induced transcription factor 1 (FIT) and similar proteins. The family includes bHLH transcription factors from Arabidopsis thaliana, such as FIT and DYT1. FIT, also termed FER-like iron deficiency-induced transcription factor, or FER-like regulator of iron uptake, or AtbHLH29, or EN 43, is a bHLH transcription factor that is required for the iron deficiency response in plant. It regulates FRO2 at the level of mRNA accumulation and IRT1 at the level of protein accumulation. DYT1, also termed AtbHLH22, or protein dysfunctional tapetum 1, or EN 49, is a bHLH transcription factor involved in the control of tapetum development. It is required for male fertility and pollen differentiation, especially during callose deposition. 76
29485 381457 cd11451 bHLH_AtTT8_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein transparent testa 8 (TT8) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as TT8, EGL1, and GL3. TT8, also termed AtbHLH42, or EN 32, is involved in the control of flavonoid pigmentation and plays a key role in regulating leucoanthocyanidin reductase (BANYULS) and dihydroflavonol-4-reductase (DFR). EGL1, also termed AtbHLH2, or EN 30, or AtMYC146, or protein enhancer of GLABRA 3, is involved in epidermal cell fate specification and regulates negatively stomata formation but promotes trichome formation. GL3, also termed AtbHLH1, or AtMYC6, or protein shapeshifter, or EN 31, is involved in epidermal cell fate specification. It regulates negatively stomata formation, but, in association with TTG1 and MYB0/GL1, promotes trichome formation, branching and endoreplication. 75
29486 381458 cd11452 bHLH_AtNAI1_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein NAI1 and similar proteins. NAI1, also termed AtbHLH20, or EN 27, is a bHLH transcription activator that regulates the expression of at least NAI2, PYK10 and PBP1. It is required for and mediates the formation of endoplasmic reticulum bodies (ER bodies). It plays a role in the symbiotic interactions with the endophytes of the Sebacinaceae fungus family, such as Piriformospora indica and Sebacina. 75
29487 381459 cd11453 bHLH_AtBIM_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana BES1-interacting Myc-like proteins (BIMs) and similar proteins. The family includes Arabidopsis thaliana BIM1 and its homologs (BIM2 and BIM3), which are bHLH transcription factors that interact with BES1 to regulate transcription of Brassinosteroid (BR)-induced gene. BR regulates many growth and developmental processes such as cell elongation, vascular development, senescence stress responses, and photomorphogenesis. BIM1 heterodimerize with BES1 and bind to E-box sequences present in many BR-induced promoters to regulated BR-induced genes. 77
29488 381460 cd11454 bHLH_AtIND_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein INDEHISCENT (IND) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as IND, HEC proteins (HEC1, HEC2 and HEC3) and UNE12. IND, also termed AtbHLH40, or EN 120, is a bHLH transcription regulator required for seed dispersal. It is involved in the differentiation of all three cell types required for fruit dehiscence. HEC1 (also termed AtbHLH88, or protein HECATE 1, or EN 118), HEC2 (also termed AtbHLH37, or protein HECATE 2, or EN 117) and HEC3 (also termed AtbHLH43, or protein HECATE 3, or EN 119) are required for the female reproductive tract development and fertility. Both IND and HEC proteins have been implicated in regulation of auxin signaling. They heterodimerize with SPATULA (SPT) bHLH transcription factor to regulate reproductive tract development in plant. UNE12, also termed AtbHLH59, or protein UNFERTILIZED EMBRYO SAC 12, or EN 93, is required for ovule fertilization. 63
29489 381461 cd11455 bHLH_AtAIG1_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein AIG1 and similar proteins. AIG1, also termed AtbHLH32, or EN 54, or protein target of MOOPTEROS 5, is a transcription factor required for MONOPTEROS-dependent root initiation in embryo. 80
29490 381462 cd11456 bHLHzip_N-Myc_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in N-Myc and similar proteins. N-Myc, also termed Class E basic helix-loop-helix protein 37 (bHLHe37), is a bHLHZip proto-oncogene protein that positively regulates the transcription of MYCNOS in neuroblastoma cells. It is also essential during embryonic development. N-Myc has a critical role in regulating the switch between proliferation and differentiation of progenitor cells. It binds DNA as a heterodimer with MAX. The family also includes S-Myc, encoded by rat or mouse intronless myc gene, which has apoptosis-inducing activity. 87
29491 381463 cd11457 bHLHzip_L-Myc basic Helix-Loop-Helix-zipper (bHLHzip) domain found in L-Myc and similar proteins. L-Myc, also termed Class E basic helix-loop-helix protein 38 (bHLHe38), or protein L-Myc-1, or V-myc myelocytomatosis viral oncogene homolog, is a bHLHZip oncoprotein belonging to the Myc oncogene protein family. It binds DNA as a heterodimer with MAX. L-Myc is co-expressed with another Myc family member and has weaker transformation/transactivation activities. L-Myc knockout mouse did not exhibit any phenotypic abnormalities. 89
29492 381464 cd11458 bHLHzip_c-Myc basic Helix-Loop-Helix-zipper (bHLHzip) domain found in c-Myc and similar proteins. c-Myc, also termed Myc proto-oncogene protein, or Class E basic helix-loop-helix protein 39 (bHLHe39), or transcription factor p64, a bHLHZip proto-oncogene protein that functions as a transcription factor, which binds DNA in a non-specific manner, yet also specifically recognizes the core sequence 5'-CAC[GA]TG-3'. It activates the transcription of growth-related genes. 84
29493 381465 cd11459 bHLH-O_HES1_4 basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split HES-1, HES-4 and similar proteins. The family includes two bHLH-O transcriptional repressors, HES-1 and HES-4. HES-1, also termed Class B basic helix-loop-helix protein 39 (bHLHb39), or hairy homolog, or hairy-like protein (HL), plays an essential role in development of both compartment and boundary cells of the central nervous system. It regulates the maintenance of neural stem/progenitor cells by inhibiting proneural gene expression via Notch signaling. HES-4, also termed Class B basic helix-loop-helix protein 42 (bHLHb42), or bHLH factor Hes4, antagonizes the function of Twist-1 to regulate lineage commitment of bone marrow stromal/stem cells (BMSC). Epigenetic dysregulation of HES-4 is associated with striatal degeneration in postmortem Huntington brains. Both HES-1 and HES-4 are mammalian counterparts of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 63
29494 381466 cd11460 bHLH-O_HES6 basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-6 and similar proteins. HES-6, also termed Class B basic helix-loop-helix protein 41 (bHLHb41), or hairy and enhancer of split 6, or C-HAIRY1, is a bHLH-O transcription factor that is expressed in developing muscle and involved in angiogenesis, myogenesis, neural differentiation and neurogenesis. HES-6 antagonizes Notch signaling but is not regulated by Notch signaling. It is a transcription co-factor associated with stem cell characteristics in neural tissue. It may act as an inhibitor of Hes-1 during neuronal development and forms a heterodimer with HES-1 to prevent its association with transcriptional co-repressors. The overexpression of HES-6 has been reported in metastatic cancers of different origins. HES-6 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 58
29495 381467 cd11461 bHLH-O_HES5 basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-5 and similar proteins. HES-5, also termed Class B basic helix-loop-helix protein 38 (bHLHb38), or hairy and enhancer of split 5, is a bHLH-O transcription factor that is involved in cell differentiation and proliferation in a variety of tissues. HES-5 is an essential effector for Notch signaling. It acts as a transducer of Notch signals in brain vascular development. It also acts as a key mediator of Wnt-3a-induced neuronal differentiation and plays a crucial role in normal inner ear hair cell development. HES-5 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 59
29496 381468 cd11462 bHLH-O_HES7 basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split 7 (HES-7) and similar proteins. HES-7, also termed Class B basic helix-loop-helix protein 37 (bHLHb37), or bHLH factor Hes7, is a bHLH-O transcriptional repressor that is expressed in an oscillatory manner and acts as a key regulator of the pace of the segmentation clock. It is regulated by the Notch and Fgf/Mapk pathways. HES-7 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 61
29497 381469 cd11463 bHLH-O_HES2 basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split 2 (HES-2) and similar proteins. HES-2, also termed Class B basic helix-loop-helix protein 40 (bHLHb40), is a bHLH-O transcriptional repressor of genes that require a bHLH protein for their transcription. It acts as a negative regulator through interaction with both E-box and N-box sequences. HES-2 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 65
29498 381470 cd11464 bHLH_TS_TWIST basic helix-loop-helix (bHLH) domain found in twist-related protein (TWIST) family. The TWIST family includes TWIST1 and TWIST2, which are highly homologous bHLH transcription factors that promote epithelial-mesenchymal transition (EMT) during development and tumor metastasis. They are involved in the negative regulation of cellular determination and in the differentiation of several lineages, including myogenesis, osteogenesis, and neurogenesis. TWIST factors express in broad partially-overlapping patterns during embryo development and dimerize with a broad sets of dimer partners that form numerous unique transcriptional complexes to regulate embryonic development. 59
29499 381471 cd11465 bHLH_TS_scleraxis_like basic helix-loop-helix (bHLH) domain found in scleraxis, transcription factor 15 (TCF-15) and similar proteins. The family includes scleraxis and transcription factor 15 (TCF-15). Scleraxis, also termed SCX, or Class A basic helix-loop-helix protein 41 (bHLHa41), or Class A basic helix-loop-helix protein 48 (bHLHa48), is a bHLH transcription factor that is expressed in sclerotome limb bud cranial and body wall mesenchyme, pericardium and heart valves, ligaments and tendons. It is required for tendon formation ligaments, connective tissue, the diaphragm, and testis development. Scleraxis plays a central role in promoting fibroblast proliferation and matrix synthesis during the embryonic development of tendons. TCF-15, also termed Class A basic helix-loop-helix protein 40 (bHLHa40), or paraxis, or protein bHLH-EC2, is a bHLH transcription factor expressed in caudal lateral and paraxial mesoderm dermomyotome and sclerotome fore limb buds during embryo development. It may function as an early transcriptional regulator involved in the patterning of the mesoderm and in lineage determination of cell types derived from the mesoderm. 55
29500 381472 cd11466 bHLH_TS_HAND basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein (HAND) family. The HAND family includes two bHLH transcription factors, HAND1 and HAND2. HAND1, also termed Class A basic helix-loop-helix protein 27 (bHLHa27), or extraembryonic tissues, heart, autonomic nervous system and neural crest derivatives-expressed protein 1 (eHAND), plays an essential role in both trophoblast-giant cells differentiation and in cardiac morphogenesis. HAND2, also termed Class A basic helix-loop-helix protein 26 (bHLHa26), or deciduum, heart, autonomic nervous system and neural crest derivatives-expressed protein 2 (dHAND), is essential for cardiac morphogenesis, particularly for the formation of the right ventricle and of the aortic arch arteries. 56
29501 381473 cd11467 bHLH_E-protein_Da_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein daughterless (Da) and similar proteins. Da is a nuclear bHLH transcription factor that is a sole E protein ortholog essential for both neurogenesis and sex determination in Drosophila. Da is expressed in a broad range of tissues and is involved in diverse developmental processes such as oogenesis, sex determination and neurogenesis depending on its bHLH-binding partners. Da and achaete-scute complex (AS-C) form heterodimers that act as transcriptional activators of neural cell fates and are involved in sex determination. 70
29502 381474 cd11468 bHLH_TS_bHLHe22_like basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein bHLHe22, bHLHe23 and similar proteins. The family includes two OLIG-related bHLH transcription factors, bHLHe22 and bHLHe23. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is a neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types. 62
29503 381475 cd11469 bHLH-PAS_ARNTL2_PASD9 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator-like protein 2 (ARNTL2) and similar proteins. ARNTL2, also termed Basic-helix-loop-helix-PAS protein MOP9, or brain and muscle ARNT-like 2 (BMAL2), or CYCLE-like factor (CLIF), or Class E basic helix-loop-helix protein 6 (bHLHe6), or member of PAS protein 9, or PAS domain-containing protein 9 (PASD9), is a neuronal bHLH-PAS transcriptional factor, regulating cell cycle progression and preventing cell death, whose sustained expression might ensure brain neuron survival. It also plays important roles in tumor angiogenesis. ARNT-2 heterodimerize with other bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM). 60
29504 381476 cd11470 bHLH_TS_TCF15_paraxis basic helix-loop-helix (bHLH) domain found in transcription factor 15 (TCF-15) and similar proteins. TCF-15, also termed Class A basic helix-loop-helix protein 40 (bHLHa40), or paraxis, or protein bHLH-EC2, is a bHLH transcription factor expressed in caudal lateral and paraxial mesoderm dermomyotome and sclerotome fore limb buds during embryo development. It may function as an early transcriptional regulator involved in the patterning of the mesoderm and in lineage determination of cell types derived from the mesoderm. 66
29505 381477 cd11471 bHLH_TS_HAND2 basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein 2 (HAND2) and similar proteins. HAND2, also termed Class A basic helix-loop-helix protein 26 (bHLHa26), or deciduum, heart, autonomic nervous system and neural crest derivatives-expressed protein 2 (dHAND), is a bHLH transcription factor that is essential for cardiac morphogenesis, particularly for the formation of the right ventricle and of the aortic arch arteries. 62
29506 211395 cd11473 W2 C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon. This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats. 135
29507 271368 cd11474 SLC5sbd_CHT Na(+)- and Cl(-)-dependent choline cotransporter CHT and related proteins; solute-binding domain. Na+/choline co-transport by CHT is Cl- dependent. Human CHT (also called CHT1) is encoded by the SLC5A7 gene, and is expressed in the central nervous system. hCHT1-mediated choline uptake may be the rate-limiting step in acetylcholine synthesis, and essential for cholinergic transmission. Changes in this choline uptake in cortical neurons may contribute to Alzheimer's dementia. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 464
29508 271369 cd11475 SLC5sbd_PutP Na(+)/proline cotransporter PutP and related proteins; solute binding domain. Escherichia coli PutP catalyzes the Na+-coupled uptake of proline with a stoichiometry of 1:1. The putP gene is part of the put operon; this operon in addition encodes a proline dehydrogenase, allowing the use of proline as a source of nitrogen and/or carbon. This subfamily also includes the Bacillus subtilis Na+/proline cotransporter (OpuE) which has an osmoprotective instead of catabolic role. Expression of the opuE gene is under osmotic control and different sigma factors contribute to its regulation; it is also a putative CcpA-activated gene. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 464
29509 271370 cd11476 SLC5sbd_DUR3 Na(+)/urea-polyamine cotransporter DUR3, and related proteins; solute-binding domain. Dur3 is the yeast plasma membrane urea transporter. Saccharomyces cerevisiae DUR3 also transports polyamine. The polyamine uptake of S. cerevisiae DUR3 is activated upon its phosphorylation by polyamine transport protein kinase 2 (PTK2). S. cerevisiae DUR3 also appears to play a role in regulating the cellular boron concentration. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 493
29510 271371 cd11477 SLC5sbd_u1 Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 493
29511 271372 cd11478 SLC5sbd_u2 Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 496
29512 271373 cd11479 SLC5sbd_u3 Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 454
29513 271374 cd11480 SLC5sbd_u4 Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family. 488
29514 271375 cd11482 SLC-NCS1sbd_NRT1-like nucleobase-cation-symport-1 (NCS1) transporter NRT1-like; solute-binding domain. This fungal NCS1 subfamily includes various Saccharomyces cerevisiae transporters: nicotinamide riboside transporter 1 (Nrt1p, also called Thi71p), Dal4p (allantoin permease), Fui1p (uridine permease), Fur4p (uracil permease), and Thi7p (thiamine transporter). NCS1s are essential components of salvage pathways for nucleobases and related metabolites. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters, and solute carrier 6 family neurotransmitter transporters. 480
29515 271376 cd11483 SLC-NCS1sbd_Mhp1-like nucleobase-cation-symport-1 (NCS1) transporter Mhp1-like; solute-binding domain. This NCS1 subfamily includes Microbacterium liquefaciens Mhp1, and various uncharacterized NCS1s. Mhp1 mediates the uptake of indolyl methyl- and benzyl-hydantoins as part of a metabolic salvage pathway for their conversion to amino acids. Mhp1 has 12 transmembrane (TM) helices (an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12; TMs numbered to conform to the Solute carrier 6 (SLC6) family Aquifex aeolicus LeuT). NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their other known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and SLC6 neurotransmitter transporters. 451
29516 271377 cd11484 SLC-NCS1sbd_CobB-like nucleobase-cation-symport-1 (NCS1) transporter CobB-like; solute-binding domain. This NCS1 subfamily includes Escherichia coli CodB (cytosine permease), and the Saccharomyces cerevisiae transporters: Fcy21p (Purine-cytosine permease), and vitamin B6 transporter Tpn1. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s). 406
29517 271378 cd11485 SLC-NCS1sbd_YbbW-like uncharacterized nucleobase-cation-symport-1 (NCS1) transporter subfamily, YbbW-like; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. This subfamily includes the putative allantoin transporter Escherichia coli YbbW (also known as GlxB2). NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s). 456
29518 271379 cd11486 SLC5sbd_SGLT1 Na(+)/glucose cotransporter SGLT1;solute binding domain. Human SGLT1 (hSGLT1) is a high-affinity/low-capacity glucose transporter, which can also transport galactose. In the transport mechanism, two Na+ ions first bind to the extracellular side of the transporter and induce a conformational change in the glucose binding site. This results in an increased affinity for glucose. A second conformational change in the transporter follows, bringing the Na+ and glucose binding sites to the inner surface of the membrane. Glucose is then released, followed by the Na+ ions. In the process, hSGLT1 is also able to transport water and urea and may be a major pathway for transport of these across the intestinal brush-border membrane. hSGLT1 is encoded by the SLC5A1 gene and expressed mostly in the intestine, but also in the trachea, kidney, heart, brain, testis, and prostate. The WHO/UNICEF oral rehydration solution (ORS) for the treatment of secretory diarrhea contains salt and glucose. The glucose, along with sodium ions, is transported by hSGLT1 and water is either co-transported along with these or follows by osmosis. Mutations in SGLT1 are associated with intestinal glucose galactose malabsorption (GGM). Up-regulation of intestinal SGLT1 may protect against enteric infections. SGLT1 is expressed in colorectal, head and neck, and prostate tumors. Epidermal growth factor receptor (EGFR) functions in cell survival by stabilizing SGLT1, and thereby maintaining intracellular glucose levels. SGLT1 is predicted to have 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5)transporter family. 636
29519 212056 cd11487 SLC5sbd_SGLT2 Na(+)/glucose cotransporter SGLT2 and related proteins; solute-binding domain. Human SGLT2 (hSGLT2) is a high-capacity, low-affinity glucose transporter, that plays an important role in renal glucose reabsorption. It is encoded by the SLC5A2 gene and expressed almost exclusively in renal proximal tubule cells. Mutations in hSGLT2 cause Familial Renal Glucosuria (FRG), a rare autosomal defect in glucose transport. hSGLT2 is a major drug target for regulating blood glucose levels in diabetes. hSGLT2 is predicted to have 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 583
29520 271380 cd11488 SLC5sbd_SGLT4 Na(+)/glucose cotransporter SGLT4 and related proteins; solute-binding domain. Human SGLT4 (hSGLT4) has been reported to be a low-affinity glucose transporter with unusual sugar selectivity: it transports D-mannose but not galactose or 3-O-methyl-D-glucoside. It is encoded by the SLC5A9 gene and is expressed in intestine, kidney, liver, brain, lung, trachea, uterus, and pancreas. hSLGT4 is predicted to contain 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5 )transporter family. 605
29521 212058 cd11489 SLC5sbd_SGLT5 Na(+)/glucose cotransporter SGLT5 and related proteins; solute-binding domain. Human SGLT5 is a glucose transporter, which also transports galactose. It is encoded by the SLC5A10 gene, and is exclusively expressed in the renal cortex. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 604
29522 271381 cd11490 SLC5sbd_SGLT6 Na(+)/chiro-inositol cotransporter SGLT6 and related proteins; solute-binding domain. Human SGLT6 (also called KST1, SMIT2) is a chiro-inositol transporter, which also transports myo-inositol. It is encoded by the SLC5A11 gene. Xenopus Na1-glucose cotransporter type 1 (SGLT-1)-like protein is predicted to contain 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 602
29523 271382 cd11491 SLC5sbd_SMIT Na(+)/myo-inositol cotransporter SMIT and related proteins; solute-binding domain. Human SMIT is a high-affinity myo-inositol transporter, and is expressed in brain, heart, kidney, and lung. Inhibition of myo-inositol uptake, through down-regulation of SMIT, may be a common mechanism of action of mood stabilizers, including lithium, carbamazepine, and valproate. SMIT is encoded by the SLC5A3 gene, which is a candidate gene for pathogenesis of nervous system dysfunction in Down syndrome (DS). The SNP, 21q22 near SLC5A3-MRPS6-KCNE2, has been associated with coronary heart disease, cardiovascular disease, and myocardial infarction. SMIT may also be involved in the pathogeneisis of congenital cataract. SMIT also plays roles in osteogenesis, bone formation, and bone mineral density determination. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 609
29524 271383 cd11492 SLC5sbd_NIS-SMVT Na(+)/iodide (NIS) and Na(+)/multivitamin (SMVT) cotransporters, and related proteins; solute binding domain. NIS (encoded by the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. SMVT (encoded by the SLC5A6 gene) transports biotin, pantothenic acid and lipoate. This subfamily also includes SMCT1 and -2. SMCT1(encoded by the SLC5A8 gene) is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. SMCT2 (encoded by the SLC5A12 gene) is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 522
29525 271384 cd11493 SLC5sbd_NIS-like_u1 uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 479
29526 271385 cd11494 SLC5sbd_NIS-like_u2 uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters, SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 473
29527 271386 cd11495 SLC5sbd_NIS-like_u3 uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 473
29528 271387 cd11496 SLC6sbd-TauT-like Na(+)- and Cl(-)-dependent taurine transporter TauT, and related proteins; solute-binding domain. This subgroup represents the solute-binding domain of TauT-like Na(+)- and Cl(-)-dependent transporters. Family members include: human TauT which transports taurine, human GAT1, GAT2, and GAT3, and BGT1, which transport gamma-aminobutyric acid (GABA), and human CT1 which transports creatine. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 543
29529 271388 cd11497 SLC6sbd_SERT-like Na(+)- and Cl(-)-dependent monoamine transporters, SERT, NET, DAT1 and related proteins; solute binding domain. This subgroup represents the solute-binding domain of transmembrane transporters that transport monoamine neurotransmitters from synaptic spaces into presynaptic neurons. Members include: NET which transports norepinephrine, SERT which transports serotonin, and DAT1 which transports dopamine. These transporters may play a role in diseases including depression, anxiety disorders, attention-deficit hyperactivity disorder, and in the control of human behavior and emotional states. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 537
29530 212067 cd11498 SLC6sbd_GlyT1 Na(+)- and Cl(-)-dependent glycine transporter GlyT1; solute-binding domain. GlyT1 is a membrane-bound transporter that re-uptakes glycine from the synaptic cleft. Human GlyT1 is encoded by the SLC6A9 gene. GlyT1 is expressed in brain, pancreas, uterus, stomach, spleen, liver, and retina. GlyT1 may play a role in schizophrenia. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 585
29531 271389 cd11499 SLC6sbd_GlyT2 Na(+)- and Cl(-)-dependent glycine transporter GlyT2; solute-binding domain. GlyT2 (also called NET1) is a membrane-bound transporter that re-uptakes glycine from the synaptic cleft. Human GlyT2 is encoded by the SLC6A5 gene. GlyT2 is expressed in brain and spinal cord. GlyT2 may play a role in pain, and in spasticity. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 597
29532 271390 cd11500 SLC6sbd_PROT Na(+)- and Cl(-)-dependent L-proline transporter PROT; solute-binding domain. PROT is a high-affinity L-proline transporter that transports L-proline, and may have a role in excitatory neurotransmission. Human PROT is encoded by the SLC6A7 gene, a potential susceptible gene for asthma. PROT is expressed in the brain. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 541
29533 271391 cd11501 SLC6sbd_ATB0 Na(+)- and Cl(-)-dependent beta-alanine transporter ATB0+; solute-binding domain. ATB0+ (also known as the beta-alanine carrier) is a transmembrane transporter with a broad substrate specificity; it can transport non-alpha-amino acids such as beta-alanine with low affinity, and can transport dipolar and cationic amino acids such as leucine and lysine, with a higher affinity. It may have a role in the absorption of essential nutrients and drugs in the distal regions of the human gastrointestinal tract. Human ATB0+ is encoded by the SLC6A14 gene. ATB0+ is expressed in the lung, trachea, salivary gland, mammary gland, stomach, and pituitary gland. ATB0+ may play a role in obesity, and its upregulation may have a pathogenic role in colorectal cancer. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 602
29534 271392 cd11502 SLC6sbd_NTT5 Neurotransmitter transporter 5; solute-binding domain. Human NTT5 is encoded by the SLC6A16 gene. NTT5 is expressed in testis, pancreas, and prostate; its expression is predominantly intracellular, indicative of a vesicular location. Its substrates are unknown. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 535
29535 271393 cd11503 SLC5sbd_NIS Na(+)/iodide cotransporter NIS and related proteins; solute-binding domain. NIS (product of the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. NIS is expressed in the thyroid, colon, ovary, and in human breast cancers. It mediates the active transport and the concentration of iodide from the blood into thyroid follicular cells, a fundamental step in thyroid hormone biosynthesis, and is the basis of radioiodine therapy for thyroid cancer. Mutation in the SLC5A5 gene can result in a form of thyroid hormone dysgenesis. Human NIS exists mainly as a dimer stabilized by a disulfide bridge. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 535
29536 271394 cd11504 SLC5sbd_SMVT Na(+)/multivitamin cotransporter SMVT and related proteins; solute-binding domain. This multivitamin transporter SMVT (product of the SLC5A6 gene) transports biotin, pantothenic acid and lipoate, and is essential for mediating biotin uptake into mammalian cells. SMVT is expressed in the placenta, intestine, heart, brain, lung, liver, kidney and pancreas. Biotin may regulate its own cellular uptake through participation in holocarboxylase synthetase-dependent chromatin remodeling events at SMVT promoter loci. The cis regulatory elements, Kruppel-like factor 4 and activator protein-2, regulate the activity of the human SMVT promoter in the intestine. Glycosylation of the hSMVT is important for its transport function. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 527
29537 271395 cd11505 SLC5sbd_SMCT Na(+)/monocarboxylate cotransporters SMCT1 and 2 and related proteins; solute-binding domain. SMCT1 is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. Human SMCT1 (hSMCT1, also called AIT) is encoded by the tumor suppressor gene SLC5A8. SMCT1 is expressed in the colon, small intestine, kidney, thyroid gland, retina, and brain. SMCT1 may contribute to the intestinal/colonic and oral absorption of monocarboxylate drugs. It also mediates iodide transport from thyrocyte into the colloid lumen in thyroid gland and, through transporting L-lactate and ketone bodies, helps maintain the energy status and the function of neurons. SMCT2 is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. hSMCT2 is encoded by the SLC5A12 gene. SMCT2 is expressed in the kidney, small intestine, skeletal muscle, and retina. In the kidney, SMCT2 may initiate lactate absorption in the early parts of the tubule, SMCT1 in the latter parts of the tubule. In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 538
29538 212075 cd11506 SLC6sbd_GAT1 Na(+)- and Cl(-)-dependent GABA transporter 1; solute-binding domain. GAT1 transports gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. Human GAT1 is encoded by the SLC6A1 gene. GAT1 is expressed in brain and peripheral nervous system. The antiepileptic drug, Tiagabine, inhibits GAT1. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 598
29539 271396 cd11507 SLC6sbd_GAT2 Na(+)- and Cl(-)-dependent GABA transporter 2; solute-binding domain. This family includes human GAT2 (hGAT2) which transports gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. hGAT2 is encoded by the SLC6A13 gene, and is similar to mouse GAT-3, and rat GAT2. hGAT2 is expressed in brain, kidney, lung, and testis. hGAT2 is a potential drug target for treatment of epilepsy. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 544
29540 212077 cd11508 SLC6sbd_GAT3 Na(+)- and Cl(-)-dependent GABA transporter 3; solute-binding domain. This family includes human GAT3 (hGAT3) a high-affinity transporter of gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. hGAT3 is encoded by the SLC6A11 gene, and is similar to mouse GAT4, and rat GAT3/GATB. GAT3 is expressed primarily in the glia of the brain, and is a potential drug target for antiepileptic drugs. This subgroup belongs to the solute carrier 6 (SLC6) transporter family 542
29541 271397 cd11509 SLC6sbd_CT1 Na(+)- and Cl(-)-dependent creatine transporter 1; solute-binding domain. CT1 (also called CRTR, CRT) transports creatine. Human CT1 is encoded by the SLC6A8 gene. CT1 is ubiquitously expressed, with highest levels found in skeletal muscle and kidney. Creatine is absorbed from food or synthesized from arginine and plays an important role in energy metabolism. Deficiency in human CT1 leads to X-linked cerebral creatine transporter deficiency. In males, this disorder is characterized by language and speech delays, autistic-like behavior, seizures in about 50% of cases, and can also involve midfacial hypoplasia, and short stature. In females, it is characterized by mild cognitive impairment with behavior and learning problems. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 589
29542 271398 cd11510 SLC6sbd_TauT Na(+)- and Cl(-)-dependent taurine transporter; solute-binding domain. TauT is a Na(+)- and Cl(-)-dependent, high-affinity, low-capacity transporter of taurine and beta-alanine. Human TauT is encoded by the SLC6A6 gene. TauT is expressed in brain, retina, liver, kidney, heart, spleen, and pancreas. It may play a part in the supply of taurine to the intestinal epithelium and in the between-meal-capture of taurine. It may also participate in re-absorbing taurine that has been deconjugated from bile acids in the distal lumen. Functional TauT protects kidney cells from nephrotoxicity caused by the chemotherapeutic agent cisplatin; cisplatin down-regulates TauT in a p53-dependent manner. In mice, TauT has been shown to be important for the maintenance of skeletal muscle function and total exercise capacity. TauT-/- mice develop additional clinically important diseases, some of which are characterized by apoptosis, including vision loss, olfactory dysfunction, and chronic liver disease. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 542
29543 212080 cd11511 SLC6sbd_BGT1 Na(+)- and Cl(-)-dependent betaine/GABA transporter-1, and related proteins; solute-binding domain. BGT1 is a relatively low-affinity transporter of gamma-aminobutyric acid (GABA), and can also transport betaine. GABA is the main inhibitory neurotransmitter within the mammalian CNS. Human BGT1 is encoded by the SLC6A12 gene, and is similar to mouse GAT2. Mouse GAT2 plays a role in transporting GABA across the blood-brain barrier. In addition to being expressed in cells of the central nervous system, BGT1 is expressed in peripheral tissues, including kidney, liver, and heart. An association has been shown between the SLC6A12 gene and the occurrence of aspirin-intolerant asthma, and BGT1 is a drug target for antiepileptic drugs. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 541
29544 212081 cd11512 SLC6sbd_NET Na(+)- and Cl(-)-dependent norepinephrine transporter NET; solute-binding domain. NET (also called NAT1, NET1), is a transmembrane transporter that transports the neurotransmitter norepinephrine from synaptic spaces into presynaptic neurons. Human NET is encoded by the SLC6A2 gene. NET is expressed in brain, peripheral nervous system, adrenal gland, and placenta. NET may play a role in diseases or disorders including depression, orthostatic intolerance, anorexia nervosa, cardiovascular diseases, alcoholism, and attention-deficit hyperactivity disorder. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 560
29545 271399 cd11513 SLC6sbd_SERT Na(+)- and Cl(-)-dependent serotonin transporter SERT; solute-binding domain. SERT (also called 5-HTT), is a transmembrane transporter that transports the neurotransmitter serotonin from synaptic spaces into presynaptic neurons. The antiport of a K+ ion is believed to follow the transport of serotonin and promote the reorientation of SERT for another transport cycle. Human SERT is encoded by the SLC6A4 gene. SERT is expressed in brain, peripheral nervous system, placenta, epithelium, and platelets. SERT may play a role in diseases or disorders including anxiety, depression, autism, gastrointestinal disorders, premature ejaculation, and obesity. It may also have a role in social cognition. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 537
29546 212083 cd11514 SLC6sbd_DAT1 Na(+)- and Cl(-)-dependent dopamine transporter 1; solute-binding domain. DAT1 (also called DAT), is a plasma membrane transport protein that functions at the dopaminergic synapses to transport dopamine from the extracellular space back into the presynaptic nerve terminal. Human DAT1 is encoded by the SLC6A3 gene, and is expressed in the brain. DAT1 may play a role in diseases or disorders related to dopaminergic neurons, including attention-deficit hyperactivity disorder (ADHD), Tourette syndrome, Parkinson's disease, alcoholism, drug abuse, schizophrenia, extraversion, and risky behavior. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 555
29547 271400 cd11515 SLC6sbd_NTT4-like Na(+)-dependent neurotransmitter transporter 4, and related proteins; solute-binding domain. This subgroup includes the solute-binding domain of NTT4 (also called XT1) and SBAT1 (also called B0AT2, v7-3, NTT7-3); both these proteins can transport neutral amino acids. Human SBAT1 is encoded by the SLC6A15 gene, a susceptibility gene for major depression. SBAT1 is expressed in brain, and may have a role in transporting neurotransmitter precursors into neurons. Human NTT4 is encoded by the SLC6A17 gene. NTT4 is specifically expressed in the nervous system, in synaptic vesicles of glutamatergic and GABAergic neurons, and may play an important role in synaptic transmission. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 530
29548 212085 cd11516 SLC6sbd_B0AT1 Na(+)-dependent neutral amino acids transporter, B0AT1; solute-binding domain. B0AT1 (also called HND) transports neutral amino acids. Human B0AT1 is encoded by the SLC6A19 gene. B0AT1 is expressed primarily in the kidney and intestine; it requires collectrin for expression in the kidney, and angiotensin-converting enzyme 2 for expression in the intestine. Interaction with these two proteins implicates B0AT1 in more complex processes such as glomerular structure, exocytosis, and blood pressure control. The autosomal recessive disorder, Hartnup disorder, is caused by mutations in B0AT1. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 581
29549 212086 cd11517 SLC6sbd_B0AT3 glycine transporter, B0AT3; solute-binding domain. B0AT3 (also called Xtrp2, XT2) transports glycine. Human B0AT3 is encoded by the SLC6A18 gene. B0AT3 is expressed in the kidney. Mutations in the SLC6A18 gene may contribute to the autosomal recessive disorder iminoglycinuria and its related disorder hyperglycinuria. SLC6A18 or its neighboring genes are associated with increased susceptibility to myocardial infarction. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 576
29550 271401 cd11518 SLC6sbd_SIT1 Na(+)- and Cl(-)-dependent imino acid transporter SIT1; solute-binding domain. SIT1 (also called XTRP3, XT3, IMINO) transports imino acids, such as proline, pipecolate, MeAIB, and sarcosine. It has weak affinity for neutral amino acids such as phenylalanine. Human SIT1 is encoded by the SLC6A20 gene. SIT1 is expressed in brain, kidney, small intestine, thymus, spleen, ovary, and lung. SLC6A20 is a candidate gene for the rare disorder iminoglycinuria. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 576
29551 271402 cd11519 SLC5sbd_SMCT1 Na(+)/monocarboxylate cotransporter SMCT1 and related proteins; solute-binding domain. SMCT1 is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. Human SMCT1 (hSMCT1, also called AIT) is encoded by the tumor suppressor gene SLC5A8. Its expression is under the control of the C/EBP transcription factor. Its tumor-suppressive role is related to uptake of butyrate, propionate, and pyruvate, these latter are inhibitors of histone deacetylases. SMCT1 is expressed in the colon, small intestine, kidney, thyroid gland, retina, and brain. SMCT1 may contribute to the intestinal/colonic and oral absorption of monocarboxylate drugs. SMCT1 also mediates iodide transport from thyrocyte into the colloid lumen in thyroid gland and through transporting l-lactate and ketone bodies helps maintain the energy status and the function of neurons. In the kidney its expression is limited to the S3 segment of the proximal convoluted tubule (in contrast to the low-affinity monocarboxylate transporter SMCT2, belonging to a different family, which is expressed along the entire length of the tubule). In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner, SMCT1 is expressed predominantly in retinal neurons and in retinal pigmented epithelial (RPE) cells. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 542
29552 212089 cd11520 SLC5sbd_SMCT2 Na(+)/monocarboxylate cotransporter SMCT2 and related proteins; solute-binding domain. SMCT2 is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. Human SMCT2 (hSMCT2) is encoded by the SLC5A12 gene. SMCT2 is expressed in the kidney, small intestine, skeletal muscle, and retina. In the kidney, it is expressed in the apical membrane of the proximal convoluted tubule, along the entire length of the tubule (in contrast to the high-affinity monocarboxylate transporter SMCT1, belonging to a different family, which is limited to the S3 segment of the tubule). SMCT2 may initiate lactate absorption in the early parts of the tubule. In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner, SMCT2 is expressed exclusively in Muller cells. Nicotine transport by hSMCT2 is inhibited by several non-steroidal anti-inflammatory drugs. This subgroup belongs to the solute carrier 5 (SLC5) transporter family. 529
29553 271403 cd11521 SLC6sbd_NTT4 Na(+)-dependent neurotransmitter transporter 4; solute-binding domain. NTT4 (also called XT1) transports the neutral amino acids, proline, glycine, leucine, and alanine, and may play an important role in synaptic transmission. Human NTT4 is encoded by the SLC6A17 gene. NTT4 is specifically expressed in the nervous system, in synaptic vesicles of glutamatergic and GABAergic neurons. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 589
29554 212091 cd11522 SLC6sbd_SBAT1 Sodium-coupled branched-chain amino-acid transporter 1; solute-binding domain. SBAT1 (also called B0AT2, v7-3, NTT7-3) is a high-affinity Na(+)-dependent transporter for large neutral amino acids, including leucine, isoleucine, valine, proline and methionine. Human SBAT1 is encoded by the SLC6A15 gene, a susceptibility gene for major depression. SBAT1 is expressed in brain, and may have a role in transporting neurotransmitter precursors into neurons. This subgroup belongs to the solute carrier 6 (SLC6) transporter family. 580
29555 212133 cd11523 NTP-PPase Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily. This superfamily contains enzymes that hydrolyze the alpha-beta phosphodiester bond of all canonical NTPs into monophosphate derivatives and pyrophosphate (PPi). Divalent ions, such as Mg2+ ion(s), are essential to activate a proposed water nucleophile and stabilize the charged intermediates to facilitate catalysis. These enzymes share a conserved divalent ion-binding motif EXX[E/D] in their active sites. They also share a highly conserved four-helix bundle, where one face forms the active site, while the other participates in oligomer assembly. The four-helix bundle consists of two central antiparallel alpha-helices that can be contained within a single protomer or form upon dimerization. The superfamily members include dimeric dUTP pyrophosphatases (dUTPases; EC 3.6.1.23), the nonspecific NTP-PPase MazG proteins, HisE-encoded phosphoribosyl ATP pyrophosphohydolase (PRA-PH), fungal histidine biosynthesis trifunctional proteins, and several uncharacterized protein families. 72
29556 211400 cd11524 SYLF The SYLF domain (also called DUF500), a novel lipid-binding module. The SYLF domain is named after SH3YL1, Ysc84p/Lsb4p, Lsb3p, and plant FYVE, which are proteins that contain it. It is also called DUF500 and is highly conserved from bacteria to mammals. Some members, such as SH3YL1, Ysc84p, and Lsb3p, which represent the best characterized members of the family, also contain an SH3 domain, while family members from plants and stramenopiles also contain a FYVE zinc finger domain. Other members only contain a stand-alone SYLF domain. The SYLF domain of SH3YL1 binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity. 194
29557 211401 cd11525 SYLF_SH3YL1_like The SYLF domain (also called DUF500), a novel lipid-binding module, of SH3 domain containing Ysc84-like 1 (SH3YL1) and similar proteins. This subfamily is composed of yeast Ysc84 (also called LAS17-binding protein 4, Lsb4p) and Lsb3p proteins, vertebrate SH3YL1 (SH3 domain containing Ysc84-like 1), and similar proteins. They contain an N-terminal SYLF domain (also called DUF500) and a C-terminal SH3 domain. SH3YL1 localizes to the plasma membrane and is required for dorsal ruffle formation. Ysc84p localizes to actin patches and plays an important role in actin polymerization during endocytosis. A study of the yeast SH3 domain interactome predicts that Lsb3p and Lsb4p may function as molecular hubs for the assembly of endocytic complexes. The SYLF domain of SH3YL1 binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity. 199
29558 211402 cd11526 SYLF_FYVE The SYLF domain (also called DUF500), a novel lipid-binding module, of FYVE zinc finger domain containing proteins. This subfamily is composed of uncharacterized proteins from plants and stramenopiles containing a FYVE zinc finger domain followed by a SYLF domain (also called DUF500). The SYLF domain of the related protein, SH3YL1, binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity. 201
29559 212134 cd11527 NTP-PPase_dUTPase Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in dimeric 2-Deoxyuridine 5'-triphosphate nucleotidohydrolase and similar proteins. dUTPase (dUTP pyrophosphatase; EC 3.6.1.23) catalyzes the hydrolysis of dUTP to dUMP and pyrophosphate. It acts to ensure chromosomal integrity by reducing the effective ratio of dUTP/dTTP. Members in this family are dimeric dUTPases, such as those from Leishmania major, Trypanosoma cruzi, and Campylobacter jejuni, which differ from the monomeric and trimeric forms and adopt an all-alpha topology. A central four-helix bundle, consisting of two alpha-helices from the rigid domain and two helices from the mobile domain and connecting loops, form the active site in dimeric dUTPase-like proteins, requiring the presence of metal ion cofactors to hydrolyze both dUTP and dUDP. 94
29560 212135 cd11528 NTP-PPase_MazG_Nterm Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) N-terminal tandem-domain of MazG proteins from Escherichia coli and bacterial homologs. MazG is a NTP-PPase that hydrolyzes all canonical NTPs into their corresponding nucleoside monophosphates and pyrophosphate. The prototype of this family is MazG proteins from Escherichia coli (EcMazG) that represents the most abundant form consisting two sequence-related domains in tandem, this family corresponding to the N-terminal MazG-like domain. EcMazG functions as a regulator of cellular response to starvation by lowering the cellular concentration of guanosine 3',5'-bispyrophosphate (ppGpp). EcMazG exists as a dimer; each monomer contains two tandem MazG-like domains with similarly folded globular structures. However, only the C-terminal domain has well-ordered active site and exhibits an NTPase activity responsible for the regulation of bacterial cell survival under nutritional stress. Divalent ions, such as Mg2+ or Mn2+, are required for activity; however, this domain does not exhibit an NTPase activity despite containing structural features such as the EEXX(E/D) motif and key basic catalytic residues responsible for nucleotide pyrophosphohydrolysis activity. It is suggested that the N-terminal domain of EcMazG might have a house-cleaning function by hydrolyzing noncanonical NTPs whose incorporation into the nascent DNA leads to increased mutagenesis and DNA damage. 114
29561 212136 cd11529 NTP-PPase_MazG_Cterm Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) C-terminal tandem-domain of MazG proteins from Escherichia coli and bacterial homologs'. MazG is a NTP-PPase that hydrolyzes all canonical NTPs into their corresponding nucleoside monophosphates and pyrophosphate. The prototype of this family is MazG proteins from Escherichia coli (EcMazG) that represents the most abundant form consisting two sequence-related domains in tandem, this family corresponding to the C-terminal MazG-like domain. EcMazG functions as a regulator of cellular response to starvation by lowering the cellular concentration of guanosine 3',5'-bispyrophosphate (ppGpp). EcMazG exists as a dimer. Each monomer contains two tandem MazG-like domains with similarly folded globular structures. However, only the C-terminal domain has well-ordered active sites and exhibits an NTPase activity responsible for the regulation of bacterial cell survival under nutritional stress. Divalent ions, such as Mg2+ or Mn2+, are required for activity, along with structural features such as EEXX(E/D) motifs and key basic catalytic residues. It has been shown that the C-terminus NTPase activity is responsible for regulation of bacterial cell survival under nutritional stress. 116
29562 212137 cd11530 NTP-PPase_DR2231_like Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Deinococcus radiodurans DR2231 protein and its bacterial homologs. This family includes a MazG-like NTP-PPase from Deinococcus radiodurans (DR2231), a putative NTP-PPase YP_001813558.1 from Exiguobacterium sibiricum and their bacterial homologs. DR2231 shows significant structural resemblance to MazG proteins, but is functionally related to the dimeric dUTPases. It can hydrolyze dUTP into dUMP. DR2231-like proteins contain a well conserved divalent ion binding motif, EXXEX(12-28)EXXD, which is the identity signature for the all-alpha-helical NTP-PPase superfamily. Unlike normal dimeric dUTPase-like proteins with a central four-helix bundle forming the active site, YP_001813558.1 displays a very unusual interlaced segment-swapped dimer. It potentially prefers to hydrolyze dCTPs or its derivatives. YP_001813558.1-like proteins contain a variant divalent ion binding motif, EXXEX(12-28)AXXD. 88
29563 212138 cd11531 NTP-PPase_BsYpjD Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain putative pyrophosphatase YpjD from Bacillus subtilis and its bacterial homologs. This family includes a putative pyrophosphatase Ypjd from Bacillus subtilis (BsYpjD) and its homologs. Although its biological role has not been described in detail, BsYpjD shows significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, BsYpjD contains a single MazG-like domain. 93
29564 212139 cd11532 NTP-PPase_COG4997 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from archaea and bacteria. The family includes some uncharacterized hypothetical proteins from archaea and bacteria. Although their biological roles remain unclear, the family members show significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, the family contains a single MazG-like domain. 95
29565 212140 cd11533 NTP-PPase_Af0060_like Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in uncharacterized protein from Archaeoglobus fulgidus (Af0060) and its bacterial homologs. This family includes an uncharacterized protein from Archaeoglobus fulgidus (Af0060) and its homologs from bacteria. Although its biological role remains unclear, Af0060 shows high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 75
29566 212141 cd11534 NTP-PPase_HisIE_like Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Escherichia coli phosphoribosyl-ATP pyrophosphohydrolase (HisIE or PRATP-PH) and its homologs. This family includes Escherichia coli phosphoribosyl-ATP pyrophosphohydrolase, HisIE, and its homologs from all three kingdoms of life. E. coli HisIE is encoded by the hisIE gene, which is formed by hisE gene fused to hisl. HisIE is a bifunctional enzyme responsible for the second and third steps of the histidine-biosynthesis pathway. Its N-terminal and C-terminal domains have phosphoribosyl-AMP cyclohydrolase (HisI) and phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) activity, respectively. This family corresponds to the C-terminal domain of HisIE and includes many hisE gene encoding proteins, all of which show significant sequence similarity to Mycobacterium tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH). These proteins may be responsible for only the second step in the histidine-biosynthetic pathway, irreversibly hydrolyzing phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate. 84
29567 212142 cd11535 NTP-PPase_SsMazG Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Sulfolobus solfataricus (Ss) and its homologs from archaea and bacteria. This family includes a MazG-like protein from Sulfolobus solfataricus (SsMazG) and its homologs from archaea and bacteria. Although its biological roles remain still unclear, SsMazG shows significant sequence similarity to the NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, SsMazG contains a single MazG-like domain. It is predicted that SsMazG might participate in house-cleaning by preventing incorporation of the oxidation product 2-oxo-(d)ATP (iso-dGTP), a mutagenic derivative of ATP, into DNA. 76
29568 212143 cd11536 NTP-PPase_iMazG Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in integron-associated MazG (iMazG) proteins. This family corresponds to the iMazG proteins representing a new subfamily of MazG NTP-PPases. iMazG is likely to act as a house-cleaning enzyme capable of removing aberrant dNTPs, preventing the incorporation of damaging non-canonical nucleotides into host-cell DNA. It can convert dNTP to dNMP and pyrophosphate by cleaving between the alpha- and beta-phosphates of its dNTP substrates, with a marked preference for dCTP and dATP. Unlike typical tandem-domain MazG proteins, iMazG contains a single MazG-like domain and functions as a tetramer (a dimer of dimers) with a typical four-helical bundle. The divalent ions, such as Mg2+, are required for its pyrophosphatase activity. 90
29569 212144 cd11537 NTP-PPase_RS21-C6_like Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in mouse RS21-C6 protein and its homologs. RS21-C6 proteins, highly expressed in all vertebrate genomes and green plants, act as house-cleaning enzymes, removing 5-methyl dCTP (m5dCTP) in order to prevent gene silencing. They show significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, RS21-C6 contains a single MazG-like domain and functions as a tetramer (a dimer of dimers) with a typical four-helical bundle. Divalent ions, such as Mg2+, are required for its pyrophosphatase activity. This family also includes a pyrophosphatase from Archaeoglobus fulgidus (Af1178). Although its biological role remains unclear, Af1178 shows significant sequence similarity to the mouse RS21-C6 protein. 90
29570 212145 cd11538 NTP-PPase_u1 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 97
29571 212146 cd11539 NTP-PPase_u2 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. The family corresponds to a group of uncharacterized hypothetical proteins from bacteria and archaea, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 85
29572 212147 cd11540 NTP-PPase_u3 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria and archaea, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 76
29573 212148 cd11541 NTP-PPase_u4 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 91
29574 212149 cd11542 NTP-PPase_u5 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D]. 99
29575 212150 cd11543 NTP-PPase_u6 Nucleoside Triphosphate Pyrophosphohydrolase EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain. 87
29576 212151 cd11544 NTP-PPase_DR2231 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Deinococcus radiodurans DR2231 protein and its bacterial homologs. This family corresponds to the DR2231 protein, a MazG-like NTP-PPase from Deinococcus radiodurans, and its bacterial homologs. All family members contain a well-conserved divalent ion binding motif, EXXEX(12-28)EXXD, which is the identity signature for all-alpha-helical NTP-PPase superfamily. DR2231 shows significant structural resemblance to MazG proteins, but is functionally related to the dimeric dUTPases. It might be an evolutionary precursor of dimeric dUTPases with very high specificity in hydrolyzing dUTP into dUMP, but an inability to hydrolyze dTTP, a typical feature of dUTPases. Moreover, unlike the dUPase monomer containing a single active site, the DR2231 protein dimer holds two putative active sites. 116
29577 212152 cd11545 NTP-PPase_YP_001813558 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Exiguobacterium sibiricum YP_001813558.1 protein and its bacterial homologs. This family contains a putative NTP_PPase (YP_001813558.1) from Exiguobacterium sibiricum and its bacterial homologs. Unlike normal dimeric dUTPase-like proteins with a central four-helix bundle forming the active site, YP_001813558.1 displays a very unusual interlaced segment-swapped dimer that might be important for it to adapt to an extremely cold environment. Moreover, structural analysis and comparisons indicate that YP_001813558.1 potentially prefers to hydrolyze dCTPs or its derivatives. 115
29578 212153 cd11546 NTP-PPase_His4 Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in His4-like fungal histidine biosynthesis trifunctional proteins and their homologs. This family includes fungal histidine biosynthesis trifunctional proteins and their homologs from eukaryotes and bacteria. Some family members contain three domains responsible for phosphoribosyl-AMP cyclohydrolase (PRAMP-CH), phosphoribosyl-ATP pyrophosphohydrolase (PRATP-PH), and histidinol dehydrogenase (Histidinol-DH) activity, respectively. Some others do not have Histidinol-DH domain, but have an additional N-terminal TIM phosphate binding domain. This family corresponds to the domain for PRATP-PH activity, which shows significant sequence similarity to Mycobacterium tuberculosis PRATP-PH that catalyzes the second step in the histidine-biosynthetic pathway, irreversibly hydrolyzing phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate. 84
29579 212154 cd11547 NTP-PPase_HisE Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Mycobacterium tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) and its bacterial homologs. This family includes M. tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) and its bacterial homologs. M. tuberculosis HisE is encoded by the hisE gene, which is a separate gene presenting in many bacteria and archaea but is fused to hisI in other bacteria, fungi and plants. HisE is responsible for the second step in the histidine-biosynthetic pathway. It can irreversibly hydrolyze phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate. HisE dimerizes into a four alpha-helix bundle, forming two inferred PRATP active sites on the outer faces. M. tuberculosis HisE has been found to be essential for growth in vitro, thus making it a potential drug target for tuberculosis. 86
29580 211389 cd11548 NodZ_like Alpha 1,6-fucosyltransferase similar to Bradyrhizobium NodZ. Bradyrhizobium NodZ is an alpha 1,6-fucosyltransferase involved in the biosynthesis of the nodulation factor, a lipo-chitooligosaccharide formed by three-to-six beta-1,4-linked N-acetyl-d-glucosamine (GlcNAc) residues and a fatty acid acyl group attached to the nitrogen atom at the non-reducing end. NodZ transfers L-fucose from the GDP-beta-L-fucose donor to the reducing residue of the chitin oligosaccharide backbone, before the attachment of a fatty acid group. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes. 287
29581 211403 cd11549 Serine_rich_CAS Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1 has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family. 159
29582 211404 cd11550 Serine_rich_NEDD9 Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein, Neural precursor cell Expressed, Developmentally Down-regulated 9; a protein interaction module. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1, another CAS protein, has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family. 162
29583 211405 cd11551 Serine_rich_CASS4 Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein family member 4; a protein interaction module. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1, another CAS protein, has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family. 159
29584 211406 cd11552 Serine_rich_BCAR1 Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein, Breast Cancer Anti-estrogen Resistance 1; a protein interaction module. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1 has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family. 157
29585 212092 cd11554 SLC6sbd_u2 uncharacterized eukaryotic solute carrier 6 subfamily; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, and are involved in regulating neurotransmitter signaling and homeostasis, and are the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter. 406
29586 271404 cd11555 SLC-NCS1sbd_u1 uncharacterized nucleobase-cation-symport-1 (NCS1) transporter subfamily; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s). 461
29587 271405 cd11556 SLC6sbd_SERT-like_u1 uncharacterized subgroup of the SERT-like Na(+)- and Cl(-)-dependent monoamine transporter subfamily; solute binding domain. SERT-like Na(+)- and Cl(-)-dependent monoamine transporters, transport monoamine neurotransmitters from synaptic spaces into presynaptic neurons. Members include: the norepinephrine transporter NET, the serotonin transporter SERT , and the dopamine transporter DAT1. These latter may play a role in diseases or disorders including depression, anxiety disorders, and attention-deficit hyperactivity disorder, and in the control of human behavior and emotional states. They belongs to the solute carrier 6 (SLC6) transporter family. Members of this subgroup are uncharacterized. 552
29588 211407 cd11557 ST7 Suppression of tumorigenicity 7. ST7 is a metazoan protein that behaves as a tumor suppressor in human cancer cells. It appears to localize to the cytoplasm and plasma membrane, and may mediate tumor suppression by regulating genes that are involved in oncogenic pathways and/or maintain cellular structure. It has been suggested that the suppression of tumorigenicity is associated with a function in mediating the remodeling of the extracellular matrix. However, somatic mutations of ST7 have not been observed as being commonly associated with molecular pathogenesis in various human neoplasias. 458
29589 211396 cd11558 W2_eIF2B_epsilon C-terminal W2 domain of eukaryotic translation initiation factor 2B epsilon. eIF2B is a heteropentameric complex which functions as a guanine nucleotide exchange factor in the recycling of eIF-2 during the initiation of translation in eukaryotes. The epsilon and gamma subunits are sequence similar and both are essential in yeast. Epsilon appears to be the catalytically active subunit, with gamma enhancing its activity. The C-terminal domain of the eIF2B epsilon subunit contains bipartite motifs rich in acidic and aromatic residues, which are responsible for the interaction with eIF2. The structure of the domain resembles that of a set of concatenated HEAT repeats. 169
29590 211397 cd11559 W2_eIF4G1_like C-terminal W2 domain of eukaryotic translation initiation factor 4 gamma 1 and similar proteins. eIF4G1 is a component of the multi-subunit eukaryotic translation initiation factor 4F, which facilitates recruitment of the mRNA to the ribosome, a rate-limiting step during translation initiation. This C-terminal domain, whose structure resembles that of a set of concatenated HEAT repeats, has been associated with binding to/recruiting the kinase Mnk1, which phosphorylates eIF4E. 134
29591 211398 cd11560 W2_eIF5C_like C-terminal W2 domain of the eukaryotic translation initiation factor 5C and similar proteins. eIF5C appears to be essential for the initiation of protein translation; its actual function, and specifically that of the C-terminal W2 domain, are not well understood. The Drosophila ortholog, kra (krasavietz) or exba (extra bases), may be involved in translational inhibition in neural development. The structure of this C-terminal domain resembles that of a set of concatenated HEAT repeats. 194
29592 211399 cd11561 W2_eIF5 C-terminal W2 domain of eukaryotic translation initiation factor 5. eIF5 functions as a GTPase acceleration protein (GAP), as well as a GDP dissociation inhibitor (GDI) during translational initiation in eukaryotes. The structure of this C-terminal domain resembles that of a set of concatenated HEAT repeats. 157
29593 211408 cd11564 FAT-like_CAS_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). The FAT-like C-terminal domain of CAS proteins binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion. 126
29594 211318 cd11566 eIF1_SUI1 Eukaryotic initiation factor 1. eIF1/SUI1 (eukaryotic initiation factor 1) plays an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. 84
29595 211319 cd11567 YciH_like Homologs of eIF1/SUI1 including Escherichia coli YciH. Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. The function of non-eukaryotic family members is unclear. Escherichia coli YciH is a non-essential protein and was reported to be able to perform some of the functions of IF3 in prokaryotic initiation. 76
29596 211409 cd11568 FAT-like_CASS4_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein family member 4; a protein interaction module. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion. 123
29597 211410 cd11569 FAT-like_BCAR1_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Breast Cancer Anti-estrogen Resistance 1; a protein interaction module. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion. 133
29598 211411 cd11570 FAT-like_NEDD9_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Neural precursor cell Expressed, Developmentally Down-regulated 9; a protein interaction module. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion. 128
29599 211412 cd11571 FAT-like_EFS_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Embryonal Fyn-associated Substrate; a protein interaction module. EFS is also called HEFS, CASS3 (CAS scaffolding protein family member 3) or SIN (Src-interacting protein). It was identified based on interactions with the Src kinases, Fyn and Yes. It plays a role in thymocyte development and acts as a negative regulator of T cell proliferation. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion. 130
29600 211413 cd11572 RlmI_M_like Middle domain of the SAM-dependent methyltransferase RlmI and related proteins. This middle or central domain is typically found between an N-terminal PUA domain and a C-terminal SAM-dependent methyltransferase domain, such as in the Escherichia coli ribosomal RNA large subunit methyltransferase RlmI (YccW). It may be involved in binding to the RNA substrate. 99
29601 211414 cd11573 GH99_GH71_like Glycoside hydrolase families 71, 99, and related domains. This superfamily of glycoside hydrolases contains families GH71 and GH99 (following the CAZY nomenclature), as well as other members with undefined function and specificity. 284
29602 211415 cd11574 GH99 Glycoside hydrolase family 99, an endo-alpha-1,2-mannosidase. This family of glycoside hydrolases 99 (following the CAZY nomenclature) includes endo-alpha-1,2-mannosidase (EC 3.2.1.130), which is an important membrane-associated eukaryotic enzyme involved in the maturation of N-linked glycans. Specifically, it cleaves mannoside linkages internal to N-linked glycan chains by hydrolyzing an alpha-1,2-mannosidic bond between a glucose-substituted mannose and the remainder of the chain. The biological function and significance of the soluble bacterial orthologs, which may have obtained the genes via horizontal transfer, is not clear. 338
29603 211416 cd11575 GH99_GH71_like_3 Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism. 376
29604 211417 cd11576 GH99_GH71_like_2 Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism. The domain may co-occur with other domains involved in the binding/processing of glycans. 378
29605 211418 cd11577 GH71 Glycoside hydrolase family 71. This family of glycoside hydrolases 71 (following the CAZY nomenclature) function as alpha-1,3-glucanases (mutanases, EC 3.2.1.59). They appear to have an endo-hydrolytic mode of enzymatic activity and bacterial members are investigated as candidates for the development of dental caries treatments.The member from fission yeast, endo-alpha-1,3-glucanase Agn1p, plays a vital role in daughter cell separation, while Agn2p has been associated with endolysis of the ascus wall. 283
29606 211419 cd11578 GH99_GH71_like_1 Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism. 313
29607 211420 cd11579 Glyco_tran_WbsX Glycosyl hydrolase family 99-like domain of WbsX-like glycosyltransferases. Members of this domain family are found in proteins within O-antigen biosynthesis clusters in Gram negative bacteria, where they may function as glycosyl hydrolases and typically co-occur with glycosyltransferase domains. They bear resemblance to GH71 and the GH99 family of alpha-1,2-mannosidases and may share a similar cataltyic site and mechanism. The O-antigens are essential lipopolysaccharides in gram-negative bacteria's outer membrane and have been linked to pathogenicity. 347
29608 211421 cd11580 eIF2D_N_like N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins. This N-terminal domain of various proteins co-occurs with a PUA domain. Members of this family are: (1) MCTS-1 (malignant T cell-amplified sequence 1) or MCT-1 (multiple copies T cell malignancies), which may play roles in the regulation of the cell cycle, (2) the eukayotic translation initiation factor 2D, and (3) an uncharacterized archaeal family. 72
29609 212547 cd11581 GINS_A Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3. The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. 103
29610 211424 cd11582 Axin_TNKS_binding Tankyrase binding N-terminal segment of axin. This N-terminal region of axin mediates interactions with the ankyrin-repeat clusters 2 and 3 of tankyrase, which controls the turnover of axin via poly-ADP-ribosylation. Axin functions as a negative regulator of the WNT signaling pathway. 69
29611 211425 cd11583 Orc6_mid Middle domain of the origin recognition complex subunit 6. Orc6 is a subunit of the origin recognition complex in eukaryotes, and it may be involved in binding to DNA. This model describes the central or middle domain of Orc6, whose structure resembles that of TFIIB, a DNA-binding transcription factor. Orc6 appears to form distinct complexes with DNA, and a putative DNA-binding site has been identified. 94
29612 211426 cd11585 SATB1_N N-terminal domain of SATB1 and similar proteins. SATB1, the special AT-rich sequence-binding protein 1, is involved in organizing chromosomal loci into distinct loops, creating a "loopscape" that has a direct bearing on gene expression. This N-terminal domain, which may be involved in various interactions with chromatin proteins, resembles a ubiquitin domain and has been shown to form tetramers, a function critical to SATB1-DNA interactions. The related Drosophila homeobox gene defective proventriculus (dve) plays a key role in the functional specification during endoderm development. 100
29613 212155 cd11586 VbhA_like VbhA antitoxin and related proteins. VbhA is the antitoxin to VbhT. The VbhT toxin of the mammalian pathogen Bartonella schoenbuchensis is responsible for the disruptive adenylation of host proteins. VbhT also induces FIC-domain-mediated growth arrest in bacteria; it is inhibited by this antitoxin which binds to block the ATP binding site of the VbhT FIC domain. 54
29614 212536 cd11587 Arginase-like Arginase types I and II and arginase-like family. This family includes arginase, also known as arginase-like amidino hydrolase family, and related proteins, found in bacteria, archaea and eykaryotes. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid. In vertebrates, at least two isozymes have been identified: type I cytoplasmic or hepatic liver-type arginase and type II mitochondrial or non-hepatic arginase. Point mutations in human arginase gene lead to hyperargininemia with consequent mental disorders, retarded development and early death. Arginase is a therapeutic target to treat asthma, erectile dysfunction, atherosclerosis and cancer. 294
29615 212537 cd11589 Agmatinase_like_1 Agmatinase and related proteins. This family includes known and predicted bacterial agmatinase (agmatine ureohydrolase; AUH; SpeB; EC=3.5.3.11), a binuclear manganese metalloenzyme, belonging to the ureohydrolase superfamily. It is a key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield urea and putrescine, the precursor for biosynthesis of higher polyamines, spermidine, and spermine. Agmatinase from Deinococcus radiodurans shows approximately 33% of sequence identity to human mitochondrial agmatinase. An analysis of the evolutionary relationship among ureohydrolase superfamily enzymes indicates the pathway involving arginine decarboxylase and agmatinase evolved earlier than the arginase pathway of polyamine. 274
29616 212538 cd11592 Agmatinase_PAH Agmatinase-like family includes proclavaminic acid amidinohydrolase. This agmatinase subfamily contains bacterial and fungal/metazoan enzymes, including proclavaminic acid amidinohydrolase (PAH, EC 3.5.3.22) and Pseudomonas aeruginosa guanidinobutyrase (GbuA) and guanidinopropionase (GpuA). PAH hydrolyzes amidinoproclavaminate to yield proclavaminate and urea in clavulanic acid biosynthesis. Clavulanic acid is an effective inhibitor of beta-lactamases and is used in combination with amoxicillin to prevent the beta-lactam rings of the antibiotic from hydrolysis and, thus keeping the antibiotic biologically active. GbuA hydrolyzes 4-guanidinobutyrate (4-GB) into 4-aminobutyrate and urea while GpuA hydrolyzes 3-guanidinopropionate (3-GP) into beta-alanine and urea. Mutation studies show that significant variations in two active site loops in these two enzymes may be important for substrate specificity. This subfamily belongs to the ureohydrolase superfamily, which includes arginase, agmatinase, proclavaminate amidinohydrolase, and formiminoglutamase. 289
29617 212539 cd11593 Agmatinase-like_2 Agmatinase and related proteins. This family includes known and predicted bacterial and archaeal agmatinase (agmatine ureohydrolase; AUH; SpeB; EC=3.5.3.11), a binuclear manganese metalloenzyme that belongs to the ureohydrolase superfamily. It is a key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield urea and putrescine, the precursor for biosynthesis of higher polyamines, spermidine, and spermine. As compared to E. coli where two paths to putrescine exist, via decarboxylation of an amino acid, ornithine or arginine, a single path is found in Bacillus subtilis, where polyamine synthesis starts with agmatine; the speE and speB encode spermidine synthase and agmatinase, respectively. The level of agmatinase synthesis is very low, allowing strict control on the synthesis of putrescine and therefore, of all polyamines, consistent with polyamine levels in the cell. This subfamily belongs to the ureohydrolase superfamily, which includes arginase, agmatinase, proclavaminate amidinohydrolase, and formiminoglutamase. 263
29618 212540 cd11598 HDAC_Hos2 Class I histone deacetylases including ScHos2 and SpPhd1. This subfamily includes Class I histone deacetylase (HDAC) Hos2 from Saccharomyces cerevisiae as well as a histone deacetylase Phd1 from Schizosaccharomyces pombe. Hos2 binds to the coding regions of genes during gene activation, specifically it deacetylates the lysines in H3 and H4 histone tails. It is preferentially associated with genes of high activity genome-wide and is shown to be necessary for efficient transcription. Thus, Hos2 is directly required for gene activation in contrast to other class I histone deacetylases. Protein encoded by phd1 is inhibited by trichostatin A (TSA), a specific inhibitor of histone deacetylase, and is involved in the meiotic cell cycle in S. pombe. Class 1 HDACs are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). 311
29619 212541 cd11599 HDAC_classII_2 Histone deacetylases and histone-like deacetylases, classII. This subfamily includes eukaryotic as well as bacterial Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. In D. discoideum, where four homologs (HdaA, HdaB, HdaC, HdaD) have been identified, HDAC activity is important for regulating the timing of gene expression during development. Also, inhibition of HDAC activity by trichostatin A is shown to cause hyperacetylation of the histone and a delay in cell aggregation and differentiation. 288
29620 212542 cd11600 HDAC_Clr3 Class II Histone deacetylase Clr3 and similar proteins. Clr3 is a class II Histone deacetylase Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Clr3 is the homolog of the class-II HDAC HdaI in S. cerevisiae, and is essential for silencing in heterochromatin regions, such as centromeric regions, ribosomal DNA, the mating-type region and telomeric loci. Clr3 has also been implicated in the regulation of stress-related genes; the histone acetyltransferase, Gcn5, in S. cerevisiae, preferentially acetylates global histone H3K14 while Clr3 preferentially deacetylates H3K14ac, and therefore, interplay between Gcn5 and Clr3 is crucial for the regulation of many stress-response genes. 313
29621 409282 cd11601 Nip7_N-like N-terminal domain of Nip7 and similar proteins. This domain of various proteins is often found N-terminal to a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. The family contains Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae. Recently, it was demonstrated that human Nip7 is essential in the accurate processing of pre-rRNA. Also included are KD93, a human homolog of Nip7, as well as an archaeal homolog and bacterial RsmB/RsmF family ribosomal methyltransferases. 76
29622 211427 cd11602 Ndc10 Ndc10 component of the yeast centromere-binding factor 3. Ndc10 is a multidomain protein conserved in Saccharomycotina that interacts with kinetochore components. This model characterizes the majority of the protein; some family members may have an additional C-terminal domain that is homologous to transcriptional activators (GCR1_C). Ndc10 is part of the centromere-binding factor 3 (CBF3) complex in budding yeast. The CBF3 complex contains four essential proteins, Ndc10, Cep3, Ctf13, and Skp1. CBF3/Ndc10 is essential for the recruitment of the centromeric nucleosome and formation of the kinetochore. The Kinetochore is the large, multiprotein assembly that serves to connect condensed sister chromatids to the mitotic spindle. Ndc10 forms a dimer and it has non-sequence-specific DNA binding activity via the DNA backbone. Ndc10 also plays an important role in the coordination of cell division. It has been noted that the protein bears resemblance to the tyrosine recombinases (type IB topoisomerase/lambda-integrase). 413
29623 211428 cd11603 ThermoDBP Thermoproteales single-stranded DNA-binding (SSB) domain. ThermoDBP is a SSB protein of the Thermoproteales. SSB proteins are essential for the genome maintenance of all known cellular organisms. Many SSBs contain an OB fold domain, albeit with low sequence conservation and OB fold-containing SSB proteins have been detected in all three domains of life. However, one group of Crenarchaea, the Thermoproteales, lack SSB encoding genes. The Thermoproteales SSB protein, ThermoDBP, lacks the OB fold and binds specifically to ssDNA with low sequence specificity. Its three-dimensional structure resembles that of the Hut operon positive regulatory protein HutP. 141
29624 211429 cd11604 RTT106_N histone chaperone RTT106, regulator of Ty1 transposition protein 106; N-terminal homodimerization domain. This cd includes the N-terminal homodimerization domain of Saccharomyces cerevisiae Rtt106, a histone chaperone. In addition to this domain, Rtt106 contains two C-terminal pleckstrin-homology (PH) domains. The acetylation of lysine 56 in histone H3 (H3K56ac) is implicated in regulating nucleosome disassembly during gene transcription, and nucleosome assembly during DNA replication and repair. Rtt106 has been shown to aid in the efficient deposition of newly synthesized H3K56ac onto replicating DNA. The interaction of Rtt106 with (H3-H4)2, most likely in the form of a (H3-H4)2 tetramer, is important for gene silencing and for the DNA damage response. Data supports a combinatorial interaction: this N-terminal domain homodimerizes and intercalates between the two H3-H4 components of the (H3-H4)2 tetramer, independent of acetylation, and the two double PH domains bind the K56-containing region of H3. Acetylation of K56 increases the affinity of the interaction. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). Saccharomyces cerevisiae Rtt106 also plays a role in a role in regulating Ty1 transposition. 54
29625 212156 cd11606 COE_DBD Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain. COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain. 212
29626 211320 cd11607 DENR_C C-terminal domain of DENR and related proteins. DENR (density regulated protein), together with MCT-1 (multiple copies T cell malignancies), has been shown to have similar function as eIF2D translation initiation factor (also known as ligatin), which is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner. 86
29627 211321 cd11608 eIF2D_C C-terminal domain of eIF2D and related proteins. eIF2D translation initiation factor (also known as ligatin) is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner. 85
29628 211422 cd11609 MCT1_N N-terminal domain of multiple copies T cell malignancies 1 and related proteins. This N-terminal domain of MCT-1 (multiple copies T cell malignancies 1), also known as MCTS-1 (malignant T cell-amplified sequence 1), co-occurs with a PUA domain. MCT-1, together with DENR (density regulated protein), has been shown to have similar function as eIF2D translation initiation factor (also known as ligatin), which is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner. 77
29629 211423 cd11610 eIF2D_N N-terminal domain of eIF2D and related proteins. This N-terminal domain of eIF2D co-occurs with a PUA domain. eIF2D translation initiation factor (also known as ligatin) is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner. 76
29630 212157 cd11611 SAF Domains similar to fish antifreeze type III protein. SAF domains are found in a wide variety of proteins with quite different functions. They are components of enzymes, such as D-altronate-dehydratases or sialic acid synthetases, of antifreeze proteins conserved in fish (where they bind to nascent ice crystals), and may act as periplasmic chaperones in bacterial flagella basal body P-ring formation. 56
29631 212158 cd11613 SAF_AH_GD Domains similar to fish antifreeze type III protein. Altronate dehydratase (EC 4.2.1.7) converts D-altronate into 2-dehydro-3-deoxy-D-gluconate and is part of a bacterial pathway for the degradation of D-galacturonate. D-galactarate dehydratase (EC 4.2.1.42) eliminates water from D-galactarate to yield 5-dehydro-4-deoxy-D-glucarate, initializing the degradation of D-galactarate. The function of the SAF domain in these enzymes is not clear. It may participate in dimerization. 80
29632 212159 cd11614 SAF_CpaB_FlgA_like SAF domains of the flagella basal body P-ring formation protein FlgA and the flp pilus assembly CpaB. FlgA is a putative periplasmic chaperone that assists in the formation of the flagellar P ring; CpaB is a protein invoved in the assembly of the flp pili, which are bacterial virulence factors mediating non-specific adherence to surfaces; these proteins appear to contain a single SAF domain. This intermediate family also contains the SAF domains of sialic acid synthetases and type III antifreeze proteins, which also share the same extensive core structure. 61
29633 212160 cd11615 SAF_NeuB_like C-terminal SAF domain of sialic acid synthetase. Sialic acid synthetase (N-acetylneuraminate synthase or N-acetylneuraminate-9-phosphate synthase) catalyzes the condensation of phosphoenolpyruvate with N-acetylmannosamine (ManNAc, in bacteria) or N-acetylmannosamine-6-phosphate (ManNAc-6P, in mammals), to yield N-acetylneuramic acid (NeuNAc) or N-acetylneuramic acid-9-phosphate (NeuNAc-9P), respectively. The N-terminal NeuB domain, a TIM-barrel-like structure, contains the catalytic site, the function of the SAF domain is not as clear. It may participate in domain-swapped dimerization and play a role in binding the substrate, in either domain-swapped dimers or by directly interacting with the N-terminal domain. Also included in the family are PEP-sugar pyruvyltransferases known as spore coat polysaccharide biosynthesis proteins (SpsE). 58
29634 212161 cd11616 SAF_DH_OX_like SAF domain of putative dehydrogenases or oxidoreductases. C-terminal SAF domain of an uncharacterized family of putative dehydrogenases or oxidoreductases, which are otherwise members of the NAD(P)-dependent Rossmann-fold superfamily. 80
29635 212162 cd11617 Antifreeze_III Type III antifreeze protein, may be specific to the Zoarcoidei. Antifreeze protein III inhibits the growth of ice crystals and protects fish from cold damage in sub-freezing temperatures. 62
29636 211316 cd11618 ChtBD1_1 Hevein or type 1 chitin binding domain; filamentous ascomycete subfamily. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 44
29637 212009 cd11619 HR1_CIP4-like Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Cdc42-Interacting Protein 4 and similar proteins. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. Members of this subfamily typically contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of CIP4 binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane. 77
29638 212010 cd11620 HR1_PKC-like_2_fungi Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of fungal Protein Kinase C-like proteins. This subfamily is composed of fungal PKC-like proteins including Pkc1p from Saccharomyces cerevisiae, and Pck1p and Pck2p from Schizosaccharomyces pombe. The yeast PKC-like proteins play a critical role in regulating cell wall biosynthesis and maintaining cell wall integrity. They contain two HR1 domains, C2 and C1 domains, and a kinase domain. This model characterizes the second HR1 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. The HR1 domains of Pck1p and Pck2p interact with GTP-bound Rho1p and Rho2p. 72
29639 212011 cd11621 HR1_PKC-like_1_fungi First Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of fungal Protein Kinase C-like proteins. This subfamily is composed of fungal PKC-like proteins including Pkc1p from Saccharomyces cerevisiae, and Pck1p and Pck2p from Schizosaccharomyces pombe. The yeast PKC-like proteins play a critical role in regulating cell wall biosynthesis and maintaining cell wall integrity. They contain two HR1 domains, C2 and C1 domains, and a kinase domain. This model characterizes the first HR1 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. The HR1 domains of Pck1p and Pck2p interact with GTP-bound Rho1p and Rho2p. 72
29640 212012 cd11622 HR1_PKN_1 First Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the first HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 66
29641 212013 cd11623 HR1_PKN_2 Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the second HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 71
29642 212014 cd11624 HR1_Rhophilin Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin. Rhophilins are scaffolding proteins that function as effectors of the Rho family of small GTPases. Vertebrates harbor two proteins, Rhophilin-1 and Rhophilin-2, whose exact functions are yet to be determined. Rhophilin-1 has been implicated in sperm motility. Rhophilin-2 regulates the organization of the actin cytoskeleton. Rhophilins contain N-terminal HR1, central Bro1-like, and C-terminal PDZ domains; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; both Rhophilin-1 and Rhophilin-2 bind RhoA, and Rhophilin-2 has also been shown to bind RhoB. 76
29643 212015 cd11625 HR1_PKN_3 Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the third HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 74
29644 212016 cd11626 HR1_ROCK Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase. ROCK is also referred to as Rho-associated kinase or simply as Rho kinase. It is a serine/threonine protein kinase that is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. ROCKs are the best-described effectors of RhoA. There are two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. ROCK contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 66
29645 212017 cd11627 HR1_Ste20-like Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Schizosaccharomyces pombe Ste20-like proteins. This group is composed of predominantly uncharacterized fungal proteins, which contain two known domains: HR1 at the N-terminal region and REM (Ras exchanger motif) at the C-terminal region. One member protein from Schizosaccharomyces pombe is named Ste16 while its gene is called ste20 (a target of rapamycin complex 2 subunit). It is a subunit in the protein kinase TOR complexes in fission yeast. The REM domain is usually found in nucleotide exchange factors for Ras-like small GTPases. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 71
29646 212018 cd11628 HR1_CIP4_FNBP1L Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of vertebrate Cdc42-Interacting Protein 4 and FormiN Binding Protein 1-Like. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. FNBP1L, also called Toca-1 (Transducer of Cdc42-dependent actin assembly 1), forms a complex with neural WASP; the complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. CIP4 and FNBP1L contain an N-terminal F-BAR domain, a central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of CIP4 binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane. 81
29647 212019 cd11629 HR1_FBP17 Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Formin Binding Protein 17. FBP17, also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. It also binds to the Fas ligand and may be implicated in the inflammatory response. FBP17 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of the related protein, CIP4, binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane. 77
29648 212020 cd11630 HR1_PKN1_2 Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N1. PKN1, also called PKNalpha or Protein-kinase C-related kinase 1 (PRK1), is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, and by fatty acids such as arachidonic and linoleic acids. It is expressed ubiquitously and is the most abundant PKN isoform in neurons. PKN1 is implicated in a variety of functions including cytoskeletal reorganization, cardiac cell survival, cell adhesion, and glucose transport, among others. PKN1 contains three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the second HR1 domain of PKN1. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN1 binds the GTPases RhoA, RhoB, and RhoC, and can also interact weakly with Rac. 78
29649 212021 cd11631 HR1_PKN2_2 Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N2. PKN2, also called PKNgamma or Protein-kinase C-related kinase 2 (PRK2), is a serine/threonine protein kinase and an effector of the small GTPase Rho/Rac. It regulates G2/M cell cycle progression and the exit from cytokinesis. It also phosphorylates hepatitis C virus (HCV) RNA polymerase and thus, plays a role in HCV RNA replication. PKN2 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN2 contains a proline-rich region in between its C2 and kinase domains and has been shown to associate with SH3 domain containing proteins like NCK and Grb4. This model characterizes the second HR1 domain of PKN2. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN2 specifically binds to RhoA GTPase in a GTP-dependent manner. The HR1 domains of PKN2, together with its C2 domain, also facilitate the recruitment of PKN2 to primordial junctions at nascent cell-cell contacts, where it promotes junctional maturation. 74
29650 212022 cd11632 HR1_PKN3_2 Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N3. PKN3, also called PKNbeta, is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, preferentially by RhoC. Both PKN1 and RhoC show limited and barely detectable expression in normal tissues, but are both upregulated in cancer cells, particularly in late-stage malignancies. PKN3 has been implicated to play a role in the metastatic growth and invasiveness of cancer cells, downstream of the oncogenic phosphoinositide 3-kinase signaling network. PKN3 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN3 contains two proline-rich regions between its C2 and kinase domains, and has been shown to associate with SH3 domain containing proteins like GRAFs, GAP for RhoA, and Cdc42Hs. This model characterizes the second HR1 domain of PKN3. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN3 binds Rho family GTPases, preferentially RhoC. 74
29651 212023 cd11633 HR1_Rhophilin-1 Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin-1. Rhophilin-1 is a scaffolding protein that functions as an effector of the Rho family of small GTPases. It has been implicated in sperm motility. Rhophilin-1 contains an N-terminal HR1, a central Bro1-like, and a C-terminal PDZ domain; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; Rhophilin-1 binds RhoA was isolated initially as a RhoA-binding protein. 85
29652 212024 cd11634 HR1_Rhophilin-2 Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin-2. Rhophilin-2 is a scaffolding protein that functions as an effector of the Rho family of small GTPases. It plays a role in regulating the organization of the actin cytoskeleton. Rhophilin-2 contains an N-terminal HR1, a central Bro1-like, and a C-terminal PDZ domain; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; Rhophilin-2 has been shown to bind both RhoA and RhoB. 82
29653 212025 cd11635 HR1_PKN2_3 Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N2. PKN2, also called PKNgamma or Protein-kinase C-related kinase 2 (PRK2), is a serine/threonine protein kinase and an effector of the small GTPase Rho/Rac. It regulates G2/M cell cycle progression and the exit from cytokinesis. It also phosphorylates hepatitis C virus (HCV) RNA polymerase and thus, plays a role in HCV RNA replication. PKN2 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN2 contains a proline-rich region in between its C2 and kinase domains and has been shown to associate with SH3 domain containing proteins like NCK and Grb4. This model characterizes the third HR1 domain of PKN2. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN2 specifically binds to RhoA GTPase in a GTP-dependent manner. The HR1 domains of PKN2, together with its C2 domain, also facilitate the recruitment of PKN2 to primordial junctions at nascent cell-cell contacts, where it promotes junctional maturation. 74
29654 212026 cd11636 HR1_PKN1_3 Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N1. PKN1, also called PKNalpha or Protein-kinase C-related kinase 1 (PRK1), is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, and by fatty acids such as arachidonic and linoleic acids. It is expressed ubiquitously and is the most abundant PKN isoform in neurons. PKN1 is implicated in a variety of functions including cytoskeletal reorganization, cardiac cell survival, cell adhesion, and glucose transport, among others. PKN1 contains three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the third HR1 domain of PKN1. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN1 binds the GTPases RhoA, RhoB, and RhoC, and can also interact weakly with Rac. 74
29655 212027 cd11637 HR1_PKN3_3 Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N3. PKN3, also called PKNbeta, is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, preferentially by RhoC. Both PKN1 and RhoC show limited and barely detectable expression in normal tissues, but are both upregulated in cancer cells, particularly in late-stage malignancies. PKN3 has been implicated to play a role in the metastatic growth and invasiveness of cancer cells, downstream of the oncogenic phosphoinositide 3-kinase signaling network. PKN3 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN3 contains two proline-rich regions between its C2 and kinase domains, and has been shown to associate with SH3 domain containing proteins like GRAFs, GAP for RhoA, and Cdc42Hs. This model characterizes the third HR1 domain of PKN3. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN3 binds Rho family GTPases, preferentially RhoC. 74
29656 212028 cd11638 HR1_ROCK2 Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase 2. ROCK2 is a serine/threonine protein kinase and was the first identified target of activated RhoA. It plays a role in stress fiber and focal adhesion formation, and is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK2 contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. ROCK2 is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 67
29657 212029 cd11639 HR1_ROCK1 Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase 1. ROCK1 is a serine/threonine kinase and is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK1 contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. It is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. 66
29658 212163 cd11640 HutP Histidine Utilizing Protein, the hut operon positive regulatory protein. The HutP protein family regulates the expression of 'hut' structural genes in Bacillus and other bacteria. It forms an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. In an L-histidine and Mg2+ dependent manner, HutP binds to the nascent hut mRNA leader transcript, and the ensuing anti-termination complex inhibits formation of a stem-loop terminator, clearing the way for transcription of the hut structural genes. 134
29659 381168 cd11641 Precorrin-4_C11-MT Precorrin-4 C11-methyltransferase (CbiF/CobM). Precorrin-4 C11-methyltransferase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. In the aerobic pathway, CobM catalyzes the methylation of precorrin-4 at C-11 to yield precorrin-5. In the anaerobic pathway, CibF catalyzes the methylation of cobalt-precorrin-4 to cobalt-precorrin-5. Both CibF and CobM, which are homologous, are included in this model. There are about 30 enzymes involved in vitamin B12 synthetic pathway. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared in both pathways and several of these enzymes are pathway-specific. 225
29660 381169 cd11642 SUMT Uroporphyrin-III C-methyltransferase (also known as S-Adenosyl-L-methionine:uroporphyrinogen III methyltransferase, SUMT). SUMT is an enzyme of the cobalamin and siroheme biosynthetic pathway. It catalyzes the first of three steps leading to the formation of siroheme from uroporphyrinogen III; it transfers two methyl groups from S-adenosyl-L-methionine to the C-2 and C-7 atoms of uroporphyrinogen III to yield precorrin-2 via the intermediate formation of precorrin-1. Precorrin-2 is also a precursor for the biosynthesis of vitamin B12, coenzyme F430, siroheme and heme d1. This family includes proteins in which the SUMT domain is fused to other functional domains, such as to a uroporphyrinogen-III synthase domain to form bifunctional uroporphyrinogen-III methylase/uroporphyrinogen-III synthase, or to a dual function dehydrogenase-chelatase domain, as in the case of the multifunctional S-adenosyl-L-methionine (SAM)-dependent bismethyltransferase/dehydrogenase/ferrochelatase CysG, which catalyzes all three steps that transform uroporphyrinogen III into siroheme. 228
29661 381170 cd11643 Precorrin-6A-synthase Precorrin-6A synthase (also named CobF). Precorrin-6A synthase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. This model represents CobF, the precorrin-6A synthase, an enzyme specific to the aerobic pathway. After precorrin-4 is methylated at C-11 by CobM to produce precorrin-5, CobF catalyzes the removal of the extruded acyl group in the subsequent step, and the addition of a methyl group at C-1. The product of this reaction is precorrin-6A, which gets reduced by an NADH-dependent reductase to yield precorrin-6B. This family includes enzymes in GC-rich Gram-positive bacteria, alpha proteobacteria and Pseudomonas-related species. 244
29662 381171 cd11644 Precorrin-6Y-MT Precorrin-6Y methyltransferase (also named CbiE). CbiE (precorrin-6Y methyltransferase, also known as cobalt-precorrin-7 C(5)-methyltransferase, also known as cobalt-precorrin-6Y C(5)-methyltransferase) catalyzes the methylation of C-5 in cobalt-precorrin-7 to form cobalt-precorrin-8. It participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. CbiE functions in the anaerobic pathway, it is a subunit of precorrin-6Y C5,15-methyltransferase, a bifunctional enzyme: cobalt-precorrin-7 C(5)-methyltransferase (CbiE)/cobalt-precorrin-6B C(15)-methyltransferase (decarboxylating) (CbiT), that catalyzes two methylations (at C-5 and C-15) in precorrin-6Y, as well as the decarboxylation of the acetate side chain located in ring C, in order to generate precorrin-8X. CbiE and CbiT can be found fused (CbiET, also called CobL), or on separate protein chains (CbiE and CbiT). In the aerobic pathway, a single enzyme called CobL catalyzes the methylations at C-5 and C-15, and the decarboxylation of the C-12 acetate side chain of precorrin-6B. 198
29663 381172 cd11645 Precorrin_2_C20_MT Precorrin-2 C20-methyltransferase, also named CobI or CbiL. Precorrin-2 C20-methyltransferase (also known as S-adenosyl-L-methionine--precorrin-2 methyltransferase) participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. Precorrin-2 C20-methyltransferase catalyzes methylation at the C-20 position of a cyclic tetrapyrrole ring of precorrin-2 using S-adenosylmethionine as a methyl group source to produce precorrin-3A. In the anaerobic pathway, cobalt is inserted into precorrin-2 by CbiK to generate cobalt-precorrin-2, which is the substrate for CbiL, a C20 methyltransferase. In Clostridium difficile, CbiK and CbiL are fused into a bifunctional enzyme. In the aerobic pathway, the precorrin-2 C20-methyltransferase is named CobI. This family includes CbiL and CobI precorrin-2 C20-methyltransferases, both as stand-alone enzymes and when CbiL forms part of a bifunctional enzyme. 223
29664 381173 cd11646 Precorrin_3B_C17_MT Precorrin-3B C(17)-methyltransferase (also named CobJ or CbiH). Precorrin-3B C(17)-methyltransferase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. This model includes CobJ of the aerobic pathway and CbiH of the anaerobic pathway, both as stand-alone enzymes and when CobJ or CbiH form part of bifunctional enzymes, such as in Mycobacterium tuberculosis CobIJ where CobJ fuses with a precorrin-2 C(20)-methyltransferase domain, or Bacillus megaterium CbiH60, where CbiH is fused to a nitrite and sulfite reductase-like domain. In the aerobic pathway, once CobG has generated precorrin-3b, CobJ catalyzes the methylation of precorrin-3b at C-17 to form precorrin-4 (the extruded methylated C-20 fragment is left attached as an acyl group at C-1). In the corresponding anaerobic pathway, CbiH carries out this ring contraction, using cobalt-precorrin-3b as a substrate to generate a tetramethylated delta-lactone. 238
29665 381174 cd11647 DHP5_DphB diphthine methyl ester synthase and diphthine synthase. Eukaryotic diphthine methyl ester synthase (DHP5) and archaeal diphthamide synthase (DphB) participate in the second step of the biosynthetic pathway of diphthamide. The eukaryotic enzyme catalyzes four methylations of the modified target histidine residue in translation elongation factor 2 (EF-2), to form an intermediate called diphthine methyl ester; the archaeal enzyme, catalyzes only 3 methylations, producing diphthine. Diphtheria toxin ADP-ribosylates diphthamide leading to inhibition of protein synthesis in the eukaryotic host cells. 241
29666 381175 cd11648 RsmI Ribosomal RNA small subunit methyltransferase I (RsmI), also known as rRNA (cytidine-2'-O-)-methyltransferase. RsmI is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress. 216
29667 381176 cd11649 RsmI_like uncharacterized subfamily of the tetrapyrrole methylase family similar to Ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress. 229
29668 212164 cd11650 AT4G37440_like Uncharacterized protein domain conserved in plants. This domain contains an extensive protein sequence fragment that appears conserved in a number of plant proteins, including the gene product of Arabidopsis thaliana locus AT4G37440, which has been identified in transcriptional profiling as expressed at different levels in white cabbage cultivars. 253
29669 212165 cd11651 YPK1_N_like Fungal protein kinase domain similar to the N-terminus of YPK1. This fungal domain family includes the N-terminal region of the Saccharomyces cerevisiae AGC kinases YPK1 and YPK2, which were found to be essential for the proliferation of yeast. YPK1 is required for cell growth and acts as a downstream kinase in the sphingolipid-mediated signaling pathway of yeast. It also plays a role in efficient endocytosis and in the maintenance of cell wall integrity. 174
29670 212166 cd11652 SSH-N N-terminal domain conserved in slingshot (SSH) phosphatases. This domain or region conserved in Bilateria is found N-terminal to the DEK_C-like and catalytic domains of slingshot phosphatases. Slingshot is a cofilin-specific phosphatase. Dephosphorylation reactivates cofilin, which in turn depolymerizes actin and is thus required for actin filament reorganization. Slingshot is a member of the dual-specificity protein phosphatase family. This N-terminal SSH region may be involved in P-cofilin binding (the model C-terminus plus the DEK_C-like domain, which are characterized as the "B" domain in some of the literature), and may be required for the F-actin mediated activation of slingshot (the N-terminal region of this model, sometimes referred to as the "A" domain). 233
29671 212167 cd11653 rap1_RCT C-terminal domain of RAP1 recruits proteins to telomeres. The RAP1 (repressor activator protein 1) C-terminal domain (RCT) mediates interactions with other proteins such as TRF2 (human), Rif1, Rif2, Sir3, Sir4 (Saccharomyces cerevisiae), and Taz1 (Schizosaccharomyces pombe) at telomeres and other loci. RAP1, identified in budding yeast as repressor/activator protein 1, is a well-conserved telomere binding protein, also found in fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation, gene silencing, as well as binding at numerous sites at each telemore, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human RAP1 apparently does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the RCT. RAP1 might act by suppressing nonhomologous end-joining. Yeast RAP1 has two myb-type DNA binding modules, and an RCT domain that recruits Sir proteins 3 and 4 (Sir3, Sir4) for gene silencing, and Rif1 and Rif2 for telomere length maintenance. Schizosaccharomyces pombe RAP1 (spRap1), like human RAP1, lacks direct DNA-binding activity and is localized to telomeres via Taz1, an ortholog of TRF1 and TRF2. The S. pompe RCT resembles the first 3-helix bundle of the yeast and human RCT forms, but is not included in this larger model. 100
29672 212553 cd11654 TRF2_RBM RAP1 binding motif of telomere repeat binding factor. TRF2 (Telomere repeat binding factor 2) functions as part of the 6-component shelterin complex. TRF2 binds DNA and recruits RAP1 (via binding to the RAP1 protein c-terminus (RCT)) and TIN2 in the protection of telomeres from DNA repair machinery. Metazoan shelterin consists of 3 DNA-binding proteins (TRF2, TRF1 and POT1) and 3 recruited proteins that bind to one or more of these DNA-binding proteins (RAP1, TIN2, TPP1). Human TRF1 and TRF2 bind double-stranded DNA. hTRF2 consists of a basic N-terminus, a TRF homology domain, the RAP1 binding motif (RBM) described by this model, the TIN2 binding motif (TBM), and a myb-like DNA binding domain. 42
29673 212554 cd11655 rap1_myb-like DNA-binding modules of yeast Rap1 and related proteins. Yeast Rap1 DNA-binding activity is mediated by a pair of DNA-binding modules comprised of 2 3-helix bundles with an N-terminal arm, closely matching the structure of homeodomain and myb-type proteins. Human Rap1 has a single myb-like module, and may not bind DNA directly. Rap1, identified in budding yeast as repressor-activator protein 1, is a conserved telomere binding protein, also identified in fission yeast and mammals. In Saccharomyces cerevisiae, Rap1 directly binds DNA and is involved in transcriptional activation, gene silencing, as well as binding at numerous binding sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human Rap1 apparently does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the Rap C-terminal domain (RCT). Rap1 may act by suppressing non-homologous end-joining. Yeast Rap1 has 2 myb-type DNA binding modules, a BRCT domain, and a RCT domain that recruits Sir3 and Sir4 proteins for gene silencing and Rif1 and Rif2 for telomere length maintenance. Human Rap1 has a similar domain architecture but has a single myb-like domain. 57
29674 212555 cd11656 FBX4_GTPase_like C-terminal GTPase-like domain of F-Box Only Protein 4. F-box proteins are involved in substrate recognition as part of SCF (Skp1-Cul1-Rbx1-F-box protein) ubiquitin ligase complexes. Fbx4 (or Fbxo4) binds to the telomere repeat binding factor 1 (TRF1), whose activity at telomeres is regulated in part by selective ubiquitination and degradation. This ubiquitination of TRF1 is mediated by Fbx4, which binds to the TRFH domain of TRF1, via the C-terminal domain characterized by this model, a module resembling a small GTPase domain that lacks the GTP-binding site. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, which in turn promotes telomere elongation. Fbx4 has also been reported to target cyclin D1 for degradation by the proteasome, a mechanism ensuring the fidelity of DNA replication. More recently, these findings have been disputed. 223
29675 240667 cd11657 TIN2_N N-terminal domain of TRF-interacting nuclear factor 2; shelterin complex protein of telomeres. TIN2 is one of the six proteins of shelterin complex, which acts to protect telomeres from DNA damage repair machinery. TIN2 binds directly to TRF1 and TRF2 and stabilizes TRF2 complex-telomere binding by tethering it to the TRF1 complex. TIN2 binding to TRF2 is primarily via the TRF binding motif (TBM) region and the N-terminus, while the far C-terminal region has lower affinity. The TIN2 TBM, but not the N-terminal region, is involved in TIN2 binding to TRF1. Truncation of the TIN2 N-terminus in mouse results in telomere elongation, suggesting a negative regulatory function of this region. Three shelterin components (TRF1, TRF2, POT1) bind DNA and 3 components (TIN2, RAP1, TPP1) are recruited by these DNA binding factors. TRF1 activity at telomeres is regulated in part by selective ubiquitination and degradation. Ubiquitination of TRF1 is mediated by Fbx4, which binds TRF1 in the TRFH domain, via a small GTPase module. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. F-box proteins act in substrate recognition as part of Skp1-Cul1-Rbx1-F- box (SCF) protein complexes. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, promoting telomere elongation. TIN2 also binds PIP1, which recruits POT1 to telomeres. 188
29676 212556 cd11658 SANT_DMAP1_like SANT/myb-like domain of Human Dna Methyltransferase 1 Associated Protein 1-like. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins. 46
29677 212557 cd11659 SANT_CDC5_II SANT/myb-like DNA-binding domain of Cell Division Cycle 5-Like Protein repeat II. In humans, cell division cycle 5-like protein (CDC5) functions in pre-mRNA splicing in cell cycle control. The DNA-binding, myb-like domain of CDC5 is a member of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of DNA-binding Myb domains and is found in a diverse set of proteins. 53
29678 212558 cd11660 SANT_TRF Telomere repeat binding factor-like DNA-binding domains of the SANT/myb-like family. Human telomere repeat binding factors, TRF1 and TRF2, function as part of the 6 component shelterin complex. TRF2 binds DNA and recruits RAP1 (via binding to the RAP1 protein c-terminal (RCT)) and TIN2 in the protection of telomeres from DNA repair machinery. Metazoan shelterin consists of 3 DNA binding proteins (TRF2, TRF1, and POT1) and 3 recruited proteins that bind to one or more of these DNA-binding proteins (RAP1, TIN2, TPP1). Schizosaccharomyces pombe TAZ1 is an orthlog and binds RAP1. Human TRF1 and TRF2 bind double-stranded DNA. hTRF2 consists of a basic N-terminus, a TRF homology domain, the RAP1 binding motif (RBM), the TIN2 binding motif (TBM) and a myb-like DNA binding domain, SANT, named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. Tandem copies of the domain bind telomeric DNA tandem repeats as part of the capping complex. The single myb-like domain of TRF-type proteins is similar to the tandem myb_like domains found in yeast RAP1. 50
29679 212559 cd11661 SANT_MTA3_like Myb-Like Dna-Binding Domain of MTA3 and related proteins. Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis. 46
29680 212560 cd11662 apollo_TRF2_binding TRF2-binding region of apollo and similar proteins. Apollo protein, a DNA repair nuclease, is recruited to telomeres by TRF2 where it is associated with the principle components of the shelterin complex. Apollo is a member of the metallo-beta-lactamase family that is required for telomere integrity during S phase; its 5' exonuclease activity is regulated by binding to TRF2. Apollo and TRF2 also suppress damage to engineered interstitial telomere repeat tracts at the chromosome ends. TRF2, which binds preferentially to positively supercoiled DNA substrates, together with Apollo, negatively regulates the amount of DNA topoisomerases (TOP1, TOP2-alpha, and TOP2-beta) at telomeres since they also act in the same pathway of telomere protection. The shelterin complex protein identified in mammals is principally comprised of 6 factors that act to protect telomeres from DNA damage repair machinery. 3 components (TRF1, TRF2, POT1) bind DNA and 3 components are recruited by these factors (TIN2, RAP1, TPP1). 34
29681 212128 cd11663 GH119_BcIgtZ-like putative catalytic domain of glycoside hydrolase family 119 (GH119). The prokaryotic subgroup is represented by IgtZ, an alpha-amylase from a Bacillus circulans strain. The GH119 family is related to GH57, a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. 363
29682 212129 cd11664 LamB_YcsF_like_2 uncharacterized proteins similar to the Aspergillus nidulans lactam utilization protein LamB. This bacterial subfamily of the LamB/YbgL family, contains many well conserved uncharacterized proteins. Although their molecular function is unknown, those proteins show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA. 238
29683 212130 cd11665 LamB_like Aspergillus nidulans lactam utilization protein LamB and similar proteins. This eukaryotic and bacterial subfamily of the LamB/YbgL family, includes Aspergillus nidulans protein LamB. The lamb gene locates at the lam locus of Aspergillus nidulans, consisting of two divergently transcribed genes, lamA and lamB, needed for the utilization of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. Although the exact molecular function of lamb encoding protein LamB is unknown, it might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA. 238
29684 212131 cd11666 GH38N_Man2A1 N-terminal catalytic domain of Golgi alpha-mannosidase II and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by Golgi alpha-mannosidase II (GMII, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A1), a monomeric, membrane-anchored class II alpha-mannosidase existing in the Golgi apparatus of eukaryotes. GMII plays a key role in the N-glycosylation pathway. It catalyzes the hydrolysis of the terminal of both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. GMII is activated by zinc or cobalt ions and is strongly inhibited by swainsonine. Inhibition of GMII provides a route to block cancer-induced changes in cell surface oligosaccharide structures. GMII has a pH optimum of 5.5-6.0, which is intermediate between those of acidic (lysosomal alpha-mannosidase) and neutral (ER/cytosolic alpha-mannosidase) enzymes. GMII is a retaining glycosyl hydrolase of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. 344
29685 212132 cd11667 GH38N_Man2A2 N-terminal catalytic domain of Golgi alpha-mannosidase IIx, and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by human alpha-mannosidase 2x (MX, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A2). MX is enzymatically and functionally very similar to GMII (found in another subfamily), and as an isoenzyme of GMII. It is thought to also function in the N-glycosylation pathway. MX specifically hydrolyzes the same oligosaccharide substrate as does MII. It specifically removes two mannosyl residues from GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine). 344
29686 212168 cd11669 TTHB210-like Hypothetical protein TTHB210, a sigma(E)-regulated gene product found in Thermus thermophilus, and similar proteins. TTHB210 is an uncharacterized protein found in Thermus thermophilus, and is controlled by the sigma(E) /anti-sigma(E) regulatory system. It is one of the five proteins of the extracytoplasmic function (ECF) sigma factor sigma(E)-regulated gene products whose physiological function have not been determined. Its crystallographic structure reveals a novel homodecamer although it is a dimer in solution. 115
29687 212561 cd11670 Sp_RAP1_RCT C-terminal domain of S. pombe RAP1 protein. The Schizosaccharomyces pombe RAP1 (repressor activator protein 1) protein C-terminal (RCT) domain structurally resembles the first 3-helix bundle found in yeast and human RAP1 RCT. S. pombe RAP1 (spRap1), like human RAP1, lacks direct DNA-binding activity and is localized to telomeres via Taz1, an ortholog of TRF1 and TRF2. The RAP1 RCT domain interacts with RAP1 binding motif (RBM) of TAZ1. RAP1, identified in budding yeast as repressor/activator protein 1 is a well-conserved telomere binding protein, found in budding yeast, fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation and mating type information gene silencing, as well as binding at numerous sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human RAP1 does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the RAP C-terminal domain (RCT). Yeast RAP1 has 2 myb-type DNA binding modules, a BRCT domain, and a RCT domain that recruits Sir3 and Sir4 for gene silencing and Rif1 and Rif2 for telomere length maintenance. S. pombe RAP1 has a BRCT domain, 2 myb like domains, and the RCT. 52
29688 212562 cd11671 TAZ1_RBM RAP1 binding motif of Schizosaccharomyces pombe TAZ1. S. pombe TAZ1 recruits the spRAP1 protein to telomeres. The TAZ1 RAP1-binding motif (RBM) binds the RAP1 C-terminal domain (RCT), which structurally resembles the first 3-helix bundle found in yeast and human RAP1 RCT. TAZ1, an ortholog of TRF1 and TRF2, has a TRF homology (TRFH) domain, the RBM domain, a dimerization domain, and a myb-like C-terminus. RAP1, identified in budding yeast as repressor/activator protein 1, is a well-conserved telomere binding protein and is also found in fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation and mating type information gene silencing, as well as in binding to numerous binding sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing, and telomere end protection. Like S. pombe RAP1, human RAP1 does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) through the RAP C-terminal domain (RCT). 49
29689 277250 cd11672 ADDz ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger domain (ADDz). The ADDz zinc finger domain is present in the chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2 type transcription factor protein. The Dnmt3 family includes two active DNA methyltransferases, Dnmt3a and -3b, and one regulatory factor Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. 99
29690 212563 cd11673 hemoglobin_linker_C Globular domain of extracellular hemoglobin linker. This family of hemoglobin linker chains is restricted to annelid worms, and participates in the formation of the large erythrocruorin respiratory complex. Via its N-terminal coiled-coil segment (not included in this model), the molecule forms trimers, which are part of a scaffold organizing the overall complex architecture; the latter encompasses 36 linkers and 144 hemoglobins in total. This C-terminal globular domain is involved in trimerization, and also interacts with globins and other C-terminal globular linker domains of neighboring trimers. The structure resembles that of nitrophorins and lipocalins. 120
29691 212564 cd11674 lambda-1 inner capsid protein lambda-1 or VP3. The reovirus inner capsid protein lambda-1 displays nucleoside triphosphate phosphohydrolase (NTPase), RNA-5'-triphosphatase (RTPase), and RNA helicase activity and may play a role in the transcription of the virus genome, the unwinding or reannealing of double-stranded RNA during RNA synthesis. The RTPase activity constitutes the first step in the capping of RNA, resulting in a 5'-diphosphorylated RNA plus-strand. lambda1 is an Orthoreovirus core protein, VP3 is the homologous core protein in Aquareoviruses. 1166
29692 212565 cd11675 SCAB1_middle middle domain of the stomatal closure-related actin binding protein1. SCAB1 is a dimeric actin crosslinker conserved in plants. The three-dimensional structure of this domain resembles that of fibronectin type III repeat units and immunoglobulins. It is situated between a coiled-coil dimerization domain and a C-terminal pleckstrin homology-like module. SCAB1 appears to be required for normal actin dynamics in guard cells stomatal movement. The function of the middle domain is not clear. 85
29693 212487 cd11676 Gemin6 Gemin 6. Gemins 6, together with the survival motor neuron (SMN) protein, other Gemins, and Unr-interacting protein (UNRIP) form the SMN complex, which plays an important role in the Sm core assembly reaction, by binding directly to the Sm proteins, as well as UsnRNAs. Gemin 6 forms a heterodimer with Gemin 7, which serve as a surrogate for the SmB-SmD3 dimer during the formation of the heptameric Sm ring. 63
29694 212488 cd11677 Gemin7 Gemin 7. Gemins 7, together with the survival motor neuron (SMN) protein, other Gemins, and Unr-interacting protein (UNRIP) form the SMN complex, which plays an important role in the Sm core assembly reaction, by binding directly to the Sm proteins, as well as UsnRNAs. Gemin 7 forms a heterodimer with Gemin 6, which serve as a surrogate for the SmB-SmD3 dimer during the formation of the heptameric Sm ring. 77
29695 212489 cd11678 archaeal_LSm archaeal Like-Sm protein. The archaeal Sm-like (LSm): The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal LSm proteins are likely to represent the ancestral Sm domain. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 69
29696 212490 cd11679 archaeal_Sm_like archaeal Sm-related protein. Archaeal Sm-related proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain. 65
29697 212543 cd11680 HDAC_Hos1 Class I histone deacetylases Hos1 and related proteins. Saccharomyces cerevisiae Hos1 is responsible for Smc3 deacetylation. Smc3 is an important player during the establishment of sister chromatid cohesion. Hos1 belongs to the class I histone deacetylases (HDACs). HDACs are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. Other class I HDACs are animal HDAC1, HDAC2, HDAC3, HDAC8, fungal RPD3 and HOS2, plant HDA9, protist, archaeal and bacterial (AcuC) deacetylases. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase. 294
29698 212544 cd11681 HDAC_classIIa Histone deacetylases, class IIa. Class IIa histone deacetylases are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) to yield deacetylated histones. This subclass includes animal HDAC4, HDAC5, HDAC7, and HDCA9. Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, they have N-terminal regulatory domain with two or three conserved serine residues, phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC9 is involved in regulation of gene expression and dendritic growth in developing cortical neurons. It also plays a role in hematopoiesis. HDAC7 is involved in regulation of myocyte migration and differentiation. HDAC5 is involved in integration of chronic drug (cocaine) addiction and depression with changes in chromatin structure and gene expression. HDAC4 participates in regulation of chondrocyte hypertrophy and skeletogenesis. 377
29699 212545 cd11682 HDAC6-dom1 Histone deacetylase 6, domain 1. Histone deacetylases 6 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. Known interaction partners of HDAC6 are alpha tubulin (substrate) and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD). 337
29700 212546 cd11683 HDAC10 Histone deacetylase 10. Histone deacetylases 10 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC10 has an N-terminal deacetylase domain and a C-terminal pseudo-repeat that shares significant similarity with its catalytic domain. It is located in the nucleus and cytoplasm, and is involved in regulation of melanogenesis. It transcriptionally down-regulates thioredoxin-interacting protein (TXNIP), leading to altered reactive oxygen species (ROS) signaling in human gastric cancer cells. Known interaction partners of HDAC10 are Pax3, KAP1, hsc70 and HDAC3 proteins. 337
29701 212566 cd11684 DHR2_DOCK Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins. DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42. 392
29702 212582 cd11687 PpPFK_gamma Pichia pastoris 6-phosphofructokinase, gamma subunit. Pichia pastoris 6-phosphofructokinase (PpPfk) is the most complex and probably largest (1 MDa) eukaryotic Pfk. It forms a dodecamer of four alpha-beta-gamma trimers. The gamma unit is unique, in contrast to other eukaryotic ATP-dependent 6-phosphofructokinases, and participates in oligomerization of the alpha and beta chains. It is not essential for enzymatic activity, but it modulates the allosteric behavior of the enzyme. 346
29703 212583 cd11688 THUMP THUMP domain, predicted to bind RNA. The THUMP domain is named after THioUridine synthases, RNA Methyltransferases and Pseudo-uridine synthases. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets. 148
29704 212588 cd11689 SidM_DrrA_GEF guanine nucleotide-exchange factor domain of Legionella SidM/DrrA. Effector protein DrrA of Legionella pneumophila, an intracellular pathogen, is a potent guanine nucleotide-exchange factor (GEF) specific for the host Rab1 GTPase. It competes with endogenous exchange factors to recruit and activate Rab1 on plasma membrane-derived organelle, therefore effectively hijacking the host's vesicle trafficking to avoid phagosome-lysosome fusion. 187
29705 212589 cd11690 Tsi2_like Tse2 immunity protein Tsi2 and similar proteins. Tsi2 is an essential protein in Pseudomonas aeruginosa, providing protection from the activity of Tse2, most likely by directly interacting with Tse2. Tse2 is a toxin transported via the type VI secretion system and is targeted towards other bacteria in the environment. 72
29706 212590 cd11691 HRI1_like Tandem repeat domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. The two repeats are sequence dissimilar, and the second (c-terminal) repeat is missing several strands, forming an incomplete barrel. 101
29707 212591 cd11692 HRI1_N_like N-terminal domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This N-terminal repeat is involved in homodimerization and may contain a ligand binding site. 134
29708 212592 cd11693 HRI1_C_like C-terminal domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This C-terminal repeat is missing several strands and forms an incomplete barrel. 90
29709 212567 cd11694 DHR2_DOCK_D Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class D, also called the Zizimin subfamily, includes Dock9, 10 and 11. Class D Docks are specific GEFs for Cdc42. Dock9 plays important roles in spine formation and dendritic growth. Dock10 and Dock11 are preferentially expressed in lymphocytes. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class D DOCKs, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus. 376
29710 212568 cd11695 DHR2_DOCK_C Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class C, also called the Zizimin-related (Zir) subfamily, includes Dock6, 7 and 8. Class C DOCKs have been shown to have GEF activity for both Rac and Cdc42. Dock6 regulates neurite outgrowth. Dock7 plays a critical roles in the early stages of axon formation, neuronal polarity, and myelination. Dock8 regulates T and B cell numbers and functions, and plays essential roles in humoral immune responses and the proper formation of B cell immunological synapses. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Class C Docks, which contains the catalytic GEF activity for Rac and Cdc42. 368
29711 212569 cd11696 DHR2_DOCK_B Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. Dock3 is a specific GEF for Rac and it regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class B DOCKs, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 391
29712 212570 cd11697 DHR2_DOCK_A Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. Class A DOCKs are specific GEFs for Rac. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. Dock2 plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. Dock5 functions upstream of Rac1 to regulate osteoclast function. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class A DOCKs, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 400
29713 212571 cd11698 DHR2_DOCK9 Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 9. Dock9, also called Zizimin1, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. It plays important roles in spine formation and dendritic growth. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock9, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus. 415
29714 212572 cd11699 DHR2_DOCK10 Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 10. Dock10, also called Zizimin3, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. Dock10 is preferentially expressed in lymphocytes and may play a role in interleukin-4 induced activation of B cells. It may also play a role in the invasion of tumor cells. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock10, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus. 446
29715 212573 cd11700 DHR2_DOCK11 Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 11. Dock11, also called Zizimin2 or activated Cdc42-associated GEF (ACG), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. Dock11 is predominantly expressed in lymphocytes and is found in high levels in germinal center B lymphocytes after T cell dependent antigen immunization. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock11, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus. 413
29716 212574 cd11701 DHR2_DOCK8 Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 8. Dock8, also called Zizimin-related 3 (Zir3), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac1 and Cdc42 by exchanging bound GDP for free GTP. Dock8 is highly expressed in the immune system and it regulates T and B cell numbers and functions. It plays essential roles in humoral immune responses and the proper formation of B cell immunological synapses. Dock8 deficiency is a primary immune deficiency that results in extreme susceptibility to cutaneous viral infections, elevated IgE levels, and eosinophilia. It was originally described as an autosomal recessive form of hyper IgE syndrome (AR-HIES). DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock8, which contains the catalytic GEF activity for Rac and/or Cdc42. 422
29717 212575 cd11702 DHR2_DOCK6 Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 6. Dock6, also called Zizimin-related 1 (Zir1), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. It is widely expressed and shows highest expression in the dorsal root ganglion and the brain. It regulates neurite outgrowth. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock6, which contains the catalytic GEF activity for Rac and/or Cdc42. 423
29718 212576 cd11703 DHR2_DOCK7 Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 7. Dock7, also called Zizimin-related 2 (Zir2), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac1 and Cdc42 by exchanging bound GDP for free GTP. It plays a critical role in the initial specification of axon formation in hippocampal neurons. It affects neuronal polarity by regulating microtubule dynamics. Dock7 also plays a role in controlling myelination by Schwann cells. It may also play important roles in the function and distribution of dermal and follicular melanocytes. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock7, which contains the catalytic GEF activity for Rac and/or Cdc42. 473
29719 212577 cd11704 DHR2_DOCK3 Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis 3. Dock3, also called modifier of cell adhesion (MOCA), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. Dock3 is a specific GEF for Rac. It regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock3, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 392
29720 212578 cd11705 DHR2_DOCK4 Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis 4. Dock4 is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. It may also regulate spine morphology and synapse formation. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock4, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 391
29721 212579 cd11706 DHR2_DOCK2 Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 2. Dock2 is a hematopoietic cell-specific, class A DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock2, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock2, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 421
29722 212580 cd11707 DHR2_DOCK1 Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 1. Dock1, also called Dock180, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. In the nervous system, it mediates attractive responses to netrin-1 and thus, plays a role in axon outgrowth and pathfinding. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock1, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock1, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 400
29723 212581 cd11708 DHR2_DOCK5 Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 5. Dock5 is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It functions upstream of Rac1 to regulate osteoclast function. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock5, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock5, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. 400
29724 293931 cd11709 SPRY SPRY domain. SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). TRIM/RBCC proteins are involved in a variety of processes, including apoptosis, cell cycle regulation, cell growth, senescence, viral response, meiosis, cell differentiation, and vesicular transport. Genes belonging to this family are implicated in several human diseases that vary from cancer to rare genetic syndromes. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome. 118
29725 212548 cd11710 GINS_A_psf1 Alpha-helical domain of GINS complex protein Psf1. Psf1 is a component of the GINS tetrameric protein complex. Psf1 is mainly expressed in highly proliferative tissues, such as blastocysts, adult bone marrow, and testis, in which the stem cell system is active. Loss of Psf1 causes embryonic lethality. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. 129
29726 212549 cd11711 GINS_A_Sld5 Alpha-helical domain of GINS complex protein Sld5. Sld5 is a component of GINS tetrameric protein complex, and within the complex Sld5 interacts with Psf1 via its N-terminal A-domain, and with Psf2 through a combination of the A and B domains. Sld5 in Drosophila is required for normal cell cycle progression and the maintenance of genomic integrity. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. 119
29727 212550 cd11712 GINS_A_psf2 Alpha-helical domain of GINS complex protein Psf2 (partner of Sld5 2). Psf2 is a component of GINS tetrameric protein complex and has been found to play important roles in normal eye development in Xenopus laevis. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. 119
29728 212551 cd11713 GINS_A_psf3 Alpha-helical domain of GINS complex protein Psf3 (partner of Sld5 3). Psf3 is a component of GINS, a tetrameric protein complex. Psf3 expression is up regulated in malignant colon cancer and it might be involved in cancer cell proliferation. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. 109
29729 212552 cd11714 GINS_A_archaea Alpha-helical domain of archaeal GINS complex proteins. The GINS complex is involved in replication of archaeal and eukayotic genomes. The archaeal DNA replication system is a simplified version of that of the eukaryotes. Like its eukaryotic counterpart, the archaeal GINS complex is tetrameric, but instead of four different subunits (Sld5, Psf1, Psf2 and Psf3) it consists of two different proteins named Gins51 and Gins23. All GINS subunits are homologs and they can be classified into two groups. One group (the eukayotic Sld5 and Psf1, as well as the archaeal Gins51) has the alpha-helical (A) domain at the N-terminus and the beta-strand domain (B) at the C-terminus (this arrangement is called ABtype). The arrangement of the A and B domains is reversed in the second group (eukaryotic Psf2 and Psf3 and archaeal Gins23, also referred to as BAtype). The overall fold of each archaeal subunit and the overall tetrameric assembly of GINS are similar, but the relative locations of the C-terminal small domains are different with respect to the alpha helical domain characterized by this model, resulting in different subunit contacts in the archaeal GINS complex.Some archaea may have a homotetrameric GINS complex (4 copies of an AB-type module). 105
29730 212584 cd11715 THUMP_AdoMetMT THUMP domain associated with S-adenosylmethionine-dependent methyltransferases. Proteins of this family contain an N-terminal THUMP domain and a C-terminal S-adenosylmethionine-dependent methyltransferase domain. Members have been implicated in the modification of 23S RNA m2G2445, a highly conserved modification in bacteria and in the m2G6 modification of tRNA. The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets. 152
29731 212585 cd11716 THUMP_ThiI THUMP domain of thiamine biosynthesis protein ThiI. ThiI is an enzyme responsible for the formation of the modified base S(4)U (4-thiouridine) found at position 8 in some prokaryotic tRNAs. This modification acts as a signal for UV exposure, triggering a response that provides protection against its damaging effects. ThiI consists of an N-terminal THUMP domain, followed by an NFLD domain, and a C-terminal PP-loop pyrophosphatase domain. The N-terminal THUMP domain has been implicated in the recognition of the acceptor-stem region. The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets. 166
29732 212586 cd11717 THUMP_THUMPD1_like THUMP domain-containing protein 1-like. This family contains THUMP domain-only proteins including THUMP domain-containing protein 1 and Saccharomyces cerevisiae Tan1. Tan1 is non essential and has been shown to be required for the formation of the modified nucleoside N(4)-acetylcytidine (ac(4)C) in tRNA. To date, there is no functional information available about THUMPD1. The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets. 158
29733 212587 cd11718 THUMP_SPOUT THUMP domain associated with SPOUT RNA Methylases. Members of this archaeal protein family are characterized by containing an N-terminal THUMP domain and a C-terminal SPOUT RNA methyltransferase domain. No functional information is available The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets. 145
29734 212593 cd11719 FANC Fanconi anemia ID complex proteins FANCI and FANCD2. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome. 977
29735 212594 cd11720 FANCI Fanconi anemia I protein. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome. The phosphorylation of FANCI may function as a molecular switch to turn on the FA pathway. 1202
29736 212595 cd11721 FANCD2 Fanconi anemia D2 protein. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome. The phosphorylation of FANCD2 is required for DNA damage-induced intra-S phase checkpoint and for cellular resistance to DNA crosslinking agents. 1161
29737 212596 cd11722 SOAR STIM1 Orai1-activating region. STIM1 (stromal interaction module 1) is a metazoan transmembrane protein located in the endoplasmic reticulum (ER) membrane, which functions as a sensor for ER calcium ion levels and activates store-operated Ca2+ influx channels (SOCs), such as the Orai1 Ca2+ channel located in the plasma membrane. STIM1 has an N-terminal Ca-binding EF-hand domain, which is located in the ER lumen. Responding to the release of Ca2+ from the ER, STIM1 was found to aggregate near the plasma membrane and contact Orai1. This model describes a region near the C-terminus of STIM1, which has been shown to mediate the interaction with Orai1 and has been labeled SOAR (STIM1 Orai1-activating region). STIM1 has also been linked to sensing oxidative and temperature-variation stress and may play a rather general role in mediating calcium signaling in response to stress. Dimerization of STIM1 via the SOAR domain appears required for the activation of the Orai1 calcium channel. A model for STIM1 activation has been proposed, in which an inhibitory helix N-terminal to the SOAR domain prevents STIM1 clustering or aggregation, and in which conformational changes triggered by depletion of the calcium stores allow the clustering and activation of Orai1. 92
29738 381177 cd11723 YabN_N_like N-terminal S-AdoMet-dependent tetrapyrrole methylase domain of Bacillus subtilis YabN and similar domains. This family includes the S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent tetrapyrrole methylase (TP-methylase) domain of Bacillus subtilis YabN, and similar domains. YabN is a fusion of an N-terminal TP-methylase and a C-terminal MazG-type nucleotide pyrophosphohydrolase domain. MazG-like NTP-PPases have been implicated in house-cleaning functions such as degrading abnormal (d)NTPs. TP-methylases use S-AdoMet in the methylation of diverse substrates. Most TP-methylase family members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis, other members like diphthine synthase and ribosomal RNA small subunit methyltransferase I (RsmI) act on other substrates. The specific function of YabN's TP-methylase domain is not known. 218
29739 381178 cd11724 TP_methylase uncharacterized family of the tetrapyrrole methylase superfamily. Members of this superfamily use S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and Ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA. Other superfamily members not involved in cobalamin biosynthesis include the N-terminal tetrapyrrole methylase domain of Bacillus subtilis YabN whose specific function is unknown, and Omphalotus olearius omphalotin methyltransferase which catalyzes the automethylation of its own C-terminus; this C terminus is subsequently released and macrocyclized to give Omphalotin A, a potent nematicide. 243
29740 277251 cd11725 ADDz_Dnmt3 ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3 (Dnmt3). Dnmt3 is a de novo DNA methyltransferase family that includes two active enzymes Dnmt3a and -3b and one regulatory factor Dnmt3l. The ADDz domain of Dnmt3 is located in the C-terminal region of Dnmt3, which is an active catalytic domain in Dnmt3a and -b, but lacks some residues for enzymatic activity in Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. 108
29741 277252 cd11726 ADDz_ATRX ADDz domain found in ATRX (alpha-thalassemia/mental retardation, X-linked). ADDz_ATRX is a PHD-like zinc finger domain of ATRX, which belongs to the SNF2 family of chromatin remodeling proteins. ATRX is a large chromatin-associated nuclear protein with two domains, ADDz_ATRX at the N-terminus, followed by a C-terminal ATPase/helicase domain. The ADDz_ATRX domain recognizes a specific methylated histone, and this interaction is required for heterochromatin localization of the ATRX protein. Missense mutations in either of the two ATRX domains lead to the X-linked alpha-thalassemia and mental retardation syndrome; however the mutations in the ADDz_ATRX domain produce a more severe disease phenotype that may also relate to disturbing unknown functions or interaction sites of this domain. The ADDz domain is also present in chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3); it is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. 102
29742 277253 cd11727 ADDz_Dnmt3l ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3 like (Dnmt3l). Dnmt3l is a regulator of DNA methylation, which acts by recognizing unmethylated histone H3 tails and interacting with Dnmt3a to stimulate its de novo DNA methylation activity. The ADDz_Dnmt3l domain is located in the C-terminal region of Dnmt3l that otherwise lacks some residues required for DNA methyltransferase activity. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. Dnmt3l is also associating with HDAC1 and acts as a transcriptional repressor. The ADDz_Dnmt3l domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. 123
29743 277254 cd11728 ADDz_Dnmt3b ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3b (Dnmt3b). ADDz_Dnmt3b is an active catalytic domain of Dnmt3b. Dnmt3b is a member of the Dnmt3 family and is a de novo DNA methyltransferases that has an N-terminal variable region followed by a conserved PWWP region and the cysteine-rich ADDz domain. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The methyltransferase activity of Dnmt3a is not only responsible for the establishment of DNA methylation pattern, but is also essential for the inheritance of these patterns during mitosis. Dnmt3b is ubiquitously expressed in most adult tissues. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. A knockout of Dnmt3b has been shown to be lethal in the mouse model. 120
29744 277255 cd11729 ADDz_Dnmt3a ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3a (Dnmt3a). Dnmt3a is a member of the Dnmt3 family and is a protein with de novo DNA methyltransferase activity. Dnmt3 family members are Dnmt3a, Dnmt3b, and Dnmt3l the non-enzymatic regulatory factor. Dnmt3a is recruited by Dnmt3l to unmethylated histone H3 and methylates the target. Dnmt3a has a variable region at the N-terminus, followed by a conserved PWWP region and the cysteine-rich ADDz domain. ADDz_Dnmt3a is an active catalytic domain of Dnmt3a. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The methyltransferase activity of Dnmt3a is not only responsible for the establishment of DNA methylation pattern, but is also essential for the inheritance of these patterns during mitosis. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. A knockout of Dnmt3a has been shown to be lethal in the mouse model. 128
29745 212496 cd11730 Tthb094_like_SDR_c Tthb094 and related proteins, classical (c) SDRs. Tthb094 from Thermus Thermophilus is a classical SDR which binds NADP. Members of this subgroup contain the YXXXK active site characteristic of SDRs. Also, an upstream Asn residue of the canonical catalytic tetrad is partially conserved in this subgroup of proteins of undetermined function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 206
29746 212497 cd11731 Lin1944_like_SDR_c Lin1944 and related proteins, classical (c) SDRs. Lin1944 protein from Listeria Innocua is a classical SDR, it contains a glycine-rich motif similar to the canonical motif of the SDR NAD(P)-binding site. However, the typical SDR active site residues are absent in this subgroup of proteins of undetermined function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. 198
29747 212682 cd11732 HSP105-110_like_NBD Nucleotide-binding domain of 105/110 kDa heat shock proteins including HSPA4, HYOU1, and similar proteins. This subfamily include the human proteins, HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1), HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28), and HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3), HYOU1 (also known as human hypoxia up-regulated 1, GRP170; HSP12A; ORP150; GRP-170; ORP-150; the human HYOU1 gene maps to11q23.1-q23.3), Saccharomyces cerevisiae Sse1p, Sse2p, and Lhs1p, and a sea urchin sperm receptor. It belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family, and includes proteins believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is also regulated by J-domain proteins. 377
29748 212683 cd11733 HSPA9-like_NBD Nucleotide-binding domain of human HSPA9, Escherichia coli DnaK, and similar proteins. This subgroup includes human mitochondrial HSPA9 (also known as 70-kDa heat shock protein 9, CSA; MOT; MOT2; GRP75; PBP74; GRP-75; HSPA9B; MTHSP75; the gene encoding HSPA9 maps to 5q31.1), Escherichia coli DnaK, and Saccharomyces cerevisiae Stress-Seventy subfamily C/Ssc1p (also called mtHSP70, Endonuclease SceI 75 kDa subunit). It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs); for Escherichia coli DnaK, these are the DnaJ and GrpE, respectively. HSPA9 is involved in multiple processses including mitochondrial import, antigen processing, control of cellular proliferation and differentiation, and regulation of glucose responses. During glucose deprivation-induced cellular stress, HSPA9 plays an important role in the suppression of apoptosis by inhibiting a conformational change in Bax that allow the release of cytochrome c. DnaK modulates the heat shock response in Escherichia coli. It protects E. coli from protein carbonylation, an irreversible oxidative modification that increases during organism aging and bacterial growth arrest. Under severe thermal stress, it functions as part of a bi-chaperone system: the DnaK system and the ring-forming AAA+ chaperone ClpB (Hsp104) system, to promote cell survival. DnaK has also been shown to cooperate with GroEL and the ribosome-associated Escherichia coli Trigger Factor in the proper folding of cytosolic proteins. S. cerevisiae Ssc1p is the major HSP70 chaperone of the mitochondrial matrix, promoting translocation of proteins from the cytosol, across the inner membrane, to the matrix, and their subsequent folding. Ssc1p interacts with Tim44, a peripheral inner membrane protein associated with the TIM23 protein translocase. It is also a subunit of the endoSceI site-specific endoDNase and is required for full endoSceI activity. Ssc1p plays roles in the import of Yfh1p, a nucleus-encoded mitochondrial protein involved in iron homeostasis (and a homolog of human frataxin, implicated in the neurodegenerative disease, Friedreich's ataxia). Ssc1 also participates in translational regulation of cytochrome c oxidase (COX) biogenesis by interacting with Mss51 and Mss51-containing complexes. 377
29749 212684 cd11734 Ssq1_like_NBD Nucleotide-binding domain of Saccharomyces cerevisiae Ssq1 and similar proteins. Ssq1p (also called Stress-seventy subfamily Q protein 1, Ssc2p, Ssh1p, mtHSP70 homolog) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). S. cerevisiae Ssq1p is a mitochondrial chaperone that is involved in iron-sulfur (Fe/S) center biogenesis. Ssq1p plays a role in the maturation of Yfh1p, a nucleus-encoded mitochondrial protein involved in iron homeostasis (and a homolog of human frataxin, implicated in the neurodegenerative disease, Friedreich's ataxia). 373
29750 212685 cd11735 HSPA12A_like_NBD Nucleotide-binding domain of HSPA12A and similar proteins. HSPA12A (also known as 70-kDa heat shock protein-12A) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12A. The gene encoding HSPA12A maps to 10q26.12, a cytogenetic region that might represent a common susceptibility locus for both schizophrenia and bipolar affective disorder; reduced expression of HSPA12A has been shown in the prefrontal cortex of subjects with schizophrenia. HSPA12A is also a candidate gene for forelimb-girdle muscular anomaly, an autosomal recessive disorder of Japanese black cattle. HSPA12A is predominantly expressed in neuronal cells. It may play a role in the atherosclerotic process. 467
29751 212686 cd11736 HSPA12B_like_NBD Nucleotide-binding domain of HSPA12B and similar proteins. Human HSPA12B (also known as 70-kDa heat shock protein-12B, chromosome 20 open reading frame 60/C20orf60, dJ1009E24.2; the gene encoding HSPA12B maps to 20p13) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12B. HSPA12B is predominantly expressed in endothelial cells, is required for angiogenesis, and may interact with known angiogenesis mediators. HSPA12B may be important for host defense in microglia-mediated immune response. HSPA12B expression is up-regulated in lipopolysaccharide (LPS)-induced inflammatory response in the spinal cord, and mostly located in active microglia; this induced expression may be regulated by activation of MAPK-p38, ERK1/2 and SAPK/JNK signaling pathways. Overexpression of HSPA12B also protects against LPS-induced cardiac dysfunction and involves the preserved activation of the PI3K/Akt signaling pathway. 468
29752 212687 cd11737 HSPA4_NBD Nucleotide-binding domain of HSPA4. Human HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1) responds to acidic pH stress, is involved in the radioadaptive response, is required for normal spermatogenesis and is overexpressed in hepatocellular carcinoma. It participates in a pathway along with NBS1 (Nijmegen breakage syndrome 1, also known as p85 or nibrin), heat shock transcription factor 4b (HDF4b), and HSPA14 (belonging to a different HSP70 subfamily) that induces tumor migration, invasion, and transformation. HSPA4 expression in sperm was increased in men with oligozoospermia, especially in those with varicocele. HSPA4 belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins. 383
29753 212688 cd11738 HSPA4L_NBD Nucleotide-binding domain of HSPA4L. Human HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28) is expressed ubiquitously and predominantly in the testis. It is required for normal spermatogenesis and plays a role in osmotolerance. HSPA4L belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins. 383
29754 212689 cd11739 HSPH1_NBD Nucleotide-binding domain of HSPH1. Human HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3) suppresses the aggregation of denatured proteins caused by heat shock in vitro, and may substitute for HSP70 family proteins to suppress the aggregation of denatured proteins in cells under severe stress. It reduces the protein aggregation and cytotoxicity associated with Polyglutamine (PolyQ) diseases, including Huntington's disease, which are a group of inherited neurodegenerative disorders sharing the characteristic feature of having insoluble protein aggregates in neurons. The expression of HSPH1 is elevated in various malignant tumors, including malignant melanoma, and there is a direct correlation between HSPH1 expression and B-cell non-Hodgkin lymphomas (B-NHLs) aggressiveness and proliferation. HSPH1 belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins. 383
29755 213038 cd11740 YajQ_like Proteins similar to Escherichia coli YajQ. In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains. 159
29756 240666 cd11741 TIN2_TBM TRF-binding motif region of TRF-Interacting Nuclear factor 2. The C-terminal region of TIN2 contains the TRF-binding motif (TBM), while the TIN2 N-terminal region acts in the modulation of TRF1 activity via the inhibition of tankyrase 1. TIN2 binding to TRF2 is primarily via the TRF binding motif (TBM) and the N-terminus, while the far C-terminal region interacts with lower affinity. The TIN2 TBM, but not the N-terminal region, is involved in TIN2 binding to TRF1. Truncation of the TIN2 N-terminus in mouse results in telomere elongation, suggesting a a negative regulatory function of this region. TIN2 is a shelterin complex protein identified in mammals, one of 6 factors that act to protect telomeres from DNA damage repair machinery. Three shelterin components (TRF1, TRF2, POT1) bind DNA and 3 components (TIN2, RAP1, TPP1) are recruited by these DNA binding factors. TIN2 binds directly to TRF1 and TRF2 and stabilizes TRF2 complex-telomere binding by tethering it to the TRF1 complex. TRF1 activity at telomeres is regulated in part by selective ubiquitination and degradation. Ubiquitination of TRF1 is mediated by Fbx4, which binds TRF1 in the TRFH domain, via a small GTPase module. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. F-box proteins act in substrate recognition as part of SCF complexes (SCF: Skp1-Cul1-Rbx1-F- box protein). Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, promoting telomere elongation. TIN2 also binds TPP1, which recruits POT1 to telomeres. 108
29757 213039 cd11743 Cthe_2751_like Uncharacterized protein domain similar to Clostridium thermocellum 2751. Cthe_2751 has been found to form homodimers. Based on structural similarity to other families, a role in processing nucleic acids was suggested, though interactions with DNA could not be demonstrated. 122
29758 213354 cd11744 MIT_CorA-like metal ion transporter CorA-like divalent cation transporter superfamily. This superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. They are found in most bacteria and archaea, and in some eukaryotes. It is a functionally diverse group which includes the Mg2+ transporters of Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and the Zn2+ transporter Salmonella typhimurium ZntB, which mediates the efflux of Zn2+ (and Cd2+). It includes five Saccharomyces cerevisiae members: i) two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, ii) two mitochondrial inner membrane Mg2+ transporters: Mfm1p/Lpe10p, and Mrs2p, and iii) and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. It also includes a family of Arabidopsis thaliana members (AtMGTs), some of which are localized to distinct tissues, and not all of which can transport Mg2+. Thermotoga maritima CorA and Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, Mrs2p, and Alr1p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 286
29759 213372 cd11745 Yos9_DD C-terminal dimerization domain (DD) of Saccharomyces cerevisiae Yos9 and related proteins. Yos9 participates in the ER-associated protein degradation pathway that targets misfolded proteins for proteolysis. Yos9 is a component of the reductase degradation (HRD) ubiquitin-ligase complex, specifically part of the luminal submodule of the ligase. Yos9 scans proteins for specific oligosaccharide modifications, which are critical determinants of degradation signal. It has been shown to be involved in the degradation of glycosylated proteins and various nonglycosylated proteins. Yos9 functions as a homodimer where this domain is responsible for the self-association; it has an alphabeta-roll domain architecture, and is found at the C-terminus of the protein. The N-terminal portion of Yos9 which includes an MRH domain is required for binding to Hrd3p, another component of the HRD complex. The DD domain does not appear to be directly binding Hrd3p. 124
29760 213062 cd11746 GH94N_like N-terminal domain of glycoside hydrolase family 94 and related domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. This GH64N domain also occurs in tandem repeat arrangements (not at the N-terminus) in cyclic beta 1-2 glucan synthetase and related proteins, and as a standalone domain in distantly related proteins of unknown function. 179
29761 213063 cd11747 GH94N_like_1 Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase and many other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain. This GH64N domain also occurs as a standalone domain in distantly related proteins of unknown function, as represented by this model, which also includes N-terminal GH94N-like domains of bacterial rhamnosidases and as found at the C-terminus of polygalacturonases. 204
29762 213064 cd11748 GH94N_NdvB_like Glycoside hydrolase family 94 N-terminal-like domain of NdvB-like proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel)]. The GH64N domain, as represented by this model, is found at the N-terminus of largely uncharacterized proteins, some members from Xanthomonas campestris and related organisms are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants. 294
29763 213065 cd11749 GH94N_LBP_like N-terminal-like domain of Paenibacillus sp. YM-1 Laminaribiose Phosphorylase and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes bacterial laminaribiose phosphorylase. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Bacterial laminaribiose phosphorylase phosphorolyzes laminaribiose into alpha-glucose 1-phosphate and glucose, but does not phosphorolyze other glucobioses; it slightly phosphorolyzed laminaritriose and higher laminarioligosaccharides. The GH64N domain, as represented by this model, is also found at the N-terminus of GH94 members with uncharacterized specificities. 229
29764 213066 cd11750 GH94N_like_3 Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. The GH64N domain, as represented by this model, is found at the N-terminus of GH94 members with uncharacterized specificities. 282
29765 213067 cd11751 GH94N_like_4 Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. The GH64N domain, as represented by this model, is found near the N-terminus of GH94 members and related proteins with uncharacterized specificities. 223
29766 213068 cd11752 GH94N_CDP_like N-terminal domain of cellodextrin phosphorylase (CDP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellodextrin phosphorylase (EC:2.4.1.49), also known as 1,4-beta-D-oligo-D-glucan:phosphate alpha-D-glucosyltransferase or CepB. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Cellodextrin phosphorylase catalyzes the reversible and phosphate dependent removal of a single alpha-D-glucose-1-phosphate unit from a (1,4-beta-D-glucosyl) oligomer. 214
29767 213069 cd11753 GH94N_ChvB_NdvB_2_like Second GH94N domain of cyclic beta 1-2 glucan synthetase and similar domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cyclic beta 1-2 glucan synthetase (EC:2.4.1.20) or ChvB (encoded by the chromosomal chvB virulence gene). This second of two tandemly repeated GH94-N-terminal-like domains has not been characterized functionally. Some beta 1-2 glucan synthetases are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants. 336
29768 213070 cd11754 GH94N_CBP_like N-terminal domain of cellobiose phosphorylase (CBP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20) or cellobiose:phosphate alpha-D-glucosyltransferase, or CepA. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Cellobiose phosphorylase participates in the degradation of cellulose, it catalyzes the phosphate dependent hydrolysis of cellobiose into alpha-D-glucose-1-phosphate and D-glucose, a reversible reaction. 303
29769 213071 cd11755 GH94N_ChBP_like N-terminal domain of chitobiose phosphorylase (ChBP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes chitobiose phosphorylase (EC:2.4.1.-). This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Chitobiose phosphorylase catalyzes the reversible phosphate dependent hydrolysis of chitobiose [(GlcNAc)2] into alpha-GlcNAc-1-phosphate and GlcNAc. In some organisms, ChBP may be involved in the production of GlcNac-6-phosphate in intracellular pathways. 300
29770 213072 cd11756 GH94N_ChvB_NdvB_1_like First GH94N domain of cyclic beta 1-2 glucan synthetase and similar domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cyclic beta 1-2 glucan synthetase (EC:2.4.1.20) or ChvB (encoded by the chromosomal chvB virulence gene). This first of two tandemly repeated GH94-N-terminal-like domains has not been characterized functionally. Some beta 1-2 glucan synthetases are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants. 284
29771 212691 cd11757 SH3_SH3BP4 Src Homology 3 domain of SH3 domain-binding protein 4. SH3 domain-binding protein 4 (SH3BP4) is also called transferrin receptor trafficking protein (TTP). SH3BP4 is an endocytic accessory protein that interacts with endocytic proteins including clathrin and dynamin, and regulates the internalization of the transferrin receptor (TfR). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29772 212692 cd11758 SH3_CRK_N N-terminal Src Homology 3 domain of Ct10 Regulator of Kinase adaptor proteins. CRK adaptor proteins consists of SH2 and SH3 domains, which bind tyrosine-phosphorylated peptides and proline-rich motifs, respectively. They function downstream of protein tyrosine kinases in many signaling pathways started by various extracellular signals, including growth and differentiation factors. Cellular CRK (c-CRK) contains a single SH2 domain, followed by N-terminal and C-terminal SH3 domains. It is involved in the regulation of many cellular processes including cell growth, motility, adhesion, and apoptosis. CRK has been implicated in the malignancy of various human cancers. The N-terminal SH3 domain of CRK binds a number of target proteins including DOCK180, C3G, SOS, and cABL. The CRK family includes two alternatively spliced protein forms, CRKI and CRKII, that are expressed by the CRK gene, and the CRK-like (CRKL) protein, which is expressed by a distinct gene (CRKL). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29773 212693 cd11759 SH3_CRK_C C-terminal Src Homology 3 domain of Ct10 Regulator of Kinase adaptor proteins. CRK adaptor proteins consists of SH2 and SH3 domains, which bind tyrosine-phosphorylated peptides and proline-rich motifs, respectively. They function downstream of protein tyrosine kinases in many signaling pathways started by various extracellular signals, including growth and differentiation factors. Cellular CRK (c-CRK) contains a single SH2 domain, followed by N-terminal and C-terminal SH3 domains. It is involved in the regulation of many cellular processes including cell growth, motility, adhesion, and apoptosis. CRK has been implicated in the malignancy of various human cancers. The C-terminal SH3 domain of CRK has not been shown to bind any target protein; it acts as a negative regulator of CRK function by stabilizing a structure that inhibits the access by target proteins to the N-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components, and mediating the formation of multiprotein complex assemblies. 57
29774 212694 cd11760 SH3_MIA_like Src Homology 3 domain of Melanoma Inhibitory Activity protein and similar proteins. MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. MIA is a member of the recently identified family that also includes MIA-like (MIAL), MIA2, and MIA3 (also called TANGO); the biological functions of this family are not yet fully understood. 76
29775 212695 cd11761 SH3_FCHSD_1 First Src Homology 3 domain of FCH and double SH3 domains proteins. This group is composed of FCH and double SH3 domains protein 1 (FCHSD1) and FCHSD2. These proteins have a common domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. They have only been characterized in silico and their functions remain unknown. This group also includes the insect protein, nervous wreck, which acts as a regulator of synaptic growth signaling. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29776 212696 cd11762 SH3_FCHSD_2 Second Src Homology 3 domain of FCH and double SH3 domains proteins. This group is composed of FCH and double SH3 domains protein 1 (FCHSD1) and FCHSD2. These proteins have a common domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. They have only been characterized in silico and their functions remain unknown. This group also includes the insect protein, nervous wreck, which acts as a regulator of synaptic growth signaling. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29777 212697 cd11763 SH3_SNX9_like Src Homology 3 domain of Sorting Nexin 9 and similar proteins. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. This subfamily consists of SH3 domain containing SNXs including SNX9, SNX18, SNX33, and similar proteins. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29778 212698 cd11764 SH3_Eps8 Src Homology 3 domain of Epidermal growth factor receptor kinase substrate 8 and similar proteins. This group is composed of Eps8 and Eps8-like proteins including Eps8-like 1-3, among others. These proteins contain N-terminal Phosphotyrosine-binding (PTB), central SH3, and C-terminal effector domains. Eps8 binds either Abi1 (also called E3b1) or Rab5 GTPase activating protein RN-tre through its SH3 domain. With Abi1 and Sos1, it becomes part of a trimeric complex that is required to activate Rac. Together with RN-tre, it inhibits the internalization of EGFR. The SH3 domains of Eps8 and similar proteins recognize peptides containing a PxxDY motif, instead of the classical PxxP motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29779 212699 cd11765 SH3_Nck_1 First Src Homology 3 domain of Nck adaptor proteins. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The first SH3 domain of Nck proteins preferentially binds the PxxDY sequence, which is present in the CD3e cytoplasmic tail. This binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29780 212700 cd11766 SH3_Nck_2 Second Src Homology 3 domain of Nck adaptor proteins. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29781 212701 cd11767 SH3_Nck_3 Third Src Homology 3 domain of Nck adaptor proteins. This group contains the third SH3 domain of Nck, the first SH3 domain of Caenorhabditis elegans Ced-2 (Cell death abnormality protein 2), and similar domains. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. Ced-2 is a cell corpse engulfment protein that interacts with Ced-5 in a pathway that regulates the activation of Ced-10, a Rac small GTPase. 56
29782 212702 cd11768 SH3_Tec_like Src Homology 3 domain of Tec-like Protein Tyrosine Kinases. The Tec (Tyrosine kinase expressed in hepatocellular carcinoma) subfamily is composed of Tec, Btk, Bmx (Etk), Itk (Tsk, Emt), Rlk (Txk), and similar proteins. They are cytoplasmic (or nonreceptor) tyr kinases containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Most Tec subfamily members (except Rlk) also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, some members contain the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. Tec kinases are expressed mainly by haematopoietic cells, although Tec and Bmx are also found in endothelial cells. B-cells express Btk and Tec, while T-cells express Itk, Txk, and Tec. Collectively, Tec kinases are expressed in a variety of myeloid cells such as mast cells, platelets, macrophages, and dendritic cells. Each Tec kinase shows a distinct cell-type pattern of expression. The function of Tec kinases in lymphoid cells have been studied extensively. They play important roles in the development, differentiation, maturation, regulation, survival, and function of B-cells and T-cells. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29783 212703 cd11769 SH3_CSK Src Homology 3 domain of C-terminal Src kinase. CSK is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. They negatively regulate the activity of Src kinases that are anchored to the plasma membrane. To inhibit Src kinases, CSK is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. CSK catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. It is expressed in a wide variety of tissues and plays a role, as a regulator of Src, in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. In addition, CSK also shows Src-independent functions. It is a critical component in G-protein signaling, and plays a role in cytoskeletal reorganization and cell migration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29784 212704 cd11770 SH3_Nephrocystin Src Homology 3 domain of Nephrocystin (or Nephrocystin-1). Nephrocystin contains an SH3 domain involved in signaling pathways that regulate cell adhesion and cytoskeletal organization. It is a protein that in humans is associated with juvenile nephronophthisis, an inherited kidney disease characterized by renal fibrosis that lead to chronic renal failure in children. It is localized in cell-cell junctions in renal duct cells, and is known to interact with Ack1, an activated Cdc42-associated kinase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29785 212705 cd11771 SH3_Pex13p_fungal Src Homology 3 domain of fungal peroxisomal membrane protein Pex13p. Pex13p, located in the peroxisomal membrane, contains two transmembrane regions and a C-terminal SH3 domain. It binds to the peroxisomal targeting type I (PTS1) receptor Pex5p and the docking factor Pex14p through its SH3 domain. It is essential for both PTS1 and PTS2 protein import pathways into the peroxisomal matrix. Pex13p binds Pex14p, which contains a PxxP motif, in a classical fashion to the proline-rich ligand binding site of its SH3 domain. It binds the WxxxF/Y motif of Pex5p in a novel site that does not compete with Pex14p binding. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 60
29786 212706 cd11772 SH3_OSTF1 Src Homology 3 domain of metazoan osteoclast stimulating factor 1. OSTF1, also named OSF or SH3P2, is a signaling protein containing SH3 and ankyrin-repeat domains. It acts through a Src-related pathway to enhance the formation of osteoclasts and bone resorption. It also acts as a negative regulator of cell motility. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29787 212707 cd11773 SH3_Sla1p_1 First Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29788 212708 cd11774 SH3_Sla1p_2 Second Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29789 212709 cd11775 SH3_Sla1p_3 Third Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. The third SH3 domain of Sla1p can bind ubiquitin while retaining the ability to bind proline-rich ligands; monoubiquitination of target proteins signals internalization and sorting through the endocytic pathway. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29790 212710 cd11776 SH3_PI3K_p85 Src Homology 3 domain of the p85 regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 72
29791 212711 cd11777 SH3_CIP4_Bzz1_like Src Homology 3 domain of Cdc42-Interacting Protein 4, Bzz1 and similar domains. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4) and similar proteins such as Formin Binding Protein 17 (FBP17) and FormiN Binding Protein 1-Like (FNBP1L), as well as yeast Bzz1 (or Bzz1p). CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. Bzz1 is also a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Members of this subfamily contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain as well as at least one C-terminal SH3 domain. Bzz1 contains a second SH3 domain at the C-terminus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29792 212712 cd11778 SH3_Bzz1_2 Second Src Homology 3 domain of Bzz1 and similar domains. Bzz1 (or Bzz1p) is a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Together with other proteins, it induces membrane scission in yeast. Bzz1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. This model represents the second C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29793 212713 cd11779 SH3_Irsp53_BAIAP2L Src Homology 3 domain of Insulin Receptor tyrosine kinase Substrate p53, Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2 (BAIAP2)-Like proteins, and similar proteins. Proteins in this family include IRSp53, BAIAP2L1, BAIAP2L2, and similar proteins. They all contain an Inverse-Bin/Amphiphysin/Rvs (I-BAR) or IMD domain in addition to the SH3 domain. IRSp53, also known as BAIAP2, is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. BAIAP2L1, also called IRTKS (Insulin Receptor Tyrosine Kinase Substrate), serves as a substrate for the insulin receptor and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. IRSp53 and IRTKS also mediate the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. BAIAP2L2 co-localizes with clathrin plaques but its function has not been determined. The SH3 domains of IRSp53 and IRTKS have been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29794 212714 cd11780 SH3_Sorbs_3 Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the third SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29795 212715 cd11781 SH3_Sorbs_1 First Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the first SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29796 212716 cd11782 SH3_Sorbs_2 Second Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the second SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29797 212717 cd11783 SH3_SH3RF_3 Third Src Homology 3 domain of SH3 domain containing ring finger 1 (SH3RF1), SH3RF3, and similar domains. SH3RF1 (or POSH) and SH3RF3 (or POSH2) are scaffold proteins that function as E3 ubiquitin-protein ligases. They contain an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle of SH3RF1 and SH3RF3, and similar domains. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29798 212718 cd11784 SH3_SH3RF2_3 Third Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29799 212719 cd11785 SH3_SH3RF_C C-terminal (Fourth) Src Homology 3 domain of SH3 domain containing ring finger 1 (SH3RF1), SH3RF3, and similar domains. SH3RF1 (or POSH) and SH3RF3 (or POSH2) are scaffold proteins that function as E3 ubiquitin-protein ligases. They contain an N-terminal RING finger domain and four SH3 domains. This model represents the fourth SH3 domain, located at the C-terminus of SH3RF1 and SH3RF3, and similar domains. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29800 212720 cd11786 SH3_SH3RF_1 First Src Homology 3 domain of SH3 domain containing ring finger proteins. This model represents the first SH3 domain of SH3RF1 (or POSH), SH3RF2 (or POSHER), SH3RF3 (POSH2), and similar domains. Members of this family are scaffold proteins that function as E3 ubiquitin-protein ligases. They all contain an N-terminal RING finger domain and multiple SH3 domains; SH3RF1 and SH3RF3 have four SH3 domains while SH3RF2 has three. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3RF2 acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29801 212721 cd11787 SH3_SH3RF_2 Second Src Homology 3 domain of SH3 domain containing ring finger proteins. This model represents the second SH3 domain of SH3RF1 (or POSH), SH3RF2 (or POSHER), SH3RF3 (POSH2), and similar domains. Members of this family are scaffold proteins that function as E3 ubiquitin-protein ligases. They all contain an N-terminal RING finger domain and multiple SH3 domains; SH3RF1 and SH3RF3 have four SH3 domains while SH3RF2 has three. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3RF2 acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29802 212722 cd11788 SH3_RasGAP Src Homology 3 domain of Ras GTPase-Activating Protein 1. RasGAP, also called Ras p21 protein activator, RASA1, or p120RasGAP, is part of the GAP1 family of GTPase-activating proteins. It is a 120kD cytosolic protein containing an SH3 domain flanked by two SH2 domains at the N-terminal end, a pleckstrin homology (PH) domain, a calcium dependent phospholipid binding domain (CaLB/C2), and a C-terminal catalytic GAP domain. It stimulates the GTPase activity of normal RAS p21. It acts as a positive effector of Ras in tumor cells. It also functions as a regulator downstream of tyrosine receptors such as those of PDGF, EGF, ephrin, and insulin, among others. The SH3 domain of RasGAP is unable to bind proline-rich sequences but have been shown to interact with protein partners such as the G3BP protein, Aurora kinases, and the Calpain small subunit 1. The RasGAP SH3 domain is necessary for the downstream signaling of Ras and it also influences Rho-mediated cytoskeletal reorganization. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29803 212723 cd11789 SH3_Nebulin_family_C C-terminal Src Homology 3 domain of the Nebulin family of proteins. Nebulin family proteins contain multiple nebulin repeats, and may contain an N-terminal LIM domain and/or a C-terminal SH3 domain. They have molecular weights ranging from 34 to 900 kD, depending on the number of nebulin repeats, and they all bind actin. They are involved in the regulation of actin filament architecture and function as stabilizers and scaffolds for cytoskeletal structures with which they associate, such as long actin filaments or focal adhesions. Nebulin family proteins that contain a C-terminal SH3 domain include the giant filamentous protein nebulin, nebulette, Lasp1, and Lasp2. Lasp2, also called LIM-nebulette, is an alternatively spliced variant of nebulette. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29804 212724 cd11790 SH3_Amphiphysin Src Homology 3 domain of Amphiphysin and related domains. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They exist in several isoforms and mammals possess two amphiphysin proteins from distinct genes. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin, and synaptojanin. They function in synaptic vesicle endocytosis. Human autoantibodies to amphiphysin I hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. Mutations in Bin1 are associated with autosomal recessive centronuclear myopathy. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. The SH3 domain of amphiphysins bind proline-rich motifs present in binding partners such as dynamin, synaptojanin, and nsP3. It also belongs to a subset of SH3 domains that bind ubiquitin in a site that overlaps with the peptide binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 64
29805 212725 cd11791 SH3_UBASH3 Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing proteins, also called TULA (T cell Ubiquitin LigAnd) family of proteins. UBASH3 or TULA proteins are also referred to as Suppressor of T cell receptor Signaling (STS) proteins. They contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. In some vertebrates, there are two TULA family proteins, called UBASH3A (also called TULA or STS-2) and UBASH3B (also called TULA-2 or STS-1), which show partly overlapping as well as distinct functions. UBASH3B is widely expressed while UBASH3A is only found in lymphoid cells. UBASH3A facilitates apoptosis induced in T cells through its interaction with the apoptosis-inducing factor AIF. UBASH3B is an active phosphatase while UBASH3A is not. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29806 212726 cd11792 SH3_Fut8 Src homology 3 domain of Alpha1,6-fucosyltransferase (Fut8). Fut8 catalyzes the alpha1,6-linkage of a fucose residue from a donor substrate to N-linked oligosaccharides on glycoproteins in a process called core fucosylation, which is crucial for growth factor receptor-mediated biological functions. Fut8-deficient mice show severe growth retardation, early death, and a pulmonary emphysema-like phenotype. Fut8 is also implicated to play roles in aging and cancer metastasis. It contains an N-terminal coiled-coil domain, a catalytic domain, and a C-terminal SH3 domain. The SH3 domain of Fut8 is located in the lumen and its role in glycosyl transfer is unclear. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29807 212727 cd11793 SH3_ephexin1_like Src homology 3 domain of ephexin-1-like SH3 domain containing Rho guanine nucleotide exchange factors. Members of this family contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and C-terminal SH3 domains. They include the Rho guanine nucleotide exchange factors ARHGEF5, ARHGEF16, ARHGEF19, ARHGEF26, ARHGEF27 (also called ephexin-1), and similar proteins, and are also called ephexins because they interact directly with ephrin A receptors. GEFs interact with Rho GTPases via their DH domains to catalyze nucleotide exchange by stabilizing the nucleotide-free GTPase intermediate. They play important roles in neuronal development. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29808 212728 cd11794 SH3_DNMBP_N1 First N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29809 212729 cd11795 SH3_DNMBP_N2 Second N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29810 212730 cd11796 SH3_DNMBP_N3 Third N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29811 212731 cd11797 SH3_DNMBP_N4 Fourth N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP bind the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29812 212732 cd11798 SH3_DNMBP_C1 First C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29813 212733 cd11799 SH3_ARHGEF37_C1 First C-terminal Src homology 3 domain of Rho guanine nucleotide exchange factor 37. ARHGEF37 contains a RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. Its specific function is unknown. Its domain architecture is similar to the C-terminal half of DNMBP or Tuba, a cdc42-specific GEF that provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics, and plays an important role in regulating cell junction configuration. GEFs activate small GTPases by exchanging bound GDP for free GTP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29814 212734 cd11800 SH3_DNMBP_C2_like Second C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba, and similar domains. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. Also included in this subfamily is the second C-terminal SH3 domain of Rho guanine nucleotide exchange factor 37 (ARHGEF37), whose function is still unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29815 212735 cd11801 SH3_JIP1_like Src homology 3 domain of JNK-interacting proteins 1 and 2, and similar domains. JNK-interacting proteins (JIPs) function as scaffolding proteins for c-Jun N-terminal kinase (JNK) signaling pathways. They bind to components of Mitogen-activated protein kinase (MAPK) pathways such as JNK, MKK, and several MAP3Ks such as MLK and DLK. There are four JIPs (JIP1-4); all contain a JNK binding domain. JIP1 and JIP2 also contain SH3 and Phosphotyrosine-binding (PTB) domains. Both are highly expressed in the brain and pancreatic beta-cells. JIP1 functions as an adaptor linking motor to cargo during axonal transport and also is involved in regulating insulin secretion. JIP2 form complexes with fibroblast growth factor homologous factors (FHFs), which facilitates activation of the p38delta MAPK. The SH3 domain of JIP1 homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29816 212736 cd11802 SH3_Endophilin_B Src homology 3 domain of Endophilin-B. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29817 212737 cd11803 SH3_Endophilin_A Src homology 3 domain of Endophilin-A. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms (A1, A2, and A3). Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. They tubulate membranes and regulate calcium influx into neurons to trigger the activation of the endocytic machinery. They are also involved in the sorting of plasma membrane proteins, actin filament assembly, and the uncoating of clathrin-coated vesicles for fusion with endosomes. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29818 212738 cd11804 SH3_GRB2_like_N N-terminal Src homology 3 domain of Growth factor receptor-bound protein 2 (GRB2) and related proteins. This family includes the adaptor protein GRB2 and related proteins including Drosophila melanogaster Downstream of receptor kinase (DRK), Caenorhabditis elegans Sex muscle abnormal protein 5 (Sem-5), GRB2-related adaptor protein (GRAP), GRAP2, and similar proteins. Family members contain an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. GRB2/Sem-5/DRK is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. GRAP2 plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. GRAP acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. The N-terminal SH3 domain of GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29819 212739 cd11805 SH3_GRB2_like_C C-terminal Src homology 3 domain of Growth factor receptor-bound protein 2 (GRB2) and related proteins. This family includes the adaptor protein GRB2 and related proteins including Drosophila melanogaster Downstream of receptor kinase (DRK), Caenorhabditis elegans Sex muscle abnormal protein 5 (Sem-5), GRB2-related adaptor protein (GRAP), GRAP2, and similar proteins. Family members contain an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. GRB2/Sem-5/DRK is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. GRAP2 plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. GRAP acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. The C-terminal SH3 domains (SH3c) of GRB2 and GRAP2 have been shown to bind to classical PxxP motif ligands, as well as to non-classical motifs. GRB2 SH3c binds Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, while the SH3c of GRAP2 binds to the phosphatase-like protein HD-PTP via a RxxxxK motif. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29820 212740 cd11806 SH3_PRMT2 Src homology 3 domain of Protein arginine N-methyltransferase 2. PRMT2, also called HRMT1L1, belongs to the arginine methyltransferase protein family. It functions as a coactivator to both estrogen receptor alpha (ER-alpha) and androgen receptor (AR), presumably through arginine methylation. The ER-alpha transcription factor is involved in cell proliferation, differentiation, morphogenesis, and apoptosis, and is also implicated in the development and progression of breast cancer. PRMT2 and its variants are upregulated in breast cancer cells and may be involved in modulating the ER-alpha signaling pathway during formation of breast cancer. PRMT2 also plays a role in regulating the function of E2F transcription factors, which are critical cell cycle regulators, by binding to the retinoblastoma gene product (RB). It contains an N-terminal SH3 domain and an AdoMet binding domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29821 212741 cd11807 SH3_ASPP Src homology 3 domain of Apoptosis Stimulating of p53 proteins (ASPP). The ASPP family of proteins bind to important regulators of apoptosis (p53, Bcl-2, and RelA) and cell growth (APCL, PP1). They share similarity at their C-termini, where they harbor a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain. Vertebrates contain three members of the family: ASPP1, ASPP2, and iASPP. ASPP1 and ASPP2 activate the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73), while iASPP is an oncoprotein that specifically inhibits p53-induced apoptosis. The expression of ASPP proteins is altered in tumors; ASPP1 and ASPP2 are downregulated whereas iASPP is upregulated is some cancer types. ASPP proteins also bind and regulate protein phosphatase 1 (PP1), and this binding is competitive with p53 binding. The SH3 domain and the ANK repeats of ASPP contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29822 212742 cd11808 SH3_Alpha_Spectrin Src homology 3 domain of Alpha Spectrin. Spectrin is a major structural component of the red blood cell membrane skeleton and is important in erythropoiesis and membrane biogenesis. It is a flexible, rope-like molecule composed of two subunits, alpha and beta, which consist of many spectrin-type repeats. Alpha and beta spectrin associate to form heterodimers and tetramers; spectrin tetramer formation is critical for red cell shape and deformability. Defects in alpha spectrin have been associated with inherited hemolytic anemias including hereditary spherocytosis (HSp), hereditary elliptocytosis (HE), and hereditary pyropoikilocytosis (HPP). Alpha spectrin contains a middle SH3 domain and a C-terminal EF-hand binding motif in addition to multiple spectrin repeats. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29823 212743 cd11809 SH3_srGAP Src homology 3 domain of Slit-Robo GTPase Activating Proteins. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs (srGAP1-3), all of which are expressed during embryonic and early development in the nervous system but with different localization and timing. A fourth member has also been reported (srGAP4, also called ARHGAP4). srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29824 212744 cd11810 SH3_RUSC1_like Src homology 3 domain of RUN and SH3 domain-containing proteins 1 and 2. RUSC1 and RUSC2, that were originally characterized in silico. They are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. RUSC1, also called NESCA (New molecule containing SH3 at the carboxy-terminus), is highly expressed in the brain and is translocated to the nuclear membrane from the cytoplasm upon stimulation with neurotrophin. It plays a role in facilitating neurotrophin-dependent neurite outgrowth. It also interacts with NEMO (or IKKgamma) and may function in NEMO-mediated activation of NF-kB. RUSC2, also called Iporin, is expressed ubiquitously with highest amounts in the brain and testis. It interacts with the small GTPase Rab1 and the Golgi matrix protein GM130, and may function in linking GTPases to certain intracellular signaling pathways. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29825 212745 cd11811 SH3_CHK Src Homology 3 domain of CSK homologous kinase. CHK is also referred to as megakaryocyte-associated tyrosine kinase (Matk). It inhibits Src kinases using a noncatalytic mechanism by simply binding to them. As a negative regulator of Src kinases, Chk may play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. To inhibit Src kinases that are anchored to the plasma membrane, CHK is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. CHK also plays a role in neural differentiation in a manner independent of Src by enhancing MAPK activation via Ras-mediated signaling. It is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29826 212746 cd11812 SH3_AHI-1 Src Homology 3 domain of Abelson helper integration site-1 (AHI-1). AHI-1, also called Jouberin, is expressed in high levels in the brain, gonad tissues, and skeletal muscle. It is an adaptor protein that interacts with the small GTPase Rab8a and regulates it distribution and function, affecting cilium formation and vesicle transport. Mutations in the AHI-1 gene can cause Joubert syndrome, a disorder characterized by brainstem malformations, cerebellar aplasia/hypoplasia, and retinal dystrophy. AHI-1 variation is also associated with susceptibility to schizophrenia and type 2 diabetes mellitus progression. AHI-1 contains WD40 and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29827 212747 cd11813 SH3_SGSM3 Src Homology 3 domain of Small G protein Signaling Modulator 3. SGSM3 is also called Merlin-associated protein (MAP), RUN and SH3 domain-containing protein (RUSC3), RUN and TBC1 domain-containing protein 3 (RUTBC3), Rab GTPase-activating protein 5 (RabGAP5), or Rab GAP-like protein (RabGAPLP). It is expressed ubiquitously and functions as a regulator of small G protein RAP- and RAB-mediated neuronal signaling. It is involved in modulating NGF-mediated neurite outgrowth and differentiation. It also interacts with the tumor suppressor merlin and may play a role in the merlin-associated suppression of cell growth. SGSM3 contains TBC, SH3, and RUN domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29828 212748 cd11814 SH3_Eve1_1 First Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29829 212749 cd11815 SH3_Eve1_2 Second Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29830 212750 cd11816 SH3_Eve1_3 Third Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29831 212751 cd11817 SH3_Eve1_4 Fourth Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29832 212752 cd11818 SH3_Eve1_5 Fifth Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29833 212753 cd11819 SH3_Cortactin_like Src homology 3 domain of Cortactin and related proteins. This subfamily includes cortactin, Abp1 (actin-binding protein 1), hematopoietic lineage cell-specific protein 1 (HS1), and similar proteins. These proteins are involved in regulating actin dynamics through direct or indirect interaction with the Arp2/3 complex, which is required to initiate actin polymerization. They all contain at least one C-terminal SH3 domain. Cortactin and HS1 bind Arp2/3 and actin through an N-terminal region that contains an acidic domain and several copies of a repeat domain found in cortactin and HS1. Abp1 binds actin via an N-terminal actin-depolymerizing factor (ADF) homology domain. Yeast Abp1 binds Arp2/3 directly through two acidic domains. Mammalian Abp1 does not directly interact with Arp2/3; instead, it regulates actin dynamics indirectly by interacting with dynamin and WASP family proteins. The C-terminal region of these proteins acts as an adaptor or scaffold that can connect membrane trafficking and signaling proteins that bind the SH3 domain within the actin network. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29834 212754 cd11820 SH3_STAM Src homology 3 domain of Signal Transducing Adaptor Molecules. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. There are two vertebrate STAMs, STAM1 and STAM2, which may be functionally redundant; vertebrate STAMs contain ITAM motifs. They are part of the endosomal sorting complex required for transport (ESCRT-0). STAM2 deficiency in mice did not cause any obvious abnormality, while STAM1 deficiency resulted in growth retardation. Loss of both STAM1 and STAM2 in mice proved lethal, indicating that STAMs are important for embryonic development. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29835 212755 cd11821 SH3_ASAP Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing proteins. ASAPs are Arf GTPase activating proteins (GAPs) and they function in regulating cell growth, migration, and invasion. They contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. Vertebrates contain at least three members, ASAP1, ASAP2, and ASAP3, but some ASAP3 proteins do not seem to harbor a C-terminal SH3 domain. ASAP1 and ASAP2 show GTPase activating protein (GAP) activity towards Arf1 and Arf5. They do not show GAP activity towards Arf6, but are able to mediate Arf6 signaling by binding stably to GTP-Arf6. ASAP3 is an Arf6-specific GAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29836 212756 cd11822 SH3_SASH_like Src homology 3 domain of SAM And SH3 Domain Containing Proteins. This subfamily, also called the SLY family, is composed of SAM And SH3 Domain Containing Protein 1 (SASH1), SASH2, SASH3, and similar proteins. These are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as wells as SAM (sterile alpha motif) and SH3 domains. SASH1 is a potential tumor suppressor in breast and colon cancer. It is widely expressed in normal tissues (except lymphocytes and dendritic cells) and is localized in the nucleus and the cytoplasm. SASH1 interacts with the oncoprotein cortactin and is important in cell migration and adhesion. SASH2 (also called SAMSN-1, SLY2, HACS1 or NASH1) and SASH3 (also called SLY/SLY1) are expressed mainly in hematopoietic cells, although SASH2 is also found in endothelial cells as well as myeloid leukemias and myeloma. SASH2 was found to be differentially expressed in malignant haematopoietic cells and in colorectal tumors, and is a potential tumor suppressor in lung cancer. SASH3 is essential in the full activation of adaptive immunity and is involved in the signaling of T cell receptors. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29837 212757 cd11823 SH3_Nostrin Src homology 3 domain of Nitric Oxide Synthase TRaffic INducer. Nostrin is expressed in endothelial and epithelial cells and is involved in the regulation, trafficking and targeting of endothelial NOS (eNOS). It facilitates the endocytosis of eNOS by coordinating the functions of dynamin and the Wiskott-Aldrich syndrome protein (WASP). Increased expression of Nostrin may be correlated to preeclampsia. Nostrin contains an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29838 212758 cd11824 SH3_PSTPIP1 Src homology 3 domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 1. PSTPIP1, also called CD2 Binding Protein 1 (CD2BP1), is mainly expressed in hematopoietic cells. It is a binding partner of the cell surface receptor CD2 and PTP-PEST, a tyrosine phosphatase which functions in cell motility and Rac1 regulation. It also plays a role in the activation of the Wiskott-Aldrich syndrome protein (WASP), which couples actin rearrangement and T cell activation. Mutations in the gene encoding PSTPIP1 cause the autoinflammatory disorder known as PAPA (pyogenic sterile arthritis, pyoderma gangrenosum, and acne) syndrome. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29839 212759 cd11825 SH3_PLCgamma Src homology 3 domain of Phospholipase C (PLC) gamma. PLC catalyzes the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG) in response to various receptors. Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma catalyzes this reaction in tyrosine kinase-dependent signaling pathways. It is activated and recruited to its substrate at the membrane. Vertebrates contain two forms of PLCgamma, PLCgamma1, which is widely expressed, and PLCgamma2, which is primarily found in haematopoietic cells. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. The SH3 domain of PLCgamma1 directly interacts with dynamin-1 and can serve as a guanine nucleotide exchange factor (GEF). It also interacts with Cbl, inhibiting its phosphorylation and activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29840 212760 cd11826 SH3_Abi Src homology 3 domain of Abl Interactor proteins. Abl interactor (Abi) proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. They localize to sites of actin polymerization in epithelial adherens junction and immune synapses, as well as to the leading edge of lamellipodia. Vertebrates contain two Abi proteins, Abi1 and Abi2. Abi1 displays a wide expression pattern while Abi2 is highly expressed in the eye and brain. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29841 212761 cd11827 SH3_MyoIe_If_like Src homology 3 domain of Myosins Ie, If, and similar proteins. Myosins Ie (MyoIe) and If (MyoIf) are nonmuscle, unconventional, long tailed, class I myosins containing an N-terminal motor domain and a myosin tail with TH1, TH2, and SH3 domains. MyoIe interacts with the endocytic proteins, dynamin and synaptojanin-1, through its SH3 domain; it may play a role in clathrin-dependent endocytosis. In the kidney, MyoIe is critical for podocyte function and normal glomerular filtration. Mutations in MyoIe is associated with focal segmental glomerulosclerosis, a disease characterized by massive proteinuria and progression to end-stage kidney disease. MyoIf is predominantly expressed in the immune system; it plays a role in immune cell motility and innate immunity. Mutations in MyoIf may be associated with the loss of hearing. The MyoIf gene has also been found to be fused to the MLL (Mixed lineage leukemia) gene in infant acute myeloid leukemias (AML). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29842 212762 cd11828 SH3_ARHGEF9_like Src homology 3 domain of ARHGEF9-like Rho guanine nucleotide exchange factors. Members of this family contain a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. They include the Rho guanine nucleotide exchange factors ARHGEF9, ASEF (also called ARHGEF4), ASEF2, and similar proteins. GEFs activate small GTPases by exchanging bound GDP for free GTP. ARHGEF9 specifically activates Cdc42, while both ASEF and ASEF2 can activate Rac1 and Cdc42. ARHGEF9 is highly expressed in the brain and it interacts with gephyrin, a postsynaptic protein associated with GABA and glycine receptors. ASEF plays a role in angiogenesis and cell migration. ASEF2 is important in cell migration and adhesion dynamics. ASEF exists in an autoinhibited form and is activated upon binding of the tumor suppressor APC (adenomatous polyposis coli), leading to the activation of Rac1 or Cdc42. In its autoinhibited form, the SH3 domain of ASEF forms an extensive interface with the DH and PH domains, blocking the Rac binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29843 212763 cd11829 SH3_GAS7 Src homology 3 domain of Growth Arrest Specific protein 7. GAS7 is mainly expressed in the brain and is required for neurite outgrowth. It may also play a role in the protection and migration of embryonic stem cells. Treatment-related acute myeloid leukemia (AML) has been reported resulting from mixed-lineage leukemia (MLL)-GAS7 translocations as a complication of primary cancer treatment. GAS7 contains an N-terminal SH3 domain, followed by a WW domain, and a central F-BAR domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29844 212764 cd11830 SH3_VAV_2 C-terminal (or second) Src homology 3 domain of VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and scaffold proteins and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29845 212765 cd11831 SH3_VAV_1 First Src homology 3 domain of VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and scaffold proteins and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29846 212766 cd11832 SH3_Shank Src homology 3 domain of SH3 and multiple ankyrin repeat domains (Shank) proteins. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. They bind a variety of membrane and cytosolic proteins, and exist in alternatively spliced isoforms. They are highly enriched in postsynaptic density (PSD) where they interact with the cytoskeleton and with postsynaptic membrane receptors including NMDA and glutamate receptors. They are crucial in the construction and organization of the PSD and dendritic spines of excitatory synapses. There are three members of this family (Shank1, Shank2, Shank3) which show distinct and cell-type specific patterns of expression. Shank1 is brain-specific; Shank2 is found in neurons, glia, endocrine cells, liver, and kidney; Shank3 is widely expressed. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 50
29847 212767 cd11833 SH3_Stac_1 First C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing (Stac) proteins. Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. This model represents the first C-terminal SH3 domain of Stac1 and Stac3, and the single C-terminal SH3 domain of Stac2. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29848 212768 cd11834 SH3_Stac_2 Second C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing proteins 1 and 3. This model represents the second C-terminal SH3 domain of Stac1 and Stac3. Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29849 212769 cd11835 SH3_ARHGAP32_33 Src homology 3 domain of Rho GTPase-activating proteins 32 and 33, and similar proteins. Members of this family contain N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal extensions. RhoGAPs (or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP32 is also called RICS, PX-RICS, p250GAP, or p200RhoGAP. It is a Rho GTPase-activating protein for Cdc42 and Rac1, and is implicated in the regulation of postsynaptic signaling and neurite outgrowth. PX-RICS, a variant of RICS that contain PX and SH3 domains, is the main isoform expressed during neural development. It is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. ARHGAP33, also called sorting nexin 26 or TCGAP (Tc10/CDC42 GTPase-activating protein), is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. It is translocated to the plasma membrane in adipocytes in response to insulin and may be involved in the regulation of insulin-stimulated glucose transport. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29850 212770 cd11836 SH3_Intersectin_1 First Src homology 3 domain (or SH3A) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The first SH3 domain (or SH3A) of ITSN1 has been shown to bind many proteins including Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29851 212771 cd11837 SH3_Intersectin_2 Second Src homology 3 domain (or SH3B) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The second SH3 domain (or SH3B) of ITSN1 has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29852 212772 cd11838 SH3_Intersectin_3 Third Src homology 3 domain (or SH3C) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The third SH3 domain (or SH3C) of ITSN1 has been shown to bind many proteins including dynamin1/2, CIN85, c-Cbl, SHIP2, Reps1, synaptojanin-1, and WNK, among others. The SH3C of ITSN2 has been shown to bind the K15 protein of Kaposi's sarcoma-associated herpesvirus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29853 212773 cd11839 SH3_Intersectin_4 Fourth Src homology 3 domain (or SH3D) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The fourth SH3 domain (or SH3D) of ITSN1 has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29854 212774 cd11840 SH3_Intersectin_5 Fifth Src homology 3 domain (or SH3E) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The fifth SH3 domain (or SH3E) of ITSN1 has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29855 212775 cd11841 SH3_SH3YL1_like Src homology 3 domain of SH3 domain containing Ysc84-like 1 (SH3YL1) protein. SH3YL1 localizes to the plasma membrane and is required for dorsal ruffle formation. It binds phosphoinositides (PIs) with high affinity through its N-terminal SYLF domain (also called DUF500). In addition, SH3YL1 contains a C-terminal SH3 domain which has been reported to bind to N-WASP, dynamin 2, and SHIP2 (a PI 5-phosphatase). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29856 212776 cd11842 SH3_Ysc84p_like Src homology 3 domain of Ysc84p and similar fungal proteins. This family is composed of the Saccharomyces cerevisiae proteins, Ysc84p (also called LAS17-binding protein 4, Lsb4p) and Lsb3p, and similar fungal proteins. They contain an N-terminal SYLF domain (also called DUF500) and a C-terminal SH3 domain. Ysc84p localizes to actin patches and plays an important in actin polymerization during endocytosis. The N-terminal domain of both Ysc84p and Lsb3p can bind and bundle actin filaments. A study of the yeast SH3 domain interactome predicts that the SH3 domains of Lsb3p and Lsb4p may function as molecular hubs for the assembly of endocytic complexes. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29857 212777 cd11843 SH3_PACSIN Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. PACSINs, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29858 212778 cd11844 SH3_CAS Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29859 212779 cd11845 SH3_Src_like Src homology 3 domain of Src kinase-like Protein Tyrosine Kinases. Src subfamily members include Src, Lck, Hck, Blk, Lyn, Fgr, Fyn, Yrk, Yes, and Brk. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Brk lacks the N-terminal myristoylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells, and tumor vasculature, contributing to cancer progression and metastasis. Src kinases are overexpressed in a variety of human cancers, making them attractive targets for therapy. They are also implicated in acute inflammatory responses and osteoclast function. Src, Fyn, Yes, and Yrk are widely expressed, while Blk, Lck, Hck, Fgr, Lyn, and Brk show a limited expression pattern. This subfamily also includes Drosophila Src42A, Src oncogene at 42A (also known as Dsrc41) which accumulates at sites of cell-cell or cell-matrix adhesion, and participates in Drosphila development and wound healing. It has been shown to promote tube elongation in the tracheal system, is essential for proper cell-cell matching during dorsal closure, and regulates cell-cell contacts in developing Drosophila eyes. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29860 212780 cd11846 SH3_Srms Src homology 3 domain of Srms Protein Tyrosine Kinase. Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites (Srms) is a cytoplasmic (or non-receptor) PTK with limited homology to Src kinases. Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Srms lacks the N-terminal myristoylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29861 212781 cd11847 SH3_Brk Src homology 3 domain of Brk (Breast tumor kinase) Protein Tyrosine Kinase (PTK), also called PTK6. Brk is a cytoplasmic (or non-receptor) PTK with limited homology to Src kinases. It has been found to be overexpressed in a majority of breast tumors. It plays roles in normal cell differentiation, proliferation, survival, migration, and cell cycle progression. Brk substrates include RNA-binding proteins (SLM-1/2, Sam68), transcription factors (STAT3/5), and signaling molecules (Akt, paxillin, IRS-4). Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Brk lacks the N-terminal myristoylation site. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29862 212782 cd11848 SH3_SLAP-like Src homology 3 domain of Src-Like Adaptor Proteins. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. They function in regulating the signaling, ubiquitination, and trafficking of T-cell receptor (TCR) and B-cell receptor (BCR) components. Vertebrates contain two SLAPs, named SLAP (or SLA1) and SLAP2 (or SLA2). SLAP has been shown to interact with the EphA receptor, EpoR, Lck, PDGFR, Syk, CD79a, among others, while SLAP2 interacts with CSF1R. Both SLAPs interact with c-Cbl, LAT, CD247, and Zap70. SLAP modulates TCR surface expression levels as well as surface and total BCR levels. As an adaptor to c-Cbl, SLAP increases the ubiquitination, intracellular retention, and targeted degradation of the BCR complex components. SLAP2 plays a role in c-Cbl-dependent regulation of CSF1R, a tyrosine kinase important for myeloid cell growth and differentiation. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29863 212783 cd11849 SH3_SPIN90 Src homology 3 domain of SH3 protein interacting with Nck, 90 kDa (SPIN90). SPIN90 is also called NCK interacting protein with SH3 domain (NCKIPSD), Dia-interacting protein (DIP), 54 kDa vimentin-interacting protein (VIP54), or WASP-interacting SH3-domain protein (WISH). It is an F-actin binding protein that regulates actin polymerization and endocytosis. It associates with the Arp2/3 complex near actin filaments and determines filament localization at the leading edge of lamellipodia. SPIN90 is expressed in the early stages of neuronal differentiation and plays a role in regulating growth cone dynamics and neurite outgrowth. It also interacts with IRSp53 and regulates cell motility by playing a role in the formation of membrane protrusions. SPIN90 contains an N-terminal SH3 domain, a proline-rich domain, and a C-terminal VCA (verprolin-homology and cofilin-like acidic) domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29864 212784 cd11850 SH3_Abl Src homology 3 domain of the Protein Tyrosine Kinase, Abelson kinase. Abl (or c-Abl) is a ubiquitously-expressed cytoplasmic (or nonreceptor) PTK that contains SH3, SH2, and tyr kinase domains in its N-terminal region, as well as nuclear localization motifs, a putative DNA-binding domain, and F- and G-actin binding domains in its C-terminal tail. It also contains a short autoinhibitory cap region in its N-terminus. Abl function depends on its subcellular localization. In the cytoplasm, Abl plays a role in cell proliferation and survival. In response to DNA damage or oxidative stress, Abl is transported to the nucleus where it induces apoptosis. In chronic myelogenous leukemia (CML) patients, an aberrant translocation results in the replacement of the first exon of Abl with the BCR (breakpoint cluster region) gene. The resulting BCR-Abl fusion protein is constitutively active and associates into tetramers, resulting in a hyperactive kinase sending a continuous signal. This leads to uncontrolled proliferation, morphological transformation and anti-apoptotic effects. BCR-Abl is the target of selective inhibitors, such as imatinib (Gleevec), used in the treatment of CML. Abl2, also known as ARG (Abelson-related gene), is thought to play a cooperative role with Abl in the proper development of the nervous system. The Tel-ARG fusion protein, resulting from reciprocal translocation between chromosomes 1 and 12, is associated with acute myeloid leukemia (AML). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29865 212785 cd11851 SH3_RIM-BP Src homology 3 domains of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29866 212786 cd11852 SH3_Kalirin_1 First Src homology 3 domain of the RhoGEF kinase, Kalirin. Kalirin, also called Duo, Duet, or TRAD, is a large neuronal dual Rho guanine nucleotide exchange factor (RhoGEF) that activates Rac1, RhoA, and RhoG using two RhoGEF domains. Kalirin exists in many isoforms generated by alternative splicing and the use of multiple promoters; the major isoforms are kalirin-7, -9, and -12, which differ at their C-terminal ends. Kalirin-12, the longest isoform, contains an N-terminal Sec14p domain, spectrin-like repeats, two RhoGEF domains, two SH3 domains, as well as Ig, FNIII, and kinase domains at the C-terminal end. Kalirin-7 contains only a single RhoGEF domain and does not contain an SH3 domain. Kalirin, through its many isoforms, interacts with many different proteins and is able to localize to different locations within the cell. It influences neurite initiation, axon growth, dendritic morphogenesis, vesicle trafficking, neuronal maintenance, and neurodegeneration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29867 212787 cd11853 SH3_Kalirin_2 Second Src homology 3 domain of the RhoGEF kinase, Kalirin. Kalirin, also called Duo, Duet, or TRAD, is a large neuronal dual Rho guanine nucleotide exchange factor (RhoGEF) that activates Rac1, RhoA, and RhoG using two RhoGEF domains. Kalirin exists in many isoforms generated by alternative splicing and the use of multiple promoters; the major isoforms are kalirin-7, -9, and -12, which differ at their C-terminal ends. Kalirin-12, the longest isoform, contains an N-terminal Sec14p domain, spectrin-like repeats, two RhoGEF domains, two SH3 domains, as well as Ig, FNIII, and kinase domains at the C-terminal end. Kalirin-7 contains only a single RhoGEF domain and does not contain an SH3 domain. Kalirin, through its many isoforms, interacts with many different proteins and is able to localize to different locations within the cell. It influences neurite initiation, axon growth, dendritic morphogenesis, vesicle trafficking, neuronal maintenance, and neurodegeneration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29868 212788 cd11854 SH3_Fus1p Src homology 3 domain of yeast cell fusion protein Fus1p. Fus1p is required at the cell surface for cell fusion during the mating response in yeast. It requires Bch1p and Bud7p, which are Chs5p-Arf1p binding proteins, for localization to the plasma membrane. It acts as a scaffold protein to assemble a cell surface complex which is involved in septum degradation and inhibition of the NOG pathway to promote cell fusion. The SH3 domain of Fus1p interacts with Bin1p, a formin that controls the assembly of actin cables in response to Cdc42 signaling. It has been shown to bind the motif, R(S/T)(S/T)SL, instead of PxxP motifs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29869 212789 cd11855 SH3_Sho1p Src homology 3 domain of High osmolarity signaling protein Sho1p. Sho1p (or Sho1), also called SSU81 (Suppressor of SUA8-1 mutation), is a yeast membrane protein that regulates adaptation to high salt conditions by activating the HOG (high-osmolarity glycerol) pathway. High salt concentrations lead to the localization to the membrane of the MAPKK Pbs2, which is then activated by the MAPKK Ste11 and in turn, activates the MAPK Hog1. Pbs2 is localized to the membrane though the interaction of its PxxP motif with the SH3 domain of Sho1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29870 212790 cd11856 SH3_p47phox_like Src homology 3 domains of the p47phox subunit of NADPH oxidase and similar domains. This family is composed of the tandem SH3 domains of p47phox subunit of NADPH oxidase and Nox Organizing protein 1 (NoxO1), the four SH3 domains of Tks4 (Tyr kinase substrate with four SH3 domains), the five SH3 domains of Tks5, the SH3 domain of obscurin, Myosin-I, and similar domains. Most members of this group also contain Phox homology (PX) domains, except for obscurin and Myosin-I. p47phox and NoxO1 are regulators of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) and nonphagocytic NADPH oxidase Nox1, respectively. They play roles in the activation of their respective NADPH oxidase, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Obscurin is a giant muscle protein that plays important roles in the organization and assembly of the myofibril and the sarcoplasmic reticulum. Type I myosins (Myosin-I) are actin-dependent motors in endocytic actin structures and actin patches. They play roles in membrane traffic in endocytic and secretory pathways, cell motility, and mechanosensing. Myosin-I contains an N-terminal actin-activated ATPase, a phospholipid-binding TH1 (tail homology 1) domain, and a C-terminal extension which includes an F-actin-binding TH2 domain, an SH3 domain, and an acidic peptide that participates in activating the Arp2/3complex. The SH3 domain of myosin-I is required for myosin-I-induced actin polymerization. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29871 212791 cd11857 SH3_DBS Src homology 3 domain of DBL's Big Sister (DBS), a guanine nucleotide exchange factor. DBS, also called MCF2L (MCF2-transforming sequence-like protein) or OST, is a Rho GTPase guanine nucleotide exchange factor (RhoGEF), facilitating the exchange of GDP and GTP. It was originally isolated from a cDNA screen for sequences that cause malignant growth. It plays roles in regulating clathrin-mediated endocytosis and cell migration through its activation of Rac1 and Cdc42. Depending on cell type, DBS can also activate RhoA and RhoG. DBS contains a Sec14-like domain, spectrin-like repeats, a RhoGEF [or Dbl homology (DH)] domain, a Pleckstrin homology (PH) domain, and an SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29872 212792 cd11858 SH3_Myosin-I_fungi Src homology 3 domain of Type I fungal Myosins. Type I myosins (myosin-I) are actin-dependent motors in endocytic actin structures and actin patches. They play roles in membrane traffic in endocytic and secretory pathways, cell motility, and mechanosensing. Saccharomyces cerevisiae has two myosins-I, Myo3 and Myo5, which are involved in endocytosis and the polarization of the actin cytoskeleton. Myosin-I contains an N-terminal actin-activated ATPase, a phospholipid-binding TH1 (tail homology 1) domain, and a C-terminal extension which includes an F-actin-binding TH2 domain, an SH3 domain, and an acidic peptide that participates in activating the Arp2/3complex. The SH3 domain of myosin-I is required for myosin-I-induced actin polymerization. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29873 212793 cd11859 SH3_ZO Src homology 3 domain of the Tight junction proteins, Zonula occludens (ZO) proteins. ZO proteins are scaffolding proteins that associate with each other and with other proteins of the tight junction, zonula adherens, and gap junctions. They play roles in regulating cytoskeletal dynamics at these cell junctions. They are considered members of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. Vertebrates contain three ZO proteins (ZO-1, ZO-2, and ZO-3) with redundant and non-redundant roles. They contain three PDZ domains, followed by SH3 and GuK domains; in addition, ZO-1 and ZO-2 contains a proline-rich (PR) actin binding domain at the C-terminus while ZO-3 contains this PR domain between the second and third PDZ domains. The C-terminal regions of the three ZO proteins are unique. The SH3 domain of ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29874 212794 cd11860 SH3_DLG5 Src homology 3 domain of Disks Large homolog 5. DLG5 is a multifunctional scaffold protein that is located at sites of cell-cell contact and is involved in the maintenance of cell shape and polarity. Mutations in the DLG5 gene are associated with Crohn's disease (CD) and inflammatory bowel disease (IBD). DLG5 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG5 contains 4 PDZ domains as well as an N-terminal domain of unknown function. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 63
29875 212795 cd11861 SH3_DLG-like Src Homology 3 domain of Disks large homolog proteins. The DLG-like proteins are scaffolding proteins that cluster at synapses and are also called PSD (postsynaptic density)-95 proteins or SAPs (synapse-associated proteins). They play important roles in synaptic development and plasticity, cell polarity, migration and proliferation. They are members of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG-like proteins contain three PDZ domains and varying N-terminal regions. All DLG proteins exist as alternatively-spliced isoforms. Vertebrates contain four DLG proteins from different genes, called DLG1-4. DLG4 and DLG2 are found predominantly at postsynaptic sites and they mediate surface ion channel and receptor clustering. DLG3 is found axons and some presynaptic terminals. DLG1 interacts with AMPA-type glutamate receptors and is critical in their maturation and delivery to synapses. The SH3 domain of DLG4 binds and clusters the kainate subgroup of glutamate receptors via two proline-rich sequences in their C-terminal tail. It also binds AKAP79/150 (A-kinase anchoring protein). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
29876 212796 cd11862 SH3_MPP Src Homology 3 domain of Membrane Protein, Palmitoylated (or MAGUK p55 subfamily member) proteins. The MPP/p55 subfamily of MAGUK (membrane-associated guanylate kinase) proteins includes at least eight vertebrate members (MPP1-7 and CASK), four Drosophila proteins (Stardust, Varicose, CASK and Skiff), and other similar proteins; they all contain one each of the core of three domains characteristic of MAGUK proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, most members except for MPP1 contain N-terminal L27 domains and some also contain a Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. CASK has an additional calmodulin-dependent kinase (CaMK)-like domain at the N-terminus. Members of this subfamily are scaffolding proteins that play important roles in regulating and establishing cell polarity, cell adhesion, and synaptic targeting and transmission, among others. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
29877 212797 cd11863 SH3_CACNB Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta. Voltage-dependent calcium channels (Ca(V)s) are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29878 212798 cd11864 SH3_PEX13_eumet Src Homology 3 domain of eumetazoan Peroxisomal biogenesis factor 13. PEX13 is a peroxin and is required for protein import into the peroxisomal matrix and membrane. It is an integral membrane protein that is essential for the localization of PEX14 and the import of proteins containing the peroxisome matrix targeting signals, PTS1 and PTS2. Mutations of the PEX13 gene in humans lead to a wide range of peroxisome biogenesis disorders (PBDs), the most severe of which is known as Zellweger syndrome (ZS), a severe multisystem disorder characterized by hypotonia, psychomotor retardation, and neuronal migration defects. PEX13 contains two transmembrane regions and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29879 212799 cd11865 SH3_Nbp2-like Src Homology 3 domain of Saccharomyces cerevisiae Nap1-binding protein 2 and similar fungal proteins. This subfamily includes Saccharomyces cerevisiae Nbp2 (Nucleosome assembly protein 1 (Nap1)-binding protein 2), Schizosaccharomyces pombe Skb5, and similar proteins. Nbp2 interacts with Nap1, which is essential for maintaining proper nucleosome structures in transcription and replication. It is also the binding partner of the yeast type II protein phosphatase Ptc1p and serves as a scaffolding protein that brings seven kinases in close contact to Ptc1p. Nbp2 plays a role many cell processes including organelle inheritance, mating hormone response, cell wall stress, mitotic cell growth at elevated temperatures, and high osmolarity. Skb5 interacts with the p21-activated kinase (PAK) homolog Shk1, which is critical for fission yeast cell viability. Skb5 activates Shk1 and plays a role in regulating cell morphology and growth under hypertonic conditions. Nbp2 and Skb5 contain an SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29880 212800 cd11866 SH3_SKAP1-like Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 1 and similar proteins. This subfamily is composed of SKAP1, SKAP2, and similar proteins. SKAP1 and SKAP2 are immune cell-specific adaptor proteins that play roles in T- and B-cell adhesion, respectively, and are thus important in the migration of T- and B-cells to sites of inflammation and for movement during T-cell conjugation with antigen-presenting cells. Both SKAP1 and SKAP2 bind to ADAP (adhesion and degranulation-promoting adaptor protein), among many other binding partners. They contain a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. The SH3 domain of SKAP1 is necessary for its ability to regulate T-cell conjugation with antigen-presenting cells and the formation of LFA-1 clusters. SKAP1 binds primarily to a proline-rich region of ADAP through its SH3 domain; its degradation is regulated by ADAP. A secondary interaction occurs via the ADAP SH3 domain and the RKxxYxxY motif in SKAP1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29881 212801 cd11867 hSH3_ADAP Helically extended Src Homology 3 domain of Adhesion and Degranulation-promoting Adaptor Protein. ADAP, also called Fyn T-binding protein (FYB) or SLP-76-associated protein (SLAP), is expressed mainly in hematopoietic cells but not in B cells. It is required for the proliferation of mature T-cells and plays an important role in T-cell activation, TCR-induced integrin clustering, and T-cell adhesion. ADAP has been shown to bind many partners including SLP-76, Fyn, Src, SKAP1, SKAP2, dynein, Ena/VASP, Carma1, among others. It is connected to cytoskeleton via its binding to Ena and VASP, which impacts actin cytoskeletal remodeling upon TCR ligation. The SH3 domain of ADAP adopts an altered fold referred to as a helically extended SH3 (hSH3) domain characterized by clusters of positive charges. The hSH3 domain can no longer bind conventional proline-rich peptides, instead, it functions as a novel lipid interaction domain and can bind acidic lipids such as phosphatidylserine, phosphatidylinositol, phosphatidic acid, and polyphosphoinositides. 77
29882 212802 cd11869 SH3_p40phox Src Homology 3 domain of the p40phox subunit of NADPH oxidase. p40phox, also called Neutrophil cytosol factor 4 (NCF-4), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p40phox positively regulates NADPH oxidase in both phosphatidylinositol-3-phosphate (PI3P)-dependent and PI3P-independent manner. It contains an N-terminal PX domain, a central SH3 domain, and a C-terminal PB1 domain that interacts with p67phox. The SH3 domain of p40phox binds to canonical polyproline and noncanonical motifs at the C-terminus of p47phox. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29883 212803 cd11870 SH3_p67phox-like_C C-terminal Src Homology 3 domain of the p67phox subunit of NADPH oxidase and similar proteins. This subfamily is composed of p67phox, NADPH oxidase activator 1 (Noxa1), and similar proteins. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), and Noxa1 are homologs and are the cytosolic subunits of the phagocytic (Nox2) and nonphagocytic (Nox1) NADPH oxidase complexes, respectively. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox and Noxa1 play regulatory roles. p67phox contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. Noxa1 has a similar domain architecture except it is lacking the N-terminal SH3 domain. The TPR domain of both binds activated GTP-bound Rac, while the C-terminal SH3 domain of p67phox and Noxa1 binds the polyproline motif found at the C-terminus of p47phox and Noxo1, respectively. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29884 212804 cd11871 SH3_p67phox_N N-terminal (or first) Src Homology 3 domain of the p67phox subunit of NADPH oxidase. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox plays a regulatory role and contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. It binds, via its C-terminal SH3 domain, to a proline-rich region of p47phox and upon activation, this complex assembles with flavocytochrome b558, the Nox2-p22phox heterodimer. Concurrently, RacGTP translocates to the membrane and interacts with the TPR domain of p67phox, which leads to the activation of NADPH oxidase. The PB1 domain of p67phox binds to its partner PB1 domain in p40phox, and this facilitates the assembly of p47phox-p67phox at the membrane. The N-terminal SH3 domain increases the affinity of p67phox for the oxidase complex. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29885 212805 cd11872 SH3_DOCK_AB Src Homology 3 domain of Class A and B Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. They are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1, 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. This subfamily includes only Class A and B DOCKs, which also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. Class A/B DOCKs are mostly specific GEFs for Rac, except Dock4 which activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. The SH3 domain of class A/B DOCKs have been shown to bind Elmo, a scaffold protein that promotes GEF activity of DOCKs by releasing DHR-2 autoinhibition by the intramolecular SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29886 212806 cd11873 SH3_CD2AP-like_1 First Src Homology 3 domain (SH3A) of CD2-associated protein and similar proteins. This subfamily is composed of the first SH3 domain (SH3A) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3A of both proteins bind to an atypical PXXXPR motif at the C-terminus of Cbl and the cytoplasmic domain of the cell adhesion protein CD2. CIN85 SH3A binds to internal proline-rich motifs within the proline-rich region; this intramolecular interaction serves as a regulatory mechanism to keep CIN85 in a closed conformation, preventing the recruitment of other proteins. CIN85 SH3A has also been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29887 212807 cd11874 SH3_CD2AP-like_2 Second Src Homology 3 domain (SH3B) of CD2-associated protein and similar proteins. This subfamily is composed of the second SH3 domain (SH3B) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3B of both proteins have been shown to bind to Cbl. In the case of CD2AP, its SH3B binds to Cbl at a site distinct from the c-Cbl/SH3A binding site. The CIN85 SH3B also binds ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29888 212808 cd11875 SH3_CD2AP-like_3 Third Src Homology 3 domain (SH3C) of CD2-associated protein and similar proteins. This subfamily is composed of the third SH3 domain (SH3C) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3C of both proteins have been shown to bind to ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29889 212809 cd11876 SH3_MLK Src Homology 3 domain of Mixed Lineage Kinases. MLKs are Serine/Threonine Kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. Mammals have four MLKs (MLK1-4), mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29890 212810 cd11877 SH3_PIX Src Homology 3 domain of Pak Interactive eXchange factors. PIX proteins are Rho guanine nucleotide exchange factors (GEFs), which activate small GTPases by exchanging bound GDP for free GTP. They act as GEFs for both Cdc42 and Rac 1, and have been implicated in cell motility, adhesion, neurite outgrowth, and cell polarity. Vertebrates contain two proteins from the PIX subfamily, alpha-PIX and beta-PIX. Alpha-PIX, also called ARHGEF6, is localized in dendritic spines where it regulates spine morphogenesis. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Beta-PIX play roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29891 212811 cd11878 SH3_Bem1p_1 First Src Homology 3 domain of Bud emergence protein 1 and similar domains. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 54
29892 212812 cd11879 SH3_Bem1p_2 Second Src Homology 3 domain of Bud emergence protein 1 and similar domains. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 56
29893 212813 cd11880 SH3_Caskin Src Homology 3 domain of CASK interacting protein. Caskin proteins are multidomain adaptor proteins that contain six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. There are two Caskin proteins called Caskin1 and Caskin2. Caskin1 binds to the multidomain scaffolding protein CASK through the CaM domain in competition with Munc-interacting protein 1 (Mint1). CASK participates in one of two evolutionarily conserved tripartite complexes containing either Mint1 and Velis or Caskin1 and Velis. Caskin1 may play a role in infantile myoclonic epilepsy. There is not much known about Caskin2; despite sharing a domain architecture with Caskin1, Caskin2 does not bind CASK. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 61
29894 212814 cd11881 SH3_MYO7A Src Homology 3 domain of Myosin VIIa and similar proteins. Myo7A is an uncoventional myosin that is involved in organelle transport. It is required for sensory function in both Drosophila and mammals. Mutations in the Myo7A gene cause both syndromic deaf-blindness [Usher syndrome I (USH1)] and nonsyndromic (DFNB2 and DFNA11) deafness in humans. It contains an N-terminal motor domain, light chain-binding IQ motifs, a coiled-coil region for heavy chain dimerization, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 64
29895 212815 cd11882 SH3_GRAF-like Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase and similar proteins. This subfamily is composed of Rho GTPase activating proteins (GAPs) with similarity to GRAF. Members contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. Although vertebrates harbor four Rho GAPs in the GRAF subfamily including GRAF, GRAF2, GRAF3, and Oligophrenin-1 (OPHN1), only three are included in this model. OPHN1 contains the BAR, PH and GAP domains, but not the C-terminal SH3 domain. GRAF and GRAF2 show GAP activity towards RhoA and Cdc42. GRAF influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase. GRAF2 regulates caspase-activated p21-activated protein kinase-2. The SH3 domain of GRAF and GRAF2 binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 54
29896 212816 cd11883 SH3_Sdc25 Src Homology 3 domain of Sdc25/Cdc25 guanine nucleotide exchange factors. This subfamily is composed of the Saccharomyces cerevisiae guanine nucleotide exchange factors (GEFs) Sdc25 and Cdc25, and similar proteins. These GEFs regulate Ras by stimulating the GDP/GTP exchange on Ras. Cdc25 is involved in the Ras/PKA pathway that plays an important role in the regulation of metabolism, stress responses, and proliferation, depending on available nutrients and conditions. Proteins in this subfamily contain an N-terminal SH3 domain as well as REM (Ras exchanger motif) and RasGEF domains at the C-terminus. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 55
29897 212817 cd11884 SH3_MYO15 Src Homology 3 domain of Myosin XV. This subfamily is composed of proteins with similarity to Myosin XVa. Myosin XVa is an unconventional myosin that is critical for the normal growth of mechanosensory stereocilia of inner ear hair cells. Mutations in the myosin XVa gene are associated with nonsyndromic hearing loss. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 56
29898 212818 cd11885 SH3_SH3TC Src Homology 3 domain of SH3 domain and tetratricopeptide repeat-containing (SH3TC) proteins and similar domains. This subfamily is composed of vertebrate SH3TC proteins and hypothetical fungal proteins containing BAR and SH3 domains. Mammals contain two SH3TC proteins, SH3TC1 and SH3TC2. The function of SH3TC1 is unknown. SH3TC2 is localized in Schwann cells in the peripheral nervous system, where it interacts with Rab11 and plays a role in peripheral nerve myelination. Mutations in SH3TC2 are associated with Charcot-Marie-Tooth disease type 4C, a severe hereditary peripheral neuropathy with symptoms that include progressive scoliosis, delayed age of walking, muscular atrophy, distal weakness, and reduced nerve conduction velocity. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 55
29899 212819 cd11886 SH3_BOI Src Homology 3 domain of fungal BOI-like proteins. This subfamily includes the Saccharomyces cerevisiae proteins BOI1 and BOI2, and similar proteins. They contain an N-terminal SH3 domain, a Sterile alpha motif (SAM), and a Pleckstrin homology (PH) domain at the C-terminus. BOI1 and BOI2 interact with the SH3 domain of Bem1p, a protein involved in bud formation. They promote polarized cell growth and participates in the NoCut signaling pathway, which is involved in the control of cytokinesis. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 55
29900 212820 cd11887 SH3_Bbc1 Src Homology 3 domain of Bbc1 and similar domains. This subfamily is composed of Saccharomyces cerevisiae Bbc1p, also called Mti1p (Myosin tail region-interacting protein), and similar proteins. Bbc1p interacts with and regulates type I myosins in yeast, Myo3p and Myo5p, which are involved in actin cytoskeletal reorganization. It also binds and inhibits Las17, a WASp family protein that functions as an activator of the Arp2/3 complex. Bbc1p contains an N-terminal SH3 domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 60
29901 212821 cd11888 SH3_ARHGAP9_like Src Homology 3 domain of Rho GTPase-activating protein 9 and similar proteins. This subfamily is composed of Rho GTPase-activating proteins including mammalian ARHGAP9, and vertebrate ARHGAPs 12 and 27. RhoGAPs (or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP9 functions as a GAP for Rac and Cdc42, but not for RhoA. It negatively regulates cell migration and adhesion. It also acts as a docking protein for the MAP kinases Erk2 and p38alpha, and may facilitate cross-talk between the Rho GTPase and MAPK pathways to control actin remodeling. ARHGAP27, also called CAMGAP1, shows GAP activity towards Rac1 and Cdc42. It binds the adaptor protein CIN85 and may play a role in clathrin-mediated endocytosis. ARHGAP12 has been shown to display GAP activity towards Rac1. It plays a role in regulating HFG-driven cell growth and invasiveness. ARHGAPs in this subfamily contain SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 54
29902 212822 cd11889 SH3_Cyk3p-like Src Homology 3 domain of Cytokinesis protein 3 and similar proteins. Cytokinesis protein 3 (Cyk3 or Cyk3p) is a component of the actomyosin ring independent cytokinesis pathway in yeast. It interacts with Inn1 and facilitates its recruitment to the bud neck, thereby promoting cytokinesis. Cyk3p contains an N-terminal SH3 domain and a C-terminal transglutaminase-like domain. The Cyk3p SH3 domain binds to the C-terminal proline-rich region of Inn1. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 53
29903 212823 cd11890 MIA Melanoma Inhibitory Activity protein. MIA is a single domain protein that adopts a Src Homology 3 (SH3) domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. It binds peptide ligands with sequence similarity to type III human fibronectin repeats. 98
29904 212824 cd11891 MIAL Melanoma Inhibitory Activity-Like protein. MIAL is specifically expressed in the cochlea and the vestibule of the inner ear and may contribute to inner ear dysfunction in humans. MIAL is a member of the recently identified family that also includes MIA, MIA2, and MIA3 (also called TANGO); MIA is the most studied member of the family. MIA is a single domain protein that adopts a Src Homology 3 (SH3) domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. 83
29905 212825 cd11892 SH3_MIA2 Src Homology 3 domain of Melanoma Inhibitory Activity 2 protein. MIA2 is expressed specifically in hepatocytes and its expression is controlled by hepatocyte nuclear factor 1 binding sites in the MIA2 promoter. It inhibits the growth and invasion of hepatocellular carcinomas (HCC) and may act as a tumor suppressor. A mutation in MIA2 in mice resulted in reduced cholesterol and triglycerides. Since MIA2 localizes to ER exit sites, it may function as an ER-to-Golgi trafficking protein that regulates lipid metabolism. MIA2 contains an N-terminal SH3-like domain, similar to MIA. It is a member of the recently identified family that also includes MIA, MIAL, and MIA3 (also called TANGO). MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. 73
29906 212826 cd11893 SH3_MIA3 Src Homology 3 domain of Melanoma Inhibitory Activity 3 protein. MIA3, also called TANGO or TANGO1, acts as a tumor suppressor of malignant melanoma. It is downregulated or lost in melanoma cells lines. Unlike other MIA family members, MIA3 is widely expressed except in hematopoietic cells. MIA3 is an ER resident transmembrane protein that is required for the loading of collagen VII into transport vesicles. SNPs in the MIA3 gene have been associated with coronary arterial disease and myocardial infarction. MIA3 contains an N-terminal SH3-like domain, similar to MIA. It is a member of the recently identified family that also includes MIA, MIAL, and MIA2. MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. 73
29907 212827 cd11894 SH3_FCHSD2_2 Second Src Homology 3 domain of FCH and double SH3 domains protein 2. FCHSD2 has a domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. It has only been characterized in silico and its function is unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29908 212828 cd11895 SH3_FCHSD1_2 Second Src Homology 3 domain of FCH and double SH3 domains protein 1. FCHSD1 has a domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. It has only been characterized in silico and its function is unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29909 212829 cd11896 SH3_SNX33 Src Homology 3 domain of Sorting Nexin 33. SNX33 interacts with Wiskott-Aldrich syndrome protein (WASP) and plays a role in the maintenance of cell shape and cell cycle progression. It modulates the shedding and endocytosis of cellular prion protein (PrP(c)) and amyloid precursor protein (APP). SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX33 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29910 212830 cd11897 SH3_SNX18 Src Homology 3 domain of Sorting nexin 18. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. It binds FIP5 and is required for apical lumen formation. It may also play a role in axonal elongation. SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX18 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29911 212831 cd11898 SH3_SNX9 Src Homology 3 domain of Sorting nexin 9. Sorting nexin 9 (SNX9), also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization. SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX9 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29912 212832 cd11899 SH3_Nck2_1 First Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The first SH3 domain of Nck2 binds the PxxDY sequence in the CD3e cytoplasmic tail; this binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29913 212833 cd11900 SH3_Nck1_1 First Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The first SH3 domain of Nck1 binds the PxxDY sequence in the CD3e cytoplasmic tail; this binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29914 212834 cd11901 SH3_Nck1_2 Second Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29915 212835 cd11902 SH3_Nck2_2 Second Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29916 212836 cd11903 SH3_Nck2_3 Third Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29917 212837 cd11904 SH3_Nck1_3 Third Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29918 212838 cd11905 SH3_Tec Src Homology 3 domain of Tec (Tyrosine kinase expressed in hepatocellular carcinoma). Tec is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. It is more widely-expressed than other Tec subfamily kinases. Tec is found in endothelial cells, both B- and T-cells, and a variety of myeloid cells including mast cells, erythroid cells, platelets, macrophages and neutrophils. Tec is a key component of T-cell receptor (TCR) signaling, and is important in TCR-stimulated proliferation, IL-2 production and phospholipase C-gamma1 activation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29919 212839 cd11906 SH3_BTK Src Homology 3 domain of Bruton's tyrosine kinase. BTK is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain with proline-rich and zinc-binding regions. Btk is expressed in B-cells, and a variety of myeloid cells including mast cells, platelets, neutrophils, and dendrictic cells. It interacts with a variety of partners, from cytosolic proteins to nuclear transcription factors, suggesting a diversity of functions. Stimulation of a diverse array of cell surface receptors, including antigen engagement of the B-cell receptor (BCR), leads to PH-mediated membrane translocation of Btk and subsequent phosphorylation by Src kinase and activation. Btk plays an important role in the life cycle of B-cells including their development, differentiation, proliferation, survival, and apoptosis. Mutations in Btk cause the primary immunodeficiency disease, X-linked agammaglobulinaemia (XLA) in humans. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29920 212840 cd11907 SH3_TXK Src Homology 3 domain of TXK, also called Resting lymphocyte kinase (Rlk). TXK is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal cysteine-rich region. Rlk is expressed in T-cells and mast cell lines, and is a key component of T-cell receptor (TCR) signaling. It is important in TCR-stimulated proliferation, IL-2 production and phospholipase C-gamma1 activation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29921 212841 cd11908 SH3_ITK Src Homology 3 domain of Interleukin-2-inducible T-cell Kinase. ITK (also known as Tsk or Emt) is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. ITK is expressed in T-cells and mast cells, and is important in their development and differentiation. Of the three Tec kinases expressed in T-cells, ITK plays the predominant role in T-cell receptor (TCR) signaling. It is activated by phosphorylation upon TCR crosslinking and is involved in the pathway resulting in phospholipase C-gamma1 activation and actin polymerization. It also plays a role in the downstream signaling of the T-cell costimulatory receptor CD28, the T-cell surface receptor CD2, and the chemokine receptor CXCR4. In addition, ITK is crucial for the development of T-helper(Th)2 effector responses. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29922 212842 cd11909 SH3_PI3K_p85beta Src Homology 3 domain of the p85beta regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. In addition to regulating the p110 subunit, p85beta binds CD28 and may be involved in the activation and differentiation of antigen-stimulated T cells. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 74
29923 212843 cd11910 SH3_PI3K_p85alpha Src Homology 3 domain of the p85alpha regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. In addition to regulating the p110 subunit, p85alpha interacts with activated FGFR3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 75
29924 212844 cd11911 SH3_CIP4-like Src Homology 3 domain of Cdc42-Interacting Protein 4. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. Members of this subfamily typically contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of CIP4 associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29925 212845 cd11912 SH3_Bzz1_1 First Src Homology 3 domain of Bzz1 and similar domains. Bzz1 (or Bzz1p) is a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Together with other proteins, it induces membrane scission in yeast. Bzz1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. This model represents the first C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29926 212846 cd11913 SH3_BAIAP2L1 Src Homology 3 domain of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1, also called Insulin Receptor Tyrosine Kinase Substrate (IRTKS). BAIAP2L1 or IRTKS is widely expressed, serves as a substrate for the insulin receptor, and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. BAIAP2L1 expression leads to the formation of short actin bundles, distinct from filopodia-like protrusions induced by the expression of the related protein IRSp53. IRTKS mediates the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. It contains an N-terminal IMD or Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The SH3 domain of IRTKS has been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29927 212847 cd11914 SH3_BAIAP2L2 Src Homology 3 domain of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2. BAIAP2L2 co-localizes with clathrin plaques but its function has not been determined. It contains an N-terminal IMD or Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The related proteins, BAIAP2L1 and IRSp53, function as regulators of membrane dynamics and the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29928 212848 cd11915 SH3_Irsp53 Src Homology 3 domain of Insulin Receptor tyrosine kinase Substrate p53. IRSp53 is also known as BAIAP2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2). It is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. One variant (T-form) is expressed exclusively in human breast cancer cells. The gene encoding IRSp53 is a putative susceptibility gene for Gilles de la Tourette syndrome. IRSp53 can also mediate the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. It contains an N-terminal IMD, a CRIB (Cdc42 and Rac interactive binding motif), an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The SH3 domain of IRSp53 has been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29929 212849 cd11916 SH3_Sorbs1_3 Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29930 212850 cd11917 SH3_Sorbs2_3 Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
29931 212851 cd11918 SH3_Vinexin_3 Third (or C-terminal) Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29932 212852 cd11919 SH3_Sorbs1_1 First Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29933 212853 cd11920 SH3_Sorbs2_1 First Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29934 212854 cd11921 SH3_Vinexin_1 First Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29935 212855 cd11922 SH3_Sorbs1_2 Second Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29936 212856 cd11923 SH3_Sorbs2_2 Second Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29937 212857 cd11924 SH3_Vinexin_2 Second Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29938 212858 cd11925 SH3_SH3RF3_3 Third Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29939 212859 cd11926 SH3_SH3RF1_3 Third Src Homology 3 domain of SH3 domain containing ring finger 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29940 212860 cd11927 SH3_SH3RF1_1 First Src Homology 3 domain of SH3 domain containing ring finger protein 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29941 212861 cd11928 SH3_SH3RF3_1 First Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29942 212862 cd11929 SH3_SH3RF2_1 First Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29943 212863 cd11930 SH3_SH3RF1_2 Second Src Homology 3 domain of SH3 domain containing ring finger protein 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29944 212864 cd11931 SH3_SH3RF3_2 Second Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29945 212865 cd11932 SH3_SH3RF2_2 Second Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29946 212866 cd11933 SH3_Nebulin_C C-terminal Src Homology 3 domain of Nebulin. Nebulin is a giant filamentous protein (600-900 kD) that is expressed abundantly in skeletal muscle. It binds to actin thin filaments and regulates its assembly and function. Nebulin was thought to be part of a molecular ruler complex that is critical in determining the lengths of actin thin filaments in skeletal muscle since its length, which varies due to alternative splicing, correlates with the length of thin filaments in various muscle types. Recent studies indicate that nebulin regulates thin filament length by stabilizing the filaments and preventing depolymerization. Mutations in nebulin can cause nemaline myopathy, characterized by muscle weakness which can be severe and can lead to neonatal lethality. Nebulin contains an N-terminal LIM domain, many nebulin repeats/super repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29947 212867 cd11934 SH3_Lasp1_C C-terminal Src Homology 3 domain of LIM and SH3 domain protein 1. Lasp1 is a cytoplasmic protein that binds focal adhesion proteins and is involved in cell signaling, migration, and proliferation. It is overexpressed in several cancer cells including breast, ovarian, bladder, and liver. In cancer cells, it can be found in the nucleus; its degree of nuclear localization correlates with tumor size and poor prognosis. Lasp1 is a 36kD protein containing an N-terminal LIM domain, two nebulin repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29948 212868 cd11935 SH3_Nebulette_C C-terminal Src Homology 3 domain of Nebulette and LIM-nebulette (or Lasp2). Nebulette is a cardiac-specific protein that localizes to the Z-disc. It interacts with tropomyosin and is important in stabilizing actin thin filaments in cardiac muscles. Polymorphisms in the nebulette gene are associated with dilated cardiomyopathy, with some mutations resulting in severe heart failure. Nebulette is a 107kD protein that contains an N-terminal acidic region, multiple nebulin repeats, and a C-terminal SH3 domain. LIM-nebulette, also called Lasp2 (LIM and SH3 domain protein 2), is an alternatively spliced variant of nebulette. Although it shares a gene with nebulette, Lasp2 is not transcribed from a muscle-specific promoter, giving rise to its multiple tissue expression pattern with highest amounts in the brain. It can crosslink actin filaments and it affects cell spreading. Lasp2 is a 34kD protein containing an N-terminal LIM domain, three nebulin repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29949 212869 cd11936 SH3_UBASH3B Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing protein B. UBASH3B, also called Suppressor of T cell receptor Signaling (STS)-1 or T cell Ubiquitin LigAnd (TULA)-2 is an active phosphatase that is expressed ubiquitously. The phosphatase activity of UBASH3B is essential for its roles in the suppression of TCR signaling and the regulation of EGFR. It also interacts with Syk and functions as a negative regulator of platelet glycoprotein VI signaling. TULA proteins contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29950 212870 cd11937 SH3_UBASH3A Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing protein A. UBASH3A is also called Cbl-Interacting Protein 4 (CLIP4), T cell Ubiquitin LigAnd (TULA), or T cell receptor Signaling (STS)-2. It is only found in lymphoid cells and exhibits weak phosphatase activity. UBASH3A facilitates T cell-induced apoptosis through interaction with the apoptosis-inducing factor AIF. It is involved in regulating the level of phosphorylation of the zeta-associated protein (ZAP)-70 tyrosine kinase. TULA proteins contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 60
29951 212871 cd11938 SH3_ARHGEF16_26 Src homology 3 domain of the Rho guanine nucleotide exchange factors ARHGEF16 and ARHGEF26. ARHGEF16, also called ephexin-4, acts as a GEF for RhoG, activating it by exchanging bound GDP for free GTP. RhoG is a small GTPase that is a crucial regulator of Rac in migrating cells. ARHGEF16 interacts directly with the ephrin receptor EphA2 and mediates cell migration and invasion in breast cancer cells by activating RhoG. ARHGEF26, also called SGEF (SH3 domain-containing guanine exchange factor), also activates RhoG. It is highly expressed in liver and may play a role in regulating membrane dynamics. ARHGEF16 and ARHGEF26 contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29952 212872 cd11939 SH3_ephexin1 Src homology 3 domain of the Rho guanine nucleotide exchange factor, ephexin-1 (also called NGEF or ARHGEF27). Ephexin-1, also called NGEF (neuronal GEF) or ARHGEF27, activates RhoA, Tac1, and Cdc42 by exchanging bound GDP for free GTP. It is expressed mainly in the brain in a region associated with movement control. It regulates the stability of postsynaptic acetylcholine receptor (AChR) clusters and thus, plays a critical role in the maturation and neurotransmission of neuromuscular junctions. Ephexin-1 directly interacts with the ephrin receptor EphA4 and their coexpression enhances the ability of ephexin-1 to activate RhoA. It is required for normal axon growth and EphA-induced growth cone collapse. Ephexin-1 contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29953 212873 cd11940 SH3_ARHGEF5_19 Src homology 3 domain of the Rho guanine nucleotide exchange factors ARHGEF5 and ARHGEF19. ARHGEF5, also called ephexin-3 or TIM (Transforming immortalized mammary oncogene), is a potent activator of RhoA and it plays roles in regulating cell shape, adhesion, and migration. It binds to the SH3 domain of Src and is involved in regulating Src-induced podosome formation. ARHGEF19, also called ephexin-2 or WGEF (weak-similarity GEF), is highly expressed in the intestine, liver, heart and kidney. It activates RhoA, Cdc42, and Rac 1, and has been shown to activate RhoA in the Wnt-PCP (planar cell polarity) pathway. It is involved in the regulation of cell polarity and cytoskeletal reorganization. ARHGEF5 and ARHGEF19 contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29954 212874 cd11941 SH3_ARHGEF37_C2 Second C-terminal Src homology 3 domain of Rho guanine nucleotide exchange factor 37. ARHGEF37 contains a RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. Its specific function is unknown. Its domain architecture is similar to the C-terminal half of DNMBP or Tuba, a cdc42-specific GEF that provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics, and plays an important role in regulating cell junction configuration. GEFs activate small GTPases by exchanging bound GDP for free GTP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29955 212875 cd11942 SH3_JIP2 Src homology 3 domain of JNK-interacting protein 2. JNK-interacting protein 2 (JIP2) is also called Mitogen-activated protein kinase 8-interacting protein 2 (MAPK8IP2) or Islet-brain-2 (IB2). It is widely expressed in the brain, where it forms complexes with fibroblast growth factor homologous factors (FHFs), which facilitates activation of the p38delta MAPK. JIP2 is enriched in postsynaptic densities and may play a role in motor and cognitive function. In addition to a JNK binding domain, JIP2 also contains SH3 and Phosphotyrosine-binding (PTB) domains. The SH3 domain of the related protein JIP1 homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29956 212876 cd11943 SH3_JIP1 Src homology 3 domain of JNK-interacting protein 1. JNK-interacting protein 1 (JIP1) is also called Islet-brain 1 (IB1) or Mitogen-activated protein kinase 8-interacting protein 1 (MAPK8IP1). It is highly expressed in neurons, where it functions as an adaptor linking motor to cargo during axonal transport. It also affects microtubule dynamics in neurons. JIP1 is also found in pancreatic beta-cells, where it is involved in regulating insulin secretion. In addition to a JNK binding domain, JIP1 also contains SH3 and Phosphotyrosine-binding (PTB) domains. Its SH3 domain homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29957 212877 cd11944 SH3_Endophilin_B2 Src homology 3 domain of Endophilin-B2. Endophilin-B2, also called SH3GLB2 (SH3-domain GRB2-like endophilin B2), is a cytoplasmic protein that interacts with the apoptosis inducer Bax. It is overexpressed in prostate cancer metastasis and has been identified as a cancer antigen with potential utility in immunotherapy. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. Endophilin-B2 forms homo- and heterodimers (with endophilin-B1) through its BAR domain. The related protein endophilin-B1 interacts with amphiphysin 1 and dynamin 1 through its SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29958 212878 cd11945 SH3_Endophilin_B1 Src homology 3 domain of Endophilin-B1. Endophilin-B1, also called Bax-interacting factor 1 (Bif-1) or SH3GLB1 (SH3-domain GRB2-like endophilin B1), is localized mainly to the Golgi apparatus. It is involved in the regulation of many biological events including autophagy, tumorigenesis, nerve growth factor (NGF) trafficking, neurite outgrowth, mitochondrial outer membrane dynamics, and cell death. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. Endophilin-B1 forms homo- and heterodimers (with endophilin-B2) through its BAR domain. It interacts with amphiphysin 1 and dynamin 1 through its SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
29959 212879 cd11946 SH3_GRB2_N N-terminal Src homology 3 domain of Growth factor receptor-bound protein 2. GRB2 is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. It is ubiquitously expressed in all tissues throughout development and is important in cell cycle progression, motility, morphogenesis, and angiogenesis. In lymphocytes, GRB2 is associated with antigen receptor signaling components. GRB2 contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. Its N-terminal SH3 domain binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29960 212880 cd11947 SH3_GRAP2_N N-terminal Src homology 3 domain of GRB2-related adaptor protein 2. GRAP2 is also called GADS (GRB2-related adapter downstream of Shc), GrpL, GRB2L, Mona, or GRID (Grb2-related protein with insert domain). It is expressed specifically in the hematopoietic system. It plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. It also have roles in antigen-receptor and tyrosine kinase mediated signaling. GRAP2 is unique from other GRB2-like adaptor proteins in that it can be regulated by caspase cleavage. It contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The N-terminal SH3 domain of the related protein GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29961 212881 cd11948 SH3_GRAP_N N-terminal Src homology 3 domain of GRB2-related adaptor protein. GRAP is a GRB-2 like adaptor protein that is highly expressed in lymphoid tissues. It acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. It has been identified as a regulator of TGFbeta signaling in diabetic kidney tubules and may have a role in the pathogenesis of the disease. GRAP contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The N-terminal SH3 domain of the related protein GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29962 212882 cd11949 SH3_GRB2_C C-terminal Src homology 3 domain of Growth factor receptor-bound protein 2. GRB2 is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. It is ubiquitously expressed in all tissues throughout development and is important in cell cycle progression, motility, morphogenesis, and angiogenesis. In lymphocytes, GRB2 is associated with antigen receptor signaling components. GRB2 contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domain of GRB2 binds to Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, as well as to the proline-rich C-terminus of FGRF2. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29963 212883 cd11950 SH3_GRAP2_C C-terminal Src homology 3 domain of GRB2-related adaptor protein 2. GRAP2 is also called GADS (GRB2-related adapter downstream of Shc), GrpL, GRB2L, Mona, or GRID (Grb2-related protein with insert domain). It is expressed specifically in the hematopoietic system. It plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. It also has roles in antigen-receptor and tyrosine kinase mediated signaling. GRAP2 is unique from other GRB2-like adaptor proteins in that it can be regulated by caspase cleavage. It contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domain of GRAP2 binds to different motifs found in substrate peptides including the typical PxxP motif in hematopoietic progenitor kinase 1 (HPK1), the RxxK motif in SLP-76 and HPK1, and the RxxxxK motif in phosphatase-like protein HD-PTP. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29964 212884 cd11951 SH3_GRAP_C C-terminal Src homology 3 domain of GRB2-related adaptor protein. GRAP is a GRB-2 like adaptor protein that is highly expressed in lymphoid tissues. It acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. It has been identified as a regulator of TGFbeta signaling in diabetic kidney tubules and may have a role in the pathogenesis of the disease. GRAP contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domains (SH3c) of the related proteins, GRB2 and GRAP2, have been shown to bind to classical PxxP motif ligands, as well as to non-classical motifs. GRB2 SH3c binds Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, while the SH3c of GRAP2 binds to the phosphatase-like protein HD-PTP via a RxxxxK motif. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29965 212885 cd11952 SH3_iASPP Src Homology 3 (SH3) domain of Inhibitor of ASPP protein (iASPP). iASPP, also called RelA-associated inhibitor (RAI), is an oncoprotein that inhibits the apoptotic transactivation potential of p53. It is upregulated in human breast cancers expressing wild-type p53, in acute leukemias regardless of the p53 mutation status, as well as in ovarian cancer where it is associated with poor patient outcome and chemoresistance. iASPP is also a binding partner and negative regulator of p65RelA, which promotes cell proliferation and inhibits apoptosis; p65RelA has the opposite effect on cell growth compared to the p53 family. It contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of iASPP contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29966 212886 cd11953 SH3_ASPP2 Src Homology 3 (SH3) domain of Apoptosis Stimulating of p53 protein 2. ASPP2 is the full length form of the previously-identified tumor supressor, p53-binding protein 2 (p53BP2). ASPP2 activates the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73). It plays a central role in regulating apoptosis and cell growth; ASPP2-deficient mice show postnatal death. Downregulated expression of ASPP2 is frequently found in breast tumors, lung cancer, and diffuse large B-cell lymphoma where it is correlated with a poor clinical outcome. ASPP2 contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of ASPP2 contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29967 212887 cd11954 SH3_ASPP1 Src Homology 3 domain of Apoptosis Stimulating of p53 protein 1. ASPP1, like ASPP2, activates the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73). In addition, it functions in the cytoplasm to regulate the nuclear localization of the transcriptional cofactors YAP and TAZ by inihibiting their phosphorylation; YAP and TAZ are important regulators of cell expansion, differentiation, migration, and invasion. ASPP1 is downregulated in breast tumors expressing wild-type p53. It contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of ASPP1 contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29968 212888 cd11955 SH3_srGAP1-3 Src homology 3 domain of Slit-Robo GTPase Activating Proteins 1, 2, and 3. srGAP1, also called Rho GTPase-Activating Protein 13 (ARHGAP13), is a Cdc42- and RhoA-specific GAP and is expressed later in the development of central nervous system tissues. srGAP2 is expressed in zones of neuronal differentiation. It plays a role in the regeneration of neurons and axons. srGAP3, also called MEGAP (MEntal disorder associated GTPase-Activating Protein), is a Rho GAP with activity towards Rac1 and Cdc42. It impacts cell migration by regulating actin and microtubule cytoskeletal dynamics. The association between srGAP3 haploinsufficiency and mental retardation is under debate. srGAPs are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29969 212889 cd11956 SH3_srGAP4 Src homology 3 domain of Slit-Robo GTPase Activating Protein 4. srGAP4, also called ARHGAP4, is highly expressed in hematopoietic cells and may play a role in lymphocyte differentiation. It is able to stimulate the GTPase activity of Rac1, Cdc42, and RhoA. In the nervous system, srGAP4 has been detected in differentiating neurites and may be involved in axon and dendritic growth. srGAPs are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29970 212890 cd11957 SH3_RUSC2 Src homology 3 domain of RUN and SH3 domain-containing protein 2. RUSC2, also called Iporin or Interacting protein of Rab1, is expressed ubiquitously with highest amounts in the brain and testis. It interacts with the small GTPase Rab1 and the Golgi matrix protein GM130, and may function in linking GTPases to certain intracellular signaling pathways. RUSC proteins are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29971 212891 cd11958 SH3_RUSC1 Src homology 3 domain of RUN and SH3 domain-containing protein 1. RUSC1, also called NESCA (New molecule containing SH3 at the carboxy-terminus), is highly expressed in the brain and is translocated to the nuclear membrane from the cytoplasm upon stimulation with neurotrophin. It plays a role in facilitating neurotrophin-dependent neurite outgrowth. It also interacts with NEMO (or IKKgamma) and may function in NEMO-mediated activation of NF-kB. RUSC proteins are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 51
29972 212892 cd11959 SH3_Cortactin Src homology 3 domain of Cortactin. Cortactin was originally identified as a substrate of Src kinase. It is an actin regulatory protein that binds to the Arp2/3 complex and stabilizes branched actin filaments. It is involved in cellular processes that affect cell motility, adhesion, migration, endocytosis, and invasion. It is expressed ubiquitously except in hematopoietic cells, where the homolog hematopoietic lineage cell-specific 1 (HS1) is expressed instead. Cortactin contains an N-terminal acidic domain, several copies of a repeat domain found in cortactin and HS1, a proline-rich region, and a C-terminal SH3 domain. The N-terminal region interacts with the Arp2/3 complex and F-actin, and is crucial in regulating branched actin assembly. Cortactin also serves as a scaffold and provides a bridge to the actin cytoskeleton for membrane trafficking and signaling proteins that bind to its SH3 domain. Binding partners for the SH3 domain of cortactin include dynamin2, N-WASp, MIM, FGD1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29973 212893 cd11960 SH3_Abp1_eu Src homology 3 domain of eumetazoan Actin-binding protein 1. Abp1, also called drebrin-like protein, is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a helical domain, and a C-terminal SH3 domain. Mammalian Abp1, unlike yeast Abp1, does not contain an acidic domain that interacts with the Arp2/3 complex. It regulates actin dynamics indirectly by interacting with dynamin and WASP family proteins. Abp1 deficiency causes abnormal organ structure and function of the spleen, heart, and lung of mice. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29974 212894 cd11961 SH3_Abp1_fungi_C2 Second C-terminal Src homology 3 domain of Fungal Actin-binding protein 1. Abp1 is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a central proline-rich region, and a C-terminal SH3 domain (many yeast Abp1 proteins contain two C-terminal SH3 domains). Yeast Abp1 also contains two acidic domains that bind directly to the Arp2/3 complex, which is required to initiate actin polymerization. The SH3 domain of yeast Abp1 binds and localizes the kinases, Ark1p and Prk1p, which facilitate actin patch disassembly following vesicle internalization. It also mediates the localization to the actin patch of the synaptojanin-like protein, Sjl2p, which plays a key role in endocytosis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29975 212895 cd11962 SH3_Abp1_fungi_C1 First C-terminal Src homology 3 domain of Fungal Actin-binding protein 1. Abp1 is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a central proline-rich region, and a C-terminal SH3 domain (many yeast Abp1 proteins contain two C-terminal SH3 domains). Yeast Abp1 also contains two acidic domains that bind directly to the Arp2/3 complex, which is required to initiate actin polymerization. The SH3 domain of yeast Abp1 binds and localizes the kinases, Ark1p and Prk1p, which facilitate actin patch disassembly following vesicle internalization. It also mediates the localization to the actin patch of the synaptojanin-like protein, Sjl2p, which plays a key role in endocytosis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29976 212896 cd11963 SH3_STAM2 Src homology 3 domain of Signal Transducing Adaptor Molecule 2. STAM2, also called EAST (Epidermal growth factor receptor-associated protein with SH3 and TAM domain) or Hbp (Hrs binding protein), is part of the endosomal sorting complex required for transport (ESCRT-0). It plays a role in sorting mono-ubiquinated endosomal cargo for trafficking to the lysosome for degradation. It is also involved in the regulation of exocytosis. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29977 212897 cd11964 SH3_STAM1 Src homology 3 domain of Signal Transducing Adaptor Molecule 1. STAM1 is part of the endosomal sorting complex required for transport (ESCRT-0) and is involved in sorting ubiquitinated cargo proteins from the endosome. It may also be involved in the regulation of IL2 and GM-CSF mediated signaling, and has been implicated in neural cell survival. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29978 212898 cd11965 SH3_ASAP1 Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing protein 1. ASAP1 is also called DDEF1 (Development and Differentiation Enhancing Factor 1), AMAP1, centaurin beta-4, or PAG2. an Arf GTPase activating protein (GAP) with activity towards Arf1 and Arf5 but not Arf6. However, it has been shown to bind GTP-Arf6 stably without GAP activity. It has been implicated in cell growth, migration, and survival, as well as in tumor invasion and malignancy. It binds paxillin and cortactin, two components of invadopodia which are essential for tumor invasiveness. It also binds focal adhesion kinase (FAK) and the SH2/SH3 adaptor CrkL. ASAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29979 212899 cd11966 SH3_ASAP2 Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing protein 2. ASAP2 is also called DDEF2 (Development and Differentiation Enhancing Factor 2), AMAP2, centaurin beta-3, or PAG3. It mediates the functions of Arf GTPases vial dual mechanisms: it exhibits GTPase activating protein (GAP) activity towards class I (Arf1) and II (Arf5) Arfs; and it binds class III Arfs (GTP-Arf6) stably without GAP activity. It binds paxillin and is implicated in Fcgamma receptor-mediated phagocytosis in macrophages and in cell migration. ASAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29980 212900 cd11967 SH3_SASH1 Src homology 3 domain of SAM And SH3 Domain Containing Protein 1. SASH1 is a potential tumor suppressor in breast and colon cancer. Its decreased expression is associated with aggressive tumor growth, metastasis, and poor prognosis. It is widely expressed in normal tissues (except lymphocytes and dendritic cells) and is localized in the nucleus and the cytoplasm. SASH1 interacts with the oncoprotein cortactin and is important in cell migration and adhesion. It is a member of the SLY family of proteins, which are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as well as SAM (sterile alpha motif) and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
29981 212901 cd11968 SH3_SASH3 Src homology 3 domain of Sam And SH3 Domain Containing Protein 3. SASH3, also called SLY/SLY1 (SH3-domain containing protein expressed in lymphocytes), is expressed exclusively in lymhocytes and is essential in the full activation of adaptive immunity. It is involved in the signaling of T cell receptors. It was the first described member of the SLY family of proteins, which are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as well as SAM (sterile alpha motif) and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29982 212902 cd11969 SH3_PLCgamma2 Src homology 3 domain of Phospholipase C (PLC) gamma 2. PLCgamma2 is primarily expressed in haematopoietic cells, specifically in B cells. It is activated by tyrosine phosphorylation by B cell receptor (BCR) kinases and is recruited to the plasma membrane where its substrate is located. It is required in pre-BCR signaling and in the maturation of B cells. PLCs catalyze the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG). Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
29983 212903 cd11970 SH3_PLCgamma1 Src homology 3 domain of Phospholipase C (PLC) gamma 1. PLCgamma1 is widely expressed and is essential in growth and development. It is activated by the TrkA receptor tyrosine kinase and functions as a key regulator of cell differentiation. It is also the predominant PLCgamma in T cells and is required for T cell and NK cell function. PLCs catalyze the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG). Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. The SH3 domain of PLCgamma1 directly interacts with dynamin-1 and can serve as a guanine nucleotide exchange factor (GEF). It also interacts with Cbl, inhibiting its phosphorylation and activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 60
29984 212904 cd11971 SH3_Abi1 Src homology 3 domain of Abl Interactor 1. Abi1, also called e3B1, is a central regulator of actin cytoskeletal reorganization through interactions with many protein complexes. It is part of WAVE, a nucleation-promoting factor complex, that links Rac 1 activation to actin polymerization causing lamellipodia protrusion at the plasma membrane. Abi1 interact with formins to promote protrusions at the leading edge of motile cells. It also is a target of alpha4 integrin, regulating membrane protrusions at sites of integrin engagement. Abi proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
29985 212905 cd11972 SH3_Abi2 Src homology 3 domain of Abl Interactor 2. Abi2 is highly expressed in the brain and eye. It regulates actin cytoskeletal reorganization at adherens junctions and dendritic spines, which is important in cell morphogenesis, migration, and cognitive function. Mice deficient with Abi2 show defects in orientation and migration of lens fibers, neuronal migration, dendritic spine morphology, as well as deficits in learning and memory. Abi proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
29986 212906 cd11973 SH3_ASEF Src homology 3 domain of APC-Stimulated guanine nucleotide Exchange Factor. ASEF, also called ARHGEF4, exists in an autoinhibited form and is activated upon binding of the tumor suppressor APC (adenomatous polyposis coli). GEFs activate small GTPases by exchanging bound GDP for free GTP. ASEF can activate Rac1 or Cdc42. Truncated ASEF, which is found in colorectal cancers, is constitutively active and has been shown to promote angiogenesis and cancer cell migration. ASEF contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. In its autoinhibited form, the SH3 domain of ASEF forms an extensive interface with the DH and PH domains, blocking the Rac binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 73
29987 212907 cd11974 SH3_ASEF2 Src homology 3 domain of APC-Stimulated guanine nucleotide Exchange Factor 2. ASEF2, also called Spermatogenesis-associated protein 13 (SPATA13), is a GEF that localizes with actin at the leading edge of cells and is important in cell migration and adhesion dynamics. GEFs activate small GTPases by exchanging bound GDP for free GTP. ASEF2 can activate both Rac 1 and Cdc42, but only Rac1 activation is necessary for increased cell migration and adhesion turnover. Together with APC (adenomatous polyposis coli) and Neurabin2, a scaffold protein that binds F-actin, it is involved in regulating HGF-induced cell migration. ASEF2 contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29988 212908 cd11975 SH3_ARHGEF9 Src homology 3 domain of the Rho guanine nucleotide exchange factor ARHGEF9. ARHGEF9, also called PEM2 or collybistin, selectively activates Cdc42 by exchanging bound GDP for free GTP. It is highly expressed in the brain and it interacts with gephyrin, a postsynaptic protein associated with GABA and glycine receptors. Mutations in the ARHGEF9 gene cause X-linked mental retardation with associated features like seizures, hyper-anxiety, aggressive behavior, and sensory hyperarousal. ARHGEF9 contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29989 212909 cd11976 SH3_VAV1_2 C-terminal (or second) Src homology 3 domain of VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The C-terminal SH3 domain of Vav1 interacts with a wide variety of proteins including cytoskeletal regulators (zyxin), RNA-binding proteins (Sam68), transcriptional regulators, viral proteins, and dynamin 2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
29990 212910 cd11977 SH3_VAV2_2 C-terminal (or second) Src homology 3 domain of VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
29991 212911 cd11978 SH3_VAV3_2 C-terminal (or second) Src homology 3 domain of VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. It has been implicated to function in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
29992 212912 cd11979 SH3_VAV1_1 First Src homology 3 domain of VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The first SH3 domain of Vav1 has been shown to bind the adaptor protein Grb2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 63
29993 212913 cd11980 SH3_VAV2_1 First Src homology 3 domain of VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 60
29994 212914 cd11981 SH3_VAV3_1 First Src homology 3 domain of VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. It has been implicated to function in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
29995 212915 cd11982 SH3_Shank1 Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 1. Shank1, also called SSTRIP (Somatostatin receptor-interacting protein), is a brain-specific protein that plays a role in the construction of postsynaptic density (PSD) and the maturation of dendritic spines. Mice deficient in Shank1 show altered PSD composition, thinner PSDs, smaller dendritic spines, and weaker basal synaptic transmission, although synaptic plasticity is normal. They show increased anxiety and impaired fear memory, but also show better spatial learning. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29996 212916 cd11983 SH3_Shank2 Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 2. Shank2, also called ProSAP1 (Proline-rich synapse-associated protein 1) or CortBP1 (Cortactin-binding protein 1), is found in neurons, glia, endocrine cells, liver, and kidney. It plays a role in regulating dendritic spine volume and branching and postsynaptic clustering. Mutations in the Shank2 gene are associated with autism spectrum disorder and mental retardation. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29997 212917 cd11984 SH3_Shank3 Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 3. Shank3, also called ProSAP2 (Proline-rich synapse-associated protein 2), is widely expressed. It plays a role in the formation of dendritic spines and synapses. Haploinsufficiency of the Shank3 gene causes the 22q13 deletion/Phelan-McDermid syndrome, and variants of Shank3 have been implicated in autism spectrum disorder, schizophrenia, and intellectual disability. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
29998 212918 cd11985 SH3_Stac2_C C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing protein 2 (Stac2). Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac2 contains a single SH3 domain at the C-terminus unlike Stac1 and Stac3, which contain two C-terminal SH3 domains. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
29999 212919 cd11986 SH3_Stac3_1 First C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing protein 3 (Stac3). Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30000 212920 cd11987 SH3_Intersectin1_1 First Src homology 3 domain (or SH3A) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The first SH3 domain (or SH3A) of ITSN1 has been shown to bind many proteins including Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30001 212921 cd11988 SH3_Intersectin2_1 First Src homology 3 domain (or SH3A) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The first SH3 domain (or SH3A) of ITSN2 is expected to bind many protein partners, similar to ITSN1 which has been shown to bind Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30002 212922 cd11989 SH3_Intersectin1_2 Second Src homology 3 domain (or SH3B) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The second SH3 domain (or SH3B) of ITSN1 has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
30003 212923 cd11990 SH3_Intersectin2_2 Second Src homology 3 domain (or SH3B) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The second SH3 domain (or SH3B) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
30004 212924 cd11991 SH3_Intersectin1_3 Third Src homology 3 domain (or SH3C) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The third SH3 domain (or SH3C) of ITSN1 has been shown to bind many proteins including dynamin1/2, CIN85, c-Cbl, SHIP2, Reps1, synaptojanin-1, and WNK, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
30005 212925 cd11992 SH3_Intersectin2_3 Third Src homology 3 domain (or SH3C) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The third SH3 domain (SH3C) of ITSN2 has been shown to bind the K15 protein of Kaposi's sarcoma-associated herpesvirus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 52
30006 212926 cd11993 SH3_Intersectin1_4 Fourth Src homology 3 domain (or SH3D) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fourth SH3 domain (or SH3D) of ITSN1 has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 65
30007 212927 cd11994 SH3_Intersectin2_4 Fourth Src homology 3 domain (or SH3D) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fourth SH3 domain (or SH3D) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
30008 212928 cd11995 SH3_Intersectin1_5 Fifth Src homology 3 domain (or SH3E) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fifth SH3 domain (or SH3E) of ITSN1 has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30009 212929 cd11996 SH3_Intersectin2_5 Fifth Src homology 3 domain (or SH3E) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fifth SH3 domain (or SH3E) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30010 212930 cd11997 SH3_PACSIN3 Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons 3 (PACSIN3). PACSIN 3 or Syndapin III (Synaptic dynamin-associated protein III) is expressed ubiquitously and regulates glucose uptake in adipocytes through its role in GLUT1 trafficking. It also modulates the subcellular localization and stimulus-specific function of the cation channel TRPV4. PACSINs act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30011 212931 cd11998 SH3_PACSIN1-2 Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons 1 (PACSIN1) and PACSIN 2. PACSIN 1 or Syndapin I (Synaptic dynamin-associated protein I) is expressed specifically in the brain and is localized in neurites and synaptic boutons. It binds the brain-specific proteins dynamin I, synaptojanin, synapsin I, and neural Wiskott-Aldrich syndrome protein (nWASP), and functions as a link between the cytoskeletal machinery and synaptic vesicle endocytosis. PACSIN 1 interacts with huntingtin and may be implicated in the neuropathology of Huntington's disease. PACSIN 2 or Syndapin II is expressed ubiquitously and is involved in the regulation of tubulin polymerization. It associates with Golgi membranes and forms a complex with dynamin II which is crucial in promoting vesicle formation from the trans-Golgi network. PACSINs act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30012 212932 cd11999 SH3_PACSIN_like Src homology 3 domain of an unknown subfamily of proteins with similarity to Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. PACSINs, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30013 212933 cd12000 SH3_CASS4 Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member 4. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30014 212934 cd12001 SH3_BCAR1 Src homology 3 domain of the CAS (Crk-Associated Substrate) scaffolding protein family member, Breast Cancer Anti-estrogen Resistance 1. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 68
30015 212935 cd12002 SH3_NEDD9 Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member, Neural precursor cell Expressed, Developmentally Down-regulated 9. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30016 212936 cd12003 SH3_EFS Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member, Embryonal Fyn-associated Substrate. EFS is also called HEFS, CASS3 (Cas scaffolding protein family member 3) or SIN (Src-interacting protein). It was identified based on interactions with the Src kinases, Fyn and Yes. It plays a role in thymocyte development and acts as a negative regulator of T cell proliferation. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30017 212937 cd12004 SH3_Lyn Src homology 3 domain of Lyn Protein Tyrosine Kinase. Lyn is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lyn is expressed in B lymphocytes and myeloid cells. It exhibits both positive and negative regulatory roles in B cell receptor (BCR) signaling. Lyn, as well as Fyn and Blk, promotes B cell activation by phosphorylating ITAMs (immunoreceptor tyr activation motifs) in CD19 and in Ig components of BCR. It negatively regulates signaling by its unique ability to phosphorylate ITIMs (immunoreceptor tyr inhibition motifs) in cell surface receptors like CD22 and CD5. Lyn also plays an important role in G-CSF receptor signaling by phosphorylating a variety of adaptor molecules. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30018 212938 cd12005 SH3_Lck Src homology 3 domain of Lck Protein Tyrosine Kinase. Lck is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lck is expressed in T-cells and natural killer cells. It plays a critical role in T-cell maturation, activation, and T-cell receptor (TCR) signaling. Lck phosphorylates ITAM (immunoreceptor tyr activation motif) sequences on several subunits of TCRs, leading to the activation of different second messenger cascades. Phosphorylated ITAMs serve as binding sites for other signaling factor such as Syk and ZAP-70, leading to their activation and propagation of downstream events. In addition, Lck regulates drug-induced apoptosis by interfering with the mitochondrial death pathway. The apototic role of Lck is independent of its primary function in T-cell signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30019 212939 cd12006 SH3_Fyn_Yrk Src homology 3 domain of Fyn and Yrk Protein Tyrosine Kinases. Fyn and Yrk (Yes-related kinase) are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Fyn, together with Lck, plays a critical role in T-cell signal transduction by phosphorylating ITAM (immunoreceptor tyr activation motif) sequences on T-cell receptors, ultimately leading to the proliferation and differentiation of T-cells. In addition, Fyn is involved in the myelination of neurons, and is implicated in Alzheimer's and Parkinson's diseases. Yrk has been detected only in chickens. It is primarily found in neuronal and epithelial cells and in macrophages. It may play a role in inflammation and in response to injury. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30020 212940 cd12007 SH3_Yes Src homology 3 domain of Yes Protein Tyrosine Kinase. Yes (or c-Yes) is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. c-Yes kinase is the cellular homolog of the oncogenic protein (v-Yes) encoded by the Yamaguchi 73 and Esh sarcoma viruses. It displays functional overlap with other Src subfamily members, particularly Src. It also shows some unique functions such as binding to occludins, transmembrane proteins that regulate extracellular interactions in tight junctions. Yes also associates with a number of proteins in different cell types that Src does not interact with, like JAK2 and gp130 in pre-adipocytes, and Pyk2 in treated pulmonary vein endothelial cells. Although the biological function of Yes remains unclear, it appears to have a role in regulating cell-cell interactions and vesicle trafficking in polarized cells. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
30021 212941 cd12008 SH3_Src Src homology 3 domain of Src Protein Tyrosine Kinase. Src (or c-Src) is a cytoplasmic (or non-receptor) PTK and is the vertebrate homolog of the oncogenic protein (v-Src) from Rous sarcoma virus. Together with other Src subfamily proteins, it is involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. Src also play a role in regulating cell adhesion, invasion, and motility in cancer cells, and tumor vasculature, contributing to cancer progression and metastasis. Elevated levels of Src kinase activity have been reported in a variety of human cancers. Several inhibitors of Src have been developed as anti-cancer drugs. Src is also implicated in acute inflammatory responses and osteoclast function. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30022 212942 cd12009 SH3_Blk Src homology 3 domain of Blk Protein Tyrosine Kinase. Blk is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. It is expressed specifically in B-cells and is involved in pre-BCR (B-cell receptor) signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30023 212943 cd12010 SH3_SLAP Src homology 3 domain of Src-Like Adaptor Protein. SLAP (or SLA1) modulates TCR surface expression levels as well as surface and total BCR levels. As an adaptor to c-Cbl, SLAP increases the ubiquitination, intracellular retention, and targeted degradation of the BCR complex components. SLAP has been shown to interact with the EphA receptor, EpoR, Lck, PDGFR, Syk, CD79a, c-Cbl, LAT, CD247, and Zap70, among others. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30024 212944 cd12011 SH3_SLAP2 Src homology 3 domain of Src-Like Adaptor Protein 2. SLAP2 plays a role in c-Cbl-dependent regulation of CSF1R, a tyrosine kinase important for myeloid cell growth and differentiation. It has been shown to interact with CSF1R, c-Cbl, LAT, CD247, and Zap70. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. They function in regulating the signaling, ubiquitination, and trafficking of T-cell receptor (TCR) and B-cell receptor (BCR) components. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30025 212945 cd12012 SH3_RIM-BP_2 Second Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30026 212946 cd12013 SH3_RIM-BP_3 Third Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
30027 212947 cd12014 SH3_RIM-BP_1 First Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30028 212948 cd12015 SH3_Tks_1 First Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the first SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30029 212949 cd12016 SH3_Tks_2 Second Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the second SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30030 212950 cd12017 SH3_Tks_3 Third Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the third SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30031 212951 cd12018 SH3_Tks4_4 Fourth (C-terminal) Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the fourth (C-terminal) SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30032 212952 cd12019 SH3_Tks5_4 Fourth Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the fourth SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30033 212953 cd12020 SH3_Tks5_5 Fifth (C-terminal) Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the fifth (C-terminal) SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30034 212954 cd12021 SH3_p47phox_1 First or N-terminal Src homology 3 domain of the p47phox subunit of NADPH oxidase, also called Neutrophil Cytosolic Factor 1. p47phox, or NCF1, is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), a polybasic/autoinhibitory region, and a C-terminal proline-rich region (PRR). This model characterizes the first SH3 domain (or N-SH3) of p47phox. In its inactive state, the tandem SH3 domains interact intramolecularly with the autoinhibitory region; upon activation, the tandem SH3 domains are exposed through a conformational change, resulting in their binding to the PRR of p22phox and the activation of NADPH oxidase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30035 212955 cd12022 SH3_p47phox_2 Second or C-terminal Src homology 3 domain of the p47phox subunit of NADPH oxidase, also called Neutrophil Cytosolic Factor 1. p47phox, or NCF1, is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), a polybasic/autoinhibitory region, and a C-terminal proline-rich region (PRR). This model characterizes the second SH3 domain (or C-SH3) of p47phox. In its inactive state, the tandem SH3 domains interact intramolecularly with the autoinhibitory region; upon activation, the tandem SH3 domains are exposed through a conformational change, resulting in their binding to the PRR of p22phox and the activation of NADPH oxidase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30036 212956 cd12023 SH3_NoxO1_1 First or N-terminal Src homology 3 domain of Nox Organizing protein 1. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1 is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. NoxO1 contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), and a C-terminal proline-rich region (PRR). This model characterizes the first SH3 domain (or N-SH3) of NoxO1. The tandem SH3 domains of NoxO1 interact with the PRR of p22phox, which also complexes with Nox1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30037 212957 cd12024 SH3_NoxO1_2 Second or C-terminal Src homology 3 domain of NADPH oxidase (Nox) Organizing protein 1. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1 is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. NoxO1 contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), and a C-terminal proline-rich region (PRR). This model characterizes the second SH3 domain (or C-SH3) of NoxO1. The tandem SH3 domains of NoxO1 interact with the PRR of p22phox, which also complexes with Nox1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30038 212958 cd12025 SH3_Obscurin_like Src homology 3 domain of Obscurin and similar proteins. Obscurin is a giant muscle protein that is concentrated at the peripheries of Z-disks and M-lines. It binds small ankyrin I, a component of the sarcoplasmic reticulum (SR) membrane. It is associated with the contractile apparatus through binding with titin and sarcomeric myosin. It plays important roles in the organization and assembly of the myofibril and the SR. Obscurin has been observed as alternatively-spliced isoforms. The major isoform in sleletal muscle, approximately 800 kDa in size, is composed of many adhesion modules and signaling domains. It harbors 49 Ig and 2 FNIII repeats at the N-terminues, a complex middle region with additional Ig domains, an IQ motif, and a conserved SH3 domain near RhoGEF and PH domains, and a non-modular C-terminus with phosphorylation motifs. The obscurin gene also encodes two kinase domains, which are not part of the 800 kDa form of the protein, but is part of smaller spliced products that present in heart muscle. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 63
30039 212959 cd12026 SH3_ZO-1 Src homology 3 domain of the Tight junction protein, Zonula occludens protein 1. ZO-1 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-1 plays an essential role in embryonic development. It regulates the assembly and dynamics of the cortical cytoskeleton at cell-cell junctions. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-1 is the largest of the three ZO proteins and contains an actin-binding region and domains of unknown function designated alpha and ZU5. The SH3 domain of ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 65
30040 212960 cd12027 SH3_ZO-2 Src homology 3 domain of the Tight junction protein, Zonula occludens protein 2. ZO-2 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-2 plays an essential role in embryonic development. It is critical for the blood-testis barrier integrity and male fertility. It also regulates the expression of cyclin D1 and cell proliferation. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-2 contains an actin-binding region and a domain of unknown function designated beta. The SH3 domain of the related protein ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 63
30041 212961 cd12028 SH3_ZO-3 Src homology 3 domain of the Tight junction protein, Zonula occludens protein 3. ZO-3 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-3 is critical for epidermal barrier function. It regulates cyclin D1-dependent cell proliferation. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-3 is the smallest of the three ZO proteins. The SH3 domain of the related protein ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 65
30042 212962 cd12029 SH3_DLG3 Src Homology 3 domain of Disks Large homolog 3. DLG3, also called synapse-associated protein 102 (SAP102), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. Mutations in DLG3 cause midgestational embryonic lethality in mice and may be associated with nonsyndromic X-linked mental retardation in humans. It interacts with the NEDD4 (neural precursor cell-expressed developmentally downregulated 4) family of ubiquitin ligases and promotes apical tight junction formation. DLG3 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG3 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 67
30043 212963 cd12030 SH3_DLG4 Src Homology 3 domain of Disks Large homolog 4. DLG4, also called postsynaptic density-95 (PSD95) or synapse-associated protein 90 (SAP90), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. It is responsible for the membrane clustering and retention of many transporters and receptors such as potassium channels and PMCA4b, a P-type ion transport ATPase, among others. DLG4 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG4 contains three PDZ domains. The SH3 domain of DLG4 binds and clusters the kainate subgroup of glutamate receptors via two proline-rich sequences in their C-terminal tail. It also binds AKAP79/150 (A-kinase anchoring protein). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 66
30044 212964 cd12031 SH3_DLG1 Src Homology 3 domain of Disks Large homolog 1. DLG1, also called synapse-associated protein 97 (SAP97), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. DLG1 plays roles in regulating cell polarity, proliferation, migration, and cycle progression. It interacts with AMPA-type glutamate receptors and is critical in their maturation and delivery to synapses. It also interacts with PKCalpha and promotes wound healing. DLG1 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG1 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 67
30045 212965 cd12032 SH3_DLG2 Src Homology 3 domain of Disks Large homolog 2. DLG2, also called postsynaptic density-93 (PSD93) or Channel-associated protein of synapse-110 (chapsyn 110), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. The DLG2 delta isoform binds inwardly rectifying potassium Kir2 channels, which determine resting membrane potential in neurons. It regulates the spatial and temporal distribution of Kir2 channels within neuronal membranes. DLG2 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG2 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 74
30046 212966 cd12033 SH3_MPP7 Src Homology 3 domain of Membrane Protein, Palmitoylated 7 (or MAGUK p55 subfamily member 7). MPP7 is a scaffolding protein that binds to DLG1 and promotes tight junction formation and epithelial cell polarity. Mutations in the MPP7 gene may be associated with the pathogenesis of diabetes and extreme bone mineral density. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
30047 212967 cd12034 SH3_MPP4 Src Homology 3 domain of Membrane Protein, Palmitoylated 4 (or MAGUK p55 subfamily member 4). MPP4, also called Disks Large homolog 6 (DLG6) or Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 5 protein (ALS2CR5), is a retina-specific scaffolding protein that plays a role in organizing presynaptic protein complexes in the photoreceptor synapse, where it localizes to the plasma membrane. It is required in the proper localization of calcium ATPases and for maintenance of calcium homeostasis. MPP4 is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
30048 212968 cd12035 SH3_MPP1-like Src Homology 3 domain of Membrane Protein, Palmitoylated 1 (or MAGUK p55 subfamily member 1)-like proteins. This subfamily includes MPP1, CASK (Calcium/calmodulin-dependent Serine protein Kinase), Caenorhabditis elegans lin-2, and similar proteins. MPP1 and CASK are scaffolding proteins from the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). In addition, they also have the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. CASK and lin-2 also contain an N-terminal calmodulin-dependent kinase (CaMK)-like domain and two L27 domains. MPP1 is ubiquitously-expressed and plays roles in regulating neutrophil polarity, cell shape, hair cell development, and neural development and patterning of the retina. CASK is highly expressed in the mammalian nervous system and plays roles in synaptic protein targeting, neural development, and gene expression regulation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30049 212969 cd12036 SH3_MPP5 Src Homology 3 domain of Membrane Protein, Palmitoylated 5 (or MAGUK p55 subfamily member 5). MPP5, also called PALS1 (Protein associated with Lin7) or Nagie oko protein in zebrafish or Stardust in Drosophila, is a scaffolding protein which associates with Crumbs homolog 1 (CRB1), CRB2, or CRB3 through its PDZ domain and with PALS1-associated tight junction protein (PATJ) or multi-PDZ domain protein 1 (MUPP1) through its L27 domain. The resulting tri-protein complexes are core proteins of the Crumb complex, which localizes at tight junctions or subapical regions, and is involved in the maintenance of apical-basal polarity in epithelial cells and the morphogenesis and function of photoreceptor cells. MPP5 is critical for the proper stratification of the retina and is also expressed in T lymphocytes where it is important for TCR-mediated activation of NFkB. Drosophila Stardust exists in several isoforms, some of which show opposing functions in photoreceptor cells, which suggests that the relative ratio of different Crumbs complexes regulates photoreceptor homeostasis. MPP5 contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 63
30050 212970 cd12037 SH3_MPP2 Src Homology 3 domain of Membrane Protein, Palmitoylated 2 (or MAGUK p55 subfamily member 2). MPP2 is a scaffolding protein that interacts with the non-receptor tyrosine kinase c-Src in epithelial cells to negatively regulate its activity and morphological function. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 59
30051 212971 cd12038 SH3_MPP6 Src Homology 3 domain of Membrane Protein, Palmitoylated 6 (or MAGUK p55 subfamily member 6). MPP6, also called Veli-associated MAGUK 1 (VAM-1) or PALS2, is a scaffolding protein that binds to Veli-1, a homolog of Caenorhabditis Lin-7. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 61
30052 212972 cd12039 SH3_MPP3 Src Homology 3 domain of Membrane Protein, Palmitoylated 3 (or MAGUK p55 subfamily member 3). MPP3 is a scaffolding protein that colocalizes with MPP5 and CRB1 at the subdpical region adjacent to adherens junctions and may function in photoreceptor polarity. It interacts with some nectins and regulates their trafficking and processing. Nectins are cell-cell adhesion proteins involved in the establishment apical-basal polarity at cell adhesion sites. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30053 212973 cd12040 SH3_CACNB2 Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta2. The beta2 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is expressed in the heart and is present in specific neuronal cells including cerebellar Purkinje cells, hippocampal pyramidal neurons, and photoreceptors. Knockout of the beta2 gene in mice results in embryonic lethality, demonstrating its importance in development. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 69
30054 212974 cd12041 SH3_CACNB1 Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta-1. The beta1 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the only beta subunit, as the beta1a variant, expressed in skeletal muscle; the beta1b variant is also widely expressed in other tissues including the heart and brain. Knockout of the beta1 gene in mice results in embryonic lethality, demonstrating its importance in development. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 68
30055 212975 cd12042 SH3_CACNB3 Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta3. The beta3 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the main beta subunit present in smooth muscles and is strongly expressed in the brain; it is predominant in the olfactory bulb, cortex, and hippocampus. It may play a role in regulating the NMDAR (N-methyl-d-aspartate receptor) activity in the hippocampus and thus, activity-dependent synaptic plasticity and cognitive behaviors. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 68
30056 212976 cd12043 SH3_CACNB4 Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta4. The beta4 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the only beta subunit expressed in the cochlea and is highly expressed in the brain, predominantly in the cerebellum. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 68
30057 212977 cd12044 SH3_SKAP1 Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 1. SKAP1, also called SKAP55 (Src kinase-associated protein of 55kDa), is an immune cell-specific adaptor protein that plays an important role in T-cell adhesion, migration, and integrin clustering. It is expressed exclusively in T-lymphocytes, mast cells, and macrophages. Binding partners include ADAP (adhesion and degranulation-promoting adaptor protein), Fyn, Riam, RapL, and RasGRP. It contains a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. The SH3 domain of SKAP1 is necessary for its ability to regulate T-cell conjugation with antigen-presenting cells and the formation of LFA-1 clusters. SKAP1 binds primarily to a proline-rich region of ADAP through its SH3 domain; its degradation is regulated by ADAP. A secondary interaction occurs via the ADAP SH3 domain and the RKxxYxxY motif in SKAP1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30058 212978 cd12045 SH3_SKAP2 Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 2. SKAP2, also called SKAP55-Related (SKAP55R) or SKAP55 homolog (SKAP-HOM or SKAP55-HOM), is an immune cell-specific adaptor protein that plays an important role in adhesion and migration of B-cells and macrophages. Binding partners include ADAP (adhesion and degranulation-promoting adaptor protein), YopH, SHPS1, and HPK1. SKAP2 has also been identified as a substrate for lymphoid-specific tyrosine phosphatase (Lyp), which has been implicated in a wide variety of autoimmune diseases. It contains a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. Like SKAP1, SKAP2 is expected to bind primarily to a proline-rich region of ADAP through its SH3 domain; its degradation may be regulated by ADAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30059 212979 cd12046 SH3_p67phox_C C-terminal (or second) Src Homology 3 domain of the p67phox subunit of NADPH oxidase. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox plays a regulatory role and contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. It binds, via its C-terminal SH3 domain, to a proline-rich region of p47phox and upon activation, this complex assembles with flavocytochrome b558, the Nox2-p22phox heterodimer. Concurrently, RacGTP translocates to the membrane and interacts with the TPR domain of p67phox, which leads to the activation of NADPH oxidase. The PB1 domain of p67phox binds to its partner PB1 domain in p40phox, and this facilitates the assembly of p47phox-p67phox at the membrane. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30060 212980 cd12047 SH3_Noxa1_C C-terminal Src Homology 3 domain of NADPH oxidase activator 1. Noxa1 is a homolog of p67phox and is a cytosolic subunit of the nonphagocytic NADPH oxidase complex Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Noxa1 is co-expressed with Nox1 in colon, stomach, uterus, prostate, and vascular smooth muscle cells, consistent with its regulatory role. It does not interact with p40phox, unlike p67phox, making Nox1 activity independent of p40phox, unlike Nox2. Noxa1 contains TPR, PB1, and C-terminal SH3 domains, but lacks the central SH3 domain that is present in p67phox. The TPR domain binds activated GTP-bound Rac. The C-terminal SH3 domain binds the polyproline motif found at the C-terminus of Noxo1, a homolog of p47phox. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30061 212981 cd12048 SH3_DOCK3_B Src Homology 3 domain of Class B Dedicator of Cytokinesis 3. Dock3, also called modifier of cell adhesion (MOCA), and presenilin binding protein (PBP), is a class B DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; Dock3 is a specific GEFs for Rac. The SH3 domain of Dock3 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock3 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30062 212982 cd12049 SH3_DOCK4_B Src Homology 3 domain of Class B Dedicator of Cytokinesis 4. Dock4 is a class B DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. It may also regulate spine morphology and synapse formation. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. The SH3 domain of Dock4 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock4 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30063 212983 cd12050 SH3_DOCK2_A Src Homology 3 domain of Class A Dedicator of Cytokinesis protein 2. Dock2 is a hematopoietic cell-specific, class A DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; they are specific GEFs for Rac. The SH3 domain of Dock2 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock2 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30064 212984 cd12051 SH3_DOCK1_5_A Src Homology 3 domain of Class A Dedicator of Cytokinesis proteins 1 and 5. Dock1, also called Dock180, and Dock5 are class A DOCKs and are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. Dock5 functions upstream of Rac1 to regulate osteoclast function. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; they are specific GEFs for Rac. The SH3 domain of Dock1 binds to DHR-2 in an autoinhibitory manner; binding of Elmo to the SH3 domain of Dock1 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30065 212985 cd12052 SH3_CIN85_1 First Src Homology 3 domain (SH3A) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the first SH3 domain (SH3A) of CIN85; SH3A binds to internal proline-rich motifs within the proline-rich region. This intramolecular interaction serves as a regulatory mechanism to keep CIN85 in a closed conformation, preventing the recruitment of other proteins. SH3A has also been shown to bind ubiquitin and to an atypical PXXXPR motif at the C-terminus of Cbl and the cytoplasmic end of the cell adhesion protein CD2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30066 212986 cd12053 SH3_CD2AP_1 First Src Homology 3 domain (SH3A) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the first SH3 domain (SH3A) of CD2AP. SH3A binds to the PXXXPR motif present in c-Cbl and the cytoplasmic domain of cell adhesion protein CD2. Its interaction with CD2 anchors CD2 at sites of cell contact. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30067 212987 cd12054 SH3_CD2AP_2 Second Src Homology 3 domain (SH3B) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the second SH3 domain (SH3B) of CD2AP. SH3B binds to c-Cbl in a site (TPSSRPLR is the core binding motif) distinct from the c-Cbl/SH3A binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30068 212988 cd12055 SH3_CIN85_2 Second Src Homology 3 domain (SH3B) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the second SH3 domain (SH3B) of CIN85. SH3B has been shown to bind Cbl proline-rich peptides and ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30069 212989 cd12056 SH3_CD2AP_3 Third Src Homology 3 domain (SH3C) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the third SH3 domain (SH3C) of CD2AP. SH3C has been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30070 212990 cd12057 SH3_CIN85_3 Third Src Homology 3 domain (SH3C) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the third SH3 domain (SH3C) of CIN85. SH3C has been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 56
30071 212991 cd12058 SH3_MLK4 Src Homology 3 domain of Mixed Lineage Kinase 4. MLK4 is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The specific function of MLK4 is yet to be determined. Mutations in the kinase domain of MLK4 have been detected in colorectal cancers. MLK4 contains an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
30072 212992 cd12059 SH3_MLK1-3 Src Homology 3 domain of Mixed Lineage Kinases 1, 2, and 3. MLKs 1, 2, and 3 are Serine/Threonine Kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. Little is known about the specific function of MLK1, also called MAP3K9. It is capable of activating the c-Jun N-terminal kinase pathway. Mice lacking both MLK1 and MLK2 are viable, fertile, and have normal life spans. MLK2, also called MAP3K10, is abundant in brain, skeletal muscle, and testis. It functions upstream of the MAPK, c-Jun N-terminal kinase. It binds hippocalcin, a calcium-sensor protein that protects neurons against calcium-induced cell death. Both MLK2 and hippocalcin may be associated with the pathogenesis of Parkinson's disease. MLK3, also called MAP3K11, is highly expressed in breast cancer cells and its signaling through c-Jun N-terminal kinase has been implicated in the migration, invasion, and malignancy of cancer cells. It also functions as a negative regulator of Inhibitor of Nuclear Factor-KappaB Kinase (IKK) and thus, impacts inflammation and immunity. MLKs contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
30073 212993 cd12060 SH3_alphaPIX Src Homology 3 domain of alpha-Pak Interactive eXchange factor. Alpha-PIX, also called Rho guanine nucleotide exchange factor 6 (ARHGEF6) or Cool (Cloned out of Library)-2, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac 1, and is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 58
30074 212994 cd12061 SH3_betaPIX Src Homology 3 domain of beta-Pak Interactive eXchange factor. Beta-PIX, also called Rho guanine nucleotide exchange factor 7 (ARHGEF7) or Cool (Cloned out of Library)-1, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac 1, and plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30075 212995 cd12062 SH3_Caskin1 Src Homology 3 domain of CASK interacting protein 1. Caskin1 is a multidomain adaptor protein that contains six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. It is expressed at high levels in the brain and is localized in presynaptic regions. It binds to the multidomain scaffolding protein CASK through the CaMK domain in competition with Munc-interacting protein 1 (Mint1). CASK participates in one of two evolutionarily conserved tripartite complexes containing either Mint1 and Velis or Caskin1 and Velis. Caskin1 may play a role in infantile myoclonic epilepsy. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 62
30076 212996 cd12063 SH3_Caskin2 Src Homology 3 domain of CASK interacting protein 2. Caskin2 is a multidomain adaptor protein that contains six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. It shares a domain architecture with Caskin1, but does not bind CASK. The function of Caskin2 is still unknown. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 62
30077 212997 cd12064 SH3_GRAF Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase. GRAF, also called Rho GTPase activating protein 26 (ARHGAP26), Oligophrenin-1-like (OPHN1L) or GRAF1, is a GAP with activity towards RhoA and Cdc42 and is only weakly active towards Rac1. It influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase (FAK), which is a critical component of integrin signaling. It is essential for the major clathrin-independent endocytic pathway mediated by pleiomorphic membranes. GRAF contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 56
30078 212998 cd12065 SH3_GRAF2 Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase 2. GRAF2, also called Rho GTPase activating protein 10 (ARHGAP10) or PS-GAP, is a GAP with activity towards Cdc42 and RhoA. It regulates caspase-activated p21-activated protein kinase-2 (PAK-2p34). GRAF2 interacts with PAK-2p34, leading to its stabilization and decrease of cell death. It is highly expressed in skeletal muscle, and is involved in alpha-catenin recruitment at cell-cell junctions. GRAF2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 54
30079 212999 cd12066 SH3_GRAF3 Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase 3. GRAF3 is also called Rho GTPase activating protein 42 (ARHGAP42) or ARHGAP10-like. Though its function has not been characterized, it may be a GAP with activity towards RhoA and Cdc42, based on its similarity to GRAF and GRAF2. It contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF and GRAF2 binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 55
30080 213000 cd12067 SH3_MYO15A Src Homology 3 domain of Myosin XVa. Myosin XVa is an unconventional myosin that is critical for the normal growth of mechanosensory stereocilia of inner ear hair cells. Mutations in the myosin XVa gene are associated with nonsyndromic hearing loss. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 80
30081 213001 cd12068 SH3_MYO15B Src Homology 3 domain of Myosin XVb. Myosin XVb, also called KIAA1783, was named based on its similarity with myosin XVa. It is a transcribed and unprocessed pseudogene whose predicted amino acid sequence contains mutated or deleted amino acid residues that are normally conserved and important for myosin function. The related myosin XVa is important for normal growth of mechanosensory stereocilia of inner ear hair cells. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 55
30082 213002 cd12069 SH3_ARHGAP27 Src Homology 3 domain of Rho GTPase-activating protein 27. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP27, also called CAMGAP1, shows GAP activity towards Rac1 and Cdc42. It binds the adaptor protein CIN85 and may play a role in clathrin-mediated endocytosis. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 57
30083 213003 cd12070 SH3_ARHGAP12 Src Homology 3 domain of Rho GTPase-activating protein 12. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP12 has been shown to display GAP activity towards Rac1. It plays a role in regulating hepatocyte growth factor (HGF)-driven cell growth and invasiveness. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 60
30084 213004 cd12071 SH3_FBP17 Src Homology 3 domain of Formin Binding Protein 17. Formin Binding Protein 17 (FBP17), also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. It contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, a Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of the related protein, CIP4, associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30085 213005 cd12072 SH3_FNBP1L Src Homology 3 domain of Formin Binding Protein 1-Like. FormiN Binding Protein 1-Like (FNBP1L), also known as Toca-1 (Transducer of Cdc42-dependent actin assembly), forms a complex with neural Wiskott-Aldrich syndrome protein (N-WASP). The FNBP1L/N-WASP complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. It contains an N-terminal F-BAR domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of the related protein, CIP4, associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30086 213006 cd12073 SH3_HS1 Src homology 3 domain of Hematopoietic lineage cell-specific protein 1. HS1, also called HCLS1 (hematopoietic cell-specific Lyn substrate 1), is a cortactin homolog expressed specifically in hematopoietic cells. It is an actin regulatory protein that binds the Arp2/3 complex and stabilizes branched actin filaments. It is required for cell spreading and signaling in lymphocytes. It regulates cytoskeletal remodeling that controls lymphocyte trafficking, and it also affects tissue invasion and infiltration of leukemic B cells. Like cortactin, HS1 contains an N-terminal acidic domain, several copies of a repeat domain found in cortactin and HS1, a proline-rich region, and a C-terminal SH3 domain. The N-terminal region binds the Arp2/3 complex and F-actin, while the C-terminal region acts as an adaptor or scaffold that can connect varied proteins that bind the SH3 domain within the actin network. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30087 213007 cd12074 SH3_Tks5_1 First Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the first SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30088 213008 cd12075 SH3_Tks4_1 First Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the first SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30089 213009 cd12076 SH3_Tks4_2 Second Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the second SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30090 213010 cd12077 SH3_Tks5_2 Second Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the second SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30091 213011 cd12078 SH3_Tks4_3 Third Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the third SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 53
30092 213012 cd12079 SH3_Tks5_3 Third Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the third SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 54
30093 213013 cd12080 SH3_MPP1 Src Homology 3 domain of Membrane Protein, Palmitoylated 1 (or MAGUK p55 subfamily member 1). MPP1, also called 55 kDa erythrocyte membrane protein (p55), is a ubiquitously-expressed scaffolding protein that plays roles in regulating neutrophil polarity, cell shape, hair cell development, and neural development and patterning of the retina. It was originally identified as an erythrocyte protein that stabilizes the actin cytoskeleton to the plasma membrane by forming a complex with 4.1R protein and glycophorin C. MPP1 is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains the three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30094 213014 cd12081 SH3_CASK Src Homology 3 domain of Calcium/calmodulin-dependent Serine protein Kinase. CASK is a scaffolding protein that is highly expressed in the mammalian nervous system and plays roles in synaptic protein targeting, neural development, and gene expression regulation. CASK interacts with many different binding partners including parkin, neurexin, syndecans, calcium channel proteins, caskin, among others, to perform specific functions in different subcellular locations. Disruption of the CASK gene in mice results in neonatal lethality while mutations in the human gene have been associated with X-linked mental retardation. Drosophila CASK is associated with both pre- and postsynaptic membranes and is crucial in synaptic transmission and vesicle cycling. CASK contains an N-terminal calmodulin-dependent kinase (CaMK)-like domain, two L27 domains, followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 62
30095 240527 cd12082 MATE_like Multidrug and toxic compound extrusion family and similar proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 420
30096 213373 cd12083 DD_cGKI Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates. 48
30097 213043 cd12084 DD_R_PKA Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 37
30098 213374 cd12085 DD_cGKI-alpha Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I alpha. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing for their targeting to different subcellular compartments and intracellular substrates. cGKI-alpha specifically binds to myosin light chain phosphatase targeting subunit (MYPT1) and the regulator of G-protein signaling-2 (RGS-2). cGKI-alpha activates the phosphatase activity of MYPT1, resulting in vasorelaxation. It increases the activity of RGS-2 toward G proteins, with implications in the downstream signaling for vasoconstrictive agents. 48
30099 213375 cd12086 DD_cGKI-beta Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I beta. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing for their targeting to different subcellular compartments and intracellular substrates. cGKI-beta binds specifically to inositol triphosphate receptor-associated PKG substrate (IRAG) and the transcriptional regulator TFII-I. Phosphorylation of IRAG by cGKI-beta contributes to smooth muscle relaxation while phosphorylation of TFII-I modulates its co-activator functions for serum response factor and Smad transcription factors. 52
30100 213052 cd12087 TM_EGFR-like Transmembrane domain of the Epidermal Growth Factor Receptor family of Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER, ErbB) subfamily members include EGFR (HER1, ErbB1), HER2 (ErbB2), HER3 (ErbB3), HER4 (ErbB4), and similar proteins. They are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. They are activated by ligand-induced dimerization, resulting in the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Collectively, they can recognize a variety of ligands including EGF, TGFalpha, and neuregulins, among others. All four subfamily members can form homo- or heterodimers. HER3 contains an impaired kinase domain and depends on its heterodimerization partner for activation. EGFR subfamily members are involved in signaling pathways leading to a broad range of cellular responses including cell proliferation, differentiation, migration, growth inhibition, and apoptosis. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of EGFR family RTKs have been associated with increased breast cancer risk. 38
30101 277187 cd12088 helicase_insert_domain helicase_insert_domain. helicase_insert_domain; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex. 82
30102 277188 cd12089 Hef_ID insert domain of Archaeal Hef helicase/nuclease. Archaeal Hef helicase/nuclease, originally identified in the hyperthermophilic archaeon Pyrococcus furiosus, contains an N-terminal SF2 helicase domain and a C-terminal XPF/Mus81-type nuclease domain. Hef has been shown to process flap- or fork-DNA structures, and that both helicase and nuclease domain independently recognize branched DNA, with a strong preference for the forked DNA. The SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. This domain which is not present in all SF2 helicases, has been shown to play an important role in branched structure processing. 119
30103 277189 cd12090 MDA5_ID Insert domain of MDA5. MDA5 (melanoma-differentiation-associated gene 5, also known as IFIH1), as well as RIG-I (Retinoic acid Inducible Gene I, also known as DDX58) and LPG2 (also known as DHX58), contain two N-terminal CARD domains and a C-terminal SF2 helicase domain. They are cytoplasmic DEAD box RNA helicases acting as key innate immune pattern-recognition receptor (PRRs) that play an important role in host antiviral response by sensing incoming viral RNA. Their SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. The inserted domain is involved in conformational changes upon ligand binding. 120
30104 277190 cd12091 FANCM_ID Insert domain of FANCM and related proteins. FANCM and related proteins, like Mph1 and Fml1, are DNA junction-specific helicases/translocases that bind to and process perturbed replication forks and intermediates of homologous recombination. FANCM contains an N-terminal superfamily 2 helicase (SF2) domain, although FANCM, in contrast to other members of this family, does not exhibit DNA helicase activity. The SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. FANCM is a component of the Fanconi anaemia (FA) core complex. FA is a rare genetic disease in humans that is associated with progressive bone marrow failure, a variety of developmental abnormalities, and a high incidence of cancer. A key role of this complex is to monoubiquitination of FANCD2 and FANCI during S-phase and in response to DNA damage. The role of FANCM during this process seems to be the recruitment of the complex to chromatin. 116
30105 213053 cd12092 TM_ErbB4 Transmembrane domain of ErbB4, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. ErbB4 (HER4) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands that bind ErbB4 fall into two groups, the neuregulins (or heregulins) and some EGFR (HER1, ErbB1) ligands including betacellulin, HBEGF, and epiregulin. All four neuregulins (NRG1-4) interact with ErbB4. Upon ligand binding, ErbB4 forms homo- or heterodimers with other ErbB proteins. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB4 have been associated with increased breast cancer risk. ErbB4 is essential in embryonic development. It is implicated in mammary gland, cardiac, and neural development. As a postsynaptic receptor of NRG1, ErbB4 plays an important role in synaptic plasticity and maturation. The impairment of NRG1/ErbB4 signaling may contribute to schizophrenia. 44
30106 213054 cd12093 TM_ErbB1 Transmembrane domain of Epidermal Growth Factor Receptor or ErbB1, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER1, ErbB1) is a receptor PTK (RTK) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands for ErbB1 include EGF, heparin binding EGF-like growth factor (HBEGF), epiregulin, amphiregulin, TGFalpha, and betacellulin. Upon ligand binding, ErbB1 can form homo- or heterodimers with other EGFR/ErbB subfamily members. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB1 have been associated with increased breast cancer risk. The ErbB1 signaling pathway is one of the most important pathways regulating cell proliferation, differentiation, survival, and growth. A number of monoclonal antibodies and small molecule inhibitors have been developed that target ErbB1, including the antibodies Cetuximab and Panitumumab, which are used in combination with other therapies for the treatment of colorectal cancer and non-small cell lung carcinoma (NSCLC). The small molecule inhibitors Gefitinib (Iressa) and Erlotinib (Tarceva), already used for NSCLC, are undergoing clinical trials for other types of cancer including gastrointestinal, breast, head and neck, and bladder. 44
30107 213055 cd12094 TM_ErbB2 Transmembrane domain of ErbB2, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. ErbB2 (HER2, HER2/neu) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB2 does not bind to any known EGFR subfamily ligands, but contributes to the kinase activity of all possible heterodimers. It acts as the preferred partner of other ligand-bound EGFR proteins and functions as a signal amplifier, with the ErbB2-ErbB3 heterodimer being the most potent pair in mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB2 have been associated with increased breast cancer risk. ErbB2 plays an important role in cell development, proliferation, survival and motility. Overexpression of ErbB2 results in its activation and downstream signaling, even in the absence of ligand. ErbB2 overexpression, mainly due to gene amplification, has been shown in a variety of human cancers. Its role in breast cancer is especially well-documented. ErbB2 is up-regulated in about 25% of breast tumors and is associated with increases in tumor aggressiveness, recurrence and mortality. ErbB2 is a target for monoclonal antibodies and small molecule inhibitors, which are being developed as treatments for cancer. The first humanized antibody approved for clinical use is Trastuzumab (Herceptin), which is being used in combination with other therapies to improve the survival rates of patients with HER2-overexpressing breast cancer. 44
30108 213056 cd12095 TM_ErbB3 Transmembrane domain of ErbB3, a Protein Tyrosine Kinase. ErbB3 (HER3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. ErbB receptors are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. ErbB3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The ErbB2-ErbB3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB receptors have been associated with increased breast cancer risk. ErbB3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells. 39
30109 213044 cd12097 DD_RI_PKA Dimerization/Docking domain of the Type I Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIalpha function is required for normal development as its deletion is embryonically lethal. RIbeta is expressed highly in the brain and is associated with hippocampal function. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 44
30110 213045 cd12098 DD_R_PKA_fungi Dimerization/Docking domain of the Regulatory subunit of fungal cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. The R subunit of fungal PKA is encoded by a single gene, which is called by various names in different organisms (for example: Yarrowia lipolytica RKA1, Saccharomyces cerevisiae Bcy1, and Schizosaccharomyces pombe Cgs1). Although most characterized PKA holoenzymes are tetramers, Y. lipolytica PKA has been reported to be a dimer of RKA1 and the catalytic subunit TPK1. RKA1 is essential and promotes hyphal growth. Cgs1 is essential for sexual differentiation of S. pombe; mutants with defective Cgs1 are partially sterile. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain of metazoan R subunits dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs). The D/D domain of fungal R subunits may also serve as a dimerization domain, in the case of heterotetrameric PKAs. Fungal PKA plays a major role in controlling cell growth and metabolism in response to nutrients and stress conditions. 38
30111 213046 cd12099 DD_RII_PKA Dimerization/Docking domain of the Type II Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIalpha plays a role in the association and dissociation of PKA with the centrosome during interphase and mitosis, respectively. RIIbeta plays an important role in adipocytes and neuronal tissues. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 39
30112 213047 cd12100 DD_CABYR_SP17 Dimerization/Docking domain of the sperm fibrous sheath proteins, Calcium-Binding tYrosine-phosphorylation Regulated protein and Sperm Protein 17. CABYR and SP17 are naturally located in human sperm fibrous sheath (FS). CABYR was originally isolated from spermatoza and was thought to be testis-specific, but has been recently been observed in lung and brain tumors. It is a polymorphic calcium binding protein that is phosphorylated during capacitation. SP17 plays an important role in the interaction of sperm with the zona pellucida during fertilization. It also promotes cell-cell adhesion. SP17 is found in various human tumors of unrelated histological origin including metastatic squamous cell carcinoma, multiple myeloma, ovarian cancer, primary nervous system tumors, among others. Both CABYR and SP17 contain an N-terminal dimerization/docking (D/D) domain with similarity to the D/D domain of the R subunit of cAMP-dependent protein kinase (PKA). The D/D domain of the R subunit dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. The D/D domain of CABYR and SP17 have been shown to bind to AKAP3, a protein that is also associated to the FS of mammalian spermatozoa. 39
30113 213048 cd12101 DD_RIalpha_PKA Dimerization/Docking domain of the Type I alpha Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIalpha is the key regulatory subunit responsible for maintaining cAMP control of the catalytic subunit. RIalpha function is required for normal development as its deletion is embryonically lethal due to failed cardiac morphogenesis. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 50
30114 213049 cd12102 DD_RIbeta_PKA Dimerization/Docking domain of the Type I beta Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIbeta is expressed highly in the brain and is associated with hippocampal function. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 54
30115 213050 cd12103 DD_RIIalpha_PKA Dimerization/Docking domain of the Type II alpha Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIalpha plays a role in the association and dissociation of PKA with the centrosome during interphase and mitosis, respectively. It is also involved in endosome-to-Golgi and Golgi-to-ER transport. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 41
30116 213051 cd12104 DD_RIIbeta_PKA Dimerization/Docking domain of the Type II beta Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIbeta plays an important role in adipocytes and neuronal tissues. Mice deficient with RIIbeta have small fat cells, and are resistant to obesity, diet-induced diabetes, and alcohol-induced motor defects. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 41
30117 213031 cd12105 HmuY Bacterial proteins similar to Porphyromonas gingivalis HmuY. HmuY is a hemophore that scavenges heme from infected hosts and delivers it to the outer membrane receptor HmuR. Related but uncharacterized proteins do not appear to share the specific heme-binding site. 121
30118 213061 cd12106 PARMER_03128_N N-terminal domain of PARMER_03128. PARMER_03128 is an uncharacterized protein from Parabacteroides merdae. This model characterizes its N-terminal domain plus that of related proteins from Bacteroidetes. Structurally, they resemble domains found in streptococcal surface proteins such as SpaP. 137
30119 213982 cd12107 Hemerythrin Hemerythrin. Hemerythrin (Hr) is a non-heme diiron oxygen transport protein found in four marine invertebrate phyla including priapulida, brachiopoda, sipunculida, and annelida, as well as in protozoa. Myohemerythrin (Mhr), a hemerythrin homolog, is found in the muscle tissue of sipunculids as well as in polycheate and oligocheate annelids. In addition to oxygen transport, Mhr proteins are involved in cadmium fixation and host anti-bacterial defense. Hr and Mhr proteins have the same "four alpha helix bundle" motif and active site structure. Hr forms oligomers, the octameric form being most prevalent, while Mhr is monomeric. 113
30120 213983 cd12108 Hr-like Hemerythrin-like domain. Hemerythrin (Hr) like domains have the same four alpha helix bundle and a similar, but slightly different active site structure than hemerythrin. They are non-heme diiron binding proteins mainly found in bacteria and eukaryotes. Like Hr, they may be involved in oxygen transport or like human FBXL5 (F-box and leucine-rich repeat protein 5), a member of this group, play a role in cellular iron homeostasis. 130
30121 213984 cd12109 Hr_FBXL5 Hemerythrin-like domain of FBXL5-like proteins. Human FBXL5 (F-box and leucine-rich repeat protein 5) protein plays a role in cellular iron homeostasis. It is part of an E3 ubiquitin ligase complex that targets the iron regulatory protein IRP2 for proteasomal degradation. The FBXL5's stability is regulated by iron concentration, with its iron- and oxygen-binding hemerythrin domain acting as a ligand-dependent regulatory switch. 158
30122 213994 cd12110 PHP_HisPPase_Hisj_like Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase of Hisj like. Bacillus subtilis YtvP HisJ has strong histidinol phosphate phosphatase (HisPPase) activity. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 244
30123 213995 cd12111 PHP_HisPPase_Thermotoga_like Polymerase and Histidinol Phosphatase domain of Thermotoga like. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. Thermotoga PHP is an uncharacterized protein. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to give histidinol. The HisPPase can be classified into two types: the bifunctional HisPPase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 226
30124 213996 cd12112 PHP_HisPPase_Chlorobi_like Polymerase and Histidinol Phosphatase domain of Chlorobi like. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. Chlorobi PHP is uncharacterized protein. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. The HisPPase can be classified into two types: the bifunctional Hisppase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. 235
30125 213997 cd12113 PHP_PolIIIA_DnaE3 Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III DnaE3. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that is responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. The PolIIIA PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination, and like other PHP structures, the PolIIIA PHP exhibits a distorted (beta/alpha) 7 barrel and coordinates up to 3 metals. Initially, it was proposed that PHP region might be involved in pyrophosphate hydrolysis, but such an activity has not been found. It has been shown that the PHP of PolIIIA has a trinuclear metal complex and is capable of proofreading activity. Bacterial genome replication and DNA repair mechanisms is related to the GC content of its genomes. There is a correlation between GC content variations and the dimeric combinations of PolIIIA subunits. Eubacteria can be grouped into different GC variable groups: the full-spectrum or dnaE1 group, the high-GC or dnaE2-dnaE1 group, and the low GC or polC-dnaE3 group. 283
30126 341279 cd12114 A_NRPS_TlmIV_like The adenylation domain of nonribosomal peptide synthetases (NRPS), including Streptoalloteichus tallysomycin biosynthesis genes. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the TLM biosynthetic gene cluster from Streptoalloteichus that consists of nine NRPS genes; the N-terminal module of TlmVI (NRPS-5) and the starter module of BlmVI (NRPS-5) are comprised of the acyl CoA ligase (AL) and acyl carrier protein (ACP)-like domains, which are thought to be involved in the biosynthesis of the beta-aminoalaninamide moiety. 477
30127 341280 cd12115 A_NRPS_Sfm_like The adenylation domain of nonribosomal peptide synthetases (NRPS), including Saframycin A gene cluster from Streptomyces lavendulae. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the saframycin A gene cluster from Streptomyces lavendulae which implicates the NRPS system for assembling the unusual tetrapeptidyl skeleton in an iterative manner. It also includes saframycin Mx1 produced by Myxococcus xanthus NRPS. 447
30128 341281 cd12116 A_NRPS_Ta1_like The adenylation domain of nonribosomal peptide synthetases (NRPS), including salinosporamide A polyketide synthase. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the myxovirescin (TA) antibiotic biosynthetic gene in Myxococcus xanthus; TA production plays a role in predation. It also includes the salinosporamide A polyketide synthase which is involved in the biosynthesis of salinosporamide A, a marine microbial metabolite whose chlorine atom is crucial for potent proteasome inhibition and anticancer activity. 470
30129 341282 cd12117 A_NRPS_Srf_like The adenylation domain of nonribosomal peptide synthetases (NRPS), including Bacillus subtilis termination module Surfactin (SrfA-C). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the adenylation domain of the Bacillus subtilis termination module (Surfactin domain, SrfA-C) which recognizes a specific amino acid building block, which is then activated and transferred to the terminal thiol of the 4'-phosphopantetheine (Ppan) arm of the downstream peptidyl carrier protein (PCP) domain. 483
30130 341283 cd12118 ttLC_FACS_AEE21_like Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles and Arabidopsis. This family includes fatty acyl-CoA synthetases that can activate medium to long-chain fatty acids. These enzymes catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family has been shown to catalyze the long-chain fatty acid, myristoyl acid. Also included in this family are acyl activating enzymes from Arabidopsis, which contains a large number of proteins from this family with up to 63 different genes, many of which are uncharacterized. 486
30131 341284 cd12119 ttLC_FACS_AlkK_like Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles. This family includes fatty acyl-CoA synthetases that can activate medium-chain to long-chain fatty acids. They catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family catalyzes the long-chain fatty acid, myristoyl acid, while another member in this family, the AlkK protein identified from Pseudomonas oleovorans, targets medium chain fatty acids. This family also includes uncharacterized FACS proteins. 518
30132 213376 cd12120 AMPKA_C_like C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains. This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization. 95
30133 213377 cd12121 MARK_C_like C-terminal kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinases, and similar domains. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. In yeast, MARK/Par-1 homologs are called Kin1/2 kinases. Kin1 is a membrane-associated kinase that is involved in regulating cytokinesis and the cell surface. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain. 96
30134 213378 cd12122 AMPKA_C C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha catalytic subunit. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively. The C-terminal RD of the AMPK alpha subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit. AMPK is conserved throughout evolution; the AMPK alpha subunit homologs in yeast and plants are called Snf1 and SnRK1 (Snf1 related kinase), respectively. 132
30135 381264 cd12124 Pgbs Protoglobins (Pgbs). Pgbs are single-domain globins of yet unknown biological function. Included in this subfamily are Pgbs from the strictly anaerobic methanogen Methanosarcina acetivorans (MaPgb) and from the obligate aerobic hyperthermophile Aeropyrum pernix (ApPgb). MaPgb is a dimeric globin which in addition to the 3-on-3 helical sandwich contains an N-terminal extension. This extension, along with other Pgb-specific loops buries the heme within the protein; two orthogonal apolar tunnels grant access of small ligand molecules to the heme. Like other globins, MaPgb can bind O2, CO and NO reversibly in vitro, however it has as unusually low O2 dissociation rate, along with a large structural distortion of the heme moiety. CO binding to and dissociation from the heme occurs through biphasic kinetics. ApPgb also contains heme, and can bind O2, CO and NO. This subfamily belongs to a family which includes the globin-coupled-sensors (GCSs) and single-domain sensor globins. It has been demonstrated that Pgbs and other single-domain globins can function as sensors, when coupled to an appropriate regulator domain. 185
30136 381265 cd12125 APC_alpha Allophycocyanin alpha subunit of the phycobilisome core. Phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 159
30137 271281 cd12126 APC_beta Allophycocyanin beta subunit of the phycobilisome core. Phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 163
30138 381266 cd12127 PE-PC-PEC_beta Beta subunits of phycocyanin, phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This family also includes the beta subunits of Cryptophyte phycobiliproteins which represent another type of biliprotein antenna with different structure and organization. The beta subunits of cryptophyte PBPs share a high degree of sequence identity with both the alpha and beta subunits of the cyanobacterial and red algal PBPs, however the alpha cryptophyte subunits are shorter, and unrelated. There is only one type of PBP present in a single species, either phycocyanin or phycoerythrin, but not allophycocyanin. Structurally, phycoerythrin in cryptophytes is an alpha1alpha2betabeta dimer and not a trimer as in the PBS. 174
30139 381267 cd12128 PBP_PBS-LCM Phycobiliprotein-like domain of the phycobilisome core-membrane linker polypeptide. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae, they consist of a central core and radiating rods and function to harvest and channel light energy toward the photosynthetic reaction centers (RCs) within the membrane. They are comprised of phycobiliproteins or chromophorylated proteins (PBPs) maintained together by linker polypeptides. LCM is a chromophore-bearing PBS linker protein; it facilitates PBS assembly and functionally connects the PBS to the chlorophyll-containing core-complexes in the photosynthetic membrane. In addition to being a linker polypeptide that stabilizes the PBS architecture, the LCM also serves as a terminal energy acceptor. The single phycocyanobilin (PCB) chromophore of LCM are one of two terminal energy transmitters that transfer excitations from the hundreds of chromophores of the PBS to the RCs within the membrane. 172
30140 271284 cd12129 PE-PC-PEC_alpha Alpha subunits of phycoerythrin, phycocyanin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 161
30141 381268 cd12130 Apl Allophycocyanin-like globins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This subfamily contains allophycocyanin-like proteins (Apls), which have conserved the residues critical for chromophore interactions, but may not maintain the proper alpha-beta subunit interactions and tertiary structure of phycobiliproteins. Indeed AplA isolated from Fremyella diplosiphon was not detected in phycobilisomes. As the genes encoding Apls cluster with light-responsive regulatory components, Apls may have photoresponsive regulatory role(s). 154
30142 381269 cd12131 HGbI-like Hell's gate globin I (HGbI) from Methylacidophilum infernorum and related proteins. HGbI is a single-domain heme-containing protein isolated from Methylacidiphilum infernorum, an aerobic acidophilic and thermophilic methanotroph. M. infernorum grows optimally at pH 2.0 and 60C and its home is New Zealand's Hell's Gate geothermal park. The physiological role of HGbI has yet to be determined. It has an extremely strong resistance to auto-oxidation, and has fast oxygen-binding/slow release characteristics. Its CO on-rate is comparable to the O2 on-rate, and it is able to bind acetate with high affinity in the ferric state. The coordination of the heme iron changes in the ferrous form from pentacoordinate at low pH to predominantly hexacoordinate at high pH; in the ferric form, it is predominantly hexacoordinate at all pH. 128
30143 271287 cd12137 GbX Globin_X (GbX). Zebrafish globin X (GbX) is expressed at low levels in neurons of the central nervous system, and appears to be associated with the sensory system. GbX is likely to be attached to the cell membrane via S-palmitoylation and N-myristoylation. It's unlikely to have a true respiratory function as it is membrane-associated. It has been suggested that it may protect the lipids in the cell membrane from oxidation or act as a redox-sensing or signaling protein. Zebrafish GbX is hexacoordinate, and displays cooperative O2 binding. 145
30144 213015 cd12139 SH3_Bin1 Src Homology 3 domain of Bridging integrator 1 (Bin1), also called Amphiphysin-2. Bin1 isoforms are localized in many different tissues and may function in intracellular vesicle trafficking. It plays a role in the organization and maintenance of the T-tubule network in skeletal muscle. Mutations in Bin1 are associated with autosomal recessive centronuclear myopathy. Bin1 contains an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR) and a C-terminal SH3 domain. The SH3 domain of Bin1 forms transient complexes with actin, myosin filaments, and CDK5, to facilitate sarcomere organization and myofiber maturation. It also binds dynamin and prevents its self-assembly. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 72
30145 213016 cd12140 SH3_Amphiphysin_I Src Homology 3 domain of Amphiphysin I. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They exist in several isoforms and mammals possess two amphiphysin proteins from distinct genes. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin, and synaptojanin. They function in synaptic vesicle endocytosis. Human autoantibodies to amphiphysin I hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. The SH3 domain of amphiphysins bind proline-rich motifs present in binding partners such as dynamin, synaptojanin, and nsP3. It also belongs to a subset of SH3 domains that bind ubiquitin in a site that overlaps with the peptide binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 72
30146 213017 cd12141 SH3_DNMBP_C2 Second C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba, and similar domains. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 57
30147 213018 cd12142 SH3_D21-like Src Homology 3 domain of SH3 domain-containing protein 21 (SH3D21) and similar proteins. N-terminal SH3 domain of the uncharacterized protein SH3 domain-containing protein 21, and similar uncharacterized domains, it belongs to the CD2AP-like_3 subfamily of proteins. The CD2AP-like_3 subfamily is composed of the third SH3 domain (SH3C) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3C of both proteins have been shown to bind to ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies. 55
30148 213019 cd12143 SH3_ARHGAP9 Src Homology 3 domain of Rho GTPase-activating protein 9 and similar proteins. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP9 functions as a GAP for Rac and Cdc42, but not for RhoA. It negatively regulates cell migration and adhesion. It also acts as a docking protein for the MAP kinases Erk2 and p38alpha, and may facilitate cross-talk between the Rho GTPase and MAPK pathways to control actin remodeling. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 57
30149 213387 cd12144 SDH_N_domain Saccharopine dehydrogenase N-terminal domain. SDH N-terminal domain is named due to its appearance at the N-terminal of SDH in eukaryotes, but can be found C-terminal of the SDH-like domain in other enzymes, such as the bifunctional lysine ketoglutarate reductase/saccharopine dehydrogenase enzyme. SDH catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SHD is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of a related structure. 114
30150 213388 cd12145 Rev1_C C-terminal domain of the Y-family polymerase Rev1. Rev1 is a eukaryotic translesion synthesis (TLS) polymerase; TLS is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Rev1 has both structural and enzymatic roles. Structurally, it is believed to interact with other nonclassical polymerases and replication machinery to act as a scaffold. The C-terminal domain modeled here is essential for TLS and has been shown to mediate interactions with the Rev7 subunit of the B-family TLS polymerase Pol zeta (Rev3/Rev7), as well as with the RIRs (Rev1-interacting regions) of polymerases kappa, iota, and eta. Rev1 is known to actively promote the introduction of mutations, potentially making it a significant target for cancer treatment. 94
30151 213389 cd12146 STING_C C-terminal domain of STING. STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain. 181
30152 213390 cd12147 Cep3_C C-terminal domain of the Cep3, a subunit of the yeast centromere-binding factor 3. Cep3, together with Skp1, Ctf13, and Ndc10, forms the yeast centromere-binding factor 3 (CBF3) which initiates kinetochore assembly by binding to the CDEIII locus of centromeric DNA. Cep3 is comprised of two domains, the N-terminal DNA-binding module, a Zn2Cys6-cluster, C-terminal domain, which dimerizes and is believed to be involved in the recruitment of the Skp1-Ctf1 heterodimer. 552
30153 213391 cd12148 fungal_TF_MHR fungal transcription factor regulatory middle homology region. This domain is present in the large family of fungal zinc cluster transcription factors that contain an N-terminal GAL4-like C6 zinc binuclear cluster DNA-binding domain. Examples of members of this large fungal group are the following Saccharomyces cerevisiae transcription factors, GAL4, STB5, DAL81, CAT8, RDR1, HAL9, PUT3, PPR1, ASG1, RSF2, PIP2, as well as the C-terminal domain of the Cep3, a subunit of the yeast centromere-binding factor 3. It has been suggested that this region plays a regulatory role. 410
30154 213392 cd12149 Flavi_E_C Immunoglobulin-like domain III (C-terminal domain) of Flavivirus envelope glycoprotein E. The C-terminal domain (domain III) of Flavivirus glycoprotein E appears to be involved in low-affinity interactions with negatively charged glycoaminoglycans on the host cell surface. Domain III may also play a role in interactions with alpha-v-beta-3 integrins in West Nile virus, Japanese encephalitis virus, and Dengue virus. The interface between domain I and domain III appears to be destabilized by the low-pH environment of the endosome, and domain III may play a vital role in the conformational changes of envelope glycoprotein E that follow the clathrin-mediated endocytosis of viral particles and are a prerequisite to membrane fusion. 91
30155 213393 cd12150 talin-RS rod-segment of the talin C-terminal domain. The talin rod-segment characterize by this model interacts with its N-terminal FERM domain to mask its integrin-binding site and interferes with interactions between the FERM domain and the cellular membrane. Talin is a large and ubiquitous cytoskeletal protein concentrated at focal adhesion sites. It is involved in linking integrins to the actin cytoskeleton. 172
30156 213394 cd12151 F1-ATPase_gamma mitochondrial ATP synthase gamma subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain. 282
30157 213395 cd12152 F1-ATPase_delta mitochondrial ATP synthase delta subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain. In bacteria, which is lacking a eukaryotic epsilon subunit homolog, this subunit is called the epsilon subunit. 123
30158 213396 cd12153 F1-ATPase_epsilon eukaryotic mitochondrial ATP synthase epsilon subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes, and in chloroplast thylakoid membranes. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits (only found in eukaryotes, lacking in bacteria) with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain.The epsilon subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1. 45
30159 240631 cd12154 FDH_GDH_like Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases. The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. 310
30160 240632 cd12155 PGDH_1 Phosphoglycerate Dehydrogenase, 2-hydroxyacid dehydrogenase family. Phosphoglycerate Dehydrogenase (PGDH) catalyzes the NAD-dependent conversion of 3-phosphoglycerate into 3-phosphohydroxypyruvate, which is the first step in serine biosynthesis. Over-expression of PGDH has been implicated as supporting proliferation of certain breast cancers, while PGDH deficiency is linked to defects in mammalian central nervous system development. PGDH is a member of the 2-hydroxyacid dehydrogenase family, enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 314
30161 240633 cd12156 HPPR Hydroxy(phenyl)pyruvate Reductase, D-isomer-specific 2-hydroxyacid-related dehydrogenase. Hydroxy(phenyl)pyruvate reductase (HPPR) catalyzes the NADP-dependent reduction of hydroxyphenylpyruvates, hydroxypyruvate, or pyruvate to its respective lactate. HPPR acts as a dimer and is related to D-isomer-specific 2-hydroxyacid dehydrogenases, a superfamily that includes groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 301
30162 240634 cd12157 PTDH Thermostable Phosphite Dehydrogenase. Phosphite dehydrogenase (PTDH), a member of the D-specific 2-hydroxyacid dehydrogenase family, catalyzes the NAD-dependent formation of phosphate from phosphite (hydrogen phosphonate). PTDH has been suggested as a potential enzyme for cofactor regeneration systems. The D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD-binding domain. 318
30163 240635 cd12158 ErythrP_dh D-Erythronate-4-Phosphate Dehydrogenase NAD-binding and catalytic domains. D-Erythronate-4-phosphate Dehydrogenase (E. coli gene PdxB), a D-specific 2-hydroxyacid dehydrogenase family member, catalyzes the NAD-dependent oxidation of erythronate-4-phosphate, which is followed by transamination to form 4-hydroxy-L-threonine-4-phosphate within the de novo biosynthesis pathway of vitamin B6. D-Erythronate-4-phosphate dehydrogenase has the common architecture shared with D-isomer specific 2-hydroxyacid dehydrogenases but contains an additional C-terminal dimerization domain in addition to an NAD-binding domain and the "lid" domain. The lid domain corresponds to the catalytic domain of phosphoglycerate dehydrogenase and other proteins of the D-isomer specific 2-hydroxyacid dehydrogenase family, which include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. 343
30164 240636 cd12159 2-Hacid_dh_2 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 303
30165 240637 cd12160 2-Hacid_dh_3 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 310
30166 240638 cd12161 GDH_like_1 Putative glycerate dehydrogenase and related proteins of the D-specific 2-hydroxy dehydrogenase family. This group contains a variety of proteins variously identified as glycerate dehydrogenase (GDH, aka Hydroxypyruvate Reductase) and other enzymes of the 2-hydroxyacid dehydrogenase family. GDH catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 315
30167 240639 cd12162 2-Hacid_dh_4 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 307
30168 240640 cd12163 2-Hacid_dh_5 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 334
30169 240641 cd12164 GDH_like_2 Putative glycerate dehydrogenase and related proteins of the D-specific 2-hydroxy dehydrogenase family. This group contains a variety of proteins variously identified as glycerate dehydrogenase (GDH, also known as hydroxypyruvate reductase) and other enzymes of the 2-hydroxyacid dehydrogenase family. GDH catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 306
30170 240642 cd12165 2-Hacid_dh_6 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 314
30171 240643 cd12166 2-Hacid_dh_7 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 300
30172 240644 cd12167 2-Hacid_dh_8 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 330
30173 240645 cd12168 Mand_dh_like D-Mandelate Dehydrogenase-like dehydrogenases. D-Mandelate dehydrogenase (D-ManDH), identified as an enzyme that interconverts benzoylformate and D-mandelate, is a D-2-hydroxyacid dehydrogenase family member that catalyzes the conversion of c3-branched 2-ketoacids. D-ManDH exhibits broad substrate specificities for 2-ketoacids with large hydrophobic side chains, particularly those with C3-branched side chains. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Glycerate dehydrogenase catalyzes the reaction (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 321
30174 240646 cd12169 PGDH_like_1 Putative D-3-Phosphoglycerate Dehydrogenases. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric. 308
30175 240647 cd12170 2-Hacid_dh_9 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 294
30176 240648 cd12171 2-Hacid_dh_10 Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 310
30177 240649 cd12172 PGDH_like_2 Putative D-3-Phosphoglycerate Dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric. 306
30178 240650 cd12173 PGDH_4 Phosphoglycerate dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. 304
30179 240651 cd12174 PGDH_like_3 Putative D-3-Phosphoglycerate Dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric. 305
30180 240652 cd12175 2-Hacid_dh_11 Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 311
30181 240653 cd12176 PGDH_3 Phosphoglycerate dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. 304
30182 240654 cd12177 2-Hacid_dh_12 Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 321
30183 240655 cd12178 2-Hacid_dh_13 Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 317
30184 240656 cd12179 2-Hacid_dh_14 Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 306
30185 240657 cd12180 2-Hacid_dh_15 Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 308
30186 240658 cd12181 ceo_syn N(5)-(carboxyethyl)ornithine synthase. N(5)-(carboxyethyl)ornithine synthase (ceo_syn) catalyzes the NADP-dependent conversion of N5-(L-1-carboxyethyl)-L-ornithine to L-ornithine + pyruvate. Ornithine plays a key role in the urea cycle, which in mammals is used in arginine biosynthesis, and is a precursor in polyamine synthesis. ceo_syn is related to the NAD-dependent L-alanine dehydrogenases. Like formate dehydrogenase and related enzymes, ceo_syn is comprised of 2 domains connected by a long alpha helical stretch, each resembling a Rossmann fold NAD-binding domain. The NAD-binding domain is inserted within the linear sequence of the more divergent catalytic domain. These ceo_syn proteins have a partially conserved NAD-binding motif and active site residues that are characteristic of related enzymes such as Saccharopine Dehydrogenase. 295
30187 240659 cd12183 LDH_like_2 D-Lactate and related Dehydrogenases, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-hydroxyisocaproic acid dehydrogenase (D-HicDH) and shares the 2-domain structure of formate dehydrogenase. D-2-hydroxyisocaproate dehydrogenase-like (HicDH) proteins are NAD-dependent members of the hydroxycarboxylate dehydrogenase family, and share the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 328
30188 240660 cd12184 HGDH_like (R)-2-Hydroxyglutarate Dehydrogenase and related dehydrogenases, NAD-binding and catalytic domains. (R)-2-hydroxyglutarate dehydrogenase (HGDH) catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. HGDH is a member of the D-2-hydroxyacid NAD(+)-dependent dehydrogenase family; these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 330
30189 240661 cd12185 HGDH_LDH_like Putative Lactate dehydrogenase and (R)-2-Hydroxyglutarate Dehydrogenase-like proteins, NAD-binding and catalytic domains. This group contains various putative dehydrogenases related to D-lactate dehydrogenase (LDH), (R)-2-hydroxyglutarate dehydrogenase (HGDH), and related enzymes, members of the 2-hydroxyacid dehydrogenases family. LDH catalyzes the interconversion of pyruvate and lactate, and HGDH catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. Despite often low sequence identity within this 2-hydroxyacid dehydrogenase family, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 322
30190 240662 cd12186 LDH D-Lactate dehydrogenase and D-2-Hydroxyisocaproic acid dehydrogenase (D-HicDH), NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenases family. LDH is homologous to D-2-hydroxyisocaproic acid dehydrogenase(D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-HicDH is a NAD-dependent member of the hydroxycarboxylate dehydrogenase family, and shares the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 329
30191 240663 cd12187 LDH_like_1 D-Lactate and related Dehydrogenase like proteins, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-Hydroxyisocaproic acid dehydrogenase(D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-2-hydroxyisocaproate dehydrogenase-like (HicDH) proteins are NAD-dependent members of the hydroxycarboxylate dehydrogenase family, and share the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. 329
30192 240664 cd12188 SDH Saccharopine Dehydrogenase NAD-binding and catalytic domains. Saccharopine Dehydrogenase (SDH) catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SHD is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of related structure. 351
30193 240665 cd12189 LKR_SDH_like bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase enzyme. Bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase protein is a pair of enzymes linked on a single polypeptide chain that catalyze the initial, consecutive steps of lysine degradation. These proteins are related to the 2-domain saccharopine dehydrogenases. Along with formate dehydrogenase and similar enzymes, SDH consists paired domains resembling Rossmann folds in which the NAD-binding domain is inserted within the linear sequence of the catalytic domain. In this bifunctional enzyme, the LKR domain is N-terminal of the SDH domain. These proteins have a close match to the active site motif of SDHs, and an NAD-binding site motif that is a partial match to that found in SDH and other FDH-related proteins. 433
30194 213397 cd12190 Bacova_04320_like Uncharacterized proteins similar to Bacteroides ovatus 4320. This model characterized a family of proteins conserved in Bacteroidetes, similar to B. ovatus ATCC 8483 reading frame 04320. Structurally, the protein resembles members of the SRPBCC domain superfamily (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC). 159
30195 213398 cd12191 gal11_coact gall11 coactivator domain. Gall11/MED15 acts in the general regulation of GAL structural genes and is required for full expression for several genes in this pathway, including GALs 1,7, and 10 in Saccharomyces cerevisiae. GAL11 function is dependent on GCN4 functionality and binds GCN4 in a degenerate manner with multiple orientations found at the GCN4-Gal11 interface. 90
30196 213399 cd12192 GCN4_cent GCN4 central activation domain-like acidic activation domain. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region. 40
30197 269833 cd12193 bZIP_GCN4 Basic leucine zipper (bZIP) domain of General control protein GCN4: a DNA-binding and dimerization domain. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain. In amino acid-deprived cells, GCN4 is up-regulated leading to transcriptional activation of genes encoding amino acid biosynthetic enzymes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 54
30198 213379 cd12194 Kcc4p_like_C C-terminal kinase associated domain 1 (KA1), a phospholipid binding domain, of Kcc4p and similar proteins. This subfamily is composed of three Saccharomyces cerevisiae proteins, Kcc4p, Gin4p, and Hsl1p, as well as similar serine/threonine protein kinases (STKs). They catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. Kcc4p, Gin4p, and Hsl1p are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. They negatively regulate the Wee1-related kinase Swe1, which phosphorylates the cyclin-dependent kinase Cdc28, and is involved in regulating the entry of cells into mitosis. Kcc4p, Gin4p, and Hsl1p localize in the bud neck in a septin-dependent manner and display distinct but partially overlapping functions. They contain an N-terminal catalytic kinase domain and a C-terminal KA1 domain. The KA1 domain of Kcc4p, Gin4p, and Hsl1p binds acidic phospholipids including phosphatidylserine (PtdSer) and is required for bud neck localization. 122
30199 213380 cd12195 CIPK_C C-terminal regulatory domain of Calcineurin B-Like (CBL)-interacting protein kinases. CIPKs are serine/threonine protein kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They comprise a unique family in higher plants of proteins that interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. The specificity of the response relies on differences in expression and localization of both CBLs and CIPKs, as well as on the interaction specificity of CBL-CIPK combinations. There are 25, 30, and 43 CIPK genes identified in the Arabidopsis thaliana, Oryza sativa, and Zea mays genomes, respectively. The founding member of the CIPK family is Arabidopsis thaliana CIPK24, also called SOS2 (Salt Overlay Sensitive 2). CIPKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain that contains the FISL (also called NAF for Asn-Ala-Phe) and PPI-binding motifs, which are involved in the interaction with CBLs and PP2C-type protein phosphatases, respectively. Studies using SOS2, SOS3, and ABI2 phosphatase show that the binding of CBL and PP2C-type protein phosphatase to CIPK is mutually exclusive. The binding of CBL to CIPK is inhibitory to kinase activity. 116
30200 213381 cd12196 MARK1-3_C C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinases 1-3. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK1/2, through their activation by death-associated protein kinase (DAPK), modulates polarized neurite outgrowth. MARK1, also called Par-1c, is also involved in axon-dendrite specification, and SNPs on the MARK1 gene is associated with autism spectrum disorders. MARK2, also called Par-1b, is implicated in many physiological processes including fertility, immune system homeostasis, learning and memory, growth, and metabolism. MARK3, also called Par-1a, is implicated in gluconeogenesis and adiposity; mice deficient with MARK3 display reduced adiposity, resistance to hepatic steatosis, and defective gluconeogensis. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain. 98
30201 213382 cd12197 MARK4_C C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinase 4. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK4 has two splicing isoforms: MARK4S, predominantly expressed in the brain; and MARK4L, expressed in all tissues. Unlike MARK1-3 that show cytoplasmic localization, MARK4 colocalizes with the centrosome and with microtubules. Decreased MARK4 expression in the brain may be involved in the pathogenesis of Prion diseases and may be correlated to PrP(Sc) deposits. MARK4 is also a component of the ectoplasmic specialization, a testis-specific adherens junction. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain. 99
30202 213383 cd12198 MELK_C C-terminal kinase associated domain 1 (KA1) of Maternal embryonic leucine zipper kinase. MELK, also called protein kinase 38 (PK38) or pEg3 kinase, is a cell cycle-regulated serine/threonine protein kinase (STK) that catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It is phosphorylated and maximally active during mitosis and is involved in regulating cell cycle progression, division, proliferation, tumor growth, and mRNA splicing. MELK shows a broad substrate specificity, including the zinc finger-like protein ZPR9, the transcription and splicing factor NIPP1, and the protein-tyrosine phosphatase Cdc25B, among others. MELK contains an N-terminal catalytic domain followed by a ubiquitin-associated (UBA) domain, a TP dipeptide-rich region, and a C-terminal KA1 domain. The KA1 domain of MELK, together with its TP dipeptide-rich region, functions as an autoinhibitory domain. The KA1 domain of the related microtubule affinity-regulating kinases (MARKs) has been shown to bind anionic phospholipids and may be involved in membrane localization. 96
30203 213384 cd12199 AMPKA1_C C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha 1 catalytic subunit. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively, and show varying expression patterns. AMPKalpha1 is the predominant isoform expressed in bone; it plays a role in bone remodeling in response to hormonal regulation. It is selectively regulated by nucleoside diphosphate kinase (NDPK)-A in an AMP-independent manner. AMPKalpha1 impacts the regulation of fat metabolism through its in vivo target, acetyl coenzyme A carboxylase (ACC). It also mediates the vasoprotective effects of estrogen through phosphorylation of another in vivo substrate, RhoA. The C-terminal RD of the AMPK alpha 1 subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit. 96
30204 213385 cd12200 AMPKA2_C C-terminal regulatory domain of 5'-AMP-activated serine/threonine kinase, subunit alpha. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively, and show varying expression patterns. AMPKalpha2 shows cytoplasmic and nuclear localization, whereas AMPKalpha1 is localized only in the cytoplasm. The C-terminal RD of the AMPK alpha 1 subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit. 102
30205 213386 cd12201 MARK2_C C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinase 2. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK2, also called Par-1b or ELKL motif kinase 1 (EMK-1), is implicated in many physiological processes including fertility, immune system homeostasis, learning and memory, growth, and metabolism. It also regulates axon formation and has been implicated in neurodegeneration. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain. 99
30206 213401 cd12202 CASP8AP2 Caspase 8-associated protein 2 myb-like domain. This domain is the SANT/myb-like domain of Caspase 8-associated protein 2 (CASP8AP2) / GON-4 like proteins. CASP8AP2 (aka Flice-Associated Huge Protein (FLASH)) is implicated in numerous gene regulatory roles including roles in embryogenesis, oncogenesis, down-regulation of replication-dependent histone genes, regulation of Caspase 8 activity at the death-inducing signaling complex (DISC), and as a useful marker in leukemia prognosis. Gon-4 is critical in Caenorhabditis elegans gonadogenesis. Danio rerio GON4 is a regulator of gene expression in hematopoietic development, possibly by repressing expression. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins. 66
30207 213402 cd12203 GT1 GT1, myb-like, SANT family. GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins. 66
30208 213176 cd12204 CBD_like Cellulose-binding domain, chitinase and related proteins. This group contains proteins related to the cellulose-binding domain of Erwinia chrysanthemi endoglucanase Z (EGZ) and Serratia marcescens chitinase B (ChiB). Gram negative plant parasite Erwinia chrysanthemi produces a variety of depolymerizing enzymes to metabolize pectin and cellulose on the host plant. Cellulase EGZ has a modular structure, with N-terminal catalytic domain linked to a C-terminal cellulose-binding domain (CBD). CBD mediates the secretion activity of EGZ. Chitinases allow certain bacteria to utilize chitin as a energy source. Typically, non-plant chitinases are of the glycosidase family 18. 48
30209 213344 cd12205 RasGAP_plexin Ras-GTPase Activating Domain of plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Ligand binding activates signal transduction pathways controlling axon guidance in the nervous system and other developmental processes, including cell migration and morphogenesis, immune function, and tumor progression. Plexins are divided into four types (A-D) according to sequence similarity. In vertebrates, type A Plexins serve as the co-receptors for neuropilins to mediate the signaling of class 3 semaphorins except Sema3E, which signals through Plexin D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B. Plexin C1 serves as the receptor of Sema7A and plays regulation roles in both immune and nervous systems. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Other proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 382
30210 213345 cd12206 RasGAP_IQGAP_related Ras-GTPase Activating Domain of proteins related to IQGAPs. RasGAP: Ras-GTPase Activating Domain. RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a myriad of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGap domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 359
30211 213346 cd12207 RasGAP_IQGAP3 Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 3. This family represents the IQ motif containing GTPase activating protein 3 (IQGAP3), which associates with Ras GTP-binding proteins. A primary function of IQGAP proteins is to modulate cytoskeletal architecture. There are three known IQGAP family members: IQGAP1, IQGAP2 and IQGAP3. Human IQGAP1 and IQGAP2 share 62% identity. IQGAPs are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP is an essential regulator of cytoskeletal function. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity, the protein actually lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite of their similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3, only present in mammals, regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton. 350
30212 411994 cd12208 DIP1984-like DIP1984 family protein and similar proteins. DIP1984 is an uncharacterized protein from Corynebacterium diphtheriae. Some members of this family may have been misnamed as septicolysin. 150
30213 213404 cd12211 Bc2l-C_N N-Terminal Domain Of Bc2l-C Lectin. Lectin BC2L-C of Burkholderia cenocepacia is one of several lectins produced by this pathogen. BC2L-C has been shown to bind fucosylated human histo-blood group epitopes H-type 1, Lewis B, and Lewis Y. The C-terminal domain resembles BC2L-A, a calcium dependent mannose-binding protein. The N-terminal domain trimerizes and binds alpha-MeSeFuc in pockets between the monomeric units. The N-terminal domain has a similar structure to tumor necrosis factor (TNF). 131
30214 276936 cd12212 Fis1 Mitochondrial Fission Protein Fis1, cytosolic domain. Fis1, along with Dnm1 and Mdv1, is an essential protein in mediating mitochondrial fission. Dnm1 and Fis1 are highly conserved, with a common mechanism in disparate species. In mutants of these proteins, mitochondrial fission is impaired, resulting in networks of undivided mitochondria. The Fis1 N-terminus is cytosolic and tethered to the mitochondrial outer membrane via a C-terminal transmembrane domain. Fis1 appears to act via the recruitment of division complexes to the mitochondrial outer membrane, via interactions with Mdv1 or Caf4. Fis1 has tandem Tetratricopeptide repeat (TPR) motifs which are known to mediate protein-protein interactions. 115
30215 213406 cd12213 ABD Alpha-Mannosidase Binding Domain of Atg19/34. These proteins are related to the Alpha-mannosidase (Ams1) Binding Domain of Atg19/Atg34, a key component in the targeting pathway that directs alpha-mannosidase and aminopeptidase I to the vacuole, either through cytoplasm-to-vacuole trafficking or via autophagy in starvation conditions. Autophagy in a eukaryotic mechanism in which cytoplasm is enclosed in double-membraned autophagosomes which fuse with a vacuole for transport into the lumen. In Saccharomyces cerevisiae, alpha-mannosidase is selectively directed to the vacuole via the direct interaction with Atg19 (and paralog Atg34) in the Cvt pathway. Ams1 binding domains (ABD) Atg19/34 have a immunoglobulin fold with eight beta-strands. The ABD is responsible for Ams1 recognition, but its deletion does not affect the fusion of Atg19 with prApe1, and the transport of prApe1 to the vacuole. The Atg19 N-terminal region is a distinct coiled-coil domain. 112
30216 213177 cd12214 ChiA1_BD chitin-binding domain of Chi A1-like proteins. This group contains proteins related to the chitin binding domain of chitinase A1 (ChiA1) of Bacillus circulans WL-12. Glycosidase ChiA1 hydrolyzes chitin and is comprised of several domains: the C-terminal chitin binding domain, an N-terminal and catalytic domain, and 2 fibronectin type III-like domains. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. Bacillus circulans WL-12 ChiA1 facilitates invasion of fungal cell walls. The ChiAi chitin binding domain is required for the specific recognition of insoluble chitin. although topologically and structurally related, ChiA1 lacks the characteristic aromatic residues of Erwinia chrysanthemi endoglucanase Z (CBD(EGZ)). 45
30217 213178 cd12215 ChiC_BD Chitin-binding domain of chitinase C. Chitin-binding domain of chitinase C (ChiC) of Streptomyces griseus and related proteins. Chitinase C is a family 19 chitinase, and consists of a N-terminal chitin binding domain and a C-terminal chitin-catalytic domain that effects degradation. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. ChiC contains the characteristic chitin-binding aromatic residues. 42
30218 213409 cd12216 Csn2_like CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family. 217
30219 213410 cd12217 Stu0660_Csn2 Stu0660-like CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. This family of Csn2 proteins includes Stu0660, the proteins are larger than other (canonical) Csn2 proteins as they have an additional alpha-helical C-terminal domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family. 343
30220 213411 cd12218 Csn2 CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family. 219
30221 340518 cd12219 Ubl_TBK1_like ubiquitin-like (Ubl) domain found in non-canonical Inhibitor of kappa B kinases IKKepsilon and TBK1, and similar proteins. IKKepsilon and TBK1 (TRAF family member-associated NF-kappaB activator-binding kinase 1) are non-canonical members of IKK family. They have been characterized as activators of nuclear factor-kappaB (NF-kappaB), but they are not essential for NF-kappaB activation. They play critical roles in antiviral response via phosphorylation and activation of transcription factors IRF3, IRF7, STAT1 and STAT3. They are also involved in the survival, tumorigenesis and development of various cancers. Both IKKepsilon and TBK1 contain an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway. 77
30222 240617 cd12220 Pesticin_RB Pesticin Translocation And Receptor Binding Domain. Pesticin (Pst) is a anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. The N-terminal domain is further divided into the TonB box (which binds TonB) , the T (translocation domain) and the R (receptor binding domain). Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacteria stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. 166
30223 240616 cd12221 Cin1 Cellophane induced protein repeats of fungus Venturia inaequalis. Cin1 (cellulose induced protein 1) repeat protein of Venturia inaequalis, the fungus responsible for scab disease of apple, encodes 8 cysteine-rich repeats and is greatly upregulated within the plant and on cellophane membranes. The crystal structure reveals a pair of disulfide bridges in each repeat. The repeats have been described as adopting a beads-on-a-string organization. Cin1 function is undetermined, however the alpha-helical structure may be involved in protein-protein or protein-carbohydrate interactions in the extracellular matrix. 114
30224 240615 cd12222 Caa3-IV Caa3-Type Cytochrome Oxidase subunit 4 interacts with cyt c subunits I/III. Cytochrome c oxidase, a haem copper oxidase superfamily member, is the final step in the electron-transport chain, linking O2 reduction to transmembrane pumping in mitochondria and aerobic prokaryotes. Cytochrome c oxidase (aka Complex IV) catalyzes the reduction of O2 to 2H2O, and acts downstream of Complexes I-III: NADH-Q oxidoreductase, succinate-Q reductase, and Q-cytochrome c oxidoreductase. In Thermus thermophilus caa3-oxidase is comprised of subunit (SU) I/III, a fusion of classical SU I and SUIII, and IIc as well SU IV, which is composed of 2 connected transmembrane helices that interface with SU I/III. 63
30225 409670 cd12223 RRM_SR140 RNA recognition motif (RRM) found in U2-associated protein SR140 and similar proteins. This subgroup corresponds to the RRM of SR140 (also termed U2 snRNP-associated SURP motif-containing protein orU2SURP, or 140 kDa Ser/Arg-rich domain protein) which is a putative splicing factor mainly found in higher eukaryotes. Although it is initially identified as one of the 17S U2 snRNP-associated proteins, the molecular and physiological function of SR140 remains unclear. SR140 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a SWAP/SURP domain that is found in a number of pre-mRNA splicing factors in the middle region, and a C-terminal arginine/serine-rich domain (RS domain). 84
30226 409671 cd12224 RRM_RBM22 RNA recognition motif (RRM) found in Pre-mRNA-splicing factor RBM22 and similar proteins. This subgroup corresponds to the RRM of RBM22 (also known as RNA-binding motif protein 22, or Zinc finger CCCH domain-containing protein 16), a newly discovered RNA-binding motif protein which belongs to the SLT11 gene family. SLT11 gene encoding protein (Slt11p) is a splicing factor in yeast, which is required for spliceosome assembly. Slt11p has two distinct biochemical properties: RNA-annealing and RNA-binding activities. RBM22 is the homolog of SLT11 in vertebrate. It has been reported to be involved in pre-splicesome assembly and to interact with the Ca2+-signaling protein ALG-2. It also plays an important role in embryogenesis. RBM22 contains a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a zinc finger of the unusual type C-x8-C-x5-C-x3-H, and a C-terminus that is unusually rich in the amino acids Gly and Pro, including sequences of tetraprolines. 74
30227 409672 cd12225 RRM1_2_CID8_like RNA recognition motif 1 and 2 (RRM1, RRM2) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear. 76
30228 409673 cd12226 RRM_NOL8 RNA recognition motif (RRM) found in nucleolar protein 8 (NOL8) and similar proteins. This model corresponds to the RRM of NOL8 (also termed Nop132) encoded by a novel NOL8 gene that is up-regulated in the majority of diffuse-type, but not intestinal-type, gastric cancers. Thus, NOL8 may be a good molecular target for treatment of diffuse-type gastric cancer. Also, NOL8 is a phosphorylated protein that contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), suggesting NOL8 is likely to function as a novel RNA-binding protein. It may be involved in regulation of gene expression at the post-transcriptional level or in ribosome biogenesis in cancer cells. 77
30229 409674 cd12227 RRM_SCAF4_SCAF8 RNA recognition motif (RRM) found in SR-related and CTD-associated factor 4 (SCAF4), SR-related and CTD-associated factor 8 (SCAF8) and similar proteins. This subfamily corresponds to the RRM in a new class of SCAFs (SR-like CTD-associated factors), including SCAF4, SCAF8 and similar proteins. The biological role of SCAF4 remains unclear, but it shows high sequence similarity to SCAF8 (also termed CDC5L complex-associated protein 7, or RNA-binding motif protein 16, or CTD-binding SR-like protein RA8). SCAF8 is a nuclear matrix protein that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II). The pol II CTD plays a role in coupling transcription and pre-mRNA processing. In addition, SCAF8 co-localizes primarily with transcription sites that are enriched in nuclear matrix fraction, which is known to contain proteins involved in pre-mRNA processing. Thus, SCAF8 may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF8 and SCAF4 both contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain), and serine/arginine-rich motifs. 77
30230 409675 cd12228 RRM_ENOX RNA recognition motif (RRM) found in the cell surface Ecto-NOX disulfide-thiol exchanger (ECTO-NOX or ENOX) proteins. This subgroup corresponds to the conserved RNA recognition motif (RRM) in ECTO-NOX proteins (also termed ENOX), comprising a family of plant and animal NAD(P)H oxidases exhibiting both, oxidative and protein disulfide isomerase-like, activities. They are growth-related and drive cell enlargement, and may play roles in aging and neurodegenerative diseases. ENOX proteins function as terminal oxidases of plasma membrane electron transport (PMET) through catalyzing electron transport from plasma membrane quinones to extracellular oxygen, forming water as a product. They are also hydroquinone oxidases that oxidize externally supplied NADH, hence NOX. ENOX proteins harbor a di-copper center that lack flavin. ENOX proteins display protein disulfide interchange activity that is also possessed by protein disulfide isomerase. In contrast to the classic protein disulfide isomerases, ENOX proteins lack the double CXXC motif. This family includes two ENOX proteins, ENOX1 and ENOX2. ENOX1, also termed candidate growth-related and time keeping constitutive hydroquinone [NADH] oxidase (cCNOX), or cell proliferation-inducing gene 38 protein, or Constitutive Ecto-NOX (cNOX), is the constitutively expressed cell surface NADH (ubiquinone) oxidase that is ubiquitous and refractory to drugs. ENOX2, also termed APK1 antigen, or cytosolic ovarian carcinoma antigen 1, or tumor-associated hydroquinone oxidase (tNOX), is a cancer-specific variant of ENOX1 and plays a key role in cell proliferation and tumor progression. In contrast to ENOX1, ENOX2 is drug-responsive and harbors a drug binding site to which the cancer-specific S-peptide tagged pan-ENOX2 recombinant (scFv) is directed. Moreover, ENOX2 is specifically inhibited by a variety of quinone site inhibitors that have anticancer activity and is unique to the surface of cancer cells. ENOX proteins contain many functional motifs. 84
30231 409676 cd12229 RRM_G3BP RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein G3BP1, G3BP2 and similar proteins. This subfamily corresponds to the RRM domain in the G3BP family of RNA-binding and SH3 domain-binding proteins. G3BP acts at the level of RNA metabolism in response to cell signaling, possibly as RNA transcript stabilizing factors or an RNase. Members include G3BP1, G3BP2 and similar proteins. These proteins associate directly with the SH3 domain of GTPase-activating protein (GAP), which functions as an inhibitor of Ras. They all contain an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an acidic domain, a domain containing PXXP motif(s), an RNA recognition motif (RRM), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif). 81
30232 409677 cd12230 RRM1_U2AF65 RNA recognition motif 1 (RRM1) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. The subfamily corresponds to the RRM1 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs. 82
30233 409678 cd12231 RRM2_U2AF65 RNA recognition motif 2 (RRM2) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. This subfamily corresponds to the RRM2 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs. 77
30234 409679 cd12232 RRM3_U2AF65 RNA recognition motif 3 (RRM3) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. This subfamily corresponds to the RRM3 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs. 89
30235 240679 cd12233 RRM_Srp1p_AtRSp31_like RNA recognition motif (RRM) found in fission yeast pre-mRNA-splicing factor Srp1p, Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins. This subfamily corresponds to the RRM of Srp1p and RRM2 of plant SR splicing factors. Srp1p is encoded by gene srp1 from fission yeast Schizosaccharomyces pombe. It plays a role in the pre-mRNA splicing process, but is not essential for growth. Srp1p is closely related to the SR protein family found in Metazoa. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine hinge and a RS domain in the middle, and a C-terminal domain. The family also includes a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RRMs at their N-terminus and an RS domain at their C-terminus. 70
30236 409680 cd12234 RRM1_AtRSp31_like RNA recognition motif (RRM) found in Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins from plants. This subfamily corresponds to the RRM1in a family that represents a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at their N-terminus, and an RS domain at their C-terminus. 72
30237 409681 cd12235 RRM_PPIL4 RNA recognition motif (RRM) found in peptidyl-prolyl cis-trans isomerase-like 4 (PPIase) and similar proteins. This subfamily corresponds to the RRM of PPIase, also termed cyclophilin-like protein PPIL4, or rotamase PPIL4, a novel nuclear RNA-binding protein encoded by cyclophilin-like PPIL4 gene. The precise role of PPIase remains unclear. PPIase contains a conserved N-terminal peptidyl-prolyl cistrans isomerase (PPIase) motif, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a lysine rich domain, and a pair of bipartite nuclear targeting sequences (NLS) at the C-terminus. 83
30238 409682 cd12236 RRM_snRNP70 RNA recognition motif (RRM) found in U1 small nuclear ribonucleoprotein 70 kDa (U1-70K) and similar proteins. This subfamily corresponds to the RRM of U1-70K, also termed snRNP70, a key component of the U1 snRNP complex, which is one of the key factors facilitating the splicing of pre-mRNA via interaction at the 5' splice site, and is involved in regulation of polyadenylation of some viral and cellular genes, enhancing or inhibiting efficient poly(A) site usage. U1-70K plays an essential role in targeting the U1 snRNP to the 5' splice site through protein-protein interactions with regulatory RNA-binding splicing factors, such as the RS protein ASF/SF2. Moreover, U1-70K protein can specifically bind to stem-loop I of the U1 small nuclear RNA (U1 snRNA) contained in the U1 snRNP complex. It also mediates the binding of U1C, another U1-specific protein, to the U1 snRNP complex. U1-70K contains a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by an adjacent glycine-rich region at the N-terminal half, and two serine/arginine-rich (SR) domains at the C-terminal half. The RRM is responsible for the binding of stem-loop I of U1 snRNA molecule. Additionally, the most prominent immunodominant region that can be recognized by auto-antibodies from autoimmune patients may be located within the RRM. The SR domains are involved in protein-protein interaction with SR proteins that mediate 5' splice site recognition. For instance, the first SR domain is necessary and sufficient for ASF/SF2 Binding. The family also includes Drosophila U1-70K that is an essential splicing factor required for viability in flies, but its SR domain is dispensable. The yeast U1-70k doesn't contain easily recognizable SR domains and shows low sequence similarity in the RRM region with other U1-70k proteins and therefore not included in this family. The RRM domain is dispensable for yeast U1-70K function. 91
30239 409683 cd12237 RRM_snRNP35 RNA recognition motif (RRM) found in U11/U12 small nuclear ribonucleoprotein 35 kDa protein (U11/U12-35K) and similar proteins. This subfamily corresponds to the RRM of U11/U12-35K, also termed protein HM-1, or U1 snRNP-binding protein homolog, and is one of the components of the U11/U12 snRNP, which is a subunit of the minor (U12-dependent) spliceosome required for splicing U12-type nuclear pre-mRNA introns. U11/U12-35K is highly conserved among bilateria and plants, but lacks in some organisms, such as Saccharomyces cerevisiae and Caenorhabditis elegans. Moreover, U11/U12-35K shows significant sequence homology to U1 snRNP-specific 70 kDa protein (U1-70K or snRNP70). It contains a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by an adjacent glycine-rich region, and Arg-Asp and Arg-Glu dipeptide repeats rich domain, making U11/U12-35K a possible functional analog of U1-70K. It may facilitate 5' splice site recognition in the minor spliceosome and play a role in exon bridging, interacting with components of the major spliceosome bound to the pyrimidine tract of an upstream U2-type intron. The family corresponds to the RRM of U11/U12-35K that may directly contact the U11 or U12 snRNA through the RRM domain. 94
30240 409684 cd12238 RRM1_RBM40_like RNA recognition motif 1 (RRM1) found in RNA-binding protein 40 (RBM40) and similar proteins. This subfamily corresponds to the RRM1 of RBM40, also known as RNA-binding region-containing protein 3 (RNPC3) or U11/U12 small nuclear ribonucleoprotein 65 kDa protein (U11/U12-65K protein), It serves as a bridging factor between the U11 and U12 snRNPs. It contains two repeats of RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), connected by a linker that includes a proline-rich region. It binds to the U11-associated 59K protein via its RRM1 and employs the RRM2 to bind hairpin III of the U12 small nuclear RNA (snRNA). The proline-rich region might be involved in protein-protein interactions. 73
30241 409685 cd12239 RRM2_RBM40_like RNA recognition motif 2 (RRM2) found in RNA-binding protein 40 (RBM40) and similar proteins. This subfamily corresponds to the RRM2 of RBM40 and the RRM of RBM41. RBM40, also known as RNA-binding region-containing protein 3 (RNPC3) or U11/U12 small nuclear ribonucleoprotein 65 kDa protein (U11/U12-65K protein). It serves as a bridging factor between the U11 and U12 snRNPs. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), connected by a linker that includes a proline-rich region. It binds to the U11-associated 59K protein via its RRM1 and employs the RRM2 to bind hairpin III of the U12 small nuclear RNA (snRNA). The proline-rich region might be involved in protein-protein interactions. RBM41 contains only one RRM. Its biological function remains unclear. 82
30242 409686 cd12240 RRM_NCBP2 RNA recognition motif (RRM) found in nuclear cap-binding protein subunit 2 (CBP20) and similar proteins. This subfamily corresponds to the RRM of CBP20, also termed nuclear cap-binding protein subunit 2 (NCBP2), or cell proliferation-inducing gene 55 protein, or NCBP-interacting protein 1 (NIP1). CBP20 is the small subunit of the nuclear cap binding complex (CBC), which is a conserved eukaryotic heterodimeric protein complex binding to 5'-capped polymerase II transcripts and plays a central role in the maturation of pre-mRNA and uracil-rich small nuclear RNA (U snRNA). CBP20 is most likely responsible for the binding of capped RNA. It contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and interacts with the second and third domains of CBP80, the large subunit of CBC. 78
30243 409687 cd12241 RRM_SF3B14 RNA recognition motif (RRM) found in pre-mRNA branch site protein p14 (SF3B14) and similar proteins. This subfamily corresponds to the RRM of SF3B14 (also termed p14), a 14 kDa protein subunit of SF3B which is a multiprotein complex that is an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA and has been involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B14 associates directly with another SF3B subunit called SF3B155. It is also present in both U2- and U12-dependent spliceosomes and may contribute to branch site positioning in both the major and minor spliceosome. Moreover, SF3B14 interacts directly with the pre-mRNA branch adenosine early in spliceosome assembly and within the fully assembled spliceosome. SF3B14 contains one well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 77
30244 409688 cd12242 RRM_SLIRP RNA recognition motif (RRM) found in SRA stem-loop-interacting RNA-binding protein (SLIRP) and similar proteins. This subfamily corresponds to the RRM of SLIRP, a widely expressed small steroid receptor RNA activator (SRA) binding protein, which binds to STR7, a functional substructure of SRA. SLIRP is localized predominantly to the mitochondria and plays a key role in modulating several nuclear receptor (NR) pathways. It functions as a co-repressor to repress SRA-mediated nuclear receptor coactivation. It modulates SHARP- and SKIP-mediated co-regulation of NR activity. SLIRP contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is required for SLIRP's corepression activities. 73
30245 409689 cd12243 RRM1_MSSP RNA recognition motif 1 (RRM1) found in the c-myc gene single-strand binding proteins (MSSP) family. This subfamily corresponds to the RRM1 of c-myc gene single-strand binding proteins (MSSP) family, including single-stranded DNA-binding protein MSSP-1 (also termed RBMS1 or SCR2) and MSSP-2 (also termed RBMS2 or SCR3). All MSSP family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity. Both, MSSP-1 and -2, have been identified as protein factors binding to a putative DNA replication origin/transcriptional enhancer sequence present upstream from the human c-myc gene in both single- and double-stranded forms. Thus, they have been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with c-MYC, the product of protooncogene c-myc. Moreover, the family includes a new member termed RNA-binding motif, single-stranded-interacting protein 3 (RBMS3), which is not a transcriptional regulator. RBMS3 binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. In addition, a putative meiosis-specific RNA-binding protein termed sporulation-specific protein 5 (SPO5, or meiotic RNA-binding protein 1, or meiotically up-regulated gene 12 protein), encoded by Schizosaccharomyces pombe Spo5/Mug12 gene, is also included in this family. SPO5 is a novel meiosis I regulator that may function in the vicinity of the Mei2 dot. 71
30246 409690 cd12244 RRM2_MSSP RNA recognition motif 2 (RRM2) found in the c-myc gene single-strand binding proteins (MSSP) family. This subfamily corresponds to the RRM2 of c-myc gene single-strand binding proteins (MSSP) family, including single-stranded DNA-binding protein MSSP-1 (also termed RBMS1 or SCR2) and MSSP-2 (also termed RBMS2 or SCR3). All MSSP family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity. Both, MSSP-1 and -2, have been identified as protein factors binding to a putative DNA replication origin/transcriptional enhancer sequence present upstream from the human c-myc gene in both single- and double-stranded forms. Thus they have been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. Moreover, they family includes a new member termed RNA-binding motif, single-stranded-interacting protein 3 (RBMS3), which is not a transcriptional regulator. RBMS3 binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. In addition, a putative meiosis-specific RNA-binding protein termed sporulation-specific protein 5 (SPO5, or meiotic RNA-binding protein 1, or meiotically up-regulated gene 12 protein), encoded by Schizosaccharomyces pombe Spo5/Mug12 gene, is also included in this family. SPO5 is a novel meiosis I regulator that may function in the vicinity of the Mei2 dot. 82
30247 409691 cd12245 RRM_scw1_like RNA recognition motif (RRM) found in yeast cell wall integrity protein scw1 and similar proteins. This subfamily corresponds to the RRM of the family including yeast cell wall integrity protein scw1, yeast Whi3 protein, yeast Whi4 protein and similar proteins. The strong cell wall protein 1, scw1, is a nonessential cytoplasmic RNA-binding protein that regulates septation and cell-wall structure in fission yeast. It may function as an inhibitor of septum formation, such that its loss of function allows weak SIN signaling to promote septum formation. It's RRM domain shows high homology to two budding yeast proteins, Whi3 and Whi4. Whi3 is a dose-dependent modulator of cell size and has been implicated in cell cycle control in the yeast Saccharomyces cerevisiae. It functions as a negative regulator of ceroid-lipofuscinosis, neuronal 3 (Cln3), a G1 cyclin that promotes transcription of many genes to trigger the G1/S transition in budding yeast. It specifically binds the CLN3 mRNA and localizes it into discrete cytoplasmic loci that may locally restrict Cln3 synthesis to modulate cell cycle progression. Moreover, Whi3 plays a key role in cell fate determination in budding yeast. The RRM domain is essential for Whi3 function. Whi4 is a partially redundant homolog of Whi3, also containing one RRM. Some uncharacterized family members of this subfamily contain two RRMs; their RRM1 shows high sequence homology to the RRM of RNA-binding protein with multiple splicing (RBP-MS)-like proteins. 79
30248 409692 cd12246 RRM1_U1A_like RNA recognition motif 1 (RRM1) found in the U1A/U2B"/SNF protein family. This subfamily corresponds to the RRM1 of U1A/U2B"/SNF protein family which contains Drosophila sex determination protein SNF and its two mammalian counterparts, U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A) and U2 small nuclear ribonucleoprotein B" (U2 snRNP B" or U2B"), all of which consist of two RNA recognition motifs (RRMs), connected by a variable, flexible linker. SNF is an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila where it is essential in sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). U1A is an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. Moreover, U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins. U2B", initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA, is a unique protein that comprises of the U2 snRNP. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. Moreover, U2B" does not require an auxiliary protein for binding to RNA, and its nuclear transport is independent of U2 snRNA binding. 78
30249 409693 cd12247 RRM2_U1A_like RNA recognition motif 2 (RRM2) found in the U1A/U2B"/SNF protein family. This subfamily corresponds to the RRM2 of U1A/U2B"/SNF protein family, containing Drosophila sex determination protein SNF and its two mammalian counterparts, U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A) and U2 small nuclear ribonucleoprotein B" (U2 snRNP B" or U2B"), all of which consist of two RNA recognition motifs (RRMs) connected by a variable, flexible linker. SNF is an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila where it is essential in sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). U1A is an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. Moreover, U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins. U2B", initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA, is a unique protein that comprises of the U2 snRNP. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA and its nuclear transport is independent on U2 snRNA binding. 72
30250 409694 cd12248 RRM_RBM44 RNA recognition motif (RRM) found in RNA-binding protein 44 (RBM44) and similar proteins. This subgroup corresponds to the RRM of RBM44, a novel germ cell intercellular bridge protein that is localized in the cytoplasm and intercellular bridges from pachytene to secondary spermatocyte stages. RBM44 interacts with itself and testis-expressed gene 14 (TEX14). Unlike TEX14, RBM44 does not function in the formation of stable intercellular bridges. It carries an RNA recognition motif (RRM) that could potentially bind a multitude of RNA sequences in the cytoplasm and help to shuttle them through the intercellular bridge, facilitating their dispersion into the interconnected neighboring cells. 77
30251 409695 cd12249 RRM1_hnRNPR_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM1 in hnRNP R, hnRNP Q, APOBEC-1 complementation factor (ACF), and dead end protein homolog 1 (DND1). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically binds mRNAs with a preference for poly(U) stretches. It has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone, and play a key role in cell growth and differentiation. DND1 is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members in this family, except for DND1, contain three conserved RNA recognition motifs (RRMs); DND1 harbors only two RRMs. 78
30252 409696 cd12250 RRM2_hnRNPR_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM2 in hnRNP R, hnRNP Q, APOBEC-1 complementation factor (ACF), and dead end protein homolog 1 (DND1). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. It has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. DND1 is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members in this family, except for DND1, contain three conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains); DND1 harbors only two RRMs. 82
30253 409697 cd12251 RRM3_hnRNPR_like RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM3 in hnRNP R, hnRNP Q, and APOBEC-1 complementation factor (ACF). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches and has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members contain three conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 72
30254 409698 cd12252 RRM_DbpA RNA recognition motif (RRM) found in the DbpA subfamily of prokaryotic DEAD-box rRNA helicases. This subfamily corresponds to the C-terminal RRM homology domain of dbpA proteins implicated in ribosome biogenesis. They bind with high affinity and specificity to RNA substrates containing hairpin 92 of 23S rRNA (HP92), which is part of the ribosomal A-site. The majority of dbpA proteins contain two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding. Several members of this family lack specificity for 23S rRNA. These proteins can generally be distinguished by a basic region that extends beyond the C-terminal domain. 71
30255 240699 cd12253 RRM_PIN4_like RNA recognition motif (RRM) found in yeast RNA-binding protein PIN4, fission yeast RNA-binding post-transcriptional regulators cip1, cip2 and similar proteins. This subfamily corresponds to the RRM in PIN4, also termed psi inducibility protein 4 or modifier of damage tolerance Mdt1, a novel phosphothreonine (pThr)-containing protein that specifically interacts with the pThr-binding site of the Rad53 FHA1 domain. It is encoded by gene MDT1 (YBL051C) from yeast Saccharomyces cerevisiae. PIN4 is involved in normal G2/M cell cycle progression in the absence of DNA damage and functions as a novel target of checkpoint-dependent cell cycle arrest pathways. It contains an N-terminal RRM, a nuclear localization signal, a coiled coil, and a total of 15 SQ/TQ motifs. cip1 (Csx1-interacting protein 1) and cip2 (Csx1-interacting protein 2) are novel cytoplasmic RRM-containing proteins that counteract Csx1 function during oxidative stress. They are not essential for viability in fission yeast Schizosaccharomyces pombe. Both cip1 and cip2 contain one RRM. Like PIN4, Cip2 also possesses an R3H motif that may function in sequence-specific binding to single-stranded nucleic acids. 79
30256 409699 cd12254 RRM_hnRNPH_ESRPs_RBM12_like RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family, epithelial splicing regulatory proteins (ESRPs), Drosophila RNA-binding protein Fusilli, RNA-binding protein 12 (RBM12) and similar proteins. The family includes RRM domains in the hnRNP H protein family, G-rich sequence factor 1 (GRSF-1), ESRPs (also termed RBM35), Drosophila Fusilli, RBM12 (also termed SWAN), RBM12B, RBM19 (also termed RBD-1) and similar proteins. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. GRSF-1 is a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. Fusilli shows high sequence homology to ESRPs. It can regulate endogenous FGFR2 splicing and functions as a splicing factor. The biological roles of both, RBM12 and RBM12B, remain unclear. RBM19 is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. Members in this family contain 2~6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 73
30257 409700 cd12255 RRM1_LKAP RNA recognition motif 1 (RRM1) found in Limkain-b1 (LKAP) and similar proteins. This subfamily corresponds to the RRM1 of LKAP, a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 (ATP-binding cassette subfamily D member 3, known previously as PMP-70) and/or PXF (peroxisomal farnesylated protein, known previously as PEX19). It associates with LIM kinase 2 (LIMK2) and may serve as a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. LKAP contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). However, whether those RRMs are bona fide RNA binding sites remains unclear. Moreover, there is no evidence of LAKP localization in the nucleus. Therefore, if the RRMs are functional, their interaction with RNA species would be restricted to the cytoplasm and peroxisomes. 73
30258 409701 cd12256 RRM2_LKAP RNA recognition motif 2 (RRM2) found in Limkain-b1 (LKAP) and similar proteins. This subfamily corresponds to the RRM2 of LKAP, a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 (ATP-binding cassette subfamily D member 3, known previously as PMP-70) and/or PXF (peroxisomal farnesylated protein, known previously as PEX19). It associates with LIM kinase 2 (LIMK2) and may serve as a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. LKAP contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). However, whether those RRMs are bona fide RNA binding sites remains unclear. Moreover, there is no evidence of LAKP localization in the nucleus. Therefore, if the RRMs are functional, their interaction with RNA species would be restricted to the cytoplasm and peroxisomes. 89
30259 409702 cd12257 RRM1_RBM26_like RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 26 (RBM26) and similar proteins. This subfamily corresponds to the RRM1 of RBM26, and the RRM of RBM27. RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, represents a cutaneous lymphoma (CL)-associated antigen. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions. RBM27 contains only one RRM; its biological function remains unclear. 72
30260 409703 cd12258 RRM2_RBM26_like RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 26 (RBM26) and similar proteins. This subfamily corresponds to the RRM2 of RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, which represents a cutaneous lymphoma (CL)-associated antigen. RBM26 contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions. 72
30261 409704 cd12259 RRM_SRSF11_SREK1 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 11 (SRSF11), splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subfamily corresponds to the RRM domain of SRSF11 (SRp54 or p54), SREK1 ( SFRS12 or SRrp86) and similar proteins, a group of proteins containing regions rich in serine-arginine dipeptides (SR protein family). These are involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. SR proteins have been identified as crucial regulators of alternative splicing. Different SR proteins display different substrate specificity, have distinct functions in alternative splicing of different pre-mRNAs, and can even negatively regulate splicing. All SR family members are characterized by the presence of one or two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and the C-terminal regions rich in serine and arginine dipeptides (SR domains). The RRM domain is responsible for RNA binding and specificity in both alternative and constitutive splicing. In contrast, SR domains are thought to be protein-protein interaction domains that are often interchangeable. 76
30262 409705 cd12260 RRM2_SREK1 RNA recognition motif 2 (RRM2) found in splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subfamily corresponds to the RRM2 of SREK1, also termed serine/arginine-rich-splicing regulatory protein 86-kDa (SRrp86), or splicing factor arginine/serine-rich 12 (SFRS12), or splicing regulatory protein 508 amino acid (SRrp508). SREK1 belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family), which is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. It is a unique SR family member and it may play a crucial role in determining tissue specific patterns of alternative splicing. SREK1 can alter splice site selection by both positively and negatively modulating the activity of other SR proteins. For instance, SREK1 can activate SRp20 and repress SC35 in a dose-dependent manner both in vitro and in vivo. In addition, SREK1 contains two (some contain only one) RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and two serine-arginine (SR)-rich domains (SR domains) separated by an unusual glutamic acid-lysine (EK) rich region. The RRM and SR domains are highly conserved among other members of the SR superfamily. However, the EK domain is unique to SREK1. It plays a modulatory role controlling SR domain function by involvement in the inhibition of both constitutive and alternative splicing and in the selection of splice-site. 85
30263 240707 cd12261 RRM1_3_MRN1 RNA recognition motif 1 (RRM1) and 3 (RRM3) found in RNA-binding protein MRN1 and similar proteins. This subfamily corresponds to the RRM1 and RRM3 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is an RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 73
30264 409706 cd12262 RRM2_4_MRN1 RNA recognition motif 2 (RRM2) and 4 (RRM4) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM2 and RRM4 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, and is an RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 78
30265 409707 cd12263 RRM_ABT1_like RNA recognition motif (RRM) found in activator of basal transcription 1 (ABT1) and similar proteins. This subfamily corresponds to the RRM of novel nuclear proteins termed ABT1 and its homologous counterpart, pre-rRNA-processing protein ESF2 (eighteen S factor 2), from yeast. ABT1 associates with the TATA-binding protein (TBP) and enhances basal transcription activity of class II promoters. Meanwhile, ABT1 could be a transcription cofactor that can bind to DNA in a sequence-independent manner. The yeast ABT1 homolog, ESF2, is a component of 90S preribosomes and 5' ETS-based RNPs. It is previously identified as a putative partner of the TATA-element binding protein. However, it is primarily localized to the nucleolus and physically associates with pre-rRNA processing factors. ESF2 may play a role in ribosome biogenesis. It is required for normal pre-rRNA processing, as well as for SSU processome assembly and function. Both ABT1 and ESF2 contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 98
30266 409708 cd12264 RRM_AKAP17A RNA recognition motif (RRM) found in A-kinase anchor protein 17A (AKAP-17A) and similar proteins. This subfamily corresponds to the RRM domain of AKAP-17A, also termed 721P, or splicing factor, arginine/serine-rich 17A (SFRS17A). It was originally reported as the pseudoautosomal or X inactivation escape gene 7 (XE7) and as B-lymphocyte antigen precursor. It has been suggested that AKAP-17A is an alternative splicing factor and an SR-related splicing protein that interacts with the classical SR protein ASF/SF2 and the SR-related factor ZNF265. Additional studies have indicated that AKAP-17A is a dual-specific protein kinase A anchoring protein (AKAP) that can bind both type I and type II protein kinase A (PKA) with high affinity and co-localizes with the catalytic subunit of PKA in nuclear speckles as well as the splicing factor SC35 in splicing factor compartments. It is involved in regulation of pre-mRNA splicing possibly by docking a pool of PKA in splicing factor compartments. AKAP-17A contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 122
30267 409709 cd12265 RRM_SLT11 RNA recognition motif (RRM) found in pre-mRNA-splicing factor SLT11 and similar proteins. This subfamily corresponds to the RRM of SLT11, also known as extracellular mutant protein 2, or synthetic lethality with U2 protein 11, and is a splicing factor required for spliceosome assembly in yeast. It contains a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). SLT11 can facilitate the cooperative formation of U2/U6 helix II in association with stem II in the yeast spliceosome by utilizing its RNA-annealing and -binding activities. 86
30268 409710 cd12266 RRM_like_XS RNA recognition motif (RRM)-like XS domain found in plants. This XS (named after rice gene X and SGS3) domain is a single-stranded RNA-binding domain (RBD) and possesses a unique version of a RNA recognition motif (RRM) fold. It is conserved in a family of plant proteins including gene X and SGS3. Although its function is still unknown, the plant SGS3 proteins are thought to be involved in post-transcriptional gene silencing (PTGS) pathways. In addition, they contain a conserved aspartate residue that may be functionally important. 107
30269 409711 cd12267 RRM_YRA1_MLO3 RNA recognition motif (RRM) found in yeast RNA annealing protein YRA1 (Yra1p), yeast mRNA export protein mlo3 and similar proteins. This subfamily corresponds to the RRM of Yra1p and mlo3. Yra1p is an essential nuclear RNA-binding protein encoded by Saccharomyces cerevisiae YRA1 gene. It belongs to the evolutionarily conserved REF (RNA and export factor binding proteins) family of hnRNP-like proteins. Yra1p possesses potent RNA annealing activity and interacts with a number of proteins involved in nuclear transport and RNA processing. It binds to the mRNA export factor Mex67p/TAP and couples transcription to export in yeast. Yra1p is associated with Pse1p and Kap123p, two members of the beta-importin family, further mediating transport of Yra1p into the nucleus. In addition, the co-transcriptional loading of Yra1p is required for autoregulation. Yra1p consists of two highly conserved N- and C-terminal boxes and a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). This subfamily includes RNA-annealing protein mlo3, also termed mRNA export protein mlo3, which has been identified in fission yeast as a protein that causes defects in chromosome segregation when overexpressed. It shows high sequence similarity with Yra1p. 78
30270 240714 cd12268 RRM_Vip1 RNA recognition motif (RRM) found in fission yeast protein Vip1 and similar proteins. This subfamily corresponds to Vip1, an RNA-binding protein encoded by gene vip1 from fission yeast Schizosaccharomyces pombe. Its biological role remains unclear. Vip1 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 68
30271 409712 cd12269 RRM_Vip1_like RNA recognition motif (RRM) found in a group of uncharacterized plant proteins similar to fission yeast Vip1. This subfamily corresponds to the Vip1-like, uncharacterized proteins found in plants. Although their biological roles remain unclear, these proteins show high sequence similarity to the fission yeast Vip1. Like Vip1 protein, members in this family contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 69
30272 409713 cd12270 RRM_MTHFSD RNA recognition motif (RRM) found in vertebrate methenyltetrahydrofolate synthetase domain-containing proteins. This subfamily corresponds to methenyltetrahydrofolate synthetase domain (MTHFSD), a putative RNA-binding protein found in various vertebrate species. It contains an N-terminal 5-formyltetrahydrofolate cyclo-ligase domain and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The biological role of MTHFSD remains unclear. 72
30273 409714 cd12271 RRM1_PHIP1 RNA recognition motif 1 (RRM1) found in Arabidopsis thaliana phragmoplastin interacting protein 1 (PHIP1) and similar proteins. This subfamily corresponds to the RRM1 of PHIP1. A. thaliana PHIP1 and its homologs represent a novel class of plant-specific RNA-binding proteins that may play a unique role in the polarized mRNA transport to the vicinity of the cell plate. The family members consist of multiple functional domains, including a lysine-rich domain (KRD domain) that contains three nuclear localization motifs (KKKR/NK), two RNA recognition motifs (RRMs), and three CCHC-type zinc fingers. PHIP1 is a peripheral membrane protein and is localized at the cell plate during cytokinesis in plants. In addition to phragmoplastin, PHIP1 interacts with two Arabidopsis small GTP-binding proteins, Rop1 and Ran2. However, PHIP1 interacted only with the GTP-bound form of Rop1 but not the GDP-bound form. It also binds specifically to Ran2 mRNA. 72
30274 409715 cd12272 RRM2_PHIP1 RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana phragmoplastin interacting protein 1 (PHIP1) and similar proteins. The CD corresponds to the RRM2 of PHIP1. A. thaliana PHIP1 and its homologs represent a novel class of plant-specific RNA-binding proteins that may play a unique role in the polarized mRNA transport to the vicinity of the cell plate. The family members consist of multiple functional domains, including a lysine-rich domain (KRD domain) that contains three nuclear localization motifs (KKKR/NK), two RNA recognition motifs (RRMs), and three CCHC-type zinc fingers. PHIP1 is a peripheral membrane protein and is localized at the cell plate during cytokinesis in plants. In addition to phragmoplastin, PHIP1 interacts with two Arabidopsis small GTP-binding proteins, Rop1 and Ran2. However, PHIP1 interacted only with the GTP-bound form of Rop1 but not the GDP-bound form. It also binds specifically to Ran2 mRNA. 73
30275 409716 cd12273 RRM1_NEFsp RNA recognition motif 1 (RRM1) found in vertebrate putative RNA exonuclease NEF-sp. This subfamily corresponds to the RRM1 of NEF-sp., including uncharacterized putative RNA exonuclease NEF-sp found in vertebrates. Although its cellular functions remains unclear, NEF-sp contains an exonuclease domain and two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), suggesting it may possess both exonuclease and RNA-binding activities. 71
30276 409717 cd12274 RRM2_NEFsp RNA recognition motif 2 (RRM2) found in vertebrate putative RNA exonuclease NEF-sp. This subfamily corresponds to the RRM2 of NEF-sp., including uncharacterized putative RNA exonuclease NEF-sp found in vertebrates. Although its cellular functions remains unclear, NEF-sp contains an exonuclease domain and two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), suggesting it may possess both exonuclease and RNA-binding activities. 71
30277 240721 cd12275 RRM1_MEI2_EAR1_like RNA recognition motif 1 (RRM1) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM1 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding protein family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 71
30278 409718 cd12276 RRM2_MEI2_EAR1_like RNA recognition motif 2 (RRM2) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM2 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 71
30279 409719 cd12277 RRM3_MEI2_EAR1_like RNA recognition motif 3 (RRM3) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM3 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 86
30280 409720 cd12278 RRM_eIF3B RNA recognition motif (RRM) found in eukaryotic translation initiation factor 3 subunit B (eIF-3B) and similar proteins. This subfamily corresponds to the RRM domain in eukaryotic translation initiation factor 3 (eIF-3), a large multisubunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3B, also termed eIF-3 subunit 9, or Prt1 homolog, eIF-3-eta, eIF-3 p110, or eIF-3 p116, is the major scaffolding subunit of eIF-3. It interacts with eIF-3 subunits A, G, I, and J. eIF-3B contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is involved in the interaction with eIF-3J. The interaction between eIF-3B and eIF-3J is crucial for the eIF-3 recruitment to the 40 S ribosomal subunit. eIF-3B also binds directly to domain III of the internal ribosome-entry site (IRES) element of hepatitis-C virus (HCV) RNA through its N-terminal RRM, which may play a critical role in both cap-dependent and cap-independent translation. Additional research has shown that eIF-3B may function as an oncogene in glioma cells and can be served as a potential therapeutic target for anti-glioma therapy. This family also includes the yeast homolog of eIF-3 subunit B (eIF-3B, also termed PRT1 or eIF-3 p90) that interacts with the yeast homologs of eIF-3 subunits A(TIF32), G(TIF35), I(TIF34), J(HCR1), and E(Pci8). In yeast, eIF-3B (PRT1) contains an N-terminal RRM that is directly involved in the interaction with eIF-3A (TIF32) and eIF-3J (HCR1). In contrast to its human homolog, yeast eIF-3B (PRT1) may have potential to bind its total RNA through its RRM domain. 84
30281 409721 cd12279 RRM_TUT1 RNA recognition motif (RRM) found in speckle targeted PIP5K1A-regulated poly(A) polymerase (Star-PAP) and similar proteins. This subfamily corresponds to the RRM of Star-PAP, also termed RNA-binding motif protein 21 (RBM21), which is a ubiquitously expressed U6 snRNA-specific terminal uridylyltransferase (U6-TUTase) essential for cell proliferation. Although it belongs to the well-characterized poly(A) polymerase protein superfamily, Star-PAP is highly divergent from both, the poly(A) polymerase (PAP) and the terminal uridylyl transferase (TUTase), identified within the editing complexes of trypanosomes. Star-PAP predominantly localizes at nuclear speckles and catalyzes RNA-modifying nucleotidyl transferase reactions. It functions in mRNA biosynthesis and may be regulated by phosphoinositides. It binds to glutathione S-transferase (GST)-PIPKIalpha. Star-PAP preferentially uses ATP as a nucleotide substrate and possesses PAP activity that is stimulated by PtdIns4,5P2. It contains an N-terminal C2H2-type zinc finger motif followed by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a split PAP domain linked by a proline-rich region, a PAP catalytic and core domain, a PAP-associated domain, an RS repeat, and a nuclear localization signal (NLS). 74
30282 409722 cd12280 RRM_FET RNA recognition motif (RRM) found in the FET family of RNA-binding proteins. This subfamily corresponds to the RRM of FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA-binding proteins. This ubiquitously expressed family of similarly structured proteins predominantly localizing to the nuclear, includes FUS (also known as TLS or Pigpen or hnRNP P2), EWS (also known as EWSR1), TAF15 (also known as hTAFII68 or TAF2N or RPB56), and Drosophila Cabeza (also known as SARFH). The corresponding coding genes of these proteins are involved in deleterious genomic rearrangements with transcription factor genes in a variety of human sarcomas and acute leukemias. All FET proteins interact with each other and are therefore likely to be part of the very same protein complexes, which suggests a general bridging role for FET proteins coupling RNA transcription, processing, transport, and DNA repair. The FET proteins contain multiple copies of a degenerate hexapeptide repeat motif at the N-terminus. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a putative zinc-finger domain, and a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is flanked by 3 arginine-glycine-glycine (RGG) boxes. FUS and EWS might have similar sequence specificity; both bind preferentially to GGUG-containing RNAs. FUS has also been shown to bind strongly to human telomeric RNA and to small low-copy-number RNAs tethered to the promoter of cyclin D1. To date, nothing is known about the RNA binding specificity of TAF15. 82
30283 409723 cd12281 RRM1_TatSF1_like RNA recognition motif 1 (RRM1) found in HIV Tat-specific factor 1 (Tat-SF1) and similar proteins. This subfamily corresponds to the RRM1 of Tat-SF1 and CUS2. Tat-SF1 is the cofactor for stimulation of transcriptional elongation by human immunodeficiency virus-type 1 (HIV-1) Tat. It is a substrate of an associated cellular kinase. Tat-SF1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a highly acidic carboxyl-terminal half. The family also includes CUS2, a yeast homolog of human Tat-SF1. CUS2 interacts with U2 RNA in splicing extracts and functions as a splicing factor that aids assembly of the splicing-competent U2 snRNP in vivo. CUS2 also associates with PRP11 that is a subunit of the conserved splicing factor SF3a. Like Tat-SF1, CUS2 contains two RRMs as well. 92
30284 409724 cd12282 RRM2_TatSF1_like RNA recognition motif 2 (RRM2) found in HIV Tat-specific factor 1 (Tat-SF1) and similar proteins. This subfamily corresponds to the RRM2 of Tat-SF1 and CUS2. Tat-SF1 is the cofactor for stimulation of transcriptional elongation by human immunodeficiency virus-type 1 (HIV-1) Tat. It is a substrate of an associated cellular kinase. Tat-SF1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a highly acidic carboxyl-terminal half. The family also includes CUS2, a yeast homolog of human Tat-SF1. CUS2 interacts with U2 RNA in splicing extracts and functions as a splicing factor that aids assembly of the splicing-competent U2 snRNP in vivo. CUS2 also associates with PRP11 that is a subunit of the conserved splicing factor SF3a. Like Tat-SF1, CUS2 contains two RRMs as well. 91
30285 409725 cd12283 RRM1_RBM39_like RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 39 (RBM39) and similar proteins. This subfamily corresponds to the RRM1 of RNA-binding protein 39 (RBM39), RNA-binding protein 23 (RBM23) and similar proteins. RBM39 (also termed HCC1) is a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Although the cellular function of RBM23 remains unclear, it shows high sequence homology to RBM39 and contains two RRMs. It may possibly function as a pre-mRNA splicing factor. 73
30286 409726 cd12284 RRM2_RBM23_RBM39 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein RBM23, RBM39 and similar proteins. This subfamily corresponds to the RRM2 of RBM39 (also termed HCC1), a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Although the cellular function of RBM23 remains unclear, it shows high sequence homology to RBM39 and contains two RRMs. It may possibly function as a pre-mRNA splicing factor. 78
30287 409727 cd12285 RRM3_RBM39_like RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 39 (RBM39) and similar proteins. This subfamily corresponds to the RRM3 of RBM39, also termed hepatocellular carcinoma protein 1, or RNA-binding region-containing protein 2, or splicing factor HCC1, ia nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Based on the specific domain composition, RBM39 has been classified into a family of non-snRNP (small nuclear ribonucleoprotein) splicing factors that are usually not complexed to snRNAs. 85
30288 409728 cd12286 RRM_Man1 RNA recognition motif (RRM) found in inner nuclear membrane protein Man1 (Man1) and similar proteins. This subfamily corresponds to the RRM of Man1, also termed LEM domain-containing protein 3 (LEMD3), an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. It is part of a protein complex essential for chromatin organization and cell division. It also functions as an important negative regulator for the transforming growth factor (TGF) beta/activin/Nodal signaling pathway by directly interacting with chromatin-associated proteins and transcriptional regulators, including the R-Smads, Smad1, Smad2, and Smad3. Moreover, Man1 is a unique type of left-right (LR) signaling regulator that acts on the inner nuclear membrane. Man1 plays a crucial role in angiogenesis. The vascular remodeling can be regulated at the inner nuclear membrane through the interaction between Man1 and Smads. Man1 contains an N-terminal LEM domain, two putative transmembrane domains, a MAN1-Src1p C-terminal (MSC) domain, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The LEM domain interacts with the DNA and chromatin-binding protein Barrier-to-Autointegration Factor, and is also necessary for efficient localization of MAN1 in the inner nuclear membrane. Research has indicated that C-terminal nucleoplasmic region of Man1 exhibits a DNA binding winged helix domain and is responsible for both DNA- and Smad-binding. 92
30289 409729 cd12287 RRM_U2AF35_like RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor U2AF 35 kDa subunit (U2AF35) and similar proteins. This subfamily corresponds to the RRM in U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF) which has been implicated in the recruitment of U2 snRNP to pre-mRNAs. It is a highly conserved heterodimer composed of large and small subunits; this family includes the small subunit of U2AF (U2AF35 or U2AF1) and U2AF 35 kDa subunit B (U2AF35B or C3H60). U2AF35 directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. It promotes U2 snRNP binding to the branch-point sequences of introns through association with the large subunit of U2AF (U2AF65 or U2AF2). Although the biological role of U2AF35B remains unclear, it shows high sequence homolgy to U2AF35, which contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR) -rich segment interrupted by glycines. In contrast to U2AF35, U2AF35B has a plant-specific conserved C-terminal region containing SERE motif(s), which may have an important function specific to higher plants. 101
30290 409730 cd12288 RRM_La_like_plant RNA recognition motif (RRM) found in plant proteins related to the La autoantigen. This subfamily corresponds to the RRM of plant La-like proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. Members in this family contain an LAM domain followed by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 90
30291 409731 cd12289 RRM_LARP6 RNA recognition motif (RRM) found in La-related protein 6 (LARP6) and similar proteins. This subfamily corresponds to the RRM of LARP6, also termed Acheron (Achn), a novel member of the lupus antigen (La) family. It is expressed predominantly in neurons and muscle in vertebrates. LARP6 functions as a key regulatory protein that may play a role in mediating a variety of developmental and homeostatic processes in animals, including myogenesis, neurogenesis and possibly metastasis. LARP6 binds to Ca2+/calmodulin-dependent serine protein kinase (CASK), and forms a complex with inhibitor of differentiation transcription factors. It is structurally related to the La autoantigen and contains a La motif (LAM), nuclear localization and export (NLS and NES) signals, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 93
30292 409732 cd12290 RRM1_LARP7 RNA recognition motif 1 (RRM1) found in La-related protein 7 (LARP7) and similar proteins. This subfamily corresponds to the RRM1 of LARP7, also termed La ribonucleoprotein domain family member 7, or P-TEFb-interaction protein for 7SK stability (PIP7S), an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. LARP7 is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP). It intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. It plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. LARP7 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminal region, which mediates binding to the U-rich 3' terminus of 7SK RNA. LARP7 also carries another putative RRM domain at its C-terminus. 79
30293 409733 cd12291 RRM1_La RNA recognition motif 1 in La autoantigen (La or LARP3) and similar proteins. This subfamily corresponds to the RRM1 of La autoantigen, also termed Lupus La protein, or La ribonucleoprotein, or Sjoegren syndrome type B antigen (SS-B), a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. La contains an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also possesses a short basic motif (SBM) and a nuclear localization signal (NLS) at the C-terminus. 73
30294 409734 cd12292 RRM2_La_like RNA recognition motif 2 in La autoantigen (La or SS-B or LARP3), La-related protein 7 (LARP7 or PIP7S) and similar proteins. This subfamily corresponds to the RRM2 of La and LARP7. La is a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. LARP7 is an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. It is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP), intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. LARP7 plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. Both La and LARP7 contain an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 74
30295 410983 cd12293 dRRM_Rrp7p deviant RNA recognition motif (dRRM) in yeast ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. Rrp7p contains a deviant RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a RRP7 domain. The classic RRM fold has a topology of beta1-alpha1-beta2-beta3-alpha2-beta4 with juxtaposed N- and C-termini. By contrast, the N-terminal region of Rrp7 displays a cyclic permutation of RRM topology: the strand equivalent to RRM beta4 is shuffled to the N-terminus of the strand equivalent to RRM beta1. Moreover, Rrp7 has an extra strand beta1, which, together with other four beta-strands, forms an antiparallel five-stranded beta-sheet. 105
30296 409735 cd12294 RRM_Rrp7A RNA recognition motif in ribosomal RNA-processing protein 7 homolog A (Rrp7A) and similar proteins. This subfamily corresponds to the RRM of Rrp7A, also termed gastric cancer antigen Zg14, a homolog of yeast ribosomal RNA-processing protein 7 (Rrp7p), and mainly found in Metazoa. Rrp7p is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. In contrast, the cellular function of Rrp7A remains unclear currently. Rrp7A harbors an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal Rrp7 domain. 103
30297 409736 cd12295 RRM_YRA2 RNA recognition motif in yeast RNA annealing protein YRA2 (Yra2p) and similar proteins. This subfamily corresponds to the RRM of Yra2p, a nonessential nuclear RNA-binding protein encoded by Saccharomyces cerevisiae YRA2 gene. It may share some overlapping functions with Yra1p, and is able to complement an YRA1 deletion when overexpressed in yeast. Yra2p belongs to the evolutionarily conserved REF (RNA and export factor binding proteins) family of hnRNP-like proteins. It is a major component of endogenous Yra1p complexes. It interacts with Yra1p and functions as a negative regulator of Yra1p. Yra2p consists of two highly conserved N- and C-terminal boxes and a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30298 409737 cd12296 RRM1_Prp24 RNA recognition motif 1 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM1 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 71
30299 409738 cd12297 RRM2_Prp24 RNA recognition motif 2 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM2 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 78
30300 409739 cd12298 RRM3_Prp24 RNA recognition motif 3 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM3 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 78
30301 409740 cd12299 RRM4_Prp24 RNA recognition motif 4 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM4 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 71
30302 409741 cd12300 RRM1_PAR14 RNA recognition motif 1 in vertebrate poly [ADP-ribose] polymerase 14 (PARP-14). This subfamily corresponds to the RRM1 of PARP-14, also termed aggressive lymphoma protein 2, a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. It is expressed in B lymphocytes and interacts with the IL-4-induced transcription factor Stat6. It plays a fundamental role in the regulation of IL-4-induced B-cell protection against apoptosis after irradiation or growth factor withdrawal. It mediates IL-4 effects on the levels of gene products that regulate cell survival, proliferation, and lymphomagenesis. PARP-14 acts as a transcriptional switch for Stat6-dependent gene activation. In the presence of IL-4, PARP-14 activates transcription by facilitating the binding of Stat6 to the promoter and release of HDACs from the promoter with an IL-4 signal. In contrast, in the absence of a signal, PARP-14 acts as a transcriptional repressor by recruiting HDACs. Moreover, the absence of PARP-14 protects against Myc-induced developmental block and lymphoma. Thus, PARP-14 may play an important role in Myc-induced oncogenesis. Research indicates that PARP-14 is also a binding partner with phosphoglucose isomerase (PGI)/ autocrine motility factor (AMF). It can inhibit PGI/AMF ubiquitination, thus contributing to its stabilization and secretion. PARP-14 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), three tandem macro domains, and C-terminal region with sequence homology to PARP catalytic domain. 82
30303 409742 cd12301 RRM1_2_PAR10_like RNA recognition motif 1 and 2 in poly [ADP-ribose] polymerase PARP-10, RNA recognition motif 2 in PARP-14, RNA recognition motif in N-myc-interactor (Nmi), interferon-induced 35 kDa protein (IFP 35), RNA-binding protein 43 (RBM43) and similar proteins. This subfamily corresponds to the RRM1 and RRM2 of PARP-10, RRM2 of PARP-14, RRM of N-myc-interactor (Nmi), interferon-induced 35 kDa protein (IFP 35) and RNA-binding protein 43 (RBM43). PARP-10 is a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. It is localized to the nuclear and cytoplasmic compartments. In addition to PARP activity, PARP-10 is also involved in the control of cell proliferation by inhibiting c-Myc- and E1A-mediated cotransformation of primary cells. PARP-10 may also play a role in nuclear processes including the regulation of chromatin, gene transcription, and nuclear/cytoplasmic transport. PARP-10 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two overlapping C-terminal domains composed of a glycine-rich region and a region with homology to catalytic domains of PARP enzymes (PARP domain). In addition, PARP-10 contains two ubiquitin-interacting motifs (UIM). PARP-14, also termed aggressive lymphoma protein 2, is a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. Like PARP-10, PARP-14 also includes two RRMs at the N-terminus. Nmi, also termed N-myc and STAT interactor, is an interferon inducible protein that interacts with c-Myc, N-Myc, Max and c-Fos, and other transcription factors containing bHLH-ZIP, bHLH or ZIP domains. Besides binding Myc proteins, Nmi also associates with all the Stat family of transcription factors except Stat2. In response to cytokine (e.g. IL-2 and IFN-gamma) stimulation, Nmi can enhance Stat-mediated transcriptional activity through recruiting the Stat1 and Stat5 transcriptional coactivators, CREB-binding protein (CBP) and p300. IFP 35 is an interferon-induced leucine zipper protein that can specifically form homodimers. Distinct from known bZIP proteins, IFP 35 lacks a basic domain critical for DNA binding. In addition, IFP 35 may negatively regulate other bZIP transcription factors by protein-protein interaction. For instance, it can form heterodimers with B-ATF, a member of the AP1 transcription factor family. Both Nmi and IFP35 harbor one RRM. RBM43 is a putative RNA-binding protein containing one RRM, but its biological function remains unclear. 74
30304 409743 cd12302 RRM_scSet1p_like RNA recognition motif in budding yeast Saccharomyces cerevisiae SET domain-containing protein 1 (scSet1p) and similar proteins. This subfamily corresponds to the RRM of scSet1p, also termed H3 lysine-4 specific histone-lysine N-methyltransferase, or COMPASS component SET1, or lysine N-methyltransferase 2, which is encoded by SET1 from the yeast S. cerevisiae. It is a nuclear protein that may play a role in both silencing and activating transcription. scSet1p is closely related to the SET domain proteins of multicellular organisms, which are implicated in diverse aspects of cell morphology, growth control, and chromatin-mediated transcriptional silencing. scSet1p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a conserved SET domain that may play a role in DNA repair and telomere function. 110
30305 409744 cd12303 RRM_spSet1p_like RNA recognition motif in fission yeast Schizosaccharomyces pombe SET domain-containing protein 1 (spSet1p) and similar proteins. This subfamily corresponds to the RRM of spSet1p, also termed H3 lysine-4 specific histone-lysine N-methyltransferase, or COMPASS component SET1, or lysine N-methyltransferase 2, or Set1 complex component, is encoded by SET1 from the fission yeast S. pombe. It is essential for the H3 lysine-4 methylation. in vivo, and plays an important role in telomere maintenance and DNA repair in an ATM kinase Rad3-dependent pathway. spSet1p is the homology counterpart of Saccharomyces cerevisiae Set1p (scSet1p). However, it is more closely related to Set1 found in mammalian. Moreover, unlike scSet1p, spSet1p is not required for heterochromatin assembly in fission yeast. spSet1p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a conserved SET domain that may play a role in DNA repair and telomere function. 86
30306 409745 cd12304 RRM_Set1 RNA recognition motif in the Set1-like family of histone-lysine N-methyltransferases. This subfamily corresponds to the RRM of the Set1-like family of histone-lysine N-methyltransferases which includes Set1A and Set1B that are ubiquitously expressed vertebrates histone methyltransferases exhibiting high homology to yeast Set1. Set1A and Set1B proteins exhibit a largely non-overlapping subnuclear distribution in euchromatic nuclear speckles, strongly suggesting that they bind to a unique set of target genes and thus make non-redundant contributions to the epigenetic control of chromatin structure and gene expression. With the exception of the catalytic component, the subunit composition of the Set1A and Set1B histone methyltransferase complexes are identical. Each complex contains six human homologs of the yeast Set1/COMPASS complex, including Set1A or Set1B, Ash2 (homologous to yeast Bre2), CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). The genomic targeting of these complexes is determined by the identity of the catalytic subunit present in each histone methyltransferase complex. Thus, the Set1A and Set1B complexes may exhibit both overlapping and non-redundant properties. Both Set1A and Set1B contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. In contrast to Set1B, Set1A additionally contains an HCF-1 binding motif that interacts with HCF-1 in vivo. 93
30307 409746 cd12305 RRM_NELFE RNA recognition motif in negative elongation factor E (NELF-E) and similar proteins. This subfamily corresponds to the RRM of NELF-E, also termed RNA-binding protein RD. NELF-E is the RNA-binding subunit of cellular negative transcription elongation factor NELF (negative elongation factor) involved in transcriptional regulation of HIV-1 by binding to the stem of the viral transactivation-response element (TAR) RNA which is synthesized by cellular RNA polymerase II at the viral long terminal repeat. NELF is a heterotetrameric protein consisting of NELF A, B, C or the splice variant D, and E. NELF-E contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It plays a role in the control of HIV transcription by binding to TAR RNA. In addition, NELF-E is associated with the NELF-B subunit, probably via a leucine zipper motif. 75
30308 409747 cd12306 RRM_II_PABPs RNA recognition motif in type II polyadenylate-binding proteins. This subfamily corresponds to the RRM of type II polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 2 (PABP-2 or PABPN1), embryonic polyadenylate-binding protein 2 (ePABP-2 or PABPN1L) and similar proteins. PABPs are highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. ePABP-2 is predominantly located in the cytoplasm and PABP-2 is located in the nucleus. In contrast to the type I PABPs containing four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), the type II PABPs contains a single highly-conserved RRM. This subfamily also includes Saccharomyces cerevisiae RBP29 (SGN1, YIR001C) gene encoding cytoplasmic mRNA-binding protein Rbp29 that binds preferentially to poly(A). Although not essential for cell viability, Rbp29 plays a role in modulating the expression of cytoplasmic mRNA. Like other type II PABPs, Rbp29 contains one RRM only. 73
30309 409748 cd12307 RRM_NIFK_like RNA recognition motif in nucleolar protein interacting with the FHA domain of pKI-67 (NIFK) and similar proteins. This subgroup corresponds to the RRM of NIFK and Nop15p. NIFK, also termed MKI67 FHA domain-interacting nucleolar phosphoprotein, or nucleolar phosphoprotein Nopp34, is a putative RNA-binding protein interacting with the forkhead associated (FHA) domain of pKi-67 antigen in a mitosis-specific and phosphorylation-dependent manner. It is nucleolar in interphase but associates with condensed mitotic chromosomes. This family also includes Saccharomyces cerevisiae YNL110C gene encoding ribosome biogenesis protein 15 (Nop15p), also termed nucleolar protein 15. Both, NIFK and Nop15p, contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30310 409749 cd12308 RRM1_Spen RNA recognition motif 1 (RRM1) found in the Spen (split end) protein family. This subfamily corresponds to the RRM1 domain in the Spen (split end) family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B), and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also known as one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong- to the Spen (split end) protein family, which share a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 78
30311 240755 cd12309 RRM2_Spen RNA recognition motif 2 (RRM2) found in the Spen (split end) protein family. This subfamily corresponds to the RRM2 domain in the Spen (split end) protein family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B), and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possess mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also termed one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong to the Spen (split end) protein family, which share a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 79
30312 409750 cd12310 RRM3_Spen RNA recognition motif 3 (RRM3) found in the Spen (split end) protein family. This subfamily corresponds to the RRM3 domain in the Spen (split end) protein family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B) and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and is a novel component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possess mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also termed one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 72
30313 409751 cd12311 RRM_SRSF2_SRSF8 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor SRSF2, SRSF8 and similar proteins. This subfamily corresponds to the RRM of SRSF2 and SRSF8. SRSF2, also termed protein PR264, or splicing component, 35 kDa (splicing factor SC35 or SC-35), is a prototypical SR protein that plays important roles in the alternative splicing of pre-mRNA. It is also involved in transcription elongation by directly or indirectly mediating the recruitment of elongation factors to the C-terminal domain of polymerase II. SRSF2 is exclusively localized in the nucleus and is restricted to nuclear processes. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. The RRM is responsible for the specific recognition of 5'-SSNG-3' (S=C/G) RNA. In the regulation of alternative splicing events, it specifically binds to cis-regulatory elements on the pre-mRNA. The RS domain modulates SRSF2 activity through phosphorylation, directly contacts RNA, and promotes protein-protein interactions with the spliceosome. SRSF8, also termed SRP46 or SFRS2B, is a novel mammalian SR splicing factor encoded by a PR264/SC35 functional retropseudogene. SRSF8 is localized in the nucleus and does not display the same activity as PR264/SC35. It functions as an essential splicing factor in complementing a HeLa cell S100 extract deficient in SR proteins. Like SRSF2, SRSF8 contains a single N-terminal RRM and a C-terminal RS domain. 73
30314 240758 cd12312 RRM_SRSF10_SRSF12 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor SRSF10, SRSF12 and similar proteins. This subfamily corresponds to the RRM of SRSF10 and SRSF12. SRSF10, also termed 40 kDa SR-repressor protein (SRrp40), or FUS-interacting serine-arginine-rich protein 1 (FUSIP1), or splicing factor SRp38, or splicing factor, arginine/serine-rich 13A (SFRS13A), or TLS-associated protein with Ser-Arg repeats (TASR). It is a serine-arginine (SR) protein that acts as a potent and general splicing repressor when dephosphorylated. It mediates global inhibition of splicing both in M phase of the cell cycle and in response to heat shock. SRSF10 emerges as a modulator of cholesterol homeostasis through the regulation of low-density lipoprotein receptor (LDLR) splicing efficiency. It also regulates cardiac-specific alternative splicing of triadin pre-mRNA and is required for proper Ca2+ handling during embryonic heart development. In contrast, the phosphorylated SRSF10 functions as a sequence-specific splicing activator in the presence of a nuclear cofactor. It activates distal alternative 5' splice site of adenovirus E1A pre-mRNA in vivo. Moreover, SRSF10 strengthens pre-mRNA recognition by U1 and U2 snRNPs. SRSF10 localizes to the nuclear speckles and can shuttle between nucleus and cytoplasm. SRSF12, also termed 35 kDa SR repressor protein (SRrp35), or splicing factor, arginine/serine-rich 13B (SFRS13B), or splicing factor, arginine/serine-rich 19 (SFRS19), is a serine/arginine (SR) protein-like alternative splicing regulator that antagonizes authentic SR proteins in the modulation of alternative 5' splice site choice. For instance, it activates distal alternative 5' splice site of the adenovirus E1A pre-mRNA in vivo. Both, SRSF10 and SRSF12, contain a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 84
30315 409752 cd12313 RRM1_RRM2_RBM5_like RNA recognition motif 1 (RRM1) and 2 (RRM2) found in RNA-binding protein 5 (RBM5) and similar proteins. This subfamily includes the RRM1 and RRM2 of RNA-binding protein 5 (RBM5 or LUCA15 or H37) and RNA-binding protein 10 (RBM10 or S1-1), and the RRM2 of RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). These RBMs share high sequence homology and may play an important role in regulating apoptosis. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM6 has been predicted to be a nuclear factor based on its nuclear localization signal. Both, RBM6 and RBM5, specifically bind poly(G) RNA. RBM10 is a paralog of RBM5. It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. All family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 85
30316 409753 cd12314 RRM1_RBM6 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 6 (RBM6). This subfamily corresponds to the RRM1 of RBM6, also termed lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, which has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both, RBM6 and RBM5, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has two additional unique domains: the decamer repeat occurring more than 20 times, and the POZ (poxvirus and zinc finger) domain. The POZ domain may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. 78
30317 409754 cd12315 RRM1_RBM19_MRD1 RNA recognition motif 1 (RRM1) found in RNA-binding protein 19 (RBM19), yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subfamily corresponds to the RRM1 of RBM19 and MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 81
30318 409755 cd12316 RRM3_RBM19_RRM2_MRD1 RNA recognition motif 3 (RRM3) found in RNA-binding protein 19 (RBM19) and RNA recognition motif 2 found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM3 of RBM19 and RRM2 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 74
30319 409756 cd12317 RRM4_RBM19_RRM3_MRD1 RNA recognition motif 4 (RRM4) found in RNA-binding protein 19 (RBM19) and RNA recognition motif 3 (RRM3) found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM4 of RBM19 and the RRM3 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well conserved in yeast and its homologues exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 72
30320 409757 cd12318 RRM5_RBM19_like RNA recognition motif 5 (RRM5) found in RNA-binding protein 19 (RBM19 or RBD-1) and similar proteins. This subfamily corresponds to the RRM5 of RBM19 and RRM4 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 80
30321 409758 cd12319 RRM4_MRD1 RNA recognition motif 4 (RRM4) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subfamily corresponds to the RRM4 of MRD1which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 84
30322 409759 cd12320 RRM6_RBM19_RRM5_MRD1 RNA recognition motif 6 (RRM6) found in RNA-binding protein 19 (RBM19 or RBD-1) and RNA recognition motif 5 (RRM5) found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM6 of RBM19 and RRM5 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 76
30323 409760 cd12321 RRM1_TDP43 RNA recognition motif 1 (RRM1) found in TAR DNA-binding protein 43 (TDP-43) and similar proteins. This subfamily corresponds to the RRM1 of TDP-43 (also termed TARDBP), a ubiquitously expressed pathogenic protein whose normal function and abnormal aggregation are directly linked to the genetic disease cystic fibrosis, and two neurodegenerative disorders: frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). TDP-43 binds both DNA and RNA, and has been implicated in transcriptional repression, pre-mRNA splicing and translational regulation. TDP-43 is a dimeric protein with two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal glycine-rich domain. The RRMs are responsible for DNA and RNA binding; they bind to TAR DNA and RNA sequences with UG-repeats. The glycine-rich domain can interact with the hnRNP family proteins to form the hnRNP-rich complex involved in splicing inhibition. It is also essential for the cystic fibrosis transmembrane conductance regulator (CFTR) exon 9-skipping activity. 74
30324 409761 cd12322 RRM2_TDP43 RNA recognition motif 2 (RRM2) found in TAR DNA-binding protein 43 (TDP-43) and similar proteins. This subfamily corresponds to the RRM2 of TDP-43 (also termed TARDBP), a ubiquitously expressed pathogenic protein whose normal function and abnormal aggregation are directly linked to the genetic disease cystic fibrosis, and two neurodegenerative disorders: frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). TDP-43 binds both DNA and RNA, and has been implicated in transcriptional repression, pre-mRNA splicing and translational regulation. TDP-43 is a dimeric protein with two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal glycine-rich domain. The RRMs are responsible for DNA and RNA binding; they bind to TAR DNA and RNA sequences with UG-repeats. The glycine-rich domain can interact with the hnRNP family proteins to form the hnRNP-rich complex involved in splicing inhibition. It is also essential for the cystic fibrosis transmembrane conductance regulator (CFTR) exon 9-skipping activity. 71
30325 240769 cd12323 RRM2_MSI RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homologs Musashi-1, Musashi-2 and similar proteins. This subfamily corresponds to the RRM2.in Musashi-1 (also termed Msi1), a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. It is evolutionarily conserved from invertebrates to vertebrates. Musashi-1 is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). It has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, Musashi-1 represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-2 (also termed Msi2) has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Both, Musashi-1 and Musashi-2, contain two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 74
30326 409762 cd12324 RRM_RBM8 RNA recognition motif (RRM) found in RNA-binding protein RBM8A, RBM8B nd similar proteins. This subfamily corresponds to the RRM of RBM8, also termed binder of OVCA1-1 (BOV-1), or RNA-binding protein Y14, which is one of the components of the exon-exon junction complex (EJC). It has two isoforms, RBM8A and RBM8B, both of which are identical except that RBM8B is 16 amino acids shorter at its N-terminus. RBM8, together with other EJC components (such as Magoh, Aly/REF, RNPS1, Srm160, and Upf3), plays critical roles in postsplicing processing, including nuclear export and cytoplasmic localization of the mRNA, and the nonsense-mediated mRNA decay (NMD) surveillance process. RBM8 binds to mRNA 20-24 nucleotides upstream of a spliced exon-exon junction. It is also involved in spliced mRNA nuclear export, and the process of nonsense-mediated decay of mRNAs with premature stop codons. RBM8 forms a specific heterodimer complex with the EJC protein Magoh which then associates with Aly/REF, RNPS1, DEK, and SRm160 on the spliced mRNA, and inhibits ATP turnover by eIF4AIII, thereby trapping the EJC core onto RNA. RBM8 contains an N-terminal putative bipartite nuclear localization signal, one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), in the central region, and a C-terminal serine-arginine rich region (SR domain) and glycine-arginine rich region (RG domain). 88
30327 409763 cd12325 RRM1_hnRNPA_hnRNPD_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP A and hnRNP D subfamilies and similar proteins. This subfamily corresponds to the RRM1 in the hnRNP A subfamily which includes hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The hnRNP D subfamily includes hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0 is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus, plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All members in this subfamily contain two putative RRMs and a glycine- and tyrosine-rich C-terminus. The family also contains DAZAP1 (Deleted in azoospermia-associated protein 1), RNA-binding protein Musashi homolog Musashi-1, Musashi-2 and similar proteins. They all harbor two RRMs. 72
30328 409764 cd12326 RRM1_hnRNPA0 RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A0 (hnRNP A0) and similar proteins. This subfamily corresponds to the RRM1 of hnRNP A0 which is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 79
30329 409765 cd12327 RRM2_DAZAP1 RNA recognition motif 2 (RRM2) found in Deleted in azoospermia-associated protein 1 (DAZAP1) and similar proteins. This subfamily corresponds to the RRM2 of DAZAP1 or DAZ-associated protein 1, also termed proline-rich RNA binding protein (Prrp), a multi-functional ubiquitous RNA-binding protein expressed most abundantly in the testis and essential for normal cell growth, development, and spermatogenesis. DAZAP1 is a shuttling protein whose acetylated is predominantly nuclear and the nonacetylated form is in cytoplasm. DAZAP1 also functions as a translational regulator that activates translation in an mRNA-specific manner. DAZAP1 was initially identified as a binding partner of Deleted in Azoospermia (DAZ). It also interacts with numerous hnRNPs, including hnRNP U, hnRNP U like-1, hnRNPA1, hnRNPA/B, and hnRNP D, suggesting DAZAP1 might associate and cooperate with hnRNP particles to regulate adenylate-uridylate-rich elements (AU-rich element or ARE)-containing mRNAs. DAZAP1 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal proline-rich domain. 80
30330 409766 cd12328 RRM2_hnRNPA_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A subfamily. This subfamily corresponds to the RRM2 of hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 73
30331 240775 cd12329 RRM2_hnRNPD_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. This subfamily corresponds to the RRM2 of hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0, a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. It has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All memembers in this family contain two putative RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 75
30332 409767 cd12330 RRM2_Hrp1p RNA recognition motif 2 (RRM2) found in yeast nuclear polyadenylated RNA-binding protein 4 (Hrp1p or Nab4p) and similar proteins. This subfamily corresponds to the RRM1 of Hrp1p and similar proteins. Hrp1p or Nab4p, also termed cleavage factor IB (CFIB), is a sequence-specific trans-acting factor that is essential for mRNA 3'-end formation in yeast Saccharomyces cerevisiae. It can be UV cross-linked to RNA and specifically recognizes the (UA)6 RNA element required for both, the cleavage and poly(A) addition steps. Moreover, Hrp1p can shuttle between the nucleus and the cytoplasm, and play an additional role in the export of mRNAs to the cytoplasm. Hrp1p also interacts with Rna15p and Rna14p, two components of CF1A. In addition, Hrp1p functions as a factor directly involved in modulating the activity of the nonsense-mediated mRNA decay (NMD) pathway; it binds specifically to a downstream sequence element (DSE)-containing RNA and interacts with Upf1p, a component of the surveillance complex, further triggering the NMD pathway. Hrp1p contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an arginine-glycine-rich region harboring repeats of the sequence RGGF/Y. 78
30333 409768 cd12331 RRM_NRD1_SEB1_like RNA recognition motif (RRM) found in Saccharomyces cerevisiae protein Nrd1, Schizosaccharomyces pombe Rpb7-binding protein seb1 and similar proteins. This subfamily corresponds to the RRM of Nrd1 and Seb1. Nrd1 is a novel heterogeneous nuclear ribonucleoprotein (hnRNP)-like RNA-binding protein encoded by gene NRD1 (for nuclear pre-mRNA down-regulation) from yeast S. cerevisiae. It is implicated in 3' end formation of small nucleolar and small nuclear RNAs transcribed by polymerase II, and plays a critical role in pre-mRNA metabolism. Nrd1 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a short arginine-, serine-, and glutamate-rich segment similar to the regions rich in RE and RS dipeptides (RE/RS domains) in many metazoan splicing factors, and a proline- and glutamine-rich C-terminal domain (P+Q domain) similar to domains found in several yeast hnRNPs. Disruption of NRD1 gene is lethal to yeast cells. Its N-terminal domain is sufficient for viability, which may facilitate interactions with RNA polymerase II where Nrd1 may function as an auxiliary factor. By contrast, the RRM, RE/RS domains, and P+Q domain are dispensable. Seb1 is an RNA-binding protein encoded by gene seb1 (for seven binding) from fission yeast S. pombe. It is essential for cell viability and bound directly to Rpb7 subunit of RNA polymerase II. Seb1 is involved in processing of polymerase II transcripts. It also contains one RRM motif and a region rich in arginine-serine dipeptides (RS domain). 79
30334 409769 cd12332 RRM1_p54nrb_like RNA recognition motif 1 (RRM1) found in the p54nrb/PSF/PSP1 family. This subfamily corresponds to the RRM1 of the p54nrb/PSF/PSP1 family, including 54 kDa nuclear RNA- and DNA-binding protein (p54nrb or NonO or NMT55), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF or POMp100), paraspeckle protein 1 (PSP1 or PSPC1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. This subfamily also includes some p54nrb/PSF/PSP1 homologs from invertebrate species, such as the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior, and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. All family members contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction module. PSF has an additional large N-terminal domain that differentiates it from other family members. 71
30335 409770 cd12333 RRM2_p54nrb_like RNA recognition motif 2 (RRM2) found in the p54nrb/PSF/PSP1 family. This subfamily corresponds to the RRM2 of the p54nrb/PSF/PSP1 family, including 54 kDa nuclear RNA- and DNA-binding protein (p54nrb or NonO or NMT55), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF or POMp100), paraspeckle protein 1 (PSP1 or PSPC1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species, such as the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. All family members contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction module. PSF has an additional large N-terminal domain that differentiates it from other family members. 80
30336 409771 cd12334 RRM1_SF3B4 RNA recognition motif 1 (RRM1) found in splicing factor 3B subunit 4 (SF3B4) and similar proteins. This subfamily corresponds to the RRM1 of SF3B4, also termed pre-mRNA-splicing factor SF3b 49 kDa (SF3b50), or spliceosome-associated protein 49 (SAP 49). SF3B4 a component of the multiprotein complex splicing factor 3b (SF3B), an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA, and is involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B4 functions to tether U2 snRNP with pre-mRNA at the branch site during spliceosome assembly. It is an evolutionarily highly conserved protein with orthologs across diverse species. SF3B4 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It binds directly to pre-mRNA and also interacts directly and highly specifically with another SF3B subunit called SAP 145. 74
30337 409772 cd12335 RRM2_SF3B4 RNA recognition motif 2 (RRM2) found in splicing factor 3B subunit 4 (SF3B4) and similar proteins. This subfamily corresponds to the RRM2 of SF3B4, also termed pre-mRNA-splicing factor SF3b 49 kDa (SF3b50), or spliceosome-associated protein 49 (SAP 49). SF3B4 is a component of the multiprotein complex splicing factor 3b (SF3B), an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA, and is involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B4 functions to tether U2 snRNP with pre-mRNA at the branch site during spliceosome assembly. It is an evolutionarily highly conserved protein with orthologs across diverse species. SF3B4 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It binds directly to pre-mRNA and also interacts directly and highly specifically with another SF3B subunit called SAP 145. 83
30338 409773 cd12336 RRM_RBM7_like RNA recognition motif (RRM) found in RNA-binding protein 7 (RBM7) and similar proteins. This subfamily corresponds to the RRM of RBM7, RBM11 and their eukaryotic homologous. RBM7 is an ubiquitously expressed pre-mRNA splicing factor that enhances messenger RNA (mRNA) splicing in a cell-specific manner or in a certain developmental process, such as spermatogenesis. It interacts with splicing factors SAP145 (the spliceosomal splicing factor 3b subunit 2) and SRp20, and may play a more specific role in meiosis entry and progression. Together with additional testis-specific RNA-binding proteins, RBM7 may regulate the splicing of specific pre-mRNA species that are important in the meiotic cell cycle. RBM11 is a novel tissue-specific splicing regulator that is selectively expressed in brain, cerebellum and testis, and to a lower extent in kidney. It is localized in the nucleoplasm and enriched in SRSF2-containing splicing speckles. It may play a role in the modulation of alternative splicing during neuron and germ cell differentiation. Both, RBM7 and RBM11, contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. The RRM is responsible for RNA binding, whereas the C-terminal region permits nuclear localization and homodimerization. 75
30339 409774 cd12337 RRM1_SRSF4_like RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 4 (SRSF4) and similar proteins. This subfamily corresponds to the RRM1 in three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS), serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). SRSF4 plays an important role in both, constitutive and alternative, splicing of many pre-mRNAs. It can shuttle between the nucleus and cytoplasm. SRSF5 regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and essential for enhancer activation. SRSF6 preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. Members in this family contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. 70
30340 409775 cd12338 RRM1_SRSF1_like RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM1 in three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 1 (SRSF1 or ASF-1), serine/arginine-rich splicing factor 9 (SRSF9 or SRp30C), and plant pre-mRNA-splicing factor SF2 (SR1). SRSF1 is a shuttling SR protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF9 has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. It can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. Both, SRSF1 and SRSF9, contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RS domains rich in serine-arginine dipeptides. In contrast, SF2 contains two N-terminal RRMs and a C-terminal PSK domain rich in proline, serine and lysine residues. 72
30341 409776 cd12339 RRM2_SRSF1_4_like RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor SRSF1, SRSF4 and similar proteins. This subfamily corresponds to the RRM2 of several serine/arginine (SR) proteins that have been classified into two subgroups. The first subgroup consists of serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS) and serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). The second subgroup is composed of serine/arginine-rich splicing factor 1 (SRSF1 or ASF-1), serine/arginine-rich splicing factor 9 (SRSF9 or SRp30C) and plant pre-mRNA-splicing factor SF2 (SR1). These SR proteins are mainly involved in regulating constitutive and alternative pre-mRNA splicing. They also have been implicated in transcription, genomic stability, mRNA export and translation. All SR proteins in this family, except SRSF5, undergo nucleocytoplasmic shuttling, suggesting their widespread roles in gene expression. These SR proteins share a common domain architecture comprising two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. Both domains can directly contact with RNA. The RRMs appear to determine the binding specificity and the SR domain also mediates protein-protein interactions. In addition, this subfamily includes the yeast nucleolar protein 3 (Npl3p), also termed mitochondrial targeting suppressor 1 protein, or nuclear polyadenylated RNA-binding protein 1. It is a major yeast RNA-binding protein that competes with 3'-end processing factors, such as Rna15, for binding to the nascent RNA, protecting the transcript from premature termination and coordinating transcription termination and the packaging of the fully processed transcript for export. It specifically recognizes a class of G/U-rich RNAs. Npl3p is a multi-domain protein with two RRMs, separated by a short linker and a C-terminal domain rich in glycine, arginine and serine residues. 70
30342 409777 cd12340 RBD_RRM1_NPL3 RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 3 (Npl3p) and similar proteins. This subfamily corresponds to the RRM1 of Npl3p, also termed mitochondrial targeting suppressor 1 protein, or nuclear polyadenylated RNA-binding protein 1. Npl3p is a major yeast RNA-binding protein that competes with 3'-end processing factors, such as Rna15, for binding to the nascent RNA, protecting the transcript from premature termination and coordinating transcription termination and the packaging of the fully processed transcript for export. It specifically recognizes a class of G/U-rich RNAs. Npl3p is a multi-domain protein containing two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a short linker and a C-terminal domain rich in glycine, arginine and serine residues. 69
30343 409778 cd12341 RRM_hnRNPC_like RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein C (hnRNP C)-related proteins. This subfamily corresponds to the RRM in the hnRNP C-related protein family, including hnRNP C proteins, Raly, and Raly-like protein (RALYL). hnRNP C proteins, C1 and C2, are produced by a single coding sequence. They are the major constituents of the heterogeneous nuclear RNA (hnRNA) ribonucleoprotein (hnRNP) complex in vertebrates. They bind hnRNA tightly, suggesting a central role in the formation of the ubiquitous hnRNP complex; they are involved in the packaging of the hnRNA in the nucleus and in processing of pre-mRNA such as splicing and 3'-end formation. Raly, also termed autoantigen p542, is an RNA-binding protein that may play a critical role in embryonic development. The biological role of RALYL remains unclear. It shows high sequence homology with hnRNP C proteins and Raly. Members of this family are characterized by an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain. The Raly proteins contain a glycine/serine-rich stretch within the C-terminal regions, which is absent in the hnRNP C proteins. Thus, the Raly proteins represent a newly identified class of evolutionarily conserved autoepitopes. 68
30344 240788 cd12342 RRM_Nab3p RNA recognition motif (RRM) found in yeast nuclear polyadenylated RNA-binding protein 3 (Nab3p) and similar proteins. This subfamily corresponds to the RRM of Nab3p, an acidic nuclear polyadenylated RNA-binding protein encoded by Saccharomyces cerevisiae NAB3 gene that is essential for cell viability. Nab3p is predominantly localized within the nucleoplasm and essential for growth in yeast. It may play an important role in packaging pre-mRNAs into ribonucleoprotein structures amenable to efficient nuclear RNA processing. Nab3p contains an N-terminal aspartic/glutamic acid-rich region, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal region rich in glutamine and proline residues. 71
30345 409779 cd12343 RRM1_2_CoAA_like RNA recognition motif 1 (RRM1) and 2 (RRM2) found in RRM-containing coactivator activator/modulator (CoAA) and similar proteins. This subfamily corresponds to the RRM in CoAA (also known as RBM14 or PSP2) and RNA-binding protein 4 (RBM4). CoAA is a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner, and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. RBM4 is a ubiquitously expressed splicing factor with two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may also function as a translational regulator of stress-associated mRNAs as well as play a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RRMs, a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. This family also includes Drosophila RNA-binding protein lark (Dlark), a homolog of human RBM4. It plays an important role in embryonic development and in the circadian regulation of adult eclosion. Dlark shares high sequence similarity with RBM4 at the N-terminal region. However, Dlark has three proline-rich segments instead of three alanine-rich segments within the C-terminal region. 66
30346 409780 cd12344 RRM1_SECp43_like RNA recognition motif 1 (RRM1) found in tRNA selenocysteine-associated protein 1 (SECp43) and similar proteins. This subfamily corresponds to the RRM1 in tRNA selenocysteine-associated protein 1 (SECp43), yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8, and similar proteins. SECp43 is an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. Yeast proteins, NGR1 and NAM8, show high sequence similarity with SECp43. NGR1 is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA). It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains three RRMs, two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the C-terminus which also harbors a methionine-rich region. NAM8 is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. NAM8 also contains three RRMs. 82
30347 409781 cd12345 RRM2_SECp43_like RNA recognition motif 2 (RRM2) found in tRNA selenocysteine-associated protein 1 (SECp43) and similar proteins. This subfamily corresponds to the RRM2 in tRNA selenocysteine-associated protein 1 (SECp43), yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8, and similar proteins. SECp43 is an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. Yeast proteins, NGR1 and NAM8, show high sequence similarity with SECp43. NGR1 is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA). It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains three RRMs, two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the C-terminus which also harbors a methionine-rich region. NAM8 is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. NAM8 also contains three RRMs. 80
30348 409782 cd12346 RRM3_NGR1_NAM8_like RNA recognition motif 3 (RRM3) found in yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8 and similar proteins. This subfamily corresponds to the RRM3 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA) in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The family also includes protein NAM8, which is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 72
30349 409783 cd12347 RRM_PPIE RNA recognition motif (RRM) found in cyclophilin-33 (Cyp33) and similar proteins. This subfamily corresponds to the RRM of Cyp33, also termed peptidyl-prolyl cis-trans isomerase E (PPIase E), or cyclophilin E, or rotamase E. Cyp33 is a nuclear RNA-binding cyclophilin with an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal PPIase domain. Cyp33 possesses RNA-binding activity and preferentially binds to polyribonucleotide polyA and polyU, but hardly to polyG and polyC. It binds specifically to mRNA, which can stimulate its PPIase activity. Moreover, Cyp33 interacts with the third plant homeodomain (PHD3) zinc finger cassette of the mixed lineage leukemia (MLL) proto-oncoprotein and a poly-A RNA sequence through its RRM domain. It further mediates downregulation of the expression of MLL target genes HOXC8, HOXA9, CDKN1B, and C-MYC, in a proline isomerase-dependent manner. Cyp33 also possesses a PPIase activity that catalyzes cis-trans isomerization of the peptide bond preceding a proline, which has been implicated in the stimulation of folding and conformational changes in folded and unfolded proteins. The PPIase activity can be inhibited by the immunosuppressive drug cyclosporin A. 75
30350 409784 cd12348 RRM1_SHARP RNA recognition motif 1 (RRM1) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM1 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 75
30351 409785 cd12349 RRM2_SHARP RNA recognition motif 2 (RRM2) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM2 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 74
30352 409786 cd12350 RRM3_SHARP RNA recognition motif 3 (RRM3) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM3 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 74
30353 409787 cd12351 RRM4_SHARP RNA recognition motif 4 (RRM4) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, is an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 77
30354 409788 cd12352 RRM1_TIA1_like RNA recognition motif 1 (RRM1) found in granule-associated RNA binding proteins p40-TIA-1 and TIAR. This subfamily corresponds to the RRM1 of nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR), both of which are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. TIA-1 and TIAR share high sequence similarity. They are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis.TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both, TIA-1 and TIAR, bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains. 73
30355 409789 cd12353 RRM2_TIA1_like RNA recognition motif 2 (RRM2) found in granule-associated RNA binding proteins p40-TIA-1 and TIAR. This subfamily corresponds to the RRM2 of nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR), both of which are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. TIA-1 and TIAR share high sequence similarity. They are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis. TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both, TIA-1 and TIAR, bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains. 75
30356 409790 cd12354 RRM3_TIA1_like RNA recognition motif 2 (RRM2) found in granule-associated RNA binding proteins (p40-TIA-1 and TIAR), and yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1. This subfamily corresponds to the RRM3 of TIA-1, TIAR, and PUB1. Nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR) are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. They share high sequence similarity and are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis.TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both TIA-1 and TIAR bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains. This subfamily also includes a yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1, termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein, which has been identified as both a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP). It may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. PUB1 is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RRMs, and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 71
30357 409791 cd12355 RRM_RBM18 RNA recognition motif (RRM) found in eukaryotic RNA-binding protein 18 and similar proteins. This subfamily corresponds to the RRM of RBM18, a putative RNA-binding protein containing a well-conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The biological role of RBM18 remains unclear. 80
30358 409792 cd12356 RRM_PPARGC1B RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator 1-beta (PGC-1-beta) and similar proteins. This subfamily corresponds to the RRM of PGC-1beta, also termed PPAR-gamma coactivator 1-beta, or PPARGC-1-beta, or PGC-1-related estrogen receptor alpha coactivator, which is one of the members of PGC-1 transcriptional coactivators family, including PGC-1alpha and PGC-1-related coactivator (PRC). PGC-1beta plays a nonredundant role in controlling mitochondrial oxidative energy metabolism and affects both, insulin sensitivity and mitochondrial biogenesis, and functions in a number of oxidative tissues. It is involved in maintaining baseline mitochondrial function and cardiac contractile function following pressure overload hypertrophy by preserving glucose metabolism and preventing oxidative stress. PGC-1beta induces hypertriglyceridemia in response to dietary fats through activating hepatic lipogenesis and lipoprotein secretion. It can stimulate apolipoprotein C3 (APOC3) expression, further mediating hypolipidemic effect of nicotinic acid. PGC-1beta also drives nuclear respiratory factor 1 (NRF-1) target gene expression and NRF-1 and estrogen related receptor alpha (ERRalpha)-dependent mitochondrial biogenesis. The modulation of the expression of PGC-1beta can trigger ERRalpha-induced adipogenesis. PGC-1beta is also a potent regulator inducing angiogenesis in skeletal muscle. The transcriptional activity of PGC-1beta can be increased through binding to host cell factor (HCF), a cellular protein involved in herpes simplex virus (HSV) infection and cell cycle regulation. PGC-1beta is a multi-domain protein containing an N-terminal activation domain, an LXXLL coactivator signature, a tetrapeptide motif (DHDY) responsible for HCF binding, two glutamic/aspartic acid-rich acidic domains, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In contrast to PGC-1alpha, PGC-1beta lacks most of the arginine/serine (SR)-rich domain that is responsible for the regulation of RNA processing. 97
30359 409793 cd12357 RRM_PPARGC1A_like RNA recognition motif (RRM) found in the peroxisome proliferator-activated receptor gamma coactivator 1A (PGC-1alpha) family of regulated coactivators. This subfamily corresponds to the RRM of PGC-1alpha, PGC-1beta, and PGC-1-related coactivator (PRC), which serve as mediators between environmental or endogenous signals and the transcriptional machinery governing mitochondrial biogenesis. They play an important integrative role in the control of respiratory gene expression through interacting with a number of transcription factors, such as NRF-1, NRF-2, ERR, CREB and YY1. All family members are multi-domain proteins containing the N-terminal activation domain, an LXXLL coactivator signature, a tetrapeptide motif (DHDY) responsible for HCF binding, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In contrast to PGC-1alpha and PRC, PGC-1beta possesses two glutamic/aspartic acid-rich acidic domains, but lacks most of the arginine/serine (SR)-rich domain that is responsible for the regulation of RNA processing. 91
30360 240804 cd12358 RRM1_VICKZ RNA recognition motif 1 (RRM1) found in the VICKZ family proteins. Thid subfamily corresponds to the RRM1 of IGF2BPs (or IMPs) found in the VICKZ family that have been implicated in the post-transcriptional regulation of several different RNAs and in subcytoplasmic localization of mRNAs during embryogenesis. IGF2BPs are composed of two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and four hnRNP K homology (KH) domains. 73
30361 409794 cd12359 RRM2_VICKZ RNA recognition motif 2 (RRM2) found in the VICKZ family proteins. This subfamily corresponds to the RRM2 of IGF-II mRNA-binding proteins (IGF2BPs or IMPs) in the VICKZ family that have been implicated in the post-transcriptional regulation of several different RNAs and in subcytoplasmic localization of mRNAs during embryogenesis. IGF2BPs are composed of two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and four hnRNP K homology (KH) domains. 76
30362 409795 cd12360 RRM_cwf2 RNA recognition motif (RRM) found in yeast pre-mRNA-splicing factor Cwc2 and similar proteins. This subfamily corresponds to the RRM of yeast protein Cwc2, also termed Complexed with CEF1 protein 2, or PRP19-associated complex protein 40 (Ntc40), or synthetic lethal with CLF1 protein 3, one of the components of the Prp19-associated complex [nineteen complex (NTC)] that can bind to RNA. NTC is composed of the scaffold protein Prp19 and a number of associated splicing factors, and plays a crucial role in intron removal during premature mRNA splicing in eukaryotes. Cwc2 functions as an RNA-binding protein that can bind both small nuclear RNAs (snRNAs) and pre-mRNA in vitro. It interacts directly with the U6 snRNA to link the NTC to the spliceosome during pre-mRNA splicing. In the N-terminal half, Cwc2 contains a CCCH-type zinc finger (ZnF domain), a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and an intervening loop, also termed RNA-binding loop or RB loop, between ZnF and RRM, all of which are necessary and sufficient for RNA binding. The ZnF is also responsible for mediating protein-protein interaction. The C-terminal flexible region of Cwc2 interacts with the WD40 domain of Prp19. 79
30363 409796 cd12361 RRM1_2_CELF1-6_like RNA recognition motif 1 (RRM1) and 2 (RRM2) found in CELF/Bruno-like family of RNA binding proteins and plant flowering time control protein FCA. This subfamily corresponds to the RRM1 and RRM2 domains of the CUGBP1 and ETR-3-like factors (CELF) as well as plant flowering time control protein FCA. CELF, also termed BRUNOL (Bruno-like) proteins, is a family of structurally related RNA-binding proteins involved in regulation of pre-mRNA splicing in the nucleus, and control of mRNA translation and deadenylation in the cytoplasm. The family contains six members: CELF-1 (also known as BRUNOL-2, CUG-BP1, NAPOR, EDEN-BP), CELF-2 (also known as BRUNOL-3, ETR-3, CUG-BP2, NAPOR-2), CELF-3 (also known as BRUNOL-1, TNRC4, ETR-1, CAGH4, ER DA4), CELF-4 (BRUNOL-4), CELF-5 (BRUNOL-5) and CELF-6 (BRUNOL-6). They all contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The low sequence conservation of the linker region is highly suggestive of a large variety in the co-factors that associate with the various CELF family members. Based on both, sequence similarity and function, the CELF family can be divided into two subfamilies, the first containing CELFs 1 and 2, and the second containing CELFs 3, 4, 5, and 6. The different CELF proteins may act through different sites on at least some substrates. Furthermore, CELF proteins may interact with each other in varying combinations to influence alternative splicing in different contexts. This subfamily also includes plant flowering time control protein FCA that functions in the posttranscriptional regulation of transcripts involved in the flowering process. FCA contains two RRMs, and a WW protein interaction domain. 77
30364 409797 cd12362 RRM3_CELF1-6 RNA recognition motif 3 (RRM3) found in CELF/Bruno-like family of RNA binding proteins CELF1, CELF2, CELF3, CELF4, CELF5, CELF6 and similar proteins. This subgroup corresponds to the RRM3 of the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) proteins, a family of structurally related RNA-binding proteins involved in the regulation of pre-mRNA splicing in the nucleus and in the control of mRNA translation and deadenylation in the cytoplasm. The family contains six members: CELF-1 (also termed BRUNOL-2, or CUG-BP1, or NAPOR, or EDEN-BP), CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR-2), CELF-3 (also termed BRUNOL-1, or TNRC4, or ETR-1, or CAGH4, or ER DA4), CELF-4 (also termed BRUNOL-4), CELF-5 (also termed BRUNOL-5), CELF-6 (also termed BRUNOL-6). They all contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The low sequence conservation of the linker region is highly suggestive of a large variety in the co-factors that associate with the various CELF family members. Based on both sequence similarity and function, the CELF family can be divided into two subfamilies, the first containing CELFs 1 and 2, and the second containing CELFs 3, 4, 5, and 6. The different CELF proteins may act through different sites on at least some substrates. Furthermore, CELF proteins may interact with each other in varying combinations to influence alternative splicing in different contexts. 73
30365 409798 cd12363 RRM_TRA2 RNA recognition motif (RRM) found in transformer-2 protein homolog TRA2-alpha, TRA2-beta and similar proteins. This subfamily corresponds to the RRM of two mammalian homologs of Drosophila transformer-2 (Tra2), TRA2-alpha, TRA2-beta (also termed SFRS10), and similar proteins found in eukaryotes. TRA2-alpha is a 40-kDa serine/arginine-rich (SR) protein that specifically binds to gonadotropin-releasing hormone (GnRH) exonic splicing enhancer on exon 4 (ESE4) and is necessary for enhanced GnRH pre-mRNA splicing. It strongly stimulates GnRH intron A excision in a dose-dependent manner. In addition, TRA2-alpha can interact with either 9G8 or SRp30c, which may also be crucial for ESE-dependent GnRH pre-mRNA splicing. TRA2-beta is a serine/arginine-rich (SR) protein that controls the pre-mRNA alternative splicing of the calcitonin/calcitonin gene-related peptide (CGRP), the survival motor neuron 1 (SMN1) protein and the tau protein. Both, TRA2-alpha and TRA2-beta, contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. 80
30366 409799 cd12364 RRM_RDM1 RNA recognition motif (RRM) found in RAD52 motif-containing protein 1 (RDM1) and similar proteins. This subfamily corresponds to the RRM of RDM1, also termed RAD52 homolog B, a novel factor involved in the cellular response to the anti-cancer drug cisplatin in vertebrates. RDM1 contains a small RD motif that shares with the recombination and repair protein RAD52, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The RD motif is responsible for the acidic pH-dependent DNA-binding properties of RDM1. It interacts with ss- and dsDNA, and may act as a DNA-damage recognition factor by recognizing the distortions of the double helix caused by cisplatin-DNA adducts in vitro. In addition, due to the presence of RRM, RDM1 can bind to RNA as well as DNA. 81
30367 409800 cd12365 RRM_RNPS1 RNA recognition motif (RRM) found in RNA-binding protein with serine-rich domain 1 (RNPS1) and similar proteins. This subfamily corresponds to the RRM of RNPS1 and its eukaryotic homologs. RNPS1, also termed RNA-binding protein prevalent during the S phase, or SR-related protein LDC2, was originally characterized as a general pre-mRNA splicing activator, which activates both constitutive and alternative splicing of pre-mRNA in vitro.It has been identified as a protein component of the splicing-dependent mRNP complex, or exon-exon junction complex (EJC), and is directly involved in mRNA surveillance. Furthermore, RNPS1 is a splicing regulator whose activator function is controlled in part by CK2 (casein kinase II) protein kinase phosphorylation. It can also function as a squamous-cell carcinoma antigen recognized by T cells-3 (SART3)-binding protein, and is involved in the regulation of mRNA splicing. RNPS1 contains an N-terminal serine-rich (S) domain, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and the C-terminal arginine/serine/proline-rich (RS/P) domain. 73
30368 409801 cd12366 RRM1_RBM45 RNA recognition motif 1 (RRM1) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM1 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 81
30369 409802 cd12367 RRM2_RBM45 RNA recognition motif 2 (RRM2) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM2 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 74
30370 409803 cd12368 RRM3_RBM45 RNA recognition motif 3 (RRM3) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM3 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 75
30371 409804 cd12369 RRM4_RBM45 RNA recognition motif 4 (RRM4) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM4 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 68
30372 409805 cd12370 RRM1_PUF60 RNA recognition motif 1 (RRM1) found in (U)-binding-splicing factor PUF60 and similar proteins. This subfamily corresponds to the RRM1 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1). PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. Research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 76
30373 409806 cd12371 RRM2_PUF60 RNA recognition motif 2 (RRM2) found in (U)-binding-splicing factor PUF60 and similar proteins. This subfamily corresponds to the RRM2 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1). PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. Research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 77
30374 409807 cd12372 RRM_CFIm68_CFIm59 RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 68 kDa subunit (CFIm68 or CPSF6), pre-mRNA cleavage factor Im 59 kDa subunit (CFIm59 or CPSF7), and similar proteins. This subfamily corresponds to the RRM of cleavage factor Im (CFIm) subunits. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. Two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. Structurally related CFIm68 and CFIm59, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF7), or cleavage and polyadenylation specificity factor 59 kDa subunit (CPSF59), are functionally redundant. Both contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. Their N-terminal RRM mediates the interaction with CFIm25, and also serves to enhance RNA binding and facilitate RNA looping. 76
30375 409808 cd12373 RRM_SRSF3_like RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 3 (SRSF3) and similar proteins. This subfamily corresponds to the RRM of two serine/arginine (SR) proteins, serine/arginine-rich splicing factor 3 (SRSF3) and serine/arginine-rich splicing factor 7 (SRSF7). SRSF3, also termed pre-mRNA-splicing factor SRp20, modulates alternative splicing by interacting with RNA cis-elements in a concentration- and cell differentiation-dependent manner. It is also involved in termination of transcription, alternative RNA polyadenylation, RNA export, and protein translation. SRSF3 is critical for cell proliferation, and tumor induction and maintenance. It can shuttle between the nucleus and cytoplasm. SRSF7, also termed splicing factor 9G8, plays a crucial role in both constitutive splicing and alternative splicing of many pre-mRNAs. Its localization and functions are tightly regulated by phosphorylation. SRSF7 is predominantly present in the nuclear and can shuttle between nucleus and cytoplasm. It cooperates with the export protein, Tap/NXF1, helps mRNA export to the cytoplasm, and enhances the expression of unspliced mRNA. Moreover, SRSF7 inhibits tau E10 inclusion through directly interacting with the proximal downstream intron of E10, a clustering region for frontotemporal dementia with Parkinsonism (FTDP) mutations. Both SRSF3 and SRSF7 contain a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 73
30376 409809 cd12374 RRM_UHM_SPF45_PUF60 RNA recognition motif (RRM) found in UHM domain of 45 kDa-splicing factor (SPF45) and similar proteins. This subfamily corresponds to the RRM found in UHM domain of 45 kDa-splicing factor (SPF45 or RBM17), poly(U)-binding-splicing factor PUF60 (FIR or Hfp or RoBP1 or Siah-BP1), and similar proteins. SPF45 is an RNA-binding protein consisting of an unstructured N-terminal region, followed by a G-patch motif and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and an Arg-Xaa-Phe sequence motif. SPF45 regulates alternative splicing of the apoptosis regulatory gene FAS (also known as CD95). It induces exon 6 skipping in FAS pre-mRNA through the UHM domain that binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) present in the 3' splice site-recognizing factors U2AF65, SF1 and SF3b155. PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RRMs and a C-terminal UHM domain. 85
30377 409810 cd12375 RRM1_Hu_like RNA recognition motif 1 (RRM1) found in the Hu proteins family, Drosophila sex-lethal (SXL), and similar proteins. This subfamily corresponds to the RRM1 of Hu proteins and SXL. The Hu proteins family represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. This family also includes the sex-lethal protein (SXL) from Drosophila melanogaster. SXL governs sexual differentiation and X chromosome dosage compensation in flies. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds to its own pre-mRNA and promotes female-specific alternative splicing. It contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RRMs that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 76
30378 240822 cd12376 RRM2_Hu_like RNA recognition motif 2 (RRM2) found in the Hu proteins family, Drosophila sex-lethal (SXL), and similar proteins. This subfamily corresponds to the RRM2 of Hu proteins and SXL. The Hu proteins family represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. Also included in this subfamily is the sex-lethal protein (SXL) from Drosophila melanogaster. SXL governs sexual differentiation and X chromosome dosage compensation in flies. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RRMs that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 79
30379 409811 cd12377 RRM3_Hu RNA recognition motif 3 (RRM3) found in the Hu proteins family. This subfamily corresponds to the RRM3 of the Hu proteins family which represent a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 76
30380 409812 cd12378 RRM1_I_PABPs RNA recognition motif 1 (RRM1) found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM1 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is a ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammals, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 80
30381 409813 cd12379 RRM2_I_PABPs RNA recognition motif 2 (RRM2) found found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM2 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is a ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 77
30382 409814 cd12380 RRM3_I_PABPs RNA recognition motif 3 (RRM3) found found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM3 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is an ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 80
30383 409815 cd12381 RRM4_I_PABPs RNA recognition motif 4 (RRM4) found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM4 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in theThe CD corresponds to the RRM. regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is an ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 79
30384 409816 cd12382 RRM_RBMX_like RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein G (hnRNP G), Y chromosome RNA recognition motif 1 (hRBMY), testis-specific heterogeneous nuclear ribonucleoprotein G-T (hnRNP G-T) and similar proteins. This subfamily corresponds to the RRM domain of hnRNP G, also termed glycoprotein p43 or RBMX, an RNA-binding motif protein located on the X chromosome. It is expressed ubiquitously and has been implicated in the splicing control of several pre-mRNAs. Moreover, hnRNP G may function as a regulator of transcription for SREBP-1c and GnRH1. Research has shown that hnRNP G may also act as a tumor-suppressor since it upregulates the Txnip gene and promotes the fidelity of DNA end-joining activity. In addition, hnRNP G appears to play a critical role in proper neural development of zebrafish and frog embryos. The family also includes several paralogs of hnRNP G, such as hRBMY and hnRNP G-T (also termed RNA-binding motif protein, X-linked-like-2). Both, hRBMY and hnRNP G-T, are exclusively expressed in testis and critical for male fertility. Like hnRNP G, hRBMY and hnRNP G-T interact with factors implicated in the regulation of pre-mRNA splicing, such as hTra2-beta1 and T-STAR. Although members in this family share a high conserved N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), they appear to recognize different RNA targets. For instance, hRBMY interacts specifically with a stem-loop structure in which the loop is formed by the sequence CA/UCAA. In contrast, hnRNP G associates with single stranded RNA sequences containing a CCA/C motif. In addition to the RRM, hnRNP G contains a nascent transcripts targeting domain (NTD) in the middle region and a novel auxiliary RNA-binding domain (RBD) in its C-terminal region. The C-terminal RBD exhibits distinct RNA binding specificity, and would play a critical role in the regulation of alternative splicing by hnRNP G. 80
30385 409817 cd12383 RRM_RBM42 RNA recognition motif (RRM) found in RNA-binding protein 42 (RBM42) and similar proteins. This subfamily corresponds to the RRM of RBM42 which has been identified as a heterogeneous nuclear ribonucleoprotein K (hnRNP K)-binding protein. It also directly binds the 3' untranslated region of p21 mRNA that is one of the target mRNAs for hnRNP K. Both, hnRNP K and RBM42, are components of stress granules (SGs). Under nonstress conditions, RBM42 predominantly localizes within the nucleus and co-localizes with hnRNP K. Under stress conditions, hnRNP K and RBM42 form cytoplasmic foci where the SG marker TIAR localizes, and may play a role in the maintenance of cellular ATP level by protecting their target mRNAs. RBM42 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 83
30386 409818 cd12384 RRM_RBM24_RBM38_like RNA recognition motif (RRM) found in eukaryotic RNA-binding protein RBM24, RBM38 and similar proteins. This subfamily corresponds to the RRM of RBM24 and RBM38 from vertebrate, SUPpressor family member SUP-12 from Caenorhabditis elegans and similar proteins. Both, RBM24 and RBM38, are preferentially expressed in cardiac and skeletal muscle tissues. They regulate myogenic differentiation by controlling the cell cycle in a p21-dependent or -independent manner. RBM24, also termed RNA-binding region-containing protein 6, interacts with the 3'-untranslated region (UTR) of myogenin mRNA and regulates its stability in C2C12 cells. RBM38, also termed CLL-associated antigen KW-5, or HSRNASEB, or RNA-binding region-containing protein 1(RNPC1), or ssDNA-binding protein SEB4, is a direct target of the p53 family. It is required for maintaining the stability of the basal and stress-induced p21 mRNA by binding to their 3'-UTRs. It also binds the AU-/U-rich elements in p63 3'-UTR and regulates p63 mRNA stability and activity. SUP-12 is a novel tissue-specific splicing factor that controls muscle-specific splicing of the ADF/cofilin pre-mRNA in C. elegans. All family members contain a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30387 409819 cd12385 RRM1_hnRNPM_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM1 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 76
30388 409820 cd12386 RRM2_hnRNPM_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM2 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. It functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 74
30389 409821 cd12387 RRM3_hnRNPM_like RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM3 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 71
30390 409822 cd12388 RRM1_RAVER RNA recognition motif 1 (RRM1) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM1 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 70
30391 409823 cd12389 RRM2_RAVER RNA recognition motif 2 (RRM2) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM2 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 77
30392 409824 cd12390 RRM3_RAVER RNA recognition motif 3 (RRM3) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM3 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 91
30393 409825 cd12391 RRM1_SART3 RNA recognition motif 1 (RRM1) found in squamous cell carcinoma antigen recognized by T-cells 3 (SART3) and similar proteins. This subfamily corresponds to the RRM1 of SART3, also termed Tat-interacting protein of 110 kDa (Tip110), an RNA-binding protein expressed in the nucleus of the majority of proliferating cells, including normal cells and malignant cells, but not in normal tissues except for the testes and fetal liver. It is involved in the regulation of mRNA splicing probably via its complex formation with RNA-binding protein with a serine-rich domain (RNPS1), a pre-mRNA-splicing factor. SART3 has also been identified as a nuclear Tat-interacting protein that regulates Tat transactivation activity through direct interaction and functions as an important cellular factor for HIV-1 gene expression and viral replication. In addition, SART3 is required for U6 snRNP targeting to Cajal bodies. It binds specifically and directly to the U6 snRNA, interacts transiently with the U6 and U4/U6 snRNPs, and promotes the reassembly of U4/U6 snRNPs after splicing in vitro. SART3 contains an N-terminal half-a-tetratricopeptide repeat (HAT)-rich domain, a nuclearlocalization signal (NLS) domain, and two C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 72
30394 409826 cd12392 RRM2_SART3 RNA recognition motif 2 (RRM2) found in squamous cell carcinoma antigen recognized by T-cells 3 (SART3) and similar proteins. This subfamily corresponds to the RRM2 of SART3, also termed Tat-interacting protein of 110 kDa (Tip110), is an RNA-binding protein expressed in the nucleus of the majority of proliferating cells, including normal cells and malignant cells, but not in normal tissues except for the testes and fetal liver. It is involved in the regulation of mRNA splicing probably via its complex formation with RNA-binding protein with a serine-rich domain (RNPS1), a pre-mRNA-splicing factor. SART3 has also been identified as a nuclear Tat-interacting protein that regulates Tat transactivation activity through direct interaction and functions as an important cellular factor for HIV-1 gene expression and viral replication. In addition, SART3 is required for U6 snRNP targeting to Cajal bodies. It binds specifically and directly to the U6 snRNA, interacts transiently with the U6 and U4/U6 snRNPs, and promotes the reassembly of U4/U6 snRNPs after splicing in vitro. SART3 contains an N-terminal half-a-tetratricopeptide repeat (HAT)-rich domain, a nuclearlocalization signal (NLS) domain, and two C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 81
30395 409827 cd12393 RRM_ZCRB1 RNA recognition motif (RRM) found in Zinc finger CCHC-type and RNA-binding motif-containing protein 1 (ZCRB1) and similar proteins. This subfamily corresponds to the RRM of ZCRB1, also termed MADP-1, or U11/U12 small nuclear ribonucleoprotein 31 kDa protein (U11/U12 snRNP 31 or U11/U12-31K), a novel multi-functional nuclear factor, which may be involved in morphine dependence, cold/heat stress, and hepatocarcinoma. It is located in the nucleoplasm, but outside the nucleolus. ZCRB1 is one of the components of U11/U12 snRNPs that bind to U12-type pre-mRNAs and form a di-snRNP complex, simultaneously recognizing the 5' splice site and branchpoint sequence. ZCRB1 is characterized by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a CCHC-type Zinc finger motif. In addition, it contains core nucleocapsid motifs, and Lys- and Glu-rich domains. 76
30396 409828 cd12394 RRM1_RBM34 RNA recognition motif 1 (RRM1) found in RNA-binding protein 34 (RBM34) and similar proteins. This subfamily corresponds to the RRM1 of RBM34, a putative RNA-binding protein containing two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Although the function of RBM34 remains unclear currently, its RRM domains may participate in mRNA processing. RBM34 may act as an mRNA processing-related protein. 91
30397 409829 cd12395 RRM2_RBM34 RNA recognition motif 2 (RRM2) found in RNA-binding protein 34 (RBM34) and similar proteins. This subfamily corresponds to the RRM2 of RBM34, a putative RNA-binding protein containing two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Although the function of RBM34 remains unclear currently, its RRM domains may participate in mRNA processing. RBM34 may act as an mRNA processing-related protein. 73
30398 409830 cd12396 RRM1_Nop13p_fungi RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 13 (Nop13p) and similar proteins. This subfamily corresponds to the RRM1 of Nop13p encoded by YNL175c from Saccharomyces cerevisiae. It shares high sequence similarity with nucleolar protein 12 (Nop12p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop12p that is localized to the nucleolus, Nop13p localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent. Nop13p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 85
30399 409831 cd12397 RRM2_Nop13p_fungi RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 13 (Nop13p) and similar proteins. This subfamily corresponds to the RRM2 of Nop13p encoded by YNL175c from Saccharomyces cerevisiae. It shares high sequence similarity with nucleolar protein 12 (Nop12p). Both Nop12p and Nop13p are not essential for growth. However, unlike Nop12p that is localized to the nucleolus, Nop13p localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent. Nop13p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 76
30400 409832 cd12398 RRM_CSTF2_RNA15_like RNA recognition motif (RRM) found in cleavage stimulation factor subunit 2 (CSTF2), yeast ortholog mRNA 3'-end-processing protein RNA15 and similar proteins. This subfamily corresponds to the RRM domain of CSTF2, its tau variant and eukaryotic homologs. CSTF2, also termed cleavage stimulation factor 64 kDa subunit (CstF64), is the vertebrate conterpart of yeast mRNA 3'-end-processing protein RNA15. It is expressed in all somatic tissues and is one of three cleavage stimulatory factor (CstF) subunits required for polyadenylation. CstF64 contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a CstF77-binding domain, a repeated MEARA helical region and a conserved C-terminal domain reported to bind the transcription factor PC-4. During polyadenylation, CstF interacts with the pre-mRNA through the RRM of CstF64 at U- or GU-rich sequences within 10 to 30 nucleotides downstream of the cleavage site. CSTF2T, also termed tauCstF64, is a paralog of the X-linked cleavage stimulation factor CstF64 protein that supports polyadenylation in most somatic cells. It is expressed during meiosis and subsequent haploid differentiation in a more limited set of tissues and cell types, largely in meiotic and postmeiotic male germ cells, and to a lesser extent in brain. The loss of CSTF2T will cause male infertility, as it is necessary for spermatogenesis and fertilization. Moreover, CSTF2T is required for expression of genes involved in morphological differentiation of spermatids, as well as for genes having products that function during interaction of motile spermatozoa with eggs. It promotes germ cell-specific patterns of polyadenylation by using its RRM to bind to different sequence elements downstream of polyadenylation sites than does CstF64. The family also includes yeast ortholog mRNA 3'-end-processing protein RNA15 and similar proteins. RNA15 is a core subunit of cleavage factor IA (CFIA), an essential transcriptional 3'-end processing factor from Saccharomyces cerevisiae. RNA recognition by CFIA is mediated by an N-terminal RRM, which is contained in the RNA15 subunit of the complex. The RRM of RNA15 has a strong preference for GU-rich RNAs, mediated by a binding pocket that is entirely conserved in both yeast and vertebrate RNA15 orthologs. 77
30401 409833 cd12399 RRM_HP0827_like RNA recognition motif (RRM) found in Helicobacter pylori HP0827 protein and similar proteins. This subfamily corresponds to the RRM of H. pylori HP0827, a putative ssDNA-binding protein 12rnp2 precursor, containing one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The ssDNA binding may be important in activation of HP0827. 75
30402 409834 cd12400 RRM_Nop6 RNA recognition motif (RRM) found in Saccharomyces cerevisiae nucleolar protein 6 (Nop6) and similar proteins. This subfamily corresponds to the RRM of Nop6, also known as Ydl213c, a component of 90S pre-ribosomal particles in yeast S. cerevisiae. It is enriched in the nucleolus and is required for 40S ribosomal subunit biogenesis. Nop6 is a non-essential putative RNA-binding protein with two N-terminal putative nuclear localisation sequences (NLS-1 and NLS-2) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It binds to the pre-rRNA early during transcription and plays an essential role in pre-rRNA processing. 74
30403 409835 cd12401 RRM_eIF4H RNA recognition motif (RRM) found in eukaryotic translation initiation factor 4H (eIF-4H) and similar proteins. This subfamily corresponds to the RRM of eIF-4H, also termed Williams-Beuren syndrome chromosomal region 1 protein, which, together with elf-4B/eIF-4G, serves as the accessory protein of RNA helicase eIF-4A. eIF-4H contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It stimulates protein synthesis by enhancing the helicase activity of eIF-4A in the initiation step of mRNA translation. 84
30404 409836 cd12402 RRM_eIF4B RNA recognition motif (RRM) found in eukaryotic translation initiation factor 4B (eIF-4B) and similar proteins. This subfamily corresponds to the RRM of eIF-4B, a multi-domain RNA-binding protein that has been primarily implicated in promoting the binding of 40S ribosomal subunits to mRNA during translation initiation. It contains two RNA-binding domains; the N-terminal well-conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), binds the 18S rRNA of the 40S ribosomal subunit and the C-terminal basic domain (BD), including two arginine-rich motifs (ARMs), binds mRNA during initiation, and is primarily responsible for the stimulation of the helicase activity of eIF-4A. eIF-4B also contains a DRYG domain (a region rich in Asp, Arg, Tyr, and Gly amino acids) in the middle, which is responsible for both, self-association of eIF-4B and binding to the p170 subunit of eIF3. Additional research indicates that eIF-4B can interact with the poly(A) binding protein (PABP) in mammalian cells, which can stimulate both, the eIF-4B-mediated activation of the helicase activity of eIF-4A and binding of poly(A) by PABP. eIF-4B has also been shown to interact specifically with the internal ribosome entry sites (IRES) of several picornaviruses which facilitate cap-independent translation initiation. 81
30405 409837 cd12403 RRM1_NCL RNA recognition motif 1 (RRM1) found in vertebrate nucleolin. This subfamily corresponds to the RRM1 of ubiquitously expressed protein nucleolin, also termed protein C23. Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop. 75
30406 409838 cd12404 RRM2_NCL RNA recognition motif 2 (RRM2) found in vertebrate nucleolin. This subfamily corresponds to the RRM2 of ubiquitously expressed protein nucleolin, also termed protein C23, a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines.RRM2, together with RRM1, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop. 77
30407 409839 cd12405 RRM3_NCL RNA recognition motif 3 (RRM3) found in vertebrate nucleolin. This subfamily corresponds to the RRM3 of ubiquitously expressed protein nucleolin, also termed protein C23, is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. 72
30408 409840 cd12406 RRM4_NCL RNA recognition motif 4 (RRM4) found in vertebrate nucleolin. This subfamily corresponds to the RRM4 of ubiquitously expressed protein nucleolin, also termed protein C23, is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. 78
30409 409841 cd12407 RRM_FOX1_like RNA recognition motif (RRM) found in vertebrate RNA binding protein fox-1 homologs and similar proteins. This subfamily corresponds to the RRM of several tissue-specific alternative splicing isoforms of vertebrate RNA binding protein Fox-1 homologs, which show high sequence similarity to the Caenorhabditis elegans feminizing locus on X (Fox-1) gene encoding Fox-1 protein. RNA binding protein Fox-1 homolog 1 (RBFOX1), also termed ataxin-2-binding protein 1 (A2BP1), or Fox-1 homolog A, or hexaribonucleotide-binding protein 1 (HRNBP1), is predominantly expressed in neurons, skeletal muscle and heart. It regulates alternative splicing of tissue-specific exons by binding to UGCAUG elements. Moreover, RBFOX1 binds to the C-terminus of ataxin-2 and forms an ataxin-2/A2BP1 complex involved in RNA processing. RNA binding protein fox-1 homolog 2 (RBFOX2), also termed Fox-1 homolog B, or hexaribonucleotide-binding protein 2 (HRNBP2), or RNA-binding motif protein 9 (RBM9), or repressor of tamoxifen transcriptional activity, is expressed in ovary, whole embryo, and human embryonic cell lines in addition to neurons and muscle. RBFOX2 activates splicing of neuron-specific exons through binding to downstream UGCAUG elements. RBFOX2 also functions as a repressor of tamoxifen activation of the estrogen receptor. RNA binding protein Fox-1 homolog 3 (RBFOX3 or NeuN or HRNBP3), also termed Fox-1 homolog C, is a nuclear RNA-binding protein that regulates alternative splicing of the RBFOX2 pre-mRNA, producing a message encoding a dominant negative form of the RBFOX2 protein. Its message is detected exclusively in post-mitotic regions of embryonic brain. Like RBFOX1, both RBFOX2 and RBFOX3 bind to the hexanucleotide UGCAUG elements and modulate brain and muscle-specific splicing of exon EIIIB of fibronectin, exon N1 of c-src, and calcitonin/CGRP. Members in this family also harbor one RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 76
30410 409842 cd12408 RRM_eIF3G_like RNA recognition motif (RRM) found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. This subfamily corresponds to the RRM of eIF-3G and similar proteins. eIF-3G, also termed eIF-3 subunit 4, or eIF-3-delta, or eIF3-p42, or eIF3-p44, is the RNA-binding subunit of eIF3, a large multisubunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. eIF-3G is one of the cytosolic targets and interacts with mature apoptosis-inducing factor (AIF). eIF-3G contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). This family also includes yeast eIF3-p33, a homolog of vertebrate eIF-3G, plays an important role in the initiation phase of protein synthesis in yeast. It binds both, mRNA and rRNA, fragments due to an RRM near its C-terminus. 76
30411 409843 cd12409 RRM1_RRT5 RNA recognition motif 1 (RRM1) found in yeast regulator of rDNA transcription protein 5 (RRT5) and similar proteins. This subfamily corresponds to the RRM1 of the lineage specific family containing a group of uncharacterized yeast regulators of rDNA transcription protein 5 (RRT5), which may play roles in the modulation of rDNA transcription. RRT5 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 84
30412 409844 cd12410 RRM2_RRT5 RNA recognition motif 2 (RRM2) found in yeast regulator of rDNA transcription protein 5 (RRT5) and similar proteins. This subfamily corresponds to the RRM2 of the lineage specific family containing a group of uncharacterized yeast regulators of rDNA transcription protein 5 (RRT5), which may play roles in the modulation of rDNA transcription. RRT5 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 93
30413 409845 cd12411 RRM_ist3_like RNA recognition motif (RRM) found in ist3 family. This subfamily corresponds to the RRM of the ist3 family that includes fungal U2 small nuclear ribonucleoprotein (snRNP) component increased sodium tolerance protein 3 (ist3), X-linked 2 RNA-binding motif proteins (RBMX2) found in Metazoa and plants, and similar proteins. Gene IST3 encoding ist3, also termed U2 snRNP protein SNU17 (Snu17p), is a novel yeast Saccharomyces cerevisiae protein required for the first catalytic step of splicing and for progression of spliceosome assembly. It binds specifically to the U2 snRNP and is an intrinsic component of prespliceosomes and spliceosomes. Yeast ist3 contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In the yeast pre-mRNA retention and splicing complex, the atypical RRM of ist3 functions as a scaffold that organizes the other two constituents, Bud13p (bud site selection 13) and Pml1p (pre-mRNA leakage 1). Fission yeast Schizosaccharomyces pombe gene cwf29 encoding ist3, also termed cell cycle control protein cwf29, is an RNA-binding protein complexed with cdc5 protein 29. It also contains one RRM. The biological function of RBMX2 remains unclear. It shows high sequence similarity to yeast ist3 protein and harbors one RRM as well. 89
30414 409846 cd12412 RRM_DAZL_BOULE RNA recognition motif (RRM) found in AZoospermia (DAZ) autosomal homologs, DAZL (DAZ-like) and BOULE. This subfamily corresponds to the RRM domain of two Deleted in AZoospermia (DAZ) autosomal homologs, DAZL (DAZ-like) and BOULE. BOULE is the founder member of the family and DAZL arose from BOULE in an ancestor of vertebrates. The DAZ gene subsequently originated from a duplication transposition of the DAZL gene. Invertebrates contain a single DAZ homolog, BOULE, while vertebrates, other than catarrhine primates, possess both BOULE and DAZL genes. The catarrhine primates possess BOULE, DAZL, and DAZ genes. The family members encode closely related RNA-binding proteins that are required for fertility in numerous organisms. These proteins contain an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a varying number of copies of a DAZ motif, believed to mediate protein-protein interactions. DAZL and BOULE contain a single copy of the DAZ motif, while DAZ proteins can contain 8-24 copies of this repeat. Although their specific biochemical functions remain to be investigated, DAZL proteins may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis. 81
30415 409847 cd12413 RRM1_RBM28_like RNA recognition motif 1 (RRM1) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM1 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs. 79
30416 409848 cd12414 RRM2_RBM28_like RNA recognition motif 2 (RRM2) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM2 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs. 76
30417 409849 cd12415 RRM3_RBM28_like RNA recognition motif 3 (RRM3) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM3 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs. 83
30418 409850 cd12416 RRM4_RBM28_like RNA recognition motif 4 (RRM4) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM4 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs. 98
30419 409851 cd12417 RRM_SAFB_like RNA recognition motif (RRM) found in the scaffold attachment factor (SAFB) family. This subfamily corresponds to the RRM domain of the SAFB family, including scaffold attachment factor B1 (SAFB1), scaffold attachment factor B2 (SAFB2), SAFB-like transcriptional modulator (SLTM), and similar proteins, which are ubiquitously expressed. SAFB1, SAFB2 and SLTM have been implicated in many diverse cellular processes including cell growth and transformation, stress response, and apoptosis. They share high sequence similarities and all contain a scaffold attachment factor-box (SAF-box, also known as SAP domain) DNA-binding motif, an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region rich in glutamine and arginine residues. SAFB1 is a nuclear protein with a distribution similar to that of SLTM, but unlike that of SAFB2, which is also found in the cytoplasm. To a large extent, SAFB1 and SLTM might share similar functions, such as the inhibition of an oestrogen reporter gene. The additional cytoplasmic localization of SAFB2 implies that it could play additional roles in the cytoplasmic compartment which are distinct from the nuclear functions shared with SAFB1 and SLTM. 74
30420 409852 cd12418 RRM_Aly_REF_like RNA recognition motif (RRM) found in the Aly/REF family. This subfamily corresponds to the RRM of Aly/REF family which includes THO complex subunit 4 (THOC4, also termed Aly/REF), S6K1 Aly/REF-like target (SKAR, also termed PDIP3 or PDIP46) and similar proteins. THOC4 is an mRNA transporter protein with a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It is involved in RNA transportation from the nucleus, and was initially identified as a transcription coactivator of LEF-1 and AML-1 for the TCRalpha enhancer function. In addition, THOC4 specifically binds to rhesus (RH) promoter in erythroid, and might be a novel transcription cofactor for erythroid-specific genes. SKAR shows high sequence homology with THOC4 and possesses one RRM as well. SKAR is widely expressed and localizes to the nucleus. It may be a critical player in the function of S6K1 in cell and organism growth control by binding the activated, hyperphosphorylated form of S6K1 but not S6K2. Furthermore, SKAR functions as a protein partner of the p50 subunit of DNA polymerase delta. In addition, SKAR may have particular importance in pancreatic beta cell size determination and insulin secretion. 75
30421 409853 cd12419 RRM_Ssp2_like RNA recognition motif (RRM) found in yeast sporulation-specific protein 2 (Ssp2) and similar protein. This subfamily corresponds to the RRM of the lineage specific yeast sporulation-specific protein 2 (Ssp2) and similar proteins. Ssp2 is encoded by a sporulation-specific gene necessary for outer spore wall assembly in the yeast Saccharomyces cerevisiae. It localizes to the spore wall and may play an important role after meiosis II and during spore wall formation. Ssp2 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 85
30422 409854 cd12420 RRM_RBPMS_like RNA recognition motif (RRM) found in RNA-binding protein with multiple splicing (RBP-MS)-like proteins. This subfamily corresponds to the RRM of RNA-binding proteins with multiple splicing (RBP-MS)-like proteins, including protein products of RBPMS genes (RBP-MS and its paralogue RBP-MS2), the Drosophila couch potato (cpo), and Caenorhabditis elegans Mec-8 genes. RBP-MS may be involved in regulation of mRNA translation and localization during Xenopus laevis development. It has also been shown to physically interact with Smad2, Smad3 and Smad4, and stimulates Smad-mediated transactivation. Cpo may play an important role in regulating normal function of the nervous system, whereas mutations in Mec-8 affect mechanosensory and chemosensory neuronal function. All members contain a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Some uncharacterized family members contain two RRMs; this subfamily includes their RRM1. Their RRM2 shows high sequence homology to the RRM of yeast proteins scw1, Whi3, and Whi4. 76
30423 409855 cd12421 RRM1_PTBP1_hnRNPL_like RNA recognition motif (RRM) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), and similar proteins. This subfamily corresponds to the RRM1 of the majority of family members that include polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), polypyrimidine tract-binding protein homolog 3 (PTBPH3), polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2), and similar proteins. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. Rod1 is a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL protein plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. The family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to other family members, all of which contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although their biological roles remain unclear, both PTBPH1 and PTBPH2 show significant sequence similarity to PTB. However, in contrast to PTB, they have three RRMs. In addition, this family also includes RNA-binding motif protein 20 (RBM20) that is an alternative splicing regulator associated with dilated cardiomyopathy (DCM) and contains only one RRM. 74
30424 409856 cd12422 RRM2_PTBP1_hnRNPL_like RNA recognition motif (RRM) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), and similar proteins. This subfamily corresponds to the RRM2 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), polypyrimidine tract-binding protein homolog 3 (PTBPH3), polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2), and similar proteins, and RRM3 of PTBPH1 and PTBPH2. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. Rod1 is a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL protein plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. This family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to other family members, all of which contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although their biological roles remain unclear, both PTBPH1 and PTBPH2 show significant sequence similarity to PTB. However, in contrast to PTB, they have three RRMs. 85
30425 409857 cd12423 RRM3_PTBP1_like RNA recognition motif 3 (RRM3) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM3 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30426 409858 cd12424 RRM3_hnRNPL_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM3 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RRMs. 74
30427 409859 cd12425 RRM4_PTBP1_like RNA recognition motif 4 (RRM4) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM4 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30428 409860 cd12426 RRM4_PTBPH3 RNA recognition motif 4 (RRM4) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM4 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 79
30429 409861 cd12427 RRM4_hnRNPL_like RNA recognition motif 4 (RRM4) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM4 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 84
30430 409862 cd12428 RRM_PARN RNA recognition motif (RRM) found in poly(A)-specific ribonuclease PARN and similar proteins. The subfamily corresponds to the RRM of PARN, also termed deadenylating nuclease, or deadenylation nuclease, or polyadenylate-specific ribonuclease, a processive poly(A)-specific 3'-exoribonuclease involved in the decay of eukaryotic mRNAs. It specifically binds both, the poly(A) tail at the 3' end and the 7-methylguanosine (m7G) cap located at the 5' end of eukaryotic mRNAs, and catalyzes the 3'- to 5'-end deadenylation of single-stranded mRNA with a free 3' hydroxyl group both in the nucleus and in the cytoplasm. PARN belongs to the DEDD superfamily of exonucleases. It contains a nuclease domain, an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and an R3H domain. PARN exists as a homodimer. The nuclease domain is involved in the dimerization. RRM and R3H domains are essential for the RNA-binding. 66
30431 409863 cd12429 RRM_DNAJC17 RNA recognition motif (RRM) found in the DnaJ homolog subfamily C member 17. The CD corresponds to the RRM of some eukaryotic DnaJ homolog subfamily C member 17 and similar proteins. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Members in this family contains an N-terminal DnaJ domain or J-domain, which mediates the interaction with Hsp70. They also contains a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the C-terminus, which may play an essential role in RNA binding. 74
30432 409864 cd12430 RRM_LARP4_5_like RNA recognition motif (RRM) found in La-related protein 4 (LARP4), La-related protein 5 (LARP5 or LARP4B) and similar proteins. This subfamily corresponds to the RRM of LARP4 and LARP5. LARP4 is a cytoplasmic factor that can bind poly(A) RNA and interact with poly(A) binding protein (PABP). It may play a role in promoting translation by stabilizing mRNA. LARP5 is a cytosolic protein that co-sediments with polysomes and accumulates upon stress induction in cellular stress granules. It can interact with the cytosolic poly(A) binding protein 1 (PABPC1) and the receptor for activated C Kinase (RACK1), a component of the 40S ribosomal subunit. LARP5 may function as a stimulatory factor of translation through bridging mRNA factors of the 3' end with initiating ribosomes. Both, LARP4 and LARP5, are structurally related to the La autoantigen. Like other La-related proteins (LARPs) family members, LARP4 and LARP5 contain a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30433 409865 cd12431 RRM_ALKBH8 RNA recognition motif (RRM) found in alkylated DNA repair protein alkB homolog 8 (ALKBH8) and similar proteins. This subfamily corresponds to the RRM of ALKBH8, also termed alpha-ketoglutarate-dependent dioxygenase ABH8, or S-adenosyl-L-methionine-dependent tRNA methyltransferase ABH8, expressed in various types of human cancers. It is essential in urothelial carcinoma cell survival mediated by NOX-1-dependent ROS signals. ALKBH8 has also been identified as a tRNA methyltransferase that catalyzes methylation of tRNA to yield 5-methylcarboxymethyl uridine (mcm5U) at the wobble position of the anticodon loop. Thus, ALKBH8 plays a crucial role in the DNA damage survival pathway through a distinct mechanism involving the regulation of tRNA modification. ALKBH8 localizes to the cytoplasm. It contains the characteristic AlkB domain that is composed of a tRNA methyltransferase motif, a motif homologous to the bacterial AlkB DNA/RNA repair enzyme, and a dioxygenase catalytic core domain encompassing cofactor-binding sites for iron and 2-oxoglutarate. In addition, unlike other AlkB homologs, ALKBH8 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal S-adenosylmethionine (SAM)-dependent methyltransferase (MT) domain. 80
30434 409866 cd12432 RRM_ACINU RNA recognition motif (RRM) found in apoptotic chromatin condensation inducer in the nucleus (acinus) and similar proteins. This subfamily corresponds to the RRM of Acinus, a caspase-3-activated nuclear factor that induces apoptotic chromatin condensation after cleavage by caspase-3 without inducing DNA fragmentation. It is essential for apoptotic chromatin condensation and may also participate in nuclear structural changes occurring in normal cells. Acinus contains a P-loop motif and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which indicates Acinus might have ATPase and DNA/RNA-binding activity. 90
30435 409867 cd12433 RRM_Yme2p_like RNA recognition motif (RRM) found in yeast mitochondrial escape protein 2 (Yme2p) and similar proteins. This subfamily corresponds to the RRM of Yme2p, also termed protein RNA12, an inner mitochondrial membrane protein that plays a critical role in mitochondrial DNA transactions. It may serve as a mediator of nucleoid structure and number in mitochondria of the yeast Saccharomyces cerevisiae. Yme2p contains an exonuclease domain, an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal domain. 86
30436 409868 cd12434 RRM_RCAN_like RNA recognition motif (RRM) found in regulators of calcineurin (RCANs) and similar proteins. This subfamily corresponds to the RRM of RCANs, a novel family of calcineurin regulators that are key factors contributing to Down syndrome in humans. They can stimulate and inhibit the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C) signaling in vivo through direct interactions with its catalytic subunit. Overexpressed RCANs may bind and inhibit calcineurin. In contrast, low levels of phosphorylated RCANs may stimulate the calcineurin signaling. RCANs are characterized by harboring a central short, unique serine-proline motif containing FLIISPPxSPP box, which is strongly conserved from yeast to human but is absent in bacteria. They consist of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 75
30437 409869 cd12435 RRM_GW182_like RNA recognition motif (RRM) found in the GW182 family proteins. This subfamily corresponds to the RRM of the GW182 family which includes three paralogs of TNRC6 (GW182-related) proteins comprising GW182/TNGW1, TNRC6B (containing three isoforms) and TNRC6C in mammal, a single Drosophila ortholog (dGW182, also called Gawky) and two Caenorhabditis elegans orthologs AIN-1 and AIN-2, which contain multiple miRNA-binding sites and have important functions in miRNA-mediated translational repression, as well as mRNA degradation in Metazoa. The GW182 family proteins directly interact with Argonaute (Ago) proteins, and thus function as downstream effectors in the miRNA pathway, responsible for inhibition of translation and acceleration of mRNA decay. Members in this family are characterized by an abnormally high content of glycine/tryptophan (G/W) repeats, one or more glutamine (Q)-rich motifs, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The only exception is the worm protein that does not contain a recognizable RRM domain. The GW182 family proteins are recruited to miRNA targets through an interaction between their N-terminal domain and an Argonaute protein. Then they promote translational repression and/or degradation of miRNA targets through their C-terminal silencing domain. 71
30438 409870 cd12436 RRM1_2_MATR3_like RNA recognition motif 1 (RRM1) and 2 (RRM2) found in the matrin 3 family of nuclear proteins. This subfamily corresponds to the RRM of the matrin 3 family of nuclear proteins consisting of Matrin 3 (MATR3), nuclear protein 220 (NP220) and similar proteins. MATR3 is a highly conserved inner nuclear matrix protein that has been implicated in various biological processes. NP220 is a large nucleoplasmic DNA-binding protein that binds to cytidine-rich sequences, such as CCCCC (G/C), in double-stranded DNA (dsDNA). Both, Matrin 3 and NP220, contain two RNA recognition motif (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Cys2-His2 zinc finger-like motif at the C-terminal region. 76
30439 409871 cd12437 RRM_BRAP2_like RNA recognition motif (RRM) found in BRCA1-associated protein (BRAP2) and similar proteins. This subfamily corresponds to the RRM domain of BRAP2, also termed impedes mitogenic signal propagation (IMP), or ring finger protein 52, or renal carcinoma antigen NY-REN-63, a novel cytoplasmic protein interacting with the two functional nuclear localisation signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 may serve as a cytoplasmic retention protein and play a role in the regulation of nuclear protein transport. The family also includes RING finger protein ETP1 and its homologs found in fungi. ETP1, also termed BRAP2 homolog, or ethanol tolerance protein 1, is the yeast homolog of BRCA1-associated protein (BRAP2) found in vertebrates. It may be involved in ethanol and salt-induced transcriptional activation of the NHA1 promoter and heat shock protein genes (HSP12 and HSP26), and participate in ethanol-induced turnover of the low-affinity hexose transporter Hxt3p. Members in this family contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 82
30440 409872 cd12438 RRM_CNOT4 RNA recognition motif (RRM) found in Eukaryotic CCR4-NOT transcription complex subunit 4 (NOT4) and similar proteins. This subfamily corresponds to the RRM of NOT4, also termed CCR4-associated factor 4, or E3 ubiquitin-protein ligase CNOT4, or potential transcriptional repressor NOT4Hp, a component of the CCR4-NOT complex, a global negative regulator of RNA polymerase II transcription. NOT4 functions as an ubiquitin-protein ligase (E3). It contains an N-terminal C4C4 type RING finger motif, followed by a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The RING fingers may interact with a subset of ubiquitin-conjugating enzymes (E2s), including UbcH5B, and mediate protein-protein interactions. T 98
30441 409873 cd12439 RRM_TRMT2A RNA recognition motif (RRM) found in tRNA (uracil-5-)-methyltransferase homolog A (TRMT2A) and similar proteins. This subfamily corresponds to the RRM of TRMT2A, also known as HpaII tiny fragments locus 9c protein (HTF9C), a novel cell cycle regulated protein. It is an independent biologic factor expressed in tumors associated with clinical outcome in HER2 expressing breast cancer. The function of TRMT2A remains unclear although by sequence homology it has a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), related to RNA methyltransferases. 79
30442 409874 cd12440 RRM_SYNJ RNA recognition motif (RRM) found in synaptojanin-1, synaptojanin-2 and similar proteins. This subfamily corresponds to the RRM of two active phosphatidylinositol phosphate phosphatases, synaptojanin-1 and synaptojanin-2. They have different interaction partners and are likely to have different biological functions. Synaptojanin-1 was originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis. It also acts as a Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase with a putative role in clathrin-mediated endocytosis. Synaptojanin-2 is a ubiquitously expressed homolog of synaptojanin-1. It is a novel Rac1 effector regulating the early step of clathrin-mediated endocytosis. Synaptojanin-2 directly and specifically interacts with Rac1 in a GTP-dependent manner. It mediates the inhibitory effect of Rac1 on endocytosis and plays an important role in the Rac1-mediated control of cell growth. Both, synaptojanin-1 and synaptojanin-2, have two tissue-specific alternative splicing isoforms, a shorter isoform expressed in brain and a longer isoform in peripheral tissues. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p, a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2. Synaptojanin-2 shows high sequence homology to the N-terminal Sac1p homology domain, the central inositol 5-phosphatase domain, the putative RNA recognition motif (RRM) of synaptojanin-1, but differs in the proline-rich region. 77
30443 409875 cd12441 RRM_Nup53_like RNA recognition motif (RRM) found in nucleoporin Nup53 and similar proteins. This subfamily corresponds to the RRM domain of nucleoporin Nup53, also termed mitotic phosphoprotein 44 (MP-44), or nuclear pore complex protein Nup53, required for normal cell growth and nuclear morphology in vertebrate. It tightly associates with the nuclear envelope membrane and the nuclear lamina where it interacts with lamin B. It may also interact with a group of nucleoporins including Nup93, Nup155, and Nup205 and play a role in the association of the mitotic checkpoint protein Mad1 with the nuclear pore complex (NPC). The family also includes Saccharomyces cerevisiae Nup53p, an ortholog of vertebrate nucleoporin Nup53. A unique property of yeast Nup53p is that it contains an additional Kap121p-binding domain and interacts specifically with the karyopherin Kap121p, which is involved in the assembly of Nup53p into NPCs. Both, vertebrate Nup35 and yeast Nup53p, contain an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. This family corresponds to the RRM domain which lacks the conserved residues that typically bind RNA in canonical RRM domains. 73
30444 409876 cd12442 RRM_RBM48 RNA recognition motif (RRM) found in RNA-binding protein 48 (RBM48) and similar proteins. This subfamily corresponds to the RRM of RBM48, a putative RNA-binding protein of unknown function. It contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 100
30445 409877 cd12443 RRM_MCM3A_like RNA recognition motif (RRM) found in 80 kDa MCM3-associated protein (Map80) and similar proteins. This subfamily corresponds to the RRM of Map80, also termed germinal center-associated nuclear protein (GANP), involved in the nuclear localization pathway of MCM3, a protein necessary for the initiation of DNA replication and also involves in controls that ensure DNA replication is initiated once per cell cycle. Map80 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 75
30446 409878 cd12444 RRM1_CPEBs RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein CPEB-1, CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subfamily corresponds to the RRM1 of the CPEB family of proteins that bind to defined groups of mRNAs and act as either translational repressors or activators to regulate their translation. CPEB proteins are well conserved in both, vertebrates and invertebrates. Based on sequence similarity, RNA-binding specificity, and functional regulation of translation, the CPEB proteins have been classified into two subfamilies. The first subfamily includes CPEB-1 and related proteins. CPEB-1 is an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bind to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. The second subfamily includes CPEB-2, CPEB-3, CPEB-4, and related protiens. Due to high sequence similarity, members in this subfamily may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All CPEB proteins are nucleus-cytoplasm shuttling proteins. They contain an N-terminal unstructured region, followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. CPEB-2, -3, and -4 have conserved nuclear export signals that are not present in CPEB-1. 95
30447 409879 cd12445 RRM2_CPEBs RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein CPEB-1, CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subfamily corresponds to the RRM2 of CPEB family of proteins that bind to defined groups of mRNAs and act as either translational repressors or activators to regulate their translation. CPEB proteins are well conserved in both, vertebrates and invertebrates. Based on sequence similarity, RNA-binding specificity, and functional regulation of translation, the CPEB proteins has been classified into two subfamilies. The first subfamily includes CPEB-1 and related proteins. CPEB-1 is an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. The second subfamily includes CPEB-2, CPEB-3, CPEB-4, and related protiens. Due to the high sequence similarity, members in this subfamily may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation. It directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All CPEB proteins are nucleus-cytoplasm shuttling proteins. They contain an N-terminal unstructured region, followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. CPEB-2, -3, and -4 have conserved nuclear export signals that are not present in CPEB-1. 81
30448 409880 cd12446 RRM_RBM25 RNA recognition motif (RRM) found in eukaryotic RNA-binding protein 25 and similar proteins. This subfamily corresponds to the RRM of RBM25, also termed Arg/Glu/Asp-rich protein of 120 kDa (RED120), or protein S164, or RNA-binding region-containing protein 7, an evolutionary-conserved splicing coactivator SRm160 (SR-related nuclear matrix protein of 160 kDa, )-interacting protein. RBM25 belongs to a family of RNA-binding proteins containing a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminus, a RE/RD-rich (ER) central region, and a C-terminal proline-tryptophan-isoleucine (PWI) motif. It localizes to the nuclear speckles and associates with multiple splicing components, including splicing cofactors SRm160/300, U snRNAs, assembled splicing complexes, and spliced mRNAs. It may play an important role in pre-mRNA processing by coupling splicing with mRNA 3'-end formation. Additional research indicates that RBM25 is one of the RNA-binding regulators that direct the alternative splicing of apoptotic factors. It can activate proapoptotic Bcl-xS 5'ss by binding to the exonic splicing enhancer, CGGGCA, and stabilize the pre-mRNA-U1 snRNP through interaction with hLuc7A, a U1 snRNP-associated factor. 83
30449 409881 cd12447 RRM1_gar2 RNA recognition motif 1 (RRM1) found in yeast protein gar2 and similar proteins. This subfamily corresponds to the RRM1 of yeast protein gar2, a novel nucleolar protein required for 18S rRNA and 40S ribosomal subunit accumulation. It shares similar domain architecture with nucleolin from vertebrates and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of gar2 is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of gar2 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RGG (or GAR) domain of gar2 is rich in glycine, arginine and phenylalanine residues. 76
30450 409882 cd12448 RRM2_gar2 RNA recognition motif 2 (RRM2) found in yeast protein gar2 and similar proteins. This subfamily corresponds to the RRM2 of yeast protein gar2, a novel nucleolar protein required for 18S rRNA and 40S ribosomal subunit accumulation. It shares similar domain architecture with nucleolin from vertebrates and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of gar2 is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of gar2 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RGG (or GAR) domain of gar2 is rich in glycine, arginine and phenylalanine residues. 73
30451 409883 cd12449 RRM_CIRBP_RBM3 RNA recognition motif (RRM) found in cold inducible RNA binding protein (CIRBP), RNA binding motif protein 3 (RBM3) and similar proteins. This subfamily corresponds to the RRM domain of two structurally related heterogenous nuclear ribonucleoproteins, CIRBP (also termed CIRP or A18 hnRNP) and RBM3 (also termed RNPL), both of which belong to a highly conserved cold shock proteins family. The cold shock proteins can be induced after exposure to a moderate cold-shock and other cellular stresses such as UV radiation and hypoxia. CIRBP and RBM3 may function in posttranscriptional regulation of gene expression by binding to different transcripts, thus allowing the cell to response rapidly to environmental signals. However, the kinetics and degree of cold induction are different between CIRBP and RBM3. Tissue distribution of their expression is different. CIRBP and RBM3 may be differentially regulated under physiological and stress conditions and may play distinct roles in cold responses of cells. CIRBP, also termed glycine-rich RNA-binding protein CIRP, is localized in the nucleus and mediates the cold-induced suppression of cell cycle progression. CIRBP also binds DNA and possibly serves as a chaperone that assists in the folding/unfolding, assembly/disassembly and transport of various proteins. RBM3 may enhance global protein synthesis and the formation of active polysomes while reducing the levels of ribonucleoprotein complexes containing microRNAs. RBM3 may also serve to prevent the loss of muscle mass by its ability to decrease cell death. Furthermore, RBM3 may be essential for cell proliferation and mitosis. Both, CIRBP and RBM3, contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), that is involved in RNA binding, and C-terminal glycine-rich domain (RGG motif) that probably enhances RNA-binding via protein-protein and/or protein-RNA interactions. Like CIRBP, RBM3 can also bind to both RNA and DNA via its RRM domain. 80
30452 409884 cd12450 RRM1_NUCLs RNA recognition motif 1 (RRM1) found in nucleolin-like proteins mainly from plants. This subfamily corresponds to the RRM1 of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 78
30453 409885 cd12451 RRM2_NUCLs RNA recognition motif 2 (RRM2) found in nucleolin-like proteins mainly from plants. This subfamily corresponds to the RRM2 of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 79
30454 409886 cd12452 RRM_ARP_like RNA recognition motif (RRM) found in yeast asparagine-rich protein (ARP) and similar proteins. This subfamily corresponds to the RRM of ARP, also termed NRP1, encoded by Saccharomyces cerevisiae YDL167C. Although its exact biological function remains unclear, ARP contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), two Ran-binding protein zinc fingers (zf-RanBP), and an asparagine-rich region. It may possess RNA-binding and zinc ion binding activities. Additional research had indicated that ARP may function as a factor involved in the stress response. 83
30455 409887 cd12453 RRM1_RIM4_like RNA recognition motif 1 (RRM1) found in yeast meiotic activator RIM4 and similar proteins. This subfamily corresponds to the RRM1 of RIM4, also termed regulator of IME2 protein 4, a putative RNA binding protein that is expressed at elevated levels early in meiosis. It functions as a meiotic activator required for both the IME1- and IME2-dependent pathways of meiotic gene expression, as well as early events of meiosis, such as meiotic division and recombination, in Saccharomyces cerevisiae. RIM4 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes a putative RNA-binding protein termed multicopy suppressor of sporulation protein Msa1. It is a putative RNA-binding protein encoded by a novel gene, msa1, from the fission yeast Schizosaccharomyces pombe. Msa1 may be involved in the inhibition of sexual differentiation by controlling the expression of Ste11-regulated genes, possibly through the pheromone-signaling pathway. Like RIM4, Msa1 also contains two RRMs, both of which are essential for the function of Msa1. 86
30456 409888 cd12454 RRM2_RIM4_like RNA recognition motif 2 (RRM2) found in yeast meiotic activator RIM4 and similar proteins. This subfamily corresponds to the RRM2 of RIM4, also termed regulator of IME2 protein 4, a putative RNA binding protein that is expressed at elevated levels early in meiosis. It functions as a meiotic activator required for both the IME1- and IME2-dependent pathways of meiotic gene expression, as well as early events of meiosis, such as meiotic division and recombination, in Saccharomyces cerevisiae. RIM4 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes a putative RNA-binding protein termed multicopy suppressor of sporulation protein Msa1. It is a putative RNA-binding protein encoded by a novel gene, msa1, from the fission yeast Schizosaccharomyces pombe. Msa1 may be involved in the inhibition of sexual differentiation by controlling the expression of Ste11-regulated genes, possibly through the pheromone-signaling pathway. Like RIM4, Msa1 also contains two RRMs, both of which are essential for the function of Msa1. 80
30457 409889 cd12455 RRM_like_Smg4_UPF3 RNA recognition motif (RRM)-like Smg4_UPF3 domain in yeast up-frameshift suppressor 3 (Upf3p), Caenorhabditis elegans SMG-4, their human orthologs Upf3A and Upf3B, and similar proteins. This subfamily corresponds to the RRM-like Smg4_UPF3 domain found in yeast up-frameshift suppressor 3 (Upf3p), Caenorhabditis elegans SMG-4, their human orthologs Upf3A and Upf3B, and similar proteins. Upf3p, also termed nonsense-mediated mRNA decay protein 3, or Sua6p, a surveillance factor encoded by UPF3 gene from Saccharomyces cerevisiae. It is required for nonsense-mediated mRNA decay (NMD) in yeast. Upf3p is primarily cytoplasmic but accumulates inside the nucleus. Its nuclear import is mediated by the Srp1p (importin-alpha)/beta heterodimer while its nuclear export is mediated by a leucine-rich nuclear export sequence (NES-A), but not the Crm1p exportin. C. elegans SMG-4 is a nuclear shuttling protein that shuttles between the cytoplasm and nucleus through nuclear import and export signals similar to that of the yeast Upf3p. It is regulated by phosphorylation. Human orthologs of yeast Upf3p and C. elegans SMG-4 include Upf3A and Upf3B, which derive from two genes, UPF3A and X-linked UPF3B, respectively. Both, Upf3A (Up-frameshift suppressor 3 homolog A, also termed regulator of nonsense transcripts 3A, or nonsense mRNA reducing factor 3A) and Upf3B (Up-frameshift suppressor 3 homolog B on chromosome X, also termed regulator of nonsense transcripts 3B, or nonsense mRNA reducing factor 3B), are nucleocytoplasmic shuttling proteins. They associate selectively with spliced beta-globin mRNA in vivo, and tethering of any human Upf protein to the 3'UTR of beta-globin mRNA prevents NMD. The function of the Upf proteins in identifying and targeting nonsense mRNAs for rapid decay is conserved among eukaryotes. Besides, all Upf proteins in this family contain a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that they may be RNA binding proteins. 88
30458 240902 cd12456 RRM_p65 RNA recognition motif (RRM) found in the holoenzyme La family protein p65. This subfamily corresponds to the RRM of a lineage specific family containing the essential La family protein p65 found in Tetrahymena thermophila. It is a telomerase holoenzyme protein necessary for telomerase RNA (TER) accumulation in vivo. p65, together with TER and telomerase reverse transcriptase (TERT), comprise a ternary catalytic core complex of Tetrahymena telomerase, which is a ribonucleoprotein complex essential for maintenance of telomere DNA at linear chromosome ends. p65 harbors a cryptic, atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which displays high structural homology to the RRM in genuine La and LARP7 proteins. 76
30459 409890 cd12457 RRM_XMAS2 RNA recognition motif (RRM) found in X-linked male sterile 2 (Xmas-2) and similar proteins. This subfamily corresponds to the RRM in Xmas-2, the Drosophila homolog of yeast Sac3p protein, together with E(y)2, the Drosophila homologue of yeast Sus1p protein, forming an endogenous complex that is required in the regulation of mRNA transport and also involved in the efficient transcription regulation of the heat-shock protein 70 (hsp70) loci. All family members are found in insects and contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a PCI domain. 71
30460 409891 cd12458 RRM_AtC3H46_like RNA recognition motif (RRM) found in Arabidopsis thaliana zinc finger CCCH domain-containing protein 46 (AtC3H46) and similar proteins. This subfamily corresponds to the RRM domain in AtC3H46, a putative RNA-binding protein that contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a CCCH class of zinc finger, typically C-X8-C-X5-C-X3-H. It may possess ribonuclease activity. 70
30461 409892 cd12459 RRM1_CID8_like RNA recognition motif 1 (RRM1) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM1 domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear. 80
30462 409893 cd12460 RRM2_CID8_like RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM2 domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear. 82
30463 409894 cd12461 RRM_SCAF4 RNA recognition motif (RRM) found in SR-related and CTD-associated factor 4 (SCAF4) and similar proteins. The CD corresponds to the RRM of SCAF4 (also termed splicing factor, arginine/serine-rich 15 or SFR15, or CTD-binding SR-like protein RA4) that belongs to a new class of SCAFs (SR-like CTD-associated factors). Although its biological function remains unclear, SCAF4 shows high sequence similarity to SCAF8 that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II) and may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF4 and SCAF8 both contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and serine/arginine-rich motifs. 81
30464 409895 cd12462 RRM_SCAF8 RNA recognition motif (RRM) found in SR-related and CTD-associated factor 8 (SCAF8) and similar proteins. This subgroup corresponds to the RRM of SCAF8 (also termed CDC5L complex-associated protein 7, or RNA-binding motif protein 16, or CTD-binding SR-like protein RA8), a nuclear matrix protein that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II). The pol II CTD plays a role in coupling transcription and pre-mRNA processing. SCAF8 co-localizes primarily with transcription sites that are enriched in nuclear matrix fraction, which is known to contain proteins involved in pre-mRNA processing. Thus, SCAF8 may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF8, together with SCAF4, represents a new class of SCAFs (SR-like CTD-associated factors). They contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and serine/arginine-rich motifs. 79
30465 409896 cd12463 RRM_G3BP1 RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein 1 (G3BP1) and similar proteins. This subgroup corresponds to the RRM of G3BP1, also termed ATP-dependent DNA helicase VIII (DH VIII), or GAP SH3 domain-binding protein 1, which has been identified as a phosphorylation-dependent endoribonuclease that interacts with the SH3 domain of RasGAP, a multi-functional protein controlling Ras activity. The acidic RasGAP binding domain of G3BP1 harbors an arsenite-regulated phosphorylation site and dominantly inhibits stress granule (SG) formation. G3BP1 also contains an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an RNA recognition motif (RRM domain), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif). The RRM domain and RGG-rich region are canonically associated with RNA binding. G3BP1 co-immunoprecipitates with mRNAs. It binds to and cleaves the 3'-untranslated region (3'-UTR) of the c-myc mRNA in a phosphorylation-dependent manner. Thus, G3BP1 may play a role in coupling extra-cellular stimuli to mRNA stability. It has been shown that G3BP1 is a novel Dishevelled-associated protein that is methylated upon Wnt3a stimulation and that arginine methylation of G3BP1 regulates both Ctnnb1 mRNA and canonical Wnt/beta-catenin signaling. Furthermore, G3BP1 can be associated with the 3'-UTR of beta-F1 mRNA in cytoplasmic RNA-granules, demonstrating that G3BP1 may specifically repress the translation of the transcript. 80
30466 409897 cd12464 RRM_G3BP2 RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein 2 (G3BP2) and similar proteins. This subgroup corresponds to the RRM of G3BP2, also termed GAP SH3 domain-binding protein 2, a cytoplasmic protein that interacts with both IkappaBalpha and IkappaBalpha/NF-kappaB complexes, indicating that G3BP2 may play a role in the control of nucleocytoplasmic distribution of IkappaBalpha and cytoplasmic anchoring of the IkappaBalpha/NF-kappaB complex. G3BP2 contains an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an acidic domain, a domain containing five PXXP motifs, an RNA recognition motif (RRM domain), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif). It binds to the SH3 domain of RasGAP, a multi-functional protein controlling Ras activity, through its N-terminal NTF2-like domain. The acidic domain is sufficient for the interaction of G3BP2 with the IkappaBalpha cytoplasmic retention sequence. Furthermore, G3BP2 might influence stability or translational efficiency of particular mRNAs by binding to RNA-containing structures within the cytoplasm through its RNA-binding domain. 83
30467 409898 cd12465 RRM_UHMK1 RNA recognition motif (RRM) found in U2AF homology motif kinase 1 (UHMK1) and similar proteins. This subgroup corresponds to the RRM of UHMK1. UHMK1, also termed kinase interacting with stathmin (KIS) or P-CIP2, is a serine/threonine protein kinase functionally related to RNA metabolism and neurite outgrowth. It contains an N-terminal kinase domain and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), with high homology to the corresponding motif of the mammalian U2 small nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65 or U2AF2). UHMK1 targets two key regulators of cell proliferation and migration, the cyclin-dependent kinase (CDK) inhibitor p27Kip1 and the microtubule-destabilizing protein stathmin. It plays a critical role during vascular wound repair by preventing excessive vascular smooth muscle cell (VSMC) migration into the vascular lesion. Moreover, UHMK1 may control cell migration and neurite outgrowth by interacting with and phosphorylating the splicing factor SF1, thereby probably contributing to the control of protein expression. Furthermore, UHMK1 may be functionally related to microtubule dynamics and axon development. It localizes to RNA granules, interacts with three proteins found in RNA granules (KIF3A, NonO, and eEF1A), and further enhances the local translation. UHMK1 is highly expressed in regions of the brain implicated in schizophrenia and may play a role in susceptibility to schizophrenia. 88
30468 409899 cd12466 RRM2_AtRSp31_like RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins from plants. This subgroup corresponds to the RRM2 in a family that represents a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at their N-terminus, and an RS domain at their C-terminus. 70
30469 240913 cd12467 RRM_Srp1p_like RNA recognition motif 1 (RRM1) found in fission yeast pre-mRNA-splicing factor Srp1p and similar proteins. This subgroup corresponds to the RRM domain in Srp1p encoded by gene srp1 from fission yeast Schizosaccharomyces pombe. It plays a role in the pre-mRNA splicing process, but not essential for growth. Srp1p is closely related to the SR protein family found in metazoa. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine hinge and a RS domain in the middle, and a C-terminal domain. Some family members also contain another RRM domain. 78
30470 409900 cd12470 RRM1_MSSP1 RNA recognition motif 1 (RRM1) found in vertebrate single-stranded DNA-binding protein MSSP-1. This subgroup corresponds to the RRM1 of MSSP-1, also termed RNA-binding motif, single-stranded-interacting protein 1 (RBMS1), or suppressor of CDC2 with RNA-binding motif 2 (SCR2), a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence CT(A/T)(A/T)T, and stimulates DNA replication in the system using SV40 DNA. MSSP-1 is identical with Scr2, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-1 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 86
30471 409901 cd12471 RRM1_MSSP2 RNA recognition motif 1 (RRM1) found in vertebrate single-stranded DNA-binding protein MSSP-2. This subgroup corresponds to the RRM1 of MSSP-2, also termed RNA-binding motif, single-stranded-interacting protein 2 (RBMS2), or suppressor of CDC2 with RNA-binding motif 3 (SCR3), a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence T(C/A)TT, and stimulates DNA replication in the system using SV40 DNA. MSSP-2 is identical with Scr3, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-2 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 84
30472 409902 cd12472 RRM1_RBMS3 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding motif, single-stranded-interacting protein 3 (RBMS3). This subgroup corresponds to the RRM1 of RBMS3, a new member of the c-myc gene single-strand binding proteins (MSSP) family of DNA regulators. Unlike other MSSP proteins, RBMS3 is not a transcriptional regulator. It binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. RBMS3 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and its C-terminal region is acidic and enriched in prolines, glutamines and threonines. 80
30473 409903 cd12473 RRM2_MSSP1 RNA recognition motif 2 (RRM2) found in vertebrate single-stranded DNA-binding protein MSSP-1. This subgroup corresponds to the RRM2 of MSSP-1, also termed RNA-binding motif, single-stranded-interacting protein 1 (RBMS1), or suppressor of CDC2 with RNA-binding motif 2 (SCR2). MSSP-1 is a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence CT(A/T)(A/T)T, and stimulates DNA replication in the system using SV40 DNA. MSSP-1 is identical with Scr2, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-1 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with c-MYC, the product of protooncogene c-myc. MSSP-1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 85
30474 409904 cd12474 RRM2_MSSP2 RNA recognition motif 2 (RRM2) found in vertebrate single-stranded DNA-binding protein MSSP-2. This subgroup corresponds to the RRM2 of MSSP-2, also termed RNA-binding motif, single-stranded-interacting protein 2 (RBMS2), or suppressor of CDC2 with RNA-binding motif 3 (SCR3). MSSP-2 is a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence T(C/A)TT, and stimulates DNA replication in the system using SV40 DNA. MSSP-2 is identical with Scr3, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-2 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 86
30475 240919 cd12475 RRM2_RBMS3 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding motif, single-stranded-interacting protein 3 (RBMS3). This subgroup corresponds to the RRM2 of RBMS3, a new member of the c-myc gene single-strand binding proteins (MSSP) family of DNA regulators. Unlike other MSSP proteins, RBMS3 is not a transcriptional regulator. It binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. RBMS3 contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and its C-terminal region is acidic and enriched in prolines, glutamines and threonines. 88
30476 409905 cd12476 RRM1_SNF RNA recognition motif 1 (RRM1) found in Drosophila melanogaster sex determination protein SNF and similar proteins. This subgroup corresponds to the RRM1 of SNF (Sans fille), also termed U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A), an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila. It is essential in Drosophila sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). SNF contains two RNA recognition motifs (RRMs); it can self-associate through RRM1, and each RRM can recognize poly(U) RNA binding independently. 85
30477 409906 cd12477 RRM1_U1A RNA recognition motif 1 (RRM1) found in vertebrate U1 small nuclear ribonucleoprotein A (U1A). This subgroup corresponds to the RRM1 of U1A (also termed U1 snRNP A or U1-A), an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein and it also shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins, including polypyrimidine tract binding protein (PTB), polypyrimidine-tract binding protein-associated factor (PSF), and non-POU-domain-containing, octamer-binding (NONO), DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (DDX5). It also binds to a flavivirus NS5 protein and plays an important role in virus replication. U1A contains two RNA recognition motifs (RRMs); the N-terminal RRM (RRM1) binds tightly and specifically to the U1 snRNA SLII and its own 3'-UTR, while in contrast, the C-terminal RRM (RRM2) does not appear to associate with any RNA and may be free to bind other proteins. U1A also contains a proline-rich region, and a nuclear localization signal (NLS) in the central domain that is responsible for its nuclear import. 89
30478 409907 cd12478 RRM1_U2B RNA recognition motif 1 in U2 small nuclear ribonucleoprotein B" (U2B") and similar proteins. This subgroup corresponds to the RRM1 of U2B" (also termed U2 snRNP B") a unique protein that comprises the U2 snRNP. It was initially identified as binding to stem-loop IV (SLIV) at the 3' end of U2 snRNA. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA. In addition, the nuclear transport of U2B" is independent of U2 snRNA binding. U2B" contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also contains a nuclear localization signal (NLS) in the central domain. However, nuclear import of U2B'' does not depend on this NLS. The N-terminal RRM is sufficient to direct U2B" to the nucleus. 91
30479 240923 cd12479 RRM2_SNF RNA recognition motif 2 (RRM2) found in Drosophila melanogaster sex determination protein SNF and similar proteins. This subgroup corresponds to the RRM2 of SNF (Sans fille), also termed U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A), an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila. It is essential in Drosophila sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). SNF contains two RNA recognition motifs (RRMs); it can self-associate through RRM1, and each RRM can recognize poly(U) RNA binding independently. 80
30480 409908 cd12480 RRM2_U1A RNA recognition motif 2 (RRM2) found in vertebrate U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A). This subgroup corresponds to the RRM2 of U1A (also termed U1 snRNP A or U1-A), an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins, including polypyrimidine tract binding protein (PTB), polypyrimidine-tract binding protein-associated factor (PSF), and non-POU-domain-containing, octamer-binding (NONO), DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (DDX5). U1A also binds to a flavivirus NS5 protein and plays an important role in virus replication. It contains two RNA recognition motifs (RRMs); the N-terminal RRM (RRM1) binds tightly and specifically to the U1 snRNA SLII and its own 3'-UTR, while in contrast, the C-terminal RRM (RRM2) does not appear to associate with any RNA and it may be free for binding other proteins. U1A also contains a proline-rich region, and a nuclear localization signal (NLS) in the central domain that is responsible for its nuclear import. 86
30481 240925 cd12481 RRM2_U2B RNA recognition motif 2 (RRM2) found in vertebrate U2 small nuclear ribonucleoprotein B" (U2B"). This subgroup corresponds to the RRM1 of U2B" (also termed U2 snRNP B"), a unique protein that comprises the U2 snRNP. It was initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA and its nuclear transport is independent of U2 snRNA binding. U2B" contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also contains a nuclear localization signal (NLS) in the central domain. However, nuclear import of U2B'' does not depend on this NLS. The N-terminal RRM is sufficient to direct U2B" to the nucleus. 80
30482 409909 cd12482 RRM1_hnRNPR RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM1 of hnRNP R, which is a ubiquitously expressed nuclear RNA-binding protein that specifically binds mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. It is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, and in retinal development and light-elicited cellular activities. hnRNP R contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; it binds RNA through its RRM domains. 79
30483 409910 cd12483 RRM1_hnRNPQ RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). This subgroup corresponds to the RRM1 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP, a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains. 84
30484 409911 cd12484 RRM1_RBM46 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM1 of RBM46, also termed cancer/testis antigen 68 (CT68), a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 78
30485 240929 cd12485 RRM1_RBM47 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM1 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 78
30486 409912 cd12486 RRM1_ACF RNA recognition motif 1 (RRM1) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM1 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone, and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. It contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 78
30487 409913 cd12487 RRM1_DND1 RNA recognition motif 1 (RRM1) found in vertebrate dead end protein homolog 1 (DND1). This subgroup corresponds to the RRM1 of DND1, also termed RNA-binding motif, single-stranded-interacting protein 4, an RNA-binding protein that is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. For instance, DND1 binds cell cycle inhibitor, P27 (p27Kip1, CDKN1B), and cell cycle regulator and tumor suppressor, LATS2 (large tumor suppressor, homolog 2 of Drosophila). It helps maintain their protein expression through blocking the inhibitory function of microRNAs (miRNA) from these transcripts. DND1 may also impose another level of translational regulation to modulate expression of critical factors in embryonic stem (ES) cells. DND1 interacts specifically with apolipoprotein B editing complex 3 (APOBEC3), a multi-functional protein inhibiting retroviral replication. The DND1-APOBEC3 interaction may play a role in maintaining viability of germ cells and for preventing germ cell tumor development. DND1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 78
30488 240932 cd12488 RRM2_hnRNPR RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM2 of hnRNP R, a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP R is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, as well as in retinal development and light-elicited cellular activities. It contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif. hnRNP R binds RNA through its RRM domains. 85
30489 240933 cd12489 RRM2_hnRNPQ RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). This subgroup corresponds to the RRM3 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP that is a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains. 85
30490 409914 cd12490 RRM2_ACF RNA recognition motif 2 (RRM2) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM2 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. ACF contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 89
30491 409915 cd12491 RRM2_RBM47 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM2 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 95
30492 240936 cd12492 RRM2_RBM46 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM2 of RBM46, also termed cancer/testis antigen 68 (CT68). It is a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 85
30493 409916 cd12493 RRM2_DND1 RNA recognition motif 2 (RRM2) found in vertebrate dead end protein homolog 1 (DND1). This subgroup corresponds to the RRM2 of DND1, also termed RNA-binding motif, single-stranded-interacting protein 4. It is an RNA-binding protein that is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. For instance, DND1 binds cell cycle inhibitor, P27 (p27Kip1, CDKN1B), and cell cycle regulator and tumor suppressor, LATS2 (large tumor suppressor, homolog 2 of Drosophila). It helps maintain their protein expression through blocking the inhibitory function of microRNAs (miRNA) from these transcripts. DND1 may also impose another level of translational regulation to modulate expression of critical factors in embryonic stem (ES) cells. Moreover, DND1 interacts specifically with apolipoprotein B editing complex 3 (APOBEC3), a multi-functional protein inhibiting retroviral replication. The DND1-APOBEC3 interaction may play a role in maintaining viability of germ cells and for preventing germ cell tumor development. DND1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 83
30494 409917 cd12494 RRM3_hnRNPR RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM3 of hnRNP R. a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP R is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, as well as in retinal development and light-elicited cellular activities. hnRNP R contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP R binds RNA through its RRM domains. 72
30495 409918 cd12495 RRM3_hnRNPQ RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). This subgroup corresponds to the RRM3 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP that is a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains. 72
30496 409919 cd12496 RRM3_RBM46 RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM3 of RBM46, also termed cancer/testis antigen 68 (CT68), is a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 74
30497 409920 cd12497 RRM3_RBM47 RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM3 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 74
30498 409921 cd12498 RRM3_ACF RNA recognition motif 3 (RRM3) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM3 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. ACF contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 83
30499 409922 cd12499 RRM_EcCsdA_like RNA recognition motif (RRM) found in Escherichia coli cold-shock DEAD box protein A (CsdA) and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of E. coli CsdA, also termed ATP-dependent RNA helicase deaD, or translation factor W2, a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. CsdA may be involved in translation initiation, gene regulation after cold-shock, mRNA decay and biogenesis of the large or small ribosomal subunit. It contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding. 73
30500 409923 cd12500 RRM_BsYxiN_like RNA recognition motif (RRM) found in Bacillus subtilis ATP-dependent RNA helicase YxiN and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of YxiN. B. subtilis YxiN is a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. It binds with high affinity and specificity to RNA substrates containing hairpin 92 of 23S rRNA (HP92) with either 3' or 5' extensions in an ATP-dependent manner. YxiN contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding. 73
30501 409924 cd12501 RRM_EcDbpA_like RNA recognition motif (RRM) found in Escherichia coli RNA helicase dbpA and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of dbpA. E. coli dbpA is a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. It binds with high affinity and specificity for RNA substrates containing hairpin 92 of 23S rRNA (HP92) with either 3' or 5' extensions. As a non-processive ATP-dependent helicase, DbpA destabilizes and unwinds short <9bp (base pairs) RNA duplexes as well as long duplex RNA stretches. It disrupts RNA helices exclusively in a 3'- 5' direction and requires a single-stranded loading site 3' of the substrate helix. dbpA contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain binds specifically to hairpin 92. 73
30502 409925 cd12502 RRM2_RMB19 RNA recognition motif 2 (RRM2) found in RNA-binding protein 19 (RBM19) and similar proteins. This subfamily corresponds to the RRM2 of RBM19, also termed RNA-binding domain-1 (RBD-1), a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is also essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 72
30503 409926 cd12503 RRM1_hnRNPH_GRSF1_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family, G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM1 of hnRNP H proteins and GRSF-1. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. These proteins have similar RNA binding affinities and specifically recognize the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. Members in this family have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. They also include a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. They may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity. 77
30504 409927 cd12504 RRM2_hnRNPH_CRSF1_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family. This subfamily corresponds to the RRM2 of hnRNP H protein family which includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9). They represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing, having similar RNA binding affinities and specifically recognizing the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. Furthermore, hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in the splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. The family also includes a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 also contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity. 77
30505 409928 cd12505 RRM2_GRSF1 RNA recognition motif 2 (RRM2) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM2 of GRSF-1, a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 77
30506 409929 cd12506 RRM3_hnRNPH_CRSF1_like RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein hnRNP H protein family, G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM3 of hnRNP H proteins and GRSF-1. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. These proteins have similar RNA binding affinities and specifically recognize the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in the splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. For instance, members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. The family also includes a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 also contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity. 75
30507 240951 cd12507 RRM1_ESRPs_Fusilli RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM1 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B). These are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli. Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. It shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 75
30508 409930 cd12508 RRM2_ESRPs_Fusilli RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM2 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli.Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous FGFR2 splicing and functions as a splicing factor. It shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 80
30509 409931 cd12509 RRM3_ESRPs_Fusilli RNA recognition motif 3 (RRM3) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM3 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli. Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous FGFR2 splicing and functions as a splicing factor. Fusilli shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 81
30510 409932 cd12510 RRM1_RBM12_like RNA recognition motif 1 (RRM1) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM1 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 74
30511 409933 cd12511 RRM2_RBM12_like RNA recognition motif 2 (RRM2) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM2 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B shows high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 73
30512 409934 cd12512 RRM3_RBM12 RNA recognition motif 3 (RRM3) found in RNA-binding protein 12 (RBM12) and similar proteins. This subfamily corresponds to the RRM3 of RBM12. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 101
30513 409935 cd12513 RRM3_RBM12B RNA recognition motif 3 (RRM3) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM3 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 81
30514 409936 cd12514 RRM4_RBM12_like RNA recognition motif 4 (RRM4) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM4 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 73
30515 409937 cd12515 RRM5_RBM12_like RNA recognition motif 5 (RRM5) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM5 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 75
30516 409938 cd12516 RRM1_RBM26 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 26 (RBM26). This subgroup corresponds to the RRM1 of RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, which represents a cutaneous lymphoma (CL)-associated antigen. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions. 76
30517 409939 cd12517 RRM_RBM27 RNA recognition motif (RRM) found in vertebrate RNA-binding protein 27 (RBM27). This subgroup corresponds to the RRM of RBM27 which contains a single RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although the specific function of the RRM in RBM27 remains unclear, it shows high sequence similarity with RRM1of RBM26, which functions as a cutaneous lymphoma (CL)-associated antigen. 76
30518 409940 cd12518 RRM_SRSF11 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 11 (SRSF11) and similar proteins. This subgroup corresponds to the RRM of SRSF11, also termed arginine-rich 54 kDa nuclear protein (SRp54 or p54), which belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family). It is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. SRSF11 has been identified as a tau exon 10 splicing repressor. It interacts with a purine-rich element in exon 10, and suppresses exon 10 inclusion by antagonizing Tra2beta, an SR-domain-containing protein that enhances exon 10 inclusion. SRSF11 is a unique SR family member and may regulate the alternative splicing in a tissue- and substrate-dependent manner. It can directly interact with the U2 auxiliary factor 65-kDa subunit (U2AF65), a protein associated with the 3' splice site. In addition, unlike the typical SR proteins, SRSF11 associates with other SR proteins but not with the U1 small nuclear ribonucleoprotein U1-70K or the U2 auxiliary factor 35-kDa subunit (U2AF35). SREK1 has unique properties in regulating alternative splicing of different pre-mRNAs; it promotes the use of the distal 5' splice site in E1A pre-mRNA alternative splicing. It also inhibits cryptic splice site selection on the beta-globin pre-mRNA containing competing 5' splice sites. SREK1 contains an RNA recognition motif (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and one serine-arginine (SR)-rich domains (SR domains). 80
30519 409941 cd12519 RRM1_SREK1 RNA recognition motif 1 (RRM1) found in splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subgroup corresponds to the RRM1 of SREK1, also termed serine/arginine-rich-splicing regulatory protein 86-kDa (SRrp86), or splicing factor arginine/serine-rich 12 (SFRS12), or splicing regulatory protein 508 amino acid (SRrp508). SREK1 belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family), and is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. It is a unique SR family member and may play a crucial role in determining tissue specific patterns of alternative splicing. SREK1 can alter splice site selection by both positively and negatively modulating the activity of other SR proteins. For instance, SREK1 can activate SRp20 and repress SC35 in a dose-dependent manner both in vitro and in vivo. In addition, SREK1 generally contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and two serine-arginine (SR)-rich domains (SR domains) separated by an unusual glutamic acid-lysine (EK) rich region. The RRM and SR domains are highly conserved among other members of the SR superfamily. However, the EK domain is unique to SREK1; plays a modulatory role controlling SR domain function by involvement in the inhibition of both constitutive and alternative splicing and in the selection of splice-site. 80
30520 240964 cd12520 RRM1_MRN1 RNA recognition motif 1 (RRM1) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM1 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa,which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30521 240965 cd12521 RRM3_MRN1 RNA recognition motif 3 (RRM3) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM3 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30522 409942 cd12522 RRM4_MRN1 RNA recognition motif 4 (RRM4) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM4 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 81
30523 409943 cd12523 RRM2_MRN1 RNA recognition motif 2 (RRM2) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM2 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 78
30524 409944 cd12524 RRM1_MEI2_like RNA recognition motif 1 (RRM1) found in plant Mei2-like proteins. This subgroup corresponds to the RRM1 of Mei2-like proteins that represent an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and it is highly conserved between plants and fungi. Up to date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 77
30525 409945 cd12525 RRM1_MEI2_fungi RNA recognition motif 1 (RRM1) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM1 of fungal Mei2-like proteins. The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 91
30526 409946 cd12526 RRM1_EAR1_like RNA recognition motif 1 (RRM1) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM1 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 71
30527 409947 cd12527 RRM2_EAR1_like RNA recognition motif 2 (RRM2) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM2 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 71
30528 240972 cd12528 RRM2_MEI2_fungi RNA recognition motif 2 (RRM2) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM2 of fungal Mei2-like proteins.The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 81
30529 409948 cd12529 RRM2_MEI2_like RNA recognition motif 2 (RRM2) found in plant Mei2-like proteins. This subgroup corresponds to the RRM2 of Mei2-like proteins that represent an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is highly conserved between plants and fungi. To date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 71
30530 240974 cd12530 RRM3_EAR1_like RNA recognition motif 3 (RRM3) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM3 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 101
30531 240975 cd12531 RRM3_MEI2_like RNA recognition motif 3 (RRM3) found in plant Mei2-like proteins. This subgroup corresponds to the RRM3 of Mei2-like proteins, representing an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is highly conserved between plants and fungi. To date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 86
30532 409949 cd12532 RRM3_MEI2_fungi RNA recognition motif 3 (RRM3) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM3 of fungal Mei2-like proteins. The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 90
30533 409950 cd12533 RRM_EWS RNA recognition motif (RRM) found in vertebrate Ewing Sarcoma Protein (EWS). This subgroup corresponds to the RRM of EWS, also termed Ewing sarcoma breakpoint region 1 protein, a member of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a multifunctional protein and may play roles in transcription and RNA processing. EWS is involved in transcriptional regulation by interacting with the preinitiation complex TFIID and the RNA polymerase II (RNAPII) complexes. It is also associated with splicing factors, such as the U1 snRNP protein U1C, suggesting its implication in pre-mRNA splicing. Additionally, EWS has been shown to regulate DNA damage-induced alternative splicing (AS). Like other members in the FET family, EWS contains an N-terminal Ser, Gly, Gln and Tyr-rich region composed of multiple copies of a degenerate hexapeptide repeat motif. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a C2/C2 zinc-finger motif, a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and at least 1 arginine-glycine-glycine (RGG)-repeat region. EWS specifically binds to poly G and poly U RNA. It also binds to the proximal-element DNA of the macrophage-specific promoter of the CSF-1 receptor gene. 84
30534 240978 cd12534 RRM_SARFH RNA recognition motif (RRM) found in Drosophila melanogaster RNA-binding protein cabeza and similar proteins. This subgroup corresponds to the RRM in cabeza, also termed P19, or sarcoma-associated RNA-binding fly homolog (SARFH). It is a putative homolog of human RNA-binding proteins FUS (also termed TLS or Pigpen or hnRNP P2), EWS (also termed EWSR1), TAF15 (also termed hTAFII68 or TAF2N or RPB56), and belongs to the of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a nuclear RNA binding protein that may play an important role in the regulation of RNA metabolism during fly development. Cabeza contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 83
30535 409951 cd12535 RRM_FUS_TAF15 RNA recognition motif (RRM) found in vertebrate fused in Ewing's sarcoma protein (FUS), TATA-binding protein-associated factor 15 (TAF15) and similar proteins. This subgroup corresponds to the RRM of FUS and TAF15. FUS (TLS or Pigpen or hnRNP P2), also termed 75 kDa DNA-pairing protein (POMp75), or oncoprotein TLS (Translocated in liposarcoma), is a member of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a multi-functional protein and has been implicated in pre-mRNA splicing, chromosome stability, cell spreading, and transcription. FUS was originally identified in human myxoid and round cell liposarcomas as an oncogenic fusion with the stress-induced DNA-binding transcription factor CHOP (CCAAT enhancer-binding homologous protein) and later as hnRNP P2, a component of hnRNP H complex assembled on pre-mRNA. It can form ternary complexes with hnRNP A1 and hnRNP C1/C2. Additional research indicates that FUS binds preferentially to GGUG-containing RNAs. In the presence of Mg2+, it can bind both single- and double-stranded DNA (ssDNA/dsDNA) and promote ATP-independent annealing of complementary ssDNA and D-loop formation in superhelical dsDNA. FUS has been shown to be recruited by single stranded noncoding RNAs to the regulatory regions of target genes such as cyclin D1, where it represses transcription by disrupting complex formation. TAF15 (TAFII68), also termed TATA-binding protein-associated factor 2N (TAF2N), or RNA-binding protein 56 (RBP56), originally identified as a TAF in the general transcription initiation TFIID complex, is a novel RNA/ssDNA-binding protein with homology to the proto-oncoproteins FUS and EWS (also termed EWSR1), belonging to the FET family as well. TAF15 likely functions in RNA polymerase II (RNAP II) transcription by interacting with TFIID and subunits of RNAP II itself. TAF15 is also associated with U1 snRNA, chromatin and RNA, in a complex distinct from the Sm-containing U1 snRNP that functions in splicing. Like other members in the FET family, both FUS and TAF15 contain an N-terminal Ser, Gly, Gln and Tyr-rich region composed of multiple copies of a degenerate hexapeptide repeat motif. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a C2/C2 zinc-finger motif, a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and at least 1 arginine-glycine-glycine (RGG)-repeat region. 86
30536 409952 cd12536 RRM1_RBM39 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 39 (RBM39). This subgroup corresponds to the RRM1 of RBM39, also termed hepatocellular carcinoma protein 1, or RNA-binding region-containing protein 2, or splicing factor HCC1, a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Based on the specific domain composition, RBM39 has been classified into a family of non-snRNP (small nuclear ribonucleoprotein) splicing factors that are usually not complexed to snRNAs. 83
30537 409953 cd12537 RRM1_RBM23 RNA recognition motif 1 (RRM1) found in vertebrate probable RNA-binding protein 23 (RBM23). This subgroup corresponds to the RRM1 of RBM23, also termed RNA-binding region-containing protein 4, or splicing factor SF2, which may function as a pre-mRNA splicing factor. It shows high sequence homology to RNA-binding protein 39 (RBM39 or HCC1), a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In contrast to RBM39, RBM23 contains only two RRMs. 85
30538 409954 cd12538 RRM_U2AF35 RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor U2AF 35 kDa subunit (U2AF35). This subgroup corresponds to the RRM of U2AF35, also termed U2AF1, which is one of the small subunits of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF). It has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF35 directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. It promotes U2 snRNP binding to the branch-point sequences of introns through association with the large subunit of U2AF, U2AF65 (also termed U2AF2). U2AF35 contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich segment interrupted by glycines. U2AF35 binds both U2AF65 and the pre-mRNA through its RRM domain. 104
30539 409955 cd12539 RRM_U2AF35B RNA recognition motif (RRM) found in splicing factor U2AF 35 kDa subunit B (U2AF35B). This subgroup corresponds to the RRM of U2AF35B, also termed zinc finger CCCH domain-containing protein 60 (C3H60), which is one of the small subunits of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF). It has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. Members in this family are mainly found in plant. They show high sequence homology to vertebrates U2AF35 that directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. U2AF35B contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich domain. In contrast to U2AF35, U2AF35B has a plant-specific conserved C-terminal region containing SERE motif(s), which may have an important function specific to higher plants. 102
30540 409956 cd12540 RRM_U2AFBPL RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 1 (U2AFBPL) and similar proteins. This subgroup corresponds to the RRM of U2AFBPL, a human homolog of the imprinted mouse gene U2afbp-rs, which encodes a U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 1 (U2AFBPL), also termed CCCH type zinc finger, RNA-binding motif and serine/arginine rich protein 1 (U2AF1RS1), or U2 small nuclear RNA auxiliary factor 1-like 1 (U2AF1L1). Although the biological role of U2AFBPL remains unclear, it shows high sequence homology to splicing factor U2AF 35 kDa subunit (U2AF35 or U2AF1) that directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. Like U2AF35, U2AFBPL contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich domain. 105
30541 409957 cd12541 RRM2_La RNA recognition motif 2 in La autoantigen (La or LARP3) and similar proteins. This subgroup corresponds to the RRM2 of La autoantigen, also termed Lupus La protein, or La ribonucleoprotein, or Sjoegren syndrome type B antigen (SS-B), a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. La contains an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition, it possesses a short basic motif (SBM) and a nuclear localization signal (NLS) at the C-terminus. 77
30542 409958 cd12542 RRM2_LARP7 RNA recognition motif 2 in La-related protein 7 (LARP7) and similar proteins. This subgroup corresponds to the RRM2 of LARP7, also termed La ribonucleoprotein domain family member 7, or P-TEFb-interaction protein for 7SK stability (PIP7S), an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. LARP7 is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP). It intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. LARP7 plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. LARP7 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminal region, which mediates binding to the U-rich 3' terminus of 7SK RNA. LARP7 also carries another putative RRM domain at its C-terminus. 78
30543 409959 cd12543 RRM2_PAR14 RNA recognition motif 2 in vertebrate poly [ADP-ribose] polymerase 14 (PARP-14). This subgroup corresponds to the RRM2 of PARP-14, also termed aggressive lymphoma protein 2, a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. It is expressed in B lymphocytes and interacts with the IL-4-induced transcription factor Stat6. It plays a fundamental role in the regulation of IL-4-induced B-cell protection against apoptosis after irradiation or growth factor withdrawal. It mediates IL-4 effects on the levels of gene products that regulate cell survival, proliferation, and lymphomagenesis. PARP-14 acts as a transcriptional switch for Stat6-dependent gene activation. In the presence of IL-4, PARP-14 activates transcription by facilitating the binding of Stat6 to the promoter and release of HDACs from the promoter with an IL-4 signal. In contrast, in the absence of a signal, PARP-14 acts as a transcriptional repressor by recruiting HDACs. Absence of PARP-14 protects against Myc-induced developmental block and lymphoma. Thus, PARP-14 may play an important role in Myc-induced oncogenesis. Additional research indicates that PARP-14 is also a binding partner with phosphoglucose isomerase (PGI)/ autocrine motility factor (AMF). It can inhibit PGI/AMF ubiquitination, thus contributing to its stabilization and secretion. PARP-14 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), three tandem macro domains, and C-terminal region with sequence homology to PARP catalytic domain. 75
30544 409960 cd12544 RRM_NMI RNA recognition motif in N-myc-interactor (Nmi) and similar proteins. This subgroup corresponds to the RRM.in Nmi, also termed N-myc and STAT interactor, an interferon inducible protein that interacts with c-Myc, N-Myc, Max and c-Fos, and other transcription factors containing bHLH-ZIP, bHLH or ZIP domains. In addition to binding Myc proteins, Nmi also associates with all the Stat family of transcription factors except Stat2. In response to cytokines (e.g. IL-2 and IFN-gamma) stimulation, Nmi can enhance Stat-mediated transcriptional activity through recruiting the Stat1 and Stat5 transcriptional coactivators, CREB-binding protein (CBP) and p300. Nmi contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 81
30545 409961 cd12545 RRM_IN35 RNA recognition motif in interferon-induced 35 kDa protein (IFP 35) and similar proteins. This subgroup corresponds to the RRM in IFP 35, an interferon-induced leucine zipper protein that can specifically form homodimers. Distinct from known bZIP proteins, IFP 35 lacks a basic domain critical for DNA binding. IFP 35 may negatively regulate other bZIP transcription factors by protein-protein interaction. For instance, it can form heterodimers with B-ATF, a member of the AP1 transcription factor family. IFP 35 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 79
30546 409962 cd12546 RRM_RBM43 RNA recognition motif in vertebrate RNA-binding protein 43 (RBM43). This subgroup corresponds to the RRM of RBM43, a putative RNA-binding protein containing one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although its biological function remains unclear, RBM43 shows high sequence homology to poly [ADP-ribose] polymerase 10 (PARP-10), which is a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. 77
30547 409963 cd12547 RRM1_2_PAR10 RNA recognition motif 1 and 2 in poly [ADP-ribose] polymerase 10 (PARP-10) and similar proteins. This subgroup corresponds to the RRM1 and RRM2 of PARP-10, a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. It is localized to the nuclear and cytoplasmic compartments. In addition to the PARP activity, PARP-10 is also involved in the control of cell proliferation by inhibiting c-Myc- and E1A-mediated cotransformation of primary cells. PARP-10 may play a role in nuclear processes including the regulation of chromatin, gene transcription, and nuclear/cytoplasmic transport. It contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two overlapping C-terminal domains composed of a glycine-rich region and a region with homology to catalytic domains of PARP enzymes (PARP domain). In addition, PARP-10 contains two ubiquitin-interacting motifs (UIM). 72
30548 409964 cd12548 RRM_Set1A RNA recognition motif in vertebrate histone-lysine N-methyltransferase Setd1A (Set1A). This subgroup corresponds to the RRM of Setd1A, also termed SET domain-containing protein 1A (Set1A), or lysine N-methyltransferase 2F, or Set1/Ash2 histone methyltransferase complex subunit Set1, a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1A is localized to euchromatic nuclear speckles and associates with a complex containing six human homologs of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). Set1A contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. In contrast to Set1B, Set1A additionally contains an HCF-1 binding motif that interacts with HCF-1 in vivo. 95
30549 409965 cd12549 RRM_Set1B RNA recognition motif in vertebrate histone-lysine N-methyltransferase Setd1B (Set1B). This subgroup corresponds to the RRM of Setd1B, also termed SET domain-containing protein 1B (Set1B), or lysine N-methyltransferase 2G, a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1B is localized to euchromatic nuclear speckles and associates with a complex containing six human homologs of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). Set1B complex is a histone methyltransferase that produces trimethylated histone H3 at Lys4. Set1B contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. 93
30550 409966 cd12550 RRM_II_PABPN1 RNA recognition motif in type II polyadenylate-binding protein 2 (PABP-2) and similar proteins. This subgroup corresponds to the RRM of PABP-2, also termed poly(A)-binding protein 2, or nuclear poly(A)-binding protein 1 (PABPN1), or poly(A)-binding protein II (PABII), which is a ubiquitously expressed type II nuclear poly(A)-binding protein that directs the elongation of mRNA poly(A) tails during pre-mRNA processing. Although PABP-2 binds poly(A) with high affinity and specificity as type I poly(A)-binding proteins, it contains only one highly conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is responsible for the poly(A) binding. In addition, PABP-2 possesses an acidic N-terminal domain that is essential for the stimulation of PAP, and an arginine-rich C-terminal domain. 76
30551 409967 cd12551 RRM_II_PABPN1L RNA recognition motif in vertebrate type II embryonic polyadenylate-binding protein 2 (ePABP-2). This subgroup corresponds to the RRM of ePABP-2, also termed embryonic poly(A)-binding protein 2, or poly(A)-binding protein nuclear-like 1 (PABPN1L). ePABP-2 is a novel embryonic-specific cytoplasmic type II poly(A)-binding protein that is expressed during the early stages of vertebrate development and in adult ovarian tissue. It may play an important role in the poly(A) metabolism of stored mRNAs during early vertebrate development. ePABP-2 shows significant sequence similarity to the ubiquitously expressed nuclear polyadenylate-binding protein 2 (PABP-2 or PABPN1). Like PABP-2, ePABP-2 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is responsible for the poly(A) binding. In addition, it possesses an acidic N-terminal domain predicted to form a coiled-coil and an arginine-rich C-terminal domain. 77
30552 409968 cd12552 RRM_Nop15p RNA recognition motif in yeast ribosome biogenesis protein 15 (Nop15p) and similar proteins. This subgroup corresponds to the RRM of Nop15p, also termed nucleolar protein 15, which is encoded by YNL110C from Saccharomyces cerevisiae, and localizes to the nucleoplasm and nucleolus. Nop15p has been identified as a component of a pre-60S particle. It interacts with RNA components of the early pre-60S particles. Furthermore, Nop15p binds directly to a pre-rRNA transcript in vitro and is required for pre-rRNA processing. It functions as a ribosome synthesis factor required for the 5' to 3' exonuclease digestion that generates the 5' end of the major, short form of the 5.8S rRNA as well as for processing of 27SB to 7S pre-rRNA. Nop15p also play a specific role in cell cycle progression. Nop15p contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 77
30553 409969 cd12553 RRM1_RBM15 RNA recognition motif 1 (RRM1) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM1 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties. 78
30554 409970 cd12554 RRM1_RBM15B RNA recognition motif 1 (RRM1) found in putative RNA binding motif protein 15B (RBM15B) from vertebrate. This subfamily corresponds to the RRM1 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 80
30555 409971 cd12555 RRM2_RBM15 RNA recognition motif 2 (RRM2) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM2 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contain three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties. 87
30556 409972 cd12556 RRM2_RBM15B RNA recognition motif 2 (RRM2) found in putative RNA binding motif protein 15B (RBM15B) from vertebrate. This subgroup corresponds to the RRM2 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 85
30557 409973 cd12557 RRM3_RBM15 RNA recognition motif 3 (RRM3) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM3 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralogue and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties. 73
30558 409974 cd12558 RRM3_RBM15B RNA recognition motif 3 (RRM3) found in putative RNA-binding protein 15B (RBM15B) from vertebrate. This subgroup corresponds to the RRM3 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 76
30559 409975 cd12559 RRM_SRSF10 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 10 (SRSF10) and similar proteins. This subgroup corresponds to the RRM of SRSF10, also termed 40 kDa SR-repressor protein (SRrp40), or FUS-interacting serine-arginine-rich protein 1 (FUSIP1), or splicing factor SRp38, or splicing factor, arginine/serine-rich 13A (SFRS13A), or TLS-associated protein with Ser-Arg repeats (TASR). SRSF10 is a serine-arginine (SR) protein that acts as a potent and general splicing repressor when dephosphorylated. It mediates global inhibition of splicing both in M phase of the cell cycle and in response to heat shock. SRSF10 emerges as a modulator of cholesterol homeostasis through the regulation of low-density lipoprotein receptor (LDLR) splicing efficiency. It also regulates cardiac-specific alternative splicing of triadin pre-mRNA and is required for proper Ca2+ handling during embryonic heart development. In contrast, the phosphorylated SRSF10 functions as a sequence-specific splicing activator in the presence of a nuclear cofactor. It activates distal alternative 5' splice site of adenovirus E1A pre-mRNA in vivo. Moreover, SRSF10 strengthens pre-mRNA recognition by U1 and U2 snRNPs. SRSF10 localizes to the nuclear speckles and can shuttle between nucleus and cytoplasm. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 95
30560 409976 cd12560 RRM_SRSF12 RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 12 (SRSF12) and similar proteins. This subgroup corresponds to the RRM of SRSF12, also termed 35 kDa SR repressor protein (SRrp35), or splicing factor, arginine/serine-rich 13B (SFRS13B), or splicing factor, arginine/serine-rich 19 (SFRS19). SRSF12 is a serine/arginine (SR) protein-like alternative splicing regulator that antagonizes authentic SR proteins in the modulation of alternative 5' splice site choice. For instance, it activates distal alternative 5' splice site of the adenovirus E1A pre-mRNA in vivo. SRSF12 contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 84
30561 409977 cd12561 RRM1_RBM5_like RNA recognition motif 1 (RRM1) found in RNA-binding protein 5 (RBM5) and similar proteins. This subgroup corresponds to the RRM1 of RNA-binding protein 5 (RBM5 or LUCA15 or H37), RNA-binding protein 10 (RBM10 or S1-1) and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both, RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 81
30562 409978 cd12562 RRM2_RBM5_like RNA recognition motif 2 (RRM2) found in RNA-binding protein 5 (RBM5) and similar proteins. This subgroup corresponds to the RRM2 of RNA-binding protein 5 (RBM5 or LUCA15 or H37), RNA-binding protein 10 (RBM10 or S1-1) and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both, RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 86
30563 409979 cd12563 RRM2_RBM6 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 6 (RBM6). This subgroup corresponds to the RRM2 of RBM6, also termed lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, which has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both, RBM6 and RBM5, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has two additional unique domains: the decamer repeat occurring more than 20 times, and the POZ (poxvirus and zinc finger) domain. The POZ domain may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. 87
30564 409980 cd12564 RRM1_RBM19 RNA recognition motif 1 (RRM1) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM1 of RBM19, also termed RNA-binding domain-1 (RBD-1), a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 76
30565 409981 cd12565 RRM1_MRD1 RNA recognition motif 1 (RRM1) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM1 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 76
30566 409982 cd12566 RRM2_MRD1 RNA recognition motif 2 (RRM2) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM2 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 79
30567 409983 cd12567 RRM3_RBM19 RNA recognition motif 3 (RRM3) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM3 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 79
30568 241012 cd12568 RRM3_MRD1 RNA recognition motif 3 (RRM3) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM3 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 72
30569 409984 cd12569 RRM4_RBM19 RNA recognition motif 4 (RRM4) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM4 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 72
30570 241014 cd12570 RRM5_MRD1 RNA recognition motif 5 (RRM5) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM5 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 76
30571 409985 cd12571 RRM6_RBM19 RNA recognition motif 6 (RRM6) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM6 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 79
30572 409986 cd12572 RRM2_MSI1 RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homolog 1 (Musashi-1) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-1. The mammalian MSI1 gene encoding Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. Musashi-1 is evolutionarily conserved from invertebrates to vertebrates. It is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1) and has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. It represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-1 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 74
30573 409987 cd12573 RRM2_MSI2 RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homolog 2 (Musashi-2) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-2 (also termed Msi2) which has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Musashi-2 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 76
30574 409988 cd12574 RRM1_DAZAP1 RNA recognition motif 1 (RRM1) found in Deleted in azoospermia-associated protein 1 (DAZAP1) and similar proteins. This subfamily corresponds to the RRM1 of DAZAP1 or DAZ-associated protein 1, also termed proline-rich RNA binding protein (Prrp), a multi-functional ubiquitous RNA-binding protein expressed most abundantly in the testis and essential for normal cell growth, development, and spermatogenesis. DAZAP1 is a shuttling protein whose acetylated form is predominantly nuclear and the nonacetylated form is in cytoplasm. It also functions as a translational regulator that activates translation in an mRNA-specific manner. DAZAP1 was initially identified as a binding partner of Deleted in Azoospermia (DAZ). It also interacts with numerous hnRNPs, including hnRNP U, hnRNP U like-1, hnRNPA1, hnRNPA/B, and hnRNP D, suggesting DAZAP1 might associate and cooperate with hnRNP particles to regulate adenylate-uridylate-rich elements (AU-rich element or ARE)-containing mRNAs. DAZAP1 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal proline-rich domain. 82
30575 409989 cd12575 RRM1_hnRNPD_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. This subfamily corresponds to the RRM1 in hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0 is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All members in this family contain two putative RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 72
30576 409990 cd12576 RRM1_MSI RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog Musashi-1, Musashi-2 and similar proteins. This subfamily corresponds to the RRM1 in Musashi-1 and Musashi-2. Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. It is evolutionarily conserved from invertebrates to vertebrates. Musashi-1 is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). It has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, Musashi-1 represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-2 (also termed Msi2) has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Both, Musashi-1 and Musashi-2, contain two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 76
30577 409991 cd12577 RRM1_Hrp1p RNA recognition motif 1 (RRM1) found in yeast nuclear polyadenylated RNA-binding protein 4 (Hrp1p or Nab4p) and similar proteins. This subfamily corresponds to the RRM1 of Hrp1p and similar proteins. Hrp1p or Nab4p, also termed cleavage factor IB (CFIB), is a sequence-specific trans-acting factor that is essential for mRNA 3'-end formation in yeast Saccharomyces cerevisiae. It can be UV cross-linked to RNA and specifically recognizes the (UA)6 RNA element required for both, the cleavage and poly(A) addition, steps. Moreover, Hrp1p can shuttle between the nucleus and the cytoplasm, and play an additional role in the export of mRNAs to the cytoplasm. Hrp1p also interacts with Rna15p and Rna14p, two components of CF1A. In addition, Hrp1p functions as a factor directly involved in modulating the activity of the nonsense-mediated mRNA decay (NMD) pathway. It binds specifically to a downstream sequence element (DSE)-containing RNA and interacts with Upf1p, a component of the surveillance complex, further triggering the NMD pathway. Hrp1p contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an arginine-glycine-rich region harboring repeats of the sequence RGGF/Y. 76
30578 409992 cd12578 RRM1_hnRNPA_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A subfamily. This subfamily corresponds to the RRM1 in hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 78
30579 409993 cd12579 RRM2_hnRNPA0 RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A0 (hnRNP A0) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A0, a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 80
30580 409994 cd12580 RRM2_hnRNPA1 RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A1, also termed helix-destabilizing protein, or single-strand RNA-binding protein, or hnRNP core protein A1, an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A1 has been characterized as a splicing silencer, often acting in opposition to an activating hnRNP H. It silences exons when bound to exonic elements in the alternatively spliced transcripts of c-src, HIV, GRIN1, and beta-tropomyosin. hnRNP A1 can shuttle between the nucleus and the cytoplasm. Thus, it may be involved in transport of cellular RNAs, including the packaging of pre-mRNA into hnRNP particles and transport of poly A+ mRNA from the nucleus to the cytoplasm. The cytoplasmic hnRNP A1 has high affinity with AU-rich elements, whereas the nuclear hnRNP A1 has high affinity with a polypyrimidine stretch bordered by AG at the 3' ends of introns. hnRNP A1 is also involved in the replication of an RNA virus, such as mouse hepatitis virus (MHV), through an interaction with the transcription-regulatory region of viral RNA. Moreover, hnRNP A1, together with the scaffold protein septin 6, serves as host proteins to form a complex with NS5b and viral RNA, and further play important roles in the replication of Hepatitis C virus (HCV). hnRNP A1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The RRMs of hnRNP A1 play an important role in silencing the exon and the glycine-rich domain is responsible for protein-protein interactions. 77
30581 409995 cd12581 RRM2_hnRNPA2B1 RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNP A2/B1) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A2/B1, an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A2/B1 also functions as a splicing factor that regulates alternative splicing of the tumor suppressors, such as BIN1, WWOX, the antiapoptotic proteins c-FLIP and caspase-9B, the insulin receptor (IR), and the RON proto-oncogene among others. Overexpression of hnRNP A2/B1 has been described in many cancers. It functions as a nuclear matrix protein involving in RNA synthesis and the regulation of cellular migration through alternatively splicing pre-mRNA. It may play a role in tumor cell differentiation. hnRNP A2/B1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 80
30582 409996 cd12582 RRM2_hnRNPA3 RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A3 (hnRNP A3) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A3, a novel RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE) independently of hnRNP A2 and participates in the trafficking of A2RE-containing RNA. hnRNP A3 can shuttle between the nucleus and the cytoplasm. It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 80
30583 241027 cd12583 RRM2_hnRNPD RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein D0 (hnRNP D0) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP D0, also termed AU-rich element RNA-binding protein 1, a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP D0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), in the middle and an RGG box rich in glycine and arginine residues in the C-terminal part. Each of RRMs can bind solely to the UUAG sequence specifically. 75
30584 409997 cd12584 RRM2_hnRNPAB RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A/B (hnRNP A/B) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A/B, also termed APOBEC1-binding protein 1 (ABBP-1), an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP A/B contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long C-terminal glycine-rich domain that contains a potential ATP/GTP binding loop. 80
30585 409998 cd12585 RRM2_hnRPDL RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein D-like (hnRNP DL) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP DL (or hnRNP D-like), also termed AU-rich element RNA-binding factor, or JKT41-binding protein (protein laAUF1 or JKTBP), is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. hnRNP DL binds single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA) in a non-sequencespecific manner, and interacts with poly(G) and poly(A) tenaciously. It contains two putative two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 75
30586 409999 cd12586 RRM1_PSP1 RNA recognition motif 1 (RRM1) found in vertebrate paraspeckle protein 1 (PSP1). This subgroup corresponds to the RRM1 of PSPC1, also termed paraspeckle component 1 (PSPC1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Its cellular function remains unknown currently, however, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO), which localizes to paraspeckles in an RNA-dependent manner. PSPC1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 71
30587 410000 cd12587 RRM1_PSF RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF). This subgroup corresponds to the RRM1 of PSF, also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex, a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. Besides, it promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. Additionally, PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. 71
30588 410001 cd12588 RRM1_p54nrb RNA recognition motif 1 (RRM1) found in vertebrate 54 kDa nuclear RNA- and DNA-binding protein (p54nrb). This subgroup corresponds to the RRM1 of p54nrb, also termed non-POU domain-containing octamer-binding protein (NonO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is a multifunctional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. It is ubiquitously expressed and highly conserved in vertebrates. p54nrb binds both, single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. It forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manneras well as with polypyrimidine tract-binding protein-associated-splicing factor (PSF). p54nrb contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 71
30589 410002 cd12589 RRM2_PSP1 RNA recognition motif 2 (RRM2) found in vertebrate paraspeckle protein 1 (PSP1 or PSPC1). This subgroup corresponds to the RRM2 of PSPC1, also termed paraspeckle component 1 (PSPC1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Although its cellular function remains unknown currently, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO), which localizes to paraspeckles in an RNA-dependent manner. PSPC1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 80
30590 410003 cd12590 RRM2_PSF RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF). This subgroup corresponds to the RRM2 of PSF, also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex, a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. It promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). Moreover, PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. 80
30591 410004 cd12591 RRM2_p54nrb RNA recognition motif 2 (RRM2) found in vertebrate 54 kDa nuclear RNA- and DNA-binding protein (p54nrb). This subgroup corresponds to the RRM2 of p54nrb, also termed non-POU domain-containing octamer-binding protein (NonO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is a multifunctional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. It is ubiquitously expressed and highly conserved in vertebrates. It binds both, single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. p54nrb forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manner. It also forms a heterodimer with polypyrimidine tract-binding protein-associated-splicing factor (PSF). p54nrb contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 80
30592 410005 cd12592 RRM_RBM7 RNA recognition motif (RRM) found in vertebrate RNA-binding protein 7 (RBM7). This subfamily corresponds to the RRM of RBM7, a ubiquitously expressed pre-mRNA splicing factor that enhances messenger RNA (mRNA) splicing in a cell-specific manner or in a certain developmental process, such as spermatogenesis. RBM7 interacts with splicing factors SAP145 (the spliceosomal splicing factor 3b subunit 2) and SRp20. It may play a more specific role in meiosis entry and progression. Together with additional testis-specific RNA-binding proteins, RBM7 may regulate the splicing of specific pre-mRNA species that are important in the meiotic cell cycle. RBM7 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. 75
30593 410006 cd12593 RRM_RBM11 RNA recognition motif (RRM) found in vertebrate RNA-binding protein 11 (RBM11). This subfamily corresponds to the RRM or RBM11, a novel tissue-specific splicing regulator that is selectively expressed in brain, cerebellum and testis, and to a lower extent in kidney. RBM11 is localized in the nucleoplasm and enriched in SRSF2-containing splicing speckles. It may play a role in the modulation of alternative splicing during neuron and germ cell differentiation. RBM11 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. The RRM of RBM11 is responsible for RNA binding, whereas the C-terminal region permits nuclear localization and homodimerization. 75
30594 410007 cd12594 RRM1_SRSF4 RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 4 (SRSF4). This subgroup corresponds to the RRM1 of SRSF4, also termed pre-mRNA-splicing factor SRp75, or SRP001LB, or splicing factor, arginine/serine-rich 4 (SFRS4). SRSF4 is a splicing regulatory serine/arginine (SR) protein that plays an important role in both constitutive splicing and alternative splicing of many pre-mRNAs. For instance, it interacts with heterogeneous nuclear ribonucleoproteins, hnRNP G and hnRNP E2, and further regulates the 5' splice site of tau exon 10, whose misregulation causes frontotemporal dementia. SFSF4 also induces production of HIV-1 vpr mRNA through the inhibition of the 5'-splice site of exon 3. In addition, it activates splicing of the cardiac troponin T (cTNT) alternative exon by direct interactions with the cTNT exon 5 enhancer RNA. SRSF4 can shuttle between the nucleus and cytoplasm. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine-rich region, an internal region homologous to the RRM, and a very long, highly phosphorylated C-terminal SR domains rich in serine-arginine dipeptides. 87
30595 410008 cd12595 RRM1_SRSF5 RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 5 (SRSF5). This subgroup corresponds to the RRM1 of SRSF5, also termed delayed-early protein HRS, or pre-mRNA-splicing factor SRp40, or splicing factor, arginine/serine-rich 5 (SFRS5). SFSF5 is an essential splicing regulatory serine/arginine (SR) protein that regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and it is necessary for enhancer activation. SRSF5 also functions as a factor required for insulin-regulated splice site selection for protein kinase C (PKC) betaII mRNA. It is involved in the regulation of PKCbetaII exon inclusion by insulin via its increased phosphorylation by a phosphatidylinositol 3-kinase (PI 3-kinase) signaling pathway. Moreover, SRSF5 can regulate alternative splicing in exon 9 of glucocorticoid receptor pre-mRNA in a dose-dependent manner. SRSF5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. The specific RNA binding by SRSF5 requires the phosphorylation of its SR domain. 70
30596 410009 cd12596 RRM1_SRSF6 RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 6 (SRSF6). This subfamily corresponds to the RRM1 of SRSF6, also termed pre-mRNA-splicing factor SRp55, which is an essential splicing regulatory serine/arginine (SR) protein that preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. For instance, it does not bind to the purine-rich sequence in the calcitonin-specific ESE, but binds to a region adjacent to the purine tract. Moreover, cellular levels of SRSF6 may control tissue-specific alternative splicing of the calcitonin/ calcitonin gene-related peptide (CGRP) pre-mRNA. SRSF6 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal SR domains rich in serine-arginine dipeptides. 72
30597 410010 cd12597 RRM1_SRSF1 RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM1 of SRSF1, also termed alternative-splicing factor 1 (ASF-1), or pre-mRNA-splicing factor SF2, P33 subunit. SRSF1 is a splicing regulatory serine/arginine (SR) protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF1 is a shuttling SR protein and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a long glycine-rich spacer, and a C-terminal RS domains rich in serine-arginine dipeptides. 79
30598 241042 cd12598 RRM1_SRSF9 RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 9 (SRSF9). This subgroup corresponds to the RRM1 of SRSF9, also termed pre-mRNA-splicing factor SRp30C. SRSF9 is an essential splicing regulatory serine/arginine (SR) protein that has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. SRSF9 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by an unusually short C-terminal RS domains rich in serine-arginine dipeptides. 72
30599 410011 cd12599 RRM1_SF2_plant_like RNA recognition motif 1 (RRM1) found in plant pre-mRNA-splicing factor SF2 and similar proteins. This subgroup corresponds to the RRM1 of SF2, also termed SR1 protein, a plant serine/arginine (SR)-rich phosphoprotein similar to the mammalian splicing factor SF2/ASF. It promotes splice site switching in mammalian nuclear extracts. SF2 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal domain rich in proline, serine and lysine residues (PSK domain), a composition reminiscent of histones. This PSK domain harbors a putative phosphorylation site for the mitotic kinase cyclin/p34cdc2. 72
30600 410012 cd12600 RRM2_SRSF4_like RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor 4 (SRSF4) and similar proteins. This subfamily corresponds to the RRM2 of three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS), serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). SRSF4 plays an important role in both, constitutive and alternative, splicing of many pre-mRNAs. It can shuttle between the nucleus and cytoplasm. SRSF5 regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and is essential for enhancer activation. SRSF6 preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. Members in this family contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. 72
30601 410013 cd12601 RRM2_SRSF1_like RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor SRSF1, SRSF9 and similar proteins. This subfamily corresponds to the RRM2 of serine/arginine-rich splicing factor SRSF1, SRSF9 and similar proteins. SRSF1, also termed ASF-1, is a shuttling SR protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF9, also termed SRp30C, has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. Both, SRSF1 and SRSF9, contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RS domains rich in serine-arginine dipeptides. 74
30602 410014 cd12602 RRM2_SF2_plant_like RNA recognition motif 2 (RRM2) found in plant pre-mRNA-splicing factor SF2 and similar proteins. This subfamily corresponds to the RRM2 of SF2, also termed SR1 protein, a plant serine/arginine (SR)-rich phosphoprotein similar to the mammalian splicing factor SF2/ASF. It promotes splice site switching in mammalian nuclear extracts. SF2 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal domain rich in proline, serine and lysine residues (PSK domain), a composition reminiscent of histones. This PSK domain harbors a putative phosphorylation site for the mitotic kinase cyclin/p34cdc2. 76
30603 410015 cd12603 RRM_hnRNPC RNA recognition motif (RRM) found in vertebrate heterogeneous nuclear ribonucleoprotein C1/C2 (hnRNP C1/C2). This subgroup corresponds to the RRM of heterogeneous nuclear ribonucleoprotein C (hnRNP) proteins C1 and C2, produced by a single coding sequence. They are the major constituents of the heterogeneous nuclear RNA (hnRNA) ribonucleoprotein (hnRNP) complex in vertebrates. They bind hnRNA tightly, suggesting a central role in the formation of the ubiquitous hnRNP complex. They are involved in the packaging of hnRNA in the nucleus and in processing of pre-mRNA such as splicing and 3'-end formation. hnRNP C proteins contain two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain that includes the variable region, the basic region and the KSG box rich in repeated Lys-Ser-Gly sequences, the leucine zipper, and the acidic region. The RRM is capable of binding poly(U). The KSG box may bind to RNA. The leucine zipper may be involved in dimer formation. The acidic and hydrophilic C-teminus harbors a putative nucleoside triphosphate (NTP)-binding fold and a protein kinase phosphorylation site. 84
30604 410016 cd12604 RRM_RALY RNA recognition motif (RRM) found in vertebrate RNA-binding protein Raly. This subgroup corresponds to the RRM of Raly, also termed autoantigen p542, or heterogeneous nuclear ribonucleoprotein C-like 2, or hnRNP core protein C-like 2, or hnRNP associated with lethal yellow protein homolog, an RNA-binding protein that may play a critical role in embryonic development. It is encoded by Raly, a ubiquitously expressed gene of unknown function. Raly shows a high degree of identity with the 5' sequences of p542 gene encoding autoantigen, which can cross-react with EBNA-1 of the Epstein Barr virus. Raly contains two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain that includes a unique glycine/serine-rich stretch. 76
30605 410017 cd12605 RRM_RALYL RNA recognition motif (RRM) found in vertebrate RNA-binding Raly-like protein (RALYL). This subgroup corresponds to the RRM of RALYL, also termed heterogeneous nuclear ribonucleoprotein C-like 3, or hnRNP core protein C-like 3, a putative RNA-binding protein that shows high sequence homology with Raly, an RNA-binding protein playing a critical role in embryonic development. The biological role of RALYL remains unclear. Like Raly, RALYL contains two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain. 69
30606 410018 cd12606 RRM1_RBM4 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 4 (RBM4). This subgroup corresponds to the RRM1 of RBM4, a ubiquitously expressed splicing factor that has two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may function as a translational regulator of stress-associated mRNAs and also plays a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. The C-terminal region may be crucial for nuclear localization and protein-protein interaction. The RRMs, in combination with the C-terminal region, are responsible for the splicing function of RBM4. 67
30607 410019 cd12607 RRM2_RBM4 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 4 (RBM4). This subgroup corresponds to the RRM2 of RBM4, a ubiquitously expressed splicing factor that has two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may function as a translational regulator of stress-associated mRNAs and also plays a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. The C-terminal region may be crucial for nuclear localization and protein-protein interaction. The RRMs, in combination with the C-terminal region, are responsible for the splicing function of RBM4. 67
30608 410020 cd12608 RRM1_CoAA RNA recognition motif 1 (RRM1) found in vertebrate RRM-containing coactivator activator/modulator (CoAA). This subgroup corresponds to the RRM1 of CoAA, also termed RNA-binding protein 14 (RBM14), or paraspeckle protein 2 (PSP2), or synaptotagmin-interacting protein (SYT-interacting protein), a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. It stimulates transcription through its interactions with coactivators, such as TRBP and CREB-binding protein CBP/p300, via the TRBP-interacting domain and interaction with an RNA-containing complex, such as DNA-dependent protein kinase-poly(ADP-ribose) polymerase complexes, via the RRMs. 69
30609 410021 cd12609 RRM2_CoAA RNA recognition motif 2 (RRM2) found in vertebrate RRM-containing coactivator activator/modulator (CoAA). This subgroup corresponds to the RRM2 of CoAA, also termed RNA-binding protein 14 (RBM14), or paraspeckle protein 2 (PSP2), or synaptotagmin-interacting protein (SYT-interacting protein), a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. It stimulates transcription through its interactions with coactivators, such as TRBP and CREB-binding protein CBP/p300, via the TRBP-interacting domain and interaction with an RNA-containing complex, such as DNA-dependent protein kinase-poly(ADP-ribose) polymerase complexes, via the RRMs. 68
30610 410022 cd12610 RRM1_SECp43 RNA recognition motif 1 (RRM1) found in tRNA selenocysteine-associated protein 1 (SECp43). This subgroup corresponds to the RRM1 of SECp43, an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. 84
30611 410023 cd12611 RRM1_NGR1_NAM8_like RNA recognition motif 1 (RRM1) found in yeast negative growth regulatory protein NGR1, yeast protein NAM8 and similar proteins. This subgroup corresponds to the RRM1 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both, RNA and single-stranded DNA (ssDNA), in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The subgroup also includes NAM8, a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 84
30612 410024 cd12612 RRM2_SECp43 RNA recognition motif 2 (RRM2) found in tRNA selenocysteine-associated protein 1 (SECp43). This subgroup corresponds to the RRM2 of SECp43, an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. 82
30613 410025 cd12613 RRM2_NGR1_NAM8_like RNA recognition motif 2 (RRM2) found in yeast negative growth regulatory protein NGR1, yeast protein NAM8 and similar proteins. This subgroup corresponds to the RRM2 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both, RNA and single-stranded DNA (ssDNA), in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The family also includes protein NAM8, which is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 80
30614 410026 cd12614 RRM1_PUB1 RNA recognition motif 1 (RRM1) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subgroup corresponds to the RRM1 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. It is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 74
30615 410027 cd12615 RRM1_TIA1 RNA recognition motif 1 (RRM1) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM1 of TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1) and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and functions as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 74
30616 410028 cd12616 RRM1_TIAR RNA recognition motif 1 (RRM1) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM1 of nucleolysin TIAR, also termed TIA-1-related protein, and a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 81
30617 410029 cd12617 RRM2_TIAR RNA recognition motif 2 (RRM2) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM2 of nucleolysin TIAR, also termed TIA-1-related protein, a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal, highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 80
30618 410030 cd12618 RRM2_TIA1 RNA recognition motif 2 (RRM2) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM2 of p40-TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1), and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and function as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 78
30619 410031 cd12619 RRM2_PUB1 RNA recognition motif 2 (RRM2) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subgroup corresponds to the RRM2 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. It is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA). However, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 80
30620 241064 cd12620 RRM3_TIAR RNA recognition motif 3 (RRM3) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM3 of nucleolysin TIAR, also termed TIA-1-related protein, a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 73
30621 410032 cd12621 RRM3_TIA1 RNA recognition motif 3 (RRM3) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM3 of p40-TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1) and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and function as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 72
30622 410033 cd12622 RRM3_PUB1 RNA recognition motif 3 (RRM3) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subfamily corresponds to the RRM3 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. PUB1 is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 74
30623 410034 cd12623 RRM_PPARGC1A RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator 1-alpha (PGC-1alpha, or PPARGC-1-alpha) and similar proteins. This subgroup corresponds to the RRM of PGC-1alpha, also termed PPARGC-1-alpha, or ligand effect modulator 6, a member of a family of transcription coactivators that plays a central role in the regulation of cellular energy metabolism. As an inducible transcription coactivator, PGC-1alpha can interact with a broad range of transcription factors involved in a wide variety of biological responses, such as adaptive thermogenesis, skeletal muscle fiber type switching, glucose/fatty acid metabolism, and heart development. PGC-1alpha stimulates mitochondrial biogenesis and promotes oxidative metabolism. It participates in the regulation of both carbohydrate and lipid metabolism and plays a role in disorders such as obesity, diabetes, and cardiomyopathy. PGC-1alpha is a multi-domain protein containing an N-terminal activation domain region, a central region involved in the interaction with at least a nuclear receptor, and a C-terminal domain region. The N-terminal domain region consists of three leucine-rich motifs (L1, NR box 2 and 3), among which the two last are required for interaction with nuclear receptors, potential nuclear localization signals (NLS), and a proline-rich region overlapping a putative repression domain. The C-terminus of PGC-1alpha is composed of two arginine/serine-rich regions (SR domains), a putative dimerization domain, and an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). PGC-1alpha could interact favorably with single-stranded RNA. 91
30624 410035 cd12624 RRM_PRC RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator-related protein 1 (PRC) and similar proteins. This subgroup corresponds to the RRM of PRC, also termed PGC-1-related coactivator, one of the members of PGC-1 transcriptional coactivators family, including peroxisome proliferator-activated receptor gamma coactivators PGC-1alpha and PGC-1beta. Unlike PGC-1alpha and PGC-1beta, PRC is ubiquitous and more abundantly expressed in proliferating cells than in growth-arrested cells. PRC has been implicated in the regulation of several metabolic pathways, mitochondrial biogenesis, and cell growth. It functions as a growth-regulated transcriptional cofactor activating many nuclear genes specifying mitochondrial respiratory function. PRC directly interacts with nuclear transcriptional factors implicated in respiratory chain expression including nuclear respiratory factors 1 and 2 (NRF-1 and NRF-2), CREB (cAMP-response element-binding protein), and estrogen-related receptor alpha (ERRalpha). It interacts indirectly with the NRF-2beta subunit through host cell factor (HCF), a cellular protein involved in herpes simplex virus (HSV) infection and cell cycle regulation. Furthermore, like PGC-1alpha and PGC-1beta, PRC can transactivate a number of NRF-dependent nuclear genes required for mitochondrial respiratory function, including those encoding cytochrome c, 5-aminolevulinate synthase, Tfam, and TFB1M, and TFB2M. Further research indicates that PRC may also act as a sensor of metabolic stress that orchestrates a redox-sensitive program of inflammatory gene expression. PRC is a multi-domain protein containing an N-terminal activation domain, an LXXLL coactivator signature, a central proline-rich region, a tetrapeptide motif (DHDY) responsible for HCF binding, a C-terminal arginine/serine-rich (SR) domain, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 91
30625 241069 cd12625 RRM1_IGF2BP1 RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1). This subgroup corresponds to the RRM1 of IGF2BP1 (IGF2 mRNA-binding protein 1 or IMP-1), also termed coding region determinant-binding protein (CRD-BP), or VICKZ family member 1, or zipcode-binding protein 1 (ZBP-1). IGF2BP1 is a multi-functional regulator of RNA metabolism that has been implicated in the control of aspects of localization, stability, and translation for many mRNAs. It is predominantly located in cytoplasm and was initially identified as a trans-acting factor that interacts with the zipcode in the 3'- untranslated region (UTR) of the beta-actin mRNA, which is important for its localization and translational regulation. It inhibits IGF-II mRNA translation through binding to the 5'-UTR of the transcript. IGF2BP1 also acts as human immunodeficiency virus type 1 (HIV-1) Gag-binding factor that interacts with HIV-1 Gag protein and blocks the formation of infectious HIV-1 particles. IGF2BP1 promotes mRNA stabilization; it functions as a coding region determinant (CRD)-binding protein that binds to the coding region of betaTrCP1 mRNA and prevents miR-183-mediated degradation of betaTrCP1 mRNA. It also promotes c-myc mRNA stability by associating with the CRD and stabilizes CD44 mRNA via interaction with the 3'-UTR of the transcript. In addition, IGF2BP1 specifically interacts with both Hepatitis C virus (HCV) 5'-UTR and 3'-UTR, further recruiting eIF3 and enhancing HCV internal ribosome entry site (IRES)-mediated translation initiation via the 3'-UTR. IGF2BP1 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. It also contains two putative nuclear export signals (NESs) and a putative nuclear localization signal (NLS). 77
30626 241070 cd12626 RRM1_IGF2BP2 RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2). This subgroup corresponds to the RRM1 of IGF2BP2 (IGF2 mRNA-binding protein 2 or IMP-2), also termed hepatocellular carcinoma autoantigen p62, or VICKZ family member 2, which is a ubiquitously expressed RNA-binding protein involved in the stimulation of insulin action. It is predominantly nuclear. SNPs in IGF2BP2 gene are implicated in susceptibility to type 2 diabetes. IGF2BP2 plays an important role in cellular motility; it regulates the expression of PINCH-2, an important mediator of cell adhesion and motility, and MURF-3, a microtubule-stabilizing protein, through direct binding to their mRNAs. IGF2BP2 may be involved in the regulation of mRNA stability through the interaction with the AU-rich element-binding factor AUF1. IGF2BP2 binds initially to nascent beta-actin transcripts and facilitates the subsequent binding of the shuttling IGF2BP1. IGF2BP2 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 77
30627 410036 cd12627 RRM1_IGF2BP3 RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3). This subgroup corresponds to the RRM1 of IGF2BP3 (IGF2 mRNA-binding protein 3 or IMP-3), also termed KH domain-containing protein overexpressed in cancer (KOC), or VICKZ family member 3, an RNA-binding protein that plays an important role in the differentiation process during early embryogenesis. It is known to bind to and repress the translation of IGF2 leader 3 mRNA. IGF2BP3 also acts as a Glioblastoma-specific proproliferative and proinvasive marker acting through IGF2 resulting in the activation of oncogenic phosphatidylinositol 3-kinase/mitogen-activated protein kinase (PI3K/MAPK) pathways. IGF2BP3 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 77
30628 410037 cd12628 RRM2_IGF2BP1 RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1). This subgroup corresponds to the RRM2 of IGF2BP1 (IGF2 mRNA-binding protein 1 or IMP-1), also termed coding region determinant-binding protein (CRD-BP), or VICKZ family member 1, or zipcode-binding protein 1 (ZBP-1). IGF2BP1 is a multi-functional regulator of RNA metabolism that has been implicated in the control of aspects of localization, stability, and translation for many mRNAs. It is predominantly located in cytoplasm and was initially identified as a trans-acting factor that interacts with the zipcode in the 3'- untranslated region (UTR) of the beta-actin mRNA, which is important for its localization and translational regulation. It inhibits IGF-II mRNA translation through binding to the 5'-UTR of the transcript. IGF2BP1 also acts as human immunodeficiency virus type 1 (HIV-1) Gag-binding factor that interacts with HIV-1 Gag protein and blocks the formation of infectious HIV-1 particles. It promotes mRNA stabilization and functions as a coding region determinant (CRD)-binding protein that binds to the coding region of betaTrCP1 mRNA and prevents miR-183-mediated degradation of betaTrCP1 mRNA. It also promotes c-myc mRNA stability by associating with the CRD. It stabilizes CD44 mRNA via interaction with the 3'-UTR of the transcript. In addition, IGF2BP1 specifically interacts with both Hepatitis C virus (HCV) 5'-UTR and 3'-UTR, further recruiting eIF3 and enhancing HCV internal ribosome entry site (IRES)-mediated translation initiation via the 3'-UTR. IGF2BP1 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. It also contains two putative nuclear export signals (NESs) and a putative nuclear localization signal (NLS). 76
30629 410038 cd12629 RRM2_IGF2BP2 RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2). This subgroup corresponds to the RRM2 of IGF2BP2 (IGF2 mRNA-binding protein 2 or IMP-2), also termed hepatocellular carcinoma autoantigen p62, or VICKZ family member 2, a ubiquitously expressed RNA-binding protein involved in the stimulation of insulin action. It is predominantly nuclear. SNPs in IGF2BP2 gene are implicated in susceptibility to type 2 diabetes. IGF2BP2 plays an important role in cellular motility; it regulates the expression of PINCH-2, an important mediator of cell adhesion and motility, and MURF-3, a microtubule-stabilizing protein, through direct binding to their mRNAs. IGF2BP2 may be involved in the regulation of mRNA stability through the interaction with the AU-rich element-binding factor AUF1. In addition, IGF2BP2 binds initially to nascent beta-actin transcripts and facilitates the subsequent binding of the shuttling IGF2BP1. IGF2BP2 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 76
30630 410039 cd12630 RRM2_IGF2BP3 RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3). This subgroup corresponds to the RRM2 of IGF2BP3 (IGF2 mRNA-binding protein 3 or IMP-3), also termed KH domain-containing protein overexpressed in cancer (KOC), or VICKZ family member 3, an RNA-binding protein that plays an important role in the differentiation process during early embryogenesis. It is known to bind to and repress the translation of IGF2 leader 3 mRNA. IGF2BP3 also acts as a Glioblastoma-specific proproliferative and proinvasive marker acting through IGF2 resulting in the activation of oncogenic phosphatidylinositol 3-kinase/mitogen-activated protein kinase (PI3K/MAPK) pathways. IGF2BP3 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 76
30631 410040 cd12631 RRM1_CELF1_2_Bruno RNA recognition motif 1 (RRM1) found in CUGBP Elav-like family member CELF-1, CELF-2, Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM1 of CELF-1, CELF-2 and Bruno protein. CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP) and CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR) belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in regulation of pre-mRNA splicing, and control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. The human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. This subgroup also includes Drosophila melanogaster Bruno protein, which plays a central role in regulation of Oskar (Osk) expression in flies. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 84
30632 410041 cd12632 RRM1_CELF3_4_5_6 RNA recognition motif 1 (RRM1) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subfamily corresponds to the RRM1 of CELF-3, CELF-4, CELF-5, CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein.The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contain three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. On the other hand, both RRM1 and RRM2 of CELF-4 can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in an muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In additiona to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 87
30633 241077 cd12633 RRM1_FCA RNA recognition motif 1 (RRM1) found in plant flowering time control protein FCA and similar proteins. This subgroup corresponds to the RRM1 of FCA, a gene controlling flowering time in Arabidopsis, encoding a flowering time control protein that functions in the posttranscriptional regulation of transcripts involved in the flowering process. FCA contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNP (ribonucleoprotein domains), and a WW protein interaction domain. 80
30634 410042 cd12634 RRM2_CELF1_2 RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-1, CELF-2 and similar proteins. This subgroup corresponds to the RRM2 of CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP), CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR), both of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. Human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it preferentially binds to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. 81
30635 410043 cd12635 RRM2_CELF3_4_5_6 RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subgroup corresponds to the RRM2 of CELF-3, CELF-4, CELF-5, and CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, being highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contain three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. On the other hand, both RRM1 and RRM2 of CELF-4 can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, being strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in a muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In addition to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 81
30636 410044 cd12636 RRM2_Bruno_like RNA recognition motif 2 (RRM2) found in Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM2 of Bruno, a Drosophila RNA recognition motif (RRM)-containing protein that plays a central role in regulation of Oskar (Osk) expression. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 81
30637 410045 cd12637 RRM2_FCA RNA recognition motif 2 (RRM2) found in plant flowering time control protein FCA and similar proteins. This subgroup corresponds to the RRM2 of FCA, a gene controlling flowering time in Arabidopsis, which encodes a flowering time control protein that functions in the posttranscriptional regulation of transcripts involved in the flowering process. The flowering time control protein FCA contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNP (ribonucleoprotein domains), and a WW protein interaction domain. 81
30638 241082 cd12638 RRM3_CELF1_2 RNA recognition motif 3 (RRM3) found in CUGBP Elav-like family member CELF-1, CELF-2 and similar proteins. This subgroup corresponds to the RRM3 of CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP) and CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR), both of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. Human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It specifically binds to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contain three highly conserved RRMs. It binds to RNA via the first two RRMs, which are important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. 92
30639 241083 cd12639 RRM3_CELF3_4_5_6 RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subgroup corresponds to the RRM3 of CELF-3, CELF-4, CELF-5, and CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein.The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contains three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. Both, RRM1 and RRM2 of CELF-4, can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in an muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In addition to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 79
30640 241084 cd12640 RRM3_Bruno_like RNA recognition motif 3 (RRM3) found in Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM3 of Bruno protein, a Drosophila RNA recognition motif (RRM)-containing protein that plays a central role in regulation of Oskar (Osk) expression. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 79
30641 410046 cd12641 RRM_TRA2B RNA recognition motif (RRM) found in Transformer-2 protein homolog beta (TRA-2 beta) and similar proteins. This subgroup corresponds to the RRM of TRA2-beta or TRA-2-beta, also termed splicing factor, arginine/serine-rich 10 (SFRS10), or transformer-2 protein homolog B, a mammalian homolog of Drosophila transformer-2 (Tra2). TRA2-beta is a serine/arginine-rich (SR) protein that controls the pre-mRNA alternative splicing of the calcitonin/calcitonin gene-related peptide (CGRP), the survival motor neuron 1 (SMN1) protein and the tau protein. It contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. TRA2-beta specifically binds to two types of RNA sequences, the CAA and (GAA)2 sequences, through the RRMs in different RNA binding modes. 87
30642 410047 cd12642 RRM_TRA2A RNA recognition motif (RRM) found in transformer-2 protein homolog alpha (TRA-2 alpha) and similar proteins. This subgroup corresponds to the RRM of TRA2-alpha or TRA-2-alpha, also termed transformer-2 protein homolog A, a mammalian homolog of Drosophila transformer-2 (Tra2). TRA2-alpha is a 40-kDa serine/arginine-rich (SR) protein (SRp40) that specifically binds to gonadotropin-releasing hormone (GnRH) exonic splicing enhancer on exon 4 (ESE4) and is necessary for enhanced GnRH pre-mRNA splicing. It strongly stimulates GnRH intron A excision in a dose-dependent manner. In addition, TRA2-alpha can interact with either 9G8 or SRp30c, which may also be crucial for ESE-dependent GnRH pre-mRNA splicing. TRA2-alpha contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. 84
30643 410048 cd12643 RRM_CFIm68 RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 68 kDa subunit (CFIm68 or CPSF6) and similar proteins. This subgroup corresponds to the RRM of CFIm68. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. Two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. The family includes CFIm68, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF6), or cleavage and polyadenylation specificity factor 68 kDa subunit (CPSF68), or protein HPBRII-4/7. CFIm68 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. The N-terminal RRM of CFIm68 mediates the interaction with CFIm25. It also serves to enhance RNA binding and facilitate RNA looping. 77
30644 410049 cd12644 RRM_CFIm59 RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 59 kDa subunit (CFIm59 or CPSF7) and similar proteins. This subgroup corresponds to the RRM of CFIm59. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. The two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. The family includes CFIm59, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF7), or cleavage and polyadenylation specificity factor 59 kDa subunit (CPSF59). CFIm59 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. The N-terminal RRM of CFIm59 mediates the interaction with CFIm25. It also serves to enhance RNA binding and facilitate RNA looping. 90
30645 241089 cd12645 RRM_SRSF3 RNA recognition motif (RRM) found in vertebrate serine/arginine-rich splicing factor 3 (SRSF3). This subgroup corresponds to the RRM of SRSF3, also termed pre-mRNA-splicing factor SRp20, a splicing regulatory serine/arginine (SR) protein that modulates alternative splicing by interacting with RNA cis-elements in a concentration- and cell differentiation-dependent manner. It is also involved in termination of transcription, alternative RNA polyadenylation, RNA export, and protein translation. SRSF3 is critical for cell proliferation and tumor induction and maintenance. SRSF3 can shuttle between the nucleus and cytoplasm. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 81
30646 410050 cd12646 RRM_SRSF7 RNA recognition motif (RRM) found in vertebrate serine/arginine-rich splicing factor 7 (SRSF7). This subgroup corresponds to the RRM of SRSF7, also termed splicing factor 9G8, is a splicing regulatory serine/arginine (SR) protein that plays a crucial role in both constitutive splicing and alternative splicing of many pre-mRNAs. Its localization and functions are tightly regulated by phosphorylation. SRSF7 is predominantly present in the nuclear and can shuttle between nucleus and cytoplasm. It cooperates with the export protein, Tap/NXF1, helps mRNA export to the cytoplasm, and enhances the expression of unspliced mRNA. SRSF7 inhibits tau E10 inclusion through directly interacting with the proximal downstream intron of E10, a clustering region for frontotemporal dementia with Parkinsonism (FTDP) mutations. SRSF7 contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a CCHC-type zinc knuckle motif in its median region, and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 77
30647 410051 cd12647 RRM_UHM_SPF45 RNA recognition motif (RRM) found in UHM domain of 45 kDa-splicing factor (SPF45) and similar proteins. This subgroup corresponds to the RRM of SPF45, also termed RNA-binding motif protein 17 (RBM17), an RNA-binding protein consisting of an unstructured N-terminal region, followed by a G-patch motif and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and an Arg-Xaa-Phe sequence motif. SPF45 regulates alternative splicing of the apoptosis regulatory gene FAS (also known as CD95). It induces exon 6 skipping in FAS pre-mRNA through the UHM domain that binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) present in the 3' splice site-recognizing factors U2AF65, SF1 and SF3b155. 95
30648 410052 cd12648 RRM3_UHM_PUF60 RNA recognition motif 3 (RRM3) found in UHM domain of poly(U)-binding-splicing factor PUF60 and similar proteins. This subgroup corresponds to the RRM3 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1), an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. The research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 98
30649 241093 cd12649 RRM1_SXL RNA recognition motif 1 (RRM1) found in Drosophila sex-lethal (SXL) and similar proteins. This subfamily corresponds to the RRM1 of SXL which governs sexual differentiation and X chromosome dosage compensation in Drosophila melanogaster. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 81
30650 410053 cd12650 RRM1_Hu RNA recognition motif 1 (RRM1) found in the Hu proteins family. This subfamily corresponds to the RRM1 of the Hu proteins family which represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 77
30651 410054 cd12651 RRM2_SXL RNA recognition motif 2 (RRM2) found in Drosophila sex-lethal (SXL) and similar proteins. This subfamily corresponds to the RRM2 of the sex-lethal protein (SXL) which governs sexual differentiation and X chromosome dosage compensation in Drosophila melanogaster. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 81
30652 410055 cd12652 RRM2_Hu RNA recognition motif 2 (RRM2) found in the Hu proteins family. This subfamily corresponds to the RRM2 of Hu proteins family which represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Moreover, HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 79
30653 410056 cd12653 RRM3_HuR RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM3 of HuR, also termed ELAV-like protein 1 (ELAV-1), the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 85
30654 241098 cd12654 RRM3_HuB RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM3 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. It is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 86
30655 410057 cd12655 RRM3_HuC RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM3 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 85
30656 241100 cd12656 RRM3_HuD RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM3 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells. And it also regulates the neurite elongation and morphological differentiation. HuD specifically bound poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 86
30657 410058 cd12657 RRM1_hnRNPM RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM1 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 76
30658 410059 cd12658 RRM1_MYEF2 RNA recognition motif 1 (RRM1) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM1 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity. 76
30659 410060 cd12659 RRM2_hnRNPM RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM2 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. It functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 76
30660 410061 cd12660 RRM2_MYEF2 RNA recognition motif 2 (RRM2) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM2 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity. 76
30661 410062 cd12661 RRM3_hnRNPM RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM3 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 77
30662 410063 cd12662 RRM3_MYEF2 RNA recognition motif 3 (RRM3) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM3 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity. 77
30663 410064 cd12663 RRM1_RAVER1 RNA recognition motif 1 (RRM1) found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM1 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 71
30664 410065 cd12664 RRM1_RAVER2 RNA recognition motif 1 (RRM1) found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM1 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 70
30665 410066 cd12665 RRM2_RAVER1 RNA recognition motif 2 (RRM2) found found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM2 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 77
30666 410067 cd12666 RRM2_RAVER2 RNA recognition motif 2 (RRM2) found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM2 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 77
30667 410068 cd12667 RRM3_RAVER1 RNA recognition motif 3 (RRM3) found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM3 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 92
30668 410069 cd12668 RRM3_RAVER2 RNA recognition motif 3 (RRM3) found found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM3 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 98
30669 410070 cd12669 RRM1_Nop12p_like RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 12 (Nop12p) and similar proteins. This subgroup corresponds to the RRM1 of Nop12p which is encoded by YOL041C from Saccharomyces cerevisiae. It is a novel nucleolar protein required for pre-25S rRNA processing and normal rates of cell growth at low temperatures. Nop12p shares high sequence similarity with nucleolar protein 13 (Nop13p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop13p that localizes primarily to the nucleolus but also present in the nucleoplasm to a lesser extent, Nop12p is localized to the nucleolus. Nop12p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 100
30670 410071 cd12670 RRM2_Nop12p_like RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 12 (Nop12p) and similar proteins. This subgroup corresponds to the RRM2 of Nop12p, which is encoded by YOL041C from Saccharomyces cerevisiae. It is a novel nucleolar protein required for pre-25S rRNA processing and normal rates of cell growth at low temperatures. Nop12p shares high sequence similarity with nucleolar protein 13 (Nop13p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop13p that localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent, Nop12p is localized to the nucleolus. Nop12p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 77
30671 410072 cd12671 RRM_CSTF2_CSTF2T RNA recognition motif (RRM) found in cleavage stimulation factor subunit 2 (CSTF2), cleavage stimulation factor subunit 2 tau variant (CSTF2T) and similar proteins. This subgroup corresponds to the RRM domain of CSTF2, its tau variant and eukaryotic homologs. CSTF2, also termed cleavage stimulation factor 64 kDa subunit (CstF64), is the vertebrate conterpart of yeast mRNA 3'-end-processing protein RNA15. It is expressed in all somatic tissues and is one of three cleavage stimulatory factor (CstF) subunits required for polyadenylation. CstF64 contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a CstF77-binding domain, a repeated MEARA helical region and a conserved C-terminal domain reported to bind the transcription factor PC-4. During polyadenylation, CstF interacts with the pre-mRNA through the RRM of CstF64 at U- or GU-rich sequences within 10 to 30 nucleotides downstream of the cleavage site. CSTF2T, also termed tauCstF64, is a paralog of the X-linked cleavage stimulation factor CstF64 protein that supports polyadenylation in most somatic cells. It is expressed during meiosis and subsequent haploid differentiation in a more limited set of tissues and cell types, largely in meiotic and postmeiotic male germ cells, and to a lesser extent in brain. The loss of CSTF2T will cause male infertility, as it is necessary for spermatogenesis and fertilization. Moreover, CSTF2T is required for expression of genes involved in morphological differentiation of spermatids, as well as for genes having products that function during interaction of motile spermatozoa with eggs. It promotes germ cell-specific patterns of polyadenylation by using its RRM to bind to different sequence elements downstream of polyadenylation sites than does CstF64. 85
30672 410073 cd12672 RRM_DAZL RNA recognition motif (RRM) found in vertebrate deleted in azoospermia-like (DAZL) proteins. This subgroup corresponds to the RRM of DAZL, also termed SPGY-like-autosomal, encoded by the autosomal homolog of DAZ gene, DAZL. It is ancestral to the deleted in azoospermia (DAZ) protein. DAZL is germ-cell-specific RNA-binding protein that contains a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a DAZ motif, a protein-protein interaction domain. Although their specific biochemical functions remain to be investigated, DAZL proteins may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis. 82
30673 410074 cd12673 RRM_BOULE RNA recognition motif (RRM) found in protein BOULE. This subgroup corresponds to the RRM of BOULE, the founder member of the human DAZ gene family. Invertebrates contain a single BOULE, while vertebrates, other than catarrhine primates, possess both BOULE and DAZL genes. The catarrhine primates possess BOULE, DAZL, and DAZ genes. BOULE encodes an RNA-binding protein containing an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a single copy of the DAZ motif. Although its specific biochemical functions remains to be investigated, BOULE protein may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis. 81
30674 410075 cd12674 RRM1_Nop4p RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM1 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 80
30675 410076 cd12675 RRM2_Nop4p RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM2 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 83
30676 410077 cd12676 RRM3_Nop4p RNA recognition motif 3 (RRM3) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM3 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 107
30677 410078 cd12677 RRM4_Nop4p RNA recognition motif 4 (RRM4) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM4 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 158
30678 410079 cd12678 RRM_SLTM RNA recognition motif (RRM) found in Scaffold attachment factor (SAF)-like transcription modulator (SLTM) and similar proteins. This subgroup corresponds to the RRM domain of SLTM, also termed modulator of estrogen-induced transcription, which shares high sequence similarity with scaffold attachment factor B1 (SAFB1). It contains a scaffold attachment factor-box (SAF-box, also known as SAP domain) DNA-binding motif, an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region rich in glutamine and arginine residues. To a large extent, SLTM co-localizes with SAFB1 in the nucleus, which suggests that they share similar functions, such as the inhibition of an oestrogen reporter gene. However, rather than mediating a specific inhibitory effect on oestrogen action, SLTM is shown to exert a generalized inhibitory effect on gene expression associated with induction of apoptosis in a wide range of cell lines. 74
30679 410080 cd12679 RRM_SAFB1_SAFB2 RNA recognition motif (RRM) found in scaffold attachment factor B1 (SAFB1), scaffold attachment factor B2 (SAFB2), and similar proteins. This subgroup corresponds to RRM of SAFB1, also termed scaffold attachment factor B (SAF-B), heat-shock protein 27 estrogen response element ERE and TATA-box-binding protein (HET), or heterogeneous nuclear ribonucleoprotein hnRNP A1- associated protein (HAP), a large multi-domain protein with well-described functions in transcriptional repression, RNA splicing and metabolism, and a proposed role in chromatin organization. Based on the numerous functions, SAFB1 has been implicated in many diverse cellular processes including cell growth and transformation, stress response, and apoptosis. SAFB1 specifically binds to AT-rich scaffold or matrix attachment region DNA elements (S/MAR DNA) by using its N-terminal scaffold attachment factor-box (SAF-box, also known as SAP domain), a homeodomain-like DNA binding motif. The central region of SAFB1 is composed of an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a nuclear localization signal (NLS). The C-terminus of SAFB1 contains Glu/Arg- and Gly-rich regions that might be involved in protein-protein interaction. Additional studies indicate that the C-terminal region contains a potent and transferable transcriptional repression domain. Another family member is SAFB2, a homolog of SAFB1. Both SAFB1 and SAFB2 are ubiquitously coexpressed and share very high sequence similarity, suggesting that they might function in a similar manner. However, unlike SAFB1, exclusively existing in the nucleus, SAFB2 is also present in the cytoplasm. The additional cytoplasmic localization of SAFB2 implies that it could play additional roles in the cytoplasmic compartment which are distinct from the nuclear functions shared with SAFB1. 76
30680 410081 cd12680 RRM_THOC4 RNA recognition motif (RRM) found in THO complex subunit 4 (THOC4) and similar proteins. This subgroup corresponds to the RRM of THOC4, also termed transcriptional coactivator Aly/REF, or ally of AML-1 and LEF-1, or bZIP-enhancing factor BEF, an mRNA transporter protein with a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It is involved in RNA transportation from the nucleus. THOC4 was initially identified as a transcription coactivator of LEF-1 and AML-1 for the TCRalpha enhancer function. In addition, THOC4 specifically binds to rhesus (RH) promoter in erythroid. It might be a novel transcription cofactor for erythroid-specific genes. 75
30681 410082 cd12681 RRM_SKAR RNA recognition motif (RRM) found in S6K1 Aly/REF-like target (SKAR) and similar proteins. This subgroup corresponds to the RRM of SKAR, also termed polymerase delta-interacting protein 3 (PDIP3), 46 kDa DNA polymerase delta interaction protein (PDIP46), belonging to the Aly/REF family of RNA binding proteins that have been implicated in coupling transcription with pre-mRNA splicing and nucleo-cytoplasmic mRNA transport. SKAR is widely expressed and localizes to the nucleus. It may be a critical player in the function of S6K1 in cell and organism growth control by binding the activated, hyperphosphorylated form of S6K1 but not S6K2. Furthermore, SKAR functions as a protein partner of the p50 subunit of DNA polymerase delta. In addition, SKAR may have particular importance in pancreatic beta cell size determination and insulin secretion. SKAR contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 69
30682 410083 cd12682 RRM_RBPMS RNA recognition motif (RRM) found in vertebrate RNA-binding protein with multiple splicing (RBP-MS). This subfamily corresponds to the RRM of RBP-MS, also termed heart and RRM expressed sequence (hermes), an RNA-binding proteins found in various vertebrate species. It contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RBP-MS physically interacts with Smad2, Smad3 and Smad4 and plays a role in regulation of Smad-mediated transcriptional activity. In addition, RBP-MS may be involved in regulation of mRNA translation and localization during Xenopus laevis development. 76
30683 410084 cd12683 RRM_RBPMS2 RNA recognition motif (RRM) found in vertebrate RNA-binding protein with multiple splicing 2 (RBP-MS2). This subfamily corresponds to the RRM of RBP-MS2, encoded by RBPMS2 gene, a paralog of RNA-binding protein with multiple splicing (RBP-MS). The biological function of RBP-MS2 remains unclear. Like RBP-MS, RBP-MS2 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30684 410085 cd12684 RRM_cpo RNA recognition motif (RRM) found in Drosophila couch potato (cpo) coding RNA-binding protein and similar proteins. This subfamily corresponds to the RRM of Cpo, an RNA-binding protein encoded by Drosophila couch potato (cpo) gene. Cpo contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It may control the processing of RNA molecules required for the proper functioning of the peripheral nervous system (PNS). 83
30685 410086 cd12685 RRM_RBM20 RNA recognition motif (RRM) found in vertebrate RNA-binding protein 20 (RBM20). This subfamily corresponds to the RRM of RBM20, an alternative splicing regulator associated with dilated cardiomyopathy (DCM). It contains only one copy of RNA-recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30686 410087 cd12686 RRM1_PTBPH1_PTBPH2 RNA recognition motif 1 (RRM1) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM1 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 81
30687 410088 cd12687 RRM1_PTBPH3 RNA recognition motif 1 (RRM1) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM1 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 75
30688 410089 cd12688 RRM1_PTBP1_like RNA recognition motif 1 (RRM1) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM1 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and functions at several aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein and negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 81
30689 410090 cd12689 RRM1_hnRNPL_like RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM1 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 80
30690 410091 cd12690 RRM3_PTBPH1_PTBPH2 RNA recognition motif 3 (RRM3) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM3 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 97
30691 241135 cd12691 RRM2_PTBPH1_PTBPH2 RNA recognition motif 2 (RRM2) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM2 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 95
30692 410092 cd12692 RRM2_PTBPH3 RNA recognition motif 2 (RRM2) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM2 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 88
30693 410093 cd12693 RRM2_PTBP1_like RNA recognition motif 2 (RRM2) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM2 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 96
30694 410094 cd12694 RRM2_hnRNPL_like RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM2 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both nuclear and cytoplasmic roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 86
30695 410095 cd12695 RRM3_PTBP1 RNA recognition motif 3 (RRM3) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM3 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence show that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 93
30696 410096 cd12696 RRM3_PTBP2 RNA recognition motif 3 (RRM3) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM3 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 107
30697 410097 cd12697 RRM3_ROD1 RNA recognition motif 3 (RRM3) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM3 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 76
30698 410098 cd12698 RRM3_PTBPH3 RNA recognition motif 3 (RRM3) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subgroup corresponds to the RRM3 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 76
30699 410099 cd12699 RRM3_hnRNPL RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM3 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology with polypyrimidine tract-binding protein (PTB or hnRNP I). Both, hnRNP-L and PTB, are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 77
30700 410100 cd12700 RRM3_hnRPLL RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM3 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 74
30701 410101 cd12701 RRM4_PTBP1 RNA recognition motif 4 (RRM4) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM4 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 76
30702 241146 cd12702 RRM4_PTBP2 RNA recognition motif 4 (RRM4) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM4 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 80
30703 410102 cd12703 RRM4_ROD1 RNA recognition motif 4 (RRM4) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM4 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein that negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 91
30704 410103 cd12704 RRM4_hnRNPL RNA recognition motif 4 (RRM4) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM4 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology with polypyrimidine tract-binding protein (PTB or hnRNP I). Both hnRNP-L and PTB are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 84
30705 410104 cd12705 RRM4_hnRPLL RNA recognition motif 4 (RRM4) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM4 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 85
30706 410105 cd12706 RRM_LARP5 RNA recognition motif (RRM) found in vertebrate La-related protein 5 (LARP5 or LARP4B). This subgroup corresponds to the RRM of LARP5, a cytosolic protein that co-sediments with polysomes and accumulates upon stress induction in cellular stress granules. It can interact with the cytosolic poly(A) binding protein 1 (PABPC1) and the receptor for activated C Kinase (RACK1), a component of the 40S ribosomal subunit. LARP5 may function as a stimulatory factor of translation through bridging mRNA factors of the 3' end with initiating ribosomes. Like other La-related proteins (LARPs) family members, LARP5 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 77
30707 410106 cd12707 RRM_LARP4 RNA recognition motif (RRM) found in vertebrate La-related protein 4 (LARP4). This subgroup corresponds to the RRM of LARP4, a cytoplasmic factor that can bind poly(A) RNA and interact with poly(A) binding protein (PABP). It may play a role in promoting translation by stabilizing mRNA. LARP4 is structurally related to the La autoantigen. Like other La-related proteins (LARPs) family members, LARP4 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 77
30708 410107 cd12708 RRM_RCAN1 RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 1 (RCAN1). This subgroup corresponds to the RRM of RCAN1, also termed calcipressin-1, or Adapt78, or Down syndrome critical region protein 1, or myocyte-enriched calcineurin-interacting protein 1 (MCIP1), encoded by the Down syndrome critical region 1 (DSCR1) gene that is abundantly expressed in human brain, heart and muscles. Overexpressed RCAN1 functions as an inhibitor of the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), and is associated with Alzheimer's disease (AD) and Down syndrome (DS). RCAN1 can be phosphorylated by several kinases such as big MAP kinase 1 (BMK1), glycogen synthase kinase-3 (GSK-3), NF-kappaB inducing kinase (NIK), and protein kinase A (PKA). The phosphorylation of RCAN1 can positively or negatively regulate calcineurin-mediated gene transcription, and also affect its protein stability in the ubiquitin-proteasome pathway. RCAN1 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 93
30709 410108 cd12709 RRM_RCAN2 RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 2 (RCAN2). This subgroup corresponds to the RRM of RCAN2, also termed calcipressin-2, or Down syndrome candidate region 1-like 1 (DSCR1L1), or myocyte-enriched calcineurin-interacting protein 2 (MCIP2), or thyroid hormone-responsive protein ZAKI-4, encoded by a novel thyroid hormone-responsive gene ZAKI-4 that is abundantly expressed in human brain, heart and muscles. RCAN2 binds to the catalytic subunit of Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), calcineurin A, and inhibits its phosphatase activity through its C-terminal region. RCAN2 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 77
30710 410109 cd12710 RRM_RCAN3 RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 3 (RCAN3). This subgroup corresponds to the RRM of RCAN3, also termed calcipressin-3, or Down syndrome candidate region 1-like protein 2 (DSCR1L2), or myocyte-enriched calcineurin-interacting protein 3 (MCIP3), encoded by a ubiquitously expressed DSCR1L2 gene. Overexpressed RCAN3 binds and inhibits the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), and further down-regulates nuclear factor of activated T cells (NFAT)-dependent cytokine gene expression in activated human Jurkat T cells. Moreover, RCAN3 interacts with cardiac troponin I (TNNI3), a heart-specific inhibitory subunit of the troponin complex, and may play a role in cardiac contraction. RCAN3 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 77
30711 410110 cd12711 RRM_TNRC6A RNA recognition motif (RRM) found in vertebrate GW182 autoantigen. This subgroup corresponds to the RRM of the GW182 autoantigen, also termed trinucleotide repeat-containing gene 6A protein (TNRC6A), or CAG repeat protein 26, or EMSY interactor protein, or protein GW1, or glycine-tryptophan protein of 182 kDa, a phosphorylated cytoplasmic autoantigen involved in stabilizing and/or regulating translation and/or storing several different mRNAs. GW182 is characterized by multiple glycine/tryptophan (G/W) repeats and is a critical component of GW bodies (GWBs, also called mammalian processing bodies, or P bodies). The mRNAs associated with GW182 are presumed to reside within GWBs. GW182 has been shown to bind multiple Ago-miRNA complexes, and thus plays a key role in miRNA-mediated translational repression and mRNA degradation. In the absence of Ago2, GW182 may induce translational silencing effect. GW182 is composed of an N-terminal G/W-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred to as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. 92
30712 410111 cd12712 RRM_TNRC6B RNA recognition motif (RRM) found in vertebrate trinucleotide repeat-containing gene 6B protein (TNRC6B). This subgroup corresponds to the RRM of TNRC6B, one of three GW182 paralogs in mammalian genomes. It is involved in miRNA-mediated mRNA degradation. TNRC6B is composed of an N-terminal glycine/tryptophan (G/W)-rich region; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. TNRC6B directly interacts with Argonaute (Ago) proteins through its N-terminal glycine/tryptophan (G/W)-rich region that is called Ago protein-binding domain. TNRC6B is enriched in P-bodies and its Q-rich domain is responsible for P-body localization. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half of TNRC6B comprising an RRM domain exerts a strong translation inhibition potential, which does not require either association with Agos or localization to P-bodies. 83
30713 410112 cd12713 RRM_TNRC6C RNA recognition motif (RRM) found in vertebrate trinucleotide repeat-containing gene 6C protein (TNRC6C). This subgroup corresponds to the RRM of TNRC6C, one of three GW182 paralogs in mammalian genomes. It is enriched in P-bodies and important for efficient miRNA-mediated repression. TNRC6C is composed of an N-terminal glycine/tryptophan (G/W)-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half containing the RRM domain functions as a key effector domain mediating protein synthesis repression by TNRC6C. 88
30714 410113 cd12714 RRM1_MATR3 RNA recognition motif 1 (RRM1) found in vertebrate matrin-3. This subgroup corresponds to the RRM1 of Matrin 3 (MATR3 or P130), a highly conserved inner nuclear matrix protein with a bipartite nuclear localization signal (NLS), two zinc finger domains predicted to bind DNA, and two RNA recognition motifs (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that are known to interact with RNA. MATR3 has been implicated in various biological processes. It is involved in RNA processing by interacting with other nuclear proteins to anchor hyperedited RNAs to the nuclear matrix. It plays a role in mRNA stabilization through maintaining the stability of certain mRNA species. Besides, it modulates the activity of proximal promoters by binding to highly repetitive sequences of matrix/scaffold attachment region (MAR/SAR). The phosphorylation of MATR3 is assumed to cause neuronal death. It is phosphorylated by the protein kinase ATM, which activates the cellular response to double strand breaks in the DNA. Its phosphorylation by protein kinase A (PKA) is responsible for the activation of the N-methyl-d-aspartic acid (NMDA) receptor. Furthermore, MATR3 has been identified as both a Ca2+-dependent CaM-binding protein and a downstream substrate of caspases. Additional research indicates that matrin 3 also binds Rev/Rev responsive element (RRE)-containing viral RNA and functions as a cofactor that mediates the post-transcriptional regulation of HIV-1. 76
30715 410114 cd12715 RRM2_MATR3 RNA recognition motif 2 (RRM2) found in vertebrate matrin-3. This subgroup corresponds to the RRM2 of Matrin 3 (MATR3 or P130), a highly conserved inner nuclear matrix protein with a bipartite nuclear localization signal (NLS), two zinc finger domains predicted to bind DNA, and two RNA recognition motifs (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that are known to interact with RNA. MATR3 has been implicated in various biological processes. It is involved in RNA processing by interacting with other nuclear proteins to anchor hyperedited RNAs to the nuclear matrix. It plays a role in mRNA stabilization through maintaining the stability of certain mRNA species. Besides, it modulates the activity of proximal promoters by binding to highly repetitive sequences of matrix/scaffold attachment region (MAR/SAR). The phosphorylation of MATR3 is assumed to cause neuronal death. It is phosphorylated by the protein kinase ATM, which activates the cellular response to double strand breaks in the DNA. Its phosphorylation by protein kinase A (PKA) is responsible for the activation of the N-methyl-d-aspartic acid (NMDA) receptor. Furthermore, MATR3 has been identified as both a Ca2+-dependent CaM-binding protein and a downstream substrate of caspases. Additional research indicates that matrin 3 also binds Rev/Rev responsive element (RRE)-containing viral RNA and functions as a cofactor that mediates the post-transcriptional regulation of HIV-1. 80
30716 410115 cd12716 RRM1_2_NP220 RNA recognition motif 1 (RRM1) and 2 (RRM2) found in vertebrate nuclear protein 220 (NP220). This subgroup corresponds to RRM1 and RRM2 of NP220, also termed zinc finger protein 638 (ZN638), or cutaneous T-cell lymphoma-associated antigen se33-1, or zinc finger matrin-like protein, a large nucleoplasmic DNA-binding protein that binds to cytidine-rich sequences, such as CCCCC (G/C), in double-stranded DNA (dsDNA). NP220 contains multiple domains, including MH1, MH2, and MH3, domains homologous to the acidic nuclear protein matrin 3; RS, an arginine/serine-rich domain commonly found in pre-mRNA splicing factors; PstI-HindIII, a domain essential for DNA binding; acidic repeat, a domain with nine repeats of the sequence LVTVDEVIEEEDL; and a Cys2-His2 zinc finger-like motif that is also present in matrin 3. It may be involved in packaging, transferring, or processing transcripts. This subgroup corresponds to the domain of MH2 that contains two tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 76
30717 410116 cd12717 RRM_ETP1 RNA recognition motif (RRM) found in yeast RING finger protein ETP1 and similar proteins. This subgroup corresponds to the RRM of ETP1, also termed BRAP2 homolog, or ethanol tolerance protein 1, the yeast homolog of BRCA1-associated protein (BRAP2) found in vertebrates. It may be involved in ethanol and salt-induced transcriptional activation of the NHA1 promoter and heat shock protein genes (HSP12 and HSP26), and participate in ethanol-induced turnover of the low-affinity hexose transporter Hxt3p. ETP1 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 83
30718 410117 cd12718 RRM_BRAP2 RNA recognition motif (RRM) found in BRCA1-associated protein (BRAP2). This subgroup corresponds to the RRM of BRAP2, also termed impedes mitogenic signal propagation (IMP), or ring finger protein 52, or renal carcinoma antigen NY-REN-63, a novel cytoplasmic protein interacting with the two functional nuclear localisation signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 may serve as a cytoplasmic retention protein and play a role in the regulation of nuclear protein transport. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 84
30719 410118 cd12719 RRM_SYNJ1 RNA recognition motif (RRM) found in synaptojanin-1 and similar proteins. This subgroup corresponds to the RRM of synaptojanin-1, also termed synaptojanin, or synaptic inositol-1,4,5-trisphosphate 5-phosphatase 1, originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis. It also acts as a Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase with a putative role in clathrin-mediated endocytosis. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p, a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2. Synaptojanin-1 has two tissue-specific alternative splicing isoforms, synaptojanin-145 expressed in brain and synaptojanin-170 expressed in peripheral tissues. Synaptojanin-145 is very abundant in nerve terminals and may play an essential role in the clathrin-mediated endocytosis of synaptic vesicles. In contrast to synaptojanin-145, synaptojanin-170 contains three unique asparagine-proline-phenylalanine (NPF) motifs in the C-terminal region and may functions as a potential binding partner for Eps15, a clathrin coat-associated protein acting as a major substrate for the tyrosine kinase activity of the epidermal growth factor receptor. 77
30720 410119 cd12720 RRM_SYNJ2 RNA recognition motif (RRM) found in synaptojanin-2 and similar proteins. This subgroup corresponds to the RRM of synaptojanin-2, also termed synaptic inositol-1,4,5-trisphosphate 5-phosphatase 2, an ubiquitously expressed central regulatory enzyme in the phosphoinositide-signaling cascade. As a novel Rac1 effector regulating the early step of clathrin-mediated endocytosis, synaptojanin-2 acts as a polyphosphoinositide phosphatase directly and specifically interacting with Rac1 in a GTP-dependent manner. It mediates the inhibitory effect of Rac1 on endocytosis and plays an important role in the Rac1-mediated control of cell growth. Synaptojanin-2 shows high sequence homology to the N-terminal Sac1p homology domain, the central inositol 5-phosphatase domain, the putative RNA recognition motif (RRM) of synaptojanin-1, but differs in the proline-rich region. 78
30721 410120 cd12721 RRM_Nup53p_fungi RNA recognition motif (RRM) found in yeast nucleoporin Nup53p and similar proteins. This subgroup corresponds to the RRM of Saccharomyces cerevisiae Nup53p, the ortholog of vertebrate nucleoporin Nup53. A unique property of yeast Nup53p is that it contains an additional Kap121p-binding domain and interacts specifically with the karyopherin Kap121p, which is involved in the assembly of Nup53p into NPCs. Like vertebrate Nup35, yeast Nup53p contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. The RRM domain lacks the conserved residues that typically bind RNA in canonical RRM domains. 86
30722 410121 cd12722 RRM_Nup53 RNA recognition motif (RRM) found in nucleoporin Nup53. This subgroup corresponds to the RRM of nucleoporin Nup53, also termed mitotic phosphoprotein 44 (MP-44), or nuclear pore complex protein Nup53, required for normal cell growth and nuclear morphology in vertebrate. It tightly associates with the nuclear envelope membrane and the nuclear lamina where it interacts with lamin B. It may also interact with a group of nucleoporins including Nup93, Nup155, and Nup205 and play a role in the association of the mitotic checkpoint protein Mad1 with the nuclear pore complex (NPC). Nup35 contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. This RRM lacks the conserved residues that typically bind RNA in canonical RRM domains. 74
30723 410122 cd12723 RRM1_CPEB1 RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein 1 (CPEB-1) and similar proteins. This subgroup corresponds to the RRM2 of CPEB-1 (also termed CPE-BP1 or CEBP), an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. CPEB-1 contains an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. Both of the RRMs and the Zn finger are required for CPEB-1 to bind CPE. The N-terminal regulatory region may be responsible for CPEB-1 interacting with other proteins. 101
30724 410123 cd12724 RRM1_CPEB2_like RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subgroup corresponds to the RRM1 of the paralog proteins CPEB-2, CPEB-3 and CPEB-4, all well-conserved in both, vertebrates and invertebrates. Due to the high sequence similarity, members in this family may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All family members contain an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. In addition, they do have conserved nuclear export signals that are not present in CPEB-1. 92
30725 410124 cd12725 RRM2_CPEB1 RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein 1 (CPEB-1) and similar proteins. This subgroup corresponds to the RRM2 of CPEB-1 (also termed CPE-BP1 or CEBP), an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. CPEB-1 contains an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. Both of the RRMs and the Zn finger are required for CPEB-1 to bind CPE. The N-terminal regulatory region may be responsible for CPEB-1 interacting with other proteins. 84
30726 410125 cd12726 RRM2_CPEB2_like RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subgroup corresponds to the RRM2 of the paralog proteins CPEB-2, CPEB-3 and CPEB-4, all well conserved in both, vertebrates and invertebrates. Due to the high sequence similarity, members in this family may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All family members contain an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. In addition, they do have conserved nuclear export signals that are not present in CPEB-1. 81
30727 410126 cd12727 RRM_like_Smg4_UPF3A RNA recognition motif (RRM)-like Smg4_UPF3 domain in up-frameshift suppressor 3 homolog A (Upf3A). This subgroup corresponds to the RRM-like Smg4_UPF3 domain in Upf3A, also termed regulator of nonsense transcripts 3A, or nonsense mRNA reducing factor 3A, a human ortholog of yeast Upf3p and Caenorhabditis elegans SMG-4. It derives from gene UPF3A and is required for nonsense-mediated mRNA decay (NMD) in human. Upf3A is a nucleocytoplasmic shuttling protein that associates selectively with spliced beta-globin mRNA in vivo. Like other Upf3 proteins, Upf3A contains nuclear import and export signals, and a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that it may be an RNA binding protein. 87
30728 410127 cd12728 RRM_like_Smg4_UPF3B RNA recognition motif (RRM)-like Smg4_UPF3 domain in up-frameshift suppressor 3 homolog B on chromosome X (Upf3B). This subgroup corresponds to the RRM-like Smg4_UPF3 domain in Upf3B, also termed regulator of nonsense transcripts 3B, or nonsense mRNA reducing factor 3B, a human ortholog of yeast Upf3p and Caenorhabditis elegans SMG-4. It derives from X-linked gene UPF3B and is required for nonsense-mediated mRNA decay (NMD) in human. Upf3B is a nucleocytoplasmic shuttling protein that associates selectively with spliced beta-globin mRNA in vivo. Like other Upf3 proteins, Upf3B contains nuclear import and export signals, and a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that it may be an RNA binding protein. 89
30729 410128 cd12729 RRM1_hnRNPH_hnRNPH2_hnRNPF RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP H , hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM1 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. These represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical. Both of them have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 79
30730 410129 cd12730 RRM1_GRSF1 RNA recognition motif 1 (RRM1) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subgroup corresponds to the RRM1 of GRSF-1, a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 79
30731 410130 cd12731 RRM2_hnRNPH_hnRNPH2_hnRNPF RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein hnRNP H, hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM2 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. These represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 90
30732 410131 cd12732 RRM2_hnRNPH3 RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP H3 (also termed hnRNP 2H9), a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Currently, little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock. In addition, the typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C-terminus, which may allow it to homo- or heterodimerize. 96
30733 410132 cd12733 RRM3_GRSF1 RNA recognition motif 3 (RRM3) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subgroup corresponds to the RRM3 of G-rich sequence factor 1 (GRSF-1), a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 75
30734 410133 cd12734 RRM3_hnRNPH_hnRNPH2_hnRNPF RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein hnRNP H , hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM3 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F, which represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical; bothe have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 76
30735 241179 cd12735 RRM3_hnRNPH3 RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3) and similar proteins. This subgroup corresponds to the RRM3 of hnRNP H3 (also termed hnRNP 2H9), a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Currently, little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock. In addition, the typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C-terminus, which may allow it to homo- or heterodimerize. 75
30736 410134 cd12736 RRM1_ESRP1 RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein 1 (ESRP1) and similar proteins. This subgroup corresponds to the RRM1 of ESRP1, also termed RNA-binding motif protein 35A (RBM35A), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (p120-Catenin) and ENAH (hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. Additional research indicated that ESRP1 functions as a tumor suppressor in colon cancer cells. It may be involved in posttranscriptional regulation of various genes by exerting a differential effect on protein translation via 5' untranslated regions (UTRs) of mRNAs. ESRP1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 93
30737 410135 cd12737 RRM1_ESRP2 RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein 2 (ESRP2) and similar proteins. This subgroup corresponds to the RRM1 of ESRP2, also termed RNA-binding motif protein 35B (RBM35B), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. ESRP2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 80
30738 241182 cd12738 RRM1_Fusilli RNA recognition motif 1 (RRM1) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM1 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 80
30739 410136 cd12739 RRM2_ESRP1 RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein 1 (ESRP1) and similar proteins. This subgroup corresponds to the RRM2 of ESRP1, also termed RNA-binding motif protein 35A (RBM35A), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. Additional research indicated that ESRP1 functions as a tumor suppressor in colon cancer cells. It may be involved in posttranscriptional regulation of various genes by exerting a differential effect on protein translation via 5' untranslated regions (UTRs) of mRNAs. ESRP1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 111
30740 241184 cd12740 RRM2_ESRP2 RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein 2 (ESRP2) and similar proteins. This subgroup corresponds to the RRM2 of ESRP2, also termed RNA-binding motif protein 35B (RBM35B), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. ESRP2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 107
30741 410137 cd12741 RRM2_Fusilli RNA recognition motif 2 (RRM2) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM2 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 99
30742 410138 cd12742 RRM3_ESRP1_ESRP2 RNA recognition motif 3 (RRM3) found in epithelial splicing regulatory protein ESRP1, ESRP2 and similar proteins. This subgroup corresponds to the RRM3 of ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B). These are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 81
30743 241187 cd12743 RRM3_Fusilli RNA recognition motif 3 (RRM3) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM3 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 85
30744 410139 cd12744 RRM1_RBM12B RNA recognition motif 1 (RRM1) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM1 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 79
30745 241189 cd12745 RRM1_RBM12 RNA recognition motif 1 (RRM1) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgrup corresponds to the RRM1 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 92
30746 410140 cd12746 RRM2_RBM12B RNA recognition motif 2 (RRM2) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM2 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 86
30747 410141 cd12747 RRM2_RBM12 RNA recognition motif 2 (RRM2) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM2 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 75
30748 410142 cd12748 RRM4_RBM12B RNA recognition motif 4 (RRM4) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM4 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 76
30749 410143 cd12749 RRM4_RBM12 RNA recognition motif 4 (RRM4) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM4 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 88
30750 410144 cd12750 RRM5_RBM12B RNA recognition motif 5 (RRM5) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM5 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 77
30751 410145 cd12751 RRM5_RBM12 RNA recognition motif 5 (RRM5) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM5 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RBMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 76
30752 410146 cd12752 RRM1_RBM5 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 5 (RBM5). This subgroup corresponds to the RRM1 of RBM5, also termed protein G15, or putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9, a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both, RBM5 and RBM6, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 87
30753 410147 cd12753 RRM1_RBM10 RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 10 (RBM10). This subgroup corresponds to the RRM1 of RBM10, also termed G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), a paralog of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 is structurally related to RBM5 and RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 84
30754 410148 cd12754 RRM2_RBM10 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 10 (RBM10). This subgroup corresponds to the RRM2 of RBM10, also termed G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), a paralog of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 is structurally related to RBM5 and RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 87
30755 410149 cd12755 RRM2_RBM5 RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 5 (RBM5). This subgroup corresponds to the RRM2 of RBM5, also termed protein G15, or putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9, a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both, RBM5 and RBM6, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 86
30756 410150 cd12756 RRM1_hnRNPD RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein D0 (hnRNP D0) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP D0, also termed AU-rich element RNA-binding protein 1, which is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP D0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), in the middle and an RGG box rich in glycine and arginine residues in the C-terminal part. Each of RRMs can bind solely to the UUAG sequence specifically. 74
30757 410151 cd12757 RRM1_hnRNPAB RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A/B (hnRNP A/B) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A/B, also termed APOBEC1-binding protein 1 (ABBP-1), which is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP A/B contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long C-terminal glycine-rich domain that contains a potential ATP/GTP binding loop. 80
30758 410152 cd12758 RRM1_hnRPDL RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein D-like (hnRNP D-like or hnRNP DL) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP DL (or hnRNP D-like), also termed AU-rich element RNA-binding factor, or JKT41-binding protein (protein laAUF1 or JKTBP), which is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. hnRNP DL binds single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA) in a non-sequencespecific manner, and interacts with poly(G) and poly(A) tenaciously. It contains two putative two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 76
30759 241203 cd12759 RRM1_MSI1 RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog 1 (Musashi-1) and similar proteins. This subgroup corresponds to the RRM1 of Musashi-1. The mammalian MSI1 gene encoding Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells and associated with asymmetric divisions in neural progenitor cells. Musashi-1 is evolutionarily conserved from invertebrates to vertebrates. It is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). Musashi-1 has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, it represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-1 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 77
30760 410153 cd12760 RRM1_MSI2 RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog 2 (Musashi-2 ) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-2 (also termed Msi2) which has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Musashi-2 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 93
30761 410154 cd12761 RRM1_hnRNPA1 RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A1, also termed helix-destabilizing protein, or single-strand RNA-binding protein, or hnRNP core protein A1, and is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A1 has been characterized as a splicing silencer, often acting in opposition to an activating hnRNP H. It silences exons when bound to exonic elements in the alternatively spliced transcripts of c-src, HIV, GRIN1, and beta-tropomyosin. hnRNP A1 can shuttle between the nucleus and the cytoplasm. Thus, it may be involved in transport of cellular RNAs, including the packaging of pre-mRNA into hnRNP particles and transport of poly A+ mRNA from the nucleus to the cytoplasm. The cytoplasmic hnRNP A1 has high affinity with AU-rich elements, whereas the nuclear hnRNP A1 has high affinity with a polypyrimidine stretch bordered by AG at the 3' ends of introns. hnRNP A1 is also involved in the replication of an RNA virus, such as mouse hepatitis virus (MHV), through an interaction with the transcription-regulatory region of viral RNA. hnRNP A1, together with the scaffold protein septin 6, serves as host protein to form a complex with NS5b and viral RNA, and further plays important roles in the replication of Hepatitis C virus (HCV). hnRNP A1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The RRMs of hnRNP A1 play an important role in silencing the exon and the glycine-rich domain is responsible for protein-protein interactions. 81
30762 410155 cd12762 RRM1_hnRNPA2B1 RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNP A2/B1) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A2/B1 which is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A2/B1 also functions as a splicing factor that regulates alternative splicing of the tumor suppressors, such as BIN1, WWOX, the antiapoptotic proteins c-FLIP and caspase-9B, the insulin receptor (IR), and the RON proto-oncogene among others. Moreover, the overexpression of hnRNP A2/B1 has been described in many cancers. It functions as a nuclear matrix protein involving in RNA synthesis and the regulation of cellular migration through alternatively splicing pre-mRNA. It may play a role in tumor cell differentiation. hnRNP A2/B1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 81
30763 410156 cd12763 RRM1_hnRNPA3 RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A3 (hnRNP A3) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A3 which is a novel RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE) independently of hnRNP A2 and participates in the trafficking of A2RE-containing RNA. hnRNP A3 can shuttle between the nucleus and the cytoplasm. It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 81
30764 410157 cd12764 RRM2_SRSF4 RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 4 (SRSF4). This subgroup corresponds to the RRM2 of SRSF4, also termed pre-mRNA-splicing factor SRp75, or SRP001LB, or splicing factor, arginine/serine-rich 4 (SFRS4), a splicing regulatory serine/arginine (SR) protein that plays an important role in both constitutive splicing and alternative splicing of many pre-mRNAs. For instance, it interacts with heterogeneous nuclear ribonucleoproteins, hnRNP G and hnRNP E2, and further regulates the 5' splice site of tau exon 10, whose misregulation causes frontotemporal dementia. SFRS4 also induces production of HIV-1 vpr mRNA through the inhibition of the 5'-splice site of exon 3. In addition, SRSF4 activates splicing of the cardiac troponin T (cTNT) alternative exon by direct interactions with the cTNT exon 5 enhancer RNA. SRSF4 can shuttle between the nucleus and cytoplasm. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine-rich region, an internal region homologous to the RRM, and a very long, highly phosphorylated C-terminal RS domains rich in serine-arginine dipeptides. 97
30765 410158 cd12765 RRM2_SRSF5 RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 5 (SRSF5). This subgroup corresponds to the RRM2 of SRSF5, also termed delayed-early protein HRS, or pre-mRNA-splicing factor SRp40, or splicing factor, arginine/serine-rich 5 (SFRS5), is an essential splicing regulatory serine/arginine (SR) protein that regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and it is necessary for enhancer activation. SRSF5 also functions as a factor required for insulin-regulated splice site selection for protein kinase C (PKC) betaII mRNA. It is involved in the regulation of PKCbetaII exon inclusion by insulin via its increased phosphorylation by a phosphatidylinositol 3-kinase (PI 3-kinase) signaling pathway. Moreover, SRSF5 can regulate alternative splicing in exon 9 of glucocorticoid receptor pre-mRNA in a dose-dependent manner. SRSF5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. The specific RNA binding by SRSF5 requires the phosphorylation of its SR domain. 81
30766 410159 cd12766 RRM2_SRSF6 RNA recognition motif 2 (RRM2) found found in vertebrate serine/arginine-rich splicing factor 6 (SRSF6). This subgroup corresponds to the RRM2 of SRSF6, also termed pre-mRNA-splicing factor SRp55, an essential splicing regulatory serine/arginine (SR) protein that preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. For instance, it does not bind to the purine-rich sequence in the calcitonin-specific ESE, but binds to a region adjacent to the purine tract. Moreover, cellular levels of SRSF6 may control tissue-specific alternative splicing of the calcitonin/ calcitonin gene-related peptide (CGRP) pre-mRNA. SRSF6 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. 73
30767 410160 cd12767 RRM2_SRSF1 RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM2 of SRSF1, also termed alternative-splicing factor 1 (ASF-1), or pre-mRNA-splicing factor SF2, P33 subunit, a splicing regulatory serine/arginine (SR) protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF1 is a shuttling SR protein and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a long glycine-rich spacer, and a C-terminal SR domains rich in serine-arginine dipeptides. 84
30768 410161 cd12768 RRM2_SRSF9 RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 9 (SRSF9). This subgroup corresponds to the RRM2 of SRSF9, also termed pre-mRNA-splicing factor SRp30C, an essential splicing regulatory serine/arginine (SR) protein that has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. SRSF9 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by an unusually short C-terminal RS domains rich in serine-arginine dipeptides. 84
30769 410162 cd12769 RRM1_HuR RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM1 of HuR, also termed ELAV-like protein 1 (ELAV-1), a ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response; it binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. Meanwhile, HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 82
30770 410163 cd12770 RRM1_HuD RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM1 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells, as well as the neurite elongation and morphological differentiation. HuD specifically binds poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 81
30771 410164 cd12771 RRM1_HuB RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM1 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads and is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 83
30772 410165 cd12772 RRM1_HuC RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM1 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 85
30773 410166 cd12773 RRM2_HuR RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM2 of HuR, also termed ELAV-like protein 1 (ELAV-1), the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 84
30774 410167 cd12774 RRM2_HuD RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM2 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells and also regulates the neurite elongation and morphological differentiation. HuD specifically binds poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 84
30775 410168 cd12775 RRM2_HuB RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM2 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. It is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 84
30776 241220 cd12776 RRM2_HuC RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM2 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 81
30777 410169 cd12777 RRM1_PTBP1 RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM1 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 81
30778 410170 cd12778 RRM1_PTBP2 RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM1 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 82
30779 410171 cd12779 RRM1_ROD1 RNA recognition motif 1 (RRM1) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM1 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein that negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 90
30780 410172 cd12780 RRM1_hnRNPL RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM1 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology to polypyrimidine tract-binding protein (PTB or hnRNP I). Both, hnRNP-L and PTB, are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 80
30781 410173 cd12781 RRM1_hnRPLL RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). This subgroup corresponds to the RRM1 of hnRNP-LL, which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 84
30782 410174 cd12782 RRM2_PTBP1 RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM2 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 108
30783 410175 cd12783 RRM2_PTBP2 RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM2 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 107
30784 410176 cd12784 RRM2_ROD1 RNA recognition motif 2 (RRM2) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM2 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein and negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 108
30785 410177 cd12785 RRM2_hnRNPL RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM2 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology to polypyrimidine tract-binding protein (PTB or hnRNP I). Both hnRNP-L and PTB are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 100
30786 241230 cd12786 RRM2_hnRPLL RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM2 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 96
30787 213347 cd12787 RasGAP_plexin_B Ras-GTPase Activating Domain of type B plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity.There are three members of the Plexin-B subfamily, namely B1, B2 and B3. Plexins-B1, B2 and B3 are receptors for Sema4D, Sema4C and Sema4G, and Sema5A, respectively. The activation of plexin-B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. By signaling the effect of Sema4C and Sema4G, the plexin-B2 receptor is critically involved in neural tube closure and cerebellar granule cell development. Plexin-B3, the receptor of Sema5A, is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin-B3 has been linked to verbal performance and white matter volume in human brain. Small GTPases play important roles in plexin-B signaling. Plexin-B1 activates Rho through Rho-specific guanine nucleotide exchange factors, leading to neurite retraction. Plexin-B1 possesses an intrinsic GTPase-activating protein activity for R-Ras and induces growth cone collapse through R-Ras inactivation. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 391
30788 213348 cd12788 RasGAP_plexin_D1 Ras-GTPase Activating Domain of plexin-D1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-D1 has been identified as the receptor of Sema3E. It binds to Sema3E directly with high affinity. Sema3E is implicated in axonal path finding and inhibition of developmental and postischemic angiogenesis. Plexin-D1 is broadly expressed on tumor vessels and tumor cells in a number of different types of human tumors. The Plexin-D1 and Sema3E interaction inhibits tumor growth but promotes invasiveness and metastasis. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 419
30789 213349 cd12789 RasGAP_plexin_C1 Ras-GTPase Activating Domain of plexin-C1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-C1 has been identified as the receptor of semaphorin 7A, which plays regulatory roles in both the immune and nervous systems. Unlike other semaphorins which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Plexin-C1 is a potential tumor suppressor for melanoma progression. The expression of Plexin-C1 is diminished or absent in human melanoma cell lines. Cofilin, an actin-binding protein involved in cell migration, is a downstream target of Sema7A and Plexin-C1 signaling. Melanoma invasion and metastasis may be promoted through the loss of Plexin-C1 inhibitory signaling on cofilin activation. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 393
30790 213350 cd12790 RasGAP_plexin_A Ras-GTPase Activating Domain of type A plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. They are divided into four types (A-D) according to sequence similarity. In vertebrates, there are four type A plexins (A1-A4) that serve as the co-receptors for neuropilins to mediate the signaling of class 3 semaphorins except Sema3E, which signals through Plexin-D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 1 and class 6 semaphorins signal through type A plexins, which mediate diverse biological functions including axon guidance, cardiovascular development, and immune function. Guanylyl cyclase Gyc76C and Off-track kinase (OTK), a putative receptor tyrosine kinase, modulate Sema1a and Plexin-A mediated axon repulsion. In their complex with Sema6s, type A plexins serve as signal-transducing subunits. An increasing number of molecules that interact with the intracellular region of Plexin-A have been identified; among them are IgCAMs (in axon guidance events) and Trem2-DAP12 (in immune responses). Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 385
30791 213351 cd12791 RasGAP_plexin_B3 Ras-GTPase Activating Domain of plexin-B3. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B3 is the receptor of semaphorin 5A. It is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin-B3 has been linked to verbal performance and white matter volume in human brain. Furthermore, Sema5A and plexin-B3 have been implicated in the progression of various types of cancer. They play an important role in the invasion and metastasis of gastric carcinoma. The protein and mRNA expression of Sema5A and its receptor plexin-B3 increased gradually in non-neoplastic mucosa, primary gastric carcinoma, and lymph node metastasis, and their expression is correlated. The stimulation of plexin-B3 by Sema5A binding in human glioma cells results in the inhibition of cell migration and invasion. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 397
30792 213352 cd12792 RasGAP_plexin_B2 Ras-GTPase Activating Domain of plexin-B2. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B2 serves as the receptor of Sema4C and Sema4G. By signaling the effect of Sema4C and Sema4G, the plexin-B2 receptor is critically involved in neural tube closure and cerebellar granule cell development. Mice lacking Plexin-B2 demonstrated defects in closure of the neural tube and disorganization of the embryonic brain. In developing kidney, Sema4C and Plexin-B2 signaling modulates ureteric branching. Plexin-B2 is expressed both in the pretubular aggregates and the ureteric epithelium in the developing kidney. Deletion of Plexin-B2 results in renal hypoplasia and occasional double ureters. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 400
30793 213353 cd12793 RasGAP_plexin_B1 Ras-GTPase Activating Domain of plexin-B1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B1 serves as the Semaphorin 4D receptor and functions as a regulator of developing neurons and a tumor suppressor protein for melanoma. The Sema4D and plexin-B1 signaling complex regulates dendritic and axonal complexity. The activation of Plexin-B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. As a tumor suppressor, plexin-B1 abrogates activation of the oncogenic receptor, c-Met, by its ligand, hepatocyte growth factor (HGF), in melanoma. Furthermore, plexin-B1 suppresses integrin-dependent migration and activation of pp125FAK and inhibits Rho activity. Plexin-B1 is highly expressed in endothelial cells and its activation by Sema4D elicits a potent proangiogenic response. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator. 394
30794 240614 cd12794 Hsm3_like Hsm3 is a yeast Proteasome chaperone of the 19S regulatory particle and related proteins. This group contains proteins related to the Hsm3 protein (Yeast Proteasome Interacting Protein) of Saccharomyces cerevisiae. S. cerevisiae Hsm3 is a chaperone of regulatory particles involved in proteasome assembly. The 26S Proteasome is a large, 2.5 MDa complex comprised of at least 33 subunits, and relies on chaperones to facilitate correct assembly. The proteasome contains a cylindrical 20S core particle and 1-2 19S regulatory particles, comprised of AAA-ATPase and non-ATPase subunits. The proteasome acts in ubiquitin-dependent proteolysis. The 19S RP targets and opens the the ubiquitin-tagged substrate and releases ubiquitin. Hsm3 acts as a 19S chaperone, binding to the C-terminal domain of Rpt1 (the 6 ATPase subunits of the 19 S regulatory particle(s). Hsm3 has a C-shape composed of 11 HEAT repeats. Mutations in the Hsm3-Rpt interface disrupt formation of the 26 S Proteasome complex. 455
30795 240613 cd12795 FILIA_N_like FILIA-N KH-like domain. This group contains the N-terminal atypical KH domain of FILIA and related domains. FILIA is expressed in oocytes and embryo, and contains an atypical KH domain at the N-terminus with an N-terminal extension that interacts with RNA. RNA-binding may mediate RNA transcript regulation in oogenesis and embryogenesis. FILIA-N differs from typical KH domains by forming a stable dimer in solution and crystal structure. 114
30796 240609 cd12796 LbR_Ice_bind Ice-binding protein, left-handed beta-roll. The ice-binding protein of the grass Lolium perenne (LpIBP) discourages the recrystallization of ice. Ice-binding proteins produced by organisms to prevent the growing of ice are termed to anti-freeze proteins. LpIBP consists of an unusual left-handed beta roll. Ice-binding is mediated by a flat beta-sheet on one side of the helix. 114
30797 410984 cd12797 M23_peptidase M23 family metallopeptidase, also known as beta-lytic metallopeptidase, and similar proteins. This model describes the metallopeptidase M23 family, which includes beta-lytic metallopeptidase and lysostaphin. Members of this family are zinc endopeptidases that lyse bacterial cell wall peptidoglycans; they cleave either the N-acylmuramoyl-Ala bond between the cell wall peptidoglycan and the cross-linking peptide (e.g. beta-lytic endopeptidase) or a bond within the cross-linking peptide (e.g. stapholysin, and lysostaphin). Beta-lytic metallopeptidase, formerly known as beta-lytic protease, has a preference for cleavage of Gly-X bonds and favors hydrophobic or apolar residues on either side. It inhibits growth of sensitive organisms and may potentially serve as an antimicrobial agent. Lysostaphin, produced by Staphylococcus genus, cleaves pentaglycine cross-bridges of cell wall peptidoglycan, acting as autolysins to maintain cell wall metabolism or as toxins and weapons against competing strains. Staphylolysin (also known as LasA) is implicated in a range of processes related to Pseudomonas virulence, including stimulating shedding of the ectodomain of cell surface heparan sulphate proteoglycan syndecan-1, and elastin degradation in connective tissue. Its active site is less constricted and contains a five-coordinate zinc ion with trigonal bipyramidal geometry and two metal-bound water molecules, possibly contributing to its activity against a wider range of substrates than those used by related lytic enzymes, consistent with its multiple roles in Pseudomonas virulence. The family includes members that do not appear to have the conserved zinc-binding site and might be lipoproteins lacking proteolytic activity. 85
30798 213998 cd12798 Alt_A1 Alternaria alternata allergen Alt a 1. Alt a 1 defines a new homologous protein family with unknown function exclusively found in fungi. The unique structure of Alt a 1 contains intramolecular disulfide bonds that are conserved among the Alt a 1 homologs. Residues reported to be IgE antibody-binding epitopes are exposed through dimerization via a conserved disulfide bond and hydrophobic and polar interactions. Further mechanistic structure/function studies will give insight into immunologic studies directed toward new forms of immunotherapy for Alternaria species-sensitive allergic patients. 132
30799 340366 cd12799 pesticin_lyz-like lysozyme-like C-terminal domain of pesticin and related proteins. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 129
30800 213999 cd12800 Sol_i_2 Sol i 2, a major allergen from fire ant venom. Sol i 2, one of four known potent allergens from the venom of red imported fire ant, is a powerful trigger of anaphylaxis. It causes production of IgE antibody in many individuals stung by fire ants. The closest structure homolog of Sol I 2 is the sequence-unrelated odorant binding protein and pheromone binding protein LUSH of the fruit fly Drosophila, suggesting a possible similar biological function. 118
30801 214000 cd12801 HopAB_KID Kinase-interacting domains of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain. 77
30802 214001 cd12802 HopAB_PID Pto-interacting domain of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain. 79
30803 214002 cd12803 HopAB_BID BAK1-interacting domain of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain. 80
30804 214003 cd12804 AKAP10_AKB PKA-binding (AKB) domain of A Kinase Anchor Protein 10. AKAPs coordinate the specificity of PKA signaling by facilitating the localization of the kinase to subcellular sites through their binding to regulatory (R) subunits of PKA. AKAP-10, also called PRKA10 or Dual-specific AKAP 2 (D-AKAP2), is a multisubunit protein containing two regulator of G protein signaling (RGS)-like domains and a PKA-binding (AKB) domain. The AKB domain of AKAP10 can bind to the dimerization/docking (D/D) domains of both RI and RII regulatory subunits of PKA. This model also includes a C-terminal PDZ-binding motif that binds to PDZK1 and NHERF-1, allowing AKAP10 to link indirectly to membrane proteins. Mutations in AKAP10 can alter its binding to R subunits, which may alter the targeting of PKA; some AKAP10 mutations are associated with abnormalities including hypertension, increased risk of severe arrhythmias during kidney transplantation, and familial breast cancer. 45
30805 214004 cd12805 Allergen_V_VI Group V, VI major allergens from grass, including Phlp 5, Phlp 6, Pha a 5 and Lol p 5. This family contains major allergens from various grass pollen, including Phl p 5 and Phl p 6 (timothy grass), Lol p 5 (rye grass) and Pha a 5 (canary grass). They induce allergic rhinitis and bronchial asthma in millions of allergic patients worldwide. These group V and group VI grass-pollen allergens belong to a new class of protease-resistant four-helix-bundle domains, which also have internal helix-turn-helix homology pointing to a special type of four-helix bundle topology, defined as twinned two-helix bundle. IgE binding experiments with recombinant Phl p 6 fragments indicated that the N terminus of the allergen is required for IgE recognition. Immunotherapy treatment for these allergies generally involves administration of grass pollen extracts which induce an initial rise in specific immunoglobulin E (sIgE) production followed by a progressive decline during the treatment. 85
30806 214005 cd12806 Esterase_713_like Novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. This enzyme is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized. 261
30807 214006 cd12807 Esterase_713 Novel bacterial esterase 713 that cleaves esters on halogenated cyclic compounds. This family contains proteins similar to a novel bacterial esterase (esterase 713) with the alpha/beta hydrolase fold that cleaves esters on halogenated cyclic compounds. This Alcaligenes esterase, however, does not contain the GXSXXG pentapeptide around the active site serine residue as seen in other esterase families. This enzyme is active as a dimer though its natural substrate is unknown. It has two distinct disulfide bridges; one formed between adjacent cysteines appears to facilitate the correct formation of the oxyanion cleft in the catalytic site. Esterase 713 also resembles human pancreatic lipase in its location of the acidic residue of the catalytic triad. It is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized. 315
30808 214007 cd12808 Esterase_713_like-1 Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. 309
30809 214008 cd12809 Esterase_713_like-2 Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. 280
30810 214009 cd12810 Esterase_713_like-3 Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. 328
30811 411995 cd12811 MALA Mala s 1 allergenic protein and similar proteins. This family includes the yeast Malassezia sympodialis allergen Mala s 1 which is localized in the cell wall and exposed on the cell surface. It can elicit specific IgE and T-cell activity in patients with atopic eczema (AE), a chronic inflammatory disease. Mala s 1 does not show any significant sequence homology to characterized proteins. However, its structure is a beta-propeller which is a novel fold among allergens. 304
30812 214010 cd12812 BPSL1549 Burkholderia Lethal Factor 1. BPSL1549, also suggested to be called Burkholderia lethal factor 1, is a protein of unknown function from Burkholderia pseudomallei, a causative agent of melioidosis (also called Whitmore's disease). This protein shows similarity to Escherichia coli cytotoxic necrotizing factor 1 which has been found to act as a potent cytotoxin against eukaryotic cells and is lethal when administered to mice. BPSL1549 expression levels correlate with suppression or promotion of pathogenic conditions. BPSL1549 inhibits helicase activity of translation initiation factor eIF4A. As yet, there is no vaccine and the organism is multidrug resistant. 203
30813 240610 cd12813 LbR-like Left-handed beta-roll, including virulence factors and various other proteins. This family contains a variety of protein domains with a left-handed beta-roll structure including cell surface adhesion proteins, bacterial virulence factors, and ice-binding proteins, and other activities. UspA1 Head And Neck Domain and YadA of Yersinia are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric beta-rolls of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. The collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. The ice-binding protein of the grass Lolium perenne (LpIBP) discourages the recrystallization of ice. Ice-binding proteins produced by organisms to prevent the growing of ice are termed to anti-freeze proteins. LpIBP consists of an unusual left-handed beta roll. Ice-binding is mediated by a flat beta-sheet on one side of the helix. These domains form a left handed beta roll made up of a series of short repeated elements. 99
30814 240611 cd12819 LbR_vir_like Cell adhesion-like domain, left-handed beta-roll. This group contains proteins of unknown function related to characterized cell surface adhesion proteins with a left-handed beta-roll, like the UspA1 Head And Neck Domain and YadA of Yersinia. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. The collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements. 111
30815 240612 cd12820 LbR_YadA-like YadA-like, left-handed beta-roll. This group contains the collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins, including Moraxella catarrhalis UspA-like proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric left-handed parallel beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. 126
30816 213355 cd12821 EcCorA_ZntB-like Escherichia coli CorA-Salmonella typhimurium ZntB_like family. A family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of this family are found in all three kingdoms of life. It is a functionally diverse family, including the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), and the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). It also includes two Saccharomyces cerevisiae members: the inner membrane Mg2+ transporters Mfm1p/Lpe10p, and Mrs2p, and a family of Arabidopsis thaliana members (AtMGTs) some of which are localized to distinct tissues, and not all of which can transport Mg2+. Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 285
30817 213356 cd12822 TmCorA-like Thermotoga maritima CorA-like family. This family belongs to the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the Thermotoga maritima CorA_like family are found in all three kingdoms of life. It is a functionally diverse family, in addition to the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, it includes three Saccharomyces cerevisiae members: two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 289
30818 213357 cd12823 Mrs2_Mfm1p-like Saccharomyces cerevisiae inner mitochondrial membrane Mg2+ transporters Mfm1p and Mrs2p-like family. A eukaryotic subfamily belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This functionally diverse subfamily includes the inner mitochondrial membrane Mg2+ transporters Saccharomyces cerevisiae Mfm1p/Lpe10p, Mrs2p, and human MRS2/ MRS2L. It also includes a family of Arabidopsis thaliana proteins (AtMGTs) some of which are localized to distinct tissues, and not all of which can transport Mg2+. Structures of the intracellular domain of two EcCorA_ZntB-like family transporters: Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 323
30819 213358 cd12824 ZntB-like Salmonella typhimurium Zn2+ transporter ZntB-like subfamily. A bacterial subfamily belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 290
30820 213359 cd12825 EcCorA-like Escherichia coli Mg2+ transporter CorA_like subfamily. A bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB_like(EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Mg2+ transporters Escherichia coli, Salmonella typhimurium, and Helicobacter pylori CorAs (which can also transport Co2+, and Ni2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 287
30821 213360 cd12826 EcCorA_ZntB-like_u1 uncharacterized bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB family. A uncharacterized subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB (EcCorA-ZntB_like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. The EcCorA-ZntB_like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA-ZntB_like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 281
30822 213361 cd12827 EcCorA_ZntB-like_u2 uncharacterized bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB family. A uncharacterized subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB (EcCorA-ZntB_like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes.The EcCorA-ZntB-like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA-ZntB-like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 289
30823 213362 cd12828 TmCorA-like_1 Thermotoga maritima CorA_like subfamily. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of this subfamily are found in all three kingdoms of life. It is functionally diverse subfamily, in addition to the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, it includes Methanosarcina mazei CorA which may be involved in transport of copper and/or other divalent metal ions. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 294
30824 213363 cd12829 Alr1p-like Saccharomyces cerevisiae Alr1p-like subfamily. This eukaryotic subfamily belongs to the Thermotoga maritima CorA (TmCorA)-family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes three Saccharomyces cerevisiae members: two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 305
30825 213364 cd12830 MtCorA-like Mycobacterium tuberculosis CorA-like subfamily. This bacterial subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Mg2+ transporter Mycobacterium tuberculosis CorA (which also transports Co2+). Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 292
30826 213365 cd12831 TmCorA-like_u2 Uncharacterized bacterial subfamily of the Thermotoga maritima CorA-like family. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the TmCorA-like family are found in all three kingdoms of life. It is a functionally diverse family which includes the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and three Saccharomyces cerevisiae proteins: two located in the plasma membrane: the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 287
30827 213366 cd12832 TmCorA-like_u3 Uncharacterized subfamily of the Thermotoga maritima CorA-like family. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the TmCorA-like family are found in all three kingdoms of life. It is a functionally diverse family which includes the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and three Saccharomyces cerevisiae proteins: two located in the plasma membrane: the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport. 287
30828 213367 cd12833 ZntB-like_1 Salmonella typhimurium Zn2+ transporter ZntB-like subgroup. A bacterial subgroup belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 290
30829 213368 cd12834 ZntB_u1 Uncharacterized bacterial subgroup of the Salmonella typhimurium Zn2+ transporter ZntB-like subfamily. The MIT superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. The ZntB-like subfamily includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 290
30830 213369 cd12835 EcCorA-like_1 Escherichia coli Mg2+ transporter CorA_like subgroup. A bacterial subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Mg2+ transporters Escherichia coli CorA and Salmonella typhimurium CorA (which can also transport Co2+, and Ni2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 287
30831 213370 cd12836 HpCorA-like Mg2+ transporter Helicobacter pylori CorA-like subgroup. A bacterial subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Mg2+ transporter Helicobacter pylori CorAs (which can also transport Co2+, and Ni2+); CorA plays an important role in the viability of this pathogen. Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB (members of the EcCorA_ZntB-like family) form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 288
30832 213371 cd12837 EcCorA-like_u1 uncharacterized subgroup of the Escherichia coli Mg2+ transporter CorA_like subfamily. A uncharacterized subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. The EcCorA_ZntB-like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA_ZntB-like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport. 298
30833 214011 cd12838 Killer_toxin_alpha Alpha subunit of killer toxin from halotolerant yeast. This family contains the alpha subunit of killer toxins that are secreted by several strains of yeasts and fungi. These toxins are proteinous substances that kill sensitive strains. The halotolerant yeast Pichia farinosa KK1 strain produces the SMK toxin, with maximum killer activity under acidic pH and high salt concentration. This toxin is composed of alpha and beta subunits that interact tightly with each other under acidic conditions but easily dissociated and lose activity under neutral conditions. It shares topology to that of the fungal killer toxin, KP4, which contains a rare structural motif, suggesting that these toxins may be evolutionally and/or functionally related. 62
30834 214012 cd12839 Killer_toxin_beta Beta subunit of killer toxin from halotolerant yeast. This family contains the beta subunit of killer toxins that are secreted by several strains of yeasts and fungi. These toxins are proteinous substances that kill sensitive strains. The halotolerant yeast Pichia farinosa KK1 strain produces the SMK toxin, with maximum killer activity under acidic pH and high salt concentration. This toxin is composed of alpha and beta subunits that interact tightly with each other under acidic conditions but easily dissociated and loose activity under neutral conditions. It shares topology to that of the fungal killer toxin, KP4, which contains a rare structural motif, suggesting that these toxins may be evolutionally and/or functionally related. 74
30835 214013 cd12840 CarS Antirepressor CarS. CarS, an antirepressor present in Cystobacterineae, recognizes repressors to turn on the photo-inducible promoter P(B). In the dark, access to the P(B) promoter is blocked by the repressor CarA. Blue light causes expression of CarS, leading the way to the CarA-CarS interaction which dismantles the CarA-operator complex, resulting in the derepression of the P(B) promoter. A parallel pathway for regulating P(B) involves the interaction of CarS with the repressor CarH, which shares the domain architecture of CarA. CarH and CarA contain an N-terminal, MerR-type winged-helix DNA-binding domain that recognizes CarS. CarS adopts an SH3-like fold with loop length variations and acts as an operator DNA mimic. 80
30836 214014 cd12841 TM_EphA1 Transmembrane domain of Ephrin Receptor A1 Protein Tyrosine Kinase. Ephrin receptors (EphRs) comprise the largest subfamily of receptor PTKs, and are classified into two classes (EphA and EphB), corresponding to binding preferences for either GPI-anchored ephrin-A ligands or transmembrane ephrin-B ligands. Vertebrates have ten EphA and six EphB receptors, which display promiscuous ligand interactions within each class. EphA1 has been associated with late-onset Alzheimer's disease and certain cancers such as colorectal and gastric carcinomas. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a single-span transmembrane (TM) domain, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. This allows ephrin/EphR dimers to form, leading to the activation of the intracellular tyr kinase domain. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The main effect of ephrin/EphR interaction is cell-cell repulsion or adhesion. Ephrin/EphR signaling is important in neural development and plasticity, cell morphogenesis and proliferation, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The TM domain mediates dimerization. 38
30837 410985 cd12842 IGCP_Hfx_cass2 integron gene cassette protein (IGCP) Hfx_cass2 and similar proteins. This family contains the unique integron gene cassette protein Hfx_cass2 and similar proteins that have yet to be characterized. The structure of Hfx_cass2 depicts a homodimer incorporating a compact all-alpha fold of six helical segments with a core central bundle of helices. It has a surface cleft reminiscent of an enzyme active site. This family may allow an assessment of the impact of the integron/gene cassette system on the emergence of new phenotypes, such as drug resistance or virulence. 110
30838 240608 cd12843 Bvu_2165_C_like The C-terminal domain of uncharacterized bacterial proteins. This family contains the C-terminal domain of uncharacterized hypothetical proteins from bacteria, including Bacteroides vulgatus Bvu_2165. The structure of Bvu_2165 is dimeric, with an extensive binding interface. 105
30839 240607 cd12869 MqsR Motility quorum-sensing regulator (MqsR). This family includes domains similar to the motility quorum-sensing regulator MqsR, a toxin that is highly upregulated in persisters (dormant cells found in biofilms that are a source of antibiotic resistance). MqsR pairs with its antitoxin MqsA, forming a unique family of toxin:antitoxin (TA) systems. MqsR has been found to be structurally homologous to the bacterial ribonuclease (RelE) toxins; however, its sequence is not similar to any other known toxins and therefore its molecular function is as yet unknown. 98
30840 240606 cd12870 MqsA antitoxin MqsA for MqsR toxin. This family includes domains similar to the antitoxin MqsA that binds motility quorum-sensing regulator MqsR, a toxin that is highly upregulated in persisters (dormant cells found in biofilms that are a source of antibiotic resistance), thus forming a unique toxin:antitoxin (TA) pair. MqsA neutralizes MsqR toxicity. It binds its own promoter as well as those of genes important for E. coli physiology, such as mcbR and spy. It also binds zinc and has been shown to coordinate DNA via its C-terminal domain. This family also includes the B. subtilis YokU protein, which is functionally uncharacterized. 66
30841 214015 cd12871 Bacuni_01323_like Uncharacterized protein conserved in Bacteroidetes. A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure. 231
30842 293932 cd12872 SPRY_Ash2 SPRY domain in Ash2. This SPRY domain is found at the C-terminus of Ash2 (absent, small, or homeotic discs 2) -like proteins, core components of all mixed-lineage leukemia (MLL) family histone methyltransferases. Ash2 is a member of the trithorax group of transcriptional regulators of the Hox genes. Recent studies show that the SPRY domain of Ash2 mediates the interaction with RbBP5 and has an important role in regulating the methyltransferase activity of MLL complexes. In yeast, Ash2 is involved in histone methylation and is required for the earliest stages of embryogenesis. 150
30843 293933 cd12873 SPRY_DDX1 SPRY domain associated with DEAD box gene DDX1. This SPRY domain is associated with the DEAD box gene, DDX1, an RNA-dependent ATPase involved in HIV-1 Rev function and virus replication. It is suggested that DDX1 acts as a cellular cofactor by promoting oligomerization of Rev on the Rev response element (RRE). DDX1 RNA is overexpressed in breast cancer, data showing a strong and independent association between poor prognosis and deregulation of the DEAD box protein DDX1, thus potentially serving as an effective prognostic biomarker for early recurrence in primary breast cancer. DDX1 also interacts with RelA and enhances nuclear factor kappaB-mediated transcription. DEAD-box proteins are associated with all levels of RNA metabolism and function, and have been implicated in translation initiation, transcription, RNA splicing, ribosome assembly, RNA transport, and RNA decay. 155
30844 293934 cd12874 SPRY_PRY PRY/SPRY domain, also known as B30.2. This domain contains residues in the N-terminus that form a distinct PRY domain structure such that the B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Among the TRIM proteins, also known as the N-terminal RING finger/B-box/coiled coil (RBCC) family, only Classes I and II contain the B30.2 domain that has evolved under positive selection. Class I TRIM proteins include multiple members involved in antiviral immunity at various levels of interferon signaling cascade. Among the 75 human TRIMs, roughly half enhance immune response, which they do at multiple levels in signaling pathways. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. 168
30845 293935 cd12875 SPRY_SOCS_Fbox SPRY domain in Fbxo45 and suppressors of cytokine signaling (SOCS) proteins. This family consists of the SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) as well as F-box protein 45 (Fbxo45), a novel synaptic E3 and ubiquitin ligase. The SPSB protein is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. SPSB1, SPSB2, and SPSB4 interact with prostate apoptosis response protein 4 (Par-4) and are negative regulators that recruit the ECS E3 ubiquitin ligase complex to polyubiquitinate inducible nitric-oxide synthase (iNOS), resulting in its proteasomal degradation. Fbxo45 is related to this family; it is located N-terminal to the SPRY domain, and known to induce the degradation of a synaptic vesicle-priming factor, Munc13-1, via the SPRY domain, thus playing an important role in the regulation of neurotransmission by modulating Munc13-1 at the synapse. Suppressor of cytokine signaling (SOCS) proteins negatively regulate signaling from JAK-associated cytokine receptor complexes, and play key roles in the regulation of immune homeostasis. 169
30846 293936 cd12876 SPRY_SOCS3 SPRY domain in the suppressor of cytokine signaling 3 (SOCS3) family. The SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. All four SPSB proteins interact with c-Met, the hepatocyte growth factor receptor, but SOCS3 regulates cellular response to a variety of cytokines such as leukemia inhibitory factor (LIF) and interleukin 6. SOCS3, along with SOCS1, are expressed by immune cells and cells of the central nervous system (CNS) and have the potential to impact immune processes within the CNS. In non-small cell lung cancer (NSCLC), SOCS3 is silenced and proline-rich tyrosine kinase 2 (Pyk2) is over-expressed; it has been suggested that SOCS3 could be an effective way to prevent the progression of NSCLC due to its role in regulating Pyk2 expression. 185
30847 240457 cd12877 SPRY1_RyR SPRY domain 1 (SPRY1) of ryanodine receptor (RyR). This SPRY domain is the first of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, but no specific function has been found for this first SPRY domain of the RyRs. 151
30848 240458 cd12878 SPRY2_RyR SPRY domain 2 (SPRY2) of ryanodine receptor (RyR). This SPRY domain (SPRY2) is the second of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, The SPRY2 domain has been shown to bind to the dihydropryidine receptor (DHPR) II-III loop and the ASI region of RyR1 133
30849 293937 cd12879 SPRY3_RyR SPRY domain 3 (SPRY3) of ryanodine receptor (RyR). This SPRY domain (SPRY3) is the third of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, but no specific function has been found for this third SPRY domain of the RyRs. 151
30850 293938 cd12880 SPRYD7 SPRY domain-containing protein 7. This family contains SPRY domain-containing protein 7 (also known as SPRY domain-containing protein 7 or CLL deletion region gene 6 protein homolog or CLLD6 or chronic lymphocytic leukemia deletion region gene 6 protein homolog). In humans, CLLD6 is highly expressed in heart, skeletal muscle, and testis as well as cancer cell lines. It also has cross-species conservation, suggesting that it is likely to carry out important cellular processes. 160
30851 293939 cd12881 SPRY_HERC1 SPRY domain in HERC1. This SPRY domain is found in the HERC1, a large protein related to chromosome condensation regulator RCC1. It is widely expressed in many tissues, playing an important role in intracellular membrane trafficking in the cytoplasm as well as Golgi apparatus. HERC1 also interacts with tuberous sclerosis 2 (TSC2, tuberin), which suppresses cell growth, and results in the destabilization of TSC2. However, the biological function of HERC1 has yet to be defined. 162
30852 293940 cd12882 SPRY_RNF123 SPRY domain at N-terminus of ring finger protein 123. This SPRY domain is found at the N-terminus of RING finger protein 123 domain (also known as E3 ubiquitin-protein ligase RNF123). The ring finger domain motif is present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. RNF123 displays E3 ubiquitin ligase activity toward the cyclin-dependent kinase inhibitor p27 (Kip1). 128
30853 293941 cd12883 SPRY_RING SPRY domain at N-terminus of Really Interesting New Gene (RING) finger domain. This SPRY domain is found at the N-terminus of RING finger domains which are present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. RING-finger domain is a type of Zn-finger that binds two Zn atoms and is identified in proteins with a wide range of functions such as viral replication, signal transduction, and development. 121
30854 293942 cd12884 SPRY_hnRNP SPRY domain in heterogeneous nuclear ribonucleoprotein U-like (hnRNP) protein 1. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of heterogeneous nuclear ribonucleoprotein U-like (hnRNP) protein 1 (also known as HNRPUL1 ) which is a major constituent of nuclear matrix or scaffold and binds directly to DNA sequences through the N-terminal acidic region named serum amyloid P (SAP). Its function is specifically modulated by E1B-55kDa in adenovirus-infected cells. HNRPUL1 also participates in ATR protein kinase signaling pathways during adenovirus infection. Two transcript variants encoding different isoforms have been found for this gene. When associated with bromodomain-containing protein 7 (BRD7), it activates transcription of glucocorticoid-responsive promoter in the absence of ligand-stimulation. 177
30855 293943 cd12885 SPRY_RanBP_like SPRY domain in Ran binding proteins, SSH4, HECT E3 and SPRYD3. This family includes SPRY domains found in Ran binding proteins (RBP or RanBPM) 9 and 10, SSH4 (suppressor of SHR3 null mutation protein 4), SPRY domain-containing protein 3 (SPRYD3) as well as HECT, a C-terminal catalytic domain of a subclass of ubiquitin-protein ligase (E3). RanBP9 and RanBP10 act as androgen receptor (AR) coactivators. Both consist of the N-terminal proline- and glutamine-rich regions, the SPRY domain, and LisH-CTLH and CRA motifs. The SPRY domain in SSH4 may be involved in cargo recognition, either directly or by combination with other adaptors, possibly leading to a higher selectivity. SPRYD3 is highly expressed in most tissues in humans, possibly involved in important cellular processes. HECT E3 mediates the direct transfer of ubiquitin from E2 to substrate. 132
30856 293944 cd12886 SPRY_like SPRY domain-like in bacteria. This family contains SPRY-like domains that are found only in bacterial and are mostly uncharacterized. SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 eukaryotic protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). 129
30857 293945 cd12887 SPRY_NHR_like SPRY domain in neuralized homology repeat. This family contains the neuralized homology repeat 1 (NHR1) domain similar to the SPRY domain (known to mediate specific protein-protein interactions) at the C-terminus of a conserved region within eukaryotic neuralized and neuralized-like proteins. In Drosophila, the neuralized protein (Neur) belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the nervous system. Neur binds to the Notch receptor ligand Delta through its first NHR1 domain and mediates its ubiquitination for endocytosis. Multiple copies of this region are found in some members of the family. 161
30858 293946 cd12888 SPRY_PRY_TRIM7_like PRY/SPRY domain in tripartite motif-binding protein 7 (TRIM7)-like, including TRIM7, TRIM10, TRIM15, TRIM26, TRIM39, TRIM41. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several tripartite motif-containing (TRIM) proteins, including TRIM7 (also referred to as glycogenin-interacting protein, RING finger protein 90 or RNF90), TRIM10, TRIM15, TRIM26, TRIM39 and TRIM41. TRIM7 or GNIP interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. TRIM10 (also known as hematopoietic RING finger 1 (HERF1) or TRIM10/HERF1) plays a key role in definitive erythroid development; downregulation of the Spi-1/PU.1 oncogene induces the expression of TRIM10/HERF1, a key factor required for terminal erythroid cell differentiation and survival. Antiviral activity of TRIM15 is dependent on the ability of its B-box to interact with the MLV Gag precursor protein; downregulation of TRIM15, along with TRIM11, enhances virus release suggesting that these proteins contribute to the endogenous restriction of retroviruses in cells. Tripartite motif-containing 26 (TRIM26) function is as yet unknown; however, since it is localized in the human histocompatibility complex (MHC) class I region, TRIM26 may play a role in immune response although studies show no association between TRIM26 polymorphisms and the risk of aspirin-exacerbated respiratory disease. TRIM39 is a MOAP-1 (Modulator of Apoptosis)-binding protein that stabilizes MOAP-1 through inhibition of its poly-ubiquitination process. TRIM41 (also known as RING finger-interacting protein with C kinase or RINCK) functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C. 169
30859 293947 cd12889 SPRY_PRY_TRIM67_9 PRY/SPRY domain in tripartite motif-containing proteins, TRIM9 and TRIM67. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM9 proteins. TRIM9 protein is expressed mainly in the cerebral cortex, and functions as an E3 ubiquitin ligase. It has been shown that TRIM9 is localized to the neurons in the normal human brain and its immunoreactivity in affected brain areas in Parkinson's disease and dementia with Lewy bodies is severely decreased, possibly playing an important role in the regulation of neuronal function and participating in pathological process of Lewy body disease through its ligase. TRIM67 negatively regulates Ras activity via degradation of 80K-H, leading to neural differentiation, including neuritogenesis. 172
30860 293948 cd12890 SPRY_PRY_TRIM16 PRY/SPRY domain in tripartite motif-containing protein 16 (TRIM16). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM16 and TRIM-like proteins. TRIM16 (also known as estrogen-responsive B box protein or EBBP) does not possess a RING domain like the other TRIM proteins, but contains two B-box domains and can heterodimerize with other TRIM proteins such as TRIM24, Promyelocytic leukemia (PML) protein and Midline-1 (MID1 or TRIM18). It is a regulator of keratinocyte differentiation and a tumor suppressor in retinoid-sensitive neuroblastoma. It has been shown that loss of TRIM16 expression plays an important role in the development of cutaneous squamous cell carcinoma (SCC) and is a determinant of retinoid sensitivity. TRIM16 also has E3 ubiquitin ligase activity. 182
30861 293949 cd12891 SPRY_PRY_C-I_2 PRY/SPRY domain in tripartite motif-containing (TRIM) proteins, including TRIM14-like, TRIM16-like, TRIM25-like, TRIM47-like, TRIM65 and RNF135, and stonustoxin. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class I TRIM proteins, including TRIM14, TRIM16 and TRIM25, TRIM47 as well as RING finger protein RNF135 and stonustoxin, a secreted poisonous protein of the stonefish Synanceja horrida. TRIM16 (also known as estrogen-responsive B box protein or EBBP) has E3 ubiquitin ligase activity. It is a regulator of keratinocyte differentiation and a tumor suppressor in retinoid-sensitive neuroblastoma. TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function. TRIM47, also known as GOA (Gene overexpressed in astrocytoma protein) or RNF100 (RING finger protein 100), is highly expressed in kidney tubular cells, but low expressed in most tissue. It is overexpressed in astrocytoma tumor cells and plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. RNF135 ubiquitinates RIG-I (retinoic acid-inducible gene-I) to promote interferon-beta induction during the early phase of viral infection. Stonustoxin (STNX) is a hypotensive and lethal protein factor that also possesses other biological activities such as species-specific hemolysis (due to its ability to form pores in the cell membrane) and platelet aggregation, edema-induction, and endothelium-dependent vasorelaxation (mediated by the nitric oxide pathway and activation of potassium channels). The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. 167
30862 240472 cd12892 SPRY_PRY_TRIM18 PRY/SPRY domain of TRIM18/MID1, also known as FXY or RNF59. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is at the C-terminus of the overall domain architecture of MID1 (also known as FXY, RNF59, TRIM18) gene represented by a RING finger domain (RING), two B-box motifs (BBOX), coiled-coil C-terminal to Bbox domain (BBC) and fibronectin type 3 domain (FN3). Mutations in the human MID1 gene result in X-linked Opitz G/BBB syndrome (OS), a disorder affecting development of midline structures, causing craniofacial, urogenital, gastrointestinal and cardiovascular abnormalities. A unique MID1 gene mutation located in a variable loop in the SPRY domain alters conformation of the binding pocket and may affect the binding affinity to the PRY/SPRY domain. 177
30863 293950 cd12893 SPRY_PRY_TRIM35 PRY/SPRY domain in tripartite motif-containing protein 35 (TRIM35). This PRY/SPRY domain is found at the C-terminus of the overall domain architecture of tripartite motif 35, TRIM35 (also known as hemopoietic lineage switch protein), which includes a RING finger domain (RING) and a B-box motif (BBOX). TRIM35 may play a role as a tumor suppressor and is implicated in the cell death mechanism. 171
30864 293951 cd12894 SPRY_PRY_TRIM36 PRY/SPRY domain in tripartite motif-containing protein 36 (TRIM36). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM36, a Class I TRIM protein. TRIM36 (also known as Haprin or RNF98) has a ubiquitin ligase activity and interacts with centromere protein-H, one of the kinetochore proteins. It has been shown that TRIM36 is potentially associated with chromosome segregation and that an excess of TRIM36 may cause chromosomal instability. In Xenopus laevis, TRIM36 is expressed during early embryogenesis and plays an important role in the arrangement of somites during their formation. 204
30865 293952 cd12895 SPRY_PRY_TRIM46 PRY/SPRY domain in tripartite motif-containing protein 46 (TRIM46). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM46 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). The SPRY/PRY combination is a possible component of immune defense. This protein family has not yet been characterized. 209
30866 293953 cd12896 SPRY_PRY_TRIM65 PRY/SPRY domain in tripartite motif-containing domain 65 (TRIM65). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM65 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). The SPRY/PRY combination is a possible component of immune defense. This protein family has not been characterized. 182
30867 293954 cd12897 SPRY_PRY_TRIM50_72 PRY/SPRY domain in tripartite motif-binding (TRIM) proteins TRIM50 and TRIM72. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several TRIM proteins, including TRIM72 and TRIM50. TRIM72 (also known as MG53) has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. It is expressed specifically in skeletal muscle and heart, and tethered to the plasma membrane and cytoplasmic vesicles via its interaction with phosphatidylserine. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23. 191
30868 293955 cd12898 SPRY_PRY_TRIM76 PRY/SPRY domain in tripartite motif-containing protein 76 (TRIM76), also called cardiomyopathy-associated protein 5. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM76, a Class I TRIM protein. TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5 or myospryn or SPRYD2) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It has been suggested that TRIM76 is involved in two distinct processes, protein kinase A signaling and vesicular trafficking. It has also been implicated in Duchenne muscular dystrophy and cardiac disease; gene polymorphism of TRIM76 is associated with left ventricular wall thickness in patients with hypertension while its interactions with M-band titin and calpain 3 link it to tibial and limb-girdle muscular dystrophies. 171
30869 293956 cd12899 SPRY_PRY_TRIM76_like PRY/SPRY domain in tripartite motif-containing protein 76 (TRIM76)-like. This domain is similar to the distinct PRY/SPRY subdomain found at the C-terminus of TRIM76, a Class I TRIM protein. TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5 or myospryn or SPRYD2) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It has been suggested that TRIM76 is involved in two distinct processes, protein kinase A signaling and vesicular trafficking. 176
30870 293957 cd12900 SPRY_PRY_TRIM21 PRY/SPRY domain in tripartite motif-binding protein 21 (TRIM21) also known as 52kD Ribonucleoprotein Autoantigen (Ro52). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM21, which is also known as Sjogren Syndrome Antigen A (SSA), SSA1, 52kD Ribonucleoprotein Autoantigen (Ro52, Ro/SSA, SS-A/Ro) or RING finger protein 81 (RNF81). TRIM21 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. As an E3 ligase, TRIM21 mediates target specificity in ubiquitination; it regulates type 1 interferon and proinflammatory cytokines via ubiquitination of interferon regulatory factors (IRFs). It is up-regulated at the site of autoimmune inflammation, such as cutaneous lupus lesions, indicating a central role in the tissue destructive inflammatory process. It interacts with auto-antigens in patients with Sjogren syndrome and systemic lupus erythematosus, a chronic systemic autoimmune disease characterized by the presence of autoantibodies against the protein component of the human intracellular ribonucleoprotein-RNA complexes and more specifically TRIM21, Ro60/TROVE2 and La/SSB proteins. It binds the Fc part of IgG molecules via its PRY-SPRY domain with unexpectedly high affinity. 180
30871 293958 cd12901 SPRY_PRY_FSD1 Fibronectin type III and SPRY containing 1 (FSD1) domain includes PRY at the N-terminus. This domain is part of the fibronectin type III and SPRY domain containing 1 (FSD1) and FSD1-like (FSD1L) proteins. These are centrosome-associated proteins that are characterized by an N-terminal coiled-coil region downstream of B-box (BBC) domain, a central fibronectin type III (FN3) domain, and C-terminal repeats in PRY/SPRY domain. The FSD1 protein associates with a subset of microtubules and may be involved in the stability and organization of microtubules during cytokinesis. 207
30872 293959 cd12902 SPRY_PRY_RNF135 PRY/SPRY domain in RING finger protein RNF135. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of the RING finger protein RNF135 (also known as Riplet/RNF135), which ubiquitinates RIG-I (retinoic acid-inducible gene-I) to promote interferon-beta induction during the early phase of viral infection. Normally, RIG-I is activated by TRIM25 in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. However, RNF135, consisting of an N-terminal RING finger domain, C-terminal SPRY and PRY motifs and showing sequence similarity to TRIM25, acts as an alternative factor that promotes RIG-I activation independent of TRIM25. 168
30873 293960 cd12903 SPRY_PRY_SPRYD4 PRY/SPRY domain containing protein 4 (SPRYD4). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain and is encoded by the SPRYD4 gene. SPRYD4 (SPRY containing domain 4) is ubiquitously expressed in many human tissues, most strongly in kidney, bladder, brain, thymus and stomach. Subcellular localization demonstrates that SPRYD4 protein is localized in the nucleus when overexpressed in COS-7 green monkey cell. It has remained uncharacterized thus far. 169
30874 293961 cd12904 SPRY_BSPRY SPRY domain in Ro-Ret family. This domain, named BSPRY, has been identified in the Ro-Ret family, since the protein is composed of a B-box, an alpha-helical coiled coil and a SPRY domain. The gene for BSPRY resides on human chromosome 9 and is specifically expressed in testis. The function of BSPRY is not known, but several related proteins of the RING-Box-coiled-coil (RBCC) family have been implicated in cell transformation. 171
30875 293962 cd12905 SPRY_PRY_A33L zinc-binding protein A33-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM69 and TRIM proteins NF7 and bloodthirsty (bty). TRIM69 is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. TRIM protein NF7, which also contains a chromodomain (CHD) at the N-terminus and an RFP (Ret finger protein)-like domain at the C-terminus, is required for its association with transcriptional units of RNA polymerase II which is mediated by a trimeric B box. In Xenopus oocyte, xNF7 has been identified as a nuclear microtubule-associated protein (MAP) whose microtubule-bundling activity, but not E3-ligase activity, contributes to microtubule organization and spindle integrity. Bloodthirsty (bty) is a novel gene identified in zebrafish and has been shown to likely play a role in in regulation of the terminal steps of erythropoiesis. 178
30876 293963 cd12906 SPRY_SOCS1-2-4 SPRY domain in the suppressor of cytokine signaling 1, 2, 4 families (SOCS1, SOCS2, SOCS4). The SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. All four SPSB proteins interact with c-Met, the hepatocyte growth factor receptor, but only SPSB1, SPSB2, and SPSB4 interact with prostate apoptosis response protein 4 (Par-4). They are negative regulators that recruit the ECS E3 ubiquitin ligase complex to polyubiquitinate inducible nitric-oxide synthase (iNOS), resulting in its proteasomal degradation, thus contributing to protection against the cytotoxic effect of iNOS in activated macrophages. It has been shown that SPSB1 and SPSB4 induce the degradation of iNOS more strongly than SPSB2. The Drosophila melanogaster SPSB1 homolog, GUSTAVUS, interacts with the DEAD box RNA helicase Vasa. Suppressor of cytokine signaling (SOCS) proteins negatively regulate signaling from JAK-associated cytokine receptor complexes, and play key roles in the regulation of immune homeostasis. 174
30877 293964 cd12907 SPRY_Fbox SPRY domain in the F-box family Fbxo45. Fbxo45 is a novel synaptic E3 and ubiquitin ligase, related to the suppressor of cytokine signaling (SOCS) proteins and located N-terminal to a SPRY (SPla and the ryanodine receptor) domain. Fbxo45 induces the degradation of a synaptic vesicle-priming factor, Munc13-1, via the SPRY domain, thus playing an important role in the regulation of neurotransmission by modulating Munc13-1 at the synapse. F-box motifs are found in proteins that function as the substrate recognition component of SCF E3 complexes. 175
30878 293965 cd12908 SPRYD3 SPRY domain-containing protein 3. This family contains SPRY domain-containing protein 3 (SPRYD3). In humans, it is highly expressed in most tissues, including brain, kidney, heart, intestine, skeletal muscle, and testis. It also has cross-species conservation, suggesting that it is likely to carry out important cellular processes. 171
30879 293966 cd12909 SPRY_RanBP9_10 SPRY domain in Ran binding proteins 9 and 10. This family includes SPRY domain in Ran binding protein (RBP or RanBPM) 9 and 10, and similar proteins. RanBP9 (also known as RanBPM), a binding partner of Ran, is a small Ras-like GTPase that exerts multiple functions via interactions with various proteins. RanBP9 and RanBP10 also act as androgen receptor (AR) coactivators. Both consist of the N-terminal proline- and glutamine-rich regions, the SPRY domain, and LisH-CTLH and CRA motifs. SPRY domain of RanBPM forms a complex with CD39, a prototypic member of the NTPDase family, thus down-regulating activity substantially. RanBP10 enhances the transcriptional activity of AR in a ligand-dependent manner and exhibits a protein expression pattern different from RanBPM in various cell lines. RanBP10 is highly expressed in AR-positive prostate cancer LNCaP cells, while RanBPM is abundant in WI-38 and MCF-7 cells. 144
30880 293967 cd12910 SPRY_SSH4_like SPRY domain in SSH4 and similar proteins. This family includes SPRY domain in SSH4 (suppressor of SHR3 null mutation protein 4) and similar proteins. SSH4 is a component of the endosome-vacuole trafficking pathway that regulates nutrient transport and may be involved in processes determining whether plasma membrane proteins are degraded or routed to the plasma membrane. The SPRY domain in SSH4 may be involved in cargo recognition, either directly or by combination with other adaptors, possibly leading to a higher selectivity. In yeast, SSH4 and the homologous protein EAR1 (endosomal adapter of RSP5) recruit Rsp5p, an essential ubiquitin ligase of the Nedd4 family, and assist it in its function at multivesicular bodies by directing the ubiquitylation of specific cargoes. 192
30881 350336 cd12911 HK_sensor Sensor domains of Histidine Kinase receptors. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution. 100
30882 350337 cd12912 PDC2_MCP_like second PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the second PDC domain of Methyl-accepting chemotaxis proteins (MCPs), Histidine kinases (HKs), and other similar domains. Many members contain both HAMP (HK, Adenylyl cyclase, MCP, and Phosphatase) and MCP domains, which are signalling domains that interact with protein partners to relay a signal. MCPs are part of a transmembrane protein complex that controls bacterial chemotaxis. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. 92
30883 350338 cd12913 PDC1_MCP_like first PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the first PDC domain of Methyl-accepting chemotaxis proteins (MCPs), Histidine kinases (HKs), and other similar domains. Many members contain both HAMP (HK, Adenylyl cyclase, MCP, and Phosphatase) and MCP domains, which are signalling domains that interact with protein partners to relay a signal. MCPs are part of a transmembrane protein complex that controls bacterial chemotaxis. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. 139
30884 350339 cd12914 PDC1_DGC_like first PDC (PhoQ/DcuS/CitA) domain of diguanylate-cyclase and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the first PDC domain of Diguanylate-cyclases (DGCs), Histidine kinases (HKs), and other similar domains. Many members of this subfamily contain a C-terminal DGC (also called GGDEF) domain. DGCs regulate the turnover of cyclic diguanosine monophosphate. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. 123
30885 350340 cd12915 PDC2_DGC_like second PDC (PhoQ/DcuS/CitA) domain of diguanylate-cyclase and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the second PDC domain of Diguanylate-cyclases (DGCs), Histidine kinases (HKs), and other similar domains. Many members of this subfamily contain a C-terminal DGC (also called GGDEF) domain. DGCs regulate the turnover of cyclic diguanosine monophosphate. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. 96
30886 240599 cd12916 VKOR_1 Vitamin K epoxide reductase family in bacteria and plants. This family includes vitamin K epoxide reductase (VKOR) present in bacteria and plant. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some plant and bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. 133
30887 240600 cd12917 VKOR_euk Vitamin K epoxide reductase family in eukaryotes, excluding plants. This family includes vitamin K epoxide reductase (VKOR) present in bacteria and plant. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. Warfarin, a widely used oral anticoagulant used in medicine as well as rodenticides, inhibits the activity of VKOR, resulting in decreased levels of reduced vitamin K, which is required for the function of several clotting factors. However, anticoagulation effect of warfarin is significantly associated with polymorphism of certain genes, including VKORC1. Interestingly, in rodents, an adaptive trait appears to have evolved convergently by selection on new or standing genetic polymorphisms in VKORC1 as well as by adaptive introgressive hybridization between species, likely brought about by human-mediated dispersal. 140
30888 240601 cd12918 VKOR_arc Vitamin K epoxide reductase family in archaea and some bacteria. This family includes vitamin K epoxide reductase (VKOR) mostly present in archaea and some bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. 126
30889 240602 cd12919 VKOR_2 Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present only in bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. 169
30890 240603 cd12920 VKOR_3 Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present in proteobacteria and spirochetes. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. 134
30891 240604 cd12921 VKOR_4 Vitamin K epoxide reductase (VKOR) family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present only in bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. This family also has a cysteine peptidase domain present at the N-terminus of the VKOR domain. 128
30892 240605 cd12922 VKOR_5 Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) mostly present in actinobacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. 133
30893 214016 cd12923 iSH2_PI3K_IA_R Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunits. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. In vertebrates, there are three genes (PIK3R1, PIK3R2, and PIK3R3) that encode for different Class IA PI3K R subunits. 152
30894 214017 cd12924 iSH2_PIK3R1 Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 1, PIK3R1, also called p85alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. In addition, p85alpha, also called PIK3R1, contains N-terminal SH3 and GAP domains. p85alpha carry functions independent of its PI3K regulatory role. It can independently stimulate signaling pathways involved in cytoskeletal rearrangements. Insulin-sensitive tissues express splice variants of the PIK3R1 gene, p50alpha and p55alpha, which may play important roles in insulin signaling during lipid and glucose metabolism. Mice deficient with PIK3R1 die perinatally, indicating its importance in development. 161
30895 214018 cd12925 iSH2_PIK3R3 Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 3, PIK3R3, also called p55gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p55gamma, also called PIK3R3 or p55PIK, also contains a unique N-terminal 24-amino acid residue (N24) that interacts with cell cycle modulators to promote cell cycle progression. 161
30896 214019 cd12926 iSH2_PIK3R2 Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 2, PIK3R2, also called p85beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p85beta, also called PIK3R2, contains N-terminal SH3 and GAP domains. It is expressed ubiquitously but at lower levels than p85alpha. Its expression is increased in breast and colon cancer, correlates with tumor progression, and enhanced invasion. During viral infection, the viral nonstructural (NS1) protein binds p85beta specifically, which leads to PI3K activation and the promotion of viral replication. Mice deficient with PIK3R2 develop normally and exhibit moderate metabolic and immunological defects. 161
30897 240571 cd12927 MMP_TTHA0227_like Minimal MMP-like domain found in Thermus thermophilus TTHA0227, Acidothermus cellulolyticus ACEL2062 and similar proteins. The family includes hypothetical proteins from bacteria that contain a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets.These proteins may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD (x could be any amino acid) motif. However, some family members carry a shorter HExxHxxG motif or HExxH motif. Some others do not have such a motif, but still share very high sequence similarity. 97
30898 240592 cd12929 GUCT RNA-binding GUCT domain found in the RNA helicase II/Gu protein family. This family includes vertebrate RNA helicase II/Gualpha (RH-II/Gualpha) and RNA helicase II/Gubeta (RH-II/Gubeta), both of which consist of a DEAD box helicase domain (DEAD), a helicase conserved C-terminal domain, and a Gu C-terminal (GUCT) domain. They localize to nucleoli, suggesting roles in ribosomal RNA production, but RH-II/Gubeta also localizes to nuclear speckles containing the splicing factor SC35, suggesting its possible involvement in pre-mRNA splicing. In contrast to RH-II/Gualpha, RH-II/Gubeta has RNA-unwinding activity, but no RNA-folding activity. The family also contains plant DEAD-box ATP-dependent RNA helicase 7 (RH7 or PRH75), Thermus thermophilus heat resistant RNA-dependent ATPase (Hera) and similar proteins. RH7 is a new nucleus-localized member of the DEAD-box protein family from higher plants. It displays a weak ATPase activity which is barely stimulated by RNA ligands. RH7 contains an N-terminal KDES domain rich in lysine, glutamic acid, aspartic acid, and serine residues, seven highly conserved helicase motifs in the central region, a GUCT domain, and a C-terminal GYR domain harboring a large number of glycine residues interrupted by either arginines or tyrosines. Thermus thermophilus Hera is a DEAD box helicase that binds fragments of 23S rRNA and RNase P RNA via its C-terminal domain. It contains a helicase core that harbors two RecA-like domains termed RecA_N and RecA_C, a dimerization domain (DD), and a C-terminal RNA-binding domain (RBD) that reveals a compact, RRM-like fold and shows sequence similarity with the typical GUCT domain found in the RNA helicase II/Gu protein family. 72
30899 410577 cd12930 GAT_SF GAT domain found in eukaryotic GGAs, metazoan Tom1-like proteins, metazoan STAMs, fungal Vps27, and similar proteins. The GAT (GGA and Tom1) domain superfamily includes the canonical GAT domain found in ADP-ribosylation factor (Arf)-binding proteins (GGAs) from eukaryotes, myb protein 1 (Tom1)-like proteins from metazoa, and LAS seventeen-binding protein 5 (Lsb5p)-like proteins from fungi. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS-domain followed by a GAT domain. Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain. They do not associate with either Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. The GAT domain superfamily also includes the non-canonical GAT domain found in several components of the ESCRT-0 complex, including signal transducing adapter molecules (STAMs) and hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs) from metazoa, as well as vacuolar protein sorting-associated protein 27 (Vps27) and class E vacuolar protein-sorting machinery protein Hse1 from fungi. Hrs, together with STAM, forms a Hrs/STAM core complex. Vps27, together with Hse1, forms a Vps27/Hse1 core complex. Those complexes consist of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The intertwined GAT heterodimer acts as a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 77
30900 240579 cd12931 eNOPS_SF NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family. All members in this family contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRM1 and RRM2), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain with a long helical C-terminal extension. The NOPS domain specifically binds to RRM2 domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. PSF has an additional large N-terminal domain that differentiates it from other family members. The p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein 1 (PSP1), which are ubiquitously expressed and are well conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1, also termed PSPC1, is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species. For instance, the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. 90
30901 240576 cd12932 RRP7_like RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A (Rrp7A), and similar proteins. This CD corresponds to the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. Rrp7A, also termed gastric cancer antigen Zg14, is the Rrp7p homolog mainly found in Metazoans. The cellular function of Rrp7A remains unclear currently. Both Rrp7p and Rrp7A harbor an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain. 118
30902 240597 cd12933 eIF3G eIF3G domain found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. eIF-3G, also termed eIF-3 subunit 4, or eIF-3-delta, or eIF3-p42, or eIF3-p44, is the RNA-binding subunit of eIF3. eIF3 is a large multi-subunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. Besides, eIF-3G is one of the cytosolic targets; it interacts with mature apoptosis-inducing factor (AIF). This family also includes yeast eIF3-p33, a homolog of vertebrate eIF-3G; it plays an important role in the initiation phase of protein synthesis in yeast. It binds both mRNA and rRNA fragments due to an RNA recognition motif near its C-terminus. 114
30903 240585 cd12934 LEM LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins. The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions. 37
30904 240596 cd12935 LEM_like LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and postmitotic reassembly. Some of the LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are nonmembrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal lamina-associated polypeptide-Emerin-MAN1 (LEM)-domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Both LEM and LEM-like domains share the same structural fold, mainly composed of two large parallel alpha helices. However, their biochemical nature of the solvent-accessible residues is completely different, which indicates the two domains may target different protein surfaces. The LEM domain is responsible for the interaction with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF), and the LEM-like domain is involved in chromosome binding. The family also includes the yeast helix-extension-helix domain-containing proteins, Heh1p (formerly called Src1p) and Heh2p, and their uncharacterized homologs found mainly in fungi and several in bacteria. Heh1p and Heh2p are inner nuclear membrane proteins that might interact with nuclear pore complexes (NPCs). Heh1p is involved in mitosis. It functions at the interface between subtelomeric gene expression and transcription export (TREX)-dependent messenger RNA export through NPCs. The function of Heh2p remains ill-defined. Both Heh1p and Heh2p contain a LEM-like domain (also termed HeH domain), but lack a LEM domain. 36
30905 240593 cd12936 GUCT_RHII_Gualpha_beta RNA-binding GUCT domain found in vertebrate RNA helicase II/Gualpha (RH-II/Gualpha), RNA helicase II/Gubeta (RH-II/Gubeta) and similar proteins. This subfamily corresponds to the Gu C-terminal (GUCT) domain of RH-II/Gualpha and RH-II/Gubeta, two paralogues found in vertebrates. RH-II/Gualpha, also termed nucleolar RNA helicase 2, or DEAD box protein 21, or nucleolar RNA helicase Gu, is a bifunctional enzyme that displays independent RNA-unwinding and RNA-folding activities. It unwinds double-stranded RNA in the 5' to 3' direction in the presence of Mg2+ through the domains in its N-terminal region. In contrast, it folds single-stranded RNA in an ATP-dependent manner and its C-terminal region is responsible for the Mg2+ independent RNA-foldase activity. RH-II/Gualpha consists of a DEAD box helicase domain (DEAD), a helicase conserved C-terminal domain (helicase_C), and a GUCT followed by three FRGQR repeats and one PRGQR sequence. The DEAD and helicase_C domains may play critical roles in the RNA-helicase activity of RH-II/Gualpha. The function of GUCT domain remains unclear. The C-terminal region responsible for the RNA-foldase activity does not overlap with the GUCT domain. RH-II/Gubeta, also termed ATP-dependent RNA helicase DDX50, or DEAD box protein 50, or nucleolar protein Gu2, shows significant sequence homology with RH-II/Gualpha. It contains a DEAD domain, a helicase_C domain, and a GUCT domain followed by an arginine-serine-rich sequence but not (F/P)RGQR repeats in RH-II/Gualpha. Both RH-II/Gualpha and RH-II/Gubeta localize to nucleoli, suggesting roles in ribosomal RNA production, but RH-II/Gubeta also localizes to nuclear speckles containing the splicing factor SC35, suggesting its possible involvement in pre-mRNA splicing. In contrast to RH-II/Gualpha, RH-II/Gubeta has RNA-unwinding activity, but no RNA-folding activity. 93
30906 240594 cd12937 GUCT_RH7_like RNA-binding GUCT domain found in plant DEAD-box ATP-dependent RNA helicase 7 (RH7) and similar proteins. This subfamily corresponds to the Gu C-terminal (GUCT) domain of RH7 and similar proteins. RH7, also termed plant RNA helicase 75 (PRH75), is a new nucleus-localized member of the DEAD-box protein family from higher plants. It displays a weak ATPase activity which is barely stimulated by RNA ligands. RH7 contains an N-terminal KDES domain rich in lysine, glutamic acid, aspartic acid, and serine residues, seven highly conserved helicase motifs in the central region, a GUCT domain, and a C-terminal GYR domain harboring a large number of glycine residues interrupted by either arginines or tyrosines. RH7 is RNA specific and harbors two possible RNA-binding motifs, the helicase motif VI (HRIGRTGR) and the C-terminal glycine-rich GYR domain. 86
30907 240595 cd12938 GUCT_Hera RNA-binding GUCT-like domain found in Thermus thermophilus heat resistant RNA-dependent ATPase (Hera) and similar proteins. This subfamily corresponds to the Gu C-terminal (GUCT)-like domain of Hera and similar proteins. Thermus thermophilus Hera is a DEAD box helicase that binds fragments of 23S rRNA and RNase P RNA via its C-terminal domain. It contains a helicase core that harbors two RecA-like domains termed RecA_N and RecA_C, a dimerization domain (DD), and a C-terminal RNA-binding domain (RBD) that reveals a compact, RRM-like fold and shows sequence similarity with GUCT domain found in vertebrate RNA helicase II/Gualpha (RH-II/Gualpha), RNA helicase II/Gubeta (RH-II/Gubeta) and plant DEAD-box ATP-dependent RNA helicase 7 (RH7 or PRH75). 74
30908 240586 cd12939 LEM_emerin LEM (Lap2/Emerin/Man1) domain found in emerin. This CD corresponds to the LEM domain that is critical for binding to lamin A/C and is also involved in interaction with the DNA binding protein barrier-to-autointegration factor (BAF). Emerin is an inner nuclear membrane protein that occurs in four differently phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. It contains an N-terminal LEM domain followed by a poly-serine segment, a region rich in hydrophobic amino acids comprising the nuclear localization signal (NLS) followed by another poly-serine segment, and a C-terminal transmembrane region. 43
30909 240587 cd12940 LEM_LAP2_LEMD1 LEM (Lap2/Emerin/Man1) domain found in lamina-associated polypeptide 2 (LAP2), LEM domain-containing protein 1 (LEMP-1) and similar proteins. This CD corresponds to the LEM domain of LAP2, LEMP-1 and similar proteins. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some of LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal LEM domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Although LEM and LEM-like domains share the same structural fold composed of two large parallel alpha helices, the biochemical nature of the solvent-accessible residues is completely different, indicating that the two domains may target different protein surfaces. The LEM domain interacts with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF) while the LEM-like domain is involved in chromosome binding. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. It may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEMP-1 contains an N-terminal LEM domain. 42
30910 240588 cd12941 LEM_LEMD2 LEM (Lap2/Emerin/Man1) domain found in LEM domain-containing protein 2 (LEM2). This CD corresponds to the LEM domain that is responsible for the interaction with chromatin protein barrier-to-autointegration factor (BAF). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization. It also binds to lamin C and plays an important role in the organization of lamin A/C complexes. LEM2 contains an N-terminal LEM domain, two putative transmembrane domains and a MAN1-Src1p C-terminal (MSC) domain, but lacks the Man1-specific C-terminal RNA recognition motif (RRM). 38
30911 240589 cd12942 LEM_Man1 LEM (Lap2/Emerin/Man1) domain found in inner nuclear membrane protein Man1. This CD corresponds to the LEM domain of Man1 and similar proteins. Man1, also termed LEM domain-containing protein 3 (LEMD3), is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. It is part of a protein complex essential for chromatin organization and cell division. It also functions as an important negative regulator for the transforming growth factor beta (TGF-beta) /activin/Nodal signaling pathway and bone morphogenetic protein (BMP) signaling pathway by directly interacting with chromatin-associated proteins and transcriptional regulators, including the R-Smads, Smad1, Smad2, and Smad3. Man1 is a unique type of left/right (LR) signaling regulator that acts on the inner nuclear membrane. Furthermore, Man1 plays a crucial role in angiogenesis. The vascular remodeling can be regulated at the inner nuclear membrane through interactions between Man1 and Smads. Man1 contains an N-terminal LEM domain, two putative transmembrane domains, a Man1-Src1p C-terminal (MSC) domain, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The LEM domain interacts with DNA and chromatin-binding protein Barrier-to-Autointegration Factor, and is also necessary for efficient localization of Man1 in the inner nuclear membrane. It has been shown that the C-terminal nucleoplasmic region of Man1 exhibits a DNA binding winged helix domain and is responsible for both, DNA- and Smad-binding. 44
30912 240590 cd12943 LEM_ANKL1 LEM (Lap2/Emerin/Man1) domain found in ankyrin repeat and LEM domain-containing protein 1 (ANKL1). The family includes ANKL1, also termed ankyrin repeat domain-containing protein 41 (ANKRD41), or LEM-domain containing protein 3 (LEM3), and similar proteins. Although their biological roles remain unclear, the family members contain an N-terminal ankyrin repeat region, LEM domain and C-terminal GIY-YIG nuclease domain. The ankyrin repeats are unique motifs mediating protein-protein interactions. The LEM domain, mainly found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. 38
30913 240591 cd12944 LEM_ANKL2 LEM (Lap2/Emerin/Man1) domain found in ankyrin repeat and LEM domain-containing protein 2 (ANKL2). The family includes ANKL2 and similar proteins. Although their biological roles remain unclear, the family members share an N-terminal LEM domain and an ankyrin repeat region. The LEM domain, mainly found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The ankyrin repeats are unique motifs mediating protein-protein interactions. 43
30914 240580 cd12945 NOPS_NONA_like NOPS domain, including C-terminal coiled-coil region, in p54nrb/PSF/PSP1 homologs from invertebrate species. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension , found in Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA), Chironomus tentans hrp65 gene encoding protein Hrp65 and similar proteins. D. melanogaster NONA is involved in eye development and behavior, and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. 100
30915 240581 cd12946 NOPS_p54nrb_PSF_PSPC1 NOPS domain, including C-terminal coiled-coil region, in p54nrb/PSF/PSPC1 family proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, found in the p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. Members in the family include 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein component 1 (PSPC1 or PSP1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSPC1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSPC1 remains unknown currently. PSF has an additional large N-terminal domain that differentiates it from other family members. 93
30916 240582 cd12947 NOPS_p54nrb NOPS domain, including C-terminal coiled-coil region, in 54 kDa nuclear RNA- and DNA-binding protein (p54nrb) and similar proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, found in p54nrb, also termed non-POU domain-containing octamer-binding protein (NONO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. It is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. p54nrb is ubiquitously expressed and highly conserved in vertebrates. It binds both single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. p54nrb forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manner. It also forms a heterodimer with polypyrimidine tract-binding protein-associated-splicing factor (PSF). The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for paraspeckle localization to subnuclear bodies. 94
30917 240583 cd12948 NOPS_PSF NOPS domain, including C-terminal coiled-coil region, in polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and similar proteins. This model contains the NOPS (NONA and PSP1) domain PSF (also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex), with a long helical C-terminal extension. PSF is a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also single-stranded DNA (ssDNA) as well as double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. Additionally, it promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). Moreover, PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. As an RNA-binding component of spliceosomes, PSF binds to the insulin-like growth factor response element (IGFRE), and acts as an independent negative regulator of the transcriptional activity of the porcine P-450 cholesterol side-chain cleavage enzyme gene (P450scc) IGFRE. PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. In addition, PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NONO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. 97
30918 240584 cd12949 NOPS_PSPC1 NOPS domain, including C-terminal coiled-coil region, in paraspeckle protein component 1 (PSPC1) and similar proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, of paraspeckle component 1 (PSPC1, also termed PSP1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Although its cellular function remains unknown currently, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NONO), which localizes to paraspeckles in an RNA-dependent manner. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. 94
30919 240577 cd12950 RRP7_Rrp7p RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. This CD corresponds to the RRP7 domain of Rrp7p. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly. Rrp7p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain. 128
30920 240578 cd12951 RRP7_Rrp7A RRP7 domain ribosomal RNA-processing protein 7 homolog A (Rrp7A) and similar proteins. The family corresponds to the RRP7 domain of Rrp7A, also termed gastric cancer antigen Zg14, and similar proteins which are yeast ribosomal RNA-processing protein 7 (Rrp7p) homologs mainly found in Metazoans. The cellular function of Rrp7A remains unclear currently. Rrp7A harbors an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain. 129
30921 240572 cd12952 MMP_ACEL2062 Minimal MMP-like domain found in Acidothermus cellulolyticus hypothetical protein ACEL2062 and similar protein. The subfamily includes an uncharacterized protein from Acidothermus cellulolyticus (ACEL2062) and its homologs from bacteria. Although its biological role remains unclear, ACEL2062 contains a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets and a HExxHxxGxxD/S (x could be any amino acid) motif. It may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD motif. 117
30922 240573 cd12953 MMP_TTHA0227 Minimal MMP-like domain found in Thermus thermophilus hypothetical protein TTHA0227 and similar proteins. The subfamily includes an uncharacterized protein from Thermus thermophilus (TTHA0227) and its homologs from bacteria. Although its biological role remains unclear, TTHA0227 contains a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets and a HExxH (x could be any amino acid) motif. It may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD motif. 112
30923 240574 cd12954 MMP_TTHA0227_like_1 Minimal MMP-like domain found in a group of hypothetical proteins from alphaproteobacteria and actinobacteria. The subfamily includes some uncharacterized bacterial proteins which show high sequence similarity with Thermus thermophilus hypothetical protein TTHA0227. However, they do not contain the conserved HExxH (x could be any amino acid) motif. They may not have any zinc metallo-peptidase activity. 99
30924 214020 cd12955 SKA2 Spindle and kinetochore-associated protein 2. SKA2, also called FAM33A, is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. SKA2 has also been identified as a glucocorticoid receptor-interacting protein and may be involved in regulating cancer cell proliferation. 116
30925 240562 cd12956 CBM_SusE-F_like carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins. This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum. 93
30926 240570 cd12957 SKA3_N Spindle and kinetochore-associated protein 3, N-terminal domain. SKA3, also called RAMA1 or C13orf3, is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, and SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. SKA3 contributes to SAC (spindle-assembly checkpoint) signaling through its interaction with Bub3. This model represents the N-terminal domain of SKA3, which is involved in interactions with SKA1 and SKA2 to form the SKA complex. The C-terminal portion of SKA3 is involved in creating a microtubule-binding surface. 100
30927 214021 cd12958 SKA1_N Spindle and kinetochore-associated protein 1, N-terminal domain. SKA1 is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. This model represents the N-terminal domain of SKA1, which is involved in interactions with SKA2 and SKA3 to form the SKA complex. The C-terminal portion of SKA1 is involved in creating a microtubule-binding surface. 89
30928 214022 cd12959 MMACHC-like Methylmalonic aciduria and homocystinuria type C protein and similar proteins. MMACHC, also called CblC, is involved in the intracellular processing of vitamin B12 by catalyzing two reactions: the reductive decyanation of cyanocobalamin in the presence of a flavoprotein oxidoreductase and the dealkylation of alkylcobalamins through the nucleophilic displacement of the alkyl group by glutathione. Mutations in MMACHC cause combined methylmalonic acidemia/aciduria and homocystinuria (CblC type), the most common inherited disorder of cobalamin metabolism. The structure of MMACHC reveals it to be the most divergent member of the NADPH-dependent flavin reductase family that can use FMN or FAD to catalyze reductive decyanation; it is also the first enzyme with glutathione transferase (GST) activity that is unrelated to the GST superfamily in structure and sequence. 226
30929 240575 cd12960 Spider_toxin Spider neurotoxins including agatoxin, purotoxin and ctenitoxin. This domain family contains spider toxins that include the omega-Aga-IVB, a P-type calcium channel antagonist from venom of the funnel web spider, Agelenopsis aperta, as well as purotoxin-1 (PT1), a spider peptide venom of the Central Asian spider Geolycosa sp., which specifically exerts inhibitory action on P2X3 purinoreceptors at nanomolar concentrations. These spider toxins, which are ion channel blockers, share a common structural motif composed of a triple-stranded antiparallel beta-sheet, stabilized by internal disulfide bonds known as cystine knots. 36
30930 240569 cd12961 CBM58_SusG Carbohydrate-binding module 58 from Bacteroides thetaiotaomicron SusG and similar CBMs. This group includes the starch-specific CBM (carbohydrate-binding module) of SusG, a cell surface lipoprotein within the Sus (Starch-utilization system) system of the Human gut symbiont Bacteroides thetaiotaomicron. It represents the CBM58 class of CBMs in the carbohydrate active enzymes (CAZy) database. SusG is an alpha-amylase, and is essential for growth on high molecular weight starch. SusG-CBM58 binds maltooligosaccharide distal to, and on the opposite side of, the amylase catalytic site; it is one of two starch-binding sites in SusG, the other being adjacent to the active site. SusG-CBM58 is required for efficient degradation of insoluble starch by the purified enzyme. Its starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. It may play a role in product exchange with other Sus components. 110
30931 240568 cd12962 X25_BaPul_like X25 domain of Bacillus acidopullulyticus pullulanase and similar proteins. Pullulanase (EC 3.2.1.41) cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. BaPul is used industrially in the production of high fructose corn syrup, high maltose content syrups and low calorie and ''light'' beers. Pullulanases, in addition to the catalytic domain, include several carbohydrate-binding domains (CBMs) as well as domains of unknown function (termed ''X'' modules). X25 was identified in Bacillus acidopullulyticus pullulanase, and splits another domain of unknown function (X45). X25 is present in multiple copy in some pullulanases. It has been suggested that X25 and X45 are CBMs which target mixed alpha-1,6/alpha-1,4 linked D-glucan polysaccharides. 95
30932 240567 cd12963 X45_BaPul_like X45 domain of Bacillus acidopullulyticus pullulanase and similar proteins. Pullulanase (EC 3.2.1.41) cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. BaPul is used industrially in the production of high fructose corn syrup, high maltose content syrups and low calorie and ''light'' beers. Pullulanases, in addition to the catalytic domain, include several carbohydrate-binding domains (CBMs) as well as domains of unknown function (termed ''X'' modules). X45 was identified in Bacillus acidopullulyticus pullulanase, it is interupted by another domain of unknown function (X25). It has been suggested that X25 and X45 are CBMs which target mixed alpha-1,6/alpha-1,4 linked D-glucan polysaccharides. 89
30933 240563 cd12964 CBM-Fa carbohydrate-binding module Fa from Bacteroides thetaiotaomicron SusE, and similar CBMs. CBM-Fa is the first of three starch-specific CBM (carbohydrate-binding modules) of SusF, a cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. The precise mechanistic role of SusF in starch metabolism is unclear. SusF has an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by three tandem starch-binding CBMs: CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus. These CBMs have no enzymatic activity. Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These three CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fa does not bind insoluble starch, and can bind smaller maltooligosaccharides. Proteins in this subgroup are present in the species of the Gram-negative Bacteroidetes phylum. 110
30934 240564 cd12965 CBM-Eb_CBM-Fb carbohydrate-binding modules Eb and Fb from SusE and SusF, respectively, and similar CBMs. Included in this subgroup are CBM-Eb and CBM-Fb, starch-specific carbohydrate-binding modules of SusE and SusF, cell surface lipoproteins within the Sus (Starch-utilization system)system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fb and CBM-Fc both bind insoluble starch, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum. 98
30935 240565 cd12966 CBM-Ec_CBM-Fc carbohydrate-binding modules Ec and Fc from SusE and SusF, respectively, and similar CBMs. Included in this subgroup are CBM-Ec and CBM-Fc, starch-specific carbohydrate-binding modules of SusE and SusF, cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fb and CBM-Fc both bind insoluble starch, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum. 98
30936 240566 cd12967 CBM_SusE-F_like_u1 Uncharacterized subgroup of the CBM-SusE-F_like superfamily. The CBM SusE-F_like superfamily includes starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum. 91
30937 240556 cd13112 POLO_box Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 76
30938 381706 cd13113 Wnt Wnt domain found in the WNT signaling gene family, also called Wingless-type mouse mammary tumor virus (MMTV) integration site family. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about the structure of Wnt proteins, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. Wnt signaling mediated by Wnt proteins orchestrates and influences a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 288
30939 240557 cd13114 POLO_box_Plk4_1 First (cryptic) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 112
30940 240558 cd13115 POLO_box_Plk4_2 Second (cryptic) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 108
30941 240559 cd13116 POLO_box_Plk4_3 C-terminal (third) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 81
30942 240560 cd13117 POLO_box_2 Second polo-box domain (PBD) of polo-like kinases Plk1, Plk2, Plk3, and Plk5. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 81
30943 240561 cd13118 POLO_box_1 First polo-box domain (PBD) of polo-like kinases Plk1, Plk2, Plk3, and Plk5. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 91
30944 240524 cd13119 BF2867_like Tandemly repeated domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity form a protein that may have a role in cell adhesion. This family also includes BF1858 and overlaps with DUF3988. 115
30945 240525 cd13120 BF2867_like_N N-terminal domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity in a tandem repeat arrangement form a protein that may have a role in cell adhesion. This family overlaps with DUF3988. 156
30946 240526 cd13121 BF2867_like_C C-terminal domain found in Bacteroides fragilisNctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity in a tandem repeat arrangement form a protein that may have a role in cell adhesion. This family overlaps with DUF3988. 138
30947 240555 cd13122 MSL2_CXC DNA-binding cysteine-rich domain of male-specific lethal 2 and related proteins. The CXC domain of Drosophila melanogaster MSL2 forms a Zn(3)Cys(9) cluster and is involved in recruiting members of the dosage compensation complex (DCC) to sites on the X chromosome. 50
30948 240528 cd13123 MATE_MurJ_like MurJ/MviN, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli MurJ (MviN) has been identified as essential for murein biosynthesis. It has been suggested that MurJ functions as the peptidoglycan lipid II flippase which is involved in translocation of lipid-anchored peptidoglycan precursors across the cytoplasmic membrane, though results obtained in Bacillus subtilis seem to indicate that its MurJ homologs are not essential for growth. Some MviN family members (e.g. in Mycobacterium tuberculosis) possess an extended C-terminal region that contains an intracellular pseudo-kinase domain and an extracellular domain resembling carbohydrate-binding proteins. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR). 420
30949 240529 cd13124 MATE_SpoVB_like Stage V sporulation protein B, also known as Stage III sporulation protein F, and related proteins. The integral membrane protein SpoVB has been implicated in the biosynthesis of the peptidoglycan component of the spore cortex in Bacillus subtilis. This model represents a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR). 434
30950 240530 cd13125 MATE_like_10 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides, such as O-antigen. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 409
30951 240531 cd13126 MATE_like_11 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides, such as O-antigen. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 396
30952 240532 cd13127 MATE_tuaB_like Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides and participate in the biosynthesis of cell wall components such as teichuronic acid. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 406
30953 240533 cd13128 MATE_Wzx_like Wzx, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli Wzx and related proteins from other gram-negative bacteria are thought to act as flippases, assisting in the membrane translocation of lipopolysaccharides including those containing O-antigens. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR). 402
30954 240534 cd13129 MATE_epsE_like Multidrug and toxic compound extrusion family and similar proteins. This model represents a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins, including Ralstonia solanaceraum GMI1000 epsE, which may be involved in exporting exopolysaccharide EPS I, a virulence factor. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR). 411
30955 240535 cd13130 MATE_rft1 Rft1-like subfamily of the multidrug and toxic compound extrusion family (MATE). This eukaryotic family may function as a transporter, shuttling phospholipids, lipopolysaccharides or oligosaccharides from cytoplasmic to the lumenal side of the endoplasmic reticulum. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. 441
30956 240536 cd13131 MATE_NorM_like Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Vibrio cholerae NorM. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. This subfamily includes Vibrio cholerae NorM and functions most likely as a multidrug efflux pump, removing xenobiotics from the interior of the cell. The pump utilizes a cation gradient across the membrane to facilitate the export process. NorM appears to bind monovalent cations in an outward-facing conformation and may subsequently cycle through an inward-facing and outward-facing conformation to capture and release its substrate. 435
30957 240537 cd13132 MATE_eukaryotic Eukaryotic members of the multidrug and toxic compound extrusion (MATE) family. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. This subfamily, which is restricted to eukaryotes, contains vertebrate solute transporters responsible for secretion of cationic drugs across the brush border membranes, yeast proteins located in the vacuole membrane, and plant proteins involved in disease resistance and iron homeostatis under osmotic stress. 436
30958 240538 cd13133 MATE_like_7 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 438
30959 240539 cd13134 MATE_like_8 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 438
30960 240540 cd13135 MATE_like_9 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 429
30961 240541 cd13136 MATE_DinF_like DinF and similar proteins, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli DinF is a membrane protein that has been found to protect cells against oxidative stress and bile salts. The expression of DinF is regulated as part of the SOS system. It may act by detoxifying oxidizing molecules that have the potential to damage DNA. Some member of this family have been reported to enhance the virulence of plant pathogenic bacteria by enhancing their ability to grow in the presence of toxic compounds. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR). 424
30962 240542 cd13137 MATE_NorM_like Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Thermotoga marina NorM. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 432
30963 240543 cd13138 MATE_yoeA_like Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Bacillus subtilis yoeA. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 431
30964 240544 cd13139 MATE_like_14 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 448
30965 240545 cd13140 MATE_like_1 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 435
30966 240546 cd13141 MATE_like_13 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 443
30967 240547 cd13142 MATE_like_12 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 444
30968 240548 cd13143 MATE_MepA_like Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Streptococcus aureus MepA. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. This subfamily includes Streptococcus aureus MepA and Vibrio vulnificus VmrA and functions most likely as a multidrug efflux pump. 426
30969 240549 cd13144 MATE_like_4 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 434
30970 240550 cd13145 MATE_like_5 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 440
30971 240551 cd13146 MATE_like_6 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 433
30972 240552 cd13147 MATE_MJ0709_like Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins, similar to Methanocaldococcus jannaschii MJ0709. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 441
30973 240553 cd13148 MATE_like_3 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 441
30974 240554 cd13149 MATE_like_2 Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. 434
30975 240523 cd13150 DAXX_histone_binding Histone binding domain of the death-domain associated protein (DAXX). DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models a functional domain of DAXX that interacts with the histone H3.3-H4 dimer, and in doing so competes with DNA binding and interactions between the histone chaperone ASF1/CIA and the H3-H4 dimer. 198
30976 240522 cd13151 DAXX_helical_bundle Helical bundle domain of the death-domain associated protein (DAXX). DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models the N-terminal helical bundle domain of DAXX, which was shown to interact with the tumor suppressor Ras-association domain family 1C (RASSF1C). 88
30977 240516 cd13152 KOW_GPKOW_A KOW motif of the "G-patch domain and KOW motifs-containing protein" (GPKOW) repeat A. GPKOW contains one G-patch domain and two KOW motifs. GPKOW is a nuclear protein that regulated by catalytic (C) subunit of Protein Kinase A (PKA) and bind RNA in vivo. PKA may be involved in regulating multiple steps in post-transcriptional processing of pre-mRNAs. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. GPKOW is also known as T54 protein or MOS2 homolog. 57
30978 240517 cd13153 KOW_GPKOW_B KOW motif of the "G-patch domain and KOW motifs-containing protein" (GPKOW) repeat B. GPKOW contains one G-patch domain and two KOW motifs. GPKOW is a nuclear protein that regulated by catalytic (C) subunit of Protein Kinase A (PKA) and bind RNA in vivo. PKA may be involved in regulating multiple steps in post-transcriptional processing of pre-mRNAs. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. GPKOW is also known as the T54 protein or MOS2 homolog. 51
30979 240518 cd13154 KOW_Mtr4 KOW_Mtr4 is an inserted domain in Mtr4 globular domain. Mtr4 is a conserved helicase with a core DExH region that cooperates with the eukaryotic nuclear exosome in RNA processing and degradation. KOW_Mtr4 motif might be involved in presenting RNA substrates to the helicase core. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW motif is located at the extended insertion of Mtr4 protein. 129
30980 240519 cd13155 KOW_KIN17 KOW_Kin17 is a RNA-binding motif. KOW domain of the KIN17protein contributes to the RNA-binding properties of the whole protein. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KIN17 is conserved from yeast to human that ubiquitously expressed at low levels in mammals tissue and have functions in DNA replication, DNA repair and cell cycle control. 54
30981 240520 cd13156 KOW_RPL6 KOW motif of Ribosomal Protein L6. RPL6 contains KOW motif that has an extra ribosomal role as an oncogenic. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. 152
30982 269979 cd13157 PTB_tensin-related Tensin-related Phosphotyrosine-binding (PTB) domain. Tensin plays critical roles in renal function, muscle regeneration, and cell migration. It binds to actin filaments and interacts with the cytoplasmic tails of beta-integrin via its PTB domain, allowing tensin to link actin filaments to integrin receptors. Tensin functions as a platform for assembly and disassembly of signaling complexes at focal adhesions by recruiting tyrosine-phosphorylated signaling molecules, and also by providing interaction sites for other proteins. In addition to its PTB domain, it contains a C-terminal SH2 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. 129
30983 269980 cd13158 PTB_APPL Adaptor protein containing PH domain, PTB domain, and Leucine zipper motif (APPL; also called DCC-interacting protein (DIP)-13alpha) Phosphotyrosine-binding (PTB) domain. APPL interacts with oncoprotein serine/threonine kinase AKT2, tumor suppressor protein DCC (deleted in colorectal cancer), Rab5, GIPC (GAIP-interacting protein, C terminus), human follicle-stimulating hormone receptor (FSHR), and the adiponectin receptors AdipoR1 and AdipoR2. There are two isoforms of human APPL: APPL1 and APPL2, which share about 50% sequence identity. APPL has a BAR and a PH domain near its N terminus, and the two domains are thought to function as a unit (BAR-PH domain). C-terminal to this is a PTB domain. Lipid binding assays show that the BAR, PH, and PTB domains can bind phospholipids. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. 135
30984 269981 cd13159 PTB_LDLRAP-mammal-like Low Density Lipoprotein Receptor Adaptor Protein 1 (LDLRAP1) in mammals and similar proteins Phosphotyrosine-binding (PTB) PH-like fold. The null mutations in the LDL receptor adaptor protein 1 (LDLRAP1) gene, which serves as an adaptor for LDLR endocytosis in the liver, causes autosomal recessive hypercholesterolemia (ARH). LDLRAP1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd contains mammals, insects, and sponges. 123
30985 269982 cd13160 PTB_LDLRAP_insect-like Low Density Lipoprotein Receptor Adaptor Protein 1 (LDLRAP1) in insects and similar proteins Phosphotyrosine-binding (PTB) PH-like fold. The null mutations in the LDL receptor adaptor protein 1 (LDLRAP1) gene, which serves as an adaptor for LDLR endocytosis in the liver, causes autosomal recessive hypercholesterolemia (ARH). LDLRAP1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd contains insects, ticks, sea urchins, and nematodes. 125
30986 269983 cd13161 PTB_TK_HMTK Tyrosine-specific kinase/HM-motif TK (TM/HMTK) Phosphotyrosine-binding (PTB) PH-like fold. TK kinases catalyzes the transfer of the terminal phosphate of ATP to a specific tyrosine residue on its target protein. TK kinases play significant roles in development and cell division. Tyrosine-protein kinases can be divided into two subfamilies: receptor tyrosine kinases, which have an intracellular tyrosine kinase domain, a transmembrane domain and an extracellular ligand-binding domain; and non-receptor (cytoplasmic) tyrosine kinases, which are soluble, cytoplasmic kinases. In HMTK the conserved His-Arg-Asp sequence within the catalytic loop is replaced by a His-Met sequence. TM/HMTK have are 2-3 N-terminal PTB domains. PTB domains in TKs are thought to function analogously to the membrane targeting (PH, myristoylation) and pTyr binding (SH2) domains of Src subgroup kinases. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 120
30987 269984 cd13162 PTB_RGS12 Regulator of G-protein signaling 12 Phosphotyrosine-binding (PTB) PH-like fold. RGS12 functions as a GTPase-activating protein and a transcriptional repressor. It is thought to play a role in tumorigenesis. RGS12 specifically interacts with guanine nucleotide-binding protein G(i), alpha-1 subunit and guanine nucleotide-binding protein G(k) subunit alpha. RGS proteins are multi-functional, GTPase-accelerating proteins that promote GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways. Upon activation by GPCRs, heterotrimeric G proteins exchange GDP for GTP, are released from the receptor, and dissociate into free, active GTP-bound alpha subunit and beta-gamma dimer, both of which activate downstream effectors. The response is terminated upon GTP hydrolysis by the alpha subunit, which can then bind the beta-gamma dimer and the receptor. RGS proteins markedly reduce the lifespan of GTP-bound alpha subunits by stabilizing the G protein transition state. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 131
30988 269985 cd13163 PTB_ICAP1 Integrin beta-1-binding protein 1 Phosphotyrosine-binding (PTB) PH-like fold. ICAP1 (also called Integrin cytoplasmic domain-associated protein 1) binds specifically to the beta1 integrin subunit cytoplasmic domain and the cerebral cavernous malformation (CCM) protein CCM1. It regulates beta1 integrin-dependent cell migration by affecting the pattern of focal adhesion formation. ICAP1 recruits CCM1 to the cell membrane and activates CCM1 by changing its conformation. Since CCM1 plays role in cardiovascular development, it is hypothesized ICAP1 is involved in vascular differentiation. ICAP-1 has an N-terminal domain that rich in serine and threonine and a C-terminal PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 129
30989 241318 cd13164 PTB_DOK4_DOK5_DOK6 Downstream of tyrosine kinase 4, 5, and 6 proteins phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup. 103
30990 269986 cd13165 PTB_DOK7 Downstream of tyrosine kinase 7 phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup. 101
30991 269987 cd13166 PTB_CCM2 Cerebral cavernous malformation 2 FERM domain C-lobe. CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM) along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signalling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours. CCM2 possesses an N-terminal PTB domain and a C-terminal Karet domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 193
30992 269988 cd13167 PTB_P-CLI1 PTB-containing, cubilin and LRP1-interacting protein Phosphotyrosine-binding (PTB) PH-like fold. P-CLI1 (also called Phosphotyrosine interaction domain-containing protein 1) increases proliferation of preadipocytes without affecting adipocytic differentiation. It forms a complex with PID1/PCLI1, LRP1 and CUBNI. It is found in subcutaneous fat, heart, skeletal muscle, brain, colon, thymus, spleen, kidney, liver, small intestine, placenta, lung and peripheral blood leukocyte. P-CLI1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. 139
30993 269989 cd13168 PTB_LOC417372 uncharacterized protein LOC417372 Phosphotyrosine-binding (PTB) PH-like fold. The function of LOC417372 and its related proteins are unknown to date. Members here contain a N-terminal RUN domain, followed by a PDZ domain, and a C-terminal PTB domain. The RUN domain is involved in Ras-like GTPase signaling. The PDZ domain (also called DHR/Dlg homologous region or GLGF after its conserved sequence motif) binds C-terminal polypeptides, internal (non-C-terminal) polypeptides, and lipids. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup. 125
30994 269990 cd13169 RanBD_NUP50_plant Ran-binding protein 2, repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP#importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The first RanBD2 is present in this hierarchy. 117
30995 269991 cd13170 RanBD_NUP50 Nucleoporin 50 Ran-binding domain. NUP50 acts as a cofactor for the importin-alpha:importin-beta heterodimer, which allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. It is thought to function primarily at the terminal stages of nuclear protein import to coordinate import complex disassembly and importin recycling. NUP50 is composed of a N-terminal NUP50 domain which binds the C-terminus of importin-beta, a central domain which binds importin-beta, and a C-terminal RanBD which binds importin-beta through Ran-GTP. NUP50:importin-alpha then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to displace nuclear localization signals from importin-alpha. NUP50 interacts with cyclin-dependent kinase inhibitor 1B which binds to cyclin E-CDK2 or cyclin D-CDK4 complexes and prevents its activation, thereby controling the cell cycle progression at G1. Fungal Nup2 transiently associates with nuclear pore complexes (NPCs) and when artificially tethered to DNA, can prevent the spread of transcriptional activation or repression between flanking genes, a function termed boundary activity (BA). Nup2 and the Ran guanylyl-nucleotide exchange factor, Prp20, interact at specific chromatin regions and enable the NPC to play an active role in chromatin organization. Nup60p, the nup responsible for anchoring Nup2 and the Mlp proteins to the NPC is required for Nup2-dependent BA. Nup2 contains an N-terminal Nup50 family domain and a C-terminal RanBD. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. 111
30996 269992 cd13171 RanBD1_RanBP2_insect-like Ran-binding protein 2, Ran binding domain repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 1 is present in this hierarchy. 117
30997 269993 cd13172 RanBD2_RanBP2_insect-like Ran-binding protein 2, Ran binding domain repeat 2. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 2 is present in this hierarchy. 118
30998 269994 cd13173 RanBD3_RanBP2_insect-like Ran-binding protein 2, Ran binding domain repeat 3. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 3 is present in this hierarchy. 115
30999 269995 cd13174 RanBD4_RanBP2_insect-like Ran-binding protein 2, Ran binding domain repeat 4. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 4 is present in this hierarchy. 118
31000 269996 cd13175 RanBD5_RanBP2_insect-like Ran-binding protein 2, Ran binding domain repeat 5. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 5 is present in this hierarchy. 114
31001 269997 cd13176 RanBD_RanBP2-like Ran-binding protein 2, Ran binding domains. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeats 1 and 3 are present in this hierarchy. 117
31002 269998 cd13177 RanBD2_RanBP2-like Ran-binding protein 2, Ran binding domain repeat 2. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 2 is present in this hierarchy. 117
31003 269999 cd13178 RanBD4_RanBP2-like Ran-binding protein 2, Ran binding domain repeat 4. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 4 is present in this hierarchy. 117
31004 270000 cd13179 RanBD_RanBP1 Ran-binding domain. RanBP1 interacts specifically with GTP-charged Ran. RanBP1 does not activate GTPase activity of Ran, but does markedly increase GTP hydrolysis by the RanGTPase-activating protein (RanGAP1). In both mammalian cells and in yeast, RanBP1 acts as a negative regulator of Regulator of chromosome condensation 1 (RCC1) by inhibiting RCC1-stimulated guanine nucleotide release from Ran. In addition to Ran, RanBP1 has been shown to interact with Exportin-1 and Importin subunit beta-1 which docks the NPC at the cytoplasmic side of the nuclear pore complex. RabBP1 contains a single RanBD. The RanBD is present in RanBD1, RanBD2, RanBD3, Nuc2, and Nuc50. Most of these proteins have a single RanBD, with the exception of RanBD2 which has 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. The Ran-binding domain is found in multiple copies in Nuclear pore complex proteins. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. 136
31005 270001 cd13180 RanBD_RanBP3 Ran-binding protein 3 Ran-binding domain. RanBP3, a Ran-interacting nuclear protein, unlike the related proteins RanBP1 and RanBP2, which promote disassembly of the export complex in the cytosol, acts as a CRM1 cofactor, enhancing nuclear export signal (NES) export by stabilizing the export complex in the nucleus. CRM1/Exportin1 is responsible for exporting many proteins and ribonucleoproteins from the nucleus to the cytosol. RanBP3 also alters the cargo selectivity of CRM1, promoting recognition of the NES of HIV-1 Rev and of other cargos while deterring recognition of the import adaptor protein Snurportin1. RanBP3 contains a N-terminal nuclear localization signal (NLS), 2 FxFG motifs, and a single RanBD. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. 113
31006 270002 cd13181 RanBD_NUP2 Nucleoporin 2 Ran-binding domain. Yeast protein Nup2 transiently associates with Nuclear pore complexes (NPCs) and when artificially tethered to DNA, can prevent the spread of transcriptional activation or repression between flanking genes, a function termed boundary activity (BA). Nup2 and the Ran guanylyl-nucleotide exchange factor, Prp20, interact at specific chromatin regions and enable the NPC to play an active role in chromatin organization. Nup60p, the nup responsible for anchoring Nup2 and the Mlp proteins to the NPC is required for Nup2-dependent BA. Nup2 contains an N-terminal Nup50 family domain and a C-terminal RanBD. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. 115
31007 270003 cd13182 EVH1-like_Dcp1 Decapping enzyme EVH1-like domain. Dcp1 is a small protein containing an EVH1 domain. The Dcp1-Dcp2 complex plays a critical step in mRNA degradation with the removal of the 50 cap structure. Dcp1 stimulates the activity of Dcp2 by promoting and/or stabilizing the closed complex. The interface of Dcp1 and Dcp2 is not fully conserved and in higher eukaryotes it requires an additional factor. The proline-rich sequence (PRS)-binding sites in Dcp1p indicates that it belongs to a novel class of EVH1 domains. Dcp1 has 2 prominent sites,one required for the function of the Dcp1p-Dcp2p complex, and the other, the PRS-binding site of EVH1 domains, a binding site for decapping regulatory proteins. It also has a conserved hydrophobic patch is shown to be critical for decapping. The EVH1 domains are part of the PH domain superamily. 116
31008 270004 cd13183 FERM_C_FRMPD1_FRMPD3_FRMPD4 FERM domain C-lobe of FERM and PDZ domain containing proteins 1, 3, and 4 (FRMPD1, 3, 4). The function of FRMPD1, FRMPD3, and FRMPD4 is unknown at present. These proteins contain an N-terminal PDZ (post synaptic density protein (PSD95), Drosophila disc large tumor suppressor (Dlg1), and zonula occludens-1 protein (zo-1) domain and a C-terminal FERM domain. PDZ (also known as DHR (Dlg homologous region) or GLGF (glycine-leucine-glycine-phenylalanine) domains) help anchor transmembrane proteins to the cytoskeleton and hold together signaling complexes. PDZ domains bind to a short region of the C-terminus of other specific proteins. The FERM domain is composed of three subdomains: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3), which form a clover leaf fold. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 105
31009 270005 cd13184 FERM_C_4_1_family FERM domain C-lobe of Protein 4.1 family. The protein 4.1 family includes four well-defined members: erythroid protein 4.1 (4.1R), the best known and characterized member, 4.1G (general), 4.1N (neuronal), and 4.1 B (brain). The less well understood 4.1O/FRMD3 is not a true member of this family and is not included in this hierarchy. Besides three highly conserved domains, FERM, SAB (spectrin and actin binding domain) and CTD (C-terminal domain), the proteins from this family contain several unique domains: U1, U2 and U3. FERM domains like other members of the FERM domain superfamily have a cloverleaf architecture with three distinct lobes: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The brain is a particularly rich source of protein 4.1 isoforms. The various 4.1R, 4.1G, 4.1N, and 4.1B mRNAs are all expressed in distinct patterns within the brain. It is likely that 4.1 proteins play important functional roles in the brain including motor coordination and spatial learning, postmitotic differentiation, and synaptic architecture and function. In addition they are found in nonerythroid, nonneuronal cells where they may play a general structural role in nuclear architecture and/or may interact with splicing factors. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 94
31010 270006 cd13185 FERM_C_FRMD1_FRMD6 FERM domain C-lobe of FERM domain containing 1 and 6 proteins. FRMD6 (also called willin and hEx/human expanded) is localized throughout the cytoplasm or along the plasma membrane. The Drosophilla protein Ex is a regulator of the Hippo/SWH (Sav/Wts/Hpo) signaling pathway, a signaling pathway that plays a pivotal role in organ size control and is tumor suppression by restricting proliferation and promoting apoptosis. Surprisingly, hEx is thought to function independently of the Hippo pathway. Instead it is hypothesized that hEx inhibits progression through the S phase of the cell cycle by upregulating p21(Cip1) and downregulating Cyclin A. It is also implicated in the progression of Alzheimer disease. Not much is known about FRMD1 to date. Both FRMD1 and FRMD6 contains a single FERM domain which has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe is a member of the PH superfamily. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 107
31011 270007 cd13186 FERM_C_NBL4_NBL5 FERM domain C-lobe of Novel band 4.1-like protein 4 and 5 (NBL4 and 5). NBL4 (also called Erythrocyte protein band 4.1-like 4; Epb4 1l4) plays a role the beta-catenin/Tcf signaling pathway and is thought to be involved in establishing the cell polarity or proliferation. NBL4 may be also involved in adhesion, in cell motility and/or in cell-to-cell communication. No role for NBL5 has been proposed to date. Both NBL4 and NBL5 contain a N-terminal FERM domain which has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe is a member of the PH superfamily. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 92
31012 270008 cd13187 FERM_C_PTPH13 FERM domain C-lobe of Protein tyrosine phosphatase non-receptor 13 (PTPH13). There are many functions of PTPN13 (also called PTPL1, PTP-BAS, hPTP1E, FAP1, or PTPL1). Mice lacking PTPN13 activity have abnormal regulation of signal transducer and activator of transcription signaling in their T cells, mild impairment of motor nerve repair, and a significant reduction in the growth of retinal glia cultures. It also plays a role in adipocyte differentiation. PTPN13 contains a kinase non-catalytic C-lobe domain (KIND), a FERM domain with two potential phosphatidylinositol 4,5-biphosphate [PtdIns(4,5)P2]-binding motifs, 5 PDZ domains, and a carboxy-terminal catalytic domain. There is an nteraction between the FERM domain of PTPL1 and PtdIns(4,5)P2 which is thought to regulate the membrane localization of PTPN13. PDZ are protein/protein interaction domains so there is the potential for numerous partners that can actively participate in the regulation of its phosphatase activity or can permit direct or indirect recruitment of tyrosine phosphorylated PTPL1 substrates. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 103
31013 270009 cd13188 FERM_C_PTPN14_PTPN21 FERM domain C-lobe of Protein tyrosine phosphatase non-receptor proteins 14 and 21 (PTPN14 and 21). This CD contains PTP members: pez/PTPN14 and PTPN21. A number of mutations in Pez have been shown to be associated with breast and colorectal cancer. The PTPN protein family belong to larger family of PTPs. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. The members are composed of a N-terminal FERM domain and a C-terminal PTP catalytic domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. Like most other ERM members they have a phosphoinositide-binding site in their FERM domain. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 91
31014 270010 cd13189 FERM_C_PTPN4_PTPN3_like FERM domain C-lobe of Protein tyrosine phosphatase non-receptor proteins 3 and 4 (PTPN4 and PTPN3). PTPN4 (also called PTPMEG, protein tyrosine phosphatase, megakaryocyte) is a cytoplasmic protein-tyrosine phosphatase (PTP) thought to play a role in cerebellar function. PTPMEG-knockout mice have impaired memory formation and cerebellar long-term depression. PTPN3/PTPH1 is a membrane-associated PTP that is implicated in regulating tyrosine phosphorylation of growth factor receptors, p97 VCP (valosin-containing protein, or Cdc48 in Saccharomyces cerevisiae), and HBV (Hepatitis B Virus) gene expression; it is mutated in a subset of colon cancers. PTPMEG and PTPN3/PTPH1 contains a N-terminal FERM domain, a middle PDZ domain, and a C-terminal phosphatase domain. PTP1/Tyrosine-protein phosphatase 1 from nematodes and a FERM_C repeat 1 from Tetraodon nigroviridis are also included in this cd. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 95
31015 270011 cd13190 FERM_C_FAK1 FERM domain C-lobe of Focal Adhesion Kinase 1 and 2. FAK1 (also called FRNK/Focal adhesion kinase-related nonkinase; p125FAK/pp125FAK;PTK2/Protein-tyrosine kinase 2 protein tyrosine kinase 2 (PTK2) is a non-receptor tyrosine kinase that localizes to focal adhesions in adherent cells. It has been implicated in diverse cellular roles including cell locomotion, mitogen response and cell survival. The N-terminal region of FAK1 contains a FERM domain, a linker, a kinase domain, and a C-terminal FRNK (FAK-related-non-kinase) domain. Three subdomains of FERM: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3), form a cloverleaf fold, similar to those of known FERM structures despite the low sequence conservation. The C-lobe/F3 within the FERM domain is part of the PH domain family. The phosphoinositide-binding site found in ERM family proteins is not present in the FERM domain of FAK1. The adjacent Src SH3 and SH2 binding sites in the linker of FAK1 associates with the F3 and F1 lobes and are thought to be involved in regulation. The FERM domain of FAK1 can inhibit enzymatic activity and repress FAK signaling. In an inactive state of FAK1, the FERM domain is thought to interact with the catalytic domain of FAK1 to repress its activity. Upon activation this interaction is disrupted and its kinase activity restored. The FRNK domain is thought to function as a negative regulator of kinase activity. The C-lobe/F3 is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 111
31016 270012 cd13191 FERM_C_FRMD4A_FRMD4B FERM domain C-lobe of FERM domain-containing protein 4A and 4B (FRMD4A and 4B). FRMD4A is part of the Par-3/FRMD4A/cytohesin-1 complex that activates Arf6, a central player in actin cytoskeleton dynamics and membrane trafficking, during junctional remodeling and epithelial polarization. The Par-3/Par-6/aPKC/Cdc42 complex regulates the conversion of primordial adherens junctions (AJs) into belt-like AJs and the formation of linear actin cables. When primordial AJs are formed, Par-3 recruits scaffolding protein FRMD4A which connects Par-3 and the Arf6 guanine-nucleotide exchange factor (GEF), cytohesin-1. FRMD4B (also called GRP1-binding protein, GRSP1) is a novel member of GRP1 signaling complexes that are recruited to plasma membrane ruffles in response to insulin receptor signaling. The GRSP1/FRMD4B protein contains a FERM protein domain as well as two coiled coil domains and may function as a scaffolding protein. GRP1 and GRSP1 interact through the coiled coil domains in the two proteins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 113
31017 270013 cd13192 FERM_C_FRMD3_FRMD5 FERM domain C-lobe of FERM domain-containing protein 3 and 5 (FRMD3 and 5). FRMD3 (also called Band 4.1-like protein 4O/4.1O though it is not a true member of that family) is a novel putative tumor suppressor gene that is implicated in the origin and progression of lung cancer. In humans there are 5 isoforms that are produced by alternative splicing. Less is known about FRMD5, though there are 2 isoforms of the human protein are produced by alternative splicing. Both FRMD3 and FRMD5 contain a N-terminal FERM domain, followed by a FERM adjacent (FA) domain, and 4.1 protein C-terminal domain (CTD). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 105
31018 270014 cd13193 FERM_C_FARP1-like FERM domain C-lobe of FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FRMD7(FERM domain containing 7). FARP1 and FARP2 are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. These members are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. Other members in this family do not contain the DH domains such as the Human FERM domain containing protein 7 and Caenorhabditis elegans CFRM3, both of which have unknown functions. They contain an N-terminal FERM domain, a PH domain, followed by a FA (FERM adjacent) domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 122
31019 270015 cd13194 FERM_C_ERM FERM domain C-lobe/F3 of the ERM family. The ERM family includes ezrin, radixin, moesin and merlin. They are composed of a N-terminal FERM (ERM) domain (also called N-ERMAD (N-terminal ERM association domain)), a coiled coil region (CRR), and a C-terminal domain CERMAD (C-terminal ERM association domain) which has an F-actin-binding site (ABD). Two actin-binding sites have been identified in the middle and N-terminal domains. Merlin is structurally similar to the ERM proteins, but instead of an actin-binding domain (ABD), it contains a C-terminal domain (CTD), just like the proteins from the 4.1 family. Activated ezrin, radixin and moesin are thought to be involved in the linking of actin filaments to CD43, CD44, ICAM1-3 cell adhesion molecules, various membrane channels and receptors, such as the Na+/H+ exchanger-3 (NHE3), cystic fibrosis transmembrane conductance regulator (CFTR), and the beta2-adrenergic receptor. The ERM proteins exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain of ERM is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 97
31020 270016 cd13195 FERM_C_MYLIP_IDOL FERM domain C-lobe of E3 ubiquitin ligase myosin regulatory light chain-interacting protein (MYLIP; also called inducible degrader of the LDL receptor, IDOL). MYLIP/IDOL is a regulator of the LDL receptor (LDLR) pathway via the nuclear receptor liver X receptor (LXR). In response to cellular cholesterol loading, the activation of LXR leads to the induction of MYLIP expression. MYLIP stimulates ubiquitination of the LDLR on its cytoplasmic tail, directing its degradation. The LXR-MYLIP-LDLR pathway provides a complementary pathway to sterol regulatory element-binding proteins for the feedback inhibition of cholesterol uptake. MYLIP has an N-terminal FERM domain and in some cases a C-terminal RING domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 111
31021 275394 cd13196 FERM_C_JAK FERM domain C-lobe of Janus kinase (JAK). JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 109
31022 270018 cd13197 FERM_C_CCM1 FERM domain C-lobe of Cerebral cavernous malformation 1. CCM1 (also called KRIT-1/Krev interaction trapped 1;ankyrin repeat-containing protein Krit1; CAM), a Rap1-binding protein, is expressed in endothelial cells where it is present in cell-cell junctions and associated with junctional proteins. Together with CCM2/MGC4607 and CCM3/PDCD10, KRIT1 constitutes a set of proteins, mutations of which are found in cerebral cavernous malformations which are characterized by cerebral hemorrhages and vascular malformations in the central nervous system. KRIT-1 possesses four ankyrin repeats, a FERM domain, and multiple NPXY sequences, one of which is essential for integrin cytoplasmic domain-associated protein-1alpha (ICAP1alpha) binding and all of which mediate binding of CCM2. KRIT-1 localization is mediated by its FERM domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 100
31023 270019 cd13198 FERM_C1_MyoVII FERM domain C-lobe, repeat 1, of Myosin VII (MyoVII/Myo7). MyoVII, a MyTH-FERM myosin, is an actin-based motor protein essential for a variety of biological processes in the actin cytoskeleton function. Mutations in MyoVII leads to problems in sensory perception: deafness and blindness in humans (Usher Syndrome), retinal defects and deafness in mice (shaker 1), and aberrant auditory and vestibular function in zebrafish. Myosin VIIAs have plus (barbed) end-directed motor activity on actin filaments and a characteristic actin-activated ATPase activity. MyoVII consists of a conserved spectrin-like, SH3 subdomain N-terminal region, a motor/head region, a neck made of 4-5 IQ motifs, and a tail consisting of a coiled-coil domain, followed by a tandem repeat of myosin tail homology 4 (MyTH4) domains and partial FERM domains that are separated by an SH3 subdomain and are thought to mediate dimerization and binding to other proteins or cargo. Members include: MyoVIIa, MyoVIIb, and MyoVII members that do not have distinct myosin VIIA and myosin VIIB genes. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 99
31024 270020 cd13199 FERM_C2_MyoVII FERM domain C-lobe, repeat 2, of Myosin VII (MyoVII, Myo7). MyoVII, a MyTH-FERM myosin, is an actin-based motor protein essential for a variety of biological processes in the actin cytoskeleton function. Mutations in MyoVII leads to problems in sensory perception: deafness and blindness in humans (Usher Syndrome), retinal defects and deafness in mice (shaker 1), and aberrant auditory and vestibular function in zebrafish. Myosin VIIAs have plus (barbed) end-directed motor activity on actin filaments and a characteristic actin-activated ATPase activity. MyoVII consists of a conserved spectrin-like, SH3 subdomain N-terminal region, a motor/head region, a neck made of 4-5 IQ motifs, and a tail consisting of a coiled-coil domain, followed by a tandem repeat of myosin tail homology 4 (MyTH4) domains and partial FERM domains that are separated by an SH3 subdomain and are thought to mediate dimerization and binding to other proteins or cargo. Members include: MyoVIIa, MyoVIIb, and MyoVII members that do not have distinct myosin VIIA and myosin VIIB genes. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 96
31025 270021 cd13200 FERM_C_KCBP FERM domain C-lobe of Kinesin-like calmodulin binding protein. KCBPs (also called KIPK/Kinesin-like Calmodulin-Binding Protein-Interacting Protein Kinase), a member of the Kinesin-14 family, is a C-terminal microtubule motor with three unique domains including a myosin tail homology region 4 (MyTH4), a talin-like domain, and a calmodulin-binding domain (CBD). Binding of the Ca2+-activated calmodulin to KCBP causes the motor to dissociate from microtubules. The microtubule binding of KCBP is controlled by the calcium binding protein KIC containing a single EF-hand motif. KCBPs are unique to land plants and green algae. The MyTH4 and talin-like domains are not found in other kinesins, while the CBD domain is also only found in Strongylocentrotus purpuratus kinesin-C (SpKinC). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 109
31026 270022 cd13201 FERM_C_MyoXV FERM domain C-lobe of Myosin XV (MyoXV/Myo15). MyoXV, a MyTH-FERM myosin, are actin-based motor proteins essential for a variety of biological processes in actin cytoskeleton function. Specifically MyoXV functions in the actin organization in hair cells of the organ of Corti. Mutations in Human MyoXVa causes non-syndromic deafness, DFNB3 and the mouse shaker-2 mutation. MyoXV consists of a N-terminal motor/head region, a neck made of 1-3 IQ motifs, and a tail that consists of either a myosin tail homology 4 (MyTH4) domains, followed by an SH3 domain, and a MyTH-FERM domains as in rat Myo15 or two MyTH-FERM domains separated by a SH3 domain as in human Myo15A. The MyTH-FERM domains are thought to mediate dimerization and binding to other proteins or cargo. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 101
31027 270023 cd13202 FERM_C_MyoX FERM domain C-lobe of Myosin X (MyoX, Myo10). MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The MyoX FERM domain binds to the NPXY motif of several beta-integrins, a key family of cell surface receptors that are involved in cell adhesion and migration. In addition the FERM domain binds to the cytoplasmic domains of the netrin receptors DCC (deleted in colorectal cancer) and neogenin. The FERM domain also forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 90
31028 270024 cd13203 FERM_C1_myosin_like FERM domain C-lobe, repeat 1, of Myosin-like proteins. These myosin-like proteins are unidentified though they are sequence similar to myosin 1/myo1, myosin 7/myoVII, and myosin 10/myoX. These myosin-like proteins contain an N-terminal motor/head region and a C-terminal tail consisting of two myosin tail homology 4 (MyTH4) and twos FERM domains. In myoX the FERM domain forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules and a similar thing might happen in these myosins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The first FERM_N repeat is present in this hierarchy. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 97
31029 270025 cd13204 FERM_C2_myosin_like FERM domain C-lobe, repeat 2, of Myosin-like proteins. These myosin-like proteins are unidentified though they are sequence similar to myosin 1/myo1, myosin 7/myoVII, and myosin 10/myoX. These myosin-like proteins contain an N-terminal motor/head region and a C-terminal tail consisting of two myosin tail homology 4 (MyTH4) and twos FERM domains. In myoX the FERM domain forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules and a similar thing might happen in these myosins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The second FERM_N repeat is present in this hierarchy. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 93
31030 270026 cd13205 FERM_C_fermitin FERM domain C-lobe of the Fermitin family. Fermitin functions as a mediator of integrin inside-out signalling. The recruitment of Fermitin proteins and Talin to the membrane mediates the terminal event of integrin signalling, via interaction with integrin beta subunits. Fermatin has FERM domain interrupted with a pleckstrin homology (PH) domain. Fermitin family homologs (Fermt1, 2, and 3, also known as Kindlins) are each encoded by a different gene. In mammalian studies, Fermt1 is generally expressed in epithelial cells, Fermt2 is expressed inmuscle tissues, and Fermt3 is expressed in hematopoietic lineages. Specifically Fermt2 is expressed in smooth and striated muscle tissues in mice and in the somites (a trunk muscle precursor) and neural crest in Xenopus embryos. As such it has been proposed that Fermt2 plays a role in cardiomyocyte and neural crest differentiation. Expression of mammalian Fermt3 is associated with hematopoietic lineages: the anterior ventral blood islands, vitelline veins, and early myeloid cells. In Xenopus embryos this expression, also include the notochord and cement gland. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). This cd is not included in the C-lobe hierarchy based on its position in the tree. One thing to note is that unlike the other members of the C-lobe hierarchy it contains 2 FERM M domains which might also reflect a difference in its evolutionary history. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 91
31031 241360 cd13206 FERM_C-lobe_PLEKHH1_PLEKHH2 FERM domain C-lobe of Pleckstrin homology domain-containing family H. PLEKHH1 and PLEKHH2 (also called PLEKHH1L) are thought to function in phospholipid binding and signal transduction. There are 3 Human PLEKHH genes: PLEKHH1, PLEKHH2, and PLEKHH3. There are many isoforms, the longest of which contain a FERM domain, a MyTH4 domain, two PH domains, a peroximal domain, a vacuolar domain, and a coiled coil stretch. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 100
31032 275395 cd13207 FERM-like_C_SNX Atypical FERM-like domain C-lobe of Sorting nexin family. Sorting nexins function in regulating recycling from endosomes to the cell surface. SNX17, SNX27, and SNX31 contain a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. All three proteins are able to bind the Ras GTPase through their FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 116
31033 275396 cd13208 PH-GRAM_MTMR5_MTMR13 Myotubularian (MTM) related 5 and 13 proteins (MTMR5 and MTMR13) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR5 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It lacks several amino acids in the dsPTPase catalytic pocket which renders it catalytically inactive as a phosphatase. MTMR5 is the most well-studied inactive member of this family and has been implicated in cellular growth control and oncogenic transformation. MTMR13 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Leu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR13 has high sequence similarity to MTMR5 and has recently been shown to be a second gene mutated in type 4B Charcot-Marie-Tooth syndrome. Both MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. Although the majority of the sequences are MTMR 5 and 13, this cd also contains MTM5 nematode sequences. 120
31034 275397 cd13209 PH-GRAM_MTMR3_MTMR4 Myotubularian (MTM) related 3 and 4 proteins (MTMR3 and MTMR4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR3 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR3 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and also form heteromers with MTMR4. MTMR4, a member of the myotubularin dual specificity protein phosphatase gene family. MTMR4 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein form heteromers with MTMR3. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 94
31035 270030 cd13210 PH-GRAM_MTMR6-like Myotubularian (MTM) related (MTMR) 7 and 8 proteins Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR6, MTMR7, and MRMR8 are all member of the myotubularin dual specificity protein phosphatase gene family. They bind to phosphoinositide lipids through its PH-GRAM domain. These proteins also interact with each other as well as MTMR9. They contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The lipid-binding FYVE domain has been shown to bind phosphotidylinositol-3-phosphate. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 98
31036 275398 cd13211 PH-GRAM_MTMR9 Myotubularian (MTM) related 9 protein (MTMR9) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR9 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Gly residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR9 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 99
31037 275399 cd13212 PH-GRAM_MTMR10-like Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10, MTMR11, and MTMR12 are catalytically inactive phosphatases that play a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. They contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. They contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. 125
31038 275400 cd13213 PH-GRAM_MTMR14 Myotubularian (MTM) related 14 protein (MTMR14) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR14 is a member of the myotubularin protein phosphatase gene family. MTMR14 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. MTMR14 plays a role in the regulation of autophagy and mutations in MTMR14 result in autosomal dominant centronuclear myopathy. MTMR14 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain (SID), a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain (SID), and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 116
31039 275401 cd13214 PH-GRAM_WBP2 WW binding protein 2 (WB2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. WBP2 plays a number of roles including: acting as a tyrosine kinase substrate, activation of estrogen receptor alpha (ERalpha)/progesterone receptor (PR) transcription, and playing a role in breast cancer. WBP2 contain a N-terminal PH-GRAM domain and a C-terminal WWbp domain. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The WWbp domain is characterized by several short PY and PT-like motifs of the PPPPY form and binds to WW domains. WW domains contain two highly conserved tryptophans that are spaced 20-23 residues apart. They bind proline-rich peptide motifs [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. 103
31040 275402 cd13215 PH-GRAM1_AGT26 Autophagy-related protein 26/Sterol 3-beta-glucosyltransferase Pleckstrin homology (PH) domain, repeat 1. ATG26 (also called UGT51/UDP-glycosyltransferase 51), a member of the glycosyltransferase 28 family, resulting in the biosynthesis of sterol glucoside. ATG26 in decane metabolism and autophagy. There are 32 known autophagy-related (ATG) proteins, 17 are components of the core autophagic machinery essential for all autophagy-related pathways and 15 are the additional components required only for certain pathways or species. The core autophagic machinery includes 1) the ATG9 cycling system (ATG1, ATG2, ATG9, ATG13, ATG18, and ATG27), 2) the phosphatidylinositol 3-kinase complex (ATG6/VPS30, ATG14, VPS15, and ATG34), and 3) the ubiquitin-like protein system (ATG3, ATG4, ATG5, ATG7, ATG8, ATG10, ATG12, and ATG16). Less is known about how the core machinery is adapted or modulated with additional components to accommodate the nonselective sequestration of bulk cytosol (autophagosome formation) or selective sequestration of specific cargos (Cvt vesicle, pexophagosome, or bacteria-containing autophagosome formation). The pexophagosome-specific additions include the ATG30-ATG11-ATG17 receptor-adaptors complex, the coiled-coil protein ATG25, and the sterol glucosyltransferase ATG26. ATG26 is necessary for the degradation of medium peroxisomes. It contains 2 GRAM domains and a single PH domain. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 116
31041 275403 cd13216 PH-GRAM2_AGT26 Autophagy-related protein 26/Sterol 3-beta-glucosyltransferase Pleckstrin homology (PH) domain, repeat 2. ATG26 (also called UGT51/UDP-glycosyltransferase 51), a member of the glycosyltransferase 28 family, resulting in the biosynthesis of sterol glucoside. ATG26 in decane metabolism and autophagy. There are 32 known autophagy-related (ATG) proteins, 17 are components of the core autophagic machinery essential for all autophagy-related pathways and 15 are the additional components required only for certain pathways or species. The core autophagic machinery includes 1) the ATG9 cycling system (ATG1, ATG2, ATG9, ATG13, ATG18, and ATG27), 2) the phosphatidylinositol 3-kinase complex (ATG6/VPS30, ATG14, VPS15, and ATG34), and 3) the ubiquitin-like protein system (ATG3, ATG4, ATG5, ATG7, ATG8, ATG10, ATG12, and ATG16). Less is known about how the core machinery is adapted or modulated with additional components to accommodate the nonselective sequestration of bulk cytosol (autophagosome formation) or selective sequestration of specific cargos (Cvt vesicle, pexophagosome, or bacteria-containing autophagosome formation). The pexophagosome-specific additions include the ATG30-ATG11-ATG17 receptor-adaptors complex, the coiled-coil protein ATG25, and the sterol glucosyltransferase ATG26. ATG26 is necessary for the degradation of medium peroxisomes. It contains 2 GRAM domains and a single PH domain. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 93
31042 275404 cd13217 PH-GRAM1_TCB1D8_TCB1D9_family TCB1D8 and TCB1D9 family Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8, TBC1D8B, TBC1D9 and TBC1D9B may act as a GTPase-activating proteins for Rab family protein(s). They all contain an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 99
31043 275405 cd13218 PH-GRAM2_TCB1D8_TCB1D9_family TCB1D8 and TCB1D9 family Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8, TBC1D8B, TBC1D9 and TBC1D9B may act as a GTPase-activating proteins for Rab family protein(s). They all contain an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 96
31044 270039 cd13219 PH-GRAM_C2-GRAM C2 and GRAM domain-containing protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. C2GRAM contains two N-terminal C2 domains followed by a single PH-GRAM domain. Since it contains both of these domains it is assumed that this gene cross-links both calcium and phosphoinositide signaling pathways. In general he C2 domain is involved in binding phospholipids in a calcium dependent manner or calcium independent manner. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 111
31045 275406 cd13220 PH-GRAM_GRAMDC GRAM domain-containing protein (GRAMDC) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. The GRAMDC proteins are membrane proteins. Nothing is known about its function. Members include: GRAMDC1A, GRAMDC1B, GRAMDC1C, GRAMDC2, GRAMDC3, GRAMDC4, and GRAMDC-like proteins. All of the members, except for GRAMDC4 are included in this hierarchy. Each contains a single PH-GRAM domain at their N-terminus. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 94
31046 270041 cd13221 PH-GRAM_GRAMDC4 GRAM domain-containing protein 4 (GRAMDC4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. GRAMDC4 is a membrane protein. Nothing is known about its function. Paralogs include: GRAMDC1A, GRAMDC1B, GRAMDC1C, GRAMDC2, GRAMDC3, and GRAMDC-like proteins. It contains a single PH-GRAM domain at its N-terminus. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 104
31047 270042 cd13222 PH-GRAM_GEM GLABRA 2 expression modulator (GEM) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. GEM interacts with CDT1, a pre-replication complex component that is involved in DNA replication, and with TTG1 (Transparent Testa GLABRA 1), a transcriptional regulator of epidermal cell fate. GEM controls the level of histone H3K9 methylation at the promoters of the GLABRA 2 and CAPRICE (CPC) genes, which are essential for epidermis patterning. GEM also regulates cell division in different root cell types. GEM regulates proliferation-differentiation decisions by integrating DNA replication, cell division and transcriptional controls. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 109
31048 275407 cd13223 PH-GRAM_MTM-like Myotubularian 1 and related proteins Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase. MTM1, MTMR1, and MTMR2 are members of the myotubularin protein phosphatase gene family. They contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. In addition MTMR1 (Myotubularian related 1 protein) and MTMR2 (Myotubularian related 2 protein) contain a C-terminal PDZ domain. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 100
31049 270044 cd13224 PH_Net1 Neuroepithelial cell transforming 1 Pleckstrin homology (PH) domain. Net1 (also called ArhGEF8) is part of the family of Rho guanine nucleotide exchange factors. Members of this family activate Rho proteins by catalyzing the exchange of GDP for GTP. The protein encoded by this gene interacts with RhoA within the cell nucleus and may play a role in repairing DNA damage after ionizing radiation. Net1 binds to caspase activation and recruitment domain (CARD)- and membrane-associated guanylate kinase-like domain-containing (CARMA) proteins and regulates nuclear factor kB activation. Net1 contains a RhoGEF domain N-terminal to a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 135
31050 270045 cd13225 PH-like_bacteria Pleckstrin homology (PH)-like domains in bacteria (PHb). Pleckstrin homology (PH) domains were first identified in eukaryotic proteins. Recently PH-like domains have been identified in bacteria as well. These PHb form dome-shaped oligomeric rings with a conserved hydrophilic surface at the intersection of the beta-strands of adjacent protomers that likely mediates protein-protein interactions. It is now thought that the PH domain superfamily is more widespread than previous thought and appears to have existed before prokaryotes and eukaryotes diverged. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 95
31051 275408 cd13226 PH-GRAM-like_Eap45 Pleckstrin homology-like domain or GLUE (GRAM-like ubiquitin-binding in Eap45) domain of Eap45. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. Human/yeast ESCRT-I consists of Tsg101/Vps23, Vps28/Vps28, and a Vps37 homolog/Vps37. Human/yeast ESCRT-II is composed of EAP20/Vps25, EAP30/Vps22, and EAP45/Vps36. Yeast ESCRT-III consists Vps2, Vps20, Vps24, and Snf7 subunits. In contrast, there are three Human paralogs of Snf7 (hSnf7-1/CHMP4A, hSnf7-2/CHMP4B, and hSnf7-3/CHMP4C) and two paralogs of Vps2 (CHMP2A and CHMP2B). Yeast ESCRT-I links directly to ESCRT-II, through a tight interaction of Vps28 (ESCRT-I) with the yeast-specific zinc-finger insertion within the GLUE domain of Vps36. The Vps36 subunit (ESCRT-II) binds ubiquitin using one of its two NZF zinc fingers in its N-terminal region. Human Vps36, EAP45, also binds ubiquitin despite having no NZF domain. Instead, mammalian ESCRT-II interacts with Ub through the Eap45 GLUE domain directly. While yeast Vps36 GLUE shows a preference for the singly phosphorylated PI(3)P, while Eap45 GLUE preferentially binds the triply phosphorylated phosphatidylinositol PI(3,4,5)P3. Structurally, Eap45 GLUE only has a PH-like fold since it lacks the secondary structure element corresponding to the 4 strand, unlike that of yeast Vps36 GLUE. ESCRT-II also interacts with ESCRT-III via a EAP20(Vps25)/CHMP6(Vps20) interaction. The interactions of ESCRT-II GLUE domain with membranes, ESCRT-I, and ubiquitin are critical for ubiquitinated cargo progression from early to late endosomes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 129
31052 275409 cd13227 PH-GRAM-like_Vps36 Pleckstrin homology-like domain or GLUE (GRAM-like ubiquitin-binding in Eap45) domain of Vps36. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. Yeast/human ESCRT-I consists of Vps23/Tsg101, Vps28/Vps28, and Vps37/Vps37 homolog. Yeast/human ESCRT-II is composed of Vps25/EAP20, Vps22/EAP30, and Vps36/EAP45. Yeast ESCRT-III consists Vps2, Vps20, Vps24, and Snf7 subunits. In contrast, there are three human paralogs of Snf7 (hSnf7-1/CHMP4A, hSnf7-2/CHMP4B, and hSnf7-3/CHMP4C) and two paralogs of Vps2 (CHMP2A and CHMP2B). Yeast ESCRT-I links directly to ESCRT-II, through a tight interaction of Vps28 (ESCRT-I) with the yeast-specific zinc-finger insertion within the GLUE domain of Vps36. The Vps36 subunit (ESCRT-II) binds ubiquitin using one of its two NZF zinc fingers in its N-terminal region. Human Vps36, EAP45, also binds ubiquitin despite having no NZF domain. Instead, mammalian ESCRT-II interacts with Ub through the Eap45 GLUE domain itself. The yeast Vps36 GLUE has a complete PH domain, wherease Eap45 GLUE only has a PH-like fold since it lacks the secondary structure element corresponding to the 4 strand. ESCRT-II also interacts with ESCRT-III via a Vps25(EAP20)/Vps20(CHMP6) interaction. Structure 2CAY is missing this insertion that contains 2 NZF zinc fingers. It is a split PH domain, with a noncanonical lipid binding pocket that binds PI(3)P. The interactions of ESCRT-II GLUE domain with membranes, ESCRT-I, and ubiquitin are critical for ubiquitinated cargo progression from early to late endosomes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 119
31053 270048 cd13228 PHear_NECAP NECAP (adaptin-ear-binding coat-associated protein) Plextrin Homology (PH) fold with ear-like function (PHear) domain. NECAPs are alpha-ear-binding proteins that enrich on clathrin-coated vesicles (CCVs). NECAP 1 is expressed in brain and non-neuronal tissues and cells while NECAP 2 is ubiquitously expressed. The PH-like domain of NECAPs is a protein-binding interface that mimics the FxDxF motif binding properties of the alpha-ear and is called PHear (PH fold with ear-like function) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 120
31054 270049 cd13229 PH_TFIIH Transcription Factor II H (TFIIH) Pleckstrin homology (PH) domain. The transcription factor II H (TFIIH) is one of the general transcription factors (GTFs) known to be a target of the transactivation domain (TAD) of p53. Human TFIIH and its homologous yeast counterpart (factor b) are composed of ten subunits that can be divided into two groups, the core TFIIH (XPB/Ssl2, p62/Tfb1, p52/Tfb2, p44/Ssl1, p34/Tfb4, and TTDA/Tfb5 in human/yeast) and the CAK complex (cdk7/Kin28, cyclin H/Ccl1, and MAT1/Tfb3). These two complexes are linked by the XPD/Rad3 subunit. The helicase activities of XPB and XPD are essential to the formation of the open complex during transcription initiation and the kinase activity of cdk7 phosphorylates the C-terminal domain (CTD) of the RNA Pol II largest subunit, enabling RNA Pol II to progress from the initiation phase to the elongation phase of transcription. The PH domain of p62/Tfb1 has been shown to interact with herpes simplex virus protein 16 (VP16) TAD and the binding of p53 TAD is mediated by the TAD2 subdomain. TFIIE recruits TFIIH to complete the preinitiation complex (PIC) formation and regulates enzymatic activities of TFIIH. The PH domain of the human TFIIH p62 subunit binds to the C-terminal acidic (AC) domain of the human TFIIEalpha subunit. This interaction could be a switch to replace p53 with TFIIE on TFIIH in transcription. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 93
31055 270050 cd13230 PH1_SSRP1-like Structure Specific Recognition protein 1 (SSRP1) Pleckstrin homology (PH) domain, repeat 1. SSRP1 is a component of FACT (facilitator of chromatin transcription), an essential chromatin reorganizing factor. In yeast FACT (yFACT) is composed of three proteins: Spt16/Cdc68, Pob3, and Nhp6. In metazoans the Pob3 and Nhp6 orthologs are fused to form SSRP1/T160 in human and mouse, respectively. The middle domain of the Pob3 subunit (Pob3-M) has an unusual double pleckstrin homology (PH) architecture. yFACT interacts in a physiologically important way with the central single-strand DNA binding factor RPA to promote a step in DNA Replication. Coordinated function by yFACT and RPA is important during nucleosome deposition. These results support the model that the FACT family has an essential role in constructing nucleosomes during DNA replication, and suggest that RPA contributes to this process. Members of this cd are composed of the first PH-like repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 137
31056 270051 cd13231 PH2_SSRP1-like Structure Specific Recognition protein 1 (SSRP1) Pleckstrin homology (PH) domain, repeat 2. SSRP1 is a component of FACT (facilitator of chromatin transcription), an essential chromatin reorganizing factor. In yeast FACT (yFACT) is composed of three proteins: Spt16/Cdc68, Pob3, and Nhp6. In metazoans the Pob3 and Nhp6 orthologs are fused to form SSRP1/T160 in human and mouse, respectively.The middle domain of the Pob3 subunit (Pob3-M) has an unusual double pleckstrin homology (PH) architecture. yFACT interacts in a physiologically important way with the central single-strand DNA binding factor RPA to promote a step in DNA Replication. Coordinated function by yFACT and RPA is important during nucleosome deposition. These results support the model that the FACT family has an essential role in constructing nucleosomes during DNA replication, and suggest that RPA contributes to this process. Members of this cd are composed of the second PH-like repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
31057 270052 cd13232 Ig-PH_SCAB1 Stomatal Closure Related Actin-Binding Protein 1 Pleckstrin homology-like domain. SCAB1 is an actin-binding protein that interacts with actin filaments and regulates stomatal movement. SCAB1 is composed of an actin-binding domain, two coiled-coil (CC) domains, and a fused immunoglobulin (Ig) and PH (Ig-PH) domain. SCAB1 homologs are widely present, often in multiple copies (three in Arabidopsis), in plants including eudicots, monocots, ferns and mosses, but are not found in algae and non-plant species. The C-terminal PH domain binds weakly with inositol phosphates via an atypical basic surface patch. SCAB1 forms a dimeric structure via its coiled-coil domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 119
31058 270053 cd13233 PH_ARHGAP9-like Beta-spectrin pleckstrin homology (PH) domain. ARHGAP family genes encode Rho/Rac/Cdc42-like GTPase activating proteins with RhoGAP domain. The ARHGAP members here all have a PH domain upstream of their C-terminal RhoGAP domain. Some have additional N-terminal SH3 and WW domains. The members here include: ARHGAP9, ARHGAP12, ARHGAP15, and ARHGAP27. ARHGAP27 and ARHGAP12 shared the common-domain structure, consisting of SH3, WW, PH, and RhoGAP domains. The PH domain of ArhGAP9 employs a non-canonical phosphoinositide binding mechanism, a variation of the spectrin- Ins(4,5)P2-binding mode, that gives rise to a unique PI binding profile, namely a preference for both PI(4,5)P2 and the PI 3-kinase products PI(3,4,5)P3 and PI(3,4)P2. This lipid binding mechanism is also employed by the PH domain of Tiam1 and Slm1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31059 270054 cd13234 PHsplit_PLC_gamma Phospholipase C-gamma Split pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) is activated by receptor and non-receptor tyrosine kinases due to the presence of its SH2 and SH3 domains. There are two main isoforms of PLC-gamma expressed in human specimens, PLC-gamma1 and PLC-gamma2. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. The split PH domain is present in this hierarchy. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31060 270055 cd13235 PH2_FARP1-like FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins Pleckstrin Homology (PH) domain, repeat 2. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FARP6 (also called Zinc finger FYVE domain-containing protein 24). They are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. Little is known about FARP1 and FARP6, though FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. FARP1 and FARP2 are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. FARP6 is composed of Dbl-homology (DH), and two C-terminal PH domains separated by a FYVE domain. This hierarchy contains the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 98
31061 270056 cd13236 PH2_FGD1-4 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins pleckstrin homology (PH) domain, C-terminus. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Not much is known about FGD2. FGD1 is the best characterized member of the group with mutations here leading to the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31062 270057 cd13237 PH2_FGD5_FGD6 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 5 and 6 pleckstrin homology (PH) domain, C-terminus. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 91
31063 270058 cd13238 PH2_FGD4_insect-like FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 4 pleckstrin homology (PH) domain, C-terminus, in insect and related arthropods. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. This cd contains insects, crustaceans, and chelicerates. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 97
31064 270059 cd13239 PH_Obscurin Obscurin pleckstrin homology (PH) domain. Obscurin (also called Obscurin-RhoGEF; Obscurin-myosin light chain kinase/Obscurin-MLCK) is a giant muscle protein that is concentrated at the peripheries of Z-disks and M-lines. It binds small ankyrin I, a component of the sarcoplasmic reticulum (SR) membrane. It is associated with the contractile apparatus through binding with titin and sarcomeric myosin. It plays important roles in the organization and assembly of the myofibril and the SR. Obscurin has been observed as alternatively-spliced isoforms. The major isoform in sleletal muscle, approximately 800 kDa in size, is composed of many adhesion modules and signaling domains. It harbors 49 Ig and 2 FNIII repeats at the N-terminues, a complex middle region with additional Ig domains, an IQ motif, and a conserved SH3 domain near RhoGEF and PH domains, and a non-modular C-terminus with phosphorylation motifs. The obscurin gene also encodes two kinase domains, which are not part of the 800 kDa form of the protein, but is part of smaller spliced products that present in heart muscle. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31065 270060 cd13240 PH1_Kalirin_Trio_like Triple functional domain pleckstrin homology pleckstrin homology (PH) domain, repeat 1. RhoGEFs, Kalirin and Trio, the mammalian homologs of Drosophila Trio and Caenorhabditis elegans UNC-73 regulate a novel step in secretory granule maturation. Their signaling modulates the extent to which regulated cargo enter and remain in the regulated secretory pathway. This allows for fine tuning of peptides released by a single secretory cell type with impaired signaling leading to pathological states. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Kalirin and Trio are encoded by separate genes in mammals and by a single one in invertebrates. Kalirin and Trio share the same complex multidomain structure and display several splice variants. The longest Kalirin and Trio proteins have a Sec14 domain, a stretch of spectrin repeats, a RhoGEF(DH)/PH cassette (also called GEF1), an SH3 domain, a second RhoGEF(DH)/PH cassette (also called GEF2), a second SH3 domain, Ig/FNIII domains, and a kinase domain. The first RhoGEF(DH)/PH cassette catalyzes exchange on Rac1 and RhoG while the second RhoGEF(DH)/PH cassette is specific for RhoA. Kalirin and Trio are closely related to p63RhoGEF and have PH domains of similar function. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 123
31066 270061 cd13241 PH2_Kalirin_Trio_p63RhoGEF p63RhoGEF pleckstrin homology (PH) domain, repeat 2. The guanine nucleotide exchange factor p63RhoGEF is an effector of the heterotrimeric G protein, Galphaq and linking Galphaq-coupled receptors (GPCRs) to the activation of RhoA. The Dbl(DH) and PH domains of p63RhoGEF interact with the effector-binding site and the C-terminal region of Galphaq and appear to relieve autoinhibition of the catalytic DH domain by the PH domain. Trio, Duet, and p63RhoGEF are shown to constitute a family of Galphaq effectors that appear to activate RhoA both in vitro and in intact cells. Dbs is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a rhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Trio is a multidomain signaling protein that contains two RhoGEF(DH)-PH domains in tandem. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 140
31067 270062 cd13242 PH_puratrophin-1 Puratrophin-1 pleckstrin homology (PH) domain. Puratrophin-1 (also called Purkinje cell atrophy-associated protein 1 or PLEKHG4/Pleckstrin homology domain-containing family G member 4) contains a spectrin repeat, a RhoGEF (DH) domain, and a PH domain. It is thought to function in intracellular signaling and cytoskeleton dynamics at the Golgi. Puratrophin-1 is expressed in kidney, Leydig cells in the testis, epithelial cells in the prostate gland and Langerhans islet in the pancreas. A single nucleotide substitution in the puratrophin-1 gene were once thought to result in autosomal dominant cerebellar ataxia (ADCA), but now it has been demonstrated that this ataxia is a result of defects in the BEAN gene. Puratrophin contains a domain architecture similar to that of Dbl family members Dbs and Trio. Dbs is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a RhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Trio is a multidomain signaling protein that contains two RhoGEF(DH)-PH domains in tandem. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 136
31068 270063 cd13243 PH_PLEKHG1_G2_G3 Pleckstrin homology domain-containing family G members 1, 2, and 3 pleckstrin homology (PH) domain. PLEKHG1 (also called ARHGEF41), PLEKHG2 (also called ARHGEF42 or CLG/common-site lymphoma/leukemia guanine nucleotide exchange factor2), and PLEKHG3 (also called ARHGEF43) have RhoGEF DH/double-homology domains in tandem with a PH domain which is involved in phospholipid binding. They function as a guanine nucleotide exchange factor (GEF) and are involved in the regulation of Rho protein signal transduction. Mutations in PLEKHG1 have been associated panic disorder (PD), an anxiety disorder characterized by panic attacks and anticipatory anxiety. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 147
31069 270064 cd13244 PH_PLEKHG5_G6 Pleckstrin homology domain-containing family G member 5 and 6 pleckstrin homology (PH) domain. PLEKHG5 has a RhoGEF DH/double-homology domain in tandem with a PH domain which is involved in phospholipid binding. PLEKHG5 activates the nuclear factor kappa B (NFKB1) signaling pathway. Mutations in PLEKHG5 are associated with autosomal recessive distal spinal muscular atrophy. PLEKHG6 (also called MyoGEF) has no known function to date. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
31070 270065 cd13245 PH_PLEKHG7 Pleckstrin homology domain-containing family G member 7 pleckstrin homology (PH) domain. PLEKHG7 has a RhoGEF DH/double-homology domain in tandem with a PH domain which is involved in phospholipid binding. PLEKHG7 is proposed to functions as a guanine nucleotide exchange factor (GEF) and is involved in the regulation of Rho protein signal transduction. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 128
31071 270066 cd13246 PH_Scd1 Shape and Conjugation Deficiency 1 Pleckstrin homology (PH) domain. Fission yeast Scd1 is an exchange factor for Cdc42 and an effector of Ras1, the homolog of the human H-Ras. Scd2/Bem1 mediates Cdc42 activation by binding to Scd1/Cdc24 and to Cdc42. Ras1 regulates Scd1/Cdc24/Ral1, which is a putative guanine nucleotide exchange factor for Cdc42, a member of the Rho family of Ras-like proteins. Cdc42 then activates the Shk1/Orb2 protein kinase. Scd1 interacts with Klp5 and Klp6 kinesins to mediate cytokinesis. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 148
31072 270067 cd13247 BAR-PH_APPL Adaptor protein containing PH domain, PTB domain, and Leucine zipper motif Bin1/amphiphysin/Rvs167 (BAR)-Pleckstrin homology (PH) domain. APPL (also called DCC-interacting protein (DIP)-13alpha) interacts with oncoprotein serine/threonine kinase AKT2, tumor suppressor protein DCC (deleted in colorectal cancer), Rab5, GIPC (GAIP-interacting protein, C terminus), human follicle-stimulating hormone receptor (FSHR), and the adiponectin receptors AdipoR1 and AdipoR2. There are two isoforms of human APPL: APPL1 and APPL2, which share about 50% sequence identity. APPL has a BAR and a PH domain near its N terminus, and the two domains are thought to function as a unit (BAR-PH domain). C-terminal to this is a PTB domain. Lipid binding assays show that the BAR, PH, and PTB domains can bind phospholipids. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31073 270068 cd13248 PH_PEPP1_2_3 Phosphoinositol 3-phosphate binding proteins 1, 2, and 3 pleckstrin homology (PH) domain. PEPP1 (also called PLEKHA4/PH domain-containing family A member 4 and RHOXF1/Rhox homeobox family member 1), and related homologs PEPP2 (also called PLEKHA5/PH domain-containing family A member 5) and PEPP3 (also called PLEKHA6/PH domain-containing family A member 6), have PH domains that interact specifically with PtdIns(3,4)P3. Other proteins that bind PtdIns(3,4)P3 specifically are: TAPP1 (tandem PH-domain-containing protein-1) and TAPP2], PtdIns3P AtPH1, and Ptd- Ins(3,5)P2 (centaurin-beta2). All of these proteins contain at least 5 of the 6 conserved amino acids that make up the putative phosphatidylinositol 3,4,5- trisphosphate-binding motif (PPBM) located at their N-terminus. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 104
31074 270069 cd13249 PH_rhotekin2 Anillin Pleckstrin homology (PH) domain. Anillin (Rhotekin/RTKN; also called PLEKHK/Pleckstrin homology domain-containing family K) is an actin binding protein involved in cytokinesis. It interacts with GTP-bound Rho proteins and results in the inhibition of their GTPase activity. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Anillin proteins have a N-terminal HRI domain/ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. The C-terminal PH domain helps target anillin to ectopic septin containing foci. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 111
31075 270070 cd13250 PH_ACAP ArfGAP with coiled-coil, ankyrin repeat and PH domains Pleckstrin homology (PH) domain. ACAP (also called centaurin beta) functions both as a Rab35 effector and as an Arf6-GTPase-activating protein (GAP) by which it controls actin remodeling and membrane trafficking. ACAP contain an NH2-terminal bin/amphiphysin/Rvs (BAR) domain, a phospholipid-binding domain, a PH domain, a GAP domain, and four ankyrin repeats. The AZAPs constitute a family of Arf GAPs that are characterized by an NH2-terminal pleckstrin homology (PH) domain and a central Arf GAP domain followed by two or more ankyrin repeats. On the basis of sequence and domain organization, the AZAP family is further subdivided into four subfamilies: 1) the ACAPs contain an NH2-terminal bin/amphiphysin/Rvs (BAR) domain (a phospholipid-binding domain that is thought to sense membrane curvature), a single PH domain followed by the GAP domain, and four ankyrin repeats; 2) the ASAPs also contain an NH2-terminal BAR domain, the tandem PH domain/GAP domain, three ankyrin repeats, two proline-rich regions, and a COOH-terminal Src homology 3 domain; 3) the AGAPs contain an NH2-terminal GTPase-like domain (GLD), a split PH domain, and the GAP domain followed by four ankyrin repeats; and 4) the ARAPs contain both an Arf GAP domain and a Rho GAP domain, as well as an NH2-terminal sterile-a motif (SAM), a proline-rich region, a GTPase-binding domain, and five PH domains. PMID 18003747 and 19055940 Centaurin can bind to phosphatidlyinositol (3,4,5)P3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 98
31076 270071 cd13251 PH_ASAP ArfGAP with SH3 domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain. ASAPs (ASAP1, ASAP2, and ASAP3) function as an Arf-specific GAPs, participates in rhodopsin trafficking, is associated with tumor cell metastasis, modulates phagocytosis, promotes cell proliferation, facilitates vesicle budding, Golgi exocytosis, and regulates vesicle coat assembly via a Bin/Amphiphysin/Rvs domain. ASAPs contain an NH2-terminal BAR domain, a tandem PH domain/GAP domain, three ankyrin repeats, two proline-rich regions, and a COOH-terminal Src homology 3 (SH3) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31077 270072 cd13252 PH1_ADAP ArfGAP with dual PH domains Pleckstrin homology (PH) domain, repeat 1. ADAP (also called centaurin alpha) is a phophatidlyinositide binding protein consisting of an N-terminal ArfGAP domain and two PH domains. In response to growth factor activation, PI3K phosphorylates phosphatidylinositol 4,5-bisphosphate to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 1 is recruited to the plasma membrane following growth factor stimulation by specific binding of its PH domain to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 2 is constitutively bound to the plasma membrane since it binds phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 3,4,5-trisphosphate with equal affinity. This cd contains the first PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 109
31078 270073 cd13253 PH1_ARAP ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 1. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the first PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 94
31079 270074 cd13254 PH2_ARAP ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 2. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the second PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 90
31080 270075 cd13255 PH_TAAP2-like Tandem PH-domain-containing protein 2 Pleckstrin homology (PH) domain. The binding of TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP2 contains two sequential PH domains in which the C-terminal PH domain specifically binds PtdIns(3,4)P2 with high affinity. The N-terminal PH domain does not interact with any phosphoinositide tested. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. The members here are most sequence similar to TAPP2 proteins, but may not be actual TAPP2 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31081 270076 cd13256 PH3_ARAP ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 3. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the third PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31082 270077 cd13257 PH4_ARAP ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 4. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the fourth PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 91
31083 270078 cd13258 PH_PLEKHJ1 Pleckstrin homology domain containing, family J member 1 Pleckstrin homology (PH) domain. PLEKHJ1 (also called GNRPX2/Guanine nucleotide-releasing protein x ). It contains a single PH domain. Very little information is known about PLEKHJ1. PLEKHJ1 has been shown to interact with IKBKG (inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase gamma) and KRT33B (keratin 33B). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
31084 270079 cd13259 PH5_ARAP ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 5. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the five PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 121
31085 270080 cd13260 PH_RASA1 RAS p21 protein activator (GTPase activating protein) 1 Pleckstrin homology (PH) domain. RASA1 (also called RasGap1 or p120) is a member of the RasGAP family of GTPase-activating proteins. RASA1 contains N-terminal SH2-SH3-SH2 domains, followed by two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Splice variants lack the N-terminal domains. It is a cytosolic vertebrate protein that acts as a suppressor of RAS via its C-terminal GAP domain function, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. Additionally, it is involved in mitogenic signal transmission towards downstream interacting partners through its N-terminal SH2-SH3-SH2 domains. RASA1 interacts with a number of proteins including: G3BP1, SOCS3, ANXA6, Huntingtin, KHDRBS1, Src, EPHB3, EPH receptor B2, Insulin-like growth factor 1 receptor, PTK2B, DOK1, PDGFRB, HCK, Caveolin 2, DNAJA3, HRAS, GNB2L1 and NCK1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31086 270081 cd13261 PH_RasGRF1_2 Ras-specific guanine nucleotide-releasing factors 1 and 2 Pleckstrin homology (PH) domain. RasGRF1 (also called GRF1; CDC25Mm/Ras-specific nucleotide exchange factor CDC25; GNRP/Guanine nucleotide-releasing protein) and RasGRF2 (also called GRF2; Ras guanine nucleotide exchange factor 2) are a family of guanine nucleotide exchange factors (GEFs). They both promote the exchange of Ras-bound GDP by GTP, thereby regulating the RAS signaling pathway. RasGRF1 and RasGRF2 form homooligomers and heterooligomers. GRF1 has 3 isoforms and GRF2 has 2 isoforms. The longest isoforms of RasGRF1 and RasGRF2 contain the following domains: a Rho-GEF domain sandwiched between 2 PH domains, IQ domains, a REM (Ras exchanger motif) domain, and a Ras-GEF domainwhich gives them the capacity to activate both Ras and Rac GTPases in response to signals from a variety of neurotransmitter receptors. Their IQ domains allow them to act as calcium sensors to mediate the actions of NMDA-type and calcium-permeable AMPA-type glutamate receptors. GRF1 also mediates the action of dopamine receptors that signal through cAMP. GRF1 and GRF2 play strikingly different roles in regulating MAP kinase family members, neuronal synaptic plasticity, specific forms of learning and memory, and behavioral responses to psychoactive drugs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 136
31087 270082 cd13262 PH_RasSynGAP-like Synaptic Ras-GTPase activating protein family Pleckstrin homology (PH) domain. The RasSynGAP family is composed of members: DAB2IP, nGAP, and SynGAP. Neuronal growth-associated proteins (nGAPs) are growth cone markers found in multiple types of neurons. There are many nGAPs including Cap1 (Adenylate cyclase-associated protein 1), Capzb (Capping protein (actin filament) muscle Z-line, beta), Clptm1 (Cleft lip and palate associated transmembrane protein 1), Cotl1 (Coactosin-like 1), Crmp1 (Collapsin response mediator protein 1), Cyfip1 (Cytoplasmic FMR1 interacting protein 1), Fabp7 (Fatty acid binding protein 7, brain), Farp2 (FERM, RhoGEF and pleckstrin domain protein 2), Gap43 (Growth associated protein 43), Gnao1 (Guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O), Gnai2 (Guanine nucleotide binding protein (G protein), alpha inhibiting 2), Pacs1 (Phosphofurin acidic cluster sorting protein 1), Rtn1 (Reticulon 1), Sept2 (Septin 2), Snap25 (Synaptosomal-associated protein 25), Strap (Serine/threonine kinase receptor associated protein), Stx7 (Syntaxin 7), and Tmod2 (Tropomodulin 2). SynGAP, a neuronal Ras-GAP, has been shown display both Ras-GAP activity and Ras-related protein (Rap)-GAP activity. Saccharomyces cerevisiae Bud2 and GAP1 members CAPRI (Ca2+-promoted Ras inactivator) and RASAL (Ras-GTPase-activating-like protein) also possess this dual activity. Human DOC-2/DAB2-interacting protein (DAB2IP) is encoded by a tumor suppressor gene and a newly recognized member of the Ras-GTPase-activating family. DAB2IP is a critical component of many signal transduction pathways mediated by Ras and tumor necrosis factors including apoptosis pathways, and it is involved in the formation of many types of tumors. DAB2IP participates in regulation of gene expression and pluripotency of cells. It has been reported that DAB2IP was expressed in different tumor tissues. Little information is available concerning the expression levels of DAB2IP in normal tissues and cells, however, and no studies of its expression patterns during the development of human embryos have been reported. DAB2IP was expressed primarily in cell cytoplasm throughout the fetal development. The expression levels varied among tissues and different gestational ages. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31088 270083 cd13263 PH_RhoGap25-like Rho GTPase activating protein 25 and related proteins Pleckstrin homology (PH) domain. RhoGAP25 (also called ArhGap25) like other RhoGaps are involved in cell polarity, cell morphology and cytoskeletal organization. They act as GTPase activators for the Rac-type GTPases by converting them to an inactive GDP-bound state and control actin remodeling by inactivating Rac downstream of Rho leading to suppress leading edge protrusion and promotes cell retraction to achieve cellular polarity and are able to suppress RAC1 and CDC42 activity in vitro. Overexpression of these proteins induces cell rounding with partial or complete disruption of actin stress fibers and formation of membrane ruffles, lamellipodia, and filopodia. This hierarchy contains RhoGAP22, RhoGAP24, and RhoGAP25. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
31089 270084 cd13264 PH_ITSN Intersectin Pleckstrin homology (PH) domain. ITSNs, an adaptor protein family, play a role in endo- and exocytosis, actin cytoskeleton rearrangement and signal transduction. There are two human ITSN genes: ITSN1 and ITSN2. They share significant sequence identity and a similar domain structure having both short and long isoforms produced by alternative splicing. The short isoform (ITSN-S) consists of two Eps15 homology domains (EH1 and EH2), a coiled-coil region (CCR) and five Src homology 3 domains (SH3A-E). The EH domains bind to Asn-Pro-Phe motifs and are implicated in endocytosis and vesicle transport. The SH3 domains bind to proline-rich sequences and are commonly found in proteins implicated in cell signalling pathways, cytoskeletal organization and membrane traffic. The long isoform (ITSN-L) contains three additional C-terminal domains, a Dbl homology domain (DH), a Pleckstrin homology domain (PH) and a C2 domain. The tandem DH-PH domains are present in all Dbl family of GEFs. ITSN acts specifically on Cdc42 through its DH domain with no portion of the PH domain making contact with Cdc42. This is in contrast to Dbs which requires the PH domain for full catalytic activity. The ITSN PH domain binds phosphoinositides. C2 domains are usually involved in Ca2+-dependent and Ca2+-independent phospholipid binding. There are more than 30 proteins that interact with ITSNs. ITSN-S is present in mammals, frogs, flies and nematodes, while ITSN-L is present only in vertebrates. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 132
31090 270085 cd13265 PH_evt Evectin Pleckstrin homology (PH) domain. There are 2 members of the evectin family (also called pleckstrin homology domain containing, family B): evt-1 (also called PLEKHB1) and evt-2 (also called PLEKHB2). evt-1 is specific to the nervous system, where it is expressed in photoreceptors and myelinating glia. evt-2 is widely expressed in both neural and nonneural tissues. Evectins possess a single N-terminal PH domain and a C-terminal hydrophobic region. evt-1 is thought to function as a mediator of post-Golgi trafficking in cells that produce large membrane-rich organelles. It is a candidate gene for the inherited human retinopathy autosomal dominant familial exudative vitreoretinopathy and a susceptibility gene for multiple sclerosis. evt-2 is essential for retrograde endosomal membrane transport from the plasma membrane (PM) to the Golgi. Two membrane trafficking pathways pass through recycling endosomes: a recycling pathway and a retrograde pathway that links the PM to the Golgi/ER. Its PH domain that is unique in that it specifically recognizes phosphatidylserine (PS), but not polyphosphoinositides. PS is an anionic phospholipid class in eukaryotic biomembranes, is highly enriched in the PM, and plays key roles in various physiological processes such as the coagulation cascade, recruitment and activation of signaling molecules, and clearance of apoptotic cells. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31091 270086 cd13266 PH_Skap_family Src kinase-associated phosphoprotein family Pleckstrin homology (PH) domain. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Src kinase-associated phosphoprotein of 55 kDa (Skap55)/Src kinase-associated phosphoprotein 1 (Skap1), Skap2, and Skap-homology (Skap-hom) have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31092 270087 cd13267 PH_DOCK-D Dedicator of cytokinesis-D subfamily Pleckstrin homology (PH) domain. DOCK-D subfamily (also called Zizimin subfamily) consists of Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2. DOCK-D has a N-terminal DUF3398 domain, a PH-like domain, a Dock Homology Region 1, DHR1 (also called CZH1), a C2 domain, and a C-terminal DHR2 domain (also called CZH2). Zizimin1 is enriched in the brain, lung, and kidney; zizimin2 is found in B and T lymphocytes, and zizimin3 is enriched in brain, lung, spleen and thymus. Zizimin1 functions in autoinhibition and membrane targeting. Zizimin2 is an immune-related and age-regulated guanine nucleotide exchange factor, which facilitates filopodial formation through activation of Cdc42, which results in activation of cell migration. No function has been determined for Zizimin3 to date. The N-terminal half of zizimin1 binds to the GEF domain through three distinct areas, including CZH1, to inhibit the interaction with Cdc42. In addition its PH domain binds phosphoinositides and mediates zizimin1 membrane targeting. DOCK is a family of proteins involved in intracellular signalling networks. They act as guanine nucleotide exchange factors for small G proteins of the Rho family, such as Rac and Cdc42. There are 4 subfamilies of DOCK family proteins based on their sequence homology: A-D. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 126
31093 270088 cd13268 PH_Brdg1 BCR downstream signaling 1 Pleckstrin homology (PH) domain. Brdg1 is thought to function as a docking protein acting downstream of Tec, a protein tyrosine kinases (PTK), in B-cell antigen receptor (BCR) signaling. BRDG1 contains a proline-rich (PR) motif which is thought to bind SH3 or WW domains, a PH domain, and multiple tyrosine residues which are potential target sites for SH2 domains. Since PH domains bind phospholipids it is thought to be involved in the tethering of Tec and BRDG1 to the cell membrane.Tec and Pyk2, but not Btk, Bmx, Lyn, Syk, or c-Abl, induces phosphorylation of BRDG1 on tyrosine residues. Efficient phosphorylation requires both the PH and SH2 domains of BRDG1 and the kinase domain of Tec. The overexpression of BRDG1 increases theBCR-mediated activation of cAMP-response element binding protein (CREB). Phosphorylated BRDG1 is hypothesized to recruit CREB either directly or through its recruitment of downstream effectors which then recruit CREB. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 127
31094 241423 cd13269 PH_alsin Alsin Pleckstrin homology (PH) domain. The ALS2 gene encodes alsin, a GEF, that has dual specificity for Rac1 and Rab5 GTPases. Alsin mutations in the form of truncated proteins are responsible for motor function disorders including juvenile-onset amyotrophic lateral sclerosis, familial juvenile primary lateral sclerosis, and infantile-onset ascending hereditary spastic paralysis. The alsin protein is widely expressed in the developing CNS including neurons of the cerebral cortex, brain stem, spinal cord, and cerebellum. Alsin contains a regulator of chromosome condensation 1 (RCC1) domain, a Rho guanine nucleotide exchanging factor (RhoGEF) domain, a PH domain, a Membrane Occupation and Recognition Nexus (MORN), a vacuolar protein sorting 9 (Vps9) domain, and a Dbl homology (DH) domain. Alsin interacts with Rab5 through its Vps9 domain and through this interaction modulates early endosome fusion and trafficking. The GEF activity of alsin towards Rab5 is regulated by Rac1 function. The GEF activity of alsin for Rac1 occurs via its DH domain and this interaction plays a role in promoting spinal motor neuron survival via multiple Rac-dependent signaling pathways. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31095 270089 cd13270 PH1_TAPP1_2 Tandem PH-domain-containing proteins 1 and 2 Pleckstrin homology (PH) domain, N-terminal repeat. The binding of TAPP1 (also called PLEKHA1/pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1) and TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP1 and TAPP2 contain two sequential PH domains in which the C-terminal PH domain binds PtdIns(3,4)P2. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 118
31096 270090 cd13271 PH2_TAPP1_2 Tandem PH-domain-containing proteins 1 and 2 Pleckstrin homology (PH) domain, C-terminal repeat. The binding of TAPP1 (also called PLEKHA1/pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1) and TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP1 and TAPP2 contain two sequential PH domains in which the C-terminal PH domain specifically binds PtdIns(3,4)P2 with high affinity. The N-terminal PH domain does not interact with any phosphoinositide tested. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
31097 270091 cd13272 PH_INPP4A_INPP4B Type I inositol 3,4-bisphosphate 4-phosphatase and Type II inositol 3,4-bisphosphate 4-phosphatase Pleckstrin homology (PH) domain. INPP4A (also called Inositol polyphosphate 4-phosphatase type I) and INPP4B (also called Inositol polyphosphate 4-phosphatase type II) both catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 3,4-bisphosphate and inositol 1,3,4-trisphosphate. They differ in that INPP4A additionally catalyzes the hydrolysis of the 4-position phosphate of inositol 3,4-bisphosphate, while INPP4B catalyzes the hydrolysis of the 4-position phosphate of inositol 1,4-bisphosphate. They both have a single PH domain followed by a C2 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 144
31098 270092 cd13273 PH_SWAP-70 Switch-associated protein-70 Pleckstrin homology (PH) domain. SWAP-70 (also called Differentially expressed in FDCP 6/DEF-6 or IRF4-binding protein) functions in cellular signal transduction pathways (in conjunction with Rac), regulates cell motility through actin rearrangement, and contributes to the transformation and invasion activity of mouse embryo fibroblasts. Metazoan SWAP-70 is found in B lymphocytes, mast cells, and in a variety of organs. Metazoan SWAP-70 contains an N-terminal EF-hand motif, a centrally located PH domain, and a C-terminal coiled-coil domain. The PH domain of Metazoan SWAP-70 contains a phosphoinositide-binding site and a nuclear localization signal (NLS), which localize SWAP-70 to the plasma membrane and nucleus, respectively. The NLS is a sequence of four Lys residues located at the N-terminus of the C-terminal a-helix; this is a unique characteristic of the Metazoan SWAP-70 PH domain. The SWAP-70 PH domain binds PtdIns(3,4,5)P3 and PtdIns(4,5)P2 embedded in lipid bilayer vesicles. There are additional plant SWAP70 proteins, but these are not included in this hierarchy. Rice SWAP70 (OsSWAP70) exhibits GEF activity toward the its Rho GTPase, OsRac1, and regulates chitin-induced production of reactive oxygen species and defense gene expression in rice. Arabidopsis SWAP70 (AtSWAP70) plays a role in both PAMP- and effector-triggered immunity. Plant SWAP70 contains both DH and PH domains, but their arrangement is the reverse of that in typical DH-PH-type Rho GEFs, wherein the DH domain is flanked by a C-terminal PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31099 270093 cd13274 PH_DGK_type2 Type 2 Diacylglycerol kinase Pleckstrin homology (PH) domain. DGK (also called DAGK) catalyzes the conversion of diacylglycerol (DAG) to phosphatidic acid (PA) utilizing ATP as a source of the phosphate. In non-stimulated cells, DGK activity is low and DAG is used for glycerophospholipid biosynthesis. Upon receptor activation of the phosphoinositide pathway, DGK activity increases which drives the conversion of DAG to PA. DGK acts as a switch by terminating the signalling of one lipid while simultaneously activating signalling by another. There are 9 mammalian DGK isoforms all with conserved catalytic domains and two cysteine rich domains. These are further classified into 5 groups according to the presence of additional functional domains and substrate specificity: Type 1 - DGK-alpha, DGK-beta, DGK-gamma - contain EF-hand motifs and a recoverin homology domain; Type 2 - DGK-delta, DGK-eta, and DGK-kappa- contain a pleckstrin homology domain, two cysteine-rich zinc finger-like structures, and a separated catalytic region; Type 3 - DGK-epsilon - has specificity for arachidonate-containing DAG; Type 4 - DGK-zeta, DGK-iota- contain a MARCKS homology domain, ankyrin repeats, a C-terminal nuclear localization signal, and a PDZ-binding motif; Type 5 - DGK-theta - contains a third cysteine-rich domain, a pleckstrin homology domain and a proline rich region. The type 2 DGKs are present as part of this Metazoan DGK hierarchy. They have a N-terminal PH domain, two cysteine rich domains, followed by bipartite catalytic domains, and a C-terminal SAM domain. Their catalytic domains and perhaps other DGK catalytic domains may function as two independent units in a coordinated fashion. They may also require other motifs for maximal activity because several DGK catalytic domains have very little DAG kinase activity when expressed as isolated subunits. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 97
31100 270094 cd13275 PH_M-RIP Myosin phosphatase-RhoA Interacting Protein Pleckstrin homology (PH) domain. M-RIP is proposed to play a role in myosin phosphatase regulation by RhoA. M-RIP contains 2 PH domains followed by a Rho binding domain (Rho-BD), and a C-terminal myosin binding subunit (MBS) binding domain (MBS-BD). The amino terminus of M-RIP with its adjacent PH domains and polyproline motifs mediates binding to both actin and Galpha. M-RIP brings RhoA and MBS into close proximity where M-RIP can target RhoA to the myosin phosphatase complex to regulate the myosin phosphorylation state. M-RIP does this via its C-terminal coiled-coil domain which interacts with the MBS leucine zipper domain of myosin phosphatase, while its Rho-BD, directly binds RhoA in a nucleotide-independent manner. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 104
31101 270095 cd13276 PH_AtPH1 Arabidopsis thaliana Pleckstrin homolog (PH) 1 (AtPH1) PH domain. AtPH1 is expressed in all plant tissue and is proposed to be the plant homolog of human pleckstrin. Pleckstrin consists of two PH domains separated by a linker region, while AtPH has a single PH domain with a short N-terminal extension. AtPH1 binds PtdIns3P specifically and is thought to be an adaptor molecule since it has no obvious catalytic functions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31102 270096 cd13277 PH_Bem3 Bud emergence protein 3 (Bem3) Pleckstrin homology (PH) domain. Bud emergence in Saccharomyces cerevisiae involves cell cycle-regulated reorganizations of cortical cytoskeletal elements and requires the action of the Rho-type GTPase Cdc42. Bem3 contains a RhoGAP domain and a PH domain. Though Bem3 and Bem2 both contain a RhoGAP, but only Bem3 is able to stimulate the hydrolysis of GTP on Cdc42. Bem3 is thought to be the GAP for Cdc42. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 111
31103 241432 cd13278 PH_Bud4 Bud4 Pleckstrin homology (PH) domain. Bud4 is an anillin-like yeast protein involved in the formation and the disassembly of the double ring structure formed by the septins during cytokinesis. Bud4 acts with Bud3 and and in parallel with septin phosphorylation by the p21-activated kinase Cla4 and the septin-dependent kinase Gin4. Bud4 is regulated by the cyclin-dependent protein kinase Cdk1, the master regulator of cell cycle progression. Bud4 contains an anillin-like domain followed by a PH domain. In addition there are two consensus Cdk phosphorylation sites: one at the N-terminus and one right before the C-terminal PH domain. Anillins also have C-terminal PH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 139
31104 270097 cd13279 PH_Cla4_Ste20 Pleckstrin homology (PH) domain. Budding yeast contain two main p21-activated kinases (PAKs), Cla4 and Ste20. The yeast Ste20 protein kinase is involved in pheromone response, though the function of Ste20 mammalian homologs is unknown. Cla4 is involved in budding and cytokinesis and interacts with Cdc42, a GTPase required for polarized cell growth as is Pak. Cla4 and Ste20 kinases share a function in localizing cell growth with respect to the septin ring. They both contain a PH domain, a Cdc42/Rac interactive binding (CRIB) domain, and a C-terminal Protein Kinase catalytic (PKc) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 92
31105 270098 cd13280 PH_SIP3 Snf1p-interacting protein 3 Pleckstrin homology (PH) domain. SIP3 interacts with SNF1 protein kinase and activates transcription when anchored to DNA. It may function in the SNF1 pathway. SIP3 contain an N-terminal Bin/Amphiphysin/Rvs (BAR) domain followed by a PH domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31106 270099 cd13281 PH_PLEKHD1 Pleckstrin homology (PH) domain containing, family D (with coiled-coil domains) member 1 PH domain. Human PLEKHD1 (also called UPF0639, pleckstrin homology domain containing, family D (with M protein repeats) member 1) is a single transcript and contains a single PH domain. PLEKHD1 is conserved in human, chimpanzee, , dog, cow, mouse, chicken, zebrafish, and Caenorhabditis elegans. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 139
31107 241436 cd13282 PH1_PLEKHH1_PLEKHH2 Pleckstrin homology (PH) domain containing, family H (with MyTH4 domain) members 1 and 2 (PLEKHH1) PH domain, repeat 1. PLEKHH1 and PLEKHH2 (also called PLEKHH1L) are thought to function in phospholipid binding and signal transduction. There are 3 Human PLEKHH genes: PLEKHH1, PLEKHH2, and PLEKHH3. There are many isoforms, the longest of which contain a FERM domain, a MyTH4 domain, two PH domains, a peroximal domain, a vacuolar domain, and a coiled coil stretch. The FERM domain has a cloverleaf tripart structure (FERM_N, FERM_M, FERM_C/N, alpha-, and C-lobe/A-lobe, B-lobe, C-lobe/F1, F2, F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 96
31108 270100 cd13283 PH_GPBP Goodpasture antigen binding protein Pleckstrin homology (PH) domain. The GPBP (also called Collagen type IV alpha-3-binding protein/hCERT; START domain-containing protein 11/StARD11; StAR-related lipid transfer protein 11) is a kinase that phosphorylates an N-terminal region of the alpha 3 chain of type IV collagen, which is commonly known as the goodpasture antigen. Its splice variant the ceramide transporter (CERT) mediates the cytosolic transport of ceramide. There have been additional splice variants identified, but all of them function as ceramide transport proteins. GPBP and CERT both contain an N-terminal PH domain, followed by a serine rich domain, and a C-terminal START domain. However, GPBP has an additional serine rich domain just upstream of its START domain. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
31109 270101 cd13284 PH_OSBP_ORP4 Human Oxysterol binding protein and OSBP-related protein 4 Pleckstrin homology (PH) domain. Human OSBP is proposed to function is sterol-dependent regulation of ERK dephosphorylation and sphingomyelin synthesis as well as modulation of insulin signaling and hepatic lipogenesis. It contains a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. OSBPs and Osh1p PH domains specifically localize to the Golgi apparatus in a PtdIns4P-dependent manner. ORP4 is proposed to function in Vimentin-dependent sterol transport and/or signaling. Human ORP4 has 2 forms, a long (ORP4L) and a short (ORP4S). ORP4L contains a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP4S is truncated and contains only an OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 99
31110 270102 cd13285 PH_ORP1 Human Oxysterol binding protein related protein 1 Pleckstrin homology (PH) domain. Human ORP1 has 2 forms, a long (ORP1L) and a short (ORP1S). ORP1L contains 3 N-terminal ankyrin repeats, followed by a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP1S is truncated and contains only an OSBP-related domain. ORP1L is proposed to function in motility and distribution of late endosomes, autophagy, and macrophage lipid metabolism. ORP1S is proposed to function in vesicle transport from Golgi. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31111 270103 cd13286 PH_OPR5_ORP8 Human Oxysterol binding protein related proteins 5 and 8 Pleckstrin homology (PH) domain. Human ORP5 is proposed to function in efficient nonvesicular transfer of low-density lipoproteins-derived cholesterol (LDL-C) from late endosomes/lysosomes to the endoplasmic reticulum (ER). Human ORP8 is proposed to modulate lipid homeostasis and sterol regulatory element binding proteins (SREBP) activity. Both ORP5 and ORP8 contain a N-terminal PH domain, a C-terminal OSBP-related domain, followed by a transmembrane domain that localizes ORP5 to the ER. Unlike all the other human OSBP/ORPs they lack a FFAT motif (two phenylalanines in an acidic tract). Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 130
31112 270104 cd13287 PH_ORP3_ORP6_ORP7 Human Oxysterol binding protein related proteins 3, 6, and 7 Pleckstrin homology (PH) domain. Human ORP3 is proposed to function in regulating the cell-matrix and cell-cell adhesion. A proposed specific function for Human ORP6 was not found at present. Human ORP7is proposed to function in negatively regulating the Golgi soluble NSF attachment protein receptor (SNARE) of 28kDa (GS28) protein stability via sequestration of Golgi-associated ATPase enhancer of 16 kDa (GATE-16). ORP3 has 2 isoforms: the longer ORP3(1) and the shorter ORP3(2). ORP3(1), ORP6, and ORP7 all contain a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. The shorter ORP3(2) is missing the C-terminal portion of its OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
31113 270105 cd13288 PH_Ses Sesquipedalian family Pleckstrin homology (PH) domain. The sesquipedalian family has 2 mammalian members: Ses1 and Ses2, which are also callled 7 kDa inositol polyphosphate phosphatase-interacting protein 1 and 2. They play a role in endocytic trafficking and are required for receptor recycling from endosomes, both to the trans-Golgi network and the plasma membrane. Members of this family form homodimers and heterodimers. Sesquipedalian interacts with inositol polyphosphate 5-phosphatase OCRL-1 (INPP5F) also known as Lowe oculocerebrorenal syndrome protein, a phosphatase enzyme that is involved in actin polymerization and is found in the trans-Golgi network and INPP5B. Sesquipedalian contains a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 120
31114 241443 cd13289 PH_Osh3p_yeast Yeast oxysterol binding protein homolog 3 Pleckstrin homology (PH) domain. Yeast Osh3p is proposed to function in sterol transport and regulation of nuclear fusion during mating and of pseudohyphal growth as well as sphingolipid metabolism. Osh3 contains a N-GOLD (Golgi dynamics) domain, a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. GOLD domains are thought to mediate protein-protein interactions, but their role in ORPs are unknown. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 90
31115 241444 cd13290 PH_ORP9 Human Oxysterol binding protein related protein 9 Pleckstrin homology (PH) domain. Human ORP9 is proposed to function in regulation of Akt phosphorylation. ORP9 has 2 forms, a long (ORP9L) and a short (ORP9S). ORP9L contains an N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP1S is truncated and contains a FFAT motif and an OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
31116 270106 cd13291 PH_ORP10_ORP11 Human Oxysterol binding protein (OSBP) related proteins 10 and 11 (ORP10 and ORP11) Pleckstrin homology (PH) domain. Human ORP10 is involvedt in intracellular transport or organelle positioning and is proposed to function as a regulator of cellular lipid metabolism. Human ORP11 localizes at the Golgi-late endosome interface and is thought to form a dimer with ORP9 functioning as an intracellular lipid sensor or transporter. Both ORP10 and ORP11 contain a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
31117 241446 cd13292 PH_Osh1p_Osh2p_yeast Yeast oxysterol binding protein homologs 1 and 2 Pleckstrin homology (PH) domain. Yeast Osh1p is proposed to function in postsynthetic sterol regulation, piecemeal microautophagy of the nucleus, and cell polarity establishment. Yeast Osh2p is proposed to function in sterol metabolism and cell polarity establishment. Both Osh1p and Osh2p contain 3 N-terminal ankyrin repeats, a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. OSBP andOsh1p PH domains specifically localize to the Golgi apparatus in a PtdIns4P-dependent manner. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31118 241447 cd13293 PH_CpORP2-like Cryptosporidium-like Oxysterol binding protein related protein 2 Pleckstrin homology (PH) domain. There are 2 types of ORPs found in Cryptosporidium: CpORP1 and CpORP2. Cryptosporium differs from other apicomplexans like Plasmodium, Toxoplasma, and Eimeria which possess only a single long-type ORP consisting of an N-terminal PH domain followed by a C-terminal ligand binding (LB) domain. CpORP2 is like this, but CpORP1 differs and has a truncated N-terminus resulting in only having a LB domain present. The exact functions of these proteins are largely unknown though CpORP1 is thought to be involved in lipid transport across the parasitophorous vacuole membrane. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 88
31119 241448 cd13294 PH_ORP_plant Plant Oxysterol binding protein related protein Pleckstrin homology (PH) domain. Plant ORPs contain a N-terminal PH domain and a C-terminal OSBP-related domain. Not much is known about its specific function in plants to date. Members here include: Arabidopsis, spruce, and petunia. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 100
31120 270107 cd13295 PH_EFA6 Exchange Factor for ARF6 Pleckstrin homology (PH) domain. EFA6 (also called PSD/pleckstrin and Sec7 domain containing) is an guanine nucleotide exchange factor for ADP ribosylation factor 6 (ARF6), which is involved in membrane recycling. EFA6 has four structurally related polypeptides: EFA6A, EFA6B, EFA6C and EFA6D. It consists of a N-terminal proline rich region (PR), a SEC7 domain, a PH domain, a PR, a coiled-coil region, and a C-terminal PR. The EFA6 PH domain regulates its association with the plasma membrane. EFA6 activates Arf6 through its Sec7 catalytic domain and modulates this activity through its C-terminal domain, which rearranges the actin cytoskeleton in fibroblastic cell lines. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 126
31121 270108 cd13296 PH2_MyoX Myosin X Pleckstrin homology (PH) domain, repeat 2. MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The first PH domain in the MyoX tail is a split-PH domain, interupted by the second PH domain such that PH 1a and PH 1b flanks PH 2. The third PH domain (PH 3) follows the PH 1b domain. This cd contains the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31122 270109 cd13297 PH3_MyoX-like Myosin X-like Pleckstrin homology (PH) domain, repeat 3. MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The first PH domain in the MyoX tail is a split-PH domain, interupted by the second PH domain such that PH 1a and PH 1b flanks PH 2. The third PH domain (PH 3) follows the PH 1b domain. This cd contains the third MyoX PH repeat. PLEKHH3/Pleckstrin homology (PH) domain containing, family H (with MyTH4 domain) member 3 is also part of this CD and like MyoX contains a FERM domain, a MyTH4 domain, and a single PH domain. Not much is known about the function of PLEKHH3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 126
31123 270110 cd13298 PH1_PH_fungal Fungal proteins Pleckstrin homology (PH) domain, repeat 1. The functions of these fungal proteins are unknown, but they all contain 2 PH domains. This cd represents the first PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31124 270111 cd13299 PH2_PH_fungal Fungal proteins Pleckstrin homology (PH) domain, repeat 2. The functions of these fungal proteins are unknown, but they all contain 2 PH domains. This cd represents the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
31125 270112 cd13300 PH1_TECPR1 Tectonin beta-propeller repeat-containing protein 1 Pleckstrin homology (PH) domain, repeat 1. TECPR1 is a tethering factor involved in autophagy. It promotes the autophagosome fusion with lysosomes by associating with both the ATG5-ATG12 conjugate and phosphatidylinositol-3-phosphate (PtdIns3P) present at the surface of autophagosomes. TECPR1 is also involved in selective autophagy against bacterial pathogens, by being required for phagophore/preautophagosomal structure biogenesis and maturation. It contains 2 DysFN (Dysferlin domains of unknown function, N-terminal), 2 Hyd_WA domains that is a probably beta-propeller, a PH-like domain, a TECPR domain, and a DysFC (C-terminal). The PH domain mediates the binding to phosphatidylinositol-3-phosphate (PtdIns3P). Binding to the ATG5-ATG12 conjugate exposes the PH domain, allowing the association with PtdIns3P. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 122
31126 270113 cd13301 PH1_Pleckstrin_2 Pleckstrin 2 Pleckstrin homology (PH) domain, repeat 1. Pleckstrin is a protein found in platelets. This name is derived from platelet and leukocyte C kinase substrate and the KSTR string of amino acids. Pleckstrin 2 contains two PH domains and a DEP (dishvelled, egl-10, and pleckstrin) domain. Unlike pleckstrin 1, pleckstrin 2 does not contain obvious sites of PKC phosphorylation. Pleckstrin 2 plays a role in actin rearrangement, large lamellipodia and peripheral ruffle formation, and may help orchestrate cytoskeletal arrangement. The PH domains of pleckstrin 2 are thought to contribute to lamellipodia formation. This cd contains the first PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31127 270114 cd13302 PH2_Pleckstrin_2 Pleckstrin 2 Pleckstrin homology (PH) domain, repeat 2. Pleckstrin is a protein found in platelets. This name is derived from platelet and leukocyte C kinase substrate and the KSTR string of amino acids. Pleckstrin 2 contains two PH domains and a DEP (dishvelled, egl-10, and pleckstrin) domain. Unlike pleckstrin 1, pleckstrin 2 does not contain obvious sites of PKC phosphorylation. Pleckstrin 2 plays a role in actin rearrangement, large lamellipodia and peripheral ruffle formation, and may help orchestrate cytoskeletal arrangement. The PH domains of pleckstrin 2 are thought to contribute to lamellipodia formation. This cd contains the second PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 109
31128 241457 cd13303 PH1-like_Rtt106 Pleckstrin homology-like domain, repeat 1, of Histone chaperone RTT106 (regulator of Ty1 transposition protein 106). Rtt106 is a histone chaperone. The binding of Rtt106 to H3K56-acetylated (H3-H4)2 tetramers contributes to nucleosome assembly in terms of DNA replication, gene silencing and maintenance of genomic stability. Rtt106 contains an N-terminal homodimerization domain and two C-terminal pleckstrin-homology (PH) domains (PH1 and PH2). The N-terminal domain homodimerizes homodimerizes and interacts with H3-H4 independently of acetylation while the double PH domain binds the K56-containing region of H3. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). This model contains the first PH-like domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 139
31129 241458 cd13304 PH2-like_Rtt106 Pleckstrin homology-like domain, repeat 2, of Histone chaperone RTT106 (regulator of Ty1 transposition protein 106). Rtt106 is a histone chaperone. Rtt106 contains an N-terminal homodimerization domain and two C-terminal pleckstrin-homology (PH) domains (PH1 and PH2). The binding of Rtt106 to H3K56-acetylated (H3-H4)2 tetramers contributes to nucleosome assembly in terms of DNA replication, gene silencing and maintenance of genomic stability. The N-terminal domain homodimerizes homodimerizes and interacts with H3-H4 independently of acetylation while the double PH domain binds the K56-containing region of H3. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). This model contains the second PH-like domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 89
31130 270115 cd13305 PH_SHARPIN SHANK-associated RH domain interacting protein Pleckstrin homology (PH) domain. SHARPIN has a variety of roles including: a role as a scaffolding partner of anchoring/scaffold proteins Shank1, a role in carcinogenesis through the interaction with FYN binding protein (FYB), which binds to oncogene FYN, a role in apoptosis by interacting with AIFM1, a mitochondrial regulator of cell death, CAPN13, and NSD1, as well as a role in immune disease and inflammation. SHARPIN has at its N-terminus a PH domain, followed by a E3 ubiquitin ligase domain, and a C-terminal RanBP-type and C3HC4-type zinc finger containing 1 domain (RBCK1, also known as HOIP which functions as a protein kinase C (PKC) binding protein as well as a transcriptional activator. SHARPIN's PH domain functions as a dimerization module, rather than a ligand recognition domain. Instead it acts as a dimerization module extending the functional applications of this superfold. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
31131 270116 cd13306 PH1_AFAP Actin filament associated protein family Pleckstrin homology (PH) domain, repeat 1. There are 3 members of the AFAP family of adaptor proteins: AFAP1, AFAP1L1, and AFAP1L2/XB130. AFAP1 is a cSrc binding partner and actin cross-linking protein. AFAP1L1 is thought to play a similar role to AFAP1 in terms of being an actin cross-linking protein, but it preferentially binds to cortactin and not cSrc, thereby playing a role in invadosome formation. AFAP1L2 is a cSrc binding protein, but does not bind to actin filaments. AFAP1L2 acts as an intermediary between the RET/PTC kinase and PI-3kinase pathway in the thyroid. The AFAPs share a similar structure of a SH3 binding motif, 3 SH2 binding motifs, 2 PH domains, a coiled-coil region corresponding to the AFAP1 leucine zipper, and an actin binding domain. The amino terminal PH1 domain of AFAP1 has been known to function in intra-molecular regulation of AFAP1. In addition, the PH1 domain is a binding partner for PKCa and phospholipids. This cd is the first PH domain of AFAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
31132 270117 cd13307 PH2_AFAP Actin filament associated protein family Pleckstrin homology (PH) domain, repeat 2. There are 3 members of the AFAP family of adaptor proteins: AFAP1, AFAP1L1, and AFAP1L2/XB130. AFAP1 is a cSrc binding partner and actin cross-linking protein. AFAP1L1 is thought to play a similar role to AFAP1 in terms of being an actin cross-linking protein, but it preferentially binds to cortactin and not cSrc, thereby playing a role in invadosome formation. AFAP1L2 is a cSrc binding protein, but does not bind to actin filaments. AFAP1L2 acts as an intermediary between the RET/PTC kinase and PI-3kinase pathway in the thyroid. The AFAPs share a similar structure of a SH3 binding motif, 3 SH2 binding motifs, 2 PH domains, a coiled-coil region corresponding to the AFAP1 leucine zipper, and an actin binding domain. This cd is the second PH domain of AFAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 101
31133 270118 cd13308 PH_3BP2 SH3 domain-binding protein 2 Pleckstrin homology (PH) domain. SH3BP2 (the gene that encodes the adaptor protein 3BP2), HD, ITU, IT10C3, and ADD1 are located near the Huntington's Disease Gene on Human Chromosome 4pl6.3. SH3BP2 lies in a region that is often missing in individuals with Wolf-Hirschhorn syndrome (WHS). Gain of function mutations in SH3BP2 causes enhanced B-cell antigen receptor (BCR)-mediated activation of nuclear factor of activated T cells (NFAT), resulting in a rare, genetic disorder called cherubism. This results in an increase in the signaling complex formation with Syk, phospholipase C-gamma2 (PLC-gamma2), and Vav1. It was recently discovered that Tankyrase regulates 3BP2 stability through ADP-ribosylation and ubiquitylation by the E3-ubiquitin ligase. Cherubism mutations uncouple 3BP2 from Tankyrase-mediated protein destruction, which results in its stabilization and subsequent hyperactivation of the Src, Syk, and Vav signaling pathways. SH3BP2 is also a potential negative regulator of the abl oncogene. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 113
31134 270119 cd13309 PH_SKIP SifA and kinesin-interacting protein Pleckstrin homology (PH) domain. SKIP (also called PLEKHM2/Pleckstrin homology domain-containing family M member 2) is a soluble cytosolic protein that contains a RUN domain and a PH domain separated by a unstructured linker region. SKIP is a target of the Salmonella effector protein SifA and the SifA-SKIP complex regulates kinesin-1 on the bacterial vacuole. The PH domain of SKIP binds to the N-terminal region of SifA while the N-terminus of SKIP is proposed to bind the TPR domain of the kinesin light chain. The opposite side of the SKIP PH domain is proposed to bind phosphoinositides. TSifA, SKIP, SseJ, and RhoA family GTPases are also thought to promote host membrane tubulation. Recently, it was shown that the lysosomal GTPase Arl8 binds to the kinesin-1 linker SKIP and that both are required for the normal intracellular distribution of lysosomes. Interestingly, two kinesin light chain binding motifs (WD) in SKIP have now been identified to match a consensus sequence for a kinesin light chain binding site found in several proteins including calsyntenin-1/alcadein, caytaxin, and vaccinia virus A36. SKIP has also been shown to interact with Rab1A. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31135 270120 cd13310 PH_RalGPS1_2 Ral GEF with PH domain and SH3 binding motif 1 and 2 Pleckstrin homology (PH) domain. RalGPS1 (also called Ral GEF with PH domain and SH3 binding motif 1;RALGEF2/ Ral guanine nucleotide exchange factor 2; RalA exchange factor RalGPS1; Ral guanine nucleotide exchange factor RalGPS1A2; ras-specific guanine nucleotide-releasing factor RalGPS1) and RalGPS2 (also called Ral GEF with PH domain and SH3 binding motif 2; Ral-A exchange factor RalGPS2; ras-specific guanine nucleotide-releasing factor RalGPS22). They activate small GTPase Ral proteins such as RalA and RalB by stimulating the exchange of Ral bound GDP to GTP, thereby regulating various downstream cellular processes. Structurally they contain an N-terminal Cdc25-like catalytic domain, followed by a PXXP motif and a C-terminal PH domain. The Cdc25-like catalytic domain interacts with Ral and its PH domain ensures the correct membrane localization. Its PXXP motif is thought to interact with the SH3 domain of Grb2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 116
31136 270121 cd13311 PH_Slm1 Slm1 Pleckstrin homology (PH) domain. Slm1 is a component of the target of rapamycin complex 2 (TORC2) signaling pathway. It plays a role in the regulation of actin organization and is a target of sphingolipid signaling during the heat shock response. Slm1 contains a single PH domain that binds PtdIns(4,5)P2, PtdIns(4)P, and dihydrosphingosine 1-phosphate (DHS-1P). Slm1 possesses two binding sites for anionic lipids. The non-canonical binding site of the PH domain of Slm1 is used for ligand binding, and it is proposed that beta-spectrin, Tiam1 and ArhGAP9 also have this type of phosphoinositide binding site. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31137 270122 cd13312 PH_USP37_like Pleckstrin homology-like domain of Ubiquitin carboxyl-terminal hydrolase 37. Members here include USP37, USP29, and USP26. All of these contain a single PH-like domain. USP37 (also called ubiquitin carboxyl-terminal hydrolase 37, ubiquitin thiolesterase 37, deubiquitinating enzyme 37, and tmp_locus_50) is a deubiquitinase that antagonizes the anaphase-promoting complex (APC/C) during G1/S transition by mediating deubiquitination of cyclin-A (CCNA1 and CCNA2), resulting in promoting S phase entry. USP37 mediates deubiquitination of 'Lys-11'-linked polyubiquitin chains, a specific ubiquitin-linkage type mediated by the APC/C complex and 'Lys-48'-linked polyubiquitin chains in vitro. Phosphorylation at Ser-628 during G1/S phase maximizes the deubiquitinase activity, leading to prevent degradation of cyclin-A (CCNA1 and CCNA2). USP29 (also called ubiquitin carboxyl-terminal hydrolase 29, ubiquitin thiolesterase 29, deubiquitinating enzyme 29, and HOM-TES-84/86) plays a role in apoptosis and oxidative stress. In response to oxidative stress, JTV1 dissociates from the ARS complex, translocates to the nucleus, associates with far upstream element binding protein (FBP) and co-activates the transcription of USP29 which binds to, cleaves poly-ubiquitin chains from, and stabilizes p53 leading to apoptosis. The X-linked deubiquitination enzyme USP26 (also called ubiquitin carboxyl-terminal hydrolase 26, ubiquitin thiolesterase 26, and deubiquitinating enzyme 26) is a regulator of androgen receptor (AR) signaling. It binds to AR using three nuclear receptor interaction motifs (LXXLL, FXXLF and FXXFF) and modulates AR ubiquitination. Polymorphism of Usp26 correlates with idiopathic male infertility. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31138 270123 cd13313 PH_NF1 Neurofibromin-1 Pleckstrin homology-like domain. Neurofibromin (NF1) contains a N-terminal RasGAP domain, followed by a Sec14-like domain, and a PH domain. Surprisingly, in neurofibromin the PH domain alone is not sufficient for phospholipid binding and instead requires the presence of the Sec-14 domain. The Sec-14 domain has been shown to bind 1-(3-sn-phosphatidyl)-sn-glycerol (PtdGro), (3-sn-phosphatidyl)-ethanolamine (PtdEtn) and -choline (PtdCho) and to a minor extent to (3-sn-phosphatidyl)-l-serine (PtdSer) and 1-(3-sn-phosphatidyl)-d-myo-inositol (PtdIns). Neurofibromatosis type 1 (also known as von Recklinghausen neurofibromatosis or NF1) is a genetic disorder caused by alterations in the tumor suppressor gene NF1. Hallmark symptoms include neural crest derived tumors, pigmentation anomalies, bone deformations, and learning disabilities. Mutations of the tumour suppressor gene NF1 are responsible for disease pathogenesis, with 90% of the alterations being nonsense codons. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 110
31139 270124 cd13314 PH_Rpn13 Pleckstrin homology-like domain of Regulatory Particle Non-ATPase 13. Targeted protein degradation is performed to a great extent by the ubiquitin-proteasome pathway, in which substrate proteins are marked by covalently attached ubiquitin chains that mediate recognition by the proteasome. Rpn13(also called ADRM1/ARM1) is one of the two major ubiquitin receptors of the proteasome, the other being S5a/Rpn10 which is not essential for ubiquitin-mediated protein degradation in budding yeast2. S5a has two ubiquitin interacting motifs (UIMs) that bind simultaneously to ubiquitin moieties to increase affinity while Rpn13 binds ubiquitin with a single, high affinity surface within its N-terminal PH domain. Rpn13 also binds and activates deubiquitinating enzyme Uch37, one of the proteasome's three deubiquitinating enzymes. Recently it was discovered that the ubiquitin-binding domain (BD) and Uch37 BD of human (h) Rpn13 pack against each other when it is not incorporated into the proteasome reducing hRpn13's affinity for ubiquitin. However when hRpn13 binds to hRpn2/S1 this abrogates its interdomain interactions, thus activating hRpn13 for ubiquitin binding. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31140 270125 cd13315 PH_Sec3 Sec 3 Pleckstrin homology-like domain. The Sec3 subunit of the exocyst, a complex involved in polarized exocytosis, bind phospholipids and GTPase Cdc42 and therefore functions as a coincidence detector at the plasma membrane. Unlike most PH domains, Sec3 contains an additional alpha-helix at its N-terminus and two beta-strands at its C-terminus that mediate dimerization through domain swapping. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 141
31141 270126 cd13316 PH_Boi Boi family Pleckstrin homology domain. Yeast Boi proteins Boi1 and Boi2 are functionally redundant and important for cell growth with Boi mutants displaying defects in bud formation and in the maintenance of cell polarity.They appear to be linked to Rho-type GTPase, Cdc42 and Rho3. Boi1 and Boi2 display two-hybrid interactions with the GTP-bound ("active") form of Cdc42, while Rho3 can suppress of the lethality caused by deletion of Boi1 and Boi2. These findings suggest that Boi1 and Boi2 are targets of Cdc42 that promote cell growth in a manner that is regulated by Rho3. Boi proteins contain a N-terminal SH3 domain, followed by a SAM (sterile alpha motif) domain, a proline-rich region, which mediates binding to the second SH3 domain of Bem1, and C-terminal PH domain. The PH domain is essential for its function in cell growth and is important for localization to the bud, while the SH3 domain is needed for localization to the neck. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 97
31142 270127 cd13317 PH_PLEKHO1_PLEKHO2 Pleckstrin homology domain-containing family O Pleckstrin homology domain. The PLEKHO family members are PLEKHO1 (also called CKIP-1/Casein kinase 2-interacting protein 1/CK2-interacting protein 1) and PLEKHO2 (PLEKHQ1/PH domain-containing family Q member 1). They both contain a single PH domain. PLEKHO1 acts as a scaffold protein that functions in plasma membrane recruitment, transcriptional activity modulation, and posttranscriptional modification regulation. As an adaptor protein it is involved in signaling pathways, apoptosis, differentiation, cytoskeleton, and bone formation. Not much is know about PLEKHO2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
31143 270128 cd13318 PH_IQSEC IQ motif and SEC7 domain-containing protein family Pleckstrin homology domain. The IQSEC (also called BRAG/Brefeldin A-resistant Arf-gunanine nucleotide exchange factor) family are a subset of Arf GEFs that have been shown to activate Arf6, which acts in the endocytic pathway to control the trafficking of a subset of cargo proteins including integrins and have key roles in the function and organization of distinct excitatory and inhibitory synapses in the retina. The family consists of 3 members: IQSEC1 (also called BRAG2/GEP100), IQSEC2 (also called BRAG1), and IQSEC3 (also called SynArfGEF, BRAG3, or KIAA1110). IQSEC1 interacts with clathrin and modulates cell adhesion by regulating integrin surface expression and in addition to Arf6, it also activates the class II Arfs, Arf4 and Arf5. Mutations in IQSEC2 cause non-syndromic X-linked intellectual disability as well as reduced activation of Arf substrates (Arf1, Arf6). IQSEC3 regulates Arf6 at inhibitory synapses and associates with the dystrophin-associated glycoprotein complex and S-SCAM. These members contains a IQ domain that may bind calmodulin, a PH domain that is thought to mediate membrane localization by binding of phosphoinositides, and a SEC7 domain that can promote GEF activity on ARF. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 128
31144 270129 cd13319 PH_RARhoGAP RA and RhoGAP domain-containing protein Pleckstrin homology PH domain. RARhoGAP (also called Rho GTPase-activating protein 20 and ARHGAP20 ) is thought to function in rearrangements of the cytoskeleton and cell signaling events that occur during spermatogenesis. RARhoGAP was also shown to be activated by Rap1 and to induce inactivation of Rho, resulting in the neurite outgrowth. Recent findings show that ARHGAP20, even although it is located in the middle of the MDR on 11q22-23, is expressed at higher levels in chronic lymphocytic leukemia patients with 11q22-23 and/or 13q14 deletions and its expression pattern suggests a functional link between cases with 11q22-23 and 13q14 deletions. The mechanism needs to be further studied. RARhoGAP contains a PH domain, a Ras-associating domain, a Rho-GAP domain, and ANXL repeats. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 97
31145 270130 cd13320 PH_OCRL-like oculocerebrorenal syndrome of Lowe family Pleckstrin homology-like domain. The OCRL family has two members: OCRL1 (also called INPP5F, LOCR, NPHL2, or phosphatidylinositol polyphosphate 5-phosphatase) and OCRL2 ( also called IPNNB5, inositol polyphosphate-5-phosphatase, phosphoinositide 5-phosphatase, 5PTase, or type II inositol-1,4,5-trisphosphate 5-phosphatase). The OCRL proteins hydrolyze phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. They interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. All OCRL family members contain a PH domain and a Rho-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31146 241475 cd13321 PH_PLEKHM1 Pleckstrin homology domain-containing family M member 1 Pleckstrin homology (PH) domain. PLEKHM1 is thought to function in vesicular transport in osteoclasts. Mutations in the PLEKHM1 gene are associated with osteopetrosis OPTB6. PLEKHM1 contains an N-terminal RUN domain (RPIP8/RaP2 interacting protein 8, UNC-14 and NESCA/new molecule containing SH3 at the carboxyl-terminus), followed by a PH domain, and either a C1 domain or a DUF4206 domain at its C-terminus. The RUN domain is thought to be involved in Rab-mediated membrane trafficking, possibly as a Rab-binding site. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 132
31147 270131 cd13322 PH_PHLPP-like PH domain leucine-rich repeat protein phosphatase family Pleckstrin homology-like domain. The PHLPP family has members PHLPP1 (also called hSCOP/Suprachiasmatic nucleus circadian oscillatory protein; PLEKHE1/Pleckstrin homology domain-containing family E member 1) and PHLPP2 (PHLPP-like/PHLPPL). The PHLPP family of novel Ser/Thr phosphatases serve as important regulators of cell survival and apoptosis. PHLPP isozymes catalyze the dephosphorylation of a conserved regulatory motif, the hydrophobic motif, on the AGC kinases Akt, PKC, and S6 kinase, as well as an inhibitory site on the kinase Mst1, to inhibit cellular proliferation and induce apoptosis and negatively regulates ERK1/2 activation. Reductions in their expression have been detected in several cancers and linked to cancer progression. PHLPP1 and PHLPP2 both contain an N-terminal PH domain, followed by 21 LRR (leucine-rich) repeats, and a C-terminal PP2C-like domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 95
31148 270132 cd13323 PH_PLEKHN1 Pleckstrin homology domain containing family N member 1Pleckstrin homology-like domain. Not much is known about PLEKHN1. It is found in a wide range of animals including humans, green anole, frog, and zebrafish. It contains a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 121
31149 270133 cd13324 PH_Gab-like Grb2-associated binding protein family Pleckstrin homology (PH) domain. Gab proteins are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. There are 3 families: Gab1, Gab2, and Gab3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 112
31150 270134 cd13325 PH_unc89 unc89 pleckstrin homology (PH) domain. unc89 is a myofibrillar protein. unc89-B the largest isoform is composed of 53 immunoglobulin (Ig) domains, 2 Fn3 domains, a triplet of SH3, DH and PH domains at its N-terminus, and 2 protein kinase domains (PK1 and PK2) at its C-terminus. unc-89 mutants display disorganization of muscle A-bands, and usually lack M-lines. The COOH-terminal region of obscurin, the human homolog of unc89, interacts via two specific Ig-like domains with the NH(2)-terminal Z-disk region of titin, a protein that connects the Z line to the M line in the sarcomere and contributes to the contraction of striated muscle. obscurin is also thought to be involved in Ca2+/calmodulin via its IQ domains, as well as G protein-coupled signal transduction in the sarcomere via its RhoGEF/DH domain. The DH-PH region of OBSCN and unc89, the C. elegans homolog, has exchange activity for RhoA and Rho-1 respectively, but not for the small GTPases homologous to Cdc42 or Rac. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
31151 270135 cd13326 PH_CNK_insect-like Connector enhancer of KSR (Kinase suppressor of ras) (CNK) pleckstrin homology (PH) domain. CNK family members function as protein scaffolds, regulating the activity and the subcellular localization of RAS activated RAF. There is a single CNK protein present in Drosophila and Caenorhabditis elegans in contrast to mammals which have 3 CNK proteins (CNK1, CNK2, and CNK3). All of the CNK members contain a sterile a motif (SAM), a conserved region in CNK (CRIC) domain, and a PSD-95/DLG-1/ZO-1 (PDZ) domain, and a PH domain. A CNK2 splice variant CNK2A also has a PDZ domain-binding motif at its C terminus and Drosophila CNK (D-CNK) also has a domain known as the Raf-interacting region (RIR) that mediates binding of the Drosophila Raf kinase. This cd contains CNKs from insects, spiders, mollusks, and nematodes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 91
31152 270136 cd13327 PH_PLEKHM3_2 Pleckstrin homology domain-containing family M member 3 Pleckstrin homology domain 2. PLEKHM3 (also called differentiation associated protein/DAPR)(also called differentiation associated protein/DAPR) exists as three alternatively spliced isoforms that participate in metal ion binding. It contains 2 PH domains and 1 phorbol-ester/DAG-type zinc finger domain. PLEKHM3 is found in Humans, canines, bovine, mouse, rat, chicken and zebrafish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 88
31153 275410 cd13328 PH1_FDG_family FYVE, RhoGEF and PH domain containing/faciogenital dysplasia family proteins, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 92
31154 275411 cd13329 PH_RhoGEF Rho guanine nucleotide exchange factor Pleckstrin homology domain. RhoGEFs belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. The members here all contain Dbl homology (DH)-PH domains. In addition some members contain N-terminal C1 (Protein kinase C conserved region 1) domains, PDZ (also called DHR/Dlg homologous regions) domains, ANK (ankyrin) domains, and RGS (Regulator of G-protein signalling) domains or C-terminal ATP-synthase B subunit. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. RhoGEF2/Rho guanine nucleotide exchange factor 2, p114RhoGEF/p114 Rho guanine nucleotide exchange factor, p115RhoGEF, p190RhoGEF, PRG/PDZ Rho guanine nucleotide exchange factor, RhoGEF 11, RhoGEF 12, RhoGEF 18, AKAP13/A-kinase anchoring protein 13, and LARG/Leukemia-associated Rho guanine nucleotide exchange factor are included in this CD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 109
31155 241484 cd13330 PH_CARM1 Coactivator-Associated Methyltransferase 1 Pleckstrin homology (PH) domain. CARM1 (also known as protein arginine methyltransferase 4/PRMT4) is a protein arginine methyltransferase recruited by several transcription factors. It methylates a variety of proteins and plays a role in gene expression. The N-terminal domain of CARM1 contains a N-terminal PH domain, a catalytic core module composed of two parts (a Rossmann fold topology (RF) and a beta-barrel), and a C-terminal domain. The N-terminal and the C-terminal end of CARM1 catalytic module contain molecular switches that may explain how CARM1 regulates its biological activities by protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 107
31156 270139 cd13331 PH_Avo1 Avo1 Pleckstrin homology (PH) domain. Target of rapamycin (TOR) is a highly conserved serine/threonine protein kinase and a central controller of the growth, metabolism and ageing of eukaryotic cells. TOR assembles into two protein complexes termed TOR complex 1 (TORC1) and TOR complex 2 (TORC2) which function as central nodes in a complex network of signal transduction pathways that are involved in normal physiological as well as pathogenic events. TORC1 mediates the rapamycin-sensitive signalling branch, which positively regulates anabolic processes and negatively regulates catabolic processes. TORC2 signalling is rapamycinin insensitive and is involved in the spatial aspects of cell growth by controlling the actin cytoskeleton and cell polarity. In Saccharomyces cerevisiae, TORC2 is involved in the regulation of ceramide metabolism. In S. cerevisiae, TORC1 consists of the proteins Kog1, Lst8, Tco89 and either Tor1 or Tor2, while TORC2 consists of the proteins Avo1, Avo2, Avo3, Bit61, Lst8 and Tor2. The C-terminal domain of the Saccharomyces cerevisiae TORC2 component Avo1 is required for plasma-membrane localization of TORC2 and is essential for yeast viability. The C-termini of Avo1 and Sin1, its Human ortholog, both have the pleckstrin homology (PH) domain fold. Comparison with known PH-domain structures suggests a putative binding site for phosphoinositides. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31157 275412 cd13332 FERM_C_JAK1 FERM domain C-lobe of Janus kinase 1. JAK1 is a tyrosine kinase protein essential in signaling type I and type II cytokines. It interacts with the gamma chain of type I cytokine receptors to elicit signals from the IL-2 receptor family, the IL-4 receptor family, the gp130 receptor family, ciliary neurotrophic factor receptor (CNTF-R), neurotrophin-1 receptor (NNT-1R) and Leptin-R). It also is involved in transducing a signal by type I (IFN-alpha/beta) and type II (IFN-gamma) interferons, and members of the IL-10 family via type II cytokine receptors. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 144
31158 270141 cd13333 FERM_C_JAK2 FERM domain C-lobe of Janus kinase (JAK) 2. JAK2 has been implicated in signaling by members of the type II cytokine receptor family, the GM-CSF receptor family, the gp130 receptor family, and the single chain receptors. JAK2 orthologs have been identified in all mammals. Mutations in JAK2 have been implicated in polycythemia vera, essential thrombocythemia, myelofibrosis as well as other myeloproliferative disorders. JAK2 gene fusions with the PCM1 and TEL(ETV6) (TEL-JAK2) genes have been found in leukemia patients. Researcher are targetting JAK2 inhibitors in the treatment of patients with prostate cancer. JAK2 has been shown to interact with a variety of proteins including growth hormone receptor, STAT5A, STAT5B, interleukin 5 receptor alpha subunit, interleukin 12 receptor, SOCS3, PTPN6,PTPN11, Grb2, VAV1, and YES1. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 113
31159 275413 cd13334 FERM_C_JAK3 FERM domain C-lobe of Janus kinase (JAK) 3. JAK3 functions in signal transduction and interacts with members of the STAT (signal transduction and activators of transcription) family. It is required for signaling of the type I receptors that use the common gamma chain: IL-2, IL-4, IL-7, IL-9, IL-15 and IL-21. Cytokine binding induces the association of separate cytokine receptor subunits and the activation of the receptor-associated JAKs. In the absence of cytokine, JAKs lack protein tyrosine kinase activity. Once activated, the JAKs create docking sites for the STAT transcription factors by phosphorylation of specific tyrosine residues on the cytokine receptor subunits. Unlike the ubiquitous expression of JAK1, JAK2 and Tyk2, JAK3 is predominantly expressed in hematopoietic cells, such as NK cells, T cells and B cells. Mutations of JAK3 result in severe combined immunodeficiency (SCID). In addition to its well-known roles in T cells and NK cells, JAK3 has recently been found to inhibits IL-8-mediated chemotaxis. JAK3 interacts with CD247, TIAF1, and IL2RG. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 110
31160 275414 cd13335 FERM_C_TYK2 FERM domain C-lobe of Non-receptor tyrosine-protein kinase TYK2. Tyk2 functions primarily in IL-12 and type I-IFN signaling as well as transduction of IL-23, IL-10, and IL-6 signals. A mutation in the Tyk2 gene has been associated with hyperimmunoglobulin E syndrome (HIES), a primary immunodeficiency characterized by elevated serum immunoglobulin E. Tyk2 has been shown to interact with FYN, PTPN6, IFNAR1, Ku80 and GNB2L1. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 158
31161 275415 cd13336 FERM-like_C_SNX31 Atypical FERM-like domain C-lobe of Sorting nexin 31. SNX31 functions in regulating recycling from endosomes to the cell surface. SNX31 contains a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. It bind Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 113
31162 270145 cd13337 FERM-like_C_SNX17 Atypical FERM-like domain C-lobe of Sorting nexin 17. SNX17 is a beta1-integrin-tail-binding protein that interacts with the free kindlin-binding site in endosomes to stabilize beta1 integrins, resulting in their recycling to the cell surface where they can be reused. SNX17 contains a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. SNX17 binds Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 113
31163 270146 cd13338 FERM-like_C_SNX27 Atypical FERM-like domain C-lobe of Sorting nexin 27. SNX27 is localized to early endosomes and known to regulate the intracellular trafficking of ion channels and receptors. SNX27 contain a N-terminal PDZ domain, a PX domain, and a FERM-like domain. SNX27 regulates trafficking of a PAK interacting exchange factor-G protein-coupled receptor kinase interacting protein complex via its PDZ domain interaction. Sorting nexin 27 interacts with multidrug resistance-associated protein 4 (MRP4). SNX27 binds Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 102
31164 275416 cd13339 PH-GRAM_MTMR13 Myotubularian (MTM) related 13 protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR13 (also called SBF2/SET binding factor 2) is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Leu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR13 has high sequence similarity to MTMR5 and has recently been shown to be a second gene mutated in type 4B Charcot-Marie-Tooth syndrome. Both MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 119
31165 275417 cd13340 PH-GRAM_MTMR5 Myotubularian (MTM) related 5 protein (MTMR5) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR5 (also called SBF1/SET binding factor 1) is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It lacks several amino acids in the dsPTPase catalytic pocket which renders it catalytically inactive as a phosphatase. MTMR5 is the most well-studied inactive member of this family and has been implicated in cellular growth control and oncogenic transformation. MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 119
31166 270149 cd13341 PH-GRAM_MTMR3 Myotubularian (MTM) related 3 protein (MTMR3) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR3 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR3 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and also form heteromers with MTMR4. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 94
31167 270150 cd13342 PH-GRAM_MTMR4 Myotubularian (MTM) related 4 protein (MTMR4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR4 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR4 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein form heteromers with MTMR3. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 114
31168 270151 cd13343 PH-GRAM_MTMR6 Myotubularian (MTM) related (MTMR) 6 protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR6 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR6 binds to phosphoinositide lipids through its PH-GRAM domain. It acts as a negative regulator of KCNN4/KCa3.1 channel activity in CD4+ T-cells possibly by decreasing intracellular levels of phosphatidylinositol-3 phosphatase and negatively regulates proliferation of reactivated CD4+ T-cells MTMR6 interacts with MTMR7, MTMR8 and MTMR9. MTMR6 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 101
31169 270152 cd13344 PH-GRAM_MTMR7 Myotubularian (MTM) related 7 protein (MTMR7) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR7 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR6 binds to phosphoinositide lipids through its PH-GRAM domain and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate. MTMR7 interacts with MTMR6, MTMR8 and MTMR9. MTMR7 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 103
31170 270153 cd13345 PH-GRAM_MTMR8 Myotubularian (MTM) related 8 protein (MTMR8) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR8 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR8 binds to phosphoinositide lipids through its PH-GRAM domain. MTMR8 can self associate and interacts with MTMR6, MTMR7 and MTMR9. MTMR8 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 103
31171 270154 cd13346 PH-GRAM_MTMR10 Myotubularian (MTM) related 10 protein (MTMR10) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR10 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, and a SET interaction domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 177
31172 275418 cd13348 PH-GRAM_MTMR12 Myotubularian (MTM) related 12 protein (MTMR12) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR12 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR12 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal a coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 178
31173 270156 cd13349 PH-GRAM1_TBC1D8 TBC1 domain family member 8 (TBC1D8; also called Vascular Rab-GAP/TBC-containing protein) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8 may act as a GTPase-activating protein for Rab family protein(s). TBC1D8 contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 99
31174 275419 cd13350 PH-GRAM1_TBC1D8B TBC1 domain family member 8B (TBC1D8B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8B may act as a GTPase-activating protein for Rab family protein(s). TBC1D8B contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 99
31175 275420 cd13351 PH-GRAM1_TCB1D9_TCB1D9B TBC1 domain family members 9 and 9B (TBC1D9 and TBC1D9B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D9 and TCB1D9B may act as a GTPase-activating proteins for Rab family protein(s). TBC1D9 and TCB1D9B contain two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 99
31176 270159 cd13352 PH-GRAM2_TBC1D8B TBC1 domain family member 8B (TBC1D8B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8B may act as a GTPase-activating protein for Rab family protein(s). TBC1D8B contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 93
31177 270160 cd13353 PH-GRAM2_TBC1D8 TBC1 domain family member 8 (TBC1D8; also called Vascular Rab-GAP/TBC-containing protein) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8 may act as a GTPase-activating protein for Rab family protein(s). TBC1D8 contains two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 96
31178 270161 cd13354 PH-GRAM2_TCB1D9_TCB1D9B TBC1 domain family members 9 and 9B (TBC1D9 and TBC1D9B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D9 and TCB1D9B may act as a GTPase-activating proteins for Rab family protein(s). TBC1D9 and TCB1D9B contain two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold. 97
31179 270162 cd13355 PH-GRAM_MTM1 Myotubularian 1 protein (MTM1) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTM1 is a member of the myotubularin protein phosphatase gene family. It is required for muscle cell differentiation and mutations in this gene have been identified as being responsible for X-linked myotubular myopathy, a severe congenital muscle disorder characterized by defective muscle cell development. Since its initial discovery, there have been an additional 14 myotubularin-related proteins identified. MTM1 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and form heteromers with MTMR12. MTM1 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. All MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE and PH domains C-terminal to the coiled-coil region. 100
31180 270163 cd13356 PH-GRAM_MTMR2_mammal-like Myotubularian related 2 protein (MTMR2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR2 is a member of the myotubularin protein phosphatase gene family. MTMR2 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. MTMR2 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.Members in this cd include mammals, chickens, anoles, human body lice, and aphids. 115
31181 270164 cd13357 PH-GRAM_MTMR2_insect-like Myotubularian related 2 protein (MTMR2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR2 is a member of the myotubularin protein phosphatase gene family. MTMR2 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. MTMR2 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. Members in this cd include Drosophila, sea urchins, mosquitos, bees, ticks, and anemones. 100
31182 270165 cd13358 PH-GRAM_MTMR1 Myotubularian related 1 protein (MTMR1) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR1 is a member of the myotubularin protein phosphatase gene family. MTMR1 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. MTMR1 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. 100
31183 270166 cd13359 PH_ELMO1_CED-12 Engulfment and cell motility protein 1 pleckstrin homology (PH) domain. DOCK2 (Dedicator of cytokinesis 2), a hematopoietic cell-specific, atypical GEF, controls lymphocyte migration through Rac activation. A DOCK2-ELMO1 complex s necessary for DOCK2-mediated Rac signaling. DOCK2 contains a SH3 domain at its N-terminus, followed by a lipid binding DHR1 domain, and a Rac-binding DHR2 domain at its C-terminus. ELMO1, a mammalian homolog of C. elegans CED-12, contains the N-terminal RhoG-binding region, the ELMO domain, the PH domain, and the C-terminal sequence with three PxxP motifs. The C-terminal region of ELMO1, including the Pro-rich sequence, binds the SH3-containing region of DOCK2 forming a intermolecular five-helix bundle along with the PH domain of ELMO1. Autoinhibition of ELMO1 and DOCK2 is accomplished by the interactions of the EID and EAD domains and SH3 and DHR2 domains, respectively. The interaction of DOCK2 and ELMO1 mutually relieve their autoinhibition and results in the activation of Rac1. The PH domain of ELMO1 does not bind phosphoinositides due to the absence of key binding residues. It more closely resembles the FERM domain rather than other PH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 126
31184 241514 cd13360 PH_PLC_fungal Fungal Phospholipase C (PLC) pleckstrin homology (PH) domain. Fungal PLC have mostly been characterized in the yeast Saccharomyces cerevisiae via deletion studies which resulted in a pleiotropic phenotype, with defects in growth, carbon source utilization, and sensitivity to osmotic stress and high temperature. Unlike Saccharomyces several other fungi including Neurospora crassa, Cryphonectria parasitica , and Magnaporthe oryzae (Mo) have several PLC proteins, some of which lack a PH domain, with varied functions. MoPLC1-mediated regulation of Ca2+ level is important for conidiogenesis and appressorium formation while both MoPLC2 and MoPLC3 are required for asexual reproduction, cell wall integrity, appressorium development, and pathogenicity. The fungal PLCs in this hierarchy contain an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, and a C-terminal C2 domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 118
31185 270167 cd13361 PH_PLC_beta Phospholipase C-beta (PLC-beta) pleckstrin homology (PH) domain. PLC-beta (PLCbeta) is regulated by heterotrimeric G protein-coupled receptors through their C2 domain and long C-terminal extension which forms an autoinhibitory helix. There are four isoforms: PLC-beta1-4. The PH domain of PLC-beta2 and PLC-beta3 plays a dual role, much like PLC-delta1, by binding to the plasma membrane, as well as the interaction site for the catalytic activator. However, PLC-beta binds to the lipid surface independent of PIP2. PLC-beta1 seems to play unspecified roles in cellular proliferation and differentiation. PLC-beta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, a C2 domain and a C-terminal PDZ. Members of the Rho GTPase family (e.g., Rac1, Rac2, Rac3, and cdc42) have been implicated in their activation by binding to an alternate site on the N-terminal PH domain. A basic amino acid region within the enzyme's long C-terminal tail appears to function as a Nuclear Localization Signal for import into the nucleus. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes. 127
31186 270168 cd13362 PH_PLC_gamma Phospholipase C-gamma (PLC-gamma) pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) is activated by receptor and non-receptor tyrosine kinases due to the presence of its SH2 and SH3 domains. There are two main isoforms of PLC-gamma expressed in human specimens, PLC-gamma1 and PLC-gamma2. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. Only the first PH domain is present in this hierarchy. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 121
31187 270169 cd13363 PH_PLC_delta Phospholipase C-delta (PLC-delta) pleckstrin homology (PH) domain. The PLC-delta (PLCdelta) consists of three family members, delta 1, 2, and 3. PLC-delta1 is the most well studied. PLC-delta is activated by high calcium levels generated by other PLC family members, and functions as a calcium amplifier within the cell. PLC-delta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, and a C-terminal C2 domain. The PH domain binds PIP2 and promotes activation of the catalytic core as well as tethering the enzyme to the plasma membrane. The C2 domain has been shown to mediate calcium-dependent phospholipid binding as well. The PH and C2 domains operate in concert as a "tether and fix" apparatus necessary for processive catalysis by the enzyme. Its leucine-rich nuclear export signal (NES) in its EF hand motif, as well as a Nuclear localization signal within its linker region allow PLC-delta 1 to actively translocate into and out of the nucleus. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 117
31188 270170 cd13364 PH_PLC_eta Phospholipase C-eta (PLC-eta) pleckstrin homology (PH) domain. PLC-eta (PLCeta) consists of two enzymes, PLCeta1 and PLCeta2. They hydrolyze phosphatidylinositol 4,5-bisphosphate, are more sensitive to Ca2+ than other PLC isozymes, and involved in PKC activation in the brain and neuroendocrine systems. PLC-eta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves by a variable linker, a C2 domain, and a C-terminal PDZ domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.involved in targeting proteins to the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes. 109
31189 270171 cd13365 PH_PLC_plant-like Plant-like Phospholipase C (PLC) pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) was the second class of PLC discovered. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). This cd contains PLC members from fungi and plants. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 115
31190 270172 cd13366 PH_ABR Active breakpoint cluster region-related protein pleckstrin homology (PH) domain. The ABR protein contains multiple domains including a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. It is related to a slightly larger protein, BCR, which is structurally similar, but has an additional N-terminal kinase domain. ABR has GAP activity for both Rac and Cdc42. It promotes the exchange of RAC or CDC42-bound GDP by GTP, thereby activating them. It is highly enriched in the brain and found to a lesser extent in heart, lung and muscle. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 185
31191 270173 cd13367 PH_BCR_vertebrate Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. This hierarchy is composed of vertebrate BCRs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 194
31192 270174 cd13368 PH_BCR_arthropod Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. This hierarchy is composed of arthropod BCRs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 180
31193 270175 cd13369 PH_RASAL1 Ras-GTPase-activating-like protein pleckstrin homology (PH) domain. RASAL1 is a member of the GAP1 family of GTPase-activating proteins, along with GAP1(m), GAP1(IP4BP) and CAPRI. RASAL1 contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. RASAL1 contains two fully conserved C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its catalytic GAP domain has dual RasGAP and RapGAP activities, while its C2 domains bind phospholipids in the presence of Ca2+. Both CAPRI and RASAL1 are calcium-activated RasGAPs that inactivate Ras at the plasma membrane. Thereby enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS and allowing control of cellular proliferation and differentiation. CAPRI and RASAL1 differ in that CAPRI is an amplitude sensor while RASAL1 senses calcium oscillations. This difference between them resides not in their C2 domains, but in their PH domains leading to speculation that this might reflect an association with either phosphoinositides and/or proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 138
31194 241521 cd13370 PH_GAP1m_mammal-like GTPase activating protein 1 m pleckstrin homology (PH) domain. GAP1(m) (also called RASA2/RAS p21 protein activator (GTPase activating protein) 2) is a member of the GAP1 family of GTPase-activating proteins, along with RASAL1, GAP1(IP4BP), and CAPRI. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. GAP1(m) contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its C2 domains, like those of GAP1IP4BP, do not contain the C2 motif that is known to be required for calcium-dependent phospholipid binding. GAP1(m) is regulated by the binding of its PH domains to phophoinositides, PIP3 (phosphatidylinositol 3,4,5-trisphosphate). It suppresses RAS, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. GAP1(m) binds inositol tetrakisphosphate (IP4). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 133
31195 241522 cd13371 PH_GAP1_mammal-like GAP1(IP4BP) pleckstrin homology (PH) domain. GAP1 (also called IP4BP, RASA3/Ras GTPase-activating protein 3, and RAS p21 protein activator (GTPase activating protein) 3/GAPIII/MGC46517/MGC47588)) is a member of the GAP1 family of GTPase-activating proteins, along with RASAL1, GAP1(m), and CAPRI. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. GAP1(IP4BP) contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its C2 domains, like those of GAP1M, do not contain the C2 motif that is known to be required for calcium-dependent phospholipid binding. GAP1(IP4BP) is regulated by the binding of its PH domains to phophoinositides, PIP3 (phosphatidylinositol 3,4,5-trisphosphate) and PIP2 (phosphatidylinositol 4,5-bisphosphate). It suppresses RAS, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. GAP1(IP4BP) binds tyrosine-protein kinase, HCK. Members here include humans, chickens, frogs, and fish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31196 241523 cd13372 PH_CAPRI Ca2+ promoted Ras inactivator pleckstrin homology (PH) domain. CAPRI (also called RASA4/RAS p21 protein activator (GTPase activating protein) 4/GAPL/FLJ59070/KIAA0538/MGC131890) is a member of the GAP1 family of GTPase-activating proteins. CAPRI contains two fully conserved C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its catalytic GAP domain has dual RasGAP and RapGAP activities, while its C2 domains bind phospholipids in the presence of Ca2+. Both CAPRI and RASAL are calcium-activated RasGAPs that inactivate Ras at the plasma membrane. Thereby enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS and allowing control of cellular proliferation and differentiation. CAPRI and RASAL differ in that CAPRI is an amplitude sensor while RASAL senses calcium oscillations. This difference between them resides not in their C2 domains, but in their PH domains leading to speculation that this might reflect an association with either phosphoinositides and/or proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 140
31197 270176 cd13373 PH_nGAP Neuronal growth-associated proteins Pleckstrin homology (PH) domain. nGAP (also called RASAL2/RAS protein activator like-3) is a member of the RasSynGAP family along with DOC-2/DAB2-interacting protein (DAB2IP) and synaptic RasGAP (SynGAP). nGAPs are growth cone markers found in multiple types of neurons. There are many nGAPs including Cap1 (Adenylate cyclase-associated protein 1), Capzb (Capping protein (actin filament) muscle Z-line, beta), Clptm1 (Cleft lip and palate associated transmembrane protein 1), Cotl1 (Coactosin-like 1), Crmp1 (Collapsin response mediator protein 1), Cyfip1 (Cytoplasmic FMR1 interacting protein 1), Fabp7 (Fatty acid binding protein 7, brain), Farp2 (FERM, RhoGEF and pleckstrin domain protein 2), Gap43 (Growth associated protein 43), Gnao1 (Guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O), Gnai2 (Guanine nucleotide binding protein (G protein), alpha inhibiting 2), Pacs1 (Phosphofurin acidic cluster sorting protein 1), Rtn1 (Reticulon 1), Sept2 (Septin 2), Snap25 (Synaptosomal-associated protein 25), Strap (Serine/threonine kinase receptor associated protein), Stx7 (Syntaxin 7), and Tmod2 (Tropomodulin 2). PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 138
31198 270177 cd13374 PH_RASAL3 RAS protein activator like-3 Pleckstrin homology (PH) domain. RASAL3 is thought to be a Ras GTPase-activating protein. It is involved in positive regulation of Ras GTPase activity and of small GTPase mediated signal transduction as well as negative regulation of Ras protein signal transduction. It contains a PH domain, a C2 domain, and a Ras-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 146
31199 270178 cd13375 PH_SynGAP Synaptic Ras-GTPase activating protein Pleckstrin homology (PH) domain. SynGAP is a member of the RasSynGAP family along with DOC-2/DAB2-interacting protein (DAB2IP) and neuronal growth-associated protein (nGAP/RASAL2). SynGAP, a neuronal Ras-GAP, has been shown display both Ras-GAP activity and Ras-related protein (Rap)-GAP activity. Saccharomyces cerevisiae Bud2 and GAP1 members CAPRI (Ca2+-promoted Ras inactivator) and RASAL (Ras-GTPase-activating-like protein) also possess this dual activity. Human DOC-2/DAB2-interacting protein (DAB2IP) is encoded by a tumor suppressor gene and a newly recognized member of the Ras-GTPase-activating family. Members here include mammals, amphibians, and bony fish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 189
31200 270179 cd13376 PH_DAB2IP DOC-2/Disabled homolog 2-interacting protein Pleckstrin homology (PH) domain. DAB2IP (also called AIP1/ASK1-interacting protein-1 and DIP1/2) is a member of the RasSynGAP family along with Synaptic Ras-GTPase activating protein (SynGAP) and neuronal growth-associated protein (nGAP/RASAL2). DAB2IP is a critical component of many signal transduction pathways mediated by Ras and tumor necrosis factors including apoptosis pathways, and it is involved in the formation of many types of tumors. DAB2IP participates in regulation of gene expression and pluripotency of cells. Human DAB2IP is expressed in the adrenal gland, pancreas, endocardium, stomach, kidney, testis, small intestine, liver, trachea, skin, ovary, endometrium, lung, esophagus and bladder. No expression was observed in the cerebrum, parotid gland, thymus, thyroid gland and spleen. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 182
31201 241529 cd13378 PH_RhoGAP2 Rho GTPase activating protein 2 Pleckstrin homology (PH) domain. RhoGAP2 (also called RhoGap22 or ArhGap22) are involved in cell polarity, cell morphology and cytoskeletal organization. They activate a GTPase belonging to the RAS superfamily of small GTP-binding proteins. The encoded protein is insulin-responsive, is dependent on the kinase Akt, and requires the Akt-dependent 14-3-3 binding protein which binds sequentially to two serine residues resulting in regulation of cell motility. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 116
31202 241530 cd13379 PH_RhoGap24 Rho GTPase activating protein 24 Pleckstrin homology (PH) domain. RhoGap24 (also called ARHGAP24, p73RhoGAp, and Filamin-A-associated RhoGAP) like other RhoGAPs are involved in cell polarity, cell morphology and cytoskeletal organization. They act as GTPase activators for the Rac-type GTPases by converting them to an inactive GDP-bound state and control actin remodeling by inactivating Rac downstream of Rho leading to suppress leading edge protrusion and promotes cell retraction to achieve cellular polarity and are able to suppress RAC1 and CDC42 activity in vitro. Overexpression of these proteins induces cell rounding with partial or complete disruption of actin stress fibers and formation of membrane ruffles, lamellipodia, and filopodia. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 114
31203 270180 cd13380 PH_Skap1 Src kinase-associated phosphoprotein 1 Pleckstrin homology (PH) domain. Adaptor protein Skap1 (also called Skap55/Src kinase-associated phosphoprotein of 55 kDa) and its partner, ADAP (adhesion and degranulation promoting adapter protein) help reorganize the cytoskeleton and/or promote integrin-mediated adhesion upon immunoreceptor activation. Skap1 is also involved in T Cell Receptor (TCR)-induced RapL-Rap1 complex formation and LFA-1 activation. Skap1 has an N-terminal coiled-coil conformation which is proposed to be involved in homodimer formation, a central PH domain and a C-terminal SH3 domain that associates with ADAP. The Skap1 PH domain plays a role in controlling integrin function via recruitment of ADAP-SKAP complexes to integrins as well as in controlling the ability of ADAP to interact with the CBM signalosome and regulate NF-kappaB. SKAP1 is necessary for RapL binding to membranes in a PH domain-dependent manner and the PI3K pathway. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Skap55/Skap1, Skap2, and Skap-homology (Skap-hom) have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31204 270181 cd13381 PH_Skap-hom_Skap2 Src kinase-associated phosphoprotein homolog and Skap 2 Pleckstrin homology (PH) domain. Adaptor protein Skap-hom, a homolog of Skap55, which interacts with actin and with ADAP (adhesion and degranulation promoting adapter protein) undergoes tyrosine phosphorylation in response to plating of bone marrow-derived macrophages on fibronectin. Skap-hom has an N-terminal coiled-coil conformation that is involved in homodimer formation, a central PH domain and a C-terminal SH3 domain that associates with ADAP. The Skap-hom PH domain regulates intracellular targeting; its interaction with the DM domain inhibits Skap-hom actin-based ruffles in macrophages and its binding to 3'-phosphoinositides reverses this autoinhibition. The Skap-hom PH domain binds PI[3,4]P2 and PI[3,4,5]P3, but not to PI[3]P, PI[5]P, or PI[4,5]P2. Skap2 is a downstream target of Heat shock transcription factor 4 (HSF4) and functions in the regulation of actin reorganization during lens differentiation. It is thought that SKAP2 anchors the complex of tyrosine kinase adaptor protein 2 (NCK20/focal adhesion to fibroblast growth factor receptors at the lamellipodium in lens epithelial cells. Skap2 has an N-terminal coiled-coil conformation which interacts with the SH2 domain of NCK2, a central PH domain and a C-terminal SH3 domain that associates with ADAP (adhesion and degranulation promoting adapter protein)/FYB (the Fyn binding protein). Skap2 PH domain binds to membrane lipids. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Src kinase-associated phosphoprotein of 55 kDa (Skap55)/Src kinase-associated phosphoprotein 1 (Skap1), Skap2, and Skap-hom have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 106
31205 270182 cd13382 PH_OCRL1 oculocerebrorenal syndrome of Lowe 1 Pleckstrin homology-like domain. OCRL1 (also called INPP5F, LOCR, NPHL2, or phosphatidylinositol polyphosphate 5-phosphatase) hydrolyzes phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. It interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. OCRL1 contains a PH domain and a Rho-GAP domain. Patients with Lowe syndrome suffer primarily from congenital cataracts, neonatal hypotonia, intellectual disability and Fanconi syndrome. Mutations in OCRL are also found in a subset of patients with type 2 Dent disease, who selectively suffer from renal proximal tubular dysfunction. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
31206 270183 cd13383 PH_OCRL2 oculocerebrorenal syndrome of Lowe 2 Pleckstrin homology-like domain. OCRL2 ( also called IPNNB5, inositol polyphosphate-5-phosphatase, phosphoinositide 5-phosphatase, 5PTase, or type II inositol-1,4,5-trisphosphate 5-phosphatase) hydrolyzes phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. It interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. OCRL2 contains a PH domain and a Rho-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31207 241535 cd13384 PH_Gab2_2 Grb2-associated binding protein family pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. Members here include insect, nematodes, and crustacean Gab2s. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 115
31208 270184 cd13385 PH_Gab3 Grb2-associated binding protein 3 pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. The members in this cd include the Gab1, Gab2, and Gab3 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
31209 275421 cd13386 PH1_FGD2 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 2, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Not much is known about FGD2. FGD1 is the best characterized member of the group with mutations here leading to the X-linked disorder known as faciogenital dysplasia (FGDY). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31210 275422 cd13387 PH1_FGD3 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 3, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 108
31211 275423 cd13388 PH1_FGD1-4_like FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 1-4 and similar proteins, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. They play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 94
31212 275424 cd13389 PH1_FGD5_FGD6 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 5 and 6, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 124
31213 275425 cd13390 PH_LARG Leukemia-associated Rho guanine nucleotide exchange factor Pleckstrin homology (PH) domain. LARG (also called RhoGEF12) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. RhoGEFs activate Rho GTPases regulating cytoskeletal structure, gene transcription, and cell migration. LARG contains a N-terminal extension, followed by Dbl homology (DH)-PH domains which bind and catalyze the exchange of GDP for GTP on RhoA in addition to a RGS domain. The active site of RhoA adopts two distinct GDP-excluding conformations among the four unique complexes in the asymmetric unit. The LARG PH domain also contains a potential protein-docking site. LARG forms a homotetramer via its DH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 138
31214 275426 cd13391 PH_PRG PDZ Rho guanine nucleotide exchange factor Pleckstrin homology (PH) domain. PRG (also called RhoGEF11) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. RhoGEFs activate Rho GTPases regulating cytoskeletal structure, gene transcription, and cell migration. PRG contains an N-terminal PDZ domain, a regulators of G-protein signaling-like (RGSL) domain, a linker region, and a C-terminal Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. As is the case in p115-RhoGEF, it is thought that the PRG activated by relieving autoinhibition caused by the linker region. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 142
31215 275427 cd13392 PH_AKAP13 A-kinase anchoring protein 13 Pleckstrin homology (PH) domain. The Rho-specific GEF activity of AKAP13 (also called Brx-1, AKAP-Lbc, and proto-Lbc) mediates signaling downstream of G-protein coupled receptors and Toll-like receptor 2. It plays a role in cell growth, cell development and actin fiber formation. Protein kinase A (PKA) binds and phosphorylates AKAP13, regulating its Rho-GEF activity. Alternative splicing of this gene in humans has at least 3 transcript variants encoding different isoforms (i.e. proto-/onco-Lymphoid blast crisis, Lbc and breast cancer nuclear receptor-binding auxiliary protein, Brx) containing a dbl oncogene homology (DH) domain and PH domain which are required for full transforming activity. The DH domain is associated with guanine nucleotide exchange activation while the PH domain has multiple functions including determine protein sub-cellular localisation via phosphoinositide interactions, while others bind protein partners. Other ligands include protein kinase C which is bound by the PH domain of AKAP13, serving to activate protein kinase D and mobilize a cardiac hypertrophy signaling pathway. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 103
31216 275428 cd13393 PH_ARHGEF2 Rho guanine nucleotide exchange factor 2 Pleckstrin homology (PH) domain. ARHGEF2, also called GEF-H1, acts as guanine nucleotide exchange factor (GEF) for RhoA GTPases. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. ARHGEF2 contains a C1 domain followed by Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 116
31217 240521 cd13394 Syo1_like Fungal symportin 1 (syo1) and similar proteins. This family of eukaryotic proteins includes Saccharomyces cerevisiae Ydl063c and Chaetomium thermophilum Syo1, which mediate the co-import of two ribosomal proteins, Rpl5 and Rpl11 (which both interact with 5S rRNA) into the nucleus. Import precedes their association with rRNA and subsequent ribosome assembly in the nucleolus. The primary structure of syo1 is a mixture of Armadillo- (ARM, N-terminal part of syo1) and HEAT-repeats (C-terminal part of syo1). 597
31218 381602 cd13399 Slt35-like Slt35-like lytic transglycosylase. Lytic transglycosylase similar to Escherichia coli lytic transglycosylase Slt35 and Pseudomonas aeruginosa Sltb1. Lytic transglycosylase (LT) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). 108
31219 381603 cd13400 LT_IagB-like Escherichia coli invasion protein IagB and similar proteins. Lytic transglycosylase-like protein, similar to Escherichia coli invasion protein IagB. IagB is encoded within a pathogenicity island in Salmonella enterica and has been shown to degrade polymeric peptidoglycan. IagB-like invasion proteins are implicated in the invasion of eukaryotic host cells by bacteria. Lytic transglycosylase (LT) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Members of this family resemble the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda. 109
31220 381604 cd13401 Slt70-like 70kDa soluble lytic transglycosylase (Slt70) and similar proteins. Catalytic domain of the 70kda soluble lytic transglycosylase (LT)-like proteins, which also have an N-terminal U-shaped U-domain and a linker L-domain. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda. 152
31221 381605 cd13402 LT_TF-like lytic transglycosylase-like domain of tail fiber-like proteins and similar domains. These tail fiber-like proteins are multi-domain proteins that include a lytic transglycosylase (LT) domain. Members of the LT family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, and the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. 117
31222 381606 cd13403 MLTF-like membrane-bound lytic murein transglycosylase F (MLTF) and similar proteins. This subfamily includes membrane-bound lytic murein transglycosylase F (MltF, murein lyase F) that degrades murein glycan strands. It is responsible for catalyzing the release of 1,6-anhydromuropeptides from peptidoglycan. Lytic transglycosylase catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do goose-type lysozymes. However, in addition, it also makes a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. 161
31223 259831 cd13404 UreI_AmiS_like UreI/AmiS family, proton-gated urea channel and putative amide transporters. This family includes UreI proton-gated urea channels as well as putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and CO2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity. 167
31224 276910 cd13405 TNFRSF14_teleost Tumor necrosis factor receptor superfamily member 14 (TNFRSF14) in teleost; also known as herpes virus entry mediator (HVEM). This subfamily of TNFRSF14 (also known as herpes virus entry mediator or HVEM, ATAR, CD270, HVEA, LIGHTR, TR2) is found in teleosts, many of which are as yet uncharacterized. It regulates T-cell immune responses by activating inflammatory as well as inhibitory signaling pathways. HVEM acts as a receptor for the canonical TNF-related ligand LIGHT (lymphotoxin-like), which exhibits inducible expression, and competes with herpes simplex virus glycoprotein D for HVEM. It also acts as a ligand for the immunoglobulin superfamily proteins BTLA (B and T lymphocyte attenuator) and CD160, a feature distinguishing HVEM from other immune regulatory molecules, thus, creating a functionally diverse set of intrinsic and bidirectional signaling pathways. HVEM is highly expressed in the gut epithelium. Genome-wide association studies have shown that HVEM is an inflammatory bowel disease (IBD) risk gene, suggesting that HVEM could have a regulatory role influencing the regulation of epithelial barrier, host defense, and the microbiota. Mouse models have revealed that HVEM is involved in colitis pathogenesis, mucosal host defense, and epithelial immunity, thus acting as a mucosal gatekeeper with multiple regulatory functions in the mucosa. HVEM plays a critical role in both tumor progression and resistance to antitumor immune responses, possibly through direct and indirect mechanisms. It is known to be expressed in several human malignancies, including esophageal squamous cell carcinoma, follicular lymphoma, and melanoma. HVEM network may therefore be an attractive target for drug intervention. In Asian seabass, the up-regulation of differentially expressed TNFRSF14 gene has been observed. 111
31225 276911 cd13406 TNFRSF4 Tumor necrosis factor receptor superfamily member 4 (TNFRSF4), also known as CD134 or OXO40. TNFRSF4 (also known as OX40, ACT35, CD134, IMD16, TXGP1L) activates NF-kappaB through its interaction with adaptor proteins TRAF2 and TRAF5. It also promotes the expression of apoptosis inhibitors BCL2 and BCL2lL1/BCL2-XL, and thus suppresses apoptosis. It is primarily expressed on activated CD4+ and CD8+ T cells, where it is transiently expressed and upregulated on the most recently antigen-activated T cells within inflammatory lesions. This makes it an attractive target to modulate immune responses, i.e. TNFRSF4 (OX40) blocking agents to inhibit adverse inflammation or agonists to enhance immune responses. An artificially created biologic fusion protein, OX40-immunoglobulin (OX40-Ig), prevents OX40 from reaching the T-cell receptors, thus reducing the T-cell response. Some single nucleotide polymorphisms (SNPs) of its natural ligand OX40 ligand (OX40L, CD252), which is also found on activated T cells, have been associated with systemic lupus erythematosus. 142
31226 276912 cd13407 TNFRSF5 Tumor necrosis factor receptor superfamily member 5 (TNFRSF5), also known as CD40. TNFRSF5 (commonly known as CD40 and also as CDW40, p50, Bp50) is widely expressed in diverse cell types including B lymphocytes, dendritic cells, platelets, monocytes, endothelial cells, and fibroblasts. It is essential in mediating a wide variety of immune and inflammatory responses, including T cell-dependent immunoglobulin class switching, memory B cell development, and germinal center formation. Its natural immunomodulating ligand is CD40L, and a primary defect in the CD40/CD40L system is associated with X-linked hyper-IgM (XHIM) syndrome. It is also involved in tumorigenesis; CD40 expression is significantly higher in gastric carcinomas and it is associated with the lymphatic metastasis of cancer cells and their tumor node metastasis (TNM) classification. Upregulated levels of CD40/CD40L on B cells and T cells may play an important role in the immune pathogenesis of breast cancer. Consequently, the CD40/CD40L system serves as a link between tumorigenesis, atherosclerosis, and the immune system, and offers a potential target for drug therapy for related diseases, such as cancer, atherosclerosis, diabetes mellitus, and immunological rejection. 161
31227 276913 cd13408 TNFRSF7 Tumor necrosis factor receptor superfamily member 7 (TNFRSF7), also known as CD27. TNFRSF7 (also known as CD27, T14, S152, Tp55, S152, LPFS2) has a key role in the generation of immunological memory via effects on T-cell expansion and survival, and B cell development. It binds to ligand CD70, and plays a key role in regulating B-cell activation and immunoglobulin synthesis. CD27 transduces signals that lead to the activation of NF-kappaB and MAPK8/JNK, and mediates the signaling process through adaptor proteins TRAF2 and TRAF5. CD27-binding protein (SIVA), a pro-apoptotic protein, can bind to CD27 and may play an important role in the apoptosis induced by this receptor. The potential role of the CD27/CD70 pathway in the course of inflammatory diseases, such as arthritis, and inflammatory bowel disease, suggests that CD70 may be a target for immune intervention. The expression of CD27 and CD44 molecules correlates with the differentiation stage of B cell precursors and has been shown to have a biological significance in acute lymphoblastic leukemia. 121
31228 276914 cd13409 TNFRSF8 Tumor necrosis factor receptor superfamily member 8 (TNFRSF8), also known as CD30. TNFRSF8 (also known as CD30, Ki-1, D1S166E) is expressed by activated T and B cells. It transduces signals that lead to the activation of NF-kappaB, mediated by the adaptor proteins TRAF2 and TRAF5. This receptor is a positive regulator of apoptosis, and has been shown to limit the proliferative potential of auto-reactive CD8 effector T cells and protect the body against autoimmunity. Two alternatively spliced transcript variants of this gene encoding distinct isoforms have been reported. CD30 is expressed in malignant Hodgkin and Reed-Sternberg cells on the surface of extracellular vesicles, facilitating CD30-CD30L interaction between cell types. This receptor is also associated with anaplastic large cell lymphoma. It is expressed in embryonal carcinoma, but not in seminoma, making it a useful marker in distinguishing between these germ cell tumors. Since CD30 has restricted expression in normal tissues, it is an optimal target for selectively eliminating CD30-expressing neoplastic cells by specific toxin-conjugated monoclonal antibodies (mAbs). 130
31229 276915 cd13410 TNFRSF9 Tumor necrosis factor receptor superfamily member 9 (TNFRSF9), also known as CD137. TNFRSF9 (also known as CD137, ILA, 4-1BB) plays a role in the immunobiology of human cancer where it is preferentially expressed on tumor-reactive subset of tumor-infiltrating lymphocytes. It can be expressed by activated T cells, but to a larger extent on CD8 than on CD4 T cells. In addition, CD137 expression is found on dendritic cells, follicular dendritic cells, natural killer cells, granulocytes and cells of blood vessel walls at sites of inflammation. It transduces signals that lead to the activation of NF-kappaB, mediated by the TRAF adaptor proteins. CD137 contributes to the clonal expansion, survival, and development of T cells. It can also induce proliferation in peripheral monocytes, enhance T cell apoptosis induced by TCR/CD3 triggered activation, and regulate CD28 co-stimulation to promote Th1 cell responses. CD137 is modulated by SAHA treatment in breast cancer cells, suggesting that the combination of SAHA with this receptor could be a new therapeutic approach for the treatment of tumors. 138
31230 276916 cd13411 TNFRSF11A Tumor necrosis factor receptor superfamily member 11A (TNFRSF11A), also known as receptor activator of nuclear factor-kappaB (RANK). TNFRSF11A (also known as RANK, FEO, OFE, ODFR, OSTS, PDB2, CD26, OPTB7, TRANCER, LOH18CR1) induces the activation of NF-kappa B and MAPK8/JNK through interactions with various TRAF adaptor proteins. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. The receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Juvenile Paget's disease (JPD) of bone. Alternatively spliced transcript variants have been described for this locus. Mutation analysis may improve diagnosis, prognostication, recurrence risk assessment, and perhaps treatment selection among the monogenic disorders of RANKL/OPG/RANK activation. 163
31231 276917 cd13412 TNFRSF11B_teleost Tumor necrosis factor receptor superfamily 11B (TNFRSF11B) in teleost; also known as Osteoprotegerin (OPG). This subfamily of TNFRSF11B (also known as Osteoprotegerin, OPG, TR1, OCIF) is found in teleosts. It is a secreted glycoprotein that regulates bone resorption. It binds to two ligands, RANKL (receptor activator of nuclear factor kappaB ligand, also known as osteoprotegerin ligand, OPGL, TRANCE, TNF-related activation induced cytokine), a critical cytokine for osteoclast differentiation, and TRAIL (TNF-related apoptosis-inducing ligand), involved in immune surveillance. Therefore, acting as a decoy receptor for RANKL and TRAIL, OPG inhibits the regulatory effects of nuclear factor-kappaB on inflammation, skeletal, and vascular systems, and prevents TRAIL-induced apoptosis. Studies in mice counterparts suggest that this protein and its ligand also play a role in lymph-node organogenesis and vascular calcification. Circulating OPG levels have emerged as independent biomarkers of cardiovascular disease (CVD) in patients with acute or chronic heart disease. OPG has also been implicated in various inflammations and linked to diabetes and poor glycemic control. Alternatively spliced transcript variants of this gene have been reported, although their full length nature has not been determined. Genetic analysis of the Japanese rice fish medaka (Oryzias latipes) has shown that entire networks for bone formation are conserved between teleosts and mammals; enabling medaka to be used as a genetic model to monitor bone homeostasis in vivo. 129
31232 276918 cd13413 TNFRSF12A Tumor necrosis factor receptor superfamily member 12A (TNFRSFA), also known as receptor fibroblast growth factor inducible 14 (FN14). TNFRSF12A (also known as receptor fibroblast growth factor inducible 14, FN14, CD266, TWEAKR) is induced by a large variety of growth factors including Fibroblast Growth Factor 1 (FGF1), FGF2, Platelet-Derived Growth Factor (PDGF), Epidermal Growth Factor (EGF) and Vascular Endothelial Growth Factor (VEGF), as well as cytokines such as tumor necrosis factor alpha (TNFalpha), Interleukin-1beta (IL-1beta), Interferon gamma (IFNgamma), and transforming growth factor-beta (TGF-beta). FN14 is expressed on a wide variety of different cell types and binds the ligand TWEAK (tumor necrosis factor-like weak inducer of apoptosis) to activate several signaling cascades through activation of NF-kappaB signaling mediated by adaptor TRAF proteins. The FN14/TWEAK pathway controls a range of cellular activities such as proliferation, differentiation, and apoptosis, and has diverse biological functions in pathological mechanisms like inflammation and fibrosis that are associated with cardiovascular diseases (CVDs). The complex is a positive regulator of cardiac hypertrophy and it has been shown that deletion of FN14 receptor protects from right heart fibrosis and dysfunction; the TWEAK/Fn14 axis could be a potential new therapeutic target for achieving cardiac protection in patients with CVDs. FN14 expression is also stimulated under specific atrophic conditions, such as denervation, immobilization, and starvation, leading to activation of TWEAK/Fn14 signaling and eventually skeletal muscle atrophy. FN14 is also a factor that promotes prostate cancer bone metastasis. 117
31233 276919 cd13414 TNFRSF17 Tumor necrosis factor receptor superfamily member 17 (TNFRSF17), also known as B cell maturation antigen (BCMA), as well as TNFRSF13A. TNFRSF17 (also known as TNFRSF13A, B cell maturation antigen or BCMA, CD269) is predominantly expressed on terminally differentiated B cells, including multiple myeloma cells, and is important for B cell development and autoimmune response. Upon binding to its ligands, B cell activator of the TNF family (BAFF, also known as TNSF13B, TALL-1, BLyS, zTNF4), and a proliferation inducing ligand (APRIL), BCMA activates NF-kappaB and MAPK8/JNK; it has a higher affinity for APRIL than for BAFF. This receptor may transduce signals for cell survival and proliferation by binding to TRAF1, TRAF2, and TRAF3. BCMA expression has also been linked to a number of cancers, autoimmune disorders, and infectious diseases. It has been shown that although BCMA does not play a role in normal B cell homeostasis, it is critical for the long-term survival of bone marrow plasma cells. BCMA is expressed in a number of hematologic malignancies, including both Hodgkin's and non-Hodgkin's lymphomas, as well as primary tumor cells and cell lines of multiple myeloma, playing a critical role in protecting myeloma cells from apoptosis. BCMA has been identified as a promising chimeric antigen receptor (CAR) target for multiple myeloma; CARs are synthetic transmembrane proteins used to redirect autologous T cells with a new specificity for antigens on the surface of cancer cells. BCMA may also be implicated in the context of both viral and fungal infections; peripheral blood B cells isolated from HIV+ viremic patients have increased expression levels of BCMA, and significant decreased levels are found during fungal infection with C. neoformans. BCMA has been linked to mucosal immunity; its signaling in B cells and non-B cells is important for driving protective IgA responses. Also, abnormal expression or signaling of BCMA in the gut may be relevant to diseases, such as irritable bowel disease and ulcerative colitis. 165
31234 276920 cd13415 TNFRSF13B Tumor necrosis factor receptor superfamily member 13B (TNFRSF13B), also known as transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI). TNFRSF13B (also known as transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI), CVID, RYZN, CD267, CVID2, TNFRSF14B) is mainly expressed on B cells and binds strongly to B cell activating factor (BAFF) and weakly to a proliferation-inducing ligand (APRIL). TACI-APRIL interactions induce B-cell differentiation, whereas TACI-BAFF ligation negatively regulates B-cell functions. In humans, TACI is expressed on memory B cells and TACI mutations are detected in 8-10% of common variable immunodeficiency (CVID) patients, making it the most frequently mutated gene for the disease. Coexisting morbidities in CVID include bronchiectasis, autoimmunity, and malignancies. However, TNFRSF13B/TACI defects alone do not result in CVID but may also be found frequently in distinct clinical phenotypes, including benign lymphoproliferation and IgG subclass deficiencies. Over-expression of TACI has been detected in multiple myeloma and thyroid carcinoma; correlative analyses suggest that TACI expression is a useful prognostic marker for lymphoma. 212
31235 276921 cd13416 TNFRSF16 Tumor necrosis factor receptor superfamily member 16 (TNFRSF16), also known as p75 neurotrophin receptor (p75NTR) or CD271. TNFRSF16 (also known as nerve growth factor receptor (NGFR) or p75 neurotrophin receptor (p75NTR or p75(NTR)), CD271, Gp80-LNGFR) is a common receptor for both neurotrophins and proneurotrophins, and plays a diverse role in many tissues, including the nervous system. It has been shown to be expressed in various types of stem cells and has been used to prospectively isolate stem cells with different degrees of potency. p75NTR owes its signaling to the recruitment of intracellular binding proteins, leading to the activation of different signaling pathways. It binds nerve growth factor (NGF) and the complex can initiate a signaling cascade which has been associated with both neuronal apoptosis and neuronal survival of discrete populations of neurons, depending on the presence or absence of intracellular signaling molecules downstream of p75NTR (e.g. NF-kB, JNK, or p75NTR intracellular death domain). p75NTR can also bind NGF in concert with the neurotrophic tyrosine kinase receptor type 1 (TrkA) protein where it is thought to modulate the formation of the high-affinity neurotrophin binding complex. On melanoma cell, p75NTR is an immunosuppressive factor, induced by interferon (IFN)-gamma, and mediates down-regulation of melanoma antigens. It can interact with the aggregated form of amyloid beta (Abeta) peptides, and plays an important role in etiopathogenesis of Alzheimer's disease by influencing protein tau hyper-phosphorylation. p75(NTR) is involved in the formation and progression of retina diseases; its expression is induced in retinal pigment epithelium (RPE) cells and its knockdown rescues RPE cell proliferation activity and inhibits RPE apoptosis induced by hypoxia. It can therefore be a potential therapeutic target for RPE hypoxia or oxidative stress diseases. 159
31236 276922 cd13417 TNFRSF18 Tumor necrosis factor receptor superfamily member 18 (TNFRSF18), also known as glucocorticoid-induced tumor necrosis factor receptor family-related protein (GITR). TNFRSF18 (also known as activation-inducible TNF receptor (AITR), glucocorticoid-induced tumor necrosis factor receptor family-related protein (GITR), CD357, GITR-D) has increased expression upon T-cell activation, and is thought to play a key role in dominant immunological self-tolerance maintained by CD25(+)CD4(+) regulatory T cells. In inflammatory cells, GITR expression indicates a possible molecular link between steroid use and complicated acute sigmoid diverticulitis; increased MMP-9 expression by GITR signaling might explain morphological changes in the colonic wall in diverticulitis. Its ligand, GITRL, activates GITR which could then influence the activity of effector and regulatory T cells, participating in the development of several autoimmune and inflammatory diseases, including autoimmune thyroid disease and rheumatoid arthritis. In systemic lupus erythematosus (SLE) patients, serum GITRL levels are increased compared with healthy controls. GITR and its ligand, GITRL, are possibly involved in the pathogenesis of primary Sjogren's syndrome (pSS). GITR is inactivated during tumor progression in Multiple Myeloma (MM); restoration of GITR expression in GITR deficient MM cells leads to inhibition of MM proliferation and induction of apoptosis, thus playing a pivotal role in MM pathogenesis and disease progression. Regulatory T-cells (Tregs) in liver tumor up-regulate the expression of GITR compared with Tregs in tumor-free liver tissue and blood. Regulatory single nucleotide polymorphisms (SNPs) in the promoter regions of the TNFRSF18 gene have been identified in a group of male Gabonese individuals exposed to a wide array of parasitic diseases such as malaria, filariasis and schistosomiasis, and may serve as a basis to study parasite susceptibility in association studies. 130
31237 276923 cd13418 TNFRSF19 Tumor necrosis factor receptor superfamily member 19 (TNFRSF19), also known as TROY. TNFRSF19 (also known as TAJ; TROY; TRADE; TAJ-alpha) is expressed in progenitor cells of the hippocampus, thalamus, and cerebral cortex and highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. It is frequently overexpressed in colorectal cancer cell lines and primary colorectal carcinomas. TNFRSF19 is a beta-catenin target gene, in mesenchymal stem cells, and also activates NF-kappaB signaling, showing that beta-catenin regulates NF-kappaB activity via TNFRSF19. Since Wnt/beta-catenin signaling plays a crucial role in the regulation of colon tissue regeneration and the development of colon tumors, TNFRSF19 may contribute to the development of colorectal tumors. These findings define a role for death receptors DR6 and TROY in CNS-specific vascular development. TNFRSF19 has been shown to promote glioblastoma (GBM) survival signaling and therefore targeting it may increase tumor vulnerability and improve therapeutic response in glioblastoma. It may play an important role in myelin-associated inhibitory factors (MAIFs)-induced inhibition of neurite outgrowth in the postnatal central nervous system (CNS) or on axon regeneration following CNS injury. 117
31238 276924 cd13419 TNFRSF19L tumor necrosis factor receptor superfamily member 19-like (TNFRSF19L), also known as receptor expressed in lymphoid tissues (RELT). TNFRSF19L (also known as receptor expressed in lymphoid tissues (RELT)) is especially abundant in hematologic tissues and can stimulate the proliferation of T-cells. It serves as a substrate for the closely related kinases, odd-skipped related transcription factor 1 (OSR1) and STE20/SPS1-related proline/alanine-rich kinase (SPAK); RELT binds SPAK and uses it to mediate p38 and JNK activation, rather than rely on the canonical TRAF pathways for its function. RELT is capable of stimulating T-cell proliferation in the presence of CD3 signaling, which suggests its regulatory role in immune response. It interacts with phospholipid scramblase 1 (PLSCR1), an interferon-inducible protein that mediates antiviral activity against DNA and RNA viruses; PLSCR1 is a regulator of hepatitis B virus X (HBV X) protein. RELT and PLSCR1 co-localize in intracellular regions of human embryonic kidney-293 cells, with RELT over-expression appearing to alter the localization of PLSCR1. 91
31239 276925 cd13420 TNFRSF25 tumor necrosis factor receptor superfamily member 25 (TNFRSF25), also known as death receptor 3 (DR3). TNFRSF25 (also known as death receptor 3 (DR3), death domain receptor 3 (DDR3), apoptosis-mediating receptor, lymphocyte associated receptor of death (LARD), apoptosis inducing receptor (AIR), APO-3, translocating chain-association membrane protein (TRAMP), WSL-1, WSL-LR or TNFRSF12) is preferentially expressed in thymocytes and lymphocytes, and may play a role in regulating lymphocyte homeostasis. It has been detected in lymphocyte-rich tissues such as colon, intestine, thymus and spleen, as well as in the prostate. Various death domain containing adaptor proteins mediate the signal transduction of this receptor; it activates nuclear factor kappa-B (NFkB) and induces cell apoptosis by associating with TNFRSF1A-associated via death domain (TRADD), which is known to mediate signal transduction of tumor necrosis factor receptors. DR3 associates with tumor necrosis factor (TNF)-like cytokine 1A (TL1A also known as TNFSF15) on activated lymphocytes and induces pro-inflammatory signals; TL1A also binds decoy receptor DcR3 (also known as TNFRSF6B). DR3/DcR3/TL1A expression is increased in both serum and inflamed tissues in autoimmune diseases such as in several autoimmune diseases, including inflammatory bowel disease (IBD), rheumatoid arthritis (RA), allergic asthma, experimental autoimmune encephalomyelitis, type 1 diabetes, ankylosing spondylitis (AS), and primary biliary cirrhosis (PBC), making modulation of TL1A-DR3 interaction a potential therapeutic target. 114
31240 276926 cd13421 TNFRSF_EDAR Tumor necrosis factor receptor superfamily member ectodysplasin A receptor (EDAR). Ectodysplasin A receptor (EDAR, also known as DL, ED3, ED5, ED1R, EDA3, HRM1, EDA1R, ECTD10A, ECTD10B, EDA-A1R) binds the soluble ligand ectodysplasin A and can activate the nuclear factor-kappaB, JNK, and caspase-independent cell death pathways. It is required for the development of hair, teeth, and other ectodermal derivatives. Mutations in this gene result in autosomal dominant and recessive forms of hypohidrotic ectodermal dysplasia. Patients present defects in the development of ectoderm-derived structures resulting in sparse hair, too few teeth (oligodontia), the absence or reduction in the ability to sweat as well as problems with mucous and saliva and the production and formation of pigment cells. 136
31241 276927 cd13422 TNFRSF5_teleost Tumor necrosis factor receptor superfamily member 5 (TNFRSF5) in teleosts; also known as CD40. TNFRSF5 (commonly known as CD40 and also as CDW40, p50, Bp50) is widely expressed in diverse cell types including B lymphocytes, dendritic cells, platelets, monocytes, endothelial cells, and fibroblasts. It is essential in mediating a wide variety of immune and inflammatory responses, including T cell-dependent immunoglobulin class switching, memory B cell development, and germinal center formation. Its natural immunomodulating ligand is CD40L, and a primary defect in the CD40/CD40L system is associated with X-linked hyper-IgM (XHIM) syndrome. It is also involved in tumorigenesis; CD40 expression is significantly higher in gastric carcinomas and it is associated with the lymphatic metastasis of cancer cells and their tumor node metastasis (TNM) classification. Upregulated levels of CD40/CD40L on B cells and T cells may play an important role in the immune pathogenesis of breast cancer. Consequently, the CD40/CD40L system serves as a link between tumorigenesis, atherosclerosis, and the immune system, and offers a potential target for drug therapy for related diseases, such as cancer, atherosclerosis, diabetes mellitus, and immunological rejection. Salmon CD40 and CD40L are widely expressed, particularly in immune tissues, and their importance for the immune response is indicated by their relatively high expression in salmon lymphoid organs and gills. 161
31242 276928 cd13423 TNFRSF6_teleost Tumor necrosis factor receptor superfamily member 6 (TNFRSF6) in teleosts; also known as fas cell surface death receptor (FasR). This subfamily of TNFRSF6 (also known as fas cell surface death receptor (FasR) or Fas; APT1; CD95; FAS1; APO-1; FASTM; ALPS1A) is found in teleosts. It contains a death domain and plays a central role in the physiological regulation of programmed cell death. In humans, it has been implicated in the pathogenesis of various malignancies and diseases of the immune system. The receptor interactions with the Fas ligand (FasL), allowing the formation of a death-inducing signaling complex that includes Fas-associated death domain protein (FADD), caspase 8, and caspase 10; autoproteolytic processing of the caspases in the complex triggers a downstream caspase cascade, leading to apoptosis. This receptor has also been shown to activate NF-kappaB, MAPK3/ERK1, and MAPK8/JNK, and is involved in transducing the proliferating signals in normal diploid fibroblast and T cells. In channel catfish and the Japanese rice fish, medaka, homologs of Fas receptor (FasR), as well as FADD and caspase 8, have been identified and characterized, and likely constitute the teleost equivalent of the death-inducing signaling complex (DISC). FasL/FasR are involved in the initiation of apoptosis and suggest that mechanisms of cell-mediated cytotoxicity in teleosts are similar to those used by mammals; presumably, the mechanism of apoptosis induction via death receptors was evolutionarily established during the appearance of vertebrates. 103
31243 276929 cd13424 TNFRSF9_teleost Tumor necrosis factor receptor superfamily member 9 (TNFRSF9) in teleosts; also known as CD137. This subfamily of TNFRSF9 (also known as CD137, ILA, 4-1BB) is found in teleosts. CD137 plays a role in the immunobiology of human cancer where it is preferentially expressed on tumor-reactive subset of tumor-infiltrating lymphocytes. It can be expressed by activated T cells, but to a larger extent on CD8 than on CD4 T cells. In addition, CD137 expression is found on dendritic cells, follicular dendritic cells, natural killer cells, granulocytes and cells of blood vessel walls at sites of inflammation. It transduces signals that lead to the activation of NF-kappaB, mediated by the TRAF adaptor proteins. CD137 contributes to the clonal expansion, survival, and development of T cells. It can also induce proliferation in peripheral monocytes, enhance T cell apoptosis induced by TCR/CD3 triggered activation, and regulate CD28 co-stimulation to promote Th1 cell responses. CD137 is modulated by SAHA treatment in breast cancer cells, suggesting that the combination of SAHA with this receptor could be a new therapeutic approach for the treatment of tumors. Mostly, CD137 in teleosts have not been characterized. 150
31244 240448 cd13425 Peptidase_G1_like Peptidases of the G1 family and homologs that might lack peptidase activity. Some members of this family had been classified earlier as carboxyl peptidases insensitive to pepstatin, and the family has also been called the eqolisin family, due to the fact that the conserved catalytic dyad of the family consists of a glutamate (E) and glutamine (Q) residue. The family is found in fungi and bacteria. This family also includes homologous uncharacterized proteins that might lack peptidase activity. 195
31245 240449 cd13426 Peptidase_G1 Peptidases of the G1 family, including scytalidoglutamic peptidase and aspergillopepsin. Some members of this family had been classified earlier as carboxyl peptidases insensitive to pepstatin, and the family has also been called the eqolisin family, due to the fact that the conserved catalytic dyad of the family consists of a glutamate (E) and glutamine (Q) residue. The family is found in fungi and bacteria. 206
31246 240450 cd13427 YncM_like Uncharacterized proteins similar to Bacillus subtilis YncM. Members of this family share close structural similarity with peptidases of the Peptidase_G1 family and may be homologous. They do not appear to share the peptidases' active site, though a bound sulfate ion in the single available structure suggests a functional site at a matching location. 204
31247 259832 cd13428 UreI_AmiS UreI/Amis family, proton-gated urea channel and putative amide transporters. This subfamily includes UreI proton-gated urea channels as well as putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity. 162
31248 259833 cd13429 UreI_AmiS_like_2 UreI/AmiS family, subgroup 2. Putative transporters related to proton-gated urea channel and putative amide transporters. This subfamily includes putative UreI proton-gated urea channels and putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity. 165
31249 240445 cd13430 LDT_IgD_like IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain. 98
31250 240446 cd13431 LDT_IgD_like_1 IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the first (N-terminal) repeat in LdtMt2 and related proteins. 95
31251 240447 cd13432 LDT_IgD_like_2 IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the repeat adjacent to the catalytic domain. 99
31252 240441 cd13433 Na_channel_gate Inactivation gate of the voltage-gated sodium channel alpha subunits. This region is part of the intracellular linker between domains III and IV of the alpha subunits of voltage-gated sodium channels. It is responsible for fast inactivation of the channel and essential for proper physiological function. 54
31253 259812 cd13434 SPFH_SLPs Stomatin-like proteins (slipins) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes proteins similar to stomatin, podocin, and other members of the stomatin-like protein family (SLPs or slipins). The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. Bacterial and archaebacterial SLPs and many of the eukaryotic family members remain uncharacterized. 108
31254 259813 cd13435 SPFH_SLP-4 Slipin-4 (SLP-4), an uncharacterized subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in arthropods. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Members of this divergent slipin subgroup remain largely uncharacterized. It contains Drosophila Mec2, the gene for which was identified in a screen for genes required for nephrocyte function; it may function together with Sns in maintaining nephrocyte diaphragm. 208
31255 259814 cd13436 SPFH_SLP-1 Stomatin-like protein 1 (SLP-1), a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in animals. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. The family contains human SLP-1, which has been found to be expressed in the brain, and Caenorhabditis elegans UNC-24, which is a lipid raft-associated protein required for normal locomotion. It may mediate the correct localization of UNC-1. Mutations in the unc-24 gene result in abnormal motion and altered patterns of sensitivity to volatile anesthetics. 131
31256 259815 cd13437 SPFH_alloslipin Alloslipin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in some eukaryotes and viruses. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. This diverse subgroup of the SLPs remains largely uncharacterized. 222
31257 259816 cd13438 SPFH_eoslipins_u2 Uncharacterized prokaryotic subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial SLPs remain uncharacterized. 215
31258 240442 cd13439 CamS_repeat Repeat domain of CamS sex pheromone cAM373 precursor and related proteins. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria and an uncharacterized protein with a single repeat from Desulfovibrio piger, which is structurally similar and might be homologous. 106
31259 240443 cd13440 CamS_repeat_2 C-terminal repeat domain of CamS sex pheromone cAM373 precursor. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria. 115
31260 240444 cd13441 CamS_repeat_1 N-terminal repeat domain of CamS sex pheromone cAM373 precursor. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria. 204
31261 412041 cd13442 CDI_toxin_Bp1026b-like C-terminal (CT) toxin domain of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei 1026b, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular example from Burkholderia pseudomallei 1026b and other bacteria appears to function as a Mg2+-dependent RNAse cleaving tRNA, most likely in the aminoacyl acceptor stem. This CdiA-Ct is structurally similar to another CDI toxin domain from B. pseudomallei E479 which is unrelated in sequence but has a similar nuclease domain, and shares similar fold and active-site architecture; it contains a core alpha/beta-fold that is characteristic of PD(D/E)XK superfamily nucleases. 129
31262 259836 cd13443 CDI_inhibitor_Bp1026b_like Inhibitor of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei 1026b, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor of the CdiA effector protein from Burkholderia pseudomallei 1026b (which is a tRNAse). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. The inhibitors are intracellular proteins that inactivate the toxin/effector protein. 100
31263 259837 cd13444 CDI_toxin_EC869_like Zn-dependent DNAse of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring bacteria. This model represents the C-terminal toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. A wide variety of C-terminal toxin domains appear to exist; this particular example from Escherichia coli EC869 and other bacteria appears to function as a Zn2+-dependent DNAse degrading the genome of target cells. 143
31264 259838 cd13445 CDI_inhibitor_EC869_like Inhibitor of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring bacteria. This model represents the inhibitor of the CdiA effector protein from Escherichia coli EC869 (which is a DNAse). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. The inhibitors are intracellular proteins that inactivate the toxin/effector protein. This domain is also known as DUF1436. 157
31265 259825 cd13516 HHD_CCM2 harmonin-homology domain (harmonin_N_like domain) of malcavernin (CCM2). CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM) along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signaling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours. CCM2 possesses an N-terminal PTB domain. The C-terminal domain of malcavernin, which is represented here, appears similar to the N-terminal domain of the scaffolding protein harmonin. It has also been referred to as the Karet domain. 97
31266 270235 cd13517 PBP2_ModA3_like Substrate binding domain of molybdate binding protein-like (ModA3), a member of the type 2 periplasmic binding fold superfamily. This subfamily contains molybdate binding protein-like (ModA3) domain of an ABC-type transporter. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea. ModA transporters import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arrangted around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 223
31267 270236 cd13518 PBP2_Fe3_thiamine_like Substrate binding domain of iron and thiamine transporters-like, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. On the other hand, thiamin is an essential cofactor in all living systems. Thiamin diphosphate (ThDP)-dependent enzymes play an important role in carbohydrate and branched-chain amino acid metabolism. Most prokaryotes, plants, and fungi can synthesize thiamin, but it is not synthesized in vertebrates. These periplasmic domains have high affinities for their respective substrates and serve as the primary receptor for transport. After binding iron and thiamine with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The iron- and thiamine-binding proteins belong to the PBPI2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 260
31268 270237 cd13519 PBP2_PEB3_AcfC Ligand-binding domain of a glycoprotein adhesion and an accessory colonization factor, a member of the type 2 periplasmic binding fold superfamily. PEB3 is a glycoprotein adhesion from Campylobacter jejuni whose structure suggests a functional role in transport, and resembles PEB1a, an Asp/Glu transporter and an adhesin. The overall structure of PEB3 is a dimer and is similar to that of other type 2 periplasmic transport proteins such as the molybdate/tungstate, sulfate, and ferric iron transporters. PEB3 has high sequence identity to Paa, an Escherichia coli adhesin, and to AcfC, an accessory colonization factor from Vibrio cholera. 227
31269 270238 cd13520 PBP2_TAXI_TRAP Substrate binding domain of TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This group includes Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family and closely related proteins. TRAP transporters are ubiquitous in prokaryotes, but absent from eukaryotes. They are comprised of an SBP (substrate-binding protein) of the DctP or TAXI families and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. The substrate-binding domain of TAXI proteins belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and tworeceptor cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 285
31270 270239 cd13521 PBP2_AlgQ_like Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This family represents the periplasmic-binding component of high molecular weight (HMW) alginate uptake system found in gram-negative soil bacteria and related proteins. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. In Sphingomonas sp. A1, the transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins AlgQ1 and AlgQ2. Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 483
31271 270240 cd13522 PBP2_ABC_oligosaccharides The periplasmic-binding component of ABC transport systems specific for maltose and related oligosaccharides; possess type 2 periplasmic binding fold. This family represents the periplasmic binding component of ABC transport systems involved in uptake of oligosaccharides including maltose, trehalose, maltodextrin, and cyclodextrin. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 368
31272 270241 cd13523 PBP2_polyamines The periplasmic-binding component of ABC transporters involved in uptake of polyamines; possess the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding proteins that function as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 268
31273 270242 cd13524 PBP2_Thiaminase_I Thiaminase-I has high structural homology to the type 2 periplasmic binding proteins of active transport systems. Thiaminase-I, a thiamin-(vitamin B1) degrading enzyme, is a monomer in its biologically active form, with two distinct globular domains (N- and C-domains) separated by a deep groove. It has a structural topology similar to the periplasmic substrate-domains of ABC-type transport systems, such as thiamin-binding protein (TbpA), that possess the type 2 periplasmic binding protein fold. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 363
31274 270243 cd13525 PBP2_ATP-Prtase_HisG The catalytic domain of ATP phosphoribosyltransferase contains the type 2 periplasmic substrate-binding fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 208
31275 270244 cd13526 PBP2_lipoprotein_MetQ_like The periplasmic-binding component of ABC-type methionine uptake transporter system and its related lipoproteins; the type 2 periplasmic-binding protein fold. This family represents the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ) and its related homologs. Members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 228
31276 270245 cd13527 PBP2_TRAP Substrate-binding component of Tripartite ATP-independent Periplasmic transporters and related proteins; contains the type 2 periplasmic-binding protein fold. This family represents the TRAP Transporters that are specific to various ligands, including sialic acid (N-acetyl neuraminic acid), glutamate, ectoine, xylulose, C4-dicarboxylates such as succinate, malate and fumarate, and keto acids such as pyruvate and alpha-ketobutyrate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This family also includes some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 301
31277 270246 cd13528 PBP2_osmoprotectants Substrate-binding domain of osmoregulatory ABC-type transporters; the type 2 periplasmic-binding protein fold. This family represents the periplasmic substrate-binding component of ABC transport systems that are involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. To counteract the efflux of water, bacteria and archaea accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 264
31278 270247 cd13529 PBP2_transferrin Transferrin family of the type 2 periplasmic-binding protein superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helical and beta sheet domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 298
31279 270248 cd13530 PBP2_peptides_like Peptide-binding protein and related homologs; type 2 periplasmic binding protein fold. This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating. The PBP2 proteins share the same architecture as periplasmic binding proteins type 1, but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction. 217
31280 270249 cd13531 PBP2_MxaJ Methanol oxidation system protein MoxJ; the type 2 periplasmic binding fold. This predicted periplasmic protein, called MoxJ or MxaJ, is required for methanol oxidation in Methylobacterium extorquens. Homology suggests it is the substrate-binding protein of an ABC transporter associated with methanol oxidation. Other evidence also suggests that MoxJ is an accessory factor or additional subunit of methanol dehydrogenase itself. Mutational studies show a dependence on this protein for expression of the PQQ-dependent, two-subunit methanol dehydrogenase (MxaF and MxaI) in Methylobacterium extorquens, as if it is a chaperone for enzyme assembly or a third subunit. A homologous N-terminal sequence was found in Paracoccus denitrificans as a 32Kd third subunit. MoxJ may be both, a component of a periplasmic enzyme that converts methanol to formaldehyde and a component of an ABC transporter that delivers the resulting formaldehyde to the cell's interior. 242
31281 270250 cd13532 PBP2_PDT_like Catalytic domain of prephenate dehydratase and similar proteins; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 184
31282 270251 cd13533 PBP2_Yhfz Substrate-binding domain of uncharacterized protein Yhfz from Shigella Flexneri; the type 2 periplasmic-binding protein fold. This subfamily contains periplasmic binding protein type II (BPBII). This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating. The PBPII proteins share the same architecture as periplasmic binding proteins type I (PBPI), but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBPII proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction. 222
31283 270252 cd13534 PBP2_MqnD_like Menaquinone biosynthetic enzyme and related hypothetical proteins; the type 2 periplasmic-binding protein fold. This family represents MqnD, an enzyme within the alternative menaquinone biosynthetic pathway, and related conserved hypothetical proteins. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. The members include Ttha1568, MqnD from Thermus thermophiles HB8, and the conserved hypothetical proteins SCO4506 from Streptomyces coelicolor, Af1704 from Archaeoglobus DSM 4304, Dr0370 from Deinococcus radiodurans, and Ca3427 from candida albicans. They all have significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 261
31284 270253 cd13535 PBP2_Osm_BCP_like Substrate binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system and related proteins; the type 2 periplasmic binding protein fold. This family is part of a high affinity multicomponent binding-protein-dependent ATP-binding cassette transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as betaines, choline, and L-proline. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 277
31285 270254 cd13536 PBP2_EcModA Substrate binding domain of ModA from Escherichia coli and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 227
31286 270255 cd13537 PBP2_YvgL_like Substrate binding domain of putative molybdate-binding protein YvgL and similar proteins;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. ModA proteins serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate and tungstate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 225
31287 270256 cd13538 PBP2_ModA_like_1 Substrate binding domain of putative molybdate-binding protein;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea. After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 230
31288 270257 cd13539 PBP2_AvModA Substrate binding domain of ModA/WtpA from Azotobacter vinelandii and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate is where the central metal atom is in a hexacoordinate configuration. This octahedral geometry was rather unexpected. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 226
31289 270258 cd13540 PBP2_ModA_WtpA Substrate binding domain of ModA/WtpA from Pyrococcus furiosus and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 263
31290 270259 cd13541 PBP2_ModA_like_2 Substrate binding domain of molybdate-binding proteins;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. ModA proteins serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate and tungstate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 238
31291 270260 cd13542 PBP2_FutA1_ilke Substrate binding domain of ferric iron-binding protein, a member of the type 2 periplasmic binding fold superfamily. FutA1 is the periplasmic component of an ABC-type iron transporter and serves as the primary receptor in Synerchosystis species. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria and is critical for survival of these pathogens within the host. After binding iron with high affinity, FutA1 interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The iron- and thiamine-binding proteins belong to the PBPI2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 314
31292 270261 cd13543 PBP2_Fbp Substrate binding domain of ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic protein (Fbp) has high affinities for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 306
31293 270262 cd13544 PBP2_Fbp_like_1 Substrate binding domain of a putative ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 292
31294 270263 cd13545 PBP2_TbpA Substrate binding domain of thiamin transporter, a member of the type 2 periplasmic binding fold superfamily. Thiamin-binding protein TbpA is the periplasmic component of ABC-type transporter in E. coli, while the transmembrane permease and ATPase are ThiP and ThiQ, respectively. Thiamin (vitamin B1) is an essential confactor in all living systems that most prokaryotes, plants, and fungi can synthesized thiamin. However, in vertebrates, thiamine cannot be synthesized and must therefore be obtained through dietary absorption. In addition to thiamin biosynthesis, most organisms can import thiamin using specific transporters. After binding thiamine with high affinity, TbpA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The thiamine-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 269
31295 270264 cd13546 PBP2_BitB Substrate binding domain of a putative iron transporter BitB, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 258
31296 270265 cd13547 PBP2_Fbp_like_2 Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 259
31297 270266 cd13548 PBP2_AEPn_like Substrate binding domain of a putative 2-amnioethylphosphonate-bindinig transporter, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 310
31298 270267 cd13549 PBP2_Fbp_like_3 Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 263
31299 270268 cd13550 PBP2_Fbp_like_4 Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 265
31300 270269 cd13551 PBP2_Fbp_like_5 Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 267
31301 270270 cd13552 PBP2_Fbp_like_6 Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 266
31302 270271 cd13553 PBP2_NrtA_CpmA_like Substrate binding domain of ABC-type nitrate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes nitrate (NrtA) and bicarbonate (CmpA) receptors. These domains are found in eubacterial perisplamic-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. These binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 212
31303 270272 cd13554 PBP2_DszB Substrate binding domain of 2'-hydroxybiphenyl-2-sulfinate desulfinase, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes DszB, which converts 2'-hydroxybiphenyl-2-sulfinate to 2-hydroxybiphenyl and sulfinate at the rate-limiting step of the microbial dibenzothiophene desulfurization pathway. The overall fold of DszB is highly similar to those of periplasmic substrate-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates. After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The DszB protein belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 246
31304 270273 cd13555 PBP2_sulfate_ester_like Sulfate ester binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 268
31305 270274 cd13556 PBP2_SsuA_like_1 Substrate binding domain of putative sulfonate binding protein, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 265
31306 270275 cd13557 PBP2_SsuA Substrate binding domain of sulfonate binding protein, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the sulfonate binding domains SsuA found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 275
31307 270276 cd13558 PBP2_SsuA_like_2 Putative substrate binding domain of sulfonate binding protein, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 267
31308 270277 cd13559 PBP2_SsuA_like_3 Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 258
31309 270278 cd13560 PBP2_taurine Taurine-binding periplasmic protein; the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 218
31310 270279 cd13561 PBP2_SsuA_like_4 Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 212
31311 270280 cd13562 PBP2_SsuA_like_5 Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes sulfonate binding domains found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 215
31312 270281 cd13563 PBP2_SsuA_like_6 Putative substrate binding domain of sulfonate binding protein-like, a member of the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 208
31313 270282 cd13564 PBP2_ThiY_THI5_like Substrate binding domain of ABC-type transporter for thiamin biosynthetic pathway intermediates and similar proteins; the type 2 periplasmic binding protein fold. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport. After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 214
31314 270283 cd13565 PBP2_PstS Substrate binding domain of ABC-type phosphate transporter, a member of the type 2 periplasmic-binding fold superfamily. This subfamily contians phosphate binding domain found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 254
31315 270284 cd13566 PBP2_phosphate Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 245
31316 270285 cd13567 PBP2_TtGluBP Substrate binding domain of Thermus thermophilus GluBP (TtGluBP) of TAXI family of the tripartite ATP-independent periplasmic transporters; contains the type 2 periplasmic binding protein fold. This subgroup includes TtGluBP of TAXI-TRAP family and closely related proteins. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. 284
31317 270286 cd13568 PBP2_TAXI_TRAP_like_3 Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. 289
31318 270287 cd13569 PBP2_TAXI_TRAP_like_1 Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. 283
31319 270288 cd13570 PBP2_TAXI_TRAP_like_2 Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. 281
31320 270289 cd13571 PBP2_PnhD_1 Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding components of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 253
31321 270290 cd13572 PBP2_PnhD_2 Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 249
31322 270291 cd13573 PBP2_PnhD_3 Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 253
31323 270292 cd13574 PBP2_PnhD_4 Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 250
31324 270293 cd13575 PBP2_PnhD Substrate binding domain of ABC-type phosphonate uptake system; contains the type 2 periplasmic binding fold. This subfamily includes the Escherichia coli PhnD (EcPhnD) which exhibits high affinity for the environmentally abundant 2-aminoethylphosphonate (2-AEP), a precursor in the biosynthesis of phosphonolipids, phosphonoproteins, and phosphonoglycans. The Escherichia coli phn operon encodes 14 genes involved in binding, uptake and metabolism of phosphonate, and is activated under phophophate-limiting conditions. PhnD belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. The PBP2 have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. PhnD is the periplasmic binding component of an ABC-type phosphonate uptake system (PhnCDE) that recognizes and binds phosphonate. 259
31325 270294 cd13576 PBP2_BugD_Asp Aspartic acid transporter of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter. 294
31326 270295 cd13577 PBP2_BugE_Glu Glutamate transporter of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter. 292
31327 270296 cd13578 PBP2_Bug27 Aromatic solutes transporter of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. Bug27 binds non-carboxylated solute nicotinamide, in contrast to BugD (aspartic acid transporter) and BugE (glutamate transporter) which both bind aliphatic carboxylated ligands. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. 291
31328 270297 cd13579 PBP2_Bug_NagM Uncharacterized NagM-like protein of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter. 292
31329 270298 cd13580 PBP2_AlgQ_like_1 Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 471
31330 270299 cd13581 PBP2_AlgQ_like_2 Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 490
31331 270300 cd13582 PBP2_AlgQ_like_3 Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 504
31332 270301 cd13583 PBP2_AlgQ_like_4 Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 478
31333 270302 cd13584 PBP2_AlgQ1_2 Periplasmic-binding component of alginate-specific ABC uptake system; contains the type 2 periplasmic binding fold. This group represents the periplasmic-binding component of high molecular weight (HMW) alginate uptake system found in gram-negative soil bacteria such as Sphingomonas sp. A1. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that includes alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide. 481
31334 270303 cd13585 PBP2_TMBP_like The periplasmic-binding component of ABC transport systems specific for trehalose/maltose and similar oligosaccharides; possess type 2 periplasmic binding fold. This family includes the periplasmic trehalose/maltose-binding component of an ABC transport system and related proteins from archaea and bacteria. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 383
31335 270304 cd13586 PBP2_Maltose_binding_like The periplasmic-binding component of ABC transport systems specific for maltose and related polysaccharides; possess type 2 periplasmic binding fold. This subfamily represents the periplasmic binding component of ABC transport systems involved in uptake of polysaccharides including maltose, maltodextrin, and cyclodextrin. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 367
31336 270305 cd13587 PBP2_polyamine_2 The periplasmic-binding component of an uncharacterized ABC transporter involved in uptake of polyamines; contains the type 2 periplasmic binding fold. This family represents the periplasmic binding domain that functions as the primary polyamine receptor of an uncharacterized ABC-type transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 292
31337 270306 cd13588 PBP2_polyamine_1 The periplasmic-binding component of an uncharacterized ABC transporter involved in uptake of polyamines; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that functions as the primary high-affinity receptor of an uncharactertized ABC-type polyamine transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 279
31338 270307 cd13589 PBP2_polyamine_RpCGA009 The periplasmic-binding component of an uncharacterized ABC transport system from Rhodopseudomonas palustris CGA009 and related proteins; contains the type 2 periplasmic-binding fold. This group represents the periplasmic binding domain that serves as the primary high-affinity receptor of an uncharacterized ABC-type polyamine transporter from Rhodopseudomonas palustris Cga009 and related proteins from other bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 268
31339 270308 cd13590 PBP2_PotD_PotF_like The periplasmic-binding component of ABC transporters involved in uptake of polyamines; possess the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain that functions as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 315
31340 270309 cd13591 PBP2_HisGL1 The catalytic domain of hexameric long form HisGL1; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 204
31341 270310 cd13592 PBP2_HisGL2 The catalytic domain of hexameric long form HisGL2; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 208
31342 270311 cd13593 PBP2_HisGL3 The catalytic domain of hexameric long form HisGL3; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 220
31343 270312 cd13594 PBP2_HisGL4 The catalytic domain of hexameric long form HisGL4; contains the type 2 periplasmic binding fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 207
31344 270313 cd13595 PBP2_HisGs The catalytic domain of hetero-octomeric short form HisGs; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl 1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain. HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 205
31345 270314 cd13596 PBP2_lipoprotein_GmpC The periplasmic substrate-binding domain of the membrane-associated lipoprotein-9 GmpC; contains the type 2 periplasmic-binding protein fold. This group includes the membrane-associated lipoprotein-9 from Staphylococcus aureus that binds the dipeptide glycylmethionine (GlyMet). The lipoprotein-9 has both structural and sequential homology to the MetQ family of substrate-binding protein. The GlyMet binding protein belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 230
31346 270315 cd13597 PBP2_lipoprotein_Tp32 The substrate-binding domain of the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum binds L-methionine; the type 2 periplasmic-binding protein fold. This group includes the lipoprotein Tp32, a periplasmic component of a methionine uptake transporter system, and its closely related homologs. The Tp32 has both structural and sequential homology to the MetQ family of substrate-binding protein, and thus it belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 236
31347 270316 cd13598 PBP2_lipoprotein_IlpA_like Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus and similar lipoproteins; the type 2 periplasmic binding protein fold. This group includes the IlpA protein which has both structural and sequential homology to the MetQ family of substrate-binding protein, and thus belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 227
31348 270317 cd13599 PBP2_lipoprotein_Gna1946 The membrane-associated lipoprotein Gna1946 from Neisseria meningitidis; the type 2 periplasmic binding protein fold. Gna1946 shares significant structural and sequence homology with the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ). The members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 228
31349 270318 cd13600 PBP2_lipoprotein_like_1 Putative periplasmic-binding component of ABC-type methionine uptake transporter system-like; the type 2 periplasmic binding protein fold. This subgroup shares significant sequence homology with the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ). The members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. 228
31350 270319 cd13601 PBP2_TRAP_DctP1_3_4_like Periplasmic substrate-binding component of uncharacterized TRAP-type C4-dicarboxylate transporter subfamilies; the type 2 periplasmic-binding protein fold. This model includes uncharacterized DctP subfamilies of the TRAP Transporters. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 302
31351 270320 cd13602 PBP2_TRAP_BpDctp6_7 Substrate-binding domain of a pyroglutamic acid binding DctP subfamily of the tripartite ATP-independent periplasmic transporters; contains the type 2 periplasmic binding protein fold. DctP6 and DctP7 groups of the TRAP transporters that involved in pyroglutamic acid transport. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 300
31352 270321 cd13603 PBP2_TRAP_Siap_TeaA_like Substrate-binding domain of a sialic acid binding Tripartite ATP-independent Periplasmic transport system (SiaP) and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic-binding component of TRAP transport systems such as SiaP (a sialic acid binding virulence factor), TeaA (an ectoine binding protein), and an uncharacterized TM0322 from hyperthermophilic bacterium Thermotoga maritima. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 297
31353 270322 cd13604 PBP2_TRAP_ketoacid_lactate_like Substrate-binding domain of an alpha-keto acid binding Tripartite ATP-independent Periplasmic transporter and related proteins; the type 2 periplasmic-binding protein fold. This family constitutes TRAP transporters that bind to ketoacids such as pyruvate and alpha-ketobutyrate, xylulose, and other unknown ligands. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 306
31354 270323 cd13605 PBP2_TRAP_DctP_like_2 Substrate-binding component of uncharacterized Tripartite ATP-independent Periplasmic transporter; the type 2 periplasmic-binding protein fold. This family represents the TRAP Transporters that are specific to various ligands, including sialic acid (N-acetyl neuraminic acid), glutamate, ectoine, xylulose, C4-dicarboxylates such as succinate, malate and fumarate, and keto acids such as pyruvate and alpha-ketobutyrate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 303
31355 270324 cd13606 PBP2_ProX_like Bacterial substrate-binding protein ProX of ABC-type osmoregulated transporter and its related proteins; the type 2 periplasmic-binding protein fold. This group includes periplasmic substrate-binding component of ABC transport systems from gram-negative and -positive bacteria that are involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 260
31356 270325 cd13607 PBP2_AfProX_like Substrate-binding protein ProX of ABC-type osmoregulatory transporter from Archaeoglobus fulgidus and its related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic substrate-binding protein ProX from the hyperthermophilic archaeon Archaeoglobus fulgidus and its related proteins. AfProX is involved in uptake of compatible solutes such as the trimethylammonium compound glycine betaine and the dimethylammonium compound proline betaine, but the relative substrate preference is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. AfProX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 261
31357 270326 cd13608 PBP2_OpuCC_like Substrate-binding protein OpuCC of ABC-type osmoregulatory transporter and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic substrate-binding protein OpuCC of the ABC transporter OpuC (where Opu is osmoprotectant uptake), which can recognize a broad spectrum of compatible solutes, and its paralog OpuBC that can solely bind choline. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 265
31358 270327 cd13609 PBP2_Opu_like_1 Substrate-binding domain of putative ABC-type osmoprotectant uptake system; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 263
31359 270328 cd13610 PBP2_ChoS Substrate-binding domain ChoS of an osmoregulated ABC-type transporter and related proteins; type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ChoS of Lactococcus lactis is predicted to be involved in uptake of compatible solutes such as choline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ChoS belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 264
31360 270329 cd13611 PBP2_YehZ Substrate-binding domain YehZ of an osmoregulated ABC-type transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein YehZ of Clostridium sticklandii is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. YehZ belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 267
31361 270330 cd13612 PBP2_ProWX Substrate-binding protein ProWX of ABC-type osmoregulated transporter and its related proteins; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ProWX of Helicobacter pylori is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ProWX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 267
31362 270331 cd13613 PBP2_Opu_like_2 Substrate-binding domain of putative ABC-type osmoprotectant uptake system; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 264
31363 270332 cd13614 PBP2_QAT_like Substrate-binding domain of quaternary amine ABC-type transporter; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative quaternary amine ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 264
31364 270333 cd13615 PBP2_ProWY Substrate-binding domain of ABC-type osmoregulated transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ProWY of Streptococcus thermophilus is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ProWY belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 262
31365 270334 cd13616 PBP2_OsmF Substrate-binding domain OsmF of an osmoregulated ABC-type transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein OsmF of an ABC transporter (YehZYXW) from Escherichia coli is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. OsmF belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 274
31366 270335 cd13617 PBP2_transferrin_C The C-lobe of transferrin, a member of the type 2 periplasmic binding protein fold superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helices and beta sheets domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 331
31367 270336 cd13618 PBP2_transferrin_N The N-lobe of transferrin, a member of the type 2 periplasmic binding protein fold superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helices and beta sheets domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 324
31368 270337 cd13619 PBP2_GlnP Glutamine-binding domain of ABC transporter, a member of the type 2 periplasmic binding fold protein superfamily. Periplasmic glutamine binding domain GlnP serves as an initial receptor in the ABC transport of glutamine in eubacteria. GlnP belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 220
31369 270338 cd13620 PBP2_GltS Substrate binding domain of glutamate or arginine ABC transporter, a member of the type 2 periplasmic binding fold protein superfamily. This family comprises of the periplasmic-binding protein component (GltS) of an ABC transporter specific for glutamate or arginine from Lactococcus lactis, as well as its closely related proteins. The GltS domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis 227
31370 270339 cd13621 PBP2_AA_binding_like_3 Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 229
31371 270340 cd13622 PBP2_Arg_3 Substrate binding domain of an arginine 3rd transport system; the type 2 periplasmic binding fold. This subgroup is similar to the HisJ-like family that comprises the periplasmic substrate-binding proteins, including the lysine-, arginine-, ornithine-binding protein (LAO) and the histidine-binding protein (HisJ), which serve as initial receptors for active transport. HisJ and LAO proteins belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 222
31372 270341 cd13623 PBP2_AA_hypothetical Substrate-binding domain of putative amino-acid transport system; the type 2 periplasmic binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 220
31373 270342 cd13624 PBP2_Arg_Lys_His Substrate binding domain of the arginine-, lysine-, histidine-binding protein ArtJ; the type 2 periplasmic binding protein fold. This group includes the periplasmic substrate-binding protein ArtJ of the ATP-binding cassette (ABC) transport system from the thermophilic bacterium Geobacillus stearothermophilus, which is specific for arginine, lysine, and histidine. ArtJ belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 219
31374 270343 cd13625 PBP2_AA_binding_like_1 Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 230
31375 270344 cd13626 PBP2_Cystine_like Substrate binding domain of cystine ABC transporters; the type 2 periplasmic binding protein fold. Cystine-binding domain of periplasmic receptor-dependent ATP-binding cassette (ABC) transporters. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Also, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 219
31376 270345 cd13627 PBP2_AA_binding_like_2 Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 243
31377 270346 cd13628 PBP2_Ala Periplasmic substrate binding domain of ABC-type transporter specific to alanine; the type 2 periplasmic binding protein. This periplasmic substrate component serves as an initial receptor in the ABC transport of glutamine in eubacteria and archaea. After binding the alanine with high affinity, this domain Interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. This alanine specific domain belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 219
31378 270347 cd13629 PBP2_Dsm1740 Amino acid-binding domain of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplasmic binding protein type II (BPBII). This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating. The PBPII proteins share the same architecture as periplasmic binding proteins type I (PBPI), but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBPII proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction. 221
31379 270348 cd13630 PBP2_PDT_1 Catalytic domain of prephenate dehydratase and similar proteins, subgroup 1; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 180
31380 270349 cd13631 PBP2_Ct-PDT_like Catalytic domain of prephenate dehydratase from Chlorobium tepidum and similar proteins, subgroup 2; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 182
31381 270350 cd13632 PBP2_Aa-PDT_like Catalytic domain of prephenate dehydratase from Arthrobacter aurescens and similar proteins, subgroup 3; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 183
31382 270351 cd13633 PBP2_Sa-PDT_like Catalytic domain of prephenate dehydratase from Staphylococcus aureus and similar proteins, subgroup 4; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 184
31383 270352 cd13634 PBP2_Sco4506 The conserved hypothetical protein SCO4506 exhibits the type 2 periplasmic-binidng protein fold. This group includes the SCO4506 protein from Streptomyces coelicolor and related hypothetical proteins. SCO4506 is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. SCO4506 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 256
31384 270353 cd13635 PBP2_Ttha1568_Mqnd A menaquinone biosynthetic enzyme exhibits the type 2 periplasmic-binding protein fold. This group includes Ttha1568 (MqnD) from Thermus thermophilies HB8, an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Ttha1568 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 260
31385 270354 cd13636 PBP2_Af1704 The conserved hypothetical protein Af1704 exhibits the type 2 periplasmic-binding protein fold. This group includes the Af1704 protein from from Archaeoglobus fulgidus DSM 4304, which is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Af1704 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 259
31386 270355 cd13637 PBP2_Ca3427_like The conserved hypothetical protein Ca3427 exhibits the type 2 periplasmic-binding protein fold. This group includes the Ca3427 protein from candida albicans, which is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8, and other related hypothetical proteins. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Ca3427 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 273
31387 270356 cd13638 PBP2_EcProx_like Substrate binding domain of Escherichia coli betaine transport system-like; the type 2 periplasmic binding protein fold. This group includes the periplasmic substrate-binding protein ProX. ProX from the Escherichia coli ATP-binding cassette transport system ProU binds the compatible solutes glycine betaine and proline betaine with high affinity and specificity. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. The ProX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 299
31388 270357 cd13639 PBP2_OpuAC_like Substrate binding domain of Lactococcus lactis ABC-type transporter OpuA and related proteins; the type 2 periplasmic binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to betaine compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine and proline betaine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 254
31389 270358 cd13640 PBP2_ChoX Substrate binding domain of ABC-type choline transport system; the type 2 periplasmic binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to choline and acetylcholine for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as choline and betaines. Choline is necessary for the biosynthesis of glycine betaine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. In the case of the Sinorhizobium meliloti choline uptake system ChoVWX, ChoV is the nucleotide-binding domain that provides energy for the transport process via ATP hydrolysis, ChoW is the integral transmembrane protein that forms the substrate translaocation pathway, and ChoX is the substrate-binding domain. ChoX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 266
31390 270359 cd13641 PBP2_HisX_like Substrate-binding domain of ABC-type histidine transporter involves in betaine and proline uptake; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 261
31391 270360 cd13642 PBP2_BCP_1 Substrate-binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system-like; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 292
31392 270361 cd13643 PBP2_BCP_2 Substrate-binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system-like; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 283
31393 270362 cd13644 PBP2_HemC_archaea Archaeal HemC of hydroxymethylbilane synthase family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 273
31394 270363 cd13645 PBP2_HuPBGD_like Human porphobilinogen deaminase possess type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of human PBGD and its closely related proteins. Mutations in human PBGD cause AIP (acute intermittent porphyria), an inherited autosomal dominant disorder. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 282
31395 270364 cd13646 PBP2_EcHMBS_like cd00494. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of Escherichia coli HMBS and its closely related proteins. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 274
31396 270365 cd13647 PBP2_PBGD_2 An uncharacterized subgroup of the PBGD family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 282
31397 270366 cd13648 PBP2_PBGD_1 An uncharacterized subgroup of the PBGD family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. 278
31398 270367 cd13649 PBP2_Cae31940 Substrate binding domain of an uncharacterized protein similar to ABC-type transporter for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplamic-binding protein Cae31940 which is phylogenetically similar to the ThiY/THI5 family. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport. After binding the ligand, They interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 223
31399 270368 cd13650 PBP2_THI5 Substrate binding domain of ABC-type transporters for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport , as well as THI5 which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes. After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 251
31400 270369 cd13651 PBP2_ThiY Substrate binding domain of ABC-type transporters for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport , as well as THI5 which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes. After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 214
31401 270370 cd13652 PBP2_ThiY_THI5_like_1 Putative substrate binding domain of an ABC-type transporter similar to ThiY/THI5; the type 2 periplasmic binding protein fold. This subfamily is phylogenetically similar to ThiY, which is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport. After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 217
31402 270371 cd13653 PBP2_phosphate_like_1 Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 240
31403 270372 cd13654 PBP2_phosphate_like_2 Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 259
31404 270373 cd13655 PBP2_oligosaccharide_1 The periplasmic binding component of ABC tansport system specific for an unknown oligosaccharide; possess the type 2 periplasmic binidng fold. This group represents an uncharacterized periplasmic-binding protein of an ATP-binding cassette transporter predicted to be involved in uptake of an unknown oligosaccharide molecule. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 363
31405 270374 cd13656 PBP2_MBP The periplasmic binding component of ABC tansport system specific for maltose; possess the type 2 periplasmic binidng fold. This group includes the periplasmic maltose-binding protein of an ATP-binding cassette transporter. Maltose is a disaccharide formed from two units of glucose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 364
31406 270375 cd13657 PBP2_Maltodextrin The periplasmic binding component of ABC transport system specific for maltodextrin. This group includes the periplasmic maltodextrin-binding protein of a binding protein-dependent ATP-binding cassette transporter. Maltodextrin is a polysaccharide that is used as a food addtive and can be enzymatically produced from any starch . Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 368
31407 270376 cd13658 PBP2_CMBP The periplasmic binding component of ABC transport systems specific for cyclo/maltodextrin; possess the type 2 periplasmic binding fold. This group includes the periplasmic cyclo/maltodextrin-binding protein of Thermoactinomyces vulgaris ATP-binding cassette transporter and related proteins. Cyclodextrins are a family of compounds composed of glucose units connected by 1, 4 glycosidic linkages to form a series of oligosaccharide rings, and their cavity is hydrophibic which allows cyclodextrins to accomodate hydrophobic molecules/moieties in the cavity. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 372
31408 270377 cd13659 PBP2_PotF The periplasmic substrate-binding component of an ABC putrescine transport system and related proteins; contains the type 2 periplasmic-binding fold. This group represents the periplasmic substrate-binding domain that serves as the primary polyamine receptor of ABC-type putrescine-preferential transporter from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 331
31409 270378 cd13660 PBP2_PotD The periplasmic substrate-binding component of an active spermidine-preferential transport system; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that serves as the primary polyamine receptor of ABC-type spermindine-preferential transport system from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 315
31410 270379 cd13661 PBP2_PotD_PotF_like_1 The periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that serves as a primary polyamine receptor of an uncharacterized ABC-type transport system from plants and plant-symbiotic cyanobacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 319
31411 270380 cd13662 PBP2_TpPotD_like The periplasmic substrate-binding component of an ABC-type polyamine transport system from Treponema pallidum and related proteins; contains the type 2 periplasmic binding fold. This group includes the polyamine-binding component of an ABC-type polyamine transport system from Treponema pallidum and closely related proteins, which is homologous to the spermidine-preferring periplasmic substrate-binding protein component (PotD)of ABC transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 312
31412 270381 cd13663 PBP2_PotD_PotF_like_2 The periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This group represents the periplasmic substrate-binding domain that serves as a primary polyamine receptor of an uncharacterized ABC-type transport system from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 323
31413 270382 cd13664 PBP2_PotD_PotF_like_3 TThe periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain that functions as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 315
31414 270383 cd13665 PBP2_TRAP_Dctp3_4 Periplasmic substrate-binding component of TRAP-type C4-dicarboxylate transport system DctP3 and DctP4; the type 2 periplasmic-binding protein fold. This group includes uncharacterized DctP3 and DctP 4 subfamilies of TRAP Transporters specific to C4-dicarboxylates such as succinate, malate and fumarate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 302
31415 270384 cd13666 PBP2_TRAP_DctP_like_1 Substrate-binding component of an uncharacterized TRAP-type C4-dicarboxylate transport system; the type 2 periplasmic-binding protein fold. This group includes a DctP subfamily of TRAP Transporters specific to C4-dicarboxylates such as succinate, malate and fumarate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 303
31416 270385 cd13667 PBP2_TRAP_DctP1 Periplasmic substrate-binding component of an uncharacterized TRAP-type C4-dicarboxylate transport system DctP1; contains the type 2 periplasmic-binding protein fold. This group includes an uncharacterized DctP1 subfamily of the TRAP Transporters. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 295
31417 270386 cd13668 PBP2_TRAP_UehA_TeaA Periplasmic substrate-binding component of osmoregulatory TRAP transporters TeaA and UehA; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic-binding component of the ectoine-specific TRAP transporters TeaA from Halomonas elongata and UehA from Ruegeria pomeroyi. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 305
31418 270387 cd13669 PBP2_TRAP_TM0322_like Periplasmic component of TRAP-type C4-dicarboxylate transport system TM0322 from Thermotoga maritima and similar proteins; the type 2 periplasmic binding protein fold. This subgroup includes the hyperthermophilic bacterium Thermotoga maritima TRAP-type C4-dicarboxylate transport system TM0322 and its closely related proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 296
31419 270388 cd13670 PBP2_TRAP_Tp0957_like Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes the putative periplasmic substrate-binding protein Tp0957 from Treponema pallidum, which is similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 298
31420 270389 cd13671 PBP2_TRAP_SBP_like_3 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 296
31421 270390 cd13672 PBP2_TRAP_Siap Substrate-binding domain of a sialic acid binding Tripartite ATP-independent Periplasmic transport system (SiaP); the type 2 periplasmic-binding protein fold. This subfamily represents the periplasmic-binding component of TRAP transport system SiaP, a sialic acid binding virulence factor. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 295
31422 270391 cd13673 PBP2_TRAP_SBP_like_2 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 301
31423 270392 cd13674 PBP2_TRAP_SBP_like_1 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 299
31424 270393 cd13675 PBP2_TRAP_SBP_like_5 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 296
31425 270394 cd13676 PBP2_TRAP_DctP2_like Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP2 and related proteins; the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP2 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 297
31426 270395 cd13677 PBP2_TRAP_SBP_like_6 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 304
31427 270396 cd13678 PBP2_TRAP_DctP10 Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP10; the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP10 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 300
31428 270397 cd13679 PBP2_TRAP_YiaO_like Substrate-binding domain of 2,3-diketo-L-gulonate-binding Tripartite ATP-independent Periplasmic transport system and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the solute receptor protein YiaO of TRAP transport system. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 298
31429 270398 cd13680 PBP2_TRAP_SBP_like_4 Uncharacterized substrate-binding protein of the Tripartite ATP-independent Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 300
31430 270399 cd13681 PBP2_TRAP_lactate Substrate-binding component of a lactate binding Tripartite ATP-independent Periplasmic transporter and related proteins; the type 2 periplasmic-binding protein fold. This subgroup includes a lactate binding TRAP transporter and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 311
31431 270400 cd13682 PBP2_TRAP_alpha-ketoacid Substrate-binding component of an alpha-keto acid binding Tripartite ATP-independent Periplasmic transporter and related proteins; contains the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporters that bind to ketoacids such as pyruvate and alpha-ketobutyrate, xylulose, and other unknown ligands. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 323
31432 270401 cd13683 PBP2_TRAP_DctP6_7 Substrate-binding domain of Tripartite ATP-independent Periplasmic transporter DctP6 and DctP7; type 2 periplasmic-binding protein fold. This subgroup includes TRAP-type mannitol/chloroaromatic compound transport system (Dctp6) and similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the PBP2 superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 304
31433 270402 cd13684 PBP2_TRAP_Dctp5_like Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP5 and related proteins; the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP5 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 314
31434 270403 cd13685 PBP2_iGluR_non_NMDA_like The ligand-binding domain of non-NMDA (N-methyl-D-aspartate) type ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This subfamily represents the ligand-binding domain of non-NMDA (N-methyl-D-aspartate) type ionotropic glutamate receptors including AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) receptors (GluR1-4), kainate receptors (GluR5-7 and KA1/2), and orphan receptors delta 1/2. iGluRs form tetrameric ligand-gated ion channels, which are concentrated at postsynaptic sites in excitatory synapses where they fulfill a variety of different functions. While this ligand-binding domain of iGluRs is structurally homologous to the periplasmic binding fold type II superfamily, the N-terminal leucine/isoleucine/valine#binding protein (LIVBP)-like domain belongs to the periplasmic-binding fold type I. 252
31435 270404 cd13686 GluR_Plant Plant glutamate receptor domain; the type 2 periplasmic binding protein fold. This subfamily contains the glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 232
31436 270405 cd13687 PBP2_iGluR_NMDA The ligand-binding domain of the NMDA (N-methyl-D-aspartate) subtype of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. The ligand-binding domain of the ionotropic NMDA subtype is structurally homologous to the periplasmic-binding fold type II superfamily, while the N-terminal domain belongs to the periplasmic-binding fold type I. The function of the NMDA subtype receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer comprising two NR1 and two NR2 (A, B, C, and D) or NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 239
31437 270406 cd13688 PBP2_GltI_DEBP Substrate-binding domain of ABC aspartate-glutamate transporter; the type 2 periplasmic binding protein fold. This subfamily represents the periplasmic-binding protein component of ABC transporter specific for carboxylic amino acids, including GtlI from Escherichia coli. The aspartate-glutamate binding domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 238
31438 270407 cd13689 PBP2_BsGlnH Substrate binding domain of ABC glutamine transporter from Bacillus subtilis; the type 2 periplasmic-bindig protein fold. This group includes periplasmic glutamine-binding domain GlnP from Bacillus subtilis and its related proteins. The GlnP domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 229
31439 270408 cd13690 PBP2_GluB Substrate binding domain of ABC glutamate transporter; the type 2 periplasmic binding protein fold. This group includes periplasmic glutamate-binding domain GluB from Corynebacterium efficiens and its related proteins. The GluB domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 231
31440 270409 cd13691 PBP2_Peb1a_like Substrate binding domain of an ABC aspartate/glutamate transporter; the type 2 periplasmic-binding protein fold. This group includes periplasmic aspartate/glutamate binding domain Peb1a and its closely related protein. The Peb1a is an important virulence factor in the food-borne human pathogen Campylobacter jejuni, which has a major role in adherence and host colonization. The Peb1a domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 228
31441 270410 cd13692 PBP2_BztA Substrate bindng domain of ABC glutamate/glutamine/aspartate/asparagine transporter; the type 2 periplasmic binding protein fold. BztA is the periplamic-binding protein component of ABC transporter specific for carboxylic amino acids, glutamine and asparagine. The BZtA domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 236
31442 270411 cd13693 PBP2_polar_AA Substrate binding domain of polar amino-acid uptake ABC transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of putative polar amino acid ABC transporter. The polar amino-acid binding domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 228
31443 270412 cd13694 PBP2_Cysteine Substrate binding domain of ABC cysteine transporter; the type 2 periplasmic binding protein fold. This subfamily comprises of the periplasmic-binding protein component of ABC transporter specific for cysteine and its closely related proteins. The cysteine-binding domains belong to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 229
31444 270413 cd13695 PBP2_Mlr3796_like The substrate-binding domain of putative amino aicd transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Mesorhizobium loti and its related proteins. The putative Mlr3796-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 232
31445 270414 cd13696 PBP2_Atu4678_like The substrate binding domain of putative amino acid transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Agrobacterium tumefaciens and its related proteins. The putative Atu4678-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 227
31446 270415 cd13697 PBP2_ArtJ_like Putative substrate-binding domain of ABC arginine transporter; the type 2 periplasmic-binding protein fold. The ArtJ domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 228
31447 270416 cd13698 PBP2_HisGluGlnArgOpine Substrate binding domain of ABC-type histidine/glutamate/glutamine/arginine/opine transporter; the type 2 periplasmic-binding protein fold. This group includes periplasmic-binding component of His/Glu/Gln/Arg/Opine ATP-binding cassette transport system. This substrate-binding domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 214
31448 270417 cd13699 PBP2_OccT_like Substrate binding domain of ABC-type octopine transporter-like; the type 2 periplasmic-binding protein fold. This group includes periplasmic octopine-binding protein and related proteins. This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 211
31449 270418 cd13700 PBP2_Arg_STM4351 Substrate binding domain of arginine-specific ABC transporter; type 2 periplasmic-binding protein fold. This group includes domains similar to Escherichia coli arginine third transport system. STM4351 is the high arginine specific periplasmic-binding protein of ABC transport system. STM4351 belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 222
31450 270419 cd13701 PBP2_ml15202_like Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporter-like; the type 2 periplasmic-binding protein fold. This group includes uncharacterized periplasmic substrate-binding protein similar to HisJ and LAO proteins which are involved in the ABC transport of histidine-, arginine, and lysine-arginine-ornithine amino acids. This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 227
31451 270420 cd13702 PBP2_mlr5654_like Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporter-like; the type 2 periplasmic-binding protein fold. This group includes uncharacterized periplasmic substrate-binding protein similar to HisJ and LAO proteins which serve as initial receptors in the ABC transport of histidine-, arginine, and lysine-arginine-ornithine amino acids. This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 223
31452 270421 cd13703 PBP2_HisJ_LAO Substrate binding domain of ABC-type histidine- and lysine/arginine/ornithine transporters; the type 2 periplasmic-binding protein fold. This subgroup includes the periplasmic-binding proteins, HisJ and LAO, that serve as initial receptors in the ABC transport of histidine and lysine-arginine-ornithine amino acids. They are belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 229
31453 270422 cd13704 PBP2_HisK The periplasmic sensor domain of histidine kinase receptors; the type 2 periplasmic binding fold protein. This subfamily includes the periplasmic sensor domain of the histidine kinase receptors (HisK) which are elements of the two-component signal transduction systems commonly found in bacteria and lower eukaryotes. Typically, the two-component system consists of a membrane-spanning histidine kinase sensor and a cytoplasmic response regulator. The two-component systems serve as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. Extracellular stimuli such as small molecule ligands and ions are detected by the N-terminal periplasmic sensing domain of the sensor kinase receptor, which regulate the catalytic activity of the cytoplasmic kinase domain and promote ATP-dependent autophosphorylation of a conserved histidine residue. The phosphate is then transferred to a conserved aspartate in the response regulator through a phospho-transfer mechanism, and the activity of the response regulator is in turn regulated. The sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space through their function as an initial high-affinity binding component. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 220
31454 270423 cd13705 PBP2_BvgS_D1 The first of the two tandem periplasmic domains of sensor-kinase BvgS; the type 2 peripasmic-binding fold protein. This group contains the first domain of the periplasmic solute-binding domains of BvgS and related proteins. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a histidine-kinase (HK), a receiver and a histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis. Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 221
31455 270424 cd13706 PBP2_HisK_like_1 Putative sensor domain similar to HisK; the type 2 periplasmic binding fold protein. This group includes periplasmic sensor domain of the histidine kinase receptors (HisK) which are elements of the two-component signal transduction systems commonly found in bacteria and lower eukaryotes. Typically, the two-component system consists of a membrane-spanning histidine kinase sensor and a cytoplasmic response regulator. The two-component systems serve as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. Extracellular stimuli such as small molecule ligands and ions are detected by the N-terminal periplasmic sensing domain of the sensor kinase receptor, which regulate the catalytic activity of the cytoplasmic kinase domain and promote ATP-dependent autophosphorylation of a conserved histidine residue. The phosphate is then transferred to a conserved aspartate in the response regulator through a phospho-transfer mechanism, and the activity of the response regulator is in turn regulated. The sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space through their function as an initial high-affinity binding component. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 219
31456 270425 cd13707 PBP2_BvgS_D2 The second of the two tandem periplasmic domains of sensor-kinase BvgS; the type 2 peripasmic-binding fold protein. This group contains the second domain of the periplasmic solute-binding domains of BvgS and related proteins. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a Histidine-kinase (HK), a receiver and a Histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis. Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 221
31457 270426 cd13708 PBP2_BvgS_like_1 Putative sensor domain similar to BvgS; the type 2 periplasmic binding protein domain. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a Histidine-kinase (HK), a receiver and a Histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis. Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 220
31458 270427 cd13709 PBP2_YxeM Substrate binding domain of an ABC transporter YxeMNO; the type 2 periplasmic binding protein fold. This group contains cystine-binding domain (YxeM) of a periplasmic receptor-dependent ATP-binding cassette transporter and its closely related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 227
31459 270428 cd13710 PBP2_TcyK Substrate binding domain of an ABC transporter TcyJKLMN; the type 2 periplasmic binding protein fold. This group contains periplasmic cystine-binding domain (TcyK) of an ATP-binding cassette transporter from Bacillus subtilus and its closely related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 233
31460 270429 cd13711 PBP2_Ngo0372_TcyA Substrate binding domain of ABC transporters involved in cystine import; the type 2 periplasmic binding protein fold. This subgroup includes cystine-binding domain of periplasmic receptor-dependent ATP-binding cassette transporters from Neisseria gonorrhoeae and Bacillus subtilis and their related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 222
31461 270430 cd13712 PBP2_FliY Substrate binding domain of an Escherichia coli ABC transporter; the type 2 periplasmic binding protein fold. This group contains cystine binding domain FliY and its related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 219
31462 270431 cd13713 PBP2_Cystine_like_1 Substrate binding domain of putative ABC transporters involved in cystine import; the type 2 periplasmic binding protein fold. This group contains uncharacterized periplasmic cystine-binding domain of ATP-binding cassette (ABC) transporters. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 218
31463 270432 cd13714 PBP2_iGluR_Kainate Kainate receptor of the type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 251
31464 270433 cd13715 PBP2_iGluR_AMPA The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtypes of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This family represents the ligand-binding domain of the AMPA receptor subunits, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. 261
31465 270434 cd13716 PBP2_iGluR_delta_like The ligand-binding domain of the delta family of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This subfamily represents the ligand-binding domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of iGluRs belongs to the periplasmic-binding fold type I. Although the delta receptors are members of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans. 257
31466 270435 cd13717 PBP2_iGluR_putative The ligand-binding domain of putative ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 360
31467 270436 cd13718 PBP2_iGluR_NMDA_Nr2 The ligand-binding domain of the NR2 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the NR2 subunit of NMDA receptor family. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 283
31468 270437 cd13719 PBP2_iGluR_NMDA_Nr1 The ligand-binding domain of the NR1 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand binding domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore can be classified as excitatory glycine receptors. NR1/NR3 receptors are calcium-impermeable and unaffected by ligands acting at the NR2 glutamate-binding site. 277
31469 270438 cd13720 PBP2_iGluR_NMDA_Nr3 The ligand-binding domain of the NR3 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the NR3 subunit of NMDA receptor family. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain. 283
31470 270439 cd13721 PBP2_iGluR_Kainate_GluR6 GluR6 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 251
31471 270440 cd13722 PBP2_iGluR_Kainate_GluR5 GluR5 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 250
31472 270441 cd13723 PBP2_iGluR_Kainate_GluR7 GluR7 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. 369
31473 270442 cd13724 PBP2_iGluR_kainate_KA1 The ligand-binding domain of the kainate subtype KA1 of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the KA1 subunit of kainate receptor. While this ligand-binding domain is structurally homologous to the periplasmic binding fold type II superfamily, the N_terminal domain of kainate receptors belongs to the periplasmic-binding fold type I. There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined. 333
31474 270443 cd13725 PBP2_iGluR_kainate_KA2 The ligand-binding domain of the kainate subtype KA2 of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the KA2 subunit of kainate receptor. While this ligand-binding domain is structurally homologous to the periplasmic binding fold type II superfamily, the N_terminal domain of kainate receptors belongs to the periplasmic-binding fold type I. There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined. 250
31475 270444 cd13726 PBP2_iGluR_AMPA_GluR2 The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR2 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR2, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. 259
31476 270445 cd13727 PBP2_iGluR_AMPA_GluR4 The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR4 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR4, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I.The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. 259
31477 270446 cd13728 PBP2_iGluR_AMPA_GluR3 The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR3 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR3, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current 259
31478 270447 cd13729 PBP2_iGluR_AMPA_GluR1 The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR1 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR1, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. 260
31479 270448 cd13730 PBP2_iGluR_delta_1 The ligand-binding domain of an orphan ionotropic glutamate receptor delta-1, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the delta1 receptor of an orphan glutamate receptor family. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of delta receptors belongs to the periplasmic-binding fold type I. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRdelta2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans. 257
31480 270449 cd13731 PBP2_iGluR_delta_2 The ligand-binding domain of an orphan ionotropic glutamate receptor delta-2, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the delta-2 receptor of an orphan glutamate receptor family. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of delta receptors belongs to the periplasmic-binding fold type I. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans. 257
31481 293968 cd13733 SPRY_PRY_C-I_1 PRY/SPRY domain in tripartite motif-containing (TRIM) proteins, including TRIM5, TRIM6, TRIM7, TRIM10, TRIM11, TRIM17, TRIM20, TRIM21, TRIM27, TRIM35, TRIM38, TRIM41, TRIM50, TRIM58, TRIM60, TRIM62, TRIM69, TRIM72, NF7 and bloodthirsty. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class IV TRIM proteins, including TRIM7, TRIM35, TRIM41, TRIM50, TRIM62, TRIM69, TRIM72, TRIM protein NF7 and bloodthirsty (bty). TRIM7 interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. TRIM35 may play a role as a tumor suppressor and is implicated in the cell death mechanism. TRIM41 is localized to speckles in the cytoplasm and nucleus, and functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23. TRIM62 is involved in the morphogenesis of the mammary gland; loss of TRIM62 gene expression in breast is associated with increased risk of recurrence in early-onset breast cancer. TRIM69 is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. TRIM protein NF7, which also contains a chromodomain (CHD) at the N-terminus and an RFP (Ret finger protein)-like domain at the C-terminus, is required for its association with transcriptional units of RNA polymerase II which is mediated by a trimeric B box. In Xenopus oocyte, xNF7 has been identified as a nuclear microtubule-associated protein (MAP) whose microtubule-bundling activity, but not E3-ligase activity, contributes to microtubule organization and spindle integrity. Bloodthirsty (bty) is a novel gene identified in zebrafish and has been shown to likely play a role in in regulation of the terminal steps of erythropoiesis. TRIM72 has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. 174
31482 293969 cd13734 SPRY_PRY_C-II PRY/SPRY domain in tripartite motif-containing proteins 1, 9, 18, 36, 46, 67,76 (TRIM1, TRIM9, TRIM18, TRIM36, TRIM46, TRIM67, TRIM76). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class I TRIM proteins, including TRIM1, TRIM9, TRIM18, TRIM36, TRIM46, TRIM67 and TRIM76. TRIM1 (also known as MID2) and its close homolog, TRIM18 (also known as MID1), both contain a B30.2-like domain at their C-terminus and a single fibronectin type III (FN3) motif between it and their N-terminal RBCC domain. Their coiled-coil motifs mediate both homo- and heterodimerization, a prerequisite for association of the rapamycin-sensitive PP2A regulatory subunit Alpha 4 with microtubules. Mutations in TRIM18 have shown to cause Opitz syndrome, a disorder causing congenital anomalies such as cleft lip and palate as well as heart defects. TRIM9 is expressed mainly in the cerebral cortex, and functions as an E3 ubiquitin ligase. Its immunoreactivity is severely decreased in affected brain areas in Parkinson's disease and dementia with Lewy bodies, possibly playing an important role in the regulation of neuronal function and participating in pathological process of Lewy body disease through its ligase. TRIM36 interacts with centromere protein-H, one of the kinetochore proteins and possibly associates with chromosome segregation; an excess of TRIM36 may cause chromosomal instability. TRIM46 has not yet been characterized. TRIM67 negatively regulates Ras activity via degradation of 80K-H, leading to neural differentiation, including neuritogenesis. TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It is possibly involved in protein kinase A signaling as well as vesicular trafficking. It has also been implicated in Duchenne muscular dystrophy and cardiac disease. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. 166
31483 293970 cd13735 SPRY_HECT_like SPRY domain in HECT E3. This domain consists of the SPRY subdomain similar to those found at the N-terminus of the HECT (homologous to the E6AP carboxyl terminus) protein, a C-terminal catalytic domain of a subclass of ubiquitin-protein ligase (E3). HECT E3 binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains. It has a prominent role in protein trafficking and immune response, and is involved in crucial signaling pathways implicated in tumorigenesis. 150
31484 293971 cd13736 SPRY_PRY_TRIM25 PRY/SPRY domain in tripartite motif-containing domain 25 (TRIM25). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM25 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function. 169
31485 293972 cd13737 SPRY_PRY_TRIM25-like PRY/SPRY domain in tripartite motif-containing domain 25 (TRIM25)-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of proteins similar to TRIM25 (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function. 172
31486 293973 cd13738 SPRY_PRY_TRIM14 PRY/SPRY domain of tripartite motif-binding protein 14 (TRIM14). This is a TRIM14 domain family contains residues in the N-terminus that form a distinct PRY domain structure such that the B30.2 domain consists of PRY and SPRY subdomains. TRIM14 domains have yet to be characterized. These B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. It belongs to Class IV TRIM protein family which has members involved in antiviral immunity at various levels of interferon signaling cascade. 173
31487 293974 cd13739 SPRY_PRY_TRIM1 PRY/SPRY domain of tripartite motif-binding protein 1 (TRIM1) or MID2. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM1 (also known as MID2 or midline 2). MID2 and its close homolog, TRIM18 (also known as MID1), both contain a B30.2-like domain at their C-terminus and a single fibronectin type III (FN3) motif between it and their N-terminal RBCC domain. MID2 and MID1 coiled-coil motifs mediate both homo- and heterodimerization, a prerequisite for association of the rapamycin-sensitive PP2A regulatory subunit Alpha 4 with microtubules. Mutations in MID1 have shown to cause Opitz syndrome, a disorder causing congenital anomalies such as cleft lip and palate as well as heart defects. 170
31488 293975 cd13740 SPRY_PRY_TRIM7 PRY/SPRY domain in tripartite motif-binding protein 7 (TRIM7). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 7 (TRIM7), also referred to as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90). TRIM7 or GNIP interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. The GNIP gene encodes at least four distinct isoforms of GNIP, of which three (GNIP1, GNIP2, and GNIP3) have the B30.2 domain. 169
31489 240499 cd13741 SPRY_PRY_TRIM41 PRY/SPRY domain in tripartite motif-binding protein 41 (TRIM41). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 41 (TRIM41). TRIM41 (also known as RING finger-interacting protein with C kinase or RINCK) is localized to speckles in the cytoplasm and nucleus, and functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C. 199
31490 293976 cd13742 SPRY_PRY_TRIM72 PRY/SPRY domain in tripartite motif-binding protein 72 (TRIM72). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM72. Muscle-specific TRIM72 (also known as Mitsugumin 53 or MG53) has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. It is expressed specifically in skeletal muscle and heart, and tethered to the plasma membrane and cytoplasmic vesicles via its interaction with phosphatidylserine. TRIM72 interacts with dysferlin, a sarcolemmal protein whose deficiency causes Miyoshi myopathy (MM) and limb girdle muscular dystrophy type 2B (LGMD2B); this coordination plays an important role in the repair of sarcolemma damage. 192
31491 293977 cd13743 SPRY_PRY_TRIM50 PRY/SPRY domain in tripartite motif-binding protein 50 (TRIM50). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM50. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23. It is specifically expressed in gastric parietal cells and may play an essential role in tubulovesicular dynamics. It also interacts with and increases the level of p62, a multifunctional adaptor protein that is implicated in various cellular processes such as the autophagy clearance of polyubiquitinated protein aggregates. 189
31492 293978 cd13744 SPRY_PRY_TRIM62 PRY/SPRY domain in tripartite motif-binding protein 62 (TRIM62). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM62. It is also called DEAR1 ductal epithelium (associated RING chromosome 1) and is involved in the morphogenesis of the mammary gland; loss of TRIM62 gene expression in breast is associated with increased risk of recurrence in early-onset breast cancer and thus, making TRIM62 a predictive biomarker. Non-small cell lung cancer lesions show a step-wise loss of TRIM62 levels during disease progression, indicating that it may play a role in the evolution of lung cancer. Decreased levels of TRIM62 also represent an independent adverse prognostic factor in AML. 188
31493 293979 cd13745 SPRY_PRY_TRIM39 PRY/SPRY domain in tripartite motif-binding protein 39 (TRIM39) and TRIM39-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of pyrin, several tripartite motif-containing proteins (TRIMs), including E3 ubiquitin-protein ligase (TRIM21), RET finger protein (RFP)/tripartite motif protein 27 (TRIM27), as well as butyrophilin (Btns) and butyrophilin-like (Btnl) family members, with the exception of Btnl2. Btn and Btnl family members are novel regulators of immune responses, with many of the genes located within the MHC. They are implicated in T-cell inhibition and modulation of epithelial cell-T cell interactions. TRIM21 (also known as RO52, SSA1 or RNF81) is a major autoantigen in autoimmune diseases such as rheumatoid arthritis, systemic lupus erythematosus, and Sjorgen's syndrome. TRIM27 (also known as Ret finger protein, RFP or RNF76) negatively regulates CD4 T-cells by ubiquitinating and inhibiting the class II phosphatidylinositol 3 kinase C2beta (PI3K-C2beta), a kinase critical for KCa3.1 channel activation. The PRY/SPRY domain of Pyrin, which is mutated in familial Mediterranean fever patients, interacts with inflammasome components and inhibits proIL-1beta processing. 177
31494 259839 cd13746 Sir4p-SID_like The SID domain of Saccharomyces cerevisiae silent information regulator 4, a Sir2p interaction domain; and related domains. Saccharomyces cerevisiae Sir2p, Sir3p, and Sir4p form a heterotrimeric complex which binds chromatin and represses transcription at the homothallic mating type (HM) loci and at subtelomeric regions. This domain model spans residues 742-893 of Sir4p. Sir4p forms a stable heterodimer with Sir2p, mediated by Sir4p residues included in this domain, and a pocket between Sir2p's catalytic domain and its non-conserved N-terminus. Sir4p also interacts with an array of additional factors, including Yku80p, a subunit of the telomeric Ku complex (Yku70p-Yku80p), which binds two sites within Sir4p, one at the N-terminus and one in the C-terminal residues, 731-1358. Other interaction factors include Esc1p (Establishes silent chromatin 1) which binds the Sir4p PAD domain (partitioning and anchoring domain, residues 950-1262), and Sir3p, Yku70p, and Rap1p (Repressor Activator Protein) which bind in its C-terminal coiled-coil (residues 1257-1358). Other Sir4p interacting factors include the Ty5 retrotransposon. Additional roles for Sir4p include roles in DNA repair, and in aging. A SIR4 mutant having a truncated Sir4p lacking a C-terminal coiled-coil domain, has an extended mean life span; deletion of the SIR4 gene leads to a decreased mean life span. 115
31495 259834 cd13747 UreI_AmiS_like_1 UreI/Amis family, subgroup 1. Putative proton-gated urea channel and putative amide transporters. This subfamily includes putative UreI proton-gated urea channels and putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity. 167
31496 259840 cd13748 CBM29_CBM65 family 29 and family 65 carbohydrate binding modules. Members of this family bind to polysaccharides that are components of plant cell walls. CBM29 is present in cell-wall degrading multi-enzyme complexes from the anaerobic fungus Piromyces equi, CBM65 can be found in endoglucanases expressed by Eubacterium cellulosolvens and has a preference for xyloglucans. 106
31497 259796 cd13749 Zn-ribbon_TFIIS domain III/zinc ribbon domain of Transcription Factor IIS. TFIIS is a zinc-containing transcription factor. It has been shown in vitro to have distinct biochemical activities, including binding to RNA polymerases, stimulation of transcript elongation, and activation of a nascent RNA cleavage activity in the RNA polymerase II (Pol II) elongation complex. TFIIS consists of three domains. Domain II and III are sufficient for all known TFIIS activities. Domain III is a zinc ribbon that separated from domain II by a long linker and is indispensable for TFIIS function. The TFIIS homologs, subunits A12.2, B9, and C11, of Pol I, II, and III respectively, are required for RNA cleavage by the polymerases. In a single organism, there are tissue-specific TFIIS related proteins. 47
31498 381628 cd13750 TGF_beta_GDNF_like transforming growth factor beta (TGF-beta) like domain found in the glial cell-line-derived neurotrophic factor (GDNF) family of ligands. GDNF family of ligands includes GDNF, Artemin, Neurturin, and Persephin. They plays an important role in the development and maintenance of the central and peripheral nervous system, renal morphogenesis, and spermatogenesis. 95
31499 381629 cd13751 TGF_beta_GDF8_like transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factors, GDF8 and GDF11, and similar proteins. The family includes GDF8 and GDF11. GDF8, also termed myostatin, acts specifically as a negative regulator of skeletal muscle growth. GDF11, also termed bone morphogenetic protein 11 (BMP-11), is a secreted signal that acts globally to specify positional identity along the anterior/posterior axis during development. 96
31500 381630 cd13752 TGF_beta_INHB transforming growth factor beta (TGF-beta) like domain found in inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), E chain (INHBE) and similar proteins. The family includes inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), and E chain (INHBE). INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress. 100
31501 381631 cd13753 TGF_beta_TGFbeta1_2_3 transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-1 (TGF-beta-1), beta-2 (TGF-beta-2), beta-3 (TGF-beta-3) and similar proteins. The family includes TGF-beta-1, TGF-beta-2 and TGF-beta-3, which are polypeptide members of the transforming growth factor beta superfamily of cytokines. TGF-beta-1 is a secreted protein that performs many cellular functions, including the control of cell growth, cell proliferation, cell differentiation, and apoptosis. TGF-beta-2 is a secreted protein that performs many cellular functions and has a vital role during embryonic development. It can suppress the effects of interleukin-2 dependent T-cell growth. TGF-beta-3 is involved in embryogenesis and cell differentiation. It regulates molecules involved in cellular adhesion and extracellular matrix (ECM) formation during the process of palate development. 97
31502 381632 cd13754 TGF_beta_INHA transforming growth factor beta (TGF-beta) like domain found in inhibin alpha chain (INHA) and similar proteins. INHA is a component of inhibins (inhibin A or inhibin B) that inhibit the secretion of follitropin by the pituitary gland. 89
31503 381633 cd13755 TGF_beta_maverick transforming growth factor beta (TGF-beta) like domain found in Drosophila melanogaster maverick and similar proteins. Maverick, also termed MAV, is a novel member of the TGF-beta superfamily in Drosophila. It's a bone morphogenetic protein (BMP)/TGF-beta related ligand. 102
31504 381634 cd13756 TGF_beta_BMPs_GDFs transforming growth factor beta (TGF-beta) like domain found in the BMP/GDF family. The BMP/GDF family consists of bone morphogenetic proteins (BMPs), growth and differentiation factors (GDFs) and similar proteins. BMPs are a group of growth factors also known as cytokines and as metabologens. They induce the formation of bone and cartilage and functions as pivotal morphogenetic signals, orchestrating tissue architecture throughout the body. GDFs have functions predominantly in development. 102
31505 381635 cd13757 TGF_beta_AMH transforming growth factor beta (TGF-beta) like domain found in anti-Muellerian hormone (AMH) and similar proteins. AMH, also termed Muellerian-inhibiting factor, or Muellerian-inhibiting substance (MIS), is a glycoprotein that causes regression of the Muellerian duct. It can also inhibit the growth of tumors derived from tissues of Muellerian duct origin. 99
31506 381636 cd13758 TGF_beta_LEFTY1_2 transforming growth factor beta (TGF-beta) like domain found in left-right determination factor 1 (lefty-1), factor 2 (lefty-2) and similar proteins. Lefty-1, also termed left-right determination factor B, or protein lefty-B, is required for left-right axis determination as a regulator of Lefty-2 and NODAL. Lefty-2, also termed endometrial bleeding-associated factor, or left-right determination factor A, or protein lefty-A, or transforming growth factor beta-4 (TGF-beta-4), is required for left-right (L-R) asymmetry determination of organ systems in mammals. It may play a role in endometrial bleeding. 90
31507 381637 cd13759 TGF_beta_NODAL transforming growth factor beta (TGF-beta) like domain found in Nodal (NODAL)-related proteins. NODAL is essential for mesoderm formation and axial patterning during embryonic development. 103
31508 381638 cd13760 TGF_beta_BMP2_like transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 2 (BMP-2), 4 (BMP-4) and similar proteins. The family includes BMP2 and BMP4 (also known as BMP2B), both of which induce cartilage and bone formation. BMP-2 stimulates the differentiation of myoblasts into osteoblasts via the EIF2AK3-EIF2A- ATF4 pathway. BMP-4 acts in mesoderm induction, tooth development, limb formation and fracture repair. 102
31509 381639 cd13761 TGF_beta_BMP5_like transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic proteins BMP-5, BMP-6, BMP-7, BMP-8A/B and similar proteins. The family includes BMP-5, BMP-6, BMP-7 and BMP-8A/B, which may induce cartilage and bone formation. 103
31510 381640 cd13762 TGF_beta_GDP9_9B_like transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9 (GDF-9), growth/differentiation factor 9B (GDF-9B) and similar proteins. The family includes GDF-9B (also known as BMP15) and GDF9. GDF-9B acts as oocyte-specific growth/differentiation factor that stimulates folliculogenesis and granulosa cell (GC) growth. GDF-9 is required for ovarian folliculogenesis. It promotes primordial follicle development and stimulates granulosa cell proliferation. 104
31511 381641 cd13763 TGF_beta_BMP3_like transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 3 (BMP-3), growth/differentiation factor 10 (GDF10) and similar proteins. The family includes BMP-3 (also known as BMP-3A or osteogenin) and GDF10 (also known as BMP-3B). BMP-3 negatively regulates bone density. It antagonizes the ability of certain osteogenic BMPs to induce osteoprogenitor differentitation and ossification. GDF10 is a growth factor involved in osteogenesis and adipogenesis. 103
31512 381642 cd13764 TGF_beta_GDF1_3_like transforming growth factor beta (TGF-beta) like domain found in embryonic growth/differentiation factor 1 (GDF1), factor 3 (GDF3) and similar proteins. The family includes GDF-1 and GDF-3. GDF1 may mediate cell differentiation events during embryonic development. GDF3 is a growth factor involved in early embryonic development and adipose-tissue homeostasis. The family also contains protein DVR-1, also termed vegetal hemisphere VG1 protein (VG-1), which serves to facilitate the differentiation of either mesoderm or endoderm either as a cofactor in an instructive signal or by providing permissive environment. 102
31513 381643 cd13765 TGF_beta_ADMP transforming growth factor beta (TGF-beta) like domain found in anti-dorsalizing morphogenetic protein (ADMP) and similar proteins. ADMP is a bone morphogenetic protein (BMP)-like transforming growth factor beta ligand, functions in the trunk organizer to antagonize head formation, thereby regulating organizer patterning. It negatively affects the formation of the organizer, although it is robustly expressed within the organizer itself. The organizer-promoting signal of ADMP is mediated by the activin A type I receptor, ACVR1 (also known as activin receptor-like kinase-2, ALK2). 105
31514 381644 cd13766 TGF_beta_GDF5_6_7 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 5 (GDF5), factor 6 (GDF6), factor 7 (GDF7) and similar proteins. The family includes GDF5, GDF6 and GDF7. GDF5, also termed bone morphogenetic protein 14 (BMP-14), or cartilage-derived morphogenetic protein 1 (CDMP-1), or lipopolysaccharide-associated protein 4 (LAP-4), or LPS-associated protein 4, or radotermin, is a growth factor involved in bone and cartilage formation. GDF6, also termed bone morphogenetic protein 13 (BMP-13), or growth/differentiation factor 16, is a growth factor that controls proliferation and cellular differentiation in the retina and bone formation. GDF7, also termed bone morphogenetic protein 12 (BMP-12), may play an active role in the motor area of the primate neocortex. 102
31515 381645 cd13767 TGF_beta_BMP9_like transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic proteins, BMP-9, BMP-10 and similar proteins. The family includes BMP9 (also known as GDF2) and BMP10. BMP-9 is a potent circulating inhibitor of angiogenesis. It signals through the type I activin receptor ACVRL1 but not other activin receptor-like kinases (ALKs). BMP-10 is required for maintaining the proliferative activity of embryonic cardiomyocytes by preventing premature activation of the negative cell cycle regulator CDKN1C/p57KIP and maintaining the required expression levels of cardiogenic factors such as MEF2C and NKX2-5. It inhibits endothelial cell migration and growth. It may reduce cell migration and cell matrix adhesion in breast cancer cell lines. 105
31516 259841 cd13768 DSS1_Sem1 proteasome complex subunit DSS1/Sem1. The evolutionarily conserved deleted in split hand/split foot protein 1 (DSS1)/Sem1 is a subunit of the regulatory particle (RP) of the proteasome. It is implicated in ubiquitin-mediated proteolysis, is required for the maintenance of genomic stability, and functions in DNA damage response. DSS1/Sem1 also displays RP-independent functions; it serves as a functional component of the nuclear pore associated TREX-2 transcription-export complex and is required for proper nuclear export of mRNA. In mammalian cells, DSS1 binds and stabilizes the tumor suppressor BRCA2, and contributes to its function in mediating homologous recombinational repair. In yeast, Sem1 also complexes with the COP9 signalosome, which is involved in de-neddylation. DSS1/Sem1 may be a versatile protein which contributes to the functional integrity of multiple protein complexes involved in various biological processes. 61
31517 259842 cd13769 ApoLp-III_like Apolipophorin-III and similar insect proteins. Exchangeable apolipoproteins play vital roles in the transport of lipids and lipoprotein metabolism. Apolipophorin III (apoLp-III) assists in the loading of diacylglycerol, generated from triacylglycerol stores in the fat body through the action of adipokinetic hormone, into lipophorin, the hemolymph lipoprotein. ApoLp-III increases the lipid carrying capacity of lipophorin by covering the expanding hydrophobic surface resulting from diacylglycerol uptake. It plays a critical role in the transport of lipids during insect flight, and may also play a role in defense mechanisms and innate immunity. 158
31518 259817 cd13775 SPFH_eoslipins_u3 Uncharacterized prokaryotic subfamily of the stomatin-like proteins (slipins), a subgroup of the SPFH family (stomatin, prohibitin, flotillin, and HflK/C). This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria and archaebacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial and archaebacterial SLPs remain uncharacterized. 177
31519 260099 cd13777 Aar2_N N-terminal domain of Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor. This family consists of the N-terminal domain of eukaryotic Aar2 and Aar2-like proteins. Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth. 126
31520 260100 cd13778 Aar2_C C-terminal domain of Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor. This family consists of the C-terminal domain of eukaryotic Aar2 and Aar2-like proteins. Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth. 155
31521 260101 cd13783 SPACA1 Sperm acrosome membrane-associated protein 1. SPACA1 (aka SAMP32, due to its 32kDa M.W.) is localized to the acrosome of spermatozoa. The acrosome is an organelle transformed from the Golgi apparatus to form a cap over the anterior portion of the spermatozoa head, which contains the sperm nucleus. Mammalian acrosomes contain digestive enzymes that degrade the ovum outer membrane (zona pellucida) to allow fusion of the sperm and ovum nuclei via the acrosomal reaction. In mammals, the acrosome releases hyaluronidase and acrosin. Antibodies generated against recombinant SPACA1 have been shown to inhibit human sperm binding and membrane fusion in vitro vs. zona-free hamster ova. Male mice lacking SPACA1 are infertile, and exhibit globozoospermia-like misformed sperm heads. SPACA1 content has been reported to be diminished in a comparison of round-headed vs normal spermatozoa. 248
31522 260102 cd13784 SP_1775_like Uncharacterized protein conserved in Streptococci. Streptococcus pneumoniae SP_1775 and related proteins from other Streptococci; may form homooctamers that may bind hydrophobic ligands. 67
31523 260079 cd13785 CARD_BinCARD_like BinCARD (Bcl10-interacting protein with CARD). BinCARD was ubiquitously expressed CRAD (Caspase activation and recruitment domain) protein in all tissues. CARD proteins play important role in apoptosis by functioning as direct regulators of death-inducing caspases. BinCARD interacts with apoptosis inducer CARD protein Bcl10 through CARD. It inhibits Bcl10-mediated activation of NF-kappa B and to suppress Bcl10 phosphorylation. Caspase activation and recruitment domains (CARDs) are death domains (DDs) found associated with caspases. In general, DDs domains are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. 86
31524 259853 cd13831 HU histone-like DNA-binding protein HU. This subfamily includes HU and HU-like domains. HU is a conserved nucleoid-associated protein (NAP) which binds non-specifically to duplex DNA with a particular preference for targeting nicked and bent DNA. It is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. HU can induce DNA bends, condense DNA in a fiber and also interact with single stranded DNA. It contains two homologous subunits, alpha and beta, typically forming homodimers (alpha-alpha and beta-beta), except in E. coli and other enterobacteria, which form heterodimers (alpha-beta). In E. coli, HU binds uniformly to the chromosome, with a preference for damaged or distorted DNA structures and can introduce negative supercoils into closed circular DNA in the presence of topoisomerase I. Anabaena HU (AHU) shows preference for A/T-rich region in the center of its DNA binding site. 86
31525 259854 cd13832 IHF Integration host factor (IHF) and similar proteins. This subfamily includes integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms. This subfamily also includes the protein Hbb from tick-borne spirochete Borrelia burgdorferi, responsible for causing Lyme disease in humans. Hbb, a homodimer, shows DNA sequence preferences that are related, yet distinct from those of IHF. 85
31526 259855 cd13833 HU_IHF_like Uncharacterized proteins similar to DNA sequence specific (IHF) and non-specific (HU) domains. This subfamily consists of uncharacterized proteins similar to integration host factor (IHF) and HU domains, including hypothetical protein Bvu_2165 from Bacteroides vulgatus. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms. 97
31527 259856 cd13834 HU_like DNA-binding proteins similar to HU domains. This subfamily consists of DNA-binding proteins similar to HU domains. HU is a conserved nucleoid-associated protein (NAP) which binds non-specifically to duplex DNA with a particular preference for targeting nicked and bent DNA. It is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. HU can induce DNA bends, condense DNA in a fiber and also interact with single stranded DNA. It contains two homologous subunits, alpha and beta, typically forming homodimers (alpha-alpha and beta-beta), except in E. coli and other enterobacteria, which form heterodimers (alpha-beta). 94
31528 259857 cd13835 IHF_A Alpha subunit of integration host factor (IHFA). This subfamily consists of the alpha subunit of integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms. 88
31529 259858 cd13836 IHF_B Beta subunit of integration host factor (IHFB). This subfamily consists of the beta subunit of integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms. 89
31530 260013 cd13838 RNase_H_like_Prp8_IV Ribonuclease-like Prp8 domain IV core. This family contains Prp8 domain IV, which adopts a RNase H like fold within its core structure but with little sequence similarity. Prp8, a spliceosome protein, interacts directly with the splice sites and branch regions of precursor-mRNAs and spliceosomal RNAs associated with catalysis of the two steps of splicing. Catalysis of RNA cleavage by RNase H-like proteins involves a two-metal mechanism in which adjacently-bound divalent magnesium ions promote hydrolysis by activation of a water nucleophile and stabilization of the transition-state. However, the Prp8 domain IV contains only one of the canonical metal-binding sites and the coordinating side chains are spatially conserved with respect to Mg2+-coordinating residues within the RNase H fold. 251
31531 260103 cd13839 MEF2_binding Mycocyte enhancer factor-2 (MEF2) binding domain of the calcineurin-binding protein cabin-1. The myocyte enhancer factor-2 (MEF2) binding domain, as found in the calcineurin-binding protein cabin-1, adopts an amphipathic alpha-helical structure, which allows it to bind to a hydrophobic groove on the MEF2S domain, forming a triple-helical interaction. Interaction of this domain with MEF2 causes repression of transcription. Cabin-1 inhibits calcineurin-mediated signal transduction in T-cell receptor-mediated signalling pathways, by binding to the activated form of calcineurin. Cabin-1 acts as a co-repressor of MEF2, the mycocyte enhancer factor-2, which regulates transcription in a calcium-dependent manner and plays vital roles in T-cell development and function. 35
31532 260104 cd13840 SMBP_like Small metal-binding protein conserved in proteobacteria. This periplasmic protein appears capable of binding multiple equivalents of a variety of divalent and trivalent metals, including Cu(2+) and Fe(3+) but also Mn(2+), Ni(2+), Mg(2+), and Zn(2+). It has been suggested that SMBP is a metal scavenging protein that plays a role in cellular copper management in Nitrosomonas europaea. 89
31533 260105 cd13841 ABBA-PTs ABBA-type aromatic prenyltransferases (PTases). ABBA-type aromatic prenyltransferases (PTases) are a subgroup of prenyltransferases that are characterized by an unusual type of beta/alpha fold with antiparallel beta strands. They lack the (N/D)DxxD motif which is characteristic for many other prenyltransferases. Generally, aromatic prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto aromatic substrates, forming C-C bonds between C-1 or C-3 of the isoprenoid substrate and one of the aromatic carbons of the acceptor substrate by an electrophilic alkylation, or Friedel-Crafts alkylation mechanism. 294
31534 259911 cd13842 CuRO_HCO_II_like Cupredoxin domain of Heme-copper oxidase subunit II. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 95
31535 259912 cd13843 Azurin_like Azurin and similar redox proteins. Azurin is a bacterial blue copper-binding protein. It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The copper of Azurin is tetrahedrally coordinated by a cysteine, 2 histidines, and a methionine residue. The electron transfer reactions are carried out with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Azurin can function as tumor suppressor; it forms a complex with p53 that triggers apoptosis in various human cancer cells. Auracyanins A and B are from photosynthetic bacteria. They are very similar blue copper proteins with 38% sequence identity and they are homologous to the bacterial redox protein Azurin. However, auracyanin A is expressed only when C. aurantiacus cells are grown in light, whereas auracyanin B is expressed under dark and in light. Thus, auracyanin A may function as a redox partner in photosynthesis, while auracyanin B may function in aerobic respiration. 124
31536 259913 cd13844 CuRO_1_BOD_CotA_like The first Cupredoxin domain of Bilirubin oxidase (BOD), the bacterial endospore coat component CotA, and similar proteins. Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and it is required for spore resistance against hydrogen peroxide and UV light. Also included in this subfamily are phenoxazinone synthase (PHS), which catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS has been shown to participate in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. These are Laccase-like multicopper oxidases (MCOs) that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 162
31537 259914 cd13845 CuRO_1_AAO The first cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 120
31538 259915 cd13846 CuRO_1_AAO_like_1 The first cupredoxin domain of plant Ascorbate oxidase homologs. This subfamily is composed of plant pollen multicopper oxidase homologous to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbor trinuclear copper binding histidines. 118
31539 259916 cd13847 CuRO_1_AAO_like_2 The first cupredoxin domain of Ascorbate oxidase homologs. This family includes fungal proteins with similarity to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 117
31540 259917 cd13848 CuRO_1_CopA The first cupredoxin domain of CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity, and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 116
31541 259918 cd13849 CuRO_1_LCC_plant The first cupredoxin domain of plant laccases. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 117
31542 259919 cd13850 CuRO_1_Abr2_like The first cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 117
31543 259920 cd13851 CuRO_1_Fet3p The first Cupredoxin domain of multicopper oxidase Fet3P. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) and a four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the exocellular space and the carboxyl terminus in the cytoplasm. The periplamic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 121
31544 259921 cd13852 CuRO_1_McoP_like The first cupredoxin domain of multicopper oxidase McoP and similar proteins. This family includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as the electron acceptor than when using dioxygen, the typical oxidizing substrate of MCOs. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 114
31545 259922 cd13853 CuRO_1_Tth-MCO_like The first cupredoxin domain of the bacterial laccases similar to Tth-MCO from Thermus Thermophilus. The subfamily of bacterial laccases includes Tth-MCO and similar proteins. Tth-MCO is a hyperthermophilic multicopper oxidase (MCO) from thermus thermophilus HB27. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 139
31546 259923 cd13854 CuRO_1_MaLCC_like The first cupredoxin domain of the fungal laccases similar to Ma-LCC from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 122
31547 259924 cd13855 CuRO_1_McoC_like The first cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacteria multicopper oxidases (MCOs) represented by McoC from pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic multicopper oxidase, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. They are capable of oxidizing a vast range of substrates, varying from aromatic compunds to inorganic compounds such as metals. Most MCOs have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 121
31548 259925 cd13856 CuRO_1_Tv-LCC_like The first cupredoxin domain of fungal laccases similar to Tv-LCC from Trametes versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 125
31549 259926 cd13857 CuRO_1_Diphenol_Ox The first cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 119
31550 259927 cd13858 CuRO_1_tcLCC2_insect_like The first cupredoxin domain of insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) family includes the majority of insect laccases. One member of the family is laccase 2 from Tribolium castaneum. Laccase 2 is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 105
31551 259928 cd13859 CuRO_D1_2dMcoN_like The first cupredoxin domain of bacterial two domain multicopper oxidase McoN and similar proteins. This family includes bacterial two domain multicopper oxidases (2dMCOs) represented by the McoN from Nitrosomonas europaea. McoN is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. Its biological function has not been characterized. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. 122
31552 259929 cd13860 CuRO_1_2dMco_1 The first cupredoxin domain of bacteria two domain multicopper oxidase. This subfamily includes bacterial two domain multicopper oxidases (2dMCOs) with similarity to McoN from Nitrosomonas europaea. 2dMCO is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. 119
31553 259930 cd13861 CuRO_1_CumA_like The first cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida, which is involved in the oxidation of Mn(II). However, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCO catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 119
31554 259931 cd13862 CuRO_1_MCO_like_1 The first cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 123
31555 259932 cd13864 CuRO_1_MCO_like_2 The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 139
31556 259933 cd13865 CuRO_1_LCC_like_3 The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 115
31557 259934 cd13866 CuRO_2_BOD The second cupredoxin domain of Bilirubin oxidase (BOD). Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. It is used in diagnosing jaundice through the determination of bilirubin in serum. BOD is a member of the multicopper oxidase (MCO) family that also includes laccase, ascorbate oxidase and ceruloplasmin. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 152
31558 259935 cd13867 CuRO_2_CueO_FtsP The second Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the second domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 146
31559 259936 cd13868 CuRO_2_CotA_like The second Cupredoxin domain of bacterial laccases including CotA, a bacterial endospore coat component. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and it is required for spore resistance against hydrogen peroxide and UV light. Laccase is composed of three cupredoxin-like domains and includes one mononuclear and one trinuclear copper center. It is a member of the multicopper oxidase (MCO) family, which couples the oxidation of a substrate with a four-electron reduction of molecular oxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 155
31560 259937 cd13869 CuRO_2_PHS The second Cupredoxin domain of phenoxazinone synthase (PHS). Phenoxazinone synthase (PHS, 2-aminophenol:oxygen oxidoreductase) catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS participates in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. It is a member of the multicopper oxidase (MCO) family, which couples the oxidation of a substrate with a four-electron reduction of molecular oxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 166
31561 259938 cd13870 CuRO_2_CopA_like_1 The second cupredoxin domain of CopA copper resistance protein like family. The members of this family are copper resistance protein (CopA) homologs. CopA is multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. CopA is involved in copper resistance in bacteria. CopA mutant causes a loss of function, including copper tolerance and oxidase activity, and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 117
31562 259939 cd13871 CuRO_2_AAO The second cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. MCOs couple oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 166
31563 259940 cd13872 CuRO_2_AAO_like_1 The second cupredoxin domain of plant pollen multicopper oxidase homologous to ascorbate oxidase. The proteins in this subfamily are expressed in plant pollen. They share homology to ascorbate oxidase and other members of the blue copper oxidase family. The expression of the protein is detected during germination and pollen tube growth. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It is a member of the multicopper oxidase (MCO) family that couples oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 141
31564 259941 cd13873 CuRO_2_AAO_like_2 The second cupredoxin domain of plant Ascorbate oxidase homologs. This family includes plant laccases similar to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couples oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 161
31565 259942 cd13874 CuRO_2_CopA The second cupredoxin domain of CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 112
31566 259943 cd13875 CuRO_2_LCC_plant The second cupredoxin domain of the plant laccases. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 148
31567 259944 cd13876 CuRO_2_Abr2_like The second cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 138
31568 259945 cd13877 CuRO_2_Fet3p_like The second Cupredoxin domain of multicopper oxidase Fet3P. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) with the four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the extracellular space and the carboxyl terminus in the cytoplasm. The periplasmic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 148
31569 259946 cd13879 CuRO_2_McoP_like The second cupredoxin domain of multicopper oxidase McoP and similar proteins. This family includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as electron acceptor than when using dioxygen, the typical oxidizing substrate of multicopper oxidases. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 162
31570 259947 cd13880 CuRO_2_MaLCC_like The second cupredoxin domain of the fungal laccases similar to Ma-LCC from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 167
31571 259948 cd13881 CuRO_2_McoC_like The second cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacterial multicopper oxidases (MCOs) represented by McoC from the pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic MCO, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with the reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. They are composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 142
31572 259949 cd13882 CuRO_2_Tv-LCC_like The second cupredoxin domain of the fungal laccases similar to Tv-LCC from Trametes versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Laccase is a multicopper oxidase (MCO) composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 159
31573 259950 cd13883 CuRO_2_Diphenol_Ox The second cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Laccase is a multicopper oxidase (MCO) composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 164
31574 259951 cd13884 CuRO_2_tcLCC_insect_like The second cupredoxin domain of the insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) subfamily includes the majority of insect laccases. One member is laccase 2 from Tribolium castaneum, which is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 150
31575 259952 cd13885 CuRO_2_CumA_like The second cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida. CumA is involved in the oxidation of Mn(II) in Pseudomonas putida; however, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCOs catalyze the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. The MCOs in this subfamily are composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 132
31576 259953 cd13886 CuRO_2_MCO_like_1 The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This family of MCOs is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 163
31577 259954 cd13887 CuRO_2_MCO_like_2 The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This family of MCOs is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 114
31578 259955 cd13888 CuRO_3_McoP_like The third cupredoxin domain of multicopper oxidase McoP and similar proteins. This subfamily includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as electron acceptor than when using dioxygen, the typical oxidizing substrate of multicopper oxidases. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Members of this subfamily contain three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 139
31579 259956 cd13889 CuRO_3_BOD The third cupredoxin domain of Bilirubin oxidase (BOD). Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. It is used in diagnosing jaundice through the determination of bilirubin in serum. BOD is a member of the multicopper oxidase (MCO) family that also includes laccase, ascorbate oxidase and ceruloplasmin. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 124
31580 259957 cd13890 CuRO_3_CueO_FtsP The third Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the first domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. FtsP does not contain any copper binding sites. 124
31581 259958 cd13891 CuRO_3_CotA_like The third Cupredoxin domain of bacterial laccases including CotA, a bacterial endospore coat component. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and is required for spore resistance against hydrogen peroxide and UV light. CotA belongs to the laccase-like multicopper oxidase (MCO) family, which are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 143
31582 259959 cd13892 CuRO_3_PHS The third Cupredoxin domain of phenoxazinone synthase (PHS). Phenoxazinone synthase (PHS, 2-aminophenol:oxygen oxidoreductase) catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS has been shown to participate in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. PHS is a member of the laccase-like multicopper oxidase (MCO) family, which are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 184
31583 259960 cd13893 CuRO_3_AAO The third cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 155
31584 259961 cd13894 CuRO_3_AAO_like_1 The third cupredoxin domain of plant Ascorbate oxidase homologs. This subfamily is composed of plant pollen multicopper oxidase homologous to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbor T1 copper or trinuclear copper binding sites. 123
31585 259962 cd13895 CuRO_3_AAO_like_2 The third cupredoxin domain of Ascorbate oxidase homologs. This family includes fungal proteins with similarity to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 188
31586 259963 cd13896 CuRO_3_CopA The third cupredoxin domain of CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 115
31587 259964 cd13897 CuRO_3_LCC_plant The third cupredoxin domain of the plant laccases. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 139
31588 259965 cd13898 CuRO_3_Abr2_like The third cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 164
31589 259966 cd13899 CuRO_3_Fet3p The third Cupredoxin domain of multicopper oxidase Fet3p. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) with the four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the extracellular space and the carboxyl terminus in the cytoplasm. The periplasmic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 160
31590 259967 cd13900 CuRO_3_Tth-MCO_like The third cupredoxin domain of the bacterial laccases similar to Tth-MCO from Thermus Thermophilus. The subfamily of bacterial laccases includes Tth-MCO and similar proteins. Tth-MCO is a hyperthermophilic multicopper oxidase (MCO) from thermus thermophilus HB27. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 123
31591 259968 cd13901 CuRO_3_MaLCC_like The third cupredoxin domain of the fungal laccases similar to Ma-LCC from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 157
31592 259969 cd13902 CuRO_3_McoC_like The third cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacteria multicopper oxidases (MCOs) represented by McoC from pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic multicopper oxidase, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. They are capable of oxidizing a vast range of substrates, varying from aromatic compunds to inorganic compounds such as metals. Most MCOs have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 125
31593 259970 cd13903 CuRO_3_Tv-LCC_like The third cupredoxin domain of the fungal laccases similar to Tv-LCC from Trametes Versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 147
31594 259971 cd13904 CuRO_3_Diphenol_Ox The third cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 158
31595 259972 cd13905 CuRO_3_tcLLC2_insect_like The third cupredoxin domain of the insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) family includes the majority of insect laccases. One member of the family is laccase 2 from Tribolium castaneum. Laccase 2 is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 174
31596 259973 cd13906 CuRO_3_CumA_like The third cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida, which is involved in the oxidation of Mn(II). However, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCO catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 138
31597 259974 cd13907 CuRO_3_MCO_like_1 The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 154
31598 259975 cd13908 CuRO_3_MCO_like_2 The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 122
31599 259976 cd13909 CuRO_3_MCO_like_3 The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 137
31600 259977 cd13910 CuRO_3_MCO_like_4 The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 166
31601 259978 cd13911 CuRO_3_MCO_like_5 The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. 119
31602 259979 cd13912 CcO_II_C C-terminal domain of Cytochrome c Oxidase subunit II. Cytochrome c Oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit II contains a copper-copper binuclear site called CuA, which is believed to be involved in electron transfer from cytochrome c to the binuclear center (active site) in subunit I. 130
31603 259980 cd13913 ba3_CcO_II_C C-terminal cupredoxin domain of Ba3-like heme-copper oxidase subunit II. The ba3 family of heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and some archaea, which catalyze the reduction of O2 and simultaneously pump protons across the membrane. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. The ba3 family contains oxidases that lack the conserved residues that form the D- and K-pathways in CcO and ubiquinol oxidase. Instead, they contain a potential alternative K-pathway. Additional proton channels have been proposed for this family of oxidases but none have been identified definitively. 99
31604 259981 cd13914 CuRO_HCO_II_like_3 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 108
31605 259982 cd13915 CuRO_HCO_II_like_2 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 98
31606 259983 cd13916 CuRO_HCO_II_like_1 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 93
31607 259984 cd13917 CuRO_HCO_II_like_4 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 88
31608 259985 cd13918 CuRO_HCO_II_like_6 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 139
31609 259986 cd13919 CuRO_HCO_II_like_5 Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits. Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I. 107
31610 259987 cd13920 Stellacyanin Stellacyanin is a subclass of phytocyanins, a plant type I copper protein. Stellacyanin is a subclass of the phytocyanins, a ubiquitous family of plant cupredoxins. Stellacyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The copper is tetrahedrally coordinated by a cysteine, 2 histidines, and a glutamine residue. The glutamine residue substitutes for a methione ligand typically found in other blue copper proteins. The exact function of stellacyanin is unknown. However, stellacyanin appears to be associated with the plant cell wall; it may be involved in oxidative reactions to build polymeric material making up the cell wall. 101
31611 259988 cd13921 Amicyanin Amicyanin is a type I blue copper protein that plays an essential role in electron transfer. In Paracoccus denitrificans bacteria, amicyanin acts as an intermediary of a three-member redox complex along with methylamine dehydrogenase (MADH) and cytochrome c-551i. The electron is transferred from the active site of MADH via the amicyanin copper ion to the cytochrome heme iron. The electron transfer from MADH to cytochrome c-551i does not involve a ternary complex but occurs via a ping-pong mechanism in which amicyanin uses the same interface for the reactions with MADH and cytochrome c-551i. 81
31612 259989 cd13922 Azurin Azurin is a redox partner for enzymes such as nitrite reductase or arsenite oxidase. Azurin is a bacterial blue copper-binding protein. It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The copper of Azurin is tetrahedrally coordinated by a cysteine, 2 histidines, and a methionine residue. The electron transfer reactions are carried out with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Azurin can function as a tumor suppressor; it forms a complex with p53 that triggers apoptosis in various human cancer cells. 125
31613 381607 cd13925 RPF core lysozyme-like domain of resuscitation-promoting factor proteins. Resuscitation-promoting factor (RPF) proteins, found in various (G+C)-rich Gram-positive bacteria, act to reactivate cultures from stationary phase. This protein shares elements of the structural core of lysozyme and related proteins. Furthermore, it shares a conserved active site glutamate which is required for activity, and has a polysaccharide binding cleft that corresponds to the peptidoglycan binding cleft of lysozyme. Muralytic activity of Rpf in Micrococcus luteus correlates with resuscitation, supporting a mechanism dependent on cleavage of peptidoglycan by RPF. 71
31614 381608 cd13926 N-acetylmuramidase_GH108 N-acetylmuramidase domain of the glycosyl hydrolase 108 family. This domain acts as a lysozyme (N-acetylmuramidase), EC:3.2.1.17. It contains a conserved EGGY motif near the N-terminus, the glutamic acid within this motif is essential for catalytic activity. In bacteria, it may activate the secretion of large proteins via the breaking and rearrangement of the peptidoglycan layer during secretion. It is frequently found at the N-terminus of proteins containing a peptidoglycan binding domain. 91
31615 260106 cd13929 PT-DMATS_CymD aromatic prenyltransferases (PTases) of the DMATS/CymD familiy. Members of the DMATS/CymD family of ABBA prenyltransferases prenylate indole, tyrosine, and xanthone derivatives. This family of fungal proteins includes cyclic dipeptide N-prenyltransferase (CdpNPT), Brevianamide F prenyltransferase (ftmPT1), fumigaclavine C synthase (FgaPT1), dimethylallyltryptophan synthase (DMATS) and related proteins. CdpNPT accepts a variety of tryptophan-containing cyclic dipeptides, including L-tryptophan itself, and prenylates these substrates inverse at the N-1 position of the indole group. FtmPT1 catalyzes the prenylation of brevianamide F in the biosynthesis of fumitremorgin-type alkaloids. FgaPT1 catalyses the prenylation of fumigaclavine A. Dimethylallyltryptophan synthases (DMATS) catalyzes the prenylation of L-tryptophan at C-4 of the indole ring during the biosynthesis of ergot alkaloids. 392
31616 260107 cd13930 PT-Tnase Aromatic Prenyltransferases (PTases) associated with tryptophanase. This group of bacterial and fungal proteins shows homology to the DMATS/CymD family of ABBA prenyltransferases, which prenylates indole, tyrosine, and xanthone derivatives. Some of the members, mostly fungal proteins, are associated with tryptophanase-like domains (Tnase) which catalyzes the degradation of L-tryptophan to yield indole, pyruvate and ammonia, or the degradation of L-tyrosine to yield phenol, pyruvate and ammonia. This suggest that these otherwise uncharacterized proteins may exhibit multiple functions. 348
31617 260108 cd13931 PT-CloQ_NphB Aromatic Prenyltransferases (PTases) of the CloQ/NphB family. Members of the CloQ/NphB family of ABBA prenyltransferases catalyze the prenylation of phenols, naphthalenes, and phenazines. This family of fungal and bacterial proteins includes dihydrophenazine-1-carboxylate dimethylallyltransferase PpzP, the aromatic prenyltransferase from the clorobiocin biosynthetic pathway CloQ, and related proteins. CloQ catalyzes the attachment of a dimethylallyl moiety to 4-hydroxyphenylpyruvate, part of the biosynthetic pathway of the Streptomyces roseochromogenes antibiotic clorobiocin. PpzP, as well as EpzP, are important for the biosynthesis of endophenazines; they catalyze the prenylation of 5,10-dihydrophenazine-1-carboxylic acid (dhPCA). Streptomyces NphB catalyzes the addition of a 10-carbon geranyl group to small organic aromatic substrates and is involved in the biosynthesis of the antioxidant naphterpin. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto aromatic substrates in biosynthetic pathways of microbial secondary metabolites. 274
31618 259826 cd13932 HN_RTEL1 harmonin_N_like domain of regulator of telomere elongation helicase 1 (also known as RTEL). Mouse Rtel is an essential protein required for the maintenance of both telomeric and genomic stability. RTEL1 appears to maintain genome stability by suppressing homologous recombination (HR). In vitro, purified human and insect RTEL1 have been shown to promote the disassembly of D loop recombination intermediates, in a reaction dependent upon ATP hydrolysis. Human RTEL1 is implicated in the etiology of Dyskeratosis congenital (DC, is an inherited bone marrow failure and cancer predisposition syndrome). Point mutations in its helicase domains, and truncations which result in loss of its C-terminus have been discovered in DC families. RTEL1 is also a candidate gene influencing glioma susceptibility. The C-terminal domain of RTEL1, represented here, appears similar to the N-terminal domain of the scaffolding protein harmonin. 99
31619 259827 cd13933 harmonin_N_like_u1 domain similar to the N-terminal protein-binding module of harmonin; uncharacterized subgroup. This domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. Harmonin (not belonging to this group) is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein, which organizes the Usher protein network of the inner ear and the retina. This domain is also related to domains found in several other scaffold proteins which organize supramolecular complexes. 78
31620 260014 cd13934 RNase_H_Dikarya_like Fungal (dikarya) Ribonuclease H, uncharacterized. This family contains dikarya RNase H, many of which are uncharacterized. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication. 153
31621 260015 cd13935 RNase_H_bacteria_like RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. This family includes bacterial ribonuclease H (RNase H) enzymes. RNases are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription. 133
31622 260110 cd13936 PANDER_like Domains similar to the Pancreatic-derived factor. FAM3B or PANDER (PANcreatic DERived factor) has been identifed as a regulator of glucose homeostasis and beta cell function. The protein is expressed in the endocrine pancreas and co-secreted with insulin in response to glucose, particularly under conditions of insulin resistance. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3B designation. This wider family contains FAM3B and FAM4C, N-terminal domains of N-acetylglucosaminyltransferases, and domains in poorly characterized proteins that have been associated with deafness and the progression of cancer. 149
31623 260111 cd13937 PANDER_GnT-1_2_like PANDER-like domain of N-acetylglucosaminyltransferases. O-linked-mannose beta-1,2-N-acetylglucosaminyltransferase 1 participates in O-mannosyl glycosylation and may be responsible for creating GlcNAc(beta1-2)Man(alpha1-)O-Ser/Thr moieties on alpha dystroglycan and other O-mannosylated proteins. The domain characterized by this model lies N-terminal to the catalytic domain. Its function has not been determined. 148
31624 260112 cd13938 PANDER_like_TMEM2 PANDER-like domain of the transmembrane protein TMEM2. TMEM2 has been characterized as a transmembrane protein that maps to the DFNB7-DFNB11 deafness locus on human chromosome 9. It contains a domain similar to the Pancreatic-derived factor PANDER, C-terminal to a glycine rich G8-domain. The function of the PANDER-like domain in TMEM2 has not been characterized. 168
31625 260113 cd13939 PANDER_FAM3B Pancreatic derived factor. FAM3B or PANDER (PANcreatic DERived factor) has been identifed as a regulator of glucose homeostasis and beta cell function. The protein is expressed in the endocrine pancreas and co-secreted with insulin in response to glucose, particularly under conditions of insulin resistance. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3B designation. PANDER induces apoptosis of insulin-secreting beta-cells when over-expressed in vitro. It has been associated with the progression of type 2 diabetes by downregulating beta cell function as well as insulin sensitivity in the liver. 175
31626 260114 cd13940 ILEI_FAM3C Interleukin-like EMT inducer. The secreted factor FAM3C or ILEI (InterLeukin-like Emt Inducer) has been identifed as a protein involved in the epithelial-mesenchymal transition (EMT) and in processes associated with metastasis formation and the progression of cancer. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3C designation. ILEI has been found to be widely expressed, and to be involved in retinal development. 171
31627 260115 cd13941 PANDER_like_KIAA1199 PANDER-like domain of KIAA1199 and similar proteins. KIAA1199 has been characterized as a protein associated with poor survival when upregulated in human cancer, as well as with nonsyndromic loss of hearing when mutated. It contains a C-terminal domain similar to the Pancreatic-derived factor PANDER; the function of this PANDER-like domain has not been characterized. 157
31628 260116 cd13944 lytB_ispH 4-hydroxy-3-methylbut-2-enyl diphosphate reductase. The 4-hydroxy-3-methylbut-2-enyl diphosphate (HMBPP) reductase (called lytB or ispH) is the terminal enzyme of the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway, one of the two metabolic routes for isoprenoid biosynthesis. The MEP pathway is essential in many eubacteria, plants, and the malaria parasite. LytB converts HMBPP into isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). 275
31629 260117 cd13945 Chs5_N N-terminal dimerization domain of Chs5 and similar proteins. Chs5/6 is a multi-protein complex conserved in fungi that interacts with chitin synthase III (Chs3p) and is involved in its transport to the cell surface from the trans-Golgi network, functioning as an exomer cargo adapter. Chs5p appears to form a complex with Chs6p and its paralogs Bch1p, Bud7p, and Bch2p. In this complex, Chs5p may act as a central scaffold. The N-terminal domain characterized by this model forms a homodimer and has been shown to interact with Chs6p and Bch1p. It may function as a flexible hinge domain that allows the exomer to interact with both proteins and the Golgi membrane as the latter undergoes changes in curvature during the formation of transport vesicles. The dimerization domain sits N-terminally to a conserved FBE (FN3-BRCT) unit, which binds Arf1 an is involved in the recruitment of the exomer to the membrane. 73
31630 260118 cd13946 LysW Lysine biosynthesis protein LysW. LysW functions as a carrier protein in the biosynthesis pathway of lysine. The C-terminal glutamate sidechain of LysW attaches to the amino group of alpha-aminoadipate (AAA); this peptide bond formation is catalyzed by the ligase LysX. AAA remains associated with LysW throughout its biosynthetic conversion to lysine. LysW also acts to protect the amino group of glutamate in arginine biosynthesis. 54
31631 320087 cd13949 7tm_V1R_pheromone vomeronasal organ pheromone receptor type-1 family, member of the seven-transmembrane G protein-coupled receptor superfamily. This family represents vomeronasal type-1 receptors (V1Rs) that are specifically expressed in the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes and monkeys. The VNO detects pheromones, chemicals released from animals that can influence social and reproductive behaviors, such as male-male aggression or sexual mating, in other members of the same species. On the other hand, the olfactory epithelium, which contains olfactory receptor neurons inside the nasal cavity, is responsible for detecting odor molecules (smells). There are two types of vertebrate pheromones: (1) small volatile molecules such as 2-heptanone, a substance in the urine of both male and female that extends estrous cycle length in female mice; and (2) water-soluble molecules such as the major histocompatibility complex (HMC) class-I peptide, which can induce the pregnancy block effect, the tendency for female rodents to abort their pregnancies upon exposure to the scent of an unknown male. While V1Rs and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO, V2Rs (type-2 vomeronasal receptors) and G-alpha(o) protein are coexpressed in the basal layer of the VNO. Activation of V1R or V2R causes stimulation of phospholipase pathway, generating diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). V1Rs have a short N-terminal extracellular domain, whereas V2Rs contain a long N-terminal extracellular domain, which is believed to bind pheromones. Although V1Rs share the seven-transmembrane domain structure with V1Rs and olfactory receptors, they share little sequence similarity with each other. 295
31632 320088 cd13950 7tm_TAS2R mammalian taste receptors type 2, member of the seven-transmembrane G protein-coupled receptor superfamily. This group represents a family of mammalian taste receptors (TAS2Rs), which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 288
31633 320089 cd13951 7tmF_Frizzled_SMO class F frizzled/smoothened family, member of the 7-transmembrane G protein-coupled receptor superfamily. The class F G protein-coupled receptors includes the frizzled (FZD) family of seven-transmembrane proteins consisting of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. Also included in the class F family is the closely related smoothened (SMO), which is a transmembrane G protein-coupled receptor that acts as the transducer of the hedgehog (HH) signaling pathway. SMO is activated by the hedgehog (HH) family of proteins acting on the 12-transmembrane domain receptor patched (PTCH), which constitutively inhibits SMO. Thus, in the absence of HH proteins, PTCH inhibits SMO signaling. On the other hand, binding of HH to the PTCH receptor activates its internalization and degradation, thereby releasing the PTCH inhibition of SMO. This allows SMO to trigger intracellular signaling and the subsequent activation of the Gli family of zinc finger transcriptional factors and induction of HH target gene expression (PTCH, Gli1, cyclin, Bcl-2, etc). The WNT and HH signaling pathways play critical roles in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 314
31634 410627 cd13952 7tm_classB class B family of seven-transmembrane G protein-coupled receptors. The class B of seven-transmembrane GPCRs is classified into three major subfamilies: subfamily B1 (secretin-like receptor family), B2 (adhesion family), and B3 (Methuselah-like family). The class B receptors have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi or prokaryotes. The B1 subfamily comprises receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the subfamily B1 receptors preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The subfamily B2 consists of cell-adhesion receptors with 33 members in humans and vertebrates. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing a variety of structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, linked to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. Furthermore, the subfamily B3 includes Methuselah (Mth) protein, which was originally identified in Drosophila as a GPCR affecting stress resistance and aging, and its closely related proteins. 260
31635 320091 cd13953 7tm_classC_mGluR-like metabotropic glutamate receptor-like class C family of seven-transmembrane G protein-coupled receptors superfamily. The class C GPCRs consist of glutamate receptors (mGluR1-8), the extracellular calcium-sensing receptors (caSR), the gamma-amino-butyric acid type B receptors (GABA-B), the vomeronasal type-2 pheromone receptors (V2R), the type 1 taste receptors (TAS1R), and the promiscuous L-alpha-amino acid receptor (GPRC6A), as well as several orphan receptors. Structurally, these receptors are typically composed of a large extracellular domain containing a Venus flytrap module which possesses the orthosteric agonist-binding site, a cysteine-rich domain (CRD) with the exception of GABA-B receptors, and the seven-transmembrane domains responsible for G protein activation. Moreover, the Venus flytrap module shows high structural homology with bacterial periplasmic amino acid-binding proteins, which serve as primary receptors in transport of a variety of soluble substrates such as amino acids and polysaccharides, among many others. The class C GPCRs exist as either homo- or heterodimers, which are essential for their function. The GABA-B1 and GABA-B2 receptors form a heterodimer via interactions between the N-terminal Venus flytrap modules and the C-terminal coiled-coiled domains. On the other hand, heterodimeric CaSRs and Tas1Rs and homodimeric mGluRs utilize Venus flytrap interactions and intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD), which can also acts as a molecular link to mediate the signal between the Venus flytrap and the 7TMs. Furthermore, members of the class C GPCRs bind a variety of endogenous ligands, ranging from amino acids, ions, to pheromones and sugar molecules, and play important roles in many physiological processes such as synaptic transmission, calcium homeostasis, and the sensation of sweet and umami tastes. 251
31636 320092 cd13954 7tmA_OR olfactory receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
31637 260119 cd13956 PT_UbiA UbiA family of prenyltransferases (PTases). Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 271
31638 260120 cd13957 PT_UbiA_Cox10 Protoheme IX farnesyltransferase. Protoheme IX farnesyltransferase (also called heme O synthase, heme A:farnesyltransferase, cytochrome c oxidase subunit X [Cox10]) converts heme B (protoheme IX) to heme O by substitution of the vinyl group on carbon 2 of the heme B porphyrin ring with a hydroxyethyl farnesyl side group. It is localized at the mitochondrial inner membrane. Eukaryotic Cox10 is important for the maturation of the heme A prosthetic group of cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, that catalyzes the electron transfer from reduced cytochrome c to oxygen. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 271
31639 260121 cd13958 PT_UbiA_chlorophyll Bacteriochlorophyll/chlorophyll synthetase. Chlorophyll synthase catalyzes the last step of chlorophyll (Chl) biosynthesis, the addition of the tetraprenyl (phytyl or geranylgeranyl) side chain. In plant chloroplast, the chlorophyll synthase is located in thylakoid membrane and has been shown to also have a regulatory or channeling function. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 277
31640 260122 cd13959 PT_UbiA_COQ2 4-Hydroxybenzoate polyprenyltransferase. 4-Hydroxybenzoate polyprenyltransferase, also known as Coq2, catalyzes the prenylation of p-hydroxybenzoate with an all-trans polyprenyl group, an important step in ubiquinone (CoQ) biosynthesis. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 272
31641 260123 cd13960 PT_UbiA_HPT1 Tocopherol phytyltransferase. Tocopherol polyprenyltransferase (TPT1), also known as homogentisate phytyltransferase 1 (HPT1), tocopherol phytyltransferase, or VTE2, catalyzes the first step in the biosynthesis of the tocopherol forms of vitamin E, which involves the prenylation of homogentisate using phytyl diphosphate (PDP) as the prenyl donor. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 289
31642 260124 cd13961 PT_UbiA_DGGGPS Geranylgeranylglycerol-phosphate geranylgeranyltransferase. Digeranylgeranylglyceryl phosphate synthase (DGGGPS) transfers a geranylgeranyl group from geranylgeranyl diphosphate to (S)-3-O-geranylgeranylglyceryl phosphate to form (S)-2,3-di-O-geranylgeranylglyceryl phosphate, as part of the isoprenoid ether lipid biosynthesis. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 270
31643 260125 cd13962 PT_UbiA_UBIAD1 1,4-Dihydroxy-2-naphthoate octaprenyltransferase. Human UBIAD1 is an enzyme involved in the synthesis of MK-4. Menaquinones (MKs, also called bacterial forms) are one of the two forms of natural vitamin K, the other being the plant form, phylloquinone (PK). All forms of vitamin K have a 2-methyl-1,4-naphthoquinone (menadione; K3) ring structure in common. At the 3-position of the ring, PK has a phytyl side chain while MKs have several repeating prenyl units. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. 283
31644 260126 cd13963 PT_UbiA_2 UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown. 278
31645 260127 cd13964 PT_UbiA_1 UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown. 282
31646 260128 cd13965 PT_UbiA_3 UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown. 273
31647 260129 cd13966 PT_UbiA_4 UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown. 272
31648 260130 cd13967 PT_UbiA_5 UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown. 277
31649 270870 cd13968 PKc_like Catalytic domain of the Protein Kinase superfamily. The PK superfamily contains the large family of typical PKs that includes serine/threonine kinases (STKs), protein tyrosine kinases (PTKs), and dual-specificity PKs that phosphorylate both serine/threonine and tyrosine residues of target proteins, as well as pseudokinases that lack crucial residues for catalytic activity and/or ATP binding. It also includes phosphoinositide 3-kinases (PI3Ks), aminoglycoside 3'-phosphotransferases (APHs), choline kinase (ChoK), Actin-Fragmin Kinase (AFK), and the atypical RIO and Abc1p-like protein kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to their target substrates; these include serine/threonine/tyrosine residues in proteins for typical or atypical PKs, the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives for PI3Ks, the 4-hydroxyl of PtdIns for PI4Ks, and other small molecule substrates for APH/ChoK and similar proteins such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. 136
31650 270871 cd13969 ADCK1-like aarF domain containing kinase 1 and similar proteins. This subfamily is composed of uncharacterized ABC1 kinase-like proteins including the human protein called aarF domain containing kinase 1 (ADCK1). Eukaryotes contain at least three ABC1-like proteins: in humans, these are ADCK3 and the putative protein kinases named ADCK1 and ADCK2. Yeast Abc1p and its human homolog ADCK3 are atypical protein kinases required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Plant subfamilies 14 and 15 (ABC1K14-15) belong to the same group of ABC1 kinases as human ADCK1. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family. 253
31651 270872 cd13970 ABC1_ADCK3 Activator of bc1 complex (ABC1) kinases, also called aarF domain containing kinase 3. This subfamily is composed of the atypical yeast protein kinase Abc1p, its human homolog ADCK3 (also called CABC1), and similar proteins. Abc1p (also called Coq8p) is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is necessary for the formation of a multi-subunit Q-biosynthetic complex and may also function in the regulation of Q synthesis. Human ADCK3 is able to rescue defects in Q synthesis and the phosphorylation state of Coq proteins in yeast Abc1 (or Coq8) mutants. Mutations in ADCK3 cause progressive cerebellar ataxia and atrophy due to Q10 deficiency. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Subfamily 13 (ABC1K13) of plant ABC1 kinases belongs in this subfamily with yeast Abc1p and human ADCK3. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family. 251
31652 270873 cd13971 ADCK2-like aarF domain containing kinase 2 and similar proteins. This subfamily is composed of uncharacterized ABC1 kinase-like proteins including the human protein called aarF domain containing kinase 2 (ADCK2). Eukaryotes contain at least three ABC1-like proteins; in humans, these are ADCK3 and the putative protein kinases named ADCK1 and ADCK2. Yeast Abc1p and its human homolog ADCK3 are atypical protein kinases required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Plant subfamily 10 (ABC1K10) belong to the same group of ABC1 kinases as human ADCK2. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family. 298
31653 270874 cd13972 UbiB Ubiquinone biosynthetic protein UbiB. UbiB is the prokaryotic homolog of yeast Abc1p and human ADCK3 (aarF domain containing kinase 3). It is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is required in the first monooxygenase step in Q biosynthesis. Mutant strains with disrupted ubiB genes lack Q and accumulate octaprenylphenol, a Q biosynthetic intermediate. 247
31654 270875 cd13973 PK_MviN-like Pseudokinase domain of the peptidoglycan biosynthetic protein MviN. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. This family is composed of the mycobacterial protein MviN and similar proteins. MviN is an integral membrane protein that is essential for growth and is required for cell wall integrity and peptidogylcan (PG) biosynthesis. It comprises of 14 predicted transmembrane (TM) helices at the N-terminus, followed by an intracellular pseudokinase domain linked through a single TM helix to a carbohydrate binding extracellular domain. Phosphorylation of the MviN pseudokinase domain by the PG-sensitive serine/threonine protein kinase PknB recruits a forkhead associated (FHA) domain protein FhaA, which modulates local PG synthesis at cell poles and the septum. The MviN pseudokinase forms a canonical receptor kinase dimer. 236
31655 270876 cd13974 STKc_SHIK Catalytic domain of the Serine/Threonine kinase, SINK-homologous inhibitory kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SHIK, also referred to as STK40 or LYK4, is a cytoplasmic and nuclear protein that is involved in the negative regulation of NF-kappaB- and p53-mediated transcription. It was identified as a protein related to SINK, a p65-interacting protein that inhibits p65 phosphorylation by the catalytic subunit of PKA, thereby inhibiting transcriptional competence of NF-kappaB. The SHIK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31656 270877 cd13975 PKc_Dusty Catalytic domain of the Dual-specificity Protein Kinase, Dusty. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. Dusty protein kinase is also called Receptor-interacting protein kinase 5 (RIPK5 or RIP5) or RIP-homologous kinase. It is widely distributed in the central nervous system, and may be involved in inducing both caspase-dependent and caspase-independent cell death. The Dusty subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31657 270878 cd13976 PK_TRB Pseudokinase domain of Tribbles Homolog proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Tribbles Homolog (TRB) proteins interact with many proteins involved in signaling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, differentiation, and gene expression. TRB proteins bind to the middle kinase in mitogen activated protein kinase (MAPK) signaling cascades, MAPK kinases. They regulate the activity of MAPK kinases, and thus, affect MAPK signaling. In Drosophila, Tribbles regulates String, the ortholog of mammalian Cdc25, during morphogenesis. String is implicated in the progression of mitosis during embryonic development. Vertebrates contain three TRB proteins encoded by three separate genes: Tribbles-1 (TRB1 or TRIB1), Tribbles-2 (TRB2 or TRIB2), and Tribbles-3 (TRB3 or TRIB3). The TRB subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 242
31658 270879 cd13977 STKc_PDIK1L Catalytic domain of the Serine/Threonine kinase, PDLIM1 interacting kinase 1 like. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PDIK1L is also called STK35 or CLIK-1. It is predominantly a nuclear protein which is capable of autophosphorylation. Through its interaction with the PDZ-LIM protein CLP-36, it is localized to actin stress fibers. The PDIK1L subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 322
31659 270880 cd13978 STKc_RIP Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP kinases serve as essential sensors of cellular stress. They are involved in regulating NF-kappaB and MAPK signaling, and are implicated in mediating cellular processes such as apoptosis, necroptosis, differentiation, and survival. RIP kinases contain a homologous N-terminal kinase domain and varying C-terminal domains. Higher vertebrates contain multiple RIP kinases, with mammals harboring at least five members. RIP1 and RIP2 harbor C-terminal domains from the Death domain (DD) superfamily while RIP4 contains ankyrin (ANK) repeats. RIP3 contain a RIP homotypic interaction motif (RHIM) that facilitates binding to RIP1. RIP1 and RIP3 are important in apoptosis and necroptosis, while RIP2 and RIP4 play roles in keratinocyte differentiation and inflammatory immune responses. The RIP subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31660 270881 cd13979 STKc_Mos Catalytic domain of the Serine/Threonine kinase, Oocyte maturation factor Mos. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mos (or c-Mos) is a germ-cell specific kinase that plays roles in both the release of primary arrest and the induction of secondary arrest in oocytes. It is expressed towards the end of meiosis I and is quickly degraded upon fertilization. It is a component of the cytostatic factor (CSF), which is responsible for metaphase II arrest. In addition, Mos activates a phoshorylation cascade that leads to the activation of the p34 subunit of MPF (mitosis-promoting factor or maturation promoting factor), a cyclin-dependent kinase that is responsible for the release of primary arrest in meiosis I. The Mos subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31661 270882 cd13980 STKc_Vps15 Catalytic domain of the Serine/Threonine kinase, Vacuolar protein sorting-associated protein 15. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Vps15 is a large protein consisting of an N-terminal kinase domain, a C-terminal WD-repeat containing domain, and an intermediate bridge domain that contain HEAT repeats. The kinase domain is necessary for the signaling functions of Vps15. Human Vps15 was previously called p150. It associates and regulates Vps34, also called Class III phosphoinositide 3-kinase (PI3K), which catalyzes the phosphorylation of D-myo-phosphatidylinositol (PtdIns). Vps34 is the only PI3K present in yeast. It plays an important role in the regulation of protein and vesicular trafficking and sorting, autophagy, trimeric G-protein signaling, and phagocytosis. The Vps15 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and PI3K. 278
31662 270883 cd13981 STKc_Bub1_BubR1 Catalytic domain of the Serine/Threonine kinases, Spindle assembly checkpoint proteins Bub1 and BubR1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Bub1 (Budding uninhibited by benzimidazoles 1), BubR1, and similar proteins. They contain an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding and a C-terminal kinase domain. Bub1 and BubR1 are involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. Impaired SAC leads to genomic instabilities and tumor development. Bub1 and BubR1 facilitate the localization of SAC proteins to kinetochores and regulate kinetochore-microtubule (K-MT) attachments. Repression studies of Bub1 and BubR1 show that they exert an additive effect in misalignment phenotypes and may function cooperatively or in parallel pathways in regulating K-MT attachments. The Bub1/BubR1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 298
31663 270884 cd13982 STKc_IRE1 Catalytic domain of the Serine/Threonine kinase, Inositol-requiring protein 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is an ER-localized type I transmembrane protein with kinase and endoribonuclease domains in the cytoplasmic side. It acts as an ER stress sensor and is the oldest and most conserved component of the unfolded protein response (UPR) in eukaryotes. The UPR is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. During ER stress, IRE1 dimerizes and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and XBP1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). The Ire1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31664 270885 cd13983 STKc_WNK Catalytic domain of the Serine/Threonine kinase, With No Lysine (WNK) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNKs comprise a subfamily of STKs with an unusual placement of a catalytic lysine relative to all other protein kinases. They are critical in regulating ion balance and are thus, important components in the control of blood pressure. They are also involved in cell signaling, survival, proliferation, and organ development. WNKs are activated by hyperosmotic or low-chloride hypotonic stress and they function upstream of SPAK and OSR1 kinases, which regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. There are four vertebrate WNKs which show varying expression patterns. WNK1 and WNK2 are widely expressed while WNK3 and WNK4 show a more restricted expression pattern. Because mutations in human WNK1 and WNK4 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension (due to increased sodium reabsorption) and hyperkalemia (due to impaired renal potassium secretion), there are more studies conducted on these two proteins, compared to WNK2 and WNK3. The WNK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31665 270886 cd13984 PK_NRBP1_like Pseudokinase domain of Nuclear Receptor Binding Protein 1 and similar proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. This subfamily is composed of NRBP1, also called MLF1-adaptor molecule (MADM), and MADML. NRBP1 was originally named based on the presence of nuclear binding and localization motifs prior to functional analyses. It is expressed ubiquitously and is found to localize in the cytoplasm, not the nucleus. NRBP1 is an adaptor protein that interacts with myeloid leukemia factor 1 (MLF1), an oncogene that enhances myeloid development of hematopoietic cells. It also interacts with the small GTPase Rac3. NRBP1 may also be involved in Golgi to ER trafficking. MADML (for MADM-Like) has been shown to be expressed throughout development in Xenopus laevis with highest expression found in the developing lens and retina. The NRBP1-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31666 270887 cd13985 STKc_GAK_like Catalytic domain of cyclin G-Associated Kinase-like proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes cyclin G-Associated Kinase (GAK), Drosophila melanogaster Numb-Associated Kinase (NAK)-like proteins, and similar protein kinases. GAK plays regulatory roles in clathrin-mediated membrane trafficking, the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses. NAK plays a role in asymmetric cell division through its association with Numb. It also regulates the localization of Dlg, a protein essential for septate junction formation. The GAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
31667 270888 cd13986 STKc_16 Catalytic domain of Serine/Threonine Kinase 16. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK16 is associated with many names including Myristylated and Palmitylated Serine/threonine Kinase 1 (MPSK1), Kinase related to cerevisiae and thaliana (Krct), and Protein Kinase expressed in day 12 fetal liver (PKL12). It is widely expressed in mammals with highest levels found in liver, testis, and kidney. It is localized in the Golgi but is translocated to the nucleus upon disorganization of the Golgi. STK16 is constitutively active and is capable of phosphorylating itself and other substrates. It may be involved in regulating stromal-epithelial interactions during mammary gland ductal morphogenesis. It may also function as a transcriptional co-activator of type-C natriuretic peptide and VEGF. The STK16 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 282
31668 270889 cd13987 STKc_SBK1 Catalytic domain of the Serine/Threonine kinase, SH3 Binding Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SBK1, also called BSK146, is predominantly expressed in the brain. Its expression is increased in the developing brain during the late embryonic stage, coinciding with dramatic neuronal proliferation, migration, and maturation. SBK1 may play an important role in regulating brain development. The SBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31669 270890 cd13988 STKc_TBK1 Catalytic domain of the Serine/Threonine kinase, TANK Binding Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TBK1 is also called T2K and NF-kB-activating kinase. It is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. In addition, TBK1 may also play roles in cell transformation and oncogenesis. The TBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 316
31670 270891 cd13989 STKc_IKK Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The IKK complex functions as a master regulator of Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. It is composed of two kinases, IKKalpha and IKKbeta, and the regulatory subunit IKKgamma or NEMO (NF-kB Essential MOdulator). IKKs facilitate the release of NF-kB dimers from an inactive state, allowing them to migrate to the nucleus where they regulate gene transcription. There are two IKK pathways that regulate NF-kB signaling, called the classical (involving IKKbeta and NEMO) and non-canonical (involving IKKalpha) pathways. The classical pathway regulates the majority of genes activated by NF-kB. The IKK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 289
31671 270892 cd13990 STKc_TLK Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. They phosphorylate and regulate Anti-silencing function 1 protein (Asf1), a histone H3/H4 chaperone that helps facilitate the assembly of chromatin following DNA replication during S phase. TLKs also phosphorylate the H3 histone tail and are essential in transcription. Vertebrates contain two subfamily members, TLK1 and TLK2. The TLK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
31672 270893 cd13991 STKc_NIK Catalytic domain of the Serine/Threonine kinase, NF-kappaB Inducing Kinase (NIK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NIK, also called mitogen activated protein kinase kinase kinase 14 (MAP3K14), phosphorylates and activates Inhibitor of NF-KappaB Kinase (IKK) alpha, which is a regulator of NF-kB proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. NIK is essential in the IKKalpha-mediated non-canonical NF-kB signaling pathway, in which IKKalpha processes the IkB-like C-terminus of NF-kB2/p100 to produce p52, allowing the p52/RelB dimer to migrate to the nucleus where it regulates gene transcription. NIK also plays an important role in Toll-like receptor 7/9 signaling cascades. The NIK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
31673 270894 cd13992 PK_GC Pseudokinase domain of membrane Guanylate Cyclase receptors. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs lack a critical aspartate involved in ATP binding and does not exhibit kinase activity. It functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
31674 270895 cd13993 STKc_Pat1_like Catalytic domain of Fungal Pat1-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Pat1 (also called Ran1), Saccharomyces cerevisiae VHS1 and KSP1, and similar fungal STKs. Pat1 blocks Mei2, an RNA-binding protein which is indispensable in the initiation of meiosis. Pat1 is inactivated and Mei2 activated, which initiates meiosis, under nutrient-deprived conditions through a signaling cascade involving Ste11. Meiosis induced by Pat1 inactivation may show different characteristics than normal meiosis including aberrant positioning of centromeres. VHS1 was identified in a screen for suppressors of cell cycle arrest at the G1/S transition, while KSP1 may be involved in regulating PRP20, which is required for mRNA export and maintenance of nuclear structure. The Pat1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31675 270896 cd13994 STKc_HAL4_like Catalytic domain of Fungal Halotolerance protein 4-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of HAL4, Saccharomyces cerevisiae Ptk2/Stk2, and similar fungal proteins. Proteins in this subfamily are involved in regulating ion transporters. In budding and fission yeast, HAL4 promotes potassium ion uptake, which increases cellular resistance to other cations such as sodium, lithium, and calcium ions. HAL4 stabilizes the major high-affinity K+ transporter Trk1 at the plasma membrane under low K+ conditions, which prevents endocytosis and vacuolar degradation. Budding yeast Ptk2 phosphorylates and regulates the plasma membrane H+ ATPase, Pma1. The HAL4-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31676 270897 cd13995 STKc_MAP3K8 Catalytic domain of the Serine/Threonine kinase, Mitogen-Activated Protein Kinase (MAPK) Kinase Kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP3K8 is also called Tumor progression locus 2 (Tpl2) or Cancer Osaka thyroid (Cot), and was first identified as a proto-oncogene in T-cell lymphoma induced by MoMuL virus and in breast carcinoma induced by MMTV. Activated MAP3K8 induces various MAPK pathways including Extracellular Regulated Kinase (ERK) 1/2, c-Jun N-terminal kinase (JNK), and p38. It plays a pivotal role in innate immunity, linking Toll-like receptors to the production of TNF and the activation of ERK in macrophages. It is also required in interleukin-1beta production and is critical in host defense against Gram-positive bacteria. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The MAP3K8 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31677 270898 cd13996 STKc_EIF2AK Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis under different stress conditions: General Control Non-derepressible-2 (GCN2) which is activated during amino acid or serum starvation; protein kinase regulated by RNA (PKR) which is activated by double stranded RNA; heme-regulated inhibitor kinase (HRI) which is activated under heme-deficient conditions; and PKR-like endoplasmic reticulum kinase (PERK) which is activated when misfolded proteins accumulate in the ER. The EIF2AK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 273
31678 270899 cd13997 PKc_Wee1_like Catalytic domain of the Wee1-like Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. This subfamily is composed of the dual-specificity kinase Myt1, the protein tyrosine kinase Wee1, and similar proteins. These proteins are cell cycle checkpoint kinases that are involved in the regulation of cyclin-dependent kinase CDK1, the master engine for mitosis. CDK1 is kept inactivated through phosphorylation of N-terminal thr (T14 by Myt1) and tyr (Y15 by Myt1 and Wee1) residues. Mitosis progression is ensured through activation of CDK1 by dephoshorylation and inactivation of Myt1/Wee1. The Wee1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31679 270900 cd13998 STKc_TGFbR-like Catalytic domain of Transforming Growth Factor beta Receptor-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of receptors for the TGFbeta family of secreted signaling molecules including TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. There are two types of TGFbeta receptors included in this subfamily, I and II, that play different roles in signaling. For signaling to occur, the ligand first binds to the high-affinity type II receptor, which is followed by the recruitment of the low-affinity type I receptor to the complex and its activation through trans-phosphorylation by the type II receptor. The active type I receptor kinase starts intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. Different ligands interact with various combinations of types I and II receptors to elicit a specific signaling pathway. Activins primarily signal through combinations of ACVR1b/ALK7 and ACVR2a/b; myostatin and GDF11 through TGFbR1/ALK4 and ACVR2a/b; BMPs through ACVR1/ALK1 and BMPR2; and TGFbeta through TGFbR1 and TGFbR2. The TGFbR-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31680 270901 cd13999 STKc_MAP3K-like Catalytic domain of Mitogen-Activated Protein Kinase (MAPK) Kinase Kinase-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed mainly of MAP3Ks and similar proteins, including TGF-beta Activated Kinase-1 (TAK1, also called MAP3K7), MAP3K12, MAP3K13, Mixed lineage kinase (MLK), MLK-Like mitogen-activated protein Triple Kinase (MLTK), and Raf (Rapidly Accelerated Fibrosarcoma) kinases. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Also included in this subfamily is the pseudokinase Kinase Suppressor of Ras (KSR), which is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway. 245
31681 270902 cd14000 STKc_LRRK Catalytic domain of the Serine/Threonine kinase, Leucine-Rich Repeat Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. Vertebrates contain two members, LRRK1 and LRRK2, which show complementary expression in the brain. Mutations in LRRK2 are linked to both familial and sporadic forms of Parkinson's disease. The normal roles of LRRKs are not clearly defined. They may be involved in mitogen-activated protein kinase (MAPK) pathways, protein translation control, programmed cell death pathways, and cytoskeletal dynamics. The LRRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31682 270903 cd14001 PKc_TOPK Catalytic domain of the Dual-specificity protein kinase, Lymphokine-activated killer T-cell-originated protein kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TOPK, also called PDZ-binding kinase (PBK), is activated at the early stage of mitosis and plays a critical role in cytokinesis. It partly functions as a mitogen-activated protein kinase (MAPK) kinase and is capable of phosphorylating p38, JNK1, and ERK2. TOPK also plays a role in DNA damage sensing and repair through its phosphorylation of histone H2AX. It contributes to cancer development and progression by downregulating the function of tumor suppressor p53 and reducing cell-cycle regulatory proteins. TOPK is found highly expressed in breast and skin cancer cells. The TOPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
31683 270904 cd14002 STKc_STK36 Catalytic domain of Serine/Threonine Kinase 36. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK36, also called Fused (or Fu) kinase, is involved in the Hedgehog signaling pathway. It is activated by the Smoothened (SMO) signal transducer, resulting in the stabilization of GLI transcription factors and the phosphorylation of SUFU to facilitate the nuclear accumulation of GLI. In Drosophila, Fused kinase is maternally required for proper segmentation during embryonic development and for the development of legs and wings during the larval stage. In mice, STK36 is not necessary for embryonic development, although mice deficient in STK36 display growth retardation postnatally. The STK36 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31684 270905 cd14003 STKc_AMPK-like Catalytic domain of AMP-activated protein kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The AMPK-like subfamily is composed of AMPK, MARK, BRSK, NUAK, MELK, SNRK, TSSK, and SIK, among others. LKB1 serves as a master upstream kinase that activates AMPK and most AMPK-like kinases. AMPK, also called SNF1 (sucrose non-fermenting1) in yeasts and SnRK1 (SNF1-related kinase1) in plants, is a heterotrimeric enzyme composed of a catalytic alpha subunit and two regulatory subunits, beta and gamma. It is a stress-activated kinase that serves as master regulator of glucose and lipid metabolism by monitoring carbon and energy supplies, via sensing the cell's AMP:ATP ratio. MARKs phosphorylate tau and related microtubule-associated proteins (MAPs), and regulates microtubule-based intracellular transport. They are involved in embryogenesis, epithelial cell polarization, cell signaling, and neuronal differentiation. BRSKs play important roles in establishing neuronal polarity. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. The AMPK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31685 270906 cd14004 STKc_PASK Catalytic domain of the Serine/Threonine kinase, Per-ARNT-Sim (PAS) domain Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PASK (or PASKIN) is a nutrient and energy sensor and thus, plays an important role in maintaining cellular energy homeostasis. It coordinates the utilization of glucose in response to metabolic demand. It contains an N-terminal PAS domain which directly interacts and inhibits a C-terminal catalytic kinase domain. The PAS domain serves as a sensory module for different environmental signals such as light, redox state, and various metabolites. Binding of ligands to the PAS domain causes structural changes which leads to kinase activation and the phosphorylation of substrates to trigger the appropriate cellular response. The PASK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31686 270907 cd14005 STKc_PIM Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are two PIM1 and three PIM2 isoforms as a result of alternative translation initiation sites, while there is only one PIM3 protein. Compound knockout mice deficient of all three PIM kinases that survive the perinatal period show a profound reduction in body size, indicating that PIMs are important for body growth. The PIM subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31687 270908 cd14006 STKc_MLCK-like Catalytic kinase domain of Myosin Light Chain Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This family is composed of MLCKs and related MLCK-like kinase domains from giant STKs such as titin, obscurin, SPEG, Unc-89, Trio, kalirin, and Twitchin. Also included in this family are Death-Associated Protein Kinases (DAPKs) and Death-associated protein kinase-Related Apoptosis-inducing protein Kinase (DRAKs). MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. Titin, obscurin, Twitchin, and SPEG are muscle proteins involved in the contractile apparatus. The giant STKs are multidomain proteins containing immunoglobulin (Ig), fibronectin type III (FN3), SH3, RhoGEF, PH and kinase domains. Titin, obscurin, Twitchin, and SPEG contain many Ig domain repeats at the N-terminus, while Trio and Kalirin contain spectrin-like repeats. The MLCK-like family is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 247
31688 270909 cd14007 STKc_Aurora Catalytic domain of the Serine/Threonine kinase, Aurora kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Yeast contains only one Aurora kinase while most higher eukaryotes have two. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). Aurora-A regulates cell cycle events from the late S-phase through the M-phase including centrosome maturation, mitotic entry, centrosome separation, spindle assembly, chromosome alignment, cytokinesis, and mitotic exit. Aurora-A activation depends on its autophosphorylation and binding to the microtubule-associated protein TPX2. Aurora-B is most active at the transition during metaphase to the end of mitosis. It is critical for accurate chromosomal segregation, cytokinesis, protein localization to the centrosome and kinetochore, correct microtubule-kinetochore attachments, and regulation of the mitotic checkpoint. Aurora-C is mainly expressed in meiotically dividing cells; it was originally discovered in mice as a testis-specific STK called Aie1. Both Aurora-B and -C are chromosomal passenger proteins that can form complexes with INCENP and survivin, and they may have redundant cellular functions. The Aurora subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31689 270910 cd14008 STKc_LKB1_CaMKK Catalytic domain of the Serine/Threonine kinases, Liver Kinase B1, Calmodulin Dependent Protein Kinase Kinase, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Both LKB1 and CaMKKs can phosphorylate and activate AMP-activated protein kinase (AMPK). LKB1, also called STK11, serves as a master upstream kinase that activates AMPK and most AMPK-like kinases. LKB1 and AMPK are part of an energy-sensing pathway that links cell energy to metabolism and cell growth. They play critical roles in the establishment and maintenance of cell polarity, cell proliferation, cytoskeletal organization, as well as T-cell metabolism, including T-cell development, homeostasis, and effector function. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMPK. Vertebrates contain two CaMKKs, CaMKK1 (or alpha) and CaMKK2 (or beta). CaMKK1 is involved in the regulation of glucose uptake in skeletal muscles. CaMKK2 is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. The LKB1/CaMKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31690 270911 cd14009 STKc_ATG1_ULK_like Catalytic domain of the Serine/Threonine kinases, Autophagy-related protein 1 and Unc-51-like kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes yeast ATG1 and metazoan homologs including vertebrate ULK1-3. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. It is involved in nutrient sensing and signaling, the assembly of autophagy factors and the execution of autophagy. In metazoans, ATG1 homologs display additional functions. Unc-51 and ULKs have been implicated in neuronal and axonal development. The ATG1/ULK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 251
31691 270912 cd14010 STKc_ULK4 Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ULK4 is a functionally uncharacterized kinase that shows similarity to ATG1/ULKs. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. The ULK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31692 270913 cd14011 PK_SCY1_like Pseudokinase domain of Scy1-like proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. This subfamily is composed of the catalytically inactive kinases with similarity to yeast Scy1. It includes four mammalian proteins called SCY1-like protein 1 (SCYL1), SCYL2, SCYL3, as well as Testis-EXpressed protein 14 (TEX14). SCYL1 binds to and co-localizes with the membrane trafficking coatomer I (COPI) complex, and regulates COPI-mediated vesicle trafficking. Null mutations in the SCYL1 gene are responsible for the pathology in mdf (muscle-deficient) mice which display progressive motor neuropathy. SCYL2, also called coated vesicle-associated kinase of 104 kDa (CVAK104), is involved in the trafficking of clathrin-coated vesicles. It also binds the HIV-1 accessory protein Vpu and acts as a regulatory factor that promotes the dephosphorylation of Vpu, facilitating the restriction of HIV-1 release. SCYL3, also called ezrin-binding protein PACE-1, may be involved in regulating cell adhesion and migration. TEX14 is required for spermatogenesis and male fertility. It localizes to kinetochores (KT) during mitosis and is a target of the mitotic kinase PLK1. It regulates the maturation of the outer KT and the KT-microtubule attachment. The SCY1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
31693 270914 cd14012 PK_eIF2AK_GCN2_rpt1 Pseudokinase domain, repeat 1, of eukaryotic translation Initiation Factor 2-Alpha Kinase 4 or General Control Non-derepressible-2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the overall downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis under different stress conditions: GCN2, protein kinase regulated by RNA (PKR), heme-regulated inhibitor kinase (HRI), and PKR-like endoplasmic reticulum kinase (PERK). GCN2 is activated by amino acid or serum starvation and UV irradiation. It induces GCN4, a transcriptional activator of amino acid biosynthetic genes, leading to increased production of amino acids under amino acid-deficient conditions. In serum-starved cells, GCN2 activation induces translation of the stress-responsive transcription factor ATF4, while under UV stress, GCN2 triggers transcriptional rescue via NF-kappaB signaling. GCN2 contains an N-terminal RWD, a degenerate kinase-like (repeat 1), the catalytic kinase (repeat 2), a histidyl-tRNA synthetase (HisRS)-like, and a C-terminal ribosome-binding and dimerization (RB/DD) domains. The degenerate pseudokinase domain of GCN2 may function as a regulatory domain. The GCN2 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
31694 270915 cd14013 STKc_SNT7_plant Catalytic domain of the Serine/Threonine kinase, Plant SNT7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SNT7 is a plant thylakoid-associated kinase that is essential in short- and long-term acclimation responses to cope with various light conditions in order to maintain photosynthetic redox poise for optimal photosynthetic performance. Short-term response involves state transitions over periods of minutes while the long-term response (LTR) occurs over hours to days and involves changing the relative amounts of photosystems I and II. SNT7 acts as a redox sensor and a signal transducer for both responses, which are triggered by the redox state of the plastoquinone (PQ) pool. It is positioned at the top of a phosphorylation cascade that induces state transitions by phosphorylating light-harvesting complex II (LHCII), and triggers the LTR through the phosphorylation of chloroplast proteins. The SNT7 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 318
31695 270916 cd14014 STKc_PknB_like Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes many bacterial eukaryotic-type STKs including Staphylococcus aureus PknB (also called PrkC or Stk1), Bacillus subtilis PrkC, and Mycobacterium tuberculosis Pkn proteins (PknB, PknD, PknE, PknF, PknL, and PknH), among others. S. aureus PknB is the only eukaryotic-type STK present in this species, although many microorganisms encode for several such proteins. It is important for the survival and pathogenesis of S. aureus as it is involved in the regulation of purine and pyrimidine biosynthesis, cell wall metabolism, autolysis, virulence, and antibiotic resistance. M. tuberculosis PknB is essential for growth and it acts on diverse substrates including proteins involved in peptidoglycan synthesis, cell division, transcription, stress responses, and metabolic regulation. B. subtilis PrkC is located at the inner membrane of endospores and functions to trigger spore germination. Bacterial STKs in this subfamily show varied domain architectures. The well-characterized members such as S. aureus and M. tuberculosis PknB, and B. subtilis PrkC, contain an N-terminal cytosolic kinase domain, a transmembrane (TM) segment, and mutliple C-terminal extracellular PASTA domains. The PknB subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
31696 270917 cd14015 STKc_VRK Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins (VRK1, VRK2, and VRK3) while invertebrates, specifically fruit flies and nematodes, seem to carry only a single ortholog. Mutations of VRK in Drosophila and Caenorhabditis elegans showed varying phenotypes ranging from embryonic lethality to mitotic and meiotic defects resulting in sterility. In vertebrates, VRK1 is implicated in cell cycle progression and proliferation, nuclear envelope assembly, and chromatin condensation. VRK2 is involved in modulating JNK signaling. VRK3 is an inactive pseudokinase that inhibits ERK signaling. The VRK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 300
31697 270918 cd14016 STKc_CK1 Catalytic domain of the Serine/Threonine protein kinase, Casein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. Some isoforms have several splice variants such as the long (L) and short (S) variants of CK1alpha. CK1 proteins are involved in the regulation of many cellular processes including membrane transport processes, circadian rhythm, cell division, apoptosis, and the development of cancer and neurodegenerative diseases. The CK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
31698 270919 cd14017 STKc_TTBK Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. TTBK1 has been linked to Alzheimer's disease (AD) while TTBK2 is associated with spinocerebellar ataxia type 11 (SCA11). Both AD and SCA11 patients show the presence of neurofibrillary tangles in the brain. The Drosophila TTBK homolog, Asator, is an essential protein that localizes to the mitotic spindle during mitosis and may be involved in regulating microtubule dynamics and function. The TTBK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31699 270920 cd14018 STKc_PINK1 Catalytic domain of the Serine/Threonine protein kinase, Pten INduced Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PINK1 contains an N-terminal mitochondrial targeting sequence, a catalytic domain, and a C-terminal regulatory region. It plays an important role in maintaining mitochondrial homeostasis. It protects cells against oxidative stress-induced apoptosis by phosphorylating the chaperone TNFR-associated protein 1 (TRAP1), also called Hsp75. Phosphorylated TRAP1 prevents cytochrome c release and peroxide-induced apoptosis. PINK1 interacts with Omi/HtrA2, a serine protease, and Parkin, an E3 ubiquitin ligase, in different pathways to promote mitochondrial health. The parkin gene is the most commonly mutated gene in autosomal recessive familial parkinsonism. Mutations within the catalytic domain of PINK1 are also associated with Parkinson's disease. The PINK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 313
31700 270921 cd14019 STKc_Cdc7 Catalytic domain of the Serine/Threonine Kinase, Cell Division Cycle 7 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Cdc7 kinase (or Hsk1 in fission yeast) is a critical regulator in the initiation of DNA replication. It forms a complex with a Dbf4-related regulatory subunit, a cyclin-like molecule that activates the kinase in late G1 phase, and is also referred to as Dbf4-dependent kinase (DDK). Its main targets are mini-chromosome maintenance (MCM) proteins. Cdc7 kinase may also have additional roles in meiosis, checkpoint responses, the maintenance and repair of chromosome structures, and cancer progression. The Cdc7 kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31701 270922 cd14020 STKc_KIS Catalytic domain of the Serine/Threonine Kinase, Kinase Interacting with Stathmin (also called U2AF homology motif (UHM) kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. KIS (or UHMK1) contains an N-terminal kinase domain and a C-terminal domain with a UHM motif, a protein interaction motif initially found in the pre-mRNA splicing factor U2AF. It phosphorylates the splicing factor SF1, which enhances binding to the splice site to promote spliceosome assembly. KIS was first identified as a kinase that interacts with stathmin, a phosphoprotein that plays a role in axon development and microtubule dynamics. It localizes in RNA granules in neurons and is important in neurite outgrowth. The KIS/UHMK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
31702 270923 cd14021 ChoK-like_euk Euykaryotic Choline Kinase and similar proteins. This group is composed of eukaryotic choline kinase, ethanolamine kinase, and similar proteins. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC), and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate and displays negligible activity towards N-methylated derivatives of Etn. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). 229
31703 270924 cd14022 PK_TRB2 Pseudokinase domain of Tribbles Homolog 2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB2 binds and negatively regulates the mitogen activated protein kinase (MAPK) kinases, MKK7 and MEK1, which are activators of the MAPKs, ERK and JNK. It controls the activation of inflammatory monocytes, which is essential in innate immune responses and the pathogenesis of inflammatory diseases such as atherosclerosis. TRB2 expression is down-regulated in human acute myeloid leukaemia (AML), which may lead to enhanced cell survival and pathogenesis of the disease. TRB2 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB2 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 242
31704 270925 cd14023 PK_TRB1 Pseudokinase domain of Tribbles Homolog 1. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB1 interacts directly with the mitogen activated protein kinase (MAPK) kinase MKK4, an activator of JNK. It regulates vascular smooth muscle cell proliferation and chemotaxis through the JNK signaling pathway. It is found to be down-regulated in human acute myeloid leukaemia (AML) and may play a role in the pathogenesis of the disease. It has also been identified as a potential biomarker for antibody-mediated allograft failure. TRB1 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB1 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 242
31705 270926 cd14024 PK_TRB3 Pseudokinase domain of Tribbles Homolog 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB3 binds and regulates ATF4, p65/RelA, and PKB (or Akt). It negatively regulates ATF4-mediated gene expression including that of CHOP (C/EBP homologous protein) and HO-1, which are both involved in modulating apoptosis. It also inhibits insulin-mediated phosphorylation of PKB and is a possible determinant of insulin resistance and related disorders. In osteoarthritic chondrocytes where it inhibits insulin-like growth factor 1-mediated cell survival, TRB3 is overexpressed, resulting in increased cell death. TRB3 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB3 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 242
31706 270927 cd14025 STKc_RIP4_like Catalytic domain of the Serine/Threonine kinases, Receptor Interacting Protein 4 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of RIP4, ankyrin (ANK) repeat and kinase domain containing 1 (ANKK1), and similar proteins, all of which harbor C-terminal ANK repeats. RIP4, also called Protein Kinase C-associated kinase (PKK), regulates keratinocyte differentiation and cutaneous inflammation. It activates NF-kappaB and is important in the survival of diffuse large B-cell lymphoma cells. The ANKK1 protein, also called PKK2, has not been studied extensively. The ANKK1 gene, located less than 10kb downstream of the D2 dopamine receptor (DRD2) locus, is altered in the Taq1 A1 polymorphism, which is related to a reduced DRD2 binding affinity and consequently, to mental disorders. The RIP4-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31707 270928 cd14026 STKc_RIP2 Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP2, also called RICK or CARDIAK, harbors a C-terminal Caspase Activation and Recruitment domain (CARD) belonging to the Death domain (DD) superfamily. It functions as an effector kinase downstream of the pattern recognition receptors from the Nod-like (NLR) family, Nod1 and Nod2, which recognizes bacterial peptidoglycans released upon infection. RIP2 may also be involved in regulating wound healing and keratinocyte proliferation. RIP kinases serve as essential sensors of cellular stress. The RIP2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
31708 270929 cd14027 STKc_RIP1 Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP1 harbors a C-terminal Death domain (DD), which binds death receptors (DRs) including TNF receptor 1, Fas, TNF-related apoptosis-inducing ligand receptor 1 (TRAILR1), and TRAILR2. It also interacts with other DD-containing adaptor proteins such as TRADD and FADD. RIP1 can also recruit other kinases including MEKK1, MEKK3, and RIP3 through an intermediate domain (ID) that bears a RIP homotypic interaction motif (RHIM). RIP1 plays a crucial role in determining a cell's fate, between survival or death, following exposure to stress signals. It is important in the signaling of NF-kappaB and MAPKs, and it links DR-associated signaling to reactive oxygen species (ROS) production. Abnormal RIP1 function may result in ROS accummulation affecting inflammatory responses, innate immunity, stress responses, and cell survival. RIP kinases serve as essential sensors of cellular stress. The RIP1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31709 270930 cd14028 STKc_Bub1_vert Catalytic domain of the Serine/Threonine kinase, Vertebrate Spindle assembly checkpoint protein Bub1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Bub1 (Budding uninhibited by benzimidazoles 1) contains an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding, a GLEBS motif for Bub3/kinetochore binding, and a C-terminal kinase domain. It is involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. Bub1 contributes to the inhibition of APC/C by phosphorylating its crucial cofactor, Cdc20, rendering it unable to activate APC/C. In addition, Bub1 facilitates the localization to kinetochores of other SAC and motor proteins including Mad1, Mad2, BubR1, and Plk1. It acts as the master organizer of the functional inner centromere. Bub1 also play roles in protecting sister chromatid cohesion and normal metaphase congression. The Bub1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31710 270931 cd14029 STKc_BubR1_vert Catalytic domain of the Serine/Threonine kinase, Vertebrate Spindle assembly checkpoint protein BubR1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BubR1 (Budding uninhibited by benzimidazoles R1) is also called Bub1 beta (Bub1b). It contains an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding and a C-terminal kinase domain. It is involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. BubR1 inhibits APC/C through direct binding. It also plays an important role in stabilizing kinetochore-microtubule attachments. Mutant mice expressing only 10% normal BubR1 protein are viable and develop into adult mice, but display many early aging-associated phenotypes including reduced lifespan, muscle atrophy, cataracts, impaired wound healing, and infertility. The BubR1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 304
31711 270932 cd14030 STKc_WNK1 Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK1 is widely expressed and is most abundant in the testis. In hyperosmotic or hypotonic low-chloride stress conditions, WNK1 is activated and it phosphorylates its substrates including SPAK and OSR1 kinases, which regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. Mutations in WNK1 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension and hyperkalemia. WNK1 negates WNK4-mediated inhibition of the sodium-chloride cotransporter NCC and activates the epithelial sodium channel ENaC by activating SGK1. WNK1 also decreases the surface expression of renal outer medullary potassium channel (ROMK) by stimulating their endocytosis. Hypertension and hyperkalemia in PHAII patients with WNK1 mutations may be due partly to increased activity of NCC and ENaC, and impaired renal potassium secretion by ROMK, respectively. In addition, WNK1 interacts with MEKK2/3 and acts as an activator of extracellular signal-regulated kinase (ERK) 5. It also negatively regulates TGFbeta signaling. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. The WNK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31712 270933 cd14031 STKc_WNK3 Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK3 shows a restricted expression pattern; it is found at high levels in the pituary glands and is also expressed in the kidney and brain. It has been shown to regulate many ion transporters including members of the SLC12A family of cation-chloride cotransporters such as NCC and NKCC2, the renal potassium channel ROMK, and the epithelial calcium channels TRPV5 and TRPV6. WNK3 appears to sense low-chloride hypotonic stress and under these conditions, it activates SPAK, which directly interacts and phosphorylates cation-chloride cotransporters. WNK3 has also been shown to promote cell survival, possibly through interaction with procaspase-3 and HSP70. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. The WNK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31713 270934 cd14032 STKc_WNK2_like Catalytic domain of With No Lysine (WNK) 2-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK2 is widely expressed and has been shown to be epigenetically silenced in gliomas. It inhibits cell growth by acting as a negative regulator of MEK1-ERK1/2 signaling. WNK2 modulates growth factor-induced cancer cell proliferation, suggesting that it may be a tumor suppressor gene. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. They are critical in regulating ion balance and are thus, important components in the control of blood pressure. The WNK2-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
31714 270935 cd14033 STKc_WNK4 Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK4 shows a restricted expression pattern and is usually found in epithelial cells. It is expressed in nephrons and in extrarenal tissues including intestine, eye, mammary glands, and prostate. WNK4 regulates a variety of ion transport proteins including apical or basolateral ion transporters, ion channels in the transcellular pathway, and claudins in the paracellular pathway. Mutations in WNK4 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension and hyperkalemia. WNK4 inhibits the activity of the thiazide-sensitive Na-Cl cotransporter (NCC), which is responsible for about 15% of NaCl reabsorption in the kidney. It also inhibits the renal outer medullary potassium channel (ROMK) and decreases its surface expression. Hypertension and hyperkalemia in PHAII patients with WNK4 mutations may be partly due to increased NaCl reabsorption through NCC and impaired renal potassium secretion by ROMK, respectively. The WNK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
31715 270936 cd14034 PK_NRBP1 Pseudokinase domain of Nuclear Receptor Binding Protein 1. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. NRBP1, also called MLF1-adaptor molecule (MADM), was originally named based on the presence of nuclear binding and localization motifs prior to functional analyses. It is expressed ubiquitously and is found to localize in the cytoplasm, not the nucleus. NRBP1 is an adaptor protein that interacts with myeloid leukemia factor 1 (MLF1), an oncogene that enhances myeloid development of hematopoietic cells. It also interacts with the small GTPase Rac3. NRBP1 may also be involved in Golgi to ER trafficking and actin dynamics. The NRBP1-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
31716 270937 cd14035 PK_MADML Pseudokinase domain of MLF1-ADaptor Molecule-Like. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. MADML has been shown to be expressed throughout development in Xenopus laevis with highest expression found in the developing lens and retina. It may play an important role in embryonic eye development. The MADML subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31717 270938 cd14036 STKc_GAK Catalytic domain of the Serine/Threonine protein kinase, cyclin G-Associated Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GAK, also called auxilin-2, contains an N-terminal kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2. In addition, it contains an auxilin-1-like domain structure consisting of PTEN-like, clathrin-binding, and J domains. Like auxilin-1, GAK facilitates Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, unlike auxilin-1 which is nerve-specific. GAK also plays regulatory roles outside of clathrin-mediated membrane traffic including the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses through interaction with the interleukin 12 receptor. It also interacts with the androgen receptor, acting as a transcriptional coactivator, and its expression is significantly increased with the progression of prostate cancer. The GAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 282
31718 270939 cd14037 STKc_NAK_like Catalytic domain of Numb-Associated Kinase (NAK)-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Drosophila melanogaster NAK, human BMP-2-inducible protein kinase (BMP2K or BIKe) and similar vertebrate proteins, as well as the Saccharomyces cerevisiae proteins Prk1, Actin-regulating kinase 1 (Ark1), and Akl1. NAK was the first characterized member of this subfamily. It plays a role in asymmetric cell division through its association with Numb. It also regulates the localization of Dlg, a protein essential for septate junction formation. BMP2K contains a nuclear localization signal and a kinase domain that is capable of phosphorylating itself and myelin basic protein. The expression of the BMP2K gene is increase during BMP-2-induced osteoblast differentiation. It may function to control the rate of differentiation. Prk1, Ark1, and Akl1 comprise a subfamily of yeast proteins that are important regulators of the actin cytoskeleton and endocytosis. They share an N-terminal kinase domain but no significant homology in other regions of their sequences. The NAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
31719 270940 cd14038 STKc_IKK_beta Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK) beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IKKbeta is involved in the classical pathway of regulating Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. The classical pathway regulates the majority of genes activated by NF-kB including those encoding cytokines, chemokines, leukocyte adhesion molecules, and anti-apoptotic factors. It involves NEMO (NF-kB Essential MOdulator)- and IKKbeta-dependent phosphorylation and degradation of the Inhibitor of NF-kB (IkB), which liberates NF-kB dimers (typified by the p50-p65 heterodimer) from an inactive IkB/dimeric NF-kB complex, enabling them to migrate to the nucleus where they regulate gene transcription. The IKKbeta subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31720 270941 cd14039 STKc_IKK_alpha Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK) alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IKKalpha is involved in the non-canonical or alternative pathway of regulating Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. The non-canonical pathway functions in cells lacking NEMO (NF-kB Essential MOdulator) and IKKbeta. It is induced by a subset of TNFR family members including CD40, RANK, and B cell-activating factor receptor. IKKalpha processes the Inhibitor of NF-kB (IkB)-like C-terminus of NF-kB2/p100 to produce p52, allowing the p52/RelB dimer to migrate to the nucleus. This pathway is dependent on NIK (NF-kB Inducing Kinase) which phosphorylates and activates IKKalpha. The IKKalpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31721 270942 cd14040 STKc_TLK1 Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. A splice variant of TLK1, called TLK1B, is expressed in the presence of double strand breaks (DSBs). It lacks the N-terminal part of TLK1, but is expected to phosphorylate the same substrates. TLK1/1B interacts with Rad9, which is critical in DNA damage-activated checkpoint response, and plays a role in the repair of linearized DNA with incompatible ends. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. The TLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 299
31722 270943 cd14041 STKc_TLK2 Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. They phosphorylate and regulate Anti-silencing function 1 protein (Asf1), a histone H3/H4 chaperone that helps facilitate the assembly of chromatin following DNA replication during S phase. TLKs also phosphorylate the H3 histone tail and are essential in transcription. Vertebrates contain two subfamily members, TLK1 and TLK2. The TLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 309
31723 270944 cd14042 PK_GC-A_B Pseudokinase domain of the membrane Guanylate Cyclase receptors, GC-A and GC-B. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-A binds and is activated by the atrial and B-type natriuretic peptides, ANP and BNP, which are important in blood pressure regulation and cardiac pathophysiology. GC-B binds the C-type natriuretic peptide, CNP, which is a potent vasorelaxant and functions in vascular remodeling and bone growth regulation. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-A/B subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
31724 270945 cd14043 PK_GC-2D Pseudokinase domain of the membrane Guanylate Cyclase receptor, GC-2D. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-2D is allso called Retinal Guanylyl Cyclase 1 (RETGC-1) or Rod Outer Segment membrane Guanylate Cyclase (ROS-GC). It is found in the photoreceptors of the retina where it anchors the reciprocal feedback loop between calcium and cGMP, which regulates the dark, light, and recovery phases in phototransduction. It is also found in other sensory neurons and may be a universal transduction component that plays a role in the perception of all senses. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-2D subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31725 270946 cd14044 PK_GC-C Pseudokinase domain of the membrane Guanylate Cyclase receptor, GC-C. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-C binds and is activated by the intestinal hormones, guanylin (GN) and uroguanylin (UGN), which are secreted after salty meals to inhibit sodium absorption and induce the secretion of chloride, bicarbonate, and water. GN and UGN are also present in the kidney, where they induce increased salt and water secretion. This prevents the development of hypernatremia and hypervolemia after ingestion of high amounts of salt. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-C subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31726 270947 cd14045 PK_GC_unk Pseudokinase domain of the unknown subfamily of membrane Guanylate Cyclase receptors. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs lack a critical aspartate involved in ATP binding and does not exhibit kinase activity. It functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31727 270948 cd14046 STKc_EIF2AK4_GCN2_rpt2 Catalytic domain, repeat 2, of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 4 or General Control Non-derepressible-2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GCN2 (or EIF2AK4) is activated by amino acid or serum starvation and UV irradiation. It induces GCN4, a transcriptional activator of amino acid biosynthetic genes, leading to increased production of amino acids under amino acid-deficient conditions. In serum-starved cells, GCN2 activation induces translation of the stress-responsive transcription factor ATF4, while under UV stress, GCN2 triggers transcriptional rescue via NF-kB signaling. GCN2 contains an N-terminal RWD, a degenerate kinase-like (repeat 1), the catalytic kinase (repeat 2), a histidyl-tRNA synthetase (HisRS)-like, and a C-terminal ribosome-binding and dimerization (RB/DD) domains. Its kinase domain is activated via conformational changes as a result of the binding of uncharged tRNA to the HisRS-like domain. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the overall downregulation of protein synthesis. The GCN2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 278
31728 270949 cd14047 STKc_EIF2AK2_PKR Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 2 or Protein Kinase regulated by RNA. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKR (or EIF2AK2) contains an N-terminal double-stranded RNA (dsRNA) binding domain and a C-terminal catalytic kinase domain. It is activated by dsRNA, which is produced as a replication intermediate in virally infected cells. It plays a key role in mediating innate immune responses to viral infection. PKR is also directly activated by PACT (protein activator of PKR) and heparin, and is inhibited by viral proteins and RNAs. PKR also regulates transcription and signal transduction in diseased cells, playing roles in tumorigenesis and neurodegenerative diseases. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The PKR subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31729 270950 cd14048 STKc_EIF2AK3_PERK Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 3 or PKR-like Endoplasmic Reticulum Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PERK (or EIF2AK3) is a type-I ER transmembrane protein containing a luminal domain bound with the chaperone BiP under unstressed conditions and a cytoplasmic catalytic kinase domain. In response to the accumulation of misfolded or unfolded proteins in the ER, PERK is activated through the release of BiP, allowing it to dimerize and autophosphorylate. It functions as the central regulator of translational control during the Unfolded Protein Response (UPR) pathway. In addition to the eIF-2 alpha subunit, PERK also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The PERK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 281
31730 270951 cd14049 STKc_EIF2AK1_HRI Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 2 or Heme-Regulated Inhibitor kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HRI (or EIF2AK1) contains an N-terminal regulatory heme-binding domain and a C-terminal catalytic kinase domain. It is suppressed under normal conditions by binding of the heme iron, and is activated during heme deficiency. It functions as a critical regulator that ensures balanced synthesis of globins and heme, in order to form stable hemoglobin during erythroid differentiation and maturation. HRI also protects cells and enhances survival under iron-deficient conditions. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The HRI subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
31731 270952 cd14050 PKc_Myt1 Catalytic domain of the Dual-specificity protein kinase, Myt1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. Myt1 is a cytoplasmic cell cycle checkpoint kinase that can keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of N-terminal thr (T14) and tyr (Y15) residues, leading to the delay of meiosis I entry. Meiotic progression is ensured by a two-step inhibition and downregulation of Myt1 by CDK1/XRINGO and p90Rsk during oocyte maturation. In addition, Myt1 targets cyclin B1/B2 and is essential for Golgi and ER assembly during telophase. In Drosophila, Myt1 may be a downstream target of Notch during eye development. The Myt1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 249
31732 270953 cd14051 PTKc_Wee1 Catalytic domain of the Protein Tyrosine Kinase, Wee1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Wee1 is a nuclear cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. There are two distinct Wee1 proteins in vertebrates showing different expression patterns, called Wee1a and Wee1b. They are functionally dstinct and are implicated in different steps of egg maturation and embryo development. The Wee1 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31733 270954 cd14052 PTKc_Wee1_fungi Catalytic domain of the Protein Tyrosine Kinases, Fungal Wee1 proteins. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of fungal Wee1 proteins, also called Swe1 in budding yeast and Mik1 in fission yeast. Yeast Wee1 is required to control cell size. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The fungal Wee1 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 278
31734 270955 cd14053 STKc_ACVR2 Catalytic domain of the Serine/Threonine Kinase, Activin Type II Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors, such as ACVR2, are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. Vertebrates contain two ACVR2 proteins, ACVR2a (or ActRIIA) and ACVR2b (or ActRIIB). The ACVR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31735 270956 cd14054 STKc_BMPR2_AMHR2 Catalytic domain of the Serine/Threonine Kinases, Bone Morphogenetic Protein and Anti-Muellerian Hormone Type II Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR2 and AMHR2 belong to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors (GDFs), and AMH, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. BMPR2 and AMHR2 act primarily as a receptor for BMPs and AMH, respectively. BMPs induce bone and cartilage formation, as well as regulate tooth, kidney, skin, hair, haematopoietic, and neuronal development. Mutations in BMPR2A is associated with familial pulmonary arterial hypertension. AMH is mainly responsible for the regression of Mullerian ducts during male sex differentiation. It is expressed exclusively by somatic cells of the gonads. Mutations in either AMH or AMHR2 cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism characterized by the presence of Mullerian derivatives (ovary and tubes) in otherwise normally masculine males. The BMPR2/AMHR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 300
31736 270957 cd14055 STKc_TGFbR2_like Catalytic domain of the Serine/Threonine Kinase, Transforming Growth Factor beta Type II Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TGFbR2 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors, such as TGFbR2, are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. TGFbR2 acts as the receptor for TGFbeta, which is crucial in growth control and homeostasis in many different tissues. It plays roles in regulating apoptosis and in maintaining the balance between self renewal and cell loss. It also plays a key role in maintaining vascular integrity and in regulating responses to genotoxic stress. Mutations in TGFbR2 can cause aortic aneurysm disorders such as Loeys-Dietz and Marfan syndromes. The TGFbR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 295
31737 270958 cd14056 STKc_TGFbR_I Catalytic domain of the Serine/Threonine Kinases, Transforming Growth Factor beta family Type I Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of type I receptors for the TGFbeta family of secreted signaling molecules including TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation through trans-phosphorylation by type II receptors, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. They are inhibited by the immunophilin FKBP12, which is thought to control leaky signaling caused by receptor oligomerization in the absence of ligand. The TGFbR-I subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
31738 270959 cd14057 PK_ILK Pseudokinase domain of Integrin Linked Kinase. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. ILK contains N-terminal ankyrin repeats, a Pleckstrin Homology (PH) domain, and a C-terminal pseudokinase domain. It is a component of the IPP (ILK/PINCH/Parvin) complex that couples beta integrins to the actin cytoskeleton, and plays important roles in cell adhesion, spreading, invasion, and migration. ILK was initially thought to be an active kinase despite the lack of key conserved residues because of in vitro studies showing that it can phosphorylate certain protein substrates. However, in vivo experiments in Caenorhabditis elegans, Drosophila melanogaster, and mice (ILK-null and knock-in) proved that ILK is not an active kinase. In addition to actin cytoskeleton regulation, ILK also influences the microtubule network and mitotic spindle orientation. The pseudokinase domain of ILK binds several adaptor proteins including the parvins and paxillin. The ILK subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 251
31739 270960 cd14058 STKc_TAK1 Catalytic domain of the Serine/Threonine Kinase, Transforming Growth Factor beta Activated Kinase-1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAK1 is also known as mitogen-activated protein kinase kinase kinase 7 (MAPKKK7 or MAP3K7), TAK, or MEKK7. As a MAPKKK, it is an important mediator of cellular responses to extracellular signals. It regulates both the c-Jun N-terminal kinase and p38 MAPK cascades by activating the MAPK kinases, MKK4 and MKK3/6. In addition, TAK1 plays diverse roles in immunity and development, in different biological contexts, through many signaling pathways including TGFbeta/BMP, Wnt/Fz, and NF-kB. It is also implicated in the activation of the tumor suppressor kinase, LKB1. The TAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31740 270961 cd14059 STKc_MAP3K12_13 Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinase Kinase Kinases 12 and 13. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP3K12 is also called MAPK upstream kinase (MUK), dual leucine zipper-bearing kinase (DLK) or leucine-zipper protein kinase (ZPK). It is involved in the c-Jun N-terminal kinase (JNK) pathway that directly regulates axonal regulation through the phosphorylation of microtubule-associated protein 1B (MAP1B). It also regulates the differentiation of many cell types including adipocytes and may play a role in adipogenesis. MAP3K13, also called leucine zipper-bearing kinase (LZK), directly phosphorylates and activates MKK7, which in turn activates the JNK pathway. It also activates NF-kB through IKK activation and this activity is enhanced by antioxidant protein-1 (AOP-1). MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAP2Ks (MAPKKs or MKKs), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The MAP3K12/13 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 237
31741 270962 cd14060 STKc_MLTK Catalytic domain of the Serine/Threonine Kinase, Mixed lineage kinase-Like mitogen-activated protein Triple Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLTK, also called zipper sterile-alpha-motif kinase (ZAK), contains a catalytic kinase domain and a leucine zipper. There are two alternatively-spliced variants, MLTK-alpha and MLTK-beta. MLTK-alpha contains a sterile-alpha-motif (SAM) at the C-terminus. MLTK regulates the c-Jun N-terminal kinase, extracellular signal-regulated kinase, p38 MAPK, and NF-kB pathways. ZAK is the MAP3K involved in the signaling cascade that leads to the ribotoxic stress response initiated by cellular damage due to Shiga toxins and ricin. It may also play a role in cell transformation and cancer development. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals.The MLTK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 242
31742 270963 cd14061 STKc_MLK Catalytic domain of the Serine/Threonine Kinases, Mixed Lineage Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Mammals have four MLKs (MLK1-4), mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31743 270964 cd14062 STKc_Raf Catalytic domain of the Serine/Threonine Kinases, Raf (Rapidly Accelerated Fibrosarcoma) kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Raf kinases act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Aberrant expression or activation of components in this pathway are associated with tumor initiation, progression, and metastasis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain, and a catalytic kinase domain. Vertebrates have three Raf isoforms (A-, B-, and C-Raf) with different expression profiles, modes of regulation, and abilities to function in the ERK cascade, depending on cellular context and stimuli. They have essential and non-overlapping roles during embryo- and organogenesis. Knockout of each isoform results in a lethal phenotype or abnormality in most mouse strains. The Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31744 270965 cd14063 PK_KSR Pseudokinase domain of Kinase Suppressor of Ras. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. KSR is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases, but there is some debate in this designation as a few groups have reported detecting kinase catalytic activity for KSRs, specifically KSR1. Vertebrates contain two KSR proteins, KSR1 and KSR2. The KSR subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31745 270966 cd14064 PKc_TNNI3K Catalytic domain of the Dual-specificity protein kinase, TNNI3-interacting kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TNNI3K, also called cardiac ankyrin repeat kinase (CARK), is a cardiac-specific troponin I-interacting kinase that promotes cardiac myogenesis, improves cardiac performance, and protects the myocardium from ischemic injury. It contains N-terminal ankyrin repeats, a catalytic kinase domain, and a C-terminal serine-rich domain. TNNI3K exerts a disease-accelerating effect on cardiac dysfunction and reduced survival in mouse models of cardiomyopathy. The TNNI3K subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
31746 270967 cd14065 PKc_LIMK_like Catalytic domain of the LIM domain kinase-like protein kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. Members of this subfamily include LIMK, Testicular or testis-specific protein kinase (TESK), and similar proteins. LIMKs are characterized as serine/threonine kinases (STKs) while TESKs are dual-specificity protein kinases. Both LIMK and TESK phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They are implicated in many cellular functions including cell spreading, motility, morphogenesis, meiosis, mitosis, and spermatogenesis. The LIMK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31747 270968 cd14066 STKc_IRAK Catalytic domain of the Serine/Threonine kinases, Interleukin-1 Receptor Associated Kinases and related STKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. Some IRAKs may also play roles in T- and B-cell signaling, and adaptive immunity. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK-1, -2, and -4 are ubiquitously expressed and are active kinases, while IRAK-M is only induced in monocytes and macrophages and is an inactive kinase. Variations in IRAK genes are linked to diverse diseases including infection, sepsis, cancer, and autoimmune diseases. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain (a pseudokinase domain in the case of IRAK3), and a C-terminal domain; IRAK-4 lacks the C-terminal domain. This subfamily includes plant receptor-like kinases (RLKs) including Arabidopsis thaliana BAK1 and CLAVATA1 (CLV1). BAK1 functions in BR (brassinosteroid)-regulated plant development and in pathways involved in plant resistance to pathogen infection and herbivore attack. CLV1, directly binds small signaling peptides, CLAVATA3 (CLV3) and CLAVATA3/EMBRYO SURROUNDING REGI0N (CLE), to restrict stem cell proliferation: the CLV3-CLV1-WUS (WUSCHEL) module influences stem cell maintenance in the shoot apical meristem, and the CLE40 (CLAVATA3/EMBRYO SURROUNDING REGION40) -ACR4 (CRINKLY4) -CLV1- WOX5 (WUSCHEL-RELATED HOMEOBOX5) module at the root apical meristem. The IRAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
31748 270969 cd14067 STKc_LRRK1 Catalytic domain of the Serine/Threonine Kinase, Leucine-Rich Repeat Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRK1 is one of two vertebrate LRRKs which show complementary expression in the brain. It can form heterodimers with LRRK2, and may influence the age of onset of LRRK2-associated Parkinson's disease. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. The LRRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 276
31749 270970 cd14068 STKc_LRRK2 Catalytic domain of the Serine/Threonine Kinase, Leucine-Rich Repeat Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRK2 is one of two vertebrate LRRKs which show complementary expression in the brain. Mutations in LRRK2, found in the kinase, ROC-COR, and WD40 domains, are linked to both familial and sporadic forms of Parkinson's disease. The most prevalent mutation, G2019S located in the activation loop of the kinase domain, increases kinase activity. The R1441C/G mutations in the GTPase domain have also been reported to influence kinase activity. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. The LRRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31750 270971 cd14069 STKc_Chk1 Catalytic domain of the Serine/Threonine kinase, Checkpoint kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chk1 is implicated in many major checkpoints of the cell cycle, providing a link between upstream sensors and the cell cycle engine. It plays an important role in DNA damage response and maintaining genomic stability. Chk1 acts as an effector of the sensor kinase, ATR (ATM and Rad3-related), a member of the PI3K family, which is activated upon DNA replication stress. Chk1 delays mitotic entry in response to replication blocks by inhibiting cyclin dependent kinase (Cdk) activity. In addition, Chk1 contributes to the function of centrosome and spindle-based checkpoints, inhibits firing of origins of DNA replication (Ori), and represses transcription of cell cycle proteins including cyclin B and Cdk1. The Chk1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
31751 270972 cd14070 STKc_HUNK Catalytic domain of the Serine/Threonine Kinase, Hormonally up-regulated Neu-associated kinase (also called MAK-V). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HUNK/MAK-V was identified from a mammary tumor in an MMTV-neu transgenic mouse. It is required for the metastasis of c-myc-induced mammary tumors, but is not necessary for c-myc-induced primary tumor formation or normal development. It is required for HER2/neu-induced tumor formation and maintenance of the cells' tumorigenic phenotype. It is over-expressed in aggressive subsets of ovary, colon, and breast carcinomas. HUNK interacts with synaptopodin, and may also play a role in synaptic plasticity. The HUNK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31752 270973 cd14071 STKc_SIK Catalytic domain of the Serine/Threonine Kinases, Salt-Inducible kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SIKs are part of a complex network that regulates Na,K-ATPase to maintain sodium homeostasis and blood pressure. Vertebrates contain three forms of SIKs (SIK1-3) from three distinct genes, which display tissue-specific effects. SIK1, also called SNF1LK, controls steroidogenic enzyme production in adrenocortical cells. In the brain, both SIK1 and SIK2 regulate energy metabolism. SIK2, also called QIK or SNF1LK2, is involved in the regulation of gluconeogenesis in the liver and lipogenesis in adipose tissues, where it phosphorylates the insulin receptor substrate-1. In the liver, SIK3 (also called QSK) regulates cholesterol and bile acid metabolism. In addition, SIK2 plays an important role in the initiation of mitosis and regulates the localization of C-Nap1, a centrosome linker protein. The SIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31753 270974 cd14072 STKc_MARK Catalytic domain of the Serine/Threonine Kinases, MAP/microtubule affinity-regulating kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MARKs, also called Partitioning-defective 1 (Par1) proteins, function as regulators of diverse cellular processes in nematodes, Drosophila, yeast, and vertebrates. They are involved in embryogenesis, epithelial cell polarization, cell signaling, and neuronal differentiation. MARKs phosphorylate tau and related microtubule-associated proteins (MAPs), and regulates microtubule-based intracellular transport. Vertebrates contain four isoforms, namely MARK1 (or Par1c), MARK2 (or Par1b), MARK3 (Par1a), and MARK4 (or MARKL1). Known substrates of MARKs include the cell cycle-regulating phosphatase Cdc25, tyrosine phosphatase PTPH1, MAPK scaffolding protein KSR1, class IIa histone deacetylases, and plakophilin 2. The MARK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31754 270975 cd14073 STKc_NUAK Catalytic domain of the Serine/Threonine Kinase, novel (nua) kinase family NUAK. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NUAK proteins are classified as AMP-activated protein kinase (AMPK)-related kinases, which like AMPK are activated by the major tumor suppressor LKB1. Vertebrates contain two NUAK proteins, called NUAK1 and NUAK2. NUAK1, also called ARK5 (AMPK-related protein kinase 5), regulates cell proliferation and displays tumor suppression through direct interaction and phosphorylation of p53. It is also involved in cell senescence and motility. High NUAK1 expression is associated with invasiveness of nonsmall cell lung cancer (NSCLC) and breast cancer cells. NUAK2, also called SNARK (Sucrose, non-fermenting 1/AMP-activated protein kinase-related kinase), is involved in energy metabolism. It is activated by hyperosmotic stress, DNA damage, and nutrients such as glucose and glutamine. NUAK2-knockout mice develop obesity, altered serum lipid profiles, hyperinsulinaemia, hyperglycaemia, and impaired glucose tolerance. The NUAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
31755 270976 cd14074 STKc_SNRK Catalytic domain of the Serine/Threonine Kinase, SNF1-related kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SNRK is a kinase highly expressed in testis and brain that is found inactive in cells that lack the LKB1 tumour suppressor protein kinase. The regulatory subunits STRAD and MO25 are required for LKB1 to activate SNRK. The SNRK mRNA is increased 3-fold when granule neurons are cultured in low potassium, and may thus play a role in the survival responses in these cells. In some vertebrates, a second SNRK gene (snrkb or snrk-1) has been sequenced and/or identified. Snrk-1 is expressed specifically in embryonic zebrafish vasculature; it plays an essential role in angioblast differentiation, maintenance, and migration. The SNRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31756 270977 cd14075 STKc_NIM1 Catalytic domain of the Serine/Threonine Kinase, NIM1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NIM1 is a widely-expressed kinase belonging to the AMP-activated protein kinase (AMPK) subfamily. Although present in most tissues, NIM1 kinase activity is only observed in the brain and testis. NIM1 is capable of autophosphorylating and activating itself, but may be present in other tissues in the inactive form. The physiological function of NIM1 has yet to be elucidated. The NIM1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31757 270978 cd14076 STKc_Kin4 Catalytic domain of the yeast Serine/Threonine Kinase, Kin4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Kin4 is a central component of the spindle position checkpoint (SPOC), which monitors spindle position and regulates the mitotic exit network (MEN). Kin4 associates with spindle pole bodies in mother cells to inhibit MEN signaling and delay mitosis until the anaphase nucleus is properly positioned along the mother-bud axis. Kin4 activity is regulated by both the bud neck-associated kinase Elm1 and protein phosphatase 2A. The Kin4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
31758 270979 cd14077 STKc_Kin1_2 Catalytic domain of Kin1, Kin2, and simlar Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of yeast Kin1, Kin2, and similar proteins. Fission yeast Kin1 is a membrane-associated kinase that is involved in regulating cell surface cohesiveness during interphase. It also plays a role during mitosis, linking actomyosin ring assembly with septum synthesis and membrane closure to ensure separation of daughter cells. Budding yeast Kin1 and Kin2 act downstream of the Rab-GTPase Sec4 and are associated with the exocytic apparatus; they play roles in the secretory pathway. The Kin1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31759 270980 cd14078 STKc_MELK Catalytic domain of the Serine/Threonine Kinase, Maternal Embryonic Leucine zipper Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MELK is a cell cycle dependent protein which functions in cytokinesis, cell cycle, apoptosis, cell proliferation, and mRNA processing. It is found upregulated in many types of cancer cells, playing an indispensable role in cancer cell survival. It makes an attractive target in the design of inhibitors for use in the treatment of a wide range of human cancer. The MELK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31760 270981 cd14079 STKc_AMPK_alpha Catalytic domain of the Alpha subunit of the Serine/Threonine Kinase, AMP-activated protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. AMPK, also called SNF1 (sucrose non-fermenting1) in yeasts and SnRK1 (SNF1-related kinase1) in plants, is a heterotrimeric enzyme composed of a catalytic alpha subunit and two regulatory subunits, beta and gamma. It is a stress-activated kinase that serves as master regulator of glucose and lipid metabolism by monitoring carbon and energy supplies, via sensing the cell's AMP:ATP ratio. In response to decreased ATP levels, it enhances energy-producing processes and inhibits energy-consuming pathways. Once activated, AMPK phosphorylates a broad range of downstream targets, with effects in carbohydrate metabolism and uptake, lipid and fatty acid biosynthesis, carbon energy storage, and inflammation, among others. Defects in energy homeostasis underlie many human diseases including Type 2 diabetes, obesity, heart disease, and cancer. As a result, AMPK has emerged as a therapeutic target in the treatment of these diseases. The AMPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31761 270982 cd14080 STKc_TSSK-like Catalytic domain of testis-specific serine/threonine kinases and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK1 and TSSK2 are expressed specifically in meiotic and postmeiotic spermatogenic cells, respectively. TSSK3 has been reported to be expressed in the interstitial Leydig cells of adult testis. TSSK4, also called TSSK5, is expressed in testis from haploid round spermatids to mature spermatozoa. TSSK6, also called SSTK, is expressed at the head of elongated sperm. TSSK1/TSSK2 double knock-out and TSSK6 null mice are sterile without manifesting other defects, making these kinases viable targets for male contraception. The TSSK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31762 270983 cd14081 STKc_BRSK1_2 Catalytic domain of Brain-specific serine/threonine-protein kinases 1 and 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BRSK1, also called SAD-B or SAD1 (Synapses of Amphids Defective homolog 1), and BRSK2, also called SAD-A, are highly expressed in mammalian forebrain. They play important roles in establishing neuronal polarity. BRSK1/2 double knock-out mice die soon after birth, showing thin cerebral cortices due to disordered subplate layers and neurons that lack distinct axons and dendrites. BRSK1 regulates presynaptic neurotransmitter release. Its activity fluctuates during cell cysle progression and it acts as a regulator of centrosome duplication. BRSK2 is also abundant in pancreatic islets, where it is involved in the regulation of glucose-stimulated insulin secretion. The BRSK1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31763 270984 cd14082 STKc_PKD Catalytic domain of the Serine/Threonine kinase, Protein Kinase D. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They contain N-terminal cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. The PKD subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
31764 270985 cd14083 STKc_CaMKI Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31765 270986 cd14084 STKc_Chk2 Catalytic domain of the Serine/Threonine kinase, Cell cycle Checkpoint Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Checkpoint Kinase 2 (Chk2) plays an important role in cellular responses to DNA double-strand breaks and related lesions. It is phosphorylated and activated by ATM kinase, resulting in its dissociation from sites of damage to phosphorylate downstream targets such as BRCA1, p53, cell cycle transcription factor E2F1, the promyelocytic leukemia protein (PML) involved in apoptosis, and CDC25 phosphatases, among others. Mutations in Chk2 is linked to a variety of cancers including familial breast cancer, myelodysplastic syndromes, prostate cancer, lung cancer, and osteosarcomas. Chk2 contains an N-terminal SQ/TQ cluster domain (SCD), a central forkhead-associated (FHA) domain, and a C-terminal catalytic kinase domain. The Chk2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31766 270987 cd14085 STKc_CaMKIV Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type IV. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKIV is found predominantly in neurons and immune cells. It is activated by the binding of calcium/CaM and phosphorylation by CaMKK (alpha or beta). The CaMKK-CaMKIV cascade participates in regulating several transcription factors like CREB, MEF2, and retinoid orphan receptors. It also is implicated in T-cell development and signaling, cytokine secretion, and signaling through Toll-like receptors, and is thus, pivotal in immune response and inflammation. The CaMKIV subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 294
31767 270988 cd14086 STKc_CaMKII Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type II. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKs contain an N-terminal catalytic domain followed by a regulatory domain that harbors a CaM binding site. In addition, CaMKII contains a C-terminal association domain that facilitates oligomerization. There are four CaMKII proteins (alpha, beta, gamma, delta) encoded by different genes; each gene undergoes alternative splicing to produce more than 30 isoforms. CaMKII-alpha and -beta are enriched in neurons while CaMKII-gamma and -delta are predominant in myocardium. CaMKII is a signaling molecule that translates upstream calcium and reactive oxygen species (ROS) signals into downstream responses that play important roles in synaptic function and cardiovascular physiology. It is a major component of the postsynaptic density and is critical in regulating synaptic plasticity including long-term potentiation. It is critical in regulating ion channels and proteins involved in myocardial excitation-contraction and excitation-transcription coupling. Excessive CaMKII activity promotes processes that contribute to heart failure and arrhythmias. The CaMKII subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 292
31768 270989 cd14087 STKc_PSKH1 Catalytic domain of the Protein Serine/Threonine kinase H1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PSKH1 is an autophosphorylating STK that is expressed ubiquitously and exhibits multiple intracellular localizations including the centrosome, Golgi apparatus, and splice factor compartments. It contains a catalytic kinase domain and an N-terminal SH4-like motif that is acylated to facilitate membrane attachment. PSKH1 plays a rile in the maintenance of the Golgi apparatus, an important organelle within the secretory pathway. It may also function as a novel splice factor and a regulator of prostate cancer cell growth. The PSKH1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31769 270990 cd14088 STKc_CaMK_like Catalytic domain of an Uncharacterized group of Serine/Threonine kinases with similarity to Calcium/calmodulin-dependent protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of uncharacterized STKs with similarity to CaMKs, which are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). CaMKs contain an N-terminal catalytic domain followed by a regulatory domain that harbors a CaM binding site. This uncharacterized subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31770 270991 cd14089 STKc_MAPKAPK Catalytic domain of the Serine/Threonine kinases, Mitogen-activated protein kinase-activated protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPK-activated protein kinases MK2, MK3, MK5 (also called PRAK for p38-regulated/activated protein kinase), and related proteins. These proteins contain a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. In addition, MK2 and MK3 contain an N-terminal proline-rich region that can bind to SH3 domains. MK2 and MK3 are bonafide substrates for the MAPK p38, while MK5 plays a functional role in the p38 MAPK pathway although their direct interaction has been difficult to detect. MK2 and MK3 are closely related and show, thus far, indistinguishable substrate specificity, while MK5 shows a distinct spectrum of substrates. MK2 and MK3 are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. MK5 is a ubiquitous protein that is implicated in neuronal morphogenesis, cell migration, and tumor angiogenesis. It interacts with PKA, which induces cytoplasmic translocation of MK5. Its substrates includes p53, ERK3/4, Hsp27, and cytosolic phospholipase A2 (cPLA2). The MAPKAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31771 270992 cd14090 STKc_Mnk Catalytic domain of the Serine/Threonine kinases, Mitogen-activated protein kinase signal-integrating kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31772 270993 cd14091 STKc_RSK_C C-terminal catalytic domain of the Serine/Threonine Kinases, Ribosomal S6 kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. Mammals possess four RSK isoforms (RSK1-4) from distinct genes. RSK proteins are also referred to as MAP kinase-activated protein kinases (MAPKAPKs), 90 kDa ribosomal protein S6 kinases (p90-RSKs), or p90S6Ks. The RSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
31773 270994 cd14092 STKc_MSK_C C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, in response to various stimuli such as growth factors, hormones, neurotransmitters, cellular stress, and pro-inflammatory cytokines. This triggers phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) in the C-terminal extension of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. MSKs are predominantly nuclear proteins. They are widely expressed in many tissues including heart, brain, lung, liver, kidney, and pancreas. There are two isoforms of MSK, called MSK1 and MSK2. The MSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 311
31774 270995 cd14093 STKc_PhKG Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). Each subunit has tissue-specific isoforms or splice variants. Vertebrates contain two isoforms of the gamma subunit (gamma 1 and gamma 2). The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
31775 270996 cd14094 STKc_CASK Catalytic domain of the Serine/Threonine Kinase, Calcium/calmodulin-dependent serine protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CASK belongs to the MAGUK (membrane-associated guanylate kinase) protein family, which functions as multiple domain adaptor proteins and is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The enzymatically inactive GuK domain in MAGUK proteins mediates protein-protein interactions and associates intramolecularly with the SH3 domain. In addition, CASK contains a catalytic kinase and two L27 domains. It is highly expressed in the nervous system and plays roles in synaptic protein targeting, neural development, and regulation of gene expression. Binding partners include parkin (a Parkinson's disease molecule), neurexin (adhesion molecule), syndecans, calcium channel proteins, CINAP (nucleosome assembly protein), transcription factor Tbr-1, and the cytoplasmic adaptor proteins Mint1, Veli/mLIN-7/MALS, SAP97, caskin, and CIP98. Deletion or mutations in the CASK gene have been implicated in X-linked mental retardation. The CASK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 300
31776 270997 cd14095 STKc_DCKL Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase (also called Doublecortin-like and CAM kinase-like). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL (or DCAMKL) proteins belong to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL proteins contain a C-terminal kinase domain with similarity to CAMKs. They are involved in the regulation of cAMP signaling. Vertebrates contain three DCKL proteins (DCKL1-3); DCKL1 and 2 also contain a serine, threonine, and proline rich domain (SP), while DCKL3 contains only a single DCX domain instead of tandem domains. The DCKL subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31777 270998 cd14096 STKc_RCK1-like Catalytic domain of RCK1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of fungal STKs including Saccharomyces cerevisiae RCK1 and RCK2, Schizosaccharomyces pombe Sty1-regulated kinase 1 (Srk1), and similar proteins. RCK1, RCK2 (or Rck2p), and Srk1 are MAPK-activated protein kinases. RCK1 and RCK2 are involved in oxidative and metal stress resistance in budding yeast. RCK2 also regulates rapamycin sensitivity in both S. cerevisiae and Candida albicans. Srk1 is activated by Sty1/Spc1 and is involved in negatively regulating cell cycle progression by inhibiting Cdc25. The RCK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 295
31778 270999 cd14097 STKc_STK33 Catalytic domain of Serine/Threonine Kinase 33. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK33 is highly expressed in the testis and is present in low levels in most tissues. It may be involved in spermatogenesis and organ ontogenesis. It interacts with and phosphorylates vimentin and may be involved in regulating intermediate filament cytoskeletal dynamics. Its role in promoting the cell viability of KRAS-dependent cancer cells is under debate; some studies have found STK33 to promote cancer cell viability, while other studies have found it to be non-essential. KRAS is the most commonly mutated human oncogene, thus, studies on the role of STK33 in KRAS mutant cancer cells are important. The STK33 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
31779 271000 cd14098 STKc_Rad53_Cds1 Catalytic domain of the yeast Serine/Threonine Kinases, Rad53 and Cds1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Rad53 and Cds1 are the checkpoint kinase 2 (Chk2) homologs found in budding and fission yeast, respectively. They play a central role in the cell's response to DNA lesions to prevent genome rearrangements and maintain genome integrity. They are phosphorylated in response to DNA damage and incomplete replication, and are essential for checkpoint control. They help promote DNA repair by stalling the cell cycle prior to mitosis in the presence of DNA damage. The Rad53/Cds1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31780 271001 cd14099 STKc_PLK Catalytic domain of the Serine/Threonine Kinases, Polo-like kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. PLKs derive their names from homology to polo, a kinase first identified in Drosophila. There are five mammalian PLKs (PLK1-5) from distinct genes. There is good evidence that PLK1 may function as an oncogene while PLK2-5 have tumor suppressive properties. PLK1 functions as a positive regulator of mitosis, meiosis, and cytokinesis. PLK2 functions in G1 progression, S-phase arrest, and centriole duplication. PLK3 regulates angiogenesis and responses to DNA damage. PLK4 is required for late mitotic progression, cell survival, and embryonic development. PLK5 was first identified as a pseudogene containing a stop codon within the kinase domain, however, both murine and human genes encode expressed proteins. PLK5 functions in cell cycle arrest. 258
31781 271002 cd14100 STKc_PIM1 Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are two PIM1 isoforms resulting from alternative translation initiation sites. PIM1 is the founding member of the PIM subfamily. It is involved in regulating cell growth, differentiation, and apoptosis. It promotes cancer development when overexpressed by inhibiting apoptosis, promoting cell proliferation, and promoting genomic instability. The PIM1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 254
31782 271003 cd14101 STKc_PIM2 Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are three PIM2 isoforms resulting from alternative translation initiation sites. PIM2 is highly expressed in leukemia and lymphomas and has been shown to promote the survival and proliferation of tumor cells. The PIM2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31783 271004 cd14102 STKc_PIM3 Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3). PIM3 can inhibit apoptosis and promote cell survival and protein translation, therefore, it can enhance the proliferation of normal and cancer cells. Mice deficient with PIM3 show minimal effects, suggesting that PIM3 msy not be essential. Since its expression is enhanced in several cancers, it may make a good molecular target for cancer drugs. The PIM3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31784 271005 cd14103 STKc_MLCK Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. In vertebrates, different MLCKs function in smooth (MLCK1), skeletal (MLCK2), and cardiac (MLCK3) muscles. A fourth protein, MLCK4, has also been identified through comprehensive genome analysis although it has not been biochemically characterized. The MLCK1 gene expresses three transcripts in a cell-specific manner: a short MLCK1 which contains three immunoglobulin (Ig)-like and one fibronectin type III (FN3) domains, PEVK and actin-binding regions, and a kinase domain near the C-terminus; a long MLCK1 containing six additional Ig-like domains at the N-terminus compared to the short MLCK1; and the C-terminal Ig module. MLCK2, MLCK3, and MLCK4 share a simpler domain architecture of a single kinase domain near the C-terminus and the absence of Ig-like or FN3 domains. The MLCK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 250
31785 271006 cd14104 STKc_Titin Catalytic domain of the Giant Serine/Threonine Kinase Titin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Titin, also called connectin, is a muscle-specific elastic protein and is the largest known protein to date. It contains multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains, and a single kinase domain near the C-terminus. It spans half of the sarcomere, the repeating contractile unit of striated muscle, and performs mechanical and catalytic functions. Titin contributes to the passive force generated when muscle is stretched during relaxation. Its kinase domain phosphorylates and regulates the muscle protein telethonin, which is required for sarcomere formation in differentiating myocytes. In addition, titin binds many sarcomere proteins and acts as a molecular scaffold for filament formation during myofibrillogenesis. The Titin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
31786 271007 cd14105 STKc_DAPK Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK1 is the prototypical member of the subfamily and is also simply referred to as DAPK. DAPK2 is also called DAPK-related protein 1 (DRP-1), while DAPK3 has also been named DAP-like kinase (DLK) and zipper-interacting protein kinase (ZIPk). These proteins are ubiquitously expressed in adult tissues, are capable of cross talk with each other, and may act synergistically in regulating cell death. The DAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31787 271008 cd14106 STKc_DRAK Catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs, also called STK17, were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 and DRAK2. Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. They may play a role in apoptotic signaling. The DRAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
31788 271009 cd14107 STKc_obscurin_rpt1 Catalytic kinase domain, first repeat, of the Giant Serine/Threonine Kinase Obscurin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Obscurin, approximately 800 kDa in size, is one of three giant proteins expressed in vetebrate striated muscle, together with titin and nebulin. It is a multidomain protein composed of tandem adhesion and signaling domains, including 49 immunoglobulin (Ig) and 2 fibronectin type III (FN3) domains at the N-terminus followed by a more complex region containing more Ig domains, a conserved SH3 domain near a RhoGEF and PH domains, non-modular regions, as well as IQ and phosphorylation motifs. The obscurin gene also encode two kinase domains, which are not expressed as part of the 800 kDa protein, but as a smaller, alternatively spliced product present mainly in the heart muscle, also called obscurin-MLCK. Obscurin is localized at the peripheries of Z-disks and M-lines, where it is able to communicate with the surrounding myoplasm. It interacts with diverse proteins including sAnk1, myosin, titin, and MyBP-C. It may act as a scaffold for the assembly of elements of the contractile apparatus. The obscurin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31789 271010 cd14108 STKc_SPEG_rpt1 Catalytic kinase domain, first repeat, of Giant Serine/Threonine Kinase Striated muscle preferentially expressed protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Striated muscle preferentially expressed gene (SPEG) generates 4 different isoforms through alternative promoter use and splicing in a tissue-specific manner: SPEGalpha and SPEGbeta are expressed in cardiac and skeletal striated muscle; Aortic Preferentially Expressed Protein-1 (APEG-1) is expressed in vascular smooth muscle; and Brain preferentially expressed gene (BPEG) is found in the brain and aorta. SPEG proteins have mutliple immunoglobulin (Ig), 2 fibronectin type III (FN3), and two kinase domains. They are necessary for cardiac development and survival. The SPEG subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31790 271011 cd14109 PK_Unc-89_rpt1 Pseudokinase domain, first repeat, of the Giant Serine/Threonine Kinase Uncoordinated protein 89. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. The nematode Unc-89 gene, through alternative promoter use and splicing, encodes at least six major isoforms (Unc-89A to Unc-89F) of giant muscle proteins that are homologs for the vetebrate obscurin. In flies, five isoforms of Unc-89 have been detected: four in the muscles of adult flies (two in the indirect flight muscle and two in other muscles) and another isoform in the larva. Unc-89 in nematodes is required for normal muscle cell architecture. In flies, it is necessary for the development of a symmetrical sarcomere in the flight muscles. Unc-89 proteins contain several adhesion and signaling domains including multiple copies of the immunoglobulin (Ig) domain, as well as fibronectin type III (FN3), SH3, RhoGEF, and PH domains. The nematode Unc-89 isoforms D, C, D, and F contain two kinase domain with B and F having two complete kinase domains while the first repeat of C and D are partial domains. Homology modeling suggests that the first kinase repeat of Unc-89 may be catalytically inactive, a pseudokinase, while the second kinase repeat may be active. The pseudokinase domain may function as a regulatory domain or a protein interaction domain. The Unc-89 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31791 271012 cd14110 STKc_obscurin_rpt2 Catalytic kinase domain, second repeat, of the Giant Serine/Threonine Kinase Obscurin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Obscurin, approximately 800 kDa in size, is one of three giant proteins expressed in vetebrate striated muscle, together with titin and nebulin. It is a multidomain protein composed of tandem adhesion and signaling domains, including 49 immunoglobulin (Ig) and 2 fibronectin type III (FN3) domains at the N-terminus followed by a more complex region containing more Ig domains, a conserved SH3 domain near a RhoGEF and PH domains, non-modular regions, as well as IQ and phosphorylation motifs. The obscurin gene also encode two kinase domains, which are not expressed as part of the 800 kDa protein, but as a smaller, alternatively spliced product present mainly in the heart muscle, also called obscurin-MLCK. Obscurin is localized at the peripheries of Z-disks and M-lines, where it is able to communicate with the surrounding myoplasm. It interacts with diverse proteins including sAnk1, myosin, titin, and MyBP-C. It may act as a scaffold for the assembly of elements of the contractile apparatus. The obscurin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31792 271013 cd14111 STKc_SPEG_rpt2 Catalytic kinase domain, second repeat, of Giant Serine/Threonine Kinase Striated muscle preferentially expressed protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Striated muscle preferentially expressed gene (SPEG) generates 4 different isoforms through alternative promoter use and splicing in a tissue-specific manner: SPEGalpha and SPEGbeta are expressed in cardiac and skeletal striated muscle; Aortic Preferentially Expressed Protein-1 (APEG-1) is expressed in vascular smooth muscle; and Brain preferentially expressed gene (BPEG) is found in the brain and aorta. SPEG proteins have mutliple immunoglobulin (Ig), 2 fibronectin type III (FN3), and two kinase domains. They are necessary for cardiac development and survival. The SPEG subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31793 271014 cd14112 STKc_Unc-89_rpt2 Catalytic kinase domain, second repeat, of the Giant Serine/Threonine Kinase Uncoordinated protein 89. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The nematode Unc-89 gene, through alternative promoter use and splicing, encodes at least six major isoforms (Unc-89A to Unc-89F) of giant muscle proteins that are homologs for the vetebrate obscurin. In flies, five isoforms of Unc-89 have been detected: four in the muscles of adult flies (two in the indirect flight muscle and two in other muscles) and another isoform in the larva. Unc-89 in nematodes is required for normal muscle cell architecture. In flies, it is necessary for the development of a symmetrical sarcomere in the flight muscles. Unc-89 proteins contain several adhesion and signaling domains including multiple copies of the immunoglobulin (Ig) domain, as well as fibronectin type III (FN3), SH3, RhoGEF, and PH domains. The nematode Unc-89 isoforms D, C, D, and F contain two kinase domain with B and F having two complete kinase domains while the first repeat of C and D are partial domains. Homology modeling suggests that the first kinase repeat of Unc-89 may be catalytically inactive, a pseudokinase, while the second kinase repeat may be active. The Unc-89 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31794 271015 cd14113 STKc_Trio_C C-terminal kinase domain of the Large Serine/Threonine Kinase and Rho Guanine Nucleotide Exchange Factor, Triple functional domain protein. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Triple functional domain protein (Trio), also called PTPRF-interacting protein, is a large multidomain protein containing a series of spectrin-like repeats, two each of RhoGEF and SH3 domains, an immunoglobulin-like (Ig) domain and a C-terminal kinase. Trio plays important roles in neuronal cell migration and axon guidance. It was originally identified as an interacting partner of the of the receptor-like tyrosine phosphatase (RPTP) LAR (leukocyte-antigen-related protein), a family of receptors that function in the signaling to the actin cytoskeleton during development. Trio functions as a GEF for Rac1, RhoG, and RhoA, and is involved in the regulation of lamellipodia formation, mediating Rac1-dependent cell spreading and migration. The Trio subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31795 271016 cd14114 STKc_Twitchin_like The catalytic domain of the Giant Serine/Threonine Kinases, Twitchin and Projectin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Caenorhabditis elegans and Aplysia californica Twitchin, Drosophila melanogaster Projectin, and similar proteins. These are very large muscle proteins containing multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains and a single kinase domain near the C-terminus. Twitchin and Projectin are both associated with thick filaments. Twitchin is localized in the outer parts of A-bands and is involved in regulating muscle contraction. It interacts with the myofibrillar proteins myosin and actin in a phosphorylation-dependent manner, and may be involved in regulating the myosin cross-bridge cycle. The kinase activity of Twitchen is activated by Ca2+ and the Ca2+ binding protein S100A1. Projectin is associated with the end of thick filaments and is a component of flight muscle connecting filaments. The kinase domain of Projectin may play roles in autophosphorylation and transphosphorylation, which impact the formation of myosin filaments. The Twitchin-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31796 271017 cd14115 STKc_Kalirin_C C-terminal kinase domain of the Large Serine/Threonine Kinase and Rho Guanine Nucleotide Exchange Factor, Kalirin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Kalirin, also called Duo or Duet, is a large multidomain protein containing a series of spectrin-like repeats, two each of RhoGEF and SH3 domains, an immunoglobulin-like (Ig) domain and a C-terminal kinase. As a GEF, it activates Rac1, RhoA, and RhoG. It is highly expressed in neurons and is required for spine formation. The kalirin gene produces at least 10 isoforms from alternative promoter use and splicing. Of the major isoforms (Kalirin-7, -9, and -12), only kalirin-12 contains the C-terminal kinase domain. Kalirin-12 is highly expressed during embryonic development and it plays an important role in axon outgrowth. The Kalirin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 248
31797 271018 cd14116 STKc_Aurora-A Catalytic domain of the Serine/Threonine kinase, Aurora-A kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). Aurora-A regulates cell cycle events from the late S-phase through the M-phase including centrosome maturation, mitotic entry, centrosome separation, spindle assembly, chromosome alignment, cytokinesis, and mitotic exit. Aurora-A activation depends on its autophosphorylation and binding to the microtubule-associated protein TPX2, which also localizes the kinase to spindle microtubules. Aurora-A is overexpressed in many cancer types such as prostate, ovarian, breast, bladder, gastric, and pancreatic. The Aurora subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31798 271019 cd14117 STKc_Aurora-B_like Catalytic domain of the Serine/Threonine kinase, Aurora-B kinase and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). This subfamily includes Aurora-B and Aurora-C. Aurora-B is most active at the transition during metaphase to the end of mitosis. It associates with centromeres, relocates to the midzone of the central spindle, and concentrates at the midbody during cell division. It is critical for accurate chromosomal segregation, cytokinesis, protein localization to the centrosome and kinetochore, correct microtubule-kinetochore attachments, and regulation of the mitotic checkpoint. Aurora-C is mainly expressed in meiotically dividing cells; it was originally discovered in mice as a testis-specific STK called Aie1. Both Aurora-B and -C are chromosomal passenger proteins that can form complexes with INCENP and survivin, and they may have redundant cellular functions. INCENP participates in the activation of Aurora-B in a two-step process: first by binding to form an intermediate state of activation and the phosphorylation of its C-terminal TSS motif to generate the fully active kinase. The Aurora-B subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
31799 271020 cd14118 STKc_CAMKK Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). Vertebrates contain two CaMKKs, CaMKK1 (or alpha) and CaMKK2 (or beta). CaMKK1 is involved in the regulation of glucose uptake in skeletal muscles. CaMKK2 is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. The CaMKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31800 271021 cd14119 STKc_LKB1 Catalytic domain of the Serine/Threonine kinase, Liver Kinase B1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LKB1, also called STK11, was first identified as a tumor suppressor responsible for Peutz-Jeghers syndrome, a disorder that leads to an increased risk of spontaneous epithelial cancer. It serves as a master upstream kinase that activates AMP-activated protein kinase (AMPK) and most AMPK-like kinases. LKB1 and AMPK are part of an energy-sensing pathway that links cell energy to metabolism and cell growth. They play critical roles in the establishment and maintenance of cell polarity, cell proliferation, cytoskeletal organization, as well as T-cell metabolism, including T-cell development, homeostasis, and effector function. To be activated, LKB1 requires the adaptor proteins STe20-Related ADaptor (STRAD) and mouse protein 25 (MO25). The LKB1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31801 271022 cd14120 STKc_ULK1_2-like Catalytic domain of the Serine/Threonine kinases, Unc-51-like kinases 1 and 2, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK1 is required for efficient amino acid starvation-induced autophagy and mitochondrial clearance. ULK2 is ubiquitously expressed and is essential in autophagy induction. ULK1 and ULK2 have unique and cell-type specific roles, but also display partially redundant roles in starvation-induced autophagy. They both display neuron-specific functions: ULK1 is involved in non-clathrin-coated endocytosis in growth cones, filopodia extension, and axon branching; ULK2 plays a role in axon development. The ULK1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31802 271023 cd14121 STKc_ULK3 Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK3 mRNA is up-regulated in fibroblasts after Ras-induced senescence, and its overexpression induces both autophagy and senescence in a fibroblast cell line. ULK3, through its kinase activity, positively regulates Gli proteins, mediators of the Sonic hedgehog (Shh) signaling pathway that is implicated in tissue homeostasis maintenance and neurogenesis. It is inhibited by binding to Suppressor of Fused (Sufu). The ULK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 252
31803 271024 cd14122 STKc_VRK1 Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. Vertebrates contain three VRK proteins. Human VRK1 is implicated in the regulation of many cellular processes including cell cycle progression and proliferation, stress responses, nuclear envelope assembly and chromatin condensation. It regulates cell cycle progression during the DNA replication period by inducing cyclin D1 expression. VRK1 also phosphorylates and regulates some transcription factors including p53, c-Jun, ATF2, and nuclear factor BAF. VRK1 stabilizes p53 by interfering with its mdm2-mediated degradation. Accumulation of p53, which blocks cell growth and division, is modulated by an autoregulatory loop between p53 and VRK1 (accumulated p53 downregulates VRK1). This autoregulatory loop has been found to be nonfunctional in some lung carcinomas. The VRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 301
31804 271025 cd14123 STKc_VRK2 Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins. VRK2 exists as two alternative splice forms, A and B, which differ in their C-terminal regions. VRK2A, the predominant isoform, contains a hydrophobic tail and is anchored to the ER and mitochondria. It is expressed in all cell types. VRK2B lacks a membrane-anchor tail and is detected in the cytosol and the nucleus. Like VRK1, it can stabilize p53. VRK2B functionally replaces VRK1 in the nucleus of cell types where VRK1 is absent. VRK2 modulates hypoxia-induced stress responses by interacting with TAK1, an atypical MAPK kinase kinase which triggers cascades that activate JNK following oxidative stress. VRK2 also interacts with JIP1, a scaffold protein that assembles three consecutive members of a MAPK pathway. This interaction prevents the association of JNK with the signaling complex, leading to reduced phosphorylation and AP1-dependent transcription. The VRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 302
31805 271026 cd14124 PK_VRK3 Pseudokinase domain of Vaccinia Related Kinase 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins. VRK3 is an inactive pseudokinase that is unable to bind ATP. It achieves its regulatory function through protein-protein interactions. It negatively regulates ERK signaling by binding directly and enhancing the activity of the MAPK phosphatase VHR (vaccinia H1-related), which dephosphorylates and inactivates ERK. The VRK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 298
31806 271027 cd14125 STKc_CK1_delta_epsilon Catalytic domain of the Serine/Threonine protein kinases, Casein Kinase 1 delta and epsilon. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. The delta and epsilon isoforms of CK1 play important roles in circadian rhythm and cell growth. They phosphorylate PERIOD proteins (PER1-3), which are circadian clock proteins that fulfill negative regulatory functions. PER phosphorylation leads to its degradation. However, CRY proteins form a complex with PER and CK1delta/epsilon that protects PER from degradation and leads to nuclear accummulation of the complex, which inhibits BMAL1-CLOCK dependent transcription activation. CK1delta/epsilon also phosphorylate the tumor suppressor p53 and the cellular oncogene Mdm2, which are key regulators of cell growth, genome integrity, and the development of cancer. This subfamily also includes the CK1 fungal proteins Saccharomyces cerevisiae HRR25 and Schizosaccharomyces pombe HHP1. These fungal proteins are involved in DNA repair. The CK1 delta/epsilon subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 275
31807 271028 cd14126 STKc_CK1_gamma Catalytic domain of the Serine/Threonine protein kinase, Casein Kinase 1 gamma. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. CK1gamma proteins are unique within the CK1 subfamily in that they are palmitoylated at the C-termini and are anchored to the plasma membrane. CK1gamma is involved in transducing the signaling of LDL-receptor-related protein 6 (LRP6) through direct phosphorylation following Wnt stimulation, resulting in the recruitment of the scaffold protein Axin. In Xenopus embryos, CK1gamma is required during anterio-posterior patterning. In higher vertebrates, three CK1gamma (gamma1-3) isoforms exist. In mammalian cells, CK1gamma2 has been implicated in regulating the synthesis of sphingomyelin, a phospholipid that is found in the outer leaflet of the plasma membrane, by hyperphosphorylating and inactivating the ceramide transfer protein CERT. CK1gamma2 also phosphorylates the transcription factor Smad-3 resulting in its ubiquitination and degradation. It inhibits Smad-3 mediated responses of Transforming Growth Factor-beta (TGF-beta) including cell growth arrest. The CK1 gamma subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
31808 271029 cd14127 STKc_CK1_fungal Catalytic domain of the Serine/Threonine protein kinase, Fungal Casein Kinase 1 homolog 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. This subfamily is composed of fungal CK1 homolog 1 proteins, also called Yck1 in Saccharomyces cerevisiae and Cki1 in Schizosaccharomyces pombe. Yck1 (or Yck1p) and Cki1 are plasma membrane-anchored proteins. Yck1 phosphorylates and regulates Khd1p, a RNA-binding protein that represses translation of bud-localized mRNA. Cki1 phosphorylates and regulates phosphatidylinositol (PI)-(4)P-5-kinase, which catalyzes the last step in the sythesis of PI(4,5)P2, which is involved in actin cytoskeleton remodeling and membrane traffic. The fungal CK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
31809 271030 cd14128 STKc_CK1_alpha Catalytic domain of the Serine/Threonine protein kinases, Casein Kinase 1 alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. CK1alpha plays a role in cell cycle progression, spindle dynamics, and chromosome segregation. It is also involved in regulating apoptosis mediated by Fas or the retinoid X receptor (RXR), and is a positive regulator of Wnt signaling. CK1alpha phosphorylates the NS5A protein of flaviviruses such as the Hepatitis C virus (HCV) and yellow fever virus (YFV), and influences flaviviral replication. The CK1 alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 266
31810 271031 cd14129 STKc_TTBK2 Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. Mutations in TTBK2 is associated with the development of spinocerebellar ataxia type 11, belonging to a group of neurodegenerative disorders characterized by progressive incoordination, dysarthria and impairment of eye movements. Brain tissues of SCA11 patients show the presence of neurofibrillary tangles and tau deposition in the brain, similar to Alzheimer's disease (AD) patients. The TTBK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31811 271032 cd14130 STKc_TTBK1 Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. Genetic variations in TTBK1 are linked to Alzheimer's disease (AD). Hyperphosphorylated tau is a major component of paired helical filaments that accumulate in the brain of AD patients. Studies in transgenic mice show that TTBK1 is involved in the phosphorylation-dependent pathogenic aggregation of tau. The TTBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31812 271033 cd14131 PKc_Mps1 Catalytic domain of the Dual-specificity Mitotic checkpoint protein kinase, Monopolar spindle 1 (also called TTK). Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TTK/Mps1 is a spindle checkpoint kinase that was first discovered due to its necessity in centrosome duplication in budding yeast. It was later found to function in the spindle assembly checkpoint, which monitors the proper attachment of chromosomes to the mitotic spindle. In yeast, substrates of Mps1 include the spindle pole body components Spc98p, Spc110p, and Spc42p. The TTK/Mps1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31813 271034 cd14132 STKc_CK2_alpha Catalytic subunit (alpha) of the Serine/Threonine Kinase, Casein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK2 is a tetrameric protein with two catalytic (alpha) and two regulatory (beta) subunits. It is constitutively active and ubiquitously expressed, and is found in the cytoplasm, nucleus, as well as in the plasma membrane. It phosphorylates a wide variety of substrates including gylcogen synthase, cell cycle proteins, nuclear proteins (e.g. DNA topoisomerase II), and ion channels (e.g. ENaC), among others. It may be considered a master kinase controlling the activity or lifespan of many other kinases and exerting its effect over cell fate, gene expression, protein synthesis and degradation, and viral infection. CK2 is implicated in every stage of the cell cycle and is required for cell cycle progression. It plays crucial roles in cell differentiation, proliferation, and survival, and is thus implicated in cancer. CK2 is not an oncogene by itself but elevated CK2 levels create an environment that enhances the survival of tumor cells. The CK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 306
31814 271035 cd14133 PKc_DYRK_like Catalytic domain of Dual-specificity tYrosine-phosphorylated and -Regulated Kinase-like protein kinases. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of the dual-specificity DYRKs and YAK1, as well as the S/T kinases (STKs), HIPKs. DYRKs and YAK1 autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. Proteins in this subfamily play important roles in cell proliferation, differentiation, survival, growth, and development. The DYRK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 262
31815 271036 cd14134 PKc_CLK Catalytic domain of the Dual-specificity protein kinases, CDC-like kinases. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on S/T residues. In Drosophila, the CLK homolog DOA (Darkener of apricot) is essential for embryogenesis and its mutation leads to defects in sexual differentiation, eye formation, and neuronal development. In fission yeast, the CLK homolog Lkh1 is a negative regulator of filamentous growth and asexual flocculation, and is also involved in oxidative stress response. Vertebrates contain mutliple CLK proteins and mammals have four (CLK1-4). The CLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 332
31816 271037 cd14135 STKc_PRP4 Catalytic domain of the Serine/Threonine Kinase, Pre-mRNA-Processing factor 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PRP4 phosphorylates a number of factors involved in the formation of active spliceosomes, which catalyze pre-mRNA splicing. It phosphorylates PRP6 and PRP31, components of the U4/U6-U5 tri-small nuclear ribonucleoprotein (snRNP), during spliceosomal complex formation. In fission yeast, PRP4 phosphorylates the splicing factor PRP1 (U5-102 kD in mammals). Thus, PRP4 plays a key role in regulating spliceosome assembly and pre-mRNA splicing. It also plays an important role in mitosis by acting as a spindle assembly checkpoint kinase that is required for chromosome alignment and the recruitment of the checkpoint proteins MPS1, MAD1, and MAD2 at kinetochores. The PRP4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 318
31817 271038 cd14136 STKc_SRPK Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. They play important roles in mediating pre-mRNA processing and mRNA maturation, as well as other cellular functions such as chromatin reorganization, cell cycle and p53 regulation, and metabolic signaling. Vertebrates contain three distinct SRPKs, called SRPK1-3. The SRPK homolog in budding yeast, Sky1p, recognizes and phosphorylates its substrate Npl3p, which lacks a classic RS domain but contains a single RS dipeptide at the C-terminus of its RGG domain. Npl3p is a shuttling heterogeneous nuclear ribonucleoprotein (hnRNP) that exports a distinct class of mRNA from the nucleus to the cytoplasm. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 320
31818 271039 cd14137 STKc_GSK3 The catalytic domain of the Serine/Threonine Kinase, Glycogen Synthase Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GSK3 is a mutifunctional kinase involved in many cellular processes including cell division, proliferation, differentiation, adhesion, and apoptosis. In plants, GSK3 plays a role in the response to osmotic stress. In Caenorhabditis elegans, it plays a role in regulating normal oocyte-to-embryo transition and response to oxidative stress. In Chlamydomonas reinhardtii, GSK3 regulates flagellar length and assembly. In mammals, there are two isoforms, GSK3alpha and GSK3beta, which show both distinct and redundant functions. The two isoforms differ mainly in their N-termini. They are both involved in axon formation and in Wnt signaling.They play distinct roles in cardiogenesis, with GSKalpha being essential in cardiomyocyte survival, and GSKbeta regulating heart positioning and left-right symmetry. GSK3beta was first identified as a regulator of glycogen synthesis, but has since been determined to play other roles. It regulates the degradation of beta-catenin and IkB. Beta-catenin is the main effector of Wnt, which is involved in normal haematopoiesis and stem cell function. IkB is a central inhibitor of NF-kB, which is critical in maintaining leukemic cell growth. GSK3beta is enriched in the brain and is involved in regulating neuronal signaling pathways. It is implicated in the pathogenesis of many diseases including Type II diabetes, obesity, mood disorders, Alzheimer's disease, osteoporosis, and some types of cancer, among others. The GSK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 293
31819 271040 cd14138 PTKc_Wee1a Catalytic domain of the Protein Tyrosine Kinase, Wee1a. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of human Wee1a, Xenopus laevis Wee1b (XeWee1b) and similar vertebrate proteins. Members of this subfamily show a wide expression pattern. XeWee1b functions after the first zygotic cell divisions. It is expressed in all tissues and is also present after the gastrulation stage of embryos. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The Wee1a subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 276
31820 271041 cd14139 PTKc_Wee1b Catalytic domain of the Protein Tyrosine Kinase, Wee1b. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of human Wee1b (also called Wee2), Xenopus laevis Wee1a (XeWee1a) and similar vertebrate proteins. XeWee1a accumulates after exiting the metaphase II stage in oocytes and in early mitotic cells. It functions during the first zygotic cell division and not during subsequent divisions. Mammalian Wee2/Wee1b is an oocyte-specific inhibitor of meiosis that functions downstream of cAMP. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The Wee1b subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 274
31821 271042 cd14140 STKc_ACVR2b Catalytic domain of the Serine/Threonine Kinase, Activin Type IIB Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2b (or ActRIIB) belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. ACVR2b is one of two ACVR2 receptors found in vertebrates. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. The ACVR2b subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
31822 271043 cd14141 STKc_ACVR2a Catalytic domain of the Serine/Threonine Kinase, Activin Type IIA Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2a (or ActRIIA) belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. ACVR2b is one of two ACVR2 receptors found in vertebrates. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. The ACVR2a subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31823 271044 cd14142 STKc_ACVR1_ALK1 Catalytic domain of the Serine/Threonine Kinases, Activin Type I Receptor and Activin receptor-Like Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR1, also called Activin receptor-Like Kinase 2 (ALK2), and ALK1 act as receptors for bone morphogenetic proteins (BMPs) and they activate SMAD1/5/8. ACVR1 is widely expressed while ALK1 is limited mainly to endothelial cells. The specificity of BMP binding to type I receptors is affected by type II receptors. ACVR1 binds BMP6/7/9/10 and can also bind anti-Mullerian hormone (AMH) in the presence of AMHR2. ALK1 binds BMP9/10 as well as TGFbeta in endothelial cells. A missense mutation in the GS domain of ACVR1 causes fibrodysplasia ossificans progressiva, a complex and disabling disease characterized by congenital skeletal malformations and extraskeletal bone formation. ACVR1 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and AMH, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like ACVR1 and ALK1, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The ACVR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 298
31824 271045 cd14143 STKc_TGFbR1_ACVR1b_ACVR1c Catalytic domain of the Serine/Threonine Kinases, Transforming Growth Factor beta Type I Receptor and Activin Type IB/IC Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TGFbR1, also called Activin receptor-Like Kinase 5 (ALK5), functions as a receptor for TGFbeta and phoshorylates SMAD2/3. TGFbeta proteins are cytokines that regulate cell growth, differentiation, and survival, and are critical in the development and progression of many human cancers. Mutations in TGFbR1 (and TGFbR2) can cause aortic aneurysm disorders such as Loeys-Dietz and Marfan syndromes. ACVR1b (also called ALK4) and ACVR1c (also called ALK7) act as receptors for activin A and B, respectively. TGFbR1, ACVR1b, and ACVR1c belong to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like TGFbR1, ACVR1b, and ACVR1c, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The TGFbR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
31825 271046 cd14144 STKc_BMPR1 Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type I Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1 functions as a receptor for morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Vertebrates contain two type I BMP receptors, BMPR1a and BMPR1b. BMPR1 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that also includes TGFbeta, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
31826 271047 cd14145 STKc_MLK1 Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK1 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK) and is also called MAP3K9. MAP3Ks phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Little is known about the specific function of MLK1. It is capable of activating the c-Jun N-terminal kinase pathway. Mice lacking both MLK1 and MLK2 are viable, fertile, and have normal life spans. There could be redundancy in the function of MLKs. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
31827 271048 cd14146 STKc_MLK4 Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK4 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The specific function of MLK4 is yet to be determined. Mutations in the kinase domain of MLK4 have been detected in colorectal cancers. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation.The MLK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
31828 271049 cd14147 STKc_MLK3 Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK3 is a mitogen-activated protein kinase kinase kinases (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLK3 activates multiple MAPK pathways and plays a role in apoptosis, proliferation, migration, and differentiation, depending on the cellular context. It is highly expressed in breast cancer cells and its signaling through c-Jun N-terminal kinase has been implicated in the migration, invasion, and malignancy of cancer cells. MLK3 also functions as a negative regulator of Inhibitor of Nuclear Factor-KappaB Kinase (IKK) and consequently, it also impacts inflammation and immunity. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation.The MLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31829 271050 cd14148 STKc_MLK2 Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK2 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK) and is also called MAP3K10. MAP3Ks phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLK2 is abundant in brain, skeletal muscle, and testis. It functions upstream of the MAPK, c-Jun N-terminal kinase. It binds hippocalcin, a calcium-sensor protein that protects neurons against calcium-induced cell death. Both MLK2 and hippocalcin may be associated with the pathogenesis of Parkinson's disease. MLK2 also binds to normal huntingtin (Htt), which is important in neuronal transcription, development, and survival. MLK2 does not bind to the polyglutamine-expanded Htt, which is implicated in the pathogeneis of Huntington's disease, leading to neuronal toxicity. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 258
31830 271051 cd14149 STKc_C-Raf Catalytic domain of the Serine/Threonine Kinase, C-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. C-Raf, also known as Raf-1 or c-Raf-1, is ubiquitously expressed and was the first Raf identified. It was characterized as the acquired oncogene from an acutely transforming murine sarcoma virus (3611-MSV) and the transforming agent from the avian retrovirus MH2. C-Raf-deficient mice embryos die around midgestation with increased apoptosis of embryonic tissues, especially in the fetal liver. One of the main functions of C-Raf is restricting caspase activation to promote survival in response to specific stimuli such as Fas stimulation, macrophage apoptosis, and erythroid differentiation. C-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. It functions in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The C-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 283
31831 271052 cd14150 STKc_A-Raf Catalytic domain of the Serine/Threonine Kinase, A-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. A-Raf cooperates with C-Raf in regulating ERK transient phosphorylation that is associated with cyclin D expression and cell cycle progression. Mice deficient in A-Raf are born alive but show neurological and intestinal defects. A-Raf demonstrates low kinase activity to MEK, compared with B- and C-Raf, and may also have alternative functions other than in the ERK signaling cascade. It regulates the M2 type pyruvate kinase, a key glycolytic enzyme. It also plays a role in endocytic membrane trafficking. A-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. It functions in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The A-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31832 271053 cd14151 STKc_B-Raf Catalytic domain of the Serine/Threonine Kinase, B-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. B-Raf activates ERK with the strongest magnitude, compared with other Raf kinases. Mice embryos deficient in B-Raf die around midgestation due to vascular hemorrhage caused by apoptotic endothelial cells. Mutations in B-Raf have been implicated in initiating tumorigenesis and tumor progression, and are found in malignant cutaneous melanoma, papillary thyroid cancer, as well as in ovarian and colorectal carcinomas. Most oncogenic B-Raf mutations are located at the activation loop of the kinase and surrounding regions; the V600E mutation accounts for around 90% of oncogenic mutations. The V600E mutant constitutively activates MEK, resulting in sustained activation of ERK. B-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The B-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 274
31833 271054 cd14152 STKc_KSR1 Catalytic domain of the Serine/Threonine Kinase, Kinase Suppressor of Ras 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. KSR1 functions as a transducer of TNFalpha-stimulated C-Raf activation of ERK1/2 and NF-kB. Detected activity of KSR1 is cell type specific and context dependent. It is inactive in normal colon epithelial cells and becomes activated at the onset of inflammatory bowel disease (IBD). Similarly, KSR1 activity is undetectable prior to stimulation by EGF or ceramide in COS-7 or YAMC cells, respectively. KSR proteins are widely regarded as pseudokinases, however, this matter is up for debate as catalytic activity has been detected for KSR1 in some systems. The KSR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
31834 271055 cd14153 PK_KSR2 Pseudokinase domain of Kinase Suppressor of Ras 2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. KSR2 interacts with the protein phosphatase calcineurin and functions in calcium-mediated ERK signaling. It also functions in energy metabolism by regulating AMP kinase and AMPK-dependent processes such as glucose uptake and fatty acid oxidation. KSR proteins act as scaffold proteins that function downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases. The KSR2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
31835 271056 cd14154 STKc_LIMK Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. Vertebrate have two members, LIMK1 and LIMK2. The LIMK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
31836 271057 cd14155 PKc_TESK Catalytic domain of the Dual-specificity protein kinase, Testicular protein kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TESK proteins phosphorylate cofilin and induce actin cytoskeletal reorganization. In the Drosphila eye, TESK is required for epithelial cell organization. Mammals contain two TESK proteins, TESK1 and TESK2, which are highly expressed in testis and play roles in spermatogenesis. TESK1 is found in testicular germ cells while TESK2 is expressed mainly in nongerminal Sertoli cells. TESK1 is stimulated by integrin-mediated signaling pathways. It regulates cell spreading and focal adhesion formation. The TESK subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 253
31837 271058 cd14156 PKc_LIMK_like_unk Catalytic domain of an unknown subfamily of LIM domain kinase-like protein kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. This group is composed of uncharacterized proteins with similarity to LIMK and Testicular or testis-specific protein kinase (TESK). LIMKs are characterized as serine/threonine kinases (STKs) while TESKs are dual-specificity protein kinases. Both LIMK and TESK phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They are implicated in many cellular functions including cell spreading, motility, morphogenesis, meiosis, mitosis, and spermatogenesis. The LIMK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31838 271059 cd14157 STKc_IRAK2 Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK2 plays a role in mediating NFkB activation by TLR3, TLR4, and TLR8. It is specifically targeted by the viral protein A52, which is important for virulence, to inhibit all IL-1/TLR pathways, indicating that IRAK2 has a predominant role in NFkB activation. It is redundant with IRAK1 in early signaling but is critical for late and sustained activation. The IRAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31839 271060 cd14158 STKc_IRAK4 Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK4 plays a critical role in NFkB activation by its interaction with MyD88, which acts as a scaffold that enables IRAK4 to phosphorylate and activate IRAK1 and/or IRAK2. It also plays an important role in type I IFN production induced by TLR7/8/9. The IRAK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
31840 271061 cd14159 STKc_IRAK1 Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK1 plays a role in the activation of IRF3/7, STAT, and NFkB. It mediates IL-6 and IFN-gamma responses following IL-1 and IL-18 stimulation, respectively. It also plays an essential role in IFN-alpha induction downstream of TLR7 and TLR9. The IRAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 296
31841 271062 cd14160 PK_IRAK3 Pseudokinase domain of Interleukin-1 Receptor Associated Kinase 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain (a pseudokinase in the case of IRAK3), and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK3 (or IRAK-M) is the only IRAK that does not show kinase activity. It is found only in monocytes and macrophages in humans, and functions as a negative regulator of TLR signaling including TLR-2 induced p38 activation. It also negatively regulates the alternative NFkB pathway in a TLR-2 specific manner. IRAK3 is downregulated in the monocytes of obese people, and is associated with high SOD2, a marker of mitochondrial oxidative stress. It is an important inhibitor of inflammation in association with obesity and metabolic syndrome. The IRAK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 276
31842 271063 cd14161 STKc_NUAK2 Catalytic domain of the Serine/Threonine Kinase, novel (nua) kinase family NUAK 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NUAK proteins are classified as AMP-activated protein kinase (AMPK)-related kinases, which like AMPK are activated by the major tumor suppressor LKB1. Vertebrates contain two NUAK proteins, called NUAK1 and NUAK2. NUAK2, also called SNARK (Sucrose, non-fermenting 1/AMP-activated protein kinase-related kinase), is involved in energy metabolism. It is activated by hyperosmotic stress, DNA damage, and nutrients such as glucose and glutamine. NUAK2-knockout mice develop obesity, altered serum lipid profiles, hyperinsulinaemia, hyperglycaemia, and impaired glucose tolerance. NUAK2 is implicated in regulating actin stress fiber assembly through its association with myosin phosphatase Rho-interacting protein (MRIP), which leads to an increase in myosin regulatory light chain (MLC) phosphorylation. It is also associated with tumor growth, migration, and oncogenicity of melanoma cells. The NUAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31843 271064 cd14162 STKc_TSSK4-like Catalytic domain of testis-specific serine/threonine kinase 4 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK4, also called TSSK5, is expressed in testis from haploid round spermatids to mature spermatozoa. It phosphorylates Cre-Responsive Element Binding protein (CREB), facilitating the binding of CREB to the specific cis cAMP responsive element (CRE), which is important in activating genes related to germ cell differentiation. Mutations in the human TSSK4 gene is associated with infertile Chinese men with impaired spermatogenesis. The TSSK4-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31844 271065 cd14163 STKc_TSSK3-like Catalytic domain of testis-specific serine/threonine kinase 3 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK3 has been reported to be expressed in the interstitial Leydig cells of adult testis. Its mRNA levels is low at birth, increases at puberty, and remains high throughout adulthood. The TSSK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
31845 271066 cd14164 STKc_TSSK6-like Catalytic domain of testis-specific serine/threonine kinase 6 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK6, also called SSTK, is expressed at the head of elongated sperm. It can phosphorylate histones and associate with heat shock protens HSP90 and HSC70. Male mice deficient in TSSK6 are infertile, showing spermatogenic impairment including reduced sperm counts, impaired DNA condensation, abnormal morphology and decreased motility rates. The TSSK6-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31846 271067 cd14165 STKc_TSSK1_2-like Catalytic domain of testis-specific serine/threonine kinase 1, TSSK2, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK1 and TSSK2 are expressed specifically in meiotic and postmeiotic spermatogenic cells, respectively. TSSK2 is localized in the sperm neck, equatorial segment, and mid-piece of the sperm tail. Both TSSK1 and TSSK2 phosphorylate their common substrate TSKS (testis-specific-kinase-substrate). TSSK1/TSSK2 double knock-out mice are sterile without manifesting other defects, making these kinases viable targets for male contraception. The TSSK1/2-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31847 271068 cd14166 STKc_CaMKI_gamma Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I gamma. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-gamma subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 285
31848 271069 cd14167 STKc_CaMKI_alpha Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 263
31849 271070 cd14168 STKc_CaMKI_delta Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I delta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-delta subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 301
31850 271071 cd14169 STKc_CaMKI_beta Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-beta subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 277
31851 271072 cd14170 STKc_MAPKAPK2 Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 2 (MAPKAP2 or MK2) contains an N-terminal proline-rich region that can bind to SH3 domains, a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK2 is a bonafide substrate for the MAPK p38. It is closely related to MK3 and thus far, MK2/3 show indistinguishable substrate specificity. They are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. The MK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 303
31852 271073 cd14171 STKc_MAPKAPK5 Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 5 (MAPKAP5 or MK5) is also called PRAK (p38-regulated/activated protein kinase). It contains a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK5 is a ubiquitous protein that is implicated in neuronal morphogenesis, cell migration, and tumor angiogenesis. It interacts with PKA, which induces cytoplasmic translocation of MK5. Its substrates includes p53, ERK3/4, Hsp27, and cytosolic phospholipase A2 (cPLA2). The MAPKAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31853 271074 cd14172 STKc_MAPKAPK3 Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 3 (MAPKAP3 or MK3) contains an N-terminal proline-rich region that can bind to SH3 domains, a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK3 is a bonafide substrate for the MAPK p38. It is closely related to MK2 and thus far, MK2/3 show indistinguishable substrate specificity. They are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. MK3 activity is only significant when MK2 is absent. The MK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31854 271075 cd14173 STKc_Mnk2 Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase signal-integrating kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 288
31855 271076 cd14174 STKc_Mnk1 Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase signal-integrating kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 289
31856 271077 cd14175 STKc_RSK1_C C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 1 (also called Ribosomal protein S6 kinase alpha-1 or 90kDa ribosomal protein S6 kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK1 is also called S6K-alpha-1, RPS6KA1, p90RSK1 or MAPK-activated protein kinase 1a (MAPKAPK-1a). It is a component of the insulin transduction pathway, regulating the function of IRS1. It also interacts with PKA and promotes its inactivation. RSK1 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 291
31857 271078 cd14176 STKc_RSK2_C C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 2 (also called 90kDa ribosomal protein S6 kinase 3 or Ribosomal protein S6 kinase alpha-3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK2 is also called p90RSK3, RPS6KA3, S6K-alpha-3, or MAPK-activated protein kinase 1b (MAPKAPK-1b). RSK2 is expressed highly in the regions of the brain with high synaptic activity. It plays a role in the maintenance and consolidation of excitatory synapses. It is a specific modulator of phospholipase D in calcium-regulated exocytosis. Mutations in the RSK2 gene, RPS6KA3, cause Coffin-Lowry syndrome (CLS), a rare syndromic form of X-linked mental retardation characterized by growth and psychomotor retardation and skeletal abnormalities. RSK2 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 339
31858 271079 cd14177 STKc_RSK4_C C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 4 (also called Ribosomal protein S6 kinase alpha-6 or 90kDa ribosomal protein S6 kinase 6). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK4 is also called S6K-alpha-6, RPS6KA6, p90RSK6 or pp90RSK4. RSK4 is a substrate of ERK and is a modulator of p53-dependent proliferation arrest in human cells. Deletion of the RSK4 gene, RPS6KA6, frequently occurs in patients of X-linked deafness type 3, mental retardation and choroideremia. Studies of RSK4 in cancer cells and tissues suggest that it may be oncogenic or tumor suppressive depending on many factors. RSK4 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 295
31859 271080 cd14178 STKc_RSK3_C C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 3 (also called Ribosomal protein S6 kinase alpha-2 or 90kDa ribosomal protein S6 kinase 2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK3 is also called S6K-alpha-2, RPS6KA2, p90RSK2 or MAPK-activated protein kinase 1c (MAPKAPK-1c). RSK3 binds muscle A-kinase anchoring protein (mAKAP)-b directly and regulates concentric cardiac myocyte growth. The RSK3 gene, RPS6KA2, is a putative tumor suppressor gene in sporadic epithelial ovarian cancer and variations to the gene may be associated with rectal cancer risk. RSK3 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 293
31860 271081 cd14179 STKc_MSK1_C C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK1 plays a role in the regulation of translational control and transcriptional activation. It phosphorylates the transcription factors, CREB and NFkB. It also phosphorylates the nucleosomal proteins H3 and HMG-14. Increased phosphorylation of MSK1 is associated with the development of cerebral ischemic/hypoxic preconditioning. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 310
31861 271082 cd14180 STKc_MSK2_C C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK2 and MSK1 play nonredundant roles in activating histone H3 kinases, which play pivotal roles in compaction of the chromatin fiber. MSK2 is the required H3 kinase in response to stress stimuli and activation of the p38 MAPK pathway. MSK2 also plays a role in the pathogenesis of psoriasis. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family, similar to 90 kDa ribosomal protein S6 kinases (RSKs). MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 309
31862 271083 cd14181 STKc_PhKG2 Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma 2 subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). The gamma 2 subunit (PhKG2) is also referred to as the testis/liver gamma isoform. Mutations in its gene cause autosomal-recessive glycogenosis of the liver. The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 279
31863 271084 cd14182 STKc_PhKG1 Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma 1 subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). The gamma 1 subunit (PhKG1) is also referred to as the muscle gamma isoform. The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 276
31864 271085 cd14183 STKc_DCKL1 Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 1 (also called Doublecortin-like and CAM kinase-like 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL1 (or DCAMKL1) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL1 contains a serine, threonine, and proline rich domain (SP) and a C-terminal kinase domain with similarity to CAMKs. DCKL1 interacts with tubulin, glucocorticoid receptor, dynein, JIP1/2, caspases (3 and 8), and calpain, among others. It plays roles in neurogenesis, neuronal migration, retrograde transport, and neuronal apoptosis. The DCKL1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 268
31865 271086 cd14184 STKc_DCKL2 Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 2 (also called Doublecortin-like and CAM kinase-like 2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL2 (or DCAMKL2) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL2 contains a serine, threonine, and proline rich domain (SP) and a C-terminal kinase domain with similarity to CAMKs. DCKL2 has been shown to interact with tubulin, JIP1/2, JNK, neurabin 2, and actin. It is associated with the terminal segments of axons and dendrites, and may function as a phosphorylation-dependent switch to control microtubule dynamics in neuronal growth cones. The DCKL2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31866 271087 cd14185 STKc_DCKL3 Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 3 (also called Doublecortin-like and CAM kinase-like 3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL3 (or DCAMKL3) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. DCKL3 contains a single DCX domain (instead of a tandem) and a C-terminal kinase domain with similarity to CAMKs. It has been shown to interact with tubulin and JIP1/2. The DCKL3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 258
31867 271088 cd14186 STKc_PLK4 Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK4, also called SAK or STK18, is structurally different from other PLKs in that it contains only one polo box that can form two adjacent polo boxes and a functional PDB by homodimerization. It is required for late mitotic progression, cell survival, and embryonic development. It localizes to centrosomes and is required for centriole duplication and chromosomal stability. Overexpression of PLK4 may be associated with colon tumors. The PLK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
31868 271089 cd14187 STKc_PLK1 Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK1 functions as a positive regulator of mitosis, meiosis, and cytokinesis. Its localization changes during mitotic progression; associating first with centrosomes in prophase, with kinetochores in prometaphase and metaphase, at the central spindle in anaphase, and in the midbody during telophase. It carries multiple functions throughout the cell cycle through interactions with differrent substrates at these specific subcellular locations. PLK1 is overexpressed in many human cancers and is associated with poor prognosis. The PLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 265
31869 271090 cd14188 STKc_PLK2 Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK2, also called Snk (serum-inducible kinase), functions in G1 progression, S-phase arrest, and centriole duplication. Its gene is responsive to both growth factors and cellular stress, is a transcriptional target of p53, and activates a G2-M checkpoint. The PLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31870 271091 cd14189 STKc_PLK3 Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK3, also called Prk or Fnk (FGF-inducible kinase), regulates angiogenesis and responses to DNA damage. Activated PLK3 mediates Chk2 phosphorylation by ATM and the resulting checkpoint activation. PLK3 phosphorylates DNA polymerase delta and may be involved in DNA repair. It also inhibits Cdc25c, thereby regulating the onset of mitosis. The PLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 255
31871 271092 cd14190 STKc_MLCK2 Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK2 (or MYLK2) phosphorylates myosin regulatory light chain and controls the contraction of skeletal muscles. MLCK2 contains a single kinase domain near the C-terminus followed by a regulatory segment containing an autoinhibitory Ca2+/calmodulin binding site. The MLCK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
31872 271093 cd14191 STKc_MLCK1 Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK1 (or MYLK1) phosphorylates myosin regulatory light chain and controls the contraction of smooth muscles. The MLCK1 gene expresses three transcripts in a cell-specific manner: a short MLCK1 which contains three immunoglobulin (Ig)-like and one fibronectin type III (FN3) domains, PEVK and actin-binding regions, and a kinase domain near the C-terminus followed by a regulatory segment containing an autoinhibitory Ca2+/calmodulin binding site; a long MLCK1 containing six additional Ig-like domains at the N-terminus compared to the short MLCK1; and the C-terminal Ig module which results in the expression of telokin in phasic smooth muscles, leading to Ca2+ desensitization by cyclic nucleotides of smooth muscle force. MLCK1 is also responsible for myosin regulatory light chain phosphorylation in nonmuscle cells and may play a role in regulating myosin II ATPase activity. The MLCK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 259
31873 271094 cd14192 STKc_MLCK3 Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK3 (or MYLK3) phosphorylates myosin regulatory light chain 2 and controls the contraction of cardiac muscles. It is expressed specifically in both the atrium and ventricle of the heart and its expression is regulated by the cardiac protein Nkx2-5. MLCK3 plays an important role in cardiogenesis by regulating the assembly of cardiac sarcomeres, the repeating contractile unit of striated muscle. MLCK3 contains a single kinase domain near the C-terminus and a unique N-terminal half, and unlike MLCK1/2, it does not appear to be regulated by Ca2+/calmodulin. The MLCK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
31874 271095 cd14193 STKc_MLCK4 Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. In vertebrates, different MLCKs function in smooth (MLCK1), skeletal (MLCK2), and cardiac (MLCK3) muscles. A fourth protein, MLCK4, has also been identified through comprehensive genome analysis although it has not been biochemically characterized. MLCK4 (or MYLK4 or SgK085) contains a single kinase domain near the C-terminus. The MLCK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 261
31875 271096 cd14194 STKc_DAPK1 Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK1 is the prototypical member of the subfamily and is also simply referred to as DAPK. It is Ca2+/calmodulin (CaM)-regulated and actin-associated protein that contains an N-terminal kinase domain followed by an autoinhibitory CaM binding region and a large C-terminal extension with multiple functional domains including ankyrin (ANK) repeats, a cytoskeletal binding domain, a Death domain, and a serine-rich tail. Loss of DAPK1 expression, usually because of DNA methylation, is implicated in many tumor types. DAPK1 is highly abundant in the brain and has also been associated with neurodegeneration. The DAPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31876 271097 cd14195 STKc_DAPK3 Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK3, also called DAP-like kinase (DLK) and zipper-interacting protein kinase (ZIPk), contains an N-terminal kinase domain and a C-terminal region with nuclear localization signals (NLS) and a leucine zipper motif that mediates homodimerization and interaction with other leucine zipper proteins. It interacts with Par-4, a protein that contains a death domain and interacts with actin filaments. DAPK3 is present in both the cytoplasm and nucleus. Its co-expression with Par-4 results in the co-localization of the two proteins to actin filaments. In addition to cell death, DAPK3 is also implicated in mediating cell motility and the contraction of smooth muscles. The DAPK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31877 271098 cd14196 STKc_DAPK2 Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK2, also called DAPK-related protein 1 (DRP-1), is a Ca2+/calmodulin (CaM)-regulated protein containing an N-terminal kinase domain, a CaM autoinhibitory site and a dimerization module. It lacks the cytoskeletal binding regions of DAPK1 and the exogenous protein has been shown to be soluble and cytoplasmic. FLAG-tagged DAPK2, however, accumulated within membrane-enclosed autophagic vesicles. It is unclear where endogenous DAPK2 is localized. DAPK2 participates in TNF-alpha and FAS-receptor induced cell death and enhances neutrophilic maturation in myeloid leukemic cells. It contributes to the induction of anoikis and its down-regulation is implicated in the beta-catenin induced resistance of malignant epithelial cells to anoikis. The DAPK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 269
31878 271099 cd14197 STKc_DRAK1 Catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 (also called STK17A) and DRAK2. Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. Rabbit DRAK1 has been shown to induce apoptosis in osteoclasts and overexpressio of human DRAK1 induces apoptosis in cultured fibroblast cells. DRAK1 may be involved in apoptotic signaling. The DRAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31879 271100 cd14198 STKc_DRAK2 The catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 and DRAK2 (also called STK17B). Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. DRAK2 has been implicated in inducing or enhancing apoptosis in beta cells, fibroblasts, and lymphoid cells, where it is highly expressed. It is involved in regulating many immune processes including the germinal center (GC) reaction, responses to thymus-dependent antigens, activated T cell survival, memory T cell responses. It may be involved in the development of autoimmunity. The DRAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
31880 271101 cd14199 STKc_CaMKK2 Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). CaMKK2, also called CaMKK beta, is one of the most versatile CaMKs. It is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. CaMKK2 contains unique N- and C-terminal domains and a central catalytic kinase domain that is followed by a regulatory domain that bears overlapping autoinhibitory and CaM-binding regions. It can be activated by signaling through G-coupled receptors, IP3 receptors, plasma membrane ion channels, and Toll-like receptors. Thus, CaMKK2 acts as a molecular hub that is capable of receiving and decoding signals from diverse pathways. The CaMKK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 286
31881 271102 cd14200 STKc_CaMKK1 Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). CaMKK1, also called CaMKK alpha, is involved in the regulation of glucose uptake in skeletal muscles, independently of AMPK and PKB activation. It also play roles in learning and memory. Studies on CaMKK1 knockout mice reveal deficits in fear conditioning. The CaMKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
31882 271103 cd14201 STKc_ULK2 Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK2 is ubiquitously expressed and is essential in autophagy induction. It displays partially redundant functions with ULK1 and is able to compensate for the loss of ULK1 in non-selective autophagy. It also displays neuron-specific functions and is important in axon development. The ULK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 271
31883 271104 cd14202 STKc_ULK1 Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK1 is required for efficient amino acid starvation-induced autophagy and mitochondrial clearance. It associates with three autophagy-related proteins (Atg13, FIP200 amd Atg101) to form the ULK1 complex. All fours proteins are essential for autophagosome formation. ULK1 is regulated by both mammalian target-of rapamycin complex 1 (mTORC1) and AMP-activated protein kinase (AMPK). mTORC1 negatively regulates the ULK1 complex in a nutrient-dependent manner while AMPK stimulates autophagy by inhibiting mTORC1. ULK1 also plays neuron-specific roles and is involved in non-clathrin-coated endocytosis in growth cones, filopodia extension, neurite extension, and axon branching. The ULK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31884 271105 cd14203 PTKc_Src_Fyn_like Catalytic domain of a subset of Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily includes a subset of Src-like PTKs including Src, Fyn, Yrk, and Yes, which are all widely expressed. Yrk has been detected only in chickens. It is primarily found in neuronal and epithelial cells and in macrophages. It may play a role in inflammation and in response to injury. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. They are also implicated in acute inflammatory responses and osteoclast function. The Src/Fyn-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 248
31885 271106 cd14204 PTKc_Mer Catalytic Domain of the Protein Tyrosine Kinase, Mer. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Mer (or Mertk) is named after its original reported expression pattern (monocytes, epithelial, and reproductive tissues). It is required for the ingestion of apoptotic cells by phagocytes such as macrophages, retinal pigment epithelial cells, and dendritic cells. Mer is also important in maintaining immune homeostasis. Mer is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Mer subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
31886 271107 cd14205 PTKc_Jak2_rpt2 Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak2 is widely expressed in many tissues and is essential for the signaling of hormone-like cytokines such as growth hormone, erythropoietin, thrombopoietin, and prolactin, as well as some IFNs and cytokines that signal through the IL-3 and gp130 receptors. Disruption of Jak2 in mice results in an embryonic lethal phenotype with multiple defects including erythropoietic and cardiac abnormalities. It is the only Jak gene that results in a lethal phenotype when disrupted in mice. A mutation in the pseudokinase domain of Jak2, V617F, is present in many myeloproliferative diseases, including almost all patients with polycythemia vera, and 50% of patients with essential thrombocytosis and myelofibrosis. Jak2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 284
31887 271108 cd14206 PTKc_Aatyk3 Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinase 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk3, also called lemur tyrosine kinase 3 (Lmtk3) is a receptor kinase containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. The function of Aatyk3 is still unknown. The Aatyk3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). 276
31888 271109 cd14207 PTKc_VEGFR1 Catalytic domain of the Protein Tyrosine Kinases, Vascular Endothelial Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR1 (or Flt1) binds VEGFA, VEGFB, and placenta growth factor (PLGF). It regulates monocyte and macrophage migration, vascular permeability, haematopoiesis, and the recruitment of haematopietic progenitor cells from the bone marrow. VEGFR1 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 340
31889 271110 cd14208 PTK_Jak3_rpt1 Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 3. Jak3 is expressed only in hematopoietic cells. It binds the shared receptor subunit, common gamma chain and thus, is essential in the signaling of cytokines that use it such as IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21. Jak3 is important in lymphoid development and myeloid cell differentiation. Inactivating mutations in Jak3 have been reported in humans with severe combined immunodeficiency (SCID). Jak3 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. Jaks are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Jak3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 260
31890 271111 cd14209 STKc_PKA Catalytic subunit of the Serine/Threonine Kinase, cAMP-dependent protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. The PKA subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 290
31891 271112 cd14210 PKc_DYRK Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase. Protein Kinases (PKs), Dual-specificity tYrosine-phosphorylated and -Regulated Kinase (DYRK) subfamily, catalytic (c) domain. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. The DYRK subfamily is part of a larger superfamily that includes the catalytic domains of other protein S/T PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. They play important roles in cell proliferation, differentiation, survival, and development. Vertebrates contain multiple DYRKs (DYRK1-4) and mammals contain two types of DYRK1 proteins, DYRK1A and DYRK1B. DYRK1A is involved in neuronal differentiation and is implicated in the pathogenesis of DS (Down syndrome). DYRK1B plays a critical role in muscle differentiation by regulating transcription, cell motility, survival, and cell cycle progression. It is overexpressed in many solid tumors where it acts as a tumor survival factor. DYRK2 promotes apoptosis in response to DNA damage by phosphorylating the tumor suppressor p53, while DYRK3 promotes cell survival by phosphorylating SIRT1 and promoting p53 deacetylation. DYRK4 is a testis-specific kinase that may function during spermiogenesis. 311
31892 271113 cd14211 STKc_HIPK Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). They show speckled localization in the nucleus, apart from the nucleoles. They play roles in the regulation of many nuclear pathways including gene transcription, cell survival, proliferation, differentiation, development, and DNA damage response. Vertebrates contain three HIPKs (HIPK1-3) and mammals harbor an additional family member HIPK4, which does not contain a homeobox-interacting domain and is localized in the cytoplasm. HIPK2, the most studied HIPK, is a coregulator of many transcription factors and cofactors and it regulates gene transcription during development and in DNA damage response. The HIPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 329
31893 271114 cd14212 PKc_YAK1 Catalytic domain of the Dual-specificity protein kinase, YAK1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of proteins with similarity to Saccharomyces cerevisiae YAK1 (or Yak1p), a dual-specificity kinase that autophosphorylates at tyrosine residues and phosphorylates substrates on S/T residues. YAK1 phosphorylates and activates the transcription factors Hsf1 and Msn2, which play important roles in cellular homeostasis during stress conditions including heat shock, oxidative stress, and nutrient deficiency. It also phosphorylates the protein POP2, a component of a complex that regulates transcription, under glucose-deprived conditions. It functions as a part of a glucose-sensing system that is involved in controlling growth in yeast. The YAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 330
31894 271115 cd14213 PKc_CLK1_4 Catalytic domain of the Dual-specificity protein kinases, CDC-like kinases 1 and 4. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK1 plays a role in neuronal differentiation. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK1/4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 330
31895 271116 cd14214 PKc_CLK3 Catalytic domain of the Dual-specificity protein kinase, CDC-like kinase 3. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK3 is predominantly expressed in mature spermatozoa, and might play a role in the fertilization process. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 331
31896 271117 cd14215 PKc_CLK2 Catalytic domain of the Dual-specificity protein kinase, CDC-like kinase 2. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK2 plays a role in hepatic insulin signaling and glucose metabolism. It is induced by the insulin/Akt pathway as part of the hepatic refeeding reponse, and it directly phosphorylates the SR domain of PGC-1alpha, which results in decreased gluconeogenic gene expression and glucose output. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 330
31897 271118 cd14216 STKc_SRPK1 Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK1 binds with high affinity the alternative splicing factor, SRSF1 (serine/arginine-rich splicing factor 1), and regiospecifically phosphorylates 10-12 serines in its RS domain. It plays a role in the regulation of pre-mRNA splicing, chromatin structure, and germ cell development. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 349
31898 271119 cd14217 STKc_SRPK2 Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK2 mediates neuronal cell cycle and cell death through regulation of nuclear cyclin D1. It has also been found to promote leukemia cell proliferation by regulating cyclin A1. SRPK2 also plays a role in regulating pre-mRNA splicing and is required for spliceosomal B complex formation. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 366
31899 271120 cd14218 STKc_SRPK3 Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK3 is highly expressed in the heart and skeletal muscles, and is controlled by a muscle-specific enhancer that is regulated by MEF2. It may play an important role in muscle development. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 365
31900 271121 cd14219 STKc_BMPR1b Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type IB. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1b, also called Activin receptor-Like Kinase 6 (ALK6), functions as a receptor for bone morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Mutations in BMPR1b that led to inhibition of chondrogenesis can cause Brachydactyly (BD) type A2, a dominant hand malformation characterized by shortening and lateral deviation of the index fingers. A point mutation in the BMPR1b kinase domain is also associated with the Booroola phenotype, characterized by precocious differentiation of ovarian follicles. BMPR1b belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1b, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1b subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 305
31901 271122 cd14220 STKc_BMPR1a Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type IA Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1a, also called Activin receptor-Like Kinase 3 (ALK3), functions as a receptor for bone morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Germline mutations in BMPR1a are associated with an increased risk to Juvenile Polyposis Syndrome, a hamartomatous disorder that may lead to gastrointestinal cancer. BMPR1a may also play an indirect role in the development of hematopoietic stem cells (HSCs) as osteoblasts are a major component of the HSC niche within the bone marrow. BMPR1a belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1a, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1a subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 287
31902 271123 cd14221 STKc_LIMK1 Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMK1 activation is induced by bone morphogenic protein, vascular endothelial growth factor, and thrombin. It plays roles in microtubule disassembly and cell cycle progression, and is critical in the regulation of neurite outgrowth. LIMK1 knockout mice show abnormalities in dendritic spine morphology and synaptic function. LIMK1 is one of the genes deleted in patients with Williams Syndrome, which is characterized by distinct craniofacial features, cardiovascular problems, as well as behavioral and neurological abnormalities. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. The LIMK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 267
31903 271124 cd14222 STKc_LIMK2 Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMK2 activation is induced by transforming growth factor-beta l (TGFb-l) and shares the same subcellular location as the cofilin family member twinfilin, which may be its biological substrate. LIMK2 plays a role in spermatogenesis, and may contribute to tumor progression and metastasis formation in some cancer cells. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. The LIMK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 272
31904 271125 cd14223 STKc_GRK2 Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK2, also called beta-adrenergic receptor kinase (beta-ARK) or beta-ARK1, is important in regulating several cardiac receptor responses. It plays a role in cardiac development and in hypertension. Deletion of GRK2 in mice results in embryonic lethality, caused by hypoplasia of the ventricular myocardium. GRK2 also plays important roles in the liver (as a regulator of portal blood pressure), in immune cells, and in the nervous system. Altered GRK2 expression has been reported in several disorders including major depression, schizophrenia, bipolar disorder, and Parkinsonism. GRK2 contains an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. TheGRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 321
31905 271126 cd14224 PKc_DYRK2_3 Catalytic domain of the protein kinases, Dual-specificity tYrosine-phosphorylated and -Regulated Kinases 2 and 3. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of DYRK2 and DYRK3, and similar proteins. Drosophila DYRK2 interacts and phosphorylates the chromatin remodelling factor, SNR1 (Snf5-related 1), and also interacts with the essential chromatin component, trithorax. It may play a role in chromatin remodelling. Vertebrate DYRK2 phosphorylates and regulates the tumor suppressor p53 to induce apoptosis in response to DNA damage. It can also phosphorylate the transcription factor, nuclear factor of activated T cells (NFAT). DYRK2 is overexpressed in lung adenocarcinoma and esophageal carcinomas, and is a predictor for favorable prognosis in lung adenocarcinoma. DYRK3, also called regulatory erythroid kinase (REDK), is highly expressed in erythroid cells and the testis, and is also present in adult kidney and liver. It promotes cell survival by phosphorylating and activating SIRT1, an NAD(+)-dependent protein deacetylase, which promotes p53 deacetylation, resulting in the inhibition of apoptosis. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. The DYRK2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other S/T kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 380
31906 271127 cd14225 PKc_DYRK4 Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase 4. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. DYRK4 is a testis-specific kinase with restricted expression to postmeiotic spermatids. It may function during spermiogenesis, however, it is not required for male fertility. DYRK4 has also been detected in a human teratocarcinoma cell line induced to produce postmitotic neurons. It may have a role in neuronal differentiation. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. They play important roles in cell proliferation, differentiation, survival, and development. The DYRK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 341
31907 271128 cd14226 PKc_DYRK1 Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase 1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. Mammals contain two types of DYRK1 proteins, DYRK1A and DYRK1B. DYRK1A was previously called minibrain kinase homolog (MNBH) or dual-specificity YAK1-related kinase. It phosphorylates various substrates and is involved in many cellular events. It phosphorylates and inhibits the transcription factors, nuclear factor of activated T cells (NFAT) and forkhead in rhabdomyosarcoma (FKHR). It regulates neuronal differentiation by targetting CREB (cAMP response element-binding protein). It also targets many endocytic proteins including dynamin and amphiphysin and may play a role in the endocytic pathway. The gene encoding DYRK1A is located in the DSCR (Down syndrome critical region) of human chromosome 21 and DYRK1A has been implicated in the pathogenesis of DS. DYRK1B, also called minibrain-related kinase (MIRK), is highly expressed in muscle and plays a critical role in muscle differentiation by regulating transcription, cell motility, survival, and cell cycle progression. It is overexpressed in many solid tumors where it acts as a tumor survival factor. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. The DYRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 339
31908 271129 cd14227 STKc_HIPK2 Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK2, the most studied HIPK, is a coregulator of many transcription factors and cofactors including homeodomain proteins (Nkx and HOX families), Smad1-4, Pax6, c-Myb, AML1, the histone acetyltransferase p300, and the tumor repressor p53, among others. It regulates gene transcription during development and in DNA damage response (DDR), and mediates cell processes such as apoptosis, survival, differentiation, and proliferation. HIPK2 mediates apoptosis by phosphorylating and activating p53 during DDR, resulting in the activation of apoptotic genes. In the absence of p53, HIPK2 targets the anti-apoptotic corepressor C-terminal binding protein (CtBP), leading to CtBP's degradation and the promotion of apoptosis. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 355
31909 271130 cd14228 STKc_HIPK1 Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK1 has been implicated in regulating eye size, lens formation, and retinal morphogenesis during late embryogenesis. It also contributes to the regulation of haematopoiesis and leukaemogenesis by phosphorylating and repressing the transcription factor c-Myb, which is crucial in T- and B-cell development. In glucose-deprived conditions, HIPK1 phosphorylates Daxx, leading to its relocalization from the nucleus to the cytoplasm, where it binds and stabilizes ASK1 (apoptosis signal-regulating kinase 1), a mitogen-activated protein kinase (MAPK) kinase kinase that activates the JNK and p38 MAPK pathways. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 355
31910 271131 cd14229 STKc_HIPK3 Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK3 is a Fas-interacting protein that induces FADD (Fas-associated death domain) phosphorylation and mediates FasL-induced JNK activation. Overexpression of HIPK3 does not affect cell death, however its expression in prostate cancer cells contributes to increased resistance to Fas receptor-mediated apoptosis. HIPK3 also plays a role in regulating steroidogenic gene expression. In response to cAMP, HIPK3 activates the phosphorylation of JNK and c-Jun, leading to increased activity of the transcription factor SF-1 (Steroidogenic factor 1), a key regulator for steroid biosynthesis in the gonad and adrenal gland. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). 330
31911 410578 cd14230 GAT_GGA canonical GAT domain found in metazoan and fungal ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs also play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. The family includes three GGAs (GGA1, GGA2, and GGA3) identified in mammals and two GGAs (Gga1p and Gga2p) identified in the budding yeast Saccharomyces cerevisiae. All these GGAs have a multidomain structure consisting of: an N-terminal VHS (Vps27/Hrs/Stam) domain that binds acidic-cluster dileucine (DxxLL)-type sorting signals (where x is any amino acid) found in the cytoplasmic tail of TGN sorting receptors; a GAT (GGA and TOM) domain that interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and the tumor susceptibility gene 101 product (TSG101); a largely unstructured hinge region that contains clathrin-binding motifs; and a C-terminal GAE (gamma-adaptin ear homology) domain that binds accessory proteins. In contrast to other GGAs-like proteins, members of this family contain a GAT N-terminal region, a helix-loop-helix in the complex with Arf1-GTP. 80
31912 410579 cd14231 GAT_GGA-like_plant canonical GAT domain found in uncharacterized Golgi-localized gamma ear-containing Arf-binding protein (GGA)-like proteins mainly found in plants. The family includes a group of uncharacterized plant proteins containing an N-terminal VHS (Vps27p/Hrs/STAM)-domain and a GAT (GGA and TOM1) domain. Both domains are also present in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs), which belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. In contrast to GGA proteins, members in this family do not have either a GAE (gamma-adaptin ear homology) domain or a clathrin-binding motif. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 79
31913 410580 cd14232 GAT_LSB5 canonical GAT domain found in yeast LAS seventeen-binding protein 5 (Lsb5p) and similar proteins. Lsb5p, also called LAS17-binding protein 5, is a Golgi-localized gamma ear-containing Arf-binding protein (GGA)-like protein located to the plasma membrane in an actin-independent manner. It plays important roles in membrane-trafficking events through association with the actin regulators, the yeast Wiskott-Aldrich syndrome protein (WASP) homologue Las17p and the cortical protein Sla1p, the yeast Arf3p (orthologous with mammalian Arf6), and ubiquitin. Lsb5p contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain and a GAT (GGA and TOM1) domain. In contrast to GGA proteins, Lsb5p harbors a C-terminal NPF (Asn-Pro-Phe) motif, but does not have either a GAE (gamma-adaptin ear homology) domain or a clathrin-binding motif. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 78
31914 410581 cd14233 GAT_TOM1_like canonical GAT domain found in target of myb protein 1 (Tom1) protein family. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS (Vps27p/Hrs/STAM)-domain followed by a GAT (GGA and TOM1) domain, both of which are also conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). In contrast to GGAs, the Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain, but do not associate with Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. In addition, the Tom1 family proteins recruit clathrin onto endosomes through their C-terminal region. In their C-terminal clathrin-binding regions, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The yeast S. cerevisiae does not contain homologous proteins of the Tom1 family. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 87
31915 410582 cd14234 GAT_GGA_meta canonical GAT domain found in metazoan ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. Moreover, GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Three GGAs (GGA1, GGA2, and GGA3) have been identified in mammals. They may appear to behave similarly, since all of them have a multidomain structure consisting of: an N-terminal VHS (Vps27/Hrs/Stam) domain that binds the acidic-cluster dileucine (DxxLL)-type sorting signals (where x is any amino acid) found in the cytoplasmic tail of TGN sorting receptors; a GAT (GGA and TOM) domain that interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and the tumor susceptibility gene 101 product (TSG101); a largely unstructured hinge region that contains clathrin-binding motifs; and a C-terminal GAE (gamma-adaptin ear homology) domain that binds accessory proteins. However, the three GGAs have some differences, which suggest they may possess their own distinct roles. For instance, both GGA1 and GGA3, but not GGA2, contains an internal DxxLL motif that binds to it own VHS domain. Only a portion of the VHS domain of GGA2 possesses distant structural homology to that of GGA1 or GGA3. Moreover, the binding affinity of GGA2 to ubiquitin is quite lower than that of GGA1 or GGA3. In addition, GGA3 has a short splicing variant that is predominantly expressed in human cell lines and tissues except the brain. It does have a VHS domain, but it is unable to bind to the DxxLL motif. GGA2 and GGA3 undergo epidermal growth factor (EGF)-induced phosphorylation. 84
31916 410583 cd14235 GAT_GGA_fungi canonical GAT domain found in fungal ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. Two GGAs (Gga1p and Gga2p) have been identified in the budding yeast Saccharomyces cerevisiae. Yeast GGAs play important roles in the carboxypeptidase Y (CPY) pathway, vacuole biogenesis, alpha-factor maturation, and interactions with clathrin. They have a multidomain structure consisting of VHS (Vps27/Hrs/ STAM), GAT (GGA and TOM), hinge, and GAE (gamma-adaptin ear) domains. Both Gga1p and Gga2p function as effectors of Arf in yeast. They interact with Arf1p and Arf2p in a GTP-dependent manner. Moreover, Gga2p mediates sequential ubiquitin-independent and ubiquitin-dependent steps in the trafficking of ARN1, a ferrichrome transporter in S. cerevisiae, from the TGN to the vacuole. It also acts as a phosphatidylinositol 4-phosphate effector at the Golgi exit, which binds directly to the TGN PtdIns(4)-kinase Pik1p and contributes to Pik1p recruitment. In addition, Gga2p is required for sorting of the yeast siderophore iron transporter1 (Sit1) to the vacuolar pathway. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101). 92
31917 260094 cd14236 GAT_TOM1 canonical GAT domain found in target of Myb protein 1 (Tom1). Tom1 was originally identified by its induced expression by the v-Myb oncogene. It is predominantly present in the cytosol and can interact with clathrin, endofin, Toll-interacting protein (Tollip), and ubiquitinated proteins. It acts as a linker protein to regulate the ability of endofin to recruit clathrin onto the sorting endosome. Moreover, Tom1 functions as a negative regulator of IL-1beta and tumor necrosis factor (TNF)-alpha-induced signaling pathways. It also plays a role in the TLR2/4 signaling pathways. Tom1 contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). In contrast to GGAs, Tom1 binds to ubiquitin, ubiquitinated proteins, and Tollip through its GAT domain, but does not associate with Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 95
31918 410584 cd14237 GAT_TM1L1 canonical GAT domain found in target of Myb-like protein 1 (Tom1L1). Tom1L1, also called Src-activating and signaling molecule protein (Srcasm), was identified as a substrate of the Src family of protein kinases. It is tyrosine-phosphorylated by Src family kinases and modulates growth factor and Src-mediated signaling pathways. It also plays a potential role in endosomal sorting and ligand-stimulated endocytosis of EGF receptors (EGFR). Tom1L1 is predominantly present in the cytosol and can interact with Toll-interacting protein (Tollip), Hrs or TSG101, clathrin, and ubiquitinated proteins. It contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain, and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). It interacts with Tollip through their GAT domain and recuits clathrin onto endosomes through their C-terminal region. However, in the C-terminal clathrin-binding region, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 92
31919 410585 cd14238 GAT_TM1L2 canonical GAT domain found in target of Myb-like protein 2 (Tom1L2). Tom1L2, together with Myb protein 1 (Tom1) and target of Myb-like protein 1 (Tom1L1), constitute the Tom1 family. Tom1L2 can interact with Toll-interacting protein (Tollip), clathrin, and ubiquitin. It may play a potential role in endosomal sorting, as well as in the regulation of membrane trafficking that is linked to immunity and cell proliferation. Tom1L2 contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain, and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). It interacts with Tollip through their GAT domain and recuits clathrin onto endosomes through their C-terminal region. However, in the C-terminal clathrin-binding region, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. 92
31920 410586 cd14239 GAT_GGA1_GGA2 canonical GAT domain found in ADP-ribosylation factor (Arf)-binding proteins GGA1 and GGA2. This subfamily includes GGA1 and GGA2, both of which belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGA1, also called gamma-adaptin-related protein 1, or Golgi-localized gamma ear-containing Arf-binding protein 1, regulates the low-density lipoprotein and sorting receptor LR11/SorLA endocytic traffic and further alters amyloid-beta precursor protein (APP) intracellular distribution and amyloid-beta production. It is also critical for the effects of beta-site APP-cleaving enzyme-1 (BACE1) on amyloid-beta generation. It interacts with BACE1 and promotes its traffic from early endosomes to late endocytic compartments or the TGN. Moreover, GGA1 acts as a clathrin assembly protein with the ability to polymerize clathrin into tubules. GGA2, also called gamma-adaptin-related protein 2, Golgi-localized gamma ear-containing Arf-binding protein 2, or VHS domain and ear domain of gamma-adaptin (Vear), interacts with the acidic cluster-dileucine motif in the cytoplasmic tail of the cation-independent mannose 6-phosphate receptor (CI-MPR) and further plays a major role in the sorting of lysosomal enzymes. It also mediates a vital function that cannot be compensated for by GGA1 and/or GGA3. Both GGA1 and GGA2 have a multidomain structure consisting of an N-terminal VHS (Vps27/Hrs/Stam) domain, a GAT (GGA and TOM) domain, a largely unstructured hinge region that contains clathrin-binding motifs, and a C-terminal GAE (gamma-adaptin ear homology) domain. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101). 88
31921 410587 cd14240 GAT_GGA3 canonical GAT domain found in ADP-ribosylation factor-binding protein GGA3. GGA3, also called Golgi-localized gamma ear-containing Arf-binding protein 3, belongs to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGA3 interacts selectively with the Met/Hepatocyte Growth Factor receptor tyrosine kinase (RTK) when stimulated. It functions as a specific cargo adaptor to target the Met RTK into recycling tubules, and further coordinates the recycling, signaling and degradative fates of the Met RTK. Moreover, GGA3, together with PACS-1 and the protein kinase CK2, forms a complex that regulates cation-independent mannose-6-phosphate receptor (CI-MPR) trafficking. Furthermore, GGA3 has been identified as an interacting protein of the beta-site APP-cleaving enzyme-1 (BACE1), a stress-related protease that is involved in Alzheimer's disease (AD) pathology. GGA3 has a multidomain structure consisting of an N-terminal VHS (Vps27/Hrs/Stam) domain, a GAT (GGA and TOM) domain, a largely unstructured hinge region that contains clathrin-binding motifs, and a C-terminal GAE (gamma-adaptin ear homology) domain. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101). 87
31922 260131 cd14241 PAD Phenolic Acid Decarboxylase. This family of bacterial and fungal phenolic acid decarboxylases catalyzes the non-oxidative decarboxylation of phenolic acids to produce 4-vinyl derivates. Phenolic acid, like ferulic, p-coumaric, and caffeic acids, are important lignin-related aromatic acids and are natural constituents of plant cell walls. They act as crosslinkers between lignin polymers and hemicellulose/cellulose in plants. Their degradation is important from a biotechnological viewpoint. 144
31923 260109 cd14243 PT-AcyF_like Putative ABBA-type prenyltransferases acting on cyanobactins. Members of this family are found in gene clusters responsible for the production and posttranslational modification of cyanobactins, small ribosomal cyclic peptides produced by cyanobacteria. The AcyF_like proteins are structurally similar to the ABBA-type aromatic prenyltransferases, and may be responsible for the reverse- and forward-O-prenylation of tyrosine, serine, and theronine in cyanobactins. ABBA-type aromatic prenyltransferases (PTases) are a subgroup of prenyltransferases that are characterized by an unusual type of beta/alpha fold with antiparallel beta strands. They lack the (N/D)DxxD motif which is characteristic for many other prenyltransferases. 294
31924 271203 cd14244 GH_101_like Endo-a-N-acetylgalactosaminidase and related glcyosyl hydrolases. This family contains the enzymatically active domain of cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins (EC:3.2.1.97). It has been classified as glycosyl hydrolase family 101 in the Cazy resource. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae and other commensal human bacteria is largely determined by their ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. 298
31925 271204 cd14245 DMP12 Putative DNA mimic protein DMP12. The Neisseria sp. protein DMP12 has been shown to interact with the bacterial histone-like protein HU and may do so by acting as a DNA mimic. It is likely to play a regulatory role, but not via direct competition for binding of HU, which is involved in maintenance of the bacterial nucleoid structure. 114
31926 271205 cd14246 ADAM17_MPD Membrane-proximal domain of a disintegrin and metalloprotease 17 (ADAM17). ADAM17 is a multi-domain protein that acts as a sheddase; is involved in the cleavage and release of the soluble ectodomain of tumor necrosis factor alpha from the cell surface and in the trans-Golgi network, as well as in the release of various other targets such as cytokines and cell adhesion molecules. This links ADAM17 to a variety of biological processes, including cellular differentiation and the progression of cancer. It was shown that the enzymatic activity of ADAM17 is regulated via a protein-disulfide isomerase (PDI). Specifically, the disulfide bridges within a CxxC motif of the membrane-proximal domain (MPD) are isomerized by PDI; the conversion triggers a conformational change between a closed and an opened form of the MPD, which may constitute a molecular switch that triggers the shedding activity of ADAM17. 60
31927 271206 cd14247 Lmo2686_like Uncharacterized hexameric protein conserved in Bacilli. This family conserved in bacilli contains proteins that form an unusual hexameric arrangement via circular domain-swapping of a beta-hairpin-beta unit. 138
31928 271207 cd14248 ESP Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP1 is a peptide pheromone found in male mouse tear fluid, which is recognized by a specific G-protein-coupled receptor in the vomeronasal sensory neurons and affects female mouse sexual receptive behaviour. This small family appears restricted to rodents. ESP36 is expressed only in the female mouse extraorbital lacrimal gland. The juvenile pheromone ESP22 is secreted from the lacrimal gland and released into the tears of 2-3 week old mice; it activates the vomeronasal response pathway, and inhibits male sexual behavior. Information regarding other members of this family is not yet available. 52
31929 271208 cd14249 ESP1_like Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP1 is a peptide pheromone found in male mouse tear fluid, which is recognized by a specific G-protein-coupled receptor in the vomeronasal sensory neurons and affects female mouse sexual receptive behaviour. This small family appears restricted to rodents; the functions of members other than mouse ESP1 have not yet been determined. 46
31930 271209 cd14250 ESP36_like Exocrine gland-secreting peptide 36 (ESP36) and similar pheromones. ESP36 is a peptide pheromone expressed only in the female mouse extraorbital lacrimal gland. This family also includes the juvenile pheromone ESP22 which is secreted from the lacrimal gland and released into the tears of 2-3 week old mice. ESP22 activates the vomeronasal response pathway, and inhibits male sexual behavior. This small family appears restricted to rodents; the functions of other members have not yet been determined. 55
31931 271210 cd14251 PL-6 Polysaccharide Lyase Family 6. Polysaccharide Lyase Family 6 is a family of beta-helical polysaccharide lyases. Members include alginate lyase (EC 4.2.2.3) and chondroitinase B (EC 4.2.2.19). Chondroitinase B is an enzyme that only cleaves the beta-(1,4)-linkage of dermatan sulfate (DS), leading to 4,5-unsaturated dermatan sulfate disaccharides as the product. DS is a highly sulfated, unbranched polysaccharide belonging to a family of glycosaminoglycans (GAGs) composed of alternating hexosamine (gluco- or galactosamine) and uronic acid (D-glucuronic or L-iduronic acid) moieties. DS contains alternating 1,4-beta-D-galactosamine (GalNac) and 1,3-alpha-L-iduronic acid units. The related chondroitin sulfate (CS) contains alternating GalNac and 1,3-beta-D-glucuronic acid units. Alginate lyases (known as either mannuronate (EC 4.2.2.3) or guluronate lyases (EC 4.2.2.11) catalyze the degradation of alginate, a copolymer of alpha-L-guluronate and its C5 epimer beta-D-mannuronate. 369
31932 271211 cd14252 Dockerin_like Dockerin repeat domains and domains resembling dockerin repeats. Dockerins are modules in the cellulosome complex that often anchor catalytic subunits by binding to cohesin domains of scaffolding proteins. Three types of dockerins and their corresponding cohesin have been described in the literature. This alignment models two consecutive dockerin repeats, the functional unit. 57
31933 271212 cd14253 Dockerin Dockerin repeat domain. Dockerins are modules in the cellulosome complex that often anchor catalytic subunits by binding to cohesin domains of scaffolding proteins. Three types of dockerins and their corresponding cohesin have been described in the literature. This alignment models two consecutive dockerin repeats, the functional unit. 56
31934 271213 cd14254 Dockerin_II Type II dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type II dockerins, which are responsible for mediating attachment of the cellulosome complex to the bacterial cell wall. 54
31935 271214 cd14255 Dockerin_III Type III dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents the atypical type III dockerins and related domains. 65
31936 271215 cd14256 Dockerin_I Type I dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I dockerins, which are responsible for anchoring a variety of enzymatic domains to the complex. 57
31937 271221 cd14257 CttA_X X module of the carbohydrate-binding protein CttA and similar proteins. This model represents a putative carbohydrate-binding domain conserved in Ruminococcus, which sits N-terminal to a dockerin repeat; the protein may be a component of the Ruminococcus cellulosome system. This X module does not share similarities with other known X modules from cellulolytic bacteria and may have a structural role. 116
31938 271222 cd14259 PUFD_like PCGF Ub-like fold discriminator and related domains. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor (BCOR) is a transcriptional repressor required for germinal center formation and is possibly involved in apoptosis. 106
31939 271223 cd14260 PUFD_like_1 PCGF Ub-like fold discriminator of BCOR-like 1. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor-like protein 1 (BCoR-L1) is largely uncharacterized; it contains ankyrin repeats. 115
31940 271224 cd14261 PUFD PCGF Ub-like fold discriminator of BCOR. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor (BCOR) is a transcriptional repressor required for germinal center formation and is possibly involved in apoptosis. 117
31941 271354 cd14262 VirB5_like VirB5 protein family. This family contains VirB5 domains, including TraC, a VirB5 homolog encoded by the pKM101 plasmid, and similar proteins. VirB5 is one of 11 conserved proteins (VirB1-VirB11) in Agrobacterium tumefaciens, the causative agent of crown gall disease, that span the inner and the outer membrane, and is involved in type IV DNA secretion systems (T4SS) which mediate the translocation of virulence factors (proteins and/or DNA) from Gram-negative bacteria into eukaryotic cells. VirB5 assembles extracellular pili by interacting with several essential proteins. VirB2-VirB5 complex formation precedes incorporation into pili; it depends on the inner membrane protein VirB4 to interact directly with and stabilize VirB8 in order for VirB5 to bind to VirB8 and VirB10. Mutagenesis studies show that VirB5 proteins participate in protein-protein interactions important for pilus assembly and function. 173
31942 260132 cd14263 DAGK_IM_like Integral membrane diacylglycerol kinase and similar enzymes. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of substrates such as diacylglycerol to phosphatidic acid or of undecaprenol to undecaprenyl phosphate. They are not related other cytosolic or membrane-associated kinases, including the eukaryotic diacylglycerol kinases. 106
31943 260133 cd14264 DAGK_IM Integral membrane diacylglycerol kinase. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of diacylglycerol to phosphatidic acid. Escherichia coli DAGK participates in the membrane-derived oligosaccharide cycle (MDO cycle) by recycling lipids to restore phosphatidylglycerols that were used up in the biosynthesis of MDOs. DAGK also recycles diacylglycerols that are produced during the biosynthesis of lipopolysaccharides (LPS) back to phospholipids. DAGK is not the main source of phosphatidic acid in de-novo biosynthesis of glycerophospholipids. Escherichia coli DAGK has low activity as an undecaprenol kinase. 109
31944 260134 cd14265 UDPK_IM_like Integral membrane undecaprenol kinase and similar enzymes. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of undecaprenol to undecaprenyl phosphate. C55-isoprenyl (undecaprenyl) pyroposphate acts as a scaffold for the assembly of peptidoglycan components; undecaprenol kinase (UDPK) is involved in recycling undecaprenyl units for re-use in the peptidoglycan biosynthesis. UDPK does not participate in the de-novo biosynthesis of undecaprenyl phosphate. Gram-positive bacteria have a large pool of free undecaprenol, in contrast to gram-negative bacteria. UDPK may also play a role in a stress-induced pathway that affects the function of ribosomes. In Streptococcus mutans, UDPK has been shown to be required for biofilm formation, such as in the case of smooth surface dental caries. Members of the UDPK family have low activity as diacylglycerol kinases (DAGK), and many of them are annotated as DAGKs. 106
31945 260135 cd14266 UDPK_IM_PAP2_like Integral membrane undecaprenol kinase domain co-occurring with type 2 phosphatidic acid phosphatase-like domains. This bacterial family of homo-trimeric integral membrane enzyme domains catalyzes the ATP-dependent phosphorylation of of undecaprenol to undecaprenyl phosphate. They sit N-terminally to phosphatase domains that are members of the type 2 phosphatidic acid phosphatase superfamily, and the function of members of this domain architecture was determined to be undecaprenyl pyrophosphate phosphatases. The bi-functional enzymes might generate undecaprenyl phosphate via two mechanisms - the phosphorylation of undecaprenol or the cleavage of the terminal phosphate group of undecaprenyl pyrophosphate. 106
31946 341312 cd14267 Rif1_CTD_C-II_like Saccharomyces cerevisiae Rap1-interacting factor 1 CTD domain, metazoan Rif1 C-II domains and related domains. This model includes Saccharomyces cerevisiae Rif1_CTD (carboxy-terminal domain) and metazoan Rif1 C-II (C-terminal subdomain II). Rif1 was originally identified in S. cerevisiae where it negatively regulates telomere length homeostasis via interaction with the C-terminal domain of Rap1. A protective capping structure (telosome) comprised of Rap1, Rif1, and Rif2, inhibits telomerase, counteracts SIR-mediated transcriptional silencing, and prevents inadvertent recognition of telomeres as DNA double-strand breaks (DSBs). S. cerevisiae Rif1 has two Rap1 binding sites: the Rap1-binding module (RBM), and the CTD domain. The latter, represented here, has a lower Rap1 affinity, and provides trans binding through tetramerization. In mammals, Rif1 has been implicated in various cellular processes including pluripotency of stem cells, breast cancer development, and DSB repair pathway choice. A mutual antagonism between the nonhomologous end joining factors (53BP1-RIF1) and the homologous recombination factors (BRCA1 -CtIP) ensures correct repair pathway choice. 46
31947 270456 cd14270 UBA UBA domain found in proteins involved in ubiquitin-mediated proteolysis. The ubiquitin-associated (UBA) domains are commonly occurring sequence motifs found in proteins involved in ubiquitin-mediated proteolysis. They contribute to ubiquitin (Ub) binding or ubiquitin-like (UbL) domain binding. However, some kinds of UBA domains can only the bind UbL domain, but not the Ub domain. UBA domains are normally comprised of compact three-helix bundles which contain a conserved GF/Y-loop. They can bind polyubiquitin with high affinity. They also bind monoubiquitin and other proteins. Most UBA domain-containing proteins have one UBA domain, but some harbor two or three UBA domains. 30
31948 270457 cd14271 UBA_YLR419W_like UBA domain found in Saccharomyces cerevisiae putative ATP-dependent RNA helicase YLR419W and similar proteins. The group includes some uncharacterized hypothetical proteins which show a high level of sequence similarity with Saccharomyces cerevisiae putative ATP-dependent RNA helicase YLR419W. All family members contain a ubiquitin-associated (UBA) domain, RWD domain, DEAD-box (DEXDc), helicase superfamily c-terminal domain (HELICc), Helicase associated domain (HA2), and a C-terminal oligonucleotide/oligosaccharide-binding (OB)-fold. 41
31949 270458 cd14272 UBA_AMPK-RKs UBA domain of AMPK related kinases. The AMPK-RK family comprises AMP-activated protein kinases (AMPKs), MAP/microtubule affinity-regulating kinases (MARKs), Brain-specific kinases (BRSKs), Salt inducible kinases (SIKs), maternal embryonic leucine zipper kinase (MELK), and SNF-related serine/threonine-protein kinase (SNRK). It is the only kinase family in the human genome containing an ubiquitin-associated (UBA) or UBA-like domain which is located immediately C-terminal to their N-terminal protein kinase catalytic domain. In addition, most of family members contain a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), but some are lack of this region. AMPK-RKs play central roles in metabolic control, energy homeostasis and stress responses in eukaryotes. They require phosphorylation by liver kinase B1 (LKB1) for full activity. Normally, AMPK-RKs appear to exist as heterotrimeric complexes consisting of a catalytic alpha-subunit and regulatory beta- and gamma-subunits. This model corresponds to the catalytic subunit. The UBA domain, previously called SNF1 homology (SNH) domain, regulates the conformation, LKB1-mediated phosphorylation and activation, and localization of the AMPK-RKs, but does not interact with ubiquitin-like molecules. In AMPKalpha subunits, the UBA-like autoinhibitory domain (AID) is responsible for AMPKalpha subunit autoinhibition. Due to the lack of UBA domain, NUAK1 kinase, also called ARK5 (AMPK-related kinase 5), and NUAK2 kinase, also called SNARK (SNF1/AMPK-related kinase), are not included in this family. 38
31950 270459 cd14273 UBA_TAP-C_like UBA-like domain found in the NXF family of mRNA nuclear export factors and similar proteins. This family includes nuclear RNA export factors (NXF1/NXF2), FAS-associated factors (FAF1/2), tyrosyl-DNA phosphodiesterase 2 (TDP2), OTU domain-containing proteins (OTU7A/OTU7B), NSFL1 cofactor p47, defective in cullin neddylation protein 1 (DCN1)-like protein (DCNL1/DCNL2), yeast defective in cullin neddylation protein 1 (DCN1) and similar proteins. NXF proteins can stimulate nuclear export of mRNAs and facilitate the export of unspliced viral mRNA containing the constitutive transport element. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF2 is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. Its biological function remains unclear. TDP2 is a 5'-Tyr-DNA phosphodiesterase required for the efficient repair of topoisomerase II-induced DNA double strand breaks. OTU7A and OTU7B are zinc finger proteins that function as deubiquitinating enzymes. p47 is a major cofactor of the cytosolic AAA ATPase p97. It is required for the p97-regulated membrane reassembly of the endoplasmic reticulum (ER), the nuclear envelope and the Golgi apparatus. DCNL1 plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. The biological function of DCNL2 remains unclear. Yeast DCN1 is a scaffold-type E3 ligase for cullin neddylation. It can bind directly to cullins and the ubiquitin-like protein Nedd8-specific E2 (Ubc12), and regulate cullin neddylation and thus display ubiquitin ligase activity. 31
31951 270460 cd14274 UBA_ACK1 UBA domain found in activated Cdc42 kinase 1 (ACK1) and similar proteins. ACK1, also called tyrosine kinase non-receptor protein 2, is an intracellular non-receptor tyrosine kinase that specifically interacts with Cdc42 and act as Cdc42 effectors. It forms a signaling complex with Cdc42, p130(Cas), and Crk, and mediates Cdc42-dependent cell migration and signaling to p130(Cas). Ack1 also stimulates prostate tumorigenesis in part by inhibiting the proapoptotic tumor suppressor WW domain containing oxidoreductase (Wwox). Moreover, ACK1 associates directly with the heavy chain of clathrin and further participates in trafficking, underlying an ability to increase receptor-mediated transferrin uptake. It may functions as a regulator of the guanine nucleotide exchange factor Dbl that can activate Rho family proteins. ACK1 consists of an N-terminal tyrosine kinase catalytic domain followed by an SH3 domain, a Cdc42/Rac interactive binding (CRIB) domain, a proline-rich region, and a C-terminal ubiquitin-association (UBA) domain. The proline-rich region of ACK1 is responsible for the binding to the adaptor proteins Nck, Grb2, sorting nexin protein 9 (SH3PX1), and Hck. 45
31952 270461 cd14275 UBA_EF-Ts UBA domain found in elongation factor Ts (EF-Ts) from bacteria, chloroplasts and mitochondria of eukaryotes. EF-Ts functions as a nucleotide exchange factor in the functional cycle of EF-Tu, another translation elongation factor that facilitates the binding of aminoacylated transfer RNAs (aminoacyl-tRNA) to the ribosomal A site as a ternary complex with guanosine triphosphate during the elongation cycle of protein biosynthesis, and then catalyzes the hydrolysis of GTP and release itself in GDP-bound form. EF-Ts forms complex with EF-Tu and catalyzes the nucleotide exchange reaction promoting the formation of EF-Tu in GTP-bound form from EF-Tu in GDP-bound form. EF-Ts from Thermus thermophiles is shorter than EF-Ts from Escherichia coli, but it has higher thermostability. The mitochondrial translational EF-Ts from chloroplasts and mitochondria display high similarity to the bacterial EF-Ts. The majority of family members contain one ubiquitin-associated (UBA) domain, but some family members from plants harbor two tandem UBA domains. 37
31953 270462 cd14276 UBA_UBP25_like UBA domain found in ubiquitin carboxyl-terminal hydrolase UBP25, UBP28, and similar proteins. UBP25, also called deubiquitinating enzyme 25, USP on chromosome 21, ubiquitin thioesterase 25, or ubiquitin-specific-processing protease 25, belongs to the deubiquitinating enzyme (DUB) family that specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. USP25 has one muscular isoform and two ubiquitous isoforms. The longer muscular isoform can bind to muscle-restricted cytoskeletal and sarcomeric proteins, such as myosin binding protein C1 (MyBPC1), actin alpha-1 (ACTA1) and filamin C (FLNC), and further prevent their degradation. USP25 harbors three potential ubiquitin-binding domains (UBDs), one ubiquitin-associated (UBA) domain and two ubiquitin-interacting motifs (UIMs) in the N-terminal region. Its C-terminal tyrosine-rich region is responsible for the binding of the second SH2 domain of SYK, a non-receptor tyrosine kinase that specifically phosphorylates USP25 and alters its cellular levels. UBP28, also called deubiquitinating enzyme 28, ubiquitin thioesterase 28, or ubiquitin-specific-processing protease 28, is also an ubiquitin-specific protease belonging to the DUB family. UBP28 can form a ternary complex with nucleoplasmic Fbw7alpha, an F-box protein that is part of an SCF-type ubiquitin ligase, and MYC, a transcription factor encoded by MYC proto-oncogene. UBP28 is required for the stability of MYC, and this stabilization is necessary for tumour-cell proliferation. Besides, UBP28 plays a critical role in the regulation of the Chk2-p53-PUMA pathway. It specifically interacts with 53BP1 and is essential to stabilize Chk2 and 53BP1 in response to DNA damage. 38
31954 270463 cd14277 UBA_UBP2_like UBA domain found in ubiquitin-associated protein 2 (UBAP-2) like proteins. The family contains some uncharacterized ubiquitin-associated proteins, including UBAP-2 and its homolog, UBAP2-like [UBP2L, also called protein NICE-4 (for newly identified cDNA from the epidermal differentiation complex EDC)], both of which contain an N-terminal ubiquitin-associated (UBA) domain along with a highly conserved, but function unknown domain (DUF3697). 38
31955 270464 cd14278 UBA_NAC_like UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and similar proteins. The family contains nascent polypeptide-associated complex subunit alpha (NACA), putative NACA-like protein (NACP1), nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD), and similar proteins found in archaea and bacteria. NACA, also called NAC-alpha or Alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. The biological function of NACP1 (also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1) and NACAD remain unclear. The family also includes huntingtin-interacting protein K (HYPK), also called Huntingtin yeast partner K or Huntingtin yeast two-hybrid protein K. It is an intrinsically unstructured Huntingtin (HTT)-interacting protein with chaperone-like activity. It may be involved in regulating cell growth, cell cycle, unfolded protein response and cell death. All members in this family contain an ubiquitin-associated (UBA) domain. 37
31956 270465 cd14279 CUE CUE domain found in ubiquitin-binding CUE proteins. This family includes many coupling of ubiquitin conjugation to endoplasmic reticulum degradation (CUE) domain containing proteins that are characterized by an FP and a di-leucine-like sequence and bind to monoubiquitin with varying affinities. Some higher eukaryotic CUE domain proteins do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. CUE domains form three-helix bundle structures and are distantly related to the ubiquitin-associated (UBA) domains which are widely occurring ubiquitin-binding motifs found in a broad range of cellular proteins in species ranging from yeast to human. The majority of family members contain one CUE domain, but some family members from fungi harbor two CUE domains. 38
31957 270466 cd14280 UBA1_Rad23_like UBA1 domain of Rad23 proteins found in eukaryotes. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA1 domain. 39
31958 270467 cd14281 UBA2_Rad23_like UBA2 domain of Rad23 proteins found in eukaryotes. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA2 domain. 38
31959 270468 cd14282 UBA_TDRD3 UBA domain of Tudor domain-containing protein 3 (TDRD3) and similar proteins. TDRD3 is a modular protein containing Tudor domain, a DUF/OB-fold motif and a ubiquitin-associated (UBA) domain. It shows both nucleic acid- and methyl-binding properties and can interact with methylated RNA-binding proteins, such as fragile X mental retardation protein (FMRP) and DEAD/H box-3 (also known as DDX3X/Y, DBX/Y, HLP2 and DDX14) which is implicated in human genetic diseases. At this point, TDRD3 may play a central role in RNA processing regulatory pathways involving arginine methylation. TDRD3 localizes predominantly to the cytoplasm stress granules (SGs). The Tudor domain is essential and sufficient for its recruitment to SGs. 39
31960 270469 cd14283 UBA_TNR6C UBA domain found in trinucleotide repeat-containing gene 6C protein (TNRC6C) and similar proteins. TNRC6C is one of three GW182 paralogs in mammalian genomes. It is enriched in P-bodies and important for efficient miRNA-mediated repression. TNRC6C is composed of an N-terminal glycine/tryptophan (G/W)-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half containing the RRM domain functions as a key effector domain mediating protein synthesis repression by TNRC6C. 38
31961 270470 cd14284 UBA_GAWKY UBA domain found in Drosophila melanogaster protein Gawky (GW) and similar proteins. GW is the D. melanogaster GW182 homolog (dGW182) which belongs to the GW182 protein family. The GW182 proteins directly interact with Argonaute (Ago) proteins, and thus function as downstream effectors in the miRNA pathway, responsible for inhibition of translation and acceleration of mRNA decay. They are characterized by an abnormally high content of glycine/tryptophan (G/W) repeats, one or more glutamine (Q)-rich motifs, and a C-terminal RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The GW182 proteins are recruited to miRNA targets through an interaction between their N-terminal domain and an Argonaute protein. Then they promote translational repression and/or degradation of miRNA targets through their C-terminal silencing domain. In addition to a G/W repeats region, a Q-rich region, and a RRM domain, GW also contains a ubiquitin-associated domain (UBA). 35
31962 270471 cd14285 UBA_scEDE1_like UBA domain found in Saccharomyces cerevisiae EH domain-containing and endocytosis protein 1 (Ede1) and similar proteins. Ede1, also bud site selection protein 15, is the mammalian protein Eps15 homolog found in yeast and functions at the internalization step of endocytosis. Both Ede1 and Eps15 are endocytic scaffold proteins that may involve in stabilization of the adaptor-cargo complex. They both contain contains three N-terminal Eps15 homology (EH) domains and C-terminal ubiquitin-binding motifs. Whereas Eps15 has two ubiquitin interacting motifs (UIM), Ede1 harbors a single ubiquitin-associated (UBA) domain. This model corresponds to Ede1 UBA domain that is responsible for the binding of monoubiquitinated proteins and negatively regulates EH domain-mediated protein-protein interactions. 35
31963 270472 cd14286 UBA_UBP24 UBA domain found in ubiquitin carboxyl-terminal hydrolase 24 (UBP24) and similar proteins. UBP24, also called deubiquitinating enzyme 24, ubiquitin thioesterase 24, or ubiquitin-specific-processing protease 24, is a deubiquitinating protein that interacts with damage-specific DNA-binding protein 2 (DDB2) and regulates DDB2 stability. It may also play a role in the pathogenesis of Parkinson's disease (PD). UBP24 proteins contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal peptidase C19 domain. 37
31964 270473 cd14287 UBA_At3g58460_like UBA domain found in uncharacterized protein At3g58460 from Arabidopsis thaliana and its homologs from other plants. The uncharacterized protein At3g58460 from Arabidopsis thaliana is also known as rhomboid-like protein 15 which is encoded by RBL15 gene. Although the biological function of the family members remains unclear, they all contain an N-terminal rhomboid-like domain and a C-terminal ubiquitin-associated (UBA) domain. 36
31965 270474 cd14288 UBA_HUWE1 UBA domain found in eukaryotic E3 ubiquitin-protein ligase HUWE1 and similar proteins. HUWE1, also called ARF-binding protein 1 (ARF-BP1), HECT, UBA and WWE domain-containing protein 1, homologous to E6AP carboxyl terminus homologs protein 9 (HectH9), large structure of UREB1 (LASU1), Mcl-1 ubiquitin ligase E3 (Mule), upstream regulatory element-binding protein 1 (URE-B1), or URE-binding protein 1, may function as a ubiquitin-protein ligase that involves in the ubiquitination cascade that targets specific substrate proteins in proteolysis. It can ubiquitylate DNA polymerase beta (Pol beta), the major BER DNA polymerase and modulates base excision repair (BER). HUWE1 also acts as a critical mediator of both the p53-independent and p53-dependent tumor suppressor functions of ARF tumor suppressor in p53 regulation. Moreover, HUWE1 is both required and sufficient for the polyubiquitination of Mcl-1, an anti-apoptotic Bcl-2 family member involving in DNA damage-induced apoptosis. Furthermore, HUWE1 plays an important role in the regulation of Cdc6 stability after DNA damage. In addition, HUWE1 works as a partner of N-Myc oncoprotein in neural cells. It ubiquitinates N-Myc and primes it for proteasomal-mediated degradation. HUWE1 contains a ubiquitin-associated (UBA) domain, a WWE domain, and a Bcl-2 homology region 3 (BH3) domain at the N-terminus and a HECT domain at the C-terminus. WWE domain plays a role in the regulation of specific protein-protein interactions in a ubiquitin conjugation system. BH3 domain is responsible for the specific binding to Mcl-1. HECT domain involves in the inhibition of the transcriptional activity of p53 via a ubiquitin-dependent degradation pathway. It also controls neural differentiation and proliferation by destabilizing the N-Myc oncoprotein. 40
31966 270475 cd14289 UBA_RHBD3 UBA domain found in vertebrate rhomboid domain-containing protein 3 (RHBD3). RHBD3 is encoded by a novel chromosome 22 CpG island-associated gene (C22orf3) that is not expressed in a significant proportion of pituitary tumors. C22orf3 is also called pituitary tumor apoptosis gene (PTAG) or RHBDD3 which is located directly upstream of EWSR1. Although its biological function remains unclear, RHBD3 contains an N-terminal rhomboid domain and a C-terminal ubiquitin-associated (UBA) domain. 44
31967 270476 cd14290 UBA_PUB_plant UBA domain found in plant PNGase/UBA or UBX (PUB) domain-containing proteins. This family includes some uncharacterized hypothetical proteins found in plants. Although their biological function remain unclear, all family members contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal PUB domain. UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover. PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. 49
31968 270477 cd14291 UBA1_NUB1_like UBA1 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA1 domain. 36
31969 270478 cd14292 UBA2_NUB1 UBA2 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA2 domain. 35
31970 270479 cd14293 UBA3_NUB1 UBA3 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA3 domain. 36
31971 270480 cd14294 UBA1_UBP5_like UBA1 domain found in ubiquitin carboxyl-terminal hydrolase UBP5, UBP13 and similar proteins. UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13, is an ortholog of UBP5. It has similar domain architecture, but functions differently from USP5 in cellular deubiquitination processes. It exhibits a weak deubiquitinating activity preferring to Lys63-linked polyubiquitin in a non-activation manner. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA1 domain. 44
31972 270481 cd14295 UBA1_atUBP14 UBA1 domain found in Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and similar proteins. atUBP14, also called deubiquitinating enzyme 14, TITAN-6 protein, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is related to the isopeptidase T class of deubiquitinating enzymes that recycle polyubiquitin chains following protein degradation. atUBP14 is essential for early plant development. It can disassemble multi-ubiquitin chains linked internally via epsilon-amino isopeptide bonds using Lys48 and can process some, but not all, translational fusions of ubiquitin linked via alpha-amino peptide bonds. atUBP14 contains two ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain. 45
31973 270482 cd14296 UBA1_scUBP14_like UBA1 domain found in Saccharomyces cerevisiae ubiquitin carboxyl-terminal hydrolase 14 (scUBP14) and similar proteins. scUBP14, also called deubiquitinating enzyme 14, glucose-induced degradation protein 6, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is the yeast ortholog of human Isopeptidase T (USP5), a deubiquitinating enzyme known to bind the 29-linked polyubiquitin chains. scUBP14 has been identified as a K29-linked polyubiquitin binding protein as well. It is involved in K29-linked polyubiquitin metabolism by binding to the 29-linked Ub4 resin and serving as an internal positive control in budding yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain. 39
31974 270483 cd14297 UBA2_spUBP14_like UBA2 domain found in Schizosaccharomyces pombe ubiquitin carboxyl-terminal hydrolase 14 (spUBP14) and similar proteins. spUBP14, also called deubiquitinating enzyme 14, UBA domain-containing protein 2, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, functions as a deubiquitinating enzyme that is involved in protein degradation in fission yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain. 39
31975 270484 cd14298 UBA2_scUBP14_like UBA2 domain found in Saccharomyces cerevisiae ubiquitin carboxyl-terminal hydrolase 14 (scUBP14) and similar proteins. scUBP14, also called deubiquitinating enzyme 14, glucose-induced degradation protein 6, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is the yeast ortholog of human Isopeptidase T (USP5), a deubiquitinating enzyme known to bind the 29-linked polyubiquitin chains. scUBP14 has been identified as a K29-linked polyubiquitin binding protein as well. It is involved in K29-linked polyubiquitin metabolism by binding to the 29-linked Ub4 resin and serving as an internal positive control in budding yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain. 38
31976 270485 cd14300 UBA_UBS3A_like UBA domain found in ubiquitin-associated and SH3 domain-containing protein A (UBS3A) and similar proteins. UBS3A, also called Cbl-interacting protein 4 (CLIP4), suppressor of T-cell receptor signaling 2 (Sts-2), or T-cell ubiquitin ligand 1 (TULA-1), is a lymphoid protein only detected in thymus, spleen, and bone marrow. UBS3A exhibits extremely low phosphatase activity, but is capable of promoting T-cell apoptosis independent of either T cell receptor (TCR)/CD3-mediated signaling or caspase activity. It functions as a negative regulator of TCR signaling. UBS3A can also inhibit HIV-1 biogenesis through the binding of ATP-binding cassette protein family E member 1 (ABCE-1), a host factor of HIV-1 assembly. Moreover, UBS3A acts as the Cbl- and ubiquitin-interacting protein that can inhibit endocytosis and downregulation of ligand-activated epidermal growth factor receptor (EGFR) by impairing Cbl-induced ubiquitination, as well as inhibit clathrin-dependent endocytosis in general. This family also includes Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and some uncharacterized AAA-type ATPase-like proteins found in plants. 37
31977 270486 cd14301 UBA_UBS3B UBA domain found in ubiquitin-associated and SH3 domain-containing protein B (UBS3B) and similar proteins. UBS3B, or Cbl-interacting protein p70, suppressor of T-cell receptor signaling 1 (Sts-1), T-cell ubiquitin ligand 2 (TULA-2), or tyrosine-protein phosphatase STS1/TULA2, is ubiquitously expressed in mammalian tissues in a variety of cell types. It exhibits high phosphatase activity, but demonstrates no proapoptotic activity. It negatively regulates the tyrosine kinase Zap-70 activation and T cell receptor (TCR) signaling pathways that modulate T cell activation. Moreover, UBS3B acts as a Cbl- and ubiquitin-interacting protein that inhibits endocytosis of epidermal growth factor receptor (EGFR) and platelet-derived growth factor receptor. 38
31978 270487 cd14302 UBA_UBXN1 UBA domain found in UBX domain-containing protein 1 (UBXN1) and similar proteins. UBXN1, also called SAPK substrate protein 1 (SAKS1) or UBA/UBX 33.3 kDa protein, is a widely expressed protein containing an N-terminal ubiquitin-associated (UBA) domain, a coiled-coil region, and a C-terminal ubiquitin-like (UBX) domain. It binds polyubiquitin and valosin-containing protein (VCP), and has been identified as a substrate for stress-activated protein kinases (SAPKs). Moreover, UBXN1 specifically binds to Homer2b. It may also interact with ubiquitin (Ub) and may be involved in the Ub-proteasome proteolytic pathways. In addition, UBXN1 can associate with autoubiquitinated BRCA1 tumor suppressor and inhibit its enzymatic function through its UBA domain. 41
31979 270488 cd14303 UBA1_KPC2 UBA1 domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also called ubiquitin-associated domain-containing protein 1 (UBAC1), or glialblastoma cell differentiation-related protein 1, is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains a ubiquitin-like (UBL) domain and two ubiquitin-associated (UBA) domains. This model corresponds to the UBA1 domain. 41
31980 270489 cd14304 UBA2_KPC2 UBA2 domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also called ubiquitin-associated domain-containing protein 1 (UBAC1), or glialblastoma cell differentiation-related protein 1, is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains a ubiquitin-like (UBL) domain and two ubiquitin-associated (UBA) domains. This model corresponds to the UBA2 domain. 39
31981 270490 cd14305 UBA_UBAC2 UBA domain found in ubiquitin-associated domain-containing protein 2 (UBAC2) and similar proteins. UBAC2, also called phosphoglycerate dehydrogenase-like protein 1, is a ubiquitin-associated domain (UBA)-domain containing protein encoded by gene UBAC2 (or PHGDHL1), a risk gene for Behcet's disease (BD). It may play an important role in the development of BD through its transcriptional modulation. Members in this family contain an N-terminal rhomboid-like domain and a C-terminal UBA domain. 38
31982 270491 cd14306 UBA_VP13D UBA domain found in vacuolar protein sorting-associated protein 13D (VP13D) and similar proteins. VP13D is a chorea-acanthocytosis (CHAC)-similar protein encoded by gene VPS13D. it contains two putative domains, ubiquitin-associated (UBA) domain and lectin domain of ricin B chain profile (ricin-B-lectin), suggesting it may interact with, and be involved in the trafficking of, proteins modified with ubiquitin and/or carbohydrate molecules. Further investigation is required. 36
31983 270492 cd14307 UBA_RUP1p UBA domain found in yeast UBA domain-containing protein RUP1p and similar proteins. RUP1p is a ubiquitin-associated (UBA) domain-containing protein encoded by a nonessential yeast gene RUP1. It can mediate the association of Rsp5 and Ubp2. The N-terminal UBA domain is responsible for antagonizing Rsp5 function, as well as bridging the Rsp5-Ubp2 interaction. No other characterized functional domains or motifs are found in RUP1p. 38
31984 270493 cd14308 UBA_Mud1_like UBA domain found in Schizosaccharomyces pombe UBA domain-containing protein mud1 and similar proteins. Schizosaccharomyces pombe mud1 is an ortholog of the Saccharomyces cerevisiae DNA-damage response protein Ddi1. S. cerevisiae Ddi1, also called v-SNARE-master 1 (Vsm1), belongs to a family of proteins known as the ubiquitin receptors which can bind ubiquitinated substrates and the proteasome. It is involved in the degradation of the F-box protein Ufo1, involved in the G1/S transition. It also participates in Mec1-mediated degradation of Ho endonuclease. Both S. pombe mud1 and S. cerevisiae Ddi1 contain an N-terminal ubiquitin-like (UBL) domain, an aspartyl protease-like domain, and a C-terminal ubiquitin-associated (UBA) domain. S. pombe mud1 binds to K48-linked polyubiquitin (polyUb) through UBA domain. 36
31985 270494 cd14309 UBA_scDdi1_like UBA domain found in Saccharomyces cerevisiae DNA-damage response protein Ddi1 and similar proteins. Ddi1, also called v-SNARE-master 1 (Vsm1), is a ubiquitin receptor involved in regulation of the cell cycle and late secretory pathway in Saccharomyces cerevisiae. It functions as a ubiquitin association domain (UBA)- ubiquitin-like-domain (UBL) shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. For instance, Ddi1 plays an essential role in the final stages of proteasomal degradation of Ho endonuclease and of its cognate FBP, Ufo1. Moreover, Ddi1 and its associated protein Rad23p play a cooperative role as negative regulators in yeast PHO pathway. Ddi1 contains an N-terminal UBL domain and a C-terminal UBA domain. It also harbors a central retroviral aspartyl-protease-like domain (RVP) which may be important in cell-cycle control. At this point, Ddi1 may function proteolytically during regulated protein turnover in the cell. This family also includes mammalian regulatory solute carrier protein family 1 member 1 (RSC1A1), also called transporter regulator RS1 (RS1) which mediates transcriptional and post-transcriptional regulation of Na(+)-D-glucose cotransporter SGLT1. 36
31986 270495 cd14310 UBA_cnDdi1_like UBA domain found in Cryptococcus neoformans DNA-damage response protein Ddi1 and similar proteins. The family includes some uncharacterized Ddi and similar proteins which show a high level of sequence similarity with yeast Ddi1. Ddi1, also called v-SNARE-master 1 (Vsm1), is a ubiquitin receptor involved in regulation of the cell cycle and late secretory pathway in yeast. It functions as a ubiquitin association domain (UBA)- ubiquitin-like-domain (UBL) shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. Ddi1 contains an N-terminal UBL domain and a C-terminal UBA domain. It also harbors a central retroviral aspartyl-protease-like domain (RVP) which may be important in cell-cycle control. At this point, Ddi1 may function proteolytically during regulated protein turnover in the cell. 30
31987 270496 cd14311 UBA_II_E2_UBC1 UBA domain of yeast ubiquitin-conjugating enzyme E2 1 (UBC1) and similar proteins. UBC1, also called ubiquitin-conjugating enzyme E2-24 kDa, or ubiquitin-protein ligase, is the yeast homolog of mammalian ubiquitin-conjugating enzyme E2 K (UBE2K or E2-25K). UBC1 and UBE2K are unique class II E2 conjugating enzymes, both of which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. The yeast UBC1 plays an important role in the degradation of short-lived proteins especially during the G0-G1 transition accompanying spore germination. 48
31988 270497 cd14312 UBA_II_E2_UBC27_like UBA domain found in plant ubiquitin-conjugating enzyme E2 27 and similar proteins. UBC27, also called ubiquitin carrier protein 27, functions as a class II ubiquitin-conjugating (UBC) enzyme (E2). E2, together with E1 (ubiquitin-activating enzyme UBA) and E3 (ubiquitin ligase), is required in the multi-step reaction of ubiquitin conjugation. Unlike other Arabidopsis UBCs, in addition to an N-terminal ubiquitin-conjugating enzyme E2 catalytic domain (UBCc), UBC27 has an additional C-terminal ubiquitin-associated domain (UBA). 36
31989 270498 cd14313 UBA_II_E2_UBE2K_like UBA domain found in vertebrate ubiquitin-conjugating enzyme E2 K (UBE2K), Drosophila melanogaster ubiquitin-conjugating enzyme E2-22 kDa (UbcD4) and similar proteins. UBE2K, also called Huntingtin-interacting protein 2 (HIP-2), ubiquitin carrier protein, ubiquitin-conjugating enzyme E2-25 kDa (E2-25K), or ubiquitin-protein ligase, is a multi-ubiquitinating enzyme with the ability to synthesize Lys48-linked polyubiquitin chains which is involved in the ubiquitin (Ub)-dependent proteolytic pathway. It interacts with the frameshift mutant of ubiquitin B and functions as a crucial factor regulating amyloid-beta neurotoxicity. It has also been characterized as Huntingtin-interacting protein that modulates the neurotoxicity of Amyloid-beta (Abeta), the principal protein involved in Alzheimer's disease pathogenesis. Moreover, E2-25K increases aggregate the formation of expanded polyglutamine proteins and polyglutamine-induced cell death in the pathology of polyglutamine diseases. UbcD4, also called ubiquitin carrier protein, or ubiquitin-protein ligase, is encoded by Drosophila E2 gene which is only expressed in pole cells in embryos. It is a putative E2 enzyme homologous to the Huntingtin interacting protein-2 (HIP2) of human. UbcD4 specifically interacts with the polyubiquitin-binding subunit of the proteasome. This family also includes a putative ubiquitin conjugating enzyme from plasmodium Yoelii (pyUCE). It shows a high level of sequence similarity with UBE2K and may also plays a role in the ubiquitin-mediated protein degradation pathway. All family members are class II E2 conjugating enzymes which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. 36
31990 270499 cd14314 UBA_II_E2_pyUCE_like UBA domain found in a putative ubiquitin conjugating enzyme from plasmodium Yoelii (pyUCE) and similar proteins. P. Yoelii ubiquitin-conjugating enzyme and other uncharacterized family members show high sequence similarity to the human Huntingtin interacting protein-2 (HIP2) which belongs to a class II E2 ubiquitin-conjugating enzyme family. These proteins may play roles in the ubiquitin-mediated protein degradation pathway. They all contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. 37
31991 270500 cd14315 UBA1_UBAP1 UBA1 domain found in vertebrate ubiquitin-associated protein 1 (UBAP-1). UBAP-1, also called nasopharyngeal carcinoma-associated gene 20 protein, is a ubiquitously expressed protein that may play an important role in the ubiquitin pathway and cell progression. It co-localizes with TDP-43 proteins in neuronal cytoplasmic inclusions and acts as a genetic risk factor for frontotemporal lobar degeneration (FTLD). Moreover, UBAP-1, together with VPS37A, forms an endosome-specific endosomal sorting complexes I required for transport (ESCRT-I) complex that displays a restricted cellular function, ubiquitin-dependent endosomal sorting and multivesicular body (MVB) biogenesis. UBAP-1 contains an N-terminal UBAP-1-MVB12-associated (UMA) domain, and two tandem ubiquitin-associated (UBA) domains that may be responsible for the binding of ubiquitin-conjugating enzymes. This model corresponds to UBA1 domain. 43
31992 270501 cd14316 UBA2_UBAP1_like UBA2 domain found in ubiquitin-associated protein 1 (UBAP-1) and similar proteins. UBAP-1, also called nasopharyngeal carcinoma-associated gene 20 protein, is a ubiquitously expressed protein that may play an important role in the ubiquitin pathway and cell progression. It co-localizes with TDP-43 proteins in neuronal cytoplasmic inclusions and acts as a genetic risk factor for frontotemporal lobar degeneration (FTLD). Moreover, UBAP-1, together with VPS37A, forms an endosome-specific endosomal sorting complexes I required for transport (ESCRT-I) complex that displays a restricted cellular function, ubiquitin-dependent endosomal sorting and multivesicular body (MVB) biogenesis. UBAP-1 contains an N-terminal UBAP-1-MVB12-associated (UMA) domain, and two tandem ubiquitin-associated (UBA) domains that may be responsible for the binding of ubiquitin-conjugating enzymes. This model corresponds to UBA2 domain. 37
31993 270502 cd14317 UBA_DHX57 UBA domain found in putative ATP-dependent RNA helicase DHX57 and similar proteins. DHX57, also called DEAH box protein 57, is a multi-domain protein with an N-terminal ubiquitin-association (UBA) domain, a Zinc finger domain, a RWD domain, a DEAD-like helicase domain and two C-terminal helicase associated domains. Although the precise biological function of DHX57 remains unclear, it may function as a putative ATP-dependent RNA helicase. 38
31994 270503 cd14318 UBA_Cbl_like UBA domain found in casitas B-lineage lymphoma (Cbl) proteins. The Cbl adaptor proteins family contains a small class of RING-type E3 ubiquitin ligases with oncogenic activity which is represented by three mammalian members, c-Cbl, Cbl-b and Cbl-3. Cbl proteins function as potent negative regulators of various signaling cascades in a wide range of cell types. They play roles in ubiquitinating the activated tyrosine kinases and targeting them for degradation. Cbl proteins in this family consists of a highly conserved N-terminal half that includes a tyrosine-kinase-binding (TKB) domain and a RING finger domain, both of which are required for Cbl-mediated downregulation of RTKs, and a C-terminal half that includes a ubiquitin-associated (UBA) domain and other protein interaction motifs. The UBA domain contains leucine/isoleucine repeats and may play a role in dimerization of Cbl proteins. In addition, although both c-Cbl and Cbl-b have the C-terminal UBA domain, only the UBA domain from Cbl-b can bind ubiquitin. 40
31995 270504 cd14319 UBA_NBR1 UBA domain of next to BRCA1 gene 1 protein (NBR1) and similar proteins. NBR1, also called cell migration-inducing gene 19 protein, membrane component chromosome 17 surface marker 2, neighbor of BRCA1 gene 1 protein, or protein 1A1-3B, is a scaffold protein that may be involved in signal transmission downstream of the serine/protein kinase from the giant muscle protein titin. Moreover, NBR1 functions as an autophagic receptor for ubiquitinated cargo. It interacts with ATG8-family proteins for its degradation by autophagy. NBR1 contains an N-terminal Phox and Bem1p (PB1) domain that plays a critical role in mediating protein-protein interactions with both titin kinase and with another scaffold protein, p62. NBR1 also has a LC3-interaction region (LIR) and a ubiquitin-associated (UBA) domain. The LIR is required for the autophagic clearance of NBR1. UBA domain is responsible for the ubiquitin binding which is necessary for the puromycin-induced formation of ubiquitinated protein aggregates. 39
31996 270505 cd14320 UBA_SQSTM UBA domain of sequestosome-1 (SQSTM) and similar proteins. SQSTM, also called EBI3-associated protein of 60 kDa (EBIAP /p60), phosphotyrosine-independent ligand for the Lck SH2 domain of 62 kDa, or ubiquitin-binding protein p62, is a widely expressed multifunctional cytoplasmic protein that is able to noncovalently bind ubiquitin and several signaling proteins, suggesting a regulatory role connected to the ubiquitin-proteasome pathway. It functions as a scaffolding protein that regulates a diverse range of signaling pathways leading to activation of the nuclear factor kappa B (NF-kappaB) family of transcription factors. It also plays a novel role in connecting receptor signals with the endosomal signaling network required for mediating TrkA-induced differentiation. SQSTM contains a PB1 dimerization domain, a tumor necrosis factor receptor-associated factor 6 (TRAF6) binding site, and a C-terminal ubiquitin-associated (UBA) domain that mediates the recognition of polyubiquitin chains and ubiquitylated substrates. 40
31997 270506 cd14321 UBA_IAPs UBA domain found in inhibitor of apoptosis proteins (IAPs). IAPs are frequently overexpressed in cancer and associated with tumor cell survival, chemoresistance, disease progression and poor prognosis. They function primarily as negative regulators of cell death. They regulate caspases and apoptosis through the inhibition of specific members of the caspase family of cysteine proteases. In addition, IAPs has been implicated in a multitude of other cellular processes, including inflammatory signalling and immunity, mitogenic kinase signalling, proliferation and mitosis, as well as cell invasion and metastasis. IAPs in this family includes cellular inhibitor of apoptosis protein c-IAP1 and c-IAP2, XIAP, and BIRC8, all of which contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), and a RING domain at the carboxyl terminus that is required for ubiquitin ligase activity. c-IAPs contains an additional caspase activation and recruitment domain (CARD) between UBA and RING domains. CARD domain may serve as a protein interaction surface. 44
31998 270507 cd14322 UBA_LATS UBA domain found in serine/threonine-protein kinase LATS and similar proteins. The LATS proteins family consists of two isoforms, LATS1 and LATS2, both of which are mammalian homologs of the Drosophila tumor suppressor gene lats/warts. LATS1, also called large tumor suppressor homolog 1, or WARTS protein kinase (warts), is a serine/threonine-protein kinase that highly conserved from fly to human. LATS2, also called kinase phosphorylated during mitosis protein, or large tumor suppressor homolog 2, or serine/threonine-protein kinase KPM, or Warts-like kinase, inhibits the G1/S transition and is essential for embryonic development, proliferation control and genomic integrity. LATS proteins contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain. 39
31999 270508 cd14323 UBA_PLCs_like UBA domain of eukaryotic protein linking integrin-associated protein with cytoskeleton (PLIC) proteins, Saccharomyces cerevisiae proteins Dsk2p and Gts1p, and similar proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like Dsk2 protein, PLIC-1 (also called ubiquilin-1), PLIC-2 (also called ubiquilin-2 or Chap1), PLIC-3 (also called ubiquilin-3) and PLIC-4 (also called ubiquilin-4, Ataxin-1 interacting ubiquitin-like protein, A1Up, Connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. Saccharomyces cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Gts1p, also called protein LSR1, is encoded by a pleiotropic gene GTS1 in budding yeast. The formation of Gts1p-mediated protein aggregates may induce reactive oxygen species (ROS) production and apoptosis. Gts1p also plays an important role in the regulation of heat and other stress responses under glucose-limited or -depleted conditions in either batch or continuous culture. 39
32000 270509 cd14324 UBA_Dsk2p_like UBA domain of Saccharomyces cerevisiae proteasome interacting protein Dsk2p and its homologs found in fungi. The family contains several fungal multi-ubiquitin receptors, including Saccharomyces cerevisiae Dsk2p and Schizosaccharomyces pombe Dph1p, both of which have been characterized as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. They interact with the proteasome through their N-terminal ubiquitin-like domain (UBL) and with ubiquitin (Ub) through their C-terminal ubiquitin-associated domain (UBA). S. cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Moreover, it has been implicated in spindle pole duplication through assisting in Cdc31 assembly into the new spindle pole body (SPB). S. pombe Dph1p is a ubiquitin receptor working in concert with the class V myosin, Myo52, to target the degradation of the S. pombe CLIP-170 homolog, Tip1. It also can protect ubiquitin chains against disassembly by deubiquitinating enzymes. 42
32001 270510 cd14325 UBA_RNF31 UBA domain found in E3 ubiquitin-protein ligase RING finger protein 31 and similar proteins. RNF31, also called HOIL-1-interacting protein (HOIP), or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. RNF31 contains a central ubiquitin-associated (UBA) domain that is responsible for the interaction with the N-terminal ubiquitin-like domain (UBL) of HOIL-1L. In addition, RNF31 can interact with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression. 55
32002 270511 cd14326 UBA_UBL7 UBA domain found in ubiquitin-like protein 7 (UBL7) and similar proteins. UBL7, also called bone marrow stromal cell ubiquitin-like protein (BMSC-UbP), or ubiquitin-like protein SB132, is a novel ubiquitin-like protein that may play roles in regulation of bone marrow stromal cell (BMSC) function or cell differentiation via an evocator-associated and cell-specific pattern. UBL7 contains an N-terminal ubiquitin domain (UBQ) and a C-terminal ubiquitin-associated (UBA) domain. UBQ domain interacts with 26S proteasome-dependent degradation, and UBA domain links cellular processes and the ubiquitin system. 38
32003 270512 cd14327 UBA_atUPL1_2_like UBA domain found in Arabidopsis thaliana E3 ubiquitin-protein ligase UPL1 (atUPL1), UPL2 (atUPL2) and similar proteins. The family includes two highly similar 405-kDa HECT E3 ubiquitin-protein ligases (UPLs), UPL1 and UPL2, from Arabidopsis thaliana. The HECT E3 UPL family plays a prominent role in the ubiquitination of plant proteins. The biological functions of UPL1 and UPL2 remain unclear. Both of them contain a ubiquitin-associated (UBA) domain and a C-terminal HECT domain. UBA domain may be involved in ubiquitin metabolism. HECT domain is necessary and sufficient for their E3 catalytic activity, but requires ATP, E1 and an E2 of the Arabidopsis UBC8 family to ubiquitinate proteins. 38
32004 270513 cd14328 UBA_TNK1 UBA domain found in non-receptor tyrosine-protein kinase TNK1 and similar proteins. TNK1, also called CD38 negative kinase 1, is a non-receptor protein tyrosine kinase (NRPTK) that has been implicated in the regulation of apoptosis, cell growth, nuclear factor-kappaB, and Ras. It associates with phospholipase C (PLC)-gamma1 and may play a role in phospholipid signal transduction. TNK1 contains an NH2-terminal kinase, a Src Homology 3 (SH3) domain, a proline-rich (PR) region, and a C-terminal ubiquitin-association (UBA) domain. 40
32005 270514 cd14329 UBA_SWA2p_like UBA domain found in yeast auxilin-like clathrin uncoating factor SWA2 (Swa2p) and similar proteins. The lineage specific group includes Swa2p and other uncharacterized hypothetical proteins from Saccharomyces. Swa2p, also called bud site selection protein 24, DnaJ-related protein SWA2, or synthetic lethal with ARF1 protein 2, is the yeast auxilin ortholog that is a multifunctional protein with three N-terminal clathrin-binding (CB) motifs, a ubiquitin-association (UBA) domain, a tetratricopeptide repeat (TPR) domain, and a C-terminal J-domain. It is required for disassembly of clathrin-coated vesicles (CCVs) in an ATP-dependent manner, as well as for cortical endoplasmic reticulum (ER) inheritance. 36
32006 270515 cd14330 UBA_atDRM2_like UBA domain found in Arabidopsis thaliana DNA (cytosine-5)-methyltransferase DRM2 (atDRM2) and similar proteins. atDRM2, also called protein domains rearranged methylase 2, is a homolog of the mammalian de novo methyltransferase DNMT3. It is the major de novo methyltransferase targeted to DNA by small interfering RNAs (siRNAs) in the RNA-directed DNA methylation (RdDM) pathway in Arabidopsis thaliana. atDRM2 is a part of the RdDM effector complex and plays a catalytic role in RdDM. It contains an N-terminal UBA domains and a C-terminal methyltransferase domain, both of which are required for normal RdDM. 37
32007 270516 cd14331 UBA_HERC1_2 UBA domain found in probable E3 ubiquitin-protein ligase HERC1, HERC2 and similar proteins. HERC1, also called HECT domain and RCC1-like domain-containing protein 1, p532, or p619, is an ubiquitously expressed giant protein involved in ubiquitin-dependent intracellular membrane trafficking through its interaction with vesicle coat proteins such as clathrin and ARF. Moreover, it has been identified as a tuberous sclerosis complex TSC2-interacting protein that may play a role in TSC-mTOR (mammalian target of rapamycin) pathway. HERC2, also called HECT domain and RCC1-like domain-containing protein 2, is a SUMO-regulated E3 ubiquitin ligase that plays an important role in the SUMO-dependent pathway which orchestrates the DNA double-strand break (DSB) response. Moreover, HERC2 functions as a RNF8 auxiliary factor that regulates ubiquitin-dependent retention of repair proteins on damaged chromosomes. HERC1 and HERC2 are multi-domain proteins with different domain organizations. Both of them contain a ubiquitin-association (UBA) domain, more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain. 40
32008 270517 cd14332 UBA_RuvA_C C-terminal UBA-like domain of holliday junction ATP-dependent DNA helicase RuvA. RuvA, along with RuvB and RuvC proteins, is involved in branch migration of heteroduplex DNA in homologous recombination that is a crucial process for maintaining genomic integrity and generating biological diversity in all living organisms. RuvA has a tetrameric architecture in which each subunit comprised of three distinct domains. This model corresponds to the C-terminal domain of RuvA which is distantly related to the ubiquitin-associated (UBA) domain. It plays a significant role in the ATP-dependent branch migration of the hetero-duplex through direct contact with RuvB. Within the Holliday junction, the C-terminal domain makes no interaction with DNA. 45
32009 270518 cd14333 UBA_unchar_Eumetazoa UBA domain found in some hypothetical proteins from Eumetazoa. The family includes some uncharacterized Eumetazoan proteins. Although their biological function remain unclear, they all contain a very conserved ubiquitin-associated (UBA) domain which is a commonly occurring sequence motif found in proteins involved in ubiquitin-mediated proteolysis. 38
32010 270519 cd14334 UBA_SNF1_fungi UBA domain of yeast carbon catabolite-derepressing protein kinases (Snf1) and similar proteins found in fungi. Snf1, also called yeast adenosine monophosphate (AMP)-activated protein kinase (AMPK), is a global regulator of carbon metabolism in the yeast Saccharomyces cerevisiae. Its phosphorylation is essential for the regulation by carbon catabolite repression in eukaryotic cells. Snf1 is involved in the cellular responses to nutrient stress, as well as other environmental stresses, including sodium ion stress, heat shock, alkaline pH, oxidative stress, and genotoxic stress. It plays roles in various nutrient-responsive, cellular developmental processes, including meiosis and sporulation, aging, haploid invasive growth, and diploid pseudohyphal growth. It is required for transcription of glucose-repressed genes, glycogen storage, thermotolerance, and peroxisome biogenesis. The catalytic activity of Snf1 can be regulated by upstream kinases, Sak1, Elm1, and Tos3, by the Reg1-Glc7 protein phosphatase 1, and by autoinhibition. In addition to an N-terminal protein kinase domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), Snf1 contains an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain, in the middle region. 48
32011 270520 cd14335 UBA_SnRK1_plant UBA domain found in the plant sucrose nonfermenting-1-related kinase (SnRK1) proteins. The plant SnRK1 proteins (also known as AKIN10/11) family contains plant orthologs of the yeast sucrose non-fermenting (Snf1) kinase and mammalian AMP-activated protein kinase (AMPK), including two catalytic alpha-subunits of plant Snf1-related kinases (SnRKs): SNF1-related protein kinase catalytic subunit alpha KIN10 (also called AKIN10 or AKIN alpha2) and SNF1-related protein kinase catalytic subunit alpha KIN11 (also called AKIN11 or AKIN alpha1). AKIN10 and AKIN11 function as central integrators of sugar, metabolic, stress, and developmental signals in plants. They form different complexes with the regulatory AKINbeta2, a plant ortholog of conserved Snf1/AMPK beta-subunits. In addition to an N-terminal protein kinase domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), Snf1 contains an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain, in the middle region. 41
32012 270521 cd14336 UBA_AID_AMPKalpha UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic alpha (AMPKalpha) subunits. The family corresponds to the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK) which includes two isoforms encoded by two distinct genes, AMPKalpha-1 (PRKAA1) and AMPKalpha-2 (PRKAA2). Skeletal muscle predominantly expresses the AMPKalpha-2, whereas the liver expresses approximately equal amounts of both AMPKalpha subunits. One AMPKalpha subunit and two regulatory subunits, beta (beta1, beta2, beta3) and gamma (gamma1, gamma2, gamma3) form a heterotrimeric AMPK complex that plays a central role in the regulation of cellular energy metabolism, activates energy-producing pathways and inhibits energy-consuming processes through responding to a fall in intracellular ATP levels. It is activated in beta-cells at low glucose concentrations, but inhibited as glucose levels increase. AMPKalpha subunits show significant similarity in the catalytic core region, but have divergent COOH-terminal tails, suggesting they may interact with different proteins within this region. Both of AMPKalpha subunits have an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID, and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunits autoinhibition. The C-terminal regulatory domain of the alpha-subunit is essential for binding the beta- and gamma-subunits. 65
32013 270522 cd14337 UBA_MARK_Par1 UBA domain found in microtubule-associated protein (MAP)/microtubule affinity-regulating kinase (MARK)/ partitioning-defective 1 (Par-1) and similar proteins. The MARK/Par-1 subfamily contains serine/threonine-protein kinases including mammal MARKs, and polarity kinases Par-1 found in Caenorhabditis elegans and Drosophila melanogaster. Those proteins are frequently found associated with membrane structures and participate in diverse processes from control of the cell cycle and polarity to intracellular signaling and microtubule stability. They are involved in nematode embryogenesis, cell cycle control, epithelial cell polarization, cell signaling, and neuronal migration and differentiation. The mammals MARKs have been implicated in carcinomas, Alzheimer's disease (through tau hyperphosphorylation), and autism. Four MARK isoforms exist in humans. Members in this subfamily contain an N-terminal protein kinase catalytic domain, followed by an ubiquitin-associated (UBA) domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK). 40
32014 270523 cd14338 UBA_SIK UBA domain found in salt-inducible kinase SIK1, SIK2, SIK3 and similar proteins. Salt-inducible kinase SIK1, SIK2, SIK3 are serine/threonine kinases that belong to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. SIK1, also called serine/threonine-protein kinase SNF1-like kinase 1 (SNF1LK), is required for myogenic differentiation. It is degraded by the proteasome in myoblasts which is regulated by cAMP signaling. Moreover, SIK1 acts as a class II histone deacetylase (HDAC) kinase, triggering the cytoplasmic export of the HDACs and activation of myocyte enhancer factor 2 (MEF2)-dependent transcription. It also regulates transcription through inhibitory phosphorylation of a family of cAMP responsive element binding protein (CREB) coactivators, called TORCs/CRTCs. In addition, SIK1 links LKB1 to p53-dependent anoikis and suppresses metastasis. It is also involved in a cell sodium-sensing network that regulates active sodium transport through a calcium-dependent process. SIK2, also called Qin-induced kinase or serine/threonine-protein kinase SNF1-like kinase 2 (SNF1LK2), plays an important role in the insulin-signaling pathway during adipocyte differentiation, as well as in autophagy progression. Moreover, SIK2 plays a critical role in neuronal survival and modulates cAMP responsive element binding protein (CREB)-mediated gene expression in response to hormones and nutrients. SIK2 acts as a critical determinant in autophagy progression. In addition, SIK2 localizes at the centrosome and functions as a centrosome kinase required for bipolar mitotic spindle formation. It is involved in the initiation of mitosis, and regulates the localization of the centrosome linker protein, C-Nap1, through S2392 phosphorylation. SIK3, also called salt-inducible kinase 3 or serine/threonine-protein kinase QSK, acts as a novel energy regulator that modulates cholesterol and bile acid metabolism by coupling with retinoid metabolism. It also play an essential role in facilitating chondrocyte hypertrophy during skeletogenesis and growth plate maintenance. Members in this family contain an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain. 45
32015 270524 cd14339 UBA_SNRK UBA domain of SNF-related serine/threonine-protein kinase (SNRK) and similar proteins mainly found in metazoa. SNRK, also called Sucrose nonfermenting 1 (Snf1)-related kinase, is a serine/threonine kinase highly expressed in the testis. It is a distant member of the largely adenosine monophosphate (AMP)-activated protein kinase (AMPK) family. SNRK can be phosphorylated and activated by LKB1 and may mediate cellular effects regulated by LKB1. It is also involved in the regulation of colon cancer cell proliferation and beta-catenin signaling. It inhibits colon cancer cell proliferation through calcyclin-binding protein (CacyBP)-dependent reduction of beta-catenin. In addition to an N-terminal protein kinase domain, it harbors an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain which is conserved in other Snf1-related kinases, but not in any other protein kinase. 48
32016 270525 cd14340 UBA_BRSK UBA domain found in serine/threonine-protein kinase BRSK1, BRSK2 and similar proteins. The family includes brain-specific kinases BRSK1 and BRSK2. They are AMP-activated protein kinase (AMPK)-related kinases that are highly expressed in mammalian forebrain and crucial for establishing neuronal polarity.BRSK1, also called brain-selective kinase 1, brain-specific serine/threonine-protein kinase 1, BR serine/threonine-protein kinase 1, serine/threonine-protein kinase SAD-B, or synapses of Amphids Defective homolog 1 (SAD1 homolog), is associated with synaptic vesicles and is tightly associated with the presynaptic cytomatrix in nerve terminals. It can regulate neurotransmitter release presynaptically. BRSK2, also called brain-selective kinase 2, brain-specific serine/threonine-protein kinase 2, BR serine/threonine-protein kinase 2, serine/threonine-protein kinase 29, or serine/threonine-protein kinase SAD-A is an AMP-activated protein kinase (AMPK)-related kinase exclusively expressed in brain and pancreas. It plays an essential role in neuronal polarization. It interacts with CDK-related protein kinase PCTAIRE1, a kinase involved in neurite outgrowth and neurotransmitter release, and further negatively regulates glucose-stimulated insulin secretion (GSIS) in pancreatic beta-cells through activation of p21-activated kinase-1 (PAK1). BRSK2 also regulates cell-cycle progression controlled by APC/C(Cdh1) through the ubiquitin-proteasome pathway. Moreover, BRSK2 is regulated by endoplasmic reticulum (ER) stress in protein level and involved in ER stress-induced apoptosis. Both BRSK1 and BRSK2 contain an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain. 54
32017 270526 cd14341 UBA_MELK UBA domain found in maternal embryonic leucine zipper kinase (MELK) and similar proteins. MELK, also called protein kinase Eg3 (pEg3 kinase), protein kinase PK38 (PK38), or tyrosine-protein kinase MELK, is a cell cycle dependent protein kinase involved in diverse cell processes including stem cell renewal, cell cycle progression, cell proliferation, apoptosis and mRNA processing. It is expressed in normal tissues and especially in cancer cells. It is upregulated in cancer tissues and thus may act as potential anticancer target in diverse tumor entities. MELK comprises an N-terminal protein kinase catalytic domain, followed by an ubiquitin-associated (UBA) domain, and a C-terminal autoinhibitory domain of 5'-AMP-activated protein kinase (AMPK). 52
32018 270527 cd14342 UBA_TAP-C UBA-like domain found in nuclear RNA export factor NXF1, NXF2 and similar proteins. The NXF family of mRNA nuclear export factors including vertebrate NXF1 (also called tip-associated protein or mRNA export factor TAP), NXF2 (also called cancer/testis antigen CT39 or TAP-like protein TAPL-2), Caenorhabditis elegans NXF1 (ceNXF1), Saccharomyces cerevisiae mRNA nuclear export factor Mex67p and similar proteins. NXF proteins can stimulate nuclear export of mRNAs and facilitate the export of unspliced viral mRNA containing the constitutive transport element. It is a multi-domain protein with a nuclear localization sequence (NLS), a non-canonical mRNA-binding domain, and four leucine-rich repeats (LLR) at the N-terminal region. Its C-terminal part contains a NTF2-like domain and a ubiquitin-associated (UBA)-like domain, joined by flexible Pro-rich linker. Caenorhabditis elegans NXF1 are essential for the nuclear export of poly(A)+mRNA. In budding yeast, Mex67p binds mRNAs through its adaptor Yra1/REF. It also interacts directly with Nab2, an essential shuttling mRNA-binding protein required for export. Moreover, Mex67p associates with both nuclear pore protein (nucleoporin) FG repeats and Hpr1, a component of the TREX/THO complex linking transcription and export. 51
32019 270528 cd14343 UBA_F100B_like UBA-like domain found in protein FAM100B. The family corresponds to the uncharacterized protein FAM100B and its homologs mainly found in Metazoa. Although their biological roles remain unclear, all family members contain a ubiquitin-associated (UBA)-like domain that may be involved in the binding of ubiquitin. 39
32020 270529 cd14344 UBA_TYDP2 UBA-like domain found in tyrosyl-DNA phosphodiesterase 2 (TDP2) and similar proteins. TDP2, also called ETS1-associated protein II (EAPII) or TRAF and TNF receptor-associated protein (Ttrap), is a 5'-Tyr-DNA phosphodiesterase, a member of the Mg(2+)/Mn(2+)-dependent family of phosphodiesterases which contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal phosphodiesterase domain. TDP2 is required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini needed for subsequent DNA ligation and hence repair of the break. Tyrosyl-DNA phosphodiesterase 1 (TDP1), an enzyme that cleaves 3'-phosphotyrosyl bonds, and TDP2 are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TDP2 has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation. It can associate with CD40, tumor necrosis factor receptor-75 (TNF-R75) and TNF receptor-associated factors (TRAFs) and may inhibit the activation of nuclear factor-kappa B (NF-kappaB). 37
32021 270530 cd14345 UBA_UBXD7 UBA-like domain found in UBX domain-containing protein 7 (UBXD7) and similar proteins. UBXD7, also known as UBXN7, functions as a ubiquitin-binding adaptor that mediates the interaction between the AAA+ ATPase p97 (also known as VCP or Cdc48) and the transcription factor HIF1alpha. It binds only to the active, NEDD8- or Rub1-modified form of cullins. UBXD7 contains the ubiquitin-associated (UBA), ubiquitin-associating (UAS), ubiquitin regulatory X (UBX), and ubiquitin-interacting motif (UIM) domains. Either UBA or UIM could serve as a docking site for neddylated-cullins. Moreover, UBA-like domain is required for binding ubiquitylated-protein substrates, UIM motif is responsible for the binding to cullin RING ligases (CRLs), and UBX domain is essential for p97 binding. 37
32022 270531 cd14346 UBA_Ubx5_like UBA-like domain found in Saccharomyces cerevisiae UBX domain-containing protein 5 (Ubx5) and similar proteins. Ubx5 is a ubiquitin regulatory X (UBX) domain-containing protein encoded by the open reading frame (ORF) YDR330W in yeast. As the yeast ortholog of mammalian UBXD7, Ubx5 functions as the cofactor of AAA+ ATPase p97, also known as VCP or Cdc48. It binds only to the active, NEDD8- or Rub1-modified form of cullins. Ubx5 contains the ubiquitin-associated (UBA), ubiquitin-associating (UAS), ubiquitin regulatory X (UBX) and ubiquitin-interacting motif (UIM) domains and its UIM domain is required to promote UV-dependent degradation of polyubiquitinated Rpb1. 39
32023 270532 cd14347 UBA_Cezanne_like UBA-like domain found in OTU domain-containing proteins OTU7A, OTU7B and similar proteins. OTU7A, also called zinc finger protein Cezanne 2, belongs to a family of proteins that have been characterized as highly specific ubiquitin iso-peptidases removing ubiquitin from proteins. OTU7B, also called cellular zinc finger anti-NF-kappaB protein, zinc finger A20 domain-containing protein 1, or zinc finger protein Cezanne, is a novel deubiquitinating enzyme that acts as a negative regulator of NF-kappaB and may play a role in the control of the inflammatory process. Both OTU7A and OTU7B contain an N-terminal ubiquitin-associated (UBA)-like domain, followed by an ovarian tumor (OTU) domain and a ubiquitin binding domain, A20-like zinc finger. In addition, they both display proteolytic activity. 43
32024 270533 cd14348 UBA_p47 UBA-like domain found in NSFL1 cofactor p47 and similar proteins. p47, also called UBX domain-containing protein 2C, is a major cofactor of the cytosolic AAA ATPase p97. It is required for the p97-regulated membrane reassembly of the endoplasmic reticulum (ER), the nuclear envelope and the Golgi apparatus. p47, together with p97, forms the p97-p47 complex that plays an important role in regulation of membrane fusion events. p47 contains an N-terminal ubiquitin-associated (UBA)-like domain, a central SEP (named after shp1, eyc and p47) domain, and a ubiquitin-like (UBX) domain. UBA-like domain is responsible for forming a highly stable complex with ubiquitin. SEP domain and UBX domain may involve in p47 trimerization or forms a stable complex with the p97 N-terminal domain. 40
32025 270534 cd14349 UBA_CF106 UBA-like domain found in uncharacterized protein C6orf106 and similar proteins. The family corresponds to a group of uncharacterized protein C6orf106 and its homologs mainly found in Metazoa. All family members contain a ubiquitin-associated (UBA)-like domain. 41
32026 270535 cd14350 UBA_DCNL UBA-like domain found in DCN1-like protein DCNL1, DCNL2 and similar proteins. DCNL1 (defective in cullin neddylation protein 1-like protein 1), also called DCUN1 domain-containing protein 1, is encoded by squamous cell carcinoma-related oncogene SCCRO (DCUN1D1). It interacts with known cullin isoforms as well as ROC1, Ubc12 and CAND1, the components of the neddylation pathway. It plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. DCNL1 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that binds to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. DCNL2 (defective in cullin neddylation protein 1-like protein 2), also called DCUN1 domain-containing protein 2, is encoded by gene DCUN1D2. Although its biological function remains unclear, DCNL2 shows high sequence similarity with DCNL1 and may also contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. Like DCNL1, DCNL2 contains an N-terminal UBA-like domain and a C-terminal cullin binding domain. 42
32027 270536 cd14351 UBA_Ubx1_like UBA-like domain found in yeast UBX domain-containing protein 1 (Ubx1) and similar proteins. Ubx1, also called suppressor of high-copy PP1 protein (Shp1), is the substrate-recruiting cofactor of AAA-adenosine triphosphatase Cdc48 in Saccharomyces cerevisiae. In concert with ubiquitin-like Atg8, Cdc48 and Ubx1 are involved in the regulation of autophagosome biogenesis. Ubx1 also functions as a regulator of phosphoprotein phosphatase 1 (PP1) with differential effects on glycogen metabolism, meiotic differentiation, and mitotic cell cycle progression. All family members contain an N-terminal ubiquitin-associated (UBA)-like domain. 37
32028 270537 cd14352 UBA_DCN1 UBA-like domain found in yeast defective in cullin neddylation protein 1 (DCN1) and similar proteins. DCN1 is a scaffold-type E3 ligase for cullin neddylation. It can bind directly to cullins and the ubiquitin-like protein Nedd8-specific E2 (Ubc12), and regulate cullin neddylation and thus display ubiquitin ligase activity. It contains an N-terminal ubiquitin-associated (UBA)-like domain and a unique C-terminal PONY domain that is essential for the neddylation function of DCN1. 36
32029 270538 cd14353 UBA_FAF UBA-like domain found in FAS-associated factor FAF1, FAF2 and similar proteins. FAF1, also called UBX domain-containing protein 12 or UBX domain-containing protein 3A, is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP) which is involved in the ubiquitin-proteosome pathway. FAF2, also called protein ETEA, UBX domain-containing protein 3B, or UBX domain-containing protein 8, is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. FAF2 shows homology to Fas-associated factor 1 (FAF1). Both of them contain N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains. Compared to FAF1, however, FAF2 lacks the nuclear targeting domain. The function of FAF2 remains unclear. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that FAF2 could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis. 32
32030 270539 cd14354 UBA_UBP25 UBA domain found in ubiquitin carboxyl-terminal hydrolase 25 (UBP25) and similar proteins. UBP25, also called deubiquitinating enzyme 25, USP on chromosome 21, ubiquitin thioesterase 25, or ubiquitin-specific-processing protease 25, belongs to the deubiquitinating enzyme (DUB) family that specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. USP25 has one muscular isoform and two ubiquitous isoforms. The longer muscular isoform can bind to muscle-restricted cytoskeletal and sarcomeric proteins, such as myosin binding protein C1 (MyBPC1), actin alpha-1 (ACTA1) and filamin C (FLNC), and further prevent their degradation. USP25 harbors three potential ubiquitin-binding domains (UBDs), one ubiquitin-associated (UBA) domain and two ubiquitin-interacting motifs (UIMs) in the N-terminal region. Its C-terminal tyrosine-rich region is responsible for the binding of the second SH2 domain of SYK, a non-receptor tyrosine kinase that specifically phosphorylates USP25 and alters its cellular levels. 46
32031 270540 cd14355 UBA_UBP28 UBA domain found in ubiquitin carboxyl-terminal hydrolase 28 (UBP28) and similar proteins. UBP28, also called deubiquitinating enzyme 28, ubiquitin thioesterase 28, or ubiquitin-specific-processing protease 28, is an ubiquitin-specific protease that belongs to the deubiquitinating enzyme (DUB) family which specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. UBP28 can form a ternary complex with nucleoplasmic Fbw7alpha, an F-box protein that is part of an SCF-type ubiquitin ligase, and MYC, a transcription factor encoded by MYC proto-oncogene. UBP28 is required for the stability of MYC, and this stabilization is necessary for tumour-cell proliferation. Besides, UBP28 plays a critical role in the regulation of the Chk2-p53-PUMA pathway. It specifically interacts with 53BP1 and is essential to stabilize Chk2 and 53BP1 in response to DNA damage. 42
32032 270541 cd14358 UBA_NAC_euk UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and its homologs mainly found in eukaryotes. The subfamily contains nascent polypeptide-associated complex subunit alpha (NACA), putative NACA-like protein (NACP1), nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD), and similar proteins. NACA, also called NAC-alpha or Alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. The biological function of NACP1 (also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1) and NACAD remain unclear. All family members contain an NAC domain and a C-terminal ubiquitin-associated (UBA) domain. 37
32033 270542 cd14359 UBA_AeNAC UBA-like domain found in archaeal nascent polypeptide-associated complex homolog from Methanothermobacter marburgensis (AeNAC) and similar proteins. AeNAC is a functional archaeal homolog of eukaryotic nascent polypeptide-associated complex (NAC). Both AeNAC and eukaryotic NAC function as the cytosolic chaperone that can bind to ribosomal RNA, interact with the nascent polypeptide chains as they emerge from the ribosome, and assist in post-translational processes. They all contain a NAC domain and an ubiquitin-associated (UBA) domain in the C-terminus. However, unlike eukaryotic NAC, AeNAC forms a ribosome associated homodimer, but not heterodimer. The NAC domain of AeNAC is responsible for the homodimer formation. 40
32034 270543 cd14360 UBA_NAC_like_bac UBA-like domain found in uncharacterized bacteria proteins similar to eukaryotic nascent polypeptide-associated complex proteins (NAC). This subfamily contains a group of uncharacterized proteins found in bacteria. They all contain an N-terminal ubiquitin-associated (UBA) that shows high sequence similarity with that of eukaryotic nascent polypeptide-associated complex proteins (NAC) which is one of the cytosolic chaperones that contact the nascent polypeptide chains as they emerge from the ribosome and assist in post-translational processes. 38
32035 270544 cd14361 UBA_HYPK UBA-like domain found in Huntingtin-interacting protein K (HYPK) and similar proteins. HYPK, also called Huntingtin yeast partner K or Huntingtin yeast two-hybrid protein K, is an intrinsically unstructured Huntingtin (HTT)-interacting protein with chaperone-like activity. It is involved in regulating cell growth, cell cycle, unfolded protein response, and cell death. All members in this subfamily contain an N-terminal ubiquitin-associated (UBA) that shows high sequence similarity with that of eukaryotic nascent polypeptide-associated complex proteins (NAC) which is one of the cytosolic chaperones that contact the nascent polypeptide chains as they emerge from the ribosome and assist in post-translational processes. 41
32036 270545 cd14362 CUE_TAB2_TAB3 CUE domain found in the N-terminal of TGF-beta-activated kinase 1 and MAP3K7-binding proteins TAB2, TAB3 and similar proteins. TAB2, also called mitogen-activated protein kinase kinase kinase 7-interacting protein 2, TAK1-binding protein 2, or TGF-beta-activated kinase 1-binding protein 2, is an adaptor protein that regulates activation of TAK1, a MAP kinase kinase kinase (MAPKKK), through linking TAK1 to TRAF6 in the Interleukin-1 (IL-1) induced NF-kappaB activation pathway. TAB3, also called mitogen-activated protein kinase kinase kinase 7-interacting protein 3, NF-kappa-B-activating protein 1, TAK1-binding protein 3, or TGF-beta-activated kinase 1-binding protein 3, is a TAB2-like TAK1-binding protein that activates NF-kappaB similar to TAB2. It activates TAK1 and regulates its association with TRAF2 and TRAF6. Moreover, TAB3 interacts with TRAF6 and TRAF2 in an IL-1- and a TNF-dependent manner, respectively. In summary, TAB2 and TAB3 function redundantly as mediators of TAK1 activation in IL-1 and TNF signal transduction. Both of them contain an N-terminal CUE domain, a coiled-coil (CC) region, a TAK1-binding domain and a C-terminal Npl4 zinc finger (NZF) ubiquitin-binding domain (UBD). 42
32037 270546 cd14363 CUE_TOLIP CUE domain found in the C-terminal of toll-interacting protein (Tollip) and similar proteins. Tollip is a new component of the IL-1RI pathway which contains an N-terminal C2 domain and a C-terminal CUE domain. Tollip binds to the cytoplasmic TIR domain of IL-1Rs after IL-1 stimulation. It is sufficient for recruitment of IRAK to IL-1Rs and negatively regulates IL-1-induced signaling by inhibiting IRAK phosphorylation. In addition, Tollip directly interacts with toll-like receptors TLR2 and TLR4, and plays an inhibitory role in TLR-mediated cell activation through suppressing phosphorylation and kinase activity of IRAK. Moreover, Tollip can associate with GAT domains of Tom1 and its related proteins Tom1L1 and Tom1L2, and facilitate the recruitment of clathrin onto endosomes. 41
32038 270547 cd14364 CUE_ASCC2 CUE domain found in activating signal cointegrator 1 complex subunit 2 (ASCC2) and similar proteins. ASCC2, also called ASC-1 complex subunit p100 or Trip4 complex subunit p100, together with ASCC1 (also called p50) and ASCC3 (also called p300), form the activating signal cointegrator complex (ASCC). ASCC plays an essential role in activating protein 1 (AP-1), serum response factor (SRF), and nuclear factor kappaB (NF-kappaB) transactivation. It acts as a transcriptional coactivator of nuclear receptors and regulates the transrepression between nuclear receptors and either AP-1 or NF-kappaB in vivo. Members in this family all contain a CUE domain. 40
32039 270548 cd14365 CUE_N4BP2 CUE domain found in NEDD4-binding protein 2 (N4BP2) and similar proteins. N4BP2 has been identified as an oncogene bcl-3 coding protein BCL-3-binding protein (B3BP) that participates in connecting transcriptional activation and genetic recombination of the Ig gene. In addition to BCL-3, it also interacts with p300/CBP histone acetyltransferases. N4BP2 shows intrinsic ATP binding and hydrolyzing activity. It contains an N-terminal ATP-binding region that is responsible for the interaction with BCL-3 and p300/CBP. N4BP2 also functions as a 5'-polynucleotide kinase that can transfer a phosphate group to the 5' end of DNA and RNA substrates. Moreover, N4BP2 contains a C-terminal MutS-related domain that possesses nicking endonuclease activity and may play a role in DNA mismatch repair (MMR). This model corresponds to CUE domain in the N-terminus of N4BP2. 42
32040 270549 cd14366 CUE_CUED1 CUE domain found in CUE domain-containing protein 1 (CUED1) and similar proteins. The subfamily includes a group of uncharacterized CUE domain-containing protein termed CUED1. Their biological function remains unknown. 42
32041 270550 cd14367 CUE_CUED2 CUE domain found in CUE domain-containing protein 2 (CUED2) and similar proteins. CUEDC2 is a novel negative regulator of progesterone receptor (PR) and functions to promote the progesterone-induced PR degradation by the ubiquitin-proteasome pathway. It also acts as the regulator of JAK1/STAT3 signaling through inhibiting cytokine-induced phosphorylation of JAK1 and STAT3 and the subsequent STAT3 transcriptional activity. All members in this subfamily contain a CUE domain. 42
32042 270551 cd14368 CUE_DEF1_like CUE domain found in fungal RNA polymerase II degradation factor 1 (DEF1) and similar proteins. DEF1, also called RRM3-interacting protein 1, is a RNA Polymerase II (RNAPII) degradation factor that may be required to couple arrested RNAPII to the proteasome to facilitate its degradation. It contains a CUE domain that is responsible for the binding of ubiquitin. The family also includes many uncharacterized hypothetical proteins. They show a high level of sequence similarity with DEF1. 41
32043 270552 cd14369 CUE_VPS9_like CUE domain found in vacuolar protein sorting-associated protein 9 (VPS9) and similar proteins. VPS9, also called vacuolar protein-targeting protein 9, is a cytosolic yeast protein required for localization of vacuolar proteins, such as the soluble vacuolar hydrolases CPY and PrA. It may bind and act as an effector of a rab GTPase and plays a role in vacuolar protein sorting (VPS) pathway. VPS9 contains a region called GBH domain that is related to mammalian Ras-binding proteins, Rin1 and JC265, and may negatively regulate Ras-mediated signaling in yeast Saccharomyces cerevisiae. This model corresponds to the N-terminal CUE domain that interacts specifically with monoubiquitin and regulates intramolecular monoubiquitylation. 42
32044 270553 cd14370 CUE_DMA CUE-like DMA domain found in the DM domain gene family encodes putative transcription factors DMRTA1, DMRTA2 and DMRTA3. The DM domain proteins are related to the sexual regulators doublesex from Drosophila melanogaster and MAB-3 from Caenorhabditis elegans. Thus, they have been named as doublesex- and mab-3-related transcription factors and may be involved in sexual development or in somite development. All DM domain proteins contain a DM domain which is an unusual zinc finger motif. In addition to an N-terminal DM domain, members in this family, including DMRTA1, DMRTA2 and DMRTA3, also harbor additional CUE-like DMA domain. DMRTA1 is encoded by gene DMRT1, a vertebrate equivalent of the D. melanogaster master sex regulator gene, doublesex. In D. melanogaster, doublesex controls the terminal switch of the pathway leading to sex fate choice. DMRT1 may function as regulator of sex differentiation in vertebrate. Especially, it is required for testis differentiation, but is not involved in the gonadal sex fate choice. DMRTA2, also called Doublesex- and mab-3-related transcription factor 5 (DMRT5), is encoded by gene DMRT2. In the zebrafish, DMRT2 is involved in somite development. DMRTA2 may act as an activator of cyclin-dependent kinase inhibitor 2C (cdkn2c) during spermatogenesis. It may also play significant roles in embryonic neurogenesis. DMRTA3 is encoded by tumor suppressor gene DMRT3 which serves as a novel potential target for homozygous deletion in squamous cell carcinoma of the lung. 40
32045 270554 cd14371 CUE_CID7_like CUE domain found in CTC-interacting domain proteins CID5, CID6, CID7 and similar proteins. CID7 is encoded by ubiquitously expressed gene CID7. It contains an N-terminal PABC-interacting domain (PAM2 or PABP-interacting motif 2) which is also found in the human Paip1 and Paip2. At this point, it functions as an interaction partner of the PABC domain of Arabidopsis thaliana Poly(A)-binding proteins. It also harbors an ubiquitin-associated (UBA)-like CUE domain and a C-terminal small MutS-related (SMR) domain. CID5 and CID6 are encoded by gene CID5, CID6, respectively. CID5 is only expressed in immature siliques. The biological function of CID5 and CID6 remain unclear. 43
32046 270555 cd14372 CUE_Cue5p_like CUE domain found in yeast ubiquitin-binding protein CUE5 (Cue5p), donuts protein 1 (DON1p) and similar proteins. Cue5p, also called coupling of ubiquitin conjugation to ER degradation protein 5, is encoded by the open reading frame (ORF) Yor042. It contains a CUE domain which exhibits weak ubiquitin binding properties. Donuts protein 1 (DON1p) is encoded by the ORF YDR273w. It localizes specifically to the prospore membrane and is expressed exclusively during meiosis. DON1p may function as a unique marker to investigate the defects associated with the impaired function of the meiotic plaque in the mpc- mutants. 45
32047 270556 cd14373 CUE_Cue3p_like CUE domain found in yeast ubiquitin-binding protein CUE3 (Cue3p) and similar proteins. Cue3p, also called coupling of ubiquitin conjugation to ER degradation protein 3, is encoded by the open reading frame (ORF) YGL110C. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue3p contains a CUE domain. 41
32048 270557 cd14374 CUE1_Cue2p_like CUE1 domain found in yeast ubiquitin-binding protein CUE2 (Cue2p) and similar proteins. Cue2p, also called coupling of ubiquitin conjugation to ER degradation protein 2, is encoded by the open reading frame (ORF) YKL090W. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue2p contains two tandem CUE domains at the N-terminus. Both of them can bind monoubiquitin independently. This model corresponds to the first CUE domain. 42
32049 270558 cd14375 CUE2_Cue2p_like CUE2 domain found in yeast ubiquitin-binding protein CUE2 (Cue2p) and similar proteins. Cue2p, also called coupling of ubiquitin conjugation to ER degradation protein 2, is encoded by the open reading frame (ORF) YKL090W. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue2p contains two tandem CUE domains at the N-terminus. Both of them can bind monoubiquitin independently. This model corresponds to the second CUE domain. 38
32050 270559 cd14376 CUE_AUP1_AMFR_like CUE domain found in ancient ubiquitous protein 1 (AUP1), autocrine motility factor receptor (AMFR) and similar proteins. AUP1 is a component of the HRD1-SEL1L endoplasmic reticulum (ER) quality control complex and is essential for US11-mediated dislocation of class I MHC heavy chains. AMFR is an internalizing cell surface glycoprotein that is localized in both plasma membrane caveolae and the ER, and involves in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. Cue1p is an N-terminally membrane-anchored endoplasmic reticulum (ER) protein essential for the activity of the two major yeast RING finger ubiquitin ligases (E3s) implicated in ER-associated degradation (ERAD). This family also includes plant E3 ubiquitin protein ligases RIN2, RIN3, and similar proteins. Comparing with other CUE domain-containing proteins, some family members from higher eukaryotes do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. 37
32051 270560 cd14377 UBA1_Rad23 UBA1 domain of Rad23 proteins found in metazoa. The family includes mammalian orthologs of yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe). Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA1 domain. 40
32052 270561 cd14378 UBA1_Rhp23p_like UBA1 domain of Schizosaccharomyces pombe UV excision repair protein Rhp23p and its homologs. The subfamily contains several fungal multi-ubiquitin receptors, including Schizosaccharomyces pombe Rhp23p and Saccharomyces cerevisiae Rad23p, both of which are orthologs of human HR23A. They play roles in nucleotide excision repair (NER) and in cell cycle regulation. They also function as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. For instance, S. pombe Rhp23p forms a complex with Rhp41p to recognize photolesions and help initiate DNA repair, and it also protects ubiquitin chains against disassembly by deubiquitinating enzymes. Like human HR23A, members in this subfamily interact with the proteasome through their N-terminal ubiquitin-like domain (UBL), and with ubiquitin (Ub), or multi-ubiquitinated substrates, through their two ubiquitin-associated domains (UBA), termed internal UBA1 and C-terminal UBA2. In addition, they contain a xeroderma pigmentosum group C (XPC) protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain. 47
32053 270562 cd14379 UBA1_Rad23_plant UBA1 domain of putative DNA repair proteins Rad23 found in plant. The radiation sensitive 23 (Rad23) subfamily consists of four isoforms of putative DNA repair proteins from Arabidopsis thaliana and similar proteins from other plants. The nuclear-enriched Rad23 proteins function in the cell cycle, morphology, and fertility of plants through their delivery of ubiquitin (Ub)/26S proteasome system (UPS) substrates to the 26S proteasome. Rad23 proteins contain an N-terminal ubiquitin-like (UBL) domain that associates with the 26S proteasome Ub receptor RPN10, and two C-terminal ubiquitin-associated (UBA) domains that bind Ub conjugates. This model corresponds to the UBA1 domain. 50
32054 270563 cd14380 UBA2_Rad23 UBA2 domain of Rad23 proteins found in metazoa. The family includes mammalian orthologs of yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe). Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA2 domain. 39
32055 270564 cd14381 UBA2_Rhp23p_like UBA2 domain of Schizosaccharomyces pombe UV excision repair protein Rhp23p and its fungal homologs. The subfamily contains several fungal multiubiquitin receptors, including Schizosaccharomyces pombe Rhp23p and Saccharomyces cerevisiae Rad23p, both of which are orthologs of human HR23A. They play roles in nucleotide excision repair (NER) and in cell cycle regulation. They also function as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. For instance, S. pombe Rhp23p forms a complex with Rhp41p to recognize photolesions and help initiate DNA repair, and it also protects ubiquitin chains against disassembly by deubiquitinating enzymes. Like human HR23A, members in this subfamily interact with the proteasome through their N-terminal ubiquitin-like domain (UBL), and with ubiquitin (Ub), or multi-ubiquitinated substrates, through their two ubiquitin-associated domains (UBA), termed internal UBA1 and C-terminal UBA2. In addition, they contain a xeroderma pigmentosum group C (XPC) protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain. 40
32056 270565 cd14382 UBA2_RAD23_plant UBA2 domain of putative DNA repair proteins RAD23 found in plant. The radiation sensitive 23 (RAD23) subfamily consists of four isoforms of putative DNA repair proteins from Arabidopsis thaliana and similar proteins from other plants. The nuclear-enriched RAD23 proteins function in the cell cycle, morphology, and fertility of plants through their delivery of ubiquitin (Ub)/26S proteasome system (UPS) substrates to the 26S proteasome. RAD23 proteins contain an N-terminal ubiquitin-like (UBL) domain that associates with the 26S proteasome Ub receptor RPN10, and two C-terminal ubiquitin-associated (UBA) domains that bind Ub conjugates. This model corresponds to the UBA2 domain. 43
32057 270566 cd14383 UBA1_UBP5 UBA1 domain found in ubiquitin carboxyl-terminal hydrolase 5 (UBP5). UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. This model corresponds to the UBA1 domain. 49
32058 270567 cd14384 UBA1_UBP13 UBA1 domain found in ubiquitin carboxyl-terminal hydrolase 13 (UBP13). UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13, is an ortholog of UBP5 implicated in catalyzing hydrolysis of various ubiquitin (Ub)-chains. It contains a zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. Due to the non-activating catalysis for K63-polyubiquitin chains, UBP13 may function differently from USP5 in cellular deubiquitination processes. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA1 domain. 49
32059 270568 cd14385 UBA1_spUBP14_like UBA1 domain found in Schizosaccharomyces pombe ubiquitin carboxyl-terminal hydrolase 14 (spUBP14) and similar proteins. spUBP14, also called deubiquitinating enzyme 14, UBA domain-containing protein 2, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, functions as a deubiquitinating enzyme that is involved in protein degradation in fission yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain. 47
32060 270569 cd14386 UBA2_UBP5 UBA2 domain found in ubiquitin carboxyl-terminal hydrolase 5 (UBP5). UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. This model corresponds to the UBA2 domain. 43
32061 270570 cd14387 UBA2_UBP13 UBA2 domain found in ubiquitin carboxyl-terminal hydrolase 13 (UBP13). UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13 is an ortholog of UBP5 implicated in catalyzing hydrolysis of various ubiquitin (Ub)-chains. It contains a zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. Due to the non-activating catalysis for K63-polyubiquitin chains, UBP13 may function differently from USP5 in cellular deubiquitination processes. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA2 domain. 35
32062 270571 cd14388 UBA2_atUBP14 UBA2 domain found in Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and similar proteins. atUBP14, also called deubiquitinating enzyme 14, TITAN-6 protein, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is related to the isopeptidase T class of deubiquitinating enzymes that recycle polyubiquitin chains following protein degradation. atUBP14 is essential for early plant development. It can disassemble multi-ubiquitin chains linked internally via epsilon-amino isopeptide bonds using Lys48 and can process some, but not all, translational fusions of ubiquitin linked via alpha-amino peptide bonds. atUBP14 contains two ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain which show a high level of sequence similarity with mammalian ubiquitin-associated and SH3 domain-containing protein A (UBS3A). 38
32063 270572 cd14389 UBA_AAA_plant UBA domain found in plant AAA-type ATPase-like proteins. This family includes some uncharacterized AAA-type ATPase-like proteins found in plant. The AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. Members in this family contains an N-terminal ubiquitin-association (UBA) domain, a AAA-type ATPase domain and a C-terminal MgsA AAA+ ATPase domain. This model corresponds to the UBA domain which show a high level of sequence similarity with mammalian ubiquitin-associated and SH3 domain-containing protein A (UBS3A). 37
32064 270573 cd14390 UBA_II_E2_UBE2K UBA domain of vertebrate ubiquitin-conjugating enzyme E2 K (UBE2K). UBE2K, also called Huntingtin-interacting protein 2 (HIP-2), ubiquitin carrier protein, ubiquitin-conjugating enzyme E2-25 kDa (E2-25K), or ubiquitin-protein ligase is a multi-ubiquitinating enzyme with the ability to synthesize Lys48-linked polyubiquitin chains which is involved in the ubiquitin (Ub)-dependent proteolytic pathway. It interacts with the frameshift mutant of ubiquitin B and functions as a crucial factor regulating amyloid-beta neurotoxicity. It has also been characterized as Huntingtin-interacting protein that modulates the neurotoxicity of Amyloid-beta (Abeta), the principal protein involved in Alzheimer's disease pathogenesis. Moreover, E2-25K increases aggregate the formation of expanded polyglutamine proteins and polyglutamine-induced cell death in the pathology of polyglutamine diseases. UBE2K and its yeast homolog UBC1 are unique class II E2 conjugating enzymes, both of which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. 38
32065 270574 cd14391 UBA_II_E2_UBCD4 UBA domain found in Drosophila melanogaster ubiquitin-conjugating enzyme E2-22 kDa (UbcD4) and similar proteins. UbcD4, also called ubiquitin carrier protein or ubiquitin-protein ligase, is a class II E2 ubiquitin-conjugating enzyme encoded by Drosophila E2 gene which is only expressed in pole cells in embryos. It is a putative E2 enzyme homologous to the Huntingtin interacting protein-2 (HIP2) of human. UbcD4 specifically interacts with the polyubiquitin-binding subunit of the proteasome. It contains a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. 36
32066 270575 cd14392 UBA_Cbl-b UBA domain found in E3 ubiquitin-protein ligase Cbl-b and similar proteins. Cbl-b, also called casitas B-lineage lymphoma proto-oncogene b, RING finger protein 56, SH3-binding protein Cbl-b, or signal transduction protein Cbl-b, has been identified as a regulator of antigen-specific, T cell-intrinsic, peripheral immune tolerance (a state also called clonal anergy). It may inhibit activation of the p85 subunit of phosphoinositide 3-kinase (PI3K), protein kinase C-theta (PKC-theta), and phospholipase C-gamma1 (PLC-gamma1) and negatively regulates T-cell receptor-induced transcription factor nuclear factor-kappaB (NF-kappaB) activation. In addition, Cbl-b may target multiple signaling molecules involved in transforming growth factor (TGF)-beta-mediated transactivation pathways. Cbl-b contains a proline rich domain, a nuclear localization signal, a C3HC4 zinc finger and a ubiquitin-associated (UBA) domain. 41
32067 270576 cd14393 UBA_c-Cbl UBA domain found in E3 ubiquitin-protein ligase Cbl and similar proteins. Cbl, also called casitas B-lineage lymphoma proto-oncogene, proto-oncogene c-Cbl, RING finger protein 55, or signal transduction protein Cbl, is a multi-domain protein that acts as a key negative regulator of various receptor and non-receptor tyrosine kinases signaling. It contains a tyrosine kinase-binding domain (TKB), a proline-rich domain, a RING domain, and a ubiquitin-associated (UBA) domain. The TKB is responsible for the interactions with many tyrosine kinases, such as the colony-stimulating factor-1 (CSF-1) receptor, Syk/ZAP-70 and Src-family of protein tyrosine kinases. The proline-rich domain can recruit proteins with SH3 domain. Moreover, Cbl functions as an E3 ubiquitin ligase that can bind ubiquitin-conjugating enzymes (E2s) through RING domain. 40
32068 270577 cd14394 UBA_BIRC2_3 UBA domain found in baculoviral IAP repeat-containing protein BIRC2, BIRC3 and similar proteins. The subfamily includes cellular inhibitor of apoptosis protein 1 (c-IAP1) and c-IAP2. c-IAPs function as ubiquitin E3 ligases that mediate the ubiquitination of the substrates involved in apoptosis, nuclear factor-kappaB (NF-kappaB)signaling, and oncogenesis. Unlike other apoptosis proteins (IAPs), such as XIAP, c-IAPs exhibit minimal binding to caspases and may not play an important role in the inhibition of these proteases. c-IAP1, also called baculoviral IAP repeat-containing protein BIRC2, IAP-2, RING finger protein 48, or TNFR2-TRAF-signaling complex protein 2, is a potent regulator of the tumor necrosis factor (TNF) receptor family and NF-kappaB signaling pathways in the cytoplasm. It can also regulate E2F1 transcription factor-mediated control of cyclin transcription in the nucleus. c-IAP2, also called BIRC3, IAP-1, apoptosis inhibitor 2 (API2), or IAP homolog C, also influences ubiquitin-dependent pathways that modulate innate immune signalling by activation of NF-kappaB. c-IAPs contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), a caspase activation and recruitment domain (CARD) that serves as a protein interaction surface, and a RING domain at the carboxyl terminus that is required for ubiquitin ligase activity. 50
32069 270578 cd14395 UBA_BIRC4_8 UBA domain found in E3 ubiquitin-protein ligase XIAP, baculoviral IAP repeat-containing protein 8 (BIRC8) and similar proteins. XIAP, also called baculoviral IAP repeat-containing protein 4 (BIRC4), IAP-like protein (ILP), inhibitor of apoptosis protein 3 (IAP-3), or X-linked inhibitor of apoptosis protein (X-linked IAP), is a potent suppressor of apoptosis that directly inhibits specific members of the caspase family of cysteine proteases, including caspase-3, -7, and -9. It promotes proteasomal degradation of caspase-3 and enhances its anti-apoptotic effect in Fas-induced cell death. The ubiquitin-protein ligase (E3) activity of XIAP also exhibits in the ubiquitination of second mitochondria-derived activator of caspases (Smac). The mitochondrial proteins, Smac/DIABLO and Omi/HtrA2, can inhibit the antiapoptotic activity of XIAP. XIAP has also been implicated in several intracellular signaling cascades involved in the cellular response to stress, such as the c-Jun N-terminal kinase (JNK) pathway, the nuclear factor-kappaB (NF-kappaB) pathway, and the transforming growth factor-beta (TGF-beta) pathway. Moreover, XIAP can regulate copper homeostasis through interacting with MURR1. BIRC8, also called inhibitor of apoptosis-like protein 2 (IAP-like protein 2 or ILP-2), or testis-specific inhibitor of apoptosis, is a tissue-specific homolog of E3 ubiquitin-protein ligase XIAP. It has been implicated in the control of apoptosis in the testis by direct inhibition of caspase 9. Both XIAP and BIRC8 contain three N-terminal baculoviral IAP repeat (BIR) domains, a ubiquitin-association (UBA) domain and a RING domain at the carboxyl terminus. 50
32070 270579 cd14396 UBA_XtBIRC7_like UBA domain found in Xenopus tropicalis baculoviral IAP repeat-containing protein BIRC7, BIRC71A and similar proteins. X. tropicalis BIRC7, also called E3 ubiquitin-protein ligase EIAP, embryonic/Egg IAP (xEIAP/XLX), inhibitor of apoptosis (IAP)-like protein, XIAP homolog XLX, is a weak apoptosis inhibitor that exhibits caspase inhibition and autoubiquitylation. It is uniquely modified by MAPK- and Cdc2/Cyclin B-dependent phosphorylation during oocyte maturation. Its caspase-dependent cleavage is altered when it is phosphorylated. X. tropicalis BIRC7 contains two N-terminal baculoviral IAP repeats (BIRs) and a C-terminal RING domain. Based on sequence homology, it also harbors a ubiquitin-associated (UBA) domain which is not detected in human BIRC7. 44
32071 270580 cd14397 UBA_LATS1 UBA domain found in vertebrate serine/threonine-protein kinase LATS1. LATS1, also called large tumor suppressor homolog 1 or WARTS protein kinase (warts), is a serine/threonine-protein kinase that highly conserved from fly to human. It plays a crucial role in the prevention of tumor formation by controlling mitosis progression. Human LATS1 is the mammalian homologs of Drosophila lats/warts gene that could suppress tumor growth and rescue all developmental defects in flies, including embryonic lethality. It forms a regulatory complex with zyxin, a regulator of actin filament assembly. The LATS1/zyxin complex plays a role in controlling mitosis progression on mitotic apparatus. LATS1 is phosphorylated in a cell-cycle-dependent manner and complexes with CDC2 in early mitosis. It can negatively modulates tumor cell growth by inducing G(2)/M cell cycle transition or apoptosis. It also functions as a mitotic exit network kinase interacting with MOB1A, a protein whose homolog in budding yeast associates with kinases involved in mitotic exit. Moreover, LATS1 acts as a novel cytoskeleton regulator that affects cytokinesis by regulating actin polymerization through inhibiting LIMK1. LATS1 can also inhibit transcription regulation and transformation functions of oncogene YAP by inhibiting its nuclear translocation through phosphorylation. In addition, LATS1 can regulate the transcriptional activity of forkhead L2 (FOXL2) via phosphorylation. It also acts as an acting-binding protein that can negatively regulate the actin polymerization. LATS1 contains an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain. 41
32072 270581 cd14398 UBA_LATS2 UBA domain found in vertebrate serine/threonine-protein kinase LATS2. LATS2, also called kinase phosphorylated during mitosis protein, or large tumor suppressor homolog 2, or serine/threonine-protein kinase KPM, or Warts-like kinase, is a novel mammalian homolog of the Drosophila tumor suppressor gene lats/warts. It inhibits the G1/S transition and is essential for embryonic development, proliferation control, and genomic integrity. LATS2 is a serine/threonine kinase that negatively regulates CyclinE/CDK2 and plays a role in tumor suppression. It also acts as the negative regulator of androgen receptor (AR) through inhibiting androgen-regulated gene expression and thus plays an important role in AR -regulated transcription and in the development of prostate cancer. Moreover, LATS2 induces apoptosis via down-regulation of anti-apoptotic proteins, BCL-2 and BCL-x(L), in human lung cancer cells. It is a centrosomal protein and forms a complex with Ajuba, a LIM protein, to regulate organization of the spindle apparatus through recruitment of gamma-tubulin to the centrosome during mitosis. Furthermore, LATS2 interacts with Mdm2 to inhibit p53 ubiquitination and promote p53 activation. It stabilizes the cellular protein level of Snail1, a central regulator of epithelial cell adhesion and movement in epithelial-to-mesenchymal transitions (EMTs) during embryo development, and enhances its EMT activity. LATS2 contains an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain. 41
32073 270582 cd14399 UBA_PLICs UBA domain of eukaryotic protein linking integrin-associated protein (IAP, also known as CD47) with cytoskeleton (PLIC) proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like Dsk2 protein, PLIC-1 (also called ubiquilin-1), PLIC-2 (also called ubiquilin-2 or Chap1), PLIC-3 (also called ubiquilin-3) and PLIC-4 (also called ubiquilin-4, Ataxin-1 interacting ubiquitin-like protein, A1Up, Connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. PLIC-1 regulates the function of the thrombospondin receptor CD47 and G protein signaling. It plays a role in TLR4-mediated signaling through interacting with the Toll/interleukin-1 receptor (TIR) domain of TLR4. It also inhibits the TLR3-Trif antiviral pathway by reducing the abundance of Trif. Moreover, PLIC-1 binds to gamma-aminobutyric acid receptors (GABAARs) and modulates the ubiquitin-dependent, proteasomal degradation of GABAARs. Furthermore, PLIC-1 acts as a molecular chaperone regulating amyloid precursor protein (APP) biosynthesis, trafficking, and degradation by stimulating K63-linked polyubiquitination of lysine 688 in the APP intracellular domain. In addition, PLIC-1 is involved in the protein aggregation-stress pathway via associating with the ubiquitin-interacting motif (UIM) proteins ataxin 3, HSJ1a, and epidermal growth factor substrate 15 (EPS15). PLIC-2 is a protein that binds the ATPase domain of the HSP70-like Stch protein. It functions as a negative regulator of G protein-coupled receptor (GPCR) endocytosis. It also involved in amyotrophic lateral sclerosis (ALS)-related dementia. PLIC-3 is encoded by UBQLN3, a testis-specific gene. It shows high sequence similarity with the Xenopus protein XDRP1, a nuclear phosphoprotein that binds to the N-terminus of cyclin A and inhibits Ca2+-induced degradation of cyclin A, but not cyclin B. PLIC-4 is a ubiquitin-like nuclear protein that interacts with ataxin-1 and further links ataxin-1 with the chaperone and ubiquitin-proteasome pathways. It also binds to the non-ubiquitinated gap junction protein connexin43 (Cx43) and regulates the turnover of Cx43 through the proteasomal pathway. PLIC proteins contain an N-terminal ubiquitin-like (UBL) domain that is responsible for the binding of ubiquitin-interacting motifs (UIMs) expressed by proteasomes and endocytic adaptors, and C-terminal ubiquitin-associated (UBA) domain that interacts with ubiquitin chains present on proteins destined for proteasomal degradation. In addition, mammalian PLIC2 proteins have an extra collagen-like motif region which is absent in other PLIC proteins and the yeast Dsk2 protein. 40
32074 270583 cd14400 UBA_Gts1p_like UBA domain found in Saccharomyces cerevisiae protein GTS1 (Gts1p) and similar proteins. Gts1p, also called protein LSR1, is encoded by a pleiotropic gene GTS1 in budding yeast. The formation of Gts1p-mediated protein aggregates may induce reactive oxygen species (ROS) production and apoptosis. Gts1p also plays an important role in the regulation of heat and other stress responses under glucose-limited or -depleted conditions in either batch or continuous culture. Gts1p contains an N-terminal zinc finger motif similar to that of GATA-transcription factors, a ubiquitin-associated (UBA) domain and a C-terminal glutamine-rich strand. The zinc finger is responsible for the binding to the glycolytic enzyme glyceraldehydes-3-phosphate dehydrogenase (GAPDH) which is required for the maintenance of the metabolic oscillations of budding yeast. The polyglutamine sequence is indispensable for the pleiotropy and nuclear localization of Gts1p. It is essential for the transcriptional activation, whereas Gts1p lacks DNA binding activity. 39
32075 270584 cd14401 UBA_HERC1 UBA domain found in probable E3 ubiquitin-protein ligase HERC1 and similar proteins. HERC1, also called HECT domain and RCC1-like domain-containing protein 1, or p532, or p619, is an ubiquitously expressed multi-domain protein involved in ubiquitin-dependent intracellular membrane trafficking through its interaction with vesicle coat proteins such as clathrin and ARF. Moreover, it has been identified as a tuberous sclerosis complex TSC2-interacting protein that may play a role in TSC-mTOR (mammalian target of rapamycin) pathway. In addition to a ubiquitin-association (UBA) domain, HERC1 contains more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain. At this point, it may function as both E3 ubiquitin ligases and guanine nucleotide exchange factors (GEFs). 44
32076 270585 cd14402 UBA_HERC2 UBA domain found in probable E3 ubiquitin-protein ligase HERC2 and similar proteins. HERC2, also called HECT domain and RCC1-like domain-containing protein 2, is a SUMO-regulated E3 ubiquitin ligase that plays an important role in the SUMO-dependent pathway which orchestrates the DNA double-strand break (DSB) response. Moreover, HERC2 functions as a RNF8 auxiliary factor that regulates ubiquitin-dependent retention of repair proteins on damaged chromosomes. In addition to a ubiquitin-association (UBA) domain, HERC2 contains more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain. 45
32077 270586 cd14403 UBA_AID_AAPK1 UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic subunit alpha-1 (AMPKalpha-1). AMPKalpha-1, also called acetyl-CoA carboxylase kinase (ACACA kinase), hydroxymethylglutaryl-CoA reductase kinase (HMGCR kinase), or Tau-protein kinase PRKAA1, is one of the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK). It has been implicated in a number of important cellular processes. For instance, it functions as a glucose sensor controlling CD8 T-cell memory, as well as a new kinase for RhoA and a new mediator of the vasoprotective effects of estrogen. It also plays a significant role in cervical malignant growth, in regulating oxidative stress and life span in erythrocytes, in modulating the antioxidant status of vascular endothelial cells, in limiting skeletal muscle overgrowth during hypertrophy through inhibition of the mammalian target of rapamycin (mTOR)-signaling pathway. AMPKalpha-1 has an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunit autoinhibition. The C-terminal regulatory domain of the alpha1-subunit is essential for binding the beta1- and gamma1-subunits. 65
32078 270587 cd14404 UBA_AID_AAPK2 UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic subunit alpha-2 (AMPKalpha-2). AMPKalpha-2, also called acetyl-CoA carboxylase kinase (ACACA kinase) or hydroxymethylglutaryl-CoA reductase kinase (HMGCR kinase), is one of the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK). It shows a wide expression pattern and is highly expressed in skeletal muscle, heart, and liver. It may be involved in the regulation of glucose and lipid metabolism and protein synthesis in peripheral tissues, as well as in regulation of energy intake and body weight. AMPKalpha-2 has an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID, and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunit autoinhibition. The C-terminal regulatory domain is essential for binding the beta- and gamma-subunits. 65
32079 270588 cd14405 UBA_MARK1 UBA domain found in serine/threonine-protein kinase MARK1 and similar proteins. MARK1, also called MAP/microtubule affinity-regulating kinase 1 or PAR1 homolog c (Par-1c), is a kinase-regulating microtubule-dependent transport in axons and dendrites. It is involved in the specification of neuronal polarity, in axon-dendrite specification, and in the synaptic plasticity in adult neurons. It has been implicated in Alzheimer's disease, cancer, and autism. 41
32080 270589 cd14406 UBA_MARK2 UBA domain found in serine/threonine-protein kinase MARK2 and similar proteins. MARK2, also called ELKL motif kinase 1 (EMK-1), MAP/microtubule affinity-regulating kinase 2, PAR1 homolog, or PAR1 homolog b (Par-1b), is enriched in brain. It belongs to the AMPK family of Ser/Thr kinases. MARK2 has been implicated in regulating fertility, immune homeostasis, learning, and memory as well as adiposity, insulin hypersensitivity, and glucose metabolism. The activity of MARK2 is necessary for the outgrowth of cell processes, neurites, and dendritic spines. It is a TORC2 (also known as Crtc2) Ser-275 kinase that blocks TORC2-induced cAMP response element binding protein (CREB) activity. It regulates axon formation via phosphorylation of a kinesin-like motor protein GAKIN/KIF13B. It also acts as a positive regulator of Wnt-beta-catenin signaling. 42
32081 270590 cd14407 UBA_MARK3_4 UBA domain found in MAP/microtubule affinity-regulating kinase MARK3, MARK4, and similar proteins. MARK3, also called C-TAK1, Cdc25C-associated protein kinase 1, ELKL motif kinase 2 (EMK-2), protein kinase STK10, Ser/Thr protein kinase PAR-1 (Par-1a), or serine/threonine-protein kinase p78, is a known regulator of KSR1, a molecular scaffold of the Raf/MEK/ERK MAP kinase cascade that regulates the intensity and duration of ERK activation. It binds plakophilin 2 (PKP2), phosphorylates human Cdc25C on serine 216, and promotes 14-3-3 protein binding and protein localization. It also interacts with microphthalmia-associated transcription factor, Mitf which is necessary for regulating genes involved in osteoclast differentiation. Moreover, MARK3 is involved in regulating localization and activity of class IIa histone deacetylases. The lack of MARK3 leads to reduced adiposity, resistance to hepatic steatosis, and defective gluconeogenesis. MARK4, also called MAP/microtubule affinity-regulating kinase-like 1 (MARKL1), or Par-1d, is a member of the AMP-activated protein kinase (AMPK)-related family of kinases. It plays a key role in energy metabolism and may act as a novel drug target for the treatment of obesity and type 2 diabetes. MARK4 also functions as the substrate of ubiquitin specific protease-9 (USP9X) and can be regulated by unusual Lys(29)/Lys(33)-linked polyubiquitin chains. Furthermore, MARK4 may play some role in hepatocellular carcinogenesis. 43
32082 270591 cd14408 UBA_SIK1 UBA domain found in salt-inducible kinase 1 (SIK1). SIK1, also called serine/threonine-protein kinase SNF1-like kinase 1 (SNF1LK), is a serine/threonine kinase abundant in adrenal glands. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. SIK1 is required for myogenic differentiation. It is degraded by the proteasome in myoblasts which is regulated by cAMP signaling. Moreover, SIK1 acts as a class II histone deacetylase (HDAC) kinase, triggering the cytoplasmic export of the HDACs and activation of myocyte enhancer factor 2 (MEF2)-dependent transcription. It also regulates transcription through inhibitory phosphorylation of a family of cAMP responsive element binding protein (CREB) coactivators, called TORCs/CRTCs. In addition, SIK1 links LKB1 to p53-dependent anoikis and suppresses metastasis. It is also involved in a cell sodium-sensing network that regulates active sodium transport through a calcium-dependent process. SIK1 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain and a putative PEST domain. 50
32083 270592 cd14409 UBA_SIK2 UBA domain found in salt-inducible kinase 2 (SIK2). SIK2, also called Qin-induced kinase or serine/threonine-protein kinase SNF1-like kinase 2 (SNF1LK2), is a serine/threonine kinase highly expressed in adipocytes. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. It plays an important role in the insulin-signaling pathway during adipocyte differentiation, as well as in autophagy progression. Moreover, SIK2 plays a critical role in neuronal survival and modulates cAMP responsive element binding protein (CREB)-mediated gene expression in response to hormones and nutrients. SIK2 acts as a critical determinant in autophagy progression. In addition, SIK2 localizes at the centrosome and functions as a centrosome kinase required for bipolar mitotic spindle formation. It is involved in the initiation of mitosis, and regulates the localization of the centrosome linker protein, C-Nap1, through S2392 phosphorylation. SIK2 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain. 45
32084 270593 cd14410 UBA_SIK3 UBA domain found in salt-inducible kinase 3 (SIK3). SIK3, also called salt-inducible kinase 3 or serine/threonine-protein kinase QSK, is a serine/threonine kinase ubiquitously expressed. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. It acts as a novel energy regulator that modulates cholesterol and bile acid metabolism by coupling with retinoid metabolism. It also play an essential role in facilitating chondrocyte hypertrophy during skeletogenesis and growth plate maintenance. SIK3 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain. 45
32085 270594 cd14411 UBA_DCNL1 UBA-like domain found in DCN1-like protein 1 (DCNL1) and similar proteins. DCNL1 (defective in cullin neddylation protein 1-like protein 1), also called DCUN1 domain-containing protein 1, is encoded by squamous cell carcinoma-related oncogene SCCRO (DCUN1D1). It interacts with known cullin isoforms as well as ROC1, Ubc12 and CAND1, the components of the neddylation pathway. It plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. DCNL1 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that binds to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. 51
32086 270595 cd14412 UBA_DCNL2 UBA-like domain found in DCN1-like protein 2 (DCNL2) and similar proteins. DCNL2 (defective in cullin neddylation protein 1-like protein 2), also called DCUN1 domain-containing protein 2, is encoded by gene DCUN1D2. Although its biological function remains unclear, DCNL2 shows high sequence similarity with DCNL1, a protein that plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. At this point, DCNL2 may also contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. Like DCNL1, DCNL2 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that is responsible for the binding to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. 47
32087 270596 cd14413 UBA_FAF1 UBA-like domain found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also called UBX domain-containing protein 12 or UBX domain-containing protein 3A, is a multi-functional Fas associating protein that contains an N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains, p150 subunit of a chromatin assembly factor like domain (CAF) and a novel nuclear localization signal (NLS). FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP) which is involved in the ubiquitin-proteosome pathway. 33
32088 270597 cd14414 UBA_FAF2 UBA-like TAP-C domain found in FAS-associated factor 2 (FAF2) and similar proteins. FAF2, also called protein ETEA, UBX domain-containing protein 3B, or UBX domain-containing protein 8, is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. FAF2 shows homology to Fas-associated factor 1 (FAF1). Both of them contain N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains. Compared to FAF1, however, FAF2 lacks the nuclear targeting domain. The function of FAF2 remains unclear. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that FAF2 could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis. 38
32089 270598 cd14415 UBA_NACA_NACP1 UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and putative NACA-like protein (NACP1). NACA, also called NAC-alpha or alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. Both NACA and BTF3 harbor an NAC domain that mediates the dimerization of the two subunits. By contrast, NACA has an extra ubiquitin-associated (UBA) domain in the C-terminus. In addition to NACA, the family includes NACP1, also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1. The biological function of NACP1 remains unclear. 46
32090 270599 cd14416 UBA_NACAD UBA-like domain found in nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD). The subfamily includes a group of uncharacterized proteins mainly found in vertebrates. Their biological function remains unknown, but they show high sequence similarity to the nascent polypeptide-associated complex (NAC) subunit alpha (NACA) that exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. Like NACA, NACAD contains an NAC domain and a C-terminal ubiquitin-associated (UBA) domain. 44
32091 270600 cd14417 CUE_DMA_DMRTA1 CUE-like DMA domain found in doublesex- and mab-3-related transcription factor A1 (DMRTA1) and similar proteins. DMRTA1 is encoded by gene DMRT1, a vertebrate equivalent of the Drosophila melanogaster master sex regulator gene, doublesex. In D. melanogaster, doublesex controls the terminal switch of the pathway leading to sex fate choice. DMRT1 may function as regulator of sex differentiation in vertebrate. Especially, it is required for testis differentiation, but is not involved in the gonadal sex fate choice. 40
32092 270601 cd14418 CUE_DMA_DMRTA2 CUE-like DMA domain found in doublesex- and mab-3-related transcription factor A2 (DMRTA2). DMRTA2, also called Doublesex- and mab-3-related transcription factor 5 (DMRT5), is encoded by gene DMRT2. In the zebrafish, DMRT2 is involved in somite development. DMRTA2 may act as an activator of cyclin-dependent kinase inhibitor 2C (cdkn2c) during spermatogenesis. It may also play significant roles in embryonic neurogenesis. 42
32093 270602 cd14419 CUE_DMA_DMRTA3 CUE-like DMA domain found in doublesex- and mab-3-related transcription factor 3 (DMRTA3). DMRTA3 is encoded by tumor suppressor gene DMRT3 which serves as a novel potential target for homozygous deletion in squamous cell carcinoma of the lung. 43
32094 270603 cd14420 CUE_AUP1 CUE domain found in ancient ubiquitous protein 1 (AUP1) and similar proteins. AUP1 is a component of the HRD1-SEL1L endoplasmic reticulum (ER) quality control complex and is essential for US11-mediated dislocation of class I MHC heavy chains. It also binds to the membrane-proximal KVGFFKR motif of the cytoplasmic tail of the integrin alphaCTs that plays a crucial role in the inside-out signaling of alpha(IIb)beta(3). AUP1 is found in both the ER and in lipid droplets. It contains two conserved cytoplasmic domains, an acyltransferase domain, a CUE domain and an E2 ubiquitin conjugase G2 (Ube2g2)-binding domain (G2BR). The acyltransferase domain transfers fatty acids onto phospholipids and CUE domain participates in ubiquitin binding or in recruitment of ubiquitin-conjugating enzymes to the site of dislocation. 45
32095 270604 cd14421 CUE_AMFR CUE domain found in autocrine motility factor receptor (AMFR) and similar proteins. AMFR is an internalizing cell surface glycoprotein that is localized in both plasma membrane caveolae and the endoplasmic reticulum (ER), and involves in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. It is also called ER-protein gp78 that has been identified as a RING finger-dependent ubiquitin protein ligase (E3) implicated in degradation from the ER. AMFR contains an N-terminal RING-finger domain and a C-terminal CUE domain. 41
32096 270605 cd14422 CUE_RIN3_plant CUE domain found in plant E3 ubiquitin protein ligases RIN2, RIN3 and similar proteins. RIN2 and RIN3 are two closely related RPM1-interacting proteins conserved in higher eukaryotes. They are orthologs of the mammalian autocrine motility factor receptor (AMFR), a cytokine receptor localized in both plasma membrane caveolae and the endoplasmic reticulum (ER). RIN2 and RIN3 have been identified as membrane-bound RING-finger type ubiquitin ligases with six apparent transmembrane domains, a RING-finger domain and a CUE domain. They act as positive regulators of RPM1- and RPS2-dependent hypersensitive response (HR). 38
32097 270606 cd14423 CUE_UBR5 CUE domain found in E3 ubiquitin-protein ligase UBR5 and similar proteins. UBR5, also called E3 ubiquitin-protein ligase, HECT domain-containing 1, hyperplastic discs protein homolog (HYD), progestin-induced protein, EDD, or Rat100, belongs to the E3 protein family of HECT (homologous to E6-AP C-terminus) ligases. It is frequently overexpressed in breast and ovarian cancer, suggesting a role in cancer development. UBR5 is involved in DNA-damage signaling. It can ubiquitinate DNA topoisomerase II-binding protein 1 (TopBP1) in the presence of the E2 enzyme UBCH4. It also activates the DNA-damage checkpoint kinase CHK2. Moreover, UBR5 interacts with the calcium and integrin-binding protein (CIB) in a DNA-damage-dependent manner. It functions as the substrate of the extracellular signal-regulated kinases (ERKs) 1 and 2. It also acts as a ubiquitin ligase that controls the levels of poly(A)-binding protein-interacting protein 2. In addition, UBR5 ubiquitinates and up-regulates beta-catenin, regulates transcription, and activates smooth-muscle differentiation through its ability to stabilize myocardin. UBR5 contains an N-terminal CUE domain, a zinc-finger-like domain termed the ubiquitin-recognin (UBR) box, a MLLE (mademoiselle) domain, and a C-terminal catalytic HECT domain. 47
32098 270607 cd14424 CUE_Cue1p_like CUE domain found in yeast ubiquitin-binding protein CUE1 (Cue1p), CUE4 (Cue4p) and similar proteins. Cue1p, also called coupling of ubiquitin conjugation to ER degradation protein 1 or kinetochore-defect suppressor 4, is encoded by the open reading frame (ORF) YMR264W in yeast. It is an N-terminally membrane-anchored endoplasmic reticulum (ER) protein essential for the activity of the two major yeast RING finger ubiquitin ligases (E3s) implicated in ER-associated degradation (ERAD). It interacts with the ERAD ubiquitin-conjugating enzyme (E2) Ubc7p in vivo, stimulates Ubc7p E2 activity, and further activates ER-associated protein degradation. Cue1p contains a CUE domain which binds ubiquitin much more weakly than those of other CUE domain containing proteins. It also has an Ubc7p binding-domain at the C-terminal region which is required for Ubc7p-dependent ubiquitylation and for degradation of substrates in the ER. This family also includes Cue4p, also called coupling of ubiquitin conjugation to ER degradation protein 4. It is encoded by the open reading frame (ORF) YML101C in yeast. Cue4p contains a CUE domain which shows high level of similarity with that of Cue1p. 37
32099 270608 cd14425 UBA1_HR23A UBA1 domain of UV excision repair protein RAD23 homolog A (HR23A) found in vertebrates. HR23A, also called Rad23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain. 40
32100 270609 cd14426 UBA1_HR23B UBA1 domain of UV excision repair protein RAD23 homolog B (HR23B) found in vertebrates. HR23B, also called xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26 S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain. 46
32101 270610 cd14427 UBA2_HR23A UBA2 domain of UV excision repair protein RAD23 homolog A (HR23A) found in vertebrates. HR23A, also called Rad23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain. 41
32102 270611 cd14428 UBA2_HR23B UBA2 domain of UV excision repair protein RAD23 homolog B (HR23B) found in vertebrates. HR23B, also called xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26 S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain. 45
32103 259859 cd14435 SPO1_TF1_like Bacteriophage SPO1-encoded TF1 binds and bends DNA. This group contains proteins related to bacillus phage SPO1-encoded transcription factor 1 (TF1), a type II DNA-binding protein related to the DNA sequence specific (IHF) and non-specific (HU) domains. Type II DNA-binding proteins bind and bend DNA as dimers. Like IHF, TF1 binds DNA specifically and bends DNA sharply. Bacteriophage SPO1-encoded TF1 recognizes SPO1 phage DNA containing 5-(hydroxymethyl)-2'-deoxyuridine as opposed to thymine, Related family members includes integration host factor (IHF) and HU, also called type II DNA-binding proteins (DNABII), which are small dimeric proteins that specifically bind the DNA minor groove, inducing large bends in the DNA and serving as architectural factors in a variety of cellular processes such as recombination, initiation of replication/transcription and gene regulation. IHF binds DNA in a sequence specific manner while HU displays little or no sequence preference. IHF homologs are usually heterodimers, while HU homologs are typically homodimers (except HU heterodimers from E. coli and other enterobacteria). HU is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). 87
32104 271226 cd14436 LepB Legionella Rab1-specific GAP LepB. LepB of Legionella, a human pathogen, is a specific RabGAP for Rab1, a member of the largest subfamily of small GTPases. RabGTPases play a role in the control of vesicular trafficking and are switched off by GTPase-activating enzymes (GAPs) that stimulate the intrinsic GTP hydrolysis activity. Legionella LepB is unrelated to the TBC family of human Rab1 RabGAPs. 272
32105 271227 cd14437 nt01cx_1156_like Uncharacterized proteins conserved in Clostridia. Some members of this uncharacterized protein family have been annotated as putative lipoproteins. The structure resembles that of a partial beta-propeller (3 out of 6 blades), suggesting that family members might form dimers. 199
32106 271228 cd14438 Hip_N N-terminal dimerization domain of the Hsp70-interacting protein (Hip) and similar proteins. The Hsc70/Hsp70-interacting protein (Hip, also p48 or suppressor of tumorigenicity ST13) functions as a regulator of the cyclic action of Hsp70. Hip forms homodimers, and this model characterizes the N-terminal dimerization domain, which may not be directly involved in its regulatory function. A central domain of Hip that contains TPR repeats binds the ATPase domain of Hsp70 and slows the release of ADP. 41
32107 270205 cd14439 AlgX_N_like N-terminal catalytic domain of putative alginate O-acetyltranferase and similar proteins. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. This wider family includes AlgX, AlgJ, AlgV, and a number of uncharacterized families, some of which may have been mis-annotated in sequence databases. 316
32108 270206 cd14440 AlgX_N_like_3 Uncharacterized proteins similar to putative alginate O-acetyltransferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this uncharacterized protein family resemble AlgX_N. 315
32109 270207 cd14441 AlgX_N N-terminal catalytic domain of the putative alginate O-acetyltranferase AlgX. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. This N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. AlgX contains a C-terminal carbohydrate binding domain that belongs to the wider family of CBM6-CBM35-CBM36_like domains. 310
32110 270208 cd14442 AlgJ_like putative alginate O-acetyltranferases AlgJ, AlgV, and similar proteins. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this family have been annotated as AlgJ or AlgV, and they closely resemble AlgX in sequence and function, although they lack the C-terminal carbohydrate binding domain of AlgX. 321
32111 270209 cd14443 AlgX_N_like_2 Uncharacterized proteins similar to putative alginate O-acetyltranferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Some members of this uncharacterized family, which resembles AlgX_N, have been annotated as twin-arginine translocation signal, although they share little or no similarity with experimentally characterized proteins that bear the same name. 313
32112 270210 cd14444 AlgX_N_like_1 Uncharacterized proteins similar to putative alginate O-acetyltranferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such those as by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this uncharacterized family similar to AlgX_N have been annotated as cell division proteins FtsQ, although they share little or no similarity with experimentally characterized members of the FtsQ family. 298
32113 271220 cd14445 RILP-like Rab interacting lysosomal protein-like 1 and 2 (Rilpl1 and Rilpl2). This domain is found in Rab interacting lysosomal protein-like 1 and 2, and appears to be conserved in Bilateria. The Rilp-like proteins regulate the concentration of ciliary membrane proteins in the primary cilium. Rilpl2 interacts with myosin-Va and has been linked to the regulation of cellular morphology in neurons; it forms a complex with Rac1 and activates Rac1-Pak signaling, dependent on myosin-Va. 89
32114 271219 cd14446 bt3222_like Uncharacterized proteins similar to Bacteriodes thetaiotaomicron bt3222. This family appears to be specific to Bacteroidetes; the two-domain protein forms a homodimer. 266
32115 269894 cd14447 SPX Domain found in Syg1, Pho81, XPR1, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). This domain is found at the amino terminus of a variety of proteins. In the yeast protein Syg1, the N-terminus directly binds to the G-protein beta subunit and inhibits transduction of the mating pheromone signal. Similarly, the N-terminus of the human XPR1 protein binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The SPX domain of S. cerevisiae low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2. NUC-2 contains several ankyrin repeats. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with xenotropic and polytropic murine leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signaling and result in cell toxicity and death. The similarity between Syg1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, S. cerevisiae, and many other diverse organisms. 143
32116 259990 cd14448 CuRO_2_BOD_CotA_like Cupredoxin domain 2 of Bilirubin oxidase (BOD), the bacterial endospore coat component CotA, and similar proteins. Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and is required for spore resistance against hydrogen peroxide and UV light. Also included in this subfamily are phenoxazinone synthase (PHS), which catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones, and FtsP (also named SufI), which is a component of the cell division apparatus. These proteins are laccase-like multicopper oxidases (MCOs) that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper. 144
32117 259991 cd14449 CuRO_1_2DMCO_NIR_like_2 The cupredoxin domain 1 of a two-domain laccase related to nitrite reductase. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles the two domain nitrite reductase in both sequence and structure. It consists of two cupredoxin domains and forms trimers, and hence resembles the quaternary structure of nitrite reductases more than that of large laccases. There are three trinuclear copper clusters in the enzyme localized between domains 1 and 2 of each pair of neighbor chains. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic, notably phenolic, and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities. This subfamily has lost the type 1 (T1) copper binding site in domain 1 that is present in other two-domain laccases. 135
32118 259992 cd14450 CuRO_3_FV_like The third cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 3 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 181
32119 259993 cd14451 CuRO_5_FV_like The fifth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 5 of unprocessed Factor V or the first cupredoxin domain of the light chain of coagulation factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 173
32120 259994 cd14452 CuRO_1_FVIII_like The first cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 1 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins. 173
32121 259995 cd14453 CuRO_2_FV_like The second cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 2 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 123
32122 259996 cd14454 CuRO_4_FV_like The fourth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 4 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 144
32123 259997 cd14455 CuRO_6_FV_like The sixth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 6 of unprocessed Factor V or the second cupredoxin domain of the light chain of coagulation factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom. 140
32124 271218 cd14456 Menin Scaffolding protein menin encoded by the MEN1 gene. MEN1 is the gene responsible for multiple endocrine neoplasia type 1, and it has been characterized as a tumor suppressor gene that encodes a protein called menin. Menin is mostly found in the nucleus and can regulate gene expression in a positive and in a negative way, and it has been shown to interact with transcription activators, transcription repressors, cell signaling proteins, and various other proteins. It plays major roles in DNA repair, the regulation of the cell cycle, and chromatin remodeling. 437
32125 271217 cd14458 DP_DD Dimerization domain of DP. DP functions as a binding partner for E2F transcription factors. DP and E2F form heterodimers and play important roles in regulating genes involved in DNA synthesis, cell cycle progression, proliferation and apoptosis. The transcriptional activity of E2F is inhibited by the retinoblastoma protein (Rb) which binds to the E2F-DP heterodimer, blocks the transactivation domain, and negatively regulates the G1-S transition. DP is distantly related to E2F. In humans, there are at least six closely related E2F and two DP family members, all containing a DNA binding domain, a coiled-coil (CC) region, and a marked-box domain. E2F1 to E2F5 also contain a C-terminal transactivation domain. 105
32126 270615 cd14472 mltA_B_like Domain B insert of mltA_like lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct. 134
32127 271216 cd14473 FERM_B-lobe FERM domain B-lobe. The FERM domain has a cloverleaf tripart structure (FERM_N, FERM_M, FERM_C/N, alpha-, and C-lobe/A-lobe, B-lobe, C-lobe/F1, F2, F3). The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases, the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains and consequently is capable of binding to both peptides and phospholipids at different sites. 99
32128 269895 cd14474 SPX_YDR089W SPX domain of the yeast protein YDR089W and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The uncharacterized yeast protein YDR089W has not been shown to be involved in phosphate homeostasis, in contrast to most of the other SPX-domain containing proteins. 144
32129 269896 cd14475 SPX_SYG1_like SPX domain of the yeast plasma protein Syg1 and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. In the yeast protein Syg1, the N-terminus binds directly to the G-protein beta subunit and inhibits transduction of the mating pheromone signal, and it co-occurs with a C-terminal domain from the EXS family. 139
32130 269897 cd14476 SPX_PHO1_like SPX domain of the plant protein PHOSPHATE1 (PHO1). This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The PHO1 gene family conserved in plants is involved in a variety of processes, most notably the transport of inorganic phosphate from the root to the shoot of the plant and mediating the response to low levels of inorganic phosphate. More recently it has become evident that PHO1 gene families have diverged in various plants and may play roles in stress response as well as the stomatal response to abscisic acid. 139
32131 269898 cd14477 SPX_XPR1_like SPX domain of the xenotropic and polytropic retrovirus receptor 1 (XPR1) and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-terminus of the human XPR1 protein (xenotropic and polytropic retrovirus receptor 1) binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that all members of this family are involved in G-protein associated signal transduction. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with xenotropic and polytropic murine leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signaling and result in cell toxicity and death. Similarity between Syg1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae, and many other diverse organisms. 161
32132 269899 cd14478 SPX_PHO87_PHO90_like SPX domain of the phosphate transporters Pho87, Pho90, Pho91, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The SPX domain of the Saccharomyces cerevisiae membrane-localized low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2. Pho91 is involved in the export of inorganic phosphate from the vacuole to the cytosol. While both, Pho87 and Pho90, transport phosphate into the cell, only Pho87 appears to also function as a sensor for high extracellular phosphate concentrations. 148
32133 269900 cd14479 SPX-MFS_plant SPX domain of proteins found in plants and stramenopiles; most have a C-terminal MFS domain. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The SPX domain is found at the amino terminus of a variety of proteins. This family, mostly found in plants, contains a C-terminal MFS domain (major facilitator superfamily), suggesting a function as a secondary transporter. The function of this N-terminal region is unclear, although it might be involved in regulating transport. 140
32134 269901 cd14480 SPX_VTC2_like SPX domain of the vacuolar transport chaperone Vtc2 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. Vtc2 is part of the Saccharomyces cerevisiae membrane-integral VTC complex, together with Vtc1, Vtc3, and Vtc4. It contains an N-terminal SPX domain next to a central polyphosphate polymerase domain and a C-terminal domain of unknown function. 135
32135 269902 cd14481 SPX_AtSPX1_like SPX domain of the plant protein SPX1 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. This family of plant proteins contains a single SPX domain. Arabidopsis thaliana SPX1 and SPX3 have been reported to play roles in the adaptation to low-phosphate conditions, SPX3 may be involved in the regulation of SPX1 activity. Oryza sativa SPX1 suppresses the regulation of expression of OsPT2, a low-affinity phosphate transporter, by the MYB-like OsPHR2. 149
32136 269903 cd14482 SPX_BAH1-like SPX domain of the E3 ubiquitin-protein ligase BAH1/NLA and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. BAH1 (benzoic acid hypersensitive 1) appears to function as an E3 ubiquitin ligase; the protein contains an SPX and a RING finger domain. It has been suggested that BAH1/NLA is involved in the regulation of plant immune responses, probably via a pathway of salicylic acid biosynthesis that includes benzoic acid as an intermediate. 156
32137 269904 cd14483 SPX_PHO81_NUC-2_like SPX domain of Pho81, NUC-2, and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 plays an important role in the phosphate-regulated signal transduction pathway in N. crassa. It shows high similarity to a cyclin-dependent kinase inhibitory protein Pho81, which is part of the phosphate regulatory cascade in S. cerevisiae. Both, NUC-2 and Pho81, have multi-domain architecture, including the SPX N-terminal domain following by several ankyrin repeats and a putative C-terminal glycerophosphodiester phosphodiesterase domain (GDPD) with unknown function. 162
32138 269905 cd14484 SPX_GDE1_like SPX domain of Gde1 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The yeast protein Gde1/Ypl110c is similar to both, NUC-2 and Pho81, in sharing their multi-domain architecture, which includes the SPX N-terminal domain followed by several ankyrin repeats and a C-terminal glycerophosphodiester phosphodiesterase domain (GDPD). Gde1 hydrolyzes intracellular glycerophosphocholine into glycerolphosphate and choline, and plays a role in the utilization of glycerophosphocholine as a source for phosphate. 134
32139 270618 cd14485 mltA_like_LT_A Domain A of MltA and related lytic transglycosylase; domain A is interrupted by domain B. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct. 159
32140 270619 cd14486 3D_domain 3D domain, named for 3 conserved aspartate residues, is found in mltA-like lytic transglycosylases and numerous other contexts. This family contains the 3D domain, named for its 3 conserved aspartates. It is found in conjunction with numerous other domains such as MltA (membrane-bound lytic murein transglycosylase A). These aspartates are critical active site residues of mltA-like lytic transglycosylases. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. MltA has 2 domains, separated by a large groove, where the peptidoglycan strand binds. The C-terminus has a double-psi beta barrel fold within the 3D domain, which forms the larger A domain along with the N-terminal region of Mlts, but is also found in various other domain architectures. Peptigoglycan (also known as murein) chains, the primary structural component of bacterial cells walls, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc); lytic transglycosylases (LTs) cleave this beta-1-4 bond. Typically, LTs are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, membrane-bound lytic murein transglycosylase E (MltE) is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane- bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and family 4 of bacteriophage origin. While most LTs are related members of the lysozyme-like lytic transglycosylase family, MltA represents a distinct fold and sequence conservation. 104
32141 271153 cd14487 AlgX_C C-terminal carbohydrate-binding domain of the alginate O-acetyltranferase AlgX. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. This N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. AlgX contains a C-terminal carbohydrate binding domain that belongs to the wider family of CBM6-CBM35-CBM36_like domains. 128
32142 271154 cd14488 CBM6-CBM35-CBM36_like_2 uncharacterized members of the carbohydrate binding module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules. 132
32143 271155 cd14489 CBM_SBP_bac_1_like Putative Carbohydrate Binding Module (CBM) of extracellular solute-binding protein family 1. Domains in this family co-occur with extracellular solute-binding domains which are periplasmic components of ABC-type sugar transport systems involved in carbohydrate transport and metabolism. Carbohydrate binding modules of family 6 (CBM6), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36) are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of co-occuring (catalytic) modules to their insoluble substrates. 150
32144 271156 cd14490 CBM6-CBM35-CBM36_like_1 uncharacterized members of the carbohydrate binding module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules. 156
32145 381186 cd14491 lipocalin_MxiM-like Shigella pilot protein MxiM and similar proteins. Shigella flexneri MxiM, is a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin and component of the type-III secretion system. MxiM, also an OM protein, binds lipids and MxiD. MxiM binds and affects several features of the secretin MxiD, including its stability in the periplasm, OM association, as well as assembly into multimeric structure. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 134
32146 381187 cd14492 lipocalin_MxiM-like MxiM-like lipocalin found in Bacteroidetes. Uncharacterized proteins in this family conserved in Bacteroidetes are similar to Shigella flexneri MxiM, a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin and component of the type-III secretion system. MxiM, also an OM protein, binds lipids and MxiD. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 142
32147 381188 cd14493 lipocalin_MxiM Shigella Pilot protein MxiM. Shigella flexneri MxiM is a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin. MxiM, also an OM protein, binds lipids and MxiD. MixD is a component of the type-III secretion system that translocates proteins through both membranes of gram-negative bacterial pathogens into host cells and requires the formation of an integral OM secretin ring. MxiM binds and affects several features of the secretin MxiD, including its stability in the periplasm, OM association, as well as assembly into multimeric structure. MxiM belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 115
32148 350344 cd14494 PTP_DSP_cys cys-based protein tyrosine phosphatase and dual-specificity phosphatase superfamily. This superfamily is composed of cys-based phosphatases, which includes classical protein tyrosine phosphatases (PTPs) as well as dual-specificity phosphatases (DUSPs or DSPs). They are characterized by a CxxxxxR conserved catalytic loop (where C is the catalytic cysteine, x is any amino acid, and R is an arginine). PTPs are part of the tyrosine phosphorylation/dephosphorylation regulatory mechanism, and are important in the response of the cells to physiologic and pathologic changes in their environment. DUSPs show more substrate diversity (including RNA and lipids) and include pTyr, pSer, and pThr phosphatases. 113
32149 350345 cd14495 PTPLP-like Protein tyrosine phosphatase-like domains of phytases and similar domains. This subfamily contains the tandem protein tyrosine phosphatase (PTP)-like domains of protein tyrosine phosphatase-like phytases (PTPLPs) and similar domains including the PTP domain of Pseudomonas syringae tyrosine-protein phosphatase hopPtoD2. PTPLPs, also known as cysteine phytases, are one of four known classes of phytases, enzymes that degrade phytate (inositol hexakisphosphate [InsP(6)]) to less-phosphorylated myo-inositol derivatives. Phytate is the most abundant cellular inositol phosphate and plays important roles in a broad scope of cellular processes, including DNA repair, RNA processing and export, development, apoptosis, and pathogenicity. PTPLPs adopt a PTP fold, including the active-site signature sequence (CX5R(S/T)) and utilize a classical PTP reaction mechanism. However, these enzymes display no catalytic activity against classical PTP substrates due to several unique structural features that confer specificity for myo-inositol polyphosphates. 278
32150 350346 cd14496 PTP_paladin protein tyrosine phosphatase-like domains of paladin. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two protein tyrosine phosphatase (PTP)-like domains. This model represents both repeats. 185
32151 350347 cd14497 PTP_PTEN-like protein tyrosine phosphatase-like domain of phosphatase and tensin homolog and similar proteins. Phosphatase and tensin homolog (PTEN) is a tumor suppressor that acts as a dual-specificity protein phosphatase and as a lipid phosphatase. It dephosphorylates phosphoinositide trisphosphate. In addition to PTEN, this family includes tensins, voltage-sensitive phosphatases (VSPs), and auxilins. They all contain a protein tyrosine phosphatase-like domain although not all are active phosphatases. Tensins are intracellular proteins that act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility, and they may or may not have phosphatase activity. VSPs are phosphoinositide phosphatases with substrates that include phosphatidylinositol-4,5-diphosphate and phosphatidylinositol-3,4,5-trisphosphate. Auxilins are J domain-containing proteins that facilitate Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles, and they do not exhibit phosphatase activity. 160
32152 350348 cd14498 DSP dual-specificity phosphatase domain. The dual-specificity phosphatase domain is found in typical and atypical dual-specificity phosphatases (DUSPs), which function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Typical DUSPs, also called mitogen-activated protein kinase (MAPK) phosphatases (MKPs), deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. Atypical DUSPs contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. Also included in this family are dual specificity phosphatase-like domains of catalytically inactive members such as serine/threonine/tyrosine-interacting protein (STYX) and serine/threonine/tyrosine interacting like 1 (STYXL1), as well as active phosphatases with substrates that are not phosphoproteins such as PTP localized to the mitochondrion 1 (PTPMT1), which is a lipid phosphatase, and laforin, which is a glycogen phosphatase. 135
32153 350349 cd14499 CDC14_C C-terminal dual-specificity phosphatase domain of CDC14 family proteins. The cell division control protein 14 (CDC14) family is highly conserved in all eukaryotes, although the roles of its members seem to have diverged during evolution. Yeast Cdc14, the best characterized member of this family, is a dual-specificity phosphatase that plays key roles in cell cycle control. It preferentially dephosphorylates cyclin-dependent kinase (CDK) targets, which makes it the main antagonist of CDK in the cell. Cdc14 functions at the end of mitosis and it triggers the events that completely eliminates the activity of CDK and other mitotic kinases. It is also involved in coordinating the nuclear division cycle with cytokinesis through the cytokinesis checkpoint, and in chromosome segregation. Cdc14 phosphatases also function in DNA replication, DNA damage checkpoint, and DNA repair. Vertebrates may contain more than one Cdc14 homolog; humans have three (CDC14A, CDC14B, and CDC14C). CDC14 family proteins contain a highly conserved N-terminal pseudophosphatase domain that contributes to substrate specificity and a C-terminal catalytic dual-specificity phosphatase domain with the PTP signature motif. 174
32154 350350 cd14500 PTP-IVa protein tyrosine phosphatase type IVA family. Protein tyrosine phosphatases type IVA (PTP-IVa), also known as protein-tyrosine phosphatases of regenerating liver (PRLs) constitute a family of small, prenylated phosphatases that are the most oncogenic of all PTPs. They stimulate progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. They associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation. Vertebrates contain three members: PRL-1, PRL-2, and PRL-3. 156
32155 350351 cd14501 PFA-DSP plant and fungi atypical dual-specificity phosphatase. Plant and fungi atypical dual-specificity phosphatases (PFA-DSPs) are a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds. They share structural similarity with atypical- and lipid phosphatase DSPs from mammals. The PFA-DSP group is composed of active as well as inactive phosphatases. The best characterized member is Saccharomyces Siw14, also known as Oca3, which plays a role in actin filament organization and endocytosis. Siw14 has been shown to be an inositol pyrophosphate phosphatase, hydrolyzing the beta-phosphate from 5-diphosphoinositol pentakisphosphate (5PP-IP5or IP7). 149
32156 350352 cd14502 RNA_5'-triphosphatase RNA 5'-triphosphatase domain. This family of RNA-specific cysteine phosphatases includes baculovirus RNA 5'-triphosphatase, dual specificity protein phosphatase 11 (DUSP11), and the RNA triphosphatase domains of metazoan and plant mRNA capping enzymes. RNA/polynucleotide 5'-triphosphatase (EC 3.1.3.33) catalyzes the removal of the gamma-phosphate from the 5'-triphosphate end of nascent mRNA to yield a diphosphate end. mRNA capping enzyme is a bifunctional enzyme that catalyzes the first two steps of cap formation. DUSP11 has RNA 5'-triphosphatase and diphosphatase activity, but only poor protein-tyrosine phosphatase activity. 167
32157 350353 cd14503 PTP-bact bacterial tyrosine-protein phosphataseS similar to Neisseria NMA1982. This subfamily is composed of bacterial tyrosine-protein phosphatases similar to Neisseria meningitidis NMA1982, which displays phosphatase activity but whose biological function is still unknown. 136
32158 350354 cd14504 DUSP23 dual specificity phosphatase 23. Dual specificity phosphatase 23 (DUSP23), also known as VH1-like phosphatase Z (VHZ) or low molecular mass dual specificity phosphatase 3 (LDP-3), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP23 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is able to enhance activation of JNK and p38 MAPK, and has been shown to dephosphorylate p44-ERK1 (MAPK3) in vitro. It has been associated with cell growth and human primary cancers. It has also been identified as a cell-cell adhesion regulatory protein; it promotes the dephosphorylation of beta-catenin at Tyr 142 and enhances the interaction between alpha- and beta-catenin. 142
32159 350355 cd14505 CDKN3-like cyclin-dependent kinase inhibitor 3 and similar proteins. This family is composed of eukaryotic cyclin-dependent kinase inhibitor 3 (CDKN3) and related archaeal and bacterial proteins. CDKN3 is also known as kinase-associated phosphatase (KAP), CDK2-associated dual-specificity phosphatase, cyclin-dependent kinase interactor 1 (CDI1), or cyclin-dependent kinase-interacting protein 2 (CIP2). It has been characterized as dual-specificity phosphatase, which function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and protein-tyrosine-phosphatase (EC 3.1.3.48). It dephosphorylates CDK2 at a threonine residue in a cyclin-dependent manner, resulting in the inhibition of G1/S cell cycle progression. It also interacts with CDK1 and controls progression through mitosis by dephosphorylating CDC2. CDKN3 may also function as a tumor suppressor; its loss of function was found in a variety of cancers including glioblastoma and hepatocellular carcinoma. However, it has also been found over-expressed in many cancers such as breast, cervical, lung and prostate cancers, and may also have an oncogenic function. 163
32160 350356 cd14506 PTP_PTPDC1 protein tyrosine phosphatase domain of PTP domain-containing protein 1. protein tyrosine phosphatase domain-containing protein 1 (PTPDC1) is an uncharacterized non-receptor class protein-tyrosine phosphatase (PTP). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Small interfering RNA (siRNA) knockdown of the ptpdc1 gene is associated with elongated cilia. 206
32161 350357 cd14507 PTP-MTM-like protein tyrosine phosphatase-like domain of myotubularins. Myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Not all members are catalytically active proteins, some function as adaptors for the active members. 226
32162 350358 cd14508 PTP_tensin protein tyrosine phosphatase-like domain of tensins. The tensin family of intracellular proteins (tensin-1, -2, -3 and -4) act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Dysregulation of tensin expression has been implicated in human cancer. Tensin-1, -2, and -3 contain an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. In addition, tensin-2 contains a zinc finger N-terminal to its PTP domain. Tensin-4 is not included in this model as it does not contain a PTP-like domain. 159
32163 350359 cd14509 PTP_PTEN protein tyrosine phosphatase-like catalytic domain of phosphatase and tensin homolog. Phosphatase and tensin homolog (PTEN), also phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN or mutated in multiple advanced cancers 1 (MMAC1), is a tumor suppressor that acts as a dual-specificity protein phosphatase and as a lipid phosphatase. It is a critical endogenous inhibitor of phosphoinositide signaling. It dephosphorylates phosphoinositide trisphosphate, and therefore, has the function of negatively regulating Akt. The PTEN/PI3K/AKT pathway regulates the signaling of multiple biological processes such as apoptosis, metabolism, cell proliferation, and cell growth. PTEN contains an N-terminal PIP-binding domain, a protein tyrosine phosphatase (PTP)-like catalytic domain, a regulatory C2 domain responsible for its cellular location, a C-tail containing phosphorylation sites, and a C-terminal PDZ domain. 158
32164 350360 cd14510 PTP_VSP_TPTE protein tyrosine phosphatase-like catalytic domain of voltage-sensitive phosphatase/transmembrane phosphatase with tensin homology. Voltage-sensitive phosphatase (VSP) proteins comprise a family of phosphoinositide phosphatases with substrates that include phosphatidylinositol-4,5-diphosphate and phosphatidylinositol-3,4,5-trisphosphate. This family is conserved in deuterostomes; VSP was first identified as a sperm flagellar plasma membrane protein in Ciona intestinalis. Gene duplication events in primates resulted in the presence of paralogs, transmembrane phosphatase with tensin homology (TPTE) and TPTE2, that retain protein domain architecture but, in the case of TPTE, have lost catalytic activity. TPTE, also called cancer/testis antigen 44 (CT44), may play a role in the signal transduction pathways of the endocrine or spermatogenic function of the testis. TPTE2, also called TPTE and PTEN homologous inositol lipid phosphatase (TPIP), occurs in several differentially spliced forms; TPIP alpha displays phosphoinositide 3-phosphatase activity and is localized on the endoplasmic reticulum, while TPIP beta is cytosolic and lacks detectable phosphatase activity. VSP/TPTE proteins contain an N-terminal voltage sensor consisting of four transmembrane segments, a protein tyrosine phosphatase (PTP)-like phosphoinositide phosphatase catalytic domain, followed by a regulatory C2 domain. 177
32165 350361 cd14511 PTP_auxilin-like protein tyrosine phosphatase-like domain of auxilin and similar proteins. This subfamily contains proteins similar to auxilin, characterized by also containing a J domain. It includes auxilin, also called auxilin-1, and cyclin-G-associated kinase (GAK), also called auxilin-2. Auxilin-1 and -2 facilitate Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, while auxilin-1 which is nerve-specific. Both proteins contain a protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN (a phosphoinositide 3-phosphatase), and a C-terminal region with clathrin-binding and J domains. In addition, GAK contains an N-terminal protein kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2. 164
32166 350362 cd14512 DSP_MKP dual specificity phosphatase domain of mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs, which are involved in gene regulation, cell proliferation, programmed cell death and stress responses, as an important feedback control mechanism that limits MAPK cascades. MKPs, also referred to as typical DUSPs, function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). 136
32167 350363 cd14513 DSP_slingshot dual specificity phosphatase domain of slingshot family phosphatases. The slingshot (SSH) family of dual specificity protein phosphatases is composed of Drosophila slingshot phosphatase and its vertebrate homologs: SSH1, SSH2 and SSH3. Its members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. In Drosophila, loss of ssh gene function causes prominent elevation in the levels of P-cofilin and filamentous actin and disorganized epidermal cell morphogenesis, including bifurcation phenotypes of bristles and wing hairs. SSH family phosphatases contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, many members contain a C-terminal tail. The SSH-N domain plays critical roles in P-cofilin recognition, F-actin-mediated activation, and subcellular localization of SSHs. 139
32168 350364 cd14514 DUSP14-like dual specificity protein phosphatases 14, 18, 21, 28 and similar proteins. This family is composed of dual specificity protein phosphatase 14 (DUSP14, also known as MKP-6), 18 (DUSP18), 21 (DUSP21), 28 (DUSP28), and similar proteins. They function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48), and are atypical DUSPs. They contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP14 directly interacts and dephosphorylates TGF-beta-activated kinase 1 (TAK1)-binding protein 1 (TAB1) in T cells, and negatively regulates TCR signaling and immune responses. DUSP18 has been shown to interact and dephosphorylate SAPK/JNK, and may play a role in regulating the SAPK/JNK pathway. DUSP18 and DUSP21 target to opposing sides of the mitochondrial inner membrane. DUSP28 has been implicated in hepatocellular carcinoma progression and in migratory activity and drug resistance of pancreatic cancer cells. 133
32169 350365 cd14515 DUSP3-like dual specificity protein phosphatases 3, 13, 26, 27, and similar domains. This family is composed of dual specificity protein phosphatase 3 (DUSP3, also known as VHR), 13B (DUSP13B, also known as TMDP), 26 (DUSP26, also known as MPK8), 13A (DUSP13A, also known as MDSP), dual specificity phosphatase and pro isomerase domain containing 1 (DUPD1), and inactive DUSP27. In general, DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Members of this family are atypical DUSPs; they contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. Inactive DUSP27 contains a dual specificity phosphatase-like domain with the active site cysteine substituted to serine. 148
32170 350366 cd14516 DSP_fungal_PPS1 dual specificity phosphatase domain of fungal dual specificity protein phosphatase PPS1-like. This subfamily contains fungal proteins with similarity to dual specificity protein phosphatase PPS1 from Saccharomyces cerevisiae, which has a role in the DNA synthesis phase of the cell cycle. As a dual specificity protein phosphatase, PPS1 functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It contains a C-terminal catalytic dual specificity phosphatase domain. 177
32171 350367 cd14517 DSP_STYXL1 dual specificity phosphatase-like domain of serine/threonine/tyrosine interacting like 1. Serine/threonine/tyrosine interacting like 1 (STYXL1), also known as DUSP24 and MK-STYX, is a catalytically inactive phosphatase with homology to the mitogen-activated protein kinase (MAPK) phosphatases (MKPs). STYXL1 plays a role in regulating pathways by competing with active phosphatases for binding to MAPKs. Similar to MKPs, STYXL1 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, however its C-terminal dual specificity phosphatase-like domain is a pseudophosphatase missing the catalytic cysteine. 155
32172 350368 cd14518 DSP_fungal_YVH1 dual specificity phosphatase domain of fungal YVH1-like dual specificity protein phosphatase. This family is composed of Saccharomyces cerevisiae dual specificity protein phosphatase Yvh1 and similar fungal proteins. Yvh1 could function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It regulates cell growth, sporulation, and glycogen accumulation. It plays an important role in ribosome assembly. Yvh1 associates transiently with late pre-60S particles and is required for the release of the nucleolar/nuclear pre-60S factor Mrt4, which is necessary to construct a translation-competent 60S subunit and mature ribosome stalk. Yvh1 contains an N-terminal catalytic dual specificity phosphatase domain and a C-terminal tail. 153
32173 350369 cd14519 DSP_DUSP22_15 dual specificity phosphatase domain of dual specificity protein phosphatase 22, 15, and similar proteins. Dual specificity protein phosphatase 22 (DUSP22, also known as VHX) and 15 (DUSP15, also known as VHY) function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). They are atypical DUSPs; they contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. The both contain N-terminal myristoylation recognition sequences and myristoylation regulates their subcellular location. DUSP22 negatively regulates the estrogen receptor-alpha-mediated signaling pathway and the IL6-leukemia inhibitory factor (LIF)-STAT3-mediated signaling pathway. DUSP15 has been identified as a regulator of oligodendrocyte differentiation. DUSP22 is a single domain protein containing only the catalytic dual specificity phosphatase domain while DUSP15 contains a short C-terminal tail. 136
32174 350370 cd14520 DSP_DUSP12 dual specificity phosphatase domain of dual specificity protein phosphatase 12 and similar proteins. Dual specificity protein phosphatase 12 (DUSP12), also called YVH1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP12 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It targets p38 MAPK to regulate macrophage response to bacterial infection. It also ameliorates cardiac hypertrophy in response to pressure overload through c-Jun N-terminal kinase (JNK) inhibition. DUSP12 has been identified as a modulator of cell cycle progression, a function independent of phosphatase activity and mediated by its C-terminal zinc-binding domain. 144
32175 350371 cd14521 DSP_fungal_SDP1-like dual specificity phosphatase domain of fungal dual specificity protein phosphatase SDP1, MSG5, and similar proteins. This family is composed of fungal dual specificity protein phosphatases (DUSPs) including Saccharomyces cerevisiae SDP1 and MSG5, and Schizosaccharomyces pombe Pmp1. function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. SDP1 is oxidative stress-induced and dephosphorylates MAPK substrates such as SLT2. MSG5 dephosphorylates the Fus3 and Slt2 MAPKs operating in the mating and cell wall integrity (CWI) pathways, respectively. Pmp1 is responsible for dephosphorylating the CWI MAPK Pmk1. These phosphatases bind to their target MAPKs through a conserved IYT motif located outside of the dual specificity phosphatase domain. 155
32176 350372 cd14522 DSP_STYX dual specificity phosphatase-like domain of serine/threonine/tyrosine-interacting protein. Serine/threonine/tyrosine-interacting protein (STYX), also called protein tyrosine phosphatase-like protein, is a catalytically inactive member of the protein tyrosine phosphatase family that plays an integral role in regulating pathways by competing with active phosphatases for binding to MAPKs. It acts as a nuclear anchor for MAPKs, affecting their nucleocytoplasmic shuttling. 151
32177 350373 cd14523 DSP_DUSP19 dual specificity phosphatase domain of dual specificity protein phosphatase 19. Dual specificity protein phosphatase 19 (DUSP19), also called low molecular weight dual specificity phosphatase 3 (LMW-DSP3) or stress-activated protein kinase (SAPK) pathway-regulating phosphatase 1 (SKRP1), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP19 interacts with the MAPK kinase MKK7, a JNK activator, and inactivates the JNK MAPK pathway. 137
32178 350374 cd14524 PTPMT1 protein-tyrosine phosphatase mitochondrial 1. Protein-tyrosine phosphatase mitochondrial 1 or PTP localized to the mitochondrion 1 (PTPMT1), also called phosphoinositide lipid phosphatase (PLIP), phosphatidylglycerophosphatase and protein-tyrosine phosphatase 1, or PTEN-like phosphatase, is a lipid phosphatase or phosphatidylglycerophosphatase (EC 3.1.3.27) which dephosphorylates phosphatidylglycerophosphate (PGP) to phosphatidylglycerol (PG). It is targeted to the mitochondrion by an N-terminal signal sequence and is found anchored to the matrix face of the inner membrane. It is essential for the biosynthesis of cardiolipin, a mitochondrial-specific phospholipid regulating the membrane integrity and activities of the organelle. PTPMT1 also plays a crucial role in hematopoietic stem cell (HSC) function, and has been shown to display activity toward phosphoprotein substrates. 149
32179 350375 cd14526 DSP_laforin-like dual specificity phosphatase domain of laforin and similar domains. This family is composed of glucan phosphatases including vertebrate dual specificity protein phosphatase laforin, also called lafora PTPase (LAFPTPase), and plant starch excess4 (SEX4). Laforin is a glycogen phosphatase; its gene is mutated in Lafora progressive myoclonus epilepsy or Lafora disease (LD), a fatal autosomal recessive neurodegenerative disorder characterized by the presence of progressive neurological deterioration, myoclonus, and epilepsy. One characteristic of LD is the accumulation of insoluble glucans. Laforin prevents LD by at least two mechanisms: by preventing hyperphosphorylation of glycogen by dephosphorylating it, allowing proper glycogen formation, and by promoting the ubiquitination of proteins involved in glycogen metabolism via its interaction with malin. Laforin contains an N-terminal CBM20 (carbohydrate-binding module, family 20) domain and a C-terminal catalytic dual specificity phosphatase (DSP) domain. Plant SEX4 regulate starch metabolism by selectively dephosphorylating glucose moieties within starch glucan chains. It contains an N-terminal catalytic DSP domain and a C-terminal Early (E) set domain. 146
32180 350376 cd14527 DSP_bac unknown subfamily of bacterial and plant dual specificity protein phosphatases. This subfamily is composed of uncharacterized bacterial and plant dual-specificity protein phosphatases. DUSPs function as a protein-serine/threonine phosphatases (EC 3.1.3.16) and a protein-tyrosine-phosphatases (EC 3.1.3.48). 136
32181 350377 cd14528 PFA-DSP_Siw14 atypical dual specificity phosphatases similar to yeast Siw14. This subfamily contains Saccharomyces Siw14 and a novel phosphatase from the Arabidopsis thaliana gene locus At1g05000. Siw14, also known as Oca3, plays a role in actin filament organization and endocytosis. Siw14 has been shown to be an inositol pyrophosphate phosphatase, hydrolyzing the beta-phosphate from 5-diphosphoinositol pentakisphosphate (5PP-IP5or IP7). The At1g05000 protein, also called AtPFA-DSP1, has been shown to have highest activity toward olyphosphate (poly-P(12-13)) and deoxyribo- and ribonucleoside triphosphates, and less activity toward phosphoenolpyruvate, phosphotyrosine, phosphotyrosine-containing peptides, and phosphatidylinositols. This subfamily belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). 148
32182 350378 cd14529 TpbA-like bacterial protein tyrosine and dual-specificity phosphatases related to Pseudomonas aeruginosa TpbA. This subfamily contains bacterial protein tyrosine phosphatases (PTPs) and dual-specificity phosphatases (DUSPs) related to Pseudomonas aeruginosa TpbA, a DUSP that negatively regulates biofilm formation by converting extracellular quorum sensing signals and to Mycobacterium tuberculosis PtpB, a PTP virulence factor that attenuates host immune defenses by interfering with signal transduction pathways in macrophages. PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides, while DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and PTPs. 158
32183 350379 cd14531 PFA-DSP_Oca1 atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 1. Oxidant-induced cell-cycle arrest protein 1 (Oca1) is an atypical dual specificity phosphatase whose gene is required for G1 arrest in response to the lipid oxidation product linoleic acid hydroperoxide. It may function in linking growth, stress responses, and the cell cycle. Oca1 belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). 149
32184 350380 cd14532 PTP-MTMR6-like protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatases 6, 7, and 8. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTMR6, MTMR7 and MTMR8, and related domains. Beside the phosphatase domain, they contain a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR6, MTMR7 and MTMR8 form complexes with catalytically inactive MTMR9, and display differential substrate preferences. In cells, the MTMR6/R9 complex significantly increases the cellular levels of PtdIns(5)P, the product of PI(3,5)P(2) dephosphorylation, whereas the MTMR8/R9 complex reduces cellular PtdIns(3)P levels. The MTMR6/R9 complex serves to inhibit stress-induced apoptosis while the MTMR8/R9 complex inhibits autophagy. 301
32185 350381 cd14533 PTP-MTMR3-like protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatases 3 and 4. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTMR3, also known as ZFYVE10, and MTMR4, also known as ZFYVE11, and related domains. Beside the phosphatase domain, they contain a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. 229
32186 350382 cd14534 PTP-MTMR5-like protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatases 5 and 13. This subgroup of enzymatically inactive phosphatase domains of myotubularins consists of MTMR5, also known as SET binding factor 1 (SBF1) and MTMR13, also known as SET binding factor 2 (SBF2), and similar domains. Beside the pseudophosphatase domain, they contain a variety of other domains, including a DENN and a PH-like domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR5 and MTMR13 are pseudophosphatases that lack the catalytic cysteine in their catalytic pocket. Mutations in MTMR13 causes Charcot-Marie-Tooth type 4B2, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR2. Mutations in the MTMR5 gene cause Charcot-Marie-tooth disease type 4B3. MTMR5 and MTMR13 interact with MTMR2 and stimulate its phosphatase activity. 274
32187 350383 cd14535 PTP-MTM1-like protein tyrosine phosphatase-like domain of myotubularin, and myotubularin related phosphoinositide phosphatases 1 and 2. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTM1, MTMR1 and MTMR2. All contain an additional N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. 249
32188 350384 cd14536 PTP-MTMR9 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 9. Myotubularin related phosphoinositide phosphatase 9 (MTMR9) is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. Mutations have been associated with obesity and metabolic syndrome. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR9 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It forms complexes with catalytically active MTMR6, MTMR7 and MTMR8, and regulates their activities; the complexes display differential substrate preferences. The MTMR6/R9 complex serves to inhibit stress-induced apoptosis while the MTMR8/R9 complex inhibits autophagy. 224
32189 350385 cd14537 PTP-MTMR10-like protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatases 10, 11, and 12. This subgroup of enzymatically inactive phosphatase domains of myotubularins consists of MTMR10, MTMR11, MTMR12, and similar proteins. Beside the phosphatase domain, they contain an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR10, MTMR11, and MTMR12 are pseudophosphatases that lack the catalytic cysteine in their catalytic pocket. MTMR12 functions as an adapter for the catalytically active myotubularin to regulate its intracellular location. 200
32190 350386 cd14538 PTPc-N20_13 catalytic domain of tyrosine-protein phosphatase non-receptor type 20 and type 13. Tyrosine-protein phosphatase non-receptor type 20 (PTPN20) and type 13 (PTPN13, also known as PTPL1) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN20 is a widely expressed phosphatase with a dynamic subcellular distribution that is targeted to sites of actin polymerization. Human PTPN13 is an important regulator of tumor aggressiveness. 207
32191 350387 cd14539 PTP-N23 PTP-like domain of tyrosine-protein phosphatase non-receptor type 23. Tyrosine-protein phosphatase non-receptor type 23 (PTPN23), also called His domain-containing protein tyrosine phosphatase (HD-PTP) or protein tyrosine phosphatase TD14 (PTP-TD14), is a catalytically inactive member of the tyrosine-specific protein tyrosine phosphatase (PTP) family. Human PTPN23 may be involved in the regulation of small nuclear ribonucleoprotein assembly and pre-mRNA splicing by modifying the survival motor neuron (SMN) complex. It plays a role in ciliogenesis and is part of endosomal sorting complex required for transport (ESCRT) pathways. PTPN23 contains five domains: a BRO1-like domain that plays a role in endosomal sorting; a V-domain that interacts with Lys63-linked polyubiquitinated substrates; a central proline-rich region that might recruit SH3-containing proteins; a PTP-like domain; and a proteolytic degradation-targeting motif, also known as a PEST sequence. 205
32192 350388 cd14540 PTPc-N21_14 catalytic domain of tyrosine-protein phosphatase non-receptor type 21 and type 14. Tyrosine-protein phosphatase non-receptor type 21 (PTPN21) and type 14 (PTPN14) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Both PTPN21 and PTPN14 contain an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence. 219
32193 350389 cd14541 PTPc-N3_4 catalytic domain of tyrosine-protein phosphatase non-receptor type 21 and type 14. Tyrosine-protein phosphatase non-receptor type 3 (PTPN3) and type 4 (PTPN4) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN3 and PTPN4 are large modular proteins containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain. PTPN3 interacts with mitogen-activated protein kinase p38gamma and serves as its specific phosphatase. PTPN4 functions in TCR cell signaling, apoptosis, cerebellar synaptic plasticity, and innate immune responses. 212
32194 350390 cd14542 PTPc-N22_18_12 catalytic domain of tyrosine-protein phosphatase non-receptor type 22, type 18 and type 12. Tyrosine-protein phosphatase non-receptor type 22 (PTPN22), type 18 (PTPN18) and type 12 (PTPN12) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN22 is expressed in hematopoietic cells and it functions as a key regulator of immune homeostasis by inhibiting T-cell receptor signaling through the direct dephosphorylation of Src family kinases (Lck and Fyn), ITAMs of the TCRz/CD3 complex, and other signaling molecules. TPN18 regulates HER2-mediated cellular functions through defining both its phosphorylation and ubiquitination states. PTPN12 is characterized as a tumor suppressor and a pivotal regulator of EGFR/HER2 signaling. 202
32195 350391 cd14543 PTPc-N9 catalytic domain of tyrosine-protein phosphatase non-receptor type 9. Tyrosine-protein phosphatase non-receptor type 9 (PTPN9), also called protein-tyrosine phosphatase MEG2, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN9 plays an important role in promoting intracellular secretary vesicle fusion in hematopoietic cells and promotes the dephosphorylation of ErbB2 and EGFR in breast cancer cells, leading to impaired activation of STAT5 and STAT3. It also directly dephosphorylates STAT3 at the Tyr705 residue, resulting in its inactivation. PTPN9 has been found to be dysregulated in various human cancers, including breast, colorectal, and gastric cancer. 271
32196 350392 cd14544 PTPc-N11_6 catalytic domain of tyrosine-protein phosphatase non-receptor type 11 and type 6. Tyrosine-protein phosphatase non-receptor type 11 (PTPN11) and type 6 (PTPN6) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN11 and PTPN6, are also called SH2 domain-containing tyrosine phosphatase 2 (SHP2) and 1 (SHP1), respectively. They contain two tandem SH2 domains: a catalytic PTP domain, and a C-terminal tail with regulatory properties. Although structurally similar, they have different localization and different roles in signal transduction. PTPN11/SHP2 is expressed ubiquitously and plays a positive role in cell signaling, leading to cell activation, while PTPN6/SHP1 expression is restricted mainly to hematopoietic and epithelial cells and functions as a negative regulator of signaling events. 251
32197 350393 cd14545 PTPc-N1_2 catalytic domain of tyrosine-protein phosphatase non-receptor type 1 and type 2. Tyrosine-protein phosphatase non-receptor type 1 (PTPN1) type 2 (PTPN2) belong to the family of classical tyrosine-specific protein tyrosine phosphatases, (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN1 (or PTP-1B) is the first PTP to be purified and characterized and is the prototypical intracellular PTP found in a wide variety of human tissues. It dephosphorylates and regulates the activity of a number of receptor tyrosine kinases, including the insulin receptor, the EGF receptor, and the PDGF receptor. PTPN2 (or TCPTP), a tumor suppressor, dephosphorylates and inactivates EGFRs, Src family kinases, Janus-activated kinases (JAKs)-1 and -3, and signal transducer and activators of transcription (STATs)-1, -3 and -5, in a cell type and context-dependent manner. 231
32198 350394 cd14546 R-PTP-N-N2 PTP-like domain of receptor-type tyrosine-protein phosphatase-like N and N2. Receptor-type tyrosine-protein phosphatase-like N (PTPRN) and N2 (PTPRN2) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). They consist of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. They are mainly expressed in neuropeptidergic neurons and peptide-secreting endocrine cells, including insulin-producing pancreatic beta-cells, and are involved in involved in the generation, cargo storage, traffic, exocytosis and recycling of insulin secretory granules, as well as in beta-cell proliferation. They also are major autoantigens in type 1 diabetes and are involved in the regulation of insulin secretion. 208
32199 350395 cd14547 PTPc-KIM catalytic domain of the kinase interaction motif (KIM) family of protein-tyrosine phosphatases. The kinase interaction motif (KIM) family of protein-tyrosine phosphatases (PTPs) includes tyrosine-protein phosphatases non-receptor type 7 (PTPN7) and non-receptor type 5 (PTPN5), and protein-tyrosine phosphatase receptor type R (PTPRR). PTPN7 is also called hematopoietic protein-tyrosine phosphatase (HePTP) while PTPN5 is also called striatal-enriched protein-tyrosine phosphatase (STEP). They belong to the family of classical tyrosine-specific PTPs (EC 3.1.3.48) that catalyze the dephosphorylation of phosphotyrosine peptides. KIM-PTPs are characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. They are highly specific to the MAPKs ERK1/2 (extracellular-signal-regulated kinase 1/2) and p38, over JNK (c-Jun N-terminal kinase); they dephosphorylate these kinases and thereby critically modulate cell proliferation and differentiation. 224
32200 350396 cd14548 R3-PTPc catalytic domain of R3 subfamily receptor-type tyrosine-protein phosphatases and similar proteins. R3 subfamily receptor-type phosphotyrosine phosphatases (RPTP) are characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. Vertebrate members include receptor-type tyrosine-protein phosphatase-like O (PTPRO), J (PTPRJ), Q (PTPRQ), B (PTPRB), V (PTPRV) and H (PTPRH). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Most members are PTPs, except for PTPRQ, which dephosphorylates phosphatidylinositide substrates. PTPRV is characterized only in rodents; its function has been lost in humans. Both vertebrate and invertebrate R3 subfamily RPTPs are involved in the control of a variety of cellular processes, including cell growth, differentiation, mitotic cycle and oncogenic transformation. 222
32201 350397 cd14549 R5-PTPc-1 catalytic domain of R5 subfamily receptor-type tyrosine-protein phosphatases, repeat 1. The R5 subfamily of receptor-type phosphotyrosine phosphatases (RPTP) is composed of receptor-type tyrosine-protein phosphatase Z (PTPRZ) and G (PTPRG). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. They are type 1 integral membrane proteins consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1). 204
32202 350398 cd14550 R5-PTP-2 PTP-like domain of R5 subfamily receptor-type tyrosine-protein phosphatases, repeat 2. The R5 subfamily of receptor-type phosphotyrosine phosphatases (RPTP) is composed of receptor-type tyrosine-protein phosphatase Z (PTPRZ) and G (PTPRG). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. They are type 1 integral membrane proteins consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2). 200
32203 350399 cd14551 R-PTPc-A-E-1 catalytic domain of receptor-type tyrosine-protein phosphatase A and E, repeat 1. Receptor-type tyrosine-protein phosphatase A (PTPRA) and E (PTPRE) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA and PTPRE share several functions including regulation of Src family kinases and voltage-gated potassium (Kv) channels. They both contain a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the first catalytic PTP domain (repeat 1). 202
32204 350400 cd14552 R-PTPc-A-E-2 catalytic domain of receptor-type tyrosine-protein phosphatase A and E, repeat 2. Receptor-type tyrosine-protein phosphatase A (PTPRA) and E (PTPRE) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA and PTPRE share several functions including regulation of Src family kinases and voltage-gated potassium (Kv) channels. They both contain a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2). 202
32205 350401 cd14553 R-PTPc-LAR-1 catalytic domain of LAR family receptor-type tyrosine-protein phosphatases, repeat 1. The LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs) include three vertebrate members: LAR (or PTPRF), R-PTP-delta (or PTPRD), and R-PTP-sigma (or PTPRS). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules; they bind to distinct synaptic membrane proteins and are physiologically responsible for mediating presynaptic development by shaping various synaptic adhesion pathways. They play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. LAR-RPTPs contain an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1). 238
32206 350402 cd14554 R-PTP-LAR-2 PTP-like domain of the LAR family receptor-type tyrosine-protein phosphatases, repeat 2. The LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs) include three vertebrate members: LAR (or PTPRF), R-PTP-delta (or PTPRD), and R-PTP-sigma (or PTPRS). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules; they bind to distinct synaptic membrane proteins and are physiologically responsible for mediating presynaptic development by shaping various synaptic adhesion pathways. They play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. LAR-RPTPs contain an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). 238
32207 350403 cd14555 R-PTPc-typeIIb-1 catalytic domain of type IIb (or R2B) subfamily receptor-type tyrosine-protein phosphatases, repeat 1. The type II (or R2B) subfamily of receptor protein tyrosine phosphatases (RPTPs) include the prototypical member PTPmu (or PTPRM), PCP-2 (or PTPRU), PTPrho (or PTPRT), and PTPkappa (or PTPRK). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Type IIb RPTPs mediate cell-cell adhesion though homophilic interactions; their ligand is an identical molecule on an adjacent cell. No heterophilic interactions between the subfamily members have been observed. They also commonly function as tumor suppressors. They contain an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain. 204
32208 350404 cd14556 R-PTPc-typeIIb-2 PTP domain of type IIb (or R2B) subfamily receptor-type tyrosine-protein phosphatases, repeat 2. The type IIb (or R2B) subfamily of receptor protein tyrosine phosphatases (RPTPs) include the prototypical member PTPmu (or PTPRM), PCP-2 (or PTPRU), PTPrho (or PTPRT), and PTPkappa (or PTPRK). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Type IIb RPTPs mediate cell-cell adhesion though homophilic interactions; their ligand is an identical molecule on an adjacent cell. No heterophilic interactions between the subfamily members have been observed. They also commonly function as tumor suppressors. They contain an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain. 201
32209 350405 cd14557 R-PTPc-C-1 catalytic domain of receptor-type tyrosine-protein phosphatase C, repeat 1. Receptor-type tyrosine-protein phosphatase C (PTPRC), also known as CD45, leukocyte common antigen (LCA) or GP180, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRC/CD45 is found in all nucleated hematopoietic cells and is an essential regulator of T- and B-cell antigen receptor signaling. It controls immune response, both positively and negatively, by dephosphorylating a number of signaling molecules such as the Src family kinases, the CD3zeta chain of TCY, and ZAP-70 kinase. Mutations in the human PTPRC/CD45 gene are associated with severe combined immunodeficiency (SCID) and multiple sclerosis. PTPRC/CD45 contains an extracellular receptor-like region with fibronectin type III (FN3) repeats, a short transmembrane segment, and a cytoplasmic region comprising of a membrane proximal catalytically active PTP domain (repeat 1 or D1) and a membrane distal catalytically impaired PTP-like domain (repeat 2, or D2). This model represents repeat 1. 201
32210 350406 cd14558 R-PTP-C-2 PTP-like domain of receptor-type tyrosine-protein phosphatase C, repeat 2. Receptor-type tyrosine-protein phosphatase C (PTPRC), also known as CD45, leukocyte common antigen (LCA) or GP180, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRC/CD45 is found in all nucleated hematopoietic cells and is an essential regulator of T- and B-cell antigen receptor signaling. It controls immune response, both positively and negatively, by dephosphorylating a number of signaling molecules such as the Src family kinases, the CD3zeta chain of TCY, and ZAP-70 kinase. Mutations in the human PTPRC/CD45 gene are associated with severe combined immunodeficiency (SCID) and multiple sclerosis. PTPRC/CD45 contains an extracellular receptor-like region with fibronectin type III (FN3) repeats, a short transmembrane segment, and a cytoplasmic region comprising of a membrane proximal catalytically active PTP domain (repeat 1 or D1) and a membrane distal catalytically impaired PTP-like domain (repeat 2, or D2). This model represents repeat 2. 203
32211 350407 cd14559 PTP_YopH-like YopH and related bacterial protein tyrosine phosphatases. Yersinia outer protein H (YopH) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. YopH is an essential virulence determinant of the pathogenic bacterium by dephosphorylating several focal adhesion proteins including p130Cas in human epithelial cells, resulting in the disruption of focal adhesions and cell detachment from the extracellular matrix. It contains an N-terminal domain that contains signals required for TTSS-mediated delivery of YopH into host cells and a C-terminal catalytic PTP domain. 227
32212 350408 cd14560 PTP_tensin-1 protein tyrosine phosphatase-like domain of tensin-1. Tensin-1 (TNS1) is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. It plays an essential role in TGF-beta-induced myofibroblast differentiation and myofibroblast-mediated formation of extracellular fibronectin and collagen matrix. It also positively regulates RhoA activity through its interaction with DLC1, a RhoGAP-containing tumor suppressor; the tensin-1-DLC1-RhoA signaling axis is critical in regulating cellular functions that lead to angiogenesis. Tensin-1 contains an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. 159
32213 350409 cd14561 PTP_tensin-3 protein tyrosine phosphatase-like domain of tensin-3. Tensin-3 (TNS3) is also called tensin-like SH2 domain-containing protein 1 (TENS1) or tumor endothelial marker (TEM6). It is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Tensin-3 contributes to cell migration, anchorage-independent growth, tumorigenesis, and metastasis of cancer cells. It cooperates with Dock5, an exchange factor for the small GTPase Rac, for osteoclast activity to ensure the correct organization of podosomes. Tensin-3 contains an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. 159
32214 350410 cd14562 PTP_tensin-2 protein tyrosine phosphatase-like domain of tensin-2. Tensin-2 (TNS2) is also called tensin-like C1 domain-containing phosphatase (TENC1) or C1 domain-containing phosphatase and tensin homolog (C1-TEN). It is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Tensin-2 is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It also modulates cell contractility and remodeling of collagen fibers through the DLC1, a RhoGAP that binds to tensins in focal adhesions. Tensin-2 may have phosphatase activity; it reduces AKT1 phosphorylation. It contains an N-terminal region with a zinc finger, a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. 159
32215 350411 cd14563 PTP_auxilin_N N-terminal protein tyrosine phosphatase-like domain of auxilin. Auxilin, also called auxilin-1 or DnaJ homolog subfamily C member 6 (DNAJC6), is a J-domain containing protein that recruits the ATP-dependent chaperone Hsc70 to newly budded clathrin-coated vesicles and promotes uncoating of clathrin-coated vesicles, driving the clathrin assembly#disassembly cycle. Mutations in the DNAJC6 gene, encoding auxilin, are associated with early-onset Parkinson's disease. Auxilin contains an N-terminal protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN, a phosphoinositide 3-phosphatase, and a C-terminal region with clathrin-binding and J domains. 163
32216 350412 cd14564 PTP_GAK protein tyrosine phosphatase-like domain of cyclin-G-associated kinase. cyclin-G-associated kinase (GAK), also called auxilin-2, contains an N-terminal protein kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2. In addition, it contains an auxilin-1-like domain structure consisting of a protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN (a phosphoinositide 3-phosphatase), and a C-terminal region with clathrin-binding and J domains. Like auxilin-1, GAK facilitates Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, unlike auxilin-1 which is nerve-specific. GAK also plays regulatory roles outside of clathrin-mediated membrane traffic including the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses through interaction with the interleukin 12 receptor. 163
32217 350413 cd14565 DSP_MKP_classI dual specificity phosphatase domain of class I mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class I MKPs consist of DUSP1/MKP-1, DUSP2 (PAC1), DUSP4/MKP-2 and DUSP5. They are all mitogen- and stress-inducible nuclear MKPs. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 138
32218 350414 cd14566 DSP_MKP_classII dual specificity phosphatase domain of class II mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class II MKPs consist of DUSP6/MKP-3, DUSP7/MKP-X and DUSP9/MKP-4, and are ERK-selective cytoplasmic MKPs. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 137
32219 350415 cd14567 DSP_DUSP10 dual specificity phosphatase domain of dual specificity protein phosphatase 10. Dual specificity protein phosphatase 10 (DUSP10), also called mitogen-activated protein kinase (MAPK) phosphatase 5 (MKP-5), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP10/MKP-5 coordinates skeletal muscle regeneration by negatively regulating mitochondria-mediated apoptosis. It is also an important regulator of intestinal epithelial barrier function and a suppressor of colon tumorigenesis. DUSP10/MKP-5 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 152
32220 350416 cd14568 DSP_MKP_classIII dual specificity phosphatase domain of class III mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class III MKPs consist of DUSP8, DUSP10/MKP-5 and DUSP16/MKP-7, and are JNK/p38-selective phosphatases, which are found in both the cell nucleus and cytoplasm. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 140
32221 350417 cd14569 DSP_slingshot_2 dual specificity phosphatase domain of slingshot homolog 2. Dual specificity protein phosphatase slingshot homolog 2 (SSH2), also called SSH-like protein 2, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. SSH2 has been identified as a target of protein kinase D1 that regulates cofilin phosphorylation and remodeling of the actin cytoskeleton during neutrophil chemotaxis. There are at least two human SSH2 isoforms reported: hSSH-2L (long) and hSSH-2. As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, hSSH-2L contains a long C-terminal tail while hSSH-2 does not. 144
32222 350418 cd14570 DSP_slingshot_1 dual specificity phosphatase domain of slingshot homolog 1. Dual specificity protein phosphatase slingshot homolog 1 (SSH1), also called SSH-like protein 1, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. SSH1 links NOD1 signaling to actin remodeling, facilitating the changes that leads to NF-kappaB activation and innate immune responses. There are at least two human SSH1 isoforms reported: hSSH-1L (long) and hSSH-1S (short). As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. They also contain C-terminal tails, differing in the lengths of the tail. 144
32223 350419 cd14571 DSP_slingshot_3 dual specificity phosphatase domain of slingshot homolog 3. Dual specificity protein phosphatase slingshot homolog 3 (SSH3), also called SSH-like protein 3, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. The Xenopus homolog (xSSH) is involved in the gastrulation movement. Mouse SSH3 dephosphorylates actin-depolymerizing factor (ADF) and cofilin but is dispensable for development. There are at least two human SSH3 isoforms reported: hSSH-3L (long) and hSSH-3. As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, hSSH-3L contains a C-terminal tail while hSSH-3 does not. 144
32224 350420 cd14572 DUSP14 dual specificity protein phosphatase 14. dual specificity protein phosphatase 14 (DUSP14), also called mitogen-activated protein kinase (MAPK) phosphatase 6 (MKP-6) or MKP-1-like protein tyrosine phosphatase (MKP-L), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP14 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP14 dephosphorylates JNK, ERK, and p38 in vitro. It also directly interacts and dephosphorylates TGF-beta-activated kinase 1 (TAK1)-binding protein 1 (TAB1) in T cells, and negatively regulates TCR signaling and immune responses. 150
32225 350421 cd14573 DUSP18_21 dual specificity protein phosphatases 18 and 21. This subfamily contains dual specificity protein phosphatase 18 (DUSP18), dual specificity protein phosphatase 21 (DUSP21), and similar proteins. They function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48), and are atypical DUSPs. They contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP18, also called low molecular weight dual specificity phosphatase 20 (LMW-DSP20), is a catalytically active phosphatase with a preference for phosphotyrosine over phosphoserine/threonine oligopeptides in vitro. In vivo, it has been shown to interact and dephosphorylate SAPK/JNK, and may play a role in regulating the SAPK/JNK pathway. DUSP21 is also called low molecular weight dual specificity phosphatase 21 (LMW-DSP21). Its gene has been identified as a potential therapeutic target in human hepatocellular carcinoma. DUSP18 and DUSP21 target to opposing sides of the mitochondrial inner membrane. 158
32226 350422 cd14574 DUSP28 dual specificity protein phosphatase 28. Dual specificity protein phosphatase 28 (DUSP28), also called VHP, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It is an atypical DUSP that contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It has been implicated in hepatocellular carcinoma progression and in migratory activity and drug resistance of pancreatic cancer cells. DUSP28 has an exceptionally low phosphatase activity due to the presence of bulky residues in the active site pocket resulting in low accessibility. 140
32227 350423 cd14575 DUPD1 dual specificity phosphatase and pro isomerase domain containing 1. Dual specificity phosphatase and pro isomerase domain containing 1 (DUPD1) was initially named as such because computational prediction appeared to encode a protein of 446 amino acids in length that included two catalytic domains: a proline isomerase and a dual specificity phosphatase (DUSP). However, it was subsequently shown that the true open reading frame only encompassed the DUSP domain and the gene product was therefore renamed DUSP27. This is distinct from inactive DUSP27. DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). DUPD1/DUSP27 has been shown to have catalytic activity with preference for phosphotyrosine over phosphothreonine and phosphoserine residues. It associates with the short form of the prolactin (PRL) receptor and plays a role in PRL-mediated MAPK inhibition in ovarian cells. 160
32228 350424 cd14576 DSP_iDUSP27 dual specificity phosphatase-like domain of inactive dual specificity protein phosphatase 27. Inactive dual specificity protein phosphatase 27 (DUSP27) may play a role in myofiber maturation. It is a pseudophosphatase containing a substitution of the active site cysteine into a serine. It is a large protein of more than 1000 amino acids in length with an N-terminal dual specificity phosphatase-like domain. 159
32229 350425 cd14577 DUSP13B dual specificity protein phosphatase 13 isoform B. Dual specificity protein phosphatase 13 isoform B (DUSP13B), also called testis- and skeletal-muscle-specific DSP (TMDP) or dual specificity phosphatase SKRP4, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP13B is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP13B inactivates MAPK activation in the order of selectivity, JNK = p38 > ERK in cells. It may play a role in protection from external stress during spermatogenesis. 163
32230 350426 cd14578 DUSP26 dual specificity protein phosphatase 26. Dual specificity protein phosphatase 26 (DUSP26), also called mitogen-activated protein kinase (MAPK) phosphatase 8 (MKP-8) or low-molecular-mass dual-specificity phosphatase 4 (LDP-4), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP26 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is a brain phosphatase highly overexpressed in neuroblastoma and has also been identified as a p53 phosphatase, dephosphorylating phospho-Ser20 and phospho-Ser37 in the p53 transactivation domain. 144
32231 350427 cd14579 DUSP3 dual specificity protein phosphatase 3. Dual specificity protein phosphatase 3 (DUSP3), also called vaccinia H1-related phosphatase (VHR), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP3 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It favors bisphosphorylated substrates over monophosphorylated ones, and prefers pTyr peptides over pSer/pThr peptides. Reported physiological substrates includes MAPKs ERK1/2, JNK, and p38, as well as STAT5, EGFR, and ErbB2. DUSP3 has been linked to breast and prostate cancer, and may also play a role in thrombosis. 168
32232 350428 cd14580 DUSP13A dual specificity protein phosphatase 13 isoform A. Dual specificity protein phosphatase 13 isoform A (DUSP13A), also called branching-enzyme interacting DSP or muscle-restricted DSP (MDSP), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP13A is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP13A also functions as a regulator of apoptosis signal-regulating kinase 1 (ASK1), a MAPK kinase kinase, by interacting with its N-terminal domain and inducing ASK1-mediated apoptosis through the activation of caspase-3. This function is independent of phosphatase activity. 145
32233 350429 cd14581 DUSP22 dual specificity protein phosphatase 22. Dual specificity protein phosphatase 22 (DUSP22), also called JNK-stimulatory phosphatase-1 (JSP-1), low molecular weight dual specificity phosphatase 2 (LMW-DSP2), mitogen-activated protein kinase phosphatase x (MKP-x) or VHR-related MKPx (VHX), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP22 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP22 negatively regulates the estrogen receptor-alpha-mediated signaling pathway and the IL6-leukemia inhibitory factor (LIF)-STAT3-mediated signaling pathway. It also regulates cell death by acting as a scaffold protein for the ASK1-MKK7-JNK signal transduction pathway independently of its phosphatase activity. 149
32234 350430 cd14582 DSP_DUSP15 dual specificity phosphatase domain of dual specificity protein phosphatase 15. Dual specificity protein phosphatase 15 (DUSP15), also called Vaccinia virus VH1-related dual-specific protein phosphatase Y (VHY) or VH1-related member Y, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). DUSP15 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is highly expressed in the testis and is located in the plasma membrane in a myristoylation-dependent manner. It may be involved in the regulation of meiotic signal transduction in testis cells. It is also expressed in the brain and has been identified as a regulator of oligodendrocyte differentiation. DUSP15 contains an N-terminal catalytic dual specificity phosphatase domain and a short C-terminal tail. 146
32235 350431 cd14583 PTP-MTMR7 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 7. Myotubularin related phosphoinositide phosphatase 7 (MTMR7) is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. In neuronal cells, MTMR7 forms a complex with catalytically inactive MTMR9 and dephosphorylates phosphatidylinositol 3-phosphate and Ins(1,3)P2. 302
32236 350432 cd14584 PTP-MTMR8 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 8. Myotubularin related phosphoinositide phosphatase 8 (MTMR8) is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR8 forms a complex with catalytically inactive MTMR9 and preferentially dephosphorylates PtdIns(3)P; the MTMR8/R9 complex inhibits autophagy. In zebrafish, it cooperates with PI3K to regulate actin filament modeling and muscle development. 308
32237 350433 cd14585 PTP-MTMR6 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 6. Myotubularin related phosphoinositide phosphatase 6 is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR6 forms a complex with catalytically inactive MTMR9 and preferentially dephosphorylates PtdIns(3,5)P(2); the MTMR6/R9 complex serves to inhibit stress-induced apoptosis. 302
32238 350434 cd14586 PTP-MTMR3 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 3. Myotubularin related phosphoinositide phosphatase 3 (MTMR3), also known as FYVE domain-containing dual specificity protein phosphatase 1 (FYVE-DSP1) or Zinc finger FYVE domain-containing protein 10 (ZFYVE10), is enzymatically active and contains a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Together with phosphoinositide 5-kinase PIKfyve, phosphoinositide 3-phosphatase MTMR3 constitutes a phosphoinositide loop that produces PI(5)P via PI(3,5)P2 and regulates cell migration. 317
32239 350435 cd14587 PTP-MTMR4 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 4. Myotubularin related phosphoinositide phosphatase 4 (MTMR4), also known as FYVE domain-containing dual specificity protein phosphatase 2 (FYVE-DSP2) or zinc finger FYVE domain-containing protein 11 (ZFYVE11), is enzymatically active and contains a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR4 localizes at the interface of early and recycling endosomes to regulate trafficking through this pathway. It plays a role in bacterial pathogenesis by stabilizing the integrity of bacteria-containing vacuoles. 308
32240 350436 cd14588 PTP-MTMR5 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 5. Myotubularin related phosphoinositide phosphatase 5 (MTMR5), also known as SET binding factor 1 (SBF1), is enzymatically inactive and contains a variety of other domains, including a DENN and a PH-like domain. Mutations in the MTMR5 gene cause Charcot-Marie-tooth disease type 4B3. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR5 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It interacts with MTMR2, an active myotubularin related phosphatidylinositol phosphatase, regulates its enzymatic activity and subcellular location. 291
32241 350437 cd14589 PTP-MTMR13 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 13. Myotubularin related phosphoinositide phosphatase 13 (MTMR13), also known as SET binding factor 2 (SBF2), is enzymatically inactive and contains a variety of other domains, including a DENN and a PH-like domain. Mutations in MTMR13 causes Charcot-Marie-Tooth type 4B2, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR2. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR13 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It is believed to interact with MTMR2 and stimulate its phosphatase activity. It is also a guanine nucleotide exchange factor (GEF) which may activate RAB28, promoting the exchange of GDP to GTP and converting inactive GDP-bound Rab proteins into their active GTP-bound form. 297
32242 350438 cd14590 PTP-MTMR2 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 2. Myotubularin related phosphoinositide phosphatase 2 (MTMR2) is enzymatically active and contains an additional N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. Mutations in MTMR2 causes Charcot-Marie-Tooth type 4B1, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR13. MTMR13, an inactive phosphatase, is believed to interact with MTMR2 and stimulate its phosphatase activity. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. 262
32243 350439 cd14591 PTP-MTM1 protein tyrosine phosphatase-like domain of myotubularin phosphoinositide phosphatase 1. Myotubularin phosphoinositide phosphatase 1 (MTM1), also called myotubularin, is enzymatically active and contains an N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. Mutations in MTM1 cause X-linked myotubular myopathy. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. 249
32244 350440 cd14592 PTP-MTMR1 protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 1. Myotubularin-related phosphoinositide phosphatase 1 (MTMR1) is enzymatically active and contains an N-terminal PH-GRAM domain, a C-terminal coiled-coiled domain and a PDZ binding site. MTMR1 is associated with myotonic dystrophy. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. 249
32245 350441 cd14593 PTP-MTMR10 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 10. Myotubularin related phosphoinositide phosphatase 10 (MTMR10) is enzymatically inactive and contains an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR10 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. 195
32246 350442 cd14594 PTP-MTMR12 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 12. Myotubularin related phosphoinositide phosphatase 12 (MTMR12), also called phosphatidylinositol 3 phosphate 3-phosphatase adapter subunit (3-PAP), is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR12 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It functions as an adapter for the catalytically active myotubularin to regulate its intracellular location. 203
32247 350443 cd14595 PTP-MTMR11 protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 11. Myotubularin related phosphoinositide phosphatase 11 (MTMR11), also called cisplatin resistance-associated protein (hCRA) in humans, is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR11 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. 195
32248 350444 cd14596 PTPc-N20 catalytic domain of tyrosine-protein phosphatase non-receptor type 20. Tyrosine-protein phosphatase non-receptor type 20 (PTPN20) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN20 is a widely expressed phosphatase with a dynamic subcellular distribution that is targeted to sites of actin polymerization. 207
32249 350445 cd14597 PTPc-N13 catalytic domain of tyrosine-protein phosphatase non-receptor type 13. Tyrosine-protein phosphatase non-receptor type 13 (PTPN13, also known as PTPL1) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN13 is an important regulator of tumor aggressiveness. It regulates breast cancer cell aggressiveness through direct inactivation of Src kinase. In hepatocellular carcinoma, PTPN13 is a tumor suppressor. PTPN13 contains a FERM domain, five PDZ domains, and a C-terminal catalytic PTP domain. With its PDZ domains, PTPN13 has numerous interacting partners that can actively participate in the regulation of its phosphatase activity or can permit direct or indirect recruitment of tyrosine phosphorylated substrates. Its FERM domain is necessary for localization to the membrane. 234
32250 350446 cd14598 PTPc-N21 catalytic domain of tyrosine-protein phosphatase non-receptor type 21. Tyrosine-protein phosphatase non-receptor type 21 (PTPN21), also called protein-tyrosine phosphatase D1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN21 is a component of a multivalent scaffold complex nucleated by focal adhesion kinase (FAK) at specific intracellular sites. It promotes cytoskeleton events that induce cell adhesion and migration by modulating Src-FAK signaling. It can also selectively associate with and stimulate Tec family kinases and modulate Stat3 activation. Human PTPN21 may also play a pathologic role in gastrointestinal tract tumorigenesis. PTPN21 contains an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence. 220
32251 350447 cd14599 PTPc-N14 catalytic domain of tyrosine-protein phosphatase non-receptor type 14. Tyrosine-protein phosphatase non-receptor type 14 (PTPN14), also called protein-tyrosine phosphatase pez, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN14 is a potential tumor suppressor and plays a regulatory role in the Hippo and Wnt/beta-catenin signaling pathways. It contains an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence. 287
32252 350448 cd14600 PTPc-N3 catalytic domain of tyrosine-protein phosphatase non-receptor type 3. Tyrosine-protein phosphatase non-receptor type 3 (PTPN3), also called protein-tyrosine phosphatase H1 (PTP-H1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN3 interacts with mitogen-activated protein kinase p38gamma and serves as its specific phosphatase. PTPN3 and p38gamma cooperate to promote Ras-induced oncogenesis. PTPN3 is a large modular protein containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain. Its PDZ domain binds with the PDZ-binding motif of p38gamma and enables efficient tyrosine dephosphorylation. 274
32253 350449 cd14601 PTPc-N4 catalytic domain of tyrosine-protein phosphatase non-receptor type 4. Tyrosine-protein phosphatase non-receptor type 4 (PTPN4), also called protein-tyrosine phosphatase MEG1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN4 functions in TCR cell signaling, apoptosis, cerebellar synaptic plasticity, and innate immune responses. It specifically inhibits the TRIF-dependent TLR4 pathway by suppressing tyrosine phosphorylation of TRAM. It is a large modular protein containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain; the PDZ domain regulates the catalytic activity of PTPN4. 212
32254 350450 cd14602 PTPc-N22 catalytic domain of tyrosine-protein phosphatase non-receptor type 22. Tyrosine-protein phosphatase non-receptor type 22 (PTPN22), also called lymphoid phosphatase (LyP), PEST-domain phosphatase (PEP), or hematopoietic cell protein-tyrosine phosphatase 70Z-PEP, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN22 is expressed in hematopoietic cells and it functions as a key regulator of immune homeostasis by inhibiting T-cell receptor signaling through the direct dephosphorylation of Src family kinases (Lck and Fyn), ITAMs of the TCRz/CD3 complex, and other signaling molecules. Mutations in the PTPN22 gene are associated with multiple connective tissue and autoimmune diseases including type 1 diabetes mellitus, rheumatoid arthritis, and systemic lupus erythematosus. PTPN22 contains an N-terminal catalytic PTP domain and four proline-rich regions at the C-terminus. 234
32255 350451 cd14603 PTPc-N18 catalytic domain of tyrosine-protein phosphatase non-receptor type 18. Tyrosine-protein phosphatase non-receptor type 18 (PTPN18), also called brain-derived phosphatase, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN18 regulates HER2-mediated cellular functions through defining both its phosphorylation and ubiquitination states. The N-terminal catalytic PTP domain of PTPN18 blocks lysosomal routing and delays the degradation of HER2 by dephosphorylation, and its C-terminal PEST domain promotes K48-linked HER2 ubiquitination and its destruction via the proteasome pathway. 266
32256 350452 cd14604 PTPc-N12 catalytic domain of tyrosine-protein phosphatase non-receptor type 12. Tyrosine-protein phosphatase non-receptor type 12 (PTPN12), also called PTP-PEST or protein-tyrosine phosphatase G1 (PTPG1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN12 is characterized as a tumor suppressor and a pivotal regulator of EGFR/HER2 signaling. It regulates various physiological processes, including cell migration, immune response, and neuronal activity, by dephosphorylating multiple substrates including HER2, FAK, PYK2, PSTPIP, WASP, p130Cas, paxillin, Shc, catenin, c-Abl, ArgBP2, p190RhoGAP, RhoGDI, cell adhesion kinase beta, and Rho GTPase. 297
32257 350453 cd14605 PTPc-N11 catalytic domain of tyrosine-protein phosphatase non-receptor type 11. Tyrosine-protein phosphatase non-receptor type 11 (PTPN11), also called SH2 domain-containing tyrosine phosphatase 2 (SHP-2 or SHP2), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN11 promotes the activation of the RAS/Mitogen-Activated Protein Kinases (MAPK) Extracellular-Regulated Kinases 1/2 (ERK1/2) pathway, a canonical signaling cascade that plays key roles in various cellular processes, including proliferation, survival, differentiation, migration, or metabolism. It also regulates the phosphoinositide 3-kinase (PI3K)/AKT pathway, a fundamental cascade that functions in cell survival, proliferation, migration, morphogenesis, and metabolism. PTPN11 dysregulation is associated with several developmental diseases and malignancies, such as Noonan syndrome and juvenile myelomonocytic leukemia. It contains two tandem SH2 domains, a catalytic PTP domain, and a C-terminal tail with regulatory properties. 253
32258 350454 cd14606 PTPc-N6 catalytic domain of tyrosine-protein phosphatase non-receptor type 6. Tyrosine-protein phosphatase non-receptor type 6 (PTPN6), also called SH2 domain-containing protein-tyrosine phosphatase 1 (SHP1 or SHP-1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN6 expression is restricted mainly to hematopoietic and epithelial cells. It is an important regulator of hematopoietic cells, downregulating pathways that promote cell growth, survival, adhesion, and activation. It regulates glucose homeostasis by modulating insulin signalling in the liver and muscle, and it also negatively regulates bone resorption, affecting both the formation and the function of osteoclasts. PTPN6 contains two tandem SH2 domains, a catalytic PTP domain, and a C-terminal tail with regulatory properties. 266
32259 350455 cd14607 PTPc-N2 catalytic domain of tyrosine-protein phosphatase non-receptor type 2. Tyrosine-protein phosphatase non-receptor type 2 (PTPN2), also called T-cell protein-tyrosine phosphatase (TCPTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN2, a tumor suppressor, dephosphorylates and inactivates EGFRs, Src family kinases, Janus-activated kinases (JAKs)-1 and -3, and signal transducer and activators of transcription (STATs)-1, -3 and -5, in a cell type and context-dependent manner. It is deleted in 6% of all T-cell acute lymphoblastic leukemias and is associated with constitutive JAK1/STAT5 signaling and tumorigenesis. 257
32260 350456 cd14608 PTPc-N1 catalytic domain of tyrosine-protein phosphatase non-receptor type 1. Tyrosine-protein phosphatase non-receptor type 1 (PTPN1), also called protein-tyrosine phosphatase 1B (PTP-1B), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN1/PTP-1B is the first PTP to be purified and characterized and is the prototypical intracellular PTP found in a wide variety of human tissues. It contains an N-terminal catalytic PTP domain, followed by two tandem proline-rich motifs that mediate interaction with SH3-domain-containing proteins, and a small hydrophobic stretch that localizes the enzyme to the endoplasmic reticulum (ER). It dephosphorylates and regulates the activity of a number of receptor tyrosine kinases, including the insulin receptor, the EGF receptor, and the PDGF receptor. 277
32261 350457 cd14609 R-PTP-N PTP-like domain of receptor-type tyrosine-protein phosphatase N. Receptor-type tyrosine-protein phosphatase-like N (PTPRN or R-PTP-N), also called islet cell antigen 512 (ICA512) or PTP IA-2, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). It consists of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. PTPRN is located in secretory granules of neuroendocrine cells and is involved in the generation, cargo storage, traffic, exocytosis and recycling of insulin secretory granules, as well as in beta-cell proliferation. It is a major autoantigen in type 1 diabetes and is involved in the regulation of insulin secretion. 281
32262 350458 cd14610 R-PTP-N2 PTP-like domain of receptor-type tyrosine-protein phosphatase N2. Receptor-type tyrosine-protein phosphatase N2 (PTPRN2 or R-PTP-N2), also called islet cell autoantigen-related protein (IAR), ICAAR, phogrin, or IA-2beta, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). It consists of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. It is mainly expressed in neuropeptidergic neurons and peptide-secreting endocrine cells, including insulin-producing pancreatic beta-cells. It may function as a phosphatidylinositol phosphatase to regulate insulin secretion. It is also required for normal accumulation of the neurotransmitters norepinephrine, dopamine and serotonin in the brain. 283
32263 350459 cd14611 R-PTPc-R catalytic domain of receptor-type tyrosine-protein phosphatase R. Receptor-type tyrosine-protein phosphatase-like R (PTPRR or R-PTP-R), also called protein-tyrosine phosphatase PCPTP1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRR is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. The human and mouse PTPRR gene produces multiple neuronal protein isoforms of varying sizes (in human, PTPPBS-alpha, beta, gamma and delta). All isoforms contain the KIM motif and the catalytic PTP domain. PTPRR-deficient mice show significant defects in fine motor coordination and balance skills that are reminiscent of a mild ataxia. 226
32264 350460 cd14612 PTPc-N7 catalytic domain of tyrosine-protein phosphatase non-receptor type 7. Tyrosine-protein phosphatase non-receptor type 7 (PTPN7), also called hematopoietic protein-tyrosine phosphatase (HePTP) or LC-PTP. belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN7/HePTP is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. PTPN7/HePTP is found exclusively in the white blood cells in bone marrow, thymus, spleen, lymph nodes and all myeloid and lymphoid cell lines. It negatively regulates T-cell activation and proliferation, and is often dysregulated in the preleukemic disorder myelodysplastic syndrome, as well as in acute myelogenous leukemia. 247
32265 350461 cd14613 PTPc-N5 catalytic domain of tyrosine-protein phosphatase non-receptor type 5. Tyrosine-protein phosphatase non-receptor type 5 (PTPN5), also called striatum-enriched protein-tyrosine phosphatase (STEP) or neural-specific PTP, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN5/STEP is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. It is a CNS-enriched protein that regulates key signaling proteins required for synaptic strengthening, as well as NMDA and AMPA receptor trafficking. PTPN5 is implicated in multiple neurologic and neuropsychiatric disorders, such as Alzheimer's disease, Parkinson's disease, schizophrenia, and fragile X syndrome. 258
32266 350462 cd14614 R-PTPc-O catalytic domain of receptor-type tyrosine-protein phosphatase O. Receptor-type tyrosine-protein phosphatase O (PTPRO or R-PTP-O), also known as glomerular epithelial protein 1 or protein tyrosine phosphatase U2 (PTP-U2), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRO is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is essential for sustaining the structure and function of foot processes by regulating tyrosine phosphorylation of podocyte proteins. It has been identified as a synaptic cell adhesion molecule (CAM) that serves as a potent initiator of synapse formation. It is also a tumor suppressor in several types of cancer, such as hepatocellular carcinoma, lung cancer, and breast cancer. 245
32267 350463 cd14615 R-PTPc-J catalytic domain of receptor-type tyrosine-protein phosphatase J. Receptor-type tyrosine-protein phosphatase J (PTPRJ or R-PTP-J), also known as receptor-type tyrosine-protein phosphatase eta (R-PTP-eta) or density-enhanced phosphatase 1 (DEP-1) OR CD148, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRJ is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats (eight in PTPRJ) and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is expressed in various cell types including epithelial, hematopoietic, and endothelial cells. It plays a role in cell adhesion, migration, proliferation and differentiation. It dephosphorylates or contributes to the dephosphorylation of various substrates including protein kinases such as FLT3, PDGFRB, MET, RET (variant MEN2A), VEGFR-2, LYN, SRC, MAPK1, MAPK3, and EGFR, as well as PIK3R1 and PIK3R2. 229
32268 350464 cd14616 R-PTPc-Q catalytic domain of receptor-type tyrosine-protein phosphatase Q. Receptor-type tyrosine-protein phosphatase Q (PTPRQ or R-PTP-Q), also called phosphatidylinositol phosphatase PTPRQ, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRQ is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats (18 in PTPRQ) and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It displays low tyrosine-protein phosphatase activity; rather, it functions as a phosphatidylinositol phosphatase required for auditory processes. It regulates the levels of phosphatidylinositol 4,5-bisphosphate (PIP2) in the basal region of hair bundles. It can dephosphorylate a broad range of phosphatidylinositol phosphates, including phosphatidylinositol 3,4,5-trisphosphate and most phosphatidylinositol monophosphates and diphosphates. 224
32269 350465 cd14617 R-PTPc-B catalytic domain of receptor-type tyrosine-protein phosphatase B. Receptor-type tyrosine-protein phosphatase B (PTPRB), also known as receptor-type tyrosine-protein phosphatase beta (R-PTP-beta) or vascular endothelial protein tyrosine phosphatase(VE-PTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRB/VE-PTP is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is expressed specifically in vascular endothelial cells and it plays an important role in blood vessel remodeling and angiogenesis. 228
32270 350466 cd14618 R-PTPc-V catalytic domain of receptor-type tyrosine-protein phosphatase V. Receptor-type tyrosine-protein phosphatase V (PTPRV or R-PTP-V), also known as embryonic stem cell protein-tyrosine phosphatase (ES cell phosphatase) or osteotesticular protein-tyrosine phosphatase (OST-PTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRV is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. In rodents, it may play a role in the maintenance of pluripotency and may function in signaling pathways during bone remodeling. It is the only PTP whose function has been lost between rodent and human. The human OST-PTP gene is a pseudogene. 230
32271 350467 cd14619 R-PTPc-H catalytic domain of receptor-type tyrosine-protein phosphatase H. Receptor-type tyrosine-protein phosphatase H (PTPRH or R-PTP-H), also known as stomach cancer-associated protein tyrosine phosphatase 1 (SAP-1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRH is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is localized specifically at microvilli of the brush border in gastrointestinal epithelial cells. It plays a role in intestinal immunity by regulating CEACAM20 through tyrosine dephosphorylation. It is also a negative regulator of integrin-mediated signaling and may contribute to contact inhibition of cell growth and motility. 233
32272 350468 cd14620 R-PTPc-E-1 catalytic domain of receptor-type tyrosine-protein phosphatase E, repeat 1. Receptor-type tyrosine-protein phosphatase E (PTPRE), also known as receptor-type tyrosine-protein phosphatase epsilon (R-PTP-epsilon), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. The PTPRE gene contains two distinct promoters that generate the two major isoforms: transmembrane (receptor type RPTPe or PTPeM) and cytoplasmic (cyt-PTPe or PTPeC). Receptor type RPTPe plays a critical role in signaling transduction pathways and phosphoprotein network topology in red blood cells, and may also play a role in osteoclast formation and function. It also negatively regulates PDGFRbeta-mediated signaling pathways that are crucial for the pathogenesis of atherosclerosis. cyt-PTPe acts as a negative regulator of insulin receptor signaling in skeletal muscle. It regulates insulin-induced phosphorylation of proteins downstream of the insulin receptor. Receptor type RPTPe contains a small extracellular region, a single transmembrane segment, and an intracellular region two tandem catalytic PTP domains. This model represents the first PTP domain (repeat 1). 229
32273 350469 cd14621 R-PTPc-A-1 catalytic domain of receptor-type tyrosine-protein phosphatase A, repeat 1. Receptor-type tyrosine-protein phosphatase A (PTPRA), also known as receptor-type tyrosine-protein phosphatase alpha (R-PTP-alpha), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA is a positive regulator of Src and Src family kinases via dephosphorylation of the Src-inhibitory tyrosine 527. Thus, it affects transformation and tumorigenesis, inhibition of proliferation, cell cycle arrest, integrin signaling, neuronal differentiation and outgrowth, and ion channel activity. It is also involved in interleukin-1 signaling in fibroblasts through its interaction with the focal adhesion targeting domain of focal adhesion kinase. PTPRA comprises a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the first catalytic PTP domain (repeat 1). 296
32274 350470 cd14622 R-PTPc-E-2 catalytic domain of receptor-type tyrosine-protein phosphatase E, repeat 2. Receptor-type tyrosine-protein phosphatase E (PTPRE), also known as receptor-type tyrosine-protein phosphatase epsilon (R-PTP-epsilon), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. The PTPRE gene contains two distinct promoters that generate the two major isoforms: transmembrane (receptor type RPTPe or PTPeM) and cytoplasmic (cyt-PTPe or PTPeC). Receptor type RPTPe plays a critical role in signaling transduction pathways and phosphoprotein network topology in red blood cells, and may also play a role in osteoclast formation and function. It also negatively regulates PDGFRbeta-mediated signaling pathways that are crucial for the pathogenesis of atherosclerosis. cyt-PTPe acts as a negative regulator of insulin receptor signaling in skeletal muscle. It regulates insulin-induced phosphorylation of proteins downstream of the insulin receptor. Receptor type RPTPe contains a small extracellular region, a single transmembrane segment, and an intracellular region two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2). 205
32275 350471 cd14623 R-PTPc-A-2 catalytic domain of receptor-type tyrosine-protein phosphatase A, repeat 2. Receptor-type tyrosine-protein phosphatase A (PTPRA), also known as receptor-type tyrosine-protein phosphatase alpha (R-PTP-alpha), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA is a positive regulator of Src and Src family kinases via dephosphorylation of the Src-inhibitory tyrosine 527. Thus, it affects transformation and tumorigenesis, inhibition of proliferation, cell cycle arrest, integrin signaling, neuronal differentiation and outgrowth, and ion channel activity. It is also involved in interleukin-1 signaling in fibroblasts through its interaction with the focal adhesion targeting domain of focal adhesion kinase. PTPRA comprises a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2). 228
32276 350472 cd14624 R-PTPc-D-1 catalytic domain of receptor-type tyrosine-protein phosphatase D, repeat 1. Receptor-type tyrosine-protein phosphatase D (PTPRD), also known as receptor-type tyrosine-protein phosphatase delta (R-PTP-delta), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules that play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. PTPRD is involved in pre-synaptic differentiation through interaction with SLITRK2. It contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1). 284
32277 350473 cd14625 R-PTPc-S-1 catalytic domain of receptor-type tyrosine-protein phosphatase S, repeat 1. Receptor-type tyrosine-protein phosphatase S (PTPRS), also known as receptor-type tyrosine-protein phosphatase sigma (R-PTP-sigma), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRS is a receptor for glycosaminoglycans, including heparan sulfate proteoglycan and neural chondroitin sulfate proteoglycans (CSPGs), which present a barrier to axon regeneration. It also plays a role in stimulating neurite outgrowth in response to the heparan sulfate proteoglycan GPC2. PTPRS contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1). 282
32278 350474 cd14626 R-PTPc-F-1 catalytic domain of receptor-type tyrosine-protein phosphatase F, repeat 1. Receptor-type tyrosine-protein phosphatase F (PTPRF), also known as leukocyte common antigen related (LAR), is the prototypical member of the LAR family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRF/LAR plays a role for LAR in cadherin complexes where it associates with and dephosphorylates beta-catenin, a pathway which may be critical for cadherin complex stability and cell-cell association. It also regulates focal adhesions through cyclin-dependent kinase-1 and is involved in axon guidance in the developing nervous system. It also functions in regulating insulin signaling. PTPRF contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1). 276
32279 350475 cd14627 R-PTP-S-2 PTP-like domain of receptor-type tyrosine-protein phosphatase S, repeat 2. Receptor-type tyrosine-protein phosphatase S (PTPRS), also known as receptor-type tyrosine-protein phosphatase sigma (R-PTP-sigma), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRS is a receptor for glycosaminoglycans, including heparan sulfate proteoglycan and neural chondroitin sulfate proteoglycans (CSPGs), which present a barrier to axon regeneration. It also plays a role in stimulating neurite outgrowth in response to the heparan sulfate proteoglycan GPC2. PTPRS contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG. 290
32280 350476 cd14628 R-PTP-D-2 PTP-like domain of receptor-type tyrosine-protein phosphatase D, repeat 2. Receptor-type tyrosine-protein phosphatase-like D (PTPRD), also known as receptor-type tyrosine-protein phosphatase delta (R-PTP-delta), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules that play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. PTPRD is involved in pre-synaptic differentiation through interaction with SLITRK2. It contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG. 292
32281 350477 cd14629 R-PTP-F-2 PTP-like domain of receptor-type tyrosine-protein phosphatase F, repeat 2. Receptor-type tyrosine-protein phosphatase F (PTPRF), also known as leukocyte common antigen related (LAR), is the prototypical member of the LAR family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRF/LAR plays a role for LAR in cadherin complexes where it associates with and dephosphorylates beta-catenin, a pathway which may be critical for cadherin complex stability and cell-cell association. It also regulates focal adhesions through cyclin-dependent kinase-1 and is involved in axon guidance in the developing nervous system. It also functions in regulating insulin signaling. PTPRF contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG. 291
32282 350478 cd14630 R-PTPc-T-1 catalytic domain of receptor-type tyrosine-protein phosphatase T, repeat 1. Receptor-type tyrosine-protein phosphatase T (PTPRT), also known as receptor-type tyrosine-protein phosphatase rho (RPTP-rho or PTPrho), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRT is highly expressed in the nervous system and it plays a critical role in regulation of synaptic formation and neuronal development. It dephosphorylates a specific tyrosine residue in syntaxin-binding protein 1, a key component of synaptic vesicle fusion machinery, and regulates its binding to syntaxin 1. PTPRT has been identified as a potential candidate gene for autism spectrum disorder (ASD) susceptibility. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain. 237
32283 350479 cd14631 R-PTPc-K-1 catalytic domain of receptor-type tyrosine-protein phosphatase K, repeat 1. Receptor-type tyrosine-protein phosphatase K (PTPRK), also known as receptor-type tyrosine-protein phosphatase kappa (RPTP-kappa or PTPkappa), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRK is widely expressed and has been shown to stimulate cell motility and neurite outgrowth. It is required for anti-proliferative and pro-migratory effects of TGF-beta, suggesting a role in regulation, maintenance, and restoration of cell adhesion. It is a potential tumour suppressor in primary central nervous system lymphomas, colorectal cancer, and breast cancer. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain. 218
32284 350480 cd14632 R-PTPc-U-1 catalytic domain of receptor-type tyrosine-protein phosphatase U, repeat 1. Receptor-type tyrosine-protein phosphatase U (PTPRU), also known as pancreatic carcinoma phosphatase 2 (PCP-2), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRU/PCP-2 is the most distant member of the type IIb subfamily and may have a distinct biological function other than cell-cell aggregation. It localizes to the adherens junctions and directly binds and dephosphorylates beta-catenin, and regulates the balance between signaling and adhesive beta-catenin. It plays an important role in the maintenance of epithelial integrity. PTPRU contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain. 205
32285 350481 cd14633 R-PTPc-M-1 catalytic domain of receptor-type tyrosine-protein phosphatase M, repeat 1. Receptor-type tyrosine-protein phosphatase M (PTPRM), also known as protein-tyrosine phosphatase mu (R-PTP-mu or PTPmu), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRM/PTPmu is a homophilic cell adhesion molecule expressed in CNS neurons and glia. It is required for E-, N-, and R-cadherin-dependent neurite outgrowth. Loss of PTPmu contributes to tumor cell migration and dispersal of human glioblastomas. PTPRM contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain. 273
32286 350482 cd14634 R-PTPc-T-2 PTP domain of receptor-type tyrosine-protein phosphatase T, repeat 2. Receptor-type tyrosine-protein phosphatase T (PTPRT), also known as receptor-type tyrosine-protein phosphatase rho (RPTP-rho or PTPrho), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRT is highly expressed in the nervous system and it plays a critical role in regulation of synaptic formation and neuronal development. It dephosphorylates a specific tyrosine residue in syntaxin-binding protein 1, a key component of synaptic vesicle fusion machinery, and regulates its binding to syntaxin 1. PTPRT has been identified as a potential candidate gene for autism spectrum disorder (ASD) susceptibility. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain. 206
32287 350483 cd14635 R-PTPc-M-2 PTP domain of receptor-type tyrosine-protein phosphatase M, repeat 2. Receptor-type tyrosine-protein phosphatase M (PTPRM), also known as protein-tyrosine phosphatase mu (R-PTP-mu or PTPmu), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRM/PTPmu is a homophilic cell adhesion molecule expressed in CNS neurons and glia. It is required for E-, N-, and R-cadherin-dependent neurite outgrowth. Loss of PTPmu contributes to tumor cell migration and dispersal of human glioblastomas. PTPRM contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain. 206
32288 350484 cd14636 R-PTPc-K-2 PTP domain of receptor-type tyrosine-protein phosphatase K, repeat 2. Receptor-type tyrosine-protein phosphatase K (PTPRK), also known as receptor-type tyrosine-protein phosphatase kappa (RPTP-kappa or PTPkappa), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRK is widely expressed and has been shown to stimulate cell motility and neurite outgrowth. It is required for anti-proliferative and pro-migratory effects of TGF-beta, suggesting a role in regulation, maintenance, and restoration of cell adhesion. It is a potential tumour suppressor in primary central nervous system lymphomas, colorectal cancer, and breast cancer. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain. 206
32289 350485 cd14637 R-PTPc-U-2 PTP domain of receptor-type tyrosine-protein phosphatase U, repeat 2. Receptor-type tyrosine-protein phosphatase U (PTPRU), also known as pancreatic carcinoma phosphatase 2 (PCP-2), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRU/PCP-2 is the most distant member of the type IIb subfamily and may have a distinct biological function other than cell-cell aggregation. It localizes to the adherens junctions and directly binds and dephosphorylates beta-catenin, and regulates the balance between signaling and adhesive beta-catenin. It plays an important role in the maintenance of epithelial integrity. PTPRU contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain. 207
32290 350486 cd14638 DSP_DUSP1 dual specificity phosphatase domain of dual specificity protein phosphatase 1. Dual specificity protein phosphatase 1 (DUSP1), also called mitogen-activated protein kinase (MAPK) phosphatase 1 (MKP-1), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. Human MKP-1 dephosphorylates MAPK1/ERK2, regulating its activity during the meiotic cell cycle. Although initially MKP-1 was considered to be ERK-specific, it has been shown that MKP-1 also dephosphorylates both JNK and p38 MAPKs. DUSP1/MKP-1 is involved in various functions, including proliferation, differentiation, and apoptosis in normal cells. It is a central regulator of a variety of functions in the immune, metabolic, cardiovascular, and nervous systems. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 151
32291 350487 cd14639 DSP_DUSP5 dual specificity phosphatase domain of dual specificity protein phosphatase 5. Dual specificity protein phosphatase 5 (DUSP5) functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other mitogen-activated protein kinase (MAPK) phosphatases (MKPs), it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP5 preferentially dephosphorylates extracellular signal-regulated kinase (ERK), and is involved in ERK signaling and ERK-dependent inflammatory gene expression in adipocytes. It also plays a role in regulating pressure-dependent myogenic cerebral arterial constriction, which is crucial for the maintenance of constant cerebral blood flow to the brain. DUSP5 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 138
32292 350488 cd14640 DSP_DUSP4 dual specificity phosphatase domain of dual specificity protein phosphatase 4. Dual specificity protein phosphatase 4 (DUSP4), also called mitogen-activated protein kinase (MAPK) phosphatase 2 (MKP-2), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP4 regulates either ERK or c-JUN N-terminal kinase (JNK), depending on the cell type. It dephosphorylates nuclear JNK and induces apoptosis in diffuse large B cell lymphoma (DLBCL) cells. It acts as a negative regulator of macrophage M1 activation and inhibits inflammation during macrophage-adipocyte interaction. It has been linked to different aspects of cancer: it may have a role in the development of ovarian cancers, oesophagogastric rib metastasis, and pancreatic tumours; it may also be a candidate tumor suppressor gene, with its deletion implicated in breast cancer, prostate cancer, and gliomas. DUSP4/MKP-2 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 141
32293 350489 cd14641 DSP_DUSP2 dual specificity phosphatase domain of dual specificity protein phosphatase 2. Dual specificity protein phosphatase 2 (DUSP2), also called dual specificity protein phosphatase PAC-1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other mitogen-activated protein kinase (MAPK) phosphatases (MKPs), it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP2 can preferentially dephosphorylate ERK1/2 and p38, but not JNK in vitro. It is predominantly expressed in hematopoietic tissues with high T-cell content, such as thymus, spleen, lymph nodes, peripheral blood and other organs such as the brain and liver. It has a critical and positive role in inflammatory responses. DUSP2 mRNA and protein are significantly reduced in most solid cancers including breast, colon, lung, ovary, kidney and prostate, and the suppression of DUSP2 is associated with tumorigenesis and malignancy. DUSP2 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 144
32294 350490 cd14642 DSP_DUSP6 dual specificity phosphatase domain of dual specificity protein phosphatase 6. Dual specificity protein phosphatase 6 (DUSP6), also called mitogen-activated protein kinase (MAPK) phosphatase 3 (MKP-3) or dual specificity protein phosphatase PYST1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP6/MKP-3 plays an important role in obesity-related hyperglycemia by promoting hepatic glucose output. MKP-3 deficiency attenuates body weight gain induced by a high-fat diet, protects mice from developing obesity-related hepatosteatosis, and reduces adiposity, possibly by repressing adipocyte differentiation. It also contributes to p53-controlled cellular senescence. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 143
32295 350491 cd14643 DSP_DUSP7 dual specificity phosphatase domain of dual specificity protein phosphatase 7. Dual specificity protein phosphatase 7 (DUSP7), also called mitogen-activated protein kinase (MAPK) phosphatase X (MKP-X) or dual specificity protein phosphatase PYST2, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP7 has been shown as an essential regulator of multiple steps in oocyte meiosis. Due to alternative promoter usage, the PYST2 gene gives rise to two isoforms, PYST2-S and PYST2-L. PYST2-L is over-expressed in leukocytes derived from AML and ALL patients as well as in some solid tumors and lymphoblastoid cell lines; it plays a role in cell-crowding. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 149
32296 350492 cd14644 DSP_DUSP9 dual specificity phosphatase domain of dual specificity protein phosphatase 9. Dual specificity protein phosphatase 9 (DUSP9), also called mitogen-activated protein kinase (MAPK) phosphatase 4 (MKP-4), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP9 is a mediator of bone morphogenetic protein (BMP) signaling to control the appropriate ERK activity critical for the determination of embryonic stem cell fate. Down-regulation of DUSP9 expression has been linked to severe pre-eclamptic placenta as well as cancers such as hepatocellular carcinoma. DUSP9 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 145
32297 350493 cd14645 DSP_DUSP8 dual specificity phosphatase domain of dual specificity protein phosphatase 8. Dual specificity protein phosphatase 8 (DUSP8), also called DUSP hVH-5 or M3/6, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP8 controls basal and acute stress-induced ERK1/2 signaling in adult cardiac myocytes, which impacts contractility, ventricular remodeling, and disease susceptibility. It also plays a role in decreasing ureteric branching morphogenesis by inhibiting p38MAPK. DUSP8 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 151
32298 350494 cd14646 DSP_DUSP16 dual specificity phosphatase domain of dual specificity protein phosphatase 16. Dual specificity protein phosphatase 16 (DUSP16), also called mitogen-activated protein kinase (MAPK) phosphatase 7 (MKP-7), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP16/MKP-7 plays an essential role in perinatal survival and selectively controls the differentiation and cytokine production of myeloid cells. It is acetylated by Mycobacterium tuberculosis Eis protein, which leads to the inhibition of JNK-dependent autophagy, phagosome maturation, and ROS generation, and thus, initiating suppression of host immune responses. DUSP16/MKP-7 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. 145
32299 271236 cd14651 ZIP_Put3 Leucine zipper Dimerization domain of transcription factor Put3. Put3p activates the transcription of PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. These genes encode for proteins that convert proline to glutamate, which is a metabolically more useful form of nitrogen. Put3p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 31
32300 271237 cd14653 ZIP_Gal4p-like Leucine zipper Dimerization domain of Gal4p-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Included in this family are Saccharomyces cerevisiae Gal4p, Hap1p, Put3p, Ppr1p and Sip4p, Neurospora crassa acu-15, and Colletotrichum acutatum Nir1, among others. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of the PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Ppr1p activates transcription of the URA1, URA3, and URA4 genes, which encode enzymes involved in the regulation of pyrimidine levels. Sip4p activates target genes under conditions of glucose deprivation. Acu-15 is involved in regulating acetate utilization while Nir1 plays a role during nitrogen-starvation conditions. 24
32301 271238 cd14654 ZIP_Gal4 Leucine zipper Dimerization domain of transcription factor Gal4 and similar fungal proteins. Gal4p is one of several GAL proteins required for the growth of yeast on galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Gal4p functions in the induction of GAL genes in the presence of galactose through an upstream activating sequence (UAS) present in their promoters. The Gal4p family of transcriptional activators contain an N-terminal DNA binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 47
32302 271239 cd14655 ZIP_Hap1 Leucine zipper Dimerization domain of transcription factor Hap1 and similar fungal proteins. Hap1p mediates oxygen sensing and heme signaling in yeast. In response to heme, it promotes transcription of genes required for respiration and controlling oxidative damage. It is a member of the Gal4/Gal4p family of transcriptional activators which contain an N-terminal DNA binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Hap1p binds to DNA containing a direct repeat of two CGG triplets. It is a large protein that contains repression modules (RPMs) and heme-responsive motifs (HRMs) in addition to the DNA-binding and dimerization domains. 32
32303 271139 cd14656 Imelysin-like_EfeO EfeO is a component of the EfeUOB operon. This family includes the EfeO domain, an essential component of the EfeUOB operon which is highly conserved in bacteria. However, its biochemical function is unknown. EfeO contains an N-terminal cupredoxin (CUP)-like domain and C-terminal imelysin-like domain that may bind iron. Algp7, a member of EfeO family protein from Sphingomonas sp. A1, is found to bind alginate at neutral pH, but does not contain the CUP domain, thus having a role that does not seem to be related to iron uptake. Some members of this family are fused to an N-terminal putative EfeU ion permease domain. The imelysin-like domain of this family also contains the GxHxxE sequence motif and a highly conserved functional site, suggesting a similar role to other imelysin family proteins containing the same motif. 239
32304 271140 cd14657 Imelysin_IrpA-like Imelysin-like iron-regulated protein A-like. This family includes putative iron-regulated protein A (IrpA) mainly from Bacteriodes, proteobacteria and cyanobacteria, as well as uncharacterized proteins with domains similar to insulin-cleaving membrane protease (imelysin, ICMP) protein. IrpA has been shown to be essential for growth under iron-deficient conditions in the cyanobacteria Synechococcus sp. The conserved GxHxxE motif is similar to other known imelysin-like proteins that are regulated by iron, such as ICMP, IrpA and EfeO. Imelysin is a membrane protein with the active site outside the cell envelope. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event. 345
32305 271141 cd14658 Imelysin-like_IrpA Imelysin-like domain in iron-regulated protein A. This family includes putative iron-regulated protein A (IrpA) mainly from Bacteriodes, proteobacteria and cyanobacteria, with domain similar to insulin-cleaving membrane protease (imelysin, ICMP) protein. It has been shown to be essential for growth under iron-deficient conditions in the cyanobacteria Synechococcus sp. The conserved GxHxxE motif is similar to other known imelysin-like proteins that are regulated by iron, such as ICMP, IrpA and EfeO. Imelysin is a membrane protein with the active site outside the cell envelope. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event. 282
32306 271142 cd14659 Imelysin-like_IPPA Imelysin-like protein. This family includes insulin-cleaving membrane protease (imelysin, ICMP)-like protein (IPPA from Psychrobacter arcticus), the Pseudomonas aeruginosa PA4372 and Vibrio cholera VC1266 Fur-regulated imelysin-like protein. They share the overall fold and a similar functional site as the insulin-cleaving membrane protease (ICMP). However, IPPA adopts a structure distinctive from the known HxxE metallopeptidases or iron-binding proteins, suggesting this protein may not be a peptidase; the histidine in the GxHxxE motif region is no longer conserved (GxxxxE), indicating a possible loss of enzymatic function or a change in substrate preference (compared to imelysin and IrpA families). A putative functional site for this non-peptidase homolog is located at the domain interface. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event. 331
32307 271137 cd14660 E2F_DD Dimerization domain of E2F transcription factors. E2F transcription factors are involved in the regulation of DNA synthesis, cell cycle progression, proliferation and apoptosis. It associates with the retinoblastoma (Rb) protein, negatively regulating the G1-S transition until cyclin-dependent kinases phosphorylate Rb, which causes E2F release. E2F forms heterodimers with DP, a distantly related protein. Heterodimerization enhances the Rb binding, DNA binding, and transactivation activities of E2Fs. In humans, there are at least six closely related E2F and two DP family members, all containing a DNA-binding domain, a coiled-coil (CC) region, and a marked-box domain. E2F1 to E2F5 also contain a C-terminal transactivation domain. 104
32308 271136 cd14661 Imelysin_like_PIBO Permuted imelysin-like protein from Bacteroides ovatus (PIBO) and similar proteins. This family includes imelysin-like proteins such as imelysin-like protein from gut bacteria Bacteroides ovatus (PIBO) that have a circularly permuted topology compared with the canonical imelysin fold, such that the N-terminal and C-terminal regions are swapped in the primary sequence. PIBO is highly similar to imelysin-like protein from Psychrobacter arcticus (IPPA) despite low sequence similarity and circular permutation. PIBO is functionally equivalent to insulin-cleaving membrane protease (ICMP or imelysin), although the permutation results in the conserved GxHxxE motif to be at the C-terminus. It may have a conserved role in iron uptake although it adopts a structure distinctive from known metallopeptidases or iron-binding proteins. 347
32309 271132 cd14662 STKc_SnRK2 Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK2 is represented in this cd. SnRK2s are involved in plant response to abiotic stresses and abscisic acid (ABA)-dependent plant development. The SnRK2s subfamily is in turn classed into three subgroups, all 3 of which are represented in this CD. Group 1 comprises kinases not activated by ABA, group 2 - kinases not activated or activated very weakly by ABA (depending on plant species), and group 3 - kinases strongly activated by ABA. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
32310 271133 cd14663 STKc_SnRK3 Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK3 is represented in this cd. The SnRK3 group contains members also known as CBL-interacting protein kinase, salt overly sensitive 2, SOS3-interacting proteins and protein kinase S. These kinases interact with calcium-binding proteins such as SOS3, SCaBPs, and CBL proteins, and are involved in responses to salt stress and in sugar and ABA signaling. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 256
32311 271134 cd14664 STK_BAK1_like Catalytic domain of the Serine/Threonine Kinase, BRI1 associated kinase 1 and related STKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes three leucine-rich repeat receptor-like kinases (LRR-RLKs): Arabidopsis thaliana BAK1 and CLAVATA1 (CLV1), and Physcomitrella patens CLL1B clavata1-like receptor S/T protein kinase. BAK1 functions in various signaling pathways. It plays a role in BR (brassinosteroid)-regulated plant development as a co-receptor of BRASSINOSTEROID (BR) INSENSITIVE 1 (BRI1), the receptor for BRs, and is required for full activation of BR signaling. It also modulates pathways involved in plant resistance to pathogen infection (pattern-triggered immunity, PTI) and herbivore attack (wound- or herbivore feeding-induced accumulation of jasmonic acid (JA) and JA-isoleucine. CLV1, directly binds small signaling peptides, CLAVATA3 (CLV3) and CLAVATA3/EMBRYO SURROUNDING REGI0N (CLE), to restrict stem cell proliferation: the CLV3-CLV1-WUS (WUSCHEL) module influences stem cell maintenance in the shoot apical meristem, and the CLE40 (CLAVATA3/EMBRYO SURROUNDING REGION40) -ACR4 (CRINKLY4) -CLV1- WOX5 (WUSCHEL-RELATED HOMEOBOX5) module at the root apical meristem. The STK_BAK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 270
32312 271135 cd14665 STKc_SnRK2-3 Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 2, group 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK2 is represented in this cd. SnRK2s are involved in plant response to abiotic stresses and abscisic acid (ABA)-dependent plant development. The SnRK2s subfamily is in turn classed into three subgroups, all 3 of which are represented in this CD. Group 1 comprises kinases not activated by ABA, group 2 - kinases not activated or activated very weakly by ABA (depending on plant species), and group 3 - kinases strongly activated by ABA. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. 257
32313 270620 cd14667 3D_containing_proteins Non-mltA associated 3D domain containing proteins, named for 3 conserved aspartate residues. This family contains the 3D domain, named for its 3 conserved aspartates, including similar uncharacterized proteins. These proteins contain the critical active site aspartate of mltA-like lytic transglycosylases where the 3D domain forms a larger domain with the N-terminal region. This domain is also found in conjunction with numerous other domains such as the Escherichia coli MltA, a membrane-bound lytic transglycosylase comprised of 2 domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, corresponding to the 3D domain and Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial LTs, which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. 90
32314 270616 cd14668 mlta_B Domain B insert of mltA_like lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct. 159
32315 270617 cd14669 mlta_related_B putative domain B insert of mltA_type lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct. 128
32316 270614 cd14670 BslA_like Bacterial immunoglobulin-like hydrophobin BslA and similar proteins. BslA (YuaB) is a protein from Bacillus subtilis acting as a hydrophobin, which forms surface layers around biofilms and participates in biofilm assembly. BslA contains an unusually hydrophobic "cap structure", which is essential for its activity and for the ability of bacteria to form hydrophobic, non-wetting biofilms. A number of domains in various proteins from Bacilli and other bacterial lineages appear related to BslA, but do not conserve the hydrophobic cap. 128
32317 269821 cd14671 PAAR_like proline-alanine-alanine-arginine (PAAR) repeat superfamily. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat superfamily, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. The PAAR-repeat proteins form a diverse superfamily with several subgroups extended both N- and C-terminally by domains with various predicted functions; the termini are exposed to solution, and do not distort the VgrG binding site. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli. 77
32318 270612 cd14672 UBA_ceTYDP2_like UBA-like domain found in Caenorhabditis elegans tyrosyl-DNA phosphodiesterase 2 (TDP2) and similar proteins. The family includes C. elegans TDP2 and its homologs found in bilateria. TDP2 (also known as TTRAP or EAPII) belongs to the Mg(2+)/Mn(2+)-dependent family of phosphodiesterases which contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal phosphodiesterase domain. It required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini needed for subsequent DNA ligation and hence repair of the break. Tyrosyl-DNA phosphodiesterase 1 (TDP1), an enzyme that cleaves 3'-phosphotyrosyl bonds, and TDP2 are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TDP2 has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation. 37
32319 270192 cd14673 PH_PHLDB1_2 Pleckstrin homology-like domain-containing family B member 2 pleckstrin homology (PH) domain. PHLDB2 (also called LL5beta) and PHLDB1 (also called LL5alpha) are cytoskeleton- and membrane-associated proteins. PHLDB2 has been identified as a key component of the synaptic podosomes that play an important role in in postsynaptic maturation. Both are large proteins containing an N-terminal pleckstrin (PH) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
32320 270193 cd14674 PH_PLEKHM3_1 Pleckstrin homology domain-containing family M member 3 Pleckstrin homology domain 1. PLEKHM3 (also called differentiation associated protein/DAPR) exists as three alternatively spliced isoforms that participate in metal ion binding. It contains 2 PH domains and 1 phorbol-ester/DAG-type zinc finger domain. PLEKHM3 is found in Humans, canines, bovine, mouse, rat, chicken and zebrafish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2, or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 90
32321 270194 cd14675 PH-SEC3_like PH-like domain of Sec3-like protein. Fungal Sec3, as well as its homolog in higher eukaryotes Exocyst complex component 1 (EXOC1) are part of the exocyst is a conserved octameric complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 115
32322 270195 cd14676 PH_DOK1,2,3 Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 1, 2, and 3. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 113
32323 270196 cd14677 PH_DOK7 Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 7. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
32324 270197 cd14678 PH_DOK4_DOK5_DOK6 Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 4, 5, and 6 proteins. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 105
32325 275429 cd14679 PH_p115RhoGEF Rho guanine nucleotide exchange factor Pleckstrin homology domain. p115RhoGEF (also called LSC, GEF1 or LBCL2) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. In addition to the Dbl homology (DH)-PH domain, p115RhoGEF contains an N-terminal RGS (Regulator of G-protein signalling) domain. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 125
32326 275430 cd14680 PH_p190RhoGEF Rho guanine nucleotide exchange factor Pleckstrin homology domain. p190RhoGEF (also called RIP2 or ARHGEF28) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. In addition to the Dbl homology (DH)-PH domain, p190RhoGEF contains an N-terminal C1 (Protein kinase C conserved region 1) domain. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 101
32327 270200 cd14681 PH-STXBP6 PH-like domain of Syntaxin binding protein 6. Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains, beside the N-terminal PH-like domain, a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, with STXBP6 being a R-SNARE. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 130
32328 270201 cd14682 PH-EXOC1_like PH-like domain of Exocyst complex component 1-like. Exocyst complex component 1-like proteins are short, higher eukaryotic proteins that show homology to the PH-domain of higher eukaryotic EXOC1 and yeast SEC3 which are part of the exocyst complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. Their function is unknown. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 118
32329 270202 cd14683 PH-EXOC1 PH-like domain of Exocyst complex component 1. Exocyst complex component 1 (EXOC1, also known as SEC3) is the higher eukaryotes homolog of yeast Sec3. The Exocyst is a conserved octameric complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 117
32330 270203 cd14684 RanBD1_RanBP2-like Ran-binding protein 2, Ran binding domain repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 1 is present in this hierarchy. 117
32331 270204 cd14685 RanBD3_RanBP2-like Ran-binding protein 2, Ran binding domain repeat 3. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeats 3 is present in this hierarchy. 117
32332 269834 cd14686 bZIP Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32333 269835 cd14687 bZIP_ATF2 Basic leucine zipper (bZIP) domain of Activating Transcription Factor-2 (ATF-2) and similar proteins: a DNA-binding and dimerization domain. ATF-2 is a sequence-specific DNA-binding protein that belongs to the Basic leucine zipper (bZIP) family of transcription factors. In response to stress, it activates a variety of genes including cyclin A, cyclin D, and c-Jun. ATF-2 also plays a role in the DNA damage response that is independent of its transcriptional activity. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32334 269836 cd14688 bZIP_YAP Basic leucine zipper (bZIP) domain of Yeast Activator Protein (YAP) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed predominantly of AP-1-like transcription factors including Saccharomyces cerevisiae YAPs, Schizosaccharomyces pombe PAP1, and similar proteins. Members of this subfamily belong to the Basic leucine zipper (bZIP) family of transcription factors. The YAP subfamily is composed of eight members (YAP1-8) which may all be involved in stress responses. YAP1 is the major oxidative stress regulator and is also involved in iron metabolism (like YAP5) and detoxification of arsenic (like YAP8). YAP2 is involved in cadmium stress responses while YAP4 and YAP6 play roles in osmotic stress. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 63
32335 269837 cd14689 bZIP_CREB3 Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein 3 (CREB3) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed of CREB3 (also called LZIP or Luman), and the CREB3-like proteins CREB3L1 (or OASIS), CREB3L2, CREB3L3 (or CREBH), and CREB3L4 (or AIbZIP). They are type II membrane-associated members of the Basic leucine zipper (bZIP) family of transcription factors, with their N-termini facing the cytoplasm and their C-termini penetrating through the ER membrane. They contain an N-terminal transcriptional activation domain followed bZIP and transmembrane domains, and a C-terminal tail. They play important roles in ER stress and the unfolded protein response (UPR), as well as in many other biological processes such as cell secretion, bone and cartilage formation, and carcinogenesis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32336 269838 cd14690 bZIP_CREB1 Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein 1 (CREB1) and similar proteins: a DNA-binding and dimerization domain. CREB1 is a Basic leucine zipper (bZIP) transcription factor that plays a role in propagating signals initiated by receptor activation through the induction of cAMP-responsive genes. Because it responds to many signal transduction pathways, CREB1 is implicated to function in many processes including learning, memory, circadian rhythm, immune response, and reproduction, among others. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 55
32337 269839 cd14691 bZIP_XBP1 Basic leucine zipper (bZIP) domain of X-box binding protein 1 (XBP1) and similar proteins: a DNA-binding and dimerization domain. XBP1, a member of the Basic leucine zipper (bZIP) family, is the key transcription factor that orchestrates the unfolded protein response (UPR). It is the most conserved component of the UPR and is critical for cell fate determination in response to ER stress. The inositol-requiring enzyme 1 (IRE1)-XBP1 pathway is one of the three major sensors at the ER membrane that initiates the UPR upon activation. IRE1, a type I transmembrane protein kinase and endoribonuclease, oligomerizes upon ER stress leading to its increased activity. It splices the XBP1 mRNA, producing a variant that translocates to the nucleus and activates its target genes, which are involved in protein folding, degradation, and trafficking. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 58
32338 269840 cd14692 bZIP_ATF4 Basic leucine zipper (bZIP) domain of Activating Transcription Factor-4 (ATF-4) and similar proteins: a DNA-binding and dimerization domain. ATF-4 was also isolated and characterized as the cAMP-response element binding protein 2 (CREB2). It is a Basic leucine zipper (bZIP) transcription factor that has been reported to act as both an activator or repressor. It is a critical component in both the unfolded protein response (UPR) and amino acid response (AAR) pathways. Under certain stress conditions, ATF-4 transcription is increased; accumulation of ATF-4 induces the expression of genes involved in amino acid metabolism and transport, mitochondrial function, redox chemistry, and others that ensure protein synthesis and recovery from stress. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 63
32339 269841 cd14693 bZIP_CEBP Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein (CEBP) and similar proteins: a DNA-binding and dimerization domain. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate the cell cycle, differentiation, growth, survival, energy metabolism, innate and adaptive immunity, and inflammation, among others. They are also associated with cancer and viral disease. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. Each possesses unique properties to regulate cell type-specific growth and differentiation. The sixth isoform, CEBPZ (zeta), lacks an intact DNA-binding domain and is excluded from this subfamily. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 60
32340 269842 cd14694 bZIP_NFIL3 Basic leucine zipper (bZIP) domain of Nuclear factor interleukin-3-regulated protein (NFIL3): a DNA-binding and dimerization domain. NFIL3, also called E4 promoter-binding protein 4 (E4BP4), is a Basic leucine zipper (bZIP) transcription factor that was independently identified as a transactivator of the IL3 promoter in T-cells and as a transcriptional repressor that binds to a DNA sequence site in the adenovirus E4 promoter. Its expression levels are regulated by cytokines and it plays crucial functions in the immune system. It is required for the development of natural killer cells and CD8+ conventional dendritic cells. In B-cells, NFIL3 mediates immunoglobulin heavy chain class switching that is required for IgE production, thereby influencing allergic and pathogenic immune responses. It is also involved in the polarization of T helper responses. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 60
32341 269843 cd14695 bZIP_HLF Basic leucine zipper (bZIP) domain of Hepatic leukemia factor (HLF) and similar proteins: a DNA-binding and dimerization domain. HLF, also called vitellogenin gene-binding protein (VBP) in birds, is a circadian clock-controlled Basic leucine zipper (bZIP) transcription factor which is a direct transcriptional target of CLOCK/BMAL1. It is implicated, together with bZIPs DBP and TEF, in the regulation of genes involved in the metabolism of endobiotic and xenobiotic agents. Triple knockout mice display signs of early aging and suffer premature death, likely due to impaired defense against xenobiotic stress. A leukemogenic translocation results in the chimeric fusion protein E2A-HLF that results in a rare form of pro-B-cell acute lymphoblastic leukemia (ALL). bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 60
32342 269844 cd14696 bZIP_Jun Basic leucine zipper (bZIP) domain of Jun proteins and similar proteins: a DNA-binding and dimerization domain. Jun is a member of the activator protein-1 (AP-1) complex, which is mainly composed of Basic leucine zipper (bZIP) dimers of the Jun and Fos families, and to a lesser extent, the activating transcription factor (ATF) and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. There are three Jun proteins: c-Jun, JunB, and JunD. c-Jun is the most potent transcriptional activator of the AP-1 proteins. Both c-Jun and JunB are essential during development; deletion of either results in embryonic lethality in mice. c-Jun is essential in hepatogenesis and liver erythropoiesis, while JunB is required in vasculogenesis and angiogenesis in extraembryonic tissues. While JunD is dispensable in embryonic development, it is involved in transcription regulation of target genes that help cells to cope with environmental signals. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32343 269845 cd14697 bZIP_Maf Basic leucine zipper (bZIP) domain of musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The large Mafs (c-Maf, MafA, MafB, NRL) contain an N-terminal transactivation domain, a linker region of varying size, an anxillary DNA-binding domain, and a C-terminal bZIP domain. They function as critical regulators of terminal differentiation in the blood and in many tissues such as bone, brain, kidney, pancreas, and retina. The small Mafs (MafF, MafK, MafG) do not contain a transactivation domain. They form dimers with cap'n'collar (CNC) proteins that harbor transactivation domains, and they act either as activators or repressors depending on their dimerization partner. They play roles in stress response and detoxification pathways. They have been implicated in various diseases such as diabetes, neurological diseases, thrombocytopenia and cancer. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 70
32344 269846 cd14698 bZIP_CNC Basic leucine zipper (bZIP) domain of Cap'n'Collar (CNC) transcription factors: a DNA-binding and dimerization domain. CNC proteins form a subfamily of Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. This subfamily includes Drosophila Cnc and four vertebrate counterparts, NFE2 (nuclear factor, erythroid-derived 2), NFE2-like 1 or NFE2-related factor 1 (NFE2L1 or Nrf1), NFE2L2 (or Nrf2), and NFE2L3 (or Nrf3). It also includes BACH1 and BACH2, which contain an additional BTB domain (Broad complex###Tramtrack###Bric-a-brac domain, also known as the POZ [poxvirus and zinc finger] domain). CNC proteins function during development and/or contribute in maintaining homeostasis during stress responses. In flies, Cnc functions both in development and in stress responses. In vertebrates, several CNC proteins encoded by distinct genes show varying functions and expression patterns. NFE2 is required for the proper development of platelets while the three Nrfs function in stress responses. Nrf2, the most extensively studied member of this subfamily, acts as a xenobiotic-activated receptor that regulates the adaptive response to oxidants and electrophiles. BACH1 forms heterodimers with small Mafs such as MafK to function as a repressor of heme oxygenase-1 (HO-1) gene (Hmox-1) enhancers. BACH2 is a B-cell specific transcription factor that plays a critical role in oxidative stress-mediated apoptosis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 68
32345 269847 cd14699 bZIP_Fos_like Basic leucine zipper (bZIP) domain of the oncogene Fos (Fos)-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of Fos proteins (c-Fos, FosB, Fos-related antigen 1 (Fra-1), and Fra-2), Activating Transcription Factor-3 (ATF-3), and similar proteins. Fos proteins are members of the activator protein-1 (AP-1) complex, which is mainly composed of bZIP dimers of the Jun and Fos families, and to a lesser extent, ATF and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. ATF3 is induced by various stress signals such as cytokines, genotoxic agents, or physiological stresses. It is implicated in cancer and host defense against pathogens. It negatively regulates the transcription of pro-inflammatory cytokines and is critical in preventing acute inflammatory syndromes. ATF3 dimerizes with Jun and other ATF proteins; the heterodimers function either as activators or repressors depending on the promoter context. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 59
32346 269848 cd14700 bZIP_ATF6 Basic leucine zipper (bZIP) domain of Activating Transcription Factor-6 (ATF-6) and similar proteins: a DNA-binding and dimerization domain. ATF-6 is a type I membrane-bound Basic leucine zipper (bZIP) transcription factor that binds to the consensus ER stress response element (ERSE) and enhances the transcription of genes encoding glucose-regulated proteins Grp78, Grp94, and calreticulum. ATF-6 is one of three sensors of the unfolded protein response (UPR) in metazoans; the others being the kinases Ire1 and PERK. It contains an ER-lumenal domain that detects unfolded proteins. In response to ER stress, ATF-6 translocates from the ER to the Golgi with simultaneous cleavage in a process called regulated intramembrane proteolysis (Rip) to its transcriptionally competent form, which enters the nucleus and upregulates target UPR genes. The three UPR sensor branches cross-communicate to form a signaling network. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32347 269849 cd14701 bZIP_BATF Basic leucine zipper (bZIP) domain of BATF proteins: a DNA-binding and dimerization domain. Basic leucine zipper (bZIP) transcription factor ATF-like (BATF or SFA2), BATF2 (or SARI) and BATF3 form heterodimers with Jun proteins. They function as inhibitors of AP-1-driven transcription. Unlike most bZIP transcription factors that contain additional domains, BATF and BATF3 contain only the the bZIP DNA-binding and dimerization domain. BATF2 contains an additional C-terminal domain of unknown function. BATF:Jun hetrodimers preferentially bind to TPA response elements (TREs) with the consensus sequence TGA(C/G)TCA, and can also bind to a TGACGTCA cyclic AMP response element (CRE). In addition to negative regulation, BATF proteins also show positive transcriptional activities in the development of classical dendritic cells and T helper cell subsets, and in antibody production. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 58
32348 269850 cd14702 bZIP_plant_GBF1 Basic leucine zipper (bZIP) domain of Plant G-box binding factor 1 (GBF1)-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of plant bZIP transciption factors including Arabidopsis thaliana G-box binding factor 1 (GBF1), Zea mays Opaque-2 and Ocs element-binding factor 1 (OCSBF-1), Triticum aestivum Histone-specific transcription factor HBP1 (or HBP-1a), Petroselinum crispum Light-inducible protein CPRF3 and CPRF6, and Nicotiana tabacum BZI-3, among many others. bZIP G-box binding factors (GBFs) contain an N-terminal proline-rich domain in addition to the bZIP domain. GBFs are involved in developmental and physiological processes in response to stimuli such as light or hormones. Opaque-2 plays a role in affecting lysine content and carbohydrate metabolism, acting indirectly on starch/amino acid ratio. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32349 269851 cd14703 bZIP_plant_RF2 Basic leucine zipper (bZIP) domain of Plant RF2-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of plant bZIP transciption factors with similarity to Oryza sativa RF2a and RF2b, which are important for plant development. They interact with, as homodimers or heterodimers with each other, and activate transcription from the RTBV (rice tungro bacilliform virus) promoter, which is regulated by sequence-specific DNA-binding proteins that bind to the essential cis element BoxII. RF2a and RF2b show differences in binding affinities to BoxII, expression patterns in different rice organs, and subcellular localization. Transgenic rice with increased RF2a and RF2b display increased resistance to rice tungro disease (RTD) with no impact on plant development. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32350 269852 cd14704 bZIP_HY5-like Basic leucine zipper (bZIP) domain of Plant Elongated/Long Hypocotyl5 (HY5)-like transcription factors and similar proteins: a DNA-binding and dimerization domain. This subfamily is predominantly composed of plant Basic leucine zipper (bZIP) transcription factors with similarity to Solanum lycopersicum and Arabidopsis thaliana HY5. Also included are the Dictyostelium discoideum bZIP transcription factors E and F. HY5 plays an important role in seedling development and is a positive regulator of photomorphogenesis. Plants with decreased levels of HY5 show defects in light responses including inhibited photomorphogenesis, loss of alkaloid organization, and reduced carotenoid accumulation. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32351 269853 cd14705 bZIP_Zip1 Basic leucine zipper (bZIP) domain of Fungal Zip1-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of fungal bZIP transcription factors including Schizosaccharomyces pombe Zip1, Saccharomyces cerevisiae Methionine-requiring protein 28 (Met28p), and Neurospora crassa cys-3, among others. Zip1 is required for the production of key proteins involved in sulfur metabolism and also plays a role in cadmium response. Met28p acts as a cofactor of Met4p, a transcriptional activator of the sulfur metabolic network; it stabilizes DNA:Met4 complexes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 55
32352 269854 cd14706 bZIP_CREBZF Basic leucine zipper (bZIP) domain of CREBZF/Zhangfei transcription factor and similar proteins: a DNA-binding and dimerization domain. CREBZF (also called Zhangfei, ZF, LAZip, or SMILE) is a neuronal bZIP transcription factor that is involved in the infection cycle of herpes simplex virus (HSV) and related cellular processes. It suppresses the ability of the HSV transactivator VP16 to initiate the viral replicative cycle. CREBZF has also been implicated in the regulation of the human nerve growth factor receptor trkA and the tumor suppressor p53. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 54
32353 269855 cd14707 bZIP_plant_BZIP46 Basic leucine zipper (bZIP) domain of uncharaterized Plant BZIP transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of uncharacterized plant bZIP transciption factors with similarity to Glycine max BZIP46, which may be a drought-responsive gene. Plant bZIPs are involved in developmental and physiological processes in response to stimuli/stresses such as light, hormones, and temperature changes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 55
32354 269856 cd14708 bZIP_HBP1b-like Basic leucine zipper (bZIP) domain of uncharaterized BZIP transcription factors with similarity to Triticum aestivum HBP-1b: a DNA-binding and dimerization domain. This subfamily is composed primarily of uncharacterized bZIP transciption factors from flowering plants, mosses, clubmosses, and algae. Included in this subfamily is wheat HBP-1b, which contains a C-terminal DOG1 domain, which is a specific plant regulator for seed dormancy. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 53
32355 269857 cd14709 bZIP_CREBL2 Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein-like 2 (CREBL2): a DNA-binding and dimerization domain. CREBL2 is a bZIP transcription factor that interacts with CREB and plays a critical role in adipogenesis and lipogenesis. Its overexpression upregulates the expression of PPARgamma and CEBPalpha to promote adipogenesis as well as accelerate lipogenesis by increasing GLUT1 and GLUT4. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 56
32356 269858 cd14710 bZIP_HAC1-like Basic leucine zipper (bZIP) domain of Fungal HAC1-like transcription factors: a DNA-binding and dimerization domain. HAC1 (also called Hac1p or HacA) is a bZIP transcription factor that plays a critical role in the unfolded protein response (UPR). The UPR is initiated by the ER-resident protein kinase and endonuclease IRE1, which promotes non-conventional splicing of the HAC1 mRNA, facilitating its translation. HAC1 binds to and activates promoters of genes that encode chaperones and other targets of the UPR. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 53
32357 269859 cd14711 bZIP_CEBPA Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein alpha (CEBPA): a DNA-binding and dimerization domain. CEPBA is a critical regulator of myeloid development; it directs granulocyte and monocyte differentiation. It is highly expressed in early myeloid progenitors and is found mutated in over half of patients with acute myeloid leukemia (AML). It is also a key regulator in energy homeostasis; mice deficient of CEBPA show abnormalities in glycogen/lipid synthesis and storage. CEPBA is the longest CEBP protein containing two transactivation domains at the N-terminus followed by a regulatory domain, a bZIP domain, and C-terminal tail. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32358 269860 cd14712 bZIP_CEBPB Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein beta (CEBPB): a DNA-binding and dimerization domain. CEBPB is a key regulator of metabolism, adipocyte differentiation, myogenesis, and macrophage activation. It is expressed as three distinct isoforms from an intronless gene through alternative translation initiation: CEBPB1 (or liver-enriched activator protein 1, LAP1); CEBPB2 (OR LAP2); and CEBPB3 (or liver-enriched inhibitory protein, LIP). LAP1/2 function as transcriptional activators while LIP is a repressor due to its lack of a transactivation domain. The relative expression of LAP and LIP has effects on inflammation, ER stress, and insulin resistance. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 71
32359 269861 cd14713 bZIP_CEBPG Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein gamma (CEBPG): a DNA-binding and dimerization domain. CEBPG is an important regulator of cellular senescence; mouse embryonic fibroblasts deficient of CEBPG proliferated poorly, entered senescence prematurely, and expressed elevated levels of proinflammatory genes. It is also the primary transcription factor that regulates antioxidant and DNA repair transcripts in normal bronchial epithelial cells. In a subset of AML patients with CEBPA hypermethylation, CEBPG is significantly overexpressed. CEBPG is the shortest CEBP protein and it lacks a transactivation domain. It acts as a regulator and buffering reservoir against the transcriptional activities of other CEBP proteins. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32360 269862 cd14714 bZIP_CEBPD Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein delta (CEBPD): a DNA-binding and dimerization domain. CEBPD is an inflammatory response gene that is induced by Toll-like receptor 4 (TLR4) and is essential in the expression of many lipopolysaccharide (LPS)-induced genes and the clearance of bacterial infection. Its expression is increased in response to various extracellular stimuli and it induces growth arrest and apoptosis in cancer cells. It is thought to function as a tumor suppressor and its expression is found reduced by site-specific methylation in many cancers including breast, cervical, and hepatocellular carcinoma. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 65
32361 269863 cd14715 bZIP_CEBPE Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein epsilon (CEBPE): a DNA-binding and dimerization domain. CEBPE is a critical regulator of terminal granulocyte differentiation or granulopoiesis. It is expressed only in myeloid cells. Mice deficient with CEBPE are normal at birth and fertile, but they do not produce normal neutrophils or eosinophils, and show impaired inflammatory and bacteriocidal responses. Functional loss of CEBPE causes the rare congenital disorder, Neutrophil-specific granule deficiency (SGD), which is characterized by patients' neutrophils with atypical nuclear morphology, abnormal migration and bactericidal activity, and the lack of specific granules. Patients with SGD suffer from severe and frequent bacterial infections. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 61
32362 269864 cd14716 bZIP_CEBP-like_1 Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein (CEBP)-like proteins: a DNA-binding and dimerization domain. This group is an uncharacterized subfamily of CEBP-like proteins. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate the cell cycle, differentiation, growth, survival, energy metabolism, innate and adaptive immunity, and inflammation, among others. They are also associated with cancer and viral disease. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. Each possesses unique properties to regulate cell type-specific growth and differentiation. The sixth isoform, CEBPZ (zeta), lacks an intact DNA-binding domain and is excluded from this subfamily. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 60
32363 269865 cd14717 bZIP_Maf_small Basic leucine zipper (bZIP) domain of small musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The small Mafs (MafF, MafK, and MafG) do not contain a transactivation domain but do harbor the anxillary DNA-binding domain and a C-terminal bZIP domain. They form dimers with cap'n'collar (CNC) proteins that harbor transactivation domains, and they act either as activators or repressors depending on their dimerization partner. CNC transcription factors include NFE2 (nuclear factor, erythroid-derived 2) and similar proteins NFE2L1 (NFE2-like 1), NFE2L2, and NFE2L3, as well as BACH1 and BACH2. Small Mafs play roles in stress response and detoxification pathways. They also regulate the expression of betaA-globin and other genes activated during erythropoiesis. They have been implicated in various diseases such as diabetes, neurological diseases, thrombocytopenia and cancer. Triple deletion of the three small Mafs is embryonically lethal. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 70
32364 269866 cd14718 bZIP_Maf_large Basic leucine zipper (bZIP) domain of large musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The large Mafs (c-Maf, MafA, MafB, and neural retina leucine zipper or NRL) contain an N-terminal transactivation domain, a linker region of varying size, an anxillary DNA-binding domain, a C-terminal bZIP domain. They function as critical regulators of terminal differentiation in the blood and in many tissues such as bone, brain, kidney, pancreas, and retina. MafA and MafB also play crucial roles in islet beta cells; they regulate genes essential for glucose sensing and insulin secretion cooperatively and sequentially. Large Mafs are also implicated in oncogenesis; MafB and c-Maf chromosomal translocations result in multiple myelomas. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 70
32365 269867 cd14719 bZIP_BACH Basic leucine zipper (bZIP) domain of BTB and CNC homolog (BACH) proteins: a DNA-binding and dimerization domain. BACH proteins are Cap'n'Collar (CNC) Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. In addition, they contain a BTB domain (Broad complex-Tramtrack-Bric-a-brac domain, also known as the POZ [poxvirus and zinc finger] domain) that is absent in other CNC proteins. Veterbrates contain two members, BACH1 and BACH2. BACH1 forms heterodimers with small Mafs such as MafK to function as a repressor of heme oxygenase-1 (HO-1) gene (Hmox-1) enhancers. It has also been implicated as the master regulator of breast cancer bone metastasis. The BACH1 bZIP transcription factor should not be confused with the protein originally named as BRCA1-Associated C-terminal Helicase1 (BACH1), which has been renamed BRIP1 (BRCA1 Interacting Protein C-terminal Helicase1) and also called FANCJ. BACH2 is a B-cell specific transcription factor that plays a critical role in oxidative stress-mediated apoptosis. It plays an important role in class switching and somatic hypermutation of immunoglobulin genes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 71
32366 269868 cd14720 bZIP_NFE2-like Basic leucine zipper (bZIP) domain of Nuclear Factor, Erythroid-derived 2 (NFE2) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed of NFE2 and NFE2-like proteins including NFE2-like 1 or NFE2-related factor 1 (NFE2L1 or Nrf1), NFE2L2 (or Nrf2), and NFE2L3 (or Nrf3). These are Cap'n'Collar (CNC) Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. NFE2 functions in development; it is required for the proper development of platelets. The three Nrfs function in stress responses. Nrf2, the most extensively studied member of this subfamily, acts as a xenobiotic-activated receptor that regulates the adaptive response to oxidants and electrophiles. As the master regulator of the antioxidant defense pathway, it plays roles in the biology of inflammation, obesity, and cancer. Nrf1 is an essential protein that binds to the antioxidant response element (ARE) and is also involved in regulating oxidative stress. In addition, it also regulates genes involved in cell and tissue differentiation, inflammation, and hepatocyte homeostasis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 68
32367 269869 cd14721 bZIP_Fos Basic leucine zipper (bZIP) domain of the oncogene Fos (Fos): a DNA-binding and dimerization domain. Fos proteins are members of the activator protein-1 (AP-1) complex, which is mainly composed of Basic leucine zipper (bZIP) dimers of the Jun and Fos families, and to a lesser extent, the activating transcription factor (ATF) and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. There are four Fos proteins: c-Fos, FosB, Fos-related antigen 1 (Fra-1), and Fra-2. In addition, FosB also exists as smaller splice variants FosB2 and deltaFosB2. They all contain an N-terminal region and a bZIP domain. c-Fos and FosB also contain a C-terminal transactivation domain which is absent in Fra-1/2 and the smaller FosB variants. Fos proteins can only heterodimerize with Jun and other AP-1 proteins, but cannot homodimerize. Fos:Jun heterodimers are more stable and can bind DNA with more affinity that Jun:Jun homodimers. Fos proteins can enhance the trans-activating and transforming properties of Jun proteins. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 62
32368 269870 cd14722 bZIP_ATF3 Basic leucine zipper (bZIP) domain of Activating Transcription Factor-3 (ATF-3) and similar proteins: a DNA-binding and dimerization domain. ATF-3 is a Basic leucine zipper (bZIP) transcription factor that is induced by various stress signals such as cytokines, genetoxic agents, or physiological stresses. It is implicated in cancer and host defense against pathogens. It negatively regulates the transcription of pro-inflammatory cytokines and is critical in preventing acute inflammatory syndromes. Mice deficient with ATF3 display increased susceptibility to endotoxic shock induced death. ATF3 dimerizes with Jun and other ATF proteins; the heterodimers function either as activators or repressors depending on the promoter context. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 62
32369 271240 cd14723 ZIP_Ppr1 Leucine zipper Dimerization coil of Ppr1-like transcription factors. Ppr1/Ppr1p activates transcription of the URA1, URA3, and URA4 genes, which encode enzymes involved in the regulation of pyrimidine levels. Also included in this subfamily is Colletotrichum acutatum Nir1 which plays a role during nitrogen-starvation conditions. Proteins in this subfamily are members of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 25
32370 271241 cd14724 ZIP_Gal4-like_1 Leucine zipper Dimerization domain of Gal4-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Gal4p family members are involved in the activation of genes in response to a specific signal. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Sip4p activates target genes under conditions of glucose deprivation while Nir1 plays a role during nitrogen-starvation conditions. This subfamily is composed of uncharacterized members of the Gal4p family. 24
32371 271242 cd14725 ZIP_Gal4-like_2 Leucine zipper Dimerization coil of Gal4-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Gal4p family members are involved in the activation of genes in response to a specific signal. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of the PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Sip4p activates target genes under conditions of glucose deprivation while Nir1 plays a role during nitrogen-starvation conditions. This subfamily is composed of uncharacterized members of the Gal4p family. 24
32372 350608 cd14726 TraB_PrgY-like TraB/PryY confer plasmid-borne pheromone resistance. TraB/PrgY proteins, identified in gut bacterium Enterococcus faecalis, are plasmid-borne homologs that are induced by pheromones. Induction rends the host bacterium insensitive to self-induction by its own pheromones, and prevents the transfer of the pheromone-inducible conjugative plasmids to bacteria that already contain it. Based on homology to Tiki activity, it has been proposed that TraB acts as a protease in the inactivation of mating pheromone, cleaving at the amino-terminus. The pheromones are small peptides (7-8 residues) encoded by the bacterial genome, and are specific for particular plasmids, or class of plasmids, which may contain several virulence factors and disseminate rapidly. Plasmid-borne antibiotic resistance and virulence determinants make these elements important contributors to medical problems. Trab/PrygY is a member of a Tiki-like superfamily. Tiki is a membrane-associated metalloprotease (MEROPS family M96) that inhibits Wnt via the cleavage of its amino terminus. Wnt is essential in animal development and homeostasis. In Xenopus, Tiki is critical in head development. In human cells, Tiki inhibits Wnt-signaling. Tiki proteins are also related to erythromycin esterase, gumN plant pathogens, RtxA containing toxins, and Campylobacter Jejuni ChaN heme-binding protein. 177
32373 350609 cd14727 ChanN-like ChaN is an iron-regulated, heme-binding protein. This family represents a domain found in ChaN, a heme-binding/iron-regulated lipoprotein from Campylobacter jejuni. ChaN, possibly involved in the uptake of heme-iron, contains a pair of cofacial heme groups situated between two ChaN monomers. A single tyrosine residue contacts the heme-bound iron atom while the heme-binding regions of each monomer also have contacts to the heme in the complementary monomer. ChaN presumably associates with an outer membrane-associated receptor, ChaR. Campylobacter jejuni is an important cause of food-borne illness, and is dependent on iron uptake from the host. ChaN like proteins are related to the Tiki/TraB like family of proteases. Proteins containing this domain also include protein reticulata-related from Arabidopsis which may play a role in leaf development. 211
32374 350610 cd14728 Ere-like Erythromycin esterase and succinoglycan biosynthesis related proteins. This group contains erythromycin esterase, which shares conserved active site residues of the Tiki/TraB family. Erythromycin esterases (EreA and EreB) disrupt erythromycin via the hydrolysis of the macrolactone ring. A critical catalytic histidine acts as a general base in the activation of a water molecule. Macrolides act by inhibiting bacterial protein synthesis by binding at the exit tunnel of ribosomal subunit 50s, blocking the translation of the polypeptide. Erythromycin esterase, typically found in integrons and transposons, confers antibiotic resistance through the disruption of the drug ring structure. EreB substrate profile is substantially broader than that for EreA, being able to also metabolize semisynthetic derivatives such as azalide azithromycin. 367
32375 350611 cd14729 RtxA-like C2-2 like domain of various multidomain toxins, including RTX-containing like proteins. This group contains mostly poorly-characterized C2-2 like domains of multidomain bacterial toxins, including Pasteurella multocida toxin PMT (also known as dermonecrotic toxin), MARTX (multifunctional-autoprocessing repeats-in-toxin holotoxin RtxA) proteins from Vibrio vulnificus, as well as bacterial effector protein from Pseudomonas syringae (type III effector HopAC1). MARTX domains at the N- and C- termini act in the translocation of the central domain across the eukaryotic plasma membrane, where it is proteolytically released. These are related to Pasteurella multicida toxin, which has structural and sequence similarity to the TIKI/TraB family of proteases. However, while this group of multidomain proteins shows fairly strong conservation of the active site residues of this family, the Pasteurella multicida toxin does not. 170
32376 269830 cd14730 LodA_like L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms. 509
32377 269831 cd14731 LodA_like_1 Uncharacterized proteins similar to L-lysine epsilon-oxidase from Marinomonas mediterranea. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is known for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. Although members of this related family share features of the active site, their functions are not known. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms. 587
32378 269832 cd14732 LodA L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. The dimerization interface observed in the available 3D structure does not seem to be conserved. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms. 639
32379 350515 cd14733 BACK BACK (BTB and C-terminal Kelch) domain. The BACK domain is found in architectures C-terminal to a BTB domain, in a diverse set of architectures together with Kelch, MATH, and/or TAZ domains. It is involved in interactions with the Cullin3 (Cul3) ubiquitin ligase complex, as well as in homo-oligomerization. Most proteins containing the BACK domain are understood to function as adaptor proteins that play a role in ubiquitination of various substrates. 55
32380 350516 cd14736 BACK_AtBPM-like BACK (BTB and C-terminal Kelch) domain found in plant BTB/POZ-MATH (BPM) protein family. The BPM protein family includes Arabidopsis thaliana BTB/POZ and MATH domain-containing proteins, AtBPM1-6, and similar proteins. BPM protein, also termed protein BTB-POZ and MATH domain, may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. 62
32381 269822 cd14737 PAAR_1 proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli. 94
32382 269823 cd14738 PAAR_2 proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli. 94
32383 269824 cd14739 PAAR_3 proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli. 90
32384 269825 cd14740 PAAR_4 proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of bacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). A few members contains C-terminal domain extensions corresponding to Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences as well as uncharacterized domains such as DUF4150. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. 121
32385 269826 cd14741 PAAR_5 proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family in bacteria as well as some archaea, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli. 95
32386 269827 cd14742 PAAR_RHS proline-alanine-alanine-arginine (PAAR) domain, also containing C-terminal Rearrangement hotspot (Rhs) extensions. This PAAR (proline-alanine-alanine-arginine) repeat subfamily, which forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS), contains C- and N-terminal domain extensions. These include Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences at the C-terminal, and various predicted functions at N- and C-terminal extensions. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. 86
32387 269828 cd14743 PAAR_CT_1 proline-alanine-alanine-arginine (PAAR) domain with C-terminal extension. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of mostly gamma-proteobacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). Some members contains C-terminal domain extensions corresponding to Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences as well as uncharacterized domains. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. 78
32388 269829 cd14744 PAAR_CT_2 proline-alanine-alanine-arginine (PAAR) domain with uncharacterized C-terminal extension. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of mostly beta- and gamma-proteobacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). Most members contain C-terminal domain extensions corresponding to several uncharacterized domains such as S-type pyocin, DUF2235, DUF2345 and cytotoxic proteins. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. 78
32389 270613 cd14745 GH66 Glycoside Hydrolase Family 66. Glycoside Hydrolase Family 66 contains proteins characterized as cycloisomaltooligosaccharide glucanotransferase (CITase) and dextranases from a variety of bacteria. CITase cyclizes part of a (1-6)-alpha-D-glucan (dextrans) chain by formation of a (1-6)-alpha-D-glucosidic bond. Dextranases catalyze the endohydrolysis of (1-6)-alpha-D-glucosidic linkages in dextran. Some members contain Carbohydrate Binding Module 35 (CBM35) domains, either C-terminal or inserted in the domain or both. 331
32390 270450 cd14747 PBP2_MalE Maltose-binding protein MalE; possesses type 2 periplasmic binding fold. This group includes the periplasmic maltose-binding component of an ABC transport system from the phytopathogen Xanthomonas citri and its related bacterial proteins. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 386
32391 270451 cd14748 PBP2_UgpB The periplasmic-binding component of ABC transport system specific for sn-glycerol-3-phosphate; possesses type 2 periplasmic binding fold. This group includes the periplasmic component of an ABC transport system specific for sn-glycerol-3-phosphate (G3P) and closely related proteins from archaea and bacteria. Under phophate starvation conditions, Escherichia coli can utilize G3P as phosphate source when exclusively imported by an ATP-binding cassette (ABC) transporter composed of the periplasmic binding protein, UgpB, the transmembrane subunits, UgpA and UgpE, and a homodimer of the nucleotide binding subunit, UgpC. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 385
32392 270452 cd14749 PBP2_XBP1_like The periplasmic-binding component of ABC transport systems specific for xylo-oligosaccharides; possesses type 2 periplasmic binding fold. This group represents the periplasmic component of an ABC transport system XBP1 that shows preference for xylo-oligosaccharides in the order of xylotriose > xylobiose > xylotetraose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 388
32393 270453 cd14750 PBP2_TMBP The periplasmic-binding component of ABC transport systems specific for trehalose/maltose; possesses type 2 periplasmic binding fold. This group represents the periplasmic trehalose/maltose-binding component of an ABC transport system and related proteins from archaea and bacteria. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 385
32394 270454 cd14751 PBP2_GacH The periplasmic-binding component of the putative oligosacchride ABC transporter GacHFG; possesses type 2 periplasmic binding fold. This group represents the periplasmic component GacH of an ABC import system. GacH is identified as a maltose/maltodextrin-binding protein with a low affinity for acarbose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. 376
32395 270212 cd14752 GH31_N N-terminal domain of glycosyl hydrolase family 31 (GH31). This family is found N-terminal to the glycosyl-hydrolase domain of Glycoside hydrolase family 31 (GH31). GH31 includes the glycoside hydrolases alpha-glucosidase (EC 3.2.1.20), alpha-1,3-glucosidase (EC 3.2.1.84), alpha-xylosidase (EC 3.2.1.177), sucrase-isomaltase (EC 3.2.1.48 and EC 3.2.1.10), as well as alpha-glucan lyase (EC 4.2.2.13). All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite-1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues of the catalytic domain have been identified as the catalytic nucleophile and the acid/base, respectively. A loop of the N-terminal beta-sandwich domain is part of the active site pocket. 122
32396 271288 cd14755 GS_BA2291-HK-like Non-heme globin sensor domain of BA2291 histidine kinase and related domains. This subfamily includes the sensor domain of Bacillus anthracis BA2291 histidine kinase. BA2291 is one of the most active kinases in promoting sporulation, and is found in most members of the Bacillus cereus subfamily of the genus Bacillus, which includes B. anthracis and Bacillus thuringiensis, but not Bacillus subtilis. This subfamily also includes two sensor-only plasmid encoded sporulation inhibitors pXO1-118 and pXO2-61 found only in B. anthracis and various strains of Bacillus cereus having similar plasmids. The pXO1-118 and pXO2-61 sensor domains form homodimers, and in vitro bind fatty acid and halide, and not heme; there may be roles for fatty acid (or similar molecule), chloride ion, and possibly pH, as signaling cues. It has been proposed that BA2291 senses the same environmental cue in vivo, and that pXO1-118 and pXO2-61 act by titrating out an environmental signal that might cause an ill-timed sporulation. 132
32397 381270 cd14756 TrHb Truncated Mb-fold globins, T family. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). Typical of the TrHb1s (N) group is a protein matrix tunnel. An example of a TrHb1 is Mycobacterium tuberculosis TrHb1/Mt-trHbN which is expressed during the Mycobacterium stationary phase, and plays a specific defense role against nitrosative stress. TrHb2s include the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins. TrHb3s include Campylobacter jejuni Ctb, encoded by Cj0465c, which may play a role in moderating O2 flux within C. jejuni. 111
32398 381271 cd14757 GS_EcDosC-like_GGDEF Globin sensor domain of Escherichia coli Direct Oxygen Sensing Cyclase and related proteins; coupled to a C-terminal GGDEF domain. Globin-coupled-sensors belonging to this subfamily have a C-terminal diguanylate cyclase (DGC/GGDEF) domain coupled to the globin sensor domain. DGC/GGDEF likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP). Members include Escherichia coli DosC (also known as YddV), the gene for which is found in a two-gene operon, dosCP. In DosC, the sensory globin domain is coupled to a GGDEF-class diguanylate cyclase, while in DosP, a heme-containing PAS domain is coupled to an EAL-class c-di-GMP phosphodiesterase. DosP and DosC associate in a di-GMP-responsive Escherichia coli RNA processing complex along with polynucleotide phosphorylase (PNPase), enolase, RNase E, and RNA. 149
32399 381272 cd14758 GS_GGDEF_1 Globin sensor domain, coupled to DGC/GGDEF domains; uncharacterized subgroup. Globin-coupled-sensors belonging to this subfamily have a sensor domain coupled to a C-terminal diguanylate cyclase (DGC/GGDEF) domain. DGC/GGDEF likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP). 148
32400 381273 cd14759 GS_GGDEF_2 Globin sensor domain, coupled to DGC/GGDEF domains; uncharacterized subgroup. The majority of globin-coupled-sensors in this subfamily have diguanylate cyclase (DGC/GGDEF) domains N-terminal to the globin sensor domain and/or C-terminal EAL domains, DGC/GGDEF and EAL domains are involved in the synthesis and degradation of the secondary messenger c-di-GMP, respectively. Some members have GAF small-molecule-binding domains in addition. 150
32401 271293 cd14760 GS_PAS-GGDEF-EAL Globin sensor domain; coupled to PAS, DGC/GGDEF and EAL domains. In addition to the N-terminal sensing domain, globin-coupled-sensors in this bacterial subfamily have a signal-sensing PAS domain, and diguanylate cyclase (DGC/GGDEF) and EAL domains. The latter two domains are involved in the synthesis and degradation of c-di-GMP, respectively, and may be involved in regulating cell surface adhesiveness, and in the transition between planktonic and biofilm growth modes. 148
32402 381274 cd14761 GS_GsGCS-like Globin sensor domain of Geobacter sulfurreducens globin-coupled-sensor and related proteins. GsGCS is a GCS of unknown function, comprised of an N-terminal globin sensor domain and a C- terminal transmembrane signal-transduction domain. For GCSs in general, the first signal O2 binds to/dissociates from the heme iron complex inducing a structural change in the globin domain, which is then transduced to the functional domain, switching on (or off) the function of the latter. Ferric GsGCS is bis-histidyl hexa-coordinated (provided by a His residue located at the E11 topological site, as distinct from the E7 site). Ferrous GsGCS is a penta- and hexa-coordinated mixture. The C-terminal domains of other members of this subfamily include histidine kinase, and PsiE domains. 149
32403 381275 cd14762 GS_STAS Globin sensor domain; coupled to a STAS domain. Globin-coupled-sensors in this subfamily have a C-terminal sulphate transporter and anti-sigma factor antagonist (STAT) domain coupled to the globin sensor domain. 143
32404 381276 cd14763 SSDgbs_1 Sensor single-domain globins; uncharacterized bacterial subgroup. This subfamily of sensor single-domain globins, belongs to a family that includes GCSs (globin-coupled-sensors) and single-domain protoglobins (Pgbs). For GCSs, an N-terminal heme-bound oxygen-sensing/binding globin domain is coupled to a C-terminal functional/signaling domain. The first signal O2 binds to/dissociates from the heme in its sensor domain inducing a conformational change in that domain and ultimately in the signaling domain. It has been demonstrated that the Pgbs and other single domain globins can function as sensors, when coupled to an appropriate regulatory domain. 144
32405 381277 cd14764 SSDgbs_2 Sensor single-domain globins; uncharacterized subgroup. This subfamily of sensor single-domain globins, belongs to a family that includes GCSs (globin-coupled-sensors) and single-domain protoglobins (Pgbs). For GCSs, an N-terminal heme-bound oxygen-sensing/binding globin domain is coupled to a C-terminal functional/signaling domain. The first signal O2 binds to/dissociates from the heme in its sensor domain inducing a conformational change in that domain and ultimately in the signaling domain. It has been demonstrated that the Pgbs and other single domain globins can function as sensors, when coupled to an appropriate regulatory domain. 145
32406 381278 cd14765 Hb Hemoglobins. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which, in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary Hb throughout most of gestation. These Hb types have differences in O2 affinity and in their interactions with allosteric effectors. 131
32407 381279 cd14766 CeGLB25-like Caenorhabditis elegans globin GLB-25, and related globins. The C. elegans genome contains 33 genes encoding globins that are all transcribed. These are very diverse in gene and protein structure and are localized in a variety of cells. The C. elegans globin GLB-25 (locus tag T06A1.3), like the majority of them, was expressed in neuronal cells in the head and tail portions of the body and in the nerve cord. 137
32408 271300 cd14767 PE_beta-like Phycoerythrin beta subunit, a component of the phycobilisome rod; and related proteins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This subfamily also includes the beta subunits of Cryptophyte phycobiliproteins which represent another type of biliprotein antenna with different structure and organization. The beta subunits of cryptophyte PBPs share a high degree of sequence identity with both the alpha and beta subunits of the cyanobacterial and red algal PBPs, however the alpha cryptophyte subunits are shorter, and unrelated. There is only one type of PBP present in a single species, either phycocyanin or phycoerythrin, but not allophycocyanin. Structurally, phycoerythrin in cryptophytes is an alpha1alpha2betabeta dimer and not a trimer as in the PBS. 176
32409 271301 cd14768 PC_PEC_beta Beta subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 171
32410 271302 cd14769 PE_alpha Phycoerythrin alpha subunit, a phycobilisome rod component. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 164
32411 271303 cd14770 PC-PEC_alpha Alpha subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). 162
32412 381280 cd14771 TrHb2_Mt-trHbO-like_O Truncated hemoglobins, group 2 (O); Mycobacterium tuberculosis hemoglobin O like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). This group includes a Mycobacterium tuberculosis TrHb2, Mt-trHbO, encoded by the Mycobacterium tuberculosis glbO gene, which is expressed throughout the Mycobacterium growth phase. It also includes a TrHb2 from the thermophilic Thermobifida fusca ( Tf-trHb) which has a high thermostability and at the optimal growth temperature for Thermobifida fusca (between 55 and 60 degrees C ), it is capable of efficient O2 binding and release. Tf-trHb shares a relatively slow rate of oxygen binding with Mt-trHbO. 119
32413 381281 cd14772 TrHb2_Bs-trHb-like_O Truncated hemoglobins, group 2 (O); Bacillus subtilis TrHb like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2's belonging to this group include monomeric Bacillus subtilis trHb (Bs-trHb), which exhibits an extremely high oxygen affinity, and a dimeric TrHb2 from the thermophilic aerobic spore forming bacterium Geobacillus stearothermophilus(Gs-trHb). 116
32414 381282 cd14773 TrHb2_PhHbO-like_O Truncated hemoglobins, group 2 (O); Pseudoalteromonas haloplanktis PhHbO like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2's belonging to this group include Pseudoalteromonas haloplanktis PhHbO (encoded by the PSHAa0030 gene) which appears to be involved in oxidative and nitrosative stress resistance. 119
32415 381283 cd14774 TrHb2_HGbIV-like_O hell's gate globin IV and similar truncated hemoglobins, group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). 131
32416 381284 cd14775 TrHb2_O-like Truncated hemoglobins, group 2 (O); uncharacterized subgroup. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). 119
32417 271309 cd14776 HmpEc-globin-like Globin domain of Escherichia coli flavohemoglobin (Hmp) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. This subfamily includes Vibrio fischeri Hmp and E.coli Hmp. NO scavenging by flavoHb affects the swarming behavior of Escherichia coli, and protects against NO during initiation of the squid-Vibrio symbiosis. E.coli Hmp can catalyze the reduction of several alkylhydroperoxide substrates into their corresponding alcohols using NADH as an electron donor, and it has been suggested that it participates in the repair of the lipid membrane oxidative damage generated during oxidative/nitrosative stress. 138
32418 381285 cd14777 Yhb1-globin-like Globin domain of Saccharomyces cerevisiae flavohemoglobin (Yhb1p) and related domains. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. S. cerevisiae Yhb1p has been shown to protect against nitrosative stress and to control ferric reductase activity; it may participate in regulating the activity of plasma membrane ferric reductase(s). Also included in this subfamily is Dictyostelium discoideum FlavoHb, the expression of which affects D. discoideum development. 140
32419 381286 cd14778 VtHb-like_SDgb Vitreoscilla stercoraria hemoglobin and related proteins; single-domain globins. VtHb is homodimeric, and may both transport oxygen to terminal respiratory oxidases, and provide resistance to nitrosative stress. It has medium oxygen affinity and displays cooperative ligand-binding properties. VHb has biotechnological application, its expression in heterologous hosts (bacteria and plants) has improved growth and productivity under microaerobic conditions. Another member of this subfamily Campylobacter jejuni hemoglobin (Cgb) is monomeric, and plays a role in detoxifying NO. Along with a truncated globin Ctb, it is up-regulated by the transcription factor NssR in response to nitrosative stress. 140
32420 381287 cd14779 FHP_Ae-globin-like Globin domain of Alcaligenes eutrophus flavohemoglobin (FHP) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb maintains Medicago truncatula-Sinorhizobium meliloti symbiosis. Alcaligenes eutrophus FHP contains a phospholipid-binding site. 140
32421 381288 cd14780 HmpPa-globin-like Globin domain of Pseudomonas aeruginosa flavohemoglobin (HmpPa) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. The physiological role of HmpPa is thought to be detoxification of NO under aerobic conditions. 140
32422 381289 cd14781 FHb-globin_1 Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. This subfamily may contain some single-domain goblins (SDgbs). 139
32423 381290 cd14782 FHb-globin_2 Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. 143
32424 271316 cd14783 FHb-globin_3 Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. 140
32425 381291 cd14784 class1_nsHb-like Class 1 nonsymbiotic hemoglobins and related proteins. Class1 nsHbs include the dimeric hexacoordinate Trema tomentosa nsHb and the dimeric hexacoordinate nsHb from monocot barley. This subfamily also includes ParaHb, a dimeric pentacoordinate Hb from the root nodules of Parasponia andersonii, a non-legume capable of symbiotic nitrogen fixation. ParaHb is unusual in that it has different heme redox potentials for each subunit. 149
32426 270211 cd14785 V-ATPase_C Subunit C of vacuolar H+-ATPase (V-ATPase). This family contains subunit C of vacuolar H+-ATPase (V-ATPase), a protein that plays a crucial role in the vacuolar system of eukaryotic cells. The main function of V-ATPase is to generate a proton-motive force at the expense of ATP and to cause limited acidification in the internal space (lumen) of several organelles of the vacuolar system. V-ATPases are multi-subunit protein complexes made up of two distinct structures: a peripheral catalytic sector (V1) and a hydrophobic membrane sector (V0) responsible for driving protons; subunit C is one of five polypeptides composing V1. The key function of the C subunit is intimately involved in the reversible dissociation of the V1 and V0 structures. It has also been identified as a mediator of the acidic microenvironment of tumors which it controls by proton extrusion to the extracellular medium. The acidic environment causes tissue damage, activates destructive enzymes in the extracellular matrix, and acquires metastatic cell phenotypes. 368
32427 341075 cd14786 STAT_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription (STAT), also called alpha domain. This family consists of the coiled-coil (alpha) domain of the STAT proteins (Signal Transducer and Activator of Transcription, or Signal Transduction And Transcription), which are latent cytoplasmic transcriptional factors that play an important role in cytokine and growth factor signaling. STAT proteins regulate several aspects of growth, survival and differentiation in cells. The transcription factors of this family are activated by JAK (Janus kinase) and dysregulation of this pathway is frequently observed in primary tumors and leads to immunosuppression, increased angiogenesis and enhanced survival of tumors. There are seven mammalian STAT family members that have been identified: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B and STAT6. STAT proteins consist of six structural regions: N-domain (ND)/protein interaction domain, coiled-coil domain (CCD)/STAT all alpha domain, DNA-binding domain (DBD), linker domain (LK), a Src homology 2 (SH2) domain, and C-terminal transcriptional activation domain (TA) that includes two conserved phosphorylation sites (tyrosine and serine residues). The coiled-coil or alpha domain is an interacting region with other proteins, including IRF-9/p48 for STAT1, c-Jun, StIP1, and GRIM-19 for STAT3, and SMRT with STAT5A and STAT5B. A functional STAT1 mutant (phenylalanine to serine) in this domain region shows significantly decreased protein expression caused by translational/post-translational mechanisms independent of proteasome machinery. The phenylalanine is not conserved in STAT4 and STAT6 that have tight specificity, suggesting a novel potential mechanism of specific activation of STAT proteins. Specifically, STAT3, STAT5, and STAT6, which are continually imported to the nucleus independent of tyrosine phosphorylation, require the conformational structure of their coiled-coil domains. 125
32428 350612 cd14787 Tiki_TraB-like diverse proteins related to the Tiki and TraB protease domains. The extracellular domain of Tiki family proteins shares homology with bacterial TraB/PrgY proteins which are known for their roles in the inhibition of mating pheromones. Tiki and TraB/PrgY proteins share limited sequence identity, but their predicted secondary structures reveal that several catalytic residues are anchored in a similar manner, consistent with a common evolutionary origin. Tiki domains are related to the erythromycin esterase, gumN plant pathogens, RtxA toxins, and Campylobacter Jejuni heme-binding, ChaN-like proteins. Tiki is a membrane-associated metalloprotease (MEROPS family M96) that inhibits Wnt via the cleavage of its amino terminus, diminishing Wnt's binding to receptors. Wnt is essential in animal development and homeostasis. In Xenopus, Tiki is critical in head development. In human cells, Tiki inhibits Wnt-signaling, which is important in embryogenesis, homeostasis, and regeneration. Deregulation of Wnt contributes to birth defects, cancer and various diseases. TraB/PrgY protein has been identified in gut bacterium Enterococcus faecalis, but its function has not been well characterized. Plasmid-borne TraB has been implicated in the regulation of pheromone sensitivity and specificity. Based on homology to Tiki activity, it has been proposed that TraB acts as a metalloprotease in the inactivation of mating pheromone. Pasteurella multicida toxin has structural and sequence similarity to the Tiki/TraB family of proteases. However, unlike related multidomain toxins in this family, they do not exhibit conservation of the typical active site residues. 127
32429 350613 cd14788 GumN poorly characterized family of proteins related to gumN pathogenicity factor of Xanthomonas. GumN, a poorly characterized protein, is part of the large gum cluster of pathogenicity factors of the plant pathogen Xanthomonas. Except for GumN, the gum cluster is conserved, and proteins of this operon are involved in the production of xanthan, an extracellular polysaccharide that promotes plant disease. Xanthomonas campestri is responsible for 'black rot' disease in certain crop plants. GumN has sequence similarity to the Tiki/TraB protease family, but lacks the typical conserved residues of the active site. 286
32430 350614 cd14789 Tiki Tiki homology domain antagonizes Wnt function via cleavage of amino-terminal residues. Tiki is a membrane-associated metalloprotease that inhibits Wnt via the cleavage of its amino terminus, diminishing Wnt's binding to receptors. Wnt is essential in animal development and homeostasis. In xenopus, tiki is critical in head development. In human cells, TIKI inhibits Wnt-signaling, which is important in embryogenesis, homeostasis, and regeneration. Deregulation of WNT contributes to birth defects, cancer and various diseases. TIKI homology domains are part of the TraB family and are related to the Erythromycin esterase, GumN plant pathogens, RtxA toxins, and Campylobacter Jejuni heme-binding, Chan-like proteins. TraB/PrgY are identified in gut bacterium Enterococcus faecalis, but its function has not been well characterized. Plasmid-borne, TraB has been implicated in the regulation of pheromone sensitivity and specificity. Based on homology to TIKI activity, it has been proposed that TraB acts as a metalloprotease in the inactivation of mating pheromone. The TIKI/TraB family has 2 conserved GxxH motifs and conserved glutamate and arginine residues that may be catalytic. 259
32431 269891 cd14790 GH_D Glycoside hydrolases, clan D. This group of glycosyl hydrolase families is comprised of glycosyl hydrolase family 31 (GH31), family 36 (GH36), and family 27 (GH27). These structurally and mechanistically related protein families are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively. They have a wide range of functions including alpha-glucosidase, alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase, alpha-N-acetylgalactosaminidase, stachyose synthase, raffinose synthase, and alpha-1,4-glucan lyase. 253
32432 269892 cd14791 GH36 glycosyl hydrolase family 36 (GH36). GH36 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-galactosidase, alpha-N-acetylgalactosaminidase, stachyose synthase, and raffinose synthase. All GH36 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. GH36 members are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively. 299
32433 269893 cd14792 GH27 glycosyl hydrolase family 27 (GH27). GH27 enzymes occur in eukaryotes, prokaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-N-acetylgalactosaminidase, and 3-alpha-isomalto-dextranase. All GH27 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. GH27 members are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively. 271
32434 269816 cd14793 DUF302_like Domains similar to DUF302 and the N-terminal domains found in some bacterial RNAses. DUF302 is an uncharacterized domain with widespread phylogenetic distribution. It appears homologous to the N-terminal domains of RNAse H3 and the Escherichia coli toxin RnlA. 81
32435 269817 cd14794 RNLA_N_1 N-terminal repeat domain of toxin RnlA; first out of two repeats. The Escherichia coli toxin RnlA functions as an mRNA endoribonuclease and is part of a two-component toxin-antitoxin system that promotes resistance to phage infections, together with the antitoxin RnlB. RNAse activity is located in the C-terminal domain. This N-terminal domain appears to participate in homodimerization and is the first out of two repeats. 90
32436 269818 cd14795 RNLA_N_2 N-terminal repeat domain of toxin RnlA; second out of two repeats. The Escherichia coli toxin RnlA functions as an mRNA endoribonuclease and is part of a two-component toxin-antitoxin system that promotes resistance to phage infections, together with the antitoxin RnlB. RNAse activity is located in the C-terminal domain. This N-terminal domain appears to participate in homodimerization and is the second out of two repeats. 87
32437 269819 cd14796 RNAse_HIII_N N-terminal domain of ribonuclease H3. RNAse H3 (HIII) is a bacterial type 2 ribonuclease, which endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, and plays a role in DNA replication and repair. The N-terminal domain characterized by this model has been shown to be important in substrate binding; it might form initial contacts with the substrate and not be part of the active complex that involves the C-terminal ribonuclease domain. This domain has also been characterized as DUF3378. 66
32438 269820 cd14797 DUF302 Uncharacterized domain family DUF302. These domains are mostly found in bacterial single-domain proteins and have been shown to form homodimers; they may also bind zinc. Also characterized as COG3439. 124
32439 271353 cd14798 RX-CC_like Coiled-coil domain of the potato virux X resistance protein and similar proteins. The potato virus X resistance protein (RX) confers resistance against potato virus X. It is a member of a family of resistance proteins with a domain architecture that includes an N-terminal coiled-coil domain (modeled here), a nucleotide-binding domain, and leucine-rich repeats (CC-NB-LRR). These intracellular resistance proteins recognize pathogen effector proteins and will subsequently trigger a response that may be as severe as localized cell death. The N-terminal coiled-coil domain of RX has been shown to interact with RanGAP2, which is a necessary co-factor in the resistance response. 124
32440 341082 cd14801 STAT_DBD DNA-binding domain of Signal Transducer and Activator of Transcription (STAT). This family consists of the DNA binding domain (DBD) of the STAT proteins (Signal Transducer and Activator of Transcription, or Signal Transduction And Transcription), which are latent cytoplasmic transcriptional factors that play an important role in cytokine and growth factor signaling. STAT proteins regulate several aspects of growth, survival and differentiation in cells. The transcription factors of this family are activated by JAK (Janus kinase) and dysregulation of this pathway is frequently observed in primary tumors and leads to immunosuppression, increased angiogenesis and enhanced survival of tumors. There are seven mammalian STAT family members that have been identified: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B and STAT6. STAT proteins consist of six structural regions: N-terminal domain (ND)/protein interaction domain, coiled-coil domain (CCD)/STAT all alpha domain, DNA-binding domain (DBD), linker domain (LK), a Src homology 2 (SH2) domain, and C-terminal transcriptional activation domain (TA) that includes two conserved phosphorylation sites (tyrosine and serine residues). STAT1 and STAT3 have the greatest diversity of biological functions among the 7 known members of the STAT family. The DNA binding domain of STAT has an Ig-like fold. DNA binding specificity experiments of different STAT proteins show that STAT5A specificity is more similar to that of STAT6 than that of STAT1, as also seen from the evolutionary relationships. 157
32441 269812 cd14803 RAP Receptor-associated protein (RAP). Receptor-associated protein, RAP, is an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. RAP associates with (LDL) receptor-related protein (LRP) early in the secretory pathway, reducing its ligand binding capacity, and then dissociates from LRP in the low-pH environment of the Golgi; studies have shown that histidine residues in RAP D3 serve as a switch that facilitates its uncoupling from the receptor. RAP is a modular protein identified as having an internal triplication, with domains, D1, D2, and D3, each thought to have distinct functions; these domains are independent and do not interact. The carboxyl-terminal domain (D3) of RAP is required for folding and trafficking of LRP, while the amino-terminal tandem D1D2 domains of RAP are essential for blocking LRP from binding of certain ligands, such as activated forms of alpha2-macroglobulin. 97
32442 271351 cd14804 Tra_M TraM mediates signalling between transferosome and relaxosome. TraM is a plasmid encoded DNA-binding protein that is essential for conjugative transfer of F-like plasmids (e.g. F, R1, R100 and pED208) between bacterial cells. Bacterial conjugation, a form of horizontal gene transfer between cells, is an important contributor to bacterial genetic diversity, enabling virulence and antibiotics resistance factors to rapidly spread in medically important human pathogens. Mutation studies have shown that TraM is required for normal levels of transfer gene expression as well as for efficient site-specific single-stranded DNA cleavage at the origin of transfer (oriT). TraM tetramers bridge oriT to a key component of the conjugative pore, the coupling protein TraD. The N-terminal ribbon-helix-helix (RHH) domain of TraM is able to cooperatively bind DNA in a staggered arrangement without interaction between tetramers. This allows the C-terminal TraM tetramerization domains to be free to make multiple interactions with TraD, thus driving plasmid recruitment to the conjugative pore. 122
32443 271348 cd14805 Translin-like Translin and translin-associated factor-X (TRAX). Translin (also known as TB-RBP), and its binding partner protein TRAX (translin-associated factor-X) are a paralogous pair of conserved proteins, and oligomeric complexes of TRAX and translin are known as C3PO proteins (for component 3 promoter of RNA-induced silencing complex or RISC). The Translin-Trax complex enhances the removal of the passenger strand in RNAi and the formation of active RISC. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear. It has been shown that Trax subunit, but not Translin, possesses a Glu-Glu-Asp catalytic center with the capacity to digest RNA as well as DNA; this catalytic activity is required for passenger-strand removal and RISC activation in RNAi. In Archaeoglobus fulgidus, Trax-like-subunits assemble into an octameric structure, highly similar to human C3PO; its complex with duplex RNA reveals that the octamer entirely encapsulates a single 13-base-pair RNA duplex inside a large inner cavity. 197
32444 269813 cd14806 RAP_D1 Receptor-associated protein (RAP), Domain 1. This subfamily is the N-terminal domain (D1) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D1 as well as domain 2 (D2) are essential for blocking low-density lipoprotein receptor-related protein (LRP) from binding of certain ligands, such as activated forms of alpha2-macroglobulin; D1 and D2 each bind LRP weakly but the tandem D1D2 binds much more tightly, suggesting the avidity effects arising from amino acid residues contributed from each domain. The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease. 71
32445 269814 cd14807 RAP_D2 Domain 2 of receptor-associated protein (RAP). This subfamily is the N-terminal domain (D2) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D2, along with RAP domain 1 (D1), is essential for blocking low-density lipoprotein receptor-related protein (LRP) from binding of certain ligands, such as alpha2-macroglobulin; D1 and D2 each bind LRP weakly but the tandem D1D2 binds much more tightly to the second and the fourth ligand-binding clusters present on LRP, suggesting the avidity effects arising from amino acid residues contributed from each domain. Also, RAP has regions that interact weakly with heparin, one located in D2 and two located in D3. The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease. 98
32446 269815 cd14808 RAP_D3 C-terminal receptor-associated protein (RAP), Domain 3. This subfamily is the C-terminal domain (D3) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D3 is required for folding and trafficking of low-density lipoprotein receptor-related protein (LRP). In the mildly acidic pH of the Golgi, unfolding of RAP-D3 helical bundle facilitates dissociation of RAP from the LDL receptor type A (LA) repeats of LDLR family proteins. Also, RAP has 3 regions that interact weakly with heparin, two regions located in D3 and one in RAP domain 2 (D2). The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease. 100
32447 269871 cd14809 bZIP_AUREO-like Basic leucine zipper (bZIP) domain of blue light (BL) receptor aureochrome (AUREO) and similar bZIP domains. AUREO is a BL-activated transcription factor specific to phototrophic stramenopiles. It has a bZIP and a BL-sensing light-oxygen voltage (LOV) domain. It has been shown to mediate BL-induced branching and regulate the development of the sex organ in Vaucheria frigida. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. This subgroup also includes the Epstein-Barr virus (EBV) immediate-early transcription factor ZEBRA (BZLF1, Zta, Z, EB1). ZEBRA exhibits a variant of the bZIP fold, it has a unique dimer interface and a substantial hydrophobic pocket; it has a C-terminal moiety which stabilizes the coiled coil involved in dimer formation. ZEBRA functions to trigger the switch of EBV's biphasic infection cycle from latent to lytic infection. It activates the promoters of EBV lytic genes by binding ZEBRA response elements (ZREs) and inducing a cascade of expression of over 50 viral genes. It also down regulates latency-associated promoters, is an essential replication factor, induces host cell cycle arrest, and alters cellular immune responses and transcription factor activity. 52
32448 269872 cd14810 bZIP_u1 Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32449 269873 cd14811 bZIP_u2 Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32450 269874 cd14812 bZIP_u3 Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32451 269875 cd14813 bZIP_BmCbz-like Basic leucine zipper (bZIP) domain of Bombyx mori chorion b-ZIP transcription factor and similar bZIP domains. Bombyx mori chorion b-ZIP transcription factor, is encoded by the Cbz gene. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. 52
32452 350615 cd14814 Peptidase_M15 Metalloproteases including zinc D-Ala-D-Ala carboxypeptidase, L-Ala-D-Glu peptidase, L,D-carboxypeptidase, bacteriophage endolysins, and related proteins. This family summarizes zinc-binding metallopeptidases which are mostly carboxypeptidases and dipeptidases, and includes zinc-dependent D-Ala-D-Ala carboxypeptidases, VanX, L-Ala-D-Glu peptidase, L,D-carboxypeptidase and bacteriophage endolysins, amongst other family members. These peptidases belong to MEROPS family M15 which are involved in bacterial cell wall biosynthesis and metabolism. 111
32453 271352 cd14815 BA_2398_like Putative Bacillus anthracis lipoprotein and related proteins. Uncharacterized protein family found in Bacilli and Gammaproteobacteria 145
32454 350616 cd14817 D-Ala-D-Ala_dipeptidase_VanX D-Ala-D-Ala dipeptidase VanX. D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It belongs in the MEROPS peptidase family M15, subfamily D. 199
32455 341426 cd14818 longin-like Longin-like domains. Longin-like domains are small protein domains present in a variety of proteins and members of protein complexes involved in or required for different steps during the transport of proteins from the ribosome to the ER to the plasma membrane, via the Golgi apparatus. Examples are mu and sigma subunits of the heterotetrameric adaptor protein (AP) complex, zeta and delta subunits of the heterotetrameric F-COPI complex, a subgroup of R-SNARE proteins, a subfamily of the transport protein particle (TRAPP), and the signal recognition particle receptor subunit alpha (SR-alpha). 117
32456 271349 cd14819 Translin Translin, also known as TB-RBP (testis brain RNA-binding protein). Translin (also known as TB-RBP for Testis Brain RNA-binding protein, a mouse ortholog), is a paralog of its binding partner protein TRAX (translin-associated factor-X) and together they form oligomeric complexes known as C3PO proteins (for component 3 promoter of RNA-induced silencing complex or RISC). DNA damage has been proposed to stimulate transport of Translin into nuclei. It binds to RNA and single-stranded DNA, and its selectivity is modulated by interactions with GTP and TRAX. Translin may also regulate dendritic trafficking of BDNF RNAs as well as function as a key activator of siRNA-mediated silencing in drosophila. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear. 206
32457 271350 cd14820 TRAX Translin-associated factor-X (TRAX). TRAX (translin-associated factor-X) is a paralog of its binding partner protein Translin and together they form oligomeric complexes known as C3PO proteins (for component 3 promoter of RNA-induced silencing complex or RISC). TRAX complexed with Translin is possibly involved in dendritic RNA processing and in DNA double-strand break repair as an interacting partner with C1D, an activator of the DNA-dependent protein kinase involved in the repair of DNA-double strand breaks. It has been shown that Trax subunit, but not Translin, possesses a Glu-Glu-Asp catalytic center with the capacity to digest RNA; this catalytic activity is required for passenger-strand removal and RISC activation in RNAi. In Archaeoglobus fulgidus, Trax-like-subunits assemble into an octameric structure, highly similar to human C3PO; its complex with duplex RNA reveals that the octamer entirely encapsulates a single 13-base-pair RNA duplex inside a large inner cavity. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear. 182
32458 350517 cd14821 BACK_SPOP_like BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein (SPOP) and similar proteins. This family includes speckle-type POZ protein (SPOP), speckle-type POZ protein-like (SPOPL), TD and POZ domain-containing proteins (TDPOZ), Drosophila melanogaster protein roadkill, and similar proteins. Both SPOP and SPOPL serve as adaptors of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins involved in transcription regulation in early development and other cellular processes. Drosophila melanogaster protein roadkill, also termed Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. 59
32459 350518 cd14822 BACK_BTBD9 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 9 (BTBD9). BTBD9 is a risk factor for Restless Legs Syndrome (RLS) encoding a Cullin-3 substrate adaptor. The BTBD9 gene may be associated with antipsychotic-induced RLS in schizophrenia. Mutations in BTBD9 lead to reduced dopamine, increased locomotion and sleep fragmentation. 101
32460 341427 cd14823 AP_longin-like Longin-like domains of AP complex subunits. AP complex sigma subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 131
32461 341428 cd14824 Longin longin domain. Longin-domain is N-terminal domain of a subgroup of R-SNARE proteins, including VAMP7, Ykt6, and Sec22. Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain. 122
32462 341429 cd14825 TRAPPC2_sedlin Trafficking protein particle complex subunit 2. Trafficking protein particle complex subunit 2 (TRAPPC2), also known as Sedlin (SEDL) or TRS20, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. In humans, deletions or point mutations in the SEDL gene cause the genetic disease spondyloepiphyseal dysplasia tarda (SEDT), an X-linked skeletal disorder. 135
32463 341430 cd14826 SR_alpha_SRX SRX domain of signal recognition particle receptor subunit alpha. Signal recognition particle receptor subunit alpha (SR-alpha) is part of the membrane-associated heterodimeric receptor for the signal recognition particle (SRP). The signal recognition particle (SRP) pathway is highly conserved and plays an important role in the translocation of proteins across and insertion into membranes by targeting the translating ribosome to the endoplasmic reticulum. The N-terminal SRX domain of SR-alpha has a profilin-like fold and has been shown to be the interaction site with the second subunit, SR-beta. 118
32464 341431 cd14827 AP_sigma AP complex subunit sigma. AP complex sigma subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 138
32465 341432 cd14828 AP_Mu_N AP complex subunit mu N-terminal domain. AP complex mu subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal. 136
32466 341433 cd14829 Zeta-COP zeta subunit of the F-COPI complex. Zeta subunit of the heterotetrameric F-COPI complex, which consists of one beta-, one gamma-, one delta-, and one zeta subunit, where beta- and gamma- subunits are related to the large adaptor protein (AP) complex subunits, and delta- and zeta- subunits are related to the medium and small AP subunits, respectively. F-COPI forms a coatomer together with the B-COPI subcomplex, which assembles with a small GTPase, ADP-ribosylation factor 1 (ARF1), playing an important role in the formation of COPI complex-coated vesicles. COPI complex-coated vesicles function in the early secretory pathway mediating the retrograde transport from the Golgi to the ER, and intra-Golgi transport. 132
32467 341434 cd14830 Delta_COP_N delta subunit of the F-COPI complex, N-terminal domain. Delta subunit of the heterotetrameric F-COPI complex, which consists of one beta-, one gamma-, one delta-, and one zeta subunit, where beta- and gamma- subunits are related to the large adaptor protein (AP) complex subunits, and delta- and zeta- subunits are related to the medium and small AP subunits, respectively. F-COPI forms a coatomer together with the B-COPI subcomplex, which assembles with a small GTPase, ADP-ribosylation factor 1 (ARF1), playing an important role in the formation of COPI complex-coated vesicles. COPI complex-coated vesicles function in the early secretory pathway mediating the retrograde transport from the Golgi to the ER, and intra-Golgi transport. 130
32468 341435 cd14831 AP1_sigma AP-1 complex subunit sigma. AP-1 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large gamma-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-1 the coat protein is clathrin. AP-1 binds the phospholipid PI(4)P which plays a role in its localisation to the trans-Golgi network (TGN)/endosome. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 143
32469 341436 cd14832 AP4_sigma AP-4 complex subunit sigma. AP-4 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large epsilon-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-4 does not bind the coat protein clathrin, it is associated with nonclathrin coats. Its phospholipid binding partner is unknown and it is localized in the trans-Golgi network (TGN). The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 138
32470 341437 cd14833 AP2_sigma AP-2 complex subunit sigma. AP-2 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-2 complex which consists of one large alpha-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-2 the coat protein is clathrin. AP-2 binds the phospholipid PI(4,5)P2 which is important for its localisation to the plasma membrane. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 141
32471 341438 cd14834 AP3_sigma AP-3 complex subunit sigma. AP-3 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large delta-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-3 binds the coat protein clathrin and the phospholipid PI(3)P and it is localized in the endosome. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals. 146
32472 341439 cd14835 AP1_Mu_N AP-1 complex subunit mu N-terminal domain. AP-1 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large gamma-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-1 the coat protein is clathrin. AP-1 binds the phospholipid PI(4)P which plays a role in its localisation to the trans-Golgi network (TGN)/endosome. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal. 139
32473 341440 cd14836 AP2_Mu_N AP-2 complex subunit mu N-terminal domain. AP-2 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-2 complex which consists of one large alpha-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-2 the coat protein is clathrin. AP-2 binds the phospholipid PI(4,5)P2 which is important for its localisation to the plasma membrane. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal. 140
32474 341441 cd14837 AP3_Mu_N AP-3 complex subunit mu N-terminal domain. AP-3 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large delta-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-3 binds the coat protein clathrin and the phospholipid PI(3)P and it is localized in the endosome. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal. 139
32475 341442 cd14838 AP4_Mu_N AP-4 complex subunit mu N-terminal domain. AP-4 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large epsilon-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-4 does not bind the coat protein clathrin, it is associated with nonclathrin coats. Its phospholipid binding partner is unknown and it is localized in the trans-Golgi network (TGN). The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal. 137
32476 350617 cd14840 D-Ala-D-Ala_dipeptidase_Aad D-Ala-D-Ala dipeptidase (includes Lactobacillus plantarum Aad peptidase). D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. This subfamily includes Lactobacillus Aad peptidase and belongs in the MEROPS peptidase family M15, subfamily D. 158
32477 350618 cd14843 D-Ala-D-Ala_dipeptidase_like D-Ala-D-Ala dipeptidase, includes uncharacterized enzymes. This subfamily of D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) also includes several uncharacterized proteins. This is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It belongs in the MEROPS peptidase family M15, subfamily D. 160
32478 350619 cd14844 Zn-DD-carboxypeptidase_like Proteins similar to the zinc-containing D-Ala-D-Ala dipeptidase. The zinc D-Ala-D-Ala carboxypeptidase (Streptomyces-type) (also known as D-alanyl-D-alanine hydrolase; D-alanyl-D-alanine-cleaving carboxypeptidase; DD-carboxypeptidase; DD-carboxypeptidase-transpeptidase; Zn2+ G peptidase; G enzyme; EC 3.4.17.14) is a zinc enzyme that belongs to the peptidase M15 subfamily A. The enzyme catalyzes carboxypeptidation but not transpeptidation reactions involved in bacterial cell wall metabolism. Its specificity with substrates of the type Xaa-Yaa-Zaa shows that the enzyme requires the substrate N-terminus to be blocked and C-terminus to be free, and Yaa and Zaa should be in the D-configuration. It is weakly inhibited by beta-lactams most likely caused by the enzyme active site geometry. 108
32479 350620 cd14845 L-Ala-D-Glu_peptidase_like L-Ala-D-Glu peptidase, also known as L-alanyl-D-glutamate endopeptidase. This L-Ala-D-Glu peptidase family includes L-alanyl-D-glutamate peptidase (bacteriophage T5) (also known as L-alanoyl-D-glutamate endopeptidase), and Ply118 and Ply500 L-Ala-D-Glu peptidase. Bacteriophage endolysin degrades the peptidoglycan of the bacterial host from within, leading to cell lysis and release of progeny virions. The bacteriophage endolysin Ply118 cleaves between L-Ala and D-Glu residues of Listeria cell wall peptidoglycan. This family belongs to the MEROPS peptidase M15 subfamily C. 126
32480 350621 cd14846 Peptidase_M15_like Uncharacterized family of the peptidase family M15, subfamily B. This family of uncharacterized proteins, similar to endolysin lys (Clavibacter phage CMP1) and VanYn peptidase, are zinc-binding enzymes that belong to the peptidase M15 subfamily B, involved in bacterial cell wall metabolism. 104
32481 350622 cd14847 DD-carboxypeptidase_like Uncharacterized proteins of the MEROPS peptidase family M15, subfamily B. This family of uncharacterized proteins similar to D-Ala-D-Ala carboxypeptidase pdcA (Myxococcus-type) are zinc-binding enzymes that belong to the peptidase M15 subfamily B. The enzyme D-Ala-D-Ala carbozypeptidase catalyzes carboxypeptidation reactions involved in bacterial cell wall metabolism. 162
32482 350623 cd14849 DD-dipeptidase_VanXYc D-Ala-D-Ala dipeptidase/D-Ala-D-Ala carboxypeptidase (VanXYc) and related proteins. VanXYc peptidase (also known as vanXY(C) peptidase, D-alanyl-D-alanine carboxypeptidase D,D-dipeptidase/D,D-carboxypeptidase, vancomycin resistance D,D-dipeptidase) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci. Some of the vancomycin resistance operons encode VanXY D,D-carboxypeptidase which hydrolyzes both, dipeptide (D-Ala-D-Ala) or pentapeptide (UDP-MurNac-L-Ala-D-Glu-L-Lys-D-Ala-D-Ala). It is a bifunctional enzyme that catalyzes D,D-peptidase and D,D-carboxypeptidase activities. VanXY has higher sequence similarity to VanY than with VanX and hydrolyzes D,D-dipeptides such as D-Ala-D-Ala, whereas VanY is inactive against this substrate; thus having a less restrictive active site to accommodate larger substrates such as UDP-MurNAc-pentapeptide[Ala]. This family belongs to the MEROPS family M15, subfamily B, and includes the D,D-dipeptidases VanXYg and VanXYe. 127
32483 350624 cd14852 LD-carboxypeptidase L,D-carboxypeptidase DacB and LdcB, and related proteins. This L,D-carboxypeptidase family includes LdcB LD-Carboxypeptidase from Streptococcus pneumoniae, Bacillus anthracis, and Bacillus subtilis, and L,D-carboxypeptidase DacB from Streptococcus pneumonia and Lactococcus lactis. These enzymes are active against cell-wall-derived tetrapeptides and synthetic tetrapeptides lacking the sugar moiety but are inactive against tetrapeptides terminating in L-alanine. L,D-carboxypeptidase DacB plays a key role in the remodeling of S. pneumoniae peptidoglycan during cell division. It adopts a zinc-dependent carboxypeptidase fold and acts as an L,D-carboxypeptidase towards the tetrapeptide L-Ala-D-iGln-L-Lys-D-Ala of the peptidoglycan stem. This family also includes vanY D-Ala-D-Ala carboxypeptidase which is vancomycin-inducible and penicillin-resistant. VanY hydrolyzes depsipeptide- and D-alanyl-D-alanine-containing peptidoglycan precursors; it is insensitive to beta-lactams. All these enzymes belong to the MEROPS family M15 subfamily B. 162
32484 341443 cd14853 TRAPPC_longin-like Longin-like domains of Trafficking protein particle complex. Longin-like domains of a subfamily of core components of the trafficking protein particle complex (TRAPP), including TRAPPC2, TRAPPC4, TRAPPC1 and a TRAPPC2L, whose function is not known. TRAPP complexes are required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. 132
32485 341444 cd14854 TRAPPC2L Trafficking protein particle complex subunit 2-like. Trafficking protein particle complex subunit 2-like (TRAPPC2L) is related to TRAPPC2. Its function is not known, but there are indications that it is part of the TRAPP II complex, which is required for distinct tethering events at Golgi membranes. TRAPPC2 has been identified as a general component of transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. 135
32486 341445 cd14855 TRAPPC1_MUM2 Trafficking protein particle complex subunit 1. Trafficking protein particle complex subunit 1 (TRAPPC1), also known as MUM2 and BET5, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. 132
32487 341446 cd14856 TRAPPC4_synbindin Trafficking protein particle complex subunit 4. Trafficking protein particle complex subunit 4 (TRAPPC4), also known as synbindin or TRS23, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. 127
32488 410986 cd14858 TrmE_N N-terminal domain of TrmE, a tRNA modification GTPase. This family contains the N-terminal domain of TrmE (also known as MnmE, ThdF, MSS1), a guanine nucleotide-binding protein conserved in all three kingdoms of life. It is involved in the modification of uridine bases (U34) at the first anticodon (wobble) position of tRNAs decoding two-family box triplets. TrmE is a three-domain protein comprising an N-terminal alpha/beta domain, a helical domain, and the GTPase domain which is nested within the helical domain. The N-terminal domain induces dimerization for self-assembly and is topologically homologous to the tetrahydrofolate (THF)-binding domain of N,N-dimethylglycine oxidase (DMGO). However, the THF-binding site in DMGO is encoded on a single polypeptide, while homodimerization would be required to create a similar THF-binding site in TrmE. Dimerization also creates a second, symmetry-related THF-binding site. Biochemical and structural studies show that TrmE indeed binds formyl-THF. A cysteine residue, necessary for modification of U34, is located close to the C1-group donor 5-formyl-tetrahydrofolate, suggesting a direct role of TrmE in the modification analogous to DNA modification enzymes. 117
32489 275438 cd14859 PMEI_like pectin methylesterase inhibitor and related proteins. Pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) catalyzes the demethylesterification of homogalacturonans in the cell wall. Its activity is regulated by the proteinaceous PME inhibitor (PMEI) which inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context. CIF (cell-wall inhibitor of beta-fructosidase from tobacco) is structurally similar to PMEI and these members are also included in this model. Comparison of the CIF/INV1 structure with the complex between PMEI/PME suggests a common targeting mechanism in PMEI and CIF. However, CIF and PMEI use distinct surface areas to selectively inhibit very different enzymatic scaffolds. 140
32490 341482 cd14860 4HBD_NAD 4-hydroxybutyrate dehydrogenase, also called gamma-hydroxybutyrate dehydrogenase, catalyzes the reduction of succinic simialdehyde to 4-hydroxybutyrate in the succinic degradation pathway. 4-hydroxybutyrate dehydrogenase (4HBD) is an iron-containing (type III) NAD-dependent alcohol dehydrogenase. It plays a role in the succinate metabolism biochemical pathway. It catalyzes the reduction of succinic simialdehide to 4-hydroxybutyrate in the succinate degradation pathway This succinate degradation pathway is present in some bacteria which can use succinate as sole carbon source. 371
32491 341483 cd14861 Fe-ADH-like Iron-containing alcohol dehydrogenases-like. This family contains proteins similar to iron-containing alcohol dehydrogenase (Fe-ADH), most of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. 374
32492 341484 cd14862 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 375
32493 341485 cd14863 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 380
32494 341486 cd14864 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 376
32495 341487 cd14865 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 383
32496 341488 cd14866 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 384
32497 271246 cd14867 uS7_Eukaryote Eukaryota homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Eukaryotic RPS7 (also named RPS5) have variable N-terminal regions that affect the efficiency of initiation translation process by impacting small ribosomal subunit to function. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit which is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 185
32498 271247 cd14868 uS7_Mitochondria_Fungi Fungal Mitochondrial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Fungal and plants mitochondrial RPS7 shows less homology to the mammalian than to bacterial RPS7. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 151
32499 271248 cd14869 uS7_Bacteria Bacterial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Prokaryotic RPS7 is lacking the variable N-terminal region of eukaryotic RPS7. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit which is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 138
32500 271249 cd14870 uS7_Mitochondria_Mammalian Mammalian Mitochondrial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. MRPS7 shows more homology to bacterial RPS7 than mitochondrial proteins from plants and fungi. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The ribosomes present in mammalian mitochondria have more proteins and low percentage of ribosomal RNA than bacterial ribosomes. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 199
32501 271250 cd14871 uS7_Chloroplast Chloroplast homolog of Ribosomal Protein S7. Chloroplast RPS7 has both general and specific regulatory roles in chloroplast translation process. uS7, also known as Ribosomal protein (RP)S7, is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The chloroplasts of plants and algae have bacterial ancestry, but it has adopted novel mechanisms in order to execute its roles within a eukaryotic cell. Chloroplast RPS7 is more homologous to bacterial RPS7 than other eukaryotic mitochondrial proteins. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The chloroplast translation regulation is more complex than in bacteria with additional RNA and chloroplast-unique proteins. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 146
32502 276839 cd14872 MYSc_Myo4 class IV myosin, motor domain. These myosins all possess a WW domain either N-terminal or C-terminal to their motor domain and a tail with a MyTH4 domain followed by a SH3 domain in some instances. The monomeric Acanthamoebas were the first identified members of this group and have been joined by Stramenopiles. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 644
32503 276840 cd14873 MYSc_Myo10 class X myosin, motor domain. Myosin X is an unconventional myosin motor that functions as a monomer. In mammalian cells, the motor is found to localize to filopodia. Myosin X walks towards the barbed ends of filaments and is thought to walk on bundles of actin, rather than single filaments, a unique behavior. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. C-terminal to the head domain are a variable number of IQ domains, 2 PH domains, a MyTH4 domain, and a FERM domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 651
32504 276841 cd14874 MYSc_Myo12 class XXXIII myosin, motor domain. Little is known about the XXXIII class of myosins. They are found predominately in nematodes. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 628
32505 276842 cd14875 MYSc_Myo13 class XIII myosin, motor domain. These myosins have an N-terminal motor domain, a light-chain binding domain, and a C-terminal GPA/Q-rich domain. There is little known about the function of this myosin class. Two of the earliest members identified in this class are green alga Acetabularia cliftonii, Aclmyo1 and Aclmyo2. They are striking with their short tail of Aclmyo1 of 18 residues and the maximum of 7 IQ motifs in Aclmyo2. It is thought that these myosins are involved in organelle transport and tip growth. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 664
32506 276843 cd14876 MYSc_Myo14 class XIV myosin, motor domain. These myosins localize to plasma membranes of the intracellular parasites and may be involved in the cell invasion process. Their known functions include: transporting phagosomes to the nucleus and perturbing the developmentally regulated elimination of the macronucleus during conjugation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. C-terminal to their motor domain these myosins have a MyTH4-FERM protein domain combination. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 649
32507 276844 cd14878 MYSc_Myo16 class XVI myosin, motor domain. These XVI type myosins are also known as Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter 3/NYAP3. Myo16 is thought to play a regulatory role in cell cycle progression and has been recently implicated in Schizophrenia. Class XVI myosins are characterized by an N-terminal ankyrin repeat domain and some with chitin synthase domains that arose independently from the ones in the class XVII fungal myosins. They bind protein phosphatase 1 catalytic subunits 1alpha/PPP1CA and 1gamma/PPP1CC. Human Myo16 interacts with ACOT9, ARHGAP26 and PIK3R2 and with components of the WAVE1 complex, CYFIP1 and NCKAP1. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 656
32508 276845 cd14879 MYSc_Myo17 class XVII myosin, motor domain. This fungal myosin which is also known as chitin synthase uses its motor domain to tether its vesicular cargo to peripheral actin. It works in opposition to dynein, contributing to the retention of Mcs1 vesicles at the site of cell growth and increasing vesicle fusion necessary for polarized growth. Class 17 myosins consist of a N-terminal myosin motor domain with Cyt-b5, chitin synthase 2, and a DEK_C domains at it C-terminus. The chitin synthase region contains several transmembrane domains by which myosin 17 is thought to bind secretory vesicles. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 647
32509 276846 cd14880 MYSc_Myo19 class XIX myosin, motor domain. Monomeric myosin-XIX (Myo19) functions as an actin-based motor for mitochondrial movement in vertebrate cells. It contains a variable number of IQ domains. Human myo19 contains a motor domain, three IQ motifs, and a short tail. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 658
32510 276847 cd14881 MYSc_Myo20 class XX myosin, motor domain. These class 20 myosins are primarily insect myosins with such members as Drosophila, Daphnia, and mosquitoes. These myosins contain a single IQ motif in the neck region. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 633
32511 276848 cd14882 MYSc_Myo21 class XXI myosin, motor domain. The myosins here are comprised of insects. Leishmania class XXI myosins do not group with them. Myo21, unlike other myosin proteins, contains UBA-like protein domains and has no structural or functional relationship with the myosins present in other organisms possessing cilia or flagella. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. They have diverse tails with IQ, WW, PX, and Tub domains. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 642
32512 276849 cd14883 MYSc_Myo22 class XXII myosin, motor domain. These myosins possess an extended neck with multiple IQ motifs such as found in class V, VIII, XI, and XIII myosins. These myosins are defined by two tandem MyTH4 and FERM domains. The apicomplexan, but not diatom myosins contain 4-6 WD40 repeats near the end of the C-terminal tail which suggests a possible function of these myosins in signal transduction and transcriptional regulation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 661
32513 276850 cd14884 MYSc_Myo23 class XXIII myosin, motor domain. These myosins are predicted to have a neck region with 1-2 IQ motifs and a single MyTH4 domain in its C-terminal tail. The lack of a FERM domain here is odd since MyTH4 domains are usually found alongside FERM domains where they bind to microtubules. At any rate these Class XXIII myosins are still proposed to function in the apicomplexan microtubule cytoskeleton. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 685
32514 276851 cd14886 MYSc_Myo25 class XXV myosin, motor domain. These myosins are MyTH-FERM myosins that play a role in cell adhesion and filopodia formation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 650
32515 276852 cd14887 MYSc_Myo26 class XXVI myosin, motor domain. These MyTH-FERM myosins are thought to be related to the other myosins that have a MyTH4 domain such as class III, VII, IX, X , XV, XVI, XVII, XX, XXII, XXV, and XXXIV. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 725
32516 276853 cd14888 MYSc_Myo27 class XXVII myosin, motor domain. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 667
32517 276854 cd14889 MYSc_Myo28 class XXVIII myosin, motor domain. These myosins are found in fish, chicken, and mollusks. The tail regions of these class-XXVIII myosins consist of an IQ motif, a short coiled-coil region, and an SH2 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 659
32518 276855 cd14890 MYSc_Myo29 class XXIX myosin, motor domain. Class XXIX myosins are comprised of Stramenopiles and have very long tail domains consisting of three IQ motifs, short coiled-coil regions, up to 18 CBS domains, a PB1 domain, and a carboxy-terminal transmembrane domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 662
32519 276856 cd14891 MYSc_Myo30 class XXX myosin, motor domain. Myosins of class XXX are composed of an amino-terminal SH3-like domain, two IQ motifs, a coiled-coil region and a PX domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 645
32520 276857 cd14892 MYSc_Myo31 class XXXI myosin, motor domain. Class XXXI myosins have a very long neck region consisting of 17 IQ motifs and 2 tandem ANK repeats that are separated by a PH domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 656
32521 276858 cd14893 MYSc_Myo32 class XXXII myosin, motor domain. Class XXXII myosins do not contain any IQ motifs, but possess tandem MyTH4 and FERM domains. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 741
32522 276859 cd14894 MYSc_Myo33 class myosin, motor domain. Class XXXIII myosins have variable numbers of IQ domain and 2 tandem ANK repeats that are separated by a PH domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 871
32523 276860 cd14895 MYSc_Myo34 class XXXIV myosin, motor domain. Class XXXIV myosins are composed of an IQ motif, a short coiled-coil region, 5 tandem ANK repeats, and a carboxy-terminal FYVE domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 704
32524 276861 cd14896 MYSc_Myo35 class XXXV myosin, motor domain. This class of metazoan myosins contains 2 IQ motifs, 2 MyTH4 domains, a single FERM domain, and an SH3 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 644
32525 276862 cd14897 MYSc_Myo36 class XXXVI myosin, motor domain. This class of molluscan myosins contains a motor domain followed by a GlcAT-I (Beta1,3-glucuronyltransferase I) domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 635
32526 276863 cd14898 MYSc_Myo37 class XXXVII myosin, motor domain. The class XXXVIII myosins are comprised of fungi. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 578
32527 276864 cd14899 MYSc_Myo38 class XXXVIII myosin. The class XXXVIII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 717
32528 276865 cd14900 MYSc_Myo39 class XXXIX myosin, motor domain. The class XXXIX myosins are found in Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 627
32529 276866 cd14901 MYSc_Myo40 class XL myosin, motor domain. The class XL myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 655
32530 276867 cd14902 MYSc_Myo41 class XLI myosin, motor domain. The class XLI myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 716
32531 276868 cd14903 MYSc_Myo42 class XLII myosin, motor domain. The class XLII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 658
32532 276869 cd14904 MYSc_Myo43 class XLIII myosin, motor domain. The class XLIII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 653
32533 276870 cd14905 MYSc_Myo44 class XLIV myosin, motor domain. There is little known about the function of the myosin XLIV class. Members here include cellular slime mold Polysphondylium and soil-living amoeba Dictyostelium. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 673
32534 276871 cd14906 MYSc_Myo45 class XLV myosin, motor domain. The class XLVI myosins are comprised of slime molds Dictyostelium and Polysphondylium. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 715
32535 276872 cd14907 MYSc_Myo46 class XLVI myosin, motor domain. The class XLVI myosins are comprised of Alveolata. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 669
32536 276873 cd14908 MYSc_Myo47 class XLVII myosin, motor domain. The class XLVII myosins are comprised of Stramenopiles. Not much is known about this myosin class. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 682
32537 276874 cd14909 MYSc_Myh1_insects_crustaceans class II myosin heavy chain 1, motor domain. Myosin motor domain of type IIx skeletal muscle myosin heavy chain 1 (also called MYHSA1, MYHa, MyHC-2X/D, MGC133384) in insects and crustaceans. Myh1 is a type I skeletal muscle myosin that in Humans is encoded by the MYH1 gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 666
32538 276875 cd14910 MYSc_Myh1_mammals class II myosin heavy chain 1, motor domain. Myosin motor domain of type IIx skeletal muscle myosin heavy chain 1 (also called MYHSA1, MYHa, MyHC-2X/D, MGC133384) in mammals. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 671
32539 276876 cd14911 MYSc_Myh2_insects_mollusks class II myosin heavy chain 2, motor domain. Myosin motor domain of type IIa skeletal muscle myosin heavy chain 2 (also called MYH2A, MYHSA2, MyHC-IIa, MYHas8, MyHC-2A) in insects and mollusks. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. Mutations in this gene results in inclusion body myopathy-3 and familial congenital myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 674
32540 276877 cd14912 MYSc_Myh2_mammals class II myosin heavy chain 2, motor domain. Myosin motor domain of type IIa skeletal muscle myosin heavy chain 2 (also called MYH2A, MYHSA2, MyHC-IIa, MYHas8, MyHC-2A) in mammals. Mutations in this gene results in inclusion body myopathy-3 and familial congenital myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 673
32541 276878 cd14913 MYSc_Myh3 class II myosin heavy chain 3, motor domain. Myosin motor domain of fetal skeletal muscle myosin heavy chain 3 (MYHC-EMB, MYHSE1, HEMHC, SMHCE) in tetrapods including mammals, lizards, and frogs. This gene is a member of the MYH family and encodes a protein with an IQ domain and a myosin head-like domain. Mutations in this gene have been associated with two congenital contracture (arthrogryposis) syndromes, Freeman-Sheldon syndrome and Sheldon-Hall syndrome. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 668
32542 276879 cd14915 MYSc_Myh4 class II myosin heavy chain 4, motor domain. Myosin motor domain of skeletal muscle myosin heavy chain 4 (also called MYH2B, MyHC-2B, MyHC-IIb). Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 671
32543 276880 cd14916 MYSc_Myh6 class II myosin heavy chain 6, motor domain. Myosin motor domain of alpha (or fast) cardiac muscle myosin heavy chain 6. Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 670
32544 276881 cd14917 MYSc_Myh7 class II myosin heavy chain 7, motor domain. Myosin motor domain of beta (or slow) type I cardiac muscle myosin heavy chain 7 (also called CMH1, MPD1, and CMD1S). Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. It is expressed predominantly in normal human ventrical and in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 668
32545 276882 cd14918 MYSc_Myh8 class II myosin heavy chain 8, motor domain. Myosin motor domain of perinatal skeletal muscle myosin heavy chain 8 (also called MyHC-peri, MyHC-pn). Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. A mutation in this gene results in trismus-pseudocamptodactyly syndrome. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 668
32546 276883 cd14919 MYSc_Myh9 class II myosin heavy chain 9, motor domain. Myosin motor domain of non-muscle myosin heavy chain 9 (also called NMMHCA, NMHC-II-A, MHA, FTNS, EPSTS, and DFNA17). Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 670
32547 276952 cd14920 MYSc_Myh10 class II myosin heavy chain 10, motor domain. Myosin motor domain of non-muscle myosin heavy chain 10 (also called NMMHCB). Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 673
32548 276885 cd14921 MYSc_Myh11 class II myosin heavy chain 11, motor domain. Myosin motor domain of smooth muscle myosin heavy chain 11 (also called SMMHC, SMHC). The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3' end overlaps with that of the latter. Inversion of the MYH11 locus is one of the most frequent chromosomal aberrations found in acute myeloid leukemia. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Mutations in MYH11 have been described in individuals with thoracic aortic aneurysms leading to acute aortic dissections with patent ductus arteriosus. MYH11 mutations are also thought to contribute to human colorectal cancer and are also associated with Peutz-Jeghers syndrome. The mutations found in human intestinal neoplasia result in unregulated proteins with constitutive motor activity, similar to the mutant myh11 zebrafish. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 673
32549 276887 cd14923 MYSc_Myh13 class II myosin heavy chain 13, motor domain. Myosin motor domain of skeletal muscle myosin heavy chain 13 (also called MyHC-eo) in mammals, chicken, and green anole. Myh13 is a myosin whose expression is restricted primarily to the extrinsic eye muscles which are specialized for function in eye movement. Class II myosins, also called conventional myosins, are the myosin type responsible for producing muscle contraction in muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 671
32550 276953 cd14927 MYSc_Myh7b class II myosin heavy chain 7b, motor domain. Myosin motor domain of cardiac muscle, beta myosin heavy chain 7b (also called KIAA1512, dJ756N5.1, MYH14, MHC14). MYH7B is a slow-twitch myosin. Mutations in this gene result in one form of autosomal dominant hearing impairment. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 676
32551 276892 cd14929 MYSc_Myh15_mammals class II myosin heavy chain 15, motor domain. Myosin motor domain of sarcomeric myosin heavy chain 15 in mammals (also called KIAA1000) . MYH15 is a slow-twitch myosin. Myh15 is a ventricular myosin heavy chain. Myh15 is absent in embryonic and fetal muscles and is found in orbital layer of extraocular muscles at birth. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 662
32552 276893 cd14930 MYSc_Myh14_mammals class II myosin heavy chain 14 motor domain. Myosin motor domain of non-muscle myosin heavy chain 14 (also called FLJ13881, KIAA2034, MHC16, MYH17). Its members include mammals, chickens, and turtles. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Some of the data used for this classification were produced by the CyMoBase team at the Max-Planck-Institute for Biophysical Chemistry. The sequence names are composed of the species abbreviation followed by the protein abbreviation and optional protein classifier and variant designations. 670
32553 276895 cd14932 MYSc_Myh18 class II myosin heavy chain 18, motor domain. Myosin motor domain of muscle myosin heavy chain 18. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 676
32554 276896 cd14934 MYSc_Myh16 class II myosin heavy chain 16, motor domain. Myosin motor domain of myosin heavy chain 16 pseudogene (also called MHC20, MYH16, and myh5), encoding a sarcomeric myosin heavy chain expressed in nonhuman primate masticatory muscles, is inactivated in humans. This cd contains Myh16 in mammals. MYH16 has intermediate fibres between that of slow type 1 and fast 2B fibres, but exert more force than any other fibre type examined. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Some of the data used for this classification were produced by the CyMoBase team at the Max-Planck-Institute for Biophysical Chemistry. The sequence names are composed of the species abbreviation followed by the protein abbreviation and optional protein classifier and variant designations. 659
32555 276897 cd14937 MYSc_Myo24A class XXIV A myosin, motor domain. These myosins have a 1-2 IQ motifs in their neck and a coiled-coil region in their C-terminal tail. The function of the class XXIV myosins remain elusive. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 637
32556 276898 cd14938 MYSc_Myo24B class XXIV B myosin, motor domain. These myosins have a 1-2 IQ motifs in their neck and a coiled-coil region in their C-terminal tail. The functions of these myosins remain elusive. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 713
32557 320093 cd14939 7tmD_STE2 fungal alpha-factor pheromone receptor STE2, member of the class D family of seven-transmembrane G protein-coupled receptors. This subfamily represents the alpha-factor pheromone receptor encoded by the STE2 gene, which is required for pheromone sensing and mating in haploid cells of the yeast Saccharomyces cerevisiae. The STE2-encoded seven-transmembrane domain receptor is a member of the class D GPCRs. Class D receptors are composed of two major subfamilies: Ste2 and Ste3. These two GPCRs (Ste2 and Ste3) sense the polypeptide mating pheromones, alpha-factor and a-factor, which activate a G protein-coupled receptors on the surface of the opposite yeast-mating haploid-types (MATa and MAT-alpha), respectively. Activation of these receptors by pheromones leads to activation of the mitogen-activated protein kinase (MAPK) signal transduction cascades, G1 cell cycle arrest, and polarized cell growth in the direction of the partner cell (a process called shmooing), which ultimately induces cell-cell fusion and the formation of a diploid zygote. Like all GPCRs, these pheromone mating factor receptors possess the same basic architecture of seven-transmembrane (7TM) domains and share common signaling mechanisms; however, there is no significant sequence similarity either between Ste2 and Ste3, or between these two receptors and the other 7TM GPCRs. Thus, STE2 and STE3 represent phylogenetically distinct groups. 265
32558 320094 cd14940 7tmE_cAMP_R_Slime_mold slime mold cyclic AMP receptor, member of the class E family of seven-transmembrane G protein-coupled receptors. This family represents the class E of seven-transmembrane G-protein coupled receptors found in soil-living amoebas, commonly referred to as slime molds. The class E family includes cAMP receptors (cAR1-4) and cAMP receptors-like proteins (CrlA-C) from Dictyostelium discoideum, and their highly homologous cAMP receptors (TasA and TasB) from Polysphondylium pallidum. So far, four subtypes of cAMP receptors (cAR1-4) have been identified that play an essential role in the detection and transmit of the periodic extracellular cAMP waves that regulate chemotactic cell movement during Dictyostelium development, from the unicellular amoeba aggregate into many multicellular slugs and then differentiate into a sporocarp, a fruiting body with cells specialized for different functions. These four subtypes differ in their expression levels and patterns during development. cAR1 is high-affinity receptor that is the first one to be expressed highly during early aggregation and continues to be expressed at low levels during later developmental stages. cAR1 detects extracellular cAMP and is coupled to G-alpha2 protein. Cells lacking cAR1 fail to aggregate, demonstrating that cAR1 is responsible for aggregation. During later aggregation the high-affinity cAR3 receptor is expressed at low levels. Nonetheless, cells lacking cAR3 do not show an obviously altered pattern of development and are still able to aggregate into fruiting bodies. In contrast, cAR2 and cAR4 are low affinity receptors expressed predominantly after aggregation in pre-stalk cells. cAR2 is essential for normal tip formation and deletion of the receptor arrests development at the mound stage. On the other hand, CAR4 regulates axial patterning and cellular differentiation, and deletion of the receptor results in defects during culmination. Furthermore, three cAMP receptor-like proteins (CrlA-C) were identified in Dictyostelium that show limited sequence similarity to the cAMP receptors. Of these CrlA is thought to be required for normal cell growth and tip formation in developing aggregates. 256
32559 271344 cd14941 TRAPPC_bet3-like Bet3-like domains of TRAPP. Bet3-like domains of a subfamily of core components of the trafficking protein particle complex (TRAPP) include TRAPPC3, TRAPPC5, and TRAPPC6A. TRAPP complexes play a key role in the regulation of ER-to-Golgi and intra-Golgi transport by tethering the vesicle membrane to the target membrane. TRAPPs are large multimeric protein complexes which contain six core subunits that belong to two distinct structural families, the bet3-like family and the sedlin-like family. 152
32560 271345 cd14942 TRAPPC3_bet3 Bet3-TRAPPC3 subunit of the TRAPP complex. Bet3 (also known as TRAPPC3) subunit of the trafficking protein particle complex (TRAPP). Bet3 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi and intra-Golgi transport by tethering the vesicle membrane to the target membrane. TRAPPC3 has also been shown to be additionally important for membrane fusion during the formation of vesicular tubular clusters (VTC). In its core, Bet3 forms a hydrophobic channel that also contains a conserved acylation site. 155
32561 271346 cd14943 TRAPPC5_Trs31 Trs31 subunit of the TRAPP complex. TRS31 (also known as TRAPPC5) subunit of the trafficking protein particle complex (TRAPP). TRS31 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi and intra-Golgi transport by tethering the vesicle membrane to the target membrane. 158
32562 271347 cd14944 TRAPPC6A_Trs33 Trs33 subunit of the TRAPP complex. TRS33 (also known as TRAPPC6A) subunit of the trafficking protein particle complex (TRAPP). TRS33 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi and intra-Golgi transport by tethering the vesicle membrane to the target membrane. In mammals, mutations in TRAPPC6a cause mosaic loss of coat pigment. 167
32563 271253 cd14945 Myo5-like_CBD Cargo binding domain of myosin 5 and similar proteins. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5 a,b,c) in vertebrates and two (myo2 and myo4) in fungi and related to plant class XI myosins. Their C-terminal cargo binding domains is important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. MyoV-CBDs interact with several adaptor proteins that in turn interact with the cargo. 288
32564 380670 cd14946 Tet_JBP oxygenase domain of ten-eleven translocation (TET) enzymes, J-binding proteins (JBPs), and similar proteins. TET proteins are involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. Alterations in TET protein function have been linked to cancer, and TETs influence many cell differentiation processes. J binding protein (JBP) 1 and JBP2 are thymidine hydroxylases that catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and a thymidine hydroxylase domain. Members of this TET/JBP family of dioxygenases require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 264
32565 271343 cd14947 NBR1_like Functionally uncharacterized domain in neighbor of Brca1 Gene 1 and related proteins. NBR1 has been characterized as a specific late endosomal protein, which might play a role in receptor (RTK) trafficking. Specifically, NBR1 was shown to inhibit ligand-mediated receptor internalization from the cell surface. The region covered by this domain model may be involved in that function, as the C-terminus (which contains a UBA domain) was shown to be essential but not sufficient by itself. In an earlier yeast two-hybrid study, the region in mouse NBR1 covered by this domain has been shown to interact with CIB (calcium and integrin-binding protein) and FEZ1 (fasciculation and elongation protein zeta-1). Thus, NBR1 may play a role in cellular signalling pathways and possibly in neural development. 112
32566 271342 cd14948 BACON Bacteroidetes-Associated Carbohydrate-binding (putative) Often N-terminal (BACON) domain. The BACON domain is found in diverse domain architectures and accociated with a wide variety of domains, including carbohydrate-active enzymes and proteases. It was named for its suggested function of carbohydrate binding; the latter was inferred from domain architectures, sequence conservation, and phyletic distribution. However, recent experimental data suggest that its primary function in Bacteroides ovatus endo-xyloglucanase BoGH5A is to distance the catalytic module from the cell surface and confer additional mobility to the catalytic domain for attack of the polysaccharide. No evidence for a direct role in carbohydrate binding could be found in that case. The large majority of BACON domains are found in Bacteroidetes. 83
32567 271340 cd14949 Asparaginase_2_like_3 Uncharacterized bacterial subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. 280
32568 271341 cd14950 Asparaginase_2_like_2 Uncharacterized archaebacterial subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. 251
32569 271321 cd14951 NHL-2_like NHL repeat domain of NHL repeat-containing protein 2 and similar proteins. NHL repeat-containing protein 2 (NHLRC2) and related bacterial proteins; members of this eukaryotic and bacterial family are uncharacterized, the NHL repeat domain is found C-terminally of a thioredoxin domain. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 334
32570 271322 cd14952 NHL_PKND_like NHL repeat domain of the protein kinase PknD. PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 247
32571 271323 cd14953 NHL_like_1 Uncharacterized NHL-repeat domain in bacterial proteins. This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 323
32572 271324 cd14954 NHL_TRIM71_like NHL repeat domain of the tripartite motif-containing protein 71 (TRIM71) and related proteins. The E3 ubiquitin-protein ligase TRIM71 (LIN-41) is a RING-finger domain containing protein that has been associated with a variety of activities. The NHL repeat domain appears responsible for targeting TRIM71 to mRNAs, and TRIM71 appears responsible for translational repression and mRNA decay. Together with BRAT, TRIM71 may be part of a family of mRNA repressors that regulate proliferation and differentiation. TRIM has been shown to negatively regulate stability of Lin28B, which inhibits the pre-let-7 miRNA precursor from maturing by recruiting the terminal uriyltransferase TUT4. This family also contains the Caenorhabditis elegans NHL repeat containing 1 (NHL-1), a RING-finger-containing protein that was shown to interact with E2 ubiquitin conjugating enzymes in two-hybrid screens. Its domain architecture resembles that of the E3 ubiquitin protein ligases TRIM2, TRIM32, and TRIM71. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 285
32573 271325 cd14955 NHL_like_4 Uncharacterized NHL-repeat domain in bacterial and archaeal proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 279
32574 271326 cd14956 NHL_like_3 Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 274
32575 271327 cd14957 NHL_like_2 Uncharacterized NHL-repeat domain in bacterial and archaeal proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 280
32576 271328 cd14958 NHL_PAL_like Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL, EC 4.3.2.5). PAL catalyzes the N-dealkylation of peptidyl-alpha-hydroxyglycine, which results in an alpha-amidated peptide and glyoxylate. Amidation of the C-terminus is required for the activity of many peptide hormones and neuropeptides. The catalytic residues of PAL are located on several NHL-repeats. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 300
32577 271329 cd14959 NHL_brat_like NHL repeat domain of the Drosophila brain-tumor protein (brat) and similar proteins. Drosophila brain-tumor (brat) has been identified as a tumor suppressor that negatively regulates cell proliferation during development of the Drosophila larval brain. It appears to be recruited to the 3'-untranslated region of hunchback RNA and regulates its translation by forming a complex with Pumilio (Pum) and Nanos (Nos). The NHL domain of brat appears to be involved by interacting with the RNA-binding Puf repeats of Pumilio, a sequence-specific RNA binding protein. This family also contains the Caenorhabditis elegans homolog NCL-1. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 274
32578 271330 cd14960 NHL_TRIM2_like NHL repeat domain of the tripartite motif-containing protein 2 (TRIM2) and related proteins. The E3 ubiquitin-protein ligase TRIM2 is responsible for ubiquinating the apoptosis-inducing Bcl-2-interacting mediator of cell death (Bim), when the latter is phosphorylated by p42/p44 MAPK. TRIM2 regulates the ubiquitination of neurofilament light subunit (NF-L), deficiencies in TRIM2 result in increased NF-L levels in axons and subsequent axonopathy. TRIM2 is also involved in regulating axon outgrowth during development; it contains RING and BBOX domains, the NHL repeat domain is located at its C-terminus. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 274
32579 271331 cd14961 NHL_TRIM32_like NHL repeat domain of the tripartite motif-containing protein 32 (TRIM32) and related proteins. The E3 ubiquitin-protein ligase TRIM32 (HT2A) is widely expressed and is responsible for ubiquinating a large variety of targets, including dysbindin (DTNBP1), NPHP7/Glis2, TAp73, and others. TRIM32 promotes disassociation of the plakoglobin-PI3K complex and reduces PI3K-Akt-FoxO signaling. Mutations in TRIM32 have been implemented in the two diverse diseases limb-girdle muscular dystrophy type 2H (LGMD2H) or sarcotubular myopathy (STM) and Bardet-Biedl syndrome type 11 (BBS11). The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 273
32580 271332 cd14962 NHL_like_6 Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 271
32581 271333 cd14963 NHL_like_5 Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. 268
32582 410628 cd14964 7tm_GPCRs seven-transmembrane G protein-coupled receptor superfamily. This hierarchical evolutionary model represents the seven-transmembrane (7TM) receptors, often referred to as G protein-coupled receptors (GPCRs), which transmit physiological signals from the outside of the cell to the inside via G proteins. GPCRs constitute the largest known superfamily of transmembrane receptors across the three kingdoms of life that respond to a wide variety of extracellular stimuli including peptides, lipids, neurotransmitters, amino acids, hormones, and sensory stimuli such as light, smell and taste. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. However, some 7TM receptors, such as the type 1 microbial rhodopsins, do not activate G proteins. Based on sequence similarity, GPCRs can be divided into six major classes: class A (the rhodopsin-like family), class B (the Methuselah-like, adhesion and secretin-like receptor family), class C (the metabotropic glutamate receptor family), class D (the fungal mating pheromone receptors), class E (the cAMP receptor family), and class F (the frizzled/smoothened receptor family). Nearly 800 human GPCR genes have been identified and are involved essentially in all major physiological processes. Approximately 40% of clinically marketed drugs mediate their effects through modulation of GPCR function for the treatment of a variety of human diseases including bacterial infections. 267
32583 410629 cd14965 7tm_Opsins_type1 type 1 opsins, member of the seven-transmembrane GPCR superfamily. This group represents the microbial rhodopsin family, also known as type 1 rhodopsins, which can function as light-dependent ion pumps, cation channels, and sensors. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. Members of the type I rhodopsin family include: light-driven inward chloride pump halorhodopsin (HR); light-driven outward proton pump bacteriorhodopsin (BR); light-gated cation channel channelrhodopsin (ChR); light-sensor activating transmembrane transducer proteins, sensory rhodopsin I and II (SRI and II); light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR); and other light-driven proton pumps such as blue-light-absorbing and green-light absorbing proteorhodopsins, among others. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins. 214
32584 320097 cd14966 7tmD_STE3 fungal a-factor pheromone receptor STE3, member of the class D family of seven-transmembrane G protein-coupled receptors. This subfamily represents the a-factor pheromone receptor encoded by the STE3 gene, which is required for pheromone sensing and mating in haploid cells of the yeast Saccharomyces cerevisiae. The STE3-encoded seven-transmembrane domain receptor is a member of the class D GPCRs. Class D receptors are composed of two major subfamilies: Ste2 and Ste3. These two GPCRs (Ste2 and Ste3) sense the polypeptide mating pheromones, alpha-factor and a-factor, which activate a G protein-coupled receptors on the surface of the opposite yeast-mating haploid-types (MATa and MAT-alpha), respectively. Activation of these receptors by pheromones leads to activation of the mitogen-activated protein kinase (MAPK) signal transduction cascades, G1 cell cycle arrest, and polarized cell growth in the direction of the partner cell (a process called shmooing), which ultimately induces cell-cell fusion and the formation of a diploid zygote. Like all GPCRs, these pheromone mating factor receptors possess the same basic architecture of seven-transmembrane (7TM) domains and share common signaling mechanisms; however, there is no significant sequence similarity either between Ste2 and Ste3, or between these two receptors and the other 7TM GPCRs. Thus, STE2 and STE3 represent phylogenetically distinct groups. 259
32585 320098 cd14967 7tmA_amine_R-like amine receptors and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Amine receptors of the class A family of GPCRs include adrenoceptors, 5-HT (serotonin) receptors, muscarinic cholinergic receptors, dopamine receptors, histamine receptors, and trace amine receptors. The receptors of amine subfamily are major therapeutic targets for the treatment of neurological disorders and psychiatric diseases. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 259
32586 341316 cd14968 7tmA_Adenosine_R adenosine receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. The adenosine receptors (or P1 receptors), a family of G protein-coupled purinergic receptors, bind adenosine as their endogenous ligand. There are four types of adenosine receptors in human, designated as A1, A2A, A2B, and A3. Each type is encoded by a different gene and has distinct functions with some overlap. For example, both A1 and A2A receptors are involved in regulating myocardial oxygen consumption and coronary blood flow in the heart, while the A2A receptor also has a broad spectrum of anti-inflammatory effects in the body. These two receptors also expressed in the brain, where they have important roles in the release of other neurotransmitters such as dopamine and glutamate, while the A2B and A3 receptors found primarily in the periphery and play important roles in inflammation and immune responses. The A1 and A3 receptors preferentially interact with G proteins of the G(i/o) family, thereby lowering the intracellular cAMP levels, whereas the A2A and A2B receptors interact with G proteins of the G(s) family, activating adenylate cyclase to elevate cAMP levels. 285
32587 381741 cd14969 7tmA_Opsins_type2_animals type 2 opsins in animals, member of the class A family of seven-transmembrane G protein-coupled receptors. This rhodopsin family represents the type 2 opsins found in vertebrates and invertebrates except sponge. Type 2 opsins primarily function as G protein coupled receptors and are responsible for vision as well as for circadian rhythm and pigment regulation. On the contrary, type 1 opsins such as bacteriorhodopsin and proteorhodopsin are found in both prokaryotic and eukaryotic microbes, functioning as light-gated ion channels, proton pumps, sensory receptors and in other unknown functions. Although these two opsin types share seven-transmembrane domain topology and a conserved lysine reside in the seventh helix, type 1 opsins do not activate G-proteins and are not evolutionarily related to type 2. Type 2 opsins can be classified into six distinct subfamilies including the vertebrate opsins/encephalopsins, the G(o) opsins, the G(s) opsins, the invertebrate G(q) opsins, the photoisomerases, and the neuropsins. 284
32588 320101 cd14970 7tmA_Opioid_R-like opioid receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes opioid receptors, somatostatin receptors, melanin-concentrating hormone receptors (MCHRs), and neuropeptides B/W receptors. Together they constitute the opioid receptor-like family, members of the class A G-protein coupled receptors. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and are involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. G protein-coupled somatostatin receptors (SSTRs), which display strong sequence similarity with opioid receptors, binds somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. MCHR binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Neuropeptides B/W receptors are primarily expressed in the CNS and stimulate the cortisol secretion by activating the adenylate cyclase- and the phospholipase C-dependent signaling pathways. 282
32589 320102 cd14971 7tmA_Galanin_R-like galanin receptor and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes G-protein coupled galanin receptors, kisspeptin receptor and allatostatin-A receptor (AstA-R) in insects. These receptors, which are members of the class A of seven transmembrane GPCRs, share a high degree of sequence homology among themselves. The galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, eating disorders, and epilepsy, among many others. KiSS1-derived peptide receptor (also known as GPR54 or kisspeptin receptor) binds the peptide hormone kisspeptin (metastin), which encoded by the metastasis suppressor gene (KISS1) expressed in various endocrine and reproductive tissues. AstA-R is a G-protein coupled receptor that binds allatostatin A. Three distinct types of allatostatin have been identified in the insects and crustaceans: AstA, AstB, and AstC. They both inhibit the biosynthesis of juvenile hormone and exert an inhibitory influence on food intake. Therefore, allatostatins are considered as potential targets for insect control. 281
32590 341317 cd14972 7tmA_EDG-like endothelial differentiation gene family, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents the endothelial differentiation gene (Edg) family of G-protein coupled receptors, melanocortin/ACTH receptors, and cannabinoid receptors as well as their closely related receptors. The Edg GPCRs bind blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). Melanocortin receptors bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. Two types of cannabinoid receptors, CB1 and CB2, are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system. 275
32591 320104 cd14973 7tmA_Mrgpr mas-related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. Also included in this family is Mas-related G-protein coupled receptor 1-like (MAS1L) which is only found in primates. The angiotensin-II metabolite angiotensin is an endogenous ligand for MAS1L. 272
32592 320105 cd14974 7tmA_Anaphylatoxin_R-like anaphylatoxin receptors and related G protein-coupled chemokine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily of G-protein coupled receptors includes anaphylatoxin receptors, formyl peptide receptors (FPR), prostaglandin D2 receptor 2, GPR1, and related chemokine receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors that bind anaphylatoxins. The members of this group include C3a and C5a receptors. The formyl peptide receptors (FPRs) are chemoattractant GPCRs that involved in mediating immune responses to infection. They are expressed mainly on polymorphonuclear and mononuclear phagocytes and bind N-formyl-methionyl peptides (FMLP), which are derived from the mitochondrial proteins of ruptured host cells or invading pathogens. Chemokine receptor-like 1 (also known as chemerin receptor 23) is a GPCR for the chemoattractant adipokine chemerin, also known as retinoic acid receptor responder protein 2 (RARRES2), and for the omega-3 fatty acid derived molecule resolvin E1. Interaction with chemerin induces activation of the MAPK and PI3K signaling pathways leading to downstream functional effects, such as a decrease in immune responses, stimulation of adipogenesis, and angiogenesis. On the other hand, resolvin E1 negatively regulates the cytokine production in macrophages by reducing the activation of MAPK1/3 and NF-kB pathways. Prostaglandin D2 receptor, also known as CRTH2, is a chemoattractant G-protein coupled receptor expressed on T helper type 2 cells that binds prostaglandin D2 (PGD2). PGD2 functions as a mast cell-derived mediator to trigger asthmatic responses and also causes vasodilation. PGD2 exerts its inflammatory effects by binding to two G-protein coupled receptors, the D-type prostanoid receptor (DP) and PD2R2 (CRTH2). PD2R2 couples to the G protein G(i/o) type which leads to a reduction in intracellular cAMP levels and an increase in intracellular calcium. GPR1 is an orphan receptor that can be activated by the leukocyte chemoattractant chemerin, thereby suggesting that some of the anti-inflammatory actions of chemerin may be mediated through GPR1. 274
32593 320106 cd14975 7tmA_LTB4R leukotriene B4 receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the G(q)-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation. 278
32594 320107 cd14976 7tmA_RNL3R relaxin-3 like peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This G protein-coupled receptor subfamily is composed of the relaxin-3 like peptide receptors, RNL3R1 and RNL3R2, and similar proteins. The relaxin-3 like peptide family includes relaxin-1, -2, -3, as well as insulin-like (INSL) peptides 3 to 6. RNL3/relaxin-3 and INSL5 are the endogenous ligands for RNL3R1 and RNL3R2, respectively. RNL3R1, also called GPCR135 or RXFP3, is predominantly expressed in the brain and is implicated in stress, anxiety, feeding, and metabolism. Insulin-like peptide 5 (INSL5), the endogenous ligand for RNL3R2 (also called GPCR142 or RXFP4), plays a role in fat and glucose metabolism. INSL5 is highly expressed in human rectal and colon tissues. Both RNL3R1 and RNL3R2 signal through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation. RNL3R1 is shown to activate Erk1/2 signaling pathway. 290
32595 320108 cd14977 7tmA_ET_R-like endothelin receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily of G-protein coupled receptors includes endothelin receptors, bombesin receptor subtype 3 (BRS-3), gastrin-releasing peptide receptor (GRPR), neuromedin B receptor (NMB-R), endothelin B receptor-like 2 (ETBR-LP-2), and GRP37. The endothelin receptors and related proteins are members of the seven transmembrane rhodopsin-like G-protein coupled receptor family (class A GPCRs) which activate multiple effectors via different types of G protein. 292
32596 410630 cd14978 7tmA_FMRFamide_R-like FMRFamide (Phe-Met-Arg-Phe) receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Drosophila melanogaster G-protein coupled FMRFamide (Phe-Met-Arg-Phe-NH2) receptor DrmFMRFa-R and related invertebrate receptors, as well as the vertebrate proteins GPR139 and GPR142. DrmFMRFa-R binds with high affinity to FMRFamide and intrinsic FMRFamide-related peptides. FMRFamide is a neuropeptide from the family of FMRFamide-related peptides (FaRPs), which all containing a C-terminal RFamide (Arg-Phe-NH2) motif and have diverse functions in the central and peripheral nervous systems. FMRFamide is an important neuropeptide in many types of invertebrates such as insects, nematodes, molluscs, and worms. In invertebrates, the FMRFamide-related peptides are involved in the regulation of heart rate, blood pressure, gut motility, feeding behavior, and reproduction. On the other hand, in vertebrates such as mice, they play a role in the modulation of morphine-induced antinociception. Orphan receptors GPR139 and GPR142 are very closely related G protein-coupled receptors, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and mediates enhancement of glucose-stimulated insulin secretion, whereas GPR139 is mostly expressed in the brain and is suggested to play a role in the control of locomotor activity. Tryptophan and phenylalanine have been identified as putative endogenous ligands of GPR139. 299
32597 320110 cd14979 7tmA_NTSR-like neurotensin receptors and related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the neurotensin receptors and related G-protein coupled receptors, including neuromedin U receptors, growth hormone secretagogue receptor, motilin receptor, the putative GPR39 and the capa receptors from insects. These receptors all bind peptide hormones with diverse physiological effects. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 300
32598 320111 cd14980 7tmA_Glycoprotein_LRR_R-like glycoprotein hormone receptors and leucine-rich repeats containing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the glycoprotein hormone receptors (GPHRs), vertebrate receptors containing 17 leucine-rich repeats (LGR4-6), and the relaxin family peptide receptors (also known as LGR7 and LGR8). They are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone receptor family contains receptors for the pituitary hormones, thyrotropin (thyroid-stimulating hormone receptor), follitropin (follicle-stimulating hormone receptor), and lutropin (luteinizing hormone receptor). Glycoprotein hormone receptors couple primarily to the G(s)-protein and promotes cAMP production, but also to the G(i)- or G(q)-protein. Two orphan GPCRs, LGR7 and LGR8, have been recently identified as receptors for the relaxin peptide hormones. 286
32599 320112 cd14981 7tmA_Prostanoid_R G protein-coupled receptors for prostanoids, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters. 288
32600 341318 cd14982 7tmA_purinoceptor-like purinoceptor and its related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Members of this subfamily include lysophosphatidic acid receptor, P2 purinoceptor, protease-activated receptor, platelet-activating factor receptor, Epstein-Barr virus induced gene 2, proton-sensing G protein-coupled receptors, GPR35, and GPR55, among others. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 283
32601 320114 cd14983 7tmA_FFAR free fatty acid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the free fatty acid receptors (FFARs) which bind free fatty acids (FFAs). They belong to the class A G-protein coupled receptors and are composed of three members, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFAR1 is a receptor for medium- and long-chain FFAs, whereas FFAR2 and FFAR3 are receptors for short chain FFAs (SCFAs), which have different ligand affinities. FFAR1 directly mediates FFA stimulation of glucose-stimulated insulin secretion and also indirectly increases insulin secretion by enhancing the release of incretin. FFAR2 activation by SCFA suppresses adipose insulin signaling, which leads to the inhibition of fat accumulation in adipose tissue. FAAR3 is expressed in intestinal L cells, which produces glucagon-like peptide 1 (GLP-1) and peptide YY (PYY), suggesting that this receptor may be involved in energy homeostasis. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity. 278
32602 341319 cd14984 7tmA_Chemokine_R classical and atypical chemokine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. In addition to these classical chemokine receptors, there exists a subfamily of atypical chemokine receptors (ACKRs) that are unable to couple to G-proteins and, instead, they preferentially mediate beta-arrestin dependent processes, such as receptor internalization, after ligand binding. The classical chemokine receptors contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling. However, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors. 278
32603 341320 cd14985 7tmA_Angiotensin_R-like angiotesin receptor family and its related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the angiotensin receptors, the bradykinin receptors, apelin receptor as well as putative G-protein coupled receptors (GPR15 and GPR25). Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors. Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2 receptor, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Bradykinins (BK) are pro-inflammatory peptides that mediate various vascular and pain responses to tissue injury through its B1 and B2 receptors. Apelin (APJ) receptor binds the endogenous peptide ligands, apelin and Toddler/Elabela. APJ is an adipocyte-derived hormone that is ubiquitously expressed throughout the human body, and Toddler/Elabela is a short secretory peptide that is required for normal cardiac development in zebrafish. Activation of APJ receptor plays key roles in diverse physiological processes including vasoconstriction and vasodilation, cardiac muscle contractility, angiogenesis, and regulation of water balance and food intake. Orphan receptors, GPR15 and GPR25, share strong sequence homology to the angiotensin II type AT1 and AT2 receptors. 284
32604 320117 cd14986 7tmA_Vasopressin-like vasopressin receptors and its related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Members of this group form a subfamily within the class A G-protein coupled receptors (GPCRs), which includes the vasopressin and oxytocin receptors, the gonadotropin-releasing hormone receptors (GnRHRs), the neuropeptide S receptor (NPSR), and orphan GPR150. These receptors share significant sequence homology with each other, suggesting that they have a common evolutionary origin. Vasopressin, also known as arginine vasopressin or anti-diuretic hormone, is a neuropeptide synthesized in the hypothalamus. The actions of vasopressin are mediated by the interaction of this hormone with three tissue-specific subtypes: V1AR, V1BR, and V2R. Although vasopressin differs from oxytocin by only two amino acids, they have divergent physiological functions. Vasopressin is involved in regulating osmotic and cardiovascular homeostasis, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. Neuropeptide S (NPS) promotes arousal and anxiolytic-like effects by activating its cognate receptor NPSR. NPSR has also been associated with asthma and allergy. GPR150 is an orphan receptor closely related to the oxytocin and vasopressin receptors. 295
32605 320118 cd14987 7tmA_ACKR3_CXCR7 CXC chemokine receptor 7, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR3, also known as CXCR7, is an atypical chemokine receptor for CXCL12 and CXCR11. Unlike the classical chemokine receptors, ACKR3 contains a DRYLSIT-sequence instead of the conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling. Thus, ACKR3 does not activate classical GPCR signaling, instead induces beta-arrestin recruitment which is leading to ligand internalization and MAP-kinase activation. It is acting as a scavenger for CXCL12 and, to a lesser degree, for CXCL11. ACKR3 is highly expressed by blood vascular endothelial cells in brain, in numerous embryonic and neonatal tissues, in inflamed tissues and in a variety of cancers such as lymphomas, sarcomas, prostate and breast cancers, and gliomas. Five receptors have been identified for the ACKR family, including CC-Chemokine Receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, DARC, and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors. 282
32606 320119 cd14988 7tmA_GPR182 G protein-coupled receptor 182, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR182 is an orphan G-protein coupled receptor that belongs to the class A of seven-transmembrane GPCR superfamily. When GPR182 gene was first cloned, it was proposed to encode an adrenomedullin receptor. However when the corresponding protein was expressed, it was found not to respond to adrenomedullin (ADM). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 278
32607 320120 cd14989 7tmA_GPER1 G protein-coupled estrogen receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled estrogen receptor 1 (GPER1), also known as the G-protein coupled receptor 30 (GPR30), is a high affinity receptor for estrogen. This receptor is a member of the class A of seven-transmembrane GPCRs. Estrogen binding results in intracellular calcium mobilization and synthesis of phosphatidylinositol (3,4,5)-trisphosphate in the nucleus. GPR30 plays an important role in development of tamoxifen resistance in breast cancer cells. The distribution of GPR30 is well established in the rodent, with high expression observed in the hypothalamus, pituitary gland, adrenal medulla, kidney medulla and developing follicles of the ovary. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 276
32608 320121 cd14990 7tmA_GPR146 G protein-coupled receptor 146, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR146 is an orphan G-protein coupled receptor that belongs to the class A of seven-transmembrane GPCR superfamily. The endogenous ligand for GPR146 is not known. It has been suggested that GPR146 may be a part of the C-peptide signaling complex. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 280
32609 320122 cd14991 7tmA_HCAR-like hydroxycarboxylic acid receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the hydroxycarboxylic acid receptors (HCARs) as well as their closely related receptors, GPR31 and oxoeicosanoid receptor 1 (OXER1). HCARs are members of the class A family of G-protein coupled receptors (GPCRs). HCAR subfamily contain three receptor subtypes: HCAR1, HCAR2, and HCAR3. The endogenous ligand of HCAR1 (also known as lactate receptor 1, GPR104, or GPR81) is L-lactic acid. The endogenous ligands of HCAR2 (also known as niacin receptor 1, GPR109A, nicotinic acid receptor) and HCAR3 (also known as niacin receptor 2, orGPR109B) are 3-hydroxybutyric acid and 3-hydroxyoctanoic acid, respectively. All three HCA receptors are expressed in adipocytes, and are coupled to G(i)-proteins mediating anti-lipolytic effects in fat cells. OXER1 is a receptor for eicosanoids and polyunsaturated fatty acids such as 5-oxo-6E,8Z,11Z,14Z-eicosatetraenoic acid (5-OXO-ETE), 5(S)-hydroperoxy-6E,8Z,11Z,14Z-eicosatetraenoic acid (5(S)-HPETE) and arachidonic acid, whereas GPR31 is a high-affinity receptor for 12-(S)-hydroxy-5,8,10,14-eicosatetraenoic acid (12-S-HETE). 280
32610 320123 cd14992 7tmA_TACR_family tachykinin receptor and closely related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes G-protein coupled receptors for a variety of neuropeptides of the tachykinin (TK) family as well as closely related receptors. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty. 291
32611 320124 cd14993 7tmA_CCKR-like cholecystokinin receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents four G-protein coupled receptors that are members of the RFamide receptor family, including cholecystokinin receptors (CCK-AR and CCK-BR), orexin receptors (OXR), neuropeptide FF receptors (NPFFR), and pyroglutamylated RFamide peptide receptor (QRFPR). These RFamide receptors are activated by their endogenous peptide ligands that share a common C-terminal arginine (R) and an amidated phenylanine (F) motif. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors. Orexins (OXs; also referred to as hypocretins) are neuropeptide hormones that regulate the sleep-wake cycle and potently influence homeostatic systems regulating appetite and feeding behavior or modulating emotional responses such as anxiety or panic. OXs are synthesized as prepro-orexin (PPO) in the hypothalamus and then proteolytically cleaved into two forms of isoforms: orexin-A (OX-A) and orexin-B (OX-B). OXA is a 33 amino-acid peptide with N-terminal pyroglutamyl residue and two intramolecular disulfide bonds, whereas OXB is a 28 amino-acid linear peptide with no disulfide bonds. OX-A binds orexin receptor 1 (OX1R) with high-affinity, but also binds with somewhat low-affinity to OX2R, and signals primarily to Gq coupling, whereas OX-B shows a strong preference for the orexin receptor 2 (OX2R) and signals through Gq or Gi/o coupling. The 26RFa, also known as QRFP (Pyroglutamylated RFamide peptide), is a 26-amino acid residue peptide that exerts similar orexigenic activity including the regulation of feeding behavior in mammals. It is the ligand for G-protein coupled receptor 103 (GPR103), which is predominantly expressed in paraventricular (PVN) and ventromedial (VMH) nuclei of the hypothalamus. GPR103 shares significant protein sequence homology with orexin receptors (OX1R and OX2R), which have recently shown to produce a neuroprotective effect in Alzheimer's disease by forming a functional heterodimer with GPR103. Neuropeptide FF (NPFF) is a mammalian octapeptide that has been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of NPFF are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. 296
32612 320125 cd14994 7tmA_GPR141 orphan G protein-coupled receptor 141, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 141 of unknown function. Several ESTs for GPR141 were found in marrow and cancer cells. GPR141 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 275
32613 320126 cd14995 7tmA_TRH-R thyrotropin-releasing hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. TRH-R is a member of the class A rhodopsin-like G protein-coupled receptors, which binds the tripeptide thyrotropin releasing hormone. The TRH-R activates phosphoinositide metabolism through a pertussis-toxin-insensitive G-protein, the G(q)/G(11) class. TRH stimulates the synthesis and release of thyroid-stimulating hormone in the anterior pituitary. TRH is produced in many other tissues, especially within the nervous system, where it appears to act as a neurotransmitter/neuromodulator. It also stimulates the synthesis and release of prolactin. In the CNS, TRH stimulates a number of behavioral and pharmacological actions, including increased turnover of catecholamines in the nucleus accumbens. There are two thyrotropin-releasing hormone receptors in some mammals, thyrotropin-releasing hormone receptor 1 (TRH1) which has been found in a number of species including rat, mouse, and human and thyrotropin-releasing hormone receptor 2 (TRH2) which has, only been found in rodents. These TRH receptors are found in high levels in the anterior pituitary, and are also found in the retina and in certain areas of the brain. 269
32614 320127 cd14996 7tmA_GPR82 orphan G protein-coupled receptor 82, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 82 of unknown function. GPR82 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 305
32615 320128 cd14997 7tmA_ETH-R ecdysis-triggering hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the ecdysis-triggering hormone receptors found in insects, which are members of the class A family of seven-transmembrane G-protein coupled receptors. Ecdysis-triggering hormones are vital regulatory signals that govern the stereotypic physiological sequence leading to cuticle shedding in insects. Thus, the ETH signaling system has been a target for the design of more sophisticated insect-selective pest control strategies. Two subtypes of ecdysis-triggering hormone receptor were identified in Drosophila melanogaster. Blood-borne ecdysis-triggering hormone (ETH) activates the behavioral sequence through direct actions on the central nervous system. In insects, ecdysis is thought to be controlled by the interaction between peptide hormones; in particular between ecdysis-triggering hormone (ETH) from the periphery and eclosion hormone (EH) and crustacean cardioactive peptide (CCAP) from the central nervous system. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 294
32616 320129 cd14998 7tmA_GPR153_GPR162-like orphan G protein-coupled receptors 153 and 162, member of the class A family of seven-transmembrane G protein-coupled receptors. This group contains the G-protein coupled receptor 153 (GPR153), GPR162, and similar proteins. These are orphan GCPRs with unknown endogenous ligand and function. GPR153 and GPR163 are widely expressed in the central nervous system (CNS) and share a common evolutionary ancestor due to a gene duplication event. Although categorized as members of the rhodopsin-like class A GPCRs, both GPR162 and GPR153 contain an HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors which is important for efficient G protein-coupled signal transduction. Moreover, the LPxF motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in both GPR162 and GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 301
32617 320130 cd14999 7tmA_UII-R urotensin-II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The urotensin-II receptor (UII-R, also known as the hypocretin receptor) is a member of the class A rhodopsin-like G-protein coupled receptors, which binds the peptide hormone urotensin-II. Urotensin II (UII) is a vasoactive somatostatin-like or cortistatin-like peptide hormone. However, despite the apparent structural similarity to these peptide hormones, they are not homologous to UII. Urotensin II was first identified in fish spinal cord, but later found in humans and other mammals. In fish, UII is secreted at the back part of the spinal cord, in a neurosecretory centre called uroneurapophysa, and is involved in the regulation of the renal and cardiovascular systems. In mammals, urotensin II is the most potent mammalian vasoconstrictor identified to date and causes contraction of arterial blood vessels, including the thoracic aorta. The urotensin II receptor is a rhodopsin-like G-protein coupled receptor, which binds urotensin-II. The receptor was previously known as GPR14, or sensory epithelial neuropeptide-like receptor (SENR). The UII receptor is expressed in the CNS (cerebellum and spinal cord), skeletal muscle, pancreas, heart, endothelium and vascular smooth muscle. It is involved in the pathophysiological control of cardiovascular function and may also influence CNS and endocrine functions. Binding of urotensin II to the receptor leads to activation of phospholipase C, through coupling to G(q/11) family proteins. The resulting increase in intracellular calcium may cause the contraction of vascular smooth muscle. 282
32618 320131 cd15000 7tmA_BNGR-A34-like putative neuropeptide receptor BNGR-A34 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes putative neuropeptide receptor BNGR-A34 found in silkworm and its closely related proteins from invertebrates. They are members of the class A rhodopsin-like GPCRs, which represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 285
32619 320132 cd15001 7tmA_GPRnna14-like GPRnna14 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the orphan G-protein coupled receptor GPRnna14 found in body louse (Pediculus humanus humanus) as well as its closely related proteins of unknown function. These receptors are members of the class A rhodopsin-like G-protein coupled receptors. As an obligatory parasite of humans, the body louse is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. GPRnna14 shares significant sequence similarity with the members of the neurotensin receptor family. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 266
32620 320133 cd15002 7tmA_GPR151 G protein-coupled receptor 151, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 151 (GRP151) is an orphan receptor of unknown function. Its expression is conserved in habenular axonal projections of vertebrates and may be a promising novel target for psychiatric drug development. GPR151 shows high sequence similarity with galanin receptors (GALR). GPR151 is a member of the class A rhodopsin-like GPCRs, which represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 280
32621 320134 cd15005 7tmA_SREB-like super conserved receptor expressed in brain and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia. All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 329
32622 320135 cd15006 7tmA_GPR176 orphan G protein-coupled receptor 176, member of the rhodopsin-like class A GPCR family. GPR176 is a putative G protein-coupled receptor that belongs to the class A GPCR superfamily; no endogenous ligand for GPR176 has yet been identified. The class A rhodopsin-like GPCRs represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 289
32623 320136 cd15007 7tmA_GPR75 G protein-coupled receptor 75, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 75 (GPR75) is an atypical chemokine receptor that is expressed by mouse and human islets. Although GPR75 shows low sequence homology to C-C chemokine receptors, chemokine (C-C motif) ligand 5 (CCL5) has been shown to act as an endogenous ligand for GPR75. CCL5 plays a key role in recruiting lymphocytes to sites of inflammatory and infection through promiscuous binding to the C-C chemokine G-protein-coupled receptors. Although categorized as a member of the rhodopsin-like class A GPCRs, GPR75 contains HRL-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. GPR75 is coupled to the G-protein G(q), which elevates intracellular calcium. 261
32624 320137 cd15008 7tmA_GPR19 G protein-coupled receptor 19, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 19 is an orphan receptor that is expressed predominantly in neuronal cells during mouse embryogenesis. Its mRNA is found frequently over-expressed in patients with small cell lung cancer. GPR19 shares a significant amino acid sequence identity with the D2 dopamine and neuropeptide Y families of receptors. Human GPR19 gene, intronless in the coding region, also has a distribution in brain overlapping that of the D2 dopamine receptor gene, and is located on chromosome 12. GPR19 is a member of the class A family of GPCRs, which represents a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 275
32625 410631 cd15010 7tmA_ACKR1_DARC Duffy antigen receptor for chemokines, member of the class A family of seven-transmembrane G protein-coupled receptors. Atypical chemokine receptor 1 (ACKR1), also known as DARC (Duffy antigen receptor for chemokines) or Fy glycoprotein (GpFy), was originally identified on erythrocytes. ACKR1 is also ubiquitously expressed by endothelial cells of venules and is highly promiscuous among all chemokine receptor. It binds many proinflammatory chemokines from both the CC and CXC subfamilies, including CCL2, CCL5, CCL7, CCL11, CXCL1, CXCL2, CXCL3, and CXCL5. Erythrocyte ACKR1 is thought to act as a chemokine sink, limiting the levels of circulating chemokines, thereby controlling leukocyte activation. ACKR1-deficient erythrocytes are shown to confer resistance to the malarial parasite, Plasmodium vivax. On the other hand, ACKR1-expressing endothelial cells can internalize chemokines. ACKR1-internalized chemokines can be moved intact across the endothelium and promotes neutrophil transmigration. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-Chemokine Receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, DARC, and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors. 257
32626 320139 cd15011 7tmA_GPR149 G protein-coupled receptor 149, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR149 is predominantly expressed in the ovary and is present at low levels in the brain and the digestive tract (stomach and small intestine). GPR149-null mice are viable and have normal maturation of the ovarian follicle, but show enhanced fertility and ovulation. Additionally, the null mice showed increased expression levels of growth differentiation factor 9 (Gdf9) in oocytes, and upregulated expression of cyclin D2, a downstream target of FSH (follicle-stimulating hormone) receptor signaling pathways that promotes granulosa cell proliferation. GPR149 is an orphan receptor with no known endogenous ligand as yet identified. Although categorized as a member of the class A GPCRs, GPR149 lacks the first two charged amino acids of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors which is important for efficient G protein-coupled signal transduction. Moreover, the transmembrane domains and carboxyl terminus of GPR149 show low similarities to other GPCRs. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 256
32627 320140 cd15012 7tmA_Trissin_R trissin receptor and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the Drosophila melanogaster trissin receptor and closely related invertebrate proteins which are a member of the class A family of seven-transmembrane G-protein coupled receptors. The cysteine-rich trissin has been shown to be an endogenous ligand for the orphan CG34381 in Drosophila melanogaster. Trissin is a peptide composed of 28 amino acids with three intrachain disulfide bonds with no significant structural similarities to known endogenous peptides. Cysteine-rich peptides are known to have antimicrobial or toxicant activities, although frequently their mechanism of action is poorly understood. Since the expression of trissin and its receptor is reported to predominantly localize to the brain and thoracicoabdominal ganglion, trissin is predicted to behave as a neuropeptide. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 277
32628 320141 cd15013 7tm_TAS2R4 mammalian taste receptor 2, subtype 4, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 4, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 286
32629 320142 cd15014 7tm_TAS2R40 mammalian taste receptor 2, subtype 40, member of the seven-transmembrane G-protein coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 40, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 290
32630 320143 cd15015 7tm_TAS2R39 mammalian taste receptor 2, subtype 39, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 39, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 289
32631 320144 cd15016 7tm_TAS2R1 mammalian taste receptor 2, subtype 1, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 1, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 283
32632 320145 cd15017 7tm_TAS2R16 mammalian taste receptor 2, subtype 16, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 16, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 285
32633 320146 cd15018 7tm_TAS2R41-like mammalian taste receptor 2, subtype 41, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 41, which functions as a bitter taste receptor. Also included is the closely related TAS2R60. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 290
32634 320147 cd15019 7tm_TAS2R14-like mammalian taste receptor 2, subtype 14, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 14, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 290
32635 320148 cd15020 7tm_TAS2R3 mammalian taste receptor 2, subtype 3, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 3, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 290
32636 320149 cd15021 7tm_TAS2R10 mammalian taste receptor 2, subtype 10, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 10, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 285
32637 320150 cd15022 7tm_TAS2R8 mammalian taste receptor 2, subtype 8, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 8, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 292
32638 320151 cd15023 7tm_TAS2R7-like mammalian taste receptor 2, subtypes 7 and 9, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtypes 7 and 9, which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 291
32639 320152 cd15024 7tm_TAS2R42 mammalian taste receptor 2, subtype 42, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 42, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 288
32640 320153 cd15025 7tm_TAS2R38 mammalian taste receptor 2, subtype 38, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 38, which functions as a bitter taste receptor. Genetic variants of human TAS2R38 influence the ability to taste synthetic compounds 6-n-propylthiouracil (PROP) and phenylthiocarbamide (PTC). Thus, sensitivity to the bitter taste of PROP is often used as a marker for individual differences in taste perception that can affect food preferences and intake. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 293
32641 320154 cd15026 7tm_TAS2R13 mammalian taste receptor 2, subtype 13, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 13, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 287
32642 320155 cd15027 7tm_TAS2R43-like mammalian taste receptor 2, subtype 43, and related proteins, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 43, which functions as a bitter taste receptor. Also included are the closely related taste receptors TAS2R19, TAS2R20, TAS2R30, TAS2R31, TAS2R45 and TAS2R50. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 292
32643 320156 cd15028 7tm_Opsin-1_euk proton pumping rhodopsins in fungi and algae, member of the seven-transmembrane GPCR superfamily. This subgroup represents uncharacterized proton pumping rhodopsins found in fungi and algae. They belong to the microbial rhodopsin family, also known as type I rhodopsins, consisting of the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 231
32644 320157 cd15029 7tm_SRI_SRII light-sensor activating transmembrane transducer protein sensory rhodopsin I and II; member of the seven-transmembrane GPCR superfamily. This subgroup includes the light-sensor activating transmembrane transducer proteins, sensory rhodopsin I (SRI) and II (SRII, also called phoborhodopsin). SRI and SRII are responsible for positive (attractive) and negative (repellent) phototaxis in halobacteria, respectively, thereby controlling the cell's directional movement in response to changes in light intensity by swimming either towards or away from the light. Both sensory rhodopsins belong to the family of microbial rhodopsins, also known as type I rhodopsins, consisting of the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 214
32645 320158 cd15030 7tmF_SMO_homolog class F smoothened family membrane region, a homolog of frizzled receptors. This group represents smoothened (SMO), a transmembrane G protein-coupled receptor that acts as the transducer of the hedgehog (HH) signaling pathway. SMO is activated by the hedgehog (HH) family of proteins acting on the 12-transmembrane domain receptor patched (PTCH), which constitutively inhibits SMO. Thus, in the absence of HH proteins, PTCH inhibits SMO signaling. On the other hand, binding of HH to the PTCH receptor activates its internalization and degradation, thereby releasing the PTCH inhibition of SMO. This allows SMO to trigger intracellular signaling and the subsequent activation of the Gli family of zinc finger transcriptional factors and induction of HH target gene expression (PTCH, Gli1, cyclin, Bcl-2, etc). SMO is closely related to the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate family of G-protein coupled receptors. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The WNT and HH signaling pathways play critical roles in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 331
32646 320159 cd15031 7tmF_FZD3_insect class F insect frizzled subfamily 3, member of 7-transmembrane G protein-coupled receptors. This group represents subfamily 3 of the frizzled (FZD) family of seven transmembrane-spanning G protein-coupled proteins that is found in insects such as Drosophila melanogaster. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 311
32647 320160 cd15032 7tmF_FZD6 class F frizzled subfamily 6, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 6 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 321
32648 320161 cd15033 7tmF_FZD3 class F frizzled subfamily 3, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 3 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 321
32649 320162 cd15034 7tmF_FZD1_2_7-like class F frizzled subfamilies 1, 2 and 7; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 1, 2 and 7 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors, as well as their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 322
32650 320163 cd15035 7tmF_FZD5_FZD8-like class F frizzled subfamilies 5, 8 and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 5 and 8 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, as well as their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 307
32651 320164 cd15036 7tmF_FZD9 class F frizzled subfamily 9, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 9 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 320
32652 320165 cd15037 7tmF_FZD10 class F frizzled subfamily 10, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 10 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 320
32653 320166 cd15038 7tmF_FZD4 class F frizzled subfamily 4, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 4 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 304
32654 410632 cd15039 7tmB3_Methuselah-like Methuselah-like subfamily B3, member of the class B family of seven-transmembrane G protein-coupled receptors. The subfamily B3 of class B GPCRs consists of Methuselah (Mth) and its closely related proteins found in bilateria. Mth was originally identified in Drosophila as a GPCR affecting stress resistance and aging. In addition to the seven transmembrane helices, Mth contains an N-terminal extracellular domain involved in ligand binding, and a third intracellular loop (IC3) required for the specificity of G-protein coupling. Drosophila Mth mutants showed an increase in average lifespan by 35% and greater resistance to a variety of stress factors, including starvation, high temperature, and paraquat-induced oxidative toxicity. Moreover, mutations in two endogenous peptide ligands of Methuselah, Stunted A and B, showed an increased in lifespan and resistance to oxidative stress induced by dietary paraquat. These results strongly suggest that the Stunted-Methuselah system plays important roles in stress response and aging. 270
32655 320168 cd15040 7tmB2_Adhesion adhesion receptors, subfamily B2 of the class B family of seven-transmembrane G protein-coupled receptors. The B2 subfamily of class B GPCRs consists of cell-adhesion receptors with 33 members in humans and vertebrates. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing a variety of structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, linked to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 253
32656 341321 cd15041 7tmB1_hormone_R The subfamily B1 of hormone receptors (secretin-like), member of the class B family seven-transmembrane G protein-coupled receptors. The B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of this subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. Moreover, the B1 subfamily receptors play key roles in hormone homeostasis and are promising drug targets in various human diseases including diabetes, osteoporosis, obesity, neurodegenerative conditions (Alzheimer###s and Parkinson's), cardiovascular disease, migraine, and psychiatric disorders (anxiety, depression). Furthermore, the subfamilies B2 and B3 consist of receptors that are capable of interacting with epidermal growth factors (EGF) and the Drosophila melanogaster Methuselah gene product (Mth), respectively. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes. 273
32657 320170 cd15042 7tmC_Boss Bride of sevenless, member of the class C family of seven-transmembrane G protein-coupled receptors. Bride of Sevenless (Boss) is a putative Drosophila melanogaster G protein-coupled receptor that functions as a glucose-responding receptor to regulate energy metabolism. Boss is expressed predominantly in the fly's fat body, a nutrient-sensing tissue functionally analogous to the mammalian liver and adipose tissues, and in photoreceptor cells. Boss, which is expressed on the surface of R8 photoreceptor cell, binds and activates the Sevenless receptor tyrosine kinase on the neighboring R7 precursor cell. Activation of Sevenless results in phosphorylation of the Sevenless, triggering a signaling transduction cascade through Ras pathway that ultimately leads to the differentiation of the R7 precursor into a fully functional R7 photoreceptor, the last of eight photoreceptors to differentiate in each ommatidium of the developing Drosophila eye. In the absence of either of Sevenless or Boss, the R7 precursor fails to differentiate as a photoreceptor and instead develops into a non-neuronal cone cell. Moreover, Boss mutants in Drosophila showed elevated food intake, but reduced stored triglyceride levels, suggesting that Boss may play a role in regulating energy homeostasis in nutrient sensing tissues. Furthermore, GPRC5B, a mammalian Boss homolog, activates obesity-associated inflammatory signaling in adipocytes, and that the GPRC5B knockout mice showed resistance to high-fat diet-induced obesity and insulin resistance. 238
32658 320171 cd15043 7tmC_RAIG_GPRC5 retinoic acid-inducible orphan G-protein-coupled receptors; class C family of seven-transmembrane G protein-coupled receptors, group 5. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG1 is evolutionarily conserved from mammals to fish. RAIG1 has been to shown to act as a tumor suppressor in non-small cell lung carcinoma as well as oral squamous cell carcinoma, but it could also act as an oncogene in breast cancer, colorectal cancer, and pancreatic cancer. Studies have shown that overexpression of RAIG1 decreases intracellular cAMP levels. Moreover, knocking out RAIG1 induces the activation of the NF-kB and STAT3 signaling pathways leading to cell proliferation and resistance to apoptosis. RAIG2 (GPRC5B), a mammalian Boss (Bride of sevenless) homolog, activates obesity-associated inflammatory signaling in adipocytes, and GPRC5B knockout mice show resistance to high-fat diet-induced obesity and insulin resistance. The specific functions of RAIG3 and RAIG4 are unknown; however, they may play roles in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interactions with G-protein signaling pathways. 248
32659 320172 cd15044 7tmC_V2R_AA_sensing_receptor-like vomeronasal type-2 pheromone receptors, amino acid-sensing receptors and closely related proteins; member of the class C family of seven-transmembrane G protein-coupled receptors. This group is composed of vomeronasal type-2 pheromone receptors (V2Rs), a subgroup of broad-spectrum amino-acid sensing receptors including calcium-sensing receptor (CaSR) and GPRC6A, as well as their closely related proteins. Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are co-expressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, producing the second messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones. CaSR is a widely expressed GPCR that is involved in sensing small changes in extracellular levels of calcium ion to maintain a constant level of the extracellular calcium via modulating the synthesis and secretion of calcium regulating hormones, such as parathyroid hormone (PTH), in order to regulate Ca(2+)transport into or out of the extracellular fluid via kidney, intestine, and/or bone. For instance, when Ca2+ is high, CaSR downregulates PTH synthesis and secretion, leading to an increase in renal Ca2+ excretion, a decrease in intestinal Ca2+ absorption, and a reduction in release of skeletal Ca2+. GRPC6A (GPCR, class C, group 6, subtype A) is a widely expressed amino acid-sensing GPCR that is most closely related to CaSR. GPRC6A is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine and less potently by small aliphatic amino acids. Moreover, the receptor can be either activated or modulated by divalent cations such as Ca2+. GPRC6A is expressed in the testis, but not the ovary and specifically also binds to the osteoblast-derived hormone osteocalcin (OCN), which regulates testosterone production by the testis and male fertility independently of the hypothalamic-pituitary axis. Furthermore, GPRC6A knockout studies suggest that GRPC6A is involved in regulation of bone metabolism, male reproduction, energy homeostasis, glucose metabolism, and in activation of inflammation response, as well as prostate cancer growth and progression, among others. 251
32660 320173 cd15045 7tmC_mGluRs metabotropic glutamate receptors, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group I mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to (Gi/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 253
32661 320174 cd15046 7tmC_TAS1R type 1 taste receptors, member of the class C of seven-transmembrane G protein-coupled receptors. This subfamily represents the type I taste receptors (TAS1Rs) that belongs to the class C family of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids. On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors. 253
32662 320175 cd15047 7tmC_GABA-B-like gamma-aminobutyric acid type B receptor and related proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD). However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. Also included in this group are orphan receptors, GPR156 and GPR158, which are closely related to the GABA-B receptor family. 263
32663 320176 cd15048 7tmA_Histamine_H3R_H4R histamine receptor subtypes H3R and H4R, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtypes H3R and H4R, members of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 296
32664 341322 cd15049 7tmA_mAChR muscarinic acetylcholine receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32665 320178 cd15050 7tmA_Histamine_H1R histamine subtype H1 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine receptor subtype H1R, a member of histamine receptor family, which belongs to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). H1R selectively interacts with the G(q)-type G protein that activates phospholipase C and the phosphatidylinositol pathway. Antihistamines, a widely used anti-allergy medication, act on the H1 subtype and produce drowsiness as a side effect. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 263
32666 320179 cd15051 7tmA_Histamine_H2R histamine subtype H2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine receptor subtype H2R, a member of histamine receptor family, which belongs to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H2R subtype selectively interacts with the G(s)-type G protein that activates adenylate cyclase, leading to increased cAMP production and activation of Protein Kinase A. H2R is found in various tissues such as the brain, stomach, and heart. Its most prominent role is in histamine-induced gastric acid secretion. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 287
32667 320180 cd15052 7tmA_5-HT2 serotonin receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32668 320181 cd15053 7tmA_D2-like_dopamine_R D2-like dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. In contrast, activation of D2-like family receptors is linked to G proteins of the G(i) family, which inhibit adenylate cyclase. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 263
32669 320182 cd15054 7tmA_5-HT6 serotonin receptor subtype 6, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT6 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the mammalian central nervous system (CNS). 5-HT6 receptors are selectively linked to G proteins of the G(s) family, which positively stimulate adenylate cyclase, causing cAMP formation and activation of protein kinase A. The 5-HT6 receptors mediates excitatory neurotransmission and are involved in learning and memory; thus they are promising targets for the treatment of cognitive impairment. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 267
32670 320183 cd15055 7tmA_TAARs trace amine-associated receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptors (TAARs) are a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 285
32671 320184 cd15056 7tmA_5-HT4 serotonin receptor subtype 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT4 subtype is a member of the serotonin receptor family that belongs to the class A G protein-coupled receptors, and binds the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the mammalian central nervous system (CNS). 5-HT4 receptors are selectively linked to G proteins of the G(s) family, which positively stimulate adenylate cyclase, causing cAMP formation and activation of protein kinase A. 5-HT4 receptor-specific agonists have been shown to enhance learning and memory in animal studies. Moreover, hippocampal 5-HT4 receptor expression has been reported to be inversely correlated with memory performance in humans. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 294
32672 320185 cd15057 7tmA_D1-like_dopamine_R D1-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. In contrast, activation of D2-like family receptors is linked to G proteins of the G(i) family, which inhibit adenylate cyclase. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 299
32673 320186 cd15058 7tmA_Beta_AR beta adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta adrenergic receptor (beta adrenoceptor), also known as beta AR, is activated by hormone adrenaline (epinephrine) and plays important roles in regulating cardiac function and heart rate, as well as pulmonary physiology. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of beta-ARs can lead to cardiac dysfunction such as arrhythmias or heart failure. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 305
32674 320187 cd15059 7tmA_alpha2_AR alpha-2 adrenergic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 261
32675 320188 cd15060 7tmA_tyramine_octopamine_R-like tyramine/octopamine receptor-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes tyramine/octopamine receptors and similar proteins found in insects and other invertebrates. Both octopamine and tyramine mediate their actions via G protein-coupled receptors (GPCRs) and are the invertebrate equivalent of vertebrate adrenergic neurotransmitters. In Drosophila, octopamine is involved in ovulation by mediating an egg release from the ovary, while a physiological role for tyramine in this process is not fully understood. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 260
32676 320189 cd15061 7tmA_tyramine_R-like tyramine receptors and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes tyramine-specific receptors and similar proteins found in insects and other invertebrates. These tyramine receptors form a distinct receptor family that is phylogenetically different from the other tyramine/octopamine receptors which also found in invertebrates. Both octopamine and tyramine mediate their actions via G protein-coupled receptors (GPCRs) and are the invertebrate equivalent of vertebrate adrenergic neurotransmitters. In Drosophila, octopamine is involved in ovulation by mediating an egg release from the ovary, while a physiological role for tyramine in this process is not fully understood. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 256
32677 320190 cd15062 7tmA_alpha1_AR alpha-1 adrenergic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats. 261
32678 320191 cd15063 7tmA_Octopamine_R octopamine receptors in invertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor for octopamine (OA), which functions as a neurotransmitter, neurohormone, and neuromodulator in invertebrate nervous system. Octopamine (also known as beta, 4-dihydroxyphenethylamine) is an endogenous trace amine that is highly similar to norepinephrine, but lacks a hydroxyl group, and has effects on the adrenergic and dopaminergic nervous systems. Based on the pharmacological and signaling profiles, the octopamine receptors can be classified into at least two groups: OA1 receptors elevate intracellular calcium levels in muscle, whereas OA2 receptors activate adenylate cyclase and increase cAMP production. 266
32679 320192 cd15064 7tmA_5-HT1_5_7 serotonin receptor subtypes 1, 5 and 7, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes serotonin receptor subtypes 1, 5, and 7 that are activated by the neurotransmitter serotonin. The 5-HT1 and 5-HT5 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family. The 5-HT1 receptor subfamily includes 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as 5-HT2C receptor. The 5-HT5A and 5-HT5B receptors have been cloned from rat and mouse, but only the 5-HT5A isoform has been identified in human because of the presence of premature stop codons in the human 5-HT5B gene, which prevents a functional receptor from being expressed. The 5-HT7 receptor is coupled to Gs, which positively stimulates adenylate cyclase activity, leading to increased intracellular cAMP formation and calcium influx. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 258
32680 320193 cd15065 7tmA_Ap5-HTB1-like serotonin receptor subtypes B1 and B2 from Aplysia californica and similar proteins; member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes Aplysia californica serotonin receptors Ap5-HTB1 and Ap5-HTB2, and similar proteins from bilateria including insects, mollusks, annelids, and worms. Ap5-HTB1 is one of the several different receptors for 5-hydroxytryptamine (5HT, serotonin). In Aplysia, serotonin plays important roles in a variety of behavioral and physiological processes mediated by the central nervous system. These include circadian clock, feeding, locomotor movement, cognition and memory, synaptic growth and synaptic plasticity. Both Ap5-HTB1 and Ap5-HTB2 receptors are coupled to G-proteins that stimulate phospholipase C, leading to the activation of phosphoinositide metabolism. Ap5-HTB1 is expressed in the reproductive system, whereas Ap5-HTB2 is expressed in the central nervous system. 300
32681 320194 cd15066 7tmA_DmOct-betaAR-like Drosophila melanogaster beta-adrenergic receptor-like octopamine receptors and similar receptors in bilateria; member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Drosophila beta-adrenergic-like octopamine receptors and similar proteins. The biogenic amine octopamine is the invertebrate equivalent of vertebrate adrenergic neurotransmitters and exerts its effects through different G protein-coupled receptor types. Insect octopamine receptors are involved in the modulation of carbohydrate metabolism, muscular tension, cognition and memory. The activation of octopamine receptors mediating these actions leads to an increase in adenylate cyclase activity, thereby increasing cAMP levels. In Drosophila melanogaster, three subgroups have been classified on the basis of their structural homology and functional equivalents with vertebrate beta-adrenergic receptors: DmOctBeta1R, DmOctBeta2R, and DmOctBeta3R. 265
32682 320195 cd15067 7tmA_Dop1R2-like dopamine 1-like receptor 2 from Drosophila melanogaster and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled dopamine 1-like receptor 2 is expressed in Drosophila heads and it shows significant sequence similarity with vertebrate and invertebrate dopamine receptors. Although the Drosophila Dop1R2 receptor does not cluster into the D1-like structural group, it does show pharmacological properties similar to D1-like receptors. As shown in vertebrate D1-like receptors, agonist stimulation of Dop1R2 activates adenylyl cyclase to increase cAMP levels and also generates a calcium signal through stimulation of phospholipase C. 262
32683 320196 cd15068 7tmA_Adenosine_R_A2A adenosine receptor subtype A2A, member of the class A family of seven-transmembrane G protein-coupled receptors. The A2A receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand and is involved in regulating myocardial oxygen consumption and coronary blood flow. High-affinity A2A and low-affinity A2B receptors are preferentially coupled to G proteins of the stimulatory (Gs) family, which lead to activation of adenylate cyclase and thereby increasing the intracellular cAMP levels. The A2A receptor activation protects against tissue injury and acts as anti-inflammatory agent. In human skin endothelial cells, activation of A2B receptor, but not the A2A receptor, promotes angiogenesis. Alternatively, activated A2A receptor, but not the A2B receptor, promotes angiogenesis in human umbilical vein and lung microvascular endothelial cells. The A2A receptor alters cardiac contractility indirectly by modulating the anti-adrenergic effect of A1 receptor, while the A2B receptor exerts direct effects on cardiac contractile function, but does not modulate beta-adrenergic or A1 anti-adrenergic effects. 293
32684 320197 cd15069 7tmA_Adenosine_R_A2B adenosine receptor subtype 2AB, member of the class A family of seven-transmembrane G protein-coupled receptors. The A2B receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand and is involved in regulating myocardial oxygen consumption and coronary blood flow. High-affinity A2A and low-affinity A2B receptors are preferentially coupled to G proteins of the stimulatory (Gs) family, which lead to activation of adenylate cyclase and thereby increasing the intracellular cAMP levels. The A2A receptor activation protects against tissue injury and acts as anti-inflammatory agent. In human skin endothelial cells, activation of A2B receptor, but not the A2A receptor, promotes angiogenesis. Alternatively, activated A2A receptor, but not the A2B receptor, promotes angiogenesis in human umbilical vein and lung microvascular endothelial cells. The A2A receptor alters cardiac contractility indirectly by modulating the anti-adrenergic effect of A1 receptor, while the A2B receptor exerts direct effects on cardiac contractile function, but does not modulate beta-adrenergic or A1 anti-adrenergic effects. 294
32685 320198 cd15070 7tmA_Adenosine_R_A3 adenosine receptor subtype A3, member of the class A family of seven-transmembrane G protein-coupled receptors. The A3 receptor, a member of the adenosine receptor family of G protein-coupled receptors, is coupled to G proteins of the inhibitory G(i) family, which lead to inhibition of adenylate cyclase and thereby lowering the intracellular cAMP levels. The A3 receptor has a sustained protective function in the heart during cardiac ischemia and contributes to inhibition of neutrophil degranulation in neutrophil-mediated tissue injury. Moreover, activation of A3 receptor by adenosine protects astrocytes from cell death induced by hypoxia. 280
32686 341323 cd15071 7tmA_Adenosine_R_A1 adenosine receptor subtype A1, member of the class A family of seven-transmembrane G protein-coupled receptors. The adenosine A1 receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand. The A1 receptor has primarily inhibitory function on the tissues in which it is located. The A1 receptor slows metabolic activity in the brain and has a strong anti-adrenergic effects in the heart. Thus, it antagonizes beta1-adrenergic receptor-induced stimulation and thereby reduces cardiac contractility. The A1 receptor preferentially couples to G proteins of the G(i/o) family, which lead to inhibition of adenylate cyclase and thereby lowering the intracellular cAMP levels. 290
32687 320200 cd15072 7tmA_Retinal_GPR retinal G protein coupled receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents the retinal G-protein coupled receptor (RGR) found exclusively in retinal pigment epithelium (RPE) and Muller cells. RGR is a member of the class A rhodopsin-like receptor family. As with other opsins, RGR binds all-trans retinal and contains a conserved lysine reside on the seventh helix. RGR functions as a photoisomerase to catalyze the conversion of all-trans-retinal to 11-cis-retinal. Two mutations in RGR gene are found in patients with retinitis pigmentosa, indicating that RGR is essential to the visual process. 260
32688 320201 cd15073 7tmA_Peropsin retinal pigment epithelium-derived rhodopsin homolog, member of the class A family of seven-transmembrane G protein-coupled receptors. Peropsin, also known as a retinal pigment epithelium-derived rhodopsin homolog (RRH), is a visual pigment-like protein found exclusively in the apical microvilli of the retinal pigment epithelium. Peropsin belongs to the type 2 opsin family of the class A G-protein coupled receptors. Peropsin presumably plays a physiological role in the retinal pigment epithelium either by detecting light directly or monitoring the levels of retinoids, the primary light absorber in visual perception, or other pigment-related compounds in the eye. 280
32689 320202 cd15074 7tmA_Opsin5_neuropsin neuropsin (Opsin-5), member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropsin, also known as Opsin-5, is a photoreceptor protein expressed in the retina, brain, testes, and spinal cord. Neuropsin belongs to the type 2 opsin family of the class A G-protein coupled receptors. Mammalian neuropsin activates Gi protein-mediated photo-transduction pathway in a UV-dependent manner, whereas, in non-mammalian vertebrates, neuropsin is involved in regulating the photoperiodic control of seasonal reproduction in birds such as quail. As with other opsins, it may also act as a retinal photoisomerase. 284
32690 320203 cd15075 7tmA_Parapinopsin non-visual parapinopsin, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the non-visual pineal pigment, parapinopsin, which is a member of the class A of the seven transmembrane G protein-coupled receptors. Parapinopsin serves as a UV-sensitive pigment for the wavelength discrimination in the pineal-related organs of lower vertebrates such as reptiles, amphibians, and fish. Although parapinopsin is phylogenetically related to vertebrate visual pigments such as rhodopsin, which releases its retinal chromophore and bleaches, the parapinopsin photoproduct is stable and does not bleach. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. 279
32691 320204 cd15076 7tmA_SWS1_opsin short wave-sensitive 1 opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Short Wave-Sensitive opsin 1 (SWS1), which mediates visual transduction in response to light at short wavelengths (ultraviolet to blue). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 280
32692 320205 cd15077 7tmA_SWS2_opsin short wave-sensitive 2 opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Short Wave-Sensitive opsin 2 (SWS2), which mediates visual transduction in response to light at short wavelengths (violet to blue). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 280
32693 320206 cd15078 7tmA_Encephalopsin encephalopsins (opsin-3), member of the class A family of seven-transmembrane G protein-coupled receptors. Encephalopsin, also called Opsin-3 or Panopsin, is a mammalian extra-retinal opsin that is highly localized in the brain. It is thought to play a role in encephalic photoreception. Encephalopsin belongs to the class A of the G protein-coupled receptors and shows strong homology to the vertebrate visual opsins. 279
32694 320207 cd15079 7tmA_photoreceptors_insect insect photoreceptors R1-R6 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the insect photoreceptors and their closely related proteins. The Drosophila eye is composed of about 800 unit eyes called ommatidia, each of which contains eight photoreceptor cells (R1-R8). The six outer photoreceptors (R1-R6) function like the vertebrate rods and are responsible for motion detection in dim light and image formation. The R1-R6 photoreceptors express a blue-absorbing pigment, Rhodopsin 1(Rh1). The inner photoreceptors (R7 and R8) are considered the equivalent of the color-sensitive vertebrate cone cells, which express a range of different pigments. The R7 photoreceptors express one of two different UV absorbing pigments, either Rh3 or Rh4. Likewise, the R8 photoreceptors express either the blue absorbing pigment Rh5 or green absorbing pigment Rh6. These photoreceptors belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 292
32695 381742 cd15080 7tmA_MWS_opsin medium wave-sensitive opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Medium Wave-Sensitive opsin, which mediates visual transduction in response to light at medium wavelengths (green). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 280
32696 320209 cd15081 7tmA_LWS_opsin long wave-sensitive opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Long Wave-Sensitive opsin is also called red-sensitive opsin or red cone photoreceptor pigment, which mediates visual transduction in response to light at long wavelengths. Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 292
32697 320210 cd15082 7tmA_VA_opsin non-visual VA (vertebrate ancient) opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. The vertebrate ancient (VA) opsin photopigments were originally identified in salmon and they appear to have diverged early in the evolution of vertebrate opsins. VA opsins are localized in the inner retina and the brain in teleosts. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extraretinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity, and body color change. The VA opsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 291
32698 320211 cd15083 7tmA_Melanopsin-like vertebrate melanopsins and related opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represent the Gq-coupled rhodopsin subfamily consists of melanopsins, insect photoreceptors R1-R6, invertebrate Gq opsins as well as their closely related opsins. Melanopsins (also called Opsin-4) are the primary photoreceptor molecules for non-visual functions such as the photo-entrainment of the circadian rhythm and pupillary constriction in mammals. Mammalian melanopsins are expressed only in the inner retina, whereas non-mammalian vertebrate melanopsins are localized in various extra-retinal tissues such as iris, brain, pineal gland, and skin. The outer photoreceptors (R1-R6) are the insect Drosophila equivalent to the vertebrate rods and are responsible for image formation and motion detection. The invertebrate G(q) opsins includes the arthropod and mollusk visual opsins as well as invertebrate melanopsins, which are also found in vertebrates. Arthropods possess color vision by the use of multiple opsins sensitive to different light wavelengths. Members of this subfamily belong to the class A of the G protein-coupled receptors and have seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 291
32699 320212 cd15084 7tmA_Pinopsin non-visual pinopsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Pinopsins are found in the pineal organ of birds, reptiles and amphibians, but are absent from teleosts and mammals. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Pinopsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 295
32700 320213 cd15085 7tmA_Parietopsin non-visual parietopsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Parietopsin is a non-visual green light-sensitive opsin that was initially identified in the parietal eye of lizards. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Parietopsin belongs to the class A of the G protein-coupled receptors and shows strong homology to the vertebrate visual opsins. 280
32701 320214 cd15086 7tmA_tmt_opsin teleost multiple tissue (tmt) opsin, member of the class A family of seven-transmembrane G protein-coupled receptors. Teleost multiple tissue (tmt) opsins are homologs of encephalopsin. Mouse encephalopsin (or panopsin) is highly expressed in the brain and testes, whereas the teleost homologs are localized to multiple tissues. The exact functions of the encephalopsins and tmt-opsins are unknown. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Tmt opsins belong to the class A of the G protein-coupled receptors and show strong homology to the vertebrate visual opsins. 276
32702 320215 cd15087 7tmA_NPBWR neuropeptide B/W receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide B/W receptor 1 and 2 are members of the class A G-protein coupled receptors that bind the neuropeptides B and W, respectively. NPBWR1 (previously known as GPR7) is expressed predominantly in cerebellum and frontal cortex, while NPBWR2 (previously known as GPR8) is located mostly in the frontal cortex and is present in human, but not in rat and mice. These receptors are suggested to be involved in the regulation of food intake, neuroendocrine function, and modulation of inflammatory pain, among many others. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32703 320216 cd15088 7tmA_MCHR-like melanin concentrating hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 278
32704 320217 cd15089 7tmA_Delta_opioid_R opioid receptor subtype delta, member of the class A family of seven-transmembrane G protein-coupled receptors. The delta-opioid receptor binds the endogenous pentapeptide ligands such as enkephalins and produces antidepressant-like effects. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. 281
32705 320218 cd15090 7tmA_Mu_opioid_R opioid receptor subtype mu, member of the class A family of seven-transmembrane G protein-coupled receptors. The mu-opioid receptor binds endogenous opioids such as beta-endorphin and endomorphin. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. 279
32706 320219 cd15091 7tmA_Kappa_opioid_R opioid receptor subtype kappa, member of the class A family of seven-transmembrane G protein-coupled receptors. The kappa-opioid receptor binds the opioid peptide dynorphin as the primary endogenous ligand. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. 282
32707 320220 cd15092 7tmA_NOFQ_opioid_R nociceptin/orphanin FQ peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The nociceptin (NOP) receptor binds nociceptin or orphanin FQ, a 17 amino acid endogenous neuropeptide. The NOP receptor is involved in the modulation of various brain activities including instinctive and emotional behaviors. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. 279
32708 320221 cd15093 7tmA_SSTR somatostatin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. They share common signaling cascades such as inhibition of adenylyl cyclase, activation of phosphotyrosine phosphatase activity, and G-protein-dependent regulation of MAPKs. 280
32709 320222 cd15094 7tmA_AstC_insect somatostatin-like receptor for allatostatin C, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. In Drosophila melanogaster and other insects, a 15-amino-acid peptide named allatostatin C(AstC) binds the somatostatin-like receptors. Two AstC receptors have been identified in Drosophila with strong sequence homology to human somatostatin and opioid receptors. 282
32710 320223 cd15095 7tmA_KiSS1R KiSS1-derived peptide (kisspeptin) receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled KiSS1-derived peptide receptor (GPR54 or kisspeptin receptor) binds the peptide hormone kisspeptin (previously known as metastin), which encoded by the metastasis suppressor gene (KISS1) expressed in various endocrine and reproductive tissues. The KiSS1 receptor is coupled to G proteins of the G(q/11) family, which lead to activation of phospholipase C and increase of intracellular calcium. This signaling cascade plays an important role in reproduction by regulating the secretion of gonadotropin-releasing hormone. 288
32711 320224 cd15096 7tmA_AstA_R_insect allatostatin-A receptor in insects, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled AstA receptor binds allatostatin A. Three distinct types of allatostatin have been identified in the insects and crustaceans: AstA, AstB, and AstC. They both inhibit the biosynthesis of juvenile hormone and exert an inhibitory influence on food intake. Therefore, allatostatins are considered as potential targets for insect control. 284
32712 320225 cd15097 7tmA_Gal2_Gal3_R galanin receptor subtypes 2 and 3, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Three receptors subtypes have been so far identified: GAL1, GAL2, and GAL3. The specific functions of each subtype remains mostly unknown, although galanin is thought to be involved in a variety of neuronal functions such as hormone release and food intake. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, depression, eating disorders, epilepsy and stroke, among many others. 279
32713 320226 cd15098 7tmA_Gal1_R galanin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Three receptors subtypes have been so far identified: GAL1, GAL2, and GAL3. The specific functions of each subtype remains mostly unknown, although galanin is thought to be involved in a variety of neuronal functions such as hormone release and food intake. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, depression, eating disorders, epilepsy and stroke, among many others. 282
32714 320227 cd15099 7tmA_Cannabinoid_R cannabinoid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system. 281
32715 320228 cd15100 7tmA_GPR3_GPR6_GPR12-like G protein-coupled receptors 3, 6, 12, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3. Also included in this subfamily is GPRx, also known as GPR185, which involved in the maintenance of meiotic arrest in frog oocytes. 268
32716 341325 cd15101 7tmA_LPAR lysophosphatidic acid receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 274
32717 320230 cd15102 7tmA_S1PR sphingosine-1-phosphate receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 270
32718 320231 cd15103 7tmA_MCR melanocortin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 270
32719 320232 cd15104 7tmA_GPR119_R_insulinotropic_receptor G protein-coupled receptor 119, also called glucose-dependent insulinotropic receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR119 is activated by oleoylethanolamide (OEA), a naturally occurring bioactive lipid with hypophagic and anti-obesity effects. Immunohistochemistry and double-immunofluorescence studies revealed the predominant GPR119 localization in pancreatic polypeptide (PP)-cells of islets. In addition, GPR119 expression is elevated in islets of obese hyperglycemic mice as compared to control islets, suggesting a possible involvement of this receptor in the development of obesity and diabetes. GPR119 has a significant sequence similarity with the members of the endothelial differentiation gene family. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 283
32720 320233 cd15105 7tmA_MrgprA mas-related G protein-coupled receptor subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 276
32721 320234 cd15106 7tmA_MrgprX-like primate-specific mas-related G protein-coupled receptor subtype X-like, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 274
32722 320235 cd15107 7tmA_MrgprB mas-related G protein-coupled receptor subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 276
32723 320236 cd15108 7tmA_MrgprD mas-related G protein-coupled receptor subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 276
32724 320237 cd15109 7tmA_MrgprF mas-related G protein-coupled receptor subtype F, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 274
32725 320238 cd15110 7tmA_MrgprH mas-related G protein-coupled receptor subtype H, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 274
32726 320239 cd15111 7tmA_MrgprG mas-related G protein-coupled receptor subtype G, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 263
32727 320240 cd15112 7tmA_MrgprE mas-related G protein-coupled receptor subtype E, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 272
32728 320241 cd15113 7tmA_MAS1L mas-related G protein-coupled receptor 1-like (MAS1L), member of the class A family of seven-transmembrane G protein-coupled receptors. MAS1L is also called MAS1 oncogene-like (MAS1-like) or mas-related G-protein coupled receptor MRG. MAS1L is a G protein-coupled receptor that only found in primates. The angiotensin-II metabolite angiotensin is an endogenous ligand for MAS1L. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. 265
32729 320242 cd15114 7tmA_C5aR complement component 5a anaphylatoxin chemotactic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors which bind anaphylatoxins; members of this group include C3a receptors and C5a receptors. Anaphylatoxins are also known as complement peptides (C3a, C4a and C5a) that are produced from the activation of the complement system cascade. These complement anaphylatoxins can trigger degranulation of endothelial cells, mast cells, or phagocytes, which induce a local inflammatory response and stimulate smooth muscle cell contraction, histamine release, and increased vascular permeability. They are potent mediators involved in chemotaxis, inflammation, and generation of cytotoxic oxygen-derived free radicals. In humans, a single receptor for C3a (C3AR1) and two receptors for C5a (C5AR1 and C5AR2, also known as C5L2 or GPR77) have been identified, but there is no known receptor for C4a. 274
32730 320243 cd15115 7tmA_C3aR complement component 3a anaphylatoxin chemotactic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors which bind anaphylatoxins; members of this group include C3a receptors and C5a receptors. Anaphylatoxins are also known as complement peptides (C3a, C4a and C5a) that are produced from the activation of the complement system cascade. These complement anaphylatoxins can trigger degranulation of endothelial cells, mast cells, or phagocytes, which induce a local inflammatory response and stimulate smooth muscle cell contraction, histamine release, and increased vascular permeability. They are potent mediators involved in chemotaxis, inflammation, and generation of cytotoxic oxygen-derived free radicals. In humans, a single receptor for C3a (C3AR1) and two receptors for C5a (C5AR1 and C5AR2, also known as C5L2 or GPR77) have been identified, but there is no known receptor for C4a. 265
32731 320244 cd15116 7tmA_CMKLR1 chemokine-like receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokine receptor-like 1 (also known as Chemerin receptor 23) is a GPCR for the chemoattractant adipokine chemerin, also known as retinoic acid receptor responder protein 2 (RARRES2), and for the omega-3 fatty acid derived molecule resolvin E1. Interaction with chemerin induces activation of the MAPK and PI3K signaling pathways leading to downstream functional effects, such as a decrease in immune responses, stimulation of adipogenesis, and angiogenesis. On the other hand, resolvin E1 negatively regulates the cytokine production in macrophages by reducing the activation of MAPK1/3 and NF-kB pathways. CMKLR1 is prominently expressed in dendritic cells and macrophages. 284
32732 320245 cd15117 7tmA_FPR-like N-formyl peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The formyl peptide receptors (FPRs) are chemoattractant GPCRs that involved in mediating immune responses to infection. They are expressed at elevated levels on polymorphonuclear and mononuclear phagocytes. FPRs bind N-formyl peptides, which are derived from the mitochondrial proteins of ruptured host cells or invading pathogens. Activation of FPRs by N-formyl peptides such as N-formyl-Met-Leu-Phe (FMLP) triggers a signaling cascade that stimulates neutrophil accumulation, phagocytosis and superoxide production. These responses are mediated through a pertussis toxin-sensitive G(i) protein that activates a PLC-IP3-calcium signaling pathway. While FPRs are involved in host defense responses to bacterial infection, they can also suppress the immune system under certain conditions. Yet, the physiological role of the FPR family is not fully understood. 288
32733 320246 cd15118 7tmA_PD2R2_CRTH2 prostaglandin D2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin D2 receptor, also known as CRTH2, is a chemoattractant G-protein coupled receptor expressed on T helper type 2 cells that binds prostaglandin D2 (PGD2). PGD2 functions as a mast cell-derived mediator to trigger asthmatic responses and also causes vasodilation. PGD2 exerts its inflammatory effects by binding to two G-protein coupled receptors, the D-type prostanoid receptor (DP) and PD2R2 (CRTH2). PD2R2 couples to the G protein G(i/o) type which leads to a reduction in intracellular cAMP levels and an increase in intracellular calcium. PD2R2 is involved in mediating chemotaxis of Th2 cells, eosinophils, and basophils generated during allergic inflammatory processes. CRTH2 (PD2R2), but not DP receptor, undergoes agonist-induced internalization which is one of key processes that regulates the signaling of the GPCR. 284
32734 320247 cd15119 7tmA_GPR1 G protein-coupled receptor 1 for chemerin, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 1 (GPR1) belongs to the class A of the seven transmembrane domain receptors. This is an orphan receptor that can be activated by the leukocyte chemoattractant chemerin, thereby suggesting that some of the anti-inflammatory actions of chemerin may be mediated through GPR1. GPR1 is most closely related to another chemerin receptor CMKLR1. In an in-vitro study, GPR1 has been shown to act as a co-receptor to allow replication of HIV viruses. 278
32735 320248 cd15120 7tmA_GPR33 orphan receptor GPR33, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor GPR33, an orphan member of the chemokine-like receptor family, was originally identified as a pseudogene in humans as well as in several apes and rodent species. Although the intact GPR33 allele is still present in a small fraction of the human population, the human GPR33 contains a premature stop codon. The amino acid sequence of GPR33 shares a high degree of sequence identity with the members of the chemokine and chemoattractant receptors that control leukocyte chemotaxis. The human GPR33 is expressed in spleen, lung, heart, kidney, pancreas, thymus, gonads, and leukocytes. 282
32736 320249 cd15121 7tmA_LTB4R1 leukotriene B4 receptor subtype 1 (LTB4R1 or BLT1), member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the Gq-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation. 278
32737 320250 cd15122 7tmA_LTB4R2 leukotriene B4 receptor subtype 2 (LTB4R2 or BLT2), member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the Gq-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation. 281
32738 320251 cd15123 7tmA_BRS-3 bombesin receptor subtype 3, member of the class A family of seven-transmembrane G protein-coupled receptors. BRS-3 is classified as an orphan receptor and belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include neuromedin B receptor (NMBR) and gastrin-releasing peptide receptor (GRPR). Bombesin is a tetradecapeptide, originally isolated from frog skin. Mammalian bombesin-related peptides are widely distributed in the gastrointestinal and central nervous systems. The bombesin family receptors couple primarily to the G proteins of G(q/11) family. BRS-3 interacts with known naturally-occurring bombesin-related peptides with low affinity; however, no endogenous high-affinity ligand to the receptor has been identified. BRS-3 is suggested to play a role in sperm cell division and maturation. 294
32739 320252 cd15124 7tmA_GRPR gastrin-releasing peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The gastrin-releasing peptide receptor (GRPR) is a G-protein coupled receptor whose endogenous ligand is gastrin releasing peptide. GRP shares high sequence homology with the neuropeptide neuromedin B in the C-terminal region. This receptor is high glycosylated and couples to a pertussis-toxin-insensitive G protein of the family of Gq/11, which leads to the activation of phospholipase C. Gastrin-releasing peptide (GRP) is a potent mitogen for neoplastic tissues and involved in regulating multiple functions of the gastrointestinal and central nervous systems. These include the release of gastrointestinal hormones, the contraction of smooth muscle cells, and the proliferation of epithelial cells. GRPR belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include neuromedin B receptor (NMBR) and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin. 293
32740 320253 cd15125 7tmA_NMBR neuromedin B receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neuromedin B receptor (NMBR), also known as BB1, is a G-protein coupled receptor whose endogenous ligand is the neuropeptide neuromedin B. Neuromedin B is a potent mitogen and growth factor for normal and cancerous lung and for gastrointestinal epithelial tissues. NMBR is widely distributed in the CNS, with especially high levels in olfactory nucleus and thalamic regions. The receptor couples primarily to a pertussis-toxin-insensitive G protein of the Gq/11 family, which leads to the activation of phospholipase C. NMBR belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include gastrin-releasing peptide receptor (GRPR) and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin. 292
32741 320254 cd15126 7tmA_ETBR-LP2 endothelin B receptor-like protein 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelin B receptor-like protein 2, also called GPR37L1, is almost exclusively expressed in the nervous system. It has recently been shown to act as a receptor for the neuropeptide prosaptide, the active fragment of the secreted neuroprotective and glioprotective factor prosaposin (also called sulfated glycoprotein-1). Both prosaptide and prosaposin protect primary astrocytes against oxidative stress. GPR37L1 is part of the class A family of GPCRs that includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 298
32742 320255 cd15127 7tmA_GPR37 G protein-coupled receptor 37, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR37, also called parkin-associated endothelin-like receptor (Pael-R), was isolated from a set of human brain frontal lobe expressed sequence tags. It is highly expressed in the mammalian CNS. It is a substrate of parkin and is involved in the pathogenesis of Parkinson's disease. GPR37 has recently been shown to act as a receptor for the neuropeptide prosaptide, the active fragment of the secreted neuroprotective and glioprotective factor prosaposin (also called sulfated glycoprotein-1). Both prosaptide and prosaposin protect primary astrocytes against oxidative stress. GPR37 is part of the class A family of GPCRs that includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 298
32743 320256 cd15128 7tmA_ET_R endothelin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are 21-amino acid peptides which able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain. 300
32744 320257 cd15129 7tmA_GPR142 G-protein-coupled receptor GPR142, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR142, a vertebrate orphan receptor, is very closely related to GPR139, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and plays an important role in mediating enhancement of glucose-stimulated insulin secretion and maintaining glucose homeostasis, whereas GPR139 is expressed almost exclusively in the brain and is suggested to play a role in the control of locomotor activity. These orphan receptors are phylogenetically clustered with invertebrate FMRFamide receptors such as Drosophila melanogaster DrmFMRFa-R. 270
32745 320258 cd15130 7tmA_NTSR neurotensin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor. 281
32746 320259 cd15131 7tmA_GHSR growth hormone secretagogue receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Growth hormone secretagogue receptor, GHSR, is also known as GH-releasing peptide receptor (GHRP) or Ghrelin receptor. Ghrelin, the endogenous ligand for GHSR, is an acylated 28-amino acid peptide hormone produced by ghrelin cells in the gastrointestinal tract. Ghrelin, also called hunger hormone, is involved in the regulation of growth hormone release, appetite and feeding, gut motility, lipid and glucose metabolism, and energy balance. It also plays a role in the cardiovascular, immune, and reproductive systems. GHSR couples to G-alpha-11 proteins. Both ghrelin and GHSR are expressed in a wide range of cancer tissues. Recent studies suggested that ghrelin may play a role in processes associated with cancer progression, including cell proliferation, metastasis, apoptosis, and angiogenesis. 291
32747 320260 cd15132 7tmA_motilin_R motilin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Motilin receptor, also known as GPR38, is a G-protein coupled receptor that binds the endogenous ligand motilin. Motilin is a 22 amino acid peptide hormone expressed throughout the gastrointestinal tract and stimulates contraction of gut smooth muscle. Motilin is also called as the housekeeper of the gut because it is responsible for the proper filling and emptying of the gastrointestinal tract in response to food intake, and for stimulating the production of pepsin. Motilin receptor shares significant amino acid sequence identity with the growth hormone secretagogue receptor (GHSR) and neurotensin receptors (NTS-R1 and 2). 289
32748 320261 cd15133 7tmA_NMU-R neuromedin U receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure. 298
32749 320262 cd15134 7tmA_capaR neuropeptide capa receptor and similar invertebrate proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. CapaR is a G-protein coupled receptor for the Drosophila melanogaster capa neuropeptides (Drm-capa-1 and -2), which act on the Malpighian tubules to increase fluid transport. The capa peptides are evolutionarily related to vertebrate Neuromedin U neuropeptide and contain a C-terminal FPRXamide motif. CapaR regulates fluid homeostasis through its ligands, thereby acts as a desiccation stress-responsive receptor. CapaR undergoes desensitization, with internalization mediated by beta-arrestin-2. 298
32750 320263 cd15135 7tmA_GPR39 G protein-coupled receptor 39, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR39 is an orphan G protein-coupled receptor that belongs to the growth hormone secretagogue and neurotensin receptor subfamily. GPR39 is expressed in peripheral tissues such as pancreas, gut, gastrointestinal tract, liver, kidney as well as certain regions of the brain. The divalent metal ion Zn(2+) has been shown to be a ligand capable of activating GPR39. Thus, it has been suggested that GPR39 function as a G(q)-coupled Zn(2+)-sensing receptor which involved in the regulation of endocrine pancreatic function, body weight, gastrointestinal mobility, and cell death. 320
32751 320264 cd15136 7tmA_Glyco_hormone_R glycoprotein hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors (GPHRs) are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG) and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. GPHRs couple primarily to the G(s)-protein and promotes cAMP production, but also to the G(i)- or G(q)-protein. 275
32752 320265 cd15137 7tmA_Relaxin_R relaxin family peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes relaxin/insulin-like family peptide receptor 1 (RXFP1 or LGR7) and 2 (RXFP2 or LGR8), which contain a very large extracellular N-terminal domain with numerous leucine-rich repeats responsible for hormone recognition and binding. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural but low sequence similarity, and exert their physiological functions by activating a group of four GPCRs, RXFP1-4. Relaxin and INSL3 are the endogenous ligands for RXFP1 and RXFP2, respectively. Upon receptor binding, relaxin activates a variety of signaling pathways to produce second messengers such as cAMP. 284
32753 320266 cd15138 7tmA_LRR_GPR orphan leucine-rich repeat-containing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes leucine-rich repeat-containing G-protein coupled receptor 4 (LGR4), 5 (LGR5), and 6 (LGR6). These receptors contain a subfamily of receptors related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The RSPO proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. LGR5 acts as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 also serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins. 274
32754 320267 cd15139 7tmA_PGE2_EP2 prostaglandin E2 receptor EP2 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP2, also called prostanoid EP2 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Stimulation of the EP2 receptor by PGE2 causes cAMP accumulation through G(s) protein activation, which subsequently produces smooth muscle relaxation and mediates the systemic vasodepressor response to PGE2. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters. 299
32755 320268 cd15140 7tmA_PGD2 prostaglandin D2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin D2 receptor (also called prostanoid DP receptor, DP1, or PGD2R1) is a G-protein coupled receptor whose endogenous ligand is prostaglandin D2 (PGD2). PGD2, the major cyclooxygenase metabolite of arachidonic acid produced by mast cells, mediates inflammatory reactions in response to allergen challenge and causes peripheral vasodilation. PGD2 exerts its biological effects by binding to two types of cell surface receptors: a DP1 receptor that belongs to the prostanoid receptor family and a chemoattractant receptor-homologous molecule expressed on the T-helper type 2 cells (CRTH2 or PD2R2). 312
32756 320269 cd15141 7tmA_PGI2 prostaglandin I2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin I2 receptor (also called prostacyclin receptor or prostanoid IP receptor) is a class A, G protein-coupled receptor whose endogenous ligand is prostacyclin, which is the major product of cyclooxygenase metabolite of arachidonic acid that found predominantly in platelets and vascular smooth muscle cells (VSMCs). The PGI2 receptor is coupled to both G(s) and G(q) protein subtypes, resulting in increased cAMP formation, phosphoinositide turnover, and Ca2+ signaling. PGI2 receptor activation by prostacyclin induces VSMC differentiation and produces a potent vasodilation and inhibition of platelet aggregation. 301
32757 320270 cd15142 7tmA_PGE2_EP4 prostaglandin E2 receptor EP4 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP4, also called prostanoid EP4 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Like the EP2 receptor, stimulation of the EP4 receptor by PGE2 causes cAMP accumulation through G(s) protein activation. Knockout studies in mice suggest that EP4 receptor may be involved in the maintenance of bone mass and fracture healing. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters. 302
32758 320271 cd15143 7tmA_TXA2_R thromboxane A2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The thromboxane receptor, also known as the prostanoid TP receptor, is a class A G-protein coupled receptor whose endogenous ligand is thromboxane A2 (TXA2). TXA2 is the major product of cyclooxygenase metabolite of arachidonic acid that found predominantly in platelets and stimulates platelet aggregation, Ca2+ influx into platelets, and also causes vasoconstriction. TXA2 has been shown to be involved in immune regulation, angiogenesis and metastasis, among many others. Activation of TXA2 receptor is coupled to G(q) and G(13), resulting in the activations of phospholipase C and RhoGEF, respectively. TXA2 receptor is widely distributed in the body and is abundantly expressed in thymus and spleen. 296
32759 320272 cd15144 7tmA_PGE2_EP1 prostaglandin E2 receptor EP1 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP1, also called prostanoid EP1 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. It has been shown that stimulation of the EP1 receptor by PGE2 causes smooth muscle contraction and increased intracellular Ca2+ levels; however, it is still unclear whether EP1 receptor is exclusively coupled to G(q/11), which leading to activation of phospholipase C and phosphatidylinositol hydrolysis. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters. 294
32760 320273 cd15145 7tmA_FP prostaglandin F2-alpha receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The PGF2-alpha receptor, also called prostanoid FP receptor, is a class A G-protein coupled receptor whose endogenous ligand is prostaglandin F2-alpha. PGF2-alpha binding to this receptor is coupled to the stimulation of phospholipase C (PLC) pathway via G-protein subunit G(q). This leads to the release of inositol trisphosphate (IP3) and diacylglycerol (DAG) which results in increased intracellular Ca2+ levels and activation of PKC. The receptor activation primarily induces uterine contraction and bronchoconstriction, and stimulates luteolysis. Like most prostanoid receptors, the PGF2-alpha receptor has also been implicated in tumor angiogenesis and metastasis. 290
32761 320274 cd15146 7tmA_PGE2_EP3 prostaglandin E2 receptor EP3 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP3, also called prostanoid EP3 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Stimulation of the EP3 receptor by PGE2 preferentially couples to G(i) protein. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels, which subsequently produces smooth muscle contraction. Knockout mice studies suggest that the EP3 receptor may act as a systemic vasopressor. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters. 308
32762 320275 cd15147 7tmA_PAFR platelet-activating factor receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The platelet-activating factor receptor is a G(q/11)-protein coupled receptor, which is linked to p38 MAPK and PI3K signaling pathways. PAF is a phospholipid (1-0-alkyl-2-acetyl-sn-glycero-3-phosphorylcholine) which is synthesized by cells especially involved in host defense such as platelets, macrophages, neutrophils, and monocytes. PAF is well-known for its ability to induce platelet aggregation and anaphylaxis, and also plays important roles in allergy, asthma, and inflammatory responses, among many others. 291
32763 320276 cd15148 7tmA_GPR34-like putative G protein-coupled receptor 34, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 34 of unknown function. Orphan GPR34 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32764 320277 cd15149 7tmA_P2Y14 P2Y purinoceptor 14, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y14 receptor is activated by UDP-sugars and belongs to the G(i) class of the P2Y family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14). P2Y14 receptor has been reported to be involved in a diverse set of physiological responses in many epithelia as well as in immune and inflammatory cells. 284
32765 341326 cd15150 7tmA_P2Y12 P2Y purinoceptor 12, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y12 receptor (P2Y12R) is found predominantly on the surface of blood platelets and is activated by adenosine diphosphate (ADP). P2Y12R plays an important role in the regulation of blood clotting and belongs to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14). 285
32766 341327 cd15151 7tmA_P2Y13 P2Y purinoceptor 13, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y13 receptor (P2Y13R) is activated by adenosine diphosphate (ADP) and belongs to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14). 284
32767 320280 cd15152 7tmA_GPR174-like putative purinergic receptor GPR174, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR174 has been recently identified as a lysophosphatidylserine receptor that enhances intracellular cAMP formation by coupling to a G(s) protein. GPR174 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32768 320281 cd15153 7tmA_P2Y10 P2Y purinoceptor 10, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y10 receptor is a G-protein coupled receptor that is activated by both sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA). Phylogenetic analysis of the class A GPCRs shows that P2Y10 is grouped into the cluster comprising nucleotide and lipid receptors. Although the mouse P2Y10 was found to be expressed in brain, lung, reproductive organs, and skeletal muscle, the physiological function of this receptor is not yet known. S1P and LPA are bioactive lipid molecules that induce a variety of cellular responses through G proteins: adhesion, invasion, cell migration and proliferation, among many others. 283
32769 320282 cd15154 7tmA_LPAR5 lysophosphatidic acid receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 5 (LPAR5) is a G protein-coupled receptor that binds the bioactive lipid lysophosphatidic acid (LPA) and is involved in maintenance of human hair growth. Phylogenetic analysis of the class A GPCRs shows that LAPR5 is classified into the cluster consisting receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production. Activation of LPAR5 is coupled to G(q) and G(12/13) proteins. 285
32770 320283 cd15155 7tmA_LPAR4 lysophosphatidic acid receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 4 (LPAR4) is a G protein-coupled receptor that binds and is activated by the bioactive lipid lysophosphatidic acid (LPA), which is released by activated platelets and constitutively found in serum. Phylogenetic analysis of the class A GPCRs shows that LAPR4 is classified into the cluster consisting receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production. Activation of LPAR5 is coupled to G(12/13) proteins, leading to neurite retraction and stress fiber formation, whereas coupling to G(q) protein leads to increases in calcium levels. 283
32771 320284 cd15156 7tmA_LPAR6_P2Y5 lysophosphatidic acid receptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 6 (LPAR6), also known as P2Y5, is a G(i), G(12/13) G protein-coupled receptor that is activated by the bioactive lipid lysophosphatidic acid (LPA), which is released by activated platelets and constitutively present in serum. LPAR6 plays an important role in maintenance of human hair growth. Thus, mutations in the receptor are responsible for both autosomal recessive wooly hair and hypotrichosis. Phylogenetic analysis of the class A GPCRs shows that LAPR6 (P2Y5) is classified into the cluster consisting of receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production. 285
32772 320285 cd15157 7tmA_CysLTR2 cysteinyl leukotriene receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 278
32773 320286 cd15158 7tmA_CysLTR1 cysteinyl leukotriene receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 285
32774 320287 cd15159 7tmA_EBI2 Epstein-Barr virus (EBV)-induced gene 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Epstein-Barr virus-induced G-protein coupled receptor 2 (EBI2), also called GPR183, is activated by 7alpha, 25-dihydroxyxcholesterol (7alpha, 25-OHC), an oxysterol. EBI2 was originally identified as one of major genes induced in the Burkitt's lymphoma cell line BL41by EBV infection. EBI2 is involved in regulating B cell migration and responses, and is also implicated in human diseases such as type I diabetes, multiple sclerosis, and cancers. 286
32775 320288 cd15160 7tmA_Proton-sensing_R proton-sensing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Proton/pH-sensing G-protein coupled receptors sense pH of 7.6 to 6.0. They mediate a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. The proton/pH-sensing receptor family includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 (TDAG8, GPR65) receptor, ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). 280
32776 320289 cd15161 7tmA_GPR17 G protein-coupled receptor 17, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR17 is a Forkhead box protein O1 (FOXO1) target and abundantly expressed in agouti-related peptide (AGRP) neurons. FOXO1 is a transcription factor that plays key roles in regulation of gluconeogenesis and glycogenolysis by insulin signaling. For instance, food intake and body weight increase when hypothalamic FOXO1 is activated, whereas they both decrease when FOXO1 is inhibited. However, a recent study has been reported that GPR17 deficiency in mice did not affect food intake or glucose homeostasis. Thus, GPR17 may not play a role in the control of food intake, body weight, or glycemic control. GPR17 is phylogenetically closely related to purinergic P2Y and cysteinyl-leukotriene receptors. 277
32777 341328 cd15162 7tmA_PAR protease-activated receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes purinergic receptor P2Y8 and protease-activated receptors. P2Y8 (or P2RY8) expression is often increased in leukemia patients, and it plays a role in the pathogenesis of acute leukemia. P2Y8 is phylogenetically closely related to the protease-activated receptors (PARs), which are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. Four different types of the protease-activated receptors have been identified (PAR1-4) and are predominantly expressed in platelets. PAR1, PAR3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 280
32778 320291 cd15163 7tmA_GPR20 G protein-coupled receptor 20, member of the class A family of seven-transmembrane G protein-coupled receptors. Orphan GPR20 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. GPR20 has been shown to constitutively activate G(i) proteins in the absence of a ligand; however its functional role is not known. GPR20 is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A common feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others. 258
32779 320292 cd15164 7tmA_GPR35-like G protein-coupled receptor 35 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR35 shares closest homology with GPR55, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A number of studies have suggested that GPR35 may play important physiological roles in hypertension, atherosclerosis, nociception, asthma, glucose homeostasis and diabetes, and inflammatory bowel disease. GPR35 is thought to be responsible for brachydactyly mental retardation syndrome, which is associated with a deletion comprising chromosome 2q37 in human, and is also implicated as a potential oncogene in stomach cancer. Several endogenous ligands for GPR35 have been identified including kynurenic acid, 2-oleoyl lysophosphatidic acid, and zaprinast. GPR35 couples to G(13) and G(i/o) proteins. 272
32780 320293 cd15165 7tmA_GPR55-like G protein-coupled receptor 55 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR55 shares closest homology with GPR35, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. GPR55 has been reported to couple to G(13), G(12), or G(q) proteins. Activation of GPR55 leads to activation of phospholipase C, RhoA, ROCK, ERK, p38MAPK, and calcium release. Lysophosphatidylinositol (LPI) is currently considered as the endogenous ligand for GPR55, although the receptor was initially de-orphanized as a cannabinoid receptor and binds many cannabinoid ligands. 277
32781 320294 cd15166 7tmA_NAGly_R_GPR18 N-arachidonyl glycine receptor, GPR18, member of the class A family of seven-transmembrane G protein-coupled receptors. N-arachidonyl glycine (NAGly), an endogenous metabolite of the endocannabinoid anandamide, has been identified as an endogenous ligand of the G(i/o) protein-coupled receptor 18 (GPR18). NAGly is involved in directing microglial migration in the CNS through activation of GPR18. NAGly-GPR18 signaling is thought to play an important role in microglial-neuronal communication. Recent studies also show that GPR18 functions as the abnormal cannabidiol (Abn-CBD) receptor. Abn-CBD is a synthetic isomer of cannabidiol and is inactive at cannabinoid receptors (CB1 or CB2), but acts as a selective agonist at GPR18. The NAGly receptor is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others. 275
32782 320295 cd15167 7tmA_GPR171 orphan G protein-coupled receptor 171, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR171 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. A recent study has been reported that the peptide LENSSPQAPARRLLPP (BigLEN) activates GPR17 to regulate body weight in mice; however the biological role of the receptor remains unknown. GPR171 is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A common feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others. 282
32783 341329 cd15168 7tmA_P2Y1-like P2Y purinoceptors 1, 2, 4, 6, 11 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). This cluster only includes P2Y1-like receptors as well as other closely related orphan receptors, such as GPR91 (a succinate receptor) and GPR80/GPR99 (an alpha-ketoglutarate receptor). 284
32784 320297 cd15169 7tmA_FFAR1 free fatty acid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the mammalian free fatty acid receptor 1 (FFAR1), also called GPR40. FFAR1 is a cell-surface receptor for medium- and long-chain free fatty acids (FFAs). The receptor is most potently activated by eicosatrienoic acid (C20:3), but can also be activated at micromolar concentrations of various fatty acids. FFAR1 directly mediates FFA stimulation of glucose-stimulated insulin secretion and indirectly increases insulin secretion by enhancing the release of incretin. Free fatty acid receptors (FFARs) belong to the class A G-protein coupled receptors and are comprised of three members, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity. 284
32785 320298 cd15170 7tmA_FFAR2_FFAR3 free fatty acid receptors 2, 3, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes free fatty acid receptor 2 (FFAR2), FFAR3, and similar proteins. They are a member of the class A G-protein coupled receptors that bind free fatty acids. The FFAR subfamily is composed of three receptors, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFAR2 and FFAR3 are cell-surface receptors for short chain FFAs (SCFAs) with different ligand affinities, whereas FFAR1 is a receptor for medium- and long-chain FFAs. FFAR2 activation by SCFA suppresses adipose insulin signaling, which leads to inhibition of fat accumulation in adipose tissue. FAAR3 is expressed in intestinal L cells, which produces glucagon-like peptide 1 (GLP-1) and peptide YY (PYY), thus suggesting that this receptor may be involved in energy homeostasis. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity. 278
32786 320299 cd15171 7tmA_CCRL2 CC chemokine receptor-like 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokine (CC-motif) receptor-like 2 (CCRL2) is a member of the atypical chemokine receptor family. CCRL2, like other atypical receptors, has an alteration in the conserved DRYLAIV motif in the third intracellular loop, which is essential for GPCR coupling and signaling. CCR2L is expressed in most hematopoietic cells and many lymphoid organs as well as in heart and lung. CCRL2 was initially reported to promote chemotaxis and calcium fluxes in responses to chemokines (CCL2, CCL5, CCL7, and CCL8); however, these results are still controversial. More recently, chemerin, a chemotactic agonist of CMKLR1 (chemokine-like receptor-1) and GPR1, was identified as a novel non-signaling ligand for both human and mouse CCRL2. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). 279
32787 341330 cd15172 7tmA_CCR6 CC chemokine receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR6 is the only known receptor identified for the chemokine CCL20 (also known as macrophage inflammatory protein-3alpha, MIP-3alpha). CCR6 is expressed by all mature human B cells, effector memory T-cells, and dendritic cells found in the gut mucosal immune system. CCL20 contributes to recruitment of CCR6-expressing cells to Peyer's patches and isolated lymphoid follicles in the intestine, thereby promoting the assembly and maintenance of organized lymphoid structures. Also, CCL20 expression is highly inducible in response to inflammatory signals. Thus, CCL20 is involved in both inflammatory and homeostatic functions in the immune system. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi. 281
32788 320301 cd15173 7tmA_CXCR6 CXC chemokine receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR6 binds specifically to the chemokine CXCL16, which is expressed on dendritic cells, monocyte/macrophages, activated T cells, fibroblastic reticular cells, and cancer cells. CXCR6 is phylogenetically more closely related to CC-type chemokine receptors (CCR6 and CCR9) than other CXC receptors. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 270
32789 320302 cd15174 7tmA_CCR9 CC chemokine receptor type 9, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR9 is a homeostatic receptor specific for CCL25 (formerly known as thymus expressed chemokine) and is highly expressed on both immature and mature thymocytes as well as on intestinal homing T Lymphocytes and mucosal Lymphocytes. In cutaneous melanoma, activation of CCR9-CCL25 has been shown to stimulate metastasis to the small intestine. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi. 280
32790 341331 cd15175 7tmA_CCR7 CC chemokine receptor type 7, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR7 is a major homeostatic receptor responsible for lymph node development and effective adaptive immune responses and plays a critical role in trafficking of dendritic cells and B and T lymphocytes. Its only two ligands, CCL and CCl21, are primarily produced by stromal cells in the T cell zones of lymph nodes and spleen. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi. 278
32791 320304 cd15176 7tmA_ACKR4_CCR11 atypical chemokine receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR4 was first reported to bind several CC chemokines including CCL19, CCL21, and CCL25 and was originally designated CCR11. AKCR4 is unable to couple to G-protein and, instead, it preferentially mediates beta-arrestin dependent processes, such as receptor internalization, after ligand binding. Thus, ACKR4 may act as a scavenger receptor to suppress the effects of proinflammatory chemokines. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors. 276
32792 341332 cd15177 7tmA_CCR10 CC chemokine receptor type 10, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR10 is a homeostatic receptor specific for two C-C motif chemokines, CCL27 and CCL28. Activation of CCR10 by its two ligands mediates diverse activities, ranging from leukocyte trafficking to skin cancer. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi. 280
32793 341333 cd15178 7tmA_CXCR1_2 CXC chemokine receptor types 1 and 2, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR1 and CXCR2 are closely related chemotactic receptors for a group of CXC chemokines distinguished by the presence of the amino acid motif ELR immediately adjacent to their CXC motif. Expression of CXCR1 and CXCR2 is strictly controlled in neutrophils by external stimuli such as lipopolysaccharide (LPS), tumor necrosis factor (TNF)-alpha, Toll-like receptor agonists, and nitric oxide. CXCL8 (formerly known as interleukin-8) binds with high-affinity and activates both receptors. CXCR1 also binds CXCL7 (neutrophil-activating protein-2), whereas CXCR2 non-selectively binds to all seven ELR-positive chemokines (CXCL1-7). Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 279
32794 341334 cd15179 7tmA_CXCR4 CXC chemokine receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR4 is the only known G protein-coupled chemokine receptor for the key homeostatic ligand CXCL12, which is constitutively secreted by bone marrow stromal cells. Atypical chemokine receptor CXCR7 (ACKR3) also binds CXCL12, but activates signaling in a G protein-independent manner. CXCR4 is also a co-receptor for HIV infection and plays critical roles in the development of immune system during both lymphopoiesis and myelopoiesis. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 278
32795 341335 cd15180 7tmA_CXCR3 CXC chemokine receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR3 is an inflammatory chemotactic receptor for a group of CXC chemokines distinguished by the presence of the amino acid motif ELR immediately adjacent to their CXC motif. CXCR3 specifically binds three chemokines CXCL9 (monokine induced by gamma-interferon), CXCL10 (interferon induced protein of 10 kDa), and CXCL11 (interferon inducible T-cell alpha-chemoattractant, I-TAC). CXC3R is expressed on CD4+ Th1 and CD8+ cytotoxic T lymphocytes as well as highly on innate lymphocytes, such as NK cells and NK T cells, where it may mediate the recruitment of these cells to the sites of infection and inflammation. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 280
32796 341336 cd15181 7tmA_CXCR5 CXC chemokine receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR5 is a B-cell selective receptor that binds specifically to the homeostatic chemokine CXCL13 and regulates adaptive immunity. The receptor is found on all peripheral blood and tonsillar B cells and is involved in lymphocyte migration (homing) to specific tissues and development of normal lymphoid tissue. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 281
32797 341337 cd15182 7tmA_XCR1 XC chemokine receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. XCR1 is a chemokine receptor specific for XCL1 and XCL2 (previously called lymphotactin alpha/beta), which differ in only two amino acids. XCL1/2 is the only member of the C chemokine subfamily, which is unique as containing only two of the four cysteines that are found in other chemokine families. Human XCL1/2 has been shown to be secreted by activated CD8+ T cells and upon activation of the innate immune system. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. 271
32798 320311 cd15183 7tmA_CCR1 CC chemokine receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR1 is widely expressed on both hematopoietic and non-hematopoietic cells and binds to the inflammatory CC chemokines CCL3, CCL5, CCL6, CCL9, CCL15, and CCL23. CCR1 activates the typical chemokine signaling pathway through the G(i/o) type of G proteins, causing inhibition of adenylate cyclase and stimulation of phospholipase C, PKC, calcium flux, and PLA2. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 278
32799 341338 cd15184 7tmA_CCR5_CCR2 CC chemokine receptor types 5 and 2, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR2 and CCR5 share very high amino acid sequence identity. Both receptors play important roles in the trafficking of monocytes/macrophages and are implicated in the pathogenesis of immunologic diseases (rheumatoid arthritis, celiac disease, and transplant rejection) and cardiovascular diseases (atherosclerosis and autoimmune hepatitis). CCR2 is a receptor specific for members of the monocyte chemotactic protein family, including CCL2, CCL7, and CCL13. Conversely, CCR5 is a major co-receptor for HIV infection and binds many CC chemokine ligands, including CC chemokine ligands including CCL2, CCL3, CCL4, CCL5, CCL11, CCL13, CCL14, and CCL16. CCR2 is expressed primarily on blood monocytes and memory T cells, whereas CCR5 is expressed on antigen-presenting cells (macrophages and dendritic cells) and activated T effector cells. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 278
32800 341339 cd15185 7tmA_CCR3 CC chemokine receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR3 is a highly promiscuous receptor that binds a variety of inflammatory CC-type chemokines, including CCL11 (eotaxin-1), CCL3L1, CCL5 (regulated on activation, normal T cell expressed and secreted; RANTES), CCL7 (monocyte-specific chemokine 3 or MCP-3), CCL8 (MCP-2), CCL11, CCL13 (MCP-4), CCL15, CCL24 (eotaxin-2), CCL26 (eotaxin-3), and CCL28. Among these, the eosinophil chemotactic chemokines (CCL11, CCL24, and CCL26) are the most potent and specific ligands. In addition to eosinophil, CCR3 is expressed on cells involved in allergic responses, such as basophils, Th2 lymphocytes, and mast cells. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 278
32801 320314 cd15186 7tmA_CX3CR1 CX3C chemokine receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. CX3CR1 is an inflammatory receptor specific for CX3CL1 (also known as fractalkine in human), which is involved in the adhesion and migration of leukocytes. The CX3C chemokine subfamily is only represented by CX3CL1, which exists in both soluble and membrane-anchored forms. Membrane-anchored form promotes strong adhesion of receptor-bearing leukocytes to CX3CL1-expressing endothelial cells. On the other hand, soluble CX3CL1, which is released by the proteolytic cleavage of membrane-anchored CX3CL1, is a potent chemoattractant for CX3CR1-expressing T cells and monocytes. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. 273
32802 320315 cd15187 7tmA_CCR8 CC chemokine receptor type 8, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR8, the receptor for the CC chemokines CCL1 and CC16, is highly expressed on allergen-specific T-helper type 2 cells, and is implicated in the pathogenesis of human asthma. CCL1- and CCR8-expressing CD4+ effector T lymphocytes are shown to have a critical role in lung mucosal inflammatory responses. CCR8 is also a functional receptor for CCL16, a liver-expressed CC chemokine that involved in attracting lymphocytes, dendritic cells, and monocytes. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. 276
32803 320316 cd15188 7tmA_ACKR2_D6 atypical chemokine receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR2 (also known as D6) binds non-selectively to all inflammatory CC-chemokines, but not to homeostatic CC-chemokines involved in controlling the migration of cells. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors. 278
32804 320317 cd15189 7tmA_Bradykinin_R bradykinin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways. 284
32805 341340 cd15190 7tmA_Apelin_R apelin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Apelin (APJ) receptor is a G protein-coupled receptor that binds the endogenous peptide ligands, apelin and Toddler/Elabela. APJ is an adipocyte-derived hormone that is ubiquitously expressed throughout the human body and Toddler/Elabela is a short secretory peptide that is required for normal cardiac development in zebrafish. Activation of APJ receptor plays key roles in diverse physiological processes including vasoconstriction and vasodilation, cardiac muscle contractility, angiogenesis, and regulation of water balance and food intake. 304
32806 341341 cd15191 7tmA_AT2R type 2 angiotensin II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors. Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2R, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Moreover, AT1R promotes cell proliferation, whereas AT2R inhibits proliferation and stimulates cell differentiation. The AT2R is highly expressed during fetal development, however it is scarcely present in adult tissues and is induced in pathological conditions. Generally, the AT1R mediates many actions of Ang II, while the AT2R is involved in the regulation of blood pressure and renal function. 285
32807 320320 cd15192 7tmA_AT1R type 1 angiotensin II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors. Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2R, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Moreover, AT1R promotes cell proliferation, whereas AT2R inhibits proliferation and stimulates cell differentiation. The AT2R is highly expressed during fetal development, however it is scarcely present in adult tissues and is induced in pathological conditions. Generally, the AT1R mediates many actions of Ang II, while the AT2R is involved in the regulation of blood pressure and renal function. 285
32808 320321 cd15193 7tmA_GPR25 G protein-coupled receptor 25, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR25 is an orphan G-protein coupled receptor that shares strong sequence homology to GPR15 and the angiotensin II receptors. These closely related receptors form a group within the class A G-protein coupled receptors (GPCRs). GPR15 controls homing of T cells, especially FOXP3(+) regulatory T cells, to the large intestine mucosa and thereby mediates local immune homeostasis. Moreover, GRP15-deficient mice were shown to be prone to develop more severe large intestine inflammation. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors. 279
32809 320322 cd15194 7tmA_GPR15 G protein-coupled receptor 15, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR15, also called as Brother of Bonzo (BOB), is an orphan G-protein coupled receptor that was originally identified as a co-receptor for human immunodeficiency virus. GPR15 is upregulated in patients with rheumatoid arthritis and shares high sequence homology with angiotensin II type AT1 and AT2 receptors; however, its endogenous ligand is unknown. GPR15 controls homing of T cells, especially FOXP3(+) regulatory T cells, to the large intestine mucosa and thereby mediates local immune homeostasis. Moreover, GRP15-deficient mice were shown to be prone to develop more severe large intestine inflammation. 281
32810 320323 cd15195 7tmA_GnRHR-like gonadotropin-releasing hormone and adipokinetic hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Gonadotropin-releasing hormone (GnRH) and adipokinetic hormone (AKH) receptors share strong sequence homology to each other, suggesting that they have a common evolutionary origin. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. Adipokinetic hormone (AKH) is a lipid-mobilizing hormone that is involved in control of insect metabolism. Generally, AKH behaves as a typical stress hormone by mobilizing lipids, carbohydrates and/or certain amino acids such as proline. Thus, it utilizes the body's energy reserves to fight the immediate stress problems and subdue processes that are less important. Although AKH is known to responsible for regulating the energy metabolism during insect flying, it is also found in insects that have lost its functional wings and predominantly walk for their locomotion. Both GnRH and AKH receptors are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily. 293
32811 320324 cd15196 7tmA_Vasopressin_Oxytocin vasopressin and oxytocin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) and oxytocin are synthesized in the hypothalamus and are released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. 264
32812 320325 cd15197 7tmA_NPSR neuropeptide S receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide S (NPS) promotes arousal and anxiolytic-like effects by activating its cognate receptor NPSR. NPSR is widely expressed in the brain, and its activation induces an elevation of intracellular calcium and cAMP concentrations, presumably by coupling to G(s) and G(q) proteins. Mutations in NPSR have been associated with an increased susceptibility to asthma. NPSR was originally identified as an orphan receptor GPR154 and is also known as G protein receptor for asthma susceptibility (GPRA) or vasopressin receptor-related receptor 1 (VRR1). 294
32813 320326 cd15198 7tmA_GPR150 G protein-coupled receptor 150, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR150 is an orphan receptor closely related to the oxytocin and vasopressin receptors. Its endogenous ligand is not known. These receptors share a significant amino acid sequence similarity, suggesting that they have a common evolutionary origin. 299
32814 320327 cd15199 7tmA_GPR31 G protein-coupled receptor 31, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR31, also known as 12-(S)-HETE receptor, is a high-affinity receptor for 12-(S)-hydroxy-5,8,10,14-eicosatetraenoic acid. Phylogenetic analysis showed that GPR31 and oxoeicosanoid receptor 1 (OXER1, GPR170) are the most closely related receptors to the hydroxycarboxylic acid receptor family (HCARs). GPR31, like OXER1, activates the ERK1/2 (MAPK3/MAPK1) pathway of intracellular signaling, but unlike the OXER1, does not cause increase in the cytosolic calcium level. GPR31 is also shown to activate NFkB. 12-(S)-HETE is a 12-lipoxygenase metabolite of arachidonic acid produced by mammalian platelets and tumor cells. It promotes tumor cells adhesion to endothelial cells and sub-endothelial matrix, which is a critical step for metastasis. 278
32815 320328 cd15200 7tmA_OXER1 oxoeicosanoid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. OXER1, also called GPR170, is a receptor for eicosanoids and polyunsaturated fatty acids such as 5-oxo-6E,8Z,11Z,14Z-eicosatetraenoic acid (5-OXO-ETE), 5(S)-hydroperoxy-6E,8Z,11Z,14Z-eicosatetraenoic acid (5(S)-HPETE) and arachidonic acid. OXER1 is a member of the class A family of seven-transmembrane G-protein coupled receptors and appears to be coupled to the G(i/o) protein. The receptor is expressed in various tissues except brain. Phylogenetic analysis showed that GPR31 and OXER1 are the most closely related receptors to the hydroxycarboxylic acid receptor family (HCARs). OXER1, like GPR31, activates the ERK1/2 (MAPK3/MAPK1) pathway of intracellular signaling, but unlike GPR31, does cause increase in the cytosolic calcium level. 276
32816 320329 cd15201 7tmA_HCAR1-3 hydroxycarboxylic acid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Hydroxycarboxylic acid receptor (HCAR) subfamily, a member of the class A G-protein coupled receptors (GPCRs), contains three receptor subtypes: HCAR1, HCAR2, and HCAR3. The endogenous ligand of HCAR1 (also known as lactate receptor 1, GPR104, or GPR81) is L-lactic acid. The endogenous ligands of HCAR2 (also known as niacin receptor 1, GPR109A, or nicotinic acid receptor) and HCAR3 (also known as niacin receptor 2 or GPR109B) are 3-hydroxybutyric acid and 3-hydroxyoctanoic acid, respectively. Because nicotinic acid is capable of stimulating HCAR2 at higher concentrations only (in the range of sub-micromolar concentration), it is unlikely that nicotinic acts as a physiological ligand of HCAR2. All three receptors are expressed in adipocytes and mediate anti-lipolytic effects in fat cells through G(i) type G protein-dependent inhibition of adenylate cyclase. 281
32817 320330 cd15202 7tmA_TACR-like tachykinin receptors and related receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the neurokinin/tachykinin receptors and its closely related receptors such as orphan GPR83 and leucokinin-like peptide receptor. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty. 288
32818 320331 cd15203 7tmA_NPYR-like neuropeptide Y receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to Gi or Go proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. Also included in this subgroup is prolactin-releasing peptide (PrRP) receptor (previously known as GPR10), which is activated by its endogenous ligand PrRP, a neuropeptide possessing C-terminal Arg-Phe-amide motif. There are two active isoforms of PrRP in mammals: one consists of 20 amino acid residues (PrRP-20) and the other consists of 31 amino acid residues (PrRP-31). PrRP receptor shows significant sequence homology to the NPY receptors, and a micromolar level of NPY can bind and completely inhibit the PrRP-evoked intracellular calcium response in PrRP receptor-expressing cells, suggesting that the PrRP receptor shares a common ancestor with the NPY receptors. 293
32819 320332 cd15204 7tmA_prokineticin-R prokineticin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Prokineticins 1 (PROK1) and 2 (PROK2), also known as endocrine gland vascular endothelial factor and Bombina varigata 8, respectively, are multifunctional chemokine-like peptides that are highly conserved across species. Prokineticins can bind with similar affinities to two closely homologous 7-transmembrane G protein coupled receptors, PROKR1 and PROKR2, which are phylogenetically related to the tachykinin receptors. Prokineticins and their GPCRs are widely distributed in human tissues and are involved in numerous physiological roles, including gastrointestinal motility, generation of circadian rhythms, neuron migration and survival, pain sensation, angiogenesis, inflammation, and reproduction. Moreover, different point mutations in genes encoding PROK2 or its receptor (PROKR2) can lead to Kallmann syndrome, a disease characterized by delayed or absent puberty and impaired olfactory function. 288
32820 320333 cd15205 7tmA_QRFPR pyroglutamylated RFamide peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. 26RFa, also known as QRFP (Pyroglutamylated RFamide peptide), is a 26-amino acid residue peptide that belongs to a family of neuropeptides containing an Arg-Phe-NH2 (RFamide) motif at its C-terminus. 26Rfa/QRFP exerts similar orexigenic activity including the regulation of feeding behavior in mammals. It is the ligand for G-protein coupled receptor 103 (GPR103), which is predominantly expressed in paraventricular (PVN) and ventromedial (VMH) nuclei of the hypothalamus. GPR103 shares significant protein sequence homology with orexin receptors (OX1R and OX2R), which have recently shown to produce a neuroprotective effect in Alzheimer's disease by forming a functional heterodimer with GPR103. 298
32821 320334 cd15206 7tmA_CCK_R cholecystokinin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors. 269
32822 320335 cd15207 7tmA_NPFFR neuropeptide FF receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R. 291
32823 320336 cd15208 7tmA_OXR orexin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Orexins (OXs, also referred to as hypocretins) are neuropeptide hormones that regulate the sleep-wake cycle and potently influence homeostatic systems regulating appetite and feeding behavior or modulating emotional responses such as anxiety or panic. OXs are synthesized as prepro-orexin (PPO) in the hypothalamus and then proteolytically cleaved into two forms of isoforms: orexin-A (OX-A) and orexin-B (OX-B). OXA is a 33 amino-acid peptide with N-terminal pyroglutamyl residue and two intramolecular disulfide bonds, whereas OXB is a 28 amino-acid linear peptide with no disulfide bonds. OX-A binds orexin receptor 1 (OX1R) with high-affinity, but also binds with somewhat low-affinity to OX2R, and signals primarily to Gq coupling, whereas OX-B shows a strong preference for the orexin receptor 2 (OX2R) and signals through Gq or Gi/o coupling. Thus, activation of OX1R or OX2R will activate phospholipase activity and the phosphatidylinositol and calcium signaling pathways. Additionally, OX2R activation can also lead to inhibition of adenylate cyclase. 303
32824 320337 cd15209 7tmA_Mel1 melatonin receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase. 279
32825 320338 cd15210 7tmA_GPR84-like G protein-coupled receptor 84 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR84, also known as the inflammation-related G-Protein coupled receptor EX33, is a receptor for medium-chain free fatty acid (FFA) with carbon chain lengths of C9 to C14. Among these medium-chain FFAs, capric acid (C10:0), undecanoic acid (C11:0), and lauric acid (C12:0) are the most potent endogenous agonists of GPR84, whereas short-chain and long-chain saturated and unsaturated FFAs do not activate this receptor. GPR84 contains a [G/N]RY-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors and important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. In the case of GPR84, activation of the receptor couples to a pertussis toxin sensitive G(i/o)-protein pathway. GPR84 knockout mice showed increased Th2 cytokine production including IL-4, IL-5, and IL-13 compared to wild-type mice. It has been also shown that activation of GPR84 augments lipopolysaccharide-stimulated IL-8 production in polymorphonuclear leukocytes and TNF-alpha production in macrophages, suggesting that GPR84 may function as a proinflammatory receptor. 254
32826 320339 cd15211 7tmA_GPR88-like G protein-coupled receptor 88, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR88, an orphan G protein-coupled receptor, is predominantly and almost exclusively expressed within medium spiny neurons (MSNs) of the brain's striatum in both human and rodents; thus it is also called Striatum-specific GPCR (STRG). The striatum is known to involve in motor coordination, reward-based decision making, and response learning. GPR88 is shown to co-localize with both dopamine D1 and D2 receptors and displays the highest sequence similarity to receptors for biogenic amines such as dopamine and serotonin. GPR88 knockout mice showed abnormal behaviors observed in schizophrenia, such as disrupted sensorimotor gating, increased stereotypic behavior and locomotor activity in response to treatment with dopaminergic compounds such as apomorphine and amphetamine, respectively, suggesting a role for GPR88 in dopaminergic signaling. Furthermore, the transcriptional profiling studies showed that GPR88 expression is altered in a number of psychiatric disorders such as depression, drug addiction, bipolar and schizophrenia, providing further evidence that GPR88 plays an important role in CNS signaling pathways related to psychiatric disorder. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 283
32827 320340 cd15212 7tmA_GPR135 G protein-coupled receptor 135, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR135, also known as the somatostatin- and angiotensin-like peptide receptor (SALPR), is found in various tissues including eye, brain, cervix, stomach, and testis. Pharmacological studies have shown that relaxin-3 (R3) is a high-affinity endogenous ligand for GPR135. R3 has recently been identified as a new member of the insulin/relaxin family of peptide hormones and is exclusively expressed in the brain neurons. In addition to GPR135, R3 also acts as an agonist for GPR142, a pseudogene in the rat, and can activate LGR7 (leucine repeat-containing G-protein receptor-7), which is the main receptor for relaxin-1 (R1) and relaxin-2 (R2). While R1 and R2 are hormones primarily associated with reproduction and pregnancy, R3 is involved in neuroendocrine and sensory processing. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 285
32828 320341 cd15213 7tmA_PSP24-like G protein-coupled receptor PSP24 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes two human orphan receptors, GPR45 and GPR65, and their closely related proteins found in vertebrates and invertebrates. GPR45 and GPR 65 are also called PSP24-alpha (or PSP24-1) and PSP24-beta (or PSP24-2) in other vertebrates, respectively. These receptors exhibit the highest sequence homology to each other. PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Instead, sphingosine 1-phosphate and dioleoylphosphatidic acid have been shown to act as low affinity agonists for GPR63. PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32829 320342 cd15214 7tmA_GPR161 orphan G protein-coupled receptor 161, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR161, an orphan GPCR, is a negative regulator of Sonic hedgehog (Shh) signaling, which promotes the processing of zinc finger protein GLI3 into its transcriptional repressor form (GLI3R) during neural tube development. In the absence of Shh, this proteolytic processing is normally mediated by cAMP-dependent protein kinase A (PKA). GPR161 is recruited to primary cilia by a mechanism depends on TULP3 (tubby-related protein 3) and the intraflagellar complex A (IFT-A). Moreover, Gpr161 knockout mice show phenotypes observed in Tulp3/IFT-A mutants, and cause increased Shh signaling in the neural tube. Taken together, GPR161 negatively regulates the PKA-dependent GLI3 processing in the absence of Shh signal by coupling to G(s) protein, which causes activation of adenylate cyclase, elevated cAMP levels, and activation of PKA. Conversely, in the presence of Shh, GPR161 is removed from the cilia by internalization into the endosomal recycling compartment, leading to downregulation of its activity and thereby allowing Shh signaling to proceed. In addition, GPR161 is over-expressed in triple-negative breast cancer (lacking estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (HER2) expression) and correlates with poor prognosis. Mutations of GPR161 have also been implicated as a novel cause for pituitary stalk interruption syndrome (PSIS), a rare congenital disease of the pituitary gland. GPR161 is a member of the class A family of GPCRs, which contains receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 261
32830 320343 cd15215 7tmA_GPR101 orphan G protein-coupled receptor 101, member of the class A family of seven-transmembrane G protein-coupled receptors. Gpr101, an orphan GPCR, is predominantly expressed in the brain within discrete nuclei and is predicted to couple to the stimulatory G(s) protein, a potent activator of adenylate cyclase. GPR101 has been implicated in mediating the actions of GnRH-(1-5), a pentapeptide formed by metallopeptidase cleavage of the decapeptide gonadotropin-releasing hormone (GnRH), which plays a critical role in the regulation of the hypothalamic-pituitary-gonadal axis. GnRH-(1-5) acts on GPR101 to stimulate epidermal growth factor (EFG) release and EFG-receptor (EGFR) phosphorylation, leading to enhanced cell migration and invasion in the Ishikawa endometrial cancer cell line. Furthermore, these effects of GnRH-(1-5) are also dependent on enzymatic activation of matrix metallopeptidase-9 (MMP-9). GPR101 is a member of the class A family of GPCRs, which includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 261
32831 320344 cd15216 7tmA_SREB1_GPR27 super conserved receptor expressed in brain 1 (or GPR27), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia. All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 332
32832 320345 cd15217 7tmA_SREB3_GPR173 super conserved receptor expressed in brain 3 (or GPR173), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia. All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 329
32833 320346 cd15218 7tmA_SREB2_GPR85 super conserved receptor expressed in brain 2 (or GPR85), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia. All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 330
32834 320347 cd15219 7tmA_GPR26_GPR78-like G protein-coupled receptors 26 and 78, member of the class A family of seven-transmembrane G protein-coupled receptors. Orphan G-protein coupled receptor 26 (GPR26) and GPR78 are constitutively active and coupled to increased cAMP formation. They are closely related based on sequence homology and comprise a conserved subgroup within the class A G-protein coupled receptor (GPCR) superfamily. Both receptors are widely expressed in selected tissues of the brain but their endogenous ligands are unknown. GPR26 knockout mice showed increased levels of anxiety- and depression-like behaviors, whereas GPR78 has been implicated in susceptibility to bipolar affective disorder and schizophrenia. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 264
32835 410633 cd15220 7tmA_GPR61_GPR62-like G protein-coupled receptors 61 and 62, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the orphan receptors GPR61 and GPR62, which are both constitutively active and predominantly expressed in the brain. While GPR61 couples to G(s) subtype of G proteins, the signaling pathway and function of GPR 62 are unknown. GPR61-deficient mice displayed significant hyperphagia and heavier body weight compared to wild-type mice, suggesting that GPR61 is involved in the regulation of food intake and body weight. GPR61 transcript expression was found in the caudate, putamen, and thalamus of human brain, whereas GPR62 transcript expression was found in the basal forebrain, frontal cortex, caudate, putamen, thalamus, and hippocampus. Both receptors share the highest sequence homology with each other and comprise a conserved subgroup within the class A family of GPCRs, which includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. Members of this subgroup contain [A/E]RY motif, a variant of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of the class A GPCRs and important for efficient G protein-coupled signal transduction. 264
32836 320349 cd15221 7tmA_OR52B-like olfactory receptor subfamily 52B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor (OR) subfamilies 52B, 52D, 52H and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
32837 320350 cd15222 7tmA_OR51-like olfactory receptor family 51 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 51 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
32838 320351 cd15223 7tmA_OR56-like olfactory receptor family 56 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 56 and related proteins in other mammals, sauropsids, and fishes. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
32839 320352 cd15224 7tmA_OR6B-like olfactory receptor subfamily 6B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6B, 6A, 6Y, 6P, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
32840 320353 cd15225 7tmA_OR10A-like olfactory receptor subfamily 10A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10A, 10C, 10H, 10J, 10V, 10R, 10J, 10W, among others, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32841 320354 cd15226 7tmA_OR4-like olfactory receptor family 4 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 4 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 267
32842 320355 cd15227 7tmA_OR14-like olfactory receptor family 14 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 14 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
32843 320356 cd15228 7tmA_OR10D-like olfactory receptor subfamily 10D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 10D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
32844 320357 cd15229 7tmA_OR8S1-like olfactory receptor subfamily 8S1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 8S1 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32845 320358 cd15230 7tmA_OR5-like olfactory receptor family 5 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 5, some subfamilies from families 8 and 9, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
32846 320359 cd15231 7tmA_OR5V1-like olfactory receptor subfamily 5V1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5V1 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32847 320360 cd15232 7tmA_OR13-like olfactory receptor family 13 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 13 (subfamilies 13A1 and 13G1) and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
32848 320361 cd15233 7tmA_OR3A-like olfactory receptor subfamily 3A3 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 3A3 and 3A4, and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32849 320362 cd15234 7tmA_OR7-like olfactory receptor family 7 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 7 and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32850 320363 cd15235 7tmA_OR1A-like olfactory receptor subfamily 1A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 1A, 1B, 1K, 1L, 1Q and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 278
32851 320364 cd15236 7tmA_OR1E-like olfactory receptor subfamily 1E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 1E, 1J, and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
32852 320365 cd15237 7tmA_OR2-like olfactory receptor family 2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 2 and 13, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
32853 320366 cd15238 7tm_ARII-like Acetabularia rhodopsin II and similar proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes the eukaryotic light-driven proton-pumping Acetabularia rhodopsin II from the giant unicellular marine alga Acetabularis acetabulum, as well as its closely related proteins. They belong to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 219
32854 320367 cd15239 7tm_YRO2_fungal-like fungal YRO2 and related proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes the yeast YRO2 protein and it closely related proteins. Although the exact function of these proteins is unknown, they show strong sequence homology to the family of microbial rhodopsins, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 227
32855 320368 cd15240 7tm_ASR-like Anabaena sensory rhodopsin and similar proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes eubacterial sensory rhodopsin from the freshwater cyanobacterium Anabaena and its closely related proteins. Unlike other sensory rhodopsins (SRI and SRII), the Anabaena sensory rhodopsin (ASR) activates a soluble transducer protein (ASRT), which may leading to transcriptional control of several genes. Although ASRT was shown to interact with DNA in vitro, the exact mechanism of photosensory transduction is not clearly understood. Moreover, the regulation of CRP (cAMP receptor protein) expression by ASR has been reported demonstrating a direct interaction of the C-terminal region of ASR with DNA, suggesting that ASR itself may also work as a transcription factor. ASR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 221
32856 320369 cd15241 7tm_ChRs channelrhodopsins, member of the seven-transmembrane GPCR superfamily. Channelrhodopsins (ChRs) are light-gated ion channels acting as sensory photoreceptors in unicellular green algae, controlling phototaxis (directional movement toward or away from light). ChRs are large seven-transmembrane proteins with large C-terminal extensions, which have been implicated in localizing the channel to the algal eyespot, a single layer of pigmented granules, overlaying part of the plasma membrane but are not required for ion channel function. ChRs are belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 219
32857 320370 cd15242 7tm_Proteorhodopsin green- and blue-light absorbing proteorhodopsins, member of the seven-transmembrane GPCR superfamily. This subgroup represents blue-light absorbing and green-light absorbing proteorhodopsins (PRs), which act as a light-driven proton pump that plays a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. PRs are found in most marine bacteria in surface waters, as well as in archaea and eukaryotes. They belong to the microbial rhodopsin family, also known as type 1 rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 229
32858 320371 cd15243 7tm_Halorhodopsin light-driven inward chloride pump halorhodopsin, member of the seven-transmembrane GPCR superfamily. Halorhodopsin (HR) acts as a light-driven inward-directed chloride pump. When activated by yellow light, HR pumps chloride ions into the cell cytoplasm, generating a negative-inside membrane potential which drives proton uptake. The resulting electrochemical ion gradient provides an energy source to the cell and contributes to pH homeostasis. HR is found in phylogenetically ancient archaea, known as halobacteria which live in high salty environments. HR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising light-driven retinal-binding outward pump bacteriorhodopsin (BR), light-gated cation channel channelrhodopsin (ChR), light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 226
32859 320372 cd15244 7tm_bacteriorhodopsin light-driven outward proton pump bacteriorhodopsin, member of the seven-transmembrane GPCR superfamily. Bacteriorhodopsin (BR) serves as a light-driven retinal-binding outward proton pump, generating an outside positive membrane potential and thus creating an inwardly directed proton motive force (PMF) necessary for ATP synthesis. BR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising light-driven inward chloride pump halorhodopsin (HR), light-gated cation channel channelrhodopsin (ChR), light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 221
32860 320373 cd15245 7tmF_FZD2 class F frizzled subfamily 2, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 2 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 330
32861 320374 cd15246 7tmF_FZD7 class F frizzled subfamily 7, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 7 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others 331
32862 320375 cd15247 7tmF_FZD1 class F mammalian frizzled subfamily 1, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 1 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 341
32863 320376 cd15248 7tmF_FZD1_insect class F insect frizzled subfamily 1, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 1 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors, found in insects such as Drosophila melanogaster. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 332
32864 320377 cd15249 7tmF_FZD5 class F frizzled subfamily 5, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 5 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 310
32865 320378 cd15250 7tmF_FZD8 class F frizzled subfamily 8, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 8 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 314
32866 320379 cd15251 7tmB2_BAI_Adhesion_VII brain-specific angiogenesis inhibitors, group VII adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1, whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediate direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions. 253
32867 320380 cd15252 7tmB2_Latrophilin_Adhesion_I Latrophilins and similar receptors, group I adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. Group I adhesion GPCRs consist of latrophilins (also called lectomedins or latrotoxin receptors) and ETL (EGF-TM7-latrophilin-related protein. These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 257
32868 320381 cd15253 7tmB2_GPR113 orphan adhesion receptor GPR113, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR113 is an orphan receptor that belongs to group VI adhesion-GPCRs along with GPR110, GPR111, GPR115, and GPR116. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR113 contains a hormone binding domain and one EGF (epidermal grown factor) domain, and is primarily expressed in a subset of taste receptor cells. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 271
32869 320382 cd15254 7tmB2_GPR116_Ig-Hepta The immunoglobulin-repeat-containing receptor Ig-hepta/GPR116, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR116 (also known as Ig-hepta) is an orphan receptor that belongs to group VI adhesion-GPCRs along with GPR110, GPR111, GPR113, and GPR115. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR116 has two C2-set immunoglobulin-like repeats, which is found in the members of the immunoglobulin superfamily of cell surface proteins, and a SEA (sea urchin sperm protein, enterokinase, and a grin)-box, which is present in the extracellular domain of the transmembrane mucin (MUC) family and known to enhance O-glycosylation. GPR116 is highly expressed in fetal and adult lung, and it has been shown to regulate lung surfactant levels as well as to stimulate breast cancer metastasis through a G(q)-p63-RhoGEF-Rho GTPase signaling pathway. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 275
32870 320383 cd15255 7tmB2_GPR144 orphan adhesion receptor GPR114, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR144 is an orphan receptor that belongs to the group V adhesion-GPCRs together with GPR133. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the Gs protein, leading to activation of adenylyl cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 263
32871 320384 cd15256 7tmB2_GPR133 orphan adhesion receptor GPR133, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR133 is an orphan receptor that belongs to the group V adhesion-GPCRs together with GPR144. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the Gs protein, leading to activation of adenylyl cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 260
32872 320385 cd15257 7tmB2_GPR128 orphan adhesion receptor GPR128, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR128 is an orphan receptor of the adhesion family (subclass B2) that belongs to the class B GPCRs. Expression of GPR128 was detected in the mouse intestinal mucosa and is thought to be involved in energy balance, as its knockout mice showed a decrease in body weight gain and an increase in intestinal contraction frequency compared to wild-type controls. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 303
32873 320386 cd15258 7tmB2_GPR126-like_Adhesion_VIII orphan GPR126 and related proteins, group VIII adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Group VIII adhesion GPCRs include orphan GPCRs such as GPR56, GPR64, GPR97, GPR112, GPR114, and GPR126. GPR56 is involved in the regulation of oligodendrocyte development and myelination in the central nervous system via coupling to G(12/13) proteins, which leads to the activation of RhoA GTPase. GPR126, on the other hand, is required for Schwann cells, but not oligodendrocyte myelination in the peripheral nervous system. Gpr64 is mainly expressed in the epididymis of male reproductive tract, and targeted deletion of GPR64 causes sperm stasis and efferent duct blockage due to abnormal fluid reabsorption, resulting in male infertility. GPR64 is also over-expressed in Ewing's sarcoma (ES), as well as upregulated in other carcinomas from kidney, prostate or lung, and promotes invasiveness and metastasis in ES via the upregulation of placental growth factor (PGF) and matrix metalloproteinase (MMP) 1. GPR97 is identified as a lymphatic adhesion receptor that is specifically expressed in lymphatic endothelium, but not in blood vascular endothelium, and is shown to regulate migration of lymphatic endothelial cells via the small GTPases RhoA and cdc42. GPR112 is specifically expressed in normal enterochromatin cells and gastrointestinal neuroendocrine carcinoma cells, but its biological function is unknown. GPR114 is mainly found in granulocytes (polymorphonuclear leukocytes), and GPR114-transfected cells induced an increase in cAMP levels via coupling to G(s) protein. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 267
32874 320387 cd15259 7tmB2_GPR124-like_Adhesion_III orphan GPR124 and related proteins, group III adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group III adhesion GPCRs include orphan GPR123, GPR124, GPR125, and their closely related proteins. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. GPR123 is predominantly expressed in the CNS including thalamus, brain stem and regions containing large pyramidal cells. GPR124, also known as tumor endothelial marker 5 (TEM5), is highly expressed in tumor vessels and in the vasculature of the developing embryo. GPR124 is essentially required for proper angiogenic sprouting into neural tissue, CNS-specific vascularization, and formation of the blood-brain barrier. GPR124 also interacts with the PDZ domain of DLG1 (discs large homolog 1) through its PDZ-binding motif. Recently, studies of double-knockout mice showed that GPR124 functions as a co-activator of Wnt7a/Wnt7b-dependent beta-catenin signaling in brain endothelium. Furthermore, WNT7-stimulated beta-catenin signaling is regulated by GPR124's intracellular PDZ binding motif and leucine-rich repeats (LRR) in its N-terminal extracellular domain. GPR125 directly interacts with dishevelled (Dvl) via its intracellular C-terminus, and together, GPR125 and Dvl recruit a subset of planar cell polarity (PCP) components into membrane subdomains, a prerequisite for activation of Wnt/PCP signaling. Thus, GPR125 influences the noncanonical WNT/PCP pathway, which does not involve beta-catenin, through interacting with and modulating the distribution of Dvl. 260
32875 320388 cd15260 7tmB1_NPR_B4_insect-like insect neuropeptide receptor subgroup B4 and related proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Nilaparvata lugens (brown planthopper) and its closely related proteins from mollusks and annelid worms. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes. 267
32876 320389 cd15261 7tmB1_PDFR The pigment dispersing factor receptor, member of the class B seven-transmembrane G protein-coupled receptors. The pigment dispersing factor receptor (PDFR) is a G protein-coupled receptor that binds the circadian clock neuropeptide PDF, a functional ortholog of the mammalian vasoactive intestinal peptide (VIP), on the pacemaker neurons. The PDFR is implicated in regulating flight circuit development and in modulating acute flight In Drosophila melanogaster. The PDFR activation stimulates adenylate cyclase, thereby increasing cAMP levels in many different pacemakers, and the receptor signaling has been shown to regulate behavioral circadian rhythms and geotaxis in Drosophila. The PDFR belongs to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. . These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. They play key roles in hormone homeostasis in mammals and are promising drug targets in various human diseases including diabetes, osteoporosis, obesity, neurodegenerative conditions (Alzheimer###s and Parkinson's), cardiovascular disease, migraine, and psychiatric disorders (anxiety, depression). 282
32877 320390 cd15262 7tmB1_NPR_B3_insect-like insect neuropeptide receptor subgroup B3 and related proteins belong to subfamily B1 of hormone receptors; member of the class B secretin-like seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Bombyx mori (silk worm) and its closely related proteins from arthropods. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes. 270
32878 320391 cd15263 7tmB1_DH_R insect diuretic hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes G protein-coupled receptors that specifically bind to insect diuretic hormones found in Manduca sexta (moth) and Acheta domesticus (the house cricket), among others. Insect diuretic hormone and their GPCRs play critical roles in the regulation of water and ion balance. Thus they are attractive targets for developing new insecticides. Activation of the diuretic hormone receptors stimulate adenylate cyclase, thereby increasing cAMP levels in Malpighian tube. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of Gs family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. 272
32879 320392 cd15264 7tmB1_CRF-R corticotropin-releasing factor receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones. In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 265
32880 320393 cd15265 7tmB1_PTHR parathyroid hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone (PTH) receptor family has three subtypes: PTH1R, PTH2R and PTH3R. PTH1R is expressed in bone and kidney and is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to a G(s)-protein that in turn activates adenylate cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple intracellular signaling pathways. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39), but not by PTHrP. PTH also strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs, suggesting that TIP-39 is a natural ligand for PTH2R. On the other hand, PTH3R binds and responds to both PTH and PTHrP, but not the TIP-39. Moreover, the PTH3R is more closely related to the PTH1R than PTH2R. PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. The PTH3R is found in chicken and fish, but it is absent in mammals. The PTH receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. 289
32881 320394 cd15266 7tmB1_GLP2R glucagon-like peptide-2 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon-like peptide-2 receptor (GLP2R) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon receptor (GCGR) and GLP1R. GLP2R is activated by glucagon-like peptide 2, which is derived from the large proglucagon precursor. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. GLP2R belongs to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 280
32882 320395 cd15267 7tmB1_GCGR glucagon receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon receptor (GCGR) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon-like peptide-1 receptor (GLP1R) and GLP2R. GCGR is activated by glucagon, which is derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR belongs to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 281
32883 341342 cd15268 7tmB1_GLP1R glucagon-like peptide-1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon-like peptide-1 receptor (GLP1R) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon receptor and GLP2R. GLP1R is activated by glucagon-like peptide 1 (GLP1), which is derived from the large proglucagon precursor. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 279
32884 320397 cd15269 7tmB1_VIP-R1 vasoactive intestinal polypeptide (VIP) receptor 1, member of the class B family of seven-transmembrane G protein-coupled receptors. Vasoactive intestinal peptide (VIP) receptor 1 is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP. VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. VIP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. However, depending on its cellular location, VIP-R1 is also capable of coupling to additional G proteins such as G(q) protein, thus leading to the activation of phospholipase C and intracellular calcium influx. 268
32885 320398 cd15270 7tmB1_GHRHR growth-hormone-releasing hormone receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Growth hormone-releasing hormone receptor (GHRHR) is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, pituitary adenylate cyclase activating polypeptide (PACAP), and vasoactive intestinal peptide. These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action. GHRH is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. GHRHR is found in mammals as well as zebrafish and chicken, whereas the GHRHR type 2, an ortholog of the GHRHR, has only been identified in ray-finned fish, chicken and Xenopus. Xenopus laevis GHRHR2 has been shown to interact with both endogenous GHRH and PACAP-related peptide (PRP). 268
32886 320399 cd15271 7tmB1_GHRHR2 growth-hormone-releasing hormone receptor type 2, member of the class B family of seven-transmembrane G protein-coupled receptors. Growth hormone-releasing hormone receptor type 2 (GHRHR2) is found in non-mammalian vertebrates such as chicken and frog. It is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, pituitary adenylate cyclase activating polypeptide (PACAP), vasoactive intestinal peptide, and mammalian growth hormone-releasing hormone. These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Mammalian GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action. Mammalian GHRH is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. GHRHR is found in mammals as well as zebrafish and chicken, whereas the GHRHR type 2, an ortholog of the GHRHR, has only been identified in ray-finned fish, chicken and Xenopus. Xenopus laevis GHRHR2 has been shown to interact with both endogenous GHRH and PACAP-related peptide (PRP). 267
32887 320400 cd15272 7tmB1_PTH-R_related invertebrate parathyroid hormone-related receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes parathyroid hormone (PTH)-related receptors found in invertebrates such as mollusks and annelid worms. The PTH family receptors are members of the B1 subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. The parathyroid hormone type 1 receptor (PTH1R) is found in all vertebrate species and is activated by two polypeptide ligands: parathyroid hormone (PTH), an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to G(s)- protein that in turn activates adenylyl cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple signaling pathways. 285
32888 320401 cd15273 7tmB1_NPR_B7_insect-like insect neuropeptide receptor subgroup B7 and related proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Nilaparvata lugens (brown planthopper) and its closely related proteins from invertebrates. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes. 285
32889 341343 cd15274 7tmB1_calcitonin_R calcitonin receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes G protein-coupled receptors for calcitonin (CT) and calcitonin gene-related peptides (CGRPs). Calcitonin, a 32-amino acid peptide hormone, is involved in calcium metabolism in many mammalian species and acts to reduce blood calcium levels and directly inhibits bone resorption by acting on osteoclast. Thus, CT acts as an antagonist to parathyroid hormone and is commonly used in the treatment of bone disorders. The CT receptor is predominantly found in osteoclasts, kidney, and brain, and is primarily coupled to stimulatory G(s) protein, which leads to activation of adenylate cyclase, thereby increasing cAMP production. CGRP, a member of the calcitonin family of peptides, is a potent vasodilator and may contribute to migraine. It is expressed in the peripheral and central nervous system and exists in two forms in humans (alpha-CGRP and beta-CGRP). CGRP meditates its physiological effects through calcitonin receptor-like receptor (CRLR) and receptor activity-modifying protein 1 (RAMP1), a single transmembrane domain protein. Thus, the CRLR/RAMP1 complex serves as a functional CGRP receptor. On the other hand, the CRLR/RAMP2 and CRLR/RAMP3 complexes function as adrenomedullin-specific receptors. The CT and CGRP receptors belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. 274
32890 320403 cd15275 7tmB1_secretin secretin receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Secretin receptor is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include vasoactive intestinal peptide (VIP), growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors, and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Secretin, a polypeptide secreted by entero-endocrine S cells in the small intestine, is involved in maintaining body fluid balance. This polypeptide regulates the secretion of bile and bicarbonate into the duodenum from the pancreatic and biliary ducts, as well as regulates the duodenal pH by the control of gastric acid secretion. Studies with secretin receptor-null mice indicate that secretin plays a role in regulating renal water reabsorption. Secretin mediates its biological actions by elevating intracellular cAMP via G protein-coupled secretin receptor, which is expressed in the brain, pancreas, stomach, kidney, and liver. 271
32891 320404 cd15277 7tmC_RAIG3_GPRC5C retinoic acid-inducible orphan G-protein-coupled receptor 3; class C family of seven-transmembrane G protein-coupled receptors, group 5, member C. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. The specific function of RAIG3 is unknown; however, this protein may play a role in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interaction with a G-protein signaling cascade. 250
32892 320405 cd15278 7tmC_RAIG2_GPRC5B retinoic acid-inducible orphan G-protein-coupled receptor 2; class C family of seven-transmembrane G protein-coupled receptors, group 5, member B. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG2 (GPRC5B), a mammalian Boss (Bride of sevenless) homolog, has been shown to activate obesity-associated inflammatory signaling in adipocytes, and that the GPRC5B knockout mice have been shown to be resistance to high-fat diet-induced obesity and insulin resistance. 244
32893 320406 cd15279 7tmC_RAIG1_4_GPRC5A_D retinoic acid-inducible orphan G-protein-coupled receptors 1 and 4; class C family of seven-transmembrane G protein-coupled receptors, group 5, member A and D. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG1 is evolutionarily conserved from mammals to fish. RAIG1 has been to shown to act as a tumor suppressor in non-small cell lung carcinoma as well as oral squamous cell carcinoma, but it could also act as an oncogene in breast cancer, colorectal cancer, and pancreatic cancer. Studies have shown that overexpression of RAIG1 decreases intracellular cAMP levels. Moreover, knocking out RAIG1 induces the activation of the NF-kB and STAT3 signaling pathways leading to cell proliferation and resistance to apoptosis. The specific function of RAIG4 is unknown; however, this protein may play a role in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interaction with a G-protein signaling cascade. 248
32894 320407 cd15280 7tmC_V2R-like vomeronasal type-2 receptor-like proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. This group represents vomeronasal type-2 receptor-like proteins that are closely related to the V2R family of vomeronasal GPCRs. Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are coexpressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, generating the secondary messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones. Human V2R1-like protein, also known as putative calcium-sensing receptor-like 1 (CASRL1), is not included here because it is a nonfunctional pseudogene. 253
32895 320408 cd15281 7tmC_GPRC6A class C of seven-transmembrane G protein-coupled receptors, subtype 6A. GRPC6A (GPCR, class C, group 6, subtype A) is a widely expressed amino acid-sensing GPCR that is most closely related to CaSR. GPRC6A is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine and less potently by small aliphatic amino acids. Moreover, the receptor can be either activated or modulated by divalent cations such as Ca2+ and Mg2+. GPRC6A is expressed in the testis, but not the ovary and specifically also binds to the osteoblast-derived hormone osteocalcin (OCN), which regulates testosterone production by the testis and male fertility independently of the hypothalamic-pituitary axis. Furthermore, GPRC6A knockout studies suggest that GRPC6A is involved in regulation of bone metabolism, male reproduction, energy homeostasis, glucose metabolism, and in activation of inflammation response, as well as prostate cancer growth and progression, among others. GPRC6A has been suggested to couple to the Gq subtype of G proteins, leading to IP3 production and intracellular calcium mobilization. GPRC6A contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD), and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B, GPRC6A, mGlu, and TAS1R receptors. 249
32896 320409 cd15282 7tmC_CaSR calcium-sensing receptor, member of the class C of seven-transmembrane G protein-coupled receptors. CaSR is a widely expressed GPCR that is involved in sensing small changes in extracellular levels of calcium ion to maintain a constant level of the extracellular calcium via modulating the synthesis and secretion of calcium regulating hormones, such as parathyroid hormone (PTH), in order to regulate Ca(2+)transport into or out of the extracellular fluid via kidney, intestine, and/or bone. For instance, when Ca2+ is high, CaSR downregulates PTH synthesis and secretion, leading to an increase in renal Ca2+ excretion, a decrease in intestinal Ca2+ absorption, and a reduction in release of skeletal Ca2+. CaSR is coupled to both G(q/11)-dependent activation of phospholipase and, subsequently, intracellular calcium mobilization and protein kinase C activation as well as G(i/o)-dependent inhibition of adenylate cyclase leading to inhibition of cAMP formation. CaSR is closely related to GRPC6A (GPCR, class C, group 6, subtype A), which is an amino acid-sensing GPCR that is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine. These receptors contain a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD), and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TASR1 receptors. 252
32897 320410 cd15283 7tmC_V2R_pheromone vomeronasal type-2 pheromone receptors, member of the class C family of seven-transmembrane G protein-coupled receptors. This group represents vomeronasal type-2 pheromone receptors (V2Rs). Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are coexpressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are coexpressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, producing the second messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones. 252
32898 320411 cd15284 7tmC_mGluR_group2 metabotropic glutamate receptors in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 254
32899 320412 cd15285 7tmC_mGluR_group1 metabotropic glutamate receptors in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 250
32900 320413 cd15286 7tmC_mGluR_group3 metabotropic glutamate receptors in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 271
32901 320414 cd15287 7tmC_TAS1R2a-like type 1 taste receptor subtype 2a and similar proteins, member of the class C of seven-transmembrane G protein-coupled receptors. This group includes TAS1R2a and its similar proteins found in fish. They are members of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids. On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors. 252
32902 320415 cd15288 7tmC_TAS1R2 type 1 taste receptor subtype 2, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R2, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids. On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors. 254
32903 320416 cd15289 7tmC_TAS1R1 type 1 taste receptor subtype 1, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R1, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids. On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors. 253
32904 320417 cd15290 7tmC_TAS1R3 type 1 taste receptor subtype 3, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R3, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids. On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs. The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors. 253
32905 320418 cd15291 7tmC_GABA-B-R1 gamma-aminobutyric acid type B receptor subunit 1, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD). However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. 274
32906 320419 cd15292 7tmC_GPR156 orphan GPR156, member of the class C family of seven-transmembrane G protein-coupled receptors. This subgroup represents orphan GPR156 that is closely related to the type B receptor for gamma-aminobutyric acid (GABA-B), which is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD). However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. 268
32907 320420 cd15293 7tmC_GPR158-like orphan GPR158 and similar proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. This group includes orphan receptors GPR158, GPR158-like (also called GPR179) and similar proteins. These orphan receptors are closely related to the type B receptor for gamma-aminobutyric acid (GABA-B), which is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD). However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. 252
32908 320421 cd15294 7tmC_GABA-B-R2 gamma-aminobutyric acid type B receptor subunit 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD). However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. 270
32909 320422 cd15295 7tmA_Histamine_H4R histamine receptor subtype H4R, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtype H4R, a member of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 267
32910 320423 cd15296 7tmA_Histamine_H3R histamine receptor subtypes H3R and H3R-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtypes H3R and H3R-like, members of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 271
32911 320424 cd15297 7tmA_mAChR_M2 muscarinic acetylcholine receptor subtype M2, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. Activation of M2 receptor causes a decrease in cAMP production, generally leading to inhibitory-type effects. This causes an outward current of potassium in the heart, resulting in a decreased heart rate. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32912 341344 cd15298 7tmA_mAChR_M4 muscarinic acetylcholine receptor subtype M4, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to G(i/o) types of G proteins. The M4 receptor is mainly found in the CNS and function as an inhibitory autoreceptor regulating acetycholine release. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32913 320426 cd15299 7tmA_mAChR_M3 muscarinic acetylcholine receptor subtype M3, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. The M3 receptor is mainly located in smooth muscle, exocrine glands and vascular endothelium. It induces vomiting in the central nervous system and is a critical regulator of glucose homeostasis by modulating insulin secretion. Generally, M3 receptor causes contraction of smooth muscle resulting in vasoconstriction and increased glandular secretion. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 274
32914 320427 cd15300 7tmA_mAChR_M5 muscarinic acetylcholine receptor subtype M5, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. M5 mAChR is primarily found in the central nervous system and mediates acetylcholine-induced dilation of cerebral blood vessels. Activation of M5 receptor triggers a variety of cellular responses, including inhibition of adenylate cyclase, breakdown of phosphoinositides, and modulation of potassium channels. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
32915 320428 cd15301 7tmA_mAChR_DM1-like muscarinic acetylcholine receptor DM1, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes muscarinic acetylcholine receptor DM1-like from invertebrates. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 270
32916 320429 cd15302 7tmA_mAChR_GAR-2-like muscarinic acetylcholine receptor GAR-2 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 266
32917 341345 cd15304 7tmA_5-HT2A serotonin receptor subtype 2A, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 267
32918 341346 cd15305 7tmA_5-HT2C serotonin receptor subtype 2C, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 275
32919 341347 cd15306 7tmA_5-HT2B serotonin receptor subtype 2B, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 277
32920 320433 cd15307 7tmA_5-HT2_insect-like serotonin receptor subtype 2 from insects, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 279
32921 320434 cd15308 7tmA_D4_dopamine_R D4 dopamine receptor of the D2-like family, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 258
32922 320435 cd15309 7tmA_D2_dopamine_R D2 subtype of the D2-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 254
32923 320436 cd15310 7tmA_D3_dopamine_R D3 subtype of the D2-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 259
32924 320437 cd15312 7tmA_TAAR2_3_4 trace amine-associated receptors 2, 3, 4, and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. TAAR2, TAAR3, and TAAR4 are among the 15 identified trace amine-associated receptor subtypes, which form a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 289
32925 320438 cd15314 7tmA_TAAR1 trace amine-associated receptor 1 and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptor 1 (TAAR1) is one of the 15 identified trace amine-associated receptor subtypes, which form a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. TAAR1 is coupled to the Gs protein, which leads to activation of adenylate cyclase, and is thought to play functional role in the regulation of brain monoamines. TAAR1 is also shown to be activated by psychoactive compounds such as Ecstasy (MDMA), amphetamine and LSD. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32926 320439 cd15316 7tmA_TAAR6_8_9 trace amine-associated receptors 6, 8, and 9, member of the class A family of seven-transmembrane G protein-coupled receptors. Included in this group are mammalian TAAR6, TAAR8, TAAR9, and similar proteins. They are among the 15 identified amine-associated receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 290
32927 320440 cd15317 7tmA_TAAR5-like trace amine-associated receptor 5 and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Included in this group are mammalian TAAR5, TAAR6, TAAR8, TAAR9, and similar proteins. They are among the 15 identified trace amine-associated receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 290
32928 320441 cd15318 7tmA_TAAR5 trace amine-associated receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptor 5 is one of the 15 identified amine-activated G protein-coupled receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32929 320442 cd15319 7tmA_D1B_dopamine_R D1B (or D5) subtype dopamine receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 317
32930 320443 cd15320 7tmA_D1A_dopamine_R D1A (or D1) subtype dopamine receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5. The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. 319
32931 320444 cd15321 7tmA_alpha2B_AR alpha-2 adrenergic receptors subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. 268
32932 320445 cd15322 7tmA_alpha2A_AR alpha-2 adrenergic receptors subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. 259
32933 320446 cd15323 7tmA_alpha2C_AR alpha-2 adrenergic receptors subtype C, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. 261
32934 320447 cd15324 7tmA_alpha-2D_AR alpha-2 adrenergic receptors subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. 256
32935 320448 cd15325 7tmA_alpha1A_AR alpha-1 adrenergic receptors subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats. 261
32936 320449 cd15326 7tmA_alpha1B_AR alpha-1 adrenergic receptors subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats. 261
32937 320450 cd15327 7tmA_alpha1D_AR alpha-1 adrenergic receptors subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats. 261
32938 320451 cd15328 7tmA_5-HT5 serotonin receptor subtype 5, member of the class A family of seven-transmembrane G protein-coupled receptors. 5-HT5 receptor, one of 14 mammalian 5-HT receptors, is activated by the neurotransmitter and peripheral signal mediator serotonin (also known as 5-hydroxytryptamine or 5-HT). The 5-HT5A and 5-HT5B receptors have been cloned from rat and mouse, but only the 5-HT5A isoform has been identified in human because of the presence of premature stop codons in the human 5-HT5B gene, which prevents a functional receptor from being expressed. 5-HT5 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/0) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 259
32939 320452 cd15329 7tmA_5-HT7 serotonin receptor subtype 7, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT7 receptor, one of 14 mammalian serotonin receptors, is a member of the class A of GPCRs and is activated by the neurotransmitter serotonin (5-hydroxytryptamine, 5-HT). 5-HT7 receptor mainly couples to Gs protein, which positively stimulates adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. 5-HT7 receptor is expressed in various human tissues, mainly in the brain, the lower gastrointestinal tract and in vital blood vessels including the coronary artery. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 260
32940 320453 cd15330 7tmA_5-HT1A_vertebrates serotonin receptor subtype 1A from vertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 260
32941 320454 cd15331 7tmA_5-HT1A_invertebrates serotonin receptor subtype 1A from invertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 261
32942 320455 cd15333 7tmA_5-HT1B_1D serotonin receptor subtypes 1B and 1D, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 265
32943 320456 cd15334 7tmA_5-HT1F serotonin receptor subtype 1F, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 259
32944 320457 cd15335 7tmA_5-HT1E serotonin receptor subtype 1E, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression. 258
32945 320458 cd15336 7tmA_Melanopsin vertebrate melanopsins (Opsin-4), member of the class A family of seven-transmembrane G protein-coupled receptors. Melanopsin (also called Opsin-4) is the G protein-coupled photopigment that mediates non-visual responses to light. In mammals, these photoresponses include the photo-entrainment of circadian rhythm, pupillary constriction, and acute nocturnal melatonin suppression. Mammalian melanopsins are expressed only in the inner retina, whereas non-mammalian vertebrate melanopsins are localized in various extra-retinal tissues such as iris, brain, pineal gland, and skin. Melanopsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 290
32946 320459 cd15337 7tmA_Opsin_Gq_invertebrates invertebrate Gq opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. The invertebrate Gq-coupled opsin subfamily includes the arthropod and mollusc visual opsins. Like the vertebrate visual opsins, arthropods possess color vision by the use of multiple opsins sensitive to different light wavelengths. The invertebrate Gq opsins are closely related to the vertebrate melanopsins, the primary photoreceptor molecules for non-visual responses to light, and the R1-R6 photoreceptors, which are the fly equivalent to the vertebrate rods. The Gq opsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. 292
32947 320460 cd15338 7tmA_MCHR1 melanin concentrating hormone receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 282
32948 320461 cd15339 7tmA_MCHR2 melanin concentrating hormone receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 283
32949 320462 cd15340 7tmA_CB1 cannabinoid receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system. 292
32950 320463 cd15341 7tmA_CB2 cannabinoid receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system. 279
32951 320464 cd15342 7tmA_LPAR2_Edg4 lysophosphatidic acid receptor subtype 2 (LPAR2 or LPA2), also called Endothelial differentiation gene 4 (Edg4), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 274
32952 320465 cd15343 7tmA_LPAR3_Edg7 lysophosphatidic acid receptor subtype 3 (LPAR3 or LPA3), also called endothelial differentiation gene 7 (Edg7), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 274
32953 341348 cd15344 7tmA_LPAR1_Edg2 lysophosphatidic acid receptor subtype 1 (LPAR1 or LPA1), also called endothelial differentiation gene 2 (Edg2), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 273
32954 320467 cd15345 7tmA_S1PR3_Edg3 sphingosine-1-phosphate receptor subtype 3 (S1PR3 or S1P3), also called endothelial differentiation gene 3 (Edg3), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 270
32955 320468 cd15346 7tmA_S1PR1_Edg1 sphingosine-1-phosphate receptor subtype 1 (S1PR1 or S1P1), also called endothelial differentiation gene 1 (Edg1), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 277
32956 320469 cd15347 7tmA_S1PR2_Edg5 sphingosine-1-phosphate receptor subtype 2 (S1PR2 or S1P2), also called endothelial differentiation gene 5 (Edg5), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 266
32957 320470 cd15348 7tmA_S1PR5_Edg8 sphingosine-1-phosphate receptor subtype 5 (S1PR5 or S1P5), also called endothelial differentiation gene 8 (Edg8), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 277
32958 320471 cd15349 7tmA_S1PR4_Edg6 sphingosine-1-phosphate receptor subtype 4 (S1PR4 or S1P4), also called endothelial differentiation gene 6 (Edg6), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8). The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13). 271
32959 320472 cd15350 7tmA_MC2R_ACTH_R melanocortin receptor subtype 2, also called adrenocorticotropic hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 270
32960 320473 cd15351 7tmA_MC1R melanocortin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 271
32961 320474 cd15352 7tmA_MC3R melanocortin receptor subtype 3, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 272
32962 320475 cd15353 7tmA_MC4R melanocortin receptor subtype 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 269
32963 320476 cd15354 7tmA_MC5R melanocortin receptor subtype 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function. 270
32964 320477 cd15355 7tmA_NTSR1 neurotensin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor. 310
32965 320478 cd15356 7tmA_NTSR2 neurotensin receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor. 285
32966 320479 cd15357 7tmA_NMU-R2 neuromedin U receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure. 293
32967 320480 cd15358 7tmA_NMU-R1 neuromedin U receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure. 305
32968 320481 cd15359 7tmA_LHCGR luteinizing hormone-choriogonadotropin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. LHCGR is expressed predominantly in the ovary and testis, and plays an essential role in sexual development and reproductive processes. LHCGR couples primarily to the G(s)-protein and activates adenylate cyclase, thereby promoting cAMP production. 275
32969 320482 cd15360 7tmA_FSH-R follicle-stimulating hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. FSH-R functions in gonad development and is found in the ovary, testis, and uterus. Defects in this receptor cause ovarian dysgenesis type 1, and also ovarian hyperstimulation syndrome. The FSH-R activation couples to the G(s)-protein and stimulates adenylate cyclase, thereby promoting cAMP production. 275
32970 320483 cd15361 7tmA_LGR4 leucine-rich repeats-containing G protein-coupled receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The leucine-rich repeat containing G-protein coupled receptor Lgr4 (formerly known as Gpr48), together with its close family members LGR5 and LGR6, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. LGR5 acts as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 also serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins. 274
32971 320484 cd15362 7tmA_LGR6 leucine-rich repeats-containing G protein-coupled receptor 6, the class A of 7-transmembrane GPCRs. The leucine-rich repeat containing G-protein coupled receptor LGR5, together with its family members LGR4 and LGR6, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR5 serves as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 is a marker for multipotent stem cells in the hair follicle that generate all skin cell lineages. In addition, LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins. 276
32972 320485 cd15363 7tmA_LGR5 leucine-rich repeats-containing G protein-coupled receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The leucine-rich repeat containing G-protein coupled receptor LGR6, together with its family members LGR4 and LGR5, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR6 serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages, whereas LGR5 is a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. In addition, LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins. 274
32973 320486 cd15364 7tmA_GPR132_G2A proton-sensing G protein-coupled receptor 132, member of the class A family of seven-transmembrane G protein-coupled receptors. The G2 accumulation receptor (G2A, also known as GPR132) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the T cell death associated gene-8 (TDAG8, GPR65) receptor, ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. G2A was originally identified as a stress-inducible receptor that causes the cell cycle arrest at G2/M phase when serum is deprived. Lysophosphatidylcholine was identified as a ligand for G2A, and whose overexpression was shown to induce cell proliferation, oncogenic transformation, and apoptosis. 279
32974 320487 cd15365 7tmA_GPR65_TDAG8 proton-sensing G protein-coupled receptor 65, member of the class A family of seven-transmembrane G protein-coupled receptors. The T cell death associated gene-8 receptor (TDAG8, also known as GPR65) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. Activation of TDAG8 by extracellular acidosis increases the cAMP production, stimulates Rho, and induces stress fiber formation. TDAG8 has also been shown to regulate the extracellular acidosis-induced inhibition of pro-inflammatory cytokine production in peritoneal macrophages. 277
32975 320488 cd15366 7tmA_GPR4 proton-sensing G protein-coupled receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein-coupled receptor 4 (GPR4) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 receptor (TDAG8, GPR65), ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. GPR4 overexpression in melanoma cells was shown to reduce cell migration, membrane ruffling, and cell spreading under acidic pH conditions. Activation of GPR4 via extracellular acidosis is coupled to the G(s), G(q), and G(12/13) pathways. 280
32976 320489 cd15367 7tmA_GPR68_OGR1 G protein-coupled receptor 68, member of the class A family of seven-transmembrane G protein-coupled receptors. The ovarian cancer G-protein receptor 1 (OGR1, also known as GPR68) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 receptor (TDAG8, GPR65), and the G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. Knock-out mice studies have suggested that OGR1 plays a role in the regulation of insulin secretion and glucose metabolism. OGR1 couples to G(q/11) proteins and activates phospholipase C and Ca2+ signaling pathways. 276
32977 320490 cd15368 7tmA_P2Y8 purinergic receptor P2Y8, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y8 (or P2RY8) expression is often increased in leukemia patients, and it plays a role in the pathogenesis of acute leukemia. P2Y8 is phylogenetically closely related to the protease-activated receptors (PARs), which are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. Four different types of the protease-activated receptors have been identified (PAR1-4) and are predominantly expressed in platelets. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 281
32978 320491 cd15369 7tmA_PAR1 protease-activated receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 281
32979 341349 cd15370 7tmA_PAR2 protease-activated receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 280
32980 320493 cd15371 7tmA_PAR3 protease-activated receptor 3, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 274
32981 320494 cd15372 7tmA_PAR4 protease-activated receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin. The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects. 274
32982 320495 cd15373 7tmA_P2Y2 P2Y purinoceptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y2 belongs to the P2Y receptor family of purinergic G-protein coupled receptors and is implicated to play a role in the control of the cell cycle of endometrial carcinoma cells. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 283
32983 320496 cd15374 7tmA_P2Y4 P2Y purinoceptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y4 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. This family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 285
32984 320497 cd15375 7tmA_OXGR1 2-oxoglutarate receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. 2-oxoglutarate receptor 1 (OXGR1) is also known as GPR80, GPR99, or P2Y15. OXGR1 functions as a receptor for alpha-ketoglutarate, a citric acid cycle intermediate, and acts exclusively through a G(q)-dependent pathway. OXGR1 belongs to the class A GPCR superfamily and is phylogenetically related to the purinergic P2Y1-like receptor subfamily, whose members are coupled to G(q) protein to activate phospholipase C (PLC). OXGR1 has also been reported as a potential third cysteinyl leukotriene receptor with specificity for leukotriene E4. 280
32985 320498 cd15376 7tmA_P2Y11 P2Y purinoceptor 11, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y11 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. The activation of P2Y11 is a major pathway of macrophage activation that leads to the release of cytokines. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 284
32986 341350 cd15377 7tmA_P2Y1 P2Y purinoceptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y1 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. This family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 289
32987 320500 cd15378 7tmA_SUCNR1_GPR91 succinate receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Succinate receptor (SUCNR1) GPR91 exclusively couples to G(i) protein to inhibit cAMP production and also activates PLC-beta to increase intracellular calcium concentrations in an inositol phosphate dependent mechanism. Succinate, an intermediate molecule of the citric cycle, is shown to cause cardiac hypertrophy via GPR91 activation. Furthermore, succinate-induced GPR91 activation is involved in the regulation of renin-angiotensin system and is suggested to play an important role in the development of renovascular hypertension and diabetic nephropathy. SUCNR1 belongs to the class A GPCR superfamily and is phylogenetically related to the purinergic P2Y1-like receptor subfamily, whose members are coupled to G(q) protein to activate phospholipase C (PLC). 283
32988 320501 cd15379 7tmA_P2Y6 P2Y purinoceptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes mammalian P2Y6, avian P2Y3, and similar proteins. P2Y3 is the avian homolog of mammalian P2Y6. They belong to the G(i) class of a family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 288
32989 320502 cd15380 7tmA_BK-1 bradykinin receptor B1, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways. 286
32990 320503 cd15381 7tmA_BK-2 bradykinin receptor B2, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways. 284
32991 320504 cd15382 7tmA_AKHR adipokinetic hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Adipokinetic hormone (AKH) is a lipid-mobilizing hormone that is involved in control of insect metabolism. Generally, AKH behaves as a typical stress hormone by mobilizing lipids, carbohydrates and/or certain amino acids such as proline. Thus, it utilizes the body's energy reserves to fight the immediate stress problems and subdue processes that are less important. Although AKH is known to responsible for regulating the energy metabolism during insect flight, it is also found in insects that have lost its functional wings and predominantly walk for their locomotion. AKH is structurally related to the mammalian gonadotropin-releasing hormone (GnRH) and they share a common ancestor. Both GnRH and AKH receptors are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily. 298
32992 320505 cd15383 7tmA_GnRHR_vertebrate vertebrate gonadotropin-releasing hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. GnRHR is expressed predominantly in the gonadotrope membrane of the anterior pituitary as well as found in numerous extrapituitary tissues including lymphocytes, breast, ovary, prostate, and cancer cell lines. There are at least two types of GnRH receptors, GnRHR1 and GnRHR2, which couple primarily to G proteins of the Gq/11 family. GnRHR is closely related to the adipokinetic hormone receptor (AKH), which binds to a lipid-mobilizing hormone that is involved in control of insect metabolism. They share a common ancestor and are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily. 295
32993 320506 cd15384 7tmA_GnRHR_invertebrate invertebrate gonadotropin-releasing hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. GnRHR is expressed predominantly in the gonadotrope membrane of the anterior pituitary as well as found in numerous extrapituitary tissues including lymphocytes, breast, ovary, prostate, and cancer cell lines. There are at least two types of GnRH receptors, GnRHR1 and GnRHR2, which couple primarily to G proteins of the Gq/11 family. GnRHR is closely related to the adipokinetic hormone receptor (AKH), which binds to a lipid-mobilizing hormone that is involved in control of insect metabolism. They share a common ancestor and are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily. 293
32994 320507 cd15385 7tmA_V1aR vasopressin receptor subtype 1A, member of the class A family of seven-transmembrane G protein-coupled receptors. V1a-type receptor is a G(q/11)-coupled receptor that mediates blood vessel constriction. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. 301
32995 320508 cd15386 7tmA_V1bR vasopressin receptor subtype 1B, member of the class A family of seven-transmembrane G protein-coupled receptors. The V1b receptor is specifically expressed in corticotropes of the anterior pituitary and plays a critical role in regulating the activity of hypothalamic-pituitary-adrenal axis, a key part of the neuroendocrine system that controls reactions to stress, by maintaining adrenocorticotropic hormone (ACTH) and corticosterone levels. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. 302
32996 320509 cd15387 7tmA_OT_R oxytocin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Oxytocin is a peptide of nine amino acids synthesized in the hypothalamus and is released from the posterior pituitary gland. Oxytocin plays an important role in sexual reproduction of both sexes and is structurally very similar to vasopressin. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. 297
32997 320510 cd15388 7tmA_V2R vasopressin receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. The vasopressin type 2 receptor (V2R) is a G(s)-coupled receptor that controls balance of water and sodium ion by regulating their reabsorption in the renal collecting duct. Mutations of V2R is responsible for nephrogenic diabetes insipidus. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. 295
32998 320511 cd15389 7tmA_GPR83 G protein-coupled receptor 83, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR83, also known as GPR72, is widely expressed in the brain, including hypothalamic nuclei which is involved in regulating energy balance and food intake. The hypothalamic expression of GPR83 is tightly regulated in response to nutrient availability and is decreased in obese mice. A recent study suggests that GPR83 has a critical role in the regulation of systemic energy metabolism via ghrelin-dependent and ghrelin-independent mechanisms. GPR83 shares a significant amino acid sequence identity with the tachykinin receptors, however its endogenous ligand is unknown. 285
32999 320512 cd15390 7tmA_TACR neurokinin receptors (or tachykinin receptors), member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents G-protein coupled receptors for a variety of neuropeptides of the tachykinin (TK) family. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty. 289
33000 320513 cd15391 7tmA_NPR-like_invertebrate invertebrate neuropeptide receptor-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes putative neuropeptide receptor found in invertebrates, which is a member of class A of 7-transmembrane G protein-coupled receptors. This orphan receptor shares a significant amino acid sequence identity with the neurokinin 1 receptor (NK1R). The endogenous ligand for NK1R is substance P, an 11-amino acid peptide that functions as a vasodilator and neurotransmitter and is released from the autonomic sensory nerve fibers. 289
33001 320514 cd15392 7tmA_PR4-like neuropeptide Y receptor-like found in insect and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes a novel G protein-coupled receptor (also known as PR4 receptor) from Drosophila melanogaster, which can be activated by the members of the neuropeptide Y (NPY) family, including NPY, peptide YY (PYY) and pancreatic polypeptide (PP), when expressed in Xenopus oocytes. These homologous peptides of 36-amino acids in length contain a hairpin-like structural motif, which referred to as the pancreatic polypeptide fold, and function as gastrointestinal hormones and neurotransmitters. The PR4 receptor also shares strong sequence homology to the mammalian tachykinin receptors (NK1R, NK2R, and NK3R), whose endogenous ligands are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB), respectively. The tachykinins function as excitatory transmitters on neurons and cells in the gastrointestinal tract. 287
33002 320515 cd15393 7tmA_leucokinin-like leucokinin-like peptide receptor from tick and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes a leucokinin-like peptide receptor from the Southern cattle tick, Boophilus microplus, a pest of cattle world-wide. Leucokinins are invertebrate neuropeptides that exhibit myotropic and diuretic activity. This receptor is the first neuropeptide receptor known from the Acari and the second known in the subfamily of leucokinin-like peptide G-protein-coupled receptors. The other known leucokinin-like peptide receptor is a lymnokinin receptor from the mollusc Lymnaea stagnalis. 288
33003 320516 cd15394 7tmA_PrRP_R prolactin-releasing peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prolactin-releasing peptide (PrRP) receptor (previously known as GPR10) is expressed in the central nervous system with the highest levels located in the anterior pituitary and is activated by its endogenous ligand PrRP, a neuropeptide possessing a C-terminal Arg-Phe-amide motif. There are two active isoforms of PrRP in mammals: one consists of 20 amino acids (PrRP-20) and the other consists of 31 amino acids (PrRP-31), where PrRP-20 is a C-terminal fragment of PrRP-31. Binding of PrRP to the receptor coupled to G(i/o) proteins activates the extracellular signal-related kinase (ERK) and it can also couple to G(q) protein leading to an increase in intracellular calcium and activation of c-Jun N-terminal protein kinase (JNK). The PrRP receptor shares significant sequence homology with the neuropeptide Y (NPY) receptor, and micromolar levels of NPY can bind and completely inhibit the PrRP-evoked intracellular calcium response in PrRP receptor-expressing cells, suggesting that the PrRP receptor shares a common ancestor with the NPY receptors. PrRP has been shown to reduce food intake and body weight and modify body temperature when administered in rats. It also has been shown to decrease circulating growth hormone levels by activating somatostatin-secreting neurons in the hypothalamic periventricular nucleus. 286
33004 320517 cd15395 7tmA_NPY1R neuropeptide Y receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis. 293
33005 320518 cd15396 7tmA_NPY6R neuropeptide Y receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. 293
33006 320519 cd15397 7tmA_NPY4R neuropeptide Y receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. 293
33007 320520 cd15398 7tmA_NPY5R neuropeptide Y receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis. 273
33008 320521 cd15399 7tmA_NPY2R neuropeptide Y receptor type 2, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis. 285
33009 320522 cd15400 7tmA_Mel1B melatonin receptor subtype 1B, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase. 279
33010 320523 cd15401 7tmA_Mel1C melatonin receptor subtype 1C, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase. 279
33011 320524 cd15402 7tmA_Mel1A melatonin receptor subtype 1A, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase. 279
33012 320525 cd15403 7tmA_GPR45 G protein-coupled receptor 45, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the human orphan receptor GPR45 and closely related proteins found in vertebrates. GPR45 is also called PSP24 in Xenopus and PSP24-alpha (or PSP24-1) in mammals. GPR45 shows the highest sequence homology with GPR63 (PSP24-beta, or PSP24-2). PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Mammalian PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 301
33013 320526 cd15404 7tmA_GPR63 G protein-coupled receptor 63, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the human orphan receptor GPR63, which is also called PSP24-beta or PSP24-2, and its closely related proteins found in vertebrates. GPR63 shares the highest sequence homology with GPR45 (Xenopus PSP24, mammalian PSP24-alpha or PSP24-1). PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Mammalian PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 265
33014 320527 cd15405 7tmA_OR8B-like olfactory receptor subfamily 8B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8B and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33015 320528 cd15406 7tmA_OR8D-like olfactory receptor subfamily 8D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8D and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 290
33016 320529 cd15407 7tmA_OR5B-like olfactory receptor subfamily 5B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5B and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33017 320530 cd15408 7tmA_OR5AK3-like olfactory receptor subfamily 5AK3, 5AU1, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AK3, 5AU1, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 287
33018 320531 cd15409 7tmA_OR5H-like olfactory receptor subfamily 5H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5H, 5K, 5AC, 5T and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33019 320532 cd15410 7tmA_OR5D-like olfactory receptor subfamily 5D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5D, 5L, 5W, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 294
33020 320533 cd15411 7tmA_OR8H-like olfactory receptor subfamily 8H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8H, 8I, 5F and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33021 320534 cd15412 7tmA_OR5M-like olfactory receptor subfamily 5M and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5M and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33022 320535 cd15413 7tmA_OR8K-like olfactory receptor subfamily 8K and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8K, 8U, 8J, 5R, 5AL and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33023 320536 cd15414 7tmA_OR5G-like olfactory receptor subfamily 5G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5G and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 285
33024 320537 cd15415 7tmA_OR5J-like olfactory receptor subfamily 5J and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5J and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33025 320538 cd15416 7tmA_OR5P-like olfactory receptor subfamily 5P and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5P and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33026 320539 cd15417 7tmA_OR5A1-like olfactory receptor subfamily 5A1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5A1, 5A2, 5AN1, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33027 320540 cd15418 7tmA_OR9G-like olfactory receptor subfamily 9G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 9G and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 281
33028 320541 cd15419 7tmA_OR9K2-like olfactory receptor subfamily 9K2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes transmembrane olfactory receptor subfamily 9K2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 279
33029 320542 cd15420 7tmA_OR2A-like olfactory receptor subfamily 2A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2A and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33030 320543 cd15421 7tmA_OR2T-like olfactory receptor subfamily 2T and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamilies 2T, 2M, 2L, 2V, 2Z, 2AE, 2AG, 2AK, 2AJ, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33031 320544 cd15424 7tmA_OR2_unk olfactory receptor family 2, unknown subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents an unknown subfamily, conserved in some mammalia and sauropsids, in family 2 of olfactory receptors. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33032 320545 cd15428 7tmA_OR2D-like olfactory receptor subfamily 2D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33033 320546 cd15429 7tmA_OR2F-like olfactory receptor subfamily 2F and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2F and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33034 320547 cd15430 7tmA_OR13-like olfactory receptor family 13 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 13 (subfamilies 13C, 13D, 13F, and 13J), some subfamilies from OR family 2 (2K and 2S), and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33035 320548 cd15431 7tmA_OR13H-like olfactory receptor subfamily 13H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 13H and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 269
33036 320549 cd15432 7tmA_OR2B2-like olfactory receptor subfamily 2B2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes transmembrane olfactory receptor subfamily 2B2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33037 320550 cd15433 7tmA_OR2Y-like olfactory receptor subfamily 2Y and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2Y, 2I, and related protein in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33038 320551 cd15434 7tmA_OR2W-like olfactory receptor subfamily 2W and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2W and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33039 320552 cd15436 7tmB2_Latrophilin Latrophilins, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 258
33040 320553 cd15437 7tmB2_ETL Epidermal Growth Factor, latrophilin and seven transmembrane domain-containing protein 1; member of the class B2 family of seven-transmembrane G protein-coupled receptors. ETL (EGF-TM7-latrophilin-related protein) belongs to Group I adhesion GPCRs, which also include latrophilins (also called lectomedins or latrotoxin receptors). All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. ETL, for instance, contains EGF-like repeats, which also present in other EGF-TM7 adhesion GPCRs, such as Cadherin EGF LAG seven-pass G-type receptors (CELSR1-3), EGF-like module receptors (EMR1-3), CD97, and Flamingo. ETL is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 258
33041 320554 cd15438 7tmB2_CD97 CD97 antigen, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the leukocyte cell-surface antigen CD97 and the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4), are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily B2 of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying numbers of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain. In the case of CD97, alternative splicing results in three isoforms possessing either three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. For example, CD97, which is involved in angiogenesis and the migration and invasion of tumor cells, has been shown to promote cell aggregation in a GPS proteolysis-dependent manner. CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis. Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment. 261
33042 320555 cd15439 7tmB2_EMR epidermal growth factor-like module-containing mucin-like hormone receptors, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4) and the leukocyte cell-surface antigen CD97, are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying number of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain. In the case of EMR2, alternative splicing results in four isoforms possessing either two (EGF1,2), three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis. Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment. 263
33043 320556 cd15440 7tmB2_latrophilin-like_invertebrate invertebrate latrophilin-like receptors, member of the class B2 family of seven-transmembrane G protein-coupled receptors. This subgroup includes latrophilin-like proteins that are found in invertebrates such as insects and worms. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of vertebrate latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 259
33044 320557 cd15441 7tmB2_CELSR_Adhesion_IV cadherin EGF LAG seven-pass G-type receptors, group IV adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains. The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease. Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. Celsr3 is expressed in both the developing and adult mouse brain. It has been functionally implicated in proper neuron migration and axon guidance in the CNS. 254
33045 320558 cd15442 7tmB2_GPR97 orphan adhesion receptor GPR97, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR97 is an orphan receptor that has been classified into the group VIII of adhesion GPCRs. Other members of the Group VII include GPR56, GPR64, GPR112, GPR114, and GPR126. GPR97 is identified as a lymphatic adhesion receptor that is specifically expressed in lymphatic endothelium, but not in blood vascular endothelium, and is shown to regulate migration of lymphatic endothelial cells via the small GTPases RhoA and cdc42. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 277
33046 320559 cd15443 7tmB2_GPR114 orphan adhesion receptor GPR114, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR114 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include GPR56, GPR64, GPR97, GPR112, and GPR126. GPR114 is mainly found in granulocytes (polymorphonuclear leukocytes), and GPR114-transfected cells induced an increase in cAMP levels via coupling to G(s) protein. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 268
33047 320560 cd15444 7tmB2_GPR64 orphan adhesion receptor GPR64 and related proteins, member of subfamily B2 of the class B secretin-like receptors of seven-transmembrane G protein-coupled receptors. GPR64 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR97, GPR112, GPR114, and GPR126. GPR64 is mainly expressed in the epididymis of male reproductive tract, and targeted deletion of GPR64 causes sperm stasis and efferent duct blockage due to abnormal fluid reabsorption, resulting in male infertility. GPR64 is also over-expressed in Ewing's sarcoma (ES), as well as upregulated in other carcinomas from kidney, prostate or lung, and promotes invasiveness and metastasis in ES via the upregulation of placental growth factor (PGF) and matrix metalloproteinase (MMP) 1. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 271
33048 320561 cd15445 7tmB1_CRF-R1 corticotropin-releasing factor receptor 1, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones. In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 265
33049 320562 cd15446 7tmB1_CRF-R2 corticotropin-releasing factor receptor 2, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones. In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 264
33050 320563 cd15447 7tmC_mGluR2 metabotropic glutamate receptor 2 in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 254
33051 320564 cd15448 7tmC_mGluR3 metabotropic glutamate receptor 3 in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 254
33052 320565 cd15449 7tmC_mGluR1 metabotropic glutamate receptor 1 in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 250
33053 320566 cd15450 7tmC_mGluR5 metabotropic glutamate receptor 5 in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 250
33054 320567 cd15451 7tmC_mGluR7 metabotropic glutamate receptor 7 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 307
33055 320568 cd15452 7tmC_mGluR4 metabotropic glutamate receptor 4 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 327
33056 320569 cd15453 7tmC_mGluR6 metabotropic glutamate receptor 6 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 273
33057 320570 cd15454 7tmC_mGluR8 metabotropic glutamate receptor 8 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 311
33058 271319 cd15457 NADAR Escherichia coli swarming motility protein YbiA and related proteins. This family of uncharacterized domains was initially classified as Domain of Unknown Function (DUF1768). It contains members such as the E. coli swarming motility protein YbiA. Mutations in YbiA cause defects in Escherichia coli swarming, but not necessarily in motility. More recently, this family has been predicted to be involved in NAD-utilizing pathways, likely to act on ADP-ribose derivatives, and has been named the NADAR (NAD and ADP-ribose) superfamily. 148
33059 271230 cd15464 HN_like Haemagglutinin-neuraminidase (HN) of paramyxoviridae and similar proteins. Most paramyxoviridae have two membrane-anchored glycoproteins that mediate entry of the virus into the host cell. The protein characterized by this model is called hemagglutinin-neuraminidase (HN), hemagglutinin glycoprotein (H), or glycoprotein (G). Typically it has a variety of functions during viral infection; it participates in virus attachment to host cells, may cleave sialic acid off host oligosaccharides, and has a stimulating effect on membrane fusion during the entry of the virus into the host cell. 391
33060 275386 cd15465 bS6_mito Mitochondrial Ribosomal Protein (MRP) S6. bS6_MRPS6 is one of the proteins of the small subunit of the mitochondrial ribosome. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. The ribosome is a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptide. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). Mitochondrial and chloroplastic ribosomes synthesize proteins that are involved in oxidative phosphorylation (ATP generation) and photosynthesis. bS6_MRPS6 is one of the fourteen mitochondrial ribosomal proteins that is known to have significant sequence homology with their bacterial counterpart. 95
33061 271318 cd15466 CLU-central An uncharacterized central domain of CLU mitochondrial proteins. Mutations in the mitochondrial CLU proteins have been shown to result in clustered mitochondria. CLU proteins include Saccharomyces cerevisiae clustered mitochondria protein (Clu1p, alias translation initiation factor 31/TIF31p), Dictyostelium discoideum clustered mitochondria protein homolog (CluA), Caenorhabditis elegans clustered mitochondria protein homolog (CLUH/ Protein KIAA0664), Drosophila clueless (alias clustered mitochondria protein homolog), Arabidopsis clustered mitochondria protein (CLU, alias friendly mitochondria protein/FMT), and human clustered mitochondria protein homolog (CLUH). Dictyostelium CluA is involved in mitochondrial dynamics and is necessary for both, mitochondrial fission and fusion. Drosophila clueless is essential for cytoplasmic localization and function of cellular mitochondria. The Drosophila clu gene interacts genetically with parkin (park, the Drosophila ortholog of a human gene responsible for many familial cases of Parkinson's disease). Arabidopsis CLU/FMT is required for correct mitochondrial distribution and morphology. The specific role CLU proteins play in mitochondrial processes in not yet known. In an early study, S. cerevisiae Clu1/TIF31p was reported as sometimes being associated with the elF3 translation initiation factor. The authors noted, however, that its tentative assignment as a subunit of elf3 was uncertain, and to date there has been no direct evidence for a role of this protein in translation. 159
33062 271231 cd15467 MV-h Measles virus hemagglutinin. The hemagglutinin (H) of measles virus plays several roles during viral infection; it participates in virus attachment to host cells via binding to a proteinaceous receptor and it has a stimulating effect on membrane fusion during the entry of the virus into the host cell by interacting with the fusion protein F. This model characterizes the globular ectodomain of measles hemagglutinin, minus the stalk region. Receptors for measles virus have been identified as the signaling lymphocyte activation molecule (SLAM, in particular its distal ectodomain), CD46, and nectin-4 in epithelial cells. 415
33063 271232 cd15468 HeV-G Glycoprotein G, or hemagglutinin-neuraminidase of Hendravirus and Nipah virus. The glycoprotein (G) of Nendravirus and Nipah virus has a variety of functions during viral infection; it participates in virus attachment to host cells and has a stimulating effect on membrane fusion during the entry of the virus into the host cell. This models characterizes the globular ectodomain of glycoprotein G. The receptors for Hendravirus and Nipah virus have been identified as ephrin B2 (EFNB2) and ephrin B3 (EFNB3). 413
33064 271233 cd15469 HN Hemagglutinin-neuraminidase (HN) of parainfluenza virus 5, Newcastle disease virus, and related paramyxoviridiae. The hemagglutinin-neuraminidase (HN) found in this family of viruses has a variety of functions during viral infection; it participates in virus attachment to host cells, cleaves sialic acid off host oligosaccharides, and has a stimulating effect on membrane fusion during the entry of the virus into the host cell. This model characterizes the global ectodomain of HN. Hemagglutinin-neuraminidase ectodomains of these viruses attaches the virion to sialic acid receptors on host cells; the neuraminidase cleaves sialic acid moieties from host cell molecules as well as virus particles, this removal may happen in the trans Golgi network. 408
33065 271254 cd15470 Myo5_CBD Cargo binding domain of myosin 5. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates. Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. The MyoV-CBDs directly interact with several adaptor proteins, in case of Myo5a, melanophilin (MLPH), Rab interacting lysosomal protein-like 2 (RILPL2), and granuphilin, and in case of Myo5b, Rab11-family interacting protein 2. 332
33066 271255 cd15471 Myo5p-like_CBD_afadin cargo binding domain of myosin 5-like of afadin. Afadin is an actin filament (F-actin) and Rap1 small G protein-binding protein, found in cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. It interacts with cell adhesion molecules and signaling molecules and plays a role in the formation of cell junctions, cell polarization, migration, survival, proliferation, and differentiation. Afadin is a multi domain protein, that contains beside a myosin5-like CBD, two Ras-associated domains, a forkhead-associated domain, a PDZ domain, three proline-rich domains, and an F-actin-binding domain. 322
33067 271256 cd15472 Myo5p-like_CBD_Rasip1 cargo binding domain of myosin 5-like of Ras-interacting protein 1. Ras-interacting protein 1 (Rasip1 or RAIN) is an effector of the small G protein Rap1 and plays an important role in endothelial junction stabilization. Rasip1, like afadin, is a multi domain protein, that contains beside a myosin5-like CBD, a Ras-associated domain and a PDZ domain. 366
33068 271257 cd15473 Myo5p-like_CBD_DIL_ANK cargo binding domain of myosin 5-like of dil and ankyrin domain containing protein. DIL and ankyrin domain-containing protein are a group of fungal proteins that contain a domain homologous to the cargo binding domain of class V myosins and ankyrin repeats. Their function is unknown. 316
33069 271258 cd15474 Myo5p-like_CBD_fungal cargo binding domain of fungal myosin V -like proteins. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. In case of Myo4 it has been shown to bind to the adapter protein She3p (Swi5p-dependent HO expression 3), which in turn anchors myosin 4 to its cargos, zip-coded mRNP (messenger ribonucleoprotein particles) and tER (tubular endoplasmic reticulum). Myo 2 binds to Vac17, vacuole-specific cargo adaptor, and Mmr1, mitochondria-specific cargo adaptor. Both adaptors bind competitivly at the same site. 352
33070 271259 cd15475 MyosinXI_CBD cargo binding domain of myosin XI. Class XI myosins are a plant specific group, homologous to class V myosins. C-terminal domain of Arabidopsis myosin XI has been shown to be homologous to the cargo-binding domain of yeast myosin V myo2p, which targets myosin to vacuole- and mitochondria, as well as secretory vesicle. 326
33071 271260 cd15476 Myo5c_CBD Cargo binding domain of myosin 5C. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates. Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. The MyoV-CBDs directly interact with several adaptor proteins.MyoVb and myoVc areprimarily expressed in epithelial cells, and have been implicated as motors involved in recycling endosomes. 332
33072 271261 cd15477 Myo5b_CBD Cargo binding domain of myosin 5b. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates. Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. They interact with several adaptor proteins, in case of Myo5b-CBD, Rab11-family interacting protein 2. 372
33073 271262 cd15478 Myo5a_CBD Cargo binding domain of myosin 5a. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates. Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. They interact with several adaptor proteins, in case of Myo5a-CBD, melanophilin (MLPH), Rab interacting lysosomal protein-like 2 (RILPL2), and granuphilin. Mutations in human Myo5a (many of which map to the cargo binding domain) lead to Griscelli syndrome, a severe neurological disease. 375
33074 271263 cd15479 fMyo4p_CBD cargo binding domain of fungal myosin 4. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. In case of Myo4 it has been shown to bind to the adapter protein She3p (Swi5p-dependent HO expression 3), which in turn anchors myosin 4 to its cargos, zip-coded mRNP (messenger ribonucleoprotein particles) and tER (tubular endoplasmic reticulum). 329
33075 271264 cd15480 fMyo2p_CBD cargo binding domain of fungal myosin 2. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. Myo 2 binds to Vac17, vacuole-specific cargo adaptor, and Mmr1, mitochondria-specific cargo adaptor. Both adaptors bind competitivly at the same site. 363
33076 271252 cd15481 SRP68-RBD RNA-binding domain of signal recognition particle subunit 68. Signal recognition particles (SRPs) are ribonucleoprotein complexes that target particular nascent pre-secretory proteins to the endoplasmic reticulum. SRP68 is one of the two largest proteins found in SRPs (the other being SRP72), and it forms a heterodimer with SRP72. Heterodimer formation is essential for SRP function. This model characterizes the N-terminal RNA-binding domain SRP68-RBD, a tetratricopeptide-like module. Interactions between SRP68-RBD and SRP RNA (7SL RNA) are thought to facilitate a conformation of SRP RNA that is required for interactions with ribosomal RNA. 195
33077 271234 cd15482 Sialidase_non-viral Non-viral sialidases. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates, they play vital roles in pathogenesis, bacterial nutrition and cellular interactions. They have a six-bladed, beta-propeller fold with the non-viral sialidases containing 2-5 Asp-box motifs (most commonly Ser/Thr-X-Asp-[X]-Gly-X-Thr- Trp/Phe). This CD includes eubacterial and eukaryotic sialidases. 339
33078 271235 cd15483 Influenza_NA Sialidase or neuraminidase (EC 3.2.1.18) of Influenza viruses A and B. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates. Viral neuraminidases, such as this family from Influenza viruses A and B, play a vital role in pathogenesis. Influenza neuraminidase cleaves an alpha-ketosidic linkage between sialic acid and a neighboring sugar residue. During budding of virus particles from the infected cell, the sialidase helps to prevent the newly formed viral particles from aggregating. The viral sialidase cleaves terminal sialic acid from glycan structures on the infected cell surface, promoting virus release and the spread of virus to neighboring cells that are not yet infected. Also, sialidase modifies mucins in the respiratory tract and may improve access of the viral particle to its target cells. Sialidases have a six-bladed beta-propeller fold. 386
33079 271251 cd15484 uS7_plant plant ribosomal protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis. 147
33080 271243 cd15485 ZIP_Cat8 Leucine zipper Dimerization domain of transcription factor Cat8 and similar proteins. Cat8p binds to carbon source-responsive element (CSRE) motifs and activates target genes under conditions of glucose deprivation. It mediates the transcriptional control of at least nine genes (ACS1, FBP1, ICL1, IDP2, JEN1, MLS1, PCK1, SFC1, and SIP4) under non-fermentative growth conditions in yeast. Studies show another 25 genes or open reading frames whose expression at the transition between the fermentative and the oxidative metabolism (diauxic shift) is altered in the absence of Cat8p. This Cat8p-dependent control results in a parallel alteration in mRNA and protein synthesis. The biochemical functions of proteins encoded by Cat8p-dependent genes are essentially related to the first steps of ethanol utilization, the glyoxylate cycle, and gluconeogenesis. Cat8p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 27
33081 271244 cd15486 ZIP_Sip4 Leucine zipper Dimerization domain transcription factor Sip4p and similar fungal proteins. Sip4p binds to carbon source-responsive element (CSRE) motifs and activates transcription of target genes under conditions of glucose deprivation. Its function is modulated through phosphorylation by SNF1 protein kinase, a protein essential for expression of glucose-repressed genes in response to glucose deprivation. Sip4p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 27
33082 275387 cd15487 bS6_chloro_cyano 30S ribosomal protein S6 of chloroplasts and cyanobacteria. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18. 94
33083 350626 cd15488 Tm-1-like ATP-binding domain found in plant Tm-1-like (Tm-1L) and similar proteins. Members of this family have been annotated as UPF0261-domain containing proteins. They are found in plants, fungi, bacteria, and archaea. A three-dimensional structure of a complex between the tomato resistance gene product Tm-1 and tomato mosaic virus helicase reveals an organization encompassing two distinct structurally similar domains and an ATP-binding site present in the N-terminal subdomain. The Tm-1-like domain is found co-occurring with a C-terminal TIM-barrel signal transduction (TBST) domain in some plant proteins like Tm-1, and with an N-terminal ABC-transporter ATP-binding domain in a few bacterial proteins. 399
33084 276966 cd15489 PHD_SF PHD finger superfamily. The PHD finger superfamily includes a canonical plant homeodomain (PHD) finger typically characterized as Cys4HisCys3, and a non-canonical extended PHD finger, characterized as Cys2HisCys5HisCys2His. Variations include the RAG2 PHD finger characterized by Cys3His2Cys2His and the PHD finger 5 found in nuclear receptor-binding SET domain-containing proteins characterized by Cys4HisCys2His. The PHD finger is also termed LAP (leukemia-associated protein) motif or TTC (trithorax consensus) domain. Single or multiple copies of PHD fingers have been found in a variety of eukaryotic proteins involved in the control of gene transcription and chromatin dynamics. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. They also function as epigenome readers controlling gene expression through molecular recruitment of multi-protein complexes of chromatin regulators and transcription factors. The PHD finger domain SF is structurally similar to the RING and FYVE_like superfamilies. 48
33085 294011 cd15490 eIF2_gamma_III Domain III of eukaryotic initiation factor eIF2 gamma. This family represents the C-terminal domain of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in eukaryotes and archaea. eIF2 is a G protein that delivers the methionyl initiator tRNA to the small ribosomal subunit and releases it upon GTP hydrolysis after the recognition of the initiation codon. eIF2 is composed three subunits, alpha, beta and gamma. Subunit gamma shows strongest conservation, and it confers both tRNA binding and GTP/GDP binding. 90
33086 294012 cd15491 selB_III Domain III of selenocysteine-specific translation elongation factor. This family represents domain III of bacterial selenocysteine (Sec)-specific elongation factor (EFSec), homologous to domain III of EF-Tu. SelB is a specialized translation elongation factor responsible for the co-translational incorporation of selenocysteine into proteins by recoding of a UGA stop codon in the presence of a downstream mRNA hairpin loop, called Sec insertion sequence (SECIS) element. 87
33087 276967 cd15492 PHD_BRPF_JADE_like PHD finger found in BRPF proteins, Jade proteins, protein AF-10, AF-17, and similar proteins. The family includes BRPF proteins, Jade proteins, protein AF-10 and AF-17. BRPF proteins are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. Jade proteins are required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6, to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. AF-10, also termed ALL1 (acute lymphoblastic leukemia)-fused gene from chromosome 10 protein, is a transcription factor that has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is a putative transcription factor that may play a role in multiple signaling pathways. All Jade proteins, AF-10, and AF-17 contain a canonical PHD finger followed by a non-canonical ePHD finger. This model corresponds to the canonical PHD finger. 46
33088 276968 cd15493 PHD_JMJD2 PHD finger found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also termed lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a Cys4HisCys3 canonical PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. JMJD2D is not included in this family, since it lacks both PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth. This model corresponds to the Cys4HisCys3 canonical PHD finger. 42
33089 276969 cd15494 PHD_ATX1_2_like PHD finger found in Arabidopsis thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like protein ATX1, ATX2, and similar proteins. The family includes A. thaliana ATX1 and ATX2, both of which are sister paralogs originating from a segmental chromosomal duplication. They are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1, also termed protein SET domain group 27, or trithorax-homolog protein 1 (TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2, also termed protein SET domain group 30, or trithorax-homolog protein 2 (TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). Both ATX1 and ATX2 are multi-domain containing proteins that consist of an N-terminal PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a C-terminal SET domain; this model corresponds to the Cys4HisCys3 canonical PHD finger. 47
33090 276970 cd15495 PHD_ATX3_4_5_like PHD finger found in Arabidopsis thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like protein ATX3, ATX4, ATX5, and similar proteins. The family includes A. thaliana ATX3 (also termed protein SET domain group 14, or trithorax-homolog protein 3), ATX4 (also termed protein SET domain group 16, or trithorax-homolog protein 4) and ATX5 (also termed protein SET domain group 29, or trithorax-homolog protein 5), which belong to the histone-lysine methyltransferase family. They show distinct phylogenetic origins from the ATX1 and ATX2 family. They are multi-domain containing proteins that consist of an N-terminal PWWP domain, a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a C-terminal SET domain; this model corresponds to the Cys4HisCys3 canonical PHD finger. 47
33091 276971 cd15496 PHD_PHF7_G2E3_like PHD finger found in PHD finger protein 7 (PHF7) and G2/M phase-specific E3 ubiquitin-protein ligase (G2E3). PHF7, also termed testis development protein NYD-SP6, is a testis-specific plant homeodomain (PHD) finger-containing protein that associates with chromatin and binds histone H3 N-terminal tails with a preference for dimethyl lysine 4 (H3K4me2). It may play an important role in stimulating transcription involved in testicular development and/or spermatogenesis. PHF7 contains a canonical Cys4HisCys3 PHD finger and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which may be involved in activating transcriptional regulation. G2E3 is a dual function ubiquitin ligase (E3) that may play a possible role in cell cycle regulation and the cellular response to DNA damage. It is essential for prevention of apoptosis in early embryogenesis. It is also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization. G2E3 contains two distinct RING-like ubiquitin ligase domains that catalyze lysine 48-linked polyubiquitination, and a C-terminal catalytic HECT domain that plays an important role in ubiquitin ligase activity and in the dynamic subcellular localization of the protein. The RING-like ubiquitin ligase domains consist of a PHD finger and an ePHD finger. This model corresponds to the Cys4HisCys3 canonical PHD finger. 54
33092 276972 cd15497 PHD1_Snt2p_like PHD finger 1 found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. Snt2p is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical Cys4HisCys3 plant homeodomain (PHD) fingers, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain; this model corresponds to the first canonical Cys4HisCys3 PHD finger. 48
33093 276973 cd15498 PHD2_Snt2p_like PHD finger 2 found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. This group corresponds to Snt2p and similar proteins. Snt2p is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical Cys4HisCys3 plant homeodomain (PHD) fingers, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain; this model corresponds to the second canonical Cys4HisCys3 PHD finger. 55
33094 276974 cd15499 PHD1_MTF2_PHF19_like PHD finger 1 found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins MTF2, PHF19, and similar proteins. The family includes two PCL family proteins, metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are homologs of PHD finger protein1 (PHF1). PCL family proteins are accessory components of the polycomb repressive complex 2 (PRC2) core complex and all contain an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. They specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD finger-containing proteins, the first PHD fingers of PCL proteins do not display histone H3K4 binding affinity and they do not affect the Tudor domain binding to histones. This model corresponds to the first PHD finger. 53
33095 276975 cd15500 PHD1_PHF1 PHD finger 1 found in PHD finger protein1 (PHF1). PHF1, also termed polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of Lysine trimethylation at Lysine 36 of Histone H3 and Lysine 27 of Histone variant H3t. PHF1 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3. Moreover, PHF1 is required for efficient H3K27me3 and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. This model corresponds to the first PHD finger. 51
33096 276976 cd15501 PHD_Int12 PHD finger found in integrator complex subunit 12 (Int12) and similar proteins. Int12, also termed IntS12, or PHD finger protein 22, is a component of integrator, a multi-protein mediator of small nuclear RNA processing. The integrator complex directly interacts with the C-terminal domain of RNA polymerase II (RNAPII) largest subunit and mediates the 3' end processing of small nuclear RNAs (snRNAs) U1 and U2. Different from other components of integrator, Int12 contains a PHD finger, which is not required for snRNA 3' end cleavage. Instead, Int12 harbors a small microdomain at its N-terminus which is necessary and sufficient for Int12 function; this microdomain facilitates Int12 binding to Int1 and promotes snRNA 3' end formation. 52
33097 276977 cd15502 PHD_Phf1p_Phf2p_like PHD finger found in Schizosaccharomyces pombe SWM histone demethylase complex subunits Phf1 (Phf1p) and Phf2 (Phf2p). Phf1p and Phf2p are components of the SWM histone demethylase complex that specifically demethylates histone H3 at lysine 9 (H3K9me2), a specific tag for epigenetic transcriptional activation. They function as corepressors and play roles in regulating heterochromatin propagation and euchromatic transcription. Both Phf1p and Phf2p contain a plant homeodomain (PHD) finger. 52
33098 276978 cd15503 PHD2_MTF2_PHF19_like PHD finger 2 found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins MTF2, PHF19, and similar proteins. The PCL family includes PHD finger protein1 (PHF1) and its homologs metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are accessory components of the Polycomb repressive complex 2 (PRC2) core complex and all contain an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD finger-containing proteins, the first PHD finger of PCL proteins do not display histone H3K4 binding affinity and they do not affect the Tudor domain binding to histones. This model corresponds to the second PHD finger. 52
33099 276979 cd15504 PHD_PRHA_like PHD finger found in Arabidopsis thaliana pathogenesis-related homeodomain protein (PRHA) and similar proteins. PRHA is a homeodomain protein encoded by a single-copy Arabidopsis thaliana homeobox gene, prha. It shows the capacity to bind to TAATTG core sequence elements but requires additional adjacent bases for high-affinity binding. PRHA contains a plant homeodomain (PHD) finger, a homeodomain, peptide repeats and a putative leucine zipper dimerization domain. 53
33100 276980 cd15505 PHD_ING PHD finger found in the inhibitor of growth (ING) protein family. The ING family includes a group of tumor suppressors, ING1-5, which act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger. 45
33101 276981 cd15506 PHD1_KMT2A_like PHD finger 1 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger. 47
33102 276982 cd15507 PHD2_KMT2A_like PHD finger 2 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger. 50
33103 276983 cd15508 PHD3_KMT2A_like PHD finger 3 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger. 57
33104 276984 cd15509 PHD1_KMT2C_like PHD finger 1 found in Histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the first PHD finger. 48
33105 276985 cd15510 PHD2_KMT2C_like PHD finger 2 found in Histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions. This model corresponds to the second PHD finger. 46
33106 276986 cd15511 PHD3_KMT2C PHD finger 3 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the third PHD finger. 52
33107 276987 cd15512 PHD4_KMT2C_like PHD finger 4 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD domain 3 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, two extended PHD (ePHD) fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fourth PHD finger of KMT2C and the third domain of KMT2D. 49
33108 276988 cd15513 PHD5_KMT2C_like PHD finger 5 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD finger 4 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fifth PHD finger of KMT2C and the fourth PHD finger of KMT2D. 47
33109 276989 cd15514 PHD6_KMT2C_like PHD finger 6 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD finger 5 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the sixth PHD finger of KMT2C and the fifth PHD finger of KMT2D. 51
33110 276990 cd15515 PHD1_KDM5A_like PHD finger 1 found in Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and similar proteins. The JARID subfamily within the JmjC proteins includes Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog, protein little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me2 and H3K4me3), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. This family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent H3K4me3 demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two or three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 46
33111 276991 cd15516 PHD2_KDM5A_like PHD finger 2 found in Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D, and similar proteins. The JARID subfamily within the JmjC proteins includes Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog protein, little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. The family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two or three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger. 53
33112 276992 cd15517 PHD_TCF19_like PHD finger found in Transcription factor 19 (TCF-19), Lysine-specific demethylase KDM5A and KDM5B, and other similar proteins. TCF-19 was identified as a putative trans-activating factor with expression beginning at the late G1-S boundary in dividing cells. It functions as a novel islet factor necessary for proliferation and survival in the INS-1 beta cell line. It plays an important role in susceptibility to both Type 1 Diabetes Mellitus (T1DM) and Type 2 Diabetes Mellitus (T2DM); it has been suggested that it may positively impact beta cell mass under conditions of beta cell stress and increased insulin demand. KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interaction with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. This family also includes Caenorhabditis elegans Lysine-specific demethylase 7 homolog (ceKDM7A). ceKDM7A (also termed JmjC domain-containing protein 1.2, PHD finger protein 8 homolog, or PHF8 homolog) is a plant homeodomain (PHD)- and JmjC domain-containing protein that functions as a histone demethylase specific for H3K9me2 and H3K27me2. The binding of the PHD finger to H3K4me3 guides H3K9me2- and H3K27me2-specific demethylation by its catalytic JmjC domain in a trans-histone regulation mechanism. In addition, this family includes plant protein OBERON 1 and OBERON 2, Alfin1-like (AL) proteins, histone acetyltransferases (HATs) HAC, and AT-rich interactive domain-containing protein 4 (ARID4). 49
33113 276993 cd15518 PHD_Ecm5p_Lid2p_like PHD finger found in Saccharomyces cerevisiae extracellular matrix protein 5 (Ecm5p), Schizosaccharomyces pombe Lid2 complex component Lid2p, and similar proteins. The family includes Saccharomyces cerevisiae Ecm5p, Schizosaccharomyces pombe Lid2 complex component Lid2p, and similar proteins. Ecm5p is a JmjC domain-containing protein that directly removes histone lysine methylation via a hydroxylation reaction. It associates with the yeast Snt2p and Rpd3 deacetylase, which may play a role in regulating transcription in response to oxidative stress. Ecm5p promotes oxidative stress tolerance, while Snt2p ultimately decreases tolerance. Ecm5p contains an N-terminal ARID domain, a JmjC domain, and a C-terminal plant homeodomain (PHD) finger. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1 and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. This model includes the second PHD finger of Lid2p. 45
33114 276994 cd15519 PHD1_Lid2p_like PHD finger 1 found in Schizosaccharomyces pombe Lid2 complex component Lid2p and similar proteins. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1 and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. This model corresponds to the first PHD finger. 46
33115 276995 cd15520 PHD3_Lid2p_like PHD finger 3 found in Schizosaccharomyces pombe Lid2 complex component Lid2p and similar proteins. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1, and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. The family corresponds to the third PHD finger. 47
33116 276996 cd15521 PHD_VIN3_plant PHD finger found in Arabidopsis thaliana protein Vernalization Insensitive 3 (VIN3) and similar proteins. The lineage specific VIN3 family of proteins includes VIN3, VIN3-like1 (VIL1, or Vernalization5 (VRN5)), VIN3-like2 (VIL2, or Vernalization5/VIN3-like protein 1 (VEL1)), VIN3-like3 (VIL3 or Vernalization5/VIN3-like protein 2 (VEL2)), and similar proteins. They contain a plant homeodomain (PHD) finger, and collectively repress different sets of members of the Flowering LOCUS C (FLC) gene family during the course of vernalization. Both VIN3 and VIL1 are required for modifying the histone architecture of the MADS box floral repressor FLC in response to prolonged cold exposure in Arabidopsis. VIN3 is required for both Histone H3 Lys 9 (H3K9) and Histone H3 Lys 27 (H3K27) methylation at FLC chromatin, ultimately leading to its repression. It is regulated by the components of Polycomb Response Complex2 (PRC2), which trimethylates histone H3 Lys 27 (H3K27me3). VIL1 appears to play a prominent role in regulating FLC by vernalization. VIL2 acts together with PRC2 to repress the floral repressor MAF5, an FLC clade member, in a photoperiod-dependent manner to accelerate flowering under non-inductive photoperiods. 64
33117 276997 cd15522 PHD_TAF3 PHD finger found in transcription initiation factor TFIID subunit 3 (TAF3). TAF3 (also termed 140 kDa TATA box-binding protein-associated factor, TBP-associated factor 3, transcription initiation factor TFIID 140 kDa subunit (TAF140), or TAFII-140, is an integral component of TFIID) is a general initiation factor (GTF) that plays a key role in preinitiation complex (PIC) assembly through core promoter recognition. The interaction of H3K4me3 with TAF3 directs global TFIID recruitment to active genes, which regulates gene-selective functions of p53 in response to genotoxic stress. TAF3 is highly enriched in embryonic stem cells and is required for endoderm lineage differentiation and prevents premature specification of neuroectoderm and mesoderm. Moreover, TAF3, along with TRF3, forms a complex that is essential for myogenic differentiation. TAF3 contains a plant homeodomain (PHD) finger. This family also includes Drosophila melanogaster BIP2 (Bric-a-brac interacting protein 2) protein, which functions as an interacting partner of D. melanogaster p53 (Dmp53). 46
33118 276998 cd15523 PHD_PHF21A PHD finger found in PHD finger protein 21A (PHF21A). PHF21A (also termed BHC80a or BRAF35-HDAC complex protein BHC80) along with HDAC1/2, CtBP1, CoREST, and BRAF35, is associated with LSD1, a lysine (K)-specific histone demethylase. It inhibits LSD1-mediated histone demethylation in vitro. PHF21A is predominantly present in the central nervous system and spermatogenic cells and is one of the six components of BRAF-HDAC complex (BHC) involved in REST-dependent transcriptional repression of neuron-specific genes in non-neuronal cells. It acts as a scaffold protein in BHC in neuronal as well as non-neuronal cells and also plays a role in spermatogenesis. PHF21A contains a C-terminal plant homeodomain (PHD) finger that is responsible for the binding directly to each of five other components of BHC, and of organizing BHC mediating transcriptional repression. 43
33119 276999 cd15524 PHD_PHF21B PHD finger found in PHD finger protein 21B (PHF21B). PHF21B is a plant homeodomain (PHD) finger-containing protein whose biological function remains unclear. It shows high sequence similarity with PHF21A, which is associated with LSD1, a lysine (K)-specific histone demethylase and inhibits LSD1-mediated histone demethylation in vitro. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. 43
33120 277000 cd15525 PHD_UHRF1_2 PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. 47
33121 277001 cd15526 PHD1_MOZ_d4 PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ), its factor (MORF), and d4 gene family proteins. MOZ is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity and to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF) is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphoma genesis and bone development, and its homologs. This family also includes three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes via changing the condensed/decondensed state of chromatin in nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro specific gene clusters. All family members contain two plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 56
33122 277002 cd15527 PHD2_KAT6A_6B PHD finger 2 found in monocytic leukemia zinc-finger protein (MOZ) and its factor (MORF). MOZ, also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60 protein 3 (MYST-3), or runt-related transcription factor-binding protein 2, or zinc finger protein 220, is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF), also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. In contrast, MORF contains an N-terminal region containing two PHD fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The family corresponds to the first PHD finger. 46
33123 277003 cd15528 PHD1_PHF10 PHD finger 1 found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is a ubiquitously expressed transcriptional regulator that is required for maintaining the undifferentiated status of neuroblasts. It contains a SAY (supporter of activation of yellow) domain and two adjacent plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 54
33124 277004 cd15529 PHD2_PHF10 PHD finger 2 found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is a ubiquitously expressed transcriptional regulator that is required for maintaining the undifferentiated status of neuroblasts. It contains a SAY (supporter of activation of yellow) domain and two adjacent plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger. 44
33125 277005 cd15530 PHD2_d4 PHD finger 2 found in d4 gene family proteins. The family includes proteins coded by three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes by changing the condensed/decondensed state of chromatin in the nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro-specific gene clusters. The d4 family proteins show distinct domain organization with domain 2/3 in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc finger in the central part and two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the second PHD finger. 46
33126 277006 cd15531 PHD1_CHD_II PHD finger 1 found in class II Chromodomain-Helicase-DNA binding (CHD) proteins. Class II CHD proteins includes chromodomain-helicase-DNA-binding protein CHD3, CHD4, and CHD5, which are nuclear and ubiquitously expressed chromatin remodelling ATPases generally associated with histone deacetylases (HDACs). They are involved in DNA Double Strand Break (DSB) signaling, DSB repair and/or p53-dependent pathways such as apoptosis and senescence, as well as in the maintenance of genomic stability, and/or cancer prevention. They function as subunits of the Nucleosome Remodelling and Deacetylase (NuRD) complex, which is generally associated with gene repression, heterochromatin formation, and overall chromatin compaction. In contrast to the class I CHD enzymes (CHD1 and CHD2), class II CHD proteins lack identifiable DNA-binding domains, but possess a C-terminal coiled-coil region. Moreover, in addition to the tandem chromodomains and a helicase domain, they all harbor tandem plant homeodomain (PHD) zinc fingers involved in the recognition of methylated histone tails. This model corresponds to the first PHD finger. 43
33127 277007 cd15532 PHD2_CHD_II PHD finger 2 found in class II Chromodomain-Helicase-DNA binding (CHD) proteins. Class II CHD proteins includes chromodomain-helicase-DNA-binding protein CHD3, CHD4, and CHD5, which are nuclear and ubiquitously expressed chromatin remodelling ATPases generally associated with histone deacetylases (HDACs). They are involved in DNA Double Strand Break (DSB) signaling, DSB repair and/or p53-dependent pathways such as apoptosis and senescence, as well as in the maintenance of genomic stability, and/or cancer prevention. They function as subunits of the Nucleosome Remodelling and Deacetylase (NuRD) complex, which is generally associated with gene repression, heterochromatin formation, and overall chromatin compaction. In contrast to the class I CHD enzymes (CHD1 and CHD2), class II CHD proteins lack identifiable DNA-binding domains, but possess a C-terminal coiled-coil region. Moreover, in addition to the tandem chromodomains and a helicase domain, they all harbor tandem plant homeodomain (PHD) zinc fingers involved in the recognition of methylated histone tails. This model corresponds to the second PHD finger. 43
33128 277008 cd15533 PHD1_PHF12 PHD finger 1 found in PHD finger protein 12 (PHF12). PHF12, also termed PHD factor 1 (Pf1), is a plant homeodomain (PHD) zinc finger-containing protein that bridges the transducin-like enhancer of split (TLE) corepressor to the mSin3A-histone deacetylase (HDAC)-complex, and further represses transcription at targeted genes. PHF12 also interacts with MRG15 (mortality factor-related genes on chromosome 15), a member of the mortality factor (MORF) family of proteins implicated in regulating cellular senescence. PHF12 contains two plant-homeodomain (PHD) zinc fingers followed by a polybasic region. The PHD fingers function downstream of phosphoinositide signaling triggered by the interaction between polybasic regions and phosphoinositides. This model corresponds to the first PHD finger. 45
33129 277009 cd15534 PHD2_PHF12_Rco1 PHD finger 2 found in PHD finger protein 12 (PHF12), yeast Rco1, and similar proteins. PHF12, also termed PHD factor 1 (Pf1), is a plant homeodomain (PHD) zinc finger-containing protein that bridges the transducin-like enhancer of split (TLE) corepressor to the mSin3A-histone deacetylase (HDAC)-complex, and further represses transcription at targeted genes. PHF12 also interacts with MRG15 (mortality factor-related genes on chromosome 15), a member of the mortality factor (MORF) family of proteins implicated in regulating cellular senescence. PHF12 contains two plant homeodomain (PHD) zinc fingers followed by a polybasic region. The PHD fingers function downstream of phosphoinositide signaling triggered by the interaction between polybasic regions and phosphoinositides. This subfamily also includes yeast transcriptional regulatory protein Rco1 and similar proteins. Rco1 is a component of the Rpd3S histone deacetylase complex that plays an important role at actively transcribed genes. Rco1 contains two PHD fingers, which are required for the methylation of histone H3 lysine 36 (H3K36) nucleosome recognition by Rpd3S. This model corresponds to the second PHD finger. 47
33130 277010 cd15535 PHD1_Rco1 PHD finger 1 found in Saccharomyces cerevisiae transcriptional regulatory protein Rco1 and similar proteins. Rco1 is a component of the Rpd3S histone deacetylase complex that plays an important role at actively transcribed genes. Rco1 contains two plant homeodomain (PHD) fingers, which are required for the methylation of histone H3 lysine 36 (H3K36) nucleosome recognition by Rpd3S. This model corresponds to the first PHD finger. 45
33131 277011 cd15536 PHD_PHRF1 PHD finger found in PHD and RING finger domain-containing protein 1 (PHRF1). PHRF1, also termed KIAA1542, or CTD-binding SR-like protein rA9, is a ubiquitin ligase that induces the ubiquitination of TGIF (TG-interacting factor) at lysine 130. It acts as a tumor suppressor that promotes the transforming growth factor (TGF)-beta cytostatic program through selective release of TGIF-driven promyelocytic leukemia protein (PML) inactivation. PHRF1 contains a plant homeodomain (PHD) finger and a RING finger. 46
33132 277012 cd15537 PHD_BS69 PHD finger found in protein BS69. Protein BS69, also termed zinc finger MYND domain-containing protein 11 (ZMYND11 or ZMY11), is a ubiquitously expressed nuclear protein acting as a transcriptional co-repressor in association with various transcription factors. It was originally identified as an adenovirus 5 E1A-binding protein that inhibits E1A transactivation, as well as c-Myb transcription. It also mediates repression, at least in part, through interaction with the co-repressor N-CoR. Moreover, it interacts with Toll-interleukin 1 receptor domain (TIR)-containing adaptor molecule-1 (TICAM-1, also named TRIF) to facilitate NF-kappaB activation and type I IFN induction. It associates with PIAS1, a SUMO E3 enzyme, and Ubc9, a SUMO E2 enzyme, and plays an inhibitory role in muscle and neuronal differentiation. Moreover, BS69 regulates Epstein-Barr virus (EBV) latent membrane protein 1 (LMP1)/C-terminal activation region 2 (CTAR2)-mediated NF-kappaB activation by interfering with the complex formation between TNFR-associated death domain protein (TRADD) and LMP1/CTAR2. It also cooperates with tumor necrosis factor receptor (TNFR)-associated factor 3 (TRAF3) in the regulation of EBV-derived LMP1/CTAR1-induced NF-kappaB activation. Furthermore, BS69 is involved in the p53-p21Cip1-mediated senescence pathway. BS69 contains a plant homeodomain (PHD) finger, a bromodomain, a proline-tryptophan-tryptophan-proline (PWWP) domain, and a Myeloid translocation protein 8, Nervy and DEAF-1 (MYND) domain. 43
33133 277013 cd15538 PHD_PRKCBP1 PHD finger found in protein kinase C-binding protein 1 (PRKCBP1). PRKCBP1, also termed cutaneous T-cell lymphoma-associated antigen se14-3 (CTCL-associated antigen se14-3), or Rack7, or zinc finger MYND domain-containing protein 8 (ZMYND8), is a novel receptor for activated C-kinase (RACK)-like protein that may play an important role in the activation and regulation of PKC-beta I, and the PKC signaling cascade. It also has been identified as a formin homology-2-domain containing protein 1 (FHOD1)-binding protein that may be involved in FHOD1-regulated actin polymerization and transcription. Moreover, PRKCBP1 may function as a REST co-repressor 2 (RCOR2) interacting factor; the RCOR2/ZMYND8 complex which might be involved in the regulation of neural differentiation. PRKCBP1 contains a plant homeodomain (PHD) finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. 41
33134 277014 cd15539 PHD1_AIRE PHD finger 1 found in autoimmune regulator (AIRE). AIRE, also termed autoimmune polyendocrinopathy candidiasis ectodermal dystrophy (APECED) protein, functions as a regulator of gene transcription in the thymus. It is essential for prevention of autoimmunity. AIRE plays a critical role in the induction of central tolerance. It promotes self-tolerance through tissue-specific antigen (TSA) expression. It also acts as an active regulator of chondrocyte differentiation. AIRE contains a homogeneously-staining region (HSR) or caspase-recruitment domain (CARD), a nuclear localization signal (NLS), a SAND (for Sp100, AIRE, nuclear phosphoprotein 41/75 or NucP41/75, and deformed epidermal auto regulatory factor 1 or Deaf1) domain, two plant homeodomain (PHD) fingers, and four LXXLL (where L stands for leucine) motifs. This model corresponds to the first PHD finger that recognizes the unmethylated tail of histone H3 and targets AIRE-dependent genes. 43
33135 277015 cd15540 PHD2_AIRE PHD finger 2 found in autoimmune regulator (AIRE). AIRE, also termed autoimmune polyendocrinopathy candidiasis ectodermal dystrophy (APECED) protein, functions as a regulator of gene transcription in the thymus. It is essential for prevention of autoimmunity. AIRE plays a critical role in the induction of central tolerance. It promotes self-tolerance through tissue-specific antigen (TSA) expression. It also acts as an active regulator of chondrocyte differentiation. AIRE contains a homogeneously-staining region (HSR) or caspase-recruitment domain (CARD), a nuclear localization signal (NLS), a SAND (for Sp100, AIRE, nuclear phosphoprotein 41/75 or NucP41/75, and deformed epidermal auto regulatory factor 1 or Deaf1) domain, two plant homeodomain (PHD) fingers, and four LXXLL (where L stands for leucine) motifs. This model corresponds to the second PHD finger that may play a critical role in the activation of gene transcription. 42
33136 277016 cd15541 PHD_TIF1_like PHD finger found in the transcriptional intermediary factor 1 (TIF1) family and similar proteins. The TIF1 family of transcriptional cofactors includes TIF1alpha (TRIM24), TIF1beta (TRIM28), TIF1gamma (TRIM33), and TIF1delta (TRIM66), which are characterized by an N-terminal RING-finger B-box coiled-coil (RBCC/TRIM) motif and plant homeodomain (PHD) finger followed by a bromodomain in the C-terminal region. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha, TIF1beta, and TIF1delta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. TIF1alpha and TIF1beta bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. In contrast, TIF1delta appears to lack nuclear receptor- and KRAB-binding activity. Moreover, TIF1delta is specifically involved in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta, and gamma. It cannot bind to nuclear receptors (NRs). This family also includes Sp100/Sp140 family proteins, the nuclear body SP100 and SP140. Sp110 is a leukocyte-specific component of the nuclear body. It may function as a nuclear hormone receptor transcriptional coactivator that may play a role in inducing differentiation of myeloid cells. It is also involved in resisting intracellular pathogens and functions as an important drug target for preventing intracellular pathogen diseases, such as tuberculosis, hepatic veno-occlusive disease, and intracellular cancers. SP140 is an interferon inducible nuclear leukocyte-specific protein involved in primary biliary cirrhosis and a risk factor in chronic lymphocytic leukemia. It is also implicated in innate immune response to human immunodeficiency virus type 1 (HIV-1) by binding to the virus viral infectivity factor (Vif) protein. Both Sp110 and Sp140 contain a SAND domain, a plant homeodomain (PHD) finger, and a bromodomain (BRD). 43
33137 277017 cd15542 PHD_UBR7 PHD finger found in putative E3 ubiquitin-protein ligase UBR7. UBR7, also termed N-recognin-7, is a UBR box-containing protein that belongs to the E3 ubiquitin ligase family that recognizes N-degrons or structurally related molecules for ubiquitin-dependent proteolysis or related processes through the UBR box motif. In addition to the UBR box, UBR7 also harbors a plant homeodomain (PHD) finger. The biochemical properties of UBR7 remain unclear. 54
33138 277018 cd15543 PHD_RSF1 PHD finger found in Remodeling and spacing factor 1 (Rsf-1). Rsf-1, also termed HBV pX-associated protein 8, or Hepatitis B virus X-associated protein alpha (HBxAPalpha), or p325 subunit of RSF chromatin-remodeling complex, is a novel nuclear protein with histone chaperon function. It is a subunit of an ISWI chromatin remodeling complex, remodeling and spacing factor (RSF), and plays a role in mediating ATPase-dependent chromatin remodeling and conferring tumor aggressiveness in common carcinomas. As an ataxia-telangiectasia mutated (ATM)-dependent chromatin remodeler, Rsf-1 facilitates DNA damage checkpoints and homologous recombination repair. It regulates the mitotic spindle checkpoint and chromosome instability through the association with serine/threonine kinase BubR1 (BubR1) and Hepatitis B virus (HBV) X protein (HBx) in the chromatin fraction during mitosis. It also interacts with cyclin E1 and promotes tumor development. Rsf-1 contains a plant homeodomain (PHD) finger. 46
33139 277019 cd15544 PHD_BAZ1A_like PHD finger found in bromodomain adjacent to zinc finger domain protein BAZ1A and BAZ1B. BAZ1A, also termed ATP-dependent chromatin-remodeling protein, or ATP-utilizing chromatin assembly and remodeling factor 1 (ACF1), or CHRAC subunit ACF1, or Williams syndrome transcription factor-related chromatin-remodeling factor 180 (WCRF180), or WALp1, is a subunit of the conserved imitation switch (ISWI)-family ATP-dependent chromatin assembly and remodeling factor (ACF)/chromatin accessibility complex (CHRAC) chromatin remodeling complex, which is required for DNA replication through heterochromatin. It alters the remodeling properties of the ATPase motor protein sucrose nonfermenting-2 homolog (SNF2H). Moreover, BAZ1A and its complexes play important roles in DNA double-strand break (DSB) repair. It is essential for averting improper gene expression during spermatogenesis. It also regulates transcriptional repression of vitamin D3 receptor-regulated genes. BAZ1B, also termed Tyrosine-protein kinase BAZ1B, or Williams syndrome transcription factor (WSTF), or Williams-Beuren syndrome chromosomal region 10 protein, Williams-Beuren syndrome chromosomal region 9 protein, or WALp2, is a multifunctional protein implicated in several nuclear processes, including replication, transcription, and the DNA damage response. BAZ1B/WSTF, together with the imitation switch (ISWI) ATPase, forms a WSTF-ISWI chromatin remodeling complex (WICH), which transiently associates with the human inactive X chromosome (Xi) during late S-phase prior to BRCA1 and gamma-H2AX. Moreover, BAZ1B/WSTF, SNF2h, and nuclear myosin 1 (NM1) forms the chromatin remodeling complex B-WICH that is involved in regulating rDNA transcription. Both BAZ1A and BAZ1B contain a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. 46
33140 277020 cd15545 PHD_BAZ2A_like PHD finger found in bromodomain adjacent to zinc finger domain protein 2A (BAZ2A) and 2B (BAZ2B). BAZ2A, also termed transcription termination factor I-interacting protein 5 (TTF-I-interacting protein 5, or Tip5), or WALp3, is an epigenetic regulator. It has been implicated in epigenetic rRNA gene silencing, as the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP- and histone H4 tail-dependent fashion. BAZ2A has also been shown to be broadly overexpressed in prostate cancer, to regulate numerous protein-coding genes and to cooperate with EZH2 (enhancer of zeste homolog 2) to maintain epigenetic silencing at genes repressed in prostate cancer metastasis. Its overexpression is tightly associated with a prostate cancer subtype displaying CpG island methylator phenotype (CIMP) in tumors and with prostate cancer recurrence in patients. BAZ2B, also termed WALp4, is a bromodomain-containing protein whose biological role is still elusive. It shows high sequence similarly with BAZ2A. Both BAZ2A and BAZ2B contain a TAM (TIP5/ARBP/MBD) domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. BAZ2B also harbors an extra Apolipophorin-III like domain in its N-terminal region. 46
33141 277021 cd15546 PHD_PHF13_like PHD finger found in PHD finger proteins PHF13 and PHF23. PHF13, also termed survival time-associated PHD finger protein in ovarian cancer 1 (SPOC1), is a novel plant homeodomain (PHD) finger-containing protein that shows strong expression in spermatogonia and ovarian cancer cells, modulates chromatin structure and mitotic chromosome condensation, and is important for proper cell division. It is also required for spermatogonial stem cell differentiation and sustained spermatogenesis. The overexpression of PHF13 associates with unresectable carcinomas and shorter survival in ovarian cancer. PHF23, also termed PHD-containing protein JUNE-1, is a hypothetical protein with a PHD finger. It is encoded by gene PHF23 that acts as a candidate fusion partner for the nucleoporin gene NUP98. The NUP98-PHF23 fusion results from a cryptic translocation t(11;17)(p15;p13) in acute myeloid leukemia (AML). 44
33142 277022 cd15547 PHD_SHPRH PHD finger found in E3 ubiquitin-protein ligase SHPRH. SHPRH, also termed SNF2, histone-linker, PHD and RING finger domain-containing helicase, belongs to the SWI2/SNF2 family of ATP-dependent chromatin remodeling enzymes, containing the Cys3HisCys4 RING-finger characteristic of E3 ubiquitin ligases. It plays a key role in the error-free branch of DNA damage tolerance. As functional homologs of Saccharomyces cerevisiae Rad5, SHPRH and its closely-related protein, helicase like transcription factor (HLTF), act as ubiquitin ligases that cooperatively mediate Ubc13-Mms2-dependent polyubiquitination of proliferating cell nuclear antigen (PCNA) and maintain genomic stability. SHPRH contains a SNF2 domain, a H1.5 (linker histone H1 and H5) domain, a plant homeodomain (PHD) finger, a Cys3HisCys4 RING-finger, and a C-terminal helicase domain. 47
33143 277023 cd15548 PHD_ASH1L PHD finger found in histone-lysine N-methyltransferase ASH1L. ASH1L, also termed ASH1-like protein, or absent small and homeotic disks protein 1 homolog, or lysine N-methyltransferase 2H, is a protein belonging to the Trithorax family. It methylates Lys36 of histone H3 independently of transcriptional elongation to promote the establishment of Hox gene expression by counteracting Polycomb silencing. It can suppress interleukin-6 (IL-6), and tumor necrosis factor (TNF) production in Toll-like receptor (TLR)-triggered macrophages, and inflammatory autoimmune diseases by inducing the ubiquitin-editing enzyme A20. ASH1L contains an associated with SET domain (AWS), a SET domain, a post-SET domain, a bromodomain, a bromo-adjacent homology domain (BAH), and a plant homeodomain (PHD) finger. 43
33144 277024 cd15549 PHD_PHF20_like PHD finger found in PHD finger protein 20 (PHF20) and PHD finger protein 20-like protein 1 (P20L1). PHF20, also termed Glioma-expressed antigen 2, or hepatocellular carcinoma-associated antigen 58, or novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53 mediated signaling. P20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (Cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. Both PHF20 and PHF20L1 contain an N-terminal MBT domain, two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger. 45
33145 277025 cd15550 PHD_MLL5 PHD finger found in mixed lineage leukemia 5 (MLL5). MLL5 is a histone methyltransferase that plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. It contains a single plant homeodomain (PHD) finger followed by a catalytic SET domain. MLL5 can be recruited to E2F1-responsive promoters to stimulate H3K4 trimethylation and transcriptional activation by binding to the cell cycle regulator host cell factor (HCF-1), thereby facilitating the cell cycle G1 to S phase transition. It is also involved in mitotic fidelity and genomic integrity by modulating the stability of the chromosomal passenger complex (CPC) via the interaction with Borealin. Moreover, MLL5 is a component of a complex associated with retinoic acid receptor that requires GlcN Acylation of its SET domain in order to activate its histone lysine methyltransferase activity. It also participates in the camptothecin (CPT)-induced p53 activation. Furthermore, MLL5 indirectly regulates H3K4 methylation, represses cyclin A2 (CycA) expression, and promotes myogenic differentiation. 44
33146 277026 cd15551 PHD_PYGO PHD finger found in PYGO proteins. The family includes Drosophila melanogaster protein pygopus (dPYGO) and its two homologs, PYGO1 and PYGO2. dPYGO is a fundamental Wnt signaling transcriptional component in Drosophila. PYGO1 is essential for the association with Legless (Lgs)/Bcl9 that acts an adaptor between Pygopus (Pygo) and Arm/beta-catenin. dPYGO and PYGO2 function as context-dependent beta-catenin coactivators, and they bind di- and trimethylated lysine 4 of histone H3 (H3K4me2/3). Moreover, PYGO2 acts as a histone methylation reader, and a chromatin remodeler in a testis-specific and Wnt-unrelated manner. It also mediates chromatin regulation and links Wnt signaling and Notch signaling to suppress the luminal/alveolar differentiation competence of mammary stem and basal cells. PYGO2 also plays a new role in rRNA transcription during cancer cell growth. It regulates mammary tumor initiation and heterogeneity in MMTV-Wnt1 mice. All family members contain a plant homeodomain (PHD) finger. 54
33147 277027 cd15552 PHD_PHF3_like PHD finger found in PHD finger protein 3 (PHF3), and death-inducer obliterator variants Dido1, Dido2, and Dido3. PHF3 is a human homolog of yeast protein bypass of Ess1 (Bye1), a nuclear protein with a domain resembling the central domain in the transcription elongation factor TFIIS. It is ubiquitously expressed in normal tissues including brain, but its expression is significantly reduced or lost in glioblastomas. PHF3 contains an N-terminal plant homeodomain (PHD) finger, a central RNA polymerase II (Pol II)-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain. This family also includes Dido gene encoding three alternative splicing variants (Dido1, 2, and 3), which have been implicated in a number of cellular processes such as apoptosis and chromosomal segregation, particularly in the hematopoietic system. Dido1 is important for maintaining embryonic stem (ES) cells and directly regulates the expression of pluripotency factors. It is the shortest isoform that contains only a highly conserved PHD finger responsible for the binding of histone H3 with a higher affinity for trimethylated lysine4 (H3K4me3). Gene Dido1 is a Bone morphogenetic protein (BMP) target gene and promotes BMP-induced melanoma progression. It also triggers apoptosis after nuclear translocation and caspase upregulation. Dido3 is the largest isoform and is ubiquitously expressed in all human tissues. It is dispensable for ES cell self-renewal and pluripotency, but is involved in the maintenance of stem cell genomic stability and tumorigenesis. Dido3 contains a PHD finger, a transcription elongation factor S-II subunit M (TFSIIM) domain, a SPOC module, and a long C-terminal region (CT) of unknown homology. 50
33148 277028 cd15553 PHD_Cfp1 PHD finger found in CXXC-type zinc finger protein 1 (Cfp1). Cfp1, also termed CpG-binding protein, or PHD finger and CXXC domain-containing protein 1 (PCCX1), is a specificity factor that binds to unmethylated CpGs and links H3K4me3 with CpG islands (CGIs). It integrates both promoter CpG content and gene activity for accurate trimethylation of histone H3 Lys 4 (H3K4me3) deposition in embryonic stem cells. Moreover, Cfp1 is an essential component of the SETD1 histone H3K4 methyltransferase complex and functions as a critical regulator of histone methylation, cytosine methylation, cellular differentiation, and vertebrate development. Cfp1 contains a plant homeodomain (PHD) finger, a CXXC domain, and a CpG binding protein zinc finger C-terminal domain. Its CXXC domain selectively binds to non-methylated CpG islands, following by a preference for a guanosine nucleotide. 46
33149 277029 cd15554 PHD_PHF2_like PHD finger found in PHF2, PHF8 and KDM7. This family includes PHF2, PHF8, KDM7, and similar proteins. PHF2, also termed GRC5, or PHD finger protein 2, is a histone lysine demethylase ubiquitously expressed in various tissues. PHF8, also termed PHD finger protein 8, or KDM7B, is a monomethylated histone H4 lysine 20(H4K20me1) demethylase that transcriptionally regulates many cell cycle genes. It also preferentially acts on H3K9me2 and H3K9me1. PHF8 is modulated by CDC20-containing anaphase-promoting complex (APC (cdc20)) and plays an important role in the G2/M transition. It acts as a critical molecular sensor for mediating retinoic acid (RA) treatment response in RAR alpha-fusion-induced leukemia. Moreover, PHF8 is essential for cytoskeleton dynamics and is associated with X-linked mental retardation. KDM7, also termed JmjC domain-containing histone demethylation protein 1D (JHDM1D), or KIAA1718, is a dual histone demethylase that catalyzes demethylation of monomethylated and dimethylated H3K9 (H3K9me2/me1) and H3K27 (H3K27me2/me1), which functions as an eraser of silencing marks on chromatin during brain development. It also plays a tumor-suppressive role by regulating angiogenesis. All family members contain a plant homeodomain (PHD) finger and a JmjC domain. 47
33150 277030 cd15555 PHD_KDM2A_2B PHD finger found in Lysine-specific demethylase KDM2A, KDM2B, and similar proteins. This family includes KDM2A, KDM2B, and F-box and leucine-rich repeat protein 19 (FBXL19). KDM2A is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. KDM2B is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. Both KDM2A and KDM2B belong to the JmjC-domain-containing histone demethylase family. They consist of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain. FBXL19 belongs to the Skp1-Cullin-F-box (SCF) family of E3 ubiquitin ligases. It mediates ubiquitination and interleukin 33 (IL-33)-induced degradation of ST2L receptor in lung epithelia, blocks IL-33-mediated apoptosis, and prevents endotoxin-induced acute lung injury. FBXL19 consists of FBXHA and FBXHB domains, similar to KDM2A and KDM2B. 55
33151 277031 cd15556 PHD_MMD1_like PHD finger found in Arabidopsis thaliana PHD finger protein MALE MEIOCYTE DEATH 1 (MMD1), PHD finger protein MALE STERILITY 1 (MS1), and similar proteins. MMD1 is a plant homeodomain (PHD) finger protein expressed in male meiocytes. It is encoded by the gene DUET, which is required for male meiotic chromosome organization and progression. MMD1 has been implicated in the regulation of gene expression during meiosis. The mmd1 mutation triggers cell death in male meiocytes. MS1 is a nuclear transcriptional activator that is important for tapetal development and pollen wall biosynthesis. It contains a Leu zipper-like domain and a PHD finger motif, both of which are essential for its function. 46
33152 277032 cd15557 PHD_CBP_p300 PHD finger found in CREB-binding protein (CBP) and histone acetyltransferase p300. This p300/CBP family includes two highly homologous histone acetyltransferases (HATs), CREB-binding protein (CBP) and p300. CBP is also known as KAT3A or CREBBP. It specifically interacts with the phosphorylated form of cyclic adenosine monophosphate-responsive element-binding protein (CREB). p300, also termed as KAT3B, or E1A-associated protein p300 (EP300), is a paralog of CBP. and is involved in E1A function in cell cycle progression and cellular differentiation. Both CBP and p300 are co-activator proteins that have been implicated in cell cycle regulation, apoptosis, embryonic development, cellular differentiation and cancer. They associate with a number of DNA-binding transcription activators as well as general transcription factors (GTFs), thus mediating recruitment of basal transcription machinery to the promoter. They contain a cysteine-histidine rich region, KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain. 37
33153 277033 cd15558 PHD_Hop1p_like PHD finger found in Schizosaccharomyces pombe meiosis-specific protein hop1 (Hop1p) and similar proteins. Fission yeast Hop1p, also termed linear element-associated protein hop1, is an S. pombe homolog of the synaptonemal complex (SC)-associated protein Hop1 in Saccharomyces cerevisiae. In contrast to S. cerevisiae, S. pombe forms thin threads, known as linear elements (LinEs), in meiotic nuclei, instead of a canonical synaptonemal complex. LinEs contain Rec10 protein and are evolutionary relics of SC axial elements. Fission yeast Hop1p is a linear element (LinE)-associated protein. It also associates with Rec10, which plays a role in recruiting the recombination machinery to chromatin. Hop1p contains an N-terminal HORMA (for Hop1p, Rev7p, and MAD2) domain and a C-terminal plant homeodomain (PHD) finger. 47
33154 277034 cd15559 PHD1_BPTF PHD finger 1 found in bromodomain and PHD finger-containing transcription factor (BPTF). BPTF, also termed nucleosome-remodeling factor subunit BPTF, or fetal Alz-50 clone 1 protein (FAC1), or fetal Alzheimer antigen, functions as a transcriptional regulator that exhibits altered expression and subcellular localization during neuronal development and neurodegenerative diseases such as Alzheimer's disease. It interacts with the human orthologue of the Kelch-like Ech-associated protein (Keap1). Its function and subcellular localization can be regulated by Keap1. Moreover, BPTF is a novel DNA-binding protein that recognizes the DNA sequence CACAACAC and represses transcription through this site in a phosphorylation-dependent manner. Furthermore, BPTF interacts with the Myc-associated zinc finger protein (ZF87/MAZ) and alters its transcriptional activity, which has been implicated in gene regulation in neurodegeneration. Some family members contain two or three plant homeodomain (PHD) fingers, which may be involved in complex formation with histone H3 trimethylated at K4 (H3K4me3). This family corresponds to the first PHD finger. 43
33155 277035 cd15560 PHD2_3_BPTF PHD finger 2 and 3 found in bromodomain and PHD finger-containing transcription factor (BPTF). BPTF, also termed nucleosome-remodeling factor subunit BPTF, or fetal Alz-50 clone 1 protein (FAC1), or fetal Alzheimer antigen, functions as a transcriptional regulator that exhibits altered expression and subcellular localization during neuronal development and neurodegenerative diseases such as Alzheimer's disease. It interacts with the human orthologue of the Kelch-like Ech-associated protein (Keap1). Its function and subcellular localization can be regulated by Keap1. Moreover, BPTF is a novel DNA-binding protein that recognizes the DNA sequence CACAACAC and represses transcription through this site in a phosphorylation-dependent manner. Furthermore, BPTF interacts with the Myc-associated zinc finger protein (ZF87/MAZ) and alters its transcriptional activity, which has been implicated in gene regulation in neurodegeneration. Some family members contain two or three plant homeodomain (PHD) fingers, which may be involved in complex formation with histone H3 trimethylated at K4 (H3K4me3). This family corresponds to the second and third PHD fingers. 47
33156 277036 cd15561 PHD1_PHF14 PHD finger 1 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the first PHD finger. 56
33157 277037 cd15562 PHD2_PHF14 PHD finger 2 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the second PHD finger. 50
33158 277038 cd15563 PHD3_PHF14 PHD finger 3 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the third PHD finger. 49
33159 277039 cd15564 PHD1_NSD PHD finger 1 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the first PHD finger. 43
33160 277040 cd15565 PHD2_NSD PHD finger 2 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the second PHD finger. 51
33161 277041 cd15566 PHD3_NSD PHD finger 3 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the third PHD finger. 48
33162 277042 cd15567 PHD4_NSD PHD finger 4 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the fourth PHD finger. 41
33163 277043 cd15568 PHD5_NSD PHD finger 5 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the fifth PHD finger. 43
33164 277044 cd15569 PHD_RAG2 PHD finger found in V(D)J recombination-activating protein 2 (RAG-2) and similar proteins. RAG-2 is an essential component of the lymphoid-specific recombination activating gene RAG1/2 V(D)J recombinase mediating antigen-receptor gene assembly. It contains an acidic hinge region implicated in histone-binding, a non-canonical plant homeodomain (PHD) finger followed by a C-terminal extension of 40 amino acids that is essential for phosphoinositide (PtdIns)-binding. The PHD finger is a chromatin-binding module that specifically recognizes histone H3 trimethylated at lysine 4 (H3K4me3) and influences V(D)J recombination. 67
33165 277045 cd15570 PHD_Bye1p_SIZ1_like PHD domain found in Saccharomyces cerevisiae bypass of ESS1 protein 1 (Bye1p), the E3 Sumo Ligase SIZ1, and similar proteins. Yeast Bye1p is a nuclear transcription factor with a domain resembling the central domain in the transcription elongation factor TFIIS and plays an inhibitory role during transcription elongation. It functions as a multicopy suppressor of Ess1, a peptidyl-prolyl cis-trans isomerase involved in proline isomerization of the C-terminal domain (CTD) of RNA polymerase II (Pol II). Bye1p contains an N-terminal plant homeodomain (PHD) finger, a central Pol II-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain. The PHD domain binds to a histone H3 tail peptide containing trimethylated lysine 4 (H3K4me3). The TLD domain is responsible for the association with chromatin. Plant SIZ1 protein is a SUMO (small ubiquitin-related modifier) E3 ligase that facilitates conjugation of SUMO to substrate target proteins (sumoylation) and belongs to the protein inhibitor of activated STAT (PIAS) protein family. It negatively regulates abscisic acid (ABA) signaling, which is dependent on the bZIP transcripton factor ABI5. It also modulates plant growth and plays a role in drought stress response likely through the regulation of gene expression. SIZ1 functions as a floral repressor that not only represses the salicylic acid (SA)-dependent pathway, but also promotes FLOWERING LOCUS C (FLC) expression by repressing FLOWERING LOCUS D (FLD) activity through sumoylation. SIZ1 contains a PHD finger, which specifically binds methylated histone H3 at lysine 4 and arginine 2. 50
33166 277046 cd15571 ePHD Extended plant homeodomain (PHD) finger, characterized by Cys2HisCys5HisCys2His. PHD finger is also termed LAP (leukemia-associated protein) motif or TTC (trithorax consensus) domain. The extended PHD finger is characterized as Cys2HisCys5HisCys2His, which has been found in a variety of eukaryotic proteins involved in the control of gene transcription and chromatin dynamics. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. They also function as epigenome readers controlling gene expression through molecular recruitment of multi-protein complexes of chromatin regulators and transcription factors. 112
33167 277047 cd15572 PHD_BRPF PHD finger found in bromodomain and PHD finger-containing (BRPF) proteins. The family of BRPF proteins includes BRPF1, BRD1/BRPF2, and BRPF3. They are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. This model corresponds to the canonical Cys4HisCys3 PHD finger. 54
33168 277048 cd15573 PHD_JADE PHD finger found in proteins Jade-1, Jade-2, Jade-3, and similar proteins. This family includes proteins Jade-1 (PHF17), Jade-2 (PHF15), and Jade-3 (PHF16), each of which is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. This family also contains Drosophila melanogaster PHD finger protein rhinoceros (RNO). It is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating transcription of key EGFR/Ras pathway regulators in the Drosophila eye. All Jade proteins contain a canonical Cys4HisCys3 PHD finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger. 46
33169 277049 cd15574 PHD_AF10_AF17 PHD finger found in protein AF-10 and AF-17. This family includes protein AF-10 and AF-17. AF-10, also termed ALL1 (acute lymphoblastic leukemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukemia) oncogene in leukemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with the human counterpart of the yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as translocation partners of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. Both AF-10 and AF-17 contain an N-terminal canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. The PHD finger is involved in their homo-oligomerization. In the C-terminal region, they possess a leucine zipper domain and a glutamine-rich region. This family also includes ZFP-1, the Caenorhabditis elegans AF10 homolog. It was originally identified as a factor promoting RNAi interference in C. elegans. It also acts as a Dot1-interacting protein that opposes H2B ubiquitination to reduce polymerase II (Pol II) transcription. This model corresponds to the canonical Cys4HisCys3 PHD finger. 48
33170 277050 cd15575 PHD_JMJD2A PHD finger found in Jumonji domain-containing protein 2A (JMJD2A). JMJD2A, also termed lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor co-repressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER) and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger. 100
33171 277051 cd15576 PHD_JMJD2B PHD finger found in Jumonji domain-containing protein 2B (JMJD2B). JMJD2B, also termed lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the trimethyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger. 99
33172 277052 cd15577 PHD_JMJD2C PHD finger found in Jumonji domain-containing protein 2C (JMJD2C). JMJD2C, also termed lysine-specific demethylase 4C (KDM4C), or gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications through modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger. 104
33173 277053 cd15578 PHD1_MTF2 PHD finger 1 found in metal-response element-binding transcription factor 2 (MTF2). MTF2, also termed metal regulatory transcription factor 2, or metal-response element DNA-binding protein M96, or polycomb-like protein 2 (PCL2), complexes with the polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. This model corresponds to the first PHD finger. 53
33174 277054 cd15579 PHD1_PHF19 PHD finger 1 found in PHD finger protein 19 (PHF19). PHF19, also termed Polycomb-like protein 3 (PCL3), is a component of the polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. It binds trimethylated histone H3 Lys36 (H3K36me3) through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states. This model corresponds to the first PHD finger. 53
33175 277055 cd15580 PHD2_MTF2 PHD finger 2 found in metal-response element-binding transcription factor 2 (MTF2). MTF2, also termed metal regulatory transcription factor 2, or metal-response element DNA-binding protein M96, or Polycomb-like protein 2 (PCL2), complexes with the Polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. This model corresponds to the second PHD finger. 52
33176 277056 cd15581 PHD2_PHF19 PHD finger 2 found in PHD finger protein 19 (PHF19). PHF19, also termed Polycomb-like protein 3 (PCL3), is a component of the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. It binds H3K36me3 through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states. This model corresponds to the second PHD finger. 52
33177 277057 cd15582 PHD2_PHF1 PHD finger 2 found in PHD finger protein1 (PHF1). PHF1, also termed Polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of Lysine trimethylation at Lysine 36 of Histone H3 and Lysine 27 of Histone variant H3t. PHF1 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3. Moreover, PHF1 is required for efficient H3K27me3 and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. This model corresponds to the second PHD finger. 52
33178 277058 cd15583 PHD_ash2p_like PHD finger found in Schizosaccharomyces pombe Set1 complex component ash2 (spAsh2p) and similar proteins. spAsh2p, also termed Set1C component ash2, or COMPASS component ash2, or complex proteins associated with set1 protein ash2, or Lid2 complex component ash2, or Lid2C component ash2, is orthologous to Drosophila melanogaster Ash2 protein. Both spAsh2p and D. melanogaster Ash2 contain a plant homeodomain (PHD) finger and a SPRY domain. In contrast, its counterpart in Saccharomyces cerevisiae, Bre2p, has no PHD finger and is not included in this family. spAsh2p shows histone H3 Lys4 (H3K4) methyltransferase activity through its PHD finger. It also interacts with Lid2p in S. pombe. Human Ash2L contains an atypical PHD finger that lacks part of the Cys4HisCys3 signature characteristic of PHD fingers, it binds to only one zinc ion through the second half of the motif and does not have histone tail binding activity. 50
33179 277059 cd15584 PHD_ING1_2 PHD finger found in inhibitor of growth protein 1 (ING1) and 2 (ING2). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs Growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind and phosphorylate ING1 and further regulates its activity. ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, belongs to the inhibitor of growth (ING) family of type II tumor suppressors. It is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. Both ING1 and ING2 contain an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger. 45
33180 277060 cd15585 PHD_ING3 PHD finger found in inhibitor of growth protein 3 (ING3) and similar proteins. ING3, also termed p47ING3, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It is ubiquitously expressed and has been implicated in transcription modulation, cell cycle control, and the induction of apoptosis. It is an important subunit of human NuA4 histone acetyltransferase complex, which regulates the acetylation of histones H2A and H4. Moreover, ING3 promotes ultraviolet (UV)-induced apoptosis through the Fas/caspase-8-dependent pathway in melanoma cells. It physically interacts with subunits of E3 ligase Skp1-Cullin-F-boxprotein complex (SCF complex) and is degraded by the SCF (F-box protein S-phase kinase-associated protein 2, Skp2)-mediated ubiquitin-proteasome system. It also acts as a suppression factor during tumorigenesis and progression of hepatocellular carcinoma (HCC). ING3 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger. 45
33181 277061 cd15586 PHD_ING4_5 PHD finger found in inhibitor of growth protein 4 (ING4) and 5 (ING5). ING4, also termed p29ING4, and ING5, also termed p28ING5, belong to the inhibitor of growth (ING) family of type II tumor suppressors. ING4 acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). ING5 is a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. Both ING4 and ING5 contain an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger. They associate with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further direct the MOZ/MORF and HBO1 complexes to chromatin. 45
33182 277062 cd15587 PHD_Yng1p_like PHD finger found in yeast orthologs of ING tumor suppressor family. The yeast orthologs of the plant homeodomain (PHD) finger-containing ING tumor suppressor family consists of chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Its PHD finger binding to H3 Trimethylated at K4 (H3K4me3) promotes NuA3 H3 HAT activity at K14 of H3 on chromatin. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays a critical role in intra-S-phase DNA damage response. Pho23p is part of the Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. All family members contain an N-terminal ING histone-binding domain and a C-terminal PHD finger. 47
33183 277063 cd15588 PHD1_KMT2A PHD finger 1 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger. 47
33184 277064 cd15589 PHD1_KMT2B PHD finger 1 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger. 47
33185 277065 cd15590 PHD2_KMT2A PHD finger 2 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger. 50
33186 277066 cd15591 PHD2_KMT2B PHD domain 2 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD), an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger. 50
33187 277067 cd15592 PHD3_KMT2A PHD finger 3 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger. 57
33188 277068 cd15593 PHD3_KMT2B PHD finger 3 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger. 57
33189 277069 cd15594 PHD2_KMT2C PHD finger 2 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the second PHD finger. 46
33190 277070 cd15595 PHD2_KMT2D PHD finger 2 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such asHOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the second PHD finger. 46
33191 277071 cd15596 PHD4_KMT2C PHD finger 4 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the fourth PHD finger. 57
33192 277072 cd15597 PHD3_KMT2D PHD finger 3 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the third PHD finger. 51
33193 277073 cd15600 PHD6_KMT2C PHD finger 6 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the sixth PHD finger. 51
33194 277074 cd15601 PHD5_KMT2D PHD finger 5 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such asHOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is downregulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fifth PHD finger. 51
33195 277075 cd15602 PHD1_KDM5A PHD finger 1 found in Lysine-specific demethylase 5A (KDM5A). KDM5A (also termed Histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2)) was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 49
33196 277076 cd15603 PHD1_KDM5B PHD finger 1 found in lysine-specific demethylase 5B (KDM5B). KDM5B (also termed Cancer/testis antigen 31 (CT31), Histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A)) is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of pregnant females and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 46
33197 277077 cd15604 PHD1_KDM5C_5D PHD finger 1 found in Lysine-specific demethylase 5C (KDM5C) and 5D (KDM5D). The family includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C (also termed Histone demethylase JARID1C, Jumonji/ARID domain-containing protein 1C, SmcX, or Xe169) is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D (also termed Histocompatibility Y antigen (H-Y), Histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or SmcY) is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 andH3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger. 46
33198 277078 cd15605 PHD1_Lid_like PHD finger 1 found in Drosophila melanogaster protein little imaginal discs (Lid) and similar proteins. Drosophila melanogaster Lid, also termed Retinoblastoma-binding protein 2 homolog, is identified genetically as a trithorax group (trxG) protein that is a Drosophila homolog of the human protein JARID1A/kdm5A, a member of the JARID subfamily within the JmjC proteins. Lid functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Lid contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger of Lid. 46
33199 277079 cd15606 PHD2_KDM5A PHD finger 2 found in Lysine-specific demethylase 5A (KDM5A). KDM5A (also termed Histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2)) was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger. 56
33200 277080 cd15607 PHD2_KDM5B PHD finger 2 found in lysine-specific demethylase 5B (KDM5B). KDM5B (also termed Cancer/testis antigen 31 (CT31), Histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), or PLU-1) is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger. 44
33201 277081 cd15608 PHD2_KDM5C_5D PHD finger 2 found in Lysine-specific demethylase 5C (KDM5C) and 5D (KDM5D). The family includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C (also termed Histone demethylase JARID1C, Jumonji/ARIDdomain-containing protein 1C, SmcX, or Xe169) is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D (also termed Histocompatibility Y antigen (H-Y), Histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or SmcY) is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger. 58
33202 277082 cd15609 PHD_TCF19 PHD finger found in Transcription factor 19 (TCF-19) and similar proteins. TCF-19, also termed transcription factor SC1, was identified as a putative trans-activating factor with expression beginning at the late G1-S boundary in dividing cells. It also functions as a novel islet factor necessary for proliferation and survival in the INS-1 beta cell line. It plays an important role in susceptibility to both Type 1 Diabetes Mellitus (T1DM) and Type 2 Diabetes Mellitus (T2DM); it has been suggested that it may positively impact beta cell mass under conditions of beta cell stress and increased insulin demand. TCF-19 contains an N-terminal fork head association domain (FHA), a proline rich region, and a C-terminal plant homeodomain (PHD) finger. The FHA domain may serve as a nuclear signaling domain or as a phosphoprotein binding domain. The proline rich region is a common characteristic of trans-activating factors. The PHD finger may allow TCF-19 to interact with chromatin via methylated histone H3. 50
33203 277083 cd15610 PHD3_KDM5A_like PHD finger 3 found in Lysine-specific demethylase 5A (KDM5A), 5B (KDM5B), and similar proteins. The family includes KDM5A and KDM5B, both of which belong to the JARID subfamily within the JmjC proteins. KDM5A, also termed Histone demethylase JARID1A, or Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as the trimethylated histone H3 lysine 4 (H3K4me3) demethylase. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5B, also termed Cancer/testis antigen 31 (CT31), or Histone demethylase JARID1B, or Jumonji/ARID domain-containing protein 1B (JARID1B), or PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well asTIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. The family also includes the Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger. 50
33204 277084 cd15612 PHD_OBE1_like PHD finger found in Arabidopsis thaliana protein OBERON 1, OBERON 2, and similar proteins mainly found in plants. Included in this family are OBERON 1 (OBE1, or potyvirus VPg-interacting protein 2) and OBERON 2 (OBE2, or potyvirus VPg-interacting protein 1), which have been involved in the maintenance and/or establishment of the meristems in Arabidopsis. They interact with potyvirus VPg-interacting proteins (PVIP1 and 2) and act as central regulators in auxin-mediated control of development. Both OBE1and OBE2 contain a plant homeodomain (PHD) finger. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. 60
33205 277085 cd15613 PHD_AL_plant PHD finger found in plant Alfin1-like (AL) proteins. AL proteins are ubiquitously expressed nuclear proteins existing only in plants. They are involved in chromatin regulation by binding to tri- and dimethylated histone H3 at lysine 4 (H3K4me3/2), the active histone markers, through their plant homeodomain (PHD) fingers. 51
33206 277086 cd15614 PHD_HAC_like PHD finger found in Arabidopsis thaliana histone acetyltransferases (HATs) HAC and similar proteins. This family includes A. thaliana HACs (HAC1/2/4/5/12), which are histone acetyltransferases of the p300/CREB-binding protein (CBP) co-activator family. CBP-type HAT proteins are also found in animals, but absent in fungi. The domain architecture of CBP-type HAT proteins differs between plants and animals. Members in this family contain an N-terminal partially conserved KIX domain, a Zf-TAZ domain, a Cysteine rich CBP-type HAT domain that harbors a plant homeodomain (PHD) finger, a Zf-ZZ domain, and a Zf-TAZ domain. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. 73
33207 277087 cd15615 PHD_ARID4_like PHD finger found in Arabidopsis thaliana AT-rich interactive domain-containing protein 4 (ARID4) and similar proteins. This family includes A. thaliana ARID4 (ARID domain-containing protein 4) and similar proteins. Their biological roles remain unclear, but they all contain an AT-rich interactive domain (ARID) and a plant homeodomain (PHD) finger at the C-terminus. ARID is a helix-turn-helix motif-based DNA-binding domain conserved in all eukaryotes. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. 57
33208 277088 cd15616 PHD_UHRF1 PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1). UHRF1 (also termed inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1) is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET and RING finger associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintaining DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD finger targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. 47
33209 277089 cd15617 PHD_UHRF2 PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2 (also termed Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2) was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs,p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. 47
33210 277090 cd15618 PHD1_MOZ_MORF PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ) and its factor (MORF). MOZ (also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60protein 3 (MYST-3), runt-related transcription factor-binding protein 2, or zinc finger protein 220) is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF), also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. In contrast, MORF contains an N-terminal region containing two PHD fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The model corresponds to the first PHD finger. 58
33211 277091 cd15619 PHD1_d4 PHD finger 1 found in d4 gene family proteins. The family includes proteins coded by three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes by changing the condensed/decondensed state of chromatin in the nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro-specific gene clusters. The d4 family proteins show distinct domain organization with domain 2/3 in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc finger in the central part and two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the first PHD finger. 56
33212 277092 cd15622 PHD_TIF1alpha PHD finger found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also termed tripartite motif-containing protein 24 (TRIM24), or E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the TRIM/RBCC protein family. It interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta) and MOD2 (HP1gamma), as well as vertebrate Kruppel-type (C2H2) zinc finger proteins that contain transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif, and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53 and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal plant homeodomain (PHD)-Bromodomain (Bromo) region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development. TIF1-alpha contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region. 43
33213 277093 cd15623 PHD_TIF1beta PHD finger found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also termed Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), or KRAB-interacting protein 1 (KRIP-1), or nuclear co-repressor KAP-1, or RING finger protein 96, or tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, acts as a nuclear co-repressor that plays a role in transcription and in DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. In addition, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. TIF1-beta contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), which can interact with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and mediates homo- and heterodimerization, a plant homeodomain (PHD) finger followed by a bromodomain in the C-terminal region, which interact with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity. 43
33214 277094 cd15624 PHD_TIF1gamma PHD finger found in transcriptional intermediary factor 1 gamma (TIF1gamma). TIF1gamma, also termed tripartite motif-containing 33 (trim33), or ectodermin, or RFG7, or PTC7, is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling; it inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key factor of tumorigenesis. Like other TIF1 family members, TIF1gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis. TIF1gamma contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region. 46
33215 277095 cd15625 PHD_TIF1delta PHD finger found in transcriptional intermediary factor 1 delta (TIF1delta). TIF1delta, also termed tripartite motif-containing protein 66 (TRIM66), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor1 (TIF1) family expressed by elongating spermatids. Like other TIF1 proteins, TIF1delta displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TIF1delta plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TIF1delta contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region. 49
33216 277096 cd15626 PHD_SP110_140 PHD finger found in the Sp100/Sp140 family of nuclear body components. The Sp100/Sp140 family includes nuclear body proteins SP100, SP140, and similar proteins. Sp110, also termed interferon-induced protein 41/75, or speckled 110 kDa, or transcriptional coactivator Sp110, is a leukocyte-specific component of the nuclear body. It may function as a nuclear hormone receptor transcriptional coactivator that may play a role in inducing differentiation of myeloid cells. It is also involved in resisting intracellular pathogens and functions as an important drug target for preventing intracellular pathogen diseases, such as tuberculosis, hepatic veno-occlusive disease, and intracellular cancers. Sp110 gene polymorphisms may be associated with susceptibility to tuberculosis in Chinese population. Sp110 contains a Sp100-like domain, a SAND domain, a plant homeodomain (PHD) finger, and a bromodomain (BRD). SP140, also termed lymphoid-restricted homolog of Sp100 (LYSp100), or nuclear autoantigen Sp-140, or speckled 140 kDa, is an interferon inducible nuclear leukocyte-specific protein involved in primary biliary cirrhosis and a risk factor in chronic lymphocytic leukemia. It is also implicated in innate immune response to human immunodeficiency virus type 1 (HIV-1) by binding to the virus's viral infectivity factor (Vif) protein. Sp140 contains a nuclear localization signal, a dimerization domain (HSR or CARD domain), a SAND domain, a PHD finger, and a BRD. 42
33217 277097 cd15627 PHD_BAZ1A PHD finger found in bromodomain adjacent to zinc finger domain protein 1A (BAZ1A). BAZ1A, also termed ATP-dependent chromatin-remodeling protein, or ATP-utilizing chromatin assembly and remodeling factor 1 (ACF1), or CHRAC subunit ACF1, or Williams syndrome transcription factor-related chromatin-remodeling factor 180 (WCRF180), or WALp1, is a subunit of the conserved imitation switch (ISWI)-family ATP-dependent chromatin assembly and remodeling factor (ACF)/chromatin accessibility complex (CHRAC) chromatin remodeling complex, which is required for DNA replication through heterochromatin. It alters the remodeling properties of the ATPase motor protein sucrose nonfermenting-2 homolog (SNF2H). Moreover, BAZ1A and its complexes play important roles in DNA double-strand break (DSB) repair. It is essential for averting improper gene expression during spermatogenesis. It also regulates transcriptional repression of vitamin D3 receptor-regulated genes. BAZ1A contains a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. 46
33218 277098 cd15628 PHD_BAZ1B PHD finger found in bromodomain adjacent to zinc finger domain protein 1B (BAZ1B). BAZ1B, also termed Tyrosine-protein kinase BAZ1B, or Williams syndrome transcription factor (WSTF), or Williams-Beuren syndrome chromosomal region 10 protein, Williams-Beuren syndrome chromosomal region 9 protein, or WALp2, is a multifunctional protein implicated in several nuclear processes, including replication, transcription, and the DNA damage response. BAZ1B/WSTF, together with the imitation switch (ISWI) ATPase, forms a WSTF-ISWI chromatin remodeling complex (WICH), which transiently associates with the human inactive X chromosome (Xi) during late S-phase prior to BRCA1 and gamma-H2AX. Moreover, BAZ1B/WSTF, SNF2h, and nuclear myosin 1 (NM1) forms the chromatin remodeling complex B-WICH that is involved in regulating rDNA transcription. BAZ1B contains a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. 46
33219 277099 cd15629 PHD_BAZ2A PHD finger found in bromodomain adjacent to zinc finger domain protein 2A (BAZ2A). BAZ2A, also termed transcription termination factor I-interacting protein 5 (TTF-I-interacting protein 5, or Tip5), or WALp3, is an epigenetic regulator. It has been implicated in epigenetic rRNA gene silencing, as the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP- and histone H4 tail-dependent fashion. BAZ2A has also been shown to be broadly overexpressed in prostate cancer, to regulate numerous protein-coding genes and to cooperate with EZH2 (enhancer of zeste homolog 2) to maintain epigenetic silencing at genes repressed in prostate cancer metastasis. Its overexpression is tightly associated with a prostate cancer subtype displaying CpG island methylator phenotype (CIMP) in tumors and with prostate cancer recurrence in patients. It contains a TAM (TIP5/ARBP/MBD) domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. 47
33220 277100 cd15630 PHD_BAZ2B PHD finger found in bromodomain adjacent to zinc finger domain protein 2B (BAZ2B). BAZ2B, also termed WALp4, is a bromodomain-containing protein whose biological role is still elusive. It shows high sequence similarly with BAZ2A, which is the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP-and histone H4 tail-dependent fashion. BAZ2B contains a TAM (TIP5/ARBP/MBD) domain, an Apolipophorin-III like domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. 49
33221 277101 cd15631 PHD_PHF23 PHD finger found in PHD finger protein 23 (PHF23). PHF23, also termed PHD-containing protein JUNE-1, is a hypothetical protein with a plant homeodomain (PHD) finger. It is encoded by gene PHF23 that acts as a candidate fusion partner for the nucleoporin gene NUP98. The NUP98-PHF23 fusion results from a cryptic translocation t(11;17)(p15;p13) in acute myeloid leukemia (AML). 44
33222 277102 cd15632 PHD_PHF13 PHD finger found in PHD finger protein 13 (PHF13). PHF13, also termed survival time-associated PHD finger protein in ovarian cancer 1 (SPOC1), is a novel plant homeodomain (PHD) finger-containing protein that shows strong expression in spermatogonia and ovarian cancer cells, modulates chromatin structure and mitotic chromosome condensation, and is important for proper cell division. It is also required for spermatogonial stem cell differentiation and sustained spermatogenesis. The overexpression of PHF13 associates with unresectable carcinomas and shorter survival in ovarian cancer. 47
33223 277103 cd15633 PHD_PHF20L1 PHD finger found in PHD finger protein 20-like protein 1 (P20L1). P20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (Cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. In addition to the MBT domain, PHF20L1 also contains two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger. 46
33224 277104 cd15634 PHD_PHF20 PHD finger found in PHD finger protein 20 (PHF20). PHF20, also termed Glioma-expressed antigen 2, or hepatocellular carcinoma-associated antigen 58, or novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53 mediated signaling. PHF20 contains an N-terminal malignant brain tumor (MBT) domain, two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger. 44
33225 277105 cd15635 PHD_PYGO1 PHD finger found in pygopus homolog 1 (PYGO1). PYGO1 is a homolog of Drosophila melanogaster protein pygopus (dPYGO), which is a fundamental Wnt signaling transcriptional component in Drosophila. It functions as a context-dependent beta-catenin coactivator, and binds di- and trimethylated lysine 4 of histone H3 (H3K4me2/3). PYGO1 is essential for the association with Legless (Lgs)/Bcl9 that acts as an adaptor between Pygopus (Pygo) and Arm/beta-catenin. PYGO1 contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway. 57
33226 277106 cd15636 PHD_PYGO2 PHD finger found in pygopus homolog 2 (PYGO2). PYGO2 is a homolog of Drosophila melanogaster protein pygopus (dPYGO), which is a fundamental Wnt signaling transcriptional component in Drosophila. It functions as a context-dependent beta-catenin coactivator, as well as a histone methylation reader that binds di-and trimethylated lysine 4 of histone H3 (H3K4me2/3). Moreover, PYGO2 acts as a chromatin remodeler in a testis-specific and Wnt-unrelated manner. It also mediates chromatin regulation and links Wnt signaling and Notch signaling to suppress the luminal/alveolar differentiation competence of mammary stem and basal cells. Furthermore, PYGO2 plays a new role in rRNA transcription during cancer cell growth. It regulates mammary tumor initiation and heterogeneity in MMTV-Wnt1 mice. PYGO2 contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway. 54
33227 277107 cd15637 PHD_dPYGO PHD finger found in Drosophila melanogaster protein pygopus (dPYGO) and similar proteins. dPYGO, also termed protein gammy legs, is a nuclear adapter protein encoded by pygopus (pygo). It is a fundamental Wnt signaling transcriptional component in Drosophila, and has both Wnt-related and Wnt-independent functions. It plays a critical role in aging-related cardiac dysfunction that is canonical Wnt signaling independent. dPYGO contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway. 54
33228 277108 cd15638 PHD_PHF3 PHD finger found in PHD finger protein 3 (PHF3). PHF3 is a human homolog of yeast protein bypass of Ess1 (Bye1), a nuclear protein with a domain resembling the central domain in the transcription elongation factor TFIIS. It is ubiquitously expressed in normal tissues including brain, but its expression is significantly reduced or lost in glioblastomas. PHF3 contains an N-terminal plant homeodomain (PHD) finger, a central RNA polymerase II (Pol II)-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain. 51
33229 277109 cd15639 PHD_DIDO1_like PHD finger found in death-inducer obliterator variants Dido1, Dido2, and Dido3. This family includes three alternative splicing variants (Dido1, 2, and 3) encoded by the Dido gene, which have been implicated in a number of cellular processes such as apoptosis and chromosomal segregation, particularly in the hematopoietic system. Dido1, also termed DIO-1, or death-associated transcription factor 1 (DATF-1), is important for maintaining embryonic stem (ES) cells and directly regulates the expression of pluripotency factors. It is the shortest isoform that contains only a highly conserved plant homeodomain (PHD) finger responsible for the binding of histone H3 with a higher affinity for trimethylated lysine 4 (H3K4me3). Gene Dido is a Bonemorphogenetic protein (BMP) target gene, which promotes BMP-induced melanoma progression. It also triggers apoptosis after nuclear translocation and caspase upregulation. Dido3 is the largest isoform ubiquitously expressed in all human tissues. It is dispensable for ES cell self-renewal and pluripotency, but involved in the maintenance of stem cell genomic stability and tumorigenesis. Dido3 contains a PHD finger, a transcription elongation factor S-II subunit M (TFSIIM) domain, aspen paralog and ortholog (SPOC) module, and a long C-terminal region (CT) of unknown homology. Its PHD finger interacts with H3K4me3. 54
33230 277110 cd15640 PHD_KDM7 PHD finger found in lysine-specific demethylase 7 (KDM7). KDM7, also termed JmjC domain-containing histone demethylation protein 1D (JHDM1D), or KIAA1718, is a dual histone demethylase that catalyzes demethylation of monomethylated and dimethylated H3K9 (H3K9me2/me1) and H3K27 (H3K27me2/me1), which functions as an eraser of silencing marks on chromatin during brain development. It also plays a tumor-suppressive role by regulating angiogenesis. KDM7 contains a plant homeodomain (PHD) that binds Lys4-trimethylated histone 3 (H3K4me3) and a jumonji domain that demethylates either H3K9me2 or H3K27me2. 50
33231 277111 cd15641 PHD_PHF2 PHD finger found in lysine-specific demethylase PHF2. PHF2, also termed GRC5, or PHD finger protein 2, is a histone lysine demethylase ubiquitously expressed in various tissues. It contains a plant homeodomain (PHD) finger and a JmjC domain and plays an important role in adipogenesis. The PHD finger domain can recognize trimethylated histone H3 lysine 4 (H3K4me3). PHF2 also has dimethylated histone H3 lysine 9(H3K9me2) demethylase activity and acts as a coactivator of several metabolism-related transcription factors. Moreover, it can demethylate ARID5B and further forms a complex with demethylated ARD5B to bind the promoter regions of target genes. The overexpression of PHF2 is involved in the progression of esophageal squamous cell carcinoma (ESCC). 50
33232 277112 cd15642 PHD_PHF8 PHD finger found in histone lysine demethylase PHF8. PHF8, also termed PHD finger protein 8, or KDM7B, is a monomethylated histone H4 lysine 20 (H4K20me1) demethylase that transcriptionally regulates many cell cycle genes. It also preferentially acts on H3K9me2 and H3K9me1. PHF8 is modulated by CDC20-containing anaphase-promoting complex (APC (cdc20)) and plays an important role in the G2/M transition. It acts as a critical molecular sensor for mediating retinoic acid (RA) treatment response in RAR alpha-fusion-induced leukemia. Moreover, PHF8 is essential for cytoskeleton dynamics and is associated with X-linked mental retardation. PHF8 contains an N-terminal plant homeodomain (PHD) finger followed by a JmjC domain. The PHD finger mediates binding to nucleosomes at active gene promoters and the JmjC domain catalyzes the demethylation of mono- or dimethyl-lysines. 52
33233 277113 cd15643 PHD_KDM2A PHD finger found in Lysine-specific demethylase 2A (KDM2A). KDM2A, also termed CXXC-type zinc finger protein 8, or F-box and leucine-rich repeat protein 11 (FBXL11), or F-box protein FBL7, or F-box protein Lilina, or F-box/LRR-repeat protein 11, or JmjC domain-containing histone demethylation protein 1A (Jhdm1a), or [Histone-H3]-lysine-36 demethylase 1A, is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. It acts as a key negative regulator of gluconeogenic gene expression and plays a critical role in the invasiveness, proliferation, and anchorage-independent growth of non-small cell lung cancer (NSCLC) cells, as well as in the osteo/dentinogenic differentiation of Mesenchymal stem cells (MSCs). It regulates rRNA transcription in response to starvation. Meanwhile, it is a negative regulator of NFkappaB. Moreover, KDM2A is a heterochromatin-associated and HP1-interacting protein that promotes HP1 localization to chromatin. It is specifically recruited to CpG islands to define a unique chromatin architecture, which requires direct and specific interaction with linker DNA. It also functions as a H3K4 demethylase that regulates cell proliferation through p15 (INK4B) and p27 (Kip1) in stem cells from apical papilla (SCAPs). KDM2A belongs to the JmjC-domain-containing histone demethylase family. KDM2A consists of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain. 57
33234 277114 cd15644 PHD_KDM2B PHD finger found in Lysine-specific demethylase 2B (KDM2B). KDM2B, also termed Ndy1, or CXXC-type zinc finger protein 2, or F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), or F-box protein FBL10, or JmjC domain-containing histone demethylation protein 1B (Jhdm1b), or Jumonji domain-containing EMSY-interactor methyltransferase motif protein (Protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B, is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). Moreover, it acts as an oncogene that plays a critical role in leukemia development and maintenance. KDM2B consists of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain. 62
33235 277115 cd15645 PHD_FXL19 PHD finger found in F-box and leucine-rich repeat protein 19 (FBXL19). FBXL19, also termed F-box/LRR-repeat protein 19, is a novel homolog of KDM2A and KDM2B. It belongs to the Skp1-Cullin-F-box (SCF) family of E3 ubiquitin ligases. FBXL19 mediates ubiquitination and interleukin 33 (IL-33)-induced degradation of ST2L receptor in lung epithelia, blocks IL-33-mediated apoptosis, and prevents endotoxin-induced acute lung injury. It also functions as a RhoA antagonist during cell proliferation and cytoskeleton rearrangement, and regulates RhoA ubiquitination and degradation in lung epithelial cells. Moreover, FBXL19 regulates cell migration by targeting Rac1 for its polyubiquitination and proteasomal degradation. It plays an essential role in regulating TGFbeta1-induced E-cadherin down-regulation by mediating Rac3 site-specific ubiquitination and stability. FBXL19 consists of FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain. 62
33236 277116 cd15646 PHD_p300 PHD finger found in histone acetyltransferase p300. p300, also termed KAT3B, or E1A-associated protein p300 (EP300), is a paralog of CREB-binding protein (CBP). It is involved in E1A function in cell cycle progression and cellular differentiation. It functions as an intrinsic HAT, as well as a factor acetyltransferase (FAT) for many transcription regulators. And thus, p300 serves as a scaffold or bridge for transcription factors and other components of the basal transcription machinery to facilitate chromatin remodeling and to activate gene transcription. p300 contains a cysteine-histidine rich region, KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain. 40
33237 277117 cd15647 PHD_CBP PHD finger found in CREB-binding protein (CBP). CBP, also termed as KAT3A, is an acetyltransferase acting on histone, which gives a specific tag for transcriptional activation and also acetylates non-histone proteins. CBP is also known as CREBBP, since it specifically interacts with the phosphorylated form of cyclic adenosine monophosphate-responsive element-binding protein (CREB). It augments the activity of phosphorylated CREB to activate transcription of cAMP-responsive genes. CBP contains a cysteine-histidine rich region, a KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain. 40
33238 277118 cd15648 PHD1_NSD1_2 PHD finger 1 found in nuclear receptor-binding SET domain-containing protein NSD1 and NSD2. NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). Both NSD1 and NSD2 contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). In addition, NSD2 harbors a high mobility group (HMG) box. The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the first PHD finger. 43
33239 277119 cd15649 PHD1_NSD3 PHD finger 1 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the first PHD finger. 44
33240 277120 cd15650 PHD2_NSD1 PHD finger 2 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or LysineN-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (C5HCH). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the second PHD finger. 47
33241 277121 cd15651 PHD2_NSD2 PHD finger 2 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the second PHD finger. 47
33242 277122 cd15652 PHD2_NSD3 PHD finger 2 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the second PHD finger. 47
33243 277123 cd15653 PHD3_NSD1 PHD finger 3 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the third PHD finger. 54
33244 277124 cd15654 PHD3_NSD2 PHD finger 3 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the third PHD finger. 54
33245 277125 cd15655 PHD3_NSD3 PHD finger 3 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the third PHD finger. 53
33246 277126 cd15656 PHD4_NSD1 PHD finger 4 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fourth PHD finger. 40
33247 277127 cd15657 PHD4_NSD2 PHD finger 4 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant-homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the fourth PHD finger. 41
33248 277128 cd15658 PHD4_NSD3 PHD finger 4 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fourth PHD finger. 40
33249 277129 cd15659 PHD5_NSD1 PHD finger 5 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fifth PHD finger. 43
33250 277130 cd15660 PHD5_NSD2 PHD finger 5 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant-homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the fifth PHD finger. 43
33251 277131 cd15661 PHD5_NSD3 PHD finger 5 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fifth PHD finger. 43
33252 277132 cd15662 ePHD_ATX1_2_like Extended PHD finger found in Arabidopsis thaliana ATX1, -2, and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of A. thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like proteins ATX1, -2, and similar proteins. ATX1 and -2 are sister paralogs originating from a segmental chromosomal duplication; they are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1 (also known as protein SET domain group 27, or trithorax-homolog protein 1/TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1 regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2 (also known as protein SET domain group 30, or trithorax-homolog protein 2/TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). ATX1 and ATX2 are multi-domain proteins that consist of an N-terminal PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical PHD finger, this non-canonical ePHD finger, and a C-terminal SET domain. 115
33253 277133 cd15663 ePHD_ATX3_4_5_like Extended PHD finger found in Arabidopsis thaliana ATX3, -4, -5, and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of A. thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like proteins ATX3 (also termed protein SET domain group 14, or trithorax-homolog protein 3), ATX4 (also termed protein SET domain group 16, or trithorax-homolog protein 4) and ATX5 (also termed protein SET domain group 29, or trithorax-homolog protein 5), which belong to the histone-lysine methyltransferase family. These proteins show distinct phylogenetic origins from the family of ATX1 and ATX2. They are multi-domain proteins that consist of an N-terminal PWWP domain, a canonical PHD finger, this non-canonical extended PHD finger, and a C-terminal SET domain. 112
33254 277134 cd15664 ePHD_KMT2A_like Extended PHD finger found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A/MLL1 and KMT2B/MLL2. KMT2A and KMT2B comprise the mammalian Trx branch of COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation by mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three PHD fingers, this extended PHD (ePHD) finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. 105
33255 277135 cd15665 ePHD1_KMT2C_like Extended PHD finger 1 found in histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMTC2C and KMTC2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five plant PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions. 90
33256 277136 cd15666 ePHD2_KMT2C_like Extended PHD finger 2 found in histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2C, and KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions. 105
33257 277137 cd15667 ePHD_Snt2p_like Extended PHD finger found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Snt2p. Sntp2 is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical PHD fingers, a non-canonical ePHD finger, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain. 141
33258 277138 cd15668 ePHD_RAI1_like Extended PHD finger found in retinoic acid-induced protein 1 (RAI1), transcription factor 20 (TCF-20) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of RAI1 and TCF-20. RAI1, a homolog of stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPBP, also termed TCF-20), is a chromatin-binding protein implicated in the regulation of gene expression. TCF-20 is involved in transcriptional activation of the MMP3 (matrix metalloprotease 3) promoter. It also functions as a transcriptional co-regulator that enhances or represses the transcriptional activity of certain transcription factors/cofactors, such as specificity protein 1 (Sp1), E twenty-six 1 (Ets1), paired box protein 6 (Pax6), small nuclear RING-finger (SNURF)/RNF4, c-Jun, androgen receptor (AR) and estrogen receptor alpha (ERalpha). Both RAI1 and TCF-20 are strongly enriched in chromatin in interphase HeLa cells, and display low nuclear mobility, and have been implicated in Smith-Magenis syndrome and Potocki-Lupski syndrome. 103
33259 277139 cd15669 ePHD_PHF7_G2E3_like Extended PHD finger found in PHD finger protein 7 (PHF7) and G2/M phase-specific E3 ubiquitin-protein ligase (G2E3). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF7 and G2E3. PHF7, also termed testis development protein NYD-SP6, is a testis-specific PHD finger-containing protein that associates with chromatin and binds histone H3 N-terminal tails with a preference for dimethyl lysine 4 (H3K4me2). It may play an important role in stimulating transcription involved in testicular development and/or spermatogenesis. PHF7 contains a PHD finger and a non-canonical ePHD finger, both of which may be involved in activating transcriptional regulation. G2E3 is a dual function ubiquitin ligase (E3) that may play a possible role in cell cycle regulation and the cellular response to DNA damage. It is essential for prevention of apoptosis in early embryogenesis. It is also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization. G2E3 contains two distinct RING-like ubiquitin ligase domains that catalyze lysine 48-linked polyubiquitination, and a C-terminal catalytic HECT domain that plays an important role in ubiquitin ligase activity and in the dynamic subcellular localization of the protein. The RING-like ubiquitin ligase domains consist of a PHD finger and an ePHD finger. 112
33260 277140 cd15670 ePHD_BRPF Extended PHD finger found in BRPF proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the ePHD finger of the family of BRPF proteins, which includes BRPF1, BRD1/BRPF2, and BRPF3. These are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a plant homeodomain (PHD) zinc finger followed by a non-canonical ePHD finger, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. This PHD finger binds to lysine 4 of histone H3 (K4H3), the bromodomain interacts with acetylated lysines on N-terminal tails of histones and other proteins, and the PWWP domain shows histone-binding and chromatin association properties. 116
33261 277141 cd15671 ePHD_JADE Extended PHD finger found in protein Jade-1, Jade-2, Jade-3 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-1 (PHF17), Jade-2 (PHF15), and Jade-3 (PHF16); each of these proteins is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, has reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. This family also contains Drosophila melanogaster PHD finger protein rhinoceros (RNO). It is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating transcription of key EGFR/Ras pathway regulators in the Drosophila eye. All Jade proteins contain a canonical PHD finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs. 112
33262 277142 cd15672 ePHD_AF10_like Extended PHD finger found in protein AF-10 and AF-17. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-10 and AF-17. AF-10, also termed ALL1 (acute lymphoblastic leukaemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukaemia) oncogene in leukaemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with the human counterpart of yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated protein SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as translocation partners of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. Both AF-10 and AF-17 contain an N-terminal plant homeodomain (PHD) finger followed by this non-canonical ePHD finger. The PHD finger is involved in their homo-oligomerization. In the C-terminal region, they possess a leucine zipper domain and a glutamine-rich region. This family also includes ZFP-1, the Caenorhabditis elegans AF10 homolog. It was originally identified as a factor promoting RNAi interference in C. elegans. It also acts as Dot1-interacting protein that opposes H2B ubiquitination to reduce polymerase II (Pol II) transcription. 116
33263 277143 cd15673 ePHD_PHF6_like Extended PHD finger found in PHD finger protein 6 (PHF6) and PHD finger protein 11 (PHF11). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the two ePHD fingers of PFH6 and the single ePHD finger of PFH11. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4. PHF6 contains two non-canonical ePHD fingers. PHF11, also termed BRCA1 C-terminus-associated protein, or renal carcinoma antigen NY-REN-34, is a transcriptional co-activator of the Th1 effector cytokine genes, interleukin-2 (IL2) and interferon-gamma (IFNG), co-operating with nuclear factor kappa B (NF-kappaB). It is involved in T-cell activation and viability. Polymorphisms within PHF11 are associated with total IgE, allergic asthma and eczema. 116
33264 277144 cd15674 ePHD_PHF14 Extended PHD finger found in PHD finger protein 14 (PHF14) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF14. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and this non-canonical ePHD finger. It can interact with histones through its PHD fingers. 114
33265 277145 cd15675 ePHD_JMJD2 Extended PHD finger found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2 proteins. JMJD2 proteins, also termed lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain. JMJD2D is not included in this family, since it lacks both PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth. 112
33266 277146 cd15676 PHD_BRPF1 PHD finger found in bromodomain and PHD finger-containing protein 1 (BRPF1) and similar proteins. BRPF1, also termed peregrin or protein Br140, is a multi-domain protein that binds histones, mediates monocytic leukemic zinc-finger protein (MOZ)-dependent histone acetylation, and is required for Hox gene expression and segmental identity. It is a close partner of the MOZ histone acetyltransferase (HAT) complex and a novel Trithorax group (TrxG) member with a central role during development. BRPF1 is primarily a nuclear protein that has a broad tissue distribution and is abundant in testes and spermatogonia. It contains a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. BRPF1 may be involved in chromatin remodeling. This model corresponds to the canonical Cys4HisCys3 PHD finger. 62
33267 277147 cd15677 PHD_BRPF2 PHD finger found in bromodomain and PHD finger-containing protein 2 (BRPF2) and similar proteins. BRPF2, also termed bromodomain-containing protein 1 (BRD1), or BR140-like protein, is encoded by BRL (BR140 Like gene). It is responsible for the bulk of the acetylation of H3K14 and forms a novel monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with HBO1 and ING4. The complex is required for full transcriptional activation of the erythroid-specific regulator genes essential for terminal differentiation and survival of erythroblasts in fetal liver. BRPF2 shows widespread expression and localizes to the nucleus within spermatocytes. It contains a cysteine rich region harboring a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. This model corresponds to the canonical Cys4HisCys3 PHD finger. 54
33268 277148 cd15678 PHD_BRPF3 PHD finger found in bromodomain and PHD finger-containing protein 3 (BRPF3) and similar proteins. BRPF3 is a homolog of BRPF1 and BRPF2. It is a scaffold protein that forms a novel monocytic leukemic zinc finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with other regulatory subunits. BRPF3 contains a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. This model corresponds to the canonical Cys4HisCys3 PHD finger. 55
33269 277149 cd15679 PHD_JADE1 PHD finger found in protein Jade-1 and similar proteins. Jade-1, also termed PHD finger protein 17 (PHF17), is a novel binding partner of von Hippel-Lindau (VHL) tumor suppressor Pvhl, a key regulator of the cellular oxygen sensing pathway. It is highly expressed in renal proximal tubules. Jade-1 functions as an essential regulator of multiple cell signaling pathways. It may be involved in the serine/threonine kinase AKT/AKT1 pathway during renal cancer pathogenesis and normally prevents renal epithelial cell proliferation and transformation. It also acts as a pro-apoptotic and growth suppressive ubiquitin ligase to inhibit canonical Wnt downstream effector beta-catenin for proteasomal degradation, and as a transcription factor associated with histone acetyltransferase activity and with increased abundance of cyclin-dependent kinase inhibitor p21. Moreover, Jade-1 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex, and plays a role in epithelial cell regeneration. It has also been identified as a novel component of the nephrocystin protein (NPHP) complex and interacts with the ciliary protein nephrocystin-4 (NPHP4). Jade-1 contains a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger. 46
33270 277150 cd15680 PHD_JADE2 PHD finger found in protein Jade-2 and similar proteins. Jade-2, also termed PHD finger protein 15 (PHF15), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-2 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-2 contains a canonical Cys4HisCys3 PHD finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger. 46
33271 277151 cd15681 PHD_JADE3 PHD finger found in protein Jade-3 and similar proteins. Jade-3, also termed PHD finger protein 16 (PHF16), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-3 is required for ING4 and ING5 to associate with histone acetyl transferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-3 contains a canonical Cys4HisCys3 PHD domain followed by a non-canonical extended PHD (ePHD) domain, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger. 50
33272 277152 cd15682 PHD_ING1 PHD finger found in inhibitor of growth protein 1 (ING1). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs Growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for the cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind phosphorylate ING1 and further regulates its activity. ING1 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger. 49
33273 277153 cd15683 PHD_ING2 PHD finger found in inhibitor of growth protein 2 (ING2). ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. ING2 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger. 49
33274 277154 cd15684 PHD_ING4 PHD finger found in inhibitor of growth protein 4 (ING4). ING4, also termed p29ING4, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). In addition, ING4 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING4 contains an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger. 48
33275 277155 cd15685 PHD_ING5 PHD finger found in inhibitor of growth protein 5 (ING5). ING5, also termed p28ING5, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. In addition, ING5 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING5 contains an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger. 49
33276 277156 cd15686 PHD3_KDM5A PHD finger 3 found in Lysine-specific demethylase 5A (KDM5A). KDM5A, also termed Histone demethylase JARID1A, or Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger. 52
33277 277157 cd15687 PHD3_KDM5B PHD finger 3 found in lysine-specific demethylase 5B (KDM5B). KDM5B, also termed Cancer/testis antigen 31 (CT31), or Histone demethylase JARID1B, or Jumonji/ARID domain-containing protein 1B (JARID1B), or PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger. 50
33278 277158 cd15688 PHD1_MOZ PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ). MOZ, also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60 protein 3 (MYST-3), or runt-related transcription factor-binding protein 2, or zinc finger protein 220, is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ and MOZ-related factor (MORF) are catalytic subunits of histone acetyltransferase (HAT) complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and implicated in human leukemias. It is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. The model corresponds to the first PHD finger. 59
33279 277159 cd15689 PHD1_MORF PHD finger 1 found in monocytic leukemia zinc finger protein-related factor (MORF). MORF, also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic histone acetyltransferase (HAT) activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. MORF and monocytic leukemia zinc-finger protein (MOZ) are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MORF contains an N-terminal region containing two plant homeodomain (PHD) fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The model corresponds to the first PHD finger. 59
33280 277160 cd15690 PHD1_DPF1 PHD finger 1 found in D4, zinc and double PHD fingers family 1 (DPF1). DPF1, also termed zinc finger protein neuro-d4, or BRG1-associated factor 45B (BAF45B), is encoded by a neuro specific gene, neuro-d4. It may be involved in the transcription regulation of neuro specific gene clusters. DPF1 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD)-fingers (d4-domain) in the C-terminal part of the molecule. The family corresponds to the first PHD finger. 58
33281 277161 cd15691 PHD1_DPF2_like PHD finger 1 found in D4, zinc and double PHD fingers family 2 (DPF2). DPF2 (also termed zinc finger protein ubi-d4, apoptosis response zinc finger protein, BRG1-associated factor 45D (BAF45D), or protein requiem) is a transcription factor that is encoded by the ubiquitously expressed gene, ubi-d4, and may be involved in leukemia or other cancers with other genes connected with cancer. It recognizes acetylated histone H3 and suppresses the function of estrogen-related receptor alpha (ERRalpha) through histone deacetylase 1 (HDAC1). Moreover, DPF2 functions as a linker protein between the SWI/SNF complex and RelB/p52 NF-kappaB heterodimer and plays important roles in NF-kappaB transactivation via its non-canonical pathway. It is also required as a transcriptional coactivator in SWI/SNF complex-dependent activation of NF-kappaB RelA/p50 heterodimer. DPF2 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central region, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This subfamily also includes DPF3 from zebrafish. This model describes the first PHD finger. 56
33282 277162 cd15692 PHD1_DPF3 PHD finger 1 found in D4, zinc and double PHD fingers family 3 (DPF3). DPF3, also termed BRG1-associated factor 45C (BAF45C), or zinc finger protein cer-d4, is encoded by a neuro-specific gene, cer-d4. It functions as a new epigenetic key factor for heart and muscle development and may be involved in the transcription regulation of neuro-specific gene clusters. It interacts with the BAF chromatin remodeling complex and binds methylated and acetylated lysine residues of histone 3 and 4. DPF3 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central region, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the first PHD finger. 57
33283 277163 cd15693 ePHD_KMT2A Extended PHD finger found in histone-lysine N-methyltransferase 2A (KMT2A). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of KMT2A. KMT2A also termed ALL-1, or CXXC-type zinc finger protein 7, or myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), or trithorax-like protein (Htrx), or zinc finger protein HRX, is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, a Bromodomain domain, this extended PHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. 113
33284 277164 cd15694 ePHD_KMT2B Extended PHD finger found in histone-lysine N-methyltransferase 2B (KMT2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of KMT2B. KMT2B is also called trithorax homolog 2 or WW domain-binding protein 7 (WBP-7). KMT2B is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation by mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, this ePHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. 105
33285 277165 cd15695 ePHD1_KMT2D Extended PHD finger 1 found in histone-lysine N-methyltransferase 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMT2D. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 at Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. 90
33286 277166 cd15696 ePHD1_KMT2C Extended PHD finger 1 found in histone-lysine N-methyltransferase 2C (KMT2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMT2C. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. 90
33287 277167 cd15697 ePHD2_KMT2C Extended PHD finger 2 found in histone-lysine N-methyltransferase 2C (KMT2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2C. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. 105
33288 277168 cd15698 ePHD2_KMT2D Extended PHD finger 2 found in histone-lysine N-methyltransferase 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2D. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. 107
33289 277169 cd15699 ePHD_TCF20 Extended PHD finger (ePHD) found in transcription factor 20 (TCF-20). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of TCF-20. TCF-20, also termed nuclear factor SPBP, or protein AR1, or stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPRE-binding protein), is involved in transcriptional activation of the MMP3 (matrix metalloprotease 3) promoter. It is strongly enriched on chromatin in interphase HeLa cells, and displays low nuclear mobility, and has been implicated in Smith-Magenis syndrome and Potocki-Lupski syndrome. As a chromatin-binding protein, TCF-20 plays a role in the regulation of gene expression. It also functions as a transcriptional co-regulator that enhances or represses the transcriptional activity of certain transcription factors/cofactors, such as specificity protein 1 (Sp1), E twenty-six 1 (Ets1), paired box protein 6 (Pax6), small nuclear RING-finger (SNURF)/RNF4, c-Jun, androgen receptor (AR) and estrogen receptor alpha (ERalpha). TCF-20 contains an N-terminal transactivation domain, a novel DNA-binding domain with an AT-hook motif, three nuclear localization signals (NLSs) and a C-terminal ePHD/ADD domain. 103
33290 277170 cd15700 ePHD_RAI1 Extended PHD finger (ePHD) found in retinoic acid-induced protein 1 (RAI1). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of RAI1. RAI1, a homolog of stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPBP, also termed TCF-20), is a chromatin-binding protein implicated in the regulation of gene expression. It is strongly enriched on chromatin in interphase HeLa cells, and displays low nuclear mobility, and has been implicated in Smith-Magenis syndrome, Potocki-Lupski syndrome, and non-syndromic autism. RAI1 contains a region with homology to the novel nucleosome-binding region SPBP and an ePHD/ADD domain with ability to bind nucleosomes. 104
33291 277171 cd15701 ePHD_BRPF1 Extended PHD finger found in bromodomain and PHD finger-containing protein 1 (BRPF1) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF1. BRPF1, also termed peregrin, or protein Br140, is a multi-domain protein that binds histones, mediates monocytic leukemic zinc-finger protein (MOZ) -dependent histone acetylation, and is required for Hox gene expression and segmental identity. It is a close partner of the MOZ histone acetyltransferase (HAT) complex and a novel Trithorax group (TrxG) member with a central role during development. BRPF1 is primarily a nuclear protein that has a broad tissue distribution and is abundant in testes and spermatogonia. It contains a plant homeodomain (PHD) zinc finger followed by a non-canonical ePHD finger, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. This PHD finger binds to methylated lysine 4 of histone H3 (H3K4me), the bromodomain interacts with acetylated lysines on N-terminal tails of histones and other proteins, and the PWWP domain shows histone-binding and chromatin association properties. BRPF1 may be involved in chromatin remodeling. 121
33292 277172 cd15702 ePHD_BRPF2 Extended PHD finger found in bromodomain and PHD finger-containing protein 2 (BRPF2) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF2. BRPF2 also termed bromodomain-containing protein 1 (BRD1), or BR140-likeprotein, is encoded by BRL (BR140 Like gene). It is responsible for the bulk of the acetylation of H3K14 and forms a novel monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with HBO1 and ING4. The complex is required for full transcriptional activation of the erythroid-specific regulator genes essential for terminal differentiation and survival of erythroblasts in fetal liver. BRPF2 shows widespread expression and localizes to the nucleus within spermatocytes. It contains a cysteine rich region harboring a plant homeodomain (PHD) finger followed by a non-canonical ePHD finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. 118
33293 277173 cd15703 ePHD_BRPF3 Extended PHD finger found in bromodomain and PHD finger-containing protein 3 (BRPF3) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF3. BRF3 is a homolog of BRPF1 and BRPF2. It is a scaffold protein that forms a novel monocytic leukemic zinc finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with other regulatory subunits. BRPF3 contains a plant homeodomain (PHD) finger followed by this non-canonical ePHD finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. 118
33294 277174 cd15704 ePHD_JADE1 Extended PHD finger found in protein Jade-1 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-1. Jade-1, also termed PHD finger protein 17 (PHF17), is a novel binding partner of von Hippel-Lindau (VHL) tumor suppressor Pvhl, a key regulator of cellular oxygen sensing pathway. It is highly expressed in renal proximal tubules. Jade-1 functions as an essential regulator of multiple cell signaling pathways. It may be involved in the Serine/threonine kinase AKT/AKT1 pathway during renal cancer pathogenesis and normally prevents renal epithelial cell proliferation and transformation. It also acts as a pro-apoptotic and growth suppressive ubiquitin ligase to inhibit canonical Wnt downstream effector beta-catenin for proteasomal degradation and ASA transcription factor associated with histone acetyltransferase activity and with increased abundance of cyclin-dependent kinase inhibitor p21. Moreover, Jade-1 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex, and plays a role in epithelial cell regeneration. It has also been identified as a novel component of the nephrocystin protein (NPHP) complex and interacts with the ciliary protein nephrocystin-4 (NPHP4). Jade-1 contains a canonical plant homeodomain (PHD) finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs. 118
33295 277175 cd15705 ePHD_JADE2 Extended PHD finger found in protein Jade-2 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-2. Jade-2, also termed PHD finger protein 15 (PHF15), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-2 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-2 contains a canonical PHD finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs. 111
33296 277176 cd15706 ePHD_JADE3 Extended PHD finger found in protein Jade-3 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-3. Jade-3, also termed PHD finger protein 16 (PHF16), is a plant homeodomain (PHD) zinc finger protein that is close related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-3 is required for ING4 and ING5 to associate with histone acetyl transferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-3 contains a canonical PHD domain followed by this non-canonical ePHD domain, both of which are zinc-binding motifs. 111
33297 277177 cd15707 ePHD_RNO Extended PHD finger found in Drosophila melanogaster PHD finger protein rhinoceros (RNO) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Drosophila melanogaster RNO. RNO is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating the transcription of key EGFR/Ras pathway regulators in the Drosophila eye. RNO contains a canonical PHD domain followed by this non-canonical ePHD domain, both of which are zinc-binding motifs. 113
33298 277178 cd15708 ePHD_AF10 Extended PHD finger found in protein AF-10 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-10. AF-10, also termed ALL1 (acute lymphoblastic leukaemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukaemia) oncogene in leukaemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with human counterpart of the yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated protein SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-10 contains an N-terminal plant homeodomain (PHD) finger followed by this non-canonical ePHD finger. 129
33299 277179 cd15709 ePHD_AF17 Extended PHD finger found in protein AF-17 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-17. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as a translocation partner of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. AF-17 contains an N-terminal plant homeodomain (PHD) finger followed by a non-canonical ePHD finger. 125
33300 277180 cd15710 ePHD1_PHF6 Extended PHD finger 1 found in PHD finger protein 6 (PHF6). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. PHF6 contains two non-canonical ePHD fingers, this model corresponds to the first ePHD finger. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4. . 115
33301 277181 cd15711 ePHD2_PHF6 Extended PHD finger 2 found in PHD finger protein 6 (PHF6). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. PHF6 contains two non-canonical ePHD fingers, this model corresponds to the second ePHD finger. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4. 118
33302 277182 cd15712 ePHD_PHF11 Extended PHD finger found in PHD finger protein 11 (PHF11). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF11. PHF11, also termed BRCA1 C-terminus-associated protein, or renal carcinoma antigen NY-REN-34, is a transcriptional co-activator of the Th1 effector cytokine genes, interleukin-2 (IL2) and interferon-gamma (IFNG), co-operating with nuclear factor kappa B (NF-kappaB). It is involved in T-cell activation and viability. Polymorphisms within PHF11 are associated with total IgE, allergic asthma and eczema. 115
33303 277183 cd15713 ePHD_JMJD2A Extended PHD finger (ePHD) found in Jumonji domain-containing protein 2A (JMJD2A). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2A. JMJD2A, also termed lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor co-repressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER) and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain. 110
33304 277184 cd15714 ePHD_JMJD2B Extended PHD finger found in Jumonji domain-containing protein 2B (JMJD2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2B. JMJD2B, also termed lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the trimethyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain. 110
33305 277185 cd15715 ePHD_JMJD2C Extended PHD finger found in Jumonji domain-containing protein 2C (JMJD2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2C. JMJD2C, also termed lysine-specific demethylase 4C (KDM4C), or gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) finger, this non-canonical ePHD finger, and a Tudor domain. 110
33306 277256 cd15716 FYVE_RBNS5 FYVE domain found in FYVE finger-containing Rab5 effector protein rabenosyn-5 (Rbsn-5) and similar proteins. Rbsn-5, also termed zinc finger FYVE domain-containing protein 20, is a novel Rab5 effector that is complexed to the Sec1-like protein VPS45 and recruited in a phosphatidylinositol-3-kinase-dependent fashion to early endosomes. It also binds to Rab4 and EHD1/RME-1, two regulators of the recycling route, and is involved in cargo recycling to the plasma membrane. Moreover, Rbsn-5 regulates endocytosis at the apical side of the wing epithelium and plays a role of the apical endocytic trafficking of Fmi in the establishment of planar cell polarity (PCP). 61
33307 277257 cd15717 FYVE_PKHF FYVE domain found in protein containing both PH and FYVE domains 1 (phafin-1), 2 (phafin-2), and similar proteins. This family includes protein containing both PH and FYVE domains 1 (phafin-1) and 2 (phafin-2). Phafin-1 is a representative of a novel family of PH and FYVE domain-containing proteins called phafins. It is a ubiquitously expressed pro-apoptotic protein via translocating to lysosomes, facilitating apoptosis induction through a lysosomal-mitochondrial apoptotic pathway. Phafin-2 is a ubiquitously expressed endoplasmic reticulum-associated protein that facilitates tumor necrosis factor alpha (TNF-alpha)-triggered cellular apoptosis through endoplasmic reticulum (ER)-mitochondrial apoptotic pathway. It is an endosomal phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) effector, as well as an interactor of the endosomal-tethering protein EEA1. It regulates endosome fusion upstream of Rab5. Phafin-2 also functions as a novel regulator of endocytic epidermal growth factor receptor (EGFR) degradation through a role in endosomal fusion. 61
33308 277258 cd15718 FYVE_WDFY1_like FYVE domain found in WD40 repeat and FYVE domain-containing protein WDFY1 and WDFY2, and similar proteins. This family includes WD40 repeat and FYVE domain-containing protein WDFY1 and WDFY2. WDFY1, also termed FYVE domain containing protein localized to endosomes-1 (FENS-1), or phosphoinositide-binding protein 1, or zinc finger FYVE domain-containing protein 17, is a novel single FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. WDFY1 to early endosomes requires an intact FYVE domain and is inhibited by wortmannin, a PI3-kinase inhibitor. WDFY2, also termed zinc finger FYVE domain-containing protein 22, or ProF (propeller-FYVE protein), is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that is localized to a distinct subset of early endosomes close to the plasma membrane. It interacts preferentially with endogenous serine/threonine kinase Akt2, but not Akt1, and plays a specific role in modulating signaling through Akt downstream of the interaction of this kinase with the endosomal proteins APPL (adaptor protein containing PH domain, PTB domain, and leucine zipper motif). In addition to Akt, WDFY2 serves as a binding partner for protein kinase C, zeta (PRKCZ), and its substrate vesicle-associated membrane protein 2 (VAMP2), and is involved in vesicle cycling in various secretory pathways. Moreover, Silencing of WDFY2 by siRNA produces a strong inhibition of endocytosis. Both WDFY1 and WDFY2 contain a FYVE domain and multiple WD-40 repeats. 70
33309 277259 cd15719 FYVE_WDFY3 FYVE domain found in WD40 repeat and FYVE domain-containing protein 3 (WDFY3) and similar proteins. WDFY3, also termed autophagy-linked FYVE protein (Alfy), is a ubiquitously expressed phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein required for selective macroautophagic degradation of aggregated proteins. It regulates the protein degradation through the direct interaction with the autophagy protein Atg5. Moreover, WDFY3 acts as a scaffold that bridges its cargo to the macroautophagic machinery via the creation of a greater complex with Atg12, Atg16L, and LC3. It also functionally associates with sequestosome-1/p62 (SQSTM1) in osteoclasts. WDFY3 shuttles between the nucleus and cytoplasm. It predominantly localizes to the nucleus and nuclear membrane under basal conditions, but is recruited to cytoplasmic ubiquitin-positive protein aggregates under stress conditions. WDFY3 contains a PH-BEACH domain assemblage, five WD40 repeats and a PtdIns3P-binding FYVE domain. 65
33310 277260 cd15720 FYVE_Hrs FYVE domain found in hepatocyte growth factor (HGF)-regulated tyrosine kinase substrate (Hrs) and similar proteins. Hrs, also termed protein pp110, is a tyrosine phosphorylated protein that plays an important role in the signaling pathway of HGF. It is localized to early endosomes and an essential component of the endosomal sorting and trafficking machinery. Hrs interacts with hypertonia-associated protein Trak1, a novel regulator of endosome-to-lysosome trafficking. It can also forms an Hrs/actinin-4/BERP/myosin V protein complex that is required for efficient transferrin receptor (TfR) recycling but not for epidermal growth factor receptor (EGFR) degradation. Moreover, Hrs, together with STAM proteins, STAM1 and STAM2, and EPs15, forms a multivalent ubiquitin-binding complex that sorts ubiquitinated proteins into the multivesicular body pathway, and plays a regulatory role in endocytosis/exocytosis. Furthermore, Hrs functions as an interactor of the neurofibromatosis 2 tumor suppressor protein schwannomin/merlin. It is also involved in the inhibition of citron kinase-mediated HIV-1 budding. Hrs contains a single ubiquitin-interacting motif (UIM) that is crucial for its function in receptor sorting, and a FYVE domain that harbors double Zn2+ binding sites. 61
33311 277261 cd15721 FYVE_RUFY1_like FYVE domain found in RUN and FYVE domain-containing protein RUFY1, RUFY2 and similar proteins. This family includes RUN and FYVE domain-containing protein RUFY1 and RUFY2. RUFY1, also termed FYVE-finger protein EIP1, or La-binding protein 1, or Rab4-interacting protein (Rabip4), or Zinc finger FYVE domain-containing protein 12 (ZFY12), a human homologue of mouse Rabip4, an effector of Rab4 GTPase that regulates recycling of endocytosed cargo. RUFY1 is an endosomal protein that functions as a dual effector of Rab4 and Rab14 and is involved in efficient recycling of transferrin (Tfn). It is a downstream effector of Etk, a downstream tyrosine kinase of PI3-kinase that is involved in regulation of vesicle trafficking. RUFY2, also termed Rab4-interacting protein related, is a novel embryonic factor that is present in the nucleus at early stages of embryonic development. It may have both endosomal functions in the cytoplasm and nuclear functions. Both RUFY1 and RUFY2 contain an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between. 58
33312 277262 cd15723 FYVE_protrudin FYVE-related domain found in protrudin and similar proteins. Protrudin, also termed zinc finger FYVE domain-containing protein 27 (ZFY27 or ZFYVE27), is a FYVE domain-containing protein involved in transport of neuronal cargoes and implicated in the onset of hereditary spastic paraplegia (HSP). It is involved in neurite outgrowth through binding to spastin. Moreover, it functions as a key regulator of the Rab11-dependent membrane trafficking during neurite extension. It serves as an adaptor molecule that links its associated proteins, such as Rab11-GDP, VAP-A and -B, Surf4, and RTN3, to KIF5, a motor protein that mediates anterograde vesicular transport in neurons, and thus plays a key role in the maintenance of neuronal function. The FYVE domain of protrudin resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. In addition, unlike canonical FYVE domains that is located to early endosomes and specifically binds to phosphatidylinositol 3-phosphate (PtdIns3P or PI3P), the FYVE domain of protrudin is located to plasma membrane and preferentially binds phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4)P2), and phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3). In addition to FYVE-related domain, protrudin also contains a Rab11-binding domain (RBD11), two hydrophobic domains, HP-1 and HP-2, an FFAT motif, and a coiled-coil domain. 62
33313 277263 cd15724 FYVE_ZFY26 FYVE domain found in FYVE domain-containing protein 26 (ZFY26 or ZFYVE26). ZFY26, also termed FYVE domain-containing centrosomal protein (FYVE-CENT), or spastizin, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that localizes to the centrosome and midbody. ZFY26 and its interacting partners TTC19 and KIF13A are required for cytokinesis. It also interacts with Beclin 1, a subunit of class III phosphatidylinositol 3-kinase complex, and may have potential implications for carcinogenesis. In addition, it has been considered as the causal agent of a rare form of hereditary spastic paraplegia. ZFY26 contains a FYVE domain that is important for targeting of FYVE-CENT to the midbody. 61
33314 277264 cd15725 FYVE_PIKfyve_Fab1 FYVE domain found in metazoan PIKfyve, fungal and plant Fab1, and similar proteins. PIKfyve, also termed FYVE finger-containing phosphoinositide kinase, or 1-phosphatidylinositol 3-phosphate 5-kinase, or phosphatidylinositol 3-phosphate 5-kinase (PIP5K3), or phosphatidylinositol 3-phosphate 5-kinase type III (PIPkin-III or type III PIP kinase), is a phosphoinositide 5-kinase that forms a complex with its regulators, the scaffolding protein Vac14 and the lipid phosphatase Fig4. The complex is responsible for synthesizing phosphatidylinositol 3,5-bisphosphate [PtdIns(3,5)P2] from phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). Then phosphatidylinositol-5-phosphate (PtdIns5P) is generated directly from PtdIns(3,5)P2. PtdIns(3,5)P2 and PtdIns5P regulate endosomal trafficking and responses to extracellular stimuli. At this point, PIKfyve is vital in early embryonic development. Moreover, PIKfyve forms a complex with ArPIKfyve (associated regulator of PIKfyve) and SAC3 at the endomembranes, which plays a role in receptor tyrosine kinase (RTK) degradation. The phosphorylation of PIKfyve by AKT can facilitate Epidermal growth factor receptor (EGFR) degradation. In addition, PIKfyve may participate in the regulation of the glutamate transporters EAAT2, EAAT3 and EAAT4, and the cystic fibrosis transmembrane conductance regulator (CFTR). It is also essential for systemic glucose homeostasis and insulin-regulated glucose uptake/GLUT4 translocation in skeletal muscle. It can be activated by protein kinase B (PKB/Akt) and further up-regulates human ether-a-go-go (hERG) channels. This family also includes the yeast and plant orthologs of human PIKfyve, Fab1. PIKfyve and its orthologs share a similar architecture. They contain an N-terminal FYVE domain, a middle region related to the CCT/TCP-1/Cpn60 chaperonins that are involved in productive folding of actin and tubulin, a second middle domain that contains a number of conserved cysteine residues (CCR) unique to this family, and a C-terminal lipid kinase domain related to PtdInsP kinases. 62
33315 277265 cd15726 FYVE_FYCO1 FYVE domain found in FYVE and coiled-coil domain-containing protein 1 (FYCO1) and similar proteins. FYCO1, also termed zinc finger FYVE domain-containing protein 7, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that is associated with the exterior of autophagosomes and mediates microtubule plus-end-directed vesicle transport. It acts as an effector of GTP-bound Rab7, a GTPase that recruits FYCO1 to autophagosomes and has been implicated in autophagosome-lysosomal fusion. FYCO1 also interacts with two microtubule motor proteins, kinesin (KIF) 5B and KIF23, and thus functions as a platform for assembly of vesicle fusion and trafficking factors. FYCO1 contains an N-terminal alpha-helical RUN domain followed by a long central coiled-coil region, a FYVE domain and a GOLD (Golgi dynamics) domain in C-terminus. 58
33316 277266 cd15727 FYVE_ZF21 FYVE domain found in zinc finger FYVE domain-containing protein 21 (ZF21) and similar proteins. ZF21 is phosphoinositide-binding protein that functions as a regulator of focal adhesions and cell movement through interaction with focal adhesion kinase. It can also bind to the cytoplasmic tail of membrane type 1 matrix metalloproteinase, a potent invasion-promoting protease, and play a key role in regulating multiple aspects of cancer cell migration and invasion. ZF21 contains a FYVE domain, which corresponds to this model. 64
33317 277267 cd15728 FYVE_ANFY1 FYVE domain found in ankyrin repeat and FYVE domain-containing protein 1 (ANFY1) and similar proteins. ANFY1, also termed ankyrin repeats hooked to a zinc finger motif (Ankhzn), is a novel cytoplasmic protein that belongs to a new group of double zinc finger proteins involved in vesicle or protein transport. It is ubiquitously expressed in a spatiotemporal-specific manner and is located on endosomes. ANFY1 contains an N-terminal coiled-coil region and a BTB/POZ domain, ankyrin repeats in the middle, and a C-terminal FYVE domain. 63
33318 277268 cd15729 FYVE_endofin FYVE domain found in endofin and similar proteins. Endofin, also termed zinc finger FYVE domain-containing protein 16 (ZFY16), or endosome-associated FYVE domain protein, is a FYVE domain-containing protein that is localized to EEA1-containing endosomes. It is regulated by phosphoinositol lipid and engaged in endosome-mediated receptor modulation. Endofin is involved in Bone morphogenetic protein (BMP) signaling through interacting with Smad1 preferentially and enhancing Smad1 phosphorylation and nuclear localization upon BMP stimulation. It also functions as a scaffold protein that brings Smad4 to the proximity of the receptor complex in Transforming growth factor (TGF)-beta signaling. Moreover, endofin is a novel tyrosine phosphorylation target downstream of epidermal growth factor receptor (EGFR) in EGF-signaling. In addition, endofin plays a role in endosomal trafficking by recruiting cytosolic TOM1, an important molecule for membrane recruitment of clathrin, onto endosomal membranes. 68
33319 277269 cd15730 FYVE_EEA1 FYVE domain found in early endosome antigen 1 (EEA1) and similar proteins. EEA1, also termed endosome-associated protein p162, or zinc finger FYVE domain-containing protein 2, is an essential component of the endosomal fusion machinery and required for the fusion and maturation of early endosomes in endocytosis. It forms a parallel coiled-coil homodimer in cells. EEA1 serves as the p97 ATPase substrate and the p97 ATPase may regulate the size of early endosomes by governing the oligomeric state of EEA1. It can interact with the GTP-bound form of Rab22a and be involved in endosomal membrane trafficking. EEA1 also functions as an obligate scaffold for angiotensin II-induced Akt activation in early endosomes. It can be phosphorylated by p38 mitogen-activated protein kinase (MAPK) and further regulate mu opioid receptor endocytosis. EEA1 consists of an N-terminal C2H2 Zn2+ finger, four long heptad repeats, and a C-terminal region containing a calmodulin binding (IQ) motif, a Rab5 interaction site, and a FYVE domain. This model corresponds to the FYVE domain that is responsible for binding phosphatidyl inositol-3-phosphate (PtdIns3P or PI3P) on the membrane. 63
33320 277270 cd15731 FYVE_LST2 FYVE domain found in lateral signaling target protein 2 homolog (Lst2) and similar proteins. Lst2, also termed zinc finger FYVE domain-containing protein 28, is a monoubiquitinylated phosphoprotein that functions as a negative regulator of epidermal growth factor receptor (EGFR) signaling. Unlike other FYVE domain-containing proteins, Lst2 displays primarily non-endosomal localization. Its endosomal localization is regulated by monoubiquitinylation. Lst2 physically binds Trim3, also known as BERP or RNF22, which is a coordinator of endosomal trafficking and interacts with Hrs and a complex that biases cargo recycling. 65
33321 277271 cd15732 FYVE_MTMR3 FYVE domain found in myotubularin-related protein 3 (MTMR3) and similar proteins. MTMR3, also termed Myotubularin-related phosphatase 3, or FYVE domain-containing dual specificity protein phosphatase 1 (FYVE-DSP1), or zinc finger FYVE domain-containing protein 10, is a ubiquitously expressed phosphoinositide 3-phosphatase specific for phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) and phosphatidylinositol 3,5-bisphosphate (PtdIns(3,5)P2) and PIKfyve, which produces PtdIns(3,5)P2 from PtdIns3P. It regulates cell migration through modulating phosphatidylinositol 5-phosphate (PtdIns5P) levels. MTMR3 contains an N-terminal PH-GRAM (PH-G) domain, a MTM phosphatase domain, a coiled-coil region, and a C-terminal FYVE domain. Unlike conventional FYVE domains, the FYVE domain of MTMR3 neither confers endosomal localization nor binds to PtdIns3P. It is also not required for the enzyme activity of MTMR3. In contrast, the PH-G domain binds phosphoinositides. 61
33322 277272 cd15733 FYVE_MTMR4 FYVE domain found in myotubularin-related protein 4 (MTMR4) and similar proteins. MTMR4, also termed FYVE domain-containing dual specificity protein phosphatase 2 (FYVE-DSP2), or zinc finger FYVE domain-containing protein 11, is an dual specificity protein phosphatase that specifically dephosphorylates phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). It is localizes to early endosomes, as well as to Rab11- and Sec15-positive recycling endosomes, and regulates sorting from early endosomes. Moreover, MTMR4 is preferentially associated with and dephosphorylated the activated regulatory Smad proteins (R-Smads) in cytoplasm to keep transforming growth factor (TGF) beta signaling in homeostasis. It also functions as an essential negative modulator for the homeostasis of bone morphogenetic protein (BMP)/decapentaplegic (Dpp) signaling. In addition, MTMR4 acts as a novel interactor of the ubiquitin ligase Nedd4 (neural-precursor-cell-expressed developmentally down-regulated 4) and may play a role in the biological process of muscle breakdown. MTMR4 contains an N-terminal PH-GRAM (PH-G) domain, a MTM phosphatase domain, a coiled-coil region, and a C-terminal FYVE domain. 60
33323 277273 cd15734 FYVE_ZFYV1 FYVE domains found in zinc finger FYVE domain-containing protein 1 (ZFYV1) and similar proteins. ZFYV1, also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYV1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by wortmannin, a PI3-kinase inhibitor. In addition to C-terminal tandem FYVE domain, ZFYV1 contains an N-terminal putative C2H2 type zinc finger and a possible nucleotide binding P-loop. 61
33324 277274 cd15735 FYVE_spVPS27p_like FYVE domain found in Schizosaccharomyces pombe vacuolar protein sorting-associated protein 27 (spVps27p) and similar proteins. spVps27p, also termed suppressor of ste12 deletion protein 4 (Sst4p), is a conserved homolog of budding Saccharomyces cerevisiae Vps27 and of mammalian Hrs. It functions as a downstream factor for phosphatidylinositol 3-kinase (PtdIns 3-kinase) in forespore membrane formation with normal morphology. It colocalizes and interacts with Hse1p, a homolog of Saccharomyces cerevisiae Hse1p and of mammalian STAM, to form a complex whose ubiquitin-interacting motifs (UIMs) are important for sporulation. spVps27p contains a VHS (Vps27p/Hrs/Stam) domain, a FYVE domain, and two UIMs. 59
33325 277275 cd15736 FYVE_scVPS27p_Vac1p_like FYVE domain found in Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and FYVE-related domain 1 found in yeast protein VAC1 (Vac1p) and similar proteins. The family includes Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and protein VAC1 (Vac1p). scVps27p, also termed Golgi retention defective protein 11, is the putative yeast counterpart of the mammalian protein Hrs and is involved in endosome maturation. It is a mono-ubiquitin-binding protein that interacts with ubiquitinated cargoes, such as Hse1p, and is required for protein sorting into the multivesicular body. Vps27p forms a complex with Hse1p. The complex binds ubiquitin and mediates endosomal protein sorting. At the endosome, Vps27p and a trimeric protein complex, ESCRT-1, bind ubiquitin and are important for multivesicular body (MVB) sorting. Vps27p contains an N-terminal VHS (Vps27/Hrs/STAM) domain, a FYVE domain that binds PtdIns3P, followed by two ubiquitin-interacting motifs (UIMs), and a C-terminal clathrin-binding motif. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The FYVE domain in both Vps27p and Vac1p harbors a zinc-binding site composed of seven Cysteines and one Histidine, which is different from that of other FYVE domain containing proteins. 56
33326 277276 cd15737 FYVE2_Vac1p_like FYVE domain 2 found in yeast protein VAC1 (Vac1p) and similar proteins. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The family corresponds to the second FYVE domain that is responsible for the ability of Pep7p to efficiently interact with Vac1p and Vps45p. 83
33327 277277 cd15738 FYVE_MTMR_unchar FYVE-related domain found in uncharacterized myotubularin-related proteins mainly from eumetazoa. This family includes a group of uncharacterized myotubularin-related proteins mainly found in eumetazoa. Although their biological functions remain unclear, they share similar domain architecture that consists of an N-terminal pleckstrin homology (PH) domain, a highly conserved region related to myotubularin proteins, a C-terminal FYVE domain. The model corresponds to the FYVE domain, which resembles the FYVE-related domain as it has an altered sequence in the basic ligand binding patch. 61
33328 277278 cd15739 FYVE_RABE_unchar FYVE domain found in uncharacterized rab GTPase-binding effector proteins from bilateria. This family includes a group of uncharacterized rab GTPase-binding effector proteins found in bilateria. Although their biological functions remain unclear, they all contain a FYVE domain that harbors a putative phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding site. 73
33329 277279 cd15740 FYVE_FGD3 FYVE-like domain found in FYVE, RhoGEF and PH domain-containing protein 3 (FGD3) and similar proteins. FGD3, also termed zinc finger FYVE domain-containing protein 5, is a putative Cdc42-specific guanine nucleotide exchange factor (GEF) that undergoes the ubiquitin ligase SCFFWD1/beta-TrCP-mediated proteasomal degradation. It is a homologue of FGD1 and contains a DBL homology (DH) domain and pleckstrin homology (PH) domain in the middle region, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. Due to this difference, FGD3 may play different roles from that of FGD1 to regulate cell morphology or motility. The FYVE domain of FGD3 resembles a FYVE-like domain that is different from the canonical FYVE domains, since it lacks one of the three conserved signature motifs (the WxxD motif) that are involved in phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding and exhibits altered lipid binding specificities. 54
33330 277280 cd15741 FYVE_FGD1_2_4 FYVE domain found in FYVE, RhoGEF and PH domain-containing protein facio-genital dysplasia FGD1, FGD2, FGD4. This family represents a group of Rho GTPase cell division cycle 42 (Cdc42)-specific guanine nucleotide exchange factors (GEFs), including FYVE, RhoGEF and PH domain-containing protein FGD1, FGD2 and FGD4. FGD1, also termed faciogenital dysplasia 1 protein, or Rho/Rac guanine nucleotide exchange factor FGD1 (Rho/Rac GEF), or zinc finger FYVE domain-containing protein 3, is a central regulator of extracellular matrix remodeling and belongs to the DBL family of GEFs that regulate the activation of the Rho GTPases. FGD1 is encoded by gene FGD1. Disabling mutations in the FGD1 gene cause the human X-linked developmental disorder faciogenital dysplasia (FGDY, also known as Aarskog-Scott syndrome). FGD2, also termed zinc finger FYVE domain-containing protein 4, is expressed in antigen-presenting cells, including B lymphocytes, macrophages, and dendritic cells. It localizes to early endosomes and active membrane ruffles. It plays a role in leukocyte signaling and vesicle trafficking in cells specialized to present antigen in the immune system. FGD4, also termed actin filament-binding protein frabin, or FGD1-related F-actin-binding protein, or zinc finger FYVE domain-containing protein 6, functions as an F-actin-binding (FAB) protein showing significant homology to FGD1. It induces the formation of filopodia through the activation of Cdc42 in fibroblasts. Those FGD proteins possess a similar domain organization that contains a DBL homology (DH) domain, a pleckstrin homology (PH) domain, a FYVE domain, and another PH domain in the C-terminus. However, each FGD has a unique N-terminal region that may directly or indirectly interact with F-actin. FGD1 and FGD4 have an N-terminal proline-rich domain (PRD) and an N-terminal F-actin binding (FAB) domain, respectively. This model corresponds to the FYVE domain, which has been found in many proteins involved in membrane trafficking and phosphoinositide metabolism, and has been defined by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCR patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding site. FGD1 possesses a FYVE-like domain that lack the N-terminal WxxD motif. Moreover, FGD2 is the only known RhoGEF family member shown to have a functional FYVE domain and endosomal binding activity. 65
33331 277281 cd15742 FYVE_FGD5 FYVE-like domain found in FYVE, RhoGEF and PH domain-containing protein 5 (FGD5) and similar proteins. FGD5, also termed zinc finger FYVE domain-containing protein 23, is an endothelial cell (EC)-specific guanine nucleotide exchange factor (GEF) that regulates endothelial adhesion, survival, and angiogenesis by modulating phosphatidylinositol 3-kinase signaling. It functions as a novel genetic regulator of vascular pruning by activation of endothelial cell-targeted apoptosis. FGD5 is a homologue of FGD1 and contains a DBL homology (DH) domain, a pleckstrin homology (PH) domain, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. The FYVE domain of FGD5 resembles a FYVE-like domain that is different from the canonical FYVE domains, since it lacks one of the three conserved signature motifs (the WxxD motif) that are involved in phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding and exhibits altered lipid binding specificities. 67
33332 277282 cd15743 FYVE_FGD6 FYVE domain found in FYVE, RhoGEF and PH domain-containing protein 6 (FGD6) and similar proteins. FGD6, also termed zinc finger FYVE domain-containing protein 24 is a putative Cdc42-specific guanine nucleotide exchange factor (GEF) whose biological function remains unclear. It is a homologue of FGD1 and contains a DBL homology (DH) domain and pleckstrin homology (PH) domain in the middle region, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. Moreover, the FYVE domain of FGD6 is a canonical FYVE domain, which has been found in many proteins involved in membrane trafficking and phosphoinositide metabolism, and has been defined by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCR patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding site. 61
33333 277283 cd15744 FYVE_RUFY3 FYVE-related domain found in RUN and FYVE domain-containing protein 3 (RUFY3) and similar proteins. RUFY3, also termed Rap2-interacting protein x (RIPx or RPIPx), or single axon-regulated protein (singar), is an N-terminal RUN domain and a C-terminal FYVE domain containing protein predominantly expressed in the brain. It suppresses formation of surplus axons for neuronal polarity. Unlike other RUFY proteins, RUFY3 can associate with the GTP-bound active form of Rab5. Moreover, the FYVE domain of RUFY3 resembles the FYVE-related domain as it lacks the WxxD motif (x for any residue). 52
33334 277284 cd15745 FYVE_RUFY4 FYVE-related domain found in RUN and FYVE domain-containing protein 4 (RUFY4) and similar proteins. RUFY4 belongs to the FUFY protein family which is characterized by the presence of an N-terminal RUN domain and a C-terminal FYVE domain. The FYVE domain of RUFY4 resembles the FYVE-related domain as it lacks the WxxD motif (x for any residue). The biological function of RUFY4 still remains unclear. 52
33335 277285 cd15746 FYVE_RP3A_like FYVE-related domain found in rabphilin-3A, Rab effector Noc2, and similar proteins. This family includes rabphilin-3A and Rab effector Noc2. Rabphilin-3A, also termed exophilin-1, is an effector protein that binds to the GTP-bound form of Rab3A, which is one of the most abundant Rab proteins in neurons and predominantly localized to synaptic vesicles. Rabphilin-3A is homologous to alpha-Rab3-interacting molecules (RIMs). It is a multi-domain protein containing an N-terminal Rab3A effector domain, a proline-rich linker region, and two tandem C2 domains. The effector domain binds specifically to the activated GTP-bound state of Rab3A and harbors a conserved FYVE zinc finger. The C2 domains are responsible for the binding of phosphatidylinositol-4,5-bisphosphate (PIP2) , a key player in the neurotransmitter release process. Thus, Rabphilin-3A has also been implicated in vesicle trafficking. Rab effector Noc2, also termed No C2 domains protein, or rabphilin-3A-like protein (RPH3AL), is a Rab3 effector that mediates the regulation of secretory vesicle exocytosis in neurons and certain endocrine cells. It also functions as a Rab27 effector and is involved in isoproterenol (IPR)-stimulated amylase release from acinar cells. Noc2 contains an N-terminal Rab3A effector domain which only harbors a conserved FYVE zinc finger. The FYVE domains of Rabphilin-3A and Noc2 resemble a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 55
33336 277286 cd15747 FYVE_Slp3_4_5 FYVE-related domain found in the synaptotagmin-like proteins 3, 4, 5. The synaptotagmin-like proteins 1-5 (Slp1-5) family belongs to the carboxyl-terminal-type (C-type) tandem C2 proteins superfamily, which also contains the synaptotagmin and the Doc2 families. Slp proteins are putative membrane trafficking proteins that are characterized by the presence of a unique N-terminal Slp homology domain (SHD), and C-terminal tandem C2 domains (known as the C2A domain and C2B domain). The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2. The SHD1 and SHD2 of Slp3, Slp4 and Slp5 are separated by a putative FYVE zinc finger. By contrast, Slp1 and Slp2 lack such zinc finger and their SHD1 and SHD2 are linked together. This model corresponds to the FYVE zinc finger. At this point, Slp1 and Slp2 are not included in this model. Moreover, the FYVE domains of Slp3, Slp4 and Slp5 resemble a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 48
33337 277287 cd15748 FYVE_SPIR FYVE-related domain found in Spir proteins, Spire1 and Spire2. Spir proteins were originally discovered as the protein products of the Drosophila spire gene. They are Jun N-terminal kinase (JNK)-interacting proteins that have exclusively been identified in metazoans. They may play roles in membrane trafficking and cortical filament crosslinking. This family includes Spire1 and Spire2, which function as new essential factors in asymmetric division of oocytes. They mediate asymmetric spindle positioning by assembling a cytoplasmic actin network. They are also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, they cooperate synergistically with Fmn2 to assemble F-actin in oocytes. Both Spire1 and Spire2 contain an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. Their FYVE domains resemble FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically bind the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). 42
33338 277288 cd15749 FYVE_ZFY19 FYVE-related domain found in FYVE domain-containing protein 19 (ZFY19) and similar proteins. ZFY19, also termed mixed lineage leukemia (MLL) partner containing FYVE domain, is encoded by a novel gene, MLL partner containing FYVE domain (MPFYVE). The FYVE domain of ZFY19 resembles FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. The biological function of ZFY19 remains unclear. 51
33339 277289 cd15750 FYVE_CARP FYVE-like domain found in caspase-associated ring proteins, CARP1 and CARP2. CARP1 and CARP2 are a novel group of caspase regulators by the presence of a FYVE-type zinc finger domain. They do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, these proteins have an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus they constitute a family of unique FYVE-type domains called FYVE-like domains. 47
33340 277290 cd15751 FYVE_BSN_PCLO FYVE-related domain found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 62
33341 277291 cd15752 FYVE_SlaC2-a FYVE-related domain found in Slp homolog lacking C2 domains a (SlaC2-a) and similar proteins. SlaC2-a, also termed melanophilin, or exophilin-3, is a GTP-bound form of Rab27A-, myosin Va-, and actin-binding protein present on melanosomes. It is involved in the control of transferring of melanosomes from microtubules to actin filaments. It also functions as a melanocyte type myosin Va (McM5) binding partner and directly activates the actin-activated ATPase activity of McM5 through forming a tripartite protein complex with Rab27A and an actin-based motor myosin Va. SlaC2-a belongs to the Slp homolog lacking C2 domains (Slac2) family. It contains an N-terminal Slp homology domain (SHD), but lacks tandem C2 domains. The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of SlaC2-a are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. Moreover, Slac2-a has a middle myosin-binding domain and a C-terminal actin-binding domain. 76
33342 277292 cd15753 FYVE_SlaC2-c FYVE-related domain found in Slp homolog lacking C2 domains c (SlaC2-c) and similar proteins. SlaC2-c, also termed Rab effector MyRIP, or exophilin-8, or myosin-VIIa- and Rab-interacting protein, or synaptotagmin-like protein lacking C2 domains c, is a GTP-bound form of Rab27A-, myosin Va/VIIa-, and actin-binding protein mainly present on retinal melanosomes and secretory granules. It may play a role in insulin granule exocytosis. It is also involved in the control of isoproterenol (IPR)-induced amylase release from parotid acinar cells. SlaC2-c belongs to the Slp homolog lacking C2 domains (Slac2) family. It contains an N-terminal Slp homology domain (SHD), but lacks tandem C2 domains. The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of SlaC2-c are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. Moreover, Slac2-c has a middle myosin-binding domain and a C-terminal actin-binding domain. 49
33343 277293 cd15754 FYVE_PKHF1 FYVE domain found in protein containing both PH and FYVE domains 1 (phafin-1) and similar proteins. Phafin-1, also termed lysosome-associated apoptosis-inducing protein containing PH (pleckstrin homology) and FYVE domains (LAPF), or pleckstrin homology domain-containing family F member 1 (PKHF1), or PH domain-containing family F member 1, or apoptosis-inducing protein, or PH and FYVE domain-containing protein 1, or zinc finger FYVE domain-containing protein 15, is a representative of a novel family of PH and FYVE domain-containing proteins called phafins. It is a ubiquitously expressed pro-apoptotic protein via translocating to lysosomes, facilitating apoptosis induction through a lysosomal-mitochondrial apoptotic pathway. 64
33344 277294 cd15755 FYVE_PKHF2 FYVE domain found in protein containing both PH and FYVE domains 2 (phafin-2) and similar proteins. Phafin-2, also termed endoplasmic reticulum-associated apoptosis-involved protein containing PH and FYVE domains (EAPF), or pleckstrin homology domain-containing family F member 2 (PKHF2), or PH domain-containing family F member 2, or PH and FYVE domain-containing protein 2, or zinc finger FYVE domain-containing protein 18, is a ubiquitously expressed endoplasmic reticulum-associated protein that facilitates tumor necrosis factor alpha (TNF-alpha)-triggered cellular apoptosis through endoplasmic reticulum (ER)-mitochondrial apoptotic pathway. It is an endosomal phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) effector, as well as an interactor of the endosomal-tethering protein EEA1. It regulates endosome fusion upstream of Rab5. Phafin-2 also functions as a novel regulator of endocytic epidermal growth factor receptor (EGFR) degradation through a role in endosomal fusion. 64
33345 277295 cd15756 FYVE_WDFY1 FYVE domain found in WD40 repeat and FYVE domain-containing protein 1 (WDFY1) and similar proteins. WDFY1, also termed FYVE domain containing protein localized to endosomes-1 (FENS-1), or phosphoinositide-binding protein 1, or zinc finger FYVE domain-containing protein 17, is a novel single FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. WDFY1 to early endosomes requires an intact FYVE domain and is inhibited by wortmannin, a PI3-kinase inhibitor. In addition to FYVE domain, WDFY1 harbors multiple WD-40 repeats. 76
33346 277296 cd15757 FYVE_WDFY2 FYVE domain found in WD40 repeat and FYVE domain-containing protein 2 (WDFY2). WDFY2, also termed zinc finger FYVE domain-containing protein 22, or ProF (propeller-FYVE protein), is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that is localized to a distinct subset of early endosomes close to the plasma membrane. It interacts preferentially with endogenous serine/threonine kinase Akt2, but not Akt1, and plays a specific role in modulating signaling through Akt downstream of the interaction of this kinase with the endosomal proteins APPL (adaptor protein containing PH domain, PTB domain, and leucine zipper motif). In addition to Akt, WDFY2 serves as a binding partner for protein kinase C, zeta (PRKCZ), and its substrate vesicle-associated membrane protein 2 (VAMP2), and is involved in vesicle cycling in various secretory pathways. Moreover, Silencing of WDFY2 by siRNA produces a strong inhibition of endocytosis. WDFY2 contains WD40 motifs and a FYVE domain. 70
33347 277297 cd15758 FYVE_RUFY1 FYVE domain found in RUN and FYVE domain-containing protein 1 (RUFY1) and similar proteins. RUFY1, also termed FYVE-finger protein EIP1, or La-binding protein 1, or Rab4-interacting protein (Rabip4), or Zinc finger FYVE domain-containing protein 12 (ZFY12), a human homologue of mouse Rabip4, an effector of Rab4 GTPase that regulates recycling of endocytosed cargo. RUFY1 is an endosomal protein that functions as a dual effector of Rab4 and Rab14 and is involved in efficient recycling of transferrin (Tfn). It is a downstream effector of Etk, a downstream tyrosine kinase of PI3-kinase that is involved in regulation of vesicle trafficking. RUFY1 contains an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between. 71
33348 277298 cd15759 FYVE_RUFY2 FYVE domain found in RUN and FYVE domain-containing protein 2 (RUFY2) and similar proteins. RUFY2, also termed Rab4-interacting protein related, is a novel embryonic factor that contains an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between. It is present in the nucleus at early stages of embryonic development. It may have both endosomal functions in the cytoplasm and nuclear functions. 71
33349 277299 cd15760 FYVE_scVPS27p_like FYVE domain found in Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and similar proteins. scVps27p, also termed Golgi retention defective protein 11, is the putative yeast counterpart of the mammalian protein Hrs and is involved in endosome maturation. It is a mono-ubiquitin-binding protein that interacts with ubiquitinated cargoes, such as Hse1p, and is required for protein sorting into the multivesicular body. Vps27p forms a complex with Hse1p. The complex binds ubiquitin and mediates endosomal protein sorting. At the endosome, Vps27p and a trimeric protein complex, ESCRT-1, bind ubiquitin and are important for multivesicular body (MVB) sorting. Vps27p contains an N-terminal VHS (Vps27/Hrs/STAM) domain, a FYVE domain that binds PtdIns3P, followed by two ubiquitin-interacting motifs (UIMs), and a C-terminal clathrin-binding motif. 59
33350 277300 cd15761 FYVE1_Vac1p_like FYVE-related domain 1 found in yeast protein VAC1 (Vac1p) and similar proteins. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The family corresponds to the first FYVE domain, which resembles the FYVE-related domain as it has an altered sequence in the basic ligand binding patch. 76
33351 277301 cd15762 FYVE_RP3A FYVE-related domain found in rabphilin-3A and similar proteins. Rabphilin-3A, also termed exophilin-1, is an effector protein that binds to the GTP-bound form of Rab3A, which is one of the most abundant Rab proteins in neurons and predominantly localized to synaptic vesicles. Rabphilin-3A is homologous to alpha-Rab3-interacting molecules (RIMs). It is a multi-domain protein containing an N-terminal Rab3A effector domain, a proline-rich linker region, and two tandem C2 domains. The effector domain binds specifically to the activated GTP-bound state of Rab3A and harbors a conserved FYVE zinc finger. The C2 domains are responsible for the binding of phosphatidylinositol-4,5-bisphosphate (PIP2) , a key player in the neurotransmitter release process. Thus, Rabphilin-3A has also been implicated in vesicle trafficking. The FYVE domain of Rabphilin-3A resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 80
33352 277302 cd15763 FYVE_RPH3L FYVE-related domain found in Rab effector Noc2 and similar proteins. Rab effector Noc2, also termed No C2 domains protein, or rabphilin-3A-like protein (RPH3AL), is a Rab3 effector that mediates the regulation of secretory vesicle exocytosis in neurons and certain endocrine cells. It also functions as a Rab27 effector and is involved in isoproterenol (IPR)-stimulated amylase release from acinar cells. Noc2 contains an N-terminal Rab3A effector domain which harbors a conserved zinc finger, but lacks tandem C2 domains. The FYVE domain of Noc2 resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 64
33353 277303 cd15764 FYVE_Slp4 FYVE-related domain found in synaptotagmin-like protein 4 (Slp4) and similar proteins. Slp4, also termed exophilin-2, or granuphilin, has been characterized as a regulator of the release of insulin granules from pancreatic beta-cells and dense core granules from PC12 neuronal cells by binding to Rab27A , and amylase granules from parotid gland acinar cells through interaction with syntaxin-2/3 in a Munc18-2-dependent manner on the apical plasma membrane. It can binds to syntaxin 2 in parotid acinar cells. It is also involved in granule transport by recruitment of the motor protein myosin Va. Moreover, it requires Rab8 to increase granule release in platelets. Slp4 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp4 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 50
33354 277304 cd15765 FYVE_Slp3 FYVE-related domain found in synaptotagmin-like protein 3 (Slp3) and similar proteins. Slp3, also termed exophilin-6, functions as a Rab27A-specific effector in cytotoxic T lymphocytes. It binds to kinesin-1 motor through interaction with the tetratricopeptide repeat of the kinesin-1 light chain (KLC1). The kinesin-1/Slp3/Rab27a complex plays a role in mediating the terminal transport of lytic granules to the immune synapse. Slp3 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp3 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. In addition, the Slp3 C2A domain showed Ca2+-dependent phospholipid binding activity. At this point, Slp3 is a Ca2+-dependent isoform in Slp proteins family. 48
33355 277305 cd15766 FYVE_Slp5 FYVE-related domain found in synaptotagmin-like protein 5 (Slp5) and similar proteins. Slp5 is a novel Rab27A-specific effector that is highly expressed in placenta and liver. Slp5 specifically interacted with the GTP-bound form of Rab27A and is involved in Rab27A-dependent membrane trafficking in specific tissues. Slp5 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp5 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 47
33356 277306 cd15767 FYVE_SPIR1 FYVE-related domain found in protein spire homolog 1 (Spire1) and similar proteins. Spire1 is encoded by gene spir-1, which is primarily found to be expressed in the developing nervous system and in neuronal cells of the adult brain, as well as in the fetal liver and in the adult spleen. It functions as a new essential factor in asymmetric division of oocytes. It mediates asymmetric spindle positioning by assembling a cytoplasmic actin network. It is also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, it cooperates synergistically with Fmn2 to assemble F-actin in oocytes. Spire1 contains an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. The FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically binds the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). 79
33357 277307 cd15768 FYVE_SPIR2 FYVE-related domain found in protein spire homolog 2 (Spire2) and similar proteins. Spire2 is encoded by gene spir-2, which is expressed in the nervous system and highly expressed in the colonic epithelium. It functions as a new essential factor in asymmetric division of oocytes. It mediates asymmetric spindle positioning by assembling a cytoplasmic actin network. It is also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, it cooperates synergistically with Fmn2 to assemble F-actin in oocytes. Spire2 contains an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. The FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically binds the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). 112
33358 277308 cd15769 FYVE_CARP1 FYVE-like domain found in caspase regulator CARP1 and similar proteins. CARP1, also termed E3 ubiquitin-protein ligase RNF34, or caspases-8 and -10-associated RING finger protein 1, or FYVE-RING finger protein Momo, or RING finger homologous to inhibitor of apoptosis protein (RFI), or RING finger protein 34, or RING finger protein RIFF, is a nuclear protein that functions as a specific E3 ubiquitin ligase for the transcriptional coactivator PGC-1alpha, a master regulator of energy metabolism and adaptive thermogenesis in the brown fat cell, and negatively regulates brown fat cell metabolism. It is preferentially expressed in esophageal, gastric and colorectal cancers, suggesting a possible association with the development of the digestive tract cancers. It regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. CARP1 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, CARP1 has an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus it belongs to a family of unique FYVE-type domains called FYVE-like domains. In addition to the N-terminal FYVE-like domain, CARP1 harbors a C-terminal RING domain. 47
33359 277309 cd15770 FYVE_CARP2 FYVE-like domain found in caspase regulator CARP2 and similar proteins. CARP2, also termed E3 ubiquitin-protein ligase rififylin, or caspases-8 and -10-associated RING finger protein 2, or FYVE-RING finger protein Sakura (Fring), or RING finger and FYVE-like domain-containing protein 1, or RING finger protein 189, or RING finger protein 34-like, is a novel caspase regulator containing a FYVE-type zinc finger domain. It regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. CARP2 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, CARP2 has an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus it belongs to a family of unique FYVE-type domains called FYVE-like domains. In addition to the N-terminal FYVE-like domain, CARP2 harbors a C-terminal RING domain. 49
33360 277310 cd15771 FYVE1_BSN_PCLO FYVE-related domain 1 found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. This model corresponds to the first FYVE-related domain. 61
33361 277311 cd15772 FYVE2_BSN_PCLO FYVE-related domain 2 found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. This model corresponds to the second FYVE-related domain. 64
33362 277312 cd15773 FYVE1_BSN FYVE-related domain 1 found in protein bassoon. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Bassoon contains two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. This family corresponds to the first FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 64
33363 277313 cd15774 FYVE1_PCLO FYVE-related domain 1 found in protein piccolo. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the first FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 62
33364 277314 cd15775 FYVE2_BSN FYVE-related domain 2 found in protein bassoon. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Bassoon contains two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 65
33365 277315 cd15776 FYVE2_PCLO FYVE-related domain 2 found in protein piccolo. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 64
33366 276940 cd15777 CRBN_C_like Thalidomide-binding C-terminal domain of cereblon (CRBN) and similar protein domains. Cereblon is part of an E3 ubiquitin ligase complex, together with damaged DNA-binding protein 1 (DDB1), CUL4A and ROC1. Cereblon interacts directly with DDB1, although the C-terminal domain characterized here does not contribute to that interaction. Ubiquination of cellular targets by this complex increases levels of FGF8 and FGF10, which was shown to affect the development of limbs and auditory vesicles in embryogenesis. The C-terminal domain of Cereblon was shown to contain the binding site for thalidomide and its analogs, a class of teratogenic drugs that exhibit an antiproliferative effect on myelomas. Mutations in CRBN, some of which map onto the C-terminal domain, were associated with autosomal recessive mental retardation, which may have to do with interactions between CRBN and ion channels in the brain. 101
33367 275446 cd15778 Lreu_0056_like Proteins similar to Lactobacillus reuteri ORF 0056. This family of Lactobacillus proteins has not been characterized. The 3D structure has been solved for a hypothetical protein with a predicted signal peptide, as part of a wider examination of the structural biology of commensal human gut flora and pathogens. 112
33368 294014 cd15783 SA1633_like Uncharacterized protein family conserved in Staphylococci. Some proteins in this family have been described as putative beta-lactamases. They are structurally similar to the C-terminal beta-grasp domains of Staphylococcal and Streptococcal superantigens. 143
33369 275431 cd15784 PH_RUTBC Rab-binding Pleckstrin homology domain (PH) of small G-protein signaling modulator 1 and similar proteins. Small G-protein signaling modulator 1, or RUN and TBC1 domain containing 2 (RUTBC2), as well as RUTBC1, bind to Rab9A via their Pleckstrin homology (PH) domain. They do not seem to act as GAP proteins that stimulate GTP hydrolysis by Rab9A, and RUTBC2 has been shown to also interact with Rab9B, most likely in a similar manner. RUTBC1 does stimulate GTP hydrolysis by Rab32 and Rab33B, however, while RUTBC2 appears to be a GAP for Rab36. Rab9A and associated proteins control the recycling of mannose-6-phosphate receptors from late endosomes to the trans-Golgi. 176
33370 276946 cd15785 YycH_N_like N-terminal domain of YycH and structurally similar proteins conserved in Firmicutes. These protein domains appear to be members of a somewhat larger structural family conserved in Firmicutes, including the N-terminal domain of YycH. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers. 113
33371 276947 cd15786 CPF_1278_like Uncharacterized protein conserved in Clostridia. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. The 3D structure is available for one protein, CPF_1278, which has been labelled a putative lipoprotein. CPF_1278 displays structural similarity to the N-terminal domain of YycH, which plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers. 124
33372 276948 cd15787 YycH_N N-terminal domain of YycH and similar proteins. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers. This model describes the N-terminal domain of YycH. 143
33373 276949 cd15788 Clospo_01618_like Uncharacterized protein conserved in Clostridia. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. It displays structural similarity to the N-terminal domain of YycH, which plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers. 129
33374 275432 cd15789 PH_ARHGEF2_18_like rho guanine nucleotide exchange factor. RhoGEFs belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. The members here all contain Dbl homology (DH)-PH domains. In addition some members contain N-terminal C1 (Protein kinase C conserved region 1) domains, PDZ (also called DHR/Dlg homologous regions) domains, ANK (ankyrin) domains, and RGS (Regulator of G-protein signalling) domains or C-terminal ATP-synthase B subunit. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. RhoGEF2/Rho guanine nucleotide exchange factor 2, p114RhoGEF/p114 Rho guanine nucleotide exchange factor, p115RhoGEF, p190RhoGEF, PRG/PDZ Rho guanine nucleotide exchange factor, RhoGEF 11, RhoGEF 12, RhoGEF 18, AKAP13/A-kinase anchoring protein 13, and LARG/Leukemia-associated Rho guanine nucleotide exchange factor are included in this CD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 102
33375 275433 cd15790 PH-GRAM_MTMR11 Myotubularian (MTM) related 11 protein (MTMR11) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10, MTMR11, and MTMR12 are catalytically inactive phosphatases that play a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. They contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. They contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. 123
33376 275434 cd15791 PH1_FDG4 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 4, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 94
33377 275435 cd15792 PH1_FGD5 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 5, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
33378 275436 cd15793 PH1_FGD6 FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 6, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 123
33379 275437 cd15794 PH_ARHGEF18 Rho guanine nucleotide exchange factor 18 Pleckstrin homology (PH) domain. ARHGEF18, also called p114RhoGEF, is a key regulator of RhoA-Rock2 signaling that is crucial for maintenance of polarity in the vertebrate retinal epithelium, and consequently is essential for cellular differentiation, morphology and eventually organ function. ARHGEF18 contains Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 119
33380 275439 cd15795 PMEI-Pla_a_1_like Pollen allergen Pla a 1 and similar plant proteins. The major Platanus acerifolia pollen allergen Pla a 1 belongs to a class of allergens related to proteinaceous invertase and pectin methylesterase inhibitors. Platanus acerifolia is an important cause of pollinosis; Pla a 1 has a prevalence of about 80% among plane tree pollen-allergic patients. Recombinant Pla a 1 binds IgE in vitro, similar to its natural counterpart, rendering it suitable for specific diagnosis and structural studies. Invertase inhibitors are structurally similar to those of pectin methylesterase (PMEIs), an enzyme that is involved in the control of pectin metabolism and is structurally unrelated to invertases. All inhibitors share a size of about 18 kDa, two strictly conserved disulfide bridges and only moderate sequence homology (about 20% sequence identity). 148
33381 275440 cd15796 CIF_like Cell-wall inhibitor of beta fructosidase and similar proteins. Cell-wall invertases (CWIs) are secreted apoplastic enzymes belonging to the glycoside hydrolase family 32 (EC 3.2.1.26) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. Their activity is tightly regulated by compartment-specific inhibitor proteins at transcriptional and post-transcriptional levels. Invertase inhibitors are structurally similar to those of pectin methylesterase (PMEIs), an enzyme that is involved in the control of pectin metabolism and is structurally unrelated to invertases. All inhibitors share a size of about 18 kDa, two strictly conserved disulfide bridges and only moderate sequence homology (about 20% sequence identity). Interaction of invertase inhibitor Nt-CIF (Nicotiana tabacum cell-wall inhibitor of beta-fructosidase) with CWI is strictly pH-dependent, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF. 148
33382 275441 cd15797 PMEI Pectin methylesterase inhibitor. Pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) catalyzes the demethylesterification of homogalacturonans in the cell wall. Its activity is regulated by the proteinaceous PME inhibitor (PMEI) which inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context. 149
33383 275442 cd15798 PMEI-like_3 Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitor domains. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context. 154
33384 275443 cd15799 PMEI-like_4 plant invertase/pectin methylesterase inhibitor domain-containing protein. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF. 151
33385 275444 cd15800 PMEI-like_2 Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitors. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF. 148
33386 275445 cd15801 PMEI-like_1 Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitors. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF. 146
33387 276805 cd15802 RING_CBP-p300 atypical RING domain found in CREB-binding protein and p300 histone acetyltransferases. CBP and p300 (also known as CREBBP or KAT3A and EP300 or KAT3B, respectively) are two histone acetyltransferases (HATs) that associate with and acetylate transcriptional regulators and chromatin. The catalytic core of animal CBP-p300 contains a bromodomain, a CH2 region containing a discontinuous PHD domain interrupted by this RING domain, and a HAT domain. Bromodomain-RING-PHD forms a compact module in which the RING domain is juxtaposed with the HAT substrate-binding site. This ring domain contains only a single zinc ion-binding cluster instead of two; instead of a second zinc atom, a network of hydrophobic interactions stabilizes the domain. The RING domain has an inhibitory role. Disease mutations that disrupt RING attachment lead to upregulation of HAT activity. HAT regulation may require repositioning of the RING domain to facilitate access to an otherwise partially occluded HAT active site. Plant CBP-p300 type HATs lack a bromodomain whose role in the animal animal CBP-p300's is to bind acetylated histones; it has been suggested that these plant proteins may utilize a different domain or another bromodomain protein to perform this function. This RING domain has also been referred to as DUF902. 73
33388 276941 cd15803 RLR_C_like C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors, Cereblon (CRBN), and similar protein domains. Retinoic acid-inducible gene (RIG)-I-like Receptors (RLRs) are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They play crucial roles in innate antiviral responses, including the production of proinflammatory cytokines and type I interferon. There are three RLRs in vertebrates, RIG-I, LGP2, and MDA5. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. Cereblon is part of an E3 ubiquitin ligase complex, together with damaged DNA binding protein 1 (DDB1), CUL4A and ROC1. Cereblon interacts directly with DDB1, although the C-terminal domain characterized here does not contribute to that interaction. The C-terminal domain of Cereblon was shown to contain the binding site for thalidomide and its analogs, a class of teratogenic drugs that exhibit an antiproliferative effect on myelomas. Mutations in CRBN, some of which map onto the C-terminal domain, were associated with autosomal recessive mental retardation, which may have to do with interactions between CRBN and ion channels in the brain. RLRs and Cereblon contain a common conserved zinc binding site in their C-terminal domains. 84
33389 276942 cd15804 RLR_C C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors. Retinoic acid-inducible gene (RIG)-I-like Receptors (RLRs) are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They play crucial roles in innate antiviral responses, including the production of proinflammatory cytokines and type I interferon. There are three RLRs in vertebrates, RIG-I, LGP2, and MDA5. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing. They may detect partially overlapping viral substrates, including dengue virus, West Nile virus (WNV), reoviruses, and several paramyxoviruses (such as measles virus and Sendai virus). LGP2 lacks CARD and may play a regulatory role in RLR signaling. It may cooperate with either RIG-I or MDA5 to sense viral RNA. 111
33390 276943 cd15805 RIG-I_C C-terminal domain of Retinoic acid-inducible gene (RIG)-I protein, a cytoplasmic viral RNA receptor. Retinoic acid-inducible gene (RIG)-I protein, also called DEAD box protein 58 (DDX58), is one of three members of the RIG-I-like Receptor (RLR) family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. RIG-I is activated by blunt-ended double-stranded RNA with or without a 5'-triphosphate (ppp), by single-stranded RNA marked by a 5'-ppp and by polyuridine sequences. It has been found to confer resistance to many negative-sense RNA viruses, including orthomyxoviruses, rhabdoviruses, bunyaviruses, and paramyxoviruses, as well as the positive-strand hepatitis C virus. RLRs are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing. 112
33391 276944 cd15806 LGP2_C C-terminal domain of Laboratory of Genetics and Physiology 2 (LGP2), a cytoplasmic viral RNA receptor. Laboratory of Genetics and Physiology 2 (LGP2) is one of three members of the RIG-I-like Receptor (RLR) family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. LGP2 lacks the caspase activation and recruitment domains (CARDs) that are present in other RLRs, which initiate downstream signaling upon viral RNA sensing. LGP2 may play a regulatory role in RLR signaling, and may cooperate with either RIG-I or MDA5 to sense viral RNA. 112
33392 276945 cd15807 MDA5_C C-terminal domain of Melanoma differentiation-associated protein 5, a cytoplasmic viral RNA receptor. Melanoma differentiation-associated protein 5 (MDA5) is also called Interferon-induced helicase C domain-containing protein 1 (IFIH1) or RIG-I-like receptor 2 (RLR-2). It is one of three members of the RLR family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. It has been shown to detect viruses from the Picornaviridae and Caliciviridae families. RLRs are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing. 117
33393 293980 cd15808 SPRY_PRY_TRIM47 PRY/SPRY domain in tripartite motif-containing protein 47 (TRIM47), also known as RING finger protein 100 (RNF100) or Gene overexpressed in astrocytoma protein (GOA). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM47, also known as GOA (Gene overexpressed in astrocytoma protein) or RNF100 (RING finger protein 100). TRIM47 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is highly expressed in kidney tubular cells, but lowly expressed in most tissue. It is overexpressed in astrocytoma tumor cells and plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis; astrocytoma, also known as cerebral astrocytoma, is a malignant glioma that arises from astrocytes. Genome wide studies on white matter lesions have identified a novel locus on chromosome 17q25 harboring several genes such as TRIM47 and TRIM65 which pinpoints to possible novel mechanisms leading to these lesions. 206
33394 293981 cd15809 SPRY_PRY_TRIM4 PRY/SPRY domain in tripartite motif-binding protein 4 (TRIM4), also known as RING finger protein 87 (RNF87). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM4 which is also known as RING finger protein 87 (RNF87). TRIM4 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It is a positive regulator of RIG-I-mediated interferon (IFN) induction. It regulates virus-induced IFN induction and cellular antiviral innate immunity by targeting RIG-I for K63-linked poly-ubiquitination. Over-expression of TRIM4 enhances virus-triggered activation of transcription factors IRF3 and NF-kappaB, as well as IFN-beta induction. Expression of TRIM4 differs significantly in Huntington's Disease (HD) neural cells when compared with wild-type controls, possibly impacting down-regulation of the Huntingtin (HTT) gene, which is involved in the regulation of diverse cellular activities that are impaired in Huntington's Disease (HD) cells. 191
33395 293982 cd15810 SPRY_PRY_TRIM5_6_22_34 PRY/SPRY domain of tripartite motif-binding protein 5, 6, 22 and 34 (TRIM5, TRIM6, TRIM22 and TRIM34). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of very close paralogs, TRIM5, TRIM6, TRIM22 and TRIM34. These domains are composed of RING/B-box/coiled-coil core and are also known as RBCC proteins. They form a locus of four closely related TRIM genes within an olfactory receptor-rich region on chromosome 11 of the human genome. Genetic analysis of this locus indicates that these four genes have evolved by gene duplication from a common ancestral gene. All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. TRIM5 promotes innate immune signaling by activating the TAK1 kinase complex by cooperating with the heterodimeric E2, UBC13/UEV1A. It also stimulates NFkB and AP-1 signaling, and the transcription of inflammatory cytokines and chemokines, amplifying these activities upon retroviral infection. Interaction of its PRY-SPRY or cyclophilin domains with the retroviral capsid lattice stimulates the formation of a complementary lattice by TRIM5, with greatly increased TRIM5 E3 activity, and host cell signal transduction. TRIM6 is selectively expressed in embryonic stem (ES) cells and interacts with the proto-oncogene product Myc, maintaining the pluripotency of the ES cells. TRIM6, together with E2 Ubiquitin conjugase (UbE2K) and K48-linked poly-Ub chains, is critical for the IkappaB kinase epsilon (IKKepsilon) branch of type I interferon (IFN-I) signaling pathway and subsequent establishment of a protective antiviral response. TRIM22 plays an integral role in the host innate immune response to viruses; it has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. Altered TRIM22 expression has also been associated with multiple sclerosis, cancer, and autoimmune disease. While the PRY-SPRY domain of TRIM5a provides specificity and the capsid recognition motif to retroviral restriction, TRIM34 binds HIV-1 capsid but does not restrict HIV-1 infection. 189
33396 293983 cd15811 SPRY_PRY_TRIM11 PRY/SPRY domain of tripartite motif-binding protein 11 (TRIM11), also known as RING finger protein 92 (RNF92). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM11, also known as RING finger protein 92 (RNF92) or BIA1. TRIM11 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It localizes to the nucleus and the cytoplasm; it is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 increases expression of dopamine beta-hydroxylase gene by interacting with the homeodomain transcription factor, PHOX2B, via the B30.2/SPRY domain, thus playing a potential role in the specification of noradrenergic (NA) neuron phenotype. It has also been shown that TRIM11 plays a critical role in the clearance of mutant PHOX2B, which causes congenital central hypoventilation syndrome, via the proteasome. TRIM11 binds a key component of the activator-mediated cofactor complex (ARC105), and destabilizes it, through the ubiquitin-proteasome system; ARC105 mediates chromatin-directed transcription activation and is a key regulatory factor for transforming growth factor beta (TGFbeta) signaling. 169
33397 293984 cd15812 SPRY_PRY_TRIM17 PRY/SPRY domain of tripartite motif-binding protein 17 (TRIM17), also known as testis RING finger protein (terf). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (terf). TRIM17 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein, expressed almost exclusively in the testis. It exhibits E3 ligase activity, causing protein degradation of ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for the mitotic spindle checkpoint, and negatively regulates proliferation of breast cancer cells. TRIM17 undergoes ubiquitination in COS7 fibroblast-like cells but is inhibited and stabilized by TRIM44. 176
33398 293985 cd15813 SPRY_PRY_TRIM20 PRY/SPRY domain in tripartite motif-binding protein 20 (TRIM20), also known as pyrin. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM20, which is also known as pyrin or marenostrin. Unlike TRIM domains that are composed of RING/B-box/coiled-coil core, the N-terminal RING domain in TRIM20 is exchanged by a PYRIN domain (PYD), a prime mediator of protein interactions necessary for apoptosis, inflammation and innate immune signaling pathway, and it also harbors a C-terminal B30.2 domain. Mutations in pyrin (TRIM20) are associated with familial Mediterranean fever (FMF), a recessively hereditary periodic fever syndrome, characterized by episodes of inflammation and fever. These mutations cluster in the C-terminal B30.2 domain and therefore it is assumed that pyrin plays a role in the innate immune system by possibly effecting caspase-1-dependent IL-1beta maturation. 184
33399 293986 cd15814 SPRY_PRY_TRIM27 PRY/SPRY domain in tripartite motif-containing protein 27 (TRIM27), also known as RING finger protein 76 (RNF76). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM27, also known as RING finger protein 76 (RNF76) or RET finger protein (RFP). TRIM27 domain is composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is highly expressed in the spleen, thymus and in cells of the hematopoietic compartment. TRIM27 exhibits either nuclear or cytosolic localization depending on the cell type. TRIM27 negatively regulates nucleotide-binding oligomerization domain containing 2 (NOD2)-mediated signaling by proteasomal degradation of NOD2, suggesting that TRIM27 could be a new target for therapeutic intervention in NOD2-associated diseases such as Crohn's. High expression of TRIM27 is observed in several human cancers, including breast and endometrial cancer, where elevated TRIM27 expression predicts poor prognosis. Also, TRIM27 forms an oncogenic fusion protein with Ret proto-oncogene. It is involved in different stages of spermatogenesis and its significant expression in male germ cells and seminomas, suggests that TRIM27 may be associated with the regulation of testicular germ cell proliferation and histological-type of germ cell tumors. TRIM27 could also be a predictive marker for chemoresistance in ovarian cancer patients. In the neurotoxin model of Parkinson's disease (PD), deficiency of TRIM27 decreases apoptosis and protects dopaminergic neurons, making TRIM27 an effective potential target during the treatment of PD. 177
33400 293987 cd15815 SPRY_PRY_TRIM38 PRY/SPRY domain of tripartite motif-binding protein 38 (TRIM38), also known as Ring finger protein 15 (RNF15). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM38, which is also known as RING finger protein 15 (RNF15) or RORET. TRIM38 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM38 has been shown to act as a suppressor in TOLL-like receptor (TLR)-mediated interferon (IFN)-beta induction by promoting degradation of TRAF6 and NAP1 through the ubiquitin-proteasome system. Another study has shown that TRIM38 may act as a novel negative regulator for TLR3-mediated IFN-beta signaling by targeting TRIF for degradation. TRIM38 has been identified as a critical negative regulator in TNFalpha- and IL-1beta-triggered activation of NF-kappaB and MAP Kinases (MAPKs); it causes degradation of two essential cellular components, TGFbeta-associated kinase 1 (TAK1)-associating chaperones 2 and 3 (TAB2/3). The degradation is promoted through a lysosomal-dependent pathway, which requires the C-terminal PRY-SPRY of TRIM38. Enterovirus 71 infection induces degradation of TRIM38, suggesting that TRIM38 may play a role in viral infections. 182
33401 293988 cd15816 SPRY_PRY_TRIM58 PRY/SPRY domain in tripartite motif-binding protein 58 (TRIM58), also known as BIA2. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM58, also known as BIA2. TRIM58 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins.It is implicated by genome-wide association studies (GWAS) to regulate erythrocyte traits, including cell size and number. Trim58 facilitates erythroblast enucleation by inducing proteolytic degradation of the microtubule motor dynein. 168
33402 293989 cd15817 SPRY_PRY_TRIM60_75 PRY/SPRY domain of tripartite motif-binding protein 60 and 75 (TRIM60 and TRIM75). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM60 and TRIM75, both composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM60 domain is also known as RING finger protein 33 (RNF33) or 129 (RNF129). Based on its expression profile, RNF33 likely plays an important role in the spermatogenesis process, the development of the pre-implantation embryo, and in testicular functions; Rnf33 is temporally transcribed in the unfertilized egg and the pre-implantation embryo, and is permanently silenced before the blastocyst stage. Mice experiments have shown that RNF33 associates with the cytoplasmic motor proteins, kinesin-2 family members 3A (KIF3A) and 3B (KIF3B), suggesting possible contribution to cargo movement along the microtubule in the expressed sites. TRIM75, also known as Gm794, has a single site of positive selection in its RING domain associated with E3 ubiquitin ligase activity. It has not been detectably expressed experimentally due to their constant turnover by the proteasome, and therefore not been characterized. 168
33403 293990 cd15818 SPRY_PRY_TRIM69 PRY/SPRY domain in tripartite motif-binding protein 69 (TRIM69), also known as RING finger protein 36 (RNF36). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM69, which is also known as RING finger protein 36 (RNF36) or testis-specific ring finger (Trif). TRIM69 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. The mouse ortholog of this gene is specifically expressed in germ cells at the round spermatid stages during spermatogenesis and, when overexpressed, induces apoptosis. TRIM69 has been shown to be a novel regulator of mitotic spindle assembly in tumor cells; it associates with spindle poles and promotes centrosomal clustering, and is therefore essential for formation of a bipolar spindle. 187
33404 293991 cd15819 SPRY_PRY_BTN1_2 butyrophilin subfamily member A1 and A2 (BTN1A and BTN2A). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of butyrophilin family 1A and 2A (BTN1A and BTN2A). BTNs belong to receptor glycoproteins of immunoglobulin (Ig) superfamily, characterized by the presence of extracellular Ig-like domains (IgV and/or IgC). BTN1A plays a role in the secretion, formation and stabilization of milk fat globules. The B30.2 domain of BTN1A1 binds the enzyme xanthine oxidoreductase (XOR) in order to participate in milk fat globule secretion; this interaction may lead to the production of reactive oxygen species, which have immunomodulatory and antimicrobial functions. Duplication events have led to three paralogs of BTN2A in primates: BTN2A1, BTN2A2, and BTN2A3. In humans, only BTN2A1 has been functionally characterized; it has been detected on epithelial cells and leukocytes, and identified as a novel ligand of dendritic cell-specific ICAM-3 grabbing nonintegrin (DCSIGN), a C-type lectin receptor that acts as an internalization receptor for HIV-1, HCV, and other pathogens. BTN2A2 mRNA has been shown to be expressed in circulating human immune cells. 172
33405 293992 cd15820 SPRY_PRY_BTN3 PRY/SPRY domain of butyrophilin 3 (BTN3), includes BTN3A1, BTN3A2, BTN3A3 as well as BTN-like 3 (BTNL3); BTN3A also known as CD277. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of butyrophilin family 3A (BTN3A); duplication events have led to three paralogs in primates: BTN3A1, BTN3A2, and BTN3A3. BTNs belong to receptor glycoproteins of immunoglobulin (Ig) superfamily, characterized by the presence of extracellular Ig-like domains (IgV and/or IgC). BTN3 transcripts are ubiquitously present in all immune cells (T cells, B cells, NK cells, monocytes, dendritic cells, and hematopoietic precursors) with different expression levels; BTN3A1 and BTN3A2 are expressed mainly by CD4+ and CD8+ T cells, BTN3A2 is the major form expressed in NK cells, and BTN3A3 is poorly expressed in these immune cells. The PRY/SPRY domain of the BTN3A1 isoform mediates phosphoantigen (pAg)-induced activation by binding directly to the pAg. 176
33406 293993 cd15821 SPRY_PRY_RFPL Ret finger protein-like (RFPL), includes RFP1, 2, 3, 4. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of RFPL protein family, which includes RFPL1, RFPL2, RFPL3 and RFPL4. In humans, RFPL transcripts can be detected at the onset of neurogenesis in differentiating human embryonic stem cells, and in the developing human neocortex. The human RFPL1, 2, 3 genes have a role in neocortex development. RFPL1 is a primate-specific target gene of Pax6, a key transcription factor for pancreas, eye and neocortex development; human RFPL1 decreases cell number through its RFPL-defining motif (RDM) and SPRY domains. The RFPL4 (also known as RFPL4A) gene encodes a putative E3 ubiquitin-protein ligase expressed in adult germ cells and interacts with oocyte proteins of the ubiquitin-proteasome degradation pathway. 178
33407 293994 cd15822 SPRY_PRY_TRIM5 PRY/SPRY domain in tripartite motif-binding protein 5 (TRIM5), also known as RING finger protein 88 (RNF88). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM5 which is also known as RING finger protein 88 (RNF88) or TRIM5alpha (TRIM5a), an antiretroviral restriction factor and a retrovirus capsid sensor in immune signaling. TRIM5 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It blocks retrovirus infection soon after the virion core enters the cell cytoplasm by recognizing the capsid protein lattice that encases the viral genomic RNA; the SPRY domain provides the capsid recognition motif that dictates specificity to retroviral restriction. TRIM5a, an E3 ubiquitin ligase, promotes innate immune signaling by activating the TAK1 kinase complex by cooperating with the heterodimeric E2, UBC13/UEV1A. It also stimulates NFkB and AP-1 signaling, and the transcription of inflammatory cytokines and chemokines, and amplifies these activities upon retroviral infection. Interaction of its PRY-SPRY or cyclophilin domains with the retroviral capsid lattice stimulates the formation of a complementary lattice by TRIM5, with greatly increased TRIM5 E3 activity, and host cell signal transduction. 200
33408 293995 cd15823 SPRY_PRY_TRIM6 PRY/SPRY domain in tripartite motif-binding protein 6 (TRIM6), also known as RING finger protein 89 (RNF89). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM6, also known as RING finger protein 89 (RNF89). TRIM6 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It is selectively expressed in embryonic stem (ES) cells and interacts with the proto-oncogene product Myc, maintaining the pluripotency of the ES cells. TRIM6, together with E2 Ubiquitin conjugase (UbE2K) and K48-linked poly-Ub chains, is critical for the IkappaB kinase epsilon (IKKepsilon) branch of type I interferon (IFN-I) signaling pathway and subsequent establishment of a protective antiviral response. 188
33409 293996 cd15824 SPRY_PRY_TRIM22 PRY/SPRY domain in tripartite motif-containing protein 22 (TRIM22), also known as RING finger protein 94 (RNF94) or Stimulated trans-acting factor of 50 kDa (STAF50). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM22, also known as RING finger protein 94 (RNF94) or STAF50 (Stimulated trans-acting factor of 50 kDa). TRIM6 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. TRIM22 is an interferon-induced protein, predominantly expressed in peripheral blood leukocytes, in lymphoid tissue such as spleen and thymus, and in the ovary.TRIM22 plays an integral role in the host innate immune response to viruses; it has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 inhibits influenza A virus (IAV) infection by targeting the viral nucleoprotein for degradation; it represents a novel restriction factor up-regulated upon IAV infection that curtails its replicative capacity in epithelial cells. Altered TRIM22 expression has also been associated with multiple sclerosis, cancer, and autoimmune disease. A large number of high-risk non-synonymous (ns)SNPs have been identified in the highly polymorphic TRIM22 gene, most of which are located in the SPRY domain and could possibly alter critical regions of the SPRY structural and functional residues, including several sites that undergo post-translational modification. TRIM22 is a direct p53 target gene and inhibits the clonogenic growth of leukemic cells. Its expression in Wilms tumors is negatively associated with disease relapse. It is greatly under-expressed in breast cancer cells as compared to non-malignant cell lines; p53 dysfunction may be one of the mechanisms for its down-regulation. 198
33410 293997 cd15825 SPRY_PRY_TRIM34 PRY/SPRY domain in tripartite motif-containing protein 34 (TRIM34), also known as RING finger protein 21 (RNF21) or interferon-responsive finger protein (IFP1). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM34, also known as RING finger protein 21 (RNF21) or interferon-responsive finger protein (IFP1). TRIM34 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. The TRIM21 cDNA possesses at least three kinds of isoforms, due to alternative splicing, of which only the long and medium forms contain the SPRY domain. It is an interferon-induced protein, predominantly expressed in the testis, kidney, and ovary. The SPRY domain provides the capsid recognition motif that dictates specificity to retroviral restriction. While the PRY-SPRY domain provides specificity and the capsid recognition motif to retroviral restriction, TRIM34 binds HIV-1 capsid but does not restrict HIV-1 infection. 185
33411 293998 cd15826 SPRY_PRY_TRIM15 PRY/SPRY domain in tripartite motif-binding protein 15 (TRIM15). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 15 (TRIM15), also referred to as RING finger protein 93 (RNF93) or Zinc finger protein B7 or 178 (ZNFB7 or ZNF178). TRIM15 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. The PRY and SPRY/B30.2 domains can function as immune defense components and in pathogen sensing. TRIM15 has been shown to regulate inflammatory and innate immune signaling, in addition to displaying antiviral activities. Down-regulation of TRIM15, as well as TRIM11, enhances virus release, suggesting that these proteins contribute to the endogenous restriction of retroviruses in cells. TRIM15 is also a regulatory component of focal adhesion turnover and cell migration. 170
33412 293999 cd15827 SPRY_PRY_TRIM10 PRY/SPRY domain of tripartite motif-binding protein 10 (TRIM10) also known as hematopoietic RING finger 1 (HERF1). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM10, also known as RING finger protein 9 (RNF9) or hematopoietic RING finger 1 (HERF1). TRIM10 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. TRIM10/HERF1 is predominantly expressed during definitive erythropoiesis and in embryonic liver, and minimally expressed in adult liver, kidney, and colon. It is critical for erythroid cell differentiation and its down-regulation leads to cell death; inhibition of TRIM10 expression blocks terminal erythroid differentiation, while its over-expression in erythroid cells induces beta-major globin expression and erythroid differentiation. 172
33413 294000 cd15828 SPRY_PRY_TRIM60 PRY/SPRY domain of tripartite motif-binding protein 60 (TRIM60) also known as RING finger protein 33 (RNF33). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM60, which is also known as RING finger protein 33 (RNF33) or 129 (RNF129). TRIM60 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. Based on its expression profile, RNF33 likely plays an important role in the spermatogenesis process, the development of the pre-implantation embryo, and in testicular functions; Rnf33 is temporally transcribed in the unfertilized egg and the pre-implantation embryo, and is permanently silenced before the blastocyst stage. Mice experiments have shown that RNF33 associates with the cytoplasmic motor proteins, kinesin-2 family members 3A (KIF3A) and 3B (KIF3B), suggesting possible contribution to cargo movement along the microtubule in the expressed sites. 180
33414 294001 cd15829 SPRY_PRY_TRIM75 PRY/SPRY domain of tripartite motif-binding protein 75 (TRIM75). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM75, also known as Gm794. TRIM75 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM75 has a single site of positive selection in its RING domain associated with E3 ubiquitin ligase activity. It has not been detectably expressed experimentally due to their constant turnover by the proteasome, and therefore not been characterized. 187
33415 276939 cd15830 BamD BamD lipoprotein, a component of the beta-barrel assembly machinery. BamD, also called YfiO, is part of the beta-barrel assembly machinery (BAM), which is essential for the folding and insertion of outer membrane proteins (OMPs) in the OM of Gram-negative bacteria. Transmembrane OMPs carry out important functions including nutrient and waste management, cell adhesion, and structural roles. The BAM complex is composed of the beta-barrel OMP BamA (also called Omp85/YaeT) and four lipoproteins BamBCDE. BamD is the only BAM lipoprotein required for viability. Both BamA and BamD are broadly distributed in Gram-negative bacteria, and may constitute the core of the BAM complex. BamD contains five Tetratricopeptide repeats (TPRs). The three TPRs at the N-terminus may participate in interaction with substrates, while the two TPRs in the C-terminus may be involved in binding with other BAM components. 213
33416 276938 cd15831 BTAD Bacterial Transcriptional Activation (BTA) domain. The Bacterial Transcriptional Activation (BTA) domain is present in the putative transcriptional regulator Mycobacterium EmbR and the related Streptomyces antibiotic regulatory protein (SARP) family of transcription factors, which includes DnrI and AfsR, among others. Members of this family contain an N-terminal DNA-binding domain, followed by the BTA domain, and many have diverse domains at the C-terminus. EmbR contains an C-terminal forkhead-associated (FHA) domain, which mediates binding to threonine-phosphorylated sites in a sequence-specific manner. The BTA domain of EmbR contains three Tetratricopeptide repeats (TPRs) and two C-terminal helices. The TPR motif typically contains 34 amino acids, and 5 or 6 tandem repeats of the motif generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein. 146
33417 276937 cd15832 SNAP Soluble N-ethylmaleimide-sensitive factor (NSF) Attachment Protein family. Members of the soluble NSF attachment protein (SNAP) family are involved in intracellular membrane trafficking, including vesicular transport between the endoplasmic reticulum and Golgi apparatus. Higher eukaryotes contain three isoforms of SNAPs: alpha, beta, and gamma. Alpha-SNAP is universally present in eukaryotes and acts as an adaptor protein between SNARE (integral membrane SNAP receptor) and NSF for recruitment to the 20S complex. Beta-SNAP is brain-specific and shares high sequence identity (about 85%) with alpha-SNAP. Gamma-SNAP is weakly related (about 20-25% identity) to the two other isoforms, and is ubiquitous. It may help regulate the activity of the 20S complex. The X-ray structures of vertebrate gamma-SNAP and yeast Sec17, a SNAP family member, show similar all-helical structures consisting of an N-terminal extended twisted sheet of four Tetratricopeptide repeat (TPR)-like helical hairpins and a C-terminal helical bundle. 278
33418 276930 cd15834 TNFRSF1A_teleost Tumor necrosis factor receptor superfamily member 1A (TNFRSF1A) in teleosts; also known as TNFR1. This subfamily of TNFRSF1 ((also known as type I TNFR, TNFR1, DR1, TNFRSF1A, CD120a, p55) is found in teleosts. It binds TNF-alpha, through the death domain (DD), and activates NF-kappaB, mediates apoptosis and activates signaling pathways controlling inflammatory, immune, and stress responses. It mediates signal transduction by interacting with antiapoptotic protein BCL2-associated athanogene 4 (BAG4/SODD) and adaptor proteins TRAF2 and TRADD that play regulatory roles. The human genetic disorder called tumor necrosis factor associated periodic syndrome (TRAPS), or periodic fever syndrome, is associated with germline mutations of the extracellular domains of this receptor, possibly due to impaired receptor clearance. Serum levels of TNFRSF1A are elevated in schizophrenia and bipolar disorder, and high levels are also associated with cognitive impairment and dementia. Knockout studies in zebrafish embryos have shown that a signaling balance between TNFRSF1A and TNFRSF1B is required for endothelial cell integrity. TNFRSF1A signals apoptosis through caspase-8, whereas TNFRSF1B signals survival via NF-kappaB in endothelial cells. Thus, this apoptotic pathway seems to be evolutionarily conserved, as TNFalpha promotes apoptosis of human endothelial cells and triggers caspase-2 and P53 activation in these cells via TNFRSF1A. 150
33419 276931 cd15835 TNFRSF1B_teleost Tumor necrosis factor receptor superfamily member 1B (TNFRSF1B) in teleost; also known as TNFR2. This subfamily of TNFRSF1B (also known as TNFR2, type 2 TNFR, TNFBR, TNFR80, TNF-R75, TNF-R-II, p75, CD120b) is found in teleosts. It binds TNF-alpha, but lacks the death domain (DD) that is associated with the cytoplasmic domain of TNFRSF1A (TNFR1). It is inducible and expressed exclusively by oligodendrocytes, astrocytes, T cells, thymocytes, myocytes, endothelial cells, and in human mesenchymal stem cells. TNFRSF1B protects oligodendrocyte progenitor cells (OLGs) against oxidative stress, and induces the up-regulation of cell survival genes. While pro-inflammatory and pathogen-clearing activities of TNF are mediated mainly through activation of TNFRSF1A, a strong activator of NF-kappaB, TNFRSF1B is more responsible for suppression of inflammation. Although the affinities of both receptors for soluble TNF are similar, TNFRSF1B is sometimes more abundantly expressed and thought to associate with TNF, thereby increasing its concentration near TNFRSF1A receptors, and making TNF available to activate TNFRSF1A (a ligand-passing mechanism). Knockout studies in zebrafish embryos have shown that a signaling balance between TNFRSF1A and TNFRSF1B is required for endothelial cell integrity. TNFRSF1A signals apoptosis through caspase-8, whereas TNFRSF1B signals survival via NF-kB in endothelial cells. In goldfish (Carassius aurutus L.), TNFRSF1B expression is substantially higher than that of TNFRSF1 in tissues and various immune cell types. Both receptors are most robustly expressed in monocytes; mRNA levels of TNFRSF1B are lowest in peripheral blood leukocytes. 130
33420 276932 cd15836 TNFRSF11A_teleost Tumor necrosis factor receptor superfamily member 11A (TNFRSF11A) in teleost; also known as RANK. TNFRSF11A (also known as RANK, FEO, OFE, ODFR, OSTS, PDB2, CD26, OPTB7, TRANCER, LOH18CR1) induces the activation of NF-kappa B and MAPK8/JNK through interactions with various TRAF adaptor proteins. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. This receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Juvenile Paget's disease (JPD) of bone. Alternatively spliced transcript variants have been described for this locus. Mutation analysis may improve diagnosis, prognostication, recurrence risk assessment, and perhaps treatment selection among the monogenic disorders of RANKL/OPG/RANK activation. 122
33421 276933 cd15837 TNFRSF26 Tumor necrosis factor receptor superfamily member 26 (TNFRSF26), also known as tumor necrosis factor receptor homolog 3 (TNFRH3). TNFRSF26 (also known as tumor necrosis factor receptor homolog 3 (TNFRH3) or TNFRSF24) is predominantly expressed in embryos and lymphoid cell types, along with its closely related TNFRSF22 and TNFRSF23 orthologs, and is developmentally regulated. Unlike TNFRSF22/23, TNFRSF26 does not serve as a TRAIL decoy receptor; it remains an orphan receptor. 118
33422 276934 cd15838 TNFRSF27 Tumor necrosis factor receptor superfamily member 27 (TNFRSF27), also known as ectodysplasin A2 receptor (EDA2R) or X-linked ectodermal dysplasia receptor (XEDAR). TNFRSF27 (also known as ectodysplasin A2 receptor (EDA2R), X-linked ectodermal dysplasia receptor (XEDAR), EDAA2R, EDA-A2R) has two isoforms, EDA-A1 and EDA-A2, that are encoded by the anhidrotic ectodermal dysplasia (EDA) gene. It is highly expressed during embryonic development and binds to ectodysplasin-A2 (EDA-A2), playing a crucial role in the p53-signaling pathway. EDA2R is a direct p53 target that is frequently down-regulated in colorectal cancer tissues due to its epigenetic alterations or through the p53 gene mutations. Mutations in the EDA-A2/XEDAR signaling give rise to ectodermal dysplasia, characterized by loss of hair, sweat glands, and teeth. A non-synonymous SNP on EDA2R, along with genetic variants in human androgen receptor is associated with androgenetic alopecia (AGA). 116
33423 276935 cd15839 TNFRSF_viral Tumor necrosis factor receptor superfamily members, virus-encoded. This family contains viral TNFR homologs that include vaccinia virus (VACV) cytokine response modifier E (CrmE), an encoded TNFR that shares significant sequence similarity with mammalian type 2 TNF receptors (TNFSFR1B, p75, TNFR type 2), a cowpox virus encoded cytokine-response modifier B (crmB), which is a secreted form of TNF receptor that can contribute to the modification of TNF-mediated antiviral processes, and a myxoma virus (MYXV) T2 (M-T2) protein that binds and inhibits rabbit TNF-alpha. The CrmE structure confirms that the canonical TNFR fold is adopted, but only one of the two "ligand-binding" loops of TNFRSF1A is conserved, suggesting a mechanism for the higher affinity of poxvirus TNFRs for TNFalpha over lymphotoxin-alpha. CrmB protein specifically binds TNF-alpha and TNF-beta indicating that cowpox virus seeks to invade antiviral processes mediated by TNF. Intracellular M-T2 blocks virus-induced lymphocyte apoptosis via a highly conserved viral preligand assembly domain (vPLAD), which controls receptor signaling competency prior to ligand binding. 125
33424 277193 cd15840 SNARE_Qa SNARE motif, subgroup Qa. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qa SNAREs are syntaxin 18, syntaxin 5, syntaxin 16, and syntaxin 1. 59
33425 277194 cd15841 SNARE_Qc SNARE motif, subgroup Qc. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qc SNAREs are C-terminal domains of SNAP23 and SNAP25, syntaxin 8, syntaxin 6, and Bet1. 59
33426 277195 cd15842 SNARE_Qb SNARE motif, subgroup Qb. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27. 62
33427 277196 cd15843 R-SNARE SNARE motif, subgroup R-SNARE. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). In contrast to Qa-, Qb- and Qc-SNAREs that are localized to target organelle membranes, R-SNAREs are localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qa SNAREs are syntaxin 18, syntaxin 5, syntaxin 16, and syntaxin 1. 60
33428 277197 cd15844 SNARE_syntaxin5 SNARE motif of syntaxin 5. Syntaxin 5 (Syn5) regulates the transport from the ER to the Golgi, as well as the early/recycling endosomes to the trans-Golgi network and participates in the assembly of transitional ER and the Golgi, lipid droplet fusion, and cytokinesis. Syn5 exists in 2 isoforms, long (42 kDa) and short (35 kDa). The short form is localized in the Golgi complex, whereas the long form is additionally found in the endoplasmic reticulum (ER). The syntaxin-5 SNARE complexes, which also contain Bet1 (Qc) and either GS27 (Qb) and Sec22B (R-SNARE) or GS28 (Qb) and Ykt6 (R-SNARE), regulate the early secretory pathway of eukaryotic cells at the level of endoplasmic reticulum (ER) to Golgi transport. The syntaxin-5 SNARE complex, which also contains GS15 (Qc), GS28 (Qb) and Ykt6 (R-SNAREs) is involved in the transport from the trans-Golgi network to the cis-Golgi. Syn5 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 86
33429 277198 cd15845 SNARE_syntaxin16 SNARE motif of syntaxin 16. Syntaxin 16 is located in trans-Golgi network (TGN) and regulated by the SM protein Vps45p. It forms a complex with syntaxin 6 (Qc), Vti1a (Qb) and VAMP4 (R-SNARE) and is involved in the regulation of recycling of early endosomes to the trans-Golgi network (TGN). Syntaxin 16 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 59
33430 277199 cd15846 SNARE_syntaxin17 SNARE motif of syntaxin 17. Synthaxin 17 (STX17) belongs to the Qa subgroup of SNAREs and interacts with SNAP29 (Qb/Qc) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials, including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 62
33431 277200 cd15847 SNARE_syntaxin7_like SNARE motif of syntaxin 7, 12 and related sequences. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. This subgroup of the Qa SNAREs includes syntaxin 7, syntaxin 12, TSNARE1 and related proteins. 60
33432 277201 cd15848 SNARE_syntaxin1-like SNARE motif of syntaxin 1 and related proteins. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. This subgroup of the Qa SNAREs includes syntaxin 1, syntaxin 11, syntaxin 19, syntaxin 2, syntaxin 3, syntaxin 4 and related proteins. 63
33433 277202 cd15849 SNARE_Sso1 SNARE motif of Sso1. Saccharomyces cerevisiae SNARE protein Sso1p forms a complex with synaptobrevin homolog Snc1p (R-SNARE) and the SNAP-25 homolog Sec9p (Qb/c) which is involved in exocytosis. Sso1 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 64
33434 277203 cd15850 SNARE_syntaxin18 SNARE motif, subgroup Qa. Syntaxin18 (also known as Ufe1p) is involved in retrograde transport of CopI coatomer coated vesicles from the Golgi to the ER. It forms a complex with USE1 (SLT1, Qc), Bnip1 (Sec20p, Qb) and Sec22b (R-SNARE). Syntaxin18 is a member of the Qa subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 59
33435 277204 cd15851 SNARE_Syntaxin6 SNARE motif of syntaxin 6. Syntaxin 6 forms a complex with syntaxin 16 (Qa), Vti1a (Qb) and VAMP4 (R-SNARE) and is involved in the regulation of recycling of early endosomes to the trans-Golgi network (TGN). Syntaxin 6 and its yeast homolog TLG1 are members of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 66
33436 277205 cd15852 SNARE_Syntaxin8 SNARE motif of syntaxin 8. Syntaxin 8 forms a complex with syntaxin 7 (Qa), Vti1b (Qb) and either VAMP7 or VAMP8 (R-SNARE) and is involved in the transport from early endosomes to the lysosome. Syntaxin 8 is a member of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 59
33437 277206 cd15853 SNARE_Bet1 SNARE motif of Bet1. Bet1 forms a complexes with GS27 (Qb), syntaxin-5 (Qa) and Sec22B (R-SNARE) or GS28 (Qb), syntaxin-5 (Qa) and Ykt6 (R-SNARE). These complexes regulates the early secretory pathway of eukaryotic cells at the level of the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC) and from ERGIC to the cis-Golgi, respectively. Bet1 is a member of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 59
33438 277207 cd15854 SNARE_SNAP47C C-terminal SNARE motif of SNAP47. C-terminal SNARE motif of SNAP47, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. The exact funtion of SNAP47 is unknown. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29 and SEC9. 59
33439 277208 cd15855 SNARE_SNAP25C_23C C-terminal SNARE motif of SNAP25 and SNAP23. C-terminal SNARE motifs of SNAP25 and SNAP23, members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with STX4 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP29, SNAP47 and SEC9. 59
33440 277209 cd15856 SNARE_SNAP29C C-terminal SNARE motif of SNAP29. C-terminal SNARE motif of SNAP29, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP29 interacts with STX17 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SEC9. 59
33441 277210 cd15857 SNARE_SEC9C C-terminal SNARE motif of SEC9. C-terminal SNARE motif of fungal SEC9, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SEC9 interacts with Sso1(Qa) and the lysosomal R-SNARE Snc1. The complex plays a role in post-Golgi transport. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SNAP29. 59
33442 277211 cd15858 SNARE_VAM7 SNARE motif of VAM7. Fungal VAM7 (vacuolar morphogenesis protein 7) is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family involved in vacuolar protein transport and membrane fusion. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 59
33443 277212 cd15859 SNARE_SYN8 SNARE motif of SYN8. Fungal SYN8 is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family presetn in the endosomes. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 68
33444 277213 cd15860 SNARE_USE1 SNARE motif of USE1. USE1 (unconventional SNARE in the ER 1 homolog, also known as SNARE-like tail-anchored protein 1 or SLT1) is involved in retrograde transport of CopI coatomer coated vesicles from the Golgi to the ER. It forms a complex with syntaxin18 (Ufe1p, Qa), Bnip1 (Sec20p, Qb) and Sec22b (R-SNARE). USE1 is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 60
33445 277214 cd15861 SNARE_SNAP25N_23N_29N_SEC9N N-terminal SNARE motif of SNAP25, SNAP23, SNAP29, and SEC9. N-terminal SNARE motif of members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29, SNAP47 and SEC9. 65
33446 277215 cd15862 SNARE_Vti1 SNARE motif of Vti1. Vti1 (vesicle transport through interaction with t-SNAREs homolog 1) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1b interacts with syntaxin 7 (Qa), syntaxin 8 (Qc), and the lysosomal R-SNARE VAMP8 or VAMP7 to form the endosomal SNARE core complex that mediates transport from the early endosomes and the MVBs (multivesicular bodies), and from the MVBs to the lysosomes, respectively. Vti1a interacts with syntaxin 16 (Qa), syntaxin 6 (Qc), and the lysosomal R-SNARE VAMP4 to form an endosomal SNARE core complex that mediates transport from the early endosomes to the TGN (trans-Golgi network). SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27. 62
33447 277216 cd15863 SNARE_GS27 SNARE motif of GS27. GS27 (also known as Bos1, EPM6, golgi SNAP receptor complex member 2 or GOSR2) is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. GS27 forms a complex together with Bet1 (Qc), syntaxin-5 (Qa) and Sec22B (R-SNARE). This complex regulates the early secretory pathway of eukaryotic cells at the level of the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC). SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 66
33448 277217 cd15864 SNARE_GS28 SNARE motif of GS28. GS28 (also known as golgi SNAP receptor complex member 1 or GOSR1) forms complexes with syntaxin-5 (Qa), Ykt6 (R-SNARE) and either Bet1 (Qc) or GS15 (Qc). These complexes regulate the early secretory pathway of eukaryotic cells at the level of the transport from the ER-Golgi intermediate compartment (ERGIC) to the cis-Golgi and transport from the trans-Golgi network to the cis-Golgi, respectively. GS28 is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 66
33449 277218 cd15865 SNARE_SEC20 SNARE motif of SEC20. SEC20 (also known as BNIP1, NIP1, or TRG-8) forms a complex with syntaxin 18 (Qa), SEC22 (R-SNARE)and USE1 (Qc), and is involved in the transport from cis-Golgi to the endoplasmic reticulum (ER). SEC20 is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 93
33450 277219 cd15866 R-SNARE_SEC22 SNARE motif of SEC22. SEC22 forms complexes with syntaxin 18 (Qa), Sec20 (Qb) and USE1 (Qc), and with syntaxin 5 (Qa), GS27 (Qb) and Bet1 (Qc). These complexes are involved in the transport from cis-Golgi to the endoplasmic reticulum (ER) and in the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC), respectively. SEC22 is a member of the R-SNARE subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 64
33451 277220 cd15867 R-SNARE_YKT6 SNARE motif of YKT6. Ykt6 forms complexes with syntaxin-5 (Qa), GS28 (Qb) and either Bet1 (Qc) or GS15 (Qc). This complex regulates the early secretory pathway of eukaryotic cells at the level of the transport from the ER-Golgi intermediate compartment (ERGIC) to the cis-Golgi and transport from the trans-Golgi network to the cis-Golgi, respectively. Ykt6 is a member of the R-SNARE subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 61
33452 277221 cd15868 R-SNARE_VAMP8 SNARE motif of VAMP8. The lysosomal VAMP8 (vesicle-associated membrane protein 8, also called endobrevin) protein belongs to the R-SNARE subgroup of SNAREs and interacts with STX17 (Qa) and SNAP29 (Qb/Qc). The complex plays a role in autophagosome-lysosome fusion via regulating the transport from early endosomes to multivesicular bodies. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 68
33453 277222 cd15869 R-SNARE_VAMP4 SNARE motif of VAMP4. The VAMP-4 (vesicle-associated membrane protein 4) protein belongs to the R-SNARE subgroup of SNAREs and interacts with syntaxin 16 (Qa), Vti1a (Qb) and syntaxin 6 (Qc). This complex plays a role in maintenance of Golgi ribbon structure and normal retrograde trafficking from the early endosome to the trans-Golgi network (TGN). SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 67
33454 277223 cd15870 R-SNARE_VAMP2 SNARE motif of VAMP2. The VAMP-2 (vesicle-associated membrane protein 2, also called synaptobrevin-2) protein belongs to the R-SNARE subgroup of SNAREs and interacts with Syntaxin-1 (Qa) and SNAP-25(Qb/Qc), as well as syntaxin 12 (Qa) and SNAP23 (Qb/Qc). The complexes play a role in transport of secretory granule from trans-Golgi network to the plasma membrane, and in the transport from early endosomes to and from the plasma membrane, respectively. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 63
33455 277224 cd15871 R-SNARE_VAMP7 SNARE motif of VAMP7. The VAMP-7 (vesicle-associated membrane protein 7, also called synaptobrevin-like protein 1) protein belongs to the R-SNARE subgroup of SNAREs and interacts with syntaxin 7(Qa), syntaxin 8 (Qc) and Vti1b (Qb). The complex is involved in the transport from early endosomes to the lysosome via regulating the transport from multivesicular bodies to the lysosomes. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 65
33456 277225 cd15872 R-SNARE_VAMP5 SNARE motif of VAMP5. The VAMP-5 (vesicle-associated membrane protein 5) protein belongs to the R-SNARE subgroup of SNAREs. Its function is unknown. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. 68
33457 277226 cd15873 R-SNARE_STXBP5_6 SNARE domain of STXBP5, STXBP6 and related proteins. Syntaxin binding protein 5 (STXBP5, also called Tomosyn), as well as its relative Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 61
33458 277227 cd15874 R-SNARE_Snc1 SNARE motif of Snc1. Saccharomyces cerevisiae SNARE protein Snc1p forms a complex with synaptobrevin homolog Sso1p (Qa) and the SNAP-25 homolog Sec9p (Qb/c) which is involved in exocytosis. Snc1 is a member of the R-SNARE subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 60
33459 277228 cd15875 SNARE_syntaxin7 SNARE motif of syntaxin 7. Syntaxin 7 forms a complex with syntaxin 8 (Qc), Vti1b (Qb) and either VAMP7 or VAMP8 (R-SNARE) and is involved in the transport from early endosomes to the lysosome. Syntaxin 7 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 60
33460 277229 cd15876 SNARE_syntaxin12 SNARE motif of syntaxin 12. Syntaxin 12 (STX12, also known as STX13 and STX14) forms a complex with SNAP25 (Qb/Qc) or SNAP29 (Qb/Qc) and VAMP2 or VAMP3 (R-SNARE) and plays a role in plasma membrane to early endosome transport. Syntaxin 12 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 67
33461 277230 cd15877 SNARE_TSNARE1 SNARE motif of TSNARE1. TSNARE1 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. Its function is unknown, but polymorphisms in human TSNARE1 have been associated with schizophrenia susceptibility. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. TSNARE1 is part of a subgroup of the Qa SNAREs that also includes syntaxin 7, syntaxin 12 and related proteins. 64
33462 277231 cd15878 SNARE_syntaxin11 SNARE motif of syntaxin 11. Syntaxin 11 (also known as STX11, FHL4, HLH4, HPLH4) is present on endosomal membranes, including late endosomes and lysosomes in macrophages, and has been shown to bind Vti1b and regulate the availability of Vti1b to form other SNARE-complexes. Mutations in human STX11 has been linked to familial hemophagocytic lymphohistiocytosis type-4 (FHL-4), an autosomal recessive disorder of immune dysregulation. Syntaxin 11 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 63
33463 277232 cd15879 SNARE_syntaxin19 SNARE motif of syntaxin 19. Syntaxin 19 has been shown to have the potential to form SNARE complexes with SNAP-23, 25 and 29 (Qb/Qc) and VAMP3 and VAMP8 (R-SNARE), indicating a role in post-Golgi trafficking or plasma membrane fusion. Syntaxin 19 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 63
33464 277233 cd15880 SNARE_syntaxin1 SNARE motif of syntaxin 1. Syntaxin-1 belongs to the Qa subgroup of SNAREs and interacts with SNAP-25 (Qb/Qc) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in exocytosis of synaptic vesicles. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 69
33465 277234 cd15881 SNARE_syntaxin3 SNARE motif of syntaxin 3. Syntaxin 3 (STX3) has been shown to form a complex with VAMP8 (R-SNARE) and SNAP-23 (Qb/c) in mast cells. Mutations have been implicated in human microvillus inclusion disease (MVID), a disorder of the differentiation of intestinal epithelium. Syntaxin 3 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 69
33466 277235 cd15882 SNARE_syntaxin2 SNARE motif of syntaxin 2. Syntaxin 2 (STX2), also known as epimorphin (EPM or EPIM), may interact with SNAP-23 (Qb/c) and genetic varioations are associated with type 1 von Willebrand disease (VWD). Syntaxin 2 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 69
33467 277236 cd15883 SNARE_syntaxin4 SNARE motif of syntaxin 4. Syntaxin-4 forms a complex with SNAP-23 (Qb/Qc) and R-SNAREs VAMP8, VAMP2 and VAMP7 which plays a role in exocytosis of secetory granule. Syntaxin 4 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 63
33468 277237 cd15884 SNARE_SNAP23C C-terminal SNARE motif of SNAP23. C-terminal SNARE motifs of SNAP23, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with Syntaxin-4 (Qa) and the R-SNARE VAMP8. The complex plays a role in exocytosis of secretory granule. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP25, SNAP29, SNAP47 and SEC9. 59
33469 277238 cd15885 SNARE_SNAP25C C-terminal SNARE motif of SNAP25. C-terminal SNARE motifs of SNAP25, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP29, SNAP47 and SEC9. 59
33470 277239 cd15886 SNARE_SEC9N N-terminal SNARE motif of SEC9. N-terminal SNARE motif of fungal SEC9, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SEC9 interacts with Sso1(Qa) and the lysosomal R-SNARE Snc1. The complex plays a role in post-Golgi transport. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SNAP29. 70
33471 277240 cd15887 SNARE_SNAP29N N-terminal SNARE motif of SNAP29. N-terminal SNARE motif of SNAP29, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP29 interacts with STX17 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SEC9. 65
33472 277241 cd15888 SNARE_SNAP47N N-terminal SNARE motif of SNAP47. N-terminal SNARE motif of SNAP47, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. The exact funtion of SNAP47 is unknown. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29 and SEC9. 65
33473 277242 cd15889 SNARE_SNAP25N_23N N-terminal SNARE motif of SNAP25 and SNAP23. N-terminal SNARE motifs of SNAP25 and SNAP23, members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with STX4 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP29, SNAP47 and SEC9. 65
33474 277243 cd15890 SNARE_Vti1b SNARE motif of Vti1b-like. Vti1b (vesicle transport through interaction with t-SNAREs homolog 1B) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1b interacts with syntaxin 7 (Qa), syntaxin 8 (Qc), and the lysosomal R-SNARE VAMP8 or VAMP7 to form the endosomal SNARE core complexes that mediate transport from the early endosomes and the MVBs (multivesicular bodies), and from the MVBs to the lysosomes, respectively. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27. 62
33475 277244 cd15891 SNARE_Vti1a SNARE motif of Vti1b-like. Vti1a (vesicle transport through interaction with t-SNAREs homolog 1A) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1a interacts with syntaxin 16 (Qa), syntaxin 6 (Qc), and the lysosomal R-SNARE VAMP4 to form an endosomal SNARE core complex that mediates transport from the early endosomes to the TGN (trans-Golgi network). SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27. 62
33476 277245 cd15892 R-SNARE_STXBP6 SNARE domain of STXBP6. Syntaxin binding protein 6 (STXBP6, also called Amisyn), as well as its relative Syntaxin binding protein 5 (STXBP5, also called Tomosyn), contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 62
33477 277246 cd15893 R-SNARE_STXBP5 SNARE domain of STXBP5. Syntaxin binding protein 5 (STXBP5, also called Tomosyn), as well as its relative Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. Tomosyn contains an N-terminal WD40 repeat region and has been shown to form complexes with SNAP-25 and syntaxin 1a, as well as SNAP-23 and syntaxin 4. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. 61
33478 277247 cd15894 SNARE_SNAP25N N-terminal SNARE motif of SNAP25. N-terminal SNARE motifs of SNAP25, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP29, SNAP47 and SEC9. 73
33479 277248 cd15895 SNARE_SNAP23N N-terminal SNARE motif of SNAP23. N-terminal SNARE motifs of SNAP23, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with Syntaxin-4 (Qa) and the R-SNARE VAMP8. The complex plays a role in exocytosis of secretory granule. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP25, SNAP29, SNAP47 and SEC9. 67
33480 276899 cd15896 MYSc_Myh19 class II myosin heavy chain19, motor domain. Myosin motor domain of muscle myosin heavy chain 19. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 675
33481 320054 cd15897 EFh_PEF The penta-EF hand (PEF) family. The penta-EF hand (PEF) family contains a group of five EF-hand calcium-binding proteins, including several classical calpain large catalytic subunits (CAPN1, 2, 3, 8, 9, 11, 12, 13, 14), two calpain small subunits (CAPNS1 and CAPNS2), as well as non-calpain PEF proteins, ALG-2 (apoptosis-linked gene 2, also termed programmed cell death protein 6, PDCD6), peflin, sorcin, and grancalcin. Based on the sequence similarity of EF1 hand, ALG-2 and peflin have been classified into group I PEF proteins. Calcium-dependent protease calpain subfamily members, sorcin and grancalcin, are group II PEF proteins. Calpains (EC 3.4.22.17) are calcium-activated intracellular cysteine proteases that play important roles in the degradation or functional modulation in a variety of substrates. They have been implicated in a number of physiological processes such as cell cycle progression, remodeling of cytoskeletal-cell membrane attachments, signal transduction, gene expression and apoptosis. ALG-2 is a pro-apoptotic factor that forms a homodimer in the cell or a heterodimer with its closest paralog peflin through their EF5s. Peflin is a 30-kD PEF protein with a longer N-terminal hydrophobic domain than any other member of the PEF family, and it contains nine nonapeptide (A/PPGGPYGGP) repeats. It exists only as a heterodimer with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+. ALG-2 interacts with various proteins in a Ca2+-dependent manner. Sorcin (for soluble resistance-related calcium binding protein) is a soluble resistance-related calcium-binding protein that participates in the regulation of calcium homeostasis in cells. Grancalcin is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It plays a key role in leukocyte-specific functions that are responsible for host defense. Grancalcin can form a heterodimer together with sorcin. Members in this family contain five EF-hand motifs attached to an N-terminal region of variable length containing one or more short Gly/Pro-rich sequences. These proteins form homodimers or heterodimers through pairing between the 5th EF-hands from the two molecules. Unlike calmodulin, the PEF domains do not undergo major conformational changes upon binding Ca2+. 165
33482 320029 cd15898 EFh_PI-PLC EF-hand motif found in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) isozymes. PI-PLC isozymes are signaling enzymes that hydrolyze the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. This family corresponds to the four EF-hand motifs containing PI-PLC isozymes, including PI-PLC-beta (1-4), -gamma (1-2), -delta (1,3,4), -epsilon (1), -zeta (1), eta (1-2). Lower eukaryotes such as yeast and slime molds contain only delta-type isozymes. In contrast, other types of isoforms present in higher eukaryotes. This family also includes 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase 1 (PLC1) from fungi. Some homologs from plants contain only two atypical EF-hand motifs and they are not included. All PI-PLC isozymes except sperm-specific PI-PLC-zeta share a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. PI-PLC-zeta lacks the PH domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Most of EF-hand motifs found in PI-PLCs consist of a helix-loop-helix structure, but lack residues critical to metal binding. Moreover, the EF-hand region of most of PI-PLCs may have an important regulatory function, but it has yet to be identified. However, PI-PLC-zeta is a key exception. It is responsible for Ca2+ oscillations in fertilized oocytes and exhibits a high sensitivity to Ca2+ mediated through its EF-hand domain. In addition, PI-PLC-eta2 shows a canonical EF-loop directing Ca2+-sensitivity and thus can amplify transient Ca2+ signals. Also it appears that PI-PLC-delta1 can regulate the binding of PH domain to PIP2 in a Ca2+-dependent manner through its functionally important EF-hand domains. PI-PLCs can be activated by a variety of extracellular ligands, such as growth factors, hormones, cytokines and lipids. Their activation has been implicated in tumorigenesis and/or metastasis linked to migration, proliferation, growth, inflammation, angiogenesis and actin cytoskeleton reorganization. PI-PLC-beta isozymes are activated by G-protein coupled receptor (GPCR) through different mechanisms. However, PI-PLC-gamma isozymes are activated by receptor tyrosine kinase (RTK), such as Rho and Ras GTPases. In contrast, PI-PLC-epsilon are activated by both GPCR and RTK. PI-PLC-delta1 and PLC-eta 1 are activated by GPCR-mediated calcium mobilization. The activation mechanism for PI-PLC-zeta remains unclear. 137
33483 320021 cd15899 EFh_CREC EF-hand, calcium binding motif, found in CREC-EF hand family. The CREC (Cab45/reticulocalbin/ERC45/calumenin)-EF hand family contains a group of six EF-hand, low-affinity Ca2+-binding proteins, including reticulocalbin (RCN-1), ER Ca2+-binding protein of 55 kDa (ERC-55, also known as TCBP-49 or E6BP), reticulocalbin-3 (RCN-3), Ca2+-binding protein of 45 kDa (Cab45 and its splice variant Cab45b), and calumenin ( also known as crocalbin or CBP-50). The proteins are not only localized in various parts of the secretory pathway, but also found in the cytosolic compartment and at the cell surface. They interact with different ligands or proteins and have been implicated in the secretory process, chaperone activity, signal transduction as well as in a large variety of disease processes. 267
33484 320080 cd15900 EFh_MICU EF-hand, calcium binding motif, found in mitochondrial calcium uptake proteins MICU1, MICU2, MICU3, and similar proteins. This family includes mitochondrial calcium uptake protein MICU1 and its two additional paralogs, MICU2 and MICU3. MICU1 localizes to the inner mitochondrial membrane (IMM). It functions as a gatekeeper of the mitochondrial calcium uniporter (MCU) and regulates MCU-mediated mitochondrial Ca2+ uptake, which is essential for maintaining mitochondrial homoeostasis. MICU1 and MICU2 are physically associated within the uniporter complex and are co-expressed across all tissues. They may play non-redundant roles in the regulation of the mitochondrial calcium uniporter. At present, the precise molecular function of MICU2 and MICU3 remain unclear. MICU2 may play possible roles in Ca2+ sensing and regulation of MCU, calcium buffering with a secondary impact on transport or assembly and stabilization of MCU. MICU3 likely has a role in mitochondrial calcium handling. All members in this family contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices. 152
33485 319999 cd15901 EFh_DMD_DYTN_DTN EF-hand-like motif found in the dystrophin/dystrobrevin/dystrotelin family. The dystrophin/dystrobrevin/dystrotelin family has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. Dystrophin is the founder member of this family. It is a sub-membrane cytoskeletal protein associated with the inner surface membrane. Dystrophin and its close paralog utrophin have a large N-terminal extension of actin-binding CH domains, up to 24 spectrin repeats, and a WW domain. Its further paralog, dystrophin-related protein 2 (DRP-2), retains only two of the spectrin repeats. Dystrophin, utrophin or DRP2 can form the core of a membrane-bound complex consisting of dystroglycan, sarcoglycans and syntrophins, known as the dystrophin-glycoprotein complex (DGC) that plays an important role in brain development and disease, as well as in the prevention of muscle damage. Dystrobrevins, including alpha- and beta-dystrobrevin, lack the large N-terminal extension found in dystrophin, but alpha-dystrobrevin has a characteristic C-terminal extension. Dystrobrevins are part of the DGC. They physically associate with members of the dystrophin family and with the syntrophins through their homologous C-terminal coiled coil motifs. In contrast, dystrotelins lack both the large N-terminal extension found in dystrophin and the obvious syntrophin-binding sites (SBSs). Dystrotelins are not critical for mammalian development. They may be involved in other forms of cytokinesis. Moreover, dystrotelin is unable to heterodimerize with members of the dystrophin or dystrobrevin families, or to homodimerize. 163
33486 320075 cd15902 EFh_HEF EF-hand, calcium binding motif, found in the hexa-EF hand proteins family. The hexa-EF hand proteins family, also named the calbindin sub-family, contains a group of six EF-hand Ca2+-binding proteins, including calretinin (CR, also termed 29 kDa calbindin), calbindin D28K (CB, also termed vitamin D-dependent calcium-binding protein, avian-type), and secretagogin (SCGN). CR is a cytosolic hexa-EF-hand calcium-binding protein predominantly expressed in a variety of normal and tumorigenic t-specific neurons of the central and peripheral nervous system. It is a multifunctional protein implicated in many biological processes, including cell proliferation, differentiation, and cell death. CB is highly expressed in brain tissue. It is a strong calcium-binding and buffering protein responsible for preventing a neuronal death as well as maintaining and controlling calcium homeostasis. SCGN is a six EF-hand calcium-binding protein expressed in neuroendocrine, pancreatic endocrine and retinal cells. It plays a crucial role in cell apoptosis, receptor signaling and differentiation. It is also involved in vesicle secretion through binding to various proteins, including interacts with SNAP25, SNAP23, DOC2alpha, ARFGAP2, rootletin, KIF5B, beta-tubulin, DDAH-2, ATP-synthase and myeloid leukemia factor 2. SCGN functions as a Ca2+ sensor/coincidence detector modulating vesicular exocytosis of neurotransmitters, neuropeptides or hormones. Although the family members share a significant amount of secondary sequence homology, they display altered structural and biochemical characteristics, and operate in distinct fashions. CB contains six EF-hand motifs in a single globular domain, where EF-hands 1, 3, 4, 5 bind four calcium ions. CR contains six EF-hand motifs within two independent domains, CR I-II and CR III-VI. They harbor two and four EF-hand motifs, respectively. The first 5 EF-hand motifs are capable of binding calcium ions, while the EF-hand 6 is inactive. SCGN consists of the three globular domains each of which contains a pair of EF-hand motifs. Human SCGN simultaneously binds four calcium ions through its EF-hands 3, 4, 5 and 6 in one high affinity and three low affinity calcium-binding sites. In contrast, SCGNs in other lower eukaryotes, such as D. rerio, X. laevis, M. domestica, G. gallus, O. anatinus, are fully competent in terms of six calcium-binding. 254
33487 277191 cd15903 Dicer_PBD Partner-binding domain of the endoribonuclease Dicer. The endoribonuclease Dicer plays a central role in RNA interference by breaking down RNA molecules into fragments of about 22 nucleotides (miRNAs and siRNAs). Loading of RNA onto Dicer and the enzymatic cleavage are supported by dsRNA-binding proteins, including trans-activation response (TAR) RNA-binding protein (TRBP) or protein activator of PKR (PACT). Together with Argonaute, this constitutes the RNA-induced silencing complex (RISC) which functions to load the small RNA fragments onto Argonaute. The Partner-binding domain of Dicer is responsible for interactions with the dsRNA-binding proteins. This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases. 104
33488 320706 cd15904 TSPO_MBR Translocator protein (TSPO)/peripheral-type benzodiazepine receptor (MBR) family. This family contains tryptophan-rich translocator protein (TSPO), an integral membrane protein that is highly conserved from bacteria to mammals. In eukaryotes, it is mainly found in the outer mitochondrial membranes of steroid-synthesizing cells of the nervous system where it transports cholesterol into mitochondria. It is known to be highly expressed in metastatic cancer, steriodogenic tissues, as well as inflammatory and neurological diseases such as Alzheimer's and Parkinson's. TSPO is also known as the peripheral benzodiazepine receptor (MBR) and its ligands include benzodiazepine drugs, implicated in regulating apoptosis. In human, a single polymorphism A147T is associated with psychiatric disorders; the mutation causes structural changes in a region implicated in cholesterol binding. TSPO is homologous to bacterial tryptophan-rich sensory proteins, and their tryptophan residues are believed to be functionally important. In bacteria, TSPO acts as a negative regulator of expression of specific photosynthesis genes in response to oxygen/light; it catalyzes a photooxidative degradation of Proto porphyrine (PpIX). R. sphaeroides TSPO (RsTSPO) is involved in porphyrin transport, similar to human, while Arabidopsis translocator protein (AtTSPO) is regulated at multiple levels in response to salt stress and perturbations in tetrapyrrole metabolism. 142
33489 320571 cd15905 7tmA_GPBAR1 G protein-coupled bile acid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The G-protein coupled bile acid receptor GPBAR1 is also known as BG37, TGR5 (Takeda G-protein-coupled receptor 5), M-BAR (membrane-type receptor for bile acids), and GPR131. GPBAR1 is highly expressed in the gastrointestinal tract, but also found at many other tissues including liver, colon, heart, skeletal muscle, and brown adipose tissue. GPBAR1 functions as a membrane-bound receptor specific for bile acids, which are the end products of cholesterol metabolism that facilitate digestion and absorption of lipids or fat-soluble vitamins. Bile acids act as liver-specific metabolic signaling molecules and stimulate liver regeneration by activating GPBAR1 and nuclear receptors such as the farnesoid X receptor (FXR). Upon bile acids binding, GPBAR1 activation causes release of the G-alpha(s) subunit and activation of adenylate cyclase. The increase in intracellular cAMP level then stimulates the expression of many genes via the PKA-mediated phosphorylation of cAMP-response element binding protein (CREB). Thus, GPAR1-signalling exerts various biological effects in immune cells, liver, and metabolic tissues. For example, GPBAR1 activation leads to enhanced energy expenditure in brown adipose tissue and skeletal muscle; stimulation of glucagon-like peptide-1 (GLP-1) production in enteroendocrine L-cells; and inhibition of pro-inflammatory cytokine production in macrophages and attenuation of atherosclerosis development. GPBAR1 is a member of the class A rhodopsin-like family of GPCRs, which comprises receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. 272
33490 320572 cd15906 7tmA_GPR162 G protein-coupled receptor 162, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the orphan G-protein coupled receptor 162 (GPR162), also called A-2 or GRCA, with unknown endogenous ligand and function. Phylogenetic analysis indicates that GPR162 and GPR153 share a common evolutionary ancestor due to a gene duplication event. Although categorized as members of the rhodopsin-like class A GPCRs, both GPR162 and GPR153 contain HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. Moreover, the LPxF motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in GPR162 and GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 315
33491 320573 cd15907 7tmA_GPR153 orphan G protein-coupled receptor 153, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 153 (GPR153) with unknown endogenous ligand and function. GPR153 shares a common evolutionary origin with GPR162 and is highly expressed in central nervous system (CNS) including the thalamus, cerebellum, and the arcuate nucleus. Although categorized as a member of the rhodopsin-like class A GPCRs, GPR153 contains HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. Moreover, the LPxFL motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 301
33492 320574 cd15908 7tm_TAS2R40-like taste receptor 2, subtypes 39 and 40, and similar receptors, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtypes 39 and 40, which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release. 289
33493 320575 cd15909 7tmF_FZD4_9_10-like class F frizzled subfamilies 4, 9, 10, and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 4, 9 and 10 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 320
33494 320576 cd15910 7tmF_FZD3_FZD6-like class F frizzled subfamilies 3, 6 and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 3 and 6 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others. 321
33495 320577 cd15911 7tmA_OR11A-like olfactory receptor subfamily 11A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 11A and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33496 320578 cd15912 7tmA_OR6C-like olfactory receptor subfamily 6C and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6C, 6X, 6J, 6T, 6V, 6M, 9A, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33497 320579 cd15913 7tmA_OR11G-like olfactory receptor OR11G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 11G, 11H, and related proteins in other mammals, and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33498 320580 cd15914 7tmA_OR6N-like olfactory receptor OR6N and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6N, 6K, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33499 320581 cd15915 7tmA_OR12D-like olfactory receptor subfamily 12D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 12D and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 271
33500 320582 cd15916 7tmA_OR10G-like olfactory receptor subfamily 10G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 10G, 10S, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 276
33501 341351 cd15917 7tmA_OR51_52-like olfactory receptor family 51, 52, 56 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 51, 52, 56, and related proteins in other mammals, sauropsids, amphibians, and fishes. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33502 320584 cd15918 7tmA_OR1_7-like olfactory receptor families 1, 7, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 1 and 7, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33503 320585 cd15919 7tmA_GPR139 G-protein-coupled receptor GPR139, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR139, a vertebrate orphan receptor, is very closely related to GPR142, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and plays an important role in mediating insulin secretion and maintaining glucose homeostasis, whereas GPR139 is expressed almost exclusively in the brain and is suggested to play a role in the control of locomotor activity. Tryptophan and phenylalanine have been identified as putative endogenous ligands of GPR139. These orphan receptors are phylogenetically clustered with invertebrate FMRFamide receptors such as Drosophila melanogaster DrmFMRFa-R. 270
33504 320586 cd15920 7tmA_GPR34-like P2Y-like receptor and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR34 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. GPR34 is shown to couple to G(i/o) protein and is highly expressed in microglia. Recently, lysophosphatidylserine has been identified as a ligand for GPR34. This group belongs to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others. 278
33505 320587 cd15921 7tmA_CysLTR cysteinyl leukotriene receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 283
33506 320588 cd15922 7tmA_P2Y-like P2Y purinoceptor-like proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y-like proteins are an uncharacterized group that is phylogenetically related to a family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 284
33507 320589 cd15923 7tmA_GPR35_55-like G protein-coupled receptor 35, GPR55, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily is composed of GPR35, GPR55, and similar proteins. GPR35 shares closest homology with GPR55, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A number of studies have suggested that GPR35 may play important physiological roles in hypertension, atherosclerosis, nociception, asthma, glucose homeostasis and diabetes, and inflammatory bowel disease. GPR35 is thought to be responsible for brachydactyly mental retardation syndrome, which is associated with a deletion comprising chromosome 2q37 in human, and is also implicated as a potential oncogene in stomach cancer. GPR35 couples to G(13) and G(i/o) proteins, whereas GPR55 has been reported to couple to G(13), G(12), or G(q) proteins. Activation of GPR55 leads to activation of phospholipase C, RhoA, ROCK, ERK, p38MAPK, and calcium release. Recently, lysophosphatidylinositol (LPI) has been identified as an endogenous ligand for GPR55, while several endogenous ligands for GPR35 have been identified including kynurenic acid, 2-oleoyl lysophosphatidic acid, and zaprinast. 273
33508 341352 cd15924 7tmA_P2Y12-like P2Y purinoceptors 12, 13, 14, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). This cluster only includes P2Y12-like receptors as well as closely related orphan receptor, GPR87. 284
33509 320591 cd15925 7tmA_RNL3R2 relaxin-3 receptor 2 (RNL3R2), member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled receptor RNL3R2 is also known as GPR100, GPR142, and relaxin family peptide receptor 4 (RXFP4). Insulin-like peptide 5 (INSL5) is an endogenous ligand for RNL3R2 and plays a role in fat and glucose metabolism. INSL5 is highly expressed in human rectal and colon tissues. RNL3R2 signals through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation. 283
33510 320592 cd15926 7tmA_RNL3R1 relaxin 3 receptor 1 (RNL3R1), member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled receptor RNL3R1 is also known as GPCR135, relaxin family peptide receptor 3 (RXFP3), and somatostatin- and angiotensin-like peptide receptor (SALPR). RNL3/relaxin-3, a member of the insulin superfamily, is an endogenous neuropeptide ligand for RNL3R1. RNL3R1 is predominantly expressed in brain regions and implicated in stress, anxiety, and feeding, and metabolism. RNL3R1 signals through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation, and also activates Erk1/2 signaling pathway. 288
33511 320593 cd15927 7tmA_Bombesin_R-like bombesin receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. This bombesin subfamily of G-protein coupled receptors consists of neuromedin B receptor (NMBR), gastrin-releasing peptide receptor (GRPR), and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin. Mammalian bombesin-related peptides are widely distributed in the gastrointestinal and central nervous systems. The bombesin family receptors couple mainly to the G proteins of G(q/11) family. NMBR functions as the receptor for the neuropeptide neuromedin B, a potent mitogen and growth factor for normal and cancerous lung and for gastrointestinal epithelial tissues. Gastrin-releasing peptide is an endogenous ligand for GRPR and shares high sequence homology with NMB in the C-terminal region. Both NMB and GRP possess bombesin-like biochemical properties. BRS-3 is classified as an orphan receptor and suggested to play a role in sperm cell division and maturation. BRS-3 interacts with known naturally-occurring bombesin-related peptides with low affinity; however, no endogenous high-affinity ligand to the receptor has been identified. The bombesin receptor family belongs to the seven transmembrane rhodopsin-like G-protein coupled receptors (class A GPCRs), which perceive extracellular signals and transduce them to guanine nucleotide-binding (G) proteins. 294
33512 320594 cd15928 7tmA_GHSR-like growth hormone secretagogue receptor, motilin receptor, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes growth hormone secretagogue receptor (GHSR or ghrelin receptor), motilin receptor (also called GPR38), and related proteins. Both GHSR and GPR38 bind peptide hormones. Ghrelin, the endogenous ligand for GHSR, is an acylated 28-amino acid peptide hormone produced by ghrelin cells in the gastrointestinal tract. Ghrelin is also called the hunger hormone and is involved in the regulation of growth hormone release, appetite and feeding, gut motility, lipid and glucose metabolism, and energy balance. Motilin, the ligand for GPR38, is a 22 amino acid peptide hormone expressed throughout the gastrointestinal tract and stimulates contraction of gut smooth muscle. It is involved in the regulation of digestive tract motility. 288
33513 341353 cd15929 7tmB1_GlucagonR-like glucagon receptor-like subfamily, member of the class B family of seven-transmembrane G protein-coupled receptors. This group represents the glucagon receptor family of G protein-coupled receptors, which includes glucagon receptor (GCGR), glucagon-like peptide-1 receptor (GLP1R), GLP2R, and closely related receptors. These receptors are activated by the members of the glucagon (GCG) peptide family including GCG, glucagon-like peptide 1 (GLP1), and GLP2, which are derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 279
33514 320596 cd15930 7tmB1_Secretin_R-like secretin receptor-like group of hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group represents G protein-coupled receptors for structurally similar peptide hormones that include secretin, growth-hormone-releasing hormone (GHRH), pituitary adenylate cyclase activating polypeptide (PACAP), and vasoactive intestinal peptide (VIP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Secretin, a polypeptide secreted by entero-endocrine S cells in the small intestine, is involved in maintaining body fluid balance. This polypeptide regulates the secretion of bile and bicarbonate into the duodenum from the pancreatic and biliary ducts, as well as regulates the duodenal pH by the control of gastric acid secretion. Studies with secretin receptor-null mice indicate that secretin plays a role in regulating renal water reabsorption. Secretin mediates its biological actions by elevating intracellular cAMP via G protein-coupled secretin receptors, which are expressed in the brain, pancreas, stomach, kidney, and liver. GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP. VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. All B1 subfamily GPCRs are able to increase intracellular cAMP levels by coupling to adenylate cyclase via a stimulatory Gs protein. However, depending on its cellular location, some members of subfamily B1 are also capable of coupling to additional G proteins such as G(i/o) and/or G(q) proteins, thereby leading to activation of phospholipase C and intracellular calcium influx. 268
33515 320597 cd15931 7tmB2_EMR_Adhesion_II EGF-like module receptors, group II adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the leukocyte cell-surface antigen CD97 and the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4), are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily B2 of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying numbers of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain. In the case of CD97, alternative splicing results in three isoforms possessing either three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. On the other hand, EMR2 generates four isoforms possessing either two (EGF1,2), three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. For example, CD97, which is involved in angiogenesis and the migration and invasion of tumor cells, has been shown to promote cell aggregation in a GPS proteolysis-dependent manner. CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis. Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment. 262
33516 320598 cd15932 7tmB2_GPR116-like_Adhesion_VI orphan GPR116 and related proteins, group IV adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group VI adhesion GPCRs consist of orphan receptors GPR110, GPR111, GPR113, GPR115, GPR116, and closely related proteins. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR110 possesses a SEA box in the N-terminal has been identified as an oncogene over-expressed in lung and prostate cancer. GPR113 contains a hormone binding domain and one EGF (epidermal grown factor) domain. GPR112 has extremely long N-terminus (about 2,400 amino acids) containing a number of Ser/Thr-rich glycosylation sites and a pentraxin (PTX) domain. GPR116 has two C2-set immunoglobulin-like repeats, which is found in the members of the immunoglobulin superfamily of cell surface proteins, and a SEA (sea urchin sperm protein, enterokinase, and a grin)-box, which is present in the extracellular domain of the transmembrane mucin (MUC) family and known to enhance O-glycosylation. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 268
33517 320599 cd15933 7tmB2_GPR133-like_Adhesion_V orphan GPR133 and related proteins, group V adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group V adhesion GPCRs include orphan receptors GPR133, GPR144, and closely related proteins. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the G(s) protein, leading to activation of adenylate cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. 252
33518 320600 cd15934 7tmC_mGluRs_group2_3 metabotropic glutamate receptors in group 2 and 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. The mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group I mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to (Gi/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity. 252
33519 320601 cd15935 7tmA_OR4Q3-like olfactory receptor 4Q3 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 4Q3 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 268
33520 320602 cd15936 7tmA_OR4D-like olfactory receptor 4D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 267
33521 320603 cd15937 7tmA_OR4N-like olfactory receptor 4N, 4M, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4N, 4M, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 267
33522 320604 cd15938 7tmA_OR4Q2-like olfactory receptor 4Q2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 4Q2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 265
33523 320605 cd15939 7tmA_OR4A-like olfactory receptor 4A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4A, 4C, 4P, 4S, 4X and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 267
33524 320606 cd15940 7tmA_OR4E-like olfactory receptor 4E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4E and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 267
33525 320607 cd15941 7tmA_OR10S1-like olfactory receptor subfamily 10S1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10S1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33526 320608 cd15942 7tmA_OR10G6-like olfactory receptor subfamily 10G6 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10G6 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33527 320609 cd15943 7tmA_OR5AP2-like olfactory receptor subfamily 5AP2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AP2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 295
33528 320610 cd15944 7tmA_OR5AR1-like olfactory receptor subfamily 5AR1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AR1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 294
33529 320611 cd15945 7tmA_OR5C1-like olfactory receptor subfamily 5C1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5C1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 292
33530 320612 cd15946 7tmA_OR1330-like olfactory receptor 1330 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes olfactory receptors 1330 from mouse, Olr859 from rat, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33531 320613 cd15947 7tmA_OR2B-like olfactory receptor subfamily 2B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 2 (subfamilies 2B, 2C, 2G, 2H, 2I, 2J, 2W, 2Y) and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 270
33532 320614 cd15948 7tmA_OR52K-like olfactory receptor subfamily 52K and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52K and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 277
33533 320615 cd15949 7tmA_OR52M-like olfactory receptor subfamily 52M and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52M and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 292
33534 320616 cd15950 7tmA_OR52I-like olfactory receptor subfamily 52I and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52I and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33535 320617 cd15951 7tmA_OR52R_52L-like olfactory receptor subfamily 52R, 52L, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamilies 52R, 52L and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33536 320618 cd15952 7tmA_OR52E-like olfactory receptor subfamily 52E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52E and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 274
33537 341354 cd15953 7tmA_OR52P-like olfactory receptor subfamily 52P and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52P and related proteins in other mammals, sauropsids and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33538 320620 cd15954 7tmA_OR52N-like olfactory receptor subfamily 52N and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52N and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 276
33539 320621 cd15955 7tmA_OR52A-like olfactory receptor subfamily 52A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52A and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 276
33540 320622 cd15956 7tmA_OR52W-like olfactory receptor subfamily 52W and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52W and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain. A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily. 275
33541 341355 cd15957 7tmA_Beta2_AR beta-2 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. Beta-2 AR is activated by adrenaline that plays important roles in cardiac function and pulmonary physiology. While beta-1 AR and beta-2 AR are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway, beta-2 AR can couple to both G(s) and G(i) proteins in the heart. Moreover, beta-2 AR activation leads to smooth muscle relaxation and bronchodilation in the lung. The beta adrenergic receptors are a subfamily of the class A rhodopsin-like G protein-coupled receptors. 301
33542 320624 cd15958 7tmA_Beta1_AR beta-1 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta-1 adrenergic receptor (beta-1 adrenoceptor), also known as beta-1 AR, is activated by adrenaline (epinephrine) and plays important roles in regulating cardiac function and heart rate. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of betrayers can lead to cardiac dysfunction such as arrhythmias or heart failure. 298
33543 320625 cd15959 7tmA_Beta3_AR beta-3 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta-3 adrenergic receptor (beta-3 adrenoceptor), also known as beta-3 AR, is activated by adrenaline and plays important roles in regulating cardiac function and heart rate. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of betrayers can lead to cardiac dysfunction such as arrhythmias or heart failure. 302
33544 320626 cd15960 7tmA_GPR185-like G protein-coupled receptor 185 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR185, also called GPRx, is a member of the constitutively active GPR3/6/12 subfamily of G protein-coupled receptors. It plays a role in the maintenance of meiotic arrest in Xenopus laevis oocytes through G(s) protein, which leads to increased cAMP levels. In Xenopus laevis, GPR185 is primarily expressed in brain, ovary, and testis; however, its ortholog has not been identified in other vertebrate genomes. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. 268
33545 320627 cd15961 7tmA_GPR12 G protein-coupled receptor 12, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3. 268
33546 320628 cd15962 7tmA_GPR6 G protein-coupled receptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3. 268
33547 320629 cd15963 7tmA_GPR3 G protein-coupled receptor 3, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3. 268
33548 320630 cd15964 7tmA_TSH-R thyroid-stimulating hormone receptor (or thyrotropin receptor), member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. TSH-R plays an important role thyroid physiology, and its activation stimulates the production of thyroxine (T4) and triiodothyronine (T3). Defects in TSH-R are a cause of several types of hyperthyroidism. The receptor is predominantly found on the surface of the thyroid epithelial cells and couples to the G(s)-protein and activates adenylate cyclase, thereby promoting cAMP production. TSH and cAMP stimulate thyroid cell proliferation, differentiation, and function. 275
33549 320631 cd15965 7tmA_RXFP1_LGR7 relaxin receptor 1 (or LGR7), member of the class A family of seven-transmembrane G protein-coupled receptors. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural but low sequence similarity, and exert their physiological functions by activating a group of four G protein-coupled receptors, RXFP1-4. Relaxin is the endogenous ligand for RXFP1, which has a large extracellular N-terminal domain containing 10 leucine-rich repeats and a unique low-density lipoprotein type A (LDLa) module which is necessary for receptor activation. Upon receptor binding, relaxin activates a variety of signaling pathways to produce second messengers such as cAMP and nitric oxide. RXFP1 is expressed in various tissues including uterus, ovary, placenta, cerebral cortex, heart, lung and kidney, among others. 287
33550 320632 cd15966 7tmA_RXFP2_LGR8 relaxin receptor 2 (or LGR8), member of the class A family of seven-transmembrane G protein-coupled receptors. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural similarity, but low sequence similarity, and exert their physiological functions by activating a group of four G protein-coupled receptors, RXFP1-4. INSL3 is the endogenous ligand for RXFP2, which couples to the G(s) protein to increase intracellular cAMP levels, but also to the GoB protein to decrease cAMP formation. RXFP2 (or LGR8) is expressed in various tissues including the brain, kidney, muscle, testis, thyroid, uterus, and peripheral blood cells, among others. 287
33551 320633 cd15967 7tmA_P2Y1-like P2Y purinoceptor 1-like. P2Y1-like is an uncharacterized group that is phylogenetically related to a family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 281
33552 320634 cd15968 7tmA_P2Y6_P2Y3-like P2Y purinoceptors 6 and 3, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes P2Y receptor 6 (P2Y6), P2Y3, and P2Y3-like proteins. These receptors belong to the G(i) class of a family of purinergic G-protein coupled receptors. In the CNS, P2Y6 plays a role in microglia activation and phagocytosis, and is involved in the secretion of interleukin from monocytes and macrophages in the immune system. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 285
33553 320635 cd15969 7tmA_GPR87 G protein-coupled receptor 87, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR87 acts as one of multiple receptors for lysophosphatidic acid (LPA). This orphan receptor has been shown to be over-expressed in several malignant tumors including lung squamous cell carcinoma and regulated by p53. GPR87 is phylogenetically closely related to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. 283
33554 320636 cd15970 7tmA_SSTR1 somatostatin receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR1 is coupled to a Na/H exchanger, voltage-dependent calcium channels, and AMPA/kainate glutamate channels. SSTR1 is expressed in the normal human pituitary and in nearly half of all pituitary adenoma subtypes. 276
33555 320637 cd15971 7tmA_SSTR2 somatostatin receptor type 2, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs), which display strong sequence similarity with opioid receptors, binds somatostatin, a polypeptide hormone that regulates a wide variety of physiological such as neurotransmission, endocrine secretion, cell proliferation, and smooth muscle contractility. SSTRs are composed of five distinct subtypes (SSTR1-5) which are encoded by separate genes on different chromosomes. SSTR2 plays critical roles in growth hormone secretion, glucagon secretion, and immune responses. SSTR2 is expressed in the normal human pituitary and in nearly all pituitary growth hormone adenomas. 279
33556 320638 cd15972 7tmA_SSTR3 somatostatin receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR3 is coupled to inward rectifying potassium channels. SSTR3 plays critical roles in growth hormone secretion, endothelial cell cycle arrest and apoptosis. Furthermore, SSTR3 is expressed in the normal human pituitary and in nearly half of pituitary growth hormone adenomas. 279
33557 320639 cd15973 7tmA_SSTR4 somatostatin receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR4 plays a critical role in mediating inflammation. Unlike other SSTRs, SSTR4 subtype is not detected in all pituitary adenomas while it is expressed in the normal human pituitary. 274
33558 320640 cd15974 7tmA_SSTR5 somatostatin receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR5 is coupled to inward rectifying K channels and phospholipase C, and plays critical roles in growth hormone and insulin secretion. SSTR5 acts as a negative regulator of PDX-1 (pancreatic and duodenal homeobox-1) expression, which is a conserved homeodomain-containing beta cell-specific transcription factor essentially involved in pancreatic development, among many other functions. 277
33559 320641 cd15975 7tmA_ET-AR endothelin A (or endothelin-1) receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain. 300
33560 320642 cd15976 7tmA_ET-BR endothelin B receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain. 296
33561 320643 cd15977 7tmA_ET-CR endothelin C receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain. The ET-C receptor is specific for endothelin-3 on frog dermal melanophores; its activation causes dispersion of pigment granules. 296
33562 320644 cd15978 7tmA_CCK-AR cholecystokinin receptor type A, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors. 278
33563 320645 cd15979 7tmA_CCK-BR cholecystokinin receptor type B, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors. 275
33564 320646 cd15980 7tmA_NPFFR2 neuropeptide FF receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R. 299
33565 320647 cd15981 7tmA_NPFFR1 neuropeptide FF receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R. 299
33566 320648 cd15982 7tmB1_PTH2R parathyroid hormone 2 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone 2 receptor (PTH2R), one of the three subtypes of PTH receptor family, is found in mammals and fish, but not in chicken or frog. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39) but not by PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs. These results suggest that TIP-39 is a natural ligand for PTH2R. Conversely, PTH1R is activated by PTH and PTHrP, but not by TIP-39. The PTH family receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. 289
33567 320649 cd15983 7tmB1_PTH3R parathyroid hormone 3 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone 3 receptor (PTH3R), one of the three subtypes of PTH receptor family, is found in chicken and fish, but it is absent in mammals. On the other hand, the PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. PTH1R is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39), but not by PTHrP. PTH also strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs, suggesting that TIP-39 is a natural ligand for PTH2R. Conversely, PTH3R binds and responds to both PTH and PTHrP, but not the TIP-39. The PTH family receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. 285
33568 320650 cd15984 7tmB1_PTH1R parathyroid hormone 1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone (PTH) receptor family has three subtypes: PTH1R, PTH2R and PTH3R. PTH1R is expressed in bone and kidney and is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to G(s)-protein that in turn activates adenylate cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple intracellular signaling pathways. PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. PTH3R is found in chicken and fish, but it is absent in mammals. The PTH receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. 290
33569 320651 cd15985 7tmB1_GlucagonR-like_1 uncharacterized group of glucagon receptor-like proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This group consists of uncharacterized proteins with similarity to members of the glucagon receptor family of G protein-coupled receptors, which include glucagon receptor (GCGR), and glucagon-like peptide-1 receptor (GLP1R), and GLP2R. The glucagon receptors are activated by the members of the glucagon (GCG) peptide family including GCG, glucagon-like peptide 1 (GLP1), and GLP2, which are derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways. 280
33570 320652 cd15986 7tmB1_VIP-R2 vasoactive intestinal polypeptide (VIP) receptor 2, member of the class B family of seven-transmembrane G protein-coupled receptors. Vasoactive intestinal peptide (VIP) receptor 2 is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP. VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. VIP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. However, depending on its cellular location, VIP-R1 is also capable of coupling to additional G proteins such as G(q) protein, thus leading to the activation of phospholipase C and intracellular calcium influx. 269
33571 320653 cd15987 7tmB1_PACAP-R1 pituitary adenylate cyclase-activating polypeptide type 1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Pituitary adenylate cyclase-activating polypeptide type 1 receptor (PACAP-R1) is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and vasoactive intestinal peptide (VIP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP. VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. PACAP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. 268
33572 320654 cd15988 7tmB2_BAI2 brain-specific angiogenesis inhibitor 2, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1, whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions. 291
33573 320655 cd15989 7tmB2_BAI3 brain-specific angiogenesis inhibitor 3, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1, whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions. 293
33574 320656 cd15990 7tmB2_BAI1 brain-specific angiogenesis inhibitor 1, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1, whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions. 267
33575 320657 cd15991 7tmB2_CELSR1 Cadherin EGF LAG seven-pass G-type receptor 1, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease. Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains. The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 254
33576 320658 cd15992 7tmB2_CELSR2 Cadherin EGF LAG seven-pass G-type receptor 2, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease. Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains. The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 255
33577 320659 cd15993 7tmB2_CELSR3 Cadherin EGF LAG seven-pass G-type receptor 3, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease. Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. Celsr3 is expressed in both the developing and adult mouse brain. It has been functionally implicated in proper neuronal migration and axon guidance in the CNS. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains. The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 254
33578 320660 cd15994 7tmB2_GPR111_115 orphan adhesion receptors GPR111 and GPR115, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR111 and GPR115 are highly homologous orphan receptors that belong to group VI adhesion-GPCRs along with GPR110, GPR113, and GPR116. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. Both GPR111 and GPR5 are present only in land-living animals and are predominantly expressed in the developing skin. 267
33579 320661 cd15995 7tmB2_GPR56 orphan adhesion receptor GPR56, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR56 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR64, GPR97, GPR112, GPR114, and GPR126. GPR56 is involved in the regulation of oligodendrocyte development and myelination in the central nervous system via coupling to G(12/13) proteins, which leads to the activation of RhoA GTPase. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 269
33580 320662 cd15996 7tmB2_GPR126 orphan adhesion receptor GPR126, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR126 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR64, GPR97, GPR112, and GPR114. GPR126 is required in Schwann cells for proper differentiation and myelination via G-Protein Activation. GPR126 is believed to couple to G(s)-protein, which leads to activation of adenylate cyclase for cAMP production. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 271
33581 320663 cd15997 7tmB2_GPR112 Probable G protein-coupled receptor 112, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR112 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR64, GPR97, GPR114, and GPR126. GPR112 is specifically expressed in normal enterochromatin cells and gastrointestinal neuroendocrine carcinoma cells, but its biological function is unknown. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 269
33582 320664 cd15998 7tmB2_GPR124 G protein-coupled receptor 124, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR124 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, which also includes orphan GPR123 and GPR125. GPR124, also known as tumor endothelial marker 5 (TEM5), is highly expressed in tumor vessels and in the vasculature of the developing embryo. GPR124 is essentially required for proper angiogenic sprouting into neural tissue, CNS-specific vascularization, and formation of the blood-brain barrier. GPR124 interacts with the PDZ domain of DLG1 (discs large homolog 1) through its PDZ-binding motif. Recently, studies of double-knockout mice showed that GPR124 functions as a co-activator of Wnt7a/Wnt7b-dependent beta-catenin signaling in brain endothelium. Moreover, WNT7-stimulated beta-catenin signaling is regulated by GPR124's intracellular PDZ binding motif and leucine-rich repeats (LRR) in its N-terminal extracellular domain. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 268
33583 320665 cd15999 7tmB2_GPR125 G protein-coupled receptor 125, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR125 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, which also includes orphan receptors GPR123 and GPR124. GPR125 directly interacts with dishevelled (Dvl) via its intracellular C-terminus, and together, GPR125 and Dvl recruit a subset of planar cell polarity (PCP) components into membrane subdomains, a prerequisite for activation of Wnt/PCP signaling. Thus, GPR125 influences the noncanonical WNT/PCP pathway, which does not involve beta-catenin, through interacting with and modulating the distribution of Dvl. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 312
33584 320666 cd16000 7tmB2_GPR123 G protein-coupled receptor 123, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR123 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, and also includes orphan receptors GPR124 and GPR125. GPR123 is predominantly expressed in the CNS including thalamus, brain stem and regions containing large pyramidal cells, yet its biological function remains to be determined. Adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 275
33585 320667 cd16001 7tmA_P2Y3-like P2Y purinoceptor 3-like proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y3-like proteins are an uncharacterized group that belongs to the G(i) class of a family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). 284
33586 320668 cd16002 7tmA_NK1R neurokinin 1 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neurokinin 1 receptor (NK1R), also known as tachykinin receptor 1 (TACR1) or substance P receptor (SPR), is a G-protein coupled receptor found in the mammalian central nervous and peripheral nervous systems. The tachykinins act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. SP is an extremely potent vasodilator through endothelium dependent mechanism and is released from the autonomic sensory nerves. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. 284
33587 320669 cd16003 7tmA_NKR_NK3R neuromedin-K receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neuromedin-K receptor (NKR), also known as tachykinin receptor 3 (TACR3) or neurokinin B receptor or NK3R, is a G-protein coupled receptor that specifically binds to neurokinin B. The tachykinins (TKs) act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty. 282
33588 320670 cd16004 7tmA_SKR_NK2R substance-K receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The substance-K receptor (SKR), also known as tachykinin receptor 2 (TACR2) or neurokinin A receptor or NK2R, is a G-protein coupled receptor that specifically binds to neurokinin A. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. 285
33589 320671 cd16005 7tmB2_Latrophilin-3 Latrophilin-3, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 258
33590 320672 cd16006 7tmB2_Latrophilin-2 Latrophilin-2, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 258
33591 320673 cd16007 7tmB2_Latrophilin-1 Latrophilin-1, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified: LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. 258
33592 293733 cd16009 PPM Bacterial phosphopentomutase. Bacterial phosphopentomutases (PPMs) are alkaline phosphatase superfamily members that interconvert alpha-D-ribose 5-phosphate (ribose 5-phosphate) and alpha-D-ribose 1-phosphate (ribose 1-phosphate). This reaction bridges glucose metabolism and RNA biosynthesis. PPM is a Mn(2+)-dependent enzyme and protein phosphorylation activates the enzyme. 382
33593 293734 cd16010 iPGM 2 3 bisphosphoglycerate independent phosphoglycerate mutase iPGM. The 2,3-diphosphoglycerate- independent phosphoglycerate mutase (iPGM) catalyzes the interconversion of 3-phosphoglycerate (3PGA) and 2-phosphoglycerate (2PGA). They are the predominant PGM in plants and some other bacteria, including endospore forming Gram-positive bacteria and their close relatives. The two steps catalysis is a phosphatase reaction removing the phosphate from 2- or 3-phosphoglycerate, generating an enzyme-bound phosphoserine intermediate, followed by a phosphotransferase reaction as the phosphate is transferred from the enzyme back to the glycerate moiety. The iPGM exists as a dimer, each monomer binding 2 magnesium atoms, which are essential for enzymatic activity. 503
33594 293735 cd16011 iPGM_like uncharacterized subfamily of alkaline phosphatase, homologous to 2 3 bisphosphoglycerate independent phosphoglycerate mutase (iPGM) and bacterial phosphopentomutases. The proteins in this subfamily of alkaline phosphatase are not characterized. Their sequences show similarity to 2 3 bisphosphoglycerate independent phosphoglycerate mutase (iPGM) which catalyzes the interconversion of 3-phosphoglycerate to 2-phosphoglycerate, and to bacterial phosphopentomutases (PPMs) which interconvert alpha-D-ribose 5-phosphate (ribose 5-phosphate) and alpha-D-ribose 1-phosphate (ribose 1-phosphate). 368
33595 293736 cd16012 ALP Alkaline Phosphatase. Alkaline phosphatases are non-specific membrane-bound phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Mammalian alkaline phosphatase is divided into four isozymes depending upon the site of tissue expression. They are Intestinal ALP, Placental ALP, Germ cell ALP and tissue nonspecific alkaline phosphatase or liver/bone/kidney (L/B/K) ALP. 283
33596 293737 cd16013 AcpA acid phosphatase A. Acid phosphatase A catalyzes the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at low pH. AcpA hydrolyzes a variety of substrates, including p-nitrophenylphosphate (pNPP), p-nitrophenylphosphorylcholine (pNPPC), peptides containing phosphotyrosine, inositol phosphates, AMP, ATP, fructose 1,6-bisphosphate, glucose and fructose 6-phosphates, NADP, and ribose 5-phosphate. AcpA is distinct from histidine ACPs and purple ACPs, as well as class A, B, and C bacterial nonspecific ACPs. 370
33597 293738 cd16014 PLC non-hemolytic phospholipase C. Nonhemolytic Phospholipases C is produced by pathogenic bacterial. The toxic phospholipases C can interact with eukaryotic cell membranes and hydrolyze phosphatidylcholine and sphingomyelin, leading to cell lysis. 287
33598 293739 cd16015 LTA_synthase Lipoteichoic acid synthase like. Lipoteichoic acid (LTA) is an important cell wall polymer found in Gram-positive bacteria. It may contain long chains of ribitol or glycerol phosphate. LTA synthase catalyzes the reaction to extend the polymer by the repeated addition of glycerolphosphate (GroP) subunits to the end of the growing chain. 283
33599 293740 cd16016 AP-SPAP SPAP is a subclass of alkaline phosphatase (AP). Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Although SPAP is a subclass of alkaline phosphatase, SPAP has many differences from other APs: 1) the catalytic residue is a threonine instead of serine, 2) there is no binding pocket for the third metal ion, and 3) the arginine residue forming bidentate hydrogen bonding is deleted in SPAP. A lysine and an asparagine residue, recruited together for the first time into the active site, bind the substrate phosphoryl group in a manner not observed before in any other AP. 457
33600 293741 cd16017 LptA Lipooligosaccharide Phosphoethanolamine Transferase A (LptA) or Lipid A Phosphoethanolamine Transferase. Lipooligosaccharide Phosphoethanolamine Transferase A (LptA) or Lipid A Phosphoethanolamine Transferase catalyzes the modification of the lipid A headgroups by phosphoethanolamine (PEA) or 4-amino-arabinose residues. Lipopolysaccharides, also called endotoxins, protect bacterial pathogens from antimicrobial peptides and have roles in virulence. The PEA modified lipid A increases resistance to the cationic cyclic polypeptide antibiotic, polymyxin. Lipid A PEA transferases usually consist of a transmembrane domain anchoring the enzyme to the periplasmic face of the cytoplasmic membrane. 288
33601 293742 cd16018 Enpp Ectonucleotide pyrophosphatase/phosphodiesterase, also called autotaxin. Ecto-nucleotide pyrophosphatases/phosphodiesterases (ENPPs) hydrolyze 5'-phosphodiester bonds in nucleotides and their derivatives, resulting in the release of 5'-nucleotide monophosphates. ENPPs have multiple physiological roles, including nucleotide recycling, modulation of purinergic receptor signaling, regulation of extracellular pyrophosphate levels, stimulation of cell motility, and possible roles in regulation of insulin receptor (IR) signaling and activity of ecto-kinases. The eukaryotic ENPP family contains at least five members that have different tissue distribution and physiological roles. 267
33602 293743 cd16019 GPI_EPT GPI ethanolamine phosphate transferase. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation. 292
33603 293744 cd16020 GPI_EPT_1 GPI ethanolamine phosphate transferase 1; PIG-N. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation. 294
33604 293745 cd16021 ALP_like uncharacterized Alkaline phosphatase subfamily. Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. 278
33605 293746 cd16022 sulfatase_like sulfatase. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 236
33606 293747 cd16023 GPI_EPT_3 GPI ethanolamine phosphate transferase 3, PIG-O. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation. 289
33607 293748 cd16024 GPI_EPT_2 GPI ethanolamine phosphate transferase 2; PIG-G. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation. 274
33608 293749 cd16025 PAS_like Bacterial Arylsulfatase of Pseudomonas aeruginosa and related proteins. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 402
33609 293750 cd16026 GALNS_like galactosamine-6-sulfatase; also known as N-acetylgalactosamine-6-sulfatase (GALNS). Lysosomal galactosamine-6-sulfatase removes sulfate groups from a terminal N-acetylgalactosamine-6-sulfate (or galactose-6-sulfate) in mucopolysaccharides such as keratan sulfate and chondroitin-6-sulfate. Defects in GALNS lead to accumulation of substrates, resulting in the development of the lysosomal storage disease mucopolysaccharidosis IV A. 399
33610 293751 cd16027 SGSH N-sulfoglucosamine sulfohydrolase (SGSH; sulfamidase). N-sulfoglucosamine sulfohydrolase (SGSH) belongs to the sulfatase family and catalyses the cleavage of N-linked sulfate groups from the GAGs heparin sulfate and heparin. The active site is characterized by the amino-acid sequence motif C(X)PSR that is highly conserved among most sulfatases. The cysteine residue is post-translationally converted to a formylglycine (FGly) residue, which is crucial for the catalytic process. Loss of function of SGSH results a disease called mucopolysaccharidosis type IIIA (Sanfilippo A syndrome), a fatal childhood-onset neurodegenerative disease with mild facial, visceral and skeletal abnormalities. 373
33611 293752 cd16028 PMH Phosphonate monoester hydrolase/phosphodiesterase. Phosphonate monoester hydrolase/phosphodiesterase hydrolyses phosphonate monoesters or phosphate diesters using a posttranslationally formed formylglycine as the catalytic nucleophile. PMH is the member of the alkaline phosphatase superfamily. The structure of PMH is more homologous to arylsulfatase than alkaline phosphatase. Sulfatases also use formylglycine as catalytic nucleophile. 449
33612 293753 cd16029 4-S N-acetylgalactosamine 4-sulfatase, also called arylsulftase B. Sulfatases catalyze the hydrolysis of sulfuric acid esters from a wide variety of substrates. N-acetylgalactosamine 4-sulfatase catalyzes the removal of the sulfate ester group from position 4 of an N-acetylgalactosamine sugar at the non-reducing terminus of the polysaccharide in the degradative pathways of the glycosaminoglycans dermatan sulfate and chondroitin-4-sulfate. N-acetylgalactosamine 4-sulfatase is a lysosomal enzyme. 393
33613 293754 cd16030 iduronate-2-sulfatase iduronate-2-sulfatase. Iduronate 2-sulfatase is a sulfatase enzyme that catalyze the hydrolysis of sulfate ester bonds from a wide variety of substrates, including steroids, carbohydrates and proteins. Iduronate 2-sulfatase is required for the lysosomal degradation of heparan sulfate and dermatan sulfate. Mutations in the iduronate 2-sulfatase gene that result in enzymatic deficiency lead to the sex-linked mucopolysaccharidosis type II, also known as Hunter syndrome. 435
33614 293755 cd16031 G6S_like unchracterized sulfatase homologous to glucosamine (N-acetyl)-6-sulfatase(G6S, GNS). N-acetylglucosamine-6-sulfatase also known as glucosamine (N-acetyl)-6-sulfatase hydrolyzes of the 6-sulfate groups of the N-acetyl-D-glucosamine 6-sulfate units of heparan sulfate and keratan sulfate. Deficiency of N-acetylglucosamine-6-sulfatase results in the disease of Sanfilippo Syndrome type IIId or Mucopolysaccharidosis III (MPS-III), a rare autosomal recessive lysosomal storage disease. 429
33615 293756 cd16032 choline-sulfatase choline-sulfatase. Choline-sulphatase is involved in the synthesis of glycine betaine from choline. The symbiotic soil bacterium Rhizobium meliloti can synthesize glycine betaine from choline-O-sulphate and choline to protect itself from osmotic stress. This biosynthetic pathway is encoded by the betICBA locus, which comprises a regulatory gene, betI, and three structural genes, betC (choline sulfatase), betB (betaine aldehyde dehydrogenase), and betA (choline dehydrogenase). betICBA genes constitute a single operon. 327
33616 293757 cd16033 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 411
33617 293758 cd16034 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 399
33618 293759 cd16035 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 311
33619 293760 cd16037 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 321
33620 277186 cd16039 PHD_SPP1 PHD finger found in Set1 complex component SPP1. Set1C component SPP1, also called COMPASS component Spp1, or Complex proteins associated with set1 protein Spp1, or Suppressor of PRP protein 1, is a component of the COMPASS complex that links histone methylation to initiation of meiotic recombination. It induces double-strand break (DSB) formation by tethering to recombinationally cold regions. SPP1 interacts with H3K4me3 and Mer2, a protein required for DSB formation, to promote recruitment of potential meiotic DSB sites to the chromosomal axis. SPP1 contains a PHD finger, a zinc binding motif. 46
33621 294002 cd16040 SPRY_PRY_SNTX Stonustoxin subunit alpha or SNTX subunit alpha. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of Stonustoxin alpha proteins. Stonustoxin (SNTX) is a multifunctional lethal protein isolated from venom elaborated by the stonefish. It comprises two subunits, termed alpha and beta. SNTX elicits an array of biological responses, particularly a potent hypotension and respiratory difficulties. 180
33622 293880 cd16074 OCRE OCRE domain. The OCRE (OCtamer REpeat) domain contains 5 repeats of an 8-residue motif, which were shown to form beta-strands. Based on the architectures of proteins containing OCRE domains, a role in RNA metabolism and/or signalling has been proposed. 54
33623 293922 cd16075 ORC6_CTD C-terminal domain of the eukaryotic origin recognition complex subunit ORC6. In eukaryotes, a complex consisting of six subunits promotes the onset of DNA replication. The 6th subunit, ORC6, does not belong to the wider AAA+ family of nucleotide hydrolases, but contains a tandemly repeated domain resembling the transcription factor TFIIB, as well as this C-terminal domain which harbors a helical segment that is responsible for interactions with the complex by binding to ORC3. Mutations in this C-terminal helix interfere with the formation of the ORC and have been linked to Meier-Gorlin syndrome, a dwarfism disorder. 53
33624 293923 cd16076 TSPcc Coiled coil region of thrombospondin. This domain family contains coiled coil region of subgroup B of thrombospondins, comprising TSP-3, TSP-4, and TSP-5, that assemble as pentamers. This region is located adjacent to the N-terminal domain (NTD) of thrombospondin (TSP), that mediates co-translational oligomerization via formation of a left-handed super-helix which binds hydrophilic signaling molecules such as vitamin D3 and vitamin A. Pentameric TSPs are stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end. TSP-5 is also known as cartilage oligomeric matrix protein (COMP). TSPs comprise a conserved family of extracellular, oligomeric, multidomain, calcium-binding glycoproteins. In mammals, they have several complex tissue-specific roles, including activities in wound healing and angiogenesis, connective tissue organization, vessel wall biology, and synaptogenesis, all mechanistically derived from interactions with cell surfaces, cytokines, growth factors, or components of the extracellular matrix (ECM) that together regulate many aspects of cell phenotype. In invertebrates, TSPs may have ancient functions such as bridging activities in cell-cell and cell-ECM interactions. Most protostomes and inferred basal metazoa encode a single TSP with the general domain organization of subgroup B TSPs and with a pentamerizing coiled coil. 40
33625 293924 cd16077 TSP-5cc Coiled coil region of thrombospondin-5 (TSP-5). This family contains the N-terminal coiled coil region of TSP-5, also known as cartilage oligomeric matrix protein (COMP). It forms a pentameric left-handed coiled coil (COMPcc) with a channel that is a unique carrier for lipophilic compounds. It is known to bind hydrophilic signaling molecules such as vitamin D3 and vitamin A, making it a possible targeted drug delivery system. TSP-5/COMP is expressed in all types of cartilage as well as in the vitreous of the eye, tendons, vascular smooth muscle cells, and heart. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-5 is essential for modulating the phenotypic transition of vascular smooth muscle cells and vascular remodeling. Mutations in TSP-5 result in two different inherited chondrodysplasias and osteoarthritic phenotypes: pseudoachondroplasia and multiple epithelial dysplasia. Deficiency of TSP-5 causes dilated cardiomyopathy (DCM), a common cause of congestive heart failure. Early increase in serum TSP-5 is associated with joint damage progression in patients with rheumatoid arthritis, thus representing a novel indicator of an activated destructive process in the joint. 43
33626 293925 cd16079 TSP-3cc Coiled coil region of thrombospondin-3 (TSP-3). This family contains the N-terminal coiled coil region of TSP-3, which is highly expressed in osteosarcomas and associated with metastasis. TSP-3, along with TSP-5 and type IX collagen, is also expressed in the growth plate and all operate in concert and participate in growth plate organization that directly modulates linear growth. It forms a pentameric left-handed coiled coil with a channel that is a unique carrier for lipophilic compounds. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-3 knockout mice have been shown to display accelerated endochondral ossification and increased trabecular bone in the femoral head. 43
33627 293926 cd16080 TSP-4cc Coiled coil region of thrombospondin-4 (TSP-4). This family contains the N-terminal coiled coil region of TSP-4, which is abundantly expressed in tendon and muscle, as well as in neural and osteogenic tissues, and has also been detected in brain capillaries. It forms a pentameric left-handed coiled coil with a channel that is a unique carrier for lipophilic compounds. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-4 regulates the composition of the deposition of extracellular matrix (ECM) in tendon and skeletal muscle. The absence of TSP-4 alters the organization, composition and physiological functions of these tissues. TSP-4 deficiency causes incorrect modification of heparan-sulfate (HS), resulting in decreased activity of lipoprotein lipase (LpL) and loss of beta-glycan; HS is involved in a wide variety of cellular functions, LpL is an endothelial enzyme responsible for the uptake and hydrolysis of lipoproteins, and beta-glycan has inhibiting effect on TGF-beta signaling in skeletal muscle. The human gene THBS4 that encodes for TSP-4 contains a single nucleotide polymorphism (SNP), which is expressed at high frequency in Caucasians and associated with a significantly increased risk of premature myocardial infarction. TSP-4 also binds stromal interaction molecule 1 (STIM1), a transmembrane protein that functions in the endoplasmic reticulum (ER), and regulates calcium channel activity. Studies show that TSP-4 may act as an organizer of adhesive and axon outgrowth-promoting molecules in the ECM to optimize retinal ganglion cell responses. TSP-4 is also involved in the post-translational modification of collagen and may assist in collagen fibril assembly. 44
33628 293927 cd16081 TSPcc_insect Coiled coil region of thrombospondin in protostomes. This family contains the N-terminal coiled coil region of thrombospondin (TSP) in some protostomes, which suggest ancient functions that include bridging activities in cell-cell and cell-ECM interactions. It appears that most protostomes and inferred basal metazoa encode a single TSP with the general domain organization of subgroup B TSPs and with a pentamerizing coiled coil. This region has heparin-binding activity and is a component of extracellular matrix (ECM), showing that the pentameric TSPs are of earlier origin and that the trimeric TSP subfamily A form is associated with higher chordates. The left-handed coiled coil pentamer forms a channel that is a unique carrier for lipophilic compounds, and is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. Several heparan sulphate (HS) proteoglycans are known in D. melanogaster, including both transmembrane and matrix forms, which could contribute to its retention in pericellular matrix. 42
33629 409504 cd16082 IgC_CRIg Immunoglobulin (Ig) constant domain of the complement receptor of the immunoglobulin superfamily (CRIg). The members here are composed of the Immunoglobulin (Ig) constant domain of the complement receptor of the immunoglobulin superfamily (CRIg). The N-terminal domain of CRIg (also referred to as Z39Ig and V-set and Ig domain-containing 4 (VSIG4)) belongs to the IgV family of immunoglobulin-like domains while the C-terminal domain of CRIg belongs to the IgC family of immunoglobulin-like domains. CRIg plays a role in the complement system, an inhibitor of the alternative pathway convertases, and a negative regulator of T cell activation. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. 86
33630 409505 cd16083 IgC1_CD80 Immunoglobulin constant (IgC)-like domain of antigen receptor Cluster of Differentiation (CD) 80; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin constant (IgC)-like domain of the antigen receptor Cluster of Differentiation (CD) 80. CD80 (also known as glycoprotein B7-1) and CD86 (also known as glycoprotein B7-2) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (CD152) on T cells. signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. CD80 contains two Ig-like domains, an amino-terminal immunoglobulin variable (IgV)-like domain characteristic of adhesion molecules, and a membrane proximal immunoglobulin constant (IgC)-like domain similar to the constant domains of antigen receptors. Members of the Ig family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions. 91
33631 409506 cd16084 IgC1_CH2_IgD CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin constant domain (IgC) in delta heavy chains. The IgC family includes immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions. 97
33632 409507 cd16085 IgC1_SIRP_domain_3 Signal-regulatory protein (SIRP) immunoglobulin-like domain 3; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in Signal-Regulatory Protein (SIRP), domain 3 (C1 repeat 2). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three Immunoglobulin superfamily domains a single V-set and two C1-set IgSF domains. Their cytoplasmic tails that contain either ITIMs or transmembrane regions that have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma. SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs but also binds CD47, but with much less affinity. 96
33633 319335 cd16086 IgV_CD80 Immunoglobulin variable domain (IgV) in Cluster of Differentiation (CD) 80. The members here are composed of the immunoglobulin variable region (IgV) in the Cluster of Differentiation (CD) 80). Glycoproteins B7-1 (also known as cluster of differentiation (CD) 80) and B7-2 (also known as CD86) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (also known as cluster of differentiation 152/CD152) on T cells. signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. CD80 contains two Ig-like domains, an amino-terminal immunoglobulin variable (IgV)-like domain characteristic of adhesion molecules and a membrane proximal immunoglobulin constant (IgC)-like domain similar to the constant domains of antigen receptors. Members of the Ig family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and Major Histocompatibility Complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions. 105
33634 409508 cd16087 IgV_CD86 Immunoglobulin variable domain (IgV) in Cluster of Differentiation (CD) 86. The members here are composed of the immunoglobulin variable region (IgV) in the Cluster of Differentiation (CD) 86). Glycoproteins B7-1 (also known as cluster of differentiation (CD) 80) and B7-2 (also known as CD86) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (also known as CD152) on T cells. signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. The CTLA-4 and B7-2 monomers are both two-layer beta-sandwiches that display the chain topology characteristic of the immunoglobulin variable (V-type) domains present in antigen receptors. The front and back sheets of B7-2 are composed of AGFCC'C" and BED strands, respectively. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. 108
33635 409509 cd16088 IgV_PD1 Immunoglobulin (Ig)-like domain of Programmed Cell Death 1 (PD1). The members here are composed of the immunoglobulin (Ig)-like domain of Programmed Cell Death 1 (PD1; also known as CD279/cluster of differentiation 279). PD1 is a cell surface receptor that is expressed on T cells and pro-B cells. The protein's structure includes an extracellular IgV domain followed by a transmembrane region and an intracellular tail. Activation of CD4+ T cells, CD8+ T cells, NKT cells, B cells, and monocytes induces PD-1 expression, immediately after which it binds two distinct ligands, PD-L1 (also known as B7-H1 or CD274/cluster of differentiation 274) and PD-L2, also known as B7-DC. PD-1 plays an important role in down regulating the immune system by preventing the activation of T-cells, reducing autoimmunity and promoting self-tolerance. The inhibitory effect of PD-1 is accomplished by promoting apoptosis in antigen specific T-cells in lymph nodes while simultaneously reducing apoptosis in regulatory T cells. A class of drugs that target PD-1, known as the PD-1 inhibitors, activate the immune system to attack tumors and treat cancer. Comparisons between the mouse PD-1 (mPD-1) and human PD-1 (hPD-1) reveals that unlike the mPD-1 which has a conventional IgSF V-set domain, hPD-1 lacks a C" strand, and instead the C' and D strands are connected by a long and flexible loop. In addition, the BC loop is not stabilized by disulfide bonding to the F strand of the ligand binding beta sheet. These differences result in different binding affinities of human and mouse PD-1 for their ligands. 112
33636 409510 cd16089 IgV_CRIg Immunoglobulin variable (IgV)-like domain in complement receptor of the immunoglobulin superfamily (CRIg). The members here are composed of the immunoglobulin variable (IgV) region of the complement receptor of the immunoglobulin superfamily (CRIg). The N-terminal domain of CRIg (also known as Z39Ig and V-set and Ig domain-containing 4 (VSIG4) belongs to the IgV family of immunoglobulin-like domains while the C-terminal domain of CRIg belongs to the IgC family of immunoglobulin-like domains. Like all members of this family, the CRIg domain contains two beta-sheets: one composed of strands A', G, F, C, C' and C", and the other of strands B, E and D. The complement system is an important part of the innate immune system and is required for removal of pathogens from the bloodstream. After exposure to pathogens, the third component of the complement system, C3, is cleaved to C3b which, after recruitment of factor B, initiates formation of the alternative pathway convertases. CRIg, a complement receptor expressed on macrophages, binds to C3b and iC3b mediating phagocytosis of the particles. It is also a potent inhibitor of the alternative pathway convertases and a negative regulator of T cell activation. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. 117
33637 409511 cd16090 IgV_CD47 Immunoglobulin variable region (IgV) in Cluster of Differentiation (CD) 47. The members here are composed of immunoglobulin variable (IgV) region in the Cluster of Differentiation (CD) 47 (also known as integrin associated protein/IAP). CD47 partners with membrane integrins and binds thrombospondin-1 (TSP-1) and signal-regulatory protein alpha (SIRP alpha). It is involved in apoptosis, proliferation, adhesion, migration, and immune and angiogenic responses. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. 112
33638 409512 cd16091 IgV_HHLA2 Immunoglobulin Variable (IgV) domain in HERV-H LTR-associating 2 (HHLA2). The members here are composed of the immunoglobulin variable (IgV) region in HERV-H LTR-associating 2 (HHLA2; also known as B7-H7/B7 homolog 7). HHLA2 is a member of the B7 family of immune regulatory proteins. Mature human HHLA2 consists of an extracellular domain (ECD) with three immunoglobulin-like domains, a transmembrane segment, and a cytoplasmic domain. HHLA2 is widely expressed in human cancers including non-small cell lung carcinoma (NSCLS), triple negative breast cancer (TNBC), and melanoma, but has limited expression on normal tissues. Interestingly, unlike other members of B7 family, HHLA2 is not expressed in mice or rats. HHLA2 functions as a T cell coinhibitory molecules as it inhibits the proliferation of activated CD4(+) and CD8(+) T cells and their cytokine production. Furthermore, HHLA2 is constitutively expressed on the surface of human monocytes and is induced on B cells after stimulation, however it is not inducible on T cells. 107
33639 319341 cd16092 IgC1_CH1_IgD CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of delta chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 96
33640 409513 cd16093 IgC1_CH2_Mu CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin mu chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin constant domain (IgC) of mu heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. 99
33641 319343 cd16094 IgC1_CH3_IgD CH3 domain (third constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily domains. The members here are composed of the third immunoglobulin constant domain (IgC) of delta heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. C1-set Ig domains have one beta sheet that is formed by strands A, B, E, and D and the other strands by G, F, C, and C'. 100
33642 409514 cd16095 IgV_H_TCR_mu T-cell receptor Mu, Heavy chain, variable (V) domain. The members here are composed of the immunoglobulin (Ig) heavy chain (H), variable (V) domain of the T-cell receptor Mu. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which can associate with any of the heavy chains. This family includes alpha, gamma, delta, epsilon, and mu heavy chains. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 115
33643 409515 cd16096 IgV_CD79b_beta Immunoglobulin variable domain (IgV) Cluster of Differentiation (CD) 79B. The members here are composed of the immunoglobulin variable domain (IgV) of the Cluster of Differentiation (CD) 79B (also known as CD79b molecule, immunoglobulin-associated beta (Ig-beta), and B29). The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-beta protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 96
33644 409516 cd16097 IgV_SIRP Immunoglobulin (Ig)-like variable (V) domain of the Signal-Regulatory Protein (SIRP). The members here are composed of the immunoglobulin (Ig)-like domain of the Signal-Regulatory Protein (SIRP). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions, but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three immunoglobulin superfamily domains, a single V-set, and two C1-set IgSF domains. Their cytoplasmic tails that contain either ITIMs or transmembrane regions have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma. SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs, but also binds CD47 with much less affinity. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 111
33645 294015 cd16098 FliS flagellar export chaperone FliS. This family contains flagellar export chaperone FliS, a protein critical for flagellar assembly and bacterial colonization. FliS prevents premature polymerization of flagellins, the major protein of the filament, by regulating interactions between structural components of the bacterial flagellum in the cytosol. It binds specifically to FliC (flagellin) which is sequentially secreted in large numbers through the central channel of the flagellum and polymerized to form the tail filament. FliS protects FliC from degradation and aggregation by binding to the FliC C-terminal helical domain, which contributes to stabilization of flagellin subunit interactions during polymerization. FliS has been shown to interact specifically with FlgM, whose role is to inhibit FliA, a flagellum-specific RNA polymerase responsible for flagellin transcription; FliA competes with FliS for FlgM binding. 102
33646 381691 cd16099 TenA_PqqC-like TenA-like proteins including TenA_C and TenA_E proteins, as well as pyrroloquinoline quinone (PQQ) synthesis protein C. TenA proteins participate in thiamin metabolism and can be classified into two classes: TenA_C which has an active site Cys, and TenA_E which does not; TenA_E proteins often have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product amino-HMP (4-amino-5-amino-methyl-2-methylpyrimidine) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Arabidopsis thaliana TenA_E hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP, but does not hydrolyze thiamin. Bacillus subtilis TenA_C can hydrolyze amino-HMP to AMP and can catalyze the hydrolysis of thiamin. Saccharomyces cerevisiae THI20 includes a C-terminal tetrameric TenA-like domain fused to an N-terminal ThiD domain, and participates in thiamin biosynthesis, degradation and salvage; the TenA-like domain catalyzes the production of HMP from thiamin degradation products (salvage). Bacillus halodurans TenA_C participates in a salvage pathway where the thiamine degradation product 2-methyl-4-formylamino-5-aminomethylpyrimidine (formylamino-HMP) is hydrolyzed first to amino-HMP by the YlmB protein, and the amino-HMP is then hydrolyzed by TenA to produce HMP. Helicobacter pylori TenA_C is also thought to catalyze a salvage reaction but the pyrimidine substrate has not yet been identified. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect; Pyrococcus furiosus TenA_E lacks appropriate surface charges for DNA interactions. This family also includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C (PQQC), an oxidase involved in the final step of PQQ biosynthesis, and CADD, a Chlamydia protein that interacts with death receptors. 196
33647 350627 cd16100 ARID ARID/BRIGHT DNA binding domain family. The AT-rich interaction domain (ARID) family of transcription factors, found in a broad array of organisms from fungi to mammals, is characterized by a highly conserved, helix-turn-helix DNA binding domain that binds to the major groove of DNA. The ARID domain, also called BRIGHT, was first identified in the mouse B-cell-specific transcription factor Bright and in the product of the dead ringer (dri) gene of Drosophila melanogaster. ARID family members are implicated in normal development, differentiation, cell cycle regulation, transcriptional activation and chromatin remodeling. Different family members exhibit different DNA-binding properties. Drosophila Dri, mammalian ARID3A/3B/3C and ARID5A/5B, selectively bind AT-rich sites. However, ARID1A/1B, Drosophila Osa, yeast SWI1, ARID2, ARID4A/4B, JARID1A/1B/1C/1D, and JARID2, bind DNA without sequence specificity. 87
33648 341089 cd16101 ING Inhibitor of growth (ING) domain family. The Inhibitor of growth (ING) family includes a group of tumor suppressors, ING1-5, which act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation, and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, which binds lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING domain for the H3 tail. The ING family also includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays a critical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. 88
33649 340519 cd16102 RAWUL_PCGF_like RRING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in PCGF1-6, RING1 and -2, DRIP and similar proteins; structurally similar to a beta-grasp ubiquitin-like fold. The family includes six Polycomb Group (PcG) RING finger homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) that use epigenetic mechanisms to maintain or repress expression of their target genes. They were first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place, and are well known for silencing Hox genes through modulation of chromatin structure during embryonic development in fruit flies. PCGF homologs play important roles in cell proliferation, differentiation, and tumorigenesis. They all have been found to associate with ring finger protein 2 (RNF2). The RNF2-PCGF heterodimer is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF homologs are critical components in the assembly of distinct Polycomb Repression Complex 1 (PRC1) related complexes which are involved in the maintenance of gene repression and target different genes through distinct mechanisms. The Drosophila PRC1 core complex is formed by the Polycomb (Pc), Polyhomeotic (Ph), Posterior sex combs (Psc), and Sex combs extra (Sce, also known as Ring) subunits. In mammals, the composition of PRC1 is much more diverse and varies depending on the cellular context. All PRC1 complexes contain homologs of the Drosophila Ring protein. Ring1A/RNF1 and Ring1B/RNF2 are E3 ubiquitin ligases that mark lysine 119 of histone H2A with a single ubiquitin group (H2AK119ub). Mammalian homologs of the Drosophila Psc protein, such as PCGF2/Mel-18 or PCGF4/BMI1, regulate PRC1 enzymatic activity. PRC1 complexes can be divided into at least two classes according to the presence or absence of CBX proteins, which are homologs of Drosophila Pc. Canonical PRC1 complexes contain CBX proteins that recognize and bind H3K27me3, the mark deposited by PRC2. Therefore, canonical PRC1 complexes and PRC2 can act together to repress gene transcription and maintain this repression through cell division. Non-canonical PRC1 complexes, containing RYBP (together with additional proteins, such as L3mbtl2 or Kdm2b) rather than the CBX proteins, have recently been described in mammals. PCGF homologs contain a C3HC4-type RING-HC finger, and a RAWUL domain that might be responsible for interaction with Cbx members of the Polycomb repression complexes. 87
33650 340520 cd16103 Ubl2_OASL ubiquitin-like (Ubl) domain 2 found in 2'-5'-oligoadenylate synthase-like protein (OASL) and similar proteins. OASL, also termed 2'-5'-OAS-related protein (2'-5'-OAS-RP), or 59 kDa 2'-5'-oligoadenylate synthase-like protein, or thyroid receptor-interacting protein 14, or TR-interacting protein 14 (TRIP-14), or p59 OASL (p59OASL), is an interferon (IFN)-induced antiviral protein that plays an important role in the IFNs-mediated antiviral signaling pathway. It inhibits respiratory syncytial virus replication and is targeted by the viral nonstructural protein 1 (NS1). It also displays antiviral activity against encephalomyocarditis virus (EMCV) and hepatitis C virus (HCV) via an alternative antiviral pathway independent of RNase L. Moreover, OASL does not have 2'-5'-OAS activity, but can bind double-stranded RNA (dsRNA) to enhance RIG-I signaling. OASL belongs to the 2'-5' oligoadenylate synthase (OAS) family. While each member of this family has a conserved N-terminal OAS catalytic domain, only OASL has two tandem C-terminal ubiquitin-like (Ubl) repeats, which are required for its antiviral activity. This family corresponds to the second Ubl domain. 72
33651 340521 cd16104 Ubl_USP14_like ubiquitin-like (Ubl) domain found in ubiquitin carboxyl-terminal hydrolase 14 (USP14) and similar proteins. USP14 (EC 3.4.19.12), also termed deubiquitinating enzyme 14, or ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, or ubiquitin carboxyl-terminal hydrolase 14, is a component of proteasome regulatory subunit 19S that regulates deubiquitinated proteins entering inside the proteasome core 20S, which plays an inhibitory role in protein degradation. USP14 is also associated with various signal transduction pathways and tumorigenesis, and thus plays an essential role in the development of various types of cancer. Moreover, USP14 mediates the development of cardiac hypertrophy by promoting GSK-3beta phosphorylation, suggesting a role in cardiac hypertrophy treatment. USP14 contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and a C-terminal ubiquitin-specific protease (USP) domain. 75
33652 340522 cd16105 Ubl_ASPSCR1_like Ubiquitin-like (Ubl) domain found at the N-terminus of mammalian ASPSCR1 (alveolar soft part sarcoma chromosomal region candidate gene 1 protein), Saccharomyces cerevisie Ubx4p, and similar proteins. ASPSCR1 (alveolar soft part sarcoma chromosomal region candidate gene 1 protein) is also known as alveolar soft part sarcoma locus protein (ASPL), tether containing UBX domain for GLUT4, TUG, UBX domain protein 9, UBXD9, UBXN9 or renal papillary cell carcinoma protein 17 (RCC17). The majority of members of this family contain two beta-grasp ubiquitin-like fold domains: the N-terminal UBL domain (described in this CD), and a C-terminal UBX domain. This UBL domain lacks the characteristic C-terminal double glycine motif. ASPSCR1 functions as a cofactor of the hexameric AAA (ATPase associated with various activities) ATPase complex, known as p97 or VCP in mammals and Cdc48p in yeast. In mammalian cells, ASPSCR1 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4 and assembly of the Golgi apparatus; ASPSCR1 also plays a role in controlling vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. Ubx4p and ASPSCR1 have only partially overlapping functions: both interact with p97/Cdc48p; however, Ubx4p is important for the ERAD (endoplasmic reticulum-associated protein degradation) pathway while ASPSCR1 appears not to be. 71
33653 340523 cd16106 Ubl_Dsk2p_like ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae proteasome interacting protein Dsk2p and similar proteins. The family contains several fungal multiubiquitin receptors, including Saccharomyces cerevisiae Dsk2p and Schizosaccharomyces pombe Dph1p, both of which have been characterized as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. They interact with the proteasome through their N-terminal ubiquitin-like domain (Ubl) and with ubiquitin (Ub) through their C-terminal Ub-associated domain (UBA). S. cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Moreover, it has been implicated in spindle pole duplication through assisting in Cdc31 assembly into the new spindle pole body (SPB). S. pombe Dph1p is an ubiquitin (Ub0 receptor working in concert with the class V myosin, Myo52, to target the degradation of the S. pombe CLIP-170 homolog, Tip1. It also can protect Ub chains against disassembly by deubiquitinating enzymes. 73
33654 340524 cd16107 Ubl_AtUPL5_like ubiquitin-like (Ubl) domain found in Arabidopsis thaliana ubiquitin-protein ligase 5 (AtUPL5) and similar proteins. Arabidopsis thaliana AtUPL5, also termed HECT-type E3 ubiquitin transferase UPL5, is an E3 ubiquitin-protein ligase that contains a ubiquitin-like domain (Ubl), a C-type lectin-binding domain, a leucine zipper and a HECT domain. HECT domain containing-ubiquitin-protein ligases have more than one member in different genomes, these proteins have been classified into four sub-families (UPL1/2, UPL3/4, UPL5 and UPL6/7). AtUPL5 fUPL5 regulates leaf senescence in Arabidopsis through degradation of the transcription factor WRKY53. 70
33655 340525 cd16108 Ubl_ATG8_like ubiquitin-like (Ubl) domain found in autophagy-related 8 (ATG8) and similar proteins. The ATG8 family of proteins constitute a single member in Saccharomyces cerevisiae, Atg8p, and multiple homologs in higher eukaryotes, they are multifunctional ubiquitin-like (Ubl) key regulators of autophagy. The ATG8 system is a Ubl conjugation system that is essential for autophagosome formation. In the ATG8 system, a cysteine protease (ATG4) cleaves a C-terminal arginine from ATG8, and then the exposed C-terminal glycine is conjugated to phosphatidylethanolamine (PE) by ATG7, an E1-like enzyme, and ATG3, an E2-like enzyme. The mammalian ATG8 family is classified into three subfamilies: i) MAP1LC3 (microtubule associated protein 1 light chain 3) which includes MAP1LC3A, MAP1LC3B, MAP1LC3B2, and MAP1LC3C, ii) GABARAP (GABA type A receptor associated protein) which includes GABARAP, GABARAPL1, and GABARAPL3, and iii) GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa). 85
33656 340526 cd16109 DCX1 Dublecortin-like domain 1. Members of the doublecortin (DCX) gene family are microtubule-associated proteins (MAPs). Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its protein domains can occur in double tandem or single repeats. The family represents the first repeat of the DCX domain which has a stable ubiquitin-like tertiary fold. Proteins with DCX double tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK). 85
33657 340527 cd16110 DCX1_RP_like Doublecortin-like domain 1 found in retinitis pigmentosa (RP)-like protein. RP-like protein family is part of doublecortin (DCX) family. It has double tandem DCX repeats that are associated with retinitis pigmentosa. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. RP-like proteins are colocalized to the photoreceptor and share a function in outer segment disc morphogenesis. 75
33658 340528 cd16111 DCX_DCLK3 Doublecortin-like domain found in doublecortin-like kinase 3 (DCLK3). DCLK3 is a member of doublecortin (DCX) protein family. It functions as a microtubule-associated protein (MAP). DCLK3 contains only one N-terminal doublecortin domain (DCX), unlike DCLK1 and DCLK2 which each have two conserved DCX domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK3 has a serine/threonine kinase domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. 85
33659 340529 cd16112 DCX1_DCX Dublecortin-like domain 1 found in neuronal migration protein doublecortin (DCX). DCX, also termed doublin or lissencephalin-X (Lis-XDCX), is a microtubule-associated protein (MAP). It belongs to the doublecortin (DCX) family, has double tandem DCX repeats, and is expressed in migrating neurons. Structure studies show that the N-terminal DCX domain has a stable ubiquitin-like fold. DCX is not only a unique MAP in terms of structure, it also interacts with multiple additional proteins. Mutations in the human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation. 89
33660 340530 cd16113 DCX2_DCDC2_like Doublecortin-like domain 2 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 74
33661 340531 cd16114 Ubl_SUMO1 ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier 1 (SUMO-1) and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. Four SUMO paralogs exist in mammals, SUMO1 through SUMO4. SUMO2-SUMO4 are more closely related to each other than they are to SUMO1. SUMO1 is a binding partner of the RAD51/52 nucleoprotein filament proteins, which mediate DNA strand exchange. SUMO1 conjugation to cellular proteins has been implicated in multiple important cellular processes, such as nuclear transport, cell cycle control, oncogenesis, inflammation, and the response to virus infection. 76
33662 340532 cd16115 Ubl_SUMO2_3_4 ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier SUMO-2, SUMO-3, SUMO-4, and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. The mammalian SUMOs have four paralogs, SUMO1 through SUMO4. SUMO2 and SUMO3 are more closely related to each other than they are to SUMO1. SUMO2/3 are capable of forming chains on substrate proteins through internal lysine residues. The basic biology of SUMO4 remains unclear. A M55V polymorphism in SUMO4 has been associated with susceptibility to type I diabetes in some genetic studies. 72
33663 340533 cd16116 Ubl_Smt3_like ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae ubiquitin-like protein Smt3p and similar proteins. Smt3 (Suppressor of Mif Two 3) was originally isolated as a high-copy suppressor of a mutation in MIF2, the gene of a centromere binding protein in S. cerevisiae. Smt3p is the yeast homolog of small ubiquitin-related modifier (SUMO) proteins that are involved in post-translational protein modification called SUMOylation, covalently attaching to and detaching from other proteins in cells to modify their function. SUMO resembles ubiquitin (Ub) in its structure, its ability to be ligated to other proteins, as well as in the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. Smt3p plays essential roles in cell-cycle regulation and chromosome segregation in budding yeast. It interacts with different modification enzymes, and regulates their functions through linking covalently to its targets. 74
33664 340534 cd16117 UBX_UBXN4 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 4 (UBXN4) and similar proteins. UBXN4, also termed ERAD (endoplasmic-reticulum-associated protein degradation) substrate erasing protein (erasin), or UBX domain-containing protein 2 (UBXD2), or UBXDC1, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN4 is an endoplasmic reticulum (ER) localized protein that interacts with p97 (also known as VCP or Cdc48) via its UBX domain. Erasin exists in a complex with other p97/VCP-associated factors involved in endoplasmic-reticulum-associated protein degradation (ERAD). p97 is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The overexpression of UBXN4 increases degradation of a classical ERAD substrate and UBXN4 levels are increased in ER stressed cells. Anti-UBXN4 staining is increased in neuropathological lesions in brains of patients with Alzheimer's disease. 77
33665 340535 cd16118 UBX2_UBXN9 Ubiquitin regulatory domain X (UBX) 2 found in UBX domain protein 9 (UBXN9, UBXD9, or ASPSCR1) and similar proteins. UBXN9, also termed tether containing UBX domain for GLUT4 (TUG), or alveolar soft part sarcoma chromosomal region candidate gene 1 protein (ASPSCR1), or alveolar soft part sarcoma locus (ASPL), or renal papillary cell carcinoma protein 17 (RCC17), belongs to the UBXD family of proteins that contains two ubiquitin regulatory domains X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, UBXN9 contains an N-terminal ubiquitin-like (Ubl) domain. UBXN9 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. However, high-affinity interacting protein ASPL efficiently promotes p97 hexamer disassembly, resulting in the formation of stable p97:ASPL heterotetramers; the extended UBX domain (eUBX) in ASPL is critical for p97 hexamer disassembly and facilitates the assembly of p97:ASPL heterotetramers.UBXN9 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4, assembly of the Golgi apparatus. In addition to GLUT4, UBXN9 also controls vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. UBXN9 and its budding yeast ortholog, Ubx4p, are multifunctional proteins that share some, but not all functions. Yeast Ubx4p is important for endoplasmic reticulum-associated protein degradation (ERAD) but UBXN9 appears not to share this function. 74
33666 340536 cd16119 UBX_UBXN6 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 6 (UBXN6) and similar proteins. UBXN6, also termed UBX domain-containing protein 1 (UBXD1), and UBXDC2, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN6 acts as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. Unlike other p97 cofactors that binds the N-domain of p97 through their UBX domain, UBXN6 binds p97 in two regions, at the p97 C terminus via a PUB domain and at the p97 N-domain with a short linear interaction motif termed VIM. Its UBX domain is not functional for the binding of p97. The UBXN6-p97 complex regulates the endolysosomal sorting of ubiquitylated plasma membrane protein caveolin-1 (CAV1), as well as the trafficking of ERGIC-53-containing vesicles by controlling the interaction of transport factors with the cytoplasmic tail of ERGIC-53. In addition, UBXN6 is a regulatory component of endoplasmic reticulum-associated degradation (ERAD) that may modulate the adaptor binding to p97. 73
33667 340537 cd16120 UBX_UBXN3B Ubiquitin regulatory domain X (UBX) found in FAS associated factor 2 (FAF2, also known as UBXN3B) and similar proteins. UBX domain-containing protein 3B (UBXN3B), also termed protein ETEA, or FAF2, or UBX domain-containing protein 8 (UBXD8), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. FAF2 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The p97-UBXD8 complex destabilizes mRNA by promoting release of ubiquitinated the RNA-binding protein HuR from messenger ribonucleoprotein (mRNP). Moreover, FAF2 is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. A yeast two-hybrid assay showed that FAF2 can interact with Fas. 80
33668 340538 cd16121 FERM_F1_SNX17 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 17 (SNX17). SNX17 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. It localizes to early endosomes, and plays an important role in mediating endocytic internalization, recycling, and/or protection from lysosomal degradation of NPxY-motif containing cell surface proteins including amyloid precursor protein (APP), P-selectin, beta1-integrin, low density lipoprotein receptor (LDLR), LDLR related protein (Lrp1), ApoER2, and FEEL1. SNX17 also affects T cell activation by regulating T cell receptor and integrin recycling. SNX17 contains a PX (Phox homology) domain and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 93
33669 340539 cd16122 FERM_F1_SNX31 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 31 (SNX31). SNX31 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. It is a novel sorting nexin associated with the uroplakin-degrading multivesicular bodies in terminally differentiated urothelial cells. SNX31 binds multiple beta integrin cytoplasmic domains and regulates beta1 integrin surface levels and stability. SNX31 contains a PX (Phox homology) domain and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 98
33670 340540 cd16123 RA_RASSF7_like Ras-associating (RA) domain found in Ras-association domain family members, RASSF7, RASSF8, RASSF9, and RASSF10. The RASSF family of proteins shares a conserved RalGDS/AF6 Ras association (RA) domain either in the C-terminus (RASSF1-6) or N-terminus (RASSF7-10). RASSF7-10 lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The structural differences between the C-terminus and N-terminus RASSF subgroups have led to the suggestion that they are two distinct families. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ras proteins are small GTPases that are involved in cellular signal transduction. The N-terminus RASSF proteins are potential Ras effectors that have been linked to key biological processes, including cell death, proliferation, microtubule stability, promoter methylation, vesicle trafficking and response to hypoxia. 81
33671 340541 cd16124 RA_GRB7_10_14 Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 7/10/14. The RA domain is highly conserved among the members of the Grb proteins family which includes Grb7, Grb10 and Grb14. Grb7/10/14 are multi-domain cytoplasmic adaptor proteins that are recruited to activated receptor tyrosine kinases. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. 85
33672 340542 cd16125 RA_ASPP1_2 Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 (ASPP) 1 and 2. The ASPP protein (apoptosis-stimulating protein of p53; also called ankyrin repeat-, Src homology 3 domain- and Pro-rich region-containing protein) plays a critical role in regulating apoptosis. The ASPP family consists of three members, ASPP1, ASPP2 and iASPP, all of which bind to p53 and regulate p53-mediated apoptosis. ASPP1 and ASPP2, have a RA domain at their N-terminus and have pro-apoptotic functions, while iASPP is involved in anti-apoptotic responses. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. 80
33673 340543 cd16126 Ubl_HR23B ubiquitin-like (Ubl) domain found in UV excision repair protein RAD23 homolog B (HR23B). HR23B, also termed xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (Ubl) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. 78
33674 340544 cd16127 Ubl_ATG8_GABARAP_like ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein (GABARAP) and similar proteins; sub-family of the autophagy-related 8 (ATG8) protein family. GABARAP (also termed GABA(A) receptor-associated protein, ATG8A, or MM46) has been implicated in intracellular protein trafficking. It is a cytosolic protein, localized to transport vesicles, the Golgi network and the endoplasmic reticulum. It interacts with the intracellular domain of the gamma2 subunit of GABA(A) receptors, and thus, functions as a trafficking modulator implicated in the intracellular trafficking of GABA(A) receptor. GABARAP also acts as a Ubl modifier belonging to the ATG8 (autophagy-related 8) protein family which is essential for autophagosome biogenesis and maturation. GABARAP recruits phosphatidylinositol 4-kinase II alpha (PI4KIIalpha) as a specific downstream effector, and regulates phosphatidylinositol 4-phosphate (PI4P)-dependent autophagosome lysosome fusion. This sub-family also includes GABARAPL1 (also termed GABA(A) receptor-associated protein-like, or GEC1), GABARAPL2/GATE-16, and GABARAPL3. GABARAPL1 has been involved in the intracellular transport of receptors via interactions with tubulin and GABA(A) or kappa opioid receptors. GABARAPL1 is also a Ubl modifier that functions as a mediator involved in androgen-regulated autophagy process. It is transcriptionally modulated by androgen receptor (AR) and has a repressive role in autophagy. In addition, GABARAPL1 is required for increased membrane expression of epidermal growth factor receptor (EGFR) during hypoxia, suggesting a possible role in the trafficking of these membrane proteins. GABARAPL1 may also play a key role in several important biological processes such as cancer or neurodegenerative diseases. Low expression of GABARAPL1 is associated with poor prognosis of patients with hepatocellular carcinoma. 107
33675 340545 cd16128 Ubl_ATG8 ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae Atg8p and related proteins; sub-family of the autophagy-related 8 (ATG8) family. The ATG8 family of proteins constitutes a single member in Saccharomyces cerevisiae, Atg8p, and multiple homologs in higher eukaryotes. These proteins are multifunctional ubiquitin-like (Ubl) key regulators of autophagy. ATG8 is characterized by a C-terminal ubiquitin-like (Ubl) domain with a short N-terminal extension. The covalent attachment of ATG8 to phosphatidylethanolamine (PtdEth) at the autophagosomal membrane places it at a crucial juncture during autophagosome formation. ATG Ubl proteins such as Saccharomyces cerevisiae Atg8p undergo a unique Ubl conjugation, a process essential for autophagosome formation. 103
33676 340546 cd16129 Ubl_ATG8_MAP1LC3 ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3 (MAP1LC3). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3 (also known as LC3) has a ubiquitin-like (Ubl) fold and belongs to the ATG8 autophagy protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by the cysteine protease ATG4 and is then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. This sub-family includes MAP1LC3A, MAP1LC3B, and MAP1LC3C, each encoded by a different MAP1LC3 gene. 105
33677 340547 cd16130 RA_Rin3 Ras-associating (RA) domain found in Ras and Rab interactor 3 (Rin3). Rin3, also termed Ras interaction/interference protein 3, is a RAS effector and a RAB5-activating guanine nucleotide exchange factor (GEF) specifically for GTPase Rab31. It functions as a negative regulator of mast cell responses to Stem Cell Factor (SCF). Rin3 contains the Vps9p-like guanine nucleotide exchange factor and Ras-association (RA) domains. 88
33678 340548 cd16131 RA_Rin2 Ras-associating (RA) domain found in Ras and Rab interactor 2 (Rin2). Rin2, also termed Ras association domain family 4, or Ras inhibitor JC265, or Ras interaction/interference protein 2, is a Rab5 GDP/GTP exchange factor with the Vps9p-like guanine nucleotide exchange factor and Ras-association (RA) domains. Rin2 connects three GTPases, R-Ras, Rab5 and Rac1, to promote endothelial cell adhesion through the regulation of integrin internalization and Rac1 activation. Rin2 is involved in the regulation of Rab5-mediated early endocytosis. The deficiency of Rin2 can cause the RIN2 syndrome, an autosomal recessive connective tissue disorder. 91
33679 340549 cd16132 RA_RASSF10 Ras-associating (RA) domain found in N-terminal Ras-association domain family 10 (RASSF10). RASSF10 is a member of a family of N-terminus RASSF7-10 proteins. RASSF7-10 has an RA domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. RA domain of N-terminal RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF10 is expressed in a wide variety of tissues and its expression in human thyroid, pancreas, placenta, heart, lung and kidney has been observed. RASSF10 is the most frequently methylated of the N-terminal RASSFs in some cancers such as in childhood acute lymphoblastic leukemia and both, thyroid cancer cell lines and primary thyroid carcinomas. 102
33680 340550 cd16133 RA_RASSF9 Ras-associating (RA) domain of N-terminal Ras-association domain family 9 (RASSF9). RASSF9, also termed PAM COOH-terminal interactor protein 1 (P-CIP1), or peptidylglycine alpha-amidating monooxygenase COOH-terminal interactor, is a member of N-terminus RASSF7-10 protein family. RASSF7-10 has an RA domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of the N-terminal RASSF proteins family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF9 was formerly known as PAM COOH-terminal interactor-1 (P-CIP1) because of its interaction with peptidylglycine alpha-amidating mono-oxygenase (PAM) and possibility of its role in regulating the trafficking of integral membrane PAM. RASSF9 is widely expressed in multiple organs such as testis, kidney, skeletal muscle, liver, lung, brain, and heart. Cloned RASSF9 showed preferential binding to N-Ras and K-Ras. 93
33681 340551 cd16134 RA_RASSF8 Ras-associating (RA) domain found in N-terminal Ras-association domain family 8 (RASSF8). RASSF8, also termed carcinoma-associated protein HOJ-1, is a member of the N-terminus RASSF7-10 protein family. RASSF7-10 has an RA-domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of N-terminal RASSF proteins family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF8 has been described as a potential tumor suppressor. RASSF8 might have a role in the regulation of cell-cell adhesion and cell growth. 82
33682 340552 cd16135 RA_RASSF7 Ras-associating (RA) domain found in N-terminal Ras-association domain family 7 (RASSF7). RASSF7, also termed HRAS1-related cluster protein 1, is a member of the N-terminus RASSF7-10 protein family. RASSF7-10 has an RA-domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of N-terminal RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF7 is a potential Ras effector as its function has been linked to some key biological processes including the regulation of cell death and proliferation; for example, RASSF7 is up-regulated in pancreatic cancer. 83
33683 340553 cd16136 RA_MRL_Lpd Ras-associating (RA) domain found in the adapter protein lamellipodin (Lpd). Lpd, also termed Ras-associated and pleckstrin homology domains-containing protein 1 (RAPH1), or amyotrophic lateral sclerosis 2 chromosomal region candidate gene 18 protein, or amyotrophic lateral sclerosis 2 chromosomal region candidate gene 9 protein, or proline-rich EVH1 ligand 2 (PREL-2), or protein RMO1, is a member of MRL (Mig10/RIAM/Lpd) family proteins that regulates cell migration and promote lamellipodia protrusion in fibroblast by interacting with Ena/VASP proteins. MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. Lpd also contains a helical region at the amino terminus for talin binding. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH domain in Lpd form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. Lpd also exhibits other unique enzymatic functions including its catalytic activity of butyrylcholinesterase, a potent therapeutic treatment targeting cocaine abuse. 90
33684 340554 cd16137 RA_MRL_RIAM Ras-associating (RA) domain found in Rap1-GTP-interacting adapter molecule (RIAM). RIAM, also termed amyloid beta A4 precursor protein-binding family B member 1-interacting protein, or APBB1-interacting protein 1, or proline-rich EVH1 ligand 1 (PREL-1), or proline-rich protein 73, or retinoic acid-responsive proline-rich protein 1 (RARP-1), is a member of MRL (Mig10/RIAM/Lpd) family proteins that regulates cell migration and promote lamellipodia protrusion in fibroblast by interacting with Ena/VASP proteins. RIAM regulates cell migration and mediates Rap1-induced cell adhesion. MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RIAM also contains a helical region at the amino terminus for talin binding. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. 89
33685 340555 cd16138 RA_MRL_MIG10 Ras-associating (RA) domain found in Caenorhabditis elegans abnormal cell migration protein 10 (MIG-10) and similar proteins. MIG-10 is lamellipodin (Lpd) found in C. elegans. It stabilizes invading cell adhesion to basement membrane and is a negative transcriptional target of Evi-1 proto-oncogene, EGL-43, in C. elegans. It also shows netrin-independent functions and is a transcriptional target of FOS-1A, a transcription factor that promotes basement membrane breaching, during anchor cell invasion in C. elegans. MIG-10 is a member of MRL (Mig10/RIAM/Lpd) family of proteins that is involved in antero-posterior migration of embryonic neurons CAN (canalassociated neurons), ALM (anterior lateral microtubule cells) and HSN (hermaphrodite-specific neurons). MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. 86
33686 340556 cd16139 RA_GRB14 Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 14. Grb14, a member of cytoplasmic adaptor proteins, is a tissue-specific negative regulator of insulin signaling. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ubi is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. A novel function of Grb14 RA domain is to interact with the nucleotide binding pocket of a cyclic nucleotide gated channel alpha subunit (CNGA1) and inhibits its activity. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and PH domains, and a carboxyl-terminal SH2 domain. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. 85
33687 340557 cd16140 RA_GRB7 Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 7. GRB7, also termed B47, or epidermal growth factor receptor GRB-7, or GRB7 adapter protein, is a signal-transducing cytoplasmic adaptor protein that is co-opted by numerous tyrosine kinases involved in various cellular signaling and functions. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. Grb7 could interact with activated N-Ras in transfected cells. 88
33688 340558 cd16141 RA_GRB10 Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 10. GRB10, also termed insulin receptor-binding protein Grb-IR, is a multi-domain cytoplasmic adaptor protein that binds to the insulin-like growth factor 1 receptor (IGF-1R) and inhibits insulin signaling. Grb10 and its related family members Grb7 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. Grb14 binds to both GTPase-defective mutant Rab5 as well as CNGA1, whereas Grb10 binds only to GTP-bound form of active Rab5. 92
33689 293761 cd16142 ARS_like uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 372
33690 293762 cd16143 ARS_like uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 395
33691 293763 cd16144 ARS_like uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 421
33692 293764 cd16145 ARS_like uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 415
33693 293765 cd16146 ARS_like uncharacterized arylsulfatase. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 409
33694 293766 cd16147 G6S glucosamine (N-acetyl)-6-sulfatase(G6S, GNS) AND sulfatase 1(SULF1). N-acetylglucosamine-6-sulfatase also known as glucosamine (N-acetyl)-6-sulfatase hydrolyzes of the 6-sulfate groups of the N-acetyl-D-glucosamine 6-sulfate units of heparan sulfate and keratan sulfate. Deficient of N-acetylglucosamine-6-sulfatase results in disease of Sanfilippo Syndrome type IIId or Mucopolysaccharidosis III (MPS-III), a rare autosomal recessive lysosomal storage disease. SULF1 encodes an extracellular heparan sulfate endosulfatase, that removes 6-O-sulfate groups from heparan sulfate chains of heparan sulfate proteoglycans (HSPGs). 396
33695 293767 cd16148 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 271
33696 293768 cd16149 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 257
33697 293769 cd16150 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 423
33698 293770 cd16151 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 377
33699 293771 cd16152 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 373
33700 293772 cd16153 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 282
33701 293773 cd16154 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 372
33702 293774 cd16155 sulfatase_like uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 372
33703 293775 cd16156 sulfatase_like uncharacterized sulfatase subfamily; includes Escherichia coli YidJ. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 468
33704 293776 cd16157 GALNS galactosamine-6-sulfatase; also known as N-acetylgalactosamine-6-sulfatase (GALNS). Lysosomal galactosamine-6-sulfatase removes sulfate groups from a terminal N-acetylgalactosamine-6-sulfate (or galactose-6-sulfate) in mucopolysaccharides such as keratan sulfate and chondroitin-6-sulfate. Defects in GALNS lead to accumulation of substrates, resulting in the development of the lysosomal storage disease mucopolysaccharidosis IV A. 466
33705 293777 cd16158 ARSA Arylsulfatase A or cerebroside-sulfatase. Arylsulfatase A breaks down sulfatides, namely cerebroside 3-sulfate into cerebroside and sulfate. It is a member of the sulfatase family. The arylsulfatase A was located in lysosome-like structures and transported to dense lysosomes in a mannose 6-phosphate receptor-dependent manner. Deficiency of arylsulfatase A leads to the accumulation of cerebroside sulfate, which causes a lethal progressive demyelination. Arylsulfatase A requires the posttranslational oxidation of the -CH2SH group of a conserved cysteine to an aldehyde, yielding a formylglycine to be in an active form. 479
33706 293778 cd16159 ES Estrone sulfatase. Human estrone sulfatase (ES) is responsible for maintaining high levels of the active estrogen in tumor cells. ES catalyzes the hydrolysis of E1 sulfate, which is a component of the three-enzyme system that has been implicated in intracrine biosynthesis of estradiol. It is associated with the membrane of the endoplasmic reticulum (ER). The structure of ES consisting of two antiparallel alpha helices that protrude from the roughly spherical molecule. These highly hydrophobic helices anchor the functional domain on the membrane surface facing the ER lumen. 521
33707 293779 cd16160 spARS_like sea urchin arylsulfatase-like. This family includes sea urchin arylsulfatase and its homologous proteins. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 445
33708 293780 cd16161 ARSG arylsulfatase G. Arylsulfatase G is a subfamily of sulfatases which specifically hydrolyze sulfate esters in a wide variety of substrates such as glycosaminoglycans, steroid sulfates, or sulfolipids. ARSG has arylsulfatase activity toward different pseudosubstrates like p-nitrocatechol sulfate and 4-methylumbelliferyl sulfate. An active site Cys is post-translationally converted to the critical active site C(alpha)-formylglycine. ARSG mRNA expression was found to be tissue-specific with highest expression in liver, kidney, and pancreas, suggesting a metabolic role of ARSG that might be associated with a non-classified lysosomal storage disorder. 383
33709 293881 cd16162 OCRE_RBM5_like OCRE domain found in RNA-binding protein RBM5, RBM10, and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing, and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, and a G-patch/D111 domain. 56
33710 293882 cd16163 OCRE_RBM6 OCRE domain found in RNA-binding protein 6 (RBM6) and similar proteins. RBM6, also called lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both specifically binds poly(G) RNA. RBM6 contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has an additional unique domain, the POZ (poxvirus and zinc finger) domain, which may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. 57
33711 293883 cd16164 OCRE_VG5Q OCRE domain found in angiogenic factor VG5Q and similar proteins. VG5Q, also called angiogenic factor with G patch and FHA domains 1 (AGGF1), or G patch domain-containing protein 7, or vasculogenesis gene on 5q protein, functions as a potent angiogenic factor in promoting angiogenesis through interacting with TWEAK (also known as TNFSF12), which is a member of the tumor necrosis factor (TNF) superfamily that induces angiogenesis in vivo. VG5Q can bind to the surface of endothelial cells and promote cell proliferation, suggesting that it may act in an autocrine fashion. The chromosomal translocation t(5;11) and the E133K variant in VG5Q are associated with Klippel-Trenaunay syndrome (KTS), a disorder characterized by diverse effects in the vascular system. In addition to a forkhead-associated (FHA) domain and a G-patch motif, VG5Q contains an N-terminal OCtamer REpeat (OCRE) domain that is characterized by a 5-fold, imperfectly repeated octameric sequence. 54
33712 293884 cd16165 OCRE_ZOP1_plant OCRE domain found in Zinc-finger and OCRE domain-containing Protein 1 (ZOP1) and similar proteins found in plant. ZOP1 is a novel plant-specific nucleic acid-binding protein required for both RNA-directed DNA methylation (RdDM) and pre-mRNA splicing. It promotes RNA polymerase IV (Pol IV)-dependent siRNA accumulation, DNA methylation, and transcriptional silencing. As a pre-mRNA splicing factor, ZOP1 associates with several typical splicing proteins as well as with RNA polymerase II (RNAP II and Pol II). It also shows both RdDM-dependent and -independent roles in transcriptional silencing. ZOP1 contains an N-terminal C2H2-type ZnF domain and an OCtamer REpeat (OCRE) domain that is usually related to RNA processing. 55
33713 293885 cd16166 OCRE_SUA_like OCRE domain found in Suppressor of ABI3-5 (SUA) and similar proteins. SUA is an RNA-binding protein located in the nucleus and expressed in all plant tissues. It functions as a splicing factor that influences seed maturation by controlling alternative splicing of ABI3. The suppression of the cryptic ABI3 intron indicates a role of SUA in mRNA processing. SUA also interacts with the prespliceosomal component U2AF65, the larger subunit of the conserved pre-mRNA splicing factor U2AF. SUA contains two RNA recognition motifs surrounding a zinc finger domain, an OCtamer REpeat (OCRE) domain, and a Gly-rich domain close to the C-terminus. 54
33714 293886 cd16167 OCRE_RBM10 OCRE domain found in RNA-binding protein 10 (RBM10) and similar proteins. RBM10, also called G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), is a paralogue of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, and a G-patch/D111 domain. 64
33715 293887 cd16168 OCRE_RBM5 OCRE domain found in RNA-binding protein 5 (RBM5) and similar proteins. RBM5 is also called protein G15, H37, putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9. It is a known modulator of apoptosis. It acts as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both of them specifically binds poly(G) RNA. RBM5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 56
33716 320085 cd16169 Tau138_eWH extended winged-helix domain of tau138 and related proteins. Tau138 is one of three subunits of the tauB subcomplex of yeast transcription factor IIIC. This extended winged-helix domain of tau138 appears to interact with the TPR (tetratricopeptide repeat) array of tauA subunit tau131, and may therefore play a role in linking tauA, tauB, and TFIIIB to regulate the formation of the RNA polymerase III pre-initiation complex. 97
33717 320084 cd16170 MvaT_DBD DNA-binding domain of the bacterial xenogeneic silencer MvaT. MvaT is a xenogeneic silencer conserved in Pseudomonas which assists in distinguishing foreign from self DNA. It prefers binding to flexible DNA segments with multiple TpA steps, and forms nucleoprotein filaments through cooperative polymerization. 42
33718 293781 cd16171 ARSK arylsulfatase family, member K ....arylsulfatase k short ask flags precursor. ARSK is a lysosomal sulfatase which exhibits an acidic pH optimum for catalytic activity against arylsulfate substrates. Other names for ARSK include arylsulfatase K and TSULF. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases. 366
33719 293930 cd16172 TorS_sensor_domain sensor domain of the sensor histidine kinase TorS. TorS is part of the trimethylamine-N-oxide (TMAO) reductase (Tor) pathway, which consists TorT, a periplasmic binding protein that binds TMAO; TorS, a sensor histidine kinase that forms a complex with TorT, and TorR, the response regulator. The Tor pathway is involved in regulating a cellular response to trimethylamine-N-oxide (TMAO), a terminal electron receptor in anaerobic respiration. TorS consists of a periplasmic sensor domain, as well as a HAMP domain, a histidine kinase domain, and a receiver domain. 261
33720 320081 cd16173 EFh_MICU1 EF-hand, calcium binding motif, found in calcium uptake protein 1, mitochondrial (MICU1) and similar proteins. MICU1, also termed atopy-related autoantigen CALC (ara CALC), or calcium-binding atopy-related autoantigen 1 (CBARA1), or Hom s 4, or EFHA3, localizes to the inner mitochondrial membrane (IMM). It functions as a gatekeeper of the mitochondrial calcium uniporter (MCU) and regulates MCU-mediated mitochondrial Ca2+ uptake, which is essential for maintaining mitochondrial homoeostasis. MICU1 and its paralog, MICU2, are physically associated within the uniporter complex and are co-expressed across all tissues. They may operate together with MCU to regulate the channel. The mutations in MICU1 are associated with neuromuscular abnormalities in children. MICU1 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices. 153
33721 320082 cd16174 EFh_MICU2 EF-hand, calcium binding motif, found in calcium uptake protein 2, mitochondrial (MICU2) and similar proteins. MICU2, also termed EF-hand domain-containing family member A1 (EFHA1), is a mitochondrial-localized paralog of MICU1. MICU2 and its paralog, MICU1, are physically associated within the mitochondrial calcium uniporter (MCU) complex and are co-expressed across all tissues. They may operate together with MCU to regulate the channel. At present, the precise molecular function of MICU2 remains unclear. It may play possible roles in Ca2+ sensing and regulation of MCU, calcium buffering with a secondary impact on transport or assembly and stabilization of MCU. MICU2 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices. 154
33722 320083 cd16175 EFh_MICU3 EF-hand, calcium binding motif, found in calcium uptake protein 3, mitochondrial (MICU3) and similar proteins. MICU3, also termed EF-hand domain-containing family member A2 (EFHA2), is a paralog of MICU1 and notably found in the central nervous system (CNS) and skeletal muscle. At present, the precise molecular function of MICU3 remains unclear. It likely has a role in mitochondrial calcium handling. MICU3 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices. 128
33723 320076 cd16176 EFh_HEF_CB EF-hand, calcium binding motif, found in calbindin (CB). CB, also termed calbindin D28, or D-28K, or avian-type vitamin D-dependent calcium-binding protein, is a unique intracellular calcium binding protein that functions as both a calcium sensor and buffer in eukaryotic cells, which undergoes a conformational change upon calcium binding and protects cells against insults of high intracellular calcium concentration. CB is highly expressed in brain and sensory neurons. It plays essential roles in neural functioning, altering synaptic interactions in the hippocampus, modulating calcium channel activity, calcium transients, and intrinsic neuronal firing activity. It prevents a neuronal death, as well as maintains and controls calcium homeostasis. CB also modulates the activity of proteins participating in the development of neurodegenerative disorders such as Alzheimer's disease, Huntington's disease, and bipolar disorder. Moreover, CB interacts with Ran-binding protein M, a protein known to involve in microtubule function. It also interacts with alkaline phosphatase and myo-inositol monophosphatase, as well as caspase 3, an enzyme that plays an important role in the regulation of apoptosis. CB contains six EF-hand motifs in a single globular domain, where EF-hands 1, 3, 4, 5 bind four calcium ions with high affinity. 243
33724 320077 cd16177 EFh_HEF_CR EF-hand, calcium binding motif, found in calretinin (CR). CR, also termed 29 kDa calbindin, is a cytosolic hexa-EF-hand calcium-binding protein predominantly expressed in a variety of normal and tumorigenic t specific neurons of the central and peripheral nervous system. It possibly functions as a calcium buffer, calcium sensor, and apoptosis regulator, which may be implicated in many biological processes, including cell proliferation, differentiation, and cell death. CR contains six EF-hand motifs within two independent domains, CR I-II and CR III-VI. CR I-II consists of EF-hand motifs 1 and 2, and CR III-VI consists of EF-hand motifs 3-6. The first 5 EF-hand motifs are capable of binding calcium ions, while the EF-hand 6 is inactive. Thus, CR has two pairs of cooperative binding sites (I-II and III-IV), which display high affinity calcium-binding sites, and one independent calcium ion-binding site (V), which displays lower affinity binding. 248
33725 320078 cd16178 EFh_HEF_SCGN EF-hand, calcium binding motif, found in secretagogin (SCGN). SCGN is a six EF-hand calcium-binding protein expressed in neuroendocrine, pancreatic endocrine and retinal cells. It plays a crucial role in cell apoptosis, receptor signaling and differentiation. It is also involved in vesicle secretion through binding to various proteins, including interacts with SNAP25, SNAP23, DOC2alpha, ARFGAP2, rootletin, KIF5B, beta-tubulin, DDAH-2, ATP-synthase and myeloid leukemia factor 2. SCGN functions as a calcium sensor/coincidence detector modulating vesicular exocytosis of neurotransmitters, neuropeptides or hormones. It also serves as a calcium buffer in neurons. Thus, SCGN may be linked to the pathogenesis of neurological diseases such as Alzheimer's, and also acts as a serum marker of neuronal damage, or as a tumor biomarker. SCGN consists of the three globular domains each of which contains a pair of EF-hand motifs. All six EF hand motifs of SCGN in some eukaryotes, including D. rerio, X. laevis, M. domestica, G. gallus, O. anatinus, could potentially bind six calcium ions. In contrast, SCGNs from higher eukaryotes have at least one non-functional EF-hand motif due to the mutation(s) or deletions. For instance, the EF1 loop does not coordinate calcium ion due to the key residue asparagine replaced by lysine in SCGNs of many mammalian species. Moreover, the EF2 loop seems to be competent for calcium-binding in most mammalian SCGNs except for human and chimpanzee orthologs. 257
33726 320079 cd16179 EFh_HEF_CBN EF-hand, calcium binding motif, found in Drosophila melanogaster calbindin-32 (CBN) and similar proteins. CBN, the product of the cbn gene, is a Drosophila homolog to vertebrate neuronal six EF-hand calcium binding proteins. It is expressed through most of ontogenesis with a selective distribution in the nervous system and in a few small adult thoracic muscles. Its precise biological role remains unclear. CBN contains six EF-hand motifs, but some of them may not bind calcium ions due to the lack of key residues. 261
33727 320055 cd16180 EFh_PEF_Group_I Penta-EF hand, calcium binding motifs, found in Group I PEF proteins. The family corresponds to Group I PEF proteins that have been found not only in higher animals but also in lower animals, plants, fungi and protists. Group I PEF proteins include apoptosis-linked gene 2 protein (ALG-2), peflin and similar proteins. ALG-2, also termed programmed cell death protein 6 (PDCD6), is a widely expressed calcium-binding modulator protein associated with cell proliferation and death, as well as cell survival. It forms a homodimer in the cell or a heterodimer with its closest paralog peflin. Among the PEF proteins, ALG-2 can bind three Ca2+ ions through its EF1, EF3, and EF5 hands, where it is unique in that its EF5 hand binds Ca2+ ion in a canonical coordination. Peflin is a ubiquitously expressed 30-kD PEF protein containing five EF-hand motifs in its C-terminal domain and a longer N-terminal hydrophobic domain (NHB domain) than any other member of the PEF family. The NHB domain harbors nine repeats of a nonapeptide (A/PPGGPYGGP). Peflin may modulate the function of ALG-2 in Ca2+ signaling. It exists only as a heterodimer with ALG-2, and binds two Ca2+ ions through its EF1 and EF3 hands. Its additional EF5 hand is unpaired and does not bind Ca2+ ion but mediates the heterodimerization with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+. 164
33728 320056 cd16181 EFh_PEF_Group_II_sorcin_like Penta-EF hand, calcium binding motifs, found in sorcin, grancalcin, and similar proteins. The family corresponds to the second group of penta-EF hand (PEF) proteins that includes sorcin, grancalcin, and similar proteins. Sorcin, also termed 22 kDa Ca2+-binding protein, CP-22, or V19, is a soluble resistance-related calcium-binding protein that is expressed in normal mammalian tissues, such as the liver, lungs and heart. It contains a flexible glycine and proline-rich N-terminal extension and five EF-hand motifs that associate with membranes in a calcium-dependent manner. It may harbor three potential Ca2+ binding sites through its EF1, EF2 and EF3 hands. However, binding of only two Ca2+/monomer suffices to trigger the conformational change that exposes hydrophobic regions and leads to interaction with the respective targets. Sorcin forms homodimers through the association of the unpaired EF5 hand. Among the PEF proteins, sorcin is unique in that it contains potential phosphorylation sites by cAMP-dependent protein kinase (PKA), and it can form a tetramer at slightly acid pH values although remaining a stable dimer at neutral pH. Grancalcin (GCA) is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It can strongly interact with sorcin to form a heterodimer and further modulate the function of sorcin. GCA exists as homodimers in solution. It contains five EF-hand motifs attached to an N-terminal region of an approximately 50 residue-long segment rich in glycines and prolines. In contrast with sorcin, GCA binds two Ca2+ ions through its EF1 and EF3 hands. 165
33729 320057 cd16182 EFh_PEF_Group_II_CAPN_like Penta-EF hand, calcium binding motifs, found in PEF calpain family. The PEF calpain family belongs to the second group of penta-EF hand (PEF) proteins. It includes classical (also called conventional or typical) calpain (referring to a calcium-dependent papain-like enzymes, EC 3.4.22.17) large catalytic subunits (CAPN1, 2, 3, 8, 9, 11, 12, 13, 14) and two calpain small subunits (CAPNS1 and CAPNS2), which are largely confined to animals (metazoans). These PEF-containing are nonlysosomal intracellular calcium-activated intracellular cysteine proteases that play important roles in the degradation or functional modulation in a variety of substrates in response to calcium signalling. The classical mu- and m-calpains are heterodimers consisting of homologous but a distinct (large) L-subunit/chain (CAPN1 or CAPN2) and a common (small) S-subunit/chain (CAPNS1 or CAPNS2). These L-subunits (CAPN1 and CAPN2) and S-subunit CAPNS1 are ubiquitously found in all tissues. Other calpains likely consist of an isolated L-subunit/chain alone. Many of them, such as CAPNS2, CAPN3 (in skeletal muscle, or lens), CAPN8 (in stomach), CAPN9 (in digestive tracts), CAPN11 (in testis), CAPN12 (in follicles), are tissue-specific and have specific functions in distinct organs. The L-subunits of similar structure (called CALPA and B) also have been found in Drosophila melanogaster. The S-subunit seems to have a chaperone-like function for proper folding of the L-subunit. The catalytic L-subunits contain a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The S-subunits only have the PEF domain following an N-terminal Gly-rich hydrophobic domain. The calpains undergo a rearrangement of the protein backbone upon Ca2+-binding. 167
33730 320058 cd16183 EFh_PEF_ALG-2 EF-hand, calcium binding motif, found in apoptosis-linked gene 2 protein (ALG-2) and similar proteins. ALG-2, also termed programmed cell death protein 6 (PDCD6), or probable calcium-binding protein ALG-2, is one of the prototypic members of the penta EF-hand protein family. It is a widely expressed calcium-binding modulator protein associated with cell proliferation and death, as well as cell survival. ALG-2 acts as a pro-apoptotic factor participating in T cell receptor-, Fas-, and glucocorticoid-induced programmed cell death, and also serves as a useful molecular marker for the prognosis of cancers. Moreover, ALG-2 functions as a calcium ion sensor at endoplasmic reticulum (ER) exit sites, and modulates ER-stress-stimulated cell death and neuronal apoptosis during organ formation. Furthermore, ALG-2 can mediate the pro-apoptotic activity of cisplatin or tumor necrosis factor alpha (TNFalpha) through the down-regulation of nuclear factor-kappaB (NF-kappaB) expression. It also inhibits angiogenesis through PI3K/mTOR/p70S6K pathway by interacting of vascular endothelial growth factor receptor-2 (VEGFR-2). In addition, nuclear ALG-2 may participate in the post-transcriptional regulation of Inositol Trisphosphate Receptor Type 1 (IP3R1) pre-mRNA at least in part by interacting with CHERP (Ca2+ homeostasis endoplasmic reticulum protein) calcium-dependently. ALG-2 contains five serially repeated EF-hand motifs and interacts with various proteins, including ALG-2-interacting protein X (Alix), Fas, annexin XI, death-associated protein kinase 1 (DAPk1), Tumor susceptibility gene 101 (TSG101), Sec31A, phospholipid scramblase 3 (PLSCR3), the P-body component PATL1, and endosomal sorting complex required for transport (ESCRT)-III-related protein IST1, in a calcium-dependent manner. It forms a homodimer in the cell or a heterodimer with its closest paralog peflin. Among the PEF proteins, ALG-2 can bind three Ca2+ ions through its EF1, EF3, and EF5 hands, where it is unique in that its EF5 hand binds Ca2+ ion in a canonical coordination. 165
33731 320059 cd16184 EFh_PEF_peflin EF-hand, calcium binding motif, found in peflin and similar proteins. Peflin, also termed penta-EF hand (PEF) protein with a long N-terminal hydrophobic domain, or penta-EF hand domain-containing protein 1, is a ubiquitously expressed 30-kD PEF protein containing five EF-hand motifs in its C-terminal domain and a longer N-terminal hydrophobic domain (NHB domain) than any other member of the PEF family. The NHB domain harbors nine repeats of a nonapeptide (A/PPGGPYGGP). Peflin may modulate the function of ALG-2 in Ca2+ signaling. It exists only as a heterodimer with ALG-2, and binds two Ca2+ ions through its EF1 and EF3 hands. Its additional EF5 hand is unpaired and does not bind Ca2+ ion but mediates the heterodimerization with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+. In lower vertebrates, peflin may interact with transient receptor potential N (TRPN1), suggesting a potential role of peflin in fast transducer channel adaptation. 165
33732 320060 cd16185 EFh_PEF_ALG-2_like EF-hand, calcium binding motif, found in homologs of mammalian apoptosis-linked gene 2 protein (ALG-2). The family includes some homologs of mammalian apoptosis-linked gene 2 protein (ALG-2) mainly found in lower eukaryotes, such as a parasitic protist Leishmarua major and a cellular slime mold Dictyostelium discoideum. These homologs contains five EF-hand motifs. Due to the presence of unfavorable residues at the Ca2+-coordinating positions, their non-canonical EF4 and EF5 hands may not bind Ca2+. Two Dictyostelium PEF proteins are the prototypes of this family. They may bind to cytoskeletal proteins and/or signal-transducing proteins localized to detergent-resistant membranes named lipid rafts, and occur as monomers or weak homo- or heterodimers like ALG-2. They can serve as a mediator for Ca2+ signaling-related Dictyostehum programmed cell death (PCD). 163
33733 320061 cd16186 EFh_PEF_grancalcin Penta-EF hand, calcium binding motifs, found in grancalcin. Grancalcin (GCA) is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It displays a Ca2+-dependent translocation to granules and plasma membrane upon neutrophil activation, suggesting roles in granule-membrane fusion and degranulation of neutrophils. It may also play a role in the regulation of vesicle/granule exocytosis through the reversible binding of secretory vesicles and plasma membranes upon the presence of calcium. Moreover, GCA is involved in inflammation, as well as in the process of adhesion of neutrophils to fibronectin. It plays a key role in leukocyte-specific functions that are responsible for host defense, and affects the function of integrin receptors on immune cells through binding to L-plastin in the absence of calcium. Furthermore, GCA can strongly interact with sorcin to form a heterodimer, and further modulate the function of sorcin. GCA exists as homodimers in solution. It contains five EF-hand motifs attached to an N-terminal region of an approximately 50 residue-long segment rich in glycines and prolines. GCA binds two Ca2+ ions through its EF1 and EF3 hands. 165
33734 320062 cd16187 EFh_PEF_sorcin Penta-EF hand, calcium binding motifs, found in sorcin. Sorcin, also termed 22 kDa Ca2+-binding protein, CP-22, or V19, is a soluble resistance-related calcium-binding protein that is expressed in normal mammalian tissues, such as the liver, lungs and heart. The up-regulation of sorcin is correlated with a number of cancer types, including colorectal, gastric and breast cancer. It may represent a therapeutic target for reversing tumor multidrug resistance (MDR). Sorcin participates in the regulation of calcium homeostasis in cells and is necessary for the activation of mitosis and cytokinesis. It enhances metastasis and promotes epithelial-to-mesenchymal transition of colorectal cancer. Moreover, sorcin has been implicated in the regulation of intracellular Ca2+ cycling and cardiac excitation-contraction coupling. It displays the anti-apoptotic properties via the modulation of mitochondrial Ca2+ handling in cardiac myocytes. It can target and activate the sarcolemmal Na+/Ca2+ exchanger (NCX1) in cardiac muscle. Meanwhile, sorcin modulates cardiac L-type Ca2+ current by functional interaction with the alpha1C subunit. It also associates with calcium/calmodulin-dependent protein kinase IIdeltaC (CaMKIIdelta(C)) and further modulates ryanodine receptor (RyR) function in cardiac myocytes. Furthermore, sorcin may act as a Ca2+ sensor for glucose-induced nuclear translocation and the activation of carbohydrate-responsive element-binding protein (ChREBP)-dependent genes. As a mitochondrial chaperone TRAP1 interactor, sorcin involves in mitochondrial metabolism through the TRAP1 pathway. In addition, sorcin may regulate the inhibition of type I interferon response in cells through interacting with foot-and-mouth disease virus (FMDV) VP1. Sorcin contains a flexible glycine and proline-rich N-terminal extension and five EF-hand motifs that associate with membranes in a calcium-dependent manner. It may harbor three potential Ca2+ binding sites through its EF1, EF2 and EF3 hands. However, binding of only two Ca2+/monomer suffices to trigger the conformational change that exposes hydrophobic regions and leads to interaction with the respective targets. Sorcin forms homodimers through the association of the unpaired EF5 hand. Among the PEF proteins, sorcin is unique in that it contains potential phosphorylation sites by cAMP-dependent protein kinase (PKA), and it can form a tetramer at slightly acid pH values although remaining a stable dimer at neutral pH. 165
33735 320063 cd16188 EFh_PEF_CPNS1_2 Penta-EF hand, calcium binding motifs, found in calcium-dependent protease small subunit CAPNS1 and CAPNS2. CAPNS1, also termed calpain small subunit 1 (CSS1), or calcium-activated neutral proteinase small subunit (CANP small subunit), or calcium-dependent protease small subunit (CDPS), or calpain regulatory subunit, is a common 28-kDa regulatory calpain subunit encoded by the calpain small 1 (Capns1, also known as Capn4) gene. It acts as a binding partner to form a heterodimer with the 80 kDa calpain large catalytic subunit and is required in maintaining the activity of calpain. CAPNS1 plays a significant role in tumor progression of human cancer, and functions as a potential therapeutic target in human hepatocellular carcinoma (HCC), nasopharyngeal carcinoma (NPC), glioma, and clear cell renal cell carcinoma (ccRCC). It may be involved in regulating migration and cell survival through binding to the SH3 domain of Ras GTPase-activating protein (RasGAP). It may also modulate Akt/FoxO3A signaling and apoptosis through PP2A. CAPNS1 contains an N-terminal glycine rich domain and a C-terminal PEF-hand domain. CAPNS2, also termed calpain small subunit 2 (CSS2), is a novel tissue-specific 30 kDa calpain small subunit that lacks two oligo-Gly stretches characteristic of the N-terminal Gly-rich domain of CAPNS1. CAPNS2 acts as a chaperone for the calpain large subunit, and appears to be the functional equivalent of CAPNS1. However, CAPNS2 binds the large subunit much more weakly than CAPNS1 and it does not undergo the autolytic conversion typical of CAPNS1. 169
33736 320064 cd16189 EFh_PEF_CAPN1_like Penta-EF hand, calcium binding motifs, found in mu-type calpain (CAPN1), m-type calpain (CAPN2), and similar proteins. The family includes mu-type calpain (CAPN1) and m-type calpain (CAPN2), both of which are ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine proteases that contain a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN1 or CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms a mu- or m-calpain heterodimer, respectively. 168
33737 320065 cd16190 EFh_PEF_CAPN3 Calcium-activated neutral. CAPN3, also termed calcium-activated neutral proteinase 3 (CANP 3), or calpain L3, or calpain p94, or muscle-specific calcium-activated neutral protease 3, or new calpain 1 (nCL-1), is a calpain large subunit that is mainly expressed in skeletal muscle, or lens. The skeletal muscle-specific CAPN3 are pathologically associated with limb girdle muscular dystrophy type 2A (LGMD2A). Its autolytic activity can be positively regulated by calmodulin (CaM), a known transducer of the calcium signal. CAPN3 is also involved in human melanoma tumorigenesis and progression. It impairs cell proliferation and stimulates oxidative stress-mediated cell death in melanoma cells. Moreover, it plays an important role in sarcomere remodeling and mitochondrial protein turnover. Furthermore, the phosphorylated skeletal muscle-specific CAPN3 acts as a myofibril structural component and may participate in myofibril-based signaling pathways. In the eye, the lens-specific CAPN3, together with CAPN2, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. CAPN3 exists as a homodimer, rather than a heterodimer with the calpain small subunit. It may also form heterodimers with other calpain large subunits. CAPN3 contains a long N-terminal region, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. Ca2+ binding at EF5 of the CAPN3 PEF domain is a distinct feature not observed in other calpain isoforms. 169
33738 320066 cd16191 EFh_PEF_CAPN8 Penta-EF hand, calcium binding motifs, found in calpain-8 (CAPN8). CAPN8, also termed new calpain 2 (nCL-2), or stomach-specific M-type calpain, is a calpain large subunit predominantly expressed in the stomach. It appears to be involved in membrane trafficking in the gastric surface mucus cells (pit cells), via its location at the Golgi and interaction with the beta-subunit of coatomer complex (beta-COP) of vesicles derived from the Golgi. Moreover, CAPN8, together with CAPN9, forms an active protease complex, G-calpain, in which both proteins are essential for stability and activity. The G-Calpain has been implicated in gastric mucosal defense. CAPN8 exists as both a monomer and homo-oligomer, but not as a heterodimer with the conventional calpain regulatory subunit (30K). The monomer and homodimer forms predominate. CAPN8 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. 168
33739 320067 cd16192 EFh_PEF_CAPN9 Penta-EF hand, calcium binding motifs, found in calpain-9 (CAPN9). CAPN9, also termed digestive tract-specific calpain, or new calpain 4 (nCL-4), or protein CG36, is a calpain large subunit predominantly expressed in gastrointestinal tract. It plays a physiological role in the suppression of tumorigenesis. It acts as an important biomolecule link for the regression of colorectal cancer via intracellular calcium homeostasis. CAPN9 may also play a critical role in lumen formation. Moreover, CAPN9, together with CAPN8, forms an active protease complex, G-calpain, in which both proteins are essential for stability and activity. The G-Calpain has been implicated in gastric mucosal defense. Furthermore, down-regulation of calpain 9 has been linked to hypertensive heart and kidney disease in salt-sensitive Dahl rats. CAPN9 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. 169
33740 320068 cd16193 EFh_PEF_CAPN11 Penta-EF hand, calcium binding motifs, found in calpain-11 (CAPN11). CAPN11, also termed calcium-activated neutral proteinase 11 (CANP 11), is a mammalian orthologue of micro/m calpain. It is one of the calpain large subunits that appears to be exclusively expressed in certain cells of the testis. It may be involved in regulating calcium-dependent signal transduction events during meiosis and sperm functional processes. CAPN11 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. 169
33741 320069 cd16194 EFh_PEF_CAPN12 Penta-EF hand, calcium binding motifs, found in calpain-12 (CAPN12). CAPN12, also termed calcium-activated neutral proteinase 12 (CANP 12), is a calpain large subunit mainly expressed in the cortex of the hair follicle. It may affect apoptosis regulation. CAPN12 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. 169
33742 320070 cd16195 EFh_PEF_CAPN13_14 Penta-EF hand, calcium binding motifs, found in calpain-13 (CAPN13), calpain-14 (CAPN14), and similar proteins. CAPN13, also termed calcium-activated neutral proteinase 13 (CANP 13), a 63.6 kDa calpain large subunit that exhibits a restricted tissue distribution with low levels of expression detected only in human testis and lung. In calpain family, CAPN13 is most closely related to calpain-14 (CAPN14). CAPN14, also termed calcium-activated neutral proteinase 14 (CANP 14), is a 76.7 kDa calpain large subunit that is most highly expressed in the oesophagus. Its expression and calpain activity can be induced by IL-13. Both CAPN13 and CAPN14 contain a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. 168
33743 320071 cd16196 EFh_PEF_CalpA_B Penta-EF hand, calcium binding motifs, found in Drosophila melanogaster calpain-A (CalpA), calpain-B (CalpB), and similar proteins. The family contains two calpains that have been found in Drosophila, CalpA and CalpB. CalpA, also termed calcium-activated neutral proteinase A (CANP A), or calpain-A catalytic subunit, is a Drosophila calpain homolog specifically expressed in a few neurons in the central nervous system, in scattered endocrine cells in the midgut, and in blood cells. CalpB, also termed calcium-activated neutral proteinase B (CANP B), contains calpain-B catalytic subunit 1 and calpain-B catalytic subunit 2. Both CalpA and CalpB are closely related to that of vertebrate calpains, and they share similar domain architecture, which consists of four domains: the N-terminal domain I, the catalytic domain II carrying the three active site residues, Cys, His and Asn, the Ca2+-regulated phospholipid-binding domain III, and penta-EF-hand Ca2+-binding domain IV. Besides, CalpA and CalpB display some distinguishing structural features that are not found in mammalian typical calpains. CalpA harbors a 76 amino acid long hydrophobic stretch inserted in domain IV, which may be involved in membrane attachment of this enzyme. CalpB has an unusually long N-terminal tail of 224 amino acids, which belongs to the class of intrinsically unstructured proteins (IUP) and may become ordered upon binding to target protein(s). Moreover, they do not need small regulatory subunits for their catalytic activity, and their proteolytic function is not regulated by an intrinsic inhibitor as the Drosophila genome contains neither regulatory subunit nor calpastatin orthologs. As a result, they may exist as a monomer or perhaps as a homo- or heterodimer together with a second large subunit. Furthermore, both CalpA and CalpB are dispensable for viability and fertility and do not share vital functions during Drosophila development. Phosphatidylinositol 4,5-diphosphate, phosphatidylinositol 4-monophosphate, phosphatidylinositol, and phosphatidic acid can stimulate the activity and the rate of activation of CalpA, but not CalpB. Calpain A modulates Toll responses by limited Cactus/IkappaB proteolysis. CalpB directly interacts with talin, an important component of the focal adhesion complex, and functions as an important modulator in border cell migration within egg chambers, which may act via the digestion of talin. CalpB can be phosphorylated by cAMP-dependent protein kinase (protein kinase A, PKA; EC 2.7.11.11) at Ser240 and Ser845, as well as by mitogen-activated protein kinase (ERK1 and ERK2; EC 2.7.11.24) at Thr747. The activation of the ERK pathway by extracellular signals results in the phosphorylation and activation of calpain B. In Schneider cells (S2), calpain B was mainly in the cytoplasm and upon a rise in Ca2+ the enzyme adhered to intracellular membranes. 167
33744 320072 cd16197 EFh_PEF_CalpC Penta-EF hand, calcium binding motifs, found in Drosophila melanogaster calpain-C (CalpC) and similar proteins. CalpC, also termed calcium-activated neutral proteinase homolog C (CANP C), is a catalytically inactive homolog of CalpA and CalpB found in Drosophila. It is strongly expressed in the salivary glands. In contrast with CalpA and CalpB, both of which are closely related to that of vertebrate calpains, and they share similar domain architecture, which consists of four domains: the N-terminal domain I, the catalytic domain II carrying the three active site residues, Cys, His and Asn, the Ca2+-regulated phospholipid-binding domain III, and penta-EF-hand Ca2+-binding domain IV. CalpC is a truncated calpain form missing domain I and about 20 residues from domain II. Moreover, due to all three mutated active site residues (Cys to Arg, His to Val and Asn to Ser), it may not have proteolytic activity. 166
33745 320073 cd16198 EFh_PEF_CAPN1 Penta-EF hand, calcium binding motifs, found in mu-type calpain (CAPN1). CAPN1, also termed calpain-1 80-kDa catalytic subunit, or calpain-1 large subunit, or micromolar-calpain (muCANP), or calcium-activated neutral proteinase 1 (CANP 1), or cell proliferation-inducing gene 30 protein, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN1 in complex with a regulatory subunit encoded by CAPNS1 forms a mu-calpain heterodimer. CAPN1 plays a central role in postmortem proteolysis and meat tenderization processes, as well as in regulation of proliferation and survival of skeletal satellite cells. It also acts as a novel regulator in IgE-mediated mast cell activation and could serve as a potential therapeutic target for the management of allergic inflammation. Moreover, CAPN1 is involved in neutrophil motility and functions as a potential target for intervention in inflammatory disease. It also facilitates age-associated aortic wall calcification and fibrosis through the regulation of matrix metalloproteinase 2 activity in vascular smooth muscle cells, and thus plays a role in hypertension and atherosclerosis. The proteolytic cleavage of beta-amyloid precursor protein and tau protein by CAPN1 may be involved in plaque formation. Furthermore, CAPN1 is activated in the brains of individuals with Alzheimer's disease. It is involved in the maintenance of a proliferative neural stem cell pool. The activation and macrophage inflammation of CAPN1 in hypercholesterolemic nephropathy is promoted by nicotinic acetylcholine receptor alpha1 (nAChRalpha1). In addition, CAPN1 displays a functional role in hemostasis, as well as in sickle cell disease. 169
33746 320074 cd16199 EFh_PEF_CAPN2 Penta-EF hand, calcium binding motifs, found in m-type calpain (CAPN2). CAPN2, also termed millimolar-calpain (m-calpain), or calpain-2 catalytic subunit, or calcium-activated neutral proteinase 2 (CANP 2), or calpain large polypeptide L2, or calpain-2 large subunit, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms an m-calpain heterodimer. CAPN2 acts as the key protease responsible for N-methyl-d-aspartic acid (NMDA)-induced cytoplasmic polyadenylation element-binding protein 3 (CPEB3) degradation in neurons. It cleaves several components of the focal adhesion complex, such as FAK and talin, triggering disassembly of the complex at the rear of the cell. The stimulation of CAPN2 activity is required for Golgi antiapoptotic proteins (GAAPs) to promote cleavage of FA kinase (FAK), cell spreading, and enhanced migration. calpain 2 is also involved in the onset of glial differentiation. It regulates proliferation, survival, migration, and tumorigenesis of breast cancer cells through a PP2A-Akt-FoxO-p27(Kip1) signaling cascade. Its expression is associated with response to platinum based chemotherapy, progression-free and overall survival in ovarian cancer. Moreover, CAPN2 may play a role in fundamental mitotic functions, such as the maintenance of sister chromatid cohesion. The activation of CAPN2 plays an essential role in hippocampal synaptic plasticity and in learning and memory. In the eye, CAPN2, together with a lens-specific variant of CAPN3, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. Sometimes, CAPN2 compensates for loss of CAPN1, and both calpain isoforms are involved in AngII-induced aortic aneurysm formation. The main phosphorylation sites in m-calpain are Ser50 and Ser369/Thr370. 168
33747 320030 cd16200 EFh_PI-PLCbeta EF-hand motif found in metazoan phosphoinositide-specific phospholipase C (PI-PLC)-beta isozymes. PI-PLC-beta isozymes represent a class of metazoan PI-PLCs that hydrolyze the membrane lipid phosphatidylinositol 4,5-bisphosphate (PIP2) to propagate diverse intracellular responses that underlie the physiological action of many hormones, neurotransmitters, and growth factors (EC 3.1.4.11). They have been implicated in numerous processes relevant to central nervous system (CNS), including chemotaxis, cardiovascular function, neuronal signaling, and opioid sensitivity. Like other PI-PLC isozymes, PI-PLC-beta isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, they have a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are four PI-PLC-beta isozymes (1-4). PI-PLC-beta1 and PI-PLC-beta3 are expressed in a wide range of tissues and cell types, whereas PI-PLC-beta2 and PI-PLC-beta4 have been found only in hematopoietic and neuronal tissues, respectively. All PI-PLC-beta isozymes are activated by the heterotrimeric G protein alpha subunits of the Gq class through their C2 domain and long C-terminal extension. They are GTPase-activating proteins (GAPs) for these G alpha(q) proteins. PI-PLC-beta2 and PI-PLC-beta3 can also be activated by beta-gamma subunits of the G alpha(i/o) family of heterotrimeric G proteins and the small GTPases such as Rac and Cdc42. This family also includes two invertebrate homologs of PI-PLC-beta, PLC21 from cephalopod retina and No receptor potential A protein (NorpA) from Drosophila melanogaster. 153
33748 320031 cd16201 EFh_PI-PLCgamma EF-hand motif found in phosphoinositide phospholipase C gamma isozymes (PI-PLC-gamma). PI-PLC-gamma isozymes represent a class of metazoan PI-PLCs that hydrolyze the membrane lipid phosphatidylinositol 4,5-bisphosphate (PIP2) to propagate diverse intracellular responses that underlie the physiological action of many hormones, neurotransmitters, and growth factors. They can form a complex with the phosphorylated cytoplasmic domains of the immunoglobulin Ig-alpha and Ig-beta subunits of the B cell receptor (BCR), the membrane-tethered Src family kinase Lyn, phosphorylated spleen tyrosine kinase (Syk), the phosphorylated adaptor protein B-cell linker (BLNK), and activated Bruton's tyrosine kinase (Btk). Like other PI-PLC isozymes, PI-PLC-gamma isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma, a second PH domain, which is split by two SH2 (Src homology 2) domains, and one SH3 (Src homology 3) domain, are present within this linker. The SH2 and SH3 domains are responsible for the binding of phosphotyrosine-containing sequences and proline-rich sequences, respectively. There are two PI-PLC-gamma isozymes (1-2), both of which are activated by receptor and non-receptor tyrosine kinases due to the presence of SH2 and SH3 domains. 145
33749 320032 cd16202 EFh_PI-PLCdelta EF-hand motif found in phosphoinositide phospholipase C delta (PI-PLC-delta). PI-PLC-delta isozymes represent a class of metazoan PI-PLCs that are some of the most sensitive to calcium among all PLCs. Their activation is modulated by intracellular calcium ion concentration, phospholipids, polyamines, and other proteins, such as RhoAGAP. Like other PI-PLC isozymes, PI-PLC-delta isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1, 3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Different PI-PLC-delta isozymes have different tissue distribution and different subcellular locations. PI-PLC-delta1 is mostly a cytoplasmic protein, PI-PLC-delta3 is located in the membrane, and PI-PLC-delta4 is predominantly detected in the cell nucleus. PI-PLC-delta isozymes is evolutionarily conserved even in non-mammalian species, such as yeast, slime molds and plants. 140
33750 320033 cd16203 EFh_PI-PLCepsilon EF-hand motif found in phosphoinositide phospholipase C epsilon 1 (PI-PLC-epsilon1). PI-PLC-epsilon1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase epsilon-1, or pancreas-enriched phospholipase C, or phospholipase C-epsilon-1 (PLC-epsilon-1), is dominant in connective tissues and brain. It has been implicated in carcinogenesis, such as in bladder and intestinal tumor, oesophageal squamous cell carcinoma, gastric adenocarcinoma, murine skin cancer, head and neck cancer. PI-PLC-epsilon1 contains an N-terminal CDC25 homology domain with a guanyl-nucleotide exchange factor (GFF) activity, a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and at least one and perhaps two C-terminal predicted RA (Ras association) domains that are implicated in the binding of small GTPases, such as Ras or Rap, from the Ras family. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is only one PI-PLC-epsilon isozyme. It is directly activated by G alpha(12/13), G beta gamma, and activated members of Ras and Rho small GTPases. 174
33751 320034 cd16204 EFh_PI-PLCzeta EF-hand motif found in phosphoinositide phospholipase C zeta 1 (PI-PLC-zeta1). PI-PLC-zeta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase zeta-1, or phospholipase C-zeta-1 (PLC-zeta-1), or testis-development protein NYD-SP27, is only found in the testis. The sperm-specific PI-PLC plays a fundamental role in vertebrate fertilization by initiating intracellular calcium oscillations that trigger the embryo development. However, the mechanism of its activation still remains unclear. PI-PLC-zeta1 contains an N-terminal four atypical EF-hand motifs, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unlike other PI-PLCs, PI-PLC-zeta is responsible for Ca2+ oscillations in fertilized oocytes and exhibits a high sensitivity to Ca2+ mediated through its EF-hand domain. There is only one PLC-zeta isozyme. Aside from PI-PLC-zeta identified in mammals, its eukaryotic homologs have been classified with this family. 142
33752 320035 cd16205 EFh_PI-PLCeta EF-hand motif found in phosphoinositide phospholipase C eta (PI-PLC-eta). PI-PLC-eta isozymes represent a class of neuron-specific metazoan PI-PLCs that are most abundant in the brain, particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. They are phosphatidylinositol 4,5-bisphosphate-hydrolyzing enzymes that are more sensitive to Ca2+ than other PI-PLC isozymes. They function as calcium sensors activated by small increases in intracellular calcium concentrations. They are also activated through G-protein-coupled receptor (GPCR) stimulation, and further mediate GPCR signalling pathways. PI-PLC-eta isozymes contain an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases. There are two PI-PLC-eta isozymes (1-2). Aside from the PI-PLC-eta isozymes identified in mammals, their eukaryotic homologs are also present in this family. 141
33753 320036 cd16206 EFh_PRIP EF-hand motif found in phospholipase C-related but catalytically inactive proteins (PRIP). This family represents a class of metazoan phospholipase C related, but catalytically inactive proteins (PRIP), which belong to a group of novel inositol 1,4,5-trisphosphate (InsP3) binding protein. PRIP has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP do not have PLC enzymatic activity. PRIP consists of two subfamilies, PRIP-1(also known as p130 or PLC-L1), which is predominantly expressed in the brain, and PRIP-2 (also known as PLC-L2), which exhibits a relatively ubiquitous expression. Experiments show both, PRIP-1 and PRIP-2, are involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. In addition, PRIP-2 acts as a negative regulator of B-cell receptor signaling and immune responses. 143
33754 320037 cd16207 EFh_ScPlc1p_like EF-hand motif found in Saccharomyces cerevisiae phospholipase C-1 (ScPlc1p) and similar proteins. This family represents a group of putative phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) encoded by PLC1 genes from yeasts, which are homologs of the delta isoforms of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2) to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The prototype of this family is protein Plc1p (also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase 1) encoded by PLC1 genes from Saccharomyces cerevisiae. ScPlc1p contains both highly conserved X- and Y- regions of PLC catalytic core domain, as well as a presumptive EF-hand like calcium binding motif. Experiments show that ScPlc1p displays calcium dependent catalytic properties with high similarity to those of the mammalian PLCs, and plays multiple roles in modulating the membrane/protein interactions in filamentation control. CaPlc1p encoded by CAPLC1 from the closely related yeast Candida albicans, an orthologue of S. cerevisiae Plc1p, is also included in this group. Like SCPlc1p, CaPlc1p has conserved presumptive catalytic domain, shows PLC activity when expressed in E. coli, and is involved in multiple cellular processes. There are two other gene copies of CAPLC1 in C. albicans, CAPLC2 (also named as PIPLC) and CAPLC3. Experiments show CaPlc1p is the only enzyme in C. albicans which functions as PLC. The biological functions of CAPLC2 and CAPLC3 gene products must be clearly different from CaPlc1p, but their exact roles remain unclear. Moreover, CAPLC2 and CAPLC3 gene products are more similar to extracellular bacterial PI-PLC than to the eukaryotic PI-PLC, and they are not included in this subfamily. 142
33755 320038 cd16208 EFh_PI-PLCbeta1 EF-hand motif found in phosphoinositide phospholipase C beta 1 (PI-PLC-beta1). PI-PLC-beta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-1, or PLC-154, or phospholipase C-I (PLC-I), or phospholipase C-beta-1 (PLC-beta1), is expressed at highest levels in specific regions of the brain, as well as in the cardiovascular system. It has two splice variants, PI-PLC-beta1a and PI-PLC-beta1b, both of which are present within the nucleus. Nuclear PI-PLC-beta1 is a key molecule for nuclear inositide signaling, where it plays a role in cell cycle progression, proliferation and differentiation. It also contributes to generate cell-specific Ca2+ signals evoked by G protein-coupled receptor stimulation. PI-PLC-beta1 acts as an effector and a GTPase activating protein (GAP) specifically activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It regulates neuronal activity in the cerebral cortex and hippocampus, and has been implicated for participations in diverse critical functions related to forebrain diseases such as schizophrenia. It may play an important role in maintenance of the status epilepticus, and in osteosarcoma-related signal transduction pathways. PI-PLC-beta1 also functions as a regulator of erythropoiesis in kinamycin F, a potent inducer of gamma-globin production in K562 cells. The G protein activation and the degradation of PI-PLC-beta1 can be regulated by the interaction of alpha-synuclein. As a result, it may reduce cell damage under oxidative stress. Moreover, PI-PLC-beta1 works as a new intermediate in the HIV-1 gp120-triggered phosphatidylcholine-specific phospholipase C (PC-PLC)-driven signal transduction pathway leading to cytoplasmic CCL2 secretion in macrophages. PI-PLC-beta1 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 151
33756 320039 cd16209 EFh_PI-PLCbeta2 EF-hand motif found in phosphoinositide phospholipase C beta 2 (PI-PLC-beta2). PI-PLC-beta2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-2, or phospholipase C-beta-2 (PLC-beta2), is expressed at highest levels in cells of hematopoietic origin. It is activated by the heterotrimeric G protein alpha q subunits (G alpha(q)) through their C2 domain and long C-terminal extension. It is also activated by the beta-gamma subunits of heterotrimeric G proteins. PI-PLC-beta2 has two cellular binding partners, alpha- and gamma-synuclein. The binding of either alpha- and gamma-synuclein inhibits PI-PLC-beta2 activity through preventing the binding of its activator G alpha(q). However, the binding of gamma-synuclein to PI-PLC-beta2 does not affect its binding to G beta(gamma) subunits or small G proteins, but enhances these signals. Meanwhile, gamma-synuclein may protect PI-PLC-beta2 from protease degradation and contribute to its over-expression in breast cancer. In leukocytes, the G beta(gamma)-mediated activation of PI-PLC-beta2 can be promoted by a scaffolding protein WDR26, which is also required for the translocation of PI-PLC-beta2 from the cytosol to the membrane in polarized leukocytes. PI-PLC-beta2 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 151
33757 320040 cd16210 EFh_PI-PLCbeta3 EF-hand motif found in phosphoinositide phospholipase C beta 3 (PI-PLC-beta3). PI-PLC-beta3, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-3, or phospholipase C-beta-3 (PLC-beta3), is widely expressed at highest levels in brain, liver, and parotid gland. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It is also activated by the beta-gamma subunits of heterotrimeric G proteins. PI-PLC-beta3 associates with CXC chemokine receptor 2 (CXCR2) and Na+/H+ exchanger regulatory factor-1 (NHERF1) to form macromolecular complexes at the plasma membrane of pancreatic cancer cells, which functionally couple chemokine signaling to PI-PLC-beta3-mediated signaling cascade. Moreover, PI-PLC-beta3 directly interacts with the M3 muscarinic receptor (M3R), a prototypical G alpha-q-coupled receptor that promotes PI-PLC-beta3 localization to the plasma membrane. This binding can alter G alpha-q-dependent PLC activation. Furthermore, PI-PLC-beta3 inhibits the proliferation of hematopoietic stem cells (HSCs) and myeloid cells through the interaction of SH2-domain-containing protein phosphatase 1 (SHP-1) and signal transducer and activator of transcription 5 (Stat5), and the augment of the dephosphorylating activity of SHP-1 toward Stat5, leading to the inactivation of Stat5. It is also involved in atopic dermatitis (AD) pathogenesis via regulating the expression of periostin in fibroblasts and thymic stromal lymphopoietin (TSLP) in keratinocytes. In addition, PI-PLC-beta3 mediates the thrombin-induced Ca2+ response in glial cells. PI-PLC-beta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 151
33758 320041 cd16211 EFh_PI-PLCbeta4 EF-hand motif found in phosphoinositide phospholipase C beta 4 (PI-PLC-beta4). PI-PLC-beta4, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-4, or phospholipase C-beta-4 (PLC-beta4), is expressed in high concentrations in cerebellar Purkinje and granule cells, the median geniculate body, and the lateral geniculate nucleus. It may play a critical role in linking anxiety behaviors and theta rhythm heterogeneity. PI-PLC-beta4 is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It contributes to generate cell-specific Ca2+ signals evoked by G protein-coupled receptor stimulation. PI-PLC-beta4 functions as a downstream signaling molecule of type 1 metabotropic glutamate receptors (mGluR1s). The thalamic mGluR1-PI-PLC-beta4 cascade is essential for formalin-induced inflammatory pain by regulating the response of ventral posterolateral thalamic nucleus (VPL) neurons. Moreover, PI-PLC-beta4 is essential for long-term depression (LTD) in the rostral cerebellum, which may be required for the acquisition of the conditioned eyeblink response. Besides, PI-PLC-beta4 may play an important role in maintenance of the status epilepticus. The mutations of PI-PLC-beta4 has been identified as the major cause of autosomal dominant auriculocondylar syndrome (ACS). PI-PLC-beta4 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 153
33759 320042 cd16212 EFh_NorpA_like EF-hand motif found in Drosophila melanogaster No receptor potential A protein (NorpA) and similar proteins. NorpA, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase, is an eye-specific phosphoinositide phospholipase C (PI-PLC) encoded by norpA gene in Drosophila. It is expressed predominantly in photoreceptors and plays an essential role in the phototransduction pathway of Drosophila. A mutation within the norpA gene can render the fly blind without affecting any of the obvious structures of the eye. Like beta-class of vertebrate PI-PLCs, NorpA contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 153
33760 320043 cd16213 EFh_PI-PLC21 EF-hand motif found in phosphoinositide phospholipase PLC21 and similar proteins. The family includes invertebrate homologs of phosphoinositide phospholipase C beta (PI-PLC-beta) named PLC21 from cephalopod retina. It also includes PLC21 encoded by plc-21 gene, which is expressed in the central nervous system of Drosophila. Like beta-class of vertebrate PI-PLCs, PLC21 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. 154
33761 320044 cd16214 EFh_PI-PLCgamma1 EF-hand motif found in phosphoinositide phospholipase C gamma 1 (PI-PLC-gamma1). PI-PLC-gamma1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1, or PLC-148, or phospholipase C-II (PLC-II), or phospholipase C-gamma-1 (PLC-gamma-1), is abundantly expressed in embryonal cortical structures, neurons, oligodendrocytes and astrocytes, and is involved in various cellular events, including proliferation, differentiation, migration, survival, and cell death. It also associates with many diseases, including epilepsy, Huntington's disease (HD), depression, Alzheimer's disease (AD) and bipolar disorder. PI-PLC-gamma1 plays a critical role in cell migration and tumor cell invasiveness and metastasis. It can mediate the cell motility effects of growth factors, including platelet-derived growth factor (PDGF), epidermal growth factor (EGF), insulin-like growth factor (IGF) and hepatocyte growth factor (HGF), as well as adhesion receptors. Moreover, PI-PLC-gamma1 can modulate neurite outgrowth, neuronal cell migration and synaptic plasticity through the Trk receptor. PI-PLC-gamma1 contains an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Besides, PI-PLC-gamma1 has a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region, which are present within this linker. PI-PLC-gamma1 is activated by receptor and non-receptor tyrosine kinases via its two SH2 and a single SH3 domain. 146
33762 320045 cd16215 EFh_PI-PLCgamma2 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2. PI-PLC-gamma2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2, or phospholipase C-IV (PLC-IV), or phospholipase C-gamma-2 (PLC-gamma-2), is highly expressed in cells of hematopoietic origin. It has been implicated in cell motility important to invasion and dissemination of tumor cells. As an important component of the B cell receptor (BCR) signaling pathway, PI-PLC-gamma2 is required for efficient formation of germinal center (GC) and memory B cells. It works as a critical effector stimulating the increase of intracellular Ca2+ and activates various signaling pathways downstream of the BCR. Moreover, PI-PLC-gamma2 has been implicated in Fc receptor-mediated degranulation of mast cells, integrin signaling in platelets, as well as integrin and Fc receptor-mediated neutrophil functions. It also acts as a crucial signaling mediator modifying DC gene expression program to activate DC responses to beta-glucan-containing pathogens. PI-PLC-gamma2 contains an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Besides, PI-PLC-gamma2 has a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region, which are present within this linker. PI-PLC-gamma2 is activated by receptor and non-receptor tyrosine kinases via its two SH2 and a single SH3 domain. Unlike PI-PLC-gamma1, the activation of PI-PLC-gamma2 may require concurrent stimulation of PI 3-kinase. 154
33763 320046 cd16216 EFh_PI-PLCgamma1_like EF-hand motif found in 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1-like proteins. This family corresponds to a small group of uncharacterized 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1-like (PI-PLC-gamma1-like) proteins. Although their biological function remains unclear, they shows high sequence similarity with other phosphoinositide phospholipase C gamma proteins. They contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. A second PH domain, which is split by two SH2 (Src homology 2) domains, and one SH3 (Src homology 3) domain, are present within this linker. 150
33764 320047 cd16217 EFh_PI-PLCdelta1 EF-hand motif found in phosphoinositide phospholipase C delta 1 (PI-PLC-delta1). PI-PLC-delta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-1 (PLCD1), or phospholipase C-III (PLC-III), or phospholipase C-delta-1 (PLC-delta-1), is present in high abundancy in the brain, heart, lung, skeletal muscle and testis. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. PI-PLC-delta1 is required for maintenance of homeostasis in skin and metabolic tissues. Moreover, it is essential in trophoblasts for placental development. Simultaneous loss of PI-PLC-delta1 may cause placental vascular defects, leading to embryonic lethality. PI-PLC-delta1 can be positively or negatively regulated by several binding partners, including p122/Rho GTPase activating protein (RhoGAP), Gha/Transglutaminase II, RalA, and calmodulin. It is involved in Alzheimer's disease and hypertension. Furthermore, PI-PLC-delta1 regulates cell proliferation and cell-cycle progression from G1- to S-phase by control of cyclin E-CDK2 activity and p27 levels. It can be activated by alpha1-adrenoreceptors (AR) in a calcium-dependent manner and may be important for G protein-coupled receptors (GPCR) responses in vascular smooth muscle (VSM). PI-PLC-delta1 may also be involved in noradrenaline (NA)-induced phosphatidylinositol-4,5-bisphosphate (PIP2) hydrolysis and modulate sustained contraction of mesenteric small arteries. In addition, it inhibits thermogenesis and induces lipid accumulation, and therefore contributes to the development of obesity. PI-PLC-delta1 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-delta1 can regulate the binding of PH domain to PIP2 in a Ca2+-dependent manner through its functionally important EF-hand domains. In addition, PI-PLC-delta1 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, as well as a nuclear localization signal within its linker region, both of which may be responsible for translocating PI-PLC-delta1 into and out of the cell nucleus. 139
33765 320048 cd16218 EFh_PI-PLCdelta3 EF-hand motif found in phosphoinositide phospholipase C delta 3 (PI-PLC-delta3). PI-PLC-delta3, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-3 (PLCD3), phospholipase C-delta-3 (PLC-delta-3), is expressed abundantly in brain, skeletal muscle and heart. PI-PLC-delta3 gene expression is down-regulation by cAMP and calcium. PI-PLC-delta3 acts as anchoring of myosin VI on plasma membrane, and further modulates Myosin IV expression and microvilli formation in enterocytes. It negatively regulates RhoA expression, inhibits RhoA/Rho kinase signaling, and plays an essential role in normal neuronal migration by promoting neuronal outgrowth in the developing brain. Moreover, PI-PLC-delta3 is essential in trophoblasts for placental development. Simultaneous loss of PI-PLC-delta3 may cause placental vascular defects, leading to embryonic lethality. PI-PLC-delta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. In addition, PI-PLC-delta3 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, which may be responsible transporting PI-PLC-delta3 from the cell nucleus. 138
33766 320049 cd16219 EFh_PI-PLCdelta4 EF-hand motif found in phosphoinositide phospholipase C delta 4 (PI-PLC-delta4). PI-PLC-delta4, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-4 (PLCD4), or phospholipase C-delta-4 (PLC-delta-4), is expressed in various tissues with the highest levels detected selectively in the brain, skeletal muscle, testis and kidney. It plays a significant role in cell growth, cell proliferation, tumorigenesis, and in an early stage of fertilization. PI-PLC-delta4 may function as a key enzyme in the regulation of PtdIns(4,5)P2 levels and Ca2+ metabolism in nuclei in response to growth factors, and its expression may be partially regulated by an increase in cytoplasmic Ca2+. Moreover, PI-PLC-delta4 binds glutamate receptor-interacting protein1 (GRIP1) in testis and is required for calcium mobilization essential for the zona pellucida-induced acrosome reaction in sperm. Overexpression or dysregulated expression of PLCdelta4 may initiate oncogenesis in certain tissues through upregulating erbB1/2 expression, extracellular signal-regulated kinase (ERK) signaling pathway, and proliferation in MCF-7 cells. PI-PLC-delta4 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unlike PI-PLC-delta 1 and 3, a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus, is not present in PI-PLC-delta4. 140
33767 320050 cd16220 EFh_PI-PLCeta1 EF-hand motif found in phosphoinositide phospholipase C eta 1 (PI-PLC-eta1). PI-PLC-eta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase eta-1, or phospholipase C-eta-1 (PLC-eta-1), or phospholipase C-like protein 3 (PLC-L3), is a neuron-specific PI-PLC that is most abundant in the brain, particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. It is also expressed in the zona incerta and in the spinal cord. PI-PLC-eta1 may perform a fundamental role in the brain. It may also act in synergy with other PLC subtypes. For instance, it is activated via intracellular Ca2+ mobilization and then plays a role in the amplification of GPCR (G-protein-coupled receptor)-mediated PLC-beta signals. In addition, its activity can be stimulated by ionomycin. PI-PLC-eta1 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases. 141
33768 320051 cd16221 EFh_PI-PLCeta2 EF-hand motif found in phosphoinositide phospholipase C eta 2 (PI-PLC-eta2). PI-PLC-eta2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase eta-2, or phosphoinositide phospholipase C-like 4, or phospholipase C-like protein 4 (PLC-L4), or phospholipase C-eta-2 (PLC-eta2), is a neuron-specific PI-PLC that is most abundant in the brain, particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. It is also expressed in the pituitary gland, pineal gland, retina, and lung, as well as in neuroendocrine cells. PI-PLC-eta2 has been implicated in the regulation of neuronal differentiation/maturation. It is required for retinoic acid-stimulated neurite growth. It may also in part function downstream of G-protein-coupled receptors and play an important role in the formation and maintenance of the neuronal network in the postnatal brain. Moreover, PI-PLC-eta2 acts as a Ca2+ sensor that shows a canonical EF-loop directing Ca2+-sensitivity and thus can amplify transient Ca2+ signals. Its activation can be triggered either by intracellular calcium mobilization or by G beta-gamma signaling. PI-PLC-eta2 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases. 141
33769 320052 cd16222 EFh_PRIP1 EF-hand motif found in phospholipase C-related but catalytically inactive protein 1 (PRIP-1). PRIP-1, also termed phospholipase C-deleted in lung carcinoma, or inactive phospholipase C-like protein 1 (PLC-L1), or p130, is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that is predominantly expressed in the brain. It is involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. It interacts with the catalytic subunits of protein phosphatase 1 (PP1) and protein phosphatase 2A (PP2A), and functions as a scaffold to regulate the activities and subcellular localizations of both PP1 and PP2A in phospho-dependent cellular signaling. It also promotes the translocation of phosphatases to lipid droplets to trigger the dephosphorylation of hormone-sensitive lipase (HSL) and perilipin A, thus reducing protein kinase A (PKA)-mediated lipolysis. Moreover, PRIP-1 plays an important role in insulin granule exocytosis through the association with GABAA-receptor-associated protein (GABARAP) to form a complex to regulate KIF5B-mediated insulin secretion. It also inhibits regulated exocytosis through direct interactions with syntaxin 1 and synaptosomal-associated protein 25 (SNAP-25) via its C2 domain. Furthermore, PRIP-1 has been implicated in the negative regulation of bone formation. PRIP-1 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-1 does not have PLC enzymatic activity. 143
33770 320053 cd16223 EFh_PRIP2 EF-hand motif found in phospholipase C-related but catalytically inactive protein 2 (PRIP-2). PRIP-2, also termed phospholipase C-L2, or phospholipase C-epsilon-2 (PLC-epsilon-2), or inactive phospholipase C-like protein 2 (PLC-L2), is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that exhibits a relatively ubiquitous expression. It functions as a novel negative regulator of B-cell receptor (BCR) signaling and immune responses. PRIP-2 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-2 does not have PLC enzymatic activity. 144
33771 320022 cd16224 EFh_CREC_RCN2 EF-hand, calcium binding motif, found in reticulocalbin-2 (RCN2). RCN2, also termed calcium-binding protein ERC-55, or E6-binding protein (E6BP), or TCBP-49, is an endoplasmic reticulum resident low-affinity Ca2+-binding protein that has been implicated in immunity, redox homeostasis, cell cycle regulation and coagulation. It is associated with tumorigenesis, in particular with transformation of cells of the cervix induced by human papillomavirus (HPV), through binding to human papillomavirus (HPV) E6 oncogenic protein. It specifically interacts with vitamin D receptor among nuclear receptors. RCN2 contains an N-terminal signal sequence followed by six copies of the EF-hand Ca2+-binding motif, and a C-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide that is required for retention of RCN2 in the endoplasmic reticulum (ER). 268
33772 320023 cd16225 EFh_CREC_cab45 EF-hand, calcium binding motif, found in 45 kDa calcium-binding protein (Cab45). Cab45, also termed stromal cell-derived factor 4 (SDF-4), is a soluble, lumenal Golgi resident low-affinity Ca2+-binding protein that contains six copies of the EF-hand Ca2+-binding motif. It is required for secretory pathway calcium ATPase1 (SPCA1)-dependent Ca2+ import into the trans-Golgi network (TGN) and plays an essential role in Ca2+-dependent secretory cargo sorting at the TGN. 278
33773 320024 cd16226 EFh_CREC_Calumenin_like EF-hand, calcium binding motif, found in calumenin, reticulocalbin-1 (RCN-1), reticulocalbin-3 (RCN-3), and similar proteins. The family corresponds to a group of six EF-hand Ca2+-binding proteins, including calumenin (also known as crocalbin or CBP-50), reticulocalbin-1 (RCN-1), reticulocalbin-3 (RCN-3), and similar proteins. Calumenin is an endo/sarcoplasmic reticulum (ER/SR) resident low-affinity Ca2+-binding protein that contains six EF-hand domains and a C-terminal SR retention signal His-Asp-Glu-Phe (HDEF) tetrapeptide. It functions as a novel regulator of SERCA2, and its expressional changes are tightly coupled with Ca2+-cycling of cardiomyocytes. It is also broadly involved in haemostasis and in the pathophysiology of thrombosis. Moreover, the extracellular calumenin acts as a suppressor of cell migration and tumor metastasis. RCN-1 is an endoplasmic reticulum resident Ca2+-binding protein with a carboxyl-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide signal. It acts as a potential negative regulator of B-RAF activation and can negatively modulate cardiomyocyte hypertrophy by inhibition of the mitogen-activated protein kinase signalling cascade. It also plays a key role in the development of doxorubicin-associated resistance. RCN-3 is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal HDEL tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4. 264
33774 320025 cd16227 EFh_CREC_RCN2_like EF-hand, calcium binding motif, found in reticulocalbin-2 (RCN2) mainly from protostomes. This family corresponds to a group of uncharacterized RCN2-like proteins, which are mainly found in protostomes. Although their biological function remains unclear, they show high sequence similarity with RCN2 (also known as E6BP or TCBP-49), which is an endoplasmic reticulum resident low-affinity Ca2+-binding protein that has been implicated in immunity, redox homeostasis, cell cycle regulation and coagulation. Members in this family contain six copies of the EF-hand Ca2+-binding motif, but may lack a C-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide that is required for retention of RCN2 in the endoplasmic reticulum (ER). 263
33775 320026 cd16228 EFh_CREC_Calumenin EF-hand, calcium binding motif, found in calumenin. Calumenin, also termed crocalbin, or IEF SSP 9302, is an endo/sarcoplasmic reticulum (ER/SR) resident low-affinity Ca2+-binding protein that contains six EF-hand domains and a C-terminal SR retention signal His-Asp-Glu-Phe (HDEF) tetrapeptide. It is highly expressed in various brain regions. Thus it plays an important role in migration and differentiation of neurons, and/or in Ca2+ signaling between glial cells and neurons. Calumenin is involved in Ca2+ homeostasis through interacting with ryanodine receptor RyR2 and SERCA2. It acts as a novel regulator of SERCA2, and its expressional changes are tightly coupled with Ca2+-cycling of cardiomyocytes. Calumenin also forms a Ca2+-dependent complex with thrombospondin-1, which is broadly involved in haemostasis and thrombosis. Moreover, calumenin is a molecular chaperone that endogenously regulates the vitamin K-dependent gamma-carboxylation of several proteins, including blood coagulation factors (such as FII, FVII, FIX, FX, and proteins C, S and Z), cell survival factors (Gas6) and bone metabolism proteins (such as matrix Gla protein or MGP, osteocalcin and periostin), through targeting the gamma-glutamyl carboxylase. It also functions as a charged F508del-cystic fibrosis transmembrane regulator (CFTR) folding modulator, as well as a G551D-CFTR associated protein. Furthermore, the extracellular calumenin acts as a suppressor of cell migration and tumor metastasis. It binds to and stabilizes fibulin-1, and further inactivates extracellular signal-regulated kinases 1 and 2 (ERK1/2) signaling. 263
33776 320027 cd16229 EFh_CREC_RCN1 EF-hand, calcium binding motif, found in reticulocalbin-1 (RCN-1). RCN-1 is an endoplasmic reticulum resident low-affinity Ca2+-binding protein with six EF-hand motifs and a carboxyl-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide signal. It is expressed at the cell surface. RCN-1 acts as a potential negative regulator of B-RAF activation and can negatively modulate cardiomyocyte hypertrophy by inhibition of the mitogen-activated protein kinase signaling cascade. It also plays a key role in the development of doxorubicin-associated resistance. 267
33777 320028 cd16230 EFh_CREC_RCN3 EF-hand, calcium binding motif, found in reticulocalbin-3 (RCN-3). RCN-3, also termed EF-hand calcium-binding protein RLP49, is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal His-Asp-Glu-Leu (HDEL) tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4. 268
33778 320010 cd16231 EFh_SPARC_like EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC) and similar proteins. This family includes secreted protein acidic and rich in cysteine (SPARC), secreted protein, acidic and rich in cysteine-like 1 (SPARCL1), and similar proteins. SPARC is a prototypic collagen-binding matricellular protein that is involved in extracellular matrix (ECM) assembly and fibrosis through binding both fibrillar collagen and basal lamina collagen IV. It regulates the activity of matrix metalloproteinases (MMPs), as well as the growth factor signaling mediated by cell surface receptors including vascular endothelial growth factor (VEGF) receptor, basic fibroblast growth factor (bFGF), and transforming growth factor (TGF) beta1. It also shows survival activity in tumor progression. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. SPARCL1 is the closest family member to SPARC. It shares the three primary domains contained within SPARC with an expanded N-terminal domain. SPARCL1 may function as both a tumor suppressor and as a regulator of angiogenesis. It can bind to collagens and be counter-adhesive to wild-type dermal fibroblasts, but do not influence rates of cell proliferation. Moreover, SPARCL1 can influence central nervous system (CNS) development and synaptic rearrangement. 116
33779 320011 cd16232 EFh_SPARC_TICN EF-hand, extracellular calcium-binding (EC) motif, found in testicans. Testicans are nervous system-expressed proteoglycans that play important roles in the regulation of protease activity, as well as in the determination of age at menarche. Testican-1 (TICN1, also termed protein SPOCK) is a secreted chimeric proteoglycan that is highly expressed in brain and carries both chondroitin and heparan sulfate glycosaminoglycan side chains. It has been implicated in autoimmune disease. It also acts as a regulator of bone morphogenetic protein (BMP) signaling and show critical functions in the nervous system. Testican-2 (TICN2, also termed protein SPOCK2) is an extracellular heparan sulphate proteoglycan highly expressed in brain. It may play regulatory roles in the development of the central nervous system. It also participates in diverse steps of neurogenesis. TICN1, but not TICN2, inhibits cathepsin L. TICN1 also inhibits attachment and neurite outgrowth in cultures of N2A neuroblastoma cells, While TICN2 is able to inhibit neurite outgrowth from primary cerebellar cells. Testicans contain an N-terminal signal peptide, a testican-specific domain followed by a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, a thyroglobulin-like domain (TY), and a C-terminal region with two putative glycosaminoglycan attachment sites. The substitution of a ligating Asp residue by Tyr orTyr in the +Y position of EF hand 2 in testican-2 could prevent Ca2+ binding to this site and also cause EF-hand 1 to bind one Ca2+ with low affinity. The substitution of a ligating Asp residue by Phe or Tyr in the +Y position of EF-hand 2 in testicans could prevent Ca2+ binding to this site and also cause EF-hand 1 to bind one Ca2+ ion with low affinity. 108
33780 320012 cd16233 EFh_SPARC_FSTL1 EF-hand, extracellular calcium-binding (EC) motif, found in follistatin-related protein 1 (FRP-1). FRP-1, also termed follistatin-like protein 1 (fstl-1), TGF-beta-stimulated clone 36 (TSC-36/Flik), or TGF-beta inducible protein, is a secreted glycoprotein that is overexpressed in certain inflammatory diseases and has been implicated in many autoimmune diseases. FRP-1 functions as an important proinflammatory factor in the pathogenesis of osteoarthritis (OA) by activating the canonical NF-kappaB-mediated inflammatory cytokines, including tumor necrosis factor alpha (TNF-alpha), interleukin-1beta (IL-1beta) and interleukin-6 (IL-6), and enhancing fibroblast like synoviocytes proliferation. It also acts as a critical mediator of collagen-induced arthritis (CIA), juvenile rheumatoid arthritis (JRA), as well as Lyme arthritis observed after Borrelia burgdorferi infection. Meanwhile, it enhances nod-like receptor family, pyrin domain containing 3 (NLRP3) inflammasome-mediated IL-1beta secretion from monocytes and macrophages. Moreover, FRP-1 shows critical functions in the nervous system. It differentially regulates transforming growth factor beta (TGF-beta) and bone morphogenetic protein (BMP) signaling, leading to epithelial injury and fibroblast activation. Furthermore, FRP-1 functions as a cardiokine with cardioprotective properties. It may play a potential role in ischemic stroke through decreasing neuronal apoptosis and improving neurological deficits via disco-interacting protein 2 homolog A (DIP2A)/Akt pathway after middle cerebral artery occlusion (MCAO). Plasma FRP-1 is elevated in Kawasaki disease (KD) and thus may play a possible role in the formation of coronary artery aneurysm (CAA). FRP-1 contains a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, and a von Willebrand factor type C (VWC) domain. The EC domain does not undergo characteristic structural changes upon calcium addition or depletion and therefore is not a functional calcium binding domain. 114
33781 320013 cd16234 EFh_SPARC_SMOC EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein SMOC-1, SMOC-2, and similar proteins. SMOC proteins corresponds to a group matricellular proteins that are involved in direct or indirect modulation of growth factor signaling pathways and play diverse roles in physiological processes involving extensive tissue remodeling, migration, proliferation, and angiogenesis. They may mediate intercellular signaling and cell type-specific differentiation during gonad and reproductive tract development. SMOC-1 is localized in basement membranes. Its mutations have been found to be associated with individuals with Warrdenburg Anopthalmia Syndrome. SMOC-2 is ubiquitously expressed and is involved in angiogenesis and the regulation of cell cycle progression. It enhances the angiogenic effect of basic fibroblast growth factor (bFGF) and vascular endothelial growth factor (VEGF). It has also been implicated in generalized vitiligo. SMOC proteins consist of a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain conserved only in SMOC proteins, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs. 104
33782 320014 cd16235 EFh_SPARC_SPARC EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC). SPARC, also termed basement-membrane protein 40 (BM-40), or osteonectin (ON), is a prototypic collagen-binding matricellular protein that is essential for embryo development in invertebrates and highly expressed in bone. It participates in normal tissue remodeling as it regulates the deposition of extracellular matrix, as well as in neoplastic transformation. It is involved in extracellular matrix (ECM) assembly and fibrosis through binding both fibrillar collagen and basal lamina collagen IV. It regulates the activity of matrix metalloproteinases (MMPs), as well as the growth factor signaling mediated by cell surface receptors including vascular endothelial growth factor (VEGF) receptor, basic fibroblast growth factor (bFGF), and transforming growth factor (TGF) beta1. SPARC shows survival activity in tumor progression. It plays a role in metastatic process to the lung during melanoma progression. It can suppress prostate cancer cell growth and survival. Moreover, SPARC is a bone- associated protein that has a major role in bone development and mineralisationis. It is involved in the initiation and progression of vascular calcification and upregulated by adiponectin. Furthermore, SPARC may be one of the molecules that govern the uptake and delivery of proteins from blood to the cerebrospinal fluid (CSF) during brain development. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. Platelet-derived growth factor (PDGF) also interacts with its EC domain, but in a calcium-independent manner, whereas collagen binding is calcium-dependent. 96
33783 320015 cd16236 EFh_SPARC_SPARCL1 EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein, acidic and rich in cysteine-like 1 (SPARCL1). SPARCL1, also termed SPARC-like protein 1, or high endothelial venule protein (Hevin), or MAST 9, or SC-1, or RAGS-1, or QR1, or ECM 2, is a diversely expressed and developmentally regulated extracellular matrix glycoprotein involved in tissue repair and remodeling via interaction with the surrounding extracellular matrix (ECM) proteins. It plays a pivotal role in the corneal wound healing. SPARCL1 may function as both a tumor suppressor and as a regulator of angiogenesis. It regulates cell migration/invasion and suppresses metastasis in many cancers, including prostate cancer, colorectal cancer, gastric cancer, and breast cancer. It can bind to collagens and be counter-adhesive to wild-type dermal fibroblasts, but do not influence rates of cell proliferation. Moreover, SPARCL1 contributes to neural development and participates in remodeling events associated with neuronal degeneration following neural injury. It can influence central nervous system (CNS) development and synaptic rearrangement. SPARCL1 is the closest family member to secreted protein acidic and rich in cysteine (SPARC), but does not compensate for the absence of SPARC in the CNS. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. SPARCL1 shares the three primary domains contained within SPARC with an expanded N-terminal domain. 93
33784 320016 cd16237 EFh_SPARC_TICN1 EF-hand, extracellular calcium-binding (EC) motif, found in testican-1 (TICN1). TICN1, also termed protein SPOCK, or SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 1 (Spock1), is a secreted chimeric proteoglycan that is highly expressed in brain and carries both chondroitin and heparan sulfate glycosaminoglycan side chains. It promotes resistance against Pseudomonas aeruginosa-induced keratitis through regulation of matrix metalloproteinase (MMP)-2 expression and activation. It also acts as a potential cancer prognostic marker that promotes the proliferation and metastasis of gallbladder cancer cells by activating the PI3K/Akt pathway. Moreover, TICN1 corresponding gene SPOCK1 is a novel transforming growth factor-beta target gene that regulates lung cancer cell epithelial-mesenchymal transition. It is also up-regulated by chromodomain helicase/adenosine triphosphatase DNA binding protein 1-like (CHD1L), and promotes human hepatocellular carcinoma (HCC) cell invasiveness and metastasis. Furthermore, TICN1 inhibits the lysosomal cysteine protease cathepsin L in intracellular vesicles and in the extracellular milieu. TICN1 contains an N-terminal signal sequence known to direct nascent polypeptides to the extracellular space, an unique region to the testicans, a follistatin (FS)-like domain generally involving five disulfide bridges, an extracellular calcium-binding (EC) domain including a pair of EF hands, and a thyroglobulin type-1 (TY) domain followed by a C-terminal acidic region with high density of negatively charged amino acids. The substitution of a ligating Asp residue by Phe291 in the +Y position of EF-hand 2 in TICN1 could prevent Ca2+ binding to this site and also cause EF-hand 1 to bind one Ca2+ ion with low affinity. 112
33785 320017 cd16238 EFh_SPARC_TICN2 EF-hand, extracellular calcium-binding (EC) motif, found in testican-2 (TICN2). TICN2, also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 2 (Spock2), is an extracellular heparan sulphate proteoglycan expressed in brain, lung, and testis. It inhibits neurite extension from cultured primary cerebellar neurons and may play regulatory roles in the development of the central nervous system. It also participates in diverse steps of neurogenesis. Moreover, TICN2 may contribute to ECM remodeling by regulating function(s) of other testican family members, which possess membrane-type matrix metalloproteinases (MT-MMPs) inhibitory function. Furthermore, TICN2 corresponding gene SPOCK2 acts as a susceptibility gene for bronchopulmonary dysplasia. TICN2 contains an N-terminal signal peptide, a testican-specific domain followed by a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, a thyroglobulin-like domain (TY), and a C-terminal region with two putative glycosaminoglycan attachment sites. The substitution of a ligating Asp residue by Tyr292 in the +Y position of EF-hand 2 in TICN2 could prevent Ca2+ binding to this site and also cause EF-hand 1 to bind one Ca2+ ion with low affinity. 112
33786 320018 cd16239 EFh_SPARC_TICN3 EF-hand, extracellular calcium-binding (EC) motif, found in testican-3 (TICN3). TICN3, also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 3 (Spock3), is a brain-specific heparan sulfate proteoglycan that shows a widespread distribution within the extracellular matrix of the brain. It plays an important role in the formation or maintenance of major neuronal structures in the brain. It also functions as a novel regulator to reduce the activity of matrix metalloproteinase (MMP) in adult T-cell leukemia (ATL). It suppresses membrane-type 1 MMP-mediated MMP-2 activation and tumor invasion. Moreover, TICN3 corresponding gene SPOCK3 acts as a risk gene for adult attention-deficit/hyperactivity disorder (ADHD) and personality disorders. TICN3 contains an N-terminal signal peptide, a testican-specific domain followed by the follistatin-like (FS) and extracellular calcium-binding (EC) domains characteristic of the BM-40 family. Towards the C-terminus they contain a thyroglobulin-like domain (TY) and a novel sequence (domain V), which includes two potential glycosaminoglycan attachment sites. The substitution of a ligating Asp residue by Tyr295 in the +Y position of EF-hand 2 in testican-3 could prevent Ca2+ binding to this site and also cause EF-hand 1 to bind one Ca2+ ion with low affinity. 113
33787 320019 cd16240 EFh_SPARC_SMOC1 EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein 1 (SMOC-1). SMOC-1, also termed SPARC-related modular calcium-binding protein 1, or smooth muscle-associated protein 1 (SMAP-1), is an Arf6 GTPase-activating protein (GAP) that directly interacts with clathrin and regulates the clathrin-dependent endocytosis of transferrin receptors from the plasma membrane. It is predominantly localized in basement membranes. SMOC-1 acts as a regulator of osteoblast differentiation and is involved in inhibition of transforming growth factor-beta (TGF-beta) signaling through production of nitric oxide. It also plays an essential role in ocular and limb development and functions as a regulator of bone morphogenic protein (BMP) signaling. It interacts with a matricellular protein, tenascin C in addition to the serum proteins, fibulin-1 and C-reactive protein, but not collagens. Two point mutations in the SMOC1 gene may cause Waardenburg Anophtalmia Syndrome. Moreover, SMOC-1 is involved in direct or indirect modulation of growth factor signaling pathways and plays a role in physiological processes involving extensive tissue remodeling. SMOC-1 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-2, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs. 115
33788 320020 cd16241 EFh_SPARC_SMOC2 EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein 2 (SMOC-2). SMOC-2, also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2 (SMAP-2), is a ubiquitously expressed matricellular protein that enhances the response to angiogenic growth factors, mediate cell adhesion, keratinocyte migration, and metastasis. It is also associated with vitiligo and craniofacial and dental defects. Moreover, SMOC-2 acts as an Arf1 GTPase-activating protein (GAP) that interacts with clathrin heavy chain (CHC) and clathrin assembly protein CALM and functions in the retrograde, early endosome/trans-Golgi network (TGN) pathway in a clathrin- and AP-1-dependent manner. It also contributes to mitogenesis via activation of integrin-linked kinase (ILK). SMOC-2 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-1, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs. 114
33789 320000 cd16242 EFh_DMD_like EF-hand-like motif found in the dystrophins subfamily. This dystrophins subfamily includes dystrophin and its two paralogs, utrophin and DRP-2. Dystrophin is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscle. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin also involves in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also termed dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homologue that increases dystrophic muscle function and reduces pathology. It is broadly expressed at both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with the dystroglycans (DGs) and the sarcoglycan-dystroglycans, sarcoglycans and sarcospan (SG-SSPN) subcomplex. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. DRP-2 is mainly expressed in the vertebrate central nervous system (CNS). It is associated with brain membrane fractions and highly enriched in the postsynaptic density. DRP-2 plays a role in the organization of central cholinergic synapses. It interacts with dystroglycan and L-Periaxin to form a transmembrane complex, which plays a role in Schwann cell-basal lamina interactions and in the regulation of the terminal stages of myelination. The dystrophins subfamily has been characterized by a compact cluster of domains comprising a WW domain, four EF-hand-like motifs and a ZZ-domain, followed by two syntrophin binding sites (SBSs) and a looser region with two coiled-coils. 163
33790 320001 cd16243 EFh_DYTN EF-hand-like motif found in dystrotelin and similar proteins. Dystrotelin is the vertebrate orthologue of Drosophila DAH, which is involved in the synchronised cellularization of thousands of nuclei in the syncytial early fly embryo (a specialised form of cytokinesis). Dystrotelin is mainly expressed in the developing central nervous system (CNS) and adult nervous and muscular tissues. Heterologously expressed dystrotelin protein localizes spontaneously to the cytoplasmic membrane, and possibly to the endoplasmic reticulum (ER). Dystrotelin is not critical for mammalian development. It may be involved in other forms of cytokinesis. Its N-terminal region contains a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. The C-terminal region is extremely divergent. Unlike other superfamily members, dystrophin or dystrobrevin, the residues directly involved in beta-dystroglycan binding are not conserved in dystrotelin, which makes it unlikely that dystrotelin interacts with this ligand. Moreover, dystrotelin is unable to heterodimerize with members of the dystrophin or dystrobrevin families, or to homodimerize. 163
33791 320002 cd16244 EFh_DTN EF-hand-like motif found in dystrobrevins and similar proteins. Dystrobrevins are part of the dystrophin-glycoprotein complex (DGC). They physically associate with members of the dystrophin family and with the syntrophins through their homologous C-terminal coiled coil motifs. The family includes two paralogs dystrobrevins, alpha- and beta-dystrobrevin, both of which are cytoplasmic components of the dystrophin-associated protein complex that function as scaffold proteins in signal transduction and intracellular transport. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. The dystrobrevins subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrobrevins contain one or two syntrophin binding sites (SBSs). 161
33792 320003 cd16245 EFh_DAH EF-hand-like motif found in Drosophila melanogaster discontinuous actin hexagon (DAH) and similar proteins. DAH, the product of the dah (discontinuous actin hexagon) gene, is a Drosophila homolog to vertebrate dystrotelin. It is tightly membrane-associated and highly phosphorylated in a time-dependent fashion. DAH plays an essential role in the process of cellularization, and is associated with vesicles that convene at the cleavage furrow. The absence of DAH leads the severe disruption of the cleavage furrows around the nuclei and development stalls. DAH contains a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. 164
33793 320004 cd16246 EFh_DMD EF-hand-like motif found in dystrophin. Dystrophin is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscle. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It involves in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated abnormal cerebral diffusion and perfusion, acute Trypanosoma cruzi infection. 162
33794 320005 cd16247 EFh_UTRO EF-hand-like motif found in utrophin. Utrophin, also termed dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homologue that increases dystrophic muscle function and reduces pathology. It is broadly expressed at both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with the dystroglycans (DGs) and the sarcoglycan-dystroglycans, sarcoglycans and sarcospan (SG-SSPN) subcomplex. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, Utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs) and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs. 162
33795 320006 cd16248 EFh_DRP-2 EF-hand-like motif found in dystrophin-related protein 2 (DRP-2). DRP-2 is a dystrophin homologue mainly expressed in the vertebrate central nervous system (CNS). It is associated with brain membrane fractions and highly enriched in the postsynaptic density. DRP-2 plays a role in the organization of central cholinergic synapses. It interacts with dystroglycan and L-Periaxin to form a transmembrane complex, which plays a role in Schwann cell-basal lamina interactions and in the regulation of the terminal stages of myelination. Like dystrophin, DRP-2 has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises only two spectrin repeats (SRs) and a WW domain. 162
33796 320007 cd16249 EFh_DTNA EF-hand-like motif found in alpha-dystrobrevin. Alpha-dystrobrevin, also termed dystrobrevin alpha (DTN-A), or dystrophin-related protein 3 (DRP-3), is the mammalian ortholog of the Torpedo 87 kDa postsynaptic protein that tightly associates with dystrophin. It is a cytoplasmic protein expressed predominantly in skeletal muscle, heart, lung, and brain. Alpha-dystrobrevin has been implicated in the regulation of acetylcholine receptor (AChR) aggregate density and patterning. It is also essential in the pathogenesis of dystrophin-dependent muscular dystrophies. It plays a critical role in the full functionality of dystrophin through increasing dystrophin's binding to the dystrophin-glycoprotein complex (DGC), and provides protection during cardiac stress. Alpha-dystrobrevin binds to the intermediate filament proteins syncoilin and beta-synemin, thereby linking the dystrophin-associated protein complex (DAPC) to the intermediate filament network. Moreover, alpha-dystrobrevin involves in cell signaling via interaction with other proteins such as syntrophin, a modular adaptor protein that coordinates the assembly of the signaling proteins nitric oxide synthase, stress-activated protein kinase-3, and Grb2 to the DAPC. Furthermore, alpha-dystrobrevin plays an important role in muscle function, as well as in nuclear morphology maintenance through specific interaction with the nuclear lamina component lamin B1. In addition, alpha-dystrobrevin is required in dystrophin-associated protein scaffolding in brain. Absence of glial alpha-dystrobrevin causes abnormalities of the blood-brain barrier and progressive brain edema. Alpha-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, alpha-dystrobrevin contain two syntrophin binding sites (SBSs). 161
33797 320008 cd16250 EFh_DTNB EF-hand-like motif found in beta-dystrobrevin. Beta-dystrobrevin, also termed dystrobrevin beta (DTN-B), is a dystrophin-related protein that is restricted to non-muscle tissues and is abundantly expressed in brain, lung, kidney, and liver. It may be involved in regulating chromatin dynamics, possibly playing a role in neuronal differentiation, through the interactions with the high mobility group HMG20 proteins iBRAF/HMG20a and BRAF35 /HMG20b. It also binds to and represses the promoter of synapsin I, a neuronal differentiation gene. Moreover, beta-dystrobrevin functions as a kinesin-binding receptor involved in brain development via the association with the extracellular matrix components pancortins. Furthermore, beta-dystrobrevin binds directly to dystrophin and is a cytoplasmic component of the dystrophin-associated glycoprotein complex, a multimeric protein complex that links the extracellular matrix to the cortical actin cytoskeleton and acts as a scaffold for signaling proteins such as protein kinase A. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. Beta-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, beta-dystrobrevin contain two syntrophin binding sites (SBSs). 161
33798 319994 cd16251 EFh_parvalbumin_like EF-hand, calcium binding motif, found in parvalbumin-like EF-hand family. The family includes alpha- and beta-parvalbumins, and a group of uncharacterized calglandulin-like proteins. Parvalbumins are small, acidic, cytosolic EF-hand-containing Ca2+-buffer and Ca2+ transporter/shuttle proteins belonging to EF-hand superfamily. They are expressed by vertebrates in fast-twitch muscle cells, specific neurons of the central and peripheral nervous system, sensory cells of the mammalian auditory organ (Corti's cell), and some other cells, and characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Thus, they may play an additional role in Mg2+ handling. Moreover, parvalbumins represent one of the major animal allergens. In metal-bound states, parvalbumins possess a rigid and stable tertiary structure and display strong allergenicity. In contrast, the metal-free parvalbumins are intrinsically disordered, and the loss of metal ions results in a conformational change that decreases their IgE binding capacity. Furthermore, parvalbumins have been widely used as a neuronal marker for a variety of functional brain systems. They also function as a Ca2+ shuttle transporting Ca2+ from troponin-C (TnC) to the sarcoplasmic reticulum (SR) Ca2+ pump during muscle relaxation. Thus they may facilitate myocardial relaxation and play important roles in cardiac diastolic dysfunction. Parvalbumins consists of alpha- and beta- sublineages, which can be distinguished on the basis of isoelectric point (pI > 5 for alpha; pI 101
33799 319995 cd16252 EFh_calglandulin_like EF-hand, calcium binding motif, found in uncharacterized calglandulin-like proteins. The family corresponds to a group of uncharacterized calglandulin-like proteins. Although their biological function remain unclear, they show high sequence similarity with human calglandulin-like protein GAGLP, which is an ortholog of calglandulin from the venom glands of Bothrops insularis snake. Both GAGLP and calglandulin are putative Ca2+-binding proteins with four EF-hand motifs. However, members in this family contain only three EF-hand motifs. In this point, they may belong to the parvalbumin-like EF-hand family, which is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix). 106
33800 319996 cd16253 EFh_parvalbumins EF-hand, calcium binding motif, found in parvalbumins. Parvalbumins are small, acidic, cytosolic EF-hand-containing Ca2+-buffer and Ca2+ transporter/shuttle proteins belonging to EF-hand superfamily. They are expressed by vertebrates in fast-twitch muscle cells, specific neurons of the central and peripheral nervous system, sensory cells of the mammalian auditory organ (Corti's cell), and some other cells, and characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Thus, they may play an additional role in Mg2+ handling. Moreover, parvalbumins represent one of the major animal allergens. In metal-bound states, parvalbumins possess a rigid and stable tertiary structure and display strong allergenicity. In contrast, the metal-free parvalbumins are intrinsically disordered, and the loss of metal ions results in a conformational change that decreases their IgE binding capacity. Furthermore, parvalbumins have been widely used as a neuronal marker for a variety of functional brain systems. They also function as a Ca2+ shuttle transporting Ca2+ from troponin-C (TnC) to the sarcoplasmic reticulum (SR) Ca2+ pump during muscle relaxation. Thus they may facilitate myocardial relaxation and play important roles in cardiac diastolic dysfunction. Parvalbumins consists of alpha- and beta- sublineages, which can be distinguished on the basis of isoelectric point (pI > 5 for alpha; pI 101
33801 319997 cd16254 EFh_parvalbumin_alpha EF-hand, calcium binding motif, found in alpha-parvalbumin. Alpha-parvalbumin is cytosolic Ca2+/Mg2+-binding protein expressed mainly in fast-twitch skeletal myofibrils, where it may act as a soluble relaxing factor facilitating the Ca2+-mediated relaxation phase. It is also expressed in rapidly firing neurons, particularly GABA-ergic neurons, and thus may confer protection against Ca2+ toxicity. The major role of alpha-parvalbumin is metal buffering and transport of Ca2+. It binds different metal cations, and exhibits very high affinity for Ca2+ and physiologically significant affinity for Mg2+. Alpha-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Both metal ion-binding sites in alpha-parvalbumin are high-affinity sites. Additionally, in contrast to beta-parvalbumin, alpha-parvalbumin is less acidic and has an additional residue in the C-terminal helix. 101
33802 319998 cd16255 EFh_parvalbumin_beta EF-hand, calcium binding motif, found in beta-parvalbumin. Beta-parvalbumin, also termed Oncomodulin-1 (OM), is a small calcium-binding protein that is expressed in hepatomas, as well as in the blastocyst and the cytotrophoblasts of the placenta. It is also found to be expressed in the cochlear outer hair cells of the organ of Corti and frequently expressed in neoplasms. Mammalian beta-parvalbumin is secreted by activated macrophages and neutrophils. It may function as a tissue-specific Ca2+-dependent regulatory protein, and may also serve as a specialized cytosolic Ca2+ buffer. Beta-parvalbumin acts as a potent growth-promoting signal between the innate immune system and neurons in vivo. It has high and specific affinity for its receptor on retinal ganglion cells (RGC) and functions as the principal mediator of optic nerve regeneration. It exerts its effects in a cyclic adenosine monophosphate (cAMP)-dependent manner and can further elevate intracellular cAMP levels. Moreover, beta-parvalbumin is associated with efferent function and outer hair cell electromotility, and can identify different hair cell types in the mammalian inner ear. Beta-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. The EF site displays a high-affinity for Ca2+/Mg2+, and the CD site is a low-affinity Ca2+-specific site. In addition, beta-parvalbumin is distinguished from other parvalbumins by its unusually low isoelectric point (pI = 3.1) and sequence eccentricities (e.g., Y57-L58-D59 instead of F57-I58-E59). 101
33803 293929 cd16256 LumP lumazine protein. Lumazine protein (LumP) is involved in the bioluminescence of certain marine bacteria. It serves as an optical transponder in bioluminescence emission. The intense fluorescence of LumP is caused by non-covalently bound 6,7- dimethyl-8-ribityllumazine. Though its amino acid sequence is very similar to riboflavin synthase it functions as a monomer, unlike the riboflavin synthases from eubacteria, yeasts and plants which act as trimers. 186
33804 293914 cd16257 EFG_III-like Domain III of Elongation factor G (EF-G) and related proteins. Bacterial Elongation factor G (EF-G) and related proteins play a role in translation and share a similar domain architecture. Elongation factor EFG participates in the elongation phase during protein biosynthesis on the ribosome by stimulating translocation. Its functional cycles depend on GTP binding and its hydrolysis. Domain III is involved in the activation of GTP hydrolysis. This domain III, which is different from domain III in EF-TU and related elongation factors, is found in several translation factors, like bacterial release factors RF3, elongation factor 4, elongation factor 2, GTP-binding protein BipA and tetracycline resistance protein Tet. 71
33805 293915 cd16258 Tet_III Domain III of Tetracycline resistance protein Tet. Tetracycline resistance proteins, including TetM and TetO, catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. 71
33806 293916 cd16259 RF3_III Domain III of bacterial Release Factor 3 (RF3). The class II RF3 is a member of one of two release factor (RF) classes required for the termination of protein synthesis by the ribosome. RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. Sequence comparison of class II release factors with elongation factors shows that prokaryotic RF3 is more similar to EF-G whereas eukaryotic eRF3 is more similar to eEF1A, implying that their precise function may differ. 70
33807 293917 cd16260 EF4_III Domain III of Elongation Factor 4 (EF4). Elongation factor 4 (EF4 or LepA) is a highly conserved guanosine triphosphatase found in bacteria and eukaryotic mitochondria and chloroplasts. EF4 functions as a translation factor, which promotes back-translocation of tRNAs on posttranslocational ribosome complexes and competes with elongation factor G for interaction with pretranslocational ribosomes, inhibiting the elongation phase of protein synthesis. 76
33808 293918 cd16261 EF2_snRNP_III Domain III of Elongation Factor 2 (EF2). This model represents domain III of Elongation factor 2 (EF2) found in eukaryotes and archaea, and the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis are important for the function of the U5-116 kD/Snu114p. 72
33809 293919 cd16262 EFG_III Domain III of Elongation Factor G (EFG). This model represents domain III of bacterial Elongation factor G (EF-G), and mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2), which play an important role during peptide synthesis and tRNA site changes. In bacteria, this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. mtEFG1 and mtEFG2 show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects, and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants of the yeast homolog of mtEFG2, MEF2. 76
33810 293920 cd16263 BipA_III Domain III of GTP-binding protein BipA (TypA). BipA (also called TypA) is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios. It is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. 79
33811 293921 cd16264 snRNP_III Domain III of the spliceosomal 116kD U5 small nuclear ribonucleoprotein (snRNP) component. Domain III of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p is homologous to domain III of the eukaryotic translational elongation factor EF-2. U5-116 kD is a GTPase component of the spliceosome complex which functions in the processing of precursor mRNAs to produce mature mRNAs. 72
33812 293910 cd16265 Translation_Factor_II Proteins related to domain II of EF-Tu and related translation factors. Elongation factor Tu consists of three structural domains; this family represents single domain proteins that are related to the second domain of EF-Tu. Domain II of EF-Tu adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is also found in other proteins such as elongation factor G and translation initiation factor IF-2. 80
33813 293911 cd16266 IF2_aeIF5B_IV Domain IV of prokaryotic Initiation Factor 2 and archaeal and eukaryotic Initiation Factor 5. This family represents the domain IV of prokaryotic Initiation Factor 2 (IF2) and its archaeal and eukaryotic homologs IF5B. IF2, the largest initiation factor is an essential GTP binding protein. In E. coli three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of the 60S ribosomal subunit. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains in EF1A, eEF1A and aeIF2gamma. 87
33814 293912 cd16267 HBS1-like_II Domain II of Hbs1-like proteins. S. cerevisiae Hbs1 is closely related to the eukaryotic class II release factor (eRF3). Hbs1, together with Dom34 (pelota), plays an important role in termination and recycling, but in contrast to eRF3/eRF1, Hbs1, together with Dom34 (pelota), functions on mRNA-bound ribosomes in a codon-independent manner and promotes subunit splitting on completely empty ribosomes. 84
33815 293913 cd16268 EF2_II Domain II of Elongation Factor 2. This subfamily represents domain II of elongation factor 2 (EF-2) found in eukaryotes and archaea. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. 96
33816 293879 cd16269 GBP_C Guanylate-binding protein, C-terminal domain. Guanylate-binding protein (GBP), C-terminal domain. Guanylate-binding proteins (GBPs) are synthesized after activation of the cell by interferons. The biochemical properties of GBPs are clearly different from those of Ras-like and heterotrimeric GTP-binding proteins. They bind guanine nucleotides with low affinity (micromolar range), are stable in their absence, and have a high turnover GTPase. In addition to binding GDP/GTP, they have the unique ability to bind GMP with equal affinity and hydrolyze GTP not only to GDP, but also to GMP. This C-terminal domain has been shown to mediate inhibition of endothelial cell proliferation by inflammatory cytokines. 291
33817 293878 cd16270 Apc5_N N-terminal domain of the anaphase-promoting complex subunit Apc5 (or Anapc5). The N-terminal domain of Apc5 interacts with subunits Apc4, Apc15, and CDC23. Apc5 is a subunit of the eukaryotic anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Although Apc5 does not contain a classical RNA binding domain, it binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. 143
33818 293830 cd16272 RNaseZ_MBL-fold Ribonuclease Z; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. Only the short form exists in bacteria. It includes the C-terminus of human ELAC2 and Escherichia coli zinc phosphodiesterase (ZiPD, also known as ecoZ, tRNase Z, or RNase BN) is a 3' tRNA-processing endonuclease, encoded by the elaC gene. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 180
33819 293831 cd16273 SNM1A-1C-like_MBL-fold SNM1A , artemis/SNM1C, yeast Pso2p, and related proteins; MBL-fold metallo-hydrolase domain. Includes human SNM1A (SNM1 homolog A, also known as DNA cross-link repair 1A protein) and Saccharomyces cerevisiae Pso2 protein (PSOralen derivative sensitive 2, also known as SNM1, sensitive to nitrogen mustard 1), both proteins are 5'-exonucleases and function in interstrand cross-links (ICL) repair. Also includes the nuclease artemis (also known as SNM1C, SNM1 homolog C, SNM1-like protein, and DNA cross-link repair 1C protein) which plays a role in V(D)J recombination/DNA repair. Purified artemis protein possesses single-strand-specific 5' to 3' exonuclease activity. Upon complex formation with, and phosphorylation by, DNA-dependent protein kinase, artemis gains endonucleolytic activity on hairpins and 5' and 3' overhangs. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 160
33820 293832 cd16274 PQQB-like_MBL-fold Coenzyme pyrroloquinoline quinone (PQQ) synthesis protein B and related proteins; MBL-fold metallo hydrolase domainhydrolase domain. PQQB is essential for the synthesis of the cofactor pyrroloquinoline quinone (PQQ) in Klebsiella pneumonia. PqqB is not directly involved in the PQQ biosynthesis but may serve as a carrier for PQQ when PQQ is released from PqqC. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 220
33821 293833 cd16275 BaeB-like_MBL-fold Bacillus amyloliquefaciens BaeB and related proteins; MBL-fold metallo hydrolase domain. Bacillus amyloliquefaciens BaeB may play a role in the synthesis of the antibiotic polyketide bacillaene. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 174
33822 293834 cd16276 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 188
33823 293835 cd16277 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 222
33824 293836 cd16278 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 185
33825 293837 cd16279 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. Some members of this subgroup are named as octanoyltransferase (also known as lipoate-protein ligase B). 193
33826 293838 cd16280 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 251
33827 293839 cd16281 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 252
33828 293840 cd16282 metallo-hydrolase-like_MBL-fold uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 209
33829 293841 cd16283 RomA-like_MBL-fold Enterobacter cloacae RomA and related proteins; MBL-fold metallo hydrolase domain. Derepression of the romA-ramA locus results in a multidrug-resistance phenotype. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 181
33830 293842 cd16284 UlaG-like_MBL-fold UlaG a putative l-ascorbate-6-P lactonase and related proteins; MBL-fold metallo hydrolase domain. UlaG is essential for L-ascorbate utilization under anaerobic conditions; it is a putative l-ascorbate-6-P lactonase thought to catalyze the hydrolysis of L-ascorbate-6-phosphate to 3-keto-L-gulonate-6-phosphate. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 178
33831 293843 cd16285 MBL-B1 metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. Includes chromosomally-encoded MBLs such as Bacillus cereus BcII, Bacteroides fragilis CcrA, and Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and acquired MBLs including IMP-1, VIM-1, VIM-2, GIM-1, NDM-1 and FIM-1. 210
33832 293844 cd16286 SPM-1-like_MBL-B1-B2-like Pseudomonas areoginosa SPM-1 and related metallo-beta-lactamases, subclasses B1 and B2 like; MBL-fold metallo-hydrolase domain. SPM-1 was first identified in a Pseudomonas aeruginosa strain from a paediatric leukaemia patient and is a major clinical problem. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs are most closely related to each other. SPM-1 appears to be a hybrid B1/B2 MBL. 236
33833 293845 cd16287 CphS_ImiS-like_MBL-B2 metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. Includes Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and Serratia fonticola Sfh-I. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis. 226
33834 293846 cd16288 BJP-1_FEZ-1-like_MBL-B3 BJP-1, FEZ-1, GOB-1, Mbl1b and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. This subgroup of B3 subclass MBLs includes Bradyrhizobium diazoefficiens BJP-1, Fluoribacter gormanii FEZ-1, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) GOB-1, Caulobacter crescentus Mbl1b. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 254
33835 293847 cd16289 L1_POM-1-like_MBL-B3 Stenotrophomonas maltophilia L1, Pseudomonas otitidis POM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of L1- and Pom-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 239
33836 293848 cd16290 AIM-1_SMB-1-like_MBL-B3 AIM-1, SMB-1, EVM-1, THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. This subgroup of B3 subclass MBLs includes Pseudomonas Aeruginosa AIM-1, Serratia marcescens SMB-1, Erythrobacter vulgaris EVM-1, and Janthinobacterium lividum THIN-B. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of AIM-1-,SMB-1-, EVM-1-, THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 256
33837 293849 cd16291 INTS11-like_MBL-fold Integrator complex subunit 11, and related proteins; MBL-fold metallo-hydrolase domain. Integrator is a metazoan-specific multisubunit, multifunctional protein complex composed of 14 subunits named Int1-Int14 (Integrator subunits). This subgroup includes Int11 (also known as cleavage and polyadenylation-specific factor (CPSF) 3-like protein, and protein related to CPSF subunits of 68 kDa (RC-68)). Integrator complex has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 199
33838 293850 cd16292 CPSF3-like_MBL-fold cleavage and polyadenylation specificity factor (CPSF) subunit 3 and related proteins; MBL-fold metallo-hydrolase domain. CPSF3 (also known as cleavage and polyadenylation specificity factor 73 kDa subunit/CPSF-73) functions as a 3' endonuclease in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and in 3' end processing of metazoan histone pre-mRNAs. This subgroup also contains the yeast homolog of CPSF-73, Ysh1/Brr5 which has roles in mRNA and snoRNA synthesis. In addition to this MBL-fold metallo-hydrolase domain, members of this subgroup contain a beta-CASP (named for metallo-beta-lactamase, CPSF, Artemis, Snm1, Pso2) domain, and a RMMBL domain (RNA-metabolizing metallo-beta-lactamase). Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 194
33839 293851 cd16293 CPSF2-like_MBL-fold cleavage and polyadenylation specificity factor (CPSF) subunit 2 and related proteins; MBL-fold metallo-hydrolase domain. CPSF2, also known as cleavage and polyadenylation specificity factor 100 kDa subunit (CPSF-100), is a component of the CPSF complex, which plays a role in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and during processing of metazoan histone pre-mRNAs. This subgroup includes Ydh1p, the yeast homolog of CPSF2. In addition to this MBL-fold metallo-hydrolase domain, members of this subgroup contain a beta-CASP (named for metallo-beta-lactamase, CPSF, Artemis, Snm1, Pso2) domain. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 199
33840 293852 cd16294 Int9-like_MBL-fold integrator subunit 9, and related proteins; MBL-fold metallo-hydrolase domain. Integrator is a metazoan-specific multisubunit, multifunctional protein complex composed of 14 subunits named Int1-Int14 (Integrator subunits). This subgroup includes Int9, also known as protein related to CPSF subunits of 74 kDa (RC-74). Integrator complex has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 166
33841 293853 cd16295 TTHA0252-CPSF-like_MBL-fold Thermus thermophilus TTHA0252 and related cleavage and polyadenylation specificity factors; MBL-fold metallo-hydrolase domain. Includes the archaeal cleavage and polyadenylation specificity factors (CPSFs) such as Methanothermobacter thermautotrophicus MTH1203, and Pyrococcus horikoshii PH1404. In addition to the MBL-fold metallo-hydrolase nuclease and the beta-CASP domains, members of this subgroup contain two contiguous KH domains. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 197
33842 293854 cd16296 RNaseZ_ELAC2-N-term-like_MBL-fold Ribonuclease Z, N-terminus of human ELAC2 and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. This eukaryotic subgroup includes the N-terminus of human ELAC2 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 175
33843 293855 cd16297 artemis-SNM1C-like_MBL-fold artemis-SNM1C and related proteins; MBL-fold metallo-hydrolase domain. Includes the nuclease artemis (also known as SNM1C, SNM1 homolog C, SNM1-like protein and DNA cross-link repair 1C protein) which plays a role in V(D)J recombination/DNA repair. Purified artemis protein possesses single-strand-specific 5' to 3' exonuclease activity. Upon complex formation with, and phosphorylation by, DNA-dependent protein kinase, artemis gains endonucleolytic activity on hairpins and 5' and 3' overhangs. Inactivation of Artemis causes severe combined immunodeficiency (SCID). Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 171
33844 293856 cd16298 SNM1A-like_MBL-fold 5'-exonucleases human SNM1A and related proteins; MBL-fold metallo-hydrolase domain. Includes human SNM1A (SNM1 homolog A, also known as DNA cross-link repair 1A protein) which is a 5'-exonuclease and functions in interstrand cross-links (ICL) repair. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 157
33845 293857 cd16299 IND_BlaB-like_MBL-B1 IND1, IND2, BlaB-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded metallo-beta-lactamases Chryseobacterium indologenes IND-1, IND-2, and IND-7, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB, Chryseobacterium gleum CGB-1, and Empedobacter brevis EBR-1. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 212
33846 293858 cd16300 NDM_FIM-like_MBL-B1 NDM-1, FIM-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the ISCR-mediated MBLs NDM-1 (NDM (New Delhi metallo-beta-lactamase) and FIM-1 (Florence imipenemase). MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 214
33847 293859 cd16301 IMP_DIM-like_MBL-B1 IMP-1, DIM-1, GIM-1, SIM-1, TMB-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the acquired MBLs IMP-1(a beta-lactamase that is active on imipenem), DIM-1 (Dutch imipenemase), GIM-1 (German imipenemase), KHM-1 (Kyorin Health Science MBL 1), SIM-1 (Seoul imipenemase), and TMB-1 (Tripoli metallo-beta-lactamase). IMP-1, DIM-1, GIM-1, SIM-1, and TMB-1 are Class 1 integron-mediated MBLs, KMH-1 is plasmid-mediated. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of acquired MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 215
33848 293860 cd16302 CcrA-like_MBL-B1 Bacteroides fragilis CcrA and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 212
33849 293861 cd16303 VIM_type_MBL-B1 VIM-type metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. VIM (Verona integron-encoded metallo-beta-lactamase)-type MBLs are integron-associated and are widely distributed acquired MBLs. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of VIM-type MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 218
33850 293862 cd16304 BcII-like_MBL-B1 Bacillus cereus Beta-lactamase 2 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Bacillus cereus Beta-lactamase 2, also called BcII. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. BcII is a chromosome-encoded B1 MBL. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 212
33851 293863 cd16305 Sfh-1-like_MBL-B2 Serratia fonticola Sfh-I and related metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis. 226
33852 293864 cd16306 CphA_ImiS-like_MBL-B2 Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and related metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis. 222
33853 293865 cd16307 FEZ-1-like_MBL-B3 Fluoribacter gormanii FEZ-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of FEZ-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 255
33854 293866 cd16308 GOB1-like_MBL-B3 Elizabethkingia meningoseptica GOB-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of GOB-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 254
33855 293867 cd16309 BJP-1-like_MBL-B3 Bradyrhizobium diazoefficiens BJP-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of BJP-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 252
33856 293868 cd16310 Mbl1b-like_MBL-B3 Caulobacter crescentus Mbl1b and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of Mbl1b-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 252
33857 293869 cd16311 THIN-B2-like_MBL-B3 Janthinobacterium lividum THIN-B2 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of THIN-B2-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 257
33858 293870 cd16312 THIN-B-like_MBL-B3 Janthinobacterium lividum THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 258
33859 293871 cd16313 SMB-1-like_MBL-B3 SMB-1, THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of SMB-1- and THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 254
33860 293872 cd16314 AIM-1-like_MBL-B3 Pseudomonas Aeruginosa AIM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup AIM-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 255
33861 293873 cd16315 EVM-1-like_MBL-B3 Erythrobacter vulgaris EVM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup EVM-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His). 248
33862 293874 cd16316 BlaB-like_MBL-B1 Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBL Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 214
33863 293875 cd16317 IND_MBL-B1 Chryseobacterium indologenes IND-1, IND-2, IND-7and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBLs Chryseobacterium indologenes IND-1, IND-2, and IND-7 and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 215
33864 293876 cd16318 MUS_TUS_MBL-B1 Myroides odoratimimus MUS-1, MUS-2, TUS-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBLs Myroides odoratimimus MUS-1 and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. 214
33865 293782 cd16319 MraZ protein domain of unknown function (UPF0040) includes MraZ. This family contains proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis. Members of this family contain two tandem copies of the domain; the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site. 53
33866 293783 cd16320 MraZ_N N-terminal subdomain of transcriptional regulator MraZ. This family contains the N-terminal domain of proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis, including mraW, ftsI, murE, murF, ftsW and murG. Members of this family contain two tandem copies of the domain; the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site. 60
33867 293784 cd16321 MraZ_C C-terminal subdomain of transcriptional regulator MraZ. This family contains the C-terminal domain of proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis, including mraW, ftsI, murE, murF, ftsW and murG. Members of this family contain two tandem copies of the domain; the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site. 62
33868 293877 cd16322 TTHA1623-like_MBL-fold uncharacterized Thermus thermophilus TTHA1623 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized Thermus thermophilus TTHA1623 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. This family includes homologs present in a wide range of bacteria and archaea and some eukaryota. Members of the MBL-fold metallo-hydrolase superfamily exhibit a variety of active site metallo-chemistry, TTHA1623 exhibiting a uniquely shaped putative substrate-binding pocket with a glyoxalase II-type metal-coordination mode. 204
33869 319993 cd16323 Syd Syd, a SecY-interacting protein. This family contains the Syd protein that has been implicated in the Sec-dependent transport of polypeptides across the inner membrane in bacteria. Syd has been shown to bind the SecY subunit of membrane-embedded SecYEG heterotrimer (also known as core translocon or SecY complex) which is a conserved protein-conducting channel essential for the biogenesis of most of the secretory and integral membrane proteins. The SecY-binding site of Syd is a conserved concave and electronegative groove that forms interactions with the electropositive loops of the SecY subunit. Syd is also known to verify the proper assembly of the SecY complex in the membrane by interfering with protein translocation only when the channel displays abnormal SecY-SecE associations. Operon analysis has shown that Syd protein may function as immunity protein in bacterial toxin systems. 173
33870 319982 cd16324 LolA_fold-like family containing periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This family contains the periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the N-terminal domain of periplasmic protein RseB, all of which have similar unclosed beta-barrel structures that resemble a baseball glove-like scaffold consisting of an 11-stranded antiparallel sheet. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts outer membrane (OM)-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. It is proposed that the LolA/LolB complex forms a tunnel-like structure, where the hydrophobic insides of LolA and LolB are connected, which enables lipoproteins to transfer from LolA to LolB. RseB exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress. Its structural similarity to LolA and LolB suggests that RseA may act as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response. 162
33871 319983 cd16325 LolA LolA, a periplasmic chaperone. This family contains periplasmic molecular chaperone LolA which binds to outer-membrane specific lipoproteins and transports them from inner membrane to outer membrane (OM) through LolB, a lipoprotein anchored to outer membranes. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts OM-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. Studies have shown that hydrophobic surface patches large enough to accommodate acyl chains of the OM lipoproteins and the structural flexibility of LolA are important factors for its role as a periplasmic chaperone. 166
33872 319984 cd16326 LolB LolB, an outer membrane lipoprotein receptor. This family contains the outer membrane lipoprotein receptor, LolB, which catalyzes the last step of lipoprotein transfer from the inner to the outer membrane. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA transports lipoproteins through the periplasm to LolB, which then localizes them to outer membranes; the protruding loop of LolB has been shown to be essential for the localization of lipoproteins in the anchoring of bacterial triacylated proteins to the outer membrane. 163
33873 319985 cd16327 RseB RseB, a sensor in periplasmic stress. This family contains the periplasmic protein RseB (also known as MucB or Mucb/RseB) which exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress. RseB binds to RseA and inhibits its sequential cleavage, thereby functioning as a negative modulator of this response. The protein is composed of two domains, the larger N-terminal domain resembling an unclosed beta-barrel that is remarkably similar structurally to LolA and LolB, proteins capable of binding the lipid anchor of lipoproteins, suggesting that RseA acts as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response. 166
33874 319992 cd16328 RseA_N N-terminal domain of RseA. This family contains the cytoplasmic (N-terminal) domain of RseA, the transmembrane anti-sigma-E factor. RseA is degraded during sigma-E-dependent transcription caused by bacterial envelope stress such as heat shock. It is an inner membrane protein with an N-terminal cytoplasmic domain that binds sigma-E and blocks its transcriptional activity, and a C-terminal periplasmic domain that binds RseB, an auxiliary negative regulator. Under inducing conditions, RseA is rapidly degraded and sigma-E is released into the cytoplasm, where it can bind core RNAP and induce its regulon. It has been shown that just the N-terminal domain is sufficient to bind and inhibit sigma-E. The C-terminal domain may interact with other proteins that signal periplasmic stress. 65
33875 319986 cd16329 LolA_like proteins similar to periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This family contains uncharacterized proteins similar to the periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB, all of which have similar unclosed beta-barrel structures that resemble a baseball glove-like scaffold consisting of an 11-stranded antiparallel sheet. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts outer membrane (OM)-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. It is proposed that the LolA/LolB complex forms a tunnel-like structure, where the hydrophobic insides of LolA and LolB are connected, which enables lipoproteins to transfer from LolA to LolB. RseB exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress. Its structural similarity to LolA and LolB suggests that RseA may act as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response. 225
33876 319987 cd16330 LolA_VioE Proteins similar to violacein biosynthetic enzyme VioE that shares fold with periplasmic molecular chaperone LolA. This family includes the violacein biosynthetic enzyme VioE which shares a core fold with lipoprotein transporter proteins that include lipoprotein transporter proteins LolA and LolB. VioE is an enzyme with no characterized homologs that plays a key role in the biosynthesis of violacein, a naturally occurring bisindole product with various biological activities, including antitumor activity as well as antibacterial and cytotoxic properties. In Chromobacterium violaceum, VioE catalyzes the third step in violacein biosynthesis from a pair of Trp residues (i.e. mediates a 1,2 shift of an indole ring and oxidative chemistry) to generate prodeoxyviolacein, a precursor to violacein. Structural and mutagenesis studies suggest that VioE acts as a catalytic chaperone, using this fold associated with lipoprotein transporters to catalyze the production of its prodeoxyviolacein product. 175
33877 319991 cd16331 YjgA-like uncharacterized proteins similar to Escherichia coli YjgA. Family of conserved uncharacterized proteins similar to Escherichia coli YjgA, which has been identified as comigrating with the 50S ribosome 153
33878 381747 cd16332 Prp-like ribosomal-processing cysteine protease Prp and similar proteins. This model represents a family of cysteine proteases that include members found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease. Dependence of Staphylococcus aureus on L27 cleavage by Prp makes the enzyme a target for antibiotic development. 102
33879 319989 cd16333 RELM resistin-like molecule (RELM) hormone family. RELMs, secreted proteins with roles including insulin resistance and the activation of inflammatory processes, are also known as found in inflammatory zone (FIZZ), and include four members in mouse (RELM-alpha/FIZZ1/HIMF, RELM-beta/FIZZ2, Resistin/FIZZ3, and RELM-gamma/FIZZ4) and two members in human (resistin and RELM-beta). Little is yet known about the differences and similarities in function of the different isoforms. RELMs are potentially implicated in a wide range of physiological and pathological processes including obesity-associated diabetes, cardiovascular system function, cancer development and metastasis. There are significant differences between human and rodent RELMs with respect to gene and protein structure, differential gene regulation, different tissue distribution profiles, and insulin resistance induction. Resistin appears to convey insulin resistance in rodents, and to instigate inflammatory processes in humans. In the pathophysiology of obesity-associated diabetes, mouse resistin is secreted by adipocytes and increases hepatic gluconeogenesis, thereby promoting insulin resistance, human resistin is secreted by macrophages and may play a role through inflammatory contributions. Elevated levels of human resistin have been reported in various cancers including colorectal, endometrial, and postmenopausal breast cancers, and may initiate the production of further inflammatory cytokines, to promote tumor cell progression; contrary to this, in vitro overexpression of human RELM-beta abolishes invasion, metastasis and angiogenesis of gastric cancer cells. Resistin circulates as hexamers and trimers; structural similarity has been noted between the resistin homotrimer and the proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain. 86
33880 319988 cd16334 LppX-like family includes lipoproteins LppX, LprA, LprF and LprG from Mycobacterium tuberculosis. This family includes the homologous lipoproteins LppX, LprA, LprF and LprG from Mycobacterium tuberculosis (Mtb), all of which share a core fold with lipoprotein transporter proteins LolA and LolB. Mtb contains components such as glycolipids, lipoglycans and lipoproteins that play critical roles in regulating host responses and promoting survival of the pathogen. Mtb LprA is a lipoprotein agonist of Toll-like receptor 2 (TLR2) that regulates innate immunity and APC function. LprF, which is also found in Mycobacterium bovis but not in the nonpathogenic Mycobacterium smegmatis, has a central hydrophobic cavity that binds a diacylated glycolipid that it transfers from the plasma membrane to the cell wall, which might be related to the pathogenesis of the bacteria. Similarly, LprG functions as a carrier of glycolipids and lipoglycans, such as lipoarabinomannan (LAM), during their trafficking and delivery to the mycobacterial cell wall, contributing to virulence; LAM inhibits fusion of phagosomes with lysosomes as a means for mycobacteria to evade host defense. In addition, LprG has potent TLR2 agonist activity that modulates antigen processing of dendritic cells and macrophages. LppX is required for the translocation of the key virulence factors, the phthiocerol dimycocerosates (PDIMs), to the surface of Mtb. 196
33881 319981 cd16335 MukF_N bacterial condensin complex subunit MukF, N-terminal domain. MukF is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2. 315
33882 319980 cd16336 MukE bacterial condensin complex subunit MukE. MukE is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2. 204
33883 319979 cd16337 MukF_C bacterial condensin complex subunit MukF, C-terminal domain. MukF is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2. 97
33884 319976 cd16338 CpcT T-type phycobiliprotein (PBP) lyase. This family contains the T-type phycobiliprotein (PBP) lyase (includes CpcT/CpeT, also known as CpcT bilin lyase). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The T-type lyase is distantly related to CpcS and is responsible for covalent attachment of phycocyanobilin (PCB) or phycoerythrobilin to a specific cysteine residue in the beta-subunit of phycocyanin (CpcB) and the beta-subunit of phycoerythrocyanin (PecB), and with a different stereochemistry than CpcS. In CpcT (All5339) from Nostoc (Anabaena) sp. PCC7120, sequential binding studies indicate that beta-subunit chromophorylation with PCB at a specific C- terminal cysteine residue in cyanobacterial phycocyanin and phycoerythrocyanin is hindered by a preceding chromophorylation at a specific N-terminal cysteine residue by CpcS. T-type PBP lyases adopt a beta-barrel structure with a modified lipocalin fold, similar to S-type PBP lyases. 180
33885 319977 cd16339 CpcS S-type phycobiliprotein (PBP) lyase. This family contains the S-type phycobiliprotein (PBP) lyase (denoted CpcS/CpcU or CpeS/CpeU). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The S-type lyase is distantly related to CpcT and similarly adopts a beta-barrel structure with a modified lipocalin fold. Many members of the CpcS/CpcU family ligate phycocyanobilin (PCB) to a specific cysteine residue in the beta-subunits of phycocyanin (CpcB) or phycoerythrocyanin (PecB) and to a related cysteine residue in the alpha and beta subunits of allophycocyanin (AP); they are typically given the designation of "CpcS" or "CpcU". Other members which attach phycoerythrobilin (PEB) to the beta-subunit of phycoerythrin (PE) are given the designation "CpeS" or "CpeU". In Guillardia theta, a Cryptophyte, which has adopted phycoerythrobilin (PEB) biosynthesis from cyanobacteria, phycobiliprotein lyase has been shown to provide structural requirements for the transfer of this chromophore to the specific cysteine residue of the apophycobiliprotein. 166
33886 319978 cd16340 CpcS_T S- and T-type phycobiliprotein (PBP) lyases. This family contains the S- and T-type phycobiliprotein (PBP) lyases. PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs. These lyases are distinguishable in the clades of E/F-, S/U-, and T-type lyases; T-type lyases (which include CpcT) are distantly related to S-type lyases (which include CpcS and CpcU). S- and T-type PBP lyases differ in mechanistic details; the conformation and protonation state in which the chromophore is presented account for their differences in stereochemistry of the chromophore selectivity as well as corresponding binding sites. On the other hand, both lyases carry out the main functions of assisting site selectivity in the apo-PBP, protecting the chromophore and ensuring the regio- and stereoselectivity of the addition. The S- and T-type PBP lyases adopt a beta-barrel structure with a modified lipocalin fold. 166
33887 319975 cd16341 FdhE formate dehydrogenase accessory protein FdhE and similar proteins. This family contains formate dehydrogenase accessory protein FdhE and FdhE-like protein, found largely in gamma- and some beta-Proteobacteria, where the fdhE genes are almost always genetically-linked to the structural genes for formate dehydrogenases. FdhE is required for the assembly of formate dehydrogenase although not present in the final complex. In E. coli, FdhE interacts with the catalytic subunits of the respiratory formate dehydrogenases. Purification of recombinant FdhE demonstrates the protein is an iron-binding rubredoxin that can adopt monomeric and homodimeric forms. E. coli FdhE interacts with the catalytic subunits, FdnG and FdoG, of the Tat- dependent respiratory formate dehydrogenases. Site-directed mutagenesis has shown that conserved cysteine motifs are essential for the physiological activity of the FdhE protein and are also involved in Fe(III) ligation. The iron likely is redox active, suggesting that the switch from aerobic to anaerobic conditions may be important in modulating FdhE function. Alternatively, FdhE may be involved in an electron transfer reaction, similar to other rubredoxins. 257
33888 319974 cd16342 FusC_FusB Fusidic acid resistance protein (FusC/FusB). The fusidic acid resistance protein FusC (FusB) mediates resistance to the antibiotic fusidic acid. Its C-terminal domain, which contains a zinc-binding site, interacts with EF-G with high affinity, promoting the dissociation of stalled ribosome#EF-G#GDP complexes that form in the presence of fusidic acid, thus allowing the ribosomes to resume translation. 204
33889 319971 cd16343 LMWPTP Low molecular weight protein tyrosine phosphatase. Low molecular weight protein tyrosine phosphatases (LMW-PTP) are a family of small soluble single-domain enzymes that are characterized by a highly conserved active site motif (V/I)CXGNXCRS and share no sequence similarity with other types of protein tyrosine phosphatases (PTPs). LMW-PTPs play important roles in many biological processes and are widely distributed in prokaryotes and eukaryotes. 147
33890 319972 cd16344 LMWPAP low molecular weight protein arginine phosphatase. Low molecular weight protein arginine phosphatases are part of the low molecular weight phosphatase (LMWP) family. They share a highly conserved active site motif (V/I)CXGNTCRS. It has been shown that the conserved threonine, which in many LMWPTPs is an isoleucine, confers specificity to phosphoarginine over phosphotyrosine. 142
33891 319973 cd16345 LMWP_ArsC Arsenate reductase of the LMWP family. Arsenate reductase plays an important role in the reduction of intracellular arsenate to arsenite, an important step in arsenic detoxification. The reduction involves three different thiolate nucleophiles. In arsenate reductases of the LMWP family, reduction can be coupled with thioredoxin (Trx)/thioredoxin reductase (TrxR) or glutathione (GSH)/glutaredoxin (Grx). 132
33892 319957 cd16347 VOC_like uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 221
33893 319958 cd16348 VOC_YdcJ_like uncharacterized metal-dependent enzyme similar to Shigella flexneri YdcJ. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 310
33894 319959 cd16349 VOC_like uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 301
33895 319960 cd16350 VOC_like uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 254
33896 319750 cd16351 CheB_like methylesterase CheB domain family. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. CheB family members may also contain an N-terminal regulatory (REC) domain, which blocks the active site of the C-terminal domain until it is phosphorylated, or a CheR domain; typically cheB and cheR occur in the same operon. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR. 184
33897 319352 cd16352 CheD chemotaxis protein CheD stimulates methylation of methyl-accepting chemotaxis proteins. This family contains bacterial chemotaxis protein CheD that stimulates methylation of methyl-accepting chemotaxis proteins (MCPs). The CheD chemotaxis gene is not found in the Escherichia coli genome, but is present in many other organisms, including B. subtilis, where CheD appears to have two separate roles; it binds to chemoreceptors to activate them as part of the CheC/CheD/CheYp adaptation system, and it deamidates selected residues to activate chemoreceptors, enabling them to mediate amino acid chemotaxis. CheD has been shown to catalyze amide hydrolysis of specific glutaminyl side chains of the B. subtilis chemoreceptors McpA, McpB and McpC; deamination by CheD is required for the chemoreceptors to effectively transduce signals to the CheA kinase. CheD's ability to bind the receptors is controlled by CheC via a competitive binding mechanism; substituting Gln into the receptor motif of the signal-terminating phosphatase, CheC, turns the inhibitor into a receptor-modifying deamidase CheD substrate. Also, CheYp increases the affinity of CheD for CheC, controlling CheD binding to the receptors through its interactions with CheC. Thus, high levels of CheYp means CheC is a better binding target for CheD than the receptors, resulting in decreased CheA activity. The CheD structure reveals a distant homology with a class of bacterial toxins represented by the cytotoxic necrotizing factor 1 (CNF1) as well as a class of proteins of unknown function represented by B. subtilis YfiH. An invariant Cys-His pair forming a catalytic dyad is observed, and is required by the toxins for deamidation activity. 146
33898 381732 cd16353 CheC_CheX_FliY CheC/CheX/FliY (CXY) family phosphatases. The CXY family includes CheY-P-hydrolyzing proteins that function in bacterial chemotaxis, which involves cellular processes that control the movement of organisms toward favorable environments via rotating flagella, which in turn determines the sense of rotation by the intracellular response regulator CheY. When phosphorylated, CheY-P interacts directly with the flagellar motor, and this signal is terminated by the CXY family of phosphatases (Escherichia coli uses CheZ). CheC acts as a weak CheY-P phosphatase but increases activity in the presence of CheD. Bacillus subtilis has only CheC and FliY while many systems also have CheX. CheC and CheX appear to be primarily involved in restoring normal CheY-P levels, whereas FliY seems to act on CheY-P constitutively. Unlike CheC and CheX, FliY is localized in the flagellar switch complex, which also contains the stator-coupling protein FliG and the target of CheY-P, FliM. CheC, CheX, and FliY phosphatases share a consensus sequence ([DS]xxxExxNx(22)P) with four conserved residues thought to form the phosphatase active site. CheC class I and FliY each have two active sites, while CheC class II and III, and CheX have only one. This family also includes FliM, a component of the flagellar switch complex and a target of CheY, which lacks the phosphatase active site consensus sequence, and is not a CheY phosphatase. 162
33899 319961 cd16354 BAT Bleomycin N-Acetyltransferase and similar proteins. BlmB, encodes a bleomycin N-acetyltransferase, designated BAT, which inactivates Bm using acetyl-coenzyme A (AcCoA). BAT forms a dimer structure via interaction of its C-terminal domains in the monomers. The N-terminal domain of BAT has a tunnel with two entrances: a wide entrance that accommodates the metal-binding domain of Bm and a narrow entrance that accommodates acetyl-CoA (AcCoA). A groove formed on the dimer interface of two BAT C-terminal domains forms the DNA-binding domain of Bm. In a ternary complex of BAT, BmA(2), and CoA, a thiol group of CoA is positioned near the primary amine of Bm at the midpoint of the tunnel and ensures efficient transfer of an acetyl group from AcCoA to the primary amine of Bm. BAT belongs to vicinal oxygen chelate (VOC) superfamily that is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including thiocoraline, bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 114
33900 319962 cd16355 VOC_like uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 121
33901 319963 cd16356 PsjN_like Burkholderia Phytofirmans glyoxalase/bleomycin resistance protein/dioxygenase family enzyme and similar proteins. Burkholderia Phytofirmans glyoxalase/bleomycin resistance protein/dioxygenase family enzyme and similar proteins. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 119
33902 319964 cd16357 GLOD4_C C-terminal domain of human glyoxalase domain-containing protein 4 and similar proteins. Uncharacterized subfamily of the vicinal oxygen chelate (VOC) superfamily contains human glyoxalase domain-containing protein 4 and similar proteins. VOC is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 114
33903 319965 cd16358 GlxI_Ni Glyoxalase I that uses Ni(++) as cofactor. This family includes Escherichia coil and other prokaryotic glyoxalase I that uses nickel as cofactor. Glyoxalase I (also known as lactoylglutathione lyase; EC 4.4.1.5) is part of the glyoxalase system, a two-step system for detoxifying methylglyoxal, a side product of glycolysis. This system is responsible for the conversion of reactive, acyclic alpha-oxoaldehydes into the corresponding alpha-hydroxyacids and involves 2 enzymes, glyoxalase I and II. Glyoxalase I catalyses an intramolecular redox reaction of the hemithioacetal (formed from methylglyoxal and glutathione) to form the thioester, S-D-lactoylglutathione. This reaction involves the transfer of two hydrogen atoms from C1 to C2 of the methylglyoxal, and proceeds via an ene-diol intermediate. Glyoxalase I has a requirement for bound metal ions for catalysis. Eukaryotic glyoxalase I prefers the divalent cation zinc as cofactor, whereas Escherichia coil and other prokaryotic glyoxalase I uses nickel. However, eukaryotic Trypanosomatid parasites also use nickel as a cofactor, which could possibly be explained by acquiring their GLOI gene by horizontal gene transfer. Human glyoxalase I is a two-domain enzyme and it has the structure of a domain-swapped dimer with two active sites located at the dimer interface. In yeast, in various plants, insects and Plasmodia, glyoxalase I is four-domain, possibly the result of a further gene duplication and an additional gene fusing event. 122
33904 319966 cd16359 VOC_BsCatE_like_C C-terminal of Bacillus subtilis CatE like protein, a vicinal oxygen chelate subfamily. Uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily contains Bacillus subtilis CatE and similar proteins. CatE is proposed to function as Catechol-2,3-dioxygenase. VOC is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 110
33905 319967 cd16360 ED_TypeI_classII_N N-terminal domain of type I, class II extradiol dioxygenases. This family contains the N-terminal non-catalytic domain of type I, class II extradiol dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site; extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon, whereas intradiol enzymes cleave the aromatic ring between two hydroxyl groups. Extradiol dioxygenases are classified into type I and type II enzymes. Type I extradiol dioxygenases include class I and class II enzymes. These two classes of enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. The extradiol dioxygenases represented in this family are type I, class II enzymes, and are composed of the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. A catalytically essential metal, Fe(II) or Mn(II), presents in all the enzymes in this family. 111
33906 319968 cd16361 VOC_ShValD_like vicinal oxygen chelate (VOC) family protein similar to Streptomyces hygroscopicus ValD protein. This subfamily of vicinal oxygen chelate (VOC) family protein includes Streptomyces hygroscopicus ValD protein and similar proteins. ValD protein functions in validamycin biosynthetic pathway. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping. 150
33907 319969 cd16362 TflA Toxoflavin Lyase. Toxoflavin lyase (TflA) is metalloenzyme degrading toxoflavin at the presence of oxygen, Mn(II), and dithiothreitol. TflA is structurally homologous to proteins of the vicinal oxygen chelate superfamily. The structure of TflA contains fold that displays a rare rearrangement of the structural modules indicative of domain permutation. Moreover, unlike the 2-His-1-carboxylate facial triad commonly utilized by vicinal oxygen chelate dioxygenases and other dioxygen-activating non-heme Fe(II) enzymes, the metal center in TflA consists of a 1-His-2-carboxylate facial triad. Toxoflavin is an azapteridine that is toxic to various plants, fungi, animals, and bacteria. 110
33908 319897 cd16363 Col_Im_like inhibitory immunity (Im) protein of colicin (Col) deoxyribonuclease (DNase) and pyocins. This family contains inhibitory immunity (Im) proteins that bind to colicin endonucleases (DNases) or pyocins with very high affinity and specificity; this is critical for the neutralization of endogenous DNase catalytic activity and for protection against exogenous DNase bacteriocins. The DNase colicin family (ColE2, ColE7, ColE8 and ColE9) in E. coli, and pyocin family (S1, S2, S3 and AP41) in P. aeruginosa, are potent bacteriocins where the immunity proteins (Ims) protect the colicin/pyocin producing (i.e. colicinogenic) bacteria by binding and inactivating colicin nucleases. The binding affinities between cognate and non-cognate nucleases by Im proteins can vary up to 10 orders of magnitude. 81
33909 409301 cd16364 T3SC_I-like class I type III secretion system (T3SS) chaperones and similar proteins. This family contains class I type III secretion system (T3SS) chaperones mainly found in Gram-negative bacteria such as Pseudomonas, Yersinia, Salmonella, Escherichia and Erwinia, among others. A wide variety of these bacterial pathogens and symbionts require a T3SS to inject eukaryotic host cells with effector proteins important for suppressing host defenses and establishing infection. Many of these effector proteins interact with specific type III secretion chaperones prior to secretion. These T3SS chaperones have been classified as class I type III secretion chaperones (T3SC), which are small structurally conserved dimers that interact specifically with T3SS effector proteins. Class I T3SC consists of two subclasses: IA and IB. Class IA T3SC binds a single effector, whereas class IB T3SC binds to several effectors. Class IA and Class IB T3SCs typically exhibit little sequence similarities, but share a common overall heart-shaped structure fold (alpha-beta-beta-beta-alpha-beta-beta-alpha) and features, such as a small size, an acidic pI and an amphipathic C-terminal alpha-helix. Chaperone protein CesT serves a chaperone function for the enteropathogenic Escherichia coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment. In Pseudomonas aeruginosa, chaperone ExsC binds small secreted protein ExsE as well as the non-secreted anti-activator protein ExsD; it relieves repression of the transcriptional activator ExsA (which activates expression of T3SS genes) by ExsD. P. aeruginosa SpcU binds the cytotoxin ExoU, which is a broad-specificity phospholipase A2 (PLA2) and lysophospholipase, and maintains the N-terminus of ExoU in an unfolded state which is required for secretion. Salmonella enterica chaperone SicP forms a complex with effector protein SptP at an early stage of its secretion process in order to avoid premature degradation, while chaperone SigE binds to effector SigD, which, upon translocation into the host cell, preferentially dephosphorylates specific inositol phospholipids that are thought to be crucial for subsequent activation of the host cell Ser-Thr kinase Akt. This family also includes Yersinia chaperone/escortee pairs SycE/YopE, SycH/YopH, SycT/YopT and SycN+YscB/YopN, all of which bind to specific Yersinia outer proteins (Yops). Also included are several DspF and related sequences from several plant pathogenic bacteria. The "disease-specific" (dsp) region next to the hrp gene cluster of Erwinia amylovora is required for pathogenicity but not for elicitation of the hypersensitive reaction. In addition, a group of proteins including Escherichia coli YbjN, Erwinia amylovora AmyR, and their homologs, are included in this family. They share a class I T3SC-like fold with T3SS chaperone proteins but appear to function independently of the T3SS. 117
33910 319887 cd16365 NarH_like beta FeS subunits DMSOR NarH-like family. This subfamily contains beta FeS subunits of several DMSO reductase superfamily, including nitrate reductase A, ethylbenzene dehydrogenase and selenate reductase. DMSO Reductase (DMSOR) family members have a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit. . The beta subunits of DMSOR contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. Nitrate reductase A contains three subunits (the catalytic subunit NarG, the catalytic subunit NarH with four [Fe-S] clusters, and integral membrane subunit NarI) and often forms a respiratory chain with the formate dehydrogenase via the lipid soluble quinol pool. Ethylbenzene dehydrogenase oxidizes the hydrocarbon ethylbenzene to (S)-1-phenylethanol. Selenate reductase catalyzes reduction of selenate to selenite in bacterial species that can obtain energy by respiring anaerobically with selenate as the terminal electron acceptor. 201
33911 319888 cd16366 FDH_beta_like beta FeS subunits of formate dehydrogenase N (FDH-N) and similar proteins. This family contains beta FeS subunits of several dehydrogenases in the DMSO reductase superfamily, including formate dehydrogenase N (FDH-N), tungsten-containing formate dehydrogenase (W-FDH) and other similar proteins. FDH-N is a major component of nitrate respiration of Escherichia coli; it catalyzes the oxidation of formate to carbon dioxide, donating the electrons to a second substrate to a cytochrome. W-FDH contains a tungsten instead of molybdenum at the catalytic center and seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It catalyzes the oxidation of formate to carbon dioxide, donating the electrons to a second substrate. 156
33912 319889 cd16367 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 138
33913 319890 cd16368 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 200
33914 319891 cd16369 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 172
33915 319892 cd16370 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 131
33916 319893 cd16371 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 140
33917 319894 cd16372 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 125
33918 319895 cd16373 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 154
33919 319896 cd16374 DMSOR_beta_like uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 139
33920 319867 cd16375 Avd_IVP_like proteins similar to the diversity-generating retroelement protein bAvd. A superfamily of four-helix bundles that form homopentamers, including the bacterial accessory variability determinant (bAvd) protein and a family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes. 103
33921 319868 cd16376 Avd_like diversity-generating retroelement protein bAvd and similar proteins. The bacterial accessory variability determinant (bAvd) protein, together with a reverse transcriptase, is involved in retrohoming processes as part of a diversity-generating retroelement (DGR), a type of system that is involved in generating sequence variation in bacterial proteins by inserting coding information from a template region into a variable region of a protein coding gene. bAvd forms homopentamers and interacts with the reverse transcriptase as well as with DNA and/or RNA. 106
33922 319869 cd16377 23S_rRNA_IVP_like 23S rRNA-intervening sequence protein and similar proteins. A family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes. The distantly related bAvd protein, which also forms a homopentamer of four-helix bundles, has been suggested to interact with nucleic acids and a reverse transcriptase. 108
33923 319866 cd16378 CcmH_N N-terminal domain of cytochrome c-type biogenesis protein CcmH and related proteins. Cytochrome c-type biogenesis protein CcmH is a membrane-anchored thiol-oxidoreductase that is essential in the maturation of c-type cytochromes. CcmH consists of an N-terminal catalytic domain with the active site CXXC motif and a C-terminal domain of unknown function which is predicted to contain TPR-like repeats. Other members of this family include NrfF, CycL, and Ccl2. 67
33924 319863 cd16379 YitT_C_like C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, found in various bacterial proteins, has no known function. It has been given the designation DUF2179 and occurs at the C-terminus of the Bacillus subtilis membrane protein YitT as well as in single-domain proteins. 80
33925 319864 cd16380 YitT_C C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, found in various bacterial proteins, has no known function. It has been given the designation DUF2179 and occurs at the C-terminus of the Bacillus subtilis membrane protein YitT. 85
33926 319865 cd16381 YitT_C_like_1 Proteins similar to the C-terminal domain of Bacillus subtilis YitT. This domain, characteristic of various bacterial proteins, has no known function. It has been given the designation DUF2179 and is similar to the C-terminus of the Bacillus subtilis membrane protein. 80
33927 341357 cd16382 XisI-like XisI is FdxN element excision controlling factor protein. This family contains XisI proteins, also known as FdxN element excision controlling factors, and similar proteins. FdxN element is excised from the chromosome during heterocyst differentiation in cyanobacteria. This is accomplished by the large serine recombinase XisF (fdxN element site-specific recombinase). The xisH as well as the xisF and xisI genes are required. XisI may function as recombination directionality factor (RDF), and needs XisH which may function as an endonuclease. 107
33928 319862 cd16383 GUN4 porphyrin-binding protein domain GUN4. GUN4 is a porphyrin-binding protein involved in chlorophyll biosynthesis regulation and intracellular signaling, found in aerobic photosynthetic organisms. It has been implicated in retrograde signalling between the chloroplast and nucleus. GUN4 can bind protoporphyrin IX (PIX) and magnesium protoporphyrin IX (MgPIX), substrate and product of the Mg-chelatase, as well as heme and cobalt protoporphyrin IX (CoPIX). It may play a role in protecting plants from reactive oxygen species (ROS)-mediated damage. 142
33929 319760 cd16384 VirB8_like virulence protein VirB8. This family contains bacterial virulence protein VirB8 and similar proteins which consist of cytoplasmic, transmembrane, and periplasmic regions. VirB8 is an essential assembly factor of type IV secretion system (T4SS) channel proteins that are highly diverse in function relative to other bacterial secretion systems. T4SS proteins are important virulence factors for many Gram-negative pathogens, such as Agrobacterium, Brucella, Legionella, and Helicobacter. This family also includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain has a similar fold to the nuclear transport factor-2 (NTF2) protein. 133
33930 319861 cd16385 IcmL inner membrane protein IcmL/DotI. This family contains the periplasmic domain of the inner membrane protein DotI of the Dot/Icm (defect in organelle trafficking/intracellular multiplication) type IVB secretion system, including its ortholog in the conjugation system of plasmid R64, TraM, and similar proteins. These domains share striking structural similarity to the type IV secretion system (T4ASS) component VirB8 suggesting DotI/TraM to be its functional counterpart. DotI is essential for intracellular growth of Legionella pneumophila (causing agent of Legionnaires' disease) within mammalian and protozoan cells; it translocates numerous effector proteins into its eukaryotic host. 132
33931 319860 cd16386 TcpC_N N-terminal domain of conjugative transposon protein TcpC. This family contains the N-terminal domain of conjugation protein TcpC and similar proteins. TcpC is required for efficient conjugative transfer, localizing to the cell membrane independently of other conjugation proteins, where membrane localization is important for its function, oligomerization and interaction with the conjugation proteins TcpA, TcpH, and TcpG. N-terminal region (sometimes referred to as central domain) of TcpC is required for efficient conjugation, oligomerization and protein-protein interactions. TcpC has low level sequence identity to proteins encoded by the conjugative transposon Tn916, which is responsible for a large proportion of the tetracycline resistance in different pathogens. 123
33932 319246 cd16387 ParB_N_Srx ParB N-terminal domain and sulfiredoxin protein-related families. The ParB N-terminal domain/Sulfiredoxin (Srx) superfamily contains proteins with diverse activities. Many of the families are involved in segregation and competition between plasmids and chromosomes. Several families share similar activities with the N-terminal domain of ParB (Spo0J in Bacillus subtilis), a DNA-binding component of the prokaryotic parABS partitioning system. Also within this superfamily is sulfiredoxin (Srx; reactivator of oxidatively inactivated 2-cys peroxiredoxins), RepB N-terminal domain (plasmid segregation replication protein B like protein), nucleoid occlusion protein, KorB N-terminal domain partition protein of low copy number plasmid RK2, irbB (immunoglobulin-binding regulator that activates eib genes), N-terminal domain of sopB protein (promotes proper partitioning of F1 plasmid), fertility inhibition factors OSA and FiwA,DNA sulfur modification protein DndB, and a ParB-like toxin domain. Other activities includes a StrR (regulator in the streptomycin biosynthetic gene cluster), and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators sbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ). Nuclease activity has also been reported in Arabidopsis Srx. 54
33933 319247 cd16388 SbnI_like_N N-terminal domain of transcriptional regulators similar to SbnI. Siderophore staphylobactin biosynthesis protein SbnI of Staphylococcus aureus is a ParB/Spo0J like protein required for the expression of genes in the sbn operon, which is responsible for staphyloferrin B (SB) biosynthesis. SnbI forms dimers and binds DNA upstream of sdnD. SbnI binds heme, which inhibits DNA binding of SbnI, leading to a suppression of sbn operon expression. 77
33934 319248 cd16389 FIN fertility inhibition factors, including OSA and FiwA, related to the ParB/Srx superfamily. Osa and FiwA are fertility inhibition factors (FIN), which are employed by plasmids to block import of rival plasmids. Osa (oncogenic suppressive activity) of IncW group plasmid pSa gene inhibits the oncogenic properties of Agrobacterium tumefaciens. Osa is structurally similar to the ParB N-terminal domain/Srx superfamily of proteins: ParB acts in the bacterial and plasmid parABS partitioning systems. Osa has been shown to have ATPase and DNAse activities, an can block T-DNA transfer into plants. FiwA is encoded by plasmid RP1 and blocks the transfer of plasmid R388. The gene product of Haemophilus influenzae p1056.10c also blocks T-DNA transfer. 124
33935 319249 cd16390 ParB_N_Srx_like uncharacterized family distantly related to the N-terminal domain of the ParB/Srx superfamily. Uncharacterized proteins distantly related to the N-terminal domain of the ParB superfamily, primarily involved in bacterial and plasmid parABS-related partitioning systems. A small minority of proteins in this family include a C-terminal inorganic pyrophosphatase domain. Also within the ParB superfamily is sulfiredoxin (Srx), which is a reactivator of oxidatively inactivated 2-cys peroxiredoxins. Other families includes a putative regulator in the biosynthetic gene cluster and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators SbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ) and EdeB (Brevibacillus brevis antimicrobial peptide edeine biosynthetic cluster). Nuclease activity has also been reported in Arabidopsis Srx. 162
33936 319250 cd16392 toxin-ParB toxin domain of the ParB/Srx superfamily. toxin domain with similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system and related proteins. Toxin found, for example, at the C-terminus of polymorphic toxin system members. 72
33937 319251 cd16393 SPO0J_N Thermus thermophilus stage 0 sporulation protein J-like N-terminal domain, ParB family member. Spo0J (stage 0 sporulation protein J) is a ParB family member, a critical component of the ParABS-type bacterial chromosome segregation system. The Spo0J N-terminal region acts in protein-protein interaction and is adjacent to the DNA-binding domain that binds to parS sites. Two Spo0J bind per parS site, and Spo0J interacts with neighbors via the N-terminal domain to form oligomers via an Arginine-rich patch (RRXR). This superfamily represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 97
33938 319252 cd16394 sopB_N N-terminal domain of sopB protein, which promotes proper partitioning of F1 plasmid. Escherichia coli SopB acts in the equitable partitioning of the F plasmid in the SopABC system. SopA binds to the sopAB promoter, while SopB binds SopC and helps stimulate polymerization of SopA in the presence of ATP and Mg(II). Mutation of SopA inhibits proper plasmid segregation. This N-terminal domain is related to the ParB N-terminal domain of bacterial and plasmid parABS partitioning systems, which binds parA. 67
33939 319253 cd16395 Srx Sulfiredoxin reactivates peroxiredoxins after oxidative inactivation. Sulfiredoxin reduces and thereby re-activates 2-cys peroxiredoxins. Peroxiredoxins act as molecular switches, inactivating in response to hyperoxidation from hydrogen peroxide and other free radicals. Sulfiredoxin reactivates Prx-SO(2)(-) via ATP-Mg(2+)-dependent reduction. Arabidopsis sulfiredoxin has been described as a dual function enzyme, having nuclease activity in addition to the sulfiredoxin activity. This protein is similar to ParB N-terminal-like domain of bacterial and plasmid parABS partitioning systems. 90
33940 319254 cd16396 Noc_N nucleoid occlusion protein, N-terminal domain, and related domains of the ParB partitioning protein family. Nucleoid occlusion protein has been shown in Bacillus subtilis to bind to specific DNA sequences on the chromosome (Noc-binding DNA sequences, NBS), inhibiting cell division near the nucleoid and thereby protecting the chromosome. This N-terminal domain is related to the N-terminal domain of ParB/repB partitioning system proteins. 95
33941 319255 cd16397 IbrB_like immunoglobulin-binding regulator IbrB activates eib genes. IbrB (along with IbrA) activates immunoglobulin-binding eib genes in Escherichia coli. IbrB is related to the ParB N-terminal domain family, which includes DNA-binding proteins involved in chromosomal/plasmid segregation and transcriptional regulation, consistent with a possible mechanism for IbrB in DNA binding-related regulation of eib expression. The ParB like family is a diverse domain superfamily with structural and sequence similarity to ParB of bacterial chromosomes/plasmid parABS partitioning system and Sulfiredoxin (Srx), which is a reactivator of oxidatively inactivated 2-cys peroxiredoxins. Other families includes proteins related to StrR, a putative regulator in the biosynthetic gene cluster and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators SbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ) and EdeB (Brevibacillus brevis antimicrobial peptide edeine biosynthetic cluster). Nuclease activity has also been reported in arabidopsis Srx. 100
33942 319256 cd16398 KorB_N_like ParB-like partition protein of low copy number plasmid RK2, N-terminal domain and related domains. KorB, a member of the ParB like family, is present on the low copy number, broad host range plasmid RK2. KorB encodes a gene product involved in segregation of RK2 and acts as a transcriptional regulator, down-regulating at least 6 RK2 operons. KorB binds RNA polymerase and acts cooperatively with several co-repressors in modulating transcription. KorB is comprised of 3 domains, including a beta-strand C-terminal domain similar to SH3 domains and an alpha helical central domain that interacts with operator DNA. In ParB of P1 and SopB of F, the N-terminal region is responsible for interaction with the parA component. However, korB interaction with the RK2 parA-equivalent IncC has been mapped to the central HTH motif. This family is related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 91
33943 319257 cd16400 ParB_Srx_like_nuclease ParB/Srx_like nuclease and putative transcriptional regulators related to SbnI. This family contains a Pyrococcus Furiosus enzyme reported to have DNA nuclease activity and resembles the N-terminal domain of ParB proteins of the parABS bacterial chromosome partitioning system. This sub-family also includes siderophore staphylobactin biosynthesis protein SbnI. 60% of the P. furiosus nuclease activity was retained at 90 degree C, suggesting a physiological role in the organism, which can grow in temperatures as high as 100 degrees Celsius. The protein has endo- and exo-nuclease activity vs. single- and double-stranded DNA, and nuclease activity was lost in methylated proteins prepared for structure solution. This family has a fairly well-conserved DGHHR motif that corresponds to the same structural position as the phosphorylation site (a portion of the ATP-Mg-binding site) of sulfiredoxin and the arginine-rich ParB BoxII of ParB. 72
33944 319258 cd16401 ParB_N_like_MT ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 85
33945 319259 cd16402 ParB_N_like_MT ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains and DUF4417. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 87
33946 319260 cd16403 ParB_N_like_MT ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 88
33947 319261 cd16404 pNOB8_ParB_N_like pNOB8 ParB-like N-terminal domain, plasmid partitioning system protein domain. archaeal pNOB8 ParB acts in a plasmid partitioning system made up of 3 parts: AspA, ParA motor protein, and ParB, which links ParA to the protein-DNA superhelix. As demonstrated in Sulfolobus, AspA spreads along DNA, which allows ParB binding, and links to the Walker-motif containing ParA motor protein. The Sulfolobus ParB C-terminal domain resembles eukaryotic segregation protein CenpA, and other histones. This family is related to the N-terminal domain of ParB (Spo0J in Bacillus subtilis), a DNA-binding component of the prokaryotic parABS partitioning system and related proteins. 69
33948 319262 cd16405 RepB_like_N plasmid segregation replication protein B like protein, N-terminal domain. RepB, found on plasmids and secondary chromosomes, works along with repA in directing plasmid segregation, and has been shown in Rhizobium etli to require the parS centromere-like sequence for full transcriptional repression of the repABC operon, inducing plasmid incompatibility. RepA is a Walker-type ATPase that complexes with RepB to form DNA-protein complexes in the presence of ATP/ADP. RepC is an initiator protein for the plasmid. repA and repB are homologous to the parA and ParB genes of the parABS partitioning system found on primary chromosomes. 91
33949 319263 cd16406 ParB_N_like ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 82
33950 319264 cd16407 ParB_N_like ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 86
33951 319265 cd16408 ParB_N_like ParB N-terminal, parA -binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 84
33952 319266 cd16409 ParB_N_like ParB N-terminal-like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 74
33953 319267 cd16410 ParB_N_like ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 80
33954 319268 cd16411 ParB_N_like ParB N-terminal, parA -binding, domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 90
33955 319269 cd16412 dndB DNA sulfur modification protein DndB. dndB acts in the regulation of DNA modifications, including DNA phosphorothioation. DndB may act by binding near the phosphorothioate modification site and regulating access of the Dnd modification machinery to DNA. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily. 333
33956 319270 cd16413 DGQHR_domain DGQHR motif containing domain. Uncharacterized diverse domain family with conserved DGQHR motif, in addition to QR and FXXXN motifs. Some proteins have been identified as parts of DNA phosphorothioation systems. Related to dndB, which acts in the regulation of DNA modifications, including DNA phosphorothioation. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily. 229
33957 319271 cd16414 dndB_like DNA-sulfur modification-associated domain. Family of proteins related to dndB. dndB acts in the regulation of DNA modifications, including DNA phosphorothioation. Both have a conserved DGQHR sequence motif. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily 238
33958 319852 cd16415 HAD_dREG-2_like uncharacterized family of the haloacid dehalogenase-like superfamily, similar to uncharacterized Drosophila melanogaster rhythmically expressed gene 2 protein and human haloacid dehalogenase-like hydrolase domain-containing protein 3. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 128
33959 319853 cd16416 HAD_BsYqeG-like Uncharacterized family of the the haloacid dehalogenase-like superfamily, similar to the uncharacterized protein Bacillus subtilis YqeG. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 108
33960 319854 cd16417 HAD_PGPase Escherichia coli Gph phosphoglycolate phosphatase and related proteins; belongs to the haloacid dehalogenase-like superfamily. Phosphoglycolate phosphatase (PGP; EC 3.1.3.18) catalyzes the conversion of 2-phosphoglycolate into glycolate and phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 212
33961 319855 cd16418 HAD_Pase phosphatases, similar to human PHOSPHO1 and PHOSPHO2 phosphatases; belongs to the haloacid dehalogenase-like superfamily. This family includes phosphatases with different substrate specificities. Human PHOSPHO1 is a phosphoethanolamine/phosphocholine phosphatase associated with high levels of expression at mineralizing regions of bone and cartilage and is thought to be involved in the generation of inorganic phosphate for bone mineralization. Human PHOSPHO2 is a putative phosphatase which shows high specific activity toward pyridoxal-5-phosphate; PHODPHO2 is not specific to bone but is expressed in a wide range of soft tissues. These belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 130
33962 319856 cd16419 HAD_SPS sucrose-phosphate synthase; belongs to the haloalcanoic acid dehalogenase (HAD) superfamily. Sucrose phosphate synthase (SPS; EC 2.4.1.14) also known as UDP-glucose-fructose-phosphate glucosyltransferase, catalyzes the transfer of a hexosyl group from UDP-glucose to D-fructose 6-phosphate to form UDP and D-sucrose-6-phosphate, this is the rate limiting step of sucrose synthesis. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 174
33963 319857 cd16421 HAD_PGPase Rhodobacter capsulatus Cbbz phosphoglycolate phosphatase and related proteins; ; belongs to the haloacid dehalogenase-like superfamily. Phosphoglycolate phosphatase (PGPase; EC 3.1.3.18) catalyzes the conversion of 2-phosphoglycolate into glycolate and phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 105
33964 319858 cd16422 HAD_Pase_UmpH-like uncharacterized subfamily of the UmpH/NagD phosphatase family, belongs to the haloacid dehalogenase-like superfamily. This uncharacterized subfamily belongs to the UmpH/NagD phosphatase family and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 247
33965 319859 cd16423 HAD_BPGM-like uncharacterized subfamily of beta-phosphoglucomutase-like family, similar to uncharacterized Bacillus subtilis YhcW. This subfamily includes the uncharacterized Bacillus subtilis YhcW. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. 169
33966 319761 cd16424 VirB8 periplasmic domain of VirB8 protein. This family contains the periplasmic domain of VirB8 protein which is an essential assembly factor of type IV secretion system (T4SS) channel proteins that are highly diverse in function relative to other bacterial secretion systems. T4SS proteins are important virulence factors for many Gram-negative pathogens, such as Agrobacterium, Brucella, Legionella, and Helicobacter. VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that VirB8 is a primary constituent of a DNA transporter. It is a crucial structural and functional component of the T4SS, with interactions between VirB8 and many other T4SS proteins, including VirB10, VirB9, VirB1, VirB4, and VirB11, as well as with itself. 137
33967 319762 cd16425 TrbF conjugal transfer protein TrbF. This family includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain is similar to the type IV secretion system (T4ASS) component VirB8 and possibly has a similar fold to the nuclear transport factor-2 (NTF-2)-like superfamily. 133
33968 319754 cd16426 VirB10_like VirB10 and similar proteins form part of core complex in Type IV secretion system (T4SS). This family contains VirB10, a component of the type IV secretion system (T4SS) and its homologs, including TraB, TraF, IcmE, and similar proteins. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4. VirB10 interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. In Gram-negative bacterial pathogen Helicobacter pylori, an important aetiological agent of gastroduodenal disease in humans, the comB3 gene encodes protein with best homologies to TraS/TraB from the Pseudomonas aeruginosa conjugative plasmid RP1 and TrbI of plasmid RP4 and VirB10 from the Ti plasmid of Agrobacterium tumefaciens, as well as DotG/IcmE of Legionella pneumophila. 151
33969 319758 cd16427 TraM-like C-terminal domain of transfer protein TraM. This family contains the C-terminal domain of transfer protein TraM of the G+ broad host range Enterococcus conjugative plasmid pIP501 and similar proteins. The protein localizes to the cell envelope and is part of the plasmid transfer system that is accessible from outside of the cell. TraM displays a fold similar to the type IV secretion system (T4SS) VirB8 proteins from A. tumefaciens and B. suis (G-) and to the transfer protein TcpC from C. perfringens plasmid pCW3 (G+), reinforcing the prediction that TraM performs a key role in the secretion process, which is underlined by its surface accessibility. It is also structurally related to members of the nuclear transport factor-2 (NTF-2)-like superfamily with a high similarity to the NTF-2 protein from Rattus norvegicus. TraM (categorized as T4SS VirB8-like class GAMMA) does not possess the binding pocket of classic VirB8 class ALPHA proteins, recognized by VirB8 interaction inhibitors. 108
33970 319759 cd16428 TcpC_C C-terminal domain of conjugative transposon protein TcpC. This family contains the C-terminal domain of conjugation protein TcpC and similar proteins. TcpC is required for efficient conjugative transfer, localizing to the cell membrane independently of other conjugation proteins, where membrane localization is important for its function, oligomerization and interaction with the conjugation proteins TcpA, TcpH, and TcpG. C-terminal domain is critical for interactions with these other conjugation proteins. TcpC has low level sequence identity to proteins encoded by the conjugative transposon Tn916, which is responsible for a large proportion of the tetracycline resistance in different pathogens. 97
33971 319755 cd16429 VirB10 VirB10 forms part of core complex in Type IV secretion system (T4SS). This family contains VirB10, a component of the type IV secretion system (T4SS), including homologs TrbI, TraF, TrwE and TraL. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4. VirB10, interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both, the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. TrwE in R33 plasmid has been shown to be anchored to the inner membrane and its C-terminal is necessary for conjugation; the transmembrane domains of TrwB and TrwE are involved in TrwB-TrwE interactions. TraF protein of the RP4 plasmid is involved in circularization of pilin subunits of P-type pili. In gonococcal genetic island (GGI) of Neisseria gonorrhoeae, T4SS encodes TrbI and circularization occurs via a covalent intermediate between the C terminus of putative pilin protein TraA and TrbI. 180
33972 319756 cd16430 TraB TraB is a homolog of VirB10 which forms part of core complex in Type IV secretion system (T4SS). This family contains TraB (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4. VirB10, interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. TraB is localized similarly to related proteins in other systems, but unlike in other systems, Neisseria gonorrhoeae TraB does not require the presence of other T4SS components for proper localization. It has been shown to be expressed with TraK (VirB9 homolog) at low levels in wild-type cells, suggesting that gonococcal T4SS may be present in single copy per cell and the small amounts of these proteins are sufficient for DNA secretion. 203
33973 319757 cd16431 IcmE DotG/IcmE is a homolog of VirB10 which forms part of core complex in Type IV secretion system. This family contains DotG/IcmE (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. The Dot/Icm system is a T4SS found in the pathogens Legionella and Coxiella and the conjugative apparatus of IncI plasmids; T4SS is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. Similar to T4SS VirB/D components, the Legionella Dot/Icm secretion apparatus contains a critical five-protein sub-assembly that forms the membrane-spanning 'core-complex' (CC), around which all other components assemble. This transmembrane connection is mediated by protein dimer pairs consisting of two inner membrane proteins, DotF and DotG, each independently associating with DotH/DotC/DotD in the outer membrane. 179
33974 319751 cd16432 CheB_Rec Chemotaxis response regulator protein-glutamate methylesterase, CheB, with N-terminal REC domain. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain with a REC domain at the N-terminus. CheB is a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. The N-terminal regulatory (REC) domain blocks the active site of the C-terminal domain until it is phosphorylated. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR. 184
33975 319752 cd16433 CheB Chemotaxis response regulator protein-glutamate methylesterase, CheB. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheR and cheB have a strong preference to occur in the same operon, and a subgroup contains multidomain proteins with CheB-CheR fusions. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR. 181
33976 319753 cd16434 CheB-CheR_fusion Chemotaxis response regulator protein-glutamate methylesterase, CheB, fused with CheR domain. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors, fused with a CheR domain as well as other domains. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheB and cheR are typically found in the same operon. However, CheB and CheR are fused in multi-domain proteins in this subgroup. The CheR protein/domain includes an all-alpha N-terminal domain and an S-adenosylmethionine-dependent methyltransferase C-terminal domain. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR. 180
33977 319740 cd16435 BPL_LplA_LipB biotin-lipoate ligase family. This family includes biotin protein ligase (BPL), lipoate-protein ligase A (LplA) and octanoyl-[acyl carrier protein]-protein acyltransferase (LipB). Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LplA) catalyses the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. 198
33978 319744 cd16436 beta_Kdo_transferase beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. KpsC and KpsS are retaining beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferases. They are part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. 222
33979 319745 cd16437 beta_Kdo_transferase_KpsC beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides, that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication. 256
33980 319746 cd16438 beta_Kdo_transferase_KpsS_like beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsS like. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. 254
33981 319747 cd16439 beta_Kdo_transferase_KpsC_2 beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC, repeat 2. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication. 259
33982 319748 cd16440 beta_Kdo_transferase_KpsC_1 beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC, repeat1. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication. 262
33983 319749 cd16441 beta_Kdo_transferase_KpsS beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsS. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. 307
33984 319741 cd16442 BPL biotin protein ligase. Biotin protein ligase (EC 6.3.4.15) catalyzes the synthesis of an activated form of biotin, biotinyl-5'-AMP, from substrates biotin and ATP followed by biotinylation of the biotin carboxyl carrier protein subunit of acetyl-CoA carboxylase. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. 173
33985 319742 cd16443 LplA lipoate-protein ligase. Lipoate-protein ligase A (LplA) catalyzes the formation of an amide linkage between free lipoic acid and a specific lysine residue of the lipoyl domain in lipoate dependent enzymes, similar to the biotinylation reaction mediated by biotinyl protein ligase (BPL). The two step reaction includes activation of exogenously supplied lipoic acid at the expense of ATP to lipoyl-AMP and then transfer to the epsilon-amino group of a specific lysine residue of the lipoyl domain of the target protein. 209
33986 319743 cd16444 LipB lipoyl/octanoyl transferase. Lipoate-protein ligase B is a octanoyl-[acyl carrier protein]-protein acyltransferase the catalyzes the first step of lipoic acid synthesis. It transfers endogenous octanoic acid attached via a thioester bond to acyl carrier protein (ACP) onto lipoyl domains, which is later converted by lipoate synthase LipA into lipoylated derivatives. 199
33987 319362 cd16448 RING-H2 H2 subclass of RING (RING-H2) finger and its variants. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized as two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have different Cys/His pattern. Some lack a single Cys or His residues at typical Zn ligand positions. Especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can indeed chelate Zn in a RING finger as well. This family corresponds to H2 subclass of RING (RING-H2) finger proteins that are characterized by containing C3H2C3-type canonical RING-H2 fingers or noncanonical RING-H2 finger variants, including C4HC3- (RING-CH alias RINGv), C3H3C2-, C3H2C2D-, C3DHC3-, and C4HC2H-type modified RING-H2 fingers. The canonical RING-H2 finger has been defined as C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-H-X2-C-X(4-48)-C-X2-C, X is any amino acid and the number of X residues varies in different fingers. It binds two Zn ions in a unique "cross-brace" arrangement, which distinguishes it from tandem zinc fingers and other similar motifs. RING-H2 finger can be found in a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serves as a scaffold for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates. 44
33988 319363 cd16449 RING-HC HC subclass of RING (RING-HC) finger and its variants. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized into two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have a different Cys/His pattern. Some lack a single Cys or His residue at typical Zn ligand positions, especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can chelate Zn in a RING finger as well. This family corresponds to HC subclass of RING (RING-HC) finger proteins that are characterized by containing C3HC4-type canonical RING-HC fingers or noncanonical RING-HC finger variants, including C4C4-, C3HC3D-, C2H2C4-, and C3HC5-type modified RING-HC fingers. The canonical RING-HC finger has been defined as C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C. It binds two Zn ions in a unique "cross-brace" arrangement, which distinguishes it from tandem zinc fingers and other similar motifs. RING-HC finger can be found in a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle, and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serve as scaffolds for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates. 39
33989 319364 cd16450 mRING-C3HGC3_RFWD3 Modified RING finger, C3HGC3-type, found in RING finger and WD repeat domain-containing protein 3 (RFWD3) and similar proteins. RFWD3, also known as RING finger protein 201 (RNF201) or FLJ10520, is an E3 ubiquitin-protein ligase that forms a complex with Mdm2 and p53 to synergistically ubiquitinate p53 and acts as a positive regulator of p53 stability in response to DNA damage. It is phosphorylated by checkpoint kinase ATM/ATR and the phosphorylation mutant fails to stimulate p53 ubiquitination. RFWD3 also functions as a novel replication protein A (RPA)-associated protein involved in DNA replication checkpoint control. RFWD3 contains an N-terminal SQ-rich region followed by a RING finger domain that exhibits robust E3 ubiquitin ligase activity toward p53, a coiled-coil domain and three WD40 repeats in the C-terminus, the latter two of which may be responsible for protein-protein interaction. The RING finger in this family is a modified C3HGC3-type RING finger, but not a canonical C3H2C3-type RING-H2 finger or C3HC4-type RING-HC finger. 49
33990 319365 cd16451 mRING_PEX12 Modified RING finger found in peroxin-12 (PEX12) and similar proteins. PEX12, also known as peroxisome assembly protein 12 or peroxisome assembly factor 3 (PAF-3), is a RING finger domain-containing integral membrane peroxin required for protein import into peroxisomes. Mutations in human PEX12 result in the peroxisome deficiency Zellweger syndrome of complementation group III (CG-III), a lethal neurological disorder. PEX12 also functions as an E3-ubiquitin ligase that facilitates the PEX4-dependent monoubiquitination of PEX5, a key player in peroxisomal matrix protein import, to control PEX5 receptor recycling or degradation. PEX12 contains a modified RING finger that lacks the third, fourth, and eighth zinc-binding residues of the consensus RING finger motif, suggesting PEX12 may only bind one zinc ion. 42
33991 319366 cd16452 SP-RING_like A group of variants of RING finger including SP-RING finger, SPL-RING finger, dRING finger, and RING-like Rtf2 domain. The family corresponds to a group of proteins with variants of RING fingers that are characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. They include SP-RING finger found in the Siz/PIAS RING (SP-RING) family of SUMO E3 ligases, SPL-RING finger found in E3 SUMO-protein ligase NSE2, degenerated RING (dRING) finger found in Saccharomyces cerevisiae required for meiotic nuclear division protein 5 (Rmd5p) and homologs, and RING-like Rtf2 domain found in the replication termination factor 2 (Rtf2) protein family. The SP-RING family includes PIAS (protein inhibitor of activated STAT) proteins, Zmiz proteins, and Siz proteins from plants and fungi. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. NSE2, also known as MMS21 homolog (MMS21) or non-structural maintenance of chromosomes element 2 homolog (Non-SMC element 2 homolog, NSMCE2), is an autosumoylating small ubiquitin-like modifier (SUMO) ligase required for the response to DNA damage. It regulates sumoylation and nuclear-to-cytoplasmic translocation of skeletal and heart muscle-specific variant of the alpha subunit of nascent polypeptide associated complex (skNAC)-Smyd1 in myogenesis. It is also required for resisting extrinsically induced genotoxic stress. Rmd5p, also known as glucose-induced degradation protein 2 (Gid2) or sporulation protein RMD5, is an E3 ubiquitin ligase that forms the heterodimeric E3 ligase unit of the glucose induced degradation deficient (GID) complex with Gid9 (also known as Fyv10), which has a degenerated RING finger as well. The GID complex triggers polyubiquitylation and subsequent proteasomal degradation of the gluconeogenic enzymes fructose-1, 6-bisphosphate by fructose-1, 6-bisphosphatase (FBPase), phosphoenolpyruvate carboxykinase (PEPCK), and cytoplasmic malate dehydrogenase (c-MDH). The Rtf2 protein family includes a group of conserved proteins found in eukaryotes ranging from fission yeast to humans. The defining member of the family is Schizosaccharomyces pombe Rtf2 (SpRtf2), which is a proliferating cell nuclear antigen-interacting protein that functions as a key requirement for efficient replication termination at the site-specific replication barrier RTS1. It promotes termination at RTS1 by preventing replication restart. The RING-like Rtf2 domain in fission yeast is required to stabilize a paused DNA replication fork during imprinting at the mating type locus, possibly by facilitating sumoylation of PCNA. The family also includes Arabidopsis RTF2 (AtRTF2), an essential nuclear protein required for both normal embryo development and for proper expression of the GFP reporter gene. It plays a critical role in splicing the GFP pre-mRNA, and may also have a more transient regulatory role during the spliceosome cycle. The biological function of Rtf2 homologs found in eumetazoa remains unclear. 42
33992 319367 cd16453 RING-Ubox U-box domain, a modified RING finger. The U-box protein family is a family of E3 enzymes that also includes the HECT family and the RING finger family. E3 enzyme is ubiquitin-protein ligase that cooperates with a ubiquitin-activating enzyme (E1) and a ubiquitin-conjugating enzyme (E2), and plays a central role in determining the specificity of the ubiquitination system. It removes the ubiquitin molecule from E2 enzyme and attaches it to the target substrate, forming a covalent bond between ubiquitin and the target. U-box proteins are characterized by the presence of a U-box domain of approximately 70 amino acids. U-box is a modified form of the RING finger domain that lacks metal chelating cysteines and histidines. It resembles the cross-brace RING structure consisting of three beta-sheets and a single alpha-helix, which would be stabilized by salt bridges instead of chelated metal ions. U-box proteins are widely distributed among eukaryotic organisms and show a higher prevalence in plants than in other organisms. 40
33993 319368 cd16454 RING-H2_PA-TM-RING RING finger, H2 subclass, found in the PA-TM-RING ubiquitin ligase family. The PA-TM-RING family represents a group of transmembrane-type E3 ubiquitin ligases, which has been characterized by an N-terminal transient signal peptide, a PA (protease-associated) domain, a TM (transmembrane) domain, as well as a C-terminal C3H2C3-type RING-H2 finger domain. It includes RNF13, RNF167, ZNRF4 (zinc and RING finger 4), GRAIL (gene related to anergy in lymphocytes)/RNF128, RNF130, RNF133, RNF148, RNF149 and RNF150 (which are more closely related), as well as RNF43 and ZNRF3 which have substantially longer C-terminal tail extensions compared with the others. PA-TM-RING proteins are expressed at low levels in all mammalian tissues and species, but they are not present in yeast. They play a common regulatory role in intracellular trafficking/sorting, suggesting that abrogation of their function may result in dysregulation of cellular signaling events in cancer. 43
33994 319369 cd16455 RING-H2_AMFR RING finger, H2 subclass, found in autocrine motility factor receptor (AMFR) and similar proteins. AMFR, also known as AMF receptor, or RING finger protein 45, or ER-protein gp78, is an internalizing cell surface glycoprotein localized in both plasma membrane caveolae and the endoplasmic reticulum (ER). It is involved in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. AMFR also functions as a RING finger-dependent ubiquitin protein ligase (E3) implicated in degradation from the ER. AMFR contains an N-terminal RING-H2 finger and a C-terminal ubiquitin-associated (UBA)-like CUE domain. 44
33995 319370 cd16456 RING-H2_APC11 RING finger, H2 subclass, found in anaphase-promoting complex subunit 11 (APC11) and similar proteins. APC11, also known as cyclosome subunit 11, or hepatocellular carcinoma-associated RING finger protein, is a C3H2C3-type RING-H2 protein that facilitates ubiquitin chain formation by recruiting ubiquitin-charged ubiquitin conjugating enzymes (E2) through its RING-H2 domain. APC11 and its partner the cullin-like subunit APC2 form the dynamic catalytic core of the gigantic, multisubunit 1.2-MDa anaphase-promoting complex/cyclosome (APC), also known as the cyclosome, which is a ubiquitin-protein ligase (E3) composed of at least 12 subunits and controls cell division by ubiquitinating cell cycle regulators, such as cyclin B and securin, to drive their timely degradation. APC11 can be inhibited by hydrogen peroxide, which may contributes to the delay in cell cycle progression through mitosis that is characteristic of cells subjected to oxidative stress. APC11 contains a canonical RING-H2-finger domain, which includes one histidine and seven cysteine residues that coordinate two Zn2+ ions. In addition, it contains a third Zn2+-binding site and the third Zn2+ ion is not essential for its ligase activity. 63
33996 319371 cd16457 RING-H2_BRAP2 RING finger, H2 subclass, found in BRCA1-associated protein (BRAP2) and similar proteins. BRAP2, also known as impedes mitogenic signal propagation (IMP), RING finger protein 52, or renal carcinoma antigen NY-REN-63, is a novel cytoplasmic protein interacting with the two functional nuclear localization signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 serves as a cytoplasmic retention protein and plays a role in in the regulation of nuclear protein transport. It contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3H2C3-type RING-H2 finger and a UBP-type zinc finger. 44
33997 319372 cd16458 RING-H2_Dmap_like RING finger, H2 subclass, found in defective in mitotic arrest proteins (Dmap) and similar proteins. The subfamily includes one Schizosaccharomyces pombe protein Dma1 (SpDma1p), two Saccharomyces cerevisiae proteins, Dma1 (ScDma1p) and Dma2 (ScDma2p), and their homologs from fungi. SpDma1p was originally isolated as multicopy suppressor of the temperature-sensitive growth phenotype caused by cdc16 mutations. It functions to prevent mitotic exit and cytokinesis during spindle checkpoint arrest by inhibiting septation initiation network (SIN) signaling. ScDma1p and ScDma2p, also known as checkpoint forkhead associated with RING domains-containing protein 1 and 2 respectively, seem to be functionally redundant. They are involved in proper septin ring positioning and cytokinesis. The simultaneous lack of Dma1 and Dma2 leads to spindle mispositioning and defects in the spindle position checkpoint. All members in this family contain a forkhead-associated domain (FHA) and a C3H2C3-type RING-H2 finger, the latter suggesting they may possibly possess E3 ubiquitin-ligase activities. 47
33998 319373 cd16459 RING-H2_DTX1_like RING finger, H2 subclass, found in E3 ubiquitin-protein ligase Deltex1 (DTX1), Deltex2 (DTX2), Deltex4 (DTX4), and similar proteins. This family includes Drosophila melanogaster Deltex, its vertebrate homologs, DTX1, DTX2, and DTX4, and other similar proteins mainly from eumetazoa. Deltex is a ubiquitously expressed cytoplasmic ubiquitin E3 ligase that mediates Notch activation in Drosophila. It selectively suppresses T-cell activation through degradation of a key signaling molecule, MAP kinase kinase kinase 1 (MEKK1). It also inhibits Jun-mediated transcription at the stage of Ras-dependent Jun N-terminal protein kinase (JNK) activation. Deltex contains N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a RING finger at the C-terminus. The vertebrate homologs of Deltex have been involved in Notch signaling and neurogenesis. The mammalian DTX1 is most closely related to the Drosophila Deltex. Both of them bind to SH3-domain containing protein Grb2 and further inhibit E2A. DTX1 functions as a Notch downstream transcription regulator. It interacts with the transcription coactivator p300 and inhibits transcription activation mediated by the neural specific transcription factor MASH1. It is also a transcription target of nuclear factor of activated T cells (NFAT) and participated in T cell anergy and Foxp3 protein level maintenance in vivo. Moreover, DTX1 promotes protein kinase C theta degradation and sustains Casitas B-lineage lymphoma expression. DTX4, also known as RING finger protein 155, shares the highest degree of sequence similarity with DTX1. So it likely interacts with the intracellular domain of Notch as well. Like DTX1 and DTX4, DTX2 is expressed in thymocytes. It interacts with the intracellular domain of Notch receptors and acts as a negative regulator of Notch signals in T cells. However, the endogenous levels of DTX1 and DTX2 is not important for regulating Notch signals during thymocyte development. In contrast to other DTXs, DTX3 does not contain N-terminal two Notch-binding WWE domains, but a short unique N-terminal domain. It does not interact with intracellular domain of Notch. In addition, it has a different class of RING finger (C3HC4 type or RING-HC subclass) than do the other DTXs which harbor a C3H2C3-type RING-H2 finger. Thus DTX3 is not included in this family. 64
33999 319374 cd16460 RING-H2_DZIP3 RING finger, H2 subclass, found in DAZ (deleted in azoospermia)-interacting protein 3 (DZIP3) and similar proteins. DZIP3, also known as RNA-binding ubiquitin ligase of 138 kDa (RUL138) or 2A-HUB protein, is an RNA-binding E3 ubiquitin-protein ligase that interacts with coactivator-associated arginine methyltransferase 1 (CARM1) and acts as a transcriptional coactivator of estrogen receptor (ER) alpha. It is also a histone H2A ubiquitin ligase that catalyzes monoubiquitination of H2A at lysine 119, functioning as a combinatoric component of the repression machinery required for repressing a specific chemokine gene expression program, critically modulating migratory responses to Toll-like receptors (TLR) activation. DZIP3 contains a C3H2C3-type RING-H2 finger at the C-terminus. 43
34000 319375 cd16461 RING-H2_EL5_like RING finger, H2 subclass, found in rice E3 ubiquitin-protein ligase EL5 and similar proteins. EL5, also known as protein ELICITOR 5, is an E3 ubiquitin-protein ligase containing an N-terminal transmembrane domain and a C3H2C3-type RING-H2 finger that is a binding site for ubiquitin-conjugating enzyme (E2). It can be rapidly induced by N-acetylchitooligosaccharide elicitor. EL5 catalyzes polyubiquitination via the Lys48 residue of ubiquitin, and thus plays a crucial role as a membrane-anchored E3 in the maintenance of cell viability after the initiation of root primordial formation in rice. It also acts as an anti-cell death enzyme that might be responsible for mediating the degradation of cytotoxic proteins produced in root cells after the actions of phytohormones. Moreover, EL5 interacts with UBC5b, a rice ubiquitin carrier protein, through its RING-H2 finger. EL5 is an unstable protein, and its degradation is regulated by the C3H2C3-type RING-H2 finger in a proteasome-independent manner. 43
34001 319376 cd16462 RING-H2_Pep3p_like RING finger, H2 subclass, found in Saccharomyces cerevisiae vacuolar membrane protein PEP3 (Pep3p) and similar proteins. Pep3p, also known as carboxypeptidase Y-deficient protein 3, vacuolar morphogenesis protein 8, vacuolar protein sorting-associated protein 18 (Vps18p), or vacuolar protein-targeting protein 18, is a vacuolar membrane protein that affects late Golgi functions required for vacuolar protein sorting and efficient alpha-factor prohormone maturation. It is required for vacuolar biogenesis caused hypersensitivity to heat shock and ethanol stresses, probably due to disappearance of normal vacuoles. As a component of the homotypic fusion and vacuole protein sorting (HOPS) and class C core vacuole/endosome tethering (CORVET) complexes, its overexpression shortens lag phase but does not alter growth rate in Saccharomyces cerevisiae exposed to acetic acid stress. Moreover, Pep3p forms the Class C Vps protein complex (C-Vps complex) with Pep5p (also known as Vps11), Vps16, and Vps33, and is necessary for trafficking of hydrolase precursors to the vacuole by promoting vesicular docking reactions with SNARE proteins. Pep3p contains a C3H2C3-type RING-H2 finger at the C-terminus. 50
34002 319377 cd16463 RING-H2_PHR RING finger, H2 subclass, found in the PHR protein family. The PHR protein family represents an evolutionally conserved family of large proteins including human E3 ubiquitin ligase protein associated with Myc (Pam) and its homologs, Phr1 (for Pam/Highwire/RPM-1) in mouse, Highwire (HIW) in Drosophila, RPM-1 (regulator of presynaptic morphology 1) in Caenorhabditis elegans, and Esrom in zebrafish. Those proteins are large E3 ubiquitin ligases containing regulator of chromosome condensation (RCC) homology domains (RHD-1 and RHD-2) with inferred guanine exchange factor (GEF) activity, a Myc-binding domain, a B-box zinc finger, and a C-terminal C3H2C3-type RING-H2 finger with E3 ubiquitin (Ub) ligase activity. They play an important role in axon guidance and synaptogenesis. They regulate synapse formation and growth in mammals, zebrafish, Drosophila, and Caenorhabditis elegans, and may control a variety of signaling pathways, including cAMP signaling in mammalian cells, JNK/p38 MAPK signaling in Drosophila and C. elegans, and bone morphogenetic protein signaling in Drosophila. Pam also known as Myc-binding protein 2 (MYCBP2), or Pam/highwire/rpm-1 protein (PHR1), negatively regulates neuronal growth, synaptogenesis and synaptic plasticity by modulating several signaling pathways including the p38 MAPK signaling cascade. It also participates in receptor and ion channel internalization, such as regulating internalization of transient receptor potential vanilloid receptor 1 (TRPV1) in peripheral sensory neurons, as well as duration of thermal hyperalgesia through p38 MAPK. It interacts with neuron-specific electroneutral potassium (K+) and chloride (Cl-) cotransporter KCC2 and modulates its function. Moreover, Pam genetically interacts with Robo2 to modulate axon guidance in the olfactory system. It also associates with tuberous sclerosis complex (TSC) proteins, ubiquitinating TSC2 and regulating mammalian/mechanistic target of rapamycin (mTOR) signaling. Furthermore, Pam is the longest lasting nontranscriptional regulator of adenylyl cyclase activity, and can mediate sustained inhibition of cAMP signaling by sphingosine-1-phosphate. It is also involved in spinal nociceptive processing. Phr1 is an essential regulator of retinal ganglion cell projection during both dorsal lateral geniculate nucleus (dLGN) and superior colliculus (SC) topographic map development. RPM-1 positively regulates a Rab GTPase pathway to promote vesicular trafficking via late endosomes, thereby regulating synapse formation and axon termination. Esrom has E3 ligase activity and modulates the amount of phosphorylated Tuberin, a tumor suppressor, in growth cones. It is required in formation of the retinotectal projection. 55
34003 319378 cd16464 RING-H2_Pirh2 RING finger, H2 subclass, found in p53-induced RING-H2 protein (Pirh2) and similar proteins. Pirh2, also known as RING finger and CHY zinc finger domain-containing protein 1 (Rchy1), androgen receptor N-terminal-interacting protein, CH-rich-interacting match with PLAG1, RING finger protein 199 (RNF199), or zinc finger protein 363 (ZNF363), is a p53 inducible E3 ubiquitin-protein ligase that functions as a negative regulator of p53. It ubiquitylates preferably the tetrameric form of p53 in vitro and in vivo, suggesting a role of Pirh2 in downregulating the transcriptionally active form of p53 in the cell. Moreover, Pirh2 inhibits p73, a homolog of the tumor suppressor p53, transcriptional activity by promoting its ubiquitination. It also monoubiquitinates DNA polymerase eta (PolH) to suppress translesion DNA synthesis. Furthermore, Pirh2 functions as a negative regulator of the cyclin-dependent kinase inhibitor p27(Kip1) function by promoting ubiquitin-dependent proteasomal degradation. In addition, Pirh2 enhances androgen receptor (AR) signaling through inhibition of histone deacetylase 1 (HDAC1) and is overexpressed in prostate cancer. Pirh2 also interacts with TIP60 and the TIP60-Pirh2 association may regulate Pirh2 stability. In addition, the oncoprotein pleomorphic adenoma gene like 2 (PLAGL2) can bind to Pirh2 dimer and therefore control the stability of Pirh2. Pirh2 contains a total of nine zinc-binding sites with six located at the N-terminal region, two in the C3H2C3-type RING-H2 domain, and one in the C-terminal region. Nine zinc binding sites comprise three different zinc coordination schemes, including RING type cross-brace zinc coordination, C4 zinc finger, and a novel left-handed beta-spiral zinc-binding motif formed by three recurrent CCHC sequence motifs. 45
34004 319379 cd16465 RING-H2_PJA1_2 RING finger, H2 subclass, found in protein E3 ubiquitin-protein ligase Praja-1, Praja-2, and similar proteins. The family includes two highly similar E3 ubiquitin-protein ligases Praja-1 and Praja-2. Praja-1, also known as RING finger protein 70, is a RING-H2 finger ubiquitin ligase encoded by gene PJA1, a novel human X chromosome gene abundantly expressed in brain. It has been implicated in bone and liver development, as well as memory formation and X-linked mental retardation (MRX). Praja-1 interacts with and activates the ubiquitin-conjugating enzyme UbcH5B, and shows E2-dependent E3 ubiquitin ligase activity. It is a 3-deazaneplanocin A (DZNep)-induced ubiquitin ligase that directly ubiquitinates individual polycomb repressive complex 2 (PRC2) subunits in a cell free system, which leads to their proteasomal degradation. It also plays an important role in neuronal plasticity, which is the basis for learning and memory. Moreover, Praja-1 ubiquitinates embryonic liver fodrin (ELF) and Smad3, but not Smad4, in a transforming growth factor-beta (TGF-beta)-dependent manner. It controls ELF abundance through ubiquitin-mediated degradation, and further regulates TGF-beta signaling, which plays a key role in the suppression of gastric carcinoma. Furthermore, Praja-1 regulates the transcription function of the homeodomain protein Dlx5 by controlling the stability of the Dlx/Msx-interacting MAGE/Necdin family protein, Dlxin-1, via an ubiquitin-dependent degradation pathway. Praja-2, also known as RING finger protein 131, or NEURODAP1, or KIAA0438, is an E2-dependent E3 ubiquitin ligase that interacts with and activates the ubiquitin-conjugating enzyme UbcH5B. It functions as an A-kinase anchoring protein (AKAP)-like E3 ubiquitin ligase that plays a critical role in controlling cyclic AMP (cAMP) dependent PKA activity and pro-survival signaling, and further promotes cell proliferation and growth. Praja-2 is also involved in protein sorting at the postsynaptic density region of axosomatic synapses and possibly plays a role in synaptic communication and plasticity. Moreover, Praja-2, together with the AMPK-related kinase SIK2 and the CDK5 activator CDK5R1/p35, forms a SIK2-p35-PJA2 complex that plays an essential role for glucose homeostasis in pancreatic beta cell functional compensation. Furthermore, Praja-2 ubiquitylates and degrades Mob, a core component of NDR/LATS kinase and a positive regulator of the tumor-suppressor Hippo signaling. Both Praja-1 and Praja-2 contain a potential nuclear localization signal (NLS) and a C-terminal C3H2C3-type RING-H2 motif. 46
34005 319380 cd16466 RING-H2_RBX2 RING finger, H2 subclass, found in RING-box protein 2 (RBX2) and similar proteins. RBX2, also known as CKII beta-binding protein 1 (CKBBP1), RING finger protein 7 (RNF7), regulator of cullins 2 (ROC2), or sensitive to apoptosis gene protein (SAG), is an E3 ubiquitin-protein ligase that protects cells from apoptosis, confers radioresistance, and plays an essential and non-redundant role in embryogenesis and vasculogenesis. It promotes ubiquitination and degradation of a number of protein substrates, including c-JUN, DEPTOR, HIF-1alpha, IkappaBalpha, NF1, NOXA, p27, and procaspase-3, thus regulating various signaling pathways and biological processes. RBX2 is necessary for ubiquitin ligation activity of the multimeric cullin (Cul)-RING E3 ligases (CRLs). RBX2-containing CRLs are involved in NEDD8 pathway and RBX2 specifically regulate NEDD8ylation of Cul5. It can bind and activate HIV-1 Vif-Cullin5 E3 ligase complex in vitro. It is also a substrate of NEDD4-1 E3 ubiquitin ligase and mediates NEDD4-1 induced chemosensitization. The inactivation of RBX2 E3 ubiquitin ligase activity triggers senescence and inhibits Kras-induced immortalization. Endothelial deletion of RBX2 causes embryonic lethality and blocks tumor angiogenesis, suggesting a way for anti-angiogenesis therapy of human cancer. Moreover, as a component of Cullin 5-RING E3 ubiquitin ligase (CRL5) complex, RBX2 regulates neuronal migration through different CRL5 adaptors, such as SOCS7. Furthermore, RBX2 functions as a redox inducible antioxidant protein that scavenges oxygen radicals by forming inter- and intra-molecular disulfide bonds when acting alone. RBX2 contains a C-terminal C3H2C3-type RING-H2 finger that is essential for its ligase activity. 60
34006 319381 cd16467 RING-H2_RNF6_like RING finger, H2 subclass, found in E3 ubiquitin-protein ligase RNF6, RNF12, and similar proteins. RNF6 is an androgen receptor (AR)-associated protein that induces AR ubiquitination and promotes AR transcriptional activity. RNF6-induced ubiquitination may regulate AR transcriptional activity and specificity through modulating cofactor recruitment. RNF6 is overexpressed in hormone-refractory human prostate cancer tissues and required for prostate cancer cell growth under androgen-depleted conditions. Moreover, RNF6 regulates local serine/threonine kinase LIM kinase 1 (LIMK1) levels in axonal growth cones. RNF6-induced LIMK1 polyubiquitination is mediated via K48 of ubiquitin and leads to proteasomal degradation of the kinase. RNF6 also binds and upregulates the Inha promoter, and functions as a transcription regulatory protein in the mouse sertoli cell. Furthermore, RNF6 acts as a potential tumor suppressor gene involved in the pathogenesis of esophageal squamous cell carcinoma (ESCC). RNF12, also known as LIM domain-interacting RING finger protein, or RING finger LIM domain-binding protein (R-LIM), is E3 ubiquitin-protein ligase encoded by gene RLIM that is crucial for normal embryonic development in some species and for normal X inactivation in mice. It thus functions as a major sex-specific epigenetic regulator of female mouse nurturing tissues. RNF12 is widely expressed during embryogenesis, and mainly localizes to the cell nucleus, where it regulates the levels of many proteins, including CLIM, LMO, HDAC2, TRF1, SMAD7, and REX1, by proteasomal degradation. Both RNF6 and RNF12 contain a well conserved C3H2C3-type RING-H2 finger. 43
34007 319382 cd16468 RING-H2_RNF11 RING finger, H2 subclass, found in RING finger protein 11 (RNF11) and similar proteins. RNF11 is an E3 ubiquitin-protein ligase that acts both as an adaptor and a modulator of itch-mediated control of ubiquitination events underlying membrane traffic. It is the downstream of an enzymatic cascade for the ubiquitination of specific substrates. It is also a molecular adaptor of homologous to E6-associated protein C-terminus (HECT)-type ligases. RNF11 has been implicated in the regulation of several signaling pathways. It enhances the transforming growth factor receptor (TGFR) signaling by both abrogating Smurf2-mediated receptor ubiquitination and by promoting the Smurf2-mediated degradation of AMSH (associated molecule with the SH3 domain of STAM), a de-ubiquitinating enzyme that enhances transforming growth factor-beta (TGF-beta) signaling and epidermal growth factor receptor (EGFR) endosomal recycling. It also acts directly on Smad4 to enhance Smad4 function, and plays a role in prolonged TGF-beta signaling. Moreover, RNF11 functions as a critical component of the A20 ubiquitin-editing protein complex that negatively regulates tumor necrosis factor (TNF)-mediated nuclear factor (NF)-kappaB activation. It also interacts with Smad anchor for receptor activation (SARA) and the endosomal sorting complex required for transport (ESCRT)-0 complex, thus participating in the regulation of lysosomal degradation of EGFR. Furthermore, RNF11 acts as a novel GGA cargo actively participating in regulating the ubiquitination of the GGA protein family. In addition, RNF11 functions together with TAX1BP1 to target TANK-binding kinase 1 (TBK1)/IkappaB kinase IKKi, and further restricts antiviral signaling and type I interferon (IFN)-beta production. RNF11 contains an N-terminal PPPY motif that binds WW domain-containing proteins such as AIP4/itch, Nedd4 and Smurf1/2 (SMAD-specific E3 ubiquitin-protein ligase 1/2), and a C-terminal C3H2C3-type RING-H2 finger that functions as a scaffold for the coordinated transfer of ubiquitin to substrate proteins together with the E2 enzymes UbcH527 and Ubc13. 43
34008 319383 cd16469 RING-H2_RNF24_like RING finger, H2 subclass, found in RING finger proteins RNF24, RNF122, and similar proteins. The family includes RNF24, RNF122, and similar proteins. RNF24 is an intrinsic membrane protein localized in the Golgi apparatus. It specifically interacts with the ankyrin-repeats domains (ARDs) of TRPC1, ?3, ?4, ?5, ?6, and ?7, and affects TRPC intracellular trafficking without affecting their activity. RNF122 is a RING finger protein associated with HEK 293T cell viability. It is localized to the endoplasmic reticulum (ER) and the Golgi apparatus, and overexpressed in anaplastic thyroid cancer cells. RNF122 functions as an E3 ubiquitin ligase that can ubiquitinate itself and undergoes degradation through its RING finger in a proteasome-dependent manner. Both RNF24 and RNF122 contain an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger. 47
34009 319384 cd16470 RING-H2_RNF25 RING finger, H2 subclass, found in RING finger protein 25 (RNF25) and similar proteins. RNF25, also known as AO7, is a putative E3 ubiquitin-protein ligase that was initially identified as an interacting protein with an ubiquitin-conjugating enzyme, Ubc5B. It is ubiquitously expressed in various tissues and predominantly localized in the nucleus. RNF25 activates the nuclear factor (NF)-kappaB-dependent gene expression upon stimulation with Interleukin-1 beta (IL-1beta), or tumor necrosis factor (TNF), or overexpression of NF-kappaB-inducing kinase. It interacts with the p65 transactivation domain (TAD) and modulates its transcriptional activity. RNF25 contains an N-terminal RWD domain, a C3H2C3-type RING-H2 finger, and a C-termial Pro-rich region. Both the RING-H2 finger and the C-terminal regions of RNF25 are necessary for the transcriptional activation. 68
34010 319385 cd16471 RING-H2_RNF32 RING finger, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway. 49
34011 319386 cd16472 RING-H2_RNF38_like RING finger, H2 subclass, found in RING finger proteins RNF38, RNF44, and similar proteins. The family includes RING finger proteins RNF38, RNF44, and similar proteins. RNF38 is a nuclear E3 ubiquitin protein ligase that plays a role in regulating p53. RNF44 is an uncharacterized RING finger protein that shows high sequence similarity with RNF38. Both RNF38 and RNF44 contain a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C3-type RING-H2 finger. In addition, RNF38 harbors two potential nuclear localization signals. 45
34012 319387 cd16473 RING-H2_RNF103 RING finger, H2 subclass, found in RING finger protein 103 (RNF103) and similar proteins. RNF103, also known as KF-1, or zinc finger protein 103 homolog (Zfp-103), is an endoplasmic reticulum (ER)-resident E3 ubiquitin-protein ligase that is widely expressed in many different organs, including brain, heart, kidney, spleen, and lung. It is involved in the ER-associated degradation (ERAD) pathway through interacting with components of the ERAD pathway, including Derlin-1 and VCP. RNF103 contains several hydrophobic regions at its N-terminal and middle regions, as well as a C-terminal C3H2C3-type RING-H2 finger. 46
34013 319388 cd16474 RING-H2_RNF111_like RING finger, H2 subclass, found in RING finger proteins RNF111, RNF165, and similar proteins. The family includes RING finger proteins RNF111, RNF165, and similar proteins. RNF111, also known as Arkadia, is a nuclear E3 ubiquitin-protein ligase that targets intracellular effectors and modulators of transforming growth factor beta (TGF-beta)/Nodal-related signaling for polyubiquitination and proteasome-dependent degradation. It also interacts with the clathrin-adaptor 2 (AP2) complex and regulates endocytosis of certain cell surface receptors, leading to modulation of epidermal growth factor (EGF) and possibly other signaling pathways. The N-terminal half of RNF111 harbors three SUMO-interacting motifs (SIMs). It thus functions as a SUMO-targeted ubiquitin ligase (STUbL) that directly links nonproteolytic ubiquitylation and SUMOylation in the DNA damage response, as well as triggers degradation of signal-induced polysumoylated proteins, such as the promyelocytic leukemia protein (PML). RNF165, also known as Arkadia-like 2, or Arkadia2, or Ark2C, is an E3 ubiquitin ligase with homology to C-terminal half of RNF111. It is expressed specifically in the nervous system, and can serve to amplify neuronal responses to specific signals. It thus acts as a positive regulator of bone morphogenetic protein (BMP)-Smad signaling that is involved in motor neuron (MN) axon elongation. Both RNF165 and RNF111 contain a C-terminal C3H2C3-type RING-H2 finger. 46
34014 319389 cd16475 RING-H2_RNF121_like RING finger, H2 subclass, found in RING finger proteins RNF121, RNF175 and similar proteins. The family includes RNF121, RNF175 and similar proteins. RNF121 is an E3-ubiquitin ligase present in the endoplasmic reticulum (ER) and cis-Golgi compartments. It is a novel regulator of apoptosis. It also facilitates the degradation and membrane localization of voltage-gated sodium (NaV) channels, and thus plays a role in the quality control of NaV channels during their synthesis and subsequent transport to the membrane. Moreover, RNF121 acts as a broad regulator of nuclear factor kappaB (NF-kappaB) signaling since its silencing also dampens NF-kappaB activation following stimulation of toll-like receptors (TLRs), nod-like receptors (NLRs), RIG-I-like Receptors (RLRs) or after DNA damages. RNF121 contains five conserved transmembrane (TM) domains and a C3H2C2-type RING-H2 finger. RNF175 is an uncharacterized RING finger protein that shows high sequence similarity with RNF121. This family also includes Arabidopsis RING finger E3 ligase RHA2A, RHA2B, and their homologs. RHA2A is a novel positive regulator of abscisic acid (ABA) signaling during seed germination and early seedling development. RHA2B may play a role in the ubiquitin-dependent proteolysis pathway that respond to proteasome inhibition. All family members contain a C3H2C3-type RING-H2 finger, which is responsible for E3-ubiquitin ligase activity. 55
34015 319390 cd16476 RING-H2_RNF139_like RING finger, H2 subclass, found in RING finger proteins RNF139, RNF145, and similar proteins. RNF139, also known as translocation in renal carcinoma on chromosome 8 protein (TRC8), is an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. It is a tumor suppressor that has been implicated in a novel regulatory relationship linking the cholesterol/lipid biosynthetic pathway with cellular growth control. The mutation of RNF139 has been identified in families with hereditary renal (RCC) and thyroid cancers. RNF145 is an uncharacterized RING finger protein encoded by RNF145 gene, which is expressed in T lymphocytes, and its expression is altered in acute myelomonocytic and acute promyelocytic leukemias. Although its biological function remains unclear, RNF145 shows high sequence similarity with RNF139. Both RNF139 and RNF145 contain a C3H2C3-type RING-H2 finger with possible E3-ubiquitin ligase activity. 41
34016 319391 cd16477 RING-H2_RNF214 RING finger, H2 subclass, found in RING finger protein 214 (RNF214) and similar proteins. RNF214 is an uncharacterized RING finger protein containing a C3H2C3-type RING-H2 finger, suggesting possible E3-ubiquitin ligase activity. 45
34017 319392 cd16478 RING-H2_Rapsyn RING finger, H2 subclass, found in 43 kDa receptor-associated protein of the synapse (Rapsyn) and similar proteins. Rapsyn, also known as acetylcholine receptor-associated 43 kDa protein or RING finger protein 205 (RNF205), is a 43 kDa postsynaptic protein that plays an essential role in the clustering and maintenance of acetylcholine receptor (AChR) in the postsynaptic membrane of the motor endplate (EP). AChRs enable the transport of rapsyn from the Golgi complex to the plasma membrane through a molecule-specific interaction. Rapsyn also mediates subsynaptic anchoring of protein kinase A (PKA) type I in close proximity to the postsynaptic membrane, which is essential for synapse maintenance. Its mutations in humans cause endplate acetylcholine-receptor deficiency and myasthenic syndrome. Rapsyn contains an N-terminal myristoylation signal required for membrane association, seven tetratricopeptide repeats (TPRs) that subserve rapsyn self-association, a coiled-coil domain responsible for the binding of determinants within the long cytoplasmic loop of each AChR subunit, a C3H2C3-type RING-H2 finger that binds to the cytoplasmic domain of beta-dystroglycan and to S-NRAP and links rapsyn to the subsynaptic cytoskeleton, and a serine phosphorylation site. 47
34018 319393 cd16479 RING-H2_synoviolin RING finger, H2 subclass, found in synoviolin and similar proteins. Synoviolin, also known as synovial apoptosis inhibitor 1 (Syvn1), Hrd1, or Der3, is an endoplasmic reticulum (ER)-anchoring E3 ubiquitin ligase that functions as a suppressor of ER stress-induced apoptosis and plays a role in homeostasis maintenance. It also targets tumor suppressor gene p53 for proteasomal degradation, suggesting the crosstalk between ER associated degradation (ERAD) and p53 mediated apoptotic pathway under ER stress. Moreover, Synoviolin controls body weight and mitochondrial biogenesis through negative regulation of the thermogenic coactivator peroxisome proliferator-activated receptor coactivator (PGC)-1beta. It upregulates amyloid beta production by targeting a negative regulator of gamma-secretase, Retention in endoplasmic reticulum 1 (Rer1), for degradation. It is also involved in the degradation of endogenous immature nicastrin, and affects amyloid beta-protein generation. Moreover, Synoviolin is highly expressed in rheumatoid synovial cells and may be involved in the pathogenesis of rheumatoid arthritis (RA). It functions as an anti-apoptotic factor that is responsible for the outgrowth of synovial cells during the development of RA. It promotes inositol-requiring enzyme 1 (IRE1) ubiquitination and degradation in synovial fibroblasts with collagen-induced arthritis. Furthermore, the upregulation of Synoviolin may represent a protective response against neurodegeneration in Parkinson"s disease (PD). In addition, Synoviolin is involved in liver fibrogenesis. Synoviolin contains a C3H2C2-type RING-H2 finger. 43
34019 319394 cd16480 RING-H2_TRAIP RING finger, H2 subclass, found in TRAF-interacting protein (TRAIP) and similar proteins. TRAIP, also known as RING finger protein 206 (RNF206) or TRIP, is a ubiquitously expressed nucleolar E3 ubiquitin ligase important for cellular proliferation and differentiation. It is found near mitotic chromosomes and functions as a regulator of the spindle assembly checkpoint. TRAIP interacts with tumor necrosis factor (TNF)-receptor-associated factor (TRAF) proteins and inhibits TNF-alpha-mediated nuclear factor (NF)-kappaB activation. It also interacts with two tumor suppressors CYLD and spleen tyrosine kinase (Syk), and DNA polymerase eta, which facilitates translesional synthesis after DNA damage. TRAIP contains an N-terminal C3H2C2-type RING-H2 finger and an extended coiled-coil domain. 45
34020 319395 cd16481 RING-H2_TTC3 RING finger, H2 subclass, found in Tetratricopeptide repeat protein 3 (TTC3) and similar proteins. TTC3, also known as protein DCRR1, or TPR repeat protein D, or TPR repeat protein 3, or RING finger protein 105 (RNF105), is an E3 ubiquitin-protein ligase encoded by a gene within the Down syndrome (DS) critical region on chromosome 21. It affects differentiation and Golgi compactness in neurons through specific actin-regulating pathways. It inhibits the neuronal-like differentiation of pheocromocytoma cells by activating RhoA and by binding to Citron proteins. TTC3 is an Akt-specific E3 ligase that binds to phosphorylated Akt and facilitates its ubiquitination and degradation within the nucleus. TTC3 contains four N-terminal TPR motifs, a potential coiled-coil region and a Citron binding region in the central part, and a C-terminal C3H2C2-type RING-H2 finger.TTC3, also known as protein DCRR1, TPR repeat protein D, TPR repeat protein 3, or RING finger protein 105 (RNF105), is an E3 ubiquitin-protein ligase encoded by a gene within the Down syndrome (DS) critical region on chromosome 21. It also affects differentiation and Golgi compactness in neurons through specific actin-regulating pathways. It inhibits the neuronal-like differentiation of pheocromocytoma cells by activating RhoA and by binding to Citron proteins. TTC3 is an Akt-specific E3 ligase that binds to phosphorylated Akt and facilitates its ubiquitination and degradation within the nucleus. TTC3 contains four N-terminal TPR motifs, a potential coiled-coil region and a Citron binding region in the central part, and a C-terminal C3H2C2-type RING-H2 finger. 42
34021 319396 cd16482 RING-H2_UBR1_like RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-1 (UBR1), E3-alpha-2 (UBR2), and similar proteins. Two UBR family members, UBR1 and UBR2, are major N-recognin ubiquitin ligases that both function in the N-end rule degradation pathway. They can recognize substrate proteins with type-1 (basic) and type-2 (bulky hydrophobic) N-terminal residues as part of N-degrons and an internal lysine residue for ubiquitin conjugation. They also function in a quality control pathway for degradation of unfolded cytosolic proteins. Their action is stimulated by Hsp70. Moreover, UBR1 and UBR2 are negative regulators of the leucine-mTOR signaling pathway. Leucine might activate this pathway in part through inhibition of their ubiquitin ligase activity. In yeast only one E3, encoded by UBR1, mediates the recognition of substrates by the N-end rule pathway. Saccharomyces cerevisiae UBR1 also functions as an additional E3 ligase in endoplasmic reticulum-associated protein degradation (ERAD) in yeast. It can provide ubiquitin ligation activity for the ERAD substrate mutated Ste6 (sterile). Schizosaccharomyces pombe UBR1 is a critical regulator that influences the oxidative stress response via degradation of active Pap1 basic leucine zipper (bZIP) transcription factor in the nucleus. Both UBR1 and UBR2 contain an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain. 67
34022 319397 cd16483 RING-H2_UBR3 RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-3 (UBR3) and similar proteins. UBR3, also known as N-recognin-3, E3alpha-III, or zinc finger protein 650, is an E3 ubiquitin-protein ligase targeting the essential DNA repair protein APE1, also known as Ref-1, for ubiquitylation. It regulates cellular levels of APE1 and is required for genome stability. It also plays a regulatory role in sensory pathways, including olfaction. Moreover, in Drosophila, UBR3 regulates apoptosis by controlling the activity of Drosophila inhibitor of apoptosis protein 1 (DIAP1), which is required to prevent caspase activation. UBR3 contains an N-terminal ubiquitin-recognin (UBR) box, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain. 90
34023 319398 cd16484 RING-H2_Vps RING finger, H2 subclass, found in vacuolar protein sorting-associated proteins Vps8, Vps11, Vps18, Vps41, and similar proteins. This family corresponds to a group of vacuolar protein sorting-associated proteins containing a C-terminal C3H2C3-type RING-H2 finger, which includes Vps8, Vps11, Vps18, and Vps41. Vps11 and Vps18 associate with Vps16 and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. 46
34024 319399 cd16485 mRING-H2-C3H2C2D_RBX1 modified RING finger, H2 subclass (C3H2C2D-type), found in RING-box protein 1 (RBX1) and similar proteins. RBX1, also known as Hrt1, protein ZYP, RING finger protein 75 (RNF75), or regulator of cullins 1 (ROC1), is an E3 ubiquitin-protein ligase necessary for ubiquitin ligation activity of the multimeric cullin (Cul)-RING E3 ligases (CRLs). RBX1-containing CRLs are involved in NEDD8 pathway and RBX1 specifically regulate NEDD8ylation of Cul1-4. It can also bind and activate HIV-1 Vif-Cullin5 E3 ligase complex in vitro. Moreover, RBX1 is an essential element of Skp1/Cullins/F-box (SCF) E3-ubiquitin ligase complex that targets diverse proteins for proteasome-mediated degradation. It is a direct functional target of miR-194 and plays an important role in proliferation and migration of gastric cancer (GC) cells. RBX1 is also an essential component of KEAP1/CUL3/RBX1 E3-ubiquitin ligase complex that functions as a regulator of NFE2-related factor 2 (NRF2) and plays a key role in NRF2 pathway deregulation in multiple tumor types, including ovarian carcinomas (OVCA) and papillary thyroid carcinoma (PTC). Furthermore, RBX1 associates with DDB1, Cul4A, and Fbxw5 to form the Fbxw5-DDB1-Cul4A-Rbx1 complex that may function as a dual SUMO/ubiquitin ligase suppressing c-Myb activity through sumoylation or ubiquitination. RBX1 contains a C-terminal modified RING-H2 finger that is C3H2C2D-type, rather than the canonical C3H2C3-type. The modified RING-H2 finger is essential for its ligase activity. 62
34025 319400 cd16486 mRING-H2-C3H2C2D_ZSWM2 Modified RING finger, H2 subclass (C3H2C2D-type), found in zinc finger SWIM domain-containing protein 2 (ZSWIM2) and similar proteins. ZSWIM2, also known as MEKK1-related protein X (MEX) or ZZ-type zinc finger-containing protein 2, is a testis-specific E3 ubiquitin ligase that promotes death receptor-induced apoptosis through Fas, death receptor (DR) 3 and DR4 signaling. ZSWIM2 is self-ubiquitinated and targeted for degradation through the proteasome pathway. It also acts as an E3 ubiquitin ligase, through the E2, Ub-conjugating enzymes UbcH5a, UbcH5c, or UbcH6. ZSWIM2 contains four putative zinc-binding domains including an N-terminal SWIM (SWI2/SNF2 and MuDR) domain critical for its ubiquitination, and two modified RING-H2 fingers separated by a ZZ zinc finger domain, which was required for interaction with UbcH5a and its self-association. This family corresponds to the second RING-H2 finger, which is not a canonical C3H2C3-type, but a modified C3H2C2D-type. 44
34026 319401 cd16487 mRING-H2-C3DHC3_ZFPL1 Modified RING finger, H2 subclass (C3DHC3-type), found in zinc finger protein-like 1 (ZFPL1) and similar proteins. ZFPL1, also known as zinc finger protein MCG4, is a novel mitotic Golgi phosphoprotein required for cis-Golgi integrity and efficient endoplasmic reticulum (ER)-to-Golgi transport via directly interacting with the cis-Golgi matrix protein GM130. ZFPL1 is a widely expressed integral membrane protein with two predicted zinc fingers at its N-terminus. One is a novel type of zinc finger, and the other is a modified RING-H2 finger that lacks the fourth zinc-binding residue of the consensus C3H2C3-type RING-H2 finger. It also contains a bipartite nuclear localization signal (NLS), and a leucine zipper at the C-terminus. 55
34027 319402 cd16488 mRING-H2-C3H3C2_Mio_like Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein mio and its homologs. This family contains Mio, WDR24, WDR59, and their counterpart Sea4, Sea2, and Sea3 from yeast, respectively. Mio/Sea4, Sea2/WDR24, and Sea3/WDR59 are components of GATOR2 complex, which also includes another two subunits, Seh1and Sec13. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. All family members contain an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. 44
34028 319403 cd16489 mRING-CH-C4HC2H_ZNRF Modified RING-CH finger, H2 subclass (C4HC2H-type), found in the ZNRF family. This ZNRF family includes zinc/RING finger proteins ZNRF1, ZNRF2, and similar proteins. It has been characterized by containing a unique combination zinc finger-RING finger motif in the C-terminal region, which is evolutionarily conserved in a wide range of species, including Caenorhabditis elegans and Drosophila. The ZNRF family of proteins function as an E3 ubiquitin ligase and are highly expressed in central nervous system (CNS) and peripheral nervous system (PNS) neurons, particularly during development and in adulthood. In neurons, ZNRF1 and ZNRF2 are differentially localized within the synaptic region. ZNRF1 is associated with synaptic vesicle membranes, whereas ZNRF2 is present in presynaptic plasma membranes. They are N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts. ZNRF proteins may play a role in the establishment and maintenance of neuronal transmission and plasticity via their ubiquitin ligase activity, as well as in regulating Ca2+-dependent exocytosis. The RING fingers found in ZNRF proteins are modified as C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger. 43
34029 319404 cd16490 RING-CH-C4HC3_FANCL RING-CH finger, H2 subclass (C4HC3-type), found in Fanconi anemia group L protein (FANCL) and similar proteins. FANCL, also known as fanconi anemia-associated polypeptide of 43 kDa (FAAP43) or PHF9, is a monomeric RING E3 ubiquitin-protein ligase that monoubiquitinates FANCD2 and FANCI. The monoubiquitinated FANCD2-FANCI heterodimer complex in turn recruits key proteins involved in homologous recombination and DNA repair. FANCL is also one of seven components in Fanconi anemia (FA) nuclear core complex, which provides the essential E3 ligase function for spatially defined FANCD2 ubiquitination and FA pathway activation. In the FA core complex, FANCL associates with FANCB and FAAP100 to constitute a catalytic subcomplex that functions as the monoubiquitination module. FANCL specifically interacts with the E2 ubiquitin-conjugating (UBC) enzyme Ube2T to make an E3-E2 pair, which is the catalytic center of the Fanconi Anemia (FA) pathway required for DNA interstrand crosslink repair. Moreover, FANCL has a noncanonical function to regulate the Wnt/beta-catenin signaling, a pathway involved in hematopoietic stem cell self-renewal. It functionally enhances beta-catenin activity through ubiquitinating beta-catenin, with atypical ubiquitin chains (K11 linked). FANCL contains an N-terminal E2-like fold (ELF) domain, a novel double-RWD (DRWD) domain with a clear hydrophobic core, and a C-terminal C4HC3-type RING-CH finger. The DRWD domain is required for substrate binding. The RING-CH finger, also known as vRING or RINGv, is predicted to facilitate E2 binding. It has an unusual arrangement of zinc-coordinating residues. Its cysteines and histidines are arranged in the sequence as C4HC3-type, rather than the C3H2C3-type in canonical RING-H2 finger. 58
34030 319405 cd16491 RING-CH-C4HC3_LTN1 RING-CH finger, H2 subclass (C4HC3-type), found in E3 ubiquitin-protein ligase listerin and similar proteins. Listerin, also known as RING finger protein 160 or zinc finger protein 294, is the mammalian homolog of yeast Ltn1. It is widely expressed in all tissues, but motor and sensory neurons and neuronal processes in the brainstem and spinal cord are primarily affected in the mutant. Listerin is required for embryonic development and plays an important role in neurodegeneration. It also functions as a critical E3 ligase involving quality control of nonstop proteins. It mediates ubiquitylation of aberrant proteins that become stalled on ribosomes during translation. Ltn1 works with several cofactors to form a large ribosomal subunit-associated quality control complex (RQC), whick mediates the ubiquitylation and extraction of ribosome-stalled nascent polypeptide chains for proteasomal degradation. It appears to first associate with nascent chain-stalled 60S subunits together with two proteins of unknown function, Tae2 and Rqc1. Listerin contains a long stretch of HEAT (Huntingtin, Elongation factor 3, PR65/A subunit of protein phosphatase 2A, and TOR) or ARM (Armadillo) repeats in the N terminus and middle region, and a catalytic RING-CH finger, also known as vRING or RINGv, with an unusual arrangement of zinc-coordinating residues in the C-terminus . Its cysteines and histidines are arranged in the sequence as C4HC3-type, rather than the C3H2C3-type in canonical RING-H2 finger. 50
34031 319406 cd16492 RING-CH-C4HC3_NFX1_like RING-CH finger, H2 subclass (C4HC3-type), found in transcriptional repressor NF-X1, NF-X1-type zinc finger protein NFXL1, and similar proteins. NF-X1, also known as nuclear transcription factor, X box-binding protein 1, is a novel cysteine-rich sequence-specific DNA-binding protein that interacts with the conserved X-box motif of the human major histocompatibility complex (MHC) class II genes via a repeated Cys-His domain. It functions as a cytokine-inducible transcriptional repressor that plays an important role in regulating the duration of an inflammatory response by limiting the period in which class II MHC molecules are induced by interferon gamma (IFN- gamma). NFXL1, also known as NF-X1-type zinc finger protein NFXL1 or ovarian zinc finger protein (OZFP), is encoded by a novel human cytoplasm-distribution zinc finger protein (CDZFP) gene. This family also includes human transcription factor NF-X1 homologs from insects, plants, and fungi. Drosophila melanogaster shuttle craft (STC) is a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system. It functions as a putative transcription factor and plays an essential role in the completion of embryonic development. In contrast to NF-X1, STC contains an RD domain. The Arabidopsis genome encodes two NF-X1 homologs, AtNFXL1 and AtNFXL2, both of which function as regulators of salt stress responses. The AtNFXL1 protein is a nuclear factor that positively affects adaptation to salt stress. It also functions as a negative regulator of the type A trichothecene phytotoxin-induced defense response. AtNFXL2 controls abscisic acid (ABA) levels and suppresses ABA responses. It may also prevent unnecessary and costly stress adaptation under favorable conditions. FKBP12-associated protein 1 (FAP1) is a dosage suppressor of rapamycin toxicity in budding yeast. It is localized in the cytoplasm, but upon rapamycin treatment translocates to the nucleus. FAP1 interacts with FKBP12 in a rapamycin-sensitive manner. It is a proline-rich protein containing a novel cysteine-rich DNA-binding motif. Unique structural features of the NFX1 and NFXL proteins are the Cys-rich region and a specific RING-CH finger motif with an unusual arrangement of zinc-coordinating residues. The Cys-rich region is required for binding to specific promoter elements. It frequently comprises more than 500 amino acids and harbors several NFX1-type zinc finger domains, characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The RING-CH finger, also known as vRING or RINGv, may have E3 ligase activity. It is characterized by a C4HC3-type Zn ligand signature and additional conserved amino acids, rather than C3H2C3-type cysteines and histidines arrangement in canonical RING-H2 finger. In addition to the Cys-rich region and RING-CH finger, NFX1 contains a PAM2 motif in the N-terminus and a R3H domain in the C-terminus. 58
34032 319407 cd16493 RING-CH-C4HC3_NSE1 RING-CH finger, H2 subclass (C4HC3-type), found in non-structural maintenance of chromosomes element 1 homolog (NSE1) and similar proteins. NSE1, also known as non-SMC element 1 homolog (NSMCE1), is an E3 ubiquitin ligase that contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger. It together with its partner, proteins NSE3 and NSE4, form a tight subcomplex of the structural maintenance of chromosomes SMC5-6 complex, which includes another two subcomplexes, SMC6-SMC5-NSE2 and NSE5-NSE6. The vRING finger is essential for normal NSE1-NSE3-NSE4 trimer formation in vitro and for damage-induced recruitment of NSE4 and SMC5 to subnuclear foci in vivo. Thus it functions as a protein-protein interaction domain required for SMC5-6 holocomplex integrity and recruitment to, or retention at, DNA lesions. The C-terminal half of NSE1, including the vRING finger, is required for DNA damage resistance and mitotic fidelity of SMC5-6 complex in the fission yeast Schizosaccharomyces pombe. The RING-CH finger may play an important role in Rad52-dependent postreplication repair of UV-damaged DNA in Saccharomyces cerevisiae. 47
34033 319408 cd16494 RING-CH-C4HC3_ZSWM2 RING-CH finger, H2 subclass (C4HC3-type), found in zinc finger SWIM domain-containing protein 2 (ZSWIM2) and similar proteins. ZSWIM2, also known as MEKK1-related protein X (MEX) or ZZ-type zinc finger-containing protein 2, is a testis-specific E3 ubiquitin ligase that promotes death receptor-induced apoptosis through Fas, death receptor (DR) 3, and DR4 signaling. ZSWIM2 is self-ubiquitinated and targeted for degradation through the proteasome pathway. It also acts as an E3 ubiquitin ligase, through the E2, Ub-conjugating enzymes UbcH5a, UbcH5c, or UbcH6. ZSWIM2 contains four putative zinc-binding domains including an N-terminal SWIM (SWI2/SNF2 and MuDR) domain critical for its ubiquitination and two RING fingers separated by a ZZ zinc finger domain, which was required for interaction with UbcH5a and its self-association. This family corresponds to the first RING finger, which is a C4HC3-type RING-CH finger, also known as vRING or RINGv, rather than the canonical C3H2C3-type RING-H2 finger. 58
34034 319409 cd16495 RING_CH-C4HC3_MARCH RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH proteins (MARCH). The family corresponds to a novel family of membrane-associated E3 ubiquitin ligases, consisting of 11 members in mammals (MARCH1-11), which are characterized by containing an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger). Most family members have hydrophobic transmembrane spans and are localized to the plasma membrane and intracellular organelle membrane. Only MARCH7 and MARCH10 are predicted to have no transmembrane spanning region. MARCH proteins have been implicated in mediating the ubiquitination and subsequent down-regulation of cell-surface immune regulatory molecules, such as major histocompatibility complex class II and CD86, as well as in endoplasmic reticulum-associated degradation, endosomal protein trafficking, mitochondrial dynamics, and spermatogenesis. 52
34035 319410 cd16496 RING-HC_BARD1 RING finger, HC subclass, found in BRCA1-associated RING domain protein 1 (BARD-1) and similar proteins. BARD-1 is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an C3HC4-type RING-HC finger that binds BRCA1 at its N-terminus and three tandem ankyrin repeats and tandem BRCT repeat domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage at its C-terminus. 45
34036 319411 cd16497 RING-HC_BAR RING finger, HC subclass, found in bifunctional apoptosis regulator (BAR). BAR, also known as RING finger protein 47, was originally identified as an inhibitor of Bax-induced apoptosis. It participates in the block of apoptosis induced by TNF-family death receptors (extrinsic pathway) and mitochondria-dependent apoptosis (intrinsic pathway). BAR is predominantly expressed by neurons in the central nervous system and is involved in the regulation of neuronal survival. It is an endoplasmic reticulum (ER)-associated RING-type E3 ubiquitin ligase that interacts with BI-1 protein and post-translationally regulates its stability as well as functions in ER stress. BAR contains an N-terminal C3HC4-type RING-HC finger, a SAM domain, a coiled-coil domain, and a C-terminal transmembrane (TM) domain. This family corresponds to the RING-HC finger responsible for the binding of ubiquitin conjugating enzymes (E2s). 46
34037 319412 cd16498 RING-HC_BRCA1 RING finger, HC subclass, found in breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also known as RING finger protein 53 (RNF53), is a RING finger protein encoded by tumor suppressor gene BRCA1 that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT (BRCA1 C-terminus domain) repeats at the C-terminus. 48
34038 319413 cd16499 RING-HC_BRE1_like RING finger, HC subclass, found in yeast Bre1 and its homologs from eukaryotes. Bre1 is an E3 ubiquitin-protein ligase that catalyzes monoubiquitination of histone H2B in concert with the E2 ubiquitin-conjugating enzyme, Rad6. The Rad6-Bre1-mediated histone H2B ubiquitylation modulates the formation of double-strand breaks (DSBs) during meiosis in yeast. it is also required, indirectly, for the methylation of histone 3 on lysine 4 (H3K4) and 79. RNF20, also known as BRE1A and RNF40, also known as BRE1B, are the mammalian homologs of Bre1. They work together to form a heterodimeric Bre1 complex that facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. Moreover, Bre1 complex acts as a tumor suppressor, augmenting expression of select tumor suppressor genes and suppressing select oncogenes. Deficiency in the mammalian histone H2B ubiquitin ligase Bre1 leads to replication stress and chromosomal instability. All family members contain a C3HC4-type RING-HC finger at its C-terminus. 42
34039 319414 cd16500 RING-HC_CARP RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein CARP-1, CARP-2 and similar proteins. The CARPs family includes CARP-1 and CARP-2 proteins, both of which are E3 ubiquitin ligases that ubiquitinate apical caspases and target them for proteasome-mediated degradation. As a novel group of caspase regulators with a FYVE-type zinc finger domain, they do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8, and caspase 10. Moreover, they stabilize MDM2 by inhibiting MDM2 self-ubiquitination, as well as by targeting 14-3-3sigma for degradation. They work together with MDM2 enhancing p53 degradation, thereby inhibiting p53-mediated cell death. CARPs contain an N-terminal FYVE-like domain that can serve as a membrane-targeting or endosome localizing signal and a C-terminal C3HC4-type RING-HC finger domain. 39
34040 319415 cd16501 RING-HC_CblA_like RING finger, HC subclass, found in Dictyostelium discoideum Cbl-like protein A (CblA) and similar proteins. CblA is a Dictyostelium homolog of the Cbl proteins which are multi-domain proteins acting as key negative regulators of various receptor and non-receptor tyrosine kinases signaling. CblA upregulates STATc tyrosine phosphorylation by downregulating PTP3, the protein tyrosine phosphatase responsible for dephosphorylating STATc. STATc is a signal transducer and activator of transcription protein. Like other Cbl proteins, CblA contains a tyrosine-kinase-binding domain (TKB), a proline-rich domain, a C3HC4-type RING-HC finger, and an ubiquitin-associated (UBA) domain. TKB, also known as a phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain. This family also includes Drosophila melanogaster defense repressor 1 (Dnr1) that was identified as an inhibitor of Dredd activity in the absence of a microbial insult in Drosophila S2 cells. It inhibits the Drosophila initiator caspases Dredd and Dronc. Moreover, Dnr1 acts as a negative regulator of the Imd (immune deficiency) innate immune-response pathway. Its mutations cause neurodegeneration in Drosophila by activating the innate immune response in the brain. Dnr1 contains a FERM N-terminal domain followed by a region rich in glutamine and serine residues, a central FERM domain, and a C-terminal C3HC4-type RING-HC finger. 37
34041 319416 cd16502 RING-HC_Cbl_like RING finger, HC subclass, found in Casitas B-lineage lymphoma (Cbl) proteins. The Cbl adaptor proteins family contains a small class of RING-type E3 ubiquitin ligases with oncogenic activity, which is represented by three mammalian members, c-Cbl, Cbl-b and Cbl-c, as well as two invertebrate Cbl-family proteins, D-Cbl in Drosophila and Sli-1 in C. elegans. Cbl proteins function as potent negative regulators of various signaling cascades in a wide range of cell types. They play roles in ubiquitinating the activated tyrosine kinases and targeting them for degradation. D-Cbl associates with the Drosophila epidermal growth factor receptor (EGFR) and overexpression of D-Cbl in the eye of Drosophila embryos inhibits EGFR dependent photoreceptor cell development. Sli-1 is a negative regulator of the Let-23 receptor tyrosine kinase, an EGFR homolog, in vulva development. Cbl proteins in this family consist of a highly conserved N-terminal half that includes a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain) and a C3HC4-type RING-HC finger, both of which are required for Cbl-mediated downregulation of RTKs, and a divergent C-terminal region. 43
34042 319417 cd16503 RING-HC_CHFR RING finger, HC subclass, found in checkpoint with forkhead and RING finger domains protein (CHFR). CHFR, also known as RING finger protein 196 (RNF196), is a checkpoint protein that delays entry into mitosis in response to stress. It functions as an E3 ubiquitin ligase that ubiquitinates and degrades its target proteins, such as Aurora-A, Plk1, Kif22, and PARP-1, which are critical for proper mitotic transitions. It also plays an important role in cell cycle progression and tumor suppression, and is negatively regulated by SUMOylation-mediated proteasomal ubiquitylation. Moreover, CHFR is involved in the early stage of the DNA damage response, which mediates the crosstalk between ubiquitination and poly-ADP-ribosylation. CHFR contains a fork head associated- (FHA) and a C3HC4-type RING-HC finger. 44
34043 319418 cd16504 RING-HC_COP1 RING finger, HC subclass, found in constitutive photomorphogenesis protein 1 (COP1) and similar proteins. COP1, also known as RING finger and WD repeat domain protein 2 (RFWD2) or RING finger protein 200 (RNF200), was defined as a central regulator of photomorphogenic development in plants, which targets key transcription factors for proteasome-dependent degradation. It is localized predominantly in the nucleus, but may also be present in the cytosol. Mammalian COP1 functions as an E3 ubiquitin-protein ligase that interacts with Jun transcription factors and modulates their transcriptional activity. It also interacts with and negatively regulates the tumor-suppressor protein p53. Moreover, COP1 associates with COP9 signalosome subunit 6 (CSN6), and is involved in 14-3-3 delta ubiquitin-mediated degradation. The CSN6-COP1 link enhances ubiquitin-mediated degradation of p27(Kip1), a critical CDK inhibitor involved in cell cycle regulation, to promote cancer cell growth. Furthermore, COP1 functions as the negative regulator of ETV1 and influences prognosis in triple-negative breast cancer. COP1 contains an N-terminal extension, a C3HC4-type RING-HC finger, a coiled coil domain, and seven WD40 repeats. In human COP1, a classic leucine-rich NES, and a novel bipartite NLS is bridged by the RING-HC finger. 46
34044 319419 cd16505 RING-HC_CYHR1 RING finger, HC subclass, found in cysteine and histidine-rich protein 1 (CYHR1) and similar proteins. CYHR1, also known as cysteine/histidine-rich protein (Chrp), shows sequence similarity with the Drosophila RING finger protein Seven-in-Absentia (sina) and its murine and human siah homologs. It is a novel prognostic marker that may work as a therapeutic target in patients with esophageal squamous cell carcinoma. It is also a biomarker of the response to erythropoietin in hemodialysis patients. CYHR1 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD). 49
34045 319420 cd16506 RING-HC_DTX3_like RING finger, HC subclass, found in E3 ubiquitin-protein ligase Deltex3 (DTX3), Deltex-3-like (DTX3L) and similar proteins. The family contains Deltex3 (DTX3) and Deltex-3-like (DTX3L), both of which are E3 ubiquitin-protein ligases belonging to the Deltex (DTX) family. DTX3, also known as RING finger protein 154 (RNF154), has a biological function that remains unclear. DTX3L, also known as B-lymphoma- and BAL-associated protein (BBAP) or Rhysin-2 (Rhysin2), regulates endosomal sorting of the G protein-coupled receptor CXCR4 from endosomes to lysosomes. It also regulates subcellular localization of its partner protein, B aggressive lymphoma (BAL), by a dynamic nucleocytoplasmic trafficking mechanism. In contrast to other DTXs, both DTX3 and DTX3L contain a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain. DTX3L can associate with DTX1 through its unique N termini and further enhance self-ubiquitination. 41
34046 319421 cd16507 RING-HC_GEFO_like RING finger, HC subclass, found in Dictyostelium discoideum Ras guanine nucleotide exchange factor O (RasGEFO) and similar proteins. RasGEFO, also known as RasGEF domain-containing protein O, is one of the Ras guanine-nucleotide exchange factors (RasGEFs), which are the proteins that activate Ras through catalyzing the replacement of GDP with GTP. They are particularly important for signaling in development and chemotaxis in many organisms, including Dictyostelium. RasGEFO contain a C3HC4-type RING-HC finger that may be responsible for the E3 ubiquitin ligase activity. 40
34047 319422 cd16508 RING-HC_HAKAI_like RING finger, HC subclass, found in E3 ubiquitin-protein ligase Hakai, zinc finger protein 645 (ZNF645), and similar proteins. Hakai, also known as Casitas B-lineage lymphoma-transforming sequence-like protein 1, RING finger protein 188 (RNF188), or c-Cbl-like protein 1 (CBLL1), is an E3 ubiquitin ligase that disrupts cell-cell contacts in epithelial cells and is upregulated in human colon and gastric adenocarcinomas. It was identified to mediate the posttranslational downregulation of E-cadherin (CDH1), a major component of adherens junctions in epithelial cells and a potent tumor suppressor. It also promotes ubiquitination of several other tyrosine-phosphorylated Src substrates, including cortactin (CTTN) and DOK1. Hakai acts as a homodimer with a novel HYB (Hakai pTyr-binding) domain that forms a phosphotyrosine-binding pocket upon, and consists of a pair of monomers arranged in an anti-parallel configuration. Each monomer contains a C3HC4-type RING-HC finger and a short pTyr-B domain that incorporates a novel, atypical C2H2-type Zn-finger coordination motif. Both domains are important for dimerization. ZNF645 is a novel testis-specific E3 ubiquitin-protein ligase that plays a role in sperm production and quality control. It has a structure similar to that of the c-Cbl-like protein Hakai. In contrast to Hakai, its HYB domain demonstrates different target specificities. It interacts with v-Src-phosphorylated E-cadherin, but not to cortactin. 38
34048 319423 cd16509 RING-HC_HLTF RING finger, HC subclass, found in helicase-like transcription factor (HLTF) and similar proteins. HLTF, also known as DNA-binding protein/plasminogen activator inhibitor 1 regulator, or HIP116, or RING finger protein 80, or SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 3, or sucrose nonfermenting protein 2-like 3, is a yeast RAD5 homolog found in mammals. It has both E3 ubiquitin ligase and DNA helicase activities, and plays a pivotal role in the template-switching pathway of DNA damage tolerance. It is involved in Lys-63-linked poly-ubiquitination of proliferating cell nuclear antigen (PCNA) at Lys-164 and in the regulation of DNA damage tolerance. It shows double-stranded DNA translocase activity with 3'-5' polarity, thereby facilitating regression of the replication fork. HLTF contains an N-terminal HIRAN (HIP116 and RAD5 N-terminal) domain, a SWI/SNF helicase domain that is divided into N- and C-terminal parts by an insertion of a C3HC4-type RING-HC finger involved in the poly-ubiquitination of PCNA. 43
34049 319424 cd16510 RING-HC_IAPs RING finger, HC subclass, found in inhibitor of apoptosis proteins (IAPs). IAPs are frequently overexpressed in cancer and associated with tumor cell survival, chemoresistance, disease progression, and poor prognosis. They function primarily as negative regulators of cell death. They regulate caspases and apoptosis through the inhibition of specific members of the caspase family of cysteine proteases. In addition, IAPs has been implicated in a multitude of other cellular processes, including inflammatory signalling and immunity, mitogenic kinase signalling, proliferation and mitosis, as well as cell invasion and metastasis. IAPs in this family includes cellular inhibitor of apoptosis protein c-IAP1 (BIRC2) and c-IAP2 (BIRC3), XIAP (BIRC4), BIRC7, and BIRC8, all of which contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), and a C3HC4-type RING-HC finger at the carboxyl terminus that is required for ubiquitin ligase activity. The UBA domain is only absent in mammalian homologs of BIRC7. Moreover, c-IAPs contains an additional caspase activation and recruitment domain (CARD) between UBA and C3HC4-type RING-HC domains. CARD domain may serve as a protein interaction surface. 39
34050 319425 cd16511 vRING-HC_IRF2BP1_like variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein IRF-2BP1, IRF-2BP2, and similar proteins. The family includes IRF-2BP1, IRF-2BP2, and their homolog, IRF-2BP-like, also known as IRF-2BPL or C14orf4. IRF-2BP1 and IRF-2BP2 are nuclear proteins that bind to the C-terminal repression domain of IRF-2 and act as an IRF-2-dependent transcriptional corepressors, both enhancer-activated and basal transcription. IRF-2BPL is expressed in the mediobasal hypothalamus and plays a critical function in regulating the female reproductive neuroendocrine axis. All family members contain a C-terminal C3HC4-type RING-HC finger with a partially new pattern. 56
34051 319426 cd16512 RING-HC_LNX3_like RING finger, HC subclass, found in ligand of Numb protein LNX3, LNX4, and similar proteins. The ligand of Numb protein X (LNX) family, also known as PDZ and RING (PDZRN) family, includes LNX1-5, which can interact with Numb, a key regulator of neurogenesis and neuronal differentiation. LNX5 (also known as PDZK4, or PDZRN4L) shows high sequence homology to LNX3 and LNX4, but it lacks the RING domain. LNX1-4 proteins function as E3 ubiquitin ligases and have a unique domain architecture consisting of an N-terminal RING-HC finger for E3 ubiquitin ligase activity and either two or four PDZ domains necessary for the substrate-binding. This family corresponds to LNX3/LNX4-like proteins, which contains a typical C3HC4-type RING-HC finger and two PDZ domains. 41
34052 319427 cd16513 RING1-HC_LONFs RING finger 1, HC subclass, found in the LON peptidase N-terminal domain and RING finger proteins family. The LON peptidase N-terminal domain and RING finger proteins family includes LONRF1 (also known as RING finger protein 191 or RNF191), LONRF2 (also known as RING finger protein 192 or RNF192, or neuroblastoma apoptosis-related protease), LONRF3 (also known as RING finger protein 127 or RNF127), which are characterized by containing two C3HC4-type RING-HC fingers, four tetratricopeptide (TPR) repeats, and one N-terminal domain of the ATP-dependent protease La (LON) domain at the C-terminus. Their biological function remain unclear. This family corresponds to the first RING-HC finger. 42
34053 319428 cd16514 RING2-HC_LONFs RING finger 2, HC subclass, found in the LON peptidase N-terminal domain and RING finger proteins family. The LON peptidase N-terminal domain and RING finger proteins family includes LONRF1 (also known as RING finger protein 191 or RNF191), LONRF2 (also known as RING finger protein 192 or RNF192, or neuroblastoma apoptosis-related protease), LONRF3 (also known as RING finger protein 127 or RNF127), which are characterized by containing two C3HC4-type RING-HC fingers, four tetratricopeptide (TPR) repeats, and one N-terminal domain of the ATP-dependent protease La (LON) domain at the C-terminus. Their biological function remain unclear. This family corresponds to the second RING-HC finger. 42
34054 319429 cd16515 RING-HC_LRSAM1 RING finger, HC subclass, found in leucine-rich repeat and sterile alpha motif-containing protein 1 (LRSAM1) and similar proteins. LRSAM1, also known as Tsg101-associated ligase (TAL), or RIFLE, is an E3 ubiquitin-protein ligase that physically associates with, and selectively ubiquitylates, Tsg101, an E2-like molecule that regulates vesicular trafficking processes in yeast and mammals. It regulates a Tsg101-associated complex responsible for the sorting of cargo into cytoplasm-containing vesicles that bud at the multivesicular body and at the plasma membrane. LRSAM1 is a multidomain protein containing an N-terminal leucine-rich repeat (LRR), followed by several recognizable motifs, including an ezrin-radixin-moezin (ERM) domain, a coiled-coil (CC) region, a SAM domain, and a C-terminal C3HC4-type RING-HC finger domain. 40
34055 319430 cd16516 RING-HC_malin RING finger, HC subclass, found in malin and similar proteins. Malin ("mal" for seizure in French), also known as NHL repeat-containing protein 1 (NHLRC1), or EPM2B, is a nuclear E3 ubiquitin-protein ligase that ubiquitinates and promotes the degradation of laforin (EPM2A encoding protein phosphatase). Malin and laforin operate as a functional complex that play key roles in regulating cellular functions such as glycogen metabolism, unfolded cellular stress response, and proteolytic processes. They act as pro-survival factors that negatively regulate the Hipk2-p53 cell death pathway. They also negatively regulate cellular glucose uptake by preventing plasma membrane targeting of glucose transporters. Moreover, they degrade polyglucosan bodies in concert with glycogen debranching enzyme and brain isoform glycogen phosphorylase. Furthermore, they, together with Hsp70, form a new functional complex that suppress the cellular toxicity of misfolded proteins by promoting their degradation through the ubiquitin-proteasome system. Defects in either malin or laforin may cause Lafora disease (LD), a fatal form of teenage-onset autosomal recessive progressive myoclonus epilepsy. In addition, malin may have function independent of laforin in lysosomal biogenesis and/or lysosomal glycogen disposal. Malin contains six NHL-repeat protein-protein interaction domains and a C3HC4-type RING-HC finger. 48
34056 319431 cd16517 RING-HC_MAT1 RING finger, HC subclass, found in RING finger protein MAT1. MAT1, also known as CDK-activating kinase assembly factor MAT1, CDK7/cyclin-H assembly factor, cyclin-G1-interacting protein, menage a trois, RING finger protein 66 (RNF66), p35, or p36, is involved in cell cycle control and in RNA transcription by RNA polymerase II. It associates primarily with the catalytic subunit cyclin-dependent kinase 7 (CDK7) and the regulatory subunit cyclin H to form the CDK-activating kinase (CAK) complex that can further associate with the core-TFIIH to form the transcription factor IIH (TFIIH) basal transcription/DNA repair factor, which activates RNA polymerase II by serine phosphorylation of the repetitive C-terminal domain (CTD) of its large subunit (POLR2A), allowing its escape from the promoter and elongation of the transcripts. MAT1 contains an N-terminal C3HC4-type RING-HC finger, a central coiled coil domain, and a C-terminal domain rich in hydrophobic residues. 49
34057 319432 cd16518 RING-HC_MEX3 RING finger, HC subclass, found in RNA-binding proteins of the evolutionarily-conserved MEX-3 family. The family includes MEX-3 family phosphoproteins have been found in vertebrates. They are mediators of post-transcriptional regulation in different organisms, and have been implicated in many core biological processes, including embryonic development, epithelial homeostasis, immune responses, metabolism, and cancer. They contain two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. They shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The RNA-binding protein MEX-3 from nematode Caenorhabditis elegans is the founding member of the MEX-3 family. Due to the lack of RING-HC finger, it is not included here. 41
34058 319433 cd16519 RING-HC_MIBs RING finger, HC subclass, found in mind bomb MIB1, MIB2, and similar proteins. MIBs are large, multi-domain E3 ubiquitin-protein ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. They are also responsible for TBK1 K63-linked ubiquitination and activation, promoting interferon production and controlling antiviral immunity. Moreover, MIBs selectively control responses to cytosolic RNA and regulate type I interferon transcription. Both MIB1 and MIB2 have similar domain architectures, which consist of two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region, where MIB1 and MIB2 contain three and two C3HC4-type RING-HC fingers, respectively. This family corresponds to the first RING-HC finger of MIB1 and MIB2, as well as the second RING-HC finger of MIB1. 37
34059 319434 cd16520 RING-HC_MIBs_like RING finger, HC subclass, found in mind bomb MIB1, MIB2, RGLG1, RGLG2, and similar proteins. MIBs are large, multi-domain E3 ubiquitin-protein ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. They are also responsible for TBK1 K63-linked ubiquitination and activation, promoting interferon production and controlling antiviral immunity. Moreover, MIBs selectively control responses to cytosolic RNA and regulate type I interferon transcription. Both MIB1 and MIB2 have similar domain architectures, which consist of two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region, where MIB1 and MIB2 contain three and two C3HC4-type RING-HC fingers, respectively. This family corresponds to the third RING-HC finger of MIB1, as well as the second RING-HC finger of MIB2. In addition to MIB1 and MIB2, RING domain ligase RGLG1, RGLG2 and similar proteins from plant have also been included in this family. RGLG1 is a ubiquitously expressed E3 ubiquitin-protein ligase that interacts with UBC13 and, together with UBC13, catalyzes the formation of K63-linked polyubiquitin chains, which is involved in DNA damage repair. RGLG1 mediates the formation of canonical, K48-linked polyubiquitin chains that target proteins for degradation. It also regulates apical dominance by acting on the auxin transport proteins abundance. RGLG1 has overlapping functions with its closest sequelog, RGLG2. They both function as RING E3 ligases that interact with ethylene response factor 53 (ERF53) in the nucleus and negatively regulate the plant drought stress response. All RGLG proteins contain a Von Willebrand factor type A (vWA) domain and a C3HC4-type RING-HC finger. 38
34060 319435 cd16521 RING-HC_MKRN RING finger, HC subclass, found in the makorin (MKRN) protein family. The MKRN protein family includes the ribonucleoproteins that are characterized by a variety of zinc-finger motifs, including typical arrays of one to four C3H1-type zinc fingers and a C3HC4-type RING-HC finger. Another motif rich in Cys and His residues (CH), with so far unknown function, is also generally present in MKRN proteins. MKRN proteins may have E3 ubiquitin ligase activity. 51
34061 319436 cd16522 RING-HC_MSL2 RING finger found in Drosophila melanogaster male-specific lethal-2 (MSL2) and similar proteins. MSL2, also known as RING finger protein 184 (RNF184), is a putative DNA-binding protein required for X chromosome dosage compensation in Drosophila males. Its expression is sex specifically regulated by Sex-lethal. Drosophila dosage compensation proteins MOF, MSL1, MSL2, and MSL3 are essential for elevating transcription of the single X chromosome in the male (X chromosome dosage compensation). MSL2 plays a critical role in translation and/or stability of MSL1 in males. In complex with MSL1, it acts as an E3 ubiquitin ligase that promotes ubiquitination of histone H2B. MSL2 contains a C3HC4-type RING-HC finger and a metallothionein-like domain with eight conserved and two non-conserved cysteines, as well as a positively and a negatively charged amino acid residue cluster and a coiled coil domain that may be involved in protein-protein interactions. This family also includes many male-specific lethal-2 homologs from bilaterians. 45
34062 319437 cd16523 RING-HC_MYLIP RING finger, HC subclass, found in myosin regulatory light chain interacting protein (MYLIP) and similar proteins. MYLIP, also known as inducible degrader of the low-density lipoprotein (LDL)-receptor (IDOL), or MIR, is an E3 ubiquitin-protein ligase that mediates ubiquitination and subsequent proteasomal degradation of myosin regulatory light chain (MRLC), LDLR, VLDLR, and LRP8. Its activity depends on E2 ubiquitin-conjugating enzymes of the UBE2D family. MYLIP stimulates clathrin-independent endocytosis and acts as a sterol-dependent inhibitor of cellular cholesterol uptake by binding directly to the cytoplasmic tail of the LDLR and promoting its ubiquitination via the UBE2D1/E1 complex. The ubiquitinated LDLR then enters the multivesicular body (MVB) protein-sorting pathway and is shuttled to the lysosome for degradation. Moreover, MYLIP has been identified as a novel ERM-like protein that affects cytoskeleton interactions regulating cell motility, such as neurite outgrowth. The ERM proteins includes ezrin, radixin, and moesin, which are cytoskeletal effector proteins linking actin to membrane-bound proteins at the cell surface. MYLIP contains an ERM-homology domain and a C-terminal C3HC4-type RING-HC finger. 38
34063 319438 cd16524 RING-HC_NHL-1_like RING finger, HC subclass, found in Caenorhabditis elegans RING finger protein NHL-1 and similar proteins. NHL-1 functions as an E3 ubiquitin-protein ligase in the presence of both UBC-13 and UBC-1 within the ubiquitin pathway of Caenorhabditis elegans. It acts in chemosensory neurons to promote stress resistance in distal tissues by the transcription factor DAF-16 activation but is dispensable for the activation of heat shock factor 1 (HSF-1). NHL-1 belongs to the TRIM (tripartite motif)-NHL family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. 45
34064 319439 cd16525 RING-HC_PCGF RING finger, HC subclass, found in Polycomb Group RING finger homologs (PCGF1, 2, 3, 4, 5 and 6), and similar proteins. The family includes six Polycomb Group (PcG) RING finger homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) that use epigenetic mechanisms to maintain or repress expression of their target genes. They were first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place, and are well known for silencing Hox genes through modulation of chromatin structure during embryonic development in fruit flies. PCGF homologs play important roles in cell proliferation, differentiation, and tumorigenesis. They all have been found to associate with ring finger protein 2 (RNF2). The RNF2-PCGF heterodimer is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF homologs are critical components in the assembly of distinct Polycomb Repression Complex 1 (PRC1) related complexes which is involved in the maintenance of gene repression and target different genes through distinct mechanisms. The Drosophila PRC1 core complex is formed by the Polycomb (Pc), Polyhomeotic (Ph), Posterior sex combs (Psc), and Sex combs extra (Sce, also known as Ring) subunits. In mammals, the composition of PRC1 is much more diverse and varies depending on the cellular context. All PRC1 complexes contain homologs of the Drosophila Ring protein. Ring1A/RNF1 and Ring1B/RNF2 are E3 ubiquitin ligases that mark lysine 119 of histone H2A with a single ubiquitin group (H2AK119ub). Mammalian homologs of the Drosophila Psc protein, such as PCGF2/Mel-18 or PCGF4/BMI1, regulate PRC1 enzymatic activity. PRC1 complexes can be divided into at least two classes according to the presence or absence of CBX proteins, which are homologs of Drosophila Pc. Canonical PRC1 complexes contain CBX proteins that recognize and bind H3K27me3, the mark deposited by PRC2. Therefore, canonical PRC1 complexes and PRC2 can act together to repress gene transcription and maintain this repression through cell division. Non-canonical PRC1 complexes, containing RYBP (together with additional proteins, such as L3mbtl2 or Kdm2b) rather than the CBX proteins have recently been described in mammals. PCGF homologs contain a C3HC4-type RING-HC finger. 42
34065 319440 cd16526 RING-HC_PEX2 RING finger, HC subclass, found in peroxin-2 (PEX2) and similar proteins. PEX2, also known as peroxisome biogenesis factor 2, 35 kDa peroxisomal membrane protein, peroxisomal membrane protein 3, peroxisome assembly factor 1 (PAF-1), or RING finger protein 72 (RNF72), is an integral peroxisomal membrane protein with two transmembrane regions and a C3HC4-type RING-HC finger within its cytoplasmically exposed C-terminus. It may be involved in the biogenesis of peroxisomes, as well as in peroxisomal matrix protein import. Mutations in the PEX2 gene are the primary defect in a subset of patients with Zellweger syndrome and related peroxisome biogenesis disorders. Moreover, PEX2 functions as an E3-ubiquitin ligase that mediates the UBC4-dependent polyubiquitination of PEX5, a key player in peroxisomal matrix protein import, to control PEX5 receptor recycling or degradation. 43
34066 319441 cd16527 RING-HC_PEX10 RING finger, HC subclass, found in peroxin-10 (PEX10) and similar proteins. PEX10, also known as peroxisome biogenesis factor 10, peroxisomal biogenesis factor 10, peroxisome assembly protein 10, or RING finger protein 69 (RNF69), is an integral peroxisomal membrane protein with two transmembrane regions and a C3HC4-type RING-HC finger within its cytoplasmically exposed C-terminus. It plays an essential role in peroxisome assembly, import of target substrates, and recycling or degradation of protein complexes and amino acids. It is an essential component of the spinal locomotor circuit, and thus its mutations may be involved in peroxisomal biogenesis disorders (PBD). Mutations in human PEX10 also result in autosomal recessive ataxia. Moreover, PEX10 functions as an E3-ubiquitin ligase with an E2, UBCH5C. It mono- or poly-ubiquitinates PEX5, a key player in peroxisomal matrix protein import, in a UBC4-dependent manner, to control PEX5 receptor recycling or degradation. It also links the E2 ubiquitin conjugating enzyme PEX4 to the protein import machinery of the peroxisome. 40
34067 319442 cd16528 RING-HC_prokRING RING finger, HC subclass, found in prokaryotic RING finger family proteins. The family corresponds to a group of uncharacterized prokaryotic C3HC4-type RING-HC finger containing proteins. The RING finger is fused to an N-terminal alpha-helical domain, ROT/Trove-like repeats, and a C-terminal TerD domain, suggesting a possible role in an RNA-processing complex. 39
34068 319443 cd16529 RING-HC_RAD18 RING finger, HC subclass, found in postreplication repair protein RAD18 and similar proteins. RAD18, also known as HR18 or RING finger protein 73 (RNF73), is an E3 ubiquitin-protein ligase involved in post replication repair of UV-damaged DNA via its recruitment to stalled replication forks. It associates to the E2 ubiquitin conjugating enzyme UBE2B to form the UBE2B-RAD18 ubiquitin ligase complex involved in mono-ubiquitination of DNA-associated PCNA on K164. It also interacts with another E2 ubiquitin conjugating enzyme RAD6 to form a complex that monoubiquitinates proliferating cell nuclear antigen at stalled replication forks in DNA translesion synthesis. Moreover, Rad18 is a key factor in double-strand break DNA damage response (DDR) pathways via its association with K63-linked polyubiquitylated chromatin proteins. It can function as a mediator for DNA damage response signals to activate the G2/M checkpoint in order to maintain genome integrity and cell survival after ionizing radiation (IR) exposure. RAD18 contains a C3HC4-type RING-HC finger, a ubiquitin-binding zinc finger domain (UBZ), a SAP (SAF-A/B, Acinus and PIAS) domain, and a RAD6-binding domain (R6BD). 42
34069 319444 cd16530 RING-HC_RAG1 RING finger, HC subclass, found in recombination activating gene-1 (RAG-1) and similar proteins. RAG-1, also known as V(D)J recombination-activating protein 1, RING finger protein 74 (RNF74), or endonuclease RAG1, is the catalytic component of the RAG complex, a multiprotein complex that mediates the DNA cleavage phase during V(D)J recombination. RAG1 is the lymphoid-specific factor that mediates the DNA-binding to the conserved recombination signal sequences (RSS) and catalyzes the DNA cleavage activities by introducing a double-strand break between the RSS and the adjacent coding segment. It also functions as an E3 ubiquitin-protein ligase that mediates monoubiquitination of histone H3, which is required for the joining step of V(D)J recombination. RAG-1 contains an N-terminal C3HC4-type RING-HC finger that mediates monoubiquitylation of Histone H3, an adjacent C2H2-type zinc finger, and a nonamer binding (NBD) DNA-binding domain. 46
34070 319445 cd16531 RING-HC_RING1_like RING finger, HC subclass, found in really interesting new gene proteins RING1, RING2 and similar proteins. RING1, also known as polycomb complex protein RING1, RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), was identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. RING2, also known as huntingtin-interacting protein 2-interacting protein 3, HIP2-interacting protein 3, protein DinG, RING finger protein 1B (RING1B), RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. Both RING1 and RING2 are core components of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING2 acts as the main E3 ubiquitin ligase on histone H2A of the PRC1 complex, while RING1 may rather act as a modulator of RNF2/RING2 activity. Members in this family contain a C3HC4-type RING-HC finger. 41
34071 319446 cd16532 RING-HC_RNFT1_like RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein RNFT1, RNFT2, and similar proteins. Both RNFT1 and RNFT2 are multi-pass membrane proteins containing a C3HC4-type RING-HC finger. Their biological roles remain unclear. 40
34072 319447 cd16533 RING-HC_RNF4 RING finger, HC subclass, found in RING finger protein 4 (RNF4) and similar proteins. RNF4, also known as small nuclear ring finger protein (SNURF), is a SUMO-targeted E3 ubiquitin-protein ligase with a pivotal function in the DNA damage response (DDR) through interacting with the deubiquitinating enzyme ubiquitin-specific protease 11 (USP11), a known DDR-component, and further facilitating DNA repair. It plays a novel role in preventing the loss of intact chromosomes and ensures the maintenance of chromosome integrity. Moreover, RNF4 is responsible for the UbcH5A-catalyzed formation of K48 chains that target SUMO-modified promyelocytic leukemia (PML) protein for proteasomal degradation in response to arsenic treatment. It also interacts with telomeric repeat binding factor 2 (TRF2) in a small ubiquitin-like modifiers (SUMO)-dependent manner and preferentially targets SUMO-conjugated TRF2 for ubiquitination through SUMO-interacting motifs (SIMs). Furthermore, RNF4 can form a complex with a Ubc13-ubiquitin conjugate and Ube2V2. It catalyzes K63-linked polyubiquitination by the Ube2V2-Ubc13 (ubiquitin-loaded) complex. Meanwhile, RNF4 negatively regulates nuclear factor kappa B (NF-kappaB) signaling by down-regulating transforming growth factor beta (TGF-beta)-activated kinase 1 (TAK1)-TAK1-binding protein2 (TAB2). RNF4 contains four SIMs followed by a C3HC4-type RING-HC finger at the C-terminus. 54
34073 319448 cd16534 RING-HC_RNF5_like RING finger, HC subclass, found in RING finger protein RNF5, RNF185 and similar proteins. RNF5 and RNF185 are E3 ubiquitin-protein ligases that are anchored to the outer membrane of the endoplasmic reticulum (ER). RNF5 acts at early stages of cystic fibrosis (CF) transmembrane conductance regulator (CFTR) biosynthesis, and functions as a target for therapeutic modalities to antagonize mutant CFTR proteins in CF patients carrying the F508del allele. RNF185 controls the degradation of CFTR and CFTR F508del allele in a RING- and proteasome-dependent manner, but does not control that of other classical endoplasmic reticulum-associated degradation (ERAD) model substrates. Moreover, both RNF5 and RNF185 play important roles in cell adhesion and migration through the modulation of cell migration by ubiquitinating paxillin. Arabidopsis thaliana RING membrane-anchor proteins (AtRMAs) are also included in this family. They possess E3 ubiquitin-protein ligase activity and may play a role in the growth and development of Arabidopsis. All members in this family contain a C3HC4-type RING-HC finger. 43
34074 319449 cd16535 RING-HC_RNF8 RING finger, HC subclass, found in RING finger protein 8 (RNF8) and similar proteins. RNF8 is a telomere-associated E3 ubiquitin-protein ligase that plays an important role in DNA double-strand break (DSB) repair via histone ubiquitination. It is localized in the nucleus and interacts with class III E2s (UBE2E2, UbcH6, and UBE2E3), but not with other E2s (UbcH5, UbcH7, UbcH10, hCdc34, and hBendless). It recruits UBC13 for lysine 63-based self polyubiquitylation. Its deficiency causes neuronal pathology and cognitive decline, and its loss results in neuron degeneration. RNF8, together with RNF168, catalyzes a series of ubiquitylation events on substrates such as H2A and H2AX, with the H2AK13/15 ubiquitylation being particularly important for recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of DSBs. Specially, RNF8 mediates the ubiquitination of gammaH2AX, and recruits 53BP1 and BRCA1 to DNA damage sites which promotes DNA damage response (DDR) and inhibits chromosomal instability. Moreover, RNF8 interacts with retinoid X receptor alpha (RXR alpha) and enhances its transcription-stimulating activity. It also regulates the rate of exit from mitosis and cytokinesis. RNF8 contains an N-terminal forkhead-associated (FHA) domain and a C-terminal C3HC4-type RING-HC finger. 42
34075 319450 cd16536 RING-HC_RNF10 RING finger, HC subclass, found in RING finger protein 10 (RNF10) and similar proteins. RNF10 is an E3 ubiquitin-protein ligase that interacts with mesenchyme Homeobox 2 (MEOX2) transcription factor, a regulator of the proliferation, differentiation and migration of vascular smooth muscle cells and cardiomyocytes, and enhances Meox2 activation of the p21 promoter. It also regulates the expression of myelin-associated glycoprotein (MAG) genes and is required for myelin production in Schwann cells of peripheral nervous system. Moreover, RNF10 regulates retinoic acid-induced neuronal differentiation and the cell cycle exit of P19 embryonic carcinoma cells. RNF10 contains a C3HC4-type RING-HC finger and three putative nuclear localization signals. 43
34076 319451 cd16537 RING-HC_RNF37 RING finger, HC subclass, found in RING finger protein 37 (RNF37). RNF37, also known as KIAA0860, U-box domain-containing protein 5 (UBOX5), UbcM4-interacting protein 5 (UIP5), or ubiquitin-conjugating enzyme 7-interacting protein 5, is an E3 ubiquitin-protein ligase found exclusively in the nucleus as part of a nuclear dot-like structure. It interacts with the molecular chaperone VCP/p97 protein. RNF37 contains a U-box domain followed by a potential nuclear location signal (NLS) and a C-terminal C3HC4-type RING-HC finger. The U-box domain is a modified RING finger domain that lacks the hallmark metal-chelating cysteines and histidines of the latter, and is likely to adopt a RING finger-like conformation. The presence of the U-box, but not of the RING finger, is required for the E3 activity. The U-box domain can directly interact with several E2 enzymes, including UbcM2, UbcM3, UbcM4, UbcH5, and UbcH8, suggesting a similar function as the RING finger in the ubiquitination pathway. This family corresponds to the RING-HC finger. 47
34077 319452 cd16538 RING-HC_RNF112 RING finger, HC subclass, found in RING finger protein 112 (RNF112) and similar proteins. RNF112, also known as brain finger protein (BFP), zinc finger protein 179 (ZNF179), or neurolastin, is a peripheral membrane protein that is predominantly expressed in the central nervous system and localizes to endosomes. It contains functional GTPase and C3HC4-type RING-HC finger domains and has been identified as a brain-specific dynamin family GTPase that affects endosome size and spine density. Moreover, RNF112 acts as a downstream target of sigma-1 receptor (Sig-1R) regulation and may play a novel role in neuroprotection by mediating the neuroprotective effects of dehydroepiandrosterone (DHEA) and its sulfated analog (DHEAS). 48
34078 319453 cd16539 RING-HC_RNF113A_B RING finger, HC subclass, found in RING finger proteins RNF113A, RNF113B, and similar proteins. RNF113A, also known as zinc finger protein 183 (ZNF183), is an E3 ubiquitin-protein ligase that physically interacts with the E2 protein, UBE2U. A nonsense mutation in RNF113A is associated with an X-linked trichothiodystrophy (TTD). Its yeast ortholog Cwc24p is predicted to have a spliceosome function and acts in a complex with Cef1p to participate in pre-U3 snoRNA splicing, indirectly affecting pre-rRNA processing. It is also important for the U2 snRNP binding to primary transcripts and co-migrates with spliceosomes. Moreover, the ortholog of RNF113A in fruit flies may also act as a spliceosome and is hypothesized to be involved in splicing, namely within the central nervous system. The ortholog in Caenorhabditis elegans is involved in DNA repair of inter-strand crosslinks. RNF113B, also known as zinc finger protein 183-like 1, shows high sequence similarity with RNF113A. Both RNF113A and RNF113B contain a CCCH-type zinc finger, which is commonly found in RNA-binding proteins involved in splicing, and a C3HC4-type RING-HC finger, which is frequently found in E3 ubiquitin ligases. 41
34079 319454 cd16540 RING-HC_RNF114 RING finger, HC subclass, found in RING finger protein 114 (RNF114) and similar proteins. RNF114, also known as zinc finger protein 228 (ZNF228) or zinc finger protein 313 (ZNF313), is a p21(WAF1)-targeting ubiquitin E3 ligase that interacts with X-linked inhibitor of apoptosis (XIAP)-associated factor 1 (XAF1) and may play a role in p53-mediated cell-fate decisions. It is involved in immune response to double-stranded RNA in disease pathogenesis. Moreover, RNF114 interacts with A20 and modulates its ubiquitylation. It negatively regulates nuclear factor-kappaB (NF-kappaB)-dependent transcription and positively regulates T-cell activation. RNF114 may play a putative role in the regulation of immune responses, since it corresponds to a novel psoriasis susceptibility gene, ZNF313. RNF114, together with three closely related proteins: RNF125, RNF138 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM). 42
34080 319455 cd16541 RING-HC_RNF123 RING finger, HC subclass, found in RING finger protein 123 (RNF123) and similar proteins. RNF123, also known as Kip1 ubiquitination-promoting complex protein 1 (KPC1), is an E3 ubiquitin-protein ligase that mediates ubiquitination and proteasomal processing of the nuclear factor-kappaB 1 (NF- kappaB1) precursor p105 to the p50 active subunit restricts tumor growth. It also regulates degradation of heterochromatin protein 1alpha (HP1alpha) and 1beta (HP1beta) in lamin A/C knock-down cells. Moreover, RNF123, together with Kip1 ubiquitylation-promoting complex 2 (KPC2), forms the Kip1 ubiquitination-promoting complex (KPC), acting as a cytoplasmic ubiquitin ligase that regulates degradation of the cyclin-dependent kinase inhibitor p27 (Kip1) at the G1 phase of the cell cycle. Furthermore, RNF123 may function as a clinically relevant, peripheral state marker of depression. RNF123 contains a C3HC4-type RING-HC finger at the C-terminus. 41
34081 319456 cd16542 RING-HC_RNF125 RING finger, HC subclass, found in RING finger protein 125 (RNF125). RNF125, also known as T-cell RING activation protein 1 (TRAC-1), is an E3 ubiquitin-protein ligase that is predominantly expressed in lymphoid cells, and functions as a positive regulator of T cell activation. It also down-modulates HIV replication and inhibits pathogen-induced cytokine production. It negatively regulates type I interferon signaling, which conjugates Lys(48)-linked ubiquitination to retinoic acid-inducible gene-I (RIG-I) and subsequently leads to the proteasome-dependent degradation of RIG-I. Further, RNF125 conjugates ubiquitin to melanoma differentiation-associated gene 5 (MDA5), a family protein of RIG-I. It thus acts as a negative regulator of RIG-I signaling, and is a direct target of miR-15b in the context of Japanese encephalitis virus (JEV) infection. Moreover, RNF125 binds to and ubiquitinates JAK1, prompting its degradation and inhibition of receptor tyrosine kinase (RTK) expression. It also negatively regulates p53 function through physical interaction and ubiquitin-mediated proteasome degradation. Mutations in RNF125 may lead to overgrowth syndromes (OGS). RNF125, together with three closely related proteins: RNF114, RNF138 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM). The UIM of RNF125 binds K48-linked poly-ubiquitin chains and is, together with the RING domain, required for auto-ubiquitination. 42
34082 319457 cd16543 RING-HC_RNF135_like RING finger, HC subclass, found in RING finger protein 135 (RNF135), tripartite motif-containing protein 15 (TRIM15) and similar proteins. RNF135, also known as RIG-I E3 ubiquitin ligase (REUL) or Riplet, is a widely expressed E3 ubiquitin-protein ligase that consists of an N-terminal C3HC4-type RING-HC finger and C-terminal B30.2/SPRY and PRY motifs, but lacks the B-box and coiled-coil domains that are also typically present in TRIM proteins. RNF135 serves as a specific retinoic acid-inducible gene-I (RIG-I)-interacting protein that ubiquitinates RIG-I and specifically stimulates RIG-I-mediated innate antiviral activity to produce antiviral type-I interferon (IFN) during the early phase of viral infection. It also has been identified as a bio-marker and therapy target of glioblastoma. It associates with the ERK signal transduction pathway and plays a role in glioblastoma cell proliferation, migration and cell cycle. TRIM15, also known as RING finger protein 93 (RNF93), zinc finger protein 178 (ZNF178), or zinc finger protein B7 (ZNFB7), is a focal adhesion protein that regulates focal adhesion disassembly. It localizes to focal contacts in a myosin-II-independent manner by an interaction between its coiled-coil domain and the LD2 motif of paxillin. TRIM15 can also associate with coronin 1B, cortactin, filamin binding LIM protein1, and vasodilator-stimulated phosphoprotein, which are involved in actin cytoskeleton dynamics. As an additional component of the integrin adhesome, it regulates focal adhesion turnover and cell migration. TRIM15 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. 37
34083 319458 cd16544 RING-HC_RNF138 RING finger, HC subclass, found in RING finger protein 138 (RNF138) and similar proteins. RNF138, also known as Nemo-like kinase-associated RING finger protein (NARF) or NLK-associated RING finger protein, is an E3 ubiquitin-protein ligase that plays an important role in glioma cell proliferation, apoptosis, and cell cycle. It specifically cooperates with the E2 conjugating enzyme E2-25K (Hip-2/UbcH1), regulates the ubiquitylation and degradation of T cell factor/lymphoid enhancer factor (TCF/LEF), and further suppresses Wnt-beta-catenin signaling. RNF138, together with three closely related proteins: RNF114, RNF125 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM). 46
34084 319459 cd16545 RING-HC_RNF141 RING finger, HC subclass, found in RING finger protein 141 (RNF141) and similar proteins. RNF141, also known as zinc finger protein 230 (ZNF230), is a RING finger protein present primarily in the nuclei of spermatogonia, the acrosome, and the tail of spermatozoa. It may have a broad function during early development of vertebrates. It plays an important role in spermatogenesis, including spermatogenic cell proliferation and sperm maturation, as well as motility and fertilization. It also exhibits DNA binding activity. RNF141 corresponding ZNF230 gene mutation may be associated with azoospermia. RNF141 contains a C3HC4-type RING finger domain that may function as an activator module in transcription. 39
34085 319460 cd16546 RING-HC_RNF146 RING finger, HC subclass, found in RING finger protein 146 (RNF146) and similar proteins. RNF146, also known as dactylidin, or iduna, is a cytoplasmic E3 ubiquitin-protein ligase that is responsible for PARylation-dependent ubiquitination (PARdU). It displays neuroprotective property due to its inhibition of Parthanatos, a PAR dependent cell death, via binding with Poly(ADP-ribose) (PAR). It also modulates PAR polymerase-1 (PARP-1)-mediated oxidative cell injury in cardiac myocytes. Moreover, RNF146 mediates tankyrase-dependent degradation of axin, thereby positively regulates Wnt signaling. It also facilitates DNA repair and protects against cell death induced by DNA damaging agents or gamma-irradiation through translocating to the nucleus after cellular injury and promoting the ubiquitination and degradation of various nuclear proteins involved in DNA damage repair. Furthermore, RNF146 is implicated in neurodegenerative disease and cancer development. It regulates the development and progression of non-small cell lung cancer (NSCLC) by enhancing cell growth, invasion, and survival. RNF146 contains an N-terminal C3HC4-type RING-HC finger followed by a WWE domain with a poly(ADP-ribose) (PAR) binding motif at the tail. 40
34086 319461 cd16547 RING-HC_RNF151 RING finger, HC subclass, found in RING finger protein 151 (RNF151) and similar proteins. RNF151 is a testis-specific RING finger protein that interacts with dysbindin, a synaptic and microtubular protein that binds brain snapin, a SNARE-binding protein that mediated intracellular membrane fusion in both neuronal and non-neuronal cells. Thus, it may be involved in acrosome formation of spermatids through interacting with multiple proteins participating in membrane biogenesis and microtubule organization. RNF151 contains a C3HC4-type RING finger domain, a putative nuclear localization signal (NLS), and a TNF receptor associated factor (TRAF)-type zinc finger domain. 39
34087 319462 cd16548 RING-HC_RNF152 RING finger, HC subclass, found in RING finger protein 152 (RNF152) and similar proteins. RNF152 is a lysosome-anchored E3 ubiquitin-protein ligase involved in apoptosis. It is polyubiquitinated through K48 linkage. It negatively regulates the activation of the mTORC1 pathway by targeting RagA GTPase for K63-linked ubiquitination. It interacts with and ubiquitinates RagA in an amino-acid-sensitive manner. The ubiquitination of RagA recruits its inhibitor GATOR1, a GAP complex for Rag GTPases to the Rag complex, thereby inactivating mTORC1 signaling. RNF152 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal transmembrane domain, both of which are responsible for its E3 ligase activity. 45
34088 319463 cd16549 RING-HC_RNF166 RING finger, HC subclass, found in RING finger protein 166 (RNF166) and similar proteins. RNF166 is encoded by gene RNF166 targeted by thyroid hormone receptor alpha1 (TRalpha1), which is important in brain development. It plays an important role in RNA virus-induced interferon-beta production by enhancing the ubiquitination of TRAF3 and TRAF6. RNF166, together with three closely related proteins: RNF114, RNF125 and RNF138, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM). 47
34089 319464 cd16550 RING-HC_RNF168 RING finger, HC subclass, found in RING finger protein 168 (RNF168) and similar proteins. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates, such as H2A and H2AX with H2AK13/15 ubiquitylation, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin. 42
34090 319465 cd16551 RING-HC_RNF169 RING finger, HC subclass, found in RING finger protein 169 (RNF169) and similar proteins. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to regulation of the DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU (motif interacting with ubiquitin) domain. 41
34091 319466 cd16552 RING-HC_NEURL3 RING finger, HC subclass, found in neuralized-like protein 3 (NEURL3) and similar proteins. NEURL3, also known as lung-inducible neuralized-related C3HC4 RING domain protein (LINCR), is a novel inflammation-induced E3 ubiquitin-protein ligase encoded by LINCR, a glucocorticoid-attenuated response gene induced in the lung during endotoxemia. It is expressed in alveolar epithelial type II cells, preferentially interacts with the ubiquitin-conjugating enzyme UbcH6, and generates polyubiquitin chains linked via non-canonical lysine residues. Overexpression of NEURL3 in the developing lung epithelium inhibits distal differentiation and induces cystic changes in the Notch signaling pathway. NEURL3 contains an N-terminal neuralized homology repeat (NHR) domain similar to the SPRY (SPla and the RYanodine receptor) domain and a C-terminal C3HC4-type RING-HC finger. 42
34092 319467 cd16553 RING-HC_RNF170 RING finger, HC subclass, found in RING finger protein 170 (RNF170) and similar proteins. RNF170, also known as putative LAG1-interacting protein, is an endoplasmic reticulum (ER) membrane-bound E3 ubiquitin-protein ligase that mediates ubiquitination-dependent degradation of type-I inositol 1,4,5-trisphosphate (IP3) receptors (ITPR1) via the endoplasmic-reticulum-associated protein degradation (ERAD) pathway. A point mutation (arginine to cysteine at position 199) of RNF170 gene is linked with autosomal-dominant sensory ataxia (ADSA), a disease characterized by neurodegeneration in the posterior columns of the spinal cord. RNF170 contains a C3HC4-type RING-HC finger. 44
34093 319468 cd16554 RING-HC_RNF180 RING finger, HC subclass, found in RING finger protein 180 (RNF180) and similar proteins. RNF180, also known as Rines, is a membrane-bound E3 ubiquitin-protein ligase well conserved among vertebrates. It is a critical regulator of the monoaminergic system, as well as emotional and social behavior. It interacts with brain monoamine oxidase A (MAO-A) and targets it for ubiquitination and degradation. It also functions as a novel tumor suppressor in gastric carcinogenesis. The hypermethylated CpG site count of RNF180 DNA promoter can be used to predict the survival of gastric cancer. RNF180 contains a novel conserved dual specificity protein phosphatase Rines conserved (DSPRC) domain, a basic coiled-coil domain, a C3HC4-type RING-HC finger, and a C-terminal hydrophobic region that is predicted to be a transmembrane domain. 44
34094 319469 cd16555 RING-HC_RNF182 RING finger, HC subclass, found in RING finger protein 182 (RNF182) and similar proteins. RNF182 is a brain-enriched E3 ubiquitin-protein ligase that stimulates E2-dependent polyubiquitination in vitro. It is upregulated in the Alzheimer"s disease (AD) brains and neuronal cells exposed to injurious insults. It interacts with ATP6V0C and promotes its degradation by the ubiquitin-proteosome pathway, suggesting a very specific role in controlling the turnover of an essential component of neurotransmitter release machinery. RNF182 contains an N-terminal C3HC4-type RING-HC finger, and a C-terminal transmembrane domain. 51
34095 319470 cd16556 RING-HC_RNF183_like RING finger, HC subclass, found in RING finger protein RNF183, RNF223 and similar proteins. RNF183 is an E3 ubiquitin-protein ligase that is upregulated during intestinal inflammation and is negatively regulated by miR-7. It promotes intestinal inflammation by increasing the ubiquitination and degradation of inhibitor of kappa B, thereby resulting in secondary activation of the Nuclear factor-kappaB (NF-kB) pathway. The interaction between RNF183-mediated ubiquitination and miRNA may be an important novel epigenetic mechanism in the pathogenesis of inflammatory bowel disease (IBD). The biological function of RNF223 remains unclear. Both RNF183 and RNF223 contain an N-terminal C3HC4-type RING-HC finger and a C-terminal transmembrane domain. 54
34096 319471 cd16557 RING-HC_RNF186 RING finger, HC subclass, found in RING finger protein 186 (RNF186) and similar proteins. RNF186 is an E3 ubiquitin-protein ligase with an N-terminal C3HC4-type RING-HC finger and two putative C-terminal transmembrane domains which enable it to localize in a certain organelle. It regulates RING-dependent self-ubiquitination, as well as endoplasmic reticulum (ER) stress-mediated apoptosis through interaction with the Bcl-2 family protein BNip1. 51
34097 319472 cd16558 RING-HC_RNF207 RING finger, HC subclass, found in RING finger protein 207 (RNF207) and similar proteins. RNF207 is a cardiac-specific E3 ubiquitin-protein ligase that plays an important role in the regulation of cardiac repolarization. It regulates action potential duration, likely via effects on human ether-a-go-go-related gene (HERG) trafficking and localization in a heat shock protein-dependent manner. RNF207 contains a C3HC4-type RING-HC finger, Bbox 1 and Bbox C-terminal (BBC), as well as a C-terminal non-homologous region (CNHR). 43
34098 319473 cd16559 RING-HC_RNF208 RING finger, HC subclass, found in RING finger protein 208 (RNF208) and similar proteins. RNF208 is an E3 ubiquitin-protein ligase whose activity can be modulated by S-nitrosylation. It contains a C3HC4-type RING-HC finger. 50
34099 319474 cd16560 RING-HC_RNF212_like RING finger, HC subclass, found in RING finger proteins RNF212, RNF212B and similar proteins. The family includes RING finger protein RNF212, RNF212B, and their homologs. RNF212 is a dosage-sensitive regulator of crossing-over during mammalian meiosis. It plays a central role in designating crossover sites and coupling chromosome synapsis to the formation of crossover-specific recombination complexes. It also functions as an E3 ligase for small ubiquitin-related modifier (SUMO) modification. RNF212B shows high sequence similarity with RNF212, but its biological function remains unclear. Members in this family contain an N-terminal C3HC4-type RING-HC finger. The family also includes two homologs of RNF212, meiotic procrossover factors Zip3 and ZHP-3, which have been identified in Saccharomyces cerevisiae and Caenorhabditis elegans, respectively. Budding yeast Zip3 is a small ubiquitin-related modifier (SUMO) E3 ligase implicated in the SUMO pathway of post-translational modification. It sumoylates chromosome axis proteins, thus promoting synaptonemal complex polymerization. It also acts as a Smt3 E3 ligase. Zip3 includes a SUMO Interacting Motif (SIM) and a modified C3HCHC2-type RING-HC finger that are important for Zip3 in vitro E3 ligase activity and necessary for SC polymerization and correct sporulation. ZHP-3 acts at crossovers to couple meiotic recombination with synaptonemal complex disassembly and chiasma formation in Caenorhabditis elegans. It possess a C3HC4-type RING-HC finger. 41
34100 319475 cd16561 RING-HC_RNF213 RING finger, HC subclass, found in RING finger protein 213 (RNF213) and similar proteins. RNF213, also known as ALK lymphoma oligomerization partner on chromosome 17 or Moyamoya steno-occlusive disease-associated AAA+ and RING finger protein (mysterin), is an intracellular soluble protein that functions as an E3 ubiquitin-protein ligase and AAA+ ATPase, which possibly contributes to vascular development through mechanical processes in the cell. It plays a unique role in endothelial cells for proper gene expression in response to inflammatory signals from the environment. Mutations in RNF213 may associate with Moyamoya disease (MMD), an idiopathic cerebrovascular occlusive disorder prevalent in East Asia. It also acts as a nuclear marker for acanthomorph phylogeny. RNF213 contains two tandem enzymatically active AAA+ ATPase modules and a C3HC4-type RING-HC finger. It can forms huge ring-shaped oligomeric complex. 41
34101 319476 cd16562 RING-HC_RNF219 RING finger, HC subclass, found in RING finger protein 219 (RNF219) and similar proteins. RNF219 may function as a modulator of late-onset Alzheimer"s disease (LOAD) associated amyloid beta A4 precursor protein (APP) endocytosis and metabolism. It genetically interacts with apolipoprotein E epsilon4 allele (APOE4). Thus a genetic variant within RNF219 was found to affect amyloid deposition in human brain and LOAD age-of-onset. Moreover, common genetic variants at the RNF219 locus had been associated with alternations in lipid metabolism, cognitive performance and central nervous system ventricle volume. RNF219 contains a C3HC4-type RING-HC finger. 42
34102 319477 cd16563 RING-HC_RNF220 RING finger, HC subclass, found in RING finger protein 220 (RNF220) and similar proteins. RNF220 is an E3 ubiquitin-protein ligase that promotes the ubiquitination and proteasomal degradation of Sin3B, a scaffold protein of the Sin3/HDAC (histone deacetylase) corepressor complex. It can also bind E2 and mediate auto-ubiquitination of itself. Moreover, RNF220 specifically interacts with beta-catenin, and enhances canonical Wnt signaling through ubiquitin-specific protease 7 (USP7)-mediated deubiquitination and stabilization of beta-catenin, which is independent of its E3 ligase activity. RNF220 contains a characteristic C3HC4-type RING-HC finger at its C-terminus. 41
34103 319478 cd16564 RING-HC_RNF222 RING finger, HC subclass, found in RING finger protein 222 (RNF222) and similar proteins. RNF222 is an uncharacterized C3HC4-type RING-HC finger-containing protein. It may function as an E3 ubiquitin-protein ligase. 47
34104 319479 cd16565 RING-HC_RNF224_like RING finger, HC subclass, found in RING finger protein RNF224, RNF225 and similar proteins. Both RNF224 and RNF225 are uncharacterized C3HC4-type RING-HC finger-containing proteins. They may function as an E3 ubiquitin-protein ligase. 49
34105 319480 cd16566 RING-HC_RSPRY1 RING finger, HC subclass, found in RING finger and SPRY domain-containing protein 1 (RSPRY1) and similar proteins. RSPRY1 is a hypothetical RING and SPRY domain-containing protein of unknown physiological function. Mutations in its corresponding gene RSPRY1 may associate with a distinct skeletal dysplasia syndrome. RSPRY1 contains a B30.2/SPRY domain and a C3HC4-type RING-HC finger. 41
34106 319481 cd16567 RING-HC_RAD16_like RING finger, HC subclass, found in Saccharomyces cerevisiae DNA repair protein RAD16, Schizosaccharomyces pombe rhp16, and similar proteins. Budding yeast RAD16, also known as ATP-dependent helicase RAD16, is encoded by a yeast excision repair gene homologous to the recombinational repair gene RAD54 and to the SNF2 gene involved in transcriptional activation. It is a component of the global genome repair (GGR) complex which promotes global genome nucleotide excision repair (GG-NER) that removes DNA damage from non-transcribing DNA. RAD16 is involved in differential repair of DNA after UV damage, and repairs preferentially the MAT-alpha locus compared with the HML-alpha locus. Fission yeast rhp16, also known as ATP-dependent helicase rhp16, is a RAD16 homolog. It is involved in GGR via nucleotide excision repair (NER), in conjunction with rhp7, after UV irradiation. Both RAD16 and rhp16 contain a C3HC4-type RING-HC finger, as well as a DEAD-like helicase domain and a helicase superfamily C-terminal domain. 47
34107 319482 cd16568 RING-HC_ScPSH1_like RING finger, HC subclass, found in Saccharomyces cerevisiae POB3/SPT16 histone-associated protein 1 (ScPSH1), Arabidopsis thaliana Protein KEEP ON GOING (AtKEG) and similar proteins. ScPSH1 is a Cse4-specific E3 ubiquitin ligase that interacts with the kinetochore protein Pat1 and targets the degradation of budding yeast centromeric histone H3 variant, CENP-ACse4, which is essential for faithful chromosome segregation. ScPSH1 contains a C3HC4-type RING-HC finger and a DNA directed RNA polymerase domain. AtKEG is an E3 ubiquitin ligase essential for Arabidopsis growth and development. It maintains low levels of ABSCISIC ACID-INSENSITIVE5 (ABI5) in the absence of stress and thus functions as a negative regulator of abscisic acid (ABA) signaling. AtKEG is a multidomain protein that includes a C3HC4-type RING-HC finger, a kinase domain, ankyrin repeats, and 12 HERC2-like (for HECT and RCC1-like) repeats. 45
34108 319483 cd16569 RING-HC_SHPRH RING finger, HC subclass, found in SNF2 histone-linker PHD finger RING finger helicase (SHPRH) and similar proteins. SHPRH is a yeast RAD5 homolog found in mammals. It functions as an E3 ubiquitin-protein ligase that associates with proliferating cell nuclear antigen (PCNA), RAD18, and the ubiquitin-conjugating enzyme UBC13 (E2) and suppresses genomic instability through proliferating methyl methanesulfonate (MMS)-induced PCNA polyubiquitination. SHPRH contains a SWI/SNF helicase domain that is divided into N- and C-terminal parts by an insertion of a linker histone domain (H15), a PHD-finger, and a C3HC4-type RING-HC finger involved in the poly-ubiquitination of PCNA. 51
34109 319484 cd16570 RING-HC_SH3RFs RING finger, HC subclass, found in SH3 domain-containing RING finger proteins SH3RF1, SH3RF2, SH3RF3, and similar proteins. SH3RF1, also known as plenty of SH3s (POSH), RING finger protein 142 (RNF142), or SH3 multiple domains protein 2 (SH3MD2), is a trans-Golgi network-associated pro-apoptotic scaffold protein with E3 ubiquitin-protein ligase activity. SH3RF2, also known as heart protein phosphatase 1-binding protein (HEPP1), plenty of SH3s (POSH)-eliminating RING protein (POSHER), protein phosphatase 1 regulatory subunit 39, or RING finger protein 158 (RNF158), is a putative E3 ubiquitin-protein ligase that acts as an anti-apoptotic regulator for the c-Jun N-terminal kinase (JNK) pathway by binding to and promoting the proteasomal degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. SH3RF3, also known as plenty of SH3s 2 (POSH2) or SH3 multiple domains protein 4 (SH3MD4), is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2) and may play a role in regulating c-Jun N-terminal kinase (JNK) mediated apoptosis in certain conditions. Members in this family contain an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. 46
34110 319485 cd16571 RING-HC_SIAHs RING finger, HC subclass, found in Drosophila melanogaster protein Seven-in-Absentia (sina) and its homologs. The family includes the Drosophila melanogaster protein Seven-in-Absentia (sina), its mammalian orthologs, SIAH1 and SIAH2, plant SINA-related proteins, and similar proteins. The Drosophila homolog sina plays an important role in the phyllopod-dependent degradation of the transcriptional repressor tramtrack as for the formation of the R7 photoreceptor in the developing eye of Drosophila melanogaster. Both of SIAH1 and SIAH2 are E3 ubiquitin-protein ligases, mediating the ubiquitinylation and subsequent proteasomal degradation of biologically important target proteins that regulate general functions, such as cell cycle control, apoptosis, and DNA repair. They are inducible by the tumor suppressor and transcription factor p53. SIAH2 can also be regulated by sex hormones and cytokine signaling. Moreover, they share high sequence similarity, but possess contrary roles in cancer, with Siah1 more often acting as a tumor suppressor while Siah2 functions as a proto-oncogene. Plant SINAT1-5 are putative E3 ubiquitin ligase involved in the regulation of stress responses. All family members possess two characteristic domains, an N-terminal C3HC4-type RING-HC finger and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD). 38
34111 319486 cd16572 RING-HC_SpRad8_like RING finger, HC subclass, found in Schizosaccharomyces pombe DNA repair protein Rad8 (SpRad8) and similar proteins. SpRad8 is a conserved protein homologous to Saccharomyces cerevisiae DNA repair protein Rad5 and human helicase-like transcription factor (HLTF) that is required for error-free postreplication repair by contributing to polyubiquitylation of PCNA. SpRad8 contains a C3HC4-type RING-HC finger responsible for the E3 ubiquitin ligase activity, a SNF2-family helicase domain including an ATP binding site, and a family-specific HIRAN domain (HIP116, Rad5p N-terminal domain) that contributes to nuclear localization. 49
34112 319487 cd16573 RING-HC_TFB3_like RING finger, HC subclass, found in RNA polymerase II transcription factor B subunit 3 (TFB3) from fungi. TFB3, also known as RNA polymerase II transcription factor B 38 kDa subunit, RNA polymerase II transcription factor B p38 subunit, or Rig2, is a component of the general transcription and DNA repair factor IIH (TFIIH or factor B), which is essential for both basal and activated transcription and is involved in nucleotide excision repair (NER) of damaged DNA. TFIIH has CTD kinase and DNA-dependent ATPase activity, and is essential for polymerase II transcription in vitro. TFB3 is a homolog of MAT1 of higher eukaryotes which forms a ternary complex with MO15 (cdk7) and cyclin H. It physically interacts with Ubc4 and the Nedd8-conjugating enzyme Ubc12 as well as the Hrt1/Rtt101 complex. It targets the yeast Cul4-type cullin Rtt101 for its neddylation and ubiquitylation, and regulates neddylation and activity of cullin-3, but not Cdc53. TFB3 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MAT1 domain responsible for the interaction with the transcription factor TFIIH. 54
34113 319488 cd16574 RING-HC_Topors RING finger, HC subclass, found in topoisomerase I-binding arginine/serine-rich protein (Topors) and similar proteins. Topors, also known as topoisomerase I-binding RING finger protein, tumor suppressor p53- binding protein 3, or p53-binding protein 3 (p53BP3), is a ubiquitously expressed nuclear E3 ubiquitin-protein ligase that can ligate both ubiquitin and small ubiquitin-like modifier (SUMO) to substrate proteins in the nucleus. It contains an N-terminal C3HC4-type RING-HC finger which ligates ubiquitin to its target proteins including DNA topoisomerase I, p53, NKX3.1, H2AX, and the AAV-2 Rep78/68 proteins. As a RING-dependent E3 ubiquitin ligase, Topors works with the E2 enzymes UbcH5a, UbcH5c, and UbcH6, but not with UbcH7, CDC34, or UbcH2b. Topors acts as a tumor suppressor in various malignancies. It regulates p53 modification, suggesting it may be responsible for astrocyte elevated gene-1 (AEG-1, also known as metadherin, or LYRIC) ubiquitin modification. Plk1-mediated phosphorylation of Topors inhibits Topors-mediated sumoylation of p53, whereas p53 ubiquitination is enhanced, leading to p53 degradation. It also functions as a negative regulator of the prostate tumor suppressor NKX3.1. Moreover, Topors is associated with promyelocytic leukemia nuclear bodies, and may be involved in the cellular response to camptothecin. It also plays a key role in the turnover of H2AX protein, discriminating the type of DNA damaging stress. Furthermore, Topors is a cilia-centrosomal protein associated with autosomal dominant retinal degeneration. Mutations in TOPORS cause autosomal dominant retinitis pigmentosa (adRP). 40
34114 319489 cd16575 RING-HC_MID_C-I RING finger, HC subclass, found in midline-1 (MID1), midline-2 (MID2) and similar proteins. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis and the functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 52
34115 319490 cd16576 RING-HC_TRIM9_like_C-I RING finger, HC subclass, found in tripartite motif-containing proteins TRIM9, TRIM36, TRIM46, TRIM67, and similar proteins. Tripartite motif-containing proteins TRIM9, TRIM36, TRIM46, and TRIM67 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. TRIM36 (the human ortholog of mouse Haprin), also known as RING finger protein 98 (RNF98), or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in the carcinogenesis. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that forms parallel microtubule bundles in the proximal axon and plays a crucial role for the establishment and maintenance of neuronal polarity. TRIM67, also known as TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H, also known as glucosidase II beta, a protein kinase C substrate. 42
34116 319491 cd16577 RING-HC_MuRF_C-II RING finger, HC subclass, found in muscle-specific RING finger proteins TRIM63/MuRF-1, TRIM55/MuRF-2 and TRIM54/MuRF-3. This family corresponds to a group of striated muscle-specific tripartite motif (TRIM) proteins, including TRIM63/MuRF-1, TRIM55/MuRF-2, and TRIM54/MuRF-3, which function as E3 ubiquitin ligases in ubiquitin-mediated muscle protein turnover. They are tightly developmentally regulated in skeletal muscle and associate with different cytoskeleton components, such as microtubules, Z-disks and M-bands, as well as with metabolic enzymes and nuclear proteins. They also cooperate with diverse proteins implicated in selective protein degradation by the proteasome and autophagosome, and target proteins of metabolic regulation, sarcomere assembly and transcriptional regulation. Moreover, MURFs display variable fibre-type preferences. TRIM63/MuRF-1 is predominantly fast (type II) fibre-associated in skeletal muscle. TRIM55/MuRF-2 is predominantly slow-fibre associated. TRIM54/MuRF-3 is ubiquitously present. They play an active role in microtubule-mediated sarcomere assembly. MuRFs belong to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain positioned C-terminal to the RBCC domain. They also harbor a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. 51
34117 319492 cd16578 RING-HC_TRIM42_C-III RING finger, HC subclass, found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear. 51
34118 319493 cd16579 RING-HC_PML_C-V RING finger, HC subclass, found in promyelocytic leukemia protein (PML) and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cell processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. 37
34119 319494 cd16580 RING-HC_TRIM8_C-V RING finger, HC subclass, found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53 impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF- kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1. 44
34120 319495 cd16581 RING-HC_TRIM13_like_C-V RING finger, HC subclass, found in tripartite motif-containing proteins TRIM13, TRIM59 and similar proteins. TRIM13 and TRIM59, two closely related tripartite motif-containing proteins, belong to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, followed by a C-terminal transmembrane domain. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane anchored E3 ubiquitin-protein ligase that interacts proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). TRIM59, also known as RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis. 45
34121 319496 cd16582 RING-HC_TRIM31_C-V RING finger, HC subclass, found in tripartite motif-containing protein 31 (TRIM31) and similar proteins. TRIM31 is an E3 ubiquitin-protein ligase that primarily localizes to the cytoplasm, but is also associated with the mitochondria. It can negatively regulate cell proliferation and may be a potential biomarker of gastric cancer as it is overexpressed from the early stage of gastric carcinogenesis. TRIM31 is downregulated in non-small cell lung cancer and serves as a potential tumor suppressor. It interacts with p52 (Shc) and inhibits Src-induced anchorage-independent growth. TRIM31 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. 41
34122 319497 cd16583 RING-HC_TRIM40-C-V RING finger, HC subclass, found in tripartite motif-containing protein 40 (TRIM40) and similar proteins. TRIM40, also known as probable E3 NEDD8-protein ligase or RING finger protein 35 (RNF35), is highly expressed in the gastrointestinal tract including the stomach, small intestine, and large intestine. It enhances neddylation of inhibitor of nuclear factor kappaB kinase subunit gamma (IKKgamma), inhibits the activity of nuclear factor-kappaB (NF-kappaB)-mediated transcription, and thus prevents inflammation-associated carcinogenesis in the gastrointestinal tract. TRIM40 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region , as well as an uncharacterized region positioned C-terminal to the RBCC domain. 46
34123 319498 cd16584 RING-HC_TRIM56_C-V RING finger, HC subclass, found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may do not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. 44
34124 319499 cd16585 RING-HC_TIF1_C-VI RING finger, HC subclass, found in the transcriptional inknown asiary factor 1 (TIF1) family and similar proteins. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1alpha (TRIM24), TIF1beta (TRIM28), and TIF1gamma (TRIM33), which belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha and TIF1beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs). TIF1delta (TRIM66) doesn"t have RING-HC finger and is not included here. 61
34125 319500 cd16586 RING-HC_TRIM2_like_C-VII RING finger, HC subclass, found in tripartite motif-containing protein TRIM2, TRIM3, and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It also plays an important role in the central nervous system (CNS). In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. Both TRIM2 and TRIM3 belong to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. 45
34126 319501 cd16587 RING-HC_TRIM32_C-VII RING finger, HC subclass, found in tripartite motif-containing protein 32 (TRIM32) and similar proteins. TRIM32, also known as 72 kDa Tat-interacting protein, or zinc finger protein HT2A, or BBS11, is an E3 ubiquitin-protein ligase that promotes degradation of several targets, including actin, PIASgamma, Abl interactor 2, dysbindin, X-linked inhibitor of apoptosis (XIAP), p73 transcription factor, thin filaments and Z-bands during fasting. It plays important roles in neuronal differentiation of neural progenitor cells, as well as in controlling cell fate in skeletal muscle progenitor cells. It reduces PI3K-Akt-FoxO signaling in muscle atrophy by promoting plakoglobin-PI3K dissociation. It also functions as a pluripotency-reprogramming roadblock that facilitates cellular transition towards differentiation via modulating the levels of Oct4 and cMyc. Moreover, TRIM32 is an intrinsic influenza A virus (IAV) restriction factor which senses and targets the polymerase basic protein 1 (PB1) polymerase for ubiquitination and protein degradation. It also plays a significant role in mediating the biological activity of the HIV-1 Tat protein in vivo, binds specifically to the activation domain of HIV-1 Tat, and can also interact with the HIV-2 and EIAV Tat proteins in vivo. Furthermore, Trim32 regulates myoblast proliferation by controlling turnover of NDRG2 (N-myc downstream-regulated gene). It negatively regulates tumor suppressor p53 to promote tumorigenesis. It also facilitates degradation of MYCN on spindle poles and induces asymmetric cell division in human neuroblastoma cells. In addition, TRIM32 plays important roles in regulation of hyperactivities and positively regulates the development of anxiety and depression disorders induced by chronic stress. It also plays a role in regeneration by affecting satellite cell cycle progression via modulation of the SUMO ligase PIASy (PIAS4). Defects in TRIM32 leads to limb-girdle muscular dystrophy type 2H (LGMD2H), sarcotubular myopathies (STM) and Bardet-Biedl syndrome. TRIM32 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The NHL domain mediates the interaction with Argonaute proteins and consequently allows TRIM32 to modulate the activity of certain miRNAs. 47
34127 319502 cd16588 RING-HC_TRIM45-C-VII RING finger, HC subclass, found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1 and downregulates mitogen-activated protein kinase (MAPK) signal transduction through inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription and suppresses cell proliferation. TRIM45 belongs to the C-VII subclass of TRIM (tripartite motif) family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain. 64
34128 319503 cd16589 RING-HC_TRIM71_C-VII RING finger, HC subclass, found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may plays essential roles in embryonic stem cells, cellular reprogramming and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7) and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs through mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. 77
34129 319504 cd16590 RING-HC_TRIM4_C-IV_like RING finger, HC subclass, found in tripartite motif-containing proteins, TRIM4, TRIM75, tripartite motif family-like protein 1 (TRIML1) and similar proteins. TRIM4 and TRIM75, two closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM4, also known as RING finger protein 87 (RNF87), is a cytoplasmic E3 ubiquitin-protein ligase that it had recently evolved and is present only in higher mammals. It transiently interacts with mitochondria, induces mitochondrial aggregation and sensitizes the cells to hydrogen peroxide (H2O2) induced death. Its interaction with peroxiredoxin 1 (PRX1) is critical for the regulation of H2O2 induced cell death. Moreover, TRIM4 functions as a positive regulator of RIG-I-mediated type I interferon induction. It regulates the K63-linked ubiquitination of RIG-1 and assembly of antiviral signaling complex at mitochondria. TRIM75 mainly localizes within spindles, suggesting it may function in spindle organization and thereby affect meiosis. The family also includes TRIML1 that is identical to TRIM11 and TRIM17 except for the absence of B-box domain. TRIML1, also known as RING finger protein 209 (RNF209), is a probable E3 ubiquitin-protein ligase expressed in embryo before implantation. It plays an important role in blastocyst development. By interacting with USP5 (also known as isoT), TRIML1 may exerts its influence on debranching ubiquitin from multi-chains on the stability and activity of protein substrates in the preimplantation embryo. 45
34130 319505 cd16591 RING-HC_TRIM5_like-C-IV RING finger, HC subclass, found in tripartite motif-containing proteins TRIM5, TRIM6, TRIM22, TRIM34 and similar proteins. TRIM5, TRIM6, TRIM22, and TRIM34, four closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM5, also known as RING finger protein 88 (RNF88), is a capsid-specific restriction factor that prevents infection from non-host-adapted retroviruses in a species-specific manner by binding to and destabilizing the retroviral capsid lattice before reverse transcription is completed. Its retroviral restriction activity correlates with the ability to activate TAK1-dependent innate immune signaling. TRIM5 also acts as a pattern recognition receptor that activates innate immune signaling in response to the retroviral capsid lattice. Moreover, TRIM5 plays a role in regulating autophagy through activation of autophagy regulator BECN1 by causing its dissociation from its inhibitors BCL2 and TAB2. It also plays a role in autophagy by acting as a selective autophagy receptor which recognizes and targets HIV-1 capsid protein p24 for autophagic destruction. TRIM6, also known as RING finger protein 89 (RNF89), is an E3-ubiquitin ligase that cooperates with the E2-ubiquitin conjugase UbE2K to catalyze the synthesis of unanchored K48-linked polyubiquitin chains, and further stimulates the interferon-I kappa B kinase epsilon (IKKepsilon) kinase-mediated antiviral response. It also regulates the transcriptional activity of Myc during the maintenance of embryonic stem (ES) cell pluripotency, and may act as a novel regulator for Myc-mediated transcription in ES cells. TRIM22, also known as 50 kDa-stimulated trans-acting factor (Staf-50) or RING finger protein 94 (RNF94), is an E3 ubiquitin-protein ligase that plays an integral role in the host innate immune response to viruses. It has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 acts as a suppressor of basal HIV-1 long terminal repeat (LTR)-driven transcription by preventing the transcription factor specificity protein 1 (Sp1) binding to the HIV-1 promoter. It also controls FoxO4 activity and cell survival by directing Toll-like receptor 3 (TLR3)-stimulated cells toward type I interferon (IFN) type I gene induction or apoptosis. Moreover, TRIM22 can activate the noncanonical nuclear factor-kappaB (NF-kappaB) pathway by activating I kappa B kinase alpha (IKKalpha). It also regulates nucleotide binding oligomerization domain containing 2 (NOD2)-dependent activation of interferon-beta signaling and nuclear factor-kappaB. TRIM34, also known as interferon-responsive finger protein 1 or RING finger protein 21 (RNF21), may function as antiviral protein that contribute to the defense against retroviral infections. 49
34131 319506 cd16592 RING-HC_TRIM7_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 7 (TRIM7) and similar proteins. TRIM7, also known as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90), is an E3 ubiquitin-protein ligase that mediates c-Jun/AP-1 activation by Ras signalling. Its phosphorylation and activation by MSK1 in response to direct activation by the Ras-Raf-MEK-ERK pathway can stimulate TRIM7 E3 ubiquitin ligase activity in mediating Lys63-linked ubiquitination of the AP-1 coactivator RACO-1, leading to RACO-1 protein stabilization. Moreover, TRIM7 binds and activates glycogenin, the self-glucosylating initiator of glycogen biosynthesis. TRIM7 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 45
34132 319507 cd16593 RING-HC_TRIM10_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 10 (TRIM10) and similar proteins. TRIM10, also known as B30-RING finger protein (RFB30), RING finger protein 9 (RNF9), or hematopoietic RING finger 1 (HERF1), is a novel hematopoiesis-specific RING finger protein required for terminal differentiation of erythroid cells. TRIM10 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 49
34133 319508 cd16594 RING-HC_TRIM11_like_C-IV RING finger, HC subclass, found in tripartite motif-containing proteins, TRIM11 and TRIM27, and similar proteins. TRIM11 and TRIM27, two closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM11, also known as protein BIA1, or RING finger protein 92 (RNF92), is an E3 ubiquitin-protein ligase involved in the development of the central nervous system. It is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 acts as a potential therapeutic target for congenital central hypoventilation syndrome (CCHS) through mediating the degradation of congenital central hypoventilation syndrome-associated polyalanine-expanded Phox2b. Trim11 modulates the function of neurogenic transcription factor Pax6 through ubiquitin-proteosome system, and thus plays an essential role for the Pax6-dependent neurogenesis. It also binds to and destabilizes a key component of the activator-mediated cofactor complex (ARC105), humanin, a neuroprotective peptide against Alzheimer"s disease-relevant insults, through the ubiquitin-proteasome system, and further regulates ARC105 function in transforming growth factor beta (TGFbeta) signaling. Moreover, TRIM11 negatively regulates retinoic acid-inducible gene-I (RIG-I)-mediated interferon-beta (IFNbeta) production and antiviral activity by targeting TANK-binding kinase-1 (TBK1). It may contribute to the endogenous restriction of retroviruses in cells. It enhances N-tropic murine leukemia virus (N-MLV) entry by interfering with Ref1 restriction. It also suppresses the early steps of human immunodeficiency virus HIV-1 transduction, resulting in decreased reverse transcripts. TRIM27, also known as RING finger protein 76 (RNF76), RET finger protein (RFP), or zinc finger protein RFP, is a nuclear E3 ubiquitin-protein ligase that is highly expressed in testis and in various tumor cell lines. Expression of TRIM27 is associated with prognosis of colon and endometrial cancers. TRIM27 was first identified as a fusion partner of the RET receptor tyrosine kinase. It functions as a transcriptional repressor and associates with several proteins involved in transcriptional activity, such as enhancer of polycomb 1 (Epc1), a member of the Polycomb group proteins, and Mi-2beta, a main component of the nucleosome remodeling and deacetylase (NuRD) complex, and the cell cycle regulator retinoblastoma protein (RB1). It also interacts with HDAC1, leading to downregulation of thioredoxin binding protein 2 (TBP-2), which inhibits the function of thioredoxin. Moreover, TRIM27 mediates Pax7-induced ubiquitination of MyoD in skeletal muscle atrophy. In addition, it inhibits muscle differentiation by modulating serum response factor (SRF) and Epc1. Furthermore, TRIM27 promote a non-canonical polyubiquitinations of PTEN, a lipid phosphatase that catalyzes PtdIns(3,4,5)P3 (PIP3) to PtdIns(4,5)P2 (PIP2). It is an IKKepsilon-interacting protein that regulates IkappaB kinase (IKK) function and negatively regulates signaling involved in the antiviral response and inflammation. In addition, TRIM27 forms a protein complex with MBD4 or MBD2 or MBD3, and thus plays an important role in the enhancement of transcriptional repression through MBD proteins in tumorigenesis, spermatogenesis, and embryogenesis. It is also a component of an estrogen receptor 1 (ESR1) regulatory complex, and is involved in estrogen receptor-mediated transcription in MCF-7 cells. Meanwhile, TRIM27 interacts with the hinge region of chromosome 3 protein (SMC3), a component of the multimeric cohesin complex that holds sister chromatids together and prevents their premature separation during mitosis. 45
34134 319509 cd16595 RING-HC_TRIM17_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM17 and similar proteins. TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (Terf), is a crucial E3 ubiquitin ligase that is necessary and sufficient for neuronal apoptosis and contributes to Mcl-1 ubiquitination in cerebellar granule neurons (CGNs). It interacts in a SUMO-dependent manner with nuclear factor of activated T cell NFATc3 transcription factor, and thus inhibits the activity of NFATc3 by preventing its nuclear localization. In contrast, it binds to and inhibits NFATc4 transcription factor in a SUMO-independent manner. Moreover, TRIM17 stimulates degradation of kinetochore protein ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for the mitotic spindle checkpoint, and negatively regulates cell proliferation. TRIM17 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 48
34135 319510 cd16596 RING-HC_TRIM21_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM21 and similar proteins. TRIM21, also known as 52 kDa Ro protein, 52 kDa ribonucleoprotein autoantigen Ro/SS-A, Ro(SS-A), RING finger protein 81 (RNF81), or Sjoegren syndrome type A antigen (SS-A), is a ubiquitously expressed E3 ubiquitin-protein ligase and a high affinity antibody receptor uniquely expressed in the cytosol of mammalian cells. As a cytosolic Fc receptor, TRIM21 binds the Fc of virus-associated antibodies and targets the complex in the cytosol for proteasomal degradation in a process known as antibody-dependent intracellular neutralization (ADIN), and provides an intracellular immune response to protect host defense against pathogen infection. It shows remarkably broad isotype specificity as it does not only bind IgG, but also IgM and IgA. Moreover, TRIM21 promotes the cytosolic DNA sensor cGAS and the cytosolic RNA sensor RIG-I sensing of viral genomes during infection by antibody-opsonized virus. It stimulates inflammatory signaling and activates innate transcription factors, such as nuclear factor-kappaB (NF-kappaB). TRIM21 also plays an essential role in p62-regulated redox homeostasis, suggesting a viable target for treating pathological conditions resulting from oxidative damage. Furthermore, TRIM21 may have implications for various autoimmune diseases associated uncontrolled antiviral signaling through the regulation of Nmi-IFI35 complex-mediated inhibition of innate antiviral response. TRIM21 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 44
34136 319511 cd16597 RING-HC_TRIM25_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM25 and similar proteins. TRIM25, also known as estrogen-responsive finger protein (EFP), RING finger protein 147 (RNF147), or RING-type E3 ubiquitin transferase, is an E3 ubiquitin/ISG15 ligase that is induced by estrogen and is therefore particularly abundant in placenta and uterus. TRIM25 regulates various cellular processes through E3 ubiquitin ligase activity, transferring ubiquitin and ISG15 to target proteins. It mediates K63-linked polyubiquitination of retinoic acid inducible gene I (RIG-I) that is crucial for downstream antiviral interferon signaling. It is also required for melanoma differentiation-associated gene 5 (MDA5) and mitochondrial antiviral signaling (MAVS, also known as IPS-1, VISA, Cardiff) mediated activation of nuclear factor-kappaB (NF-kappaB) and interferon production. Upon UV irradiation, TRIM25 interacts with mono-ubiquitinated PCNA and promotes its ISG15 modification (ISGylation), suggesting a crucial role in termination of error-prone translesion DNA synthesis. TRIM25 also functions as a novel regulator of p53 and Mdm2. It enhances p53 and Mdm2 abundance by inhibiting their ubiquitination and degradation in 26S proteasomes. Meanwhile, it inhibits p53's transcriptional activity and dampens the response to DNA damage, and is essential for medaka development and this dependence is rescued by silencing of p53. Moreover, TRIM25 is involved in the host cellular innate immune response against retroviral infection. It interferes with the late stage of feline leukemia virus (FeLV) replication. Furthermore, TRIM25 acts as an oncogene in gastric cancer. Its blockade by RNA interference inhibits migration and invasion of gastric cancer cells through transforming growth factor-beta (TGF-beta) signaling, suggesting it presents a novel target for the detection and treatment of gastric cancer. In addition, TRIM25 acts as an RNA-specific activator for Lin28a/TuT4-mediated uridylation. TRIM25 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 44
34137 319512 cd16598 RING-HC_TRIM26_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 26 (TRIM26) and similar proteins. TRIM26, also known as acid finger protein (AFP), RING finger protein 95 (RNF95), or zinc finger protein 173 (ZNF173), is an E3 ubiquitin-protein ligase that negatively regulates interferon-beta production and antiviral response through polyubiquitination and degradation of nuclear transcription factor IRF3. It functions as an important regulator for RNA virus-triggered innate immune response by bridging TBK1 to NEMO (NF-kappaB essential modulator, also known as IKKgamma) and mediating TBK1 activation. It also acts as a novel tumor suppressor of hepatocellular carcinoma by regulating cancer cell proliferation, colony forming ability, migration, and invasion. TRIM26 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 45
34138 319513 cd16599 RING-HC_TRIM35_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 35 (TRIM35) and similar proteins. TRIM35, also known as hemopoietic lineage switch protein 5 (HLS5), is a putative hepatocellular carcinoma (HCC) suppressor that inhibits phosphorylation of pyruvate kinase isoform M2 (PKM2), which is involved in aerobic glycolysis of cancer cells and further suppresses the Warburg effect and tumorigenicity in HCC. It also negatively regulates Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production by suppressing the stability of interferon regulatory factor 7 (IRF7). Moreover, TRIM35 regulates erythroid differentiation by modulating globin transcription factor 1 (GATA-1) activity. TRIM35 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 44
34139 319514 cd16600 RING-HC_TRIM38_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 38 (TRIM38) and similar proteins. TRIM38, also known as RING finger protein 15 (RNF15) or zinc finger protein RoRet, is an E3 ubiquitin-protein ligase that promotes K63- and K48-linked ubiquitination of cellular proteins and also catalyzes self-ubiquitination. It negatively regulates Tumor necrosis factor alpha(TNF-alpha)- and interleukin-1beta-triggered Nuclear factor-kappaB (NF-kappaB) activation by mediating lysosomal-dependent degradation of transforming growth factor beta (TGFbeta)-activated kinase 1 (TAK1)-binding protein (TAB)2/3, two critical components of the TAK1 kinase complex. It also inhibits TLR3/4-mediated activation of NF-kappaB and interferon regulatory factor 3 (IRF3) by mediating ubiquitin-proteasomal degradation of TNF receptor-associated factor 6 (Traf6) and NAK-associated protein 1 (Nap1), respectively. Moreover, TRIM38 negatively regulates TLR3-mediated interferon beta (IFN-beta) signaling by targeting ubiquitin-proteasomal degradation of TIR domain-containing adaptor inducing IFN-beta (TRIF). It functions as a valid target for autoantibodies in primary Sjogren"s Syndrome. TRIM38 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 51
34140 319515 cd16601 RING-HC_TRIM39_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 39 (TRIM39) and similar proteins. TRIM39, also known as RING finger protein 23 (RNF23) or testis-abundant finger protein, is an E3 ubiquitin-protein ligase that plays a role in controlling DNA damage-induced apoptosis through inhibition of the anaphase promoting complex (APC/C), a multiprotein ubiquitin ligase that controls multiple cell cycle regulators, including cyclins, geminin, and others. TRIM39 also functions as a regulator of several key processes in the proliferative cycle. It directly regulates p53 stability. It modulates cell cycle progression and DNA damage responses via stabilizing p21. Moreover, TRIM39 negatively regulates the nuclear factor kappaB (NFkappaB)-mediated signaling pathway through stabilization of Cactin, an inhibitor of NFkappaB- and Toll-like receptor (TLR)-mediated transcriptions, which is induced by inflammatory stimulants such as tumor necrosis factor alpha (TNFalpha). Furthermore, TRIM39 is a MOAP-1-binding protein that can promote apoptosis signaling through stabilization of MOAP-1 via the inhibition of its poly-ubiquitination process. TRIM39 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 44
34141 319516 cd16602 RING-HC_TRIM41_like_C-IV RING finger, HC subclass, found in tripartite motif-containing proteins TRIM41, TRIM52 and similar proteins. TRIM41 and TRIM52, two closely related tripartite motif-containing proteins, have dramatically expanded RING domains compared with the rest of the TRIM family proteins. TRIM41 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM52 lacks the putative viral recognition SPRY/B30.2 domain, and thus has been classified to the C-V subclass of TRIM family that contains only RBCC domains. TRIM41, also known as RING finger-interacting protein with C kinase (RINCK), is an E3 ubiquitin-protein ligase that promotes the ubiquitination of protein kinase C (PKC) isozymes in cells. It specifically recognizes the C1 domain of PKC isozymes. It controls the amplitude of PKC signaling by controlling the amount of PKC in the cell. TRIM52, also known as RING finger protein 102 (RNF102), is encoded by a novel, noncanonical antiviral TRIM52 gene in primate genomes with unique specificity determined by the rapidly evolving RING domain. 41
34142 319517 cd16603 RING-HC_TRIM43_like_C-IV RING finger, HC subclass, found in tripartite motif-containing proteins TRIM43, TRIM48, TRIM49, TRIM51, TRIM64, TRIM77 and similar proteins. The family includes a group of closely related uncharacterized tripartite motif-containing proteins, TRIM43, TRIM43B, TRIM48/RNF101, TRIM49/RNF18, TRIM49B, TRIM49C/TRIM49L2, TRIM49D/TRIM49L, TRIM51/SPRYD5, TRIM64, TRIM64B, TRIM64C, and TRIM77, whose biological function remain unclear. TRIM49, also known as testis-specific RING-finger protein, has moderate similarity with SS-A/Ro52 antigen, suggesting it may be one of target proteins of autoantibodies in the sera of patients with these autoimmune disorders. All family members belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. 46
34143 319518 cd16604 RING-HC_TRIM47_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 47 (TRIM47) and similar proteins. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), belongs a subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. It plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. 47
34144 319519 cd16605 RING-HC_TRIM50_like_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM50, TRIM73, TRIM74 and similar proteins. TRIM50 is a stomach-specific E3 ubiquitin-protein ligase, encoded by the Williams-Beuren syndrome (WBS) TRIM50 gene, which regulates vesicular trafficking for acid secretion in gastric parietal cells. It colocalizes, interacts with, and increases the level of p62/SQSTM1, a multifunctional adaptor protein implicated in various cellular processes including the autophagy clearance of polyubiquitinated protein aggregates. It also promotes the formation and clearance of aggresome-associated polyubiquitinated proteins through the interaction with the histone deacetylase 6 (HDAC6), a tubulin specific deacetylase that regulates microtubule-dependent aggresome formation. TRIM50 can be acetylated by PCAF and p300. TRIM50 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The family also includes two paralogs of TRIM50, tripartite motif-containing protein 73 (TRIM73), also known as tripartite motif-containing protein 50B (TRIM50B), and tripartite motif-containing protein 74 (TRIM74), also known as tripartite motif-containing protein 50C (TRIM50C), both of which are WBS-related genes encoding proteins and may also act as E3 ligases. In contrast with TRIM50, TRIM73 and TRIM74 belong to the C-V subclass of TRIM family of proteins that are defined by an N-terminal RBCC domains only. 45
34145 319520 cd16606 RING-HC_TRIM58_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM58 and similar proteins. TRIM58, also known as protein BIA2, is an erythroid E3 ubiquitin-protein ligase induced during late erythropoiesis. It binds and ubiquitinates the intermediate chain of the microtubule motor dynein (DYNC1LI1/DYNC1LI2), stimulating the degradation of the dynein holoprotein complex. It may participate in the erythroblast enucleation process through regulation of nuclear polarization. TRIM58 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 51
34146 319521 cd16607 RING-HC_TRIM60_like_C-IV RING finger, HC subclass, found in tripartite motif-containing proteins TRIM60, TRIM61 and similar proteins. TRIM60 and TRIM61 are two closely related tripartite motif-containing proteins. TRIM60, also known as RING finger protein 129 (RNF129) or RING finger protein 33 (RNF33), is a cytoplasmic protein expressed in the testis. It may play an important role in the spermatogenesis process, the development of the preimplantation embryo, and in testicular functions. RNF33 interacts with the cytoplasmic kinesin motor proteins KIF3A and KIF3B suggesting possible contribution to cargo movement along the microtubule in the expressed sites. It is also involved in spermatogenesis in Sertoli cells under the regulation of nuclear factor-kappaB (NF-kappaB). TRIM60 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast to TRIM60, TRIM61 belongs to the C-V subclass of TRIM family that contains RBCC domains only. Its biological function remains unclear. 47
34147 319522 cd16608 RING-HC_TRIM62_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 62 (TRIM62) and similar proteins. TRIM62, also known as Ductal Epithelium Associated Ring Chromosome 1 (DEAR1), is a cytoplasmic E3 ubiquitin-protein ligase that was identified as a dominant regulator of acinar morphogenesis in the mammary gland. It is implicated in the inflammatory response of immune cells by regulating the Toll-like receptor 4 (TLR4) signaling pathway, leading to increased activity of the activator protein 1 (AP-1) transcription factor in primary macrophages. It is also involved in muscular protein homeostasis, especially during inflammation-induced atrophy, and may play a role in the pathogenesis of ICU-acquired weakness (ICUAW) by activating and maintaining inflammation in myocytes. Moreover, TRIM62 facilitates K27-linked poly-ubiquitination of CARD9 and also regulates CARD9-mediated anti-fungal immunity and intestinal inflammation. Furthermore, TRIM62 is involved in the regulation of apical-basal polarity and acinar morphogenesis. It also functions as a chromosome 1p35 tumor suppressor and negatively regulates transforming growth factor beta (TGFbeta)-driven epithelial-mesenchymal transition (EMT) through binding to and promoting the ubiquitination of SMAD3, a major effector of TGFbeta-mediated EMT. TRIM62 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 50
34148 319523 cd16609 RING-HC_TRIM65_C-IV RING finger, HC subclass, found in tripartite motif-containing protein TRIM65 and similar proteins. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. TRIM65 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 47
34149 319524 cd16610 RING-HC_TRIM68_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 68 (TRIM68) and similar proteins. TRIM68, also known as RING finger protein 137 (RNF137) or SSA protein SS-56 (SS-56), is an E3 ubiquitin-protein ligase that negatively regulates Toll-like receptor (TLR)- and RIG-I-like receptor (RLR)-driven type I interferon production by degrading TRK fused gene (TFG), a novel driver of IFN-beta downstream of anti-viral detection systems. It also functions as a cofactor for androgen receptor-mediated transcription through regulating ligand-dependent transcription of androgen receptor in prostate cancer cells. Moreover, TRIM68 is a cellular target of autoantibody responses in Sjogren"s syndrome (SS), as well as systemic lupus erythematosus (SLE). It is also an auto-antigen for T cells in SS and SLE. TRIM68 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 49
34150 319525 cd16611 RING-HC_TRIM69_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 69 (TRIM69) and similar proteins. TRIM69, also known as RFP-like domain-containing protein trimless or RING finger protein 36 (RNF36), is a testis E3 ubiquitin-protein ligase that plays a specific role in apoptosis and may also play an important role in germ cell homeostasis during spermatogenesis. TRIM69 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 44
34151 319526 cd16612 RING-HC_TRIM72_C-IV RING finger, HC subclass, found in tripartite motif-containing protein 72 (TRIM72) and similar proteins. TRIM72, also known as Mitsugumin-53 (MG53), is a muscle-specific protein that plays a central role in cell membrane repair by nucleating the assembly of the repair machinery at muscle injury sites. It is required in repair of alveolar epithelial cells under plasma membrane stress failure. It interacts with dysferlin to regulate sarcolemmal repair. Upregulation of TRIM72 develops obesity, systemic insulin resistance, dyslipidemia, and hyperglycemia, as well as induces diabetic cardiomyopathy through transcriptional activation of peroxisome proliferation-activated receptor alpha (PPAR-alpha) signaling pathway. Compensation for the absence of AKT signaling by ERK signaling during TRIM72 overexpression leads to pathological hypertrophy. Moreover, TRIM72 functions as a novel negative feedback regulator of myogenesis via targeting insulin receptor substrate-1 (IRS-1). It is transcriptionally activated by the synergism of myogenin (MyoD) and myocyte enhancer factor 2 (MEF2). TRIM72 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 47
34152 319527 cd16613 RING-HC_UHRF RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing proteins, UHRF1 and UHRF2, and similar proteins. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation, but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger. 46
34153 319528 cd16614 RING-HC_UNK_like RING finger, HC subclass, found in RING finger protein unkempt (UNK), unkempt-like (UNKL), and similar proteins. UNK, also known as zinc finger CCCH domain-containing protein 5, is a metazoan-specific zinc finger protein enriched in embryonic brains. It may play a broad regulatory role during the formation of the central nervous system (CNS). It is a sequence-specific RNA-binding protein required for the early neuronal morphology. UNK is a neurogenic component of the mTOR pathway, and functions as a negative regulator of the timing of photoreceptor differentiation. It also specifically binds to Brg/Brm-associated factor BAF60b and promotes its ubiquitination in a Rac1-dependent manner. UNKL, also known as zinc finger CCCH domain-containing protein 5-like, is a putative E3 ubiquitin-protein ligase that may participate in a protein complex showing an E3 ligase activity regulated by RAC1. Both UNK and UNKL contain several tandem CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus. 34
34154 319529 cd16615 RING-HC_ZNF598 RING finger, HC subclass, found in zinc finger protein 598 (ZNF598) and similar proteins. ZNF598 associates with eukaryotic initiation factor 4E (eIF4E) homologous protein from mammalian (m4EHP) through binding to Grb10-interacting GYF protein 2 (GIGYF2). The m4EHP-GIGYF2 complex functions as a translational repressor and is essential for normal embryonic development of mammalian. ZNF598 harbors a C3HC4-type RING-HC finger at its N-terminus. 41
34155 319530 cd16616 mRING-HC-C4C4_Asi1p_like Modified RING finger, HC subclass (C4C4-type), found in Saccharomyces cerevisiae amino acid sensor-independent protein Asi1p, Asi3p and similar proteins. Asi1p and Asi3p are inner nuclear membrane proteins that act as negative regulators of SPS (Ssy1-Ptr3-Ssy5)-sensor signaling in yeast. Together with Asi2p, they assemble into an Asi complex that functions in the SPS amino acid sensing pathway involved in degradation of Stp1 and Stp2 transcription factors. Both Asi1p and Asi3p contain five membrane-spanning domains, as well as highly conserved RING fingers at their extreme C termini, which are a C4C4-type RING finger motif whose overall folding is similar to that of the C3HC4-type RING-HC finger. 44
34156 319531 cd16617 mRING-HC-C4C4_CesA_plant Modified RING finger, HC subclass (C4C4-type), found in Arabidopsis thaliana cellulose synthase A (CesA) catalytic subunit 1-10, and similar proteins from plant. The family includes a group of plant catalytic subunits of cellulose synthase terminal complexes ("rosettes") required for beta-1,4-glucan microfibril crystallization, a major mechanism of the cell wall formation. CesA1, also known as protein RADIALLY SWOLLEN 1 (RSW1), is required during embryogenesis for cell elongation, orientation of cell expansion and complex cell wall formations, such as interdigitated pattern of epidermal pavement cells, stomatal guard cells, and trichomes. It plays a role in lateral roots formation, but seems unnecessary for the development of tip-growing cells such as root hairs. CesA2, also known as Ath-A, is involved in the primary cell wall formation. It forms a homodimer. CesA3, also known as constitutive expression of VSP1 protein 1, or isoxaben-resistant protein 1, or Ath-B, or protein ECTOPIC LIGNIN 1, or protein RADIALLY SWOLLEN 5 (RSW5), is involved in the primary cell wall formation, especially in roots. CesA4, also known as protein IRREGULAR XYLEM 5 (IRX5), is involved in the secondary cell wall formation, and required for the xylem cell wall thickening. CesA5 may be partially redundant with CesA6. CesA6, also known as AraxCelA, isoxaben-resistant protein 2, protein PROCUSTE 1, or protein QUILL, is involved in the primary cell wall formation. Like CesA1, CesA6 is critical for cell expansion. The CESA6-dependent cell elongation seems to be independent of gibberellic acid, auxin, and ethylene. CesA6 interacts with and moves along cortical microtubules for the process of cellulose deposition. CesA7, also known as protein FRAGILE FIBER 5, or protein IRREGULAR XYLEM 3 (IRX3) is involved in the secondary cell wall formation and required for the xylem cell wall thickening. CesA8, also known as protein IRREGULAR XYLEM 1 (IRX1) or protein LEAF WILTING 2, is involved in the secondary cell wall formation and required for the xylem cell wall thickening. The biological function of CesA9 and CesA10 remain unclear. CesA1, CesA3, and CesA6 form a functional complex essential for primary cell wall cellulose synthesis, while CesA4, CesA7, and CesA8 form a functional complex located in secondary cell wall deposition sites. All family members contain an N-terminal C4C4-type RING-HC finger and a C-terminal glycosyltransferase family A (GT-A) domain. 51
34157 319532 cd16618 mRING-HC-C4C4_CNOT4 Modified RING finger, HC subclass (C4C4-type), found in CCR4-NOT transcription complex subunit 4 (NOT4) and similar proteins. NOT4, also known as CCR4-associated factor 4, E3 ubiquitin-protein ligase CNOT4, or potential transcriptional repressor NOT4, is a component of the multifunctional CCR4-NOT complex, a global regulator of RNA polymerase II transcription. It associates with polysomes and contributes to the negative regulation of protein synthesis. NOT4 functions as an E3 ubiquitin-protein ligase that interacts with a specific E2, Ubc4/5 in yeast, and the ortholog UbcH5B in humans, and ubiquitylates a wide range of substrates, including ribosome-associated factors. Thus, it plays a role in cotranslational quality control (QC) through ribosome-associated ubiquitination and degradation of aberrant peptides. NOT4 contains a C4C4-type RING finger motif, whose overall folding is similar to that of the C3HC4-type RING-HC finger, a central RNA recognition motif (RRM), and a C-terminal domain predicted to be unstructured. 45
34158 319533 cd16619 mRING-HC-C4C4_TRIM37_C-VIII Modified RING finger, HC subclass (C4C4-type), found in tripartite motif-containing protein 37 (TRIM37) and similar proteins. TRIM37, also known as mulibrey nanism protein, or MUL, is a peroxisomal E3 ubiquitin-protein ligase that is involved in the tumorigenesis of several cancer types, including pancreatic ductal adenocarcinoma (PDAC), hepatocellular carcinoma (HCC), breast cancer, and sporadic fibrothecoma. It mono-ubiquitinates histone H2A, a chromatin modification associated with transcriptional repression. Moreover, TRIM37 possesses anti-HIV-1 activity, and interferes with viral DNA synthesis. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction, and hepatomegaly. TRIM37 belongs to the C-VIII subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a MATH (meprin and TRAF-C homology) domain positioned C-terminal to the RBCC domain. Its MATH domain has been shown to interact with the TRAF (TNF-Receptor-Associated Factor) domain of six known TRAFs in vitro. 43
34159 319534 cd16620 vRING-HC-C4C4_RBBP6 Variant RING finger, HC subclass (C4C4-type), found in retinoblastoma-binding protein 6 (RBBP6) and similar proteins. RBBP6, also known as proliferation potential-related protein, protein P2P-R, retinoblastoma-binding Q protein 1 (RBQ-1), or p53-associated cellular protein of testis (PACT), is a nuclear E3 ubiquitin-protein ligase involved in multiple processes, such as the control of gene expression, mitosis, cell differentiation, and cell apoptosis. It plays a role in both promoting and inhibiting apoptosis in many human cancers, including esophageal, lung, hepatocellular, and colon cancers, familial myeloproliferative neoplasms, as well as in human immunodeficiency virus-associated nephropathy (HIVAN). It functions as an Rb- and p53-binding protein that plays an important role in chaperone-mediated ubiquitination and possibly in protein quality control. It acts as a scaffold protein to promote the assembly of the p53/TP53-MDM2 complex, resulting in an increase of MDM2-mediated ubiquitination and degradation of p53/TP53, and leading to both apoptosis and cell growth. It is also a double-stranded RNA-binding protein that plays a role in mRNA processing by regulating the human polyadenylation machinery and modulating expression of mRNAs with AU-rich 3' untranslated regions (UTRs). Moreover, RBBP6 ubiquitinates and destabilizes the transcriptional repressor ZBTB38 that negatively regulates transcription and levels of the MCM10 replication factor on chromatin. Furthermore, RBBP6 is involved in tunicamycin-induced apoptosis through mediating protein kinase (PKR) activation. RBBP6 contains an N-terminal ubiquitin-like domain and a C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger. RBBP6 interacts with chaperones Hsp70 and Hsp40 through its N-terminal ubiquitin-like domain. It promotes the ubiquitination of p53 by Hdm2 in an E4-like manner through its RING finger. It also interacts directly with the pro-proliferative transcription factor Y-box-binding protein-1 (YB-1) via its RING finger. 45
34160 319535 cd16621 vRING-HC-C4C4_RFPL1_like Variant RING finger, HC subclass (C4C4-type), found in Ret finger protein-like (RFPL) family. RFPL family, also known as RING-B30 family, represents a group of RFPL gene products, RFPL1, RFPL2, RFPL3, and RFPL4A, which are characterized by containing an N-terminal RFPL1, 2, 3-specifying helix (RSH), a C4C4- or a modified C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger, an RFPL-defining motif (RDM), and C-terminal PRY/SPRY-forming B30.2 domain. RFPL1, also known as RING finger protein 78 (RNF78), is expressed during cell differentiation, its impact on cell-cycle lengthening therefore provides novel insights into primate-specific development. RFPL2, also known as RING finger protein 79 (RNF79), shows high sequence similarity with other RFPL gene products. Its biological role remains unclear. RFPL3 interacts directly with CREB binding protein (CBP) in the nucleus of lung cancer cells. RFPL3 and CBP synergistically upregulate TERT activity and promote lung cancer growth. Moreover, RFPL3 acts as a novel E3 ubiquitin ligase modulating the integration activity of human immunodeficiency virus, type 1 (HIV-1) preintegration complex. RFPL4A, also known as RING finger protein 210 (RNF210), is a novel factor that increases the G1 population and decreases sensitivity to chemotherapy in human colorectal cancer cells. This family corresponds to the C4C4-type RING finger. RFPL4A lacks the fourth conserved zinc-binding residue, cysteine, and the eighth zinc-binding residue, cysteine, in RFPL2 is replaced by serine. 44
34161 319536 cd16622 vRING-HC-C4C4_RBR_RNF217 Variant RING finger, HC subclass (C4C4-type), found in RING finger protein 217 (RNF217) and similar proteins. RNF217, also known as IBR domain-containing protein 1 (IBRDC1), is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase mainly expressed in testis and skeletal muscle with different splice variants. It interacts with the anti-apoptotic protein HAX1, and is adjacent to a translocation breakpoint involving ETV6 in childhood acute lymphoblastic leukemia (ALL). RNF217 contains a RBR domain followed by TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a variant C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination. 47
34162 319537 cd16623 RING-HC_RBR_TRIAD1_like RING finger, HC subclass, found in two RING fingers and DRIL [double RING finger linked] 1 (TRIAD1), ankyrin repeat and IBR domain-containing protein 1 (ANKIB1) and similar proteins. TRIAD1, also known as ariadne-2 (ARI-2), protein ariadne-2 homolog, Ariadne RBR E3 ubiquitin protein ligase 2 (ARIH2), or UbcM4-interacting protein 48, is a RBR-type E3 ubiquitin-protein ligase that catalyzes the formation of polyubiquitin chains linked via lysine-48 as well as lysine-63 residues. Its auto-ubiquitylation can be catalyzed by the E2 conjugating enzyme UBCH7. TRIAD1 has been implicated in hematopoiesis, specifically in myelopoiesis, as well as in embryogenesis. ANKIB1 is a RBR-type E3 ubiquitin-protein ligase that may function as part of E3 complex, which accepts ubiquitin from specific E2 ubiquitin-conjugating enzymes and then transfers it to substrates. Both TRIAD1 and ANKIB1 contain a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. In contrast to TRIAD1, ANKIB1 harbors an extra N-terminal ankyrin repeats domain. 50
34163 319538 cd16624 RING-HC_RBR_CUL9 RING finger, HC subclass, found in cullin-9 (CUL-9) and similar proteins. CUL-9, also known as UbcH7-associated protein 1 (H7-AP1), p53-associated parkin-like cytoplasmic protein, or PARC, is a cytoplasmic RBR-type E3 ubiquitin-protein ligase that is a tumor suppressor and promotes p53-dependent apoptosis. It mediates the ubiquitination and degradation of cytosolic cytochrome c to promote survival in neurons and cancer cells. It is also a critical downstream effector of the 3M complex in the maintenance of microtubules and genome integrity. Moreover, CUL-9, together with CUL-7, forms homodimers and heterodimers, as well as some atypical cullin RING ligase complexes, which may exhibit ubiquitin ligase activity. CUL-9 contains a CPH domain (Cul7, PARC, and HERC2), a DOC (DOC1/APC10) domain, cullin homology (CH) domains linked with E3 ligase function, and a C-terminal RBR domain previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 52
34164 319539 cd16625 RING-HC_RBR_HEL2_like RING finger, HC subclass, found in Saccharomyces cerevisiae histone E3 ligase 2 (HEL2) and similar proteins. HEL2 is an E3 ubiquitin-protein ligase that interacts with the E2 ubiquitin-conjugating enzyme UBC4 and histones H3 and H4. It plays an important role in regulating histone protein levels and also likely to contribute to the maintenance of genomic stability in the budding yeast. HEL2 can be phosphorylated by the DNA damage checkpoint kinase and histone protein regulator Rad53. This family also includes Schizosaccharomyces pombe histone E3 ligase 1 (HEL1), also known as DNA-break-localizing protein 4 (dbl4), and Dictyostelium discoideum Ariadne-like ubiquitin ligase (RbrA). RbrA may act as an E3 ubiquitin-protein ligase that appears to be required for normal cell-type proportioning and cell sorting during multicellular development, and is also necessary for spore cell viability. Members in this family contain a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 54
34165 319540 cd16626 RING-HC_RBR_HHARI RING finger, HC subclass, found in human homolog of Drosophila ariadne (HHARI) and similar proteins. The family includes Drosophila melanogaster protein ariadne-1 (ARI-1), and its eukaryotic homologs, such as HHARI. ARI-1 is a novel widely expressed Drosophila RING-finger protein that localizes mainly in the cytoplasm and is required for neural development. It interacts with a novel ubiquitin-conjugating enzyme, UbcD10. HHARI, also known as H7-AP2, monocyte protein 6 (MOP-6), protein ariadne-1 homolog, Ariadne RBR E3 ubiquitin protein ligase 1 (ARIH1), ariadne-1 (ARI-1), UbcH7-binding protein, UbcM4-interacting protein, or ubiquitin-conjugating enzyme E2-binding protein 1, is a RBR-type E3 ubiquitin-protein ligase highly expressed in nuclei, where it is co-localized with nuclear bodies including Cajal, PML, and Lewy bodies. It interacts with the E2 conjugating enzymes UbcH7, UbcH8, UbcM4, and UbcD10 in human, mouse, and fly, and modulates the ubiquitylation of substrate proteins including single-minded 2 (SIM2) and translation initiation factor 4E homologous protein (4EHP). It functions as a potent mediator of DNA damage-induced translation arrest, which protects stem and cancer cells against genotoxic stress by initiating a 4EHP-mediated mRNA translation arrest. HHARI contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 58
34166 319541 cd16627 RING-HC_RBR_parkin RING finger, HC subclass, found in parkin and similar proteins. Parkin, also known as Parkinson juvenile disease protein 2, is a RBR-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson"s disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins, such as BCL2, SYT11, CCNE1, GPR37, RHOT1/MIRO1, MFN1, MFN2, STUB1, SNCAIP, SEPT5, TOMM20, USP30, ZNF746, and AIMP2. It mediates monoubiquitination, as well as Lys-6-, Lys-11-, Lys-48- and Lys-63-linked polyubiquitination of substrates depending on the context. Parkin may enhance cell viability and protects dopaminergic neurons from oxidative stress-mediated death by regulating mitochondrial function. It also limits the production of reactive oxygen species (ROS) and regulates cyclin-E during neuronal apoptosis. Moreover, parkin displays a ubiquitin ligase-independent function in transcriptional repression of p53. Parkin contains an N-terminal ubiquitin-like domain and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 59
34167 319542 cd16628 RING-HC_RBR_RNF14 RING finger, HC subclass, found in RING finger protein 14 (RNF14) and similar proteins. RNF14, also known as androgen receptor-associated protein 54 (ARA54), HFB30, or Triad2 protein, is a RBR-type E3 ubiquitin-protein ligase that is highly expressed in the testis and interacts with class III E2s (UBE2E2, UbcH6, and UBE2E3). Its differential localization may play an important role in testicular development and spermatogenesis in humans. RNF14 functions as a transcriptional regulator of mitochondrial and immune function in muscle. It is a ligand-dependent androgen receptor (AR) co-activator that enhances AR-dependent transcriptional activation. It also may participate in enhancing cell cycle progression and cell proliferation via induction of cyclin D1. Moreover, RNF14 is crucial for colon cancer cell survival. It acts as a new enhancer of the Wnt-dependent transcriptional outputs that acts at the level of the T-cell factor/lymphoid enhancer factor (TCF/LEF)-beta-catenin complex. RNF14 contains an N-terminal RWD domain and a C-terminal RBR domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain uses an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 54
34168 319543 cd16629 RING-HC_RBR_RNF19 RING finger, HC subclass, found in the family of RING finger proteins RNF19A, RNF19B and similar proteins. The family includes RING finger protein RNF19A and RNF19B, both of which are transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligases. RNF19A, also known as double ring-finger protein (Dorfin) or p38, localizes to the ubiquitylated inclusions in Parkinson's disease (PD), dementia with Lewy bodies, multiple system atrophy, and amyotrophic lateral sclerosis (ALS). It interacts with Psmc3, a protein component of the 19S regulatory cap of the 26S proteasome, and further participates in the ubiquitin-proteasome system in acrosome biogenesis, spermatid head shaping, and development of the head-tail coupling apparatus and tail. It modulates the ubiquitination and degradation of calcium-sensing receptor (CaR), which may contribute to a general mechanism for CaR quality control during biosynthesis. Moreover, RNF19A can also ubiquitylate mutant superoxide dismutase 1 (SOD1), the causative gene of familial ALS. It may associate with endoplasmic reticulum-associated degradation (ERAD) pathway, which is related to the pathogenesis of neurodegenerative disorders, such as PD or Alzheimer"s disease. It is also involved in the pathogenic process of PD and Lewy body (LB) formation by ubiquitylation of synphilin-1. RNF19B, also known as IBR domain-containing protein 3 or natural killer lytic-associated molecule (NKLAM), plays a role in controlling tumor dissemination and metastasis. It is involved in the cytolytic function of natural killer (NK) cells and cytotoxic T lymphocytes (CTLs). It interacts with ubiquitin conjugates UbcH7 and UbcH8, and ubiquitinates uridine kinase like-1 (URKL-1) protein, targeting it for degradation. Moreover, RNF19B is a novel component of macrophage phagosomes and plays a role in macrophage anti-bacterial activity. It functions as a novel modulator of macrophage inducible nitric oxide synthase (iNOS) expression. Both RNF19A and RNF19B contain a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 55
34169 319544 cd16630 RING-HC_RBR_RNF216 RING finger, HC subclass, found in RING finger protein 216 (RNF216) and similar proteins. RNF216, also known as Triad domain-containing protein 3 (Triad3A), ubiquitin-conjugating enzyme 7-interacting protein 1, or zinc finger protein inhibiting NF-kappa-B (ZIN), is a RBR-type E3 ubiquitin-protein ligase that interacts with several components of Toll-like receptor (TLR) signaling and promotes their proteolytic degradation. It negatively regulates the RIG-I RNA sensing pathway through Lys48-linked, ubiquitin-mediated degradation of the tumor necrosis factor receptor-associated factor 3 (TRAF3) adapter following RNA virus infection. It also controls ubiquitination and proteasomal degradation of receptor-interacting protein 1 (RIP1), a serine/threonine protein kinase that is critically involved in tumor necrosis factor receptor-1 (TNF-R1)-induced NF-kappa B activation, following disruption of Hsp90 binding. Moreover, RNF216 is involved in inflammatory diseases through strongly inhibiting autophagy in macrophages. It interacts with and ubiquitinates BECN1, a key regulator in autophagy, thereby contributing to BECN1 degradation. It regulates synaptic strength by ubiquitination of Arc, resulting in its rapid proteasomal degradation. It is also a key negative regulator of sustained Killer cell Ig-like receptor (KIR) with two Ig-like domains and a long cytoplasmic domain 4 (2DL4)-mediated NF-kappaB signaling from internalized 2DL4, which functions by promoting ubiquitylation and degradation of endocytosed receptor from early endosomes. Furthermore, RNF216 interacts with human immunodeficiency virus type 1 (HIV-1) Virion infectivity factor (Vif) protein, which is essential for the productive infection of primary human CD4 T lymphocytes and macrophages. Mutations in RNF216 may result in Gordon Holmes syndrome, a condition defined by hypogonadotropic hypogonadism and cerebellar ataxia, as well as in autosomal recessive Huntington-like disorder. RNF216 contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger motif required for RBR-mediated ubiquitination. 58
34170 319545 cd16631 mRING-HC-C4C4_RBR_HOIP Modified RING finger, HC subclass (C4C4-type), found in HOIL-1-interacting protein (HOIP) and similar proteins. HOIP, also known as RING finger protein 31 (RNF31) or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. It also interacts with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression. HOIP contains three Npl4 zinc fingers, a central ubiquitin-associated (UBA) domain responsible for the interaction with the N-terminal ubiquitin-like domain (UBL) of HOIL-1L, a RBR domain, and a C-terminal linear chain determining domain (LDD). The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger motif whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination. 53
34171 319546 cd16632 mRING-HC-C4C4_RBR_RNF144 Modified RING finger, HC subclass (C4C4-type), found in the RNF144 protein family. The RNF144 family includes RNF144A and RNF144B, both of which are transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligases. RNF144A, also known as UbcM4-interacting protein 4 (UIP4) or ubiquitin-conjugating enzyme 7-interacting protein 4 targets DNA-dependent protein kinase catalytic subunit (DNA-PKcs), and thus promote DNA damage-induced cell apoptosis. It is transcriptionally repressed by metastasis-associated protein 1 (MTA1) and inhibits MTA1-driven cancer cell migration and invasion. RNF144B, also known as PIR2, IBR domain-containing protein 2 (IBRDC2), or p53-inducible RING finger protein (p53RFP), induces p53-dependent, but caspase-independent apoptosis. It interacts with E2 ubiquitin-conjugating enzymes UbcH7 and UbcH8, but not with UbcH5. It is involved in ubiquitination and degradation of p21, a p53 downstream protein promoting growth arrest and antagonizing apoptosis, suggesting a role in switching a cell from p53-mediated growth arrest to apoptosis. Moreover, RNF144B regulates the levels of Bax, a pro-apoptotic protein from the Bcl-2 family, and protects cells from unprompted Bax activation and cell death. It also regulates epithelial homeostasis by mediating degradation of p21WAF1 and p63. Both RNF144A and RNF144B contain a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination. 51
34172 319547 cd16633 mRING-HC-C3HC3D_RBR_HOIL1 Modified RING finger, HC subclass (C3HC3D-type), found in heme-oxidized IRP2 ubiquitin ligase 1 (HOIL-1) and similar proteins. HOIL-1, also known as RBCK1, HOIL-1L, RanBP-type and C3HC4-type zinc finger-containing protein 1, HBV-associated factor 4, Hepatitis B virus X-associated protein 4, RING finger protein 54 (RNF54), ubiquitin-conjugating enzyme 7-interacting protein 3, or UbcM4-interacting protein 28 (UIP28), together with E3 ubiquitin-protein ligase RNF31 (also known as HOIP) and SHANK-associated RH domain interacting protein (SHARPIN), form the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis through conjugation of linear polyubiquitin chains to NF-kappaB essential modulator (also known as NEMO or IKBKG). HOIL-1 plays a crucial role in TNF-alpha-mediated NF-kappaB activation. It also functions as an ubiquitin-protein ligase E3 that interacts with not only PKCbeta, but also PKCzeta. It can recognize heme-oxidized IRP2 (iron regulatory protein2) and is thought to affect the turnover of oxidatively damaged proteins. HOIL-1 contains an N-terminal ubiqutin-like (UBL) domain and an Npl4 zinc-finger (NZF) domain, which regulate the interaction with the LUBAC subunit RNF31 and ubiquitin, respectively. The NZF domain belongs to RanBP2-type zinc finger (zf-RanBP2) domain superfamily. In addition, HOIL-1 has a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a modified C3HC3D-type RING-HC finger required for RBR-mediated ubiquitination. 55
34173 319548 cd16634 mRING-HC-C3HC3D_Nrdp1 Modified RING finger, HC subclass (C3HC3D-type), found in neuregulin receptor degradation protein-1 (Nrdp1) and similar proteins. Nrdp1 (referred to as FLRF in mice), also known as RING finger protein 41 (RNF41), is an E3 ubiquitin-protein ligase that plays a critical role in the regulation of cell growth and apoptosis, inflammation and production of reactive oxygen species (ROS), as well as in doxorubicin (DOX)-induced cardiac injury. It promoten and degradation of the epidermal growth factor receptor (EGFR/ErbB) family member, ErbB3, which is independent of growth factor stimulation. It also promotes M2 macrophage polarization by ubiquitinating and activating transcription factor CCAAT/enhancer-binding Protein beta (C/EBPbeta) via Lys-63-linked ubiquitination. Moreover, Nrdp1 interacts with and modulates activity of Parkin, a causative protein for early onset recessive juvenile parkinsonism (AR-JP). It also interacts with ubiquitin-specific protease 8 (USP8), which is involved in trafficking of various transmembrane proteins. Furthermore, Nrdp1 inhibits basal lysosomal degradation and enhances ectodomain shedding of JAK2-associated cytokine receptors. Its phosphorylation by the kinase Par-1b (also known as MARK2) is required for epithelial cell polarity. Nrdp1 contains an N-terminal modified C3HC3D-type RING-HC finger required for enhancing ErbB3 degradation, a B-box, a coiled-coil domain responsible for Nrdp1 oligomerization, and a C-terminal ErbB3-binding domain. 43
34174 319549 cd16635 mRING-HC-C3HC3D_PHRF1 Modified RING finger, HC subclass (C3HC3D-type), found in PHD and RING finger domain-containing protein 1(PHRF1) and similar proteins. PHRF1, also known as KIAA1542, or CTD-binding SR-like protein rA9, is a ubiquitin ligase which induces the ubiquitination of TGIF (TG-interacting factor) at lysine 130. It acts as a tumor suppressor that promotes the transforming growth factor (TGF)-beta cytostatic program through selective release of TGIF-driven promyelocytic leukemia protein (PML) inactivation. PHRF1 contains a plant homeodomain (PHD) finger and a modified C3HC3D-type RING-HC finger. 44
34175 319550 cd16636 mRING-HC-C3HC3D_SCAF11 Modified RING finger, HC subclass (C3HC3D-type), found in SR-related and CTD-associated factor 11 (SCAF11) and similar proteins. SCAF11, also known as CTD-associated SR protein 11 (CASP11), renal carcinoma antigen NY-REN-40, SC35-interacting protein 1 (Sip1), Serine/arginine-rich splicing factor 2 (SRSF2)-interacting protein, or splicing regulatory protein 129 (SRrp129), is a novel arginine-serine-rich (RS) domain-containing protein essential for pre-mRNA splicing. It functions as an auxiliary splice factor interacting with spliceosomal component SC35 promoting RNAPII elongation. In addition to SR proteins, such as SC35, ASF/SF2, SRp75, and SRp20, SCAF11 also associates with U1-70K and U2AF65, proteins associated with 5' and 3' splice sites, respectively. SCAF11 contains an N-terminal modified C3HC3D-type RING-HC finger, an internal serine-arginine rich domain (SR domain), and a C-terminal SRI domain. 43
34176 319551 cd16637 mRING-HC-C3HC3D_LNX1_like Modified RING finger, HC subclass (C3HC3D-type), found in ligand of Numb protein LNX1, LNX2, and similar proteins. The ligand of Numb protein X (LNX) family, also known as PDZ and RING (PDZRN) family, includes LNX1-5, which can interact with Numb, a key regulator of neurogenesis and neuronal differentiation. LNX5 (also known as PDZK4 or PDZRN4L) shows high sequence homology to LNX3 and LNX4, but it lacks the RING domain. LNX1-4 proteins function as E3 ubiquitin ligases and have a unique domain architecture consisting of an N-terminal RING-HC finger for E3 ubiquitin ligase activity and either two or four PDZ domains necessary for the substrate-binding. This family corresponds to LNX1/LNX2-like proteins, which contains a modified C3HC3D-type RING-HC finger and four PDZ domains. 42
34177 319552 cd16638 mRING-HC-C3HC3D_Roquin Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-1, Roquin-2, and similar proteins. This family corresponds to the ROQUIN family of proteins, including Roquin-1, Roquin-2, and similar proteins, which localize to the cytoplasm and upon stress are concentrated in stress granules. They may play essential roles in preventing T-cell-mediated autoimmune disease and in microRNA-mediated repression of inducible costimulator (Icos) mRNA. They function as E3 ubiquitin ligases consisting of an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 activity, a highly conserved ROQ domain required for RNA binding and localization to stress granules, and a CCCH-type zinc finger involved in RNA recognition. 44
34178 319553 cd16639 RING-HC_TRAF2 RING finger, HC subclass, found in tumor necrosis factor (TNF) receptor-associated factor 2 (TRAF2) and similar proteins. TRAF2, also known as tumor necrosis factor type 2 receptor-associated protein 3, is an E3 ubiquitin-protein ligase that was identified as a 75 kDa tumor necrosis factor receptor (TNF-R2)-assciated signaling protein. It interacts with members of the TNF receptor superfamily and connects the receptors to downstream signaling proteins. It also mediates K63-linked polyubiquitination of RIP1, a kinase pivotal in TNFalpha-induced NF-kappaB activation. Moreover, TRAF2 regulates peripheral CD8(+) T-cell and NKT-cell homeostasis by modulating sensitivity to IL-15. It also acts an important biological suppressor of necroptosis. It inhibits TNF-related apoptosis inducing ligand (TRAIL)- and CD95L-induced apoptosis and necroptosis. TRAF2 contains an N-terminal domain with a typical C3HC4-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain. 42
34179 319554 cd16640 RING-HC_TRAF3 RING finger, HC subclass, found in tumor necrosis factor (TNF) receptor-associated factor 3 (TRAF3) and similar proteins. TRAF3, also known as CAP-1, CD40 receptor-associated factor 1 (CRAF1), CD40-binding protein (CD40BP), or LMP1-associated protein 1 (LAP1), is a member of TRAF protein family, which mainly functions in the immune system, where it mediates signaling through tumor necrosis factor receptors (TNFRs) and interleukin-1/Toll-like receptors (IL-1/TLRs). It also plays a unique cell type-specific and critical role in the restraint of B-cell homeostatic survival, a role with important implications for both B-cell differentiation and the pathogenesis of B-cell malignancies. Meanwhile, TRAF3 differentially regulates differentiation of specific T cell subsets. It is required for iNKT cell development, restrains Treg cell development in the thymus, and plays an essential role in the homeostasis of central memory CD8+ T cells. TRAF3 contains an N-terminal domain with a typical C3HC4-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain, and a conserved TRAF-C domain. 42
34180 319555 cd16641 mRING-HC-C3HC3D_TRAF4 Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 4 (TRAF4) and similar proteins. TRAF4, also known as cysteine-rich domain associated with RING and Traf domains protein 1, or metastatic lymph node gene 62 protein (MLN 62), or RING finger protein 83 (RNF83), is a member of TRAF protein family, which mainly function in the immune system, where they mediate signaling through tumor necrosis factor receptors (TNFRs) and interleukin-1/Toll-like receptors (IL-1/TLRs). It also plays a critical role in nervous system, as well as in carcinogenesis. TRAF4 promotes the growth and invasion of colon cancer through the Wnt/beta-catenin pathway. It contributes to the TNFalpha-induced activation of 70 kDa ribosomal protein S6 kinase (p70s6k) signaling pathway, and activation of transforming growth factor beta (TGF-beta)-induced SMAD-dependent signaling and non-SMAD signaling in breast cancer. It also enhances osteosarcoma cell proliferation and invasion by Akt signaling pathway. Moreover, TRAF4 is a novel phosphoinositide-binding protein modulating tight junctions and favoring cell migration. TRAF4 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain. 45
34181 319556 cd16642 mRING-HC-C3HC3D_TRAF5 Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 5 (TRAF5) and similar proteins. TRAF5, also known as RING finger protein 84 (RNF84), is an important signal transducer for a wide range of TNF receptor superfamily members, including tumor necrosis factor receptor 1 (TNFR1), tumor necrosis factor receptor 2 (TNFR2), CD40, and other lymphocyte costimulatory receptors, RANK/TRANCE-R, ectodysplasin-A Receptor (EDAR), lymphotoxin-beta receptor (LT-betaR), latent membrane protein 1 (LMP1), and IRE1. It functions as an activator of NF-kappaB, MAPK, and JNK, and is involved in both RANKL- and TNFalpha-induced osteoclastogenesis. It mediates CD40 signaling through associating with the cytoplasmic tail of CD40. It also negatively regulates Toll-like receptor (TLR) signaling and functions as a negative regulator of the interleukin 6 (IL-6) receptor signaling pathway that limits the differentiation of inflammatory CD4(+) T cells. TRAF5 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain. 43
34182 319557 cd16643 mRING-HC-C3HC3D_TRAF6 Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 6 (TRAF6) and similar proteins. TRAF6, also known as interleukin-1 signal transducer or RING finger protein 85 (RNF85), is a cytoplasmic adapter protein that mediates signals induced by the tumor necrosis factor receptor (TNFR) superfamily and Toll-like receptor (TLR)/interleukin-1 receptor (IL-1R) family. It functions as a mediator involved in the activation of mitogen-activated protein kinase (MAPK), phosphoinositide 3-kinase (PI3K), and interferon regulatory factor pathways, as well as in IL-1R-mediated activation of NF-kappaB. TRAF6 is also an oncogene that plays a vital role in K-RAS-mediated oncogenesis. TRAF6 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain. 58
34183 319558 cd16644 mRING-HC-C3HC3D_TRAF7 Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 7 (TRAF7) and similar proteins. TRAF7, also known as RING finger and WD repeat-containing protein 1 or RING finger protein 119 (RNF119), is an E3 ubiquitin-protein ligase involved in signal transduction pathways that lead either to activation or repression of NF-kappaB transcription factor through promoting K29-linked ubiquitination of several cellular targets, including the NF-kappaB essential modulator (NEMO) and the p65 subunit of NF-kappaB transcription factor. It is also involved in K29-linked polyubiquitination that has been implicated in lysosomal degradation of proteins. Moreover, TRAF7 is required for K48-linked ubiquitination of p53, a key tumor suppressor and a master regulator of various signaling pathways, such as those related to apoptosis, cell cycle and DNA repair. It is also required for tumor necrosis factor alpha (TNFalpha)-induced Jun N-terminal kinase activation and promotes cell death by regulating polyubiquitination and lysosomal degradation of c-FLIP protein. Furthermore, TRAF7 functions as small ubiquitin-like modifier (SUMO) E3 ligase involved in other post-translational modification, such as sumoylation. It binds to and stimulates sumoylation of the proto-oncogene product c-Myb, a transcription factor regulating proliferation and differentiation of hematopoietic cells. It potentiates MEKK3-induced AP1 and CHOP activation and induces apoptosis. Meanwhile, TRAF7 mediates MyoD1 regulation of the pathway and cell-cycle progression in myoblasts. It also plays a role in Toll-like receptors (TLR) signaling. TRAF7 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and an adjacent zinc finger, and a unique C-terminal domain that comprises a coiled coil domain and seven WD40 repeats. 39
34184 319559 cd16645 mRING-HC-C3HC3D_TRIM23_C-IX Modified RING finger, HC subclass (C3HC3D-type), found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a modified C3HC3D-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition. 50
34185 319560 cd16646 mRING-HC-C2H2C4_MDM2_like Modified RING finger, HC subclass (C2H2C4-type), found in E3 ubiquitin-protein ligase MDM2, protein MDM4 and similar proteins. MDM2 (also known as HDM2) and MDM4 (also known as MDMX or HDMX) are the primary p53 tumor suppressor negative regulators. They have non-redundant roles in the regulation of p53. MDM2 mainly functions to control p53 stability, while MDM4 controls p53 transcriptional activity. Both MDM2 and MDM4 contain an N-terminal p53-binding domain, a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region, and a C-terminal modified C2H2C4-type RING-HC finger. Mdm2 can form homo-oligomers through its RING domain and displays E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Despite its RING domain and structural similarity with MDM2, MDM4 does not homo-oligomerize and lacks ubiquitin-ligase function, but inhibits the transcriptional activity of p53. In addition, both their RING domains are responsible for the hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. Moreover, MDM2 and MDM4 can be phosphorylated and destabilized in response to DNA damage stress. In response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11, and L23. However, MDM4 is not bound to ribosomal proteins, suggesting its different response to regulation by small basic proteins such as ribosomal proteins and ARF. 44
34186 319561 cd16647 mRING-HC-C3HC5_NEU1 Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein NEURL1A, NEURL1B, and similar proteins. The family includes Drosophila neuralized (D-neu) protein, and its two mammalian homologs, NEURL1A and NEURL1B. D-neu is a regulator of the developmentally important Notch signaling pathway. NEURL1A, also known as NEURL1, NEU, neuralized 1, or RING finger protein 67 (RNF67), is a mammalian homolog of D-neu. It functions as an E3 ubiquitin-protein ligase that directly interacts with and monoubiquitinates cytoplasmic polyadenylation element-binding protein 3 (CPEB3), an RNA binding protein and a translational regulator of local protein synthesis, which facilitates hippocampal plasticity and hippocampal-dependent memory storage. It also acts as a potential tumor suppressor that causes apoptosis and downregulates Notch target genes in medulloblastoma. NEURL1B, also known as neuralized-2 (NEUR2) or neuralized-like protein 3, is another mammalian homolog of D-neu protein. It functions as an E3 ubiquitin-protein ligase that interacts with and ubiquitinates Delta. Thus, it plays a role in the endocytic pathways for Notch signaling through working cooperatively with another E3 ligase, Mind bomb-1 (Mib1), in Delta endocytosis to hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs)-positive vesicles. Members in this family contain two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 42
34187 319562 cd16648 mRING-HC-C3HC5_MAPL Modified RING finger, HC subclass (C3HC5-type), found in mitochondrial-anchored protein ligase (MAPL) and similar proteins. MAPL, also known as MULAN, mitochondrial ubiquitin ligase activator of NFKB 1, E3 SUMO-protein ligase MUL1, E3 ubiquitin-protein ligase MUL1, growth inhibition and death E3 ligase (GIDE), putative NF-kappa-B-activating protein 266, or RING finger protein 218 (RNF218), is a multifunctional mitochondrial outer membrane protein involved in several processes specific to metazoan (multicellular animal) cells, such as NF-kappaB activation, innate immunity and antiviral signaling, suppression of PINK1/parkin defects, mitophagy in skeletal muscle, and caspase-dependent apoptosis. MAPL contains a unique BAM (beside a membrane)/GIDE (growth inhibition death E3 ligase) domain and a C-terminal modified cytosolic C3HC5-type RING-HC finger which is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 40
34188 319563 cd16649 mRING-HC-C3HC5_CGRF1_like Modified RING finger, HC subclass (C3HC5-type), found in RING finger proteins, RNF26, RNF197 (CGRRF1), RNF156 (MGRN1), RNF157 and similar proteins. This family corresponds to a group of RING finger proteins containing a modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. Cell growth regulator with RING finger domain protein 1 (CGRRF1), also known as cell growth regulatory gene 19 protein (CGR19) or RING finger protein 197 (RNF197), functions as a novel biomarker of tissue monitor endometrial sensitivity and response to insulin-sensitizing drugs, such as metformin, in the context of obesity. RNF26 is an E3 ubiquitin ligase that temporally regulates virus-triggered type I interferon induction by increasing the stability of Mediator of IRF3 activation, MITA, also known as STING, through K11-linked polyubiquitination of MITA after viral infection and promoting degradation of IRF3, another important component required for virus-triggered interferon induction. Mahogunin ring finger-1 (MGRN1), also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase MGRN1. In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis. 41
34189 319564 cd16650 SP-RING_PIAS_like SP-RING finger found in the Siz/PIAS RING (SP-RING) family of SUMO E3 ligases. The SP-RING family includes PIAS (protein inhibitor of activated STAT) proteins, Zmiz proteins, and Siz proteins from plants and fungi. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. It consists of four members: PIAS1, PIAS2 (also known as PIASx), PIAS3, and PIAS4 (also known as PIASy). PIAS proteins were initially identified as inhibitors of activated STAT only, but are now known to interact with and modulate several other proteins, including androgen receptor (AR), tumor suppressor p53, and the transforming growth factor-beta (TGF-beta) signaling protein SMAD. They interact with STATs in a cytokine-dependent manner. PIAS proteins have SUMO E3-ligase activity and interaction of PIAS proteins with transcription factors often results in sumoylation of that protein. Zmiz1 (Zimp10) and its homolog Zmiz2 (Zimp7) were initially identified in humans as androgen receptor (AR) interacting proteins and function as transcriptional co-activators. They interact with BRG1, the catalytic subunit of the SWI-SNF remodeling complex. They also associate with other hormone nuclear receptors and transcription factors such as p53 and Smad3/Smad4, and regulate transcription of specific target genes by altering their chromatin structure. SIZ1 proteins from plants and fungi are also founding members of this family. SIZ1-mediated conjugation of SUMO1 and SUMO2 to other intracellular proteins is essential in Arabidopsis. Yeast SIZ proteins are SUMO E3 ligases involved in a novel pathway of chromosome maintenance. They enhance SUMO modification to many substrates in vivo, but also exhibit unique substrate specificity. All family members contain a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, which is essential for SUMO ligase activity. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. 48
34190 319565 cd16651 SPL-RING_NSE2 SPL-RING finger found in E3 SUMO-protein ligase NSE2 and similar proteins. NSE2, also known as MMS21 homolog (MMS21) or non-structural maintenance of chromosomes element 2 homolog (Non-SMC element 2 homolog, NSMCE2), is an autosumoylating small ubiquitin-like modifier (SUMO) ligase required for the response to DNA damage. It regulates sumoylation and nuclear-to-cytoplasmic translocation of skeletal and heart muscle-specific variant of the alpha subunit of nascent polypeptide associated complex (skNAC)-Smyd1 in myogenesis. It is also required for resisting extrinsically induced genotoxic stress. Moreover, NSE2 together with its partner proteins SMC6 and SMC5 form a tight subcomplex of the structural maintenance of chromosomes SMC5-6 complex, which includes another two subcomplexes, NSE1-NSE3-NSE4 and NSE5-NSE6. SMC6 and NSE3 are sumoylated in an NSE2-dependent manner, but SMC5 and NSE1 are not. NSE2-dependent E3 SUMO ligase activity is required for efficient DNA repair, but not for SMC5-6 complex stability. NSE2 contains a RING variant known as a Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription)-like RING (SPL-RING) finger that is likely shared by the SP-RING type SUMO E3 ligases, such as PIAS family proteins. The SPL-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. It harbors only one Zn2+-binding site and is required for the sumoylating activity. 67
34191 319566 cd16652 dRing_Rmd5p_like Degenerated RING (dRING) finger found in Saccharomyces cerevisiae required for meiotic nuclear division protein 5 (Rmd5p) and similar proteins. Rmd5p, also known as glucose-induced degradation protein 2 (Gid2) or sporulation protein RMD5, is an E3 ubiquitin ligase containing a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. It forms the heterodimeric E3 ligase unit of the glucose induced degradation deficient (GID) complex with Gid9 (also known as Fyv10), which has a degenerated RING finger as well. The GID complex triggers polyubiquitylation and subsequent proteasomal degradation of the gluconeogenic enzymes fructose-1, 6-bisphosphate by fructose-1, 6-bisphosphatase (FBPase), phosphoenolpyruvate carboxykinase (PEPCK), and cytoplasmic malate dehydrogenase (c-MDH). Moreover, Rmd5p can form the GID complex with the other six Gid proteins, including Gid1/Vid30, Gid4/Vid24, Gid5/Vid28, Gid7, Gid8, and Gid9/Fyv10. The GID complex in which the seven Gid proteins reside functions as a novel ubiquitin ligase (E3) involved in the regulation of carbohydrate metabolism. 49
34192 319567 cd16653 RING-like_Rtf2 RING-like Rtf2 domain, C2HC2-type, found in the replication termination factor 2 (Rtf2) protein family. The Rtf2 protein family includes a group of conserved proteins found in eukaryotes ranging from fission yeast to humans. The defining member of the family is Schizosaccharomyces pombe Rtf2 (SpRtf2), which is a proliferating cell nuclear antigen-interacting protein that functions as a key requirement for efficient replication termination at the site-specific replication barrier RTS1. It promotes termination at RTS1 by preventing replication restart. SpRtf2 contains a RING-like Rtf2 domain that is characterized by a C2HC2 motif similar to C3HC4 RING-HC finger motif known to bind two Zn2+ ions and mediate protein-protein interactions. The C2HC2 motif lacks three of the seven conserved cysteines of the C3HC4 motif, and forms only one functional Zn2+ ion-binding site. The RING-like Rtf2 domain in fission yeast is required to stabilize a paused DNA replication fork during imprinting at the mating type locus, possibly by facilitating sumoylation of PCNA. The family also includes Arabidopsis RTF2 (AtRTF2), an essential nuclear protein required for both normal embryo development and for proper expression of the GFP reporter gene. It plays a critical role in splicing the GFP pre-mRNA, and may also have a more transient regulatory role during the spliceosome cycle. The biological function of Rtf2 homologs found in eumetazoa remains unclear. They contains a variant C2HC2 motif where the middle conserved histidine has been replaced by cysteine. 46
34193 319568 cd16654 RING-Ubox_CHIP U-box domain, a modified RING finger, found in carboxyl terminus of HSP70-interacting protein (CHIP) and similar proteins. CHIP, also known as STIP1 homology and U box-containing protein 1 (STUB1), CLL-associated antigen KW-8, or Antigen NY-CO-7, is a multifunctional protein that functions both as a co-chaperone and an E3 ubiquitin-protein ligase. It couples protein folding and proteasome mediated degradation by interacting with heat shock proteins (e.g. HSC70) and ubiquitinating their misfolded client proteins thereby targeting them for proteasomal degradation. It is also important for cellular differentiation and survival (apoptosis), as well as susceptibility to stress. It targets a wide range of proteins, such as expanded ataxin-1, ataxin-3, huntingtin, and androgen receptor, which play roles in glucocorticoid response, tau degradation, and both p53 and cAMP signaling. CHIP contains an N-terminal tetratricopeptide repeat (TPR) domain responsible for protein-protein interaction, a highly charged middle coiled-coil (CC), and a C-terminal RING-like U-box domain acting as an ubiquitin ligase. 67
34194 319569 cd16655 RING-Ubox_WDSUB1_like U-box domain, a modified RING finger, found in WD repeat, SAM and U-box domain-containing protein 1 (WDSUB1) and similar proteins. WDSUB1 is an uncharacterized protein containing seven WD40 repeats and a SAM domain in addition of the U-box. Its biological role remains unclear. The family also includes many uncharacterized kinase domain-containing U-box (AtPUB) proteins and several MIF4G motif-containing AtPUB proteins from Arabidopsis. 42
34195 319570 cd16656 RING-Ubox_PRP19 U-box domain, a modified RING finger, found in pre-mRNA-processing factor 19 (Prp19) and similar proteins. Prp19, also known as nuclear matrix protein 200 (NMP200), senescence evasion factor (SNEV), or DNA repair protein Pso4 (psoralen-sensitive mutant 4), is a ubiquitously expressed multifunctional E3 ubiquitin ligase with pleiotropic activities in DNA damage signaling, repair, and replicative senescence. It functions as a critical component of DNA repair and DNA damage checkpoint complexes. It senses DNA damage, binds double-stranded DNA in a sequence-independent manner, facilitates processing of damaged DNA, promotes DNA end joining, regulates replication protein A (RPA2) phosphorylation and ubiquitination at damaged DNA, and regulates RNA splicing and mitotic spindle formation in its integral capacity as a scaffold for a multimeric core complex. Prp19 contains an N-terminal E3 ubiquitin ligase U-box domain with E2 recruitment function that facilitates dimerization and is essential for its auto-ubiquitination activity in vitro or when overexpressed, a coiled-coil Prp19 homology region that mediates its tetramerization and interaction with CDC5L and SPF27, and a C-terminal seven-bladed WD40 beta-propeller type of leucine-rich architectural repeats that form an asymmetrical barrel-shaped structure important for substrate recognition and recruitment. 53
34196 319571 cd16657 RING-Ubox_UBE4A U-box domain, a modified RING finger, found in ubiquitin conjugation factor E4 A (UBE4A) and similar proteins. The family includes yeast ubiquitin fusion degradation protein 2 (UFD2p) and its mammalian homolog, UBE4A. Yeast UFD2p, also known as ubiquitin conjugation factor E4 or UB fusion protein 2, is a polyubiquitin chain conjugation factor (E4) in the ubiquitin fusion degradation (UFD) pathway which catalyzes elongation of the ubiquitin chain through Lys48 linkage. It binds to substrates conjugated with one to three ubiquitin molecules and catalyzes the addition of further ubiquitin moieties in the presence of ubiquitin-activating enzyme (E1), ubiquitin-conjugating enzyme (E2) and ubiquitin ligase (E3), yielding multiubiquitylated substrates that are targets for the 26S proteasome. UFD2p is implicated in cell survival under stress conditions and is essential for homoeostasis of unsaturated fatty acids. It interacts with UBL-UBA proteins Rad23 and Dsk2, which are involved in the endoplasmic reticulum-associated degradation, ubiquitin fusion degradation, and OLE-1 gene induction pathway. UBE4A is a U-box-type ubiquitin-protein ligase that is located in common neuroblastoma deletion regions and may be subject to mutations in tumors. It may have a specific role in different biochemical processes other than ubiquitination, including growth or differentiation. Members in this family contain an N-terminal ubiquitin elongating factor core and a RING-like U-box domain at the C-terminus. 70
34197 319572 cd16658 RING-Ubox_UBE4B U-box domain, a modified RING finger, found in ubiquitin conjugation factor E4 B (UBE4B) and similar proteins. UBE4B, also known as UFD2a, is a U-box-type ubiquitin-protein ligase that functions as an E3 ubiquitin ligase and an E4 polyubiquitin chain elongation factor, which catalyzes formation of Lys27- and Lys33-linked polyubiquitin chains rather than the Lys48-linked chain. It is a mammalian homolog of yeast UFD2 ubiquitination factor and participates in the proteasomal degradation of misfolded or damaged proteins through association with chaperones. It is located in common neuroblastoma deletion regions and may be subject to mutations in tumors. UBE4B has contradictory functions upon tumorigenesis as an oncogene or tumor suppressor in different types of cancers. It is essential for Hdm2 (also known as Mdm2)-mediated p53 degradation. It mediates p53 polyubiquitination and degradation, as well as inhibits p53-dependent transactivation and apoptosis, and thus plays an important role in regulating phosphorylated p53 following DNA damage. UBE4B is also associated with other pathways independent of the p53 family, such as polyglutamine aggregation and Wallerian degeneration, both of which are critical in neurodegenerative diseases. Moreover, UBE4B acts as a regulator of epidermal growth factor receptor (EGFR) degradation. It is recruited to endosomes in response to EGFR activation by binding to Hrs, a key component of endosomal sorting complex required for transport (ESCRT) 0, and then regulates endosomal sorting, affecting cellular levels of the EGFR and its downstream signaling. UBE4B contains a ubiquitin elongating factor core and a RING-like U-box domain at the C-terminus. 75
34198 319573 cd16659 RING-Ubox_Emp U-box domain, a modified RING finger, found in erythroblast macrophage protein (Emp) and similar proteins. Emp, also known as cell proliferation-inducing gene 5 protein or macrophage erythroblast attacher (MAEA), is a key protein which functions in normal differentiation of erythroid cells and macrophages. It is a potential biomarker for hematopoietic evaluation of Hematopoietic stem cell transplantation (HSCT) patients. Emp was initially identified as a heparin-binding protein involved in the association of erythroblasts with macrophages promotes erythroid proliferation and maturation. It also plays an important role in erythroblastic island formation. Absence of Emp leads to failure of erythroblast nuclear extrusion. It is required in definitive erythropoiesis and plays a cell intrinsic role in the erythroid lineage. Emp contains a Lissencephaly type-1-like homology (LisH) motif, a C-terminal to LisH (CTLH) domain, and a RING-like U-box domain at the C-terminus. 50
34199 319574 cd16660 RING-Ubox_RNF37 U-box domain, a modified RING finger, found in RING finger protein 37 (RNF37). RNF37, also known as KIAA0860, U-box domain-containing protein 5 (UBOX5), UbcM4-interacting protein 5 (UIP5), or ubiquitin-conjugating enzyme 7-interacting protein 5, is an E3 ubiquitin-protein ligase found exclusively in the nucleus as part of a nuclear dot-like structure. It interacts with the molecular chaperone VCP/p97 protein. RNF37 contains a U-box domain followed by a potential nuclear location signal (NLS), and a C-terminal C3HC4-type RING-HC finger. The U-box domain is a modified RING finger domain that lacks the hallmark metal-chelating cysteines and histidines of the latter, but is likely to adopt a RING finger-like conformation. The presence of the U-box, but not of the RING finger, is required for the E3 activity. The U-box domain can directly interact with several E2 enzymes, including UbcM2, UbcM3, UbcM4, UbcH5, and UbcH8, suggesting a similar function as the RING finger in the ubiquitination pathway. This family corresponds to the U-box domain. 53
34200 319575 cd16661 RING-Ubox1_NOSIP U-box domain 1, a modified RING finger, found in nitric oxide synthase-interacting protein (NOSIP) and similar proteins. NOSIP, also known as endothelial NO synthase (eNOS)-interacting protein, p33RUL, is an E3 ubiquitin-protein ligase implicated in the control of airway and vascular diameter, mucosal secretion, NO synthesis in ciliated epithelium, and, therefore, of mucociliary and bronchial function. The loss of NOSIP may cause holoprosencephaly and facial anomalies, including cleft lip/palate, cyclopia, and facial midline clefting. NOSIP interacts with neuronal nitric oxide synthase (nNOS) and eNOS by inhibiting the nitric oxide (NO) production. It acts as a novel type of modulator that promotes translocation of eNOS from the plasma membrane to intracellular sites, thereby uncoupling eNOS from plasma membrane caveolae and inhibiting NO synthesis. NOSIP also interacts with protein phosphatase PP2A and mediates the monoubiquitination of the PP2A catalytic subunit. Thus, it is a critical modulator of brain and craniofacial development in mice and a candidate gene for holoprosencephaly in humans. Moreover, NOSIP associates with the erythropoietin (Epo) receptor (EpoR), mediates ubiquitination of EpoR, and plays an essential role in erythropoietin-induced proliferation. NOSIP contains an atypical N-terminal RING-like U-box domain that is split into two parts by an interjacent stretch of 104 amino acid residues, as well as a C-terminal RING-like U-box domain. This family corresponds to the first U-box domain. 43
34201 319576 cd16662 RING-Ubox2_NOSIP U-box domain 2, a modified RING finger, found in nitric oxide synthase-interacting protein (NOSIP) and similar proteins. NOSIP, also known as endothelial NO synthase (eNOS)-interacting protein, p33RUL, is an E3 ubiquitin-protein ligase implicated in the control of airway and vascular diameter, mucosal secretion, NO synthesis in ciliated epithelium, and, therefore, of mucociliary and bronchial function. The loss of NOSIP may cause holoprosencephaly and facial anomalies including cleft lip/palate, cyclopia and facial midline clefting. NOSIP interacts with neuronal nitric oxide synthase (nNOS) and eNOS by inhibiting nitric oxide (NO) production. It acts as a novel type of modulator that promotes translocation of eNOS from the plasma membrane to intracellular sites, thereby uncoupling eNOS from plasma membrane caveolae and inhibiting NO synthesis. NOSIP also interacts with protein phosphatase PP2A and mediates the monoubiquitination of the PP2A catalytic subunit. Thus, it is a critical modulator of brain and craniofacial development in mice and a candidate gene for holoprosencephaly in humans. Moreover, NOSIP associates with the erythropoietin (Epo) receptor (EpoR), mediates ubiquitination of EpoR, and plays an essential role in erythropoietin-induced proliferation. NOSIP contains an atypical N-terminal RING-like U-box domain that is split into two parts by an interjacent stretch of 104 amino acid residues, as well as a C-terminal RING-like U-box domain. This family corresponds to the second U-box domain. 65
34202 319577 cd16663 RING-Ubox_PPIL2 U-box domain, a modified RING finger, found in peptidyl-prolyl cis-trans isomerase-like 2 (PPIL2) and similar proteins. PPIL2 (EC 5.2.1.8), also known as PPIase, CYC4, cyclophilin-60 (Cyp60), cyclophilin-like protein Cyp-60, or Rotamase PPIL2, is a nuclear-specific cyclophilin which interacts with the proteinase inhibitor eglin c and regulates gene expression. PPIL2 belongs to the cyclophilin family of peptidylprolyl isomerases and catalyzes cis-trans isomerization of proline-peptide bonds, which is often a rate-limiting step in protein folding. It positively regulates beta-site amyloid precursor protein cleaving enzyme (BACE1) expression and beta-secretase activity. Moreover, PPIL2 plays an important role in the translocation of CD147 to the cell surface, and thus may present a novel target for therapeutic interventions in diseases where CD147 functions as a pathogenic factor in cancer, human immunodeficiency virus infection, or rheumatoid arthritis. PPIL2 contains an N-terminal RING-like U-box domain and a C-terminal cyclophilin (Cyp)-like chaperone domain. 73
34203 319578 cd16664 RING-Ubox_PUB U-box domain, a modified RING finger, found in Arabidopsis plant U-box proteins (AtPUB) and similar proteins. The plant PUB proteins, also known as U-box domain-containing proteins, are much more numerous in Arabidopsis which has 62 in comparison with the typical 6 in most animals . The majority of AtPUB of this family are known as ARM domain-containing PUB proteins which contain a C-terminal located, tandem ARM (armadillo) repeat protein-interaction region in addition to the U-box domain. They have been implicated in the regulation of cell death and defense. They also play important roles in other plant-specific pathways, such as controlling both self-incompatibility and pseudo-self-incompatibility, as well as acting in abiotic stress. A subgroup of ARM domain-containing PUB proteins harbors a plant-specific U-box N-terminal domain. 43
34204 319579 cd16665 RING-H2_RNF13_like RING finger, H2 subclass, found in RING finger protein 13 (RNF13), RING finger protein 167 (RNF167), and similar proteins. This subfamily includes RING finger protein 13 (RNF13), RING finger protein 167 (RNF167), Zinc/RING finger protein 4 (ZNRF4), and similar proteins, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane domain (TM), and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF13 is a widely expressed membrane-associated E3 ubiquitin-protein ligase that is functionally significant in the regulation of cancer development, muscle cell growth, and neuronal development. Its expression is developmentally regulated during myogenesis and is upregulated in various tumors. RNF13 negatively regulates cell proliferation through its E3 ligase activity. RNF167, also known as RING105, is an endosomal/lysosomal E3 ubiquitin-protein ligase involved in alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor (AMPAR) ubiquitination. It acts as an endosomal membrane protein which ubiquitylates vesicle-associated membrane protein 3 (VAMP3) and regulates endosomal trafficking. Moreover, RNF167 plays a role in the regulation of TSSC5 (tumor-suppressing subchromosomal transferable fragment cDNA; also known as ORCTL2/IMPT1/BWR1A/SLC22A1L), which can function in concert with the ubiquitin-conjugating enzyme UbcH6. ZNRF4, also known as RING finger protein 204 (RNF204), or Nixin, is an endoplasmic reticulum (ER) membrane-anchored ubiquitin ligase that physically interacts with the ER-localized chaperone calnexin in a glycosylation-independent manner, induces calnexin ubiquitination, and p97-dependent degradation, indicating an ER-associated degradation-like mechanism of calnexin turnover. The murine protein sperizin (spermatid-specific ring zinc finger) is a homolog of human ZNRF4. It is specifically expressed in Haploid germ cells and involved in spermatogenesis. 46
34205 319580 cd16666 RING-H2_RNF43_like RING finger, H2 subclass, found in RING finger proteins RNF43, ZNRF3, and similar proteins. RNF43 and ZNRF3 (also known as RNF203) are transmembrane E3 ubiquitin-protein ligases that belong to the PA-TM-RING ubiquitin ligases family, which has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region. Both RNF43 and RNF203 function as tumor suppressors involved in the regulation of Wnt/beta-catenin signaling. They negatively regulate Wnt signaling through interacting with complexes of frizzled receptors (FZD) and low-density lipoprotein receptor-related protein (LRP) 5/6, which leads to ubiquitination of Frizzled receptors (FZD) and endocytosis of the Wnt receptor. Dishevelled (DVL), a positive Wnt regulator, is required for ZNRF3/RNF43-mediated ubiquitination and degradation of FZD. They also associate with R-spondin 1 (RSPO1). This interaction may block Frizzled ubiquitination and enhances Wnt signaling. 45
34206 319581 cd16667 RING-H2_RNF126_like RING finger, H2 subclass, found in RING finger proteins RNF126, RNF115, and similar proteins. The family includes RING finger proteins RNF126, RNF115, and similar proteins. RNF126 is a Bag6-dependent E3 ubiquitin ligase that is involved in the mislocalized protein (MLP) pathway of quality control. It regulates the retrograde sorting of the cation-independent mannose 6-phosphate receptor (CI-MPR). RNF126 promotes cancer cell proliferation by targeting the tumor suppressor p21 for ubiquitin-mediated degradation, and could be a novel therapeutic target in breast and prostate cancers. It is also able to ubiquitylate cytidine deaminase (AID), a poorly soluble protein that is essential for antibody diversification. RNF115, also known as Rab7-interacting ring finger protein (Rabring 7), or zinc finger protein 364 (ZNF364), or breast cancer-associated gene 2 (BCA2), is an E3 ubiquitin-protein ligase that is an endogenous inhibitor of adenosine monophosphate-activated protein kinase (AMPK) activation and its inhibition increases the efficacy of metformin in breast cancer cells. It also functions as a co-factor in the restriction imposed by tetherin on HIV-1, and targets HIV-1 Gag for lysosomal degradation, impairing virus assembly and release, in a tetherin-independent manner. Moreover, RNF115 is a Rab7-binding protein that stimulates c-Myc degradation through mono-ubiquitination of MM-1. It also plays crucial roles as a Rab7 target protein in vesicle traffic to late endosome/lysosome and lysosome biogenesis. RNF115 and RNF126 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. Both of them contain an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger. 43
34207 319582 cd16668 RING-H2_GRAIL RING finger, H2 subclass, found in the GRAIL transmembrane proteins family. The GRAIL transmembrane proteins family includes RING finger proteins RNF128 (also known as GRAIL), RNF130, RNF133, RNF148, RNF149, and RNF150, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF128 is a type 1 transmembrane E3 ubiquitin-protein ligase that is a critical regulator of adaptive immunity and development. RNF130, also known as Goliath homolog (H-Goliath), is a paralog of RNF128. It is a transmembrane E3 ubiquitin-protein ligase expressed in leukocytes. It has a self-ubiquitination property, and controls the development of T cell clonal anergy by ubiquitination. RNF133 is a testis-specific endoplasmic reticulum-associated E3 ubiquitin ligase that may play a role in sperm maturation through an ER-associated degradation (ERAD) pathway. RNF148 is a testis-specific E3 ubiquitin ligase that is abundantly expressed in testes and slightly expressed in pancreas. Its expression regulated by histone deacetylases. RNF149, also known as DNA polymerase-transactivated protein 2, is an E3 ubiquitin-protein ligase that induces the ubiquitination of wild-type v-Raf murine sarcoma viral oncogene homolog B1 (BRAF) and promotes its proteasome-dependent degradation. RNF150 is a RING finger protein that its polymorphisms may be associated with chronic obstructive pulmonary disease (COPD) risk in the Chinese population. The family also includes Drosophila melanogaster protein goliath (d-goliath), also known as protein g1, which is one of the funding members of the family. It was originally identified as a transcription factor involved in the embryo mesoderm formation. 48
34208 319583 cd16669 RING-H2_RNF181 RING finger, H2 subclass, found in RING finger protein 181 (RNF181) and similar proteins. RNF181, also known as HSPC238, is a platelet E3 ubiquitin-protein ligase containing a C3H2C3-type RING-H2 finger. It interacts with the KVGFFKR motif of platelet integrin alpha(IIb)beta3, suggesting a role for RNF181-mediated ubiquitination in integrin and platelet signaling. It also suppresses the tumorigenesis of hepatocellular carcinoma (HCC) through the inhibition of extracellular signal-regulated kinase/mitogen-activated protein kinase (ERK/MAPK) signaling in the liver. 46
34209 319584 cd16670 RING-H2_RNF215 RING finger, H2 subclass, found in RING finger protein 215 (RNF215) and similar proteins. This family includes uncharacterized protein RNF215 and similar proteins. Although its biological function remains unclear, RNF215 shares high sequence similarity with PA-TM-RING ubiquitin ligases, which have been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain. 50
34210 319585 cd16671 RING-H2_DTX1_4 RING finger, H2 subclass, found in E3 ubiquitin-protein ligase deltex1 (DTX1), deltex4 (DTX4), and similar proteins. DTX1 is a mammalian homolog of Drosophila Deltex that is a ubiquitously expressed cytoplasmic ubiquitin E3 ligase that mediates Notch activation in Drosophila. It functions as a Notch downstream transcription regulator that mediates a Notch signal to block differentiation of neural progenitor cells. DTX1 interacts with the transcription coactivator p300 and inhibits transcription activation mediated by the neural specific transcription factor MASH1. It is also a transcription target of nuclear factor of activated T cells (NFAT) and participates in T cell anergy and Foxp3 protein level maintenance in vivo. Moreover, Deltex1 appears to promote B-cell development at the expense of T-cell development. It also promotes protein kinase C theta degradation and sustains Casitas B-lineage lymphoma expression. DTX4, also known as RING finger protein 155, shares the highest degree of sequence similarity with DTX1 and likely interacts with the intracellular domain of Notch as well. Both DTX1 and DTX4 contain N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a C3H2C3-type RING-H2 finger at the C-terminus. They also harbor two nuclear localization signals. 69
34211 319586 cd16672 RING-H2_DTX2 RING finger, H2 subclass, found in E3 ubiquitin-protein ligase Deltex2 (DTX2) and similar proteins. DTX2, also known as RING finger protein 58, together with DTX1 and DTX4, forms a family of related proteins that are the mammalian homologs of Drosophila Deltex, a known regulator of Notch signals. Like DTX1 and DTX4, DTX2 is expressed in thymocytes. It interacts with the intracellular domain of Notch receptors and acts as a negative regulator of Notch signals in T cells. However, the endogenous levels of DTX1 and DTX2 is not important for regulating Notch signals during thymocyte development. DTX2 contains N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a C3H2C3-type RING-H2 finger at the C-terminus. It also harbors two nuclear localization signals. 72
34212 319587 cd16673 RING-H2_RNF6 RING finger, H2 subclass, found in E3 ubiquitin-protein ligase RNF6 and similar proteins. RNF6 is an androgen receptor (AR)-associated protein that induces AR ubiquitination and promotes AR transcriptional activity. RNF6-induced ubiquitination may regulate AR transcriptional activity and specificity through modulating cofactor recruitment. RNF6 is overexpressed in hormone-refractory human prostate cancer tissues and required for prostate cancer cell growth under androgen-depleted conditions. Moreover, RNF6 regulates local serine/threonine kinase LIM kinase 1 (LIMK1) levels in axonal growth cones. RNF6-induced LIMK1 polyubiquitination is mediated via K48 of ubiquitin and leads to proteasomal degradation of the kinase. RNF6 also binds and upregulates the Inha promoter, and functions as a transcription regulatory protein in the mouse sertoli cell. Furthermore, RNF6 acts as a potential tumor suppressor gene involved in the pathogenesis of esophageal squamous cell carcinoma (ESCC). RNF6 contains an N-terminal coiled-coil domain, a Lys-X-X-Leu/Ile-X-X-Leu/Ile (KIL) motif, and a C-terminal C3H2C3-type RING-H2 finger which is responsible for its ubiquitin ligase activity. The KIL motif is present in a subset of RING-H2 proteins from organisms as evolutionarily diverse as human, mouse, chicken, Drosophila, Caenorhabditis elegans, and Arabidopsis thaliana. 45
34213 319588 cd16674 RING-H2_RNF12 RING finger, H2 subclass, found in RING finger protein 12 (RNF12) and similar proteins. RNF12, also known as LIM domain-interacting RING finger protein or RING finger LIM domain-binding protein (R-LIM), is an E3 ubiquitin-protein ligase encoded by gene RLIM that is crucial for normal embryonic development in some species and for normal X inactivation in mice. It thus functions as a major sex-specific epigenetic regulator of female mouse nurturing tissues. RNF12 is widely expressed during embryogenesis, and mainly localizes to the cell nucleus, where it regulates the levels of many proteins, including CLIM, LMO, HDAC2, TRF1, SMAD7, and REX1, by proteasomal degradation. Its functional activity is regulated by phosphorylation-dependent nucleocytoplasmic shuttling. It is negatively regulated by pluripotency factors in embryonic stem cells. p53 represses its transcription through Sp1. RNF12 is the primary factor responsible for X chromosome inactivation (XCI) in female placental mammals. It is an indispensable factor in up-regulation of Xist transcription, thereby leading to initiation of random XCI. It also targets REX1, an inhibitor of XCI, for proteasomal degradation. Moreover, RNF12 acts as a co-regulator of a range of transcription factors, particularly those containing a LIM homeodomain, and modulates the formation of transcriptional multiprotein complexes. It is a negative regulator of Smad7, which in turn negatively regulates the type I receptors in transforming growth factor beta (TGF-beta) superfamily signaling. In addition, paternal RNF12 is a critical survival factor for milk-producing alveolar cells. RNF12 contains an nuclear localization signal (NLS) and a C3H2C3-type RING-H2 finger. 45
34214 319589 cd16675 RING-H2_RNF24 RING finger, H2 subclass, found in RING finger protein 24 (RNF24) and similar proteins. RNF24 is an intrinsic membrane protein localized in the Golgi apparatus. It specifically interacts with the ankyrin-repeats domains (ARDs) of TRPC1, ?3, ?4, ?5, ?6, and ?7, and affects TRPC intracellular trafficking without affecting their activity. RNF24 contains an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger. 47
34215 319590 cd16676 RING-H2_RNF122 RING finger, H2 subclass, found in RING finger protein 122 (RNF122) and similar proteins. RNF122 is a RING finger protein associated with HEK 293T cell viability. It is localized to the endoplasmic reticulum (ER) and the Golgi apparatus, and overexpressed in anaplastic thyroid cancer cells. RNF122 functions as an E3 ubiquitin ligase that can ubiquitinate itself and undergoes degradation through its RING finger in a proteasome-dependent manner. It interacts with calcium-modulating cyclophilin ligand (CAML), which is not a substrate, but a stabilizer of RNF122. RNF122 contains an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger. 47
34216 319591 cd16677 RING1-H2_RNF32 RING finger 1, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway. This family corresponds to the first RING-H2 finger. 44
34217 319592 cd16678 RING2-H2_RNF32 RING finger 2, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway. This family corresponds to the second RING-H2 finger. 60
34218 319593 cd16679 RING-H2_RNF38 RING finger, H2 subclass, found in RING finger protein 38 (RNF38) and similar proteins. RNF38 is a nuclear E3 ubiquitin protein ligase that is widely expressed throughout the body in human, especially highly expressed in the heart, brain, placenta and the testis. It recognizes p53 as a substrate for ubiquitination, and thus plays a role in regulating p53. The overexpression of RNF38 increases p53 ubiquitination and alters p53 localization. It is also capable of autoubiquitination. Moreover, RNF38 expression is negatively regulated by the serotonergic system. Induction of RNF38 may be involved in the anxiety-like behavior or non-cell autonomous by the decline of serotonin (5-HT) levels. RNF38 contains a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C3-type RING-H2 finger, as well as two potential nuclear localization signals. 49
34219 319594 cd16680 RING-H2_RNF44 RING finger, H2 subclass, found in RING finger protein 44 (RNF44) and similar proteins. RNF44 is an uncharacterized RING finger protein that shows high sequence similarity with RNF38, which is a nuclear E3 ubiquitin protein ligase that plays a role in regulating p53. RNF44 contains a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C2-type RING-H2 finger. 45
34220 319595 cd16681 RING-H2_RNF111 RING finger, H2 subclass, found in RING finger protein 111 (RNF111) and similar proteins. RNF111, also known as Arkadia, is a nuclear E3 ubiquitin-protein ligase that targets intracellular effectors and modulators of transforming growth factor beta (TGF-beta)/Nodal-related signaling for polyubiquitination and proteasome-dependent degradation. It acts as an amplifier of Nodal signals, and enhances the dorsalizing activity of limiting amounts of Xnr1, a Nodal homolog, and requires Nodal signaling for its function. The loss of RNF111 results in early embryonic lethality, with defects attributed to compromised Nodal signaling. Moreover, RNF111 regulates tumor metastasis by modulation of the TGF-beta pathway. Its ubiquitination can be modulated by the four and a half LIM-only protein 2 (FHL2) that activates TGF-beta signal transduction. Furthermore, RNF111 interacts with the clathrin-adaptor 2 (AP2) complex and regulates endocytosis of certain cell surface receptors, leading to modulation of epidermal growth factor (EGF) and possibly other signaling pathways. In addition, RNF111 has been identified as a small ubiquitin-like modifier (SUMO)-binding protein with clustered SUMO-interacting motifs (SIMs) that together form a SUMO-binding domain (SBD). It thus functions as a SUMO-targeted ubiquitin ligase (STUbL) that directly links nonproteolytic ubiquitylation and SUMOylation in the DNA damage response, as well as triggers degradation of signal-induced polysumoylated proteins, such as the promyelocytic leukemia protein (PML). The N-terminal half of RNF111 harbors three SIMs. Its C-terminal half show high sequence similarity with RING finger protein 165 (RNF165), where it contains two serine rich domains, two nuclear localization signals, a NRG-TIER domain, and a C-terminal C3H2C3-type RING-H2 finger that is required for polyubiqutination and proteasome-dependent degradation of phosphorylated forms of Smad2/3 and three major negative regulators of TGF-beta signaling, Smad7, SnoN and c-Ski. 46
34221 319596 cd16682 RING-H2_RNF165 RING finger, H2 subclass, found in RING finger protein 165 (RNF165) and similar proteins. RNF165, also known as Arkadia-like 2, or Arkadia2, or Ark2C, is an E3 ubiquitin ligase with homology to C-terminal half of RNF111. It is expressed specifically in the nervous system, and can serve to amplify neuronal responses to specific signals. It thus acts as a positive regulator of bone morphogenetic protein (BMP)-Smad signaling that is involved in motor neuron (MN) axon elongation. RNF165 contains two serine rich domains, a nuclear localization signal, a NRG-TIER domain, and a C-terminal C3H2C3-type RING-H2 finger that is responsible for the enhancement of BMP-Smad1/5/8 signaling in the spinal cord. 51
34222 319597 cd16683 RING-H2_RNF139 RING finger, H2 subclass, found in RING finger protein 139 (RNF139) and similar proteins. RNF139, also known as translocation in renal carcinoma on chromosome 8 protein (TRC8), is an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. It is a tumor suppressor that has been implicated in a novel regulatory relationship linking the cholesterol/lipid biosynthetic pathway with cellular growth control. The mutation of RNF139 has been identified in families with hereditary renal (RCC) and thyroid cancers. RNF139 physically and functionally interacts with von Hippel-Lindau (VHL), which is part of an SCF related E3-ubiquitin ligase complex with "gatekeeper" function in renal carcinoma and is defective in most sporadic clear-cell renal cell carcinomas (ccRCC). It suppresses growth and functions with VHL in a common pathway. RNF139 also suppresses tumorigenesis through targeting heme oxygenase-1 for ubiquitination and degradation. Moreover, RNF139 is a target of Translin (TSN), a posttranscriptional regulator of genes transcribed by the transcription factor CREM-tau in postmeiotic male germ cells, suggesting a role of RNF139 in dysgerminoma. Furthermore, RNF139 physically and functionally interacts with von Hippel-Lindau (VHL), which is part of an SCF related E3-ubiquitin ligase complex with "gatekeeper" function in renal carcinoma and is defective in most sporadic clear-cell renal cell carcinomas (ccRCC). It suppresses growth and functions with VHL in a common pathway. In addition, RNF139 forms an integral part of a novel multi-protein ER complex, containing MHC I, US2, and signal peptide peptidase, which is associated with ER-associated degradation (ERAD) pathway. It is required for the ubiquitination of MHC class I molecules before dislocation from the ER. As a novel sterol-sensing ER membrane protein, RNF139 hinders sterol regulatory element-binding protein-2 (SREBP-2) processing through interaction with SREBP-2 and SREBP cleavage-activated protein (SCAP), regulating its own turnover rate via its E3 ubiquitin ligase activity. RNF139 shows two regions of similarity with the receptor for sonic hedgehog (SHH), Patched. The first region corresponds to the second extracellular domain of Patched, which is involved in binding SHH. The second region is a putative sterol-sensing domain (SSD). In addition, the C-terminal half of RNF139 contains a C3H2C3-type RING-H2 finger with E3-ubiquitin ligase activity in vitro. 42
34223 319598 cd16684 RING-H2_RNF145 RING finger, H2 subclass, found in RING finger protein 145 (RNF145) and similar proteins. RNF145 is an uncharacterized RING finger protein encoded by RNF145 gene, which is expressed in T lymphocytes, and its expression is altered in acute myelomonocytic and acute promyelocytic leukemias. Although its biological function remains unclear, RNF145 shows high sequence similarity with RNF139, an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. Like RNF139, RNF145 contains a C3H2C3-type RING-H2 finger with possible E3-ubiquitin ligase activity. 43
34224 319599 cd16685 RING-H2_UBR1 RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-1 (UBR1) and similar proteins. UBR1, also known as N-recognin-1 or E3alpha-I, is an E3 ubiquitin-protein ligase that is the E3 component of the N-end rule pathway. It also promotes degradation of proteins via distinct mechanism that detects a misfolded conformation. UBR1 associates with the RAD6-encoded E2 enzyme to form an E2-E3 complex that catalyzes the synthesis of a substrate-linked multi-ubiquitin chain and may also mediate the delivery of substrates to the 26S proteasome. Moreover, UBR1 promotes the degradation of a misfolded protein in the cytosol. It promotes protein kinase quality control and sensitizes cells to heat shock protein 90 (Hsp90) inhibition. Furthermore, UBR1 functions as a polyubiquitylation-enhancing component of the UBR1-UFD4 complex in its targeting of ubiquitin-fusion degradation (UFD) substrates. UBR1 harbors at least three distinct substrate-binding sites and functions in association with Ubc2/Rad6 and also Ubc4. It contains an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain. A missense mutation in UBR1 is responsible for Johanson-Blizzard syndrome leads to UBR box unfolding and loss of function. 120
34225 319600 cd16686 RING-H2_UBR2 RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-2 (UBR2) and similar proteins. UBR2, also known as N-recognin-2 or E3alpha-II, is an E3 ubiquitin-protein ligase that play an important role in maintaining genome integrity and in homologous recombination repair. It regulates the level of the transcription factor Rpn4 (also known as Son1 and Ufd5) through ubiquitylation. The ubiquitin-conjugating enzyme Rad6, another binding partner of URB2, and an additional factor Mub1, are required for the ubiquitin-dependent degradation of Rpn4. UBR2 associates with Mub1 to form a Mub1/Ubr2 ubiquitin ligase complex that regulates the conserved Dsn1 kinetochore protein levels, which is a part of a quality control system that monitors kinetochore integrity, thus ensuring genomic stability. As the recognition component of a major cellular proteolytic system, UBR2 is associated with chromatin and controls chromatin dynamics and gene expression in both spermatocytes and somatic cells. Moreover, UBR2 mediates transcriptional silencing during spermatogenesis via histone ubiquitination. It functions as a scaffold E3 promoting HR6B/UbcH2-dependent ubiquitylation of H2A and H2B, but not H3 and H4. It also binds to Tex19.1, also known as Tex19, a germ cell-specific protein, and metabolically stabilizes it during spermatogenesis. Furthermore, UBR2 is involved in skeletal muscle (SKM) atrophy. Its expression can be modulated by the mouse ether-a-gogo-related gene 1a (MERG1a) potassium channel. In addition, UBR2 up-regulation in cachectic muscle is mediated by the p38beta-CCAAT/enhancer binding protein (C/EBP)-beta signaling pathway responsible for the bulk of tumor-induced muscle proteolysis. UBR2 contains an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain. 116
34226 319601 cd16687 RING-H2_Vps8 RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 8 (Vps8) and similar proteins. Vps8 is the Rab-specific subunit of the endosomal tethering complex CORVET (class C core vacuole/endosome transport) that also includes Vps3 and a Class C Vps core complex composed of Vps11, Vps16, Vps18, and Vps33. CORVET operates at endosomes, controls traffic into late endosomes, and interacts with the Rab5/Vps21-GTP form. The CORVET-specific Vps3 and Vps8 subunits belong to a class D Vps. They form a subcomplex that interact with Rab5/Vps21, and are critical for localization and function of the CORVET tethering complex on endosomes. Vps8 contains an N-terminal WD40 repeat and a C-terminal C3H2C3-type RING-H2 finger. 55
34227 319602 cd16688 RING-H2_Vps11 RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 11 homolog (Vps11) and similar proteins. Vps11, also known as RING finger protein 108 (RNF108), is a soluble protein involved in regulation of glycolipid degradation and retrograde toxin transport. It is highly expressed in heart and pancreas. Vps11 associates with Vps16, Vps18, and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. Vps11 is a central scaffold protein upon which both HOPS and CORVET assemble. The HOPS and CORVET complexes disassemble in the absent of Vps11, resulting in a massive fragmentation of vacuoles. Vps11 contains a clathrin repeat domain and a C-terminal C3H2C3-type RING-H2 finger. This subfamily also includes Vps11 homologs found in fungi, such as Saccharomyces cerevisiae vacuolar membrane protein Pep5p, also known as carboxypeptidase Y-deficient protein 5, vacuolar morphogenesis protein 1, or vacuolar biogenesis protein END1. Pep5p is essential for vacuolar biogenesis in Saccharomyces cerevisiae. It associates with Pep3p to form a core Pep3p/Pep5p complex that promotes vesicular docking/fusion reactions in conjunction with SNARE proteins at multiple steps in transport routes to the vacuole. 44
34228 319603 cd16689 RING-H2_Vps18 RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 18 (Vps18) and similar proteins. Vps18 is an ubiquitin ligase E3 that is highly expressed in heart. It induces the ubiquitylation and degradation of serum-inducible kinase (SNK). Vps18 associates with Vps11, Vps16, and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. Vps18 deficiency inhibits dendritogenesis in Purkinje cells by blocking the lysosomal degradation of lysyl oxidase. Vps18 contains a clathrin heavy chain repeat, a coiled-coil domain, and a C3H2C3-type RING-H2 finger domain close to its C-terminus. This subfamily also includes Vps18 homologs found in insects and fungi, such as Drosophila melanogaster protein deep orange (dor) gene encoding protein Dor, and Saccharomyces cerevisiae vacuolar membrane protein Pep3p, also known as carboxypeptidase Y-deficient protein 3, or vacuolar morphogenesis protein 8. Drosophila Dor is part of a protein complex, which also includes the Sep1p homolog carnation (car), which localizes to endosomal compartments and is required not only for the biogenesis of pigment granules but also for the normal delivery of proteins to lysosomes. Pep3p is a vacuolar peripheral membrane protein that is required for vacuolar biogenesis in Saccharomyces cerevisiae. Pep3p associates with Pep5p to form a core Pep3p/Pep5p complex that promotes vesicular docking/fusion reactions in conjunction with SNARE proteins at multiple steps in transport routes to the vacuole. 38
34229 319604 cd16690 RING-H2_Vps41 RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 41 (Vps41) and similar proteins. Vps41, also known as S53, is a protein involved in trafficking of proteins from the late Golgi to the vacuole. It interacts with caspase-8, suggesting a potential role of Vps41 beyond lysosomal trafficking. It has been identified as a potential therapeutic target for human Parkinson"s disease (PD). Vps41 and the soluble N-ethylmaleimide-sensitive factor attachment protein receptors protein VAMP7 are specifically involved in the fusion of the trans-Golgi network-derived lysosome-associated membrane protein carriers with late endosomes. Vps41 is a specific subunit of the lysosomal tethering complex HOPS (homotypic vacuole fusion and protein sorting) that also includes Vps39 and a Class C Vps core complex composed of Vps11, Vps16, Vps18, and Vps33. HOPS operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. The HOPS-specific Vps39 and Vps41 subunits belong to a class B Vps. They form a subcomplex that interacts with Rab7/Ypt7 and is are required for homotypic and heterotypic late endosome fusion. Vps41 contains an N-terminal WD40 repeat, one or two clathrin repeats and a C3H2C3-type RING-H2 finger domain close to its C-terminus. This subfamily also includes Vps18 homologs found in insects, such as Drosophila melanogaster eye color gene light encoding protein. 51
34230 319605 cd16691 mRING-H2-C3H3C2_Mio Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein mio and simialr proteins. This family contains Mio, its counterpart Sea4 from yeast, and other homologs. Mio/Sea4 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea2/WDR24, and Sea3/WDR59. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. Mio interacts with endogenous RagA and RagC, and plays an essential role in the activation of mTOR Complex 1 (mTORC1) by amino acids. In GATOR2, Mio and Seh1 localize to lysosomes and autolysosomes, and form a heterodimer that is required to oppose the TORC1 inhibitory activity of the Iml1/GATOR1 complex to prevent the constitutive down-regulation of TORC1 activity in later stages of oogenesis. A tissue-specific requirement is necessary for Mio to be involved in cell growth in the female germ line. Mio contains an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. 73
34231 319606 cd16692 mRING-H2-C3H3C2_WDR59 Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein 59 (WDR59) and similar proteins. WDR59 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea2/WDR24, and Mio/Sea4. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. WDR59 contains an N-terminal WD40 domain followed by a RWD domain, and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. Sea3 is the yeast counterpart of WDR59. It is not included in this subfamily. 47
34232 319607 cd16693 mRING-H2-C3H3C2_WDR24 Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein 24 (WDR24) and similar proteins. WDR24 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea3/WDR59, and Mio/Sea4. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. WDR24 contains an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. Sea2 is the yeast counterpart of WDR24. It is not included in this subfamily. 46
34233 319608 cd16694 mRING-CH-C4HC2H_ZNRF1 Modified RING-CH finger, H2 subclass (C4HC2H-type), found in zinc/RING finger protein 1 (ZNRF1) and similar proteins. ZNRF1, also known as Nerve injury-induced gene 283 protein (nin283), or peripheral nerve injury protein (PNIP), is an E3 ubiquitin-protein ligase that is highly expressed in the nervous system during development and is associated with synaptic vesicle membranes. It is N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts, suggesting it may participate in ubiquitin-mediated protein modification. It contains an N-terminal MAGE domain, and a special C-terminal domain that combines a zinc finger and a modified C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger. Only the RING finger of the zinc finger-RING finger motif is required for its E3 ubiquitin ligase activity. ZNRF1 regulates Schwann cell differentiation by proteasomal degradation of glutamine synthetase (GS). It also mediates regulation of neuritogenesis via interaction with beta-tubulin type 2 (Tubb2). Moreover, ZNRF1 promotes Wallerian degeneration by degrading AKT to induce glycogen synthase kinase-3beta (GSK3B)-dependent CRMP2 phosphorylation. Furthermore, ZNRF1 and its sister protein ZNRF2 regulate the ubiquitous Na+/K+ pump (Na+/K+ATPase). In addition, ZNRF1 may be associated with leukemogenesis of acute lymphoblastic leukemia (ALL) with paired box domain gene 5 (PAX5) alteration. 46
34234 319609 cd16695 mRING-CH-C4HC2H_ZNRF2 Modified RING-CH finger, H2 subclass (C4HC2H-type), found in zinc/RING finger protein 2 (ZNRF2) and similar proteins. ZNRF2, also known as protein Ells2 or RING finger protein 202 (RNF202), is an E3 ubiquitin-protein ligase that is highly expressed in the nervous system during development and is present in presynaptic plasma membranes. It is N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts. It contains an N-terminal MAGE domain, and a special C-terminal domain that combines a zinc finger and a modified C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger. Only the RING finger of the zinc finger-RING finger motif is required for its E3 ubiquitin ligase activity. Together with its sister protein ZNRF1, ZNRF2 regulates the ubiquitous Na+/K+ pump (Na+/K+ATPase). 45
34235 319610 cd16696 RING-CH-C4HC3_NFX1 RING-CH finger, H2 subclass (C4HC3-type), found in transcriptional repressor NF-X1 and similar proteins. NF-X1, also known as nuclear transcription factor, X box-binding protein 1, is a novel cysteine-rich sequence-specific DNA-binding protein that interacts with the conserved X-box motif of the human major histocompatibility complex (MHC) class II genes via a repeated Cys-His domain. It functions as a cytokine-inducible transcriptional repressor that plays an important role in regulating the duration of an inflammatory response by limiting the period in which class II MHC molecules are induced by interferon gamma (IFN- gamma). NFX1 contains an N-terminal PAM2 motif, a C4HC3-type RING-CH finger, a Cys-rich region that harbors several NFX1-type zinc fingers, and a C-terminal R3H domain. 55
34236 319611 cd16697 RING-CH-C4HC3_NFXL1 RING-CH finger, H2 subclass (C4HC3-type), found in nuclear transcription factor, X-box binding-like 1 (NFXL1) and similar proteins. NFXL1, also known as NF-X1-type zinc finger protein NFXL1, or ovarian zinc finger protein (OZFP), is encoded by a novel human cytoplasm-distribution zinc finger protein (CDZFP) gene. It is a putative zinc finger protein with a C4HC3-type RING-CH finger and a Cys-rich region that harbors several NFX1-type zinc fingers. 63
34237 319612 cd16698 RING_CH-C4HC3_MARCH1_like RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH1, MARCH8, and similar proteins. This family includes the closely related MARCH1 and MARCH8, both of which are located on endosomes and the plasma membrane and are implicated in regulating cell surface expression of their substrates. They ubiquitylate and downregulate many targets, including major histocompatibility complex class II (MHCII), CD86, transferrin receptor, HLA-DM, and Fas from the cell surface. MARCH1 is mainly expressed in cells of the immune system, while MARCH8 is more broadly expressed. Both of them contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and two transmembrane domains. The cytoplasmic RING-CH domain participates in the ubiquitin transfer from the E2 to its substrate. The transmembrane domains are implicated in target recognition and dimer formation. 52
34238 319613 cd16699 RING_CH-C4HC3_MARCH2_like RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH2, MARCH3, and similar proteins. MARCH2 contain a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus. It is a Golgi-localized, membrane-associated E3 ubiquitin-protein ligase that is involved in endosomal trafficking through the binding of syntaxin 6 (STX6). It is involved in the cystic fibrosis transmembrane conductance regulator (CFTR)-associated ligand (CAL)-mediated ubiquitination and lysosomal degradation of mature CFTR through the association with adaptor proteins CAL and STX6. It also reduces the surface expression of CD86 and the transferrin receptor TFRC and regulates cell surface carvedilol-bound beta2-adrenergic receptor (beta2ARs) expression. Moreover, MARCH2 interacts with and ubiquitinates PDZ domains polarity determining scaffold protein DLG1 through its PDZ-binding motif, suggesting it may function as a molecular bridge with ubiquitin ligase activity connecting endocytic tumor suppressor proteins such as syntaxins to DLG1. MARCH3 is an E3 ubiquitin-protein ligase that is broadly expressed at relatively high levels in spleen, colon, and lung. It is localized to early endosomes, binds to MARCH2 and syntaxin 6, and is involved in the regulation of vesicular trafficking and fusion of the transport vesicles in endosomes. Its E2 specificity significantly overlaps that of MARCH2. 51
34239 319614 cd16700 RING_CH-C4HC3_MARCH4_like RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH4, MARCH9, MARCH11, and similar proteins. MARCH4 and MARCH9 are closely related to each other. They downregulate major histocompatibility complex-I (MHC-I). Moreover, MARCH4 and MARCH9, but not other MARCH proteins, can associate with Mult1 and prevent Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. MARCH11 is a transmembrane RING-finger ubiquitin ligase that is predominantly expressed in developing spermatids in a stage-specific manner and is localized to the trans-Golgi network (TGN) vesicles and multivesicular bodies (MVBs). It mediates selective protein sorting via the TGN-MVB transport pathway through its ubiquitin ligase activity. SAMT family proteins have been identified as substrates of MARCH11 in mouse spermatids, suggesting that MARCH11 plays a role in mammalian spermiogenesis. Moreover, MARCH11 functions as an E3 ubiquitin ligase that targets CD4 for ubiquitination. It also forms complexes with the adaptor protein complex-1 and with fucose-containing glycoproteins including ubiquitinated forms. All family members contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions. 51
34240 319615 cd16701 RING_CH-C4HC3_MARCH5 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH5 (MARCH5). MARCH5, also known as membrane-associated RING finger protein 5, membrane-associated RING-CH protein V (MARCH-V), RING finger protein 153 (RNF153), or mitochondrial ubiquitin ligase (MITOL), is a mitochondrial outer membrane-associated E3 ubiquitin-protein ligase that regulates mitochondrial dynamics including mitochondrial morphology, transport, and interaction with endoplasmic reticulum (ER), at least in part, through the ubiquitination of mitochondrial fission factor Drp1, microtubule-associated protein 1B (MAP1B) and mitofusin 2 (Mfn2), respectively. MARCH5 also mediates the cell cycle-dependent degradation of Mitofusin 1 (Mfn1) in G2/M phase, and thus serves as an upstream quality controller on Mitofusin 1 (Mfn1), preventing excessive accumulation of Mfn1 protein under stress conditions, which is crucial for mitochondrial homeostasis and cell viability. Moreover, MARCH5 is involved in maintaining mouse-embryonic stem cell (mESC) pluripotency via suppression of ERK signalling. It is also a positive regulator of Toll-like receptor 7 (TLR7)-mediated NF-kappaB activation in mammals. MARCH5 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and four C-terminal transmembrane spans. 61
34241 319616 cd16702 RING_CH-C4HC3_MARCH6 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH6 (MARCH6). MARCH6, also known as membrane-associated RING finger protein 6, membrane-associated RING-CH protein VI (MARCH-VI), RING finger protein 176 (RNF176), protein TEB-4, or Doa10 homolog, is an endoplasmic reticulum (ER)-localized E3 ubiquitin ligase that ubiquitinates ER-associated proteins with a cytoplasmic domain in a ubiquitin-conjugating enzyme 7 (UBC7)-dependent manner), such as Mps2, UBC6, and Ste6. It also regulates its own UBC7-mediated degradation. MARCH6 interacts with ubiquitin-specific protease USP19, which deubiquitinates and stabilizes MARCH6 and inhibits p97-dependent proteasomal degradation. It is also involved in the cholesterol synthesis pathway through controlling the degradation of squalene monooxygenase (SM), and affects 3-hydroxy-3-methyl-glutaryl coenzyme A reductase (HMGCR). Furthermore, it may be a key regulator of thyroid hormone activation in a number of tissues, since it mediates the proteasomal degradation of type 2 iodothyronine deiodinase (D2). MARCH6 contains 14 transmembrane helices and a conserved N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that catalyzes ubiquitin Lys48-specific ligation. 50
34242 319617 cd16703 RING_CH-C4HC3_MARCH7_like RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated MARCH7, MARCH10, and similar proteins. The subfamily includes two closely related membrane-associated RING-CH proteins, MARCH7 and MARCH10, both of which are predicted to have no transmembrane spanning region, but harbor a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for E3 activity. MARCH7, also known as MARCH-VII, RNF177, or axotrophin, is a ubiquitin E3 ligase expressed in multiple types of cells and tissues, including stem cells and precursor cells, and predominantly localized on the plasma membrane, and cytoplasm. MARCH7 is involved in T cell proliferation and neuronal development. It also participates in the regulation of cytoskeleton re-organization, cellular migration and invasion, cell proliferation, and tumorigenesis in ovarian carcinoma cells. Moreover, MARCH7 modulates nuclear factor kappaB (NF-kappaB) and Wnt/beta-catenin pathways. It has been identified as an authentic target of miR-101. Furthermore, ubiquitinates tau protein in vitro impairing microtubule binding. MARCH10, also known as MARCH-X or RNF190, is a microtubule-associated E3 ubiquitin ligase of developing spermatids. It is localized to the principal piece of elongating spermatids. MARCH10 is involved in spermiogenesis by regulating the formation and maintenance of the flagella in developing spermatids. 61
34243 319618 cd16704 RING-HC_RNF20_like RING finger, HC subclass, found in RING finger protein RNF20, RNF40, and similar proteins. RNF20, also known as BRE1A, and RNF40, also known as BRE1B, are E3 ubiquitin-protein ligases that work together to form a heterodimeric complex that facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. RNF20 regulates the cell cycle and differentiation of neural precursor cells (NPCs) and links histone H2B ubiquitylation with inflammation and inflammation-associated cancer. RNF40, also known as 95 kDa retinoblastoma-associated protein (RBP95), was identified as a novel leucine zipper retinoblastoma protein (pRb)-associated protein that may function as a regulation factor in the process of RNA polymerase II-mediated transcription and/or transcriptional processing. All family members contain a C3HC4-type RING-HC finger at its C-terminus. 46
34244 319619 cd16705 RING-HC_dBre1_like RING finger, HC subclass, found in Drosophila melanogaster Bre1 (dBre1) and similar proteins. dBre1 is the functional homolog of yeast Bre1, an E3 ubiquitin ligase required for the monoubiquitination of histone H2B and, indirectly, for H3K4 methylation. dBre1 acts as a nuclear component required cell autonomously for the expression of Notch target genes in Drosophila development. dBre1 contains a C3HC4-type RING-HC finger at its C-terminus. 42
34245 319620 cd16706 RING-HC_CARP1 RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein 1 (CARP1) and similar proteins. CARP1, also known as caspase regulator CARP1, FYVE-RING finger protein Momo, RING finger homologous to inhibitor of apoptosis protein (RFI), RING finger protein 34 (RNF34), or RING finger protein RIFF, is a nuclear protein that functions as a specific E3 ubiquitin ligase for the transcriptional coactivator PGC-1alpha, a master regulator of energy metabolism and adaptive thermogenesis in the brown fat cell which negatively regulates brown fat cell metabolism. It is preferentially expressed in esophageal, gastric, and colorectal cancers, suggesting a possible association with the development of the digestive tract cancers. It regulates the p53 signaling pathway by degrading 14-3-3 sigma and stabilizing MDM2. CARP1 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10. CARP1 contains an N-terminal FYVE-like domain and a C-terminal C3HC4-type RING-HC finger domain. 39
34246 319621 cd16707 RING-HC_CARP2 RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein 2 (CARP-2) and similar proteins. CARP-2, also known as rififylin, caspase regulator CARP2, FYVE-RING finger protein Sakura (Fring), RING finger and FYVE-like domain-containing protein 1, RING finger protein 189 (RNF189), or RING finger protein 34-like, is an endosome-associated E3 ubiquitin-protein ligase that targets internalized receptor interacting kinase (RIP) for proteasome-mediated degradation. It acts as a negative regulator of tumor necrosis factor (TNF)-induced nuclear factor (NF)-kappaB activation. It also regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. As a caspase regulator, CARP2 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10. CARP2 contains an N-terminal FYVE-like domain and a C-terminal C3HC4-type RING-HC finger domain. 40
34247 319622 cd16708 RING-HC_Cbl RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl and similar proteins. Cbl, also known as Casitas B-lineage lymphoma proto-oncogene, proto-oncogene c-Cbl, RING finger protein 55 (RNF55), or signal transduction protein Cbl, is a multi-domain protein that acts as a key negative regulator of various receptor and non-receptor tyrosine kinases signaling. It contains a tyrosine kinase-binding domaina (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a proline-rich domain, a C3HC4-type RING-HC finger, and an ubiquitin-associated (UBA) domain. TKB is responsible for the interactions with many tyrosine kinases, such as the colony-stimulating factor-1 (CSF-1) receptor, Syk/ZAP-70, and Src-family of protein tyrosine kinases. The proline-rich domain can recruit proteins with a SH3 domain. Moreover, Cbl functions as an E3 ubiquitin ligase that can bind ubiquitin-conjugating enzymes (E2s) through the RING-HC finger. 57
34248 319623 cd16709 RING-HC_Cbl-b RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl-b and similar proteins. Cbl-b, also known as Casitas B-lineage lymphoma proto-oncogene b, RING finger protein 56 (RNF56), SH3-binding protein Cbl-b, or signal transduction protein Cbl-b, has been identified as a regulator of antigen-specific, T cell-intrinsic, peripheral immune tolerance, a state also known as clonal anergy. It may inhibit activation of the p85 subunit of phosphoinositide 3-kinase (PI3K), protein kinase C-theta (PKC-theta), and phospholipase C-gamma1 (PLC-gamma1) and negatively regulates T-cell receptor-induced transcription factor nuclear factor kappaB (NF-kappaB) activation. In addition, Cbl-b may target multiple signaling molecules involved in transforming growth factor (TGF)-beta-mediated transactivation pathways. Cbl-b contains a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a proline rich domain, a nuclear localization signal, a C3HC4-type RING-HC finger and an ubiquitin-associated (UBA) domain. 66
34249 319624 cd16710 RING-HC_Cbl-c RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl-c and similar proteins. Cbl-c, also known as RING finger protein 57 (RNF57), SH3-binding protein Cbl-3, SH3-binding protein Cbl-c, or signal transduction protein Cbl-c, is an E3 ubiquitin-protein ligase expressed exclusively in epithelial cells. It contains a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a C3HC4-type RING-HC finger, and a short proline-rich region, but lacks the ubiquitin-associated (UBA) leucine zipper motif that are present in Cbl and Cbl-b. Cbl-c acts as a regulator of epidermal growth factor receptor (EGFR) mediated signal transduction. It also suppresses v-Src-induced transformation through ubiquitin-dependent protein degradation. Moreover, Cbl-c ubiquitinates and downregulates RETMEN2A and implicates Enigma (PDLIM7) as a positive regulator of RETMEN2A through blocking of Cbl-mediated ubiquitination and degradation. The ubiquitin ligase activity of Cbl-c is increased via the interaction of its RING-HC finger domain with a LIM domain of the paxillin homolog, hydrogen peroxide Induced Construct 5 (Hic-5). 53
34250 319625 cd16711 RING-HC_DTX3 RING finger, HC subclass, found in E3 ubiquitin-protein ligase Deltex3 (DTX3) and similar proteins. DTX3, also known as RING finger protein 154 (RNF154), is an E3 ubiquitin-protein ligase that belongs to the Deltex (DTX) family. In contrast to other DTXs, DTX3 does not contain N-terminal two Notch-binding WWE domains, but a short unique N-terminal domain, suggesting it does not interact with intracellular domain of Notch. Its C-terminal region includes a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain. 41
34251 319626 cd16712 RING-HC_DTX3L RING finger, HC subclass, found in protein Deltex-3-like (DTX3L) and similar proteins. DTX3L, also known as B-lymphoma- and BAL-associated protein (BBAP) or Rhysin-2 (Rhysin2), is a RING-domain E3 ubiquitin-protein ligase that regulates endosomal sorting of the G protein-coupled receptor CXCR4 from endosomes to lysosomes. It also regulates subcellular localization of its partner protein, B aggressive lymphoma (BAL), by a dynamic nucleocytoplasmic trafficking mechanism. DTX3L has a unique N-terminus, but lacks the highly basic N-terminal motif and the central proline-rich motif present in other Deltex (DTX) family members, such as DTX1, DTX2, and DTX4. Moreover, its C-terminal region is highly homologous to DTX3. It includes a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain. DTX3L can associate with DTX1 through its unique N-terminus and further enhance self-ubiquitination. 41
34252 319627 cd16713 RING-HC_BIRC2_3_7 RING finger, HC subclass, found in apoptosis protein c-IAP1, c-IAP2, livin, and similar proteins. The cellular inhibitor of apoptosis protein c-IAPs function as ubiquitin E3 ligases that mediate the ubiquitination of the substrates involved in apoptosis, nuclear factor-kappaB (NF-kappaB) signaling, and oncogenesis. Unlike other apoptosis proteins (IAPs), such as XIAP, c-IAPs exhibit minimal binding to caspases and may not play an important role in the inhibition of these proteases. c-IAP1, also known as baculoviral IAP repeat-containing protein BIRC2, IAP-2, RING finger protein 48, or TNFR2-TRAF-signaling complex protein 2, is a potent regulator of the tumor necrosis factor (TNF) receptor family and NF-kappaB signaling pathways in the cytoplasm. It can also regulate E2F1 transcription factor-mediated control of cyclin transcription in the nucleus. c-IAP2, also known as BIRC3, IAP-1, apoptosis inhibitor 2 (API2), or IAP homolog C, also influences ubiquitin-dependent pathways that modulate innate immune signalling by activation of NF-kappaB. c-IAPs contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), a caspase activation and recruitment domain (CARD) that serves as a protein interaction surface, and a C3HC4-type RING-HC finger at the carboxyl terminus that is required for ubiquitin ligase activity. Livin, also known as baculoviral IAP repeat-containing protein 7 (BIRC7), or kidney inhibitor of apoptosis protein (KIAP), or melanoma inhibitor of apoptosis protein (ML-IAP), or RING finger protein 50, was identified as the melanoma IAP. It plays crucial roles in apoptosis, cell proliferation, and cell cycle control. Its anti-apoptotic activity is regulated by the inhibition of caspase-3, -7, and -9. Its E3 ubiquitin-ligase-like activity promotes degradation of Smac/DIABLO, a critical endogenous regulator of all IAPs. Unlike other family members, mammalian livin contains a single BIR domain and a C3HC4-type RING-HC finger. The UBA domain can be detected in non-mammalian homologs of livin. 54
34253 319628 cd16714 RING-HC_BIRC4_8 RING finger, HC subclass, found in E3 ubiquitin-protein ligase XIAP, baculoviral IAP repeat-containing protein 8 (BIRC8) and similar proteins. XIAP, also known as baculoviral IAP repeat-containing protein 4 (BIRC4), IAP-like protein (ILP), inhibitor of apoptosis protein 3 (IAP-3), or X-linked inhibitor of apoptosis protein (X-linked IAP), is a potent suppressor of apoptosis that directly inhibits specific members of the caspase family of cysteine proteases, including caspase-3, -7, and -9. It promotes proteasomal degradation of caspase-3 and enhances its anti-apoptotic effect in Fas-induced cell death. The ubiquitin-protein ligase (E3) activity of XIAP also exhibits in the ubiquitination of second mitochondria-derived activator of caspases (Smac). The mitochondrial proteins, Smac/DIABLO and Omi/HtrA2, can inhibit the antiapoptotic activity of XIAP. XIAP has also been implicated in several intracellular signaling cascades involved in the cellular response to stress, such as the c-Jun N-terminal kinase (JNK) pathway, the nuclear factor-kappaB (NF-kappaB) pathway, and the transforming growth factor-beta (TGF-beta) pathway. Moreover, XIAP can regulate copper homeostasis through interacting with MURR1. BIRC8, also known as inhibitor of apoptosis-like protein 2, IAP-like protein 2, ILP-2, or testis-specific inhibitor of apoptosis, is a tissue-specific homolog of E3 ubiquitin-protein ligase XIAP. It has been implicated in the control of apoptosis in the testis by direct inhibition of caspase 9. Both XIAP and BIRC8 contain three N-terminal baculoviral IAP repeat (BIR) domains, a ubiquitin-association (UBA) domain and a C3HC4-type RING-HC finger at the carboxyl terminus. 62
34254 319629 cd16715 vRING-HC_IRF2BP1 variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein 1 (IRF-2BP1). IRF-2BP1, also known as IRF-2-binding protein 1, is a nuclear protein that binds to the C-terminal repression domain of IRF-2 and acts as an IRF-2-dependent transcriptional corepressor, both enhancer-activated and basal transcription. It binds to Jun-dimerization protein 2 (JDP2), a member of the activating protein-1 (AP-1) family of transcription factors, and enhances the polyubiquitination of JDP2. It also represses activating transcription factor-2 (ATF2)-mediated transcriptional activation from a cyclic AMP-responsive element (CRE)-containing promoter. IRF-2BP1 contains an N-terminal C4-type zinc finger and a C-terminal C3HC4-type RING-HC finger with a partially new pattern. 56
34255 319630 cd16716 vRING-HC_IRF2BP2 variant of RING finger, HC subclass, found in interferon regulatory factor 2 (IRF2)-binding protein 2 (IRF-2BP2). IRF-2BP2, also known as IRF-2-binding protein 2 or DIF-1, is a nuclear protein that binds to the C-terminal repression domain of IRF-2 and acts as an IRF-2-dependent transcriptional corepressor, both enhancer-activated and basal transcription. IRF-2BP2 also specifically interacts with the C-terminal domain of the nuclear factor of activated T cells NFAT1 transcription factor, and negatively regulates the NFAT1-dependent transactivation of NFAT-responsive promoters. Moreover, IRF2BP2 suppresses the transactivation activity of p53 on both Bax and p21 promoters. It also shows anti-apoptotic activity through the modulation of a death domain in NRIF3. In addition, IRF2BP2 functions as a cofactor of VGLL4 and plays a critical role controlling gene expression in skeletal, cardiac, and smooth muscle cells. It is a muscle-enriched transcription factor required to activate vascular endothelial growth factor-A (VEGF-A) expression in muscle. IRF-2BP2 contains an N-terminal C4-type zinc finger and a C-terminal C3HC4-type RING-HC finger with a partially new pattern. The zinc finger is responsible for the homo- and hetero-dimerization between different members of the IRF-2BP2 family. The RING-HC finger interacts with IRF2 and also with nuclear receptor interacting factor 3 (NRIF3). 56
34256 319631 cd16717 vRING-HC_IRF2BPL variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein-like (IRF-2BPL). IRF-2BPL, also known as C14orf4 or enhanced at puberty protein 1(EAP1), is a homolog of interferon regulatory factor 2-binding proteins, IRF-2BP1 and IRF-2BP2. It is expressed in the mediobasal hypothalamus and plays a critical function in regulating the female reproductive neuroendocrine axis. IRF-2BPL is a proline-rich protein with polyglutamine and polyalanine tracks at the N-terminus and a C3HC4-type RING-HC finger domain with a partially new pattern at the C-terminus. 56
34257 319632 cd16718 RING-HC_LNX3 RING finger, HC subclass, found in ligand of numb protein X 3 (LNX3). LNX3, also known as PDZ domain-containing RING finger protein 3 (PDZRN3), or Semaphorin cytoplasmic domain-associated protein 3 (SEMACAP3), is an E3 ubiquitin-protein ligase that was first identified as a Semaphorin-binding partner. It is also responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX3 acts as a negative regulator of osteoblast differentiation by inhibiting Wnt-beta-catenin signaling. LNX3 also plays an important role in neuromuscular junction formation. It interacts with and ubiquitinates the muscle specific tyrosine kinase (MuSK), thus promoting its endocytosis and negatively regulating the cell surface expression of this key regulator of postsynaptic assembly. LNX3 contains an N-terminal typical C3HC4-type RING-HC finger, two PDZ domains, and a C-terminal LNX3 homology (LNX3H) domain. 42
34258 319633 cd16719 RING-HC_LNX4 RING finger, HC subclass, found in ligand of numb protein X 4 (LNX4). LNX4, also known as PDZ domain-containing RING finger protein 4 (PDZRN4), or SEMACAP3-like protein (SEMCAP3L), is an E3 ubiquitin-protein ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX4 contains an N-terminal typical C3HC4-type RING-HC finger, two PDZ domains, and a C-terminal LNX3 homology (LNX3H) domain. 42
34259 319634 cd16720 RING-HC_MEX3A RING finger, HC subclass, found in RNA-binding protein MEX3A. MEX3A, also known as RING finger and KH domain-containing protein 4 (RKHD4), is a RNA-binding phosphoprotein that localizes in P-bodies and stress granules, which are two structures involved in the storage and turnover of mRNAs. It has been implicated in the regulation of tumorigenesis. It controls the polarity and stemness of intestinal epithelial cells through the post-transcriptional regulation of the homeobox transcription factor CDX2, which plays a crucial role in intestinal cell fate specification, both during normal development and in tumorigenic processes involving intestinal reprogramming. Moreover, it exhibits a transforming activity when overexpressed in gastric epithelial cells. MEX3A contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3A shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway. 43
34260 319635 cd16721 RING-HC_MEX3B RING finger, HC subclass, found in RNA-binding protein MEX3B. MEX3B, also known as RING finger and KH domain-containing protein 3 (RKHD3), or RING finger protein 195 (RNF195), is a RNA-binding phosphoprotein that localizes in P-bodies and stress granules, which are two structures involved in the storage and turnover of mRNAs. It regulates the spatial organization of the Rap1 pathway that orchestrates Sertoli cell functions. It has a 3' long conserved untranslated region (3'LCU)-mediated fine-tuning system for mRNA regulation in early vertebrate development such as anteroposterior (AP) patterning and signal transduction. MEX3B contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3B shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway. 43
34261 319636 cd16722 RING-HC_MEX3C RING finger, HC subclass, found in RNA-binding protein MEX3C. MEX3C, also known as RING finger and KH domain-containing protein 2 (RKHD2), or RING finger protein 194 (RNF194), is a RNA-binding phosphoprotein that acts as a suppressor of chromosomal instability. It functions as a RNA-binding ubiquitin E3 ligase responsible for the post-transcriptional, HLA-A allotype-specific regulation of MHC class I molecules (MHC-I). It also modifies retinoic acid inducible gene-1 (RIG-I) in stress granules and plays a critical role in eliciting antiviral immune responses. Moreover, MEX3C plays an essential role in normal postnatal growth via enhancing the local expression of insulin-like growth factor 1 (IGF1) in bone. It may also be involved in metabolic regulation of energy balance. MEX3C contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3C shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway. 43
34262 319637 cd16723 RING-HC_MEX3D RING finger, HC subclass, found in RNA-binding protein MEX3D. MEX3D, also known as RING finger and KH domain-containing protein 1 (RKHD1), RING finger protein 193 (RNF193), or TINO, is a RNA-binding phosphoprotein that controls the stability of the transcripts coding for the anti-apoptotic protein BCL-2, and negatively regulates BCL-2 in HeLa cells. MEX3D contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3D shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway. 45
34263 319638 cd16724 RING1-HC_MIB1 RING finger 1, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the first RING-HC finger. 37
34264 319639 cd16725 RING2-HC_MIB1 RING finger 2, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the second RING-HC finger. 37
34265 319640 cd16726 RING1-HC_MIB2 RING finger 1, HC subclass, found in mind bomb 2 (MIB2) and similar proteins. MIB2, also known as novel zinc finger protein (Novelzin), putative NF-kappa-B-activating protein 002N, skeletrophin, or zinc finger ZZ type with ankyrin repeat domain protein 1, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands. It promotes Delta ubiquitylation and endocytosis in Notch activation. Overexpression of MIB2, activates NF-kappaB and interferon-stimulated response element (ISRE) reporter activity. Moreover, MIB2 acts as a novel component of the activated B-cell CLL/lymphoma 10 (BCL10) complex and controls BCL10-dependent NF-kappaB activation. It also functions as a founder myoblast-specific protein that regulates myoblast fusion and muscle stability. MIB2 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of two C3HC4-type RING-HC fingers. This family corresponds to the first RING-HC finger. 37
34266 319641 cd16727 RING3-HC_MIB1 RING finger 3, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the third RING-HC finger. 42
34267 319642 cd16728 RING2-HC_MIB2 RING finger 2, HC subclass, found in mind bomb 2 (MIB2) and similar proteins. MIB2, also known as novel zinc finger protein (Novelzin), putative NF-kappa-B-activating protein 002N, skeletrophin, or zinc finger ZZ type with ankyrin repeat domain protein 1, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands. Especially, it promotes Delta ubiquitylation and endocytosis in Notch activation. Overexpression of MIB2, activates NF-kappaB and interferon-stimulated response element (ISRE) reporter activity. Moreover, MIB2 acts as a novel component of the activated B-cell CLL/lymphoma 10 (BCL10) complex and controls BCL10-dependent NF-kappaB activation. It also functions as a founder myoblast-specific protein that regulates myoblast fusion and muscle stability. MIB2 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of two C3HC4-type RING-HC fingers. This family corresponds to the second RING-HC finger. 46
34268 319643 cd16729 RING-HC_RGLG_plant RING finger, HC subclass, found in RING domain ligase RGLG1, RGLG2 and similar proteins from plant. RGLG1 is a ubiquitously expressed E3 ubiquitin-protein ligase that interacts with UBC13 and, together with UBC13, catalyzes the formation of K63-linked polyubiquitin chains, which is involved in DNA damage repair. RGLG1 mediates the formation of canonical, K48-linked polyubiquitin chains that target proteins for degradation. It also regulates apical dominance by acting on the auxin transport proteins abundance. RGLG1 has overlapping functions with its closest sequelog, RGLG2. They both function as RING E3 ligases that interact with ethylene response factor 53 (ERF53) in the nucleus and negatively regulate the plant drought stress response. All members in this family contain a Von Willebrand factor type A (vWA) domain and a C3HC4-type RING-HC finger. 45
34269 319644 cd16730 RING-HC_MKRN1_3 RING finger, HC subclass, found in makorin-1 (MKRN1), makorin-3 (MKRN3), and similar proteins. MKRN1, also known as makorin RING finger protein 1 or RING finger protein 61 (RNF61), is an E3 ubiquitin-protein ligase targeting the telomerase catalytic subunit (TERT) for proteasome processing. It regulates the ubiquitination and degradation of peroxisome-proliferator-activated receptor gamma (PPARgamma), a nuclear receptor that is linked to obesity and metabolic diseases. It also mediates the posttranslational regulation of p14ARF, and thus potentially regulates cellular senescence and tumorigenesis in gastric cancer. Moreover, MKRN1 functions as a differentially negative regulator of p53 and p21, and controls cell cycle arrest and apoptosis. It induces degradation of West Nile virus (WNV) capsid protein to protect cells from WNV. Furthermore, MKRN1 may represent a nuclear protein with multiple nuclear functions, including regulating RNA polymerase II-catalyzed transcription. It is a RNA-binding protein involved in the modulation of cellular stress and apoptosis. It predominantly associates with proteins involved in mRNA metabolism including regulators of mRNA turnover, transport, and/or translation, and acts as a component of a ribonucleoprotein complex in embryonic stem cells (ESCs) that is recruited to stress granules upon exposure to environmental stress. Meanwhile, MKRN1 interacts with poly(A)-binding protein (PABP), a key component of different ribonucleoprotein complexes, in an RNA-independent manner, and stimulates translation in nerve cells. In addition, MKRN1 is a novel SEREX (serological identification of antigens by recombinant cDNA expression cloning) antigen of esophageal squamous cell carcinoma (SCC). It may be involved in carcinogenesis of the well-differentiated type of tumors possibly via ubiquitination of filamin A interacting protein 1 (L-FILIP). Human MKRN1 contains three N-terminal C3H1-type zinc fingers, a motif rich in Cys and His residues (CH), a C3HC4-type RING-HC finger, and another C3H1-type zinc finger at the C-terminus. MKRN3, also known as makorin RING finger protein 3, RING finger protein 63 (RNF63), or zinc finger protein 127 (ZNF127), is a therian mammal-specific retrocopy of MKRN1. It acts as a putative E3 ubiquitin-protein ligase involved in ubiquitination and cell signaling. MKRN3 shows a potential inhibitory effect on hypothalamic gonadotropin-releasing hormone (GnRH) secretion. Its defects represent the most frequent known genetic cause of familial central precocious puberty (CPP). In contrast to human MKRN1, human MKRN3 lacks the second C3H1-type zinc finger at the N-terminal region. The RING-HC finger of mammalian MKRN4 shows high sequence similarity with that of MKRN3, and is also included in this subfamily. 61
34270 319645 cd16731 RING-HC_MKRN2 RING finger, HC subclass, found in makorin-2 (MKRN2) and similar proteins. MKRN2, also known as makorin RING finger protein 2, RING finger protein 62 (RNF62), or HSPC070, is a putative ribonucleoprotein that acts as a neurogenesis inhibitor acting upstream of glycogen synthase kinase-3beta (GSK-3beta) in the phosphatidylinositol 3-kinase (PI3K)/Akt pathway. It also functions on promoting cell proliferation of primary CD34+ progenitor cells and K562 cells, indicating its possible involvement in normal and malignant hematopoiesis. Mammalian MKRN2 contains three N-terminal C3H1-type zinc fingers, a motif rich in Cys and His residues (CH), a C3HC4-type RING-HC finger, and another C3H1-type zinc finger at the C-terminus. The third C3H1-type zinc finger, the CH motif, as well as the RING zinc finger are necessary for its anti-neurogenic activity. 58
34271 319646 cd16732 RING-HC_MKRN4 RING finger, HC subclass, found in makorin-4 (MKRN4) and similar proteins. MKRN4, also known as makorin RING finger protein pseudogene 4, makorin RING finger protein pseudogene 5, RING finger protein 64 (RNF64), zinc finger protein 127-Xp (ZNF127-Xp), or zinc finger protein 127-like 1, is a new divergent member of the makorin protein family in vertebrates. It may have an ancestral gonad-specific function and maternal embryonic expression before duplication in vertebrates. MKRN4 contains typical arrays of one to four C3H1-type zinc fingers, a motif rich in Cys and His residues (CH) and a C3HC4-type RING-HC finger. The RING-HC finger of mammalian MKRN4 shows high sequence similarity with that of MKRN3, and is not included in this subfamily. 61
34272 319647 cd16733 RING-HC_PCGF1 RING finger, HC subclass, found in polycomb group RING finger protein 1 (PCGF1) and similar proteins. PCGF1, also known as nervous system Polycomb-1 (NSPc1) or RING finger protein 68 (RNF68), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like BCOR complex that also contains RING1, RNF2, RYBP, SKP1, as well as the BCL6 co-repressor BCOR and the histone demethylase KDM2B, and is required to maintain the transcriptionally repressive state of some genes, such as Hox genes, BCL6 and the cyclin-dependent kinase inhibitor, CDKN1A. PCGF1 promotes cell cycle progression and enhances cell proliferation as well. It is a cell growth regulator that acts as a transcriptional repressor of p21Waf1/Cip1 via the retinoid acid response element (RARE element). Moreover, PCGF1 functions as an epigenetic regulator involved in hematopoietic cell differentiation. It cooperates with the transcription factor runt-related transcription factor 1 (Runx1) in regulating differentiation and self-renewal of hematopoietic cells. Furthermore, PCGF1 represents a physical and functional link between Polycomb function and pluripotency. PCGF1 contains a C3HC4-type RING-HC finger. 43
34273 319648 cd16734 RING-HC_PCGF2 RING finger found in polycomb group RING finger protein 2 (PCGF2) and similar proteins. PCGF2, also known as DNA-binding protein Mel-18, RING finger protein 110 (RNF110), or zinc finger protein 144 (ZNF144), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF2 uniquely regulates PRC1 to specify mesoderm cell fate in embryonic stem cells. It is required for PRC1 stability and maintenance of gene repression in embryonic stem cells (ESCs) and essential for ESC differentiation into early cardiac-mesoderm precursors. PCGF2 also plays a significant role in the angiogenic function of endothelial cells (ECs) by regulating endothelial gene expression. Furthermore, PCGF2 is a SUMO-dependent regulator of hormone receptors. It facilitates the deSUMOylation process by inhibiting PCGF4/BMI1-mediated ubiquitin-proteasomal degradation of SUMO1/sentrin-specific protease 1 (SENP1). It is also a novel negative regulator of breast cancer stem cells (CSCs) that inhibits the stem cell population and in vitro and in vivo self-renewal through the inactivation of Wnt-mediated Notch signaling. PCGF2 contains a C3HC4-type RING-HC finger. 43
34274 319649 cd16735 RING-HC_PCGF3 RING finger found in polycomb group RING finger protein 3 (PCGF3) and similar proteins. PCGF3, also known as RING finger protein 3A (RNF3A), is one of six PcG RING finger (PCGF) homologs (PCGF1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF3 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF3 contains a C3HC4-type RING-HC finger. 47
34275 319650 cd16736 RING-HC_PCGF4 RING finger found in polycomb group RING finger protein 4 (PCGF4) and similar proteins. PCGF4, also known as polycomb complex protein BMI-1 (B cell-specific Moloney murine leukemia virus integration site 1) or RING finger protein 51 (RNF51), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3), and plays important roles in chromatin compaction and H2AK119 monoubiquitination. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence target gene in a PRC2-independent manner. Moreover, PCGF4 is expressed in the hair cells and supporting cells. It can regulate cell survival by controlling mitochondrial function and reactive oxygen species (ROS) level in thymocytes and neurons, thus having an important role in the survival and sensitivity to ototoxic drug of auditory hair cells. Furthermore, PCGF4 controls memory CD4 T-cell survival through direct repression of Noxa gene in an Ink4a- and Arf-independent manner. It is required in neurons to suppress p53-induced apoptosis via regulating the antioxidant defensive response, and also involved in the tumorigenesis of various cancer types. PCGF4 contains a C3HC4-type RING-HC finger. 54
34276 319651 cd16737 RING-HC_PCGF5 RING finger found in polycomb group RING finger protein 5 (PCGF5) and similar proteins. PCGF5, also known as RING finger protein 159 (RNF159), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF5 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF5 contains a C3HC4-type RING-HC finger. 42
34277 319652 cd16738 RING-HC_PCGF6 RING finger found in polycomb group RING finger protein 6 (PCGF6) and similar proteins. PCGF6, also known as Mel18 and Bmi1-like RING finger (MBLR), or RING finger protein 134 (RNF134), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like L3MBTL2 complex, which is composed of some canonical components, such as RNF2, CBX3, CXB4, CXB6, CXB7, and CXB8, as well as some noncanonical components, such as L3MBTL2, E2F6, WDR5, HDAC1, and RYBP, and plays a critical role in epigenetic transcriptional silencing in higher eukaryotes. Like other PCGF homologs, PCGF6 possesses the transcriptional repression activity, and also associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF6 can regulate the enzymatic activity of JARID1d/KDM5D, a trimethyl H3K4 demethylase, through the direct interaction with it. Furthermore, PCGF6 is expressed predominantly in meiotic and post-meiotic male germ cells and may play important roles in mammalian male germ cell development. It also regulates mesodermal lineage differentiation in mammalian embryonic stem cells (ESCs) and functions in induced pluripotent stem (iPS) reprogramming. The activity of PCGF6 is found to be regulated by cell cycle dependent phosphorylation. PCGF6 contains a C3HC4-type RING-HC finger. 45
34278 319653 cd16739 RING-HC_RING1 RING finger, HC subclass, found in really interesting new gene 1 protein (RING1) and similar proteins. RING1, also known as polycomb complex protein RING1, RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), was identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. It is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase that transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING1 interacts with multiple PcG proteins and displays tumorigenic activity. It also shows zinc-dependent DNA binding activity. Moreover, RING1 inhibits transactivation of the DNA-binding protein recombination signal binding protein-Jkappa (RBP-J) by Notch through interaction with the LIM domains of KyoT2. RING1 contains a C3HC4-type RING-HC finger. 44
34279 319654 cd16740 RING-HC_RING2 RING finger, HC subclass, found in really interesting new gene 2 protein (RING2) and similar proteins. RING2, also known as huntingtin-interacting protein 2-interacting protein 3, HIP2-interacting protein 3, protein DinG, RING finger protein 1B (RING1B), RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. RING2 is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. The enzymatic activity of RING2 is enhanced by the interaction with BMI1/PCGF4, and it is dispensable for early embryonic development and much of the gene repression activity of PRC1. Moreover, RING2 plays a key role in terminating neural precursor cell (NPC)-mediated production of subcerebral projection neurons (SCPNs) during neocortical development. It also plays a critical role in nonhomologous end-joining (NHEJ)-mediated end-to-end chromosome fusions. Furthermore, RING2 is essential for expansion of hepatic stem/progenitor cells. It promotes hepatic stem/progenitor cell expansion through simultaneous suppression of cyclin-dependent kinase inhibitors (CDKIs) Cdkn1a and Cdkn2a, known negative regulators of cell proliferation. RING2 also negatively regulates p53 expression through directly binding with both p53 and MDM2 and promoting MDM2-mediated p53 ubiquitination in selective cancer cell types to stimulate tumor development. RING2 contains a C3HC4-type RING-HC finger. 47
34280 319655 cd16741 RING-HC_RNFT1 RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein 1 (RNFT1). RNFT1, also known as protein PTD016, is a multi-pass membrane protein containing a C3HC4-type RING-HC finger. Its biological role remains unclear. 40
34281 319656 cd16742 RING-HC_RNFT2 RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein 2(RNFT2). RNFT2, also known as transmembrane protein 118 (TMEM118), is a multi-pass membrane protein containing a C3HC4-type RING-HC finger. Its biological role remains unclear. 41
34282 319657 cd16743 RING-HC_RNF5 RING finger, HC subclass, found in RING finger protein 5 (RNF5) and similar proteins. RNF5, also known as protein G16 or Ram1, is an E3 ubiquitin-protein ligase anchored to the outer membrane of the endoplasmic reticulum (ER). It acts at early stages of cystic fibrosis (CF) transmembrane conductance regulator (CFTR) biosynthesis and functions as a target for therapeutic modalities to antagonize mutant CFTR proteins in CF patients carrying the F508del allele. It also regulates the turnover of specific G protein-coupled receptors by ubiquitinating JNK-associated membrane protein (JAMP) and preventing proteasome recruitment. RNF5 limits basal levels of autophagy and influences susceptibility to bacterial infection through the regulation of ATG4B stability. It is also involved in the degradation of Pendrin, a transmembrane chloride/anion exchanger highly expressed in thyroid, kidney, and inner ear. RNF5 plays an important role in cell adhesion and migration. It can modulate cell migration by ubiquitinating paxillin. Furthermore, RNF5 interacts with virus-induced signaling adaptor (VISA) at mitochondria in a viral infection-dependent manner, and further targets VISA at K362 and K461 for K48-linked ubiquitination and degradation after viral infection. It also negatively regulates virus-triggered signaling by targeting MITA, also known as STING, for ubiquitination and degradation at the mitochondria. In addition, RNF5 determines breast cancer response to ER stress-inducing chemotherapies through the regulation of the L-glutamine carrier proteins SLC1A5 and SLC38A2 (SLC1A5/38A2). It also has been implicated in muscle organization and in recognition and processing of misfolded proteins. RNF5 contains a C3HC4-type RING-HC finger. 46
34283 319658 cd16744 RING-HC_RNF185 RING finger, HC subclass, found in RING finger protein 185 (RNF185) and similar proteins. RNF185 is an E3 ubiquitin-protein ligase of endoplasmic reticulum-associated degradation (ERAD) that targets cystic fibrosis transmembrane conductance regulator (CFTR). It controls the degradation of CFTR and CFTR F508del allele in a RING- and proteasome-dependent manner, but does not control that of other classical ERAD model substrates. It also negatively regulates osteogenic differentiation by targeting dishevelled2 (Dvl2), a key mediator of Wnt signaling pathway, for degradation. Moreover, RNF185 regulates selective mitochondrial autophagy through interaction with the Bcl-2 family protein BNIP1. It also plays an important role in cell adhesion and migration through the modulation of cell migration by ubiquitinating paxillin. RNF185 contains a C3HC4-type RING-HC finger. 43
34284 319659 cd16745 RING-HC_AtRMA_like RING finger, HC subclass, found in Arabidopsis thaliana RING membrane-anchor proteins (AtRMAs) and similar proteins. AtRMAs, including AtRma1, AtRma2, and AtRma3, are endoplasmic reticulum (ER)-localized Arabidopsis homologs of human outer membrane of the ER-anchor E3 ubiquitin-protein ligase, RING finger protein 5 (RNF5). AtRMAs possess E3 ubiquitin ligase activity, and may play a role in the growth and development of Arabidopsis. The AtRMA1 and AtRMA3 genes are predominantly expressed in major tissues, such as cotyledons, leaves, shoot-root junction, roots, and anthers, while AtRMA2 expression is restricted to the root tips and leaf hydathodes. AtRma1 probably functions with the Ubc4/5 subfamily of E2. AtRma2 is likely involved in the cellular regulation of ABP1 expression levels through interacting with auxin binding protein 1 (ABP1). AtRMA proteins contain an N-terminal C3HC4-type RING-HC finger and a trans-membrane-anchoring domain in their extreme C-terminal region. 45
34285 319660 cd16746 RING-HC_RNF212 RING finger, HC subclass, found in RING finger protein 212 (RNF212) and similar proteins. RNF212 is a dosage-sensitive regulator of crossing-over during mammalian meiosis. It plays a central role in designating crossover sites and coupling chromosome synapsis to the formation of crossover-specific recombination complexes. It also functions as an E3 ligase for SUMO modification. RNF212 contains an N-terminal C3HC4-type RING-HC finger. 48
34286 319661 cd16747 RING-HC_RNF212B RING finger, HC subclass, found in RING finger protein 212B (RNF212B) and similar proteins. RNF212B is an uncharacterized protein with high sequence similarity with RNF212, a dosage-sensitive regulator of crossing-over during mammalian meiosis. RNF212B contains an N-terminal C3HC4-type RING-HC finger. 41
34287 319662 cd16748 RING-HC_SH3RF1 RING finger, HC subclass, found in SH3 domain-containing RING finger protein 1 (SH3RF1) and similar proteins. SH3RF1, also known as plenty of SH3s (POSH), RING finger protein 142 (RNF142), or SH3 multiple domains protein 2 (SH3MD2), is a trans-Golgi network-associated pro-apoptotic scaffold protein with E3 ubiquitin-protein ligase activity. It also plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and c-Jun N-terminal kinase (JNK) mediated apoptosis, linking Rac1 to downstream components. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. Moreover, SH3RF1 assembles an inhibitory complex with the actomyosin regulatory protein Shroom3, which links to the actin-myosin network to regulate neuronal process outgrowth. It also forms a complex with apoptosis-linked gene-2 (ALG-2) and ALG-2-interacting protein (ALIX/AIP1) in a calcium-dependent manner to play a role in the regulation of the JNK pathway. Furthermore, direct interaction of SH3RF1 and another molecular scaffold JNK-interacting protein (JIP) is required for apoptotic activation of JNKs. Interaction of SH3RF1 and E3 ubiquitin-protein isopeptide ligases, Siah proteins, also needs to promote JNK activation and apoptosis. In addition, SH3RF1 binds to and degrades TAK1, a crucial activator of both the JNK and the Relish signaling pathways.SH3RF1 contains an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. 48
34288 319663 cd16749 RING-HC_SH3RF2 RING finger, HC subclass, found in SH3 domain-containing RING finger protein 2 (SH3RF2) and similar proteins. SH3RF2, also known as heart protein phosphatase 1-binding protein (HEPP1), plenty of SH3s (POSH)-eliminating RING protein (POSHER), protein phosphatase 1 regulatory subunit 39, or RING finger protein 158 (RNF158), is a putative E3 ubiquitin-protein ligase that acts as an anti-apoptotic regulator for the c-Jun N-terminal kinase (JNK) pathway by binding to and promoting the proteasomal degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. 48
34289 319664 cd16750 RING-HC_SH3RF3 RING finger, HC subclass, found in SH3 domain-containing RING finger protein 3 (SH3RF3) and similar proteins. SH3RF3, also known as plenty of SH3s 2 (POSH2) or SH3 multiple domains protein 4 (SH3MD4), is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating c-Jun N-terminal kinase (JNK) mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1. Both of them contain an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. 46
34290 319665 cd16751 RING-HC_SIAH1 RING finger, HC subclass, found in seven in absentia homolog 1 (SIAH1) and similar proteins. SIAH1, also known as Siah-1a, is an inducible E3 ubiquitin-protein ligase that contributes to proteasome-mediated degradation of multiple targets in numerous cellular processes including apoptosis, tumor suppression, cell cycle, axon guidance, transcription regulation, and tumor necrosis factor signaling. SIAH1 functions as a scaffolding protein and interacts with a variety of different substrates for ubiquitination and subsequent degradation. It regulates the oncoprotein p34SEI-1 polyubiquitination and its subsequent degradation in a p53-dependent manner, which mediates p53 preferential vitamin C cytotoxicity. It targets the nonreceptor tyrosine kinase activated Cdc42-associated kinase 1 (ACK1), a valid target in cancer therapy, for ubiquitinylation and proteasomal degradation. It also interacts with KLF10 and targets for its degradation. The CDK2 phosphorylation-mediates KLF10 dissociation from SIAH1 is linked to cell cycle progression. Moreover, Siah1 is downregulated and associated with apoptosis and invasion in human breast cancer. It targets TAp73, a homolog of the tumor suppressor p53, for degradation. It is suppressed by hypoxia-inducible factor 1-alpha (HIF-1alpha) under hypoxic conditions to regulate TAp73 levels. It also promotes the migration and invasion of human glioma cells by regulating HIF-1alpha signaling under hypoxia. Furthermore, Siah1 forms a protein complex with glyceraldehyde-3-phosphate dehydrogenase (GAPDH). The apoptosis signal-regulating kinase 1 (ASK1) functions as an activator of the GAPDH-Siah1 stress-signaling cascade. It also plays an important role in ethanol-induced apoptosis in neural crest cells (NCCs). SIAH1 contains an N-terminal C3HC4-type RING-HC finger, two zinc-finger subdomains, and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD) responsible for dimer formation. 40
34291 319666 cd16752 RING-HC_SIAH2 RING finger, HC subclass, found in seven in absentia homolog 2 (SIAH2) and similar proteins. SIAH2 is an E3 ubiquitin-protein ligase that contributes to proteasome-mediated degradation of multiple targets in numerous cellular processes. It targets the ubiquitylation and degradation of tumor necrosis factor receptor-associated factor 2 (TRAF2) under stress conditions, which is required for the cell to commit to undergoing apoptosis. It is, therefore, a key regulator of TRAF2-dependent signaling in response to tumor necrosis factor-alpha (TNF-alpha) treatment and UV irradiation. SIAH2 modulates the polyubiquitination of G protein pathway suppressor 2 (GPS2), and targets for its proteasomal degradation. It is also a regulator of NF-E2-related factor 2 (Nrf2), a key regulator of cellular oxidative response and contributes to the degradation of Nrf2 irrespective of its phosphorylation status. Moreover, SIAH2 contributes to castration-resistant prostate cancer (CRPC) by regulation of androgen receptor (AR) transcriptional activity. It enhances AR transcriptional activity and prostate cancer cell growth. Its stability can be regulated by AKR1C3. SIAH2 also inhibits tyrosine kinase-2 (TYK2)-STAT3 signaling in lung carcinoma cells. Furthermore, SIAH2 regulates obesity-induced adipose tissue inflammation through altering peroxisome proliferator-activated receptor gamma (PPAR gamma) protein levels and selectively regulating PPAR gamma activity. It also functions as a regulator of the nuclear hormone receptor RevErbalpha (Nr1d1) stability and rhythmicity, and overall circadian oscillator function. In addition, Siah2 is an essential component of the hypoxia response Hippo signaling pathway and has been implicated in normal development and tumorigenesis. It modulates the hypoxia pathway upstream of hypoxia-induced transcription factor subunit HIF-1alpha, and therefore may play an important role in angiogenesis in response to hypoxic stress in endothelial cells. It also stimulates transcriptional coactivator YAP1 by destabilizing serine/threonine-protein kinase LATS2, a critical component of the Hippo pathway, in response to hypoxia. Meanwhile, Siah2 is involved in regulation of tight junction integrity and cell polarity under hypoxia, through its regulation of apoptosis-stimulating proteins of p53 subunit 2 (ASPP2) stability. SIAH2 contains an N-terminal C3HC4-type RING-HC finger, two zinc-finger subdomains, and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD) responsible for dimer formation. 38
34292 319667 cd16753 RING-HC_MID1 RING finger, HC subclass, found in midline-1 (MID1) and similar proteins. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. MID1 hetero-dimerizes in vitro with its paralog MID2. 54
34293 319668 cd16754 RING-HC_MID2 RING finger, HC subclass, found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1 that associate with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. Loss-of-function mutations in MID2 lead to the human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. MID2 hetero-dimerizes in vitro with its paralog MID1. 53
34294 319669 cd16755 RING-HC_TRIM9 RING finger, HC subclass, found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9, human ortholog of rat Spring, also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducin repeat-containing protein (beta-TrCP) through its N-terminal degron motif depending on the phosphorylation status, and thus negatively regulates nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytic soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 52
34295 319670 cd16756 RING-HC_TRIM36 RING finger, HC subclass, found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in the carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequently dorsal axis formation. It is also potentially associated with chromosome segregation through interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 49
34296 319671 cd16757 RING-HC_TRIM46 RING finger, HC subclass, found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 43
34297 319672 cd16758 RING-HC_TRIM67 RING finger, HC subclass, found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also known as TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. 50
34298 319673 cd16759 RING-HC_MuRF1 RING finger, HC subclass, found in muscle-specific RING finger protein 1 (MuRF-1) and similar proteins. MuRF-1, also known as tripartite motif-containing protein 63 (TRIM63), RING finger protein 28 (RNF28), iris RING finger protein, or striated muscle RING zinc finger, is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is predominantly fast (type II) fibre-associated in skeletal muscle and can bind to many myofibrillar proteins, including titin, nebulin, the nebulin-related protein NRAP, troponin-I (TnI), troponin-T (TnT), myosin light chain 2 (MLC-2), myotilin, and T-cap. The early and robust upregulation of MuRF-1 is triggered by disuse, denervation, starvation, sepsis, or steroid administration resulting in skeletal muscle atrophy. It also plays a role in maintaining titin M-line integrity. It associates with the periphery of the M-line lattice and may be involved in the regulation of the titin kinase domain. It also participates in muscle stress response pathways and gene expression. MuRF-1 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. 63
34299 319674 cd16760 RING-HC_MuRF2 RING finger, HC subclass, found in muscle-specific RING finger protein 2 (MuRF-2) and similar proteins. MuRF-2, also known as tripartite motif-containing protein 55 (TRIM55) or RING finger protein 29 (RNF29), is a muscle-specific E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover and also a ligand of the transactivation domain of the serum response transcription factor (SRF). It is predominantly slow-fibre associated and highly expressed in embryonic skeletal muscle. MuRF-2 associates transiently with microtubules, myosin, and titin during sarcomere assembly. It has been implicated in microtubule, intermediate filament, and sarcomeric M-line maintenance in striated muscle development, as well as in signalling from the sarcomere to the nucleus. It plays an important role in the earliest stages of skeletal muscle differentiation and myofibrillogenesis. It is developmentally downregulated and is assembled at the M-line region of the sarcomere and with microtubules. MuRF-2 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. 64
34300 319675 cd16761 RING-HC_MuRF3 RING finger, HC subclass, found in muscle-specific RING finger protein 3 (MuRF-3) and similar proteins. MuRF-3, also known as tripartite motif-containing protein 54 (TRIM54), or RING finger protein 30 (RNF30), is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is ubiquitously detected in all fibre types, and is developmentally upregulated, associates with microtubules, the sarcomeric M-line (this report) and Z-line, and is required for microtubule stability and myogenesis. It associates with glutamylated microtubules during skeletal muscle development, and is required for skeletal myoblast differentiation and development of cellular microtubular networks. MuRF-3 controls the degradation of four-and-a-half LIM domain (FHL2) and gamma-filamin and is required for maintenance of ventricular integrity after myocardial infarction (MI). MuRF-3 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. 59
34301 319676 cd16762 RING-HC_TRIM13_C-V RING finger, HC subclass, found in tripartite motif-containing protein 13 (TRIM13) and similar proteins. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane anchored E3 ubiquitin-protein ligase that interacts proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). It also targets the known ER proteolytic substrate CD3-delta, but not the N-end rule substrate Ub-R-YFP (yellow fluorescent protein) for its degradation. Moreover, TRIM13 regulates ubiquitination and degradation of NEMO to suppress tumor necrosis factor (TNF) induced nuclear factor-kappaB (NF- kappa B) activation. It is also involved in NF-kappaB p65 activation and nuclear factor of activated T-cells (NFAT)-dependent activation of c-Rel upon T-cell receptor engagement. Furthermore, TRIM13 negatively regulates lanoma differentiation-associated gene 5 (MDA5)-mediated type I interferon production. It also regulates caspase-8 ubiquitination, translocation to autophagosomes, and activation during ER stress induced cell death. Meanwhile, TRIM13 enhances ionizing radiation-induced apoptosis by increasing p53 stability and decreasing AKT kinase activity through MDM2 and AKT degradation. TRIM13 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region. In addition, TRIM13 contains a C-terminal transmembrane domain. 57
34302 319677 cd16763 RING-HC_TRIM59_C-V RING finger, HC subclass, found in tripartite motif-containing protein 59 (TRIM59) and similar proteins. TRIM59, also known as RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis. It is upregulated in gastric cancer and promotes gastric carcinogenesis by interacting with and targeting the P53 tumor suppressor for its ubiquitination and degradation. It also acts as a novel accessory molecule involved in cytotoxicity of BCG-activated macrophages (BAM). Moreover, TRIM59 may serve as a multifunctional regulator for innate immune signaling pathways. It interacts with ECSIT and negatively regulates nuclear factor-kappaB (NF- kappa B) and interferon regulatory factor (IRF)-3/7-mediated signal pathways. TRIM59 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region. In addition, TRIM59 contains a C-terminal transmembrane domain. 56
34303 319678 cd16764 RING-HC_TIF1alpha RING finger, HC subclass, found in transcription inknown asiary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contains transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development. 77
34304 319679 cd16765 RING-HC_TIF1beta RING finger, HC subclass, found in transcription inknown asiary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity. 61
34305 319680 cd16766 RING-HC_TIF1gamma RING finger, HC subclass, found in transcriptional inknown asiary factor 1 gamma (TIF1gamma). TIF1gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling, inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis. 67
34306 319681 cd16767 RING-HC_TRIM2 RING finger, HC subclass, found in tripartite motif-containing protein 2 (TRIM2). TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM2 also plays a role in mediating the p42/p44 MAPK-dependent ubiquitination of the cell death-promoting protein Bcl-2-interacting mediator of cell death (Bim) in rapid ischemic tolerance. TRIM2 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. 46
34307 319682 cd16768 RING-HC_TRIM3 RING finger, HC subclass, found in tripartite motif-containing protein 3 (TRIM3). TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It functions as a tumor suppressor that regulates asymmetric cell division in glioblastoma. It binds to the cdk inhibitor p21(WAF1/CIP1) and regulates its availability that promotes cyclin D1-cdk4 nuclear accumulation. Moreover, TRIM3 plays an important role in the central nervous system (CNS). It corresponds to gene BERP (brain-expressed RING finger protein), a unique p53-regulated gene that modulates seizure susceptibility and GABAAR cell surface expression. Furthermore, TRIM3 mediates activity-dependent turnover of postsynaptic density (PSD) scaffold proteins GKAP/SAPAP1 and is a negative regulator of dendritic spine morphology. In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. It also regulates the motility of the kinesin superfamily protein KIF21B. TRIM3 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. 45
34308 319683 cd16769 RING-HC_UHRF1 RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1). UHRF1, also known as inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET and RING finger associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-HC finger exhibits both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. 47
34309 319684 cd16770 RING-HC_UHRF2 RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2, also known as Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation, but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger. 46
34310 319685 cd16771 RING-HC_UNK RING finger, HC subclass, found in RING finger protein unkempt (UNK) and similar proteins. UNK, also known as zinc finger CCCH domain-containing protein 5, is a metazoan-specific zinc finger protein enriched in embryonic brains. It may play a broad regulatory role during the formation of the central nervous system (CNS). It is a sequence-specific RNA-binding protein required for the early neuronal morphology. UNK is a neurogenic component of the mTOR pathway, and functions as a negative regulator of the timing of photoreceptor differentiation. It also specifically binds to Brg/Brm-associated factor BAF60b and promotes its ubiquitination in a Rac1-dependent manner. UNK contains six tandem CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus. 37
34311 319686 cd16772 RING-HC_UNKL RING finger, HC subclass, found in RING finger protein unkempt-like (UNKL) and similar proteins. UNKL, also known as zinc finger CCCH domain-containing protein 5-like, is a putative E3 ubiquitin-protein ligase that may participate in a protein complex showing an E3 ligase activity regulated by RAC1. It shows high sequence similarity with RING finger protein unkempt (UNK), which is a metazoan-specific zinc finger protein enriched in embryonic brains, and may play a broad regulatory role during the formation of the central nervous system (CNS). UNKL contains several CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus. 38
34312 319687 cd16773 RING-HC_RBR_TRIAD1 RING finger, HC subclass, found in two RING fingers and DRIL [double RING finger linked] 1 (TRIAD1). TRIAD1, also known as ariadne-2 (ARI-2), protein ariadne-2 homolog, Ariadne RBR E3 ubiquitin protein ligase 2 (ARIH2), or UbcM4-interacting protein 48, is a RBR-type E3 ubiquitin-protein ligase that catalyzes the formation of polyubiquitin chains linked via lysine-48, as well as lysine-63 residues. Its auto-ubiquitylation can be catalyzed by the E2 conjugating enzyme UBCH7. TRIAD1 has been implicated in hematopoiesis, specifically in myelopoiesis, as well as in embryogenesis. It functions as a regulator of endosomal transport and is required for the proper function of multivesicular bodies. It also acts as a novel ubiquitination target for proteasome-dependent degradation by murine double minute 2 (MDM2). As a proapoptotic protein, TRIAD1 promotes p53 activation, and inhibits MDM2-mediated p53 ubiquitination and degradation. Furthermore, TRIAD1 can inhibit the ubiquitination and proteasomal degradation of growth factor independence 1 (Gfi1), a transcriptional repressor essential for the function and development of many different hematopoietic lineages. TRIAD1 contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 54
34313 319688 cd16774 RING-HC_RBR_ANKIB1 RING finger, HC subclass, found in ankyrin repeat and IBR domain-containing protein 1 (ANKIB1) and similar proteins. ANKIB1 is a RBR-type E3 ubiquitin-protein ligase that may function as part of E3 complex, which accepts ubiquitin from specific E2 ubiquitin-conjugating enzymes and then transfers it to substrates. It contains an N-terminal ankyrin repeats domain and a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 57
34314 319689 cd16775 RING-HC_RBR_RNF19A RING finger, HC subclass, found in RING finger protein 19A (RNF19A) and similar proteins. RNF19A, also known as double ring-finger protein (Dorfin) or p38, is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that localizes to the ubiquitylated inclusions in Parkinson"s disease (PD), dementia with Lewy bodies, multiple system atrophy, and amyotrophic lateral sclerosis (ALS). It interacts with Psmc3, a protein component of the 19S regulatory cap of the 26S proteasome, and further participates in the ubiquitin-proteasome system in acrosome biogenesis, spermatid head shaping, and development of the head-tail coupling apparatus and tail. It modulates the ubiquitination and degradation of calcium-sensing receptor (CaR), which may contribute to a general mechanism for CaR quality control during biosynthesis. Moreover, RNF19A can also ubiquitylate mutant superoxide dismutase 1 (SOD1), the causative gene of familial ALS. It may associate with endoplasmic reticulum-associated degradation (ERAD) pathway, which is related to the pathogenesis of neurodegenerative disorders, such as PD or Alzheimer"s disease. It is also involved in the pathogenic process of PD and Lewy body (LB) formation by ubiquitylation of synphilin-1. RNF19A contains a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 55
34315 319690 cd16776 RING-HC_RBR_RNF19B RING finger, HC subclass, found in RING finger protein 19B (RNF19B) and similar proteins. RNF19B, also known as IBR domain-containing protein 3 or natural killer lytic-associated molecule (NKLAM), is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that plays a role in controlling tumor dissemination and metastasis. It is involved in the cytolytic function of natural killer (NK) cells and cytotoxic T lymphocytes (CTLs). It interacts with ubiquitin conjugates UbcH7 and UbcH8, and ubiquitinates uridine kinase like-1 (URKL-1) protein, targeting it for degradation. Moreover, RNF19B is a novel component of macrophage phagosomes and plays a role in macrophage anti-bacterial activity. It functions as a novel modulator of macrophage inducible nitric oxide synthase (iNOS) expression. RNF19B contains a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. 55
34316 319691 cd16777 mRING-HC-C4C4_RBR_RNF144A Modified RING finger, HC subclass (C4C4-type), found in RING finger protein 144A (RNF144A). RNF144A, also known as UbcM4-interacting protein 4 (UIP4) or ubiquitin-conjugating enzyme 7-interacting protein 4, is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that targets DNA-dependent protein kinase catalytic subunit (DNA-PKcs) and thus promotes DNA damage-induced cell apoptosis. It is transcriptionally repressed by metastasis-associated protein 1 (MTA1) and inhibits MTA1-driven cancer cell migration and invasion. RNF144A contains a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is responsible for the interaction of E2-conjugating enzymes UbcH7 and UbcH8. 54
34317 319692 cd16778 mRING-HC-C4C4_RBR_RNF144B Modified RING finger, HC subclass (C4C4-type), found in RING finger protein 144B (RNF144B). RNF144B, also known as PIR2, IBR domain-containing protein 2 (IBRDC2), or p53-inducible RING finger protein (p53RFP), is a transmembrane (TM) domain-containing RBR (RING1-IBR-RING2) E3 ubiquitin-protein ligase that induces p53-dependent, but caspase-independent apoptosis. It interacts with E2 ubiquitin-conjugating enzymes UbcH7 and UbcH8, but not with UbcH5. It is involved in ubiquitination and degradation of p21, a p53 downstream protein promoting growth arrest and antagonizing apoptosis, suggesting a role in switching a cell from p53-mediated growth arrest to apoptosis. Moreover, RNF144B regulates the levels of Bax, a pro-apoptotic protein from the Bcl-2 family, and protects cells from unprompted Bax activation and cell death. It also regulates epithelial homeostasis by mediating degradation of p21WAF1 and p63. RNF144B contains a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination. 57
34318 319693 cd16779 mRING-HC-C3HC3D_LNX1 Modified RING finger, HC subclass (C3HC3D-type), found in ligand of numb protein X 1 (LNX1). LNX1, also known as numb-binding protein 1 or PDZ domain-containing RING finger protein 2, is a PDZ domain-containing RING-type E3 ubiquitin ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX1 contains an N-terminal modified C3HC3D-type RING-HC finger, a NPAY motif for Numb-LNX interaction, and four PDZ domains necessary for the binding of substrates, including CAR, ErbB2, SKIP, JAM4, CAST, c-Src, Claudins, RhoC, KCNA4, PAK6, PLEKHG5, PKC-alpha1, TYK2, PDZ-binding kinase (PBK), LNX2, and itself. 42
34319 319694 cd16780 mRING-HC-C3HC3D_LNX2 Modified RING finger, HC subclass (C3HC3D-type), found in ligand of numb protein X 2 (LNX2). LNX2, also known as numb-binding protein 2, or PDZ domain-containing RING finger protein 1 (PDZRN1), is a PDZ domain-containing RING-type E3 ubiquitin ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. It interacts with contactin-associated protein 4 (Caspr4, also known as CNTNAP4) in a PDZ domain-dependent manner, which modulates the proliferation and neuronal differentiation of neural progenitor cells (NPCs). LNX2 contains an N-terminal modified C3HC3D-type RING-HC finger, a NPAF motif for Numb/ Numblike-LNX interaction, and four PDZ domains necessary for the binding of substrates, including ErbB2, RhoC, the presynaptic protein CAST, the melanoma/cancer-testis antigen MAGEB18 and several proteins associated with cell junctions, such as JAM4 and the Coxsackievirus and adenovirus receptor (CAR). 45
34320 319695 cd16781 mRING-HC-C3HC3D_Roquin1 Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-1. Roquin-1, also known as RING finger and C3H zinc finger protein 1 (RC3H1), or RING finger protein 198 (RNF198), is a ubiquitously expressed RNA-binding protein essential for degradation of inflammation-related mRNAs and maintenance of immune homeostasis. It is localized in cytoplasmic granules and binds to the 3' untranslated region (3'UTR) of inducible costimulator (ICOS) mRNA to post-transcriptionally repress its expression. Roquin-1 interacts with 3'UTR of tumor necrosis factor receptor superfamily member 4 (TNFRSF4) and tumor-necrosis factor-alpha (TNFalpha), and post-transcriptionally regulates A20 mRNA and modulates the activity of the IKK/NF-kappaB pathway. Moreover, Roquin-1 shares functions with its paralog Roquin-2 in the repression of mRNAs controlling T follicular helper cells and systemic inflammation. Roquin-1 contains an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 ubiquitin-ligase function, a highly conserved ROQ domain required for RNA binding and localization to stress granules, and a CCCH-type zinc finger that is involved in RNA recognition, typically contacting AU-rich elements. In addition, both N- and C-terminal to the ROQ domain are combined to form a HEPN (higher eukaryotes and prokaryotes nucleotide-binding) domain that is highly likely to function as a RNA-binding domain. 44
34321 319696 cd16782 mRING-HC-C3HC3D_Roquin2 Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-2. Roquin-2, also known as membrane-associated nucleic acid-binding protein (MNAB), RING finger and CCCH-type zinc finger domain-containing protein 2 (RC3H2), or RING finger protein 164 (RNF164), is an E3 ubiquitin ligase that is localized to the cytoplasm and upon stress is concentrated in stress granules. It is required for reactive oxygen species (ROS)-induced ubiquitination and degradation of apoptosis signal-regulating kinase 1 (ASK1, also known as MAP3K5). Roquin-2 interacts with 3'UTR of tumor necrosis factor receptor superfamily member 4 (TNFRSF4) and tumor-necrosis factor-alpha (TNFalpha), and modulates immune responses. Moreover, Roquin-2 shares functions with its paralog Roquin-1 in the repression of mRNAs controlling T follicular helper cells and systemic inflammation. Roquin-2 contains an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 ubiquitin-ligase function, a highly conserved ROQ domain required for RNA binding and localization to stress granules, a coiled-coil (CC1), and a CCCH-type zinc finger that is involved in RNA recognition. 44
34322 319697 cd16783 mRING-HC-C2H2C4_MDM2 Modified RING finger, HC subclass (C2H2C4-type), found in E3 ubiquitin-protein ligase MDM2 and similar proteins. MDM2, also known as double minute 2 protein (Hdm2), oncoprotein MDM2, or p53-binding protein, exerts its oncogenic activity predominantly by binding p53 tumor suppressor and blocking its transcriptional activity. It forms homo-oligomers and displays E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Moreover, in response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through its interaction with ribosomal proteins L5, L11, and L23. MDM2 can be phosphorylated in the DNA damage. Meanwhile, MDM2 has a p53-independent role in tumorigenesis and cell growth regulation. In addition, it binds interferon (IFN) regulatory factor-2 (IRF-2), an IFN-regulated transcription factor, and mediates its ubiquitination. MDM2 contains an N-terminal p53-binding domain, and a C-terminal modified C2H2C4-type RING-HC finger conferring E3 ligase activity that is required for ubiquitination and nuclear export of p53. It is also responsible for the hetero-oligomerization of MDM2, which is crucial for the suppression of P53 activity during embryonic development, and the recruitment of E2 ubiquitin-conjugating enzymes. MDM2 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain, as well as a nuclear localization signal (NLS) and a nuclear export signal (NES), near the central acidic region. The zf-RanBP2 domain plays an important role in mediating MDM2 binding to ribosomal proteins and thus is involved in MDM2-mediated p53 suppression. 57
34323 319698 cd16784 mRING-HC-C2H2C4_MDM4 Modified RING finger, HC subclass (C2H2C4-type), found in protein MDM4 and similar proteins. MDM4, also known as double minute 4 protein (Hdm4), or MDM2-like p53-binding protein, or protein MDMX, or HDMX, or p53-binding protein MDM4, exerts its oncogenic activity predominantly by binding p53 tumor suppressor and blocking its transcriptional activity. MDM4 is phosphorylated and destabilized in response to DNA damage stress. It can also be specifically dephosphorylated through directly interacting with protein phosphatase 1 (PP1), which may increase its stability and thus inhibits p53 activity. Meanwhile, MDM4 has a p53-independent role in tumorigenesis and cell growth regulation. MDM4 contains an N-terminal p53-binding domain and a C-terminal modified C2H2C4-type RING-HC finger responsible for its hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. MDM4 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region. 59
34324 319699 cd16785 mRING-HC-C3HC5_NEU1A Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein 1A (NEURL1A) and similar proteins. NEURL1A, also known as NEURL1, NEU, neuralized 1, or RING finger protein 67 (RNF67), is a mammalian homolog of the Drosophila neuralized (D-neu) protein. It functions as an E3 ubiquitin-protein ligase that directly interacts with and monoubiquitinates cytoplasmic polyadenylation element-binding protein 3 (CPEB3), an RNA binding protein and a translational regulator of local protein synthesis, which facilitates hippocampal plasticity and hippocampal-dependent memory storage. It also acts as a potential tumor suppressor that causes apoptosis and downregulates Notch target genes in the medulloblastoma. NEURL1A contains two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 44
34325 319700 cd16786 mRING-HC-C3HC5_NEU1B Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein 1B (NEURL1B). NEURL1B, also known as neuralized-2 (NEUR2) or neuralized-like protein 3, is a mammalian homolog of the Drosophila neuralized (D-neu) protein. It functions as an E3 ubiquitin-protein ligase that interacts with and ubiquitinates Delta. Thus, it plays a role in the endocytic pathways for Notch signaling through working cooperatively with another E3 ligase, Mind bomb-1 (Mib1), in Delta endocytosis to hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs)-positive vesicles. NEURL1B contains two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 43
34326 319701 cd16787 mRING-HC-C3HC5_CGRF1 Modified RING finger, HC subclass (C3HC5-type), found in cell growth regulator with RING finger domain protein 1 (CGRRF1) and similar proteins. CGRRF1, also known as cell growth regulatory gene 19 protein (CGR19) or RING finger protein 197 (RNF197), functions as a novel biomarker of tissue monitor endometrial sensitivity and response to insulin-sensitizing drugs, such as metformin, in the context of obesity. CGRRF1 contains a C-terminal modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 37
34327 319702 cd16788 mRING-HC-C3HC5_RNF26 Modified RING finger, HC subclass (C3HC5-type), found in RING finger protein 26 (RNF26) and similar proteins. RNF26 is an E3 ubiquitin ligase that temporally regulates virus-triggered type I interferon induction by increasing the stability of Mediator of IRF3 activation, MITA, also known as STING, through K11-linked polyubiquitination of MITA after viral infection and promoting degradation of IRF3, another important component required for virus-triggered interferon induction. Although RNF26 substrates of ubiquitination remain unclear at present, RNF26 upregulation in gastric cancer might be implicated in carcinogenesis through dysregulation of growth regulators. RNF26 contains an N-terminal leucine zipper domain and a C-terminal modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 48
34328 319703 cd16789 mRING-HC-C3HC5_MGRN1_like---blasttree Modified RING finger, HC subclass (C3HC5-type), found in mahogunin RING finger protein 1 (MGRN1), RING finger protein 157 (RNF157) and similar proteins. MGRN1, also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. MGRN1 interacts with cytosolic prion proteins (PrPs) that are linked with neurodegeneration. It also interacts with expanded polyglutamine proteins, and suppresses misfolded polyglutamine aggregation and cytotoxicity. Moreover, MGRN1 inhibits melanocortin receptor signaling by competition with Galphas, suggesting a novel pathway for melanocortin signaling from the cell surface to the nucleus. Furthermore, MGRN1 interacts with and ubiquitylates TSG101, a key component of the endosomal sorting complex required for transport (ESCRT)-I, and regulates endosomal trafficking. A null mutation in the gene encoding MGRN1 causes spongiform neurodegeneration, suggesting a link between dysregulation of endosomal trafficking and spongiform neurodegeneration. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase mahogunin ring finger-1 (MGRN1). In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis. Both MGRN1 and RNF157 contain a modified C3HC5-type RING-HC finger, and a functionally uncharacterized region, known as domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 41
34329 319704 cd16790 SP-RING_PIAS SP-RING finger found in protein inhibitor of activated signal transducer and activator of transcription (PIAS) proteins. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. It consists of four members: PIAS1, PIAS2 (also known as PIASx), PIAS3, and PIAS4 (also known as PIASy). PIAS proteins were initially identified as inhibitors of activated STAT only, but are now known to interact with and modulate several other proteins, including androgen receptor (AR), tumor suppressor p53, and the transforming growth factor-beta (TGF-beta) signaling protein SMAD. They interact with STATs in a cytokine-dependent manner. PIAS1, PIAS2, and PIAS3 interact with STAT1, STAT3, and STAT4, respectively. In addition, PIAS4 is associated with STAT1. PIAS proteins have SUMO E3-ligase activity and interaction of PIAS proteins with transcription factors often results in sumoylation of that protein. PIAS proteins contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, which is required for the trans-repression of STAT1 activity by PIAS2, a PINT motif, which is essential for nuclear retention of PIAS3L (the long form of PIAS3), a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, which is essential for SUMO ligase activity, and the acidic C-terminal domain, which is involved in binding of PIAS3 to the nuclear coactivator TIF2. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. 48
34330 319705 cd16791 SP-RING_ZMIZ SP-RING finger found in zinc finger MIZ domain-containing protein Zmiz1, Zmiz2, and similar proteins. This family includes Zmiz1 (Zimp10) and its homolog Zmiz2 (Zimp7), both of which were initially identified in humans as androgen receptor (AR) interacting proteins and function as transcriptional co-activators. They interact with BRG1, the catalytic subunit of the SWI-SNF remodeling complex. They also associate with other hormone nuclear receptors and transcription factors, such as p53 and Smad3/Smad4, and regulate transcription of specific target genes by altering their chromatin structure. The family also includes tonalli (Tna), an ortholog identified in Drosophila. It genetically interacts with the ATP-dependent SWI/SNF and Mediator complexes, suggesting a potential role for the Zmiz proteins in chromatin remodeling. Zmiz proteins contain a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a strong transactivation domain within the C-terminus. The SP-RING/Miz domain is highly conserved in members of the PIAS family and confers SUMO-conjugating activity. It is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. The strong intrinsic transactivation domain facilitates Zmiz proteins to augment the transcriptional activity of nuclear hormone receptors and other transcriptional factors. They may act as transcriptional co-regulators. 48
34331 319706 cd16792 SP-RING_Siz_plant SP-RING finger found in Arabidopsis thaliana E3 SUMO-protein ligase SIZ1 (AtSIZ1) and similar proteins. SIZ1-mediated conjugation of SUMO1 and SUMO2 to other intracellular proteins is essential in Arabidopsis. AtSIZ1 negatively regulates abscisic acid (ABA) signaling through the sumoylation of bZIP transcripton factor ABI5. It also mediates sumoylation of bromodomain GTE proteins. Moreover, AtSIZ1 regulates flowering by controlling a salicylic acid-mediated floral promotion pathway and through affecting on FLOWERING LOCUS C (FLC) chromatin structure. It also plays a role in drought stress response likely through the regulation of gene expression. Members in this family contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box, a plant homeodomain (PHD) finger, and a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. 49
34332 319707 cd16793 SP-RING_ScSiz_like SP-RING finger found in Saccharomyces cerevisiae E3 SUMO-protein ligase SIZ1, SIZ2, and similar proteins. Saccharomyces cerevisiae SIZ proteins, also known as SAP and Miz-finger domain-containing proteins, are a Siz/PIAS RING (SP-RING) family of SUMO E3 ligases, and may be involved in a novel pathway of chromosome maintenance. They enhance SUMO modification with many substrates in vivo, but also exhibit unique substrate specificity. SIZ1, also known as ubiquitin-like protein ligase 1 (Ull1), modifies both cytoplasmic and nuclear proteins. It functions as an E3 factor specific for septin components. SIZ1-dependent substrates include Cdc3 and Cdc11 (septin subunits), Prp45 (a splicing factor), and the proliferating cell nuclear antigen (PCNA). SIZ2, also known as NFI1, interacts with Smt3, SUMO/Smt3 conjugating enzyme Ubc9, and a septin component Cdc3. Members in this family contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. 48
34333 319708 cd16794 dRING_RMD5A Degenerated RING finger found in protein RMD5 homolog A (RMD5A). RMD5A is one of the vertebrate homologs of yeast Rmd5p. The biological function of RMD5A remains unclear. RMD5A contains a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. 49
34334 319709 cd16795 dRING_RMD5B Degenerated RING finger found in protein RMD5 homolog B (RMD5B). RMD5B is one of the vertebrate homologs of yeast Rmd5p. The biological function of RMD5B remains unclear. RMD5B contains a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. 47
34335 319710 cd16796 RING-H2_RNF13 RING finger, H2 subclass, found in RING finger protein 13 (RNF13) and similar proteins. RING finger, H2 subclass, found in RING finger protein 13 (RNF13) and similar proteins RNF13 is a widely expressed membrane-associated E3 ubiquitin-protein ligase that is functionally significant in the regulation of cancer development, muscle cell growth, and neuronal development. Its expression is developmentally regulated during myogenesis and is upregulated in various tumors. RNF13 negatively regulates cell proliferation through its E3 ligase activity. It functions as an important regulator of Inositol-requiring transmembrane kinase/endonuclease IRE1alpha, mediating endoplasmic reticulum (ER) stress-induced apoptosis through the activation of the IRE1alpha-TRAF2-JNK signaling pathway. Moreover, RNF13 is involved in the regulation of the soluble N-ethylmaleimide-sensitive fusion protein attachment protein receptor (SNARE) complex via the ubiquitination of snapin, a SNAP25-interacting protein, which thereby controls synaptic function. In addition, RNF13 participates in regulating the function of satellite cells by modulating cytokine composition. RNF13 is evolutionarily conserved among many metazoans and contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. 49
34336 319711 cd16797 RING-H2_RNF167 RING finger, H2 subclass, found in RING finger protein 167 (RNF167) and similar proteins. RNF167, also known as RING105, is an endosomal/lysosomal E3 ubiquitin-protein ligase involved in alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor (AMPAR) ubiquitination. It ubiquitinates GluA2 and regulates its surface expression, and thus acts as a selective regulator of AMPAR-mediated neurotransmission. It acts as an endosomal membrane protein which ubiquitylates vesicle-associated membrane protein 3 (VAMP3) and regulates endosomal trafficking. Moreover, RNF167 plays a role in the regulation of TSSC5 (tumor-suppressing subchromosomal transferable fragment cDNA, also known as ORCTL2/IMPT1/BWR1A/SLC22A1L), which can function in concert with the ubiquitin-conjugating enzyme UbcH6. RNF167 is widely conserved in metazoans and contains an N-terminal signal peptide, a protease-associated (PA) domain, two transmembrane domains (TM1 and TM2), and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. 46
34337 319712 cd16798 RING-H2_RNF43 RING finger, H2 subclass, found in RING finger protein 43 (RNF43) and similar proteins. RNF43 is a transmembrane E3 ubiquitin-protein ligase that plays an important role in frizzled-dependent regulation of the Wnt/beta-catenin pathway. It functions as a tumor suppressor that inhibits Wnt/beta-catenin signaling by ubiquitinating Frizzled receptor and targeting it to the lysosomal pathway for degradation. miR-550a-5p directly targeted the 3?-UTR of gene RNF43 and regulated its expression. Moreover, RNF43 interacts with NEDD-4-like ubiquitin-protein ligase-1 (NEDL1) and regulates p53-mediated transcription. It may also be involved in cell growth control potentially through the interaction with HAP95, a chromatin-associated protein interfacing the nuclear envelope. Mutations of RNF43 have been identified in various tumors, including colorectal cancer (CRC), endometrial cancer, mucinous ovarian tumors, gastric adenocarcinoma, pancreatic ductal adenocarcinoma, liver fluke-associated cholangiocarcinoma, hepatocellular carcinoma, and glioma. RNF43 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region. 47
34338 319713 cd16799 RING-H2_ZNRF3 RING finger, H2 subclass, found in zinc/RING finger protein 3 (ZNRF3) and similar proteins. ZNRF3, also known as RING finger protein 203 (RNF203), is a homolog of Ring finger protein 43 (RNF43). It is a transmembrane E3 ubiquitin-protein ligase that is associated with the Wnt receptor complex, and negatively regulates Wnt signaling by promoting the turnover of frizzled and lipoprotein receptor-related protein LRP6 in an R-spondin-sensitive manner. It inhibits gastric cancer cell growth and promotes the cell apoptosis by affecting the Wnt/beta-catenin/TCF signaling pathway. ZNRF3 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region. 45
34339 319714 cd16800 RING-H2_RNF115 RING finger, H2 subclass, found in RING finger protein 115 (RNF115) and similar proteins. RNF115, also known as Rab7-interacting ring finger protein (Rabring 7), or zinc finger protein 364 (ZNF364), or breast cancer-associated gene 2 (BCA2), is an E3 ubiquitin-protein ligase that is an endogenous inhibitor of adenosine monophosphate-activated protein kinase (AMPK) activation and its inhibition increases the efficacy of metformin in breast cancer cells. It also functions as a co-factor in the restriction imposed by tetherin on HIV-1, and targets HIV-1 Gag for lysosomal degradation, impairing virus assembly and release, in a tetherin-independent manner. Moreover, RNF115 is a Rab7-binding protein that stimulates c-Myc degradation through mono-ubiquitination of MM-1. It also plays crucial roles as a Rab7 target protein in vesicle traffic to late endosome/lysosome and lysosome biogenesis. Furthermore, RNF115 and the related protein, RNF126 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. RNF115 contains an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger. 47
34340 319715 cd16801 RING-H2_RNF126 RING finger, H2 subclass, found in RING finger protein 126 (RNF126) and similar proteins. RNF126 is a Bag6-dependent E3 ubiquitin ligase that is involved in the mislocalized protein (MLP) pathway of quality control. It regulates the retrograde sorting of the cation-independent mannose 6-phosphate receptor (CI-MPR). Moreover, RNF126 promotes cancer cell proliferation by targeting the tumor suppressor p21 for ubiquitin-mediated degradation, and could be a novel therapeutic target in breast and prostate cancers. It is also able to ubiquitylate cytidine deaminase (AID), a poorly soluble protein that is essential for antibody diversification. In addition, RNF126 and the related protein, RNF115 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. RNF126 contains an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger. 44
34341 319716 cd16802 RING-H2_RNF128_like RING finger, H2 subclass, found in RING finger protein 128 (RNF128) and similar proteins. This subfamily includes RING finger proteins RNF128, RNF133, RNF148, and similar proteins, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF128, also known as gene related to anergy in lymphocytes protein (GRAIL), is a type 1 transmembrane E3 ubiquitin-protein ligase that is a critical regulator of adaptive immunity and development. It inhibits cytokine gene transcription is expressed in anergic CD4+ T cells, and has been implicated in primary T cell activation, survival, and differentiation, as well as in T cell anergy and oral tolerance. It induces T cell anergy through the ubiquitination activity of its cytosolic RING finger. It regulates expression of the costimulatory molecule CD40L on CD4 T cells, and ubiquitinates the costimulatory molecule CD40 ligand (CD40L) during the induction of T cell anergy. Moreover, RNF128 interacts with the luminal/extracellular portion of both CD151 and the related tetraspanin CD81 via its PA domain, which promoted ubiquitination of cytosolic lysine residues. It also down-modulates the expression of CD83 (previously described as a cell surface marker for mature dendritic cells) on CD4 T cells. Furthermore, Rho guanine dissociation inhibitor (RhoGDI) has been identified as a potential substrate of RNF128, suggesting a role for Rho effector molecules in T cell anergy. In addition, RNF128 plays a role in environmental stress responses. It promotes environmental salinity tolerance in euryhaline tilapia. RNF133 is a testis-specific endoplasmic reticulum-associated E3 ubiquitin ligase that is mainly present in the cytoplasm of elongated spermatids. It may play a role in sperm maturation through an ER-associated degradation (ERAD) pathway. RNF148 is a testis-specific E3 ubiquitin ligase that is abundantly expressed in testes and slightly expressed in pancreas. Its expression regulated by histone deacetylases. 49
34342 319717 cd16803 RING-H2_RNF130 RING finger, H2 subclass, found in RING finger protein 130 (RNF130) and similar proteins. RNF130, also known as Goliath homolog (H-Goliath), is a paralog of RNF128, also known as gene related to anergy in lymphocytes protein (GRAIL). It is a transmembrane E3 ubiquitin-protein ligase expressed in leukocytes. It has a self-ubiquitination property, and controls the development of T cell clonal anergy by ubiquitination. RNF130 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. 49
34343 319718 cd16804 RING-H2_RNF149 RING finger, H2 subclass, found in RING finger protein 149 (RNF149) and similar proteins. RNF149, also known as DNA polymerase-transactivated protein 2, is an E3 ubiquitin-protein ligase that interacts with wild-type v-Raf murine sarcoma viral oncogene homolog B1 (BRAF), a RING domain-containing E3 ubiquitin ligase involved in control of gene transcription, translation, cytoskeletal organization, cell adhesion, and epithelial development. RNF149 induces the ubiquitination of wild-type BRAF and promotes its proteasome-dependent degradation. Mutated RNF149 has been found in some human breast, ovarian, and colorectal cancers. RNF149 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. 48
34344 319719 cd16805 RING-H2_RNF150 RING finger, H2 subclass, found in RING finger protein 150 (RNF150) and similar proteins. RNF150 is a RING finger protein that its polymorphisms may be associated with chronic obstructive pulmonary disease (COPD) risk in the Chinese population. Further studies with larger numbers of participants worldwide are needed for validation of the relationships between RNF150 genetic variants and the pathogenesis of COPD. RNF150 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. 49
34345 319720 cd16806 RING_CH-C4HC3_MARCH1 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH1 (MARCH1). MARCH1, also known as membrane-associated RING finger protein 1, membrane-associated RING-CH protein I (MARCH-I), or RING finger protein 171 (RNF171), is a membrane-anchored E3 ubiquitin ligase that mainly expressed in cells of the immune system. It regulates antigen presentation and T-cell costimulatory functions of dendritic cells by down-regulating the cell surface expression of major histocompatibility complex class II (MHCII) and CD86 molecules. It mediates ubiquitination of MHCII and CD86 in dendritic cells (DCs). This ubiquitination induces MHCII and CD86 endocytosis, lysosomal transport, and degradation. MARCH1 also plays a regulatory role in T cell activation during immune responses, as well as in splenic DC homeostasis. Moreover, MARCH1 may regulate its own expression through dimerization and autoubiquitination. MARCH1 contains an N-terminal cytoplasmic C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger and two transmembrane domains. 53
34346 319721 cd16807 RING_CH-C4HC3_MARCH8 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH8 (MARCH8). MARCH8, also known as membrane-associated RING finger protein 8, membrane-associated RING-CH protein VIII (MARCH-VIII), RING finger protein 178 (RNF178), or cellular modulator of immune recognition (c-MIR), is a membrane-anchored E3 ubiquitin ligase that is broadly expressed. It is a functional homolog of Kaposi"s sarcoma associated-herpes virus encodes proteins modulator of immune recognition (MIR) 1 and 2, which are involved in the evasion of host immunity. MARCH8 mediates the ubiquitination and down-regulation of immune regulatory cell surface molecules, including major histocompatibility complex class II (MHCII), CD86, transferrin receptor, HLA-DM, and Fas in immune cells. Moreover, MARCH8 controls cell surface expression of some additional proteins. It regulates the ubiquitination and lysosomal degradation of the transferrin receptor (TfR). Tumor necrosis factor-related apoptosis inducing ligand receptor 1 (TRAIL-R1) is also a physiological substrate of the endogenous MARCH8, which regulates the steady-state cell surface expression of TRAIL-R1. Meanwhile, it negatively regulates interleukin-1 (IL-1) beta-induced NF-kappaB activation by targeting the IL-1 receptor accessory protein (IL1RAP) coreceptor for ubiquitination and degradation. Furthermore, MARCH8 functions in the embryo to modulate the strength of cell adhesion by regulating the localization of E-cadherin. In addition, MARCH8 plays a role in the inhibition of inflammatory cytokine production, suggesting a new therapeutic approach to the treatment of rheumatoid arthritis (RA). MARCH8 contains an N-terminal cytoplasmic C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger and two transmembrane domains. 55
34347 319722 cd16808 RING_CH-C4HC3_MARCH2 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH2 (MARCH2). MARCH2, also known as membrane-associated RING finger protein 2, membrane-associated RING-CH protein II (MARCH-II), or RING finger protein 172 (RNF172), is a Golgi-localized, membrane-associated E3 ubiquitin-protein ligase that is involved in endosomal trafficking through the binding of syntaxin 6 (STX6). It is involved in the cystic fibrosis transmembrane conductance regulator (CFTR)-associated ligand (CAL)-mediated ubiquitination and lysosomal degradation of mature CFTR through the association with adaptor proteins CAL and STX6. It also reduces the surface expression of CD86 and the transferrin receptor TFRC and regulates cell surface carvedilol-bound beta2-adrenergic receptor (beta2ARs) expression. Moreover, MARCH2 interacts with and ubiquitinates PDZ domains polarity determining scaffold protein DLG1 through its PDZ-binding motif, suggesting it may function as a molecular bridge with ubiquitin ligase activity connecting endocytic tumor suppressor proteins such as syntaxins to DLG1. MARCH2 contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus. 52
34348 319723 cd16809 RING_CH-C4HC3_MARCH3 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH3 (MARCH3). MARCH3, also known as membrane-associated RING finger protein 3, or membrane-associated RING-CH protein III (MARCH-III), or RING finger protein 173 (RNF173), is an E3 ubiquitin-protein ligase that is broadly expressed at relatively high levels in spleen, colon, and lung. It is localized to early endosomes, binds to MARCH2 and syntaxin 6, and is involved in the regulation of vesicular trafficking and fusion of the transport vesicles in endosomes. MARCH3 is the closest homolog of MARCH2 and it is also a functional homolog of K3 and K5 viral ubiquitin E3 ligases related to immune-evasion strategies used by Kaposi"s sarcoma-associated herpesvirus (KSHV). Its E2 specificity significantly overlaps that of MARCH2. MARCH3 contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus. The RING-CH finger and PDZ-binding motif are essential for the subcellular localization of MARCH3 and the inhibitory effect on transferrin uptake. 51
34349 319724 cd16810 RING_CH-C4HC3_MARCH11 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH11 (MARCH11). MARCH11, also known as membrane-associated RING finger protein 11, or membrane-associated RING-CH protein XI (MARCH-XI), is a transmembrane RING-finger ubiquitin ligase that is predominantly expressed in developing spermatids in a stage-specific manner and is localized to the trans-Golgi network (TGN) vesicles and multivesicular bodies (MVBs). It mediates selective protein sorting via the TGN-MVB transport pathway through its ubiquitin ligase activity. SAMT family proteins have been identified as substrates of MARCH11 in mouse spermatids, suggesting that MARCH11 plays a role in mammalian spermiogenesis. Moreover, MARCH11 functions as an E3 ubiquitin ligase that targets CD4 for ubiquitination. It also forms complexes with the adaptor protein complex-1 and with fucose-containing glycoproteins including ubiquitinated forms. MARCH11 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and two transmembrane domains. In addition, it harbors a proline-rich region, a tyrosine-based motif, and a PDZ binding motif. 52
34350 319725 cd16811 RING_CH-C4HC3_MARCH4_9 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH4, MARCH9, and similar proteins. This subfamily includes the closely related MARCH4 and MARCH9, both belonging to the family of MARCH E3 ligases. They downregulate major histocompatibility complex-I (MHC-I). In the presence of MARCH4 or MARCH9, MHC-I can be ubiquitinated and rapidly internalized by endocytosis, whereas MHC-I molecules lacking lysines in their cytoplasmic tail are resistant to downregulation. Moreover, MARCH4 and MARCH9, but not other MARCH proteins, can associate with Mult1 and prevent Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. Both MARCH4 and MARCH9 contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions. 51
34351 319726 cd16812 RING_CH-C4HC3_MARCH7 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH7 (MARCH7). MARCH7, also known as membrane-associated RING finger protein 7, membrane-associated RING-CH protein VII (MARCH-VII), RING finger protein 177 (RNF177), or axotrophin, is a ubiquitin E3 ligase expressed in multiple types of cells and tissues, including stem cells and precursor cells, and is predominantly localized on the plasma membrane and cytoplasm. MARCH7 is involved in T cell proliferation and neuronal development. It also participates in the regulation of cytoskeleton re-organization, cellular migration and invasion, cell proliferation, and tumorigenesis in ovarian carcinoma cells. Moreover, MARCH7 modulates nuclear factor kappaB (NF-kappaB) and Wnt/beta-catenin pathways. It has been identified as an authentic target of miR-101. Furthermore, ubiquitinates tau protein in vitro impairing microtubule binding. Unlike other MARCH proteins, MARCH7 is predicted to have no transmembrane spanning region. It harbors a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for its E3 activity. 65
34352 319727 cd16813 RING_CH-C4HC3_MARCH10 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH10 (MARCH10). MARCH10, also known as membrane-associated RING finger protein 10, membrane-associated RING-CH protein X (MARCH-X), or RING finger protein 190 (RNF190), is a microtubule-associated E3 ubiquitin ligase of developing spermatids. It is localized to the principal piece of elongating spermatids. MARCH10 is involved in spermiogenesis by regulating the formation and maintenance of the flagella in developing spermatids. Unlike other MARCH proteins, MARCH10 is predicted to have no transmembrane spanning region. It harbors a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for its E3 activity. 64
34353 319728 cd16814 RING-HC_RNF20 RING finger, HC subclass, found in RING finger protein 20 (RNF20). RNF20, also known as BRE1A or BRE1, is an E3 ubiquitin-protein ligase that forms a heterodimeric complex together with BRE1B, also known as RNF40, to facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. It regulates the cell cycle and differentiation of neural precursor cells (NPCs), and links histone H2B ubiquitylation with inflammation and inflammation-associated cancer. Moreover, RNF20 promotes the polyubiquitination and proteasome-dependent degradation of transcription factor activator protein 2alpha (AP-2alpha), a negative regulator of adipogenesis by repressing the transcription of CCAAT/enhancer binding protein (C/EBPalpha) gene. Furthermore, RNF20 functions as an additional chromatin regulator that is necessary for mixed-lineage leukemia (MLL)-fusion-mediated leukemogenesis. It also inhibits TFIIS-facilitated transcriptional elongation to suppress pro-oncogenic gene expression. TFIIS is a factor capable of relieving stalled RNA polymerase II. RNF20 contains a C3HC4-type RING-HC finger at its C-terminus. 46
34354 319729 cd16815 RING-HC_RNF40 RING finger, HC subclass, found in RING finger protein 40 (RNF40). RNF40, also known as BRE1B or 95 kDa retinoblastoma-associated protein (RBP95), was identified as a novel leucine zipper retinoblastoma protein (pRb)-associated protein that may function as a regulation factor in the process of RNA polymerase II-mediated transcription and/or transcriptional processing. RNF40 also functions as an E3 ubiquitin-protein ligase that forms a heterodimeric complex together with BRE1B, also known as RNF40, to facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. It cooperates with SUPT16H to induce dynamic changes in chromatin structure during DSB repair. RNF40 contains a C3HC4-type RING-HC finger at the C-terminus. 55
34355 319730 cd16816 mRING-HC-C3HC5_MGRN1 Modified RING finger, HC subclass (C3HC5-type), found in mahogunin RING finger protein 1 (MGRN1) and similar proteins. MGRN1, also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. MGRN1 interacts with cytosolic prion proteins (PrPs) that are linked with neurodegeneration. It also interacts with expanded polyglutamine proteins, and suppresses misfolded polyglutamine aggregation and cytotoxicity. Moreover, MGRN1 inhibits melanocortin receptor signaling by competition with Galphas, suggesting a novel pathway for melanocortin signaling from the cell surface to the nucleus. Furthermore, MGRN1 interacts with and ubiquitylates TSG101, a key component of the endosomal sorting complex required for transport (ESCRT)-I, and regulates endosomal trafficking. A null mutation in the gene encoding MGRN1 causes spongiform neurodegeneration, suggesting a link between dysregulation of endosomal trafficking and spongiform neurodegeneration. MGRN1 contains a modified C3HC5-type RING-HC finger, a conserved PSAP motif necessary for interaction between MGRN1 and TSG101. In addition, MGRN1 harbors a functionally uncharacterized region, as known as the domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 41
34356 319731 cd16817 mRING-HC-C3HC5_RNF157 Modified RING finger, HC subclass (C3HC5-type), found in RING finger protein 157 (RNF157) and similar proteins. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase mahogunin ring finger-1 (MGRN1). In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis. RNF157 contains a modified C3HC5-type RING-HC finger, and a functionally uncharacterized region, known as domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. 41
34357 319732 cd16818 SP-RING_PIAS1 SP-RING finger found in protein inhibitor of activated STAT protein 1 (PIAS1) and similar proteins. PIAS1, also known as DEAD/H box-binding protein 1, Gu-binding protein (GBP), or RNA helicase II-binding protein, was initially identified as an inhibitor of STAT1 that blocks the DNA-binding activity of STAT1 and specifically inhibits STAT1-mediated gene transcription in response to cytokine stimulation. It selectively inhibits interferon-inducible gene expression and plays an important role in the IFN-gamma- or IFN-beta-mediated innate immune response through negative regulation of STAT1. It also regulates the activity of other transcription factors to regulate immune response, such as NF-kappaB and Smad4. Moreover, PIAS1 functions as an E3 small ubiquitin-like modifier (SUMO)-protein ligase specifying target proteins for SUMO conjugation by Ubc9. The sumoylation activity of PIAS1 can suppress cytokine transforming growth factor beta (TGFbeta)-induced epithelial mesenchymal transition (EMT) in non-transformed epithelial cells to promote activation of the matrix metalloproteinase 2 (MMP2). It thus regulates TGFbeta-induced cancer cell invasion and metastasis. PIAS1 may also be involved in spatial learning and memory formation through its SUMOylation of cAMP-responsive element binding protein (CREB). In addition, PIAS1 is the E3 ligase responsible for SUMOylation of High mobility group nucleosomal binding domain 2 (HMGN2), which is a small and unique non-histone protein that has many functions in a variety of cellular processes, including regulation of chromatin structure, transcription, and DNA repair, as well as antimicrobial activity, cell homing, and regulating cytokine release. Furthermore, PIAS1 is a genuine chromatin-bound androgen receptor (AR) co-regulator that functions in a target gene selective fashion to regulate prostate cancer cell growth. It also mediates the SUMOylation of c-Myc, which is the most frequently overexpressed oncogene in tumours, including breast cancer, colon cancer, and lung cancer. Necdin, a pleiotropic protein that promotes differentiation and survival of mammalian neurons, can suppresses PIAS1 both by inhibiting SUMO E3 ligase activity and by promoting ubiquitin-dependent degradation. PIAS1 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. The SP-RING finger mediates the interaction of PIAS1 with the SUMO E2 conjugating enzyme Ubc9. 51
34358 319733 cd16819 SP-RING_PIAS2 SP-RING finger found in protein inhibitor of activated STAT protein 2 (PIAS2) and similar proteins. PIAS2, also known as androgen receptor-interacting protein 3 (ARIP3), DAB2-interacting protein (DIP), Msx-interacting zinc finger protein (Miz1), PIAS-NY protein, protein inhibitor of activated STAT x, protein inhibitor of activated STAT2, is an E3 SUMO-protein ligase highly expressed in the testis. It functions as a transcriptional activator of BCL2 and is essential for blocking c-MYC-induced apoptosis. It also acts as a negative regulator of cell proliferation, induces expression of the cell-cycle inhibitors p15(Ink4b) and p21(Cip1), and activates transcription of the p21(Cip1) gene in response to UV irradiation. Moreover, PIAS2 associates with topoisomerase II binding protein 1 (TopBP1), an essential activator of the Atr kinase. It thus affects the activity of the Atr checkpoint. Receptor of activated C kinase 1 (RACK1), glucocorticoid receptor (GR)-interacting protein 1 (GRIP1), friend leukemia integration-I (FLI-1), and ubiquitously expressed transcript (UXT) are binding partners of PIAS2. The interaction between UXT and PIAS2 may be important for the transcriptional activation of androgen receptor (AR). PIAS2 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus, and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. 49
34359 319734 cd16820 SP-RING_PIAS3 SP-RING finger found in protein inhibitor of activated STAT protein 3 (PIAS3) and similar proteins. PIAS3 is an E3 SUMO-protein ligase that was initially identified as an interleukin-6 (IL-6)-dependent repressor of signal transducer and activator of transcription 3 (STAT3) and has anti-proliferative properties. It binds specifically to phosphorylated STAT3 and inhibits its transcriptional activity by blocking its binding to DNA. It regulates STAT3-mediated induction of Snail expression, as well as suppresses acute graft-versus-host disease (GVHD) by modulating effector T and B cell subsets through inhibition of STAT3 activation. It activates the intrinsic apoptotic pathway in non-small cell lung cancer cells independent of p53 status. When overexpressed, it can interact with STAT5 to regulate prolactin-induced STAT5-mediated gene expression. Moreover, PIAS3 binds to and activates Smad3 transcriptional activity, resulting in the enhancement of transforming growth factor-beta (TGF-beta) signaling. It functions as a transcriptional corepressor of Erythroid Kruppel-like factor (EKLF or KLF1) and thus plays an important role in erythropoiesis. It also plays a significant role in DNA damage response (DDR) pathway through promoting homologous recombination (HR)- and non-homologous end joining (NHEJ)-mediated DNA double-strand break (DSB) repair. Furthermore, PIAS3 preferentially interacts with and enhances the SUMOylation of TAK1-binding protein 2 (TAB2), an upstream adaptor protein in the IL-1 signaling pathway. It also promotes SUMOylation and nuclear sequestration of ErbB4 receptor tyrosine kinase. In addition, PIAS3 may form a complex with microphthalmia-associated transcription factor, nuclear factor-kappaB, Smad, and estrogen receptor. Its other transcription factor binding partners include: ETS, EGR1, NR1I2, and GATA1. PIAS3 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. 51
34360 319735 cd16821 SP-RING_PIAS4 SP-RING finger found in protein inhibitor of activated STAT protein 4 (PIAS4) and similar proteins. PIAS4, also known as PIASy or protein inhibitor of activated STAT protein gamma (PIAS-gamma), is an E3 SUMO-protein ligase that interacts with the androgen receptor (AR) and is involved in ubiquitin signaling pathways. It is associated with macro/microcephaly in the novel interstitial 19p13.3 microdeletion/microduplication syndrome. It also regulates the hypoxia signalling pathway through interacting with the tumor suppressor von Hippel-Lindau (VHL) and leads to VHL sumoylation, oligomerization, and impaired function during growth of pancreatic cancer cells. Moreover, PIAS4 acts as a direct binding partner for vitamin D receptor (VDR) and facilitates its modification with SUMO2. The process of SUMOylation modulates VDR-mediated signaling. As components of the DNA-damage response (DDR), PIAS4 together with PIAS1 promote responses to DNA double-strand breaks (DSBs). They are required for effective ubiquitin-adduct formation mediated by RNF8, RNF168, and BRCA1 at sites of DNA damage. PIAS4 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. 49
34361 319736 cd16822 SP-RING_ZMIZ1 SP-RING finger found in zinc finger MIZ domain-containing protein 1 (Zmiz1) and similar proteins. Zmiz1, also known as PIAS-like protein Zimp10 (zinc finger-containing, Miz1, PIAS-like protein on chromosome 10) or retinoic acid-induced protein 17, is a novel PIAS-like protein that was initially identified as an androgen receptor (AR) interacting protein and functions as a transcriptional co-activator. It co-localizes with AR and small ubiquitin-like modifier SUMO-1, forms a protein complex at replication foci in the nucleus, and augments AR-mediated transcription. It also functions as a transcriptional co-activator of the p53 tumor suppressor that plays a critical role in the cell cycle progression, DNA repair, and apoptosis. Moreover, Zmiz1 associates with multiple autoimmune diseases and has been implicated in the development, function, and survival of melanocyte. Zmiz1 also interacts with Smad3/4 proteins and augments Smad-mediated transcription, suggesting it is important in the regulation of the transforming growth factor beta (TGF-beta)/Smad signaling pathway and may have an inhibitory effect on the immune system. Furthermore, Zmiz1 is overexpressed in a significant percentage of human cutaneous squamous cell carcinoma (SCC), breast, ovarian, and colon cancers, suggesting it may play a broader role in epithelial cancers. It functionally interacts with NOTCH1 to promote C-MYC transcription and activity, and thus is involved in a variety of C-MYC-driven cancers. Zmiz1 contains a PAT domain, a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a putative nuclear localization sequence (NLS), as well as a strong intrinsic transactivation domain within the C-terminus. 49
34362 319737 cd16823 SP-RING_ZMIZ2 SP-RING finger found in zinc finger MIZ domain-containing protein 2 (Zmiz2) and similar proteins. Zmiz2, also known as PIAS-like protein Zimp7 (zinc finger-containing, Miz1, PIAS-like protein on chromosome 7), is a novel PIAS-like protein that was initially identified as an androgen receptor (AR) interacting protein and functions as a transcriptional co-activator. It interacts with beta-catenin and enhances Wnt/beta-catenin-mediated transcription. It also associates with BRG1 and BAF57, components of the ATP-dependent mammalian SWI/SNF-like BAF chromatin-remodeling complexes, and thus plays a potential role in modulation of AR and/or other nuclear receptor-mediated transcription. For instance, it can increase the effects of BRG1 on androgen receptor-mediated transcriptional activity. Moreover, Zmiz2 physically interacts with PIAS proteins, especially PIAS3. Through the interaction, PIAS3 augments Zmiz2-mediated transcription, suggesting PIAS proteins may play a regulatory role in Zmiz-mediated transcription. Furthermore, Zmiz2 is involved in transcriptional regulation of factors essential for patterning in the dorsoventral axis. It is required for the restriction of the zebrafish organizer and mesoderm development. Zmiz2 contains a PAT domain, a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a strong intrinsic transactivation domain within the C-terminus. 49
34363 319738 cd16824 RING_CH-C4HC3_MARCH4 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH4 (MARCH4). MARCH4, also known as membrane-associated RING finger protein 4, membrane-associated RING-CH protein IV (MARCH-IV), or RING finger protein 174 (RNF174), is a transmembrane E3 ubiquitin-protein ligase that down-regulates the tetraspanin CD81 and major histocompatibility complex-I (MHC). It also associates with Mult1, suppressing Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. MARCH4 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions. 51
34364 319739 cd16825 RING_CH-C4HC3_MARCH9 RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH9 (MARCH9). MARCH9, also known as membrane-associated RING finger protein 9, membrane-associated RING-CH protein IX (MARCH-IX), or RING finger protein 179 (RNF179), is a transmembrane E3 ubiquitin-protein ligase that down-regulates Mult1, CD4, major histocompatibility complex-I (MHC), and intercellular adhesion molecule (ICAM-1). It may also interact with receptor-type protein-tyrosine phosphatases (e.g. PTPRJ/CD148) as well as Fc gamma receptor IIB (CD32B), HLA-DQ, signaling lymphocytic activation molecule (CD150), and polio virus receptor (CD155). MARCH9 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions. 51
34365 319356 cd16827 ChuX-like heme utilization protein ChuX and similar proteins. This family contains ChuX, a member of the conserved heme utilization operon from pathogenic E. coli, and similar proteins, which include ChuS, HutX, HuvX, HugX, and ShuX in proteobacteria, among others. It forms a dimer which displays a very similar fold and organization to the monomeric structure of other heme utilization proteins such as HemS, ChuS, HmuS, PhuS; these latter occurring as duplicated domains. They all bind heme via a key conserved histidine. The genes encoded within these heme utilization operons enable the effective uptake and utilization of heme as an iron source in pathogenic microorganisms to enable multiplication and survival within hosts they invade. 141
34366 319357 cd16828 HemS-like N- and C-terminal domains of heme degrading enzyme HemS, and similar proteins. This family contains the N- and C-terminal domains of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria. Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. PhuS in Pseudomonas aeruginosa has been reported as a heme chaperone and as a heme degrading enzyme, and is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. 152
34367 319358 cd16829 ChuX_HutX-like heme iron utilization protein ChuX and similar proteins. This family contains proteins similar to ChuX, a member of the conserved heme utilization operon from pathogenic E. coli, and includes ChuS, HutX, HuvX, HugX, and ShuX in proteobacteria, among others. It forms a dimer which displays a very similar fold and organization to the monomeric structure of other heme utilization proteins such as HemS, ChuS, HmuS, PhuS; these latter occurring as duplicated domains. They all bind heme via a key conserved histidine. The genes encoded within these heme utilization operons enable the effective uptake and utilization of heme as an iron source in pathogenic microorganisms to enable multiplication and survival within hosts they invade. ChuX, a member of the conserved heme utilization operon from pathogenic E. coli O157:H7, forms a dimer with a very similar fold to the monomer structure of two other heme utilization proteins, ChuS and HemS, despite low sequence homology. ChuX has been shown to bind heme in a 1:1 manner, inferring that the ChuX homodimer could coordinate two heme molecules in contrast to only one heme molecule bound in ChuS and HemS. Similarly, cytoplasmic heme-binding protein HutX in Vibrio cholera, an intracellular heme transport protein for the heme-degrading enzyme HutZ, forms a dimer, each domain binding heme that is transferred from HutX to HutZ via a specific protein-protein interaction. This family also includes AGR_C_4470p from Agrobacterium tumefaciens and found to be a dimer, with each subunit having strong structural homology and organization to the heme utilization protein ChuS from Escherichia coli and HemS from Yersinia enterocolitica. However, the heme binding site is not conserved in AGR_C_4470p, suggesting a possible different function. 143
34368 319359 cd16830 HemS-like_N N-terminal domain of heme degrading enzyme HemS, and similar proteins. This family contains the N-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria. Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. 152
34369 319360 cd16831 HemS-like_C C-terminal domain of heme degrading enzyme HemS, and similar proteins. This family contains the C-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria. Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both, heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. 155
34370 319353 cd16832 CNF1_CheD_YfiH-like cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and YfiH (DUF152) are distant homologs. This family contains distant homologs that include cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and a protein of unknown function YfiH. CNF-1 along with dermonecrotic toxin (DNT) from Bordetella species, and Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549) are Rho-activating toxins. The bacterial chemotaxis protein CheD stimulates methylation of methyl-accepting chemotaxis proteins (MCPs). YfiH, a domain of unknown function, also included in this family reveals a structure with a distant homology between to the CNF1, and CheD, all having an invariant Cys-His pair forming a catalytic dyad that is required by the CNF-1 toxins for deamidation activity. 145
34371 319354 cd16833 YfiH protein of unknown function YfiH. This subfamily contains YfiH, a protein of unknown function from Shigella flexneri, E. coli, and many similar proteins which collectively are often called DUF152. The structure of YfiH reveals a distant homology to Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) as well as chemotaxis protein CheD that stimulates methylation of methyl-accepting chemotaxis proteins (MCPs), all having an invariant Cys-His pair forming a catalytic dyad, and is required by the CNF-1 toxins for deamidation activity. 185
34372 319355 cd16834 CNF1-like cytotoxic necrotizing factor 1 (CNF1) and similar proteins. This subfamily contains Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) and dermonecrotic toxin (DNT) from Bordetella species, as well as Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549), and similar proteins. CNF1 causes alteration of the host cell actin cytoskeleton and promotes bacterial invasion of blood-brain barrier endothelial cells. E. coli CNF1 constitutively activates host small G proteins such as RhoA and Cdc42 by deamidating a glutamine residue essential for GTP hydrolysis. DNT stimulates the assembly of actin stress fibers and focal adhesions by deamidation/polyamination of a specific glutamine of the small GTPase Rho. CNF1 and DNT are A-B toxins composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain; their homology is restricted to the catalytic domains at the C termini of the toxins, suggesting that they share a similar molecular mechanism. BLF1, a toxin that inhibits helicase activity of translation factor eIF4A, is similar to the catalytic domain of Escherichia coli CNF1 (CNF1-C); although CNF1-C and BLF1 show little sequence identity, the active sites have the conserved LSGC (Leu, Ser, Gly, Cys) motif. 168
34373 319351 cd16837 BldD_C_like C-terminal domain of BldD and similar transcription factors. The Streptomyces transcription factor BldD dimerizes via an unusual mechanism that inolves a tetrameric c-di-GMP assembly. BdlD is involved in controlling multicellular differentiation in sporulating actinomycetes. 73
34374 319350 cd16839 PCSK9_C-CRD proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain (CRD). PCSK9 post-translationally regulates hepatic low-density lipoprotein receptors (LDLRs) by binding to LDLRs on the cell surface, leading to their degradation. Other known PSCK9 targets include very-low-density lipoprotein receptor (VLDLR), apoE receptor2, lipoprotein receptor-related protein 1, etc. This PCSK9 C-terminal CRD may play an analogous role to the P (processing) domains of Furin and Kex2 (i.e. be required for the correct functioning/folding of the protein). Structural similarity has been noted between PCSK9 C-terminal CRD and the resistin homotrimer. This alignment model represents a three-fold repeat. 225
34375 411037 cd16840 toxin_MLD toxin effector region membrane localization domain. This MLD domain functions as a membrane-targeting domain for toxin effectors such as the Rho-inactivation domain of Vibrio MARTX, Pasteurella mitogenic toxin (PMT), where it has been termed PMT C1 domain, and clostridial glycosylating cytotoxins including Clostridium difficile toxins A (TcdA) and B (TcdB), Clostridium novyi alpha-toxin (TcnA), and Clostridium sordellii lethal toxin (TcsL). During infection, the C. difficile homologous exotoxins, TcdA and TcdB, target and disrupt the colonic epithelium, leading to diarrhea and colitis. They disrupt host cell function through a multistep process involving receptor binding, endocytosis, low pH-induced pore formation, and the translocation and delivery of a C-terminal glucosyltransferase domain (GTD) that inactivates host GTPases. Their N-terminal MLD domains confer membrane localization of adjacent effector domains via the 4-helix-bundle motif. 78
34376 319245 cd16841 RraA_family ribonuclease activity regulator RraA family. RraA protein family is named after the regulator of ribonuclease activity A (RraA), a protein that binds to RNase E and inhibits RNase E endonucleolytic cleavages. Members also include proteins with other functions, like a 4-hydroxy-4-methyl-2-oxoglutarate/4-carboxy-4-hydroxy-2-oxoadipate (HMG/CHA) aldolase from Pseudomonas putida, which catalyzes the last step of the bacterial protocatechuate 4,5-cleavage pathway and the uncharacterized YER010Cp protein from yeast, an organism lacking RNAse E. 150
34377 409517 cd16842 Ig_SLAM-like_N N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family and similar proteins. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. Members of this group include proteins such as CD84, SLAM (CD150), Ly-9 (CD229), NTB-A (ly-108, SLAM6), 19A (CRACC), and SLAMF9. The genes coding for the SLAM family are nested on chromosome 1, in humans at 1q23, and in mice at 1H2. The SLAM family is a subset of the CD2 family, which also includes CD2 and CD58 located on chromosome 1 at 1p13 in humans. In mice, CD2 is located on chromosome 3, and there is no CD58 homolog. The SLAM family proteins are organized as an extracellular domain with either two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region having Tyr-based motifs. The extracellular domain is organized as a membrane-distal Ig variable (IgV) domain that is responsible for ligand recognition and a membrane-proximal truncated Ig constant-2 (IgC2) domain. 102
34378 409518 cd16843 IgC2_D1_D2_LILR_KIR_like Immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors, Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the first and second immunoglobulin (Ig)-like domains found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules. 90
34379 319272 cd16844 ParB_N_like_MT ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains and DUF4417. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci. 54
34380 341083 cd16845 STAT1_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 1 (STAT1). This family consists of the DNA-binding domain (DBD) of the STAT1 proteins (Signal Transducer and Activator of Transcription 1, or Signal Transduction And Transcription 1). The DNA binding domain has an Ig-like fold. STAT1 plays an essential role in mediating responses to all types of interferons (IFN), transducing signals from cytoplasmic domains of transmembrane receptors into the nucleus where it regulates gene expression. Thus STAT1 is involved in modulating diverse cellular processes, such as antimicrobial activities, cell proliferation and cell death. STAT1 function is crucial in the innate and adaptive arm of immunity and protects from pathogen infections; phosphorylation of a critical tyrosine by Janus kinases (JAKs) leads to its activation and nuclear translocation, while phosphorylation of a critical serine is required for full transcriptional activation upon IFN stimulation and in response to cellular stress. Transcription of protein-encoding genes (including Stat1 itself) as well as expression of microRNAs (miRNAs) is regulated by activated STAT1. Animal studies have shown that STAT1 is generally considered a tumor suppressor but it can also act as a tumor promoter; its functions are not restricted to tumor cells, but extend to parts of the tumor microenvironment such as immune cells, endothelial cells. STAT1 abundance is a reliable marker for good prognosis in selected tumor types, but it can also correlate with disease progression. In head and neck cancer (HNC) patients, upregulation of STAT1-induced HLA class I enhances immunogenicity and clinical response to anti-EGFR mAb cetuximab therapy. In systemic juvenile idiopathic arthritis (sJIA) characterized by systemic inflammation and arthritis, STAT1 phosphorylation downstream of IFNs is impaired. It exerts anti-oncogenic activities through interferon-gamma and interferon-alpha. STAT1 may inhibit hepatocellular carcinoma cell growth by regulating p53-related cell cycling and apoptosis. Studies also show a significant correlation of high STAT1 activity with longer colorectal cancer patient overall survival. Recent studies have shown that STAT1 suppresses mouse mammary gland tumorigenesis by immune regulatory as well as tumor cell-specific functions of STAT1. 161
34381 341084 cd16846 STAT2_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 2 (STAT2). This family consists of the DNA-binding domain (DBD) of the STAT2 proteins (Signal Transducer and Activator of Transcription 2, or Signal Transduction And Transcription 2). The DNA binding domain has an Ig-like fold. STAT2 activation is driven predominantly by only two classes of cell surface receptors: Type I and III interferon receptors, making it a unique STAT family of transcription factors. Thus, STAT2 plays a critical role in host defenses against viral infections since type I interferon (IFN-I) response inhibits viral replication, and sets the stage for the development of adaptive immunity; viruses target STAT2 by either inhibiting its expression, blocking its activity, or by targeting it for degradation, thus triggering remarkable divergence in the STAT2 gene across species compared to other STAT family members. STAT2 function is regulated by tyrosine phosphorylation which enables STAT dimerization, and subsequent nuclear translocation and transcriptional activation of IFN stimulated genes. Dengue virus (DENV)-mediated degradation of STAT2 has emerged as an important determinant of DENV pathogenesis and host tropism. This vector-borne flavivirus suppresses IFN1 signaling to replicate and cause disease in vertebrates via proteasome-dependent STAT2 degradation mediated by the nonstructural protein NS5 and its interaction partner UBR4, an E3 ubiquitin ligase. The mechanism of Zika virus (ZIKV) NS5 resembles DENV NS5 but through different mechanism - ZIKV does not require the UBR4 to induce STAT2 degradation. It has also been shown that the STAT2 and STAT4 genes are direct targets for transcription factor Oct-1 protein which is involved in the regulation of expression of genes of the JAK-STAT signaling pathway in the Namalwa Burkitt's lymphoma cell line. 160
34382 341085 cd16847 STAT3_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 3 (STAT3). This family consists of the DNA-binding domain (DBD) of the STAT3 proteins (Signal Transducer and Activator of Transcription 3, or Signal Transduction And Transcription 3). The DNA binding domain has an Ig-like fold. STAT3 plays key roles in vertebrate development and mature tissue function including control of inflammation and immunity. Mutations in human STAT3, especially in the DNA-binding and SH2 domains, are associated with diseases such as autoimmunity, immunodeficiency and cancer. STAT3 regulation is tightly controlled since either inactivation or hyperactivation results in disease. STAT3 activation is stimulated by several cytokines and growth factors, via diverse receptors. For example, IL-6 receptors depend on the tyrosine kinases JAK1 or JAK2, which associate with the cytoplasmic tail of gp130, and results in STAT3 phosphorylation, dimerization, and translocation to the nucleus; this leads to further IL-6 production and up-regulation of anti-apoptotic genes, thus promoting various cellular processes required for cancer progression. Other activators of STAT3 include IL-10, IL-23, and LPS activation of Toll-like receptors TLR4 and TLR9. STAT3 is constitutively activated in numerous cancer types, including over 40% of breast cancers. It has been shown to play a significant role in promoting acute myeloid leukemia (AML) through three mechanisms: promoting proliferation and survival, preventing AML differentiation to functional dendritic cells (DCs), and blocking T-cell function through other pathways. STAT3 also regulates mitochondrion functions, as well as gene expression through epigenetic mechanisms; its activation is induced by overexpression of Bcl-2 via an increase in mitochondrial superoxide. Thus, many of the regulators and functions of JAK-STAT3 in tumors are important therapeutic targets for cancer treatment. 164
34383 341086 cd16848 STAT4_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 4 (STAT4). This family consists of the DNA-binding domain (DBD) of the STAT4 proteins (Signal Transducer and Activator of Transcription 4, or Signal Transduction And Transcription 4). The DNA binding domain has an Ig-like fold. STAT4 acts as the major signaling transducing STATs in response to interleukin-12 (IL-12) by inducing interferon-gamma (IFNg) , and is a central mediator in generating inflammation during protective immune responses and immune-mediated diseases. STAT4 is a critical regulator of Th1 differentiation and inflammatory disease. It is essential for the differentiation and function of many immune cells, including natural killer cells, dendritic cells, mast cells and T helper cells. STAT4-mediated signaling promotes the production of autoimmune-associated components, which are implicated in the pathogenesis of autoimmune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, systemic sclerosis and psoriasis, making STAT4 a promising therapeutic target for autoimmune diseases. Variations in STAT4 gene are linked to the development of systemic lupus erythematosus (SLE) in humans. STAT4 activation is detected in chronic liver diseases; polymorphism in STAT4 gene has been shown to be associated with the antiviral response in primary biliary cirrhosis (PBC), HCV-associated liver fibrosis, hepatocellular carcinoma (HCC), chronic hepatitis C and in drug-induced liver injury (DILI). STAT4 may inhibit HCC development by modulating HCC cell proliferation. Studies show that increased expression of STAT4 is positively correlated with the depth of invasion in colorectal cancer (CRC) patients, and the growth and invasion of CRC cells are repressed by inhibition of STAT4 expression, making STAT4 a promising therapeutic target for the treatment of CRC. 152
34384 341087 cd16849 STAT5_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 5 (STAT5). This family consists of the DNA-binding domain (DBD) of the STAT5 proteins (Signal Transducer and Activator of Transcription 5, or Signal Transduction And Transcription 4), which include STAT5A and STAT5B, both of which are >90% identical despite being encoded by separate genes. The DNA binding domain has an Ig-like fold. STAT5A and STAT5B regulate erythropoiesis, lymphopoiesis, and the maintenance of the hematopoietic stem cell population. STAT5A and STAT5B have overlapping and redundant functions; both isoforms can be activated by the same set of cytokines, but some cytokines preferentially activate either STAT5A or STAT5B, e.g. during pregnancy and lactation, STAT5A rather than STAT5B is required for the production of luminal progenitor cells from mammary stem cells and is essential for the differentiation of milk producing alveolar cells during pregnancy. STAT5 has been found to be constitutively phosphorylated in cancer cells, and therefore constantly activated, either by aberrant cell signaling expression or by mutations. It differentially regulates cellular behavior in human mammary carcinoma. Prolactin (PRL) in the prostate gland can induce growth and survival of prostate cancer cells and tissues through the activation of STAT5, its downstream target; PRL expression and STAT5 activation correlates with disease severity. STAT5A and STAT5B are central signaling molecules in leukemias driven by Abelson fusion tyrosine kinases, displaying unique nuclear shuttling mechanisms and having a key role in resistance of leukemic cells against treatment with tyrosine kinase inhibitors (TKI). In addition, STAT5A and STAT5B promote survival of leukemic stem cells. STAT5 is a key transcription factor for IL-3-mediated inhibition of RANKL-induced osteoclastogenesis via the induction of the expression of Id genes. Autosomal recessive STAT5B mutations are associated with severe growth failure, insulin-like growth factor (IGF) deficiency and growth hormone insensitivity (GHI) syndrome. STAT5B deficiency can lead to potentially fatal primary immunodeficiency. 159
34385 341088 cd16850 STAT6_DBD DNA-binding domain of Signal Transducer and Activator of Transcription 6 (STAT6). This family consists of the DNA-binding domain (DBD) of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). The DNA binding domain has an Ig-like fold. STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites. 160
34386 341076 cd16851 STAT1_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 1 (STAT1). This family consists of the coiled-coil (alpha) domain of the STAT1 proteins (Signal Transducer and Activator of Transcription 1, or Signal Transduction And Transcription 1). STAT1 plays an essential role in mediating responses to all types of interferons (IFN), transducing signals from cytoplasmic domains of transmembrane receptors into the nucleus where it regulates gene expression. Thus STAT1 is involved in modulating diverse cellular processes, such as antimicrobial activities, cell proliferation and cell death. STAT1 function is crucial in the innate and adaptive arm of immunity and protects from pathogen infections; phosphorylation of a critical tyrosine by Janus kinases (JAKs) leads to its activation and nuclear translocation, while phosphorylation of a critical serine is required for full transcriptional activation upon IFN stimulation and in response to cellular stress. Transcription of protein-encoding genes (including Stat1 itself) as well as expression of microRNAs (miRNAs) is regulated by activated STAT1. Animal studies have shown that STAT1 is generally considered a tumor suppressor but it can also act as a tumor promoter; its functions are not restricted to tumor cells, but extend to parts of the tumor microenvironment such as immune cells, endothelial cells. STAT1 abundance is a reliable marker for good prognosis in selected tumor types, but it can also correlate with disease progression. In head and neck cancer (HNC) patients, upregulation of STAT1-induced HLA class I enhances immunogenicity and clinical response to anti-EGFR mAb cetuximab therapy. In systemic juvenile idiopathic arthritis (sJIA) characterized by systemic inflammation and arthritis, STAT1 phosphorylation downstream of IFNs is impaired. It exerts anti-oncogenic activities through interferon-gamma and interferon-alpha. STAT1 may inhibit hepatocellular carcinoma cell growth by regulating p53-related cell cycling and apoptosis. Studies also show a significant correlation of high STAT1 activity with longer colorectal cancer patient overall survival. Recent studies have shown that STAT1 suppresses mouse mammary gland tumorigenesis by immune regulatory as well as tumor cell-specific functions of STAT1. 176
34387 341077 cd16852 STAT2_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 2 (STAT2). This family consists of the coiled-coil (alpha) domain of the STAT2 proteins (Signal Transducer and Activator of Transcription 2, or Signal Transduction And Transcription 2). STAT2 activation is driven predominantly by only two classes of cell surface receptors: Type I and III interferon receptors, making it a unique STAT family of transcription factors. It differs from other STAT family members in that it associates constitutively with a non-STAT protein, the interferon regulatory factor 9 (IRF9). The coiled-coil domain of STAT2 is necessary for binding the carboxyl terminus of IRF9, an association required for the constitutive nuclear import of unphosphorylated STAT2. STAT2 plays a critical role in host defenses against viral infections since type I interferon (IFN-I) response inhibits viral replication, and sets the stage for the development of adaptive immunity; viruses target STAT2 by either inhibiting its expression, blocking its activity, or by targeting it for degradation, thus triggering remarkable divergence in the STAT2 gene across species compared to other STAT family members. STAT2 function is regulated by tyrosine phosphorylation which enables STAT dimerization, and subsequent nuclear translocation and transcriptional activation of IFN stimulated genes. Dengue virus (DENV)-mediated degradation of STAT2 has emerged as an important determinant of DENV pathogenesis and host tropism. This vector-borne flavivirus suppresses IFN1 signaling to replicate and cause disease in vertebrates via proteasome-dependent STAT2 degradation mediated by the nonstructural protein NS5 and its interaction partner UBR4, an E3 ubiquitin ligase. The mechanism of Zika virus (ZIKV) NS5 resembles DENV NS5 but through different mechanism - ZIKV does not require the UBR4 to induce STAT2 degradation. It has also been shown that the STAT2 and STAT4 genes are direct targets for transcription factor Oct-1 protein which is involved in the regulation of expression of genes of the JAK-STAT signaling pathway in the Namalwa Burkitt's lymphoma cell line. 172
34388 341078 cd16853 STAT3_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 3 (STAT3). This family consists of the coiled-coil (alpha) domain of the STAT3 proteins (Signal Transducer and Activator of Transcription 3, or Signal Transduction And Transcription 3). STAT3 continuously shuttles between nuclear and cytoplasmic compartments. The coiled-coil domain (CCD) of STAT3 appears to be required for constitutive nuclear localization signals (NLS) function; small deletions within the STAT3 CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 in the testis, and importin-alpha6 NLS adapters in most cells. STAT3 plays key roles in vertebrate development and mature tissue function including control of inflammation and immunity. Mutations in human STAT3, especially in the DNA-binding and SH2 domains, are associated with diseases such as autoimmunity, immunodeficiency and cancer. STAT3 regulation is tightly controlled since either inactivation or hyperactivation results in disease. STAT3 activation is stimulated by several cytokines and growth factors, via diverse receptors. For example, IL-6 receptors depend on the tyrosine kinases JAK1 or JAK2, which associate with the cytoplasmic tail of gp130, and results in STAT3 phosphorylation, dimerization, and translocation to the nucleus; this leads to further IL-6 production and up-regulation of anti-apoptotic genes, thus promoting various cellular processes required for cancer progression. Other activators of STAT3 include IL-10, IL-23, and LPS activation of Toll-like receptors TLR4 and TLR9. STAT3 is constitutively activated in numerous cancer types, including over 40% of breast cancers. It has been shown to play a significant role in promoting acute myeloid leukemia (AML) through three mechanisms: promoting proliferation and survival, preventing AML differentiation to functional dendritic cells (DCs), and blocking T-cell function through other pathways. STAT3 also regulates mitochondrion functions, as well as gene expression through epigenetic mechanisms; its activation is induced by overexpression of Bcl-2 via an increase in mitochondrial superoxide. Thus, many of the regulators and functions of JAK-STAT3 in tumors are important therapeutic targets for cancer treatment. 180
34389 341079 cd16854 STAT4_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 4 (STAT4). This family consists of the coiled-coil (alpha) domain of the STAT4 proteins (Signal Transducer and Activator of Transcription 4, or Signal Transduction And Transcription 4). STAT4 expression is restricted to spermatozoa, myeloid cells, and T lymphocytes, making it distinct from other STATs. It acts as the major signaling transducing STATs in response to interleukin-12 (IL-12) by inducing interferon-gamma (IFNgamma), and is a central mediator in generating inflammation during protective immune responses and immune-mediated diseases. STAT4 is a critical regulator of Th1 differentiation and inflammatory disease. It is essential for the differentiation and function of many immune cells, including natural killer cells, dendritic cells, mast cells and T helper cells. STAT4-mediated signaling promotes the production of autoimmune-associated components, which are implicated in the pathogenesis of autoimmune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, systemic sclerosis and psoriasis, making STAT4 a promising therapeutic target for autoimmune diseases. Variations in STAT4 gene are linked to the development of systemic lupus erythematosus (SLE) in humans. STAT4 activation is detected in chronic liver diseases; polymorphism in STAT4 gene has been shown to be associated with the antiviral response in primary biliary cirrhosis (PBC), HCV-associated liver fibrosis, hepatocellular carcinoma (HCC), chronic hepatitis C and in drug-induced liver injury (DILI). STAT4 may inhibit HCC development by modulating HCC cell proliferation. Studies show that increased expression of STAT4 is positively correlated with the depth of invasion in colorectal cancer (CRC) patients, and the growth and invasion of CRC cells are repressed by inhibition of STAT4 expression, making STAT4 a promising therapeutic target for the treatment of CRC. 173
34390 341080 cd16855 STAT5_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 5 (STAT5). This family consists of the coiled-coil (alpha) domain of the STAT5 proteins (Signal Transducer and Activator of Transcription 5, or Signal Transduction And Transcription 5) which include STAT5A and STAT5B, both of which are >90% identical despite being encoded by separate genes. The coiled-coil domain (CCD) of STAT5A and STAT5B appears to be required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells. STAT5A and STAT5B regulate erythropoiesis, lymphopoiesis, and the maintenance of the hematopoietic stem cell population. STAT5A and STAT5B have overlapping and redundant functions; both isoforms can be activated by the same set of cytokines, but some cytokines preferentially activate either STAT5A or STAT5B, e.g. during pregnancy and lactation, STAT5A rather than STAT5B is required for the production of luminal progenitor cells from mammary stem cells and is essential for the differentiation of milk producing alveolar cells during pregnancy. STAT5 has been found to be constitutively phosphorylated in cancer cells, and therefore constantly activated, either by aberrant cell signaling expression or by mutations. It differentially regulates cellular behavior in human mammary carcinoma. Prolactin (PRL) in the prostate gland can induce growth and survival of prostate cancer cells and tissues through the activation of STAT5, its downstream target; PRL expression and STAT5 activation correlates with disease severity. STAT5A and STAT5B are central signaling molecules in leukemias driven by Abelson fusion tyrosine kinases, displaying unique nuclear shuttling mechanisms and having a key role in resistance of leukemic cells against treatment with tyrosine kinase inhibitors (TKI). In addition, STAT5A and STAT5B promote survival of leukemic stem cells. STAT5 is a key transcription factor for IL-3-mediated inhibition of RANKL-induced osteoclastogenesis via the induction of the expression of Id genes. Autosomal recessive STAT5B mutations are associated with severe growth failure, insulin-like growth factor (IGF) deficiency and growth hormone insensitivity (GHI) syndrome. STAT5B deficiency can lead to potentially fatal primary immunodeficiency. 194
34391 341081 cd16856 STAT6_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription 6 (STAT6). This family consists of the coiled-coil (alpha) domain of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). SImilar to STAT3 and STAT5. the coiled-coil domain (CCD) of STAT6 is required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells.STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites. 167
34392 341090 cd16857 ING_ING1_2 Inhibitor of growth (ING) domain of inhibitor of growth protein ING1, ING2, and similar proteins. ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind and phosphorylate ING1 and further regulates its activity. ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. Both ING1 and ING2 contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 89
34393 341091 cd16858 ING_ING3_Yng2p Inhibitor of growth (ING) domain of inhibitor of growth protein 3 (ING3), Yng2p and similar proteins. ING3, also termed p47ING3, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It is ubiquitously expressed and has been implicated in transcription modulation, cell cycle control, and the induction of apoptosis. It is an important subunit of human NuA4 histone acetyltransferase complex, which regulates the acetylation of histones H2A and H4. Moreover, ING3 promotes ultraviolet (UV)-induced apoptosis through the Fas/caspase-8-dependent pathway in melanoma cells. It physically interacts with subunits of E3 ligase Skp1-Cullin-F-boxprotein complex (SCF complex) and is degraded by the SCF (F-box protein S-phase kinase-associated protein 2, Skp2)-mediated ubiquitin-proteasome system. It also acts as a suppression factor during tumorigenesis and progression of hepatocellular carcinoma (HCC). Yeast chromatin modification-related protein Yng2p, also termed ESA1-associated factor 4 or ING1 homolog 2, is a subunit of the NuA4 histone acetyltransferase (HAT) complex. It plays a critical role in intra-S-phase DNA damage response. Members of this family contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 92
34394 341092 cd16859 ING_ING4_5 Inhibitor of growth (ING) domain of inhibitor of growth protein ING4, ING5, and similar proteins. ING4, also termed p29ING4, and ING5, also termed p28ING5, belong to the inhibitor of growth (ING) family of type II tumor suppressors. ING4 acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53, and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). ING5 is a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with Inhibitor of cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. Both ING4 and ING5 contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. They associate with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further direct the MOZ/MORF and HBO1 complexes to chromatin. 91
34395 341093 cd16860 ING_ING1 Inhibitor of growth (ING) domain of inhibitor of growth protein 1 (ING1). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for the cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind phosphorylate ING1 and further regulates its activity. ING1 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 88
34396 341094 cd16861 ING_ING2 Inhibitor of growth (ING) domain of inhibitor of growth protein 2 (ING2). ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. ING2 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 88
34397 341095 cd16862 ING_ING4 Inhibitor of growth (ING) domain of inhibitor of growth protein 4 (ING4). ING4, also termed p29ING4, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). In addition, ING4 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING4 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 94
34398 341096 cd16863 ING_ING5 Inhibitor of growth (ING) domain of inhibitor of growth protein 5 (ING5). ING5, also termed p28ING5, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with Inhibitor of cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. In addition, ING5 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING5 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. 93
34399 350628 cd16864 ARID_JARID ARID/BRIGHT DNA binding domain of JARID proteins. The JARID subfamily within the JmjC protein family includes lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog, protein little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. The family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members of this subfamily contain the catalytic JmjC domain, JmjN, the AT-rich domain interacting domain (ARID)/BRIGHT domain, a C5HC2 zinc finger, as well as two or three plant homeodomain (PHD) fingers. 87
34400 350629 cd16865 ARID_ARID1A-like ARID/BRIGHT DNA binding domain found in AT-rich interactive domain-containing proteins ARID1A, ARID1B and similar proteins. This subfamily contains ARID1A and its paralog ARID1B. They are mutually exclusive components of human SWItch/Sucrose NonFermentable (SWI/SNF) chromatin remodeling protein complexes, but display different functions in development and cell-cycle control. SWI/SNF complexes containing ARID1A have an antiproliferative function, whereas the one harboring ARID1B shows a pro-proliferative function. ARID1A functions as an important tumor suppressor in various tumor types. It has been implicated in cell-cycle arrest, as well as in the interactions with p53 and BRG1/BRM and with topoisomerase II alpha. ARID1B may be considered as a potential therapeutic target for ARID1A-mutant cancers. Moreover, mutations in the ARID1B gene cause Coffin-Siris syndrome, exhibiting developmental defects, and haplo-insufficiency of ARID1B is a frequent cause of intellectual disability. Mutations in the ARID1B gene also have been found in many cancers. Both ARID1A and ARID1B contain an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which binds DNA in a non-sequence-specific manner. 93
34401 350630 cd16866 ARID_ARID2 ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 2 (ARID2) and similar proteins. ARID2, also called BRG1-associated factor 200 (BAF200) or zinc finger protein with activation potential (Zipzap/p200), is a novel serum response factor (SRF)-binding protein with multiple conserved domains, including an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), RFX DNA-binding domain, a glutamine-rich domain, and two C2H2 zinc fingers. It binds DNA without sequence specificity. ARID2 is an intrinsic subunit of PBAF (SWI/SNF-B) remodeling complex, which needs ARID2 to play an essential role in promoting osteoblast differentiation, maintaining cellular identity and activating tissue-specific gene expression. Moreover, ARID2 may function as a tumor suppressor in many cancers. It may also serve as a transcription co-activator for the regulation of cardiac gene expression, and is required for heart morphogenesis and coronary artery development. 88
34402 350631 cd16867 ARID_ARID3 ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID3A, ARID3B, ARID3C, dead ringer (Dri) from Drosophila melanogaster, and similar proteins. The ARID3 subfamily includes AT-rich interactive domain (ARID, also known as BRIGHT)-containing proteins ARID3A, ARID3B and ARID3C, which are the most direct mammalian counterparts of the Drosophila "dead ringer" protein Dri. They consist of an acidic N-terminal region of unknown function, the central ARID matrix association (or attachment) region (MAR)-DNA binding domain, a SUMO-I conjugation (SUMO) motif, and a multifunctional homomerization/nuclear export REKLES domain in the C-terminal third of the molecule. The ARID domain in this subfamily has been described as the "extended" or e-ARID due to additional conserved sequences at both the N and C termini of the core ARID region. The REKLES domain is found only in the ARID3 subfamily. It has co-evolved with and regulates functional properties of the ARID DNA-binding domain. 118
34403 350632 cd16868 ARID_ARID4 ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID4A, ARID4B and similar proteins. This subfamily contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (Rb)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and Rb transcriptional activity, and play important roles in the AR and Rb pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. 87
34404 350633 cd16869 ARID_ARID5 ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID5A, ARID5B, and similar proteins. This subfamily contains ARID5A and its paralog ARID5B. ARID5A, also called modulator recognition factor 1 (MRF-1), is an estrogen receptor alpha (ER alpha)-interacting protein that is expressed abundantly in cardiovascular tissues and suppresses ER alpha-induced transactivation. It also plays an important role in the promotion of inflammatory processes and autoimmune diseases. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Both ARID5A and ARID5B contain an AT-rich DNA-interacting domain (ARID, also known as BRIGHT). 87
34405 350634 cd16870 ARID_JARD2 ARID/BRIGHT DNA binding domain of Jumonji/ARID domain-containing protein 2 (JARID2) and similar proteins. JARID2, also called protein Jumonji, is a DNA-binding protein that contains both the Jumonji C (JmjC) domain and AT-rich DNA-interacting domain (ARID, also known as BRIGHT). It is an interacting component of Polycomb repressive complex-2 (PRC2) that catalyzes methylation of lysine 27 of histone H3 (H3K27) and regulates important gene expression patterns during development. It exhibits nucleosome-binding activity that contributes to PRC2 stimulation. However, unlike other JmjC domain-containing proteins, JARID2 is catalytically inactive due to the lack of conserved residues essential for histone demethylase activity. JARID2 is also involved in transforming growth factor-beta (TGF-beta)-induced epithelial-mesenchymal transition (EMT) of lung and colon cancer cell lines through the modulation of histone H3K27 methylation. Moreover, JARID2 is a part of GLP- and G9a-containing protein complex that promotes lysine 9 on histone H3 (H3K9) methylation on the cyclin D1 promoter and silences the expression of cyclin D1 and other cell cycle genes. It functions as a transcriptional repressor that plays critical roles in embryonic development including heart development in mice, and regulates cardiomyocyte proliferation via interaction with retinoblastoma protein (Rb), one of the master regulatory genes of the cell cycle. Furthermore, JARID2 acts as a transcriptional repressor of target genes, including Notch1. It directly binds to SETDB1 (SET domain, bifurcated 1) to form a complex that plays an important role in a novel process involving the modification of H3K9 methylation during heart development. Meanwhile, JARID2 is a key transcriptional repressor that plays a role in invariant natural killer T (iNKT) cell maturation. It regulates promyelocytic leukemia zinc finger (PLZF) expression by linking T-cell receptor (TCR) signaling to H3K9me3. JARID2 polymorphisms are associated with non-syndromic orofacial clefts (NSOC) susceptibility. 112
34406 350635 cd16871 ARID_Swi1p-like ARID/BRIGHT DNA binding domain of yeast SWI/SNF chromatin-remodeling complex subunit Swi1p and similar proteins. Saccharomyces cerevisiae Swi1p, also called SWI/SNF chromatin-remodeling complex subunit SWI1, regulatory protein GAM3, or transcription regulatory protein ADR6, is a transcription regulatory protein that is a subunit of the SWI/SNF complex, which plays critical roles in the regulation of gene transcription and expression. It can exist as a prion, [SWI(+)], which demonstrates a link between prionogenesis and global transcriptional regulation. Swi1p contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT) that binds DNA nonspecifically. This subfamily also includes Schizosaccharomyces pombe SWI/SNF chromatin-remodeling complex subunit sol1 (sol1p, also known as switch one-like protein). sol1p is a homolog of S. cerevisiae Swi1p and is also a part of SWI/SNF chromatin-remodeling complex. 90
34407 350636 cd16872 ARID_HMGB9-like ARID/BRIGHT DNA binding domain of Arabidopsis thaliana high mobility group B proteins HMGB9, HMGB10, HMGB11, HMGB15 and similar proteins. This subfamily includes a group of conserved plant DNA-binding proteins, including HMGB9 (or ARID-HMG1), HMGB10 (or ARID-HMG2), HMGB11, and HMGB15. They have been termed ARID-HMG proteins, due to containing two DNA-binding domains, an N-terminal AT-rich DNA-interacting domain (ARID, also known as BRIGHT), and a C-terminal high mobility group (HMG)-box domain. They are widely expressed in Arabidopsis and localize primarily to the nucleus. HMGB9/ARID-HMG1 binds specifically to A/T-rich DNA. HMGB15 is a transcription factor predominantly expressed in mature pollen grains and pollen tubes. It may work in the form of a homodimer, or interact with HMGB9, HMGB10 and HMGB11 to form heteromultimers in plant cells. HMGB15 is required for pollen tube growth in Arabidopsis and is involved in transcriptional regulation through the interaction with AGL66 and AGL104. 86
34408 350637 cd16873 ARID_KDM5A ARID/BRIGHT DNA binding domain of lysine-specific demethylase 5A (KDM5A). KDM5A, also called histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner; its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as the trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as three plant homeodomain (PHD) fingers. 92
34409 350638 cd16874 ARID_KDM5B ARID/BRIGHT DNA binding domain of lysine-specific demethylase 5B (KDM5B). KDM5B, also called cancer/testis antigen 31 (CT31), histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible earlygene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as three plant homeodomain (PHD) fingers. 90
34410 350639 cd16875 ARID_KDM5C_5D ARID/BRIGHT DNA binding domain of lysine-specific demethylase KDM5C and KDM5D. This group includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C, also called histone demethylase JARID1C, Jumonji/ARID domain-containing protein 1C, protein SmcX, or protein Xe169, is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D, also called histocompatibility Y antigen (H-Y), histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or protein SmcY, is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as two plant homeodomain (PHD) fingers. 92
34411 350640 cd16876 ARID_ARID1A ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 1A (ARID1A) and similar proteins. ARID1A, also called B120, BRG1-associated factor 250a (BAF250A), Osa homolog 1(OSA1), SWI-like protein, SWI/SNF complex protein p270, or SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin subfamily F member 1 (SWI1), has been identified as a novel tumor suppressor in various tumor types. It interacts with BRG1 adenosine triphosphatase to form a SWItch/Sucrose NonFermentable (SWI/SNF) chromatin remodeling protein complex, which plays a critical role in transcriptional control and gene expression. ARID1A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), and Eld/Osa homology domains (EHD) 1 and 2 within the C-terminus. The ARID in ARID1A binds nonspecific DNA in general and plays an important role in targeting SWI/SNF to chromatin. The EHD1 may be capable of mediating an intramolecular association with EHD2, and/or an intermolecular association resulting in homo- or hetero-dimerization. The EHD2 binds Swi2/Brahma homologue Brahma-related gene 1 (BRG1, also known as Snf2b), a human homologue of yeast Swi2. 93
34412 350641 cd16877 ARID_ARID1B ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 1B (ARID1B) and similar proteins. ARID1B, also called BRG1-associated factor 250b (BAF250B), BRG1-binding protein ELD/OSA1, Osa homolog 2 (Osa2), or p250R, is the largest subunit of ATP-dependent SWItch/sucrose nonfermentable (SWI/SNF) chromatin remodeling complex, which plays a critical role in transcriptional control and gene expression. ARID1B exhibits tumour-suppressor activities in pancreatic cancer cell lines. Mutations in the ARID1B gene cause Coffin-Siris syndrome, exhibiting developmental defects, and haplo-insufficiency of ARID1B is a frequent cause of intellectual disability. Moreover, mutations in the ARID1B gene have been found in many cancers. ARID1B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which binds DNA in a non-sequence-specific manner similar to ARID1A. 93
34413 350642 cd16878 ARID_ARID3A ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3A (ARID3A) and similar proteins. ARID3A, also called B-cell regulator of IgH transcription (Bright), dead ringer-like protein 1 (Dril1), or E2F-binding protein 1 (E2FBP1), is an ubiquitously expressed DNA-binding protein that has been implicated in embryonic patterning, cell lineage gene regulation, and cell cycle control, chromatin remodeling and transcriptional regulation. It was originally identified as a B cell-specific trans-activator of immunoglobulin heavy-chain (IgH) transcription, which increases immunoglobulin transcription in antigen-activated B cells and plays regulatory roles in hematopoiesis. It also functions as an E2F transcription regulator, inducing promyelocytic leukemia protein (PML) reduction and suppressing the formation of PML-nuclear bodies. It antagonizes the p16(INK4A)-Rb tumor suppressor machinery by regulating PML stability. ARID3A transcriptional activity can be modulated by SUMO (Small Ubiquitin-related Modifier) modification through the interaction with the SUMO-conjugating enzyme Ubc9. ARID3A also plays an important role in marginal zone B lymphocyte development and autoantibody production. Furthermore, ARID3A is a direct p53 target gene. It controls cell growth in a p53-dependent manner. ARID3A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta. 133
34414 350643 cd16879 ARID_ARID3B ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3B (ARID3B) and similar proteins. ARID3B, also called Bright and dead ringer protein, or Bright-Dri-like protein (Bdp), is a DNA binding protein involved in cellular immortalization, epithelial-mesenchymal transition (EMT), and tumorigenesis. Its expression is differentially regulated in normal and malignant tissues. It is required for heart development by regulating the motility and differentiation of heart progenitors. ARID3B is overexpressed in neuroblastoma and ovarian cancer. It acts as a novel target with roles in cell motility in breast cancer cells, promotes migration of mouse embryo fibroblasts (MEFs) and breast cancer cells, and induces tumor necrosis factor alpha (TNFalpha)-mediated apoptosis. ARID3B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta. 126
34415 350644 cd16880 ARID_ARID3C ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3C (ARID3C) and similar proteins. ARID3C, also called Brightlike, is a new ARID3 family transcription factor that co-activates ARID3A-mediated immunoglobulin gene transcription. It also functions as a potential regulator of early events in B cell antigen receptor (BCR) signaling. ARID3C contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta. 127
34416 350645 cd16881 ARID_Dri-like ARID/BRIGHT DNA binding domain of dead ringer (Dri) from Drosophila melanogaster and similar proteins. Dri, also termed retained (retn), is a nuclear protein with a sequence-specific DNA-binding domain termed AT-rich DNA-interacting domain (ARID, also known as BRIGHT). It is a founding member of the ARID family. Sequence comparison shows that DRI belongs to the "extended" or e-ARID subfamily, which exhibits an extended region of similarity either side of the ARID. Dri plays an important role in embryogenesis. It functions as an essential transcription factor involved in aspects of dorsal/ventral and anterior/posterior axis patterning, as well as myogenesis and hindgut development. 125
34417 350646 cd16882 ARID_ARID4A ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1, or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B (also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC-dependent and -independent repression activities. It is also involved in the pocket domain of pRb-mediated repression of E2F-dependent transcription and cellular proliferation. Moreover, it acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation. 87
34418 350647 cd16883 ARID_ARID4B ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1, or RBBP1L1), is a tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating cell cycle. ARID4B contains a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. 92
34419 350648 cd16884 ARID_ARID5A ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 5A (ARID5A) and similar proteins. ARID5A, also called modulator recognition factor 1 (MRF-1), is an estrogen receptor alpha (ER alpha)-interacting protein that is expressed abundantly in cardiovascular tissues and suppresses ER alpha-induced transactivation. It also associates with thyroid receptor alpha (TR alpha) and retinoid X receptor alpha (RXR alpha) in a ligand-dependent manner, and with ER beta, androgen receptor (AR), and the retinoic acid receptor (RAR) in a ligand-independent manner. ARID5A functions as a negative regulator of RORgamma-induced Th17 cell differentiation and may be involved in the pathogenesis of rheumatoid arthritis (RA). Moreover, it is an important transcriptional partner of the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) in stimulation of chondrocyte-specific transcription. Meanwhile, ARID5A plays an important role in promotion of inflammatory processes and autoimmune diseases. It works as a unique RNA binding protein, which stabilizes interleukin-6 (IL-6) but not tumor necrosis factor-alpha (TNF-alpha) mRNA through binding to the 3' untranslated region (UTR) of IL-6 mRNA, and inhibits the destabilizing effect of regnase-1 on IL-6 mRNA. ARID5A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT). 87
34420 350649 cd16885 ARID_ARID5B ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 5B (ARID5B) and similar proteins. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex, which is a signal-sensing modulator of histone methylation and gene transcription. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Its polymorphism has been associated with risk for pediatric acute lymphoblastic leukemia (ALL). ARID5B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which can bind both the major and minor grooves of its target sequences. 95
34421 341123 cd16887 YEATS YEATS domain family, chromatin reader proteins. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein, YEATS2, Drosophila D12, and others. DNA regulation by chromatin through histone post-translational modification and other mechanisms involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state and stimulates transcriptional activity through preferential interactions with crotonylated lysines on histones. 120
34422 381609 cd16888 lyz_G-like_1 lysozyme G-like protein 1. Eukaryotic goose-type or G-type lysozymes (goose egg-white lysozyme; GEWL) catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc). Mammals have two lysozyme G-like proteins, and this family corresponds to human and mouse lysozyme G-like protein 1. In humans and some other species, the canonical catalytic glutamate residue is absent, suggesting a loss of muramidase activity. 160
34423 381610 cd16889 chitinase-like chitinase-like domain. This family includes proteins such as chitinases, chitosanase, pesticin, and endolysin, which are involved in the degradation of 1,4-N-acetyl D-glucosamine linkages in chitin polymers and related activities. Chitinases are enzymes that catalyze the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Chitosanase enzymes hydrolyze chitosan, a biopolymer of beta (1,4)-linked-D-glucosamine (GlcN) residues produced by partial or full deacetylation of chitin. Pesticin (Pst) is a anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. 105
34424 381611 cd16890 lyz_i I-type lysozyme. Invertebrate type (I-type) lysozyme, initially identified in starfish and marine bivalves, are found in various invertebrate phyla and are apparently ubiquitous in insects. Lysozymes cleave the beta-(1,4)-glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine in peptidoglycan, the major bacterial cell wall polymer. I-type enzymes share structural similarity and the conserved glutamate catalytic residue of the lysozyme family. 117
34425 381612 cd16891 CwlT-like CwlT-like N-terminal lysozyme domain and similar domains. CwlT is a bifunctional cell wall hydrolase containing an N-terminal lysozyme domain and a C-terminal NlpC/P60 endopeptidase domain (gamma-d-D-glutamyl-L-diamino acid endopeptidase), and has been implicated in the spread of transposons. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). 151
34426 381613 cd16892 LT_VirB1-like VirB1-like subfamily. This subfamily includes VirB1 protein, one of twelve proteins making up type IV secretion systems (T4SS). T4SS are macromolecular assemblies generally composed of VirB1-11 and VirD4 proteins, and are used by bacteria to transport material across their membranes. VirB1 acts as a lytic transglycosylase (LT), and is important with respect to piercing the peptidoglycan layer in the periplasm. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). 143
34427 381614 cd16893 LT_MltC_MltE membrane-bound lytic murein transglycosylases MltC and MltE, and similar proteins. MltC and MltE are periplasmic, outer membrane attached lytic transglycosylases (LTs), which cleave beta-1,4-glycosidic bonds joining N-acetylmuramic acid and N-acetylglucosamine in the cell wall peptidoglycan, yielding 1,6-anhydromuropeptides. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda 162
34428 381615 cd16894 MltD-like Membrane-bound lytic murein transglycosylase D and similar proteins. Lytic transglycosylases (LT) catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc). Membrane-bound lytic murein transglycosylase D protein (MltD) family members may have one or more small LysM domains, which may contribute to peptidoglycan binding. Unlike the similar "goose-type" lysozymes, LTs also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). 129
34429 381616 cd16895 TraH-like conjugal transfer protein H and similar proteins. This subfamily consists of several TraH proteins, putative conjugal transfer proteins of uncharacterized function, apparently found only in alphaproteobacteria. They have similarity to lysozyme and preserve the critical glutamate residue which has catalytic activity in lysozyme-like proteins. 198
34430 381617 cd16896 LT_Slt70-like uncharacterized lytic transglycosylase subfamily with similarity to Slt70. Uncharacterized lytic transglycosylase (LT) with a conserved sequence pattern suggesting similarity to the Slt70, a 70kda soluble lytic transglycosylase which also has an N-terminal U-shaped U-domain and a linker L-domain. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. 146
34431 340383 cd16897 LYZ_C C-type lysozyme. C-type lysozyme (chicken or conventional type; 1, 4-beta-N-acetylmuramidase). In humans, lysozyme is found in a wide variety of tissue types and body fluids. It has bacteriolytic properties through the hydolysis of beta-1,4, glyocosidic linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues in a peptidoglycan, as well as between N-acetyl-D-glucosamine residues in chitodextrins. This family also includes digestive stomach lysozyme, which allow ruminants to digest bacteria in the foregut. The mammalian enzyme is related to lysozyme found hen egg-whites and related species. 126
34432 340384 cd16898 LYZ_LA alpha lactalbumin. alpha-lactalbumin (lactose synthase B protein, LA) is a calcium-binding metalloprotein that is expressed exclusively in the mammary gland during lactation. LA is the regulatory subunit of the enzyme lactose synthase. The association of LA with the catalytic component of lactose synthase, galactosyltransferase, alters the acceptor substrate specificity of this glycosyltransferase, facilitating biosynthesis of lactose. 123
34433 381618 cd16899 LYZ_C_invert C-type invertebrate lysozyme. C-type lysozyme proteins of invertebrates, including digestive lysozymes 1 and 2 from Musca domestica, which aid in the use of bacteria as a food source. These lysozymes have high expression in the gut and optimal lytic activity at a lower pH. Other lysozymes in this subfamily have immunological roles. e.g. Anopheles gambiae has eight lysozymes, most of which seem to have immunological roles, those some may function as digestive enzymes in larvae. C-type lysozyme (chicken or conventional type; 1, 4-beta-N-acetylmuramidase) has bacteriolytic properties through the hydolysis of beta-1,4, glyocosidic linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues in a peptidoglycan, as well as between N-acetyl-D-glucosamine residues in chitodextrins. 121
34434 381619 cd16900 endolysin_R21-like endolysin R21-like proteins. Unlike T4 E phage lysozyme, the endolysin R21 from Enterobacteria phage P21 has an N-terminal SAR (signal-arrest-release) domain that anchors the endolysin to the membrane in an inactive form, which act to prevent premature lysis of the infected bacterium. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan. 142
34435 381620 cd16901 lyz_P1 P1 lysozyme Lyz-like proteins. Enterobacteria phage P1 lysozyme Lyz is secreted to the Escherichia coli periplasm where it is membrane bound and inactive. Activation involves the release from the membrane, an intramolecular thiol-disulfide isomerization and extensive structural rearrangement of the N-terminal region. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan. 140
34436 381621 cd16902 pesticin_lyz lysozyme-like C-terminal domain of pesticin. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 178
34437 340389 cd16903 pesticin_lyz-like pesticin C-terminal-like domain of uncharacterized proteins. This subfamily is composed of uncharacterized proteins containing a lysozyme-like domain similar to the C-terminal domain of pesticin. Some members also contain an EF hand domain. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 150
34438 340390 cd16904 pesticin_lyz-like pesticin C-terminal-like domain of uncharacterized proteins. This subfamily is composed of uncharacterized proteins containing a lysozyme-like domain similar to the C-terminal domain of pesticin. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 138
34439 341124 cd16905 YEATS_Taf14_like YEATS domain found in Taf14 and similar proteins. Taf14 has been identified as a component of TFIIF and TFIID transcription factor complexes, various chromatin-remodeling complexes (such as SWI/SNF, INO80, and RSC) and the NuA3 histone acetyltransferase complex. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. Specifically, Taf14 has been show to be a reader of lysine crotonylation, associated with active gene promoters and enhancers and binding acetyllysine on in histone H3. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 118
34440 341125 cd16906 YEATS_AF-9_like YEATS domain found in ENL and AF-9-like proteins. Yeast AFF9 is a YEATS domain protein that binds to modified histones, with a preference for crotonyllysine over acetyllysine. Histone crotonylation upregulates gene expression in an AF9-dependent manner. Sub-family also includes eleven-nineteen-leukemia protein ENL, which binds histones H3 and H1. DNA regulation by chromatin through histone post-translational modification and other mechanisms involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state and stimulates transcriptional activity through preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 127
34441 341126 cd16907 YEATS_YEATS2_like YEATS domain found in YEATS2 and Drosophila D12. YEATS2 is a YEATS domain reader protein with a preference for recognition of histone H3 crotonylation on lysine 27 (H3K27crHistone crotonylation upregulates gene expression in an AF9-dependent manner. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 123
34442 341127 cd16908 YEATS_Yaf9_like YEATS domain found in Yaf9 and similar proteins. Yaf9 is a YEATS domain family protein essential in the function the NuA4 histone acetyltransferase complex and the SWR1-C ATP-dependent chromatin remodeling complex. Yaf9 shares structural similarity with histone chaperone Asf1, both exhibit histone H3 and H4 binding in vitro, and evidence supports both play a role in the same histone acetylation mechanism. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: `YNK7', `ENL', `AF-9', and `TFIIF small subunit', and also contains the GAS41 protein. 145
34443 341128 cd16909 YEATS_GAS41_like YEATS domain found in YEATS domain-containing protein 4 and similar proteins. glioma amplified sequence 41 (GAS41, also known as, YEATS domain-containing protein 4, NuMA-binding protein 1 ) is a YEATS domain family protein that is amplified and acts as an oncogene in human glioma. GAS41 is a YEATS domain family protein and has been shown to interact with the general transcription factor TFIIF via the YEATS domain. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 137
34444 341129 cd16910 YEATS_TFIID14_like YEATS domain found in transcription initiation factor TFIID subunit 14 and similar proteins. YEATS domain containing proteins, which include Transcription initiation factor TFIID subunits 14 and 14b of Arabidopsis, shown to be part of the TFIID general transcriptional regulator complex in a two-hybrid screen. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 131
34445 350650 cd16911 AfaD_SafA-like AfaD-like family of invasins. This family consists of Escherichia coli AfaD, Salmonella SafA and SafD, Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. Saf-pilin pilus formation proteins SafA and SafD are the major and minor subunits, respectively, of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Also included is the enteroaggregative Escherichia coli AAF/IV pilus tip protein, which is implicated in adhesion as well. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one AfaD-like family monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE). 120
34446 341130 cd16913 YkuD_like L,D-transpeptidases/carboxypeptidases similar to Bacillus YkuD. Members of the YkuD-like family of proteins are found in a range of bacteria. The best studied member Bacillus YkuD has been shown to act as an L,D-transpeptidase that gives rise to an alternative pathway for peptidoglycan cross-linking. Another member Helicobacter pylori Csd6 functions as an L,D-carboxypeptidase and regulates helical cell shape and motility. The conserved region contains a conserved histidine and cysteine, with the cysteine thought to be an active site residue. 121
34447 410987 cd16914 EcfT T component of ECF-type transporters. The transmembrane component (T component) of the energy coupling-factor (ECF)-type transporter is a transmembrane protein important for vitamin uptake in prokaryotes. In addition to the T component, energy-coupling factor (ECF) transporters contain an energy-coupling module that consists of two ATP-binding proteins (known as the A and A' components) and a substrate-binding (S) component. ECF transporters comprise a subgroup of ATP-binding cassette (ABC) transporters that do not make use of water-soluble substrate binding proteins or domains, but instead employ integral membrane proteins for substrate binding, the S component, in contrast to classical ABC importers. The T component links the S component to the ATP-binding subcomplex that is composed of the A and A' components. 233
34448 340392 cd16915 HATPase_DpiB-CitA-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli K-12 DpiB, DcuS, and Bacillus subtilis CitS, DctS, and YufL. This family includes histidine kinase-like ATPase domains of Escherichia coli K-12 DpiB and DcuS, and Bacillus subtilis CitS, DctS and MalK histidine kinases (HKs) all of which are two component transduction systems (TCSs). E. coli K-12 DpiB (also known as CitA) is the histidine kinase (HK) of DpiA-DpiB, a two-component signal transduction system (TCS) required for the expression of citrate-specific fermentation genes and genes involved in plasmid inheritance. E. coli K-12 DcuS (also known as YjdH) is the HK of DcuS-DcuR, a TCS that in the presence of the extracellular C4-dicarboxlates, activates the expression of the genes of anaerobic fumarate respiration and of aerobic C4-dicarboxylate uptake. CitS is the HK of Bacillus subtilis CitS-CitT, a TCS which regulates expression of CitM, the Mg-citrate transporter. Bacillus subtilis DctS forms a tripartite sensor unit (DctS/DctA/DctB) for sensing C4 dicarboxylates. Bacillus subtilis MalK (also known as YfuL) is the HK of MalK-MalR (YufL-YufM) a TCS which regulates the expression of the malate transporters MaeN (YufR) and YflS, and is essential for utilization of malate in minimal medium. Proteins having this DpiB-CitA-like HATPase domain generally have sensor domains such as Cache and PAS, and a histidine kinase A (HisKA)-like SpoOB-type, alpha-helical domain. 104
34449 340393 cd16916 HATPase_CheA-like Histidine kinase-like ATPase domain of the chemotaxis protein histidine kinase CheA, and some hybrid sensor histidine kinases. This family includes the cytoplasmic histidine kinase (HK) CheA, a transmembrane receptor which, together with cytoplasmic adaptor protein (CheW), forms the lattice at the core of the chemosensory array that controls the cellular chemotaxis of motile bacteria and archaea. CheA forms a two-component signal transduction system (TCS) with the response regulator CheY. Proteins having this CheA-like HATPase domain generally also have a histidine-phosphotransfer domain, a histidine kinase homodimeric domain, and a regulatory domain; some are hybrid sensor histidine kinases as they contain a REC signal receiver domain. 178
34450 340394 cd16917 HATPase_UhpB-NarQ-NarX-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli UhpB, NarQ and NarX, and Bacillus subtilis YdfH, YhcY and YfiJ. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Escherichia coli UhpB, a HK of the UhpB-UhpA TCS, NarQ and NarX, HKs of the NarQ-NarP and NarX-NarL TCSs, respectively, and Bacillus YdfH, YhcY and YfiJ HKs, of the YdfH-YdfI, YhcY-YhcZ and YfiJ-YfiK TCSs, respectively. In addition, it includes Bacillus YxjM, ComP, LiaS and DesK, HKs of the YxjM-YxjML, ComP-ComA, LiaS-LiaR, DesR-DesK TCSs, respectively. Proteins having this HATPase domain have a histidine kinase dimerization and phosphoacceptor domain; some have accessory domains such as GAF, HAMP, PAS and MASE sensor domains. 87
34451 340395 cd16918 HATPase_Glnl-NtrB-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli GlnL (synonyms NtrB and NRII). This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs), similar to Escherichia coli GlnL/NtrB/NRII HK of the two-component regulatory system (TCS) GlnL/GlnG (NtrB-NtrC, or NRII-NRI), which regulates the transcription of genes encoding metabolic enzymes and permeases in response to carbon and nitrogen status in E. coli and related bacteria. Also included in this family are Rhodobacter capsulatus NtrB, Azospirillum brasilense NtrB, Vibrio alginolyticus NtrB, Rhizobium leguminosarum biovar phaseoli NtrB, and Herbaspirillum seropedicae NtrB. Escherichia coli GlnL/NtrB/NRII is both a kinase and a phosphatase, catalyzing the phosphorylation and dephosphorylation of GlnG/NtrC/NRI. The kinase and phosphatase activities of GlnL/NtrB/NRII are regulated by the PII signal transduction protein, which on binding to GlnL/NtrB/NRII, inhibits the kinase activity of GlnL/NtrB/NRII and activates the GlnL/NtrB/NRII phosphatase activity. Proteins having this HATPase domain also have a histidine kinase dimerization and phosphoacceptor domain (HisKA); some also contain PAS sensor domain(s). 109
34452 340396 cd16919 HATPase_CckA-like Histidine kinase-like ATPase domain of two-component sensor hybrid histidine kinases, similar to Brucella abortus 2308 CckA. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinase (HKs) similar to Brucella abortus 2308 CckA, which is a component of an essential protein phosphorelay that regulates expression of genes required for growth, division, and intracellular survival; phosphoryl transfer initiates from the sensor kinase CckA and proceeds via the ChpT phosphotransferase to two regulatory substrates: the DNA-binding response regulator CtrA and the phospho-receiver protein CpdR. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), a REC signal receiver domain, and some contain PAS or PAS and GAF sensor domain(s). 116
34453 340397 cd16920 HATPase_TmoS-FixL-DctS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Rhizobium meliloti FixL, and Rhodobacter capsulatus DctS; includes hybrid sensor histidine kinase similar to Pseudomonas mendocina TmoS. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs), such as Pseudomonas mendocina TmoS HK of the TmoS-TmoT TCS, which controls the expression of the toluene-4-monooxygenase pathway, Rhizobium meliloti FixL HK of the FixL-FixJ TCS, which regulates the expression of the genes related to nitrogen fixation in the root nodule in response to O(2) levels, and Rhodobacter capsulatus DctS of the DctS-DctR TCS, which controls synthesis of the high-affinity C4-dicarboxylate transport system. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and PAS sensor domain(s); many are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 104
34454 340398 cd16921 HATPase_FilI-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Methanosaeta harundinacea FilI and some hybrid sensor histidine kinases. This family includes FilI, the histidine kinase (HK) component of FilI-FilRs, a two-component signal transduction system (TCS) of the methanogenic archaeon, Methanosaeta harundinacea, which is involved in regulating methanogenesis. The cytoplasmic HK core consists of a C-terminal HK-like ATPase domain (represented here) and a histidine kinase dimerization and phosphoacceptor domain (HisKA) domain, which, in FilI, are coupled to CHASE, HAMP, PAS, and GAF sensor domains. FilI-FilRs catalyzes the phosphotransfer between FilI (HK) and FilRs (FilR1 and FilR2, response regulators) of the TCS. TCSs are predicted to be of bacterial origin, and acquired by archaea by horizontal gene transfer. This model also includes related HATPase domains such as that of Synechocystis sp. PCC6803 phytochrome-like protein Cph1. Proteins having this HATPase domain and HisKA domain also have accessory sensor domains such as CHASE, GAF, HAMP and PAS; some are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 105
34455 340399 cd16922 HATPase_EvgS-ArcB-TorS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases, many are hybrid sensor histidine kinases, similar to Escherichia coli EvgS, ArcB, TorS, BarA, RcsC. This family contains the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinases (HKs), including the following Escherichia coli HKs: EvgS, a HK of the EvgS-EvgA two-component system (TCS) that confers acid resistance; ArcB, a HK of the ArcB-ArcA TCS that modulates the expression of numerous genes in response to respiratory growth conditions; TorS, a HK of the TorS-TorR TCS which is involved in the anaerobic utilization of trimethylamine-N-oxide; BarA, a HK of the BarA-UvrY TCS involved in the regulation of carbon metabolism; and RcsC, a HK of the RcsB-RcsC TCS which regulates the expression of the capsule operon and of the cell division gene ftsZ. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), with most having accessory sensor domain(s) such as GAF, PAS and CHASE; many are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 110
34456 340400 cd16923 HATPase_VanS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Enterococcus faecium VanS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Enterococcus faecium VanS HK of the VanS-VanR two-component regulatory system (TCS) which activates the transcription of vanH, vanA and vanX vancomycin resistance genes. It also contains Ecoli YedV and PcoS, probable members of YedW-YedV TCS and PcoS-PcoR TCS, repectively. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); most also have a HAMP sensor domain. 102
34457 340401 cd16924 HATPase_YpdA-YehU-LytS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YpdA, YehU, Bacillus subtilis LytS, and some hybrid sensor histidine kinases. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis LytS, a HK of the two-component system (TCS) LytS-LytR needed for growth on pyruvate, and Staphylococcus aureus LytS-LytR TCS involved in the adaptation of S. aureus to cationic antimicrobial peptides. It also includes the HATPase domains of Escherichia coli YpdA and YehU, HKs of YpdA-YpdB and YehU-YehTCSs, which are involved together in a nutrient sensing regulatory network. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), some having accessory sensor domain(s) such as Cache, HAMP or GAF; some are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 103
34458 340402 cd16925 HATPase_TutC-TodS-like Histidine kinase-like ATPase domain of hybrid sensor histidine kinases similar to Pseudomonas putida TodS and Thauera aromatica TutC. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinase (HKs) such Pseudomonas putida TodS HK of the TodS-TodT two-component regulatory system (TCS) which controls the expression of a toluene degradation pathway. Thauera aromatica TutC may be part of a TCS that is involved in anaerobic toluene metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), PAS sensor domain(s) and a REC domain. 110
34459 340403 cd16926 HATPase_MutL-MLH-PMS-like Histidine kinase-like ATPase domain of DNA mismatch repair proteins Escherichia coli MutL, human MutL homologs (MLH/ PMS), and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of Escherichia coli MutL, human MLH1 (mutL homolog 1), human PMS1 (PMS1 homolog 1, mismatch repair system component), human MLH3 (mutL homolog 3), and human PMS2 (PMS1 homolog 2, mismatch repair system component). MutL homologs (MLH/PMS) participate in MMR (DNA mismatch repair), and in addition have role(s) in DNA damage signaling and suppression of homologous recombination (recombination between partially homologous parental DNAs). The primary role of MutL in MMR is to mediate protein-protein interactions during mismatch recognition and strand removal; a ternary complex is formed between MutS, MutL, and the mismatched DNA, which activates the MutH endonuclease. 188
34460 340404 cd16927 HATPase_Hsp90-like Histidine kinase-like ATPase domain of human cytosolic Hsp90 and its homologs including Escherichia coli HtpG, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of 90 kilodalton heat-shock protein (Hsp90) eukaryotic homologs including cytosolic Hsp90, mitochondrial TRAP1 (tumor necrosis factor receptor-associated protein 1), GRP94 (94 kDa glucose-regulated protein) of the endoplasmic reticulum (ER), and chloroplast Hsp90C. It also includes the bacterial homologs of Hsp90, known as HtpG (High temperature protein G). Hsp90 family of chaperones assist other proteins to fold correctly, stabilizes them against heat stress, and aids in protein degradation. 189
34461 340405 cd16928 HATPase_GyrB-like Histidine kinase-like ATPase domain of the B subunit of DNA gyrase. This family includes histidine kinase-like ATPase domain of the B subunit of DNA gyrase. Bacterial DNA gyrase is a type II topoisomerase (type II as it transiently cleaves both strands of DNA) which catalyzes the introduction of negative supercoils into DNA, possibly by a mechanism in which one segment of the double-stranded DNA substrate is passed through a transient break in a second segment. It consists of GyrA and GyrB subunits in an A2B2 stoichiometry; GyrA subunits catalyze strand-breakage and reunion reactions, and GyrB subunits hydrolyze ATP. DNA gyrase is found in bacteria, plants and archaea, but as it is absent in humans it is a possible drug target for the treatment of bacterial and parasite infections. 180
34462 340406 cd16929 HATPase_PDK-like Histidine kinase-like ATPase domain of pyruvate dehydrogenase kinase, branched-chain alpha-ketoacid dehydrogenase kinase and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of all four PDK isoforms (pyruvate dehydrogenase kinases 1-4) that have been described in mammals, and other PDKs including Saccharomyces Pkp1p and Pkp2p. PDKs and phosphatases tightly regulate the mitochondrial pyruvate dehydrogenase complex (PDC) by reversible phosphorylation. PDC catalyzes the oxidative decarboxylation of pyruvate to acetyl-CoA, connecting glycolysis and the TCA acid cycle. Also included in this family is mammalian branched-chain alpha-ketoacid dehydrogenase kinase (BDK), a mitochondrial protein kinase that phosphorylates a subunit of the branched-chain a-ketoacid dehydrogenase (BCKD) complex, which catalyzes the oxidative decarboxylation of branched-chain alpha-ketoacids derived from leucine, isoleucine, and valine, a rate-limiting step in the oxidative degradation of these branched-chain amino acids. 169
34463 340407 cd16930 HATPase_TopII-like Histidine kinase-like ATPase domain of eukaryotic topoisomerase II. This family includes the histidine kinase-like ATPase (HATpase) domains of human topoisomerase IIA (TopIIA) and TopIIB, Saccharomyces cerevisae TOP2p, and related proteins. These proteins catalyze the passage of DNA double strands through a transient double-strand break in the presence of ATP. 147
34464 340408 cd16931 HATPase_MORC-like Histidine kinase-like ATPase domain of human microrchidia (MORC) family CW-type zinc finger proteins MORC1-4, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domain of human microrchidia (MORC) family CW-type zinc finger proteins MORC1-4, and related domains. In addition to the HATPase domain, MORC family proteins have a CW-type zinc finger domain containing four conserved cysteines and two conserved tryptophans, and coiled-coil domains at the carboxy-terminus. MORC1 has cross-species differential methylation in association with early life stress, and genome-wide association with major depressive disorder (MDD). MORC2 is involved in several nuclear processes, including transcription modulation and DNA damage repair, and exhibits a cytosolic function in lipogenesis, adipogenic differentiation, and lipid homeostasis by increasing the activity of ACLY. MORC3 regulates p53, and is an antiviral factor which plays an important role during HSV-1 and HCMV infection, and is a positive regulator of influenza virus transcription. MORC4 is highly expressed in a subset of diffuse large B-cell lymphomas and has potential as a lymphoma biomarker. 118
34465 340409 cd16932 HATPase_Phy-like Histidine kinase-like ATPase domain of plant phytochromes similar to Arabidopsis thaliana Phytochrome A, B, C, D and E. This family includes the histidine kinase-like ATPase (HATPase) domains of plant red/far-red photoreceptors, the phytochromes, and includes the Arabidopsis thaliana phytochrome family phyA-phyE. Following red light absorption, biologically inactive forms of phytochromes convert to active forms, which rapidly convert back to inactive forms upon far-red light irradiation. Phytochromes can be considered as having an N-terminal photosensory region to which a bilin chromophore is bound, and a C-terminal output region, which includes the HATPase domain represented here, and is involved in dimerization and presumably contributes to relaying the light signal to downstream signaling events. 113
34466 340410 cd16933 HATPase_TopVIB-like Histidine kinase-like ATPase domain of type IIB topoisomerase, Topo VI, subunit B. This family includes the histidine kinase-like ATPase (HATPase) domain of the B subunit of topoisomerase VI (Topo VIB). Topo VI is a heterotetrameric complex composed of two TopVIA and two TopVIB subunits and is categorized as a type II B DNA topoisomerase. It is found in archaea and also in plants. Type II enzymes cleave both strands of a DNA duplex and pass a second duplex through the resulting break in an ATP-dependent mechanism. DNA cleavage by Topo VI generates two-nucleotide 5'-protruding ends. 203
34467 340411 cd16934 HATPase_RsbT-like Histidine kinase-like ATPase domain of the anti sigma-B factor Bacillus subtilis serine/threonine-protein kinase RsbT, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domain of Bacillus subtilis serine/threonine-protein kinase RsbT, a component of the stressosome signaling complex of Bacillus subtilis. The stressosome is formed from multiple copies of three proteins, a sensor protein RsbR, a scaffold protein RsbS, and RsbT, and responds to environmental changes by initiating a protein partner switching cascade. Stress perception increases the phosphorylation of RsbR and RsbS, by RsbT. Subsequent dissociation of RsbT from the stressosome activates the sigma-B cascade, leading to the release of the alternative sigma factor, sigma-B. 117
34468 340412 cd16935 HATPase_AgrC-ComD-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Staphylococcus aureus AgrC and Streptococcus pneumoniae ComD which are involved in quorum sensing. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) including Staphylococcus aureus AgrC which is an HK of the accessory gene regulator (agr) quorum sensing two-component regulatory system (TCS) AgrC-AgrA. The agr system plays a part in the transition from persistent to virulent phenotype. This family also includes Streptococcus pneumoniae ComD HK of the ComD-ComE TCS, involved in quorum sensing and genetic competence. 134
34469 340413 cd16936 HATPase_RsbW-like Histidine kinase-like ATPase domain of RsbW, an anti sigma-B factor and serine-protein kinase involved in regulating sigma-B during stress in Bacilli, and related domains. This family includes histidine kinase-like ATPase (HATPase) domain of RsbW, an anti sigma-B factor as well as a serine-protein kinase involved in regulating sigma-B during stress in Bacilli. The alternative sigma factor sigma-B is an important regulator of the general stress response of Bacillus cereus and B. subtilis. RsbW is an anti-sigma factor while RsbV is an anti-sigma factor antagonist (anti-anti-sigma factor). RsbW can also act as a kinase on RsbV. In a partner-switching mechanism, RsbW, RsbV, and sigma-B participate as follows: in non-stressed cells, sigma-B is present in an inactive form complexed with RsbW; in this form, sigma-B is unable to bind to RNA polymerase. Under stress, RsbV binds to RsbW, forming an RsbV-RsbW complex, and sigma-B is released to bind to RNA polymerase. RsbW may then act as a kinase on RsbV, phosphorylating a serine residue; RsbW is then released to bind to sigma-B, hence blocking its ability to bind RNA polymerase. A phosphatase then dephosphorylates RsbV so that it can again form a complex with RsbW, leading to the release of sigma-B. 91
34470 340414 cd16937 HATPase_SMCHD1-like Histidine kinase-like ATPase domain of structural maintenance of chromosomes flexible hinge domain containing 1 (SMCHD1) protein. This family includes histidine kinase-like ATPase (HATPase) domain of structural maintenance of chromosomes flexible hinge domain containing 1 (SMCHD1) protein, which is involved in gene silencing and in DNA damage. It has critical roles in X-chromosome inactivation and is also an important regulator of autosomal gene clusters. Upon DNA damage, SMCHD1 promotes non-homologous end joining and inhibits homologous recombination repair. SMCHD1 is implicated in the pathogenesis of facioscapulohumeral muscular dystrophy. 119
34471 340415 cd16938 HATPase_ETR2_ERS2-EIN4-like Histidine kinase-like ATPase domain of Arabidopsis thaliana ETR2, ERS2, and EIN4, and related domains. This family includes the histidine kinase-like ATPase domains (HATPase) of three out of the five receptors that recognize the plant hormone ethylene in Arabidopsis thaliana. These three proteins have been classified as belonging to subfamily 2: ETR2, ERS2, and EIN4. They lack most of the motifs characteristic of histidine kinases, and EIN4 is the only one in this group containing the conserved histidine that is phosphorylated in two-component and phosphorelay systems. This family also includes the HATPase domains of Escherichia coli RcsD phosphotransferase which is a component of the Rcs-signaling system, a complex multistep phosphorelay involving five proteins, and is involved in many transcriptional networks such as cell division, biofilm formation, and virulence, among others. Also included is Schizosaccharomyces pombe Mak3 (Phk1) which participates in a multi-step two-component related system which regulates H2O2-induced activation of the Sty1 stress-activated protein kinase pathway. Most proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a GAF sensor domain; most are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 133
34472 340416 cd16939 HATPase_RstB-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Salmonella typhimurium RstB. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Salmonella typhimurium RstB HK of the RstA-RstB two-component regulatory system (TCS), which regulates expression of the constituents participating in pyrimidine metabolism and iron acquisition, and may be required for regulation of Salmonella motility and invasion. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a HAMP sensor domain. 104
34473 340417 cd16940 HATPase_BasS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli BasS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) similar to Escherichia coli BasS HK of the BasS-BasR two-component regulatory system (TCS). Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some contain a HAMP sensory domain, while some an N-terminal two-component sensor kinase domain. 113
34474 340418 cd16942 HATPase_SpoIIAB-like Histidine kinase-like ATPase domain of SpoIIAB, an anti sigma-F factor and serine-protein kinase involved in regulating sigma-F during sporulation in Bacilli, and related domains. This family includes histidine kinase-like ATPase (HATPase) domain of SpoIIAB, an anti sigma-F factor and a serine-protein kinase involved in regulating sigma-F during sporulation in Bacilli where, early in sporulation, the cell divides into two unequal compartments: a larger mother cell and a smaller forespore. Sigma-F transcription factor is activated in the forespore directly after the asymmetric septum forms, and its spatial and temporal activation is required for sporulation. Free sigma-F can associate with the RNA polymerase core and activate transcription of the sigma-F regulon, its regulation may comprise a partner-switching mechanism involving SpoIIAB, SpoIIAA, and sigma-F as follows: SpoIIAB can form alternative complexes with either: i) sigma-F, holding it in an inactive form and preventing its association with RNA polymerase, or ii) unphosphorylated SpoIIAA and a nucleotide, either ATP or ADP. In the presence of ATP, SpoIIAB acts as a kinase to specifically phosphorylate a serine residue of SpoIIAA; this phosphorylated form has low affinity for SpoIIAB and dissociates, making SpoIIAB available to capture sigma-F. SpoIIAA may then be dephosphorylated by a SpoIIE serine phosphatase and be free to attack the SpoIIAB sigma-F complex to induce the release of sigma-F. 135
34475 340419 cd16943 HATPase_AtoS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli K-12 AtoS. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Escherichia coli AtoS, an HK of the AtoS-AtoC TCS. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have accessory domains such as HAMP or PAS sensor domains or CBS-pair domains. 105
34476 340420 cd16944 HATPase_NtrY-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Azorhizobium caulinodans NtrY. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Azorhizobium caulinodans ORS571 NtrY of the NtrY-NtrX TCS, which is involved in nitrogen fixation and metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also have PAS sensor domains. 108
34477 340421 cd16945 HATPase_CreC-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli CreC. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli CreC of the CreC-CreB two-component regulatory system (TCS) involved in catabolic regulation. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and accessory sensory domain(s) such as HAMP, CACHE or PAS. 106
34478 340422 cd16946 HATPase_BaeS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli BasS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) similar to Escherichia coli BaeS HK of the BaeS/BaeR two-component regulatory system (TCS), which responds to envelope stress. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a HAMP sensory domain. 109
34479 340423 cd16947 HATPase_YcbM-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis YcbM. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis YcbM, a HK of the two-component system YcbM-YcbL. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA). 125
34480 340424 cd16948 HATPase_BceS-YxdK-YvcQ-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis BceS, YxdK, and Bacillus thuringiensis YvcQ. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis BceS and Bacillus thuringiensis YvcQ, the HKs of the two-component regulatory system (TCSs) BceS-BceR and YvcQ-YvcP, repsectively, which are both involved in regulating bacitracin resistance. It also includes the HATPase domain of YxdK, the HK of YxdK-YxdJ TCS involved in sensing antimicrobial compounds. 109
34481 340425 cd16949 HATPase_CpxA-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli CpxA. This family includes the histidine kinase-like ATPase (HATPase) domains of two-component sensor histidine kinase (HKs) similar to Escherichia coli CpxA, HK of the CpxA-CpxR two-component regulatory system (TCS) which may function in acid stress and in cell wall stability. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also contain a CpxA family periplasmic domain. 104
34482 340426 cd16950 HATPase_EnvZ-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli EnvZ and Pseudomonas aeruginosa BfmS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli EnvZ of the EnvZ-OmpR two-component regulatory system (TCS), which functions in osmoregulation. It also contains the HATPase domain of Pseudomonas aeruginosa BfmS, the HK of the BfmSR TCS, which functions in the regulation of the rhl quorum-sensing system and bacterial virulence in P. aeruginosa. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also contain a periplasmic domain. 101
34483 340427 cd16951 HATPase_EL346-LOV-HK-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Erythrobacter litoralis blue light-activated histidine kinase 2. This domain family includes the histidine kinase-like ATPase (HATPase) domain of blue light-activated histidine kinase 2 of Erythrobacter litoralis (EL346). Signaling commonly occurs within HK dimers, however EL346 functions as a monomer. Also included in this family are the HATPase domains of ethanolamine utilization sensory transduction histidine kinase (EutW), whereby regulation of ethanolamine, a carbon and nitrogen source for gut bacteria, results in autophosphorylation and subsequent phosphoryl transfer to a response regulator (EutV) containing an RNA-binding domain. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have an accessory PAS sensor domain, while some have an N-terminal histidine kinase domain. 131
34484 340428 cd16952 HATPase_EcPhoR-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli PhoR. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Escherichia coli or Vibrio cholera PhoR, the histidine kinase (HK) of PhoB-PhoR a two-component signal transduction system (TCS) involved in phosphate regulation. PhoR monitors extracellular inorganic phosphate (Pi) availability and PhoB, the response regulator, regulates transcription of genes of the phosphate regulon. PhoR is a bifunctional histidine autokinase/phospho-PhoB phosphatase; in phosphate deficiency, it autophosphorylates and Pi is transferred to PhoB, and when environmental Pi is abundant, it removes the phosphoryl group from phosphorylated PhoB. Other roles of PhoB-PhoR TCS have been described, including motility, biofilm formation, intestinal colonization, and virulence in V. cholera. E.coli PhoR and Bacillus subtilis PhoR (whose HATPase domain belongs to a different family) sense very different signals in each bacterium. In E. coli the PhoR signal comes from phosphate transport mediated by the PstSCAB2 phosphate transporter and the PhoU chaperone-like protein while in B. subtilis, the PhoR activation signal comes from wall teichoic acid (WTA) metabolism. 108
34485 340429 cd16953 HATPase_BvrS-ChvG-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Brucella abortus BvrS and Sinorhizobium meliloti ChvG. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Brucella abortus BvrS of the BvrR-BvrS two-component regulatory system (TCS), which controls cell invasion and intracellular survival, as well as Sinorhizobium meliloti and Agrobacterium tumefaciens ChvG of the ChvI-ChvG TCS necessary for endosymbiosis and pathogenicity in plants. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), an accessory HAMP sensor domain, a periplasmic stimulus-sensing domain, and some also have a sensor N-terminal transmembrane domain. 110
34486 340430 cd16954 HATPase_PhoQ-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli PhoQ and Providencia stuartii AarG. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Escherichia coli PhoQ and Providencia stuartii AarG. PhoQ is the histidine kinase (HK) of the PhoP-PhoQ two-component regulatory system (TCS), which responds to the levels of Mg2+ and Ca2+, controls virulence, mediates the adaptation to Mg2+-limiting environments, and regulates numerous cellular activities. Providencia stuartii AarG is a putative sensor kinase which controls the expression of the 2'-N-acetyltransferase and an intrinsic multiple antibiotic resistance (Mar) response in Providencia stuartii. The AarG product is similar to PhoQ in that it is able to restore wild-type levels of resistance to a Salmonella typhimurium phoQ mutant. However, the expression of the 2'-N-acetyltransferase gene and of aarP (a gene encoding a transcriptional activator of 2'-N-acetyltransferase) are not significantly affected by the levels of Mg2+ or Ca2+. Most proteins in this group contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have an accessory HAMP sensor domain, and some have an intracellular membrane -interaction PhoQ sensor domain. 135
34487 340431 cd16955 HATPase_YpdA-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YpdA. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli YpdA, a HK of the two-component system (TCS) YpdA-YpdB which is involved in a nutrient sensing regulatory network with YehU-YehT. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), and some have a GAF sensor domain; some contain a DUF3816 domain; some are hybrid sensor histidine kinases as they also contain a REC signal receiver domain. 102
34488 340432 cd16956 HATPase_YehU-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YehU. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) including Escherichia coli YehU, a HK of the two-component system (TCS) YehU-YehT which is involved in a nutrient sensing regulatory network. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase); some have a GAF sensor domain while some have a cupin domain. 101
34489 340433 cd16957 HATPase_LytS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis LytS and Staphylococcus aureus LytS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis LytS, a HK of the two-component system (TCS) LytS-LytR needed for growth on pyruvate, and Staphylococcus aureus LytS-LytR TCS involved in the adaptation of S. aureus to cationic antimicrobial peptides. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), and a GAF sensor domain; most contain a DUF3816 domain. 106
34490 341131 cd16961 RMtype1_S_TRD-CR_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR) and similar domains. The restriction-modification (RM) system S subunit generally consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This superfamily represents a single TRD-CR unit; in addition to type I TRD-CR units, it includes RMtype1_S_TRD-CR_like domains of various putative Helicobacter type II restriction enzymes and methyltransferases, such as Hci611ORFHP and HfeORF12890P, as well as TRD-CR-like sequence-recognition domains of the M subunit of putative type I DNA methyltransferase such as M2.CinURNWORF2828P and M.Mae7806ORF3969P. 178
34491 340813 cd16962 RuvC Crossover junction endodeoxyribonuclease RuvC. Crossover junction endodeoxyribonuclease RuvC is also called Holliday junction resolvase RuvC. It is part of the RuvABC pathway in Escherichia coli and other Gram-negative bacteria that is involved in processing Holliday junctions, which are formed by the reciprocal exchange of strands between two DNA duplexes. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. RuvC is thought to bind either on the open, DNA exposed face of a single RuvA tetramer, or to replace one of the two tetramers. Binding is proposed to be mediated by an unstructured loop on RuvC, which becomes structured on binding RuvA. RuvC can be bound to the complex in either orientation, therefore resolving Holliday junctions in either a horizontal or vertical manner. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi. These are referred to as the RuvC family of Holliday junction resolvases, RuvC being the Escherichia coli HJR. RuvC and its orthologs are homodimers and display structural similarity to RNase H and Hsp70. 153
34492 340814 cd16963 CCE1 fungal mitochondrial Holliday junction resolvases similar to Saccharomyces cerevisiae CCE1. Saccharomyces cerevisiae Cruciform cutting endonuclease 1 (CCE1) is a Holliday junction resolvase specific for 4-way junctions. CCE1 is involved in the maintenance of mitochondrial DNA. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. Holliday junctions are formed by the reciprocal exchange of strands between two DNA duplexes. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi; they may form homodimers and display structural similarity to RNase H and Hsp70. 290
34493 340815 cd16964 YqgF putative pre-16S rRNA nuclease YqgF and RuvX family. Escherichia coli YqgF has been shown to act as a pre-16S rRNA nuclease, presumably as a monomer. It is involved in the processing of pre-16S rRNA during ribosome maturation. The RuvX gene product from Mycobacterium tuberculosis was shown to act, in a dimeric form, as a Holliday junction resolvase (HJR). HJRs endonucleases specifically resolve Holliday junction DNA intermediates during homologous recombination. Holliday junctions are formed by the reciprocal exchange of strands between two DNA duplexes. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi; they may form homodimers and display structural similarity to RNase H and Hsp70. 132
34494 341215 cd16965 Alpha_kinase_ChaK Alpha-kinase domain of channel kinases. This group is composed of channel kinases 1 (ChaK1) and 2 (ChaK2), and similar proteins. ChaK1 and ChaK2 are also called transient receptor potential cation channel subfamily M members 7 (TRMP7) and 6 (TRMP6), respectively. They are fusion proteins containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. They are both cation-selective channels that preferentially permeate Zn2+, Mg2+, and Ca2+ ions. They are central regulators of Mg2+ and Ca2+ homeostasis. TRMP7 is ubiquitously expressed while TRMP6 is highly expressed in specific tissues such as the kidney and intestine. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34495 341216 cd16966 Alpha_kinase_ALPK2_3 Alpha-kinase domain of alpha-protein kinases 2 and 3. Alpha-protein kinases 2 (ALPK2) and 3 (ALPK3) are also called heart alpha-protein kinase (HAK) and muscle alpha-protein kinase (MAK), respectively. They both contain a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Loss of function mutations in ALPK3 can cause early-onset and familial cardiomyopathy in humans. The ALPK2 gene may also be a novel candidate gene for inherited hypertension in Dahl rats. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34496 341217 cd16967 Alpha_kinase_eEF2K Alpha-kinase domain of eukaryotic elongation factor-2 kinase. Eukaryotic elongation factor-2 kinase (eEF2K) is also called calcium/calmodulin (CaM)-dependent eEF2K. It phosphorylates eukaryotic elongation factor-2 (EEF2) at a single site, leading to its inactivation and inability to bind ribosomes, and slowing down the elongation stage of protein synthesis. It has been linked to many human diseases including cardiovascular conditions (atherosclerosis) and pulmonary arterial hypertension, as well as solid tumors and neurological disorders. eEF2K is an atypical protein kinase containing a CaM binding region, an alpha-kinase catalytic domain, and TPR-like Sel1 repeats at the C-terminus. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 216
34497 341218 cd16968 Alpha_kinase_MHCK_like Alpha-kinase domain of myosin heavy chain kinase and similar domains. This group is composed of alpha-kinase domains of Dictyostelium discoideum myosin heavy chain kinases A-D (MHCKA, MHCKB, MHCKC, MHCKD), alpha-protein kinase 1 (AK1), and similar proteins. The myosin heavy chain kinases are involved in regulating myosin II filament assembly in Dictyostelium discoideum. They phosphorylate target threonine residues located in the carboxyl-terminal portion of the myosin II heavy chain (MHC) tail, resulting in filament disassembly. The different MHCK isoforms display different spatial regulation, indicating specific roles for each isoform in fine tuning the Dictyostelium actomyosin cytoskeleton. They all contain an alpha-kinase domain as well as WD40 repeats at the C-terminus. AK1 contains an N-terminal Arf-GAP domain and a central alpha-kinase domain. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 202
34498 341219 cd16969 Alpha_kinase_ALPK1 Alpha-kinase domain of alpha-protein kinase 1. Alpha-protein kinase 1 is also called chromosome 4 kinase or lymphocyte alpha-protein kinase (LAK). ALPK1 is implicated in epithelial cell polarity and exocytic vesicular transport towards the apical plasma membrane. It resides on Golgi-derived vesicles where it phosphorylates myosin IA, a motor protein that regulates the delivery of vesicles to the plasma-membrane. It may be associated with inflammation-related diseases such as gout and type 2 diabetes mellitus. ALPK1 contains a C-terminal alpha-kinase domain, an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 227
34499 341220 cd16970 Alpha_kinase_VwkA_like Alpha-kinase domain of Dictyostelium discoideum VwkA and similar domains. Dictyostelium discoideum alpha-protein kinase VwkA is also called von Willebrand factor A alpha-kinase or vWF kinase. It influences myosin II abundance and assembly behavior as vWKA gene disruption leads to significant myosin II overassembly. VwkA also serves a critical conserved role in the periodic contractions of the contractile vacuole through its regulation of the myosin II cortical cytoskeleton. It contains a vWFa domain (named after its homology to von Willebrand factor A, a plasma glycoprotein essential for proper blood clotting) and a C-terminal alpha-kinase domain. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 227
34500 341221 cd16971 Alpha_kinase_ChaK1_TRMP7 Alpha-kinase domain of channel kinase 1, also called transient receptor potential cation channel subfamily M member 7. Channel kinase 1 (ChaK1), also called transient receptor potential cation channel subfamily M member 7 (TRMP7) or long transient receptor potential channel 7 (LTrpC7), is a fusion protein containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. It is ubiquitously expressed and is a cation-selective channel that preferentially permeates Zn2+, Mg2+, and Ca2+ ions. It is a central regulator of Mg2+ and Ca2+ homeostasis. TRPM7 plays a role in cancer proliferation, stroke, hydrogen peroxide dependent neurodegeneration, and heavy metal toxicity. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34501 341222 cd16972 Alpha_kinase_ChaK2_TRPM6 Alpha-kinase domain of channel kinase 2, also called transient receptor potential cation channel subfamily M member 6. Channel kinase 2 (ChaK2), also called transient receptor potential cation channel subfamily M member 6 (TRMP6) or melastatin-related TRP cation channel 6, is a fusion protein containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. It is highly expressed in the kidney and instestine. It is a cation-selective channel that preferentially permeates Zn2+, Mg2+, and Ca2+ ions. It is a central regulator of Mg2+ and Ca2+ homeostasis. TRPM6 is considered to be the Mg2+ entry pathway in the distal convoluted tubule of the kidney, where it functions as a gatekeeper for controlling the body's Mg2+ balance. Mutations in the TRPM6 gene cause the autosomal recessive disorder hypomagnesemia with secondary hypocalcemia, which often results in severe muscular and neurologic complications from early infancy that can lead to neurologic damage or cardiac arrest if left untreated. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34502 341223 cd16973 Alpha_kinase_ALPK3 Alpha-kinase domain of alpha-protein kinase 3. Alpha-protein kinase 3 (ALPK3) is also called muscle alpha-protein kinase (MAK) or myocytic induction/differentiation originator (Midori). Its expression is restricted to fetal and adult heart and adult skeletal muscle, and is localized in the nucleus. It is thought to act as a transcriptional regulator implicated in early cardiac development. Loss of function mutations in ALPK3 can cause early-onset and familial cardiomyopathy in humans. ALPK3 contains a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34503 341224 cd16974 Alpha_kinase_ALPK2 Alpha-kinase domain of alpha-protein kinase 2. Alpha-protein kinase 2 (ALPK2) is also called heart alpha-protein kinase (HAK). Little functional information is known about ALPK2. In a three-dimensional colonic-crypt model, it has been identified as crucial for luminal apoptosis and expression of DNA repair-related genes, possibly in the transition of normal colonic crypt to adenoma. The ALPK2 gene may also be a novel candidate gene for inherited hypertension in Dahl rats. ALPK2 contains a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 239
34504 340434 cd16975 HATPase_SpaK_NisK-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis SpaK and Lactococcus lactis NisK. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Bacillus subtilis SpaK and Lactococcus lactis NisK. SpaK is the histidine kinase (HK) of the SpaK-SpaR two-component regulatory system (TCS), which is involved in the regulation of the biosynthesis of lantibiotic subtilin. NisK is the HK of the NisK-NisR TCS, which is involved in the regulation of the biosynthesis of lantibiotic nisin. SpaK and NisK may function as membrane-associated protein kinases that phosphorylate SpaR and NisR, respectively, in response to environmental signals. 107
34505 340435 cd16976 HATPase_HupT_MifS-like Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Rhodobacter capsulatus HupT and Pseudomonas aeruginosa MifS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Rhodobacter capsulatus HupT of the HupT-HupR two-component regulatory system (TCS), which regulates the synthesis of HupSL, a membrane bound [NiFe]hydrogenase. It also contains the HATPase domain of Pseudomonas aeruginosa MifS, the HK of the MifS-MifR TCS, which may be involved in sensing alpha-ketoglutarate and regulating its transport and subsequent metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some also have a C-terminal PAS sensor domain. 102
34506 340774 cd16977 VHS_GGA VHS (Vps27/Hrs/STAM) domain of GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) subfamily. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 133
34507 340775 cd16978 VHS_HSE1 VHS (Vps27/Hrs/STAM) domain of Class E vacuolar protein-sorting machinery protein HSE1. Class E vacuolar protein-sorting machinery protein HSE1, together with Vps27, comprise the ESCRT-0 complex, the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB). The complex directly binds to ubiquitinated transmembrane proteins and recruits both ubiquitin ligases and deubiquitinating enzymes. It is also required the efficient recycling of late Golgi proteins including the carboxypeptidase Y (CPY) sorting receptor, Vps10. Similar to metazoan STAMs, HSE1 contain: an N-terminal VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats; a Ubiquitin-Interacting Motif (UIM); a SH3 (Src Homology 3) domain, a well-established protein-protein interaction domain; and a GAT (GGA and TOM) domain, which is essential for the normal sorting function of HSE1. 134
34508 340776 cd16979 VHS_Vps27 VHS (Vps27/Hrs/STAM) domain of Vacuolar protein sorting-associated protein 27. Vacuolar protein sorting-associated protein 27 (Vps27 or Vps27p) is also called Golgi retention defective protein 11, and is the yeast homolog of Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate). Together with class E vacuolar protein-sorting machinery protein HSE1, it comprises the ESCRT-0 complex, the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB). The complex directly binds to ubiquitinated transmembrane proteins and recruits both ubiquitin ligases and deubiquitinating enzymes. It is also required the efficient recycling of late Golgi proteins including the carboxypeptidase Y (CPY) sorting receptor, Vps10. Vps27 contain similar domains and motifs to Hrs; it contains an N-terminal VHS domain, which has a superhelical structure similar to the structure of ARM (Armadillo) repeats, a FYVE (Fab1p, YOTB, Vac1p, and EEA1) zinc finger domain, two Ubiquitin-Interacting Motifs (UIMs), a GAT (GGA and TOM) domain, two a P(S/T)XP motifs that recruit ESCRT-I, and a short peptide motif near the C-terminus that recruits clathrin. 141
34509 340777 cd16980 VHS_Lsb5 VHS (Vps27/Hrs/STAM) domain of LAS seventeen-binding protein 5. LAS seventeen-binding protein 5 (LAS17-binding protein 5, Lsb5, or Lsb5p) localizes to the plasma membrane and plays a role in endocytosis in yeast. It interacts with actin regulators Sla1p and Las17p, ubiquitin, and Arf3p, coupling actin dynamics to membrane trafficking processes. Lsb5p contains an N-terminal VHS domain and a GAT (GGA and TOM) domain. The VHS domain has a superhelical structure similar to the structure of ARM (Armadillo) repeats. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. 132
34510 340778 cd16981 CID_RPRD_like CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing proteins. This family is composed of Regulation of nuclear pre-mRNA domain-containing proteins 1A (RPRD1A), 1B (RPRD1B), 2 (RPRD2), yeast Rtt103, and similar proteins. RPRD1A, RPRD1B, and RPRD2 are CID (CTD-Interacting Domain) containing proteins that co-purify with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. Yeast transcription termination factor Rtt103 is a CID containing protein that functions in DNA damage response. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 125
34511 340779 cd16982 CID_Pcf11 CID (CTD-Interacting Domain) of Pcf11. Pcf11 is conserved across eukaryotes. The best studied protein is Saccharomyces cerevisiae Pcf11, also called protein 1 of CF I, an essential subunit of the cleavage factor IA (CFIA) complex which is required for polyadenylation-dependent pre-mRNA 3'-end processing and RNA polymerase (Pol) II (RNAP II) transcription termination. Human Pcf11, also referred to as pre-mRNA cleavage complex 2 protein Pcf11, has been shown to enhance degradation of RNAP II-associated nascent RNA and transcriptional termination. The family also includes plant PCFS4 (Pcf11-similar-4 protein or Polyadenylation and cleavage factor homolog 4) and Caenorhabditis elegans Polyadenylation and cleavage factor homolog 11. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. Pcf11 CID preferentially interacts with CTD phosphorylated at Ser2. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 127
34512 340780 cd16983 CID_SCAF8_like CID (CTD-Interacting Domain) of SR-related and CTD-associated factor 8 and similar proteins. This subfamily includes SR-related and CTD-associated factors 8 (SCAF8) and 4 (SCAF4), and similar proteins. SCAF4 is also called Splicing factor arginine serine rich 15 (SFRS15). Members may play roles in mRNA processing. Both SCAF4 and SCAF8 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 131
34513 340781 cd16984 CID_Nrd1_like CID (CTD-Interacting Domain) of Nrd1 and similar proteins. This subfamily includes Saccharomyces cerevisiae protein Nrd1, Schizosaccharomyces pombe Rpb7-binding protein Seb1, and similar proteins. Nrd1 cooperates with Nab3 and Sen1, also called the Nrd1-Nab3-Sen1 (NNS) complex, to terminate the transcription by RNA polymerase (Pol) II (RNAPII) of many noncoding RNAs (ncRNAs), including small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), and cryptic unstable transcripts (CUTs). Schizosaccharomyces pombe Seb1 does not function in an NNS-like termination pathway but promotes polyadenylation site selection of coding and noncoding genes. It cotranscriptionally controls alternative polyadenylation. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. Nrd1 CID preferentially interacts with CTD phosphorylated at Ser5. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 145
34514 340782 cd16985 ANTH_N_AP180 ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of adaptor protein 180 (AP180) subfamily. The Adaptor Protein 180 (AP180) subfamily members are phosphatidylinositol-binding clathrin assembly proteins, including mammalian clathrin coat assembly protein AP180 and Clathrin Assembly Lymphoid Myeloid Leukemia protein (CALM), Drosophila LAP (also called Like-AP180 or AP180), and Caenorhabditis elegans Uncoordinated protein 11 (unc-11, also called AP180-like adaptor protein). They are components of the adaptor complexes which link clathrin to receptors in coated vesicles. AP180 and CALM play important roles in clathrin-mediated endocytosis. AP180, also called 91 kDa synaptosomal-associated protein (SNAP91) or phosphoprotein F1-20, is a brain-specific clathrin-binding protein which stimulates clathrin assembly during the recycling of synaptic vesicles. CALM, also called phosphatidylinositol binding clathrin assembly protein (PICALM), is ubiquitously expressed. Members of this subfamily contain ANTH domains, which bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of the Adaptor Protein 180 (AP180) subfamily. 117
34515 340783 cd16986 ANTH_N_Sla2p_HIP1_like ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Sla2p/HIP1/HIP1R subfamily. Members of the Sla2p/HIP1/HIP1R subfamily share a common domain architecture, containing an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. HIP1 was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. Both HIP1 and HIP1R promote clathrin assembly in vitro. Yeast Sla2p, is a regulator of membrane cytoskeleton assembly. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. While the ANTH domain of Sla2p preferentially binds PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome, mammalian HIP1 and HIP1R were found to preferentially bind PtdIns(3,4)P2 and PtdIns(3,5)P2, respectively. This model describes the N-terminal region of ANTH domains of the Sla2p/HIP1/HIP1R subfamily. 117
34516 340784 cd16987 ANTH_N_AP180_plant ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of plant Clathrin coat assembly protein AP180 and similar proteins. This subfamily is composed of plant clathrin coat assembly protein AP180 and other ANTH domain containing proteins that are yet to be characterized. Arabidopsis thaliana AP180 (At-AP180) is a binding partner of plant alphaC-adaptin; it functions as a clathrin assembly protein that promotes the formation of cages with an almost uniform size distribution. In addition to At-AP180, Arabidopsis thaliana contains many ANTH domain containing proteins labelled as putative clathrin assembly proteins included in this subfamily such as At4g02650, At5g10410, At2g25430, and At1g33340, among others. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of plant clathrin coat assembly protein AP180 and similar proteins. 122
34517 340785 cd16988 ANTH_N_YAP180 ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of yeast clathrin coat assembly protein AP180 (YAP180) and similar proteins. This subfamily includes yeast clathrin coat assembly protein AP180 (YAP180) and similar proteins. There are two YAP180 proteins in Saccharomyces cerevisiae, AP180A (yAP180A or YAP1801) and AP180B (yAP180B or YAP1802). They are involved in endocytosis and clathrin cage assembly. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of plant clathrin coat assembly protein AP180 and similar proteins. 117
34518 340786 cd16989 ENTH_EpsinR Epsin N-Terminal Homology (ENTH) domain of Epsin-related protein. Epsin-related protein (EpsinR) is also called clathrin interactor 1 (Clint), enthoprotin, or epsin-4. It is a clathrin-coated vesicle (CCV) protein that binds to membranes enriched in phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), clathrin, and the gamma appendage domain of the adaptor protein complex 1 (AP1). It contains an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. The ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. The ENTH domain of human epsinR binds directly to the helical bundle domain of the mouse SNARE Vti1b; soluble NSF attachment protein receptors (SNAREs) are type II transmembrane proteins that have critical roles in providing the specificity and energy for transport-vesicle fusion. Specific ENTH domains may also function as protein cargo selection/recognition modules. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 130
34519 340787 cd16990 ENTH_Epsin Epsin N-Terminal Homology (ENTH) domain of Epsin family. Members of the epsin family play an important role as accessory proteins in clathrin-mediated endocytosis. They are important factors in clathrin-coated vesicle (CCV) generation. They contribute to membrane deformation and play a key function as adaptor proteins, coupling various components of clathrin-mediated uptake. They also have an important role in selecting and recognizing cargo. Three isoforms have been identified in mammals, epsin-1 to -3, and these are conserved in vertebrates. Epsin-1 is highly enriched and represents the dominant isoform in the brain. It is required for proper synaptic vesicle retrieval and modulates the endocytic capacity of synaptic vesicles. Epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. The ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of CCVs. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 124
34520 340788 cd16991 ENTH_Ent1_Ent2 Epsin N-Terminal Homology (ENTH) domain of Yeast Ent1, Ent2, and similar proteins. This subfamily is composed of the two orthologs of epsin in Saccharomyces cerevisiae, Epsin-1 (Ent1 or Ent1p) and Epsin-2 (Ent2 or Ent2p), and similar proteins. Yeast single epsin knockouts, either Ent1 and Ent2, are viable while the double knockout is not. Yeast epsins are required for endocytosis and localization of actin. Ent2 also plays a signaling role during cell division. The ENTH domain of Ent2 interacts with the septin organizing, Cdc42 GTPase activating protein, Bem3, leading to increased cytokinesis failure when overexpressed. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 132
34521 340789 cd16992 ENTH_Ent3 Epsin N-Terminal Homology (ENTH) domain of Yeast Ent3 and similar proteins. This subfamily is composed of one of two epsinR orthologs present in Saccharomyces cerevisiae, Epsin-3 (Ent3 or Ent3p), and similar proteins. Ent3 is an adaptor proteins at the Trans-Golgi Network (TGN); it cooperates with yeast SNARE Vti1p to regulate transport from the TGN to the prevacuolar endosome. Ent3 facilitates the interaction between Gga2p with both the endosomal syntaxin Pep12p and clathrin in the GGA-dependent transport to the late endosome. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. Similar to mammalian epsinR, The ENTH domain of Ent3 binds to the yeast SNARE Vti1p; soluble NSF attachment protein receptors (SNAREs) are type II transmembrane proteins that have critical roles in providing the specificity and energy for transport-vesicle fusion. Specific ENTH domains may also function as protein cargo selection/recognition modules. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 121
34522 340790 cd16993 ENTH_Ent5 Epsin N-Terminal Homology (ENTH) domain of Yeast Ent5 and similar proteins. This subfamily is composed of one of two epsinR orthologs present in Saccharomyces cerevisiae, Epsin-5 (Ent5 or Ent5p), and similar proteins. Ent5 is required, together with Ent3 and Vps27p for ubiquitin-dependent protein sorting into the multivesicular body. It is also required for protein transport from the Trans-Golgi Network (TGN) to the vacuole. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 158
34523 340791 cd16994 ENTH_Ent4 Epsin N-Terminal Homology (ENTH) domain of Yeast Ent4 and similar proteins. Yeast Epsin-4 (Ent4 or Ent4p) has been reported to be involved in the Trans-Golgi Network (TGN)-to-vacuole sorting of Arn1p, a transporter for the uptake of ferrichrome, an important nutritional source of iron. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 126
34524 340792 cd16995 VHS_Tom1 VHS (Vps27/Hrs/STAM) domain of Target of Myb protein 1. Tom1 (Target of myb1 - retroviral oncogene) is a novel negative regulator of interleukin-1 and tumor necrosis factor-induced signaling pathways. It also plays important roles in protein-degradation systems in Alzheimer's disease pathogenesis. Tom1 contains VHS and GAT domains in the N-terminal and central region, respectively. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. The VHS domain of Tom1 is essential for its function as a negative regulator. 137
34525 340793 cd16996 VHS_Tom1L2 VHS (Vps27/Hrs/STAM) domain of TOM1-like protein 2. TOM1-like protein 2 (Tom1L2) is a member of the Tom1 (Target of myb1) subfamily, characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Studies in Tom1L2 hypomorphic mice suggest that Tom1L2 may play roles in immune responses and tumor suppression. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. 137
34526 340794 cd16997 VHS_Tom1L1 VHS (Vps27/Hrs/STAM) domain of TOM1-like protein 1. TOM1-like protein 1 (Tom1L1) is also called Src-activating and signaling molecule protein (Srcasm). It is a member of the Tom1 (Target of myb1) subfamily, characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Tom1L1 has been implicated in multivesicular body (MVB) formation, viral egress from the cell, and cytokinesis. Its amplification enhances the metastatic progression of ERBB2-positive breast cancers. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. 137
34527 340795 cd16998 VHS_GGA_fungi VHS (Vps27/Hrs/STAM) domain of fungal GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) proteins. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. Yeast GGAs facilitate the specific and direct delivery of vacuolar sorting receptor Vps10p and the processing protease Kex2p from the TGN to the late endosome/prevacuolar compartment (PVC). The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 139
34528 340796 cd16999 VHS_STAM2 VHS (Vps27/Hrs/STAM) domain of Signal Transducing Adapter Molecule 2. Signal Transducing Adapter Molecule 2 (STAM2) is also called EAST (EGFR-Associated protein with SH3 and TAM domain) and Hbp (Hrs-binding protein). It is highly expressed in neurons, where it is localized in the nucleus. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMS, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. 139
34529 340797 cd17000 VHS_STAM1 VHS (Vps27/Hrs/STAM) domain of Signal Transducing Adapter Molecule 1. Signal Transducing Adapter Molecule 1 (STAM1) is part of a crucial regulatory axis for the ventral axonal trajectory of developing spinal motor neurons. It forms a complex with beta-arrestin, which regulates lysosomal trafficking of the chemokine receptor CXCR4 and also mediates CXCR4-dependent chemotaxis. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMS, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. 131
34530 340798 cd17001 CID_RPRD2 CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 2. Regulation of nuclear pre-mRNA domain-containing protein 2 (RPRD2) is a CID (CTD-Interacting Domain) domain containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 125
34531 340799 cd17002 CID_RPRD1 CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1 and similar proteins. This subfamily contains Regulation of nuclear pre-mRNA domain-containing proteins 1A (RPRD1A) and 1B (RPRD1B) from jawed vertebrates, CID domain-containing protein 1 (CIDS1 or cids-1) from Caenorhabditis elegans, and similar proteins. RPRD1A and RPRD1B are CID (CTD-Interacting Domain) containing proteins that co-purify with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1A and RPRD1B form homodimers and heterodimers through their coiled-coil domains. Both associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. The function of CIDS1 is not yet known. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 128
34532 340800 cd17003 CID_Rtt103 CID (CTD-Interacting Domain) of yeast transcription termination factor Rtt103 and similar proteins. Yeast transcription termination factor Rtt103 is a CID (CTD-Interacting Domain) containing protein that functions in DNA damage response. It associates with sites of DNA breaks and is essential for recovery from DNA double strand breaks in the chromosome. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). Rtt103 CID preferentially interacts with CTD phosphorylated at Ser2. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 127
34533 340801 cd17004 CID_SCAF8 CID (CTD-Interacting Domain) of SR-related and CTD-associated factor 8. SR-related and CTD-associated factor 8 (SCAF8) is also called CDC5L complex-associated protein 7 (CCAP7) or RNA-binding motif protein 16 (RBM16). It may play a role in mRNA processing. SCAF8 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 131
34534 340802 cd17005 CID_SFRS15_SCAF4 CID (CTD-Interacting Domain) of Splicing factor arginine serine rich 15. Splicing factor arginine serine rich 15 (SFRS15) is also called CTD-binding SR-like protein RA4 or SR-related and CTD-associated factor 4 (SCAF4). It may act to physically and functionally link transcription and pre-mRNA processing. SFRS15/SCAF4 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 131
34535 340803 cd17006 ANTH_N_HIP1_like ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1 and related proteins. This subfamily includes Huntingtin-interacting protein 1 (HIP1), HIP1-related protein (HIP1R), and similar proteins. Mammalian HIP1 was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. HIP1 is expressed only in neurons while HIP1R is ubiquitously expressed. Together with its interacting partner HIPPI, HIP1 regulates apoptosis and gene expression. Both HIP1 and HIP1R promote clathrin assembly in vitro, and they share a common domain architecture, containing an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. Mammalian HIP1 and HIP1R were found to preferentially bind PtdIns(3,4)P2 and PtdIns(3,5)P2, respectively, instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of the ANTH domain of Huntingtin-interacting protein 1 and related proteins. 114
34536 340804 cd17007 ANTH_N_Sla2p ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Sla2p and similar proteins. This subfamily is composed of Saccharomyces cerevisiae Sla2 protein (Sla2p, also called transmembrane protein MOP2), Schizosaccharomyces pombe endocytosis protein End4 (End4p, also called Sla2 protein homolog), and similar proteins. In yeast, cells lacking Sla2p have severe defects in actin organization, cell morphology, and endocytosis, suggesting roles in these processes. Sla2p regulates the Eps15-like Arp2/3 complex activator, Pan1p, controlling actin polymerization during endocytosis. In fission yeast, End4p has been implicated in cellular morphogenesis. Sla2p contains an N-terminal ANTH, a central colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of Sla2p preferentially binds PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domains f Sla2p and similar proteins. 115
34537 340805 cd17008 VHS_GGA3 VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA3. ADP-ribosylation factor-binding protein GGA3 (Golgi-localized, Gamma-ear-containing, Arf-binding 3) regulates the trafficking and is required for the lysosomal degradation of BACE (beta-site APP-cleaving enzyme), the protease that initiates the production of beta-amyloid, which causes Alzheimer's disease. It also plays a key role in GABA (+) transmission, which is important in the regulation of anxiety-like behaviors. GGA3 is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 141
34538 340806 cd17009 VHS_GGA1 VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA1. ADP-ribosylation factor-binding protein GGA1 (Golgi-localized, Gamma-ear-containing, Arf-binding 1) is also called Gamma-adaptin-related protein 1. It is expressed in human brain and affects the generation of amyloid beta-peptide, and may be involved in the pathogenesis of Alzheimer disease. It is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 139
34539 340807 cd17010 VHS_GGA2 VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA2. ADP-ribosylation factor-binding protein GGA2 (Golgi-localized, Gamma-ear-containing, Arf-binding 2) is also called Gamma-adaptin-related protein 2 and VHS domain and ear domain of gamma-adaptin (Vear). It is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system. 139
34540 340808 cd17011 CID_RPRD1A CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1A. Regulation of nuclear pre-mRNA domain-containing protein 1A (RPRD1A) is also called Cyclin-dependent kinase inhibitor 2B-related protein or p15INK4B-related protein (P15RS). RPRD1A is a CID (CTD-Interacting Domain) containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1A form homodimers and heterodimers with RPRD1B through their coiled-coil domains. Both RPRD1A and RPRD1B associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 128
34541 340809 cd17012 CID_RPRD1B CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1B. Regulation of nuclear pre-mRNA domain-containing protein 1B (RPRD1B) is also called Cell cycle-related and expression-elevated protein in tumor (CREPT). RPRD1B is a CID (CTD-Interacting Domain) containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1B form homodimers and heterodimers with RPRD1A through their coiled-coil domains. Both RPRD1A and RPRD1B associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. RPRD1B is highly expressed during tumorigenesis and in endometrial cancer, has been shown to promote tumor growth by accelerating the cell cycle. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 129
34542 340810 cd17013 ANTH_N_HIP1 ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1. Huntingtin-interacting protein 1 (HIP1) was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. HIP1 promotes clathrin assembly in vitro. Together with its interacting partner HIPPI, it regulates apoptosis and gene expression. HIP1 contains an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domain. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of mammalian HIP1 was found to preferentially bind PtdIns(3,4)P2 instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domain of Huntingtin-interacting protein 1. 114
34543 340811 cd17014 ANTH_N_HIP1R ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1-related protein. Huntingtin-interacting protein 1-related protein (HIP1R), also called HIP12, promotes clathrin assembly in vitro. It is an endocytic protein involved in receptor trafficking, including regulating cell surface expression of receptor tyrosine kinases. Low HIP1R protein expression is associated with worse survival in diffuse large B-cell lymphoma (DLBCL) patients; it is preferentially expressed in germinal center B-cell (GCB)-like DLBCL, and may be potentially useful in subtyping DLBCL cases. HIP1R contains an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domain. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of mammalian HIP1R was found to preferentially bind PtdIns(3,5)P2 instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domain of Huntingtin-interacting protein 1-related protein. 114
34544 341097 cd17015 ING_plant Inhibitor of growth (ING) domain of plant inhibitor of growth proteins. This subfamily is composed of mainly plant inhibitor of growth proteins such as Arabidopsis thaliana ING1 (AtING1 or PHD finger protein ING1) and ING2 (AtING2 or PHD finger protein ING2). They are histone-binding components that specifically recognizes H3 tails trimethylated on 'Lys-4' (H3K4me3), which mark transcription start sites of virtually all active genes. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation, and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, which binds lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING domain for the H3 tail. 98
34545 341098 cd17016 ING_Pho23p_like Inhibitor of growth (ING) domain of yeast Pho23p and similar proteins. This family is composed of Saccharomyces cerevisiae transcriptional regulatory protein PHO23 (Pho23p), Schizosaccharomyces pombe chromatin modification-related protein png2 (also called ING1 homolog 2), and similar proteins. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Pho23p inhibits p53-dependent transcription. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail. 89
34546 341099 cd17017 ING_Yng1p Inhibitor of growth (ING) domain of yeast Yng1p and similar proteins. The ING family includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays acritical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail. 100
34547 409302 cd17018 T3SC_IA_ExsC-like Class IA type III secretion system chaperone protein, similar to Pseudomonas aeruginosa exoenzyme S synthesis protein C (ExsC). This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas aeruginosa and Aeromonas hydrophila ExsC (also known as exoenzyme S synthesis protein C). P. aeruginosa ExsC, a member of the type IA family of T3SS chaperones, is unique because, as part of the signaling process, it binds small secreted protein ExsE as well as the non-secreted anti-activator protein ExsD; it relieves repression of the transcriptional activator ExsA (which activates expression of T3SS genes) by ExsD. However, in Aeromonas, although ExsA is likely the master regulator of the T3SS, there is little evidence of ExsC and ExsE involvement in the regulation of the T3SS. 127
34548 409303 cd17019 T3SC_IA_ShcA-like Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae chaperone protein ShcA. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcA and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector HopA1 (previously known as HopPsyA or HrmA), a protein that has unknown functions in the host cell but possesses close homologs that trigger the plant hypersensitive response in resistant strains. Chaperone ShcA binding to Hop1A shows that interactions in animal pathogens are preserved in the Gram-negative pathogens of plants. 122
34549 409304 cd17020 T3SC_IA_ShcM-like Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae chaperone protein ShcM. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcM and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector protein HopPtoM (previously known as CEL ORF3), among known plant pathogen effectors, that makes a major contribution to the elicitation of lesion symptoms but not growth in host tomato leaves. Chaperone ShcM is required for efficient translocation and function of HopPtoM in the plant cell, consistent with the presence of customized chaperones in plant pathogenic bacteria. 122
34550 409305 cd17021 T3SC_IA_SicP-like Class IA type III secretion system chaperone protein, similar to Salmonella enterica chaperone protein SicP. This family includes type III secretion system (T3SS) chaperone proteins similar to Salmonella enterica SicP and similar proteins. In S. enterica, many of its serovars being serious human pathogens, the T3SS allows injection of the effector SptP, a virulence protein that is involved in bacterial invasion into a host cell. Chaperone SicP forms a complex with SptP at an early stage of the effector protein secretion process in order to avoid premature degradation; also, the complex is dissociated at a late stage to secrete only SptP with the help of the ATPase InvC which is part of the related T3SS injectisome. 121
34551 409306 cd17022 T3SC_IA_SigE-like Class IA type III secretion system chaperone protein, similar to Salmonella enterica SigE. This family includes type III secretion system (T3SS) chaperone proteins similar to Salmonella enterica chaperone SigE and similar proteins. In S. enterica, many of its serovars being serious human pathogens, the T3SS allows injection of the effector SigD (also known as SopB) which is an inositol phosphatase. Chaperone SigE binds to SigD, which, upon translocation into the host cell, preferentially dephosphorylates specific inositol phospholipids that are thought to be crucial for subsequent activation of the host cell Ser-Thr kinase Akt. 113
34552 409307 cd17023 T3SC_IA_CesT-like Class IA type III secretion system chaperone protein, similar to Escherichia coli CesT. This family includes type III secretion system (T3SS) chaperone proteins similar to Escherichia coli CesT and also contains Stm2138, a novel virulence chaperone in Salmonella enterica subsp. enterica serovar Typhimurium. In E. coli, the T3SS allows injection of the effector Tir (translocated intimin receptor), which plays a key role in enterohemorrhagic Escherichia coli (EHEC) infection, attaching and effacing (A/E) lesions, and intracellular signal transduction. CesT binds to Tir, which interacts with intimin and anchors the infected cell membrane inside the host cytoplasm for signaling. 133
34553 409308 cd17024 T3SC_IA_DspF-like Class IA type III secretion system chaperone protein, similar to Erwinia amylovora DspF (DspF/AvrF family protein). This family includes type III secretion system (T3SS) chaperone proteins similar to Erwinia amylovora DspF, Pantoea stewartii WtsE, Pseudomonas viridiflava AvrF, and similar proteins, all of which bind AvrE family type III effector proteins. In E. amylovora, a gram-negative enterobacterium that causes a devastating blight disease of apple and pear trees, the T3SS allows injection of effector DspE via the chaperone DspF. DspE has been shown to interact with several apple proteins, suppress salicylic acid-mediated host defenses and cause necrotic cell death in host and non-host plants. In Pectobacterium carotovorum, DspE is required early in solanum tuberosum leaf infection to cause cell death. Effector WtsE in P. stewartii causes disease-associated cell death in corn and requires the chaperone protein WtsF for stability. 121
34554 409309 cd17025 T3SC_IA_ShcF-like Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae ShcF. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcF and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector protein AvrPphF into genetically susceptible host cells. Chaperone ShcF (originally known as AvrPphF ORD1) binds AvrPphF in a similar manner to type III chaperones from bacterial pathogens of animals, indicating structural conservation of these specialized chaperones, despite high sequence divergence. 124
34555 409310 cd17026 T3SC_IA_SpcU-like Class IA type III secretion system chaperone protein, similar to Pseudomonas aeruginosa SpcU. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas aeruginosa SpcU and similar proteins. In P. aeruginosa, a multidrug resistant pathogen associated with serious illnesses such as ventilator-associated pneumonia and various sepsis syndromes, the T3SS allows injection of effector protein ExoU, one of the most aggressive toxins injected by a T3SS, into the cytosol of target eukaryotic cells, leading to rapid cell necrosis. Chaperone SpcU binds the cytotoxin ExoU, which is a broad-specificity phospholipase A2 (PLA2) and lysophospholipase, and maintains the N-terminus of ExoU in an unfolded state which is required for secretion. 120
34556 409311 cd17027 T3SC_IA_YscB_AscB-like Class IA type III secretion system chaperone protein, similar to Yersinia pestis YscB. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis YscB and its homologs, Aeromonas hydrophila AscB and Photorhabdus luminescens LscB. In Yersinia pestis, which causes the deadly bubonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex. YscB acts, along with SycN, as a chaperone for YopN, a key part of a complex that regulates type III secretion so that it responds to contact with the eukaryotic target cell. 127
34557 409312 cd17028 T3SC_IA_SycE_Scc1-like Class IA type III secretion system chaperone protein, similar to Chlamydia SycE/Scc1. This family includes type III secretion system (T3SS) chaperone proteins similar to Chlamydia SycE (also known as Scc1) and similar proteins. Chlamydia SycE is homologous to that of the SycE chaperone protein of Yersinia, which is involved in promoting translocation of Yersinia outer protein E (YopE). In Chlamydia, two T3SS chaperones, Scc1 and Scc4, work together to promote secretion of the important effector and plug protein, CopN, whereas, the Scc3 chaperone represses its secretion. 134
34558 409313 cd17029 T3SC_IA_SycE_SpcS-like Class IA type III secretion system chaperone protein, similar to Yersinia SycE. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia SycE and its homolog Pseudomonas aeruginosa SpcS. Involvement of Yersinia chaperone SycE (also known as YerA) in promoting translocation of Yersinia outer protein E (YopE), a selective activator of mammalian Rho-family GTPases, into host macrophages is essential to Yersinia pathogenesis. In P. aeruginosa, which is an opportunistic pathogen that harbors multiple virulence factors that widely manipulate host cell signaling and immune response, the effector toxin proteins of T3SS are ExoT, ExoS, ExoU and ExoY. Chaperone SpcS (formerly known as Orf1) binds to ExoT as well as its homolog, ExoS, both known to be the actual virulence determinants due to the presence of bifunctional GTPase-activating (GAP) and ADP-ribosyltransferase (ADPRT) domains which are essential for inhibition of bacterial internalization and epithelial cell migration by altering the actin cytoskeleton. 115
34559 409314 cd17030 T3SC_IA_SycH-like Class IA type III secretion system chaperone protein, similar to Yersinia pestis SycH. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis SycH and similar proteins. In Yersinia pestis, the causative agent of bubonic and pneumonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane and have been linked to cytolysis. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex. SycH is the chaperone for YopH, a potent eukaryotic-like protein tyrosine phosphatase that is essential for virulence. SycH also binds two negative regulators of type III secretion, YscM1 and YscM2, both sharing significant sequence homology with the chaperone-binding domain of YopH. 120
34560 409315 cd17031 T3SC_IA_SycN-like Class IA type III secretion system chaperone protein, similar to Yersinia pestis SycN. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis SycN and similar proteins. In Yersinia pestis, the causative agent of bubonic and pneumonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane and have been linked to cytolysis. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex; SycN-YscB forms a heterodimeric secretion chaperone and binds YopN, a key part of a complex that regulates type III secretion, in response to calcium levels, so that secretion occurs only after contact with the targeted eukaryotic cell. Negative regulation is mediated by the complex by blocking the entrance to the secretion apparatus prior to contact with mammalian cells. 118
34561 409316 cd17032 T3SC_IA_SycT-like Class IA type III secretion system chaperone protein, similar to Yersinia enterocolitica SycT. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia enterocolitica SycT and similar proteins. In Y. enterocolitica, a food-borne pathogen causing gastroenteritis and mesenteric lymphadenitis, chaperone SycT promotes translocation of effector YopT (Yersinia outer protein T), a cysteine protease that inactivates the small GTPase RhoA of targeted host cells by cleaving its C-terminal, prenylated cysteine, thereby releasing the GTPase into the host cytosol. 121
34562 409317 cd17033 DR1245-like possible type III secretion system (T3SS) chaperone protein DR1245 found in Deinococcus radiodurans. This family includes a possible type III secretion system (T3SS) chaperone protein DR1245 found in Deinococcus radiodurans, a bacterium that is exceptionally resistant to the lethal effects of ionizing radiation (IR), ultraviolet light and other DNA-damaging agents. DR1245, a protein of unknown function conserved only in the Deinococcaceae, and with strong structural homology to YbjN proteins and T3SS chaperones, may display some chaperone activity towards DdrB, a protein found to be highly up-regulated following irradiation; DR1245 may also bind to other substrates. 138
34563 409318 cd17034 T3SC_IA_ShcO1-like Class IA type III secretion system chaperone proteins, similar to Pseudomonas syringae ShcO1, ShcS1, and ShcS2. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcO1 and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of effector Hrp-dependent outer proteins (HOPs) HopO1-1, HopS1, and HopS2. Three homologous chaperones ShcO1, ShcS1, and ShcS2 facilitate the translocation of their cognate effectors HopO1-1, HopS1, and HopS2, respectively. Interestingly, ShcS1 and ShcS2 are capable of substituting for ShcO1 in facilitating HopO1-1 secretion and translocation. ShcS1 and ShcO1 are exceptional class IA T3SS chaperones because they can bind more than one target effector. 123
34564 409319 cd17035 T3SC_IB_Spa15-like Class IB type III secretion system chaperone protein, similar to Shigella flexneri Spa15. This family includes type III secretion system (T3SS) chaperone proteins similar to Shigella flexneri Spa15, Salmonella enterica InvB, and similar proteins. In S. flexineri, which is a facultative intracellular pathogen that invades the colonic epithelium and causes bacillary dysentery, the T3SS allows injection of a number of effectors to ensure their stabilization prior to secretion. Spa15 is the chaperone for several TTS effectors, including IpaA, IpgB1, OspC3, OspB and OspD1. Effector IpgB that chaperone Spa15 is a mimic of the human Ras-like Rho guanosine triphosphatase RhoG, thus activating Rac1 guanosine triphosphatase and setting off membrane ruffling of the cell, assisting the internalization of Shigella. Also, Spa15 is a chaperone for secreted anti-activator OspD1 which is involved in the control of transcription by the type III secretion apparatus (T3SA) activity in Shigella flexneri. 129
34565 409320 cd17036 T3SC_YbjN-like_1 T110839 is structurally similar to type III secretion system chaperones and YbjN family proteins. This family includes protein T110839 from Synechococcus elongatus that is structurally similar to type III secretion system (T3SS) chaperones (T3SC) that bind effector proteins, and is homologous to YbjN, a putative sensory transduction regulator protein found in Proteobacteria. 125
34566 409321 cd17037 T3SC_IA_ShcV-like Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae ShcV. This family includes type III secretion system (T3SS) chaperone protein similar to Pseudomonas syringae ShcV. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of effector HopPtoV which may play a subtle role in pathogenesis. Chaperone ShcV facilitates secretion of HopProV into plant cells via the amino-terminal third of the effector. 131
34567 341208 cd17038 Flavi_M Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with a membrane-anchored envelope comprised of 3 proteins called C, M and E. The envelope glycoprotein M is translated as a precursor, called prM. The precursor portion of the protein is the signal peptide for the protein's entry into the membrane. prM is cleaved to form M by the proprotein convertase furin in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus. 75
34568 340559 cd17039 Ubl_ubiquitin_like ubiquitin-like (Ubl) domain found in ubiquitin and ubiquitin-like Ubl proteins. Ubiquitin-like (Ubl) proteins have a similar ubiquitin (Ub) beta-grasp fold and attach to other proteins in a Ubl manner but with biochemically distinct roles. Ub and Ubl proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Some Ubl domains have adaptor roles in Ub-signaling by mediating protein-protein interaction. Prokaryotic sulfur carrier proteins are Ub-related proteins that can be activated in an ATP-dependent manner. Polyubiquitination signals for a diverse set of cellular events via different isopeptide linkages formed between the C terminus of one ubiquitin (Ub) and the epsilon-amine of K6, K11, K27, K29, K33, K48, or K63 of a second Ub. One of these seven lysine residues (K27, Ub numbering) is conserved in this Ubl_ubiquitin_like family. K27-linked Ub chains are versatile and can be recognized by several downstream receptor proteins. K27 has roles beyond chain linkage, such as in Ubl NEDD8 (which contains many of the same lysines (K6, K11, K27, K33, K48) as Ub) where K27 has a role (other than conjugation) in the mechanism of protein neddylation. 68
34569 340560 cd17040 Ubl_MoaD_like ubiquitin-like (Ubl) domain found in a group of small sulfide carrier proteins. Ubiquitin-like (Ubl) domain found in a group of small sulfide carrier proteins This family includes ThiS, MoaD, CysO, QbsE, and their homologs, which are structurally homologous to ubiquitin (Ub) and may function as the sulfide donor for the biosynthesis of thiamin, molybdopterin, cysteine, thioquinolobactin, and other sulfur-containing natural products. Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. Like Ub, small sulfide carrier proteins in this family are adenylated at a diglycyl C-terminus by specific activating proteins. The adenylated C-terminus is subsequently converted to a thiocarboxylate, serving as the sulfide source. Those activating proteins are diverse and show little sequence similarity. This family also includes the small archaeal modifier protein (SAMP), including SAMP1, SAMP2 and SAMP3, which are Ub-like proteins that function as protein modifiers and are required for the production of sulfur-containing biomolecules in the archaeon Haloferax volcanii. SAMP1 and SAMP2 are involved in sulfur transfer during molybdenum cofactor biosynthesis and tRNA thiolation much like MoaD and Urm1, respectively. They can form covalent conjugates with their protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. SAMP2 also forms homo-conjugates through the intermolecular isopeptide bond between the C-terminal Gly and the Lys58 side chain, a feature that likely resembles polyubiquitination. SAMP3 conjugates are dependent on the Ub-activating E1 enzyme homolog of archaea (UbaA) for synthesis and are cleaved by the JAMM/MPN+ domain metalloprotease HvJAMM1. 88
34570 340561 cd17041 Ubl_WDR48 Ubiquitin-like (Ubl) domain found in WD repeat-containing protein 48 (WDR48) and similar proteins. WDR48, also termed USP1-associated factor 1 (UAF1), or WD repeat endosomal protein, or p80, is required for the histone deubiquitination activity. It stimulates activity of ubiquitin-specific proteases USP1, USP12, and USP46.As potential tumor suppressor, WDR48 in complex with deubiquitinase USP12 suppresses Akt-dependent cell survival signaling by stabilizing PH domain leucine-rich repeat protein phosphatase 1 (PHLPP1). WDR48 also functions as a novel interaction partner of E1 helicase from anogenital human papillomavirus (HPV) types, and plays an essential role in anogenital HPV DNA replication. WDR48 contains a WD40 domain and a ubiquitin-like domain that shows high sequence and structural similarity with RING finger- and WD40-associated ubiquitin-like (RAWUL) domain. 97
34571 340562 cd17042 Ubl_TmoB Ubiquitin-like (Ubl) domain found in toluene-4-monooxygenase system protein B (TmoB). TmoB is a component of the multicomponent toluene-4-monooxygenase (T4MO) system that metabolizes toluene as a carbon source. The T4MO complex is composed of a diiron hydroxylase (T4MOH), a Rieske-type ferredoxin (T4MOC), an effector protein (T4MOD), and an NADH oxidoreductase (T4MOF). The T4MOH component consists of TmoA, TmoB, and TmoE. TmoB adopts a beta-grasp ubiquitin-like fold but its precise role remains unclear. 79
34572 340563 cd17043 RA Ras-associating (RA) domain, structurally similar to a beta-grasp ubiquitin-like fold. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in various functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. RA-containing proteins include RalGDS, AF6, RIN, RASSF1, SNX27, CYR1, STE50, and phospholipase C epsilon. 87
34573 340564 cd17044 Ubl_TBCE ubiquitin-like (Ubl) domain found in tubulin-folding cofactor E (TBCE) and similar proteins. TBCE, also termed tubulin-specific chaperone E, is a tubulin polymerizing protein involved in the second step of the tubulin folding pathway through cooperating in tubulin heterodimer dissociation both in vivo and in vitro. It may also be implicated in the maintenance of the neuronal microtubule network. Mutations in TBCE gene cause hypoparathyroidism, mental retardation and facial dysmorphism. TBCE contains an N-terminal cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain, a leucine-rich repeat protein-protein interaction domain followed by leucine-rich repeat (LRR) domains, and a C-terminal ubiquitin-like (Ubl) domain. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes. 83
34574 340565 cd17045 Ubl_TBCEL ubiquitin-like (Ubl) domain found in tubulin-specific chaperone cofactor E-like protein (TBCEL) and similar proteins. TBCEL, also termed leucine-rich repeat-containing protein 35 (LRRC35), or E-like (EL), is a novel regulator of tubulin stability, suggesting a link between tubulin turnover and vesicle transport. TBCEL is abundantly expressed in testis, but is also present in several tissues at a much lower level. It is required for the synchronous movement of the investment cones and is important for normal male fertility. TBCEL shows high sequence similarity to tubulin-specific chaperone cofactor E (TBCE), a component of the multimolecular complex required for tubulin heterodimer formation in all eukaryotic cells. It contains a leucine-rich repeat protein-protein interaction domain and a C-terminal ubiquitin-like (Ubl) domain, but does not harbor the cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain found in TBCE. 87
34575 340566 cd17046 Ubl_IKKA_like ubiquitin-like (Ubl) domain found in inhibitor of nuclear factor kappa-B kinases, IKK-alpha and IKK-beta, and similar proteins. IKK, also termed IkappaB kinase, is an enzyme complex involved in propagating the cellular response to inflammation. It is part of the upstream nuclear factor kappa-B kinase (NF-kappaB) signal transduction cascade, and plays an important role in regulating the NF-kappaB transcription factor. IKK is composed of three subunits, IKK-alpha/CHUK, IKK-beta/IKBKB, and IKK-gamma/NEMO. The IKK-alpha and IKK-beta subunits together are catalytically active whereas the IKK-gamma subunit serves a regulatory function. IKK-alpha and IKK-beta phosphorylate the IkappaB proteins, marking them for degradation via ubiquitination and allowing NF-kappaB transcription factors to go into the nucleus. IKK-alpha, also known as IKK-A, or IkappaB kinase A (IkBKA), or conserved helix-loop-helix ubiquitous kinase (CHUK), or I-kappa-B kinase 1 (IKK1), or nuclear factor NF-kappa-B inhibitor kinase alpha (NFKBIKA), or transcription factor 16 (TCF-16), belongs to the serine/threonine protein kinase family. In addition to NF-kappaB response, it has many additional cellular targets in an NF-kappaB-independent manner. For instance, it plays a role in epidermal differentiation, as well as in the regulation of the cell cycle protein cyclin D1. IKK-beta, also known as IKK-B, or IkappaB kinase B (IkBKB), or I-kappa-B kinase 2 (IKK2), or nuclear factor NF-kappa-B inhibitor kinase beta (NFKBIKB), belongs to the serine/threonine protein kinase family as well. It interacts with many different protein partners and has been implicated in the treatment of many inflammatory diseases and cancers. Both IKK-alpha and IKK-beta contain an N-terminal catalytic domain followed by a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 75
34576 340567 cd17047 Ubl_UBFD1 ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein UBFD1 and similar proteins. UBFD1, also termed ubiquitin-binding protein homolog (UBPH), is a polyubiquitin binding protein containing a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. It may play a role as nuclear factor-kappaB (NF-kappaB) regulator. 70
34577 340568 cd17048 Ubl_UBL3 ubiquitin-like (Ubl) domain found in ubiquitin-like protein 3 (UBL3) and similar proteins. UBL3, also termed membrane-anchored ubiquitin-fold protein (MUB), or protein HCG-1, belongs to a newly described MUB protein family with structural homology with ubiquitin. MUB proteins have a beta-grasp ubiquitin-like (Ubl) domain with longer N- and C-termini and extended loops. The Ubl domain contains a C-terminal CAAX-box, a canonical motif for protein prenylation, which is modified through protein lipidation with a hydrophobic membrane anchor. The lipidation and membrane localization inhibit attachment of MUBs to target proteins. 82
34578 340569 cd17049 Ubl_Sacsin ubiquitin-like (Ubl) domain found in Sacsin and similar proteins. Sacsin, also termed DnaJ homolog subfamily C member 29 (DNAJC29), is encoded by SACS gene that is highly expressed in the brain. Mutations in SACS can cause the neurodegenerative disease autosomal recessive spastic ataxia of Charlevoix Saguenay (ARSACS) which is characterized by early-onset spastic ataxia. Sacsin is a modular protein that is localized on the mitochondrial surface and possibaly required for normal mitochondrial network organization. Sacsin knockdown resulted in a reduction in cells expressing plyglutamine-expanded ataxin-1, which correlated with a loss of cells with large nuclear ataxin-1 incusions. At the N-terminus, sacsin contains a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which can interact with the proteasome. At the C-terminus, sacsin harbors a protein-protein interaction J-domain followed by an higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domain. The J-domain is typically associated with DnaJ-like co-chaperones involved in regulation of the Hsp70 heat shock system. 73
34579 340570 cd17050 Ubl1_ANKUB1 ubiquitin-like (Ubl) domain 1 found in Ankyrin repeat and ubiquitin domain-containing 1 (ANKUB1) and similar proteins. ANKUB1 is an uncharacterized protein with two tandem ubiquitin-like (Ubl) domains located at the N-terminal of Ankyrin repeats (ANK). The Ubl domain may have an adaptor role in ubiquitin (Ub)-signaling by mediating protein-protein interaction. Ubl proteins have a beta-grasp Ubl fold and attach to other proteins in a Ubl manner with biochemically distinct roles. The ankyrin repeats have been identified in numerous proteins with diverse functions. The family corresponds to the first Ubl domain. 79
34580 340571 cd17051 Ubl2_ANKUB1 ubiquitin-like (Ubl) domain 2 found in Ankyrin repeat and ubiquitin domain-containing 1 (ANKUB1) and similar proteins. ANKUB1 is an uncharacterized protein with two tandem ubiquitin-like (Ubl) domains located at the N-terminal of Ankyrin repeats (ANK). The Ubl domain may have an adaptor role in ubiquitin(Ub)-signaling by mediating protein-protein interaction. Ubl proteins have a beta-grasp Ubl fold and attach to other proteins in a Ubl manner with biochemically distinct roles. The ankyrin repeats have been identified in numerous proteins with diverse functions. The family corresponds to the second Ubl domain. 83
34581 340572 cd17052 Ubl1_FAT10 ubiquitin-like (Ubl) domain 1 found in leukocyte antigen F (HLA-F) adjacent transcript 10 (FAT10) and similar proteins. FAT10, also termed ubiquitin D (UBD), or diubiquitin, is a cytokine-inducible ubiquitin-like (Ubl) modifer that is highly expressed in the thymus, and targets substrates covalently for 26S proteasomal degradation. It is also associated with cancer development, antigen processing and antimicrobial defense, chromosomal stability and cell cycle regulation. FAT10 is presented on immune cells and under the inflammatory conditions, is synergistically induced by interferon gamma (IFNgamma) and tumor necrosis factor (TNFalpha) in the non-immune (liver parenchymal) cells. FAT10 contains two Ubl domains. The family corresponds to the first Ubl domain of FAT10. Some family members contain only one Ubl domain. 74
34582 340573 cd17053 Ubl2_FAT10 ubiquitin-like (Ubl) domain 2 found in leukocyte antigen F (HLA-F) adjacent transcript 10 (FAT10) and similar proteins. FAT10, also termed ubiquitin D (UBD), or diubiquitin, is a cytokine-inducible ubiquitin-like (Ubl) modifer that is highly expressed in the thymus, and targets substrates covalently for 26S proteasomal degradation. It is also associated with cancer development, antigen processing and antimicrobial defense, chromosomal stability and cell cycle regulation. FAT10 is presented on immune cells and under the inflammatory conditions, is synergistically induced by interferon gamma (IFNgamma) and tumor necrosis factor (TNFalpha) in the non-immune (liver parenchymal) cells. FAT10 contains two Ubl domains. The family corresponds to the second Ubl domain of FAT10. Some family members contain only one Ubl domain. 71
34583 340574 cd17054 Ubl_AtBAG1_like ubiquitin-like (Ubl) domain found in Arabidopsis thaliana Bcl-2-associated athanogenes AtBAG1, AtBAG2, AtBAG3, AtBAG4, and similar proteins. The family includes four Arabidopsis BAG family proteins (AtBAG1, AtBAG2, AtBAG3, AtBAG4) that have very similar domain organizations with a ubiquitin-like (Ubl) domain in the N-terminus and a BAG domain in the C-terminus. They may function as co-chaperones that regulate diverse cellular pathways, such as programmed cell death and stress responses. AtBAG1, AtBAG3, and AtBAG4 are predicted to localize in the cytoplasm, but the localization of AtBAG2 is the microbody. AtBAG4 can interact with Hsc70. The overexpression of AtBAG4 in tobacco plants confers tolerance to a wide range of abiotic stresses such as UV light, cold, oxidants, and salt treatments. 70
34584 340575 cd17055 Ubl_AtNPL4_like ubiquitin-like (Ubl) domain found in Arabidopsis thaliana NPL4-like proteins NPL4-1, NPL4-2, and similar proteins. The family includes a group of uncharacterized plant ubiquitin-like (Ubl) domain-containing proteins, including Arabidopsis thaliana NPL4-like protein 1 and NPL4-like protein 2. 73
34585 340576 cd17056 Ubl_FAF1 ubiquitin-like (Ubl) domain found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. This family corresponds to Ubl domains. 71
34586 340577 cd17057 Ubl_TMUB1_like ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing proteins TMUB1, TMUB2, and similar proteins. TMUB1, also termed dendritic cell-derived ubiquitin-like protein (DULP), or hepatocyte odd protein shuttling protein, or ubiquitin-like protein SB144, or HOPS, is highly expressed in the nervous system. It is involved in the termination of liver regeneration and plays a negative role in interleukin-6-induced hepatocyte proliferation. The overexpression of Tmub1 has been shown to play a role in the inhibition of cell proliferation. TMUB1 has been implicated in the regulation of locomotor activity and wakefulness in mice, perhaps acting through its interaction with CAMLG. It also facilitates the recycling of AMPA receptors into synaptic membrane in cultured primary neurons. TMUB1 contains transmembrane domains and a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. TMUB2 is an uncharacterized transmembrane domain and Ubl domain-containing protein that shows high sequence similarity to TMUB1. 74
34587 340578 cd17058 Ubl_SNRNP25 ubiquitin-like (Ubl) domain found in small nuclear ribonucleoprotein U11/U12 subunit 25 (SNRNP25) and similar proteins. SNRNP25, also termed U11/U12 small nuclear ribonucleoprotein 25 kDa protein, U11/U12 snRNP 25 kDa protein (U11/U12-25K), or minus-99 protein, is a component of the U11/U12 snRNPs that are part of the U12-type spliceosome. It contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 89
34588 340579 cd17059 Ubl_OTU1 ubiquitin-like (Ubl) domain found in ubiquitin thioesterase OTU1 and similar proteins. OTU1 (EC 3.4.19.12), also termed YOD1, or DUBA-8, or HIV-1-induced protease 7 (HIN-7), or OTU domain-containing protein 2 (OTUD2), is a p97-associated deubiquitinylase that functions as a key player in endoplasmic reticulum-associated degradation (ERAD). Its deubiquitinylase activity is also required for negatively regulating cholera toxin A1 (CTA1) retro-translocation. OTU1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a C2H2-type zinc finger, and an OTU domain. 75
34589 340580 cd17060 Ubl_RB1CC1 ubiquitin-like (Ubl) domain found in retinoblastoma 1-inducible coiled-coil protein 1 (RB1CC1) and similar proteins. RB1CC1, also termed FAK family kinase-interacting protein of 200 kDa (FIP200), is the mammalian counterpart of the yeast Atg17 gene and functions as a component of the ULK1/Atg13/RB1CC1/Atg101 complex essential for induction of autophagy. RB1CC1 is a key signaling node to regulate cellular proliferation and differentiation. As a DNA-binding transcription factor, RB1CC1 has been implicated in the regulation of retinoblastoma 1 (RB1) expression. RB1CC1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, as well as a nuclear localization signal (KPRK), a leucine zipper motif and a coiled-coil structure. 75
34590 340581 cd17061 Ubl_IQUB ubiquitin-like (Ubl) domain found in IQ and ubiquitin-like domain-containing protein (IQUB) and similar proteins. IQUB is an IQ motif and ubiquitin domain-containing protein that may play roles in cilia formation and/or maintenance. It contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 79
34591 340582 cd17062 Ubl_NUB1 ubiquitin-like (Ubl) domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also termed negative regulator of ubiquitin-like proteins 1, or renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1), which may function in the regulation of cell cycle progression. NUB1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. 78
34592 340583 cd17063 Ubl_ANKRD60 ubiquitin-like (Ubl) domain found in ankyrin repeat domain-containing protein 60 (ANKRD60) and similar proteins. ANKRD60 is an uncharacterized ankyrin repeat domain-containing protein which also harbors a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. 77
34593 340584 cd17064 Ubl_TAFs_like ubiquitin-like (Ubl) domain found in plant TBP-associated factors (TAFs) and similar proteins. TAFs, also termed transcription initiation factor TFIID subunits, or TAFII250 subunits, are components of the TFIID complex, a multisubunit protein complex involved in promoter recognition and essential for mediating regulation of RNA polymerase transcription. TAFs is the core scaffold of the TFIID complex, which is comprised of the TATA binding protein (TBP) and 12-15 TAFs. TAFs contain a ubiquitin-like (Ubl) domain and a Bromo domain. 72
34594 340585 cd17065 Ubl_UBP24 ubiquitin-like (Ubl) domain found in ubiquitin carboxyl-terminal hydrolase 24 (UBP24) and similar proteins. UBP24 (EC 3.4.19.12), also termed deubiquitinating enzyme 24, or ubiquitin thioesterase 24, or ubiquitin-specific-processing protease 24 (USP24), is a deubiquitinating protein that interacts with damage-specific DNA-binding protein 2 (DDB2) and regulates DDB2 stability. It may also play a role in the pathogenesis of Parkinson's disease (PD). UBP24 proteins contain an N-terminal ubiquitin-associated (UBA) domain, a ubiquitin-like (Ubl) domain, and a C-terminal peptidase C19 domain. 79
34595 340586 cd17066 Ubl_KPC2 ubiquitin-like (Ubl) domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also termed ubiquitin-associated domain-containing protein 1 or UBA domain containing 1 (UBAC1), or glialblastoma cell differentiation-related protein 1 (GBDR1), is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains an ubiquitin-like (Ubl) domain and two ubiquitin-associated (UBA) domains. 87
34596 340587 cd17067 RBD2_RGS12_like Ras-binding domain (RBD) 2 of regulator of G protein signaling 12 (RGS12) and similar proteins. Regulator of G-protein signaling (RGS) proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. The RGS12-like subfamily is composed of RGS12 and RGS14, with multidomain architectures including a RGS domain, two tandem Ras-binding domains (RBDs), and a second Galpha interacting domain, the GoLoco motif. The RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 72
34597 340588 cd17068 RBD_PLEKHG5 Ras-binding domain (RBD) found in pleckstrin homology (PH) and RhoGEF domain containing G5 (PLEKHG5) and similar proteins. PLEKHG5, is also termed PH domain-containing family G member 5, or guanine nucleotide exchange factor 720 (GEF720), Syx, or Tech, is a novel Dbl-like protein related to p115Rho-GEF. It functions as a Rho guanine nucleotide exchange factor directly activating RhoA in vivo and potentially involved in the control of neuronal cell differentiation. It also regulates the balance of the RhoA downstream effector Dia and ROCK activities to promote polarized-cancer-cell migration. Moreover, PLEKHG5 activates the nuclear factor kappaB (NFkappaB) signaling pathway. Mutations in the PLEKHG5 gene are relevant with autosomal recessive intermediate Charcot-Marie-Tooth disease (CMT) and lower motor neuron disease (LMND). 75
34598 340589 cd17069 DCX2 Dublecortin-like domain 2. Members in doublecortin (DCX) gene family are microtubule-associated proteins (MAPs). Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its protein domains can occur in double tandem or as a single repeat. The first repeat of DCX domain has a stable ubiquitin-like tertiary fold. Proteins with DCX double tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK). 84
34599 340590 cd17070 DCX2_RP_like Dublecortin-like domain 2 found in retinitis pigmentosa (RP)-like protein. RP-like protein family is part of doublecortin (DCX) superfamily with double tandem DCX repeats that are associated with retinitis pigmentosa. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. RP-like proteins are colocalized to the photoreceptor and share a function in outer segment disc morphogenesis. 69
34600 340591 cd17071 DCX1_DCDC2_like Dublecortin-like domain 1 found in doublecortin domain-containing protein 2 (DCDC2) and similar proteins. DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 80
34601 340592 cd17072 DCX_DCDC5_like Doublecortin-like domain found in doublecortin domain-containing protein 5 (DCDC5) and similar proteins. DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8 as well as with the Rab8 nucleotide exchange factor Rabin8. This family also includes DCDC1, which is a hydrophilic intracellular protein that contains only one DCX repeat. Therefore, DCDC1 might only bind to microtubules without microtubule polymerization properties. DCDC1 is mainly expressed in adult testis. 71
34602 340593 cd17073 KHA KHA, dimerization domain of potassium ion channel, similar to doublecortin-like domain, found in potassium channel tetramerization domain containing 9 (KCTD9) and similar proteins. This family corresponds to KHA, the tetramerization domain of eukaryotic voltage-dependent potassium ion-channel proteins, mainly found in vertebrates KCTD9 and plants AKT proteins. In plants the domain lies at the C-terminus whereas in many chordates it lies at the N-terminus. KHA shows high sequence similarity with doublecortin-like domain, which has a stable ubiquitin-like tertiary fold. KCTD9, also termed BTB/POZ domain-containing protein 9, belongs to the KCTD protein family, which corresponds to potassium channel tetramerization domain proteins, a class of BTB-domain-containing proteins. It is involved in potassium channel formation. Moreover, KCTD9 contributes to liver injury through NK cell activation during hepatitis B virus (HBV)-induced acute-on-chronic liver failure. AKT proteins play crucial roles in K+ uptake and translocation in plant cells. 65
34603 340594 cd17074 Ubl_CysO_like ubiquitin-like (Ubl) domain found in Mycobacterium tuberculosis CysO and similar proteins. CysO, also termed 9.5 kDa culture filtrate antigen cfp10A, together with CysM (Cysteine synthase M), forms a protein complex CysM-CysO that represents a new cysteine biosynthetic pathway in Mycobacterium tuberculosis. The replacement of the acetyl group of O-acetylserine by CysO thiocarboxylate to generate a protein-bound cysteine is catalyzed by CysM in a pyridoxal 5?-phosphate (PLP)-dependent manner. The family also includes QbsE that functions as the sulfide donor for the biosynthesis of thioquinolobactin in Pseudomonas fluorescens. A JAMM motif protein QbsD catalyzes removal of the carboxy-terminal dipeptide from QbsE. Both CysO and QbsE are similar to prokaryotic sulfur carrier proteins such as ThiS and MoaD, containing the beta-grasp ubiquitin-like fold. 89
34604 340595 cd17075 UBX1_UBXN9 Ubiquitin regulatory domain X (UBX) 1 found in UBX domain protein 9 (UBXN9, UBXD9, or ASPSCR1) and similar proteins. UBXN9, also termed tether containing UBX domain for GLUT4 (TUG), or alveolar soft part sarcoma chromosomal region candidate gene 1 protein (ASPSCR1), or alveolar soft part sarcoma locus (ASPL), or renal papillary cell carcinoma protein 17 (RCC17), belongs to the UBXD family of proteins that contains two ubiquitin regulatory domains X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, UBXN9 contains an N-terminal ubiquitin-like (Ubl) domain. UBXN9 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN9 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4, assembly of the Golgi apparatus. In addition to GLUT4, UBXN9 also controls vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. UBXN9 and its budding yeast ortholog, Ubx4p, are multifunctional proteins that share some, but not all functions. Yeast Ubx4p is important for endoplasmic reticulum-associated protein degradation (ERAD) but UBXN9 appears not to share this function. 85
34605 340596 cd17076 UBX_UBXN10 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 10 (UBXN10) and similar proteins. UBXN10, also termed UBX domain-containing protein 3 (UBXD3), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN10 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN10 localizes to cilia in a p97-dependent manner, and both p97 and UBXN10 are required for ciliogenesis. Additionally, UBXN10 interacts with the intraflagellar transport B (IFT-B) and regulates anterograde transport into cilia. 76
34606 340597 cd17077 UBX_UBXN11 Ubiquitin regulatory domain X (UBX) found in UBX domain protein 11 (UBXN11) and similar proteins. UBXN11, also termed colorectal tumor-associated antigen COA-1, or socius, or UBX domain-containing protein 5 (UBXD5), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN11 may function as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN11 also acts as a novel interacting partner of Rnd proteins (Rnd1, Rnd2, and Rnd3/RhoE), new members of Rho family of small GTPases. It directly binds to Rnd GTPases through its C-terminal region, and further participates in disassembly of actin stress fibers. UBXN11 also binds directly to Galpha12 and Galpha13 through its N-terminal region. As a novel activator of the Galpha12 family, UBXN11 promotes the Galpha12-induced RhoA activation. 76
34607 340598 cd17078 Ubl_SLD1_NFATC2ip SUMO-like domain 1 (SLD1), structurally similar to a beta-grasp ubiquitin-like fold, found in nuclear factor of activated T-cells 2 interacting protein (NFATC2ip) and similar proteins. NFATC2ip, also termed nuclear factor of activated T cells (NFAT), cytoplasmic, calcineurin dependent 2 interacting protein, or 45 kDa NF-AT-interacting protein, or 45 kDa NFAT-interacting protein (Nip45), or nuclear factor of activated T-cells, or cytoplasmic 2-interacting protein, belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family. The family members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. NFATC2ip was firstly identified as a co-regulator with NFAT and the T helper 2 (Th2)-specific transcription factor, c-Maf, to induce IL-4 production. NFATC2ip has also been involved in cellular differentiation and coordination of the immune response in humans and mice. 74
34608 340599 cd17079 Ubl_SLD2_NFATC2ip SUMO-like domain 2 (SLD2), structurally similar to a beta-grasp ubiquitin-like fold, found in nuclear factor of activated T-cells 2 interacting protein (NFATC2ip) and similar proteins. NFATC2ip, also termed nuclear factor of activated T cells (NFAT), cytoplasmic, calcineurin dependent 2 interacting protein, or 45 kDa NF-AT-interacting protein, or 45 kDa NFAT-interacting protein (Nip45), or nuclear factor of activated T-cells, or cytoplasmic 2-interacting protein, belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family. The family members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. NFATC2ip was firstly identified as a co-regulator with NFAT and the T helper 2 (Th2)-specific transcription factor, c-Maf, to induce IL-4 production. NFATC2ip has also been involved in cellular differentiation and coordination of the immune response in humans and mice. NFATC2ip SLD2 domain binds to E2 SUMOylation enzyme, Ubc9, in an almost identical manner to that of SUMO and thereby inhibits elongation of poly-SUMO chains. 73
34609 340600 cd17080 Ubl_SLD2_Esc2_like SUMO-like domain 2 (SLD2), structurally similar to a beta-grasp ubiquitin-like fold, found in Saccharomyces cerevisiae establishes silent chromatin protein 2 (Esc2p) and similar proteins. Protein Esc2p belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family, whose members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. Yeast Esc2p was identified as a factor promoting gene silencing. It is also required for genome integrity during DNA replication and sister chromatid cohesion in Saccharomyces cerevisiae. Esc2p promotes Mus81p complex-activity via its SUMO-like and DNA binding domains. It also acts as a novel structure-specific DNA-binding factor implicated in the local regulation of the Srs2p helicase through promoting recombination at sites of stalled replication. In addition, Esc2p specifically promotes the accumulation of SUMOylated Mms21-specific substrates and functions with Mms21p to suppress gross chromosomal rearrangements (GCRs). This family also includes DNA repair protein Rad60p from Schizosaccharomyces pombe. It is a SUMO mimetic and SUMO-targeted ubiquitin ligase (STUbL)-interacting protein that is required for the repair of DNA double strand breaks, recovery from S phase replication arrest, and plays an essential role in cell viability. Like other RENi family members, Rad60p has two SLD domains. 74
34610 340601 cd17081 RAWUL_PCGF1 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 1 (PCGF1) and similar proteins. PCGF1, also termed nervous system Polycomb-1 (NSPc1), or RING finger protein 68 (RNF68), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like BCOR complex that also contains RING1, RNF2, RYBP, SKP1, as well as the BCL6 co-repressor BCOR and the histone demethylase KDM2B, and is required to maintain the transcriptionally repressive state of some genes, such as Hox genes, BCL6 and the cyclin-dependent kinase inhibitor, CDKN1A. PCGF1 promotes cell cycle progression and enhances cell proliferation as well. It is a cell growth regulator that acts as a transcriptional repressor of p21Waf1/Cip1 via the retinoid acid response element (RARE element). Moreover, PCGF1 functions as an epigenetic regulator involved in hematopoietic cell differentiation. It cooperates with the transcription factor runt-related transcription factor 1 (Runx1) in regulating differentiation and self-renewal of hematopoietic cells. Furthermore, PCGF1 represents a physical and functional link between Polycomb function and pluripotency. PCGF1 contains a C3HC4-type RING-HC finger and a RAWUL domain. 92
34611 340602 cd17082 RAWUL_PCGF2_like RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger proteins PCGF2, PCGF4, and similar proteins. This family includes polycomb group RING finger proteins, PCGF2 (also known as Mel-18, or RNF110, or ZNF144) and PCGF4 (also known as BMI-1, or RNF51), both serving as the core component of a canonical polycomb repressive complex 1 (PRC1). PRC1 is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence the target gene in a PRC2-independent manner. Both, PCGF2 and PCGF4, contain a C3HC4-type RING-HC finger and a RAWUL domain. 108
34612 340603 cd17083 RAWUL_PCGF3 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 3 (PCGF3) and similar proteins. PCGF3, also termed RING finger protein 3A (RNF3A), is one of six PcG RING finger (PCGF) homologs (PCGF1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF3 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF3 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 85
34613 340604 cd17084 RAWUL_PCGF5 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 5 (PCGF5) and similar proteins. PCGF5, also termed RING finger protein 159 (RNF159), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF5 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF5 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 101
34614 340605 cd17085 RAWUL_PCGF6 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 6 (PCGF6) and similar proteins. PCGF6, also termed Mel18 and Bmi1-like RING finger (MBLR), or RING finger protein 134 (RNF134), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5 and PCGF6/MBLR), and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like L3MBTL2 complex, which is composed of some canonical components, such as RNF2, CBX3, CXB4, CXB6, CXB7 and CXB8, as well as some noncanonical components, such as L3MBTL2, E2F6, WDR5, HDAC1 and RYBP, and plays critical roles in epigenetic transcriptional silencing in higher eukaryotes. Like other PCGF homologs, PCGF6 possesses the transcriptional repression activity, and also associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF6 can regulate the enzymatic activity of JARID1d/KDM5D, a trimethyl H3K4 demethylase, through the direct interaction with it. Furthermore, PCGF6 is expressed predominantly in meiotic and post-meiotic male germ cells and may play important roles in mammalian male germ cell development. It also regulates mesodermal lineage differentiation in mammalian embryonic stem cells (ESCs) and functions in induced pluripotent stem (iPS) reprogramming. The activity of PCGF6 is found to be regulated by cell cycle dependent phosphorylation. PCGF6 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 89
34615 340606 cd17086 RAWUL_RING1_like RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene proteins RING1, RING2 and similar proteins. RING1, also termed polycomb complex protein RING1, or RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), has been identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. RING2, also termed huntingtin-interacting protein 2-interacting protein 3, or HIP2-interacting protein 3, or protein DinG, or RING finger protein 1B (RING1B), or RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. Both, RING1 and RING2, are core components of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING2 acts as the main E3 ubiquitin ligase on histone H2A of the PRC1 complex, while RING1 may rather act as a modulator of RNF2/RING2 activity. Members in this family contain a C3HC4-type RING-HC finger, and a RAWUL domain. 106
34616 340607 cd17087 RAWUL_DRIP_like RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in DREB2A-interacting protein (DRIP) and similar proteins. Dehydration-Responsive Element-Binding Protein 2A (DREB2A) regulates the expression of stress-inducible genes via the dehydration-responsive elements and requires posttranslational modification for its activation. DREB2A-Interacting Protein (DRIP) contains a RING finger, and a RING finger- and WD40-associated ubiquitin-like (RAWUL) domain. DRIP interacts with DREB2A and functions as a E3 ubiquitin ligases that negatively regulates DREB2A expression. 104
34617 340608 cd17088 FERM_F1_FRMPD1_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD3, FRMPD4, and similar proteins. This family includes FERM and PDZ domain-containing proteins FRMPD1, FRMPD3, and FRMPD4, which all contain a PDZ domain and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMPD1, also termed FERM domain-containing protein 2, is an activator of G-protein signaling 3 (AGS3)-binding protein that regulates the subcellular location of AGS3 and its interaction with G-proteins. FRMPD4, also termed PDZ domain-containing protein 10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a multiscaffolding protein that modulates both Homer1 and post-synaptic density protein 95 activity. Both FRMPD1 and FRMPD4 can associate with the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. The biological role of FRMPD3 remains unclear. 90
34618 340609 cd17089 FERM_F0_TLN FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin and similar proteins. Talin is a cytoskeletal protein that activates integrins and couples them to cytoskeletal actin. Talin consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain that is joined to the F1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components; no binding partner has yet been found for it. 84
34619 340610 cd17090 FERM_F1_TLN FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin and similar proteins. Talin is a cytoskeletal protein that activates integrins and couples them to cytoskeletal actin. Talin consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain that is joined to the F1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components; no binding partner has yet been found for it. 111
34620 340611 cd17091 FERM_F0_SHANK FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains proteins, SHANK1, SHANK2, and SHANK3. SHANK proteins (SHANK1, SHANK2, and SHANK3) are core components of the postsynaptic density (PSD) of excitatory synapses. They act as scaffolding molecules that cluster neurotransmitter receptors as well as cell adhesion molecules attaching them to the actin cytoskeleton. They play important roles in proper excitatory synapse and circuit function. Mutations in SHANK genes, especially in SHANK3 and SHANK2, may lead to neuropsychiatric disorders, such as autism spectrum disorder (ASD). SHANK proteins contain an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold. 84
34621 340612 cd17092 FERM1_F1_Myosin-VII FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain 1, F1 sub-domain, found in Myosin-VIIa, Myosin-VIIb, and similar proteins. This family includes two nontraditional members of the myosin superfamily, myosin-VIIa and myosin-VIIb. Myosin-VIIa, also termed myosin-7a (Myo7a), has been implicated in the structural organization of hair bundles at the apex of sensory hair cells (SHCs) where it serves mechanotransduction in the process of hearing and balance. Mutations in the MYO7A gene may be associated with Usher Syndrome type 1B (USH1B) and nonsyndromic hearing loss (DFNB2, DFNA11). Myosin-VIIb, also termed myosin-7b (Myo7b), is a high duty ratio motor adapted for generating and maintaining tension. It associates with harmonin and ANKS4B to form a stable ternary complex for anchoring microvilli tip-link cadherins. Like other unconventional myosins, myosin-VII is composed of a conserved motor head, a neck region and a tail region containing two MyTH4 domains, a SH3 domain, and two FERM domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the first FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 99
34622 340613 cd17093 FERM2_F1_Myosin-VII FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain 2, F1 sub-domain, found in Myosin-VIIa, Myosin-VIIb, and similar proteins. This family includes two nontraditional members of myosin superfamily, myosin-VIIa and myosin-VIIb. Myosin-VIIa, also termed myosin-7a (Myo7a), has been implicated in the structural organization of hair bundles at the apex of sensory hair cells (SHCs) where it serves mechanotransduction in the process of hearing and balance. Mutations in MYO7A gene may be associated with Usher Syndrome type 1B (USH1B) and nonsyndromic hearing loss (DFNB2, DFNA11). Myosin-VIIb, also termed myosin-7b (Myo7b), is a high duty ratio motor adapted for generating and maintaining tension. It associates with harmonin and ANKS4B to form a stable ternary complex for anchoring microvilli tip-link cadherins. Like other unconventional myosins, myosin-VII is composed of a conserved motor head, a neck region and a tail region containing two MyTH4 domains, a SH3 domain, and two FERM domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the second FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 98
34623 340614 cd17094 FERM_F1_Max1_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Caenorhabditis elegans max-1 and its homologs PLEKHH1 and PLEKHH2. Caenorhabditis elegans max-1 is expressed and functions in motor neurons. MAX-1 protein plays a possible role in netrin-induced axon repulsion by modulating the UNC-5 receptor signaling pathway. PLEKHH1 is critically required in vascular patterning in vertebrate species through acting upstream of the ephrin pathway. PLEKHH2 is highly enriched in renal glomerular podocytes, and acts as a novel, important component of the podocyte foot processes. It is involved in matrix adhesion and actin dynamics. Members in this family all contain two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 102
34624 340615 cd17095 FERM_F0_kindlins FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in the kindlin family. The kindlin family is composed of kindlin-1, 2 and 3, which are FERM domain-containing adaptor molecules that interact with the cytoplasmic component of integrins and regulate cell-matrix connections. Kindlins belong to the 4.1- ezrin-ridixin-moesin (FERM) domain containing protein family. They contain F1, F2 and F3 subdomains that typify FERM family members, and these subdomains are preceded by an N-terminal F0 subdomain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain. In addition, a distinctive feature of kindlins is the insertion of a pleckstrin homology (PH) subdomain into the F2 subdomain. 80
34625 340616 cd17096 FERM_F1_kindlins FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the kindlin family. The kindlin family is composed of Kindlin-1, 2 and 3, which are FERM domain-containing adaptor molecules that interact with the cytoplasmic component of integrins and regulate cell-matrix connections. Kindlins belong to the 4.1- ezrin-ridixin-moesin (FERM) domain containing protein family. They contain F1, F2 and F3 subdomains that typify FERM family members, and these subdomains are preceded by an N-terminal F0 subdomain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain. In addition, a distinctive feature of kindlins is the insertion of a pleckstrin homology (PH) subdomain into the F2 subdomain. 90
34626 340617 cd17097 FERM_F1_ERM_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the ERM family proteins. The ezrin-radixin-moesin (ERM) family includes a group of closely related cytoskeletal proteins that play an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. They exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Merlin, which is highly related to the members of the ezrin, radixin, and moesin (ERM) protein family that are directly attached to and functionally linked with NHE1, is included in this family. 83
34627 340618 cd17098 FERM_F1_FARP1_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, RhoGEF and pleckstrin domain-containing protein 1 (FARP1) and similar proteins. This family includes the F1 sub-domain of FERM, RhoGEF and pleckstrin domain-containing proteins FARP1, FARP2, and FERM domain-containing protein 7 (FRMD7). FARP1, also termed chondrocyte-derived ezrin-like protein (CDEP), or pleckstrin homology (PH) domain-containing family C member 2 (PLEKHC2), is a neuronal activator of the RhoA GTPase. It promotes outgrowth of developing motor neuron dendrites. It also regulates excitatory synapse formation and morphology, as well as activates the GTPase Rac1 to promote F-actin assembly. FARP2, also termed FERM domain including RhoGEF (FIR), or Pleckstrin homology (PH) domain-containing family C member 3, is a Dbl-family guanine nucleotide exchange factor (GEF) that activates Rac1 or Cdc42 in response to upstream signals, suggesting roles in regulating processes such as neuronal axon guidance and bone homeostasis. It is also a key molecule involved in the response of neuronal growth cones to class-3 semaphorins. FRMD7 plays an important role in neuronal development and is involved in the regulation of F-actin, neurofilament, and microtubule dynamics. All family members contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 85
34628 340619 cd17099 FERM_F1_PTPN14_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptors PTPN14, PTPN21, and similar proteins. This family includes tyrosine-protein phosphatase non-receptors PTPN14 and PTPN21, both of which are protein-tyrosine phosphatase (PTP). They belong to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN14 plays a role in the nucleus during cell proliferation. PTPN21 interacts with a Tec tyrosine kinase family member, the epithelial and endothelial tyrosine kinase (Etk, also known as Bmx), modulates Stat3 activation, and plays a role in the regulation of cell growth and differentiation. 85
34629 340620 cd17100 FERM_F1_PTPN3_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 3 (PTPN3) and similar proteins. This family includes two tyrosine-protein phosphatase non-receptors, PTPN3 and PTPN4, both of which belong to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 86
34630 340621 cd17101 FERM_F1_PTPN13_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 13 (PTPN13) and similar proteins. This family includes tyrosine-protein phosphatase non-receptor type 13 (PTPN13), FERM and PDZ domain-containing protein 2 (FRMPD2), and FERM domain-containing proteins FRMD1 and FRMD6. All family members contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 97
34631 340622 cd17102 FERM_F1_FRMD3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 3 (FRMD3) and similar proteins. FRMD3, also termed band 4.1-like protein 4O, or ovary type protein 4.1 (4.1O), belongs to the 4.1 protein superfamily, which share the highly conserved membrane-association FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMD3 is involved in maintaining cell shape and integrity. It also functions as a tumour suppressor in non-small cell lung carcinoma (NSCLC). Some single nucleotide polymorphisms (SNPs) located in FRMD3 have been associated with diabetic kidney disease (DKD) in different ethnicities. 82
34632 340623 cd17103 FERM_F1_FRMD4 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing proteins FRMD4A, FRMD4B, and similar proteins. This family includes FERM domain-containing proteins FRMD4A and FRMD4B, both of which contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMD4A is a cytohesin adaptor involved in cell structure, transport and signaling. It promotes the growth of cancer cells in tongue, head and neck squamous cell carcinomas. FRMD4B, also termed GRP1-binding protein GRSP1, interacts with the coil-coil domain of ARF exchange factor GRP1 to form the Grsp1-Grp1 complex that co-localizes with cortical actin rich regions in response to stimulation of CHO-T cells with insulin or epidermal growth factor (EGF). 91
34633 340624 cd17104 FERM_F1_MYLIP FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in E3 ubiquitin-protein ligase MYLIP and similar proteins. MYLIP, also termed inducible degrader of the LDL-receptor (Idol), or myosin regulatory light chain interacting protein (MIR), is an E3 ubiquitin-protein ligase that mediates ubiquitination and subsequent proteasomal degradation of myosin regulatory light chain (MRLC), LDLR, VLDLR and LRP8. Its activity depends on E2 ubiquitin-conjugating enzymes of the UBE2D family, including UBE2D1, UBE2D2, UBE2D3, and UBE2D4. MYLIP stimulates clathrin-independent endocytosis and acts as a sterol-dependent inhibitor of cellular cholesterol uptake by binding directly to the cytoplasmic tail of the LDLR and promoting its ubiquitination via the UBE2D1/E1 complex. The ubiquitinated LDLR then enters the multivesicular body (MVB) protein-sorting pathway and is shuttled to the lysosome for degradation. Moreover, MYLIP has been identified as a novel ERM-like protein that affects cytoskeleton interactions regulating cell motility, such as neurite outgrowth. The ERM proteins includes ezrin, radixin, and moesin, which are cytoskeletal effector proteins linking actin to membrane-bound proteins at the cell surface. MYLIP contains a FERM-domain and a C-terminal C3HC4-type RING-HC finger. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 81
34634 340625 cd17105 FERM_F1_EPB41 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1 (EPB41) and similar proteins. EPB41, also termed protein 4.1 (P4.1), or 4.1R, or Band 4.1, or EPB4.1, belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41 is a widely expressed cytoskeletal phosphoprotein that stabilizes the spectrin-actin cytoskeleton and anchors the cytoskeleton to the cell membrane. EPB41 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 83
34635 340626 cd17106 FERM_F1_EPB41L FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like proteins. The family includes erythrocyte membrane protein band 4.1-like proteins EPB41L1/4.1N, EPB41L2/4.1G, and EPB41L3/4.1B. They belong to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L1 is a cytoskeleton-associated protein that may serve as a tumor suppressor in solid tumors. EPB41L2 is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L3 also acts as a tumor suppressor implicated in a variety of meningiomas and carcinomas. Members in this family contain a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 84
34636 340627 cd17107 FERM_F1_EPB41L4A FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte band 4.1-like protein 4A (EPB41L4A) and similar proteins. EPB41L4A, also termed protein NBL4, is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). EPB41L4A is an important component of the beta-catenin/Tcf pathway. It may be related to determination of cell polarity or proliferation. 91
34637 340628 cd17108 FERM_F1_EPB41L5_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like 5 (EPB41L5) and similar proteins. This family includes FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like proteins, EPB41L5 and EPB41L4B. EPB41L5 is a mesenchymal-specific protein that is an integral component of the ARF6-based pathway. EPB41L4B is a positive regulator of keratinocyte adhesion and motility, suggesting a role in wound healing. It also promotes cancer metastasis in melanoma, prostate cancer and breast cancer. Both EPB41L5 and EPB41L4B contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 81
34638 340629 cd17109 FERM_F1_SNX17_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in PX-FERM family sorting nexin proteins. This family includes three endosome-associated PX (Phox homology) and FERM (Band 4.1, ezrin, radixin, moesin) domain-containing proteins called sorting nexin (SNX) 17, SNX27, and SNX31, which are modular peripheral membrane proteins acting as central scaffolds mediating protein-lipid interactions, cargo binding, and regulatory protein recruitment. They are key regulators of endosomal recycling and bind conserved NPX(Y/F) peptide sorting motifs in transmembrane cargos via an atypical FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 93
34639 340630 cd17110 FERM_F1_Myo10_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in unconventional myosin-X and similar proteins. Myosin-X, also termed myosin-10 (Myo10), is an untraditional member of myosin superfamily. It is an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. Myosin-X functions as an important regulator of cytoskeleton that modulates cell motilities in many different cellular contexts. It regulates neuronal radial migration through interacting with N-cadherin. Like other unconventional myosins, Myosin-X is composed of a conserved motor head, a neck region and a variable tail. The neck region consists of three IQ motifs (light chain-binding sites), and a predicted stalk of coiled coil. The tail contains three PEST regions, three PH domains, a MyTH4 domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Amoebozoan Dictyostelium discoideum myosin VII (DdMyo7) and uncharacterized pleckstrin homology domain-containing family H member 3 (PLEKHH3) are also included in this family. Like metazoan Myo10, DdMyo7 is essential for the extension of filopodia, plasma membrane protrusions filled with parallel bundles of F-actin. 97
34640 340631 cd17111 RA1_DAGK-theta Ras-associating (RA) domain 1 found in diacylgylcerol kinase theta (DAGK-theta) and similar proteins. DAGK phosphorylates the second messenger diacylglycerol to phosphatidic acid as part of a protein kinase C pathway. DAGK-theta is characterized as a type V DAGK that has three cysteine-rich domains (all other isoforms have two), a proline/glycine-rich domain at its N-terminal, and a proposed Ras-associating (RA) domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Ten mammalian isoforms of DAGK have been identified to date which are organized into five categories based on the domain architecture. DAGK-theta also contains a pleckstrin homology (PH) domain. The subcellular localization and the activity of DAGK-theta are regulated in a complex (stimulation- and cell type-dependent) manner. This family corresponds to the first RA domain of DAGK-theta. 91
34641 340632 cd17112 RA_MRL_like Ras-associating (RA) domain found in Mig10/RIAM/Lpd (MRL) family and growth factor receptor-bound (Grb) protein 7/10/14. MRL proteins share a common structural architecture, including a central structural unit consisting of a Ras-associating (RA) domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. MRL proteins have distinct functions in cell migration and adhesion, signaling, and in cell growth. Grb7/10/14 are multi-domain cytoplasmic adaptor proteins that are recruited to activated receptor tyrosine kinases. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. 81
34642 340633 cd17113 RA_ARAPs Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing proteins ARAP1, ARAP2, ARAP3, and similar proteins. ARAPs are phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating proteins (GAPs). They contain multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 95
34643 340634 cd17114 RA_PLC-epsilon Ras-associating (RA) domain found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. 97
34644 340635 cd17115 RA_RHG20 Ras-associating (RA) domain found in Rho GTPase-activating protein 20 (RHG20) and similar proteins. RHG20, also termed ARHGAP20, is one of GTPase activating proteins for Rho family proteins (RhoGAPs). It contains a PH-like domain, an RA domain, a RhoGap domain, and two Annexin-like (ANXL) repeats. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin that is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 117
34645 340636 cd17116 RA_Radil_like Ras-associating (RA) domain found in ras-associating and dilute domain-containing protein (Radil) and similar proteins. Radil acts as an important small GTPase Rap1 effector required for cell spreading and migration. It regulates neutrophil adhesion and motility through linking Rap1 to beta2-integrin activation.This family also includes Ras-interacting protein 1 (Rain, also termed Rasip1), which is a novel Ras-interacting protein with a unique subcellular localization. It interacts with Ras in a GTP-dependent manner, and may serve as an effector for endomembrane-associated Ras. Radil contains RA, DIL, and PDZ domains. In contrast, Rain contains a myosin5-like cargo binding domain, a RA domain and a PDZ domain. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 121
34646 340637 cd17117 RA_ANKFN1_like Ras-associating (RA) domain found in Ankyrin repeat and fibronectin type-III domain-containing protein 1 (ANKFN1) and similar proteins. ANKFN1 is a multi-domain protein, with unknown function, that contains two ankyrin repeats and one fibronectin type-III domain. Except for the mammalian homologs, most metazon ANKFN1 harbor a RA domain at the C-terminus. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. 97
34647 340638 cd17118 Ubl_HERP1 ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP1 and similar proteins. HERP1, also termed homocysteine-responsive endoplasmic reticulum-resident ubiquitin-like domain member 1 protein (HERPUD1), or methyl methanesulfonate (MMF)-inducible fragment protein 1 (MIF1), is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. HERP1 is a component of the ER quality control (ERQC) system, also called ER-associated degradation (ERAD), which is involved in ubiquitin-dependent degradation of misfolded ER proteins. It promotes the degradation of ER-resident Ca2+ channels. It is also involved in ubiquitin ligase HRD1-dependent protein degradation at the ER. Moreover, HERP1 plays a role in regulating the cell cycle, apoptosis and steroid hormone biosynthesis in mouse granulosa cells. 78
34648 340639 cd17119 Ubl_HERP2 ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP2 and similar proteins. HERP2, also termed homocysteine-responsive endoplasmic reticulum-resident ubiquitin-like domain member 2 protein (HERPUD2), is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. It is homologous to HERP1, and could be involved in the unfolded protein response (UPR) pathway. It regulates the ubiquitin ligase HRD1-dependent protein degradation at the ER. 77
34649 340640 cd17120 Ubl_UBTD1 ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein 1 (UBTD1). UBTD1 is the mammalian homolog of the mitochondrial Dc-UbP/UBTD2 protein, both of which contain an N-terminal ubiquitin binding domain (UBD) and a C-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. UBTD1 stably interacts with the UBE2D family of E2 ubiquitin conjugating enzymes. As a tumor suppressor, UBTD1 plays a pivotal role in in regulating cellular senescence that mediates p53 function. It is also involved in MDM2 ubiquitination and degradation. 69
34650 340641 cd17121 Ubl_UBTD2 ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein 2 (UBTD2). UBTD2, also termed dendritic cell-derived ubiquitin-like protein (DC-UbP), or ubiquitin-like protein SB72, is a potential ubiquitin shuttle protein firstly identified in dendritic cells and implicated in apoptosis. It binds proteins involved in the ubiquitination pathway and may play an important role in regulating protein ubiquitination and delivery of ubiquitinated substrates in eukaryotic cells. UBTD2 is expressed in tumor cells but not in normal human adult tissue suggesting a role in tumorogenesis. It can potentially regulate the stability of proteins involved in various physiological processes relevant to many disease phenotypes, such as ageing and cancer. UBTD2 reconciles protein ubiquitination and deubiquitination via linking UbE1 and USP5 enzymes. 75
34651 340642 cd17122 Ubl_UHRF1 ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing protein 1(UHRF1). UHRF1, also termed inverted CCAAT box-binding protein of 90 kDa, or nuclear protein 95, or nuclear zinc finger protein Np95 (Np95), or RING finger protein 106, or transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma, gastric cancer, esophageal squamous cell carcinoma, colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21.Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination ofTIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET and RING finger associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitination has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD finger targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibits both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. 74
34652 340643 cd17123 Ubl_UHRF2 ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2, also termed Np95/ICBP90-like RING finger protein (NIRF), or Np95-likeRING finger protein, or nuclear protein 97, or nuclear zinc finger protein Np97, or RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131(ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a set- and ring-associated (SRA) domain, and a C-terminal RING finger. 74
34653 340644 cd17124 Ubl_TECR ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase (TECR). TECR, also termed very-long-chain enoyl-CoA reductase, or synaptic glycoprotein SC2, or TER, or GPSN2, is a synaptic glycoprotein that catalyzes the fourth reaction in the synthesis of very long-chain fatty acids (VLCFA) which is the reduction step of the microsomal fatty acyl-elongation process. Diseases involving perturbations to normal synthesis and degradation of VLCFA (e.g. adrenoleukodystrophy and Zellweger syndrome) have significant neurological consequences. The mammalian TECR P182L mutation causes nonsyndromic mental retardation. Deletion of the yeast TECR homolog (TSC13p) is lethal. TECR contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain. 79
34654 340645 cd17125 Ubl_TECRL ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase-like (TECRL). TECRL, also termed steroid 5-alpha-reductase 2-like 2 protein (SRD5A2L2), is associated with life?threatening inherited arrhythmias displaying features of both long QT syndrome (LQTS) and catecholaminergic polymorphic ventricular tachycardia (CPVT). TECRL contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain. 78
34655 340646 cd17126 Ubl_HR23A ubiquitin-like (Ubl) domain found in UV excision repair protein RAD23 homolog A (HR23A). HR23A, also termed RAD23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation, which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with the proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (Ubl) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. 76
34656 340647 cd17127 Ubl_TBK1 ubiquitin-like (Ubl) domain found in TRAF family member-associated NF-kappaB activator (TANK)-binding kinase 1 (TBK1). TBK1, also termed NF-kappa-B-activating kinase, or T2K, or TANK-binding kinase 1, is an interferon regulatory factor-activating kinase that is a non-canonical member of IKK family. It plays a role in regulating innate immunity, inflammation and oncogenic signaling. TBK1 is involved in the regulation of type I interferons and of nuclear factor-kappaB (NF-kappaB) signal transduction. It regulates factors such as IRF3 and IRF7, promoting antiviral activity in the interferon signaling pathways. It modulates inflammatory hyperalgesia by regulating MAP kinases and NF-kappaB dependent genes. Moreover, TBK1 acts as a central player in the intracellular nucleic acid-sensing pathways involved in antiviral signaling. Dysregulation of TBK1 activity is often associated with autoimmune diseases and cancer. TBK1 contains an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain, and a C-terminal elongated helical domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway. 78
34657 340648 cd17128 Ubl_IKKE ubiquitin-like (Ubl) domain found in inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E). IKK-E (EC 2.7.11.10), also termed I-kappa-B kinase epsilon (IKKepsilon), or IKK-epsilon, or IkBKE, or inducible I kappa-B kinase (IKK-i), is an interferon regulatory factor-activating kinase that is a non-canonical member of IKK family. It is involved in the cellular innate immunity by inducing type I interferons. It is induced by the activation of nuclear factor-kappaB (NF-kappaB). IKK-E has also been implicated in antiviral immune response in higher vertebrates. It acts as a crucial pro-survival factor in human T cell leukemia virus type 1 (HTLV-1)-transformed T lymphocytes. Moreover, IKK-E plays an essential role in tumor initiation and progression. It inhibits protein kinase C (PKC) to promote Fascin-dependent actin bundling. IKK-E contains an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain, and a C-terminal elongated helical domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway. 78
34658 340649 cd17129 Ubl1_FAF1 ubiquitin-like (Ubl) domain 1 found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin (Ub) regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. The UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with the UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. The family corresponds to the first Ubl domain. 73
34659 340650 cd17130 Ubl2_FAF1 ubiquitin-like (Ubl) domain 2 found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. The UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with the UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. The family corresponds to the second Ubl domain. 75
34660 340651 cd17131 Ubl_TMUB1 ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing protein 1 (TMUB1). TMUB1, also termed dendritic cell-derived ubiquitin-like protein (DULP), or hepatocyte odd protein shuttling protein, or ubiquitin-like protein SB144, or HOPS, is highly expressed in the nervous system. It is involved in the termination of liver regeneration and plays a negative role in interleukin-6-induced hepatocyte proliferation. The overexpression of Tmub1 has been shown to play a role in the inhibition of cell proliferation. TMUB1 has also been implicated in the regulation of locomotor activity and wakefulness in mice, perhaps acting through its interaction with CAMLG. It also facilitates the recycling of AMPA receptors into synaptic membrane in cultured primary neurons. TMUB1 contains transmembrane domains and a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. 71
34661 340652 cd17132 Ubl_TMUB2 ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing protein 2 (TMUB2). TMUB2 is composed of an uncharacterized transmembrane domain and a ubiquitin-like (Ubl) domain-containing protein that shows high sequence similarity to TMUB1; the latter is highly expressed in the nervous system and is involved in the termination of liver regeneration. 71
34662 340653 cd17133 RBD_ARAF Ras-binding domain (RBD) found in serine/threonine-protein kinase ARAF. ARAF, also termed proto-oncogene ARAF, or proto-oncogene ARAF1, or proto-oncogene PKS2, belongs to the RAF protein family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. ARAF is predominantly found in urogenital tissue with a low basal kinase activity. It directly cross-talks with NODAL/SMAD2 signaling in a MAPK-independent manner. It also promotes MAPK pathway activation and cell migration in a cell type-dependent manner. Moreover, ARAF acts as a scaffold to stabilize BRAF-CRAF heterodimers. Mice deleted for ARAF are viable but die perinatally. 73
34663 340654 cd17134 RBD_BRAF Ras-binding domain (RBD) found in serine/threonine-protein kinase BRAF. BRAF, also termed proto-oncogene B-Raf, or p94, or v-Raf murine sarcoma viral oncogene homolog B1, belongs to the RAF family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. BRAF is the most effective RAF kinase in terms of induction of MEK/ERK activity. It is somatically mutated in a number of human cancers. 79
34664 340655 cd17135 RBD_CRAF Ras-binding domain (RBD) found in RAF proto-oncogene serine/threonine-protein kinase RAF1/CRAF. RAF1/CRAF, also termed proto-oncogene c-RAF (cRaf), or Raf-1, belongs to the RAF family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. RAF1/CRAF is an important effector of Ras-mediated signaling and is a critical regulator of the MAPK/ERK pathway. 77
34665 340656 cd17136 RBD1_RGS12 Ras-binding domain (RBD) 1 of regulator of G protein signaling 12 (RGS12). RGS12 (regulator of G protein signaling 12) is a multidomain RGS protein with numerous signaling regulatory elements. In addition to a central RGS domain which is responsible for GAP activity, the long RGS12 splice variant contains a PDZ (PSD-95/Discs-large/ZO-1 homology) domain capable of binding the interleukin-8 receptor B (CXCR2) or its own C-terminal, a phosphotyrosine-binding (PTB) domain that associates with tyrosine-phosphorylated N-type calcium channel, two tandem Ras-binding domains (RBDs) that may integrate signaling pathways involving both heterotrimeric and monomeric G-proteins, and a GoLoco (G-alpha-i/o-Loco) interaction motif which has guanine nucleotide dissociation inhibitor (GDI) activity toward G-alpha-i subunits. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. 70
34666 340657 cd17137 RBD1_RGS14 Ras-binding domain (RBD) 1 of regulator of G protein signaling 14 (RGS14). RGS12 (regulator of G protein signaling 14) is a RGS protein with a multidomain structure that allows it to interact with binding partners from multiple signaling pathways. RGS proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. RGS14 contains an N-terminal RGS domain, two tandem Ras-binding domains (RBDs) and a G protein regulatory (GPR, also referred to as a GoLoco) motif. RGS14 binds activated H-Ras-GTP through its first RBD and interacts with Rap2-GTP and RAF kinases by the second tandem RBD. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS14 modulates neuronal physiology and all of its binding partners have roles in synaptic plasticity. 71
34667 340658 cd17138 RBD2_RGS12 Ras-binding domain (RBD) 2 of regulator of G protein signaling 12 (RGS12). RGS12 (regulator of G-protein signaling 12) is a multidomain RGS protein with numerous signaling regulatory elements. In addition to a central RGS domain which is responsible for GAP activity, the long RGS12 splice variant contains a PDZ (PSD-95/Discs-large/ZO-1 homology) domain capable of binding the interleukin-8 receptor B (CXCR2) or its own C-terminal, a phosphotyrosine-binding (PTB) domain that associates with tyrosine-phosphorylated N-type calcium channel, two tandem Ras-binding domains (RBDs) that may integrate signaling pathways involving both heterotrimeric and monomeric G-proteins, and a GoLoco (G-alpha-i/o-Loco) interaction motif which has guanine nucleotide dissociation inhibitor (GDI) activity toward G-alpha-i subunits. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. 73
34668 340659 cd17139 RBD2_RGS14 Ras-binding domain (RBD) 2 of regulator of G protein signaling 14 (RGS14). RGS14 (regulator of G protein signaling 14) is a RGS protein with a multidomain structure that allows it to interact with binding partners from multiple signaling pathways. RGS proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. RGS14 contains an N-terminal RGS domain, two tandem Ras-binding domains (RBDs) and a G protein regulatory (GPR, also referred to as a GoLoco) motif. RGS14 binds activated H-Ras-GTP through its first RBD and interacts with RAP2-GTP and RAF kinases by the second tandem RBD. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS14 modulates neuronal physiology and all of its binding partners have roles in synaptic plasticity. 73
34669 340660 cd17140 DCX1_DCLK1 Dublecortin-like domain 1 found in doublecortin-like kinase 1 (DCLK1). DCLK1 is a member of doublecortin (DCX) protein superfamily that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule-binding domains, DCLK encodes a serine/threonine kinase domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK1 appears to regulate cyclic AMP signaling and is involved in neuronal migration, retrograde transport, neuronal apoptosis and neurogenesis. Unlike DCX, this DCLK has varying levels of expression throughout embryonic and adult life. 89
34670 340661 cd17141 DCX1_DCLK2 Dublecortin-like domain 1 found in doublecortin-like kinase 2 (DCLK2). DCLK2 is a member of doublecortin (DCX) protein superfamily that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains, which typically occur in tandem. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier (Ubiquitination) in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. Molecular actions of DCX members are less well characterized and it shows that DCLK2 members regulate cyclic AMP signaling. Unlike DCX, this DCLK has varying levels of expression throughout embryonic and adult life. 85
34671 340662 cd17142 DCX2_DCX Dublecortin-like domain 2 found in neuronal migration protein doublecortin (DCX). DCX, also termed doublin or lissencephalin-X (Lis-XDCX), is a microtubule-associated protein (MAP). It belongs to the doublecortin (DCX) family, has double tandem DCX repeats, and is expressed in migrating neurons. Structure studies show that the N-terminal DCX domain has a stable ubiquitin-like fold. DCX is not only a unique MAP in terms of its structure, but also interacts with multiple additional proteins. Mutations in the human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation. 84
34672 340663 cd17143 DCX2_DCLK1 Dublecortin-like domain 2 found in doublecortin-like kinase 1 (DCLK1). DCLK1 is a member of doublecortin (DCX) protein family that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK1 appears to regulate cyclic AMP signaling and is involved in neuronal migration, retrograde transport, neuronal apoptosis and neurogenesis. Unlike DCX, the DCLK has varying levels of expression throughout embryonic and adult life. 84
34673 340664 cd17144 DCX2_DCLK2 Dublecortin-like domain 2 found in doublecortin-like kinase 2 (DCLK2). DCLK2 is a member of doublecortin (DCX) protein family that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains, which typically occur in double tandem. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK2 members regulate cyclic AMP signaling. Unlike DCX, the DCLK has varying levels of expression throughout embryonic and adult life. 84
34674 340665 cd17145 DCX1_RP1 Doublecortin-like domain 1 found in retinitis pigmentosa 1 (RP1)-like protein. RP1, also termed oxygen-regulated protein 1, is a member of the doublecortin (DCX) family. Its DCX domains occur in double tandem repeats. RP1 is associated with retinitis pigmentosa, which is a type of inherited blindness. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The RP1 protein is expressed in photoreceptors and is required for correct stacking of outer segment discs. It interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 79
34675 340666 cd17146 DCX1_RP1L1 Doublecortin-like domain 1 found in retinitis pigmentosa 1-like 1 (RP1L1) protein. RP1L1 is a member of the doublecortin (DCX) family. Its DCX domains occur in double tandem repeats. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX-domain of RP1L1 localizes to the photoreceptor and is genetically associated with retinitis pigmentosa. 79
34676 340667 cd17147 DCX2_RP1 Dublecortin-like domain 2 found in retinitis pigmentosa 1 (RP1)-like protein. RP1, also termed oxygen-regulated protein 1, is a member of doublecortin (DCX) superfamily that contains double tandem repeats of the DCX domains. RP1 is associated with retinitis pigmentosa, which is a type of inherited blindness. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The RP1 protein is expressed in photoreceptors that is required for correct stacking of outer segment discs. RP1 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 76
34677 340668 cd17148 DCX2_RP1L1 Dublecortin-like domain 2 found in retinitis pigmentosa 1-like 1 (RP1L1) protein. RP1L1 is a member of doublecortin (DCX) family. Its protein domains occur in tandem repeats. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX-domain of RP1L1 localizes to the photoreceptor and is genetically associated with retinitis pigmentosa. 76
34678 340669 cd17149 DCX1_DCDC2 Dublecortin-like domain 1 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 80
34679 340670 cd17150 DCX1_DCDC2B Dublecortin-like domain 1 found in doublecortin domain-containing protein 2B (DCDC2B). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 79
34680 340671 cd17151 DCX1_DCDC2C Dublecortin-like domain 1 found in doublecortin domain-containing protein 2C (DCDC2C). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 79
34681 340672 cd17152 DCX2_DCDC2 Doublecortin-like domain 2 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 80
34682 340673 cd17153 DCX2_DCDC2B Doublecortin-like domain 2 found in doublecortin domain-containing protein 2B (DCDC2B). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 80
34683 340674 cd17154 DCX2_DCDC2C Doublecortin-like domain 2 found in doublecortin domain-containing protein 2C (DCDC2C). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with. 80
34684 340675 cd17155 DCX_DCDC1 Doublecortin-like domain found in doublecortin domain-containing protein 1 (DCDC1). Doublecortin (DCX) is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCX gene family consists of eleven paralogs in human and mouse, such that its protein domains can occur in tandem or single repeats. Proteins with DCX tandem domains in general have roles in microtubule (MT) regulation and signal transduction while single DCX repeat proteins are normally localized to actin-rich subcellular structures or to the nucleus. DCDC1 is a hydrophilic intracellular protein that contains only one DCX repeat. Therefore, DCDC1 might only bind to microtubules without microtubule polymerization properties. DCDC1 is mainly expressed in adult testis. 72
34685 340676 cd17156 DCX1_DCDC5 Doublecortin-like domain 1 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and is involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8. 76
34686 340677 cd17157 DCX2_DCDC5 Doublecortin-like domain 2 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8. 86
34687 340678 cd17158 DCX3_DCDC5 Doublecortin-like domain 3 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8. 73
34688 340679 cd17159 DCX4_DCDC5 Doublecortin-like domain 4 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8. 73
34689 340680 cd17160 UBX_UBXN2A Ubiquitin regulatory domain X (UBX) found in UBX domain protein 2A (UBXN2A) and similar proteins. UBXN2A, also termed UBX domain-containing protein 4 (UBXD4), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2A functions as a p97 (also known as VCP or Cdc48) adaptor protein facilitating the regulation of the cell surface number and stability of alpha3-containing neuronal nicotinic acetylcholine receptors (nAChRs). It also regulates nicotinic receptor degradation by modulating the E3 ligase activity of carboxyl terminus of Hsc70 interacting protein (CHIP) that is involved in the degradation of several disease-related proteins. In addition, UBXN2A is an important anticancer factor that contributes to p53 localization and activation as a host defense mechanism against cancerous growth. It acts as a potential mortalin inhibitor, as well as a potential chemotherapy sensitizer for clinical application. It binds to the oncoprotein mortalin-2 (mot-2), and further unsequesters p53 from mot-2 in the cytoplasm, resulting in translocation of p53 to the nucleus where p53 proteins activate their downstream biological effects, including apoptosis. 84
34690 340681 cd17161 UBX_UBXN2B Ubiquitin regulatory domain X (UBX) found in UBX domain protein 2B (UBXN2B) and similar proteins. UBXN2B, also termed NSFL1 cofactor p37, or p97 cofactor p37, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2B is an adaptor protein of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN2B forms a tight complex with p97 in the cytosol and localizes to the Golgi and endoplasmic reticulum (ER). 82
34691 340682 cd17162 UBX_UBXN2C Ubiquitin regulatory domain X (UBX) found in NSFL1 cofactor (also known as UBX domain-containing protein 2C (UBXN2C) and similar proteins. UBXN2C, also termed NSFL1C, or NSFL1 cofactor p47, or p97 cofactor p47, UBX1, or UBXD10, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2C is a major adaptor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The main role of the UBXN2C/p97 complex is in regulation of membrane fusion events. It plays an essential role in the reassembly of Golgi stacks at the end of mitosis. UBXN2C also functions as an essential factor for Golgi membrane fusion, which associates with the nuclear factor-kappaB essential modulator (NEMO) subunit of the IkappaB kinase (IKK) complex upon tumor necrosis factor (TNF)-alpha or interleukin (IL)-1 stimulation, induces the lysosome-dependent degradation of polyubiquitinated NEMO without p97, and thus inhibits IKK activation. Moreover, UBXN2C regulates a membrane fusion process, which is required by the maintenance of the endoplasmic reticulum (ER) network, through phosphorylation by Cdc2 kinase. 82
34692 340683 cd17163 Ubl_ATG8_GABARAPL2 ubiquitin-like (Ubl) domain found in GABA type A receptor associated protein like 2 (GABARAPL2, also known as GATE16). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa), has a ubiquitin-like (Ubl) domain and is a sub-family of the autophagy-related 8 (ATG8) protein family. GABARAPL2 participates to the autophagosome maturation, and seems to be involved in constitutive transport under normal conditions and in autophagic processes during stress. GABARAPL2 is characterized as a membrane transport modulator that controls intra-Golgi transport by interacting with NSF and Golgi v-SNARE GOS-28. 112
34693 340684 cd17164 RAWUL_PCGF2 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 2 (PCGF2). PCGF2, also termed DNA-binding protein Mel-18, or RING finger protein 110 (RNF110), or zinc finger protein 144 (ZNF144), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF2 uniquely regulates PRC1 to specify mesoderm cell fate in embryonic stem cells. It is required for PRC1 stability and maintenance of gene repression in embryonic stem cells (ESCs) and essential for ESC differentiation into early cardiac-mesoderm precursors. PCGF2 also plays a significant role in the angiogenic function of endothelial cells (ECs) by regulating endothelial gene expression. Furthermore, PCGF2 is a SUMO-dependent regulator of hormone receptors. It facilitates the deSUMOylation process by inhibiting PCGF4/BMI1-mediated ubiquitin-proteasomal degradation of SUMO1/sentrin-specific protease 1 (SENP1). It is also a novel negative regulator of breast cancer stem cells (CSCs) that inhibits the stem cell population, and in vitro and in vivo self-renewal through the inactivation of Wnt-mediated Notch signaling. PCGF2 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 99
34694 340685 cd17165 RAWUL_PCGF4 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 4 (PCGF4). PCGF4, also termed polycomb complex protein BMI-1 (B cell-specific Moloney murine leukemia virus integration site 1), or RING finger protein 51 (RNF51), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3), and plays important roles in chromatin compaction and H2AK119 monoubiquitination. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence target gene in a PRC2-independent manner. Moreover, PCGF4 is expressed in the hair cells and supporting cells. It can regulate cell survival by controlling mitochondrial function and reactive oxygen species (ROS) level in thymocytes and neurons, thus having an important role in the survival and sensitivity to ototoxic drug of auditory hair cells. Furthermore, PCGF4 controls memory CD4 T-cell survival through direct repression of Noxa gene in an Ink4a- and Arf-independent manner. It is required in neurons to suppress p53-induced apoptosis via regulating the antioxidant defensive response, and also involved in the tumorigenesis of various cancer types. PCGF4 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 97
34695 340686 cd17166 RAWUL_RING1 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene 1 protein (RING1). RING1, also termed polycomb complex protein RING1, or RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), has been identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. It is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase that transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING1 interacts with multiple PcG proteins and displays tumorigenic activity. It also shows zinc-dependent DNA binding activity. Moreover, RING1 inhibits transactivation of the DNA-binding protein recombination signal binding protein-Jkappa (RBP-J) by Notch through interaction with the LIM domains of KyoT2. RING1 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 124
34696 340687 cd17167 RAWUL_RING2 RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene 2 protein (RING2). RING2, also termed huntingtin-interacting protein 2-interacting protein 3, or HIP2-interacting protein 3, or protein DinG, or RING finger protein 1B (RING1B), or RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. RING2 is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. The enzymatic activity of RING2 is enhanced by the interaction with BMI1/PCGF4, and it is dispensable for early embryonic development and much of the gene repression activity of PRC1. Moreover, RING2 plays a key role in terminating neural precursor cell (NPC)-mediated production of subcerebral projection neurons (SCPNs) during neocortical development. It also plays a critical role in nonhomologous end-joining (NHEJ)-mediated end-to-end chromosome fusions. Furthermore, RING2 is essential for expansion of hepatic stem/progenitor cells. It promotes hepatic stem/progenitor cell expansion through simultaneous suppression of cyclin-dependent kinase inhibitors (CDKIs) Cdkn1a and Cdkn2a, known negative regulators of cell proliferation. RING2 also negatively regulates p53 expression through directly binding with both p53 and MDM2 and promoting MDM2-mediated p53 ubiquitination in selective cancer cell types to stimulate tumor development. RING2 contains a C3HC4-type RING-HC finger, and a RAWUL domain. 106
34697 340688 cd17168 FERM_F1_FRMPD1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 1 (FRMPD1). FRMPD1, also termed FERM domain-containing protein 2, is an activator of G-protein signaling 3 (AGS3)-binding protein that regulates the subcellular location of AGS3 and its interaction with G-proteins. It also binds to the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. FRMPD1 contains a PDZ domain and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 90
34698 340689 cd17169 FERM_F1_FRMPD3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 3 (FRMPD3). FRMPD3 is an uncharacterized FERM and PDZ domain-containing protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 93
34699 340690 cd17170 FERM_F1_FRMPD4 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 4 (FRMPD4). FRMPD4, also termed PDZ domain-containing protein 10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a multiscaffolding protein that modulates both Homer1 and postsynaptic density protein 95 activity. It can associate with the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. Moreover, FRMPD4 is asymmetrically distributed in the cytosol and nuclei of neural stem/progenitor cells in the adult brain, suggesting a significant role in cell differentiation via association with cell polarity machinery. FRMPD4 contains a WW domain, a PDZ domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 94
34700 340691 cd17171 FERM_F0_TLN1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin-1 (TLN1). TLN1 is a cytoskeletal protein that plays a pivotal role in regulating the activity of the integrin family of cell adhesion proteins by coupling them to F-actin. It functions as a focal adhesion protein involved in the attachment of the bacterium. It binds to multiple adhesion molecules, including integrins, vinculin, focal adhesion kinase (FAK), and actin. TLN1 also plays an essential role in integrin activation. TLN1 interacts with the hepatitis B virus (HBV) accessory protein X (HBx), which induces the degradation of TLN1. It also acts as an adaptor protein that regulates leukocyte function-associated antigen-1 (LFA-1) affinity. In addition, TLN1 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN1 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain. 84
34701 340692 cd17172 FERM_F0_TLN2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin-2 (TLN2). TLN2 is a cytoskeletal protein that plays an important role in cell adhesion and recycling of synaptic vesicles. TLN2 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN2 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain. 84
34702 340693 cd17173 FERM_F1_TLN1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin-1 (TLN1). TLN1 is a cytoskeletal protein that plays a pivotal role in regulating the activity of the integrin family of cell adhesion proteins by coupling them to F-actin. It functions as a focal adhesion protein involved in the attachment of the bacterium. It binds to multiple adhesion molecules, including integrins, vinculin, focal adhesion kinase (FAK), and actin. TLN1 also plays an essential role in integrin activation. TLN1 interacts with the hepatitis B virus (HBV) accessory protein X (HBx), which induces the degradation of TLN1. It also acts as an adaptor protein that regulates leukocyte function-associated antigen-1 (LFA-1) affinity. In addition, TLN1 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN1 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain. 112
34703 340694 cd17174 FERM_F1_TLN2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin-2 (TLN2). TLN2 is a cytoskeletal protein that plays an important role in cell adhesion and recycling of synaptic vesicles. TLN2 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN2 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain. 112
34704 340695 cd17175 FERM_F0_SHANK1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 1 (SHANK1). SHANK1, also termed somatostatin receptor-interacting protein, or SSTR-interacting protein (SSTRIP), is a postsynaptic density (PSD)-associated scaffolding proteins at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. Mutations in the SHANK1 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation. SHANK1 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold. 89
34705 340696 cd17176 FERM_F0_SHANK2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 2 (SHANK2). SHANK2, also termed cortactin-binding protein 1 (CortBP1), or proline-rich synapse-associated protein 1, is a postsynaptic density (PSD)-associated scaffolding proteins at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. It is strongly expressed in the cerebellum. Moreover, SHANK2 acts as a component of the albumin endocytic pathway in podocytes, and regulates renal albumin endocytosis. It also associates with and regulates Na+/H+ exchanger 3 (NHE3) and is involved in the fine regulation of transepithelial salt and water transport through affecting NHE3 expression and activity. Mutations in the SHANK2 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation. SHANK2 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold. 88
34706 340697 cd17177 FERM_F0_SHANK3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 3 (SHANK3). SHANK3, also termed proline-rich synapse-associated protein 2 (ProSAP2), is a postsynaptic density (PSD)-associated scaffolding protein at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. It is critical for synaptic plasticity and the trans-synaptic coupling between the reliability of presynaptic neurotransmitter release and postsynaptic responsiveness. It is a key component of a zinc-sensitive signaling system that regulates excitatory synaptic strength. Mutations in the SHANK3 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation, and the cause of human Phelan-McDermid syndrome (22q13.3 deletion syndrome) has been isolated to loss of function of one copy of the SHANK3 gene. SHANK3 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also known as DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold. 87
34707 340698 cd17178 FERM_F1_PLEKHH1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 1 (PLEKHH1). PLEKHH1 is a homolog of Caenorhabditis elegans MAX-1 that has been implicated in motor neuron axon guidance. PLEKHH1 is critical in vascular patterning in vertebrate species through acting upstream of the ephrin pathway. PLEKHH1 contains a putative alpha-helical coiled-coil segment within the N-terminal half, and two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 106
34708 340699 cd17179 FERM_F1_PLEKHH2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 2 (PLEKHH2). PLEKHH2 is a novel podocyte protein downregulated in human focal segmental glomerulosclerosis. It is highly enriched in renal glomerular podocytes, and acts as a novel, important component of the podocyte foot processes. PLEKHH2 contains a putative alpha-helical coiled-coil segment within the N-terminal half, and two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PLEKHH2 is involved in matrix adhesion and actin dynamics. It directly interacts through its FERM domain with the focal adhesion protein Hic-5 and actin. 103
34709 340700 cd17180 FERM_F0_KIND1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-1 (KIND1). KIND1, also termed Kindlerin, or Kindler syndrome protein, or fermitin family homolog 1 (FERMT1), or Unc-112-related protein 1 (URP1), is an integrin-interacting protein that has been implicated in cell adhesion, proliferation, polarity, and motility. It is essential for maintaining the structure of cell-matrix adhesion, such as focal adhesions and podosomes. KIND1 is expressed primarily in epithelial cells. Loss or mutations of KIND1 gene may cause the Kindler syndrome (KS), an autosomal recessive skin disorder with an intriguing progressive phenotype comprising skin blistering, photosensitivity, progressive poikiloderma with extensive skin atrophy, and propensity to skin cancer. KIND1 forms a molecular complex with the key transforming growth factor (TGF)-beta/Smad3 signaling components including type I TGFbeta receptor (TbetaRI), Smad3 and Smad anchor for receptor activation (SARA) to control the activation of TGF-beta/Smad3 signaling pathway. KIND1 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain. 84
34710 340701 cd17181 FERM_F0_KIND2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-2 (KIND2). KIND2, also termed fermitin family homolog 2 (FERMT2), or mitogen-inducible gene 2 protein (MIG-2), or Pleckstrin homology (PH) domain-containing family C member 1, is an adaptor protein that is widely distributed and is particularly abundant in adherent cells. It binds to the integrin beta cytoplasmic tail to promote integrin activation. It promotes carcinogenesis through regulation of cell-cell and cell-extracellular matrix adhesion. In additon, KIND2 plays an important role in cardiac development. KIND2 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain. 80
34711 340702 cd17182 FERM_F0_KIND3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-3 (KIND3). KIND3, also termed fermitin family homolog 3 (FERMT3), or MIG2-like protein, or Unc-112-related protein 2, is an adaptor protein that expressed primarily in hematopoietic cells. It plays a central role in cell adhesion in hematopoietic cells, and also promotes integrin activation, clustering and outside-in signaling. KIND3, together with talin-1, contributes essentially to the activation of beta2-integrins in neutrophils. In addition, KIND3 interacts with the ribosome and regulates c-Myc expression required for proliferation of chronic myeloid leukemia cells. Mutations in the KIND3 gene cause leukocyte adhesion deficiency type III (LAD III), which is characterized by high susceptibility to infections, spontaneous and episodic bleedings, and osteopetrosis. KIND3 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain. 83
34712 340703 cd17183 FERM_F1_KIND1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-1 (KIND1). KIND1, also termed Kindlerin, or Kindler syndrome protein, or fermitin family homolog 1 (FERMT1), or Unc-112-related protein 1 (URP1), is an integrin-interacting protein that has been implicated in cell adhesion, proliferation, polarity, and motility. It is essential for maintaining the structure of cell-matrix adhesion, such as focal adhesions and podosomes. KIND1 is expressed primarily in epithelial cells. Loss or mutations of KIND1 gene may cause the Kindler syndrome (KS), an autosomal recessive skin disorder with an intriguing progressive phenotype comprising skin blistering, photosensitivity, progressive poikiloderma with extensive skin atrophy, and propensity to skin cancer. KIND1 forms a molecular complex with the key transforming growth factor (TGF)-beta/Smad3 signaling components including type I TGFbeta receptor (TbetaRI), Smad3 and Smad anchor for receptor activation (SARA) to control the activation of TGF-beta/Smad3 signaling pathway. KIND1 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain. 93
34713 340704 cd17184 FERM_F1_KIND2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-2 (KIND2). KIND2, also termed fermitin family homolog 2 (FERMT2), or mitogen-inducible gene 2 protein (MIG-2), or Pleckstrin homology (PH) domain-containing family C member 1, is an adaptor protein that is widely distributed and is particularly abundant in adherent cells. It binds to the integrin beta cytoplasmic tail to promote integrin activation. It promotes carcinogenesis through regulation of cell-cell and cell-extracellular matrix adhesion. KIND2 also plays an important role in cardiac development. KIND2 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain. 101
34714 340705 cd17185 FERM_F1_KIND3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-3 (KIND3). KIND3, also termed fermitin family homolog 3 (FERMT3), or MIG2-like protein, or Unc-112-related protein 2, is an adaptor protein that expressed primarily in hematopoietic cells. It plays a central role in cell adhesion in hematopoietic cells, and also promotes integrin activation, clustering and outside-in signaling. KIND3, together with talin-1, contributes essentially to the activation of beta2-integrins in neutrophils. In addition, KIND3 interacts with the ribosome and regulates c-Myc expression required for proliferation of chronic myeloid leukemia cells. Mutations in the KIND3 gene cause leukocyte adhesion deficiency type III (LAD III), which is characterized by high susceptibility to infections, spontaneous and episodic bleedings, and osteopetrosis. KIND3 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain. 91
34715 340706 cd17186 FERM_F1_Merlin FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in merlin and similar proteins. Merlin, also termed moesin-ezrin-radixin-like protein, or neurofibromin-2 (NF2), or Schwannomerlin, or Schwannomin, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif, merlin however lacks the typical actin-binding motif in the C-tail. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Merlin plays vital roles in controlling proper development of organ sizes by specifically binding to a large number of target proteins localized both in cytoplasm and nuclei. Merlin may function as a tumor suppressor that functions upstream of the core Hippo pathway kinases Lats1/2 (Wts in Drosophila) and Mst1/2 (Hpo in Drosophila), as well as the nuclear E3 ubiquitin ligase DDB1-and-Cullin 4-associated Factor 1 (DCAF1)-associated cullin 4-Roc1 ligase, CRL4(DCAF1). Merlin may also has a tumor suppressor function in melanoma cells, the inhibition of the proto-oncogenic Na(+)/H(+) exchanger isoform 1 (NHE1) activity. 85
34716 340707 cd17187 FERM_F1_ERM FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the ERM family proteins, Ezrin, Radixin, and Moesin. The ezrin-radixin-moesin (ERM) family includes a group of closely related cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. They exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 83
34717 340708 cd17188 FERM_F1_FRMD7 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 7 (FRMD7). FRMD7 plays an important role in neuronal development and is involved in the regulation of F-actin, neurofilament, and microtubule dynamics. It interacts with the Rho GTPase regulator, RhoGDIalpha, and activates the Rho subfamily member Rac1, which regulates reorganization of actin filaments and controls neuronal outgrowth. Mutations in the FRMD7 gene are responsible for the X-linked idiopathic congenital nystagmus (ICN), a disease which affects ocular motor control. FRMD7 contains a FERM domain, and a pleckstrin homology (PH) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 86
34718 340709 cd17189 FERM_F1_FARP1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, ARH/RhoGEF and pleckstrin domain-containing protein 1 (FARP1). FARP1, also termed chondrocyte-derived ezrin-like protein (CDEP), or pleckstrin homology (PH) domain-containing family C member 2 (PLEKHC2), is a neuronal activator of the RhoA GTPase. It promotes outgrowth of developing motor neuron dendrites. It also regulates excitatory synapse formation and morphology, as well as activates the GTPase Rac1 to promote F-actin assembly. As a novel downstream signaling partner of Rif, FARP1 is involved in the regulation of semaphorin signaling in neurons. FARP1 contains a FERM domain, a Dbl-homology (DH) domain and two pleckstrin homology (PH) domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 85
34719 340710 cd17190 FERM_F1_FARP2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, ARH/RhoGEF and pleckstrin domain-containing protein 2 (FARP2) and similar proteins. FARP2, also termed FERM domain including RhoGEF (FIR), or Pleckstrin homology (PH) domain-containing family C member 3, is a Dbl-family guanine nucleotide exchange factor (GEF) that activates Rac1 or Cdc42 in response to upstream signals, suggesting roles in regulating processes such as neuronal axon guidance and bone homeostasis. It is also a key molecule involved in the response of neuronal growth cones to class-3 semaphorins. FARP2 contains a FERM domain, a Dbl-homology (DH) domain and two pleckstrin homology (PH) domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 85
34720 340711 cd17191 FERM_F1_PTPN14 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 14 (PTPN14) and similar proteins. PTPN14, also termed protein-tyrosine phosphatase pez, or PTPD2, or PTP36, is a widely expressed non-transmembrane cytosolic protein tyrosine phosphatase (PTP). It belongs to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN14 plays a role in the nucleus during cell proliferation. It forms a complex with Kibra and LATS1 proteins and negatively regulates the key Hippo pathway protein Yes-associated protein (YAP) oncogenic function by controlling its localization. It specifically regulates p130 Crk-associated substrate (p130Cas) phosphorylation at tyrosine residue 128 (Y128) in colorectal cancer (CRC) cells. Moreover, PTPN14 may be a critical enzyme in regulating endothelial cell function. It plays a crucial role in organogenesis by inducing transforming growth factor beta (TGFbeta) and epithelial-mesenchymal transition (EMT). It also acts as a modifier of angiogenesis and hereditary haemorrhagic telangiectasia. It regulates the lymphatic function and choanal development through the interaction with the vascular endothelial growth factor receptor 3 (VEGFR3), a receptor tyrosine kinase essential for lymphangiogenesis. Furthermore, PTPN14 functions as a regulator of cell motility through its action on cell-cell adhesion. Beta-Catenin, a central component of adherens junctions, has been identified as a PTPN14 substrate. PTPN14 works as a novel sperm-motility biomarker and a potential mitochondrial protein. 87
34721 340712 cd17192 FERM_F1_PTPN21 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 21 (PTPN21) and similar proteins. PTPN21, also termed protein-tyrosine phosphatase D1 (PTPD1), is a cytosolic non-receptor protein-tyrosine phosphatase (PTP) that belongs to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN21 interacts with a Tec tyrosine kinase family member, the epithelial and endothelial tyrosine kinase (Etk, also known as Bmx), modulates Stat3 activation, and plays a role in the regulation of cell growth and differentiation. It also associates with and activates Src tyrosine kinase, and directs epidermal growth factor (EGF)/Src signaling to the nucleus through activating ERK1/2- and Elk1-dependent gene transcription. PTPD1-Src complex interacts a protein kinase A-anchoring protein AKAP121 to forms a PTPD1-Src-AKAP121 complex, which is required for efficient maintenance of mitochondrial membrane potential and ATP oxidative synthesis. As a novel component of the endocytic pathway, PTPN21 supports EGF receptor stability and mitogenic signaling in bladder cancer cells. Moreover, PTPD1 regulates focal adhesion kinase (FAK) autophosphorylation and cell migration through modulating Src-FAK signaling at adhesion sites. 87
34722 340713 cd17193 FERM_F1_PTPN3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 3 (PTPN3). PTPN3, also termed protein-tyrosine phosphatase H1 (PTP-H1), belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN3 associates with the mitogen-activated protein kinase p38gamma (also known as MAPK12) to form a PTPN3-p38gamma complex that promotes Ras-induced oncogenesis. It may also act as a tumor suppressor in lung cancer through its modulation of epidermal growth factor receptor (EGFR) signaling. Moreover, PTPN3 shows sensitizing effect to anti-estrogens. It dephosphorylates the tyrosine kinase EGFR, disrupts its interaction with the nuclear estrogen receptor, and increases breast cancer sensitivity to small molecule tyrosine kinase inhibitors (TKIs). It also cooperates with vitamin D receptor to stimulate breast cancer growth through their mutual stabilization. 84
34723 340714 cd17194 FERM_F1_PTPN4 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 4 (PTPN4). PTPN4, also termed protein-tyrosine phosphatase MEG1 (MEG) or PTPase-MEG1, belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN4 protects cells against apoptosis. It associates with the mitogen-activated protein kinase p38gamma (also known as MAPK12) to form a PTPN4-p38gamma complex that promotes cellular signaling, preventing cell death induction. It also inhibits tyrosine phosphorylation and subsequent cytoplasm translocation of TRIF-related adaptor molecule (TRAM, also known as TICAM2), resulting in the disturbance of TRAM-TRIF interaction. Moreover, PTPN4 negatively regulates cell proliferation and motility through dephosphorylation of CrkI. 84
34724 340715 cd17195 FERM_F1_PTPN13 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 13 (PTPN13). PTPN13, also termed Fas-associated protein-tyrosine phosphatase 1 (FAP-1), or PTP-BAS, or protein-tyrosine phosphatase 1E (PTP-E1 or PTPE1), or protein-tyrosine phosphatase PTPL1, belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a KIND domain, a FERM domain, five PDZ domains, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN13 interacts with a variety of ligands, suggests an important role as a scaffolding protein. It is also involved in the regulation of apoptosis, cytokinesis and cell cycle progression. 96
34725 340716 cd17196 FERM_F1_FRMPD2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 2 (FRMPD2). FRMPD2, also termed PDZ domain-containing protein 4 (PDZK4), or PDZ domain-containing protein 5C (PDZD5C), is a potential scaffold protein involved in basolateral membrane targeting in epithelial cells. It interacts with nucleotide-binding oligomerization domain-2 (NOD2) through leucine-rich repeats and forms a complex with the membrane-associated protein ERBB2IP. FRMPD2 contains an N-terminal KIND domain, a FERM domain and three PDZ domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 95
34726 340717 cd17197 FERM_F1_FRMD1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 1 (FRMD1). FRMD1 is an uncharacterized FERM domain-containing protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 98
34727 340718 cd17198 FERM_F1_FRMD6 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 6 (FRMD6). FRMD6, also termed willin, expanded or expanded homolog, is a FERM domain-containing protein that plays a critical role in regulating both cell proliferation and apoptosis. It acts as a tumor suppressor of human breast cancer cells independently of the Hippo pathway. It also inhibits human glioblastoma growth and progression by negatively regulating activity of receptor tyrosine kinases. As an upstream component of the hippo signaling pathway, FRMD6 orchestrates mammalian peripheral nerve fibroblasts. FRMD6 contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 98
34728 340719 cd17199 FERM_F1_FRMD4A FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 4A (FRMD4A). FRMD4A is a cytohesin adaptor involved in cell structure, transport and signaling. It promotes the growth of cancer cells in tongue, head and neck squamous cell carcinomas. It also regulates tau secretion by activating cytohesin-Arf6 signaling through connecting cytohesin family Arf6-specific guanine-nucleotide exchange factors (GEFs) and Par-3 at primordial adherens junctions during epithelial polarization. As a genetic risk factor for late-onset Alzheimer's disease (AD), FRMD4A may play a role in amyloidogenic and tau-related pathways in AD. FRMD4A contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 89
34729 340720 cd17200 FERM_F1_FRMD4B FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 4B (FRMD4B). FRMD4B, also termed GRP1-binding protein GRSP1, interacts with the coil-coil domain of ARF exchange factor GRP1 to form the Grsp1-Grp1 complex that co-localizes with cortical actin rich regions in response to stimulation of CHO-T cells with insulin or epidermal growth factor (EGF). FRMD4B contains a FERM protein interaction domain as well as two coiled coil domains and may therefore function as a scaffolding protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 89
34730 340721 cd17201 FERM_F1_EPB41L1 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 1 (EPB41L1) and similar proteins. EPB41L1, also termed neuronal protein 4.1 (4.1N), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. It is a cytoskeleton-associated protein that may serve as a tumor suppressor in solid tumors. It suppresses hypoxia-induced epithelial-mesenchymal transition in epithelial ovarian cancer (EOC) cells. The down-regulation of EPB41L1 expression is a critical step for non-small cell lung cancer (NSCLC) development. Moreover, EPB41L1 functions as a linker protein between inositol 1,4,5-trisphosphate receptor type1 (IP3R1) and actin filaments in neurons. EPB41L1 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 84
34731 340722 cd17202 FERM_F1_EPB41L2 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 2 (EPB41L2) and similar proteins. EPB41L2, also termed generally expressed protein 4.1 (4.1G), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L2 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 84
34732 340723 cd17203 FERM_F1_EPB41L3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 3 (EPB41L3) and similar proteins. EPB41L3, also termed 4.1B, or differentially expressed in adenocarcinoma of the lung protein 1 (DAL-1), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L3 is a tumor suppressor that has been implicated in a variety of meningiomas and carcinomas. EPB41L3 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 84
34733 340724 cd17204 FERM_F1_EPB41L4B FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte band 4.1-like protein 4B (EPB41L4B). EPB41L4B, also termed FERM-containing protein CG1, or expressed in high metastatic cells (Ehm2), or Lulu2, is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). EPB41L4B is a positive regulator of keratinocyte adhesion and motility, suggesting a role in wound healing. It also promotes cancer metastasis in melanoma, prostate cancer and breast cancer. 84
34734 340725 cd17205 FERM_F1_EPB41L5 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like 5 (EPB41L5). EPB41L5 is a mesenchymal-specific protein that is an integral component of the ARF6-based pathway. It is normally induced during epithelial-mesenchymal transition (EMT) by an EMT-related transcriptional factor, ZEB1, which drives ARF6-based invasion, metastasis and drug resistance. EPB41L5 also binds to paxillin to enhance integrin/paxillin association, and thus promotes focal adhesion dynamics. Moreover, EPB41L5 acts as a substrate for the E3 ubiquitin ligase Mind bomb 1 (Mib1), which is essential for activation of Notch signaling. EPB41L5 is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 86
34735 340726 cd17206 FERM_F1_Myosin-X FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in unconventional myosin-X. Myosin-X, also termed myosin-10 (Myo10), is an untraditional member of myosin superfamily. It is an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. Myosin-X functions as an important regulator of cytoskeleton that modulates cell motilities in many different cellular contexts. It regulates neuronal radial migration through interacting with N-cadherin. Like other unconventional myosins, Myosin-X is composed of a conserved motor head, a neck region and a variable tail. The neck region consists of three IQ motifs (light chain-binding sites), and a predicted stalk of coiled coil. The tail contains three PEST regions, three PH domains, a MyTH4 domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 97
34736 340727 cd17207 FERM_F1_PLEKHH3 FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 3 (PLEKHH3). PLEKHH3 is an uncharacterized Pleckstrin homology (PH) domain-containing protein that shows high sequence similarity with unconventional myosin-X, an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. In addition to two PH domains, PLEKHH3 harbors a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 96
34737 340728 cd17208 FERM_F1_DdMyo7_like FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Dictyostelium discoideum Myosin-VIIa (DdMyo7) and similar proteins. DdMyo7, also termed Myosin-I heavy chain, or class VII unconventional myosin, or M7, plays a role in adhesion in Dictyostelium where it is a component of a complex of proteins that serve to link membrane receptors to the underlying actin cytoskeleton. It interacts with talinA, an actin-binding protein with a known role in cell-substrate adhesion. DdMyo7 is required for phagocytosis. It is also essential for the extension of filopodia, plasma membrane protrusions filled with parallel bundles of F-actin. Members in this family contain a myosin motor domain, two MyTH4 domains, two FERM (Band 4.1, ezrin, radixin, moesin) domains, and two Pleckstrin homology (PH) domains. Some family members contain an extra SH3 domain. Each FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). 98
34738 340729 cd17209 RA_RalGDS Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator (RalGDS) and similar proteins. RalGDS, also termed Ral guanine nucleotide exchange factor (RalGEF), is a guanine exchange factor (GEF) for the Ral family of small GTPases. It is the prototype of RalGDS family proteins that are involved in Ras and Ral signaling pathways as downstream effector proteins. RalGDS stimulates the dissociation of GDP from the Ras-related RalA and RalB GTPases which allows GTP binding and activation of the GTPases. It interacts and acts as an effector molecule for R-Ras, H-Ras, K-Ras, and Rap. Moreover, RalGDS functions as a novel interacting partner for Rab7-interacting lysosomal protein (RILP), a key regulator for late endosomal/lysosomal trafficking. RILP suppresses invasion of breast cancer cells by inhibiting the GEF activity for RalA of RalGDS. RalGDS also plays a vital role in the regulation of Ral-dependent Weibel-Palade bodies (WPB) exocytosis from endothelial cells. In addition, RalGDS couples growth factor signaling to Akt activation by promoting PDK1-induced Akt phosphorylation. Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. 86
34739 340730 cd17210 RA_RGL Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 1 (RalGDS-like 1) and similar proteins. RalGDS-like 1 (RGL) is a Ral-specific guanine nucleotide exchange factor that belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. RGL has been identified as a possible effector protein of Ras. It also regulates c-fos promoter and the GDP/GTP exchange of Ral. Members in this family have similar structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. 87
34740 340731 cd17211 RA_RGL2 Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 2 (RalGDS-like 2) and similar proteins. RalGDS-like 2 (RGL2), also termed RalGDS-like factor (RLF), or Ras-associated protein RAB2L, is a novel Ras and Rap 1A-associating protein that belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. RGL2 exhibits guanine nucleotide exchange activity towards the small GTPase Ral. Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domain of RGL2 is phosphorylated by protein kinase A and the phosphorylation affects the ability of RGL2 to bind both Ras and Rap1. 86
34741 340732 cd17212 RA_RGL3 Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 3 (RalGDS-like 3) and similar proteins. RalGDS-like 3 (RGL3), also termed Ras pathway modulator (RPM), interacts in a GTP- and effector loop-dependent manner with Rit and Ras. As a novel potential effector of both p21 Ras and M-Ras, RGL3 negatively regulates Elk-1-dependent gene induction downstream of p21 Ras or mitogen activated protein/extracellular signal regulated kinase Kinase 1 (MEKK1). It also functions as a potential binding partner for Rap-family small G-proteins and profilin II. RGL3 belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier (Ubiquitination) in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. 87
34742 340733 cd17213 RA_PHLPP Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase PHLPP1, PHLPP2, and similar proteins. PHLPP represents a novel family of Ser/Thr protein phosphatases, which is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP targets oncogenic kinases and may act as a tumor suppressor in several types of cancers. Two PHLPP isoforms are included in this family, PHLPP1 and PHLPP2. They regulate Akt activation together when both phosphatases are expressed. PHLPP1 is also termed pleckstrin homology domain-containing family E member 1, or PH domain-containing family E member 1, or suprachiasmatic nucleus circadian oscillatory protein (SCOP). It plays a suppression role in inflammatory response of glioma. Its loss contributes to gliomas development and progression. Loss of PHLPP1 also protects against colitis by inhibiting intestinal epithelial cell apoptosis. The overexpression of PHLPP1 impairs hippocampus-dependent learning, suggesting a role for PHLPP1 in learning and memory. PHLPP2 is also termed PH domain leucine-rich repeat-containing protein phosphatase-like (PHLPP-like). Both PHLPP1 and PHLPP2 contain a putative Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain. 97
34743 340734 cd17214 RA_CYR1_like Ras-associating (RA) domain found in Saccharomyces cerevisiae adenylate cyclase and similar proteins. CYR1, also termed ATP pyrophosphate-lyase, or adenylyl cyclase, is a fungal adenylate cyclase that regulates developmental processes such as hyphal growth, biofilm formation, and phenotypic switching. CYR1 plays essential roles in regulation of cellular metabolism by catalyzing the synthesis of a second messenger, cAMP. It acts as a scaffold protein keeping Ras2 available for its regulatory factors, the Ira proteins. CYR1 has at least four domains, including an N-terminal adenylate cyclase G-alpha binding domain, a Ras-associating (RA) domain, a middle leucine-rich repeat region, and a catalytic domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domain of CYR1 post-translationally modifies a small GTPase called Ras, which is involved in cellular signal transduction. CYR1 activity is stimulated directly by regulatory proteins (Ras1 and Gpa2), peptidoglycan fragments and carbon dioxide. 99
34744 340735 cd17215 RA_Rin1 Ras-associating (RA) domain found in Ras and Rab interactor 1 (Rin1). Rin1, also termed Ras inhibitor JC99, or Ras interaction/interference protein 1, is a downstream Ras effector that represents a unique class of Ras effector connected to two independent signaling pathways. The first effector pathway is the direct activation of RAB5-mediated endocytosis and the second pathway involves direct activation of ABL tyrosine kinase activity. Rin1 functions as a guanine nucleotide exchange factor (GEF) for RAB5 GTPases. The RAB5 GEF activity of Rin1 promotes early endosome fusion, an early event in transit to the lysosome. Rin1 binds the SH3 and SH2 domains of ABL proteins, ABL1 and ABL2, and activates their tyrosine kinase activity. Rin1 contains SH2 and proline-rich domains in the N-terminal region, and RH, VPS9, and RA domains in the C-terminal region. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. 88
34745 340736 cd17216 RA_Myosin-IXa Ras-associating (RA) domain found in Myosin-IXa. Myosin-IXa, also termed myosin-9a (Myo9a), is a single-headed, actin-dependent motor protein of the unconventional myosin IX class. It is expressed in several tissues and is enriched in the brain and testes. Myosin-IXa contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating domain (RhoGAP). Its RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. Myosin-IXa binds the alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid receptor (AMPAR) GluA2 subunit, and plays a key role in controlling the molecular structure and function of hippocampal synapses. Moreover, Myosin-IXa functions in epithelial cell morphology and differentiation such that its knockout mice develop hydrocephalus and kidney dysfunction. Myosin-IXa regulates collective epithelial cell migration by targeting RhoGAP activity to cell-cell junctions. Myosin-IXa negatively regulates Rho GTPase signaling, and functions as a regulator of kidney tubule function. 96
34746 340737 cd17217 RA_Myosin-IXb Ras-associating (RA) domain found in Myosin-IXb. Myosin-IXb, also termed myosin-9b (Myo9b), is a motor protein with a Rho GTPase activating domain (RhoGAP); it is an actin-dependent motor protein of the unconventional myosin IX class. It is expressed abundantly in tissues of the immune system, like lymph nodes, thymus, and spleen and in several immune cells including dendritic cells, macrophages and CD4+ T. Myosin-IXb contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a RhoGAP domain. Its RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. Myosin-IXb acts as a motorized signaling molecule that links Rho signaling to the dynamic actin cytoskeleton. It regulates leukocyte migration by controlling RhoA signaling. Myosin-IXb is also involved in the development of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus and type 1 diabetes. Moreover, Myosin-IXb is a ROBO-interacting protein that suppresses RhoA activity in lung cancer cells. 96
34747 340738 cd17218 RA_RASSF1 Ras-associating (RA) domain found in Ras-association domain-containing protein 1 (RASSF1). RASSF1 is a member of a family of six related RASSF1-6 proteins (the classical RASSF proteins). RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. With the exception of some minor splice variants (RASSF1F and RASSF1G), RASSF1 contains an RA domain and a C-terminal SARAH protein-protein interaction motif. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF1A and 1C are the most extensively studied RASSF1 with both localized to microtubules and involved in regulation of growth and migration. 157
34748 340739 cd17219 RA_RASSF3 Ras-associating (RA) domain found in Ras-association domain-containing protein 3 (RASSF3). RASSF3 is a member of a family of six related classical RASSF1-6 proteins (the classical RASSF proteins). RASSF3 has three transcripts (A-C) due to alternative splicing of the exons. The RASSF3B and 3C isoforms are shorter than RASSF3A, and unlike RASSF3A do not contain the RA or SARAH domains. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF3A regulates apoptosis and cell cycle via p53 stabilization and possibly is involved in DNA repair. 141
34749 340740 cd17220 RA_RASSF5 Ras-associating (RA) domain of Ras-association domain family 5 (RASSF5). RASSF5, also called New ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is a member of a family of six related RASSF1-6 proteins (the classical RASSF proteins) and is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. All transcripts variants of RASSF5 contain the RA or SARAH domains. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion. 152
34750 340741 cd17221 RA_RASSF2 Ras-associating (RA) domain found in Ras-association domain-containing protein 2 (RASSF2). RASSF2 is a member of a family of six related classical RASSF1-6 proteins. The RASSF2 gene is transcribed into two major isoforms (A and C). RASSF2 is structurally related to RASSF1A but unlike RASSF1A It is primarily a nuclear protein. RASSF2 contains the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, and SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF2 is inactivated in different cancers and cancer cell lines by promoter methylation and loss of expression, implicating the correlation and significance of RASSF2 in tumorigenesis. In addition to regulating apoptosis and proliferation RASSF2 may have other functions as RASSF2 knockout mice develop normally for the first two weeks but then develop growth retardation and die 4 weeks after birth. 87
34751 340742 cd17222 RA_RASSF4 Ras-associating (RA) domain found in Ras-association domain-containing protein 4 (RASSF4). RASSF4 is a member of a family of six related classical RASSF1-6 proteins and is broadly expressed in normal tissues. RASSF4 expression is reduced in tumor cell lines and primary tumors by promoter specific hypermethylation. RASSF4 contains the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, and SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF4 inhibits lung cancer cell proliferation and invasion. 87
34752 340743 cd17223 RA_RASSF6 Ras-associating (RA) domain found in Ras-association domain-containing protein 6 (RASSF6). RASSF6 is a member of a family of six related classical RASSF1-6 proteins and is expressed as four transcripts via alternative splicing. All transcripts variant of RASSF6 contain the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF6 is ubiquitiated and degraded by interacting with MDM2 to stabilize P53 and regulates apoptosis and cell cycle. RASSF6 is a tumor suppressor protein and is epigenetically silenced in childhood leukemia and neuroblastomas. Overexpression of RASSF6 causes apoptosis in HeLa cells. 87
34753 340744 cd17224 RA_ASPP1 Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 protein 1 (ASPP1). ASPP1 is a member of the ASPP protein family (Apoptosi-Stimulating Protein of p53) that activates the p53-mediated apoptotic response. ASSP1 functions as a tumor suppressor and coordinates with p53 to protect hematopoietic stem cell (HSC) pool integrity, guarding against hematological malignancies. ASSP1 contains a RA domain at the N-terminus. The RA domain is a ubiquitin-like domain and RA domain-containing proteins are involved in several different functions ranging from tumor suppression to being oncoproteins. 85
34754 340745 cd17225 RA_ASPP2 Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 protein 2 (ASPP2). ASPP2, also termed Bcl2-binding protein (Bbp), or renal carcinoma antigen NY-REN-51, or tumor suppressor p53-binding protein 2 (53BP2), or p53-binding protein 2 (p53BP2), is a member of ASPP protein family and it functions as a tumor suppressor. ASPP2 binds to p53 and enhances p53-mediated transcription of proapoptotic genes. ASSP2 contains a RA domain at the N-terminus. The RA domain is a ubiquitin-like domain and RA domain-containing proteins are involved in several different functions ranging from tumor suppression to being oncoproteins. All p53 amino acids that are important for ASPP2 binding are mutated in human cancer, and ASPP2 is frequently downregulated in these tumor cells. 80
34755 340746 cd17226 RA_ARAP1 Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 1 (ARAP1). ARAP1, also termed Centaurin-delta-2 (Cnt-d2), is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP) that inhibits the trafficking of epidermal growth factor receptor (EGFR) to the early endosome. It associates with the Cbl-interacting protein of 85 kDa (CIN85), regulates endocytic trafficking of the EGFR, and thus affects ubiquitination of EGFR. It also regulates the ring size of circular dorsal ruffles through Arf1 and Arf5. ARAP1 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 93
34756 340747 cd17227 RA_ARAP2 Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 2 (ARAP2). ARAP2, also termed Centaurin-delta-1 (Cnt-d1), or Protein PARX, is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP), which promotes GLUT1-mediated basal glucose uptake by modifying sphingolipid metabolism through glucosylceramide synthase (GCS). ARAP2 signals through Arf6 and Rac1 to control focal adhesion morphology. ARAP2 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 98
34757 340748 cd17228 RA_ARAP3 Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 3 (ARAP3). ARAP3, also termed Centaurin-delta-3 (Cnt-d3), is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP) that modulates actin cytoskeleton remodeling by regulating ARF and RHO family members, ADP-ribosylation factor 6 (Arf6) and Ras homolog gene family member A (RhoA). It is regulated by phosphatidylinositol 3,4,5-trisphosphate and a small GTPase Rap1-GTP, and has been implicated in the regulation of cell shape and adhesion. ARAP3 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. 99
34758 340749 cd17229 RA1_PLC-epsilon Ras-associating (RA) domain 1 found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. This family corresponds to the first RA domain of PLC-epsilon. 108
34759 340750 cd17230 TGS_DRG1 TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein 1 (DRG-1). DRG-1 is a potassium-dependent GTPase that belongs to the DRG family GTP-binding proteins. It plays an important role in regulating cell growth. It functions as a potential oncogene in lung adenocarcinoma and promotes tumor progression via spindle checkpoint signaling regulation. It also plays an important role in melanoma cell growth and transformation, indicating a novel role in CD4(+) T cell-mediated immunotherapy in melanoma. In addition, DRG-1 is regulated by ZC3H15 (zinc finger CCCH-type containing 15, also known as Lerepo4), and displays a high temperature optimum of activity at 42C, suggesting the ability of being active under possible heat stress conditions. DRG-1 contains a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as this C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold and may be related to RNA binding. 80
34760 340751 cd17231 TGS_DRG2 TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein 2 (DRG-2). DRG-2 is a member of the DRG family GTP-binding proteins. It has been implicated in cell growth, differentiation and death. DRG-2 plays a critical role in control of the cell cycle and apoptosis in Jurkat T cells. It regulates G2/M progression via the cyclin B1-Cdk1 complex. Moreover, DRG-2 is an endosomal protein and a key regulator of the small GTPase Rab5 deactivation and transferrin recycling. It enhances experimental autoimmune encephalomyelitis (EAE) by suppressing the development of TH17 cells. It is also associated with survival and cytoskeleton organization of osteoclasts under influence of macrophage colony-stimulating factor, and its overexpression leads to elevated bone resorptive activity of osteoclasts, resulting in bone loss. DRG-2 contains a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as this C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold and may be involved in RNA binding. 79
34761 340752 cd17232 Ubl_ATG8_GABARAP ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein (GABARAP). GABARAP (also termed GABA(A) receptor-associated protein, ATG8A, or MM46) has been implicated in intracellular protein trafficking. It is a cytosolic protein that is localized to transport vesicles, the Golgi network and the endoplasmic reticulum. It interacts with the intracellular domain of the gamma2 subunit of GABA(A) receptors, and thus functions as a trafficking modulator implicated in the intracellular trafficking of GABA(A) receptor. GABARAP also acts as a Ubl modifier belonging to the ATG8 (autophagy-related 8) protein family, which is essential for autophagosome biogenesis and maturation. GABARAP recruits phosphatidylinositol 4-kinase II alpha (PI4KIIalpha) as a specific downstream effector, and regulates phosphatidylinositol 4-phosphate (PI4P)-dependent autophagosome lysosome fusion. 115
34762 340753 cd17233 Ubl_ATG8_GABARAPL1_like ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein-like 1 (GABARAPL1) and similar proteins. GABARAPL1, also termed GEC1, or GABA(A) receptor-associated protein-like, belongs to the small family of GABARAP proteins which includes GABARAP, GABARAPL1, GABARAPL2/GATE-16, and GABARAPL3. GABARAPL1 has been involved in the intracellular transport of receptors via interactions with tubulin and GABA(A) or kappa opioid receptors. It is also a Ubl modifier that functions as a mediator involved in androgen-regulated autophagy process. It is transcriptionally modulated by androgen receptor (AR) and has a repressive role in autophagy. In addition, GABARAPL1 is required for increased membrane expression of epidermal growth factor receptor (EGFR) during hypoxia, suggesting a possible role in the trafficking of these membrane proteins. GABARAPL1 may also play a key role in several important biological processes such as cancer or neurodegenerative diseases. Low expression of GABARAPL1 is associated with poor prognosis of patients with hepatocellular carcinoma. This family also includes GABARAPL3, a paralog of GABARAPL1. 107
34763 340754 cd17234 Ubl_ATG8_MAP1LC3A ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3A (MAP1LC3A). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3A is belong to MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like fold (Ubl) that belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAPLC3 is known as a marker of autophagy-induction. MAP1LC3A staining patterns are used for different cancer diagnostic. 117
34764 340755 cd17235 Ubl_ATG8_MAP1LC3B ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3B (MAP1LC3B). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAPLC3B belongs to the MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like (Ubl) fold and belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAPLC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. All MAP1LC3 proteins are dispensable for basal autophagy; however, it has been shown that MAP1LC3B is responsible for selective degradation of p62 through autophagy. 115
34765 340756 cd17236 Ubl_ATG8_MAP1LC3C ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3C (MAPLC3C). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3C belongs to the MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like (Ubl) fold that belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, a E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. ATG8 proteins are ubiquitously expressed, although some subfamily members are expressed at increased levels in certain tissues, e.g. MAP1LC3C is transcribed at lower levels than other members of MAP1LC3 subfamily and expressed predominantly in the lung. 113
34766 340757 cd17237 FERM_F1_Moesin FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in moesin and similar proteins. Moesin, also termed membrane-organizing extension spike protein, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Moesin is involved in mitotic spindle function through stabilizing cell shape and microtubules at the cell cortex. It is required for the formation of F-actin networks that mediate endosome biogenesis or maturation and transport through the degradative pathway. 84
34767 340758 cd17238 FERM_F1_Radixin FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in radixin and similar proteins. Radixin is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Radixin plays important roles in cell polarity, cell motility, invasion and tumor progression. It mediates the binding of F-actin to the plasma membrane after a conformational activation through Akt2-dependent phosphorylation at Thr564. It is also involved in reversal learning and short-term memory by regulating synaptic GABAA receptor density. 83
34768 340759 cd17239 FERM_F1_Ezrin FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Ezrin and similar proteins. Ezrin, also termed cytovillin, or villin-2, or p81, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Ezrin is a tyrosine kinase substrate that functions as a cross-linker between actin cytoskeleton and plasma membrane. It has been implicated in the regulation of the proliferation, apoptosis, adhesion, invasion, metastasis and angiogenesis of cancer cells. 85
34769 340760 cd17240 RA_PHLPP1 Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase 1 (PHLPP1). PHLPP1, also termed pleckstrin homology domain-containing family E member 1, or PH domain-containing family E member 1, or suprachiasmatic nucleus circadian oscillatory protein (SCOP), is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP1 also plays critical roles in many cancers, such as gastric cancer, sacral chordoma, gallbladder cancer, hypopharyngeal squamous cell carcinomas, and non-small cell lung cancer. It plays a suppression role in inflammatory response of glioma. Its loss contributes to gliomas development and progression. Loss of PHLPP1 also protects against colitis by inhibiting intestinal epithelial cell apoptosis. The overexpression of PHLPP1 impairs hippocampus-dependent learning, suggesting a role in learning and memory. PHLPP1 contains a Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain. 90
34770 340761 cd17241 RA_PHLPP2 Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase 2 (PHLPP2). PHLPP2, also termed PH domain leucine-rich repeat-containing protein phosphatase-like (PHLPP-like), is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP2 also plays critical roles in many cancers, such as glioma, hypopharyngeal squamous cell carcinomas, and non-small cell lung cancer. PHLPP2 contains a Ras-associating (RA) domain followed by a PH domain, leucine-rich repeats and protein phosphatase 2C (PP2C) domain. 108
34771 410988 cd17242 MobM_relaxase relaxase domain of MobM and similar proteins. With some plasmids, recombination can occur in a site-specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre (plasmid recombination enzyme), also known as Mob (conjugative mobilization). The best characterized member of this family is encoded by the streptococcal plasmid pMV158 that recognizes the plasmid origin of transfer. MobM converts supercoiled plasmid DNA into relaxed DNA by cleaving a phosphodiester bond of a specific dinucleotide and remains bound to the 5'-end of the nick site. 196
34772 341132 cd17243 RMtype1_S_AchA6I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Arthrobacter chlorophenolicus A6 S subunit (S.AchA6I) TRD2-CR2. The S.AchA6I S subunit recognizes 5'... TGAANNNNNTCG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.AchA6I-TRD1 recognizes TGAA/TTCA, and TRD2 recognizes CGA/TCG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 182
34773 341133 cd17244 RMtype1_S_Apa101655I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acetobacter pasteurianus S subunit (S.Apa101655I) TRD2-CR2. The S. Apa101655I S subunit recognizes 5'... TTAGNNNNNNTTC... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 180
34774 341134 cd17245 RMtype1_S_TteMORF1547P-TRD2-CR2_Aco12261I-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P) TRD2-CR2 and Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD1-CR1. The S.Aco12261I S subunit recognizes 5'... GCANNNNNNTGT ... 3', while the recognition sequence is undetermined for S.TteMORF1547P TRD2-CR2. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. S.TteMORF1547P TRD1-CR1 and S.Aco12261I TRD2-CR2 do not belong to this family. This family may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 174
34775 341135 cd17246 RMtype1_S_SonII-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Shewanella oneidensis MR-1 S subunit (S.SonII) TRD2-CR2. This model contains Shewanella oneidensis MR-1 S subunit (S.SonII) TRD2-CR2 and similar TRD-CR's. S.SonII recognizes 5'... GTCANNNNNNRTCA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.SonII TRD1-CR1 does not belong to this subfamily. 189
34776 341136 cd17247 RMtype1_S_Eco2747I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli ST2747 S subunit (S.Eco2747I) TRD2-CR2. The S. Eco2747I S subunit recognizes 5'... CACNNNNNNNGTTG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 190
34777 341137 cd17248 RMtype1_S_AmiI-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Actinosynnema mirum DSM 43827 S subunit (S. AmiI) TRD2-CR2. The S. AmiI S subunit recognizes 5'... CAGNNNNNNNTCGA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S. AmiI -TRD1 recognizes CAG/CTG, and TRD2 recognizes TCGA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 196
34778 341138 cd17249 RMtype1_S_EcoR124I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoR124I TRD2-CR2, S.Eco540I TRD2-CR2, S.Eco540AI TRD2-CR2, and S.Eco540ANI TRD2-CR2. Escherichia coli (R124) S subunit (S.EcoR124I), E. coli ST540 S subunit (S.Eco540I), E. coli ST540A S subunit (S.Eco540AI), and Escherichia coli ST540AN S subunit (S.Eco540ANI) recognize the sequence 5'... GAANNNNNNRTCG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoR124I -TRD1 recognizes GAA/TTC, and -TRD2 recognizes CGAY/RTCG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 185
34779 341139 cd17250 RMtype1_S_Eco4255II_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255II) TRD2-CR2 and Shewanella oneidensis MR-1 S subunit (S.SonIV) TRD1-CR1. Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255II) recognizes 5'... TACNNNNNNNRTRTC ... 3 while Shewanella oneidensis MR-1 S subunit (S.SonIV) recognizes 5'... TACNNNNNNGTNGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.SonIV-TRD1 recognizes TAC/GTA and S.SonIV-TRD2 recognizes ACNAC/GTNGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 185
34780 341140 cd17251 RMtype1_S_HinAWORF1578P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to S.HinAWORF1578P TRD2-CR2. Haemophilus influenzae RdAW S subunit (S.HinAWORF1578P) recognizes 5'... CTANNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mostly TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 185
34781 341141 cd17252 RMtype1_S_EcoKI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoKI TRD1-CR1, S.StySPI TRD1-CR1, S.Ara36733II TRD1-CR1, and S.Eco3722I TRD1-CR1. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) and Escherichia coli NCM3722 S subunit (S.Eco3722I) recognize 5'... AACNNNNNNGTGC ... 3', Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3', and Actinomyces radicidentis S subunit (S.Ara36733II) recognizes 5'... CATCNNNNNNCTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 and S.StySPI-TRD1 recognize AAC/GTT, S.EcoKI-TRD2 recognizes GCAC/GTGC, and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Treponema pedis T A4 putative Type IIG restriction enzyme/N6-adenine DNA methyltransferase RM.TpeTA4ORF2695P. It may also include type I DNA methyltransferases. 189
34782 341142 cd17253 RMtype1_S_Eco933I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli O157:H7 EDL933 S subunit (S.Eco933I), Escherichia coli O104:H4 2009EL-2071 S subunit (S.Eco2071ORF3585P) TRD2-CR2, and Streptomyces species SirexAA-E S subunit (S.SspAAEORF2129P) TRD1-CR1 and TRD2-CR2. Escherichia coli O157:H7 EDL933 S subunit (S.Eco933I) recognizes 5'... CACNNNNNNNCTGG ... 3' and Escherichia coli O104:H4 2009EL-2071 S subunit (S.Eco2071ORF3585P) recognizes 5'... RTCANNNNNNNNGTGG ... 3'. The recognition sequence of Streptomyces species SirexAA-E S subunit (S.SspAAEORF2129P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Eco2071ORF3585P TRD1 recognizes RTCA/TGAY and S.Eco2071ORF3585P TRD2 recognizes CCAC/GTGG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 193
34783 341143 cd17254 RMtype1_S_FclI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.FclI TRD1-CR1. The recognition sequence of Flavobacterium columnare G4 S subunit (S.FclI) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also contains TRD-CR-like sequence-recognition domains of type I DNA methyltransferases, such as putative Type I N6-adenine DNA methyltransferases from Microbacterium ketosireducens (M.Msp12510ORF408P) and Treponema primitia ZAS-2 (M.TprZAS2ORF3630P). It may also include various type II restriction enzymes and methyltransferases. 173
34784 341144 cd17255 RMtype1_S_Fco49512ORF2615P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Flavobacterium columnare S subunit (S.Fco49512ORF2615P) TRD2-CR2. The recognition sequence of Flavobacterium columnare S subunit (S.Fco49512ORF2615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 166
34785 341145 cd17256 RMtype1_S_EcoJA65PI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoJA65PI TRD1-CR1, S.Fco49512ORF2615P TRD1-CR1, and S.SonIV TRD2-CR2. Escherichia coli UCD_JA65_pb S subunit (S.EcoJA65PI) recognizes 5'... AGCANNNNNNTGA ... 3' while Shewanella oneidensis MR-1 S subunit (S.SonIV) recognizes 5'... TACNNNNNNGTNGT ... 3'. The recognition sequence of Flavobacterium columnare S subunit (S.Fco49512ORF2615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoJA65PI TRD1 recognizes AGCA/TGCT and S.EcoJA65PI TRD2 recognizes TCA/TGA; S.SonIV TRD1 recognizes TAC/GTA and S.SonIV TRD2 recognizes ACNAC/GTNGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 182
34786 341146 cd17257 RMtype1_S_EcoBI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoBI TRD1-CR1, S.EcoSanI TRD1-CR1, and S.EcoVR50I TRD1-CR1. Escherichia coli B S subunit (S.EcoBI) and Escherichia coli VR50 S subunit (S.EcoVR50I) recognize 5'... TGANNNNNNNNTGCT ... 3', while Escherichia coli Sanji S subunit (S.EcoSanI) recognizes 5'... TGANNNNNNCTTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 176
34787 341147 cd17258 RMtype1_S_Sau13435ORF2165P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Sau13435ORF2165P TRD1-CR1 and S.SauL3067ORFAP TRD1-CR1. Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) recognizes 5'... TCTANNNNNNRTTC ... 3'; the recognition sequence of Staphylococcus aureus 3067 S.Sau3067ORFAP S subunit is as yet undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau13435ORF2165P TRD1 recognizes TCTA/TAGA, and S.Sau13435ORF2165P TRD2 recognizes GAAY/RTTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mostly TRD1-CR1. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 173
34788 341148 cd17259 RMtype1_S_StySKI-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to TRD2-CR2's of StySKI, S.EcoAI, S.EcoJA17PI, and S.EcoJA23PI. Salmonella kaduna CDC-388 S subunit (StySKI) recognizes 5'... CGATNNNNNNNGTTA ... 3' while Escherichia coli Type-1 restriction enzyme EcoAI specificity protein (S.EcoAI), Escherichia coli UCD_JA17_pb S subunit (S.EcoJA17PI) and Escherichia coli UCD_JA23_pb S subunit (S.EcoJA23PI) recognize 5'... GAGNNNNNNNGTCA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 189
34789 341149 cd17260 RMtype1_S_EcoEI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoEI TRD1-CR1, S.EcoJA17PI TRD1-CR1, S.EcoJA23PI TRD1-CR1, and S.StyLTIII TRD1-CR1. Escherichia coli A58 S subunit (S.EcoEI) recognizes 5'... GAGNNNNNNNATGC ... 3', Escherichia coli UCD_JA17_pb S subunit (S.EcoJA17PI) and Escherichia coli UCD_JA23_pb S subunit (S.EcoJA23PI) recognize 5'... GAGNNNNNNNGTCA ... 3', and Salmonella typhimurium LT7 S subunit (S.StyLTIII) recognizes 5'... GAGNNNNNNRTAYG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example: S.EcoEI TRD1 and S.StyLTIII TRD1 recognize GAG/CTC, S.EcoEI TRD2 recognizes GCAT/ATGC, and S.StyLTIII TRD2 recognizes CRTAY/RTAYG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferas RM.PpiI and Porphyromonas macacae COT-192 OH2631 RM.Pma2631ORF8845P, as well as type I DNA methyltransferases such as Chlorobium limicola M.Cli245ORF128P. RM.PpiI recognizes the sequence 5' ... GAACNNNNNCTC ... 3'. 165
34790 341150 cd17261 RMtype1_S_EcoKI-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) TRD2-CR2, Escherichia coli A58 S subunit (S.EcoEI) TRD2-CR2, and Aminomonas paucivorans S subunit (S.Apa12260I) TRD2-CR2. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) recognizes 5'... AACNNNNNNGTGC ... 3', Escherichia coli A58 S subunit (S.EcoEI) recognizes 5'... GAGNNNNNNNATGC ... 3', and Aminomonas paucivorans S subunit (S.Apa12260I) recognizes 5'... GCCNNNNNCTCC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 recognizes AAC/GTT and S.EcoKI-TRD2 recognizes GCAC/GTGC, S.EcoEI TRD1 recognizes GAG/CTC and S.EcoEI TRD2 recognizes GCAT/ATGC, and S.Apa12260I TRD1 recognizes GCC/GGC and S.Apa12260I TRD2 recognizes GGAG/CTCC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 191
34791 341151 cd17262 RMtype1_S_Aco12261I-TRD2-CR2 Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD2-CR2 and Moraxella catarrhalis S subunit (S.Mca353ORF290P) TRD2-CR2. Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) recognizes 5'... GCANNNNNNTGT ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 172
34792 341152 cd17263 RMtype1_S_AbaB8300I-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acinetobacter baumannii B8300 S subunit (S.AbaB8300I) TRD1-CR1. Acinetobacter baumannii B8300 S subunit (S.AbaB8300I) recognizes 5'... GAYNNNNNNNTCYC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 177
34793 341153 cd17264 RMtype1_S_Eco3763I-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O69:H11 07-3763 S subunit (S.Eco3763I) TRD2-CR2. Escherichia coli O69:H11 07-3763 S subunit (S.Eco3763I) recognizes 5'... TACNNNNNNNRTRTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 184
34794 341154 cd17265 RMtype1_S_Eco4255III-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255III) TRD2-CR2 and Escherichia coli ECONIH1 S subunit (S.EcoNIH1II) TRD2-CR2. Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255III) recognizes 5'... GAGNNNNNGTTY ... 3', and Escherichia coli ECONIH1 S subunit (S.EcoNIH1II) recognizes 5'... YTCANNNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Eco4255III-TRD1 recognizes GAG/CTC and S.EcoNIH1II-TRD1 recognizes YTCA/TGAR, while both S.EcoNIH1II-TRD2 and S.Eco4255III-TRD2 recognize RAAC/GTTY. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 181
34795 341155 cd17266 RMtype1_S_Sau1132ORF3780P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) TRD2-CR2. Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3'. The RM system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 159
34796 341156 cd17267 RMtype1_S_EcoAO83I-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoAO83I TRD1-CR1 and S.AbaB8342I TRD2-CR2. Escherichia coli strain A0 34/86 S subunit (S.EcoAO83I) recognizes 5'... GGANNNNNNNNATGC ... 3, and Acinetobacter baumannii B8342 S subunit (S.AbaB8342I) recognizes 5'... TTCANNNNNNTCC ... 3. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example S.AbaB8342I-TRD1 recognizes TTCA/TGAA and S.AbaB8342I-TRD2 recognizes GGA/TCC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Type IIG restriction enzyme/N6-adenine DNA methyltransferases from Thermus scotoductus RFL1 (RM.TstI) and Acinetobacter lwoffi Ks 4-8 (RM.AloI), as well as type I DNA methyltransferases such as Sideroxydans lithotrophicus ES-1 Type I N6-adenine DNA methyltransferase (M.SliESORF1090P). RM.TstI recognizes 5' ... CACNNNNNNTCC ... 3' and RM.AloI recognizes 5' ... GAACNNNNNNTCC ... 3'. 158
34797 341157 cd17268 RMtype1_S_Ara36733I_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Ara36733I TRD1-CR1 AND S.Ara36733I TRD2-CR2. Actinomyces radicidentis S subunit (S.Ara36733I) recognizes 5'... CGAGNNNNNCTG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 185
34798 341158 cd17269 RMtype1_S_PluTORF4319P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Photorhabdus luminescens S subunit (S.PluTORF4319P) TRD2-CR2. The recognition sequence of Photorhabdus luminescens S subunit (S.PluTORF4319P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 168
34799 341159 cd17270 RMtype1_S_Sba223ORF3470P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Sba223ORF3470P TRD1-CR1. The recognition sequence of Shewanella baltica OS223 S subunit (S.Sba223ORF3470P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 183
34800 341160 cd17271 RMtype1_S_NmaSCMORF606P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Nitrosopumilus maritimus SCM1 S subunit (S2.NmaSCMORF606P) TRD2-CR2, Corynebacterium jeikeium K411 S subunit (S.CjeKORF1254P) TRD2-CR2 and Porphyromonas canoris COT-108 OH2762 S subunit (S2.Pca2762ORF8685P) TRD1-CR1. The recognition sequences of Nitrosopumilus maritimus SCM1 S subunit (S2.NmaSCMORF606P), Corynebacterium jeikeium K411 S subunit (S.CjeKORF1254P), and Porphyromonas canoris COT-108 OH2762 S subunit (S2.Pca2762ORF8685P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 186
34801 341161 cd17272 RMtype1_S_Eco2747II-TRD2-CR2-like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Eco2747II TRD2-CR2 and S.Eco2747AII TRD2-CR2. Escherichia coli ST2747 S subunit (S.Eco2747II) and Escherichia coli ST2747A s SUBUNIT (S.Eco2747AII) recognize 5'... GAANNNNNNNTAAA ... 3'. Generally The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mainly TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 189
34802 341162 cd17273 RMtype1_S_EcoJA69PI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoJA69PI TRD1-CR1, MjaXIP/S.MjaORF132P TRD2-CR2, and S.HspDL1ORF16625P TRD2-CR2. Escherichia coli UCD_JA69_pb S subunit (S.EcoJA69PI) recognizes 5'... CCANNNNNNNCTTC ... 3'. The recognition sequences of Methanococcus jannaschii MjaXIP/S.MjaORF132P TRD2-CR2 and Halobacterium species DL1 S subunit (S.HspDL1ORF16625P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various putative type II restriction enzymes and methyltransferases and may also include type I DNA methyltransferases. 186
34803 341163 cd17274 RMtype1_S_Eco540ANI-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Eco540ANI TRD1-CR1, S.Eco2747AII TRD1-CR1, S.Eco540AI TRD1-CR1, S.Eco2747II TRD1-CR1, and S.Eco540I TRD1-CR1. Escherichia coli ST540AN S subunit (S.Eco540ANI ), Escherichia coli ST540A S subunit (S.Eco540AI), and Escherichia coli ST540 S subunit (S.Eco540I) recognize 5'... GAANNNNNNRTCG ... 3'. Escherichia coli ST2747A S subunit (S.Eco2747AII) and Escherichia coli ST2747 S subunit (S.Eco2747II) recognize 5'... GAANNNNNNNTAAA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 171
34804 341164 cd17275 RMtype1_S_MjaORF132P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to MjaXIP/S.MjaORF132P TRD1-CR1. The recognition sequence of Methanococcus jannaschii S subunit (MjaXIP/S.MjaORF132P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 186
34805 341165 cd17276 RMtype1_S_Sau1132ORF3780P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to S.Sau1132ORF3780P TRD1-CR1, and S.Mca353ORF290P TRD1-CR1. The Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG, and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 187
34806 341166 cd17277 RMtype1_M_Cni19672ORF1405P_RMtype11G_Hci611ORFHP_TRD1-CR1_like Restriction modification N6-adenine DNA methyltransferase TRD-CR, similar to RMtype1 Calditerrivibrio nitroreducens M.Cni19672ORF1405P TRD1-CR1 and RMtype11G Helicobacter cinaedi PAGU611 RM.Hci611ORFHP TRD1-CR1. The recognition sequence of Calditerrivibrio nitroreducens M.Cni19672ORF1405P is undetermined, and the predicted recognition sequence of RM.Hci611ORFHP is 5'... GAGNNNNNGT ... 3'. M.Cni19672ORF1405P is a putative type I N6-adenine DNA methyltransferase. RM.Hci611ORFHP is a type II subtype gamma (also called type IIG and type IIC) N6-adenine DNA methyltransferase. Both are DNA methyltransferase-specificity subunit fusion proteins, they each have a domain corresponding to a HsdM methylation (M) subunit followed by a C-terminal, TRD-CR-like domain for sequence-recognition, which corresponds to the HsdS specificty (S) subunit. The latter consists of two variable TRDs, and two CRs which separate the TRDs; the TRDs each bind to different specific sequences in the DNA. RM.Hci611ORFHP has an additional N-terminal HSDR_N domain. Restriction-modification (RM) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit, two modification (M) subunits, and two restriction (R) subunits. 184
34807 341167 cd17278 RMtype1_S_LdeBORF1052P-TRD2-CR2 Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Lactobacillus delbrueckii subsp. bulgaricus S subunit (S2.LdeBORF1052P) TRD2-CR2. The recognition sequence of Lactobacillus delbrueckii subsp. bulgaricus S subunit (S2.LdeBORF1052P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 189
34808 341168 cd17279 RMtype1_S_BmuCF2ORF3362P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Burkholderia multivorans CF2 S subunit (S.BmuCF2ORF3362P) TRD1-CR1 and and Halomonas campaniensis LS21 S subunit (S.HcaLS21ORF9970P) TRD1-CR1. The recognition sequences of Burkholderia multivorans CF2 S subunit (S.BmuCF2ORF3362P) and Halomonas campaniensis LS21 S subunit (S.HcaLS21ORF9970P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 184
34809 341169 cd17280 RMtype1_S_MspEN3ORF6650P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Marinobacter species EN3 S subunit (S.MspEN3ORF6650P) TRD1-CR1, Methanothermobacter marburgensis str. Marburg S subunit (S.Mma2133ORF14720P) TRD2-CR2 and Nostoc species NIES-3756 S subunit (S.Nsp3756ORF27100P) TRD1-CR1. The recognition sequences of Marinobacter species EN3 S subunit (S.MspEN3ORF6650P), Methanothermobacter marburgensis str. Marburg S subunit (S.Mma2133ORF14720P), and Nostoc species NIES-3756 S subunit (S.Nsp3756ORF27100P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 187
34810 341170 cd17281 RMtype1_S_HpyAXIII_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori 26695 S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) TRD1-CR1, Neisseria meningitidis 510612 S subunit (S.Nme510612ORF1157P) TRD1-CR1 and Streptococcus mitis SVGS_061 S subunit (S2.Smi61ORF7905P) TRD1-CR1. Helicobacter pylori 26695 S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) recognizes 5'... CTANNNNNNNNTGT ... 3', and the recognition sequences of Neisseria meningitidis 510612 S subunit (S.Nme510612ORF1157P) and Streptococcus mitis SVGS_061 S subunit (S2.Smi61ORF7905P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, Helicobacter pylori 26695 S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) TRD1 recognizes CTA/TAG, and TRD2 recognizes ACA/TGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 196
34811 341171 cd17282 RMtype1_S_Eco16444ORF1681_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli G4/9 S subunit (S.Eco16444ORF1681P) TRD1-CR1 and Zobellia galactanivorans DsiJT S subunit (S.ZgaJTORF2697P)TRD2-CR2. The recognition sequences of Escherichia coli G4/9 S subunit (S.Eco16444ORF1681P) and Zobellia galactanivorans DsiJT S subunit (S.ZgaJTORF2697P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various putative type II restriction enzymes and methyltransferases and may also include type I DNA methyltransferases. 186
34812 341172 cd17283 RMtype1_S_Hpy180ORF7835P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori SJM180 S subunit (S.Hpy180ORF7835P) TRD2-CR2 and Haemophilus influenzae PittGG S subunit (S.HinGGORF3080P) TRD2-CR2. The recognition sequences of Helicobacter pylori SJM180 S subunit (S.Hpy180ORF7835P) and Haemophilus influenzae PittGG S subunit (S.HinGGORF3080P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 181
34813 341173 cd17284 RMtype1_S_Cbo7060ORF11580P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Clostridium botulinum CFSAN024410 S subunit (S.Cbo7060ORF11580P) TRD2-CR2 and Shewanella xiamenensis BC01 S subunit (S.SxiBC01ORF77P) TRD1-CR1. The recognition sequences of Clostridium botulinum CFSAN024410 S subunit (S.Cbo7060ORF11580P) and Shewanella xiamenensis BC01 S subunit (S.SxiBC01ORF77P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 185
34814 341174 cd17285 RMtype1_S_Csp16704I_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Campylobacter species RM1670 S subunit (S.Csp16704I) TRD2-CR2, Aeromonas media WS S subunit (S.AmeWSORF2351P) TRD1-CR1, and Clostridium carboxidivorans P7 S subunit (S.CcaPORF573P) TRD2-CR2. Campylobacter species RM16704 S subunit (S.Csp16704I ) recognizes 5'... ACANNNNNNNNTCG ... 3', and the recognition sequences of Aeromonas media WS TRD1-CR1 S subunit (S.AmeWSORF2351P) and Clostridium carboxidivorans P7 S subunit (S.CcaPORF573P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 181
34815 341175 cd17286 RMtype1_S_Lla161ORF747P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Lactococcus lactis subsp. lactis Dephy 1 S subunit (S.Lla161ORF747P) TRD1-CR1, and Lactococcus lactis IO-1 S subunit (S2.LlaIO1ORF1141P) TRD2-CR2. The recognition sequences of Lactococcus lactis subsp. lactis Dephy 1 S subunit (S.Lla161ORF747P) and Lactococcus lactis IO-1 S subunit (S2.LlaIO1ORF1141P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 179
34816 341176 cd17287 RMtype1_S_EcoN10ORF171P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli N10-0505 S subunit (S.EcoN10ORF171P) TRD2-CR2, and Herpetosiphon aurantiacus S subunit (S.HauORF5277P) TRD2-CR2. The recognition sequences of Escherichia coli N10-0505 S subunit (S.EcoN10ORF171P) and Herpetosiphon aurantiacus S subunit (S.HauORF5277P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 184
34817 341177 cd17288 RMtype1_S_LlaAI06ORF1089P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Lactococcus lactis S subunit (S.LlaAI06ORF1089P) TRD1-CR1 and Bacillus subtilis B4071 S subunit (S2.BsuCC16ORF609P) TRD2-CR2. The recognition sequences of Lactococcus lactis S subunit (S.LlaAI06ORF1089P) and Bacillus subtilis B4071 S subunit (S2.BsuCC16ORF609P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 163
34818 341178 cd17289 RMtype1_S_BamJRS5ORF1993P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus amyloliquefaciens JRS5 S subunit (S.BamJRS5ORF1993P) TRD1-CR1 and Bacillus pumilus Jo2 S subunit (S.BpuJo2I) TRD1-CR1. The recognition sequences of Bacillus amyloliquefaciens JRS5 S subunit (S.BamJRS5ORF1993P) and Bacillus pumilus Jo2 S subunit (S.BpuJo2I) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 191
34819 341179 cd17290 RMtype1_S_AleSS8ORF2795P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Algibacter lectus S subunit (S.AleSS8ORF2795P) TRD1-CR1 and Vibrio parahaemolyticus O1:K33 CDC_K4557 S subunit (S.Vpa4557ORF22590P) TRD2-CR2. The recognition sequences of Algibacter lectus S subunit (S.AleSS8ORF2795P) and Vibrio parahaemolyticus O1:K33 CDC_K4557 S subunit (S.Vpa4557ORF22590P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 184
34820 341180 cd17291 RMtype1_S_MgeORF438P-TRD-CR_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.MgeORF438P TRD1-CR1 and TRD2-CR2, and to Escherichia coli G5 S subunit (S.Eco16445ORF5013P ) TRD2-CR2 and Acetobacter pasteurianus IFO 3283-01 S subunit (S2.Apa3283ORF14230P) TRD1-CR1. The recognition sequences of Mycoplasma genitalium G-37 S subunit (S.MgeORF438P), Escherichia coli G5 S subunit (S.Eco16445ORF5013P), and Acetobacter pasteurianus IFO 3283-01 S subunit (S2.Apa3283ORF14230P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 161
34821 341181 cd17292 RMtype1_S_LlaA17I_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to the S subunit TRD2-CR2 regions of Lactococcus lactis subsp. cremoris A17 (S.LlaA17I), Haemophilus influenzae Rd (S.HindORF215P) and Clostridium species ASF502 (S.Csp502ORF478P). Lactococcus lactis subsp. cremoris A17 S subunit (S.LlaA17I) recognizes 5'... CAANNNNNNNNTAYG... 3', while the recognition sequences of Clostridium species ASF502 S subunit (S.Csp502ORF478P) and Haemophilus influenzae Rd S subunit (S.HindORF215P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Porphyromonas species COT-108 OH1349 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.Psp1349ORF730P) of unknown recognition sequence. It may also include type I DNA methyltransferases. 149
34822 341182 cd17293 RMtype1_S_Ppo21ORF8840P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Paenibacillus polymyxa SQR-21 SQR21 S subunit (S.Ppo21ORF8840P) TRD1-CR1, Nitrosococcus halophilus Nc4 S subunit (S.NhaNc4ORF3964P) TRD1-CR1. The recognition sequences of Paenibacillus polymyxa SQR-21 SQR21 S subunit (S.Ppo21ORF8840P) and Nitrosococcus halophilus Nc4 S subunit (S.NhaNc4ORF3964P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This superfamily contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 180
34823 341183 cd17294 RMtype1_S_MmaC7ORF19P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) TRD1-CR1 and Mycoplasma gallinaceum S subunit (S3.Mme68BORF1125P) TRD2-CR2. The recognition sequences of Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) and Mycoplasma gallinaceum S subunit (S3.Mme68BORF1125P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 188
34824 341184 cd17296 RMtype1_S_MmaC5ORF1169P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Methanococcus maripaludis C5 S subunit (S.MmaC5ORF1169P) TRD1-CR1, and Methanobacterium formicicum S subunit (S.Mfo3637ORF3708P) TRD2-CR2. The recognition sequences of Methanococcus maripaludis C5 S subunit (S.MmaC5ORF1169P) and Methanobacterium formicicum S subunit (S.Mfo3637ORF3708P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 182
34825 341209 cd17297 AldB-like proteins similar to alpha-acetolactate dehydrogenase. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an H(x)H(x)nH motif (x could be any residue, n could be 9 or 10) that coordinates a zinc ion. The proteins are homologous to bacterial alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5), which converts acetolactate into acetoin. 209
34826 341210 cd17298 DUF1907 proteins similar to putative ester hydrolase C11orf54/PTD012. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif (x could be any residue) that coordinates a zinc ion, and an acetate anion at a site that may support the enzymatic activity of a ester. In vitro hydrolytic activity towards para-nitrophenylacetate for the human enzyme was reported. The proteins are homologous to bacterial alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5), which converts acetolactate into acetoin. 287
34827 341211 cd17299 acetolactate_decarboxylase alpha-acetolactate decarboxylase. alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5) converts acetolactate ((2S)-2-hydroxy-2-methyl-3-oxobutanoate) into acetoin ((3R)-3-hydroxybutan-2-one) and CO(2). Acetoin may be secreted by the cells, perhaps in order to control the internal pH. AldB may function as a regulator in valine and leucine biosynthesis and in catalyzing the second step of the 2,3-butanediol pathway. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxxH motif (x could be any residue) that coordinates a zinc ion. 232
34828 340437 cd17300 PIPKc_PIKfyve Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in 1-phosphatidylinositol-3-phosphate 5-kinase and similar proteins. 1-phosphatidylinositol-3-phosphate 5-kinase (EC 2.7.1.150) is also called FYVE finger-containing phosphoinositide kinase, PIKfyve, phosphatidylinositol 3-phosphate 5-kinase (PIP5K3), or phosphatidylinositol 3-phosphate 5-kinase type III (PIPkin-III or type III PIP kinase). It forms a complex with its regulators, the scaffolding protein Vac14 and the lipid phosphatase Fig4. The complex is responsible for synthesizing phosphatidylinositol 3,5-bisphosphate [PtdIns(3,5)P2] by catalyzing the phosphorylation of phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) on the fifth hydroxyl of the myo-inositol ring. Then phosphatidylinositol-5-phosphate (PtdIns5P) is generated directly from PtdIns(3,5)P2. PtdIns(3,5)P2 and PtdIns5P regulate endosomal trafficking and responses to extracellular stimuli. PIKfyve is vital in early embryonic development. It forms a complex with ArPIKfyve (associated regulator of PIKfyve) and SAC3 at the endomembranes, playing a role in receptor tyrosine kinase (RTK) degradation. The phosphorylation of PIKfyve by AKT can facilitate epidermal growth factor receptor (EGFR) degradation. In addition, PIKfyve may participate in the regulation of the glutamate transporters EAAT2, EAAT3 and EAAT4, and the cystic fibrosis transmembrane conductance regulator (CFTR). It is also essential for systemic glucose homeostasis and insulin-regulated glucose uptake/GLUT4 translocation in skeletal muscle. It can be activated by protein kinase B (PKB/Akt) and further up-regulates human Ether-a-go-go-Related Gene (hERG) channels. This family also includes the yeast ortholog of human PIKfyve, Fab1. PIKfyve and its orthologs share a similar architecture. They contain an N-terminal FYVE domain, a middle region related to the CCT/TCP-1/Cpn60 chaperonins that are involved in productive folding of actin and tubulin, a second middle domain that contains a number of conserved cysteine residues (CCR) unique to this family, and a C-terminal catalytic lipid kinase domain related to PtdInsP kinases (or the PIPKc domain). 262
34829 340438 cd17301 PIPKc_PIP5KI Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in type I phosphatidylinositol 4-phosphate (PtdIns(4)P) 5-kinases (PIP5KI) and similar proteins. PIP5KIs, also known as PIPKIs, or PI4P5KIs, phosphorylate the head group of phosphatidylinositol 4-phosphate (PtdIns4P) to generate phosphatidylinositol 4,5-bisphosphate (PtdIns4,5P2), an essential lipid molecule in various cellular processes. Three distinct PIP5KIs have been characterized in erythrocytes, PIP5K1alpha, PIP5K1beta, and PIP5K1gamma isoforms. 320
34830 340439 cd17302 PIPKc_AtPIP5K_like Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Arabidopsis thaliana phosphatidylinositol 4-phosphate 5-kinases (PIP5Ks) and similar proteins. PIP5K (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase, or diphosphoinositide kinase, phosphorylates phosphatidylinositol-4-phosphate to produce phosphatidylinositol-4,5-bisphosphate as a precursor of two second messengers, inositol-1,4,5-triphosphate and diacylglycerol, and as a regulator of many cellular proteins involved in signal transduction and cytoskeletal organization. The family includes several PIP5Ks from Arabidopsis thaliana. AtPIP5K1 is involved in water-stress signal transduction. AtPIP5K2 acts as an interactor of all five Arabidopsis RAB-E proteins but not with other Rab subclasses residing at the Golgi or trans-Golgi network. AtPIP5K3 is a key regulator of root hair tip growth. AtPIP5K4 and AtPIP5K5 are type B PI4P 5-kinases expressed in pollen and have important functions in pollen germination and in pollen tube growth. AtPIP5K6 regulates clathrin-dependent endocytosis in pollen tubes. AtPIP5K9 interacts with a cytosolic invertase to negatively regulate sugar-mediated root growth. 314
34831 340440 cd17303 PIPKc_PIP5K_yeast_like Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in yeast phosphatidylinositol 4-phosphate 5-kinases (PIP5Ks) and similar proteins. PIP5K (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase, or diphosphoinositide kinase, phosphorylates phosphatidylinositol-4-phosphate to produce phosphatidylinositol-4,5-bisphosphate as a precursor of two second messengers, inositol-1,4,5-triphosphate and diacylglycerol, and as a regulator of many cellular proteins involved in signal transduction and cytoskeletal organization. The family includes Saccharomyces cerevisiae PIP5K MSS4, Schizosaccharomyces pombe PIP5K Its3. MSS4 is required for organization of the actin cytoskeleton in budding yeast. Its3 is involved, together with the calcineurin ppb1, in cytokinesis of fission yeast. 318
34832 340441 cd17304 PIPKc_PIP5KL1 Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase-like protein 1 (PIP5KL1) and similar proteins. PIP5KL1 (EC 2.7.1.68), also known as PI(4)P 5-kinase-like protein 1, or PtdIns(4)P-5-kinase-like protein 1, may act as a scaffold to localize and regulate type I PI(4)P 5-kinases to specific compartments within the cell, where they generate PI(4,5)P2 for actin nucleation, signaling and scaffold protein recruitment, and conversion to PI(3,4,5)P3. 319
34833 340442 cd17305 PIPKc_PIP5KII Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in type II phosphatidylinositol 5-phosphate 4-kinase (PIP5KII) and similar proteins. PIP5KIIs, also known as PIPKIIs, or PI4P5KIIs, are responsible for the synthesis of phosphatidylinositol-4,5-bisphosphate (PtdIns4,5P2), an essential lipid molecule in various cellular processes, from phosphatidylinositol-5-phosphate (PtdIns5P). Three distinct PIP5KIs have been characterized in erythrocytes, PIP5K2A, PIP5K2B, and PIP5K2C isoforms. 300
34834 340443 cd17306 PIPKc_PIP5K1A_like Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 alpha (PIP5K1alpha) and similar proteins. PIP5K1alpha (EC 2.7.1.68), also termed PIP5K1A, or PtdIns(4)P-5-kinase 1 alpha, or 68 kDa type I phosphatidylinositol 4-phosphate 5-kinase alpha, or PIPKI-alpha, catalyzes the phosphorylation of phosphatidylinositol 4-phosphate (PtdIns4P) to form phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2). It mediates extracellular calcium-induced keratinocyte differentiation. Unlike other type I phosphatidylinositol-4-phosphate 5-kinase (PIPKI) isoforms, PIP5K1alpha regulates directed cell migration by modulating Rac1 plasma membrane targeting and activation. This function is independent of its catalytic activity, and requires physical interaction of PIP5K1alpha with the Rac1 polybasic domain. The family also includes testis-specific PIP5K1A and PSMD4-like protein, also known as PIP5K1A-PSMD4 or PIPSL. It has negligeable PIP5 kinase activity and binds to ubiquitinated proteins. 339
34835 340444 cd17307 PIPKc_PIP5K1B Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 beta (PIP5K1beta) and similar proteins. PIP5K1beta (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase 1 beta, or protein STM-7, or PIP5K1B, is encoded by the Friedreich's ataxia (FRDA) gene, STM7. FRDA is a progressive neurodegenerative disease characterized by ataxia, variously associating heart disease, diabetes mellitus, and/or glucose intolerance. PIP5K1beta is an enzyme functionally linked to actin cytoskeleton dynamics and it phosphorylates phosphatidylinositol 4-phosphate (PtdIns4P) to generate phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2). 321
34836 340445 cd17308 PIPKc_PIP5K1C Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 gamma (PIP5K1gamma) and similar proteins. PIP5K1gamma(EC 2.7.1.68), also known as PtdIns(4)P-5-kinase 1 gamma, or PIP5K1gamma, or PIPKIgamma, or PtdInsPKI gamma, is a phosphatidylinositol-4-phosphate 5-kinase that catalyzes the phosphorylation of phosphatidylinositol 4-phosphate (PtdIns4P) to form phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), which is involved in a variety of cellular processes and is the substrate to form phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3), another second messenger. PIP5K1gamma is required for epidermal growth factor (EGF)-stimulated directional cell migration. It also modulates adherens junction and E-cadherin trafficking via a direct interaction with mu 1B adaptin. 323
34837 340446 cd17309 PIPKc_PIP5K2A Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 alpha (PIP5K2A) and similar proteins. PIP5K2A (EC 2.7.1.149), also known as PIP4K2A, or 1-phosphatidylinositol 5-phosphate 4-kinase 2-alpha, or diphosphoinositide kinase 2-alpha, or PIP5KIII, or phosphatidylinositol 5-phosphate 4-kinase type II alpha, or PI(5)P 4-kinase type II alpha, or PIP4KII-alpha, or PtdIns(4)P-5-kinase C isoform, or PtdIns(5)P-4-kinase isoform 2-alpha, catalyzes the phosphorylation of phosphatidylinositol 5-phosphate (PtdIns5P) on the fourth hydroxyl of the myo-inositol ring, to form phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), one of the key metabolic crossroads in phosphoinositide signaling. It is possibly involved in a mechanism protecting against tardive dyskinesia-inducing neurotoxicity. PIP5K2A is associated with schizophrenia. It controls the function of KCNQ channels via phosphatidylinositol-4,5-bisphosphate (PIP2) synthesis, and plays a potential role in the regulation of alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) receptors. 309
34838 340447 cd17310 PIPKc_PIP5K2B Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 beta (PIP5K2B) and similar proteins. PIP5K2B (EC 2.7.1.149), also known as 1-phosphatidylinositol 5-phosphate 4-kinase 2-beta, or diphosphoinositide kinase 2-beta, or phosphatidylinositol 5-phosphate 4-kinase type II beta, or PI(5)P 4-kinase type II beta, or PIP4KII-beta, or PtdIns(5)P-4-kinase isoform 2-beta, or PIP5KIIbeta, or PIP4K2B, participates in the biosynthesis of phosphatidylinositol 4,5-bisphosphate. It directly regulates the levels of two important phosphoinositide second messengers, PtdIns5P and phosphatidylinositol-(4,5)-bisphosphate (PtdIns(4,5)P2), one of the key metabolic crossroads in phosphoinositide signaling. It regulates the levels of nuclear PtdIns5P, which in turn modulates the acetylation of the tumour suppressor p53. It also interacts with and modulates nuclear localization of the high-activity PtdIns5P-4-kinase isoform PIP4Kalpha. Moreover, PIP5K2B is a molecular sensor that transduces changes in GTP into changes in the levels of the phosphoinositide PtdIns5P to modulate tumour cell growth. 311
34839 340448 cd17311 PIPKc_PIP5K2C Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 gamma (PIP5K2C) and similar proteins. PIP5K2C (EC 2.7.1.149), also known as 1-phosphatidylinositol 5-phosphate 4-kinase 2-gamma, or PI5P4Kgamma, or diphosphoinositide kinase 2-gamma, or phosphatidylinositol 5-phosphate 4-kinase type II gamma, or PI(5)P 4-kinase type II gamma, or PIP4KII-gamma, or PIP4K2C, may play an important role in the production of phosphatidylinositol bisphosphate (PIP2) in the endoplasmic reticulum. It contributes to the development and maintenance of epithelial cell functional polarity. It also plays a role in the regulation of the immune system via mTORC1 signaling. Moreover, PIP5K2C is involved in arsenic trioxide (ATO) cytotoxicity. It mediates PIP2 generation required for positioning and assembly of bipolar spindles and alteration of PIP5K2C function by ATO may thus lead to spindle abnormalities. 298
34840 340870 cd17312 MFS_OPA_SLC37 Organophosphate:Pi antiporter/Solute Carrier family 37 of the Major Facilitator Superfamily of transporters. Organophosphate:Pi antiporters (OPA) are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA family is also called solute carrier family 37 (SLC37) in vertebrates. Members include glucose-6-phosphate (Glc6P) transporter (also called translocase or exchanger), glycerol-3-phosphate permease, 2-phosphonopropionate transporter, phosphoglycerate transporter, as well as membrane sensor protein UhpC from Escherichia coli. UhpC is both a sensor and a transport protein; it recognizes external Glc6P and induces transport by UhpT, and it can also transport Glc6P. Vertebrates contain four SLC37 or sugar-phosphate exchange (SPX) proteins: SLC37A1 (SPX1), SLC37A2 (SPX2), SLC37A3 (SPX3), and SLC37AA4 (SPX4). The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 364
34841 340871 cd17313 MFS_SLC45_SUC Solute carrier family 45 and similar sugar transporters of the Major Facilitator Superfamily of transporters. This group includes the solute carrier 45 (SLC45) family as well as plant sucrose transporters (SUCs or SUTs) and similar proteins such as Schizosaccharomyces pombe general alpha-glucoside permease. the SLC45 family is composed of four (A1-A4) vertebrate proteins as well as related insect proteins such as Drosophila sucrose transporter SCRT or Slc45-1. Members of this group transport sucrose and other sugars like maltose into the cell, with the concomitant uptake of protons (symport system). Plant sucrose transporters are crucial to carbon partitioning, playing a key role in phloem loading/unloading. They play a key role in loading and unloading of sucrose into the phloem and as a result, they control sucrose distribution throughout the whole plant and drive the osmotic flow system in the phloem. They also play a role in the exchange of sucrose between beneficial symbionts (mycorrhiza and Rhizobium) as well as pathogens such as nematodes and parasitic fungi. There are nine sucrose transporter genes in Arabidopsis and five in rice. Vertebrate SLC45 family proteins have been implicated in the regulation of glucose homoeostasis in the brain (SLC45A1), with skin and hair pigmentation (SLC45A2), and with prostate cancer and myelination (SLC45A3). Mutations in SLC45A2, also called MATP (membrane-associated transporter protein) or melanoma antigen AIM1, cause oculocutaneous albinism type 4 (OCA4), an autosomal recessive disorder of melanin biosynthesis that results in congenital hypopigmentation of ocular and cutaneous tissues. The SLC45 family and related sugar transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 421
34842 340872 cd17314 MFS_MCT_like Monocarboxylate transporter (MCT) family and similar transporters of the Major Facilitator Superfamily. The group is composed of the Monocarboxylate transporter (MCT) family in animals and similar transporters from fungi, plants, archaea, and bacteria. MCT is also called Solute carrier family 16 (SLC16 or SLC16A). It is composed of 14 members, MCT1-14. MCTs play an integral role in cellular metabolism via lactate transport and have been implicated in metabolic synergy in tumors. MCTs have been found to facilitate the transport across the plasma membrane not only of monocarboxylates (MCT1-4), but also thyroid hormones (MCT8/10), and aromatic acids (MCT10). Yeast MCT homologous (Mch) proteins are not involved in the uptake of monocarboxylates; their substrates are not known. The MCT-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 385
34843 340873 cd17315 MFS_GLUT_like Glucose transporters (GLUTs) and other similar sugar transporters of the Major Facilitator Superfamily. This family is composed of glucose transporters (GLUTs) and other sugar transporters including fungal hexose transporters (HXT), bacterial xylose transporter (XylE), plant sugar transport proteins (STP) and polyol transporters (PLT), H(+)-myo-inositol cotransporter (HMIT), and similar proteins. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. The GLUT-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 365
34844 340874 cd17316 MFS_SV2_like Metazoan Synaptic vesicle glycoprotein 2 (SV2) and related small molecule transporters of the Major Facilitator Superfamily. This family is composed of metazoan synaptic vesicle glycoprotein 2 (SV2) and related small molecule transporters including those that transport inorganic phosphate (Pht), aromatic compounds (PcaK and related proteins), proline/betaine (ProP), alpha-ketoglutarate (KgtP), citrate (CitA), shikimate (ShiA), and cis,cis-muconate (MucK), among others. SV2 is a transporter-like protein that serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. Also included in this family is synaptic vesicle 2 (SV2)-related protein (SVOP) and similar proteins. SVOP is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. The SV2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 353
34845 340875 cd17317 MFS_SLC22 Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily. The Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. The SLC22 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 331
34846 340876 cd17318 MFS_SLC17 Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily of transporters. The Solute carrier 17 (SLC17) family is primarily involved in the transport of organic anions. There are nime human proteins belonging to this family including: the type I phosphate transporters (SLC17A1-4) that were initially identified as sodium-dependent inorganic phosphate (Pi) transporters but are now known to be involved in tha transport of organic anions; lysosomal acidic sugar transporter (SLC17A5 or sialin), vesicular glutamate transporters (VGluT1#3 or SLC17A7, SLC17A6, and SLC17A8, respectively), and a vesicular nucleotide transporter (VNUT or SLC17A9). SLC17A1 and SLC17A3 have roles in the transport of urate and para-aminohippurate, respectively. The SLC17 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
34847 340877 cd17319 MFS_ExuT_GudP_like Hexuronate transporter, Glucarate transporter, and similar transporters of the Major Facilitator Superfamily. This family is composed of predominantly bacterial transporters for hexuronate (ExuT), glucarate (GudP), galactarate (GarP), and galactonate (DgoT). They mediate the uptake of these compounds into the cell. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 358
34848 340878 cd17320 MFS_MdfA_MDR_like Multidrug transporter MdfA and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of bacterial multidrug resistance (MDR) transporters including several proteins from Escherichia coli such as MdfA (also called chloramphenicol resistance pump Cmr), EmrD, MdtM, MdtL, bicyclomycin resistance protein (also called sulfonamide resistance protein), and the uncharacterized inner membrane transport protein YdhC. EmrD is a proton-dependent secondary transporter, first identified as an efflux pump for uncouplers of oxidative phosphorylation. It expels a range of drug molecules and amphipathic compounds across the inner membrane of E. coli. Similarly, MdfA is a secondary multidrug transporter that exports a broad spectrum of structurally and electrically dissimilar toxic compounds. These MDR transporters are drug/H+ antiporters (DHA) belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 379
34849 340879 cd17321 MFS_MMR_MDR_like Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of bacterial, fungal, and archaeal multidrug resistance (MDR) transporters including several proteins from Bacilli such as methylenomycin A resistance protein (also called MMR peptide), tetracycline resistance protein (TetB), and lincomycin resistance protein LmrB, as well as fungal proteins such as vacuolar basic amino acid transporters, which are involved in the transport into vacuoles of the basic amino acids histidine, lysine, and arginine in Saccharomyces cerevisiae, and aminotriazole/azole resistance proteins. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, MMR confers resistance to the epoxide antibiotic methylenomycin while TetB resistance to tetracycline by an active tetracycline efflux. MMR-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 370
34850 340880 cd17322 MFS_ARN_like Yeast ARN family of Siderophore iron transporters and similar proteins of the Major Facilitator Superfamily. The ARN family of siderophore iron transporters includes ARN1 (or ferrichrome permease), ARN2 (or triacetylfusarinine C transporter 1 or TAF1), ARN3 (or siderophore iron transporter 1 or SIT1 or ferrioxamine B permease) and ARN4 (or Enterobactin permease or ENB1). They specifically recognize siderophore-iron chelates are expressed under conditions of iron deprivation. They facilitate the uptake of both hydroxamate- and catecholate-type siderophores. This group also includes glutathione exchanger 1 (Gex1p) and Gex2p, which are proton/glutathione antiporters that import glutathione from the vacuole and exports it through the plasma membrane. The ARN family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 514
34851 340881 cd17323 MFS_Tpo1_MDR_like Yeast Polyamine transporter 1 (Tpo1) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of fungal multidrug resistance (MDR) transporters including several proteins from Saccharomyces cerevisiae such as polyamine transporters 1-4 (Tpo1-4), quinidine resistance proteins 1-3 (Qdr1-3), dityrosine transporter 1 (Dtr1), fluconazole resistance protein 1 (Flr1), and protein HOL1. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, Flr1 confers resistance to the azole derivative fluconazole while Tpo1 confers resistance and adaptation to quinidine and ketoconazole. The polyamine transporters are involved in the detoxification of excess polyamines in the cytoplasm. Tpo1-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 376
34852 340882 cd17324 MFS_NepI_like Purine ribonucleoside efflux pump NepI and similar transporters of the Major Facilitator Superfamily. This family is composed of purine efflux pumps such as Escherichia coli NepI and Bacillus subtilis PbuE, sugar efflux transporters such as Corynebacterium glutamicum arabinose efflux permease, multidrug resistance (MDR) transporters such as Streptomyces lividans chloramphenicol resistance protein (CmlR), and similar proteins. NepI and PbuE are involved in the efflux of purine ribonucleosides such as guanosine, adenosine and inosine, as well as purine bases like guanine, adenine, and hypoxanthine, and purine base analogs. They play a role in the maintenance of cellular purine base pools, as well as in protecting the cells and conferring resistance against toxic purine base analogs such as 6-mercaptopurine. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. The NepI-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 370
34853 340883 cd17325 MFS_MdtG_SLC18_like bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily of transporters. This family is composed of eukaryotic solute carrier 18 (SLC18) family transporters and related bacterial multidrug resistance (MDR) transporters including several proteins from Escherichia coli such as multidrug resistance protein MdtG, from Bacillus subtilis such as multidrug resistance proteins 1 (Bmr1) and 2 (Bmr2), and from Staphylococcus aureus such as quinolone resistance protein NorA. The family also includes Escherichia coli arabinose efflux transporters YfcJ and YhhS. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. The SLC18 transporter family includes vesicular monoamine transporters (VAT1 and VAT2), vesicular acetylcholine transporter (VAChT), and SLC18B1, which is proposed to be a vesicular polyamine transporter (VPAT). The MdtG/SLC18 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 375
34854 340884 cd17326 MFS_MFSD8 Major facilitator superfamily domain-containing protein 8. Major facilitator superfamily (MFS) domain-containing protein 8 (MFSD8) is also called ceroid-lipofuscinosis neuronal protein 7 (CLN7). It is a polytopic lysosomal membrane protein that may transport small solutes by using chemiosmotic ion gradients. Mutations in MFSD8/CLN7 cause a variant of late-infantile neuronal ceroid lipofuscinoses (vLINCL), a neurodegenerative lysosomal storage disorder. Some variants are associated with nonsyndromic autosomal recessive macular dystrophy. MFSD8/CLN7 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
34855 340885 cd17327 MFS_FEN2_like Pantothenate transporter FEN2 and similar transporters of the Major Facilitator Superfamily. This family is composed of Saccharomyces cerevisiae pantothenate transporter FEN2 (or fenpropimorph resistance protein 2) and similar proteins from fungi and bacteria including fungal vitamin H transporter, allantoate permease, and high-affinity nicotinic acid transporter, as well as Pseudomonas putida phthalate transporter and nicotinate degradation protein T (nicT). These proteins are involved in the uptake into the cell of specific substrates such as pathothenate, biotin, allantoate, and nicotinic acid, among others. The FEN2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 406
34856 340886 cd17328 MFS_spinster_like Protein spinster and spinster homologs of the Major Facilitator Superfamily of transporters. The protein spinster family includes Drosophila protein spinster, its vertebrate homologs, and similar proteins. Humans contain three homologs called protein spinster homologs 1 (SPNS1), 2 (SPNS2), and 3 (SPNS3). Protein spinster and its homologs may be sphingolipid transporters that play central roles in endosomes and/or lysosomes storage. SPNS2 is also called sphingosine 1-phosphate (S1P) transporter and is required for migration of myocardial precursors. S1P is a secreted lipid mediator that plays critical roles in cardiovascular, immunological, and neural development and function. The spinster-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 405
34857 340887 cd17329 MFS_MdtH_MDR_like Multidrug resistance protein MdtH and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli MdtH and similar multidrug resistance (MDR) transporters from bacteria and archaea, many of which remain uncharacterized. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. MdtH confers resistance to norfloxacin and enoxacin. MdtH-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 376
34858 340888 cd17330 MFS_SLC46_TetA_like Eukaryotic Solute carrier 46 (SLC46) family, Bacterial Tetracycline resistance proteins, and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of the eukaryotic proteins MFSD9, MFSD10, MFSD14, and SLC46 family proteins, as well as bacterial multidrug resistance (MDR) transporters such as tetracycline resistance protein TetA and multidrug resistance protein MdtG. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. TetA proteins confer resistance to tetracycline while MdtG confers resistance to fosfomycin and deoxycholate. The Solute carrier 46 (SLC46) family is composed of three vertebrate members (SLC46A1, SLC46A2, and SLC46A3), the best-studied of which is SLC46A1, which functions both as an intestinal proton-coupled high-affinity folate transporter involved in the absorption of folates and as an intestinal heme transporter which mediates heme uptake. MFSD10 facilitates the uptake of organic anions such as some non-steroidal anti-inflammatory drugs (NSAIDs) and confers resistance to such NSAIDs. The SLC46/TetA-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 349
34859 340889 cd17331 MFS_SLC22A18 Solute carrier family 22 member 18 of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 18 (SLC22A18) is also called Beckwith-Wiedemann syndrome chromosomal region 1 candidate gene A protein (BWR1A or BWSCR1A), efflux transporter-like protein, imprinted multi-membrane-spanning polyspecific transporter-related protein 1 (IMPT1), organic cation transporter-like protein 2 (ORCTL2), or tumor-suppressing subchromosomal transferable fragment candidate gene 5 protein (TSSC5). It is localized at the apical membrane surface of renal proximal tubules and may act as an organic cation/proton antiporter. It functions as a tumor suppressor in several cancer types including glioblastoma and colorectal cancer. SLC22A18 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 382
34860 340890 cd17332 MFS_MelB_like Salmonella enterica Na+/melibiose symporter MelB and similar transporters of the Major Facilitator Superfamily. This family is composed of Salmonella enterica Na+/melibiose symporter MelB, Major Facilitator Superfamily domain-containing proteins, MFSD2 and MFSD12, and other sugar transporters. MelB catalyzes the electrogenic symport of galactosides with Na+, Li+ or H+. The MFSD2 subfamily is composed of two vertebrate members, MFSD2A and MFSD2B. MFSD2A is more commonly called sodium-dependent lysophosphatidylcholine symporter 1 (NLS1). It is an LPC symporter that plays an essential role for blood-brain barrier formation and function. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. MelB-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 424
34861 340891 cd17333 MFS_FucP_MFSD4_like Bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4, and similar proteins. This family is composed of bacterial L-fucose permease (FucP), eukaryotic Major facilitator superfamily domain-containing protein 4 (MFSD4) proteins, and similar proteins. L-fucose permease facilitates the uptake of L-fucose across the boundary membrane with the concomitant transport of protons into the cell; it can also transport L-galactose and D-arabinose. The MFSD4 subfamily consists of two vertebrate members: MFSD4A and MFSD4B. The function of MFSD4A is unknown. MFSD4B is more commonly know as Sodium-dependent glucose transporter 1 (NaGLT1), a primary fructose transporter in rat renal brush-border membranes that also facilitates sodium-independent urea uptake. The FucP/MFSD4 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 372
34862 340892 cd17334 MFS_SLC49 Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily of transporters. The Solute carrier 49 (SLC49) family is composed of four members: feline leukemia virus subgroup C receptor 1 (FLVCR1, SLC49A1); FLVCR2 (SLC49A2); major facilitator superfamily domain-containing protein 7 (MFSD7, SLC49A3); and disrupted in renal carcinoma protein 2 (DIRC2, SLC49A4). FLVCR1 and FLVCR2 are heme transporters. In addition, FLVCR2 also functions as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. The function of MFSD7 is unknown. DIRC2 is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. The SLC49 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 407
34863 340893 cd17335 MFS_MFSD6 Major facilitator superfamily domain-containing protein 6. Human Major facilitator superfamily domain-containing protein 6 (MFSD6) is also called macrophage MHC class I receptor 2 homolog (MMR2). It has been postulated as a possible receptor for human leukocyte antigen (HLA)-B62. MFSD6 is conserved through evolution and appeared before bilateral animals. It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 375
34864 340894 cd17336 MFS_SLCO_OATP Solute carrier organic anion transporters of the Major Facilitator Superfamily of transporters. Solute carrier organic anion transporters (SLCOs) are also called organic anion transporting polypeptides (OATPs) or SLC21 (Solute carrier family 21) proteins. They are sodium-independent transporters that mediate the transport of a broad range of endo- as well as xenobiotics. Their substrates are mainly amphipathic organic anions with a molecular weight of more than 300Da, although there are a few known neutral or positively charged substrates. These include drugs including statins, angiotensin-converting enzyme inhibitors, angiotensin receptor blockers, antibiotics, antihistaminics, antihypertensives, and anticancer drugs. SLCOs/OATPs can be classified into 6 families (SLCO1-6 or OATP1-6) and each family may have subfamilies (e.g. OATP1A, OATP1B, OATP1C). Within the subfamilies, individual members are numbered according to the chronology of their identification and if there is already an ortholog known, they are given the same number. For example, the first SLCO identified, is rat OATP1A1 (encoded by the Slco1a1 gene). The second SLCO identified is the first human SLCO from the same subfamily and is called OATP1A2 (encoded by the SLCO1A2 gene). There are 11 human SLCOs/OATPs. SLCOs belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 411
34865 340895 cd17337 MFS_CsbX CsbX family of the Major Facilitator Superfamily of transporters. The CsbX family is composed of Bacillus subtilis CsbX protein (also named alpha-ketoglutarate permease), Klebsiella pneumoniae D-arabinitol transporter (DalT), and similar proteins. The csbX gene is a sigmaB-controlled gene that is expressed during the stationary phase of cell growth. DalT is a pentose-specific ion symporter for D-arabinitol uptake. Most members of this family remain uncharacterized. The CsbX family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 388
34866 340896 cd17338 MFS_unc93_like Unc-93 family of the Major Facilitator Superfamily of transporters. The Unc-93 family is composed of Caenorhabditis elegans uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93) and similar proteins including three vertebrate members: protein unc-93 homolog A (UNC93A), protein unc-93 homolog B1 (UNC93B1), and UNC93-like protein MFSD11 (also called major facilitator superfamily domain-containing protein 11 or protein ET). Unc-93 acts as a regulatory subunit of a multi-subunit potassium channel complex that may function in coordinating muscle contraction in C. elegans. The human UNC93A gene is located in a region of the genome that is frequently associated with ovarian cancer, however, there is no evidence that UNC93A has a tumor suppressor function. UNC93B1 controls intracellular trafficking and transport of a subset of Toll-like receptors (TLRs), including TLR3, TLR7 and TLR9, from the endoplasmic reticulum to endolysosomes where they can engage pathogen nucleotides and activate signaling cascades. MFSD11 is ubiquitously expressed in the periphery and the central nervous system of mice, where it is expressed in excitatory and inhibitory mouse brain neurons. The unc93-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 388
34867 340897 cd17339 MFS_NIMT_CynX_like 2-nitroimidazole and cyanate transporters and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Escherichia coli 2-nitroimidazole transporter (NIMT) and cyanate transport protein CynX, and similar proteins. NIMT, also called YeaN, confers resistance to 2-nitroimidazole, the antibacterial and antifungal antibiotic, by mediating the active efflux of this compound. CynX is part of an active transport system that transports exogenous cyanate into E. coli cells. The NIMT/CynX-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
34868 340898 cd17340 MFS_MFSD1 Major facilitator superfamily domain-containing protein 1. Human major facilitator superfamily domain-containing protein 1 (MFSD1) is also called smooth muscle cell-associated protein 4 (SMAP-4). The function of MFSD1 is still unknown. Its expression is affected by altered nutrient intake. During starvation, expression of MFSD1 is downregulated in anterior brain sections in mice while it is upregulated in the brainstem. In mice raised on high-fat diet, MFSD1 is specifically downregulated in brainstem and hypothalamus. MFSD1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 394
34869 340899 cd17341 MFS_NRT2_like Plant Nitrate transporter NRT2 family and Bacterial Nitrate/Nitrite transporters of the Major Facilitator Superfamily. This family is composed of plant NRT2 family high-affinity nitrate transporters as well as nitrate and nitrite transporters from bacteria including Bacillus subtilis nitrate transporter NasA and nitrite extrusion protein NarK, Staphylococcus aureus NarT, Synechococcus sp. nitrate permease NapA, Mycobacterium tuberculosis NarK2 and nitrite extrusion protein NarU. NRT2 family proteins are involved in the uptake of nitrate by plant roots from the soil through the high-affinity transport system (HATS). There are seven Arabidopsis thaliana NRT2 proteins, called AtNRT2:1 to AtNRT2:7. The NRT2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 384
34870 340900 cd17342 MFS_SLC37A3 Solute carrier family 37 member 3 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 member 3 (SLC37A3) is also called sugar phosphate exchanger 3 (SPX3), and is one of four SLC37 family proteins in vertebrates. It's function and activity is unknown. The best characterized SLC37 family member is SLC37A4, also called the glucose-6-phosphate transporter (G6PT), a phosphate (Pi)-linked G6P antiporter. SLC37A3 is a member of the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 399
34871 340901 cd17343 MFS_SLC37A4 Solute carrier family 37 member 4 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 member 4 (SLC37A4), one of four SLC37 family proteins in vertebrates, is better known as glucose-6-phosphate transporter (G6PT). It is also called sugar phosphate exchanger 4 (SPX4), G6P translocase, or transformation-related gene 19 protein (TRG-19). G6PT is a phosphate (Pi)-linked G6P antiporter, catalyzing G6P:Pi and Pi:Pi exchanges. Deficiencies in human G6PT lead to glycogen storage disease type Ib (GSD-Ib), which is a metabolic and immune disorder. G6PT is a member of the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 409
34872 340902 cd17344 MFS_SLC37A1_2 Solute carrier family 37 members 1 and 2 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 members 1 (SLC37A1) and 2 (SLC37A2) are also called sugar phosphate exchangers 1 (SPX1) and 2 (SPX2). SLC37A1 and SLC37A2 are ER-associated, Pi-linked antiporters that can transport glucose-6-phosphate (G6P) but are insensitive to chlorogenic acid, a competitive inhibitor of physiological ER G6P transport, unlike SLC37A4, the best characterized SLC37 family member and is the physiological G6P transporter (G6PT). SLC37A1 and SLC37A2 belong to the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 400
34873 340903 cd17345 MFS_GlpT Glycerol-3-Phosphate Transporter of the Major Facilitator Superfamily of transporters. Glycerol-3-Phosphate Transporter (also called GlpT or G-3-P permease) is responsible for glycerol-3-phosphate uptake. It is part of the Organophosphate:Pi antiporter (OPA) family of integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The GlpT group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 411
34874 340904 cd17346 MFS_DtpA_like Dipeptide and tripeptide permease A (DtpA)-like subfamily of the Major Facilitator Superfamily of transporters. The DtpA-like subfamily includes four Escherichia coli proteins: dipeptide and tripeptide permeases A (DtpA, TppB or YdgR), B (DtpB or YhiP), C (DtpC or YjdL), and D (DtpD or YbgH). They are proton-dependent permeases that transport di- and tripeptides. DtpA and DtpB display a preference for di- and tripeptides composed of L-amino acids. DtpC shows higher specificity for dipeptides compared to tripeptides, and prefers dipeptides containing a C-terminal lysine residue. The DtpA-like subfamily belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 399
34875 340905 cd17347 MFS_SLC15A1_2_like Solute carrier family 15 members 1 and 2, and similar Major Facilitator Superfamily transporters. Solute carrier family 15 member 1 (SLC15A1) and SLC15A2 are members of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. They mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A1, or peptide transporter 1 (PepT1), is primarily expressed in the brush border membranes of enterocytes of the small intestine and is also known as the intestinal isoform. SLC15A2, or peptide transporter 2 (PepT2), is abundantly expressed in the apical membrane of kidney proximal tubules and is also referred to as the renal isoform. Both proteins transport di/tripeptides, but not tetrapeptides or free amino acids, using the energy generated by an inwardly directed transmembrane proton gradient. The SLC15A1/SLC15A2-like group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 427
34876 340906 cd17348 MFS_SLC15A3_4 Solute Carrier family 15 members 3 and 4 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 members 3 (SLC15A3) and 4 (SLC15A4) are members of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. They are peptide/histidine transporters (PHTs) that transport free histidine in addition to di/tripeptides. SLC15A4, also called peptide transporter 4 or peptide/histidine transporter 1 (PHT1), is expressed in the human brain, retina, placenta, and immune cells. It is required for Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production in plasmacytoid dendritic cells (pDCs) and is involved in the pathogenesis of lupus-like autoimmunity. SLC15A3, also called osteoclast transporter, peptide transporter 3, or peptide/histidine transporter 2 (PHT2), is expressed in immune tissues including the spleen and thymus. The SLC15A3/SLC15A4 group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 435
34877 340907 cd17349 MFS_SLC15A5 Solute Carrier family 15 member 5 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 5 (SLC15A5) is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. The specific function of SLC15A5 is unknown. SLC15A5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 437
34878 340908 cd17350 MFS_PTR2 Peptide transporter PTR2 of the Major Facilitator Superfamily of transporters. Fungal peptide transporter or permease PTR2 is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. It is a 12-transmembrane domain (TMD) integral membrane protein that translocates di-/tripeptides. As with other POT family proteins, it displays characteristic substrate multispecificity. PTR2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 438
34879 340909 cd17351 MFS_NPF Plant NRT1/PTR family (NPF) of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 445
34880 340910 cd17352 MFS_MCT_SLC16 Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily of transporters. The animal Monocarboxylate transporter (MCT) family is also called Solute carrier family 16 (SLC16 or SLC16A). It is composed of 14 members, MCT1-14. MCTs play an integral role in cellular metabolism via lactate transport and have been implicated in metabolic synergy in tumors. MCT1-4 are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT8 and MCT10 are transporters which stimulate the cellular uptake of thyroid hormones such as thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). MCT10 also functions as a sodium-independent transporter that mediates the uptake or efflux of aromatic acids. Many members are orphan transporters whose substrates are yet to be determined. The MCT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 361
34881 340911 cd17353 MFS_OFA_like Oxalate:formate antiporter (OFA) and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Oxalobacter formigenes oxalate:formate antiporter (OFA or OxlT) and similar proteins. O. formigenes, a commensal found in the gut of animals and humans, plays an important role in clearing dietary oxalate from the intestinal tract, which is carried out by OFA/OxlT, an anion transporter that facilitates the exchange of divalent oxalate with monovalent formate, the product of oxalate decarboxylation. This exchange generates an electrochemical proton gradient and is the source of energy for ATP synthesis in this cell. The OFA-like subfamily belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
34882 340912 cd17354 MFS_Mch1p_like Monocarboxylate transporter-homologous (Mch) 1 protein and similar transporters of the Major Facilitator Superfamily of transporters. Yeast monocarboxylate transporter-homologous (Mch) proteins are putative transporters that do not transport monocarboxylic acids across the plasma membrane, and may play roles distinct from their mammalian counterparts. Their function has not been determined. The Mch1p-like group belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 385
34883 340913 cd17355 MFS_YcxA_like MFS-type transporter YcxA and similar proteins of the Major Facilitator Superfamily of transporters. This group is composed of uncharacterized bacterial MFS-type transporters including Bacillus subtilis YcxA and YbfB. YcxA has been shown to facilitate the export of surfactin in B. subtilis. The YcxA-like group belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 386
34884 340914 cd17356 MFS_HXT Fungal Hexose transporter subfamily of the Major Facilitator Superfamily of transporters and similar proteins. The fungal hexose transporter (HXT) subfamily is comprised of functionally redundant proteins that function mainly in the transport of glucose, as well as other sugars such as galactose and fructose. Saccharomyces cerevisiae has 20 genes that encode proteins in this family (HXT1 to HXT17, GAL2, SNF3, and RGT2). Seven of these (HXT1-7) encode functional glucose transporters. Gal2p is a galactose transporter, while Rgt2p and Snf3p act as cell surface glucose receptors that initiate signal transduction in response to glucose, functioning in an induction pathway responsible for glucose uptake. Rgt2p is activated by high levels of glucose and stimulates expression of low affinity glucose transporters such as Hxt1p and Hxt3p, while Snf3p generates a glucose signal in response to low levels of glucose, stimulating the expression of high affinity glucose transporters such as Hxt2p and Hxt4p. Schizosaccharomyces pombe contains eight GHT genes (GHT1-8) belonging to this family. Ght1, Ght2, and Ght5 are high-affinity glucose transporters; Ght3 is a high-affinity gluconate transporter; and Ght6 high-affinity fructose transporter. The substrate specificities for Ght4, Ght7, and Ght8 remain undetermined. The HXT subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 403
34885 340915 cd17357 MFS_GLUT_Class1_2_like Class 1 and Class 2 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. This subfamily includes Class 1 and Class 2 glucose transporters (GLUTs) including Solute carrier family 2, facilitated glucose transporter member 1 (SLC2A1, also called glucose transporter type 1 or GLUT1), SLC2A2-5 (GLUT2-5), SLC2A7 (GLUT7), SLC2A9 (GLUT9), SLC2A11 (GLUT11), SLC2A14 (GLUT14), and similar proteins. GLUTs are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUTs 1-5 are the most thoroughly studied and are well-established as glucose and/or fructose transporters in various tissues and cell types. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 447
34886 340916 cd17358 MFS_GLUT6_8_Class3_like Glucose transporter (GLUT) types 6 and 8, Class 3 GLUTs, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of glucose transporter type 6 (GLUT6), GLUT8, plant early dehydration-induced gene ERD6-like proteins, and similar insect proteins including facilitated trehalose transporter Tret1-1. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). Insect Tret1-1 is a low-capacity facilitative transporter for trehalose that mediates the transport of trehalose synthesized in the fat body and the incorporation of trehalose into other tissues that require a carbon source. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 436
34887 340917 cd17359 MFS_XylE_like D-xylose-proton symporter and similar transporters of the Major Facilitator Superfamily. This subfamily includes bacterial transporters such as D-xylose-proton symporter (XylE or XylT), arabinose-proton symporter (AraE), galactose-proton symporter (GalP), major myo-inositol transporter IolT, glucose transport protein, putative metabolite transport proteins YfiG, YncC, and YwtG, and similar proteins. The symporters XylE, AraE, and GalP facilitate the uptake of D-xylose, arabinose, and galactose, respectively, across the boundary membrane with the concomitant transport of protons into the cell. IolT is involved in polyol metabolism and myo-inositol degradation into acetyl-CoA. The XylE-like subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 383
34888 340918 cd17360 MFS_HMIT_like H(+)-myo-inositol cotransporter and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of myo-inositol/inositol transporters and similar transporters from vertebrates, plant, and fungi. The human protein is called H(+)-myo-inositol cotransporter/Proton myo-inositol cotransporter (HMIT), or H(+)-myo-inositol symporter, or Solute carrier family 2 member 13 (SLC2A13). HMIT is classified as a Class 3 GLUT (glucose transporter) based on sequence similarity with GLUTs, but it does not transport glucose. It specifically transports myo-inositol and is expressed predominantly in the brain, with high expression in the hippocampus, hypothalamus, cerebellum and brainstem. HMIT may be involved in regulating processes that require high levels of myo-inositol or its phosphorylated derivatives, such as membrane recycling, growth cone dynamics, and synaptic vesicle exocytosis. Arabidopsis Inositol transporter 4 (AtINT4) mediates high-affinity H+ symport of myo-inositol across the plasma membrane. The HMIT-like subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 362
34889 340919 cd17361 MFS_STP Plant Sugar transport protein subfamily of the Major Facilitator Superfamily of transporters. The plant Sugar transport protein (STP) subfamily includes STP1-STP14; they are also called hexose transporters. They mediate the active uptake of hexoses such as glucose, 3-O-methylglucose, fructose, xylose, mannose, galactose, fucose, 2-deoxyglucose and arabinose, by sugar/hydrogen symport. Several STP family transporters are expressed in a tissue-specific manner, or at specific developmental stages. STP1 is the member with the highest expression level of all members and high expression is detected in photosynthetic tissues, such as leaves and stems, while roots, siliques, and flowers show lower expression levels. It plays a major role in the uptake and response of Arabidopsis seeds and seedlings to sugars. The STP subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 390
34890 340920 cd17362 MFS_GLUT10_12_Class3_like Glucose transporter (GLUT) types 10 and 12, Class 3 GLUTs, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of glucose transporter type 10, GLUT12, plant polyol transporters (PLTs), and similar proteins. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
34891 340921 cd17363 MFS_SV2 Synaptic vesicle glycoprotein 2 of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. The SV2 family belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 474
34892 340922 cd17364 MFS_PhT Inorganic Phosphate Transporter of the Major Facilitator Superfamily of transporters. This subfamily is composed of predominantly fungal and plant high-affinity inorganic phosphate transporters (PhT or PiPT), which are involved in the uptake, translocation, and internal transport of inorganic phosphate. They also function in sensing external phosphate levels as transceptors. Phosphate is crucial for structural and metabolic needs, including nucleotide and lipid synthesis, signalling and chemical energy storage. The Pht subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
34893 340923 cd17365 MFS_PcaK_like 4-hydroxybenzoate transporter PcaK and similar transporters of the Major Facilitator Superfamily. This aromatic acid:H(+) symporter subfamily includes Acinetobacter sp. 4-hydroxybenzoate transporter PcaK, Pseudomonas putida gallate transporter (GalT), Corynebacterium glutamicum gentisate transporter (GenK), Nocardioides sp. 1-hydroxy-2-naphthoate transporter (PhdT), Escherichia coli 3-(3-hydroxy-phenyl)propionate (3HPP) transporter (MhpT), and similar proteins. These transporters are involved in the uptake across the cytoplasmic membrane of specific aromatic compounds such as 4-hydroxybenzoate, gallate, gentisate (2,5-dihydroxybenzoate), 1-hydroxy-2-naphthoate, and 3HPP, respectively. The PcaK-like aromatic acid:H(+) symporter subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 351
34894 340924 cd17366 MFS_ProP Proline/betaine transporter of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli proline/betaine transporter, also called proline porter II (PPII), and similar proteins. ProP is a proton symporter that senses osmotic shifts and responds by importing osmolytes such as proline, glycine betaine, stachydrine, pipecolic acid, ectoine and taurine. It is both an osmosensor and an osmoregulator which is available to participate early in the bacterial osmoregulatory response. The ProP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 377
34895 340925 cd17367 MFS_KgtP Alpha-ketoglutarate permease of the Major Facilitator Superfamily of transporters. This subfamily includes Escherichia coli alpha-ketoglutarate permease (KgtP) and similar proteins. KgtP is a constitutively expressed proton symporter that functions in the uptake of alpha-ketoglutarate across the boundary membrane. Also included is a putative transporter from Pseudomonas aeruginosa named dicarboxylic acid transporter PcaT. The KgtP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 407
34896 340926 cd17368 MFS_CitA Citrate-proton symporter of the Major Facilitator Superfamily of transporters. Citrate-proton symporter, also called citrate carrier protein or citrate transporter or citrate utilization protein A (CitA), is a proton symporter that functions in the uptake of citrate across the boundary membrane. It allows the utilization of citrate as a sole source of carbon and energy. In Klebsiella pneumoniae, the gene encoding this protein is called citH, instead of citA, which is the case for Escherichia coli and other organisms. CitA belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 407
34897 340927 cd17369 MFS_ShiA_like Shikimate transporter and similar proteins of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli shikimate transporter (ShiA), inner membrane metabolite transport protein YhjE, and other putative metabolite transporters. ShiA is involved in the uptake of shikimate, an aromatic compound involved in siderophore biosynthesis. It has been suggested that YhjE may mediate the uptake of osmoprotectants. The ShiA-like subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 408
34898 340928 cd17370 MFS_MJ1317_like MJ1317 and similar transporters of the Major Facilitator Superfamily. This family is composed of Methanocaldococcus jannaschii MFS-type transporter MJ1317, Mycobacterium bovis protein Mb2288, and similar proteins. They are uncharacterized transporters belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
34899 340929 cd17371 MFS_MucK Cis,cis-muconate transport protein and similar proteins of the Major Facilitator Superfamily. This subfamily is composed of Acinetobacter sp. Cis,cis-muconate transport protein (MucK), Escherichia coli putative sialic acid transporter 1, and similar proteins. MucK functions in the uptake of muconate and allows Acinetobacter calcoaceticus ADP1 (BD413) to grow on exogenous cis,cis-muconate as the sole carbon source. The MucK subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 389
34900 340930 cd17372 MFS_SVOP_like Synaptic vesicle 2-related protein (SVOP) and related proteins of the Major Facilitator Superfamily. This subfamily is composed of synaptic vesicle 2 (SV2)-related protein (SVOP), SVOP-like protein (SVOPL), and similar proteins. SVOP is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. Like SV2, SVOP is expressed in all brain regions, with highest levels in cerebellum, hindbrain and pineal gland. Studies with knockout mice suggets that SVOP may perform a subtle function that is not necessary for survival under normal conditions, since mice lacking SVOP are viable, fertile, and phenotypically normal. SVOP and SVOPL share structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. The SVOP-like subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 367
34901 340931 cd17373 MFS_SLC22A17_like Solute carrier family 22, member 17 and similar proteins of the Major Facilitator Superfamily. This group is composed of Solute carrier family 22, members 17, 23, and 31. They are members of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). SLC22A17 functions as a cell surface receptor for lipocalin-2 (LCN2), also called NGAL or 24p3, which plays a key role in iron homeostasis and transport. SLC22A23 and SLC22A31 are orphan members of the SLC22 family. SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. The SLC22A17-like group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 348
34902 340932 cd17374 MFS_OAT Organic anion transporters of the Major Facilitator Superfamily of transporters. Organic anion transporters (OATs) generally display broad substrate specificity and they facilitate the exchange of extracellular with intracellular organic anions (OAs). Several OATs have been characterized including OAT1-10 and urate anion exchanger 1 (URAT1, also called SLC22A12). Many OATs occur in renal proximal tubules, the site of active drug secretion. OATs mediate the absorption, distribution, and excretion of a diverse array of environmental toxins, and clinically important drugs, including anti-HIV therapeutics, anti-tumor drugs, antibiotics, anti-hypertensives, and anti-inflammatories, and therefore is critical for the survival of the mammalian species. OAT falls into the SLC22 (solute carrier 22) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 341
34903 340933 cd17375 MFS_SLC22A16_CT2 Solute carrier family 22 member 16 (also called Carnitine transporter 2) of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 16 (SLC22A16) is also called carnitine transporter 2 (CT2), fly-like putative transporter 2 (FLIPT2), organic cation transporter OKB1, or organic cation/carnitine transporter 6 (OCT6). It is a partially sodium-ion dependent high affinity carnitine transporter. It also transports organic cations such as tetraethylammonium (TEA) and doxorubicin. It is one of several organic cation transporters (OCTs) that falls into the SLC22 (solute carrier 22) family. OCTs are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. SLC22A16 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 341
34904 340934 cd17376 MFS_SLC22A4_5_OCTN1_2 Solute carrier family 22 members 4 and 5 (also called Organic cation/carnitine transporters 1 and 2) of the Major Facilitator Superfamily of transporters. This subfamily is composed of solute carrier family 22 members 4 (SLC22A4) and 5 (SLC22A5), and similar proteins. SLC22A4 is also called ergothioneine transporter (ETT) or organic cation/carnitine transporter 1 (OCTN1). It is a sodium-ion dependent, low affinity carnitine transporter, and a highly specific transporter for the uptake of ergothioneine (ET), a thiolated derivative of histidine with antioxidant properties. ET is a natural compound produced only by certain fungi and bacteria and must be absorbed from the diet by humans and other vetebrates. SLC22A5, also called organic cation/carnitine transporter 2 (OCTN2), is a sodium-ion dependent, high affinity carnitine transporter involved in the active cellular uptake of carnitine. SLC22A4/5 belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 342
34905 340935 cd17377 MFS_SLC22A15 Solute carrier family 22 member 15 of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 15 (SLC22A15) is also called fly-like putative transporter 1 (FLIPT1). It is expressed at the highest levels in the kidney and brain. It is a member of the SLC22 family of transporters, which includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A15 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 353
34906 340936 cd17378 MFS_OCT_plant Plant organic cation/carnitine transporters of the Major Facilitator Superfamily of transporters. Plant organic cation/carnitine transporters (OCTs) are sequence-similar to their animal counterparts, which are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. Little is know about plant OCTs. In Arabidopsis, there are six genes belonging to this family that show distinct, organ-specific expression pattern of the individual genes. AtOCT1 has been found to affect root development and carnitine-related responses in Arabidopsis. AtOCT4, 5 and 6 are up-regulated during drought stress, AtOCT3 and 5 during cold stress and AtOCT5 and 6 during salt stress treatments. Plant OCTs belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 342
34907 340937 cd17379 MFS_SLC22A1_2_3 Solute carrier family 22 members 1, 2, and 3 (also called Organic cation transporters 1, 2, and 3) of the Major Facilitator Superfamily of transporters. This sufamily includes solute carrier family 22 member 1 (SLC22A1, also called organic cation transporter 1 or OCT1), SLC22A2 (or OCT2), SLC22A3 (or OCT3), and similar proteins. OCT1-3 have similar basic functional properties: they are able to translocate a variety of structurally different organic cations in both directions across the plasma membrane; to translocate organic cations independently from sodium, chloride or proton gradients; and to function as electrogenic uniporters for cations or as electroneutral cation exchangers. They show overlapping but distinct substrate and inhibitor specificities, and different tissue expression pattern. In humans, OCT1 is strongly expressed in the liver, OCT2 is highly expressed in the kidney where it is localized at the basolateral membrane of renal proximal tubules, and OCT3 is most strongly expressed in skeletal muscle. OCTs are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. The SLC22A1-3 subfamily belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 340
34908 340938 cd17380 MFS_SLC17A9_like Solute carrier family 17 member 9 and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily includes solute carrier family 17 member 9 (SLC17A9) and similar proteins including plant inorganic phosphate transporters (PHT4) that are also probably anion transporters. SLC17A9, also called vesicular nucleotide transporter (VNUT), is involved in vesicular storage and exocytosis of ATP. It facilitates the accumulation of ATP and other nucleotides in secretory vesicles such as adrenal chromaffin granules and synaptic vesicles. It also functions as a lysosomal ATP transporter and regulates cell viability. Plant PHT4 family transporters mediate the transport of inorganic phosphate and may also transport organic anions. The Arabidopsis protein AtPHT4;4 is a chloroplast-localized ascorbate transporter. PHT4 proteins show differential expression that suggests specialized functions. The SLC17A9-like subfamily belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 361
34909 340939 cd17381 MFS_SLC17A5 Solute carrier family 17 member 5 (also called sialin) of the Major Facilitator Superfamily of transporters. Solute carrier family 17 member 5 (SLC17A5) is also called sialin, H(+)/nitrate cotransporter, H(+)/sialic acid cotransporter (AST), membrane glycoprotein HP59, or vesicular H(+)/aspartate-glutamate cotransporter. It transports glucuronic acid and free sialic acid out of the lysosome after its cleavage from sialoglycoconjugates, which is required for normal CNS myelination. It also mediates the membrane potential-dependent uptake of aspartate and glutamate into synaptic vesicles and synaptic-like microvesicles. In the plasma membrane, it functions as a nitrate transporter. Recessive mutations in the SLC17A5 gene cause the allelic disorders, Infantile sialic acid storage disease (ISSD) and Salla disease (a predominantly neurological disorder). SLC17A5 belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 397
34910 340940 cd17382 MFS_SLC17A6_7_8_VGluT Solute carrier family 17 members 6, 7, and 8 (also called Vesicular glutamate transporters) of the Major Facilitator Superfamily of transporters. This subfamily is composed of solute carrier family 17 member 6 (SLC17A6), SLC17A7, SLC17A8, and similar proteins. SLC17A6 is also called vesicular glutamate transporter 2 (VGluT2), differentiation-associated BNPI, or differentiation-associated Na(+)-dependent inorganic phosphate cotransporter. SLC17A7 is also called VGluT1 or brain-specific Na(+)-dependent inorganic phosphate cotransporter. SLC17A8 is also called VGluT3. They mediate the uptake of glutamate into synaptic vesicles at presynaptic nerve terminals of excitatory neural cells, and may also mediate the transport of inorganic phosphate. VGluTs are also expressed and localized in various secretory vesicles in non-neuronal peripheral organelles such as hormone-containing secretory granules in endocrine cells, and thus, also act as metabolic regulators. The VGluT subfamily belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 380
34911 340941 cd17383 MFS_SLC18A3_VAChT Vesicular acetylcholine transporter (VAChT) and similar transporters of the Major Facilitator Superfamily. Vesicular acetylcholine transporter (VAChT) is also called solute carrier family 18 member 3 (SLC18A3) in vertebrates and uncoordinated protein 17 (unc-17) in Caenorhabditis elegans. It is a glycoprotein involved in acetylcholine transport into synaptic vesicles and is responsible for the accumulation of acetylcholine into pre-synaptic vesicules of cholinergic neurons. Variants in SLC18A3 are associated with congenital myasthenic syndrome in humans. VAChT belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 378
34912 340942 cd17384 MFS_SLC18A1_2_VAT1_2 Vesicular amine transporters 1 (VAT1) and 2 (VAT2), and similar transporters of the Major Facilitator Superfamily. Vesicular amine transporter 1 (VAT1 or VMAT1) is also called solute carrier family 18 member 1 (SLC18A1) or chromaffin granule amine transporter, while VAT2 (or VMAT2) is also called SLC18A2, synaptic vesicular amine transporter, or monoamine transporter. VATs (or VMATs) are responsible for the uptake of cytosolic monoamines into synaptic vesicles in monoaminergic neurons. VAT1 and VAT2 distinct pharmacological properties and tissue distributions. VAT1 is preferentially expressed in neuroendocrine cells and endocrine cells, where it transports biogenic monoamines, such as serotonin, from the cytoplasm into the secretory vesicles. VAT2 is primarily expressed in the CNS and is involved in the ATP-dependent vesicular transport of biogenic amine neurotransmitters including dopamine, norepinephrine, serotonin, and histamine into synaptic vesicles. VATs belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 373
34913 340943 cd17385 MFS_SLC18B1 Solute carrier family 18 member B1 of the Major Facilitator Superfamily of transporters. Solute carrier family 18 member B1 (SLC18B1) is the fourth member of the SLC18 transporter family, which includes vesicular monoamine transporters and vesicular acetylcholine transporter. It is predominantly expressed in the hippocampus and is associated with vesicles in astrocytes. It actively transports spermine and spermidine by exchange of H(+), and has been suggested to be a vesicular polyamine transporter (VPAT). SLC18B1 belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 390
34914 340944 cd17386 MFS_SLC46 Solute carrier 46 (SLC46) family of the Major Facilitator Superfamily of transporters. The solute carrier 46 (SLC46) family is composed of three vertebrate members (SLC46A1, SLC46A2, and SLC46A3) and similar proteins from insects and nematodes. The best-studied member is SLC46A1, also called proton-coupled folate transporter (PCFT), which functions both as an intestinal proton-coupled high-affinity folate transporter involved in the absorption of folates and as an intestinal heme transporter which mediates heme uptake. SLC46A2, also called thymic stromal cotransporter protein (TSCOT), is a putative 12-transmembrane protein mainly expressed in the thymic cortex in a specific thymic epithelial cell (TEC) subpopulation. SLC46A3 is a lysosomal membrane protein that functions as a direct transporter of noncleavable antibody maytansine-based catabolites from the lysosome to the cytoplasm. The SLC46 family belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 360
34915 340945 cd17387 MFS_MFSD14 Major facilitator superfamily domain-containing 14A and 14B. This subfamily is composed of major facilitator superfamily domain-containing 14A (MFSD14A) and MFSD14B, and similar proteins. MFSD14A and MFSD14B are also called hippocampus abundant transcript 1 protein (HIAT1) and hippocampus abundant transcript-like protein 1 (HIATL1), respectively. They are both ubiquitously expressed with HIAT1 highly expressed intestis and HIATL1 most abundantly expressed in skeletal muscle. Gene disruption of MFSD14A causes globozoospermia and infertility in male mice. It has bee suggested that MFSD14A may transport a solute from the bloodstream that is required for spermiogenesis. The function of MFSD14B is unknown. The MFSD14 subfamily belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 410
34916 340946 cd17388 MFS_TetA Tetracycline resistance protein TetA and related proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of tetracycline resistance proteins similar to Escherichia coli TetA(A), TetA(B), and TetA(E), which are metal-tetracycline/H(+) antiporters that confer resistance to tetracycline by an active tetracycline efflux, which is an energy-dependent process that decreases the accumulation of the antibiotic in cells. TetA-like tetracycline resistance proteins belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 385
34917 340947 cd17389 MFS_MFSD10 Major facilitator superfamily domain-containing protein 10. Major facilitator superfamily domain-containing protein 10 (MFSD10) is also called tetracycline transporter-like protein (TETRAN). It is expressed in various human tissues, including the kidney. In cultured cells, its overexpression facilitated the uptake of organic anions such as some non-steroidal anti-inflammatory drugs (NSAIDs). MFSD10/TETRAN overexpression cause resistance to some NSAIDs, suggesting that it may be an organic anion transporter that serves as an efflux pump for some NSAIDs and various other organic anions at the final excretion step. MFSD10 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 391
34918 340948 cd17390 MFS_MFSD9 Major facilitator superfamily domain-containing protein 9. Major facilitator superfamily domain-containing protein 9 (MFSD9) is expressed in the central nervous system (CNS) and in most peripheral tissues but at very low expression levels. The function of MFSD9 is unknown. MFSD9 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 350
34919 340949 cd17391 MFS_MdtG_MDR_like Multidrug resistance protein MdtG and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli multidrug resistance protein MdtG, Streptococcus pneumoniae multidrug resistance efflux pump PmrA, and similar multidrug resistance (MDR) transporters from bacteria. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. MdtG confers resistance to fosfomycin and deoxycholate. PmrA serves as an efflux pump for various substrates and is associated with fluoroquinolone resistance. MdtG-like MDR transporters belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 380
34920 340950 cd17392 MFS_MFSD2 Major facilitator superfamily domain-containing protein 2 subfamily. The major facilitator superfamily domain-containing protein 2 (MFSD2) subfamily is composed of two vertebrate members, MFSD2A amd MFSD2B. MFSD2A is more commonly called sodium-dependent lysophosphatidylcholine symporter 1 (NLS1). It is an LPC symporter that plays an essential role for blood-brain barrier formation and function. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. The MFSD2 subfamily belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 446
34921 340951 cd17393 MFS_MosC_like Membrane protein MosC and similar proteins of the Major Facilitator Superfamily of transporters. The gene encoding Sinorhizobium meliloti membrane protein MosC is part of the mos locus, which encodes the biosynthesis of the rhizopine 3-O-methyl-scyllo-inosamine. MosC belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 373
34922 340952 cd17394 MFS_FucP_like Fucose permease and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of L-fucose permease (also called L-fucose-proton symporter) and similar proteins such as glucose/galactose transporter and N-acetyl glucosamine transporter NagP. L-fucose permease facilitates the uptake of L-fucose across the boundary membrane with the concomitant transport of protons into the cell; it can also transport L-galactose and D-arabinose. Glucose/galactose transporter functions in the uptake of of glucose and galactose. The FucP-like subfamily belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 401
34923 340953 cd17395 MFS_MFSD4 Major facilitator superfamily domain-containing protein 4. The Major facilitator superfamily domain-containing protein 4 (MFSD4) subfamily consists of two vertebrate members: MFSD4A and MFSD4B. The function of MFSD4A is unknown. MFSD4B is more commonly know as sodium-dependent glucose transporter 1 (NaGLT1), a primary fructose transporter in rat renal brush-border membranes that also facilitates sodium-independent urea uptake. The MFSD4 subfamily belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 367
34924 340954 cd17396 MFS_YdiM_like Inner membrane transport protein YdiM and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily contains Escherichia coli inner membrane transport proteins YdiM and YdiN, which belong to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 384
34925 340955 cd17397 MFS_DIRC2 Disrupted in renal carcinoma protein 2 of the Major Facilitator Superfamily of transporters. Disrupted in renal carcinoma protein 2 or disrupted in renal cancer protein 2 (DIRC2), encoded by the SLC49A4 gene, was initially identified as a breakpoint-spanning gene in a chromosomal translocation associated with the development of renal cancer. It is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. DIRC2 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 381
34926 340956 cd17398 MFS_FLVCR_like Feline leukemia virus subgroup C receptor subfamily of the Major Facilitator Superfamily of transporters. The Feline leukemia virus subgroup C receptor (FLVCR) subfamily is conserved in metazoans and is composed of two vertebrate members, FLVCR1 and FLVCR2. FLVCR1 is a heme transporter and it has two isoforms: 1 (or FLVCR1a), which exports cytoplasmic heme as well as coproporphyrin and protoporphyrin IX; and 2 (FLVCR1b), which promotes heme efflux from the mitochondrion to the cytoplasm. FLVCR2 functions as a heme importer as well as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. The FLVCR subfamily belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 406
34927 340957 cd17399 MFS_MFSD7 Major facilitator superfamily domain-containing protein 7. Major facilitator superfamily domain-containing protein 7 (MFSD7) is also called myosin light polypeptide 5 regulatory protein (MYL5). It's function is unknown. It is encoded by the a SLC49A3 gene and is a member of the Solute carrier 49 (SLC49) family, which also includes feline leukemia virus subgroup C receptor 1 (FLVCR1, SLC49A1), FLVCR2 (SLC49A2), as well as disrupted in renal carcinoma protein 2 (DIRC2, SLC49A4). FLVCR1 and FLVCR2 are heme transporters. DIRC2 is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. MFSD7 belongs to the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 419
34928 340958 cd17400 MFS_SLCO1_OATP1 Solute carrier organic anion transporter 1 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1 (SLCO1) or Organic anion transporting polypeptide 1 (OATP1) family contains three subfamilies: OATP1A, OATP1B, and OATP1C. OATP1A contains one human member, OATP1A2, which shows a broad spectrum of substrates including endogenous compounds (such as bile acids, steroid hormones and their conjugates, thyroid hormones) and various drugs (such as fexofenadine, ouabain and the cyanobacterial toxin microcystin). OATP1B contains two human proteins, OATP1B1 and OATP1B3, which can both accept a wide variety of structurally-unrelated compounds as substrates including clinically-important drugs such as hydroxymethylglutaryl (HMG)-CoA reductase inhibitors (statins), angiotensin II receptor blockers (sartans), angiotensin converting enzyme (ACE) inhibitors, and anti-diabetes drugs (glinides). OATP1C contains one mammalian member, OATP1C1, which is also called thyroxine transporter. It mediates the high affinity transport of the thyroid hormones, T4 (3,5,3',5'tetraiodo-L-thyronine or thyroxine), rT3 (3,3'5'-triiodo-L-thyronine), and T3 (3,5,3'tri-iodo-L-thyronine or triiodothyronine). The SLCO1/OATP1 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 436
34929 340959 cd17401 MFS_SLCO2_OATP2 Solute carrier organic anion transporter 2 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2 (SLCO2) or Organic anion transporting polypeptide 2 (OATP2) family contains two subfamilies: OATP2A and OATP2B, each containing one mammalian member, OATP2A1 and OATP2B1, respectively. OATP2A1 (encoded by SLCO2A1) is a lactate/prostaglandin anion exchanger that mediates the release of newly synthesized prostaglandins (PGD2, PGE1, PGE2, PGF2A and PGI2) from cells, the transepithelial transport of prostaglandins, and the clearance of prostaglandins from the circulation. OATP2B1 (encoded by SLCO2B1) mediates the Na(+)-independent transport of various organic anions such as taurocholate, the prostaglandins PGD2, PGE1, PGE2, leukotriene C4, thromboxane B2 and iloprost, as well as endogenous sex steroid conjugates such as dehydroepiandrosterone sulfate (DHEAS). The SLCO2/OATP2 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 440
34930 340960 cd17402 MFS_SLCO3_OATP3 Solute carrier organic anion transporter 3 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 3 (SLCO3) or Organic anion transporting polypeptide 3 (OATP3) family contains only one subfamily, OATP3A, which contains only one mammalian member OATP3A1 (encoded by SLCO3A1). It mediates the Na(+)-independent transport of organic anions such as estrone-3-sulfate, prostaglandins (PG) E1 and E2, thyroxine (T4), deltorphin II, BQ-123, and vasopressin. SLCO3A1 has been identified as a Crohn's disease (CD)-associated gene, which mediates inflammatory processes in intestinal epithelial cells through NF-kappaB transcription activation, resulting in a higher incidence of bowel perforation in CD patients. The SLCO3/OATP3 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 444
34931 340961 cd17403 MFS_SLCO4_OATP4 Solute carrier organic anion transporter 4 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4 (SLCO4) or Organic anion transporting polypeptide 4 (OATP4) family contains two families: OATP4A and OATP4C, each containing one mammalian member, OATP4A1 and OATP4C1, respectively. OATP4A1 (encoded by SLCO4A1), is ubiquitously expressed and mediates the Na(+)-independent transport of the thyroid hormones T3 (triiodo-L-thyronine), T4 (thyroxine) and rT3, and other organic anions such as estrone sulfate and taurocholate. OATP4C1 (encoded by SLCO4C1) is capable of transporting pharmacological substances such as digoxin, ouabain, thyroxine, methotrexate, cAMP, and uremic toxins, which accumulate in patients with chronic kidney diseases (CKDs). The SLCO4/OATP4 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 420
34932 340962 cd17404 MFS_SLCO5_OATP5 Solute carrier organic anion transporter 5 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 5 (SLCO5) or Organic anion transporting polypeptide 5 (OATP5) family contains only one subfamily, OATP5A, which contains only one mammalian member OATP5A1 (encoded by SLCO5A1). Deletion of the SLCO5A1 gene has been implicated in the pathogenesis of Mesomelia-synostoses syndrome (MSS), a rare autosomal-dominant disorder characterized by mesomelic limb shortening, acral synostoses, and multiple congenital malformations. OATP5A1 may be a non-classical OATP which is involved in biological processes that require the reorganization of the cell shape, such as differentiation and migration. It seems to affect intracellular transport of drugs and may participate in chemoresistance of small cell lung cancer (SCLC by sequestration), rather than mediating cellular uptake. The SLCO5/OATP5 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 425
34933 340963 cd17405 MFS_SLCO6_OATP6 Solute carrier organic anion transporter 6 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 6 (SLCO6) or Organic anion transporting polypeptide 6 (OATP6) family contains only one subfamily, OATP6A, which contains only one human member OATP6A1 (encoded by SLCO6A1). The OATP6 family is the most diverged of the OATPs. OATP6A1 is also called cancer/testis antigen 48 (CT48) or gonad-specific transporter. It is strongly expressed only in normal testis, and weakly in spleen, brain, fetal brain, and placenta. It is found in tumor samples (lung, bladder, and esophageal) and cancer cell lines (lung), and may be of potential use as a target for therapy for a variety of tumor types. The SLCO6/OATP6 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 428
34934 340964 cd17406 MFS_unc93A_like Protein unc-93 homolog A and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Caenorhabditis elegans Uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93), human protein unc-93 homolog A (HmUnc-93A or UNC93A), and similar proteins. Unc-93 acts as a regulatory subunit of a multi-subunit potassium channel complex that may function in coordinating muscle contraction in C. elegans. The human UNC93A gene is located in a region of the genome that is frequently associated with ovarian cancer, however, there is no evidence that UNC93A has a tumor suppressor function. This unc93A-like subfamily belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 390
34935 340965 cd17407 MFS_MFSD11 UNC93-like Major facilitator superfamily domain-containing protein 11. This group is composed of UNC93-like protein MFSD11 (also called major facilitator superfamily domain-containing protein 11 or protein ET) and similar proteins, most of which are uncharacterized. MFSD11 is ubiquitously expressed in the periphery and the central nervous system of mice, where it is expressed in excitatory and inhibitory mouse brain neurons. Its expression is affected by altered energy homeostasis, suggesting plausible involvement in the energy regulation. MFSD11 belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 387
34936 340966 cd17408 MFS_unc93B1 Protein unc-93 homolog B1 of the Major Facilitator Superfamily of transporters. Protein unc-93 homolog B1 (UNC93B1) controls intracellular trafficking and transport of a subset of Toll-like receptors (TLRs), including TLR3, TLR7 and TLR9, from the endoplasmic reticulum to endolysosomes where they can engage pathogen nucleotides and activate signaling cascades. It regulates differential transport of TLR7 and TLR9 into signaling endosomes to prevent autoimmunity. UNC93B1 belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 456
34937 340967 cd17409 MFS_NIMT_like 2-nitroimidazole transporter and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli 2-nitroimidazole transporter (NIMT), also called YeaN, and similar proteins. NIMT confers resistance to 2-nitroimidazole, the antibacterial and antifungal antibiotic, by mediating the active efflux of this compound. The NIMT-like subfamily belongs to the 2-nitroimidazole and cyanate transporters like (NIMT/CynX-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
34938 340968 cd17410 MFS_CynX_like Cyanate transport protein CynX and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli cyanate transport protein CynX and similar proteins. CynX is part of an active transport system that transports exogenous cyanate into E. coli cells. The gene encoding CynX is part of the cyn operon that also includes cynS, encoding cynase, which catalyzes the reaction of cyanate with bicarbonate to give ammonia and carbon dioxide, and cynT, which encodes a carbonic anhydrase. The CynX-like subfamily belongs to the 2-nitroimidazole and cyanate transporters like (NIMT/CynX-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 372
34939 340969 cd17411 MFS_SLC15A2 Solute carrier family 15 member 2 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 2 (SLC15A2), also called peptide transporter 2 (PepT2), is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. SLC15A2, as well as SLC15A1, mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A2 is a high-affinity transporter and is abundantly expressed in the apical membrane of kidney proximal tubules and choroid plexus epithelial cells. It is the major transporter involved in the reclamation of peptide-bound amino acids and peptide-like drugs in the kidney, and is also called the renal isoform. In choroid plexus and the brain, it acts as an efflux transporter and plays a role in regulating peptide/neuropeptide homeostasis. SLC15A2/PepT2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 403
34940 340970 cd17412 MFS_SLC15A1 Solute Carrier family 15 member 1 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 1 (SLC15A1), also called peptide transporter 1 (PepT1), is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. SLC15A1, as well as SLC15A2, mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A1 is primarily expressed in the brush border membranes of enterocytes of the small intestine and is also known as the intestinal isoform. It is a high-capacity/low-affinity transporter that drives the transport of di-and tripeptides for metabolic purposes. It's expression is upregulated in the colon during chronic inflammation associated with inflammatory bowel disease. SLC15A1/PepT1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 415
34941 340971 cd17413 MFS_NPF6 NRT1/PTR family (NPF), subfamily 6 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter 1/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF6 includes the first identified member of the NRT1/PTR family: Arabidopsis thaliana NRT1.1, now called AtNPF6.3. It is a dual affinity nitrate influx transporter and a nitrate sensor. It also transports auxin and has nitrate efflux activity. NPF6 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 457
34942 340972 cd17414 MFS_NPF4 NRT1/PTR family (NPF), subfamily 4 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. Members of the NPF4 subfamily have been shown to transport ABA. NPF4 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 456
34943 340973 cd17415 MFS_NPF3 NRT1/PTR family (NPF), subfamily 3 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF3 is the smallest NPF subfamily and it includes Cucumis sativus nitrite transporter (CsNitr1), now named CsNPF3.2. It functions as a chloroplast nitrite uptake transporter to remove toxic nitrite from the cytosol. NPF3 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 448
34944 340974 cd17416 MFS_NPF1_2 NRT1/PTR family (NPF), subfamily 1 and 2 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF1 includes Medicago truncatula LATD/NIP, now named MtNPF1.7, which is a high-affinity nitrate transporter and is involved in nodulation and root architecture. NPF2 members are well-established nitrate and glucosinolate transporters, including Arabidopsis nitrate influx and efflux transporters with varied tissue and developmental specificity. Examples are AtNPF2.7, which is expressed in the cortex of mature roots, and AtNPF2.9, which is expressed in root companion cells where it is involved in phloem loading. NPF1/2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 444
34945 340975 cd17417 MFS_NPF5 NRT1/PTR family (NPF), subfamily 5 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF5 includes Arabidopsis thaliana PTR3 (AtPTR3, now named AtNPF5.2), which is a wound-induced peptide transporter that is necessary for defense against virulent bacterial pathogens. NPF5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 452
34946 340976 cd17418 MFS_NPF8 NRT1/PTR family (NPF), subfamily 8 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF8 contains the Arabidopsis dipeptide transporters AtNPF8.1 (PTR1), AtNPF8.2 (PTR5), and AtNPF8.3 (PTR2), as well as tonoplast-localized transporters AtNPF8.4 (PTR4) and AtNPF8.5 (PTR6). Oryza sativa NRT1 (now called OsNPF8.9) is a low-affinity nitrate transporter. NPF8 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 447
34947 340977 cd17419 MFS_NPF7 NRT1/PTR family (NPF), subfamily 7 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF7 includes the nitrate transporters AtNPF7.2 and AtNPF7.3, as well as the dipeptide transporter OsNPF7.3. AtNPF7.3 is a bidirectional transporter involved in nitrate influx and efflux. NPF7 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 447
34948 340978 cd17420 MFS_MCT8_10 Monocarboxylate transporters 8 and 10, and similar proteins of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 8 (MCT8) and 10 (MCT10) are transporters which stimulate the cellular uptake of thyroid hormones such as thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). MCT has a preference for T3 and is also a sodium-independent transporter that mediates the uptake or efflux of aromatic acids such as Phe, Tyr, and Trp, as well as L-3,4-di-hydroxy-phenylalanine. MCT8/10 and similar proteins belong to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 400
34949 340979 cd17421 MFS_MCT5 Monocarboxylate transporter 5 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 5 (MCT5) is also called Solute carrier family 16 member 4 (SLC16A4). It is an orphan transporter expressed in the brain, muscle, liver, kidney, lung, ovary, placenta, and heart. It is a member of the monocarboxylate transporter (MCT) family, whose members include MCT1-4, which are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT5 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 369
34950 340980 cd17422 MFS_MCT7 Monocarboxylate transporter 7 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 7 (MCT7) is also called Solute carrier family 16 member 6 (SLC16A6). Zebrafish MCT7 is required for hepatocyte secretion of ketone bodies during fasting; it has been shown to be a selective transporter of the major ketone body beta-hydroxybutyrate, whose abundance is increased during fasting. MCT7 is expressed in the brain, pancreas, muscle, and prostate. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 363
34951 340981 cd17423 MFS_MCT11_13 Monocarboxylate transporters 11 and 13 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 11 (MCT11) and 13 (MCT13) are also called Solute carrier family 16 members 11 (SLC16A11) and 13 (SLC16A13), respectively. They are orphan transporters whose substrates are yet to be determined. MCT11 is expressed in skin, lung, ovary, breast, lung, pancreas, retinal pigment epithelium, and choroid plexus. Genetic variants in SLC16A11, the gene encoding MCT11, are associated with type 2 diabetes in Mexican and other Latin American populations. MCT13 is expressed in breast and bone marrow stem cells. MCT11/13 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 383
34952 340982 cd17424 MFS_MCT12 Monocarboxylate transporter 12 of the of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 12 (MCT12) is also called Solute carrier family 16 member 12 (SLC16A12). It is a creatine transporter encoded by the cataract and glucosuria associated gene SLC16A12. A heterozygous mutation of the gene causes a syndrome with juvenile cataracts, microcornea, and glucosuria. MCT12 may function in a basolateral exit pathway for creatine in the proximal tubule. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 363
34953 340983 cd17425 MFS_MCT6 Monocarboxylate transporter 6 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 6 (MCT6) is also called Solute carrier family 16 member 5 (SLC16A5). MCT6 has been shown to transport bumetanide, nateglinide, probenecid, and prostaglandin F2a, but not L-lactic acid, in a pH- and membrane potential-dependent manner. It may be involved in the disposition and absorption of various drugs. MCT6 is expressed in the kidney, muscle, brain, heart, pancreas, prostate, lung, and placenta. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 364
34954 340984 cd17426 MFS_MCT1 Monocarboxylate transporter 1 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 1 (MCT1) is also called Solute carrier family 16 member 1 (SLC16A1). It is a proton-coupled transporter that facilitates the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. It is widely expressed in many tissues its main function is to transport lactate into the cell. MCT1 deficiency has been identified as a cause of profound ketoacidosis, a potentially lethal condition caused by the imbalance between hepatic production and extrahepatic utilization of ketone bodies. This suggests that MCT1-mediated ketone-body transport is crucial in maintaining acid-base balance. MCT1 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
34955 340985 cd17427 MFS_MCT2 Monocarboxylate transporter 2 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 2 (MCT2) is also called Solute carrier family 16 member 7 (SLC16A7). It is a proton-coupled transporter that facilitates the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. It transports pyruvate and lactate outside and inside of sperm and plays roles in the regulation of spermatogenesis. Genetic variation in MCT2 has functional and clinical relevance with male infertility. MCT2 is consistently overexpressed in prostate cancer (PCa) cells and its location at peroxisomes is associated with malignant transformation. MCT2 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 367
34956 340986 cd17428 MFS_MCT9 Monocarboxylate transporter 9 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 9 (MCT9) is also called Solute carrier family 16 member 9 (SLC16A9). It is an orphan transporter that is expressed in a number of tissues including intestine and kidney. A missense variant of MCT9 (K258T) is associated with significant increase in susceptibility to renal overload (ROL) gout with intestinal urate underexcretion. This suggests that MCT9 may have a role in intestinal urate excretion; it is possible that it transports urate. MCT9 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 361
34957 340987 cd17429 MFS_MCT14 Monocarboxylate transporter 14 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 14 (MCT14) is also called Solute carrier family 16 member 14 (SLC16A14). It is an orphan transporter expressed in the brain, heart, muscle, ovary, prostate, breast, lung, pancreas, liver, spleen, and thymus. It may function as a neuronal aromatic-amino-acid transporter. MCT14 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 361
34958 340988 cd17430 MFS_MCT3_4 Monocarboxylate transporters 9 and 14, and similar proteins of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 3 (MCT3) and 4 (MCT4) are also called Solute carrier family 16 members 8 (SLC16A8) and 3 (SLC16A3), respectively. They are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT3 is preferentially expressed in the basolateral membrane of the retinal pigment epithelium and plays a role in pH and ion homeostasis of the outer retina by facilitating the transport of lactate and H(+) out of the retina. Mice deficient with MCT3 display altered visual function. MCT4 is highly expressed in tissues dependent on glycolysis, and it plays an important role in lactate efflux from cells. MCT4 is expressed in neurons and astrocytes; it has been found to play a role in neuroprotective mechanism of ischemic preconditioning in animals (in the gerbil) with transient cerebral ischemia. Increased MCT4 expression has also been correlated with worse prognosis across many cancer types. MCT3/4 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 368
34959 340989 cd17431 MFS_GLUT_Class1 Class 1 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUTs 1-4 are well-established as glucose and/or fructose transporters in various tissues and cell types. GLUT1, also called solute carrier family 2, facilitated glucose transporter member 1 (SLC2A1), displays broad substrate specificity and can transport a wide range of pentoses and hexoses including glucose, galactose, mannose, and glucosamine. It is found in the brain, erythrocytes, and in many fetal tissues. GLUT2 (or SLC2A2) is found in the liver, islet of Langerhans, intestine, and kidney, and is the isoform that likely mediates the bidirectional transfer of glucose across the plasma membrane of hepatocytes and is responsible for uptake of glucose by beta cells. GLUT3 (or SLC2A3) is found in the brain and can mediates the uptake of glucose, 2-deoxyglucose, galactose, mannose, xylose and fucose, and dehydroascorbate. GLUT4 (or SLC2A4) is an insulin-regulated facilitative glucose transporter found in adipose tissues, and in skeletal and cardiac muscle. GLUT14 (or SLC2A14) is an orphan transporter expressed mainly in the testis. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 445
34960 340990 cd17432 MFS_GLUT_Class2 Class 2 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT5, also called Solute carrier family 2, facilitated glucose transporter member 5 (SLC2A5), is a well-established fructose transporter found in the small intestine. GLUT7 (or SLC2A7) is a high-affinity glucose and fructose transporter expressed in the small intestine and colon. GLUT9 (or SLC2A9) transports urate and fructose, and is most strongly expressed in the basolateral membranes of proximal renal tubular cells, liver and placenta. It may play a role in urate reabsorption by proximal tubules. GLUT11 (or SLC2A11) is a facilitative glucose transporter expressed in heart and skeletal muscle. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 452
34961 340991 cd17433 MFS_GLUT8_Class3 Glucose transporter type 8, a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 8 (GLUT8) is also called Solute carrier family 2, facilitated glucose transporter member 8 (SLC2A8) or glucose transporter type X1 (GLUTX1). It is classified as a Class 3 GLUT protein and is an insulin-regulated facilitative glucose transporter predominantly expressed in testis and brain. It can also transport fructose and galactose. SLC2A8 knockout mice were viable, developed normally, and display only a very mild phenotype, including mild alterations in the brain (increased proliferation of hippocampal neurons), heart (impaired transmission of electrical wave through the atrium), and sperm cells (reduced number of motile sperm cells). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 416
34962 340992 cd17434 MFS_GLUT6_Class3 Glucose transporter type 6, a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 6 (GLUT6) is also called Solute carrier family 2, facilitated glucose transporter member 6 (SLC2A6). It is classified as a Class 3 GLUT protein, and is a facilitative glucose transporter that binds cytochalasin B with low affinity. It is found in the brain, spleen, and leucocytes. GLUT6 may function in oxalate secretion. SLC2A6 has been identified as an oxalate nephrolithiasis gene in mice; its deletion causes spontaneous calcium oxalate nephrolithiasis in the setting of hyperoxalaemia, hyperoxaluria, and nephrocalcinosis. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 417
34963 340993 cd17435 MFS_GLUT12_Class3 Glucose transporter type 12 (GLUT12), a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 12 (GLUT12) is also called Solute carrier family 2, facilitated glucose transporter member 12 (SLC2A12). It is a facilitative glucose transporter, classified as a Class 3 GLUT, and is expressed in the heart, skeletal muscle, prostate, and small intestine, and is highly upregulated in breast ductal cell carcinoma. It plays a role as a secondary insulin-sensitive glucose transporter in insulin-dependent tissues. The GLUT12 subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 376
34964 340994 cd17436 MFS_GLUT10_Class3 Glucose transporter type 10 (GLUT10), a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 10 (GLUT10) is also called Solute carrier family 2, facilitated glucose transporter member 10 (SLC2A10). It is classified as a Class 3 GLUT and is a facilitative glucose transporter that exhibits a wide tissue distribution. It is expressed in pancreas, placenta, heart, lung, liver, brain, fat, muscle, and kidney. GLUT10 facilitates the transport of dehydroascorbic acid (DHA), the oxidized form of vitamin C, into mitochondria, and also increases cellular uptake of DHA, which in turn protects cells against oxidative stress. Loss-of-function mutations in SLC2A10 cause arterial tortuosity syndrome (ATS), an autosomal recessive connective tissue disorder characterized by twisting and lengthening of the major arteries, hypermobility of the joints, and laxity of skin. The GLUT10 subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 376
34965 340995 cd17437 MFS_PLT Plant Polyol transporter family of the Major Facilitator Superfamily of transporters. The plant Polyol transporter (PLT) subfamily includes PLT1-6 from Arabidopsis thaliana and similar transporters. The best characterized member of the group is Polyol transporter 5, also called Sugar-proton symporter PLT5, which mediates the H+-symport of numerous substrates including linear polyols (such as sorbitol, xylitol, erythritol or glycerol), cyclic polyol myo-inositol, and different hexoses, pentoses (including ribose), tetroses, and sugar alcohols. It functions to transport a wide range of substrates into specific sink tissues in the plant. The PLT subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 387
34966 340996 cd17438 MFS_SV2B Synaptic vesicle glycoprotein 2B of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. SV2B is a key modulator of amyloid toxicity at the synaptic site and also has an essential role in the formation and maintenance of the glomerular capillary wall. SV2B belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 477
34967 340997 cd17439 MFS_SV2A Synaptic vesicle glycoprotein 2A of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. It is unclear how SV2A is involved in correct SV function, but it has been suggested to either act as a transporter or a regulator of exocytosis by mediating Ca2+ dynamics. SV2A has been identified as the molecular target of the antiepileptic drug levetiracetam (LEV). Its expression is decreased in patients with epilepsy and in epileptic animal models. SV2A belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 478
34968 340998 cd17440 MFS_SV2C Synaptic vesicle glycoprotein 2C of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. SV2C exhibits enriched expression in several basal ganglia nuclei, and has been found to be involved in normal operation of the basal ganglia network and could be also be involved in system adaptation in basal ganglia pathological conditions. SV2C belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 479
34969 340999 cd17441 MFS_SVOP Synaptic vesicle 2-related protein (SVOP) of the Major Facilitator Superfamily. Synaptic vesicle 2 (SV2)-related protein (SVOP) is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. Like SV2, SVOP is expressed in all brain regions, with highest levels in cerebellum, hindbrain and pineal gland. Studies with knockout mice suggets that SVOP may perform a subtle function that is not necessary for survival under normal conditions, since mice lacking SVOP are viable, fertile, and phenotypically normal. SVOP shares structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. This SVOP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 372
34970 341000 cd17442 MFS_SVOPL Synaptic vesicle 2 (SV2)-related protein-like (SVOPL) of the Major Facilitator Superfamily. Synaptic vesicle 2 (SV2)-related protein-like (SVOPL) or SVOP-like protein is a transporter-like protein that shares structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. It belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 375
34971 341001 cd17443 MFS_SLC22A31 Solute carrier family 22, member 31 of the Major Facilitator Superfamily. Solute carrier family 22, member 31 (SLC22A31) is an uncharacterized member of the SLC22 family of transporters, which includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A31 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 343
34972 341002 cd17444 MFS_SLC22A23 Solute carrier family 22, member 23 of the Major Facilitator Superfamily. Solute carrier family 22, member 23 (SLC22A23) is an orphan member of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). It is abundantly expressed in brain and is also found in liver. Single-nucleotide polymorphisms in SLC22A23 are associated with inflammatory bowel disease (IBD) in a Canadian white population. SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A23 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 364
34973 341003 cd17445 MFS_SLC22A17 Solute carrier family 22, member 17 of the Major Facilitator Superfamily. Solute carrier family 22, member 17 (SLC22A17) is also called 24p3 receptor (24p3R), lipocalin-2 receptor, or neutrophil gelatinase-associated lipocalin (NGAL) receptor (NGALR). It functions as a cell surface receptor for lipocalin-2 (LCN2), also called NGAL or 24p3, which plays a key role in iron homeostasis and transport. LCN2 is a secreted protein of the lipocalin family that induces apoptosis in some types of cells and inhibits bacterial growth by sequestration of the iron-laden bacterial siderophore. Over-expressions of NGAL and NGALR have been found to be correlated with unfavorable clinicopathologic features and poor prognosis of patients with hepatocellular carcinoma. SLC22A17 is a member of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 346
34974 341004 cd17446 MFS_SLC22A6_OAT1_like Solute carrier family 22 member 6 (also called Organic anion transporter 1) and similar transporters of the Major Facilitator Superfamily. This subfamily includes solute carrier family 22 member 6 (SLC22A6, also called organic anion transporter 1 or OAT1 or para-aminohippurate (PAH) transporter), SLC22A8 (or OAT3), and SLC22A20 (or OAT6). OAT1 and OAT3 are involved in the renal elimination of endogenous and exogenous organic anions (OAs). They function as OA exchangers, coupling the uptake of OAs against an electrochemical gradient with the efflux of intracellular dicarboxylates. SLC22A20 is an OA transporter that mediates the uptake of estrone sulfate. The OAT1-like subfamily belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 339
34975 341005 cd17447 MFS_SLC22A7_OAT2 Solute carrier family 22 member 7 (also called Organic anion transporter 2) of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 7 (SLC22A7), also called organic anion transporter 2 (OAT2) mediates sodium-independent transport of a variety of organic anions including prostaglandin E2, prostaglandin F2, tetracycline, bumetanide, estrone sulfate, glutarate, dehydroepiandrosterone sulfate, allopurinol, 5-fluorouracil, paclitaxel, L-ascorbic acid, salicylate, ethotrexate, and alpha-ketoglutarate. It also plays a role in renal uric acid uptake from blood as a first step of tubular secretion. OAT2 belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS)of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 341
34976 341006 cd17448 MFS_SLC46A3 Solute carrier family 46 member 3 of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 3 (SLC46A3) is a lysosomal membrane protein that functions as a direct transporter of noncleavable antibody maytansine-based catabolites from the lysosome to the cytoplasm. SLC46A3 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 396
34977 341007 cd17449 MFS_SLC46A1_PCFT Solute carrier family 46 member 1, also called Proton-coupled folate transporter, of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 1 (SLC46A1) is also called proton-coupled folate transporter (PCFT), G21, or heme carrier protein 1 (HCP1). It functions in two ways: as an intestinal proton-coupled high-affinity folate transporter that facilitates the absorption of folates across the brush-border membrane of the small intestine; and as an intestinal heme transporter which mediates heme uptake from the gut lumen into duodenal epithelial cells. It displays a higher affinity for folate than heme. It is also expressed in the choroid plexus and is required for transport of folates into the cerebrospinal fluid. Loss of function mutations in the SLC46A1 gene results in the autosomal recessive disorder "hereditary folate malabsorption" (HFM), characterized by severe systemic and cerebral folate deficiency. SLC46A1 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 425
34978 341008 cd17450 MFS_SLC46A2_TSCOT Solute carrier family 46 member 2, also called Thymic stromal cotransporter protein, of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 2 (SLC46A2) is also called thymic stromal cotransporter protein (TSCOT). It is a putative 12-transmembrane protein mainly expressed in the thymic cortex in a specific thymic epithelial cell (TEC) subpopulation. Polymorphisms in TSCOT are linked to cervical cancer in affected sib-pairs with high mean age at diagnosis. TSCOT belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 383
34979 341009 cd17451 MFS_NLS1_MFSD2A Sodium-dependent lysophosphatidylcholine symporter 1 of the Major Facilitator Superfamily of transporters. Sodium-dependent lysophosphatidylcholine (LPC) symporter 1 (NLS1) is also called major facilitator superfamily domain-containing protein 2A (MFSD2A). NLS1/MFSD2A is an LPC symporter that plays an essential role for blood-brain barrier formation and function. It also transports the essential omega-3 fatty acid docosahexaenoic acid (DHA), which is essential for normal brain growth and cognitive function, in the form of LPC into the brain across the blood-brain barrier. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. NLS1/MFSD2A belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 419
34980 341010 cd17452 MFS_MFSD2B Major facilitator superfamily domain-containing protein 2B. Major facilitator superfamily domain-containing protein 2B (MFSD2B) is closely related to MFSD2A, and their conserved genomic structure suggests that they are derived from the duplication of an ancestral gene. Variations of chromosome 2 gene expressions among patients with lung cancer or non-cancer identified MFSD2B as a potential risk or protect factor in the prognosis of lung adenocarcinoma. MFSD2B belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 416
34981 341011 cd17453 MFS_MFSD4A Major facilitator superfamily domain-containing protein 4A. Major facilitator superfamily domain-containing protein 4A (MFSD4A) belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 415
34982 341012 cd17454 MFS_NaGLT1_MFSD4B Sodium-dependent glucose transporter 1, also called Major facilitator superfamily domain-containing protein 4B. Sodium-dependent glucose transporter 1 (NaGLT1) is also called major facilitator superfamily domain-containing protein 4B (MFSD4B). NaGLT1 is a primary fructose transporter in rat renal brush-border membranes. It also facilitates sodium-independent urea uptake in assays performed on Xenopus oocytes. NaGLT1/MFSD4B belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 369
34983 341013 cd17455 MFS_FLVCR1 Feline leukemia virus subgroup C receptor-related protein 1 of the Major Facilitator Superfamily of transporters. Feline leukemia virus subgroup C receptor-related protein 1 (FLVCR1) is also called feline leukemia virus subgroup C receptor (FLVCR). FLVCR1 is a heme transporter and it has two isoforms: 1 (or FLVCR1a), which exports cytoplasmic heme as well as coproporphyrin and protoporphyrin IX; and 2 (FLVCR1b), which promotes heme efflux from the mitochondrion to the cytoplasm. Mutations in the FLVCR1 gene have been linked to vision impairment, posterior column ataxia, and sensory neurodegeneration with loss of pain perception. FLVCR1 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 407
34984 341014 cd17456 MFS_FLVCR2 Feline leukemia virus subgroup C receptor-related protein 2 of the Major Facilitator Superfamily of transporters. Feline leukemia virus subgroup C receptor-related protein 2 (FLVCR2) is also called calcium-chelate transporter (CCT). It functions as a heme importer as well as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. Mutations in the FLVCR2 gene cause Proliferative vasculopathy and hydranencephaly-hydrocephaly syndrome (PVHH), also known as Fowler syndrome, a rare autosomal recessive disorder characterized by glomerular vasculopathy in the central nervous system, severe hydrocephaly, hypokinesia and arthrogryphosis. FLVCR2 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 406
34985 341015 cd17457 MFS_SLCO1B_OATP1B Solute carrier organic anion transporter 1B subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1B (SLCO1B), also called organic anion-transporting polypeptide 1B (OATP1B), subfamily is composed of two human proteins, OATP1B1 (encoded by SLCO1B1) and OATP1B3 (encoded by SLCO1B3), and one rodent member, OATP1B2 (encoded by Slco1b2). OATP1B1 and OATP1B3 are almost exclusively expressed on the basal side of hepatocytes in normal human organs. They both can accept a wide variety of structurally-unrelated compounds as substrates including clinically-important drugs such as hydroxymethylglutaryl (HMG)-CoA reductase inhibitors (statins), angiotensin II receptor blockers (sartans), angiotensin converting enzyme (ACE) inhibitors, and anti-diabetes drugs (glinides). Loss-of-function mutations in both SLCO1B1 and SLCO1B3 genes result in the Rotor syndrome, a hereditary hyperbilirubinemia. The SLCO1B/OATP1B subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 455
34986 341016 cd17458 MFS_SLCO1A_OATP1A Solute carrier organic anion transporter 1A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1A (SLCO1A), also called Organic anion-transporting polypeptide 1A (OATP1A), subfamily is composed of one human member OATP1A2 (encoded by SLCO1A2) and several rodent proteins encoded by the Slco1a1, Slco1a3, Slco1a4, Slco1a5, and Slco1a6 genes. OATP1A2, also known as human OATP-A or OATP1, shows a broad spectrum of substrates including endogenous compounds (such as bile acids, steroid hormones and their conjugates, thyroid hormones) and various drugs (such as fexofenadine, ouabain and the cyanobacterial toxin microcystin). It is expressed in the brain, kidney, intestine, liver, lung, testes, and the eye (ciliary body). The SLCO1A/OATP1A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 527
34987 341017 cd17459 MFS_SLCO1C_OATP1C Solute carrier organic anion transporter 1C subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1C (SLCO1C), also called Organic anion-transporting polypeptide 1C (OATP1C), subfamily contains one mammalian member, OATP1C1 (encoded by SLCO1C1), which is also called thyroxine transporter. It mediates the high affinity transport of the thyroid hormones, T4 (3,5,3',5'tetraiodo-L-thyronine or thyroxine), rT3 (3,3'5'-triiodo-L-thyronine), and T3 (3,5,3'tri-iodo-L-thyronine or triiodothyronine), as well as organic anions such as 17-beta-glucuronosyl estradiol, estrone-3-sulfate, and sulfobromophthalein (BSP), which are transported with much lower efficiency. The SLCO1C/OATP1C subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 498
34988 341018 cd17460 MFS_SLCO2B_OATP2B Solute carrier organic anion transporter 2B subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2B (SLCO2B), also called Organic anion-transporting polypeptide 2B (OATP2B), subfamily has one mammalian member, OATP2B1 (encoded by SLCO2B1). It mediates the Na(+)-independent transport of various organic anions such as taurocholate, the prostaglandins PGD2, PGE1, PGE2, leukotriene C4, thromboxane B2 and iloprost. It also mediates the transport of endogenous sex steroid conjugates such as dehydroepiandrosterone sulfate (DHEAS). SLCO2B1 variations result in differential expression and uptake of DHEAS, which impacts subsequent resistance to androgen-deprivation therapy (ADT), the primary treatment of metastatic prostate cancer. The SLCO2B/OATP2B subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 479
34989 341019 cd17461 MFS_SLCO2A_OATP2A Solute carrier organic anion transporter 2A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2A (SLCO2A), also called Organic anion-transporting polypeptide 2A (OATP2A), subfamily has one mammalian member, OATP2A1 (encoded by SLCO2A1), which is also called prostaglandin transporter. It is a lactate/prostaglandin anion exchanger that mediates the release of newly synthesized prostaglandins (PGD2, PGE1, PGE2, PGF2A and PGI2) from cells, the transepithelial transport of prostaglandins, and the clearance of prostaglandins from the circulation. Mutations in SLCO2A1 can cause primary hypertrophic osteoarthropathy (PHO), a rare multi-organic disease characterized by digital clubbing, pachydermia and periosteal reaction. The SLCO2A/OATP2A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 474
34990 341020 cd17462 MFS_SLCO4A_OATP4A Solute carrier organic anion transporter 4A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4A (SLCO4A), also called Organic anion-transporting polypeptide 4A (OATP4A), subfamily has one mammalian member, OATP4A1 (encoded by SLCO4A1). It is ubiquitously expressed and it mediates the Na(+)-independent transport of the thyroid hormones T3 (triiodo-L-thyronine), T4 (thyroxine) and rT3, and other organic anions such as estrone sulfate and taurocholate. OATP4A1 is the most abundantly expressed transporter colorectal cancer (CRC) and its role in the transport of estrone sulfate, which is used in hormone replacement therapy (HRT), affects the outcome of the treatment. The SLCO4A/OATP4A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 427
34991 341021 cd17463 MFS_SLCO4C_OATP4C Solute carrier organic anion transporter 4C subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4C (SLCO4C), also called Organic anion-transporting polypeptide 4C (OATP4C), subfamily has one mammalian member, OATP4C1 (encoded by SLCO4C1). It is capable of transporting pharmacological substances such as digoxin, ouabain, thyroxine, methotrexate and cAMP. It is the only OATP expressed at the basolateral side of proximal tubular cells in human kidney and it mediates the excretion of uremic toxins, which accumulate in patients with chronic kidney diseases (CKDs) and cause further progression of renal damage and cardiovascular diseases. Overexpression of human SLCO4C1 in rat kidney promotes the renal excretion of uremic toxins and reduces hypertension, cardiomegaly, and renal inflammation in renal failure. The SLCO4C/OATP4C subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 429
34992 341022 cd17464 MFS_MCT10 Monocarboxylate transporter 10 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 10 (MCT10) is also called Solute carrier family 16 member 10 (SLC16A10). In addition, human MCT10 is also called T-type amino acid transporter 1 (TAT1). MCT10 is a sodium-independent transporter that mediates the uptake or efflux of aromatic acids such as Phe, Tyr, and Trp, as well as L-3,4-di-hydroxy-phenylalanine. It is also a thyroid hormone transporter with preference for triiodothyronine (T3). MCT10 is expressed in intestine, kidney, liver, muscle, and placenta, and appears predominantly localized in the basolateral membrane. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 395
34993 341023 cd17465 MFS_MCT8 Monocarboxylate transporter 8 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 8 (MCT8) is also called Solute carrier family 16 member 2 (SLC16A2) or X-linked PEST-containing transporter. MCT8 is a very active and specific thyroid hormone transporter which stimulates the cellular uptake of thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). Inactivating mutations in SLC16A2, the gene that encodes MCT8, lead to an X-linked syndrome with severe neurological impairment known as Allan-Herndon-Dudley syndrome (AHDS). AHDS is characterized by congenital hypotonia that progresses to spasticity with severe psychomotor delays, spastic paraplegia and dystonic movements, global developmental delay, and profound intellectual disability. MCT8 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 367
34994 350654 cd17466 T3SS_Flik_C_like C-terminal domain of type III secretion proteins FliK, HrpP, YscP, and similar domains. Type III secretion systems (T3SS) are essential components of two complex bacterial machineries: the flagellum, which drives cell motility, and the non-flagellar T3SS, which delivers effectors into eukaryotic cells. This model represents the C-terminal domain of T3SS proteins such as the flagellar hook-length control protein FliK, and non-flagellar Yop proteins translocation protein P (YscP) and HrpP. FliK is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate. HrpP is a type III secretion system substrate specificity switch-domain protein that is required for the export of pathogenicity factors into plant cells by pathogens. YscP is a needle-length sensing protein that controls the needle length of the injectisome, which is used by pathogenic bacteria to inject effector proteins across eukaryotic cell membranes. FliK, YscP, and HrpP contain a C-terminal globular domain that is necessary for the hierarchical switching of substrates during T3SS assembly and subsequent virulence effector secretion and is also referred to as the substrate-switching (SS) domain or the type III secretion substrate specificity switch (T3S4) domain. 87
34995 350655 cd17467 T3SS_YscP_C C-terminal substrate-switching domain of ruler proteins from the Ysc family, such as YscP and PscP. This subfamily includes needle-length sensing proteins, also called ruler proteins, in type 3 secretion systems (T3SS), such as Yersinia pestis Yop proteins translocation protein P (YscP) and Pseudomonas aeruginosa PscP. T3SS ruler proteins contain an N-terminal helical region that dictates needle length and is referred to as the length-sensing (LS) domain, and a C-terminal globular domain that is necessary for the hierarchical switching of substrates during T3SS assembly and subsequent virulence effector secretion and is also referred to as the substrate-switching (SS) domain or the type III secretion substrate specificity switch (T3S4) domain. The C-terminal SS domain is highly stable and sits on the extracellular side prior to needle assembly. 111
34996 350656 cd17468 T3SS_HrpP_C C-terminal domain of type III secretion protein HrpP and similar domains. This subfamily contains Pseudomonas syringae HrpP, a type III secretion system (T3SS) substrate specificity switch-domain protein that has has an atypical T3SS translocation signal. HrpP is required for the export of pathogenicity factors into plant cells. HrpP contains a C-terminal domain similar to the globular C-terminal substrate-switching (SS) domain, also called the type III secretion substrate specificity switch (T3S4) domain, of Yersinia pestis YscP. 89
34997 350657 cd17470 T3SS_Flik_C C-terminal domain of flagellar hook-length control protein FliK and similar domains. The flagellar hook-length control protein FliK is a soluble cytoplasmic protein that is secreted during flagellar formation. It controls hook elongation by two successive events: by determining hook length and by stopping the supply of hook protein. It contains an N-terminal domain that determines hook length and a C-terminal domain that is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate. 86
34998 341024 cd17471 MFS_Set Sugar efflux transporter (Set) family of the Major Facilitator Superfamily of transporters. This family is composed of sugar transporters such as Escherichia coli Sugar efflux transporter SetA, SetB, SetC and other sugar transporters. SetA, SetB, and SetC are involved in the efflux of sugars such as lactose, glucose, IPTG, and substituted glucosides or galactosides. They may be involved in the detoxification of non-metabolizable sugar analogs. The Set family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
34999 341025 cd17472 MFS_YajR_like Escherichia coli inner membrane transport protein YajR and similar multidrug-efflux transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli inner membrane transport protein YajR and some uncharacterized multidrug-efflux transporters. YajR is a putative proton-driven major facilitator superfamily (MFS) transporter found in many gram-negative bacteria. Unlike most MFS transporters, YajR contains a C-terminal, cytosolic YAM domain, which may play an essential role for the proper functioning of the transporter. YajR-like transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
35000 341026 cd17473 MFS_arabinose_efflux_permease_like Putative arabinose efflux permease family transporters of the Major Facilitator Superfamily. This family includes a group of putative arabinose efflux permease family transporters, such as alpha proteobacterium quinolone resistance protein NorA (characterized Staphylococcus aureus Quinolone resistance protein NorA belongs to a different group), Desulfovibrio dechloracetivorans bacillibactin exporter, Vibrio aerogenes antiseptic resistance protein. The biological function of those transporters remain unclear. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
35001 341027 cd17474 MFS_YfmO_like Bacillus subtilis multidrug efflux protein YfmO and similar transporters of the Major Facilitator Superfamily. This family is composed of Bacillus subtilis multidrug efflux protein YfmO, bacillibactin exporter YmfD/YmfE, uncharacterized MFS-type transporter YvmA, and similar proteins. YfmO acts to efflux copper or a copper complex, and could contribute to copper resistance. YmfD/YmfE is involved in secretion of bacillibactin. The YfmO-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 374
35002 341028 cd17475 MFS_MT3072_like Mycobacterium tuberculosis uncharacterized MFS-type transporter MT3072 and similar transporters of the Major Facilitator Superfamily. This family includes the Mycobacterium tuberculosis uncharacterized MFS-type transporter MT3072. It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 378
35003 341029 cd17476 MFS_Amf1_MDR_like Saccharomyces cerevisiae low affinity ammonium transporter Amf1p/YOR378W, aminotriazole resistance protein Atr1p, and similar transporters of the Major Facilitator Superfamily. Saccharomyces cerevisiae Amf1p/Ammonium Facilitator 1/YOR378W functions as a low affinity NH4+ transporter. S. cerevisiae aminotriazole resistance protein (Atr1p) is required for controlling sensitivity to aminotriazole; it is a putative component of the machinery responsible for pumping aminotriazole (and possibly other toxic compounds) out of the cell. This subfamily also includes S. cerevisiae YMR279C, a putative boron transporter involved in boron efflux and resistance, and Kluyveromyces lactis Knq1p which is involved in oxidative stress response and iron homeostasis. Amf1p, Atr1p, and YMR279C have been classified as group 1 members of the DHA2 (Drug:H+ Antiporter family 2) family, K. lactis Knq1 as group 2. This subfamily also includes two Aspergillus terreus terrein biosynthesis cluster proteins, efflux pump TerG and TerJ which may be required for efficient secretion of terrein or other secondary metabolites produced by the terrein gene cluster. The Amf1p-like subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 362
35004 341030 cd17477 MFS_YcaD_like YcaD and similar transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli MFS-type transporter YcaD, Bacillus subtilis MFS-type transporter YfkF, and similar proteins. They are uncharacterized transporters belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 360
35005 341031 cd17478 MFS_FsR Fosmidomycin resistance protein of the Major Facilitator Superfamily of transporters. Fosmidomycin resistance protein (FsR) confers resistance against fosmidomycin. It shows sequence similarity with the bacterial drug-export proteins that mediate resistance to tetracycline and chloramphenicol. This FsR family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 365
35006 341032 cd17479 MFS_MFSD6L Major facilitator superfamily domain-containing protein 6-like and similar transporters of the Major Facilitator Superfamily. Major facilitator superfamily domain-containing protein 6-like (MFSD6L) protein family includes a group uncharacterized proteins similar to human major facilitator superfamily domain-containing protein 6 (MFSD6). MFSD6 is also called Macrophage MHC class I receptor 2 homolog (MMR2). It has been postulated as a possible receptor for human leukocyte antigen (HLA)-B62. The MFSD6L family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 376
35007 341033 cd17480 MFS_SLC40A1_like Solute carrier family 40 member 1 of the Major Facilitator Superfamily of transporters. Solute carrier family 40 member 1 (SLC40A1 or SLC11A3) is also called ferroportin-1 (FPN1) or iron-regulated transporter 1 (IREG1). In the presence of a ferroxidase (hephaestin and/or ceruloplasmin), SLC40A1 acts as an iron exporter ferroportin releases Fe(2+) from cells into plasma, thereby maintaining iron homeostasis. Specially, it is involved in iron export from duodenal epithelial cell and also in the transfer of iron between maternal and fetal circulation. The transport activity of SLC40A1 is suppressed by the peptide hormone hepcidin. This family also includes a bacterial homologue of SLC40A1 (Bdellovibrio bacteriovorus ferroportin). It adopts the major facilitator superfamily fold, but undergoes an intra-domain conformational rearrangement during the transport cycle. SLC40A1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 386
35008 341034 cd17481 MFS_MFSD13A Major facilitator superfamily domain containing 13A. Human major facilitator superfamily domain containing 13A (MFSD1A) protein is also called transmembrane protein 180. Its function is still unknown. MFSD13A belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement 429
35009 341035 cd17482 MFS_YxiO_like Bacillus subtilis YxiO, Listeria monocytogenes BtlA, and similar transporters of the Major Facilitator Superfamily. This family is composed of Bacillus subtilis MFS-type transporter YxiO, and similar proteins including Listeria monocytogenes BtlA. The function of B. subtilis YxiO is still unknown, and L. monocytogenes BtlA is a putative secondary transporter involved in bile tolerance and general stress resistance. This family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 362
35010 341036 cd17483 MFS_Atg22_like Autophagy-related protein 22 and similar proteins; member of the Major Facilitator Superfamily of transporters. Atg22 (also known as Aut4) protein functions as a vacuolar effluxer which mediates the efflux of amino acids resulting from autophagic degradation. The release of autophagic amino acids allows the maintenance of protein synthesis and viability during nitrogen starvation. Members of this family belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 467
35011 341037 cd17484 MFS_FBT Folate-biopterin transporter of the Major Facilitator Superfamily of transporters. The Folate-biopterin transporter (FBT) family includes folate carriers related to those of trypanosomatids in higher plant plastids and cyanobacteria. FBT mediates folate monoglutamate transport involved in tetrahydrofolate biosynthesis. It also mediates transport of antifolates, such as methotrexate and aminopterin. The FBT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 399
35012 341038 cd17485 MFS_MFSD3 Major facilitator superfamily domain containing 3 protein. Major facilitator superfamily domain containing 3 protein (MFSD3) is a predicted acetyl-CoA transporter. As an atypical putative membrane-bound solute carrier (SLC), MFSD3 is most likely to be functionally active in the plasma membrane and not in any intracellular organelles. MFSD3 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 386
35013 341039 cd17486 MFS_AmpG_like AmpG and similar transporters of the Major Facilitator Superfamily. AmpG acts as an inner membrane permease in the beta-lactamase induction system and in peptidoglycan recycling. It transports meuropeptide from the periplasm into the cytosol in gram-negative bacteria, which is essential for the induction of the ampC encoding beta-lactamase. The AmpG family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 387
35014 341040 cd17487 MFS_MFSD5_like Major facilitator superfamily domain containing 5 protein. Human major facilitator superfamily domain containing 5 protein (MFSD5) is also called molybdate-anion transporter, or molybdate transporter 2 homolog (MOT2). It acts as an atypical solute carrier (SLC) that mediates high-affinity intracellular uptake of the rare oligo-element molybdenum. It may also play a role in maintaining the glucose homeostasis and pancreatic beta-cell proliferation, as well as in altered energy homeostasis. MFSD5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 385
35015 341041 cd17488 MFS_UhpC Membrane sensor protein UhpC of the Major Facilitator Superfamily of transporters. Membrane sensor protein UhpC acts as both a sensor and a transport protein. It is part of the UhpABC signaling cascade that controls the expression of the hexose phosphate transporter UhpT. UhpC recognizes external glucose-6-phosphate (Glc6P) and induces transport by UhpT. It can also transport and sense Glc6P, and interacts with the histidine kinase UhpB, leading to the stimulation of the autokinase activity of UhpB. This group also includes the hexose phosphate transport protein UhpT from Chlamydia pneumoniae; it is a transport protein for sugar phosphate uptake. It is part of the Organophosphate:Pi antiporter (OPA) family of integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The UhpC group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 364
35016 341042 cd17489 MFS_YfcJ_like Escherichia coli YfcJ, YhhS, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli membrane proteins, YfcJ and YhhS, Bacillus subtilis uncharacterized MFS-type transporter YwoG, and similar proteins. YfcJ and YhhS are putative arabinose efflux transporters. YhhS has been implicated glyphosate resistance. YfcJ-like arabinose efflux transporters belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 367
35017 341043 cd17490 MFS_YxlH_like Bacillus subtilis YxlH and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of Bacillus subtilis YxlH uncharacterized MFS-type transporter YxlH and similar proteins. The biological function of YxlH remains unclear. The YxlH-like subfamily belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
35018 341044 cd17491 MFS_MFSD12 Major facilitator superfamily domain-containing protein 12. Major facilitator superfamily domain-containing protein 12 (MFSD12) protein subfamily includes a group of uncharacterized proteins similar to human MFSD2. MFSD2 is composed of two vertebrate members, MFSD2A and MFSD2B. MFSD2A is an LPC symporter that plays an essential role for blood-brain barrier formation and function. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. The MFSD12 subfamily belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 438
35019 341212 cd17492 toxin_CptN type III toxin-antitoxin system toxin CptN and similar proteins. CptN-like toxin component of a type III toxin-antitoxin (TA) system, which consists of a ribonuclease (RNase) toxin that processes its structured and specific cognate RNA antitoxin, which in turn then directly inhibits the toxin. TA systems have been associated with many important phenotypes, like phage resistance, maintenance of genomic islands, and formation of bacterial persister cells. 149
35020 341213 cd17493 toxin_TenpN type III toxin-antitoxin system toxin TenpN and similar proteins. TenpN-like toxin component of a type III toxin-antitoxin (TA) system, which consists of a ribonuclease (RNase) toxin that processes its structured and specific cognate RNA antitoxin, which in turn then directly inhibits the toxin. TA systems have been associated with many important phenotypes, like phage resistance, maintenance of genomic islands, and formation of bacterial persister cells. 121
35021 341185 cd17494 RMtype1_S_Sma198ORF994P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Streptococcus macedonicus ACA-DC 198 S subunit (S1.Sma198ORF994P) TRD2-CR2 and Lactobacillus amylovorus GRL 1112 S subunit (S1.LamGRLORF5415P) TRD2-CR2. The recognition sequences of Streptococcus macedonicus ACA-DC 198 S subunit (S1.Sma198ORF994P) and Lactobacillus amylovorus GRL 1112 S subunit (S1.LamGRLORF5415P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 171
35022 341186 cd17495 RMtype1_S_Cep9333ORF4827P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Crinalium epipsammum S subunit (S.Cep9333ORF4827P) TRD2-CR2 and Corynebacterium genitalium sp. nov. S subunit (S.CgeORF10704P) TRD2-CR2. The recognition sequences for Crinalium epipsammum S subunit (S.Cep9333ORF4827P) and Corynebacterium genitalium sp. nov. S subunit (S.CgeORF10704P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 174
35023 341187 cd17496 RMtype1_S_BliBORF2384P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus licheniformis S subunit (S1.BliBORF2384P) TRD1-CR1 and Chlorobium tepidum TLS S subunit (S.CteTORF675P) TRD1-CR1. The recognition sequences for Bacillus licheniformis S subunit (S1.BliBORF2384P) and Chlorobium tepidum TLS S subunit (S.CteTORF675P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 175
35024 341188 cd17497 RMtype1_S_TteMORF1547P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P) TRD2-CR2. The recognition sequence is undetermined for Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P). The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.TteMORF1547P TRD1-CR1 does not belong to this subfamily. 174
35025 341189 cd17498 RMtype1_S_Aco12261I-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD1-CR1. The S.Aco12261I S subunit recognizes 5'... GCANNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.Aco12261I TRD2-CR2 does not belong to this subfamily. 173
35026 341190 cd17499 RMtype1_S_CloLW9ORF3270P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Cecembia lonarensis LW9 S subunit (S.CloLW9ORF3270P) TRD1-CR1 and Bacillus licheniformis 9945A S subunit (S.Bli9945ORF10320P) TRD1-CR1. The recognition sequences for Cecembia lonarensis LW9 S subunit (S.CloLW9ORF3270P) and Bacillus licheniformis 9945A S subunit (S.Bli9945ORF10320P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Helicobacter bizzozeronii CIII-1 putative Type IIG restriction enzyme/N6-adenine DNA methyltransferase RM.HbiCORF8670P, and may also contain type I DNA methyltransferases. 175
35027 341191 cd17500 RMtype1_S_MmaGORF2198P_TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Methanosarcina mazei Goe1 S subunit (S.MmaGORF2198P) TRD1-CR1, and Flavobacterium psychrophilum FPG3 S subunit (S.FpsFPG3ORF6820P) TRD1-CR1. The recognition sequences of Methanosarcina mazei Goe1 S subunit (S.MmaGORF2198P) and Flavobacterium psychrophilum FPG3 S subunit (S.FpsFPG3ORF6820P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains TRD1-CR1. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 186
35028 341192 cd17501 RMtype1_S_Vch69ORF1407P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Vibrio cholerae 1311-69 S subunit (S.Vch69ORF1407P) TRD2-CR2, and Methanococcoides methylutens MM1 S subunit (S.MmeMM1ORF456P) TRD2-CR2. The recognition sequences of Vibrio cholerae 1311-69 S subunit (S.Vch69ORF1407P) and Methanococcoides methylutens MM1 S subunit (S.MmeMM1ORF456P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 191
35029 341045 cd17502 MFS_Azr1_MDR_like Saccharomyces cerevisiae Azole resistance protein 1 (Azr1p), and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of multidrug resistance (MDR) transporters including various Saccharomyces cerevisiae proteins such as azole resistance protein 1 (Azr1p), vacuolar basic amino acid transporter 1 (Vba1p), vacuolar basic amino acid transporter 5 (Vba5p), and Sge1p (also known as Nor1p, 10-N-nonyl acridine orange resistance protein, and crystal violet resistance protein). MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 337
35030 341046 cd17503 MFS_LmrB_MDR_like Bacillus subtilis lincomycin resistance protein (LmrB) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of multidrug resistance (MDR) transporters including Bacillus subtilis lincomycin resistance protein LmrB, and several proteins from Escherichia coli such as the putative MDR transporters EmrB, MdtD, and YieQ. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, MMR confers resistance to the epoxide antibiotic methylenomycin. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 380
35031 341047 cd17504 MFS_MMR_MDR_like Methylenomycin A resistance protein (also called MMR peptide)-like multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of putative multidrug resistance (MDR) transporters including Chlamydia trachomatis antiseptic resistance protein QacA_2, and Serratia sp. DD3 Bmr3. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. 371
35032 340762 cd17505 Ubl_SAMP1_like ubiquitin-like (Ubl) domain found in small archaeal modifier protein 1 (SAMP1). Ubiquitin-like small archaeal modifier protein 1 (SAMP1) shows a beta-grasp fold of Ub, suggesting that this archaeal Ubl molecule is more closely related to eukaryotic Ub and Ubls than to its prokaryotic counterpart. Several Ub-like structural features such as an N-terminal single lysine residue and di-glycine motif at the C-terminus, spatially isolated, implicate formation of a poly-SAMPylated chainpoly-SAMPylation. SAMP1 can form covalent conjugates with its protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. It is involved in sulfur transfer during molybdenum cofactor biosynthesis much like MoaD. This family also includes proteins such as Thermoplasma acidophilum TA0895 and others, all closely related to proteins MoaD. 90
35033 340763 cd17506 Ubl_SAMP2_like ubiquitin-like (Ubl) domain found in small archaeal modifier protein (SAMP2). Ubiquitin-like small archaeal modifier protein 2 (SAMP2) shows a beta-grasp fold of Ub, suggesting that this archaeal Ubl molecule is more closely related to eukaryotic Ub and Ubls than to its prokaryotic counterpart. Several Ub-like structural features such as an N-terminal single lysine residue and di-glycine motif at the C-terminus, spatially isolated, implicate formation of a poly-SAMPylated chainpoly-SAMPylation. SAMP2 can form covalent conjugates with its protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. It also forms homo-conjugates through the intermolecular isopeptide bond between the C-terminal Gly and the Lys58 side chain, a feature that likely resembles polyubiquitination. SAMP2 is involved in sulfur transfer during tRNA thiolation much like Urm1. This family also includes uncharacterized proteins such as Methanothermococcus thermolithotrophicus Mth1743, Pyrococcus furiosus PF1061 and others, all closely related to proteins MoaD. 67
35034 340861 cd17507 GT28_Beta-DGS-like beta-diglucosyldiacylglycerol synthase and similar proteins. beta-diglucosyldiacylglycerol synthase (processive diacylglycerol beta-glucosyltransferase EC 2.4.1.315) is involved in the biosynthesis of both the bilayer- and non-bilayer-forming membrane glucolipids. This family of glycosyltransferases also contains plant major galactolipid synthase (chloroplastic monogalactosyldiacylglycerol synthase 1 EC 2.4.1.46). Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. 364
35035 341225 cd17508 Alpha_kinase Alpha kinase family; uncharacterized subgroup. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 243
35036 341226 cd17509 Alpha_kinase Alpha kinase family; uncharacterized subgroup. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 221
35037 409322 cd17510 T3SC_YbjN-like_2 Uncharacterized protein is structurally similar to type III secretion system chaperones and YbjN family proteins. This family includes an uncharacterized protein from Methanothermobacter Thermautotrophicus that is structurally similar to type III secretion system (T3SS) chaperones (T3SC) that bind effector proteins, and is homologous to YbjN, a putative sensory transduction regulator protein found in Proteobacteria. 142
35038 409323 cd17511 YbjN_AmyR-like YbjN protein family is structurally similar to type III secretion system chaperones. This YbjN protein family includes Escherichia coli YbjN, Erwinia amylovora AmyR, and similar proteins. YbjN proteins share a class I type III secretion chaperone (T3SC)-like fold with type III secretion system (T3SS) chaperone proteins but appear to function independently of the T3SS. YbjN is an enterobacteria-specific protein. In E. coli, it acts as a sensory transduction regulator that may play important roles in regulating bacterial multicellular behavior, metabolism, and survival under stress conditions. E. amylovora AmyR, a functionally conserved ortholog of E. coli YbjN, is a stress and virulence associated protein that regulates the ams operon. Ams proteins are required for amylovoran biosynthesis. AmyR may also regulate the Rcs phosphorelay system, an atypical two-component signal transduction (TCST) system present only in Enterobacteriaceae and positively regulates amylovoran biosynthesis by activating the ams operon transcription. 122
35039 341193 cd17512 RMtype1_S_BceB55ORF5615P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus cereus HuB5-5 S subunit (S.BceB55ORF5615P) TRD2-CR2. The recognition sequence of Bacillus cereus HuB5-5 S subunit (S.BceB55ORF5615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 195
35040 341194 cd17513 RMtype1_S_AveSPN6ORF1907P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Archaeoglobus veneficus SNP6 S subunit (S.AveSPN6ORF1907P) TRD2-CR2 and Bacillus subtilis JRS2 S subunit (S.BsuJRS7ORF3308P) TRD1-CR1. The recognition sequences of Archaeoglobus veneficus SNP6 S subunit (S.AveSPN6ORF1907P) and Bacillus subtilis JRS2 S subunit (S.BsuJRS7ORF3308P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 182
35041 341195 cd17514 RMtype1_S_Eco2747I_MmaC7ORF19P-TRD-CR_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli ST2747 S subunit (S.Eco2747I) TRD2-CR2, Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) TRD1-CR1, and related domains. The S. Eco2747I S subunit recognizes 5'... CACNNNNNNNGTTG ... 3'. The recognition sequence of Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. It also includes the TRD-CR-like domains of putative type II restriction enzymes and methyltransferases, such as Helicobacter cinaedi PAGU611 Hci611ORFHP which may recognize 5'... GAGNNNNNGT ... 3', and type I N6-adenine DNA methyltransferases, such as Calditerrivibrio nitroreducens M.Cni19672ORF1405P whose recognition sequence is undetermined. 183
35042 341196 cd17515 RMtype1_S_MjaORF132P_Sau1132ORF3780P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to MjaXIP/S.MjaORF132P TRD1-CR1, S.Sau1132ORF3780P TRD1-CR1, S.Mca353ORF290P TRD1-CR1, and other TRD-CR's. The Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The recognition sequence of Methanococcus jannaschii S subunit (MjaXIP/S.MjaORF132P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG, and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 181
35043 341197 cd17516 RMtype1_S_HinAWORF1578P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.HinAWORF1578P TRD2-CR2. Haemophilus influenzae RdAW S subunit (S.HinAWORF1578P) recognizes 5'... CTANNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 184
35044 341198 cd17517 RMtype1_S_EcoKI_StySPI-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR),similar to Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) TRD2-CR2, Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) TRD2-CR2, and other TRD-CR's. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) recognizes 5'... AACNNNNNNGTGC ... 3' and Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 and S.StySPI-TRD1 both recognize AAC/GTT, S.EcoKI-TRD2 recognizes GCAC/GTGC and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2.It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.PpiI), and type I DNA methyltransferases such as Bacillus cereus BDRD-ST24 M subunit of Type I N6-adenine DNA methyltransferase (M.Bce24ORF51270P). RM.PpiI recognizes 5' ... GAACNNNNNCTC ... 3'. 192
35045 341199 cd17518 RMtype1_S_Asp27244ORF1181P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acinetobacter sp. S subunit (S.Asp27244ORF1181P) TRD1-CR1. The recognition sequence of Acinetobacter sp. S subunit (S.Asp27244ORF1181P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 180
35046 341200 cd17519 RMtype1_S_HpyCR35ORFAP-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori CR35 S subunit (S.HpyCR35ORFAP) TRD1-CR1 and Mycoplasma haemofelis str. Langford 1 S subunit (S2.Mha1ORF7190P) TRD1-CR1. The recognition sequences of Helicobacter pylori CR35 S subunit (S.HpyCR35ORFAP) and Mycoplasma haemofelis str. Langford 1 S subunit (S2.Mha1ORF7190P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 183
35047 341201 cd17520 RMtype1_S_HmoORF3075P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Heliobacterium modesticaldum Ice1 S subunit (S1.HmoORF3075P) TRD1-CR1. The recognition sequence of Heliobacterium modesticaldum Ice1 S subunit (S1.HmoORF3075P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This subfamily also includes the TRD-CR-like sequence-recognition domain of type I N6-adenine DNA methyltransferase (M) subunit of Clostridium intestinale URNW (M2.CinURNWORF2828P). The recognition sequence of M2.CinURNWORF2828P is undetermined. Type I methyltransferases included in this group include two domains: one for methylation, and another (TRD-CR-like) for sequence-recognition. 180
35048 341202 cd17521 RMtype1_S_Sau13435ORF2165P_TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) TRD2-CR2, Escherichia coli E24377A S subunit (S.EcoE24377ORF286P) TRD1-CR1 and Pseudoalteromonas species P1-13-1a S. subunit (S.Psp1bORF2093P) TRD2-CR2. Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) recognizes 5'... TCTANNNNNNRTTC ... 3', and the recognition sequences of Escherichia coli E24377A S subunit (S.EcoE24377ORF286P) and Pseudoalteromonas species P1-13-1a S subunit (S.Psp1bORF2093P) are undetermined. The restriction-modification (RM) system S subunit generally consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) TRD1 recognizes TCTA/TAGA, and -TRD2 recognizes GAAY/RTTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. In addition, this family includes RMtype1_S_TRD-CR_like domains of various putative Helicobacter type II restriction enzymes and methyltransferases, such as Hci611ORFHP and HfeORF12890P. 187
35049 341203 cd17522 RMtype1_S_MjaORF1531P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Methanocaldococcus jannaschii DSM 2661 S subunit (S.MjaORF1531P/MjaXIIP) TRD1-CR1. The recognition sequence of Methanocaldococcus jannaschii DSM 2661 S subunit (S.MjaORF1531P, also called MjaXIIP) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 190
35050 341204 cd17523 RMtype1_S_StySPI-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) TRD2-CR2. Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.StySPI-TRD1 recognizes AAC/GTT and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 190
35051 341205 cd17524 RMtype1_S_EcoUTORF5051P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli UTI89 S subunit (S.EcoUTORF5051P) TRD2-CR2 and Archaeoglobus fulgidus VC-16 S subunit (S.AfuORF1715P) TRD2-CR2. Escherichia coli UTI89 S subunit (S.EcoUTORF5051P) recognizes 5'... CCANNNNNNNCTTC ... 3' and the recognition sequence of Archaeoglobus fulgidus VC-16 S subunit (S.AfuORF1715P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.PpiI), and type I DNA methyltransferases such as Bacillus cereus BDRD-ST24 M subunit of Type I N6-adenine DNA methyltransferase (M.Bce24ORF51270P). RM.PpiI recognizes 5' ... GAACNNNNNCTC ... 3'. 189
35052 341206 cd17525 RMtype1_S_Eco15ORF14057P-TRD1-CR1_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli 541-15 S subunit (S.Eco15ORF14057P) TRD1-CR1 and Desulfotignum phosphitoxidans S subunit (S.Dph13687ORF2110P) TRD2-CR2. The recognition sequences of Escherichia coli 541-15 S subunit (S.Eco15ORF14057P) and Desulfotignum phosphitoxidans S subunit (S.Dph13687ORF2110P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. 190
35053 341207 cd17526 RMtype1_S_Cje2232P-TRD2-CR2_like Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) TRD2-CR2 and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) TRD1-CR1. The recognition sequences of Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. Also included in this subfamily is the C-terminal TRD-CR-like sequence-recognition domain of Microcystis aeruginosa putative type I N6-adenine DNA methyltransferase M subunit (M.Mae7806ORF3969P). The recognition sequence of M.Mae7806ORF3969P is undetermined. 192
35054 381744 cd17527 HAMP_II second HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established. 46
35055 381745 cd17528 HAMP_III third HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established. 44
35056 381746 cd17529 HAMP_I first HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established. 44
35057 381086 cd17530 REC_RocR phosphoacceptor receiver (REC) domain of response regulator RocR. The response regulator RocR from some pathogens contains an N-terminal phosphoreceiver (REC) domain and a C-terminal EAL domain that possesses c-di-GMP specific phosphodiesterase activity. The RocR REC domain is phosphorylated and modulates its EAL domain enzymatic activity, regulating the local level of c-di-GMP. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 123
35058 381087 cd17532 REC_LytTR_AlgR-like phosphoacceptor receiver (REC) domain of LytTR/AlgR family response regulators similar to AlgR. Members of the LytTR/AlgR family of response regulators contain a REC domain and a unique LytTR DNA-binding output domain that lacks the helix-turn-helix motif and consists mostly of beta-strands. Transcriptional regulators with the LytTR-type output domains are involved in biosynthesis of extracellular polysaccharides, fimbriation, expression of exoproteins, including toxins, and quorum sensing. Included in this AlgR-like group of LytTR/AlgR family response regulators are Streptococcus agalactiae sensory transduction protein LytR, Pseudomonas aeruginosa positive alginate biosynthesis regulatory protein AlgR, Bacillus subtilis sensory transduction protein LytT, and Escherichia coli transcriptional regulatory protein BtsR, which are members of two-component regulatory systems. LytR and LytT are components of regulatory systems that regulate genes involved in cell wall metabolism. AlgR positively regulates the algD gene, which codes for a GDP-mannose dehydrogenase, a key enzyme in the alginate biosynthesis pathway. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35059 381088 cd17533 REC_LytTR_AgrA-like phosphoacceptor receiver (REC) domain of LytTR/AlgR family response regulators similar to AgrA. Members of the LytTR/AlgR family of response regulators contain a REC domain and a unique LytTR DNA-binding output domain that lacks the helix-turn-helix motif and consists mostly of beta-strands. Transcriptional regulators with the LytTR-type output domains are involved in biosynthesis of extracellular polysaccharides, fimbriation, expression of exoproteins, including toxins, and quorum sensing. Included in this AgrA-like group of LytTR/AlgR family response regulators are Staphylococcus aureus accessory gene regulator protein A (AgrA) and Streptococcus pneumoniae response regulator ComE, which are members of two-component regulatory systems. AgrA is a global regulator that controls the synthesis of virulence factors and other exoproteins. ComE is part of the ComD-ComE system that is part of a quorum-sensing signaling pathway that controls the development of competence, a physiological state required for genetic transformation. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 131
35060 381089 cd17534 REC_DC-like phosphoacceptor receiver (REC) domain of modulated diguanylate cyclase and similar domains. This groups includes a modulated diguanylate cyclase containing a PAS sensor domain from Desulfovibrio desulfuricans G20. Members of this group contain N-terminal REC domains and various output domains including the GGDEF, histidine kinase, and helix-turn-helix (HTH) DNA binding domains. Also included in this family is Mycobacterium tuberculosis PdtaR, a transcriptional antiterminator that contains a REC domain and an ANTAR RNA-binding output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35061 381090 cd17535 REC_NarL-like phosphoacceptor receiver (REC) domain of NarL (Nitrate/Nitrite response regulator L) family response regulators. The NarL family is one of the more abundant families of DNA-binding response regulators (RRs). Members of the NarL family contain a REC domain and a helix-turn-helix (HTH) DNA-binding output domain, with a majority of members containing a LuxR-type HTH domain. They function as transcriptional regulators. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35062 381091 cd17536 REC_YesN-like phosphoacceptor receiver (REC) domain of YesN and related helix-turn-helix containing response regulators. This family is composed of uncharacterized response regulators that contain a REC domain and a AraC family helix-turn-helix (HTH) DNA-binding output domain, including Bacillus subtilis uncharacterized transcriptional regulatory protein YesN and Staphylococcus aureus uncharacterized response regulatory protein SAR0214. YesN is a member of the two-component regulatory system YesM/YesN and SAR0214 is a member of the probable two-component regulatory system SAR0215/SAR0214. Also included in this family is the AlgR-like group of LytTR/AlgR family response, which includes Pseudomonas aeruginosa positive alginate biosynthesis regulatory protein AlgR and Bacillus subtilis sensory transduction protein LytT, among others. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 121
35063 381092 cd17537 REC_FixJ phosphoacceptor receiver (REC) domain of FixJ family response regulators. FixJ family response regulators contain an N-terminal receiver domain (REC) and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. The Sinorhizobium meliloti two-component system FixL/FixJ regulates nitrogen fixation in response to oxygen during symbiosis. Under microaerobic conditions, the kinase FixL phosphorylates the response regulator FixJ resulting in the regulation of nitrogen fixation genes such as nifA and fixK. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
35064 381093 cd17538 REC_D1_PleD-like first (D1) phosphoacceptor receiver (REC) domain of response regulator PleD and similar domains. PleD contains a REC domain (D1) with the phosphorylatable aspartate, a REC-like adaptor domain (D2), and the enzymatic diguanylate cyclase (DGC) domain, also called the GGDEF domain according to a conserved sequence motif, as its output domain. The GGDEF-containing PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes D1 of PleD and similar domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 104
35065 381094 cd17539 psREC-like_D2_PleD REC-like adaptor domain (D2) of response regulator PleD. PleD contains a REC domain (D1) with the phosphorylatable aspartate, a pseudo receiver (psREC)-like adaptor domain (D2), and the enzymatic diguanylate cyclase (DGC) domain, also called the GGDEF domain according to a conserved sequence motif, as its output domain. The GGDEF-containing PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes the REC-like adaptor domain D2 of PleD, which is an inactive domain. 124
35066 381095 cd17540 REC_PhyR phosphoacceptor receiver (REC) domain of response regulator PhyR and similar proteins. PhyR is a hybrid stress regulator that contains an N-terminal sigma-like (SL) domain and a C-terminal REC domain. Phosphorylation of the REC domain is known to promote binding of the SL domain to an anti-sigma factor. PhyR thus functions as an anti-anti-sigma factor in its phosphorylated state. It is involved in the general stress response. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35067 381096 cd17541 REC_CheB-like phosphoacceptor receiver (REC) domain of chemotaxis response regulator protein-glutamate methylesterase CheB and similar chemotaxis proteins. Methylesterase CheB is a chemotaxis response regulator with an N-terminal REC domain and a C-terminal methylesterase domain. Chemotaxis is a behavior known in motile bacteria that directs their movement in response to chemical gradients. CheB is a phosphorylation-activated response regulator involved in the reversible modification of bacterial chemotaxis receptors. It catalyzes the demethylation of specific methylglutamate residues introduced into the chemoreceptors (methyl-accepting chemotaxis proteins) by CheR. The CheB REC domain packs against the active site of the C-terminal domain and inhibits methylesterase activity by directly restricting access to the active site. Also included in this family is chemotaxis response regulator CheY, which contains a stand-alone REC domain, and an uncharacterized subfamily composed of proteins containing an N-terminal REC domain and a C-terminal CheY-P phosphatase (CheC) domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 125
35068 381097 cd17542 REC_CheY phosphoacceptor receiver (REC) domain of chemotaxis protein CheY. The chemotaxis response regulator CheY contains a stand-alone REC domain. Chemotaxis is a behavior known for motile bacteria that directs their movement in response to chemical gradients. CheY is involved in transmitting sensory signals from chemoreceptors to the flagellar motors. Phosphorylated CheY interacts with the flagella switch components FliM and FliY, which causes counterclockwise rotation of the flagella, resulting in smooth swimming. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35069 381098 cd17544 REC_2_GGDEF second phosphoacceptor receiver (REC) domain of uncharacterized GGDEF domain proteins. This family is composed of uncharacterized PleD-like response regulators that contain two N-terminal REC domains and a C-terminal diguanylate cyclase output domain with the characteristic GGDEF motif at the active site. Unlike PleD which contains a REC-like adaptor domain, the second REC domain of these uncharacterized GGDEF domain proteins, described in this model, contains characteristic metal-binding and active site residues. PleD response regulators are global regulators of cell metabolism in some important human pathogens. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 122
35070 381099 cd17546 REC_hyHK_CKI1_RcsC-like phosphoacceptor receiver (REC) domain of hybrid sensor histidine kinases/response regulators similar to Arabidopsis thaliana CKI1 and Escherichia coli RcsC. This family is composed of hybrid sensor histidine kinases/response regulators that are sensor histidine kinases (HKs) fused with a REC domain, similar to the sensor histidine kinase CKI1 from Arabidopsis thaliana, which is involved in multi-step phosphorelay (MSP) signaling that mediates responses to a variety of important stimuli in plants. MSP involves a signal being transferred from HKs via histidine phosphotransfer proteins (AHP1-AHP5) to nuclear response regulators. The CKI1 REC domain specifically interacts with the downstream signaling protein AHP2, AHP3 and AHP5. The plant MSP system has evolved from the prokaryotic two-component system (TCS), which allows organisms to sense and respond to changes in environmental conditions. This family also includes bacterial hybrid sensor HKs such as Escherichia coli RcsC, which is a component of the Rcs signalling pathway that controls a variety of physiological functions like capsule synthesis, cell division, and motility. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 113
35071 381100 cd17548 REC_DivK-like phosphoacceptor receiver (REC) domain of DivK and similar proteins. Caulobacter crescentus DivK is an essential response regulator that is involved in the complex phosphorelay pathways controlling both cell division and motility. It localizes cell cycle regulators to specific poles of the cell during division. DivK contains a stand-alone REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35072 381101 cd17549 REC_DctD-like phosphoacceptor receiver (REC) domain of C4-dicarboxylic acid transport protein D (DctD) and similar proteins. C4-dicarboxylic acid transport protein D (DctD) is part of the two-component regulatory system DctB/DctD, which regulates C4-dicarboxylate transport via regulation of expression of the dctPQM operon and dctA. It is an activator of sigma(54)-RNA polymerase holoenzyme that uses the energy released from ATP hydrolysis to stimulate the isomerization of a closed promoter complex to an open complex capable of initiating transcription. DctD is a member of the NtrC family, characterized by a domain architecture containing an N-terminal REC domain, followed by a central sigma-54 interaction/ATPase domain, and a C-terminal DNA binding domain. The ability of the central domain to hydrolyze ATP and thus to interact effectively with a complex of RNA polymerase, sigma54, and promoter, is controlled by the phosphorylation status of the REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 130
35073 381102 cd17550 REC_NtrX-like phosphoacceptor receiver (REC) domain of nitrogen assimilation regulatory protein NtrX and similar proteins. NtrX is part of the two-component regulatory system NtrY/NtrX that is involved in the activation of nitrogen assimilatory genes such as Gln. It is phosphorylated by the histidine kinase NtrY and interacts with sigma-54. NtrX is a member of the NtrC family, characterized by a domain architecture containing an N-terminal REC domain, followed by a central sigma-54 interaction/ATPase domain, and a C-terminal DNA binding domain. NtrC family response regulators are sigma54-dependent transcriptional activators. Also included in this subfamily is Aquifex aeolicus NtrC4. The ability of the central domain to hydrolyze ATP and thus to interact effectively with a complex of RNA polymerase, sigma54, and promoter, is controlled by the phosphorylation status of the REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35074 381103 cd17551 REC_RpfG-like phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase response regulator RpfG and similar proteins. Cyclic di-GMP phosphodiesterase response regulator RpfG, together with sensory/regulatory protein RpfC, constitute a two-component system implicated in sensing and responding to the diffusible signal factor (DSF) that is essential for cell-cell signaling. RpfC is a hybrid sensor/histidine kinase that phosphorylates and activates RpfG, which degrades cyclic di-GMP to GMP, leading to the activation of Clp, a global transcriptional regulator that regulates a large set of genes in the DSF pathway. RpfG contains a CheY-like receiver domain attached to a histidine-aspartic acid-glycine-tyrosine-proline (HD-GYP) cyclic di-GMP phosphodiesterase domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35075 381104 cd17552 REC_RR468-like phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator RR468 and similar domains. Thermotoga maritima RR468 (encoded by gene TM0468) is the cognate response regulator (RR) of the class I histidine kinase HK853 (product of gene TM0853). HK853/RR468 comprise a two-component system (TCS) that couples environmental stimuli to adaptive responses. This subfamily also includes Fremyella diplosiphon complementary adaptation response regulator homolog RcaF, a small RR that is involved in four-step phosphorelays of the complementary chromatic adaptation (CCA) system that occurs in many cyanobacteria. Both RR468 and RcaF are stand-alone RRs containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 121
35076 381105 cd17553 REC_Spo0F-like phosphoacceptor receiver (REC) domain of Spo0F and similar domains. Spo0F, a stand-alone response regulator containing only a REC domain with no output/effector domain, controls sporulation in Bacillus subtilis through the exchange of a phosphoryl group. Bacillus subtilis forms spores when conditions for growth become unfavorable. The initiation of sporulation is controlled by a phosphorelay (an expanded version of the two-component system) that consists of four main components: a histidine kinase (KinA), a secondary messenger (Spo0F), a phosphotransferase (Spo0B), and a transcription factor (Spo0A). REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35077 381106 cd17554 REC_TrrA-like phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator TrrA and similar domains. Thermotoga maritima contains a two-component signal transduction system (TCS) composed of the ThkA sensory histidine kinase (HK) and its cognate response regulator (RR) TrrA; the specific function of the system is unknown. TCSs couple environmental stimuli to adaptive responses. TrrA is a stand-alone RR containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 113
35078 381107 cd17555 REC_RssB-like phosphoacceptor receiver (REC) domain of Pseudomonas aeruginosa RssB and similar domains. Pseudomonas aeruginosa RssB is an orphan atypical response regulator containing a REC domain and a PP2C-type protein phosphatase output domain. Its function is still unknown. Escherichia RssB, which is not included in this subfamily, is a ClpX adaptor protein which alters ClpX specificity by mediating a specific interaction between ClpX and the substrates such as RpoS, an RNA polymerase sigma factor. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
35079 381108 cd17557 REC_Rcp-like phosphoacceptor receiver (REC) domain of cyanobacterial phytochrome response regulator Rcp and similar domains. This family is composed of response regulators (RRs) that are members of phytochrome-associated, light-sensing two-component signal transduction pathways such as Synechocystis sp. Rcp1, Tolypothrix sp. RcpA, and Agrobacterium tumefaciens bacteriophytochrome response regulator AtBRR. They are stand-alone RRs containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. Also included in this family us Methanosaeta harundinacea methanogenesis regulatory protein FilR2, also a stand-alone RR. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 129
35080 381109 cd17561 REC_Spo0A phosphoacceptor receiver (REC) domain of Spo0A. Spo0A is a response regulator of the phosphorelay system in the early stage of spore formation. It may be an element of the effector pathway responsible for the activation of sporulation genes in response to nutritional stress and may act in the with sigma factor spo0H to control the expression of some genes that are critical to the sporulation process. Spo0A contains a regulatory N-terminal REC domain and a C-terminal DNA-binding transcription activation domain as its effector/output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 108
35081 381110 cd17562 REC_CheY4-like phosphoacceptor receiver (REC) domain of chemotaxis response regulator CheY4 and similar CheY family proteins. CheY family chemotaxis response regulators (RRs) comprise about 17% of bacterial RRs and almost half of all RRs in archaea. This subfamily contains Vibrio cholerae CheY4 and similar CheY family RRs. CheY proteins control bacterial motility and participate in signaling phosphorelays and in protein-protein interactions. CheY RRs contain only the REC domain with no output/effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35082 381111 cd17563 REC_RegA-like phosphoacceptor receiver (REC) domain of photosynthetic apparatus regulatory protein RegA. Rhodobacter sphaeroides RegA, also called response regulator PrrA, is the DNA binding regulatory protein of a redox-responsive two-component regulatory system RegB/RegA that is involved in transactivating anaerobic expression of the photosynthetic apparatus. It contains a REC domain and a DNA-binding helix-turn-helix output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 112
35083 381112 cd17565 REC_GlnL-like phosphoacceptor receiver (REC) domain of transcriptional regulatory protein GlnL and similar proteins. Bacillus subtilis GlnL is part of the GlnK-GlnL (formerly YcbA-YcbB) two-component system that positively regulates the expression of the glsA-glnT (formerly ybgJ-ybgH) operon in response to glutamine. It contains a REC domain and a DNA-binding output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 103
35084 381113 cd17569 REC_HupR-like phosphoacceptor receiver (REC) domain of hydrogen uptake protein regulator (HupR) and similar domains. This family is composed of mostly uncharacterized response regulators with similarity to the REC domains of response regulator components of two-component systems that regulates hydrogenase activity, including HupR and HoxA. HupR is part of the HupT/HupR system that controls the synthesis of the membrane-bound [NiFe]hydrogenase, HupSL, of the photosynthetic bacterium Rhodobacter capsulatus. It contains an N-terminal REC domain, a central sigma-54 interaction domain that lacks ATPase activity, and a C-terminal DNA-binding domain. Members of this family contain a REC domain and various output domains including the cyclase homology domain (CHD) and the c-di-GMP phosphodiesterase domains, HD-GYP and EAL. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35085 381114 cd17572 REC_NtrC1-like phosphoacceptor receiver (REC) domain of nitrogen regulatory protein C 1 (NtrC1) from Aquifex aeolicus and similar NtrC family response regulators. NtrC family proteins are transcriptional regulators that have REC, AAA+ ATPase/sigma-54 interaction, and DNA-binding output domains. This subfamily of NtrC proteins include Aquifex aeolicus NtrC1 and Vibrio quorum-sensing signal integrator LuxO. The N-terminal REC domain of NtrC proteins regulate the activity of the protein and its phosphorylation controls the AAA+ domain oligomerization, while the central AAA+ domain participates in nucleotide binding, hydrolysis, oligomerization, and sigma54 interaction. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 121
35086 381115 cd17573 REC_HP-RR-like phosphoacceptor receiver (REC) domain of orphan response regulator HP-RR and similar proteins. Helicobacter pylori response regulator hp1043 (HP-RR) is an orphan response regulator which is phosphorylation-independent and is essential for growth. HP-RR functions as a cell growth-associated regulator in the absence of post-translational modification. Members of this subfamily contain REC and DNA-binding output domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 110
35087 381116 cd17574 REC_OmpR phosphoacceptor receiver (REC) domain of OmpR family response regulators. OmpR-like proteins are one of the most widespread transcriptional regulators. OmpR family members contain REC and winged helix-turn-helix (wHTH) DNA-binding output effector domain. They are involved in the control of environmental stress tolerance (such as the oxidative, osmotic and acid stress response), motility, virulence, outer membrane biogenesis and other processes. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 99
35088 381117 cd17575 REC_WspR-like phosphoacceptor receiver (REC) domain of WspR response regulator and similar proteins. The GGDEF response regulator WspR is part of the Wsp system that is homologous to chemotaxis systems and also includes the membrane-bound receptor protein WspA. In response to growth on surfaces, WspR is phosphorylated by the Wsp signal transduction complex and is activated, functioning as a diguanylate cyclase (DGC) that catalyzes c-di-GMP synthesis. WspR is a hybrid response regulator-diguanylate cyclase, containing an N-terminal REC domain and a C-terminal GGDEF domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 128
35089 381118 cd17580 REC_2_DhkD-like second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains. Dictyostelium discoideum hybrid signal transduction histidine kinase D (DhkD) is a large protein that contains two histidine kinase (HK) and two REC domains on the intracellular side of a single pass transmembrane domain, and extracellular PAS and PAC domains that likely are involved in ligand binding. This model represents the second REC domain and similar domains. DhkD activates the cAMP phosphodiesterase RegA to ensure proper prestalk and prespore patterning, tip formation, and the vertical elongation of the mound into a finger, in Dictyostelium discoideum. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 112
35090 381119 cd17581 REC_typeA_ARR phosphoacceptor receiver (REC) domain of type A Arabidopsis response regulators (ARRs) and similar proteins. Type-A response regulators of Arabidopsis (ARRs) are involved in cytokinin signaling, which involves a phosphorelay cascade by histidine kinase receptors (AHKs), histidine phosphotransfer proteins (AHPs) and downstream ARRs. Cytokinin is a plant hormone implicated in many growth and development processes including shoot organogenesis, leaf senescence, sink/source relationships, vascular development, lateral bud release, and photomorphogenic development. Type-A ARRs function downstream of and are regulated by type-B ARRs, which are a class of MYB-type transcription factors. As primary cytokinin response genes, type-A ARRs act as redundant negative feedback regulators of cytokinin signaling by inactivating the phosphorelay. ARRs are divided into two groups, type-A and -B, according to their sequence and domain structure. Type-A ARRs are similar in domain structure to CheY, in that they lack a typical output domain and only contain a stand-alone receiver (REC) domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 122
35091 381120 cd17582 psREC_PRR pseudo receiver domain of pseudo-response regulators. In Arabidopsis, five pseudo-response regulators (PRRs), also called APRRs, comprise a core group of clock components that controls the pace of the central oscillator of the circadian clock, an endogenous time-keeping mechanism that enables organisms to adapt to external daily cycles. The coordinated sequential expression of PRR9 (APRR9), PRR7 (APRR7), PRR5 (APRR5), PRR3 (APRR3), and PRR1 (APRR1) results in circadian waves that may be at the basis of the endogenous circadian clock. PRRs contain an N-terminal pseudo receiver (psREC) domain that resembles the receiver domain of a two-component response regulator, but lacks an aspartate residue that accepts a phosphoryl group from the sensor kinase, and a CCT motif at the C-terminus that contains a putative nuclear localization signal. The psREC domain is involved in protein-protein interactions. 104
35092 381121 cd17584 REC_typeB_ARR-like phosphoacceptor receiver (REC) domain of type B Arabidopsis response regulators (ARRs) and similar domains. Type-B ARRs (Arabidopsis response regulators) are a class of MYB-type transcription factors that act as major players in the transcriptional activation of cytokinin-responsive genes. They directly regulate the expression of type-A ARR genes and other downstream target genes. Cytokinin is a plant hormone implicated in many growth and development processes including shoot organogenesis, leaf senescence, sink/source relationships, vascular development, lateral bud release, and photomorphogenic development. Cytokinin signaling involves a phosphorelay cascade by histidine kinase receptors (AHKs), histidine phosphotransfer proteins (AHPs) and downstream ARRs. ARRs are divided into two groups, type-A and -B, according to their sequence and domain structure. Type-B ARRs contain a receiver (REC) domain and a large C-terminal extension that has characteristics of an effector or output domain, with a Myb-like DNA binding domain referred to as the GARP domain. The GARP domain is a motif specific to plant transcription factors. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35093 381122 cd17586 REC_PFxFATGY phosphoacceptor receiver (REC) domain of PFxFATGY motif single-domain (stand-alone) response regulators. This subfamily is composed of stand-alone response regulators (RRs) containing the PFxFATG[G/Y] motif; RRs with such a motif are also called ''FAT GUY'' response regulators. Included in this subfamily are Sphingomonas melonis SdrG, Sinorhizobium meliloti Sma0114, and Erythrobacter litoralis EL_LovR. SdrG is involved in the control of the general stress response. Sma0114 is part of the Sma0113/Sma0114 two-component system (TCS) that is involved in catabolite repression and polyhydroxy butyrate synthesis. EL_LovR is involved in a light-regulated TCS. PFxFATG[G/Y] RRs are typically associated with histidine-tryptophan-glutamate (HWE) histidine kinases that constitute a subclass of the larger histidine kinase superfamily characterized by an altered ATP binding site, which lacks the F-box that is normally an integral component of the ATP lid. The PFxFATG[G/Y] motif is involved in conformational changes after phosphorylation that results in the activation of the RR. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 111
35094 381123 cd17589 REC_TPR phosphoacceptor receiver (REC) domain of uncharacterized tetratricopeptide repeat (TPR)-containing response regulators. Response regulators share the common phosphoacceptor REC domain and different output domains. This subfamily contains uncharacterized response regulators with TPR repeats as the effector or output domain, which might contain between 3 to 16 TPR repeats (each about 34 amino acids). TPR-containing proteins occur in all domains of life and the abundance of TPR-containing proteins in a bacterial proteome is not indicative of virulence. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. Some members in this subfamily may contain inactive REC domains lacking canonical metal-binding and active site residues. 115
35095 381124 cd17593 REC_CheC-like phosphoacceptor receiver (REC) domain of uncharacterized response regulators containing a CheC domain. This subfamily is composed of uncharacterized proteins containing an N-terminal REC domain and a C-terminal CheC domain that may function as the output/effector domain of a response regulator. CheC is a CheY-P phosphatase, affecting the level of phosphorylated CheY which controls the sense of flagella rotation and determine swimming behavior of chemotactic bacteria. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
35096 381125 cd17594 REC_OmpR_VirG phosphoacceptor receiver (REC) domain of VirG-like OmpR family response regulators. VirG is part of the VirA/VirG two-component system that regulates the expression of virulence (vir) genes. The histidine kinase VirA senses a phenolic wound response signal, undergoes autophosphorylation, and phosphorelays to the VirG response regulator, which induces transcription of the vir regulon. VirG belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 113
35097 381126 cd17595 REC_TrxB phosphoacceptor receiver (REC) domain a fused response regulator with a thioredoxin reductase output domain. This family is composed of uncharacterized fusion proteins containing a REC domain and a thioredoxin reductase domain. Thioredoxin reductase catalyzes the reduction of thioredoxin and is thus a central component in the thioredoxin system. Fusion proteins containing REC and thioredoxin reductase domains could play an important role in the environmental regulation of the cellular dithiol-disulfide ratio. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 135
35098 381127 cd17596 REC_HupR phosphoacceptor receiver (REC) domain of hydrogen uptake protein regulator (HupR). Members of this subfamily are response regulator components of two-component systems that regulates hydrogenase activity, including HupR and HoxA. HupR is part of the HupT/HupR system that controls the synthesis of the membrane-bound [NiFe]hydrogenase, HupSL, of the photosynthetic bacterium Rhodobacter capsulatus. It belongs to the nitrogen regulatory protein C (NtrC) family of response regulators, which activate transcription by RNA polymerase (RNAP) in response to a change in the environment. HupR is an unusual member of this family as it activates transcription when unphosphorylated, and transcription is inhibited by phosphorylation. Proteins in this subfamily contain an N-terminal REC domain, a central sigma-54 interaction domain that lacks ATPase activity, and a C-terminal DNA-binding domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 133
35099 381128 cd17598 REC_hyHK phosphoacceptor receiver (REC) domain of uncharacterized hybrid sensor histidine kinase/response regulators. Typically, two-component regulatory systems (TCSs) consist of a sensor (histidine kinase) that responds to specific input(s) by modifying the output of a cognate response regulator (RR). TCSs allow organisms to sense and respond to changes in environmental conditions. Hybrid sensor histidine kinase/response regulators contain all the elements of a classical TCS in a single polypeptide chain. RRs share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35100 381129 cd17602 REC_PatA-like phosphoacceptor receiver (REC) domain of PatA and similar domains. Nostoc sp. (or Anabaena sp.) PatA is necessary for proper patterning of heterocysts along filaments. PatA contains phosphoacceptor REC domain at its C-terminus and an N-terminal PATAN (PatA N-terminus) domain, which was proposed in a bioinformatics study to mediate protein-protein interactions. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. Some members of this group may have an inactive REC domain, lacking canonical metal-binding and active site residues. 102
35101 381130 cd17614 REC_OmpR_YycF-like phosphoacceptor receiver (REC) domain of YrcF-like OmpR family response regulators. YycF appears to play an important role in cell wall integrity in a wide range of gram-positive bacteria, and may also modulate cell membrane integrity. It functions as part of a phosphotransfer system that ultimately controls the levels of competence within the bacteria. YycF belongs to the OmpR family of response regulators, which are characterized by a REC domain and a winged helix-turn-helix effector domain involved in DNA binding. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35102 381131 cd17615 REC_OmpR_MtPhoP-like phosphoacceptor receiver (REC) domain of MtPhoP-like OmpR family response regulators. Mycobacterium tuberculosis PhoP (MtPhoP) is part of the PhoP/PhoR two-component system that is involved in phosphate control by stimulating expression of genes involved in scavenging, transport and mobilization of phosphate, and repressing the utilization of nitrogen sources. Also included in this subfamily is Mycobacterium tuberculosis transcriptional regulatory protein TcrX, part of the two-component regulatory system TcrY/TcrX that may be involved in virulence. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35103 381132 cd17616 REC_OmpR_CtrA phosphoacceptor receiver (REC) domain of CtrA-like OmpR family response regulators. CtrA is part of the CckA-ChpT-CtrA phosphorelay that is conserved in alphaproteobacteria and is important in orchestrating the cell cycle, polar development, and flagellar biogenesis. CtrA is the master regulator of flagella synthesis genes and also regulates genes involved in the cell cycle, exopolysaccharide synthesis, and cyclic-di-GMP signaling. CtrA is active as a transcription factor when phosphorylated. It is a member of the OmpR family of DNA-binding response regulators, characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 114
35104 381133 cd17618 REC_OmpR_PhoB phosphoacceptor receiver (REC) domain of PhoB response regulator from the OmpR family. The transcription factor PhoB is a component of the PhoR/PhoB two-component system, a key regulatory protein network that facilitates response to inorganic phosphate (Pi) starvation conditions by turning on the phosphate (pho) regulon whose products are involved in phosphorus uptake and metabolism. PhoB is a member of the OmpR family of DNA-binding response regulators that contains REC and winged helix-turn-helix (wHTH) DNA-binding output effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
35105 381134 cd17619 REC_OmpR_ArcA_TorR-like phosphoacceptor receiver (REC) domain of ArcA- and TorR-like OmpR family response regulators. This subfamily includes Escherichia coli TorR and ArcA, both OmpR family response regulators that mediate adaptation to changes in various respiratory growth conditions. The TorS-TorR two-component system (TCS) is responsible for the tight regulation of the torCAD operon, which encodes the trimethylamine N-oxide (TMAO) reductase respiratory system in response to anaerobic conditions and the presence of TMAO. The ArcA-ArcB TCS is involved in cell growth during anaerobiosis. ArcA is a global regulator that controls more than 30 operons involved in redox regulation (the Arc modulon). OmpR family DNA-binding response regulators are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 113
35106 381135 cd17620 REC_OmpR_KdpE-like phosphoacceptor receiver (REC) domain of KdpE-like OmpR family response regulators. KdpE is a component of the KdpD/KdpE two-component system (TCS) and is activated when histidine kinase KdpD senses a drop in external K+ concentration or upshift in ionic osmolarity, resulting in the expression of a heterooligomeric transporter KdpFABC. In addition, the KdpD/KdpE TCS is also an adaptive regulator involved in the virulence and intracellular survival of pathogenic bacteria. KdpE is a member of the OmpR family of DNA-binding response regulators that contain REC and winged helix-turn-helix (wHTH) DNA-binding output effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 99
35107 381136 cd17621 REC_OmpR_RegX3-like phosphoacceptor receiver (REC) domain of RegX3-like OmpR family response regulators. RegX3 is a member of the SenX3-RegX3 two-component system that is involved in phosphate-sensing signal transduction. Phosphorylated RegX3 functions as a transcriptional activator of phoA. It induces transcription in phosphate limiting environment and also controls expression of several critical metabolic enzymes in aerobic condition. RegX3 belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 99
35108 381137 cd17622 REC_OmpR_kpRstA-like phosphoacceptor receiver (REC) domain of kpRstA-like OmpR family response regulators. Klebsiella pneumoniae RstA (kpRstA) is part of the RstA/RstB two-component regulatory system that may play a regulatory role in virulence. It belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
35109 381138 cd17623 REC_OmpR_CpxR phosphoacceptor receiver (REC) domain of CpxR-like OmpR family response regulators. CpxR is part of the CpxA/CpxR two-component regulatory system that mediates envelope stress responses that is key for virulence and antibiotic resistance in several Gram negative pathogens. CpxR is a transcription factor/response regulator that controls the expression of numerous genes, including those of the classical porins OmpF and OmpC. It belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35110 381139 cd17624 REC_OmpR_PmrA-like phosphoacceptor receiver (REC) domain of PmrA-like OmpR family response regulators. This subfamily contains various OmpR family response regulators including PmrA, BasR, QseB, tctD, and RssB, which are components of two-component regulatory systems (TCSs). The PmrA/PmrB TCS controls transcription of genes that are involved in lipopolysaccharide modification in the outer membrane of bacteria, increasing bacterial resistance to host-derived antimicrobial peptides. The BasS/BasR TCS functions as an iron- and zinc-sensing transcription regulator. The QseB/QseC TCS activates the flagella regulon by activating transcription of FlhDC. The RssA/RssB TCS regulates swarming behavior in Serratia marcescens. OmpR family DNA-binding response regulators contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35111 381140 cd17625 REC_OmpR_DrrD-like phosphoacceptor receiver (REC) domain of DrrD-like OmpR family response regulators. DrrD is a OmpR/PhoB homolog from Thermotoga maritima whose function is not yet known. This subfamily also includes Streptococcus agalactiae transcriptional regulatory protein DltR, part of the DltS/DltR two-component system (TCS), and Pseudomonas aeruginosa transcriptional activator protein PfeR, part of the PfeR/PfeS TCS, which activates expression of the ferric enterobactin receptor. The DltS/DltR TCS regulates the expression of the dlt operon, which comprises four genes (dltA, dltB, dltC, and dltD) that catalyze the incorporation of D-alanine residues into the lipoteichoic acids. Members of this subfamily belong to the OmpR/PhoB family, which comprises of two domains, an N-terminal receiver domain and a C-terminal DNA-binding winged helix-turn-helix effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35112 381141 cd17626 REC_OmpR_MtrA-like phosphoacceptor receiver (REC) domain of MtrA-like OmpR family response regulators. MtrA is part of MtrA/MtrB (or MtrAB), a highly conserved two-component system (TCS) implicated in the regulation of cell division in the actinobacteria. In unicellular Mycobacterium tuberculosis, MtrAB coordinates DNA replication with cell division and regulates the transcription of resuscitation-promoting factor B. In filamentous Streptomyces venezuelae, it links antibiotic production to sporulation. MtrA belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
35113 381142 cd17627 REC_OmpR_PrrA-like phosphoacceptor receiver (REC) domain of PrrA-like OmpR family response regulators. The Mycobacterium tuberculosis PrrA is part of the PrrA/PrrB two-component system (TCS) that has been implicated in early intracellular multiplication and is essential for viability. Also included in this subfamily is Mycobacterium tuberculosis MprA, part of the MprAB TCS that regulates EspR, a key regulator of the ESX-1 secretion system, and is required for establishment and maintenance of persistent infection in a tissue- and stage-specific fashion. PrrA and MprA belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
35114 341285 cd17630 OSB_MenE-like O-succinylbenzoic acid-CoA ligase. This family contains O-succinylbenzoyl-CoA (OSB-CoA) synthetase (also known as O-succinylbenzoic acid CoA ligase) that belongs to the ANL superfamily and catalyzes the ligation of CoA to o-succinylbenzoate (OSB). It includes MenE in the bacterial menaquinone biosynthesis pathway which is a promising target for the development of novel antibacterial agents. MenE catalyzes CoA ligation via an acyl-adenylate intermediate; tight-binding inhibitors of MenE based on stable acyl-sulfonyladenosine analogs of this intermediate provide a pathway toward the development of optimized MenE inhibitors. 325
35115 341286 cd17631 FACL_FadD13-like fatty acyl-CoA synthetase, including FadD13. This family contains fatty acyl-CoA synthetases, including Mycobacterium tuberculosis acid-induced operon MymA encoding the fatty acyl-CoA synthetase FadD13 which is essential for virulence and intracellular growth of the pathogen. The fatty acyl-CoA synthetase activates lipids before entering into the metabolic pathways and is also involved in transmembrane lipid transport. However, unlike soluble fatty acyl-CoA synthetases, but like the mammalian integral-membrane very-long-chain acyl-CoA synthetases, FadD13 accepts lipid substrates up to the maximum length of C26, and this is facilitated by an extensive hydrophobic tunnel from the active site to a positively charged patch. Also included is feruloyl-CoA synthetase (Fcs) in Rhodococcus strains where it is involved in biotechnological vanillin production from eugenol and ferulic acid via a non-beta-oxidative pathway. 435
35116 341287 cd17632 AFD_CAR-like adenylation domain of carboxylic acid reductase (CAR). This family contains the adenylation domain of carboxylic acid reductase enzymes (CARs), and performs an equivalent function to that of the ANL superfamily of adenylating enzymes. It takes a carboxylic acid substrate and ATP, and produces an AMP-acyl phosphoester intermediate, releasing pyrophosphate. Kinetic analysis using various substrates shows that this enzyme has a broad but similar substrate specificity, preferring electron-rich acids. This suggests that attack by the carboxylate on the alpha-phosphate of adenosine triphosphate (ATP) is the step that determines the substrate specificity and reaction kinetics. CAR is an important enzyme for use as a biocatalyst providing regiospecific route to aldehydes from their respective carboxylic acids. 588
35117 341288 cd17633 AFD_YhfT-like fatty acid-CoA ligase VraA. This family of acyl-CoA ligases includes Bacillus subtilis YhfT, as well as long-chain fatty acid-CoA ligase VraA, all of which are as yet to be characterized. These proteins belong to the adenylate-forming enzymes which catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain 320
35118 341289 cd17634 ACS-like acetate-CoA ligase. This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain. 587
35119 341290 cd17635 FADD10 adenylate forming domain, fatty acid CoA ligase (FadD10). This family contains long chain fatty acid CoA ligases, including FadD10 which is involved in the synthesis of a virulence-related lipopeptide. FadD10 is a fatty acyl-AMP ligase (FAAL) that transfers fatty acids to an acyl carrier protein. Structures of FadD10 in apo- and complexed form with dodecanoyl-AMP, show a novel open conformation, facilitated by its unique inter-domain and intermolecular interactions, which is critical for the enzyme to carry out the acyl transfer onto the acyl carrier protein (Rv0100) rather than coenzyme A. 340
35120 341291 cd17636 PtmA long-chain fatty acid CoA ligase (FadD). This family contains fatty acid CoA ligases, including acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II, most of which are yet to be characterized. Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 331
35121 341292 cd17637 ACLS-CaiC acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II. This family contains fatty acid CoA ligases, including acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II, most of which are yet to be characterized, but may be similar to Carnitine-CoA ligase (CaiC) which catalyzes the transfer of CoA to carnitine. Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. 333
35122 341293 cd17638 FadD3 acyl-CoA synthetase FadD3 and similar proteins. This family contains long chain fatty acid CoA ligases, including FadD3 which is an acyl-CoA synthetase that initiates catabolism of cholesterol rings C and D in actinobacteria. The cholesterol catabolic pathway occurs in most mycolic acid-containing actinobacteria, such as Rhodococcus jostii RHA1, and is critical for Mycobacterium tuberculosis (Mtb) during infection. FadD3 catalyzes the ATP-dependent CoA thioesterification of 3a-alpha-H-4alpha(3'-propanoate)-7a-beta-methylhexahydro-1,5-indanedione (HIP) to yield HIP-CoA. Hydroxylated analogs of HIP, 5alpha-OH HIP and 1beta-OH HIP, can also be used. 330
35123 341294 cd17639 LC_FACS_euk1 Eukaryotic long-chain fatty acid CoA synthetase (LC-FACS), including fungal proteins. The members of this family are eukaryotic fatty acid CoA synthetases (EC 6.2.1.3) that activate fatty acids with chain lengths of 12 to 20 and includes fungal proteins. They act on a wide range of long-chain saturated and unsaturated fatty acids, but the enzymes from different tissues show some variation in specificity. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Organisms tend to have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. In Schizosaccharomyces pombe, lcf1 gene encodes a new fatty acyl-CoA synthetase that preferentially recognizes myristic acid as a substrate. 507
35124 341295 cd17640 LC_FACS_like Long-chain fatty acid CoA synthetase. This family includes long-chain fatty acid (C12-C20) CoA synthetases, including an Arabidopsis gene At4g14070 that plays a role in activation and elongation of exogenous fatty acids. FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Eukaryotes generally have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 468
35125 341296 cd17641 LC_FACS_bac1 bacterial long-chain fatty acid CoA synthetase. The members of this family are bacterial long-chain fatty acid CoA synthetase, most of which are as yet uncharacterized. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions. 569
35126 341297 cd17642 Firefly_Luc insect luciferase, similar to plant 4-coumarate: CoA ligases. This family contains insect firefly luciferases that share significant sequence similarity to plant 4-coumarate:coenzyme A ligases, despite their functional diversity. Luciferase catalyzes the production of light in the presence of MgATP, molecular oxygen, and luciferin. In the first step, luciferin is activated by acylation of its carboxylate group with ATP, resulting in an enzyme-bound luciferyl adenylate. In the second step, luciferyl adenylate reacts with molecular oxygen, producing an enzyme-bound excited state product (Luc=O*) and releasing AMP. This excited-state product then decays to the ground state (Luc=O), emitting a quantum of visible light. 532
35127 341298 cd17643 A_NRPS_Cytc1-like similar to adenylation domain of cytotrienin synthetase CytC1. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes Streptomyces sp. cytotrienin synthetase (CytC1), a relatively promiscuous adenylation enzyme that installs the aminoacyl moieties on the phosphopantetheinyl arm of the holo carrier protein CytC2. Also included are Streptomyces sp Thr1, involved in the biosynthesis of 4-chlorothreonine, Pseudomonas aeruginosa pyoverdine synthetase D (PvdD), involved in the biosynthesis of the siderophore pyoverdine and Pseudomonas syringae syringopeptin synthetase, where syringpeptin is a necrosis-inducing phytotoxin that functions as a virulence determinant in the plant-pathogen interaction. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 450
35128 341299 cd17644 A_NRPS_ApnA-like similar to adenylation domain of anabaenopeptin synthetase (ApnA). This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes Planktothrix agardhii anabaenopeptin synthetase (ApnA A1), which is capable of activating two chemically distinct amino acids (Arg and Tyr). Structural studies show that the architecture of the active site forces Arg to adopt a Tyr-like conformation, thus explaining the bispecificity. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 465
35129 341300 cd17645 A_NRPS_LgrA-like adenylation (A) domain of linear gramicidin synthetase (LgrA) and similar proteins. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes linear gramicidin synthetase (LgrA) in Brevibacillus brevis. LgrA has a formylation domain fused to the N-terminal end that formylates its substrate for linear gramicidin synthesis to proceed. This formyl group is essential for the clinically important antibacterial activity of gramicidin by enabling head-to-head gramicidin dimers to make a beta-helical pore in gram-positive bacterial membranes, allowing free passage of monovalent cations, destroying the ion gradient and killing bacteria. This family also includes bacitracin synthetase 1 (known as ATP-dependent cysteine adenylase or BA1); it activates cysteine, incorporates two D-amino acids, releases and cyclizes the mature bacitracin, an antibiotic that is a mixture of related cyclic peptides that disrupt gram positive bacteria by interfering with cell wall and peptidoglycan synthesis. Also included is surfactin synthetase which activates and polymerizes the amino acids Leu, Glu, Asp, and Val to form the antibiotic surfactin. 440
35130 341301 cd17646 A_NRPS_AB3403-like Peptide Synthetase. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 488
35131 341302 cd17647 A_NRPS_alphaAR Alpha-aminoadipate reductase. This family contains L-2-aminoadipate reductase, also known as alpha-aminoadipate reductase (EC 1.2.1.95) or alpha-AR or L-aminoadipate-semialdehyde dehydrogenase (EC 1.2.1.31), which catalyzes the activation of alpha-aminoadipate by ATP-dependent adenylation and the reduction of activated alpha-aminoadipate by NADPH. The activated alpha-aminoadipate is bound to the phosphopantheinyl group of the enzyme itself before it is reduced to (S)-2-amino-6-oxohexanoate. 520
35132 341303 cd17648 A_NRPS_ACVS-like N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase. This family contains ACV synthetase (ACVS, EC 6.3.2.26; also known as N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase or delta-(L-alpha-aminoadipyl)-L-cysteinyl-D-valine synthetase) is involved in medically important antibiotic biosynthesis. ACV synthetase is active in an early step in the penicillin G biosynthesis pathway which involves the formation of the tripeptide 6-(L-alpha-aminoadipyl)-L-cysteinyl-D-valine (ACV); each of the constituent amino acids of the tripeptide ACV are activated as aminoacyl-adenylates with peptide bonds formed through the participation of amino acid thioester intermediates. ACV is then cyclized by the action of isopenicillin N synthase. 453
35133 341304 cd17649 A_NRPS_PvdJ-like non-ribosomal peptide synthetase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes pyoverdine biosynthesis protein PvdJ involved in the synthesis of pyoverdine, which consists of a chromophore group attached to a variable peptide chain and comprises around 6-12 amino acids that are specific for each Pseudomonas species, and for which the peptide might be first synthesized before the chromophore assembly. Also included is ornibactin biosynthesis protein OrbI; ornibactin is a tetrapeptide siderophore with an l-ornithine-d-hydroxyaspartate-l-serine-l-ornithine backbone. The adenylation domain at the N-terminal of OrbI possibly initiates the ornibactin with the binding of N5-hydroxyornithine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 450
35134 341305 cd17650 A_NRPS_PpsD_like similar to adenylation domain of plipastatin synthase (PpsD). This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes bacitracin synthetase 1 (BacA) in Bacillus licheniformis, tyrocidine synthetase in Brevibacillus brevis, plipastatin synthase (PpsD, an important antifungal protein) in Bacillus subtilis and mannopeptimycin peptide synthetase (MppB) in Streptomyces hygroscopicus. Plipastatin has strong fungitoxic activity and is involved in inhibition of phospholipase A2 and biofilm formation. Bacitracin, a mixture of related cyclic peptides, is used as a polypeptide antibiotic while function of tyrocidine is thought to be regulation of sporulation. MppB is involved in biosynthetic pathway of mannopeptimycin, a novel class of mannosylated lipoglycopeptides. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 447
35135 341306 cd17651 A_NRPS_VisG_like similar to adenylation domain of virginiamycin S synthetase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes virginiamycin S synthetase (VisG) in Streptomyces virginiae; VisG is involved in virginiamycin S (VS) biosynthesis as the provider of an L-pheGly molecule, a highly specific substrate for the last condensation step by VisF. This family also includes linear gramicidin synthetase B (LgrB) in Brevibacillus brevis. Substrate specificity analysis using residues of the substrate-binding pockets of all 16 adenylation domains has shown good agreement of the substrate amino acids predicted with the sequence of linear gramicidin. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 491
35136 341307 cd17652 A_NRPS_CmdD_like similar to adenylation domain of chondramide synthase cmdD. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes phosphinothricin tripeptide (PTT, phosphinothricylalanylalanine) synthetase, where PTT is a natural-product antibiotic and potent herbicide that is produced by Streptomyces hygroscopicus. This adenylation domain has been confirmed to directly activate beta-tyrosine, and fluorinated chondramides are produced through precursor-directed biosynthesis. Also included in this family is chondramide synthase D (also known as ATP-dependent phenylalanine adenylase or phenylalanine activase or tyrosine activase). Chondramides A-D are depsipeptide antitumor and antifungal antibiotics produced by C. crocatus, are a class of mixed peptide/polyketide depsipeptides comprised of three amino acids (alanine, N-methyltryptophan, plus the unusual amino acid beta-tyrosine or alpha-methoxy-beta-tyrosine) and a polyketide chain ([E]-7-hydroxy-2,4,6-trimethyloct-4-enoic acid). 436
35137 341308 cd17653 A_NRPS_GliP_like nonribosomal peptide synthase GliP-like. This family includes the adenylation (A) domain of nonribosomal peptide synthases (NRPS) gliotoxin biosynthesis protein P (GliP), thioclapurine biosynthesis protein P (tcpP) and Sirodesmin biosynthesis protein P (SirP). In the filamentous fungus Aspergillus fumigatus, NRPS GliP is involved in the biosynthesis of gliotoxin, which is initiated by the condensation of serine and phenylalanine. Studies show that GliP is not required for invasive aspergillosis, suggesting that the principal targets of gliotoxin are neutrophils or other phagocytes. SirP is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). In the fungus Claviceps purpurea, NRPS tcpP catalyzes condensation of tyrosine and glycine, part of biosynthesis of an unusual class of epipolythiodioxopiperazines (ETPs) that lacks the reactive thiol group for toxicity. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 433
35138 341309 cd17654 A_NRPS_acs4 acyl-CoA synthetase family member 4. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) contains acyl-CoA synthethase family member 4, also known as 2-aminoadipic 6-semialdehyde dehydrogenase or aminoadipate-semialdehyde dehydrogenase, most of which are uncharacterized. Acyl-CoA synthetase catalyzes the initial reaction in fatty acid metabolism, by forming a thioester with CoA. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 449
35139 341310 cd17655 A_NRPS_Bac bacitracin synthetase and related proteins. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes bacitracin synthetases 1, 2, and 3 (BA1, also known as ATP-dependent cysteine adenylase or cysteine activase, BA2, also known as ATP-dependent lysine adenylase or lysine activase, and BA3, also known as ATP-dependent isoleucine adenylase or isoleucine activase) in Bacilli. Bacitracin is a mixture of related cyclic peptides used as a polypeptide antibiotic. This family also includes gramicidin synthetase 1 involved in synthesis of the cyclic peptide antibiotic gramicidin S via activation of phenylalanine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 490
35140 341311 cd17656 A_NRPS_ProA gramicidin S synthase 2, also known as ATP-dependent proline adenylase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) contains gramicidin S synthase 2 (also known as ATP-dependent proline adenylase or proline activase or ProA). ProA is a multifunctional enzyme involved in synthesis of the cyclic peptide antibiotic gramicidin S and able to activate and polymerize the amino acids proline, valine, ornithine and leucine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. 479
35141 350495 cd17657 CDC14_N N-terminal domain pseudophosphatase domain of CDC14 family proteins. The cell division control protein 14 (CDC14) family is highly conserved in all eukaryotes, although the roles of its members seem to have diverged during evolution. Yeast Cdc14, the best characterized member of this family, is a dual-specificity phosphatase that plays key roles in cell cycle control. It preferentially dephosphorylates cyclin-dependent kinase (CDK) targets, which makes it the main antagonist of CDK in the cell. Cdc14 functions at the end of mitosis and it triggers the events that completely eliminates the activity of CDK and other mitotic kinases. It is also involved in coordinating the nuclear division cycle with cytokinesis through the cytokinesis checkpoint, and in chromosome segregation. Cdc14 phosphatases also function in DNA replication, DNA damage checkpoint, and DNA repair. Vertebrates may contain more than one Cdc14 homolog; humans have three (CDC14A, CDC14B, and CDC14C). CDC14 family proteins contain a highly conserved N-terminal pseudophosphatase domain that contributes to substrate specificity and a C-terminal catalytic dual-specificity phosphatase domain with the PTP signature motif. The N-terminal pseudophosphatase domain lacks the catalytic residues. 144
35142 350496 cd17658 PTPc_plant_PTP1 protein tyrosine phosphatase 1 from Arabidopsis thaliana and similar plant PTPs. Arabidopsis thaliana protein tyrosine phosphatase 1 (AtPTP1) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. AtPTP1 dephosphorylates and inhibits MAP kinase 6 (MPK6) in non-oxidative stress conditions. Together with MAP kinase phosphatase 1 (MKP1) it expresses salicylic acid (SA) and camalexin biosynthesis, and therefore, modulating defense response. 206
35143 350497 cd17659 PTP_paladin_1 protein tyrosine phosphatase-like domain of paladin, repeat 1. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two tyrosine-protein phosphatase domains. This model represents repeat 1. 220
35144 350498 cd17660 PTP_paladin_2 protein tyrosine phosphatase-like domain of paladin, repeat 2. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two tyrosine-protein phosphatase domains. This model represents repeat 2. 216
35145 350499 cd17661 PFA-DSP_Oca2 atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 2. Oxidant-induced cell-cycle arrest protein 2 (Oca2) is an atypical dual specificity phosphatase of unknown function. It has been identified as a putative negative regulator acting on cell wall integrity and mating MAPK pathways in yeast. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca2 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif. 146
35146 350500 cd17662 PFA-DSP_Oca4 atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 4. Oxidant-induced cell-cycle arrest protein 4 (Oca4) is an atypical dual specificity phosphatase of unknown function. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca4 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif. 177
35147 350501 cd17663 PFA-DSP_Oca6 atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 6. Oxidant-induced cell-cycle arrest protein 6 (Oca6) is an atypical dual specificity phosphatase of unknown function. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca6 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif. 162
35148 350502 cd17664 Mce1_N N-terminal triphosphatase domain of mRNA capping enzyme. mRNA capping enzyme, also known as RNA guanylyltransferase and 5'-phosphatase (RNGTT) or mammalian mRNA capping enzyme (Mce1) in mammals, is a bifunctional enzyme that catalyzes the first two steps of cap formation: (1) by removing the gamma-phosphate from the 5'-triphosphate end of nascent mRNA to yield a diphosphate end using the polynucleotide 5'-phosphatase activity (EC 3.1.3.33) of the N-terminal triphosphatase domain; and (2) by transferring the GMP moiety of GTP to the 5'-diphosphate terminus through the C-terminal mRNA guanylyltransferase domain (EC 2.7.7.50). The enzyme is also referred to as CEL-1 in Caenorhabditis elegans. 167
35149 350503 cd17665 DSP_DUSP11 dual-specificity phosphatase domain of dual specificity protein phosphatase 11 and similar proteins. dual specificity protein phosphatase 11 (DUSP11), also known as RNA/RNP complex-1-interacting phosphatase or phosphatase that interacts with RNA/RNP complex 1 (PIR1), has RNA 5'-triphosphatase and diphosphatase activity, but only poor protein-tyrosine phosphatase activity. It has activity for short RNAs but is less active toward mononucleotide triphosphates, suggesting that its primary function in vivo is to dephosphorylate RNA 5'-ends. It may play a role in nuclear mRNA metabolism. Also included in this subfamily is baculovirus RNA 5'-triphosphatase for Autographa californica nuclear polyhedrosis virus. 169
35150 350504 cd17666 PTP-MTM-like_fungal protein tyrosine phosphatase-like domain of fungal myotubularins. Myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Not all members are catalytically active proteins, some function as adaptors for the active members. 229
35151 350505 cd17667 R-PTPc-G-1 catalytic domain of receptor-type tyrosine-protein phosphatase G, repeat 1. Receptor-type tyrosine-protein phosphatase G (PTPRG), also called protein-tyrosine phosphatase gamma (R-PTP-gamma), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRG is an important tumor suppressor gene in multiple human cancers such as lung, ovarian, and breast cancers. It is widely expressed in many tissues, including the central nervous system, where it plays a role during neuroinflammation processes. It can dephosphorylate platelet-derived growth factor receptor beta (PDGFRB) and may play a role in PDGFRB-related infantile myofibromatosis. PTPRG has four splicing isoforms: three transmembrane isoforms, PTPRG-A, B, and C, and one secretory isoform, PTPRG-S, which are expressed in many tissues including the brain. PTPRG is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1). 274
35152 350506 cd17668 R-PTPc-Z-1 catalytic domain of receptor-type tyrosine-protein phosphatase Z, repeat 1. Receptor-type tyrosine-protein phosphatase Z (PTPRZ), also called receptor-type tyrosine-protein phosphatase zeta (R-PTP-zeta), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Three isoforms are generated by alternative splicing from a single PTPRZ gene: two transmembrane isoforms, PTPRZ-A and PTPRZ-B, and one secretory isoform, PTPRZ-S (also known as phosphacan); all are preferentially expressed in the central nervous system (CNS) as chondroitin sulfate (CS) proteoglycans. PTPRZ isoforms play important roles in maintaining oligodendrocyte precursor cells in an undifferentiated state. PTPRZ is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1). 209
35153 350507 cd17669 R-PTP-Z-2 catalytic domain of receptor-type tyrosine-protein phosphatase Z, repeat 2. Receptor-type tyrosine-protein phosphatase Z (PTPRZ), also called receptor-type tyrosine-protein phosphatase zeta (R-PTP-zeta), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Three isoforms are generated by alternative splicing from a single PTPRZ gene: two transmembrane isoforms, PTPRZ-A and PTPRZ-B, and one secretory isoform, PTPRZ-S (also known as phosphacan); all are preferentially expressed in the central nervous system (CNS) as chondroitin sulfate (CS) proteoglycans. PTPRZ isoforms play important roles in maintaining oligodendrocyte precursor cells in an undifferentiated state. PTPRZ is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2). 204
35154 350508 cd17670 R-PTP-G-2 PTP-like domain of receptor-type tyrosine-protein phosphatase G, repeat 2. Receptor-type tyrosine-protein phosphatase G (PTPRG), also called protein-tyrosine phosphatase gamma (R-PTP-gamma), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRG is an important tumor suppressor gene in multiple human cancers such as lung, ovarian, and breast cancers. It is widely expressed in many tissues, including the central nervous system, where it plays a role during neuroinflammation processes. It can dephosphorylate platelet-derived growth factor receptor beta (PDGFRB) and may play a role in PDGFRB-related infantile myofibromatosis. PTPRG is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2). 205
35155 349491 cd17672 MDM2 p53-binding domain found in E3 ubiquitin-protein ligase MDM2 and similar proteins. MDM2, also known as double minute 2 protein (Hdm2), or oncoprotein MDM2, or p53-binding protein, exerts its oncogenic activity predominantly by binding the p53 tumor suppressor and blocking its transcriptional activity. It forms homo-oligomers and displays E3 ubiquitin ligase activity, catalyzing the attachment of ubiquitin to p53 as an essential step in the regulation of its expression levels in cells. Moreover, in response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11, and L23. MDM2 also has a p53-independent role in tumorigenesis and cell growth regulation. In addition, it binds interferon (IFN) regulatory factor-2 (IRF-2), an IFN-regulated transcription factor, and mediates its ubiquitination. MDM2 contains an N-terminal p53-binding domain and a C-terminal zinc RING-finger domain conferring E3 ligase activity that is required for ubiquitination and nuclear export of p53. It is also responsible for the hetero-oligomerization of MDM2, which is crucial for the suppression of P53 activity during embryonic development, and the recruitment of E2 ubiquitin-conjugating enzymes. MDM2 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain, as well as a nuclear localization signal (NLS) and a nuclear export signal (NES), near the central acidic region. 83
35156 349492 cd17673 MDM4 p53-binding domain found in MDM4 and similar proteins. MDM4, also known as double minute 4 protein, MDM2-like p53-binding protein, protein MDMX, HDMX, or p53-binding protein MDM4, exerts its oncogenic activity predominantly by binding the p53 tumor suppressor and blocking its transcriptional activity. MDM4 is phosphorylated and destabilized in response to DNA damage stress. It can also be specifically dephosphorylated through directly interacting with protein phosphatase 1 (PP1), which may increase its stability and thus inhibit p53 activity. MDM4 also has a p53-independent role in tumorigenesis and cell growth regulation. MDM4 contains an N-terminal p53-binding domain and a C-terminal zinc RING-finger domain responsible for its hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. MDM4 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region. 79
35157 349493 cd17674 SWIB_BAF60A SWIB domain found in BRG1-associated factor 60A (BAF60A) and similar proteins. BAF60A, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 1 (SMARCD1), or 60 kDa BRG-1/Brm-associated factor subunit A, or SWI/SNF complex 60 kDa subunit, is a core subunit of the SWI/SNF chromatin-remodeling complex that activates the transcription of fatty acid oxidation genes during fasting. BAF60A is involved in chromatin remodeling and hepatic lipid metabolism. It mediates critical interactions between nuclear receptors and the BRG1 chromatin-remodeling complex for transactivation. It is also a key component of the transcriptional control in cardiac progenitors. Moreover, BAF60A interacts with p53 to recruit the SWI/SNF complex, suggesting that the SWI/SNF chromatin remodeling complex is involved in the suppression of tumors. 77
35158 349494 cd17675 SWIB_BAF60B SWIB domain found in BRG1-associated factor 60B (BAF60B) and similar proteins. BAF60B, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 2 (SMARCD2), or 60 kDa BRG-1/Brm-associated factor subunit B, is a component of the BAF complex. It is involved in transcriptional activation and repression of select genes by chromatin remodeling. It plays a role in the ATM-p53 pathway in sensing chromatin opening by facilitating ATM recruitment to the SWI/SNF complex, as well as ATM activation. It also regulates transcriptional networks controlling differentiation of neutrophil granulocytes. Thus, it acts as a key factor controlling myelopoiesis and is a potential tumor suppressor in leukemia. 80
35159 349495 cd17676 SWIB_BAF60C SWIB domain found in BRG1-associated factor 60C (BAF60C) and similar proteins. BAF60C, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 3 (SMARCD3), or 60 kDa BRG-1/Brm-associated factor subunit C, is a core subunit of the SWI/SNF chromatin-remodeling complex that activates the transcription of fatty acid oxidation genes during fasting. It is involved in chromatin remodeling and hepatic lipid metabolism. It is also essential for cardiomyocyte differentiation at the early heart development. Moreover, BAF60C drives glycolytic metabolism in the muscle and improves systemic glucose homeostasis through Deptor-mediated Akt activation. Furthermore, BAF60C epigenetically regulates epithelial-mesenchymal transition (EMT) by activating WNT signaling pathways. 74
35160 350658 cd17706 MCM MCM helicase family. MCM helicases are a family of helicases that play an important role in replication and homologous recombination repair. The heterohexameric ring-shaped Mcm2-7 complex is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. Mcm8 and Mcm9, form a complex required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs). 311
35161 349340 cd17707 BRCT_XRCC1_rpt2 Second (C-terminal) BRCT domain in X-ray repair cross-complementing protein 1 (XRCC1) and similar proteins. XRCC1 is a DNA repair protein that corrects defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents. It forms homodimers and interacts with polynucleotide kinase (PNK), DNA polymerase-beta (POLB), DNA ligase III (LIG3), APTX, APLF, and APEX1. XRCC1 contains an N-terminal XRCC1-specific domain and two BRCT domains. This model corresponds to the second BRCT domain. 94
35162 349341 cd17709 BRCT_pescadillo_like BRCT domain of pescadillo and related proteins. Pescadillo has been characterized in zebrafish as a protein involved in the control of cell proliferation, specifically in the developing embryo. Mammalian homologs have been linked to ribosome biogenesis and nucleologenesis, and yeast homologs have been shown to be required for synthesis of the 60S ribosomal subunit. Pescadillo contains a BRCT domain. 86
35163 349342 cd17710 BRCT_PAXIP1_rpt2 second BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the second BRCT domain. 81
35164 349343 cd17711 BRCT_PAXIP1_rpt3 third BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the third BRCT domain. 81
35165 349344 cd17712 BRCT_PAXIP1_rpt5 fifth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the fifth BRCT domain. 75
35166 349345 cd17713 BRCT_polymerase_mu_like BRCT domain of DNA-directed DNA/RNA polymerase mu (polymerase mu), DNA nucleotidylexotransferase and similar proteins. The family includes DNA-directed DNA/RNA polymerase mu (polymerase mu) and DNA nucleotidylexotransferase. Polymerase mu (EC 2.7.7.7), also termed Pol mu, or terminal transferase, is a Gap-filling polymerase involved in repair of DNA double-strand breaks by non-homologous end joining (NHEJ). It participates in immunoglobulin (Ig) light chain gene rearrangement in V(D)J recombination. DNA nucleotidylexotransferase (EC 2.7.7.31), also termed terminal addition enzyme, or terminal deoxynucleotidyltransferase, or terminal transferase, is a template-independent DNA polymerase which catalyzes the random addition of deoxynucleoside 5'-triphosphate to the 3'-end of a DNA initiator. It is the addition of nucleotides at the junction (N region) of rearranged Ig heavy chain and T-cell receptor gene segments during the maturation of B- and T-cells. All family members contains a BRCT domain. 87
35167 349346 cd17714 BRCT_PAXIP1_rpt1 first (N-terminal) BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the first BRCT domain. 76
35168 349347 cd17715 BRCT_polymerase_lambda BRCT domain of DNA polymerase lambda and similar proteins. DNA polymerase lambda, also termed Pol Lambda, or DNA polymerase beta-2 (Pol beta2), or DNA polymerase kappa, is involved in base excision repair (BER) and is responsible for repair of lesions that give rise to abasic (AP) sites in DNA. It also contributes to DNA double-strand break repair by non-homologous end joining and homologous recombination. DNA polymerase lambda has both template-dependent and template-independent (terminal transferase) DNA polymerase activities, as well as a 5'-deoxyribose-5-phosphate lyase (dRP lyase) activity. DNA polymerase lambda contains one BRCT domain. 80
35169 349348 cd17716 BRCT_microcephalin_rpt1 first (N-terminal) BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the first repeat. 78
35170 349349 cd17717 BRCT_DNA_ligase_IV_rpt2 second BRCT domain of DNA ligase 4 (LIG4) and similar proteins. LIG4 (EC 6.5.1.1), also termed DNA ligase IV, or polydeoxyribonucleotide synthase [ATP] 4, is involved in DNA non-homologous end joining (NHEJ) required for double-strand break repair and V(D)J recombination. It is a component of the LIG4-XRCC4 complex that is responsible for the NHEJ ligation step. LIG4 contains two BRCT domains. The family corresponds to the second one. 88
35171 349350 cd17718 BRCT_TopBP1_rpt3 third BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the third BRCT domain. 83
35172 349351 cd17719 BRCT_Rev1 BRCT domain of DNA repair protein Rev1 and similar proteins. REV1, also termed alpha integrin-binding protein 80, or AIBP80, or Rev1-like terminal deoxycytidyl transferase, is a DNA template-dependent dCMP transferase required for mutagenesis induced by UV light. 87
35173 349352 cd17720 BRCT_Bard1_rpt2 second (C-terminal) BRCT domain of BRCA1-associated RING domain protein 1 (Bard1) and similar proteins. Bard1, also termed BARD-1, or RING-type E3 ubiquitin transferase BARD1, is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an N-terminal C3HC4-type RING-HC finger that binds BRCA1, and a C-terminal region with three ankyrin repeats and tandem BRCT domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage. The family corresponds to the second BRCT domain. 101
35174 349353 cd17721 BRCT_BRCA1_rpt2 second (C-terminal) BRCT domain of breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also termed RING finger protein 53 (RNF53), is a RING finger protein encoded by BRCA1, a tumor suppressor gene that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling, and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT repeats at the C-terminus. The family corresponds to the second BRCT domain. 98
35175 349354 cd17722 BRCT_DNA_ligase_IV_rpt1 first BRCT domain of DNA ligase 4 (LIG4) and similar proteins. LIG4 (EC 6.5.1.1), also termed DNA ligase IV, or polydeoxyribonucleotide synthase [ATP] 4, is involved in DNA non-homologous end joining (NHEJ) required for double-strand break repair and V(D)J recombination. It is a component of the LIG4-XRCC4 complex that is responsible for the NHEJ ligation step. LIG4 contains two BRCT domains. The family corresponds to the first one. 90
35176 349355 cd17723 BRCT_Rad4_rpt4 fourth BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system, which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the fourth one. 74
35177 349356 cd17724 BRCT_p53bp1_rpt2 Second (C-terminal) BRCT domain in p53-binding protein 1 (p53BP1) and similar proteins. p53BP1, also termed 53BP1, or TP53-binding protein 1 (TP53BP1) , is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics, and class-switch recombination (CSR) during antibody genesis. TP53BP1 contains two tandem BRCT repeats. This family corresponds to the second BRCT domain. 87
35178 349357 cd17725 BRCT_XRCC1_rpt1 First (central) BRCT domain in X-ray repair cross-complementing protein 1 (XRCC1) and similar proteins. XRCC1 is a DNA repair protein that corrects defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents. It forms homodimers and interacts with polynucleotide kinase (PNK), DNA polymerase-beta (POLB), DNA ligase III (LIG3), APTX, APLF, and APEX1. XRCC1 contains an N-terminal XRCC1-specific domain and two BRCT domains. This family corresponds to the first one. 80
35179 349358 cd17726 BRCT_PARP4_like BRCT domain of poly [ADP-ribose] polymerase 4 (PARP-4) and similar proteins. PARP-4, also termed 193 kDa vault protein, or ADP-ribosyltransferase diphtheria toxin-like 4 (ARTD4), or PARP-related/IalphaI-related H5/proline-rich (PH5P), or vault poly(ADP-ribose) polymerase (VPARP), shows poly(ADP-ribosyl)ation activity that catalyzes the formation of ADP-ribose polymers in response to DNA damage. PARP-4 is a component of the vault ribonucleoprotein particle, at least composed of MVP, PARP4 and one or more vault RNAs (vRNAs). The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group. 85
35180 349359 cd17727 BRCT_TopBP1_rpt6 sixth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the sixth BRCT domain. 75
35181 349360 cd17728 BRCT_TopBP1_rpt8 eighth (C-terminal) BRCT domain of DNA topoisomerase 2-binding protein 1. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the eighth BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group. 80
35182 349361 cd17729 BRCT_CTDP1 BRCT domain of RNA polymerase II subunit A C-terminal domain phosphatase (CTDP1) and similar proteins. CTDP1 (EC 3.1.3.16), also termed TFIIF-associating CTD phosphatase, or TFIIF- associating RNA polymerase C-terminal domain phosphatase (FCP1), promotes the activity of RNA polymerase II through processively dephosphorylating 'Ser-2' and 'Ser-5' of the heptad repeats YSPTSPS in the C-terminal domain of the largest RNA polymerase II subunit. It plays a role in the exit from mitosis by dephosphorylating crucial mitotic substrates (USP44, CDC20 and WEE1) that are required for M-phase-promoting factor (MPF)/CDK1 inactivation. 97
35183 349362 cd17730 BRCT_PAXIP1_rpt4 fourth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the fourth BRCT domain. 73
35184 349363 cd17731 BRCT_TopBP1_rpt2_like second BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the second BRCT domain. 77
35185 349364 cd17732 BRCT_Ect2_rpt2 second BRCT domain of epithelial cell-transforming sequence 2 protein (ECT2) and similar proteins. ECT2 is a guanine nucleotide exchange factor (GEF) for Rho GTPases, phosphorylated in G2/M phases, and is involved in the regulation of cytokinesis. It contains two tandem BRCT domains. The family corresponds to the second BRCT domain. 80
35186 349365 cd17733 BRCT_Ect2_rpt1 first BRCT domain of epithelial cell-transforming sequence 2 protein (ECT2) and similar proteins. ECT2 is a guanine nucleotide exchange factor (GEF) for Rho GTPases, phosphorylated in G2/M phases, and is involved in the regulation of cytokinesis. It contains two tandem BRCT domains. The family corresponds to the first BRCT domain. 76
35187 349366 cd17734 BRCT_Bard1_rpt1 first BRCT domain of BRCA1-associated RING domain protein 1 (Bard1) and similar proteins. Bard1, also termed BARD-1, or RING-type E3 ubiquitin transferase BARD1, is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an N-terminal C3HC4-type RING-HC finger that binds BRCA1, and a C-terminal region with three ankyrin repeats and tandem BRCT domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage. The family corresponds to the first BRCT domain. 80
35188 349367 cd17735 BRCT_BRCA1_rpt1 first BRCT domain of breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also termed RING finger protein 53 (RNF53), is a RING finger protein encoded by BRCA1, a tumor suppressor gene that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling, and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT (BRCA1 C-terminus domain) repeats at the C-terminus. The family corresponds to the first BRCT domain. 97
35189 349368 cd17736 BRCT_microcephalin_rpt2 second BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the second repeat. 76
35190 349369 cd17737 BRCT_TopBP1_rpt1 first BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the first BRCT domain. 72
35191 349370 cd17738 BRCT_TopBP1_rpt7 seventh BRCT domain of DNA topoisomerase 2-binding protein 1. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the seventh BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is missing in this group. 75
35192 349371 cd17740 BRCT_Rad4_rpt1 first BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples the S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the first one. 82
35193 349372 cd17741 BRCT_nibrin BRCT domain of nibrin and similar proteins. Nibrin (NBN), also termed Nijmegen breakage syndrome protein 1 (NBS1), or cell cycle regulatory protein p95, is a novel DNA double-strand break repair protein that is mutated in Nijmegen breakage syndrome. It is a component of the MRE11-RAD50-NBN (MRN complex) which plays a critical role in the cellular response to DNA damage and the maintenance of chromosome integrity. The BRCT (Breast Cancer Suppressor Protein BRCA1, carboxy-terminal) domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is absent in this group. 74
35194 349373 cd17742 BRCT_CHS5_like BRCT domain of yeast chitin biosynthesis protein CHS5 and similar proteins. CHS5, also termed protein CAL3, is a component of the CHS5/6 complex which mediates export of specific cargo proteins, including chitin synthase CHS3. It is also involved in targeting FUS1 to sites of polarized growth. 77
35195 349374 cd17743 BRCT_BRC1_like_rpt5 fifth BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contain six BRCT domains. This family corresponds to the fifth one. 70
35196 349375 cd17744 BRCT_MDC1_rpt1 first BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the first BRCT domain. 72
35197 349376 cd17745 BRCT_p53bp1_rpt1 first (central) BRCT domain in p53-binding protein 1 (p53BP1) and similar proteins. p53BP1, also termed 53BP1, or TP53-binding protein 1 (TP53BP1) , is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics and class-switch recombination (CSR) during antibody genesis. TP53BP1 contains two tandem BRCT repeats. This family also includes Schizosaccharomyces pombe Crb2, which is a checkpoint mediator required for the cellular response to DNA damage. This model corresponds to the first BRCT domain. 99
35198 349377 cd17746 BRCT_Rad4_rpt2 second BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the second one. 91
35199 349378 cd17747 BRCT_PARP1 BRCT domain of poly [ADP-ribose] polymerase 1 (PARP-1) and similar proteins. PARP-1 (EC 2.4.2.30), also termed ADP-ribosyltransferase diphtheria toxin-like 1 (ARTD1), or NAD(+) ADP-ribosyltransferase 1 (ADPRT 1), or poly[ADP-ribose] synthase 1, is involved in the base excision repair (BER) pathway, by catalyzing the poly(ADP-ribosyl)ation of a limited number of acceptor proteins involved in chromatin architecture and in DNA metabolism. 76
35200 349379 cd17748 BRCT_DNA_ligase_like BRCT domain of bacterial NAD-dependent DNA ligase (LigA) and similar proteins. LigA, also called NAD(+)-dependent polydeoxyribonucleotide synthase, catalyzes the formation of phosphodiester linkages between 5'-phosphoryl and 3'-hydroxyl groups in double-stranded DNA using NAD as a coenzyme and as the energy source for the reaction. It is essential for DNA replication and repair of damaged DNA. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family. 76
35201 349380 cd17749 BRCT_TopBP1_rpt4 fourth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also called DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the fourth BRCT domain. 84
35202 349381 cd17750 BRCT_SLF1 BRCT domain of SMC5-SMC6 complex localization factor protein 1 (SLF1) and similar proteins. SLF1, also termed Smc5/6 localization factor 1, or ankyrin repeat domain-containing protein 32 (ANKRD32), or BRCT domain-containing protein 1 (BRCTD1), plays a role in the DNA damage response (DDR) pathway by regulating post replication repair of UV-damaged DNA and genomic stability maintenance. It is a component of the SLF1-SLF2 complex that acts to link RAD18 with the SMC5-SMC6 complex at replication-coupled interstrand cross-links (ICL) and DNA double-strand break (DSB) sites on chromatin during DNA repair in response to stalled replication forks. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is missing in this group. 81
35203 349382 cd17751 BRCT_microcephalin_rpt3 third BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the third repeat. 75
35204 349383 cd17752 BRCT_RFC1 BRCT domain of replication factor C subunit 1 (RFC1) and similar proteins. RFC1, also termed activator 1 140 kDa subunit, or A1 140 kDa subunit, or activator 1 large subunit, or activator 1 subunit 1, or replication factor C 140 kDa subunit, or RF-C 140 kDa subunit, or RFC140, is the large subunit of replication factor C (RFC), which is a heteropentameric protein essential for DNA replication and repair. RFC1 can bind single- or double-stranded DNA. It could play a role in DNA transcription regulation as well as DNA replication and/or repair. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family. 79
35205 350659 cd17753 MCM2 DNA replication licensing factor Mcm2. Mcm2 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 325
35206 350660 cd17754 MCM3 DNA replication licensing factor Mcm3. Mcm3 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 299
35207 350661 cd17755 MCM4 DNA replication licensing factor Mcm4. Mcm4 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 309
35208 350662 cd17756 MCM5 DNA replication licensing factor Mcm5. Mcm5 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 317
35209 350663 cd17757 MCM6 DNA replication licensing factor Mcm6. Mcm6 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 307
35210 350664 cd17758 MCM7 DNA replication licensing factor Mcm7. Mcm7 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. 306
35211 350665 cd17759 MCM8 DNA helicase Mcm8. Mcm8 plays an important role homologous recombination repair. It forms a complex with Mcm9 that is required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs). 289
35212 350666 cd17760 MCM9 DNA helicase Mcm9. Mcm9 plays an important role homologous recombination repair. It forms a complex with Mcm8 that is required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs). 299
35213 350667 cd17761 MCM_arch archaeal MCM protein. archaeal MCM proteins form a homohexameric ring homologous to the eukaryotic Mcm2-7 helicase and also function as the replicative helicase at the replication fork 308
35214 350162 cd17762 AMN AMP nucleosidase. AMP nucleosidase (AMN) catalyzes the hydrolysis of AMP to ribose 5-phosphate and adenine. It is a prokaryotic enzyme which plays a role in purine nucleoside salvage and intracellular AMP level regulation. AMN is active as a homohexamer; each monomer is comprised of a catalytic domain and a putative regulatory domain. This model represents the catalytic domain. AMN belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 242
35215 350163 cd17763 UP_hUPP-like uridine phosphorylases similar to a human UPP1 and UPP2. Uridine phosphorylase (UP) catalyzes the reversible phosphorolysis of uracil ribosides and analogous compounds to their respective nucleobases and ribose 1-phosphate. Human UPP1 has a role in the activation of pyrimidine nucleoside analogues used in chemotherapy, such as 5-fluorouracil. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 276
35216 350164 cd17764 MTAP_SsMTAPI_like 5'-deoxy-5'-methylthioadenosine phosphorylases similar to Sulfolobus solfataricus MTAPI. 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP) catalyzes the reversible phosphorolysis of 5'-deoxy-5'-methylthioadenosine (MTA) to adenine and 5-methylthio-D-ribose-1-phosphate. Sulfolobus solfataricus MTAPI will utilize inosine, guanosine, and adenosine as substrates, in addition to MTA. Two MTAPs have been isolated from S. solfataricus: SsMTAP1 and SsMTAPII, SsMTAPII belongs to a different subfamily of the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-I family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 220
35217 350165 cd17765 PNP_ThPNP_like purine nucleoside phosphorylases similar to Thermus thermophiles PNP. Purine nucleoside phosphorylase (PNP) catalyzes the reversible phosphorolysis of purine nucleosides. Thermus thermophiles PNP catalyzes the phosphorolysis of guanosine but not adenosine. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 234
35218 350166 cd17766 futalosine_nucleosidase_MqnB futalosine nucleosidase which catalyzes the hydrolysis of futalosine to dehypoxanthinylfutalosine and a hypoxanthine base; similar to Thermus thermophiles MqnB. Futalosine nucleosidase (MqnB, EC 3.2.2.26, also known as futalosine hydrolase) functions in an alternative menaquinone biosynthetic pathway (the futalosine pathway) which operates in some bacteria, including Streptomyces coelicolor and Thermus thermophiles. This domain model belongs to the PNP_UDP_1 superfamily which includes members which accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. PNP_UDP_1 includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Superfamily members have different physiologically relevant quaternary structures: hexameric such as the trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP, homotrimeric such as human PNP and Escherichia coli PNPII (XapA), homohexomeric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD), or homodimeric such as human and Trypanosoma brucei UP. The PNP_UDP_2 (nucleoside phosphorylase-II family) is a different structural family. 217
35219 350167 cd17767 UP_EcUdp-like uridine phosphorylases similar to Escherichia coli Udp and related phosphorylases. Uridine phosphorylase (UP) is specific for pyrimidines, and is involved in pyrimidine salvage and in the maintenance of uridine homeostasis. In addition to E. coli Udp, this subfamily includes Shewanella oneidensis MR-1 UP and Plasmodium falciparum purine nucleoside phosphorylase (PfPNP). PfPNP is an outlier in terms of genetic distance from the other families of PNPs. PfPNP is catalytically active for inosine and guanosine, and in addition, has a weak UP activity. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 239
35220 350168 cd17768 adenosylhopane_nucleosidase_HpnG-like adenosylhopane nucleosidase which cleaves adenine from adenosylhopane to form ribosyl hopane; similar to Burkholderia cenocepacia HpnG. adenosylhopane nucleosidase HpnG, catalyzes the second step in hopanoid side-chain biosynthesis. Hopanoids are bacterial membrane lipids. This CD belongs to the PNP_UDP_1 superfamily which includes members which accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. PNP_UDP_1 includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Superfamily members have different physiologically relevant quaternary structures: hexameric such as the trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP, homotrimeric such as human PNP and Escherichia coli PNPII (XapA), homohexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD), or homodimeric such as human and Trypanosoma brucei UP. The PNP_UDP_2 (nucleoside phosphorylase-II family) is a different structural family. 188
35221 350169 cd17769 NP_TgUP-like nucleoside phosphorylases similar to Toxoplasma gondii uridine phosphorylase. This subfamily is composed of mostly uncharacterized proteins with similarity to Toxoplasma gondii uridine phosphorylase (TgUPase). Toxoplasma gondii appears to have a single non-specific uridine phosphorylase which catalyzes the reversible phosphorolysis of uridine, deoxyuridine and thymidine, rather than the two distinct enzymes of mammalian cells: uridine phosphorylase (nucleoside phosphorylase-I family) and thymidine phosphorylase (nucleoside phosphorylase-II family). TgUPase is a potential target for intervention against toxoplasmosis. It belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 255
35222 341407 cd17771 CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc CBS domain protein. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain. Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 115
35223 341408 cd17772 CBS_pair_DHH_polyA_Pol_assoc Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DHH and nucleotidyltransferase (NT) domains. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream nucleotidyltransferase (NT) domain of family X DNA polymerases. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 112
35224 341409 cd17773 CBS_pair_NeuB Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain present in N-acylneuraminate-9-phosphate synthase. This CD contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain present in N-acylneuraminate-9-phosphate synthase NeuB. NeuB catalyzes the condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine, directly forming N-acetylneuraminic acid (or sialic acid). The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 118
35225 341410 cd17774 CBS_two-component_sensor_histidine_kinase_repeat2 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins, repeat 2. This cd contains 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins. Two-component regulation is the predominant form of signal recognition and response coupling mechanism used by bacteria to sense and respond to diverse environmental stresses and cues ranging from common environmental stimuli to host signals recognized by pathogens and bacterial cell-cell communication signals. The structures of both sensors and regulators are modular, and numerous variations in domain architecture and composition have evolved to tailor to specific needs in signal perception and signal transduction. The simplest histidine kinase sensors consists of only sensing and kinase domains. The more complex hybrid sensors contain an additional REC domain typical of two-component regulators and in some cases a C-terminal histidine phosphotransferase (HPT) domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 137
35226 341411 cd17775 CBS_pair_bact_arch Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria and archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 117
35227 341412 cd17776 CBS_pair_arch Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 115
35228 341413 cd17777 CBS_arch_repeat1 CBS pair domains found in archeal proteins, repeat 1. CBS pair domains found in archeal proteins that contain 2 repeats. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. 137
35229 341414 cd17778 CBS_arch_repeat2 CBS pair domains found in archeal proteins, repeat 2. CBS pair domains found in archeal proteins that contain 2 repeats. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. 131
35230 341415 cd17779 CBS_archAMPK_gamma-repeat1 signal transduction protein with CBS domains. Archeal gamma-subunit of 5'-AMP-activated protein kinase (AMPK) contains four CBS domains in tandem repeats, similar to eukaryotic homologs. AMPK is an important regulator of metabolism and of energy homeostasis. It is a heterotrimeric protein composed of a catalytic serine/threonine kinase subunit (alpha) and two regulatory subunits (beta and gamma). The gamma subunit senses the intracellular energy status by competitively binding AMP and ATP and is believed to be responsible for allosteric regulation of the whole complex. In humans mutations in gamma- subunit of AMPK are associated with hypertrophic cardiomiopathy, Wolff-Parkinson-White syndrome and glycogen storage in the skeletal muscle. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. 136
35231 341416 cd17780 CBS_pair_arch1_repeat1 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 1. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 106
35232 341417 cd17781 CBS_pair_MUG70_1 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains similar to MUG70 repeat1. Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain, present in MUG70. The MUG70 protein, encoded by the Meiotically Up-regulated Gene 70, plays a role in meiosis and contains, beside the two CBS pairs, a PB1 domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 118
35233 341418 cd17782 CBS_pair_MUG70_2 Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains similar to MUG70 repeat2. Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain, present in MUG70. The MUG70 protein, encoded by the Meiotically Up-regulated Gene 70, plays a role in meiosis and contains, beside the two CBS pairs, a PB1 domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 118
35234 341419 cd17783 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 108
35235 341420 cd17784 CBS_pair_Euryarchaeota Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in Euryarchaeota. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 120
35236 341421 cd17785 CBS_pair_bac_arch Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria and archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 136
35237 341422 cd17786 CBS_pair_Thermoplasmatales Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in Thermoplasmatales. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 114
35238 341423 cd17787 CBS_pair_ACT Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in Thermatoga in combination with an ACT domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 111
35239 341424 cd17788 CBS_pair_bac Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 137
35240 341425 cd17789 CBS_pair_plant_CBSX Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains from plant CBSX proteins. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains of plant single cystathionine beta-synthase (CBS) pair proteins (CBSX). CBSX1 and CBSX2 have been identified as redox regulators of the thioredoxin (Trx) system. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase). 141
35241 341356 cd17790 7tmA_mAChR_M1 muscarinic acetylcholine receptor subtype M1, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. M1 is the dominant mAChR subtype involved in learning and memory. It is linked to synaptic plasticity, neuronal excitability, and neuronal differentiation during early development. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 262
35242 341490 cd17791 HipA-like serine/threonine-protein kinases similar to HipA and CtkA. This family contains serine/threonine-protein kinases similar to Escherichia coli HipA, a type II toxin-antitoxin (TA) system HipA family toxin that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu), and is the toxin component of the HipA-HipB TA module, as well as similar to the Helicobacter pylori serine/threonine-protein kinase CtkA (proinflammatory kinase), which has been shown to be secreted by the bacteria and to induce cytokines in gastric epithelial cells relevant to chronic gastric inflammation. 284
35243 341491 cd17792 CtkA serine/threonine-protein kinase CtkA and similar proteins. The Helicobacter pylori serine/threonine-protein kinase CtkA (proinflammatory kinase), encoded by the jhp940 gene, has been shown to be secreted by the bacterium and to induce cytokines in gastric epithelial cells. It may play a role in chronic gastric inflammation. CtkA autophosphorylates itself at a threonine residue near the N-terminus and it translocates into cultured human cells. It also enhances phosphorylation of the NF-kappaB p65 subunit at Ser276 in human epithelial cancer cells; phosphorylation at this position is known to activate the transcriptional activity of NF-kappaB. 281
35244 341492 cd17793 HipA type II toxin-antitoxin sytem toxin HipA and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Escherichia coli and Shewanella oneidensis HipA, which is a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. Structures of HipAB:DNA complexes from both Escherichia coli and Shewanella oneidensis reveal distinct complex assembly. 358
35245 341493 cd17808 HipA_Ec_like type II toxin-antitoxin sytem toxin HipA from Escherichia coli and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Escherichia coli HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Escherichia coli HipAB:DNA promoter complex, HipA forms a dimer and each HipA monomer interacts with a HipB homodimer which binds DNA. The HipAB component of the complex is composed of two HipA and four HipB subunits. 401
35246 341494 cd17809 HipA_So_like type II toxin-antitoxin sytem toxin HipA from Shewanella oneidensis and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Shewanella oneidensis HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Shewanella oneidensis HipAB:DNA promoter complex, HipB forms a dimer that binds the duplex operator DNA, with each HipB monomer interacting with separate HipA monomers. The HipAB component of the complex is composed of two HipA and two HipB subunits. 405
35247 341489 cd17814 Fe-ADH-like iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized. 374
35248 349777 cd17868 GPN GPN-loop GTPase. GPN-loop GTPases are deeply evolutionarily conserved family of three small GTPases, Gpn1, 2, and 3. They form heterodimers, interact with RNA polymerase II and may function in nuclear import of RNA polymerase II. 198
35249 349778 cd17869 TadZ-like pilus assembly protein TadZ. Pilus assembly protein TadZ is involved in the production of a variant of type IV pili. It is part of the SIMIBI superfamily which contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. 219
35250 349779 cd17870 GPN1 GPN-loop GTPase 1. GPN-loop GTPase 1 (GPN1, also kown as MBD2-interacting protein or MBDin, RNAPII-associated protein 4, and XPA-binding protein 1) is a GTPase is required for nuclear targeting of RNA polymerase II. It forms heterodimers with GPN3. 241
35251 349780 cd17871 GPN2 GPN-loop GTPase 2. GPN-loop GTPase 2 (GPN2) is a small GTPase required for proper localization of RNA polymerase II and III (RNAPII and RNAPIII). It forms heterodimers with GPN1 or GPN3. 196
35252 349781 cd17872 GPN3 GPN-loop GTPase 3. GPN-loop GTPase 3 (GPN3) is a small GTPase that is required for nuclear targeting of RNA polymerase II. It forms heterodimers with GPN1. 196
35253 349782 cd17873 FlhF signal-recognition particle GTPase FlhF. FlhF protein is a signal-recognition particle (SRP)-type GTPase that is essential for the placement and assembly of polar flagella. It is similar to the 54 kd subunit (SRP54) of the signal recognition particle (SRP) that mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). 189
35254 349783 cd17874 FtsY signal recognition particle receptor FtsY. FtsY, the bacterial signal-recognition particle (SRP) receptor (SR), is homologous to the SRP receptor alpha-subunit (SRalpha) of the eukaryotic SR. It interacts with the signal-recognition particle (SRP) and is required for the co-translational membrane targeting of proteins. 199
35255 349784 cd17875 SRP54_G GTPase domain of the signal recognition 54 kDa subunit. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain. 193
35256 349785 cd17876 SRalpha_C C-terminal domain of signal recognition particle receptor alpha subunit. The signal-recognition particle (SRP) receptor (SR) alpha-subunit (SRalpha) of the eukaryotic SR interacts with the signal-recognition particle (SRP) and is essential for the co-translational membrane targeting of proteins. 204
35257 350170 cd17877 NP_MTAN-like nucleoside phosphorylases similar to 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidases. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs), as well as futalosine nucleosidase and adenosylhopane nucleosidase. Bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 210
35258 350625 cd17880 D-Ala-D-Ala_dipeptidase D-Ala-D-Ala_dipeptidase. This family contains D-Ala-D-Ala dipeptidase enzymes which include D-alanyl-D-alanine dipeptidase vanX and Aad, among others. VanX is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It fasmily includes Lactobacillus Aad peptidase and belongs in the MEROPS peptidase family M15, subfamily D. 110
35259 350087 cd17900 ArfGap_ASAP3 ArfGAP domain of ASAP3 (ArfGAP with ANK repeat and PH domain-containing protein 3). The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3. The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP1 and ASAP2, ASAP3 do not have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid. ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma. 124
35260 350088 cd17901 ArfGap_ARAP1 ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 1. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP1 localizes to the plasma membrane, the Golgi complex, and endosomal compartments. It displays PI(3,4,5)P3-dependent ArfGAP activity that regulates Arf-, RhoA-, and Cdc42-dependent cellular events. For example, ARAP1 inhibits the trafficking of epidermal growth factor receptor (EGFR) to the early endosome. 116
35261 350089 cd17902 ArfGap_ARAP3 ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 3. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP3 possesses a unique dual-specificity GAP activity for Arf6 and RhoA regulated by PI(3,4,5)P3 and a small GTPase Rap1-GTP. The RhoGAP activity of ARAP3 is enhanced by direct binding of Rap1-GTP to the Ras-association (RA) domain. ARAP3 is involved in regulation of cell shape and adhesion. 116
35262 350090 cd17903 ArfGap_AGFG2 ArfGAP domain of AGFG2 (ArfGAP domain and FG repeat-containing protein 2). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG2 is a member of the HIV-1 Rev binding protein (HRB) family and contains one Arf-GAP zinc finger domain, several Phe-Gly (FG) motifs, and four Asn-Pro-Phe (NPF) motifs. AGFG2 interacts with Eps15 homology (EH) domains and plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication. 116
35263 380783 cd17904 PFM_monalysin-like pore-forming module of Pseudomonas entomophila monalysin and similar aerolysin-type beta-barrel pore-forming proteins. Monalysin plays a role in Pseudomonas entomophila virulence against Drosophila, contributing to host intestinal damage and lethality. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 206
35264 381733 cd17905 CheC-like chemotaxis protein CheC; includes CheC classes I, II, and III. This family contains chemotaxis protein CheC that acts as a weak CheY-P phosphatase but shows increased activity in the presence of CheD. This CheC family includes three classes: class I containing Bacillus subtilis CheC which might function as a regulator of CheD; class II CheCs that likely function as phosphatases in systems other than chemotaxis; and class III CheCs that are found chiefly in the archaeal class Halobacteria and might function similarly as class I CheCs. Class I CheCs contain two active sites with the consensus sequence ([DS]xxxExxNx(22)P), with four conserved residues thought to form the phosphatase active site; class II and class III CheCs have only one actve site. 173
35265 381734 cd17906 CheX chemotaxis phosphatase CheX. This family contains CheX CheY-P phosphatase which is very closely related to CheC chemotaxis phosphatase; both dephosphorylate CheY, although CheC requires binding of CheD to achieve the level of activity of CheX. CheX has been shown to be the most powerful CheY-P phosphatase of the CheC-FliY-CheX (CXY) family. Structural and functional data of CheX and its CheY3 substrate in Borrelia burgdorferi (the causative agent of Lyme disease) bound to the phosphoryl analog BeF3(-) and Mg2+ reveal a unique mode of binding, but a catalytic mechanism which is virtually identical to that used by the structurally unrelated CheZ, providing a striking example of convergent evolution. Thus, CheX is quite divergent from the rest of the CXY family; it forms a dimer and some may function outside chemotaxis. The data also suggest a possible CheX regulatory mechanism through dissociation of the CheX homodimer. 148
35266 381735 cd17907 FliY_FliN-Y flagellar motor switch protein FliY. This family contains the flagellar rotor protein FliY, a highly conserved and essential member of the CheC phosphatase family, that distinguishes flagellar architecture and function in different types of bacteria. Unlike CheC and CheX, FliY is localized in the flagellar switch complex, which also contains the stator-coupling protein FliG and the target of CheY-P, FliM, all present in many copies, and together corresponding structurally to the C-ring of the flagellar basal body. FliY structure resembles that of the rotor protein FliM but contains two active centers for CheY dephosphorylation. In bacteria such as Thermotogae and Bacilli, FliY is fused to FliN. It incorporates properties of the FliM/FliN rotor proteins and the CheC/CheX phosphatases to serve multiple functions in the flagellar switch. FliY seems to act on CheY-P constitutively, as compared to CheC and CheX that appear to be primarily involved in restoring normal CheY-P levels. 191
35267 381736 cd17908 FliM flagellar protein FliM. This family contains bacterial flagellar protein FliM which is localized in the flagellar switch complex along with FliG and FliY; all are present in many copies, and together they correspond structurally to the C-ring of the flagellar basal body. FliM does not contain the CheC consensus sequence of the phosphatase active site ([DS]xxxExxNx(22)P) and is not a CheY-P phosphatase. FliM sits in the center of the rotor with the N-terminal region interacting with the signaling protein, phosphorylated CheY (CheY-P). The activated form of CheY destabilizes the parallel arrangement of FliM molecules, and perturbs FliG alignment in a process that may reflect the onset of rotation switching. This suggests a model of C-ring assembly in which intermolecular contacts among FliG domains provide a template for FliM assembly. Recent data show that binding of FliM to spermine synthase, SpeE, contributes to flagellar motility, an association that is unique to Helicobacter species. 181
35268 381737 cd17909 CheC_ClassI chemotaxis protein CheC, Class I. This subfamily contains Class I CheC proteins with phosphatase activity. The Class I cheC genes are generally found in firmicute and archaeal chemotaxis operons with cheD, usually translationally coupled. Class I CheCs interact with the CheD protein which is responsible for deamidation of certain glutamine residues to glutamates on the chemotaxis receptor proteins. This family contains two active sites with the consensus sequence ([DS]xxxExxNx(22)P), with four conserved residues thought to form the phosphatase active site. The C-terminal helix of CheC acts as a mimic of the natural enzymatic target of CheD, the alpha-helical receptors, and serves as the binding site for CheD. The CheC/CheD heterodimerization increases CheY-P phosphatase activity five-fold. Class I CheCs are involved in adaptation of the chemotaxis system. 189
35269 381738 cd17910 CheC_ClassII chemotaxis protein CheC, Class II. This family contains class II CheC proteins found in proteobacteria, which diverge from class I CheCs in sequence conservation and lack critical well-conserved residues for CheD binding. These proteins are likely to be dedicated phosphatases. The class II cheC genes are not found in chemotaxis operons, but in operons containing more archetypical two-component signaling components, non-signaling operons, or as orphans. Thus, class II CheCs appear to be involved in non-chemotactic two component systems. Class II CheCs lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site of class I CheCs. 187
35270 381739 cd17911 CheC_ClassIII chemotactic protein CheC, Class III. This family contains class III CheC proteins, present chiefly in the archaeal class Halobacteria. Sequence analysis shows that class III CheC proteins are structurally and functionally similar to class I CheCs, and not to CheX, despite the fact that both class III CheCs and CheX lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site. Mutation analysis shows that the second active site is more important for function that the first one, suggesting that class III proteins arose by loss of the unnecessary first active site through mutational shift. All chemotactic archaea have a CheC homologue. 187
35271 350670 cd17912 DEAD-like_helicase_N N-terminal helicase domain of the DEAD-box helicase superfamily. The DEAD-like helicase superfamily is a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. The N-terminal domain contains the ATP-binding region. 81
35272 350671 cd17913 DEXQc_Suv3 DEXQ-box helicase domain of Suv3. Suppressor of var1 3-like protein (Suv3) is a DNA/RNA unwinding enzyme belonging to the class of DexH-box helicases. It localizes predominantly in the mitochondria, where it forms an RNA-degrading complex called mitochondrial degradosome (mtEXO) with exonuclease PNP (polynucleotide phosphorylase), that degrades 3' overhang double-stranded RNA with a 3'-to-5' directionality in an ATP-dependent manner. Suv3 plays a role in the RNA surveillance system in mitochondria; it regulates the stability of mature mRNAs, the removal of aberrantly formed mRNAs and the rapid degradation of non coding processing intermediates. It also confers salinity and drought stress tolerance by maintaining both photosynthesis and antioxidant machinery, probably via an increase in plant hormone levels such as gibberellic acid (GA3), the cytokinin zeatin (Z), and indole-3-acetic acid (IAA). Suv3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 142
35273 350672 cd17914 DExxQc_SF1-N DEXQ-box helicase domain of superfamily 1 helicase. The superfamily (SF)1 family members include UvrD/Rep, Pif1-like, and Upf-1-like proteins. Like SF2, they do not form toroidal, predominantly hexameric structures like SF3-6. Their helicase core is surrounded by C and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains or domains engaged in protein-protein interactions. SF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 116
35274 350673 cd17915 DEAHc_XPD-like DEAH-box helicase domain of XPD family DEAD-like helicases. The xeroderma pigmentosum group D (XPD)-like family members are DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 138
35275 350674 cd17916 DEXHc_UvrB DEXH-box helicase domain of excinuclease ABC subunit B. Excinuclease ABC subunit B (or UvrB) plays a central role in nucleotide excision repair (NER). Together with other components of the NER system, like UvrA, UvrC, UvrD (helicase II) and DNA polymerase I, it recognizes and cleaves damaged DNA in a multistep ATP-dependent reaction. UvrB is critical for the second phase of damage recognition by verifying the nature of the damage and forming the pre-incision complex. Its ATPase site becomes activated in the presence of UvrA and damaged DNA, but its activity is strand destabilization via distortion of the DNA at lesion site, with very limited DNA unwinding. UvrB is a member of the DEAD-like helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 299
35276 350675 cd17917 DEXHc_RHA-like DEXH-box helicase domain of DEAD-like helicase RHA family proteins. The RNA helicase A (RHA) family includes RHA, also called DEAH-box helicase 9 (DHX9), DHX8, DHX15-16, DHX32-38, and many others. The RHA family belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 159
35277 350676 cd17918 DEXHc_RecG DEXH/Q-box helicase domain of DEAD-like helicase RecG family proteins. The DEAD-like helicase RecG family is part of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 180
35278 350677 cd17919 DEXHc_Snf DEXH/Q-box helicase domain of DEAD-like helicase Snf family proteins. Sucrose Non-Fermenting (SNF) proteins DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 182
35279 350678 cd17920 DEXHc_RecQ DEXH-box helicase domain of RecQ family proteins. The RecQ family of the type II DEAD box helicase superfamily is a family of highly conserved DNA repair helicases. This domain contains the ATP-binding region. 200
35280 350679 cd17921 DEXHc_Ski2 DEXH-box helicase domain of DEAD-like helicase Ski2 family proteins. Ski2-like RNA helicases play an important role in RNA degradation, processing, and splicing pathways. They belong to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 181
35281 350680 cd17922 DEXHc_LHR-like DEXH-box helicase domain of LHR. Large helicase-related protein (LHR) is a DNA damage-inducible helicase that uses ATP hydrolysis to drive unidirectional 3'-to-5' translocation along single-stranded DNA (ssDNA) and to unwind RNA:DNA duplexes. This group also includes related bacterial and archaeal helicases from the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 166
35282 350681 cd17923 DEXHc_Hrq1-like DEAH-box helicase domain of Hrq1 and similar proteins. Yeast Hrq1, similar to RecQ4, plays a role in DNA inter-strand crosslink (ICL) repair and in telomere maintenance. Hrq1 lacks the Sld2-like domain found in RecQ4. Hrq1 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 182
35283 350682 cd17924 DDXDc_reverse_gyrase DDXD-box helicase domain of reverse gyrase. Reverse gyrase modifies the topological state of DNA by introducing positive supercoils in an ATP-dependent process. Reverse gyrase belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 189
35284 350683 cd17925 DEXDc_ComFA DEXD-box helicase domain of ComFA. ATP-dependent helicase ComFA (also called ComF operon protein 1) is part of the complex mediating the binding and uptake of single-stranded DNA. ComFA is required for DNA uptake but not for binding. It belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 143
35285 350684 cd17926 DEXHc_RE DEXH-box helicase domain of DEAD-like helicase restriction enzyme family proteins. This family is composed of helicase restriction enzymes and similar proteins such as TFIIH basal transcription factor complex helicase XPB subunit. These proteins are part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 146
35286 350685 cd17927 DEXHc_RIG-I DEXH-box helicase domain of DEAD-like helicase RIG-I family proteins. Members of the RIG-I family include FANCM, dicer, Hef, and the RIG-I-like receptors. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). Hef (helicase-associated endonuclease fork-structure) is involved in stalled replication fork repair. RIG-I-like receptors (RLRs) sense cytoplasmic viral RNA and comprises RIG-I, RLR-2/MDA5 (melanoma differentiation-associated protein 5) and RLR-3/LGP2 (laboratory of genetics and physiology 2). The RIG-I family is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 201
35287 350686 cd17928 DEXDc_SecA DEXD-box helicase domain of SecA. SecA is a part of the Sec translocase that transports the vast majority of bacterial and ER-exported proteins. SecA binds both the signal sequence and the mature domain of the preprotein emerging from the ribosome. SecA belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 230
35288 350687 cd17929 DEXHc_priA DEXH-box helicase domain of PriA. PriA, also known as replication factor Y or primosomal protein N', is a 3'-->5' superfamily 2 DNA helicase that acts to remodel stalled replication forks and as a specificity factor for origin-independent assembly of a new replisome at the stalled fork. PriA is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 178
35289 350688 cd17930 DEXHc_cas3 DEXH/Q-box helicase domain of Cas3. CRISPR-associated (Cas) 3 is a nuclease-helicase responsible for degradation of dsDNA. The two enzymatic units of Cas3, a histidine-aspartate (HD) nuclease and a Superfamily 2 (SF2) helicase, may be expressed from separate genes as Cas3' (SF2 helicase) and Cas3'' (HD nuclease) or may be fused as a single HD-SF2 polypeptide. The nucleolytic activity of most Cas3 enzymes is transition metal ion-dependent. Cas3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 186
35290 350689 cd17931 DEXHc_viral_Ns3 DEXH-box helicase domain of NS3 protease-helicase. NS3 is a nonstructural multifunctional protein found in pestiviruses that contains an N-terminal protease and a C-terminal helicase. The N-terminal domain is a chymotrypsin-like serine protease, which is responsible for most of the maturation cleavages of the polyprotein precursor in the cytosolic side of the endoplasmic reticulum membrane. The C-terminal domain, about two-thirds of NS3, is a helicase belonging to superfamily 2 (SF2) thought to be important for unwinding highly structured regions of the RNA genome during replication. NS3 plays an essential role in viral polyprotein processing and genome replication. NS3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 151
35291 350690 cd17932 DEXQc_UvrD DEXQD-box helicase domain of UvrD. UvrD is a highly conserved helicase involved in mismatch repair, nucleotide excision repair, and recombinational repair. It plays a critical role in maintaining genomic stability and facilitating DNA lesion repair in many prokaryotic species including Helicobacter pylori and Escherichia coli. UvrD is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 189
35292 350691 cd17933 DEXSc_RecD-like DEXS-box helicase domain of RecD and similar proteins. RecD is a member of the RecBCD (EC 3.1.11.5, Exonuclease V) complex. It is the alpha chain of the complex and functions as a 3'-5' helicase. The RecBCD enzyme is both a helicase that unwinds, or separates the strands of DNA, and a nuclease that makes single-stranded nicks in DNA. RecD is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 155
35293 350692 cd17934 DEXXQc_Upf1-like DEXXQ-box helicase domain of Upf1-like helicase. The Upf1-like helicase family includes UPF1, HELZ, Mov10L1, Aquarius, IGHMBP2 (SMUBP2), and similar proteins. They belong to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 133
35294 350693 cd17935 EEXXQc_AQR EEXXQ-box helicase domain of AQR. Aquarius (AQR) is a multifunctional RNA helicase that binds precursor-mRNA introns at a defined position and is part of a pentameric intron-binding complex (IBC). It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 207
35295 350694 cd17936 EEXXEc_NFX1 EEXXE-box helicase domain of NFX1. Human NFX1 protein was identified as a protein that represses class II MHC (major histocompatibility complex) gene expression. NFX1 binds a conserved cis-acting element, termed the X-box, in promoters of human class II MHC genes. The Cys-rich region contains several NFX1-type zinc finger domains. Frequently, a R3H domain is present in the C-terminus, and a RING finger domain and a PAM2 motif are present in the N-terminus. The lack of R3H and PAM2 motifs in the plant proteins indicates functional differences. Plant NFX1-like proteins are proposed to modulate growth and survival by coordinating reactive oxygen species, salicylic acid, further biotic stress and abscisic acid responses. A common feature of all members may be E3 ubiquitin ligase, due to the presence of a RING finger domain, as well as DNA binding. NFX1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 178
35296 350695 cd17937 DEXXYc_viral_SF1-N DEXXY-box helicase domain of viral superfamily 1 helicase. Superfamily 1 (SF1) helicases are nucleic acid motor proteins that couple ATP hydrolysis to translocation along with the concomitant unwinding of DNA or RNA. The members here contain arterivirus equine arteritis virus (EAV) non-structural protein (nsp)10. Nsp10 is composed of two domains, ZBD (ATPase) and HEL1 (helicase) along with 2 additional non-enymatic domains that are thought to regulate HEL1 function. The helicase activity depends on the extensive relay of interactions between the ZBD and HEL1 domains. The arterivirus helicase structurally resembles the cellular Upf1 helicase, suggesting that nidoviruses may also use their helicases for post-transcriptional quality control of their large RNA genomes. The proteins here are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 137
35297 350696 cd17938 DEADc_DDX1 DEAD-box helicase domain of DEAD box protein 1. DEAD box protein 1 (DDX1) acts as an ATP-dependent RNA helicase, able to unwind both RNA-RNA and RNA-DNA duplexes. It possesses 5' single-stranded RNA overhang nuclease activity as well as ATPase activity on various RNA, but not DNA polynucleotides. DDX1 may play a role in RNA clearance at DNA double-strand breaks (DSBs), thereby facilitating the template-guided repair of transcriptionally active regions of the genome. It may also be involved in 3'-end cleavage and polyadenylation of pre-mRNAs. DDX1 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 204
35298 350697 cd17939 DEADc_EIF4A DEAD-box helicase domain of eukaryotic initiation factor 4A. The eukaryotic initiation factor-4A (eIF4A) family consists of 3 proteins EIF4A1, EIF4A2, and EIF4A3. These factors are required for the binding of mRNA to 40S ribosomal subunits. In addition these proteins are helicases that function to unwind double-stranded RNA. EIF4A proteins are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 199
35299 350698 cd17940 DEADc_DDX6 DEAD-box helicase domain of DEAD box protein 6. DEAD box protein 6 (DDX6, also known as Rck or p54) participates in mRNA regulation mediated by miRNA-mediated silencing. It also plays a role in global and transcript-specific messenger RNA (mRNA) storage, translational repression, and decay. It is a member of the DEAD-box helicase family, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 201
35300 350699 cd17941 DEADc_DDX10 DEAD-box helicase domain of DEAD box protein 10. Fusion of the DDX10 gene and the nucleoporin gene, NUP98, by inversion 11 (p15q22) chromosome translocation is found in the patients with de novo or therapy-related myeloid malignancies. Diseases associated with DDX10 (also known as DDX10-NUP98 Fusion Protein Type 2) include myelodysplastic syndrome and leukemia, acute myeloid. DDX10 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 198
35301 350700 cd17942 DEADc_DDX18 DEAD-box helicase domain of DEAD box protein 18. This DDX18 gene encodes a DEAD box protein and is activated by Myc protein. DDX18 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 198
35302 350701 cd17943 DEADc_DDX20 DEAD-box helicase domain of DEAD box protein 20. DDX20 (also called DEAD Box Protein DP 103, Component Of Gems 3, Gemin-3, and SMN-Interacting Protein) interacts directly with SMN (survival of motor neurons), the spinal muscular atrophy gene product, and may play a catalytic role in the function of the SMN complex on ribonucleoproteins. Diseases associated with DDX20 include spinal muscular atrophy and muscular atrophy. DDX20 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 192
35303 350702 cd17944 DEADc_DDX21_DDX50 DEAD-box helicase domain of DEAD box proteins 21 and 50. DDX21 (also called Gu-Alpha and nucleolar RNA helicase 2) is an RNA helicase that acts as a sensor of the transcriptional status of both RNA polymerase (Pol) I and II. It promotes ribosomal RNA (rRNA) processing and transcription from polymerase II (Pol II) and binds various RNAs, such as rRNAs, snoRNAs, 7SK and, at lower extent, mRNAs. DDX50 (also called Gu-Beta, Nucleolar Protein Gu2, and malignant cell derived RNA helicase). DDX21 and DDX50 have similar genomic structures and are in tandem orientation on chromosome 10, suggesting that the two genes arose by gene duplication in evolution. Diseases associated with DDX21 include stomach disease and cerebral creatine deficiency syndrome 3. Diseases associated with DDX50 include rectal disease. Both are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. Their name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP- binding region. 202
35304 350703 cd17945 DEADc_DDX23 DEAD-box helicase domain of DEAD box protein 23. DDX23 (also called U5 snRNP 100kD protein and PRP28 homolog) is involved in pre-mRNA splicing and its phosphorylated form (by SRPK2) is required for spliceosomal B complex formation. Diseases associated with DDX23 include distal hereditary motor neuropathy, type II. DDX23 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 220
35305 350704 cd17946 DEADc_DDX24 DEAD-box helicase domain of DEAD box protein 24. The human DDX24 gene encodes a DEAD box protein, which shows little similarity to any of the other known human DEAD box proteins, but shows a high similarity to mouse Ddx24 at the amino acid level. MDM2 mediates nonproteolytic polyubiquitylation of the DEAD-Box RNA helicase DDX24. DDX24 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP- binding region. 235
35306 350705 cd17947 DEADc_DDX27 DEAD-box helicase domain of DEAD box protein 27. DDX27 (also called RHLP, deficiency of ribosomal subunits protein 1 homolog, and probable ATP-dependent RNA helicase DDX27) is involved in the processing of 5.8S and 28S ribosomal RNAs. More specifically, the encoded protein localizes to the nucleolus, where it interacts with the PeBoW complex to ensure proper 3' end formation of 47S rRNA. DDX27 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 196
35307 350706 cd17948 DEADc_DDX28 DEAD-box helicase domain of DEAD box protein 28. DDX28 (also called mitochondrial DEAD-box polypeptide 28) plays an essential role in facilitating the proper assembly of the mitochondrial large ribosomal subunit and its helicase activity is essential for this function. DDX28 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 231
35308 350707 cd17949 DEADc_DDX31 DEAD-box helicase domain of DEAD box protein 31. DDX31 (also called helicain or G2 helicase) plays a role in ribosome biogenesis and TP53/p53 regulation through its interaction with NPM1. DDX31 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 214
35309 350708 cd17950 DEADc_DDX39 DEAD-box helicase domain of DEAD box protein 39. DDX39A is involved in pre-mRNA splicing and is required for the export of mRNA out of the nucleus. DDX39B is an essential splicing factor required for association of U2 small nuclear ribonucleoprotein with pre-mRNA, and it also plays an important role in mRNA export from the nucleus to the cytoplasm. Diseases associated with DDX39A (also called UAP56-Related Helicase, 49 kDa) include gastrointestinal stromal tumor and inflammatory bowel disease 6, while diseases associated with DDX39B (also called 56 kDa U2AF65-Associated Protein) include Plasmodium vivax malaria. DDX39 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 208
35310 350709 cd17951 DEADc_DDX41 DEAD-box helicase domain of DEAD box protein 41. DDX41 (also called ABS and MPLPF) interacts with several spliceosomal proteins and may recognize the bacterial second messengers cyclic di-GMP and cyclic di-AMP, resulting in the induction of genes involved in the innate immune response. Diseases associated with DDX41 include "myeloproliferative/lymphoproliferative neoplasms, familial" and "Ddx41-related susceptibility to familial myeloproliferative/lymphoproliferative neoplasms". DDX41 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 206
35311 350710 cd17952 DEADc_DDX42 DEAD-box helicase domain of DEAD box protein 42. DDX42 (also called Splicing Factor 3B-Associated 125 kDa Protein, RHELP, or RNAHP) is an NTPase with a preference for ATP, the hydrolysis of which is enhanced by various RNA substrates. It acts as a non-processive RNA helicase with protein displacement and RNA annealing activities. DDX42 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 197
35312 350711 cd17953 DEADc_DDX46 DEAD-box helicase domain of DEAD box protein 46. DDX46 (also called Prp5-like DEAD-box protein) is a component of the 17S U2 snRNP complex. It plays an important role in pre-mRNA splicing and has a role in antiviral innate immunity. DDX46 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 222
35313 350712 cd17954 DEADc_DDX47 DEAD-box helicase domain of DEAD box protein 47. DDX47 (also called E4-DEAD box protein) can shuttle between the nucleus and the cytoplasm, and has an RNA-independent ATPase activity. DX47 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 203
35314 350713 cd17955 DEADc_DDX49 DEAD-box helicase domain of DEAD box protein 49. DDX49 (also called Dbp8) is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 204
35315 350714 cd17956 DEADc_DDX51 DEAD-box helicase domain of DEAD box protein 51. DDX51 aids cell cancer proliferation by regulating multiple signalling pathways. Mammalian DEAD box protein Ddx51 acts in 3' end maturation of 28S rRNA by promoting the release of U8 snoRNA.It is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 231
35316 350715 cd17957 DEADc_DDX52 DEAD-box helicase domain of DEAD box protein 52. DDX52 (also called ROK1 and HUSSY19) is ubiquitously expressed in testis, endometrium, and other tissues in humans. DDX52 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 198
35317 350716 cd17958 DEADc_DDX43_DDX53 DEAD-box helicase domain of DEAD box proteins 43 and 53. DDX43 (also called cancer/testis antigen 13 or helical antigen) displays tumor-specific expression. Diseases associated with DDX43 include rheumatoid lung disease. DDX53 is also called cancer/testis antigen 26 or DEAD-Box Protein CAGE. Both DDX46 and DDX53 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 197
35318 350717 cd17959 DEADc_DDX54 DEAD-box helicase domain of DEAD box protein 54. DDX54 interacts in a hormone-dependent manner with nuclear receptors, and represses their transcriptional activity. DDX54 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 205
35319 350718 cd17960 DEADc_DDX55 DEAD-box helicase domain of DEAD box protein 55. DDX55 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 202
35320 350719 cd17961 DEADc_DDX56 DEAD-box helicase domain of DEAD box protein 56. DDX56 is a helicase required for assembly of infectious West Nile virus particles. New research suggests that DDX56 relocalizes to the site of virus assembly during WNV infection and that its interaction with WNV capsid in the cytoplasm may occur transiently during virion morphogenesis. DDX56 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 206
35321 350720 cd17962 DEADc_DDX59 DEAD-box helicase domain of DEAD box protein 59. DDX59 plays an important role in lung cancer development by promoting DNA replication. DDX59 knockdown mice showed reduced cell proliferation, anchorage-independent cell growth, and reduction of tumor formation. Recent work shows that EGFR and Ras regulate DDX59 during lung cancer development.Diseases associated with DDX59 (also called zinc finger HIT domain-containing protein 5) include orofaciodigital syndrome V and orofaciodigital syndrome. DDX59 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 193
35322 350721 cd17963 DEADc_DDX19_DDX25 DEAD-box helicase domain of ATP-dependent RNA helicases DDX19 and DDX25. DDX19 (also called DEAD box RNA helicase DEAD5) and DDX25 (also called gonadotropin-regulated testicular RNA helicase (GRTH)) are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 196
35323 350722 cd17964 DEADc_MSS116 DEAD-box helicase domain of DEAD-box helicase Mss116. Mss116 is an RNA chaperone important for mitochondrial group I and II intron splicing, translational activation, and RNA end processing. Mss116 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 211
35324 350723 cd17965 DEADc_MRH4 DEAD-box helicase domain of ATP-dependent RNA helicase MRH4. Mitochondrial RNA helicase 4 (MRH4) plays an essential role during the late stages of mitochondrial ribosome or mitoribosome assembly by promoting remodeling of the 21S rRNA-protein interactions. MRH4 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 251
35325 350724 cd17966 DEADc_DDX5_DDX17 DEAD-box helicase domain of ATP-dependent RNA helicases DDX5 and DDX17. DDX5 and DDX17 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 197
35326 350725 cd17967 DEADc_DDX3_DDX4 DEAD-box helicase domain of ATP-dependent RNA helicases DDX3 and DDX4. This subfamily includes Drosophila melanogaster Vasa, which is essential for development. DEAD box protein 3 (DDX3) has been reported to display a high level of RNA-independent ATPase activity stimulated by both RNA and DNA. DEAD box protein 4 (DDX4, also known as VASA homolog) is an ATP-dependent RNA helicase required during spermatogenesis and is essential for the germline integrity. DDX3 and DDX4 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 221
35327 350726 cd17968 DEAHc_DDX11_starthere DEAH-box helicase domain of ATP-dependent DNA helicase DDX11. DDX11 (also called ChlR1) encodes a protein of the conserved family of Iron-Sulfur (Fe-S) cluster DNA helicases and is thought to function in maintaining chromosome transmission fidelity and genome stability. Mutations in the Chl1 human homologs ChlR1/DDX11 and BACH1/BRIP1/FANCJ collectively result in Warsaw Breakage Syndrome, Fanconi anemia, cell aneuploidy and breast and ovarian cancers. DDX11 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 134
35328 350727 cd17969 DEAHc_XPD DEAH-box helicase domain of TFIIH basal transcription factor complex helicase XPD subunit. TFIIH can be resolved biochemically into a seven subunit core complex containing XPD/Rad3, XPB/Ssl2, p62/Tfb1, p52/Tfb2, p44/Ssl1, p34/Tfb4, and p8/Tfb5 and a three subunit Cdk Activating Kinase (CAK) complex containing CDK7/Kin28, cyclin H/Ccl1, and MAT1/Tfb3. XPD interacts directly with p44, which stimulates XPD helicase activity. XPD/Rad3 also interacts directly with the CAK via its MAT1/Tfb3 subunit inhibiting the helicase activity of XPD. XPD is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 157
35329 350728 cd17970 DEAHc_FancJ DEAH-box helicase domain of Fanconi anemia group J protein and similar proteins. Fanconi anemia group J protein (FACJ or FANCJ, also known as BRIP1) is a DNA helicase required for the maintenance of chromosomal stability. It plays a role in the repair of DNA double-strand breaks by homologous recombination dependent on its interaction with BRCA1. FANCJ belongs to the DEAD-box helicase family, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 181
35330 350729 cd17971 DEXHc_DHX8 DEXH-box helicase domain of DEAH-box helicase 8. DEAH-box helicase 8 (DHX8 ,also known as pre-mRNA-splicing factor ATP-dependent RNA helicase PRP22) acts late in the splicing of pre-mRNA and mediates the release of the spliced mRNA from spliceosomes. DHX8 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 179
35331 350730 cd17972 DEXHc_DHX9 DEXH-box helicase domain of DEAH-box helicase 9. DEAH-box helicase 9 (DHX9, also known as ATP-dependent RNA helicase A or RHA and leukophysin or LKP) plays an important role in many cellular processes, including regulation of DNA replication, transcription, translation, microRNA biogenesis, RNA processing and transport, and maintenance of genomic stability. DHX9 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 234
35332 350731 cd17973 DEXHc_DHX15 DEXH-box helicase domain of DEAH-box helicase 15. DEAH-box helicase 15 (DHX15) is a pre-mRNA processing factor involved in disassembly of spliceosomes after the release of mature mRNA. DHX15 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 186
35333 350732 cd17974 DEXHc_DHX16 DEXH-box helicase domain of DEAH-box helicase 16. DEAH-box helicase 16 (DHX16) is probably involved in pre-mRNA splicing. DHX16 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 174
35334 350733 cd17975 DEXHc_DHX29 DEXH-box helicase domain of DEAH-box helicase 29. DEAH-box helicase 29 (DHX29) is a part of the 43S pre-initiation complex involved in translation initiation of mRNAs with structured 5'-UTRs. DHX29 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 183
35335 350734 cd17976 DEXHc_DHX30 DEXH-box helicase domain of DEAH-box helicase 30. DEAH-box helicase 30 (DHX30) plays an important role in the assembly of the mitochondrial large ribosomal subunit. DHX30 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 178
35336 350735 cd17977 DEXHc_DHX32 DEXH-box helicase domain of DEAH-box helicase 32. DEAH-box helicase 32 (DHX32) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 176
35337 350736 cd17978 DEXHc_DHX33 DEXH-box helicase domain of DEAH-box helicase 33. DEAH-box helicase 33 (DHX33) stimulates RNA polymerase I transcription of the 47S precursor rRNA. DHX33 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 179
35338 350737 cd17979 DEXHc_DHX34 DEXH-box helicase domain of DEAH-box helicase 34. DEAH-box helicase 34 (DHX34) plays a role in the nonsense-mediated decay (NMD), a surveillance mechanism that degrades aberrant mRNAs. DHX34 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 170
35339 350738 cd17980 DEXHc_DHX35 DEXH-box helicase domain of DEAH-box helicase 35. DHX35 plays a role in colorectal cancers and seems to be associated with risk to thyroid cancers. It also has been shown to postively regulates poxviruses, such as Myxoma virus. DEAH-box helicase 35 (DHX35) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 185
35340 350739 cd17981 DEXHc_DHX36 DEXH-box helicase domain of DEAH-box helicase 36. DEAH-box helicase 36 (DHX36, also known as G4-resolvase 1 or G4R1, MLE-like protein 1 and RNA helicase associated with AU-rich element or RHAU) unwinds a G4-quadruplex in human telomerase RNA. DHX36 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 180
35341 350740 cd17982 DEXHc_DHX37 DEXH-box helicase domain of DEAH-box helicase 37. DHX37 plays a role in the development of the human nervous system and has been linked to schizophrenia. It also negatively regulates poxviruses such as Myxoma virus. DEAH-box helicase 37 (DHX37) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 191
35342 350741 cd17983 DEXHc_DHX38 DEXH-box helicase domain of DEAH-box helicase 38. DEAH-box helicase 38 (DHX38, also known as PRP16) is involved in pre-mRNA splicing. DHX38 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 173
35343 350742 cd17984 DEXHc_DHX40 DEXH-box helicase domain of DEAH-box helicase 40. DEAH-box helicase 40 (DHX40) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 178
35344 350743 cd17985 DEXHc_DHX57 DEXH-box helicase domain of DEAH-box helicase 57. DEAH-box helicase 57 (DHX57) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 177
35345 350744 cd17986 DEXQc_DQX1 DEXQ-box helicase domain of DEAQ-box RNA dependent ATPase 1. DEAQ-box RNA dependent ATPase 1 (DQX1) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 177
35346 350745 cd17987 DEXHc_YTHDC2 DEXH-box helicase domain of YTH domain containing 2. YTH domain containing 2 (YTHDC2) regulates mRNA translation and stability via binding to N6-methyladenosine, a modified RNA nucleotide enriched in the stop codons and 3' UTRs of eukaryotic messenger RNAs. YTHDC2 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 176
35347 350746 cd17988 DEXHc_TDRD9 DEXH-box helicase domain of tudor domain containing 9. Tudor domain containing 9 (TDRD9, also known as HIG-1or NET54 or C14orf75) is a part of the nuclear PIWI-interacting RNA (piRNA) pathway essential for transposon silencing and male fertility TDRD9 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 180
35348 350747 cd17989 DEXHc_HrpA DEXH-box helicase domain of ATP-dependent RNA helicase HrpA. HrpA is part of the HrpB-HrpA two-partner secretion (TPS) system, a secretion pathway important to the secretion of large virulence-associated proteins. HrpA belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 173
35349 350748 cd17990 DEXHc_HrpB DEXH-box helicase domain of ATP-dependent helicase HrpB. HrpB is part of the HrpB-HrpA two-partner secretion (TPS) system, a secretion pathway important to the secretion of large virulence-associated proteins. HrpB belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 174
35350 350749 cd17991 DEXHc_TRCF DEXH/Q-box helicase domain of the transcription-repair coupling factor. Transcription-repair coupling factor (TrcF) dissociates transcription elongation complexes blocked at nonpairing lesions and mediates recruitment of DNA repair proteins. TrcF is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 193
35351 350750 cd17992 DEXHc_RecG DEXH/Q-box helicase domain of RecG. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. It is a member of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 225
35352 350751 cd17993 DEXHc_CHD1_2 DEXH-box helicase domain of the chromodomain helicase DNA binding proteins 1 and 2, and similar proteins. Chromodomain-helicase-DNA-binding protein 1 (CHD1) is an ATP-dependent chromatin-remodeling factor which functions as the substrate recognition component of the transcription regulatory histone acetylation (HAT) complex SAGA. It regulates polymerase II transcription and is also required for efficient transcription by RNA polymerase I, and more specifically the polymerase I transcription termination step. It is not only involved in transcription-related chromatin-remodeling, but is also required to maintain a specific chromatin configuration across the genome. CHD1 is also associated with histone deacetylase (HDAC) activity. Chromodomain-helicase-DNA-binding protein 2 (CHD2) is a DNA-binding helicase that specifically binds to the promoter of target genes, leading to chromatin remodeling, possibly by promoting deposition of histone H3.3. It is involved in myogenesis via interaction with MYOD1; it binds to myogenic gene regulatory sequences and mediates incorporation of histone H3.3 prior to the onset of myogenic gene expression, promoting their expression. Both are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 218
35353 350752 cd17994 DEXHc_CHD3_4_5 DEAH-box helicase domain of the chromodomain helicase DNA binding proteins 3, 4 and 5. Chromodomain-helicase-DNA-binding protein 3 (CHD3) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. It is required for anchoring centrosomal pericentrin in both interphase and mitosis, for spindle organization and centrosome integrity. Chromodomain-helicase-DNA-binding protein 4 (CHD4) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. Chromodomain-helicase-DNA-binding protein 5 (CHD5) is a chromatin-remodeling protein that binds DNA through histones and regulates gene transcription. It is thought to specifically recognize and bind trimethylated 'Lys-27' (H3K27me3) and non-methylated 'Lys-4' of histone H3 and plays a role in the development of the nervous system by activating the expression of genes promoting neuron terminal differentiation. In parallel, it may also positively regulate the trimethylation of histone H3 at 'Lys-27' thereby specifically repressing genes that promote the differentiation into non-neuronal cell lineages. As a tumor suppressor, it regulates the expression of genes involved in cell proliferation and differentiation. In spermatogenesis, it probably regulates histone hyperacetylation and the replacement of histones by transition proteins in chromatin, a crucial step in the condensation of spermatid chromatin and the production of functional spermatozoa. CHD3, CHD4, and CHD5 are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 196
35354 350753 cd17995 DEXHc_CHD6_7_8_9 DEXH-box helicase domain of the chromodomain helicase DNA binding protein 6, 7, 8 and 9. Chromodomain-helicase-DNA-binding protein 6-9 (CHD6, CHD7, CHD8, and CHD9) are members of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 223
35355 350754 cd17996 DEXHc_SMARCA2_SMARCA4 DEXH-box helicase domain of SMARCA2 and SMARCA4. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, members 2 and 4 (SMARCA2 and SMARCA4) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 233
35356 350755 cd17997 DEXHc_SMARCA1_SMARCA5 DEAH-box helicase domain of SMARCA1 and SMARCA5. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 and 5 (SMARCA1 and SMARCA5) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 222
35357 350756 cd17998 DEXHc_SMARCAD1 DEXH-box helicase domain of SMARCAD1. SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A containing DEAD/H box 1 (SMARCAD1, also known as ATP-dependent helicase 1 or Hel1) possesses intrinsic ATP-dependent nucleosome-remodeling activity and is required for both DNA repair and heterochromatin organization. SMARCAD1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 187
35358 350757 cd17999 DEXHc_Mot1 DEXH-box helicase domain of Mot1. Modifier of transcription 1 (Mot1, also known as TAF172 in eukaryotes) regulates transcription in association with TATA binding protein (TBP). Mot1, Ino80C, and NC2 function coordinately to regulate pervasive transcription in yeast and mammals. Mot1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 232
35359 350758 cd18000 DEXHc_ERCC6 DEXH-box helicase domain of ERCC6. ERCC excision repair 6, chromatin remodeling factor (ERCC6, also known Cockayne syndrome group B (CSB), Rad26 in Saccharomyces cerevisiae, and Rhp26 in Schizosaccharomyces pombe) is a DNA-binding protein that is important in transcription-coupled excision repair. ERCC6 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 193
35360 350759 cd18001 DEXHc_ERCC6L DEXH-box helicase domain of ERCC6L. ERCC excision repair 6 like, spindle assembly checkpoint helicase (ERCC6L, also known as RAD26L) is an essential component of the mitotic spindle assembly checkpoint, by acting as a tension sensor that associates with catenated DNA which is stretched under tension until it is resolved during anaphase. ERCC6L is proposed to stimulate cancer cell proliferation by promoting cell cycle through a way of RAB31-MAPK-CDK2. ERCC6L is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 232
35361 350760 cd18002 DEXQc_INO80 DEAQ-box helicase domain of INO80. INO80 is the catalytic ATPase subunit of the INO80 chromatin remodeling complex. INO80 removes histone H3-containing nucleosomes from associated chromatin, promotes CENP-ACnp1 chromatin assembly at the centromere in a redundant manner with another chromatin-remodeling factor Chd1Hrp1. INO80 mutants have severe defects in oxygen consumption and promiscuous cell division that is no longer coupled with metabolic status. INO80 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 229
35362 350761 cd18003 DEXQc_SRCAP DEXH/Q-box helicase domain of SRCAP. Snf2-related CBP activator (SRCAP, also known as SWR1 or DOMO1) is the core catalytic component of the multiprotein chromatin-remodeling SRCAP complex, that is necessary for the incorporation of the histone variant H2A.Z into nucleosomes. SRCAP is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 223
35363 350762 cd18004 DEXHc_RAD54 DEXH-box helicase domain of RAD54. RAD54 proteins play a role in recombination. They are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 240
35364 350763 cd18005 DEXHc_ERCC6L2 DEXH-box helicase domain of ERCC6L2. ERCC excision repair 6 like 2 (ERCC6L2, also known as RAD26L) may play a role in DNA repair and mitochondrial function. In humans, mutations in the ERCC6L2 gene are associated with bone marrow failure syndrome 2. ERCC6L2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 245
35365 350764 cd18006 DEXHc_CHD1L DEAH/Q-box helicase domain of CHD1L. Chromodomain helicase DNA binding protein 1 like (CHD1L, also known as ALC1) is involved in DNA repair by regulating chromatin relaxation following DNA damage. CHD1L is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 216
35366 350765 cd18007 DEXHc_ATRX-like DEXH-box helicase domain of ATRX-like proteins. This family includes ATRX-like members such as transcriptional regulator ATRX (also called alpha thalassemia/mental retardation syndrome X-linked and X-linked nuclear protein or XNP) which is involved in transcriptional regulation and chromatin remodeling, and ARIP4 (also called androgen receptor-interacting protein 4, RAD54 like 2 or RAD54L2) which modulates androgen receptor (AR)-dependent transactivation in a promoter-dependent manner. They are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 239
35367 350766 cd18008 DEXDc_SHPRH-like DEXH-box helicase domain of SHPRH-like proteins. The SHPRH-like subgroup belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 241
35368 350767 cd18009 DEXHc_HELLS_SMARCA6 DEXH-box helicase domain of HELLS. HELLS (helicase, lymphoid specific, also known as Lsh or SMARCA6) is a major epigenetic regulator crucial for normal heterochromatin structure and function. HELLS is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 236
35369 350768 cd18010 DEXHc_HARP_SMARCAL1 DEXH-box helicase domain of SMARCAL1. SMARCAL1 (SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a like 1, also known as HARP) is recruited to stalled replication forks to promote repair and helps restart replication. It plays a role in DNA repair, telomere maintenance and replication fork stability in response to DNA replication stress. Mutations cause Schimke Immunoosseous Dysplasia. SMARCAL1 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 213
35370 350769 cd18011 DEXDc_RapA DEXH-box helicase domain of RapA. In bacteria, RapA is an RNA polymerase (RNAP)-associated SWI2/SNF2 (switch/sucrose non-fermentable) protein that mediates RNAP recycling during transcription. The ATPase activity of RapA is stimulated by its interaction with RNAP and inhibited by its N-terminal domain. The conformational changes of RapA and its interaction with RNAP are essential for RNAP recycling. RapA is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 207
35371 350770 cd18012 DEXQc_arch_SWI2_SNF2 DEAQ-box helicase domain of archaeal and bacterial SNF2-related proteins. Proteins belonging to SNF2 family of DNA dependent ATPases are important members of the chromatin remodeling complexes that are implicated in epigenetic control of gene expression. The Snf2 family comprises a large group of ATP-hydrolyzing proteins that are ubiquitous in eukaryotes, but also present in eubacteria and archaea. Archaeal SWI2 and SNF2 are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 218
35372 350771 cd18013 DEXQc_bact_SNF2 DEXQ-box helicase domain of bacterial SNF2 family proteins. Proteins belonging to the SNF2 family of DNA dependent ATPases are important members of the chromatin remodeling complexes that are implicated in epigenetic control of gene expression. The Snf2 family comprise a large group of ATP-hydrolyzing proteins that are ubiquitous in eukaryotes, but also present in eubacteria and archaea. The bacterial SNF2 present in this family are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 218
35373 350772 cd18014 DEXHc_RecQ5 DEAH-box helicase domain of RecQ5. ATP-dependent DNA helicase Q5 (RecQ5) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 205
35374 350773 cd18015 DEXHc_RecQ1 DEXH-box helicase domain of RecQ1. ATP-dependent DNA helicase Q1 (RecQ1) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 209
35375 350774 cd18016 DEXHc_RecQ2_BLM DEAH-box helicase domain of RecQ2. ATP-dependent DNA helicase Q2 (RecQ2, also called Bloom syndrome protein homolog or BLM) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations in RecQ2 cause Bloom syndrome. 208
35376 350775 cd18017 DEXHc_RecQ3 DEAH-box helicase domain of RecQ3. DEAD-like helicase RecQ3 (also called Werner syndrome ATP-dependent helicase or WRN) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations cause Werner's syndrome. 193
35377 350776 cd18018 DEXHc_RecQ4-like DEAH-box helicase domain of RecQ4 and similar proteins. ATP-dependent DNA helicase Q4 (RecQ4) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations cause Rothmund-Thomson/RAPADILINO/Baller-Gerold syndrome. 201
35378 350777 cd18019 DEXHc_Brr2_1 N-terminal DEXH-box helicase domain of spliceosomal Brr2 RNA helicase. Brr2 is a type II DEAD box helicase that mediates spliceosome catalytic activation. It is a stable subunit of the spliceosome, required during splicing catalysis and spliceosome disassembly. Brr2 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 214
35379 350778 cd18020 DEXHc_ASCC3_1 N-terminal DEXH-box helicase domain of Activating signal cointegrator 1 complex subunit 3. Activating signal cointegrator 1 complex subunit 3 (ASCC3) is a type II DEAD box helicase that plays a role in the repair of N-alkylated nucleotides. ASCC3 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 199
35380 350779 cd18021 DEXHc_Brr2_2 C-terminal D[D/E]X[H/Q]-box helicase domain of spliceosomal Brr2 RNA helicase. Brr2 is a type II DEAD box helicase that mediates spliceosome catalytic activation. It is a stable subunit of the spliceosome, required during splicing catalysis and spliceosome disassembly. Brr2 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 191
35381 350780 cd18022 DEXHc_ASCC3_2 C-terminal DEXH-box helicase domain of Activating signal cointegrator 1 complex subunit 3. Activating signal cointegrator 1 complex subunit 3 (ASCC3) is a type II DEAD box helicase that plays a role in the repair of N-alkylated nucleotides. ASCC3 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 189
35382 350781 cd18023 DEXHc_HFM1 DEXH-box helicase domain of ATP-dependent DNA helicase HFM1. HFM1 is a type II DEAD box helicase, required for crossover formation and complete synapsis of homologous chromosomes during meiosis. HFM1 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 206
35383 350782 cd18024 DEXHc_Mtr4-like DEXH-box helicase domain of ATP-dependent RNA helicase Mtr4. Mtr4 (also known as DOB1 or SKIV2L2) is a type II DEAD box helicase that plays a role in the processing of structured RNAs, including the maturation of 5.8S ribosomal RNA (rRNA)and is part of the TRAMP complex that is involved in exosome-mediated degradation of aberrant RNAs. Mtr4 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 205
35384 350783 cd18025 DEXHc_DDX60 DEXH-box helicase domain of DEAD box protein 60. DEAD box protein 60 (DDX60) is an IFN-inducible cytoplasmic helicase that plays a role in RIG-I-mediated type I interferon (IFN) nuclease-mediated viral RNA degradation. DDX60 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 192
35385 350784 cd18026 DEXHc_POLQ-like DEXH-box helicase domain of DNA polymerase theta. DNA polymerase theta (POLQ) is important in the repair of genomic double-strand breaks (DSBs). POLQ contains an N-terminal type II DEAD box helicase domain which contains the ATP-binding region. 202
35386 350785 cd18027 DEXHc_SKIV2L DEXH-box helicase domain of SKIV2L. Superkiller viralicidic activity 2-like (SKIV2L, also called SKI2 or DHX13) plays a role in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. SKIV2L belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 179
35387 350786 cd18028 DEXHc_archSki2 DEXH-box helicase domain of archaeal Ski2-type helicase. Archaeal Ski2-type RNA helicases play an important role in RNA degradation, processing and splicing pathways. They belong to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 177
35388 350787 cd18029 DEXHc_XPB DEXH-box helicase domain of TFIIH XPB subunit and similar proteins. TFIIH basal transcription factor complex helicase XPB subunit (also known as DNA excision repair protein ERCC-3 or TFIIH 89 kDa subunit) is the ATP-dependent 3'-5' DNA helicase component of the core-TFIIH basal transcription factor, involved in nucleotide excision repair (NER) of DNA and, when complexed to CAK, in RNA transcription by RNA polymerase II. XPB is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 169
35389 350788 cd18030 DEXHc_RE_I_HsdR DEXH-box helicase domain of type I restriction enzyme HdsR subunit. The HdsR motor subunit of type I restriction-modification enzymes contains the DNA cleavage and ATP-dependent DNA translocation activities of the heteromeric complex. It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 208
35390 350789 cd18031 DEXHc_UvsW DEXH-box helicase domain of bacteriophage UvsW. Bacteriophage UvsW is part of the WXY system that repairs DNA damage by a process that involves homologous recombination. UvsW is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 161
35391 350790 cd18032 DEXHc_RE_I_III_res DEXH-box helicase domain of type III restriction enzyme res subunit. Members of this cd includes both type I and type III restriction enzymes. Both are hetero-oligomeric proteins. Type I REs are encoded by three closely linked genes: a specificity subunit (HsdS or S) for recognizing a DNA sequence, a methylation subunit (HsdM or M) for methylating the recognized target bases, and a restriction subunit (HsdR or R) for the translocation and random cleavage of non-methylated DNA. They show diverse catalytic activities, including methyltransferase (MTase), ATP hydrolase (ATPase), DNA translocation and restriction activities. These enzymes cut at a site that differs, and is a random distance (at least 1000 bp) away, from their recognition site. Cleavage at these random sites follows a process of DNA translocation, which shows that these enzymes are also molecular motors. The recognition site is asymmetrical and is composed of two specific portions: one containing 3-4 nucleotides, and another containing 4-5 nucleotides, separated by a non-specific spacer of about 6-8 nucleotides. Type III enzymes are composed of two subunits, Res and Mod. The Mod subunit recognizes the DNA sequence specific for the system and is a modification methyltransferase; as such, it is functionally equivalent to the M and S subunits of type I restriction endonucleases. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognize short 5-6 bp-long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Both type I and type III REs are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 163
35392 350791 cd18033 DEXDc_FANCM DEAH-box helicase domain of FANCM. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. In complex with CENPS and CENPX, it binds double-stranded DNA (dsDNA), fork-structured DNA (fsDNA), and Holliday junction substrates. Its ATP-dependent DNA branch migration activity can process branched DNA structures such as a movable replication fork. This activity is strongly stimulated in the presence of CENPS and CENPX. In complex with FAAP24, it efficiently binds to single-strand DNA (ssDNA), splayed-arm DNA, and 3'-flap substrates. In vitro, on its own, it strongly binds ssDNA oligomers and weakly fsDNA, but does not bind to dsDNA. FANCM is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 182
35393 350792 cd18034 DEXHc_dicer DEXH-box helicase domain of endoribonuclease Dicer. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). In concert with Argonautes, these small RNAs bind complementary mRNAs to down-regulate their expression. miRNAs are processed by Dicer from small hairpins, while siRNAs are typically processed from longer dsRNA, from endogenous sources, or exogenous sources such as viral replication intermediates. Some organisms, such as Homo sapiens and Caenorhabditis elegans, encode one Dicer that generates miRNAs and siRNAs, but other organisms have multiple dicers with specialized functions. Dicers exist throughout eukaryotes, and a subset have an N-terminal helicase domain of the RIG-I-like receptor (RLR) subgroup. RLRs often function in innate immunity and Dicer helicase domains sometimes show differences in activity that correlate with roles in immunity. Dicer is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 200
35394 350793 cd18035 DEXHc_Hef DEXH-box helicase domain of Hef. Hef (helicase-associated endonuclease fork-structure) belongs to the XPF/MUS81/FANCM family of endonucleases and is involved in stalled replication fork repair. All archaea encode a protein of the XPF/MUS81/FANCM family of endonucleases. It exists in two forms: a long form, referred as Hef which consists of an N-terminal helicase fused to a C-terminal nuclease and is specific to euryarchaea and a short form, referred as XPF which lacks the helicase domain and is specific to crenarchaea and thaumarchaea. Hef has the unique feature of having both active helicase and nuclease domains. This domain configuration is highly similar with the human FANCM, a possible ortholog of archaeal Hef proteins. Hef is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 181
35395 350794 cd18036 DEXHc_RLR DEXH-box helicase domain of RIG-I-like receptors. RIG-I-like receptors (RLRs) sense cytoplasmic viral RNA and comprise RIG-I, RLR-2/MDA5 (melanoma differentiation-associated protein 5) and RLR-3/LGP2 (laboratory of genetics and physiology 2). RIG-I-like receptors (RLRs) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 204
35396 350795 cd18037 DEXSc_Pif1_like DEAD-box helicase domain of Pif1. Pif1 and other members of this family are RecD-like helicases involved in maintaining genome stability through unwinding double-stranded DNAs (dsDNAs), DNA/RNA hybrids, and G quadruplex (G4) structures. The members of Pif1 helicase subfamily studied so far all appear to contribute to telomere maintenance. Pif1 is a member of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 183
35397 350796 cd18038 DEXXQc_Helz-like DEXXQ/H-box helicase domain of Helz-like helicase. This subfamily contains HELZ, Mov10L1, and similar proteins. Helicase with zinc finger (HELZ) acts as a helicase that plays a role in RNA metabolism during development. Moloney leukemia virus 10-like protein 1 (Mov10L1) binds Piwi-interacting RNA (piRNA) precursors to initiate piRNA processing. All are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 229
35398 350797 cd18039 DEXXQc_UPF1 DEXXQ-box helicase domain of UPF1. UPF1 (also called RNA Helicase And ATPase, Regulator Of Nonsense Transcripts, or ATP-Dependent Helicase RENT1) is an RNA-dependent helicase and ATPase required for nonsense-mediated decay (NMD) of mRNAs containing premature stop codons. It is recruited to mRNAs upon translation termination and undergoes a cycle of phosphorylation and dephosphorylation; its phosphorylation appears to be a key step in NMD. It is recruited by release factors to stalled ribosomes together with the SMG1C protein kinase complex to form the transient SURF (SMG1-UPF1-eRF1-eRF3) complex. In EJC-dependent NMD, the SURF complex associates with the exon junction complex (EJC) located downstream from the termination codon through UPF2 and allows the formation of an UPF1-UPF2-UPF3 surveillance complex which is believed to activate NMD. Diseases associated with UPF1 include juvenile amyotrophic lateral sclerosis and epidermolysis bullosa, junctional, non-Herlitz type. UPF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 234
35399 350798 cd18040 DEXXc_HELZ2-C C-terminal DEXX-box helicase domain of HELZ2. Helicase with zinc finger 2 (HELZ2, also known as PPAR-alpha-interacting complex protein 285 or PRIC285 and PPAR-gamma DBD-interacting protein 1 or PDIP1) acts as a transcriptional coactivator for a number of nuclear receptors including PPARA, PPARG, THRA, THRB and RXRA. It belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 271
35400 350799 cd18041 DEXXQc_DNA2 DEXXQ-box helicase domain of DNA2. DNA2 (DNA Replication Helicase/Nuclease 2) possesses different enzymatic activities, such as single-stranded DNA (ssDNA)-dependent ATPase, 5-3 helicase, and endonuclease activities, and is involved in DNA replication and DNA repair in the nucleus and mitochondrion. It is involved in Okazaki fragment processing by cleaving long flaps that escape FEN1: flaps that are longer than 27 nucleotides are coated by replication protein A complex (RPA), leading to recruit DNA2 which cleaves the flap until it is too short to bind RPA and becomes a substrate for FEN1. It is also involved in 5-end resection of DNA during double-strand break (DSB) repair; it is recruited by BLM and mediates the cleavage of 5-ssDNA, while the 3-ssDNA cleavage is prevented by the presence of RPA. DNA2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 203
35401 350800 cd18042 DEXXQc_SETX DEXXQ-box helicase domain of SETX. The RNA/DNA helicase senataxin (SETX) plays a role in transcription, neurogenesis, and antiviral response. SEXT is an R-loop-associated protein that is thought to function as an RNA/DNA helicase. R-loops consist of RNA/DNA hybrids, formed during transcription when nascent RNA hybridizes to the DNA template strand, displacing the non-template DNA strand. Mutations in SETX are linked to two neurodegenerative disorders: ataxia with oculomotor apraxia type 2 (AOA2) and amyotrophic lateral sclerosis type 4 (ALS4). S. cerevisiae homolog splicing endonuclease 1 (Sen1) is an exclusively nuclear protein, important for nucleolar organization. S. cerevisiae Sen1 and its ortholog, the Schizosaccharomyces pombe Sen1, share conserved domains and belong to the family I class of helicases. Both proteins translocate 5' to 3' and unwind both DNA and RNA duplexes and also RNA/DNA hybrids in vitro. SETX is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 217
35402 350801 cd18043 DEXXQc_SF1 DEXXQ-box helicase domain of Superfamily 1 helicases. Superfamily 1 (SF1) helicases are nucleic acid motor proteins that couple ATP hydrolysis to translocation along with the concomitant unwinding of DNA or RNA. This is central to many aspects of cellular DNA and RNA metabolism and accordingly, they are implicated in a wide range of nucleic acid processing events including DNA replication, recombination, and repair as well as many aspects of RNA metabolism. Superfamily 1 helicases are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 127
35403 350802 cd18044 DEXXQc_SMUBP2 DEXXQ-box helicase domain of SMUBP2. SMUBP2 (also called immunoglobulin mu-binding protein 2, or IGHMBP2) is a 5' to 3' helicase that unwinds RNA and DNA duplexes in an ATP-dependent reaction. It is a DNA-binding protein specific to 5'-phosphorylated single-stranded guanine-rich sequence (5'-GGGCT-3') related to the immunoglobulin mu chain switch region. The IGHMBP2 gene is responsible for Charcot-Marie-Tooth disease (CMT) type 2S and spinal muscular atrophy with respiratory distress type 1 (SMARD1). It is also thought to play a role in frontotemporal dementia (FTD) with amyotrophic lateral sclerosis (ALS) and major depressive disorder (MDD). SMUBP2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 191
35404 350803 cd18045 DEADc_EIF4AIII_DDX48 DEAD-box helicase domain of eukaryotic initiation factor 4A-III. Eukaryotic initiation factor 4A-III (EIF4AIII, also known as DDX48) is part of the exon junction complex (EJC) that plays a major role in posttranscriptional regulation of mRNA. EJC consists of four proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP. DDX48 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 201
35405 350804 cd18046 DEADc_EIF4AII_EIF4AI_DDX2 DEAD-box helicase domain of eukaryotic initiation factor 4A-I and 4-II. Eukaryotic initiation factor 4A-I (DDX2A) and eukaryotic initiation factor 4A-II (DDX2B) are involved in cap recognition and are required for mRNA binding to ribosome. They are DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 201
35406 350805 cd18047 DEADc_DDX19 DEAD-box helicase domain of DEAD box protein 19. DDX19 is an RNA helicase involved in both mRNA (mRNA) export from the nucleus into the cytoplasm and in mRNA translation. DDX19 functions in the nucleus in resolving RNA:DNA hybrids (R-loops). Activation of a DNA damage response pathway dependent upon the ATR kinase, a major regulator of replication fork progression, stimulates translocation of DDX19 from the cytoplasm into the nucleus. Only nuclear Ddx19 is competent to resolve R-loops. DDX19 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 205
35407 350806 cd18048 DEADc_DDX25 DEAD-box helicase domain of DEAD box protein 25. DDX25 (also called gonadotropin-regulated testicular RNA helicase (GRTH) is a testis-specific protein essential for completion of spermatogenesis. DDX25 is also a novel negative regulator of IFN pathway and facilitates RNA virus infection. Diseases associated with DDX25 include hydrolethalus syndrome, an autosomal recessive lethal malformation syndrome characterized by multiple developmental defects of fetus.. DDX25 (also called gonadotropin-regulated testicular RNA helicase) is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 229
35408 350807 cd18049 DEADc_DDX5 DEAD-box helicase domain of DEAD box protein 5. DDX5 (also called RNA helicase P68, HLR1, G17P1, or HUMP68) is involved in pathways that include the alteration of RNA structures, plays a role as a coregulator of transcription, a regulator of splicing, and in the processing of small noncoding RNAs. It synergizes with DDX17 and SRA1 RNA to activate MYOD1 transcriptional activity and is involved in skeletal muscle differentiation. Dysregulation of this gene may play a role in cancer development. DDX5 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 234
35409 350808 cd18050 DEADc_DDX17 DEAD-box helicase domain of DEAD box protein 17. DDX17 (also called DEAD Box Protein P72 or DEAD Box Protein P82) has a wide variety of functions including regulating the alternative splicing of exons exhibiting specific features such as the inclusion of AC-rich alternative exons in CD44 transcripts, playing a role in innate immunity, and promoting mRNA degradation mediated by the antiviral zinc-finger protein ZC3HAV1 in an ATPase-dependent manner. DDX17 synergizes with DDX5 and SRA1 RNA to activate MYOD1 transcriptional activity and is involved in skeletal muscle differentiation. DDX17 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 271
35410 350809 cd18051 DEADc_DDX3 DEAD-box helicase domain of DEAD box protein 3. DDX3 (also called helicase-like protein, DEAD box, X isoform, or DDX14) has been reported to display a high level of RNA-independent ATPase activity stimulated by both RNA and DNA. This protein has multiple conserved domains and is thought to play roles in both the nucleus and cytoplasm. Nuclear roles include transcriptional regulation, mRNP assembly, pre-mRNA splicing, and mRNA export. In the cytoplasm, this protein is thought to be involved in translation, cellular signaling, and viral replication. Misregulation of this gene has been implicated in tumorigenesis. Diseases associated with DDX3 include mental retardation, X-linked 102 and agenesis of the corpus callosum, with facial anomalies and robin sequence. DDX3 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 249
35411 350810 cd18052 DEADc_DDX4 DEAD-box helicase domain of DEAD box protein 4. DEAD box protein 4 (DDX4, also known as VASA homolog) is an ATP-dependent RNA helicase required during spermatogenesis and is essential for the germline integrity. DEAD-box helicases are a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 264
35412 350811 cd18053 DEXHc_CHD1 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 1. Chromodomain-helicase-DNA-binding protein 1 (CHD1) is an ATP-dependent chromatin-remodeling factor which functions as substrate recognition component of the transcription regulatory histone acetylation (HAT) complex SAGA. It regulates polymerase II transcription and is also required for efficient transcription by RNA polymerase I, and more specifically the polymerase I transcription termination step. It is not only involved in transcription-related chromatin-remodeling, but also required to maintain a specific chromatin configuration across the genome. CHD1 is also associated with histone deacetylase (HDAC) activity. It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 237
35413 350812 cd18054 DEXHc_CHD2 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 2. Chromodomain-helicase-DNA-binding protein 2 (CHD2) is a DNA-binding helicase that specifically binds to the promoter of target genes, leading to chromatin remodeling, possibly by promoting deposition of histone H3.3. It is involved in myogenesis via interaction with MYOD1; it binds to myogenic gene regulatory sequences and mediates incorporation of histone H3.3 prior to the onset of myogenic gene expression, promoting their expression. CHD2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 237
35414 350813 cd18055 DEXHc_CHD3 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 3. Chromodomain-helicase-DNA-binding protein 3 (CHD3) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. It is required for anchoring centrosomal pericentrin in both interphase and mitosis, for spindle organization and centrosome integrity. CHD3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 232
35415 350814 cd18056 DEXHc_CHD4 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 4. Chromodomain-helicase-DNA-binding protein 4 (CHD4) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. CHD4 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 232
35416 350815 cd18057 DEXHc_CHD5 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 5. Chromodomain-helicase-DNA-binding protein 5 (CHD5) is a chromatin-remodeling protein that binds DNA through histones and regulates gene transcription. It is thought to specifically recognize and bind trimethylated 'Lys-27' (H3K27me3) and non-methylated 'Lys-4' of histone H3 and plays a role in the development of the nervous system by activating the expression of genes promoting neuron terminal differentiation. In parallel, it may also positively regulate the trimethylation of histone H3 at 'Lys-27' thereby specifically repressing genes that promote the differentiation into non-neuronal cell lineages. As a tumor suppressor, it regulates the expression of genes involved in cell proliferation and differentiation. In spermatogenesis, it probably regulates histone hyperacetylation and the replacement of histones by transition proteins in chromatin, a crucial step in the condensation of spermatid chromatin and the production of functional spermatozoa. CHD5 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 232
35417 350816 cd18058 DEXHc_CHD6 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 6. Chromodomain-helicase-DNA-binding protein 6 (CHD6) is a DNA-dependent ATPase that plays a role in chromatin remodeling. It regulates transcription by disrupting nucleosomes in a largely non-sliding manner which strongly increases the accessibility of chromatin. It activates transcription of specific genes in response to oxidative stress through interaction with NFE2L2.2 and acts as a transcriptional repressor of different viruses including influenza virus or papillomavirus. During influenza virus infection, the viral polymerase complex localizes CHD6 to inactive chromatin where it gets degraded in a proteasome independent-manner. CHD6 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 222
35418 350817 cd18059 DEXHc_CHD7 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 7. Chromodomain-helicase-DNA-binding protein 7 (CHD7) is a probable transcription regulator. It may be involved in the 45S precursor rRNA production. CHD7 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 222
35419 350818 cd18060 DEXHc_CHD8 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 8. Chromodomain-helicase-DNA-binding protein 8 (CHD8) is a DNA helicase that acts as a chromatin remodeling factor and regulates transcription. It also acts as a transcription repressor by remodeling chromatin structure and recruiting histone H1 to target genes. It suppresses p53/TP53-mediated apoptosis by recruiting histone H1 and preventing p53/TP53 transactivation activity and of STAT3 activity by suppressing the LIF-induced STAT3 transcriptional activity. It also acts as a negative regulator of Wnt signaling pathway and CTNNB1-targeted gene expression. CHD8 is also involved in both enhancer blocking and epigenetic remodeling at chromatin boundary via its interaction with CTCF. It also acts as a transcription activator via its interaction with ZNF143 by participating in efficient U6 RNA polymerase III transcription. CHD8 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 222
35420 350819 cd18061 DEXHc_CHD9 DEAH-box helicase domain of the chromodomain helicase DNA binding protein 9. Chromodomain-helicase-DNA-binding protein 9 (CHD9) acts as a transcriptional coactivator for PPARA and possibly other nuclear receptors. It is proposed to be a ATP-dependent chromatin remodeling protein. CHD9 has DNA-dependent ATPase activity and binds to A/T-rich DNA. It also associates with A/T-rich regulatory regions in promoters of genes that participate in the differentiation of progenitors during osteogenesis. CHD9 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 222
35421 350820 cd18062 DEXHc_SMARCA4 DEXH-box helicase domain of SMARCA4. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (SMARCA4, also known as transcription activator BRG1) is a component of the CREST-BRG1 complex that regulates promoter activation by orchestrating a calcium-dependent release of a repressor complex and a recruitment of an activator complex. Mutation of SMARCA4 (BRG1), the ATPase of BAF (mSWI/SNF) and PBAF complexes, contributes to a range of malignancies and neurologic disorders. SMARCA4 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 251
35422 350821 cd18063 DEXHc_SMARCA2 DEXH-box helicase domain of SMARCA2. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2 (SMARCA2, also known as brahma homolog) is a component of the BAF complex. SMARCA2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 251
35423 350822 cd18064 DEXHc_SMARCA5 DEAH-box helicase domain of SMARCA5. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 5 (SMARCA5, also called SNF2H) is the catalytic subunit of the four known chromatin-remodeling complexes: CHRAC, RSF, ACF/WCRF, and WICH. SMARCA5 plays a major role organising arrays of nucleosomes adjacent to the binding sites for the architectural transcription factor CTCF sites and acts to promote CTCF binding SMARCA5 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 244
35424 350823 cd18065 DEXHc_SMARCA1 DEAH-box helicase domain of SMARCA1. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 (SMARCA1, also called SNF2L) is a component of NURF (nucleosome-remodeling factor) and CERF (CECR2-containing-remodeling factor) complexes which promote the perturbation of chromatin structure in an ATP-dependent manner. SMARCA1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 233
35425 350824 cd18066 DEXHc_RAD54B DEXH-box helicase domain of RAD54B. DNA repair and recombination protein RAD54B, also known as RDH54, binds to double-stranded DNA, displays ATPase activity in the presence of DNA, and may have a role in meiotic and mitotic recombination. RAD54B is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 235
35426 350825 cd18067 DEXHc_RAD54A DEXH-box helicase domain of RAD54A. DNA repair and recombination protein RAD54A, also known as RAD54L or RAD54, plays a role in homologous recombination related repair of DNA double-strand breaks. RAD54A is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 243
35427 350826 cd18068 DEXHc_ATRX DEXH-box helicase domain of ATRX. Transcriptional regulator ATRX (also called alpha thalassemia/mental retardation syndrome X-linked and X-linked nuclear protein or XNP) is involved in transcriptional regulation and chromatin remodeling. Mutations in humans cause mental retardation, X-linked, syndromic, with hypotonic facies 1 (MRXSHF1) and alpha-thalassemia myelodysplasia syndrome (ATMDS). ATRX is part of the a DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 246
35428 350827 cd18069 DEXHc_ARIP4 DEXH-box helicase domain of ARIP4. Androgen receptor-interacting protein 4 (ARIP4, also called RAD54 like 2 or RAD54L2 ) modulates androgen receptor (AR)-dependent transactivation in a promoter-dependent manner. ARIP4 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 227
35429 350828 cd18070 DEXQc_SHPRH DEXQ-box helicase domain of SHPRH. E3 ubiquitin-protein ligase SHPRH is a ubiquitously expressed protein that contains motifs characteristic of several DNA repair proteins, transcription factors, and helicases. SHPRH is a functional homolog of S. cerevisiae RAD5 and is involved in DNA repair. SHPRH is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 257
35430 350829 cd18071 DEXHc_HLTF1_SMARC3 DEXH-box helicase domain of HLTF1. Helicase like transcription factor (HLTF1, also known as HIP116 or SMARCA3) has both helicase and E3 ubiquitin ligase activities and ATP-dependent nucleosome-remodeling activity. HLTF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 239
35431 350830 cd18072 DEXHc_TTF2 DEAH-box helicase domain of TTF2. Transcription termination factor 2 (TTF2 also called Forkhead-box E1/FOXE1 ) is a transcription termination factor that couples ATP hydrolysis with the removal of RNA polymerase II from the DNA template. Single nucleotide polymorphism (SNP) within the 5'-UTR of TTF2 is associated with thyroid cancer risk.TTF2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 241
35432 350831 cd18073 DEXHc_RIG-I_DDX58 DEXH-box helicase domain of RIG-I. RIG-I (Retinoic acid-inducible gene I protein), also called DEAD box protein 58 (DDX58), is a pathogen-recognition receptor that recognizes viral 5'-triphosphates carrying double-stranded RNA. Upon binding to these microbe-associated molecular patterns (MAMPs), RIG-I forms oligomers and promotes downstream processes that result in type I interferon production and induction of an antiviral state. The optimal ligand for RIG-I has been found to be base-paired or double-stranded RNA (dsRNA) molecules containing a 5' triphosphate (5'-ppp-dsRNA). RIG-I contains two N-terminal caspase activation and recruitment domains (CARDs), which are required for interaction with IPS-1, a superfamily 2 helicase/translocase/ATPase (SF2) domain and a C-terminal regulatory/repressor domain (RD). RIG-I is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 202
35433 350832 cd18074 DEXHc_RLR-2 DEXH-box helicase domain of RLR-2. RIG-I-like receptor 2 (RLR-2, also known as melanoma differentiation-associated protein 5 or Mda5 and IFIH1) is a viral double-stranded RNA (dsRNA) receptor that shares sequence similarity and signaling pathways with RIG-I, yet plays essential functions in antiviral immunity through distinct specificity for viral RNA. RLR-2 recognizes the internal duplex structure, whereas RIG-I recognizes the terminus of dsRNA. RLR-2 uses direct protein-protein contacts to stack along dsRNA in a head-to-tail arrangement. The signaling domain (tandem CARD), which decorates the outside of the core RLR-2 filament, also has an intrinsic propensity to oligomerize into an elongated structure that activates the signaling adaptor, MAVS. RLR-2 uses long dsRNA as a signaling platform to cooperatively assemble the core filament, which in turn promotes stochastic assembly of the tandem CARD oligomers for signaling. LGP2 appears to positively and negatively regulate RLR-2 and RIG-I signaling, respectively. RLR-2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 216
35434 350833 cd18075 DEXHc_RLR-3 DEXH-box helicase domain of RLR-3. RIG-I-like receptor 3 (RLR-3, also known as laboratory of genetics and physiology 2 or LGP2 and DHX58) appears to positively and negatively regulate MDA5 and RIG-I signaling, respectively. RLR-3 resembles a chimera combining a MDA5-like helicase domain and RIG-I like CTD supporting both stem and end binding. RNA binding is required for RLR-3-mediated enhancement of MDA5 activation. RLR-3 end-binding may promote nucleation of MDA5 oligomerization on dsRNA. RLR-3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 200
35435 350834 cd18076 DEXXQc_HELZ2-N N-terminal DEXXQ-box helicase domain of HELZ2. Helicase with zinc finger 2 (HELZ2, also known as PPAR-alpha-interacting complex protein 285 or PRIC285 and PPAR-gamma DBD-interacting protein 1 or PDIP1) acts as a transcriptional coactivator for a number of nuclear receptors including PPARA, PPARG, THRA, THRB, and RXRA. It belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 230
35436 350835 cd18077 DEXXQc_HELZ DEXXQ-box helicase domain of HELZ. Helicase with zinc finger (HELZ) acts as a helicase that plays a role in RNA metabolism during development. HELZ is a member of the family I class of RNA helicases of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 226
35437 350836 cd18078 DEXXQc_Mov10L1 DEXXQ-box helicase domain of Mov10L1. Moloney leukemia virus 10-like protein 1 (Mov10L1) binds Piwi-interacting RNA (piRNA) precursors to initiate piRNA processing. Mov10L1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 230
35438 350837 cd18079 S-AdoMet_synt S-adenosylmethionine synthetase. S-adenosylmethionine synthetase (EC 2.5.1.6), also known as methionine adenosyltransferase, catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP in two steps, the formation of AdoMet and hydrolysis of the tripolyphosphate, which occurs prior to release of the product from the enzyme, which consists of three structural domains that have a similar alpha+beta fold. 371
35439 349953 cd18080 TrmD-like tRNA-M1G37-methyltransferase TrmD. The bacterial tRNA-(N(1)G37) methyltransferase (TrmD) catalyzes the transfer of a methyl group from S-adenosyl-L-methionine (AdoMet) to the N1 position of G37 in the anticodon loop of a subset of tRNA that contains a G at position 36. The presence of the modification prevents Watson-Crick base-pairing of this guanosine with cytosine in mRNA and translational frame-shifting. This family of proteins contains members of the SPOUT methyltransferases. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 219
35440 349954 cd18081 RlmH-like 23S-rRNA-pseudouridine1915-N3-methyltransferase RlmH. 23S rRNA (pseudouridine1915-N3)-methyltransferase RlmH catalyzes the addition of a methyl group at the N-3 position of pseudouridine Psi1915 in 23S rRNA to form m(3)Psi1915. This family of proteins belongs to the SPOUT methyltransferases. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 152
35441 349955 cd18082 SpoU-like_family SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 145
35442 349956 cd18083 aTrm56-like archaeal tRNA (cytidine(56)-2'-O)-methyltransferase Trm56. Archaeal tRNA (cytidine(56)-2'-O)-methyltransferase Trm56 catalyzes the 2'-O-ribose methylation of cytidine at position 56 in tRNAs. Trm56 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 169
35443 349957 cd18084 RsmE-like SPOUT superfamily RNA methyltransferase RsmE-like. 16S rRNA m3U1498 methyltransferase RsmE modifies nucleotides during ribosomal RNA maturation in a site-specific manner. The Escherichia coli member is specific for U1498 methylation. RsmE is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 159
35444 349958 cd18085 TM1570-like SPOUT superfamily RNA methyltransferase TM1570-like. DUF2168; This domain, found in various hypothetical prokaryotic proteins, has no known function. It is also found in a few prokaryotic tRNA (guanine-N(1)-)-methyltransferases. Proteins of this family are members of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 178
35445 349959 cd18086 HsC9orf114-like SPOUT superfamily RNA methyltransferase HsC9orf114-like. Human C9orf114 (also known as centromere protein 32 or CENP-32) is required for association of the centrosomes with the poles of the bipolar mitotic spindle during metaphase. CENP-32 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 187
35446 349960 cd18087 TrmY-like tRNA (pseudouridine(54)-N(1))-methyltransferase TrmY. tRNA (pseudouridine(54)-N(1))-methyltransferase TrmY catalyzes the N1-methylation of pseudouridine at position 54 (Psi54) in tRNAs. TrmY is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 193
35447 349961 cd18088 Nep1-like 18S rRNA (pseudouridine(1248)-N1)-methyltransferase Nep1. 18S rRNA (pseudouridine(1248)-N1)-methyltransferase Nep1 (also known as EMG1) methylates pseudouridine at position1248 (Psi1248) in 18S rRNA and is required for small subunit (SSU) ribosomal RNA (rRNA) maturation. Mutations on human cause in Bowen-Conradi Syndrome. Nep1 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 204
35448 349962 cd18089 SPOUT_Trm10-like tRNA methyltransferase Trm10-like. Family of tRNA methyltransferase Trm10-like proteins catalyzes the N(1) methylation of guanine at position 9 (m(1)G9) of tRNA (eukaryotes) or N(1) methylation of guanine or adenine at position 9 (m1G9/m1A9) of tRNA (archaea), which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 171
35449 349963 cd18090 Arginine_MT_Sfm1 SAM-dependent arginine methyltransferase related to yeast Sfm1. Arginine methyltransferase Sfm1 methylates R146 of 40S ribosomal protein S3 (Rps3), which contacts 18S RNA. Sfm1 is part of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent mainly RNA MTases which are structurally characterized by a deep trefoil knot. 140
35450 349964 cd18091 SpoU-like_TRM3-like SAM-dependent tRNA methylase related to TRM3. Yeast tRNA (guanosine(18)-2'-O)-methyltransferase TRM3 catalyzes the formation of 2'-O-methylguanosine at position 18 (Gm18) in various tRNAs. TRM3 is similar to C-terminal domain of TAR (HIV-1) RNA binding protein 1 (TARBP1), a protein binding to TAR, which functions as a RNA regulatory signal by forming a stable stem-loop structure to which transactivator protein Tat binds. The role of TARBP1 is believed to be to disengage RNA polymerase II from TAR during transcriptional elongation. TRM3 and the C-terminal methyltransferase domain of TARBP1 are members of the SPOUT methyltransferase superfamily. 145
35451 349965 cd18092 SpoU-like_TrmH SAM-dependent tRNA methylase related to TrmH. TrmH catalyzes the transfer of the methyl group from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of the ribose of the universally conserved guanosine 18 (G18) position in tRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 162
35452 349966 cd18093 SpoU-like_TrmJ SAM-dependent tRNA methylase related to TrmJ. tRNA methyltransferase TrmJ catalyzes the methyl transfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH at position 32 in both tRNASer1 and tRNAGln2. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 153
35453 349967 cd18094 SpoU-like_TrmL SAM-dependent tRNA methylase related to TrmL. tRNA (Um34/Cm34) methyltransferase TrmL catalyzes the methyl transfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH at position 34 in both tRNA(Leu)(CmAA) and tRNA(Leu)(cmnm5UmAA). It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 145
35454 349968 cd18095 SpoU-like_rRNA-MTase SAM-dependent rRNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 143
35455 349969 cd18096 SpoU-like SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 140
35456 349970 cd18097 SpoU-like SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 144
35457 349971 cd18098 SpoU-like SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 138
35458 349972 cd18099 Trm10arch archaeal tRNA(m1G9/m1A9)-methyltransferase Trm10. Archaeal tRNA(m1G9/m1A9)-methyltransferase Trm10 catalyzes the N(1) methylation of guanine or adenine at position 9 (m1G9/m1A9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 170
35459 349973 cd18100 Trm10euk_B eukaryotic tRNA m1G9 methyltransferase Trm10 homolog B. Eukaryotic tRNA m1G9 methyltransferase Trm10 homolog B (TM10B) catalyzes the N(1) methylation of guanine at Position 9 (m(1)G9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 182
35460 349974 cd18101 Trm10euk_A eukaryotic tRNA m1G9 methyltransferase Trm10 homolog A. Eukaryotic tRNA m1G9 methyltransferase Trm10 homolog A (TM10A) catalyzes the N(1) methylation of guanine at Position 9 (m(1)G9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 174
35461 349975 cd18102 Trm10_MRRP1 Mitochondrial ribonuclease P protein 1. Mitochondrial ribonuclease P protein 1 (or tRNA methyltransferase 10 homolog C) functions in mitochondrial tRNA maturation and is part of mitochondrial ribonuclease P, an enzyme composed of MRPP1/RG9MTD1, MRPP2/HSD17B10 and MRPP3/KIAA0391, which cleaves tRNA molecules in their 5'-ends. MRRP1 is related to Trm10, a tRNA m1G9 methyltransferase and is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 179
35462 349976 cd18103 SpoU-like_RlmB SAM-dependent rRNA methylase related to RlmB. 23S rRNA-M2G2251-MTase RlmB catalyzes the methylation of guanosine 2251, a modification conserved in the peptidyltransferase domain of 23S rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 143
35463 349977 cd18104 SpoU-like_RNA-MTase SAM-dependent RNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 146
35464 349978 cd18105 SpoU-like_MRM1 SAM-dependent rRNA methylase related to MRM1. MRM1 catalyzes the methylation of 2'-O-ribose residues G1145 to GmG residue of the mitochondrial 16S rRNA. MRM1 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 158
35465 349979 cd18106 SpoU-like_RNMTL1 SAM-dependent rRNA methylase related to RNMTL1. RNMTL1 (also known as HC90, MRM3 and RMTL1) catalyzes the methylation of 2'-O-ribose residues G1370 to GmG residue of the mitochondrial 16S rRNA. RNMTL1 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 148
35466 349980 cd18107 SpoU-like_AviRb SAM-dependent rRNA methylase related to AviRb. AviRb from Streptomyces viridochromogenes methylates the 2'-O atom of U2479 of the 23S ribosomal RNA. AviRb is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 148
35467 349981 cd18108 SpoU-like_NHR Nosiheptide-resistance methyltransferase (NHR). Nosiheptide-resistance methyltransferase (NHR) confers resistance to the thiazole antibiotic nosiheptide via catalyzing 2'O-methylation of 23S rRNA at the nucleotide A1067. NHR is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 144
35468 349982 cd18109 SpoU-like_RNA-MTase SAM-dependent RNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. 141
35469 349745 cd18110 ATP-synt_F1_beta_C F1-ATP synthase beta (B) subunit, C-terminal domain. The beta (B) subunit of the F1 complex of F0F1-ATP synthase, C-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic. 108
35470 349746 cd18111 ATP-synt_V_A-type_alpha_C V/A-type ATP synthase catalytic subunit A (alpha), C-terminal domain. The alpha (A) subunit of the V1/A1 complex of V/A-type ATP synthases, C-terminal domain. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. 105
35471 349747 cd18112 ATP-synt_V_A-type_beta_C V/A-type ATP synthase beta (B) subunit, C-terminal domain. The beta (B) subunit of the V1/A1 complexes of V/A-type ATP synthases, C-terminal domain. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. This subfamily consists of the non-catalytic beta subunit. 95
35472 349748 cd18113 ATP-synt_F1_alpha_C F1-ATP synthase alpha (A) subunit, C-terminal domain. The alpha (A) subunit of the F1 complex of F0F1-ATP synthase, C-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic. 126
35473 349749 cd18114 ATP-synt_flagellum-secretory_path_III_C Flagellum-specific ATP synthase, C-terminal domain. The C-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The flagellum-specific ATPase FliI is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of FoF1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway. 71
35474 349739 cd18115 ATP-synt_F1_beta_N F1-ATP synthase beta (B) subunit, N-terminal domain. The beta (B) subunit of the F1 complex of FoF1-ATP synthase, N-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic. 76
35475 349740 cd18116 ATP-synt_F1_alpha_N F1-ATP synthase alpha (A) subunit, N-terminal domain. The alpha (A) subunit of the F1 complex of FoF1-ATP synthase, N-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, in mitochondrial inner membranes, and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic. 67
35476 349741 cd18117 ATP-synt_flagellum-secretory_path_III_N Flagellum-specific ATP synthase, N-terminal domain. The N-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The FliI ATPase is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of F1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens, such as Salmonella and Chlamydia, also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway. 70
35477 349742 cd18118 ATP-synt_V_A-type_beta_N V/A-type ATP synthase beta (B) subunit, N-terminal domain. The beta (B) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex which contains three copies each of the alpha and beta subunits that form the soluble catalytic core, that is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex which forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. This subfamily consists of the non-catalytic beta subunit. 72
35478 349743 cd18119 ATP-synt_V_A-type_alpha_N V/A-type ATP synthase catalytic subunit A (alpha), N-terminal domain. The alpha (A) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. 67
35479 349413 cd18120 ATP-synt_Vo_Ao_c Membrane-bound Vo/Ao complexes of V/A-type ATP synthases, subunit c. Vo/Ao-ATP synthase subunit c. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPases) is exclusively found in archaea and functions like the F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. The V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. The V1 complex consists of three A and three B subunits, two G subunits plus the C, D, E, F, and H subunits. The Vo complex consists of five different subunits: a, c, c', c'', and d. The Ao/A1 complexes are composed of nine subunits in a stoichiometry of A(3):B(3):C:D:E:F:H(2):a:c(x). ATP is synthesized on the A3:B3 hexamer and the energy released during that process is transferred to the Ao complex, which consists of the C-terminal segment of subunit a and subunit c. 62
35480 349414 cd18121 ATP-synt_Fo_c membrane-bound Fo complex of F-ATP synthase, subunit c. Subunit c (also called subunit 9, or proteolipid) of the Fo complex of F-ATP synthase. The F-ATP synthase (also called FoF1-ATPase) consists of two structural domains: the F1 (factor one) complex containing the soluble catalytic core, and the Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F1 is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1, while Fo consists of the three subunits a, b, and c (1:2:10-14). An oligomeric ring of 10-14 c subunits (c-ring) make up the Fo rotor. The flux of protons though the ATPase channel (Fo) drives the rotation of the c-ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the c-ring of Fo. The F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. The F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. This group also includes F-ATP synthase that has also been found in the archaea Methanosarcina acetivorans. 65
35481 350838 cd18133 HLD_clamp helical lid domain of clamp loader-like AAA+ proteins. Clamp loader complexes are multisubunit complexes that play an important role in DNA replication. They open sliding clamps for assembly and close them around DNA, specifically targeting them to sites where DNA synthesis is initiated and orienting them correctly for replication. The subunits belong to the clamp loader clade of AAA+ superfamily. 65
35482 350839 cd18137 HLD_clamp_pol_III_gamma_tau helical lid domain of DNA polymerase III subunits gamma and tau. DNA polymerase III subunit gamma/tau is part of the DNA polymerase III holoenzyme. Gamma and tau subunits are isoforms, both containing the helical lid domain. Gamma interacts with the delta subunit to transfer the beta subunit on the DNA while tau serves as a scaffold to help in the dimerization of the core complex. Both are members of the clamp-loader clade of the AAA+ superfamily. 65
35483 350840 cd18138 HLD_clamp_pol_III_delta helical lid domain of DNA polymerase III subunits delta. DNA polymerase III subunit delta is part of the DNA polymerase III holoenzyme. the delta subunit id required for ring opening and binds the beta subunit. It is a member of the clamp-loader clade of the AAA+ superfamily. 65
35484 350841 cd18139 HLD_clamp_RarA helical lid domain of recombination factor protein RarA. Recombination factor RarA (Replication associated recombination gene/protein A, also known as MgsA (Maintenance of genome stability A) or Mgs1 in yeast and WRNIP1 in mammals) is a member of the clamp-loader clade of the AAA+ superfamily. It functions as a tetramer. RarA co-localize with the replication fork throughout the cell cycle and may play a role in the rescue of stalled replication forks. 75
35485 350842 cd18140 HLD_clamp_RFC helical lid domain of replication factor C subunit. Replication factor C (RFC) is five-protein clamp loader complex that forms a stable ATP-dependent complex with the sliding clamp, PCNA, which binds specifically to primed DNA. RFC subunits belong to the clamp loader clade of the AAA+ superfamily. 63
35486 381143 cd18159 REC_OmpR_NsrR-like phosphoacceptor receiver (REC) domain of Streptococcus agalactiae NsrR-like OmpR family response regulators. Streptococcus agalactiae NsrR is a lantibiotic resistance-associated response regulator and is part of the nisin resistance operon. It is a member of the NsrRK two-component system (TCS) that is involved in the regulation of lantibiotic resistance genes such as a membrane-associated lipoprotein of LanI, and the nsr gene cluster which encodes for the resistance protein NSR and the ABC transporter NsrFP, both conferring resistance against nisin. This subfamily also includes Staphylococcus epidermidis GraR, part of the GraR/GraS TCS involved in resistance against cationic antimicrobial peptides, and Bacillus subtilis BceR, part of the BceS/BceR TCS involved in the regulation of bacitracin resistance. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 113
35487 381144 cd18160 REC_CpdR_CckA-like phosphoacceptor receiver (REC) domain of Brucella abortus CpdR and CckA, and similar domains. Two-component systems (TCSs), consisting of a sensor and a response regulator, are used by bacteria to adapt to changing environments. Processes regulated by TCSs in bacteria include sporulation, pathogenicity, virulence, chemotaxis and membrane transport. Response regulators share the common phosphoacceptor REC domain and differ output domains such as DNA, RNA, ligand, and protein-binding, or enzymatic domain. CpdR is a stand-alone REC protein. CckA is a sensor histidine kinase containing N-terminal PAS domains and a C-terminal REC domain. CpdR and CckA are components of a regulatory phosphorelay system (composed of CckA, ChpT, CtrA and CpdR) that controls Brucella abortus cell growth, division, and intracellular survival inside mammalian host cells. CckA autophosphorylates in the presence of ATP and transfers a phosphoryl group to the conserved aspartic acid residue on its C-terminal REC domain, which is relayed to the ChpT phosphotransferase. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 103
35488 381145 cd18161 REC_hyHK_blue-like phosphoacceptor receiver (REC) domain of hybrid sensor histidine kinase/response regulators similar to Pseudomonas savastanoi blue-light-activated histidine kinase. Typically, two-component regulatory systems (TCSs) consist of a sensor (histidine kinase) that responds to specific input(s) by modifying the output of a cognate response regulator (RR). TCSs allow organisms to sense and respond to changes in environmental conditions. Hybrid sensor histidine kinase (HK)/response regulators contain all the elements of a classical TCS in a single polypeptide chain. Pseudomonas savastanoi blue-light-activated histidine kinase is a photosensitive HK and RR that is involved in increased bacterial virulence upon exposure to light. RRs share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 102
35489 349482 cd18172 M14_CP_plant Zinc carboxypeptidase, including SOL1, a carboxypeptidase D in plant. This family includes only plant members of the carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). It includes Arabidopsis thaliana SOL1 carboxypeptidase D which is known to possess enzymatic activity to remove the C-terminal arginine residue of CLE19 proprotein in vitro, and SOL1-dependent cleavage of the C-terminal arginine residue is necessary for CLE19 activity in vivo. The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. 276
35490 349483 cd18173 M14_CP_bacteria bacterial peptidase M14 carboxypeptidase, uncharacterized. This family contains only bacterial carboxypeptidase (CP) members of the M14 family of metallocarboxypeptidases (MCPs), mostly of which have yet to be characterized. The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. 281
35491 349484 cd18174 M14_ASTE_ASPA_like Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 187
35492 349415 cd18175 ATP-synt_Vo_c_ATP6C_rpt1 V-type proton ATPase 16 kDa proteolipid subunit (ATP6C/ATP6V0C/ATP6L/ATPL) and similar proteins. ATP6C (also called the V-ATPase 16 kDa proteolipid subunit, or vacuolar proton pump 16 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells. 68
35493 349416 cd18176 ATP-synt_Vo_c_ATP6C_rpt2 V-type proton ATPase 16 kDa proteolipid subunit (ATP6C/ATP6V0C/ATP6L/ATPL) and similar proteins. ATP6C (also called V-ATPase 16 kDa proteolipid subunit, or vacuolar proton pump 16 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells. 68
35494 349417 cd18177 ATP-synt_Vo_c_ATP6F_rpt1 V-type proton ATPase 21 kDa proteolipid subunit (ATP6F/ATP6V0B) and similar proteins. ATP6F (also called V-ATPase 21 kDa proteolipid subunit, or vacuolar proton pump 21 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells. 63
35495 349418 cd18178 ATP-synt_Vo_c_ATP6F_rpt2 V-type proton ATPase 21 kDa proteolipid subunit (ATP6F/ATP6V0B) and similar proteins. ATP6F (also called V-ATPase 21 kDa proteolipid subunit, or vacuolar proton pump 21 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells. 65
35496 349419 cd18179 ATP-synt_Vo_Ao_c_NTPK_rpt1 V-type sodium ATPase subunit K (NTPK) and similar proteins. NTPK (also called Na(+)-translocating ATPase subunit K, or sodium ATPase proteolipid component) is involved in ATP-driven sodium extrusion. 63
35497 349420 cd18180 ATP-synt_Vo_Ao_c_NTPK_rpt2 V-type sodium ATPase subunit K (NTPK) and similar proteins. NTPK (also called Na(+)-translocating ATPase subunit K, or sodium ATPase proteolipid component) is involved in ATP-driven sodium extrusion. 64
35498 349421 cd18181 ATP-synt_Vo_Ao_c_TtATPase_like Thermus thermophilus V/A-ATPase and similar proteins. This family includes a group of uncharacterized ATPase similar to Thermus thermophilus V/A-ATPase, which is homologous to the eukaryotic V-ATPase, but has a simpler subunit composition and functions in vivo to synthesize ATP rather than pump protons. 62
35499 349422 cd18182 ATP-synt_Fo_c_ATP5G3 ATP synthase F(0) complex subunit C3 (ATP5G3) and similar proteins. ATP5G3 (also called ATP synthase lipid-binding protein, ATP synthase proteolipid P3, ATP synthase proton-transporting mitochondrial F(o) complex subunit C3, ATPase protein 9, or ATPase subunit c) transports protons across the inner mitochondrial membrane to the F1-ATPase protruding on the matrix side, resulting in the generation of ATP. 65
35500 349423 cd18183 ATP-synt_Fo_c_ATPH F-type proton-translocating ATP synthase (ATPH) and similar proteins. This family includes subunit c of chloroplast F-ATP synthase (F1Fo-ATP synthase), also known as ATP synthase F(o) sector subunit c (also called ATPase subunit III, F-type ATPase subunit c, or F-ATPase subunit c)and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpH. 75
35501 349424 cd18184 ATP-synt_Fo_c_NaATPase F-type sodium ion-translocating ATP synthase and similar proteins. This family includes F-type Na(+)-coupled ATP synthase and similar proteins. 65
35502 349425 cd18185 ATP-synt_Fo_c_ATPE F-type proton-translocating ATPase subunit c (ATPE) and similar proteins. This family includes subunit c of F-ATP synthase (also called ATP synthase F(o) sector subunit c, F-type ATPase subunit c, or F-ATPase subunit c) and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpE. 65
35503 349497 cd18186 BTB_POZ_ZBTB_KLHL-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing (ZBTB) proteins, Kelch-like (KLHL) proteins, and similar proteins. This family includes a variety of BTB/POZ domain-containing proteins, such as zinc finger and BTB domain-containing (ZBTB) proteins and Kelch-like (KLHL) proteins. They have diverse functions, such as transcriptional regulation, chromatin remodeling, protein degradation and cytoskeletal regulation. Many BTB/POZ proteins contain one or two additional domains, such as kelch repeats, zinc-finger domains, FYVE (Fab1, YOTB, Vac1, and EEA1) fingers, or ankyrin repeats. These special additional domains or interaction partners provide unique characteristics and functions to BTB/POZ proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 82
35504 349498 cd18187 BTB_POZ_Kv_KCTD BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in voltage-gated potassium (Kv) channels and potassium channel tetramerization domain-containing (KCTD) proteins. This family includes two protein groups: voltage-gated potassium (Kv) channels and potassium channel tetramerization domain-containing (KCTD) proteins. Kv channels are membrane proteins with fundamental physiological roles. They are responsible for a variety of electrical phenomena, such as the repolarization of the action potential, spike frequency adaptation, synaptic repolarization, and smooth muscle contraction. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels, and others. All family members contain the BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. 83
35505 349499 cd18190 BTB_POZ_ETO1-like BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana ethylene-overproduction protein 1 (ETO1) and similar proteins. ETO1, also called protein ethylene overproducer 1, is an essential regulator of the ethylene pathway, which acts by regulating the stability of 1-aminocyclopropane-1-carboxylate synthase (ACS) enzymes. It may act as a substrate-specific adaptor that connects ACS enzymes, such as ACS5, to ubiquitin ligase complexes, leading to proteasomal degradation of ACS enzymes. The family also includes ETO1-like proteins 1 (EOL1) and 2 (EOL2). ETO1, EOL1, and EOL2 contain a BTB domain and tetratricopeptide (TPR) repeats. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 83
35506 349500 cd18191 BTB_POZ_ARMC5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in armadillo repeat-containing protein 5 (ARMC5). ARMC5 plays a role in steroidogenesis, and modulates the expression and cortisol production of steroidogenic enzymes. It negatively regulates adrenal cells survival. It contains armadillo (ARM) repeats and a BTB domain, which is a common protein-protein interaction motif of about 100 amino acids. 100
35507 349501 cd18192 BTB_POZ_ZBTB1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 1 (ZBTB1). ZBTB1 acts as a transcriptional repressor that represses cAMP-responsive element (CRE)-mediated transcriptional activation. It also has a role in translesion DNA synthesis, and is essential for lymphocyte development. ZBTB1 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 114
35508 349502 cd18193 BTB_POZ_ZBTB2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 2 (ZBTB2). ZBTB2 is a POZ domain Kruppel-like zinc finger (POK) family transcription factor acting as a potent repressor of the ARF-HDM2-p53-p21 pathway, which is important in cell cycle regulation. It represses transcription of the ARF, p53, and p21 genes, but activates the HDM2 gene. ZBTB2 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 115
35509 349503 cd18194 BTB_POZ_ZBTB3-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing proteins, ZBTB3, ZBTB18, ZBTB42 and similar proteins. The family includes zinc finger and BTB domain-containing proteins, ZBTB3, ZBTB18 and ZBTB42. ZBTB3 is a transcription factor essential for cancer cell growth via the regulation of the reactive oxygen species (ROS) detoxification pathway. ZBTB18 is a sequence-specific transrepressor associated with heterochromatin. ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. Members of this family contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35510 349504 cd18195 BTB_POZ_ZBTB4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 4 (ZBTB4). ZBTB4, also called KAISO-like zinc finger protein 1 (KAISO-L1), is a transcriptional repressor with bimodal DNA-binding specificity. It binds with a higher affinity to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3' but can also bind to the non-methylated consensus sequence 5'-CTGCNA-3', also known as the consensus kaiso binding site (KBS). It can also bind specifically to a single methyl-CpG pair and can bind hemimethylated DNA but with a lower affinity compared to methylated DNA. ZBTB4 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 124
35511 349505 cd18196 BTB_POZ_ZBTB5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 5 (ZBTB5). ZBTB5 is a POZ domain Kruppel-like zinc finger (POK) family transcription repressor of cell cycle arrest gene p21 and a potential proto-oncogene stimulating cell proliferation. ZBTB5 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 126
35512 349506 cd18197 BTB_POZ_ZBTB6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 6 (ZBTB6). ZBTB6, also called zinc finger protein 482 (ZNF482) or zinc finger protein with interaction domain, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 116
35513 349507 cd18198 BTB_POZ_ZBTB7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7 (ZBTB7). There are three ZBTB7 isoforms: ZBTB7A, ZBTB7B, and ZBTB7C. ZBTB7A is a transcription repressor of key glycolytic genes, including GLUT3, PFKP, and PKM, and its downregulation in human cancer contributes to tumor metabolism. ZBTB7B is a transcriptional regulator of extracellular matrix gene expression. ZBTB7C is a transcriptional repressor with a pro-oncogenic role that relies upon binding to p53 and inhibition of its transactivation function. ZBTB7 isoforms contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 120
35514 349508 cd18199 BTB_POZ_ZBTB8 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8 (ZBTB8). There are two ZBTB8 isoforms: ZBTB8A and ZBTB8B. ZBTB8A is a novel proto-oncoprotein that stimulates cell proliferation. ZBTB8B may be involved in transcriptional regulation. They both contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 113
35515 349509 cd18200 BTB_POZ_ZBTB9 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 9 (ZBTB9). ZBTB9 may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 111
35516 349510 cd18201 BTB_POZ_ZBTB10 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 10 (ZBTB10). ZBTB10, also called zinc finger protein RIN ZF, is an mRNA target of miR-27a and a transcriptional repressor of Specificity protein (Sp) expression. The microRNA-27a:ZBTB10-specificity protein pathway is involved in follicle stimulating hormone-induced VEGF, Cox2, and survivin expression in ovarian epithelial cancer cells. ZBTB10 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 122
35517 349511 cd18202 BTB_POZ_ZBTB11 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 11 (ZBTB11). ZBTB11 is a transcriptional repressor of TP53. It is critical for basal and emergency granulopoiesis. It regulates neutrophil development through its integrase-like zinc finger domain. ZBTB11 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 118
35518 349512 cd18203 BTB_POZ_ZBTB12 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 12 (ZBTB12). ZBTB12, also called protein G10, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 122
35519 349513 cd18204 BTB_POZ_ZBTB14 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 14 (ZBTB14). ZBTB14 is also called zinc finger protein 161 (Zfp-161), zinc finger protein 478, zinc finger protein 5 (ZF5), or Zfp-5. It is a novel transcriptional activator of the dopamine transporter, binding it's promoter at the consensus sequence 5'-CCTGCACAGTTCACGGA-3'. It also binds to 5'-d(GCC)(n)-3' trinucleotide repeats in promoter regions and acts as a repressor of the FMR1 gene. ZBTB14 acts as a transcriptional repressor of MYC and thymidine kinase promoters. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 114
35520 349514 cd18205 BTB_POZ_ZBTB16_PLZF BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 16 (ZBTB16). ZBTB16 is also called promyelocytic leukemia zinc finger protein, zinc finger protein 145, or zinc finger protein PLZF. It is a DNA-binding transcription factor essential for undifferentiated cell maintenance. ZBTB16 also acts as a downstream transcriptional regulator of Osterix and can be useful as a late marker of osteoblastic differentiation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 107
35521 349515 cd18206 BTB_POZ_ZBTB17_MIZ1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 17 (ZBTB17). ZBTB1 is also called c-Myc-interacting zinc finger protein 1 (Miz-1), zinc finger protein 151, or zinc finger protein 60. It is a poly-Cys2His2 zinc finger (ZF) transcription factor that can function as an activator or repressor depending on its binding partners, and by targeting negative regulators of cell cycle progression. ZBTB17 has been implicated in cardiomyopathy and is important in cardiac stress response. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 112
35522 349516 cd18207 BTB_POZ_ZBTB19_PATZ1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in POZ-, AT hook-, and zinc finger-containing protein 1 (PATZ1). PATZ1 is also called zinc finger and BTB domain-containing protein 19 (ZBTB19), BTB/POZ domain zinc finger transcription factor, protein kinase A RI subunit alpha-associated protein, zinc finger protein 278, or zinc finger sarcoma gene protein. It is an important transcriptional regulatory factor that regulates divergent pathways depending on the cellular context. For instance, it acts as a transcriptional suppressor that functions in T lymphocytes. It is also a DNA damage-responsive transcription factor that inhibits p53 function. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35523 349517 cd18208 BTB_POZ_ZBTB20_DPZF BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 20 (ZBTB20). ZBTB20, also called dendritic-derived BTB/POZ zinc finger protein (DPZF) or zinc finger protein 288, may be a transcription factor involved in hematopoiesis, oncogenesis, and immune responses. It is an essential regulator of hepatic lipogenesis and may be a therapeutic target for the treatment of fatty liver disease. It also functions as a critical regulator of anterior pituitary development and lactotrope specification. Moreover, it promotes astrocytogenesis during neocortical development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 117
35524 349518 cd18209 BTB_POZ_ZBTB21_ZNF295 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 21 (ZBTB21). ZBTB21, also called zinc finger protein 295 (ZNF295), is a transcription repressor that acts in a selective manner on different promoters. It may be involved in the bi-directional control of gene expression in concert with another transcription factor ZFP161. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 112
35525 349519 cd18210 BTB_POZ_ZBTB22_BING1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 22 (ZBTB22). ZBTB22, also called protein BING1 or zinc finger protein 297, may be involved in transcriptional regulation. Its gene, together with BING 3-5, TAPASIN, DAXX, RGL2, and HKE2, form a dense cluster at the centromeric end of the major histocompatibility complex class I region. ZBTB22 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 112
35526 349520 cd18211 BTB_POZ_ZBTB23_GZF1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in glial cell line-derived neurotrophic factor-inducible zinc finger protein 1 (GZF1). GZF1 is also called GDNF-inducible zinc finger protein 1, zinc finger and BTB domain-containing protein 23 (ZBTB23), or zinc finger protein 336 (ZNF336). It is a sequence-specific transcriptional repressor that binds the GZF1 responsive element (GRE), with the consensus sequence of 5'-TGCGCN[TG][CA]TATA-3'. It may play a role in renal branching morphogenesis. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35527 349521 cd18212 BTB_POZ_ZBTB24_ZNF450 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 24 (ZBTB24). ZBTB24, also called zinc finger protein 450, functions as a transcription factor essentially involved in B-cell functions in humans. The loss-of-function mutations in ZBTB24 can cause ICF2 (immunodeficiency, centromeric instability and facial anomalies syndrome 2) with immunological characteristics of greatly reduced serum antibodies and circulating memory B cells. ZBTB24 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 118
35528 349522 cd18213 BTB_POZ_ZBTB25_ZNF46_KUP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 25 (ZBTB25). ZBTB25, also called zinc finger protein 46 (ZNF46) or zinc finger protein KUP, is a transcription repressor that facilitates viral RNA transcription and replication. It interacts with viral RNA-dependent RNA polymerase (RdRp) proteins and modulates their transcription activity. It also functions as a viral RNA-binding protein, binding preferentially to the U-rich sequence within 5' UTR of vRNA. ZBTB25 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35529 349523 cd18214 BTB_POZ_ZBTB26_Bioref BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 26 (ZBTB26). ZBTB26, also called zinc finger protein 481 (ZNF481) or zinc finger protein Bioref, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 122
35530 349524 cd18215 BTB_POZ_ZBTB27-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell lymphoma 6 proteins, BCL-6 and BCL-6B. This family includes B-cell lymphoma 6 proteins, BCL-6 and BCL-6B. BCL-6 is a transcriptional repressor mainly required for germinal center (GC) formation and antibody affinity maturation, which have different mechanisms of action specific to the lineage and biological functions. BCL-6B is a sequence-specific transcriptional repressor in association with BCL-6. It may function in a narrow stage or be related to some events in the early B-cell development. Family members contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 113
35531 349525 cd18216 BTB_POZ_ZBTB29-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer proteins, Hic-1 and Hic-2. The family includes hypermethylated in cancer proteins, Hic-1 and Hic-2. Hic-1 is a sequence-specific transcriptional repressor that recognizes and binds to the consensus sequence '5-[CG]NG[CG]GGGCA[CA]CC-3'. Hic-2 is a homolog of tumor suppressor Hic-1 that functions as a transcriptional regulator. Family members contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 118
35532 349526 cd18217 BTB_POZ_ZBTB31_myoneurin BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in myoneurin. Myoneurin, also called zinc finger and BTB domain-containing protein 31 (ZBTB31), is a novel member of the BTB/POZ-zinc finger family highly expressed in the neuromuscular system and is associated with neuromuscular junctions during the late embryonic period. It may function as a synaptic gene regulator. Myoneurin contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 111
35533 349527 cd18218 BTB_POZ_ZBTB32_FAZF_TZFP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 32 (ZBTB32). ZBTB32 is also called FANCC-interacting protein, fanconi anemia zinc finger protein (FAZF), testis zinc finger protein (TZFP), or zinc finger protein 538 (ZNF538). It is a DNA-binding transcription factor that binds to the 5'-TGTACAGTGT-3' core sequence. It acts as a transcription suppressor that controls T cell-mediated autoimmunity. ZBTB32 is essential for down-regulation of GATA3 via ZPO2; this promotes aggressive breast cancer development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 110
35534 349528 cd18219 BTB_POZ_ZBTB33_KAISO BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kaiso. Kaiso, also called zinc finger and BTB domain-containing protein 33 (ZBTB33), is a DNA methylation-dependent transcriptional repressor that binds to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3'. It also binds to the non-methylated consensus sequence 5'-CTGCNA-3', also known as the consensus kaiso binding site (KBS). It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 106
35535 349529 cd18220 BTB_POZ_ZBTB34 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 34 (ZBTB34). ZBTB34 acts as a transcriptional regulator. It downregulates specificity protein (Sp) transcription factors Sp1, Sp3, and Sp4 in pancreatic cancer cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 120
35536 349530 cd18221 BTB_POZ_ZBTB35_ZNF131 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger protein 131 (ZNF131). ZNF131, also called zinc finger and BTB domain-containing protein 35 (ZBTB35), is a transcriptional activator implicated as a regulator of Kaiso-mediated biological processes. It regulates cell growth of developing and mature T cells. It inhibits estrogen signaling by suppressing estrogen receptor alpha homo-dimerization, and plays a role in breast cancer cell proliferation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 113
35537 349531 cd18222 BTB_POZ_ZBTB37 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 37 (ZBTB37). ZBTB37 may be involved in transcriptional regulation. It is differentially expressed in aryl hydrocarbon receptor (AhR)-KO mice compared with WT mice, and may potentially contribute to the aging phenotype of AhR-KO mice. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 123
35538 349532 cd18223 BTB_POZ_ZBTB38_CIBZ BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 38 (ZBTB38). ZBTB38, also termed CIBZ, is a transcriptional regulator with bimodal DNA-binding specificity. It binds with a higher affinity to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3', as well as E-box elements (5'-CACGTG-3'). It can also bind specifically to a single methyl-CpG pair. ZBTB38 represses transcription in a methyl-CpG-dependent manner. It is a negative regulator of endoplasmic reticulum stress-associated apoptosis. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 114
35539 349533 cd18224 BTB_POZ_ZBTB39 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 39 (ZBTB39). ZBTB39 may be involved in transcriptional regulation. Its specific function is as yet unknown. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 123
35540 349534 cd18225 BTB_POZ_ZBTB40 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 40 (ZBTB40). ZBTB40 may be involved in transcriptional regulation. Single-nucleotide polymorphisms of ZBTB40 are associated with bone mineral density in European and East-Asian populations. ZBTB40 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 116
35541 349535 cd18226 BTB_POZ_ZBTB41 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 41 (ZBTB41). ZBTB41, also called FRBZ1, may be involved in transcriptional regulation. Its specific function is as yet unknown. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 114
35542 349536 cd18227 BTB_POZ_ZBTB43 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 43 (ZBTB43). ZBTB43, also called zinc finger and BTB domain-containing protein 22B (ZBTB22b), zinc finger protein 297B (ZNF297B), or ZnF-x, may be involved in transcriptional regulation. It interacts with BDP1, a subunit of transcription factor IIIB (TFIIIB). Since BDP1 is essential in Pol III transcription, ZBTB43 may also regulate these transcriptional pathways. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 121
35543 349537 cd18228 BTB_POZ_ZBTB44 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 44 (ZBTB44). ZBTB44, also called BTB/POZ domain-containing protein 15 (BTBD15) or zinc finger protein 851 (ZNF851), may be involved in transcriptional regulation. Single-nucleotide polymorphisms of ZBTB44 showed a suggestive association with disease progression of Crohn's disease. ZBTB44 has also preferentially been recognized by sera of patients with peripheral T-cell lymphoma (PTCL). It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 126
35544 349538 cd18229 BTB_POZ_ZBTB45 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 45 (ZBTB45). ZBTB45, also called zinc finger protein 499 (ZNF499), may act as a transcriptional regulator that is essential for proper glial differentiation of neural and oligodendrocyte progenitor cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 112
35545 349539 cd18230 BTB_POZ_ZBTB46 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 46 (ZBTB46). ZBTB46 is also called BTB-ZF protein expressed in effector lymphocytes (BZEL), BTB/POZ domain-containing protein 4 (BTBD4), or zinc finger protein 340 (ZNF340). It is a conventional dendritic cell (cDC) lineage specific transcription factor that acts as a negative regulator required to prevent activation of classical dendritic cells in the steady state. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 125
35546 349540 cd18231 BTB_POZ_ZBTB47_ZNF651 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 47 (ZBTB47). ZBTB47, also called zinc finger protein 651 (ZNF651), is a paralog of ZNF652, a novel zinc-finger transcriptional repressor. It interacts with CBFA2T3 via its carboxy-terminal proline-rich region. CBFA2T3-ZNF651 functions as a transcriptional co-repressor complex. ZBTB47 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 115
35547 349541 cd18232 BTB_POZ_ZBTB48_TZAP_KR3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in telomere zinc finger-associated protein (TZAP). TZAP is also called Krueppel-related zinc finger protein 3 (KR3), zinc finger and BTB domain-containing protein 48 (ZBTB48), or zinc finger protein 855 (ZNF855). It is a vertebrate telomere-binding protein involved in telomere length control. It directly binds the telomeric double-stranded 5'-TTAGGG-3' repeat. TZAP also acts as a transcription regulator that binds to promoter regions. It is a transcriptional activator of alternate reading frame (ARF) gene. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 108
35548 349542 cd18233 BTB_POZ_ZBTB49 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 49 (ZBTB49). ZBTB49, also called zinc finger protein 509 (ZNF509), is a transcription factor that inhibits cell proliferation by activating either CDKN1A/p21 transcription or RB1 transcription. There are four ZNF509 isoforms generated by alternative splicing. Short ZNF509 (ZNF509S1, -S2 and -S3) isoforms contain one or two out of the seven zinc-fingers contained in long ZNF509 (ZNF509L). ZBTB49 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 112
35549 349543 cd18234 BTB_POZ_KLHL1-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins KLHL1, KLHL4 and KLHL5. This family contains the Kelch-like proteins: KLHL1, KLHL4 and KLHL5, all of which share high identity and similarity with the Drosophila kelch protein, a component of ring canals. KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. Family members contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 105
35550 349544 cd18235 BTB_POZ_KLHL2-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins, KLHL2 and KLHL3. The family includes Kelch-like proteins, KLHL2 and KLHL3. KLHL2 is a novel actin-binding protein predominantly expressed in brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 and KLHL3 each functions as a component of an E3 ubiquitin ligase complex that mediates the ubiquitination of target proteins. They contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35551 349545 cd18236 BTB_POZ_KLHL6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 6 (KLHL6). KLHL6 is a BTB-kelch protein with a lymphoid tissue-restricted expression pattern. It is involved in B-lymphocyte antigen receptor signaling and germinal center formation. It belongs to the KLHL gene family, which is composed of an N-terminal BTB-POZ domain and four to six Kelch motifs in tandem. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 129
35552 349546 cd18237 BTB_POZ_KLHL7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 7 (KLHL7). KLHL7 is a component of a Cul3-based E3 ubiquitin ligase complex and is involved in the ubiquitination of target proteins for proteasome-mediated degradation. Mutations in KLHL7 causes autosomal-dominant retinitis pigmentosa. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 126
35553 349547 cd18238 BTB_POZ_KLHL8 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 8 (KLHL8). KLHL8 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for the ubiquitination and degradation of rapsyn, a postsynaptic protein required for clustering of nicotinic acetylcholine receptors (nAChRs) at the neuromuscular junction. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35554 349548 cd18239 BTB_POZ_KLHL9_13 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins KLHL9 and KLHL13. KLHL9 and KLHL13 (also called BTB and kelch domain-containing protein 2, or BKLHD2) are substrate-specific adaptors of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL9-KLHL13) E3 ubiquitin ligase complex mediates the ubiquitination of AURKB and controls the dynamic behavior of AURKB on mitotic chromosomes, thereby coordinating faithful mitotic progression and completion of cytokinesis. They contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 128
35555 349549 cd18240 BTB_POZ_KLHL10 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 10 (KLHL10). KLHL10 is a substrate-specific adaptor of a CUL3-based E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins specifically in the testis during spermatogenesis. Haploinsufficiency of Klhl10 causes infertility in male mice. KLHL10 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35556 349550 cd18241 BTB_POZ_KLHL11 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 11 (KLHL11). KLHL11 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, leading most often to their proteasomal degradation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 135
35557 349551 cd18242 BTB_POZ_KLHL12_C3IP1_DKIR BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 12 (KLHL12). KLHL12, also called CUL3-interacting protein 1 (C3IP1) or DKIR homolog, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a negative regulator of the Wnt signaling pathway and ER-Golgi transport. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35558 349552 cd18243 BTB_POZ_KLHL14_printor BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 14 (KLHL14). KLHL14 is also called protein interactor of Torsin-1A (TOR1A), protein interactor of torsinA, or Printor. It is a novel torsinA-interacting protein that preferentially interacts with ATP-free form of TOR1A and is implicated in dystonia pathogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 130
35559 349553 cd18244 BTB_POZ_KLHL15 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 15 (KLHL15). KLHL15 is a substrate-specific adaptor for the Cullin3 E3 ubiquitin-protein ligase complex that targets the serine/threonine-protein phosphatase 2A (PP2A) subunit PPP2R5B for ubiquitination and subsequent proteasomal degradation, thus promoting exchange with other regulatory subunits. It also plays a key role in DNA damage response, favoring DNA double-strand repair through error-prone non-homologous end joining (NHEJ) over error-free, RBBP8-mediated homologous recombination (HR), by targeting the DNA-end resection factor RBBP8/CtIP for ubiquitination and subsequent proteasomal degradation. KLHL15 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 137
35560 349554 cd18245 BTB_POZ_KLHL16_gigaxonin BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in gigaxonin. Gigaxonin, also called Kelch-like protein 16 (KLHL16), may be a cytoskeletal component that directly or indirectly plays an important role in neurofilament architecture. It may also act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins, including tubulin folding cofactor B (TBCB), microtubule-associated protein MAP1B, and glial fibrillary acidic protein (GFAP). Gigaxonin is mutated in giant axonal neuropathy. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 111
35561 349555 cd18246 BTB_POZ_KLHL17_actinfilin BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 17 (KLHL17). KLHL17, also called actinfilin, is a substrate-recognition component of some cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes. It acts as a Cullin 3 (Cul3) substrate adaptor that links GLUR6 to the E3 ubiquitin-ligase complex, and mediates the ubiquitination and subsequent degradation of GLUR6. It may play a role in actin-based neuronal function. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 139
35562 349556 cd18247 BTB_POZ_KLHL18 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 18 (KLHL18). KLHL18 acts as a substrate-specific adaptor for a Cullin3 E3 ubiquitin-protein ligase complex that regulates mitotic entry and ubiquitylates Aurora-A. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 116
35563 349557 cd18248 BTB_POZ_KLHL19_KEAP1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like ECH-associated protein 1 (KEAP1). KEAP1, also called cytosolic inhibitor of Nrf2 (INrf2) or Kelch-like protein 19 (KLHL19), is a redox-regulated substrate adaptor protein for a Cullin3-dependent ubiquitin ligase complex that targets NFE2L2/NRF2 for ubiquitination and degradation by the proteasome, thus resulting in the suppression of its transcriptional activity and the repression of antioxidant response element-mediated detoxifying enzyme gene expression. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35564 349558 cd18249 BTB_POZ_KLHL20_KLEIP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 20 (KLHL20). KLHL20, also called Kelch-like ECT2-interacting protein (KLEIP) or Kelch-like protein X, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex involved in interferon response and anterograde Golgi to endosome transport. KLHL20 plays a role in actin assembly at cell-cell contact sites of Madin-Darby canine kidney cells. It also controls endothelial migration and sprouting angiogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 128
35565 349559 cd18250 BTB_POZ_KLHL21 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 21 (KLHL21). KLHL21 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for efficient chromosome alignment and cytokinesis. The BCR(KLHL21) E3 ubiquitin ligase complex regulates localization of the chromosomal passenger complex (CPC) from chromosomes to the spindle midzone in anaphase and mediates the ubiquitination of aurora B. KLHL21 also targets IkappaB kinase-beta to regulate nuclear factor kappa-light chain enhancer of activated B cells (NF-kappaB) signaling negatively. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35566 349560 cd18251 BTB_POZ_KLHL22 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 22 (KLHL22). KLHL22 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for chromosome alignment and localization of polo-like kinase 1 (PLK1) at kinetochores. The BCR(KLHL22) ubiquitin ligase complex mediates mono-ubiquitination of PLK1, leading to PLK1 dissociation from phosphoreceptor proteins and subsequent removal from kinetochores, allowing silencing of the spindle assembly checkpoint (SAC) and chromosome segregation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 125
35567 349561 cd18252 BTB_POZ_KLHL23 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 23 (KLHL23). KLHL23 overexpression is associated with increased cell proliferation and invasion in gastric cancer. Downregulation of KLHL23 is associated with invasion, metastasis, and poor prognosis of hepatocellular carcinoma and pancreatic cancer. KLHL23 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 127
35568 349562 cd18253 BTB_POZ_KLHL24_KRIP6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 24 (KLHL24). KLHL24, also called kainate receptor-interacting protein for GluR6 (KRIP6) or protein DRE1, is necessary to maintain the balance between intermediate filament stability and degradation, a process that is essential for skin integrity. KLHL24 is a component of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that mediates ubiquitination of KRT14 and controls its levels during keratinocyte differentiation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35569 349563 cd18254 BTB_POZ_KLHL25 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 25 (KLHL25). KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that is required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1). Cullin3-KLHL25 ubiquitin ligase also targets ATP-citrate lyase (ACLY), a key enzyme for lipid synthesis, for degradation to inhibit lipid synthesis and tumor progression. KLHL25 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 128
35570 349564 cd18255 BTB_POZ_KLHL26 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 26 (KLHL26). KLHL26 is encoded by the klhl26 gene, which is regulated by p53 via fuzzy tandem repeats. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35571 349565 cd18256 BTB_POZ_KLHL27_IPP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in intracisternal A particle-promoted polypeptide (IPP). IPP, also called Kelch-like protein 27 (KLHL27) or actin-binding protein IPP, is an actin-binding protein that may play a role in organizing the actin cytoskeleton. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 125
35572 349566 cd18257 BTB_POZ_KLHL28_BTBD5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 28 (KLHL28). KLHL28, also called BTB/POZ domain-containing protein 5 (BTBD5), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 118
35573 349567 cd18258 BTB_POZ_KLHL29_KBTBD9 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 29 (KLHL29). KLHL29 is also called Kelch repeat and BTB domain-containing protein 9 (KBTBD9). A novel fusion transcript NR5A2-KLHL29FT, resulting from transchromosomal insertion, may influence the origin or progression of colon cancer. KLHL29 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 125
35574 349568 cd18259 BTB_POZ_KLHL30 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 30 (KLHL30). KLHL30 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 137
35575 349569 cd18260 BTB_POZ_KLHL31_KBTBD1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 31 (KLHL31). KLHL31 is also called BTB and kelch domain-containing protein 6 (BKLHD6), Kelch repeat and BTB domain-containing protein 1 (KBTBD1), or Kelch-like protein KLHL. It is a transcriptional repressor in the MAPK/JNK signaling pathway to regulate cellular functions. Overexpression inhibits the transcriptional activities of both the TPA-response element (TRE) and serum response element (SRE). It is also a novel modulator of canonical Wnt signaling, which is important for vertebrate myogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35576 349570 cd18261 BTB_POZ_KLHL32 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 32 (KLHL32). KLHL32, also called BTB and kelch domain-containing protein 5 (BKLHD5), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. Deletion of KLHL32 may be ssociated with Tourette syndrome and obsessive-compulsive disorder. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 133
35577 349571 cd18262 BTB1_POZ_KLHL33 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 33 (KLHL33). KLHL33 contains BTB domains and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. KLHL33 gene expression in normal and tumor tissue suggest a significant association with prostate cancer risk. KLHL33 contains two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 101
35578 349572 cd18263 BTB2_POZ_KLHL33 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 33 (KLHL33). KLHL33 contains BTB domains and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. KLHL33 gene expression in normal and tumor tissue suggest a significant association with prostate cancer risk. KLHL33 contains two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 118
35579 349573 cd18264 BTB_POZ_KLHL34 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 34 (KLHL34). KLHL34 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The methylation status of KLHL34 cg14232291 appears to be predictive of pathologic response to preoperative chemoradiation therapy in rectal cancer patients. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 136
35580 349574 cd18265 BTB_POZ_KLHL35 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 35 (KLHL35). KLHL35 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. Significant differences in DNA methylation of the KLHL35 gene in abdominal aortic aneurysm (AAA) patients compared to non-AAA controls suggest a potential role in AAA pathology. Hypermethylation of the KLHL35 gene has also been associated with the development of hepatocellular carcinoma. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 128
35581 349575 cd18266 BTB_POZ_KLHL36 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 36 (KLHL36). KLHL36 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 135
35582 349576 cd18267 BTB_POZ_KLHL37_ENC1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ectoderm-neural cortex protein 1 (ENC-1). ENC-1 is also called Kelch-like protein 37 (KLHL37), nuclear matrix protein NRP/B, or p53-induced gene 10 protein. It is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 147
35583 349577 cd18268 BTB_POZ_KLHL38 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 38 (KLHL38). KLHL38 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The KLHL38 gene is significantly up-regulated during diapause, a temporary arrest of development during early ontogeny. It may also function in preadipocyte differentiation in the chicken. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 129
35584 349578 cd18269 BTB_POZ_KLHL40-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins, KLHL40 and KLHL41. This family includes Kelch-like proteins, KLHL40 and KLHL41. KLHL40 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. KLHL41 is a novel kelch related protein that is involved in pseudopod elongation in transformed cells. They both contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 133
35585 349579 cd18270 BTB_POZ_KBTBD2_BKLHD1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 2 (KBTBD2). KBTBD2, also called BTB and kelch domain-containing protein 1 (BKLHD1), plays an essential role in the regulation of insulin-signaling pathway. It is a BTB-Kelch family substrate recognition subunit of the Cullin-3-based E3 ubiquitin ligase, which targets p85alpha, the regulatory subunit of the phosphoinositol-3-kinase (PI3K) heterodimer, causing p85alpha ubiquitination and proteasome-mediated degradation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 133
35586 349580 cd18271 BTB_POZ_KBTBD3_BKLHD3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 3 (KBTBD3). KBTBD3, also called BTB and kelch domain-containing protein 3 (BKLHD3), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 130
35587 349581 cd18272 BTB_POZ_KBTBD4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 4 (KBTBD4). KBTBD4, also called BTB and kelch domain-containing protein 4 (BKLHD4), is a BTB-BACK-Kelch domain protein belonging to a large family of cullin-RING ubiquitin ligase adaptors that facilitate the ubiquitination of target substrates. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 140
35588 349582 cd18273 BTB_POZ_KBTBD6_7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing proteins KBTBD6 and KBTBD7. KBTBD6 and KBTBD7 are substrate adaptors of a cullin-3 RING ubiquitin ligase complex that mediates ubiquitylation and proteasomal degradation of T-lymphoma and metastasis gene 1 (TIAM1), a RAC1-specific guanine exchange factor (GEF), by cooperating with gamma-aminobutyric acid receptor-associated proteins (GABARAP). KBTBD7 may also act as a new transcriptional activator in mitogen-activated protein kinase (MAPK) signaling. They both contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 142
35589 349583 cd18274 BTB_POZ_KBTBD8 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 8 (KBTBD8). KBTBD8, also called T-cell activation kelch repeat protein (TA-KRP), is a BTB-kelch family protein that is located in the Golgi apparatus and translocates to the spindle apparatus during mitosis. It acts as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a regulator of neural crest specification. The BCR(KBTBD8) complex monoubiquitylates NOLC1 and its paralog TCOF1, the mutation of which underlies the neurocristopathy Treacher Collins syndrome. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 129
35590 349584 cd18275 BTB_POZ_KBTBD11 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 11 (KBTBD11). KBTBD11 is also called chronic myelogenous leukemia-associated protein (CMLAP) or Kelch domain-containing protein 7B, or KLHDC7C. It is a BTB-Kelch family protein whose function remains unclear. A novel polymorphism rs11777210 in KBTBD11 is significantly associated with colorectal cancer risk. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 104
35591 349585 cd18276 BTB_POZ_KBTBD12 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 12 (KBTBD12). KBTBD12, also called Kelch domain-containing protein 6 (KLHDC6), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 127
35592 349586 cd18277 BTB_POZ_BACH1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog 1 (BACH1). BACH1, also called BTB-basic leucine zipper transcription factor 1, belongs to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. It can act as repressor or activator. BACH1 is a heme-responsive transcriptional repressor of heme oxygenase (HO)-1. It represses genes involved in heme metabolism and oxidative stress response. It is also a negative regulator of nuclear factor erythroid 2-related factor 2 (Nrf2) that controls antioxidant response elements (ARE)-dependent gene expressions. BACH1 binds to NF-E2 binding sites in vitro, and plays important roles in coordinating transcription activation and repression by MafK. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35593 349587 cd18278 BTB_POZ_BACH2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog 2 (BACH2). BACH2, also called BTB-basic leucine zipper transcription factor 2, belongs to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. BACH2 is a lymphoid-specific transcription factor with a prominent role in B-cell development. It is transcriptionally regulated by the BCR/ABL oncogene. It represses the anti-apoptotic factor heme oxygenase-1 (HO-1). It is also a potent general repressor of effector differentiation in naive T cells. Moreover, BACH2 is required for pulmonary surfactant homeostasis and alveolar macrophage function. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35594 349588 cd18279 BTB_POZ_SPOP-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein (SPOP) and similar proteins. This family includes speckle-type POZ protein (SPOP), speckle-type POZ protein-like (SPOPL), TD and POZ domain-containing proteins (TDPOZ), Drosophila melanogaster protein roadkill and similar proteins. Both, SPOP and SPOPL, serve as adaptors of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes that mediate the ubiquitination and proteasomal degradation of target proteins. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domains. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. Drosophila melanogaster protein roadkill, also called Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35595 349589 cd18280 BTB_POZ_BPM_plant BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in plant BTB/POZ-MATH (BPM) protein family. The BPM protein family includes Arabidopsis thaliana BTB/POZ and MATH domain-containing proteins, AtBPM1-6, and similar proteins from other plants. BPM protein, also called protein BTB-POZ and MATH domain, may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35596 349590 cd18281 BTB_POZ_BTBD1_2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing proteins, BTBD1 and BTBD2. This family includes BTB/POZ domain-containing proteins BTBD1 and BTBD2, both of which are BTB-domain-containing Kelch-like proteins that interact with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 127
35597 349591 cd18282 BTB_POZ_BTBD3_6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing proteins, BTBD3 and BTBD6. This family includes BTB/POZ domain-containing proteins BTBD3 and BTBD6, both of which are BTB-domain-containing Kelch-like proteins. BTBD3 controls dendrite orientation toward active axons in mammalian neocortex. BTBD6 is required for proper embryogenesis and plays an essential evolutionary conserved role during neuronal development. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 108
35598 349592 cd18283 BTB1_POZ_BTBD7 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 92
35599 349593 cd18284 BTB2_POZ_BTBD7 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 146
35600 349594 cd18285 BTB1_POZ_BTBD8 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental processes. It may also act as a protein-protein adaptor in a transcription complex and thus be involved in brain development. BTBD8 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 104
35601 349595 cd18286 BTB2_POZ_BTBD8 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental processes. It may also act as a protein-protein adaptor in a transcription complex and thus be involved in brain development. BTBD8 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35602 349596 cd18287 BTB_POZ_BTBD9 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 9 (BTBD9). BTBD9 is a risk factor for Restless Legs Syndrome (RLS) encoding a Cullin-3 substrate adaptor. The BTBD9 gene may be associated with antipsychotic-induced RLS in schizophrenia. Mutations in BTBD9 lead to reduced dopamine, increased locomotion and sleep fragmentation. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 119
35603 349597 cd18288 BTB_POZ_BTBD12_SLX4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Structure-specific endonuclease subunit SLX4. SLX4, also called BTB/POZ domain-containing protein 12 (BTBD12), is a Holliday junction resolvase subunit that binds multiple DNA repair/recombination endonucleases and is required for DNA repair. Mutations of the SLX4 gene are found in Fanconi anemia. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 116
35604 349598 cd18289 BTB_POZ_BTBD14A_NAC2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in nucleus accumbens-associated protein 2 (NAC-2). NAC-2, also called BTB/POZ domain-containing protein 14A (BTBD14A) or repressor with BTB domain and BEN domain (RBB), is a novel transcription repressor through its association with the NuRD complex. It recruits the NuRD complex to the promoter of MDM2, leading to the repression of MDM2 transcription and subsequent stability of p53/TP53. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 131
35605 349599 cd18290 BTB_POZ_BTBD14B_NAC1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in nucleus accumbens-associated protein 1 (NAC-1). NAC-1, also called BTB/POZ domain-containing protein 14B (BTBD14B), is a transcriptional repressor that contributes to tumor progression, tumor cell proliferation, and survival. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 123
35606 349600 cd18291 BTB_POZ_BTBD16 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 16 (BTBD16). BTBD16 is a BTB domain-containing protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 114
35607 349601 cd18292 BTB_POZ_BTBD17 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 17 (BTBD17). BTBD17, also called galectin-3-binding protein-like, is a BTB domain-containing protein. Its function remains unclear. It may be associated with hepatocellular carcinoma. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 114
35608 349602 cd18293 BTB_POZ_BTBD18 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 18 (BTBD18). BTBD18 acts as a specific controller for transcription activation through RNA polymerase II elongation at a subset of genomic PIWI-interacting RNA (piRNA) loci. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35609 349603 cd18294 BTB_POZ_BTBD19 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 19 (BTBD19). BTBD19 is a BTB domain-containing protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 111
35610 349604 cd18295 BTB1_POZ_ABTB1_BPOZ1 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein or bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homolog (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes. ABTB1 contains an ankyrin repeat and two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 119
35611 349605 cd18296 BTB2_POZ_ABTB1_BPOZ1 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein or bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homolog (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes. ABTB1 contains an ankyrin repeat and two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35612 349606 cd18297 BTB_POZ_ABTB2-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2) and similar proteins. This family includes ABTB2, BTBD11, plant ARM repeat protein interacting with ABF2 (ARIA), and similar proteins. ABTB2, also called bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation, and prevent translation. The BTBD11 gene has been recently identified as an all-trans retinoic acid (atRA)-responsive gene that lies downstream of atRA and its receptors in the regulation of neurite outgrowth and cell adhesion in neural as well as non-neural tissues. ARIA is an armadillo (ARM) repeat and BTB domain-containing protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 117
35613 349607 cd18298 BTB_POZ_RCBTB1_2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2. The RCC1-related guanine nucleotide exchange factor (GEF) family includes RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2, both of which are chromosome condensation regulator-like guanine nucleotide exchange factors. They contain an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 108
35614 349608 cd18299 BTB1_POZ_RhoBTB first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. In vertebrates, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 103
35615 349609 cd18300 BTB2_POZ_RhoBTB second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, a tandem of 2 BTB domains, and a conserved C-terminal region. In humans, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 108
35616 349610 cd18301 BTB1_POZ_IBtk first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor or negative regulator of Bruton tyrosine kinase (Btk), which is required for B-cell differentiation and development. IBtk binds to the PH domain of Btk and down-regulates the Btk kinase activity. It contains two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 99
35617 349611 cd18302 BTB2_POZ_IBtk second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor or negative regulator of Bruton tyrosine kinase (Btk), which is required for B-cell differentiation and development. IBtk binds to the PH domain of Btk and down-regulates the Btk kinase activity. It contains two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 113
35618 349612 cd18303 BTB_POZ_Rank-5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in rabankyrin-5 (Rank-5). Rank-5, also called ankyrin repeat and FYVE domain-containing protein 1 (ANKFY1) or ankyrin repeats hooked to a zinc finger motif (ANKHZN), is a Rab5 effector that regulates and coordinates different endocytic mechanisms. It contains an N-terminal BTB domain, followed by a BACK domain, several ankyrin (ANK) repeats and a C-terminal FYVE domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 120
35619 349613 cd18304 BTB_POZ_M2BP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Mac-2-binding protein (MAC2BP/M2BP). M2BP is also called galectin-3-binding protein, basement membrane autoantigen p105, lectin galactoside-binding soluble 3-binding protein (LGALS3BP), or tumor-associated antigen 90K. It promotes integrin-mediated cell adhesion and may stimulate host defense against viruses and tumor cells. It contains a scavenger receptor cysteine-rich domain, followed by BTB and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 114
35620 349614 cd18305 BTB_POZ_GCL BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster protein germ cell-less (GCL) and similar proteins. GCL proteins are nuclear envelope proteins highly conserved between the mammalian and Drosophila orthologs. Drosophila melanogaster GCL is a key regulator required for the specification of pole cells and primordial germ cell formation in embryos. Both, human germ cell-less protein-like 1 (GMCL1) and germ cell-less protein-like 2 (GMCL2), also called germ cell-less protein-like 1-like (GMCL1P1 or GMCL1L), may function in spermatogenesis. They may also be substrate-specific adaptors of E3 ubiquitin-protein ligase complexes which mediate the ubiquitination and subsequent proteasomal degradation of target proteins. They contain BTB and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 115
35621 349615 cd18306 BTB_POZ_NS1BP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Influenza virus NS1A-binding protein (NS1-BP). NS1-BP is also called NS1-binding protein, aryl hydrocarbon receptor-associated protein 3 (ARA3), or IVNS1ABP. It is a novel protein that interacts with the influenza A virus nonstructural NS1 protein, which is relocalized in the nuclei of infected cells. It plays a role in cell division and in the dynamic organization of the actin skeleton as a stabilizer of actin filaments by association with F-actin through its kelch repeats. It also interacts with alpha-enolase/MBP-1 and is involved in c-Myc gene transcriptional control. NS1-BP contains BTB and BACK domains at the N-terminal region and kelch repeats at the C-terminal region. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35622 349616 cd18307 BTB_POZ_calicin BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in calicin. Calicin is a basic cytoskeletal protein involved in the formation and maintenance of the highly regular organization of the postacrosomal perinuclear theca, the calyx of mammalian spermatozoa. It contains BTB and BACK domains at the N-terminal region and kelch repeats at the C-terminal region. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 97
35623 349617 cd18308 BTB1_POZ_LZTR1 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. LZTR-1 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 156
35624 349618 cd18309 BTB2_POZ_LZTR1 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. LZTR-1 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 126
35625 349619 cd18310 BTB_POZ_NPR_plant BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in plant regulatory proteins, NPR1-4, and similar proteins. NPR1 and NPR2 are essential for pathogenicity and the utilization of many nitrogen sources. NPR1 is also called nitrogen pathogenicity regulation protein NPR1, non-inducible immunity protein 1 (Nim1), nonexpresser of PR genes 1, or salicylic acid insensitive 1 (Sai1). It acts as a transcription coactivator that plays dual roles in regulating plant immunity. NPR3 and NPR4 are involved in negative regulation of defense responses against pathogens in plant. NPR proteins contain a BTB domain, DUF3420, ankyrin (ANK) repeats, and a conserved C-terminal domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 145
35626 349620 cd18311 BTB_POZ_CP190-like BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster centrosomal protein 190kD (CP190) and similar proteins. CP190 is a large, multi-domain protein, first identified as a centrosome protein with oscillatory localization over the course of the cell cycle. It has an essential function in the nucleus as a chromatin insulator. It is known to associate with the nuclear matrix, components of the RNAi machinery, active promoters and borders of the repressive chromatin domains. CP190 contains an N-terminal BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 110
35627 349621 cd18312 BTB_POZ_NPY3-like BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana protein naked pins in YUC mutants 3 (NPY3), Root phototropism protein 3 (RPT3), and similar proteins. NPY3 may play an essential role in auxin-mediated organogenesis and in root gravitropic responses in Arabidopsis. RPT3 is a signal transducer of the phototropic response and photo-induced movements. It is necessary for root and hypocotyl phototropisms, but not for the regulation of stomata opening. Proteins in this subfamily contain an N-terminal BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 105
35628 349622 cd18313 BTB_POZ_BT BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana BTB/POZ and TAZ domain-containing proteins, BT1-5. BT1-5 may act as substrate-specific adaptors of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. They contain a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 102
35629 349623 cd18314 BTB_POZ_trishanku-like BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Dictyostelium discoideum trishanku and similar proteins. Trishanku is a novel regulator required for normal morphogenesis and cell-type stability in Dictyostelium discoideum. It contains a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 96
35630 349624 cd18315 BTB_POZ_BAB-like BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster proteins bric-a-brac 1 (BAB1), bric-a-brac 2 (BAB2), modifier of mdg4 (doom), and similar proteins. BAB1 and BAB2 probably act as transcriptional regulators that are required for specification of the tarsal segment and are involved in antenna development. Doom is a product of the Drosophila mod(mdg4) gene. It induces apoptosis and binds to baculovirus inhibitor-of-apoptosis proteins. This subfamily also includes Drosophila melanogaster sex determination protein fruitless (FRU), protein jim lovell (LOV), zinc finger protein chinmo, transcription factor GAGA, transcription factor Ken, and longitudinals lacking proteins (LOLA). FRU probably acts as a transcriptional regulator that plays a role in male courtship behavior and sexual orientation, and enhances male-specific expression of takeout in brain-associated fat body. LOV, also called tyrosine kinase-related (TKR), has a regulatory role during midline cell development. Chinmo is a functional effector of the JAK/STAT pathway that regulates eye development, tumor formation, and stem cell self-renewal in Drosophila. GAGA is a transcriptional activator that functions by regulating chromatin structure. Ken, also termed protein Ken and Barbie, is a transcription factor required for Terminalia development. LOLA proteins are putative transcription factors required for axon growth and guidance in the central and peripheral nervous systems. Proteins in this subfamily contain a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 85
35631 349625 cd18316 BTB_POZ_KCTD-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins. The potassium channel tetramerization domain (KCTD) family proteins contain the BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. Some others show Cullin-independent functions including binding and regulation of GABA (gamma-aminobutyric acid) receptors (KCTD8, KCTD12 and KCTD16) and inhibition of AP-2 function (KCTD15). KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 83
35632 349626 cd18317 BTB_POZ_Kv BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in voltage-gated potassium (Kv) channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. This family includes several groups of alpha subunits such as KCNA/Kv1 family of Shaker-type Kv channels, KCNB/Kv2 family of Shab-type Kv channels, KCNC/Kv3 family of Shaw-type Kv channels, KCND/Kv4 family of Shal-type Kv channels, KCNF/Kv5 subfamily of Kv channels, KCNG/Kv6 subfamily of Kv channels, KCNV/Kv8 subfamily of Kv channels, and KCNS/Kv9 subfamily of Kv channels. Kv alpha subunits form functional homo- or hetero-tetrameric channels (typically with other alpha subunits from the same subfamily) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. KCNQ/Kv7 channels are not included in this family, since they do not contain a BTB/POZ domain. 82
35633 349627 cd18318 BTB_POZ_KCTD20-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 20 (KCTD20) and BTB/POZ domain-containing protein 10 (BTBD10). KCTD20, also termed potassium channel tetramerization domain containing 20, is a positive regulator of Akt signaling. It may play an important role in regulating the death and growth of some non-nervous and nervous cells. BTBD10, also termed glucose metabolism-related protein 1 (GMRP1), plays a major role as an activator of AKT family members. It binds to Akt and protein phosphatase 2A (PP2A) and inhibits the PP2A-mediated dephosphorylation of Akt, thereby keeping Akt activated. It also plays a role in preventing motor neuronal death and accelerating the growth of pancreatic beta cells. 92
35634 349628 cd18319 BTB_POZ_KLHL42 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 42 (KLHL42). KLHL42, also called Cullin-3-binding protein 9 (Ctb9) or Kelch domain-containing protein 5, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL42) E3 ubiquitin ligase complex mediates the ubiquitination and subsequent degradation of katanin (KATNA1). KLHL42 is involved in microtubule dynamics throughout mitosis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 93
35635 349629 cd18320 BTB_POZ_KBTBD13 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 13 (KBTBD13). KBTBD13 is a muscle specific protein. Its autosomal dominant mutations may cause Nemaline Myopathy (NEM). KBTBD13 may act as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that functions as a muscle specific ubiquitin ligase, and thereby implicating the ubiquitin proteasome pathway in the pathogenesis of KBTBD13-associated NEM. KBTBD13 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 83
35636 349630 cd18321 BTB_POZ_EloC BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Elongin-C (EloC) and similar proteins. Elongin-C is also called elongin 15 kDa subunit, RNA polymerase II transcription factor SIII subunit C, SIII p15, or transcription elongation factor B polypeptide 1 (TCEB1). It is a component of SIII (also known as elongin), which is a general transcription elongation factor that increases the RNA polymerase II transcription elongation past template-encoded arresting sites. It forms a regulatory complex with subunit B or elongin-B (BC) that enhances the activity of the transcriptionally active subunit A. The BC complex also functions as an adaptor protein in the proteasomal degradation of target proteins via different E3 ubiquitin ligase complexes, including the von Hippel-Lindau ubiquitination complex CBC (VHL) and the suppressors of cytokine signaling (SOCS) box ubiquitin ligase family. Elongin-C belongs to the BTB/POZ domain family; the domain is a common protein-protein interaction motif of about 100 amino acids. 95
35637 349631 cd18322 BTB_POZ_SKP1 BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in S-phase kinase-associated protein 1 (SKP1) and similar proteins. SKP1 is also called cyclin-A/CDK2-associated protein p19 (p19A), organ of Corti protein 2 (OCP-2), organ of Corti protein II (OCP-II), RNA polymerase II elongation factor-like protein, transcription elongation factor B polypeptide 1-like, or p19skp1. It is an essential component of the SCF (SKP1-CUL1-F-box protein) ubiquitin ligase complex, which mediates the ubiquitination of proteins involved in cell cycle progression, signal transduction and transcription. SKP1 serves as an adaptor protein that links the F-box protein to CUL1. SKP1 and CUL1 are invariant components of all SCF complexes, while F-box proteins are variable substrate binding modules that determine specificity. SKP1 belongs to the BTB/POZ domain family; the domain is a common protein-protein interaction motif of about 100 amino acids. 120
35638 349632 cd18323 BTB_POZ_ZBTB3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 3 (ZBTB3). ZBTB3 is a transcription factor essential for cancer cell growth via the regulation of the reactive oxygen species (ROS) detoxification pathway. ZBTB3 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35639 349633 cd18324 BTB_POZ_ZBTB18_RP58 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 18 (ZBTB18). ZBTB18 is also called 58 kDa repressor protein (RP58), transcriptional repressor RP58, translin-associated zinc finger protein 1 (TAZ-1), zinc finger protein 238 (ZNF238), or zinc finger protein C2H2-171. It is a sequence-specific transrepressor associated with heterochromatin. It plays a role in various developmental processes such as myogenesis and brain development. It specifically binds the consensus DNA sequence 5'-[AC]ACATCTG[GT][AC]-3' which contains the E box core, and acts by recruiting chromatin remodeling multiprotein complexes. ZBTB18 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 147
35640 349634 cd18325 BTB_POZ_ZBTB18_2-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 18.2 and similar proteins. This subfamily is composed of Xenopus laevis zinc finger and BTB domain-containing protein 18.2, encoded by the znf238.2.L gene, and similar proteins. Many proteins in this group are annotated as zinc finger and BTB domain-containing protein 42 (ZBTB42). However, characterized mammalian ZBTB42 does not belong to this subfamily. ZBTB18.2, like ZBTB18, functions as a transcriptional repressor that plays a role in various developmental processes. Members of this family contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 128
35641 349635 cd18326 BTB_POZ_ZBTB7A BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7A (ZBTB7A). ZBTB7A is also called factor binding IST protein 1 (FBI-1), factor that binds to inducer of short transcripts protein 1, HIV-1 1st-binding protein 1, Leukemia/lymphoma-related factor (LRF), POZ and Krueppel erythroid myeloid ontogenic factor, POK erythroid myeloid ontogenic factor, Pokemon, TTF-I-interacting peptide 21 (TIP21), or zinc finger protein 857A (ZNF857A). It is a transcription repressor of key glycolytic genes, including GLUT3, PFKP, and PKM, and its downregulation in human cancer contributes to tumor metabolism. It has been implicated in carcinogenesis and cell differentiation and development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 120
35642 349636 cd18327 BTB_POZ_ZBTB7B_ZBTB15 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7B (ZBTB7B). ZBTB7B is also called Krueppel-related zinc finger protein cKrox, T-helper-inducing POZ/Krueppel-like factor, zinc finger and BTB domain-containing protein 15 (ZBTB15), zinc finger protein 67 (ZNF67), Zfp67, zinc finger protein 857B (ZNF857B), or zinc finger protein Th-POK. It is a transcriptional regulator of extracellular matrix gene expression. It plays widespread and critical roles in T-cell development, particularly as the master regulator of CD4 commitment. It also plays a role as a potent driver of brown fat development and thermogenesis, as well as cold-induced beige fat formation. ZBTB7B contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 127
35643 349637 cd18328 BTB_POZ_ZBTB7C_ZBTB36 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7C (ZBTB7C). ZBTB7C is also called affected by papillomavirus DNA integration in ME180 cells protein 1 (APM-1), zinc finger and BTB domain-containing protein 36 (ZBTB36), zinc finger protein 857C (ZNF857C), or kidney cancer-related POZ domain and Kruppel-like protein (Kr-POK). It is a transcriptional repressor with a pro-oncogenic role that relies upon binding to p53 and inhibition of its transactivation function. It may act as an important regulator of fatty acid synthesis and may induce rapid cancer cell proliferation by increasing palmitate synthesis. The ZBTB7C gene has been identified as a susceptibility gene to ischemic injury. ZBTB7C contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 120
35644 349638 cd18329 BTB_POZ_ZBTB8A_BOZF1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8A (ZBTB8A). ZBTB8A, also called BTB/POZ and zinc-finger domain-containing factor or BTB/POZ and zinc-finger domains factor on chromosome 1 (BOZ-F1), is a novel proto-oncoprotein that stimulates cell proliferation. It binds to all the proximal GC boxes to repress transcription, and it inhibits p53 acetylation without affecting p53 stability. It may be involved in gastric adenocarcinoma cell differentiation, cancer invasion, and metastasis. ZBTB8A contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 116
35645 349639 cd18330 BTB_POZ_ZBTB8B BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8B (ZBTB8B). ZBTB8B may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 113
35646 349640 cd18331 BTB_POZ_ZBTB27_BCL6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell lymphoma 6 protein (BCL-6). BCL-6 is also called B-cell lymphoma 5 protein (BCL-5), zinc finger and BTB domain-containing protein 27 (ZBTB27), protein LAZ-3, or zinc finger protein 51 (ZNF51). It is a transcriptional repressor mainly required for germinal center (GC) formation and antibody affinity maturation, which have different mechanisms of action specific to the lineage and biological functions. It represses its target genes by binding directly to the DNA sequence 5'-TTCCTAGAA-3' or indirectly by repressing the transcriptional activity of transcription factors. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 118
35647 349641 cd18332 BTB_POZ_ZBTB28_BCL6B BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell CLL/lymphoma 6 member B protein (BCL6B). BCL6B is also called Bcl6-associated zinc finger protein, zinc finger protein 62, or zinc finger and BTB domain-containing protein 28 (ZBTB28). It is a sequence-specific transcriptional repressor in association with BCL-6. It may function in a narrow stage or be related to some events in the early B-cell development. BCL6B plays an important role as a potential tumor suppressor in gastric cancer; it is found preferentially methylated in gastric cancer. It also inhibits both colorectal cancer growth and hepatocellular carcinoma metastases. BCL6B contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 114
35648 349642 cd18333 BTB_POZ_ZBTB29_HIC1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer 1 protein (Hic-1). Hic-1, also called zinc finger and BTB domain-containing protein 29 (ZBTB29), is a sequence-specific transcriptional repressor that recognizes and binds to the consensus sequence '5-[CG]NG[CG]GGGCA[CA]CC-3'. It may act as a tumor suppressor, and is involved in regulatory loops modulating P53-dependent and E2F1-dependent cell survival, growth control, and stress responses. It also regulates intestinal immunity and homeostasis. Hic-1 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 121
35649 349643 cd18334 BTB_POZ_ZBTB30_HIC2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer 2 protein (Hic-2). Hic-2 is also called HIC1-related gene on chromosome 22 protein (HRG22), Hic-3, or zinc finger and BTB domain-containing protein 30 (ZBTB30). It is a homolog of tumor suppressor Hic-1. It functions as a transcriptional regulator. It is a dosage-dependent regulator of cardiac development located within the distal 22q11 deletion syndrome region. Hic-2 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 120
35650 349644 cd18335 BTB_POZ_KLHL1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 1 (KLHL1). KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. It may play a role in organizing the actin cytoskeleton the brain cells. KLHL1 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 126
35651 349645 cd18336 BTB_POZ_KLHL4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 4 (KLHL4). KLHL4 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It may be associated with X-linked cleft palate (CPX) and is also a candidate gene in the impairment of mullerian duct development. In addition, it has been identified as a target of insulin-like growth factor binding protein 5 (IGFBP5). KLHL4 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 126
35652 349646 cd18337 BTB_POZ_KLHL5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 5 (KLHL5). KLHL5 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It is abundantly expressed in the ovary, adrenal gland, and thymus. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 130
35653 349647 cd18338 BTB_POZ_KLHL2_Mayven BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 2 (KLHL2). KLHL2, also called actin-binding protein Mayven, is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, such as NPTXR, leading most often to their proteasomal degradation. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35654 349648 cd18339 BTB_POZ_KLHL3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 3 (KLHL3). KLHL3 is a component of an E3 ubiquitin ligase complex that regulates blood pressure by targeting With-No-Lysine (WNK) kinases for degradation. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35655 349649 cd18340 BTB_POZ_KLHL40_KBTBD5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 40 (KLHL40). KLHL40, also called Kelch repeat and BTB domain-containing protein 5 (KBTBD5) or sarcosynapsin, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. Mutations in KLHL40 may cause severe autosomal-recessive nemaline myopathy. KLHL40 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 134
35656 349650 cd18341 BTB_POZ_KLHL41_KBTBD10 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 41 (KLHL41). KLHL41 is also called Kel-like protein 23, Kelch repeat and BTB domain-containing protein 10 (KBTBD10), Kelch-related protein 1 (Krp1), or sarcosine. It is a novel kelch-related protein that is involved in pseudopod elongation in transformed cells. It is also involved in skeletal muscle development and differentiation. It regulates proliferation and differentiation of myoblasts and plays a role in myofibril assembly by promoting lateral fusion of adjacent thin fibrils into mature, wide myofibrils. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 133
35657 349651 cd18342 BTB_POZ_SPOP BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein (SPOP). SPOP, also called HIB homolog 1 or Roadkill homolog 1, is a novel nuclear speckle-type protein which serves as an adaptor of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins, such as BRMS1, DAXX, PDX1/IPF1, GLI2 and GLI3. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 125
35658 349652 cd18343 BTB_POZ_SPOPL BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein-like (SPOPL). SPOPL, also called HIB homolog 2 or Roadkill homolog 2, is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes that mediate the ubiquitination and subsequent proteasomal degradation of target proteins. The complexes may contain homodimeric SPOPL or the heterodimers formed by speckle-type POZ protein (SPOP) and SPOPL, which are less efficient than ubiquitin ligase complexes containing only SPOP. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 123
35659 349653 cd18344 BTB_POZ_TDPOZ BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in TD and POZ domain-containing proteins (TDPOZ). TDPOZ is a family of bipartite animal and plant proteins that contains a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. This subfamily contains only mammalian members. Plant TDPOZ proteins contain a MATH domain at the N-terminal region and are named "BTB/POZ and MATH domain-containing proteins (BPM)", not included in this subfamily. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 128
35660 349654 cd18345 BTB_POZ_roadkill-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster protein roadkill and similar proteins. Drosophila melanogaster protein roadkill, also called Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 121
35661 349655 cd18346 BTB_POZ_BTBD1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 1 (BTBD1). BTBD1, also called Hepatitis C virus NS5A-transactivated protein 8 or HCV NS5A-transactivated protein 8, is a BTB-domain-containing Kelch-like protein that is expressed in skeletal muscle and interacts with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. BTBD1 may serve as substrate-specific adaptor of an E3 ubiquitin-protein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 133
35662 349656 cd18347 BTB_POZ_BTBD2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 2 (BTBD2). BTBD2 is a BTB-domain-containing Kelch-like protein that interacts with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 127
35663 349657 cd18348 BTB_POZ_BTBD3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 3 (BTBD3). BTBD3 is a BTB-domain-containing Kelch-like protein that controls dendrite orientation toward active axons in the mammalian neocortex. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 131
35664 349658 cd18349 BTB_POZ_BTBD6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 6 (BTBD6). BTBD6, also termed lens BTB domain protein, is a BTB-domain-containing Kelch-like protein required for proper embryogenesis and plays an essential evolutionary conserved role during neuronal development. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 109
35665 349659 cd18350 BTB_POZ_ABTB2_BPOZ2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2). ABTB2, also called bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins with various functions ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications for Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation, and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homolog (PTEN). It contains an ankyrin repeat, BTB/POZ, and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 134
35666 349660 cd18351 BTB_POZ_BTBD11 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 11 (BTBD11). BTBD11, also called ankyrin repeat and BTB/POZ domain-containing protein BTBD11, is a BTB-domain-containing protein. The BTBD11 gene has been recently identified as an all-trans retinoic acid (atRA)-responsive gene that lies downstream of atRA and its receptors in the regulation of neurite outgrowth and cell adhesion in neural as well as non-neural tissues. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 131
35667 349661 cd18352 BTB_POZ_ARIA_plant BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in plant ARM repeat protein interacting with ABF2 (ARIA) and similar proteins. ARIA is an armadillo (ARM) repeat and BTB domain-containing protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2, a transcription factor which controls ABA-dependent gene expression via the G-box-type ABA-responsive elements. ARIA is a novel abscisic acid signaling component. It negatively regulates seed germination and young seedling growth. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 116
35668 349662 cd18353 BTB_POZ_RCBTB1_CLLD7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing protein 1 (RCBTB1). RCBTB1 is also called chronic lymphocytic leukemia deletion region gene 7 protein (CLLD7), CLL deletion region gene 7 protein, regulator of chromosome condensation and BTB domain-containing protein 1, or E4.5. It is a novel chromosome condensation regulator-like guanine nucleotide exchange factor that may be involved in cell cycle regulation by chromatin remodeling. It may also function as a tumor suppressor that regulates pathways of DNA damage/repair and apoptosis. RCBTB1 may also be a substrate adaptor for a cullin3 (CUL3) E3 ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. Biallelic mutations in RCBTB1 may cause isolated and syndromic retinal dystrophy. It contains an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 117
35669 349663 cd18354 BTB_POZ_RCBTB2_CHC1L BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing protein 2 (RCBTB2). RCBTB2 is also called chromosome condensation 1-like (CHC1-L), RCC1-like G exchanging factor, or regulator of chromosome condensation and BTB domain-containing protein 2. It is a chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) protein for the Ras-related GTPase Ran. It contains an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 117
35670 349664 cd18355 BTB1_POZ_RHOBTB1 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB1 functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 146
35671 349665 cd18356 BTB1_POZ_RHOBTB2 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2) or p83, is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB2 functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 148
35672 349666 cd18357 BTB1_POZ_RHOBTB3 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB3 is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. It is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) through facilitating hydroxylation and suppresses the Warburg effect. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 159
35673 349667 cd18358 BTB2_POZ_RHOBTB1 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB1 functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 126
35674 349668 cd18359 BTB2_POZ_RHOBTB2 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2) or p83, is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB2 functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 124
35675 349669 cd18360 BTB2_POZ_RHOBTB3 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB3 is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. It is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) through facilitating hydroxylation and suppresses the Warburg effect. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 110
35676 349670 cd18361 BTB_POZ_KCTD1-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD1 and KCTD15. This subfamily of KCTD proteins includes KCTD1 and KCTD15. KCTD1 is a nuclear BTB/POZ domain-containing protein that acts as a transcriptional repressor and mediates protein-protein interactions through a BTB domain. It represses the transcriptional activity of AP-2 family members, including TFAP2A, TFAP2B and TFAP2C. It also functions as a novel inhibitor of the Wnt signaling pathway. Mutations in KCTD1 cause scalp-ear-nipple (SEN) syndrome. KCTD15 is a BTB/POZ domain-containing protein that plays a role in the regulation of neural crest (NC) formation and other steps in embryonic development. It inhibits AP2 transcriptional activity by interaction with its activation domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains form pentamers. 94
35677 349671 cd18362 BTB_POZ_KCTD2-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins KCTD2, KCTD5, and KCTD17, and similar proteins. This subfamily includes potassium channel tetramerization domain-containing proteins KCTD2, KCTD5, and KCTD17, all of which function as adaptors of Cullin3 based ubiquitin E3 ubiquitin ligases. KCTD2 suppresses gliomagenesis by destabilizing c-Myc. KCTD5 is a negative regulator of the AKT pathway, a key signaling cascade frequently deregulated in cancer. KCTD5 does not impact the operation of Kv4.2, Kv3.4, Kv2.1, or Kv1.2 channels. KCTD17 polyubiquitylates trichoplein, a protein involved in ciliogenesis down-regulation. It is a positive regulator of ciliogenesis, playing a crucial role in the initial steps of axoneme extension. A missense mutation in KCTD17 causes autosomal dominant myoclonus-dystonia (M-D). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 and KCTD17 BTB domains form pentamer structures. 85
35678 349672 cd18363 BTB_POZ_KCTD3-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 3 (KCTD3) and SH3KBP1-binding protein 1 (SHKBP1). The group of KCTD proteins includes KCTD3 and SHKBP1. KCTD3, also called renal carcinoma antigen NY-REN-45, is a BTB/POZ domain-containing protein that is an accessory subunit of potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel 3 (HCN3), upregulating its cell-surface expression and current density without affecting its voltage dependence and kinetics. SHKBP1, also called SETA-binding protein 1, interacts with cathepsin B and participates in tumor necrosis factor (TNF)-induced apoptosis in ovarian cancer cells. It can promote epidermal growth factor receptor (EGFR) signaling by interrupting c-Cbl-CIN85 complex and inhibiting EGFR degradation. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 86
35679 349673 cd18364 BTB_POZ_KCTD4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 4 (KCTD4). KCTD4 is a BTB/POZ domain-containing protein with an unknown biological function. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 86
35680 349674 cd18365 BTB_POZ_KCTD6_like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD6, KCTD21 and similar proteins. KCTD6, also called KCASH3 (KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 3), is a substrate-specific adaptor of cullin-3, effectively regulating protein levels of the muscle small ankyrin-1 isoform 5 (sAnk1.5). KCTD21, also called KCASH2, functions as a substrate-specific adaptor of cullin-3, promoting the ubiquitination and degradation of histone deacetylase HDAC1, thereby inhibiting the deacetylation-mediated transcriptional activation of the Hedgehog effectors Gli1 and Gli2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 94
35681 349675 cd18366 BTB_POZ_KCTD7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 7 (KCTD7). KCTD7 is a BTB/POZ domain-containing protein that has an impact on K+ fluxes, neurotransmitter synthesis, and neuronal function. It functions as a regulator of potassium conductance in neurons, and is involved in the control of excitability of cortical neurons. Mutations in KCTD7 may cause progressive myoclonus epilepsy (PME). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 92
35682 349676 cd18367 BTB_POZ_KCTD8-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD8, KCTD12, KCTD16 and similar proteins. This subfamily of KCTD proteins includes KCTD8, KCTD12 (also called predominantly fetal expressed T1 domain/Pfetin), and KCTD16. They act as auxiliary subunits of GABAB receptors associated with mood disorders. KCTD8 interacts as a tetramer with GABRB1 and GABRB2. KCTD12 regulates agonist potency and kinetics of GABAB receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. KCTD16 interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion, and axon guidance. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 100
35683 349677 cd18368 BTB_POZ_KCTD9 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 9 (KCTD9). KCTD9 is a BTB/POZ domain-containing protein that contributes to liver injury through NK cell activation during hepatitis B virus-induced acute-on-chronic liver failure. It functions as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex, which mediates the ubiquitination of target proteins, leading to their degradation by the proteasome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD9 BTB domain forms a pentameric structure. 100
35684 349678 cd18369 BTB_POZ_KCTD10-like_BACURD BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD10 (BACURD3), KCTD13 (BACURD1), and TNFAIP1 (BACURD2). This subfamily of KCTD proteins, also called the BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein (BACURD) subfamily, includes KCTD10 (BACURD3), KCTD13 (BACURD1), and TNFAIP1 (BACURD2). KCTD10 is a BTB/POZ domain-containing protein that interacts with proliferating cell nuclear antigen (PCNA) and polymerase delta, and participates in DNA repair, DNA replication, and cell-cycle control. Its down-regulation could inhibit cell proliferation. KCTD10 also plays crucial roles in embryonic angiogenesis and heart development in mammals by negatively regulating the Notch signaling pathway. KCTD13 is a BTB/POZ domain-containing protein that may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of RhoA, leading to its degradation by the proteasome, thereby regulating the actin cytoskeleton and cell migration. TNFAIP1, also called protein B12, is a BTB/POZ domain-containing protein that is involved in DNA replication, DNA damage repair, cell apoptosis, and is implicated in human diseases including cancer, Alzheimer's disease (AD) and type 2 diabetic nephropathy. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD10 and KCTD13 BTB domains form a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels. 91
35685 349679 cd18370 BTB_POZ_KCTD11 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein KCTD11. KCTD11 may function as an antagonist of the Hedgehog pathway of cell proliferation and differentiation by affecting the nuclear transfer of transcription factor GLI1, thus maintaining cerebellar granule cells in the undifferentiated state. It is a probable substrate-specific adapter for a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex towards HDAC1. It contains a BTB/POZ domain; in some cases the domain may be truncated. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. Variants of the human/mouse KCTD11 appear to contain truncated BTB/POZ domains. 88
35686 349680 cd18371 BTB_POZ_KCTD14 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 14 (KCTD14). KCTD14 is a BTB/POZ domain-containing protein with unknown biological function. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 99
35687 349681 cd18372 BTB_POZ_KCTD18 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 18 (KCTD18). KCTD18 is a BTB/POZ domain-containing protein with with unknown biological function. A duplication of the KCTD18 gene has been found in a patient with epilepsy, developmental delay, and autistic behavior, which may contribute to the phenotype. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 101
35688 349682 cd18373 BTB1_POZ_KCTD19 first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 19 (KCTD19). KCTD19 is a BTB/POZ domain-containing protein with unclear biological function. It may be a host factor involved in Nef-induced downregulation of MHC-I. Nef is a HIV-1-encoded protein that plays a key role in the development of AIDS. KCTD19 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 98
35689 349683 cd18374 BTB2_POZ_KCTD19 second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 19 (KCTD19). KCTD19 is a BTB/POZ domain-containing protein with unclear biological function. It may be a host factor involved in Nef-induced downregulation of MHC-I. Nef is a HIV-1-encoded protein that plays a key role in the development of AIDS. KCTD19 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 99
35690 349684 cd18375 BTB_POZ_KCNRG BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel regulatory protein (KCNRG). KCNRG, also called potassium channel regulator or protein CLLD4, is an endoplasmic reticulum (ER)-associated tumor suppressor that regulates Kv1 family potassium channel proteins by retaining a fraction of the channels in endomembranes. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. 97
35691 349685 cd18376 BTB_POZ_FIP2-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana FH protein interacting protein FIP2 and similar proteins. FIP2 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. 89
35692 349686 cd18377 BTB_POZ_Kv1_KCNA BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNA/Kv1 subfamily of Shaker-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv1, also known as subfamily A, contains eight alpha subunit members, Kv1.1 (KCNA1), Kv1.2 (KCNA2), Kv1.3 (KCNA3), Kv1.4 (KCNA4), Kv1.5 (KCNA5), Kv1.6 (KCNA6), Kv1.7 (KCNA7), and Kv1.8 (KCNA10), which are orthologs of the Shaker gene in Drosophila. They are delayed rectifiers except for Kv1.4 (KCNA4), which is an A-type potassium channel. Delayed rectifiers are slow opening and closing voltage-gated potassium channels. Because of their delayed activation kinetics, they play an important role in controlling action potential duration. A-type channels are fast/rapidly inactivating potassium channels. Kv1/KCNA subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 85
35693 349687 cd18378 BTB_POZ_Kv2_KCNB BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNB/Kv2 subfamily of Shab-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv3, also known as subfamily C, contains two alpha subunit members, Kv2.1 (KCNB1) and Kv2.2 (KCNB2), which are orthologs of the Shab gene in Drosophila. They are delayed-rectifier potassium currents in various neurons, although their physiological roles often remain elusive. Kv2/KCNB subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 109
35694 349688 cd18379 BTB_POZ_Kv3_KCNC BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNC/Kv3 subfamily of Shaw-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv3, also known as subfamily C, contains four alpha subunit members, Kv3.1 (KCNC1), Kv3.2 (KCNC2), Kv3.3 (KCNC3), and Kv3.4 (KCNC4), which are orthologs of the Shaw gene in Drosophila. Unlike other Kv subfamilies, Kv3 channels typically open only at positive potentials and both, activation and deactivation, in response to changes in voltage are very rapid. They are uniquely associated with the ability of certain neurons to fire action potentials and to release neurotransmitter at high rates of up to 1,000 Hz. Kv3/KCNC subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 109
35695 349689 cd18380 BTB_POZ_Kv4_KCND BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCND/Kv4 subfamily of Shal-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv4, also known as subfamily D, contains three alpha subunit members, Kv4.1 (KCND1), Kv4.2 (KCND2), and Kv4.3 (KCND3), which are orthologs of the Shal gene in Drosophila. They are A-type potassium channels that mediate the native, fast inactivating (A-type) K+ current (IA) described both in the nervous system (A currents) and the heart (transient outward current). Kv4/KCND subfamily alpha subunits form functional homo- or hetero-tetrameric channels through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. They are modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with EF-hand-like domains. 102
35696 349690 cd18381 BTB_POZ_Kv5_KCNF1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNF/Kv5 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv5, also known as subfamily F, only contains KCNF1 (also known as Kv5.1 or kH1), which functions as a regulatory alpha-subunit of voltage-gated potassium channel that when coassembled with Kv2.1 can modulate gating in a physiologically relevant manner. It forms hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 116
35697 349691 cd18382 BTB_POZ_Kv6_KCNG BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNG/Kv6 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv6, also known as subfamily G, includes KCNG1 (Kv6.1), KCNG2 (Kv6.2 or KCNF2), KCNG3 (Kv6.3) and KCNG4 (Kv6.4), which are regulatory alpha subunits and do not form functional channels on their own. KCNG1 can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. KCNG2, also called cardiac potassium channel subunit, can form functional heterodimeric channels with KCNB1, and further modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. KCNG3, also called voltage-gated potassium channel subunit Kv10.1, is an electrically silent modulatory subunit that can form functional heterotetrameric channels with KCNB1, and further promotes a reduction in the rate of activation and inactivation of the delayed rectifier voltage-gated potassium channel KCNB1. KCNG4 is a silent voltage-gated potassium (KvS) channel subunit that can form functional heterotetrameric channels with KCNB1, and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. 109
35698 349692 cd18384 BTB_POZ_Kv9_KCNS BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNS/Kv9 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv9, also known as subfamily S, includes KCNS1 (Kv9.1), KCNS2 (Kv9.2) and KCNS3 (Kv9.3). They are regulatory alpha subunits that cannot form functional homo-tetrameric channels. Both KCNS1 and KCNS2 are delayed-rectifier K(+) channel alpha subunits that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. KCNS3 is a delayed-rectifier K(+) channel alpha subunit linked to tissue oxygenation responses. It can form functional heterotetrameric channels with KCNB1, and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. 106
35699 349693 cd18385 BTB_POZ_BTBD10_GMRP1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 10 (BTBD10). BTBD10, also called glucose metabolism-related protein 1 (GMRP1), plays a major role as an activator of AKT family members. It binds to Akt and protein phosphatase 2A (PP2A) and inhibits the PP2A-mediated dephosphorylation of Akt, thereby keeping Akt activated. It also plays a role in preventing motor neuronal death and accelerating the growth of pancreatic beta cells. BTBD10 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 110
35700 349694 cd18386 BTB_POZ_KCTD20 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 20 (KCTD20). KCTD20, also called potassium channel tetramerization domain containing 20, is a positive regulator of Akt signaling. It may play an important role in regulating the death and growth of some non-nervous and nervous cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 104
35701 349695 cd18387 BTB_POZ_KCTD1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 1 (KCTD1). KCTD1 is a nuclear BTB/POZ domain-containing protein that acts as a transcriptional repressor and mediates protein-protein interactions through a BTB domain. It represses the transcriptional activity of AP-2 family members, including TFAP2A, TFAP2B and TFAP2C to various extent. It also functions as a novel inhibitor of the Wnt signaling pathway. Mutations in KCTD1 cause scalp-ear-nipple (SEN) syndrome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains form pentamers. 105
35702 349696 cd18388 BTB_POZ_KCTD15 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 15 (KCTD15). KCTD15 is a BTB/POZ domain-containing protein that plays a role in the regulation of neural crest (NC) formation and other steps in embryonic development. It inhibits AP2 transcriptional activity by interaction with its activation domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains, closely related to KCTD15, form pentamers. 99
35703 349697 cd18389 BTB_POZ_KCTD2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 2 (KCTD2). KCTD2 is a BTB/POZ domain-containing protein that functions as an adaptor of Cullin3 E3 ubiquitin ligase. It suppresses gliomagenesis by destabilizing c-Myc. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 and KCTD17 BTB domain, highly similar to KCTD2, form pentamer structures. 105
35704 349698 cd18390 BTB_POZ_KCTD5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 5 (KCTD5). KCTD5 is a BTB/POZ domain-containing protein that functions as a substrate adaptor for cullin3 based ubiquitin E3 ligases. It is a negative regulator of the AKT pathway, a key signaling cascade frequently deregulated in cancer. KCTD5 does not impact the operation of Kv4.2, Kv3.4, Kv2.1, or Kv1.2 channels. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 forms pentamers mediated by its BTB domain. 112
35705 349699 cd18391 BTB_POZ_KCTD17 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 17 (KCTD17). KCTD17 is a BTB/POZ domain-containing protein that functions as a substrate-adaptor for cullin3-RING ubiquitin ligases that polyubiquitylates trichoplein, a protein involved in ciliogenesis down-regulation. It is a positive regulator of ciliogenesis, playing a crucial role in the initial steps of axoneme extension. A missense mutation in KCTD17 causes autosomal dominant myoclonus-dystonia (M-D). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD17 BTB domains form pentamers. 101
35706 349700 cd18392 BTB_POZ_KCTD3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 3 (KCTD3). KCTD3, also called renal carcinoma antigen NY-REN-45, is a BTB/POZ domain-containing protein that is an accessory subunit of potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel 3 (HCN3), upregulating its cell-surface expression and current density without affecting its voltage dependence and kinetics. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 88
35707 349701 cd18393 BTB_POZ_SHKBP1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in SH3KBP1-binding protein 1 (SHKBP1). SHKBP1, also called SETA-binding protein 1, interacts with cathepsin B and participates in tumor necrosis factor (TNF)-induced apoptosis in ovarian cancer cells. It can promote epidermal growth factor receptor (EGFR) signaling by interrupting c-Cbl-CIN85 complex and inhibiting EGFR degradation. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 103
35708 349702 cd18394 BTB_POZ_KCTD6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 6 (KCTD6). KCTD6, also called KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 3 (KCASH3), is a BTB/POZ domain-containing protein that functions as a substrate-specific adaptor of cullin-3, regulating protein levels of the muscle small ankyrin-1 isoform 5 (sAnk1.5) as well as suppressing histone deacetylase and Hedgehog activity in medulloblastoma. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 104
35709 349703 cd18395 BTB_POZ_KCTD21 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 21 (KCTD21). KCTD21, also calledz KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 2 (KCASH2), is a BTB/POZ domain-containing protein that functions as a substrate-specific adaptor of cullin-3, promoting the ubiquitination and degradation of histone deacetylase HDAC1, thereby inhibiting the deacetylation-mediated transcriptional activation of the Hedgehog effectors Gli1 and Gli2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 98
35710 349704 cd18396 BTB_POZ_KCTD8 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein KCTD8. KCTD8, a BTB/POZ domain-containing protein, is an auxiliary subunit of GABA-B receptors that determine the pharmacology and kinetics of the receptor response. It interacts as a tetramer with GABRB1 and GABRB2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 103
35711 349705 cd18397 BTB_POZ_KCTD12_Pfetin BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 12 (KCTD12). KCTD12, also called predominantly fetal expressed T1 domain (Pfetin), is a BTB/POZ domain-containing protein that is an auxiliary subunit of GABAB receptors associated with mood disorders. It regulates agonist potency and kinetics of GABAB receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. It also regulates colorectal cancer cell stemness through the ERK pathway. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 100
35712 349706 cd18398 BTB_POZ_KCTD16 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 16 (KCTD16). KCTD16 is a BTB/POZ domain-containing protein that is an auxiliary subunit of GABAB receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. 103
35713 349707 cd18399 BTB_POZ_KCTD10_BACURD3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 10 (KCTD10). KCTD10, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 3 (BACURD3), is a BTB/POZ domain-containing protein that interacts with proliferating cell nuclear antigen (PCNA) and polymerase delta, and participates in DNA repair, DNA replication, and cell-cycle control. Its down-regulation could inhibit cell proliferation. KCTD10 also plays crucial roles in embryonic angiogenesis and heart development in mammals by negatively regulating the Notch signaling pathway. Furthermore, KCTD10 may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex, which mediates the ubiquitination of target proteins, leading to their degradation by the proteasome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD10 BTB domain forms a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels. 110
35714 349708 cd18400 BTB_POZ_KCTD13_BACURD1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 13 (KCTD13). KCTD13, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 1 (BACURD1), or TNFAIP1-like protein, is a BTB/POZ domain-containing protein that may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of RhoA, leading to its degradation by the proteasome, thereby regulating the actin cytoskeleton and cell migration. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD13 BTB domain forms a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels. 103
35715 349709 cd18401 BTB_POZ_TNFAIP1_BACURD2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in tumor necrosis factor, alpha-induced protein 1, endothelial (TNFAIP1). TNFAIP1, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 2 (BACURD2), or protein B12, is a BTB/POZ domain-containing protein that is involved in DNA replication, DNA damage repair and cell apoptosis, and is implicated in human diseases including cancer, Alzheimer's disease (AD) and type 2 diabetic nephropathy. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The BTB domains of other BACURD subfamily members, KCTD10 and KCTD13, form a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels. 104
35716 349710 cd18402 BTB_POZ_KCNA1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 1 (KCNA1). KCNA1 is also called voltage-gated K(+) channel HuKI, voltage-gated potassium channel HBK1, or voltage-gated potassium channel subunit Kv1.1. It mediates transmembrane potassium transport in excitable membranes, primarily in the brain and the central nervous system, but also in the kidney. It is involved in the regulation of the membrane potential and nerve signaling, and prevents neuronal hyperexcitability. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 98
35717 349711 cd18403 BTB_POZ_KCNA2_KCNA3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A members 2 (KCNA2) and 3 (KCNA3). KCNA2 is also called NGK1, voltage-gated K(+) channel HuKIV, voltage-gated potassium channel HBK5, or voltage-gated potassium channel subunit Kv1.2. KCNA3 is also called HGK5, HLK3, HPCN3, voltage-gated K(+) channel HuKIII, or voltage-gated potassium channel subunit Kv1.3. KCNA2 and KCNA3 mediate transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA2 primarily functions in the brain and the central nervous system, but also in the cardiovascular system. It prevents aberrant action potential firing and regulates neuronal output. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA2 and KCNA3 are alpha subunits that form functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 99
35718 349712 cd18405 BTB_POZ_KCNA4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 4 (KCNA4). KCNA4 is also called HPCN2, or voltage-gated K(+) channel HuKII, voltage-gated potassium channel HBK4, voltage-gated potassium channel HK1, or voltage-gated potassium channel subunit Kv1.4. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA4 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 97
35719 349713 cd18406 BTB_POZ_KCNA5 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 5 (KCNA5). KCNA5, also called HPCN1, voltage-gated potassium channel HK2, or voltage-gated potassium channel subunit Kv1.5, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA5 may play a role in regulating the secretion of insulin in normal pancreatic islets. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA5 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 97
35720 349714 cd18407 BTB_POZ_KCNA6 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 6 (KCNA6). KCNA6, also called voltage-gated potassium channel HBK2 or voltage-gated potassium channel subunit Kv1.6, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA6 is distributed primarily in neurons of central and peripheral nervous systems. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA6 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 127
35721 349715 cd18408 BTB_POZ_KCNA7 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 7 (KCNA7). KCNA7, also called voltage-gated potassium channel subunit Kv1.7, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA7 plays an important role in the repolarization of cell membranes. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA7 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 115
35722 349716 cd18409 BTB_POZ_KCNA10 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 10 (KCNA10). KCNA10, also called voltage-gated potassium channel subunit Kv1.8, is a cyclic nucleotide-gated, voltage-activated potassium channel that mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA10 is expressed in proximal tubular cells, glomerular and vascular endothelial cells, as well as in vascular smooth muscle cells. It may facilitate proximal tubular sodium absorption by stabilizing cell membrane voltage. The channel activity is up-regulated by cAMP. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA10 is an alpha subunit that forms functional homotetrameric channels through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 87
35723 349717 cd18410 BTB_POZ_Shaker-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shaker and similar proteins. Shaker, also termed protein minisleep, represents a family of putative potassium channel proteins in the nervous system of Drosophila. It is a voltage-gated potassium channel that mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Shaker plays a role in the regulation of sleep need or efficiency. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shaker is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 100
35724 349718 cd18411 BTB_POZ_KCNB1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily B member 1 (KCNB1). KCNB1, also called delayed rectifier potassium channel 1 (DRK1) or voltage-gated potassium channel subunit Kv2.1, mediates transmembrane potassium transport in excitable membranes, primarily in the brain, but also in the pancreas and cardiovascular system. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNB1 is involved in the regulation of the action potential (AP) repolarization, duration and frequency of repetitive AP firing in neurons, muscle cells and endocrine cells and plays a role in homeostatic attenuation of electrical excitability throughout the brain. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNB1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 117
35725 349719 cd18412 BTB_POZ_KCNB2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily B member 2 (KCNB2). KCNB2, also called voltage-gated potassium channel subunit Kv2.2, mediates transmembrane potassium transport in excitable membranes, primarily in the brain and smooth muscle cells. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNB2 contributes to the delayed-rectifier voltage-gated potassium current in cortical pyramidal neurons and smooth muscle cells. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNB2 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 127
35726 349720 cd18413 BTB_POZ_Shab-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shab and similar proteins. Shab is a slow delayed rectifier voltage-gated potassium channel in Drosophila. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shab is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 109
35727 349721 cd18414 BTB_KCNC1_3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily C members KCNC1 and KCNC3. KCNC1 (also called NGK2, voltage-gated potassium channel subunit Kv3.1, or voltage-gated potassium channel subunit Kv4) and KCNC3 (also called KSHIIID or voltage-gated potassium channel subunit Kv3.3) play important roles in the rapid repolarization of fast-firing brain neurons. Assuming opened or closed conformations in response to the voltage difference across the membrane, the proteins form tetrameric potassium-selective channels through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNC1 and KCNC3 are alpha subunit that form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 117
35728 349722 cd18415 BTB_KCNC2_4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily C members KCNC2 and KCNC4. KCNC2, also called Shaw-like potassium channel or voltage-gated potassium channel Kv3.2, is a delayed rectifier voltage-gated potassium channel that mediates transmembrane potassium transport in excitable membranes, primarily in the brain. It contributes to the regulation of the fast action potential repolarization and in sustained high-frequency firing in neurons of the central nervous system. KCNC4, also called KSHIIIC or voltage-gated potassium channel subunit Kv3.4, is a novel high-voltage-activating, tetraethylammonium (TEA)-sensitive, type-A potassium channel that mediates the voltage-dependent potassium ion permeability of excitable membranes. It plays a pivotal role in oxidative stress-related neural cell damage as an oxidation-sensitive channel. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNC2 and KCNC4 are alpha subunit that form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 124
35729 349723 cd18416 BTB_Shaw-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel protein Shaw. Shaw, also called Shaw2, is a voltage-gated potassium channel in Drosophila. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shaw is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 112
35730 349724 cd18417 BTB_POZ_KCND1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 1 (KCND1). KCND1, also called voltage-gated potassium channel subunit Kv4.1, is a pore-forming subunit of voltage-gated rapidly inactivating A-type potassium channels. It may contribute to I (To) current in heart and I (Sa) current in neurons. Its properties are modulated by interactions with other alpha subunits and with regulatory subunits. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with EF-hand-like domains. 138
35731 349725 cd18418 BTB_POZ_KCND2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 2 (KCND2). KCND2, also called voltage-gated potassium channel subunit Kv4.2, is a major pore-forming subunit in somatodendritic subthreshold A-type potassium current I(SA) channels. It mediates transmembrane potassium transport in excitable membranes, primarily in the brain. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND2 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with EF-hand-like domains. 103
35732 349726 cd18419 BTB_POZ_KCND3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 3 (KCND3). KCND3, also called voltage-gated potassium channel subunit Kv4.3, is a pore-forming subunit of voltage-gated rapidly inactivating A-type potassium channels. Mutations in KCND3 cause spinocerebellar ataxia. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND3 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with EF-hand-like domains. 138
35733 349727 cd18420 BTB_POZ_Shal-like BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shal and similar proteins. Drosophila melanogaster Shal, also called Shaker cognate l or Shal2, is a transient potassium current (I(A)) channel, which is required for maintaining excitability during repetitive firing and normal locomotion in Drosophila. It may play a role in the nervous system and in the regulation of beating frequency in pacemaker cells. Shal mediates the voltage-dependent potassium ion permeability of excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shal is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 139
35734 349728 cd18421 BTB_POZ_KCNG1_2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G members, KCNG1 and KCNG2. KCNG1, also called voltage-gated potassium channel subunit Kv6.1 or kH2, functions as a regulatory alpha-subunit of voltage-gated potassium channel that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. KCNG2, also called cardiac potassium channel subunit or voltage-gated potassium channel subunit Kv6.2, is a new gamma-subunit of voltage-gated potassium channels that can form functional heterodimeric channels with KCNB1, and further modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG1 and KCNG2 are regulatory alpha subunits and do not form homomultimers. They form heteromultimers (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 114
35735 349729 cd18422 BTB_POZ_KCNG3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G member 3 (KCNG3). KCNG3, also called voltage-gated potassium channel subunit Kv6.3 or voltage-gated potassium channel subunit Kv10.1, is an electrically silent modulatory subunit that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further promotes a reduction in the rate of activation and inactivation of the delayed rectifier voltage-gated potassium channel KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG3 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 111
35736 349730 cd18423 BTB_POZ_KCNG4 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G member 4 (KCNG4). KCNG4, also called voltage-gated potassium channel subunit Kv6.4, is a silent voltage-gated potassium (KvS) channel subunit that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG4 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 112
35737 349731 cd18424 BTB_POZ_KCNV1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily V member 1 (KCNV1). KCNV1, also called neuronal potassium channel alpha subunit HNKA or voltage-gated potassium channel subunit Kv8.1, is a new neuronal voltage-gated potassium channel alpha subunit with specific inhibitory properties towards Shab and Shaw channels. It modulates KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2) channel activity by shifting the threshold for inactivation to more negative values and by slowing the rate of inactivation. It can also down-regulate the channel activity of KCNB1, KCNB2, KCNC4 (also known as Kv3.4) and KCND1 (also known as Kv4.1), possibly by trapping them in intracellular membranes. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNV1 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 109
35738 349732 cd18425 BTB_POZ_KCNV2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily V member 2 (KCNV2). KCNV2, also called voltage-gated potassium channel subunit Kv8.2, is a modulatory voltage-gated potassium channel alpha subunit that modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. KCNV2 is essential for visual function and cone survival. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNV2 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 108
35739 349733 cd18426 BTB_POZ_KCNS1 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 1 (KCNS1). KCNS1, also called delayed-rectifier K(+) channel alpha subunit 1 or voltage-gated potassium channel subunit Kv9.1, is a modulatory alpha subunit of voltage-gated potassium channel that mediates neuropathic pain following nerve injury. It can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS1 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 106
35740 349734 cd18427 BTB_POZ_KCNS2 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 2 (KCNS2). KCNS2, also called delayed-rectifier K(+) channel alpha subunit 2 or voltage-gated potassium channel subunit Kv9.2, is a modulatory alpha subunit of voltage-gated potassium channel that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS2 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 107
35741 349735 cd18428 BTB_POZ_KCNS3 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 3 (KCNS3). KCNS3, also called delayed-rectifier K(+) channel alpha subunit 3 or voltage-gated potassium channel subunit Kv9.3, is an alpha subunit of voltage-gated potassium channel linked to tissue oxygenation responses. It can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS3 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. 108
35742 349485 cd18429 M14_Nna1-like Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases; uncharacterized bacterial subgroup. A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP),-like proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain. 253
35743 349486 cd18430 M14_ASTE_ASPA_like Succinylglutamate desuccinylase/aspartoacylase; uncharacterized. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. 168
35744 349384 cd18431 BRCT_DNA_ligase_III BRCT domain of DNA ligase 3 (LIG3) and similar proteins. LIG3 (EC 6.5.1.1), also termed DNA ligase III, or polydeoxyribonucleotide synthase [ATP] 3, functions as heterodimer with DNA-repair protein XRCC1 in the nucleus and can correct defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents. 78
35745 349385 cd18432 BRCT_PAXIP1_rpt6_like sixth BRCT domain of PAX-interacting protein 1 (PAXIP1), second BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the sixth BRCT domain of PAXIP1 and the second BRCT domain of MDC1. 85
35746 349386 cd18433 BRCT_Rad4_rpt3 third BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the third repeat. 83
35747 349387 cd18434 BRCT_TopBP1_rpt5 fifth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the fifth BRCT domain. 89
35748 349388 cd18435 BRCT_BRC1_like_rpt1 first (N-terminal) BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. Members in this family contains six BRCT domains. This family corresponds to the fourth repeat. 107
35749 349389 cd18436 BRCT_BRC1_like_rpt2 second BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the second repeat. 75
35750 349390 cd18437 BRCT_BRC1_like_rpt3 third BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the third repeat. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group; it contains a conserved Trp, but not the Cys/Ser residue. 78
35751 349391 cd18438 BRCT_BRC1_like_rpt4 fourth BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. Members in this family contains six BRCT domains. This family corresponds to the fourth repeat. 68
35752 349392 cd18439 BRCT_BRC1_like_rpt6 sixth (C-terminal) BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the sixth repeat. 116
35753 349393 cd18440 BRCT_PAXIP1_rpt6 sixth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the sixth BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family. 90
35754 349394 cd18441 BRCT_MDC1_rpt2 second BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the second BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family. 81
35755 349395 cd18442 BRCT_polymerase_mu BRCT domain of DNA-directed DNA/RNA polymerase mu (polymerase mu) and similar proteins. Polymerase Mu (EC 2.7.7.7), also termed Pol mu, or terminal transferase, is a Gap-filling polymerase involved in repair of DNA double-strand breaks by non-homologous end joining (NHEJ). It participates in immunoglobulin (Ig) light chain gene rearrangement in V(D)J recombination. Polymerase Mu contains a BRCT domain. 98
35756 349396 cd18443 BRCT_DNTT BRCT domain of DNA nucleotidylexotransferase (DNTT) and similar proteins. DNTT (EC 2.7.7.31), also termed terminal addition enzyme, or terminal deoxynucleotidyltransferase, or terminal transferase, is a template-independent DNA polymerase which catalyzes the random addition of deoxynucleoside 5'-triphosphate to the 3'-end of a DNA initiator. It is the addition of nucleotides at the junction (N region) of rearranged Ig heavy chain and T-cell receptor gene segments during the maturation of B- and T-cells. DNA nucleotidylexotransferase contains a BRCT domain. 95
35757 350519 cd18444 BACK_KLHL1_like BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins KLHL1, KLHL4 and KLHL5. This subfamily contains Kelch-like proteins: KLHL1, KLHL4 and KLHL5, all of which share high identity and similarity with the Drosophila kelch protein, a component of ring canals. Members of this subfamily contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. 106
35758 350520 cd18445 BACK_KLHL2_like BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL2 and KLHL3. This subfamily includes Kelch-like proteins, KLHL2 and KLHL3. KLHL2 is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. Both KLHL2 and KLHL3 function as a component of an E3 ubiquitin ligase complex that mediates the ubiquitination of target proteins. 114
35759 350521 cd18446 BACK_KLHL6 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 6 (KLHL6). KLHL6 is a BTB-kelch protein with a lymphoid tissue-restricted expression pattern. It belongs to the KLHL gene family, which is composed of an N-terminal BTB-POZ domain and four to six Kelch motifs in tandem. It is involved in B-lymphocyte antigen receptor signaling and germinal center formation. 108
35760 350522 cd18447 BACK_KLHL7 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 7 (KLHL7). KLHL7 is a BTB-Kelch protein that constitutes a Cul3-based E3 ubiquitin ligase complex and is involved in the ubiquitination of target proteins for proteasome-mediated degradation. Mutations in KLHL7 cause autosomal-dominant retinitis pigmentosa. 98
35761 350523 cd18448 BACK_KLHL8 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 8 (KLHL8). KLHL8 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex. The BCR(KLHL8) ubiquitin ligase complex mediates ubiquitination and degradation of RAPSN. 97
35762 350524 cd18449 BACK_KLHL9_13 BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL9 and KLHL13. KLHL9 and KLHL13 (also termed BTB and kelch domain-containing protein 2, or BKLHD2) are substrate-specific adaptors of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL9-KLHL13) E3 ubiquitin ligase complex mediates the ubiquitination of AURKB and controls the dynamic behavior of AURKB on mitotic chromosomes and thereby coordinates faithful mitotic progression and completion of cytokinesis. 95
35763 350525 cd18450 BACK_KLHL10 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 10 (KLHL10). KLHL10 may be a substrate-specific adapter of a CUL3-based E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins during spermatogenesis. 80
35764 350526 cd18451 BACK_KLHL11 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 11 (KLHL11). KLHL11 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, leading most often to their proteasomal degradation. 88
35765 350527 cd18452 BACK_KLHL12 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 12 (KLHL12). KLHL12, also termed CUL3-interacting protein 1 (C3IP1), or DKIR, is a substrate-specific adapter of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a negative regulator of Wnt signaling pathway and ER-Golgi transport. 136
35766 350528 cd18453 BACK_KLHL14 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 14 (KLHL14). KLHL14, also termed protein interactor of Torsin-1A, or Printor, or protein interactor of torsinA, is a novel ATP-free form of torsinA-interacting protein implicated in dystonia pathogenesis. 102
35767 350529 cd18454 BACK_KLHL15 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 15 (KLHL15). KLHL15 is a substrate-specific adaptor for Cullin3 E3 ubiquitin-protein ligase complex that target the serine/threonine-protein phosphatase 2A (PP2A) subunit PPP2R5B for ubiquitination and subsequent proteasomal degradation, thus promoting exchange with other regulatory subunits. It also plays a key role in DNA damage response, favoring DNA double-strand repair through error-prone non-homologous end joining (NHEJ) over error-free, RBBP8-mediated homologous recombination (HR), by targeting the DNA-end resection factor RBBP8/CtIP for ubiquitination and subsequent proteasomal degradation. 108
35768 350530 cd18455 BACK_KLHL16_gigaxonin BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 16 (KLHL16). Gigaxonin, also termed Kelch-like protein 16 (KLHL16), may be a cytoskeletal component that directly or indirectly plays an important role in neurofilament architecture. It may also act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins, such as tubulin folding cofactor B (TBCB), microtubule-associated protein MAP1B and glial fibrillary acidic protein (GFAP). Gigaxonin is mutated in giant axonal neuropathy. 97
35769 350531 cd18456 BACK_KLHL17 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 17 (KLHL17). KLHL17, also termed actinfilin, is a substrate-recognition component of some cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes. It acts as a Cullin 3 (Cul3) substrate adaptor that links GluR6 to the E3 ubiquitin-ligase complex, and mediates the ubiquitination and subsequent degradation of GLUR6. It may play a role in the actin-based neuronal function. 102
35770 350532 cd18457 BACK_KLHL18 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 18 (KLHL18). KLHL18 acts as a substrate-specific adaptor for the Cullin3 E3 ubiquitin-protein ligase complex that regulates mitotic entry and ubiquitylates Aurora-A. 107
35771 350533 cd18458 BACK_KLHL19_KEAP1 BACK (BTB and C-terminal Kelch) domain found in Kelch-like ECH-associated protein 1 (KEAP1). KEAP1, also termed cytosolic inhibitor of Nrf2 (INrf2), or Kelch-like protein 19 (KLHL19), is a redox-regulated substrate adaptor protein for a Cullin3-dependent ubiquitin ligase complex that targets NFE2L2/NRF2 for ubiquitination and degradation by the proteasome, thus resulting in the suppression of its transcriptional activity and the repression of antioxidant response element-mediated detoxifying enzyme gene expression. 91
35772 350534 cd18459 BACK_KLHL20 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 20 (KLHL20). KLHL20, also termed Kelch-like ECT2-interacting protein (KLEIP), or Kelch-like protein X, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex involved in interferon response and anterograde Golgi to endosome transport. KLHL20 plays a role in actin assembly at cell-cell contact sites of Madin-Darby canine kidney cells. It also controls endothelial migration and sprouting angiogenesis. 100
35773 350535 cd18460 BACK_KLHL21 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 21 (KLHL21). KLHL21 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for efficient chromosome alignment and cytokinesis. The BCR(KLHL21) E3 ubiquitin ligase complex regulates localization of the chromosomal passenger complex (CPC) from chromosomes to the spindle midzone in anaphase and mediates the ubiquitination of aurora B. KLHL21 targets IkappaB kinase-beta to regulate nuclear factor kappa-light chain enhancer of activated B cells (NF-kappaB) signaling negatively. 101
35774 350536 cd18461 BACK_KLHL22 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 22 (KLHL22). KLHL22 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for chromosome alignment and localization of Polo-like kinase 1 (PLK1) at kinetochores. The BCR(KLHL22) ubiquitin ligase complex mediates monoubiquitination of PLK1, leading to PLK1 dissociation from phosphoreceptor proteins and subsequent removal from kinetochores, allowing silencing of the spindle assembly checkpoint (SAC) and chromosome segregation. 104
35775 350537 cd18462 BACK_KLHL23 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 23 (KLHL23). KLHL23 is involved in tumorigenesis and resistance to anticancer drug treatment. It also associates with cone-rod dystrophy. 102
35776 350538 cd18463 BACK_KLHL24 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 24 (KLHL24). KLHL24, also called kainate receptor-interacting protein for GluR6 (KRIP6), or protein DRE1, is necessary to maintain the balance between intermediate filament stability and degradation, a process that is essential for skin integrity. KLHL24 is a component of the BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that mediates ubiquitination of KRT14 and controls its levels during keratinocyte differentiation. 78
35777 350539 cd18464 BACK_KLHL25_like BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL25 and KLHL37. The family includes KLHL25 and KLHL37. KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1). KLHL37, also called ectoderm-neural cortex protein 1 (ENC-1), or nuclear matrix protein NRP/B, or p53-induced gene 10 protein, is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells. 98
35778 350540 cd18465 BACK_KLHL26 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 26 (KLHL26). KLHL26 is a kelch family protein encoded by gene klhl26, which is regulated by p53 via fuzzy tandem repeats. 97
35779 350541 cd18466 BACK_KLHL27_IPP BACK (BTB and C-terminal Kelch) domain found in intracisternal A particle-promoted polypeptide (IPP). IPP, also termed Kelch-like protein 27 (KLHL27), is an actin-binding protein that may play a role in organizing the actin cytoskeleton. 103
35780 350542 cd18467 BACK_KLHL28_BTBD5 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 28 (KLHL28). KLHL28, also termed BTB/POZ domain-containing protein 5 (BTBD5), belongs to the KLHL family. Its function remains unclear. 99
35781 350543 cd18468 BACK_KLHL29_KBTBD9 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 29 (KLHL29). KLHL29, also termed Kelch repeat and BTB domain-containing protein 9 (KBTBD9), belongs to the KLHL family. Its function remains unclear. A nuclear receptor subfamily 5, group A, member 2 (NR5A2)-Kelch-like family member 29 (KLHL29) fusion transcript may participate in the origin or progression of some colon cancers. 102
35782 350544 cd18469 BACK_KLHL30 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 30 (KLHL30). KLHL30 belongs to the KLHL family. Its function remains unclear. Differential expression of the KLHL30 gene has been observed in glioblastoma multiforme versus normal brain. 104
35783 350545 cd18470 BACK_KLHL31_KBTBD1 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 31 (KLHL31). KLHL31, also termed BTB and kelch domain-containing protein 6, or Kelch repeat and BTB domain-containing protein 1, or Kelch-like protein KLHL, is a transcriptional repressor in MAPK/JNK signaling pathway that regulates cellular functions. Overexpression inhibits the transcriptional activities of both the TPA-response element (TRE) and serum response element (SRE). 98
35784 350546 cd18471 BACK_KLHL32_BKLHD5 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 32 (KLHL32). KLHL32, also termed BTB and kelch domain-containing protein 5 (BKLHD5), belongs to the KLHL family. Its function remains unclear. KLHL32 SNPs may be associated with body mass index in individuals of African ancestry. 98
35785 350547 cd18472 BACK_KLHL33 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 33 (KLHL33). KLHL33 belongs to the KLHL family. Its function remains unclear. KLHL33 SNPs may be associated with prostate cancer risk. 75
35786 350548 cd18473 BACK_KLHL34 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 34 (KLHL34). KLHL34 belongs to the KLHL family. Its function remains unclear. The methylation status of KLHL34 cg14232291 may be a predictive candidate of sensitivity to preoperative chemoradiation therapy. 106
35787 350549 cd18474 BACK_KLHL35 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 35 (KLHL35). KLHL35 belongs to the KLHL family. Its function remains unclear. Hypermethylation of KLHL35 is associated with hepatocellular carcinoma and abdominal aortic aneurysm. 79
35788 350550 cd18475 BACK_KLHL36 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 36 (KLHL36). KLHL36 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. 100
35789 350551 cd18476 BACK_KLHL38 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 38 (KLHL38). KLHL38 belongs to the KLHL family. Its function remains unclear. The klhl38 gene has recently been identified as a possible diapause (a temporary arrest of development during early ontogeny) gene, as it is significantly up-regulated during diapause. It may also be involved in chicken preadipocyte differentiation. 99
35790 350552 cd18477 BACK_KLHL40_like BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL40 and KLHL41. The family includes Kelch-like proteins, KLHL40 and KLHL41. KLHL40 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. KLHL41 is a novel kelch related protein that is involved in pseudopod elongation in transformed cells. 99
35791 350553 cd18478 BACK_KLHL42_KLHDC5 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 42 (KLHL42). KLHL42, also called Cullin-3-binding protein 9 (Ctb9), or Kelch domain-containing protein 5, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL42) E3 ubiquitin ligase complex mediates the ubiquitination and subsequent degradation of KATNA1. KLHL42 is involved in microtubule dynamics throughout mitosis. 103
35792 350554 cd18479 BACK_KBTBD2 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 2 (KBTBD2). KBTBD2, also called BTB and kelch domain-containing protein 1 (BKLHD1), plays an essential role in the regulating the insulin-signaling pathway. It is a BTB-Kelch family substrate recognition subunit of the Cullin-3-based E3 ubiquitin ligase, which targets p85alpha, the regulatory subunit of the phosphoinositol-3-kinase (PI3K) heterodimer, causing p85alpha ubiquitination and proteasome-mediated degradation. 96
35793 350555 cd18480 BACK_KBTBD3 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 3 (KBTBD3). KBTBD3, also termed BTB and kelch domain-containing protein 3 (BKLHD3), is a BTB-Kelch family protein. Its function remains unclear. 82
35794 350556 cd18481 BACK_KBTBD4 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 4 (KBTBD4). KBTBD4, also termed BTB and kelch domain-containing protein 4 (BKLHD4), is a BTB-BACK-Kelch domain protein belonging to a large family of cullin-RING ubiquitin ligase adaptors that facilitate the ubiquitination of target substrates. 88
35795 350557 cd18482 BACK_KBTBD6_7 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing proteins, KBTBD6 and KBTBD7. KBTBD6 and KBTBD7 are substrate adaptors of a cullin-3 RING ubiquitin ligase complex that mediates ubiquitylation and proteasomal degradation of T-lymphoma and metastasis gene 1 (TIAM1), a RAC1-specific guanine exchange factor (GEF), by cooperating with gamma-aminobutyric acid receptor-associated proteins (GABARAP). KBTBD7 may also act as a new transcriptional activator in mitogen-activated protein kinase (MAPK) signaling. 99
35796 350558 cd18483 BACK_KBTBD8 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 8 (KBTBD8). KBTBD8, also called T-cell activation kelch repeat protein (TA-KRP), is a BTB-kelch family protein that is located in the Golgi apparatus and translocates to the spindle apparatus during mitosis. It acts as a substrate-specific adaptor for a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a regulator of neural crest specification. The BCR(KBTBD8) complex monoubiquitylates NOLC1 and its paralogue TCOF1, the mutation of which underlies the neurocristopathy Treacher Collins syndrome. 97
35797 350559 cd18484 BACK_KBTBD11_CMLAP BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 11 (KBTBD11). KBTBD11, also termed chronic myelogenous leukemia-associated protein (CMLAP), or Kelch domain-containing protein 7B, or KLHDC7C, is a BTB-Kelch family protein. Its function remains unclear. A novel polymorphism rs11777210 in KBTBD11 is significantly associated with colorectal cancer risk; KBTBD11 may function as a tumor suppressor. KBTBD11 hypomethylation may also be a potential target for differentiating between the mostly fatal TCF3-HLF and curable TCF3-PBX1 pediatric acute lymphoblastic leukemia subtypes. 77
35798 350560 cd18485 BACK_KBTBD12 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 12 (KBTBD12). KBTBD12, also termed Kelch domain-containing protein 6 (KLHDC6), is a BTB-Kelch family protein. Its function remains unclear. 100
35799 350561 cd18486 BACK_KBTBD13 BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 13 (KBTBD13). KBTBD13 is a muscle-specific protein. Autosomal dominant mutations may cause nemaline myopathy (NEM); these disease-associated mutations are located in conserved Kelch repeats and are predicted to disrupt the beta-propeller structure. KBTBD13 may act as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that functions as a muscle specific ubiquitin ligase, and thereby implicate the ubiquitin proteasome pathway in the pathogenesis of KBTBD13-associated NEM. 89
35800 350562 cd18487 BACK_BTBD1_like BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing proteins, BTBD1 and BTBD2. This subfamily includes BTB/POZ domain-containing proteins BTBD1 and BTBD2, both of which are BTB-domain-containing Kelch-like proteins that interact with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. 95
35801 350563 cd18488 BACK_BTBD3_like BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing proteins, BTBD3 and BTBD6. This subfamily includes BTB/POZ domain-containing proteins BTBD3 and BTBD6, both of which are BTB-domain-containing Kelch-like proteins. BTBD3 controls dendrite orientation toward active axons in mammalian neocortex. BTBD6 is required for proper embryogenesis and plays an essential evolutionarily-conserved role during neuronal development. 95
35802 350564 cd18489 BACK_BTBD7 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains and a BACK domain. 98
35803 350565 cd18490 BACK_BTBD8 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental process. It may also act as a protein-protein adaptor in a transcription complex and thus may be involved in brain development. 64
35804 350566 cd18491 BACK_ABTB2_like BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2) and similar proteins. ABTB2, also called Bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins involved in a range of functions from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications in Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homologue (PTEN). This subfamily also includes BTB/POZ domain-containing protein 11 (BTBD11), whose function is unclear. 72
35805 350567 cd18492 BACK_BTBD16 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 16 (BTBD16). BTBD16 is a BTB-domain-containing Kelch-like protein. Its function remains unclear. BTBD16 SNPs may be bipolar disorder (BD) genetic susceptibility variants exhibiting genetic background-dependent effects. 97
35806 350568 cd18493 BACK_BTBD17 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 17 (BTBD17). BTBD17, also termed galectin-3-binding protein-like, is a BTB-domain-containing Kelch-like protein. Its function remains unclear. It may be involved in hepatocellular carcinoma development and progression. 74
35807 350569 cd18494 BACK_BTBD19 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 19 (BTBD19). BTBD19 is a BTB-domain-containing Kelch-like protein. Its function remains unclear. 73
35808 350570 cd18495 BACK_GCL BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster protein germ cell-less (GCL) and similar proteins. The GCL protein is a nuclear envelope protein highly conserved between the mammalian and Drosophila orthologs. Drosophila melanogaster GCL is a key regulator required for the specification of pole cells and primordial germ cell formation in Drosophila embryos. Both human germ cell-less protein-like 1 (GMCL1) and germ cell-less protein-like 1-like (GMCL1P1 or GMCL1L) may function in spermatogenesis. They may also be substrate-specific adaptors of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. 78
35809 350571 cd18496 BACK_LGALS3BP BACK (BTB and C-terminal Kelch) domain found in lectin galactoside-binding soluble 3-binding protein (LGALS3BP). LGALS3BP, also called galectin-3-binding protein, or basement membrane autoantigen p105, or Mac-2-binding protein (MAC2BP/M2BP), or tumor-associated antigen 90K, promotes integrin-mediated cell adhesion. It may stimulate host defense against viruses and tumor cells. 74
35810 350572 cd18497 BACK_ABTB1_BPOZ BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein, or Bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homologue (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes. 72
35811 350573 cd18498 BACK_RCBTB1_2 BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2. The RCC1-related guanine nucleotide exchange factor (GEF) family includes RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2, both of which are chromosome condensation regulator-like guanine nucleotide exchange factors. 64
35812 350574 cd18499 BACK_RHOBTB BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline rich region, a tandem of 2 BTB domains, and a C-terminal BACK domain. In humans, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi. 76
35813 350575 cd18500 BACK_IBtk BACK (BTB and C-terminal Kelch) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor of Bruton's tyrosine kinase (Btk), thereby playing a role in B-cell development. 60
35814 350576 cd18501 BACK_ANKFY1_Rank5 BACK (BTB and C-terminal Kelch) domain found in rabankyrin-5 (Rank-5). Rank-5, also called ankyrin repeat and FYVE domain-containing protein 1 (ANKFY1), or ankyrin repeats hooked to a zinc finger motif (ANKHZN), is a Rab5 effector that regulates and coordinates different endocytic mechanisms. 89
35815 350577 cd18502 BACK_NS1BP_IVNS1ABP BACK (BTB and C-terminal Kelch) domain found in influenza virus NS1A-binding protein (NS1-BP). NS1-BP, also called NS1-binding protein, or Aryl hydrocarbon receptor-associated protein 3, or IVNS1ABP, is a novel protein that interacts with the influenza A virus nonstructural NS1 protein, which is relocalized in the nuclei of infected cells. It plays a role in cell division and in the dynamic organization of the actin skeleton as a stabilizer of actin filaments by association with F-actin through Kelch repeats. It also interacts with alpha-enolase/MBP-1 and is involved in c-Myc gene transcriptional control. 99
35816 350578 cd18503 BACK_calicin BACK (BTB and C-terminal Kelch) domain found in calicin. Calicin is a basic cytoskeletal protein involved in the formation and maintenance of the highly regular organization of the postacrosomal perinuclear theca, the calyx of mammalian spermatozoa. 78
35817 350579 cd18504 BACK_ARIA_like BACK (BTB and C-terminal Kelch) domain found in plant ARM repeat protein interacting with ABF2 (ARIA) and similar proteins. ARIA is an ARM repeat protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2, a transcription factor which controls ABA-dependent gene expression via the G-box-type ABA-responsive elements. ARIA is a novel abscisic acid signaling component. It negatively regulates seed germination and young seedling growth. 64
35818 350580 cd18505 BACK1_LZTR1 first BACK (BTB and C-terminal Kelch) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a Golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. 59
35819 350581 cd18506 BACK2_LZTR1 second BACK (BTB and C-terminal Kelch) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a Golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. 61
35820 350582 cd18507 BACK_GPRS_like BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster serine-enriched protein (GPRS) and similar proteins. The family includes uncharacterized Drosophila melanogaster serine-enriched protein (GPRS) and similar proteins. 80
35821 350583 cd18508 BACK_KEL_like BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster ring canal kelch protein (KEL) and similar proteins. KEL, also termed kelch short protein, is a component of ring canals that regulates the flow of cytoplasm between cells. It binds actin and may be involved in the regulation of cytoplasm flow from nurse cells to the oocyte during oogenesis. 77
35822 350584 cd18509 BACK_KLHL1 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 1 (KLHL1). KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. It may play a role in organizing the actin cytoskeleton in brain cells. KLHL1 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. 106
35823 350585 cd18510 BACK_KLHL4 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 4 (KLHL4). KLHL4 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. 106
35824 350586 cd18511 BACK_KLHL5 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 5 (KLHL5). KLHL5 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. It is abundantly expressed in ovary, adrenal gland, and thymus. 106
35825 350587 cd18512 BACK_KLHL2_Mayven BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 2 (KLHL2). KLHL2, also called actin-binding protein Mayven, is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, such as NPTXR, leading most often to their proteasomal degradation. 130
35826 350588 cd18513 BACK_KLHL3 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 3 (KLHL3). KLHL3 serves as a substrate adapter in Cullin3 (Cul3) E3 ubiquitin ligase complexes. It is a component of an E3 ubiquitin ligase complex that regulates blood pressure by targeting With-No-Lysine (WNK) kinases for degradation. 130
35827 350589 cd18514 BACK_KLHL25_ENC2 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 25 (KLHL25). KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1). 99
35828 350590 cd18515 BACK_KLHL37_ENC1 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 37 (KLHL37). KLHL37, also called ectoderm-neural cortex protein 1 (ENC-1), or nuclear matrix protein NRP/B, or p53-induced gene 10 protein, is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells. 98
35829 350591 cd18516 BACK_KLHL40_KBTBD5 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 40 (KLHL40). KLHL40, also called Kelch repeat and BTB domain-containing protein 5, or sarcosynapsin, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. Mutations in KLHL40 may cause severe autosomal-recessive nemaline myopathy. 99
35830 350592 cd18517 BACK_KLHL41_KBTBD10 BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 41 (KLHL41). KLHL41, also called Kel-like protein 23, or Kelch repeat and BTB domain-containing protein 10, or Kelch-related protein 1 (Krp1), or sarcosine, is a novel kelch related protein that is involved in pseudopod elongation in transformed cells. It is also involved in skeletal muscle development and differentiation. It regulates proliferation and differentiation of myoblasts and plays a role in myofibril assembly by promoting lateral fusion of adjacent thin fibrils into mature, wide myofibrils. 99
35831 350593 cd18518 BACK_SPOP BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein (SPOP). SPOP, also termed HIB homolog 1, or Roadkill homolog 1, is a novel nuclear speckle-type protein which serves as an adaptor of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins, such as BRMS1, DAXX, PDX1/IPF1, GLI2 and GLI3. 71
35832 350594 cd18519 BACK_SPOPL BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein-like (SPOPL). SPOPL, also termed HIB homolog 2, or Roadkill homolog 2, is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The complexes containing homodimeric SPOPL or the heterodimer formed by speckle-type POZ protein (SPOP) and SPOPL are less efficient than ubiquitin ligase complexes containing only SPOP. 96
35833 350595 cd18520 BACK_roadkill_like BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster protein roadkill and similar proteins. Drosophila melanogaster protein roadkill, also termed Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. 74
35834 350596 cd18521 BACK_Tdpoz BACK (BTB and C-terminal Kelch) domain found in TD and POZ domain-containing proteins, Tdpoz1-4. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. This subfamily contains only mammalian members. Plant TDPOZ proteins contain a MATH domain at the N-terminal region and are named "BTB/POZ and MATH domain-containing proteins (BPM)", and are not inlcuded in this subfamily. 67
35835 350597 cd18522 BACK_BTBD1 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 1 (BTBD1). BTBD1, also called Hepatitis C virus NS5A-transactivated protein 8 or HCV NS5A-transactivated protein 8, is a BTB-domain-containing Kelch-like protein specifically expressed in skeletal muscle. It interacts with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. It may serve as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. 107
35836 350598 cd18523 BACK_BTBD2 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 2 (BTBD2). BTBD2 is a BTB-domain-containing Kelch-like protein that interacts with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. 106
35837 350599 cd18524 BACK_BTBD3 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 3 (BTBD3). BTBD3 is a BTB-domain-containing Kelch-like protein that controls dendrite orientation toward active axons in mammalian neocortex. 95
35838 350600 cd18525 BACK_BTBD6 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 6 (BTBD6). BTBD6, also called lens BTB domain protein, is a BTB-domain-containing Kelch-like protein that is required for proper embryogenesis and plays an essential evolutionarily-conserved role during neuronal development. 95
35839 350601 cd18526 BACK_ABTB2 BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2). ABTB2, also called Bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins with functions ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications in Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homologue (PTEN). 79
35840 350602 cd18527 BACK_BTBD11 BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 11 (BTBD11). BTBD11, also termed ankyrin repeat and BTB/POZ domain-containing protein BTBD11, is a BTB-domain-containing Kelch-like protein. Its function remains unclear. The BTBD11 gene has been identified as an all-trans retinoic acid-responsive gene that may play a role in neural development. 83
35841 350603 cd18528 BACK_RCBTB1 BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing protein 1 (RCBTB1). RCBTB1, also called chronic lymphocytic leukemia deletion region gene 7 protein (CLLD7), or CLL deletion region gene 7 protein, or regulator of chromosome condensation and BTB domain-containing protein 1, or E4.5, is a novel chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) that may be involved in cell cycle regulation by chromatin remodeling. It may also function as a tumor suppressor that regulates pathways of DNA damage/repair and apoptosis. Moreover, RCBTB1 acts as a putative substrate adaptor for a cullin3 (CUL3) E3 ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. Biallelic mutations in RCBTB1 may cause isolated and syndromic retinal dystrophy. 66
35842 350604 cd18529 BACK_RCBTB2 BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing protein 2 (RCBTB2). RCBTB2, also called chromosome condensation 1-like (CHC1-L), or RCC1-like G exchanging factor, or regulator of chromosome condensation and BTB domain-containing protein 2, is a chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) protein for the Ras-related GTPase Ran. 65
35843 350605 cd18530 BACK_RHOBTB1 BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. RhoBTB1 also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. 100
35844 350606 cd18531 BACK_RHOBTB2 BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2), or p83, is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. RhoBTB2 also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. 97
35845 350607 cd18532 BACK_RHOBTB3 BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. RhoBTB3 is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) by facilitating hydroxylation and ubiquitination. 83
35846 350509 cd18533 PTP_fungal fungal protein tyrosine phosphatases. This subfamily contains Saccharomyces cerevisiae protein-tyrosine phosphatases 1 (PTP1) and 2 (PTP2), Schizosaccharomyces pombe PTP1, PTP2, and PTP3, and similar fungal proteins. PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. PTP2, together with PTP3, is the major phosphatase that dephosphorylates and inactivates the MAP kinase HOG1 and also modulates its subcellular localization. 212
35847 350510 cd18534 DSP_plant_IBR5-like dual specificity phosphatase domain of plant IBR5-like protein phosphatases. This subfamily is composed of Arabidopsis thaliana INDOLE-3-BUTYRIC ACID (IBA) RESPONSE 5 (IBR5) and similar plant proteins. IBR5 protein is also called SKP1-interacting partner 33. The IBR5 gene encodes a dual-specificity phosphatase (DUSP) which acts as a positive regulator of plant responses to auxin and abscisic acid. DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Typical DUSPs, also called mitogen-activated protein kinase (MAPK) phosphatases (MKPs), deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. IBR5 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs. It has been shown to target MPK12, which is a negative regulator of auxin signaling. 130
35848 350511 cd18535 PTP-IVa3 protein tyrosine phosphatase type IVA 3. Protein tyrosine phosphatase type IVA 3 (PTP-IVa3), also known as protein-tyrosine phosphatase of regenerating liver 3 (PRL-3), stimulates progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. It exerts its oncogenic functions through activation of PI3K/Akt, which is a key regulator of the rapamycin-sensitive mTOR complex 1. PRL-3 is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation. 154
35849 350512 cd18536 PTP-IVa2 protein tyrosine phosphatase type IVA 2. Protein tyrosine phosphatase type IVA 2 (PTP-IVa2), also known as protein-tyrosine phosphatase of regenerating liver 2 (PRL-2), stimulates progression from G1 into S phase during mitosis and promotes tumors. It regulates tumor cell migration and invasion through an ERK-dependent signaling pathway. Its overexpression correlates with breast tumor formation and progression. PRL-2 is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation. 155
35850 350513 cd18537 PTP-IVa1 protein tyrosine phosphatase type IVA 1. Protein tyrosine phosphatase type IVA 1 (PTP-IVa1), also known as protein-tyrosine phosphatase of regenerating liver 1 (PRL-1), stimulates progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. It may play a role in the development and maintenance of differentiating epithelial tissues. PRL-1 promotes cell growth and migration by activating both the ERK1/2 and RhoA pathways. It is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation. 167
35851 350514 cd18538 PFA-DSP_unk unknown subfamily of atypical dual-specificity phosphatases from fungi. This uncharacterized subfamily belongs to the plant and fungi atypical dual-specificity phosphatases (PFA-DSPs) group of atypical DSPs that present in plants, fungi, kinetoplastids, and slime molds. They share structural similarity with atypical- and lipid phosphatase DSPs from mammals. The PFA-DSP group is composed of active as well as inactive phosphatases. This unknown subgroup contains the conserved the CxxxxxR catalytic motif present in active cysteine phosphatases. 145
35852 349786 cd18539 SRP_G GTPase domain of signal recognition particle protein. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain. 193
35853 349984 cd18540 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 295
35854 349985 cd18541 ABC_6TM_TmrB_like Six-transmembrane helical domain (TmrB) of the heterodimeric Thermus thermophilus multidrug resistance proteins TmrAB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the heterodimeric Thermus thermophilus multidrug resistance proteins A and B (TmrAB), a homolog of the Antigen Translocation Complex Tap, and similar proteins. TmrAB has been shown to able to restore antigen processing in human TAP-deficient cells. The 6-transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 293
35855 349986 cd18542 ABC_6TM_YknU_like Six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknU and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknU and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 292
35856 349987 cd18543 ABC_6TM_Rv0194_D1_like Six-transmembrane helical domain 1 (TMD1) of the multidrug efflux ABC transporter Rv0194 and similar proteins. This group includes the six-transmembrane helical domain 1 (TMD1) of the multidrug efflux ATP-binding/permease protein Rv0194 from Mycobacterium tuberculosis and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 291
35857 349988 cd18544 ABC_6TM_TmrA_like Six-transmembrane helical domain (TmrA) of the heterodimeric Thermus thermophilus multidrug resistance proteins TmrAB, and similar proteins. This group represents the six-transmembrane helical domain (TrmA) of the heterodimeric Thermus thermophilus multidrug resistance proteins A and B (TmrAB), a homolog of the Antigen Translocation Complex Tap, and similar proteins. TmrAB has been shown to able to restore antigen processing in human TAP-deficient cells. The 6-transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 294
35858 349989 cd18545 ABC_6TM_YknV_like Six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknV and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknV and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 293
35859 349990 cd18546 ABC_6TM_Rv0194_D2_like Six-transmembrane helical domain 2 (TMD2) of the multidrug efflux ABC transporter Rv0194 and similar proteins. This group includes the six-transmembrane helical domain 2 (TMD2) of the multidrug efflux ATP-binding/permease protein Rv0194 from Mycobacterium tuberculosis and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 292
35860 349991 cd18547 ABC_6TM_Tm288_like Six-transmembrane helical domain Tm288 of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This group represents the six-transmembrane helical domain (Tm288) of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 298
35861 349992 cd18548 ABC_6TM_Tm287_like Six-transmembrane helical domain Tm287 of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This group represents the six-transmembrane helical domain (Tm287) of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 292
35862 349993 cd18549 ABC_6TM_YwjA_like Six-transmembrane helical domain of an uncharacterized ABC transporter YwjA and similar proteins. This group represents the six-transmembrane helical domain of an uncharacterized ABC transporter YwjA from Bacillus subtilis and similar proteins. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 295
35863 349994 cd18550 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 294
35864 349995 cd18551 ABC_6TM_LmrA_like Six-transmembrane helical domain of the multidrug resistance ABC transporter LmrA and similar proteins. This group represents the six-transmembrane helical domain of the multidrug resistance ABC transporter LmrA from Lactococcus lactis and similar proteins. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 289
35865 349996 cd18552 ABC_6TM_MsbA_like Six-transmembrane helical domain of the bacterial ABC lipid flippase MsbA and similar proteins. The bacterial lipid flippase MsbA is found in Gram-negative bacteria and transports lipid A and lipopolysaccharide (LPS) from the cytoplasmic leaflet to the periplasmic leaflet of the inner membrane. MsbA is also a polyspecific transporter capable of transporting a broad spectrum of drug molecules. Additionally, MsbA exhibits significant sequence similarity to mammalian multidrug resistance (MDR) proteins such as human MDR protein 1 (MDR1) and LmrA from Lactococcus lactis. This subgroup also contains a putative transporter Brevibacillus brevis TycD; the location of the tycD gene within the Tyc (tyrocidine) biosynthesis operon suggests that TycD may play a role in the secretion of the cyclic decapeptide antibiotic tyrocidine. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 292
35866 349997 cd18553 ABC_6TM_PglK_like Six-transmembrane helical domain of the ABC transporter PglK and similar proteins. This group represents the transmembrane (TM) domain of an active lipid-linked oligosaccharides flippase PglK (protein glycosylation K), which is a homodimeric ABC transporter that flips a lipid-linked oligosaccharide that serves as a glycan donor in N-linked protein glycosylation. Pglk mediates the ATP-dependent translocation of the undecaprenylpyrophosphate-linked heptasaccharide intermediate across the cell membrane; this is an essential step during the N-linked protein glycosylation pathway. This TM subunit exhibits the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. Bacterial ABC exporters are typically expressed as half-transporters that contain one transmembrane domain (TMD) fused to a nucleotide-binding domain (NBD), which dimerize to form the full transporter. 300
35867 349998 cd18554 ABC_6TM_Sav1866_like Six-transmembrane helical domain of the bacterial ABC multidrug exporter Sav1866 and similar proteins. This group represents the homodimeric bacterial ABC multidrug exporter Sav1866, which is homologous to the lipid flippase MsbA, and both of which are functionally related to the human P-glycoprotein multidrug transporter (ABCB1 or MDR1). This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. 299
35868 349999 cd18555 ABC_6TM_T1SS_like Six-transmembrane helical domain (6-TMD) of the ATP-binding cassette subunit in the type 1 secretion systems, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) and similar proteins. These transporter subunits include HylB, PrtD, CyaB, CvaB, RsaD, HasD, LipB, and LapB, among many others. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). Most targeted proteins are not cleaved at the N terminus, but rather carry signals located toward the extreme C terminus to direct type I secretion. However, the 10 kDa Escherichia coli colicin V (CvaB) targets the ABC transporter using a cleaved, N-terminal signal sequence. Almost all transport substrates of the type I system have critical functions in attacking host cells either directly or by being essential for host colonization. The ABC-dependent T1SS transports various molecules, from ions, drugs, to proteins of various sizes up to 900 kDa. The molecules secreted vary in size from the small Escherichia coli peptide colicin V, (10 kDa) to the Pseudomonas fluorescens cell adhesion protein LapA of 520 kDa. The best characterized are the RTX toxins such as the adenylate cyclase (CyaA) toxin from Bordetella pertussis, the causative agent of whooping cough, and the lipases such as LipA. Type I secretion is also involved in export of non-protein substrates such as cyclic beta-glucans and polysaccharides. 294
35869 350000 cd18556 ABC_6TM_McjD_like Six-transmembrane helical domain of the antibacterial peptide ATP-binding cassette transporter McjD and similar proteins. This group represents the 6-TM subunit of the ABC transporter McjD that exports the antibacterial peptide microcin J25, which is an antimicrobial peptide produced by Enterobacteriaceae against other microorganisms for survival under nutrient starvation. Thus, the ABC exporter McjD provides self-immunity of the producing bacteria through export of the toxic peptide out of the cell. Bacterial ABC exporters are typically expressed as half-transporters that contain one transmembrane domain (TMD) fused to a nucleotide-binding domain (NBD), which dimerize to form the full transporter. 298
35870 350001 cd18557 ABC_6TM_TAP_ABCB8_10_like Six-transmembrane helical domain (6-TMD) of the ABC transporter TAP, ABCB8 and ABCB10. This group includes ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection, as well as ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. 289
35871 350002 cd18558 ABC_6TM_Pgp_ABCB1 Six-transmembrane helical domain of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1(ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells. 312
35872 350003 cd18559 ABC_6TM_ABCC Six-transmembrane helical domain of the ABC transporters, subfamily C. This group represents the 6-transmembrane (6TM) domain of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. 290
35873 350004 cd18560 ABC_6TM_ATM1_ABCB7_HMT1_ABCB6 Six-transmembrane helical domain (6-TMD) of the Atm1/ABCB7/HMT1/ABCB6 subfamily. This group represents the Atm1/ABCB7/HMT1/ABCB6 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia. ABCB6 is originally identified as a porphyrin transporter present in the outer membrane of mitochondria. It is highly expressed in cells resistance to arsenic and protects against arsenic cytotoxicity. Moreover, Heavy Metal Tolerance Factor-1 (HMT1) proteins are required for cadmium resistance in Caenorhabditis elegans and Drosophila melanogaster. 292
35874 350005 cd18561 ABC_6TM_AarD_CydDC_like Six-transmembrane helical domain (6-TMD) of the ABC cysteine/GSH transporter CydDC, and similar proteins. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs). 289
35875 350006 cd18562 ABC_6TM_NdvA_beta-glucan_exporter_like Six-transmembrane helical domain of the cyclic beta-glucan ABC transporter NdvA, and similar proteins. This group represents the six-transmembrane domain of NdvA, an ATP-dependent exporter of cyclic beta glucans, and similar proteins. NdvA is required for nodulation of legume roots and is involved in beta-(1,2)-glucan export to the periplasm. NdvA mutants in Brucella abortus and Sinorhizobium meliloti have been shown to exhibit decreased virulence in mice and inhibit intracellular multiplication in HeLa cells. These results suggest that cyclic beta-(1,2)-glucan is required to transport into the periplasmatic space to function as a virulence factor. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. 289
35876 350007 cd18563 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 296
35877 350008 cd18564 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 307
35878 350009 cd18565 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 313
35879 350010 cd18566 ABC_6TM_PrtD_LapB_HlyB_like Six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion. 294
35880 350011 cd18567 ABC_6TM_CvaB_RaxB_like Six-transmembrane helical domain (6-TMD) of the ABC transporter subunit of the type 1 secretion systems, CvaB and RaxB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the peptidase-containing ABC transporter subunit of T1SS (Type 1 secretion systems), such as Escherichia coli colicin V secretion/processing ATP-binding protein CvaB and putative ABC transporter RaxB. These ABC-transporter proteins carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion. RaxB is part of the T1SS RaxABC, which is responsible for the type 1-dependent secretion of the bacterial quorum-sensing molecule AvrXa21. Both CvaB and RaxB belong to a subgroup of T1SS ABC transporters that contain a C39 peptidase domain. T1SS are found in pathogenic Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. 294
35881 350012 cd18568 ABC_6TM_HetC_like Six-transmembrane helical domain (6-TMD) of the ABC subunit of T1SS-like HetC and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit of T1SS (type 1 secretion systems), such as heterocyst differentiation protein HetC. HetC is similar to ABC protein exporters of T1SS (type 1 secretion systems) and is involved in early regulation of heterocyst differentiation in the filamentous cynobacterium Anabaena sp. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. ABC-transporter proteins in this group carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion. 294
35882 350013 cd18569 ABC_6TM_NHLM_bacteriocin Six-transmembrane helical domain (6-TMD) of NHLP family bacteriocin export ABC transporters. This group includes the six-transmembrane helical domain (6-TMD) of the ABC subunit of NHLM (Nitrile Hydratase Leader Microcin) bacteriocin system, which contains ABC transporter (permease/ATP-binding fused protein) with a peptidase domain. ABC-transporter proteins in this group are predicted to be a subunit of a bacteriocin processing and export system, and they carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion. 294
35883 350014 cd18570 ABC_6TM_PCAT1_LagD_like Six-transmembrane helical domain (6-TMD) of the peptidase-containing ATP-binding cassette transporters. This group includes the 6-TMD of the peptidase-containing ATP-binding cassette transporters (PCATs) such as Clostridium thermocellum PCAT1, a polypeptide processing and secretion transporter, and LagD, a bacteriocin ABC transporter from Lactococcus lactis. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. The transporters involved in protein secretion often contain additional peptidase domains essential for substrate processing. These peptidase domains belong to the cysteine protease superfamily, classified as family C39, bacteriocin-processing peptidase. LagD is highly similar to the peptidase-containing ATP-binding cassette transporters (PCATs). In Gram-positive bacteria, the PCATs are responsible for exporting quorum-sensing or antimicrobial peptides called bacteriocins. 294
35884 350015 cd18571 ABC_6TM_peptidase_like Six-transmembrane helical domain (6-TMD) of an uncharacterized peptidase ABC transporter and similar proteins. This group includes the 6-TMD of an uncharacterized peptidase-containing ABC transporter of T1SS (type 1 secretion systems), similar to heterocyst differentiation protein HetC. HetC is involved in early regulation of heterocyst differentiation in the filamentous cynobacterium Anabaena sp. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. ABC-transporter proteins in this group carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion. 294
35885 350016 cd18572 ABC_6TM_TAP Six-transmembrane helical domain (6-TMD) of the ABC transporter associated with antigen processing. This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation. 289
35886 350017 cd18573 ABC_6TM_ABCB10_like Six-transmembrane helical domain (6-TMD) of the mitochondrial transporter ABCB10 (subfamily B, member 10) and similar proteins. This group includes the 6-TM subunit of the ABC10 (also known as ABC mitochondrial erythroid, ABC-me, mABC2, or ABCBA), which is one of the three ATP-binding cassette (ABC) transporters found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. In mammals, ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. 294
35887 350018 cd18574 ABC_6TM_ABCB8_like Six-transmembrane helical domain (6-TMD) of ATP-binding cassette transporter subfamily B member 8, mitochondrial, and similar proteins. This group includes ABCB8, which is one of the three ATP-binding cassette (ABC) transporters found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. ABCB8 is essential for maintenance of normal cardiac function, involves mitochondrial iron export, and plays a role in the maturation of cytosolic Fe/S cluster-containing enzymes. ABCB8 is a half-molecule ABC protein that contains one TMD fused to a NBD, which dimerize to form a functional transporter. 295
35888 350019 cd18575 ABC_6TM_bac_exporter_ABCB8_10_like Six-transmembrane helical domain of putative bacterial ABC exporters, similar to ABCB8 and ABCB10. This group includes putative bacterial ABC transporters similar to ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. 289
35889 350020 cd18576 ABC_6TM_bac_exporter_ABCB8_10_like Six-transmembrane helical domain of putative bacterial ABC exporters, similar to ABCB8 and ABCB10. This group includes putative bacterial ABC transporters similar to ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. 289
35890 350021 cd18577 ABC_6TM_Pgp_ABCB1_D1_like Six-transmembrane helical domain 1 (TMD1) of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1 (ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells. 300
35891 350022 cd18578 ABC_6TM_Pgp_ABCB1_D2_like Six-transmembrane helical domain 2 (TMD2) of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1 (ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells. 317
35892 350023 cd18579 ABC_6TM_ABCC_D1 Six-transmembrane helical domain 1 (TMD1) of the ABC transporters, subfamily C. This group represents the six-transmembrane domain 1 (TMD1)of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs. 289
35893 350024 cd18580 ABC_6TM_ABCC_D2 Six-transmembrane helical domain 2 (TMD2) of the ABC transporters, subfamily C. This group represents the six-transmembrane domain 2 (TMD2) of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. All ABC transporters share a common architecture of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs. 294
35894 350025 cd18581 ABC_6TM_ABCB6 Six-transmembrane helical domain of the ATP-binding cassette subfamily B member 6, mitochondrial. This group represents the ABCB6 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. ABCB6 is originally identified as a porphyrin transporter present in the outer membrane of mitochondria. It is highly expressed in cells resistance to arsenic and protects against arsenic cytotoxicity. Moreover, ABCB6 (ABC transporter subfamily B, member 6) is closely related to yeast ATM1 and human ABCB7, which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia. 300
35895 350026 cd18582 ABC_6TM_ATM1_ABCB7 Six-transmembrane helical domain of the Atm1/ABC7 transporters. This group represents the Atm1/ABCB7 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia. 292
35896 350027 cd18583 ABC_6TM_HMT1 Six-transmembrane helical domain of the heavy metal tolerance protein. This group represents the HMT1 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Heavy Metal Tolerance Factor-1 (HMT1) proteins are required for cadmium resistance in Caenorhabditis elegans and Drosophila melanogaster. HMT1 is closely related to Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. 290
35897 350028 cd18584 ABC_6TM_AarD_CydD Six-transmembrane helical domain (6TM) of the CydD, a component of the ABC cysteine/GSH transporter, and a homolog AarD. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs). 290
35898 350029 cd18585 ABC_6TM_CydC Six-transmembrane helical domain (6-TMD) of the CydC, a component of the ABC cysteine/GSH transporter. The CydC protein, together with the CydD protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs). 290
35899 350030 cd18586 ABC_6TM_PrtD_like Six-transmembrane helical domain (6TM) domain of the ABC subunit (PrtD) in the T1SS metalloprotease secretion system, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) such as PrtD, which is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. The Aquifex aeolicus PrtDEF of T1SS is composed of an inner-membrane ABC transporter (PrtD), a periplasmic membrane-fusion protein (PrtE), and an outer-membrane porin (PrtF). These three components assemble into complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides 291
35900 350031 cd18587 ABC_6TM_LapB_like Six-transmembrane helical domain of the ABC transporter subunit LapB and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), such as LapB. LapB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion, LapA is a RTX (repeats in toxin) protein found in Pseudomonas fluorescens and is required for biofilm formation in this organism. T1SS are found in pathogenic Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. In this T1SS system, LapB is a cytoplasmic membrane-localized ATPase, LapC is a membrane fusion protein, and LapE is an outer membrane protein. 293
35901 350032 cd18588 ABC_6TM_CyaB_HlyB_like Six-transmembrane helical domain of the ABC subunits of T1SS, CyaB/HylB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunits of T1SS, such as CyaG and HlyB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. Additionally, CyaB is part of the three T1SS complex proteins for adenylate cyclase toxin CyaA, which is a primary virulence factor in Bordetella pertussis: CyaB (an ABC transporter) CyaD (a membrane fusion protein), and CyaE (an outer membrane protein). 294
35902 350033 cd18589 ABC_6TM_TAP1 Six-transmembrane helical domain 1 (6-TMD1) of the ABC transporter associated with antigen processing 1 (TAP1). This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation. 289
35903 350034 cd18590 ABC_6TM_TAP2 Six-transmembrane helical domain 2 (6-TMD2) of the ABC transporter associated with antigen processing 2 (TAP2). This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation. 289
35904 350035 cd18591 ABC_6TM_SUR1_D1_like Six-transmembrane helical domain 1 (TMD1) of the sulphonylurea receptors SUR1/2. This group represents the six-transmembrane domain 1 (TMD1) of the sulphonylurea receptors SUR1/2 (ABCC8), which function as a modulator of ATP-sensitive potassium channels and insulin release, and they belong to the ABCC subfamily. The ATP-sensitive (K-ATP) channel is an octameric complex of four pore-forming Kir6.2 subunits and four regulatory SUR subunits. Thus, in contrast to other ABC transporters, the SUR serves as the regulatory subunit of an ion channel. Mutations and deficiencies in the SUR proteins have been observed in patients with hyperinsulinemic hypoglycemia of infancy, an autosomal recessive disorder of unregulated and high insulin secretion. Mutations have also been associated with non-insulin-dependent diabetes mellitus type 2, an autosomal dominant disease of defective insulin secretion. 309
35905 350036 cd18592 ABC_6TM_MRP5_8_9_D1 Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 287
35906 350037 cd18593 ABC_6TM_MRP4_D1_like Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated protein 4 (MRP4) and similar proteins. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated protein 4 (MRP4), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 291
35907 350038 cd18594 ABC_6TM_CFTR_D1 Six-transmembrane helical domain 1 of Cystic Fibrosis Transmembrane Conductance Regulator. This group represents the six-transmembrane domain 1 (TMD1) of the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), which belongs to the ABCC subfamily. CFTR functions as a chloride channel, in contrast to other ABC transporters, and controls ion and water secretion and absorption in epithelial tissues. ABC proteins are formed from two homologous halves each containing a transmembrane domain (TMD) and a cytosolic nucleotide binding domain (NBD). In CFTR, these two TMD-NBD halves are linked by the unique regulatory (R) domain, which is not present in other ABC transporters. The ion channel only opens when its R-domain is phosphorylated by cyclic AMP-dependent protein kinase (PKA) and ATP is bound at the NBDs. Mutations in CFTR cause cystic fibrosis, the most common lethal genetic disorder in populations of Northern European descent. 291
35908 350039 cd18595 ABC_6TM_MRP1_2_3_6_D1_like Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 290
35909 350040 cd18596 ABC_6TM_VMR1_D1_like Six-transmembrane helical domain 1 (TMD1) of the yeast Vmr1p, Ybt1p and Nft1; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Vmr1p, Ybt1p and Nft1, all of which are ABC transporters of the MRP (multidrug resistance-associated protein) subfamily (ABCC). Yeast ABCC (also termed MRP/CFTR) subfamily includes six members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, Vmr1p, and Yor1p), of which three members (Ycf1p, Bpt1P and Yor1p) are not included here. While Yor1p, an oligomycin resistance ABC transporter, has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. Ybt1p is originally identified as a bile acid transporter and regulates membrane fusion through Ca2+ transport modulation. Ybt1p also plays a part in ade2 pigment transport. Moreover, Ybt1p has been recently shown to translocate phosphatidylcholine from the outer leaflet of the vacuole to the inner leaflet for degradation and choline recycling. Vmr1p, a vacuolar membrane protein, participates in the export of numerous growth inhibitors from the cell, such as cycloheximide, 2,4-dinitrophenole, cadmium and other toxic metals. Nft1p is not well-characterized, but it is proposed to be regulate Ycf1p, which is involved in heavy metal detoxification. 309
35910 350041 cd18597 ABC_6TM_YOR1_D1_like Six-transmembrane helical domain 1 (TMD1) of the yeast Yor1p and similar proteins; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Yor1p, an oligomycin resistance ABC transporter, and similar proteins. Members of this group belong to the MRP (multidrug resistance-associated protein) subfamily (ABCC). In addition to Yor1p, yeast ABCC (also termed MRP/CFTR) subfamily also comprises five other members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, and Vmr1p), which are not included in this group. Yor1p is a plasma membrane ATP-binding transporter that mediates export of many different organic anions including oligomycin. While Yor1p has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. 293
35911 350042 cd18598 ABC_6TM_MRP7_D1_like Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated protein 7, and similar proteins. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated protein 7 (MRP7), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 288
35912 350043 cd18599 ABC_6TM_MRP5_8_9_D2 Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 313
35913 350044 cd18600 ABC_6TM_CFTR_D2 Six-transmembrane helical domain 2 of Cystic Fibrosis Transmembrane Conductance Regulator. This group represents the six-transmembrane domain 2 (TMD2) of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. All ABC transporters share a common architecture of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs. 324
35914 350045 cd18601 ABC_6TM_MRP4_D2_like Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated protein 4 (MRP4) and similar proteins. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated protein 4 (MRP4), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 314
35915 350046 cd18602 ABC_6TM_SUR1_D2_like Six-transmembrane helical domain 2 (TMD2) of the sulphonylurea receptors SUR1/2. This group represents the six-transmembrane domain 2 (TMD2) of the sulphonylurea receptors SUR1/2 (ABCC8), which function as a modulator of ATP-sensitive potassium channels and insulin release, and belong to the ABCC subfamily. The ATP-sensitive (K-ATP) channel is an octameric complex of four pore-forming Kir6.2 subunits and four regulatory SUR subunits. Thus, in contrast to other ABC transporters, the SUR serves as the regulatory subunit of an ion channel. Mutations and deficiencies in the SUR proteins have been observed in patients with hyperinsulinemic hypoglycemia of infancy, an autosomal recessive disorder of unregulated and high insulin secretion. Mutations have also been associated with non-insulin-dependent diabetes mellitus type 2, an autosomal dominant disease of defective insulin secretion. 307
35916 350047 cd18603 ABC_6TM_MRP1_2_3_6_D2_like Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 296
35917 350048 cd18604 ABC_6TM_VMR1_D2_like Six-transmembrane helical domain 2 (TMD2) of the yeast Vmr1p, Ybt1p and Nft1; ABCC subfamily. This group includes the six-transmembrane domain 2 (TMD2) of the yeast Vmr1p, Ybt1p and Nft1, all of which are ABC transporters of the MRP (multidrug resistance-associated protein) subfamily (ABCC). Yeast ABCC (also termed MRP/CFTR) subfamily includes six members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, Vmr1p, and Yor1p), of which three members (Ycf1p, Bpt1P and Yor1p) are not included here. While Yor1p, an oligomycin resistance ABC transporter, has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. Ybt1p is originally identified as a bile acid transporter and regulates membrane fusion through Ca2+ transport modulation. Ybt1p also plays a part in ade2 pigment transport. Moreover, Ybt1p has been recently shown to translocate phosphatidylcholine from the outer leaflet of the vacuole to the inner leaflet for degradation and choline recycling. Vmr1p, a vacuolar membrane protein, participates in the export of numerous growth inhibitors from the cell, such as cycloheximide, 2,4-dinitrophenole, cadmium and other toxic metals. Nft1p is not well-characterized, but it is proposed to be regulate Ycf1p, which is involved in heavy metal detoxification. 297
35918 350049 cd18605 ABC_6TM_MRP7_D2_like Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated protein 7, and similar proteins. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated protein 7 (MRP7), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0). 300
35919 350050 cd18606 ABC_6TM_YOR1_D2_like Six-transmembrane helical domain 2 (TMD2) of the yeast Yor1p and similar proteins; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Yor1p, an oligomycin resistance ABC transporter, and similar proteins. Members of this group belong to the MRP (multidrug resistance-associated protein) subfamily (ABCC). In addition to Yor1p, yeast ABCC (also termed MRP/CFTR) subfamily also comprises five other members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, and Vmr1p), which are not included in this group. Yor1p is a plasma membrane ATP-binding transporter that mediates export of many different organic anions including oligomycin. While Yor1p has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. 290
35920 350119 cd18607 GH130 Glycoside hydrolase family 130. Members of the glycosyl hydrolase family 130, as classified by the carbohydrate-active enzymes database (CAZY), are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. 269
35921 350120 cd18608 GH43_F5-8_typeC-like Glycosyl hydrolase family 43 protein most having a F5/8 type C domain C-terminal to the GH43 domain. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been annotated as having beta-xylosidase (EC 3.2.1.37), xylanase (EC 3.2.1.8), and beta-galactosidase (EC 3.2.1.145) activities, and some as F5/8 type C domain (also known as the discoidin (DS) domain)-containing proteins. Most contain a F5/8 type C domain C-terminal to the GH43 domain. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Characterized enzymes belonging to this subgroup include Lactobacillus brevis (LbAraf43) and Weissella sp (WAraf43) which show activity with similar catalytic efficiency on 1,5-alpha-L-arabinooligosaccharides with a degree of polymerization (DP) of 2-3; size is limited by an extended loop at the entrance to the active site. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 276
35922 350121 cd18609 GH32-like Glycosyl hydrolase family 32 family protein. The GH32 family contains glycosyl hydrolase family GH32 proteins that cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). This family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 303
35923 350122 cd18610 GH130_BT3780-like Glycosyl hydrolase family 130, such as beta-mammosidase BT3780 and BACOVA_03624. This subfamily contains glycosyl hydrolase family 130, as classified by the carbohydrate-active enzymes database (CAZY), and includes Bacteroides enzymes, BT3780 and BACOVA_03624. Members of this family possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. GH130 enzymes have also been shown to target beta-1,2- and beta-1,4-mannosidic linkages where these phosphorylases mediate bond cleavage by a single displacement reaction in which phosphate functions as the catalytic nucleophile. However, some lack the conserved basic residues that bind the phosphate nucleophile, as observed for the Bacteroides enzymes, BT3780 and BACOVA_03624, which are indeed beta-mannosidases that hydrolyze beta-1,2-mannosidic linkages through an inverting mechanism. 301
35924 350123 cd18611 GH130 Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. 289
35925 350124 cd18612 GH130_Lin0857-like Glycoside hydrolase family 130 such as Listeria innocua beta-1,2-mannobiose phosphorylase. This subfamily contains the glycosyl hydrolase family 130 (GH130), as classified by the carbohydrate-active enzymes database (CAZY), enzymes that are phosphorylases and hydrolases for beta-mannosides, and includes Listeria innocua beta-1,2-mannobiose phosphorylase (Lin0857). hey possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Structure of Lin0857 shows beta-1,2-mannotriose bound in a U-shape, interacting with a phosphate analog at both ends. Lin0857 has a unique dimer structure connected by a loop, with a significant open-close loop displacement observed for substrate entry. A long loop, which is exclusively present in Lin0857, covers the active site to limit the pocket size. 261
35926 350125 cd18613 GH130 Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. 302
35927 350126 cd18614 GH130 Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. 276
35928 350127 cd18615 GH130 Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. 277
35929 350128 cd18616 GH43_ABN-like Glycosyl hydrolase family 43 such as arabinan endo-1 5-alpha-L-arabinosidase. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activity. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 291
35930 350129 cd18617 GH43_XynB-like Glycosyl hydrolase family 43, such as Bacteroides ovatus alpha-L-arabinofuranosidase (BoGH43, XynB). This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been characterized to have alpha-L-arabinofuranosidase (EC 3.2.1.55) and beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activities. Beta-1,4-xylosidases are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Also included in this subfamily are Bacteroides ovatus alpha-L-arabinofuranosidases, BoGH43A and BoGH43B, both having a two-domain architecture, consisting of an N-terminal 5-bladed beta-propeller domain harboring the catalytic active site, and a C-terminal beta-sandwich domain. However, despite significant functional overlap between these two enzymes, BoGH43A and BoGH43B share just 41% sequence identity. The latter appears to be significantly less active on the same substrates, suggesting that these paralogs may play subtly different roles during the degradation of xyloglucans from different sources, or may function most optimally at different stages in the catabolism of xyloglucan oligosaccharides (XyGOs), for example before or after hydrolysis of certain side-chain moieties. It also includes Phanerochaete chrysosporium BKM-F-1767 Xyl, a bifunctional xylosidase/arabinofuranosidase. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 285
35931 350130 cd18618 GH43_Xsa43E-like Glycosyl hydrolase family 43, including Butyrivibrio proteoclasticus arabinofuranosidase Xsa43E. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. This subgroup includes Cellvibrio japonicus arabinan-specific alpha-1,2-arabinofuranosidase, CjAbf43A, which confers its specificity by a surface cleft that is complementary to the helical backbone of the polysaccharide, and Butyrivibrio proteoclasticus GH43 enzyme Xsa43E, also an arabinofuranosidase, which has been shown to cleave arabinose side chains from short segments of xylan. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 275
35932 350131 cd18619 GH43_CoXyl43_like Glycosyl hydrolase family 43 protein such as metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Included in this subfamily is the metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43, which shows synergy with Trichoderma reesei cellulases and promotes plant biomass saccharification by degrading xylo-oligosaccharides, such as xylobiose and xylotriose, into the monosaccharide xylose. Studies show that the hydrolytic activity of CoXyl43 is stimulated in the presence of calcium. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 313
35933 350132 cd18620 GH43_XylA-like Glycosyl hydrolase family 43-like protein such as Clostridium stercorarium alpha-L-arabinofuranosidase XylA. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. The GH43_XylA-like subgroup includes Clostridium stercorarium alpha-L-arabinofuranosidase XylA, and enzymes that have been annotated as having beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), endo-alpha-L-arabinanase (EC 3.2.1.-) as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. 274
35934 350133 cd18621 GH32_XdINV-like glycoside hydrolase family 32 protein such as Xanthophyllomyces dendrorhous beta-fructofuranosidase (Inv;Xd-INV;XdINV). This subfamily of glycosyl hydrolase family GH32 includes fructan:fructan 1-fructosyltransferase (FT, EC 2.4.1.100) and beta-fructofuranosidase (invertase or Inv, EC 3.2.1.26), among others. These enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Xanthophyllomyces dendrorhous beta-fructofuranosidase (XdINV) also catalyzes the synthesis of fructooligosaccharides (FOS, a beneficial prebiotic), producing neo-FOS, making it an interesting biotechnology target. Structural studies show plasticity of its active site, having a flexible loop that is essential in binding sucrose and beta(2-1)-linked oligosaccharide, making it a valuable biocatalyst to produce novel bioconjugates. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 337
35935 350134 cd18622 GH32_Inu-like glycoside hydrolase family 32 protein such as Aspergillus ficuum endo-inulinase (Inu2). This subfamily of glycosyl hydrolase family GH32 includes endo-inulinase (inu2, EC 3.2.1.7), exo-inulinase (Inu1, EC 3.2.1.80), invertase (EC 3.2.1.26), and levan fructotransferase (LftA, EC 4.2.2.16), among others. These enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 289
35936 350135 cd18623 GH32_ScrB-like glycoside hydrolase family 32 sucrose 6 phosphate hydrolase (sucrase). Glycosyl hydrolase family GH32 subgroup contains sucrose-6-phosphate hydrolase (sucrase, EC:3.2.1.26) among others. The enzyme cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 289
35937 350136 cd18624 GH32_Fruct1-like glycoside hydrolase family 32 protein such as Arabidopsis thaliana cell-wall invertase 1 (AtBFruct1;Fruct1;AtcwINV1;At3g13790). This subfamily of glycosyl hydrolase family GH32 includes fructan beta-(2,1)-fructosidase and fructan 1-exohydrolase IIa (1-FEH IIa, EC 3.2.1.153), cell-wall invertase 1 (EC 3.2.1.26), sucrose:fructan 6-fructosyltransferase (6-Sst/6-Dft, EC 2.4.1.10), and levan fructosyltransferases (EC 2.4.1.-) among others. This enzyme cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 296
35938 350137 cd18625 GH32_BfrA-like glycoside hydrolase family 32 protein such as Thermotoga maritima invertase (BfrA or Tm1414). This subfamily of glycosyl hydrolase family GH32 includes beta-fructosidase (invertase, EC 3.2.1.26) that cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 286
35939 349276 cd18626 CD_eEF3 chromodomain-like insertion in an ATPase domain of elongation factor eEF3. Eukaryotic elongation factor eEF3 (also known as EF-3, YEF3, and TEF3), a member of the ATP-binding cassette (ABC) family of proteins, is a ribosomal binding ATPase essential for fungal translation machinery. Until recently it was considered fungal-specific and therefore an attractive target for antifungal therapy; however, recent bioinformatics analysis indicates it may be more widely distributed among other unicellular eukaryotes, and translation elongation factor 3 activity has been demonstrated from a non-fungal species, Phytophthora infestans. eEF3 is a soluble factor lacking a transmembrane domain and having two ABC domains arranged in tandem, with a unique chromodomain inserted within the ABC2 domain. Chromodomain mutations in the ABC2 domain of eEF3 have been shown to reduce ATPase activity, but not ribosome binding. Thus, the chromodomain-like insertion is critical to eEF3 function. In addition to its elongation function, eEF3 has been shown to interact with mRNA in a translation independent manner, suggesting an additional, non-elongation function for this factor. 56
35940 349277 cd18627 CD_polycomb_like chromodomain of polycomb and chromobox family proteins. CHRomatin Organization Modifier (chromo) domain of Polycomb and Polycomb-group (PcG) chromobox (CBX) family proteins such as CBX2, CBX4, CBX6, CBX7, and CBX8. These CBX proteins are components of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. 49
35941 349278 cd18628 CD3_cpSRP43_like chromodomain 3 of chloroplast signal recognition particle 43 kDa protein, and similar proteins. This subgroup includes the chromodomain 3 of chloroplast SRP43 (cpSRP43), and similar proteins. CpSRP43 is a component of the chloroplast signal recognition particle (SRP) pathway. It forms a stable complex with cpSRP54 (cpSRP complex) which is required for the efficient posttranslational transport of members of the nuclearly encoded light harvesting chlorophyll-a/b-binding proteins (LHCPs) to the thylakoid membrane. Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromodomain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followed by a related chromo shadow domain. 51
35942 349279 cd18629 CD2_cpSRP43_like chromodomain 2 of chloroplast signal recognition particle 43 kDa protein, and similar proteins. This subgroup includes the chromodomain 2 of chloroplast SRP43 (cpSRP43), and similar proteins. CpSRP43 is a component of the chloroplast signal recognition particle (SRP) pathway. It forms a stable complex with cpSRP54 (cpSRP complex) which is required for the efficient posttranslational transport of members of the nuclearly encoded light harvesting chlorophyll-a/b-binding proteins (LHCPs) to the thylakoid membrane. Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromodomain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followed by a related chromo shadow domain. 48
35943 349280 cd18630 CD_Rhino chromodomain of Drosophila melanogaster Rhino, and similar proteins. N-terminal CHRomatin Organization Modifier (chromo) domain of Drosophila melanogaster Rhino (also known as heterochromatin protein 1-like), and similar proteins. Rhino is a female-specific protein that affects chromosome structure and egg polarity that is required for germline PIWI-interacting RNA (piRNA) production. In Drosophila the RDC (rhino, deadlock, and cutoff) complex, composed of rhino, the protein deadlock (Del) and the Rai1-like transcription termination cofactor cutoff (Cuff) binds to chromatin of dual-strand piRNA clusters, special genomic regions, which encode piRNA precursors. The RDC complex is anchored to H3K9me3-marked chromatin in part via the H3K9me3-binding activity of Rhino, and is required for transcription of piRNA precursors. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 51
35944 349281 cd18631 CD_HP1_like chromodomain of heterochromatin protein 1 proteins, including HP1alpha, HP1beta, and HP1gamma. CHRomatin Organization Modifier (chromo) domain of mammalian HP1alpha (Cbx5), HP1beta (Cbx1), HP1gamma (Cbx5), and similar proteins. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). 50
35945 349282 cd18632 CD_Clr4_like N-terminal chromodomain of the fission yeast histone methyltransferase Clr4, and similar proteins. N-terminal CHRomatin Organization Modifier (chromo) domain of cryptic loci regulator 4 (Clr4), a histone H3 lysine methyltransferase which targets H3K9. Clr4 regulates silencing and switching at the mating-type loci and affects chromatin structure at centromeres. Clr4 is a catalytic component of the rik1-associated E3 ubiquitin ligase complex that shows ubiquitin ligase activity and is required for histone H3K9 methylation. H3K9me represents a specific tag for epigenetic transcriptional repression by recruiting swi6/HP1 to methylated histones which leads to transcriptional silencing within centromeric heterochromatin, telomeric regions and at the silent mating-type loci. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 55
35946 349283 cd18633 CD_MMP8 chromodomain of M-phase phosphoprotein 8. The chromodomain of M-phase phosphoprotein 8 (MPP8), a component of the RanBPM-containing large protein complex, binds methylated H3K9. This may in turn recruit the H3K9 methyltransferases GLP and ESET, and DNA methyltransferase 3A to the promoter of the E-cadherin gene, mediating the E-cadherin gene silencing and promoting tumor cell motility and invasion. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 51
35947 349284 cd18634 CD_CDY chromodomain of the Chromodomain Y-like protein family. This group includes the chromodomain found in the mammalian chromodomain Y-like (CDY) protein family, and similar proteins. The human CDY family includes 6 proteins: the genes encoding four of these: two copies of CDY1 (CDY1a, CDY1a) and two copies of CDY2(CDY2a and CDY2b), are located on chromosome Y, and the genes encoding the other two members (CDYL and CDYL2) are located on autosomes. The chromosomal genes are only present in primates, whereas the CDYL and CDYL2 genes exist in most mammalian species. The CDY family proteins contain two functional domains: a chromodomain involved in chromatin binding and a catalytic domain found in many coenzyme A (CoA)- dependent acylation enzymes. CDYL is ubiquitously expressed, whereas CDYL2 shows selective expression in tissues of testis, prostate, spleen, and leukocyte. The CDYL genes are ubiquitously expressed, the CDY genes are only expressed in the testis. Deletion of the CDY1b gene has been shown to be a risk factor for male infertility. Impairments in CDY2 expression could be implicated in the pathogenesis of maturation arrest (a failure of germ cell development). 52
35948 349285 cd18635 CD_CMT3_like chromodomain of chromomethylase 3, and similar proteins. CHRomatin Organization Modifier (chromo) domain of DNA (cytosine-5)-methyltransferase chromomethylase 3 (CMT3, EC:2.1.1.37), and similar proteins. CMT3 is primarily a CHG (where H is either A, T or C) methyltransferase and is predominantly expressed in actively replicating cells. The protein is involved in preferentially methylating transposon-related sequences, reducing their mobility. Studies suggest that in order to target DNA methylation, CMT3 associates with H3K9me2-containing nucleosomes through binding of its BAH- and chromo-domains to H3K9me2. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 57
35949 349286 cd18636 CD_Chp1_like chromodomain of chromodomain-containing protein 1, and similar proteins. CHRomatin Organization Modifier (chromo) domain of chromodomain-containing protein 1 (CHp1), and similar proteins. Chp1 is needed for RNA interference-dependent heterochromatin formation in fission yeast. Chp1 is a member of the RNA-induced transcriptional silencing (RITS) complex which maintains the heterochromatin regions. The chromodomain of the Chp1 component binds the histone H3 lysine 9 methylated tail (H3K9me) and the core of the nucleosome. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 52
35950 349287 cd18637 CD_Swi6_like chromodomain of fission yeast Swi6, and similar proteins. Fission yeast Swi6 protein is a structural and functional homolog of mammalian HP1 (heterochromatin protein 1) and is involved in the chromatin structure by binding to centromeres, telomeres, and the silent mating-type locus. Swi6 contains a N-terminal chromo (CHRromatin Organization MOdifier) domain and a C-terminal chromo shadow domain (CSD). Swi6 binds histone H3 tails methylated at Lys- and the cohesion subunit Psc3, leading to silencing the genes and sister chromatid cohesion. It is also involved in the repression of the silent mating-type loci MAT2 and MAT3. Swi6 may compact MAT2/3 into a heterochromatin-like conformation which represses the transcription of these silent cassettes. chromodomains mediate the interaction of the heterochromatin with other heterochromatin proteins, thereby affecting chromatin structure (e.g. Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2). CSDs have only been found in proteins that also possess a chromodomain. 54
35951 349288 cd18638 CD_EhHp1_like chromodomain of Entamoeba histolytica heterochromatin protein 1, and similar proteins. This subgroup includes the N-terminal CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 (HP1)-like protein from Entamoeba histolytica, and similar proteins. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 52
35952 349289 cd18639 CD_SUV39H1_like chromodomain of histone methyltransferase SUV39H1, and similar proteins. CHRomatin Organization Modifier (chromo) domain of human SUV39H1, a histone lysine methyltransferase (HMT) which catalyzes di- and tri-methylation of lysine 9 of histone H3 (H3K9me2/3), leading to heterochromatin formation and gene silencing. H3K9me2/3 represents a specific mark for epigenetic transcriptional repression by recruiting HP1 (CBX1, CBX3, and/or CBX5) proteins to methylated histones. SUV39H1 mainly functions in heterochromatin regions. The human SUV39H1/2, histone H3K9 methyltransferases, are the mammalian homologs of Drosophila Su(var)3-9 and Schizosaccharomyces pombe Clr4. SUV39H1 contains a chromodomain at its N-terminus and a SET domain at its C-terminus. Although the SET domain performs the catalytic activity, the chromodomain of SUV39H1 is essential for the catalytic activity of SUV39H1. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 49
35953 349290 cd18640 CD_Chro-like chromodomain of Drosophila melanogaster chromator chromodomain protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in chromodomain of Drosophila melanogaster chromator (also known as Chriz/Chro) chromodomain protein, and similar proteins. Chromator is a nuclear protein that plays a role in proper spindle dynamics during mitosis. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 52
35954 350843 cd18641 CBD_RBP1_like chromo barrel domain of retinoblastoma binding protein 1, and similar proteins. Retinoblastoma-binding protein 1 (RBP1), also termed AT-rich interaction domain 4A, is a ubiquitously expressed nuclear protein. RBP1 is a tumor and leukemia suppressor that binds both methylated histone tails and DNA, and is involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. The chromo barrel domain of RBP1 has been reported to recognize histone H4K20me3 weakly, and this binding is enhanced by the simultaneous binding of DNA. RBP1 binds directly, with several other proteins, to retinoblastoma protein (pRB) which regulates cell proliferation; pRB represses transcription by recruiting RBP1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. 59
35955 350844 cd18642 CBD_MOF_like chromo barrel domain of Drosophila melanogaster males-absent on the first protein, and similar proteins. This subgroup includes the chromo barrel domains found in human Tat-interactive protein 60 (TIP60, (also known as KAT5 or HTATIP), Drosophila melanogaster males-absent on the first (MOF) protein, and Saccharomyces ESA1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. The MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites. 67
35956 350845 cd18643 CBD chromo barrel domain of MOF acetyltransferase, and similar proteins. This group includes the chromo barrel domains found in human Tat-interactive protein 60 (TIP60, (also known as KAT5 or HTATIP), Drosophila melanogaster males-absent on the first (MOF) protein, human male-specific lethal (MSL) complex subunit 3 (MSL3), and retinoblastoma binding protein 1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. The chromobarrel domains include a MOF-like subgroup which may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites. 61
35957 349291 cd18644 CD_polycomb chromodomain of polycomb. CHRomatin Organization Modifier (chromo) domain of the PcG (polycomb-group) chromodomain protein Polycomb (Pc) from Drosophila melanogaster, anthropod, worm, and sea cucumber, and similar proteins. Pc is a component of the Polycomb-group (PcG) multiprotein PRC1 complex, a complex class required to maintain the transcriptionally repressive state of many genes, including Hox genes, throughout development. The core subunits of PRC1 are polycomb (Pc), polyhomeotic (Ph), posterior sex combs (Psc), and sex comb extra (Sce, also known as dRing). Polycomb (Pc) plays a role in modulating life span in flies, it negatively regulates longevity. 54
35958 349292 cd18645 CD_Cbx4 chromodomain of chromobox homolog 4. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 4 (CBX4), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. In addition to a chromodomain with H3K27me3-binding activity, Cbx4 contains two SUMO-interacting motifs responsible for its small ubiquitin-related modifier (SUMO) E3 ligase activity. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, for example CBX8 promotes proliferation while suppressing metastasis, in colorectal carcinoma progression. CBX4 may serve as a tumor suppressor in colorectal carcinoma, and has been shown to be an oncogene in osteosarcoma and breast cancer. 55
35959 349293 cd18646 CD_Cbx7 chromodomain of chromobox homolog 7. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 7 (CBX7), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, for example CBX8 promotes proliferation while suppressing metastasis, in colorectal carcinoma progression. CBX7 has been shown to function as a tumor suppressor in lung carcinoma and an oncogene in gastric cancer and lymphoma. 56
35960 349294 cd18647 CD_Cbx2 chromodomain of chromobox homolog 2. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 2 (CBX2), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. 53
35961 349295 cd18648 CD_Cbx6 chromodomain of chromobox homolog 6. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 6 (CBX6), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. 58
35962 349296 cd18649 CD_Cbx8 chromodomain of chromobox homolog 8. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 8 (CBX8), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, CBX8 for example promotes proliferation while suppressing metastasis, in colorectal carcinoma progression. 55
35963 349297 cd18650 CD_HP1beta_Cbx1 chromodomain of heterochromatin protein 1 homolog beta. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog beta (also known as HP1beta, CBX1, and chromobox 1), and related proteins. HP1beta is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta, and HP1gamma (also known as Cbx3). 50
35964 349298 cd18651 CD_HP1alpha_Cbx5 chromodomain of heterochromatin protein 1 homolog alpha. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog alpha (also known as HP1alpha, Cbx5, and Chromobox 5), and related proteins. HP1alpha has diverse functions in heterochromatin formation, gene regulation, and mitotic progression, and forms complex networks of gene, RNA, and protein interactions. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha, HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). 50
35965 349299 cd18652 CD_HP1gamma_Cbx3 chromodomain of heterochromatin protein 1 homolog gamma. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog gamma (also known as HP1gamma, Cbx3, and Chromobox 3), and related proteins. HP1gamma is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. In addition to being involved in transcriptional silencing in heterochromatin-like complexes, HP1gamma also binds lamin B receptor, an integral membrane protein found in the inner nuclear membrane. The dual binding functions of the protein may explain the association of heterochromatin with the inner nuclear membrane. HP1gamma is also recruited to sites of ultraviolet-induced DNA damage and double-strand breaks. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma. 50
35966 349300 cd18653 CD_HP1a_insect chromodomain of insect HP1a. CHRomatin Organization Modifier (chromo) domain of insect HP1a. HP1a is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. In Drosophila, there are at least five HP1 family proteins, this subgroup includes the CD of Drosophila melanogaster HP1a. 50
35967 349301 cd18654 CSD_HP1beta_Cbx1 chromo shadow domain of heterochromatin protein 1 homolog beta. heterochromatin protein 1 homolog beta (also known as HP1beta, Cbx1, chromobox 1) is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1beta has a single N-terminal chromodomain which can bind to histone proteins via methylated lysine residues, and a C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1beta may play an important role in the epigenetic control of chromatin structure and gene expression. CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta, and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3 58
35968 349302 cd18655 CSD_HP1alpha_Cbx5 chromo shadow domain of heterochromatin protein 1 homolog alpha. Chromo shadow domain (CSD) of heterochromatin protein 1 homolog alpha (also known as HP1alpha, Cbx5, and Chromobox 5), and related proteins. HP1alpha has diverse functions in heterochromatin formation, gene regulation, and mitotic progression, and forms complex networks of gene, RNA, and protein interactions. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha, HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. 58
35969 349303 cd18656 CSD_HP1gamma_Cbx3 chromo shadow domain of heterochromatin protein 1 gamma homolog gamma. Chromo shadow domain (CSD) of heterochromatin protein 1 gamma homolog gamma (also known as HP1gamma, Cbx3, Chromobox 3), and related proteins. HP1gamma appears to be involved in transcriptional silencing in heterochromatin-like complexes. It binds histone H3 tails methylated at Lys-9, leading to epigenetic repression, and also binds lamin B receptor, an integral membrane protein found in the inner nuclear membrane. The dual binding functions of the protein may explain the association of heterochromatin with the inner nuclear membrane. HP1gamma is also recruited to sites of ultraviolet-induced DNA damage and double-strand breaks. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal CSD which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma. The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. 58
35970 349304 cd18657 CSD_Swi6 chromo shadow domain of chromatin-associated protein Swi6. Chromo shadow domain (CSD) of fission yeast Swi6 protein. Swi6 is a structural and functional homolog of mammalian HP1 (heterochromatin protein 1) and is involved in the chromatin structure by binding to centromere, telomere and silent mating-type locus. Swi6 contains a N-terminal chromo (CHRromatin Organization MOdifier) domain and a C-terminal chromo shadow domain (CSD). Swi6 binds histone H3 tails methylated at Lys- and the cohesion subunit Psc3, leading to silencing the genes and sister chromatid cohesion. It is also involved in the repression of the silent mating-type loci MAT2 and MAT3. Swi6 may compact MAT2/3 into a heterochromatin-like conformation which represses the transcription of these silent cassettes. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. 55
35971 349305 cd18658 CSD_HP1a_insect chromo shadow domain of insect heterochromatin protein 1a. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal CSD which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. In Drosophila, there are at least five HP1 family proteins, this subgroup includes the CSD of Drosophila melanogaster HP1a. 53
35972 349306 cd18659 CD2_tandem repeat 2 of paired tandem chromodomains. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 to CHD9, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 54
35973 349307 cd18660 CD1_tandem repeat 1 of paired tandem chromodomains. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 to CHD9, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 70
35974 349308 cd18661 CD2_tandem_CHD1-2_like repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 1 and 2, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 and CHD2. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 58
35975 349309 cd18662 CD2_tandem_CHD3-4_like repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 3 and 4, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD3 and CHD4, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. Human CHD3 (also named Mi-2 alpha) and CHD4 (also named Mi-2 beta) are coexpressed in many cell lines and tissues and may act as the motor subunit of the NuRD complex (nucleosome remodeling and deacetylase activities). The proteins form distinct CHD3- and CHD4-NuRD complexes that repress, as well as activate gene transcription of overlapping and specific target genes. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 55
35976 349310 cd18663 CD2_tandem_CHD5-9_like repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 5-9, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD5, CHD6, CHD7, CHD8, and CHD9. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. CHD6, CHD7, and CHD8 enzymes have been demonstrated to have different substrate specificities and remodeling activities. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 59
35977 349311 cd18664 CD2_tandem_ScCHD1_like repeat 2 of the paired tandem chromodomains of yeast chromodomain helicase DNA-binding protein 1, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 59
35978 349312 cd18665 CD1_tandem_CHD1_yeast_like repeat 1 of the paired tandem chromodomains of yeast chromodomain helicase DNA-binding protein 1, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 65
35979 349313 cd18666 CD1_tandem_CHD1-2_like repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 1 and 2, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 and CHD2. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 85
35980 349314 cd18667 CD1_tandem_CHD3-4_like repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 3 and 4, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD3 and CHD4. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. Human CHD3 (also named Mi-2 alpha) and CHD4 (also named Mi-2 beta) are coexpressed in many cell lines and tissues and may act as the motor subunit of the NuRD complex (nucleosome remodeling and deacetylase activities). The proteins form distinct CHD3- and CHD4-NuRD complexes that repress, as well as activate gene transcription of overlapping and specific target genes. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 79
35981 349315 cd18668 CD1_tandem_CHD5-9_like repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 5-9, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD5, CHD6, CHD7, CHD8, and CHD9. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. CHD6, CHD7, and CHD8 enzymes have been demonstrated to have different substrate specificities and remodeling activities. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 68
35982 349948 cd18669 M20_18_42 M20, M18 and M42 Zn-peptidases include aminopeptidases and carboxypeptidases. This family corresponds to the MEROPS MH clan families M18, M20, and M42. The peptidase M20 family contains exopeptidases, including carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase, dipeptidases such as bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). This family also includes the bacterial aminopeptidase peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. These peptidases generally hydrolyze the late products of protein degradation so as to complete the conversion of proteins to free amino acids. Glutamate carboxypeptidase hydrolyzes folate analogs such as methotrexate, and therefore can be used to treat methotrexate toxicity. Peptidase families M18 and M42 contain metallo-aminopeptidases. M18 (aspartyl aminopeptidase, DAP) family cleaves only unblocked N-terminal acidic amino-acid residues and is highly selective for hydrolyzing aspartate or glutamate residues. Some M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacyl-peptidase activity (i.e. hydrolysis of acylated N-terminal residues). 198
35983 350850 cd18670 PIN_Mut7-C-like PIN domain at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related domains. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups, this family includes some sequences belonging to two of these, PIN _10 and PIN_16. 65
35984 350238 cd18671 PIN_PRORP-Zc3h12a-like PIN domain of protein-only RNase P (PRORP), ribonuclease Zc3h12a, and related proteins. PRORPs catalyze the maturation of the 5' end of precursor tRNAs in eukaryotes. This family includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3, PRORP1 localizes to the chloroplast and the mitochondria, and PRORP2 and PRORP3 localize to the nucleus. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This PIN_PRORP-Zc3h12a-like family also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
35985 350239 cd18672 PIN_FAM120B-like FEN-like PIN domains of FAM120B (family with sequence similarity 120B) and related proteins. FAM120B (also known as CCPG, "constitutive coactivator of PPAR-gamma", PGCC1, "PPARgamma constitutive coactivator 1") is a constitutive coactivator of peroxisome proliferator-activated receptor (PPARgamma) that promotes adipogenesis in a PPARgamma-dependent manner. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 210
35986 350240 cd18673 PIN_XRN1-2-like FEN-like PIN domains of XRN1, XRN2, and related proteins. XRN1 (5'-3' exoribonuclease 1, also known as SEP1) is a processive 5'-3' exoribonuclease that degrades the body of transcripts in the major pathway of RNA decay; XRN2 (5'-3' exoribonuclease 2) is predominantly localized in the nucleus and recognizes single-stranded RNAs with a 5'-terminal monophosphate to degrade them possessively to mononucleotides. XRN2 has a critical function to process maturation of 5.8S and 25S/28S rRNAs as well as degradation of some spacer fragments that are excised during rRNA maturation. Both XRN1 and XRN2 preferentially cleave 5'-monophosphorylated RNA. XRN2 is also known as Rat1p in yeast. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 240
35987 350241 cd18674 PIN_Pox_G5 FEN-like PIN domain of vaccinia virus G5 protein and related proteins. Poxvirus G5 nuclease is involved in DNA replication and double-strand break repair by homologous recombination. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 151
35988 350242 cd18675 PIN_SpAst1-like FEN-like PIN domain of Schizosaccharomyces pombe asteroid homolog 1 and related proteins. Schizosaccharomyces pombe Ast1 is a homologue of Drosophila Asteroid and human ASTE1. Ast1 appears to be involved in mounting a checkpoint response to endogenous damage in cells lacking Rad2 and Exo1. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 163
35989 350243 cd18676 PIN_asteroid-like FEN-like PIN domain of Drosophila melanogaster asteroid and related proteins. This subfamily includes Drosophila melanogaster asteroid protein which may function in EGF receptor signaling, and may play a role in compound eye morphogenesis. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 164
35990 350244 cd18677 PIN_MjVapC2-VapC6_like VapC-like PIN domain of Methanocaldococcus jannaschii VapC2, and VapC6, and related proteins. This subfamily includes Methanocaldococcus jannaschii VapC2 and VapC6. It belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 136
35991 350245 cd18678 PIN_MtVapC25_VapC33-like VapC-like PIN domain of Mycobacterium tuberculosis VapC25, VapC33, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC25, VapC29, VapC33, VapC37, and VapC39 toxins. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 140
35992 350246 cd18679 PIN_VapC-Af1683-like VapC-like PIN domain of VapC ribonuclease similar to Archaeoglobus fulgidus uncharacterized Af1683 protein. Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
35993 350247 cd18680 PIN_MtVapC20-like VapC-like PIN domain of Mycobacterium tuberculosis VapC20 and related proteins. M. tuberculosis VapC20 inhibits translation by site-specific cleavage of the universally conserved Sarcin-Ricin loop in 23S rRNA. This subfamily belongs to the VapC (virulence-associated protein C)-like nuclease family of the PIN domain-like superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. 131
35994 350248 cd18681 PIN_MtVapC27-VapC40_like VapC-like PIN domain of Mycobacterium tuberculosis VapC27, and VapC40, and related proteins. This subfamily includes Mycobacterium tuberculosis VapC27 and VapC40. It belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
35995 350249 cd18682 PIN_VapC-like Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 118
35996 350250 cd18683 PIN_VapC-like Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 125
35997 350251 cd18684 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 131
35998 350252 cd18685 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains and included distant subgroups, this subgroup includes some sequences belonging to one of these, PIN_14. 110
35999 350253 cd18686 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 119
36000 350254 cd18687 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_3. 118
36001 350255 cd18688 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 134
36002 350256 cd18689 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 125
36003 350257 cd18690 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_12. 134
36004 350258 cd18691 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 130
36005 350259 cd18692 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
36006 350260 cd18693 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_22. 129
36007 350261 cd18694 PIN_VapC-like uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_24. 133
36008 350262 cd18695 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_13. 118
36009 350263 cd18696 PIN_MtVapC26-like VapC-like PIN domain of Mycobacterium tuberculosis VapC26 and related proteins. Mycobacterium tuberculosis VapC26 cleaves 23S rRNA in the Sarcin-Ricin Loop, it is inhibited by the cognate VapB26 antitoxin. This subfamily belongs to the VapC (virulence-associated protein C)-like nuclease family of the PIN domain-like superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 132
36010 350264 cd18697 PIN_VapC_N-like VapC-like N-terminal PIN (DUF4935) domain of DUF4935 domain-containing proteins and related proteins. This subgroup the includes N-terminal PIN domain of DUF4935 domain-containing proteins, and is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 183
36011 350265 cd18698 PIN_VapC_C-like VapC-like C-terminus of DUF1308 domain in DUF1308 domain-containing proteins and related proteins. This subfamily includes the C-terminus of DUF1308 domain in DUF1308 domain-containing proteins, and is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 138
36012 350266 cd18699 PIN_VapC_like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_15. 129
36013 350267 cd18700 PIN_GNAT-like VapC-like PIN domain of uncharacterized GNAT family proteins. This subfamily includes uncharacterized GNAT family proteins having an N-terminal GNAT family N-acetyltransferase domain. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_17. 137
36014 350268 cd18701 PIN_VapC_like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_18. 144
36015 350269 cd18702 PIN_VapC_like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_19 139
36016 350270 cd18703 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_20. 148
36017 350271 cd18704 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_21. 145
36018 350272 cd18705 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_23. 130
36019 350273 cd18706 PIN_STKc_like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily; includes the PIN domains of uncharacterized serine/threonine kinases. This subfamily includes the PIN domains of some uncharacterized proteins having serine/threonine kinase catalytic domains and annotated as serine/threonine kinases. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_25. 126
36020 350274 cd18707 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_26. 131
36021 350275 cd18708 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_28. 116
36022 350276 cd18709 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_4-1. 196
36023 350277 cd18710 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_4-2. 130
36024 350278 cd18711 PIN_VapC-like_DUF411 VapC-like PIN (DUF411) domain in DUF411 domain-containing proteins and related proteins. This subfamily includes the DUF411 PIN domain in proteins annotated as DUF411 domain-containing proteins. It is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 146
36025 350279 cd18712 PIN_VapC-like_DUF411 CapC-like PIN (DUF411) domain in DUF411 domain-containing proteins and related proteins. This subfamily includes the DUF411 PIN domain in proteins annotated as DUF411 domain-containing proteins. It is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 148
36026 350280 cd18713 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_27. 140
36027 350281 cd18714 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_8. 228
36028 350282 cd18715 PIN_VapC-like uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 125
36029 350283 cd18716 PIN_SSO1118-like VapC-like PIN domain of Sulfolobus solfataricus SSO1118 and related proteins. This subfamily includes the functionally uncharacterized protein SSO1118 from the hyperthermophilic archaeon Sulfolobus solfataricus P2. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 102
36030 350284 cd18717 PIN_ScNmd4p-like VapC-like PIN domain of Saccharomyces cerevisiae Nmd4p and related proteins. Saccharomyces cerevisiae Nmd4p may be involved in nonsense-mediated mRNA decay. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 150
36031 350285 cd18718 PIN_PRORP PIN domain of protein-only RNase P (PRORP) and related proteins. PRORPs catalyze the maturation of the 5' end of precursor tRNAs in eukaryotes. This family includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3, PRORP1 localizes to the chloroplast and the mitochondria, and PRORP2 and PRORP3 localize to the nucleus. This subfamily belongs to the PRORP-Zc3h12a-like PIN family which in addition includes Zc3h12a (also known as MCPIP1/MCP induced protein 1 and Regnase-1), Caenorhabditis elegans REGE-1 (REGnasE-1), Zc3h12b-d (also known as Regnase-2-4), N4BP1, and NYNRIN (also known as CGIN1). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 124
36032 350286 cd18719 PIN_Zc3h12a-N4BP1-like PRORP-like PIN domain of ribonuclease Zc3h12a, NEDD4-binding partner-1, and related proteins. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the PRORP-Zc3h12a-like PIN family which in addition includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36033 350287 cd18720 PIN_YqxD-like LabA-like PIN domain of uncharacterized Bacillus subtilis YqxD and related proteins. This subfamily includes the PIN domain of uncharacterized Bacillus subtilis YqxD (also known as YqfM) and Escherichia coli YaiI. Firmicute, such as Bacillus and Listeria, YqxD proteins are encoded within RNA polymerase major sigma43 operons, whereas E. coli YaiL is transcribed as a mono cistron. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 96
36034 350288 cd18721 PIN_ZNF451-like LabA-like PIN domain of human zinc finger protein 451 and related proteins. Human ZNF451 (also known as COASTER) functions as a transcriptional cofactor in promyelocytic leukemia bodies in the nucleus, it acts as a coactivator or corepressor, depending on the factors with which it interacts. ZNF451 interacts with p300 by the PIN-like domain and down regulates TGF-beta signaling in a p300-dependent and sumoylation-independent manner. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_11. 117
36035 350289 cd18722 PIN_NicB-like LabA-like PIN domain of Pseudomonas putida S16 NicB and related proteins. Curiously NicB from Pseudomonas putida S16 is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine. This subfamily also includes the uncharacterized CPP15 (plasmid) protein from Campylobacter jejuni. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 117
36036 350290 cd18723 PIN_LabA-like uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_7. 110
36037 350291 cd18724 PIN_LabA-like uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 172
36038 350292 cd18725 PIN_LabA-like uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 131
36039 350293 cd18726 PIN_LabA-like uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 113
36040 350294 cd18727 PIN_Swt1-like VapC-like PIN domain of Saccharomyces cerevisiae Swt1p, human SWT1 and related proteins. Saccharomyces cerevisiae mRNA-processing endoribonuclease Swt1p plays an important role in quality control of nuclear mRNPs in eukaryotes. Human transcriptional protein SWT1 (RNA endoribonuclease homolog, also known as HsSwt1, C1orf26, and chromosome 1 open reading frame 26) is an RNA endonuclease that participates in quality control of nuclear mRNPs and can associate with the nuclear pore complex (NPC). This subfamily belongs to the Smg5 and Smg6-like PIN domain family. Smg5 and Smg6 are essential factors in NMD, a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal-ion dependent, degradation of single-stranded RNA, in vitro. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. 141
36041 350295 cd18728 PIN_N4BP1-like PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins. NEDD4-binding partner-1 (N4BP1) interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. This subfamily additionally includes NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36042 350296 cd18729 PIN_Zc3h12-like PRORP-like PIN domain of ribonuclease Zc3h12a and related proteins. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4. It belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 131
36043 350297 cd18730 PIN_PH0500-like VapC-like PIN-domain of Pyrococcus horikoshii protein PH0500 and related proteins. This subfamily includes Pyrococcus horikoshii protein PH0500, a protein with possible exonuclease activity and involvement in DNA or RNA editing. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36044 350298 cd18731 PIN_NgFitB-like VapC-like PIN domain of Neisseria gonorrhoeae FitB and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Neisseria gonorrhoeae FitB toxin of the FitAB toxin/antitoxin (TA) system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. In N. gonorrhoeae, FitA and FitB form a heterodimer: FitA is the DNA binding subunit and FitB contains a ribonuclease activity that is blocked by the presence of FitA. A tetramer of FitAB heterodimers binds DNA from the fitAB upstream promoter region with high affinity. This results in both sequestration of FitAB and repression of fitAB transcription. It is thought that FitAB release from the DNA and subsequent dissociation both slows N. gonorrhoeae replication and transcytosis by an as yet undefined mechanism. The toxin M. tuberculosis VapC is a structural homolog of N. gonorrhoeae FitB, but their antitoxin partners, VapB and FitA, respectively, differ structurally. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 136
36045 350299 cd18732 PIN_MtVapC4-C5_like VapC-like PIN domain of Mycobacterium tuberculosis VapC4, VapC5, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 toxin of the VapBC toxin/antitoxin (TA) system. This family belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. M. tuberculosis VapC4 interacts with, and cleaves tRNA44Cys-GCA. M. tuberculosis VapC5 has endonucleolytic activity with RNA, this activity is low with dsRNA, and no activity has been demonstrated on dsDNA. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 124
36046 350300 cd18733 PIN_RfVapC1-like VapC-like PIN domain of Rickettsia felis VapC1 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Rickettsia felis VapC1, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 131
36047 350301 cd18734 PIN_RfVapC2-like VapC-like PIN domain of Rickettsia felis VapC2 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Rickettsia felis VapC2, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. Rickettsia felis VapC2 cleaves single-stranded RNA. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
36048 350302 cd18735 PIN_HiVapC1-like VapC-like PIN domain of Haemophilus influenzae VapC1 and related proteins. Haemophilus influenzae VapC1 has endonucleolytic activity with RNA, it cleaves initiator tRNA between the anticodon stem and loop, but does not cleave mRNA, rRNA or tmRNA, and has no activity on ssDNA or dsDNA. This subfamily belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. FitB is a toxin of the FitAB TA system. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
36049 350303 cd18736 PIN_CcVapC1-like VapC-like PIN domain of Caulobacter Crescentus VapC1-like and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Caulobacter Crescentus VapC1, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. This subfamily belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. FitB is a toxin of the FitAB TA system. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 123
36050 350304 cd18737 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 115
36051 350305 cd18738 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 118
36052 350306 cd18739 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 124
36053 350307 cd18740 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36054 350308 cd18741 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 120
36055 350309 cd18742 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36056 350310 cd18743 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36057 350311 cd18744 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
36058 350312 cd18745 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 130
36059 350313 cd18746 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 133
36060 350314 cd18747 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 132
36061 350315 cd18748 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
36062 350316 cd18749 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36063 350317 cd18750 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36064 350318 cd18751 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 133
36065 350319 cd18752 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 130
36066 350320 cd18753 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 123
36067 350321 cd18754 PIN_VapC4-5_FitB-like uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
36068 350322 cd18755 PIN_MtVapC3_VapC21-like VapC-like PIN domain of Mycobacterium tuberculosis VapC3, VapC21 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC3 and VapC21 toxins. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36069 350323 cd18756 PIN_MtVapC15-VapC11-like VapC-like PIN domain of Mycobacterium tuberculosis VapC11, VapC15, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC11 and VapC15 toxins. M. tuberculosis VapC11 and VapC15 cleave tRNA3 Leu-CAG, VapC11 may additionally cleave tRNA13Leu-GAG and tRNA10Gln-CTG. This subgroup belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
36070 350324 cd18757 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
36071 350325 cd18758 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36072 350326 cd18759 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36073 350327 cd18760 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36074 350328 cd18761 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 127
36075 350329 cd18762 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 130
36076 350330 cd18763 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 128
36077 350331 cd18764 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 129
36078 350332 cd18765 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 131
36079 350333 cd18766 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 130
36080 350334 cd18767 PIN_MtVapC3-like uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 126
36081 350335 cd18768 PIN_MtVapC4-C5-like VapC-like PIN domain of Mycobacterium tuberculosis VapC4, VapC5, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 toxin of the VapBC toxin/antitoxin (TA) system. This family belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. M. tuberculosis VapC4 interacts with, and cleaves tRNA44Cys-GCA. M. tuberculosis VapC5 has endonucleolytic activity with RNA, this activity is low with dsRNA, and no activity has been demonstrated on dsDNA. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. 123
36082 350851 cd18769 PIN_Mut7-C-like uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_10. 85
36083 350852 cd18770 PIN_Mut7-C-like uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog) Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_16. 80
36084 350853 cd18771 PIN_Mut7-C-like uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. 62
36085 350854 cd18772 PIN_Mut7-C-like Mut7-C-like family of the PIN domain superfamily similar to the PIN domain found at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related proteins. This Mut7-C-like subgroup of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. 81
36086 350341 cd18773 PDC1_HK_sensor first PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins, diguanylate-cyclase and similar domains. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution. 125
36087 350342 cd18774 PDC2_HK_sensor second PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins, diguanylate-cyclase and similar domains. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution. 89
36088 350651 cd18775 SafA-like Saf-pilin pilus formation protein SafA. This subfamily is composed of Saf-pilin pilus formation protein SafA from Salmonella enterica and similar proteins. SafA is the major subunit of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. In addition to safA, the saf operon is also composed of safB (periplasmic chaperone), safC (outer membrane usher), and safD (minor subunit). SafA and SafD subunits are transported from the cytoplasm into the periplasm via the SEC machinery, and the periplasmic chaperone SafB donates its G1 strand to complete the correct folding of SafA or SafD. In Saf pili assembly, the N-terminal extension (NTE) of an incoming SafA replaces the G1 strand (in SafB) via a zip-in-zip-out mechanism (also called donor-strand complementation or exchange) to form the polymer of SafD-(SafA)n (n > 100). 122
36089 350652 cd18776 AfaD-like AfaD and similar proteins. This subfamily consists of Escherichia coli AfaD, Salmonella SafD, and similar proteins. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. SafD is the minor subunit of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. In addition to safD, the saf operon is also composed of safA (major subunit), safB (periplasmic chaperone), and safC (outer membrane usher). Also included is the enteroaggregative Escherichia coli AAF/IV pilus tip protein, which is implicated in adhesion as well. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE). 118
36090 350653 cd18777 PsaA_MyfA Fimbrial subunit PsaA, MyfA, and similar proteins. This subfamily is composed of Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Psa and Myf specifically recognize beta1-3- or beta1-4-linked galactose in glycosphingolipids, but while Psa also binds phosphatidylcholine, Myf does not. Psa has acquired a tyrosine-rich surface that enables it to bind to phosphatidylcholine and mediate adhesion of Y. pestis/pseudotuberculosis to alveolar cells. Myf has specialized as a carbohydrate-binding adhesin, facilitating the attachment of Y. enterocolitica to intestinal cells. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE). 110
36091 350051 cd18778 ABC_6TM_exporter_like Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis. 293
36092 350052 cd18779 ABC_6TM_T1SS_like uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ATP-binding cassette subunit in the type 1 secretion systems, and similar proteins. uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) and similar proteins. These transporter subunits include HylB, PrtD, CyaB, CvaB, RsaD, HasD, LipB, and LapB, among many others. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). Most targeted proteins are not cleaved at the N terminus, but rather carry signals located toward the extreme C terminus to direct type I secretion. However, the 10 kDa Escherichia coli colicin V (CvaB) targets the ABC transporter using a cleaved, N-terminal signal sequence. Almost all transport substrates of the type I system have critical functions in attacking host cells either directly or by being essential for host colonization. The ABC-dependent T1SS transports various molecules, from ions, drugs, to proteins of various sizes up to 900 kDa. The molecules secreted vary in size from the small Escherichia coli peptide colicin V, (10 kDa) to the Pseudomonas fluorescens cell adhesion protein LapA of 520 kDa. The best characterized are the RTX toxins such as the adenylate cyclase (CyaA) toxin from Bordetella pertussis, the causative agent of whooping cough, and the lipases such as LipA. Type I secretion is also involved in export of non-protein substrates such as cyclic beta-glucans and polysaccharides. 294
36093 350053 cd18780 ABC_6TM_AtABCB27_like Six-transmembrane helical domain (6-TMD) of the Arabidopsis ABC transporter B family member 27 and similar proteins. This group includes Arabidopsis ABC transporter B family member 27 (also known as AtABCB27, aluminum tolerance-related ATP-binding cassette transporter, transporter associated with antigen processing-like protein 2, AtTAP2, and ALS1) which may play a role in aluminum resistance. The ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family includes ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection, as well as ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. 295
36094 350054 cd18781 ABC_6TM_AarD_CydDC_like uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC cysteine/GSH transporter CydDC, and similar proteins. This subgroup belongs to the ABC_6TM_AarD_CydDC_like subgroup of the ABC_6TM exporter family. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs). The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. 290
36095 350055 cd18782 ABC_6TM_PrtD_LapB_HlyB_like uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. Uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion. 294
36096 350056 cd18783 ABC_6TM_PrtD_LapB_HlyB_like uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. Uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion. 294
36097 350057 cd18784 ABC_6TM_ABCB9_like Six-transmembrane helical domain (6-TMD) of ATP-binding cassette sub-family B member 9 and similar proteins. ATP-binding cassette sub-family B member 9 is also known as transporter associated with antigen processing, TAP-like protein, TAPL, and ABCB9. It is a half transporter comprises a homodimeric lysosomal peptide transport complex. It belongs to the ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs. The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit. 289
36098 350172 cd18785 SF2_C C-terminal helicase domain of superfamily 2 DEAD/H-box helicases. Superfamily (SF)2 helicases include DEAD-box helicases, UvrB, RecG, Ski2, Sucrose Non-Fermenting (SNF) family helicases, and dicer proteins, among others. Similar to SF1 helicases, they do not form toroidal structures like SF3-6 helicases. SF2 helicases are a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Their helicase core is surrounded by C- and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains, or domains engaged in protein-protein interactions. The core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 77
36099 350173 cd18786 SF1_C C-terminal helicase domain of superfamily 1 DEAD/H-box helicases. Superfamily (SF)1 family members include UvrD/Rep, Pif1-like, and Upf-1-like proteins. Similar to SF2 helicases, they do not form toroidal, predominantly hexameric structures like SF3-6. SF1 helicases are a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Their helicase core is surrounded by C- and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains, or domains engaged in protein-protein interactions. The core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 89
36100 350174 cd18787 SF2_C_DEAD C-terminal helicase domain of the DEAD box helicases. DEAD-box helicases comprise a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis, and RNA degradation. They are superfamily (SF)2 helicases that, similar to SF1, do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 131
36101 350175 cd18788 SF2_C_XPD C-terminal helicase domain of xeroderma pigmentosum group D (XPD) family DEAD-like helicases. The xeroderma pigmentosum group D (XPD)-like family members are DEAD-box helicases belonging to superfamily (SF)2. This family includes DDX11 (also called ChlR1), a protein involved in maintaining chromosome transmission fidelity and genome stability, the TFIIH basal transcription factor complex XPD subunit, and FANCJ (also known as BRIP1), a DNA helicase required for the maintenance of chromosomal stability. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 159
36102 350176 cd18789 SF2_C_XPB C-terminal helicase domain of XPB-like helicases. TFIIH basal transcription factor complex helicase XPB (xeroderma pigmentosum type B) subunit (also known as DNA excision repair protein ERCC-3 or TFIIH 89 kDa subunit) is the ATP-dependent 3'-5' DNA helicase component of the core-TFIIH basal transcription factor, involved in nucleotide excision repair (NER) of DNA and, when complexed to CAK, in RNA transcription by RNA polymerase II. XPB is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 153
36103 350177 cd18790 SF2_C_UvrB C-terminal helicase domain of the UvrB family helicases. Excinuclease ABC subunit B (or UvrB) plays a central role in nucleotide excision repair (NER). Together with other components of the NER system, like UvrA, UvrC, UvrD (helicase II), and DNA polymerase I, it recognizes and cleaves damaged DNA in a multistep ATP-dependent reaction. UvrB is critical for the second phase of damage recognition by verifying the nature of the damage and forming the pre-incision complex. Its ATPase site becomes activated in the presence of UvrA and damaged DNA. Its activity is strand destabilization via distortion of the DNA at lesion site, with very limited DNA unwinding. UvrB is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 171
36104 350178 cd18791 SF2_C_RHA C-terminal helicase domain of the RNA helicase A (RHA) family helicases. The RNA helicase A (RHA) family includes RHA, also called DEAH-box helicase 9 (DHX9), DHX8, DHX15-16, DHX32-38, and many others. The RHA family members are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 171
36105 350179 cd18792 SF2_C_RecG_TRCF C-terminal helicase domain of the RecG family helicases. The DEAD-like helicase RecG family contains recombination factor RecG and transcription-repair coupling factor TrcF. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 160
36106 350180 cd18793 SF2_C_SNF C-terminal helicase domain of the SNF family helicases. The Sucrose Non-Fermenting (SNF) family includes chromatin-remodeling factors, such as CHD proteins and SMARCA proteins, recombination proteins Rad54, and many others. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 135
36107 350181 cd18794 SF2_C_RecQ C-terminal helicase domain of the RecQ family helicases. The RecQ helicase family is an evolutionarily conserved class of enzymes, dedicated to preserving genomic integrity by operating in telomere maintenance, DNA repair, and replication. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 134
36108 350182 cd18795 SF2_C_Ski2 C-terminal helicase domain of the Ski2 family helicases. Ski2-like RNA helicases play an important role in RNA degradation, processing, and splicing pathways. This family includes spliceosomal Brr2 RNA helicase, ASCC3 (involved in the repair of N-alkylated nucleotides), Mtr4 (involved in processing of structured RNAs), DDX60 (involved in viral RNA degradation), and other proteins. Ski2-like RNA helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 154
36109 350183 cd18796 SF2_C_LHR C-terminal helicase domain of LHR family helicases. Large helicase-related protein (LHR) is a DNA damage-inducible helicase that uses ATP hydrolysis to drive unidirectional 3'-to-5' translocation along single-stranded DNA (ssDNA) and to unwind RNA:DNA duplexes. This group also includes related bacterial and archaeal helicases. LHR family helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 150
36110 350184 cd18797 SF2_C_Hrq C-terminal helicase domain of HrQ family helicases. Yeast Hrq1, similar to RecQ4, plays a role in DNA inter-strand crosslink (ICL) repair and in telomere maintenance. Hrq1 lacks the Sld2-like domain found in RecQ4. HrQ family helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 146
36111 350185 cd18798 SF2_C_reverse_gyrase C-terminal helicase domain of the reverse gyrase. Reverse gyrase modifies the topological state of DNA by introducing positive supercoils in an ATP-dependent process. Reverse gyrase is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 174
36112 350186 cd18799 SF2_C_EcoAI-like C-terminal helicase domain of EcoAI HsdR-like restriction enzyme family helicases. This family is composed of helicase restriction enzymes, including the HsdR subunit of restriction-modification enzymes such as Escherichia coli type I restriction enzyme EcoAI R protein (R.EcoAI). The EcoAI enzyme recognizes 5'-GAGN(7)GTCA-3'. The HsdR or R subunit is required for both nuclease and ATPase activities, but not for modification. These proteins are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 116
36113 350187 cd18800 SF2_C_EcoR124I-like C-terminal helicase domain of EcoR124I HsdR-like restriction enzyme family helicases. This family is composed of helicase restriction enzymes, including the HsdR subunit of restriction-modification enzymes such as Escherichia coli type I restriction enzyme EcoR124I R protein. EcoR124I recognizes the sequence, 5'-GAAN(6)RTCG-3', and cleaves at random sites. The HsdR or R subunit is required for both nuclease and ATPase activities, but not for modification. These proteins are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 82
36114 350188 cd18801 SF2_C_FANCM_Hef C-terminal helicase domain of Fanconi anemia group M family helicases. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. Hef (helicase-associated endonuclease fork-structure) belongs to the XPF/MUS81/FANCM family of endonucleases and is involved in stalled replication fork repair. FANCM and Hef are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 143
36115 350189 cd18802 SF2_C_dicer C-terminal helicase domain of the endoribonuclease Dicer. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). In concert with Argonautes, these small RNAs bind complementary mRNAs to down-regulate their expression. miRNAs are processed by Dicer from small hairpins, while siRNAs are typically processed from longer dsRNA, from endogenous sources, or exogenous sources such as viral replication intermediates. Some organisms, such as Homo sapiens and Caenorhabditis elegans, encode one Dicer that generates miRNAs and siRNAs, but other organisms have multiple dicers with specialized functions. Dicer exists throughout eukaryotes, and a subset has an N-terminal helicase domain of the RIG-I-like receptor (RLR) subgroup. RLRs often function in innate immunity and Dicer helicase domains sometimes show differences in activity that correlate with roles in immunity. Dicer helicase domains are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 142
36116 350190 cd18803 SF2_C_secA C-terminal helicase domain of the protein translocase subunit secA. SecA is a component of the Sec translocase that transports the vast majority of bacterial and ER-exported proteins. SecA binds both the signal sequence and the mature domain of the preprotein emerging from the ribosome. SecA is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 141
36117 350191 cd18804 SF2_C_priA C-terminal helicase domain of ATP-dependent helicase PriA. PriA, also known as replication factor Y or primosomal protein N', is a 3'-->5' DNA helicase that acts to remodel stalled replication forks and as a specificity factor for origin-independent assembly of a new replisome at the stalled fork. PriA is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 238
36118 350192 cd18805 SF2_C_suv3 C-terminal helicase domain of ATP-dependent RNA helicase. The SUV3 (suppressor of Var 3) gene encodes a DNA and RNA helicase, which is localized in mitochondria and is a subunit of the degradosome complex involved in regulation of RNA surveillance and turnover. SUV3 exhibits DNA and RNA-dependent ATPase, DNA and RNA-binding and DNA and RNA unwinding activities. SUV3 is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 135
36119 350193 cd18806 SF2_C_viral C-terminal helicase domain of viral helicase. Viral helicases in this family here are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 145
36120 350194 cd18807 SF1_C_UvrD C-terminal helicase domain of UvrD family helicases. UvrD is a highly conserved helicase involved in mismatch repair, nucleotide excision repair, and recombinational repair. It plays a critical role in maintaining genomic stability and facilitating DNA lesion repair in many prokaryotic species including Helicobacter pylori and Escherichia coli. This family also includes ATP-dependent helicase/nuclease AddA and helicase/nuclease RecBCD subunit RecB, among others. UvrD family helicases are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 150
36121 350195 cd18808 SF1_C_Upf1 C-terminal helicase domain of Upf1-like family helicases. The Upf1-like helicase family includes UPF1, HELZ, Mov10L1, Aquarius, IGHMBP2 (SMUBP2), and similar proteins. They are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 184
36122 350196 cd18809 SF1_C_RecD C-terminal helicase domain of RecD family helicases. RecD is a member of the RecBCD (EC 3.1.11.5, Exonuclease V) complex. It is the alpha chain of the complex and functions as a 3'-5' helicase. The RecBCD enzyme is both a helicase that unwinds, or separates the strands of DNA, and a nuclease that makes single-stranded nicks in DNA. RecD family helicases are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 80
36123 350197 cd18810 SF2_C_TRCF C-terminal helicase domain of the transcription-repair coupling factor. Transcription-repair coupling factor (TrcF) dissociates transcription elongation complexes blocked at nonpairing lesions and mediates recruitment of DNA repair proteins. TrcF is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 151
36124 350198 cd18811 SF2_C_RecG C-terminal helicase domain of DNA helicase RecG. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. RecG helps process Holliday junction intermediates to mature products by catalyzing branch migration. It is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 159
36125 349406 cd18812 CAP_PI15-like CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 15 and similar proteins. This family is composed of peptidase inhibitor 15 (PI15), peptidase inhibitor R3HDML, cysteine-rich secretory protein LCCL domain-containing 1 (CRISPLD1), and cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2). PI15 is a serine protease inhibitor which displays weak inhibitory activity against trypsin and may play a role in facial patterning during embryonic development. The PI15 gene is a candidate gene for abdominal aortic internal elastic lamina ruptures in the rat. R3HDML is a putative serine protease inhibitor, whose gene may be associated with clinical dimensions of schizophrenia. CRISPLD1 may play a role in NSCLP (nonsyndromic cleft lip with or without cleft palate) through the interaction with CRISPLD2 and folate pathway genes. plays a role in the etiology of NSCLP and is required for neural crest cell migration and cell viability during craniofacial development. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 146
36126 349407 cd18813 CAP_CRISPLD1 CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory protein LCCL domain-containing 1. Cysteine-rich secretory protein LCCL domain-containing 1 (CRISPLD1) is also called cysteine-rich secretory protein 10 (CRISP-10), CocoaCrisp, LCCL domain-containing cysteine-rich secretory protein 1 (LCRISP1), or CAP and LCCL domain containing protein 1 (CAPLD1). CRISPLD1 is clearly distinct from CRISPs because they do not contain the 10 absolutely conserved cysteines or the ICR (ion channel regulator) domain of the CRISPs. It may play a role in NSCLP (nonsyndromic cleft lip with or without cleft palate) through the interaction with CRISPLD2 and folate pathway genes. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 146
36127 349408 cd18814 CAP_PI15 CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 15. Peptidase inhibitor 15 (PI15) is also called 25 kDa trypsin inhibitor (p25TI), cysteine-rich secretory protein 8 (CRISP-8), or SugarCrisp. It is a serine protease inhibitor which displays weak inhibitory activity against trypsin and may play a role in facial patterning during embryonic development. The PI15 gene is a candidate gene for abdominal aortic internal elastic lamina ruptures in the rat. PI15 may also participate in the regulation of drug resistance in ovarian cancer and serve as a potential target in targeted therapies. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 146
36128 349409 cd18815 CAP_R3HDML CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor R3HDML. Peptidase inhibitor R3HDML, also called cysteine-rich secretory protein R3HDML, is a putative serine protease inhibitor. The R3HDML gene may be associated with clinical dimensions of schizophrenia. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 146
36129 349410 cd18816 CAP_CRISPLD2 CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory protein LCCL domain-containing 2. Cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) is also called cysteine-rich secretory protein 11 (CRSIP-11), LCCL domain-containing cysteine-rich secretory protein 2 (LCRISP2), or CAP and LCCL domain containing protein 2 (CAPLD2). It plays a role in the etiology of NSCLP (non-syndromic cleft lip with or without cleft palate). It is required for neural crest cell migration and cell viability during craniofacial development. The CRISPLD2 gene has been identified a glucocorticoid responsive gene that modulates cytokine function in airway smooth muscle cells. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. 146
36130 350138 cd18817 GH43f_LbAraf43-like Glycosyl hydrolase family 43 such as Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with alpha-L-arabinofuranosidase (EC 3.2.1.55) activity. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Characterized enzymes belonging to this subgroup include Lactobacillus brevis (LbAraf43) and Weissella sp (WAraf43) which show activity with similar catalytic efficiency on 1,5-alpha-L-arabinooligosaccharides with a degree of polymerization (DP) of 2-3; size is limited by an extended loop at the entrance to the active site. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 262
36131 350139 cd18818 GH43_GbtXyl43B-like Glycosyl hydrolase family 43 such as Geobacillus thermoleovorans IT-08 beta-xylosidase/exo-xylanase (GbtXyl43B). This glycosyl hydrolase family 43 (GH43) subgroup includes the characterized enzymes Geobacillus thermoleovorans IT-08 beta-xylosidase (EC 3.2.1.37) / exo-xylanase (GbtXyl43B), and Paenibacillus sp. strain E18 alpha-L-arabinofuranosidase (EC 3.2.1.55) Abf43B. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 273
36132 350140 cd18819 GH43_LbAraf43-like Glycosyl hydrolase family 43 proteins similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities, similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 277
36133 350141 cd18820 GH43_LbAraf43-like Glycosyl hydrolase family 43 proteins similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities, similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 258
36134 350142 cd18821 GH43_Pc3Gal43A-like Glycosyl hydrolase family 43 protein such as Phanerochaete chrysosporium exo-beta-1,3-galactanase (Pc1, 3Gal43A, 1,3Gal43A). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), Fusarium oxysporum 12S Fo/1 (3Gal), and Streptomyces sp. 19(2012) SGalase1 and SGalase2. It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 262
36135 350143 cd18822 GH43_CtGH43-like Glycosyl hydrolase family 43 protein such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43), Streptomyces avermitilis MA-4680 = NBRC 14893 (Sa1,3Gal43A;SAV2109) (1,3Gal43A), and Ruminiclostridium thermocellum ATCC 27405 (Ct1,3Gal43A;CtGH43;Cthe_0661) (1,3Gal43A). It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 266
36136 350144 cd18823 GH43_RcAra43A-like Glycosyl hydrolase family 43 such as Ruminococcus champanellensis arabinanase Ara43A. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis arabinanase Ara43A and Fibrobacter succinogenes subsp. succinogenes S85 Fisuc_1994 / FSU_2517. It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 289
36137 350145 cd18824 GH43_CtGH43-like Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 282
36138 350146 cd18825 GH43_CtGH43-like Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 285
36139 350147 cd18826 GH43_CtGH43-like Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 269
36140 350148 cd18827 GH43_XlnD-like Glycosyl hydrolase family 43 protein such as Aspergillus niger DMS1957 xylanase D (XlnD); includes mostly xylanases. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have mostly been annotated as xylanases (endo-alpha-L-arabinanase, EC 3.2.1.8). It belongs to the GH43_bXyl-like subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_bXyl-like subgroup includes enzymes that have been annotated as xylan-digesting beta-xylosidases (EC 3.2.1.37) and xylanases, as well the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 277
36141 350149 cd18828 GH43_BT3675-like Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (BT3675;BT_3675). This glycosyl hydrolase family 43 (GH43) subgroup includes the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). It belongs to the GH43_bXyl subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_bXyl subgroup also includes enzymes annotated as having xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 283
36142 350150 cd18829 GH43_BsArb43A-like Glycosyl hydrolase family 43 protein such as Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase Arb43A. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes annotated as having endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities, and includes Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase (AbnA;BSU28810) (Arb43A). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the arabinofuranosidase (ABF; EC 3.2.1.55) enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 273
36143 350151 cd18830 GH43_CjArb43A-like Glycosyl hydrolase family 43 protein such as Cellvibrio japonicus Ueda107 endo-alpha-1,5-L-arabinanase / exo-alpha-1,5-L-arabinanase 43A (ArbA;CJA_0805) (Arb43A). This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes annotated with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities, and includes the bifunctional Cellvibrio japonicus Ueda107 endo-alpha-1,5-L-arabinanase / exo-alpha-1,5-L-arabinanase 43A (ArbA;CJA_0805) (Arb43A). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 291
36144 350152 cd18831 GH43_AnAbnA-like Glycosyl hydrolase family 43 protein such as Aspergillus niger endo-alpha-L-arabinanase (AbnA). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities such as Aspergillus niger AbnA, Aspergillus niveus AbnA, and Chrysosporium lucknowense Abn1. It belongs to the GH43_Arb43a subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_Arb43a subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. The GH43_Arb43a subgroup includes many enzymes such as Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, and are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 286
36145 350153 cd18832 GH43_GsAbnA-like Glycosyl hydrolase family 43 protein such as Geobacillus stearothermophilus endo-alpha-1,5-L-arabinanase AbnA. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. It includes Geobacillus stearothermophilus T-6 NCIMB 40222 AbnA, Bacillus subtilis subsp. subtilis str. 168 (Abn2;YxiA;J3A;BSU39330) (Arb43B), and Thermotoga petrophila RKU-1 (AbnA;TpABN;Tpet_0637). These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 332
36146 350154 cd18833 GH43_PcXyl-like Glycosyl hydrolase family 43 protein such as the bifunctional Phanerochaete chrysosporium xylosidase/arabinofuranosidase (Xyl;PcXyl). This glycosyl hydrolase family 43 (GH43) subgroup includes Phanerochaete chrysosporium BKM-F-1767 Xyl, a characterized bifunctional enzyme with beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37)/ alpha-L-arabinofuranosidase (EC 3.2.1.55) activities. This subgroup belongs to the GH43_XybB subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_XybB subgroup includes enzymes having beta-1,4-xylosidase and alpha-L-arabinofuranosidase activities. Beta-1,4-xylosidases are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43_XybB subgroup includes Bacteroides ovatus alpha-L-arabinofuranosidases, BoGH43A and BoGH43B, both having a two-domain architecture, consisting of an N-terminal 5-bladed beta-propeller domain harboring the catalytic active site, and a C-terminal beta-sandwich domain. However, despite significant functional overlap between these two enzymes, BoGH43A and BoGH43B share just 41% sequence identity. The latter appears to be significantly less active on the same substrates, suggesting that these paralogs may play subtly different roles during the degradation of xyloglucans from different sources, or may function most optimally at different stages in the catabolism of xyloglucan oligosaccharides (XyGOs), for example before or after hydrolysis of certain side-chain moieties. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 292
36147 380671 cd18892 TET oxygenase domain of ten-eleven translocation (TET)1, TET2, and TET3 methylcytosine dioxygenases and similar proteins. TET proteins are involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. Alterations in TET protein function have been linked to cancer, and TETs influence many cell differentiation processes. TET family genes have been implicated as tumor suppressors, for example mutations/deletions of the TET2 gene frequently occur in multiple spectra of myeloid malignancies. TET3 acts as a suppressor of ovarian cancer by demethylating the miR-30d precursor gene promoter to block TGF-beta1 induced epithelial-mesenchymal transition (EMT). TET3 (and TET2) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A). TET genes are downregulated in endometriosis. TET proteins belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 398
36148 380672 cd18893 TET-like oxygenase domain of ten-eleven translocation (TET)-like proteins such as Naegleria gruberi Tet-like protein (NgTet1) and similar proteins. Naegleria gruberi Tet1 can catalyze the iterative oxidation of both 5-methylcytosine (5mC) and thymidine (T) on various DNA forms. Like mammalian TETs, it catalyzes the oxidation of 5mC to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) in three consecutive, Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate, 2-OG)-dependent oxidation reactions. Like JBP1 and JBP2, NgTet1 can perform T-oxidation to form 5-hydroxymethyluridine (5hmU), but in addition it can catalyze the formation of 5-formyluridine (5fU) and 5-carboxyluridine (5caU). This family belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 243
36149 380673 cd18894 JBP-like oxygenase domain of J-binding protein (JBP) 1 and JBP2 thymidine hydroxylases and similar proteins, including uncharacterized bacterial and phage proteins. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 250
36150 380674 cd18895 TET1 oxygenase domain of ten-eleven translocation (TET)1 methylcytosine dioxygenase and similar proteins. TET1 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Human TET1 (and TET2) are more active on 5mC-DNA than 5hmC/5fC-DNA substrates. TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET1 plays multiple roles in in tumor development and progression. TET1 serves as a tumor suppressor gene; loss of TET1 is associated with tumorigenesis and can be used as a potential biomarker for cancer therapy. In addition to its dioxygenase activity, it can induce epithelial-mesenchymal transition and act as a coactivator to regulate gene transcription. The regulation of TET1 is also correlated with microRNA in a posttranscriptional modification process. TET1 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 410
36151 380675 cd18896 TET2 oxygenase domain of ten-eleven translocation (TET)2 methylcytosine dioxygenase and similar proteins. TET2 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Human TET2 (and TET1) have been shown to be more active on 5mC-DNA than 5hmC/5fC-DNA substrates. TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET2 acts as a tumor suppressor in hematopoiesis; mutations/deletions of the TET2 gene frequently occur in multiple spectra of myeloid malignancies. TET2 (and TET3) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A), which play a functional role in the epithelial-mesenchymal transition process and metastasis. In addition, TET2 (and TET3) may be guardians of regulatory T cell stability and immune homeostasis. TET2 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 434
36152 380676 cd18897 TET3 oxygenase domain of ten-eleven translocation (TET)3 methylcytosine dioxygenase and similar proteins. TET3 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET3 serves as a tumor suppressor; it acts as a suppressor of ovarian cancer by demethylating the miR-30d precursor gene promoter to block TGF-beta1 induced epithelial-mesenchymal transition (EMT). TET3 (and TET2) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A), which play a functional role in the EMT process and metastasis. In addition, TET3 (and TET2) may be guardians of regulatory T cell stability and immune homeostasis. TET3 has been shown to prevent terminal differentiation of adult neural stem cells by a mechanism involving direct binding and repression of TET3 to the imprinted gene Snrpn. TET3 has also been shown to mediate the activation of hepatic stellate cells via modulation of the long non-coding RNA HIF1A-AS1 expression. TET1 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 452
36153 381478 cd18908 bHLH_SOHLH1_2 basic helix-loop-helix (bHLH) domain found in the spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein (SOHLH) family. The SOHLH family includes two bHLH transcription factors, SOHLH1 and SOHLH2. They are specifically in spermatogonia and oocytes and essential for early spermatogonial and oocyte differentiation. 59
36154 381479 cd18909 bHLH_TCFL5 basic helix-loop-helix (bHLH) domain found in transcription factor-like 5 protein (TCFL5) and similar proteins. TCFL5, also termed Cha transcription factor, or HPV-16 E2-binding protein 1 (E2BP-1), is a bHLH transcription factor that plays a crucial role in spermatogenesis. It regulates cell proliferation or differentiation of cells through binding to a specific DNA sequence like other bHLH molecules. 60
36155 381480 cd18910 bHLHzip_USF3 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in basic helix-loop-helix domain-containing protein USF3 and similar proteins. USF3, also termed upstream transcription factor 3, is a bHLHzip protein that is involved in the negative regulation of epithelial-mesenchymal transition, the process by which epithelial cells lose their polarity and adhesion properties to become mesenchymal cells with enhanced migration and invasive properties. 65
36156 381481 cd18911 bHLHzip_MGA basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MAX gene-associated protein (MGA) and similar proteins. MGA, also termed MAX dimerization protein 5 (MAD5), is a dual specificity T-box/ bHLHzip transcription factor that regulates the expression of both Max-network and T-box family target genes. It contains a Myc-like bHLHZip motif and requires heterodimerization with Max for binding to the preferred Myc-Max-binding site CACGTG. In addition to the bHLHZip domain, MGA harbors a second DNA-binding domain, the T-box or T-domain. It thus binds the preferred Brachyury-binding sequence and represses transcription of reporter genes containing promoter-proximal Brachyury-binding sites. 65
36157 381482 cd18912 bHLH_TS_bHLHa9 basic helix-loop-helix (bHLH) domain found in Class A basic helix-loop-helix protein 9 (bHLHa9) and similar proteins. bHLHa9, also termed Class F basic helix-loop-helix factor 42 (bHLHf42), is a bHLH transcription factor that plays an essential role in limb development. 63
36158 381483 cd18913 bHLH-O_hairy_like basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster protein hairy, protein deadpan and similar proteins. Protein hairy is a bHLH transcriptional repressor of genes that require a bHLH-O protein for their transcription. It acts as a pair-rule protein that regulates embryonic segmentation and adult bristle patterning. Protein deadpan is closely related to the product of the segmentation gene hairy. It is a direct target of Notch signaling and regulates neuroblast self-renewal in Drosophila. 67
36159 381484 cd18914 bHLH_AtORG2_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana OBP3-responsive gene 2 (ORG2), 3 (ORG3) and similar proteins. The family includes ORG2 (also termed AtbHLH38, or EN 8) and ORG3 (also termed AtbHLH39, or EN 9), both of which act as bHLH transcription factors. 77
36160 381485 cd18915 bHLH_AtLHW_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein LONESOME HIGHWAY (LHW) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as LHW, and EMB1444. LHW, also termed AtbHLH156, or bHLH delta, is a bHLH transcription activator that regulates root development and promotes the production of stele cells in roots. It coordinately controls the number of all vascular cell types by regulating the size of the pool of cells from which they arise. EMB1444, also termed AtbHLH169, or lonesome highway-like protein 1, or protein embryo defective 1444, may regulate root development. 71
36161 381486 cd18916 bHLH-O_ESM5_like basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster Enhancer of split proteins, E(spl)m5, E(spl)m8 and similar proteins. The family includes two bHLH-O transcriptional repressors, E(spl)m5 and E(spl)m8, which participate in the control of cell fate choice by uncommitted neuroectodermal cells in the embryo. They bind DNA on N-box motifs, 5'-CACNAG-3'. 59
36162 381487 cd18917 bHLH_AtSAC51_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana suppressor of acaulis 51 (SAC51) and similar proteins. SAC51, also termed AtbHLH142, or EN 128, is a bHLH transcription factor that is involved in stem elongation, probably by regulating a subset of genes involved in this process. 53
36163 381488 cd18918 bHLH_AtMYC1_like basic Helix-Loop-Helix (bHLH) domain found in Arabidopsis thaliana MYC1 and similar proteins. MYC1, also termed AtbHLH12, or EN 58, acts as a transcription activator, when associated with MYB75/PAP1 or MYB90/PAP2. 70
36164 381489 cd18919 bHLH_AtBPE_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana BIG PETAL (BPE) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as BPE, HBI1 and BEE proteins (BEE1-3). BPE, also termed AtbHLH31, or EN 88, is involved in the control of Arabidopsis petal size, by interfering with postmitotic cell expansion to limit final petal cell size. HBI1, also termed AtbHLH64, or homolog of bee2 interacting with IBH1, or EN 79, is an atypical bHLH transcription factor that acts as positive regulator of cell elongation downstream of multiple external and endogenous signals by direct binding to the promoters and activation of the two expansin genes EXPA1 and EXPA8, encoding cell wall loosening enzymes. BEEs, also termed protein Brassinosteroid enhanced expression, are positive regulators of brassinosteroid signaling. 86
36165 381490 cd18920 bHLH-O_HEY2 basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif protein 2 (HEY2) and similar proteins. HEY2, also termed cardiovascular helix-loop-helix factor 1 (CHF-1), or Class B basic helix-loop-helix protein 32 (bHLHb32), or HES-related repressor protein 2, or hairy and enhancer of split-related protein 2 (HESR-2), or hairy-related transcription factor 2 (HRT-2), or protein gridlock homolog, is a bHLH-O transcriptional repressor expressed preferentially in the developing and adult cardiovascular system. As a downstream effector of Notch signaling, HEY2 may be required for cardiovascular development. It also plays an important role in neurologic development, as well as in the progression of human cancers. 82
36166 381491 cd18921 bHLHzip_SREBP1 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein 1 (SREBP1) and similar proteins. SREBP1, also termed Class D basic helix-loop-helix protein 1 (bHLHd1), or sterol regulatory element-binding transcription factor 1 (SREBF1), is a member of a family of bHLHzip transcription factors that recognize sterol regulatory element 1 (SRE-1). It acts as a transcriptional activator required for lipid homeostasis. It may control transcription of the low-density lipoprotein receptor gene as well as the fatty acid. SREBP1 has dual sequence specificity binding to both an E-box motif (5'-ATCACGTGA-3') and to SRE-1 (5'-ATCACCCCAC-3'). 75
36167 381492 cd18922 bHLHzip_SREBP2 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein 2 (SREBP2) and similar proteins. SREBP2, also termed Class D basic helix-loop-helix protein 2 (bHLHd2), or sterol regulatory element-binding transcription factor 2 (SREBF2), is a member of a family of bHLHzip transcription factors that recognize sterol regulatory element 1 (SRE-1). It acts as a transcription activator of cholesterol biosynthesis. 77
36168 381493 cd18923 bHLHzip_USF2 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factor 2 (USF2) and similar proteins. USF2, also termed Class B basic helix-loop-helix protein 12 (bHLHb12), or major late transcription factor 2, or FOS-interacting protein (FIP), or upstream transcription factor 2, is a bHLHzip transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5'-CACGTG-3') that is found in a variety of viral and cellular promoters. 80
36169 381494 cd18924 bHLHzip_USF1 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factor 1 (USF1) and similar proteins. USF1, also termed Class B basic helix-loop-helix protein 11 (bHLHb11), or major late transcription factor 1, is a bHLHzip transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5'-CACGTG-3') that is found in a variety of viral and cellular promoters. It is ubiquitously expressed and involved in the transcription activation of various functional genes implicated in lipid and glucose metabolism, stress response, immune response, cell cycle control and tumour suppression. USF-1 recruits chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. Genetic polymorphisms of USF1 are associated with some metabolic and cardiovascular diseases, like diabetes, atherosclerosis, coronary artery calcifications and familial combined hyperlipidaemia (FCHL). 65
36170 381495 cd18925 bHLHzip_TFEC basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor EC (TFEC) and similar proteins. TFEC, also termed Class E basic helix-loop-helix protein 34 (bHLHe34), or transcription factor EC-like (TFEC-L), is a bHLHzip transcriptional regulator that acts as a repressor or an activator and regulates gene expression in macrophages. It plays an important role in the niche to expand hematopoietic progenitors through the modulation of several cytokines. 85
36171 381496 cd18926 bHLHzip_MITF basic Helix-Loop-Helix-zipper (bHLHzip) domain found in microphthalmia-associated transcription factor (MITF) and similar proteins. MITF, also termed Class E basic helix-loop-helix protein 32 (bHLHe32), is a bHLHzip transcription factor that is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. It regulates the expression of genes with essential roles in cell differentiation, proliferation and survival. It binds to M-boxes (5'-TCATGTG-3') and symmetrical DNA sequences (E-boxes) (5'-CACGTG-3') found in the promoters of target genes, such as BCL2 and tyrosinase (TYR). 104
36172 381497 cd18927 bHLHzip_TFEB basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor EB (TFEB) and similar proteins. TFEB, also termed Class E basic helix-loop-helix protein 35 (bHLHe35), is a bHLHzip transcription factor that is required for vascularization of the mouse placenta. It specifically recognizes and binds E-box sequences (5'-CANNTG-3'). Its efficient DNA-binding requires dimerization with itself or with another MiT/TFE family member such as TFE3 or MITF. 91
36173 381498 cd18928 bHLHzip_TFE3 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor E3 (TFE3) and similar proteins. TFE3, also termed Class E basic helix-loop-helix protein 33 (bHLHe33), is a bHLHzip transcription factor that is involved in B cell function. It specifically recognizes and binds E-box sequences (5'-CANNTG-3'). Its efficient DNA-binding requires dimerization with itself or with another MiT/TFE family member such as TFEB or MITF. 91
36174 381499 cd18929 bHLHzip_Mad4 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-associated protein 4 (Mad4) and similar proteins. Mad4, also termed Max dimerization protein 4, or Max dimerizer 4 (MXD4), or Class C basic helix-loop-helix protein 12 (bHLHc12), or Max-interacting transcriptional repressor MAD4, is a bHLHZip Max-interacting transcriptional repressor that suppresses c-myc dependent transformation and is expressed during neural and epidermal differentiation. It is regulated by a transcriptional repressor complex that contains Miz-1 and c-Myc. 88
36175 381500 cd18930 bHLHzip_MXI1 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-interacting protein 1 (MXI1) and similar proteins. MXI1, also termed Max interactor 1, or Class C basic helix-loop-helix protein 11 (bHLHc11), is a bHLHZip transcriptional repressor that binds with MAX to form a sequence-specific DNA-binding protein complex which recognizes the core sequence 5'-CAC[GA]TG-3'. It thus antagonizes MYC transcriptional activity by competing for MAX. It plays an important role in the regulation of cell proliferation. 80
36176 381501 cd18931 bHLHzip_Mad1 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in protein Max-associated protein 1 (Mad1) and similar proteins. Mad1, also termed Max dimerization protein 1 (MXD1), or Max dimerizer 1, or protein MAD, is a bHLHZip transcriptional repressor that binds with MAX to form a sequence-specific DNA-binding protein complex which recognizes the core sequence 5'-CAC[GA]TG-3'. It thus antagonizes MYC transcriptional activity by competing for MAX. 80
36177 381502 cd18932 bHLHzip_Mad3 basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-associated protein 3 (Mad3) and similar proteins. Mad3, also termed Max dimerization protein 3, or Max dimerizer 3 (MXD3), or Class C basic helix-loop-helix protein 13 (bHLHc13), or Max-interacting transcriptional repressor MAD3, or Myx, is a bHLHZip Max-interacting transcriptional repressor that plays an important role in cellular proliferation. It suppresses c-myc dependent transformation and is expressed during neural and epidermal differentiation. 85
36178 381503 cd18933 bHLH-O_HES3 basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-3 and similar proteins. HES-3, also termed Class B basic helix-loop-helix protein 43 (bHLHb43), or hairy and enhancer of split 3, is a bHLH-O transcription factor expressed in neural stem and progenitor cells that is involved in tissue regeneration. It regulates gene expression, cell growth, and insulin release. HES-3 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability. 55
36179 381504 cd18934 bHLH_TS_MRF4_Myf6 basic helix-loop-helix (bHLH) domain found in muscle-specific regulatory factor 4 (MRF4) and similar proteins. MRF4, also termed Class C basic helix-loop-helix protein 4 (bHLHc4), or myogenic factor 6 (Myf-6), is a bHLH transcription factor associated with myogenesis. It plays a role in skeletal muscle differentiation. 64
36180 381505 cd18935 bHLH_TS_MYOG_Myf4 basic helix-loop-helix (bHLH) domain found in myogenin (MYOG) and similar proteins. MYOG, also termed Class C basic helix-loop-helix protein 3 (bHLHc3), or myogenic factor 4 (Myf-4), is a bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle differentiation, cell cycle exit and muscle atrophy. 59
36181 381506 cd18936 bHLH_TS_MYOD1_Myf3 basic helix-loop-helix (bHLH) domain found in myoblast determination protein 1 (MYOD1) and similar proteins. MYOD1, also termed Class C basic helix-loop-helix protein 1 (bHLHc1), or myogenic factor 3 (Myf-3), is a bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle differentiation. Together with Myf-5 and MYOG, MYOD1 co-occupies muscle-specific gene promoter core region during myogenesis. 61
36182 381507 cd18937 bHLH_TS_Myf5 basic helix-loop-helix (bHLH) domain found in myogenic factor 5 (Myf-5) and similar proteins. Myf-5, also termed Class C basic helix-loop-helix protein 2 (bHLHc2), is a nuclear bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle specification and differentiation. It also acts as an RNA-binding protein which enhances Ccnd1/Cyclin D1 mRNA translation during myogenesis. 64
36183 381508 cd18938 bHLH_TS_Mesp basic helix-loop-helix (bHLH) domain found in the mesoderm posterior protein (Mesp) family. Mesp, a bHLH tissue specific transcription factor, acts as a key regulator of the cardiovascular transcriptional network by inducing directly and/or indirectly the expression of the majority of key cardiovascular transcription factors. The Mesp family includes two bHLH transcription factors, Mesp1 and Mesp2. Mesp1, also termed Class C basic helix-loop-helix protein 5 (bHLHc5), promotes cardiovascular differentiation during embryonic development and embryonic stem cell differentiation. Mesp2, also termed Class C basic helix-loop-helix protein 6 (bHLHc6), plays an important role in somitogenesis. 65
36184 381509 cd18939 bHLH_TS_Msgn1 basic helix-loop-helix (bHLH) domain found in mesogenin-1 (Msgn1) and similar proteins. Msgn1, also termed paraxial mesoderm-specific mesogenin1, or pMesogenin1 (pMsgn1), is a bHLH transcription factor required for maturation and segmentation of paraxial mesoderm. It may regulate the expression of T-box transcription factors essential for mesoderm formation and differentiation. 66
36185 381510 cd18940 bHLH_TS_OLIG2 basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 2 (Oligo2) and similar proteins. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is a bHLH transcription factor that is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube. 85
36186 381511 cd18941 bHLH_TS_OLIG3 basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 3 (Oligo3) and similar proteins. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is a bHLH transcription factor that is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons. 81
36187 381512 cd18942 bHLH_TS_OLIG1 basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 1 (Oligo1) and similar proteins. Oligo1, also termed Class B basic helix-loop-helix protein 6 (bHLHb6), or Class E basic helix-loop-helix protein 21 (bHLHe21), is a bHLH transcription factor that promotes formation and maturation of oligodendrocytes, especially within the brain. 75
36188 381513 cd18943 bHLH_E-protein_E47-like basic helix-loop-helix (bHLH) domain found in transcription factor E47 and similar proteins. E47 is a class I bHLH transcriptional regulator that forms heterodimers with class II bHLH proteins to regulate distinct differentiation pathways. Its homodimers regulate B lymphocytes development. 74
36189 381514 cd18944 bHLH_E-protein_E2A_TCF3 basic helix-loop-helix (bHLH) domain found in transcription factor E2-alpha (E2A) and similar proteins. E2A, also termed Class B basic helix-loop-helix protein 21 (bHLHb21), or immunoglobulin enhancer-binding factor E12/E47, or immunoglobulin transcription factor 1, or Kappa-E2-binding factor, or transcription factor 3 (TCF-3), or transcription factor ITF-1, is a bHLH transcriptional regulator involved in the initiation of neuronal differentiation. 74
36190 381515 cd18945 bHLH_E-protein_TCF4_E2-2 basic helix-loop-helix (bHLH) domain found in transcription factor 4 (TCF-4) and similar proteins. TCF-4, also termed E2-2, or Class B basic helix-loop-helix protein 19 (bHLHb19), or immunoglobulin transcription factor 2 (ITF-2), or SL3-3 enhancer factor 2 (SEF-2), is a bHLH transcription factor that binds to the immunoglobulin enhancer Mu-E5/KE5-motif. It is involved in the initiation of neuronal differentiation. 85
36191 381516 cd18946 bHLH_E-protein_TCF12_HEB basic helix-loop-helix (bHLH) domain found in transcription factor 12 (TCF-12) and similar proteins. TCF-12, also termed HEB, or Class B basic helix-loop-helix protein 20 (bHLHb20), or DNA-binding protein HTF4, or E-box-binding protein, or transcription factor HTF-4, is a bHLH transcription factor that is involved in the initiation of neuronal differentiation. 83
36192 381517 cd18947 bHLH-PAS_ARNT basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator (ARNT) and similar proteins. ARNT, also termed Class E basic helix-loop-helix protein 2 (bHLHe2), or Dioxin receptor, nuclear translocator, or hypoxia-inducible factor 1-beta (HIF1b), or HIF-1-beta, or HIF1-beta, is a member of bHLH-PAS transcription regulators that acts as the heterodimeric partner for bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM). These bHLH-PAS transcription complexes are involved in transcriptional responses to xenobiotic, hypoxia, and developmental pathways. Heterodimerization of bHLH-PAS proteins with ARNT is mediated by contacts between both the bHLH and the tandem PAS domains. ARNT use bHLH and/or PAS domains to interact with several transcriptional coactivators. It is required for activity of the aryl hydrocarbon (dioxin) receptor. 65
36193 381518 cd18948 bHLH-PAS_NCoA1_SRC1 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 1 (NCoA-1) and similar proteins. NCoA-1, also termed Class E basic helix-loop-helix protein 74 (bHLHe74), or protein Hin-2, or RIP160, or renal carcinoma antigen NY-REN-52, or steroid receptor coactivator 1 (SRC-1), is a bHLH-PAS nuclear receptor coactivator that directly binds nuclear receptors and stimulates the transcriptional activities in a hormone-dependent fashion. 61
36194 381519 cd18949 bHLH-PAS_NCoA3_SRC3 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 3 (NCoA-3) and similar proteins. NCoA-3, also termed ACTR, or amplified in breast cancer 1 protein (AIB-1), or CBP-interacting protein (pCIP), or Class E basic helix-loop-helix protein 42 (bHLHe42), or receptor-associated coactivator 3 (RAC-3), or steroid receptor coactivator protein 3 (SRC-3), or thyroid hormone receptor activator molecule 1 (TRAM-1), is a bHLH-PAS steroid/nuclear receptor-associated coactivator that directly binds nuclear receptors and stimulates the transcriptional activities in a hormone-dependent fashion. It also plays a central role in creating a multisubunit coactivator complex, which probably acts via remodeling of chromatin. 73
36195 381520 cd18950 bHLH-PAS_NCoA2_SRC2 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 2 (NCoA-2) and similar proteins. NCoA-2, also termed Class E basic helix-loop-helix protein 75 (bHLHe75), or transcriptional intermediary factor 2 (TIF2), or steroid receptor coactivator 2 (SRC-2), or glucocorticoid receptor interacting protein-1 (GRIP1), is a bHLH-PAS transcriptional coactivator for steroid receptors and nuclear receptors. It is required with NCoA-1 to control energy balance between white and brown adipose tissues. 64
36196 381521 cd18951 bHLH_TS_scleraxis basic helix-loop-helix (bHLH) domain found in scleraxis and similar proteins. Scleraxis, also termed SCX, or Class A basic helix-loop-helix protein 41 (bHLHa41), or Class A basic helix-loop-helix protein 48 (bHLHa48), is a bHLH transcription factor that is expressed in sclerotome limb bud cranial and body wall mesenchyme, pericardium and heart valves, ligaments and tendons. It is required for tendon formation ligaments, connective tissue, the diaphragm, and testis development. Scleraxis plays a central role in promoting fibroblast proliferation and matrix synthesis during the embryonic development of tendons. 68
36197 381522 cd18952 bHLH_TS_HAND1 basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein 1 (HAND1) and similar proteins. HAND1, also termed Class A basic helix-loop-helix protein 27 (bHLHa27), or extraembryonic tissues, heart, autonomic nervous system and neural crest derivatives-expressed protein 1 (eHAND), is a bHLH transcription factor that plays an essential role in both trophoblast-giant cells differentiation and in cardiac morphogenesis. 60
36198 381523 cd18953 bHLH_TS_bHLHe23_bHLHb4 basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein 23 (bHLHe23) and similar proteins. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is an OLIG-related bHLH transcription factor that is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types. 81
36199 381524 cd18954 bHLH_TS_bHLHe22_bHLHb5 basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein 22 (bHLHe22) and similar proteins. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is an OLIG-related bHLH neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types. 70
36200 349736 cd18955 BTB_POZ_BACH BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog (BACH) proteins. This subfamily includes BACH1 (also called BTB-basic leucine zipper transcription factor 1), BACH2 (also called BTB-basic leucine zipper transcription factor 2), and similar proteins. They belong to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. BACH1 is a heme-responsive transcriptional repressor of heme oxygenase (HO)-1. It represses genes involved in heme metabolism and oxidative stress response. BACH2 is a lymphoid-specific transcription factor with a prominent role in B-cell development. It is transcriptionally regulated by the BCR/ABL oncogene. It represses the anti-apoptotic factor heme oxygenase-1 (HO-1). Subfamily members contain a BTB domain and a basic leucine zipper (bZIP) domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids. 94
36201 349737 cd18956 BTB_POZ_ZBTB42 BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 42 (ZBTB42). ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. It is enriched in skeletal muscles, especially at the neuromuscular junction. A ZBTB42 mutation has been identified to define a novel lethal congenital contracture syndrome (LCCS6), a lethal autosomal recessive form of arthrogryposis multiplex congenita (AMC). ZBTB42 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 129
36202 349316 cd18960 CD_HP1_like chromodomain of heterochromatin protein 1 proteins, including HP1alpha, HP1beta, and HP1gamma; uncharacterized subgroup. CHRomatin Organization Modifier (chromo) domain of mammalian HP1alpha (Cbx5), HP1beta (Cbx1), HP1gamma (Cbx5), and similar proteins. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). 51
36203 349317 cd18961 CD_CEC-4_like chromodomain of Caenorhabditis elegans chromodomain protein 4, and similar proteins. CHRomatin Organization Modifier (chromo) domain of Caenorhabditis elegans CEC-4, and similar proteins. CEC-4 is a perinuclear heterochromatin anchor, it mediates the anchoring of H3K9 methylation-bearing chromatin at the nuclear periphery in early to mid-stage embryos. It is necessary for anchoring, but does not affect transcriptional repression. CEC-4 contributes to the efficiency with which muscle differentiation is induced following ectopic expression of the master regulator, HLH-1 (MyoD in mammals). A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 51
36204 349318 cd18962 CD_MT_like chromodomain of a putative Coemansia reversa NRRL 1564 methyltransferase, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Coemansia reversa NRRL 1564 SET (Su(var)3-9, enhancer-of-zeste, trithorax) domain-containing protein, and similar proteins. The SU(VAR)3-9 protein is the main chromocenter-specific histone H3-K9 methyltransferase (HMTase) in Drosophila where it plays a role in heterochromatic gene silencing. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 52
36205 349319 cd18963 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 57
36206 349320 cd18964 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 54
36207 349321 cd18965 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 53
36208 349322 cd18966 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 49
36209 349323 cd18967 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 55
36210 349324 cd18968 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 57
36211 349325 cd18969 chromodomain CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup; for most members of this subgroup, the chromodomain is followed by a chromo shadow domain. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. For the majority of members of this subgroup, the chromodomain is followed by a chromo shadow domain (CSD). 56
36212 349326 cd18970 CD_POL_like chromodomain of Hypsizygus marmoreus TY3B-I_0 protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Hypsizygus marmoreus TY3B-I_0 protein, a putative TY3/gypsy retrotransposon polyprotein, and similar proteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 49
36213 349327 cd18971 CD_POL_like chromodomain of a Magnaporthe grisea putative retrotransposon polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Magnaporthe grisea putative retrotransposon polyprotein which includes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 50
36214 349328 cd18972 CD_POL_like chromodomain of a Moniliophthora perniciosa FA553 putative retrotransposon polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Moniliophthora perniciosa FA553 putative retrotelement polyprotein, which includes domains in the following order: a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related "chromo shadow" domain 50
36215 349329 cd18973 CD_Tf2-1_POL_like chromodomain of Rhizoctonia solani AG-1 IB retrotransposable element Tf2 155 kDa protein type 1, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Rhizoctonia solani AG-1 IB retrotransposable element Tf2 155 kDa protein type 1 (Tf2-1), and similar proteins. It belongs to the Ty3/gypsy family of long terminal repeat (LTR) retrotransposons. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 50
36216 349330 cd18974 CD_POL_like chromodomain of Penicillium solitum protein PENSOL_c198G03123. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Penicillium solitum protein PENSOL_c198G03123 a putative polyprotein from a Ty3/Gypsy long terminal repeat (LTR) retroelement. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 50
36217 349331 cd18975 CD_MarY1_POL_like chromodomain of Tricholoma matsutake polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in the polyprotein from the MarY1 Ty3/Gypsy long terminal repeat (LTR) retroelement from the from the Ectomycorrhizal Basidiomycete Tricholoma matsutake. The pol gene in TY3/gypsy elements generally encodes domains in the following order: prt-reverse transcriptase-RNase H-integrase, in marY1 POL the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 49
36218 349332 cd18976 CD_POL_like chromodomain of uncharacterized putative retroelement polyprotein proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in uncharacterized putative retrotransposon proteins, and similar proteins. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 51
36219 349333 cd18977 CD_POL_like chromodomain of a Rhizoctonia solani AG-3 Rhs1AP polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Rhizoctonia solani AG-3 Rhs1AP, a putative Ty3/Gypsy polyprotein/retrotransposon which includes a protease, a reverse transcriptase, a ribonuclease H, and an integrase domain, in that order, with a chromodomain at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 57
36220 349334 cd18978 CD_DDE_transposase_like chromodomain of Rhizopus microsporus putative DDE transposases, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Rhizopus microsporus putative DDE transposases, and similar proteins. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 52
36221 349335 cd18979 CD_POL_like chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins (Z195D10.9), and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Zea maize Z195D10.9 protein, and other putative TY3/gypsy retrotransposon polyproteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 48
36222 349336 cd18980 CD_NC-like chromodomain of a Tasahii var. asahii CBS 8904 retrotransposon nucleocapsid protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Trichosporon asahii var. asahii CBS 8904 retrotransposon nucleocapsid protein, and similar proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. 56
36223 349337 cd18981 CSD_HP1e_insect chromo shadow domain of insect heterochromatin protein 1E. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. 53
36224 349338 cd18982 CSD chromo shadow domain; uncharacterized subgroup. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. 50
36225 350846 cd18983 CBD_MSL3_like chromo barrel domain of human male-specific lethal complex subunit 3, and similar proteins. This subgroup includes human male-specific lethal (MSL) complex subunit 3 (MSL3, also known as MSL3L1). The MSL3 chromodomain specifically recognizes the H4K20 monomethyl mark, in a DNA-dependent manner, and may be involved in chromosomal targeting of the MSL complex. Also included is MORF-related gene on chromosome 15 (MRG15, also known as MORF4L1) which specifically binds to Lys36-methylated histone H3 and plays a role in transcriptional regulation and in DNA repair. This subgroup also includes Arabidopsis thaliana Morf Related Gene 2 (MRG2) which acts as a H3K4me3/H3K36me3 reader involved in the regulation of Arabidopsis flowering. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. 57
36226 350847 cd18984 CBD_MOF_like chromo barrel domain of Drosophila melanogaster males-absent on the first protein, and similar proteins. This subgroup includes the chromo barrel domain of Drosophila melanogaster males-absent on the first (MOF) protein. The histone H4 lysine 16 (H4K16)-specific acetyltransferase MOF is part of two distinct complexes involved in X chromosome dosage compensation and autosomal transcription regulation. Its chromobarrel domain is essential for H4K16 acetylation throughout the Drosophila genome and controls spreading of the male-specific lethal (MSL) complex on the X chromosome. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. The MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites. 70
36227 350848 cd18985 CBD_TIP60_like chromo barrel domain of human tat-interactive protein 60, and similar proteins. Tat-interactive protein 60 (also known as KAT5 or HTATIP) catalyzes the acetylation of lysine side chains in various histone and nonhistone proteins, and in itself. It plays roles in multiple cellular processes including remodeling, transcription, DNA double-strand break repair, apoptosis, embryonic stem cell identity, and embryonic development. The TIP60 chromo barrel domain recognizes trimethylated lysine at site 9 of histone H3 (H3K9me3) which triggers TIP60 to acetylate and activate ataxia telangiectasia-mutated kinase, thereby promoting the DSB repair pathway. In a different study, the TIP60 chromo barrel domain was shown to bind H3K4me1, which stabilizes TIP60 recruitment to a subset of estrogen receptor alpha target genes, facilitating regulation of the associated gene transcription. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. This subgroup belongs to the MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites. 64
36228 350849 cd18986 CBD_ESA1_like chromo barrel domain of yeast NuA4 histone acetyltransferase complex catalytic subunit ESA1, and similar proteins. The subgroup includes the chromo barrel domain of NuA4 histone acetyltransferase (HAT) complex catalytic subunit Esa1 (also known as Tas1 and Kat5). Yeast Esa1p acetylates specific histones nonrandomly in H4, H3, and H2A. Esa1 also plays roles in cell cycle progression. In addition, its chromo barrel domain plays a role in the yeast Piccolo NuA4 complex's ability to distinguish between histones and nucleosomes; however, the chromodomain is not required for the Piccolo to bind to nucleosomes. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. This subgroup belongs to the MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites. 65
36229 349788 cd18987 LGIC_ECD_anion extracellular domain (ECD) of anionic Cys-loop neurotransmitter-gated ion channels. This family contains the extracellular domain (ECD) of anionic Cys-loop neurotransmitter-gated ion channels which include type-A gamma-aminobutyric acid receptor (GABAAR), glycine receptor (GlyR), invertebrate glutamate-gated chloride channel (GluCl), and histimine-gated chloride channel (HisCl). These neurotransmitter receptors directly mediate chloride permeability and constitute one half of the Cys-loop receptor family. Receptors in this family are composed of five either identical or homologous subunits, which generate diversity in functional profiles and pharmacological preferences. GABAAR and GlyR, both mediate fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. GluCl channels are found only in protostomia, but are closely related to mammalian glycine receptors (GlyRs). They have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on this inhibitory half of the Cys-loop receptor family. Many of the therapeutically useful compounds acting at Cys-loop receptors target an allosteric site. The sites in Cys-loop receptors at which these allosteric ligands bind and their structure-based mechanisms of action are largely unresolved. 185
36230 349789 cd18988 LGIC_ECD_bact extracellular domain of prokaryotic pentameric ligand-gated ion channels (pLGIC). This family contains extracellular domain (ECD) of bacterial pentameric ligand-gated ion channels (pLGICs), including ones from Gloebacter violaceus (GLIC) and Erwinia chrysanthemi (ELIC). These prokaryotic homologs of Cys-loop receptors have been useful in understanding their eukaryotic counterparts. The largely beta-sheet ECD in this family is similar to other pLGICs, but lacks the cysteine loop and an intracellular domain. While most pLGICs undergo desensitization on prolonged exposure to the agonist, GLIC is activated by protons, but does not desensitize, even at proton concentrations eliciting maximal electrophysiological response (pH 4.5). Studies show that GLIC activation is inhibited by most general anaesthetics at clinical concentrations, including xenon which has been used in clinical practice as a potent gaseous anesthetic for decades. Xenon binding sites have been identified in three distinct regions of the TMD: in a large intra-subunit cavity, in the pore, and at the interface between adjacent subunits. 182
36231 349790 cd18989 LGIC_ECD_cation extracellular domain (LBD) of cationic Cys-loop neurotransmitter-gated ion channels. This superfamily contains the extracellular domain (ECD) of cationic Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), and zinc-activated ligand-gated ion channel (ZAC) receptor. These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+ and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling require is as yet unknown. 180
36232 349791 cd18990 LGIC_ECD_GABAAR gamma-aminobutyric acid receptor extracellular domain. This family contains extracellular domain (ECD) of type-A gamma-aminobutyric acid receptor (GABAAR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. This family includes 19 isoforms in human; six alpha, 3 beta, 3 gamma, one of delta, epsilon, pi, and theta, known to form heteropentameric GABAARs, and 3 rho subunits that only form homopentameric channels (also known as GABAA rho or GABAC receptor) or pseudoheteromeric if consisting of different rho subunits. The majority of GABAA receptor pentamers contain two alpha subunits, two beta subunits, and a gamma subunit, with different isoforms affecting potency of the neurotransmitter. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to its site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the ECD. The channels have to contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. 184
36233 349792 cd18991 LGIC_ECD_GlyR extracellular domain of glycine receptor (GlyR). This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine. GlyR has four known isoforms of the alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands and a single beta-subunit (encoded by GLRB), all of which have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). Functional chloride-permeable GlyR ion channels are formed by 5 alpha subunit homopentamers or by alpha and beta subunit heteropentamers, which form complexes with either a 2alpha-3beta or 3alpha-2beta stoichiometry. The receptor can be activated by glycine as well as beta-alanine and taurine, and can be selectively blocked by the high-affinity competitive antagonist strychnine. Caffeine is also a competitive antagonist of GlyR. In human, glycine receptor alpha1 and beta subunits are the major targets of mutations that cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions, leading to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli. Mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing. 185
36234 349793 cd18992 LGIC_ECD_HisCl extracellular domain of histimine-gated chloride channel (HisCl or HGCC). This family contains extracellular domain (ECD) of histamine-gated chloride channel (HisCl), a member of the Cys-loop receptor superfamily of ligand-gated ion channels and is closely related to the mammalian GABAA receptor and glycine receptor (GlyR). Histamine (HA) is a neurotransmitter that activates GPCRs in vertebrates, but in arthropods, it is a photoreceptor neurotransmitter, directly gating chloride channels on large monopolar cells (LMCs), postsynaptic to photoreceptors in the lamina. It has also been reported to play important roles in mechanosensory reception, temperature preference, and sleep in insects. HA activates its receptor channels to cause an inward chloride flux in the insect nervous system. In Drosophila, HA acts on two histamine-gated chloride channel (HGCC) subunits called HisCl1 (HisClalpha2, HCLB) and HisCl2 (HisClalpha1, Ort, HCLA). HisCl1 (HCLB) and HisCl2 (HCLA) are expressed predominantly in the insect eye, sharing 60% sequence identity, and forming homomeric and heteromeric HGCCs. HCLA homomers are involved in synaptic transmission in the lamina, while HCLB homomers, localized in the glia cells, have a role in shaping the transmission. HCLB channels, but not HCLA channels, are also responsible for the activation and maintenance of wake state in D. melanogaster. In Manduca sexta, HCLB channels in the flight sensory-motor have been shown to be involved in olfactory processing circuit. Studies show that HCLB channels are more sensitive to agonists when compared with HCLA channels, but insensitive to known LGCC insecticides. 185
36235 349794 cd18993 LGIC_ECD_GluCl glutamate-gated chloride channel (GluCl) extracellular domain. This subfamily contains extracellular domain of glutamate-gated chloride channel (GluCl) found only in protostomia, but are closely related to mammalian glycine receptors. They have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Comparison of the GluCl gene families between organisms shows that insect gene family is relatively simple, while that found in nematodes tends to be larger and more diverse. Glutamate is an inhibitory neurotransmitter that shapes the responses of projection neurons to olfactory stimuli in the Drosophila. GluCls are targeted by the macrocyclic lactone family of anthelmintics and pesticides in arthropods and nematodes, thus making the GluCls of considerable medical and economic importance. In Drosophila melanogaster, GluCl mediates sensitivity to the antiparasitic agents ivermectin and nodulisporic acid, suggesting that their drug target is the same throughout the Ecdysozoa. 183
36236 349795 cd18994 LGIC_ECD_ZAC extracellular domain of zinc-activated ligand-gated ion channel. This family is the extracellular domain of zinc-activated ligand-gated ion channel (ZAC), a cationic ion channel belonging to the superfamily of Cys-loop receptors, which consists of pentameric ligand-gated ion channels. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown. 170
36237 349796 cd18995 LGIC_AChBP acetylcholine binding protein (AChBP). This family contains acetylcholine binding protein (AChBP) which is a soluble extracellular domain homolog secreted by protostomia, and has been widely recognized as a surrogate for the ligand binding domain of nicotinic acetylcholine receptors (nAChRs). AChBP forms a pentameric structure where the interfaces between the subunits provide an acetylcholine (ACh) binding pocket homologous to the binding pocket of nAChRs. Thus far, AChBPs have been characterized only in aquatic mollusks, which have shown low sensitivity to neonicotinoids, the insecticides targeting insect nAChRs. Lymnaea stagnalis acetylcholine binding protein (Ls-AChBP) which has been found in glial cells as a water-soluble protein modulating synaptic ACh concentration has its the binding pocket show better resemblance as it contains all the five aromatic residues fully conserved in nAChR. Five AChBP subunits have been characterized in Pardosa pseudoannulata, a predator enemy against rice insect pests, and share higher sequence similarities with nAChR subunits of both insects and mammals compared with mollusk AChBP subunits. 180
36238 349797 cd18996 LGIC_ECD_5-HT3 extracellular domain of serotonin 5-HT3 receptor. This family contains extracellular domain of serotonin 5-HT3 receptor which belongs to the Cys-loop superfamily of ligand-gated ion channels (LGICs). This ion channel is cation-selective and mediates neuronal depolarization and excitation within the central and peripheral nervous systems. Like other ligand gated ion channels, the 5-HT3 receptor consists of five subunits arranged around a central ion conducting pore, which is permeable to Na+, K+, and Ca2+ ions. Binding of the neurotransmitter 5-hydroxytryptamine (serotonin) to the 5-HT3 receptor opens the channel, which then leads to an excitatory response in neurons, and the rapidly activating, desensitizing, inward current is predominantly carried by Na+ and K+ ions. This receptor is most closely related by homology to the nicotinic acetylcholine receptor (nAChR). Five subunits have been identified for this family: 5-HT3A, 5-HT3B, 5-HT3C, 5-HT3D, and 5-HT3E, encoded by HTR3A-E genes. Only 5-HT3A subunits are able to form functional homomeric receptors, whereas the 5-HT3B, C, D, and E subunits form heteromeric receptors with 5-HT3A. Different receptor subtypes are important mediators of nausea and vomiting during chemotherapy, pregnancy, and following surgery, while some contribute to neuro-gastroenterologic disorders such irritable bowel syndrome (IBS) and eating disorders as well as co-morbid psychiatric conditions. 5-HT3 receptor antagonists are established treatments for emesis and IBS, and are beneficial in the treatment of psychiatric diseases. 215
36239 349798 cd18997 LGIC_ECD_nAChR extracellular domain of nicotinic acetylcholine receptor. This family contains the extracellular domain of nicotinic acetylcholine receptor (nAChR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits, and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease. Among subtypes of muscle nAChRs, the heteromeric subunits (alpha1)2, beta, gamma, and delta in fetal muscle, and the gamma subunit replaced by epsilon in adult muscle have been implicated in congenital myasthenic syndromes and multiple pterygium syndromes due to various mutations. This family also includes alpha- and beta-like nAChRs found in protostomia. 181
36240 349799 cd18998 LGIC_ECD_GABAAR_A extracellular domain of gamma-aminobutyric acid receptor subunit alpha. This family contains extracellular domain (ECD) of type-A gamma-aminobutyric acid receptor (GABAAR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. This family includes 19 isoforms in human; six alpha, 3 beta, 3 gamma, one of delta, epsilon, pi, and theta, known to form heteromeric GABAARs, and 3 rho subunits that only form homomeric channels (also known as GABAA rho or GABAC receptor) or pseudoheteromeric if consisting of different rho subunits. GABAAR is assembled from a variety of different subunit subtypes which determines their pharmacology and physiology; the most abundant being 2alpha2beta1gamma stoichiometry. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to its site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the ECD. The channels have to contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRA1, GABRA3, GABRB3, GABRG2, and GABRD, encoding the alpha1-, alpha3-, beta2-, gamma3-, and delta-subunits have been directly associated with epilepsy. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity. 184
36241 349800 cd18999 LGIC_ECD_GABAAR_B extracellular domain of gamma-aminobutyric acid receptor subunit beta (GABAAR-B or GABRB). This family contains extracellular domain (ECD) of beta subunits of type-A gamma-aminobutyric acid receptor (GABAAR), which include beta1-beta4 in vertebrates. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the LBD. The channels must contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Mutations or genetic variations of the genes encoding the GABRB2 and GABRB3 have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. Mutations in the GABRB1 gene promote alcohol consumption through increased tonic inhibition. 182
36242 349801 cd19000 LGIC_ECD_GABAAR_G extracellular domain of gamma-aminobutyric acid receptor subunit gamma. This family contains extracellular domain (ECD) of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAA receptor theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes. 182
36243 349802 cd19001 LGIC_ECD_GABAAR_delta extracellular domain of gamma-aminobutyric acid receptor subunit delta. This family contains extracellular domain of delta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Receptors containing the delta subunit (GABRD) are expressed exclusively extra-synaptically (in the cortex, hippocampus, thalamus, striatum, and cerebellum) and mediate tonic inhibition. Studies suggest that delta subunits form heteropentamers in similar stoichiometry and arrangement as alpha/beta/gamma receptors, with the delta subunit replacing the gamma subunit (2alpha:2beta:1delta), although other stoichiometries have also been detected. The delta subunit is flexible in its positioning in the pentameric complex, producing receptors with diverse pharmacological properties. Mutations in GABRD have been associated with susceptibility to generalized epilepsy with febrile seizures, type 5. GABRD gene may also be associated with childhood-onset mood disorders. 184
36244 349803 cd19002 LGIC_ECD_GABAAR_E extracellular domain of gamma-aminobutyric acid receptor subunit epsilon (GABRE). This family contains extracellular domain of epsilon subunit of type-A gamma-aminobutyric acid receptor (GABAAR), a protein that is encoded by the GABRE gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The epsilon subunits form heteropentamers with other GABAAR subunits, possibly with alpha3, beta4, and theta subunits since their genes are clustered on the same human chromosome. Various combinations of alpha3-, theta-, and epsilon-subunits may be assembled at a regional and developmental level in the brain. Brainstem expression of epsilon subunit-containing GABAA receptors is upregulated during pregnancy, particularly in the ventral respiratory neurons, thus protecting breathing, despite increased neurosteroid levels during pregnancy. 182
36245 349804 cd19003 LGIC_ECD_GABAAR_theta extracellular domain of gamma-aminobutyric acid receptor subunit theta (GABRQ). This family contains extracellular domain (ECD) of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR), and encoded by the GABRQ gene, which is mapped to chromosome Xq28 in a cluster of genes that also that encode the alpha 3 and epsilon subunits. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAAR theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes. 183
36246 349805 cd19004 LGIC_ECD_GABAAR_pi extracellular domain of gamma-aminobutyric acid receptor subunit pi (GABRP). This family contains extracellular domain of pi subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRP is expressed mainly in non-neuronal tissues such as the mammary gland, prostate gland, lung, thymus, and uterus. It is also highly expressed in certain types of cancer such as basal-like breast cancer and pancreatic ductal adenocarcinoma. GABRP is involved in inhibitory synaptic transmission in the central nervous system. Its assembly with other GABAAR subunits alters the sensitivity of recombinant receptors to modulatory agents such as pregnanolone. Studies suggest that polymorphisms in the GABRP gene may be associated with the susceptibility to systematic lupus erythematosus (SLE). 182
36247 349806 cd19005 LGIC_ECD_GABAAR_rho extracellular domain of gamma-aminobutyric acid receptor subunit rho. This family contains extracellular domain of rho subunits (rho1, rho2, and rho3, encoded by GABRR1, GABRR2, and GABRR3, respectively) of type-A gamma-aminobutyric acid receptor (GABAAR). These subunits homo-oligomerize to form GABAA-rho receptors (formerly classified as GABA-rho or GABAC receptor), but do not co-assemble with any of the classical GABAA subunits. They are especially high expression in the retina and their distinctive pharmacological properties are unique; they are not modulated by many GABAA receptor modulators such as barbiturates, benzodiazepines, and neuroactive steroids. In humans, mutations in the GABRR1 and GABRR2 genes may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder while a SNP in GABRR2 has been reported to show association with autism. 186
36248 349807 cd19006 LGIC_ECD_GABAAR_LCCH3-like gamma-aminobutyric acid receptor subunit beta-like extracellular domain in protostomia, such as LCCH3 (ligand-gated chloride channel homolog 3). This family contains extracellular domain of beta-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster ligand-gated chloride channel homolog 3 (LCCH3) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. LCCH3 has been shown to combine with subunit GRD to form cation-selective GABA-gated ion channels when coexpressed in Xenopus laevis oocytes. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing. 183
36249 349808 cd19007 LGIC_ECD_GABAR_GRD-like gamma-aminobutyric acid receptor subunit alpha-like extracellular domain in protostomia, such as GRD (GABA/glycine-like receptor of Drosophila). This family contains extracellular domain of alpha-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster GABA/ glycine-like receptor of Drosophila (GRD) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. LCCH3 has been shown to combine with subunit GRD to form cation-selective GABA-gated ion channels when co-expressed in Xenopus laevis oocytes. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing. 183
36250 349809 cd19008 LGIC_ECD_GABAR_RDL-like gamma-aminobutyric acid receptor subunit beta-like extracellular domain in protostomia, such as RDL (resistant to dieldrin). This family contains extracellular domain of beta-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster resistant to dieldrin (RDL) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing. Bombyx mori includes three RDL (RD1, RD2, RD3), one LCCH3, and one GRD subunits. Its RDL1 gene has RNA-editing sites, and the RDL1 and RDL3 genes possess alternative splicing, enhancing the diversity of its GABA-receptor gene family. The three RDL subunits may have arisen from two duplication events. 184
36251 349810 cd19009 LGIC_ECD_GlyR_alpha extracellular domain of glycine receptor alpha subunit. This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) alpha subunits of the amino acid neurotransmitter glycine. GlyR has four known isoforms of alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands, and, along with the GlyR beta subunit, have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). These alpha subunits are highly homologous, but differ in their kinetic properties, temporal and regional expression and physiological functions. They can form functional chloride-permeable GlyR ion channels by forming homopentamers with 5 alpha subunits or heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. In human, mutations in glycine receptor alpha subunits cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions. Mutations in GlyR alpha1 subunit leads to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, while mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing. GlyR alpha1 and alpha2 subunits have an important role in regulation of the excitatory-inhibitory balance, control of motor actions, modulation of sedative ethanol effects and probably regulation ethanol preference and consumption. 184
36252 349811 cd19010 LGIC_ECD_GlyR_beta extracellular domain of glycine receptor beta subunit. This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) beta subunit of the amino acid neurotransmitter glycine encoded by GLRB gene. These subunits form heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. While the alpha subunits contain binding sites for agonists and antagonists and are responsible for ion channel formation, the beta subunit displays structural and regulatory functions, such as GlyR clustering in synaptic locations by interaction between intracellular loop domains with the scaffolding protein gephyrin, and control of pharmacologic responses to agonist or allosteric modulators due in part to the presence of interfaces alpha/beta and beta/beta. GLRB gene mutations are associated with the neurological disorder hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, as well as agoraphobic cognitions. 187
36253 349812 cd19011 LGIC_ECD_5-HT3A extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit A (5HT3A). This subfamily contains extracellular domain of subunit A of serotonin 5-HT3 receptor (5-HT3AR), encoded by the HTR3A gene. 5-HT3A subunit forms a homopentameric complex or a heterologous combination with other subunits (B-E). Heteromeric combination of A and B subunits provides the full functional features of this receptor, since either subunit alone results in receptors with very low conductance and response amplitude. 5-HT3A receptors are located in the dorsal vagal complex of the brainstem and in the gastrointestinal (GI) tract, and form a channel circuit that controls gut motility, secretion, visceral perception, and the emesis reflex. These receptors are implicated in several GI and psychiatric disorder conditions including anxiety, depression, bipolar disorder, and irritable bowel syndrome (IBS). Several 5-HT3AR antagonists, such as the isoquinoline Palonosetron, are in clinical use to control emetic reflexes associated with gastrointestinal pathologies and cancer therapies. SNPs in the 5-HT3A serotonin receptor gene are associated with psychiatric disorders. 208
36254 349813 cd19012 LGIC_ECD_5-HT3B extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit B (5HT3B). This subfamily contains extracellular domain of subunit B of serotonin 5-HT3 receptor (5-HT3BR), encoded by the HTR3B gene. 5-HT3B is not functional as a homopentameric complex and is co-expression with the 5-HT3A subunit, resulting in heteromeric 5-HT3AB receptors that are functionally distinct from homomeric 5-HT3A receptors. This receptor causes fast, depolarizing responses in neurons after activation, with affinities of competitive ligands at the two receptor subtypes extracellular domains mostly similar. HTR3B gene variants may contribute to variability in severity of and response to anti-emetic therapy for nausea and vomiting in pregnancy, as well as efficacy of ondansetron in cancer chemotherapy, radiation therapy, or surgery. 5-HT3B subunit affects high-potency inhibition of 5-HT3 receptors by morphine by reducing its affinity at its high-affinity, non-competitive site. 210
36255 349814 cd19013 LGIC_ECD_5-HT3C_E extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit E (5HT3E); may include subunits C and D (5-HT3C,D). This subfamily contains extracellular domain of subunit E of serotonin 5-HT3 receptor (5-HT3ER), encoded by the HTR3E gene, and may also contain subunits C and D, all three encoding genes forming a cluster on chromosome 3. Data show that 5-HT3C, 5-HT3D, and 5-HT3E subunits are co-expressed with 5-HT3A in cell bodies of myenteric neurons, and that 5-HT3A and 5-HT3D are expressed in submucosal plexus of the human large intestine while HTR3E is restricted to the colon, intestine, and stomach. None of these subunits can form functional homopentamers, but, upon co-expression with the 5-HT3A subunit, they give rise to functional receptors that differ in maximal responses to 5-HT, and thus modulate 5-HT3 receptor's pharmacological profile. HTR3A and HTR3E polymorphisms have been shown to remarkably up-regulate the expression of 5-HT3 receptors, which have been proved to cause the gastric functional disorders including emesis, eating disorders and IBS-D. 215
36256 349815 cd19014 LGIC_ECD_nAChR_A1 extracellular domain of nicotinic acetylcholine receptor subunit alpha 1 (CHRNA1). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 1 (alpha1), encoded by the CHRNA1 gene. These muscle type nicotinic subunits form heteropentamers with other nAChR subunits, most broadly expressed as combination of two alpha1, beta1, delta, and epsilon subunits in mature muscles, and of two alpha1, beta1, delta, and gamma in embryonic cells. The alpha1 subunit in human nAChR is the primary target of Myasthenia gravis antibodies that disrupt communication between the nervous system and the muscle, causing chronic muscle weakness. 210
36257 349816 cd19015 LGIC_ECD_nAChR_A2 extracellular domain of nicotinic acetylcholine receptor subunit alpha 2 (CHRNA2). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 2 (alpha2), encoded by the CHRNA2 gene. It is specifically expressed in medial subpallium-derived amygdalar nuclei from early developmental stages to adult. This subunit is incorporated in heteropentameric neuronal nAChRs mainly with beta2 or beta4 subunits and, along with the alpha4 and alpha7, is one of the main nAChR subunits expressed in primate brain. In Xenopus laevis oocytes, when alpha2 is co-expressed with the beta2 subunit, two subtypes of alpha2beta2 nAChR are formed with either low or high ACh sensitivity. Mouse mutation studies show that alpha2 subunits in the nAChRs influence hippocampus-dependent learning and memory as well as CA1 synaptic plasticity in adolescent mice. 207
36258 349817 cd19016 LGIC_ECD_nAChR_A3 extracellular domain of nicotinic acetylcholine receptor subunit alpha 3 (CHRNA3). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 3 (alpha3), encoded by the CHRNA3 gene, and likely plays a role in neurotransmission. The alpha3 subunit is expressed in the aorta and macrophages, and may play a regulatory role in the process of vascular inflammation. One of the most broadly expressed subtype is the alpha3beta4 nAChR, also known as the ganglion-type nicotinic receptor, located in the autonomic ganglia and adrenal medulla, where activation yields post- and/or presynaptic excitation, mainly by increased Na+ and K+ permeability. The exact pentameric stochiometry of alpha3beta4 receptor is not known and functional assemblies with varying subunit stoichiometries are possible. Alpha4 plays a pivotal role in regulating the inflammatory responses in endothelial cells and macrophages, via mechanisms involving the modulations of multiple cell signaling pathways. Polymorphisms in this gene (CHRNA3) have been associated with an increased risk of smoking initiation and an increased susceptibility to lung cancer. 207
36259 349818 cd19017 LGIC_ECD_nAChR_A4 extracellular domain of neuronal acetylcholine receptor subunit alpha 4 (CHRNA4). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 4 (alpha4), encoded by the CHRNA4 gene. Alpha4 forms a functional nAChR by interacting with either nAChR beta2 or beta4 subunits. Alpha4beta2, the major heteropentameric nAChR in the brain, exists in two isoforms, (alpha4)3(beta2)2 and (alpha4)2(beta2)3, with the latter believed to constitute the majority of alpha4beta2 nAChR in the cortex. Both isoforms contain two canonical alpha4:beta2 ACh-binding sites with either low or high ACh sensitivity. This protein is an integral membrane receptor subunit that can interact with either nAChR beta-2 or nAChR beta-4 to form a functional receptor. Mutations in this gene (CHRNA4) cause nocturnal frontal lobe epilepsy type 1. Polymorphisms in this gene may provide protection against nicotine addiction. 181
36260 349819 cd19018 LGIC_ECD_nAChR_A5 extracellular domain of nicotinic acetylcholine receptor subunit alpha 5 (CHRNA5). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 5 (alpha5), encoded by the CHRNA5 gene, which is part of the CHRNA5/A3/B4 gene cluster. Polymorphisms in this gene cluster have been identified as risk factors for nicotine dependence, lung cancer, chronic obstructive pulmonary disease, alcoholism, and peripheral arterial disease. A loss-of-function polymorphism in CHRNA5 is strongly linked to nicotine abuse and schizophrenia; the alpha5 nAChR subunit is strategically situated in the prefrontal cortex (PFC), where a loss-of-function in this subunit may contribute to cognitive disruptions in both disorders. Alpha5 forms heteropentamers with alpha3beta2 or alpha3beta4 nAChRs which increases the calcium permeability of the resulting receptors possibly playing significant roles in the initiation of ACh-induced signaling cascades under normal and pathological condition. Acetylcholine (ACh) release and signaling via alpha4/beta2 nAChR subunits plays a central role in the control of attention, but a subset of these oligomers also contains alpha5 subunit. A strong association is seen between a CHRNA5 polymorphism and the risk of lung cancer, especially in smokers. 207
36261 349820 cd19019 LGIC_ECD_nAChR_A6 extracellular domain of nicotinic acetylcholine receptor subunit alpha 6 (CHRNA6). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 6 (alpha6), encoded by the CHRNA6 gene. Human (alpha6beta2)(alpha4beta2)3 nicotinic acetylcholine receptors (AChRs) are essential for addiction to nicotine and a target for drug development for smoking cessation. In xenopus oocytes, data show efficient expression of (alpha6beta2)2beta3 AChR subunits with only small changes in alpha6 subunits, while not altering AChR pharmacology or channel structure. Alternatively spliced transcript variants have been observed for this gene. Single nucleotide polymorphisms in this gene have been associated with both nicotine and alcohol dependence. CHRNA6 has a cellular expression signature for retinal ganglion cells with high correlation to Thy1, a known marker, and is preferentially expressed by retinal ganglion cells (RGCs) in the young and adult mouse retina and expression is reduced in glaucoma. A genetic variant in CHRNB3#CHRNA6 cluster is associated with esophageal adenocarcinoma. 181
36262 349821 cd19020 LGIC_ECD_nAChR_A7 extracellular domain of neuronal acetylcholine receptor subunit alpha 7 (CHRNA7). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 7 (alpha7), encoded by the CHRNA7 gene. Alpha7 subunits form a homo-pentameric channel, displays marked permeability to calcium ions and is a major component of brain nicotinic receptors that are blocked by, and highly sensitive to, alpha-bungarotoxin. This protein is ubiquitously expressed in both the central nervous system and in the periphery, in several tissues, including adrenal, small intestine, testis, and stomach. CHRNA7 is located in a region identified as a major susceptibility locus for juvenile myoclonic epilepsy and a chromosomal location involved in the genetic transmission of schizophrenia. It is also genetically linked to other disorders with cognitive deficits, including bipolar disorder, ADHD, Alzheimer's disease, and Rett syndrome. An evolutionarily recent partial duplication of CHRNA7 on chromosome 15 forms a new gene, CHRFAM7A or FAM7A, which encodes the protein dup-alpha7. This protein assembles with alpha7 subunits, results in fewer binding sites and is a dominant negative regulator of alpha7 nAChR function. 180
36263 349822 cd19021 LGIC_ECD_nAChR_A7L extracellular domain of neuronal acetylcholine receptor subunit alpha-7-like. This family contains the extracellular domain of nicotinic acetylcholine receptor (nAChR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease. 179
36264 349823 cd19022 LGIC_ECD_nAChR_A9 extracellular domain of neuronal acetylcholine receptor subunit alpha 9 (CHRNA9). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 9 (alpha9), encoded by the CHRNA9 gene. This protein is involved in cochlea hair cell development and is also expressed in the outer hair cells (OHCs) of the adult cochlea as well as in keratinocytes, the pituitary gland, B-cells, and T-cells. Mammalian alpha9 subunits can form functional homomeric alpha9 receptors as well as the heteromeric alpha9alpha10 receptors, the latter being atypical since the heteromeric alpha9alpha10 receptor is composed only of alpha subunits compared to nAChRs typically assembled from alpha and beta subunits. A stoichiometry of (alpha9)2(alpha10)3 has been determined for the rat recombinant receptor. The alpha9alpha10 nAChR is an important therapeutic target for pain; selective block of alpha9alpha10 nicotinic acetylcholine receptors by the conotoxin RgIA has been shown to be analgesic in an animal model of nerve injury pain, and accelerates recovery of nerve function after injury, possibly through immune/inflammatory-mediated mechanisms. CHRNA9 polymorphisms are associated with non-small cell lung cancer, and effect of a particular SNP (rs73229797) and passive smoking exposure on risk of breast malignancy has been observed. 207
36265 349824 cd19023 LGIC_ECD_nAChR_A10 extracellular domain of neuronal acetylcholine receptor subunit alpha 10 (CHRNA10). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 10 (alpha10), encoded by the CHRNA10 gene. This protein is involved in cochlea hair cell development and is also expressed in the outer hair cells (OHCs) of the adult cochlea as well as in keratinocytes, the pituitary gland, B-cells, and T-cells. Unlike alpha9 nAChR subunits, alpha10 subunits do not generate functional channels when expressed heterologously, suggesting that alpha10 might serve as a structural subunit, much like a beta subunit of heteromeric receptors, providing only complementary components to the agonist binding site. Mammalian alpha10 subunits can form functional heteromeric alpha9alpha10 receptors, an atypical heteromeric receptor since it is composed only of alpha subunits compared to nAChRs typically assembled from alpha and beta subunits. A stoichiometry of (alpha9)2(alpha10)3 has been determined for the rat recombinant receptor. The alpha9alpha10 nAChR is an important therapeutic target for pain; selective block of alpha9alpha10 nicotinic acetylcholine receptors by the conotoxin RgIA has been shown to be analgesic in an animal model of nerve injury pain, and accelerates recovery of nerve function after injury, possibly through immune/inflammatory-mediated mechanisms. 181
36266 349825 cd19024 LGIC_ECD_nAChR_B1 extracellular domain of nicotinic acetylcholine receptor subunit beta 1 (CHRNB1). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 1 (beta1), encoded by the CHRNB1 gene. It is a muscle type subunit found predominantly in the neuromuscular junction (NMJ), but also in other tissues and cell lines such as adrenal glands, carcinomas, brain, and lung. Simultaneous mRNA and protein expression of beta1 nAChR subunit is present in human placenta and skeletal muscle. The beta1 nAChR subunit forms a heteropentamer with either (alpha1)2, gamma and delta subunits in embryonic type or (alpha1)2, epsilon and delta subunits in adult type receptors. nAChRs containing beta1 subunits have been attributed to efficient clustering and anchoring of the receptors to the cytoskeleton which is important for formation of synapses in the NMJ. Mutations in the transmembrane domain region of this gene are associated with slow-channel congenital myasthenic syndrome (CMS). 213
36267 349826 cd19025 LGIC_ECD_nAChR_B2 extracellular domain of nicotinic acetylcholine receptor subunit beta 2 (CHRNB2). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 2 (beta2), encoded by the CHRNB2 gene. The most abundant nicotinic subtype in the human brain is alpha4beta2 receptor which is known to assemble in two functional subunit stoichiometries, (alpha4)3(beta2)2 and (alpha4)2(beta2)3, the latter having a much higher affinity for both acetylcholine and nicotine. This subtype is implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism, and neurodegenerative disorders such as Parkinson's disease and Alzheimer's disease. Thus, pharmacological ligands targeting this subtype have been researched and developed as a treatment approach implicated in these diseases. They include agonists such as varenicline and cytisine used as smoking cessation aids, as well as positive allosteric modulators (PAMs) such as desformylflustrabromine (dFBr), which are ligands that bind to nicotinic receptors at sites other than the orthosteric site where acetylcholine binds, and are not able to act as agonists on nAChR. 204
36268 349827 cd19026 LGIC_ECD_nAChR_B3 extracellular domain of nicotinic acetylcholine receptor subunit beta 3 (CHRNB3). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 3 (beta3), encoded by the CHRNB3 gene. CHRNB3 polymorphisms have been reported to potentially affect nicotine-induced upregulation of nicotinic and to be associated with disorders such as schizophrenia, autism, and cancer. Beta3 subunit is depleted in the striatum of Parkinson's disease patients. Rare variants in CHRNB3 are also implicated in risk for alcohol and cocaine dependence and independently associated with bipolar disorder. Human alpha6beta2beta3* (* indicating possible additional assembly partners) nAChRs on dopaminergic neurons are important targets for drugs to treat nicotine addiction and Parkinson's disease; (alpha6beta2)(alpha4beta2)beta3 nAChR is essential for addiction to nicotine and a target for drug development for smoking cessation. 179
36269 349828 cd19027 LGIC_ECD_nAChR_B4 extracellular domain of nicotinic acetylcholine receptor subunit beta 4 (CHRNB4). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 4 (beta4), encoded by the CHRNB4 gene and ubiquitously expressed on lung epithelial cells. The cluster of human neuronal nicotinic receptor gene CHRNA5-CHRNA3-CHRNB4 is related to drug-related behaviors and the development of lung cancer. One of the most broadly expressed subtype is the alpha-3 beta-4 nAChR, also known as the ganglion-type nicotinic receptor, located in the autonomic ganglia and adrenal medulla, where activation yields post- and/or pre-synaptic excitation, mainly by increased Na+ and K+ permeability. Beta4 forms heteromeric nAchRs to modulate receptor affinity for nicotine, but the exact pentameric stochiometry of alpha3beta4 receptor is not known; functional assemblies with varying subunit stoichiometries are possible. 178
36270 349829 cd19028 LGIC_ECD_nAChR_D extracellular domain of nicotinic acetylcholine receptor subunit delta (CHRND). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit delta (delta), encoded by the CHRND gene and found in the muscle. Delta nAChR subunit forms a heteropentamer with either (alpha1)2, beta and gamma subunits in embryonic type or (alpha1)2, beta and epsilon subunits in adult type receptors. Defects in this gene are a cause of multiple pterygium syndrome lethal type (MUPSL), congenital myasthenic syndrome slow-channel type (SCCMS), and congenital myasthenic syndrome fast-channel type (FCCMS). The slow-channel congenital myasthenic syndromes (SCCMS) are caused by prolonged opening episodes of AChR due to dominant gain-of-function mutations in the transmembrane regions of the heteropentamer. These mutations produce an increase in the channel opening rate, a decrease in the channel closing rate, or an increase in the affinity of ACh for the AChR, resulting in the stabilization of the open state or the destabilization of the closed state of the AChR. 221
36271 349830 cd19029 LGIC_ECD_nAChR_G extracellular domain of nicotinic acetylcholine receptor subunit gamma (CHRNG). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit gamma (gamma), encoded by the CHRNG gene expressed during early fetal development, and replaced by the epsilon subunit in the adult. The gamma subunit forms a heteropentamer with (alpha1)2, beta, and delta and plays a role in neuromuscular organogenesis and ligand binding. Disruption of gamma subunit expression prevents the correct localization of the receptor in cell membranes. Mutations in CHRNG may cause the non-lethal Escobar variant (EVMPS) and lethal form (LMPS) of multiple pterygium syndrome (MPS), a condition characterized by prenatal growth failure with pterygium and akinesia leading to muscle weakness and severe congenital contractures, as well as scoliosis. Muscle-type acetylcholine receptor is the major antigen in the autoimmune disease myasthenia gravis. 193
36272 349831 cd19030 LGIC_ECD_nAChR_E extracellular domain of nicotinic acetylcholine receptor subunit epsilon (CHRNE). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit epsilon (epsilon), encoded by the CHRNE gene and found in adult skeletal muscle. Epsilon subunit forms a heteropentamer with (alpha1)2, beta and delta after birth, replacing the gamma subunit seen in embryonic receptors. The adult-type epsilon-AChR has a higher conductance and a shorter open time compared to embryonic gamma-AChR and the open channel is non-selectively cation permeable. Mutations of the CHRNE gene are the most common causes of congenital myasthenic syndrome (CMS), most of which are autosomal recessive loss-of-function mutations, resulting in endplate AChR deficiency. A highly fatal fast-channel syndrome is caused by AChR epsilon subunit mutation (Trp to Arg; changing environment from anionic to cationic) at the agonist binding site at the alpha/epsilon interface of the receptor, thus disrupting agonist binding affinity and gating efficiency. 191
36273 349832 cd19031 LGIC_ECD_nAChR_proto_alpha-like extracellular domain of nicotinic acetylcholine receptor subunit alpha-like found in protostomia. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha-like in organisms that include arthropods, mollusks, annelid worms, and flat worms, and have their cholinergic system limited to the central nervous system. C. elegans genome encodes 29 acetylcholine receptor subunits, of which the levamisole-sensitive receptor (L-AChR) alpha-subunits, UNC-38, UNC-63, and LEV-8, included in this subfamily, form heteromers with the two non-alpha (also known as beta-like) subunits, UNC-29 and LEV-1. This receptor functions as the main excitatory postsynaptic receptor at neuromuscular junctions, indicating that many are expressed in neurons. Also included is the nicotinic alpha subunit MARA1 (Manduca ACh Receptor Alpha 1) which is expressed in Ca2+ responding neurons and contributes to the nicotinic responses in the neurons. In insects, the receptors supply fast synaptic excitatory transmission and represent a major target for several insecticides. In Drosophila, ten exclusively neuronal nAChRs have been identified, Dalpha1-Dalpha7 and Dbeta1-Dbeta3, and various combinations of these subunits and mutations are key to nAChR function. Alpha5 subunit is involved in alpha-bungarotoxin sensitivity while the alpha6 subunit is essential for the insecticidal effect of spinosad. nAChR agonists acetylcholine, nicotine, and neonicotinoids stimulate dopamine release in Drosophila larval ventral nerve cord and mutations in nAChR subunits affect how insecticides stimulate dopamine release. 222
36274 349833 cd19032 LGIC_ECD_nAChR_proto_beta-like extracellular domain of nicotinic acetylcholine receptor subunit beta-like found in protostomia. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta-like in organisms that include arthropods, mollusks, annelid worms, and flat worms, and have their cholinergic system limited to the central nervous system. C. elegans genome encodes 29 acetylcholine receptor subunits, of which the levamisole-sensitive receptor alpha-subunits (L-AChR), UNC-38, UNC-63, and LEV-8, form heteromers with the two non-alpha (also known as beta-like) subunits, UNC-29 and LEV-1 found in this subfamily. This receptor functions as the main excitatory postsynaptic receptor at neuromuscular junctions, indicating that many are expressed in neurons. In insects, the receptors supply fast synaptic excitatory transmission and represent a major target for several insecticides. In Drosophila, ten exclusively neuronal nAChR subunits have been identified, Dalpha1-Dalpha7 and Dbeta1-Dbeta3, and various combinations of these subunits and mutations are key to nAChR function. Dbeta1 subunits in dopaminergic neurons play a role in acute locomotor hyperactivity caused by nicotine in male Drosophila. Mutations of Dbeta2 or Dalpha1 nAChR subunits in Drosophila strains have significantly lower neonicotinoid-stimulated release, but no changes in nicotine-stimulated release; they are highly resistant to the neonicotinoids nitenpyram and imidacloprid. This family also includes a novel nAChR found in Aplysia bag cell neurons (neuroendocrine cells that control reproduction) which is a cholinergic ionotropic receptor that is both, nicotine insensitive and acetylcholine sensitive. 208
36275 349834 cd19033 LGIC_ECD_nAChR_proto-like nicotinic acetylcholine receptor (nAChR) subunit extracellular domain in molluscs and annelids. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit found in molluscs, including several Lymnaea nAChRs, and annelids that are mostly uncharacterized. To date, 12 Lymnaea nAChRs have been identified which can be subdivided in two subtypes according to the residues that may be contributing to the selectivity of ion conductance. Phylogenetic analysis of the nAChR gene sequences suggests that anionic nAChRs in molluscs probably evolved from cationic ancestors through amino acid substitutions in the ion channel pore which is a mechanism different from acetylcholine-gated channels in other invertebrates. 183
36276 349835 cd19034 LGIC_ECD_GABAAR_A1 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-1 (GABAAR-A1 or GABRA1). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-1 (GABAAR-A1), a protein that is encoded by the GABRA1 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-1 subunits form heteropentamers with other GABAAR subunits, most broadly expressed as combination of two alpha1, beta1, gamma. Alpha1, beta2, and gamma2 subunits are clustered on the same human chromosome and may be why alpha1beta2gamma2 receptors are one of the most abundant GABAA receptor isoforms in CNS neurons. Mutations in this gene cause familial juvenile myoclonic epilepsy, sporadic childhood absence epilepsy type 4, and idiopathic familial generalized epilepsy. Polymorphisms in GABRA1 are also significantly associated with schizophrenia. GABRA1 has also been associated with methamphetamine abuse. The GABRA1 receptor is the specific target of the z-drug class of nonbenzodiazepine hypnotic agents and is responsible for their hypnotic and hallucinogenic effects. 194
36277 349836 cd19035 LGIC_ECD_GABAAR_A2 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-2 (GABAAR-A2 or GABRA2). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-2 (GABAAR-A2), a protein that is encoded by the GABRA2 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-2 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha2beta3gamma2. The alpha-2 (GABRA2) subunit is found primarily in the forebrain and hippocampus, and is more confined to areas of the brain compared to other alpha subunits. GABRA2 increases the risk of anxiety, making it a target for treating behavioral disorders including alcohol dependence, and drug use. GABRA2 is a binding site for benzodiazepines (psychoactive drugs known to reduce anxiety), causing chloride channels to open, leading to the hyper-polarization of the membrane. Other anxiolytic drugs such as Diazepam bind this subunit to induce inhibitory effects. GABRA2 is associated with reward behavior when it activates the insula, the part of the cerebral cortex responsible for emotions. GABA alpha2 and/or alpha3 receptor subtypes are also involved in GABAergic modulation of prolactin secretion. 203
36278 349837 cd19036 LGIC_ECD_GABAAR_A3 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-3 (GABAAR-A3 or GABRA3). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-3 (GABAAR-A3), a protein that is encoded by the GABRA3 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-3 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha3betagamma2, typically found post-synaptically. Rare loss-of-function variants in GABRA3 have been shown to increase the risk for a varying combination of epilepsy, intellectual disability/developmental delay, and dysmorphic features. GABRA3, normally exclusively expressed in adult brain, is also expressed in breast cancer, with high expression being inversely correlated with breast cancer survival. It activates the AKT pathway to promote breast cancer cell migration, invasion, and metastasis. GABRA3 promotes lymphatic metastasis in lung adenocarcinoma by mediating upregulation of matrix metalloproteinases, MMP-2 and MMP-9, through activation of the JNK/AP-1 signaling pathway. GABRA3 is overexpressed in human hepatocellular carcinoma growth and, with GABA, promotes the proliferation of cancer cells. 200
36279 349838 cd19037 LGIC_ECD_GABAAR_A4 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-4 (GABAAR-A4 or GABRA4). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-4 (GABAAR-A4), a protein that is encoded by the GABRA4 gene in humans, with biased expression in the brain and heart. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-4 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha2alpha4beta1gamma1, all four subunits existing on the same gene cluster. Alpha-4 is involved in the etiology of autism and eventually increases autism risk through interaction with the beta-1 (GABRB1) subunit. Polymorphism in GABRA4 may trigger migraine by ethanol, while another is associated to faster reaction times and with lower ethanol effects. A rare variant in GABRA4 may have modest physiological effect in autism spectrum disorder etiology. 199
36280 349839 cd19038 LGIC_ECD_GABAAR_A5 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-5 (GABAAR-A5 or GABRA5). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-5 (GABAAR-A5), a protein that is encoded by the GABRA5 gene in humans, with biased expression in the brain and heart. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-5 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as alpha5-beta-gamma2, and probably alpha5-beta3-gamma2, predominantly expressed in the hippocampus and localized extrasynaptically. These receptors have been demonstrated to play an important modulatory role in learning and memory processes, thus making them suitable targets for pharmacological intervention. Studies show that alpha5-containing GABAARs play an important part in tonic inhibition in hippocampal pyramidal neurons, and that these can also contribute to synaptic inhibition. Studies strongly suggest that amnesia is primarily mediated by alpha5-beta-gamma2. Polymorphisms in GABRA5 (and GABRA3) are linked to the susceptibility to panic disorder. A genetic association also exists between GABRA5 and bipolar affective disorder. 199
36281 349840 cd19039 LGIC_ECD_GABAAR_A6 extracellular domain of gamma-aminobutyric acid receptor subunit alpha-6 (GABAAR-A6 or GABRA6). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-6 (GABAAR-A6), a protein that is encoded by the GABRA6 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-6 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as alpha6-beta-gamma2 found extrasynaptically, alpha6-beta2/3-delta in the cerebellar granule cells and likely also forms alpha1-alpha6-beta-gamma/alpha1-alpha6-beta-delta. A GABRA6 mutation from Arg to Trp, has been identified as a susceptibility gene that may contribute to the pathogenesis of childhood absence epilepsy and cause neuronal disinhibition and increase in seizures via a reduction of alphabetagamma and alphabetadelta receptor function and expression. Polymorphism in the GABRA6 gene is associated with specific personality characteristics as well as a marked attenuation in hormonal and blood pressure responses to psychological stress. Alpha6-containing receptors lack high sensitivity to diazepam. 198
36282 349841 cd19040 LGIC_ECD_GABAAR_B1 extracellular domain of gamma-aminobutyric acid receptor subunit beta-1 (GABAAR-B1 or GABRB1). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-1 subunit, a protein that is encoded by the GABRB1 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-1 subunit forms heteropentamers with other GABAAR subunits, likely expressed as alpha-beta1-gamma/delta, mainly found in the brain. It is clustered on the chromosome with genes encoding alpha 4, alpha 2, and gamma 1 subunits of the GABAAR. GABRB1 expression is altered significantly in the lateral cerebellum of subjects with schizophrenia, major depression, and bipolar disorder. Mutations in the GABRB1 gene promote alcohol consumption through increased tonic inhibition. Epigenetic control of gene expression may affect the expression of GABRB1 and disrupt inhibitory synaptic transmission during embryonic development. The GABRB1 gene is also associated with thalamus volume and modulates the association between thalamus volume and intelligence. 182
36283 349842 cd19041 LGIC_ECD_GABAAR_B2 extracellular domain of gamma-aminobutyric acid receptor subunit beta-2 (GABAAR-B2 or GABRB2). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-2 subunit, a protein that is encoded by the GABRB2 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-2 subunit forms heteropentamers with other GABAAR subunits, with alpha1-beta2-gamma2 subtype being the most prevalent isoform (approximately 50%-60% of all GABAARs), and are expressed in almost all regions of the brain. It also assembles less abundantly as alpha4beta2/3delta and alpha6beta2/3delta. Mutations or genetic variations of the genes encoding the GABRB2 and GABRB3 have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. GABRB2 plays important tumorigenic functions and acts as a novel oncogene in papillary thyroid carcinoma (PTC). 182
36284 349843 cd19042 LGIC_ECD_GABAAR_B3 extracellular domain of gamma-aminobutyric acid receptor subunit beta-3 (GABAAR-B3 or GABRB3). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-3 subunit, a protein that is encoded by the GABRB3 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-3 subunit forms heteropentamers with other GABAAR subunits, with alpha2-beta3-gamma2 and alpha3-beta3-gamma2 subtypes highly enriched in hippocampal pyramidal neurons and cholinergic neurons of the basal forebrain, respectively. Other heteromers include alpha1-beta3-gamma2 and alpha5-beta3-gamma2. GABRB3 mutations are likely associated with a broad phenotypic spectrum of epilepsies and that reduced receptor function causing GABAergic disinhibition represents the relevant disease mechanism. GABRB3 might be associated with heroin dependence, and increased expression possibly contributing to the pathogenesis of heroin dependence. This gene may also be associated with the pathogenesis of other disorders such as Angelman syndrome, Prader-Willi syndrome, nonsyndromic orofacial clefts, schizophrenia, and autism. 183
36285 349844 cd19043 LGIC_ECD_GABAAR_G1 extracellular domain of gamma-aminobutyric acid receptor subunit gamma-1 (GABAAR-G1 or GABRG1). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-1 (GABAAR-G1), a protein that is encoded by the GABRG1 gene in humans, clustered with the alpha2 gene GABRA2, which is associated with alcohol dependence. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-1 subunit forms heteropentamers with other GABAAR subunits, likely expressed as combination of alpha1/2-beta-gamma1 subunits. A variant in GABRG1 shows the strongest statistical evidence of association of recovery from eating disorders. Studies show that upregulating or preserving GABAA gamma1/3 and gamma2 receptors may protect neurons against neurofibrillary pathology in Alzheimer's disease. 182
36286 349845 cd19044 LGIC_ECD_GABAAR_G2 extracellular domain of gamma-aminobutyric acid receptor subunit gamma-2 (GABAAR-G2 or GABRG2). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-2 (GABAAR-G2), a protein that is encoded by the GABRG2 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-2 subunit forms heteropentamers with other GABAAR subunits, most prevalently expressed as alpha1-beta2-gamma2. The gamma2 subunit also coassembles with other alpha and beta variants in the brain, but these receptors are found in considerably less abundance and are restricted in their regional, e.g. the alpha2-beta3-gamma2 and alpha3-beta3-gamma2 subtypes are highly enriched in hippocampal pyramidal neurons and cholinergic neurons of the basal forebrain, respectively. Pathogenic missense and truncating variants in this gene have been associated with spectrum of epilepsies, from Dravet syndrome to milder simple febrile seizures, while a recurrent GABRG2 missense variant is associated with early-onset seizures, significant motor and speech delays, intellectual disability, hypotonia, movement disorder, dysmorphic features, and vision/ocular issues. 184
36287 349846 cd19045 LGIC_ECD_GABAAR_G3 extracellular domain of gamma-aminobutyric acid receptor subunit gamma-3 (GABAAR-G3 or GABRG3). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-3 (GABAAR-G3), a protein that is encoded by the GABRG3 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-3 subunit forms heteropentamers with other GABAAR subunits, likely expressed as alpha1-beta3-gamma3. This subunit contains the benzodiazepine binding site. Polymorphisms in GABG3 show consistent evidence of alcohol dependence. 182
36288 349847 cd19046 LGIC_ECD_GABAAR_rho1 extracellular domain of gamma-aminobutyric acid receptor subunit rho-1 (GABA-rho1 or GABRR1). This family contains extracellular domain (ECD) of the rho subunit 1 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR1 gene, expressed in many areas of the brain, but especially high in the retina. GABRR1 exists next to GABRR2 (encoding rho subunit 2) on the chromosome region thought to be associated with susceptibility for psychiatric disorders and epilepsy. Close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, mutations in the GABRR1 gene may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder, and may be associated with alcohol dependency. 186
36289 349848 cd19047 LGIC_ECD_GABAAR_rho2 extracellular domain of gamma-aminobutyric acid receptor subunit rho-2 (GABA-rho2 or GABRR2). This family contains extracellular domain (ECD) of the rho subunit 2 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR2 gene which exists next to GABRR1 (encoding rho subunit 1) on the chromosome region thought to be associated with susceptibility for psychiatric disorders and epilepsy. Close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event. Rho1 is expressed in many areas of the brain, but especially high in the retina. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, mutations in the GABRR2 gene may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR2 is also associated with susceptibility to bipolar schizoaffective disorder, as well as alcohol dependence and general cognitive ability. GABA-rho2 receptors expressed pre-synaptically in the spinal dorsal horn have been implicated in pain perception and identified as a novel target for analgesia. 186
36290 349849 cd19048 LGIC_ECD_GABAAR_rho3 extracellular domain of gamma-aminobutyric acid receptor subunit rho-3 (GABAA-rho3). This family contains extracellular domain (ECD) of the rho subunit 3 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR3 gene which maps to a different chromosome to that of GABRR1 and GABRR2. While close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event, GABRR3 may have arisen by duplication of a GABRR1/GABRR2 progenitor. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, some individuals contain a variant that is predicted to inactivate this gene product. 186
36291 349851 cd19049 LGIC_TM_anion transmembrane domain of anionic Cys-loop neurotransmitter-gated ion channels, includes GABAAR, GlyR and GluCl. This family contains transmembrane domain of type-A gamma-aminobutyric acid receptor (GABAAR) as well as glycine receptor (GlyR) subunits. Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine, or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity. 111
36292 349852 cd19050 LGIC_TM_bact transmembrane domain of prokaryotic pentameric ligand-gated ion channels (pLGIC). This family contains transmembrane (TM) domain of bacterial pentameric ligand-gated ion channels (pLGICs) including ones from Gloeobacter violaceus (GLIC) and Erwinia chrysanthemi (ELIC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. Studies show that GLIC activation is inhibited by most general anaesthetics at clinical concentrations, including xenon which has been used in clinical practice as a potent gaseous anesthetic for decades. Xenon binding sites have been identified in three distinct regions of the TMD: in a large intra-subunit cavity, in the pore, and at the interface between adjacent subunits. Propofol, the drug used for induction and maintenance of general anesthesia, and desflurane, a negative allosteric modulator of GLIC bind at the entrance in the intra-subunit cavity. Alzheimer's drug memantine, which blocks ion conduction at vertebrate pLGICs by plugging the channel pore, has been shown to have similar potency in ELIC. 119
36293 349853 cd19051 LGIC_TM_cation transmembrane domain of Cys-loop neurotransmitter-gated ion channels, includes 5HT3, nAChR, and ZAC. This superfamily contains the transmembrane (TM) domain of cationic Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), and zinc-activated ligand-gated ion channel (ZAC) receptor. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. The ligand-gated ion channels (LGICs) in this family are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+ and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling require is as yet unknown. 112
36294 349854 cd19052 LGIC_TM_GABAAR_alpha transmembrane domain of alpha subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane domain of type-A gamma-aminobutyric acid receptor (GABAAR) as well as glycine receptor (GlyR) subunits. Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity. 111
36295 349855 cd19053 LGIC_TM_GABAAR_beta transmembrane domain of beta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the beta subunit of type-A beta-aminobutyric acid receptor (GABAAR), which includes beta1-beta4 in vertebrates. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Mutations or genetic variations of the genes encoding beta2 (GABRB2) and beta3 (GABRB3) have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. Mutations in the GABRB1 gene encoding beta1 promote alcohol consumption through increased tonic inhibition. 111
36296 349856 cd19054 LGIC_TM_GABAAR_gamma transmembrane domain of gamma subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the gamma subunit of type-A beta-aminobutyric acid receptor (GABAAR), which includes gamma1-gamma3 in vertebrates. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Studies show upregulating or preserving GABAA gamma1/3 and gamma2 receptors may protect neurons against neurofibrillary pathology in Alzheimer's disease. Pathogenic missense and truncating variants in GABRG2 have been associated with spectrum of epilepsies, from Dravet syndrome to milder simple febrile seizures. Polymorphisms in GABG3 show consistent evidence of alcohol dependence. 111
36297 349857 cd19055 LGIC_TM_GABAAR_delta transmembrane domain of delta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the delta subunit of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the gene GABRD. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Receptors containing the delta subunit (GABRD) are expressed exclusively extra-synaptically (in the cortex, hippocampus, thalamus, striatum, and cerebellum) and mediate tonic inhibition. Studies suggest that delta subunits form heteropentamers in similar stoichiometry and arrangement as alpha/beta/gamma receptors, with the delta subunit replacing the gamma subunit (2alpha:2beta:1delta), although other stoichiometries have also been detected. The delta subunit is flexible in its positioning in the pentameric complex, producing receptors with diverse pharmacological properties. Mutations in GABRD have been associated with susceptibility to generalized epilepsy with febrile seizures, type 5. GABRD gene may also be associated with childhood-onset mood disorders. 121
36298 349858 cd19056 LGIC_TM_GABAAR_theta transmembrane domain of theta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAA receptor theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes. 118
36299 349859 cd19057 LGIC_TM_GABAAR_epsilon transmembrane domain of epsilon subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of type-A gamma-aminobutyric acid receptor (GABAAR) subunits as well as glycine receptor (GlyR). Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine, or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity. 115
36300 349860 cd19058 LGIC_TM_GABAAR_pi transmembrane domain of pi subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the pi subunit of type-A gamma-aminobutyric acid receptor (GABAAR), encoded my the gene GABRP. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRP is expressed mainly in non-neuronal tissues such as the mammary gland, prostate gland, lung, thymus, and uterus. It is also highly expressed in certain types of cancer such as basal-like breast cancer and pancreatic ductal adenocarcinoma. GABRP is involved in inhibitory synaptic transmission in the central nervous system. Its assembly with other GABAAR subunits alters the sensitivity of recombinant receptors to modulatory agents such as pregnanolone. Studies suggest that polymorphisms in the GABRP gene may be associated with the susceptibility to systematic lupus erythematosus (SLE). 123
36301 349861 cd19059 LGIC_TM_GABAAR_rho transmembrane domain of rho subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the rho subunit of type-A gamma-aminobutyric acid receptor (GABAAR), which includes rho1-3. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. These rho subunits homo-oligomerize to form GABAA-rho receptors (formerly classified as GABA-rho or GABAC receptor) but do not co-assemble with any of the classical GABAA subunits. They are especially high expression in the retina and their distinctive pharmacological properties are unique; they are not modulated by many GABAA receptor modulators such as barbiturates, benzodiazepines, and neuroactive steroids. In humans, mutations in the rho-1 and rho genes, GABRR1 and GABRR2, may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder while a SNP in GABRR2 has been reported to show association with autism. 113
36302 349862 cd19060 LGIC_TM_GlyR_alpha transmembrane domain of alpha subunits of glycine receptor (GlyR). This family contains transmembrane (TM) domain of the alpha subunit of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GlyR has four known isoforms of the alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands and, along with the GlyR beta subunit, have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). These alpha subunits are highly homologous but differ in their kinetic properties, temporal and regional expression and physiological functions. They can form functional chloride-permeable GlyR ion channels by forming homopentamers with 5 alpha subunits or heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. In human, mutations in glycine receptor alpha subunits cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions. Mutations in GlyR alpha1 subunit leads to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, while mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing. GlyR alpha1 and alpha2 subunits have an important role in regulation of the excitatory-inhibitory balance, control of motor actions, modulation of sedative ethanol effects and probably regulation of ethanol preference and consumption. 120
36303 349863 cd19061 LGIC_TM_GlyR_beta transmembrane domain of beta subunits of glycine receptor (GlyR). This family contains transmembrane (TM) domain of the beta subunit of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine, encoded by GLRB gene. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. These subunits form heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. While the alpha subunits contain binding sites for agonists and antagonists and are responsible for ion channel formation, the beta subunit displays structural and regulatory functions, such as GlyR clustering in synaptic locations by interaction between intracellular loop domains with the scaffolding protein gephyrin, and control of pharmacologic responses to agonist or allosteric modulators due in part to the presence of interfaces alpha/beta and beta/beta. GLRB gene mutations are associated with the neurological disorder hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, as well as agoraphobic cognitions. 114
36304 349864 cd19062 LGIC_TM_GluCl transmembrane domain of glutamate gated chloride channel (GluCl). This family contains transmembrane (TM) domain of the glutamate-gated chloride channel (GluCl) found only in protostomia but are closely related to mammalian glycine receptors. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. These GluCl channels have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Comparison of the GluCl gene families between organisms shows that insect gene family is relatively simple, while that found in nematodes tends to be larger and more diverse. Glutamate is an inhibitory neurotransmitter that shapes the responses of projection neurons to olfactory stimuli in the Drosophila. GluCls are targeted by the macrocyclic lactone family of anthelmintics and pesticides in arthropods and nematodes, thus making the GluCls of considerable medical and economic importance. In Drosophila melanogaster, GluCl mediates sensitivity to the antiparasitic agents ivermectin and nodulisporic acid, suggesting that their drug target is the same throughout the Ecdysozoa. 116
36305 349865 cd19063 LGIC_TM_5-HT3 transmembrane domain of 5-hydroxytryptamine 3 (5-HT3) receptor. This family contains transmembrane (TM) domain of the serotonin 5-HT3 receptors. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. The 5-HT3 channel is cation-selective and mediates neuronal depolarization and excitation within the central and peripheral nervous systems. Like other ligand gated ion channels, the 5-HT3 receptor consists of five subunits arranged around a central ion conducting pore, which is permeable to Na+, K+, and Ca2+ ions. Binding of the neurotransmitter 5-hydroxytryptamine (serotonin) to the 5-HT3 receptor opens the channel, which then leads to an excitatory response in neurons, and the rapidly activating, desensitizing, inward current is predominantly carried by Na+ and K+ ions. This receptor is most closely related by homology to the nicotinic acetylcholine receptor (nAChR). Five subunits have been identified for this family: 5-HT3A, 5-HT3B, 5-HT3C, 5-HT3D, and 5-HT3E, encoded by HTR3A-E genes. Only 5-HT3A subunits are able to form functional homomeric receptors, whereas the 5-HT3B, C, D, and E subunits form heteromeric receptors with 5-HT3A. Different receptor subtypes are important mediators of nausea and vomiting during chemotherapy, pregnancy, and following surgery, while some contribute to neuro-gastroenterologic disorders such irritable bowel syndrome (IBS) and eating disorders as well as co-morbid psychiatric conditions. 5-HT3 receptor antagonists are established treatments for emesis and IBS, and are beneficial in the treatment of psychiatric diseases. 121
36306 349866 cd19064 LGIC_TM_nAChR transmembrane domain of nicotinic acetylcholine receptor (nAChR). This family contains transmembrane (TM) domain of the nicotinic acetylcholine receptor (nAChR). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease. Among subtypes of muscle nAChRs, the heteromeric subunits (alpha1)2, beta, gamma, and delta in fetal muscle, and the gamma subunit replaced by epsilon in adult muscle have been implicated in congenital myasthenic syndromes and multiple pterygium syndromes due to various mutations. This family also includes alpha- and beta-like nAChRs found in protostomia. 113
36307 349867 cd19065 LGIC_TM_ZAC transmembrane domain of zinc-activated ligand-gated ion channel. This family contains transmembrane (TM) domain of zinc-activated ligand-gated ion channel (ZAC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown. 176
36308 380453 cd19066 C_NRPS-like Condensation domain of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long, with various activities such as antibiotic, antifungal, antitumor and immunosuppression. There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 427
36309 410989 cd19067 PfuEndoQ-like lesion-specific endonuclease similar to Pyrococcus furiosus EndoQ. Pyrococcus furiosus EndoQ is a lesion-specific endonuclease which is assumed to be involved in DNA repair pathways in Thermococcales. It recognizes a deaminated base and hydrolyzes the phosphodiester bond 5' to the site of the lesion. Initially identified as a hypoxanthine-specific endonuclease, it has now been shown that EndoQ also recognizes uracil, xanthine, and apurinic/apyrimidinic (AP) sites in DNA, and that a homolog in Bacillus pumilus shares functional properties of the archaeal EndoQs. 395
36310 381297 cd19071 AKR_AKR1-5-like AKR1/2/3/4/5 family of aldo-keto reductase (AKR) and similar proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. The family includes AKR1A/B/C/D/E/G/I, AKR2A/B/C/D/E, AKR3A/B/C/D/E/G, AKR4A/B/C, AKR5A/B/C/D/E/F/G/H, and similar proteins. 251
36311 381298 cd19072 AKR_AKR3F1-like Thermotoga maritime Tm1743, Escherichia coli YeaE and similar proteins. Thermotoga maritime Tm1743 is a founding member of aldo-keto reductase family 3 member F1 (AKR3F1). It is a aldo/keto reductase family oxidoreductase. Escherichia coli YeaE may act as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. 263
36312 381299 cd19073 AKR_AKR3F2_3 Escherichia coli 2,5-diketo-D-gluconic acid reductase B (DkgB/YafB), Sinorhizobium meliloti isatin reductase and similar proteins. Escherichia coli DkgB/YafB (EC 1.1.1.346), also called 2,5-didehydrogluconate reductase (2-dehydro-L-gulonate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, is a founding member of aldo-keto reductase family 3 member F2 (AKR3F2). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). Sinorhizobium meliloti isatin reductase is a founding member of aldo-keto reductase family 3 member F3 (AKR3F3). It is a aldo/keto reductase family oxidoreductase. 243
36313 381300 cd19074 Aldo_ket_red_shaker-like Shaker potassium channel beta subunit family and similar proteins. This family includes voltage-gated potassium channel subunits, beta-1 (KCAB1B), beta-2 (KCAB2B) and beta-3 (KCAB3B). KCAB1B and KCAB2B are cytoplasmic potassium channel subunits that modulate the characteristics of the channel-forming alpha-subunits. KCAB3B is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. The family also includes Drosophila melanogaster Hk protein, a founding member of aldo-keto reductase family 6 member B1 (AKR6B1), as well as voltage-gated potassium channel subunit beta (KCAB) from Arabidopsis thaliana and Egeria densa, founding members of AKR6C1and AKR6C2, respectively. Hk protein, also called hyperkinetic, is a beta subunit of Shaker (Sh) K+ channels and shows high sequence homology to aldoketoreductase. KCAB, also called Shaker channel b-subunit, or K(+) channel subunit beta, or potassium voltage beta 1, or KV-beta1, or KAB1, is a probable accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. 297
36314 381301 cd19075 AKR_AKR7A1-5 AKR7A family of aldo-keto reductase (AKR). Aflatoxin B1 aldehyde reductase member 1/3 (AKR7A1/AKR7A3/AFAR) from Rattus norvegicus, aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AFAR1/AFAR) and aflatoxin B1 aldehyde reductase member 3 (AKR7A3/AFAR2) from Homo sapiens, aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AFAR2) from Rattus norvegicus, and aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AKR7A5/AFAR) from Mus musculus, are founding members of aldo-keto reductase family 7 member A1-5 (AKR7A1-5), respectively. AKR7A2 (EC 1.1.1.n11), also called AFB1 aldehyde reductase 1, or AFB1-AR 1, or aldoketoreductase 7, or succinic semialdehyde reductase, or SSA reductase, catalyzes the NADPH-dependent reduction of succinic semialdehyde to gamma-hydroxybutyrate (GHB). It has NADPH-dependent aldehyde reductase activity towards 2-carboxybenzaldehyde, 2-nitrobenzaldehyde and pyridine-2-aldehyde (in vitro). AKR7A2, AKR7A3 (also called AFB1 aldehyde reductase 2 or AFB1-AR 2), and AKR7A4 (also called AFB1 aldehyde reductase 3, or AFB1-AR 3, or aldoketoreductase 7-like), may be involved in protection of liver against the toxic and carcinogenic effects of aflatoxin B1 (AFB1), a potent hepatocarcinogen. They can reduce the dialdehyde protein-binding form of AFB1 to the non-binding AFB1 dialcohol. 304
36315 381302 cd19076 AKR_AKR13A_13D AKR13A and AKR13D families of aldo-keto reductase (AKR). Schizosaccharomyces pombe aldo-keto reductase YakC is a founding member of aldo-keto reductase family 13 member A1 (AKR13A1). It catalyzes the reversible reduction of ketones to the respective alcohols using NADP(+) as a hydride donor. Rauvolfia serpentina PR is a founding member of aldo-keto reductase family 13 member D1 (AKR13D1). It catalyzes the NADPH-dependent reduction of the aldehyde perakine to yield the alcohol raucaffrinoline in the biosynthetic pathway of ajmaline in Rauvolfia, a key step in indole alkaloid biosynthesis. This family also includes Arabidopsis thaliana aldo-keto reductases, ALKR1-6. 303
36316 381303 cd19077 AKR_AKR8A1-2 AKR8A family of aldo-keto reductase (AKR). Schizosaccharomyces pombe PLR and PLR2 are founding members of aldo-keto reductase family 8 member A1-2 (AKR8A1-2), respectively. PLR (EC 1.1.1.65), also called PL reductase (PL-red), catalyzes the reduction of pyridoxal (PL) with NADPH and oxidation of pyridoxine (PN) with NADP(+). 302
36317 381304 cd19078 AKR_AKR13C1_2 AKR13C family of aldo-keto reductase (AKR). The AKR13C family includes Helicobacter pyroli aldehyde reductase (AKR13C1) and Thermotoga maritima aldo-keto reductase (AKR13C2). Aldehyde reductase (EC 1.1.1.21), also called aldose reductase, is a cytosolic NADPH-dependent oxidoreductase that catalyzes the reduction of a variety of aldehydes and carbonyls, including monosaccharides. 301
36318 381305 cd19079 AKR_EcYajO-like Escherichia coli YajO and similar proteins. Escherichia coli YajO is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase. 312
36319 381306 cd19080 AKR_AKR9A_9B AKR9A and AKR9B families of aldo-keto reductase (AKR). The AKR9A family includes Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV, Aspergillus flavus norsolorinic acid reductase (NOR), and Phanerochaete chrysosporium aryl-alcohol dehydrogenase [NADP(+)] (AAD), are founding members of aldo-keto reductase family 9 member A1-3 (AKR9A1-3), respectively. StcV may be involved in the dehydration of 5'-hydroxyaverantin to form averufin. NOR is involved in aflatoxin biosynthesis. AAD (EC1.1.1.91) is involved in lignin degradation and reduces aromatic benzaldehydes to their respective alcohols in the presence of NADP(H). The AKR9B family includes Saccharomyces cerevisiae aryl-alcohol dehydrogenases AAD14p, AAD3p, AAD4p, and AAD10p, which are founding members of aldo-keto reductase family 9 member B1-4 (AKR9B1-4), respectively. 307
36320 381307 cd19081 AKR_AKR9C1 AKR9C family of aldo-keto reductase (AKR). Haloferax volcanii aldo-keto reductase is a founding member of aldo-keto reductase family 9 member C1 (AKR9C1). 308
36321 381308 cd19082 AKR_AKR10A1_2 AKR10A family of aldo-keto reductase (AKR). Streptomyces bluensis aldo-keto reductase (BlmT) and Streptomyces glaucescens aldo-keto reductase (StrT) are founding members of aldo-keto reductase family 10 member A1 (AKR10A1) and A2 (AKR10A2). BlmT is bluensomycin aldo-keto reductase (AKR) and StrT is streptomycin AKR. 291
36322 381309 cd19083 AKR_AKR11A1_11D1 AKR11A and AKR11D families of aldo-keto reductase (AKR). Bacillus subtilis aldo-keto reductase IolS, also called vegetative protein 147 (VEG147), is a founding member of aldo-keto reductase family 11 member A1 (AKR11A1). It is able to reduce the standard aldo-keto reductase (AKR) substrates DL-glyceraldehyde, D-erythrose, and methylglyoxal in the presence of NADPH, albeit with poor efficiency in vitro. Bacillus aryabhattai aldo keto reductase is a founding member of aldo-keto reductase family 11 member D1 (AKR11D1). 307
36323 381310 cd19084 AKR_AKR11B1-like AKR11B1/AKR11B2 subfamily of aldo-keto reductase (AKR). Bacillus subtilis YhdN, also called general stress protein 69 (GSP69), is a founding member of aldo-keto reductase family 11 member B1 (AKR11B1). It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. Escherichia coli YdjG is a founding member of aldo-keto reductase family 11 member B2 (AKR11B2). It catalyzes the NADH-dependent reduction of methylglyoxal (2-oxopropanal) in vitro. It may play some role in intestinal colonization. 296
36324 381311 cd19085 AKR_AKR11B3 Synechococcus sp. aldo-keto reductase (SakR1) and similar proteins. Synechococcus sp. SakR1 is a founding member of aldo-keto reductase family 11 member B3(AKR11B3). It is responsible for methylglyoxal detoxification. 292
36325 381312 cd19086 AKR_AKR11C1 AKR11C family of aldo-keto reductase (AKR). Bacillus subtilis uncharacterized oxidoreductase YqkF is a founding member of aldo-keto reductase family 11 member C1 (AKR11C1). It may function as oxidoreductase. This family also includes Bacillus halodurans AKR11C1, an NADPH-dependent 4-hydroxy-2,3-trans-nonenal reductase. 238
36326 381313 cd19087 AKR_AKR12A1_B1_C1 AKR12A, AKR12B, AKR12C families of aldo-keto reductase (AKR). Streptomyces fradiae TylCII, Saccharopolyspora erythraea EryBII, and Streptomyces avermitilis aveBVIII are founding members of aldo-keto reductase family 12 member A1 (AKR12A1), B1 (AKR12B1), and C1(AKR12C1), respectively. TylCII acts as a NDP-hexose 2,3-enoyl reductase. EryBII is a mycarose/desosamine reductase involved in L-mycarose and D-desosamine production. aveBVIII functions as a dTDP-4-keto-6-deoxy-L-hexose-2,3-reductase. 310
36327 381314 cd19088 AKR_AKR13B1 AKR13B family of aldo-keto reductase (AKR). Xylella fastidiosa phenylacetaldehyde dehydrogenase is a founding member of aldo-keto reductase family 13 member B1 (AKR13B1). phenylacetaldehyde dehydrogenase (EC 1.2.1.39) catalyzes the NAD+-dependent oxidation of phenylactealdehyde to phenylacetic acid. 256
36328 381315 cd19089 AKR_AKR14A1_2 AKR14A family of aldo-keto reductase (AKR). Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ), also called GAP reductase, is a founding member of aldo-keto reductase family 14 member A1 (AKR14A1). It catalyzes the stereospecific, NADPH-dependent reduction of L-glyceraldehyde 3-phosphate (L-GAP). It is also involved in the stress response as a methylglyoxal reductase which converts the toxic metabolite methylglyoxal to acetol in vitro and in vivo. Salmonella enterica AKR is a founding member of aldo-keto reductase family 14 member A2 (AKR14A2). It catalyzes the conversion of 3-hydroxybutanal (3-HB) to 1,3-butanediol (1,3-BDO) by using NADPH as a cofactor. 308
36329 381316 cd19090 AKR_AKR15A-like AKR15A family of aldo-keto reductase and similar proteins. The AKR15 family includes Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD), Pseudomonas sp. D-threo-aldose 1-dehydrogenase (FDH) and similar proteins. PLD (EC1.1.1.107) catalyzes irreversible oxidation of pyridoxal. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose and, to a much lesser degree, D-arabinose. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose and, to a much lesser degree, D-arabinose. The family also includes L-galactose dehydrogenase (L-galDH) and D-arabinose 1-dehydrogenase (ARA2). L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+). ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+). 278
36330 381317 cd19091 AKR_PsAKR Polaromonas Sp. aldo-keto reductase and similar proteins. The prototype of this family is an uncharacterized aldo-keto reductase from Polaromonas sp. 319
36331 381318 cd19092 AKR_BsYcsN_EcYdhF-like Bacillus subtilis YcsN, Escherichia coli YdhF and similar proteins. Bacillus subtilis YcsN and Escherichia coli YdhF are prototypes of this family. They are uncharacterized aldo/keto reductase family oxidoreductases. 287
36332 381319 cd19093 AKR_AtPLR-like Arabidopsis thaliana pyridoxal reductase (PLR) and similar proteins. Arabidopsis thaliana PLR (EC 1.1.1.65) is the prototype of this family. It catalyzes the reduction of pyridoxal (PL) with NADPH and oxidation of pyridoxine (PN) with NADP(+), and is involved in the PLP salvage pathway. 293
36333 381320 cd19094 AKR_Tas-like Escherichia coli Tas protein and similar proteins. Escherichia coli Tas protein is the prototype of this family. It is an NADP(H)-dependent aldo-keto reductase that catalyzes the reversible reduction of ketones to the respective alcohols using NADP(H) as a hydride donor. 328
36334 381321 cd19095 AKR_PA4992-like Pseudomona aeruginosa PA4992 and similar proteins. Pseudomona aeruginosa PA4992 is the prototype of this family. It is a putative aldo-keto reductase that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. 253
36335 381322 cd19096 AKR_Fe-S_oxidoreductase Fe-S oxidoreductase and similar proteins. The family includes a group of uncharacterized Fe-S oxidoreductase that belongs to aldo-keto reductase (AKR) superfamily. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 255
36336 381323 cd19097 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 267
36337 381324 cd19098 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 318
36338 381325 cd19099 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 316
36339 381326 cd19100 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 238
36340 381327 cd19101 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 304
36341 381328 cd19102 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 302
36342 381329 cd19103 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 299
36343 381330 cd19104 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 321
36344 381331 cd19105 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 250
36345 381332 cd19106 AKR_AKR1A1-4 AKR1A family of aldo-keto reductase (AKR). The AKR1A family of AKR includes alcohol dehydrogenase [NADP(+)] (ALR, EC 1.1.1.2) from Homo sapiens (AKR1A1), Sus scrofa (AKR1A2), Rattus norvegicus (liver, AKR1A3), and Mus musculus (AKR1A4). ALR, also known as aldehyde reductase, or ALDR1, catalyzes the NADPH-dependent reduction of a variety of aromatic and aliphatic aldehydes to their corresponding alcohols. In vitro substrates include succinic semialdehyde, 4-nitrobenzaldehyde, 1,2-naphthoquinone, methylglyoxal, and D-glucuronic acid. 305
36346 381333 cd19107 AKR_AKR1B1-19 AKR1B family of aldo-keto reductase (AKR). The AKR1B family of AKR includes aldose reductase (AR, EC 1.1.1.21) from Homo sapiens (AKR1B1), Oryctolagus cuniculus (kidney, AKR1B2), Mus musculus (AKR1B3), Rattus norvegicus (lens, AKR1B4), Bos taurus (lens/testis, AKR1B5), and Sus scrofa (lens, AKR1B6), aldose reductase-related protein 1 (ALD1, EC1.1.1.21) from Mus musculus (AKR1B7), Rattus norvegicus (AKR1B14), and Homo sapiens (AKR1B15), Mus musculus fibroblast growth factor induced protein (FR-1 or AKR1B8, EC 1.1.1.21), Cricetulus griseus aldose reductase-related protein 2 (ALD2 or AKR1B9, EC 1.1.1.21), aldose reductase-like from Homo sapiens (ARL-1 or AKR1B10) and Rattus norvegicus (AKR1B13), aldo-keto reductase from Gallus domesticus (eye, tongue, esophagus, AKR1B12), and Oryctolagus cuniculus AR-like protein (3beta-HSD, AKR1B19). AR, also called aldehyde reductase, catalyzes the NADPH-dependent reduction of a wide variety of carbonyl-containing compounds to their corresponding alcohols with a broad range of catalytic efficiencies. ALD1 reduces a broad range of aliphatic and aromatic aldehydes to the corresponding alcohols. It may play a role in the metabolism of xenobiotic aromatic aldehydes. FR-1, also called aldose reductase-related protein 2, or fibroblast growth factor-regulated protein (FGFRP), is induced by fibroblast growth factor-1. It may play a role in the regulation of the cell cycle. FR-1 belongs to the NADPH-dependent aldo-keto reductase family. ALD2 is an inducible aldo-keto reductase with a preference for aliphatic substrates. It can also act on small aromatic aldehydes, steroid aldehydes and some ketone substrates. ARL-1, also called aldose reductase-like, or aldose reductase-related protein (ARP), or small intestine reductase, or SI reductase, acts as all-trans-retinaldehyde reductase that can efficiently reduce aliphatic and aromatic aldehydes, and is less active on hexoses (in vitro). It may be responsible for detoxification of reactive aldehydes in the digested food before the nutrients are passed on to other organs. AKR1B15, also called estradiol 17-beta-dehydrogenase AKR1B15, is a mitochondrial aldo-keto reductase that catalyzes the reduction of androgens and estrogens with high positional selectivity (shows 17-beta-hydroxysteroid dehydrogenase activity) as well as 3-keto-acyl-CoAs. It has a strong selectivity towards NADP(H). AKR1B19 is aldose reductase-like that may show 3-beta-hydroxysteroid dehydrogenase (3beta-HSD) activity. 307
36347 381334 cd19108 AKR_AKR1C1-35 AKR1C family of aldo-keto reductase (AKR). The AKR1C family of aldo-keto reductase (AKR) includes AKR1C1 (20-alpha-hydroxysteroid dehydrogenase, also known as 20alpha-HSD), AKR1C2 (3alpha-HSD type 3), AKR1C3 (17beta-HSD type 5), and AKR1C4 (3alpha-HSD type 1) from Homo sapiens; AKR1C5 (20alpha-HSD, also known as prostaglandin-E(2) 9-reductase) from Rattus norvegicus (ovary); AKR1C6 (estradiol 17beta-HSD type 5) from Mus musculus; AKR1C7 (prostaglandin F synthase 1 or PGF1) from Bos taurus (lung); AKR1C8 (20alpha-HSD) from Rattus norvegicus (ovary); AKR1C9 (3alpha-HSD) from Rattus norvegicus (liver); AKR1C10a (Rho crystallin) from Rana temporaria and AKR1C10b (Rho crystallin) from Rana catesbeina; AKR1C11 (prostaglandin F synthase 2 or PGF2) from Bos taurus (liver); AKR1C12 (aldo-keto reductase or AKR), AKR1C13 (interleukin-3-regulated AKR), and AKR1C14 (3alpha-HSD) from Mus musculus; AKR1C15 (NADPH-dependent reductase), AKR1C16 (NAD+-preferring 3alpha/17beta/20alpha-HSD), and AKR1C17 (NAD+-dependent 3alpha-HSD) from Rattus norvegicus; AKR1C18 (20alpha-HSD), AKR1C19 (3-hydroxybutyrate dehydrogenase or 3HB dehydrogenase), AKR1C20 (3alpha(17beta)-HSD), AKR1C21 (3(17)alpha-HSD), AKR1C22 (dihydrodiol dehydrogenase or DD) from Mus musculus; AKR1C23 (20alpha-HSD) from Equus caballus; AKR1C24 (NAD+-dependent 17beta-HSD) from Rattus norvegicus; AKR1C25 (3(20)alpha-HSD) from Macaca fuscata; AKR1C26 (identical to morphine 6-dehydrogenase or M6DH, acts as NAD(+)-dependent 3alpha/17beta-HSD), AKR1C27/AKR1C28 (NAD(+)-dependent 3alpha/17beta-HSDs), AKR1C29 (identical to 3-hydroxyhexobarbital dehydrogenase or 3HBD, acts as NADPH-preferring reductase with 3alpha/3beta/17beta/20alpha-HSD activity), AKR1C30 (identical to naloxone reductase type 1 and acts as 17beta-HSD), AKR1C31 (3alpha/17beta/20alpha-HSD), AKR1C32 (identical to loxoprofen reductase and acts as 3alpha/20alpha-HSD), and AKR1C33 (identical to naloxone reductase type 2 and mainly acts as 3alpha-HSD) from Oryctolagus cuniculus; AKR1C34 (NAD+-dependent morphine 6-dehydrogenase or M6DH with 3beta/17beta/20alpha-HSD activity) and AKR1C35 (NAD+-dependent dehydrogenase with 3(17)beta-HSD activity) from Mesocricetus auratus. 303
36348 381335 cd19109 AKR_AKR1D1-3 AKR1D family of aldo-keto reductase (AKR). The AKR1D family of aldo-keto reductase includes 3-oxo-5-beta-steroid 4-dehydrogenase (EC 1.3.1.3) from Homo sapiens (AKR1D1), Rattus norvegicus (liver, AKR1D2), and Oryctolagus cuniculus (AKR1D3). 3-oxo-5-beta-steroid 4-dehydrogenase, also called delta(4)-3-ketosteroid 5-beta-reductase (EC 1.3.99.6), or delta(4)-3-oxosteroid 5-beta-reductase, or 5-beta-reductase, efficiently catalyzes the reduction of progesterone, androstenedione, 17-alpha-hydroxyprogesterone and testosterone to 5-beta-reduced metabolites. 308
36349 381336 cd19110 AKR_AKR1E1-2 AKR1E family of aldo-keto reductase (AKR). The AKR1E family of AKR includes 1,5-anhydro-D-fructose reductase (EC 1.1.1.263) from Mus musculus (liver, AKR1E1) and Homo sapiens (AKR1E2). 1,5-anhydro-D-fructose reductase), also called AF reductase, or aldo-keto reductase family 1 member C-like protein 2 (AKR1CL2), catalyzes the NADPH-dependent reduction of 1,5-anhydro-D-fructose (AF) to 1,5-anhydro-D-glucitol. AKR1E2 is a testis aldo-keto reductase (tAKR), which is also known as testis-specific protein (TSP), or LoopADR. 301
36350 381337 cd19111 AKR_AKR1G1_1I Caenorhabditis elegans aldo-keto reductase (CeAKR), Coptotermes gestroi aldo-keto reductase (CgAKR-1) and similar proteins. CeAKR is a founding member of aldo-keto reductase family 1 member G1 (AKR1G1). It may catalyze the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. Coptotermes gestroi aldo-keto reductase (CgAKR-1) is a founding member of aldo-keto reductase family 1 member I (AKR1I). It is a multipurpose enzyme with potential biotechnological applications. 286
36351 381338 cd19112 AKR_AKR2A1-2 AKR2A family of aldo-keto reductase (AKR). The AKR2A family of AKR includes AKR2A1 (NADP-dependent D-sorbitol-6-phosphate dehydrogenase or NADP-S6PDH) from Malus domestica, and AKR2A2 (NADPH-dependent mannose-6-phosphate reductase or NADPH-M6PR) from Apium graveolens. NADP-S6PDH (EC 1.1.1.200), also called aldose-6-phosphate reductase [NADPH], synthesizes sorbitol-6-phosphate, a key intermediate in the synthesis of sorbitol which is a major photosynthetic product in many members of the Rosaceae family. NADPH-M6PR (EC 1.1.1.224), also called NADPH-dependent M6P reductase, is a key enzyme involved in mannitol biosynthesis. 308
36352 381339 cd19113 AKR_AKR2B1-10 AKR2B family of aldo-keto reductase (AKR). The AKR2B family of AKR includes NAD(P)H-dependent D-xylose reductase (XR) from Pichia stipites, Kluyveromyces lactis, Pachysolen tannophilus, Candida tropicalis, and Candida tenuis, Gre3p from Saccharomyces cerevisiae, XR from Candida tropicalis, Pichia guilliermondii, Debaryomyces hansenli, and Debaryomyces nepalensis, which correspond to aldo-keto reductase family 2 member B1-B10 (AKR2B1-10), respectively. XR (EC1.1.1.307) catalyzes the NAD(P)H dependent reduction of xylose to xylitol. 310
36353 381340 cd19114 AKR_AKR2C1 AKR2C family of aldo-keto reductase (AKR). Mucor mucedo NADP-dependent 4-dihydromethyl-trisporate dehydrogenase (TDH), also called 4-dihydromethyltrisporate dehydrogenase, or 4-dihydromethyl-TA dehydrogenase, is a founding member of aldo-keto reductase family 2 member C1 (AKR2C1). It is involved in the biosynthesis of trisporic acid, the sexual hormone of zygomycetes, which induces the first steps of zygophore development. TDH catalyzes the NADP-dependent oxidation of (+) mating-type specific precursor 4-dihydromethyl-trisporate to methyl-trisporate. 302
36354 381341 cd19115 AKR_AKR2D1 AKR2D family of aldo-keto reductase (AKR). Aspergillus niger NAD(P)H-dependent D-xylose reductase xyl1 (XR, EC 1.1.1.307) is a founding member of aldo-keto reductase family 2 member D1 (AKR2D1). It catalyzes the initial reaction in the xylose utilization pathway by reducing D-xylose into xylitol in a NAD(P)H dependent manner. 311
36355 381342 cd19116 AKR_AKR2E1-5 AKR2E family of aldo-keto reductase (AKR). Bombyx mori 3-dehydroecdysone reductase is a founding member of aldo-keto reductase family 2 member E4 (AKR2E4). It is a NADP-dependent oxidoreductase with high 3-dehydroecdysone reductase activity. It may play a role in the regulation of molting and has lower activity with phenylglyoxal and isatin (in vitro). This family also includes 3-dehydroecdysone 3b-reductase from Spodoptera littoralis and Trichoplusia ni, DL-glyceraldehyde reductase from Drosophila melanogaster, aldo-keto reductase from Bombyx mori, which correspond to aldo-keto reductase family 2 member E1, E2, E3 and E5 (AKR2E1/2/3/5), respectively. 292
36356 381343 cd19117 AKR_AKR3A1-2 AKR3A family of aldo-keto reductase (AKR). Saccharomyces cerevisiae Gcy1p and Ypr1p are founding members of aldo-keto reductase family 3 member A1 (AKR3A1) and A2 (AKR3A2), respectively. Gcy1p, also called galactose-inducible crystallin-like protein 1, is a glycerol dehydrogenase involved in glycerol catabolism under microaerobic conditions. It has mRNA binding activity. Ypr1p acts as a 2-methylbutyraldehyde reductase that displays high specific activity towards 2-methylbutyraldehyde, as well as other aldehydes such as hexanal. 284
36357 381344 cd19118 AKR_AKR3B1-3 AKR3B family of aldo-keto reductase (AKR). Sporidiobolus salmonicolor NADPH-dependent aldehyde reductase 1 (ARI, EC 1.1.1.2), Trichosporonoides megachilieni NADPH-dependent erthyrose reductase (ER) 1/2 and 3, are founding members of aldo-keto reductase family 3 member B1 (AKR3B1), B2 (AKR3B2), and B3 (AKR3B3), respectively. Sporidiobolus salmonicolor NADPH-ARI, also called alcohol dehydrogenase [NADP(+)], or aldehyde reductase I, or ALR 1, catalyzes the asymmetric reduction of aliphatic and aromatic aldehydes and ketones to an R-enantiomer. It reduces ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. Trichosporonoides megachilieni NADPH-ERs catalyze the reduction of D-erythrose. 283
36358 381345 cd19119 AKR_AKR3C1 Saccharomyces cerevisiae D-arabinose dehydrogenase [NAD(P)+] heavy chain (Ara1p) and similar proteins. Saccharomyces cerevisiae Ara1p (EC 1.1.1.117), also called D-arabinose 1-dehydrogenase (NAD(P)(+)), is a founding members of aldo-keto reductase family 3 member C1 (AKR3C1). It catalyzes the oxidation of D-arabinose, L-xylose, L-fucose, and L-galactose in the presence of NADP(+). 294
36359 381346 cd19120 AKR_AKR3C2-3 Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase, Candida parapsilosis NADPH-dependent conjugated polyketone reductase C2 (CPR), and similar proteins. Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase (EC 1.1.1.190/EC 1.1.1.191) and Candida parapsilosis NADPH-dependent CPR (EC 1.1.1.358/EC 1.1.1.168) are founding members of aldo-keto reductase family 3 member C2 (AKR3C2) and C3 (AKR3C3), respectively. Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase catalyzes the conversion from (Indol-3-yl)ethanol to (indol-3-yl)acetaldehyde in a NAD/NADP-dependent manner. CPR, also called 2-dehydropantolactone reductase, or 2-dehydropantolactone reductase (A-specific), or ketopantoyl-lactone reductase, acts as a NADPH-dependent conjugated polyketone reductase with broad substrate specificity and strict stereospecificity. It reduces ketopantoyl lactone and isatin. 269
36360 381347 cd19121 AKR_AKR3D1 AKR3D family of aldo-keto reductase (AKR). Trichoderma reesei D-galacturonate reductase (GAR1, EC 1.1.1.365), also called D-galacturonic acid reductase, or GalUR, is a founding member of aldo-keto reductase family 3 member D1 (AKR3D1). It mediates the reduction of D-galacturonate to L-galactonate, the first step in D-galacturonate catabolic process. It also has activity with D-glucuronate and DL-glyceraldehyde. Its activity is seen only with NADPH and not with NADH. 279
36361 381348 cd19122 AKR_AKR3E1 AKR3E family of aldo-keto reductase (AKR). Trichoderma reesei NADP(+)-dependent glycerol 2-dehydrogenase (GLD2, EC 1.1.1.156), also called dihydroxyacetone reductase, is a founding member of aldo-keto reductase family 3 member E1 (AKR3E1). It acts as a glycerol oxidoreductase probably involved in glycerol synthesis. 291
36362 381349 cd19123 AKR_AKR3G1 AKR3G family of aldo-keto reductase (AKR). Synechocystis sp. aldo/keto reductase slr0942 is a founding member of aldo-keto reductase family 3 member G1 (AKR3G1). It is an aldo/keto reductase that catalyzes the NADPH-dependent reduction of aldehyde- and ketone-groups of different classes of carbonyl compounds to the corresponding alcohols. 297
36363 381350 cd19124 AKR_AKR4A_4B AKR4A and AKR4B families of aldo-keto reductase (AKR). The AKR4A family of AKR includes Glycine max NAD(P)H-dependent 6'-deoxychalcone synthase (6DCS, EC 3.1.170), chalcone reductase (CHR, EC 2.3.1.74) from Medicago sativa, Glycyrrhiza echinate, and Glycyrrhiza glabra, which are founding members of aldo-keto reductase family 4 member A1 (AKR4A1), A2 (AKR4A2), A3 (AKR4A3), and A4 (AKR4A4), respectively. NAD(P)H-6DCS co-acts with chalcone synthase in formation of 4,2',4'-trihydroxychalcone, involved in the biosynthesis of glyceollin type phytoalexins. CHR, also called chalcone polyketide reductase, is a key enzyme of the flavonoid/isoflavonoid biosynthesis pathway. The AKR4B family of AKR includes Sesbania rostrate chalcone reductase (CHR, AKR4B1), Papaver somniferum codeinone reductase (COR, AKR4B2/ AKR4B3), Fragaria x ananassa D-galacturonate reductase (GalUR, AKR4B4), deoxymugineic acid synthase 1 (DMAS1) from Zea mays (AKR4B5), Oryza sativa (AKR4B6), Hordeum vulgare (AKR4B7), Triticum aestivum (AKR4B8), and Erythroxylum coca methylecgonone reductase (MecgoR, AKR4B10). CHR, also called chalcone polyketide reductase, is a key enzyme of the flavonoid/isoflavonoid biosynthesis pathway. NADPH-dependent COR and non-functional NADPH-dependent COR from Papaver somniferum are founding members of aldo-keto reductase family 4 member B2 (AKR4B2) and B3 (AKR4B3), respectively. NADPH-dependent COR (EC 1.1.1.247) reduces codeinone to codeine in the penultimate step in morphine biosynthesis. It can use morphinone, hydrocodone, and hydromorphone as substrates during reductive reaction with NADPH as cofactor, and morphine and dihydrocodeine as substrates during oxidative reaction with NADP as cofactor. GalUR (EC 1.1.1.365), also called aldo-keto reductase 2 (AKR2), is involved in ascorbic acid (vitamin C) biosynthesis by catalyzing the conversion from L-galactonate and NADP(+) to D-galacturonate and NADPH. DMAS1 (EC 1.1.1.285) catalyzes the reduction of a 3''-keto intermediate during the biosynthesis of 2'-deoxymugineic acid (DMA) from L-Met. It is involved in the formation of phytosiderophores (MAs) belonging to the mugineic acid family and required to acquire iron. MecgoR catalyzes the stereospecific reduction of methylecgonone to methylecgonine, the penultimate step in cocaine biosynthesis. 281
36364 381351 cd19125 AKR_AKR4C1-15 AKR4C family of aldo-keto reductase (AKR). The AKR4C family of AKR includes aldose reductase (ALR) from Hordeum vulgare (AKR4C1), Bromus inermis (AKR4C2), Avena fatua (AKR4C3), and Xerophyta viscosa (AKR4C4), two aldose reductases, DpAR1 (AKR4C5) and DpAR2(AKR4C6), from Digitalis purpurea, aldehyde reductase from Zea mays (AKR4C7), four aldo-keto reductases from Arabidopsis thaliana (AKR4C8-11), and another three aldo-keto reductases from Aloe arborescens (AKR4C12) and Oryza sativa (AKR4C14/15). ALR (EC 1.1.1.21), also called AR, aldehyde reductase, or polyol dehydrogenase (NADP(+)), is a cytosolic NADPH-dependent oxidoreductase that catalyzes the reduction of a variety of aldehydes and carbonyls, including monosaccharides. Both DpAR1 and DpAR2 reduce the ketone group of steroid structures. They may be involved in plant steroid metabolism in general and in cardenolide biosynthesis in particular. Plant aldo-keto reductases of the AKR4C subfamily play key roles during stress and are attractive targets for developing stress-tolerant crops. 287
36365 381352 cd19126 AKR_AKR5A_5G AKR5A and AKR5G families of aldo-keto reductase (AKR). The AKR5A family of AKR includes prostaglandin F2-alpha synthase (PGFS) from Leishmania major (AKR5A1) and Trypanosoma brucei (AKR5A2). PGFS, also called 9,11-endoperoxide prostaglandin H2 reductase, catalyzes the NADP-dependent formation of prostaglandin F2-alpha from prostaglandin H2. It has also aldo/ketoreductase activity for synthetic substrates 9,10-phenanthrenequinone and p-nitrobenzaldehyde. The AKR5G family of AKR includes Bacillus subtilis glyoxal reductase (GR), uncharacterized oxidoreductase YtbE, and Bacillus aryabhattai aldo-keto reductase, which corresponds to aldo-keto reductase family 5 member G1-3 (AKR5G1-3), respectively. GR (YvgN, EC 1.1.1.283), also called methylglyoxal reductase, reduces glyoxal and methylglyoxal (2-oxopropanal). It is not involved in vitamin B6 biosynthesis. 254
36366 381353 cd19127 AKR_AKR5B1 AKR5B family of aldo-keto reductase (AKR). Pseudomonas putida morphine 6-dehydrogenase (M6DH) is a founding member of the aldo-keto reductase family 5 member B1 (AKR5B1). M6DH (EC 1.1.1.218), also called naloxone reductase, oxidizes the C-6 hydroxy group of morphine and codeine. 268
36367 381354 cd19128 AKR_GlAR-like Giardia lamblia aldose reductase (AR) and similar proteins. Giardia lamblia AR (EC 1.1.1.21), also called aldehyde reductase, is the prototype of this family. It catalyzes the NADPH-dependent reduction of a wide variety of carbonyl-containing compounds to their corresponding alcohols with a broad range of catalytic efficiencies. 277
36368 381355 cd19129 AKR_BaDH-like Bradyrhizobium diazoefficiens dehydrogenase (DH) and similar proteins. Bradyrhizobium diazoefficiens DH is the prototype of this family. It belongs to aldo/keto reductase family. 295
36369 381356 cd19130 AKR_AKR5C1 Corynebacterium sp. 2,5-diketo-D-gluconic acid reductase A (DkgA) and similar proteins. Corynebacterium sp. DkgA is a founding member of aldo-keto reductase family 5 member C1 (AKR5C1). DkgA (EC 1.1.1.346), also called 2,5-DKG reductase A, or 2,5-DKGR A, or 25DKGR-A, or AKR5C, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). 5-keto-D-fructose and dihydroxyacetone can also serve as substrates. 256
36370 381357 cd19131 AKR_AKR5C2 Escherichia coli 2,5-diketo-D-gluconic acid reductase A (DkgA/YqhE) and similar proteins. Escherichia coli DkgA/YqhE is a founding member of aldo-keto reductase family 5 member C2 (AKR5C2). DkgA/YqhE (EC 1.1.1.274), also called 2,5-DKG reductase A, or 2,5-DKGR A, or 25DKGR-A, or AKR5C, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). It is also capable of stereoselective -keto ester reductions on ethyl acetoacetate and other 2-substituted derivatives. 256
36371 381358 cd19132 AKR_AKR5D1_E1 AKR5D and AKR5E families of aldo-keto reductase (AKR). 2,5-diketo-D-gluconic acid reductase B (DkgB) from Corynebacterium sp. and 2,5-diketo-D-gluconic acid reductase Zymomonas mobilis are founding members of aldo-keto reductase family 5 member D1 (AKR5D1) and E1 (AKR5E1), respectively. DkgB (EC 1.1.1.274), also called 2,5-didehydrogluconate reductase (2-dehydro-D-gluconate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). 255
36372 381359 cd19133 AKR_AKR5F1 the AKR5F family of aldo-keto reductase (AKR). Klebsiella sp. 2,5-diketo-D-gluconic acid reductase (2,5-DKG reductase) is a founding member of aldo-keto reductase family 5 member F1 (AKR5F1). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). 255
36373 381360 cd19134 AKR_AKR5H1 AKR5H family of aldo-keto reductase (AKR). Mycobacterium smegmatis MSMEG_2407 is a founding member of aldo-keto reductase family 5 member H1 (AKR5H1). It is a NADPH-dependent aldo-keto reductase that reduces methylglyoxal and phenylglyoxal. 263
36374 381361 cd19135 AKR_CeZK1290-like Caenorhabditis elegans ZK1290.5 and similar proteins. Caenorhabditis elegans ZK1290.5 is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase. 265
36375 381362 cd19136 AKR_DrGR-like Danio rerio glyoxal reductase-like (GR-like) protein and similar proteins. Danio rerio GR-like protein is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase similar to Bacillus subtilis glyoxal reductase (YvgN) that reduces glyoxal and methylglyoxal (2-oxopropanal). 262
36376 381363 cd19137 AKR_AKR3F1 Thermotoga maritime Tm1743 and similar proteins. Thermotoga maritime Tm1743 is a founding member of aldo-keto reductase family 3 member F1 (AKR3F1). It is a aldo/keto reductase family oxidoreductase. 260
36377 381364 cd19138 AKR_YeaE Escherichia coli YeaE and similar proteins. Escherichia coli YeaE is the prototype of this family. It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. 266
36378 381365 cd19139 AKR_AKR3F2 Escherichia coli 2,5-diketo-D-gluconic acid reductase B (DkgB/YafB) and similar proteins. Escherichia coli DkgB/YafB (EC 1.1.1.346), also called 2,5-didehydrogluconate reductase (2-dehydro-L-gulonate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, is a founding member of aldo-keto reductase family 3 member F2 (AKR3F2). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). 248
36379 381366 cd19140 AKR_AKR3F3 Sinorhizobium meliloti isatin reductase and similar proteins. Sinorhizobium meliloti isatin reductase is a founding member of aldo-keto reductase family 3 member F3 (AKR3F3). It is a aldo/keto reductase family oxidoreductase. 253
36380 381367 cd19141 Aldo_ket_red_shaker Shaker potassium channel beta subunit (AKR6A) family of aldo-keto reductase (AKR). This family includes voltage-gated potassium channel subunits, beta-1 (KCAB1B), beta-2 (KCAB2B) and beta-3 (KCAB3B). KCAB1B and KCAB2B are cytoplasmic potassium channel subunits that modulate the characteristics of the channel-forming alpha-subunits. KCAB3B is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. 310
36381 381368 cd19142 AKR_AKR6B1 AKR6B family of aldo-keto reductase (AKR). Drosophila melanogaster Hk protein is a founding member of aldo-keto reductase family 6 member B1 (AKR6B1). Hk protein, also called hyperkinetic, is a beta subunit of Shaker (Sh) K+ channels and shows high sequence homology to aldoketoreductase. 325
36382 381369 cd19143 AKR_AKR6C1_2 AKR6C family of aldo-keto reductase (AKR). Voltage-gated potassium channel subunit beta (KCAB) from Arabidopsis thaliana and Egeria densa are founding members of aldo-keto reductase family 6 member C1 (AKR6C1) and C2 (AKR6C2), respectively. KCAB, also called Shaker channel b-subunit, or K(+) channel subunit beta, or potassium voltage beta 1, or KV-beta1, or KAB1, is a probable accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. 319
36383 381370 cd19144 AKR_AKR13A1 AKR13A family of aldo-keto reductase (AKR). Schizosaccharomyces pombe aldo-keto reductase YakC is a founding member of aldo-keto reductase family 13 member A1 (AKR13A1). It catalyzes the reversible reduction of ketones to the respective alcohols using NADP(+) as a hydride donor. 323
36384 381371 cd19145 AKR_AKR13D1 AKR13D family of aldo-keto reductase (AKR). Rauvolfia serpentina PR is a founding member of aldo-keto reductase family 13 member D1 (AKR13D1). It catalyzes the NADPH-dependent reduction of the aldehyde perakine to yield the alcohol raucaffrinoline in the biosynthetic pathway of ajmaline in Rauvolfia, a key step in indole alkaloid biosynthesis. This family also includes Arabidopsis thaliana aldo-keto reductases, ALKR1-6. 304
36385 381372 cd19146 AKR_AKR9A1-2 Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV, Aspergillus flavus norsolorinic acid reductase (NOR), and similar proteins. Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV and Aspergillus flavus norsolorinic acid reductase (NOR), are founding members of aldo-keto reductase family 9 member A1-2 (AKR9A1-2), respectively. StcV may be involved in the dehydration of 5'-hydroxyaverantin to form averufin. NOR is involved in aflatoxin biosynthesis. 326
36386 381373 cd19147 AKR_AKR9A3_9B1-4 Phanerochaete chrysosporium aryl-alcohol dehydrogenase [NADP(+)] (AAD) and similar proteins. Phanerochaete chrysosporium ADD (EC1.1.1.91) is a founding member of aldo-keto reductase family 9 member A3. It is involved in lignin degradation and reduces aromatic benzaldehydes to their respective alcohols in the presence of NADP(H). This family also includes Saccharomyces cerevisiae aryl-alcohol dehydrogenases AAD14p, AAD3p, AAD4p, and AAD10p, which are founding members of aldo-keto reductase family 9 member B1-4 (AKR9B1-4), respectively. 319
36387 381374 cd19148 AKR_AKR11B1 Bacillus subtilis aldo-keto reductase YhdN and similar proteins. Bacillus subtilis YhdN, also called general stress protein 69 (GSP69), is a founding member of aldo-keto reductase family 11 member B1 (AKR11B1). It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. 302
36388 381375 cd19149 AKR_AKR11B2 Escherichia coli NADH-specific methylglyoxal reductase (YdjG) and similar proteins. Escherichia coli YdjG is a founding member of aldo-keto reductase family 11 member B2 (AKR11B2). It catalyzes the NADH-dependent reduction of methylglyoxal (2-oxopropanal) in vitro. It may play some role in intestinal colonization. 315
36389 381376 cd19150 AKR_AKR14A1 Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ/AKR14A1) and similar proteins. Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ), also called GAP reductase, is a founding member of aldo-keto reductase family 14 member A1 (AKR14A1). It catalyzes the stereospecific, NADPH-dependent reduction of L-glyceraldehyde 3-phosphate (L-GAP). It is also involved in the stress response as a methylglyoxal reductase which converts the toxic metabolite methylglyoxal to acetol in vitro and in vivo. 309
36390 381377 cd19151 AKR_AKR14A2 Salmonella enterica aldo-keto reductase (AKR) and similar protein. Salmonella enterica AKR is a founding member of aldo-keto reductase family 14 member A2 (AKR14A2). 309
36391 381378 cd19152 AKR_AKR15A AKR15A family of aldo-keto reductase. The AKR15 family includes Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD), Pseudomonas sp. D-threo-aldose 1-dehydrogenase (FDH), and similar proteins. PLD (EC1.1.1.107) catalyzes irreversible oxidation of pyridoxal. FDH(EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose. 308
36392 381379 cd19153 AKR_galDH-like L-galactose dehydrogenase (L-galDH), D-arabinose 1-dehydrogenase (ARA2) and similar proteins. L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+). ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+). 294
36393 381380 cd19154 AKR_AKR1G1_CeAKR Caenorhabditis elegans aldo-keto reductase (CeAKR) and similar proteins. CeAKR is a founding member of aldo-keto reductase family 1 member G1 (AKR1G1). It may catalyze the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. 303
36394 381381 cd19155 AKR_AKR1I_CgAKR1 Coptotermes gestroi aldo-keto reductase (CgAKR-1) and similar proteins. Coptotermes gestroi aldo-keto reductase (CgAKR-1) is a founding member of aldo-keto reductase family 1 member I (AKR1I). It is a multipurpose enzyme with potential biotechnological applications. 307
36395 381382 cd19156 AKR_AKR5A1_2 AKR5A family of aldo-keto reductase (AKR). Prostaglandin F2-alpha synthase (PGFS) from Leishmania major and Trypanosoma brucei are founding members of aldo-keto reductase family 5 member A1 (AKR5A1) and A2 (AKR5A2), respectively. PGFS, also called 9,11-endoperoxide prostaglandin H2 reductase, catalyzes the NADP-dependent formation of prostaglandin F2-alpha from prostaglandin H2. It has also aldo/ketoreductase activity toward the synthetic substrates 9,10-phenanthrenequinone and p-nitrobenzaldehyde. 266
36396 381383 cd19157 AKR_AKR5G1-3 AKR5G family of aldo-keto reductase (AKR). Bacillus subtilis glyoxal reductase (GR), uncharacterized oxidoreductase YtbE, and Bacillus aryabhattai aldo-keto reductase are founding members of aldo-keto reductase family 5 member G1-3 (AKR5G1-3), respectively. GR (YvgN, EC 1.1.1.283), also called methylglyoxal reductase, reduces glyoxal and methylglyoxal (2-oxopropanal). It is not involved in vitamin B6 biosynthesis. 265
36397 381384 cd19158 AKR_KCAB2B_AKR6A1-like voltage-gated potassium channel subunit beta-2 (KCAB2B) and similar proteins. KCAB2B from Bos taurus, Rattus norvegicus, Mus musculus, Homo sapiens, and Oryctolagus cuniculus, are founding members of aldo-keto reductase family 6 member A1 (AKR6A1), A2 (AKR6A2), A4 (AKR6A4), A5 (AKR6A5), and A6 (AKR6A6), respectively. KCAB2B, also called Shaker channel b-subunit 2 (Kvb2), or K(+) channel subunit beta-2, or Kv-beta-2, or Kvbeta2, is a cytoplasmic potassium channel subunit that modulates the characteristics of the channel-forming alpha-subunits. It may be involved in the regulation of nerve signaling, and prevents neuronal hyperexcitability. 324
36398 381385 cd19159 AKR_KCAB1B_AKR6A3-like voltage-gated potassium channel subunit beta-1 (KCAB1B) and similar proteins. KCAB1B from Homo sapiens, Mus musculus, Mustela putorius, Rattus norvegicus, and Kvb1.1, Kvb1.2 from Oryctolagus cuniculus, are founding members of aldo-keto reductase family 6 member A3 (AKR6A3), A8 (AKR6A8), A10a (AKR6A10a), A13 (AKR6A13), A7 (AKR6A7) and A10b (AKR6A10b), respectively. KCAB1B, also called Shaker channel b-subunit 1(Kvb1), K(+) channel subunit beta-1, or Kv-beta-1, is a cytoplasmic potassium channel subunit that modulates the characteristics of the channel-forming alpha-subunits. It modulates action potentials via its effect on the pore-forming alpha subunits. 323
36399 381386 cd19160 AKR_KCAB3B_AKR6A9-like voltage-gated potassium channel subunit beta-3 (KCAB3B) and similar proteins. KCAB3B from Homo sapiens, Rattus norvegicus, and Mus musculus, are founding members of aldo-keto reductase family 6 member A9 (AKR6A9), A12 (AKR6A12), A14 (AKR6A14), respectively. KCAB3B, also called Shaker channel b-subunit 3 (Kvb3), K(+) channel subunit beta-3, or Kv-beta-3, is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. It alters the functional properties of Kv1.5. 325
36400 381387 cd19161 AKR_AKR15A1 Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD) and similar proteins. Microbacterium luteolum PLD (EC1.1.1.107) is a founding member of aldo-keto reductase family 15 member A1 (AKR15A1). It catalyzes irreversible oxidation of pyridoxal. 310
36401 381388 cd19162 AKR_FDH D-threo-aldose 1-dehydrogenase (FDH) and similar proteins. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose. 290
36402 381389 cd19163 AKR_galDH L-galactose dehydrogenase (L-galDH) and similar proteins. L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+). 293
36403 381390 cd19164 AKR_ARA2 D-arabinose 1-dehydrogenase (ARA2) and similar proteins. ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+). 298
36404 350856 cd19165 HemeO heme oxygenase in eukaryotes and some bacteria. This subfamily contains heme oxygenase (HO, EC 1.14.14.18) found in eukaryotes as well as some proteobacteria, including cyanobacteria. Heme oxygenase (HO) catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling of iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. In vertebrates, HO plays a role in heme homeostasis and oxidative stress response, and cellular signaling in mammals that include isoforms HO-1, HO-2 and HO-3. HO-1 is ubiquitously expressed after induction while HO-2 expression is constitutive, mostly limited to certain organs, such as the brain, testes, and the vascular system. HO-3 is non-functional in humans, suggesting that the Hmox3 gene is a pseudogene derived from HO-2 transcripts. In higher plants and cyanobacteria, heme oxygenase is required for the synthesis of light-harvesting pigments, which contain tetrapyrrols derived from biliverdin. Candida albicans expresses a heme oxygenase that is required for the utilization of heme as a nutritional iron source, whereas Saccharomyces cerevisiae responds to iron deprivation by increasing Hmx1p transcription, which is controlled by the major iron-dependent transcription factor, Aft1p, and promotes both the re-utilization of heme iron and the regulation of heme-dependent transcription during periods of iron scarcity. In pathogenic bacteria, HO is part of a pathway for iron acquisition from host heme. In Leptospira interrogans, a pathogenic spirochete that causes leptospirosis, HO is required for iron utilization when hemoglobin is the sole iron source, thus making HO an interesting target for novel antimicrobial agents. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology. 205
36405 350857 cd19166 HemeO-bac heme oxygenase found in pathogenic bacteria. This subfamily contains bacterial heme oxygenase (HO, EC 1.14.14.18), where HO is part of a pathway for iron acquisition from host heme and heme products. Most of these proteins have yet to be characterized. HO catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling of iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. This family includes heme oxygenase (pa-HO) from Pseudomonas aeruginosa, an opportunistic pathogen that causes a variety of systemic infections, particularly in those afflicted with cystic fibrosis, as well as cancer and AIDS patients who are immunosuppressed. Pa-HO, expressed by the PigA gene, is critical for the acquisition of host iron since there is essentially no free iron in mammals, and is unusual since it hydroxylates heme predominantly at the delta-meso heme carbon, while all other well-studied HOs hydroxylate the alpha-meso carbon. Also included in this family is Neisseria meningitidis HO which is substantially different from the human HO, with the reaction product being ferric biliverdin IXalpha rather than reduced iron and free biliverdin IXalpha. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology. 182
36406 380944 cd19167 SET_SMYD1_2_3-like SET domain (including post-SET domain) found in SET and MYND domain-containing proteins, SMYD1, SMYD2, SMYD3 and similar proteins. The family includes SET and MYND domain-containing proteins, SMYD1, SMYD2 and SMYD3. SMYD1 (EC 2.1.1.43; also termed BOP) is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD2 (also termed HSKM-B, or lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. 205
36407 380945 cd19168 SET_EZH-like SET domain found in enhancer of zeste homolog 1 (EZH1) and zeste homolog 2 (EZH2) of polycomb repressive complex 2 (PRC2), and similar proteins. The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both EZH1 and EZH2 can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics. 124
36408 380946 cd19169 SET_SETD1 SET domain (including post-SET domain) found in SET domain-containing protein 1 (SETD1) and similar proteins. This family includes SET domain-containing protein 1A (SETD1A) and SET domain-containing protein 1B (SETD1B). These proteins are histone-lysine N-methyltransferases that specifically methylate 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. 148
36409 380947 cd19170 SET_KMT2A_2B SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B) and similar proteins. This family includes KMT2A and KMT2B. Both KMT2A (also termed ALL-1 or CXXC7 or MLL or MLL1 or TRX1 or HRX) and KMT2B (also termed MLL4 or TRX2) act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me). 152
36410 380948 cd19171 SET_KMT2C_2D SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C), 2D (KMT2D) and similar proteins. This family includes KMT2C and KMT2D. Both, KMT2C (also termed HALR or MLL3) and KMT2D (also termed ALR or MLL2), act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me). They are subunits of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. 153
36411 380949 cd19172 SET_SETD2 SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2) and similar proteins. SETD2 (also termed HIF-1, huntingtin yeast partner B, huntingtin-interacting protein 1 (HIP-1), huntingtin-interacting protein B, lysine N-methyltransferase 3A or protein-lysine N-methyltransferase SETD2) acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. It has been shown that methylation is a posttranslational modification of dynamic microtubules and that SETD2 methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy. 142
36412 380950 cd19173 SET_NSD SET domain (including post-SET domain) found in nuclear SET domain-containing proteins, NSD1, NSD2, NSD3 and similar proteins. The nuclear receptor-binding SET Domain (NSD) family of histone H3 lysine 36 methyltransferases is comprised of NSD1, NSD2, and NSD3, which are primarily known to be involved in chromatin integrity and gene expression through mono-, di-, or tri-methylating lysine 36 of histone H3 (H3K36), respectively. NSD1 (EC 2.1.1.43; also termed histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B) or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-27' (H3K27me) methyltransferase activity. NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3. 142
36413 380951 cd19174 SET_ASH1L SET domain (including post-SET domain) found in ASH1-like protein (ASH1L) and similar proteins. ASH1L (EC 2.1.1.43; also termed absent small and homeotic disks protein 1 homolog, KMT2H, or lysine N-methyltransferase 2H) acts as histone-lysine N-methyltransferase that specifically methylates 'Lys-36' of histone H3 (H3K36me). It plays important roles in development; heterozygous mutation of ASH1L is associated with severe intellectual disability (ID) and multiple congenital anomaly (MCA). 141
36414 380952 cd19175 SET_ASHR3-like SET domain (including post-SET domain) found in Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins. This family includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3, also termed protein SET DOMAIN GROUP 4 or protein stamen loss), ASH1 homolog 3 (ASHH3, also termed protein SET DOMAIN GROUP 7) and homolog 4 (ASHH4, also termed protein SET DOMAIN GROUP 24). They all function as histone-lysine N-methyltransferases (EC 2.1.1.43). 139
36415 380953 cd19176 SET_SETD3 SET domain found in SET domain-containing protein 3 (SETD3) and similar proteins. SETD3 (EC 2.1.1.43) is a histone-lysine N-methyltransferase that methylates 'Lys-4' and 'Lys-36' of histone H3 (H3K4me and H3K36me). It functions as a transcriptional activator that plays an important role in the transcriptional regulation of muscle cell differentiation via interaction with MYOD1. 251
36416 380954 cd19177 SET_SETD4 SET domain found in SET domain-containing protein 4 (SETD4) and similar proteins. SETD4 is a cytosolic and nuclear functional lysine methyltransferase that plays a crucial role in breast carcinogenesis. However, its specific substrates and modification sites remain to be disclosed. 245
36417 380955 cd19178 SET_SETD6 SET domain found in SET domain-containing protein 6 (SETD6) and similar proteins. SETD6 is a lysine N-methyltransferase that monomethylates 'Lys-310' of the RELA subunit of NF-kappa-B complex, leading to down-regulate NF-kappa-B transcription factor activity. It also monomethylates 'Lys-8' of H2AZ (H2AZK8me1). 250
36418 380956 cd19179 SET_RBCMT SET domain found in chloroplastic ribulose-1,5 bisphosphate carboxylase/oxygenase large subunit N-methyltransferase (RBCMT) and similar proteins. RBCMT (EC 2.1.1.127; also termed [Ribulose-bisphosphate carboxylase]-lysine N-methyltransferase, RuBisCO LSMT, RuBisCO methyltransferase, or rbcMT) methylates 'Lys-14' of the large subunit of RuBisCO. 237
36419 380957 cd19180 SET_SpSET10-like SET domain found in Schizosaccharomyces pombe SET domain-containing protein 10 (SETD10) and similar proteins. Schizosaccharomyces pombe SETD10 is a ribosomal S-adenosyl-L-methionine-dependent protein-lysine N-methyltransferase that methylates ribosomal protein L23 (rpl23a and rpl23b). 252
36420 380958 cd19181 SET_SETD5 SET domain (including post-SET domain) found in SET domain-containing protein 5 (SETD5) and similar proteins. SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. SETD5 loss-of-function mutations are a likely cause of a familial syndromic intellectual disability with variable phenotypic expression. 150
36421 380959 cd19182 SET_KMT2E SET domain found in inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins. KMT2E (also termed inactive lysine N-methyltransferase 2E, myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. It associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. Lack of key residues in the SET domain as well as the presence of an unusually large loop in the SET-I subdomain preclude the interaction of MLL5 SET with its cofactor and substrate thus making MLL5 devoid of any in vitro methyltransferase activity on full-length histones and histone H3 peptide. 129
36422 380960 cd19183 SET_SpSET3-like SET domain (including post-SET domain) found in Schizosaccharomyces pombe SET domain-containing protein 3 (SETD3) and similar proteins. Schizosaccharomyces pombe SETD3 functions as a transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. It is required for both, gene activation and repression. 173
36423 380961 cd19184 SET_KMT5B SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 5B (KMT5B) and similar proteins. KMT5B (also termed lysine N-methyltransferase 5B, lysine-specific methyltransferase 5B, suppressor of variegation 4-20 homolog 1, Su(var)4-20 homolog 1 or Suv4-20h1) is a histone methyltransferase that specifically trimethylates 'Lys-20' of histone H4 (H4K20me3). It plays a central role in the establishment of constitutive heterochromatin in pericentric heterochromatin regions. 144
36424 380962 cd19185 SET_KMT5C SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 5C (KMT5C) and similar proteins. KMT5C (also termed lysine N-methyltransferase 5C, lysine-specific methyltransferase 5C, suppressor of variegation 4-20 homolog 2, Su(var)4-20 homolog 2 or Suv4-20h2) is a histone methyltransferase that specifically trimethylates 'Lys-20' of histone H4 (H4K20me3). It plays a central role in the establishment of constitutive heterochromatin in pericentric heterochromatin regions. 142
36425 380963 cd19186 SET_Suv4-20 SET domain (including post-SET domain) found in Drosophila melanogaster suppressor of variegation 4-20 (Suv4-20) and similar proteins. Suv4-20 (also termed Su(var)4-20) is a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-20' of histone H4. It acts as a dominant suppressor of position-effect variegation. 142
36426 380964 cd19187 PR-SET_PRDM1 PR-SET domain found in PR domain zinc finger protein 1 (PRDM1) and similar proteins. PRDM1 (also termed BLIMP-1, beta-interferon gene positive regulatory domain I-binding factor, PR domain-containing protein 1, positive regulatory domain I-binding factor 1, PRDI-BF1, or PRDI-binding factor 1) acts as a transcription factor that mediates a transcriptional program in various innate and adaptive immune tissue-resident lymphocyte T cell types such as tissue-resident memory T (Trm), natural killer (trNK) and natural killer T (NKT) cells and negatively regulates gene expression of proteins that promote the egress of tissue-resident T-cell populations from non-lymphoid organs. 128
36427 380965 cd19188 PR-SET_PRDM2 PR-SET domain found in PR domain zinc finger protein 2 (PRDM2) and similar proteins. PRDM2 (also termed GATA-3-binding protein G3B, lysine N-methyltransferase 8, MTB-or MTE-binding protein, PR domain-containing protein 2, retinoblastoma protein-interacting zinc finger protein, or zinc finger protein RIZ) is S-adenosyl-L-methionine-dependent histone methyltransferase that specifically methylates 'Lys-9' of histone H3. It may function as a DNA-binding transcription factor. 123
36428 380966 cd19189 PR-SET_PRDM4 PR-SET domain found in PR domain zinc finger protein 4 (PRDM4) and similar proteins. PRDM4 (also termed PR domain-containing protein 4, or PFM1) may function as a transcription factor involved in cell differentiation. 133
36429 380967 cd19190 PR-SET_PRDM5 PR-SET domain found in PR domain zinc finger protein 5 (PRDM5) and similar proteins. PRDM5 (also termed PR domain-containing protein 5) is a sequence-specific DNA-binding transcription factor that represses transcription at least in part by recruitment of the histone methyltransferase EHMT2/G9A and histone deacetylases such as HDAC1. 127
36430 380968 cd19191 PR-SET_PRDM6 PR-SET domain found in PR domain zinc finger protein 6 (PRDM6) and similar proteins. PRDM6 (also termed PR domain-containing protein 6) is a putative histone-lysine N-methyltransferase that acts as a transcriptional repressor of smooth muscle gene expression. It may specifically methylate 'Lys-20' of histone H4 when associated with other proteins and in vitro. 128
36431 380969 cd19192 PR-SET_PRDM8 PR-SET domain found in PR domain zinc finger protein 8 (PRDM8) and similar proteins. PRDM8 (also termed PR domain-containing protein 8) may function as histone methyltransferase, preferentially acting on 'Lys-9' of histone H3. 131
36432 380970 cd19193 PR-SET_PRDM7_9 PR-SET domain found in PR domain zinc finger protein 7 (PRDM7) and 9 (PRDM9) and similar proteins. PRDM7 (also termed PR domain-containing protein 7) is a primate-specific histone methyltransferase that is the result of a recent gene duplication of PRDM9. It selectively catalyzes the trimethylation of H3 lysine 4 (H3K4me3). PRDM9 (also termed PR domain-containing protein 9) is a histone methyltransferase that specifically trimethylates 'Lys-4' of histone H3 (H3K4me3) during meiotic prophase and is essential for proper meiotic progression. It also efficiently mono-, di-, and trimethylates H3K36. Aberrant PRDM9 expression is assciated with with genome instability in cancer. 129
36433 380971 cd19194 PR-SET_PRDM10 PR-SET domain found in PR domain zinc finger protein 10 (PRDM10) and similar proteins. PRDM10 (also termed PR domain-containing protein 10, or tristanin) may be involved in transcriptional regulation. 128
36434 380972 cd19195 PR-SET_PRDM11 PR-SET domain found in PR domain zinc finger protein 11 (PRDM11) and similar proteins. PRDM11 (also termed PR domain-containing protein 11) may be involved in transcription regulation. 127
36435 380973 cd19196 PR-SET_PRDM12 PR-SET domain found in PR domain zinc finger protein 12 (PRDM12) and similar proteins. PRDM12 (also termed PR domain-containing protein 12) acts as a transcription factor that is involved in the positive regulation of histone H3-K9 dimethylation. 130
36436 380974 cd19197 PR-SET_PRDM13 PR-SET domain found in PR domain zinc finger protein 13 (PRDM13) and similar proteins. PRDM13 (also termed PR domain-containing protein 13) may be involved in transcriptional regulation. It mediates the balance of inhibitory and excitatory neurons in somatosensory circuits. 103
36437 380975 cd19198 PR-SET_PRDM14 PR-SET domain found in PR domain zinc finger protein 14 (PRDM14) and similar proteins. PRDM14 (also termed PR domain-containing protein 14) acts as a transcription factor that has both positive and negative roles on transcription. It acts on regulating epigenetic modifications in the cells, playing a key role in the regulation of cell pluripotency, epigenetic reprogramming, differentiation and development. Aberrant PRDM14 expression is associated with tumorigenesis, cell migration and cell chemotherapeutic drugs resistance. 133
36438 380976 cd19199 PR-SET_PRDM15 PR-SET domain found in PR domain zinc finger protein 15 (PRDM15) and similar proteins. PRDM15 (also termed PR domain-containing protein 15, or zinc finger protein 298 (ZNF298)) may be involved in transcriptional regulation. It plays an essential role as a chromatin factor that modulates the transcription of upstream regulators of WNT and MAPK-ERK signaling to safeguard naive pluripotency. 126
36439 380977 cd19200 PR-SET_PRDM16_PRDM3 PR-SET domain found in PR domain zinc finger protein 16 (PRDM16), MDS1 and EVI1 complex locus protein and similar proteins. PRDM16 (also termed PR domain-containing protein 16, transcription factor MEL1, or MDS1/EVI1-like gene 1) functions as a transcriptional regulator. PRDM16 is preferentially expressed by hematopoietic and neuronal stem cells. It is closely related to paralog of PRDM3 (also termed MDS1 and EVI1 complex locus protein, ecotropic virus integration site 1 protein, EVI-1, myelodysplasia syndrome 1 protein, myelodysplasia syndrome-associated protein 1, or MECOM) which is a nuclear transcription factor essential for the proliferation/maintenance of hematopoietic stem cells (HSCs). PRDM3 and PRDM16 are both directly linked to various aspects of oncogenic transformation. 135
36440 380978 cd19201 PR-SET_ZFPM PR-SET domain found in zinc finger protein ZFPM1, ZFPM2 and similar proteins. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis. 122
36441 380979 cd19202 SET_SMYD2 SET domain (including post-SET domain) found in SET and MYND domain-containing protein 2 (SMYD2) and similar proteins. SMYD2 (also termed HSKM-B, lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). It plays a role in myofilament organization in both skeletal and cardiac muscles via Hsp90 methylation. SMYD2 overexpression is associated with tumor cell proliferation and a worse outcome in human papillomavirus-unrelated nonmultiple head and neck carcinomas. It regulates leukemia cell growth such that diminished SMYD2 expression upregulates SET7/9, thereby possibly shifting leukemia cells from growth to quiescence state associated with resistance to DNA damage associated with Acute Myeloid Leukemia (AML). 206
36442 380980 cd19203 SET_SMYD3 SET domain (including post-SET domain) found in SET and MYND domain-containing protein 3 (SMYD3) and similar proteins. SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. It is overexpressed in colorectal, breast, prostate, and hepatocellular tumors, and has been implicated as an oncogene in human malignancies. Methylation of MEKK2 by SMYD3 is important for regulation of the MEK/ERK pathway, suggesting the possibility of selectively targeting SMYD3 in RAS-driven cancers. 210
36443 380981 cd19204 SET_SETD1A SET domain (including post-SET domain) found in SET domain-containing protein 1A (SETD1A) and similar proteins. SETD1A (EC2.1.1.43), also termed lysine N-methyltransferase 2F, or Set1/Ash2 histone methyltransferase complex subunit SET1, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Human SET domain containing protein 1A (hSETD1A) expression occurs at a high rate in hepatocellular carcinoma patients and controls tumor metastasis in breast cancer by activating MMP expression. 153
36444 380982 cd19205 SET_SETD1B SET domain (including post-SET domain) found in SET domain-containing protein 1B (SETD1B) and similar proteins. SETD1B (EC2.1.1.43), also termed lysine N-methyltransferase 2G, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Loss of SETD1B occurs in up to half the gastric and colorectal cancers, most commonly via SETD1B mutations, while de novo variants in SETD1B are associated with intellectual disability, epilepsy and autism. 153
36445 380983 cd19206 SET_KMT2A SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A) and similar proteins. KMT2A (EC2.1.1.43; also termed lysine N-methyltransferase 2A, ALL-1, CXXC-type zinc finger protein 7 (CXXC7), myeloid/lymphoid or mixed-lineage leukemia (MLL), myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (TRX1), or zinc finger protein HRX) acts as a histone methyltransferase that plays an essential role in early development and hematopoiesis. It is a catalytic subunit of the MLL1/MLL complex, a multiprotein complex that mediates both methylation of 'Lys-4' of histone H3 (H3K4me) complex and acetylation of 'Lys-16' of histone H4 (H4K16ac). 154
36446 380984 cd19207 SET_KMT2B SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2B (KMT2B) and similar proteins. KMT2B (EC2.1.1.43; also termed lysine N-methyltransferase 2B, myeloid/lymphoid or mixed-lineage leukemia protein 4 (MLL2/MLL4), trithorax homolog 2 (TRX2), or WW domain-binding protein 7 (WBP-7)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is required during the transcriptionally active period of oocyte growth for the establishment and/or maintenance of bulk H3K4 trimethylation (H3K4me3), global transcriptional silencing that precedes resumption of meiosis, oocyte survival and normal zygotic genome activation. 154
36447 380985 cd19208 SET_KMT2C SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C) and similar proteins. KMT2C (EC2.1.1.43; also termed lysine N-methyltransferase 2C, homologous to ALR protein (HALR) myeloid/lymphoid, or mixed-lineage leukemia protein 3 (MLL3)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me) and may be involved in leukemogenesis and developmental disorder. KMT2C is a catalytic subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. Overexpression of KMT2C is associated with estrogen receptor-positive breast cancer; KMT2C mediates the estrogen dependence of breast cancer through regulation of estrogen receptor alpha (ERalpha) enhancer function. KMT2C is frequently mutated in certain populations with diffuse-type gastric adenocarcinomas (DGA); its loss promotes epithelial-to-mesenchymal transition (EMT) and is associated with worse overall survival. 154
36448 380986 cd19209 SET_KMT2D SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2D (KMT2D) and similar proteins. KMT2D (EC2.1.1.43; also termed lysine N-methyltransferase 2D, ALL1-related protein (ALR), or myeloid/lymphoid or mixed-lineage leukemia protein 2 (MLL2)), acts as histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is a coactivator for estrogen receptor by being recruited by ESR1, thereby activating transcription. KMT2D is a subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. 155
36449 380987 cd19210 SET_NSD1 SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 1 (NSD1) and similar proteins. NSD1 (EC 2.1.1.43; also termed Histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD1 is altered in approximately 10% of head and neck cancer patients with 55% decrease in risk of death in NSD1-mutated versus non-mutated patients; its disruption promotes favorable chemotherapeutic responses linked to hypomethylation. 142
36450 380988 cd19211 SET_NSD2 SET domain (including post-SET domain) found in nuclear SET domain-containing protein 2 (NSD2) and similar proteins. NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-36' (H3K36me) methyltransferase activity. NSD2 has been shown to mediate di- and trimethylation of H3K36 and dimethylation of H4K20 in different systems, and has been characterized as a transcriptional repressor interacting with histone deacetylase HDAC1 and histone demethylase LSD1. NSD2 mediates constitutive NF-kappaB signaling for cancer cell proliferation, survival and tumor growth. It is highly overexpressed in several types of human cancers, including small-cell lung cancers, neuroblastoma, carcinomas of stomach and colon, and bladder cancers, and its overexpression tends to be associated with tumor aggressiveness. WHSC1 is frequently deleted in Wolf-Hirschhorn syndrome (WHS). 142
36451 380989 cd19212 SET_NSD3 SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 3 (NSD3) and similar proteins. NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3. NSD3 is amplified and overexpressed in multiple cancer types, including acute myeloid leukemia (AML), breast, lung, pancreatic and bladder cancers, as well as squamous cell carcinoma of the head and neck (SCCHN). NSD3 contributes to tumorigenesis by interacting with bromodomain-containing protein 4 (BRD4), the bromodomain and extraterminal (BET) protein, which is a potential therapeutic target in acute myeloid leukemia (AML). NSD3 is amplified in primary tumors and cell lines from breast carcinoma, and can promote the cell viability of small-cell lung cancer and pancreatic ductal adenocarcinoma. High NSD3 expression is implicated in poor grade and heavy smoking history in SCCHN. Thus, NSD3 may serve as a potential druggable target for selective cancer therapy. 142
36452 380990 cd19213 PR-SET_PRDM16 PR-SET domain found in PR domain zinc finger protein 16 (PRDM16) and similar proteins. PRDM16, also termed PR domain-containing protein 16, or transcription factor MEL1, or MDS1/EVI1-like gene 1, functions as a transcriptional regulator. PRDM16 is preferentially expressed by hematopoietic and neuronal stem cells and is closely related to paralog of PRDM3, both of which are directly linked to various aspects of oncogenic transformation. 162
36453 380991 cd19214 PR-SET_PRDM3 PR-SET domain found in MDS1 and EVI1 complex locus protein and similar proteins. PRDM3 (also termed MDS1 and EVI1 complex locus protein, ecotropic virus integration site 1 protein, EVI-1, myelodysplasia syndrome 1 protein, myelodysplasia syndrome-associated protein 1, or MECOM) is a nuclear transcription factor, which is essential for the proliferation/maintenance of hematopoietic stem cells (HSCs). It is closely related to paralog PRDM16, both o fwhich are directly linked to various aspects of oncogenic transformation. 158
36454 380992 cd19215 PR-SET_ZFPM1 PR-SET domain found in zinc finger protein ZFPM1 and similar proteins. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation. 110
36455 380993 cd19216 PR-SET_ZFPM2 PR-SET domain found in zinc finger protein ZFPM2 and similar proteins. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis. 111
36456 380994 cd19217 SET_EZH1 SET domain found in enhancer of zeste homolog 1 (EZH1) and similar proteins. EZH1 (EC 2.1.1.43), also termed ENX-2, or histone-lysine N-methyltransferase EZH1, is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. 136
36457 380995 cd19218 SET_EZH2 SET domain found in enhancer of zeste homolog 2 (EZH2) and similar proteins. EZH2 (EC 2.1.1.43), also termed lysine N-methyltransferase 6, or ENX-1, or histone-lysine N-methyltransferase EZH2, is a catalytic subunit of the polycomb repressive complex 2 (PRC2)/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics. 120
36458 412037 cd19318 Rev1_UBM2 Ubiquitin-Binding Motif 2 (UBM2) of Y-family polymerase Rev1. This model characterizes UBM2, the second ubiquitin-binding motif of Rev1, a DNA damage tolerance protein. Rev1 acts as a translesion synthesis (TLS) DNA polymerase and may also recruit other TLS polymerases to the site of DNA damage; in that process the UBMs are essential for Rev1 function, triggering TLS activation via recognition of ubiquitin moieties in PCNA, the proliferating cell nuclear antigen. 36
36459 381707 cd19333 Wnt_Wnt1 Wnt domain found in proto-oncogene Wnt-1 and similar proteins. Wnt-1, also called proto-oncogene Int-1, acts in the canonical Wnt signaling pathway by promoting beta-catenin-dependent transcriptional activation. It plays a role in osteoblast function, bone development and bone homeostasis. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 311
36460 381708 cd19334 Wnt_Wnt2_like Wnt domain found in protein Wnt-2, Wnt-2b and similar proteins. The family includes Wnt-2 and Wnt-2b. Wnt-2, also called Int-1-like protein 1 (INT1L1), or Int-1-related protein (IRP), functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It plays an important role in embryonic lung development. Wnt-2b, also called protein Wnt-13, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a redundant role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 314
36461 381709 cd19335 Wnt_Wnt3_Wnt3a Wnt domain found in proto-oncogene Wnt-3 and similar proteins. Wnt-3, also called proto-oncogene Int-4, functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It is required for normal embryonic development, and especially for limb development. Wnt-3a functions in the canonical Wnt signaling pathway and plays crucial roles in both proliferation and differentiation processes in several types of stem cells. Wnt3a stimulates the migration and invasion of trophoblasts and induce the survival, proliferation, and migration of human embryonic kidney (HEK) 293 cells. It also up-regulates genes implicated in melanocyte differentiation and increases the expression and nuclear localization of the transcriptional co-activator with PDZ-binding motif (TAZ), a transcriptional modulator involved in activating osteoblastic differentiation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 314
36462 381710 cd19336 Wnt_Wnt4 Wnt domain found in protein Wnt-4 and similar proteins. Wnt-4 may function as a signaling molecule which affects the development of discrete regions of tissues. Its overexpression may be associated with abnormal proliferation in human breast tissue. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 310
36463 381711 cd19337 Wnt_Wnt5 Wnt domain found in protein Wnt-5a, Wnt-5b and similar proteins. The family includes Wnt-5a and Wnt-5b, both of which are secreted growth factors that belong to the noncanonical members of the Wingless-related MMTV-integration family. Wnt-5a can activate or inhibit canonical Wnt signaling, depending on receptor context. It specifically regulates dendritic spine formation in rodent hippocampal neurons, resulting in postsynaptic development that promotes the clustering of the postsynaptic density protein 95 (PSD-95). The overexpression of Wnt-5b is associated with cancer aggressiveness. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 313
36464 381712 cd19338 Wnt_Wnt6 Wnt domain found in protein Wnt-6 and similar proteins. Wnt-6 may function as a signaling molecule which affects the development of discrete regions of tissues. It may promote tumorigenesis in gastrointestinal cancer and cervical cancer. It can compensate for the absence of ectoderm and can induce the formation of muscle cells in the limb. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 310
36465 381713 cd19339 Wnt_Wnt7 Wnt domain found in protein Wnt-7a, Wnt-7b and similar proteins. The family includes Wnt-7a and Wnt-7b. Wnt-7a acts as a canonical Wnt ligand that modulates the synaptic vesicle cycle and synaptic transmission in hippocampal neurons. It also plays an important role in embryonic development, including dorsal versus ventral patterning during limb development, skeleton development, and urogenital tract development. Wnt-7b functions in the canonical Wnt/beta-catenin signaling pathway in vascular smooth muscle cells. It is required for normal fusion of the chorion and the allantois during placenta development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 313
36466 381714 cd19340 Wnt_Wnt8 Wnt domain found in protein Wnt-8a, Wnt-8b and similar proteins. The family includes Wnt-8a and Wnt-8b. Wnt-8a, also called protein Wnt-8d, plays a role in embryonic patterning. Wnt-8b may play an important role in the development and differentiation of certain forebrain structures, notably the hippocampus. It acts as a suppressor of early eye and retinal progenitor formation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 301
36467 381715 cd19341 Wnt_Wnt9 Wnt domain found in protein Wnt-9a, Wnt-9b and similar proteins. The family includes Wnt-9a and Wnt-9b, both of which function in the canonical Wnt/beta-catenin signaling pathway. Wnt-9a, also called protein Wnt-14, is required for normal timing of IHH expression during embryonic bone development, normal chondrocyte maturation and for normal bone mineralization during embryonic bone development. Wnt-9a plays a redundant role in maintaining joint integrity. It is a conserved regulator of hematopoietic stem and progenitor cell development. Wnt-9b, also called protein Wnt-14b, or protein Wnt-15, plays a central role in the regulation of mesenchymal to epithelial transitions underlying organogenesis of the mammalian urogenital system. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 299
36468 381716 cd19342 Wnt_Wnt10 Wnt domain found in protein Wnt-10a, Wnt-10b and similar proteins. The family includes protein Wnt-10a and Wnt-10b. Wnt-10a plays a role in normal ectoderm development. It is required for normal postnatal development and maintenance of tongue papillae and sweat ducts, as well as normal hair follicle function. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency, and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 316
36469 381717 cd19343 Wnt_Wnt11 Wnt domain found in protein Wnt-11 and similar proteins. Wnt-11 may be a signaling molecule which has possible roles in the development of skeleton, kidney and lung. It is a positive regulator of the Wnt signaling pathway, which plays a crucial role in carcinogenesis. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 305
36470 381718 cd19344 Wnt_Wnt16 Wnt domain found in protein Wnt-16 and similar proteins. Wnt-16 is a mixed canonical and noncanonical Wnt ligand involved in the regulation of postnatal bone homeostasis. It promotes bone formation and inhibits bone resorption. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 306
36471 381719 cd19345 Wnt_Wnt2 Wnt domain found in protein Wnt-2 and similar proteins. Wnt-2, also called Int-1-like protein 1 (INT1L1), or Int-1-related protein (IRP), functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It plays an important role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 314
36472 381720 cd19346 Wnt_Wnt2b Wnt domain found in protein Wnt-2b and similar proteins. Wnt-2b, also called protein Wnt-13, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a redundant role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 314
36473 381721 cd19347 Wnt_Wnt5a Wnt domain found in protein Wnt-5a and similar proteins. Wnt-5a is a secreted growth factor that belongs to the noncanonical members of the Wingless-related MMTV-integration family. It can activate or inhibit canonical Wnt signaling, depending on receptor context. Wnt-5a specifically regulates dendritic spine formation in rodent hippocampal neurons, resulting in postsynaptic development that promotes the clustering of the postsynaptic density protein 95 (PSD-95). Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 312
36474 381722 cd19348 Wnt_Wnt5b Wnt domain found in protein Wnt-5b and similar proteins. Wnt-5b is a secreted growth factor that belongs to the noncanonical members of the Wingless-related MMTV-integration family. Its overexpression is associated with cancer aggressiveness. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 312
36475 381723 cd19349 Wnt_Wnt7a Wnt domain found in protein Wnt-7a and similar proteins. Wnt-7a acts as a canonical Wnt ligand that modulates the synaptic vesicle cycle and synaptic transmission in hippocampal neurons. It also plays an important role in embryonic development, including dorsal versus ventral patterning during limb development, skeleton development and urogenital tract development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 318
36476 381724 cd19350 wnt_Wnt7b Wnt domain found in protein Wnt-7b and similar proteins. Wnt-7b functions in the canonical Wnt/beta-catenin signaling pathway in vascular smooth muscle cells. It is required for normal fusion of the chorion and the allantois during placenta development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 318
36477 381725 cd19351 Wnt_Wnt8a Wnt domain found in protein Wnt-8a and similar proteins. Wnt-8a, also called protein Wnt-8d, plays a role in embryonic patterning. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 307
36478 381726 cd19352 Wnt_Wnt8b Wnt domain found in protein Wnt-8b and similar proteins. Wnt-8b may play an important role in the development and differentiation of certain forebrain structures, notably the hippocampus. It acts as a suppressor of early eye and retinal progenitor formation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 305
36479 381727 cd19353 Wnt_Wnt9a Wnt domain found in protein Wnt-9a and similar proteins. Wnt-9a, also called protein Wnt-14, functions in the canonical Wnt/beta-catenin signaling pathway. It is required for normal timing of IHH expression during embryonic bone development, normal chondrocyte maturation and for normal bone mineralization during embryonic bone development. Wnt-9a plays a redundant role in maintaining joint integrity. It is a conserved regulator of hematopoietic stem and progenitor cell development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 298
36480 381728 cd19354 Wnt_Wnt9b Wnt domain found in protein Wnt-9b and similar proteins. Wnt-9b, also called protein Wnt-14b or Wnt-15, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a central role in the regulation of mesenchymal to epithelial transitions underlying organogenesis of the mammalian urogenital system. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 297
36481 381729 cd19355 Wnt_Wnt10a Wnt domain found in protein Wnt-10a and similar proteins. Wnt-10a plays a role in normal ectoderm development. It is required for normal postnatal development and maintenance of tongue papillae and sweat ducts, as well as normal hair follicle function. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 302
36482 381730 cd19356 Wnt_Wnt10b Wnt domain found in protein Wnt-10b and similar proteins. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 299
36483 381692 cd19357 TenA_E_At3g16990-like TenA_E proteins similar to Arabidopsis thaliana At3g16990. This family of TenA proteins belongs to the TenA_E class, and lacks the conserved active site Cys residue of the TenA_C class; most have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2; aminopyrimidine aminohydrolase, also known as thiaminase II) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Members of this family include Arabidopsis thaliana At3g16990, Zea mays GRMZM2G080501, and Pyrococcus furiosus PF1337, among others. Arabidopsis thaliana TenA_E hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP, but does not hydrolyze thiamin; nor does it have activity with other thiamine degradation products such as thiamine mono- or diphosphate, oxythiamine, oxothiamine, thiamine disulfide, desthiothiamine or thiochrome as substrates. Structural studies of P. furiosus PF1337 strongly support its enzymatic function in thiamine biosynthesis. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. 217
36484 381693 cd19358 TenA_E_Spr0628-like TenA_E proteins similar to Streptococcus pneumoniae Spr0628 and Saccharomyces cerevisiae S288C PET18. This family of TenA proteins belongs to the TenA_E class, and lacks the conserved active site Cys residue of the TenA_C class; most have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2; aminopyrimidine aminohydrolase, also known as thiaminase II) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Arabidopsis thaliana TenA_E (not belonging to this family) hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP. Members of this family include the putative thiaminase Streptococcus pneumoniae Spr0628, and Saccharomyces cerevisiae S288C PET18, a protein of unknown function whose expression is induced in the absence of thiamin. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. Many proteins in this family have yet to be characterized. 209
36485 381694 cd19359 TenA_C_Bt3146-like uncharacterized TenA_C proteins similar to Bacteroides thetaiotaomicron Bt3146. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA-like proteins such as Bacteroides thetaiotaomicron Bt3146. 206
36486 381695 cd19360 TenA_C_SaTenA-like TenA_C proteins similar to Staphylococcus aureus TenA (SaTenA). This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Staphylococcus aureus TenA (SaTenA) which plays two essential roles in thiamin metabolism: in the deamination of aminopyrimidine to HMP, and in hydrolyzing thiamin into HMP and 5-(2-hydroxyethyl)4-methylthiazole (THZ). It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. SaTenA is then also a putative transcriptional regulator controlling the secretion of extracellular proteases such as subtilisin-type proteases in bacteria. This family includes mostly uncharacterized TenA like proteins. 211
36487 381696 cd19361 TenA_C_HP1287-like TenA_C proteins similar to Helicobacter pylori TenA (HP1287. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Helicobacter pylori TenA (HP1287) protein which is thought to catalyze a salvage reaction in thiamin metabolism, however its pyrimidine substrate has not yet been identified. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. HP1287 may contribute to stomach colonization and persistence. 212
36488 381697 cd19362 TenA_C_SsTenA-1-like uncharacterized TenA_C proteins similar to Sulfolobus solfataricus TenA-1 (Sso2206). This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA like proteins such as Sulfolobus solfataricus putative TenA-like thiaminase (Tena-1, Sso2206). 200
36489 381698 cd19363 TenA_C_PH1161-like uncharacterized TenA_C proteins similar to Pyrococcus horikoshii PH1161. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes functionally uncharacterized TenA like proteins such as Pyrococcus horikoshii PH1161 protein. 210
36490 381699 cd19364 TenA_C_BsTenA-like TenA_C proteins similar to Bacillus subtilis TenA. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Bacillus subtilis TenA which has been shown to be a thiaminase II, catalyzing the hydrolysis of thiamine into HMP and 5-(2-hydroxyethyl)-4-methylthiazole (THZ), within thiamine metabolism. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. 212
36491 381700 cd19365 TenA_C-like uncharacterized TenA_C proteins. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA_C- like proteins. 205
36492 381701 cd19366 TenA_C_BhTenA-like TenA_C proteins similar to Bacillus halodurans TenA. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Bacillus halodurans TenA which participates in a salvage pathway where the thiamine degradation product 2-methyl-4-formylamino-5-aminomethylpyrimidine (formylamino-HMP) is hydrolyzed first to amino-HMP by the YlmB protein, and the amino-HMP is then hydrolyzed by TenA to produce HMP. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. 213
36493 381702 cd19367 TenA_C_ScTHI20-like TenA_C family similar to the C-terminal TenA_C domain of Saccharomyces cerevisiae THI20 protein. This TenA family belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. Saccharomyces cerevisiae THI20 includes a C-terminal tetrameric TenA-like domain fused to an N-terminal HMP kinase/HMP-P kinase (ThiD-like) domain, and participates in thiamin biosynthesis, degradation and salvage; the TenA-like domain catalyzes the production of HMP from thiamin degradation products (salvage). A majority of this family are single-domain TenA_C- like proteins; some however have additional domains such as a ThiD domain. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. 204
36494 381703 cd19368 TenA_C_AtTH2-like TenA_C family similar to the N-terminal TenA_C domain of Arabidopsis thaliana thiamine requiring 2. This TenA family belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. Arabidopsis thaliana TH2 is an orphan enzyme thiamin monophosphate phosphatase which has a haloacid dehalogenase (HAD) family domain fused to its TenA_C domain, it's TenA_C domain has thiamin salvage hydrolase activity against amino-HMP. This family includes mostly uncharacterized single-domain TenA_C- like proteins; some however have additional domains such as a HAD family domain or a kinase domain It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. 210
36495 381704 cd19369 TenA_C-like uncharacterized TenA_C proteins. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA_C- like proteins. 202
36496 381705 cd19370 TenA_PqqC TenA_like proteins, including PqqC and CADD. This family contains proteins with similarity to TenA, and includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C or PQQC proteins. PQQ is the prosthetic group of several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC catalyzes the last step of PQQ biogenesis which involves a ring closure and an eight-electron oxidation of the substrate [3a-(2-amino-2-carboxyethyl)-4,5-dioxo-4,5,6,7,8,9-hexahydroquinoline-7,9-dicarboxylic acid (AHQQ)]. The exact molecular function of members of this family is unclear. Also belonging to this family is Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), a redox protein toxin unique to Chlamydia species, which modulates host cell apoptosis; its redox activity and death domain binding ability may be required for this biological activity. CADD may have a role in folate metabolism. 219
36497 381686 cd19371 UDG-F1-like Uracil DNA glycosylase family 1, includes Human uracil DNA glycosylase, Vaccinia virus protein D4, Nitratifractor salsuginis UNG and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. More distant members of UDG family 1 include Nitratifractor salsuginis UNG (NsaUNG) and Vaccinia virus (VAVC) protein D4 uracil-DNA glycosylase, a subunit of the VACV DNA polymerase holoenzyme. NsaUNG only exhibits robust enzymatic activity on uracil-containing DNAs, in particular double-stranded uracil-containing substrates; it does not act on hypoxanthine- and xanthine-containing substrates. NsUNG is not inhibited by Ugi protein that specifically inhibits conventional family 1 UDGs. D4, in addition to excising uracil residues from DNA, is part of a heterodimeric processivity factor which potentiates the DNA polymerase activity. 135
36498 381687 cd19372 UDG_F1_VAVC_D4-like Uracil DNA glycosylase family 1 subfamily, includes Vaccinia virus protein D4 and similar proteins. Vaccinia virus (VAVC) protein D4 uracil-DNA glycosylase, is a subunit of the VACV DNA polymerase holoenzyme, and a more distant member of uracil DNA glycosylase (UDG) family 1. D4, in addition to excising uracil residues from DNA, is part of a heterodimeric processivity factor which potentiates the DNA polymerase activity. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 200
36499 381688 cd19373 UDG-F1_NsUNG-like Uracil DNA glycosylase family 1 subfamily, includes Nitratifractor salsuginis UNG and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. Nitratifractor salsuginis UNG (NsaUNG) only exhibits robust enzymatic activity on uracil-containing DNAs, in particular double-stranded uracil-containing substrates, and does not act on hypoxanthine- and xanthine-containing substrates. NsUNG is not inhibited by Ugi protein that specifically inhibits conventional family 1 UDGs. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 174
36500 381689 cd19374 UDG-F3_SMUG1-like Uracil DNA glycosylase family 3 subfamily, includes single-strand-selective monofunctional uracil-DNA glycosylase 1 and similar proteins. Uracil DNA glycosylase family 3 includes Human SMUG1 that can remove uracil and its oxidized pyrimidine derivatives from both, single-stranded DNA and double-stranded DNA, with a preference for single-stranded DNA substrates. The SMUG-targeted mismatched uracil derivatives include 5-hydroxyuracil, 5-hydroxymethyluracil and 5-formyluracil. Also included in this subfamily is Geobacter metallireducens SMUG1 which has dual substrate specificities for DNA with uracil or xanthine. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 232
36501 381690 cd19375 UDG-F3-like_SMUG2 Uracil DNA glycosylase family 3-like subfamily, includes single-strand-selective monofunctional uracil-DNA glycosylase 2 and similar proteins. Uracil DNA glycosylase family 3-like, which includes Pedobacter heparinus SMUG2, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. 218
36502 381646 cd19376 TGF_beta_GDF15 transforming growth factor beta (TGF-beta) like domain found in mammalian growth/differentiation factor 15 (GDF-15) and similar proteins. GDF-15, also termed macrophage inhibitory cytokine 1 (MIC-1), or NSAID-activated gene 1 protein (NAG-1), or NSAID-regulated gene 1 protein (NRG-1), or placental TGF-beta, or placental bone morphogenetic protein, or prostate differentiation factor, regulates food intake, energy expenditure and body weight in response to metabolic and toxin-induced stresses. 101
36503 381647 cd19377 TGF_beta_INHA_B_like transforming growth factor beta (TGF-beta) like domain found in inhibin alpha chain (INHA), beta chain (INHB) and similar proteins. INHA is a component of inhibins (inhibin A or inhibin B) that inhibit the secretion of follitropin by the pituitary gland. INHB includes inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), and E chain (INHBE). INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress. 101
36504 381648 cd19378 TGF_beta_DAF7 transforming growth factor beta (TGF-beta) like domain found in Caenorhabditis elegans Dauer larva development regulatory growth factor DAF-7 and similar proteins. DAF-7, also termed abnormal dauer formation protein 7, may act as a negative regulator of dauer larva development by transducing chemosensory information from ASI neurons. It is involved in sensitivity to CO2 levels. 100
36505 381649 cd19379 TGF_beta_GSDF transforming growth factor beta (TGF-beta) like domain found in Danio rerio gonadal somatic cell derived factor (GSDF) and similar proteins. GSDF is a new member of transforming growth factor beta (TGF-beta) superfamily. It is a teleost- and gonad-specific growth factor that controls sex determination in some fish and plays an important role in mediating germ cell/soma signaling. 94
36506 381650 cd19380 TGF_beta_GDNF transforming growth factor beta (TGF-beta) like domain found in glial cell line-derived neurotrophic factor (GDNF) and similar proteins. GDNF, also termed astrocyte-derived trophic factor (ATF), is a member of the glial cell-line-derived neurotrophic factor (GDNF) family. It acts as a neurotrophic factor that enhances survival and morphological differentiation of dopaminergic neurons and increases their high-affinity dopamine uptake. 96
36507 381651 cd19381 TGF_beta_Artemin transforming growth factor beta (TGF-beta) like domain found in Artemin and similar proteins. Artemin, also termed Enovin, or Neublastin, is a member of the glial cell-line-derived neurotrophic factor (GDNF) family with growth promoting activity on neuronal cells. It acts as the ligand for the GFR-alpha-3-RET receptor complex but can also activate the GFR-alpha-1-RET receptor complex. It supports peripheral and central neurons and signals through the GFR-alpha-3-RET receptor complex. 98
36508 381652 cd19382 TGF_beta_Persephin transforming growth factor beta (TGF-beta) like domain found in Persephin and similar proteins. Persephin is a member of the glial cell-line-derived neurotrophic factor (GDNF) family with neurotrophic activity on mesencephalic dopaminergic and motor neurons. 99
36509 381653 cd19383 TGF_beta_Neurturin transforming growth factor beta (TGF-beta) like domain found in Neurturin and similar proteins. Neurturin is a member of the glial cell-line-derived neurotrophic factor (GDNF) family. It acts as a neurotrophic factor that supports the survival of sympathetic neurons in culture and may regulate the development and maintenance of the CNS. It might control the size of non-neuronal cell population such as haemopoietic cells. 104
36510 381654 cd19384 TGF_beta_TGFB1 transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-1 (TGF-beta-1) and similar proteins. TGF-beta-1 is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is a secreted protein that performs many cellular functions, including the control of cell growth, cell proliferation, cell differentiation, and apoptosis. 99
36511 381655 cd19385 TGF_beta_TGFB2 transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-2 (TGF-beta-2) and similar proteins. TGF-beta-2, also termed BSC-1 cell growth inhibitor, or cetermin, or glioblastoma-derived T-cell suppressor factor (G-TSF), or polyergin, is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is a secreted protein that performs many cellular functions and has a vital role during embryonic development. It can suppress the effects of interleukin-2 dependent T-cell growth. 97
36512 381656 cd19386 TGF_beta_TGFB3 transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-3 (TGF-beta-3) and similar proteins. TGF-beta-3 is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is involved in embryogenesis and cell differentiation. It regulates molecules involved in cellular adhesion and extracellular matrix (ECM) formation during the process of palate development. 101
36513 381657 cd19387 TGF_beta_univin transforming growth factor beta (TGF-beta) like domain found in Strongylocentrotus purpuratus univin and similar proteins. Univin may have a critical role in early developmental decisions in the sea urchin embryo. 104
36514 381658 cd19388 TGF_beta_GDF8 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 8 (GDF8) and similar proteins. GDF8, also termed myostatin, acts specifically as a negative regulator of skeletal muscle growth. 108
36515 381659 cd19389 TGF_beta_GDF11 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 11 (GDF11) and similar proteins. GDF11, also termed bone morphogenetic protein 11 (BMP-11), is a secreted signal that acts globally to specify positional identity along the anterior/posterior axis during development. 109
36516 381660 cd19390 TGF_beta_BMP2 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 2 (BMP-2) and similar proteins. BMP-2, also termed BMP-2A, induces cartilage and bone formation. It stimulates the differentiation of myoblasts into osteoblasts via the EIF2AK3-EIF2A- ATF4 pathway. 103
36517 381661 cd19391 TGF_beta_BMP4_BMP2B transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 4 (BMP-4) and similar proteins. BMP-4, also termed BMP-2B, induces cartilage and bone formation. It also acts in mesoderm induction, tooth development, limb formation and fracture repair. 107
36518 381662 cd19392 TGF_beta_DPP transforming growth factor beta (TGF-beta) like domain found in Drosophila melanogaster protein decapentaplegic (Dpp) and similar proteins. decapentaplegic (Dpp) and similar proteins Dpp, also termed as protein DPP-C, is required later in embryogenesis for dorsal closure and patterning of the hindgut. It also functions postembryonically as a long-range morphogen during imaginal disk development and is responsible for the progression of the morphogenetic furrow during eye development. 109
36519 381663 cd19393 TGF_beta_BMP3 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 3 (BMP-3) and similar proteins. BMP-3, also termed BMP-3A, or osteogenin, negatively regulates bone density. It antagonizes the ability of certain osteogenic BMPs to induce osteoprogenitor differentitation and ossification. 110
36520 381664 cd19394 TGF_beta_GDF10 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 10 (GDF10) and similar proteins. GDF10, also termed bone morphogenetic protein 3B (BMP-3B), or bone-inducing protein (BIP), is a growth factor involved in osteogenesis and adipogenesis. 112
36521 381665 cd19395 TGF_beta_BMP5 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 5 (BMP-5) and similar proteins. BMP-5 induces cartilage and bone formation. 113
36522 381666 cd19396 TGF_beta_BMP6 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 6 (BMP-6) and similar proteins. BMP-6, also termed VG-1-related protein, or VG-1-R, or VGR-1, induces cartilage and bone formation. 103
36523 381667 cd19397 TGF_beta_BMP7 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 7 (BMP-7) and similar proteins. BMP-7, also termed osteogenic protein 1 (OP-1), or eptotermin alfa, induces cartilage and bone formation. It may act as the osteoinductive factor responsible for the phenomenon of epithelial osteogenesis and play a role in calcium regulation and bone homeostasis. 107
36524 381668 cd19398 TGF_beta_BMP8 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 8A (BMP-8A), 8B (BMP-8B) and similar proteins. BMP-8A plays a role in the maintenance of spermatogenesis and the integrity of the epididymis. BMP-8B, also termed BMP-8, or osteogenic protein 2 (OP-2), may act as secreted factor in cancer progression. It also plays an essential role in bone metabolism and can regulate thermogenesis and energy balance. Like BMP-8A, BMP-8B plays a role in spermatogenesis and placental development. Mutation in either of the genes encoding BMP-8A or BMP-8B causes postnatal depletion of spermatogonia in mice. 105
36525 381669 cd19399 TGF_beta_GDF5 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 5 (GDF5) and similar proteins. GDF5, also termed bone morphogenetic protein 14 (BMP-14), or cartilage-derived morphogenetic protein 1 (CDMP-1), or lipopolysaccharide-associated protein 4 (LAP-4), or LPS-associated protein 4, or radotermin, is a growth factor involved in bone and cartilage formation. 103
36526 381670 cd19400 TGF_beta_BMP9 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 9 (BMP-9) and similar proteins. BMP-9, also termed growth/differentiation factor 2 (GDF-2), is a potent circulating inhibitor of angiogenesis. It signals through the type I activin receptor ACVRL1 but not other activin receptor-like kinases (ALKs). 105
36527 381671 cd19401 TGF_beta_BMP10 transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 10 (BMP-10) and similar proteins. BMP-10 is required for maintaining the proliferative activity of embryonic cardiomyocytes by preventing premature activation of the negative cell cycle regulator CDKN1C/p57KIP and maintaining the required expression levels of cardiogenic factors such as MEF2C and NKX2-5. It inhibits endothelial cell migration and growth. It may reduce cell migration and cell matrix adhesion in breast cancer cell lines. 105
36528 381672 cd19402 TGF_beta_GDF9B transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9B (GDF-9B) and similar proteins. GDF-9B, also termed bone morphogenetic protein 15 (BMP15), acts as oocyte-specific growth/differentiation factor that stimulates folliculogenesis and granulosa cell (GC) growth. 104
36529 381673 cd19403 TGF_beta_GDF9 transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9 (GDF-9) and similar proteins. GDF-9 is required for ovarian folliculogenesis. It promotes primordial follicle development and stimulates granulosa cell proliferation. 106
36530 381674 cd19404 TGF_beta_INHBA transforming growth factor beta (TGF-beta) like domain found in inhibin beta A chain (INHBA) and similar proteins. INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. 108
36531 381675 cd19405 TGF_beta_INHBB transforming growth factor beta (TGF-beta) like domain found in inhibin beta B chain (INHBB) and similar proteins. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. 107
36532 381676 cd19406 TGF_beta_INHBC_E transforming growth factor beta (TGF-beta) like domain found in inhibin beta C chain (INHBC), inhibin beta E chain (INHBE) and similar proteins. The family includes INHBC and INHBE. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress. 104
36533 410990 cd19412 pMMO-AMO_C subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO) from methanotrophic bacteria, and of ammonia monooxygenase (AMO) from ammonia-oxidizing bacteria, and related proteins. This family contains subunit C of particulate methane monooxygenase (pMMO; EC 1.14.18.3), an integral membrane metalloenzyme that catalyzes the conversion of methane to methanol. MMO is the first enzyme in the metabolic pathway of methanotrophic bacteria. It also contains subunit C of AMO (EC 1.14.99.39) from ammonia-oxidizing bacteria such as Nitrosomonas europaea (AmoC1-AmoC3). AMO catalyzes the conversion of ammonia to hydroxylamine. pMMO, along with soluble MMO (sMMO; EC 1.14.13.25), and the related enzyme AMO are the only known enzymes capable of methane hydroxylation. pMMO is composed of three subunits, PmoB (B or alpha), PmoA (A or beta), and PmoC (C or gamma), each containing membrane-spanning helices, with three copies each of the subunits forming a cylindrical A3B3C3 oligomer with a hole in the center. This subunit of pMMO has a metal-binding site that is exposed to the center of the pMMO oligomer, the metal being zinc or copper. Although biochemical and mutagenesis data indicate that the active site is located at the dicopper site in subunit B, the metal-binding site in this transmembrane subunit C may also be functionally relevant since all ligands are conserved and best enzymatic activity is obtained from intact pMMO containing all three subunits. Zinc inhibition studies of several respiratory complexes in Methylococcus capsulatus and Methylosinus trichosporium suggest that zinc might inhibit proton transfer in pMMO by either replacing active site copper ions or another copper site that is involved in reducing the active site. Nitrosomonas europaea AMO is composed of three subunits; AmoA, AmoB, and AmoC; it has two nearly identical copies of AmoC encoded by duplicate amoCAB operons and a more divergent AmoC encoded by a monocistronic amoC. The significantly shorter related C subunit of AMO from ammonia-oxidizing archaea are not included in this model. 217
36534 381292 cd19413 RsbR_N-like globin-like domain of positive regulator of sigma-B activity (RsbRA). The globin-like domain of Bacillus subtilis RsbRA is a non-heme globin presumed to channel sensory input to the C-terminal sulfate transporter/anti-sigma factor antagonist (STAT) domain. RsbRA is a component of the sigma B-activating stressosome, and a regulator of the RNA polymerase sigma factor subunit sigma (B). 132
36535 381189 cd19414 lipocalin_1_3_4_13-like lipocalin-1, -3, -4, -13 and similar proteins. Lipocalin-1 (LCN1, also known as tear lipocalin, von ebner's gland protein, or tear specific prealbumin), the main lipid carrier in human tears, is critical to functions involving lipids in protection of the ocular surface. Its large ligand pocket accommodates a range of ligands including alkyl alcohols, glycolipids, phospholipids, cholesterol, steroids, and siderophores. Lipocalin-3 (LCN3, also known as vomeronasal secretory protein 1) and lipocalin-4 (LCN4, also known as vomeronasal secretory protein 2) are involved in transport of lipophilic molecules, and are possibly pheromone-carriers. Lipocalin-13 (LCN13, also known as odorant binding protein 2A) may bind and transport small hydrophobic volatile molecules with a higher affinity for aldehydes and large fatty acids. Another member of this family is late lactation protein B (LLPB), a milk protein produced during the late phase of lactation, which may be involved in transporting a small ligand released during the hydrolysis of milk fat. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 147
36536 381190 cd19415 lipocalin_ApoM_AGP apolipoprotein M and alpha1-acid glycoprotein family. Apolipoprotein M (ApoM) is mainly found in high-density lipoproteins (HDL) and is expressed in the liver and in the kidney; it is associated to a lesser extend with low density lipids and triglyceride rich lipoproteins. ApoM is involved in lipid transport and can bind sphingosine-1-phosphate, myristic acid, palmitic acid and stearic acid, retinol, all-trans-retinoic acid and 9-cis-retinoic acid. Alpha1-acid glycoprotein (AGP), also known as orosomucoid, has many important biological roles such as in the acute-phase reaction in response to inflammation, in immune regulation, in drug-binding and drug-transportation, in regulating sphingolipid synthesis and metabolism, and in maintaining the capillary barrier. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 153
36537 381191 cd19416 lipocalin_beta-LG-like beta-lactoglobulin and similar proteins. Beta-Lactoglobulin (beta-LG) is the major whey protein of ruminant species and present in the milk of many other species, with a notable exception of human. It is the major allergen of bovine milk. Beta-LG has been shown to bind hydrophobic ligands such as curcumin, vitamin E or fatty acids, or hydrophilic such as vitamin B9. This group also includes human glycodelin (also known as placental protein 14, pregnancy-associated endometrial alpha-2 globulin, and progestagen-associated endometrial protein) which is involved in crucial biological processes such as reproduction and immune reaction. Four glycoforms of glycodelin have been identified in reproductive tissue that differ in glycosylation and biological activity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 160
36538 381192 cd19417 lipocalin_C8gamma complement protein C8 gamma. Human complement protein C8 gamma, together with C8alpha and C8beta, form one of five components of the cytolytic membrane attack complex (MAC), a pore-like structure that assembles on bacterial membranes. C8alpha and C8gamma form a disulfide-linked heterodimer that is noncovalently associated with C8beta. MAC plays an important role in the defense against gram-negative bacteria and other pathogenic organisms. C8gamma belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 162
36539 381193 cd19418 lipocalin_A1M-like lipocalin domain of alpha1-microglobulin and similar proteins. Alpha(1)-microglobulin (A1M, also known as protein AMBP, alpha-1 microglycoprotein, and protein HC), has immunosuppressive properties, such as inhibition of antigen induced lymphocyte cell-proliferation, cytokine secretion, and oxidative burst of neutrophils. A1M may participate in the reducing and scavenging of biological pro-oxidants such as heme and heme-proteins. It binds heme strongly, and a C-terminally processed form of the protein degrades the heme. It can reduce cytochrome C, nitroblue tetrazolium, methemoglobin and free iron, using NADH, NADPH or ascorbate as cofactor. Intravenous administration of recombinant A1M in animal models eliminates or significantly reduces the manifestations of preeclampsia. A1M is a useful biomarker in clinical diagnostics for monitoring pre-eclampsia, hepatitis E, renal tubular dysfunction, and renal toxicity. A1M belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 163
36540 381194 cd19419 lipocalin_L-PGDS lipocalin-type prostaglandin D synthase. Lipocalin-type prostaglandin D synthase (L-PGDS; EC:5.3.99.2) is a secreted enzyme and the second most abundant protein in human cerebrospinal fluid. L-PGDS acts as both, an enzyme and as a lipid transporter, converting prostaglandin H2 to prostaglandin D2 and serving as a carrier for hydrophobic ligands including retinoids, hemoglobin metabolites, thyroid hormones, gangliosides, and fatty acids. L-PGDS belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 158
36541 381195 cd19420 lipocalin_VDE lipocalin domain of violaxanthin deepoxidase and similar proteins. Plant violaxanthin de-epoxidase (VDE, EC 1.23.5.1) participates in the xanthophyll cycle for controlling the concentration of zeaxanthin in chloroplasts. It catalyzes the conversion of violaxanthin to antheraxanthin and zeaxanthin in strong light, and plays a central role in adjusting photosynthetic activity to changing light conditions. In addition, maize VDE has been shown to interact with sugarcane mosaic virus helper component-proteinase, HC-(SCMV), and to attenuate the RNA silencing suppression activity of the latter. VDE belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 177
36542 381196 cd19421 lipocalin_5_8-like lipocalin similar to human epididymal-specific lipocalin-8, mouse lipocalin-5 and -8, and similar proteins. Lipocalin 5 (LCN5; also known as epididymal retinoic acid binding protein Erabp, mouse epididymal protein 10, MEP10, and E-RABP) and Lipocalin 8 (LCN8; also known as mouse epididymal protein 17, MEP17) are homologous proteins belonging to the epididymis-specific lipocalins; they may play a role in male fertility, and may act as retinoid carrier proteins within the epididymis. In mice, genes encoding the two proteins are contiguous; in humans, there is one gene LCN8 (which has been previously called LCN5). This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 150
36543 381197 cd19422 lipocalin_15-like lipocalin 15 and similar proteins, such as chicken CALbeta. This subfamily includes uncharacterized human lipocalin 15, and chicken chondrogenesis-associated lipocalin (CAL) beta which is associated with chondrogenesis and inflammation. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 143
36544 381198 cd19423 lipocalin_LTBP1-like Triatominae salivary lipocalins such as Rhodnius prolixus LTBP1 and Meccus pallidipennis triabin, and similar proteins. This subfamily includes various insect proteins found in the saliva of Triatominae (kissing bugs), including Rhodnius prolixus leukotriene-binding LTBP1. Rhodnius prolixus, a vector of the pathogen Trypanosoma cruzi, sequesters cysteinyl leukotrienes during feeding to inhibit immediate inflammatory responses; LTBP1 binds leukotrienes C4 (LTC4), D4 (LTD4), and E4 (LTE4). Meccus pallidipennis (syn Triatoma pallidipennis) triabin is a potent and selective thrombin inhibitor. It also includes Triatoma protracta procalin, a major salivary allergen which causes an allergic reaction in humans. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 132
36545 381199 cd19424 lipocalin_NPs-like nitrophorins and similar proteins. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase, R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. Besides NO, NPs also show high affinities for histamine (Hm). This group also includes Rhodnius prolixus amine-binding protein (ABP) which plays an important role in biogenic amine binding; it binds serotonin and norepinephrine with high affinity. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 175
36546 381200 cd19425 lipocalin_10-like Epididymal-specific lipocalin-10 and similar proteins. Epididymal-specific lipocalin-10 (LCN10) may play a role in male fertility, and may act as a retinoid carrier protein within the epididymis. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 111
36547 381201 cd19426 lipocalin_6 Epididymal-specific lipocalin-6. Epididymal-specific lipocalin-6 (LCN6) may play a role in male fertility. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 144
36548 381202 cd19427 lipocalin_OBP-like Lipocalin odorant-binding protein and similar proteins. Odorant-binding proteins (OBPs) transport small hydrophobic molecules in the nasal mucosa of vertebrates. This subfamily includes mouse odorant-binding protein 1a (Obp1a), Obp1b, and probasin. Mouse Obp1a and Obp1b, which are expressed in the nasal mucosa, bind the chemical odorant 2-isobutyl-3-methoxypyrazine, and may form a OBPO1a/Opb1B heterodimer. Mouse probasin may play a role in the biology of the prostate gland. This group also includes hamster female-specific lacrimal gland protein (FLP) and aphrodisin. FLP may bind tear lipids or lipid-like pheromones found in hamster tears; aphrodisin is found in hamster vaginal discharge, carries pheromones, and stimulates copulatory behavior in males. This group also includes dog allergen Ca f4 which is expressed by tongue epithelial tissue and found in saliva and dander. Bovine OBP is believed to act as a homodimer, having the C-terminal alpha-helix of each monomer stacking against the beta-barrel of the other monomer; this is possible due to its lack of cysteines and therefore lack of disulfide bonds. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 147
36549 381203 cd19428 lipocalin_MUP-like major urinary proteins (MUPs) and similar proteins. Mouse urine contains major urinary proteins (MUPs) which bind low molecular weight hydrophobic organic compounds such as urinary volatile pheromones such as the male-specific 2-sec-butyl-4,5-dihydrothiazole (SB2HT) which hastens puberty in female mice. The association between MUPs and these volatiles slows the release of the volatiles into the air from urine marks. MUPs may also act as pheromones themselves. MUPs, expressed in the nasal and vomeronasal mucosa, may be important for delivering urinary volatiles to receptors in the vomeronasal organ. This group includes MUPs encoded by central genes in the MUP cluster, as well as those encoded by peripheral genes such as Darcin/Mup20 which binds most of the male pheromone SB2HT in urine and was the first MUP shown to have male pheromonal activity in its own right. This group includes rat MUPs (also called alpha-2U globulins) and other lipocalins such as major horse allergen Equ c 1 and boar salivary lipocalin, a pheromone-binding protein specifically expressed in the submaxillary glands of the boar. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 158
36550 381204 cd19429 lipocalin_9 lipocalin 9. Lipocalin 9 (LCN9) is specifically expressed in the epididymis. It belongs to the lipocalin/cytosolic fatty-acid binding protein family. Lipocalins are typically small extracellular proteins that bind small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids and form covalent or non-covalent complexes with soluble macromolecules as well as membrane bound-receptors. They are involved in many important functions, like ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. 156
36551 381205 cd19430 lipocalin_trichosorin-like trichosurin and similar proteins. Trichosurin is a protein from the milk whey of the common brushtail possum, Trichosurus Vulpecula, and shows a preference for binding small phenolic ligands. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 153
36552 381206 cd19431 lipocalin_Can_f_2 Minor allergen Can f 2. The minor dog lipocalin allergen Can f 2 is an important cause of allergic sensitization in humans worldwide. It is one of two major allergens present in dog dander extracts, and is produced by tongue and the parotid gland (a major salivary gland). Can f 2 belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 162
36553 381207 cd19432 lipocalin_2_12-like lipocalin 2 and 12 and similar proteins. Lipocalin-2 (LCN2, also known as siderocalin, uterocalin, neutrophil gelatinase-associated lipocalin) is expressed in renal, endothelial, liver, smooth muscle cells, cardiomyocytes, in various populations of immune cells and dendritic cells. Roles ascribed to LCN2 include chemotactic and bacteriostatic effects, and iron trafficking. LCN2 can also act as a growth factor. It plays a key role in the pathophysiology of renal and cardiovascular diseases, and is involved in various deleterious processes, such as inflammation and fibrosis. It is used as a renal injury biomarker. Lipocalin 12 (LCN12) is an epididymis-specific protein which binds all-trans retinoic acid. It may act as a retinoid carrier protein within the epididymis and play a role in male reproduction. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 154
36554 381208 cd19433 lipocalin_CpcS-CpeS CpcS/CpeS phycobiliprotein lyase family. Phycobilin lyases covalently attach a chromophore to the Cys residue(s) of cyanobacterial phycobiliproteins. They include Synechococcus sp. PCC 7002 phycocyanobilin lyase CpcS which attaches a phycocyanobilin chromophore to C-phycocyanain beta subunit and to allophycocyanin alpha and beta subunits, Synechococcus sp. PCC 7002 phycocyanobilin lyase subunit CpcU which forms a heterodimer with CpcS-I to attach phycocyanobilin to beta-phycocyanin and to allophycocyanin subunits, and Prochlorococcus marinus phycoerythrobilin lyase CpeS which attaches 3Z-phycoerythrobilin to phycoerythrin. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 169
36555 381209 cd19434 lipocalin_YxeF Lipocalin similar to uncharacterized Bacillus subtilis YxeF. Bacillus subtiuls YxeF lacks the alpha-helix that packs in all lipocalins with known structure against the beta-barrel. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 108
36556 381210 cd19435 lipocalin_Bacteroides bacteroides lipocalin. An uncharacterized Bacteroides subfamily of the lipocalin/cytosolic fatty-acid binding protein family a characterisitc of which is a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 131
36557 381211 cd19436 lipocalin_crustacyanin crustacyanin Type I CRTC and Type II CRTA subunits. Alpha crustacyanin bound with the carotenoid astaxanthisn (AXT) is the predominant cartenoprotein generating the slate-grey/blue color of the lobster carapace. Crustacyanin forms heterodimers (beta-crustacyanin) or complexes of 16 subunits (alpha-crustacyanin) assembled from beta-crustacyanin. Beta-crustacyanin is formed from one type I CRTC lipocalin subunit, and one type II CRTA lipocalin subunit (and two bound astaxanthin molecules). Homarus gammarus (European lobster) crustacyanin has of five distinct subunits evident on 6 M urea-PAGE gels: type I CRTC ( A1, C1, C2) and type II CRTA ( A2, A3). Homarus americanus crustacyanin consists of only two major subunits, namely type I CRTC (H1) and type II CRTA (H2), both of which behave like Ax subunits on a 6 M urea-PAGE gel. This family includes both type I CRTC subunit and type II CRTA subunits and belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 169
36558 381212 cd19437 lipocalin_apoD-like apolipoprotein D and similar proteins. Human apolipoprotein D (ApoD) is a small glycoprotein associated with high density lipoproteins (HDL) in plasma. It appears promiscuous since it can bind hydrophobic ligands belonging to different lipid groups, with different shapes and biochemical properties; however, it exhibits specificity between very similar lipidic species. Some ligands, such as progesterone and arachidonic acid, bind to the ligand-binding pocket with high affinity, while others may interact with ApoD via its region of surface hydrophobicity. This hydrophobic surface cluster may facilitate its association with HDL particles and facilitate its insertion into cellular lipid membranes. Drosophila NLaz and Schistocerca Laz belong to this group, and share functional properties with human ApoD, including regulation of lifespan, lipid and carbohydrate metabolism control, and protection against oxidative stress or starvation. This group also includes Sandercyanin, a blue protein secreted in the skin mucus of blue forms of walleye, Sander vitreus. Walleye is an important golden yellow commercial and sport fish; the findings of blue walleye are recent. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 160
36559 381213 cd19438 lipocalin_Blc-like bacterial lipocalin Blc, Arabidopsis thaliana temperature-induced lipocalin-1, and similar proteins. Escherichia coli bacterial lipocalin (Blc, also known as YjeL) is an outer membrane lipoprotein involved in the storage or transport of lipids necessary for membrane maintenance under stressful conditions. Blc has a binding preference for lysophospholipids. This group includes eukaryotic lipocalins such as Arabidopsis thaliana temperature-induced lipocalin-1 (TIL) which is involved in thermotolerance, oxidative, salt, drought and high light stress tolerance, and is needed for seed longevity by ensuring polyunsaturated lipids integrity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 143
36560 381214 cd19439 lipocalin_Ex-FABP-like extracellular fatty acid-binding protein. Ex-FABP (also known as siderocalin, lipocalin Q83 or protein Ch21) displays a dual ligand binding mode as it can bind siderophore and fatty acids simultaneously. ExFABP has a cavity which extends through the protein and has two separate ligand specificities, one for bacterial siderophores at one end, and other specifically binding co-purified lysophosphatidic acid (LPA), a potent cell signaling molecule, at the other end. As well as acting as an LPA "sensor", Ex-FABP is bacteriostatic, and tightly binds the 2,3-catechol-type ferric siderophores enterobactin, bacillibactin, and parabactin, associated with enteric bacteria and Gram-positive bacilli. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 142
36561 381215 cd19440 lipocalin_Bla_g_4_Per_a_4 major allergens Bla g 4 and Per a 4. Inhalant allergens from cockroaches are an important cause of asthma. Bla g 4 and Per a 4 are male pheromone transport lipocalins, and both are major allergens. Bla g 4 is produced by Blattella germanica (German cockroach) and has been shown to bind two biogenic amines, tyramine and octopamine which may be its physiological ligands. Per a 4 is produced by Periplaneta americana (American cockroach) and may bind different ligands from Bla g 4 or have different modes for tyramine/octopamine binding. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 148
36562 381216 cd19441 CRABP cellular retinoic acid-binding proteins (CRABP1 and CRABP2). Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. CRABP1 (also known as CRABP, CRABP-I, CRABPI, RBP5) is thought to play an important role in retinoic acid-mediated differentiation and proliferation processes. CRABP1 has been shown to modulate stem cell proliferation to affect learning and memory. It has also been shown to regulate CaMKII, excessive and/or persistent activation of which is detrimental in acute and chronic cardiac injury. CRABP2 (also known as CRABP-II, RBP6) transports retinoic acid to the nucleus, and delivers all-trans-retinoic acid to nuclear retinoic acid receptors. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs). 135
36563 381217 cd19442 CRBP cellular retinol-binding protein. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs). 131
36564 381218 cd19443 FABP3-like fatty acid-binding protein 3 and similar proteins including FABP4, -5, -7, -8, -9, -11, and -12. This FABP3-like subfamily includes FABP3, -4, -5, -7, -8, -9, -11, -12, and similar proteins and belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36565 381219 cd19444 FABP1 fatty acid-binding protein 1. Fatty acid-binding protein 1, FABP1 (also known as fatty acid-binding protein 1, liver FABP, L-FABP) occurs at high cytosolic concentration in liver, intestine, and, in the case of humans, also in kidney. FABP1 binds to two molecules of long-chain fatty acids; the two binding sites appear to be inter-dependent. FABP1 binds to fatty acyl-CoAs, peroxisome proliferators, prostaglandins, bile acids, bilirubin, heme, hydroxyl and hydroperoxyl metabolites of fatty acids, lysophosphatidic acids, selenium, and other hydrophobic ligands. FABP1 is down-regulated in about ten percent of hepatocellular carcinoma (HCC) as well as in colorectal cancer at the adenoma stage, but can also be over-expressed in various cancers. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 126
36566 381220 cd19445 FABP2 fatty acid-binding protein 2. FABP2 (also known as fatty acid-binding protein 2, intestinal, and I-FABP) is a small cytosolic protein abundantly present in mature enterocytes of small and large intestine and responsible for the absorption and intracellular transport of fatty acids. It is present throughout the small intestine; its highest expression is in the jejunum. It is a sensitive marker for damage to the intestinal epithelium. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 130
36567 381221 cd19446 FABP6 fatty acid-binding protein 6. Human fatty acid-binding protein 6 (also known as gastrotropin, I-15P, I-BABP, I-BALB, I-BAP, ILBP, ILBP3, ileal bile acid-binding protein, ILLBP, "ileal lipid-binding protein) is an intracellular carrier of bile salts in the epithelial cells of the distal small intestine and has a key role in the enterohepatic circulation of bile salts. It recognizes a series of physiological bile salts that vary in the number and position of steroidal hydroxyl groups, and the presence and type of side-chain conjugation. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 125
36568 381222 cd19447 L-BABP-like liver bile acid-binding protein and similar proteins. Liver bile acid-binding protein (also known as "fatty acid-binding protein, liver", LB-FABP, L-BABP, L-FABP, FABP1) is present in the liver of the vertebrates fish, amphibians, reptiles, and birds but not in mammals. L-BABPs bind free fatty acids and their coenzyme A derivatives, bilirubin, and some other small molecules in the cytoplasm. The role of L-BABPs may be that of cellular and metabolic trafficking of bile acids; they may be involved in intracellular lipid transport. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 124
36569 381223 cd19448 FABP_pancrustacea fatty acid-binding protein similar to Manduca sexta and Eriocheir sinensis fatty acid-binding protein 1. This subfamily includes fatty acid-binding protein found mainly in insects such as Manduca sexta FABP1 (also known as MFB1) and Luciola cerata FABP (LcFABP), and crustacea such as Eriocheir sinensis FABP (Es-FABP). MFB1, which is isolated from midgut cytosol, binds fatty acids in a 1:1 molar ratio. LcFABP, abundantly and specifically expressed in the cytosol as well as the nucleus of cells of the photogenic layer of firefly light organ, binds fatty acids of length C14-C18. Es-FABP plays a role in lipid transport during the period of rapid ovarian growth and is involved in lipid nutrient absorption and utilization processes in the hepatopancreas, ovary, and hemocytes. It is also expressed in gills, muscle, thoracic ganglia, heart, and intestine. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 130
36570 381224 cd19449 ReP1-NCXSQ-like fatty acid-binding protein ReP1-NCXSQ and similar proteins. Arthropod ReP1-NCXSQ (regulatory protein of the squid nerve sodium calcium exchanger) is required for MgATP stimulation of the squid nerve Na(+)/Ca(2+) exchanger NCXSQ1. ReP1-NCXSQ acts as a carrier of fatty acids; is possible that its biological ligand is palmitic acid, which is abundant in squid axons. The mechanism for fine-tuning of the regulation of NCXSQ1 by ReP1-NCXSQ may then involve the transport of palmitic acid. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 129
36571 381225 cd19450 lipocalin_ApoM Apolipoprotein M. Apolipoprotein M (ApoM) is mainly found in high-density lipoproteins (HDL) and is expressed in the liver and the kidney; it is associated to a lesser extend with low density lipids and triglyceride rich lipoproteins. It is involved in lipid transport and can bind sphingosine-1-phosphate, myristic acid, palmitic acid and stearic acid, retinol, all-trans-retinoic acid and 9-cis-retinoic acid. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 161
36572 381226 cd19451 lipocalin_AGP-like alpha1-acid glycoprotein and similar proteins. Alpha1-acid glycoprotein (AGP), also known as orosomucoid, has many important biological roles such as in the acute-phase reaction in response to inflammation, in immune regulation, in drug-binding and drug-transportation, in regulating sphingolipid synthesis and metabolism, and in maintaining the capillary barrier. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 173
36573 381227 cd19452 lipocalin_ABP Rhodnius prolixus amine-binding protein and similar proteins. Rhodnius prolixus amine-binding protein (ABP) plays an important role in biogenic amine binding; it binds serotonin and norepinephrine with high affinity. It is a subgroup of the lipocalin NP-like family. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. The lipocalin NP-like family belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 189
36574 381228 cd19453 lipocalin_NP2 nitrophorin 2. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 179
36575 381229 cd19454 lipocalin_NP3 nitrophorin 3. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 189
36576 381230 cd19455 lipocalin_NP1 nitrophorin 1. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 184
36577 381231 cd19456 lipocalin_NP4 nitrophorin 4. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 184
36578 381232 cd19457 lipocalin_2-like lipocalin 2 and similar proteins. Lipocalin-2 (LCN2, also known as siderocalin, uterocalin, oncogene 24p3, and neutrophil gelatinase-associated lipocalin) is expressed in renal, endothelial, liver, smooth muscle cells, cardiomyocytes, in various populations of immune cells and dendritic cells. Roles ascribed to LCN2, include chemotactic and bacteriostatic effects, and iron trafficking. LCN2 can also act as a growth factor. It plays an key role in the pathophysiology of renal and cardiovascular diseases, and is involved in various deleterious processes, such as inflammation and fibrosis. It is used as a renal injury biomarker. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 173
36579 381233 cd19458 lipocalin_12 Lipocalin 12. Lipocalin 12 (LCN12) is an epididymis-specific protein which binds all-trans retinoic acid. It may act as a retinoid carrier protein within the epididymis and play a role in male reproduction. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 165
36580 381234 cd19459 lipocalin_C1_moubatin-like Ornithodoros moubata CI, O. moubata moubatin, and similar proteins. The soft tick Ornithodoros moubata complement inhibitor CI (OmCI, also known as coversin) specifically targets C5, a member of the C3/C4/C5 protein family that orchestrates the assembly of the terminal C multiprotein complexes. O. moubata moubatin is a specific inhibitor of collagen-induced platelet aggregation. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 146
36581 381235 cd19460 CRABP1 cellular retinoic acid-binding protein 1. Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. This subgroup includes CRABP1 (also known as CRABP, CRABP-I, CRABPI, RBP5), which is thought to play an important role in retinoic acid-mediated differentiation and proliferation processes. CRABP1 has been shown to modulate stem cell proliferation to affect learning and memory. It has also been shown to regulate CaMKII, excessive and/or persistent activation of which is detrimental in acute and chronic cardiac injury. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs). 136
36582 381236 cd19461 CRABP2 Cellular retinoic acid-binding protein 2. Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. This subgroup includes CRABP2 (also known as CRABP-II, RBP6) which transports retinoic acid to the nucleus, and delivers all-trans-retinoic acid to nuclear retinoic acid receptors. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs). 136
36583 381237 cd19462 CRBP1 cellular retinol-binding protein 1. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBP1 (also known as Retinol-Binding Protein 1, CRBP, RBPC, CRBP1, CRBPI, CRABP-I) is widely expressed in numerous tissues: it has highest abundance in the liver, kidney, lung, and retinal pigment epithelium cells of the eye. CRBP1 has a high affinity for retinol. It accepts retinol transported from the plasma to cytosol via a cell surface receptor named STRA6, which interacts with serum retinol-binding protein. CRBP1 can bind all-trans-retinol, all trans-retinal and 13-cis-retinol, but not 9-cis-retinol. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs). 131
36584 381238 cd19463 CRBP2 cellular retinol-binding protein 2. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBP2 is also known as: "retinol-binding protein 2, cellular", CRABP-II, CRBP2, CRBPII, and RBPC2. Expression of CRBP2 is limited to the small intestine. CRBP2 binds both retinol and retinal; rat CRBP2 appears to bind both with equal affinity, human CRBP2 showed a significantly higher affinity for retinol relative to retinal. CRBP2 can bind all-trans-retinol, all trans-retinal and 13-cis-retinol, but not 9-cis-retinol. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs). 131
36585 381239 cd19464 CRBP3 cellular retinol-binding protein 3. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. This group includes human CRBP3 (also known as retinol-binding protein 5, HRBPiso) which is expressed at highest levels in kidney and liver. CRBP3 binds retinol, and may be a human intracellular carrier of retinol in such tissues. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs). 131
36586 381240 cd19465 CRBP4 cellular retinol-binding protein 4. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. This group includes human CRBP4 (also known as retinoid-binding protein 7, CRABP4, CRBP4, CRBPIV) which is expressed primarily in kidney, heart, and transverse colon, and mouse CRBP4 which is highly expressed in white adipose tissue and mammary gland. Human CRBP4 binds retinol with an affinity lower than those for CRBP1, -2, -3. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs). 131
36587 381241 cd19466 FABP3 fatty acid binding protein 3. FABP3 (also known as heart-type fatty acid binding protein, H-FABP, MDGI, O-FABP) is a cytosolic protein mainly expressed in cardiac and skeletal muscle cells. In these tissues, it plays an important role in fatty acid transportation, cell growth, cell signaling, and gene transcription. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36588 381242 cd19467 FABP4 fatty acid binding protein 4. FABP4 (also known as A-FABP, adipocyte fatty acid binding protein, aP2) is highly expressed in macrophages and in adipocytes where it regulates fatty acid storage and lipolysis and is an important mediator of inflammation. It binds long chain fatty acids, retinoic acid and eicosanoids. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 130
36589 381243 cd19468 FABP5 fatty acid binding protein 5. FABP5 (also known as epidermal FABP, E-FABP, cutaneous fatty-acid-binding protein, C-FABP, psoriasis-associated fatty-acid-binding protein, KFABP, PA-FABP) binds a wide array of ligands. It is an intracellular carrier for long-chain fatty acids and related active lipids, and also selectively delivers specific fatty acids from the cytosol to the nucleus. Its ligands include vitamin A metabolite all-trans-retinoic acid, endocannabinoid and numerous synthetic drugs and probes. It may be involved in keratinocyte differentiation. Mouse FABP5 is found only in the monomeric form; however, human FABP5 can exist as a monomer as well as a domain-swapped dimer. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36590 381244 cd19469 FABP8 fatty acid binding protein 8. FABP8 (also known as peripheral myelin protein 2, PMP2, myelin fatty acid binding protein, M-FABP, myelin P2 protein, MP2) is a fatty acid-binding structural component of the myelin sheath in the peripheral nervous system and may play a role in lipid transport and homeostasis in myelin. It may bind cholesterol which is present in myelin at high concentrations. In addition to binding momomeric ligands, P2 is able to bind membrane surfaces, and to stack lipid bilayers together. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 129
36591 381245 cd19470 FABP7 Fatty acid binding protein 7. FABP7 (also known as brain FABP, B-FABP, BLBP, brain lipid binding protein) is highly expressed in glial cells through development of the nervous system. In the developing brain, FABP7 is required for the establishment of the radial glial fiber system, which is involved in the migration of immature neurons. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 130
36592 381246 cd19471 FABP9 fatty acid binding protein 9 and similar proteins. FABP9 (also known as testis-FABP, T-FABP, PERF15) is a major protein found in the inner acrosomal membrane and outer face of the nuclear envelope of mammalian sperm. Its expression is increased in prostate cancer. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 130
36593 380996 cd19473 SET_SUV39H_DIM5-like SET domain (including pre-SET domain) found in Neurospora crassa (DIM-5) and similar proteins. This subfamily contains Neurospora crassa DIM-5 (also termed H3-K9-HMTase dim-5, or HKMT) which functions as histone-lysine N-methyltransferase that specifically trimethylates histone H3 to form H3K9me3. 274
36594 410883 cd19475 FlaH flagellar accessory protein FlaH. Flagellar accessory protein FlaH is part of the motor of the archaellum membrane-anchored archaeal motility structure, together with FlaX and FlaI. FlaH forms a hexameric ring, and binds ATP which is essential for its interaction with FlaI and for archaellum assembly. 220
36595 410884 cd19476 RecA-like_ion-translocating_ATPases RecA-like domain of ion-translocating ATPases. RecA-like NTPases. This family includes the NTP-binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 270
36596 410885 cd19477 type_II_IV_secretion_ATPases type II/type IV hexameric secretion ATPases. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 168
36597 410886 cd19478 Csm2 Shu complex subunit Csm2. Csm2, together with Shu1, Shu2, and Psy3, form the Shu complex which is thought to play a role in maintaining genome stability by linking error-free post-replication repair to homologous recombination. 206
36598 410887 cd19479 Elp456 Elongator subcomplex subunits Elp4, 5 and 6. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases. 175
36599 410888 cd19480 Psy3 Shu complex subunit Psy3. Psy3, together with Shu1, Shu2, and Csm2, form the Shu complex which is thought to play a role in maintaining genome stability by linking error-free PRR to homologous recombination (HR). 218
36600 410889 cd19481 RecA-like_protease proteases similar to RecA. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 158
36601 410890 cd19482 RecA-like_Thep1 RecA-like domain of the nucleoside-triphosphatase THEP1 family. This family represents the THEP1 family ATPase domain. It includes nucleoside-triphosphatase THEP 1 from Aquifex aeolicus (aaTHEP1) a nucleoside-phosphatase, with activity towards ATP, GTP, CTP, TTP and UTP; and which may hydrolyze nucleoside diphosphates with lower efficiency. The catalytic function of aaTHEP1 remains unclear, it may be a DNA/RNA modifying enzyme. Human THEP1 (hsTHEP1) may have a general function in many human tissues, as it is widely expressed in most examined tissues (such as in brain, heart, lymph node, skin, pancreas); it is especially highly expressed in embryonic and various tumor tissues. This family belongs to the RecA-like NTPase superfamily which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 164
36602 410891 cd19483 RecA-like_Gp4D_helicase RecA-like domain of Escherichia coli bacteriophage T7 Gp4D helicase. This family includes the RecA-like domain of the Gp4D fragment of the Gene4 helicase-primase (Gp4) from bacteriophage T7. Gp4D (residues 241-566) is the minimal fragment of the Gp4 that forms hexameric rings, it contains the helicase domain and the linker connecting the helicase and primase domains. Helicases are ring-shaped oligomeric enzymes that unwind DNA at the replication fork; they couple NTP hydrolysis to the unwinding of nucleic acid duplexes into their component strands. This family belongs to the RecA-like NTPase superfamily which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 231
36603 410892 cd19484 KaiC_C C-terminal domain of Circadian Clock Protein KaiC. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 218
36604 410893 cd19485 KaiC-N N-terminal domain of Circadian Clock Protein Kaic. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 226
36605 410894 cd19486 KaiC_arch KaiC family protein; uncharacterized subfamily similar to Pyrococcus horikoshii PH0284. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 230
36606 410895 cd19487 KaiC-like_C C-terminal domain of KaiC family protein; uncharacterized subfamily. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 219
36607 410896 cd19488 KaiC-like_N N-terminal domain of KaiC family protein; uncharacterized subfamily. KaiC is a circadian clock protein, most studied in cyanobacteria. KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation. 225
36608 410897 cd19489 Rad51D RAD51D recombinase. RAD51D recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51D, together with the other RAD51 paralogs, RAD51B, RAD51C, XRCC3, and XRCC2, helps recruit RAD51 to the break site. 209
36609 410898 cd19490 XRCC2 XRCC2 recombinase. XRCC2 (X-ray repair complementing defective repair in Chinese hamster cells 2) recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. XRCC2, together with the other RAD51 paralogs, RAD51B, RAD51C, RAD51D, and XRCC3, helps recruit RAD51 to the break site. 226
36610 410899 cd19491 XRCC3 XRCC3 recombinase. XRCC3 (X-ray repair complementing defective repair in Chinese hamster cells 3) recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. XRCC3, together with the other RAD51 paralogs, RAD51B, RAD51C, RAD51D, and XRCC2, helps recruit RAD51 to the break site. 250
36611 410900 cd19492 Rad51C RAD51C recombinase. RAD51C recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51C, together with the other RAD51 paralogs, RAD51B, RAD51D, XRCC3, and XRCC2, helps recruit RAD51 to the break site. Additionally, RAD51C acts as a mediator in the early steps of DNA damage signaling. 172
36612 410901 cd19493 Rad51B RAD51B recombinase. RAD51B recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51B, together with the other RAD51 paralogs, RAD51C, RAD51D, XRCC3, and XRCC2, helps recruit RAD51 to the break site. 222
36613 410902 cd19494 Elp4 Elongator subcomplex subunit Elp4. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases. 259
36614 410903 cd19495 Elp6 Elongator subcomplex subunit Elp6. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases. 228
36615 410904 cd19496 Elp5 Elongator subcomplex subunit Elp5. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases. 143
36616 410905 cd19497 RecA-like_ClpX ATP-dependent Clp protease ATP-binding subunit ClpX. ClpX is a component of the ATP-dependent protease ClpXP. In ClpXP, ClpX ATPase serves to specifically recognize, unfold, and translocate protein substrates into the chamber of ClpP protease for degradation. This RecA-like_ClpX domain subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 251
36617 410906 cd19498 RecA-like_HslU ATP-dependent protease ATPase subunit HslU. HslU is a component of the ATP-dependent protease HslVU. In HslVU, HslU ATPase serves to unfold and translocate protein substrate, and the HslV protease degrades the unfolded proteins. This RecA-like_HslU subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 183
36618 410907 cd19499 RecA-like_ClpB_Hsp104-like Chaperone protein ClpB/Hsp104 subfamily. Bacterial Caseinolytic peptidase B (ClpB) and eukaryotic Heat shock protein 104 (Hsp104) are ATP-dependent molecular chaperones and essential proteins of the heat-shock response. ClpB/Hsp104 ATPases, in concert with the DnaK/Hsp70 chaperone system, disaggregate and reactivate aggregated proteins. This RecA-like_ClpB_Hsp104_like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 178
36619 410908 cd19500 RecA-like_Lon lon protease homolog 2 peroxisomal. Lon protease (also known as Lon peptidase) is an evolutionarily conserved ATP-dependent serine protease, present in bacteria and eukaryotic mitochondria and peroxisomes, which mediates the selective degradation of mutant and abnormal proteins as well as certain short-lived regulatory proteins. Lon protease is both an ATP-dependent peptidase and a protein-activated ATPase. This RecA-like Lon domain subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 182
36620 410909 cd19501 RecA-like_FtsH ATP-dependent zinc metalloprotease FtsH. FtsH ATPase is a processive, ATP-dependent zinc metallopeptidase for both cytoplasmic and membrane proteins. It is anchored to the cytoplasmic membrane such that the amino- and carboxy-termini are exposed to the cytoplasm. It presents a membrane-bound hexameric structure that is able to unfold and degrade protein substrates. It is comprised of an N-terminal transmembrane region and the larger C-terminal cytoplasmic region, which consists of an ATPase domain and a protease domain. This RecA-Like FTsH subfamily represents the ATPase domain, and belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 171
36621 410910 cd19502 RecA-like_PAN_like proteasome activating nucleotidase PAN and related proteasome subunits. This subfamily contains ATPase subunits of the eukaryotic 26S proteasome, and of the archaeal proteasome which carry out ATP-dependent degradation of substrates of the ubiquitin-proteasome pathway. The eukaryotic 26S proteasome consists of a proteolytic 20S core particle (CP), and a 19S regulatory particle (RP) which provides the ATP-dependence and the specificity for ubiquitinated proteins. In the archaea the RP is a homohexameric complex of proteasome-activating nucleotidase (PAN). This subfamily also includes various eukaryotic 26S subunits including, proteasome 26S subunit, ATPase 2 (PSMC2, also known as S7 and MSS1) which is a member of the 19S RP and has a chaperone like activity; and proteasome 20S subunit alpha 6 (PSMA6, also known as IOTA, p27K, and PROS27) which is a member of the 20S CP. This RecA-like_PAN subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 171
36622 410911 cd19503 RecA-like_CDC48_NLV2_r1-like first of two ATPase domains of CDC48 and NLV2, and similar ATPase domains. CDC48 in yeast and p97 or VCP metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the first of the two ATPase domains. This subfamily also includes the first of the two ATPase domains of NVL (nuclear VCP-like protein) 2, an isoform of NVL mainly present in the nucleolus, which is involved in ribosome biogenesis, in telomerase assembly and the regulation of telomerase activity, and in pre-rRNA processing. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 165
36623 410912 cd19504 RecA-like_NSF-SEC18_r1-like first of two ATPase domains of NSF and SEC18, and similar ATPase domains. N-ethylmaleimide-sensitive factor (NSF) and Saccharomyces cerevisiae Vesicular-fusion protein Sec18, key factors for eukaryotic trafficking, are ATPases and SNARE disassembly chaperones. NSF/Sec18 activate or prime SNAREs, the terminal catalysts of membrane fusion. Sec18/NSF associates with SNARE complexes through binding Sec17/alpha-SNAP. Sec18 has an N-terminal cap domain and two nucleotide-binding domains (D1 and D2) which form the two rings of the hexameric complex. The hydrolysis of ATP by D1 generates most of the energy necessary to disassemble inactive SNARE bundles, while the D2 ring binds ATP to stabilize the homohexamer. This subfamily includes the first (D1) ATPase domain of NSF/Sec18, and belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 177
36624 410913 cd19505 RecA-like_Ycf2 ATPase domain of plant YCF2. Ycf2 is a chloroplast ATPase which has an essential function; however, its function remains unclear. The gene encoding YCF2 is the largest known plastid gene in angiosperms and has been used to predict phylogenetic relationships. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 161
36625 410914 cd19506 RecA-like_IQCA1 ATPase domain of IQ and AAA domain-containing protein 1 (IQCA1). IQCA1 (also known as dynein regulatory complex subunit 11, DRC11 and IQCA) is an ATPase subunit of the nexin-dynein regulatory complex (N-DRC). The 9 + 2 axoneme of most motile cilia and flagella consists of nine outer doublet microtubules arranged in a ring surrounding a central pair of two singlet microtubules. The N-DRC complex maintains alignment between outer doublet microtubules and limits microtubule sliding in motile axonemes. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 160
36626 410915 cd19507 RecA-like_Ycf46-like ATPase domain of Ycf46 and similar ATPase domains. Ycf46 may play a role in the regulation of photosynthesis in cyanobacteria, especially in CO2 uptake and utilization. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 161
36627 410916 cd19508 RecA-like_Pch2-like ATPase domain of Pachytene checkpoint 2 (Pch2) and similar ATPase domains. Pch2 (known as Thyroid hormone receptor interactor 13 (TRIP13) and 16E1BP) is a key regulator of specific chromosomal events, like the control of G2/prophase processes such as DNA break formation and recombination, checkpoint signaling, and chromosome synapsis. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion 199
36628 410917 cd19509 RecA-like_VPS4-like ATPase domain of VPS4, ATAD1, K, KTNA1, Spastin, FIGL-1 and similar ATPase domains. This subfamily includes the ATPase domains of vacuolar protein sorting-associated protein 4 (VPS4), ATPase family AAA domain-containing protein 1 (ATAD1, also known as Thorase), Katanin p60 ATPase-containing subunit A1 (KTNA1), Spastin, and Fidgetin-Like 1 (FIGL-1). This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 163
36629 410918 cd19510 RecA-like_BCS1 Mitochondrial chaperone BCS1. Mitochondrial chaperone BCS1 is necessary for the assembly of mitochondrial respiratory chain complex III and plays an important role in the maintenance of mitochondrial tubular networks, respiratory chain assembly and formation of the LETM1 complex. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 153
36630 410919 cd19511 RecA-like_CDC48_r2-like second of two ATPase domains of CDC48/p97, PEX1 and -6, VAT and NVL, and similar ATPase domains. This subfamily includes the second of two ATPase domains of the molecular chaperone CDC48 in yeast and p97 or VCP in metazoans, Peroxisomal biogenesis factor 1 (PEX1) and -6 (PEX6), Valosin-containing protein-like ATPase (VAT), and nuclear VCP-like protein (NVL). This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 159
36631 410920 cd19512 RecA-like_ATAD3-like ATPase domains of ATPase AAA-domain protein 3A (ATAD3A), -3B, and -3C, and similar ATPase domains. ATPase AAA-domain protein 3 (ATAD3) is a ubiquitously expressed mitochondrial protein involved in mitochondrial dynamics, DNA-nucleoid structural organization, cholesterol transport and steroidogenesis. The ATAD3 gene family in human comprises three paralog genes: ATAD3A, ATAD3B and ATAD3C. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 150
36632 410921 cd19513 Rad51 RAD51D recombinase. RAD51 recombinase plays an essential role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51 is recruited to the break site with the help of its paralogs, RAD51D, RAD51B, RAD51C, XRCC3, and XRCC2, where it forms long helical polymers which wrap around the ssDNA tail at the break which leads to pairing and strand invasion. 235
36633 410922 cd19514 DMC1 homologous-pairing protein DMC1. DMC1 has a central role in homologous recombination in meiosis. It assembles at the sites of programmed DNA double-strand breaks and carries out a search for allelic DNA sequences located on homologous chromatids. It forms octameric rings. 236
36634 410923 cd19515 archRadA archaeal recombinase Rad51/RadA. This group includes the archaeal protein RadA which is a homolog of Rad51. RAD51 recombinase plays an essential role in DNA repair by homologous recombination (HR) 233
36635 410924 cd19516 DotB_TraJ dot/icm secretion system protein DotB-like. Defect in organelle trafficking (Dot)B is part of the type IVb secretion (T4bS) system, also known as the dot/icm system, and is the main energy supplier of the secretion system. It is an ATPase, similar to the VirB11 component of the T4aS systems. This family also includes Escherichia coli IncI plasmid-encoded conjugative transfer ATPase TraJ encoded on the tra (transfer) operon. 179
36636 410925 cd19517 RecA-like_Yta7-like ATPase domain of Saccharomyces cerevisiae Yta7 and similar ATPase domains. Saccharomyces cerevisiae Yta7 is a chromatin-associated AAA-ATPase involved in regulation of chromatin dynamics. Its human ortholog ANCCA/ATAD2 transcriptionally activates pathways of malignancy in a broad range of cancers. The RecA-like_Yta7 subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 170
36637 410926 cd19518 RecA-like_NVL_r1-like first of two ATPase domains of NVL (nuclear VCP-like protein) and similar ATPase domains. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 169
36638 410927 cd19519 RecA-like_CDC48_r1-like first of two ATPase domains of CDC48 and similar ATPase domains. CDC48 in yeast and p97 or VCP metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the first of the two ATPase domains. CDC48's roles include in the fragmentation of Golgi stacks during mitosis and for their reassembly after mitosis, and in the formation of the nuclear envelope, and of the transitional endoplasmic reticulum (tER). This RecA-like_cdc48_r1-like subfamily belongs to the RecA-like family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 166
36639 410928 cd19520 RecA-like_ATAD1 ATPase domain of ATPase family AAA domain-containing protein 1 and similar ATPase domains. ATPase family AAA domain-containing protein 1 (ATAD1, also known as Thorase) is an ATPase that plays a critical role in regulating the surface expression of alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, thereby regulating synaptic plasticity, learning and memory. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 166
36640 410929 cd19521 RecA-like_VPS4 ATPase domain of vacuolar protein sorting-associated protein 4. Vacuolar protein sorting-associated protein 4 (Vps4) is believed to be involved in intracellular protein transport out of a prevacuolar endosomal compartment. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 170
36641 410930 cd19522 RecA-like_KTNA1 Katanin p60 ATPase-containing subunit A1. Katanin p60 ATPase-containing subunit A1 (KTNA1) is the catalytic subunit of the Katanin complex which is severs microtubules in an ATP-dependent manner, and is implicated in multiple aspects of microtubule dynamics. In addition to the p60 catalytic ATPase subunit, Katanin contains an accessory subunit (p80 or p80-like). The microtubule-severing activity of the ATPase is essential for female meiotic spindle assembly, and male gamete production; and the katanin complex severing microtubules is under tight regulation during the transition from the meiotic to mitotic stage to allow proper embryogenesis. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 170
36642 410931 cd19523 RecA-like_fidgetin ATPase domain of fidgetin. Fidgetin (FIGN) is a ATP-dependent microtubule severing protein. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 163
36643 410932 cd19524 RecA-like_spastin ATPase domain of spastin. Spastin is an ATP-dependent microtubule-severing protein involved in microtubule dynamics; it specifically recognizes and cuts microtubules that are polyglutamylated. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 164
36644 410933 cd19525 RecA-like_Figl-1 ATPase domain of Fidgetin-Like 1 (FIGL-1). FIGL-1 may participate in DNA repair in the nucleus; it may be involved in DNA double-strand break repair via homologous recombination. Caenorhabditis elegans FIGL-1 is a nuclear protein and controls the mitotic progression in the germ line and mouse FIGL-1 may be involved in the control of male meiosis. human FIGL-1 has been shown to be a centrosome protein involved in ciliogenesis perhaps as a microtubule-severing protein. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 186
36645 410934 cd19526 RecA-like_PEX1_r2 second of two ATPase domains of Peroxisomal biogenesis factor 1 (PEX1). PEX1(also known as Peroxin-1)/PEX6 is a protein unfoldase; PEX1 and PEX6 form a heterohexameric Type-2 AAA-ATPase complex and are essential for peroxisome biogenesis as they are required for the import of folded proteins into the peroxisomal matrix. PEX-1 is required for stability of PEX5. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 158
36646 410935 cd19527 RecA-like_PEX6_r2 second of two ATPase domains of Peroxisomal biogenesis factor 6 (PEX6). PEX6(also known as Peroxin61)/PEX1 is a protein unfoldase; PEX6 and PEX1 form a heterohexameric Type-2 AAA-ATPase complex and are essential for peroxisome biogenesis as they are required for the import of folded proteins into the peroxisomal matrix. This subfamily represents the second ATPase domain of PEX6. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 160
36647 410936 cd19528 RecA-like_CDC48_r2-like second of two ATPase domains of CDC48 and similar ATPase domains. CDC48 in yeast and p97 or VCP in metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the second of the two ATPase domains. CDC48's roles include in the fragmentation of Golgi stacks during mitosis and for their reassembly after mitosis, and in the formation of the nuclear envelope, and of the transitional endoplasmic reticulum (tER). This RecA-like_cdc48_r2-like subfamily belongs to the RecA-like family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 161
36648 410937 cd19529 RecA-like_VCP_r2 second of two ATPase domains of Valosin-containing protein-like ATPase (VAT) and similar ATPase domains. The Valosin-containing protein-like ATPase of Thermoplasma acidophilum (VAT), is an archaeal homolog of the ubiquitous Cdc48/p97. It is a protein unfoldase that functions in concert with the 20S proteasome by unfolding proteasome substrates and passing them on for degradation. VAT forms a homohexamer, each monomer contains two tandem ATPase domains, referred to as D1 and D2, and an N-terminal domain. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 159
36649 410938 cd19530 RecA-like_NVL_r2-like second of two ATPase domains of NVL (nuclear VCP-like protein) and similar ATPase domains. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 161
36650 380454 cd19531 LCL_NRPS-like LCL-type Condensation (C) domain of non-ribosomal peptide synthetases(NRPSs) and similar domains including the C-domain of SgcC5, a free-standing NRPS with both ester- and amide- bond forming activity. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. Streptomyces globisporus SgcC5 is a free-standing NRPS condensation enzyme (rather than a modular NRPS), which catalyzes the condensation between the SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and (R)-1phenyl-1,2-ethanediol, forming an ester bond, during the synthesis of the chromoprotein enediyne antitumor antibiotic C-1027. It has some acceptor substrate promiscuity as it has been shown to also catalyze the formation of an amide bond between SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and a mimic of the enediyne core acceptor substrate having an amine at its C-2 position. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains. 427
36651 380455 cd19532 C_PKS-NRPS Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Most members of this subfamily have the typical C-domain HHxxxD motif, a few such as Monascus pilosus lovastatin nonaketide synthase MokA have a non-canonical HRxxxD motif in the C-domain and are unable to catalyze amide-bond formation. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 421
36652 380456 cd19533 starter-C_NRPS Starter Condensation domains, found in the first module of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. While standard C-domains catalyze peptide bond formation between two amino acids, an initial, ('starter') C-domain may instead acylate an amino acid with a fatty acid. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 419
36653 380457 cd19534 E_NRPS Epimerization domain of nonribosomal peptide synthetases (NRPSs); belongs to the Condensation-domain family. Epimerization (E) domains of nonribosomal peptide synthetases (NRPS) flip the chirality of the end amino acid of a peptide being manufactured by the NRPS. E-domains are homologous to the Condensation (C) domains. NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Specialized tailoring NRPS domains such as E-domains greatly increase the range of possible peptide products created by the NRPS machinery. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the E-domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 428
36654 380458 cd19535 Cyc_NRPS Cyc (heterocyclization) domain of nonribosomal peptide synthetases (NRPSs); belongs to the Condensation-domain family. Cyc (heterocyclization) domains catalyze two separate reactions in the creation of heterocyclized peptide products in nonribosomal peptide synthesis: amide bond formation followed by intramolecular cyclodehydration between a Cys, Ser, or Thr side chain and a carbonyl carbon on the peptide backbone to form a thiazoline, oxazoline, or methyloxazoline ring. Cyc-domains are homologous to standard NRPS Condensation (C) domains. C-domains typically have a conserved HHxxxD motif at the active site; Cyc-domains have an alternative, conserved DxxxxD active site motif, mutation of the aspartate residues in this motif can abolish or diminish condensation activity. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and Cyc-domains. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 423
36655 380459 cd19536 DCL_NRPS-like DCL-type Condensation domains of nonribosomal peptide synthetases (NRPSs), such as terminal fungal CT domains and Dual Epimerization/Condensation (E/C) domains. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type [D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L))], which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 419
36656 380460 cd19537 C_NRPS-like Condensation family domain with an atypical active site motif. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily typically have a non-canonical conserved SHXXXDX(14)Y motif. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 395
36657 380461 cd19538 LCL_NRPS LCL-type Condensation domain of non-ribosomal peptide synthetases (NRPSs) and similar domains. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains. 432
36658 380462 cd19539 SgcC5_NRPS-like SgcC5 is a non-ribosomal peptide synthetase (NRPS) condensation enzyme with ester- and amide- bond forming activity and similar C-domains of modular NRPSs. SgcC5 is a free-standing NRPS condensation enzyme (rather than a modular NRPS), which catalyzes the condensation between the SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and (R)-1phenyl-1,2-ethanediol, forming an ester bond, during the synthesis of the chromoprotein enediyne antitumor antibiotic C-1027. It has some acceptor substrate promiscuity as it has been shown to also catalyze the formation of an amide bond between SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and a mimic of the enediyne core acceptor substrate having an amine at its C-2 position. This subfamily also includes similar C-domains of modular NRPSs such as Penicillium chrysogenum N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase PCBAB. Condensation (C) domains of NRPSs normally catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 427
36659 380463 cd19540 LCL_NRPS-like LCL-type Condensation domain of nonribosomal peptide synthetases (NRPSs) and similar domains. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains. 433
36660 380464 cd19542 CT_NRPS-like Terminal Condensation (CT)-like domains of nonribosomal peptide synthetases (NRPSs). Unlike bacterial NRPS, which typically have specialized terminal thioesterase (TE) domains to cyclize peptide products, many fungal NRPSs employ a terminal condensation-like (CT) domain to produce macrocyclic peptidyl products (e.g. cyclosporine and echinocandin). Domains in this subfamily (which includes both terminal and non-terminal domains) typically have a non-canonical conserved [SN]HxxxDx(14)Y motif at their active site compared to the standard Condensation (C) domain active site motif (HHxxxD). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 401
36661 380465 cd19543 DCL_NRPS DCL-type Condensation domain of nonribosomal peptide synthetases (NRPSs), which catalyzes the condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. The DCL-type Condensation (C) domain catalyzes the condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. This domain is D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L)); this is in contrast with the standard LCL domains which catalyze peptide bond formation between two L-amino acids, and the restriction of ribosomes to use only L-amino acids. C domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains in addition to the LCL- and DCL-types such as starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. 423
36662 380466 cd19544 E-C_NRPS Dual Epimerization/Condensation (E/C) domains of nonribosomal peptide synthetases (NRPSs). Dual function Epimerization/Condensation (E/C) domains have both an epimerization and a DCL condensation activity. Dual E/C domains first epimerize the substrate amino acid to produce a D-configuration, then catalyze the condensation between the D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. They are D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L)); this is in contrast with the standard LCL domains which catalyze peptide bond formation between two L-amino acids, and the restriction of ribosomes to use only L-amino acids. These Dual E/C domains contain an extended His-motif (HHx(N)GD) near the N-terminus of the domain in addition to the standard Condensation (C) domain active site motif (HHxxxD). C domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains, these include the DCL-type, LCL-type, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C domains, and the X-domain. 413
36663 380467 cd19545 FUM14_C_NRPS-like Condensation domains of nonribosomal peptide synthetases (NRPSs) similar to the ester-bond forming Fusarium verticillioides FUM14 protein. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) typically catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. However, some C-domains have ester-bond forming activity. This subfamily includes Fusarium verticillioides FUM14 (also known as NRPS8), a bi-domain protein with an ester-bond forming NRPS C-domain, which catalyzes linkages between an aminoacyl/peptidyl-PCP donor and a hydroxyl-containing acceptor. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. FUM14 has an altered active site motif DHTHCD instead of the typical HHxxxD motif seen in other subfamily members. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 395
36664 380468 cd19546 X-Domain_NRPS X-domain is a catalytically inactive Condensation-like domain shown to recruit oxygenases to the non-ribosomal peptide synthetase (NRPS). The X-domain is a catalytically inactive member of the Condensation (C) domain family of non-ribosomal peptide synthetase (NRPS). It has been shown to recruit oxygenases to the NRPS to perform side-chain crosslinking in the production of glycopeptide antibiotics. C-domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as this X-domain, the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, and dual E/C (epimerization and condensation) domains. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity; members of this X-domain subfamily lack the second H of this motif. 440
36665 380469 cd19547 beta-lac_NRPS Condensation domain of nonribosomal peptide synthetases (NRPSs) similar to Nocardia uniformis NocB which exhibits an unusual cyclization to form beta-lactam rings in pro-nocardicin G synthesis. Nocardia uniformis NRPS NocB acts centrally in the biosynthesis of the nocardicin monocyclic beta-lactam antibiotics. Along with another NRPS NocA, it mediates an unusual cyclization to form beta-lactam rings in the synthesis of the beta-lactam-containing pentapeptide pro-nocardicin G. This small subfamily is related to DCL-type Condensation (C) domains, which catalyze condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; domains belonging to this subfamily have an HHHxxxD motif at the active site. 422
36666 381016 cd19548 serpinA_A1AT-like serpin family A member, alpha-1-antitrypsin and similar serpin proteins in birds and reptiles. The alpha-1-antitrypsin family has a variety of different members of sauropsida belonging to the clade A of the serpin superfamily. This branch includes members from zebra finch, green anole, king cobra, gekko, crocodile, and central bearded dragon. Alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, and serum trypsin inhibitor) is a protease inhibitor. Clade A includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 370
36667 381017 cd19549 serpinA_A1AT-like serpin family A member, alpha-1-antitrypsin and similar proteins. This group contains proteins similar to alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, and serum trypsin inhibitor), a protease inhibitor that belongs to the serpin superfamily. It is encoded in humans by the SERPINA1 gene. When the blood contains inadequate amounts of A1AT or functionally defective A1AT (such as in alpha-1 antitrypsin deficiency), neutrophil elastase is excessively free to break down elastin, degrading the elasticity of the lungs, which results in respiratory complications, such as chronic obstructive pulmonary disease. Normally, A1AT leaves its site of origin, the liver, and joins the systemic circulation; defective A1AT can fail to do so, building up in the liver, which results in cirrhosis. This group belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 367
36668 381018 cd19550 serpinA2_PIL serpin family A member 2, protease inhibitor 1-like. Protease inhibitor 1-like (also called serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 2, ARGS, protease inhibitor 1 (alpha-1-antitrypsin)-like)/PIL, and alpha-1-antitrypsin-related protein/ATR) belongs to the serpin superfamily and is encoded by the SERPINA2 gene in humans. SERPINA2 was once thought to be a pseudogene, but recent evidence shows that it produces an active transcript. It is very similar in structure and function to SERPINA1. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 363
36669 381019 cd19551 serpinA3_A1AC serpin family A member 3, alpha 1-antichymotrypsin. Alpha 1-antichymotrypsin (a1AC/A1AC/a1ACT/AACT) is an alpha globulin glycoprotein that is a member of the serpin superfamily. In humans, it is encoded by the SERPINA3 gene. It inhibits the activity of proteases, such as cathepsin G that is found in neutrophils, and chymases found in mast cells, by cleaving them into a different shape or conformation. This activity protects some tissues, such as the lower respiratory tract, from damage caused by proteolytic enzymes. Deficiency of this protein has been associated with liver disease. Mutations have been identified in patients with Parkinson disease and chronic obstructive pulmonary disease. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 382
36670 381020 cd19552 serpinA4_KST serpin family A member 4, kallistatin. Kallistatin (KST, also called proteinase inhibitor 4/PI4, or kallikrein inhibitor/KAL) is a protein that in humans is encoded by the SERPINA4 gene. Kallistatin inhibits human amidolytic and kininogenase activities of tissue kallikrein. Heparin blocks kallistatin's complex formation with tissue kallikrein and abolishes its inhibitory effect on tissue kallikrein's activity. Kallistatin was found to be expressed in human liver, stomach, pancreas, kidney, aorta, testes, prostate, artery, atrium, ventricle, lung, renal proximal tubular cell, and a colonic carcinoma cell line T84. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 383
36671 381021 cd19553 serpinA5_PCI serpin family A member 5, protein C inhibitor. Protein C inhibitor (PCI/PROCI, also called PAI3, plasminogen activator inhibitor-3/PLANH3, plasma serine protease inhibitor) has many biological functions. It acts as a pro-coagulant in blood and in the seminal vesicles, it is required for spermatogenesis. It is a member of the clade A serpin family that includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 364
36672 381022 cd19554 serpinA6_CBG serpin family A member 6, corticosteroid-binding globulin. Corticosteroid-binding globulin (CBG, also known as transcortin) is encoded by the SERPINA6 gene in humans which encodes an alpha-globulin with corticosteroid-binding properties. It is produced in the liver. CBG binds several steroid hormones at high rates including cortisol, cortisone, deoxycorticosterone (DOC), corticosterone, aldosterone, progesterone, and 17a-hydroxyprogesterone. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 373
36673 381023 cd19555 serpinA7_TBG serpin family A member 7, thyroxine-binding globulin. Thyroxine-binding globulin (TBG, also called T4-binding globulin) is a globulin that binds thyroid hormones in circulation. It is one of three transport proteins (along with transthyretin and serum albumin) responsible for carrying the thyroid hormones thyroxine (T4) and triiodothyronine (T3) in the bloodstream. TBG is synthesized primarily in the liver and is a serpin with no inhibitory function like many other members of this class of proteins. There are two forms of inherited thyroxine-binding globulin deficiency: the complete form (TBG-CD), which results in a total loss of thyroxine-binding globulin, and the partial form (TBG-PD), which reduces the amount of this protein or alters its structure. Neither of these conditions causes any problems with thyroid function, but it can be mistaken for more serious thyroid disorders, such as hypothyroidism. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 379
36674 381024 cd19556 serpinA9_centerin serpin family A member 9, centerin. Centerin, also known as germinal center B-cell-expressed transcript 1/GCET1, is a serpin whose expression is restricted to germinal center B-cells and lymphoid malignancies with germinal center B-cell maturation. Expression of centerin, together with bcl-6 and GCET2, constitutes a germinal center B-cell signature, which is associated with a good prognosis in diffuse large B-cell lymphomas. Centerin is thought to function in vivo in the germinal centre as an efficient inhibitor of a trypsin-like protease. It also inhibits the trypsin-like serine proteases trypsin, thrombin and plasmin and is able to bind heparin and DNA. The centerin gene maps to the A clade serpin cluster on chromosome 14q32.1, which also contains a1-antitrypsin and a1-antichymotrypsin together with seven other serpins. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 388
36675 381025 cd19557 serpinA11 serpin family A member 11. Serpin A11, in rats also called liver regeneration-related protein LRRG023, is a serpin encoded by the gene SERPINA11. It maps on chromosome 14, at 14q32.13 and is strongly expressed in the human liver. The function of this protein is unknown. It belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 373
36676 381026 cd19558 serpinA12_vaspin serpin family A member 12, visceral adipose tissue-derived serpin. Vaspin, also called visceral adipose tissue-derived serpin or serpinA12, was identified as an adipokine with insulin-sensitizing effects and has been shown to significantly reduce blood glucose concentrations in various mouse models. As such, vaspin may represent a novel treatment tool for diabetes intervention strategies. Human kallikrein 7 (hK7), which cleaves human insulin within A and B chain, was the first protease target of vaspin inhibited by classical serpin mechanism with high specificity in vitro. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 372
36677 381027 cd19559 serpinA14_UTMP_UABP-2 serpin family A member 14, uterine milk protein and uteroferrin-associated basic protein 2. The uteroferrin(Uf)-associated basic proteins-2(UABP-2/UABP/UfAP) are a group of three (Mr = 42K, 48K, and 50K) antigenically related, basic glycoproteins secreted by the porcine uterus under the influence of progesterone (P4), which exist as heterodimers (Mr = 80,000) with the iron-binding acid phosphatase, Uf. This group also contains UTMP (uterine milk protein), encoded by SERPINA14. UTMP binds noncovalently to the iron-containing glycoprotein uteroferrin, which displays phosphatase activity and is thought to be involved with iron transport to the fetus. Synthesis of these serpins is induced by progesterone in the uterus. UTMP is also an activin-binding protein and has been implicated in regulation of uterine immune function. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 386
36678 381028 cd19560 serpinB1_LEI serpin family B member 1 (serpin B1), leukocyte elastase inhibitor (LEI). Leukocyte elastase inhibitor (LEI , also known as proteinase inhibitor 2/PI2, monocyte neutrophil elastase inhibitor/MNEI, EI, or ELANH2) is a member of the clade B serpins or ov-serpins (ovalbumin related serpins) that in humans is encoded by the SERPINB1 gene. Human SERPINB1 is a potent intracellular inhibitor for granzyme H (GzmH) which is constitutively expressed in NK cells and induces target cell death. GzmH cleaves SERPINB1 at Phe343 in the RCL to mediate suicide inhibition. Equine leukocyte elastase inhibitor (HLEI) in contrast to other serpins contains no carbohydrate and has a blocked amino terminus. HLEI is a thymosin beta4-binding protein suggesting a physiological role for cytoplasmic elastase inhibitors in the thymosin B4-regulated rearrangement of the cytoskeleton of leukocytes. HLEI has been proposed to be involved with the control of intracellular protein turnover or the control of elastinolytic activity during inflammation. Ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 379
36679 381029 cd19562 serpinB2_PAI-2 serpin family B member 2, plasminogen activator inhibitor 2. Plasminogen activator inhibitor-2 (PAI-2/PLANH2, also called placental PAI, monocyte arg-serpin, or urokinase inhibitor) is a serine protease inhibitor that belongs to the ovalbumin family of serpins (ov-serpins). It is an effective inhibitor of urinary plasminogen activator (urokinase or uPA) and is involved in cell differentiation, tissue growth and regeneration. Ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 414
36680 381030 cd19563 serpinB3_B4_SCCA1_2 serpin family B members 3 and 4, squamous cell carcinoma antigens 1 and 2. Squamous cell carcinoma antigen 1 (SCCA1, also called HsT1196 or protein T4-A) and squamous cell carcinoma antigen 2 (SCCA2, also called PI11 or leupin), which are encoded by the SERPINB3 and SERPINB4 genes, respectively, are members of the serpin family of serine protease inhibitors. SCCA1 is a so called cross-class serpin, inhibiting cysteine proteinases such as cathepsin S, K, L, and papain. SCCA2 inhibits chymotrypsin-like serine proteases including chymase, cathepsin G, and Der p1. Elevated levels of SCCA1 and SCCA2 have been detected in chronic inflammatory conditions involving the skin, especially atopic dermatitis (AD)and psoriasis, as well as in respiratory inflammatory diseases such as asthma, chronic obstructive pulmonary disease (COPD), and tuberculosis. They are both normally co-expressed in squamous epithelial cells of tongue, esophagus, tonsils, epidermal hair follicles, lung and uterus, and become highly up-regulated in squamous carcinomas of these organs. Diseases associated with SERPINB3 include anal cancer and cervical squamous cell carcinoma, whereas SERPINB4 include squamous cell carcinoma and chromosome 18Q deletion syndrome. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 390
36681 381031 cd19565 serpinB6_CAP serpin family B member 6, cytoplasmic antiproteinase. Cytoplasmic antiproteinase (CAP, also called proteinase inhibitor 6/PI6 or placental thrombin inhibitor/PTI) is thought to be involved in the regulation of serine proteinases present in the brain or extravasated from the blood. It may play an important role in the inner ear in the protection against leakage of lysosomal content during stress; loss of this protection results in cell death and sensorineural hearing loss. It is an inhibitor of cathepsin G, kallikrein-8 and thrombin. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 378
36682 381032 cd19566 serpinB7_megsin serpin family B member 7, megsin. Megsin is named as such due to its primary expression in the mesangium, a structure associated with the capillaries in the glomerulus of the kidney. Megsin is thought to play a role in the regulation of a wide variety of processes in mesangial cells, such as matrix metabolism, cell proliferation, and apoptosis. Identification of the exact biological functions and target proteases of megsin will lead to the development of novel therapeutic approaches to glomerular diseases. Expression of this gene is upregulated in IgA nephropathy and mutations have been found to cause palmoplantar keratoderma, Nagashima type. Megsin belongs to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 380
36683 381033 cd19567 serpinB8_CAP-2 serpin family B member 8, cytoplasmic antiproteinase 2. Cytoplasmic antiproteinase 2 (CAP-2 or peptidase inhibitor 8/PI-8) is a member of the ovalbumin family of serpins (ov-serpins). Serpin B8 is produced by platelets and can bind to and inhibit the function of furin, a serine protease involved in platelet functions. In addition, this protein has been found to enhance the mechanical stability of cell-cell adhesion in the skin, and defects in this gene have been associated with an autosomal-recessive form of exfoliative ichthyosis. Diseases associated with SERPINB8 include Peeling Skin Syndrome 5 and Exfoliative Ichthyosis. Among its related pathways are Response to elevated platelet cytosolic Ca2+ and CFTR-dependent regulation of ion channels in Airway Epithelium (norm and CF). The ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 374
36684 381034 cd19568 serpinB9_CAP-3 serpin family B member 9, cytoplasmic antiproteinase 3. Cytoplasmic antiproteinase 3 (CAP-3; peptidase inhibitor 9/PI-9, Spi6, or testicular tissue protein Li 180) is an intracellular inhibitor of granzyme B (grB) that protects cytotoxic lymphocytes from grB-mediated death. It is also thought to be expressed in accessory immune cells, including dendritic cells (DCs), although there is some debate about this. Overexpression of serpin B9 may prevent cytotoxic T-lymphocytes from eliminating certain tumor cells. A pseudogene of this gene is found on chromosome 6. Diseases associated with serpin B9 include chronic obstructive pulmonary disease (COPD) and oral squamous cell carcinoma (OSCC). The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 376
36685 381035 cd19569 serpinB10_bomapin serpin family B member 10, bomapin. Bomapin (also called proteinase inhibitor 10/PI10) is a hematopoietic- and myeloid leukaemia-specific protease inhibitor which is thought to augment proliferation or apoptosis of leukemia cells, depending on growth factor availability. Bomapin is expressed only in bone marrow, leukocytes of patients with myeloid leukaemia that correspond to myeloid progenitors, and promyelocytic leukaemia cell lines (HL60, THP1, and AML-193), but it is not present in terminally differentiated leukocytes. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 397
36686 381036 cd19570 serpinB11_epipin serpin family B member 11, epipin. Epipin/SERPINB11 has no serine protease inhibitory activity, probably due to mutations in the scaffold, impairing conformational changes, and may have evolved a non-inhibitory function. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 392
36687 381037 cd19571 serpinB12_yukopin serpin family B member 12, yukopin. Yukopin, encoded by the SERPINB12 gene, is a member of the serpin superfamily of serine protease inhibitors. It inhibits trypsin and plasmin, but not thrombin, coagulation factor Xa, or urokinase-type plasminogen activator. An important paralog of this gene is SERPINB4. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 420
36688 381038 cd19572 serpinB13_headpin serpin family B member 13, headpin. Headpin (also known as hurpin or proteinase inhibitor 13/P113) maps to chromosome 18q21.3 and is expressed in normal squamous epithelium of the oral mucosa, skin, and cervix. Inhibitory serpins are known to play an important role in tumor invasion, metastasis, tumor suppression and apoptosis. Headpin belongs to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 391
36689 381039 cd19573 serpinE2_GDN serpin family E member 2, glia derived nexin (GDN). Serpin glia-derived nexin (GDN; also called peptidase inhibitor 7/PI-7 or protease nexin 1/PN-1) is a specific and extremely efficient inhibitor of thrombin. Unlike other thrombin inhibitors, it is not synthesized in the liver and does not circulate in the blood. It is instead expressed by multiple cell types and is located on the surface of these cells, bound to glycosaminoglycans. GDN plays a role in thrombosis and atherosclerosis and is a clade E serpin. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 375
36690 381040 cd19574 serpinE3 serpin family E member 3. The function of serpin E3 is not known. It is a member of clade E, which also includes nexin and plasminogen activator inhibitor type 1, of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 384
36691 381041 cd19575 serpinH2 serpin family H member 2. The function of Danio rerio serpin H2 is not known. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 382
36692 381042 cd19576 serpinI2_pancpin serpin family I member 2, pancpin. Pancpin (also called proteinase inhibitor 14/PI14 or myoepithelium-derived serine protease inhibitor/MEPI ) is an inhibitory member of the serpin superfamily. It is downregulated in pancreatic and breast cancer, and is associated with acinar cell apoptosis and pancreatic insufficiency when absent in mice. Pancpin was found to inhibit pancreatic chymotrypsin and elastase. It is thought that pancpin protects pancreatic cells from the consequences of premature activation of their respective zymogens. This subgroup belongs to clade I of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 371
36693 381043 cd19577 serpinJ_IRS-2-like serpin family J, Ixodes ricinus serpin-2 (IRS-2). The serpin family J clade contains serpins from the Chelicerates. This model includes serpins from the Japanese horseshoe crab, mites, ticks, and spiders. The Limulus intracellular coagulation inhibitor, designated LICI, was isolated from hemocytes of the Japanese horseshoe crab. It blocks the amidolytic activities of Limulus lipopolysaccharide-sensitive serine protease, factor C and also inhibits human alpha-thrombin, rat salivary kallikrein, bovine plasmin, and trypsin but not Limulus clotting enzyme, Limulus factor B, bovine factor Xa, human factor XIa, human tissue plasminogen activator, human urokinase, chymotrypsin, elastase, and papain. Glycosaminoglycans such as heparin and heparan sulfate had no effect on the inhibitory activity. The castor bean tick, Ixodes ricinus serpin-2 (IRS-2) whose structure has been solved, unlike that of the LICI, is found in the saliva of the tick and primarily targets 2 proinflammatory serine proteases: cathepsin G and mast cell chymase, and in higher molar excess, thrombin. It also blocks cathepsin G- and thrombin-induced platelet aggregation. Thus it has a dual role and can interfere with both inflammation and wound healing during tick feeding. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 372
36694 381044 cd19578 serpinK_insect_SRPN2-like serpin family K, insect Serpin-2 and similar proteins. Serpin-2 (SRPN2) is a negative regulator of the melanization response in the malaria vector Anopheles gambiae. SRPN2 irreversibly inhibits clip domain serine proteinase 9 (CLIPB9), which functions in a serine proteinase cascade ending in the activation of prophenoloxidase and melanization. Silencing of SRPN2 results in spontaneous melanization and decreased life span of the mosquito and is a promising target for vector control. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 376
36695 381045 cd19579 serpin1K-like Manduca sexta Serpin 1K and similar proteins. Serpin 1K is a chymotrypsin inhibitor and is 1 of 12 serpins found in the hemolymph of the hornworm moth Manduca sexta. Serpins may be involved in the immune response in insect hemolymph. All of these serpins are encoded by the same gene, and the message for each is produced by alternative splicing of the final exon. This exon encodes the RCL and two strands of sheet B. Serpin 1K has a canonical structure at the reactive center, as is observed in a1-antitrypsin, whereas hinge residues (P17-P13) adopt the position and conformation observed in ovalbumin. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 368
36696 381046 cd19580 serpin_silkworm16_18_22 silk gland serpins 16, 18, and 22 from Bombyx mori. Serpins 16, 18, and 22 of the silkworm Bombyx mori are found in the silk gland, a highly specialized organ that functions to synthesize and store silk proteins. These three serpins are mainly distributed in the middle silk gland and contain a signal peptide for secretion. They also share high sequence homology (~87%), implying that they might carry out a similar and specific function in the middle silk gland lumen. They have a canonical serpin fold, but contain a unique reactive center loop, which is shorter than that of typical serpins. It is thought that active proteases in silk glands are restricted by serpins until the wandering stage. Studies show that serpins 16 and 18 act as inhibitor of cysteine protease with serpin 18 acting specifically on fibroinase. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 365
36697 381047 cd19581 serpinL_nematode serpin family L, serpin family proteins from nematodes. The role of nematode serpins remains largely elusive. The only nematode serpin for which experimental evidence indicates an evasive function is Brugia malayi SPN-2 which specifically inhibits two human neutrophil-derived serine proteinases, cathepsin G and elastase. Less is known of Brugia malayi SPN-1, which is present at all stages of the parasite life cycle and could exist to inhibit a cognate proteinase endogenous to the parasite. Schistosoma serpins are hypothesized to play a role in both the physiological control of elastase within the schistosomes, and protection of the parasite from activated neutrophils during inflammation. Caenorhabditis elegans serpins are thought to regulate endogenous serine proteinases as well as inhibit proteinases produced by pathogenic microorganisms. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 357
36698 381048 cd19582 serpinM_ShSPI serpin family M, Schistosoma haematobium serpin. ShSPI is a serpin from the trematode Schistosoma haematobium. The protein is exposed on the surface of invading cercaria as well as of adult worms, suggesting its involvement in the parasite-host interaction. It has several distinctive features, mostly concerning the helical subdomain of the protein. It is proposed that these peculiarities are related to the unique biological properties of a small serpin subfamily which is conserved among pathogenic schistosomes. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 388
36699 381049 cd19583 serpinN_SPI-1_SPI-2 serpin family N, viral serpin-1 and serpin-2. This group of viral serpins are from the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) and corresponding to clade N which contains viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins. The other is clade O which contains the viral serpin-3 (SPI-3-like) serpins. SPI-2, also called cytokine response modifier A (crmA), acts to inhibit inflammation and apoptosis. SPI-1, a serpin that is approximately 45% identical to SPI-2, has also been implicated in the inhibition of apoptosis, since certain cells infected with RPV SPI-1 mutants undergo apoptotic cell death. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 347
36700 381050 cd19584 serpinO_SPI-3_virus serpin family O, viral serpin-3. This group of viral serpins are from the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) and corresponding to clade O which contains the viral serpin-3 (SPI-3-like) serpins. The other is clade N which contains viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins. SPI-3 is an N-glycosylated bifunctional protein that acts as both a proteinase inhibitor and a suppressor of infected cell-cell fusion. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 350
36701 381051 cd19585 serpin_poxvirus serpin-like proteins found in poxviruses. These are viral serpins from poxviridae that are not in the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) that contains clade N serpins (viral serpin-1/SPI-1-like and viral serpin-2/SPI-2-like) and clade O serpins (viral serpin-3/SPI-3-like). The members here include fowlpox virus, canarypox virus, deerpox virus, tanapox virus, an cotia virus and belong to other poxviridae branches including Leporipoxvirus, Yatapoxvirus, and Avipoxvirus. These viruses have a variety of hosts including humans, birds, and mice. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 345
36702 381052 cd19586 serpin_mimivirus serpin-like proteins found in mimiviruses. These viral serpins are from Mimiviridae (Tupanvirus, Powai, Bandra, Moumouvirus, and Megavirus) and may represent a new clade of viral serpins. Mimiviridae are thought to have a common evolutionary origin with Poxviridae whose viral serpins are classified into clades N and O. N is composed of viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins and clade O is made up of viral serpin-3 (SPI-3-like) serpins. Mimiviruses have the only known viral serpins outside of the poxvirus family. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 355
36703 381053 cd19587 serpinA16_HongrES1-like serpin family A member 16, HongrES1 and similar proteins. HongrES1 is an epididymis-specific secretory protein and is encoded by the SERPINA16 gene. It is one of several potential decapacitation factors of rodents, including a 40-kDa glycoprotein, phosphatidylethanolamine-binding protein 1 (PEBP1), a cysteine-rich secretory protein 1, an acrosome-stabilizing factor, SVA, SVS2, and SPINKL. In humans, some potential decapacitation factors that have been reported are glycodelin-S, semenogelin I, a 130-kDa glycoprotein, and some mannosyl glycopeptides. Decapitation factors are removed from the sperm head surface during the capacitation process and are able to reverse sperm capacitation. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 373
36704 381054 cd19588 serpin_miropin-like serpin miropin and similar proteins. Miropin, the serpin from Tannerella forsythia, is thought to contribute to the virulence of periodontal pathogens by inhibiting neutrophil serine proteases. Miropin broadly inhibits serine endopeptidases (SEPs) including trypsin, neutrophil elastase, pancreatic elastase, subtilisin, and cathepsin G and cysteine endopeptidases (CEPs) including papain, calpain-like peptidase Tpr, and gingipain K through various reactive-site bonds. This is achieved by offering several target bonds of the RCL for cleavage within a bait region, instead of a single RSB as found in canonical serpins. In addition, promiscuous inhibition is facilitated by the capacity to insert strands deviating from the canonical length into the central sheet A, while keeping the prey peptidase bound and inactivated. The structural adaptation of miropin to provide a relaxed inhibitory specificity, which allows for formation of inhibitory complexes using different sites, is unique among serpins. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 365
36705 381055 cd19589 serpin_tengpin-like serpin tengpin and similar proteins. Tengpin is an unusual prokaryotic serpin from the extremophile Thermoanaerobacter tengcongensis. In addition to the serpin domain, tengpin contains an N-terminal region that functions to trap the serpin domain in the native metastable state and prevent the spontaneous transition to the latent conformation. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 367
36706 381056 cd19590 serpin_thermopin-like serpin thermopin and similar proteins. Thermopin, the serpin from Thermobifida fusca, functions as an irreversible proteinase inhibitor with resistance to polymerization at high temperatures. The crystal structure of the cleaved thermopin was found to adopt the canonical serpin fold, supporting its inclusion as a classical inhibitory member of the serpin superfamily. A detailed structural comparison revealed unique features, including charge-stabilizing interactions, a deleted element of secondary structure (the G helix), and a C-terminal "tail" that interacts with the top of the A beta sheet and plays an important role in the folding/unfolding of the molecule. These unique features provide structural and biophysical evidence as to how this unusual serpin member has adapted to remain functional in an extreme environment. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 366
36707 381057 cd19591 serpin_like serpin family proteins. This group includes a variety of serpins in three domains of life eukaryotes, bacteria, and archaea. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 364
36708 381058 cd19593 serpin_bacteria_crustaceans serpin family proteins from bacteria and crustaceans. This group includes a variety of serpin family proteins from various bacteria and crustaceans including sea louse and salmon louse. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 370
36709 381059 cd19594 serpin_crustaceans_chelicerates_insects serpin family proteins from crustaceans, chelicerates, and insects. This group includes a variety of serpins from crustaceans (sea louse, Chinese mitten crab, signal crayfish, red king crab, Asian tiger shrimp), chelicerates (Atlantic horseshoe crab, common house spider), and insects (Asian tiger mosquito, caddisfly, pea aphid, bed bug, fruit fly, Australian sheep blowfly, tobacco hornworm, alfalfa leafcutting bee). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 374
36710 381060 cd19596 serpin_fungal cellulosomal serpin precursor. A single fungal serpin has been characterized to date: celpin from Piromyces spp. strain E2. Piromyces is a genus of anaerobic fungi found in the gut of ruminants and is important for digesting plant material. Celpin is predicted to be inhibitory and contains two N-terminal dockerin domains in addition to its serpin domain. Dockerins are commonly found in proteins that localise to the fungal cellulosome, a large extracellular multiprotein complex that breaks down cellulose.[21] It is therefore suggested that celpin may protect the cellulosome against plant proteases. Certain bacterial serpins similarly localize to the cellulosome.[186] SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 361
36711 381061 cd19597 serpin28D-like_insects insect serpins similar to Drosophila melanogaster Serpin-28D. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin-28D is required for pupal viability and plays an essential role in regulating melanization. Insect serpins from mosquitoes, Mediterranean fruit fly, fruit fly, and blowfly are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 395
36712 381062 cd19598 serpin77Ba-like_insects insect serpins similar to Drosophila melanogaster Serpin 77Ba. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin 77Ba plays an essential role in regulating the tracheal melanization immune response to bacterial and fungal infection. Insect serpins from pine beetle, diamondback moth, red flour beetle, mosquito, silkworm, and fruit fly are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 376
36713 381063 cd19599 serpin18-like_insects insect serpins similar to Anopheles gambiae Serpin 18. Serpins in insects function within development, wound healing and immunity. A. gambiae serpin 18 is categorized as non-inhibitory based on the sequence of its reactive-center loop. It is expressed throughout all life stages in multiple tissues and the hemolymph, and is predicted to be secreted based on the presence of a signal peptide. Insect serpins from mosquitoes are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 354
36714 381064 cd19600 serpin11-like_insects insect serpins similar to Bombyx mori Serpin-11. Serpins in insects function within development, wound healing and immunity. The specific function of Bombyx mori serpin-11 (SPN19) is unknown. Insect serpins from sawfly, mealworm, riceborer, moth, silkworm, bollworm are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 366
36715 381065 cd19601 serpin42Da-like serpins similar to Drosophila melanogaster Serpin 42Da. This subfamily is composed mainly of insect serpins, including Drosophila melanogaster serpin 42Da. Serpins in insects function within development, wound healing and immunity. Serpin 42Da, previously serpin 4, is a serine protease inhibitor that is capable of remarkable functional diversity through the alternative splicing of four different reactive center loop exons. Insect serpins from stink bug, alfalfa leafcutting bee, red flour beetle, house fly, and brown planthopper are also included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 361
36716 381066 cd19602 serpin_mollusks serpin family proteins from mollusks. This group includes a variety of serpins from mollusks (freshwater snail, sea slug, and disk abalone). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 374
36717 381067 cd19603 serpin_platyhelminthes serpin family proteins from platyhelminthes. This group includes a variety of serpins from platyhelminthes (lung fluke, tapeworm, flatworm). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 380
36718 381068 cd19604 serpin_protozoa serpin family proteins from protozoa. This group includes a variety of serpin clades from various protozoa including Neospora caninum that causes neosporosis, Toxoplasma gondii that causes toxoplasmosis, and Hammondia hammondi. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 439
36719 381069 cd19605 serpin_protozoa viral serpin. CrmA is a viral serpin that inhibits both cysteine and serine proteinases involved in the regulation of host inflammatory and apoptosis processes. It differs from other members of the serpin superfamily by having a shorter reactive center loop as well as possessing an additional highly charged antiparallel beta-strand of beta-sheet A, whose sequence and length are unique. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 413
36720 381624 cd19606 GH113-like Glycoside hydrolase family 113 beta-mannosidase and similar proteins. Family 113 glycoside hydrolases cleave (1->4)-beta-glycosidic linkages, such as endo-1,4-beta-mannanase. This family also includes TIM-barrel domains found in gene transfer agent proteins. 303
36721 381625 cd19607 GTA_TIM-barrel-like Putative glycoside hydrolase TIM-barrel domain in gene transfer agent and similar proteins. This domain is found in the gene transfer agent protein, such as the Rhodobacter capsulatus putative gene transfer agent protein encoded by orfg15. In the purple nonsulfur bacterium Rhodobacter capsulatus, DNA transmission is mediated via an unusual system, a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. 438
36722 381626 cd19608 GH113_mannanase-like Glycoside hydrolase family 113 beta-1,4-mannanase and similar proteins. Mannan endo-1,4-beta mannosidase (E.C 3.2.1.78) randomly cleaves (1->4)-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans and is also called beta-1,4-mannanase, endo-1,4-beta-mannanase, endo-beta-1,4-mannase, beta-mannanase B, beta-1, 4-mannan 4-mannanohydrolase, endo-beta-mannanase, beta-D-mannanase, 1,4-beta-D-mannan mannanohydrolase, and 4-beta-D-mannan mannanohydrolase. (1->4)-beta-linked mannans are polysaccharides with a linear polymer backbone of (1->4)-beta-linked mannose units (in plants and fungi) or alternating mannose and glucose/galactose units (glucomannan in plants and fungi, and galactomannan and galactoglucomannan in plants), such as in the hemicellulose fraction of hard- and softwoods. Complete degradation of mannan requires a series of enzymes, including beta-1,4-mannanase. According to the CAZy database beta-1,4-mannanases are grouped into various glycoside hydrolase (GH) families; GH family 113 beta-1,4-mannanases include mostly bacterial and archaeal sequences. 310
36723 410991 cd19609 NTD_TDP-43 N-terminal domain of transactive response DNA-binding protein 43. Transactive response DNA-binding protein of 43 kDa (TDP-43) is a nuclear DNA/RNA-binding protein involved in gene transcription and mRNA processing, transport, and translational regulation. It is vital to pre-mRNA and microRNA processing and regulates stress granule activity through the differential regulation of G3BP and TIA-1. It also forms aggregates implicated in amyotrophic lateral sclerosis. The N-terminal domain of TDP-43 is required for its physiological functions and pathological aggregation. 74
36724 381622 cd19610 mannanase_GH134 glycosyl hydrolase family 134 inverting endo-beta-1,4-mannanase. glycosyl hydrolase family 134 beta-mannanase (E.C. 3.2.1.78) differs from other mannanases in as it has a hen egg white lysozyme fold and cleaves beta-1,4-mannans with inversion of sterochemistry. Beta-mannosidases are enzymes involved in seed germination and the degradation of the hemicellulose fraction of soft- and hardwoods. 162
36725 381623 cd19611 Ctf13_LRR_LRR-insertion leucine-rich-repeat (LRR) domain and LRR insertion domain of centromere DNA-binding protein complex CBF3 subunit C (Ctf13). Ctf13, is an F-box protein of the leucine-rich-repeat superfamily; it is a component of CEN binding factor 3 (CBF3), a complex that recognizes point centromeres found in budding yeast, associating specifically with the third centromere DNA element (CDEIII) DNA. CBF3 is comprised of two homodimers of Cep3 and Ndc10, and a Ctf13-Skp1 heterodimer. The Skp1-Ctf13 heterodimer interacts with Cep3, Ndc10 and CDEIII at a completely conserved G, centrally positioned between the TGC/CCG sites. The eight leucine-rich repeat (LRR) motifs of Ctf13 (LRR 1-8) form a solenoid structure. At the N-terminus of the Ctf13 LRR is an expanded F-box, and at the C-terminal end, an alpha-beta domain formed by insertions within the latter LRRs of Ctf13 (LRR insertion domain). This domain model includes the LLR domain and the LRR insertion domain. 290
36726 381247 cd19614 FABP_Der_p_13-like mite group 13 allergens similar to Dermatophagoides farinae Der p 13, and related proteins. The minor house dust mite allergen Der p 13 is a fatty acid-binding protein and an activator of a TLR2-mediated innate immune response. This group also contains other mite group 13 allergens, including Tyrophagus putrescentiae Tyr p 13 and Blomia tropicalis mite blo t 13. blo t 13 binds the natural fluorescent fatty acid cis-parinaric acid and oleic acid by competition, but not retinol, retinoic acid, cholesterol, or dansylated or anthroxylated fatty acids. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36727 381248 cd19615 lipocalin_NP7 nitrophorin 7. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NP7 displays peculiar properties, such as an abnormally high isoelectric point, the ability to bind negatively charged membranes, and a strong pH sensitivity of NO affinity. In contrast to NP1-4, which show high affinities for histamine (Hm), Np7 does not appear to sequester Hm. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 174
36728 381249 cd19616 FABP11 fatty acid binding protein 11 similar to zebrafish FABP11. This group includes zebrafish FABP11a and FABP11b, Senegalese sole FABP11, and similar proteins. The two copies of the fabp11 gene in the zebrafish genome may have resulted from a fish-specific whole genome duplication event. Fabp11a transcripts have been detected in the liver, brain, heart, testis, muscle, ovary and skin of adult zebrafish while fabp11b transcripts have been found in the brain, heart, ovary and eye in adult tissues. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 129
36729 381250 cd19617 FABP12 fatty acid-binding protein 12. FABP12 is expressed in rodent retina and testis, as well as in human retinoblastoma cell lines. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36730 410998 cd19668 TmYidC_peri periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Thermotoga maritima and similar domains. This subfamily is composed of Thermotoga maritima YidC (TmYidC) and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. The periplasmic domain of TmYidC shows no amino acid sequence identity with that of the prototypical Escherichia coli YidC (EcYidC), yet they adopt a similar fold. However, the periplasmic domain of TmYidC displays shorter beta strands and some differences in connectivity, compared to EcYidC. 196
36731 381525 cd19682 bHLHzip_MGA_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MAX gene-associated protein (MGA) family. The MGA family includes MGA, Schizosaccharomyces pombe ESC1 (spESC1) and similar proteins. MGA, also termed MAX dimerization protein 5 (MAD5), is a dual specificity T-box/ bHLHzip transcription factor that regulates the expression of both Max-network and T-box family target genes. It contains a Myc-like bHLHZip motif and requires heterodimerization with Max for binding to the preferred Myc-Max-binding site CACGTG. In addition to the bHLHZip domain, MGA harbors a second DNA-binding domain, the T-box or T-domain. It thus binds the preferred Brachyury-binding sequence and represses transcription of reporter genes containing promoter-proximal Brachyury-binding sites. spESC1 is a bHLHzip protein with homology to human MyoD and Myf-5 myogenic differentiation inducers. It is involved in the sexual differentiation process. 65
36732 381526 cd19683 bHLH_SOHLH_like basic helix-loop-helix (bHLH) domain found in the spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein (SOHLH) family and similar proteins. The SOHLH family includes two bHLH transcription factors, SOHLH1 and SOHLH2. They are specifically in spermatogonia and oocytes and essential for early spermatogonial and oocyte differentiation. The family also includes transcription factor-like 5 protein (TCFL5) and similar proteins. TCFL5, also termed Cha transcription factor, or HPV-16 E2-binding protein 1 (E2BP-1), is a bHLH transcription factor that plays a crucial role in spermatogenesis. It regulates cell proliferation or differentiation of cells through binding to a specific DNA sequence like other bHLH molecules. 58
36733 381527 cd19684 bHLH_dnHLH_ID basic helix-loop-helix (bHLH) domain found in the DNA-binding protein inhibitor (ID) family. The ID family includes a dominant negative group of helix-loop-helix (dnHLH) proteins, ID1-4, that are negative regulators of bHLH transcription factors. They contain the HLH-dimerization domain but lack the basic domain necessary for DNA-binding. ID proteins inhibit binding to DNA and transcriptional transactivation by heterodimerization with bHLH proteins. They also interact with many non-bHLH proteins in complex networks. ID proteins have been implicated in regulating gene expression as well as cell-cycle progression. Whereas ID-1, ID-2 and ID-3, are generally considered as tumor promoters, ID4 on the contrary has emerged as a tumor suppressor. 47
36734 381528 cd19685 bHLH-O_HERP_HES basic helix-loop-helix-orange (bHLH-O) domain found in HERP/HES-like family. The HERP/HES-like family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split (HES) proteins. The HERP (HES-related repressor protein) subfamily proteins contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. Hairy and enhancer of split (HES)-related repressor protein (HERP) proteins (HEY1, HEY2 and HEYL) act as downstream effectors of Notch signaling. They are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis. Hairy and enhancer of split-related protein HELT is a transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. Differentially expressed in chondrocytes proteins, DEC1 and DEC2, are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers. Drosophila melanogaster protein clockwork orange (Cwo) is also included in this subfamily. It is involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression. The HES subfamily proteins contain a basic helix-loop-helix (bHLH) domain with an invariant proline residue in its basic region, an orange domain in the central region and a conserved tetrapeptide motif, WRPW, at its C-terminal region. They form heterodimers or homodimers via their HLH domain and bind DNA to repress gene transcription that play an essential role in development of both compartment and boundary cells of the central nervous system. 52
36735 381529 cd19686 bHLH_TS_FERD3L_like basic helix-loop-helix (bHLH) domain found in Fer3-like protein (FERD3L), pancreas transcription factor 1 subunit alpha (PTF1A) and similar proteins. The family corresponds to a group of bHLH transcription factors, including FERD3L and PTF1A. FERD3L, also termed basic helix-loop-helix protein N-twist, or Class A basic helix-loop-helix protein 31 (bHLHa31), or nephew of atonal 3 (NATO3), or Neuronal twist (NTWIST), is expressed in the developing central nervous system (CNS). It regulates floor plate (FP) cells development. FP is a critical organizing center located at the ventral-most midline of the neural tube. FERD3L binds to the E-box and functions as inhibitor of transcription. PTF1A, also termed Class A basic helix-loop-helix protein 29 (bHLHa29), or pancreas-specific transcription factor 1a, or bHLH transcription factor p48, or p48 DNA-binding subunit of transcription factor PTF1 (PTF1-p48), is implicated in the cell fate determination in various organs. It binds to the E-box consensus sequence 5'-CANNTG-3' and plays a role in early and late pancreas development and differentiation. 56
36736 381530 cd19687 bHLHzip_Mlx basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-like protein X (Mlx) and similar proteins. Mlx, also termed Class D basic helix-loop-helix protein 13 (bHLHd13), or Max-like bHLHZip protein, or protein BigMax, or transcription factor-like protein 4, is a Max-like bHLHZip transcription regulator that interacts with the Max network of transcription factors. It forms a sequence-specific DNA-binding protein complex with some member of Mad family (Mad1 and Mad4) and Mondo family but not the Myc family and bind the E-box DNA to control transcription. 76
36737 381531 cd19688 bHLHzip_MLXIP basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein (MLXIP) and similar proteins. MLXIP, also termed Class E basic helix-loop-helix protein 36 (bHLHe36), or transcriptional activator MondoA, is a bHLHZip transcriptional activator that binds DNA as a heterodimer with Mlx. It binds to the canonical E box sequence 5'-CACGTG-3' and plays a role in transcriptional activation of glycolytic target genes. MLXIP is most highly expressed in skeletal muscle and functions as an indirect glucose sensor, by sensing glucose 6-phosphate and shuttling between the nucleus and the cytoplasm. 72
36738 381532 cd19689 bHLHzip_MLXIPL basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein-like (MLXIPL) and similar proteins. MLXIPL, also termed carbohydrate-responsive element-binding protein (ChREBP), or Class D basic helix-loop-helix protein 14 (bHLHd14), or MLX interactor, or WS basic-helix-loop-helix leucine zipper protein (WS-bHLH), or Williams-Beuren syndrome chromosomal region 14 protein (WBSCR14), is a bHLHZip transcriptional factor integral to the regulation of glycolysis and lipogenesis in the liver. It forms heterodimers with the bHLHZip protein Mlx to bind the DNA sequence 5'-CACGTG-3'. 76
36739 381533 cd19690 bHLHzip_spESC1_like basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Schizosaccharomyces pombe ESC1 (spESC1) and similar proteins. spESC1 is a bHLHzip protein with homology to human MyoD and Myf-5 myogenic differentiation inducers. It is involved in the sexual differentiation process. 65
36740 381534 cd19691 bHLH_dnHLH_ID1 basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID1 and similar proteins. ID1, also termed Class B basic helix-loop-helix protein 24 (bHLHb24), or inhibitor of DNA binding 1, or inhibitor of differentiation 1, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. ID1 interferes with centrosomal function. It has been implicated in the regulation of cell proliferation and differentiation in myogenesis, neurogenesis, and/or hematopoiesis. 52
36741 381535 cd19692 bHLH_dnHLH_ID2 basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID2 and similar proteins. ID2, also termed Class B basic helix-loop-helix protein 26 (bHLHb26), or inhibitor of DNA binding 2, or inhibitor of differentiation 2, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. It has been implicated in the regulation of cell proliferation and differentiation in myogenesis, neurogenesis, and/or hematopoiesis. 66
36742 381536 cd19693 bHLH_dnHLH_ID3 basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID3 and similar proteins. ID3, also termed Class B basic helix-loop-helix protein 25 (bHLHb25), or helix-loop-helix protein HEIR-1, or ID-like protein inhibitor HLH 1R21, or inhibitor of DNA binding 3, or inhibitor of differentiation 3, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. It negatively regulates muscle differentiation by inhibiting the DNA-binding activities of the myogenic regulatory factors. 61
36743 381537 cd19694 bHLH_dnHLH_ID4 basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID4 and similar proteins. ID4, also termed Class B basic helix-loop-helix protein 27 (bHLHb27), or inhibitor of DNA binding 4, or inhibitor of differentiation 4, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. It plays a role in adipose cell differentiation. 60
36744 381538 cd19695 bHLH_dnHLH_EMC_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein extra-macrochaetae and similar proteins. Extra-macrochaetae is a negative regulator of sensory organ development in Drosophila. It belongs to dominant negative group of helix-loop-helix (dnHLH) proteins, which lack a basic DNA-binding domain but can form heterodimers with other HLH proteins, thereby inhibiting DNA binding. 52
36745 381539 cd19696 bHLH-PAS_AhR_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in the aryl hydrocarbon receptor (AhR) family. The AhR family includes AhR, AhR repressor (AhRR) and Drosophila melanogaster protein spineless. AhR, also termed Ah receptor, or Dioxin receptor (DR), or Class E basic helix-loop-helix protein 76 (bHLHe76), is the only member of bHLH-PAS transcription regulators that bind and be activated by small chemical ligands. It is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes AhR dimerizes with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator). AhRR, also termed Class E basic helix-loop-helix protein 77 (bHLHe77), is a member of bHLH-PAS transcription factors that acts as a negative regulator of AhR, playing key roles in development and environmental sensing. AhRR functions by competing with AhR for its partner ARNT. AhRR-ARNT complexes are transcriptionally inactive. Spineless is a bHLH-PAS transcription factor that plays an important role in fly morphogenesis. It is both necessary and sufficient for the formation of the ommatidial mosaic. 59
36746 381540 cd19697 bHLH-PAS_NPAS4_PASD10 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 4 (NPAS4) and similar proteins. NPAS4, also termed neuronal Per-Arnt-Sim homology (PAS) factor 4, or neuronal PAS4, or Class E basic helix-loop-helix protein 79 (bHLHe79), or HLH-PAS transcription factor NXF, or PAS domain-containing protein 10 (PASD10), is a bHLH-PAS neuronal activity-dependent transcription factor which heterodimerizes with ARNT2 to regulate genes involved in inhibitory synapse formation. 57
36747 381541 cd19698 bHLH_AtMEE8_like basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein maternal effect embryo arrest 8 (AtMEE8) and similar proteins. AtMEE8, also termed AtbHLH108, or EN 132, is a bHLH transcription factor required during early embryo development, for the endosperm formation. 71
36748 381542 cd19699 bHLH_TS_dMYOD_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster myogenic-determination protein (dMYOD) and similar proteins. dMYOD, also termed protein nautilus, or Myd, may play an important role in the early development of muscle in Drosophila. 56
36749 381543 cd19700 bHLH_TS_TWIST2 basic helix-loop-helix (bHLH) domain found in twist-related protein 2 (TWIST2) and similar proteins. TWIST2, also termed Class A basic helix-loop-helix protein 39 (bHLHa39), or Dermis-expressed protein 1, or Dermo-1, is a bHLH transcription factor that regulates the development of mesenchymal tissues and plays a critical role in embryogenesis. It binds to the E-box consensus sequence 5'-CANNTG-3' as a heterodimer and inhibits transcriptional activation by MYOD1, MYOG, MEF2A and MEF2C. 82
36750 381544 cd19701 bHLH_TS_HEN1 basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein 1 (HEN-1) and similar proteins. HEN-1, also termed nescient helix-loop-helix 1 (Nhlh1), or Class A basic helix-loop-helix protein 35 (bHLHa35), or nescient helix loop helix 1 (NSCL-1), is a neuron-specific bHLH transcription factor that may serve as DNA-binding protein and may be involved in the control of cell-type determination, possibly within the developing nervous system. 72
36751 381545 cd19702 bHLH_TS_HEN2 basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein 2 (HEN-2) and similar proteins. HEN-2, also termed nescient helix-loop-helix 2 (Nhlh2), or Class A basic helix-loop-helix protein 34 (bHLHa34), or nescient helix loop helix 2 (NSCL-2), is a neuron-specific bHLH transcription factor that may serve as DNA-binding protein and may be involved in the control of cell-type determination, possibly within the developing nervous system. 73
36752 381546 cd19703 bHLH_TS_musculin basic helix-loop-helix (bHLH) domain found in musculin and similar proteins. Musculin, also termed activated B-cell factor 1 (ABF-1), or Class A basic helix-loop-helix protein 22 (bHLHa22), is a bHLH transcription factor expressed in activated B lymphocytes. It acts as a transcription repressor capable of inhibiting the transactivation capability of TCF3/E47. Musculin may play a role in regulating antigen-dependent B-cell differentiation. The mouse homolog, musculin, is suggested to be a repressor of myogenesis that is expressed in developing muscle and in the spleen. Musculin heterodimerizes with products of the E2A gene. 66
36753 381547 cd19704 bHLH_TS_TCF21_capsulin basic helix-loop-helix (bHLH) domain found in transcription factor 21 (TCF-21) and similar proteins. TCF-21, also termed capsulin, or Class A basic helix-loop-helix protein 23 (bHLHa23), or epicardin, or podocyte-expressed 1 (Pod-1), is a bHLH transcription factor expressed specifically in mesodermally-derived cells that surround the epithelium of the developing gastrointestinal, genitourinary and respiratory systems during mouse embryogenesis. It may play a role in the specification or differentiation of one or more subsets of epicardial cell types. 64
36754 381548 cd19705 bHLH_TS_LYL1 basic helix-loop-helix (bHLH) domain found in protein lyl-1 and similar proteins. Lyl-1, also termed Class A basic helix-loop-helix protein 18 (bHLHa18), or lymphoblastic leukemia-derived sequence 1, is a proto-oncogenic bHLH transcription factor involved in T-cell acute lymphoblastic leukemia. It plays an important role in hematopoietic stem cell function and is required for the late stages of postnatal angiogenesis to limit the formation of new blood vessels, notably by regulating the activity of the small GTPase Rap1. LYL-1 deficiency induces a stress erythropoiesis. 65
36755 381549 cd19706 bHLH_TS_TAL1 basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein 1 (TAL-1) and similar proteins. TAL-1, also termed Class A basic helix-loop-helix protein 17 (bHLHa17), or stem cell protein (SCL), or T-cell leukemia/lymphoma protein 5, is a hematopoietic-specific bHLH transcription factor that functions in embryonic and adult hematopoiesis in vertebrates. It is also required for embryonic vascular remodeling. It acts as a regulator of erythroid differentiation and binds to regulatory regions of a large cohort of erythroid genes as part of a complex with GATA-1, LMO2 and Ldb1. TAL-1 has been implicated in T-cell acute lymphoblastic leukemia. In common with other tissue-specific bHLH proteins, Tal heterodimerizes with ubiquitously-expressed members of the E2A family and form a DNA-binding complex with an E-box (CANNTG) to regulate transcription at its recognition site. 65
36756 381550 cd19707 bHLH_TS_TAL2 basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein 2 (TAL-2) and similar proteins. TAL-2, also termed Class A basic helix-loop-helix protein 19 (bHLHa19), is a bHLH transcription factor essential for the normal brain development. It has been implicated in T-cell acute lymphoblastic leukemia. 61
36757 381551 cd19708 bHLH_TS_dHLH3B_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster helix loop helix protein 3B (dHLH3B) and similar proteins. Drosophila HLH3B is an uncharacterized bHLH transcription factor that belongs to the T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family. 60
36758 381552 cd19709 bHLH_TS_TCF23_OUT basic helix-loop-helix (bHLH) domain found in transcription factor 23 (TCF-23) and similar proteins. TCF-23, also termed Class A basic helix-loop-helix protein 24 (bHLHa24), is a bHLH transcription factor that is essential for progesterone-dependent decidualization. The mouse homolog is also called ovary, uterus and testis protein (OUT), which is expressed predominantly in the reproductive organs such as the uterus, ovary and testis. It shows an Id-like inhibitory activity and functions as a negative regulator of bHLH factors through the formation of a functionally inactive heterodimeric complex. OUT inhibits the formation of TCF3 and MYOD1 homodimers and heterodimers, but lacks DNA binding activity. OUT is involved in the regulation or modulation of smooth muscle contraction of the uterus during pregnancy and particularly around the time of delivery. It also plays a role in the inhibition of myogenesis. Unlike typical bHLH factors, OUT proteins do not bind E-box (CANNTG) or N-box DNA sequences and inhibit DNA binding of homo- and heterodimers consisting of E12 and MyoD in gel mobility shift assays. 56
36759 381553 cd19710 bHLH_TS_TCF24 basic helix-loop-helix (bHLH) domain found in transcription factor 24 (TCF-24) and similar proteins. TCF-24 is an uncharacterized bHLH transcription factor that shows high sequence similarity with TCF-23. 56
36760 381554 cd19711 bHLH_TS_MIST1 basic helix-loop-helix (bHLH) domain found in muscle, intestine and stomach expression 1 (MIST-1) and similar proteins. MIST-1, also termed Class A basic helix-loop-helix protein 15 (bHLHa15), or Class B basic helix-loop-helix protein 8 (bHLHb8), is a bHLH transcription factor expressed in pancreatic acinar cells and other serous exocrine cells. It is essential for cytoskeletal organization and secretory activity. It also functions as a potent endoplasmic reticulum (ER) stress-inducible transcriptional regulator. MIST-1 is capable of binding to E-box (CANNTG) motifs as a homodimer or a heterodimer with E-proteins (E12 and E47) to regulate transcription. 62
36761 381555 cd19712 bHLH_TS_dimmed_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein dimmed and similar proteins. Dimmed, also termed DIMM, is a bHLH transcription factor that regulates neurosecretory (NS) cell function and neuroendocrine cell fate in Drosophila. 60
36762 381556 cd19713 bHLH_TS_ATOH1 basic helix-loop-helix (bHLH) domain found in protein atonal homolog 1 (ATOH1) and similar proteins. ATOH1, also termed Class A basic helix-loop-helix protein 14 (bHLHa14), or helix-loop-helix protein hATH-1 (hATH1), or Math1, or Cath1, is a proneural bHLH transcription factor that is essential for inner ear hair cell differentiation. It dimerizes with E47 and activates E-box (CANNTG) dependent transcription. ATOH1 is a mammalian homolog of the Drosophila melanogaster gene atonal and mouse atonal homolog 1 (Math1). 64
36763 381557 cd19714 bHLH_TS_ATOH7 basic helix-loop-helix (bHLH) domain found in protein atonal homolog 7 (ATOH7) and similar proteins. ATOH7, also termed Class A basic helix-loop-helix protein 13 (bHLHa13), or helix-loop-helix protein hATH-5 (hATH5), or Math5, is a bHLH transcription factor involved in the differentiation of retinal ganglion cells. 69
36764 381558 cd19715 bHLH_TS_amos_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein Amos and similar proteins. Amos, also termed absent MD neurons and olfactory sensilla protein, or reduced olfactory organs protein, or rough eye protein, is a bHLH transcription factor that promotes multiple dendritic neuron formation in the Drosophila peripheral nervous system. 64
36765 381559 cd19716 bHLH_TS_NGN1_NeuroD3 basic helix-loop-helix (bHLH) domain found in neurogenin-1 (NGN-1) and similar proteins. NGN-1, also termed Class A basic helix-loop-helix protein 6 (bHLHa6), or neurogenic basic-helix-loop-helix protein, or neurogenic differentiation factor 3 (NeuroD3), is a neural-specific bHLH transcription factor involved in the initiation of neuronal differentiation. It activates transcription by binding to the E box (5'-CANNTG-3'). 77
36766 381560 cd19717 bHLH_TS_NGN2_ATOH4 basic helix-loop-helix (bHLH) domain found in neurogenin-2 (NGN-2) and similar proteins. NGN-2, also termed Class A basic helix-loop-helix protein 8 (bHLHa8), or protein atonal homolog 4 (ATOH4), is a neural-specific bHLH transcription factor required for sensory neurogenesis. It activates transcription by binding to the E box (5'-CANNTG-3'). 69
36767 381561 cd19718 bHLH_TS_NGN3_ATOH5 basic helix-loop-helix (bHLH) domain found in neurogenin-3 (NGN-3) and similar proteins. NGN-3, also termed Class A basic helix-loop-helix protein 7 (bHLHa7), or protein atonal homolog 5 (ATOH5), is a neural-specific bHLH transcription factor expressed in the developing central nervous system and the embryonic pancreas. It is involved in neurogenesis and plays an important role in spermatogenesis. 68
36768 381562 cd19719 bHLH_TS_NeuroD1 basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 1 (NeuroD1) and similar proteins. NeuroD1, also termed Class A basic helix-loop-helix protein 3 (bHLHa3), is a neuronal bHLH transcription factor involved in the development and maintenance of the endocrine pancreas and neuronal elements. It acts as an essential regulator of glutamatergic neuronal differentiation. Loss of NeuroD1 causes ataxia, cerebellar hypoplasia, sensorineural deafness, and severe retinal dystrophy in mice. 86
36769 381563 cd19720 bHLH_TS_NeuroD2 basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 2 (NeuroD2) and similar proteins. NeuroD2, also termed Class A basic helix-loop-helix protein 1 (bHLHa1), or NeuroD-related factor (NDRF), is a neuronal calcium-dependent bHLH transcription factor that induces neuronal differentiation and promotes neuronal survival. It plays a central role in thalamocortical synaptic maturation. NeuroD2 mediates calcium-dependent transcription activation by binding to E box-containing promoter. 93
36770 381564 cd19721 bHLH_TS_NeuroD4_ATOH3 basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 4 (NeuroD4) and similar proteins. NeuroD4, also termed Class A basic helix-loop-helix protein 4 (bHLHa4), or protein atonal homolog 3 (ATH-3), or Atoh3, or Math-3, is a bHLH transcriptional activator that mediates neuronal differentiation. 87
36771 381565 cd19722 bHLH_TS_NeuroD6_ATOH2 basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 6 (NeuroD6) and similar proteins. NeuroD6, also termed Class A basic helix-loop-helix protein 2 (bHLHa2), or protein atonal homolog 2 (ATH-2), or Atoh2, or Math2, or Nex1, is a neurogenic bHLH transcription factor involved in neuronal development, differentiation, and survival in Alzheimer's disease (AD) brains of both cohorts. It plays an integrative role in coordinating increase in mitochondrial mass with cytoskeletal remodeling, suggesting that it may act as a co-regulator of neuronal differentiation and energy metabolism. 70
36772 381566 cd19723 bHLH_TS_ASCL1_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster achaete-scute complex (AS-C) proteins, achaete-scute-like proteins, ASCL1-2, and similar proteins. This subfamily includes Drosophila melanogaster AS-C proteins and two ASCL family of transcription factors, ASCL-1 and ASCL-2. Drosophila melanogaster AS-C proteins includes lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation. ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete scute homolog 1 (Mash1), is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete scute homolog 2 (Mash2), is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves. 56
36773 381567 cd19724 bHLH_TS_ASCL3_like basic helix-loop-helix (bHLH) domain found in achaete-scute-like proteins, ASCL3-5, and similar proteins. This subfamily includes three ASCL family of transcription factors, ASCL-3, ASCL-4 and ASCL-5. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis. ASCL-4, also termed Class A basic helix-loop-helix protein 44 (bHLHa44), or achaete-scute homolog 4 (ASH-4), or Hash4, may be involved in skin development. ASCL-5, also termed Class A basic helix-loop-helix protein 47 (bHLHa47), or achaete-scute homolog 5 (ASH-5), is an uncharacterized bHLH transcription factor that is close related to ASCL-3 and ASCL-4. 56
36774 381568 cd19725 bHLH_TS_OLIG2_like basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factors, Oligo2, Oligo3 and similar proteins. The family includes two bHLH transcription factors, Oligo2 and Oligo3. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons. 63
36775 381569 cd19726 bHLH-PAS_cycle_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein cycle and similar proteins. Protein cycle, also termed brain and muscle ARNT-like 1 (BMAL1), or MOP3, is a putative bHLH-PAS transcription factor involved in the generation of biological rhythms in Drosophila. It activates cycling transcription of Period (PER) and Timeless (TIM) by binding to the E-box (5'-CACGTG-3') present in their promoters. 62
36776 381570 cd19727 bHLH-PAS_HIF1a_PASD8 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 1-alpha (HIF1a) and similar proteins. HIF1a, also termed HIF-1-alpha, or HIF1-alpha, or ARNT-interacting protein, or Basic-helix-loop-helix-PAS protein MOP1, or Class E basic helix-loop-helix protein 78 (bHLHe78), or Member of PAS protein 1, or PAS domain-containing protein 8 (PASD8), functions as a master transcriptional regulator of the adaptive response to hypoxia. 71
36777 381571 cd19728 bHLH-PAS_HIF2a_PASD2 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 2-alpha (HIF2a) and similar proteins. HIF2a, also termed HIF-2-alpha, or HIF2-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP2, or Class E basic helix-loop-helix protein 73 (bHLHe73), or Member of PAS protein 2, or PAS domain-containing protein 2 (PASD2), or HIF-1-alpha-like factor (HLF), is a bHLH-PAS transcription factor involved in the induction of oxygen regulated genes. 66
36778 381572 cd19729 bHLH-PAS_HIF3a_PASD7 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 3-alpha (HIF3a) and similar proteins. HIF3a, also termed HIF-3-alpha, or HIF3-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP7, or Class E basic helix-loop-helix protein 17 (bHLHe17), or Member of PAS protein 7, or PAS domain-containing protein 7 (PASD7), or HIF3-alpha-1, or inhibitory PAS domain protein (IPAS), is a bHLH-PAS transcriptional regulator in adaptive response to low oxygen tension. It plays a role in the regulation of hypoxia-inducible gene expression. 63
36779 381573 cd19730 bHLH-PAS_spineless_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein spineless and similar proteins. Spineless is a bHLH-PAS transcription factor that plays an important role in fly morphogenesis. It is both necessary and sufficient for the formation of the ommatidial mosaic. 64
36780 381574 cd19731 bHLH-PAS_NPAS1_PASD5 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 1 (NPAS1) and similar proteins. NPAS1, also termed neuronal PAS1, or Basic-helix-loop-helix-PAS protein MOP5, or Class E basic helix-loop-helix protein 11 (bHLHe11), or member of PAS protein 5, or PAS domain-containing protein 5 (PASD5), is a bHLH-PAS transcriptional repressor expressed in the central nervous system and involved in neuronal differentiation. It is active during late embryogenesis and postnatal development. 74
36781 381575 cd19732 bHLH-PAS_NPAS3_PASD6 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 3 (NPAS3) and similar proteins. NPAS3, also termed neuronal PAS3, or Basic-helix-loop-helix-PAS protein MOP6, or Class E basic helix-loop-helix protein 12 (bHLHe12), or member of PAS protein 6, or PAS domain-containing protein 6 (PASD6), is a bHLH-PAS brain-enriched transcription factor that is involved in central nervous system development and neurogenesis. It is a replicated genetic risk factor for psychiatric disorders. Human chromosomal rearrangements that affect NPAS3 normal expression are associated with schizophrenia and mental retardation. 78
36782 381576 cd19733 bHLH-PAS_trachealess_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein trachealess and similar proteins. Protein trachealess is a bHLH-PAS transcription factor that acts as an inducer of tracheal cell fates in Drosophila. It is necessary for the development of the salivary gland duct and the posterior spiracles. 79
36783 381577 cd19734 bHLH-PAS_CLOCK basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Circadian locomotor output cycles protein kaput (CLOCK) and similar proteins. CLOCK, also termed Class E basic helix-loop-helix protein 8 (bHLHe8), is a bHLH-PAS transcriptional activator which forms a core component of the circadian clock. It forms heterodimers with another bHLH-PAS protein, Brain-Muscle-Arnt-Like (also known as BMAL or ARNT3 or mop3), which regulates circadian rhythm. BMAL1-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes. 61
36784 381578 cd19735 bHLH-PAS_dCLOCK basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster Circadian locomotor output cycles protein kaput (dCLOCK) and similar proteins. dCLOCK, also termed dPAS1, is a bHLH-PAS Circadian regulator that acts as a transcription factor and generates a rhythmic output with a period of about 24 hours. 80
36785 381579 cd19736 bHLH-PAS_PASD1 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in circadian clock protein PASD1. PASD1, also termed PAS domain-containing protein 1, is evolutionarily related to Circadian locomotor output cycles protein kaput (CLOCK)and functions as a suppressor of the biological clock that drives the daily circadian rhythms of cells throughout the body. Mammalian PASD1 doesn't harbor the bHLH-PAS domain and is not included in this family. 70
36786 381580 cd19737 bHLH-PAS_NPAS2_PASD4 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 2 (NPAS2) and similar proteins. NPAS2, also termed neuronal PAS2, or basic-helix-loop-helix-PAS protein MOP4, or Class E basic helix-loop-helix protein 9 (bHLHe9), or member of PAS protein 4, or PAS domain-containing protein 4 (PASD4), is a bHLH-PAS transcriptional activator which forms a core component of the circadian clock. 77
36787 381581 cd19738 bHLH-PAS_SIM1 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded homolog 1 (SIM1) and similar proteins. SIM1, also termed Class E basic helix-loop-helix protein 14 (bHLHe14), is a bHLH-PAS transcription factor that may have pleiotropic effects during embryogenesis and in the adult. 71
36788 381582 cd19739 bHLH-PAS_SIM2 basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded homolog 2 (SIM2) and similar proteins. SIM2, also termed Class E basic helix-loop-helix protein 15 (bHLHe15), is a bHLH-PAS transcription factor that may be a master gene of central nervous system (CNS) development in cooperation with ARNT. It may have pleiotropic effects in the tissues expressed during development. 74
36789 381583 cd19740 bHLH-PAS_dSIM_like basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein single-minded (SIM) and similar proteins. SIM is a nuclear bHLH-PAS transcription factor that functions as a master developmental regulator controlling midline development of the ventral nerve cord in Drosophila. 62
36790 381584 cd19741 bHLH-O_ESMB_like basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster enhancer of split mbeta protein (ESMB) and similar proteins. ESMB, also termed E(spl)mbeta, or HLH-mbeta, or split locus enhancer protein mA, is a bHLH-O transcriptional repressor of genes that require a bHLH protein for their transcription. It is involved in the neural-epidermal lineage decision during early neurogenesis. The family also includes Enhancer of split m7 protein (also known as E(spl)m7), which acts as a transcriptional repressor that participates in the control of cell fate choice by uncommitted neuroectodermal cells in the embryo. 69
36791 381585 cd19742 bHLH_TS_ASCL1_Mash1 basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 1 (ASCL-1) and similar proteins. ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete scute homolog 1 (Mash1), is a neural-specific bHLH transcription factor that is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system. 71
36792 381586 cd19743 bHLH_TS_ASCL2_Mash2 basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 2 (ASCL-2) and similar proteins. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete scute homolog 2 (Mash2), is a bHLH transcription factor that is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves. 64
36793 381587 cd19744 bHLH_TS_dAS-C_like basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster achaete-scute complex (AS-C) proteins and similar proteins. Drosophila melanogaster AS-C proteins includes lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation. 67
36794 381588 cd19745 bHLH_TS_ASCL3 basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 3 (ASCL-3) and similar proteins. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is a bHLH transcription factor specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis. 59
36795 381589 cd19746 bHLH_TS_ASCL4 basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 4 (ASCL-4) and similar proteins. ASCL-4, also termed Class A basic helix-loop-helix protein 44 (bHLHa44), or achaete-scute homolog 4 (ASH-4), or Hash4, is a bHLH transcriptional regulator that may be involved in skin development. 64
36796 381590 cd19747 bHLH_TS_ASCL5 basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 5 (ASCL-5) and similar proteins. ASCL-5, also termed Class A basic helix-loop-helix protein 47 (bHLHa47), or achaete-scute homolog 5 (ASH-5), is an uncharacterized bHLH transcription factor that belongs to AS-C (achaete, scute, lethal of scute, and asense) family. 61
36797 381591 cd19748 bHLH-O_HEY1 basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif protein 1 (HEY1) and similar proteins. HEY1, also termed cardiovascular helix-loop-helix factor 2 (CHF-2), or Class B basic helix-loop-helix protein 31 (bHLHb31), or HES-related repressor protein 1, or hairy and enhancer of split-related protein 1 (HESR-1), or hairy-related transcription factor 1 (HRT-1), is a bHLH-O transcriptional repressor that acts as an essential downstream effector of the Notch signaling pathway and may play a fundamental role in vascular development. HEY1 also participates several cancer-related pathways. It acts as a positive regulator of the tumor suppressor p53. 71
36798 381592 cd19749 bHLH-O_DEC1 basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein 1 (DEC1) and similar proteins. DEC1, also termed Class E basic helix-loop-helix protein 40 (bHLHe40), or Class B basic helix-loop-helix protein 2 (bHLHb2), or enhancer-of-split and hairy-related protein 2 (SHARP-2), or stimulated by retinoic acid gene 13 protein (STRA13), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. 90
36799 381593 cd19750 bHLH-O_DEC2 basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein 2 (DEC2) and similar proteins. DEC2, also termed Class E basic helix-loop-helix protein 41 (bHLHe41), or Class B basic helix-loop-helix protein 3 (bHLHb3), or enhancer-of-split and hairy-related protein 1 (SHARP-1), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. 92
36800 410992 cd19751 5TM_YidC_Oxa1_Alb3 Five transmembrane core domain of YidC/Oxa1/Alb3 protein family of insertases. The YidC/Oxa1/Alb3 protein family of insertases facilitate the insertion, folding and assembly of proteins of the inner membranes of bacteria and mitochondria, and the thylakoid membrane of plastids. Members include bacterial YidC, mitochondrial Cox18 and Oxa1, and chloroplastic Alb3 and Alb4. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Oxa1 and Cox18/Oxa2 mediate the insertion of both mitochondrion-encoded precursors and nuclear-encoded proteins from the matrix into the mitochondrial inner membrane. Alb3 and Alb3-like proteins, including Alb4, are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function. 189
36801 381391 cd19752 AKR_unchar uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. 291
36802 381293 cd19753 Mb-like_oxidoreductase Globin domain of uncharacterized oxidoreductases containing a FAD/NADH binding domain. This subfamily is composed of uncharacterized proteins containing an N-terminal myoglobin-like (M family globin) domain and a C-terminal oxygenase reductase FAD/NADH binding domain belonging to the ferredoxin reductase (FNR) family and is usually part of multi-component bacterial oxygenases which oxidize hydrocarbons using oxygen as the oxidant. The domain architecture of this subfamily is similar to flavohemoglobins, which function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses. 121
36803 381294 cd19754 FHb_fungal-globin Globin domain of fungal flavohemoglobin. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses. FlavoHb expression affects Aspergillus nidulans sexual development and mycotoxin production, and Dictyostelium discoideum development. 141
36804 381295 cd19755 TrHb2_AtGlb3-like_O nonsymbiotic haemoglobin Ahb3 (GLB3) and similar truncated hemoglobins, group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). This subfamily includes the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins. 119
36805 380814 cd19756 Bbox2 B-box-type 2 zinc finger (Bbox2). The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction. Based on different consensus sequence and the spacing of the 7-8 zinc-binding residues, B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2). The family corresponds to type 2 B-box (Bbox2). 39
36806 380815 cd19757 Bbox1 B-box-type 1 zinc finger (Bbox1). The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain, in functionally unrelated proteins, most likely mediating protein-protein interactions. Based on different consensus sequences and the spacing of the 7-8 zinc-binding residues, the B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2). This family corresponds to the type 1 B-box (Bbox1). 44
36807 380816 cd19758 Bbox2_MID B-box-type 2 zinc finger found in midline (MID) family. The MID family includes MID1 and MID2. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis, and functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 40
36808 380817 cd19759 Bbox2_TRIM2-like B-box-type 2 zinc finger found in tripartite motif-containing protein TRIM2, TRIM3, and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It also plays an important role in the central nervous system (CNS). In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. Both TRIM2 and TRIM3 belong to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 42
36809 380818 cd19760 Bbox2_TRIM4-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM4, TRIM17, TRIM41, TRIM52 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM4, TRIM17, TRIM41 and TRIM52. TRIM4, also known as RING finger protein 87 (RNF87), is a cytoplasmic E3 ubiquitin-protein ligase that recently evolved and is present only in higher mammals. It transiently interacts with mitochondria, induces mitochondrial aggregation and sensitizes the cells to hydrogen peroxide (H2O2) induced death. Its interaction with peroxiredoxin 1 (PRX1) is critical for the regulation of H2O2 induced cell death. Moreover, TRIM4 functions as a positive regulator of RIG-I-mediated type I interferon induction. It regulates the K63-linked ubiquitination of RIG-1 and assembly of antiviral signaling complex at mitochondria. TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (Terf), is a crucial E3 ubiquitin ligase that is necessary and sufficient for neuronal apoptosis and contributes to Mcl-1 ubiquitination in cerebellar granule neurons (CGNs). It interacts in a SUMO-dependent manner with nuclear factor of activated T cell NFATc3 transcription factor, and thus inhibits the activity of NFATc3 by preventing its nuclear localization. In contrast, it binds to and inhibits NFATc4 transcription factor in a SUMO-independent manner. Moreover, TRIM17 stimulates degradation of kinetochore protein ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for mitotic spindle checkpoint, and negatively regulates cell proliferation. TRIM41, also known as RING finger-interacting protein with C kinase (RINCK), is an E3 ubiquitin-protein ligase that promotes the ubiquitination of protein kinase C (PKC) isozymes in cells. It specifically recognizes the C1 domain of PKC isozymes. It controls the amplitude of PKC signaling by controlling the amount of PKC in the cell. TRIM52, also known as RING finger protein 102 (RNF102), is encoded by a novel, noncanonical antiviral TRIM52 gene in primate genomes with unique specificity determined by the rapidly evolving RING domain. TRIM4, TRIM17 and TRIM41 belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM52 lacks the putative viral recognition SPRY/B30.2 domain, and thus has been classified to the C-V subclass of TRIM family that contains only RBCC domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36810 380819 cd19761 Bbox2_TRIM5-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM5, TRIM6, TRIM22, TRIM34, TRIM38 and similar proteins. The family includes TRIM5, TRIM6, TRIM22, TRIM34, and TRIM38, all of which belong to the C-IV subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM5, also termed RING finger protein 88 (RNF88), is a capsid-specific restriction factor that prevents infection from non-host-adapted retroviruses in a species-specific manner by binding to and destabilizing the retroviral capsid lattice before reverse transcription is completed. Its retroviral restriction activity correlates with the ability to activate TAK1-dependent innate immune signaling. TRIM5 also acts as a pattern recognition receptor that activates innate immune signaling in response to the retroviral capsid lattice. Moreover, TRIM5 plays a role in regulating autophagy through activation of autophagy regulator BECN1 by causing its dissociation from its inhibitors BCL2 and TAB2. It also plays a role in autophagy by acting as a selective autophagy receptor which recognizes and targets HIV-1 capsid protein p24 for autophagic destruction. TRIM6, also termed RING finger protein 89 (RNF89), is an E3-ubiquitin ligase that cooperates with the E2-ubiquitin conjugase UbE2K to catalyze the synthesis of unanchored K48-linked polyubiquitin chains, and further stimulates the interferon-I kappa B kinase epsilon (IKKepsilon) kinase-mediated antiviral response. It also regulates the transcriptional activity of Myc during the maintenance of embryonic stem (ES) cell pluripotency, and may act as a novel regulator for Myc-mediated transcription in ES cells. TRIM22, also termed 50 kDa-stimulated trans-acting factor (Staf-50), or RING finger protein 94 (RNF94), is an E3 ubiquitin-protein ligase that plays an integral role in the host innate immune response to viruses. It has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 acts as a suppressor of basal HIV-1 long terminal repeat (LTR)-driven transcription by preventing transcription factor specificity protein 1 (Sp1) binding to the HIV-1 promoter. It also controls FoxO4 activity and cell survival by directing Toll-like receptor 3 (TLR3)-stimulated cells toward type I interferon (IFN) type I gene induction or apoptosis. Moreover, TRIM22 can activate the noncanonical nuclear factor-kappaB (NF-kappaB) pathway by activating I kappa B kinase alpha (IKKalpha). It also regulates nucleotide binding oligomerization domain containing 2 (NOD2)-dependent activation of interferon-beta signaling and nuclear factor-kappaB. TRIM34, also termed interferon-responsive finger protein 1, or RING finger protein 21 (RNF21), may function as an antiviral protein that contributes to the defense against retroviral infections. TRIM38, also known as RING finger protein 15 (RNF15) or zinc finger protein RoRet, is an E3 ubiquitin-protein ligase that promotes K63- and K48-linked ubiquitination of cellular proteins and also catalyzes self-ubiquitination. It negatively regulates tumor necrosis factor alpha (TNF-alpha) and interleukin-1beta-triggered nuclear factor-kappaB (NF-kappaB) activation by mediating lysosomal-dependent degradation of transforming growth factor beta (TGFbeta)-activated kinase 1 (TAK1)-binding protein (TAB)2/3, two critical components of the TAK1 kinase complex. It also inhibits TLR3/4-mediated activation of NF-kappaB and interferon regulatory factor 3 (IRF3) by mediating ubiquitin-proteasomal degradation of TNF receptor-associated factor 6 (Traf6) and NAK-associated protein 1 (Nap1), respectively. Moreover, TRIM38 negatively regulates TLR3-mediated interferon beta (IFN-beta) signaling by targeting ubiquitin-proteasomal degradation of TIR domain-containing adaptor inducing IFN-beta (TRIF). It functions as a valid target for autoantibodies in primary Sjogren's Syndrome. 40
36811 380820 cd19762 Bbox2_TRIM7-like B-box-type 2 zinc finger found in tripartite motif-containing proteins TRIM7, TRIM27 and similar proteins. The family includes TRIM7 and TRIM27, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM7, also known as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90), is an E3 ubiquitin-protein ligase that mediates c-Jun/AP-1 activation by Ras signalling. Its phosphorylation and activation by MSK1 in response to direct activation by the Ras-Raf-MEK-ERK pathway can stimulate TRIM7 E3 ubiquitin ligase activity in mediating Lys63-linked ubiquitination of the AP-1 coactivator RACO-1, leading to RACO-1 protein stabilization. Moreover, TRIM7 binds and activates glycogenin, the self-glucosylating initiator of glycogen biosynthesis. TRIM27, also termed RING finger protein 76 (RNF76), or RET finger protein (RFP), or zinc finger protein RFP, is a nuclear E3 ubiquitin-protein ligase that is highly expressed in testis and in various tumor cell lines. Expression of TRIM27 is associated with prognosis of colon and endometrial cancers. TRIM27 was first identified as a fusion partner of the RET receptor tyrosine kinase. It functions as a transcriptional repressor and associates with several proteins involved in transcriptional activity, such as enhancer of polycomb 1 (Epc1), a member of the Polycomb group proteins, and Mi-2beta, a main component of the nucleosome remodeling and deacetylase (NuRD) complex, and the cell cycle regulator retinoblastoma protein (RB1). It also interacts with HDAC1, leading to downregulation of thioredoxin binding protein 2 (TBP-2), which inhibits the function of thioredoxin. Moreover, TRIM27 mediates Pax7-induced ubiquitination of MyoD in skeletal muscle atrophy. It also inhibits muscle differentiation by modulating serum response factor (SRF) and Epc1. Furthermore, TRIM27 promotes non-canonical polyubiquitination of PTEN, a lipid phosphatase that catalyzes PtdIns(3,4,5)P3 (PIP3) to PtdIns(4,5)P2 (PIP2). It is an IKKepsilon-interacting protein that regulates IkappaB kinase (IKK) function and negatively regulates signaling involved in the antiviral response and inflammation. In addition, TRIM27 forms a protein complex with MBD4 or MBD2 or MBD3, and thus plays an important role in the enhancement of transcriptional repression through MBD proteins in tumorigenesis, spermatogenesis, and embryogenesis. It is also a component of an estrogen receptor 1 (ESR1) regulatory complex, and is involved in estrogen receptor-mediated transcription in MCF-7 cells. Meanwhile, TRIM27 interacts with the hinge region of chromosome 3 protein (SMC3), a component of the multimeric cohesin complex that holds sister chromatids together and prevents their premature separation during mitosis. 44
36812 380821 cd19763 Bbox2_TRIM8_C-V B-box-type 2 zinc finger found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF-kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 41
36813 380822 cd19764 Bbox2_TRIM9-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM9, TRIM67 and similar proteins. This family includes a group of tripartite motif-containing proteins including TRIM9 and TRIM67, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. It plays an important role in the regulation of neuronal functions and participates in neurodegenerative disorders through its ligase activity. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. 39
36814 380823 cd19765 Bbox2_TRIM10-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM10, TRIM15, TRIM26, TRIM31 and similar proteins. This family includes TRIM10, TRIM15, TRIM26 and TRIM31. TRIM10, also known as B30-RING finger protein (RFB30), RING finger protein 9 (RNF9), or hematopoietic RING finger 1 (HERF1), is a novel hematopoiesis-specific RING finger protein required for terminal differentiation of erythroid cells. TRIM15, also termed RING finger protein 93 (RNF93), or zinc finger protein 178 (ZNF178), or zinc finger protein B7 (ZNFB7), is a focal adhesion protein that regulates focal adhesion disassembly. It localizes to focal contacts in a myosin-II-independent manner by an interaction between its coiled-coil domain and the LD2 motif of paxillin. TRIM15 can also associate with coronin 1B, cortactin, filamin binding LIM protein1, and vasodilator-stimulated phosphoprotein, which are involved in actin cytoskeleton dynamics. As an additional component of the integrin adhesome, it regulates focal adhesion turnover and cell migration. TRIM26, also known as acid finger protein (AFP), RING finger protein 95 (RNF95), or zinc finger protein 173 (ZNF173), is an E3 ubiquitin-protein ligase that negatively regulates interferon-beta production and antiviral response through polyubiquitination and degradation of nuclear transcription factor IRF3. It functions as an important regulator for RNA virus-triggered innate immune response by bridging TBK1 to NEMO (NF-kappaB essential modulator, also known as IKKgamma) and mediating TBK1 activation. It also acts as a novel tumor suppressor of hepatocellular carcinoma by regulating cancer cell proliferation, colony forming ability, migration, and invasion. TRIM31 is an E3 ubiquitin-protein ligase that primarily localizes to the cytoplasm, but is also associated with the mitochondria. It can negatively regulate cell proliferation and may be a potential biomarker of gastric cancer as it is overexpressed from the early stage of gastric carcinogenesis. TRIM31 is downregulated in non-small cell lung cancer and serves as a potential tumor suppressor. It interacts with p52 (Shc) and inhibits Src-induced anchorage-independent growth. TRIM10, TRIM15 and TRIM26 belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM31 belongs to the C-V subclass of TRIM family of proteins. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36815 380824 cd19766 Bbox2_TRIM11_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 11 (TRIM11) and similar proteins. TRIM11, also known as protein BIA1, or RING finger protein 92 (RNF92), is an E3 ubiquitin-protein ligase involved in the development of the central nervous system. It is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 acts as a potential therapeutic target for congenital central hypoventilation syndrome (CCHS) through mediating the degradation of CCHS-associated polyalanine-expanded Phox2b. Trim11 modulates the function of neurogenic transcription factor Pax6 through the ubiquitin-proteosome system, and thus plays an essential role for Pax6-dependent neurogenesis. It also binds to and destabilizes a key component of the activator-mediated cofactor complex (ARC105), humanin, a neuroprotective peptide against Alzheimer's disease-relevant insults, and further regulates ARC105 function in transforming growth factor beta (TGFbeta) signaling. Moreover, TRIM11 negatively regulates retinoic acid-inducible gene-I (RIG-I)-mediated interferon-beta (IFNbeta) production and antiviral activity by targeting TANK-binding kinase-1 (TBK1). It may contribute to the endogenous restriction of retroviruses in cells. It enhances N-tropic murine leukemia virus (N-MLV) entry by interfering with Ref1 restriction. It also suppresses the early steps of human immunodeficiency virus HIV-1 transduction, resulting in decreased reverse transcripts. TRIM11 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36816 380825 cd19767 Bbox2_TRIM13_C-XI B-box-type 2 zinc finger found in tripartite motif-containing protein 13 (TRIM13) and similar proteins. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane-anchored E3 ubiquitin-protein ligase that interacts with proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). It also targets the known ER proteolytic substrate CD3-delta, but not the N-end rule substrate Ub-R-YFP (yellow fluorescent protein) for its degradation. Moreover, TRIM13 regulates ubiquitination and degradation of NEMO to suppress tumor necrosis factor (TNF) induced nuclear factor-kappaB (NF- kappa B) activation. It is also involved in NF-kappaB p65 activation and nuclear factor of activated T-cells (NFAT)-dependent activation of c-Rel upon T-cell receptor engagement. Furthermore, TRIM13 negatively regulates lanoma differentiation-associated gene 5 (MDA5)-mediated type I interferon production. It also regulates caspase-8 ubiquitination, translocation to autophagosomes, and activation during ER stress induced cell death. Meanwhile, TRIM13 enhances ionizing radiation-induced apoptosis by increasing p53 stability and decreasing AKT kinase activity through MDM2 and AKT degradation. TRIM13 belongs to the C-XI subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. In addition, TRIM13 contains a C-terminal transmembrane domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 42
36817 380826 cd19768 Bbox2_TRIM14 B-box-type 2 zinc finger found in tripartite motif-containing protein 14 (TRIM14) and similar proteins. TRIM14 is a mitochondrial adaptor that facilitates innate immune signaling. It also plays a critical role in tumor development. TRIM14 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a Bbox2 zinc finger as well as a C-terminal SPRY/B30.2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36818 380827 cd19769 Bbox2_TRIM16-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM16, TRIM29, TRIM47 and similar proteins. This family includes a group of tripartite motif-containing proteins, such as TRIM16, TRIM29 and TRIM47. TRIM16, also termed estrogen-responsive B box protein (EBBP), is a regulator that may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor through affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. TRIM16 and TRIM29 belong to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. TRIM47 belongs to the C-IV subclass of TRIM family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 46
36819 380828 cd19770 Bbox2_TRIM19_C-V B-box-type 2 zinc finger found in tripartite motif-containing protein 19, also called promyelocytic leukemia protein (PML), and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cell processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 50
36820 380829 cd19771 Bbox2_TRIM20 B-box-type 2 zinc finger found in tripartite motif-containing protein TRIM20 and similar proteins. TRIM20, also termed Pyrin, or Marenostrin (MEFV), is involved in the regulation of innate immunity and the inflammatory response in response to IFNG/IFN-gamma. TRIM20 belongs to unclassified TRIM family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a pyrin domain, a Bbox2 zinc finger, and a C-terminal SPRY/B30.2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36821 380830 cd19772 Bbox2_TRIM21_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 21 (TRIM21) and similar proteins. TRIM21, also known as 52 kDa Ro protein, 52 kDa ribonucleoprotein autoantigen Ro/SS-A, Ro(SS-A), RING finger protein 81 (RNF81), or Sjoegren's syndrome type A antigen (SS-A), is a ubiquitously expressed E3 ubiquitin-protein ligase and a high affinity antibody receptor uniquely expressed in the cytosol of mammalian cells. As a cytosolic Fc receptor, TRIM21 binds the Fc of virus-associated antibodies and targets the complex in the cytosol for proteasomal degradation in a process known as antibody-dependent intracellular neutralization (ADIN), and provides an intracellular immune response to protect host defense against pathogen infection. It shows remarkably broad isotype specificity as it does not only bind IgG, but also IgM and IgA. Moreover, TRIM21 promotes the cytosolic DNA sensor cGAS and the cytosolic RNA sensor RIG-I sensing of viral genomes during infection by antibody-opsonized virus. It stimulates inflammatory signaling and activates innate transcription factors, such as nuclear factor-kappaB (NF-kappaB). TRIM21 also plays an essential role in p62-regulated redox homeostasis, suggesting a viable target for treating pathological conditions resulting from oxidative damage. Furthermore, TRIM21 may have implications for various autoimmune diseases associated uncontrolled antiviral signaling through the regulation of Nmi-IFI35 complex-mediated inhibition of innate antiviral response. TRIM21 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 40
36822 380831 cd19773 Bbox2_TRIM23_C-IX_rpt1 first B-box-type 2 zinc finger found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, two Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition. 50
36823 380832 cd19774 Bbox2_TRIM23_C-IX_rpt2 second B-box-type 2 zinc finger found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, two Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition. 50
36824 380833 cd19775 Bbox2_TIF1_C-VI B-box-type 2 zinc finger found in transcription intermediary factor 1 (TIF1) family. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1-alpha (TRIM24), TIF1-beta (TRIM28), and TIF1-gamma (TRIM33), which belong to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha and TIF1beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs). 43
36825 380834 cd19776 Bbox2_TRIM25_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 25 (TRIM25) and similar proteins. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. It binds to mono-ubiquitinated PCNA and promotes the ISG15 modification (ISGylation) of PCNA, suggesting a crucial role in termination of error-prone translesion DNA synthesis. TRIM25 also enhances p53 and Mdm2 abundance by inhibiting their ubiquitination and degradation in 26S proteasomes. It suppresses p53's transcriptional activity and dampens the response to DNA damage. Upon deubiquitylation by ubiquitin-specific peptidase 15 (USP15), it mediates K63-linked polyubiquitination of RIG-I that is crucial for downstream antiviral interferon signaling. TRIM25 is required for melanoma differentiation-associated gene 5 (MDA5) and mitochondrial antiviral signaling (MAVS, also known as IPS-1, VISA, Cardiff) mediated activation of nuclear factor-kappaB (NF- kappa B) and interferon production. It is an RNA binding protein acting as RNA-specific activator for Lin28a/TuT4-mediated uridylation. TRIM25 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 38
36826 380835 cd19777 Bbox2_TRIM35_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 35 (TRIM35) and similar proteins. TRIM35, also known as hemopoietic lineage switch protein 5 (HLS5), is a putative hepatocellular carcinoma (HCC) suppressor that inhibits phosphorylation of pyruvate kinase isoform M2 (PKM2), which is involved in aerobic glycolysis of cancer cells and further suppresses the Warburg effect and tumorigenicity in HCC. It also negatively regulates Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production by suppressing the stability of interferon regulatory factor 7 (IRF7). Moreover, TRIM35 regulates erythroid differentiation by modulating globin transcription factor 1 (GATA-1) activity. TRIM35 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36827 380836 cd19778 Bbox2_TRIM36_C-I B-box-type 2 zinc finger found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequently dorsal axis formation. It is also potentially associated with chromosome segregation through interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 45
36828 380837 cd19779 Bbox2_TRIM37_C-VIII B-box-type 2 zinc finger found in tripartite motif-containing protein 37 (TRIM37) and similar proteins. TRIM37, also known as Mulibrey nanism protein, is a peroxisomal E3 ubiquitin-protein ligase that is involved in the tumorigenesis of several cancer types, including pancreatic ductal adenocarcinoma (PDAC), hepatocellular carcinoma (HCC), breast cancer, and sporadic fibrothecoma. It mono-ubiquitinates histone H2A, a chromatin modification associated with transcriptional repression. Moreover, TRIM37 possesses anti-HIV-1 activity, and interferes with viral DNA synthesis. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction, and hepatomegaly. TRIM37 belongs to the C-VIII subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a MATH (meprin and TRAF-C homology) domain positioned C-terminal to the RBCC domain. Its MATH domain has been shown to interact with the TRAF (TNF-Receptor-Associated Factor) domain of six known TRAFs in vitro. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 40
36829 380838 cd19780 Bbox2_TRIM39-like B-box-type 2 zinc finger found in tripartite motif-containing proteins TRIM39, TRIM58 and similar proteins. The family includes TRIM39 and TRIM58, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM39, also termed RING finger protein 23 (RNF23), or testis-abundant finger protein, is an E3 ubiquitin-protein ligase that plays a role in controlling DNA damage-induced apoptosis through inhibition of the anaphase promoting complex (APC/C), a multiprotein ubiquitin ligase that controls multiple cell cycle regulators, including cyclins, geminin, and others. TRIM39 also functions as a regulator of several key processes in the proliferative cycle. It directly regulates p53 stability and modulates cell cycle progression and DNA damage responses via stabilization of p21. TRIM39 also negatively regulates the nuclear factor kappaB (NFkappaB)-mediated signaling pathway through stabilization of Cactin, an inhibitor of NFkappaB- and Toll-like receptor (TLR)-mediated transcription, which is induced by inflammatory stimulants such as tumor necrosis factor alpha (TNFalpha). TRIM39 is a MOAP-1-binding protein that can promote apoptosis signaling through stabilization of MOAP-1 via the inhibition of its poly-ubiquitination process. TRIM58, also known as protein BIA2, is an erythroid E3 ubiquitin-protein ligase induced during late erythropoiesis. It binds and ubiquitinates the intermediate chain of the microtubule motor dynein (DYNC1LI1/DYNC1LI2), stimulating the degradation of the dynein holoprotein complex. It may participate in the erythroblast enucleation process through regulation of nuclear polarization. 44
36830 380839 cd19781 Bbox2_TRIM40_C-V B-box-type 2 zinc finger found in tripartite motif-containing protein 40 (TRIM40) and similar proteins. TRIM40, also termed probable E3 NEDD8-protein ligase, or RING finger protein 35, may function as an E3 ubiquitin-protein ligase of the NEDD8 conjugation pathway. It promotes neddylation of IKBKG/NEMO, stabilizing NFKBIA, and inhibiting NF-kappaB nuclear translocation and activity. TRIM40 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36831 380840 cd19782 Bbox2_TRIM42_C-III B-box-type 2 zinc finger found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear. 40
36832 380841 cd19783 Bbox2_TRIM43-like B-box-type 2 zinc finger found in tripartite motif-containing proteins TRIM43, TRIM48, TRIM49, TRIM51, TRIM64, TRIM77 and similar proteins. The family includes a group of closely related uncharacterized tripartite motif-containing proteins, TRIM43, TRIM43B, TRIM48/RNF101, TRIM49/RNF18, TRIM49B, TRIM49C/TRIM49L2, TRIM49D/TRIM49L, TRIM51/SPRYD5, TRIM64, TRIM64B, TRIM64C, and TRIM77, whose biological functions remain unclear. TRIM49, also known as testis-specific RING-finger protein, has moderate similarity with SS-A/Ro52 antigen, suggesting it may be one of target proteins of autoantibodies in the sera of patients with these autoimmune disorders. All family members (except for TRIM51) belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM51 belongs to unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 53
36833 380842 cd19784 Bbox2_TRIM44 B-box-type 2 zinc finger found in tripartite motif-containing protein 44 (TRIM44) and similar proteins. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM44 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a Bbox2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36834 380843 cd19785 Bbox2_TRIM45_C-X B-box-type 2 zinc finger found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1 and downregulates mitogen-activated protein kinase (MAPK) signal transduction by inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription and suppresses cell proliferation. TRIM45 belongs to the C-X subclass of TRIM (tripartite motif) family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36835 380844 cd19786 Bbox2_TRIM46_C-I B-box-type 2 zinc finger found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 46
36836 380845 cd19787 Bbox2_TRIM50-like B-box-type 2 zinc finger found in tripartite motif-containing protein TRIM50, TRIM73, TRIM74 and similar proteins. TRIM50 is a stomach-specific E3 ubiquitin-protein ligase, encoded by the Williams-Beuren syndrome (WBS) TRIM50 gene, which regulates vesicular trafficking for acid secretion in gastric parietal cells. It colocalizes, interacts with, and increases the level of p62/SQSTM1, a multifunctional adaptor protein implicated in various cellular processes including the autophagy clearance of polyubiquitinated protein aggregates. It also promotes the formation and clearance of aggresome-associated polyubiquitinated proteins through the interaction with the histone deacetylase 6 (HDAC6), a tubulin specific deacetylase that regulates microtubule-dependent aggresome formation. TRIM50 can be acetylated by PCAF and p300. TRIM50 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. The family also includes two paralogs of TRIM50, tripartite motif-containing protein 73 (TRIM73), also known as tripartite motif-containing protein 50B (TRIM50B), and tripartite motif-containing protein 74 (TRIM74), also known as tripartite motif-containing protein 50C (TRIM50C), both of which are WBS-related genes encoding proteins and may also act as E3 ligases. In contrast with TRIM50, TRIM73 and TRIM74 belong to the C-V subclass of TRIM family of proteins that are defined by the N-terminal RBCC domains only. 39
36837 380846 cd19788 Bbox2_MuRF B-box-type 2 zinc finger found in muscle-specific RING finger protein (MuRF) family. This family corresponds to a group of striated muscle-specific tripartite motif (TRIM) proteins, including TRIM63/MuRF-1, TRIM55/MuRF-2, and TRIM54/MuRF-3, which function as E3 ubiquitin ligases in ubiquitin-mediated muscle protein turnover. They are tightly developmentally regulated in skeletal muscle and associate with different cytoskeleton components, such as microtubules, Z-disks and M-bands, as well as with metabolic enzymes and nuclear proteins. They also cooperate with diverse proteins implicated in selective protein degradation by the proteasome and autophagosome, and target proteins of metabolic regulation, sarcomere assembly and transcriptional regulation. Moreover, MURFs display variable fibre-type preferences. TRIM63/MuRF-1 is predominantly fast (type II) fibre-associated in skeletal muscle. TRIM55/MuRF-2 is predominantly slow-fibre associated. TRIM54/MuRF-3 is ubiquitously present. They play an active role in microtubule-mediated sarcomere assembly. MuRFs belong to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain positioned C-terminal to the RBCC domain. They also harbor a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36838 380847 cd19789 Bbox2_TRIM56_C-V B-box-type 2 zinc finger found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 47
36839 380848 cd19790 Bbox2_TRIM59_C-XI B-box-type 2 zinc finger found in tripartite motif-containing protein 59 (TRIM59) and similar proteins. TRIM59, also known as TRIM57, or RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis. It is upregulated in gastric cancer and promotes gastric carcinogenesis by interacting with and targeting the P53 tumor suppressor for its ubiquitination and degradation. It also acts as a novel accessory molecule involved in cytotoxicity of BCG-activated macrophages (BAM). Moreover, TRIM59 may serve as a multifunctional regulator for innate immune signaling pathways. It interacts with ECSIT and negatively regulates nuclear factor-kappaB (NF- kappa B) and interferon regulatory factor (IRF)-3/7-mediated signal pathways. TRIM59 belongs to the C-XI subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. In addition, TRIM59 contains a C-terminal transmembrane domain. 40
36840 380849 cd19791 Bbox2_TRIM60-like B-box-type 2 zinc finger found in tripartite motif-containing proteins, TRIM60, TRIM61, TRIM75 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM60, TRIM61 and TRIM75. TRIM60, also known as RING finger protein 129 (RNF129) or RING finger protein 33 (RNF33), is a cytoplasmic protein expressed in the testis. It may play an important role in the spermatogenesis process, the development of the preimplantation embryo, and in testicular functions. TRIM60 interacts with the cytoplasmic kinesin motor proteins KIF3A and KIF3B suggesting possible contribution to cargo movement along the microtubule in the expressed sites. It is also involved in spermatogenesis in Sertoli cells under the regulation of nuclear factor-kappaB (NF-kappaB). TRIM61 is closely related to TRIM60, but its biological function remains unclear. TRIM75 could be the product of a pseudogene. Its biological function remains unclear. TRIM60 and TRIM75 belong to the C-IV subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM61 belongs to the C-V subclass of TRIM family that contains RBCC domains only. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36841 380850 cd19792 Bbox2_TRIM62_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 62 (TRIM62) and similar proteins. TRIM62, also known as Ductal Epithelium Associated Ring Chromosome 1 (DEAR1), is a cytoplasmic E3 ubiquitin-protein ligase that was identified as a dominant regulator of acinar morphogenesis in the mammary gland. It is implicated in the inflammatory response of immune cells by regulating the Toll-like receptor 4 (TLR4) signaling pathway, leading to increased activity of the activator protein 1 (AP-1) transcription factor in primary macrophages. It is also involved in muscular protein homeostasis, especially during inflammation-induced atrophy, and may play a role in the pathogenesis of ICU-acquired weakness (ICUAW) by activating and maintaining inflammation in myocytes. Moreover, TRIM62 facilitates K27-linked poly-ubiquitination of CARD9 and also regulates CARD9-mediated anti-fungal immunity and intestinal inflammation. Furthermore, TRIM62 is involved in the regulation of apical-basal polarity and acinar morphogenesis. It also functions as a chromosome 1p35 tumor suppressor and negatively regulates transforming growth factor beta (TGFbeta)-driven epithelial-mesenchymal transition (EMT) through binding to and promoting the ubiquitination of SMAD3, a major effector of TGFbeta-mediated EMT. TRIM62 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 38
36842 380851 cd19793 Bbox2_TRIM65-like B-box-type 2 zinc finger found in tripartite motif-containing protein 65 (TRIM65), B box and SPRY domain-containing protein (BSPRY) and similar proteins. The family includes TRIM65 and BSPRY. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. BSPRY is a regulatory protein for maintaining calcium homeostasis. It may regulate epithelial calcium transport by inhibiting TRPV5 activity. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36843 380852 cd19794 Bbox2_TRIM66-like B-box-type 2 zinc finger found in tripartite motif-containing protein 66 (TRIM66) and similar proteins. TRIM66, also termed transcriptional intermediary factor 1 delta (TIF1delta), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor 1 (TIF1) family expressed by elongating spermatids. Like other TIF1 proteins, TRIM66 displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TRIM66 plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TRIM66 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36844 380853 cd19795 Bbox2_TRIM68_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 68 (TRIM68) and similar proteins. TRIM68, also known as RING finger protein 137 (RNF137) or SSA protein SS-56 (SS-56), is an E3 ubiquitin-protein ligase that negatively regulates Toll-like receptor (TLR)- and RIG-I-like receptor (RLR)-driven type I interferon production by degrading TRK fused gene (TFG), a novel driver of IFN-beta downstream of anti-viral detection systems. It also functions as a cofactor for androgen receptor-mediated transcription by regulating ligand-dependent transcription of androgen receptor in prostate cancer cells. Moreover, TRIM68 is a cellular target of autoantibody responses in Sjogren's syndrome (SS), as well as systemic lupus erythematosus (SLE). It is also an auto-antigen for T cells in SS and SLE. TRIM68 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36845 380854 cd19796 Bbox2_TRIM71_C-VII B-box-type 2 zinc finger found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may play essential roles in embryonic stem cells, cellular reprogramming, and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as a target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7), and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs through mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 48
36846 380855 cd19797 Bbox2_TRIM72_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 72 (TRIM72) and similar proteins. TRIM72, also known as Mitsugumin-53 (MG53), is a muscle-specific protein that plays a central role in cell membrane repair by nucleating the assembly of the repair machinery at muscle injury sites. It is required in repair of alveolar epithelial cells under plasma membrane stress failure. It interacts with dysferlin to regulate sarcolemmal repair. Upregulation of TRIM72 develops obesity, systemic insulin resistance, dyslipidemia, and hyperglycemia, as well as induces diabetic cardiomyopathy through transcriptional activation of peroxisome proliferation-activated receptor alpha (PPAR-alpha) signaling pathway. Compensation for the absence of AKT signaling by ERK signaling during TRIM72 overexpression leads to pathological hypertrophy. Moreover, TRIM72 functions as a novel negative feedback regulator of myogenesis via targeting insulin receptor substrate-1 (IRS-1). It is transcriptionally activated by the synergism of myogenin (MyoD) and myocyte enhancer factor 2 (MEF2). TRIM72 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 42
36847 380856 cd19798 Bbox2_BRAT-like B-box-type 2 zinc finger found in Drosophila melanogaster brain tumor protein (BRAT) and similar proteins. BRAT is a NHL-domain family protein that functions as a translational repressor to inhibit cell proliferation. This family also contains Caenorhabditis elegans B-box type zinc finger protein ncl-1, a C. elegans Brat homolog which functions as a translational repressor that inhibits protein synthesis. BRAT contains Bbox1 and Bbox2 zinc fingers and NHL repeats. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 44
36848 380857 cd19799 Bbox2_MYCBP2 B-box-type 2 zinc finger found in Myc-binding protein 2 (MYCBP2) and similar proteins. MYCBP2, also termed protein associated with Myc (Pam), is an atypical E3 ubiquitin-protein ligase which specifically mediates ubiquitination of threonine and serine residues on target proteins, instead of ubiquitinating lysine residues. MYCBP2 harbors a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 50
36849 380858 cd19800 Bbox2_xNF7-like B-box-type 2 zinc finger found in Xenopus laevis nuclear factor 7 (xNF7) and similar proteins. xNF7 is a maternally expressed novel zinc finger nuclear phosphoprotein. It acts as a transcription factor that determines dorsal-ventral body axis. xNF7 harbors a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 39
36850 380859 cd19801 Bbox1_MID B-box-type 1 zinc finger found in the midline (MID) family. The MID family includes MID1 and MID2. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with alpha4, a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis, and functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 49
36851 380860 cd19802 Bbox1_TRIM8-like B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM8, TRIM16, TRIM25, TRIM29, TRIM44, TRIM47 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM8, TRIM16, TRIM25, TRIM29, TRIM44 and TRIM47. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM16, also termed estrogen-responsive B box protein (EBBP), may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor by affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and is particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. The TRIM (tripartite motif) family of proteins are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 46
36852 380861 cd19803 Bbox1_TRIM9-like_C-I B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM9, TRIM67 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM9 and TRIM67, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. It plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. 47
36853 380862 cd19804 Bbox1_TRIM19_C-V B-box-type 1 zinc finger found in promyelocytic leukemia protein (PML) and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cellular processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites, and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 47
36854 380863 cd19805 Bbox1_TIF1 B-box-type 1 zinc finger found in transcription intermediary factor 1 (TIF1) family. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1-alpha (TRIM24), TIF1-beta (TRIM28), and TIF1-gamma (TRIM33), which belongs to the C-VI subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1-alpha and TIF1-beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1-gamma is structurally closely related to TIF1-alpha and TIF1-beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs). 44
36855 380864 cd19806 Bbox1_TRIM32_C-VII B-box-type 1 zinc finger found in tripartite motif-containing protein 32 (TRIM32) and similar proteins. TRIM32, also known as 72 kDa Tat-interacting protein, or zinc finger protein HT2A, or BBS11, is an E3 ubiquitin-protein ligase that promotes degradation of several targets, including actin, PIASgamma, Abl interactor 2, dysbindin, X-linked inhibitor of apoptosis (XIAP), p73 transcription factor, thin filaments and Z-bands during fasting. It plays important roles in neuronal differentiation of neural progenitor cells, as well as in controlling cell fate in skeletal muscle progenitor cells. It reduces PI3K-Akt-FoxO signaling in muscle atrophy by promoting plakoglobin-PI3K dissociation. It also functions as a pluripotency-reprogramming roadblock that facilitates cellular transition towards differentiation via modulating the levels of Oct4 and cMyc. Moreover, TRIM32 is an intrinsic influenza A virus (IAV) restriction factor which senses and targets the polymerase basic protein 1 (PB1) for ubiquitination and protein degradation. It also plays a significant role in mediating the biological activity of the HIV-1 Tat protein in vivo, binding specifically to the activation domain of HIV-1 Tat; it and can also interact with the HIV-2 and EIAV Tat proteins. Furthermore, TRIM32 regulates myoblast proliferation by controlling turnover of NDRG2 (N-myc downstream-regulated gene). It negatively regulates tumor suppressor p53 to promote tumorigenesis. It also facilitates degradation of MYCN on spindle poles and induces asymmetric cell division in human neuroblastoma cells. In addition, TRIM32 plays important roles in regulation of hyperactivities and positively regulates the development of anxiety and depression disorders induced by chronic stress. It also plays a role in regeneration by affecting satellite cell cycle progression via modulation of the SUMO ligase PIASy (PIAS4). Defects in TRIM32 leads to limb-girdle muscular dystrophy type 2H (LGMD2H), sarcotubular myopathies (STM) and Bardet-Biedl syndrome. TRIM32 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The NHL domain mediates the interaction with Argonaute proteins and consequently allows TRIM32 to modulate the activity of certain miRNAs. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 41
36856 380865 cd19807 Bbox1_TRIM36-like B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM36, TRIM46 and similar proteins. The family includes tripartite motif-containing proteins, TRIM36 and TRIM46, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM36, the human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequent dorsal axis formation. It is also potentially associated with chromosome segregation by interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. 52
36857 380866 cd19808 Bbox1_TRIM42_C-III B-box-type 1 zinc finger found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear. 47
36858 380867 cd19809 Bbox1_TRIM45_C-X B-box-type 1 zinc finger found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1, and downregulates mitogen-activated protein kinase (MAPK) signal transduction by inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription, and suppresses cell proliferation. TRIM45 belongs to the C-X subclass of the TRIM (tripartite motif) family of proteins that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 46
36859 380868 cd19810 Bbox1_TRIM56_C-V B-box-type 1 zinc finger found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 49
36860 380869 cd19811 Bbox1_TRIM66 B-box-type 1 zinc finger found in tripartite motif-containing protein 66 (TRIM66) and similar proteins. TRIM66, also termed transcriptional intermediary factor 1 delta (TIF1delta), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor 1 (TIF1) family, and is expressed by elongating spermatids. Like other TIF1 proteins, TRIM66 displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TRIM66 plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TRIM66 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 37
36861 380870 cd19812 Bbox1_TRIM71_C-VII B-box-type 1 zinc finger found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may play essential roles in embryonic stem cells, cellular reprogramming, and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as a target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7) and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs by mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of the TRIM (tripartite motif) family of proteins that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 44
36862 380871 cd19813 Bbox1_BRAT-like B-box-type 1 zinc finger found in Drosophila melanogaster brain tumor protein (BRAT) and similar proteins. BRAT is a NHL-domain family protein that functions as a translational repressor to inhibit cell proliferation. The family also contains Caenorhabditis elegans B-box type zinc finger protein ncl-1, a C. elegans Brat homolog which functions as a translational repressor that inhibits protein synthesis. BRAT contains Bbox1 and Bbox2 zinc fingers and NHL repeats. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 44
36863 380872 cd19814 Bbox1_RNF207-like B-box-type 1 zinc finger found in RING finger protein 207 (RNF207) and similar proteins. RNF207 is a cardiac-specific E3 ubiquitin-protein ligase that plays an important role in the regulation of cardiac repolarization. It regulates action potential duration, likely via effects on human ether-a-go-go-related gene (HERG) trafficking and localization, in a heat shock protein-dependent manner. RNF207 contains a RING finger, a B-box motif and Bbox C-terminal (BBC) domain, as well as a C-terminal non-homologous region (CNHR). The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 49
36864 380873 cd19815 Bbox1_HOIP B-box-type 1 zinc finger found in HOIL-1-interacting protein (HOIP) and similar proteins. HOIP, also termed RING finger protein 31 (RNF31), or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. It also interacts with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression. HOIP contains a B-box motif that shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 43
36865 380874 cd19816 Bbox1_CYLD B-box-type 1 zinc finger found in tumor suppressor cylindromatosis (CYLD) and similar proteins. CYLD, also termed ubiquitin carboxyl-terminal hydrolase CYLD, or deubiquitinating enzyme CYLD, or ubiquitin thioesterase CYLD, or ubiquitin-specific-processing protease CYLD, is a microtubule-associated deubiquitinase that specifically cleaves Lys-63-linked polyubiquitin chains. It plays a pivotal role in a wide range of cellular activities, including innate immunity, cell division, and ciliogenesis. CYLD antagonizes NF-kappaB and JNK signaling by disassembly of Lys63-linked ubiquitin chains synthesized in response to cytokine stimulation. Structural characterization reveals a small zinc-binding B-box inserted within the ubiquitin specific protease (USP) domain of CYLD. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs) and is responsible for its intermolecular interaction and cytoplasmic localization. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 56
36866 380875 cd19817 Bbox1_ANCHR-like B-box-type 1 zinc finger found in Abscission/NoCut checkpoint regulator (ANCHR) and similar proteins. ANCHR, also termed MLL partner containing FYVE domain, or zinc finger FYVE domain-containing protein 19, is a key regulator of the abscission step in cytokinesis: part of the cytokinesis checkpoint, a process required to delay abscission to prevent both premature resolution of intercellular chromosome bridges and accumulation of DNA damage. The family also includes zinc finger B-box domain-containing protein 1 (ZBBX), a B-box motif containing protein with unclear biological function. The B-box motif of this family shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 45
36867 380876 cd19818 Bbox1_ZBBX B-box-type 1 zinc finger found in zinc finger B-box domain-containing protein 1 (ZBBX) and similar proteins. The family corresponds to a group of uncharacterized zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 43
36868 380877 cd19819 Bbox1_ZFYVE1_rpt1 first B-box-type 1 zinc finger found in zinc finger FYVE domain-containing protein 1 (ZFYVE1) and similar proteins. ZFYVE1 also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYVE1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by Wortmannin, a PI3-kinase inhibitor. ZFYVE1 harbors two B-box motifs, both of which show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 48
36869 380878 cd19820 Bbox1_ZFYVE1_rpt2 second B-box-type 1 zinc finger found in zinc finger FYVE domain-containing protein 1 (ZFYVE1) and similar proteins. ZFYVE1 also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYVE1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by Wortmannin, a PI3-kinase inhibitor. ZFYVE1 harbors two B-box motifs, both of which show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 59
36870 380879 cd19821 Bbox1_BBX-like B-box-type 1 zinc finger found in B-box (BBX) family of plant transcription factors and similar proteins. The BBX family includes a group of zinc finger transcription factors that contain one or two B-box motifs, and sometimes also feature a CCT (CONSTANS, CO-like, and TOC1) domain. They play important roles in plant growth and development, including seedling photomorphogenesis, photoperiodic regulation of flowering, shade avoidance, and responses to biotic and abiotic stresses. Their B-box motifs show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs) and are involved in mediating transcriptional regulation and protein-protein interaction in plant signaling. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif; this family contains a modified motif, C3XC2H2, where X can be D, E, C or H. 44
36871 380880 cd19822 Bbox2_MID1_C-I B-box-type 2 zinc finger found in midline-1 (MID1) and similar proteins. MID1, also termed midin, or midline 1 RING finger protein, or putative transcription factor XPRF, or RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRI18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. It heterodimerizes in vitro with its paralog MID2. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 47
36872 380881 cd19823 Bbox2_MID2_C-I B-box-type 2 zinc finger found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1 that associate with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. It heterodimerizes in vitro with its paralog MID1. Loss-of-function mutations in MID2 lead to the human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 40
36873 380882 cd19824 Bbox2_TRIM2_C-VII B-box-type 2 zinc finger found in tripartite motif-containing protein 2 (TRIM2) and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM2 also plays a role in mediating the p42/p44 Semi-independent ubiquitination of the cell death-promoting protein Bcl-2-interacting mediator of cell death (Aim) in rapid ischemic tolerance. TRIM2 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 42
36874 380883 cd19825 Bbox2_TRIM3_C-VII B-box-type 2 zinc finger found in tripartite motif-containing protein 3 (TRIM3). TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It functions as a tumor suppressor that regulates asymmetric cell division in neuroblastoma. It binds to the ck inhibitor p21(WAF1/CIP1) and regulates its availability that promotes cyclins D1-cdk4 nuclear accumulation. Moreover, TRIM3 plays an important role in the central nervous system (CNS). It corresponds to gene BERP (brain-expressed RING finger protein), a unique p53-regulated gene that modulates seizure susceptibility and GABAAR cell surface expression. Furthermore, TRIM3 mediates activity-dependent turnover of presynaptic density (PSD) scaffold proteins GKAP/SAPAP1 and is a negative regulator of dendrite spine morphology. In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. It also regulates the motility of the kinesin superfamily protein KIF21B. TRIM3 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 47
36875 380884 cd19826 Bbox2_TRIM9_C-I B-box-type 2 zinc finger found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9 (the human ortholog of rat Spring), also termed RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducer repeat-containing protein (beta-TCP) through its N-terminal degron motif (DSGXXS) depending on the phosphorylation status, and thus negatively regulate nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytosis soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 49
36876 380885 cd19827 Bbox2_TRIM67_C-I B-box-type 2 zinc finger found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 45
36877 380886 cd19828 Bbox2_TIF1a_C-VI B-box-type 2 zinc finger found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-alpha interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contain the transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development. 57
36878 380887 cd19829 Bbox2_TIF1b_C-VI B-box-type 2 zinc finger found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD) and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-beta acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity. 44
36879 380888 cd19830 Bbox2_TIF1g_C-VI B-box-type 2 zinc finger found in transcription intermediary factor 1 gamma (TIF1-gamma). TIF1-gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-gamma is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling. It inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1-gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis. 53
36880 380889 cd19831 Bbox2_MuRF1_C-II B-box-type 2 zinc finger found in muscle-specific RING finger protein 1 (MuRF-1) and similar proteins. MuRF-1, also known as tripartite motif-containing protein 63 (TRIM63), RING finger protein 28 (RNF28), iris RING finger protein, or striated muscle RING zinc finger, is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is predominantly fast (type II) fibre-associated in skeletal muscle and can bind to many myofibrillar proteins, including titin, nebulin, the nebulin-related protein NRAP, troponin-I (TnI), troponin-T (TnT), myosin light chain 2 (MLC-2), myotilin, and T-cap. The early and robust upregulation of MuRF-1 is triggered by disuse, denervation, starvation, sepsis, or steroid administration resulting in skeletal muscle atrophy. It also plays a role in maintaining titin M-line integrity. It associates with the periphery of the M-line lattice and may be involved in the regulation of the titin kinase domain. It also participates in muscle stress response pathways and gene expression. MuRF-1 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36881 380890 cd19832 Bbox2_MuRF2_C-II B-box-type 2 zinc finger found in muscle-specific RING finger protein 2 (MuRF-2) and similar proteins. MuRF-2, also known as tripartite motif-containing protein 55 (TRIM55) or RING finger protein 29 (RNF29), is a muscle-specific E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover and also a ligand of the transactivation domain of the serum response transcription factor (SRF). It is predominantly slow-fibre associated and highly expressed in embryonic skeletal muscle. MuRF-2 associates transiently with microtubules, myosin, and titin during sarcomere assembly. It has been implicated in microtubule, intermediate filament, and sarcomeric M-line maintenance in striated muscle development, as well as in signalling from the sarcomere to the nucleus. It plays an important role in the earliest stages of skeletal muscle differentiation and myofibrillogenesis. It is developmentally downregulated and is assembled at the M-line region of the sarcomere and with microtubules. MuRF-2 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 45
36882 380891 cd19833 Bbox2_MuRF3_C-II B-box-type 2 zinc finger found in muscle-specific RING finger protein 3 (MuRF-3) and similar proteins. MuRF-3, also known as tripartite motif-containing protein 54 (TRIM54), or RING finger protein 30 (RNF30), is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is ubiquitously detected in all fibre types, and is developmentally upregulated, associates with microtubules, the sarcomeric M-line and Z-line, and is required for microtubule stability and myogenesis. It associates with glutamylated microtubules during skeletal muscle development, and is required for skeletal myoblast differentiation and development of cellular microtubular networks. MuRF-3 controls the degradation of four-and-a-half LIM domain (FHL2) and gamma-filamin and is required for maintenance of ventricular integrity after myocardial infarction (MI). MuRF-3 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36883 380892 cd19834 Bbox2_BSPRY B-box-type 2 zinc finger found in B box and SPRY domain-containing protein (BSPRY) and similar proteins. BSPRY is a regulatory protein for maintaining calcium homeostasis. It may regulate epithelial calcium transport by inhibiting TRPV5 activity. BSPRY is composed of a B-box, an alpha-helical coiled coil and a SPRY domain. The B-box motif shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 43
36884 380893 cd19835 Bbox2_TRIM65_C-IV B-box-type 2 zinc finger found in tripartite motif-containing protein 65 (TRIM65) and similar proteins. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. TRIM65 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 42
36885 380894 cd19836 Bbox1_MID1_C-I B-box-type 1 zinc finger found in midline-1 (MID1) and similar proteins. MID1, also termed midin, or midline 1 RING finger protein, or putative transcription factor XPRF, or RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRI18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. MID1 heterodimerizes in vitro with its paralog MID2. 50
36886 380895 cd19837 Bbox1_MID2_C-I B-box-type 1 zinc finger found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1, which associates with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with alpha4, a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. Loss-of-function mutations in MID2 lead to human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. MID2 heterodimerizes in vitro with its paralog MID1. 53
36887 380896 cd19838 Bbox1_TRIM8_C-V B-box-type 1 zinc finger found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF-kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 48
36888 380897 cd19839 Bbox1_TRIM16 B-box-type 1 zinc finger found in tripartite motif-containing protein 16 (TRIM16) and similar proteins. TRIM16, also termed estrogen-responsive B box protein (EBBP), is a regulator that may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor by affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM16 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 46
36889 380898 cd19840 Bbox1_TRIM29 B-box-type 1 zinc finger found in tripartite motif-containing protein 29 (TRIM29) and similar proteins. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM29 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 47
36890 380899 cd19841 Bbox1_TRIM44 B-box-type 1 zinc finger found in tripartite motif-containing protein 44 (TRIM44) and similar proteins. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM44 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif; this family contains a modified motif, C5H3. 46
36891 380900 cd19842 Bbox1_TRIM25-like_C-IV B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM25, TRIM47 and similar proteins. The family includes tripartite motif-containing proteins, TRIM25 and TRIM47, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and is particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. 49
36892 380901 cd19843 Bbox1_TRIM9_C-I B-box-type 1 zinc finger found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9 (the human ortholog of rat Spring), also termed RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducer repeat-containing protein (beta-TCP) through its N-terminal degron motif depending on the phosphorylation status, and thus negatively regulate nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytosis soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 47
36893 380902 cd19844 Bbox1_TRIM67_C-I B-box-type 1 zinc finger found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 49
36894 380903 cd19845 Bbox1_TIF1a_C-VI B-box-type 1 zinc finger found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-alpha interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoic X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contain the transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development. 45
36895 380904 cd19846 Bbox1_TIF1b_C-VI B-box-type 1 zinc finger found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-beta/KAP-1 acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity. 52
36896 380905 cd19847 Bbox1_TIF1g_C-VI B-box-type 1 zinc finger found in transcriptional intermediary factor 1 gamma (TIF1-gamma). TIF1-gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-gamma is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling. It inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1-gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1-gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis. 54
36897 380906 cd19848 Bbox1_TRIM36_C-I B-box-type 1 zinc finger found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, the human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequent dorsal axis formation. It is also potentially associated with chromosome segregation by interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 55
36898 380907 cd19849 Bbox1_TRIM46_C-I B-box-type 1 zinc finger found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 52
36899 381251 cd19851 lipocalin_CHL chloroplastic lipocalin(CHL) similar to Arabidopsis CHL. Chloroplastic lipocalin (CHL) prevents thylakoidal membrane lipids peroxidation and is protective against oxidative stress, especially mediated by singlet oxygen in response to excess light and other stress (e.g. heat shocks). CHL is required for seed longevity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 141
36900 381252 cd19852 FABP_pancrustacea fatty acid-binding protein similar to Locusta migratoria FABP (Lm-FABP). This subfamily includes fatty acid-binding protein found mainly in insects such as the migratory locust (Locusta migratoria) FABP (Lm-FABP) and the desert locust (Schistocerca gregaria) FABP (Sg-FABP), having flight muscle tissues that contain unusually high levels FABP, similar to migratory birds. Both Sg- and Lm-FABP are closely related to the mammalian i-LBP subfamily IV, especially to the heart and adipocyte FABP forms. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development. 128
36901 380683 cd19854 DSRM_DHX9_rpt1 first double-stranded RNA binding motif of DEAH box protein 9 (DHX9) and similar proteins. DHX9 (EC 3.6.4.13; also known as ATP-dependent RNA helicase A, DExH-box helicase 9 (DDX9), Leukophysin (LKP), nuclear DNA helicase II (NDH II), NDH2, or RNA helicase A) is a multifunctional ATP-dependent nucleic acid helicase that unwinds DNA and RNA in a 3' to 5' direction and plays important roles in many processes, such as DNA replication, transcriptional activation, post-transcriptional RNA regulation, mRNA translation, and RNA-mediated gene silencing. It contains two double-stranded RNA binding motifs (DSRMs) at the N-terminal region. This model corresponds to the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36902 380684 cd19855 DSRM_DHX9_rpt2 second double-stranded RNA binding motif of DEAH box protein 9 (DHX9) and similar proteins. DHX9 (EC 3.6.4.13; also known as ATP-dependent RNA helicase A, DExH-box helicase 9 (DDX9), Leukophysin (LKP), nuclear DNA helicase II (NDH II), NDH2, or RNA helicase A) is a multifunctional ATP-dependent nucleic acid helicase that unwinds DNA and RNA in a 3' to 5' direction and plays important roles in many processes, such as DNA replication, transcriptional activation, post-transcriptional RNA regulation, mRNA translation and RNA-mediated gene silencing. It contains two double-stranded RNA binding motifs (DSRMs) at the N-terminal region. This model corresponds to the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 75
36903 380685 cd19856 DSRM_Kanadaptin double-stranded RNA binding motif of Kanadaptin and similar proteins. Kanadaptin (also known as human lung cancer oncogene 3 protein (HLC-3), kidney anion exchanger adapter protein, or solute carrier family 4 anion exchanger member 1 adapter protein (SLC4A1AP)) is a nuclear protein widely expressed in mammalian tissues. It was originally isolated as a kidney Cl-/HCO3- anion exchanger 1 (kAE1)-binding protein. It is a highly mobile nucleocytoplasmic shuttling and multilocalizing protein. Its role in mammalian cells remains unclear. The double-stranded RNA binding motif (DSRM) is not sequence specific, but highly specific for dsRNAs of various origin and structure. 86
36904 380686 cd19857 DSRM_STAU_rpt1 first double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 64
36905 380687 cd19858 DSRM_STAU_rpt2 second double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36906 380688 cd19859 DSRM_STAU_rpt3 third double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 65
36907 380689 cd19860 DSRM_STAU_rpt4 fourth double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36908 380690 cd19861 DSRM_STAU_rpt5 fifth double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36909 380691 cd19862 DSRM_PRKRA-like_rpt1 first double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. This family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)), participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. This family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 70
36910 380692 cd19863 DSRM_PRKRA-like_rpt2 second double-stranded RNA binding motif of PRKRA, TARBP2 and similar proteins. The family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. The family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36911 380693 cd19864 DSRM_PRKRA-like_rpt3 third double-stranded RNA binding motif of PRKRA, TARBP2 and similar proteins. The family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. The family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36912 380694 cd19865 DSRM_STRBP_RED-like_rpt1 first double-stranded RNA binding motif of STRBP, ILF3, RED1, RED2 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3), as well as two RNA-editing deaminases, RED1 and RED2. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. RED1 (EC 3.5.4.37; also called double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. RED2 (also called double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of these enzymes. It is capable of binding to dsRNA, but also to ssRNA. RED2 lacks editing activity for currently known substrate RNAs. Members of this group contain two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 63
36913 380695 cd19866 DSRM_STRBP_RED-like_rpt2 second double-stranded RNA binding motif of STRBP, ILF3, RED1, RED2 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3), as well as two RNA-editing deaminases, RED1 and RED2. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. RED1 (EC 3.5.4.37; also called double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. RED2 (also called double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 lacks editing activity for currently known substrate RNAs. Members of this group contain two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 63
36914 380696 cd19867 DSRM_DGCR8_rpt1 first double-stranded RNA binding motif of DiGeorge syndrome critical region 8 (DGCR8) and similar proteins. DGCR8 is a component of the microprocessor complex that acts as an RNA- and heme-binding protein that is involved in the initial step of microRNA (miRNA) biogenesis. Within the microprocessor complex, DGCR8 functions as a molecular anchor necessary for the recognition of pri-miRNA at dsRNA-ssRNA junction and directs DROSHA to cleave 11bp away from the junction to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. DGCR8 contains two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 74
36915 380697 cd19868 DSRM_DGCR8_rpt2 second double-stranded RNA binding motif of DiGeorge syndrome critical region 8 (DGCR8) and similar proteins. DGCR8 is a component of the microprocessor complex that acts as an RNA- and heme-binding protein that is involved in the initial step of microRNA (miRNA) biogenesis. Within the microprocessor complex, DGCR8 functions as a molecular anchor necessary for the recognition of pri-miRNA at dsRNA-ssRNA junction and directs DROSHA to cleave 11bp away from the junction to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. DGCR8 contains two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36916 380698 cd19869 DSRM_DCL_plant double-stranded RNA binding motif of plant Dicer-like proteins. The family includes plant Dicer-like (DCL) proteins and other ribonuclease (RNase) III-like (RTL) proteins. DCLs are endoribonucleases involved in RNA-mediated post-transcriptional gene silencing (PTGS). They function in the microRNA (miRNA) biogenesis pathway by cleaving primary miRNAs (pri-miRNAs) and precursor miRNAs (pre-miRNAs). Family members contain a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 70
36917 380699 cd19870 DSRM_SON-like double-stranded RNA binding motif of protein SON and similar proteins. Protein SON (also known as Bax antagonist selected in saccharomyces 1 (BASS1), negative regulatory element-binding protein (NRE-binding protein), or protein DBP-5, or SON3) is an RNA-binding protein which acts as an mRNA splicing cofactor by promoting efficient splicing of transcripts that possess weak splice sites. It specifically promotes splicing of many cell-cycle and DNA-repair transcripts that possess weak splice sites, such as TUBG1, KATNB1, TUBGCP2, AURKB, PCNT, AKT1, RAD23A, and FANCG. Members of this group contain a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 75
36918 380700 cd19871 DSRM_DUS2L double-stranded RNA binding motif of tRNA-dihydrouridine(20) synthase [NAD(P)+]-like (DUS2L) and similar proteins. DUS2L (also known as dihydrouridine synthase 2 (DUS2), up-regulated in lung cancer protein 8 (URLC8), or tRNA-dihydrouridine synthase 2-like) catalyzes the synthesis of dihydrouridine, a modified base found in the D-loop of most tRNAs. It negatively regulates the activation of EIF2AK2/PKR. DUS2L contains an N-terminal FMN-binding domain and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36919 380701 cd19872 DSRM_A1CF-like double-stranded RNA binding motif of APOBEC1 complementation factor (A1CF), RNA-binding protein 46 (RBM46) and similar proteins. The family includes two dsRNA-binding motif-containing proteins, A1CF and RBM46. A1CF (also known as APOBEC1-stimulating protein) is an essential component of the apolipoprotein B mRNA editing enzyme complex which is responsible for the posttranscriptional editing of a CAA codon for Gln to a UAA codon for stop in APOB mRNA. A1CF binds to APOB mRNA and is probably responsible for docking the catalytic subunit, APOBEC1, to the mRNA to allow it to deaminate its target cytosine. RBM46 (also called cancer/testis antigen 68 (CT68), or RNA-binding motif protein 46) plays a novel role in the regulation of embryonic stem cell (ESC) differentiation by regulating the degradation of beta-catenin mRNA. It also regulates trophectoderm specification by stabilizing Cdx2 mRNA in early mouse embryos. Members of this family contain three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure. 75
36920 380702 cd19873 DSRM_MRPL3_like double-stranded RNA binding motif of Saccharomyces cerevisiae mitochondrial 54S ribosomal protein L3 (MRPL3) and similar proteins. MRPL3 (also called mitochondrial large ribosomal subunit protein mL44) is a component of the mitochondrial ribosome (mitoribosome), a dedicated translation machinery responsible for the synthesis of mitochondrial genome-encoded proteins, including at least some of the essential transmembrane subunits of the mitochondrial respiratory chain. MRPL3 contains a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 84
36921 380703 cd19874 DSRM_MRPL44 double-stranded RNA binding motif of mitochondrial 39S ribosomal protein L44 (MRPL44) and similar proteins. MRPL44 (also known as L44mt, MRP-L44, or mitochondrial large ribosomal subunit protein mL44) is a component of the 39S subunit of mitochondrial ribosome. It may play a role in the assembly/stability of nascent mitochondrial polypeptides exiting the ribosome. MRPL44 contains a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 84
36922 380704 cd19875 DSRM_EIF2AK2-like double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. The family includes EIF2AK2 and adenosine deaminase domain-containing proteins, ADAD1 and ADAD2. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. ADAD1 (also called testis nuclear RNA-binding protein (TENR)) and ADAD2 (also called testis nuclear RNA-binding protein-like (TENRL)) are phylogenetically related to a family of adenosine deaminases involved in RNA editing. ADAD1 plays an essential function in spermatid morphogenesis. It may be involved in testis-specific nuclear post-transcriptional processes such as heterogeneous nuclear RNA (hnRNA) packaging, alternative splicing, or nuclear/cytoplasmic transport of mRNAs. ADAD2 is a double-stranded RNA binding protein with unclear biological function. Members of this group contains varying numbers of double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36923 380705 cd19876 DSRM_RNT1p-like double-stranded RNA binding motif of Saccharomyces cerevisiae ribonuclease 3 (RNT1p) and similar proteins. RNT1p (EC 3.1.26.3; also known as ribonuclease III (RNase III)) is a dsRNA-specific nuclease that cleaves eukaryotic pre-ribosomal RNA at the U3 snoRNP-dependent A0 site in the 5'-external transcribed spacer (ETS) and in the 3'-ETS. RNT1p contains a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36924 380706 cd19877 DSRM_RNAse_III_meta_like double-stranded RNA binding motif of metazoan ribonuclease III (RNase III) and similar proteins. RNase III (EC 3.1.26.3; also known as Drosha, or ribonuclease 3) is a double-stranded RNA (dsRNA)-specific endoribonuclease that is involved in the initial step of microRNA (miRNA) biogenesis. It is a component of the microprocessor complex that is required to process primary miRNA transcripts (pri-miRNAs) to release precursor miRNA (pre-miRNA) in the nucleus. Within the microprocessor complex, RNase III cleaves the 3' and 5' strands of a stem-loop in pri-miRNAs (processing center 11 bp from the dsRNA-ssRNA junction) to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. It is also involved in pre-rRNA processing. Metazoan RNase III is a larger protein than bacterial RNase III. It contains two RNase III domains in the C-terminal half of the protein followed by a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 75
36925 380707 cd19878 DSRM_AtDRB-like double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36926 380708 cd19879 DSRM_STAU1_rpt1 first double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 66
36927 380709 cd19880 DSRM_STAU2_rpt1 first double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36928 380710 cd19881 DSRM_STAU1_rpt2 second double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 79
36929 380711 cd19882 DSRM_STAU2_rpt2 second double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 82
36930 380712 cd19883 DSRM_STAU1_rpt3 third double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36931 380713 cd19884 DSRM_STAU2_rpt3 third double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36932 380714 cd19885 DSRM_STAU1_rpt4 fourth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 86
36933 380715 cd19886 DSRM_STAU2_rpt4 fourth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 86
36934 380716 cd19887 DSRM_STAU1_rpt5 fifth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 70
36935 380717 cd19888 DSRM_STAU2_rpt5 fifth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36936 380718 cd19889 DSRM_PRKRA_rpt1 first double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
36937 380719 cd19890 DSRM_TARBP2_rpt1 first double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)), participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36938 380720 cd19891 DSRM_PRKRA_rpt2 second double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 67
36939 380721 cd19892 DSRM_PRKRA_rpt3 third double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36940 380722 cd19893 DSRM_TARBP2_rpt3 third double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36941 380723 cd19894 DSRM_STRBP-like_rpt1 first double-stranded RNA binding motif of STRBP, ILF3 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3). STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. Members of this STRBP/ILF3 group contain an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 63
36942 380724 cd19895 DSRM_RED1_rpt1 first double-stranded RNA binding motif of RNA-editing deaminase 1 (RED1) and similar proteins. RED1 (EC 3.5.4.37; also known as double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. It contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the first DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36943 380725 cd19896 DSRM_RED2_rpt1 first double-stranded RNA binding motif of RNA-editing deaminase 2 (RED2) and similar proteins. RED2 (also known as double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the first DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. RED2 lacks editing activity for currently known substrate RNAs, and may have an inactive editase domain. 74
36944 380726 cd19897 DSRM_STRBP-like_rpt2 second double-stranded RNA binding motif of STRBP, ILF3 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3). STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. Members of this STRBP/ILF3 group contain an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 64
36945 380727 cd19898 DSRM_RED1_rpt2 second double-stranded RNA binding motif of RNA-editing deaminase 1 (RED1) and similar proteins. RED1 (EC 3.5.4.37; also known as double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. It contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the second DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 70
36946 380728 cd19899 DSRM_RED2_rpt2 second double-stranded RNA binding motif of RNA-editing deaminase 2 (RED2) and similar proteins. RED2 (also known as double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the second DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. RED2 lacks editing activity for currently known substrate RNAs, and may have an inactive editase domain. 74
36947 380729 cd19900 DSRM_A1CF double-stranded RNA binding motif of APOBEC1 complementation factor (A1CF) and similar proteins. A1CF (also known as APOBEC1-stimulating protein) is an essential component of the apolipoprotein B mRNA editing enzyme complex which is responsible for the posttranscriptional editing of a CAA codon for Gln to a UAA codon for stop in APOB mRNA. A1CF binds to APOB mRNA and is probably responsible for docking the catalytic subunit, APOBEC1, to the mRNA to allow it to deaminate its target cytosine. It contains three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure. 81
36948 380730 cd19901 DSRM_RBM46 double-stranded RNA binding motif of RNA-binding protein 46 (RBM46) and similar proteins. RBM46 (also known as cancer/testis antigen 68 (CT68), or RNA-binding motif protein 46) plays a novel role in the regulation of embryonic stem cell (ESC) differentiation by regulating the degradation of beta-catenin mRNA. It also regulates trophectoderm specification by stabilizing Cdx2 mRNA in early mouse embryos. RBM46 contains three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure. 78
36949 380731 cd19902 DSRM_DRADA double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. DRADA family members contain at least one double-stranded RNA binding motifs (DSRM); vertebrate proteins contain three. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
36950 380732 cd19903 DSRM_EIF2AK2_rpt1 first double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
36951 380733 cd19904 DSRM_EIF2AK2_rpt2 second double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36952 380734 cd19905 DSRM_ADAD1 double-stranded RNA binding motif of adenosine deaminase domain-containing protein 1 (ADAD1) and similar proteins. ADAD1 (also known as testis nuclear RNA-binding protein (TENR)) is phylogenetically related to a family of adenosine deaminases involved in RNA editing. It plays an essential function in spermatid morphogenesis. It may be involved in testis-specific nuclear post-transcriptional processes such as heterogeneous nuclear RNA (hnRNA) packaging, alternative splicing, or nuclear/cytoplasmic transport of mRNAs. ADAD1 contains a double-stranded RNA binding motif (DSRM) and a C-terminal adenosine-deaminase (editase) domain. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36953 380735 cd19906 DSRM_ADAD2 double-stranded RNA binding motif of adenosine deaminase domain-containing protein 2 (ADAD2) and similar proteins. ADAD2 (also known as testis nuclear RNA-binding protein-like (TENRL)) is phylogenetically related to a family of adenosine deaminases involved in RNA editing. It is a double-stranded RNA binding protein with unclear biological function. ADAD2 contains a double-stranded RNA binding motif (DSRM) and a C-terminal adenosine-deaminase (editase) domain. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 74
36954 380736 cd19907 DSRM_AtDRB-like_rpt1 first double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36955 380737 cd19908 DSRM_AtDRB-like_rpt2 second double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 69
36956 380738 cd19909 DSRM_STRBP_rpt1 first double-stranded RNA binding motif of spermatid perinuclear RNA-binding protein (STRBP) and similar proteins. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. STRBP contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 84
36957 380739 cd19910 DSRM_ILF3_rpt1 first double-stranded RNA binding motif of interleukin enhancer-binding factor 3 (ILF3) and similar proteins. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. ILF3 contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 73
36958 380740 cd19911 DSRM_STRBP_rpt2 second double-stranded RNA binding motif of spermatid perinuclear RNA-binding protein (STRBP) and similar proteins. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. STRBP contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 64
36959 380741 cd19912 DSRM_ILF3_rpt2 second double-stranded RNA binding motif of interleukin enhancer-binding factor 3 (ILF3) and similar proteins. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. ILF3 contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 72
36960 380742 cd19913 DSRM_DRADA_rpt1 first double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA). DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
36961 380743 cd19914 DSRM_DRADA_rpt2 second double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
36962 380744 cd19915 DSRM_DRADA_rpt3 third double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
36963 381179 cd19916 OphMA_like tetrapyrrole methylase family protein similar to Omphalotus olearius omphalotin methyltransferase (OphMA) and Dendrothele bispora dbOphMA. OphMA, is the precursor protein of the fungal cyclic peptide Omphalotin A. Omphalotin A is a potent nematicide, having 9 out of 12 of its residues methylated at the backbone amide. Omphalotin A derives from the C-terminus of OphMA (also known as OphA). OphMA catalyzes the automethylation of its own C-terminus using S-adenosyl methionine (SAM); this C terminus is subsequently released and macrocyclized by the protease OphP to give Omphalotin A. 237
36964 381180 cd19917 RsmI_like tetrapyrrole methylase family protein similar to ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress. 217
36965 381181 cd19918 RsmI_like uncharacterized subfamily of the tetrapyrrole methylase family similar to Ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress. 217
36966 381146 cd19919 REC_NtrC phosphoacceptor receiver (REC) domain of DNA-binding transcriptional regulator NtrC. DNA-binding transcriptional regulator NtrC is also called nitrogen regulation protein NR(I) or nitrogen regulator I (NRI). It contains an N-terminal receiver (REC) domain, followed by a sigma-54 interaction domain, and a C-terminal helix-turn-helix DNA-binding domain. It is part of the two-component regulatory system NtrB/NtrC, which controls expression of the nitrogen-regulated (ntr) genes in response to nitrogen limitation. DNA-binding response regulator NtrC is phosphorylated by NtrB; phosphorylation of the N-terminal REC domain activates the central sigma-54 interaction domain and leads to the transcriptional activation from promoters that require sigma(54)-containing RNA polymerase. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
36967 381147 cd19920 REC_PA4781-like phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase PA4781 and similar domains. Pseudomonas aeruginosa cyclic di-GMP phosphodiesterase PA4781 contains an N-terminal REC domain and a C-terminal catalytic HD-GYP domain, characteristics of RpfG family response regulators. PA4781 is involved in cyclic di-3',5'-GMP (c-di-GMP) hydrolysis/degradation in a two-step reaction via the linear intermediate pGpG to produce GMP. Its unphosphorylated REC domain prevents accessibility of c-di-GMP to the active site. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 103
36968 381148 cd19921 REC_1_GGDEF first phosphoacceptor receiver (REC) domain of uncharacterized GGDEF domain proteins. This family is composed of uncharacterized PleD-like response regulators that contain two N-terminal REC domains and a C-terminal diguanylate cyclase output domain with the characteristic GGDEF motif at the active site. Unlike PleD which contains a REC-like adaptor domain, the second REC domain of these uncharacterized GGDEF domain proteins contains characteristic metal-binding and active site residues. PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes the first REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 115
36969 381149 cd19922 REC_RitR-like receiver (REC) domain of orphan response regulator RitR and similar domains. Streptococcus pneumoniae RitR (Repressor of iron transport Regulator, formerly RR489) is an orphan two-component signal transduction response regulator that is required for lung pathogenicity. It acts to repress iron uptake via binding the pneumococcal iron uptake (Piu) transporter promoter. Members of this subfamily contain REC and DNA-binding output domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. However, members of this family do not contain the phosphorylatable aspartic acid residue and are phosphorylation-independent. 110
36970 381150 cd19923 REC_CheY_CheY3 phosphoacceptor receiver (REC) domain of chemotaxis response regulator CheY3 and similar CheY family proteins. CheY family chemotaxis response regulators (RRs) comprise about 17% of bacterial RRs and almost half of all RRs in archaea. This subfamily contains Vibrio cholerae CheY3, Escherichia coli CheY, and similar CheY family RRs. CheY proteins control bacterial motility and participate in signaling phosphorelays and in protein-protein interactions. CheY RRs contain only the REC domain with no output/effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 119
36971 381151 cd19924 REC_CheV-like phosphoacceptor receiver (REC) domain of chemotaxis protein CheV and similar proteins. This subfamily includes the REC domains of Bacillus subtilis chemotaxis protein CheV, Myxococcus xanthus gliding motility regulatory protein FrzE, and similar proteins. CheV is a hybrid protein with an N-terminal CheW-like domain and a C-terminal CheY-like REC domain. The CheV pathway is one of three systems employed by B. subtilis for sensory adaptation that contribute to chemotaxis. It is involved in the transmission of sensory signals from chemoreceptors to flagellar motors. Together with CheW, it is involved in the coupling of methyl-accepting chemoreceptors to the central two-component histidine kinase CheA. FrzE is a hybrid sensor histidine kinase/response regulator that is part of the Frz pathway that controls cell reversal frequency to support directional motility during swarming and fruiting body formation. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 111
36972 381152 cd19925 REC_citrate_TCS phosphoacceptor receiver (REC) domain of citrate family two-component system response regulators. This family includes Lactobacillus paracasei MaeR, Escherichia coli DcuR and DpiA, Klebsiella pneumoniae CitB, as well as Bacillus DctR, MalR, and CitT. These are all response regulators of two-component systems (TCSs) from the citrate family, and are involved in the transcriptional regulation of genes associated with L-malate catabolism (MaeRK), citrate-specific fermentation (DpiAB, CitAB), plasmid inheritance (DpiAB), anaerobic fumarate respiratory system (DcuRS), and malate transport/utilization (MalKR). REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
36973 381153 cd19926 REC_PilR phosphoacceptor receiver (REC) domain of type 4 fimbriae expression regulatory protein PilR and similar proteins. Pseudomonas aeruginosa PilR is the response regulator of the PilS/PilR two-component regulatory system (PilSR TCS) that acts in conjunction with sigma-54 to regulate the expression of type 4 pilus (T4P) major subunit PilA. In addition, the PilSR TCS regulates flagellum-dependent swimming motility and pilus-dependent twitching motility. PilR contains an N-terminal REC domain, a central sigma-54 interaction domain, and a C-terminal Fis-type helix-turn-helix DNA-binding domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 100
36974 381154 cd19927 REC_Ycf29 phosphoacceptor receiver (REC) domain of probable transcriptional regulator Ycf29. Ycf29 is a probable response regulator of a two-component system (TCS), typically consisting a sensor and a response regulator, that functions in adaptation to changing environments. Processes regulated by TCSs in bacteria include sporulation, pathogenicity, virulence, chemotaxis, and membrane transport. Ycf29 contains an N-terminal REC domain and a LuxR-type helix-turn-helix DNA-binding output domain. REC domains function as phosphorylation-mediated switches within RRs, but some also transfer phosphoryl groups in multistep phosphorelays. 102
36975 381155 cd19928 REC_RcNtrC-like phosphoacceptor receiver (REC) domain of Rhodobacter capsulatus nitrogen regulatory protein C (NtrC) and similar NtrC family response regulators. NtrC family proteins are transcriptional regulators that have REC, AAA+ ATPase/sigma-54 interaction, and DNA-binding output domains. This subfamily of NtrC proteins include NtrC, also called nitrogen regulator I (NRI), from Rhodobacter capsulatus, Azospirillum brasilense, and Azorhizobium caulinodans. NtrC is part of the NtrB/NtrC two-component system that controls the expression of the nitrogen-regulated (ntr) genes in response to nitrogen limitation. The N-terminal REC domain of NtrC proteins regulate the activity of the protein and its phosphorylation controls the AAA+ domain oligomerization, while the central AAA+ domain participates in nucleotide binding, hydrolysis, oligomerization, and sigma54 interaction. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 100
36976 381156 cd19929 psREC_Atg32 pseudo receiver domain of autophagy receptor Atg32. Autophagy receptor Atg32 is a single-pass outer mitochondrial membrane protein that is required for the selective autophagy of mitochondria, called mitophagy, in yeast. It mediates ubiquitin-independent mitophagy in response to nitrogen deprivation. It recruits the autophagy machinery to mitochondria, facilitating mitochondrial capture in phagophores, the precursors to autophagosomes. Whereas mammals have at least 7 different autophagy receptors, yeast only has one. Little is known about the structure of Atg32; it contains a binding region for the selective autophagy scaffolding protein Atg11 and an Atg8-interacting motif (AIM). Limited proteolysis has identified a structured domain, a pseudo receiver (psREC) domain, within the cytosolic region of Atg32 that is essential for the induction of mitophagy. psREC domains lack the metal-binding, phosphorylatable asp, and active site residues of canonical REC domains and are thought to function in protein-protein interactions. 139
36977 381157 cd19930 REC_DesR-like phosphoacceptor receiver (REC) domain of DesR and similar proteins. This group is composed of Bacillus subtilis DesR, Streptococcus pneumoniae response regulator spr1814, and similar proteins, all containing an N-terminal REC domain and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. DesR is a response regulator that, together with its cognate sensor kinase DesK, comprises a two-component regulatory system that controls membrane fluidity. Phosphorylation of the REC domain of DesR is allosterically coupled to two distinct exposed surfaces of the protein, controlling noncanonical dimerization/tetramerization, cooperative activation, and DesK binding. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
36978 381158 cd19931 REC_NarL phosphoacceptor receiver (REC) domain of Nitrate/Nitrite response regulator L (NarL). Nitrate/nitrite response regulator protein NarL contains an N-terminal REC domain and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. Escherichia coli NarL activates the expression of the nitrate reductase (narGHJI) and formate dehydrogenase-N (fdnGHI) operons, and represses the transcription of the fumarate reductase (frdABCD) operon in response to a nitrate/nitrite induction signal. Phosphorylation of the NarL REC domain releases the C-terminal HTH output domain that subsequently binds specific DNA promoter sites to repress or activate gene expression. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
36979 381159 cd19932 REC_PdtaR-like phosphoacceptor receiver (REC) domain of PdtaR and similar proteins. This subfamily includes Mycobacterium tuberculosis PdtaR, also called Rv1626, and similar proteins containing a REC domain and an ANTAR (AmiR and NasR transcription antitermination regulators) RNA-binding output domain. PdtaR is a response regulator that acts at the level of transcriptional antitermination and is a member of the PdtaR/PdtaS two-component regulatory system. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 118
36980 381160 cd19933 REC_ETR-like phosphoacceptor receiver (REC) domain of plant ethylene receptors ETR1, ETR2, and EIN4, and similar proteins. Plant ethylene receptors contain N-terminal transmembrane domains that contain an ethylene binding site and also serve in localization of the receptor to the endoplasmic reticulum or the Golgi apparatus and a C-terminal histidine kinase (HK)-like domain. There are five ethylene receptors (ETR1, ERS1, ETR2, ERS2, and EIN4) in Arabidopsis thaliana. ETR1, ETR2, and EIN4 also contain REC domains C-terminal to the HK domain. ETR1 and ERS1 belong to subfamily 1, and have functional HK domains while ETR2, ERS2, and EIN4 belong to subfamily 2, and lack the necessary residues for HK activity and may function as serine/threonine kinases. The plant hormone ethylene plays an important role in plant growth and development. It regulates seed germination, seedling growth, leaf and petal abscission, fruit ripening, organ senescence, and pathogen responses. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
36981 381161 cd19934 REC_OmpR_EcPhoP-like phosphoacceptor receiver (REC) domain of EcPhoP-like OmpR family response regulators. Escherichia coli PhoP (EcPhoP) is part of the PhoQ/PhoP two-component system (TCS) that regulates virulence genes and plays an essential role in the response of the bacteria to the environment of their mammalian hosts, sensing several stimuli such as extracellular magnesium limitation, low pH, the presence of cationic antimicrobial peptides, and osmotic upshift. This subfamily also includes Brucella suis FeuP, part of the FeuPQ TCS that is involved in the regulation of iron uptake, and Microchaete diplosiphon RcaC, which is required for chromatic adaptation. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 117
36982 381162 cd19935 REC_OmpR_CusR-like phosphoacceptor receiver (REC) domain of CusR-like OmpR family response regulators. Escherichia coli CusR is part of the CusS/CusR two-component system (TCS) that is involved in response to copper and silver. Other members of this subfamily include Escherichia coli PcoR, Pseudomonas syringae CopR, and Streptomyces coelicolor CutR, which are all transcriptional regulatory proteins and components of TCSs that regulate genes involved in copper resistance and/or metabolism. member of the subfamily is Escherichia coli HprR (hydrogen peroxide response regulator), previously called YdeW, which is part of the HprSR (or YedVW) TCS involved in stress response to hydrogen peroxide, as well as Cupriavidus metallidurans CzcR, which is part of the CzcS/CzcR TCS involved in the control of cobalt, zinc, and cadmium homeostasis. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 100
36983 381163 cd19936 REC_OmpR_ChvI-like phosphoacceptor receiver (REC) domain of ChvI-like OmpR family response regulators. Sinorhizobium meliloti ChvI is part of the ExoS/ChvI two-component regulatory system (TCS) that is required for nitrogen-fixing symbiosis and exopolysaccharide synthesis. ExoS/ChvI also play important roles in regulating biofilm formation, motility, nutrient utilization, and the viability of free-living bacteria. ChvI belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 99
36984 381164 cd19937 REC_OmpR_BsPhoP-like phosphoacceptor receiver (REC) domain of BsPhoP-like OmpR family response regulators. Bacillus subtilis PhoP (BsPhoP) is part of the PhoPR two-component system that participates in a signal transduction network that controls adaptation of the bacteria to phosphate deficiency by regulating (activating or repressing) genes of the Pho regulon upon phosphorylation by PhoR. When activated, PhoPR directs expression of phosphate scavenging enzymes, lowers synthesis of the phosphate-rich wall teichoic acid (WTA) and initiates synthesis of teichuronic acid, a non-phosphate containing replacement anionic polymer. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
36985 381165 cd19938 REC_OmpR_BaeR-like phosphoacceptor receiver (REC) domain of BaeR-like OmpR family response regulators. BaeR is part of the BaeSR two-component system that is involved in regulating genes that confer multidrug and metal resistance. In Salmonella, BaeSR induces AcrD and MdtABC drug efflux systems, increasing multidrug and metal resistance. In Escherichia coli, BaeR stimulates multidrug resistance via mdtABC (multidrug transporter ABC, formerly known as yegMNO) genes, which encode a resistance-nodulation-cell division (RND) drug efflux system. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 114
36986 381166 cd19939 REC_OmpR_BfmR-like phosphoacceptor receiver (REC) domain of BfmR-like OmpR family response regulators. Acinetobacter baumannii BfmR is part of the BfmR/S two-component system that functions as the master regulator of biofilm initiation. BfmR confers resistance to complement-mediated bactericidal activity, independent of capsular polysaccharide, and also increases resistance to the clinically important antimicrobials meropenem and colistin, making it a potential antimicrobial target. Its inhibition would have the dual benefit of significantly decreasing in vivo survival and increasing sensitivity to selected antimicrobials. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. 116
36987 410849 cd19940 XPF_nuclease-like nuclease domain of XPF/MUS81 family proteins. The XPF/MUS81 family belongs to 3'-flap endonuclease that act upon 3'-flap structures and involved in DNA repair pathways that are necessary for the removal of UV-light-induced DNA lesions and cross-links between DNA strands. Family members exist either as heterodimers or as homodimers in their functionally competent states which consist of a catalytic and a noncatalytic subunit. The catalytic subunits have a DX(n)RKX(3)D motif. This motif is required for metal-dependent endonuclease activity but not for DNA junction binding. The equivalent regions of the noncatalytic subunits (ERCC1, EME1, and FAAP24) have diverged. The noncatalytic subunits have roles such as binding ssDNA or an ability to target the endonuclease to defined DNA structures or sites of DNA damage. 126
36988 410995 cd19941 TIL trypsin inhibitor-like cysteine rich domain. TIL (trypsin inhibitor-like) cysteine rich domains are found in smapins (small serine proteinase inhibitor), or Ascaris trypsin inhibitor (ATI)-like proteins, whose members include anticoagulant proteins, elastase inhibitors, trypsin inhibitors, thrombin inhibitors, and chymotrypsin inhibitors. The TIL domain is also found in some large modular glycoproteins, including the von Willebrand factor (VWF), mucin-6, mucin-19, and SCO-spondin, among others. The TIL domain is characterized by the presence of five disulfide bonds (two of which are located on either side of the reactive site) in a single small protein domain of 61-62 residues. The cysteine residues that form the disulfide bonds are linked in the pattern: cysteines 1-7, 2-6, 3-5, 4-10 and 8-9. TILs can occur as a single domain or in multiple tandem arrangements. The disulfide bonds account for the unusual resistance to proteolysis and heat denaturation of these proteins. Smapins possess an unusual fold and, with the exception of the reactive site, shows no similarity to other serine protease inhibitors. The serine protease inhibitors comprise a large family of molecules involved in inflammatory responses, blood clotting, and complement activation. 55
36989 381075 cd19942 Fer2_BFD-like [2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins. The BFD-like [2Fe-2S]-binding domain comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. The Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. BFD-like [2Fe-2S]-binding domains are found in proteins such as bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, Cu+ chaperone CopZ, anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, nitrogen fixation protein NifU, prokaryotic assimilatory nitrate reductase catalytic subunit NasA, and archaeal proline dehydrogenase PDH1. This superfamily also includes uncharacterized proteins having an N-terminal BFD-like [2Fe-2S]-binding domain and a C-terminal domain belonging to the Ni,Fe-hydrogenase I small subunit family. 49
36990 381076 cd19943 NirB_Fer2_BFD-like_1 first bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the large subunit of the NADH-dependent nitrite reductase. The NADH-dependent nitrite reductase (NirBD) complex comprises a large and a small subunit, and is also known as nitrite reductase (reduced nicotinamide adenine dinucleotide), NADH-nitrite oxidoreductase, and assimilatory nitrite reductase. NirBD uses NADH as electron donor, and FAD, iron-sulfur cluster, and siroheme cofactors, all embedded in the large subunit NirB to catalyze the 6-electron reduction of nitrite to ammonium. NirBD plays a role in regulating nitric oxide homeostasis in Streptomyces coelicolor. In addition to NirB, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins including bacterioferritin-associated ferredoxin (BFD) and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 53
36991 381077 cd19944 NirB_Fer2_BFD-like_2 second bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the large subunit of the NADH-dependent nitrite reductase. The NADH-dependent nitrite reductase (NirBD) complex comprises a large (NirB) and a small (NirD) subunit, and is also known as nitrite reductase (reduced nicotinamide adenine dinucleotide), NADH-nitrite oxidoreductase, and assimilatory nitrite reductase. NirBD uses NADH as electron donor, and FAD, iron-sulfur cluster, and siroheme cofactors, all embedded in the large subunit NirB to catalyze the 6-electron reduction of nitrite to ammonium. Some of the second [2Fe-2S]-binding domains, have one of the Cys residues replaced by a His residue, they may interact with non-Rieske NirD subunits. NirBD plays a role in regulating nitric oxide homeostasis in Streptomyces coelicolor. In addition to NirB, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins including bacterioferritin-associated ferredoxin (BFD) and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 52
36992 381078 cd19945 Fer2_BFD bacterioferritin-associated ferredoxin (BFD) [2Fe-2S]-binding domain. This family includes Escherichia coli and Pseudomonas aeruginosa bacterioferritin-associated ferredoxin BFD which binds an [2Fe-2S] cluster and appears to interact with bacterioferritin (E. coli BFR/YheA and P. aeruginosa BfrB), a dynamic regulator of intracellular iron levels. It has been suggested that BFD and bacterioferritin form an electron transfer complex which may participate in the iron storage or iron immobilization functions of bacterioferritin. For Pseudomonas aeruginosa, it has been shown that mobilization of Fe3+ stored in BfrB requires interaction with BFD, which transfers electrons to reduce Fe3+ in the internal cavity of BfrB for subsequent release of Fe2+. The stability of BFD may be aided by an anion-binding site found within this domain. In addition to BFD, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins such as the large subunit of NADH-dependent nitrite reductase and the Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 54
36993 381079 cd19946 GlpA-like_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, and similar proteins. This subgroup includes the BFD-like [2Fe-2S]-binding domains of subunits of various component dehydrogenase/oxidases, including anaerobic glycerol 3-phosphate dehydrogenase subunit A of GlpABC, hydrogen cyanide synthase subunit HcnB of HcnABC, octopine oxidase subunit A of OoxAB, and nopaline oxidase subunit A of NoxAB. GlpABC catalyzes the conversion of glycerol 3-phosphate to dihydroxyacetone, and participates in the glycerol degradation by glycerol kinase pathway in step 1 of the sub-pathway that synthesizes glycerone phosphate from sn-glycerol 3-phosphate (anaerobic route). HcnABC oxidizes glycine producing hydrogen cyanide and CO2. In Agrobacterium spp, the first enzymic step in the catabolic utilization of octopine and nopaline is the oxidative cleavage into L-arginine and pyruvate or 2-ketoglutarate, respectively; nopaline oxidase (NoxAB) accepts nopaline and octopine while octopine oxidase (OoaB) has high activity with octopine but barely detectable activity with nopaline, both subunits possibly contributing to the substrate specificity. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 55
36994 381080 cd19947 NifU_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of nitrogen fixation protein NifU and similar proteins. This family includes the BFD-like [2Fe-2S]-binding domain of Azotobacter vinelandii and Klebsiella pneumoniae nitrogen fixation protein NifU. NifU binds one Fe cation per subunit and one [2Fe-2S] cluster per subunit, and is involved in the formation or repair of [Fe-S] clusters present in iron-sulfur proteins. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 55
36995 381081 cd19948 NasA-like_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain at the C-terminus of prokaryotic assimilatory nitrate reductase catalytic subunit NasA and similar proteins. The BFD-like [2Fe-2S]-binding domain described in this family is found at the C-terminus of prokaryotic assimilatory nitrate reductase catalytic subunit (NasA) such as Rhodobacter capsulatus E1F1 NasA. Nitrate reductase catalyzes the reduction of nitrate to nitrite, the first step of nitrate assimilation. R. capsulatus E1F1 nitrate reductase is composed of this NasA subunit and a small diaphorase subunit with FAD. Note that this [2Fe-2S]-binding domain is not always present; for example, it is absent from the characterized haloaechean Haloferax mediterranei NasA; both, however, have an [4Fe-4S] binding domain at their N-terminus. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 53
36996 381082 cd19949 PDH1_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the alpha-subunit of archaeal proline dehydrogenase PDH1. This domain family describes the alpha-subunit of archaeal dye-linked L-proline dehydrogenase PDH1. Dye-linked PDH catalyzes the oxidation of L-proline to 1-pyrroline-5-carboxylate in the presence of artificial electron acceptors. It includes the alpha subunit of Pyrococcus horikoshii PHD1 which has been shown to exist as an (alphabeta)4 heterooctamer and to contain three cofactors: FAD, FMN, and ATP; the alpha subunit contains ATP but exhibits no PDH activity, the beta subunit is the catalytic component contains FAD and exhibits PDH activity, and FMN is located between the alpha and beta subunits. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 56
36997 381083 cd19950 Fer2_BFD-like [2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins; uncharacterized subgroup. The BFD-like [2Fe-2S]-binding domain comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. The Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. BFD-like [2Fe-2S]-binding domains are found in proteins such as bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, Cu+ chaperone CopZ, anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, nitrogen fixation protein NifU, prokaryotic assimilatory nitrate reductase catalytic subunit NasA, and archaeal proline dehydrogenase PDH1. 51
36998 381084 cd19951 HyaA_family_Fer2_BFD-like bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of uncharacterized proteins having a C-terminal Ni,Fe-hydrogenase I small subunit (HyaA) family domain. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 54
36999 410996 cd19953 PDS5 Sister chromatid cohesion protein PDS5. Pds5 plays a crucial role in sister chromatid cohesion. Together with WapI and Scc3, it is involved in the release of the cohesin complex from chromosomes during S phase. The core of the cohesin complex consists of a coiled-coiled heterodimer of Smc1 and Smc30, together with Scc1 (also called kleisin). Pds5 interacts with Scc1 via a conserved patch on the surface of its heat repeats. Pds5 also promotes the acetylation of Smc3 that protects cohesin from releasing activity in G2 phase. 630
37000 381070 cd19954 serpin42Dd-like_insects insect serpins similar to Drosophila melanogaster Serpin 42Dd. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin 42Dd, also called serpin 1 (Spn1), regulates Toll-mediated immune responses, functioning as a repressor of Toll activation upon fungal infection. Insect serpins from house flies, fruit flies, and stable flies are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 366
37001 381071 cd19955 serpin48-like_insects insect serpins similar to Tenebrio molitor serpin 48. Serpins in insects function within development, wound healing and immunity. Tenebrio molitor serpin 48 (SPN48) is highly specific for Spatzle-processing enzyme, an essential component in insect innate immunity. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 361
37002 381072 cd19956 serpinB serpin B family, ov-serpins. The clade B of the serpin superfamily corresponds to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). Family members are also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 376
37003 381073 cd19957 serpinA serpin family A. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans. 363
37004 410997 cd19958 pyocin_knob knob domain of R1 and R2 pyocins and similar domains. The knob domain is present as a tandemly repeated structural domain in R-type pyocins, which are high-molecular weight bacteriocins produced by some strains of Pseudomonas aeruginosa to specifically kill other strains of the same species. R-type pyocins are structurally similar to simple contractile tails, such as those of phage P2 and Mu, and they punch a hole in the bacterial envelope to efficiently kill target cells. The second knob domain may contain regions responsible for determining the killing spectrum. Knob-like domains occur in host-recognition and binding proteins of, not only pyocins, but also phages, such as in phage K1F endosialidase (not represented by this model), where it may interact with sialic acid, the cell surface molecule that is recognized during infection. 80
37005 410999 cd19960 YidC_peri periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Gram-negative bacteria and similar domains. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contains a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. Other members of the YidC/Oxa1/Alb3 family include YidC1/YidC2 from gram-positive bacteria as well as eukaryotic members such as mitochondrial Oxa1/Oxa2 (or Cox18) and chloroplastic Alb3/Alb4; they are not part of this hierarchy as they do not possess the periplasmic domain. 233
37006 411000 cd19961 EcYidC-like_peri periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Escherichia coli and similar domains. This subfamily is composed of Escherichia coli YidC and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. 255
37007 380617 cd19962 PBP1_ABC_RfuA-like periplasmic riboflavin-binding component (RfuA) of ABC transporter (RfuABCD) from Treponema pallidum and its close homologs in other bacteria. This group includes the basic membrane lipoprotein (BMP) family ABC transporter substrate-binding protein RfuA from Treponema pallidum and its close homologs in other bacteria. RfuA is the riboflavin-binding component of ABC transporter (RfuABCD) in spirochetes. The members of this group are highly similar to that of the periplasmic binding domain of basic membrane lipoprotein (BMP), PnrA. The PnrA lipoprotein, also known as Tp0319 or TmpC, represents a novel family of bacterial purine nucleoside receptor encoded within an ATP-binding cassette (ABC) transport system (pnrABCDE). It shows a striking structural similarity to another basic membrane lipoprotein Med which regulates the competence transcription factor gene, comK, in Bacillus subtilis. PnrA-like proteins are likely to have similar nucleoside-binding functions and a similar type 1 periplasmic sugar-binding protein-like fold. 305
37008 380618 cd19963 PBP1_BMP-like periplasmic binding component of a basic membrane lipoprotein (BMP) from Brucella abortus and its close homologs in other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold. 279
37009 380619 cd19964 PBP1_BMP-like periplasmic binding component of a basic membrane lipoprotein (BMP) from Aeropyrum pernix K1 and its close homologs in other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Aeropyrum pernix K1 and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold. 263
37010 380620 cd19965 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein CUT2 family and similar proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 272
37011 380621 cd19966 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein CUT2 family and simialr proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 278
37012 380622 cd19967 PBP1_TmRBP-like D-ribose ABC transporter substrate-binding protein such as Thermoanaerobacter tengcongensis ribose binding protein (ttRBP). Periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis ribose binding protein (ttRBP) and its mesophilic homologs. Members of this group are belonging to the type 1 periplasmic binding protein superfamily, whose members are involved in chemotaxis, ATP-binding cassette transport, and intercellular communication in central nervous system. The thermophilic and mesophilic ribose-binding proteins are structurally very similar, but differ substantially in thermal stability. 272
37013 380623 cd19968 PBP1_ABC_IbpA-like periplasmic sugar-binding protein IbpA of an ABC transporter and similar proteins. The periplasmic binding protein (PBP) IbpA mediates the uptake of myo-inositol by an ABC transporter that consists of the PBP IbpA, the transmembrane permease IatP, and the ABC IatA. IbpA shares homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. 271
37014 380624 cd19969 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 278
37015 380625 cd19970 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 275
37016 380626 cd19971 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 267
37017 380627 cd19972 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 269
37018 380628 cd19973 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this family function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea. The sugar binding domain is also homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. Moreover, this periplasmic binding domain, also known as Venus flytrap domain, undergoes transition from an open to a closed conformational state upon the binding of ligands such as lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. This family also includes the periplasmic binding domain of autoinducer-2 (AI-2) receptors such as LsrB and LuxP which are highly homologous to periplasmic pentose/hexose sugar-binding proteins. 285
37019 380629 cd19974 PBP1_LacI-like ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 270
37020 380630 cd19975 PBP1_CcpA-like ligand-binding domain of putative DNA transcription regulators highly similar to that of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation. This group includes the ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators. 269
37021 380631 cd19976 PBP1_DegA_Like ligand-binding domain of putative DNA transcription regulators highly similar to that of the transcription regulator DegA. This group includes the ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the transcription regulator DegA, which is involved in the control of degradation of Bacillus subtilis amidophosphoribosyltransferase (purF). This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding. 268
37022 380632 cd19977 PBP1_EndR-like periplasmic ligand-binding domain of putative repressor of the endoglucanase operon and its close homologs. This group includes the ligand-binding domain of putative repressor of the endoglucanase operon from Paenibacillus polymyxa and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding. 264
37023 380633 cd19978 PBP1_ABC_ligand_binding-like periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally, however. 341
37024 380634 cd19979 PBP1_ABC_ligand_binding-like amino acid amide ABC transporter substrate binding protein haat family. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally. 350
37025 380635 cd19980 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 334
37026 380636 cd19981 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 297
37027 380637 cd19982 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally. 302
37028 380638 cd19983 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 303
37029 380639 cd19984 PBP1_ABC_ligand_binding-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 296
37030 380640 cd19985 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 321
37031 380641 cd19986 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 297
37032 380642 cd19987 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 353
37033 380643 cd19988 PBP1_ABC_HAAT-like type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally. 302
37034 380644 cd19989 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 299
37035 380645 cd19990 PBP1_GABAb_receptor_plant periplasmic ligand-binding domain of Arabidopsis thaliana glutamate receptors and its close homologs in other plants. This group includes the ligand-binding domain of Arabidopsis thaliana glutamate receptors, which have sequence similarity with animal ionotropic glutamate receptor and its close homologs in other plants. The ligand-binding domain of GABAb receptors are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). GABA is the major inhibitory neurotransmitter in the mammalian CNS and, like glutamate and other transmitters, acts via both ligand gated ion channels (GABAa receptors) and G-protein coupled receptors (GABAb receptor or GABAbR). GABAa receptors are members of the ionotropic receptor superfamily which includes alpha-adrenergic and glycine receptors. The GABAb receptor is a member of a receptor superfamily which includes the mGlu receptors. The GABAb receptor is coupled to G alpha-i proteins, and activation causes a decrease in calcium, an increase in potassium membrane conductance, and inhibition of cAMP formation. The response is thus inhibitory and leads to hyperpolarization and decreased neurotransmitter release, for example. 373
37036 380646 cd19991 PBP1_ABC_xylose_binding D-xylose binding periplasmic protein. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 284
37037 380647 cd19992 PBP1_ABC_xylose_binding-like periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 284
37038 380648 cd19993 PBP1_ABC_xylose_binding-like periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 287
37039 380649 cd19994 PBP1_ChvE periplasmic sugar binding protein ChvE that interacts with a bacterial two-component signaling system. Periplasmic aldose-monosaccharides binding protein ChvE that belongs to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 304
37040 380650 cd19995 PBP1_ABC_xylose_binding-like periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. 294
37041 380651 cd19996 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail. 302
37042 380652 cd19997 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail. 305
37043 380653 cd19998 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail. 302
37044 380654 cd19999 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail. 313
37045 380655 cd20000 PBP1_ABC_rhamnose rhamnose ABC transporter substrate-binding protein. Rhamnose ABC transporter substrate-binding protein similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling. 298
37046 380656 cd20001 PBP1_LsrB_Quorum_Sensing-like ligand-binding protein LsrB-like of ABC transporter periplasmic binding protein. Ligand-binding protein LsrB-like of a transport system, similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling. 296
37047 380657 cd20002 PBP1_LsrB_Quorum_Sensing-like ligand-binding protein LsrB-like of ABC transporter periplasmic binding protein. Ligand-binding protein LsrB-like of a transport system, similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling. 295
37048 380658 cd20003 PBP1_LsrB_Quorum_Sensing ligand-binding protein LsrB of ABC transporter periplasmic binding protein. Periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling. 298
37049 380659 cd20004 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 273
37050 380660 cd20005 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 274
37051 380661 cd20006 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 274
37052 380662 cd20007 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 271
37053 380663 cd20008 PBP1_ABC_sugar_binding-like monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis. 277
37054 380664 cd20009 PBP1_RafR-like Ligand-binding domain of DNA transcription repressor specific for raffinose (RafR) and similar proteins. Ligand-binding domain of DNA transcription repressor specific for raffinose (RafR) which is a member of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 266
37055 380665 cd20010 PBP1_AglR-like Ligand-binding domain of DNA transcription repressor specific for alpha-glucosides (AglR) and similar proteins. Ligand-binding domain of DNA transcription repressor specific for alpha-glucosides (AglR) which is a member of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor. 269
37056 380666 cd20013 PBP1_RPA0985_benzoate-like type 1 periplasmic binding-protein component of an ABC system (RPA0985), involved in the active transport of lignin-derived benzoate derivative compounds, and its close homologs. This group includes RPA0985 from Rhodopseudomonas palustris and its close homologs in other bacteria. Rpa0985 is the periplasmic binding-protein component of an ABC system that is involved in the active transport of lignin-derived benzoate derivative compounds. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP). 356
37057 380667 cd20014 PBP1_RPA0668_benzoate-like type 1 periplasmic binding-protein component of an ABC system (RPA0668), involved in in the active transport of lignin-derived benzoate derivative compounds, and its close homologs. This group includes RPA0668 from Rhodopseudomonas palustris and its close homologs in other bacteria. Rpa0668 is the periplasmic binding-protein component of an ABC system that is involved in the active transport of lignin-derived benzoate derivative compounds. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP). 346
37058 410789 cd20015 FH_FOXA Forkhead (FH) domain found in the Forkhead box protein A (FOXA) subfamily. The FOXA subfamily includes three winged helix transcription factors, FOXA1 (also called hepatocyte nuclear factor 3-alpha or transcription factor 3A), FOXA2 (also called hepatocyte nuclear factor 3-beta or transcription factor 3B), and FOXA3 (also called hepatocyte nuclear factor 3-gamma or transcription factor 3G). FOXA1 is essential for epithelial lineage differentiation and has been found to be upregulated in numerous cancers. FOXA2 controls cell differentiation. It is a key transcriptional regulator that maintains airway mucus homeostasis and may also have an important role in bone metabolism. FOXA3 acts as an essential transcriptional regulator engaged in adipogenesis and energy metabolism. This subfamily also includes Xenopus tropicalis FOXA4, Drosophila melanogaster protein fork head (dFKH), and similar proteins. FOXA4 is only present in amphibians, where it is required for the correct regionalization and maintenance of the central nervous system. dFKH promotes terminal as opposed to segmental development. In the absence of dFKH, this developmental switch does not occur. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 97
37059 410790 cd20016 FH_FOXB Forkhead (FH) domain found in the Forkhead box protein B (FOXB) subfamily. The FOXB subfamily includes two winged helix transcription factors, FOXB1 (also called transcription factor FKH-5) and FOXB2 (also called transcription factor FKH-4). FOXB1 controls development of mammary glands and regions of the central nervous system (CNS) that regulate the milk-ejection reflex. It is essential for access of mammillothalamic axons to the thalamus. FOXB2 may act as a tumor suppressor; it has been found to inhibit the malignant characteristics of the pancreatic cancer cell line Panc-1 in vitro. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 79
37060 410791 cd20017 FH_FOXC Forkhead (FH) domain found in the Forkhead box protein C (FOXC) subfamily. The FOXC subfamily includes two winged helix transcription factors, FOXC1 (also called Forkhead-related protein FKHL7, or Forkhead-related transcription factor 3) and FOXC2 (also called Forkhead-related protein FKHL14, or Mesenchyme fork head protein 1). FOXC1 is a DNA-binding transcriptional factor that plays a role in a broad range of cellular and developmental processes such as the development of the eyes, bones, cardiovascular system, kidneys, and skin. FOXC2 acts as a transcriptional activator that might be involved in the formation of special mesenchymal tissues. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 77
37061 410792 cd20018 FH_FOXD Forkhead (FH) domain found in the Forkhead box protein D (FOXD) subfamily. The FOXD subfamily includes four winged helix transcription factors, FOXD1-4. FOXD1, also called Forkhead-related protein FKHL8 or Forkhead-related transcription factor 4 (FREAC-4), is involved in transcriptional activation of Placental Growth Factor (PGF) and the complement component (C3) genes. It plays an important role in early embryonic development and organogenesis, and functions as an oncogene in several cancers. FOXD2, also called Forkhead-related protein FKHL17 or Forkhead-related transcription factor 9 (FREAC-9), is a probable transcription factor involved in embryogenesis and somatogenesis. FOXD3, also called HNF3/FH transcription factor genesis, acts as a transcriptional repressor that binds to the consensus sequence 5'-A[AT]T[AG]TTTGTTT-3'. It also acts as a transcriptional activator. It promotes development of neural crest cells from neural tube progenitors and restricts neural progenitor cells to the neural crest lineage while suppressing interneuron differentiation. FOXD3 is required for maintenance of pluripotent cells in the pre-implantation and peri-implantation stages of embryogenesis. FOXD4, also called Forkhead-related protein FKHL9 or Forkhead-related transcription factor 5 (FREAC-5), is essential for establishing neural cell fate and for neuronal differentiation. The family also includes Forkhead box protein D4-like proteins, FOXD4L1-6. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 78
37062 410793 cd20019 FH_FOXE Forkhead (FH) domain found in the Forkhead box protein E (FOXE) subfamily. The FOXE subfamily includes two winged helix transcription factors, FOXE1 (also known as FOXE2) and FOXE3. FOXE1, also called Forkhead-related protein FKHL15, HFKH4, HNF-3/fork head-like protein 5 (HFKL5), or thyroid transcription factor 2 (TTF-2), is a transcription factor that binds consensus sites on a variety of gene promoters and activate their transcription. FOXE3, also called Forkhead-related protein FKHL12 or Forkhead-related transcription factor 8 (FREAC-8), is a transcription factor that controls lens epithelial cell growth through regulation of proliferation, apoptosis, and the cell cycle. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 94
37063 410794 cd20020 FH_FOXF Forkhead (FH) domain found in the Forkhead box protein F (FOXF) subfamily. The FOXF subfamily includes two winged helix transcription factors, FOXF1 and FOXF2, both of which are probable transcription activators for a number of lung-specific genes. FOXF1 mutations in sporadic and familial cases of alveolar capillary dysplasia with misaligned pulmonary veins (ACD/MPV) suggest its involvement in ACD/MPV and lung organogenesis. FOXF2 is involved in programming organogenesis and regulating epithelial-to-mesenchymal transition (EMT) and cell proliferation. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 78
37064 410795 cd20021 FH_FOXG Forkhead (FH) domain found in the Forkhead box protein G (FOXG) subfamily. The FOXG subfamily includes a winged helix transcription factor FOXG1, which is also called brain factor 1 (BF-1), brain factor 2 (BF-2), Forkhead box protein G1A, Forkhead box protein G1B, Forkhead box protein G1C, Forkhead-related protein FKHL1, Forkhead-related protein FKHL2, or Forkhead-related protein FKHL3. FOXG1 acts as a transcription repression factor which plays an important role in the establishment of the regional subdivision of the developing brain and in the development of the telencephalon. It is repetitively used in the sequential events of telencephalic development to control multi-steps of brain circuit formation ranging from cell cycle control to neuronal differentiation in a clade- and species-specific manner. Individuals with mutations in FOXG1 harbor "FOXG1-related encephalopathy", characterized by two clinical phenotypes/syndromes including microcephaly, developmental delay, severe cognitive disabilities, early-onset dyskinesia and hyperkinetic movements, stereotypies, epilepsy, and cerebral malformation for those with deletions or intragenic mutations of FOXG1. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 79
37065 410796 cd20022 FH_FOXH Forkhead (FH) domain found in the Forkhead box protein H (FOXH) subfamily. The FOXH subfamily includes a winged helix transcription factor, FOXH1, which is also called Forkhead activin signal transducer 1 (Fast-1), or Forkhead activin signal transducer 2 (Fast-2). FOXH1 acts as a transcriptional activator that recognizes and binds to the DNA sequence 5'-TGT[GT][GT]ATT-3'. It is required for induction of the goosecoid (GSC) promoter by TGF-beta or activin signaling. FOXH1 forms a transcriptionally active complex containing FOXH1/SMAD2/SMAD4 on a site on the GSC promoter called TARE (TGF-beta/activin response element). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 79
37066 410797 cd20023 FH_FOXJ1 Forkhead (FH) domain found in Forkhead box protein J1 (FOXJ1) and similar proteins. FOXJ1, also called Forkhead-related protein FKHL13 or hepatocyte nuclear factor 3 Forkhead homolog 4 (HFH-4), acts as a transcription factor specifically required for the formation of motile cilia. It acts by activating transcription of genes that mediate assembly of motile cilia, such as CFAP157. FOXJ1 binds the DNA consensus sequences 5'-HWDTGTTTGTTTA-3' or 5'-KTTTGTTGTTKTW-3' (where H is not G, W is A or T, D is not C, and K is G or T). It activates the transcription of a variety of ciliary proteins in the developing brain and lung. The FH domain is a winged helix DNA-binding domain. 79
37067 410798 cd20024 FH_FOXJ2-like Forkhead (FH) domain found in Forkhead box proteins, FOXJ2, FOXJ3 and similar proteins. The FOXJ2-like subfamily includes two winged helix transcription factors, FOXJ2 and FOXJ3. FOXJ2, also called Forkhead homologous X (FHX), plays an important role in tumorigenesis, progression, and metastasis of certain cancers. It acts as a transcriptional activator that can bind to two different type of DNA binding sites. FOXJ3 is a transcription factor which regulates sperm function. It transcriptionally activates Mef2c and regulates adult skeletal muscle fiber type identity. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 77
37068 410799 cd20025 FH_FOXI Forkhead (FH) domain found in the Forkhead box protein I (FOXI) subfamily. The FOXI subfamily includes three human winged helix transcription factors, FOXI1-3, Xenopus laevis FoxI1c, and similar proteins. FOXI1 acts as a transcriptional activator required for the development of normal hearing, sense of balance, and kidney function. FOXI2 may act as a transcriptional activator that plays a possible role in controlling cellular identity. Loss of function mutations in the FOXI2 gene may contribute to ectodermal dysplasia. FOXI3 plays a critical role in the development of the inner ear and jaw. It is a regulator of ectodermal development. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 77
37069 410800 cd20026 FH_FOXK Forkhead (FH) domain found in the Forkhead box protein K (FOXK) subfamily. The FOXK subfamily includes two winged helix transcription factors, FOXK1 and FOXK2. FOXK1, also called myocyte nuclear factor (MNF), acts as a transcriptional regulator that binds to the upstream enhancer region (CCAC box) of myoglobin genes. It positively regulates Wnt/beta-catenin signaling by translocating dishevelled (DVL) proteins into the nucleus. It also reduces virus replication, probably by binding the interferon stimulated response element (ISRE) to promote antiviral gene expression. In addition, FOXK1 plays important roles in multiple human cancers. FOXK2, also called cellular transcription factor ILF-1 or interleukin enhancer-binding factor 1, is a transcriptional regulator that recognizes the core sequence 5'-TAAACA-3'. It binds to NFAT-like motifs (purine-rich) in the IL2 promoter. It also binds to the HIV-1 long terminal repeat. FOXK2 may be involved in both, positive and negative regulation of important viral and cellular promoter elements. In addition, FOXK2 plays a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 78
37070 410801 cd20027 FH_FOXL1 Forkhead (FH) domain found in Forkhead box protein L1 (FOXL1) and similar proteins. FOXL1, also called Forkhead-related protein FKHL11 or Forkhead-related transcription factor 7 (FREAC-7), acts as a transcription factor required for proper proliferation and differentiation in the gastrointestinal epithelium. It may play a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 98
37071 410802 cd20028 FH_FOXL2 Forkhead (FH) domain found in Forkhead box protein L2 (FOXL2) and similar proteins. FOXL2 is a transcriptional regulator that is essential for ovary differentiation and maintenance, and repression of the genetic program for somatic testis determination. Mutations in the FOXL2 gene cause blepharophimosis-ptosis-epicanthus inversus syndrome (BPES) types I and II, a rare genetic disorder. In BPES type I, a complex eyelid malformation is associated with premature ovarian failure (POF), whereas in BPES type II, the eyelid defect occurs as an isolated entity. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 89
37072 410803 cd20029 FH_FOXM Forkhead (FH) domain found in the Forkhead box protein M (FOXM) subfamily. The FOXM subfamily includes a winged helix transcription factor, FOXM1, which is also called Forkhead-related protein FKHL16, Hepatocyte nuclear factor 3 Forkhead homolog 11 (HFH-11), HNF-3/fork-head homolog 11, M-phase phosphoprotein 2, MPM-2 reactive phosphoprotein 2, transcription factor Trident, or Winged-helix factor from INS-1 cells. FOXM1 acts as a transcriptional factor regulating the expression of cell cycle genes essential for DNA replication and mitosis. It plays a role in the control of cell proliferation. It is also involved in DNA break repair, participating in the DNA damage checkpoint response. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 77
37073 410804 cd20030 FH_FOXN1-like Forkhead (FH) domain found in Forkhead box protein N1 (FOXN1) and similar proteins. The FOXN1-like group includes two FOXN subfamily winged helix transcription factors, FOXN1 and FOXN4. FOXN1, also called winged helix transcription factor nude, acts as a transcriptional regulator which regulates the development, differentiation, and function of thymic epithelial cells (TECs), both in the prenatal and postnatal thymus. FOXN4 acts as a transcription factor essential for neural and some non-neural tissue development, such as retina and lung, respectively. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 81
37074 410805 cd20031 FH_FOXN2-like Forkhead (FH) domain found in Forkhead box protein N2 (FOXN2) and similar proteins. The FOXN2-like group includes two FOXN subfamily winged helix transcription factors, FOXN2 and FOXN3. FOXN2, also called human T-cell leukemia virus enhancer factor (HTLF), is a potential tumor suppressor that can facilitate replication fork reversal. It acts as a transcription factor that binds to the purine-rich region in human T-cell leukemia virus long terminal repeat (HTLV-I LTR). It may be a potential therapeutic and radiosensitization target for lung cancer. FOXN3, also called checkpoint suppressor 1, acts as a transcriptional repressor that may be involved in DNA damage-inducible cell cycle arrests (checkpoints). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 76
37075 410806 cd20032 FH_FOXO Forkhead (FH) domain found in the Forkhead box protein O (FOXO) subfamily. The FOXO subfamily includes several winged helix transcription factors: FOXO1, FOXO3, FOXO4 and FOXO6. FOXO transcription factors are involved in the regulation of longevity phenomenon via insulin and insulin-like growth factor signaling. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members. FOXO1, also called Forkhead box protein O1A or Forkhead in rhabdomyosarcoma (FKHR), is a transcription factor that is the main target of insulin signaling and regulates metabolic homeostasis in response to oxidative stress. FOXO3, also called AF6q21 protein or Forkhead in rhabdomyosarcoma-like 1 (FKHRL1), is a transcriptional activator which triggers apoptosis in the absence of survival factors, including neuronal cell death upon oxidative stress. It recognizes and binds to the DNA sequence 5'-[AG]TAAA[TC]A-3'. FOXO4, also called Fork head domain transcription factor AFX1, is a transcription factor involved in the regulation of the insulin signaling pathway. It binds to insulin-response elements (IREs) and can activate transcription of IGFBP1. FOXO6 acts as a transcriptional activator that may play an important role on tumor invasion, metastasis, and prognosis. The FH domain is a winged helix DNA-binding domain. 80
37076 410807 cd20033 FH_FOXP Forkhead (FH) domain found in the Forkhead box protein P (FOXP) subfamily. The FOXP subfamily includes four winged helix transcription factors, FOXP1-4. They are involved in the development of the central nervous system. Mutations in FOXP1, also called Mac-1-regulated Forkhead (MFH), leads to developmental delay, intellectual disability, autism spectrum disorder, speech delay, and dysmorphic features. FOXP2, also called CAG repeat protein 44 or Trinucleotide repeat-containing gene 10 protein, is a transcriptional repressor that may play a role in the specification and differentiation of lung epithelium. It may also play a role in developing neural, gastrointestinal and cardiovascular tissues. FOXP3, also called Scurfin, is a transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg). FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 73
37077 410808 cd20034 FH_FOXQ1-like Forkhead (FH) domain found in Forkhead box protein Q1 (FOXQ1) and similar proteins. FOXQ1, also called HNF-3/Forkhead-like protein 1 (HFH-1), or hepatocyte nuclear factor 3 Forkhead homolog 1, plays a role in hair follicle differentiation. It has also been reported to promote epithelial differentiation, inhibit smooth muscle differentiation, activate T cells and autoimmunity, and control mucin gene expression and granule content in stomach surface mucous cells. FOXQ1 is significantly associated with the pathogenesis of various tumor types including gastric, breast, colorectal, pancreatic, bladder and ovarian cancer, and glioma. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 79
37078 410809 cd20035 FH_FOXQ2-like Forkhead (FH) domain found in Forkhead box protein Q2 (FOXQ2) and similar proteins. FOXQ2 is the neurogenic ectoderm specification transcription factor that controls aboral development in cnidarians and anterior identity in bilaterians. It plays essential roles in epidermal development and central brain patterning. The foxQ2 gene is absent in placental mammals. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 95
37079 410810 cd20036 FH_FOXR Forkhead (FH) domain found in the Forkhead box protein R (FOXR) subfamily. The FOXR subfamily includes two winged helix transcription factors, FOXR1-2. FOXR1, also called Forkhead box protein N5 (FOXN5), is required for proper cell division and survival possibly via the p21 and mTOR pathways. FOXR2, also called Forkhead box protein N6 (FOXN6), is an important player in a wide range of cellular processes such as proliferation, migration, differentiation, and apoptosis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 87
37080 410811 cd20037 FH_FOXS1 Forkhead (FH) domain found in Forkhead box protein S1 (FOXS1). FOXS1, also called Forkhead-like 18 protein or Forkhead-related transcription factor 10 (FREAC-10), is a transcriptional repressor that suppresses transcription from the FASLG, FOXO3, and FOXO4 promoters. It may have a role in the organization of the testicular vasculature. It has also been implicated in energy turnover, motor function, and body weight. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 100
37081 410812 cd20038 FH_FOXA1 Forkhead (FH) domain found in Forkhead box protein A1 (FOXA1) and similar proteins. FOXA1, also called hepatocyte nuclear factor 3-alpha (HNF-3-alpha or HNF-3A) or transcription factor 3A (TCF-3A), acts as a transcription factor that is essential for epithelial lineage differentiation. It has been found to be upregulated in numerous cancers. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 112
37082 410813 cd20039 FH_FOXA2 Forkhead (FH) domain found in Forkhead box protein A2 (FOXA2) and similar proteins. FOXA2, also called hepatocyte nuclear factor 3-beta (HNF-3-beta or HNF-3B) or transcription factor 3B (TCF-3B), acts as a core transcription factor that controls cell differentiation. It is a key transcriptional regulator that maintains airway mucus homeostasis. It may also have an important role in bone metabolism. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 104
37083 410814 cd20040 FH_FOXA3 Forkhead (FH) domain found in Forkhead box protein A3 (FOXA3) and similar proteins. FOXA3, also called hepatocyte nuclear factor 3-gamma (HNF-3-gamma or HNF-3G), fork head-related protein FKH H3, or transcription factor 3G (TCF-3G), acts as an essential transcriptional regulator engaged in adipogenesis and energy metabolism. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 102
37084 410815 cd20041 FH_dFKH Forkhead (FH) domain found in Drosophila melanogaster protein fork head (dFKH) and similar proteins. dFKH promotes terminal as opposed to segmental development. In the absence of dFKH, this developmental switch does not occur. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 98
37085 410816 cd20042 FH_FOXB1 Forkhead (FH) domain found in Forkhead box protein B1 (FOXB1) and similar proteins. FOXB1, also called transcription factor FKH-5, is a winged helix transcription factor that controls development of mammary glands and regions of the central nervous system (CNS) that regulate the milk-ejection reflex. It is essential for access of mammillothalamic axons to the thalamus. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 92
37086 410817 cd20043 FH_FOXB2 Forkhead (FH) domain found in Forkhead box protein B2 (FOXB2) and similar proteins. FOXB2, also called transcription factor FKH-4, may act as a transcription factor. It may also act as a tumor suppressor; it has been found to inhibit the malignant characteristics of the pancreatic cancer cell line Panc-1 in vitro. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 110
37087 410818 cd20044 FH_FOXC1 Forkhead (FH) domain found in Forkhead box protein C1 (FOXC1) and similar proteins. FOXC1, also called Forkhead-related protein FKHL7, or Forkhead-related transcription factor 3 (FREAC-3), is a DNA-binding transcriptional factor that plays a role in a broad range of cellular and developmental processes such as the development of the eyes, bones, cardiovascular system, kidneys, and skin. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 93
37088 410819 cd20045 FH_FOXC2 Forkhead (FH) domain found in Forkhead box protein C2 (FOXC2) and similar proteins. FOXC2, also called Forkhead-related protein FKHL14, Mesenchyme fork head protein 1 (MFH-1 protein), or transcription factor FKH-14, acts as a transcriptional activator that might be involved in the formation of special mesenchymal tissues. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 90
37089 410820 cd20046 FH_FOXD1_D2-like Forkhead (FH) domain found in Forkhead box proteins FOXD1, FOXD2 and similar proteins. FOXD1, also called Forkhead-related protein FKHL8, or Forkhead-related transcription factor 4 (FREAC-4), is involved in transcriptional activation of Placental Growth Factor (PGF) and the complement component (C3) genes. It plays an important role in early embryonic development and organogenesis, and functions as an oncogene in several cancers. FOXD2, also called Forkhead-related protein FKHL17, or Forkhead-related transcription factor 9 (FREAC-9), is a probable transcription factor involved in embryogenesis and somatogenesis. It has been found that long noncoding RNA FOXD2 adjacent opposite strand RNA1 (lncRNA FOXD2-AS1) expression is upregulated in various human malignancies, including gastric, lung, bladder, colorectal, nasopharyngeal, esophageal, hepatocellular, thyroid and skin cancer. It is a promising candidate as a new biomarker and therapeutic target for cancer diagnosis/prognostication due to high tissue specificity and elevated efficiency. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 99
37090 410821 cd20047 FH_FOXD3 Forkhead (FH) domain found in Forkhead box protein D3 (FOXD3) and similar proteins. FOXD3, also called HNF3/FH transcription factor genesis, acts as a transcriptional repressor that binds to the consensus sequence 5'-A[AT]T[AG]TTTGTTT-3'. It also acts as a transcriptional activator. It promotes development of neural crest cells from neural tube progenitors and restricts neural progenitor cells to the neural crest lineage while suppressing interneuron differentiation. FOXD3 is required for maintenance of pluripotent cells in the pre-implantation and peri-implantation stages of embryogenesis. The FH domain is a winged helix DNA-binding domain. 97
37091 410822 cd20048 FH_FOXD4-like Forkhead (FH) domain found in Forkhead box protein D4 (FOXD4) and similar proteins. FOXD4, also called Forkhead-related protein FKHL9, Forkhead-related transcription factor 5 (FREAC-5), or myeloid factor-alpha, is essential for establishing neural cell fate and for neuronal differentiation. The family also includes Forkhead box protein D4-like proteins, FOXD4L1-6. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 96
37092 410823 cd20049 FH_FOXF1 Forkhead (FH) domain found in Forkhead box protein F1 (FOXF1) and similar proteins. FOXF1, also called Forkhead-related activator 1 (FREAC-1), Forkhead-related protein FKHL5, or Forkhead-related transcription factor 1, is a probable transcription activator for a number of lung-specific genes. FOXF1 mutations in sporadic and familial cases of alveolar capillary dysplasia with misaligned pulmonary veins (ACD/MPV) suggest its involvement in ACD/MPV and lung organogenesis. The role of FOXF1 in cancer is conflicting; its loss in some cancers suggests a tumor suppressive function, but its abundance in others is associated with protumorigenic and metastatic traits. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 99
37093 410824 cd20050 FH_FOXF2 Forkhead (FH) domain found in Forkhead box protein F2 (FOXF2) and similar proteins. FOXF2, also called Forkhead-related activator 2 (FREAC-2), Forkhead-related protein FKHL6, or Forkhead-related transcription factor 2, is a probable transcription activator for a number of lung-specific genes. It is involved in programming organogenesis and regulating epithelial-to-mesenchymal transition (EMT) and cell proliferation. FOXF2 dysregulation is critical for tumorigenesis of various tissue types. Its expression correlates with good prognosis in patients with early non-invasive stages of breast cancer, but with poor prognosis in advanced breast cancer. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 93
37094 410825 cd20051 FH_FOXJ2 Forkhead (FH) domain found in Forkhead box protein J2 (FOXJ2) and similar proteins. FOXJ2, also called Fork head homologous X (FHX), plays an important role in tumorigenesis, progression, and metastasis of certain cancers. It acts as a transcriptional activator that can bind to two different type of DNA binding sites. It is specifically expressed in meiotic spermatocytes in adult mouse testes and appears to promote meiotic progression during spermatogenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 82
37095 410826 cd20052 FH_FOXJ3 Forkhead (FH) domain found in Forkhead box protein J3 (FOXJ3) and similar proteins. FOXJ3 is a transcription factor which regulates sperm function. It transcriptionally activates Mef2c and regulates adult skeletal muscle fiber type identity. Polymorphisms in the FOXJ3 gene may be associated with the development of rheumatoid arthritis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 79
37096 410827 cd20053 FH_FOXI1 Forkhead (FH) domain found in Forkhead box protein I1 (FOXI1) and similar proteins. FOXI1 is also called Forkhead-related protein FKHL10, Forkhead-related transcription factor 6 (FREAC-6), Hepatocyte nuclear factor 3 Forkhead homolog 3 (HFH-3), or HNF-3/fork-head homolog 3. It is a master regulator of vacuolar H-ATPase proton pump subunits in the inner ear, kidney, and epididymis, and is required for the development of normal hearing, sense of balance, and kidney function. Its epididymal expression is required for male fertility. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 100
37097 410828 cd20054 FH_FOXK1 Forkhead (FH) domain found in Forkhead box protein K1 (FOXK1) and similar proteins. FOXK1, also called myocyte nuclear factor (MNF), acts as a transcriptional regulator that binds to the upstream enhancer region (CCAC box) of myoglobin genes. It positively regulates Wnt/beta-catenin signaling by translocating dishevelled (DVL) proteins into the nucleus. It also reduces virus replication, probably by binding the interferon stimulated response element (ISRE) to promote antiviral gene expression. In addition, FOXK1 plays important roles in multiple human cancers. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 101
37098 410829 cd20055 FH_FOXK2 Forkhead (FH) domain found in Forkhead box protein K2 (FOXK2) and similar proteins. FOXK2, also called cellular transcription factor ILF-1 or interleukin enhancer-binding factor 1, is a transcriptional regulator that recognizes the core sequence 5'-TAAACA-3'. It binds to NFAT-like motifs (purine-rich) in the IL2 promoter. It also binds to the HIV-1 long terminal repeat. FOXK2 may be involved in both, positive and negative regulation of important viral and cellular promoter elements. In addition, FOXK2 plays a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 98
37099 410830 cd20056 FH_FOXN1 Forkhead (FH) domain found in Forkhead box protein N1 (FOXN1). FOXN1, also called winged helix transcription factor nude, acts as a transcriptional regulator which regulates the development, differentiation, and function of thymic epithelial cells (TECs), both in the prenatal and postnatal thymus. It is also an important factor in controlling the skin wound healing process, as it actively participates in re-epithelialization and is responsible for scar formation. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 97
37100 410831 cd20057 FH_FOXN4 Forkhead (FH) domain found in Forkhead box protein N4 (FOXN4). FOXN4 acts as a transcription factor essential for neural and some non-neural tissue development, such as retina and lung, respectively. During development of the central nervous system, FOXN4 is required in specifying the amacrine and horizontal cell fates from multipotent retinal progenitors, while suppressing alternative photoreceptor cell fates. In non-neural tissues, it plays an essential role in specifying the atrioventricular canal and is indirectly required for patterning the distal airway during lung development. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 106
37101 410832 cd20058 FH_FOXN2 Forkhead (FH) domain found in Forkhead box protein N2 (FOXN2). FOXN2, also called human T-cell leukemia virus enhancer factor (HTLF), is a potential tumor suppressor that can facilitate replication fork reversal. It acts as a transcription factor that binds to the purine-rich region in human T-cell leukemia virus long terminal repeat (HTLV-I LTR). It may be a potential therapeutic and radiosensitization target for lung cancer. FOXN2 has also been found to be down-regulated in breast cancer, where it regulates migration, invasion, and epithelial-mesenchymal transition. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 82
37102 410833 cd20059 FH_FOXN3 Forkhead (FH) domain found in Forkhead box protein N3 (FOXN3). FOXN3, also called checkpoint suppressor 1 (CHES1), acts as a transcriptional repressor that may be involved in DNA damage-inducible cell cycle arrests (checkpoints). It displays transcriptional inhibitory activity, and is involved in cell cycle regulation and tumorigenesis. Alterations in FOXN3 are found in of a variety of cancers including melanoma, osteosarcoma, and hepatocellular carcinoma. FOXN3 also regulates hepatic glucose utilization/metabolism by regulating gluconeogenic substrate selection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 90
37103 410834 cd20060 FH_FOXO1 Forkhead (FH) domain found in Forkhead box protein O1 (FOXO1). FOXO1, also called Forkhead box protein O1A or Forkhead in rhabdomyosarcoma (FKHR), is a transcription factor that is the main target of insulin signaling and regulates metabolic homeostasis in response to oxidative stress. It binds to the insulin response element (IRE) with the consensus sequence 5'-TT[G/A]TTTTG-3' and the related Daf-16 family binding element (DBE) with the consensus sequence 5'-TT[G/A]TTTAC-3'. The FH domain is a winged helix DNA-binding domain. 99
37104 410835 cd20061 FH_FOXO3 Forkhead (FH) domain found in Forkhead box protein O3 (FOXO3). FOXO3, also called AF6q21 protein or Forkhead in rhabdomyosarcoma-like 1 (FKHRL1), is a transcriptional activator which triggers apoptosis in the absence of survival factors, including neuronal cell death upon oxidative stress. It recognizes and binds to the DNA sequence 5'-[AG]TAAA[TC]A-3'. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members. 83
37105 410836 cd20062 FH_FOXO4 Forkhead (FH) domain found in Forkhead box protein O4 (FOXO4) and similar proteins. FOXO4, also called Fork head domain transcription factor AFX1, is a transcription factor involved in the regulation of the insulin signaling pathway. It binds to insulin-response elements (IREs) and can activate transcription of IGFBP1. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members. 86
37106 410837 cd20063 FH_FOXO6 Forkhead (FH) domain found in Forkhead box protein O6 (FOXO6) and similar proteins. FOXO6 acts as a transcriptional activator that may play an important role on tumor invasion, metastasis and prognosis. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members. 88
37107 410838 cd20064 FH_FOXP1 Forkhead (FH) domain found in Forkhead box protein P1 (FOXP1). FOXP1, also called Mac-1-regulated Forkhead (MFH), is a transcription factor that is widely expressed and has a broad range of functions. It has been shown to have a role in cardiac, lung, and lymphocyte development. Deregulation of FOXP1 is an important contributor to the pathogenesis of diffuse large B-cell lymphoma (DLBCL), suggesting it may function as an oncogene. Loss of FOXP1 expression in breast cancer is associated with a worse outcome, suggesting that FOXP1 may function as a tumor suppressor in some tissues. Haploinsufficiency of the FOXP1 gene leads to a neurodevelopmental disorder termed FOXP1 syndrome, characterized by developmental delay, intellectual disability, autism spectrum disorder, speech delay, and dysmorphic features. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 87
37108 410839 cd20065 FH_FOXP2 Forkhead (FH) domain found in Forkhead box protein P2 (FOXP2). FOXP2, also called CAG repeat protein 44, or Trinucleotide repeat-containing gene 10 protein, is a transcriptional repressor that may play a role in the specification and differentiation of lung epithelium. It may also play a role in developing neural, gastrointestinal, and cardiovascular tissues. An arginine-to-histidine missense mutation (R553H) in the FOXP2 Forkhead domain has been linked to a severe speech and language disorder. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 82
37109 410840 cd20066 FH_FOXP3 Forkhead (FH) domain found in Forkhead box protein P3 (FOXP3) and similar proteins. FOXP3, also called Scurfin, is a transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg), which are required for maintaining self-tolerance. It may also have intrinsic regulatory function in conventional T (Tconv) cells. A deletion of the Forkhead domain arising from a frame-shift mutation in mouse FOXP3 is linked to the autoimmune disorder scurfy, and a similar congenital disease in humans is known as IPEX (immune dysregulation, polyendocrinopathy, enteropathy, X-linked syndrome). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 81
37110 410841 cd20067 FH_FOXP4 Forkhead (FH) domain found in Forkhead box protein P4 (FOXP4) and similar proteins. FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is not required for T cell development, but is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 87
37111 410993 cd20069 5TM_Oxa1-like Five transmembrane core domain of mitochondrial inner membrane protein Oxa1 and similar proteins. This group is composed mostly of the mitochondrial members of the YidC/Oxa1/Alb3 protein family of insertases, including mitochondrial inner membrane proteins Oxa1, Oxa1-like (Oxa1L), cytochrome c oxidase assembly protein 18 (Cox18, also called Oxa2), and Arabidopsis thaliana mitochondrial ALBINO3-like protein 3 (ALB33). It also includes Arabidopsis thaliana chloroplastic ALBINO3-like protein 2 (ALB32). Members of this group mediate the insertion of both mitochondrion-encoded precursors and nuclear-encoded proteins from the matrix into the mitochondrial inner membrane. Oxa1 and Cox18/Oxa2 are essential for the activity and assembly of cytochrome c oxidase, playing central roles in the translocation and export of the N-terminal and C-terminal parts, respectively, of the COX2 protein into the mitochondrial intermembrane space. ALB32 may be involved in the insertion of integral membrane proteins into the chloroplast thylakoid membranes. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function. 201
37112 410994 cd20070 5TM_YidC_Alb3 Five transmembrane core domain of membrane protein insertase YidC, Alb3, and similar proteins. This group is composed of the bacterial and chloroplastic members of the YidC/Oxa1/Alb3 protein family of insertases, including bacterial YidC, and chloroplastic ALBINO3 (Alb3) and Alb3-like proteins such as ALBINO3-like protein 1 (also called Alb4). Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC from Gram-negative bacteria contains an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. Alb3 and Alb3-like proteins are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. Alb3 acts independently and may also function cooperatively with the thylakoid cpSecYE translocase to insert proteins co-translationally into the thylakoid membrane, similar to bacterial YidC that can function with the SecYEG translocase. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function. 181
37113 380997 cd20071 SET_SMYD SET domain (including SET domain and post-SET domain) found in SET and MYND domain-containing protein, and similar proteins. The family includes SET and MYND domain-containing proteins, SMYD1-SYMD5. SMYD1 (EC 2.1.1.43; also termed BOP) is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD2 (also termed HSKM-B, or lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. SMYD4 functions as a potential tumor suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha. SMYD5 (also termed protein NN8-4AG, or retinoic acid-induced protein 15) functions as histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions. 122
37114 380998 cd20072 SET_SET1 SET domain (including post-SET domain) found in catalytic component of the Saccharomyces cerevisiae COMPASS complex and similar proteins. The family contains mostly fungal SET domains, including SET1 found in the catalytic component of the Saccharomyces cerevisiae COMPASS (complex of proteins associated with Set1). SET1 is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex. The activity of this catalytic domain is established through forming a complex with a set of core proteins; it is extensively contacted by Cps60 (Bre2), Cps50 (Swd1), and Cps30 (Swd3). 148
37115 380999 cd20073 SET_SUV39H_Clr4-like SET domain (including pre-SET and post-SET domains) found in of Schizosaccharomyces pombe H3K9 methyltransferase Clr4, and similar proteins. This subfamily contains fission yeast Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (also known as Suv39h), the sole homolog of the mammalian SUV39H1 and SUV39H2 enzymes, that has a critical role in preventing aberrant heterochromatin formation. It is known to di- and tri-methylate Lys-9 of histone H3, a central heterochromatic histone modification, with its specificity profile most similar to that of the human SUV39H2 homolog. 259
37116 410850 cd20074 XPF_nuclease_Mus81 XPF-like nuclease domain of Mus81. Mus81 is a crossover junction endonuclease that interacts with Eme1 and Eme2 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. Mus81 may be required in mitosis for the processing of stalled or collapsed replication forks. Mus81 consists of the active nuclease domain with the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity and two helix-hairpin-helix (HhH2) domains. 150
37117 410851 cd20075 XPF_nuclease_XPF_arch nuclease domain of XPF found in archaea. XPF, also called DNA excision repair protein ERCC-4, or DNA repair protein complementing XP-F cells, or Xeroderma pigmentosum group F-complementing protein, is a 3'-flap repair endonuclease that cleaves 5' of ds/ssDNA interfaces in 3' flap structures, although it also cuts bubble, Y-DNA structures and mobile and immobile Holliday junctions. XPF cuts preferentially after pyrimidines, may continue to progressively cleave substrate upstream of the initial cleavage, at least in vitro. It may be involved in nucleotide excision repair. The nuclease domains of the catalytic subunits XPF have the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity but not for DNA junction binding. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links. 127
37118 410852 cd20076 XPF_nuclease_FAAP24 XPF-like nuclease domain of Fanconi anemia associated protein 24 kDa (FAAP24). FAAP24, also called Fanconi anemia core complex-associated protein 24, plays a role in DNA repair through recruitment of the Fanconi anemia (FA) core complex to damaged DNA. It regulates FANCD2 monoubiquitination upon DNA damage and induces chromosomal instability as well as hypersensitivity to DNA cross-linking agents, when repressed. FAAP24 may possess a high affinity toward single-stranded DNA (ssDNA). The nuclease domain of FAAP24 lacks the catalytic motif. The FANCM/FAAP24 complex is related to XPF/MUS81 endonucleases but lacks endonucleolytic activity. It binds branched DNA structures containing ssDNA regions, such as splayed-arm and 3'-flap DNA structures, and anchors the FA core complex to chromatin in repairing DNA interstrand crosslinks. 123
37119 410853 cd20077 XPF_nuclease_FANCM XPF-like nuclease domain of Fanconi anemia group M protein (FANCM). FANCM (EC 3.6.4.13), also called Fanconi anemia-associated polypeptide of 250 kDa (FAAP250), or protein Hef ortholog, or ATP-dependent RNA helicase FANCM, is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. In complex with CENPS and CENPX, it binds double-stranded DNA (dsDNA), fork-structured DNA (fsDNA) and Holliday junction substrates. In complex with FAAP24, it efficiently binds to single-strand DNA (ssDNA), splayed-arm DNA, and 3'-flap substrates. In vitro, on its own, FANCM strongly binds ssDNA oligomers and weakly fsDNA, but does not bind to dsDNA. 139
37120 410854 cd20078 XPF_nuclease_XPF_euk nuclease domain of XPF found in eukaryotes. XPF, also called DNA excision repair protein ERCC-4, or DNA repair protein complementing XP-F cells, or Xeroderma pigmentosum group F-complementing protein, is a DNA repair endonuclease that is a catalytic component of a structure-specific DNA repair endonuclease responsible for the 5-prime incision during DNA repair. It is involved in homologous recombination that assists in removing interstrand cross-link. The nuclease domains of the catalytic subunits XPF have the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity but not for DNA junction binding. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links. 136
37121 410855 cd20079 XPF_nuclease_ERCC1 XPF-like nuclease domain of DNA excision repair protein ERCC1. ERCC1 is a non-catalytic component of a structure-specific DNA repair endonuclease responsible for the 5'-incision during DNA repair. In conjunction with SLX4, ERCC1 is responsible for the first step in the repair of interstrand cross-links (ICL), as well as for homology-directed repair (HDR) of DNA double-strand breaks. ERCC1 participates in the processing of anaphase bridge-generating DNA structures, which consist in incompletely processed DNA lesions arising during S or G2 phase, and can result in cytokinesis failure. ERCC1 also plays a critical role in targeting the XPF-ERCC1 complex to DNA. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links. The critical motif, DX(n)ERKX(3)D, for endonuclease activity is absent in the nuclease domain of ERCC1. 115
37122 410856 cd20080 XPF_nuclease_EME-like XPF-like nuclease domain of the family of Essential Meiotic Endonucleases (EMEs) and similar proteins. The family of EMEs includes EME1 and EME2. EME1, also called MMS4 homolog (hMMS4), interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Its typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. EME1 may be required in mitosis for the processing of stalled or collapsed replication forks. EME2 interacts with MUS81 to form a DNA structure-specific endonuclease which cleaves substrates such as 3'-flap structures. MUS81-EME2 is responsible for fork cleavage and restart in human cells. The MUS81-EME2 protein, whose actions are restricted to S phase, is also responsible for telomere maintenance in telomerase-negative ALT (Alternative Lengthening of Telomeres) cells. The nuclease domain of EMEs is a nuclease-like domain which is involved in targeting the MUS81-EME heterodimer complex to DNA. The family also includes budding yeast Mms4 (also known as Eme1 in other organisms), a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The nuclease domain of Mms4 lacks the catalytic motif. 164
37123 410857 cd20081 XPF_nuclease_EME1 XPF-like nuclease domain of crossover junction endonuclease EME1. EME1, also called MMS4 homolog (hMMS4), interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Its typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. EME1 may be required in mitosis for the processing of stalled or collapsed replication forks. The nuclease domain of EME1 is a nuclease-like domain which is involved in targeting the MUS81-EME1 heterodimer complex to DNA. 179
37124 410858 cd20082 XPF_nuclease_EME2 XPF-like nuclease domain of crossover junction endonuclease EME2. EME2 interacts with MUS81 to form a DNA structure-specific endonuclease which cleaves substrates such as 3'-flap structures. MUS81-EME2 is responsible for fork cleavage and restart in human cells. The MUS81-EME2 protein, whose actions are restricted to S phase, is also responsible for telomere maintenance in telomerase-negative ALT (Alternative Lengthening of Telomeres) cells. The nuclease domain of EME2 is a nuclease-like domain which is involved in targeting the MUS81-EME2 heterodimer complex to DNA. 195
37125 410859 cd20083 XPF_nuclease_EME XPF-like nuclease domain of crossover junction endonucleases, EME1, EME2 and similar proteins. The Mus81-EME1 complex is a structure-selective endonuclease with a critical role in the resolution of recombination intermediates during DNA repair after interstrand cross-links, replication fork collapse, or double-strand breaks. ERCC4 domain of Eme1 is a nuclease-like domain which is involved in targeting the MUS81-EME1 heterodimer complex to DNA. 179
37126 410860 cd20085 XPF_nuclease_Mms4 XPF-like nuclease domain of Saccharomyces cerevisiae crossover junction endonuclease Mms4 and similar proteins. Budding yeast Mms4, also known as Eme1 in other organisms, is a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Typical substrates include 3'-flap structures, D-loops, replication forks with regressed leading strands and nicked Holliday junctions. The nuclease domain of Mms4 lacks the catalytic motif. 220
37127 380911 cd20167 Peptidase_M90-like M90 peptidase is a zinc-metallopeptidase. The M90 peptidase family includes the MtfA (Mlc Titration Factor A) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in a HEXXH motif. This family also includes uncharacterized proteins similar to MtfA peptidase. 208
37128 380912 cd20169 Peptidase_M90_mtfA Mlc titration factor A (MtfA) is a zinc metallopeptidase (M90 peptidase). This subfamily includes the Mlc Titration Factor A (MtfA; also known as YeeI or DgsA anti-repressor MtfA) which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). It can cleave synthetic substrates of both carboxypeptidases and aminopeptidases, with strongest activity towards the latter. Its biologically relevant substrate has yet to be identified. Although it interacts with the transcription repressor Mlc, it does not cleave it. However, Mlc seems to activate the peptidase activity of MtfA. MtfA is related to the catalytic domain of the anthrax lethal factor which is a zinc-dependent metalloprotease, targeting mitogen-activated protein kinase kinases (MAPKKs), and resulting in apoptosis, as well as the Mop (modulation of pathogenesis) protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site. 208
37129 380913 cd20170 Peptidase_M90-like uncharacterized M90 peptidase family-like proteins. This subfamily contains uncharacterized M90 peptidase-like domains, similar to the Mlc Titration Factor A (MtfA) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in an HEXXH motif. MtfA is related to the catalytic domain of the anthrax lethal factor and the Mop protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site. 210
37130 380332 cd20171 M34_peptidase Peptidase family M34 includes the C-terminal catalytic domain of anthrax lethal factor (ATLF), the protective antigen-binding domains of ATLF and edema factor, and Pro-Pro endopeptidase. Peptidase family M34 (also known as the anthrax lethal factor family) includes the C-terminal catalytic domain of anthrax lethal factor (ATLF, EC 3.4.24.83), and the N-terminal protective antigen-binding domains (PABDs) of ATLF and edema factor (EF). ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). At its N-terminus, ATLF has a PABD domain which lacks the hallmark metalloprotease motif HEXXH, and, at its C-terminus, the related catalytic domain has the HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. EF acts as a Ca2+- and calmodulin-dependent adenylyl cyclase that can cause edema when associated with PA. EF is comprised of the PABD and an adenylyl cyclase domain. This family also includes Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1) which is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion. 156
37131 380910 cd20174 GH18_LinChi78-like_UFR an unknown function domain of Listeria innocua LinChi78 GH18 chitinase that is essential for its catalytic activity; found in similar chitinase-like proteins. This domain is referred to as an unknown-function region (UFR) and shown to be necessary for the hydrolytic activity of LinChi78 glycosyl hydrolase family 18 (GH18) chitinase (a product of the lin0153gene) from the nonpathogenic bacterium Listeria innocua. The catalytic domain (CatD) of GH18 chitinases folds into a TIM barrel and has a conserved DXXDXDXE motif, in which the Glu residue functions as a catalytic residue; these chitinases contain additional domains such as a chitin-binding domain (ChBD) and/or a fibronectin type III-like (FnIII) domain. LinChi78 consists of a CatD, a FnIII, and a ChBD domain, and has this UFR region located between the CatD and the FnIII domain. Its catalytic site is composed of a typical CatD and a portion of this UFR, in particular the key Gln and Ile residues which are indispensable for LinChi78 to exhibit full catalytic activity. This UFR domain is also found in proteins where it is located between a CatD domain and DUF5011 and ChBD(s) domains. LinChi78 exhibits chitinase activity towards artificial and natural substrates, including colloidal chitin and chitin oligosaccharides of various lengths, and hydrolyzes these in a processive manner. Members of this family include some uncharacterized chitinase-like proteins from pathogenic bacteria such as Listeria monocytogenes and Clostridium botulinum. 136
37132 412038 cd20175 ThyX FAD-dependent thymidylate synthase (ThyX), mechanistically and structurally unrelated to thymidylate synthase (ThyA). This family contains FAD-dependent thymidylate synthase (also known as ThyX, Thy1, FDTS or thymidylate synthase complementing protein), found in many microbial genomes including several human pathogens, but absent in humans. This protein is mechanistically and structurally unrelated to thymidylate synthase (TS or ThyA) found in mammals. ThyA and ThyX both produce de novo thymidylate or deoxythymidine 5'-monophosphate (dTMP), an essential DNA precursor. The classic ThyA catalyzes the reductive methylation of deoxyuridine 5'-monophosphate (dUMP) to form dTMP, with methylenetetrahydrofolate (CH2H4folate) serving as a one-carbon donor and as the source of reductive power. On the other hand, ThyX contains FAD, tightly bound by a novel fold, that mediates hydride transfer from NADPH during catalysis. Consequently, CH2H4folate serves only as a carbon donor and tetrahydrofolate (and not dihydrofolate as in the case of ThyA) is produced. The differences between the ThyX and ThyA is used for mechanism-based drugs to selectively inhibit FDTS and not have much effect on human and other eukaryotic TS. ThyX has been pursued for the development of new antibacterial agents against Mycobacterium tuberculosis, the causative agent of the widespread infectious disease tuberculosis (TB). It is also an attractive target for designing specific antibiotic drugs against many diseases such as ulcers, periodontal disease, and Lyme's disease, as well as biological warfare agents such as anthrax, botulism, and typhus. 186
37133 380333 cd20183 M34_PPEP Pro-Pro endopeptidase (PPEP) and similar proteins; belongs to peptidase family M34. This subfamily includes the enzyme Pro-Pro endopeptidase (PPEP-1, EC 3.4.24.89, also known as Zmp1 (Clostridium difficile-type)), an extracellular metalloprotease showing a unique specificity for hydrolyzing a Pro-Pro bond. It belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH, where the two His residues bind a single zinc atom, and the Glu has a catalytic role. PPEP-1 cleaves two C. difficile cell surface proteins (CD2831 and CD3246) involved in adhesion, one of which is encoded by the gene adjacent to the ppep-1 gene. There are multiple PPEP-1 cleavage sites located just above the site of attachment to the peptidoglycan layer. PPEP-1 may play a role in switching from an adhesive to a motile phenotype. Also included in this subfamily is Paenibacillus alvei PPEP-2, a secreted Pro-Pro endopeptidase. The cleavage motif of PPEP-2, PLP PVP, is distinct from that of PPEP-1 (VNP PVP). PPEP-2 cleavage sites in a cell-surface protein, with putative extracellular matrix-binding domains, and encoded by the adjacent gene, suggests a similar role of PPEP-2 in controlling bacterial adhesion. 184
37134 380334 cd20184 M34_peptidase_like uncharacterized subfamily of peptidase family M34. Peptidase family M34 (also known as the anthrax lethal factor family) includes the C-terminal catalytic domain of anthrax lethal factor (ATLF, EC 3.4.24.83), and the N-terminal protective antigen-binding domains (PABDs) of ATLF and edema factor (EF). ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). At its N-terminus, ATLF has a PABD domain which lacks the hallmark metalloprotease motif HEXXH, and, at its C-terminus, the related catalytic domain has the HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. EF acts as a Ca2+- and calmodulin-dependent adenylyl cyclase that can cause edema when associated with PA; it is comprised of the PABD and an adenylyl cyclase domain. Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1) is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion. This uncharacterized subfamily includes proteins which have an N-terminal SLH domain, and proteins which may have an N-terminal IG-like domain; these proteins have the hallmark metalloprotease motif HEXXH motif. 131
37135 380335 cd20185 M34_PABD N-terminal protective antigen-binding domain (PABD) of Anthrax Toxin Lethal Factor (ATLF) and Edema Factor (EF), and similar domains; belongs to peptidase family M34. This subfamily includes the functional N-terminal protective antigen-binding domain (PABD) of the anthrax edema factor (EF), as well as the likely inactive N-terminal PABD of anthrax toxin lethal factor (ATLF), both secreted by Bacillus anthracis. ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF-PABD resembles the C-terminal catalytic domain of ATLF (EC 3.4.24.83) but lacks the hallmark metalloprotease motif HEXXH. This subfamily belongs to the peptidase family M34 (also known as the anthrax lethal factor family), which also includes the C-terminal catalytic domain of ATLF, and Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1), which is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion. 219
37136 410313 cd20187 T-box_TBX1_10-like DNA-binding domain of T-box transcription factor 1 and 10, and related T-box proteifactors. This subfamily includes TBX1 and TBX10. TBX1 is a T-box transcription factor which plays an important role in heart development and has been implicated in DiGeorge or 22q11.2 deletion syndrome. This syndrome is associated with various types of cardiac outflow tract (OFT) and vascular defects. Wnt5a is regulated by TBX1 in the second heart field (SHF). TBX1 is required to maintain the integrity of extracellular matrix-cell interactions in the SHF and this interaction is critical for cardiac (OFT) development. TBX10 is a putative T-box transcription factor. Diseases associated with TBX10 include Isolated Cleft Lip and Cleft Lip/cleft lip with or without cleft palate. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 189
37137 410314 cd20188 T-box_TBX2_3-like DNA-binding domain of T-box transcription factor 2 and 3, and related T-box proteins. This subfamily includes the T-box transcription factors TBX2 and TBX3 and similar proteins. TBX2 is an oncogenic transcription factor implicated in developmental processes, including coordinating cell fate, patterning and morphogenesis of a wide range of tissues and organs. It is overexpressed in several cancers, including melanoma and breast, and plays a key role during cardiac development. TBX2 is a negative regulator of promyelocytic leukemia protein (PML) function in cellular senescence, and it interacts with HP1 to recruit a repression complex to EGR1-responsive promoters to drive the proliferation of breast cancer cells. TBX3 has also been implicated in oncogenesis in breast cancer and melanoma. The tbx3 gene is downregulated by PML. TBX3 directly represses TBX2 under the control of the PRC2 complex in skeletal muscle and rhabdomyosarcoma. Also included in this family is the Drosophila melanogaster optomotor-blind protein (Omb, also known as lethal(1)optomotor-blind, or L(1)omb, or protein bifid) which controls many developmental processes such as wing, eye, and abdominal tergites and optic lobes, and induces epithelial cell migration and extrusion in vivo. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 185
37138 410315 cd20189 T-box_TBX4_5-like DNA-binding domain of T-box transcription factor 4 and 5, and related T-box proteins. This subfamily includes the T-box transcription factors TBX4 and TBX5 which play important roles in vertebrate limb and heart development, and in lung and trachea development. TBX4 is needed for normal skeletal and muscular hindlimb development and is involved in super-enhancer-driven transcriptional programs underlying features specific to lung fibroblasts. TBX5 plays a role in regulating cardiac conduction system function, and in coordinating forelimb muscle pattern. Mutations in human TBX5 and TBX4 are associated with Holt-Oram syndrome and Small Patella syndrome, respectively. Both syndromes are characterized by limb defects in addition to other abnormalities. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 185
37139 410316 cd20190 T-box_TBX6_VegT-like DNA-binding domain of T-box transcription factor 6, VegT and related T-box proteins. This subfamily includes the transcriptional regulators TBX6 and VegT. TBX6 plays an essential role in the fate determination of axial stem to become either neural or mesodermal. It also plays an essential role in the regulation of left/right patterning in mouse embryos through effects on nodal cilia and perinodal signaling. VegT (also known as Antipodean, Brat and Xombi) is required in early Xenopus embryos for the formation of both the mesoderm and endoderm germ layers. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved 1DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 183
37140 410317 cd20191 T-box_TBX15_18_22-like DNA-binding domain of T-box transcription factor 15, 18 and 22, and related T-box proteins. This subfamily includes the transcriptional regulators TBX15, TBX18 and TBX22 which are involved in various developmental processes. TBX15 (also known as TBX14) plays an important role in the development of the skeleton of the limb, vertebral column and head, possibly through its control of the number of mesenchymal precursor cells and chondrocytes; it also plays a role in the differentiation of brown and brite adipocytes. TBX18 is involved in the developmental processes of a variety of tissues and organs, including the ureter, vertebral column. epicardium and coronary vessels; it is important for the development of the head portion of the sino atrial node (SAN). Mutations in the T-box transcription factor gene TBX22 are found in X-linked Cleft Palate with or without Ankyloglossia syndrome (CPX syndrome), and associated with cleft lip and palate, and tooth agenesis. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 194
37141 410318 cd20192 T-box_TBXT_TBX19-like DNA-binding domain of T-box transcription factor T, T-box transcription factor 19 and related T-box proteins. Tbx19 (also known as Tpit) is a T-box factor restricted to two pituitary (pro-opiomelanocortin) POMC-expressing lineages, the corticotrophs and melanotrophs; it controls terminal differentiation of these lineages. TBX19 activates POMC gene transcription with the cooperation of another transcription factor Pitx1. TBXT, also known as Brachyury protein, or protein T, is a transcription factor needed for posterior mesoderm formation and differentiation as well as for the notochord development during embryogenesis. It binds to a 24 base-pair (bp) palindromic site (called the T site) and activates gene transcription when bound to such a site. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. TBXT is the founding member of the T-box family, members of which share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 180
37142 410319 cd20193 T-box_TBX20-like DNA-binding domain of T-box transcription factor 20 and related T-box proteins. TBX20 is a T-box transcriptional factor which functions in embryonic development and its deficiency is associated with congenital heart disease. It acts both as a transcriptional activator and a repressor required for cardiac development, and has key roles in maintaining the functional and structural phenotypes in the adult heart. The TBX20-cardiac transcription factor CASZ1 protein complex is protective against dilated cardiomyopathy and is essential for maintaining cardiac homeostasis. TBX20 has also been shown to regulate angiogenesis through the PROK2-PROKR1 (prokineticin receptor 1) pathway and is involved in both, pathological and developmental, angiogenesis. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 190
37143 410320 cd20194 T-box_TBR1_2_21-like DNA-binding domain of T-box brain protein 1 and 2, T-box transcription factor 21 and related T-box proteins. TBX21 (also known as T-cell-specific T-box transcription factor T-bet or transcription factor TBLYM) is a lineage-defining transcription factor which directs T helper type 1 (Th1) cell differentiation. This subfamily includes TBR1 (also known as T-brain-1, or TES-56), which is a neuron-specific transcription factor involved in forebrain development, and TBR2 (also known as Eomesodermin, Eomes, or T-brain-2), which is associated with neurogenesis, cardiogenesis and tumor immune response. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 185
37144 410321 cd20195 T-box_MGA-like DNA-binding domain of MAX gene-associated protein and related T-box proteins. MGA (also known as MGAP, MAX dimerization protein, MAD5, MXD5) is a dual-specificity transcription factor that regulates the expression of both, MAX-network and T-box family target genes. MGA functions as a repressor or an activator; it binds to 5'-AATTTCACACCTAGGTGTGAAATT-3' core sequence. Its function is activated by heterodimerization with MAX. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 186
37145 410322 cd20196 T-box_TBX6 DNA-binding domain of T-box transcription factor 6, and related T-box proteins. TBX6 is a T-box transcription factor which plays an essential role in the fate determination of axial stem to become either neural or mesodermal. It also plays an essential role in the regulation of left/right patterning in mouse embryos, through effects on nodal cilia and perinodal signaling. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 182
37146 410323 cd20197 T-box_VegT-like DNA-binding domain of Xenopus VegT and related T-box proteins. VegT, (also known as Antipodean, Brat and Xombi), is a T-box transcription factor required in early Xenopus embryos for the formation of both, the mesoderm and endoderm germ layers. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 183
37147 410324 cd20198 T-box_TBX15-like DNA-binding domain of T-box transcription factor 15 and related T-box proteins. TBX15 (also known as TBX14) plays an important role in the development of the skeleton of the limb, vertebral column and head, possibly through its control of the number of mesenchymal precursor cells and chondrocytes. TBX15 also plays a role in the differentiation of brown and brite adipocytes. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 198
37148 410325 cd20199 T-box_TBX18_like DNA-binding domain of T-box transcription factor 18 and related T-box proteins. TBX18 acts as a transcription repressor involved in the developmental processes of a variety of tissues and organs, including the ureter, vertebral column. epicardium and coronary vessels. TBX18 is important for the development of the head portion of the sino atrial node (SAN); SAN is the pacemaker region of the heart that initiates each heartbeat. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 195
37149 410326 cd20200 T-box_TBX22-like DNA-binding domain of T-box transcription factor 22 and related T-box proteins. TBX22 is a transcriptional regulator involved in developmental processes. Mutations in the T-Box transcription factor gene TBX22 are found in X-linked Cleft Palate with or without Ankyloglossia syndrome (CPX syndrome). TBX22 mutation is also associated with cleft lip and palate, and tooth agenesis. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 194
37150 410327 cd20201 T-box_TBX19-like DNA-binding domain of T-box transcription factor 19 and related T-box proteins. Tbx19 (also known as Tpit) is a T-box factor restricted to two pituitary (pro-opiomelanocortin) POMC-expressing lineages, the corticotrophs and melanotrophs; it controls terminal differentiation of these lineages. TBX19 activates POMC gene transcription with the cooperation of another transcription factor Pitx1. Mutations of the human TPIT gene cause early onset pituitary adrenocorticotrophic hormone (ACTH) deficiency. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 183
37151 410328 cd20202 T-box_TBXT DNA-binding domain of T-box transcription factor T and related T-box proteins. TBXT, also known as Brachyury protein, or protein T, is a transcription factor needed for posterior mesoderm formation and differentiation as well as for the notochord development during embryogenesis. It binds to a 24 base-pair (bp) palindromic site (called the T site) and activates gene transcription when bound to such a site. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. TBXT is the founding member of the T-box family, members of which share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 179
37152 410329 cd20203 T-box_TBX21 DNA-binding domain of T-box transcription factor 21 and related T-box proteins. TBX21 (also known as T-cell-specific T-box transcription factor T-bet or transcription factor TBLYM) is a lineage-defining transcription factor which directs T helper type 1 (Th1) cell differentiation. It initiates Th1 lineage development from naive T helper precursor cells both by initiating the Th1 genetic programs and by inhibiting the opposing Th2 and Th17 lineage-commitment programs. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 191
37153 410330 cd20204 T-box_TBR1 DNA-binding domain of T-box brain protein 1 and related T-box proteins. TBR1 (also known as T-brain-1 or TES-56) is a neuron-specific transcription factor of the T-box family and involved in forebrain development. It has been recognized as a high-confidence risk gene for autism spectrum disorders (ASD); it regulates the expression of ASD-related genes that are critical for cortical development. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 191
37154 410331 cd20205 T-box_TBR2 DNA-binding domain of T-box brain protein 2 and related T-box proteins. TBR2 (also known as Eomesodermin, Eomes, or T-brain-2) is a member of the T-box family of transcription factors and is associated with neurogenesis, cardiogenesis and tumor immune response. This subfamily belongs to the T-box family of transcription factors which plays a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 191
37155 411001 cd20206 YbbR YbbR domain. YbbR domains occur as tandem repeats or in architectures together with other domains. Putative roles in cell growth, cell division, and/or virulence have been suggested for this domain. 79
37156 380908 cd20207 Bbox2_GefO-like B-box-type 2 zinc finger found in Ras guanine nucleotide exchange factor O (GefO) and similar proteins. Ras guanine-nucleotide exchange factors (RasGEFs) activate Ras by catalyzing the replacement of GDP with GTP, and thus lie near the top of many signaling pathways. They are important for signaling in development and chemotaxis in many organisms. Ras guanine nucleotide exchange factor O (GefO), also known as RasGEF domain-containing protein O, is faintly expressed during development of Dictyostelium discoideum. It contains a C3HC4-type RING finger, a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs), a REM (Ras exchanger motif) domain, and a # RasGEF domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. 40
37157 380909 cd20208 Bbox1_DUF2009 B-box-type 1 zinc finger found in DUF2009 domain-containing proteins and similar proteins. This group is composed of uncharacterized proteins containing a zinc finger B-box domain and a DUF2009 domain, and similar zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 43
37158 380784 cd20214 PFM_CEL-III-like pore-forming module of hemolytic lectin Cucumaria echinate CEL-III and similar aerolysin-type beta-barrel pore-forming proteins. Cucumaria echinate CEL-III is a Ca(2+)-dependent and galactose-specific lectin, which is cytotoxic to some cultured cell lines, has strong hemolytic activity toward human and rabbit erythrocytes, and anti-malarial activity. Hemolysis results from ion-permeable pores formed from CEL-III oligomers in the target cell membrane. Members of this group includes CEL-III isoforms: CEL-III-L1, CEL-III-L2, CEL-III-S1, CEL-III-S2, and CEL-III-LS1. Many proteins belonging to this group have two N-terminal ricin-type carbohydrate-binding domains which adopt beta-trefoil folds. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). The CEL-III oligomer in the membrane may be composed of six monomers. 124
37159 380785 cd20215 PFM_LSL-like pore-forming module of Laetiporus sulphureus LSL lectin and similar aerolysin-type beta-barrel pore-forming proteins. LSL is a lectin, produced by the parasitic mushroom Laetiporus sulphureus, which exhibits hemolytic and hemagglutinating activities. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 164
37160 380786 cd20216 PFM_HFR-2-like pore-forming module of wheat HFR-2 toxin, FEM32, and similar aerolysin-type beta-barrel pore-forming proteins. HFR-2 is a wheat cytolytic toxin which may normally function in defense against certain insects or pathogens. The Hfr-2 gene is upregulated in virulent Hessian fly larval feedingdouble dagger. The HFR-2 protein may insert in plant cell membranes at the feeding sites and by forming pores provide water, ions and other small nutritive molecules to the developing larvae. This group also contains FEM32, a flower-specific lectin-like protein from the dioecious plant Rumex acetosa, which alters flower development and induces male sterility in transgenic tobacco. It has been suggested that the FEM32 gene activates some form of programmed cell death (PCD), a process that could be mediated by the action of its lectin domains for binding to specific glycoproteins on the cell membrane and facilitated by the formation of pore structures in the membranes and the subsequent leakage of the cytosolic content through its pore-forming aerolysin domain. Most proteins belonging to this group have N-terminal agglutatin (also known as amaranthin) lectin domains; most have two agglutatin domains, in combination with one aerolysin domain. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 152
37161 380787 cd20217 PFM_agglutinin-like pore-forming module (PFM) of uncharacterized proteins which have agglutatin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have an N-terminal agglutatin (also known as amaranthin) lectin domain; some have fascin-like domains which adopt a beta-trefoil topology. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 150
37162 380788 cd20218 PFM_aerolysin pore-forming module of aerolysin and similar aerolysin-type beta-barrel pore-forming proteins. Aerolysin is a cytosolic bacterial toxin that forms pores in the host membrane, leading to destruction of the membrane permeability barrier and host cell death. Another member of this family is alpha-toxin from Clostridium septicum, the main virulence factor of this bacterium, known for causing non-traumatic gas gangrene. Many proteins belonging to this group have an N-terminal APT domain; an APT domain is the N-terminal domain of aerolysin and pertussis toxin and has a type-C lectin-like fold. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 144
37163 380789 cd20219 PFM_physalysin-1-like pore-forming module of Physella acuta physalysin1 and similar aerolysin-type beta-barrel pore-forming proteins. From a comparative immunological study of the snail Physella acuta, physalysin1 was identified as one of three physalysins in the snail. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 125
37164 380790 cd20220 PFM_natterin-3-like pore-forming module of Thalassophryne nattereri fish venom natterins 1-4, and similar aerolysin-type beta-barrel pore-forming proteins. This group includes 4 of the 5 Thalassophryne nattereri fish venom natterins: natterin-1, -2, -3, and 4. Natterins have kininogenase activity, kallikrein activity, and are allodynic and edema inducing. They also cleave type I and type IV collagen, resulting in necrosis of the affected cells. Contradictory to their edematic activity, Natterins also have anti-inflammatory effects through inhibition of interactions between leukocytes and the endothelium, and reduction in neutrophil accumulation. Many proteins belonging to this group have an N-terminal DUF3421 domain. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 152
37165 380791 cd20221 PFM_Dln1-like pore-forming module of Danio rerio Dln1, and similar aerolysin-type beta-barrel pore-forming proteins. Since Danio rerio Dln1 has a specific affinity towards high-mannose glycans, which are common on the surface of virus and fungi, it has been suggested that it may play a defense role. Members of this group also include lamprey immune protein (LIP), a defense molecule derived from lamprey supraneural body tissue which has efficient cytocidal actions against tumor cells. Many proteins belonging to this group have a N-terminal Jacalin-like lectin domain. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 168
37166 380792 cd20222 PFM_parasporin-2-like pore-forming module of parasporin-2, hydralysin and similar aerolysin-type beta-barrel pore-forming proteins. Bacillus thuringiensis strain A1547 parasporin-2 (PS2, also named Cry46Aa1) is an anti-cancer protein which causes specific cell damage via PS2 oligomerization in the cell membrane. Glycosylphosphatidylinositol (GPI)-anchored proteins may be involved in the cytocidal action of PS2 as co-receptors for PS2's cytocidal action. This family also includes hydralysin (Hln-1) and Hln-2 produced by the green hydra Chlorohydra viridissima. Hydralysin is a paralysis-inducing protein not found in the stinging cells (nematocytes), with a cell type-selective cytolytic activity; it binds erythrocyte membranes and forms discrete pores. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 147
37167 380793 cd20223 PFM_epsilon-toxin-like pore-forming module of Clostridium perfringens epsilon-toxin and similar aerolysin-type beta-barrel pore-forming proteins. Clostridium perfringens epsilon-toxin is responsible for fatal enterotoxemia in ungulates. It forms a heptamer in the lipid rafts of Madin-Darby Canine Kidney (MDCK) cells, leading to cell death; its oligomer formation is induced by activation of neutral sphingomyelinase. This group also includes an insecticidal crystal protein Cry14-4 (encoded on plasmid pBMBt1 of Bacillus thuringiensis serovar darmstadiensis). Also included is pXO2-60 (a protein from the pathogenic pXO2 plasmid of Bacillus anthracis) which harbors a unique ubiquitin-like fold domain at the C-terminus of the aerolysin-like domain, and is involved in virulence. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 144
37168 380794 cd20224 PFM_alpha-toxin-like pore-forming module of Clostridium septicum alpha-toxin and similar aerolysin-type beta-barrel pore-forming proteins. Clostridium septicum alpha-toxin is the main virulence factor of this bacterium, known for causing non-traumatic gas gangrene. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 121
37169 380795 cd20225 PFM_lysenin-like pore-forming module of lysenin and similar aerolysin-type beta-barrel pore-forming proteins. Lysenin (also known as Efl1) is a sphingomyelin-binding defense protein found in the coelomic fluid of the annelid earthworm Eisenia fetida. This group also contains lysenin-related proteins LRP-1 , LRP-2 , and LRP-3 from Eisenia sp.. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 150
37170 380796 cd20226 PFM_Cry51Aa-like pore-forming module of Bacillus thuringiensis insecticidal Cry51A toxin, Bacillus thuringiensis cytotoxic parasporin-5 and similar aerolysin-type beta-barrel pore-forming proteins. Bacillus thuringiensis parasporin-5 has strong cytocidal activity against several types of cancer cells and may or may not have insecticidal activity. Cry51A toxin is toxic to coleopteran (beetle) larvae. Other members of this family include Bacillus thuringiensis Cry15Aa which is toxic to lepidopteran (butterflies and moth) larvae. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 172
37171 380797 cd20227 PFM_CPE-like CPE (Clostridium perfringens enterotoxin), HA-70 type C, and similar aerolysin-type beta-barrel pore-forming proteins. This domain is also known as the clenterotox domain (Chlostridium enterotoxin). Clostridium perfringens enterotoxin is the major virulence determinant for C. perfringens type-A food poisoning. After binding to its receptors, which include particular human claudins, the toxin forms pores in the cell membrane. This family also includes HA-70 type C, a component of the haemagglutinin complex of Clostridium botulinum type C toxin. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 162
37172 380798 cd20228 PFM_TDP-like pore-forming module of Flammulina velutipes transepithelial electrical resistance (TEER)-decreasing protein, and similar aerolysin-type beta-barrel pore-forming proteins. Flammulina velutipes TEER-decreasing protein (also known as flammutoxin, FTX), is a pore-forming hemolytic protein known to cause a rapid decrease in TEER and a parallel increase in paracellular permeability in the human intestinal epithelial Caco-2 cell monolayer. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 118
37173 380799 cd20229 PFM_tachylectin-like pore-forming module (PFM) of uncharacterized proteins having tachylectin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have tachylectin domain(s), N-terminal to this PFM; some also have an immunoglobulin (Ig) domain. Tachylectins are lectins which bind N-acetylglucosamine and N-acetylgalactosamine. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 148
37174 380800 cd20230 PFM_EP37-like pore-forming module of Cynops pyrrhogaster EP37, and similar aerolysin-type beta-barrel pore-forming proteins. Cynops pyrrhogaster (Japanese newt EP37) EP37 is an epidermis-specific protein which has a non-lens beta/gamma-crystallin domain in tandem and N-terminal to this pore-forming module. C. pyrrhogaster has several EP37-like proteins present in skin, gastric epithelium and fundic glands of an adult newt and in the swimming larva. This group also includes the alpha subunit of Bombina maxima betagamma-CAT (a non-lens betagamma-crystallin (alpha-subunit) and trefoil factor (beta subunit) complex) identified from skin secretions. Betagamma-CAT shows potent hemolytic activity on mammalian erythrocytes. Many proteins belonging to this group have N-terminal crystallin (beta/gamma crystallin) domain(s). Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 146
37175 380801 cd20231 PFM_jacalin-like pore-forming module of uncharacterized proteins which have an N-terminal jacalin-like lectin domain, and similar aerolysin-type beta-barrel pore-forming proteins. Jacalin-like lectins are sugar-binding protein domains. Proteins having these lectin domains may bind mono- or oligosaccharides with high specificity. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 150
37176 380802 cd20232 PFM_crystallin-like pore-forming module (PFM) of uncharacterized proteins which have N-terminal crystallin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have N-terminal crystallin (beta/gamma crystallins) domain(s). Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 151
37177 380803 cd20233 PFM_unzipped-like pore-forming module (PFM) of proteins having a DUF3421 domain including Drosophila unzipped, honey bee anarchy 1, and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have N-terminal DUF3421. Drosophila melanogaster unzipped protein is required for normal axon patterning during neurogenesis, and honey bee anarchy 1 may play a role in worker sterility in a social insect. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 134
37178 380804 cd20234 PFM_fascin-like pore-forming module (PFM) of uncharacterized proteins which have N-terminal fascin-like domain, and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have an N-terminal Fascin-like domains which adopt a beta-trefoil topology. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 139
37179 380805 cd20235 PFM_spherulin-2a-like pore-forming module of Physarum polycephalum spherulin-2a, Plodia interpunctella follicular epithelium yolk protein subunit YP4, and similar aerolysin-type beta-barrel pore-forming proteins. Spherulin 2a is a coat glycoprotein produced during encystment from the slime mold, Physarum polycephalum. YP4, is one of two subunits of the follicular epithelium yolk protein from Plodia interpunctella and other pyralid moths; it is produced in the follicle cells during vitellogenesis, and after secretion it is taken up into the oocyte and stored in the yolk spheres for utilization during embryogenesis. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 150
37180 380806 cd20236 PFM_SP17-like pore-forming module-like domain of Phlebotomus argentipes 29 kDa salivary protein SP17, and similar aerolysin-type beta-barrel pore-forming proteins. Members include two putative secreted proteins from the salivary glands of Phlebotomus argentipes: 29 kDa salivary protein SP17 and 30 kDa salivary protein SP15. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 117
37181 380807 cd20237 PFM_LIN24-like pore-forming module of Caenorhabditis elegans LIN-24 and similar aerolysin-type beta-barrel pore-forming proteins. The process of cytotoxic cell death occurs in Caenorhabditis elegans containing mutations in either of lin-24 and lin-33. The cytotoxicity caused by mutation of either gene requires the function of the other. Genes required for the engulfment of apoptotic corpses function in the cytotoxic cell deaths induced by mutations in lin-24 and lin-33. It has been proposed that Caenorhabditis elegans LIN-24 may function to interact with bacterial toxins having similarity with it, and inactivate these, thereby allowing C. elegans to consume or survive exposure to bacteria that produce such toxins. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 120
37182 380808 cd20238 PFM_ABFB-like pore-forming module (PFM) of uncharacterized proteins which have an N-terminal ABFB (alpha-L-arabinofuranosidase B) domain, and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have a PFM C-terminal to an ABFB domain. Alpha-L-arabinofuranosidase (Araf-ase, EC 3.2.1.55) belongs to the glycosyl hydrolase family GH54, and in Aspergillus niger exhibits both Araf-ase, (EC 3.2.1.55) and alpha-D-galactofuranose (Galf-ase) activities, with Galf-ase being less than Araf-ase. Some members have a Ricin-type carbohydrate-binding domain which adopts a beta-trefoil fold. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 146
37183 380809 cd20239 PFM_aerolysin-like pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 145
37184 380810 cd20240 PFM_aerolysin-like pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 145
37185 380811 cd20241 PFM_aerolysin-like pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 139
37186 380812 cd20242 PFM_aerolysin-like pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 144
37187 380781 cd20243 NoBody Negative regulator of P-body association. Negative regulator of P-body association, also called P-body dissociating protein or protein NoBody (non-annotated P-body dissociating polypeptide), is a microprotein that interacts with mRNA decapping proteins, which remove the 5' cap from mRNAs to promote 5'-to-3' decay. It localizes to P-bodies, which are mRNA-decay-associated RNA-protein granules. NoBody promotes dispersal of P-body components and is likely to play a role in the mRNA decapping process. Decapping proteins participate in mRNA turnover and nonsense-mediated decay (NMD). 68
37188 380780 cd20244 Toddler Apelin receptor early endogenous ligand, also called Toddler. Apelin receptor early endogenous ligand, also called protein Elabela or protein Toddler, is an endogenous peptidic ligand of the G protein-coupled receptor APJ, also called apelin receptor (APLNR). The apelin/APJ axis contributes to maintaining homeostasis in normal and pathological hearts, and is also involved in heart development, including endoderm differentiation, heart morphogenesis and coronary vascular formation. Elabela/Toddler plays crucial roles in heart development and disease conditions, presumably at time points or at areas of the heart different from apelin. It is a potential therapeutic peptide, showing beneficial effects on body fluid homeostasis, cardiovascular health, and renal insufficiency, as well as potential benefits for metabolism and diabetes. 53
37189 380778 cd20245 humanin humanin and similar peptides. Humanin (HN) is a peptide encoded in the mitochondrial genome by the 16S ribosomal RNA gene MT-RNR2. HN has neuroprotective, anti-inflammatory, anti-apoptotic, anti-aging, and anti-fibrilogenic properties through interaction with a series of targets including BAX, IGFBP-3 (insulin-like growth factor binding protein 3), and a trimeric CNTFR/WSX-1/gp130 complex. Endogenous HN is both, an intracellular and secreted protein, and has been detected in brain, retinal pigment epithelium, blood vessels, pancreatic beta cells, tumors, testes and other tissues, and is also present in serum and in human seminal plasma. Single amino acid substitutions of HN can lead to significant changes in its potency and biological functions. There are multiple nuclearly-encoded HN isoforms which may be functional genes regulated in a tissue- and factor-specific manner. HN has potential as a therapeutic target for neurodegenerative and cardiovascular diseases, diabetes, male infertility, and cancer; it may have value as a biomarker for these diseases. Humanin analogs and peptide mimetics have been developed which show promising results in preclinical models of degenerative diseases. 24
37190 380775 cd20246 CASIMO1 Cancer Associated Small Integral Membrane Open reading frame 1 (CASIMO1). The Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein), controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This family contains two variants expressed on different chromosomes in humans, small integral membrane protein 5 (SMIM5) and small integral membrane protein 22 (SMIM22). 68
37191 380779 cd20247 DWORF DWarf Open Reading Frame (DWORF). DWarf Open Reading Frame (DWORF) is a small protein that plays a key role in heart muscle contraction. DWORF, Sarcolipin (SLN), and myoregulin (MRLN) are transmembrane regulators of the sarcoplasmic reticulum calcium transporting ATPase (SERCA). DWORF enhances SERCA activity by displacing phospholamban (PLN), a potent SERCA inhibitor. This makes DWORF an attractive candidate for a heart failure therapeutic. DWORF is also present in slow-twitch skeletal muscle fibers. 35
37192 380749 cd20248 phospholamban_like phospholamban, sarcolipin, and sarcolamban family of bioactive peptides. Vertebrate phospholamban (PLN), sarcolipin (SLN), and invertebrate sarcolamban (SCL) constitute a family of bioactive peptides. They are involved in the regulation of Ca2+ traffic, and their alteration can result in irregular muscle contractions. Invertebrate SCL (SCLA and SCLB) are encoded within a single putative noncoding transcript, pncr003:2L; vertebrate PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN expression is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. SLN in skeletal muscle has also been shown to amplify calcineurin signaling, and is a potential therapeutic target for the management of muscular dystrophy, as it is upregulated in the skeletal muscle of the classical mouse model for Duchenne muscular dystrophy. 27
37193 380772 cd20249 Hemotin Hemotin. Hemotin is a transmembrane alpha-helical microprotein localized to early endosomes in hemocytes (Drosophila macrophages), where it regulates endosomal maturation during phagocytosis by repressing the cooperation of 14-3-3zeta with specific phosphatidylinositol enzymes. Hemocytes are professional phagocytes tasked with removing dying cells and microorganisms invading the body. Drosophila hemotin mutants accumulate undigested phagocytic material inside enlarged endolysosomes, resulting in reduced ability to fight bacteria and severely reduced life span. 86
37194 380750 cd20250 Phospholamban Phospholamban bioactive peptide and similar proteins. Vertebrate phospholamban (PLN) belongs to a family of bioactive peptides which includes vertebrate sarcolipin (SLN), and the invertebrate sarcolamban (SCLA, and SCLB). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. PLN and SLN differ in their interaction with SERCA; PLN is an affinity modulator of SERCA. It is thought to form a pentamer in the membrane. 52
37195 380755 cd20251 Complex1_LYR_SF LYR (leucine-tyrosine-arginine) motif found in Complex1_LYR-like superfamily. The Complex1_LYR-like superfamily consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. The human genome has at least ten LYR proteins that were predominantly identified as mitochondrial proteins. Some family members were also found in the cytosol or nucleus. LYR motif-containing protein 4 (LYRM4) represents the only LYR protein that is directly involved in the first steps of Fe-S cluster generation. Other LYR proteins have been identified as accessory subunits or assembly factors of mitochondrial OXPHOS (oxidative phosphorylation) complexes I, II, III and V, and they play specific roles in acetate metabolism. 57
37196 380751 cd20253 Sarcolipin Sarcolipin bioactive peptide and similar proteins. Vertebrate sarcolipin (SLN) belongs to a family of bioactive peptides which includes phospholamban (PLN), and invertebrate sarcolamban (SCL). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. PLN and SLN differ in their interaction with SERCA. SLN in skeletal muscle has also been shown to amplify calcineurin signaling, and is a potential therapeutic target for the management of muscular dystrophy, as it is upregulated in the skeletal muscle of the classical mouse model for Duchenne muscular dystrophy. 30
37197 380776 cd20254 CASIMO1_SMIM5 small integral membrane protein 5 (SMIM5) of CASIMO1. This family contains the small integral membrane protein 5 (SMIM5) variant of the Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein). CASIMO1 controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This variant is expressed on chromosome 16 in humans, mostly in the stomach, kidney, thyroid and esophagus. 71
37198 380777 cd20255 CASIMO1_SMIM22 small integral membrane protein 22 (SMIM22) of CASIMO1. This family contains the small integral membrane protein 22 (SMIM22) variant of the Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein). CASIMO1 controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This variant is expressed on chromosome 17 in humans, mostly in the colon and stomach. 80
37199 380773 cd20256 Stannin_family Stannin family includes vertebrate Stannin and insect Hemotin. The Stannin family includes vertebrate Stannin and insect Hemotin, which are functional homologs required at the cellular level for endosomal maturation, and at the molecular level, to bind and antagonize 14-3-3zeta. Stannin is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. Analysis of the Hemotin sequence using a transmembrane topology prediction program revealed a very similar potential transmembrane alpha-helical domain arrangement as Stannin. 83
37200 380774 cd20257 Stannin Stannin. Stannin (SNN) is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. It binds and antagonizes 14-3-3zeta and is required for endosomal maturation. It has also been identified as the specific marker for neuronal cell apoptosis induced by trimethyltin (TMT) intoxication. TMT is one of the most toxic organotin compound (or alkyltin), and is known to selectively inflict injury to specific regions of the brain. 84
37201 380771 cd20258 Tal_Pri Tarsal-less (Tal), also known as polished rice (Pri), and related peptides. The tal/pri gene produces a single polycistronic transcript that encodes 4 related peptides: tal-1A, tal-2A, and tal-3A which are each 11 amino acids long, and tal-AA, which is 32 amino acids long, the shorter ones contain one conserved LDPTGXY motif, tal-AA contains two. The Tal/Pri peptides function redundantly in several developmental processes. They are required for embryonic and imaginal development in Drosophila. They control epidermal differentiation in Drosophila by triggering the amino-terminal truncation of the transcription factor Shavenbaby (Svb), converting Svb from a repressor to an activator. In addition, Tal/Pri peptides are required for denticle formation and may play a role in the developmental timing of trichome differentiation. They are essential for the development of taenidial folds in the trachea, and in the early stages of leg development, for the intercalation of the tarsal segments during the mid-third instar stage and later for tarsal joint formation. Furthermore, Tal/Pri peptides are required for correct wing and leg formation through their regulation of several genes including those in the Notch signaling pathway. The Tribolium orthologue mille-pattes (mlpt) is essential for embryo segmentation; it also encodes a polycistronic mRNA that codes for four peptides: Mlpt peptides 1-3 range in size from 11 to 15 amino acids and each contain one conserved LDPTGXY motif; Mlpt peptide 4 is 23 amino acids, is not represented here, and does not contain this motif. 32
37202 380770 cd20259 pgc polar granule component. Polar granule component (pgc) is implicated in primordial germ cell specification in Drosophila, which require transcriptional quiescence and three genes: pgc, nano (nos) and germ cell less (gcl), that act to down-regulate Pol II transcription. The microprotein pgc inhibits transcription elongation factor b (P-TEFb), which phosphorylates the C-terminal domain of the largest Pol II subunit. 66
37203 380769 cd20260 Myoregulin Myoregulin. Myoregulin (MLN) is encoded by a skeletal muscle-specific RNA Linc-RAM, which is annotated as a putative long noncoding RNA (lncRNA). It is a single-pass transmembrane alpha-helix that interacts directly with sarcoplasmic reticulum (SR) calcium transporting ATPase (SERCA) and impedes Ca(2+) uptake into the SR. SERCA is the membrane pump that controls muscle relaxation by regulating Ca(2+) uptake into the SR. MLN is the dominant regulator of SERCA1 activity in adult skeletal muscle and is a promising drug target for improving muscle performance. 45
37204 380756 cd20261 Complex1_LYR_LYRM1 LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 1 (LYRM1) and similar proteins. LYR motif-containing protein 1 (LYRM1) may promote cell proliferation and inhibition of apoptosis of preadipocytes. Overexpression of the human LYRM1 causes mitochondrial dysfunction and induces insulin resistance in 3T3-L1 adipocytes. LYRM1 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 70
37205 380757 cd20262 Complex1_LYR_LYRM2 LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 2 (LYRM2) and similar proteins. LYRM2 is an uncharacterized LYR motif-containing protein that belongs to the Complex1_LYR-like superfamily which consists of proteins of diverse functions that are exclusively found in eukaryotes; these proteins contain the conserved tripeptide 'LYR' close to the N-terminus. 63
37206 380758 cd20263 Complex1_LYR_NDUFB9_LYRM3 LYR (leucine-tyrosine-arginine) motif found in NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 9 (NDUFB9) and similar proteins. NDUFB9, also called LYR motif-containing protein 3 (LYRM3), or Complex I-B22 (CI-B22), or NADH-ubiquinone oxidoreductase B22 subunit (UQOR22), is an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (complex I), and is believed to be not involved in catalysis. In general, accessory subunits are integral for assembly and function of Complex I, which functions in the transfer of electrons from NADH to the respiratory chain. NDUFB9 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 77
37207 380759 cd20264 Complex1_LYR_LYRM4 LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 4 (LYRM4) and similar proteins. LYRM4, also called ISD11, is a eukaryote-specific component of the mitochondrial biogenesis of Fe-S clusters which are essential cofactors in multiple processes, including oxidative phosphorylation. It is required for nuclear and mitochondrial iron-sulfur protein biosynthesis by forming a complex with, and stabilizing, the sulfur donor NFS1. LYRM4 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 69
37208 380760 cd20265 Complex1_LYR_ETFRF1_LYRM5 LYR (leucine-tyrosine-arginine) motif found in electron transfer flavoprotein regulatory factor 1 (ETFRF1) and similar proteins. ETFRF1, also called LYR motif-containing protein 5 (LYRM5), or Ghiso (growth-factor inducible soluble) factor, acts as a regulator of the electron transfer flavoprotein by promoting the removal of flavin from the ETF holoenzyme (composed of ETFA and ETFB). It belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 74
37209 380761 cd20266 Complex1_LYR_NDUFA6_LYRM6 LYR (leucine-tyrosine-arginine) motif found in NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 (NDUFA6) and similar proteins. NDUFA6, also called LYR motif-containing protein 6 (LYRM6), or Complex I-B14 (CI-B14), or NADH-ubiquinone oxidoreductase B14 subunit, is an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I), and is believed to be not involved in catalysis. In general, accessory subunits are integral for assembly and function of Complex I, which functions in the transfer of electrons from NADH to the respiratory chain. NDUFA6 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 75
37210 380762 cd20267 Complex1_LYR_LYRM7 LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 7 (LYRM7) and similar proteins. LYRM7 is an assembly factor required for Rieske iron sulfur (Fe-S) protein UQCRFS1 incorporation into the cytochrome b-c1 (CIII) complex. It functions as a chaperone, binding to this subunit within the mitochondrial matrix and stabilizing it prior to its translocation and insertion into the late CIII dimeric intermediate within the mitochondrial inner membrane. LYRM7 mutations cause a multifocal cavitating leukoencephalopathy with a distinct and recognizable magnetic resonance imaging (MRI) pattern. LYRM7 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 72
37211 380763 cd20268 Complex1_LYR_SDHAF1_LYRM8 LYR (leucine-tyrosine-arginine) motif found in mitochondrial succinate dehydrogenase assembly factor 1 (SDHAF1) and similar proteins. SDHAF1, also called SDH assembly factor 1, or LYR motif-containing protein 8 (LYRM8), is a LYR complex-II specific assembly factor that plays an essential role in the assembly of succinate dehydrogenase (SDH), an enzyme complex (also referred to as respiratory complex II) that is a component of both the tricarboxylic acid (TCA) cycle and the mitochondrial electron transport chain, and which couples the oxidation of succinate to fumarate with the reduction of ubiquinone (coenzyme Q) to ubiquinol. It promotes maturation of the iron-sulfur protein subunit SDHB of the SDH catalytic dimer, protecting it from the deleterious effects of oxidants. SDHAF1 may act together with SDHAF3. It is mutated in SDH-defective infantile leukoencephalopathy. SDHAF1 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 64
37212 380764 cd20269 Complex1_LYR_LYRM9 LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 9 (LYRM9) and similar proteins. LYRM9 is an uncharacterized LYR motif-containing protein that belongs to the Complex1_LYR-like superfamily which consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 60
37213 380765 cd20270 Complex1_LYR_SDHAF3_LYRM10 LYR (leucine-tyrosine-arginine) motif found in mitochondrial succinate dehydrogenase assembly factor 3 (SDHAF3) and similar proteins. SDHAF3, also called SDH assembly factor 3, or LYR motif-containing protein 10 (LYRM10), plays an essential role in the assembly of succinate dehydrogenase (SDH), an enzyme complex (also referred to as respiratory complex II) that is a component of both the tricarboxylic acid (TCA) cycle and the mitochondrial electron transport chain, and which couples the oxidation of succinate to fumarate with the reduction of ubiquinone (coenzyme Q) to ubiquinol. It promotes maturation of the iron-sulfur protein subunit SDHB of the SDH catalytic dimer, protecting it from the deleterious effects of oxidants. SDHAF3 may act together with SDHAF1. Its mutations may be associated with idiopathic SDH-associated diseases. SDHAF3 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 56
37214 380766 cd20271 Complex1_LYR_FMC1 LYR (leucine-tyrosine-arginine) motif found in formation of mitochondrial complex V assembly factor 1 (FMC1) and similar proteins. FMC1, also known as formation of mitochondrial complexes protein 1, is an ATP synthase assembly factor that plays a role in the assembly/stability of the mitochondrial membrane ATP synthase (F(1)F(0) ATP synthase or Complex V). It belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 95
37215 380767 cd20272 Complex1_LYR_MIEF1-MP LYR (leucine-tyrosine-arginine) motif found in mitochondrial elongation factor 1 microprotein (MIEF1-MP) and similar proteins. MIEF1-MP, also called alternative mitochondrial elongation factor 1 (MIEF1) protein (AltMIEF1), or MIEF1 upstream open reading frame protein, is involved in the regulation of mitochondrial fission mediated by dynamin-1-like protein (DNM1L). It positively regulates mitochondrial translation. MIEF1-MP belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 58
37216 380768 cd20273 Complex1_LYR_unchar LYR (leucine-tyrosine-arginine) motif found in uncharacterized LYR motif-containing protein. This group contains uncharacterized LYR motif-containing proteins belonging to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 61
37217 380752 cd20274 Sarcolamban Sarcolamban A and B bioactive peptides and similar proteins. Invertebrate sarcolamban (SCLA and SCLB) belong to a family of bioactive peptides which includes vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. 27
37218 380753 cd20275 Sarcolamban_B Sarcolamban B bioactive peptide and similar proteins. Invertebrate sarcolamban B (SCLB) belongs to a family of bioactive peptides which includes invertebrate sarcolamban A (SCLA), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. 28
37219 380754 cd20276 Sarcolamban_A Sarcolamban A bioactive peptide. Invertebrate sarcolamban A (SCLA) belongs to a family of bioactive peptides which includes invertebrate sarcolamban B (SCLB), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. 27
37220 410555 cd20277 FXYD phenylalanine-X-tyrosine-aspartate (FXYD) family. FXYDs are small single-transmembrane proteins that act as novel regulators of Na+/K+-ATPase (NKA). The transmembrane domain and the conserved Phe-X-Tyr-Asp motif of FXYD play a role in the binding of FXYD to the alpha- and beta-subunits of NKA. PFXYD (proline-phenylalanine-X-tyrosine-aspartate) at the beginning of the signature sequence is invariant in all known examples in mammals and identical except for the proline in other vertebrates; X is usually Y (tyrosine), but can also be E, T, or H (glutamate, threonine, or histidine). The FXYD protein family contains at least twelve members that have the extracellular FXYD motif, transmembrane domain, and intracellular domain. Members share a 35-amino acid signature sequence domain, beginning with PFXYD and containing 7 invariant and 6 highly conserved amino acids. In mammals, members of the FXYD family include FXYD1 (phospholemman, PLM), FXYD2 (the gamma-subunit of NKA), FXYD3 (mammary tumor marker Mat-8), FXYD4 (corticosteroid hormone-induced factor, CHIF), FXYD5 (dysadherin), FXYD6 (phosphohippolin), and FXYD7. In elasmobranchs, FXYD10 (phospholemman-like protein from shark, PLMS) was first identified in the rectal glands of Squalus acanthias. In addition, studies on sharks reported that the functions of FXYD10 via its C-terminal cysteine residue interactions were associated with negative regulation of shark NKA activity. Teleostean FXYD proteins (FXYD2, 5-9, 11, and 12) have been reported in certain teleosts such as the Tetraodon nigroviridis, Salmo salar, Danio rerio, and Oryzias dancena. Recent studies have demonstrated that several teleost FXYD isoforms are expressed in the gills and kidneys of the fish, and their expression levels are altered in response to salinity changes, suggesting that these FXYDs may regulate electrolyte homeostasis and body fluid of the fish. 30
37221 380748 cd20278 Minion Microprotein INducer of fusION (Minion). Microprotein INducer of fusION (Minion), also called protein myomixer or protein myomerger, is encoded by the MYMX gene. Along with Myomaker, it allows cells to fuse and form multinucleated fibers that are capable of contracting. A lack of Minion disables skeletal muscles, including the diaphragm, resulting in perinatal death in mice. This insight into the Minion-Myomaker system may one day be exploited for targeted drug delivery involving fusing cells in cancer or other contexts. The production of Minion peaks three to four days after injury, similar to the expression profile of Myomaker. 57
37222 411710 cd20280 NotI-like Restriction endonuclease NotI and similar proteins. Restriction enzyme NotI is a type IIP restriction enzyme (the simplest being separate homodimeric endonucleases and methyltransferases that each recognize the same palindromic DNA target sequence) that recognize sites of 8 bp or longer in invasive DNA. NotI is commonly used for the introduction of radiolabeled landmarks in the restriction landmark genomic scanning (RLGS) method, which has become a common technique for the study of aberrant DNA methylation patterns in tumor- and tissue-specific cell lines. 357
37223 380417 cd20281 cupin_QDO_C quercetinase, C-terminal cupin domain. This family contains the C-terminal domain of quercetinase (also known as quercetin 2,3-dioxygenase, 2,3QD, QDO and YxaG; EC 1.13.11.24), a mononuclear copper-dependent dioxygenase that catalyzes the cleavage of the flavonol quercetin (5,7,3',4'-tetrahydroxyflavonol) heterocyclic ring to produce 2-protocatechuoyl-phloroglucinol carboxylic acid and carbon monoxide. This family includes Aspergillus japonicus quercetin 2,3-dioxygenase (QDO), a homodimer that shows oxygenase activity with Cu2+. The dioxygen binds to the metal ion of the Cu-QDO-quercetin complex, yielding a Cu2+-superoxo quercetin radical intermediate, which forms a Cu2+-alkylperoxo complex that evolves into an endoperoxide intermediate that decomposes to the product. Quercetinase is a bicupin with two tandem cupin beta-barrel domains, only the C-terminal domain is included in this alignment. The pirins, which also belong to the cupin domain family, have been shown to catalyze a reaction involving quercetin and may have a function similar to that of quercetinase. 114
37224 380418 cd20282 cupin_DddQ dimethylsulfoniopropionate lyase DddQ, cupin domain. Dimethylsulfoniopropionate (DMSP) is produced worldwide in large amounts, mainly by marine phytoplankton and macroalgae. DMSP lyase catalyzes the cleavage of DMSP to generate the volatile dimethyl sulfide (DMS) and plays a major role in the biogeochemical cycling of sulfur. When released into the atmosphere from the oceans, DMS is oxidized, forming cloud condensation nuclei that may influence weather and climate. DMSP lyase belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 103
37225 380419 cd20283 cupin_DddY dimethylsulfoniopropionate lyase DddY, cupin domain. This family includes dimethylsulfoniopropionate (DMSP) lyase DddY, the only known periplasmic DMSP lyase that is present in certain proteobacteria. DddY cleaves dimethylsulfoniopropionate (DMSP), the organic osmolyte and antioxidant produced in marine environments, and yields acrylate and the climate-active gas dimethyl sulfide (DMS). The catabolism of DMSP by microbial organisms provides a major source of carbon and sulfur in the marine environment. Studies show that DddY binds a zinc ion as cofactor, and uses a key tyrosine as a general base to attack DMSP. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 294
37226 380420 cd20285 cupin_7S_11S_C 7S and 11S seed storage globulin, C-terminal cupin domain. This family contains the C-terminal cupin domains of 7S and 11S seed storage proteins. The 7S globulins include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2, and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The 11S globulins include many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 109
37227 380421 cd20287 cupin_pirin-like_N pirin-like, N-terminal cupin domain. This family contains the N-terminal cupin domain of pirin and pirin-like proteins, including Escherichia coli YhhW and YhaK. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. Proteins in this family have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 81
37228 380422 cd20288 cupin_pirin-like_C pirin-like, C-terminal cupin domain. This family contains the C-terminal cupin domain of pirin and pirin-like proteins, including Escherichia coli YhhW and YhaK. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. Proteins in this family have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 70
37229 380423 cd20289 cupin_ADO 2-aminoethanethiol dioxygenase, cupin domain. This family contains 2-aminoethanethiol dioxygenase (also known as cysteamine dioxygenase, persulfurase or ADO; EC 1.13.11.19), which catalyzes the addition of two oxygen atoms to free cysteamine (2-aminoethanethiol) to form hypotaurine that subsequently oxidizes to taurine. These enzymes are found in prokaryotes as well as eukaryotes and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 103
37230 380424 cd20290 cupin_Mj0764-like uncharacterized Methanocaldococcus jannaschii Mj0764 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to MJ0764, a Methanocaldococcus jannaschii protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 100
37231 380425 cd20291 cupin_CucA-like soluble periplasm cuproprotein CucA and related proteins, cupin domain. This family includes bacterial proteins homologous to a soluble periplasm protein, CucA, found in the periplasm of the cyanobacterium Synechocystis where it shows some Cu2+-dependent quercetin dioxygenase activity. Studies show that a copper-trafficking pathway enables Cu2+ occupancy of CucA to accumulate in the periplasm, and this involves two copper transporters (CtaA and PacS) and a metallochaperone (Atx1). CucA belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 201
37232 380426 cd20292 cupin_QdtA-like sugar 3,4-ketoisomerase QdtA and related proteins, cupin domain. This family includes cupin domains of several bacterial proteins homologous to sugar 3,4-ketoisomerases. Thermoanaerobacterium thermosaccharolyticum QdtA catalyzes a key step in the biosynthesis of these sugars, the conversion of thymidine diphosphate (dTDP)-4-keto-6-deoxyglucose to dTDP-3-keto-6-deoxyglucose. In Aneurinibacillus thermoaerophilus, TDP-4-oxo-6-deoxy-alpha-D-glucose-3,4-oxoisomerase (also known as FdtA) is involved in the biosynthesis of dTDP-Fucp3NAc (3-acetamido-3,6-dideoxy-alpha-d-galactose), which is part of the repeating units of the glycan chain in the S-layer. Shewanella denitrificans bifunctional ketoisomerase/N-acetyltransferase (also known as FdtD) is involved in the third and fifth steps in the production of 3-acetamido-3,6-dideoxy-alpha-d-galactose or Fuc3NAc; the C-terminal cupin domain harbors the active site responsible for the isomerization reaction. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 117
37233 380427 cd20293 cupin_HutD_N histidine utilization protein HutD and related proteins, N-terminal cupin domain. This model represents the N-terminal domain of a bicupin protein HutD, involved in histidine utilization (Hut) in Pseudomonas species. Although a metal binding site is not found in Pseudomonas fluorescens (PfluHutD), a binding pocket for ligands is located in the middle of the N-terminal cupin domain near the metal binding sites; N-formyl-l-glutamate (FG, a Hut pathway intermediate) has been identified as a potential ligand in vivo. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 92
37234 380428 cd20294 cupin_KduI_N 5-keto-4-deoxyuronate isomerase (KduI) and related proteins, N-terminal cupin domain. 5-keto-4-deoxyuronate isomerase (KduI; EC 5.3.1.17), also called 5-dehydro-4-deoxy-D-glucuronate isomerase or 4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase, catalyzes the interconversion of 5-keto-4-deoxyuronate and 2,5-diketo-3-dexoygluconate in the breakdown of pectin. KduI is a bicupin; this model describes the N-terminal cupin domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 100
37235 380429 cd20295 cupin_Pac13-like monomeric dehydratase Pac13 and related proteins, cupin domain. This family includes a small monomeric dehydratase Pac13 that mediates the formation of the 3'-deoxynucleotide of pacidamycins, which are uradyl peptide antibiotics (UPAs). Pac13 is involved in the formation of the unique 3'-deoxyuridine moiety found in these UPAs; it catalyzes the dehydration of uridine-5'-aldehyde. The similarity of the 3'-deoxy pacidamycin moiety with synthetic anti-retrovirals, offers a potential opportunity for the utilization of Pac13 in the biocatalytic generation of antiviral compounds. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 101
37236 380430 cd20296 cupin_PpnP-like pyrimidine/purine nucleoside phosphorylase and related proteins, cupin domain. This family includes cupin domain proteins that are homologous to pyrimidine/purine nucleoside phosphorylase PpnP. Purine and pyrimidine nucleoside phosphorylases are key enzymes of the nucleoside salvage pathway; they catalyze the reversible phosphorolytic cleavage of the glycosidic bond of purine and pyrimidine nucleosides. Nucleoside phosphorylases are of medical interest since phosphorylases can be used in activating prodrugs; high-molecular mass purine nucleoside phosphorylases may be used in gene therapy of some solid tumors and their inhibitors could be selective immunosuppressive, anticancer, and antiparasitic agents. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 90
37237 380431 cd20297 cupin_HQDO_small hydroquinol 1,2-dioxygenase (HQDO) small subunit, cupin domain. This model describes the small (or alpha) subunit of hydroquinone 1,2-dioxygenase (HQDO), which adopts a cupin domain fold. HQDO is a heterotetramer of two alpha and two beta subunits of 19kDa and 38kDa, respectively, and is a Fe(II) ring cleaving dioxygenase that is a key enzyme in the hydroquinone pathway of para-nitrophenol degradation, where it catalyzes the ring cleavage of hydroquinone to gamma-hydroxymuconic semialdehyde. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 160
37238 380432 cd20298 cupin_UAH ureidoglycolate amidohydrolase (UAH) and related proteins, cupin domain. This family includes the cupin-fold protein, ureidoglycolate hydrolase (AllA; EC 3.5.3.19), which is involved in the breakdown of allantoin under aerobic conditions. Allantoin is the key intermediate of nitrogen fixation in bacteria, and its degradation occurs in several steps. AllA is involved in the third step of this pathway which consists of hydrolysis of (S)-ureidoglycolate to yield glyoxylate, ammonia, and CO2. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 92
37239 380433 cd20299 cupin_YP766765-like Rhizobium leguminosarum YP_766765.1 and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to Rhizobium leguminosarum YP_766765.1, a protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 90
37240 380434 cd20300 cupin_npun_f5605-like_N Nostoc punctiforme npun_f5605 and related proteins, N-terminal cupin domain. This family includes proteins homologous to Nostoc punctiforme putative dioxygenase npun_f5605, a protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 99
37241 380435 cd20301 cupin_ChrR anti-ECFsigma factor, ChrR , cupin domain. This family contains bacterial anti-sigma factor ChrR from the photosynthetic bacterium Rhodobacter sphaeroides (Rsp) and similar proteins. ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the Rsp extra cytoplasmic function (ECF) sigma factor E (sigmaE), an essential factor to mount a transcriptional response to a singlet oxygen and for viability when carotenoids are limiting. ChrR comprises two structural and functional modules; the N-terminal anti-sigma domain (ASD) binds a Zn(2+) ion, contacts sigma(E), and is sufficient to inhibit sigma(E)-dependent transcription. The ChrR C-terminal domain adopts a cupin fold, can coordinate an additional Zn(2+), and is required for the transcriptional response to singlet oxygen, a potent oxidant that damages cellular biomolecules and can kill cells. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 161
37242 380436 cd20302 cupin_DAD 2,4'-Dihydroxyacetophenone dioxygenase (DAD), cupin domain. 2,4'-Dihydroxyacetophenone dioxygenase (DAD) catalyzes the oxidation of 2,4'-dihydroxyacetophenone to 4-hydroxybenzoate and formate as part of the 4-hydroxyacetophenone catabolic pathway. This enzyme is a homo-tetramer containing one iron per molecule of enzyme. This enzyme is an unusual dioxygenase in that it cleaves a C-C bond in a substituent of the aromatic ring rather than within the ring itself. As a bacterial dioxygenase, DAD plays an important environmental role in the aerobic catabolism of aromatic compounds; expression of this enzyme in appropriately engineered microorganisms has the potential to use these aromatic pollutants as a carbon source and thus remove them from the environment. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 123
37243 380437 cd20303 cupin_ChrR_1 Marinobacter hydrocarbonoclasticus anti-ECFsigma factor ChrR, and similar proteins; 2 heterologous tandem repeats of cupin domain. This family contains bacterial anti-sigma factor such as ChrR from Marinobacter hydrocarbonoclasticus. Anti-sigma factor ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the ECF sigma factor E (sigmaE), an essential factor to mount a transcriptional response to a singlet oxygen and for viability when carotenoids are limiting. This protein family likely contains two distinct homologous functional domains belonging to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 102
37244 380438 cd20304 cupin_OxDC_N Oxalate decarboxylase (OxDC), N-terminal cupin domain. This model represents the N-terminal cupin domain of oxalate decarboxylase (OxDC; EC 4.1.1.2), a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 155
37245 380439 cd20305 cupin_OxDC_C Oxalate decarboxylase (OxDC), C-terminal cupin domain. This model represents the C-terminal cupin domain of oxalate decarboxylase (OxDC; EC 4.1.1.2), a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 153
37246 380440 cd20306 cupin_OxDC-like Oxalate decarboxylase (OxDC)-like cupin domain. This subfamily contains bacterial and eukaryotic cupin domains of proteins homologous to oxalate decarboxylase (OxDC; EC 4.1.1.2) such as MSMEG_2254, a putative OxDC from Mycobacterium smegmatis. OxDC is a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. 151
37247 380441 cd20307 cupin_BacB_N Bacillus subtilis bacilysin and related proteins, N-terminal cupin domain. This model represents the N-terminal domain of bacilysin (BacB, also known as AerE in Microcystis aeruginosa), a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. Bacilysin is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls. AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF Aeruginosin biosynthesis gene cluster in Microcystis aeruginosa. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 100
37248 380442 cd20308 cupin_YdaE D-lyxose isomerase YdaE, cupin domain. This family includes D-lyxose isomerase (D-LI; EC 5.3.1.15) homologous to YdaE from the sigma B regulon of Bacillus subtilis, a protein with an active site that is highly similar to the E. coli O157 z5688 D-lyxose isomerase. YdaE may have a synergistic role with ydaD, an NAD(P)-dependent alcohol dehydrogenase, in the adaptation to environment stresses; YdaD may be active against the ketose sugar produced by YdaE and function in providing resistance to oxidative stress through the production of reducing equivalents in the form of NAD(P)H. YdaE forms a cupin-type beta-barrel, with two alpha helices at the N-terminus. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 160
37249 380443 cd20309 cupin_EcSI Escherichia coli sugar isomerase (EcSI), cupin domain. This family includes a sugar isomerase homologous to pathogenic Escherichia coli O157 z5688 D-lyxose isomerase (EcSI or Z5688) which has an active site highly similar to YdaE from the sigma B regulon of Bacillus subtilis. Extensive substrate screening has revealed that EcSI is capable of acting on D-lyxose and D-mannose. Studies show that overexpression of EcSI enables cell growth on D-lyxose as the sole carbon source. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 199
37250 380444 cd20310 cupin_L-RbI L-ribose isomerase, cupin domain. L-ribose isomerase (RbI) catalyzes the reversible isomerization between L-ribose and L-ribulose, which are rare sugars and non-abundant in nature. RbI from Acinetobacter sp. DL-28 has been shown to have D-lyxose isomerase activity of about 47% compared to L-ribose. Cellulomonas parahominis MB426 RbI has a broad substrate specificity and can also catalyze the isomerization between D-lyxose and D-xylulose, D-talose and D-tagatose, L-allose and L-psicose, L-gulose and L-sorbose, and D-mannose and D-fructose. RbI adopts a cupin-type structure and belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 235
37251 380445 cd20311 cupin_Yhhw_C Escherichia coli YhhW and YhaK and related proteins, pirin-like bicupin, C-terminal cupin domain. This family includes the C-terminal domain of YhhW and YhaK, Escherichia coli pirin-like proteins with unknown function. YhhW is structurally similar not only to human pirin but also to quercitin 2,3-dioxygenase (quercitinase). Although the function of YhhW is not completely understood, YhhW and its human ortholog have quercitinase activity and are likely to play an important role in transcription and apoptosis. This C-terminal cupin-like domain has diverged considerably and has closer alignment with C-terminal pirin, while the N-terminal cupin domain of YhhW has a metal coordination site and is thought to have catalytic activity. YhaK is found in low abundance in the cytosol of E. coli and is strongly up-regulated by nitroso-glutathione (GSNO). There are major structural differences at the N-terminus of YhaK compared with YhhW; YhaK lacks the canonical cupin metal-binding residues of pirins and may be involved in chloride binding and/or sensing of oxidative stress in enterobacteria. YhaK showed no quercetinase and peroxidase activity; however, reduced YhaK was very sensitive to reactive oxygen species (ROS). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold. 70
37252 380745 cd20313 DSRM_DND1 double-stranded RNA binding motif of dead end protein homolog 1 (DND1) and similar proteins. DND1 (also known as dead end protein, or RNA-binding motif single-stranded-interacting protein 4 (RBMS4)) is an RNA-binding protein that is required for the survival of primordial germ cells (PGCs) and suppresses the formation of germ-cell tumors. DND1 binds a UU(A/U) trinucleotide motif predominantly in the 3' untranslated regions of mRNA, and destabilizes target mRNAs. It also counteracts the function of several microRNAs (miRNAs), which are inhibitors of gene expression, by binding mRNAs and prohibiting miRNAs from associating with their target sites. DND1 contains two RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure. 80
37253 380746 cd20314 DSRM_EIF2AK2 double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 68
37254 380747 cd20315 DSRM_mL44_subfamily double-stranded RNA binding motif of mL44 subfamily proteins. The mitochondrion-specific ribosomal protein mL44 subfamily is composed of mitochondrial 54S ribosomal protein L3 (MRPL3) and mitochondrial 39S ribosomal protein L44 (MRPL44). MRPL3 (also known as mitochondrial large ribosomal subunit protein mL44) is a component of the mitochondrial ribosome (mitoribosome), a dedicated translation machinery responsible for the synthesis of mitochondrial genome-encoded proteins, including at least some of the essential transmembrane subunits of the mitochondrial respiratory chain. MRPL44 (also called L44mt, MRP-L44, or mitochondrial large ribosomal subunit protein mL44) is a component of the 39S subunit of mitochondrial ribosome. It may play a role in the assembly/stability of nascent mitochondrial polypeptides exiting the ribosome. Members of this family contain a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. 71
37255 410556 cd20317 FXYD1 FXYD domain-containing ion transport regulator 1. FXYD domain-containing ion transport regulator 1 (FXYD1), also known as phospholemman (PLM), or sodium/potassium-transporting ATPase subunit FXYD1, associates with and regulates the activity of the sodium/potassium-transporting ATPase (NKA) which transports Na+ out of the cell and K+ into the cell. It is a plasma membrane substrate for several kinases, including protein kinase A, protein kinase C, NIMA kinase, and myotonic dystrophy kinase. It is thought to form an ion channel or regulate ion channel activity. Transcript variants with different 5' UTR sequences have been described in the literature. 64
37256 410557 cd20318 FXYD2 FXYD domain-containing ion transport regulator 2. FXYD domain-containing ion transport regulator 2 (FXYD2), also known as sodium/potassium-transporting ATPase subunit gamma, or Na(+)/K(+) ATPase subunit gamma, or sodium pump gamma chain, is the regulatory subunit of the sodium/potassium-transporting ATPase (Na,K-ATPase). Na+,K+-ATPase is a heteromeric complex consisting of a large alpha-subunit, which is responsible for ATP hydrolysis, ion transport, and CTS binding, and a beta-subunit, acting as a chaperone. Although the Na,K-ATPase does not depend on the gamma subunit to be functional, it is thought that the gamma subunit modulates the enzyme's activity by inducing ion channel activity. Mutations in this gene have been associated with renal hypomagnesaemia. 43
37257 410558 cd20322 FXYD4 FXYD domain-containing ion transport regulator 4. FXYD domain-containing ion transport regulator 4 (FXYD4), also known as CHIF (channel-inducing factor or corticosteroid hormone-induced factor), evokes K+ conductance in oocytes and is localized in the distal parts of the nephron and in the colon. CHIF, a putative K channel regulator, is regulated by aldosterone in the colon and by K+ intake in the kidney. 48
37258 410559 cd20323 FXYD_FXYD5 FXYD domain of FXYD domain-containing ion transport regulator 5. FXYD domain-containing ion transport regulator 5 (FXYD5) is also called dysadherin in humans or related to ion channel (RIC) in mice. Two transcript variants have been found for this gene, and they are both predicted to encode the same protein. Dysadherin is the gamma subunit the human Na,K-ATPase and is the only member that has a large extracellular sequence of 140 amino acids. Dysadherin has been observed to be over-expressed on the surface of cells that have down regulated levels of surface E-cadherin. CCL2 (bone homing cytokine) is a protein that is highly affected by silencing dysadherin expression. Dysadherin interferes with cell adhesion via beta1 subunit interactions and is a target for an extracellular antibody drug conjugate where the antibody to dysadherin is attached to a cardiac glycoside. FXYD5 expression in mouse is mainly in the kidney, intestine, spleen, and lung. Confocal immunofluorescence microscopy of mouse kidney detected FXYD5 on basolateral membranes of connecting tubules, collecting tubules, intercalated cells of collecting duct, and on apical membranes in long thin limb of Henle loop. 48
37259 410560 cd20324 FXYD6 FXYD domain-containing ion transport regulator 6. FXYD domain-containing ion transport regulator 6 (FXYD6 encodes the protein phosphohippolin and is located at the 11q23.3. It can be found in all human tissues except blood. FXYD6 in humans is primarily in the brain, with highest levels of expression found in the prefrontal cortex, amygdala, hypothalamus, and occipital lobe. FXYD6 is up-regulated in hepatocellular carcinoma (HCC) and it enhances the migration and proliferation of HCC cells. Therapy targeting FXYD6 could potentially benefit the clinical treatment toward HCC patients. FXYD6 is also associated with mental diseases. Mutations in the FXYD6 gene, or in sequences close to this gene, can predispose to schizophrenia which is known to be strongly heritable. FXYD6 was also found to be significantly downregulated in a Tg2576 mouse model of Alzheimer's disease (AD) brain and hippocampus. FXYD6 is a novel regulator of Na,K-ATPase expressed in the inner ear. 66
37260 410561 cd20325 FXYD7 FXYD domain-containing ion transport regulator 7. FXYD domain-containing ion transport regulator 7 (FXYD7) has a potential splice variant with an additional 3 residues. In rats, expression of FXYD7 was restricted to the brain, with highest levels in the cerebrum, followed by brainstem, and hippocampus, and relatively weak expression in the hypothalamus. Immunofluorescence microscopy demonstrated colocalization with synaptophysin and modest colocalization with glial fibrillary acidic protein, indicating predominant expression in neurons and lower expression in astroglial cells. The FXYD7 gene maps to chromosome 19. 51
37261 410562 cd20327 FXYD8 FXYD domain-containing ion transport regulator 8. FXYD domain-containing ion transport regulator 8 (FXYD8), also known as FXYD domain containing ion transport regulator 6 pseudogene 3 (FXYD6P3), is a member of the FXYD protein family that is involved in the modulation of NKA activity in the kidneys. The human FXYD8 gene is located on the X chromosome. However, the gene is located on chromosome 9 in the mouse and chromosome 8 in the rat. 93
37262 410563 cd20328 FXYD3-like FXYD domain-containing ion transport regulator 3 and similar proteins. This subfamily includes FXYD domain-containing ion transport regulator 3 (FXYD3), FXYD9, and FXYD10, also called PLMS/Phospholemman-like protein. FXYD3, also known as mammary tumor 8 kDa protein (MAT-8), or chloride conductance inducer protein Mat-8, or phospholemman-like (PLML), may function as a chloride channel or as a chloride channel regulator. It associates with and regulates the activity of the sodium/potassium-transporting ATPase (NKA) which transports Na+ out of the cell and K+ into the cell. Two transcript variants encode two different isoforms of the protein; in addition, transcripts utilizing alternative polyA signals have been described in the literature. Members here include mammalians and reptiles. FXYD9 is present in teleosts including: Danio rerio, Atlantic salmon, and Japanese Medaka fish. In general, the FXYD9 isoform has the highest degree of conservation among the examined teleost species, indicating that it may be involved in physiological processes that are not evolving within this group of vertebrates. FXYD10, present in shark, associates with and modifies the activity of Na,K-ATPase in vitro through interactions mediated by its transmembrane and cytoplasmic C-terminal domains. It is important in the phosphorylation and potassium deocclusion reactions, which are known to be controlled by A domain movements. It is thought that FXYD10 interacts with the A domain of the shark Na,K-ATPase alpha-subunit. 51
37263 410564 cd20329 FXYD11 FXYD domain-containing ion transport regulator 11. FXYD domain-containing ion transport regulator 11 (FXYD11) is a putative regulatory subunit of the Na(+)/K(+)-ATPase (NKA) pump. FXYD11 is expressed predominantly in the gills of euryhaline teleosts, such as the spotted scat, Scatophagus argus. It regulates NKA activity through protein-protein interactions. The regulation of NKA and FXYD11 is of critical importance for osmotic homeostasis. The expression and activity of NKA, as well as FXYD11 mRNA expression in gills have been shown to respond to different environmental salinity by dual-labeling immunohistochemistry and quantitative PCR (RT-qPCR) methods, indicating that there is an interaction between NKA and FXYD. 64
37264 410565 cd20330 FXYD12 FXYD domain-containing ion transport regulator 12. The FXYD domain-containing ion transport regulator 12 (FXYD12) mRNA is mainly distributed in kidneys and intestines of fish. In co-immunoprecipitation experiments, FXYD12 was shown to associate with the Na(+)/(K+)-ATPase (NKA) alpha-subunit in the intestines of two closely related medakas, Oryzias dancena and O. latipes. These results suggests that FXYD12 may play a role in modulating NKA activity in the intestines following salinity changes in the maintenance of internal homeostasis. 53
37265 380677 cd20331 JBP-like oxygenase domain of uncharacterized bacterial and phage proteins similar to kinetoplastid J-binding protein (JBP) 1 and JBP2. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 270
37266 380678 cd20332 JBP J-binding protein. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 250
37267 380474 cd20334 Cas13b Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes; class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage. 759
37268 380669 cd20374 Pot1C Protection Of Telomeres Protein 1 (POT1) C-terminal region. POT1 is part of shelterin, a hexameric nucleoprotein complex (comprising TRF1, TRF2, TIN2, RAP1, POT1 and TPP1 in humans) that protects telomeres, the physical ends of chromosomes. Shelterin protects against these ends being recognized as double-stranded DNA breaks, as well as against degradation of the telomeric overhang by endonucleases. It also helps control access of telomerase to the telomeric overhang, thereby affecting telomore length. This C-terminal region has an OB-fold domain and a holiday junction resolvase (HJR) domain which make dimer contacts with TPP1. 286
37269 380668 cd20378 PBP1_SBP-like periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids. 357
37270 410450 cd20379 Tudor_dTUD-like Tudor domain found in Drosophila melanogaster maternal protein Tudor (dTUD) and similar proteins. dTUD is required during oogenesis for the formation of primordial germ cells and for normal abdominal segmentation. It contains 11 Tudor domains. The family also includes mitochondrial A-kinase anchor protein 1 (AKAP1) and Tudor domain-containing proteins (TDRDs). AKAP1, also called A-kinase anchor protein 149 kDa (AKAP 149), or dual specificity A-kinase-anchoring protein 1 (D-AKAP-1), or protein kinase A-anchoring protein 1 (PRKA1), or Spermatid A-kinase anchor protein 84 (S-AKAP84), is found in mitochondria and in the endoplasmic reticulum-nuclear envelope where it anchors protein kinases, phosphatases, and a phosphodiesterase. It regulates multiple cellular processes governing mitochondrial homeostasis and cell viability. AKAP1 binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane. TDRDs have diverse biological functions and may contain one or more copies of the Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37271 410451 cd20380 Tudor_TDRD13-like Tudor domain found in Tudor domain-containing protein 13 (TDRD13) and similar proteins. The TDRD13 family includes TDRD13 and OTU domain-containing protein 4 (OTUD4). TDRD13, also called asparagine-linked glycosylation 13 (ALG13), glycosyltransferase 28 domain-containing protein 1 (GLT28D1), or UDP-N-acetylglucosamine transferase subunit ALG13, is a putative bifunctional UDP-N-acetylglucosamine transferase and deubiquitinase (EC 2.4.1.141/EC 3.4.19.12). It is a potential member of the Alg7p/Alg13p/Alg14p complex catalyzing the first two initial reactions in the N-glycosylation process. OTUD4, also called HIV-1-induced protein HIN-1, is a phospho-activated K63 deubiquitinase that hydrolyzes the isopeptide bond between the ubiquitin C-terminus and the lysine epsilon-amino group of the target protein. It may negatively regulate inflammatory and pathogen recognition signaling in innate immune response. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 54
37272 410452 cd20381 Tudor_LBR Tudor domain found in Lamin-B receptor (LBR) and similar proteins. LBR, also called integral nuclear envelope inner membrane (INM) protein or LMN2R, is a nuclear envelope protein that anchors the lamina and the heterochromatin to the inner nuclear membrane, in cellular senescence induced by excess thymidine. It is also important for cholesterol biosynthesis. LBR can interact with chromodomain proteins and DNA. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 51
37273 410453 cd20382 Tudor_SETDB1_rpt1 first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 82
37274 410454 cd20383 Tudor_53BP1 Tudor domain found in tumor suppressor TP53-binding protein 1 (53BP1) and similar proteins. 53BP1, also called p53-binding protein 1 (p53BP1), is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics, and class-switch recombination (CSR) during antibody genesis. It plays a key role in the repair of DSBs in response to DNA damage by promoting non-homologous end joining (NHEJ)-mediated repair of DSBs and specifically counteracting the function of the homologous recombination (HR) repair protein BRCA1. It is recruited to DSB sites by recognizing and binding histone H2A monoubiquitinated at 'Lys-15' (H2AK15Ub) and histone H4 dimethylated at 'Lys-20' (H4K20me2), two histone marks that are present at DSB sites. 53BP1 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 52
37275 410455 cd20384 Tudor_ZGPAT Tudor domain found in zinc finger CCCH-type with G patch domain-containing protein (ZGPAT) and similar proteins. ZGPAT, also called ZIP, G patch domain-containing protein 6 (GPATC6), GPATCH6, zinc finger CCCH domain-containing protein 9 (ZC3HDC9), ZC3H9, or zinc finger and G patch domain-containing protein, is a transcription repressor that specifically binds the 5'-GGAG[GA]A[GA]A-3' consensus sequence. It represses transcription by recruiting the chromatin multiprotein complex NuRD to target promoters. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37276 410456 cd20385 Tudor_PCL Tudor domain found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins. The PCL family includes PHD finger protein1 (PHF1) and its homologs, metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are accessory components of the Polycomb repressive complex 2 (PRC2) core complex. Members contain an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD domain-containing proteins, the first PHD domains of PCL proteins do not display histone H3K4 binding affinity and they do not affect the binding of the Tudor domain to histones. 54
37277 410457 cd20386 Tudor_PHF20-like Tudor domain found in PHD finger protein 20 (PHF20), PHF20-like protein 1 (PHF20L1), and similar proteins. PHF20, also called Glioma-expressed antigen 2, hepatocellular carcinoma-associated antigen 58, novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds to Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53-mediated signaling. PHF20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Both PHF20 and PHF20L1 contain an N-terminal malignant brain tumor (MBT) domain, a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37278 410458 cd20387 Tudor_UHRF_rpt1 first Tudor domain found in the UHRF (ubiquitin-like PHD and RING finger domain-containing protein) family. The UHRF family includes UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain(PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. The model corresponds to the first Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3. 73
37279 410459 cd20388 Tudor_UHRF_rpt2 second Tudor domain found in the UHRF (ubiquitin-like PHD and RING finger domain-containing protein) family. The UHRF family includes UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain(PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. The model corresponds to the second Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3. 72
37280 410460 cd20389 Tudor_ARID4_rpt1 first Tudor domain found in AT-rich interactive domain-containing protein ARID4 family. The family contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (RB)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and RB transcriptional activity, and play important roles in the AR and RB pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through their interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 53
37281 410461 cd20390 Tudor_ARID4_rpt2 second Tudor domain found in AT-rich interactive domain-containing protein ARID4 family. The family contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (RB)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and RB transcriptional activity, and play important roles in the AR and RB pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through their interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 53
37282 410462 cd20391 Tudor_JMJD2_rpt1 first Tudor domain found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also called lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. JMJD2D is not included in this model, since it lacks both the PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth. 53
37283 410463 cd20392 Tudor_JMJD2_rpt2 second Tudor domain found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also called lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. JMJD2D is not included in this model, since it lacks both the PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth. 56
37284 410464 cd20393 Tudor_SGF29_rpt1 first Tudor domain found in SAGA-associated factor 29 (SGF29) and similar proteins. SGF29, also called coiled-coil domain-containing protein 101, or SAGA complex-associated factor 29, is a chromatin reader component of some histone acetyltransferase (HAT) SAGA-type complexes, like the TFTC-HAT, ATAC or STAGA complexes. It specifically recognizes and binds methylated 'Lys-4' of histone H3 (H3K4me), with a preference for the trimethylated form (H3K4me3). SGF29 contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 67
37285 410465 cd20394 Tudor_SGF29_rpt2 second Tudor domain found in SAGA-associated factor 29 (SGF29) and similar proteins. SGF29, also called coiled-coil domain-containing protein 101, or SAGA complex-associated factor 29, is a chromatin reader component of some histone acetyltransferase (HAT) SAGA-type complexes, like the TFTC-HAT, ATAC or STAGA complexes. It specifically recognizes and binds methylated 'Lys-4' of histone H3 (H3K4me), with a preference for trimethylated form (H3K4me3). SGF29 contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 60
37286 410466 cd20395 Tudor_SpCrb2-like_rpt1 first Tudor domain found in Schizosaccharomyces pombe Cut5-repeat binding protein 2 (Crb2) and similar proteins. Crb2, also called RAD9 protein homolog, or checkpoint mediator protein crb2, is a DNA repair protein essential for cell cycle arrest at the G1 and G2 stages following DNA damage by X-, and UV-irradiation, or inactivation of DNA ligase. Crb2 contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37287 410467 cd20396 Tudor_SpCrb2-like_rpt2 second Tudor domain found in Schizosaccharomyces pombe Cut5-repeat binding protein 2 (Crb2) and similar proteins. Crb2, also called RAD9 protein homolog, or checkpoint mediator protein crb2, is a DNA repair protein essential for cell cycle arrest at the G1 and G2 stages following DNA damage by X-, and UV-irradiation, or inactivation of DNA ligase. Crb2 contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 73
37288 410468 cd20397 Tudor_BAHCC1-like Tudor domain found in the BAH and coiled-coil domain-containing protein 1 (BAHCC1) family. The family of BAHCC1 includes BAHCC1 and trinucleotide repeat-containing gene 18 protein (TNRC18). BAHCC1 may function as a transcriptional regulator. The biological function of TNRC18 remains unclear. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 67
37289 410469 cd20398 Tudor_SMN Tudor domain found in survival motor neuron protein (SMN) and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. Mutations in human SMN lead to motor neuron degeneration and spinal muscular atrophy. SMN contains a central, highly conserved Tudor domain that is required for U snRNP assembly and Sm protein binding and has been shown to bind arginine-glycine-rich motifs in an methylarginine-dependent manner. 56
37290 410470 cd20399 Tudor_SPF30 Tudor domain found in survival of motor neuron-related-splicing factor 30 (SPF30) and similar proteins. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. Overexpression of SPF30 causes apoptosis. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37291 410471 cd20400 Tudor_ERCC6L2 Tudor domain found in DNA excision repair protein ERCC-6-like 2 (ERCC6L2) and similar proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. ERCC6L2 gene mutations have been associated with bone marrow failure that includes developmental delay and microcephaly. It contains an N-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 59
37292 410472 cd20401 Tudor_AtPTM-like Tudor domain found in Arabidopsis thaliana DDT domain-containing protein PTM (AtPTM), Dirigent protein 17 (AtDIR17), and similar proteins. This family includes AtPTM and AtDIR17. AtPTM, also called DDT domain-containing protein 1, or PHD type transcription factor with transmembrane domains, is a membrane-bound transcription factor required for plastid-to-nucleus retrograde signaling. AtDIR17 imparts stereoselectivity on the phenoxy radical-coupling reaction, yielding optically active lignans from two molecules of coniferyl alcohol in the biosynthesis of lignans, flavonolignans, and alkaloids, and thus plays a central role in plant secondary metabolism. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37293 410473 cd20402 Tudor_Agenet_FMRP-like_rpt1 first Tudor-like Agenet domain found in the fragile X mental retardation protein (FMRP) family. The FMRP family includes synaptic functional regulator FMR1, fragile X mental retardation syndrome-related protein 1 (FXR1), and 2 (FXR2). FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FXR1 and FXR2 are RNA-binding proteins that shuttle between the nucleus and cytoplasm and associate with polyribosomes, predominantly with the 60S ribosomal subunit. Members of this family contain two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37294 410474 cd20403 Tudor_Agenet_FMRP-like_rpt2 second Tudor-like Agenet domain found in the fragile X mental retardation protein (FMRP) family. The FMRP family includes synaptic functional regulator FMR1, fragile X mental retardation syndrome-related protein 1 (FXR1) and 2 (FXR2). FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FXR1 and FXR2 are RNA-binding proteins that shuttle between the nucleus and cytoplasm and associate with polyribosomes, predominantly with the 60S ribosomal subunit. Members of this family contain two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37295 410475 cd20404 Tudor_Agenet_AtEML-like Tudor-like Agenet domain found in Arabidopsis thaliana proteins EMSY-LIKE 1-4 (AtEML1-4) and similar proteins. This family includes Arabidopsis thaliana proteins EMSY-LIKE 1-4 (AtEML1-4), histone-lysine N-methyltransferase trithorax-like proteins ATX1-2 (AtATX1-2), histone-lysine N-methyltransferase ASHH3, DNA mismatch repair protein MSH6, and similar proteins. EMSY-like proteins contain an EMSY N-terminal domain, a central Tudor-like Agenet domain, and a C-terminal coiled-coil motif. AtEML1, AtEML2, and likely AtEML4, contribute to RPP7-mediated immunity. Besides this, AtEML1 and AtEML2 participate in a second EDM2-dependent function and affect floral transition. ATX-like proteins are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1, also called protein SET domain group 27, or trithorax-homolog protein 1 (TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2, also called protein SET domain group 30, or trithorax-homolog protein 2 (TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). Both ATX1 and ATX2 are multi-domain proteins that consist of an N-terminal Tudor-like Agenet domain, a PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical plant homeodomain (PHD) domain, a non-canonical extended PHD (ePHD) domain, and a C-terminal SET domain. ASHR3, also called protein SET DOMAIN GROUP 7, functions as a histone-lysine N-methyltransferase (EC 2.1.1.43). It contains a SET domain and a Tudor-like Agenet domain. AtMSH6, also called MutS protein homolog 6, is a component of the post-replicative DNA mismatch repair system (MMR). It forms a heterodimer with MutS alpha (MSH2-MSH6 heterodimer) which binds to DNA mismatches thereby initiating DNA repair. AtMSH6 contains a Tudor-like Agenet domain and a MutS domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 51
37296 410476 cd20405 Tudor_Agenet_AtDUF_rpt1_3 first and third Tudor-like Agenet domains found in a family of Arabidopsis thaliana DUF724 domain-containing proteins (AtDUFs). The family includes a group of AtDUFs (AtDUF1-3 and AtDUF6-8) that may be involved in the polar growth of plant cells via transportation of RNAs. Members of this family have four Tudor-like Agenet domains, except for AtDUF8, which contains only two copies of the Tudor-like Agenet domain. AtDUF4 and AtDUF5 are not included here due to the lack of a Tudor-like Agenet domain. The model corresponds to the first and third Tudor-like Agenet domains in AtDUF1-3 and AtDUF6-7, as well as the first Tudor-like Agenet domain in AtDUF8. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 65
37297 410477 cd20406 Tudor_Agenet_AtDUF_rpt2_4 second and fourth Tudor-like Agenet domains found in the family of Arabidopsis thaliana DUF724 domain-containing proteins (AtDUFs). The family includes a group of AtDUFs (AtDUF1-3 and AtDUF6-8) that may be involved in the polar growth of plant cells via transportation of RNAs. Members of this family have four Tudor-like Agenet domains, except for AtDUF8, which contains only two copies of the Tudor-like Agenet domain. AtDUF4 and AtDUF5 are not included here due to the lack of a Tudor-like Agenet domain. The model corresponds to the second and fourth Tudor-like Agenet domains in AtDUF1-3 and AtDUF6-7, as well as the first Tudor-like Agenet domain in AtDUF8. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 47
37298 410478 cd20407 Tudor_AKAP1 Tudor domain found in mitochondrial A-kinase anchor protein 1 (AKAP1) and similar proteins. AKAP1, also called A-kinase anchor protein 149 kDa (AKAP 149), dual specificity A-kinase-anchoring protein 1 (D-AKAP-1), protein kinase A-anchoring protein 1 (PRKA1), or Spermatid A-kinase anchor protein 84 (S-AKAP84), is found in mitochondria and in the endoplasmic reticulum-nuclear envelope, where it anchors protein kinases, phosphatases, and a phosphodiesterase. It regulates multiple cellular processes governing mitochondrial homeostasis and cell viability. AKAP1 binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane. It contains a C-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 76
37299 410479 cd20408 Tudor_TDRD1_rpt1 first Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 130
37300 410480 cd20409 Tudor_TDRD1_rpt2 second Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 82
37301 410481 cd20410 Tudor_TDRD1_rpt3 third Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 59
37302 410482 cd20411 Tudor_TDRD1_rpt4 fourth Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 116
37303 410483 cd20412 Tudor_TDRD2 Tudor domain found in Tudor domain-containing protein 2 (TDRD2) and similar proteins. TDRD2, also called Tudor and KH domain-containing protein (TDRKH), participates in the primary piwi-interacting RNA (piRNA) biogenesis pathway and is required during spermatogenesis to repress transposable elements and prevent their mobilization, which is essential for germline integrity. The family also includes the TDRD2 homolog found in Drosophila melanogaster (dTDRKH), which is also called partner of PIWIs protein, or PAPI, and is involved in Zucchini-mediated piRNA 3'-end maturation. TDRD2 contains two KH domains and one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 95
37304 410484 cd20413 Tudor_TDRD3 Tudor domain found in Tudor domain-containing protein 3 (TDRD3) and similar proteins. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. In the nucleus, it acts as a coactivator; it recognizes and binds asymmetric dimethylation on the core histone tails associated with transcriptional activation (H3R17me2a and H4R3me2a) and recruits proteins at these arginine-methylated loci. In the cytoplasm, it may play a role in the assembly and/or disassembly of mRNA stress granules and in the regulation of translation of target mRNAs by binding Arg/Gly-rich motifs (GAR) in dimethylarginine-containing proteins. TDRD3 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 53
37305 410485 cd20414 Tudor_TDRD4_rpt1 first Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 77
37306 410486 cd20415 Tudor_TDRD4_rpt2 second Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 96
37307 410487 cd20416 Tudor_TDRD4_rpt3 third Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 82
37308 410488 cd20417 Tudor_TDRD4_rpt4 fourth Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 68
37309 410489 cd20418 Tudor_TDRD4_rpt5 fifth Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 105
37310 410490 cd20419 Tudor_TDRD5 Tudor domain found in Tudor domain-containing protein 5 (TDRD5) and similar proteins. TDRD5 is an RNA-binding protein directly associated with piRNA precursors. It is required for retrotransposon silencing, chromatoid body assembly, and spermiogenesis. TDRD5 participates in the repression of transposable elements and prevents their mobilization, which is essential for germline integrity. TDRD5 contains three LOTUS domains and one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 118
37311 410491 cd20420 Tudor_TDRD6_rpt1 first Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 132
37312 410492 cd20421 Tudor_TDRD6_rpt2 second Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 130
37313 410493 cd20422 Tudor_TDRD6_rpt3 third Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 135
37314 410494 cd20423 Tudor_TDRD6_rpt4 fourth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 80
37315 410495 cd20424 Tudor_TDRD6_rpt5 fifth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 126
37316 410496 cd20425 Tudor_TDRD6_rpt6 sixth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the sixth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 115
37317 410497 cd20426 Tudor_TDRD6_rpt7 seventh Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the seventh one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 140
37318 410498 cd20427 Tudor_TDRD7_rpt1 first Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 98
37319 410499 cd20428 Tudor_TDRD7_rpt2 second Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 140
37320 410500 cd20429 Tudor_TDRD7_rpt3 third Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 91
37321 410501 cd20430 Tudor_TDRD8 Tudor domain found in Tudor domain-containing protein 8 (TDRD8) and similar proteins. TDRD8, also called serine/threonine-protein kinase (EC 2.7.11.1) 31 (STK31), serine/threonine-protein kinase NYD-SPK, or Sugen kinase 396 (SgK396), is a germ cell-specific factor expressed in embryonic gonocytes of both sexes, and in postnatal spermatocytes and round spermatids in males. It acts as a cell-cycle regulated protein that contributes to the tumorigenicity of epithelial cancer cells. TDRD8 contains a Tudor domain and a serine/threonine kinase domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 75
37322 410502 cd20431 Tudor_TDRD9 Tudor domain found in Tudor domain-containing protein 9 (TDRD9) and similar proteins. TDRD9 is an ATP-dependent DEAD-like RNA helicase required during spermatogenesis. It is involved in the biosynthesis of PIWI-interacting RNAs (piRNAs). A recessive deleterious mutation mutation in TDRD9 causes non-obstructive azoospermia in infertile men. TDRD9 contains an N-terminal HrpA-like RNA helicase module and a Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 101
37323 410503 cd20432 Tudor_TDRD10 Tudor domain found in Tudor domain-containing protein 10 (TDRD10) and similar proteins. TDRD10 is widely expressed and localized both to the nucleus and cytoplasm, and may play general roles like regulation of RNA metabolism. It contains a Tudor domain and an RNA recognition motif (RRM). The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 139
37324 410504 cd20433 Tudor_TDRD11 Tudor domain found in Tudor domain-containing protein 11 (TDRD11) and similar proteins. TDRD11, also called Staphylococcal nuclease domain-containing protein 1 (SND1), 100 kDa coactivator, EBNA2 coactivator p100, or p100 co-activator, is a multifunctional protein that is reportedly associated with different types of RNA molecules, including mRNA, miRNA, pre-miRNA, and dsRNA. It has been implicated in a number of biological processes in eukaryotic cells, including the cell cycle, DNA damage repair, proliferation, and apoptosis. TDRD11 is overexpressed in multiple cancers and functions as an oncogene. It contains multiple Staphylococcal nuclease (SN) domains and a C-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 84
37325 410505 cd20434 Tudor_TDRD12_rpt1 first Tudor domain found in Tudor domain-containing protein 12 (TDRD12) and similar proteins. TDRD12, also called ES cell-associated transcript 8 protein (ECAT8), is a putative ATP-dependent DEAD-like RNA helicase that is essential for germ cell development and maintenance. It acts as a unique piRNA biogenesis factor essential for secondary PIWI interacting RNA (piRNA) biogenesis. TDRD12 contains two Tudor domains, one at the N-terminus and the other at the C-terminal end. The model corresponds to the first/N-terminal one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 164
37326 410506 cd20435 Tudor_TDRD12_rpt2 second Tudor domain found in Tudor domain-containing protein 12 (TDRD12) and similar proteins. TDRD12, also called ES cell-associated transcript 8 protein (ECAT8), is a putative ATP-dependent DEAD-like RNA helicase that is essential for germ cell development and maintenance. It acts as a unique piRNA biogenesis factor essential for secondary PIWI interacting RNA (piRNA) biogenesis. TDRD12 contains two Tudor domains, one at the N-terminus and the other at the C-terminal end. The model corresponds to the second/C-terminal one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 134
37327 410507 cd20436 Tudor_TDRD15_rpt1 first Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 147
37328 410508 cd20437 Tudor_TDRD15_rpt2 second Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 120
37329 410509 cd20438 Tudor_TDRD15_rpt3 third Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 141
37330 410510 cd20439 Tudor_TDRD15_rpt4 fourth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 125
37331 410511 cd20440 Tudor_TDRD15_rpt5 fifth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 127
37332 410512 cd20441 Tudor_TDRD15_rpt6 sixth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the sixth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 108
37333 410513 cd20442 Tudor_TDRD15_rpt7 seventh Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the seventh one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 160
37334 410514 cd20443 Tudor_AtTudor1-like Tudor domain found in Arabidopsis thaliana ribonuclease Tudor 1 (AtTudor1), ribonuclease Tudor 2 (AtTudor2), and similar proteins. The family includes AtTudor1 (also called Tudor-SN protein 1) and AtTudor2 (also called Tudor-SN protein 2 or 100 kDa coactivator-like protein). They are cytoprotective ribonucleases (RNases) required for resistance to abiotic stresses, acting as positive regulators of mRNA decapping during stress. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 117
37335 410515 cd20444 Tudor_vreteno-like_rpt1 first Tudor domain found in Drosophila melanogaster protein vreteno and similar proteins. Vreteno is a gonad-specific protein essential for germline development to repress transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process in both germline and somatic gonadal tissues by mediating the repression of transposable elements during meiosis. Vreteno contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37336 410516 cd20445 Tudor_vreteno-like_rpt2 second Tudor domain found in Drosophila melanogaster protein vreteno and similar proteins. Vreteno is a gonad-specific protein essential for germline development to repress transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process in both germline and somatic gonadal tissues by mediating the repression of transposable elements during meiosis. Vreteno contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 56
37337 410517 cd20446 Tudor_SpSPF30-like Tudor domain found in Schizosaccharomyces pombe splicing factor spf30 (SpSPF30) and similar proteins. SpSPF30, also called survival of motor neuron-related-splicing factor 30, is necessary for spliceosome assembly. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 56
37338 410518 cd20447 Tudor_TDRD13 Tudor domain found in Tudor domain-containing protein 13 (TDRD13). TDRD13, also called asparagine-linked glycosylation 13 (ALG13), glycosyltransferase 28 domain-containing protein 1 (GLT28D1), or UDP-N-acetylglucosamine transferase subunit ALG13, is a putative bifunctional UDP-N-acetylglucosamine transferase and deubiquitinase (EC 2.4.1.141/EC 3.4.19.12). It is a potential member of the Alg7p/Alg13p/Alg14p complex catalyzing the first two initial reactions in the N-glycosylation process. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 80
37339 410519 cd20448 Tudor_OTUD4 Tudor domain found in OTU domain-containing protein 4 (OTUD4). OTUD4, also called HIV-1-induced protein HIN-1, is a phospho-activated K63 deubiquitinase that hydrolyzes the isopeptide bond between the ubiquitin C-terminus and the lysine epsilon-amino group of the target protein. It may negatively regulate inflammatory and pathogen recognition signaling in innate immune response. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 64
37340 410520 cd20449 Tudor_PHF1 Tudor domain found in PHD finger protein1 (PHF1) and similar proteins. PHF1, also called Polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis, through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of lysine trimethylation at lysine 36 of histone H3 and lysine 27 of histone variant H3t. Moreover, PHF1 is required for efficient H3-K27 trimethylation (H3K27me3) and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. PHF1 consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3. 54
37341 410521 cd20450 Tudor_MTF2 Tudor domain found in metal-response element-binding transcription factor 2 (MTF2) and similar proteins. MTF2, also called metal regulatory transcription factor 2, metal-response element DNA-binding protein M96, or Polycomb-like protein 2 (PCL2), complexes with the Polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2, like other PCL family proteins, consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. 54
37342 410522 cd20451 Tudor_PHF19 Tudor domain found in PHD finger protein1 (PHF19) and similar proteins. PHF19, also called Polycomb-like protein 3 (PCL3), is a component of the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. It binds trimethylated histone H3 Lys36 (H3K36me3) through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states. 57
37343 410523 cd20452 Tudor_dPCL-like Tudor domain found in Drosophila melanogaster Polycomb protein PCL (dPCL)and similar proteins. dPCL, also called Polycomblike protein, is a Polycomb group (PcG) protein that is specifically required during the first 6 hours of embryogenesis to establish the repressed state. dPCL is a component of the Esc/E(z) complex, which methylates 'Lys-9' and 'Lys-27' residues of histone H3, leading to transcriptional repression of the affected target gene. Like other PCL family proteins, it consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. 55
37344 410524 cd20453 Tudor_PHF20 Tudor domain found in PHD finger protein 20 (PHF20) and similar proteins. PHF20, also called Glioma-expressed antigen 2, hepatocellular carcinoma-associated antigen 58, novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds to Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53-mediated signaling. PHF20 contains an N-terminal malignant brain tumor (MBT) domain, a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 53
37345 410525 cd20454 Tudor_PHF20L1 Tudor domain found in PHD finger protein 20-like protein 1 (PHF20L1) and similar proteins. PHF20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. In addition to the MBT domain, PHF20L1 also contains a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 59
37346 410526 cd20455 Tudor_UHRF1_rpt1 first Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1) and similar proteins. UHRF1, also called inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can act as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING- associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. The model corresponds to the first Tudor domain. 79
37347 410527 cd20456 Tudor_UHRF2_rpt1 first Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2) and similar proteins. UHRF2, also called Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. UHRF2 also functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger domain. The model corresponds to the first Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3. 91
37348 410528 cd20457 Tudor_UHRF1_rpt2 second Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1) and similar proteins. UHRF1, also called inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can act as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING- associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. The model corresponds to the second Tudor domain. 72
37349 410529 cd20458 Tudor_UHRF2_rpt2 second Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2) and similar proteins. UHRF2, also called Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. UHRF2 also functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger domain. The model corresponds to the second Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3. 73
37350 410530 cd20459 Tudor_ARID4A_rpt1 first Tudor domain found in AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1 or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B ( also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC -dependent and -independent repression activities. It also acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only the chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 58
37351 410531 cd20460 Tudor_ARID4B_rpt1 first Tudor domain found in AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1 or RBBP1L1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating the cell cycle. ARID4B contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 61
37352 410532 cd20461 Tudor_ARID4A_rpt2 second Tudor domain found in AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1 or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B ( also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC -dependent and -independent repression activities. It also acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only the chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 60
37353 410533 cd20462 Tudor_ARID4B_rpt2 second Tudor domain found in AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1 or RBBP1L1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating the cell cycle. ARID4B contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 57
37354 410534 cd20463 Tudor_JMJD2A_rpt1 first Tudor domain found in Jumonji domain-containing protein 2A (JMJD2A) and similar proteins. JMJD2A, also called lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor corepressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER), and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37355 410535 cd20464 Tudor_JMJD2B_rpt1 first Tudor domain found in Jumonji domain-containing protein 2B (JMJD2B) and similar proteins. JMJD2B, also called lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the tri-methyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 54
37356 410536 cd20465 Tudor_JMJD2C_rpt1 first Tudor domain found in Jumonji domain-containing protein 2C (JMJD2C) and similar proteins. JMJD2C, also called lysine-specific demethylase 4C (KDM4C), gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 54
37357 410537 cd20466 Tudor_JMJD2A_rpt2 second Tudor domain found in Jumonji domain-containing protein 2A (JMJD2A) and similar proteins. JMJD2A, also called lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor corepressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER), and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 56
37358 410538 cd20467 Tudor_JMJD2B_rpt2 second Tudor domain found in Jumonji domain-containing protein 2B (JMJD2B) and similar proteins. JMJD2B, also called lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the tri-methyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 56
37359 410539 cd20468 Tudor_JMJD2C_rpt2 second Tudor domain found in Jumonji domain-containing protein 2C (JMJD2C) and similar proteins. JMJD2C, also called lysine-specific demethylase 4C (KDM4C), gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 60
37360 410540 cd20469 Tudor_TNRC18 Tudor domain found in trinucleotide repeat-containing gene 18 protein (TNRC18) and similar proteins. TNRC18, also called long CAG trinucleotide repeat-containing gene 79 protein (CAGL79), is a protein that in humans is encoded by the TNRC18 gene. Its biological function remains unclear. TNRC18 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 67
37361 410541 cd20470 Tudor_BAHCC1 Tudor domain found in BAH and coiled-coil domain-containing protein 1 (BAHCC1) and similar proteins. BAHCC1, also called Bromo adjacent homology domain-containing protein 2 (BAHD2), or BAH domain-containing protein 2, may function as a transcriptional regulator. BAHCC1 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 70
37362 410542 cd20471 Tudor_Agenet_FMR1_rpt1 first Tudor-like Agenet domain found in synaptic functional regulator FMR1 and similar proteins. FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FMR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37363 410543 cd20472 Tudor_Agenet_FXR1_rpt1 first Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA binding protein that interacts with the functionally similar proteins FMR1 and FXR2. It shuttles between the nucleus and cytoplasm and associates with polyribosomes, predominantly with the 60S ribosomal subunit. FXR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37364 410544 cd20473 Tudor_Agenet_FXR2_rpt1 first Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2 is an RNA-binding protein that associates with polyribosomes, predominantly with 60S large ribosomal subunits. It may have a role in the development of fragile X mental retardation syndrome. FXR2 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 55
37365 410545 cd20474 Tudor_Agenet_FMR1_rpt2 second Tudor-like Agenet domain found in synaptic functional regulator FMR1 and similar proteins. FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FMR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 63
37366 410546 cd20475 Tudor_Agenet_FXR1_rpt2 second Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA binding protein that interacts with the functionally similar proteins FMR1 and FXR2. It shuttles between the nucleus and cytoplasm and associates with polyribosomes, predominantly with the 60S ribosomal subunit. FXR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 66
37367 410547 cd20476 Tudor_Agenet_FXR2_rpt2 second Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2 is an RNA-binding protein that associates with polyribosomes, predominantly with 60S large ribosomal subunits. It may have a role in the development of fragile X mental retardation syndrome. FXR2 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 68
37368 380475 cd20477 Cas13b_Pb-like Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b from Prevotella buccae and similar Cas13b proteins. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage. 995
37369 380476 cd20478 Cas13b_Bz-like Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b from Bergeyella zoohelcum and similar Cas13b proteins. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage. 1118
37370 380470 cd20480 ArgR-Cyc_NRPS-like Cyc (heterocyclization)-like domain of Vibrio anguillarum AngR and similar proteins; belongs to the Condensation-domain family. Vibrio anguillarum AngR plays a role in regulating the expression of iron transport genes as well as in the production of the siderophore anguibactin. Cyc-domains are a type of Condensation (C) domain. Cyc-domains catalyze two separate reactions in the creation of heterocyclized peptide products in nonribosomal peptide synthesis: amide bond formation followed by intramolecular cyclodehydration between a Cys, Ser, or Thr side chain and a carbonyl carbon on the peptide backbone to form a thiazoline, oxazoline, or methyloxazoline ring. C-domains typically have a conserved HHxxxD motif at the active site; Cyc-domains have a alternative, conserved DxxxxD active site motif, mutation of the aspartate residues in this motif can abolish or diminish condensation activity. Members of this subfamily have an SxxxD motif at the active site. C-domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to Cyc-domains there are various other subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 406
37371 380473 cd20481 phage_tailspike_middle N-terminal and middle domains of tailspike protein in Acinetobacter bacteriophages. This model describes the middle beta-helical domain of Acinetobacter bacteriophage tailspike proteins, as well as a separate N-terminal domain that does not appear to be part of the beta-helical substructure. The N-terminal domain may be involved in virion binding, and the molecules form a homo-trimeric arrangement. A C-terminal domain that may be involved in receptor binding is omitted from the model. 419
37372 380471 cd20483 C_PKS-NRPS Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Most members of this subfamily have the typical C-domain HHXXXD motif. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 430
37373 380472 cd20484 C_PKS-NRPS_PksJ-like Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs), similar to Bacillus subtilis PksJ. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily have the typical C-domain HHxxxD motif. PksJ is involved in some intermediate steps for the synthesis of the antibiotic polyketide bacillaene which is important in secondary metabolism. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 430
37374 380450 cd20485 USP25_USP28_C-like carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25) and 28 (USP28), and similar domains. This family contains the C-terminal domain of two deubiquitinases (DUBs), ubiquitin-specific proteases USP25 and USP28, which share high similarity but vary in their cellular functions. USP25 is a regulator of the innate immune system and may play a role in tumorigenesis, while USP28 is known for its tumor-promoting role. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This alignment model represents the C-terminal region that has been implicated in substrate binding for both USP25 and USP28 and harbors the splicing site for isoform-specific sequences. 273
37375 380451 cd20486 USP25_C carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25). This subfamily contains the C-terminal domain of ubiquitin-specific protease USP25, a deubiquitinase (DUB), which shares high similarity with USP28 but varies in cellular function; USP25 is a regulator of the innate immune system and may play a role in tumorigenesis, while USP28 is known for its tumor-promoting role. USP25 regulates inflammatory TRAF signaling and USP28 stabilizes c-MYC and other nuclear proteins. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP25 and harbors the splicing site for isoform-specific sequences. Structure studies show that the C-terminally extended USP25 is exclusively tetrameric. 281
37376 380452 cd20487 USP28_C carboxyl-terminal domain of ubiquitin-specific protease 28 (USP28). This family contains the C-terminal domain of ubiquitin-specific protease USP28, a deubiquitinase (DUB), which shares high similarity with USP25 but varies in cellular function; USP28 is known for its tumor-promoting role while USP25 is a regulator of the innate immune system and may play a role in tumorigenesis. USP28 stabilizes c-MYC and other nuclear proteins, and USP25 regulates inflammatory TRAF signaling. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP28 and harbors the splicing site for isoform-specific sequences. Structure studies suggest that the C-terminal domain forms an independent entity. 280
37377 410774 cd20488 peptidase_C58-like C58 peptidase domain and and similar domains. This family contains C58 peptidases and similar proteins. C58 family peptidases are endopeptidases that also act as transamidases, attaching a lipid moiety to the newly exposed N-terminus of the substrate. These include the Pseudomonas avirulence (Avr) protein AvrPphB and the homologous protein from Yersinia known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. Also included is a cysteine-protease-like domain in Photorhabdus asymbiotica toxin PaTox that enhances cytotoxic effects of the toxin, and therefore is essential for full PaTox activity. The C58 cysteine protease domain is also found in Vibrio vulnificus biotype 3 multifunctional autoprocessing RTX toxin. It usually contains the characteristic Cys, His, Asp residues in the active site. Some members may lack this cataytic triad. 143
37378 380446 cd20489 cupin_HppE-like_C hydroxypropylphosphonic acid epoxidase (HppE) and similar proteins, C-terminal cupin domain. This family includes HppE (hydroxypropylphosphonic acid epoxidase or HPP epoxidase or 2-hydroxypropylphosphonic acid epoxidase; EC 1.11.1.23), a non-heme mononuclear iron-dependent enzyme that catalyzes a unique epoxidation reaction as part of the biosynthetic pathway of the clinically important oxirane antibiotic fosfomycin. HppE uses a facial triad with two histidine ligands and one aspartic acid or glutamic acid, His2(Glu/Asp), to catalyze a variety of different reactions, including DNA repair and antibiotic biosynthesis. The C-terminal catalytic domain of HppE has a cupin fold that binds a divalent cation, whereas the N-terminal domain carries a helix-turn-helix (HTH) motif with putative DNA-binding helices. HppE converts (S)-2-hydroxypropyl-1-phosphonate (S-HPP) to the antibiotic fosfomycin [(1R,2S)-epoxypropylphosphonate] in an unusual 1,3-dehydrogenation of a secondary alcohol to an epoxide; it uses H2O2 as a co-substrate to abstract hydrogen (Ho) from C1 of S-HPP to initiate epoxide ring closure, using an iron(IV)-oxo complex as the Ho abstractor. HppE belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization and its structure serves as a model for numerous proteins of unknown function, predicted to be transcription factors, containing an HTH motif at the N-terminus and a cupin domain at the C-terminus. 97
37379 380447 cd20490 cupin_HutD_C histidine utilization protein HutD and related proteins, C-terminal cupin domain. This model represents the C-terminal domain of a bicupin protein HutD, involved in histidine utilization (Hut) in Pseudomonas species. Although a metal binding site is not found in Pseudomonas fluorescens (PfluHutD), a binding pocket for ligands is located in the middle of the N-terminal cupin domain near the metal binding sites; N-formyl-l-glutamate (FG, a Hut pathway intermediate) has been identified as a potential ligand in vivo. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 77
37380 380448 cd20491 cupin_KduI_C Escherichia coli 5-keto-4-deoxyuronate isomerase (KduI) and related proteins, C-terminal cupin domain. 5-keto-4-deoxyuronate isomerase (KduI; EC 5.3.1.17), also called 5-dehydro-4-deoxy-D-glucuronate isomerase or 4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase, catalyzes the interconversion of 5-keto-4-deoxyuronate and 2,5-diketo-3-dexoygluconate in the breakdown of pectin. KduI is a bicupin; this model describes the C-terminal cupin domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 108
37381 380449 cd20492 cupin_HQDO_large_C hydroquinol 1,2-dioxygenase (HQDO) large subunit, C-terminal cupin domain. This model describes the C-terminal cupin domain of the large (or beta) subunit of hydroquinone 1,2-dioxygenase (HQDO), a heterotetramer of two alpha and two beta subunits of 19kDa and 38kDa, respectively. HQDO is a Fe(II) ring cleaving dioxygenase that is a key enzyme in the hydroquinone pathway of para-nitrophenol degradation, where it catalyzes the ring cleavage of hydroquinone to gamma-hydroxymuconic semialdehyde. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. 100
37382 380336 cd20493 M34_ATLF_C-like C-terminal catalytically active domain of anthrax toxin lethal factor and similar domains; belongs to peptidase family M34. This subfamily includes the C-terminal catalytic domain of anthrax toxin lethal factor (ATLF; EC 3.4.24.83). ATLF and edema factor are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is secreted by Bacillus anthracis to promote disease virulence through disruption of host signaling pathways. ATLF belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). MKKs are cleaved by ATLF near their N-termini, removing the docking sequence for the downstream cognate mitogen-activated protein kinase. Preferred amino acids around the cleavage site can be denoted BBBBxHxH, in which B denotes Arg or Lys, H denotes a hydrophobic amino acid, and x is any amino acid. At its N-terminus, ATLF has a related PABD domain which lacks the hallmark metalloprotease motif HEXXH. This subfamily also includes Bacillus thuringiensis Vip2Ac-like_2 which belongs to the Vip family of proteins that are secreted during the vegetative growth phase. 208
37383 410775 cd20494 C58_RtxA peptidase C58-like domain of cytotoxin RtxA and similar proteins. This subfamily includes the C58 peptidase-like domain of Vibrio vulnificus biotype 3 multifunctional autoprocessing RTX (MARTX) toxin, the primary virulence factor of V. vulnificus. MARTX has been shown to be an essential virulence factor contributing to highly inflammatory skin wounds with severe damage affecting every tissue layer. This toxin is a large single-polypeptide composed of repeat sequences that form a pore in eukaryotic cell plasma membranes for the translocation of centrally located effector domains. This C58 family cysteine protease domain usually contains the invariant C/H/D residues that form an active site triad, however, cysteine is not fully conserved in this group. 229
37384 410776 cd20495 C58_PaToxP-like peptidase C58 domain of Photorhabdus asymbiotica toxin PaTox and LifA/Efa1-related large cytotoxin, and similar proteins. This subfamily includes the cysteine protease domain of Photorhabdus asymbiotica toxin PaTox, a large virulence-associated multifunctional protein toxin. This domain is similar to AvrPphB protease found in Pseudomonas syringae, a C58 protease. Mutation studies show that this domain enhances cytotoxic effects of the toxin, and therefore is essential for full PaTox activity. Also included in this family is the enteropathogenic Escherichia coli (EPEC) factor for adherence/lymphocyte activation inhibitor (efa1/lifA) gene which is strongly associated with diarrhea. Efa1/LifA proteins are important for A/E lesion formation efficiency in EPEC strains lacking multiple effectors. This domain contains the invariant C/H/D residues conserved in the C58/YopT family. 179
37385 410777 cd20496 C58 peptidase C58 domain. The C58 family peptidases are endopeptidases that also act as transamidases, attaching a lipid moiety to the newly exposed N-terminus of the substrate. These include the Pseudomonas avirulence (Avr) protein AvrPphB and the homologous protein from Yersinia known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. The proteolytic activity of AvrPphB is essential for autoproteolytic cleavage of an AvrPphB precursor as well as for eliciting the hypersensitive response in plants. Yersinia pestis YopT cleaves the post-translationally modified Rho GTPases near their carboxyl termini, releasing them from the membrane. This leads to the disruption of actin cytoskeleton in host cells. Also included in this family is the Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. All of these proteolytic activities are dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain. 149
37386 410778 cd20497 C58_YopT-like peptidase C58 domain of YopT-like proteins, including Pseudomonas avirulence AvrPphB. This subfamily includes the C58 peptidase domain of the Pseudomonas avirulence (Avr) protein AvrPphB which is homologous to Yersinia effector known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. The proteolytic activity of AvrPphB is essential for autoproteolytic cleavage of an AvrPphB precursor as well as for eliciting the hypersensitive response in plants. Also included is the Ralstonia solanacearum type III effector protein RipT, a YopT-like cysteine protease. All of these proteolytic activities are dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain. 185
37387 410779 cd20498 C58_YopT peptidase C58 domain of the YopT subfamily, including Yersinia pestis YopT and related proteins. This subfamily includes the plague organism Yersinia pestis cysteine protease YopT, an outer membrane protein. Y. pestis can disarm the host immune response by interfering with cell-signaling pathways; YopT cleaves post-translationally modified Rho GTPases near their carboxyl termini, releasing them from the membrane. This leads to the disruption of the actin cytoskeleton in host cells. YopT's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain. 211
37388 410780 cd20499 HopN1-like peptidase C58 domain of Pseudomonas syringae type III effector HopN1 and related proteins. This family includes the C58 peptidase domain of Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. HopN1's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain. 216
37389 410973 cd20500 Peptidase_C80 peptidase C80 family. The peptidase C80 family includes self-cleaving proteins that are precursors of bacterial toxins such as the Vibrio cholerae RTX self-cleaving toxin, as well as the major virulence factors of Clostridium difficile multidomain toxins, TcdA and TcdB. These toxins contain a cysteine protease domain (CPD) that autoproteolytically releases a cytotoxic effector domain upon binding intracellular inositol hexakisphosphate. This family also contains filamentous hemagglutinin family cysteine protease C80 domains, that are located at the C-terminus. All domains in this family contain the characteristic Cys/His residues in the active site. Site-directed mutagenesis has identified functional residues Asp/His/Cys in Clostridium toxin B and His/Cys in cholera RTX toxin. 150
37390 410974 cd20501 C80_RtxA-like peptidase C80 cysteine binding domain of RTX toxin RtxA and related proteins. This peptidase C80 family includes the autoproteolytic cysteine protease domain (CPD) of Vibrio cholerae multifunctional autoprocessing repeats-in-toxin (MARTX) toxin that causes disassembly of the actin cytoskeleton and enhances V. cholerae colonization of the small intestine, possibly by facilitating evasion of phagocytic cells. The central region of this toxin is composed of several domains, including the actin cross-linking domain (ACD) that introduces lysine-glutamate cross-links between actin protomers, the Rho-inactivating domain (RID) that disables small Rho GTPases, and an autoprocessing cysteine protease domain (CPD). Within the cell, the CPD is activated by the binding of inositol hexakisphosphate to release individual effector domains of the toxin into the cytosol. The CPD contains the characteristic Cys/His residues in the active site. 194
37391 410975 cd20502 C80_toxinA_B-like Peptidase C80 cysteine binding domain of Clostridium difficile toxins A and B, and related proteins. This peptidase C80 family includes the major virulence factors of Clostridium difficile multidomain toxins TcdA and TcdB. These large homologous toxins contain several distinct domains including a cysteine protease domain (CPD) that autoproteolytically releases a cytotoxic effector domain upon binding of intracellular inositol hexakisphosphate. C. difficile is a major cause of intestinal tissue damage and inflammation, and TcdA is generally more inflammatory whereas TcdB is more cytotoxic; studies show that the CPD is an internal regulator of the proinflammatory activity. Site-directed mutagenesis has identified functional residues Asp/His/Cys in Clostridium toxin B. 209
37392 410976 cd20503 C80_adhesin-like peptidase C80 domains found in filamentous hemagglutinin or adhesin, and other similar proteins. This peptidase C80 family includes the cysteine-binding domain (CPD) of several large, repetitive bacterial exoproteins involved in heme utilization or adhesion and many typically having CPD repeats as well as regions rich in repeats. Many members of this family have been designated adhesins or filamentous haemagglutinins. The CPD contains the characteristic Asp/Cys/His residues found in Clostridium toxin B active site. 156
37393 410208 cd20504 CYCLIN_CCNA_rpt1 first cyclin box found in cyclin-A (CCNA) family. The CCNA family includes two A-type cyclins, CCNA1 and CCNA2. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. Members in this family contain two cyclin boxes. The model corresponds to the first one. The cyclin box is a protein binding domain. 128
37394 410209 cd20505 CYCLIN_CCNA_rpt2 second cyclin box found in cyclin-A (CCNA) family. The CCNA family includes two A-type cyclins, CCNA1 and CCNA2. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 110
37395 410210 cd20506 CYCLIN_AtCycA-like_rpt2 second cyclin box found in Arabidopsis thaliana A-type cyclins (CycAs) and similar proteins. Plant A-type cyclins (CycAs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycAs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 111
37396 410211 cd20507 CYCLIN_CCNB1-like_rpt1 first cyclin box found in cyclin-B1 (CCNB1)-like family. The CCNB1-like family includes two B-type cyclins, CCNB1 and CCNB2, both of which are essential for the control of the cell cycle at the G2/M (mitosis) transition. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 130
37397 410212 cd20508 CYCLIN_CCNB3_rpt1 first cyclin box found in G2/mitotic-specific cyclin-B3 (CCNB3) and similar proteins. CCNB3 is a mitotic B-type cyclin that promotes the metaphase-anaphase transition. It controls anaphase onset independent of spindle assembly checkpoint in meiotic oocytes. CCNB3 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 142
37398 410213 cd20509 CYCLIN_CCNB1-like_rpt2 second cyclin box found in cyclin-B1 (CCNB1)-like family. The CCNB1-like family includes two B-type cyclins, CCNB1 and CCNB2, both of which are essential for the control of the cell cycle at the G2/M (mitosis) transition. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 111
37399 410214 cd20510 CYCLIN_CCNB3_rpt2 second cyclin box found in G2/mitotic-specific cyclin-B3 (CCNB3) and similar proteins. CCNB3 is a mitotic B-type cyclin that promotes the metaphase-anaphase transition. It controls anaphase onset independent of spindle assembly checkpoint in meiotic oocytes. CCNB3 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 115
37400 410215 cd20511 CYCLIN_AtCycB-like_rpt2 second cyclin box found in Arabidopsis thaliana B-type cyclins (CycBs) and similar proteins. Plant B-type cyclins (CycBs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycBs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 117
37401 410216 cd20512 CYCLIN_CLBs_yeast_rpt2 second cyclin box found in yeast B-type cyclins. The family includes Saccharomyces cerevisiae G2/mitotic-specific cyclins 1-4 (ScCLB1-4), S-phase entry cyclins 5-6 (ScCLB5-6), and Schizosaccharomyces pombe G2/mitotic-specific cyclins, cig1, cig2 and cdc13. ScCLB1-4 are essential for the control of the cell cycle at the G2/M (mitosis) transition. They interact with the CDC2 protein kinase to form maturation promoting factor (MPF). ScCLB5-6 interact with CDC28 and are involved in DNA replication in Saccharomyces cerevisiae. ScCLB5 is required for efficient progression through S phase and possibly for the normal progression through meiosis. ScCLB6 is involved in G1/S and or S phase progression. Cig1 is required for efficient passage of the G1/S transition. Cig2 and cdc13 are essential for the control of the cell cycle at the G2/M and G1/S (mitosis) transition. They interact with the cdc2 protein kinase to form MPF. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 116
37402 410217 cd20513 CYCLIN_CCNC_rpt1 first cyclin box found in cyclin-C (CCNC) and similar proteins. CCNC, also termed CycC, or SRB11, is a component of the Mediator complex, a coactivator involved in regulated gene transcription of nearly all RNA polymerase II-dependent genes. It mediates stress-induced mitochondrial fission and apoptosis. CCNC contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 101
37403 410218 cd20514 CYCLIN_CCNC_rpt2 second cyclin box found in cyclin-C (CCNC) and similar proteins. CCNC, also termed CycC, or SRB11, is a component of the Mediator complex, a coactivator involved in regulated gene transcription of nearly all RNA polymerase II-dependent genes. It mediates stress-induced mitochondrial fission and apoptosis. CCNC contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 92
37404 410219 cd20515 CYCLIN_CCND_rpt1 first cyclin box found in cyclin-D (CCND) family. The CCND family includes three mitogen-induced D-type cyclins, CCND1, CCND2 and CCND3, which function as regulatory subunits of the cyclin-dependent kinases CDK4 and CDK6, that drive progression through the G1 phase of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 150
37405 410220 cd20516 CYCLIN_CCND_rpt2 second cyclin box found in cyclin-D (CCND) family. The CCND family includes three mitogen-induced D-type cyclins, CCND1, CCND2 and CCND3, which function as regulatory subunits of the cyclin-dependent kinases CDK4 and CDK6, that drive progression through the G1 phase of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 98
37406 410221 cd20517 CYCLIN_vCyC_rpt1 first cyclin box found in viral cyclin (v-cyclin). v-Cyclin modulates host cell cycle progression and apoptotic signaling pathways. It forms an active kinase complex with cellular CDK6, a cellular cyclin-dependent kinase known to interact with cellular type D cyclins. v-Cyclin belongs to Cyclin D subfamily. It contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 99
37407 410222 cd20518 CYCLIN_vCyC_rpt2 second cyclin box found in viral cyclin (v-cyclin). v-Cyclin modulates host cell cycle progression and apoptotic signaling pathways. It forms an active kinase complex with cellular CDK6, a cellular cyclin-dependent kinase known to interact with cellular type D cyclins. v-Cyclin belongs to Cyclin D subfamily. It contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 100
37408 410223 cd20519 CYCLIN_CCNE_rpt1 first cyclin box found in G1/S-specific cyclin-E (CCNE) family. The CCNE family includes two E-type cyclins, CCNE1 and CCNE2. CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form a serine/threonine kinase holoenzyme complexes. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 131
37409 410224 cd20520 CYCLIN_CCNE_rpt2 second cyclin box found in G1/S-specific cyclin-E (CCNE) family. The CCNE family includes two E-type cyclins, CCNE1 and CCNE2. CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form serine/threonine kinase holoenzyme complexes. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 105
37410 410225 cd20521 CYCLIN_CCNF_rpt1 first cyclin box found in G2/mitotic-specific cyclin-F (CCNF) and similar proteins. CCNF, also termed F-box only protein 1 (FBXO1), is a substrate recognition component of a SCF (SKP1-CUL1-F-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of CP110 during G2 phase, thereby acting as an inhibitor of centrosome reduplication. It is the largest among all cyclins and oscillates in the cell cycle like other cyclins. CCNF contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 95
37411 410226 cd20522 CYCLIN_CCNF_rpt2 second cyclin box found in G2/mitotic-specific cyclin-F (CCNF) and similar proteins. CCNF, also termed F-box only protein 1 (FBXO1), is a substrate recognition component of a SCF (SKP1-CUL1-F-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of CP110 during G2 phase, thereby acting as an inhibitor of centrosome reduplication. It is the largest among all cyclins and oscillates in the cell cycle like other cyclins. CCNF contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 112
37412 410227 cd20523 CYCLIN_CCNG cyclin box found in the cyclin-G (CCNG) family. The CCNG family includes two cyclins, CCNG1 and CCNG2. CCNG1 is the only cyclin that has either positive or negative effects on cell growth. It is associated with G2/M phase arrest in response to DNA damage. It is also involved in the development of human carcinoma. CCNG2 may play a role in growth regulation and in negative regulation of cell cycle progression. It has been identified as a tumor suppressor in several cancers. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 94
37413 410228 cd20524 CYCLIN_CCNH_rpt1 first cyclin box found in cyclin-H (CCNH) and similar proteins. CCNH, also called MO15-associated protein, p34, or p37, is normally associated with the cyclin-dependent kinase cdk7, the catalytic subunit of the CDK-activating kinase (CAK) enzymatic complex. CCNH contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 150
37414 410229 cd20525 CYCLIN_CCNH_rpt2 second cyclin box found in cyclin-H (CCNH) and similar proteins. CCNH, also called MO15-associated protein, p34, or p37, is normally associated with the cyclin-dependent kinase cdk7, the catalytic subunit of the CDK-activating kinase (CAK) enzymatic complex. CCNH contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 126
37415 410230 cd20526 CYCLIN_CCNI-like cyclin box found in cyclin-I (CCNI) and similar proteins. CCNI is an atypical cyclin because it is most abundant in post-mitotic cells. It is involved in various biological processes, such as cell survival, angiogenesis, cell differentiation, and cell cycle progression. CCNI contains a typical cyclin box near the N-terminus and a PEST sequence near the C-terminus. The cyclin box is a protein binding domain. 99
37416 410231 cd20528 CYCLIN_CCNJ-like_rpt1 first cyclin box found in cyclin-J (CCNJ) family. The CCNJ family includes two cyclins, CCNJ and CCNJ-like. CCNJ may regulate the cell cycle or transcription. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 103
37417 410232 cd20529 CYCLIN_CCNJ-like_rpt2 second cyclin box found in cyclin-J (CCNJ) family. The CCNJ family includes two cyclins, CCNJ and CCNJ-like. CCNJ may regulate the cell cycle or transcription. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 101
37418 410233 cd20530 CYCLIN_CCNK_rpt1 first cyclin box found in cyclin-K (CCNK) and similar proteins. CCNK is a novel RNA polymerase II-associated C-type cyclin possessing both carboxy-terminal domain kinase and Cdk-activating kinase activity. It is a regulatory subunit of cyclin-dependent kinases that mediates the activation of these target kinases. It plays a role in transcriptional regulation by controlling the phosphorylation of the C-terminal domain (CTD) of the large subunit of RNA polymerase II (POLR2A). CCNK contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 115
37419 410234 cd20531 CYCLIN_CCNK_rpt2 second cyclin box found in cyclin-K (CCNK) and similar proteins. CCNK is a novel RNA polymerase II-associated C-type cyclin possessing both carboxy-terminal domain kinase and Cdk-activating kinase activity. It is a regulatory subunit of cyclin-dependent kinases that mediates the activation of these target kinases. It plays a role in transcriptional regulation by controlling the phosphorylation of the C-terminal domain (CTD) of the large subunit of RNA polymerase II (POLR2A). CCNK contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 101
37420 410235 cd20532 CYCLIN_CCNL_rpt1 first cyclin box found in cyclin-L (CCNL) family. The CCNL family includes two cyclins, CCNL1 and CCNL2. CCNL1 is involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL2 is a novel RNA polymerase II-associated cyclin involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 127
37421 410236 cd20533 CYCLIN_CCNL_rpt2 second cyclin box found in cyclin-L (CCNL) family. The CCNL family includes two cyclins, CCNL1 and CCNL2. CCNL1 is involved in regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL2 is a novel RNA polymerase II-associated cyclin involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 92
37422 410237 cd20534 CYCLIN_CCNM_CCNQ_rpt1 first cyclin box found in cyclin-M (CCNM) family. The CCNM family proteins, also called ancient conserved domain proteins (ACDPs), are evolutionarily conserved Mg2+ transporters. CCNM, also called cyclin-Q (CCNQ), or CDK10-activating cyclin, or cyclin-related protein FAM58A, associates with CDK10 to promote its kinase activity. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 110
37423 410238 cd20535 CYCLIN_CCNM_CCNQ_rpt2 second cyclin box found in cyclin-M (CCNM) family. The CCNM family proteins, also called ancient conserved domain proteins (ACDPs), are evolutionarily conserved Mg2+ transporters. CCNM, also called cyclin-Q (CCNQ), or CDK10-activating cyclin, or cyclin-related protein FAM58A, associates with CDK10 to promote its kinase activity. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 104
37424 410239 cd20536 CYCLIN_CCNO_rpt1 first cyclin box found in cyclin-O (CCNO) and similar proteins. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 93
37425 410240 cd20537 CYCLIN_CCNO-like_rpt2 second cyclin box found in cyclin-O (CCNO) and similar proteins. This subfamily is composed of CCNO and similar proteins including Schizosaccharomyces pombe meiosis-specific cyclin rem1, Drosophila melanogaster G2/mitotic-specific cyclin-A (CCNA), and Candida albicans G1/S-specific cyclin CCN1, among others. Rem1 is required for pre-meiotic DNA synthesis and S phase progression. CCNA is essential for the control of the cell cycle at the G2/M (mitosis) transition. CCN1 is essential for the control of the cell cycle at the G1/S (start) transition and for maintenance of filamentous growth. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It is involved in the activation of cyclin-dependent kinase 2. Members of this subfamily contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 91
37426 410241 cd20538 CYCLIN_CCNT_rpt1 first cyclin box found in cyclin-T (CCNT) family. The CCNT family includes two C-type cyclins, cyclin-T1 (CCNT1) and cyclin-T2 (CCNT2), both of which are regulatory subunits of the cyclin-dependent kinase pair (CDK9/cyclin-T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 137
37427 410242 cd20539 CYCLIN_CCNT_rpt2 second cyclin box found in cyclin-T (CCNT) family. The CCNT family includes two C-type cyclins, cyclin-T1 (CCNT1) and cyclin-T2 (CCNT2), both of which are regulatory subunits of the cyclin-dependent kinase pair (CDK9/cyclin-T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 92
37428 410243 cd20540 CYCLIN_CCNY_like cyclin box found in cyclin-Y (CCNY) family. The CCNY family includes two cyclins, CCNY and CCNY-like protein 1 (CCNYL1). They can enhance Wnt/beta-catenin signaling in mitosis. CCNY, also called Cyc-Y, cyclin box protein 1 (CBCP1), cyclin fold protein 1 (CFP1), or cyclin-X (CCNX), is a key cell cycle regulator that acts as a growth factor sensor to integrate extracellular signals with the cell cycle machinery. It is a positive regulatory subunit of the cyclin-dependent kinases CDK14/PFTK1 and CDK16. It acts as a cell-cycle regulator of Wnt signaling pathway during G2/M phase by recruiting CDK14/PFTK1 to the plasma membrane and promoting phosphorylation of LRP6, leading to the activation of the Wnt signaling pathway. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 97
37429 410244 cd20541 CYCLIN_CNTD1 cyclin box found in Cyclin N-terminal domain-containing protein 1 (CNTD1) and similar proteins. CNTD1 is a cyclin-related protein critical for meiotic crossover maturation and deselection of excess precrossover sites. CNTD1 contains one cyclin box. The cyclin box is a protein binding domain. 127
37430 410245 cd20542 CYCLIN_CNTD2 cyclin box found in Cyclin N-terminal domain-containing protein 2 (CNTD2) and similar proteins. CNTD2 is an atypical cyclin upregulated in human cancer tissues. It promotes cell proliferation and migration, as well as increases tumor growth in vivo. It can function as a prognostic factor and drug target. CNTD2 contains one cyclin box. The cyclin box is a protein binding domain. 96
37431 410246 cd20543 CYCLIN_AtCycD-like_rpt1 first cyclin box found in plant cyclin-delta family. This subfamily is composed of plant delta family cyclins, including a group of G1/S-specific D-type cyclins from Arabidopsis thaliana which may activate the cell cycle in the root apical meristem (RAM) and promote embryonic root (radicle) protrusion. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 99
37432 410247 cd20544 CYCLIN_AtCycD-like_rpt2 second cyclin box found in plant cyclin-delta family. This subfamily is composed of plant delta family cyclins, including a group of G1/S-specific D-type cyclins from Arabidopsis thaliana which may activate the cell cycle in the root apical meristem (RAM) and promote embryonic root (radicle) protrusion. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 90
37433 410248 cd20545 CYCLIN_SpCG1C-like_rpt1 first cyclin box found in Schizosaccharomyces pombe cyclin C homolog 1 (pch1) and similar proteins. Cyclin pch1 is essential for progression through the whole cell cycle. It is a homolog of cyclin T; it forms a heterodimer with its partner kinase, cyclin-dependent kinase 9 (CDK9), that can phosphorylate both the pol II C-terminal domain (CTD) and the CTD of transcription elongation factor Spt5. Yeast Cdk9/Pch1, with mRNA capping enzyme Pct1, may also form an elongation checkpoint for mRNA quality control. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 116
37434 410249 cd20546 CYCLIN_SpCG1C_ScCTK2-like_rpt2 second cyclin box found in Schizosaccharomyces pombe cyclin C homolog 1 (pch1), Saccharomyces cerevisiae CTD kinase subunit 2 (ScCTK2), and similar proteins. Cyclin pch1 is essential for progression through the whole cell cycle. It is a homolog of cyclin T; it forms a heterodimer with its partner kinase, cyclin-dependent kinase 9 (CDK9), that can phosphorylate both the pol II C-terminal domain (CTD) and the CTD of transcription elongation factor Spt5. Yeast Cdk9/Pch1, with mRNA capping enzyme Pct1, may also form an elongation checkpoint for mRNA quality control. CTK2, also called CTD kinase subunit beta, CTDK-I subunit beta, or CTD kinase 38 kDa subunit, is the cyclin subunit of the CTDK-I complex, which hyperphosphorylates the C-terminal heptapeptide repeat domain (CTD) of the largest RNA polymerase II subunit. This group also includes yeast RNA polymerase II holoenzyme cyclin-like subunit, a component of the SRB8-11 complex, a regulatory module of the Mediator complex which is involved in regulation of basal and activated RNA polymerase II-dependent transcription. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 97
37435 410250 cd20547 CYCLIN_ScCTK2-like_rpt1 first cyclin box found in Saccharomyces cerevisiae CTD kinase subunit 2 (ScCTK2) and similar proteins. CTK2, also called CTD kinase subunit beta, CTDK-I subunit beta, or CTD kinase 38 kDa subunit, is the cyclin subunit of the CTDK-I complex, which hyperphosphorylates the C-terminal heptapeptide repeat domain (CTD) of the largest RNA polymerase II subunit. CTK2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 110
37436 410251 cd20548 CYCLIN_RB-like cyclin box found in retinoblastoma-associated protein (RB) family. The RB family includes retinoblastoma-associated protein (RB), and two retinoblastoma-like proteins, RBL1 and RBL2. RB, also called p105-Rb, pRb, or pp110, is a key regulator of entry into cell division, and also acts as a tumor suppressor. It promotes G0-G1 transition when phosphorylated by CDK3/cyclin-C. It also acts as a transcription repressor of E2F1 target genes. RB is directly involved in heterochromatin formation by maintaining overall chromatin structure. It recruits and targets histone methyltransferases SUV39H1, KMT5B and KMT5C, leading to epigenetic transcriptional repression. RBL1 and RBL2 are also key regulators of entry into cell division. RBL1 and RBL2 recruit and target histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. They control histone H4 'Lys-20' trimethylation and probably act as transcription repressors by recruiting chromatin-modifying enzymes to promoters. They may also act as tumor suppressors. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 122
37437 410252 cd20549 CYCLIN_TFIIB_archaea_like_rpt1 first cyclin box found in archaeal transcription initiation factor IIB (TFIIB) and similar proteins. Archaeal TFIIB stabilizes TATA-binding protein (TBP) binding to an archaeal box-A promoter. It is also responsible for recruiting RNA polymerase II to the pre-initiation complex (DNA-TBP-TFIIB). TFIIB contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB. 99
37438 410253 cd20550 CYCLIN_TFIIB_archaea_like_rpt2 second cyclin box found in archaeal transcription initiation factor IIB (TFIIB) and similar proteins. Archaeal TFIIB stabilizes TATA-binding protein (TBP) binding to an archaeal box-A promoter. It is also responsible for recruiting RNA polymerase II to the pre-initiation complex (DNA-TBP-TFIIB). TFIIB contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB. 87
37439 410254 cd20551 CYCLIN_TFIIB_rpt1 first cyclin box found in transcription initiation factor IIB (TFIIB) and similar proteins. TFIIB, also called B-related factor 2 (BRF-2) or S300-II, is a general transcription factor that plays a role in transcription initiation by RNA polymerase II (Pol II). It is involved in the pre-initiation complex (PIC) formation and Pol II recruitment at promoter DNA. TFIIB contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB. 88
37440 410255 cd20552 CYCLIN_TFIIB_rpt2 second cyclin box found in transcription initiation factor IIB (TFIIB) and similar proteins. TFIIB, also called B-related factor 2 (BRF-2) or S300-II, is a general transcription factor that plays a role in transcription initiation by RNA polymerase II (Pol II). It is involved in the pre-initiation complex (PIC) formation and Pol II recruitment at promoter DNA. TFIIB contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB. 97
37441 410256 cd20553 CYCLIN_TFIIIB90_rpt1 first cyclin box found in transcription factor IIIB 90 kDa subunit (TFIIIB90) and similar proteins. TFIIIB90, also called B-related factor 1 (BRF-1), or TATA box-binding protein-associated factor, RNA polymerase III, subunit 2 (TAF3B2), is a general activator of RNA polymerase which utilizes different TFIIIB complexes at structurally distinct promoters. TFIIIB90 contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIIB90. 91
37442 410257 cd20554 CYCLIN_TFIIIB90_rpt2 second cyclin box found in transcription factor IIIB 90 kDa subunit (TFIIIB90) and similar proteins. TFIIIB90, also called B-related factor 1 (BRF-1), or TATA box-binding protein-associated factor, RNA polymerase III, subunit 2 (TAF3B2), is a general activator of RNA polymerase which utilizes different TFIIIB complexes at structurally distinct promoters. TFIIIB90 contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIIB90. 92
37443 410258 cd20555 CYCLIN_BRF2 cyclin box found in B-related factor 2 (BRF-2) and similar proteins. BRF-2, also called transcription factor IIIB 50 kDa subunit (TFIIIB50), or BRFU, is a general activator of RNA polymerase (Pol) III transcription and is required for Pol III transcription of genes with promoter elements upstream of the initiation sites. It recruits Pol III to type III gene-external promoters, including the U6 spliceosomal RNA and selenocysteine tRNA genes. BRF-2 contains one cyclin box. The cyclin box fold is generally a protein binding domain, but binds DNA in BRF-2. 97
37444 410259 cd20556 CYCLIN_CABLES cyclin box found in CDK5 and ABL1 enzyme substrate (CABLES) family. The CABLES family includes CABLES1 and CABLES2. CABLES1, also called interactor with CDK3 1 (Ik3-1), is a cyclin-dependent kinase binding protein that enhances cyclin-dependent kinase tyrosine phosphorylation by non-receptor tyrosine kinases, such as that of CDK5 by activated ABL1, which leads to increased CDK5 activity and is critical for neuronal development, and that of CDK2 by WEE1, which leads to decreased CDK2 activity and growth inhibition. CABLES2, also called interactor with CDK3 2 (Ik3-2), acts as a proapoptotic factor involved in both p53-mediated and p53-independent apoptotic pathways. Both, CABLES1 and CABLES2, contain one cyclin box. The cyclin box is a protein binding domain. 119
37445 410260 cd20557 CYCLIN_ScPCL1-like cyclin box found in Saccharomyces cerevisiae G1/S-specific cyclin PCL1, PCL2 and similar proteins. The family includes a group of cyclin-like proteins that interact with the Pho85 cyclin-dependent kinase, such as Saccharomyces cerevisiae G1/S-specific cyclin PCL1, PCL2, PCL9 and their vertebrate counterparts, cyclin Pas1/PHO80 domain-containing protein 1 (CNPPD1). PCL1 (also called PHO85 cyclin-1, or cyclin HCS26) and PCL2 (also called PHO85 cyclin-1, or cyclin HCS26 homolog) are G1/S-specific cyclin partners of the cyclin-dependent kinase (CDK) PHO85. They are essential for the control of the cell cycle at the G1/S (start) transition. The PCL1-PHO85 cyclin-CDK holoenzyme is involved in phosphorylation of the CDK inhibitor (CKI) SIC1, which is required for its ubiquitination and degradation, releasing repression of b-type cyclins and promoting exit from mitosis. Together with cyclin PCL2, it positively controls degradation of sphingoid long chain base kinase LCB4. PCL1-PHO85 also phosphorylates HMS1, NCP1 and NPA3, which may all have a role in mitotic exit. PCL2-PHO85 also phosphorylates RVS167, linking cyclin-CDK activity with organization of the actin cytoskeleton. PCL9 is an M/G1-specific cyclin partner of the cyclin-dependent kinase (CDK) PHO85. It may have a role in bud site selection in the G1 phase. The family also includes cyclin Pas1/PHO80 domain-containing protein 1 (CNPPD1) and similar proteins. Their biological functions remain unclear. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 94
37446 410261 cd20558 CYCLIN_ScPCL7-like cyclin box found in Saccharomyces cerevisiae PHO85 cyclin-7 (ScPCL7) and similar proteins. ScPCL7, also called PHO85-associated protein 1, is a cyclin partner of the cyclin-dependent kinase (CDK) PHO85. Together with cyclin PCL6, ScPCL7 controls glycogen phosphorylase and glycogen synthase activities in response to nutrient availablility. This family also includes Schizosaccharomyces pombe PHO85 cyclin-like protein Psl1 (SpPsl1) and Arabidopsis thaliana PHO80-like proteins, P-type cyclins (CYCPs). SpPsl1 is the cyclin partner of the CDK pef1 (PHO85 homolog). CYCPs may be involved in cell division, cell differentiation, and the nutritional status of the cell in Arabidopsis thaliana. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 101
37447 410262 cd20559 CYCLIN_ScCLN_like cyclin box found in Saccharomyces cerevisiae G1/S-specific cyclins (ScCLNs) and similar proteins. ScCLNs, including ScCLN1-3, are essential for the control of the cell cycle at the G1/S (start) transition in Saccharomyces. ScCLN1 and ScCLN2 interact with the CDC28 protein kinase to form maturation promoting factor (MPF). ScCLN3 may be an upstream activator of the G1 cyclins that directly initiate G1/S transition. This family also includes Schizosaccharomyces pombe cyclin puc1, which contributes to negative regulation of the timing of sexual development in fission yeast, and functions at the transition between cycling and non-cycling cells. It interacts with protein kinase A. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 95
37448 410263 cd20560 CYCLIN_CCNA1_rpt1 first cyclin box found in cyclin-A1 (CCNA1). CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 163
37449 410264 cd20561 CYCLIN_CCNA2_rpt1 first cyclin box found in cyclin-A2 (CCNA2) and similar proteins. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. It is significantly over-expressed in various cancer types, and can be used as a prognostic biomarker for estrogen receptor positive (ER+) breast cancer and tamoxifen resistance. CCNA2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 131
37450 410265 cd20562 CYCLIN_AtCycA_like_rpt1 first cyclin box found in Arabidopsis thaliana A-type cyclins (CycAs) and similar proteins. Plant A-type cyclins (CycAs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycAs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 136
37451 410266 cd20563 CYCLIN_CCNA1_rpt2 second cyclin box found in cyclin-A1 (CCNA1) and similar proteins. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 123
37452 410267 cd20564 CYCLIN_CCNA2_rpt2 second cyclin box found in cyclin-A2 (CCNA2) and similar proteins. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. It is significantly overexpressed in various cancer types, and can be used as a prognostic biomarker for estrogen receptor positive (ER+) breast cancer and tamoxifen resistance. CCNA2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 110
37453 410268 cd20565 CYCLIN_CCNB1_rpt1 first cyclin box found in G2/mitotic-specific cyclin-B1 (CCNB1). CCNB1 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for embryo development. Over-expression of human CCNB1 has been found in numerous cancers and has been associated with tumor aggressiveness. CCNB1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 129
37454 410269 cd20566 CYCLIN_CCNB2_rpt1 first cyclin box found in G2/mitotic-specific cyclin-B2 (CCNB2) and similar proteins. CCNB2 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for progression through meiosis. CCNB2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 133
37455 410270 cd20567 CYCLIN_AtCycB-like_rpt1 first cyclin box found in Arabidopsis thaliana B-type cyclins (CycBs) and similar proteins. Plant B-type cyclins (CycBs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycBs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 147
37456 410271 cd20568 CYCLIN_CLBs_yeast_rpt1 first cyclin box found in yeast B-type cyclins. This subfamily includes Saccharomyces cerevisiae G2/mitotic-specific cyclins 1-4 (ScCLB1-4), S-phase entry cyclins 5-6 (ScCLB5-6), and Schizosaccharomyces pombe G2/mitotic-specific cyclins, cig1, cig2 and cdc13. ScCLB1-4 are essential for the control of the cell cycle at the G2/M (mitosis) transition. They interact with the CDC2 protein kinase to form maturation promoting factor (MPF). ScCLB5-6 interact with CDC28 and are involved in DNA replication in Saccharomyces cerevisiae. ScCLB5 is required for efficient progression through S phase and possibly for the normal progression through meiosis. ScCLB6 is involved in G1/S and or S phase progression. Cig1 is required for efficient passage of the G1/S transition. Cig2 and cdc13 are essential for the control of the cell cycle at the G2/M and G1/S (mitosis) transitions. They interact with the cdc2 protein kinase to form MPF. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 134
37457 410272 cd20569 CYCLIN_CCNB1_rpt2 second cyclin box found in G2/mitotic-specific cyclin-B1 (CCNB1) and similar proteins. CCNB1 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for embryo development. Over-expression of human CCNB1 has been found in numerous cancers and has been associated with tumor aggressiveness. CCNB1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 121
37458 410273 cd20570 CYCLIN_CCNB2_rpt2 second cyclin box found in G2/mitotic-specific cyclin-B2 (CCNB2) and similar proteins. CCNB2 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for progression through meiosis. CCNB2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 119
37459 410274 cd20571 CYCLIN_AtCycC_rpt1 first cyclin box found in Arabidopsis thaliana C-type cyclins (CycCs) and similar proteins. Plant CycCs are the cognate cyclin partners of cyclin-dependent kinase CDK8. They may be involved in cell cycle control. CycCs contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 107
37460 410275 cd20572 CYCLIN_AtCycC_rpt2 second cyclin box found in Arabidopsis thaliana C-type cyclins (CycCs) and similar proteins. CycCs are the cognate cyclin partners of cyclin-dependent kinase CDK8. They may be involved in cell cycle control. CycCs contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 102
37461 410276 cd20573 CYCLIN_CCND1_rpt1 first cyclin box found in G1/S-specific cyclin-D1 (CCND1). CCND1, also called B-cell lymphoma 1 protein (BCL-1), or BCL-1 oncogene, or PRAD1 oncogene, is a regulatory component of the cyclin D1-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1. The complex also regulates the cell-cycle during G(1)/S transition. It is an important cell cycle regulatory protein involved in carcinogenesis of various human cancers. CCND1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 149
37462 410277 cd20574 CYCLIN_CCND2_rpt1 first cyclin box found in G1/S-specific cyclin-D2 (CCND2). CCND2 is a regulatory component of the cyclin D2-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is a critical mediator of exercise-induced cardiac hypertrophy. It also acts as a regulator of cell cycle proteins affecting SAMHD1-mediated HIV-1 restriction in non-proliferating macrophages. CCND2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 150
37463 410278 cd20575 CYCLIN_CCND3_rpt1 first cyclin box found in G1/S-specific cyclin-D3 (CCND3) and similar proteins. CCND3 is a regulatory component of the cyclin D3-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. In skeletal muscle, CCND3 plays a unique function in controlling the proliferation/differentiation balance of myogenic progenitor cells. CCND3 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 150
37464 410279 cd20576 CYCLIN_CCND1_rpt2 second cyclin box found in G1/S-specific cyclin-D1 (CCND1). CCND1, also called B-cell lymphoma 1 protein (BCL-1), or BCL-1 oncogene, or PRAD1 oncogene, is a regulatory component of the cyclin D1-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is an important cell cycle regulatory protein involved in carcinogenesis of various human cancers. CCND1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 110
37465 410280 cd20577 CYCLIN_CCND2_rpt2 second cyclin box found in G1/S-specific cyclin-D2 (CCND2). CCND2 is a regulatory component of the cyclin D2-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is a critical mediator of exercise-induced cardiac hypertrophy. It also acts as a regulator of cell cycle proteins affecting SAMHD1-mediated HIV-1 restriction in non-proliferating macrophages. CCND2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 105
37466 410281 cd20578 CYCLIN_CCND3_rpt2 second cyclin box found in G1/S-specific cyclin-D3 (CCND3). CCND3 is a regulatory component of the cyclin D3-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. In skeletal muscle, CCND3 plays a unique function in controlling the proliferation/differentiation balance of myogenic progenitor cells. CCND3 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 105
37467 410282 cd20579 CYCLIN_CCNE1_rpt1 first cyclin box found in G1/S-specific cyclin-E1 (CCNE1). CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 137
37468 410283 cd20580 CYCLIN_CCNE2_rpt1 first cyclin box found in G1/S-specific cyclin-E2 (CCNE2). CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form a serine/threonine kinase holoenzyme complexes. CCNE2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 137
37469 410284 cd20581 CYCLIN_CCNE1_rpt2 second cyclin box found in G1/S-specific cyclin-E1 (CCNE1). CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 114
37470 410285 cd20582 CYCLIN_CCNE2_rpt2 second cyclin box found in G1/S-specific cyclin-E2 (CCNE2). CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form serine/threonine kinase holoenzyme complexes. CCNE2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 99
37471 410286 cd20583 CYCLIN_CCNG1 cyclin box found in cyclin-G1 (CCNG1) and similar proteins. CCNG1 is the only cyclin that has either positive or negative effects on cell growth. It is associated with G2/M phase arrest in response to DNA damage. It is also involved in the development of human carcinoma. CCNG1 contains one cyclin box. The cyclin box is a protein binding domain. 98
37472 410287 cd20584 CYCLIN_CCNG2 cyclin box found in cyclin-G2 (CCNG2) and similar proteins. CCNG2 may play a role in growth regulation and in negative regulation of cell cycle progression. It has been identified as a tumor suppressor in several cancers. CCNG2 contains one cyclin box. The cyclin box is a protein binding domain. 96
37473 410288 cd20585 CYCLIN_AcCycH_rpt1 first cyclin box found in Arabidopsis thaliana H-type cyclin (CycH) and similar proteins. CycH associates with and activates the cyclin-dependent kinases, CDK-2 and CDK-3. CycH contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 170
37474 410289 cd20586 CYCLIN_AcCycH_rpt2 second cyclin box found in Arabidopsis thaliana H-type cyclin (CycH) and similar proteins. CycH associates with and activates the cyclin-dependent kinases, CDK-2 and CDK-3. CycH contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 132
37475 410290 cd20587 CYCLIN_AcCycT_rpt1 first cyclin box found in Arabidopsis thaliana T-type cyclins (CycTs) and similar proteins. CycTs associate with their partner cyclin-dependent kinases (CDKs) to trigger their kinase activity. CycTs show high sequence similarity with metazoan cyclin-K (CCNK). CycTs contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 124
37476 410291 cd20588 CYCLIN_AcCycT_rpt2 second cyclin box found in Arabidopsis thaliana T-type cyclins (CycTs) and similar proteins. CycTs associate with their partner cyclin-dependent kinases (CDKs) to trigger their kinase activity. CycTs show high sequence similarity with metazoan cyclin-K (CCNK). CycTs contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 93
37477 410292 cd20589 CYCLIN_CCNL1_rpt1 first cyclin box found in cyclin-L1 (CCNL1) and similar proteins. CCNL1 is an L-type cyclin involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 124
37478 410293 cd20590 CYCLIN_CCNL2_rpt1 first cyclin box found in cyclin-2 (CCNL2) and similar proteins. CCNL2 is a novel RNA polymerase II-associated cyclin that is involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. CCNL2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 156
37479 410294 cd20591 CYCLIN_AcCycL_rpt1 first cyclin box found in Arabidopsis thaliana L-type cyclins (CycL) and similar proteins. Cyclin-L1-1 (CycL1), also called arginine-rich cyclin 1 (AtRCY1), or protein MODIFIER OF SNC1 12, is the cognate cyclin for cyclin-dependent kinase G1 (CDKG1). It is involved in regulation of DNA methylation and transcriptional silencing. It is required for synapsis and male meiosis, and for the proper splicing of specific resistance (R) genes. L-type cyclins contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 128
37480 410295 cd20592 CYCLIN_CCNL1_rpt2 second cyclin box found in cyclin-L1 (CCNL1) and similar proteins. CCNL1 is an L-type cyclin involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 123
37481 410296 cd20593 CYCLIN_CCNL2_rpt2 second cyclin box found in cyclin-2 (CCNL2) and similar proteins. CCNL2 is a novel RNA polymerase II-associated cyclin that is involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. CCNL2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 123
37482 410297 cd20594 CYCLIN_AcCycL_rpt2 second cyclin box found in Arabidopsis thaliana L-type cyclins (CycL) and similar proteins. Cyclin-L1-1 (CycL1), also called arginine-rich cyclin 1 (AtRCY1), or protein MODIFIER OF SNC1 12, is the cognate cyclin for cyclin-dependent kinase G1 (CDKG1). It is involved in regulation of DNA methylation and transcriptional silencing. It is required for synapsis and male meiosis, and for the proper splicing of specific resistance (R) genes. L-type cyclins contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 101
37483 410298 cd20595 CYCLIN_CCNT1_rpt1 first cyclin box found in cyclin-T1 (CCNT1). CCNT1, also termed CycT1, is a host factor essential for HIV-1 replication in CD4 T cells and macrophages. It is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin-T1) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). CCNT1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain. 138
37484 410299 cd20596 CYCLIN_CCNT2_rpt1 first cyclin box found in cyclin-T2 (CCNT2). CCNT2, also termed CycT2, is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to production elongation by phosphorylating the CTD (carboxy-terminal domain) of the large subunit of RNA polymerase II (RNAP II). CCNT2 contains two cyclin boxs. The model responds to the first one. The cyclin box is a protein binding domain. 139
37485 410300 cd20597 CYCLIN_CCNT1_rpt2 second cyclin box found in cyclin-T1 (CCNT1). CCNT1, also termed CycT1, is a host factor essential for HIV-1 replication in CD4 T cells and macrophages. It is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin-T1) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). CCNT1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 109
37486 410301 cd20598 CYCLIN_CCNT2_rpt2 second cyclin box found in cyclin-T2 (CCNT2). CCNT2, also termed CycT2, is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to production elongation by phosphorylating the CTD (carboxy-terminal domain) of the large subunit of RNA polymerase II (RNAP II). CCNT2 contains two cyclin boxs. The model responds to the second one. The cyclin box is a protein binding domain. 114
37487 410302 cd20599 CYCLIN_RB cyclin box found in retinoblastoma-associated protein (RB) and similar proteins. RB, also called p105-Rb, pRb, or pp110, is a key regulator of entry into cell division and also acts as a tumor suppressor. It promotes G0-G1 transition when phosphorylated by CDK3/cyclin-C. It also acts as a transcription repressor of E2F1 target genes. RB is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. It recruits and targets histone methyltransferases SUV39H1, KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. RB contains one cyclin box. The cyclin box is a protein binding domain. 126
37488 410303 cd20600 CYCLIN_RBL cyclin box found in retinoblastoma-like protein (RBL) subfamily. The RBL subfamily includes two retinoblastoma-like proteins, RBL1 and RBL2. They are key regulators of entry into cell division and are directly involved in heterochromatin formation by maintaining overall chromatin structure. RBL1 and RBL2 recruit and target histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. They control histone H4 'Lys-20' trimethylation and probably act as transcription repressors by recruiting chromatin-modifying enzymes to promoters. They may also act as tumor suppressors. Members of this family contain one cyclin box. The cyclin box is a protein binding domain. 112
37489 410304 cd20601 CYCLIN_AtRBR_like cyclin box found in Arabidopsis thaliana retinoblastoma-related protein 1 (AtRBR1) and similar proteins. AtRBR1 is a key regulator of entry into cell division. It acts as a transcription repressor of E2F target genes, whose activity is required for progress from the G1 to the S phase of the cell cycle. AtRBR1 plays a central role in the mechanism controlling meristem cell differentiation, cell fate establishment and cell fate maintenance during organogenesis and gametogenesis. AtRBR1 contains one cyclin box. The cyclin box is a protein binding domain. 129
37490 410305 cd20602 CYCLIN_CABLES1 cyclin box found in CDK5 and ABL1 enzyme substrate 1 (CABLES1). CABLES1, also called interactor with CDK3 1 (Ik3-1), is a cyclin-dependent kinase binding protein that enhances cyclin-dependent kinase tyrosine phosphorylation by non-receptor tyrosine kinases, such as that of CDK5 by activated ABL1, which leads to increased CDK5 activity and is critical for neuronal development, and that of CDK2 by WEE1, which leads to decreased CDK2 activity and growth inhibition. CABLES1 contains one cyclin box. The cyclin box is a protein binding domain. 132
37491 410306 cd20603 CYCLIN_CABLES2 cyclin box found in CDK5 and ABL1 enzyme substrate 2 (CABLES2). CABLES2, also called interactor with CDK3 2 (Ik3-2), acts as a proapoptotic factor involved in both p53-mediated and p53-independent apoptotic pathways. CABLES2 contains one cyclin box. The cyclin box is a protein binding domain. 121
37492 410307 cd20604 CYCLIN_AtCycU-like cyclin box found in Arabidopsis thaliana U-type cyclins (CycUs) and similar proteins. CycUs interact with cyclin-dependent kinase A-1 (CDKA-1) to trigger its kinase activity. CycUs contain one cyclin box. The cyclin box is a protein binding domain. 126
37493 410308 cd20605 CYCLIN_RBL1 cyclin box found in retinoblastoma-like protein 1 (RBL1) and similar proteins. RBL1, also called 107 kDa retinoblastoma-associated protein (p107), retinoblastoma-related protein 1 (RBR-1), or pRb1, is a key regulator of entry into cell division. It is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. RBL1 recruits and targets histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. RBL1 probably acts as a transcription repressor by recruiting chromatin-modifying enzymes to promoters. It may also act as a tumor suppressor. RBL1 contains one cyclin box. The cyclin box is a protein binding domain. 130
37494 410309 cd20606 CYCLIN_RBL2 cyclin box found in retinoblastoma-like protein 2 (RBL2) and similar proteins. RBL2, also called 130 kDa retinoblastoma-associated protein (p130), retinoblastoma-related protein 2 (RBR-2), or pRb2, is a key regulator of entry into cell division. It is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. RBL2 recruits and targets histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. It probably acts as a transcription repressor by recruiting chromatin-modifying enzymes to promoters. It may also act as a tumor suppressor. RBL2 contains one cyclin box. The cyclin box is a protein binding domain. 189
37495 380328 cd20607 FbiB_C-like nitroreductase family domain similar to the C-terminal domain of F420:gamma-glutamyl ligase FbiB. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Mycobacterium tuberculosis FbiB, is a two-domain protein and produces F420 with predominantly 5 to 7 L-glutamate residues in the poly-gamma-glutamate tail, its C-terminal domain is homologous to FMN-dependent nitroreductases. 155
37496 380329 cd20608 nitroreductase nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase. 145
37497 380330 cd20609 nitroreductase nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer.often found to be homodimers. 145
37498 380331 cd20610 nitroreductase nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase. 167
37499 410705 cd20612 CYP_LDS-like_C C-terminal cytochrome P450 domain of linoleate diol synthase and similar cytochrome P450s. This family contains Gaeumannomyces graminis linoleate diol synthase (LDS) and similar proteins including Ssp1 from the phytopathogenic basidiomycete Ustilago maydis. LDS, also called linoleate (8R)-dioxygenase, catalyzes the dioxygenation of linoleic acid to (8R)-hydroperoxylinoleate and the isomerization of the resulting hydroperoxide to (7S,8S)-dihydroxylinoleate. Ssp1 is expressed in mature teliospores, which are produced by U. maydis only after infection of its host plant, maize. Ssp1 is localized on lipid bodies in germinating teliospores, suggesting a role in the mobilization of storage lipids. LDS and Ssp1 contain an N-terminal dioxygenase domain related to animal heme peroxidases, and a C-terminal cytochrome P450 domain. The LDS-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 370
37500 410706 cd20613 CYP46A1-like cytochrome P450 family 46, subfamily A, polypeptide 1, also called cholesterol 24-hydroxylase, and similar cytochrome P450s. CYP46A1 is also called cholesterol 24-hydroxylase (EC 1.14.14.25), CH24H, cholesterol 24-monooxygenase, or cholesterol 24S-hydroxylase. It catalyzes the conversion of cholesterol into 24S-hydroxycholesterol and, to a lesser extent, 25-hydroxycholesterol. CYP46A1 is associated with high-order brain functions; increased expression improves cognition while a reduction leads to a poor cognitive performance. It also plays a role in the pathogenesis or progression of neurodegenerative disorders. CYP46A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 429
37501 410707 cd20614 CYPBJ-4-like cytochrome P450 BJ-4 homolog and similar cytochrome P450s. This group is composed of mostly uncharacterized proteins including Sinorhizobium fredii CYPBJ-4 homolog. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 406
37502 410708 cd20615 CYP_GliC-like cytochrome P450 monooxygenases similar to gliotoxin biosynthesis protein C. This subfamily is composed of cytochrome P450 monooxygenases that are part of gene clusters involved in the biosynthesis of various compounds such as mycotoxins and alkaloids, including Aspergillus fumigatus gliotoxin biosynthesis protein (GliC), Penicillium rubens roquefortine/meleagrin synthesis protein R (RoqR), Aspergillus oryzae aspirochlorine biosynthesis protein C (AclC), Aspergillus terreus bimodular acetylaranotin synthesis protein ataTC, Kluyveromyces lactis pulcherrimin biosynthesis cluster protein 2 (PUL2), and Aspergillus nidulans aspyridones biosynthesis protein B (ApdB). The GliC-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 409
37503 410709 cd20616 CYP19A1 cytochrome P450 family 19, subfamily A, polypeptide 1. CYP19A1, also called aromatase or estrogen synthetase (EC 1.14.14.14), catalyzes the formation of aromatic C18 estrogens from C19 androgens. The CYP19A1 subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 414
37504 410710 cd20617 CYP1_2-like cytochrome P450 families 1 and 2, and similar cytochrome P450s. This model includes cytochrome P450 families 1 (CYP1) and 2 (CYP2), CYP17A1, and CYP21 in vertebrates, as well as insect and crustacean CYPs similar to CYP15A1 and CYP306A1. CYP1 and CYP2 enzymes are involved in the metabolism of endogenous and exogenous compounds such as hormones, xenobiotics, and drugs. CYP17A1 catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products, while CYP21 catalyzes the 21-hydroxylation of steroids such as progesterone and 17-alpha-hydroxyprogesterone (17-alpha-OH-progesterone) to form 11-deoxycorticosterone and 11-deoxycortisol, respectively. Members of this group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 419
37505 410711 cd20618 CYP71_clan Plant cytochrome P450s, clan CYP71. The number of cytochrome P450s (P450s, CYPs) in plants is considerably larger than in other taxa. In individual plant genomes, CYPs form the third largest family of plant genes; the two largest gene families code for F-box proteins and receptor-like kinases. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. However, there is a phenomenon called family creep, where a sequence (below 40% identity) is absorbed into a large family; this is seen in the plant CYP71 and CYP89 families. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). The CYP71 clan has expanded dramatically and represents 50% of all plant CYPs; it includes several families including CYP71, CYP73, CYP76, CYP81, CYP82, CYP89, and CYP93, among others. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 429
37506 410712 cd20619 CYP_XplA cytochrome P450 XplA. XplA is a cytochrome P450 that was found to mediate the microbial metabolism of the military explosive, hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX). XplA has an unusual structural organization comprising a heme domain that is fused to its flavodoxin redox partner. XplA, along with its partner reductase XplB, are plasmid encoded and the xplA gene has now been found in divergent genera across the globe with near sequence identity. It has only been detected at explosive-contaminated sites, suggesting rapid dissemination of this novel catabolic activity, possibly within a 50-year period since the introduction of RDX into the environment. XplA belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 358
37507 410713 cd20620 CYP132-like cytochrome P450 family 132 and similar cytochrome P450s. This subfamily is composed of Mycobacterium tuberculosis cytochrome P450 132 (CYP132) and similar proteins. The function of CYP132 is as yet unknown. CYP132 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 406
37508 410714 cd20621 CYP5011A1-like cytochrome P450 monooxygenase CYP5011A1 and similar cytochrome P450s. This subfamily is composed of CYPs from unicellular ciliates similar to Tetrahymena thermophila CYP5011A1, whose function is still unknown. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 427
37509 410715 cd20622 CYP_TRI13-like fungal cytochrome P450s similar to TRI13. This subfamily is composed of cytochrome P450 monooxygenase TRI13, also called core trichothecene cluster (CTC) protein 13, and similar proteins. The tri13 gene is located in the trichothecene biosynthesis gene cluster in Fusarium species, which produce a great diversity of agriculturally important trichothecene toxins that differ from each other in their pattern of oxygenation and esterification. Trichothecenes comprise a large family of chemically related bicyclic sesquiterpene compounds acting as mycotoxins, including the T2-toxin; TRI13 is required for the addition of the C-4 oxygen of T-2 toxin. The TRI13-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 494
37510 410716 cd20623 CYP_unk unknown subfamily of actinobacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 367
37511 410717 cd20624 CYP_unk unknown subfamily of actinobacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 376
37512 410718 cd20625 CYP164-like cytochrome P450 family 164 and similar cytochrome P450s. This group is composed mostly of bacterial cytochrome P450s from multiple families, including Mycobacterium smegmatis CYP164A2, Streptomyces sp. CYP245A1, Bacillus subtilis CYP107H1, Micromonospora echinospora P450 oxidase Calo2, and putative P450s such as Xylella fastidiosa CYP133 and Mycobacterium tuberculosis CYP140. CYP107H1, also called cytochrome P450(BioI), catalyzes the C-C bond cleavage of fatty acid linked to acyl carrier protein (ACP) to generate pimelic acid for biotin biosynthesis. CYP245A1, also called cytochrome P450 StaP, catalyzes the intramolecular C-C bond formation and oxidative decarboxylation of chromopyrrolic acid (CPA) to form the indolocarbazole core, a key step in staurosporine biosynthesis. CalO2 is involved in calicheamicin biosynthesis. The CYP164-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 369
37513 410719 cd20626 CYP_Pc22g25500-like cytochrome P450 Pc22g25500 and similar cytochrome P450s. Penicillium rubens Pc22g25500 is a putative cytochrome P450 of unknown function. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. 381
37514 410720 cd20627 CYP20A1 cytochrome P450 family 20, subfamily A, polypeptide 1. Cytochrome P450, family 20, subfamily A, polypeptide 1 (cytochrome P450 20A1 or CYP20A1) is expressed in human hippocampus and substantia nigra. In zebrafish, maternal transcript of CYP20A1 occurs in eggs, suggesting involvement in brain and early development. CYP20A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 394
37515 410721 cd20628 CYP4 cytochrome P450 family 4. Cytochrome P450 family 4 (CYP4) proteins catalyze the omega-hydroxylation of the terminal carbon of fatty acids, including essential signaling molecules such as eicosanoids, prostaglandins and leukotrienes, and they are important for chemical defense. There are seven vertebrate family 4 subfamilies: CYP4A, CYP4B, CYP4F, CYP4T, CYP4V, CYP4X, and CYP4Z; three (CYP4X, CYP4A, CYP4Z) are specific to mammals. CYP4 enzymes metabolize fatty acids off various length, level of saturation, and branching. Specific subfamilies show preferences for the length of fatty acids; CYP4B, CYP4A and CYP4V, and CYP4F preferentially metabolize short (C7-C10), medium (C10-C16), and long to very long (C18-C26) fatty acid chains, respectively. CYP4 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
37516 410722 cd20629 P450_pinF1-like cytochrome P450-pinF1 and similar cytochrome P450s. This subfamily is composed of bacterial CYPs similar to Agrobacterium tumefaciens plant-inducible cytochrome P450-pinF1, which is not essential for virulence but may be involved in the detoxification of plant protective agents at the site of wounding. The P450-pinF1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 353
37517 410723 cd20630 P450_epoK-like cytochrome P450epok and similar cytochrome P450s. Sorangium cellulosum cytochrome P450epoK is a heme-containing monooxygenase which participates in epothilone biosynthesis where it catalyzes the epoxidation of epothilones C and D into epothilones A and B, respectively. This subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 373
37518 410724 cd20631 CYP7A1 cytochrome P450 family 7, subfamily A, polypeptide 1. Cytochrome P450 7A1 (CYP7A1) is also called cholesterol 7-alpha-monooxygenase (EC 1.14.14.23) or cholesterol 7-alpha-hydroxylase. It catalyzes the hydroxylation at position 7 of cholesterol, a rate-limiting step in the classic (or neutral) pathway of cholesterol catabolism and bile acid biosynthesis. It is important for cholesterol homeostasis. CYP7A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 451
37519 410725 cd20632 CYP7B1 cytochrome P450 family 7, subfamily B, polypeptide 1. Cytochrome P450 7B1 (CYP7B1) is also called 25-hydroxycholesterol 7-alpha-hydroxylase (EC 1.14.14.29) or oxysterol 7-alpha-hydroxylase. It catalyzes the 7alpha-hydroxylation of both steroids and oxysterols, and is thus implicated in the metabolism of neurosteroids and bile acid synthesis, respectively. It participates in the alternative (or acidic) pathway of cholesterol catabolism and bile acid biosynthesis. It also mediates the formation of 7-alpha,25-dihydroxycholesterol (7-alpha,25-OHC) from 25-hydroxycholesterol; 7-alpha,25-OHC acts as a ligand for the G protein-coupled receptor GPR183/EBI2, a chemotactic receptor in lymphoid cells. CYP7B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 438
37520 410726 cd20633 Cyp8B1 cytochrome P450 family 8, subfamily B, polypeptide 1. Cytochrome P450 8B1 (CYP8B1) is also called 7-alpha-hydroxycholest-4-en-3-one 12-alpha-hydroxylase (EC 1.14.18.8) or sterol 12-alpha-hydroxylase. It is involved in the classic (or neutral) pathway of cholesterol catabolism and bile acid synthesis, and is responsible for sterol 12alpha-hydroxylation, which directs the synthesis to cholic acid (CA). It converts 7-alpha-hydroxy-4-cholesten-3-one into 7-alpha,12-alpha-dihydroxy-4-cholesten-3-one, but also displays broad substrate specificity including other 7-alpha-hydroxylated C27 steroids. CYP8B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 449
37521 410727 cd20634 PGIS_CYP8A1 prostacyclin Synthase, also called cytochrome P450 family 8, subfamily A, polypeptide 1. Prostacyclin synthase, also called prostaglandin I2 synthase (PGIS) or cytochrome P450 8a1 (CYP8A1), catalyzes the isomerization of prostaglandin H2 to prostacyclin (or prostaglandin I2), a potent mediator of vasodilation and anti-platelet aggregation. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 442
37522 410728 cd20635 CYP39A1 cytochrome P450 family 39, subfamily A, polypeptide 1. Cytochrome P450 39A1 (CYP39A1) is also called 24-hydroxycholesterol 7-alpha-hydroxylase (EC 1.14.14.26) or oxysterol 7-alpha-hydroxylase. It is involved in the metabolism of bile acids and has a preference for 24-hydroxycholesterol, converting it into the 7-alpha-hydroxylated product. It may play a role in the alternative bile acid synthesis pathway in the liver. CYP39A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 410
37523 410729 cd20636 CYP26C1 cytochrome P450 family 26, subfamily C, polypeptide 1. Cytochrome P450 26C1 (CYP26C1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It effectively metabolizes all-trans retinoic acid (atRA), 9-cis-retinoic acid (9-cis-RA), 13-cis-retinoic acid, and 4-oxo-atRA with the highest intrinsic clearance toward 9-cis-RA. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. Loss of function mutations in the CYP26C1 gene cause type IV focal facial dermal dysplasia (FFDD), a rare syndrome characterized by facial lesions resembling aplasia cutis. CYP26C1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 431
37524 410730 cd20637 CYP26B1 cytochrome P450 family 26, subfamily B, polypeptide 1. Cytochrome P450 26B1 (CYP26B1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It is an all-trans-retinoic acid (atRA) hydroxylase that catalyzes the formation of similar metabolites as CYP26A1. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. In rats, CYP26B1 regulates sex-specific timing of meiotic initiation, independent of RA signaling. CYP26B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 430
37525 410731 cd20638 CYP26A1 cytochrome P450 family 26, subfamily A, polypeptide 1. Cytochrome P450s 26A1 (CYP26A1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It is the main all-trans-retinoic acid (atRA) hydroxylase that catalyzes the formation of several hydroxylated forms of RA, including 4-OH-RA, 4-oxo-RA and 18-OH-RA. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. CYP26A1 has been shown to upregulate fascin and promote the malignant behavior of breast carcinoma cells. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
37526 410732 cd20639 CYP734 cytochrome P450 family 734. Cytochrome P450 family 734 (CYP734) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. CYP734As function as multisubstrate and multifunctional enzymes in brassinosteroid (BRs) catabolism and regulation of BRs homeostasis. Arabidopsis thaliana CYP734A1/BAS1 (formerly CYP72B1) inactivates bioactive brassinosteroids such as castasterone (CS) and brassinolide (BL) by C-26 hydroxylation. Rice CYP734As can catalyze C-22 hydroxylation as well as second and third oxidations to produce aldehyde and carboxylate groups at C-26. CYP734 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 428
37527 410733 cd20640 CYP714 cytochrome P450 family 714. Cytochrome P450 family 714 (CYP714) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. CYP714 enzymes are involved in the biosynthesis of gibberellins (GAs) and the mechanism to control their bioactive endogenous levels. They contribute to the production of diverse GA compounds through various oxidations of C and D rings in both monocots and eudicots. CYP714B1 and CYP714B2 encode the enzyme GA 13-oxidase, which is required for GA1 biosynthesis, while CYP714D1 encodes GA 16a,17-epoxidase, which inactivates the non-13-hydroxy GAs in rice. Arabidopsis CYP714A1 is an inactivation enzyme that catalyzes the conversion of GA12 to 16-carboxylated GA12 (16-carboxy-16beta,17-dihydro GA12). CYP714 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
37528 410734 cd20641 CYP709 cytochrome P450 family 709. Cytochrome P450 family 709 (CYP709) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. Arabidopsis thaliana CYP709B3 is involved in abscisic acid (ABA) and salt stress response. CYP709 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 431
37529 410735 cd20642 CYP72 cytochrome P450 family 72. Cytochrome P450 family 72 (CYP72) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. Characterized members, among others, include: Catharanthus roseus cytochrome P450 72A1 (CYP72A1), also called secologanin synthase (EC 1.3.3.9), that catalyzes the conversion of loganin into secologanin, the precursor of monoterpenoid indole alkaloids and ipecac alkaloids; Medicago truncatula CYP72A67 that catalyzes a key oxidative step in hemolytic sapogenin biosynthesis; and Arabidopsis thaliana CYP72C1, an atypical CYP that acts on brassinolide precursors and functions as a brassinosteroid-inactivating enzyme. This family also includes Panax ginseng CYP716A47 that catalyzes the formation of protopanaxadiol from dammarenediol-II during ginsenoside biosynthesis. CYP72 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 431
37530 410736 cd20643 CYP11A1 cytochrome P450 family 11, subfamily A, polypeptide 1, also called cholesterol side-chain cleavage enzyme. Cytochrome P450 11A1 (CYP11A1, EC 1.14.15.6) is also called cholesterol side-chain cleavage enzyme, cholesterol desmolase, or cytochrome P450(scc). It catalyzes the side-chain cleavage reaction of cholesterol to form pregnenolone, the precursor of all steroid hormones. Missense or nonsense mutations of the CYP11A1 gene cause mild to severe early-onset adrenal failure depending on the severity of the enzyme dysfunction/deficiency. CYP11A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37531 410737 cd20644 CYP11B cytochrome P450 family 11, subfamily B subfamily. Cytochrome P450 11B (CYP11B) enzymes catalyze the final steps in the production of glucocorticoids and mineralocorticoids that takes place in the adrenal gland. There are two human CYP11B isoforms: Cyb11B1 (11-beta-hydroxylase or P45011beta), which catalyzes the final step of cortisol synthesis by a one-step reaction from 11-deoxycortisol; and CYP11B2 (aldosterone synthase or P450aldo), which catalyzes three steps in the synthesis of aldosterone. The CYP11B subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 428
37532 410738 cd20645 CYP24A1 cytochrome P450 family 24, subfamily A, polypeptide 1, also called vitamin D(3) 24-hydroxylase. Cytochrome P450 24A1 (CYP24A1, EC 1.14.15.16) is also called 1,25-dihydroxyvitamin D(3) 24-hydroxylase (24-OHase), vitamin D(3) 24-hydroxylase, or cytochrome P450-CC24. It catalyzes the NADPH-dependent 24-hydroxylation of calcidiol (25-hydroxyvitamin D(3)) and calcitriol (1-alpha,25-dihydroxyvitamin D(3) or 1,25(OH)2D3). CYP24A1 regulates vitamin D activity through its hydroxylation of calcitriol, the physiologically active vitamin D hormone, which controls gene-expression and signal-transduction processes associated with calcium homeostasis, cellular growth, and the maintenance of heart, muscle, immune, and skin function. CYP24A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 419
37533 410739 cd20646 CYP27A1 cytochrome P450 family 27, subfamily A, polypeptide 1, also called vitamin D(3) 25-hydroxylase. Cytochrome P450 27A1 (CYP27A1, EC 1.14.15.15) is also called CYP27, cholestanetriol 26-monooxygenase, sterol 26-hydroxylase, 5-beta-cholestane-3-alpha,7-alpha,12-alpha-triol 27-hydroxylase, cytochrome P-450C27/25, sterol 27-hydroxylase, or vitamin D(3) 25-hydroxylase. It catalyzes the first step in the oxidation of the side chain of sterol intermediates, the 27-hydroxylation of 5-beta-cholestane-3-alpha,7-alpha,12-alpha-triol, and the first three sterol side chain oxidations in bile acid biosynthesis via the neutral (classic) pathway. It also hydroxylates vitamin D3 at the 25-position, as well as cholesterol at positions 24 and 25. CYP27A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 430
37534 410740 cd20647 CYP27C1 cytochrome P450 family 27, subfamily C, polypeptide 1, also called all-trans retinol 3,4-desaturase. Cytochrome P450 27C1 (CYP27C1) is also called all-trans retinol 3,4-desaturase. It catalyzes the conversion of all-trans retinol (also called vitamin A1, the precursor of 11-cis retinal) to 3,4-didehydroretinol (also called vitamin A2, the precursor of 11-cis 3,4-didehydroretinal). CYP27C1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 433
37535 410741 cd20648 CYP27B1 cytochrome P450 family 27, subfamily B, polypeptide 1, also called calcidiol 1-monooxygenase. Cytochrome p450 27B1 (CYP27B1) is also called calcidiol 1-monooxygenase (EC 1.14.15.18), 25-hydroxyvitamin D(3) 1-alpha-hydroxylase (VD3 1A hydroxylase), 25-hydroxyvitamin D-1 alpha hydroxylase, 25-OHD-1 alpha-hydroxylase, 25-hydroxycholecalciferol 1-hydroxylase, or 25-hydroxycholecalciferol 1-monooxygenase. It catalyzes the conversion of 25-hydroxyvitamin D3 (25(OH)D3) to 1-alpha,25-dihydroxyvitamin D3 (1,25(OH)2D3 or calcitriol), and of 24,25-dihydroxyvitamin D3 (24,25(OH)(2)D3) to 1-alpha,24,25-trihydroxyvitamin D3 (1alpha,24,25(OH)(3)D3). It is also active with 25-hydroxy-24-oxo-vitamin D3, and has an important role in normal bone growth, calcium metabolism, and tissue differentiation. CYP27B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 430
37536 410742 cd20649 CYP5A1 cytochrome P450 family 5, subfamily A, polypeptide 1, also called thromboxane-A synthase. Cytochrome P450 5A1 (CYP5A1), also called thromboxane-A synthase (EC 5.3.99.5) or thromboxane synthetase, converts prostaglandin H2 into thromboxane A2, a biologically active metabolite of arachidonic acid that has been implicated in stroke, asthma, and various cardiovascular diseases, due to its acute and chronic effects in promoting platelet aggregation, vasoconstriction, bronchoconstriction, and proliferation. CYP5A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 457
37537 410743 cd20650 CYP3A cytochrome P450 family 3, subfamily A. The cytochrome P450 3A (CYP3A) subfamily, the most abundant CYP subfamily in the liver, consists of drug-metabolizing enzymes. In humans, there are at least four isoforms: CYP3A4, 3A5, 3A7, and 3A3. CYP3A enzymes are embedded in the endoplasmic reticulum, where they can catalyze a wide variety of biochemical reactions including hydroxylation, N-demethylation, O-dealkylation, S-oxidation, deamination, or epoxidation of substrates. They oxidize a variety of structurally unrelated compounds including steroids, fatty acids, and xenobiotics. The CYP3A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
37538 410744 cd20651 CYP15A1-like cytochrome P450 family 15, subfamily A, polypeptide 1, and similar cytochrome P450s. This subfamily is composed of insect and crustacean cytochrome P450s including Diploptera punctata cytochrome P450 15A1 (CYP15A1 or CYP15A1), Panulirus argus CYP2L1, and CYP303A1, CYP304A1, and CYP305A1 from Drosophila melanogaster. CYP15A1, also called methyl farnesoate epoxidase, catalyzes the conversion of methyl farnesoate to juvenile hormone III acid during juvenile hormone biosynthesis. CYP303A1, CYP304A1, and CYP305A1 may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. The CYP15A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 423
37539 410745 cd20652 CYP306A1-like cytochrome P450 306A1 and similar cytochrome P450s. This subfamily is composed of insect and crustacean cytochrome P450s including insect cytochrome P450 306A1 (CYP306A1 or Cyp306a1) and CYP18A1. CYP306A1 functions as a carbon 25-hydroxylase and has an essential role in ecdysteroid biosynthesis during insect development. CYP18A1 is a 26-hydroxylase and plays a key role in steroid hormone inactivation. The CYP306A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
37540 410746 cd20653 CYP81 cytochrome P450 family 81. The only characterized member of the cytochrome P450 family 81 (CYP81 or Cyp81) is CYP81E1, also called isoflavone 2'-hydroxylase, that catalyzes the hydroxylation of isoflavones, daidzein, and formononetin, to yield 2'-hydroxyisoflavones, 2'-hydroxydaidzein, and 2'-hydroxyformononetin, respectively. It is involved in the biosynthesis of isoflavonoid-derived antimicrobial compounds of legumes. CYP81 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 420
37541 410747 cd20654 CYP82 cytochrome P450 family 82. Cytochrome P450 family 82 (CYP82 or Cyp82) genes specifically reside in dicots and are usually induced by distinct environmental stresses. Characterized members include: Glycine max CYP82A3 that is induced by infection, salinity and drought stresses, and is involved in the jasmonic acid and ethylene signaling pathway, enhancing plant resistance; Arabidopsis thaliana CYP82G1 that catalyzes the breakdown of the C(20)-precursor (E,E)-geranyllinalool to the insect-induced C(16)-homoterpene (E,E)-4,8,12-trimethyltrideca-1,3,7,11-tetraene (TMTT); and Papaver somniferum CYP82N4, also called methyltetrahydroprotoberberine 14-monooxygenase, and CYP82Y1, also called N-methylcanadine 1-hydroxylase. CYP82N4 catalyzes the conversion of N-methylated protoberberine alkaloids N-methylstylopine and N-methylcanadine into protopine and allocryptopine, respectively, in the biosynthesis of isoquinoline alkaloid sanguinarine. CYP82Y1 catalyzes the 1-hydroxylation of N-methylcanadine to 1-hydroxy-N-methylcanadine, the first committed step in the formation of noscapine. CYP82 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 447
37542 410748 cd20655 CYP93 cytochrome P450 family 93. The cytochrome P450 family 93 (CYP93) is specifically found in flowering plants and could be classified into ten subfamilies, CYP93A-K. CYP93A appears to be the ancestor that was derived in flowering plants, and the remaining subfamiles show lineage-specific distribution: CYP93B and CYP93C are present in dicots; CYP93F is distributed only in Poaceae; CYP93G and CYP93J are monocot-specific; CYP93E is unique to legumes; CYP93H and CYP93K are only found in Aquilegia coerulea; and CYP93D is Brassicaceae-specific. Members of this family include: Glycyrrhiza echinata CYP93B1, also called licodione synthase (EC 1.14.14.140), that catalyzes the formation of licodione and 2-hydroxynaringenin from (2S)-liquiritigenin and (2S)-naringenin, respectively; and Glycine max CYP93A1, also called 3,9-dihydroxypterocarpan 6A-monooxygenase (EC 1.14.14.93), that is involved in the biosynthesis of the phytoalexin glyceollin. CYP93 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 433
37543 410749 cd20656 CYP98 cytochrome P450 family 98. Cytochrome P450 family 98 (CYP98) monooxygenases catalyze the meta-hydroxylation step in the phenylpropanoid biosynthetic pathway. CYP98A3, also called p-coumaroylshikimate/quinate 3'-hydroxylase, catalyzes 3'-hydroxylation of p-coumaric esters of shikimic/quinic acids to form lignin monomers. CYP98A8, also called p-coumarate 3-hydroxylase, acts redundantly with CYP98A9 as tricoumaroylspermidine meta-hydroxylase. CYP98 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
37544 410750 cd20657 CYP75 cytochrome P450 family 75. The cytochrome P450 family 75 (CYP75) play important roles in the biosynthesis of colored class of flavonoids, anthocyanins, which confer a diverse range of colors to flowers from orange to red to violet and blue. The number of hydroxyl groups on the B-ring of anthocyanidins, the chromophores and precursors of anthocyanins, impact the anthocyanin color - the more the bluer. The hydroxylation pattern is determined by CYP75 proteins: flavonoid 3'-hydroxylase (F3'H, EC 1.14.14.82) and and flavonoid 3',5'-hydroxylase (F3'5'H, EC 1.14.14.81), which belong to CYP75B and CYP75A subfamilies, respectively. Both enzymes have broad substrate specificity and catalyze the hydroxylation of flavanones, dihydroflavonols, flavonols and flavones. F3'H catalyzes the 3'-hydroxylation of the flavonoid B-ring to the 3',4'-hydroxylated state. F3'5'H catalysis leads to trihydroxylated delphinidin-based anthocyanins that tend to have violet/blue colours. CYP75 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 438
37545 410751 cd20658 CYP79 cytochrome P450 family 79. Cytochrome P450 family 79 (CYP79) enzymes catalyze the first committed step in the biosynthesis of the core structure of glucosinolates, the conversion of amino acids to the corresponding aldoximes. Glucosinolates are amino acid-derived natural plant products that function in the defense against herbivores and microorganisms. Arabidopsis thaliana contains seven family members: CYP79B2 and CYP79B3, which metabolize trytophan; CYP79F1 and CYP79F2, which metabolize chain-elongated methionine derivatives with respectively 1-6 or 5-6 additional methylene groups in the side chain; CYP79A2 that metabolizes phenylalanine; and CYP79C1 and CYP79C2, with unknown function. CYP79 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 444
37546 410752 cd20659 CYP4B_4F-like cytochrome P450 family 4, subfamilies B and F, and similar cytochrome P450s. This group is composed of family 4 cytochrome P450s from vertebrate subfamilies A (CYP4A), B (CYP4B), F (CYP4F), T (CYP4T), X (CYP4X), and Z (CYP4Z). Also included are similar proteins from lancelets, tunicates, hemichordates, echinoderms, mollusks, annelid worms, sponges, and choanoflagellates, among others. The CYP4A, CYP4X, and CYP4Z subfamilies are specific to mammals, CYP4T is present in fish, while CYP4B and CYP4F are conserved among vertebrates. CYP4Bs specialize in omega-hydroxylation of short chain fatty acids and also participates in the metabolism of exogenous compounds that are protoxic including valproic acid (C8), 3-methylindole (C9), 4-ipomeanol, 3-methoxy-4-aminoazobenzene, and several aromatic amines. CYP4F enzymes are known for known for omega-hydroxylation of very long fatty acids (VLFA; C18-C26), leukotrienes, prostaglandins, and vitamins with long alkyl side chains. The CYP4B_4F-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 423
37547 410753 cd20660 CYP4V-like cytochrome P450 family 4, subfamily V, and similar cytochrome P450s. This group is composed of vertebrate cytochrome P450 family 4, subfamily V (CYP4V) enzymes and similar proteins, including invertebrate subfamily C (CYP4C). Insect CYP4C enzymes may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. CYP4V2, the most characterized member of the CYP4V subfamily, is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 429
37548 410754 cd20661 CYP2R1 cytochrome P450 2R1. CYP2R1, also called vitamin D 25-hydroxylase (EC 1.14.14.24), is a microsomal enzyme that is required for the activation of vitamin D; it catalyzes the initial step converting vitamin D into 25-hydroxyvitamin D (25(OH)D), the major circulating metabolite of vitamin D. The 1alpha-hydroxylation of 25(OH)D by CYP27B1 generates the fully active vitamin D metabolite, 1,25-dihydroxyvitamin D (1,25(OH)2D). Mutations in the CYP2R1 gene are associated with an atypical form of vitamin D-deficiency rickets, which has been classified as vitamin D dependent rickets type 1B. CYP2R1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 436
37549 410755 cd20662 CYP2J cytochrome P450 family 2, subfamily J. Members of CYP2J are expressed in multiple tissues in mice and humans. They function as catalysts of arachidonic acid metabolism and are active in the metabolism of fatty acids to generate bioactive compounds. Human CYP2J2, also called arachidonic acid epoxygenase or albendazole monooxygenase (hydroxylating), is a membrane-bound cytochrome P450 primarily expressed in the heart and plays a significant role in cardiovascular diseases. The CYP2J subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 421
37550 410756 cd20663 CYP2D cytochrome P450 family 2, subfamily D. Members of CYP2D are present in mammals, birds, reptiles, and amphibians. The hominin CYP2D subfamily consists of a functional CYP2D6 and two paralogs, CYP2D7 and CYP2D8, that are often not functional in some species. Human CYP2D6 has a high affinity for alkaloids and can detoxify them. It is also responsible for metabolizing about 25% of commonly used drugs, such as antidepressants, beta-blockers, and antiarrhythmics. The CYP2D subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 428
37551 410757 cd20664 CYP2K cytochrome P450 family 2, subfamily K. Members of CYP2K are present in fish, birds, and amphibians. CYP2K6 from zebrafish has been shown to catalyze the conversion of aflatoxin B1 (AFB1) to its cytotoxic derivative AFB1 exo-8,9-epoxide, while its ortholog in rainbow trout CYP2K1 is also capable of oxidizing lauric acid. In birds, CYP2K is one of the largest CYP2 subfamilies. The CYP2K subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 424
37552 410758 cd20665 CYP2C-like cytochrome P450 family 2, subfamily C, and similar cytochrome P450s. This CYP2C-like group includes CYP2C, and similar CYPs including mammalian CYP2E1, also called 4-nitrophenol 2-hydroxylase, as well as chicken CYP2H1 and CYP2H2. The CYP2C subfamily is composed of four human members (CYP2C8, CYP2C9, CYP2C18, CYP2C19) that metabolize approximately 20% of clinically used drugs, and all four exhibit genetic polymorphisms that results in toxicity or altered efficacy of some drugs in affected individuals. CYP2E1 participates in the metabolism of endogenous substrates, including acetone and fatty acids, and exogenous compounds such as anesthetics, ethanol, nicotine, acetaminophen, aspartame, and chlorzoxazone, among others. The CYP2C-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37553 410759 cd20666 CYP2U1 cytochrome P450 family 2, subfamily U, polypeptide 1. CYP2U1 is a thymus- and brain-specific cytochrome P450 that catalyzes omega- and (omega-1)-hydroxylation of fatty acids such as arachidonic acid, docosahexaenoic acid, and other long chain fatty acids. Mutations in CYP2U1 are associated with hereditary spastic paraplegia (HSP), a neurological disorder, and pigmentary degenerative maculopathy associated with progressive spastic paraplegia. CYP2U1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 426
37554 410760 cd20667 CYP2AB1-like cytochrome P450, family 2, subfamily AB, polypeptide 1 and similar cytochrome P450s. The function of CYP2AB1 is unknown. CYP2AB1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 423
37555 410761 cd20668 CYP2A cytochrome P450 family 2, subfamily A. Cytochrome P450 family 2, subfamily A (CYP2A) includes CYP2A1, 2A2, and 2A3 in rats; CYP2A4, 2A5, 2A12, 2A20p, 2A21p, 2A22, and 2A23p in mice; CYP2A6, 2A7, 2A13, 2A18P in humans; CYP2A8, 2A9, 2A14, 2A15, 2A16, and 2A17 in hamsters; CYP2A10 and 2A11 in rabbits; and CYP2A19 in pigs. CYP2A enzymes metabolize numerous xenobiotic compounds, including coumarin, aflatoxin B1, nicotine, cotinine, 1,3-butadiene, and acetaminophen, among others, as well as endogenous compounds, including testosterone, progesterone, and other steroid hormones. Human CYP2A6 is responsible for the systemic clearance of nicotine, while CYP2A13 activates the nicotine-derived procarcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) into DNA-altering compounds that cause lung cancer. The CYP2A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37556 410762 cd20669 Cyp2F cytochrome P450 family 2, subfamily F. Cytochrome P450 family 2, subfamily F (CYP2F) members are selectively expressed in lung tissues. They are responsible for the bioactivation of several pneumotoxic and carcinogenic chemicals such as benzene, styrene, naphthalene, and 1,1-dichloroethylene. CYP2F1 and CYP2F3 selectively catalyzes the 3-methyl dehydrogenation of 3-methylindole, forming toxic reactive intermediates that can form adducts with proteins and DNA. The CYP2F subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37557 410763 cd20670 CYP2G cytochrome P450 family 2, subfamily G. CYP2G1 is uniquely expressed in the olfactory mucosa of rats and rabbits and may have important functions for the olfactory chemosensory system. It is involved in the metabolism of sex steroids and xenobiotic compounds. In cynomolgus monkeys, CYP2G2 is a functional drug-metabolizing enzyme in nasal mucosa. In humans, two different CYP2G genes, CYP2GP1 and CYP2GP2, are pseudogenes because of loss-of-function deletions/mutations. The CYP2G subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37558 410764 cd20671 CYP2W1 cytochrome P450 family 2, subfamily W, polypeptide 1. Cytochrome P450 2W1 (CYP2W1) is expressed during development of the gastrointestinal tract, is silenced after birth in the intestine and colon by epigenetic modifications, but is activated following demethylation in colorectal cancer (CRC). Its expression levels in CRC correlate with the degree of malignancy, are higher in metastases and are predictive of survival. Thus, it is an attractive tumor-specific diagnostic and therapeutic target. CYP2W1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 422
37559 410765 cd20672 CYP2B cytochrome P450 family 2, subfamily B. The human cytochrome P450 family 2, subfamily B (CYP2B) consists of only one functional member CYP2B6, which shows broad substrate specificity and plays a key role in the metabolism of many clinical drugs, environmental toxins, and endogenous compounds. Rodents have multiple functional CYP2B proteins; mouse subfamily members include CYP2B9, 2B10, 2B13, 2B19, and 2B23. CYP2B enzymes are highly inducible by chemicals that interact with the constitutive androstane receptor (CAR) and/or pregnane X receptor (PXR), such as rifampicin and phenobarbital. The CYP2B subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 425
37560 410766 cd20673 CYP17A1 cytochrome P450 family 17, subfamily A, polypeptide 1. Cytochrome P450 17A1 (CYP17A1 or Cyp17a1), also called cytochrome P450c17, steroid 17-alpha-hydroxylase (EC 1.14.14.19)/17,20 lyase (EC 1.14.14.32), or 17-alpha-hydroxyprogesterone aldolase, catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products and subsequently to dehydroepiandrosterone (DHEA) and androstenedione. It is a dual enzyme that catalyzes both the 17-alpha-hydroxylation and the 17,20-lyase reactions. Severe mutations on the enzyme cause combined 17-hydroxylase/17,20-lyase deficiency (17OHD); patients with 17OHD synthesize 11-deoxycorticosterone (DOC) which causes hypertension and hypokalemia. Loss of 17,20-lyase activity precludes sex steroid synthesis and leads to sexual infantilism. Included in this group is a second 17A P450 from teleost fish, CYP17A2, that is more efficient in pregnenolone 17-alpha-hydroxylation than CYP17A1, but does not catalyze the lyase reaction. CYP17A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 432
37561 410767 cd20674 CYP21 cytochrome P450 21, also called steroid 21-hydroxylase. Cytochrome P450 21 (CYP21 or Cyp21), also called steroid 21-hydroxylase (EC 1.14.14.16) or cytochrome P-450c21 or CYP21A2 (in humans), catalyzes the 21-hydroxylation of steroids such as progesterone and 17-alpha-hydroxyprogesterone (17-alpha-OH-progesterone) to form 11-deoxycorticosterone and 11-deoxycortisol, respectively. It is required for the adrenal synthesis of mineralocorticoids and glucocorticoids. Deficiency of this CYP is involved in ~95% of cases of human congenital adrenal hyperplasia, a disorder of adrenal steroidogenesis. There are two CYP21 genes in the human genome, CYP21A1 (a pseudogene) and CYP21A2 (the functional gene). Deficiencies in steroid 21-hydroxylase activity lead to a type of congenital adrenal hyperplasia, which has three clinical forms: a severe form with concurrent defects in both cortisol and aldosterone biosynthesis; a form with adequate aldosterone biosynthesis; and a mild, non-classic form that can be asymptomatic or associated with signs of postpubertal androgen excess without cortisol deficiency. CYP21A2 is also the major autoantigen in autoimmune Addison disease. Cyp21 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 424
37562 410768 cd20675 CYP1B1-like cytochrome P450 family 1, subfamily B, polypeptide 1 and similar cytochrome P450s. Cytochrome P450 1B1 (CYP1B1) is expressed in liver and extrahepatic tissues where it carries out the metabolism of numerous xenobiotics, including metabolic activation of polycyclic aromatic hydrocarbons. It is also important in regulating endogenous metabolic pathways, including the metabolism of steroid hormones, fatty acids, melatonin, and vitamins. CYP1B1 is overexpressed in a wide variety of tumors and is associated with angiogenesis. It is also associated with adipogenesis, obesity, hypertension, and atherosclerosis. It is therefore a target for the treatment of metabolic diseases and cancer. Also included in this subfamily are CYP1C proteins from fish, birds and amphibians. The CYP1B1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 434
37563 410769 cd20676 CYP1A cytochrome P450 family 1, subfamily A. Cytochrome P450 family 1, subfamily A (CYP1A) consists of two human members, CYP1A1 and CYP1A2, which overlap in their activities. CYP1A2 is the highly expressed cytochrome enzyme in the human liver, while CYP1A1 is mostly found in extrahepatic tissues. Known common substrates include aromatic compounds such as polycyclic aromatic hydrocarbons, arachidonic acid and eicosapentoic acid, as well as melatonin and 6-hydroxylate melatonin. In addition, CYP1A1 activates procarcinogens into carcinogens via epoxides, and metabolizes heterocyclic aromatic amines of industrial origin. CYP1A2 metabolizes numerous natural products that result in toxic products, such as the transformation of methyleugenol to 1'-hydroxymethyleugenol, estragole to reactive metabolites, and oxidation of nephrotoxins. It also plays an important role in the metabolism of several clinical drugs including analgesics, antipyretics, antipsychotics, antidepressants, anti-inflammatory, and cardiovascular drugs. The CYP1A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 437
37564 410770 cd20677 CYP1D1 cytochrome P450 family 1, subfamily D, polypeptide 1. The cytochrome P450 1D1 (CYP1D1) gene is pseudogenized in humans because of five nonsense mutations in the putative coding region. However, in other organisms including cynomolgus monkey, CYP1D1 is a functional drug-metabolizing enzyme that is highly expressed in the liver. CYP1D1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 435
37565 410771 cd20678 CYP4B-like cytochrome P450 family 4, subfamily B and similar cytochrome P450s, including subfamilies A, T, X, and Z. This group is composed of family 4 cytochrome P450s from subfamilies A (CYP4A), B (CYP4B), T (CYP4T), X (CYP4X), and Z (CYP4Z). The CYP4A, CYP4X, and CYP4Z subfamilies are specific to mammals, CYP4T is present in fish, while CYP4B is conserved among vertebrates. CYP4As are known for catalyzing arachidonic acid to 20-HETE (20-hydroxy-5Z,8Z,11Z,14Z-eicosatetraenoic acid), and some can also metabolize lauric and palmitic acid. CYP4Bs specialize in omega-hydroxylation of short chain fatty acids and also participates in the metabolism of exogenous compounds that are protoxic including valproic acid (C8), 3-methylindole (C9), 4-ipomeanol, 3-methoxy-4-aminoazobenzene, and several aromatic amines. CYP4X1 is expressed at high levels in the mammalian brain and may play a role in regulating fat metabolism. CYP4Z1 is a fatty acid hydroxylase that is unique among human CYPs in that it is predominantly expressed in the mammary gland. Monophyly was not found with the CYP4T and CYP4B subfamilies, and further consideration should be given to their nomenclature. The CYP4B-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 436
37566 410772 cd20679 CYP4F cytochrome P450 family 4, subfamily F. Cytochrome P450 family 4, subfamily F (CYP4F) enzymes are known for known for omega-hydroxylation of very long fatty acids (VLFA; C18-C26), leukotrienes, prostaglandins, and vitamins with long alkyl side chains. The CYP4F subfamily show diverse specificities among its members: CYP4F2 and CYP4F3 metabolize pro- and anti-inflammatory leukotrienes; CYP4F8 and CYP4F12 metabolize prostaglandins, endoperoxides and arachidonic acid; CYP4F11 and CYP4F12 metabolize VLFA and are unique in the CYP4F subfamily since they also hydroxylate xenobiotics such as benzphetamine, ethylmorphine, erythromycin, and ebastine. CYP4F belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 442
37567 410773 cd20680 CYP4V cytochrome P450 family 4, subfamily V. Cytochrome P450 family 4, subfamily V, polypeptide 2 (CYP4V2) is the most characterized member of the CYP4V subfamily. It is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 440
37568 410332 cd20681 T-box_Drosocross-like DNA-binding domain of Drosophila Dorsocross and related T-box proteins. Drosophila Dorsocross (Doc) includes three Dorsocross paralogs, Doc1-3. These are key cardiogenic T-box transcription factors during specification and differentiation of heart cells. Drosophila Doc also functions in caudal visceral mesoderm development, and modulates Notch signaling in the developing Drosophila eye by regulating the expression of Delta in the eye imaginal discs. Doc also functions in the morphogenesis of epithelial tissues: in Drosophila, which possesses a single extraembryonic (EE) membrane, it is essential for EE epithelia tissue maintenance while in Tribolium castaneum, which has 2 EE membranes, Doc plays a major role in EE morphogenetic events throughout development without affecting EE tissue specificity or maintenance. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. 186
37569 410333 cd20682 T-box-like T-box DNA-binding domain; uncharacterized subfamily. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors. 191
37570 410334 cd20683 T-box_Fungi_incertae_sedis T-box DNA-binding domain; uncharacterized subfamily of fungi classified as Fungi incertae sedis. Fungi incertae sedis refers to a fungal taxonomic group where its broader relationships are unknown or undefined. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors. 214
37571 411002 cd20684 CdiA-CT_Yk_RNaseA-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Yersinia kristensenii, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector protein from Yersinia kristensenii and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; Yersinia kristensenii CdiA-CT has potent RNase activity in vivo and in vitro. Although CdiA-CT has structural homology with angiogenin and other RNase A paralogs, it does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. It binds its cognate immunity protein CdiI which neutralizes toxicity by blocking access to RNA substrates. Y. kristensenii CdiA-CT is the first non-vertebrate protein found to possess the RNase A superfamily fold. Homologs of this toxin are associated with secretion systems in many Gram-negative and Gram-positive bacteria, suggesting that RNase A-like toxins are commonly deployed in inter-bacterial competition. 112
37572 411003 cd20685 CdiA-CT_Ecl_RNase-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Enterobacter cloacae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Enterobacter cloacae CdiA and similar domains. This CdiA-CT toxin has structural homology to the C-terminal nuclease domain of colicin E3, which cleaves 16S ribosomal RNA to disrupt protein synthesis, and has been shown to use the same nuclease activity to inhibit bacterial growth. The CdiA-CT toxin is specifically neutralized by cognate immunity protein CdiI to protect the toxin-producing cell from autoinhibition. Despite carrying equivalent toxin domains, the corresponding immunity proteins for CdiA-CT and colicin E3 are unrelated in sequence, structure, and toxin-binding site, thus showing diversity among 16S rRNase toxins. 75
37573 411004 cd20686 CdiA-CT_Ec-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli STEC_O31, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Escherichia coli STEC_O31 CdiA and similar domains. The function of this CdiA-CT is as yet unknown, but its C-terminal end is similar to EndoU domain-containing protein which may act as a nuclease toxin that cleaves RNAs in competitor cells. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 135
37574 412039 cd20687 CdiI_Ykris-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Yersinia kristensenii, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Yersinia kristensenii (which is an RNase), and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. The CdiI immunity protein binds the CdiA toxin via its C-terminal domain to prevent auto-inhibition. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Y. kristensenii CdiI binds directly over the putative active site of the CdiA-CT toxin and likely neutralizes toxicity by blocking access to RNA substrates. 90
37575 412040 cd20688 CdiI_Ecoli_Nm-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli STEC_O31, Neisseria meningitidis MC58, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli STEC_O31 Neisseria meningitidis MC58, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Neisseria meningitidis MC58 immunity protein CdiI has structural homology to the Whirly family of RNA-binding proteins, but lacks the characteristic nucleic acid-binding motif of the family. It has been predicted to neutralize toxin activity by preventing access to RNA substrates. 100
37576 412042 cd20689 CDI_toxin_Bp_tRNase-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model includes the C-terminal (CT) toxin domains of Burkholderia pseudomallei E479 and 1026b, both appearing to be RNAses acting on tRNA. 99
37577 412045 cd20690 CdiI_BpE479-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei E479, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Burkholderia pseudomallei E479 (which is a tRNase). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Although related B. pseudomallei E479 CdiA-CT has structural homology to B. pseudomallei 1026B CdiA-CT (both tRNases), their cognate CdiI immunity proteins share no significant sequence or structure homology. This CdiI binds its cognate toxin CdiA-CT domain with high affinity. 100
37578 412046 cd20691 CdiI_EC536-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 536, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli 536 (which is a predicted RNase), and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This E. coli CdiI's cognate toxin CdiA-CT domain is activated only when it is bound to the biosynthetic enzyme O-acetylserine sulfhydrylase-A (CysK), one of two isoenzymes (along with CysM) that catalyze the final reaction in cysteine synthesis. CdiA's predicted nuclease active site is occluded by immunity protein in the CysK/CdiA-CT/CdiI structure, suggesting that CdiI blocks the binding of tRNA substrates to the toxin. 120
37579 411005 cd20692 CdiA-CT_Ec-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli A0 34/86, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain Escherichia coli A0 34/86 CdiA. Activity of this E. coli CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 99
37580 412047 cd20693 CdiI_EcoliA0-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli A0 34/86, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli A0 34/86, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This E. coli CdiI binds its cognate toxin CdiA-CT domain with high affinity. 124
37581 412048 cd20694 CdiI_Ct-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Cupriavidus taiwanensis CdiI immunity protein and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Cupriavidus taiwanensis, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This C. taiwanensis CdiI is alpha-helical and binds its cognate toxin CdiA-CT domain with high affinity. 96
37582 411006 cd20695 CdiA-CT_5T87E_Ct C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Cupriavidus taiwanensis, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Cupriavidus taiwanensis CdiA. The exact biochemical function of this CdiA-CT cannot be predicted easily and may include RNase or DNase activity. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 61
37583 412049 cd20696 CdiI_Ecoli3006-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 3006, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli 3006, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. The E. coli CdiI binds its cognate CdiA-CT with high affinity via one end of its beta-sandwich structure. 150
37584 410969 cd20697 CdiA-CT_Ec_Kp-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli and Klebsiella pneumoniae CdiA, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal domain of CdiA (CdiA-CT) from Escherichia coli, Klebsiella pneumoniae and other bacteria. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 94
37585 412050 cd20698 CdiI_Kp-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Klebsiella pneumoniae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Klebsiella pneumoniae, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. The K. pneumoniae CdiI binds its cognate CdiA-CT via one end of its beta-sandwich structure. 110
37586 412051 cd20699 CdiI_ECL-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Enterobacter cloacae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Enterobacter cloacae, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Although E. cloacae CdiA-CT has structural homology to the C-terminal nuclease domain of colicin E3, which cleaves 16S ribosomal RNA to disrupt protein synthesis, and has been shown to use the same nuclease activity to inhibit bacterial growth, the corresponding CdiI immunity proteins are unrelated in sequence, structure and toxin-binding sites. Structural homology searches reveal that E. cloacae CdiI is most similar to the Whirly family of single-stranded DNA-binding proteins. 141
37587 411007 cd20700 CdiA-CT_Ec_tRNase C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Escherichia coli 563, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain Escherichia coli 563 CdiA and similar domains. This CdiA-CT (EC536) region is composed of two domains that have distinct functions during CDI. This domain is the extreme C-terminal domain, an RNase toxin that possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and [DN]HxxE motifs. The N-terminal domain facilitates translocation of the tethered nuclease into the cytosol of target bacteria. Although this CdiA-CT rapidly cleaves tRNA in vivo, the purified toxin has no detectable nuclease activity in vitro. Experiments show that it is activated when bound to the biosynthetic enzyme O-acetylserine sulfhydrylase-A (CysK), which is one of two isoenzymes (along with CysM) that catalyze the final reaction in cysteine synthesis. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 115
37588 410945 cd20701 MIX Marker for type sIX effectors domain. This family contains the MIX (Marker for type sIX effectors) domain, a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. MIX domains have been classified into five clans (called MIX I-V) by Dar et. al. based on sequence similarity. These domains have been further classified as either antibacterial or anti-eukaryotic, based on the presence or absence of adjacent putative immunity genes, respectively. In Vibrionaceae, antibacterial MIX-effectors carrying domains with pore-forming, phospholipase, nuclease, peptidoglycan hydrolase, and protease activities have been identified. Additionally, novel virulence MIX-effectors that employ a combination of antibacterial and anti-eukaryotic MIX-effectors have been found, suggesting that certain bacteria adapted their antibacterial T6SS to mediate interactions with eukaryotic hosts or predators. A subset of polymorphic MIX-effectors, a widespread class of effectors secreted by T6SSs, are horizontally shared between marine bacteria and used to diversify their T6SS effector repertoires, thus enhancing their environmental fitness. 128
37589 410637 cd20702 PoNe Polymorphic Nuclease effector (PoNe) domain is a deoxyribonuclease. This family contains the DNase toxin domain called PoNe (Polymorphic Nuclease effector), which belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. The PoNe domain co-occurs with a variety of N-terminal domains such as filamentous hemagglutinin, nuclease, HINT, DUFs, PAAR, RHS repeat, or LXG domains. Some members of this family also co-occur with the FIX (Found in type sIX effector) domain of unknown function, as identified by Jana et al., who have also identified this PoNe domain. 77
37590 410939 cd20703 FIX-like Found in type sIX effector (FIX) domain of unknown function. This family contains the Found in type sIX effector (FIX) domain and similar proteins. FIX is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. FIX is present in various established T6SS-secreted effectors that have an N-terminal VgrG or PAAR or PAAR-like (i.e., DUF4280) domain, suggesting that FIX may play a role in delivery of T6SS effectors, and serve as a new marker for T6SS-delivered proteins to enable the identification of novel T6SS substrates. 75
37591 412052 cd20704 Orc3 Origin recognition complex subunit 3. Origin recognition complex subunit 3 (Orc3) is a subunit of the heterohexameric origin recognition complex (ORC) that is essential for coordinating replication onset. ORC binds to the origin of replication, binds CDC6, and recruits the hexameric MCM2-7 ring to the DNA, which leads to the assembly of the pre-replicative complex (pre-RC). Five of the 6 ORC subunits (Orc 1-5) retain AAA+ (ATPases associated with a variety of cellular activities) folds, but Orc3, as well as Orc2, lost their ATP-binding signatures. 387
37592 410946 cd20705 MIX_I Marker for type sIX effectors domain, clan I. This subfamily contains the MIX (Marker for type sIX effectors) clan I (MIX I) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins predicted to mediate antibacterial toxicity. These C-terminal toxin domains of Vibrionaceae MIX I effectors include pore-forming, nuclease and nucleotide deaminase activities. Members of the MIX I clan are similar, in both sequence and synteny, to the Vibrio parahaemolyticus MIX-effector VP1388, but their activity is unknown. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. 115
37593 410947 cd20706 MIX_II Marker for type sIX effectors domain, clan II. This subfamily contains the MIX (Marker for type sIX effectors) II clan (MIX II) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted activity of the C-terminal toxin domains of Vibrionaceae MIX II effectors is mainly pore-forming. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. Also, some of these MIX II effectors also contain N-terminal domains such as the T6SS-secreted tail component PAAR. 149
37594 410948 cd20707 MIX_III Marker for type sIX effectors domain, clan III. This subfamily contains the MIX (Marker for type sIX effectors) III clan (MIX III) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. No MIX III clan members have been detected in Vibrionaceae. Predicted activity of the C-terminal toxin domains of MIX III effectors is mainly pore-forming. Studies have shown that many members of the MIX III clan neighbor transposable elements. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. 137
37595 410949 cd20708 MIX_IV Marker for type sIX effectors domain, clan IV. This subfamily contains the MIX (Marker for type sIX effectors) IV clan (MIX IV) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted activity of the C-terminal toxin domains of Vibrionaceae MIX IV effectors is mainly pore-forming. Members of MIX IV are similar, in both sequence and synteny, to the Vibrio parahaemolyticus MIX-effector VP1388, but their activity is unknown. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. 133
37596 410950 cd20709 MIX_V Marker for type sIX effectors domain, clan V. This subfamily contains the MIX (Marker for type sIX effectors) V clan (MIX V) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted antibacterial activities of the C-terminal toxin domains of Vibrionaceae MIX V effectors include peptidase, peptidoglycan hydrolase, nuclease and pore-forming. Also included in this clan is VPR01S_11_01570, encoded by V. proteolyticus, that carries a CNF1 (cytotoxic necrotizing factor 1) toxin domain and modulates the actin cytoskeleton of eukaryotic phagocytic cells. Some members contain DUF2235, which is predicted as a phospholipase domain. Members of the MIX V clan are shared between marine bacteria via horizontal gene transfer, thereby enhancing their bacterial competitive fitness. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. 123
37597 411008 cd20710 NOT1_connector Connector domain of NOT1. This NOT1 connector domain is one of several catalytically inactive subunits of the multisubunit CCR4-NOT complex assembly that plays a central role in post-translational gene regulation in eukaryotes. CCR4-NOT contains the catalytic center formed by two deadenylase subunits CCR4 and CAF1, and the conserved core complex which contains a minimum of four catalytically inactive subunits, NOT1, NOT2, NOT3 and CAF40/NOT9. NOT1 is the largest subunit which functions as a central scaffold for complex assembly in human orthologs. The Chaetomium thermophilum NOT1 connector domain consists of five alpha-helical hairpin repeats of the HEAT type that structurally resemble MIF4G domains, and hence is also called the MIF4G-C domain. However, NOT1 MIF4G-C does not interact with DEAD-box helicases such as DDX6 like MIF4G does. Structural conservation of this domain suggests an important role but its function is as yet unknown. 202
37598 411009 cd20712 LNYV_P-protein-C_like C-terminal domain of lettuce necrotic yellows virus phosphoprotein and related domains. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae family such as Lettuce necrotic yellows virus (LNYV). LNYV P protein acts as a weak local RNA silencing suppressor in plants to counteract RNA silencing antiviral defense. It suppresses both RNA induced silencing complex (RISC)-mediated cleavage and RNA silencing amplification. The C-terminal domain of LNYV P protein is essential for both local RNA silencing suppression and interaction with Argonaute (AGO) 1, AGO2, and AGO4 (key components of the RISC complexes), and with SGS3 and RDR6 (which function in the amplification step of RNA silencing). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, including acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 69
37599 411010 cd20714 NSP3_rotavirus rotavirus non-structural protein 3 (NSP3). Rotaviruses co-opt the eukaryotic translation machinery during their life cycle. Most eukaryotic mRNAs are characterized by a 5' cap structure and a 3' poly(A) tail. Eukaryotic translation initiation is facilitated by interactions between the 3' poly(A) tail and the 5' end of the message mediated by poly(A) binding protein (PABP) and eukaryotic translation initiation factor 4G (eIF4G). Rotavirus NSP3 is a functional analog of PABP that enables rotaviruses to direct eukaryotic translation machinery to viral mRNAs. It binds to the 3' consensus sequence of viral mRNA and participates in mRNA circularization by interacting with eIF4G. NSP3 closes the viral mRNA loop and facilitates translation of its own mRNAs while blocking recruitment of PABP to the eukaryotic translation initiation machinery. 127
37600 410956 cd20716 cyt_P460_fam Cytochrome P460 family. The cytochrome P460 family is composed mostly of monoheme, ~17 kDa, c-cytochromes typified by the cytochromes P460 of Nitrosomonas europaea and Methylococcus capsulatus (Bath), and the cytochrome c'-beta of M. capsulatus. Members of this family can be characterized by a predominantly beta-sheet structure as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. They are involved in the oxidation/reduction or ligation of N-oxides for detoxification or energy generation. Phylogenetic studies suggest that cytochrome P460 (cytL) genes evolved from ancestral cytochrome c'-beta genes (cytS) by acquisition of features including the lysine-heme cross-link. The protein-bound c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. 124
37601 410310 cd20721 CYCLIN_SDS-like_rpt2 second cyclin box found in Arabidopsis thaliana cyclin-SDS and similar proteins. Cyclin-SDS, also called protein SOLO DANCERS, is a meiosis-specific cyclin that is required for normal homolog synapsis and recombination in early to mid-prophase 1. It contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 104
37602 410311 cd20722 CYCLIN_CCNO_rpt2 second cyclin box found in cyclin-O (CCNO). CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 97
37603 410970 cd20723 CdiA-CT_Ec-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli CdiA, and similar proteins. This family includes the C-terminal (CT) domain of Escherichia coli CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. The exact biochemical function of this E. coli CdiA is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its all helical cognate CdiI with high affinity. 160
37604 410971 cd20724 CdiA-CT_Kp342-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Klebsiella pneumoniae 342 CdiA, and similar proteins. This family includes the C-terminal (CT) domain of Klebsiella pneumoniae CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI immunity protein (all beta-sheet structure) with high affinity. 107
37605 410972 cd20725 CdiA-CT_Kp-like CdiA C-terminal domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) similar to Klebsiella pneumoniae CdiA-CT. This family includes the C-terminal (CT) domain of bacterial CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. Many of the domains in this family are associated with RHS repeats N-terminal to the domain. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 99
37606 412043 cd20726 CDI_toxin_BpE479_tRNase-like C-terminal (CT) toxin domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Burkholderia pseudomallei E479 CdiA. This CdiA-CT domain is a tRNAse that contains a core alpha/beta-fold that is characteristic of PD(D/E)XK superfamily nucleases. It is structurally similar to another CDI toxin domain from B. pseudomallei 1026b which is unrelated in sequence but has a similar nuclease domain, and shares similar fold and active-site architecture. The PD(D/E)XK superfamily includes most restriction endonucleases and other enzymes involved in DNA recombination and repair. 121
37607 412044 cd20727 CDI_toxin_Bp_tRNase-like C-terminal (CT) toxin domain of a contact-dependent growth inhibition (CDI) system (CdiA-CT) similar to that of Burkholderia pseudomallei, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) toxin domains that are similar to Burkholderia pseudomallei E479 and 1026b CdiA toxins, both of which are tRNAses. 105
37608 410638 cd20729 PoNe_LXG-like Polymorphic Nuclease effector (PoNe) co-occurring with N-terminal LXG domains. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains some members that contain LXG domains in the N-terminal region. This group of polymorphic toxin proteins in bacteria are predicted to be associated with type VII secretion pathways to mediate export of bacterial toxins. 139
37609 410639 cd20730 PoNe_FilH-like Polymorphic Nuclease effector (PoNe) co-occurring with filamentous hemagglutinin N-terminal domain repeats. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal hemagglutinin repeats and/or a hemagglutination activity domain. 133
37610 410640 cd20731 PoNe_FilH_TF-like Polymorphic Nuclease effector (PoNe) co-occurring with N-terminal domains such as filamentous hemagglutinin repeats or TANFOR. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains members with PoNe domains typically co-occuring with N-terminal domains such as hemagglutinin repeats and/or hemagglutination activity domains, or a TANFOR domain, which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains. 129
37611 410641 cd20732 PoNe_FilH_DUF637_VENN-like Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as filamentous hemagglutinin repeats, DUF637, or pre-toxin domain with VENN motif. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with a PoNe domain typically co-occuring with N-terminal domains such as filamentous hemagglutinin repeats, hemagglutination activity domains, DUF637 - predicted to be a hemagglutinin domain, or pre-toxin domains with VENN motifs, which are found in many bacterial polymorphic toxins and are located before the C-terminal toxin modules. 121
37612 410642 cd20733 PoNe_PAAR-like Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal PAAR domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contain members with PoNe domains that typically co-occur with N-terminal domains such as proline-alanine-alanine-arginine (PAAR) repeat domains that form a sharp conical extension on VgrG spikes, which is a trimeric protein complex of the bacterial type VI secretion system (T6SS). 125
37613 410643 cd20734 PoNe_RHS-like Polymorphic Nuclease effector (PoNe) domain co-occurring with RHS repeat-associated core domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains typically co-occurring with N-terminal domains such as RHS repeat-associated core domains, which may contain FG-GAP, RHS or YD repeats, and are found in secreted bacterial insecticidal toxins. 90
37614 410644 cd20735 PoNe_RHS-like Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as RHS repeats. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with uncharacterized N-terminal RHS repeat domains. 111
37615 410645 cd20736 PoNe_Nuclease Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal nuclease domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with nuclease N-terminal domains such as endonucleases involved in methyl-directed DNA mismatch repair in gram negative bacteria. 80
37616 410646 cd20737 PoNe_HINT Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as the HINT domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis. 91
37617 410647 cd20738 PoNe_DUF4280 Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal DUF4280 domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with an N-terminal domain of unknown function (DUF4280), which has a single completely conserved residue C that may be functionally important. 127
37618 410648 cd20739 PoNe_DUF637 Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal DUF637 domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as DUF637 predicted to be a hemagglutinin domain. 124
37619 410649 cd20740 PoNe_LXG_HINT-like Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal LXG or pre-toxin HINT domains. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains members with PoNe domains that co-occur with N-terminal domains such as HINT or LXG domains. The pre-toxin HINT domain, a member of the HINT superfamily of proteases, is usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis. The LXG domains that are present in the N-terminal region of a group of polymorphic toxin proteins in bacteria and predicted to use a Type VII secretion pathway to mediate export of bacterial toxins. 96
37620 410650 cd20741 PoNe_HINT_TF-like Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as HINT or TANFOR. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as a TANFOR domain which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains, or a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis. 77
37621 410940 cd20742 FIX_vWA-like Found in type sIX effector (FIX) domain of unknown function co-occurring with von Willebrand factor type A (vWA) domain or MSCRAMM family adhesin SdrC domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with domains such as the von Willebrand factor type A (vWA) domain, which has a wide variety of important cellular functions, and the MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules) family adhesin SdrC domain, that contains a variable-length C-terminal region of Ser-Asp (SD) repeats. 80
37622 410941 cd20743 FIX_RhsA-like Found in type sIX effector (FIX) domain of unknown function co-occurring with RhsA domains with RHS repeats. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal RhsA-like domain, which contains extended repeat regions and RHS repeats. Some in this family have additional C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module. 92
37623 410942 cd20744 FIX_AHH_RhsA-like Found in type sIX effector (FIX) domain of unknown function co-occurring with C-terminal AHH domain and some RhsA domains with RHS repeats. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module. Some in this family have additional C-terminal domains such as RhsA protein which contains extended repeat regions and RHS repeats. 76
37624 410943 cd20745 FIX_RhsA_AHH_HNH-like Found in type sIX effector (FIX) domain of unknown function co-occurring with RhsA, AHH or HNH domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal RhsA-like domain which contains extended repeat regions and RHS repeats. Some in this subfamily have additional C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module, and HNH endonuclease domain, which usually contains a conserved HNH motif in the sequence. Some members also contain additional N-terminal VgrG or PAAR or PAAR-like (i.e., DUF4280) domain. 69
37625 410944 cd20746 FIX_Ntox15_NUC_DUF4112_RhsA-like Found in type sIX effector (FIX) domain of unknown function co-occurring with Ntox15, endonuclease, or RHS repeat domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that generally co-occurs with the C-terminal Ntox15 (Novel toxin 15), a predicted RNase toxin that possesses a conserved HxxD motif, as well as with domains such as DNA/RNA non-specific endonuclease, RhsA domain regions with extende RHS repeats, or DUF4112. Some members also contain an N-terminal PAAR-like (i.e., DUF4280) domain. 84
37626 410957 cd20750 cyt_c_I Uncharacterized subfamily of the cytochrome P460 family. This subfamily is composed mainly of hypothetical proteins, including Sphingopyxis alaskensis class I cytochrome C. Members of this subfamily belong to the cytochrome P460 family that is composed mostly of monoheme, ~17 kDa, c-cytochromes typified by the cytochromes P460 of Nitrosomonas europaea and Methylococcus capsulatus (Bath), and the cytochrome c'-beta of M. capsulatus. Cytochrome P460 family members can be characterized by a predominantly beta-sheet structure as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. They are involved in the oxidation/reduction or ligation of N-oxides for detoxification or energy generation. The protein-bound c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. 146
37627 410958 cd20751 cyt_P460_Ne-like cytochrome P460 from Nitrosomonas europaea and similar proteins. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. The heme P460 in N. europea cyt P460 contains a third proteinaceous cross-link, similar to that found in hydroxylamine oxidoreductase (HAO), but in this case, the cross-link is to a conserved lysine residue, K70. The biological function of cyt P460 is yet to be determined, but it binds hydroxylamine, hydrazine, hydrogen peroxide, and cyanide in the ferric form, and CO in the ferrous form; it also possesses a weak hydroxylamine oxidation/cyt c reduction activity. It belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. 153
37628 410959 cd20752 cyt_c'_beta Cytochrome c'-beta from Methylococcus capsulatus (Bath) and similar proteins. Cytochromes (cyt) c' are defined by a pentacoordinate heme Fe with a CXXCH c-heme-binding motif located close to the C-terminus. Most cyt c' have four alpha-helix bundle structures, and are referred to as cyt c'-alpha. M. capsulatus (Bath) cytochrome c'-beta, encoded by the cytS gene, is a homodimeric heme protein with a higher molecular weight of about 16 kDa per monomer, compared to cyt c'-alpha (~12 kDa), and it adopts a beta-sheet structure. It is involved in nitric oxide scavenging and protection against nitrosoative stress. Cyt c'-beta belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. 136
37629 410960 cd20753 cyt_P460_Mc-like cytochrome P460 from Methylococcus capsulatus (Bath) and similar proteins. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. M. capsulatus (Bath) cytochrome P460, encoded by the cytL gene, catalyzes the oxidation of hydroxylamine (NH2OH) to form nitrous oxide (N2O) under anaerobic conditions. Similar to Nitrosomonas europaea cyt P460, it is defined by an unusual porphyrin (heme)-lysine cross link. This subfamily belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. 136
37630 394914 cd20754 capping_2-OMTase_viral viral Cap-0 specific (nucleoside-2'-O-)-methyltransferase. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Some dsDNA and dsRNA viruses, like the bluetongue virus (BTV), a member of the Reoviridae family, and Vaccinia virus, a member of the Poxviridae family, as well as some ss(+)RNA viruses, like Flaviviridae and Nidovirales, also cap their mRNAs and encode their own 2'OMTase. In BTV, all four reactions are catalyzed by a single protein, VP4. In Vaccinia, the activity is located in the processing factor of the poly(A) polymerase, VP39. 179
37631 394915 cd20756 capping_2-OMTase_Poxviridae Cap-0 specific (nucleoside-2'-O-)-methyltransferase of poxviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Poxviridae, a family of dsDNA viruses, cap their mRNAs. The 2'OMTase activity is located in the processing factor of the poly(A) polymerase, VP39. 270
37632 394916 cd20757 capping_2-OMTase_Rotavirus Cap-0 specific (nucleoside-2'-O-)-methyltransferase of rotavirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Rotavirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the multifunctional capping enzyme, VP3. 197
37633 394917 cd20758 capping_2-OMTase_Orbivirus Cap-0 specific (nucleoside-2'-O-)-methyltransferase of orbivirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Orbivirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the multifunctional capping enzyme, VP4. 211
37634 394918 cd20759 capping_2-OMTase_Phytoreovirus Cap-0 specific (nucleoside-2'-O-)-methyltransferase of phytoreovirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Phytoreovirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the mRNA capping enzyme P5. 199
37635 394919 cd20760 capping_2-OMTase_Mimiviridae Cap-0 specific (nucleoside-2'-O-)-methyltransferase of mimiviridae and pithoviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Mimiviridae and pithoviridae are part of the nucleocytoplasmic large dsDNA virus clade (NCLDV). The 2'OMTase activity is located in the polyA polymerase regulatory subunit. 233
37636 394920 cd20761 capping_2-OMTase_Flaviviridae Cap-0 specific (nucleoside-2'-O-)-methyltransferase of flaviviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Flaviviridae, a family of ss(+)RNA viruses, cap their mRNAs. The 2'OMTase activity is located in the nonstructural protein 5 (NS5). 222
37637 394921 cd20762 capping_2-OMTase_Nidovirales Cap-0 specific (nucleoside-2'-O-)-methyltransferase of nidovirales. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Nidovirales, a family of ss(+)RNA viruses, cap their mRNAs. For one member, Coronavirus, the 2'OMTase activity is located in the nonstructural protein 16 (NSP16). For others, the 2'OMTase activity may be located in replicase polyprotein 1ab. 176
37638 411011 cd20786 tapirin_C C-terminal domain of cellulose binding protein tapirin. This family contains the C-terminal domain of tapirin, a unique cellulose binding protein that is present only in the extremely thermophilic bacterial species Caldicellulosiruptor that grow on carbohydrates from lignocellulose at elevated temperatures. Tapirins appear to be specifically attached to cellulose, having similar binding affinities to cellulose as family 3 carbohydrate binding modules (CBM3). Structures of the C-terminal region indicate that aromatic and hydrophobic residues are responsible for cellulose binding, while a flexible peptide loop may protect and control access to this region. The basis for the genomic localization of the tapirins is unknown; however, these proteins are located next to type IV pili in the Caldicellulosiruptor genomes and therefore may be exposed on the cell membrane beside or as part of pili proteins. Caldicellulosiruptor hydrothermalis, which has less capability to deconstruct lignocellulose itself, may use tapirin as one of the mechanisms for its survival in extreme environments by anchoring itself to biomass that is hydrolyzed by enzymes from other species. Understanding mechanisms by which these microorganisms attach to and degrade lignocellulose may be important in finding effective approaches for conversion of plant biomass into fuels and chemicals. 343
37639 412053 cd20788 TBC1D23_C-like C-terminal domain of TBC1 domain family member 23, and similar proteins. This family contains the C-terminal domain of Tre2-Bub2-Cdc16 (TBC) family 23 (TBC1D23), which adopts a Pleckstrin homology (PH) domain fold. It selectively binds to phosphoinositides, in particular, PtdIns(4)P, through one surface while it binds FAM21 via the opposite surface. TBC1D23, which is highly conserved in many eukaryotes but missing in plants and fungi, also possesses an N-terminal domain which is a catalytically inactive TBC domain. TBC1D23 encodes a protein functioning in endosome-to-Golgi trafficking in cells; it is a specificity determinant that links the vesicle to the target membrane. Homozygous mutations of TBC1D23 have been found in patients diagnosed with pontocerebellar hypoplasia (PCH), a group of neurological disorders that affect the brain development, particularly, the pons and cerebellum. Mutation of key residues of TBC1D23 (or FAM21) selectively disrupts the endosomal vesicular trafficking toward the Trans-Golgi Network. This C-terminal domain is missing in some PCH patients. 115
37640 411012 cd20789 Cas13d Class 2 type VI-D CRISPR-associated RNA-guided ribonuclease Cas13d. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both, pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13d enzymes are 20-30% smaller than other Cas13 subtypes, which enable flexible packaging into size-constrained therapeutic viral vectors such as adeno-associated virus. 875
37641 411013 cd20790 Cas13a Class 2 type VI-A CRISPR-associated RNA-guided ribonuclease Cas13a. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Within the Cas13a (also called C2c2) subfamily, the active site is functionally diverse in terms of both nucleotide cleavage preference and turnover efficiency. There are two distinct types of Cas13a enzymes, based on their cleavage preference: adenosine (A) cleaving or uridine (U) cleaving. 1188
37642 410342 cd20792 C1_cPKC_nPKC_rpt1 first protein kinase C conserved region 1 (C1 domain) found in classical (or conventional) protein kinase C (cPKC), novel protein kinase C (nPKC), and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. nPKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs (aPKCs) only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. This family includes classical PKCs (cPKCs) and novel PKCs (nPKCs). There are four cPKC isoforms (named alpha, betaI, betaII, and gamma) and four nPKC isoforms (delta, epsilon, eta, and theta). Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37643 410343 cd20793 C1_cPKC_nPKC_rpt2 second protein kinase C conserved region 1 (C1 domain) found in classical (or conventional) protein kinase C (cPKC), novel protein kinase C (nPKC), and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. nPKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs (aPKCs) only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. This family includes classical PKCs (cPKCs) and novel PKCs (nPKCs). There are four cPKC isoforms (named alpha, betaI, betaII, and gamma) and four nPKC isoforms (delta, epsilon, eta, and theta). Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 50
37644 410344 cd20794 C1_aPKC protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. Members of this family contain one C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37645 410345 cd20795 C1_PKD_rpt1 first protein kinase C conserved region 1 (C1 domain) found in the protein kinase D (PKD) family. PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs contain N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37646 410346 cd20796 C1_PKD_rpt2 second protein kinase C conserved region 1 (C1 domain) found in the family of protein kinase D (PKD). PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs contain N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37647 410347 cd20797 C1_CeDKF1-like_rpt1 first protein kinase C conserved region 1 (C1 domain) found in Caenorhabditis elegans serine/threonine-protein kinase DKF-1 and similar proteins. DKF-1 converts transient diacylglycerol (DAG) signals into prolonged physiological effects, independently of PKC. It plays a role in the regulation of growth and neuromuscular control of movement. It is involved in immune response to Staphylococcus aureus bacterium by activating transcription factor hlh-30 downstream of phospholipase plc-1. Members of this group contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37648 410348 cd20798 C1_CeDKF1-like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in Caenorhabditis elegans serine/threonine-protein kinase DKF-1 and similar proteins. DKF-1 converts transient diacylglycerol (DAG) signals into prolonged physiological effects, independently of PKC. It plays a role in the regulation of growth and neuromuscular control of movement. It is involved in immune response to Staphylococcus aureus bacterium by activating transcription factor hlh-30 downstream of phospholipase plc-1. Members of this group contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37649 410349 cd20799 C1_DGK_typeI_rpt1 first protein kinase C conserved region 1 (C1 domain) found in type I diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type I DAG kinases (DGKs) contain EF-hand structures that bind Ca(2+) and recoverin homology domains, in addition to C1 and catalytic domains that are present in all DGKs. Type I DGKs, regulated by calcium binding, include three DGK isozymes (alpha, beta and gamma). DAG kinase alpha, also called 80 kDa DAG kinase, or diglyceride kinase alpha (DGK-alpha), is active upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. DAG kinase beta, also called 90 kDa DAG kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. DGK-alpha contains atypical C1 domains, while DGK-beta and DGK-gamma contain typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37650 410350 cd20800 C1_DGK_typeII_rpt1 first protein kinase C conserved region 1 (C1 domain) found in type II diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type II DAG kinases (DGKs) contain pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. Three DGK isozymes (delta, eta and kappa) are classified as type II. DAG kinase delta, also called 130 kDa DAG kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. The DAG kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase kappa is also called diglyceride kinase kappa (DGK-kappa) or 142 kDa DAG kinase. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 60
37651 410351 cd20801 C1_DGKepsilon_typeIII_rpt1 first protein kinase C conserved region 1 (C1 domain) found in type III diacylglycerol kinase, DAG kinase epsilon, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase epsilon, also called diglyceride kinase epsilon (DGK-epsilon), is the only isoform classified as type III; it possesses a hydrophobic domain in addition to C1 and catalytic domains that are present in all DGKs, and shows selectivity for acyl chains. It is highly selective for arachidonate-containing species of DAG. It may terminate signals transmitted through arachidonoyl-DAG or may contribute to the synthesis of phospholipids with defined fatty acid composition. DAG kinase epsilon contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37652 410352 cd20802 C1_DGK_typeIV_rpt1 first protein kinase C conserved region 1 (C1 domain) found in type IV diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type IV DAG kinases (DGKs) contain myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. Two DGK isozymes (zeta and iota) are classified as type IV. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37653 410353 cd20803 C1_DGKtheta_typeV_rpt1 first protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37654 410354 cd20804 C1_DGKtheta_typeV_rpt2 second protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37655 410355 cd20805 C1_DGK_rpt2 second protein kinase C conserved region 1 (C1 domain) found in the diacylglycerol kinase family. The diacylglycerol kinase (DGK, EC 2.7.1.107) family of enzymes plays critical roles in lipid signaling pathways by converting diacylglycerol to phosphatidic acid, thereby downregulating signaling by the former and upregulating signaling by the latter second messenger. Ten DGK family isozymes have been identified to date, which possess different interaction motifs imparting distinct temporal and spatial control of DGK activity to each isozyme. They have been classified into five types (I-V), according to domain architecture and some common features. All DGK isozymes, except for DGKtheta, contain two copies of the C1 domain. This model corresponds to the second one. DGKtheta harbors three C1 domains. Its third C1 domain is included here. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37656 410356 cd20806 C1_CHN protein kinase C conserved region 1 (C1 domain) found in the chimaerin family. Chimaerins are a family of phorbolester- and diacylglycerol-responsive GTPase activating proteins (GAPs) specific for the Rho-like GTPase Rac. Alpha1-chimerin (formerly known as N-chimerin) and alpha2-chimerin are alternatively spliced products of a single gene, as are beta1- and beta2-chimerin. Alpha1- and beta1-chimerin have a relatively short N-terminal region that does not encode any recognizable domains, whereas alpha2- and beta2-chimerin both include a functional SH2 domain that can bind to phosphotyrosine motifs within receptors. All the isoforms contain a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37657 410357 cd20807 C1_Munc13 protein kinase C conserved region 1 (C1 domain) found in the Munc13 family. The Munc13 gene family encodes a family of neuron-specific, synaptic molecules that bind to syntaxin, an essential mediator of neurotransmitter release. Munc13-1 is a component of presynaptic active zones in which it acts as an essential synaptic vesicle priming protein. Munc13-2 is essential for normal release probability at hippocampal mossy fiber synapses. Munc13-3 is almost exclusively expressed in the cerebellum. It acts as a tumor suppressor and plays a critical role in the formation of release sites with calcium channel nanodomains. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37658 410358 cd20808 C1_RASGRP protein kinase C conserved region 1 (C1 domain) found in the RAS guanyl-releasing protein (RASGRP) family. The RASGRP family includes RASGRP1-4. They function as cation-, usually calcium-, and diacylglycerol (DAG)-regulated nucleotide exchange factor activating Ras through the exchange of bound GDP for GTP. RASGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII) or Ras guanyl-releasing protein, activates the Erk/MAP kinase cascade and regulates T-cell/B-cell development, homeostasis and differentiation by coupling T-lymphocyte/B-lymphocyte antigen receptors to Ras. RASGRP1 also regulates NK cell cytotoxicity and ITAM-dependent cytokine production by activation of Ras-mediated ERK and JNK pathways. RASGRP2, also called calcium and DAG-regulated guanine nucleotide exchange factor I (CalDAG-GEFI), Cdc25-like protein (CDC25L), or F25B3.3 kinase-like protein, specifically activates Rap and may also activate other GTPases such as RRAS, RRAS2, NRAS, KRAS but not HRAS. RASGRP2 is involved in aggregation of platelets and adhesion of T-lymphocytes and neutrophils probably through inside-out integrin activation, as well as in the muscarinic acetylcholine receptor M1/CHRM1 signaling pathway. RASGRP3, also called calcium and DAG-regulated guanine nucleotide exchange factor III (CalDAG-GEFIII), or guanine nucleotide exchange factor for Rap1, is a guanine nucleotide-exchange factor activating H-Ras, R-Ras and Ras-associated protein-1/2. It functions as an important mediator of signaling downstream from receptor coupled phosphoinositide turnover in B and T cells. RASGRP4 may function in mast cell differentiation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37659 410359 cd20809 C1_MRCK protein kinase C conserved region 1 (C1 domain) found in the Myotonic dystrophy kinase-related Cdc42-binding kinase (MRCK) family. MRCK is thought to be a coincidence detector of signaling by the small GTPase Cdc42 and phosphoinositides. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. MRCK has been shown to promote cytoskeletal reorganization, which affects many biological processes. Three isoforms of MRCK are known, named alpha, beta and gamma. MRCKgamma is expressed in heart and skeletal muscles, unlike MRCKalpha and MRCKbeta, which are expressed ubiquitously. MRCK consists of a serine/threonine kinase domain, a cysteine rich (C1) region, a PH domain and a p21 binding motif. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37660 410360 cd20810 C1_VAV protein kinase C conserved region 1 (C1 domain) found in VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and as scaffold proteins, and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37661 410361 cd20811 C1_Raf protein kinase C conserved region 1 (C1 domain) found in the Raf (Rapidly Accelerated Fibrosarcoma) kinase family. Raf kinases are serine/threonine kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. They act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Aberrant expression or activation of components in this pathway are associated with tumor initiation, progression, and metastasis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. Vertebrates have three Raf isoforms (A-, B-, and C-Raf) with different expression profiles, modes of regulation, and abilities to function in the ERK cascade, depending on cellular context and stimuli. They have essential and non-overlapping roles during embryo- and organogenesis. Knockout of each isoform results in a lethal phenotype or abnormality in most mouse strains. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 49
37662 410362 cd20812 C1_KSR protein kinase C conserved region 1 (C1 domain) found in the kinase suppressor of Ras (KSR) family. KSR is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases, but there is some debate in this designation as a few groups have reported detecting kinase catalytic activity for KSRs, specifically KSR1. Vertebrates contain two KSR proteins, KSR1 and KSR2. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 48
37663 410363 cd20813 C1_ROCK protein kinase C conserved region 1 (C1 domain) found in the Rho-associated coiled-coil containing protein kinase (ROCK) family. ROCK is a serine/threonine protein kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. It is also referred to as Rho-associated kinase or simply as Rho kinase. It contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 65
37664 410364 cd20814 CRIK protein kinase C conserved region 1 (C1 domain) found in citron Rho-interacting kinase (CRIK) and similar proteins. CRIK, also called serine/threonine-protein kinase 21, is an effector of the small GTPase Rho. It plays an important function during cytokinesis and affects its contractile process. CRIK-deficient mice show severe ataxia and epilepsy as a result of abnormal cytokinesis and massive apoptosis in neuronal precursors. A Down syndrome critical region protein TTC3 interacts with CRIK and inhibits CRIK-dependent neuronal differentiation and neurite extension. CRIK contains a catalytic domain, a central coiled-coil domain, and a C-terminal region containing a Rho-binding domain (RBD), a zinc finger (C1 domain), and a pleckstrin homology (PH) domain, in addition to other motifs. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37665 410365 cd20815 C1_p190RhoGEF-like protein kinase C conserved region 1 (C1 domain) found in the 190 kDa guanine nucleotide exchange factor (p190RhoGEF)-like family. The p190RhoGEF-like protein family includes p190RhoGEF, Rho guanine nucleotide exchange factor 2 (ARHGEF2), A-kinase anchor protein 13 (AKAP-13) and similar proteins. p190RhoGEF is a brain-enriched, RhoA-specific guanine nucleotide exchange factor that regulates signaling pathways downstream of integrins and growth factor receptors. It is involved in axonal branching, synapse formation and dendritic morphogenesis, as well as in focal adhesion formation, cell motility and B-lymphocytes activation. ARHGEF2 acts as a guanine nucleotide exchange factor (GEF) that activates Rho-GTPases by promoting the exchange of GDP for GTP. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. AKAP-13 is a scaffold protein that plays an important role in assembling signaling complexes downstream of several types of G protein-coupled receptors. It activates RhoA in response to signaling via G protein-coupled receptors via its function as Rho guanine nucleotide exchange factor. It may also activate other Rho family members. AKAP-13 plays a role in cell growth, cell development and actin fiber formation. Members of this family share a common domain architecture containing C1, RhoGEF or Dbl-homologous (DH), and Pleckstrin Homology (PH) domains. Some members may contain additional domains such as the DUF5401 domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37666 410366 cd20816 C1_GMIP-like protein kinase C conserved region 1 (C1 domain) found in the GEM-interacting protein (GMIP)-like family. The GMIP-like family includes GMIP, Rho GTPase-activating protein 29 (ARHGAP29) and Rho GTPase-activating protein 45 (ARHGAP45). GMIP is a RhoA-specific GTPase-activating protein that acts as a key factor in saltatory neuronal migration. It associates with the Rab27a effector JFC1 and modulates vesicular transport and exocytosis. ARHGAP29, also called PTPL1-associated RhoGAP protein 1 (PARG1) or Rho-type GTPase-activating protein 29, is a GTPase activator for the Rho-type GTPases by converting them to an inactive GDP-bound state. It has strong activity toward RHOA, and weaker activity toward RAC1 and CDC42. ARHGAP29 may act as a specific effector of RAP2A to regulate Rho. In concert with RASIP1, ARHGAP29 suppresses RhoA signaling and dampens ROCK and MYH9 activities in endothelial cells and plays an essential role in blood vessel tubulogenesis. ARHGAP45, also called minor histocompatibility antigen HA-1 (mHag HA-1), is a Rac-GAP (GTPase-Activating Protein) in endothelial cells. It acts as a novel regulator of endothelial integrity. ARHGAP45 contains a GTPase activator for the Rho-type GTPases (RhoGAP) domain that would be able to negatively regulate the actin cytoskeleton as well as cell spreading. However, it also contains N-terminally a BAR-domin which can play an autoinhibitory effect on this RhoGAP activity. Members of this family contain a zinc-binding C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 51
37667 410367 cd20817 C1_Stac protein kinase C conserved region 1 (C1 domain) found in the SH3 and cysteine-rich domain-containing protein (Stac) family. Stac proteins are putative adaptor proteins that are important for neuronal function. There are three mammalian members (Stac1, Stac2 and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. Stac proteins contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 51
37668 410368 cd20818 C1_Myosin-IX protein kinase C conserved region 1 (C1 domain) found in the unconventional myosin-IX family. Myosins IX (Myo9) is a class of unique motor proteins with a common structure of an N-terminal extension preceding a myosin head homologous to the Ras-association (RA) domain, a head (motor) domain, a neck with IQ motifs that bind light chains, and a C-terminal tail containing cysteine-rich zinc binding (C1) and Rho-GTPase activating protein (RhoGAP) domains. There are two genes for myosins IX in humans, IXa and IXb, that are different in their expression and localization. IXa is expressed abundantly in brain and testis, and IXb is expressed abundantly in tissues of the immune system. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37669 410369 cd20819 C1_DEF8 protein kinase C conserved region 1 (C1 domain) found in differentially expressed in FDCP 8 (DEF-8) and similar proteins. DEF-8 positively regulates lysosome peripheral distribution and ruffled border formation in osteoclasts. It is involved in bone resorption. DEF-8 contains a protein kinase C conserved region 1 (C1) domain followed by a putative zinc-RING and/or ribbon. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37670 410370 cd20820 C1_RASSF1-like protein kinase C conserved region 1 (C1 domain) found in the Ras association domain-containing protein 1 (RASSF1)-like family. The RASSF1-like family includes RASSF1 and RASSF5. RASSF1 and RASSF5 are members of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. RASSF1A and 1C are the most extensively studied RASSF1; both are localized to microtubules and involved in the regulation of growth and migration. RASSF1 is a potential tumor suppressor that is required for death receptor-dependent apoptosis. RASSF5, also called new ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion. RASSF5 is a potential tumor suppressor that seems to be involved in lymphocyte adhesion by linking RAP1A activation upon T-cell receptor or chemokine stimulation to integrin activation. RASSF1 and RASSF5 contain a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37671 410371 cd20821 C1_MgcRacGAP protein kinase C conserved region 1 (C1 domain) found in male germ cell RacGap (MgcRacGAP) and similar proteins. MgcRacGAP, also called Rac GTPase-activating protein 1 (RACGAP1) or protein CYK4, plays an important dual role in cytokinesis: i) it is part of centralspindlin-complex, together with the mitotic kinesin MKLP1, which is critical for the structure of the central spindle by promoting microtuble bundling; and ii) after phosphorylation by aurora B, MgcRacGAP becomes an effective regulator of RhoA and plays an important role in the assembly of the contractile ring and the initiation of cytokinesis. MgcRacGAP-like proteins contain an N-terminal C1 domain, and a C-terminal RhoGAP domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37672 410372 cd20822 C1_ScPKC1-like_rpt1 first protein kinase C conserved region 1 (C1 domain) found in Saccharomyces cerevisiae protein kinase C-like 1 (ScPKC1) and similar proteins. ScPKC1 is required for cell growth and for the G2 to M transition of the cell division cycle. It mediates a protein kinase cascade, activating BCK1 which itself activates MKK1/MKK2. The family also includes Schizosaccharomyces pombe PKC1 and PKC2, which are involved in the control of cell shape and act as targets of the inhibitor staurosporine. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37673 410373 cd20823 C1_ScPKC1-like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in Saccharomyces cerevisiae protein kinase C-like 1 (ScPKC1) and similar proteins. ScPKC1 is required for cell growth and for the G2 to M transition of the cell division cycle. It mediates a protein kinase cascade, activating BCK1 which itself activates MKK1/MKK2. The family also includes Schizosaccharomyces pombe PKC1 and PKC2, which are involved in the control of cell shape and act as targets of the inhibitor staurosporine. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37674 410374 cd20824 C1_SpBZZ1-like protein kinase C conserved region 1 (C1 domain) found in Schizosaccharomyces pombe protein BZZ1 and similar proteins. BZZ1 is a syndapin-like F-BAR protein that plays a role in endocytosis and trafficking to the vacuole. It functions with type I myosins to restore polarity of the actin cytoskeleton after NaCl stress. BZZ1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. Schizosaccharomyces pombe BZZ1 also harbors a C1 domain, but Saccharomyces cerevisiae BZZ1 doesn't have any. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37675 410375 cd20825 C1_PDZD8 protein kinase C conserved region 1 (C1 domain) found in PDZ domain-containing protein 8 (PDZD8) and similar proteins. PDZD8, also called Sarcoma antigen NY-SAR-84/NY-SAR-104, is a molecular tethering protein that connects endoplasmic reticulum (ER) and mitochondrial membranes. PDZD8-dependent ER-mitochondria membrane tethering is essential for ER-mitochondria Ca2+ transfer. In neurons, it is involved in the regulation of dendritic Ca2+ dynamics by regulating mitochondrial Ca2+ uptake. PDZD8 also plays an indirect role in the regulation of cell morphology and cytoskeletal organization. It contains a PDZ domain and a C1 domain. This model describes the C1 domain, a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37676 410376 cd20826 C1_TNS2-like protein kinase C conserved region 1 (C1 domain) found in tensin-2 like (TNS2-like) proteins. The TNS2-like group includes TNS2, and variants of TNS1 and TNS3. Tensin-2 (TNS2), also called C1 domain-containing phosphatase and tensin (C1-TEN), or tensin-like C1 domain-containing phosphatase (TENC1), is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It regulates cell motility and proliferation. It may have phosphatase activity. TNS2 reduces AKT1 phosphorylation, lowers AKT1 kinase activity and interferes with AKT1 signaling. Tensin-1 (TNS1) plays a role in fibrillar adhesion formation. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton. Tensin-3 (TNS3), also called tensin-like SH2 domain-containing protein 1 (TENS1), or tumor endothelial marker 6 (TEM6), may play a role in actin remodeling. It is involved in the dissociation of the integrin-tensin-actin complex. Typical TNS1 and TNS3 do not contain C1 domains, but some isoforms/variants do. Members of this family contain an N-terminal region with a zinc finger (C1 domain), a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37677 410377 cd20827 C1_Sbf-like protein kinase C conserved region 1 (C1 domain) found in the myotubularin-related protein Sbf and similar proteins. This group includes Drosophila melanogaster SET domain binding factor (Sbf), the single homolog of human MTMR5/MTMR13, and similar proteins, that show high sequence similarity to vertebrate myotubularin-related proteins (MTMRs) which may function as guanine nucleotide exchange factors (GEFs). Sbf is a pseudophosphatase that coordinates both phosphatidylinositol 3-phosphate (PI(3)P) turnover and Rab21 GTPase activation in an endosomal pathway that controls macrophage remodeling. It also functions as a GEF that promotes Rab21 GTPase activation associated with PI(3)P endosomes. Vertebrate MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Members of this family contain these domains and have an additional C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37678 410378 cd20828 C1_MTMR-like protein kinase C conserved region 1 (C1 domain) found in uncharacterized proteins similar to myotubularin-related proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate myotubularin-related proteins (MTMRs), such as MTMR5 and MTMR13. MTMRs may function as guanine nucleotide exchange factors (GEFs). Vertebrate MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Members of this family contain these domains and have an additional C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37679 410379 cd20829 C1_PIK3R-like_rpt1 first protein kinase C conserved region 1 (C1 domain) found in uncharacterized phosphatidylinositol 3-kinase regulatory subunit-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate phosphatidylinositol 3-kinase regulatory subunits (PIK3Rs), which bind to activated (phosphorylated) protein-tyrosine kinases through its SH2 domain and regulate their kinase activity. Unlike typical PIK3Rs, members of this family have two C1 domains. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37680 410380 cd20830 C1_PIK3R-like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in uncharacterized phosphatidylinositol 3-kinase regulatory subunit-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate phosphatidylinositol 3-kinase regulatory subunits (PIK3Rs), which bind to activated (phosphorylated) protein-tyrosine kinases through its SH2 domain and regulate their kinase activity. Unlike typical PIK3Rs, members of this family have two C1 domains. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37681 410381 cd20831 C1_dGM13116p-like protein kinase C conserved region 1 (C1 domain) found in Drosophila melanogaster GM13116p and similar proteins. This group contains uncharacterized proteins including Drosophila melanogaster GM13116p and Caenorhabditis elegans hypothetical protein R11G1.4, both of which contain C2 (a calcium-binding domain) and C1 domains. This model describes the C1 domain, a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 58
37682 410382 cd20832 C1_ARHGEF-like protein kinase C conserved region 1 (C1 domain) found in uncharacterized Rho guanine nucleotide exchange factor (ARHGEF)-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate Rho guanine nucleotide exchange factors ARHGEF11 and ARHGEF12, which may play a role in the regulation of RhoA GTPase by guanine nucleotide-binding alpha-12 (GNA12) and alpha-13 (GNA13). Unlike typical ARHGEF11 and ARHGEF12, members of this family contain a C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37683 410383 cd20833 C1_cPKC_rpt1 first protein kinase C conserved region 1 (C1 domain) found in the classical (or conventional) protein kinase C (cPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 58
37684 410384 cd20834 C1_nPKC_theta-like_rpt1 first protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) theta, delta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37685 410385 cd20835 C1_nPKC_epsilon-like_rpt1 first protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) epsilon, eta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 64
37686 410386 cd20836 C1_cPKC_rpt2 second protein kinase C conserved region 1 (C1 domain) found in the classical (or conventional) protein kinase C (cPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37687 410387 cd20837 C1_nPKC_theta-like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) theta, delta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 50
37688 410388 cd20838 C1_nPKC_epsilon-like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) epsilon, eta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37689 410389 cd20839 C1_PKD1_rpt1 first protein kinase C conserved region 1 (C1 domain) found in protein kinase D (PKD) and similar proteins. PKD is also called PKD1, PRKD1, protein kinase C mu type (nPKC-mu), PRKCM, serine/threonine-protein kinase D1, or nPKC-D1. It is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of MAPK8/JNK1 and Ras signaling, Golgi membrane integrity and trafficking, cell survival through NF-kappa-B activation, cell migration, cell differentiation by mediating HDAC7 nuclear export, cell proliferation via MAPK1/3 (ERK1/2) signaling, and plays a role in cardiac hypertrophy, VEGFA-induced angiogenesis, genotoxic-induced apoptosis and flagellin-stimulated inflammatory response. PKD contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 72
37690 410390 cd20840 C1_PKD2_rpt1 first protein kinase C conserved region 1 (C1 domain) found in protein kinase D2 (PKD2) and similar proteins. PKD2, also called PRKD2, HSPC187, or serine/threonine-protein kinase D2 (nPKC-D2), is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of cell proliferation via MAPK1/3 (ERK1/2) signaling, oxidative stress-induced NF-kappa-B activation, inhibition of HDAC7 transcriptional repression, signaling downstream of T-cell antigen receptor (TCR) and cytokine production, and plays a role in Golgi membrane trafficking, angiogenesis, secretory granule release and cell adhesion. PKD2 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 73
37691 410391 cd20841 C1_PKD3_rpt1 first protein kinase C conserved region 1 (C1 domain) found in protein kinase D3 (PKD3) and similar proteins. PKD3 is also called PRKD3, PRKCN, serine/threonine-protein kinase D3 (nPKC-D3), protein kinase C nu type (nPKC-nu), or protein kinase EPK2. It converts transient diacylglycerol (DAG) signals into prolonged physiological effects, downstream of PKC. It is involved in the regulation of the cell cycle by modulating microtubule nucleation and dynamics. PKD3 acts as a key mediator in several cancer development signaling pathways. PKD3 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 75
37692 410392 cd20842 C1_PKD1_rpt2 second protein kinase C conserved region 1 (C1 domain) found in protein kinase D (PKD) and similar proteins. PKD is also called PKD1, PRKD1, protein kinase C mu type (nPKC-mu), PRKCM, serine/threonine-protein kinase D1, or nPKC-D1. It is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of MAPK8/JNK1 and Ras signaling, Golgi membrane integrity and trafficking, cell survival through NF-kappa-B activation, cell migration, cell differentiation by mediating HDAC7 nuclear export, cell proliferation via MAPK1/3 (ERK1/2) signaling, and plays a role in cardiac hypertrophy, VEGFA-induced angiogenesis, genotoxic-induced apoptosis and flagellin-stimulated inflammatory response. PKD contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 94
37693 410393 cd20843 C1_PKD2_rpt2 second protein kinase C conserved region 1 (C1 domain) found in protein kinase D2 (PKD2) and similar proteins. PKD2, also called PRKD2, HSPC187, or serine/threonine-protein kinase D2 (nPKC-D2), is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of cell proliferation via MAPK1/3 (ERK1/2) signaling, oxidative stress-induced NF-kappa-B activation, inhibition of HDAC7 transcriptional repression, signaling downstream of T-cell antigen receptor (TCR) and cytokine production, and plays a role in Golgi membrane trafficking, angiogenesis, secretory granule release and cell adhesion. PKD2 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 79
37694 410394 cd20844 C1_PKD3_rpt2 second protein kinase C conserved region 1 (C1 domain) found in protein kinase D3 (PKD3) and similar proteins. PKD3 is also called PRKD3, PRKCN, serine/threonine-protein kinase D3 (nPKC-D3), protein kinase C nu type (nPKC-nu), or protein kinase EPK2. It converts transient diacylglycerol (DAG) signals into prolonged physiological effects, downstream of PKC. It is involved in the regulation of the cell cycle by modulating microtubule nucleation and dynamics. PKD3 acts as a key mediator in several cancer development signaling pathways. PKD3 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 69
37695 410395 cd20845 C1_DGKbeta_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase beta (DAG kinase beta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase beta, also called 90 kDa diacylglycerol kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase beta contains two copies of the C1 domain. This model corresponds to the first one. DGK-beta contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 66
37696 410396 cd20846 C1_DGKgamma_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase gamma (DAG kinase gamma) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DGK-gamma contains two copies of the C1 domain. This model corresponds to the first one. DGK-gamma contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 73
37697 410397 cd20847 C1_DGKdelta_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase delta (DAG kinase delta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase delta, also called 130 kDa diacylglycerol kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. DAG kinase delta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 85
37698 410398 cd20848 C1_DGKeta_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase eta (DAG kinase eta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. The diacylglycerol kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase eta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 86
37699 410399 cd20849 C1_DGKzeta_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase zeta (DAG kinase zeta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase zeta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 74
37700 410400 cd20850 C1_DGKiota_rpt1 first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase iota (DAG kinase iota) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase iota contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 73
37701 410401 cd20851 C1_DGK_typeI_like_rpt2 second protein kinase C conserved region 1 (C1 domain) found in type I diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type I DAG kinases (DGKs) contain EF-hand structures that bind Ca(2+) and recoverin homology domains, in addition to C1 and catalytic domains that are present in all DGKs. Type I DGKs, regulated by calcium binding, include three DGK isozymes (alpha, beta and gamma). DAG kinase alpha, also called 80 kDa DAG kinase, or diglyceride kinase alpha (DGK-alpha), is active upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. DAG kinase beta, also called 90 kDa DAG kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. DGK-alpha contains atypical C1 domains, while DGK-beta and DGK-gamma contain typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37702 410402 cd20852 C1_DGK_typeII_rpt2 second protein kinase C conserved region 1 (C1 domain) found in type II diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type II DAG kinases (DGKs) contain pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. Three DGK isozymes (delta, eta and kappa) are classified as type II. DAG kinase delta, also called 130 kDa DAG kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. The DAG kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase kappa is also called diglyceride kinase kappa (DGK-kappa) or 142 kDa DAG kinase. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37703 410403 cd20853 C1_DGKepsilon_typeIII_rpt2 second protein kinase C conserved region 1 (C1 domain) found in type III diacylglycerol kinase, DAG kinase epsilon, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase epsilon, also called diglyceride kinase epsilon (DGK-epsilon), is the only isoform classified as type III; it possesses a hydrophobic domain in addition to C1 and catalytic domains that are present in all DGKs, and shows selectivity for acyl chains. It is highly selective for arachidonate-containing species of DAG. It may terminate signals transmitted through arachidonoyl-DAG or may contribute to the synthesis of phospholipids with defined fatty acid composition. DAG kinase epsilon contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 63
37704 410404 cd20854 C1_DGKtheta_typeV_rpt3 third protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the third one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 63
37705 410405 cd20855 C1_DGK_typeIV_rpt2 second protein kinase C conserved region 1 (C1 domain) found in type IV diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type IV DAG kinases (DGKs) contain myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. Two DGK isozymes (zeta and iota) are classified as type IV. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37706 410406 cd20856 C1_alphaCHN protein kinase C conserved region 1 (C1 domain) found in alpha-chimaerin and similar proteins. Alpha-chimaerin, also called A-chimaerin, N-chimaerin (CHN), alpha-chimerin, N-chimerin (NC), or Rho GTPase-activating protein 2 (ARHGAP2), is a GTPase-activating protein (GAP) for p21-rac and a phorbol ester receptor. It is involved in the assembly of neuronal locomotor circuits as a direct effector of EPHA4 in axon guidance. Alpha-chimaerin contains a functional SH2 domain that can bind to phosphotyrosine motifs within receptors, a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37707 410407 cd20857 C1_betaCHN protein kinase C conserved region 1 (C1 domain) found in beta-chimaerin and similar proteins. Beta-chimaerin, also called beta-chimerin (BCH) or Rho GTPase-activating protein 3 (ARHGAP3), is a GTPase-activating protein (GAP) for p21-rac. Insufficient expression of beta-2 chimaerin is expected to lead to higher Rac activity and could therefore play a role in the progression from low-grade to high-grade tumors. Beta-chimaerin contains a functional SH2 domain that can bind to phosphotyrosine motifs within receptors, a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37708 410408 cd20858 C1_Munc13-1 protein kinase C conserved region 1 (C1 domain) found in Munc13-1 and similar proteins. Munc13-1, also called protein unc-13 homolog A (Unc13A), is a diacylglycerol (DAG) receptor that plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway. It is involved in neurotransmitter release by acting in synaptic vesicle priming prior to vesicle fusion and participates in the activity-dependent refilling of readily releasable vesicle pool (RRP). Loss of MUNC13-1 function causes microcephaly, cortical hyperexcitability, and fatal myasthenia. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 60
37709 410409 cd20859 C1_Munc13-2-like protein kinase C conserved region 1 (C1 domain) found in Munc13-2, Munc13-3 and similar proteins. Munc13-2, also called protein unc-13 homolog B (Unc13B), plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway. It is involved in neurotransmitter release by acting in synaptic vesicle priming prior to vesicle fusion and participates in the activity-dependent refilling of readily releasable vesicle pool (RRP). Munc13-2 is essential for normal release probability at hippocampal mossy fiber synapses. Munc13-3 is almost exclusively expressed in the cerebellum. It acts as a tumor suppressor and plays a critical role in the formation of release sites with calcium channel nanodomains. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 82
37710 410410 cd20860 C1_RASGRP1 protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 1 (RASGRP1) and similar proteins. RASGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII) or Ras guanyl-releasing protein, functions as a calcium- and diacylglycerol (DAG)-regulated nucleotide exchange factor specifically activating Ras through the exchange of bound GDP for GTP. It activates the Erk/MAP kinase cascade and regulates T-cell/B-cell development, homeostasis and differentiation by coupling T-lymphocyte/B-lymphocyte antigen receptors to Ras. RASGRP1 also regulates NK cell cytotoxicity and ITAM-dependent cytokine production by activation of Ras-mediated ERK and JNK pathways. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37711 410411 cd20861 C1_RASGRP2 protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 2 (RASGRP2) and similar proteins. RASGRP2, also called calcium and DAG-regulated guanine nucleotide exchange factor I (CalDAG-GEFI), Cdc25-like protein (CDC25L), or F25B3.3 kinase-like protein, functions as a calcium- and DAG-regulated nucleotide exchange factor specifically activating Rap through the exchange of bound GDP for GTP. It may also activate other GTPases such as RRAS, RRAS2, NRAS, KRAS but not HRAS. RASGRP2 is also involved in aggregation of platelets and adhesion of T-lymphocytes and neutrophils probably through inside-out integrin activation, as well as in the muscarinic acetylcholine receptor M1/CHRM1 signaling pathway. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37712 410412 cd20862 C1_RASGRP3 protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 3 (RASGRP3) and similar proteins. RASGRP3, also called calcium and DAG-regulated guanine nucleotide exchange factor III (CalDAG-GEFIII), or guanine nucleotide exchange factor for Rap1, is a guanine nucleotide-exchange factor activating H-Ras, R-Ras and Ras-associated protein-1/2. It functions as an important mediator of signaling downstream from receptor coupled phosphoinositide turnover in B and T cells. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37713 410413 cd20863 C1_RASGRP4 protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 4 (RASGRP4) and similar proteins. RASGRP4 functions as a cation- and diacylglycerol (DAG)-regulated nucleotide exchange factor activating Ras through the exchange of bound GDP for GTP. It may function in mast cell differentiation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37714 410414 cd20864 C1_MRCKalpha protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase alpha (MRCK alpha) and similar proteins. MRCK alpha, also called Cdc42-binding protein kinase alpha, DMPK-like alpha, or myotonic dystrophy protein kinase-like alpha, is a serine/threonine-protein kinase expressed ubiquitously in many tissues. It plays a role in the regulation of peripheral actin reorganization and neurite outgrowth. It may also play a role in the transferrin iron uptake pathway. MRCK alpha is an important downstream effector of Cdc42 and plays a role in the regulation of cytoskeleton reorganization and cell migration. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 60
37715 410415 cd20865 C1_MRCKbeta protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase beta (MRCK beta) and similar proteins. MRCK beta, also called Cdc42-binding protein kinase beta (Cdc42BP-beta), DMPK-like beta, or myotonic dystrophy protein kinase-like beta, is a serine/threonine-protein kinase expressed ubiquitously in many tissues. MRCK beta is an important downstream effector of Cdc42 and plays a role in the regulation of cytoskeleton reorganization and cell migration. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37716 410416 cd20866 C1_MRCKgamma protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase gamma (MRCK gamma) and similar proteins. MRCK gamma (MRCKG), also called Cdc42-binding protein kinase gamma, DMPK-like gamma, myotonic dystrophy protein kinase-like gamma, or myotonic dystrophy protein kinase-like alpha, is a serine/threonine-protein kinase expressed in heart and skeletal muscles. It may act as a downstream effector of Cdc42 in cytoskeletal reorganization and contributes to the actomyosin contractility required for cell invasion, through the regulation of MYPT1 and thus MLC2 phosphorylation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37717 410417 cd20867 C1_VAV1 protein kinase C conserved region 1 (C1 domain) found in VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37718 410418 cd20868 C1_VAV2 protein kinase C conserved region 1 (C1 domain) found in VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 58
37719 410419 cd20869 C1_VAV3 protein kinase C conserved region 1 (C1 domain) found in VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. Its function has been implicated in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37720 410420 cd20870 C1_A_C-Raf protein kinase C conserved region 1 (C1 domain) found in A- and C-Raf (Rapidly Accelerated Fibrosarcoma) kinases, and similar proteins. This group includes A-Raf and C-Raf, both of which are serine/threonine-protein kinases. A-Raf, also called proto-oncogene A-Raf or proto-oncogene A-Raf-1, cooperates with C-Raf in regulating ERK transient phosphorylation that is associated with cyclin D expression and cell cycle progression. Mice deficient in A-Raf are born alive but show neurological and intestinal defects. A-Raf demonstrates low kinase activity to MEK, compared with B- and C-Raf, and may also have alternative functions other than in the ERK signaling cascade. It regulates the M2 type pyruvate kinase, a key glycolytic enzyme. It also plays a role in endocytic membrane trafficking. C-Raf, also known as proto-oncogene Raf-1 or c-Raf-1, is ubiquitously expressed and was the first Raf identified. It was characterized as the acquired oncogene from an acutely transforming murine sarcoma virus (3611-MSV) and the transforming agent from the avian retrovirus MH2. C-Raf-deficient mice embryos die around mid-gestation with increased apoptosis of embryonic tissues, especially in the fetal liver. One of the main functions of C-Raf is restricting caspase activation to promote survival in response to specific stimuli such as Fas stimulation, macrophage apoptosis, and erythroid differentiation. Both A- and C-Raf are mitogen-activated protein kinase kinase kinases (MAP3K, MKKK, MAPKKK), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 52
37721 410421 cd20871 C1_B-Raf protein kinase C conserved region 1 (C1 domain) found in B-Raf (Rapidly Accelerated Fibrosarcoma) kinase and similar proteins. Serine/threonine-protein kinase B-Raf, also called proto-oncogene B-Raf, p94, or v-Raf murine sarcoma viral oncogene homolog B1, activates ERK with the strongest magnitude, compared with other Raf kinases. Mice embryos deficient in B-Raf die around midgestation due to vascular hemorrhage caused by apoptotic endothelial cells. Mutations in B-Raf have been implicated in initiating tumorigenesis and tumor progression, and are found in malignant cutaneous melanoma, papillary thyroid cancer, as well as in ovarian and colorectal carcinomas. Most oncogenic B-Raf mutations are located at the activation loop of the kinase and surrounding regions; the V600E mutation accounts for around 90% of oncogenic mutations. The V600E mutant constitutively activates MEK, resulting in sustained activation of ERK. B-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 60
37722 410422 cd20872 C1_KSR1 protein kinase C conserved region 1 (C1 domain) found in kinase suppressor of Ras 1 (KSR1) and similar proteins. KSR1 functions as a transducer of TNFalpha-stimulated C-Raf activation of ERK1/2 and NF-kB. Detected activity of KSR1 is cell type specific and context dependent. It is inactive in normal colon epithelial cells and becomes activated at the onset of inflammatory bowel disease (IBD). Similarly, KSR1 activity is undetectable prior to stimulation by EGF or ceramide in COS-7 or YAMC cells, respectively. KSR proteins are widely regarded as pseudokinases, however, this matter is up for debate as catalytic activity has been detected for KSR1 in some systems. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 47
37723 410423 cd20873 C1_KSR2 protein kinase C conserved region 1 (C1 domain) found in kinase suppressor of Ras 2 (KSR2) and similar proteins. KSR2 interacts with the protein phosphatase calcineurin and functions in calcium-mediated ERK signaling. It also functions in energy metabolism by regulating AMP kinase and AMPK-dependent processes such as glucose uptake and fatty acid oxidation. KSR proteins act as scaffold proteins that function downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37724 410424 cd20874 C1_ROCK1 protein kinase C conserved region 1 (C1 domain) found in Rho-associated coiled-coil containing protein kinase 1 (ROCK1) and similar proteins. ROCK1 is a serine/threonine kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK1, also called Rho-associated protein kinase 1, renal carcinoma antigen NY-REN-35, Rho-associated, coiled-coil-containing protein kinase I (ROCK-I), p160 ROCK-1, or p160ROCK, is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK proteins contain an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 69
37725 410425 cd20875 C1_ROCK2 protein kinase C conserved region 1 (C1 domain) found in Rho-associated coiled-coil containing protein kinase 2 (ROCK2) and similar proteins. ROCK2 is a serine/threonine kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK2, also called Rho-associated protein kinase 2, Rho kinase 2, Rho-associated, coiled-coil-containing protein kinase II (ROCK-II), or p164 ROCK-2, was the first identified target of activated RhoA, and was found to play a role in stress fiber and focal adhesion formation. It is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK proteins contain an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 71
37726 410426 cd20876 C1_p190RhoGEF protein kinase C conserved region 1 (C1 domain) found in 190 kDa guanine nucleotide exchange factor (p190RhoGEF) and similar proteins. p190RhoGEF, also called Rho guanine nucleotide exchange factor (RGNEF), Rho guanine nucleotide exchange factor 28 (ARHGEF28), or RIP2, is a brain-enriched, RhoA-specific guanine nucleotide exchange factor that regulates signaling pathways downstream of integrins and growth factor receptors. It is involved in axonal branching, synapse formation and dendritic morphogenesis, as well as in focal adhesion formation, cell motility and B-lymphocytes activation. In addition to the Dbl homology (DH)-PH domain, p190RhoGEF contains an N-terminal C1 (Protein kinase C conserved region 1) domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37727 410427 cd20877 C1_ARHGEF2 protein kinase C conserved region 1 (C1 domain) found in Rho guanine nucleotide exchange factor 2 (ARHGEF2) and similar proteins. ARHGEF2, also called guanine nucleotide exchange factor H1 (GEF-H1), microtubule-regulated Rho-GEF, or proliferating cell nucleolar antigen p40, acts as guanine nucleotide exchange factor (GEF) that activates Rho-GTPases by promoting the exchange of GDP for GTP. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. ARHGEF2 may be involved in epithelial barrier permeability, cell motility and polarization, dendritic spine morphology, antigen presentation, leukemic cell differentiation, cell cycle regulation, innate immune response, and cancer. It contains a C1 domain followed by Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37728 410428 cd20878 C1_AKAP13 protein kinase C conserved region 1 (C1 domain) found in A-kinase anchor protein 13 (AKAP-13) and similar proteins. AKAP-13, also called AKAP-Lbc, breast cancer nuclear receptor-binding auxiliary protein (Brx-1), guanine nucleotide exchange factor Lbc, human thyroid-anchoring protein 31, lymphoid blast crisis oncogene (LBC oncogene), non-oncogenic Rho GTPase-specific GTP exchange factor, protein kinase A-anchoring protein 13 (PRKA13), or p47, is a scaffold protein that plays an important role in assembling signaling complexes downstream of several types of G protein-coupled receptors (GPCRs). It activates RhoA in response to GPCR signaling via its function as a Rho guanine nucleotide exchange factor. It may also activate other Rho family members. AKAP-13 plays a role in cell growth, cell development and actin fiber formation. Its Rho-GEF activity is regulated by protein kinase A (PKA), through binding and phosphorylation. Alternative splicing of this gene in humans has at least 3 transcript variants encoding different isoforms (i.e. proto-/onco-Lymphoid blast crisis, Lbc and breast cancer nuclear receptor-binding auxiliary protein, and Brx) that contain a C1 domain followed by a dbl oncogene homology (DH) domain and a PH domain which are required for full transforming activity. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 60
37729 410429 cd20879 C1_ARHGEF18-like protein kinase C conserved region 1 (C1 domain) found in uncharacterized Rho guanine nucleotide exchange factor 18 (ARHGEF18)-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate ARHGEF18, which is also called 114 kDa Rho-specific guanine nucleotide exchange factor (p114-Rho-GEF), p114RhoGEF, or septin-associated RhoGEF (SA-RhoGEF). ARHGEF18 acts as guanine nucleotide exchange factor (GEF) for RhoA GTPases. Its activation induces formation of actin stress fibers. ARHGEF18 also acts as a GEF for RAC1, inducing production of reactive oxygen species (ROS). Members of this family contain C1, RhoGEF or Dbl-homologous (DH), and Pleckstrin Homology (PH) domains, as well as a DUF5401 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37730 410430 cd20880 C1_Stac1 protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein (Stac1) and similar proteins. Stac1, also called Src homology 3 and cysteine-rich domain-containing protein, promotes expression of the ion channel CACNA1H at the cell membrane, and thereby contributes to the regulation of channel activity. It plays a minor and redundant role in promoting the expression of calcium channel CACNA1S at the cell membrane, and thereby contributes to increased channel activity. It slows down the inactivation rate of the calcium channel CACNA1C. Stac1 contains a cysteine-rich C1 domain and two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37731 410431 cd20881 C1_Stac2 protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein 2 (Stac2) and similar proteins. Stac2, also called 24b2/Stac2, or Src homology 3 and cysteine-rich domain-containing protein 2, plays a redundant role in promoting the expression of calcium channel CACNA1S at the cell membrane, and thereby contributes to increased channel activity. It slows down the inactivation rate of the calcium channel CACNA1C. Stac2 contains a cysteine-rich C1 domain and one SH3 domain at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37732 410432 cd20882 C1_Stac3 protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein 3 (Stac3) and similar proteins. Stac3 is an essential component of the skeletal muscle excitation-contraction coupling (ECC) machinery. It is required for normal excitation-contraction coupling in skeletal muscle and for normal muscle contraction in response to membrane depolarization. It plays an essential role for normal Ca2+ release from the sarcplasmic reticulum, which ultimately leads to muscle contraction. Stac3 contains a cysteine-rich C1 domain and two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37733 410433 cd20883 C1_Myosin-IXa protein kinase C conserved region 1 (C1 domain) found in unconventional myosin-IXa and similar proteins. Myosin-IXa, also called unconventional myosin-9a (Myo9a), is a single-headed, actin-dependent motor protein of the unconventional myosin IX class. It is expressed in several tissues and is enriched in the brain and testes. Myosin-IXa contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating domain (RhoGAP). Myosin-IXa binds the alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid receptor (AMPAR) GluA2 subunit, and plays a key role in controlling the molecular structure and function of hippocampal synapses. Moreover, Myosin-IXa functions in epithelial cell morphology and differentiation, such that its knockout mice develop hydrocephalus and kidney dysfunction. Myosin-IXa regulates collective epithelial cell migration by targeting RhoGAP activity to cell-cell junctions. Myosin-IXa negatively regulates Rho GTPase signaling, and functions as a regulator of kidney tubule function. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 58
37734 410434 cd20884 C1_Myosin-IXb protein kinase C conserved region 1 (C1 domain) found in unconventional myosin-IXb and similar proteins. Myosin-IXb, also called unconventional myosin-9b (Myo9b), is an actin-dependent motor protein of the unconventional myosin IX class. It is expressed abundantly in tissues of the immune system, like lymph nodes, thymus, and spleen, and in several immune cells including dendritic cells, macrophages and CD4+ T cells. Myosin-IXb contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating (RhoGAP) domain. Myosin-IXb acts as a motorized signaling molecule that links Rho signaling to the dynamic actin cytoskeleton. It regulates leukocyte migration by controlling RhoA signaling. Myosin-IXb is also involved in the development of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus, and type 1 diabetes. Moreover, Myosin-IXb is a ROBO-interacting protein that suppresses RhoA activity in lung cancer cells. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 58
37735 410435 cd20885 C1_RASSF1 protein kinase C conserved region 1 (C1 domain) found in Ras association domain-containing protein 1 (RASSF1) and similar proteins. RASSF1 is a member of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. RASSF1A and 1C are the most extensively studied RASSF1 with both localized to microtubules and involved in regulation of growth and migration. RASSF1 is a potential tumor suppressor that is required for death receptor-dependent apoptosis. It contains a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 54
37736 410436 cd20886 C1_RASSF5 protein kinase C conserved region 1 (C1 domain) found in Ras association domain-containing protein 5 (RASSF5) and similar proteins. RASSF5, also called new ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is a member of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. It is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion. RASSF5 is a potential tumor suppressor that seems to be involved in lymphocyte adhesion by linking RAP1A activation upon T-cell receptor or chemokine stimulation to integrin activation. It contains a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 50
37737 410437 cd20887 C1_TNS2 protein kinase C conserved region 1 (C1 domain) found in tensin-2 and similar proteins. Tensin-2 (TNS2), also called C1 domain-containing phosphatase and tensin (C1-TEN), or tensin-like C1 domain-containing phosphatase (TENC1), is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It regulates cell motility and proliferation. It may have phosphatase activity. TNS2 reduces AKT1 phosphorylation, lowers AKT1 kinase activity, and interferes with AKT1 signaling. It contains an N-terminal region with a zinc finger (C1 domain), a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 53
37738 410438 cd20888 C1_TNS1_v protein kinase C conserved region 1 (C1 domain) found in tensin-1 (TNS1) variant and similar proteins. Tensin-1 (TNS1) plays a role in fibrillar adhesion formation. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton. This model corresponds to the C1 domain found in TNS1 variant. Typical TNS1 does not contain C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 57
37739 410439 cd20889 C1_TNS3_v protein kinase C conserved region 1 (C1 domain) found in tensin-3 (TNS3) variant and similar proteins. Tensin-3 (TNS3), also called tensin-like SH2 domain-containing protein 1 (TENS1), or tumor endothelial marker 6 (TEM6), may play a role in actin remodeling. It is involved in the dissociation of the integrin-tensin-actin complex. This model corresponds to the C1 domain found in TNS3 variant. Typical TNS3 does not contain C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 56
37740 410440 cd20890 C1_DGKalpha_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase alpha (DAG kinase alpha) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase alpha, also called 80 kDa diacylglycerol kinase, or diglyceride kinase alpha (DGK-alpha), converts the second messenger diacylglycerol into phosphatidate upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase alpha contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37741 410441 cd20891 C1_DGKbeta_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase beta (DAG kinase beta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase beta, also called 90 kDa diacylglycerol kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase beta contains two copies of the C1 domain. This model corresponds to the second one. DGK-beta contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 59
37742 410442 cd20892 C1_DGKgamma_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase gamma (DAG kinase gamma) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DGK-gamma contains two copies of the C1 domain. This model corresponds to the second one. DGK-gamma contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37743 410443 cd20893 C1_DGKdelta_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase delta (DAG kinase delta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase delta, also called 130 kDa diacylglycerol kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. DAG kinase delta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 61
37744 410444 cd20894 C1_DGKeta_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase eta (DAG kinase eta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. The diacylglycerol kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase eta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 62
37745 410445 cd20895 C1_DGKzeta_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase zeta (DAG kinase zeta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase zeta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 75
37746 410446 cd20896 C1_DGKiota_rpt2 second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase iota (DAG kinase iota) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase iota contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 75
37747 411014 cd20897 Smlt3025-like S. maltophilia immunity protein with unknown function (Smlt3025) and similar proteins. This family includes Smlt3025, an immunity protein of the type IV secretion system (T4SS) found in the multi-drug-resistant opportunistic pathogen Stenotrophomonas maltophilia. Experiments show that Smlt3025 counteracts Smlt3024, an effector protein transferred by the T4SS into target cells. The crystal structure of Smlt3025 reveals a topology similar to the iron-regulated protein FrpD from Neisseria meningitidis which has been shown to interact with the RTX protein FrpC, while its counterpart Smlt3024 is homologous to the N-terminal domain of large Ca2+-binding RTX proteins. 234
37748 412054 cd20900 HopBF1 type III secretion system (T3SS) effector HopBF1 from Ewingella americana, and similar proteins. This family includes HopBF1 family of bacterial type III secretion system (T3SS) effectors identified as eukaryotic-specific HSP90 protein kinases. HopBF1 adopts a minimal and atypical protein kinase fold such that it is recognized by HSP90 as a host client. Utilizing this "betrayal-like" mechanism to achieve specificity, HopF1 phosphorylates and inactivates eukaryotic HSP90 by inhibiting the chaperone's ATPase activity. This prevents activation of immune receptors that trigger hypersensitive response in plants, thereby inducing severe disease symptoms in plants infected by certain plant pathogens. 169
37749 411015 cd20901 CC_AF10 coiled coil domain of ALL1-Fused gene from chromosome 10 protein (AF10) and similar proteins. This family includes AF10 (ALL1-Fused gene from chromosome 10 protein) which is one of mixed-lineage leukemia 1 (MLL1)-fusion partners that function in acute myeloid leukemia (ALL). Aberration of the mixed-lineage leukemia (MLL) gene is implicated in acute leukemia; chromosomal translocations of MLL1 generate oncogenic chimeric proteins, containing the non-catalytic N-terminal portion of MLL1 fused with many partners such as AF10. The MLL-AF10 fusion oncoprotein recruits DOT1L (disruptor of telomeric-silencing 1-like) to the homeobox A. The aberrant recruitment of DOT1L, a histone methyltransferase that methylates H3 lysine residues (H3K79), by MLL fusions and the resulting H3K79 methylation are thought to affect gene expression by altering chromatin accessibility. AF10 and DOT1L interact through their coiled coil domains. 64
37750 411016 cd20902 CC_DOT1L coiled coil domain of disruptor of telomeric-silencing 1-like (DOT1L) and similar proteins. This family contains DOT1L (disruptor of telomeric-silencing 1-like), a non-SET domain histone lysine methyltransferase (HKMT) that catalyzes monomethylation, dimethylation, and trimethylation of nucleosomal H3K79. DOT1L is recruited to the homeobox A by AF10 (ALL1-Fused gene from chromosome 10 protein), one of the mixed-lineage leukemia 1 (MLL1)-fusion partners that function in acute myeloid leukemia (ALL). Aberration of the MLL gene is implicated in acute leukemia; chromosomal translocations of MLL1 generate oncogenic chimeric proteins, containing the non-catalytic N-terminal portion of MLL1 fused with many partners such as AF10. The aberrant recruitment of DOT1L by MLL fusions and the resulting H3K79 methylation are thought to affect gene expression by altering chromatin accessibility. AF10 and DOT1L interact through their coiled coil domains. 65
37751 411017 cd20903 HCV_p7 Hepatitis C virus p7 protein. Hepatitis C virus (HCV) p7 protein is a viroporin essential for virus production. The p7 monomer is comprised of 2 trans-membrane helices connected by a cytosolic loop, and oligomerizes to form cation-specific ion channels. These ion channels dissipate pH gradients in secretory vesicles potentially protecting acid-labile intracellular virions during egress (the rupturing of the infected cell and release of viral contents). p7 protein has at least two different functions in culture, one via the formation of these ion channels, the other through its specific interaction with the non-structural viral protein NS2. Several compounds targeting p7 have been investigated as anti-HCV drugs. 58
37752 411018 cd20905 EHMT_ZBD Zinc-binding domain of euchromatic histone lysine methyltransferases EHMT1 and EHTM2. EHMT1 (also known as GLP) and EHMT2 (also known as NG36 and G9a) are histone methyltransferases that methylate the K9 position of histone H3, marking genomic regions for transcriptional repression. They may play a role in the G0/G1 cell cycle transition and are associated with promoting various types of cancer. Mutations in EHMT1 are associated with the genetic disorder Kleefstra syndrome. A functional role for the zinc-binding domain has not been established. 133
37753 411019 cd20907 CBM86 carbohydrate binding module family 86. This family describes what is most likely a xylan-binding module such as found in the Xyn10A protein of Roseburia intestinalis L1-82, which is involved in the extracellular capture and breakdown of xylan. 127
37754 411020 cd20908 SUF4-like N-terminal domain of Oryza sativa transcription factor SUPPRESSOR OF FRI 4 (OsSUF4), Arabidopsis thaliana SUF4 (AtSUF4), and similar proteins. Oryza sativa SUPPRESSOR OF FRI 4 (OsSUF4) is a C2H2-type zinc finger transcription factor which interacts with the major H3K36 methyltransferase SDG725 to promote H3K36me3 (tri-methylation at H3K9) establishment. The transcription factor OsSUF4 recognizes a specific 7-bp DNA element (5'-CGGAAAT-3'), which is contained in the promoter regions of many genes throughout the rice genome. Through interaction with OsSUF4, SDG725 is recruited to the promoters of key florigen genes, RICE FLOWERING LOCUS T1 (RFT1) and Heading date 3a (Hd3a), for H3K36 deposition to promote gene activation and rice plant flowering. OsSUF4 target genes include a number of genes involved in many biological processes. Flowering plant Arabidopsis SUF4 binds to a 15bp DNA element (5'-CCAAATTTTAAGTTT-3') within the promoter of the floral repressor gene FLOWERING LOCUS C (FLC) and recruits the FRI-C transcription activator complex to the FLC promoter. Although the DNA-binding element and target genes of AtSUF4 are different from those of OsSUF4, AtSUF4 is known to interact with the Arabidopsis H3K36 methyltransferase SDG8 (also known as ASHH2/EFS/SET8), and the methylation deposition mechanism mediated by the SUF4 transcription factor and H3K36 methyltransferase may be conserved in Arabidopsis and rice. Proteins in this family have two conserved C2H2-type zinc finger motifs at the N-terminus (included in this model), and a large proline-rich domain at the C-terminus; for OsSUF4, it has been shown that the N-terminal zinc-finger domain is responsible for DNA binding, and that the C-terminal domain interacts with SDG725. 82
37755 411021 cd20910 NCBD_CREBBP-p300_like Nuclear Coactivator Binding Domain (NCBD) of CREB (cyclic AMP response element binding protein) binding protein (CREBBP, also known as CBP) and its paralog p300. CREBBP (also called CBP) and its paralog p300, generally referred to as CREBBP/p300, are universal transcriptional coactivators that interact with many important transcription factors and comodulators to activate transcription. The NCBD domain [nuclear coactivator binding domain, also known as IRF-3 binding domain (IBiD) or SRC1 interaction domain (SID)] of CREBBP/p300 behaves as an intrinsically disordered domain in isolation, but folds into helical structures with different topologies upon binding to different ligands such as nuclear receptor coactivator p160, CREBBP interaction domain (CID) from nuclear receptor coactivator 1 (NCOA1 or Src1), NCOA2 (Tif2), and NCOA3 (ACTR), or interferon regulatory factor 3 (IRF-3). In Drosophila, there is only one CREB-binding protein ortholog and it is called nejire, dCBP, CBP/p300, or CBP. 43
37756 411022 cd20912 AIR_RAP80-like ABRAXAS Interacting Region (AIR) of Receptor-Associated Protein 80 (RAP80), and related domains. RAP80 and ABRAXAS are integral subunits of the BRCA1-A complex which also contains MERIT40 (Mediator of Rap80 Interactions and Targeting 40 kD, also known as BABAM1), BRE (also known as BABAM2, BRCC45 and BRCC4), and BRCC36 (also known as BRCC3). BRCA1-A functions in DNA double-strand break (DSB) repair. RAP80 interacts with the ABRAXAS, MERIT40, and BRE subunits. It is the interaction with ABRAXAS that drives specific incorporation of RAP80 into BRCA1-A. RAP80 contains one SUMO-interacting motif (SIM), two ubiquitin-interacting motifs (UIMs), this AIR, and two zinc finger motifs (ZnF). The SIM and UIM domains recruit BRCA1-A to sites of DNA damage. The AIR is integral in the interaction of RAP80 with the ABRAXAS, MERIT40, and BRE subunits. 59
37757 411023 cd20913 DCAF15-CTD C-terminal domain of DDB1- and CUL4-associated factor 15. This model represents the C-terminal domain of DCAF15 (DDB1- and CUL4-associated factor 15), the cullin RING ligase substrate receptor/adaptor that forms a complex with CUL4A or CUL4B, as part of the Rbx-Cul4-DDA1-DDB1-DCAF15 E3 ubiquitin ligase that is responsible for the proteasome degradation of certain proteins. Aryl sulfonamide anticancer agents such as indisulam, tasisulam, E7820, and chloroquinoxaline have been shown to recruit the essential mRNA-splicing factor RBM39 to DCAF15. These agents appear to promote binding of DCAF15 to the RNA-recognition motif (RRM) of RBM39, which suggests that derivatives of the aryl-sulfonamides may be used to target other RRM-containing proteins. Cell proliferation is inhibited by these aryl sulfonamides by causing degradation of RBM39, which leads to aberrant processing of pre-mRNA in hundreds of genes, primarily reflected by intron retention and exon skipping, thus collectively referred to as splicing inhibitor sulfonamides, or SPLAMs. 224
37758 411024 cd20917 DCAF15-NTD N-terminal domain of DDB1- and CUL4-associated factor 15. This model represents the N-terminal domain of DCAF15 (DDB1- and CUL4-associated factor 15), the cullin RING ligase substrate receptor/adaptor that forms a complex with CUL4A or CUL4B, as part of the Rbx-Cul4-DDA1-DDB1-DCAF15 E3 ubiquitin ligase that is responsible for the proteasome degradation of certain proteins. Aryl sulfonamide anticancer agents such as indisulam, tasisulam, E7820, and chloroquinoxaline have been shown to recruit the essential mRNA-splicing factor RBM39 to DCAF15. These agents appear to promote binding of DCAF15 to the RNA-recognition motif (RRM) of RBM39, which suggests that derivatives of the aryl-sulfonamides may be used to target other RRM-containing proteins. Cell proliferation is inhibited by these aryl sulfonamides by causing degradation of RBM39, which leads to aberrant processing of pre-mRNA in hundreds of genes, primarily reflected by intron retention and exon skipping, thus collectively referred to as splicing inhibitor sulfonamides, or SPLAMs. 225
37759 410842 cd20918 polyA_pol_NCLDV RNA polyadenylate polymerase of nucleocytoplasmic large DNA viruses. This model represents the poly(A) polymerases (PAPs) from nucleocytoplasmic large DNA viruses (NCLDV), a group of giant eukaryotic double-stranded DNA viruses that make up the phylum Nucleocytoviricota. They are referred to as nucleocytoplasmic because they are often able to replicate in both the host's cell nucleus and cytoplasm. PAPs catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. This group includes PAPs from the Poxviridae and Mimiviridae family of viruses. In Vaccinia virus, from the Poxviridae family, polyadenylation is crucial for virion maturation and is carried out by a heterodimer, formed by the catalytic subunit VP55 and the processivity factor (VP39), which is required for the formation of long poly(A) tails. PAPs from Acanthamoeba polyphaga mimivirus and Megavirus chiliensis, which belong to the Mimiviridae family, are homodimeric and intrinsically self-processive, generating >700 nucleotides long poly(A) tails. Homodimerization is required for PAP activity; monomers are able to bind RNA but are enzymatically inactive. Thus, while other PAPs form heterodimers with processivity factors, the Mimiviridae PAPs become processive upon homodimerization. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 335
37760 410843 cd20919 polyA_pol_Pox RNA polyadenylate polymerase catalytic subunit from the Poxviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. In Vaccinia virus, from the Poxviridae family of viruses, polyadenylation is crucial for virion maturation and is carried out by a heterodimer, formed by the catalytic subunit VP55 and the processivity factor (VP39). In the absence of VP39, oligo(A) tails are added to permissive primers by VP55 in a rapid processive burst, which ceases abruptly after tails have reached 30-35 nucleotides in length. With VP39, tails with lengths in the hundreds of nucleotides are processively synthesized with no abrupt termination of elongation. In contrast to mammalian cells, polyadenylation is not dependent on a multiprotein mRNA 3' end processing complex. VP55 translocates with respect to its single-stranded nucleic acid substrate during poly(A) tail addition. The catalytic subunit of Poxviridae PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 461
37761 410844 cd20920 polyA_pol_Mimi RNA polyadenylate polymerase from the Mimiviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. PAPs from Acanthamoeba polyphaga mimivirus and Megavirus chiliensis, which belong to the Mimiviridae family, are homodimeric and intrinsically self-processive, generating >700 nucleotides long poly(A) tails. Homodimerization is required for PAP activity; monomers are able to bind RNA but are enzymatically inactive. Thus, while other PAPs form heterodimers with processivity factors, the Mimiviridae PAPs become processive upon homodimerization. mRNA polyadenylation in Mimiviridae occurs at hairpin-forming palindromic sequences terminating viral transcripts. The catalytic subunit of Mimiviridae PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 449
37762 410845 cd20921 polyA_pol_Pycodna RNA polyadenylate polymerase from the Phycodnaviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 357
37763 410846 cd20922 polyA_pol_Marseille RNA polyadenylate polymerase from the Marseilleviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 316
37764 410847 cd20923 polyA_pol_Fausto RNA polyadenylate polymerase from Faustovirus. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 351
37765 410848 cd20924 polyA_pol_Asfar RNA polyadenylate polymerase from the Asfarviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 336
37766 409519 cd20925 IgV_CD28 Immunoglobulin Variable (IgV) domain Cluster of Differentiation (CD) 28. The members here are composed of the immunoglobulin variable region (IgV) of Cluster of Differentiation (CD) 28). CD28 is one of the proteins expressed on T cells that provide co-stimulatory signals required for T cell activation and survival. CD28 is the receptor for CD80 (B7.1) and CD86 (B7.2) proteins. CD28 consists of a paired V-set of immunoglobulin (Ig) superfamily domains attached to single-transmembrane domains and cytoplasmic domains that contain the MYPPY motif, which is involved in binding B7.1 or B7.2. CD28 is very similar to CTLA-4 (cytotoxic T-lymphocyte-associated protein 4, also known as CD152 (cluster of differentiation 152)), which is involved in the regulation of T cell response, acting as an inhibitor of intracellular signaling. CTLA-4 also binds the B7 molecules (B7.1 and B7.2) with a higher affinity than does CD28. The B7/CTLA-4 interaction generates inhibitory signals down-regulating the response, and may prevent T cell activation by weak TCR signals. CD28 and CTLA-4 then elicit opposing signals in the regulation of T cell responsiveness and homeostasis. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The N-terminal Ig-like domain of CD28 is a member of the V-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C'-C" in the other. However, each CD28-B7 family member is slightly different, some have an IgV domain which lacks an A' or C" strand. 117
37767 409520 cd20926 IgV_NKp30 Immunoglobulin variable (IgV) domain of Natural Killer cell activating receptor NKp30 and similar domains. The members here are composed of the immunoglobulin variable region (IgV) of Natural Killer cell activating receptor NKp30 (also known as Natural Cytotoxicity triggering Receptor 3 (NCR3)) and similar domains. NKp30 Recognizes the N-Terminal IgV Domain of B7-H6. In humans, the activating natural cytotoxicity receptor NKp30 plays a major role in NK cell-mediated tumor cell lysis. NKp30 recognizes the cell-surface protein B7-H6, which is expressed on tumor, but not healthy, cells. 112
37768 409521 cd20927 IgI_Titin_M1-like Immunoglobulin-like M1 domain from Titin; a member of the I-set of IgSF domains. The members here are composed of the Immunoglobulin-like M1 I-set domain from Titin and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin-M1 domain lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 90
37769 409522 cd20928 IgI_BTLA Extracellular Immunoglobulin (Ig) domain of the B and T lymphocyte attenuator (BTLA); member of the I-set Ig superfamily domains. The members here are composed of the extracellular immunoglobulin (Ig) domain of the B and T lymphocyte attenuator (BTLA; also known as CD270). BTLA is a type I transmembrane glycoprotein that is structurally similar to the CD28 family of T cell co-stimulatory or coinhibitory molecules. BTLA is a coinhibitory molecule expressed on T cells, B cells, macrophages, dendritic and natural killer (NK) cells. Unlike CD28 family members, BTLA interacts with the tumor necrosis factor receptor superfamily member HVEM (herpes virus entry mediator) rather than with B7 family ligands. In addition, BTLA does not form a homodimer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. In contrast to CD28 family members, the structure of the BTLA extracellular Ig domain lacks a C" strand and thus is better described as a member of the I-set of Ig domains. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). 101
37770 409523 cd20929 IgV_CD22_d1 First immunoglobulin domain of Cluster of Differentiation (CD) 22; member of the V-set of IgSF domains. The members here are composed of the first immunoglobulin domain in Cluster of Differentiation (CD) 22 (also known as Siglec-2). CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the V-set of IgSF domains. 114
37771 409524 cd20930 Ig3_Nectin-5_like Third immunoglobulin domain of Nectin-like Protein-5, and similar domains. The members here are composed of the third immunoglobulin domain of Nectin-like Protein-5 (also known as Cluster of Differentiation 155 (CD155)). Nectin-like Protein-5 mediates NK (Natural Killer) cell adhesion and triggers NK cell effector functions. CD155 binds two different NK cell receptors: CD96 and CD226. These interactions accumulate at the cell-cell contact site, leading to the formation of a mature immunological synapse between NK cell and target cell. This may trigger adhesion and secretion of lytic granules and IFN-gamma and activate cytotoxicity of activated NK cells. CD155 may also promote NK cell-target cell modular exchange, and PVR transfer to the NK cell. This transfer is more important in some tumor cells expressing a lot of PVR, and may trigger fratricide NK cell activation, providing tumors with a mechanism of immunoevasion. Moreover, CD155 plays a role in mediating tumor cell invasion and migration. 86
37772 409525 cd20931 Ig3_IL1RAP Third immunoglobulin domain of interleukin-1 receptor accessory protein (IL1RAP). The members here are composed of the third immunoglobulin Ig interleukin-1 receptor accessory protein (IL1RAP). The interleukin 1 receptor accessory protein (IL-1RAP), also known as IL-1R3, is a coreceptor of type 1 interleukin 1 receptor (IL-1R1) and is required for transmission of IL-1 signaling. The activated IL-1 receptor complex, which consists of IL-1R1 and IL-1RAP, induces multiple cellular responses including NF-kappa-B activation, IL-2 secretion, and IL-2 promoter activation. Signaling involves the recruitment of adapter molecules such as TOLLIP, MYD88, and IRAK1 or IRAK2 via the respective Toll/IL-1 receptor (TIR) domains of the receptor/coreceptor subunits. Moreover, IL1RAP is known to be the accessory co-receptor that activates signal transduction upon IL-36 binding to IL-36R. IL-36 cytokines, which are a subfamily of the IL-1 superfamily, bind to the IL-36 receptor (IL-36R) and use IL1RAP as a co-receptor. 107
37773 409526 cd20932 Ig3_IL1R_like Third immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the third immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R) and similar proteins. Members of this family are characterized by extracellular immunoglobulin-like domains and intracellular Toll/Interleukin-1R (TIR) domain. Three naturally occurring ligands for the IL-1 receptor (IL1R) are known: the agonists IL-1alpha and IL-1beta and the IL-1-receptor antagonist IL1RA. IL-1Rs are involved in immune host defense and hematopoiesis. After binding to interleukin-1, IL1R associates with the coreceptor IL1RAP (interleukin 1 receptor accessory protein, also known as IL-1R3) to form the high affinity interleukin-1 receptor complex, which induces multiple cellular responses including NF-kappa-B activation, IL-2 secretion, and IL-2 promoter activation. Signaling involves the recruitment of adapter molecules such as TOLLIP, MYD88, and IRAK1 or IRAK2 via the respective TIR domains of the receptor/coreceptor subunits. IL1R binds ligands with comparable affinity to its antagonist IL1RA, and binding of IL1RA to IL1R, prevents association of the latter with IL1RAP to form a signaling complex. 104
37774 409527 cd20933 Ig_ch-CD3_epsilon_like Immunoglobulin (Ig)-like domain of chicken Cluster of Differentiation (CD) 3 epsilon chain and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of chicken Cluster of Differentiation (CD) 3 epsilon chain and similar proteins. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex. The chicken CD3epsilon Ig domain has low sequence identity with human (22%) and mouse (24%) CD3epsilon, but overall is structurally very similar over the entire domain. 63
37775 409528 cd20934 IgV_B7-H3 Immunoglobulin Variable (IgV) domain of B7-H3, a member of the B7 family of immune checkpoint molecules. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H3 also known as CD276), a member of the B7 family of immune checkpoint molecules. B7-H3 is an important immune checkpoint member of the B7 family and shares homology with other B7 ligands such as programmed death ligand 1 (PD-L1). The B7 family molecules interact with CD28 on T-cells to provide co-stimulatory signals that regulate T-cell activation and T-helper cell differentiation. Although B7-H3 has been shown to have both co-stimulatory and co-inhibitory effects on T-cell responses, the most current studies describe B7-H3 as a T cell inhibitor that promotes tumor aggressiveness and proliferation. Moreover, B7-H3 is highly overexpressed on a wide range of human solid cancers and promotes tumor growth, metastasis, and drug resistance. Thus, B7-H3 expression in tumors often correlates with both negative prognosis and poor clinical outcome in cancer patients. B7-H3 protein contains a predicted signal peptide, V- and C-like Ig domains (IgV and IgC), a transmembrane region, and an intracellular tail. 115
37776 409529 cd20935 IgV_B7-H2 Immunoglobulin Variable (IgV) domain of B7-H2 (B7 homolog 2). The members here are composed of the immunoglobulin variable (IgV) domain of B7-H2 (B7 homolog 2 also known as ICOSL (inducible T cell costimulator ligand) or CD275). B7-H2 is a ligand for the T-cell-specific cell surface receptor ICOS and acts as a costimulatory signal for T-cell proliferation and cytokine secretion. The interaction of ICOS with ICOSL (B7-H2) regulates T cell activation and expansion, is involved in T cell dependent B cell activation, and T-helper cell differentiation. It is a member of the B7 family of immune regulatory proteins and shares homology with other B7 ligands, such as B7-1, B7-2, B7-H1 (PD-L1), PD-L2, and B7-H3. The extracellular domains of B7 proteins contain two Ig-like domains and all members have short cytoplasmic domains. These ligands are typically expressed on antigen presenting cells (such as macrophages, B cells and dendritic cells) and have the ability to regulate T-cell proliferation and function. Tumor cells are also capable of expressing the B7 family members in order to evade immune surveillance. 113
37777 409530 cd20936 IgI_3_CSF-1R Third immunoglobulin domain of the hematopoietic colony-stimulating factor 1 receptor (CSF-1R), and similar domains; member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin domain of the hematopoietic colony-stimulating factor 1 receptor (CSF-1R) and similar proteins. CSF-1R, a class III receptor tyrosine kinase (RTKIII), is critical to the survival, proliferation, and differentiation of mononuclear phagocytic cells such as monocytes, tissue macrophages, muscularis macrophages, microglia, osteoclasts, Paneth cells, and myeloid dendritic cells. Human colony-stimulating factor 1 receptor (hCSF-1R) is unique among the hematopoietic receptors because it is activated by two distinct cytokines, CSF-1 and interleukin-34 (IL-34). The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the I-set of IgSF domains. 93
37778 409531 cd20937 IgC2_CD22_d3 Third immunoglobulin domain in Cluster of Differentiation (CD) 22; member of the Constant 2 (C2)-set of IgSF domains. The members here are composed of the third immunoglobulin domain in Cluster of Differentiation (CD) 22 (also known as Siglec-2). CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand. 88
37779 409532 cd20938 IgC1_CD22_d2 Second immunoglobulin domain of Cluster of Differentiation (CD) 22; member of the Constant 1 (C1)-set of IgSF domains. The members here are composed of the second immunoglobulin domain of clusters of differentiation (CD) 22 (also known as Siglec-2). CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C1-set of IgSF domains. 98
37780 409533 cd20939 IgC2_D1_IL-6RA Immunoglobulin-like domain D1 of interleukin-6 receptor alpha-chain (IL-6RA, also known as CD126); member of the C2-set of IgSF domains. The members here are composed of the immunoglobulin-like domain D1 of interleukin-6 receptor alpha-chain (IL-6RA, also known as CD126). The IL-6RA ectodomain, which is highly modular, consisting of three domains (D1, D2, and D3). Interleukin-6 (IL-6) is a multifunctional cytokine that regulates the immune response, hemopoiesis, the acute phase response and inflammation. It is generated in an infectious lesion and sends out a warning signal to the entire body. IL-6 binds first to its cognate alpha-chain receptor (IL-6R), and then the IL-6/IL-6R complex which in turn induces homodimerization of gp130. As a result, a high-affinity functional receptor complex of IL-6, IL-6R and gp130 is formed, and subsequently the complex triggers a downstream signal cascade. Aberrant production of IL-6 and its receptor (IL-6R) are implicated in the pathogenesis of multiple myeloma, autoimmune diseases and prostate cancer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains. Unlike other Ig domain sets, the C2-set lacks the D strand. 77
37781 409534 cd20940 Ig0_BSG1 Immunoglobulin-like Ig0 domain of basigin-1 (BSG1) and similar proteins. The members here are composed of the immunoglobulin (Ig) domain of the collagenase stimulatory factor, basigin-1 (BSG1; also known as Cluster of Differentiation 147 (CD147) and Extracellular Matrix Metalloproteinase Inducer (EMMPRIN)) and similar proteins. CD147 is a transmembrane glycoprotein that belongs to the immunoglobulin superfamily. It is expressed in nearly all cells including platelets and fibroblasts and is involved in inflammatory diseases, and cancer progression. CD147 is highly expressed in several cancers and used as a prognostic marker. The two primary isoforms of CD147 that are related to cancer progression have been identified: CD147 Ig1-Ig2 (also called Basigin-2) that is ubiquitously expressed in most tissues and CD147 Ig0-Ig1-Ig2 (also called Basigin-1) that is retinal specific and implicated in retinoblastoma. Studies showed that CD147 Ig0 domain is a potent stimulator of interleukin-6 and suggest that the CD147 Ig0 dimer is the functional unit required for activity. 116
37782 409535 cd20942 IgI_MAdCAM-1 Immunoglobulin-like domain of Mucosal addressin cell-adhesion molecule (MAdCAM-1); member of the I-set of IgSF domains. The members here include the immunoglobulin-like domain of Mucosal addressin cell-adhesion molecule (MAdCAM-1). MadCAM-1 is an endothelial cell adhesion molecule that interacts preferentially with the leukocyte beta7 integrin LPAM-1 (alpha4beta7), L-selectin, and VLA-4 (alpha4beta1) on myeloid cells to direct leukocytes into mucosal and inflamed tissues. MadCAM-1 is expressed primarily on HEV of Peyer's patches and on venules in small intestinal lamina propria, on the marginal sinus of the spleen, and on HEV of embryonic lymph nodes. It is a member of the immunoglobulin superfamily (IgSF), which is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of MAdCAM-1 is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous, but lacks a C" strand. 88
37783 409536 cd20943 IgI_VCAM-1 First immunoglobulin-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1), and similar domains; member of the I-set of IgSF domains. The members here include the first immunoglobulin-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1; also known as Cluster of Differentiation 106 (CD106)) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. The interaction of VCAM-1 binding to the beta1 integrin very late antigen (VLA-4) expressed by lymphocytes and monocytes mediates the adhesion of leucocytes to blood vessel walls, and regulates migration across the endothelium. During metastasis, some circulating cancer cells extravasate to a secondary site by a similar process. VCAM-1 may be involved in organ targeted tumor metastasis and may also act as host receptors for viruses and parasites. VCAM-1 contains seven Ig domains. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of VCAM-1 is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous, but lacks a C" strand. 89
37784 409537 cd20944 IgI_N_ICAM1-2-3 N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (CD102) and ICAM-3 (CD50); members of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain found in the N-terminus of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (CD102), and ICAM-3 (CD50). ICAM-1, ICAM-2, and ICAM-3 mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues. 81
37785 409538 cd20946 IgV_1_JAM1-like First Ig-like domain of Junctional adhesion molecule-1 (JAM1)and similar domains; a member of the V-set of IgSF domains. The members here are composed of the first Ig-like domain of Junctional Adhesion Molecule-1 (JAM1)and similar domains. JAM1 is an immunoglobulin superfamily (IgSF) protein with two Ig-like domains in its extracellular region; it plays a role in the formation of endothelial and epithelial tight junction and acts as a receptor for mammalian reovirus sigma-1. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of JAM1 is a member of the V-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C'-C" in the other. 102
37786 409539 cd20947 IgV_PDl1 Immunoglobulin Variable (IgV) domain of Programmed death ligand 1 (PD-L1). The members here are composed of the immunoglobulin variable (IgV) domain of Programmed death ligand 1 (PD-L1; also known as Cluster of Differentiation 274 (CD274)). PD-L1 is a cell-surface ligand that competes with PD-L2 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. Like PD-L2, PD-L1 interacts with PD-1 and suppresses T cell proliferation and cytokine production. The PD-1 receptor is expressed on the surface of activated T cells, while PD-L1 is expressed on cancer cells. When PD-1 and PD-L1 bind together, they form a molecular shield protecting tumor cells from being destroyed by the immune system. Thus, inhibiting the binding of PD-L1 to PD-1 with an antibody leads to killing of tumor cells by T cells. PD-1 inhibitors (such as Pembrolizumab, Nivolumab, and Cemiplimab) and PD-L1 inhibitors (such as Atezolizumab, Avelumab, and Durvalumab ) are an emerging class of immunotherapy that stimulate lymphocytes against tumor cells. 110
37787 409540 cd20948 IgC2_CEACAM5-like Fifth immunoglobulin (Ig)-like domain of the carcinoembryonic antigen (CEA) related cell adhesion molecule 5 (CEACAM5) and similar domains; member of the C2-set IgSF domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of the carcinoembryonic antigen (CEA) related cell adhesion molecule 5 (CEACAM5) and similar domains. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. Carcinoembryonic antigen-related cell adhesion molecule 5 (CEACAM5), also known as CD66e (Cluster of Differentiation 66e), is a cell surface glycoprotein that plays a role in cell adhesion, intracellular signaling and tumor progression. Diseases associated with CEACAM5 include lung cancer and rectum cancer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand. 76
37788 409541 cd20949 IgI_Twitchin_like C-terminal immunoglobulin-like domain of the myosin-associated giant protein kinase Twitchin, and similar domains; member of the I-set IgSF domains. The members here are composed of the C-terminal immunoglobulin-like domain of the myosin-associated giant protein kinase Twitchin and similar proteins, including Caenorhabditis elegans and Aplysia californica Twitchin, Drosophila melanogaster Projectin, and similar proteins. These are very large muscle proteins containing multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains and a single kinase domain near the C-terminus. In humans these proteins are called Titin. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig-like domain of the Twitchin is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins (titin, telokin, and twitchin), the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D. 89
37789 409542 cd20950 IgI_2_JAM1 Second Ig-like domain of Junctional adhesion molecule-1 (JAM1); a member of the I-set of IgSF domains. The members here are composed of the second Ig-like domain of Junctional adhesion molecule-1 (JAM1). JAM1 is an immunoglobulin superfamily (IgSF) protein with two Ig-like domains in its extracellular region; it plays a role in the formation of endothelial and epithelial tight junction and acts as a receptor for mammalian reovirus sigma-1. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The second Ig-like domain of JAM1 is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous but lacks a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors. 97
37790 409543 cd20951 IgI_titin_I1-like Immunoglobulin domain I1 of the titin I-band and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin domain I1 of the titin I-band and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig I1 domain of the titin I-band is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 94
37791 409544 cd20952 IgI_5_Robo Fifth Ig-like domain of Roundabout (Robo) homolog 1/2, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fifth Ig-like domain of Roundabout (Robo) homolog 1/2 and similar domains. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, -2, and -3), and three mammalian Slit homologs (Slit-1,-2, -3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, -2, and -3 are expressed by commissural neurons in the vertebrate spinal cord and Slits 1, -2, -3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of slit responsiveness, antagonizes slit responsiveness in precrossing axons. The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be is the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. The fifth Ig-like domain of Robo 1 and 2 is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors 87
37792 409545 cd20953 IgI_2_Dscam Second immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. DSCAM is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 95
37793 409546 cd20954 IgI_7_Dscam Seventh immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the seventh immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 96
37794 409547 cd20955 IgI_1_Dscam First immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 99
37795 409548 cd20956 IgI_4_Dscam Fourth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 96
37796 409549 cd20957 IgC2_3_Dscam Third immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the Constant 2 (C2)-set of IgSF domains. The members here are composed of the third immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C, and C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand. 88
37797 409550 cd20958 IgI_5_Dscam Fifth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fifth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 89
37798 409551 cd20959 IgI_6_Dscam Sixth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the sixth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 94
37799 409552 cd20960 IgV_CAR_like Immunoglobulin Variable (V) domain of the Coxsackievirus and Adenovirus Receptor (CAR), and similar proteins. The members here are composed of the Variable (V) domain of the Coxsackievirus and Adenovirus Receptor (CAR), and similar proteins. CAR, which is encoded by human CXADR gene, is a cell adhesion molecule of the Immunoglobulin (Ig) superfamily. The CAR acts as a type I membrane receptor for group B1-B6 coxsackie viruses and subgroup C adenoviruses. For instance, adenovirus interacts with the coxsackievirus and adenovirus receptor to enter epithelial airway cells. The CAR is also shown to be involved in physiological processes such as neuronal and heart development, epithelial tight junction integrity, and tumor suppression. The CAR is a component of the epithelial apical junction complex that may function as a homophilic cell adhesion molecule and is essential for tight junction integrity. The CAR is also involved in transepithelial migration of leukocytes through adhesive interactions with JAML a transmembrane protein of the plasma membrane of leukocytes. The interaction between both receptors also mediates the activation of gamma-delta T-cells, a subpopulation of T-cells residing in epithelia and involved in tissue homeostasis and repair. The CAR is composed of one V-set and one C2-set Ig module, a single transmembrane helix, and an intracellular domain. This group belongs to the V-set of IgSF domains, having A, B, E and D strands in one beta-sheet and A', G, F, C, C' and C" in the other 114
37800 409553 cd20961 Ig1_Tyro3_like First immunoglobulin (Ig)-like domain of Tyro3 receptor tyrosine kinase (RTK), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Tyro3 receptor tyrosine kinase (RTK). Tyro3 together with Axl and Mer form the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3 and Mer are widely expressed in adult tissues, though they show higher expression in the brain, in the lymphatic and vascular systems, and in the testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation. 87
37801 409554 cd20962 IgI_C1_MyBP-C_like Immunoglobulin Domain C1 of human cardiac Myosin Binding Protein C and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin domain C1 of human cardiac Myosin Binding Protein C (MyBP-C). MyBP-C is a thick filament protein involved in the regulation of muscle contraction. Mutations in cardiac MyBP-C gene are the second most frequent cause of hypertrophic cardiomyopathy. MyBP-C binds to myosin with two binding sites, one at its C-terminus and another at its N-terminus. The N-terminal binding site, consisting of immunoglobulin (lg) domains C1 and C2 connected by a flexible linker, interacts with the S2 segment of myosin in a phosphorylation-regulated manner. The C1 and C2 Ig domains can bind to and activate or inhibit the thin filament. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The C1 domain of the MyBP-C is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors. 101
37802 409555 cd20963 IgV_VCBP Immunoglobulin Variable region-containing chitin-binding proteins; an immunoglobulin V-set domain. The members here are composed of the immunoglobulin variable (IgV) region-containing chitin-binding proteins (VCBPs). VCBPs are secreted, immune-type molecules that have been identified in both amphioxus and sea squirt (Ciona intestinalis). VCBPs, which consist of a leader peptide, two tandem N-terminal immunoglobulin V-type domains and a single C-terminal chitin-binding domain, belong to a multigene family encoding secreted proteins. The VCBPs were identified first in the cephalochordate Branchiostoma floridae and show structural similarities with V-type domains of immunoglobulins and T cell receptors, suggesting that VCBPs represent a unique gut-associated form of innate immune proteins. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the V-set of IgSF domains, having A, B, E and D strands in one beta-sheet and A', G, F, C, C' and C" in the other. 123
37803 409556 cd20964 IgI_Tie2 Immunoglobulin domain of Tie2 tyrosine kinase; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain of Tie2 tyrosine kinase. The Tie receptor tyrosine kinases and their angiopoietin (Ang) ligands play central roles in developmental and tumor-induced angiogenesis. Tie2 contains three immunoglobulin (Ig) domains, which fold together with the three epidermal growth factor domains into a compact, arrowhead-shaped structure. Ang2-Tie2 recognition is similar to antibody-protein antigen recognition, including the location of the ligand-binding site within the Ig fold. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the Tie2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors. 92
37804 409557 cd20965 IgI_2_hemolin-like Second immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of this group show that the second Ig domain lacks this strand and thus belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). 101
37805 409558 cd20966 IgI_1_Axl_like First immunoglobulin (Ig)-like domain of Axl receptor tyrosine kinase (RTK), and similar domains; member of the I-set Ig domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Axl receptor tyrosine kinase (RTK). Axl together with Tyro3 and Mer form the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3 and Mer are widely expressed in adult tissues, though they show higher expression in the brain, in the lymphatic and vascular systems, and in the testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation. Ig superfamily domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig-like domain of the Axl is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. 101
37806 409559 cd20967 IgI_C2_MyBP-C-like Domain C2 of human cardiac Myosin Binding Protein C and similar domains; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) Domain C2 of human cardiac Myosin Binding Protein C (MyBP-C) and similar domains. MyBP-C is a thick filament protein involved in the regulation of muscle contraction. Mutations in cardiac MyBP-C gene are the second most frequent cause of hypertrophic cardiomyopathy. MyBP-C binds to myosin with two binding sites, one at its C-terminus and another at its N-terminus. The N-terminal binding site, consisting of immunoglobulin (lg) domains C1 and C2 connected by a flexible linker, interacts with the S2 segment of myosin in a phosphorylation-regulated manner. The C1 and C2 Ig domains can bind to and activate or inhibit the thin filament. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the Ig domains of MyBP-C lack this strand and thus belong to the I-set of Ig superfamily domains. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors. 82
37807 409560 cd20968 IgI_2_MuSK agrin-responsive second immunoglobulin-like domains (Ig2) of the Muscle-specific kinase (MuSK) ectodomain; a member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin-like (Ig) domains of the Muscle-specific kinase (MuSK) ectodomain. MuSK is a receptor tyrosine kinase specifically expressed in skeletal muscle, where it plays a central role in the formation and maintenance of the neuromuscular junction (NMJ). MuSK is activated by agrin, a neuron-derived heparan sulfate proteoglycan. The activation of MUSK in myotubes regulates the formation of NMJs through the regulation of different processes including the specific expression of genes in subsynaptic nuclei, the reorganization of the actin cytoskeleton and the clustering of the acetylcholine receptors (AChR) in the postsynaptic membrane. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the MuSK lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 88
37808 409561 cd20969 IgI_Lingo-1 Immunoglobulin I-set domain of the Leucine-rich repeat and immunoglobin-like domain-containing protein 1 (Lingo-1). The members here are composed of the immunoglobulin I-set (IgI) domain of the Leucine-rich repeat and immunoglobin-like domain-containing protein 1 (Lingo-1). Human Lingo-1 is a central nervous system-specific transmembrane glycoprotein also known as LERN-1, which functions as a negative regulator of neuronal survival, axonal regeneration, and oligodendrocyte differentiation and myelination. Lingo-1 is a key component of the Nogo receptor signaling complex (RTN4R/NGFR) in RhoA activation responsible for some inhibition of axonal regeneration by myelin-associated factors. The ligand-binding ectodomain of human Lingo-1 contains a bimodular, kinked structure composed of leucine-rich repeat (LRR) and immunoglobulin (Ig)-like modules. Diseases associated with Lingo-1 include mental retardation, autosomal recessive 64 and essential tremor. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the Lingo-1 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 92
37809 409562 cd20970 IgI_1_MuSK agrin-responsive first immunoglobulin-like domains (Ig1) of the MuSK ectodomain; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin-like domains (Ig1) of the Muscle-specific kinase (MuSK). MuSK is a receptor tyrosine kinase specifically expressed in skeletal muscle, where it plays a central role in the formation and maintenance of the neuromuscular junction (NMJ). MuSK is activated by agrin, a neuron-derived heparan sulfate proteoglycan. The activation of MUSK in myotubes regulates the formation of NMJs through the regulation of different processes including the specific expression of genes in subsynaptic nuclei, the reorganization of the actin cytoskeleton and the clustering of the acetylcholine receptors (AChR) in the postsynaptic membrane. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the MuSK lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 92
37810 409563 cd20971 IgI_1_Titin-A168_like First immunoglobulin-like domains A168 within the A-band segment of human cardiac titin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin-like domain A168 within the A-band segment of human cardiac titin. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the titin-A168169 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 93
37811 409564 cd20972 IgI_2_Titin_Z1z2-like Second Ig-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin Z1z2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 91
37812 409565 cd20973 IgI_telokin-like immunoglobulin-like domain of telokin and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain in telokin, the C-terminal domain of myosin light chain kinase which is identical to telokin, and similar proteins. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the telokin Ig domain lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 88
37813 409566 cd20974 IgI_1_Titin_Z1z2-like First Ig-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin Z1z2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 93
37814 409567 cd20975 IgI_APEG-1_like Immunoglobulin-like domain of human Aortic Preferentially Expressed Protein-1 (APEG-1) and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin I-set (IgI) domain of the Human Aortic Preferentially Expressed Protein-1 (APEG-1) and similar proteins. APEG-1 is a novel specific smooth muscle differentiation marker predicted to play a role in the growth and differentiation of arterial smooth muscle cells (SMCs). The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the human APEG-1 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 91
37815 409568 cd20976 IgI_4_MYLK-like Fourth Ig-like domain from smooth muscle myosin light chain kinase and similar domains ; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain from smooth muscle myosin light chain kinase (MYLK) and similar domains. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of this group shows that the fourth Ig-like domain from myosin light chain kinase lacks this strand and thus belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 90
37816 409569 cd20977 IgI_3_hemolin-like Third immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The third Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). 93
37817 409570 cd20978 IgI_4_hemolin-like Fourth immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The fourth Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis. 88
37818 409571 cd20979 IgI_1_hemolin-like First immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). 91
37819 409572 cd20980 IgV_VISTA Immunoglobulin variable (IgV) domain of V-domain immunoglobulin suppressor of T cell activation (VISTA). The members here are composed of the immunoglobulin variable (IgV) domain of V-domain immunoglobulin suppressor of T cell activation (VISTA; also known as B7-H5, PD-1H, Gi24, Dies1, SISP1 and DD1alpha). VISTA is an immune checkpoint protein involved in the regulation of T cell activity and inhibits the T cell response against cancer. VISTA is a type I transmembrane protein with a single IgV domain with sequence homology to the IgV domains of the members of B7 family. VISTA is the only B7 family member that lacks an IgC domain. VISTA is primarily expressed in white blood cells and its transcription is partially controlled by p53. Similar to PD-1/PD-L1 and CTLA-4, a blockade of VISTA promotes tumor clearance by the immune system. Unlike the B7 family members, VISTA contains 10 beta-strands, instead of the nine that typically comprises of an IgV fold. Moreover, human VISTA contains the 21-residue extended loop between stands C and C', which does not align with any B7 family structure. 147
37820 409573 cd20981 IgV_B7-H6 Immunoglobulin variable (IgV) domain of B7-H6. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H6 (also known as NCR3LG1). B7-H6 contains one IgV domain and one IgC domain (IgV-IgC) and belongs to the B7-family, which consists of structurally related cell-surface protein ligands which bind to receptors on lymphocytes that regulate immune responses. B7-H6 is a ligand of NKp30, which is a member of CD28 family and an activating receptor of natural killer (NK) cells. The expression of NKp30 has been found in most of NK cells, which is involved in the process of tumor cell killing and interaction with antigen presenting cells (APCs) such as dendritic cells. Studies showed that NK cells eliminate B7-H6-expressing tumor cells either directly via cytotoxicity or indirectly by cytokine secretion. For instance, chimeric NKp30-expressing T cells responded to B7-H6(+) tumor cells and those T cells produced IFN-gamma and killed B7-H6-expressing tumor cells in vivo. B7-H6 mRNA is not found in normal cells, while high expression of B7-H6 is found in certain type tumor cells, such as lymphoma, leukemia, ovarian cancer, brain tumors, breast cancers, and various sarcomas. Since B7-H6 can bind NKp30 to exert anti-tumor effects by NK cells, which are able to recognize the difference between cancer cells and normal cells, B7-H6 may serve as a promising target for cancer immunotherapy. 114
37821 409574 cd20982 IgV_TIM-3_like Immunoglobulin Variable (IgV) domain of T cell Immunoglobulin Domain and Mucin Domain 3 (Tim-3), and similar domains. The members here are composed of the immunoglobulin variable (IgV) domain of T cell immunoglobulin domain and mucin domain 3 (Tim-3; also known as Hepatitis A virus cellular receptor 2 (HAVcr-2) and Cluster of Differentiation 366 (CD366)) and similar proteins. TIM-3 is a checkpoint inhibitor in immune responses to tumors, as well as involved in chronic viral infections. Thus, Tim-3 has emerged as one of most promising immune checkpoint targets for cancer immunotherapy. Tim-3 is highly expressed on Th1 lymphocytes and CD11b(+) macrophages and is upregulated on activated T and myeloid cells. TIM-3 regulates macrophage, activation and inhibits Th1 mediated immune responses to promote immunological tolerance. There are three TIM family members in humans (TIM-1, TIM-3, and TIM-4) and eight members in mice (TIM-1 to TIM-8). The IgV domain of human TIM-3 has been shown to bind ligands such as carcinoembryonic antigen cell adhesion molecule 1 (CEACAM1), high mobility group protein B1 (HMGB1)and galectin-9 (GAL9). The binding of GAL9 to TIM-3 can negatively regulate Th1 immune response, enhance immune tolerance and inhibit anti#tumor immunity. Dysregulation of the TIM-3/GAL9 pathway is implicated in numerous chronic autoimmune diseases, such as multiple sclerosis and systemic lupus erythematosus. 107
37822 409575 cd20983 IgV_PD-L2 Immunoglobulin Variable (IgV) domain of Programmed death ligand 2 (PD-L2). The members here are composed of the immunoglobulin variable (IgV) domain of Programmed death ligand 2 (PD-L2; also known as B7-DC or CD273). Receptor-binding domain of PD-L2 is a cell-surface ligand that competes with PD-L1 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the CD28/B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. PD-L2 has a higher affinity for PD-1 but is expressed at lower levels. PD-L2 interaction with PD-1 suppresses T cell proliferation, cytokine production and cytotoxic activity. PD-L2 is expressed on tumor cells, antigen-presenting cells or APCs (such as macrophages, B cells and dendritic cells), and a variety of other immune and nonimmune cells. Tumor expression of PD-L2 may contribute to tumor evasion of immune destruction by inactivating T cells. Thus, PD-L2 is a negative predictor for prognosis among solid cancer patients. 100
37823 409576 cd20984 IgV_B7-H4 Immunoglobulin Variable (IgV) domain of B7-H4. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H4 (also known as B7-S1, B7x, or Vtcn1). B7-H4 is one of the B7 family of immune-regulatory ligands that act as negative regulators of T cell function; it contains one IgV domain and one IgC domain. The B7-family consists of structurally related cell-surface protein ligands, which bind to receptors on lymphocytes that regulate immune responses. The binding of B7-H4 to unidentified receptors results in the inhibition of TCR-mediated T cell proliferation, cell-cycle progression and IL-2 production. As a co-inhibitory molecule, B7-H4 is widely expressed in tumor tissues and its expression is significantly associated with poor prognosis in human cancers such as glioma, pancreatic cancer, oral squamous cell carcinoma, renal cell carcinoma, and lung cancer. 110
37824 409577 cd20985 IgV_CD200R-like Immunoglobulin Variable domain of cell surface glycoprotein CD200 receptor and similar proteins. The members here are composed of the immunoglobulin variable (IgV) domain of cell surface glycoprotein CD200 receptor and similar proteins. CD200 (also known as OX2) is a widely distributed membrane glycoprotein that regulates myeloid cell activity through its interaction with an inhibitory receptor (CD200R). CD200-CD200R interactions are involved in the control of myeloid cellular function. In the mouse, several CD200R-related genes have been identified, including CD200RL (for receptor like), CD200R1, and CD200R2. While CD200 gives good binding to CD200R, it does not bind CD200RLa, CD200RLb, CD200RLc, or CD200RLe. For instance, CD200RLa has a 50-fold lower binding affinity to CD200, although CD200RLa shares a high amino acid sequence identity with CD200R in the V-like domain. Furthermore, the CD200-CD200R regulatory interactions provide an attractive target for immunomodulation, because its manipulation can provoke either immune tolerance or autoimmune diseases. 107
37825 409578 cd20986 IgC1_PD-L2 Immunoglobulin Constant 1 (IgC1) domain of Programmed death ligand 2 (PD-L2). The members here are composed of the immunoglobulin Constant 1 (IgC1) domain of Programmed death ligand 2 (PD-L2; also known as B7-DC or CD273). PD-L2 is a cell-surface ligand that competes with PD-L1 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the CD28/B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. PD-L2 has a higher affinity for PD-1 but is expressed at lower levels. PD-L2 interaction with PD-1 suppresses T cell proliferation, cytokine production and cytotoxic activity. PD-L2 is expressed on tumor cells, antigen-presenting cells or APCs (such as macrophages, B cells and dendritic cells), and a variety of other immune and nonimmune cells. Tumor expression of PD-L2 may contribute to tumor evasion of immune destruction by inactivating T cells. Thus, PD-L2 is a negative predictor for prognosis among solid cancer patients. 82
37826 409579 cd20987 IgC2_CD33_d2_like Second immunoglobulin domain of Cluster of Differentiation (CD) 33 and related Siglecs; member of the C2-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 33 (also known as sialic-acid binding immunoglobulin type-lectin 3 (Siglec-3)) and related Siglecs. CD33, a Siglec family member, is a well-known immunotherapeutic target in acute myeloid leukemia (AML). It is an inhibitory sialoadhesin expressed in human leukocytes of the myeloid lineage and some lymphoid subsets, including natural killer (NK) cells. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. CD33 (Siglec-3) is the smallest Siglec member. It preferentially binds to alpha2-6- and alpha2-3-sialylated glycans and strongly binds to sialylated ligands on leukemic cell lines. Ig Superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group includes CD33-related Siglecs which belong to the C2-set of IgSF domains. Unlike the C1-set, the C2-set structures do not have a D strand. 94
37827 409580 cd20988 IgV_TCR_gammadelta Gammadelta T-cell antigen receptor, variable (V) domain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains. Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing, and MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 114
37828 409581 cd20989 IgV_1_Nectin-2_NecL-5_like_CD112_CD155 First immunoglobulin variable (IgV) domain of nectin-2, nectin-like protein 5, and similar domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-2 (also known as poliovirus receptor related protein 2 or Cluster of Differentiation 112 (CD112)), nectin-like protein 5 (CD155), and similar proteins. Nectins and Nectin-like molecules are a family of Ca(2+)-independent immunoglobulin-like transmembrane glycoproteins belonging to the class of adhesion receptors, consisting of nine members (nectins 1 through 4 and nectin-like proteins 1 through 5). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. CD155 is the fifth member in the nectin-like molecule family, and functions as the receptor of poliovirus; therefore, CD155 is also referred to as Necl-5, or PVR. In contrast to all other family members, CD155 lacks self-adhesion capacity, yet it shares with nectins the feature to interact with other nectins. For instance, CD155 heterophilically trans-interacts with nectin-3, thereby contributing significantly to the establishment of adherens junctions between epithelial cells. This group belongs to the Constant 1 (C1)-set of IgSF domains, which has one beta-sheet that is formed by strands A-B-E-D and the other strands by G-F-C-C'. 112
37829 409582 cd20990 IgI_2_Palladin_C Second C-terminal immunoglobulin (Ig)-like domain of palladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of palladin. Palladin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to this family contain multiple Ig-like domains and function as scaffolds, modulating actin cytoskeleton. Palladin binds to alpha-actinin ezrin, vasodilator-stimulated phosphoprotein VASP, SPIN90 (also known as DIP or mDia interacting protein), and Src. Palladin also binds F-actin directly, via its Ig3 domain. Palladin is expressed as several alternatively spliced isoforms, having various combinations of Ig-like domains, in a cell-type-specific manner. It has been suggested that palladin's different Ig-like domains may be specialized for distinct functions. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand. 91
37830 409583 cd20991 Ig1_IL1R_like First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. IL-1 receptor antagonist (IL-1RA), a naturally occurring cytokine, is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. 91
37831 409584 cd20992 Ig1_IL1R_like First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. 108
37832 409585 cd20993 Ig2_IL-1RAP_like Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2. 93
37833 409586 cd20994 Ig2_IL1R_like Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2. 94
37834 409587 cd20995 IgI_N_ICAM-2 N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-2 (Cluster of Differentiation 102 or CD102); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-2 (Cluster of Differentiation 102 or CD102). The intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 and ICAM-3 (Cluster of Differentiation 50 or CD50) mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues. 83
37835 409588 cd20996 IgI_N_ICAM-1 N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54). The intercellular adhesion molecules ICAM-1, ICAM-2 (Cluster of Differentiation 102 or CD102) and ICAM-3 (Cluster of Differentiation 50 or CD50) mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues. 82
37836 409589 cd20997 IgI_N_ICAM-3 N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-3 (Cluster of Differentiation 50 or CD50); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-3 (Cluster of Differentiation 50 or CD50). The intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (Cluster of Differentiation 102 or CD102) and ICAM-3 mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues. 85
37837 409590 cd20998 IgC1_MHC_II_beta_I-E Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) I-E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) I-E. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). I-A and I-E molecules have the same basic features insofar as peptide loading and presentation, although each interacts with distinctly different sets of peptides. They also differ in that there is a relatively high incidence of deletion of the I-E gene in both inbred strains of mice as well as wild mice and the lack of the reverse situation i.e. the deletion of I-A genes. A detailed structural understanding of the similarities and differences between I-A and the paralogous I-E could help illuminate the respective roles these molecules play in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 99
37838 409591 cd21000 IgC1_MHC_II_beta_HLA-DR Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DR; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DR. HLA-DR is an MHC class II cell surface receptor encoded by the human leukocyte antigen complex on chromosome 6 region 6p21.31. HLA-DR is also involved in several autoimmune conditions, disease susceptibility, and disease resistance including seronegative-rheumatoid arthritis, penicillamine-induced myasthenia, schizophrenia, Goodpasture syndrome, systemic lupus erythematosus, Alzheimers, tuberculoid leprosy, and Hashimoto's thyroiditis. HLA-DR molecules are upregulated in response to signaling. HLA-DR is an alphabeta heterodimer cell surface receptor, each subunit of which contains two extracellular domains, a membrane-spanning domain, and a cytoplasmic tail. Both alpha and beta chains are anchored in the membrane. The DR beta chain is encoded by 4 loci, however no more than 3 functional loci are present in a single individual, and no more than two on a single chromosome. Sometimes an individual may only possess 2 copies of the same locus, DRB1*. The HLA-DRB1 locus is ubiquitous and encodes a very large number of functionally variable gene products (HLA-DR1 to HLA-DR17). The HLA-DRB3 locus encodes the HLA-DR52 specificity, is moderately variable and is variably associated with certain HLA-DRB1 types. The HLA-DRB4 locus encodes the HLA-DR53 specificity, has some variation, and is associated with certain HLA-DRB1 types. The HLA-DRB5 locus encodes the HLA-DR51 specificity, which is typically invariable, and is linked to the HLA-DR2 types. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 96
37839 409592 cd21001 IgC1_MHC_II_beta_HLA-DQ_I-A Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DQ and I-A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of human histocompatibility antigen (HLA) DQ and mouse I-A. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). I-A and I-E have the same basic features insofar as peptide loading and presentation, they differ in that each interacts with distinctly different sets of peptides, and in the incidence of deletion of their genes. A structural understanding of the similarities and differences between I-A and I-E may help with understanding their roles in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. Human HLA-DR, -DQ, and -DP are about 70% similar to each other. HLA-DQ (DQ) is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DQA1 and HLA-DQB1, that are adjacent to each other on chromosome band 6p21.3. A person often produces two alpha-chain and two beta chain variants and thus 4 isoforms of DQ. HLA-DQ is involved in the autoimmune diseases celiac disease and diabetes mellitus type. DQ is one of several antigens involved in rejection of organ transplants. DQ2 is encoded by the HLA-DQB1*02 allele group. DQ6 is encoded by the HLA-DQB1*06 allele group. DQ2 beta-chains combine with alpha-chains, encoded by genetically linked HLA-DQA1 alleles, to form the cis-haplotype isoforms. These isoforms, nicknamed DQ2.2 and DQ2.5, are also encoded by the DQA1*0201 and DQA1*0501 genes, respectively. DQ6 beta-chains combine with alpha-chains, encoded by genetically linked HLA-DQA1 alleles, to form the cis-haplotype isoforms. For DQ6, however, cis-isoform pairing only occurs with DQ1 alpha-chains. There are many haplotypes of DQ6. Susceptibility to Leptospirosis infection was found associated with undifferentiated DQ6. DQ8 is determined by the antibody recognition of beta8 and this generally detects the gene product of DQB1*0302. DQ8 is commonly linked to autoimmune disease in the human population. DQ8 is the second most predominant isoform linked to celiac disease and the DQ most linked to Type 1 diabetes. DQ8 increases the risk for rheumatoid arthritis and is linked to the primary risk locus for RA, HLA-DR4. DR4 also plays an important role in Type 1 diabetes. DQ8 is a split antigen of the DQ3 broad antigen. MHC class II molecules play a key role in the initiation of the antigen-specific immune response. They are expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice, and induced in nonprofessional APCs, such as keratinocyctes; they are expressed on the surface of activated human T cells and on T cells from other species. MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes; these peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC, and bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 97
37840 409593 cd21002 IgC1_MHC_II_beta_HLA-DM Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DM; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DM. Human HLA-DM plays a critical role in antigen presentation to CD4 T cells by catalyzing the exchange of peptides bound to MHC class II molecules. Type 1 diabetes is correlated with DM activation and it is also implicated in viral infections such as herpes simplex virus, celiac disease, multiple sclerosis, other autoimmune diseases, and leukemia. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 97
37841 409594 cd21003 IgC1_MHC_II_beta_HLA-DP Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DP; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DP. HLA class II histocompatibility antigen, DP(W2) beta chain is a protein that in humans is encoded by the HLA-DPB1 gene. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other. HLA-DP is an alphabeta heterodimer cell-surface receptor. Each DP subunit (alpha-subunit, beta-subunit) is composed of a alpha-helical N-terminal domain, an IgG-like beta sheet, a membrane spanning domain, and a cytoplasmic domain. The alpha-helical domain forms the sides of the peptide binding groove. The beta sheet regions form the base of the binding groove and the bulk of the molecule as well as the inter-subunit (non-covalent) binding region. Individuals carrying the MHCII allele, HLA-DP2, are at risk for chronic beryllium disease (CBD), a debilitating inflammatory lung condition caused by the reaction of CD4 T cells to inhaled beryllium. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 96
37842 409595 cd21004 IgC1_MHC_II_alpha_HLA_DO HLA class II histocompatibility antigen DO alpha; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the nonclassical MHC class II (MHCII) protein, HLA-DO, which binds HLA-DM and influences the repertoire of peptides presented by MHCII proteins. In complex with HLA-DM, HLA-DO adopts a classical MHCII structure, with alterations near the a subunit's 310 helix. HLA-DO binds to HLA-DM at the same sites implicated in MHCII interaction, and kinetic analysis showed that HLA-DO acts as a competitive inhibitor by acting as a substrate mimic. Though more remains to be elucidated about the function of HLA-DO, its unique distribution in the mammalian body namely, the exclusive expression of HLA-DO in B cells, thymic medullary epithelial cells, and dendritic cells indicate that it may be of physiological importance and has inspired further research. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37843 409596 cd21005 IgC1_MHC_II_alpha_I-EK Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) I-E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) I-E. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 95
37844 409597 cd21006 IgC1_MHC_II_alpha_I-A Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) I-A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) I-A. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). I-A and I-E molecules have the same basic features insofar as peptide loading and presentation, although each interacts with distinctly different sets of peptides. They also differ in that there is a relatively high incidence of deletion of the I-E a gene in both inbred strains of mice as well as wild mice and the lack of the reverse situation i.e. the deletion of I-A genes. A detailed structural understanding of the similarities and differences between I-A and the paralogous I-E could help illuminate the respective roles these molecules play in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 95
37845 409598 cd21007 IgC1_MHC_II_alpha_HLA-DR Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DR; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DR. MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other. HLA-DR is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DRA1 and HLA-DRB1, that are adjacent to each other on chromosome band 6p21.31. Susceptibility to multiple sclerosis and rheumatoid arthritis are associated with the human histocompatibility leukocyte antigen HLA-DR2 and HLA-DR4, respectively. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 95
37846 409599 cd21008 IgC1_MHC_II_alpha_HLA-DQ Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DQ and related proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DQ. MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other. HLA-DQ (DQ) is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DQA1 and HLA-DQB1, that are adjacent to each other on chromosome band 6p21.3. A person often produces two alpha-chain and two beta chain variants and thus 4 isoforms of DQ. Two autoimmune diseases in which HLA-DQ is involved are celiac disease and diabetes mellitus type 1. DQ is one of several antigens involved in rejection of organ transplants. DQ8 is a split antigen of the DQ3 broad antigen. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 95
37847 409600 cd21009 IgC1_MHC_II_alpha_HLA-DM Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DM; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DM. Human HLA-DM, also known as H2-M in mice, plays a critical role in antigen presentation to CD4 T cells by catalyzing the exchange of peptides bound to MHC class II molecules. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 94
37848 409601 cd21010 IgC1_MHC-like_ZAG Immunoglobulin domain of Zn-alpha2-glycoprotein (ZAG); member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin domain of Zn-alpha2-glycoprotein (ZAG). ZAG is a soluble protein that is present in serum and other body fluids. ZAG stimulates lipid degradation in adipocytes and causes the extensive fat losses associated with some advanced cancers. The 2.8 angstrom crystal structure of ZAG resembles a class I major histocompatibility complex (MHC) heavy chain, but ZAG does not bind the class I light chain beta-2-microglobulin. The ZAG structure includes a large groove analogous to class I MHC peptide binding grooves. Instead of a peptide, the ZAG groove contains a nonpeptidic compound that may be implicated in lipid catabolism under normal or pathological conditions. IgC_MHC_I_alpha3; Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 93
37849 409602 cd21011 IgC1_MHC-like_FcRn immunoglobulin domain of neonatal Fc receptor, major histocompatibility complex (MHC)-like; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin domain of neonatal Fc receptor (FcRn). FcRn performs two distinct functions: the transport of maternal immunoglobulin G (IgG) to pre- or neonatal mammals which provides passive immunity and protection of IgG from normal serum protein catabolism. FcRn is related to class I MHC proteins, but lacks a functional peptide binding groove. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 93
37850 409603 cd21012 IgC1_MHC_H-2_TLA H-2 class I histocompatibility complex TLA (thymus leukemia antigen); member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the major histocompatibility complex (MHC) H-2 class I histocompatibility complex TLA (thymus leukemia antigen). The murine MHC class I histocompatibility TLA (Thymus leukemia antigen), which is encoded in the T region by T3 and T18 genes, is expressed mainly by intestinal epithelial cells and thymocytes. The murine TLAs are class I, beta-2-microglobulin-associated glycoproteins. The TLA function is not defined by antigen presentation, but rather by its relatively high affinity binding to CD8-alpha-alpha compared with CD8-alpha-beta. The existence of a human homolog for murine TLA remains unresolved. This group is a member of the C1-set Ig domains, which have one beta sheet that is formed by strands A, B, E, and D and the other strands by G, F, C, and C'. 95
37851 409604 cd21013 IgC1_MHC_Ib_Qa-1 Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1 and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1 and similar proteins. Qa-1 presents hydrophobic peptides including Qdm derived from the leader sequence of classical MHC I molecules for immune surveillance by NK cells. Qa-1 bound peptides derived from the TCR Vbeta8.2 of activated T cells also activates CD8+ regulatory T cells to control autoimmunity and maintain self-tolerance. Four allotypes of Qa-1 (Qa-1a-d) are expressed that are highly conserved in sequence but have several variations that could affect peptide binding to Qa-1 or TCR recognition. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 97
37852 409605 cd21014 IgC1_MHC_Ib_Qa-2 Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-2; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of QA-2. Qa-2 is a nonclassical MHC Ib antigen, which has been implicated in both innate and adaptive immune responses, as well as embryonic development. Qa-2 has an unusual peptide binding specificity in that it requires two dominant C-terminal anchor residues and is capable of associating with a substantially more diverse array of peptide sequences than other nonclassical MHC. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 94
37853 409606 cd21015 IgC1_MHC_Ia_RT1-Aa Class Ia major histocompatibility complex (MHC) immunoglobulin domain of RT1-Aa; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of RT1-Aa. While most mammalian species transport these peptides into the ER via a single allele of TAP, rats have evolved different TAPs, TAP-A and TAP-B, RT1-Aa and RT1-A1c, which are associated with TAP-A and TAP-B. The rat MHC class Ia molecule RT1-Aa has the unusual capacity to bind long peptides ending in arginine, such as MTF-E, a thirteen-residue, maternally transmitted minor histocompatibility antigen. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37854 409607 cd21016 IgC1_MHC_Ib_T10_T22_like Class Ib major histocompatibility complex (MHC) immunoglobulin domain of T10, T22, and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of the murine H-2T-encoded T10, T22, and similar proteins. T10 and T22 are highly related nonclassical major histocompatibility complex (MHC) class Ib proteins that bind to certain gammadelta T cell receptors (TCRs) in the absence of other components. Classical MHC class I (class Ia) molecules participate in immune responses by presenting peptide antigens to cytolytic alpha beta T cells. Many nonclassical MHC class I (class Ib) molecules have distinct antigen-binding capabilities, suggesting that they have evolved for specific tasks that are distinct from those of MHC class Ia. Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions. 97
37855 409608 cd21017 IgC1_MHC_Ia_MIC-A_MIC-B Class Ia major histocompatibility complex (MHC) immunoglobulin domain of MIC-A and MIC-B; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of MIC-A and MIC-B. MIC-A and MIC-B are homologs that serve as stress-inducible antigens on epithelial and epithelially derived cells. Both serve as ligands for the widely expressed activating immunoreceptor NKG2D, a C-type lectin-like activating immunoreceptor. MIC-B is very similar in structure to MIC-A and likely interacts with NKG2D in an analogous manner. The interdomain flexibility observed in the MIC-A structures, a feature unique to MIC proteins among MHC class I proteins and homologs, is also displayed by MIC-B, with an interdomain relationship intermediate between the two examples of MIC-A structures. Mapping sequence variations onto the structures of MIC-A and MIC-B reveals patterns completely distinct from those displayed by classical MHC class I proteins, with a number of substitutions falling on positions likely to affect interactions with NKG2D, but with other positions lying distant from the NKG2D binding sites or buried within the core of the proteins. Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding and the IgC domain is involved in oligomerization and molecular interactions. 95
37856 409609 cd21018 IgC1_MHC_Ia_H2Db_H2Ld Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) H2Db and H2Ld; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) H2Db and H2Ld. H-2Ld complexed with peptide QL9 (or p2Ca) and complexed with influenza virus peptide NP366-374 (ASNEN-METM), respectively are high-affinity alloantigens for the 2C T cell receptor (TCR). The a1-a2 super domains of H-2Ld, H-2Db, and H-2Kb closely superimpose. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37857 409610 cd21019 IgC1_MHC_Ia_H-2Kb Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H-2Kb; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H-2Kb. H-2Kb is an alloantigen for the 2C T cell receptor (TCR). H-2Kb forms a complex with beta-2-microglobulin, and a peptide, including VSV-8 (RGYVYNGL), SEV-9 (FAPGNYPAL), and OVA-8 (SIINFEKL). Comparison of the OVA-8, VSV-8, and SEV-9 complexes with H-2Kb indicates that four side chains (Lys-66, Glu-152, Arg-155, and Trp-167) adopt peptide-specific conformations. H-2Kb paralogs include H-2Db, H-2Kbml and H-2KbI1s. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 94
37858 409611 cd21020 IgC1_MHC_Ia_H-2Dd Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H2-Dd; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H2-Dd. Mouse MHC is composed of 11 subclasses. It includes the classical MHC class I (MHC-Ia) that comprises H-2D, H-2K and H-2L subclasses, the non-classical MHC class I (MHCIb) that comprises H-2Q, H-2M and H-2T subclasses, the classical MHC class II (MHC-IIa) that includes H-2A(I-A) and H-2E(I-E) subclasses, and the non-classical MHC class II (MHC-IIb) comprises H-2M and H-2O. H-2K, H-2D, and H-2L are 80 to 90% homologous at the amino acid level yet appear to be involved in different recognition reactions and are differentially expressed on lymphoid cells. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37859 409612 cd21021 IgC1_MHC_Ib_HLA-H Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen H; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen H (HLA-H). HLA-H (also known as hereditary hemochromatosis protein; HFE) is a major histocompatibility complex (MHC) class I-like protein that is mutated in Hereditary Hemochromatosis. HFE is a protein of 343 amino acids that includes a signal peptide, an extracellular transferrin receptor-binding region (a1 and a2), an immunoglobulin-like domain (a3), a transmembrane region, and a short cytoplasmic tail. HFE binds beta-2-microglobulin to form a heterodimer expressed at the cell surface. It binds transferrin receptor (TFRC) in its extracellular alpha1-alpha2 domain. HFE plays an important part in the regulation of hepcidin expression in response to iron overload and the liver is important in the pathophysiology of HFE-associated hemochromatosis. Nine HFE splicing variants have been reported with transcripts lacking exon 2 or exon 3, or exons 2-3, 2-4, or 2-5. Diverse mutations involving HFE introns and exons discovered in persons with hemochromatosis or their family members cause or probably cause high iron phenotypes. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 94
37860 409613 cd21022 IgC1_MHC_Ia_HLA-G Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) G; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) G. HLA-G histocompatibility antigen (also known as human leukocyte antigen G ; HLA-G) is a protein that in humans is encoded by the HLA-G gene. HLA-G belongs to the HLA nonclassical class I heavy chain paralogs. This class I molecule is a heterodimer consisting of a heavy chain and light chain, beta-2-microglobulin. The heavy chain is anchored in the membrane. HLA-G may play a role in immune tolerance in pregnancy, being expressed in the placenta by extravillous trophoblast cells (EVT), while the classical MHC class I genes (HLA-A and HLA-B) are not. Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I and class II. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. MHC class II molecules play a key role in the initiation of the antigen-specific immune repose. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure. 94
37861 409614 cd21023 IgC1_MHC_Ia_HLA-F Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) F; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen alpha chain F (HLA-F). HLA-F, encoded by the HLA-F gene in humans, belongs to the non-classical HLA class I heavy chain paralogs. This class I molecule mainly exists as a heterodimer associated with the invariant light chain beta-2-microglobulin. HLA-F molecules can interact with both activating and inhibitory receptors on immune cells, such as NK cells, and can present a diverse panel of peptides. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 98
37862 409615 cd21024 IgC1_MHC_Ib_HLA-E Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) E. HLA-E is the first human class Ib major histocompatibility complex molecule to be crystallized. Like other MHC class I molecules, HLA-E is a heterodimer consisting of an a heavy chain and light chain beta-2-microglobulin. HLA-E is highly conserved and almost nonpolymorphic, and has recently been shown to be the first specialized ligand for natural killer cell receptors. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37863 409616 cd21025 IgC1_MHC_Ib_HLA-Cw3-4 Class Ib major histocompatibility complex (MHC) immunoglobulin domain of HLA-Cw3 and HLA-Cw4; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of HLA-Cw3 and HLA-Cw4. HLA-C belongs to the MHC class I heavy chain receptors. The C receptor is a heterodimer consisting of a HLA-C mature gene product and beta-2-microglobulin. The mature C chain is anchored in the membrane. MHC Class I molecules, like HLA-C, are expressed in nearly all cells, and present small peptides to the immune system which surveys for non-self peptides. HLA-C is a locus on chromosome 6, which encodes for a large number of HLA-C alleles that are Class-I MHC receptors. Class Ib histocompatibility leukocyte antigens (HLA)-Cw3 and (HLA)-Cw4 are ligands for the natural killer (NK) cell inhibitory receptors KIR2DL2 and KIR2DL1, respectively. HLA-Cw3 and related alleles (HLA-Cw1, -Cw7, and -Cw8) contain Ser77 and Asn80 and interact with KIR that are reactive with the GL183 antibody Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. HLA-Cw4 and related alleles (HLA-Cw2, -Cw5, and -Cw6) have Asn77 and Lys80 and are recognized by KIR reactive with the EB6 15 or HP-3E4 16 antibody. Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions. 96
37864 409617 cd21026 IgC1_MHC_Ia_HLA-B Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) B and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) B and similar proteins. The classical class I molecules (HLA-A, -B, and -C) are responsible for the presentation of endogenous antigen to CD8+ T cells. The receptor is a heterodimer, and is composed of a heavy alpha chain and smaller beta chain. The alpha chain is encoded by a variant HLA-B gene, and the beta chain (beta-2-microglobulin) is an invariant beta-2-microglobulin molecule. The beta-2-microglobulin protein is coded for by a separate region of the human genome. Human leukocyte antigen (HLA) B*3501 (B35) is a common human allele involved in mediating protective immunity against HIV. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 97
37865 409618 cd21027 IgC1_MHC_Ia_HLA-A Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) A. The classical class I molecules (HLA-A, -B, and -C) are responsible for the presentation of endogenous antigen to CD8+ T cells. The receptor is a heterodimer, and is composed of a heavy alpha chain and smaller beta chain. The alpha chain is encoded by a variant HLA-A gene, and the beta chain (beta-2-microglobulin) is an invariant beta-2-microglobulin molecule. The beta-2-microglobulin protein is coded for by a separate region of the human genome. HLA-A2 is associated with spontaneous abortions, HIV, and Hodgkin lymphoma. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 95
37866 409619 cd21028 IgC1_MHC_I_M144 Class I major histocompatibility complex (MHC) homolog m144; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) homolog m144 class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. 101
37867 409620 cd21029 IgC1_CD1 Immunoglobulin domain of Cluster of Differentiation (CD) 1; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin domain of Cluster of Differentiation (CD) 1. CD1 family of transmembrane glycoproteins, are structurally related to the major histocompatibility complex (MHC) proteins and form heterodimers with beta-2-microglobulin. They mediate the presentation of primarily lipid and glycolipid antigens of self or microbial origin to T cells. The human genome contains five CD1 family genes (CD1a, CD1b, CD1c, CD1d, and CD1e) organized in a cluster on chromosome 1. The CD1 family members are thought to differ in their cellular localization and specificity for particular lipid ligands. CD1a localizes to the plasma membrane and to recycling vesicles of the early endocytic system. Alternative splicing results in multiple transcript variants. Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading. Class I MHC molecules are expressed on most nucleated cells. C1-set Ig domains have one beta sheet that is formed by strands A, B, E, and D and the other strands by G, F, C, and C'. 93
37868 411025 cd21030 V35-RBD_P-protein-C_like C-terminal RNA-binding domain (RBD) domain of Ebola virus VP35 phosphoprotein and related proteins. This family includes the C-terminal RNA-binding domain (RBD) of the P protein of viruses belonging to the Filoviridae family, such as Ebola virus or Marburg virus. VP35-RBD contains two subdomains: an alpha-helical subdomain and a beta-sheet subdomain. Virus infection typically activates host innate immunity, including the interferon (IFN) signaling pathway; VP35-RBD binds double-stranded RNA (dsRNA) inhibiting IFN-alpha/beta signaling. The family Filoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serve as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 125
37869 411026 cd21031 MEV_P-protein-C_like C-terminal domain of Measles virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Paramyxoviridae family such as measles virus and mumps virus. The family Paramyxoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. Paramyxoviruses have a polycistronic phosphoprotein (P) gene which encodes for proteins in addition to P protein; for example the measles virus P gene encodes for P protein and virulence factor V (MV-V). This domain family includes the unshared C-terminal domain of P protein not present in MV-V. 46
37870 411027 cd21032 RABV_P-protein-C_like C-terminal domain of Rabies virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Rabies virus (RABV). RABV P protein is known to counteract the functions of various cellular factors involved in antiviral responses, including STAT1, and interferon-induced promyelocytic leukaemia (PML) protein; the C-terminal domain of the RABV P protein includes STAT1 and PML binding sites. The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 105
37871 411028 cd21033 VSV_P-protein-C_like C-terminal domain of Vesicular stomatitis Indiana virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Vesicular stomatitis Indiana virus (VSV). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 71
37872 411029 cd21036 WH_MUS81 winged helix domain found in crossover junction endonuclease MUS81 and similar proteins. MUS81 is a crossover junction endonuclease that interacts with EME1 (essential meiotic structure-specific endonuclease 1) and EME2, to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The MUS80-EME1 endonuclease maintains genomic integrity in metazoans by cleaving branched DNA structures that can form during mitosis and fission yeast meiosis, and during processing of damaged replication folks. This model corresponds to the winged helix (WH) domain of MUS81, which is responsible for DNA binding. It comprises four helices and two beta strands. 94
37873 411030 cd21037 MLKL_NTD N-terminal domain of mixed lineage kinase domain-like protein (MLKL) and similar proteins. MLKL is a pseudokinase that does not have protein kinase activity and plays a key role in tumor necrosis factor (TNF)-induced necroptosis, a programmed cell death process. The model corresponds to the MLKL N-terminal region that reveals a four-helix bundle with an additional helix at the top which is likely key for MLKL function. The N-terminal domain binds directly to phospholipids and induces membrane permeabilization. 138
37874 410951 cd21039 NURR NURR (N-terminal unit for RNA recognition) domain. NURR domain is a self-folding globular RNA-binding domain with an all alpha-helix architecture with a highly conserved negatively charged surface area. It also contains a large hydrophobic cavity and a positively charged surface area as potential epitopes for inter-molecular interactions. NURR domain has been found in Drosophila melanogaster Syncrip and vertebrates heterogeneous nuclear ribonucleoproteins hnRNPR and hnRNPQ. 77
37875 411031 cd21044 Rab11BD_RAB3IP_like Rab11 binding domain of Rab-3A-interacting protein (RAB3IP), Rab-3A-interacting-like protein 1 (RAB3IL1) and similar proteins. The family includes RAB3IP and RAB3IL1, as well as Rab guanine nucleotide exchange factor SEC2 from yeast. RAB3IP, also called Rabin-3, or SSX2-interacting protein, or Rabin8, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. It mediates the release of GDP from RAB8A and RAB8B but not from RAB3A or RAB5. It modulates actin organization and promotes polarized transport of RAB8A-specific vesicles to the cell surface. RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. SEC2 is a guanine nucleotide exchange factor for SEC4, catalyzing the dissociation of GDP from SEC4 and also potently promoting binding of GTP. Activation of SEC4 by SEC2 is needed for the directed transport of vesicles to sites of exocytosis. SEC2 binds the Rab GTPase YPT32 but does not have exchange activity on YPT32. The model corresponds to the Rab11a/Rab11b-binding region of family members which lies within the carboxy-terminus, a region distinct from their GEF domain and Rab3a-binding region. 178
37876 410965 cd21050 ELD_TRPML extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipins (TRPMLs). TRPML family proteins contain a linker between the first two transmembrane helices (S1 and S2), which is called TRPML I-II linker. It forms a tight tetramer that is crucial for full-length TRPMLs assembly and localization. In lysosomes and endosomes, this linker faces the lumen (it is therefore also referred to as the 'luminal linker'); on the plasma membrane, it faces the extracellular solution. TRPML I-II linker has been named as extracytosolic/lumenal domain (ELD). 167
37877 411034 cd21055 WH_NTD_SMARCB1_like N-terminal winged helix DNA-binding domain found in SMARCB1, PHF10 and similar proteins. The family includes SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 (SMARCB1) and PHD finger protein 10 (PHF10), both of which have an N-terminal winged helix DNA-binding domain that is structurally related to the SKI/SNO/DAC domain found in a number of metazoan chromatin-associated proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is involved in transcription activity regulation by chromatin remodeling. It is a component of the neural progenitors-specific chromatin remodeling complex (npBAF complex) and plays a role in the proliferation of neural progenitors. 80
37878 411038 cd21058 toxin_MLD_like membrane localization domain (MLD) of Vibrio MARTX, Pasteurella PMT, clostridial glycosylating cytotoxins, toxin effectors BteA (Bordetella T3SS effector A) and related proteins. This family includes membrane localization domains (MLDs) for toxin effectors such as the Rho-inactivation domain of Vibrio MARTX, Pasteurella mitogenic toxin (PMT), where it has been termed PMT C1 domain, and clostridial glycosylating cytotoxins including Clostridium difficile toxins A (TcdA) and B (TcdB), Clostridium novyi alpha-toxin (TcnA), and Clostridium sordellii lethal toxin (TcsL). It also includes the MLD located in the N-terminal minimal membrane-binding fragment of BteA, a type III secretion system (T3SS) effector protein from Bordetella pertussis, the causative agent of whooping cough. 78
37879 411040 cd21059 LciA-like lactococcin A immunity protein (LciA) and similar proteins. This family includes pore-forming bacteriocin class IId lactococcin A immunity protein (LciA) and similar proteins. The subclass IId is a linear, one-peptide bacteriocin that shares no sequence similarity to the other class II pediocin-like bacteriocins (class IIa), two-peptide bacteriocins (class IIb) or cyclic bacteriocins (class IIc). However, they all induce membrane leakage and cell death by specifically binding the mannose phosphotransferase system (man-PTS) on their target cells. LciA shares the same 4-helical bundle structure as the pediocin-like immunity proteins but has a shorter C-terminal helix and a different surface potential. Also, it has a flexible C-terminal tail that is important for the functionality of the immunity protein. 69
37880 410634 cd21061 7tm_viral_rhodopsin viral rhodopsins and similar proteins, members of the seven-transmembrane GPCR superfamily. This subfamily is composed of viral homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. Viral proteorhodopsins are monophyletic and split into two distinct groups, I and II, represented by Phaeocystis globosa virus 12T VirRDTS and Organic Lake phycodnavirus OLPVRII, respectively. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 210
37881 410952 cd21064 NURR_hnRNPQ-like NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoproteins hnRNPQ, hnRNPR and similar proteins. The family includes hnRNPQ and hnRNPR. hnRNPQ, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NSAP1), or Synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP), is a highly conserved RNA-binding protein that mediates the exosomal partition of a set of miRNAs. It acts as a component of the hepatocyte exosomal miRNA sorting machinery. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression. 78
37882 410953 cd21065 NURR_Syncrip-like NURR (N-terminal unit for RNA recognition) domain found in Drosophila melanogaster Syncrip and similar proteins. Syncrip is a conserved RNA-binding protein important in neuronal and muscular development in Drosophila. It is essential for the morphology and growth of the neuromuscular junction and regulates cytoplasmic vesicle-based messenger RNA (mRNA) transport. The model corresponds to NURR domain of Syncrip, which is a RNA-binding domain with a highly conserved RNA-binding surface. 83
37883 410954 cd21066 NURR_hnRNPQ NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoprotein Q (hnRNPQ) and similar proteins. hnRNPQ, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NSAP1), or Synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP), is a highly conserved RNA-binding protein that mediates the exosomal partition of a set of miRNAs. It acts as a component of the hepatocyte exosomal miRNA sorting machinery. The model corresponds to NURR domain of hnRNPQ, which has structural similarity to bacterial protein Barstar and binds to Apobec1. 85
37884 410955 cd21067 NURR_hnRNPR NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoprotein R (hnRNPR) and similar proteins. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression. The model corresponds to NURR domain of hnRNPR. 84
37885 411032 cd21068 Rab11BD_RAB3IP Rab11 binding domain of Rab-3A-interacting protein (RAB3IP). RAB3IP, also called Rabin-3, or SSX2-interacting protein, or Rabin8, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. It mediates the release of GDP from RAB8A and RAB8B but not from RAB3A or RAB5. It modulates actin organization and promotes polarized transport of RAB8A-specific vesicles to the cell surface. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IP lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region. 193
37886 411033 cd21069 Rab11BD_RAB3IL1 Rab11 binding domain of Rab-3A-interacting-like protein 1 (RAB3IL1). RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IL1 lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region. 163
37887 410966 cd21070 ELD_TRPML1 extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 1 (TRPML1). TRPML1, also called mucolipin-1 (ML1), or MG-2, or Mucolipidin, may play a major role in Ca(2+) release from late endosome and lysosome vesicles to the cytoplasm, which is important for many lysosome-dependent cellular events, including the fusion and trafficking of these organelles, exocytosis and autophagy. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML1. It forms a tight tetramer that is crucial for full-length TRPML1 assembly and localization. 171
37888 410967 cd21071 ELD_TRPML2 extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 2 (TRPML2). TRPML2, also called mucolipin-2 (ML2), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It may activate ARF6 and be involved in the trafficking of GPI-anchored cargo proteins to the cell surface via the ARF6-regulated recycling pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML2. It forms a tight tetramer that is crucial for full-length TRPML2 assembly and localization. 167
37889 410968 cd21072 ELD_TRPML3 extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 3 (TRPML3). TRPML3, also called mucolipin-3 (ML3), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It mediates release of Ca(2+) from endosomes to the cytoplasm, contributes to endosomal acidification and is involved in the regulation of membrane trafficking and fusion in the endosomal pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML3. It forms a tight tetramer that is crucial for full-length TRPML3 assembly and localization. 169
37890 411039 cd21073 toxin_BteA-MLD_like membrane localization domain (MLD) of BteA (Bordetella T3SS effector A) cytotoxin, the N-terminal domain of Photox toxin and related proteins. This family includes the MLD located in the N-terminal minimal membrane-binding segment of BteA (residues 1-131, BteA131), which has also been referred to as the lipid raft targeting (LRT) domain/motif. BteA is a type III secretion system (T3SS) effector protein from Bordetella pertussis, a bacterial respiratory pathogen and the causative agent of whooping cough. The BteA131 segment is multifunctional: in addition to targeting phosphatidylinositol (PI)-rich microdomains in the host membrane, it binds its cognate chaperone BtcA. The MLD adopts a four-helix bundle structure, with a positively charged surface that targets phosphatidylinositol 4,5-bisphosphate (PIP2) in the host membrane via critical arginine and lysine residues. A flexible region preceding the BteA helical bundle contains the characteristic beta-motif required for binding BtcA. This domain has significant sequence similarity to the N-terminal domain of effectors and the endo-domain of RTX-type toxins from Photorhabdus luminescens. This family includes the N-terminal domain of Photorhabdus laumondii Photox toxin; little is known about the N-terminus of Photox, but its C-terminus is an actin-targeting ADP-ribosyltransferase. 87
37891 410781 cd21074 DHD_Ski_Sno_Dac Dachshund-homology domain found in the Ski/Sno/Dac family of transcriptional regulators. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. Members of this family include the Ski protein, Ski-like protein (Sno), and Dachshund proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. Ski-like protein, also known as SKIL or Sno, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. Dachshund proteins are essential components of a regulatory network controlling cell fate determination. They have been implicated in eye, limb, brain, and muscle development. 88
37892 410961 cd21075 DBD_XPA-like DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14 and similar proteins. The family includes DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14, zinc transporter 9 (ZNT9) and similar proteins. XPA, also known as xeroderma pigmentosum group A-complementing protein (XPAC), is involved in DNA excision repair. It initiates repair by binding to damaged sites with various affinities, depending on the photoproduct and the transcriptional state of the region. Rad14 is involved in nucleotide excision repair. It binds specifically to damaged DNA and is required for the incision step. Rad14 is a component of the nucleotide excision repair factor 1 (NEF1) complex consisting of Rad1, Rad10 and Rad14. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator. The model corresponds to the DNA-binding domain found in XPA and Rad14. It consists of a conserved N-terminal zinc-binding subdomain and a C-terminal alpha/beta fold subdomain. ZNT9 contains only C-terminal alpha/beta fold subdomain but lacks of N-terminal zinc-binding subdomain. 67
37893 410962 cd21076 DBD_XPA DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA) and similar proteins. XPA, also known as xeroderma pigmentosum group A-complementing protein (XPAC), is involved in DNA excision repair. It initiates repair by binding to damaged sites with various affinities, depending on the photoproduct and the transcriptional state of the region. 107
37894 410963 cd21077 DBD_Rad14 DNA-binding domain found in yeast DNA repair protein Rad14 and similar proteins. Rad14 is involved in nucleotide excision repair. It binds specifically to damaged DNA and is required for the incision step. Rad14 is a component of the nucleotide excision repair factor 1 (NEF1) complex consisting of Rad1, Rad10 and Rad14. 105
37895 410964 cd21078 NTD_ZNT9 N-terminal domain found in zinc transporter 9 (ZNT9) and similar proteins. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator. 89
37896 410782 cd21079 DHD_Ski_Sno Dachshund-homology domain found in Ski, Ski-like protein (Sno), and similar proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. Ski-like protein, also known as SKIL or Sno, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 91
37897 410783 cd21080 DHD_Skor Dachshund-homology domain found in SKI family transcriptional corepressors, Skor1, Skor2 and similar proteins. Skor1, also known as functional Smad-suppressing element on chromosome 15 (Fussel-15), LBX1 corepressor 1, or ladybird homeobox corepressor 1, acts as a transcriptional corepressor of LBX1 and inhibits BMP signaling. Skor2, also known as functional Smad-suppressing element on chromosome 18 (Fussel-18), LBX1 corepressor 1-like protein, or ladybird homeobox corepressor 1-like protein, exhibits transcriptional repressor activity. It acts as a transforming growth factor-beta (TGF-beta) antagonist in the nervous system. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 91
37898 410784 cd21081 DHD_Dac Dachshund-homology domain found in the retinal determination protein Dachshund and similar proteins. Dachshund proteins act as transcription factors involved in the regulation of organogenesis. They may be a regulator of SIX1, SIX6 and probably SIX5. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. It has been postulated that Dachshund proteins may bind to chromatin DNA via their DHD domains. 95
37899 410785 cd21082 DHD_SKIDA1 Dachshund-homology domain found in SKI/DACH domain-containing protein 1 (SKIDA1) and similar proteins. SKIDA1 is also known as protein DLN-1. Its biological function remains unclear. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 91
37900 410786 cd21083 DHD_Ski Dachshund-homology domain found in Ski and similar proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 102
37901 410787 cd21084 DHD_Sno Dachshund-homology domain found in Ski-like protein (Sno) and similar proteins. Ski-like protein, also known as SKIL, Ski-related oncogene (Sno), or Ski-related protein, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 100
37902 411035 cd21085 WH_NTD_PHF10 N-terminal winged helix DNA-binding domain found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is involved in transcription activity regulation by chromatin remodeling. It is a component of the neural progenitors-specific chromatin remodeling complex (npBAF complex) and plays a role in the proliferation of neural progenitors. The model corresponds to the N-terminal winged helix DNA-binding domain of PHF10, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins. 89
37903 411036 cd21086 WH_NTD_SMARCB1 N-terminal winged helix DNA-binding domain found in SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 (SMARCB1) and similar proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. The model corresponds to the N-terminal winged helix DNA binding domain of SMARCB1, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins. 88
37904 410635 cd21087 7tm_viral_rhod_II_OLPVRII-like viral group II rhodopsins such as OLPVRII and similar proteins, members of the seven-transmembrane GPCR superfamily. The viral group II rhodopsins includes Organic Lake Phycodnavirus rhodopsin II (OLPVRII), a pentameric light-gated channel that is functionally analogous to well-studied pentameric ligand-gated ion channels playing crucial roles in many cellular processes. It is most likely specific for chloride. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 210
37905 410636 cd21088 7tm_viral_rhod_I_VirRDTS-like viral group I rhodopsins such as VirRDTS and similar proteins, members of the seven-transmembrane GPCR superfamily. The viral group I rhodopsins includes Phaeocystis globosa virus 12T divergent type-1 DTS-motif rhodopsin (VirRDTS), a green light-absorbing proton pump that has a structure similar to that of bacteriorhodopsin (BR) and transfers light energy in a manner that substantially changes medium pH when expressed in a cell. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 211
37906 411041 cd21089 Trm112-like eukaryotic tRNA methyltransferase 112, a partner protein of both rRNA/tRNA and protein methyltransferases, and similar proteins. This family contains eukaryotic tRNA methyltransferase 112 (Trm112)-like proteins such as human multifunctional methyltransferase subunit Trm112 protein, which acts as an activator of both rRNA/tRNA and protein methyltransferases. Trm112 acts as an obligate activating platform for at least four methyltransferases (MTase) involved in the modification of 18S rRNA (Bud23), tRNA (Trm9 and Trm11) and translation termination factor eRF1 (Mtq2) in eukaryotes. Hence, Trm112 is at a nexus between ribosome synthesis and function. Trm112 is a partner protein of N6amt1 (N6 -adenine-specific DNA methyltransferase 1), which is suggested to be the N6-adenine DNA methyltransferase (MTase) in human cells. Trm112 binds to a hydrophobic surface of N6amt1, stabilizing its structure but not directly contributing to substrate binding and catalysis. In Yarrowia lipolytica, it forms a complex with Trm9 methyltransferase, which is involved in the 5-methoxycarbonylmethyluridine (mcm(5)U) modification of the tRNA anticodon wobble position and hence promotes translational fidelity. In Saccharomyces cerevisiae, Trm112 (also called Ynr046w or tRNA methyltransferase 112) is a zinc binding protein that is plurifunctional and a component of the eRF1 methyltransferase, putatively containing a zinc finger signature motif. 117
37907 411042 cd21090 C11orf65 chromosome 11 open reading frame 65 and homologs. Chromosome 11 open reading frame 65 (C11orf65) is an uncharacterized protein that may be associated with potential sensitivity to metformin in type 2 diabetes (diabetes mellitus) patients without cancer. 260
37908 411043 cd21091 Fuzzy protein fuzzy and homologs. Protein fuzzy (or FUZ) is a planar cell polarity (PCP) effector that controls multiple cellular processes during development. PCP signalling is an evolutionarily conserved pathway by which directional information regarding polarized cell movement is provided to cells. The PCP signalling axis involves PCP core and PCP effector genes, which are activated consecutively to govern orientated cell migration and the establishment of cytoskeletal structures. Dishevelled (Dvl in mammals or Dsh in Drosophila) and Fuz (or fuzzy in Drosophila) are two representative PCP core and effector genes, respectively. PCP regulates mammalian nervous system development; Fuz-null mutant mice display neural tube defects due to failure of directional cell motility and cell fusion. 401
37909 411044 cd21092 TPT_S35C2 solute carrier family 35 member C2, member of the triose-phosphate transporter family. Solute carrier family 35 member C2 (S35C2 or Slc35c2), also called ovarian cancer-overexpressed gene 1 protein (OVCOV1), is a member of the triose-phosphate transporter (TPT) family, which is part of the drug/metabolite transporter (DMT) superfamily. It may function either as a GDP-fucose transporter that competes with Slc35c1 (S35C1) for GDP-fucose, or a factor that otherwise enhances the fucosylation of Notch and is required for optimal Notch signaling in mammalian cells. 248
37910 410606 cd21093 KLF8_12_N N-terminal domain of Kruppel-like factor (KLF) 8, KLF12, and similar proteins. Kruppel-like transcription factors (also known as Krueppel-like transcription factors, KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. This model represents the related N-terminal activation/repression domains of KLF8 and KLF12. 172
37911 410447 cd21094 C1_aPKC_iota protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) iota type. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. Members of this family contain C1 domain found in aPKC isoform iota. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37912 410448 cd21095 C1_aPKC_zeta protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) zeta type. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. Members of this family contain C1 domain found in aPKC isoform zeta. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 55
37913 411045 cd21101 MAF1-ALBA4_C C-terminal domain of mitochondrial association factor 1 (MAF1), Alba4, and related proteins. Mitochondria play a role in the regulation of the innate immune response. Host mitochondria are recruited to the membranes that surround certain intracellular bacteria and parasites during infection, a phenomenon termed host mitochondrial association (HMA). In Toxoplasma gondii, HMA is driven by a gene family that encodes mitochondrial association factor 1 (MAF1) proteins. MAF1 is the parasite protein needed to recruit host mitochondria to the Toxoplasma-containing vacuole during infection. The T. gondii MAF1 locus harbors multiple distinct paralogs that differ in their ability to mediate HMA; these fall into two broad groups designated MAF1a and MAF1b based on residue percent identity. MAF1b paralogs, but not MAF1a paralogs, have been shown to be responsible for the HMA phenotype. This family also includes Plasmodium yeolii ALBA4 which has been shown to modulate its stage-specific interactions and the fates of specific mRNAs during the parasite's growth and transmission, acting to regulate the development of the parasite's transmission stages. 264
37914 411696 cd21102 Arl6IP1_RETR3-like ADP-ribosylation factor-like protein 6-interacting protein 1, Reticulophagy regulator 3, and similar proteins. This family contains ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1) and the N-terminal reticulon-homology domain (RHD) of Reticulophagy regulators 1-3. Arl6IP1 is an endoplasmic reticulum (ER) protein that has an important role in cell conduction and material transport. Arl6IP1, a tetraspan membrane protein, is an anti-apoptotic protein specific to multicellular organisms, and is a potential player in shaping the ER tubules in mammalian cells. In Drosophila, knockdown of the Arl6IP1 gene leads to progressive motor deficit. An Arl6IP1 variant has also been associated with hereditary spastic paraplegia (HSP), motor and sensory polyneuropathy, and acromutilation. Reticulophagy regulator 1 (RETREG1/FAM134B) is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. RETREG2/FAM134A and RETREG3/FAM134C has been shown to interact with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. Arl6IP1 shows some sequence similarity to the RHD of reticulophagy regulators, which may function in inducing membrane curvature. 178
37915 411046 cd21104 SNU13 U4/U6.U5 small nuclear ribonucleoprotein SNU13. U4/U6.U5 small nuclear ribonucleoprotein SNU13, also known as NHP2-like protein 1 or U4/U6.U5 tri-snRNP 15.5 kDa protein, is a component of the spliceosome B complex, involved in pre-mRNA splicing. It binds to the 5'-stem-loop of U4 snRNA. 122
37916 409189 cd21105 PGAP4-like Post-GPI attachment to proteins factor 4 and similar proteins. This family includes post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246). PGAP4 has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases, it contains three transmembrane domains. This family also includes uncharacterized fungal proteins with similarity to PGAP4. 364
37917 411047 cd21106 TM6SF1-like transmembrane 6 superfamily member 1, member 2, and similar proteins. This family includes transmembrane 6 superfamily members 1 (TM6SF1) and 2 (TM6SF2), and similar proteins. TM6SF1 is a widely expressed lysosomal transmembrane protein that may be suitable as a lysosomal marker. Polymorphism of its paralog, TM6SF2, has been associated with the risk for hepatocellular carcinoma, and a variant of the gene has been found to impact the processing of lipids in the liver and the small intestine, causing non-alcoholic fatty liver disease (NAFLD). 356
37918 411048 cd21107 RsiG anti-sigma factor RsiG (AmfC). RsiG is an anti-sigma factor that binds and sequesters the sporulation-specific sigma factor WhiG in a fashion dependent on 3',5'-cyclic diguanylic acid (c-di-GMP). RsiG can bind the cyclic dinucleotide in the absence of sigma factor WhiG, and does so in a specific manner via a unique signature conserved in all Streptomyces. This gene was originally named amfC (aerial mycelium formation protein) but no specific role was established, and was later renamed rsiG (regulator of sigma WhiG). 145
37919 410609 cd21109 SPASM Iron-sulfur cluster-binding SPASM domain. This iron-sulfur cluster-binding domain is named SPASM after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. SPASM occurs as an additional C-terminal domain in many peptide-modifying enzymes of the radical S-adenosylmethionine (SAM) superfamily. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. 65
37920 411049 cd21111 IFTase inulin fructotransferase. Inulin fructotransferase (IFTase; EC 4.2.2.17 and EC 4.2.2.18), a member of the glycoside hydrolase family 91, catalyzes depolymerization of beta-2,1-fructans inulin by successively removing the terminal difructosaccharide units as cyclic anhydrides via intramolecular fructosyl transfer. As a result, IFTase produces DFA-I (alpha-D-fructofuranose-beta-D-fructofuranose 2',1:2,1'-dianhydride) and DFA-III (alpha-D-fructofuranose-beta-D-fructofuranose 2',1:2,3'-dianhydride). 395
37921 411050 cd21112 alphaLP-like alpha-lytic protease (alpha-LP), a bacterial serine protease of the chymotrypsin family, and similar proteins. This family represents the catalytic domain of alpha-lytic protease (alpha-LP) and its closely-related homologs. Alpha-lytic protease (EC 3.4.21.12; also called alpha-lytic endopeptidase), originally isolated from the myxobacterium Lysobacter enzymogenes, belongs to the MEROPS peptidase family S1, subfamily S1E (streptogrisin A subfamily). It is synthesized as a pro-enzyme, thus having two domains; the N-terminal pro-domain acts as a foldase, required transiently for the correct folding of the protease domain, and also acts as a potent inhibitor of the mature enzyme, while the C-terminal domain catalyzes the cleavage of peptide bonds. Members of the alpha-lytic protease subfamily include Nocardiopsis alba protease (NAPase), a secreted chymotrypsin from the alkaliphile Cellulomonas bogoriensis, streptogrisins (SPG-A, SPG-B, SPG-C, and SPG-D), and Thermobifida fusca protease A (TFPA). These serine proteases have characteristic kinetic stability, exhibited by their extremely slow unfolding kinetics. The active site, characteristic of serine proteases, contains the catalytic triad consisting of serine acting as a nucleophile, aspartate as an electrophile, and histidine as a base, all required for activity. This model represents the C-terminal catalytic domain of alpha-lytic proteases. 188
37922 409232 cd21114 NAC NAC domain. This family contains the NAC domain, named after the nascent polypeptide-associated complex (NAC) whose subunits contain NAC domains. In eukaryotes, the NAC complex, which plays an important role in co-translational targeting of nascent polypeptides to endoplasmic reticulum (ER), consists of 2 subunits: NAC alpha and a shortened splice variant of the basal transcription factor 3 (BTF3; also called BTF3b or NAC beta). The full length BTF3a protein excites transcription. 43
37923 411051 cd21115 legumain_C C-terminal prodomain of legumain. This family contains the C-terminal propeptide of legumain, a lysosomal endopeptidase with a specificity for hydrolysis of asparaginyl bonds. Legumain (also called vacuolar processing enzyme or VPE in plants, and asparaginyl endopeptidase or AEP in animals) is synthesized as a precursor with both N- and C-terminal propeptides. Prolegumain is directed to the lysosome or plant vacuole, where activation occurs at least partially by autolysis. The N-terminal catalytic domain is a cysteine protease from the C13 family. The C-terminal prodomain can be organized into an activation peptide (AP), spanning a helical region, and a C-terminal death domain-like fold, denoted as legumain stabilization and activity modulation (LSAM) domain. The C-terminal prodomain binds over the active site and inhibits the catalytic domain. During activation, the C-terminal prodomain is autocatalytically cleaved. This process is induced by pH changes. Human legumain has been shown to process the tetanus toxin generating the fragments found in class II antigen presentation. Legumain from plant seeds is thought to be responsible for the post-translational processing of seed proteins prior to storage. Legumain is highly expressed in some cancers such as colorectal cancer (CRC) and uveal melanoma (UM); it is associated with poor outcome in CRC and upregulation of legumain is associated with malignant behavior of UM. Thus, legumain may be used as a negative prognostic factor as well as a therapeutic target. 119
37924 411052 cd21117 Twitch_MoaA Iron-sulfur cluster-binding Twitch domain of GTP 3',8-cyclase. The iron-sulfur cluster-binding Twitch domain is found at the C-terminus of GTP 3',8-cyclase (EC 4.1.99.22), which is also called molybdenum cofactor biosynthesis protein A (MoaA) in bacteria and archaea, molybdenum cofactor biosynthesis protein 1 (MOCS1) in most eukaryotes, and molybdenum cofactor biosynthesis enzyme CNX2 in plants. GTP 3',8-cyclase is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the first step in molybdopterin biosynthesis, the cyclization of guanosine triphosphate to (8S)-3',8-cyclo-7,8-dihydroguanosine 5'-triphosphate, which is then converted to molybdopterin in subsequent steps. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. GTP 3',8-cyclase contains an additional iron-sulfur cluster at the C-terminal Twitch domain that is involved in substrate binding. The Twitch domain may be related to another iron-sulfur cluster-binding domain found at the C-terminus of some radical SAM enzymes, the SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSMEs, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. 70
37925 411053 cd21118 dermokine dermokine. Dermokine, also known as epidermis-specific secreted protein SK30/SK89, is a skin-specific glycoprotein that may play a regulatory role in the crosstalk between barrier dysfunction and inflammation, and therefore play a role in inflammatory diseases such as psoriasis. Dermokine is one of the most highly expressed proteins in differentiating keratinocytes, found mainly in the spinous and granular layers of the epidermis, but also in the epithelia of the small intestine, macrophages of the lung, and endothelial cells of the lung. Mouse dermokine has been reported to be encoded by 22 exons, and its expression leads to alpha, beta, and gamma transcripts. 495
37926 410610 cd21119 SPASM_PqqE Iron-sulfur cluster-binding SPASM domain of coenzyme PQQ synthesis protein E. Coenzyme PQQ synthesis protein E (PqqE), also called pyrroloquinoline quinone (PQQ) biosynthesis protein E or PqqA peptide cyclase (EC 1.21.98.4), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of a C-C bond between C-4 of glutamate and C-3 of tyrosine residues of the PqqA protein, which is the first enzymatic step in the biosynthesis of the bacterial enzyme cofactor PQQ. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. PqqE contains two auxiliary Fe-S clusters in its SPASM domain: one nearest the RS site (AuxI) is in the form of a 2Fe-2S cluster ligated by four cysteines; and a more remote cluster (AuxII) in the form of a 4Fe-4S center that is ligated by three cysteine residues and one aspartate residue. 114
37927 410611 cd21120 SPASM_anSME Iron-sulfur cluster-binding SPASM domain of anaerobic sulfatase maturating enzyme. Anaerobic sulfatase maturating enzyme (anSME) is a radical S-adenosylmethionine (SAM) enzyme that catalyzes, under anaerobic conditions, the co- or post-translational modification of arylsulfatases to form a catalytically essential formylglycine (FGly) residue to perform their hydrolysis function, removing sulfate groups from a wide array of substrates. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster; anSME contains two auxillary 4Fe-4S clusters in its SPASM domain. 107
37928 410612 cd21121 SPASM_Cmo-like Iron-sulfur cluster-binding SPASM domain of tungsten-containing aldehyde ferredoxin oxidoreductase cofactor-modifying protein and similar proteins. This group is composed of Pyrococcus furiosus tungsten-containing aldehyde ferredoxin oxidoreductase (AOR; EC 1.2.7.5) cofactor-modifying protein, encoded by the cmo gene, and similar proteins. AOR cofactor-modifying protein is involved in the biosynthesis of a molybdopterin-based tungsten cofactor. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN. 80
37929 410613 cd21122 SPASM_rSAM Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN. 71
37930 410614 cd21123 SPASM_MftC-like Iron-sulfur cluster-binding SPASM domain of mycofactocin radical SAM maturase MftC and similar proteins. This group is composed of Mycobacterium tuberculosis putative mycofactocin radical SAM maturase MftC and similar proteins. MftC is a radical S-adenosylmethionine (SAM) enzyme that may function to modify mycofactocin, a conserved polypeptide that might serve as an electron carrier. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster that is similar to the second auxillary 4Fe-4S cluster (AuxII) of Clostridium perfringens anaerobic sulfatase-maturating enzyme (anSME). 91
37931 410615 cd21124 SPASM_CteB-like Iron-sulfur cluster-binding SPASM domain of sactionine bond-forming enzyme CteB and similar proteins. Clostridium thermocellum sactionine bond-forming enzyme CteB is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of the requisite thioether bridge between a cysteine and the alpha-carbon of an opposing amino acid that is required in sactipeptide biosynthesis. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. CteB contains two auxillary 4Fe-4S clusters in its SPASM domain; the auxillary cluster nearest the RS site, called AuxI, exhibits an open coordination site in the absence of peptide substrate, which is coordinated by a peptidyl-cysteine residue in the bound state. 96
37932 410616 cd21125 SPASM_AlbA-like Iron-sulfur cluster-binding SPASM domain of antilisterial bacteriocin subtilosin biosynthesis protein AlbA and similar proteins. Bacillus subtilis antilisterial bacteriocin subtilosin biosynthesis protein AlbA is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of three thioether bonds in the post-translational modification of a linear peptide into the cyclic peptide subtilosin A. The thioether bonds formed are between the sulfur of three cysteine residues and the alpha-carbons of two phenylalanines and one threonine to produce a rigid cyclic peptide. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. AlbA appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN. 97
37933 410617 cd21126 SPASM_rSAM Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN. 70
37934 410618 cd21127 SPASM_rSAM Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN. 83
37935 410619 cd21128 SPASM_rSAM Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group may contain one auxillary Fe-S cluster with an open coordination site, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN, but missing one conserved cysteine in the binding site. 65
37936 410620 cd21129 SPASM_BtrN Iron-sulfur cluster-binding SPASM domain of butirosin biosynthesis protein N. Butirosin biosynthesis protein N (BtrN), also called S-adenosyl-L-methionine-dependent 2-deoxy-scyllo-inosamine dehydrogenase (EC 1.1.99.38), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the two-electron oxidation of 2-deoxy-scyllo-inosamine (DOIA) to amino-dideoxy-scyllo-inosose (amino-DOI) in the biosynthetic pathway of the aminoglycoside antibiotic butirosin. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. BtrN contains one auxillary 4Fe-4S cluster. 87
37937 412055 cd21131 TbPSSA-2-like ectodomain of Trypanosoma, including T. brucei Procyclic-Specific Surface Antigen-2 (TbPSSA-2), T. congolense Insect Stage Antigen (TcISA), and similar proteins. This family includes the ectodomains of Trypanosoma brucei Procyclic-Specific Surface Antigen-2 (TbPSSA-2) and homolog T. congolense Insect Stage Antigen (TcISA). Trypanosomal parasites transmit disease through the arthropod vector Glossina spp (the tsetse fly). Studies have shown that TbPSSA-2 plays an important role in parasite survival in the tsetse; TbPSSA-2 knock-out reduced the efficiency of trypanosome migration from the tsetse midgut to the salivary glands. The TbPSSA-2 and TcISA ectodomains adopt a novel architecture, having two lobes connected by a loop, exhibiting conformational flexibility. The inter-lobe hinge region displaying rotational flexibility suggests a potential mechanism for coordinating a binding partner. 208
37938 410977 cd21132 EVE-like EVE and YTH domains belong to the PUA superfamily. The EVE domain was formerly known as DUF55 and is thought to be involved in RNA binding. The YTH (YT521-B homology) domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. Both domains are part of the larger PUA superfamily. 138
37939 410978 cd21133 EVE EVE domains are putative RNA-binding domains that belong to the PUA superfamily. The EVE domain, formerly known as DUF55, has been revealed via structural similarity to be part of the PUA superfamily. It is most similar in three-dimensional fold to the YTH (YT521-B homology) domain, and is thought to be involved in RNA-binding. 148
37940 410979 cd21134 YTH YTH (YT521-B homology) domains are RNA-binding domains that belong to the PUA superfamily. Individual members of the YTH family have been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. In general, eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or in other forms of silencing. The YTH domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. It belongs to the larger PUA superfamily. 133
37941 412056 cd21137 AA13_LPMO-like AA13 lytic polysaccharide monooxygenase, and similar proteins. This family contains starch-degrading (also called starch-active) lytic polysaccharide monooxygenase (LPMO), a representative of the new CAZy AA13 family and classified as an auxiliary activity enzyme. This enzyme acts on alpha-linked glycosidic bonds and displays a binding surface that is quite different from those of LPMOs acting on beta-linked glycosidic bonds, indicating that the AA13 family proteins interact with their substrate in a distinct fashion. The active site contains an amino-terminal histidine-ligated mononuclear copper. This enzyme generates aldonic acid-terminated malto-oligosaccharides from retrograded starch and significantly boosts the conversion of this recalcitrant substrate to maltose by beta-amylase. 233
37942 411054 cd21138 McdB-like Maintenance of carboxysome distribution (Mcd) protein B and similar proteins. This family contains maintenance of carboxysome distribution (Mcd) protein B (McdB), also called maintenance of carboxysome positioning B protein (McsB). It is found in cyanobacteria, where carboxysome maintenance is mediated by a DNA partition-like ParA-ParB system called the McdA-McdB. In order to actively position carboxysomes, McdB binds directly to McdA, a putative Walker-box ParA-like protein. McdB harbors a unique helical fold and it enables McdA dimer formation. 132
37943 410981 cd21140 Cas6_I-like Class 1 type I CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type I CRISPR-Cas systems and similar proteins. 243
37944 410982 cd21141 Cas6_III-like Class 1 type III CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type III CRISPR-Cas systems and similar proteins. 251
37945 411055 cd21142 Cas7fv type I-F variant CRISPR-associated backbone protein Cas7 (Cas7fv). Cas7fv is one of the CRISPR associated (Cas) proteins of type I-F variant CRISPR-Cas system. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas modules are adaptive immune systems found in archaea and bacteria against foreign nucleic acids such as phages and plasmids via RNA-guided endonucleases. CRISPR-Cas systems are classified based on Cas protein content and arrangement in CRISPR-Cas loci into two main classes (1 and 2) and at least six types (I, II, III, IV, V, VI). Class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Type I CRISPR-Cas systems are most widespread in nature and the Cas protein composition of the employed Cascade interference complexes differ between seven subtypes (A-F, U). Type I-F variant (I-Fv) is a subtype that rely on a minimal set of five Cas proteins and has structural differences with I-F and I-E Cascades. Double strand DNA recruitment and recognition in the type I-Fv Cascade is facilitated from the major groove side by Cas5fv instead of the large subunit Cas8 and the finger domain of Cas7fv. 315
37946 411056 cd21143 Cas5fv type I-F variant CRISPR-associated protein Cas5 (Cas5fv). Cas5fv is one of the CRISPR-associated (Cas) proteins of type I-F variant CRISPR-Cas system. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas modules are adaptive immune systems found in archaea and bacteria against foreign nucleic acids such as phages and plasmids via RNA-guided endonucleases. CRISPR-Cas systems are classified based on Cas protein content and arrangement in CRISPR-Cas loci into two main classes (1 and 2) and at least six types (I, II, III, IV, V, VI). Class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Type I CRISPR-Cas systems are most widespread in nature and the Cas protein composition of the employed Cascade interference complexes differ between seven subtypes (A-F, U). Type I-F variant (I-Fv) is a subtype that rely on a minimal set of five Cas proteins and has structural differences with I-F and I-E Cascades. Double strand DNA recruitment and recognition in the type I-Fv Cascade is facilitated from the major groove side by Cas5fv instead of the large subunit Cas8 and the finger domain of Cas7fv. 335
37947 394908 cd21144 NendoU_XendoU-like Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, Xenopus laevis endoribonuclease XendoU, and related proteins. Nidovirus endoribonucleases (NendoUs) and eukaryotic Xenopus laevis-like endoribonucleases (XendoUs) are uridylate-specific endoribonucleases which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. XendoU is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. XendoU also requires Mn2+. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) forms a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV is a dimer. 112
37948 409283 cd21146 Nip7_N_euk N-terminal domain of eukaryotic 60S ribosome subunit biogenesis protein Nip7 and similar proteins. This N-terminal domain of various proteins co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. This model contains eukaryotic Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae. Recently, it was demonstrated that human Nip7 is essential in the accurate processing of pre-rRNA. Also included is KD93, a human homolog of Nip7. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. 87
37949 409284 cd21147 RsmF_methylt_CTD1 RsmF rRNA methyltransferase first C-terminal domain. This model represents the first of two distinct C-terminal domains of the 16S rRNA methyltransferase RsmF and related RsmB/RsmF family ribosomal methyltransferases. It is necessary for stabilizing the N-terminal catalytic core (a SAM-dependent methyltransferase) and is related to the N-terminal domain of Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae (human Nip7 is essential in the accurate processing of pre-rRNA). The second distinct C-terminal domain belongs to the PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain superfamily. 75
37950 409290 cd21148 PUA_Cbf5 PUA RNA-binding domain of the archaeal pseudouridine synthase component Cbf5. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the archaeal and eukaryotic subfamily of pseudouridine synthases, including Cbf5 (dyskerin in humans) and similar proteins, are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. In Pyrococcus furiosus H/ACA ribonucleoprotein (RNP) assembly with a single-hairpin H/ACA RNA, the lower stem and the ACA motif of the guide RNA are anchored at the PUA domain of Cbf5. In addition, the N-terminal extension of Cbf5, which is a hot spot for dyskeratosis congenita (a rare genetic form of bone marrow failure) mutation, forms an extra structural layer on the PUA domain. 75
37951 409291 cd21149 PUA_archaeosine_TGT PUA RNA-binding domain of archaeosine tRNA-guanine transglycosylase. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this archaeosine tRNA-guanine transglycosylase (TGT) family are responsible for the exchange of a guanine residue in archaeal tRNAs with a preQ0 base (7-cyano-7-deazaguanine), which constitutes the initial step in archaeosine biosynthesis. Archaeosine is a modified RNA base specific to archaea (7-formamidino-7deazaguanosine), found at position 15 in tRNAs. It has been shown that the PUA domain of archaeosine TGT is not required for its specificity for position 15. 75
37952 409292 cd21150 PUA_NSun6-like PUA RNA-binding domain of the SAM-dependent methyltransferase NSun6 and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this subfamily contain PUA domains that co-occur with SAM-dependent methyltransferase domains and may play roles as cytosine-C(5)-methyltransferases specific for tRNAs or rRNAs. Nsun6 binding to its tRNA substrates requires the presence of a 3'-CCA sequence, which is precisely recognized primarily through interactions with residues from the PUA domain, where the molecular surface of the PUA domain snugly fits onto each nucleotide residue of the CCA end. Human RNA:m5C methyltransferase NSun6 (hNSun6) plays a major role in bone metastasis and could be a valuable therapeutic target for bone metastasis and therapy-resistant tumors. 92
37953 409293 cd21151 PUA_Nip7-like PUA RNA binding domain of ribosome assembly factor Nip7 and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. This eukaryotic and archaeal subfamily contains the conserved protein Nip7 and similar proteins, which are involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of 60S ribosomal subunit. Nip7 orthologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. Structural analyses of the RNA-interacting surfaces of Saccharomyces cerevisiae and Pyrococcus abyssi Nip7 orthologs indicate that, in the archaeal PUA domain, C-terminal positively charged residues (arginines and lysines) are involved in RNA interaction while equivalent positions in eukaryotic orthologs are occupied by mostly hydrophobic residues. Both proteins can bind specifically to polyuridine, and RNA interaction requires specific residues of the PUA domain as determined by site-directed mutagenesis. 78
37954 409294 cd21152 PUA_TruB_bacterial PUA RNA-binding domain of bacterial pseudouridine synthase TruB and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this bacterial subfamily of pseudouridine synthases, including TruB and similar proteins, are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs. 60
37955 409295 cd21153 PUA_RlmI PUA RNA-binding domain of the SAM-dependent methyltransferase RlmI and related proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this subfamily contain PUA domains that co-occur N-terminal to SAM-dependent methyltransferase domains and include Escherichia coli RlmI (rRNA large subunit methyltransferase gene I, also called YccW) and Thermus thermophilus methyltransferase RlmO, which are 5-methylcytosine methyltransferases (m5C MTases) that play a role in modifying 23S rRNA. This subfamily also includes Pyrococcus horikoshii PH1915 that may play a role as a 5-methyluridine MTase, and/or perform similar roles. 70
37956 409296 cd21154 PUA_MJ1432-like PUA RNA-binding domain of MJ1432, TA1423, PH0734, and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this mostly archaeal family have not been characterized functionally; they may bind to RNA. This family includes Pyrococcus horikoshii PH0734 where the N-terminal domain may modulate the binding target of the C-terminal PUA domain using its characteristic electropositive surface. 84
37957 409297 cd21155 PUA_MCTS-1-like PUA RNA-binding domain of malignant T cell-amplified sequence 1 and related proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this eukaryotic family, labelled MCT-1 (malignant T cell-amplified sequence 1) or MCTS-1 (multiple copies T-cell lymphoma-1), contain a single PUA domain. They may play roles in the regulation of the cell cycle; human MCT-1 has been characterized for its oncogenic potential. MCT-1/MCTS1 expression is a new poor-prognosis marker in patients with aggressive breast cancers, and thus the MCT-1 pathway is a novel and promising therapeutic target for triple-negative breast cancer (TNBC). 97
37958 409298 cd21156 PUA_eIF2d-like PUA RNA-binding domain of eukaryotic translation initiation factor 2D and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Most members of this eukaryotic translation initiation factor 2D (eIF2d)-like family of eukaryotic proteins also contain a domain homologous to the translation initiation factor eIF1/SUI1, and a short uncharacterized N-terminal domain. eIF2D may function as a cytosolic GTP-independent initiation factor which delivers Met-tRNA (and non-initiating tRNAs) to the 40S ribosomal subunit. The family member from Drosophila melanogaster has been named ligatin, and this alias has been adopted for other family members as well, which are not homologous to the vertebrate ligatin (LGTN) that is a trafficking receptor for phosphoglycoproteins. 82
37959 409299 cd21157 PUA_G5K PUA domain of gamma-glutamyl kinase, found in archaea, bacteria, and eukarya. Gamma glutamyl kinase (G5K) is an enzyme essential for the biosynthesis of L-proline; it catalyzes the transfer of a phosphate group to glutamate. The resulting glutamate 5-phosphate cyclizes spontaneously to form 5-oxoproline. The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain functions as an RNA binding domain in many other proteins; however, its role in G5K is not understood. It might play a role in modulating the enzymatic properties of bacterial G5Ks. 104
37960 394909 cd21158 NendoU_nv Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. This family also includes torovirus NendoUs. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) form a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and monomer in solution. Nsp11 from the arterivirus PRRSV is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. 134
37961 394910 cd21159 XendoU Xenopus laevis endoribonuclease XendoU, and related proteins. Xenopus laevis XendoU is a uridylate-specific endoribonuclease, which releases a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. XendoU is a monomer and requires Mn2+. It is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. XendoU is distantly related to the Nidovirus uridylate-specific endoribonucleases (NendoUs) which include Nonstructural protein 15 (Nsp15) from coronaviruses, Nsp11 from arteriviruses, and torovirus endoribonuclease. 264
37962 394911 cd21160 NendoU_av_Nsp11-like Nidoviral uridylate-specific endoribonuclease (NendoU) domain of arterivirus PRRSV Nonstructural protein 11 (Nsp11), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. This Nsp11 exists as a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. 120
37963 394912 cd21161 NendoU_cv_Nsp15-like Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural Protein 15 (Nsp15) and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) form a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and a monomer in solution. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. 151
37964 394913 cd21162 NendoU_tv_PToV-like Nidoviral uridylate-specific endoribonuclease (NendoU) domain of Porcine torovirus (PToV) endoribonuclease and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. The Porcine torovirus (PToV) strain PToV-NPL/2013 NendoU domain is located at the N-terminus of the ORF1ab replicase polyprotein, between regions annotated as Nonstructural proteins 11 (Nsp11) and 13 (Nsp13). This subfamily belongs to a family which includes Nsp15 from coronaviruses and Nsp11 from arteriviruses, which may participate in the viral replication process and in the evasion of the host immune system. These vary in their requirement for Mn2+. Coronavirus Nsp15 generally form functional hexamers, with the exception of Porcine DeltaCoronavirus (PDCoV) Nsp15 which exists as a dimer and a monomer in solution. Arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11 is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. 133
37965 394902 cd21163 M_cv_Nsp15-NTD_av_Nsp11-like middle (M) domain of coronavirus Nonstructural protein 15 (Nsp15) and the N-terminal domain (NTD) of arterivirus Nsp11 and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Arterivirus Nsp11 has an N-terminal domain (NTD) and a C-terminal catalytic (NendoU) domain. The NTD of Nsp11 superimposes onto the M-domain of coronavirus Nsp15. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of other coronavirus members; it has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV functions as a dimer. 127
37966 394903 cd21165 M_cv-Nsp15-like middle domain of coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronavirus members; it has been shown to exist as a dimer and a monomer in solution. 128
37967 394904 cd21166 NTD_av_Nsp11-like N-terminal domain (NTD) of arterivirus Nonstructural protein 11 (Nsp11), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Arterivirus Nsp11 has an N-terminal domain (NTD) and a C-terminal NendoU catalytic domain. The NTD of Nsp11 superimposes onto the M-domain of coronavirus Nsp15. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV functions as a dimer. PRRSV Nsp11 has been shown to induce STAT2 degradation to inhibit interferon signaling; mutagenesis revealed that the amino acid residue K59 located at the NTD of Nsp11 is indispensable for inducing STAT2 reduction. 100
37968 394905 cd21167 M_alpha_beta_cv_Nsp15-like middle domain of alpha- and beta-coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. This middle domain harbors residues involved in hexamer formation and in trimer stability. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. 127
37969 394906 cd21168 M_gcv_Nsp15-like middle domain of gammacoronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. This middle domain harbors residues involved in hexamer formation and in trimer stability. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. 123
37970 394907 cd21169 M_dcv_Nsp15-like middle domain of delta coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. 118
37971 394899 cd21170 NTD_cv_Nsp15-like N-terminal domain of coronavirus Nonstructural protein 15 (Nsp15) and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This NTD structure (approximately 60 residues) present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from the Nsp15 of these alpha- and beta-coronavirus; it has been shown to exist as dimers and monomers in solution. 60
37972 394900 cd21171 NTD_alpha_beta_cv_Nsp15-like N-terminal domain of alpha- and beta-coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This small NTD structure, present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Residues in this N-terminal domain are important for hexamer (dimer of trimers) formation. 61
37973 394901 cd21172 NTD_dcv_Nsp15-like N-terminal domain of deltacoronavirus Nonstructural protein 15 (Nsp15), and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This small NTD structure, present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from the Nsp15 of alpha- and beta-coronavirus, such as Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and Murine Hepatitis Virus (MHV) which form functional hexamers; PDCoV Nsp15 has been shown to exist as a dimer and a monomer in solution. 60
37974 411057 cd21173 NucC-like cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease and similar proteins. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease. 231
37975 410621 cd21174 LPMO_auxiliary lytic polysaccharide monooxygenase auxiliary activity protein. Many proteins in this superfamily are copper-dependent lytic polysaccharide monooxygenases (LPMOs) and include lytic polysaccharide monooxygenase auxiliary activity families 9 (AA9) and 10 (AA10). The substrate-binding surface of this family is a flat beta-sandwich fold. 136
37976 410622 cd21175 LPMO_AA9 lytic polysaccharide monooxygenase (LPMO) auxiliary activity family 9 (AA9). AA9 proteins are copper-dependent lytic polysaccharide monooxygenases (LPMOs) involved in the cleavage of cellulose chains with oxidation of carbons C1 and/or C4 and C6. Activities include lytic cellulose monooxygenase (C1-hydroxylating) (EC 1.14.99.54) and lytic cellulose monooxygenase (C4-dehydrogenating) (EC 1.14.99.56). The family used to be called GH61 because weak endoglucanase activity had been demonstrated in some family members. 216
37977 410623 cd21176 LPMO_auxiliary-like fungal lytic polysaccharide monooxygenase (LPMO) auxiliary activity family protein. Proteins in this fungal family of copper-binding proteins may not function as lytic polysaccharide monooxygenases (LPMOs) or in specific binding of chitin and/or cellulose. A family member found in the ectomycorrhizal fungus Laccaria bicolor has been found to be located at the interface between tree rootlet cells and fungal hyphae. It does not perform oxidative cleavage of polysaccharides. Members of this family are related to LPMOs but have diverged to biological functions other than polysaccharide degradation. 121
37978 410624 cd21177 LPMO_AA10 lytic polysaccharide monooxygenase (LPMO) auxiliary activity family 10 (AA10). AA10 proteins are copper-dependent lytic polysaccharide monooxygenases (LPMOs), which may act on chitin or cellulose. The family used to be called CBM33. Activities in this family include lytic cellulose monooxygenase (C1-hydroxylating) (EC 1.14.99.54), lytic cellulose monooxygenase (C4-dehydrogenating) (EC 1.14.99.56), lytic chitin monooxygenase (EC 1.14.99.53), and lytic xylan monooxygenase/xylan oxidase (glycosidic bond-cleaving) (EC 1.14.99.-). Also included are viral chitin-binding glycoproteins such as fusolin and spheroidin-like proteins. 180
37979 410625 cd21178 Fusolin-like fusolin and similar proteins. Fusolin is a protein found in spindles of insect poxviruses that resembles the lytic polysaccharide monooxygenases of chitinovorous bacteria and may function to disrupt the chitin-rich peritrophic matrix that protects insects against oral infections. Thus, it is a component of the virus occlusion bodies (which are large proteinaceous polyhedra) that protect the virus from the outside environment for extended periods until they are ingested by insect larvae. 227
37980 411059 cd21179 LIC_1098-like putative DNA adenine methyltransferase similar to Leptospira interrogans LIC_1098. this uncharacterized family is structurally similar to DNA adenine methyltransferases such as FokI, EcoRV, or DpnIIA. 280
37981 409666 cd21180 GH2_GIPC GIPC-homology 2 (GH2) domain found in the GIPC family. The GIPC family includes PDZ domain-containing proteins, GIPC1 (also called GAIP C-terminus-interacting protein, RGS-GAIP-interacting protein, RGS19-interacting protein 1, RGS19IP1, synectin, tax interaction protein 2, or TIP-2), GIPC2, and GIPC3, which may act as scaffold proteins linking heterotrimeric G-proteins to seven-transmembrane-type WNT receptor or to receptor tyrosine kinases. They might play key roles in carcinogenesis and embryogenesis through modulation of growth factor signaling and cell adhesion. GIPCs are proteins with a GIPC homology 1 (GH1) domain, a central PDZ domain and a GH2 domain. This model corresponds to the GH2 domain, which mediates the interaction with myosin VI and is involved in homodimerization in the autoinhibited state. 65
37982 410548 cd21181 Tudor_SETDB1_rpt2 second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 54
37983 410549 cd21182 Tudor_SMN_SPF30-like Tudor domain found in survival motor neuron protein (SMN), motor neuron-related-splicing factor 30 (SPF30), and similar proteins. This group contains SMN, SPF30, Tudor domain-containing protein 3 (TDRD3), DNA excision repair protein ERCC-6-like 2 (ERCC6L2), and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. Members of this group contain a single Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 50
37984 409032 cd21183 CH_FLN-like_rpt1 first calponin homology (CH) domain found in the filamin family. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. This family also includes Drosophila melanogaster protein jitterbug (Jbug), which is an actin-meshwork organizing protein containing three copies of the CH domain. Other members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 108
37985 409033 cd21184 CH_FLN-like_rpt2 second calponin homology (CH) domain found in the filamin family. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. This family also includes Drosophila melanogaster protein jitterbug (Jbug), which is an actin-meshwork organizing protein containing three copies of the CH domain. Other members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 103
37986 409034 cd21185 CH_jitterbug-like_rpt3 third calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 98
37987 409035 cd21186 CH_DMD-like_rpt1 first calponin homology (CH) domain found in the dystrophin family. The dystrophin family includes dystrophin and its paralog, utrophin. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin is also involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and links the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 107
37988 409036 cd21187 CH_DMD-like_rpt2 second calponin homology (CH) domain found in the dystrophin family. The dystrophin family includes dystrophin and its paralog, utrophin. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin is also involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 104
37989 409037 cd21188 CH_PLEC-like_rpt1 first calponin homology (CH) domain found in the plectin/dystonin/MACF1 family. This family includes plectin, dystonin and microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1). Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It could also bind muscle proteins such as actin to membrane complexes in muscle. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 105
37990 409038 cd21189 CH_PLEC-like_rpt2 second calponin homology (CH) domain found in the plectin/dystonin/MACF1 family. This family includes plectin, dystonin and microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1). Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It could also bind muscle proteins such as actin to membrane complexes in muscle. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 105
37991 409039 cd21190 CH_SYNE-like_rpt1 first calponin homology (CH) domain found in the synaptic nuclear envelope protein family. The synaptic nuclear envelope (SYNE) family includes SYNE-1, -2 and calmin. SYNE-1 (also called nesprin-1, enaptin, KASH domain-containing protein 1, KASH1, myocyte nuclear envelope protein 1, MYNE-1, or nuclear envelope spectrin repeat protein 1) and SYNE-2 (also called nesprin-2, KASH domain-containing protein 2, KASH2, nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE) may act redundantly. They are multi-isomeric modular proteins which form a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. They also act as components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 113
37992 409040 cd21191 CH_CLMN_rpt1 first calponin homology (CH) domain found in calmin and similar proteins. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Calmin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 114
37993 409041 cd21192 CH_SYNE-like_rpt2 second calponin homology (CH) domain found in the synaptic nuclear envelope protein (SYNE) family. The SYNE family includes SYNE-1, -2 and calmin. SYNE-1 (also called nesprin-1, enaptin, KASH domain-containing protein 1, KASH1, myocyte nuclear envelope protein 1, MYNE-1, or nuclear envelope spectrin repeat protein 1) and SYNE-2 (also called nesprin-2, KASH domain-containing protein 2, KASH2, nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE) may act redundantly. They are multi-isomeric modular proteins which form a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. They also act as components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 107
37994 409042 cd21193 CH_beta_spectrin_rpt1 first calponin homology (CH) domain found in the beta spectrin family. The beta spectrin family includes beta-I, -II, -III, -IV and -V spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Beta-III spectrin is also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5). Beta-V spectrin, also called spectrin beta chain, non-erythrocytic 5 (SPTBN5), is a mammalian ortholog of Drosophila beta H spectrin. Beta-III and Beta-V spectrins may play crucial roles as longer actin-membrane cross-linkers or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 116
37995 409043 cd21194 CH_beta_spectrin_rpt2 second calponin homology (CH) domain found in the beta spectrin family. The beta spectrin family includes beta-I, -II, -III, -IV and -V spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Beta-III spectrin is also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5). Beta-V spectrin, also called spectrin beta chain, non-erythrocytic 5 (SPTBN5), is a mammalian ortholog of Drosophila beta H spectrin. Beta-III and Beta-V spectrins may play crucial roles as longer actin-membrane cross-linkers or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 105
37996 409044 cd21195 CH_MICAL2_3-like calponin homology (CH) domain found in molecule interacting with CasL protein 2 (MICAL-2), MICAL-3, and similar proteins. Molecule interacting with CasL protein (MICAL) is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. In addition, MICAL functions to interact with Rab13 and Rab8 to coordinate the assembly of tight junctions and adherens junctions in epithelial cells. Thus, MICAL is also called junctional Rab13-binding protein (JRAB). Members of this family, which includes MICAL-2, MICAL-3, and similar proteins, contain one CH domain. CH domains are actin filament (F-actin) binding motifs. 110
37997 409045 cd21196 CH_MICAL1 calponin homology (CH) domain found in molecule interacting with CasL protein 1. MICAL-1, also called NEDD9-interacting protein with calponin homology and LIM domains, acts as a [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-1 acts as a cytoskeletal regulator that connects NEDD9 to intermediate filaments. It also acts as a negative regulator of apoptosis via its interaction with STK38 and STK38L. MICAL-1 is a Rab effector protein that plays a role in vesicle trafficking. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 106
37998 409046 cd21197 CH_MICALL calponin homology (CH) domain found in the MICAL-like protein family. The MICAL-L family includes MICAL-L1 and MICAL-L2. MICAL-L1, also called molecule interacting with Rab13 (MIRab13), is a probable lipid-binding protein with higher affinity for phosphatidic acid, a lipid enriched in recycling endosome membranes. It is a tubular endosomal membrane hub that connects Rab35 and Arf6 with Rab8a. It may be involved in a late step of receptor-mediated endocytosis regulating endocytosed-EGF receptor trafficking. Alternatively, it may regulate slow endocytic recycling of endocytosed proteins back to the plasma membrane. MICAL-L1 may indirectly play a role in neurite outgrowth. MICAL-L2, also called junctional Rab13-binding protein (JRAB), or molecule interacting with CasL-like 2, acts as an effector of small Rab GTPases which is involved in junctional complexes assembly through the regulation of cell adhesion molecule transport to the plasma membrane, and actin cytoskeleton reorganization. It regulates the endocytic recycling of occludins, claudins, and E-cadherin to the plasma membrane and may thereby regulate the establishment of tight junctions and adherens junctions. Members of this family contain a single copy of CH domain. CH domains are actin filament (F-actin) binding motifs. 105
37999 409047 cd21198 CH_EHBP calponin homology (CH) domain found in the EH domain-binding protein (EHBP) family. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP1 is associated with aggressive prostate cancer and insulin-stimulated trafficking and cell migration. EHBP1L1 may also act as Rab effector protein and play a role in vesicle trafficking. It coordinates Rab8 and Bin1 to regulate apical-directed transport in polarized epithelial cells. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38000 409048 cd21199 CH_CYTS calponin homology (CH) domain found in the cytospin family. The cytospin family includes cytospin-A and cytospin-B. Cytospin-A, also called renal carcinoma antigen NY-REN-22, sperm antigen with calponin homology and coiled-coil domains 1-like, or SPECC1-like (SPECC1L) protein, is involved in cytokinesis and spindle organization. It may play a role in actin cytoskeleton organization and microtubule stabilization and hence, is required for proper cell adhesion and migration. Cytospin-B, also called nuclear structure protein 5 (NSP5), sperm antigen HCMOGT-1, or sperm antigen with calponin homology and coiled-coil domains 1 (SPECC1), is a novel fusion partner to PDGFRB in juvenile myelomonocytic leukemia with translocation t(5;17)(q33;p11.2). Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 112
38001 409049 cd21200 CH_SMTN-like calponin homology (CH) domain found in the smoothelin family. The smoothelin family includes smoothelin and smoothelin-like proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. SMTNL1, also called calponin homology-associated smooth muscle protein (CHASM), plays a role in the regulation of contractile properties of both striated and smooth muscles. It can bind to calmodulin and tropomyosin. When it is unphosphorylated, SMTNL1 may inhibit myosin dephosphorylation. SMTNL2 is highly expressed in skeletal muscle and could be associated with differentiating myocytes. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38002 409050 cd21201 CH_VAV calponin homology (CH) domain found in VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and as scaffold proteins, and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the CH domain, an actin-binding domain which is present as a single copy in VAV proteins. 117
38003 409051 cd21202 CH_PIX calponin homology (CH) domain found in the Pak Interactive eXchange factor family. Pak Interactive eXchange factor (PIX) proteins are Rho guanine nucleotide exchange factors (GEFs), which activate small GTPases by exchanging bound GDP for free GTP. They act as GEFs for both Cdc42 and Rac1, and have been implicated in cell motility, adhesion, neurite outgrowth, and cell polarity. Vertebrates contain two proteins from the PIX family, alpha-PIX and beta-PIX. Alpha-PIX, also called Rho guanine nucleotide exchange factor 6 (ARHGEF6), is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Beta-PIX, also called Rho guanine nucleotide exchange factor 7 (ARHGEF7), plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. Both alpha-PIX and beta-PIX contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 114
38004 409052 cd21203 CH_AtKIN14-like calponin homology (CH) domain found in Arabidopsis thaliana Kinesin-like KIN-14 protein family. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. This family includes a group of kinesin-like proteins belonging to KIN-14 protein family. They all contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 112
38005 409053 cd21204 CH_GAS2-like calponin homology (CH) domain found in the growth arrest-specific protein 2 family. The growth arrest-specific protein 2 (GAS-2) family includes GAS-2, and GAS-2 like proteins, GAS2L1-3. GAS-2 may play a role in apoptosis by acting as a cell death substrate for caspases. GAS2L1 (also called GAS2-related protein on chromosome 22 or growth arrest-specific protein 2-like 1) and GAS2L2 (also called GAS2-related protein on chromosome 17 or growth arrest-specific protein 2-like 2) may be involved in the cross-linking of microtubules and microfilaments. GAS2L3, also called GAS2-like protein 3, is a cytoskeletal linker protein that may promote and stabilize the formation of the actin and microtubule network. Members of this family contain a single copy of the CH domain at the N-terminal region. CH domains are actin filament (F-actin) binding motifs. 131
38006 409054 cd21205 CH_LRCH calponin homology (CH) domain found in the leucine-rich repeat and calponin homology domain-containing protein family. The leucine-rich repeat and calponin homology domain-containing protein (LRCH) family includes LRCH1-4. LRCH1, also called calponin homology domain-containing protein 1, or neuronal protein 81 (NP81), acts as a negative regulator of GTPase Cdc42 by sequestering Cdc42-guanine exchange factor DOCK8. LRCH2 may play a role in the organization of the cytoskeleton. LRCH3 is part of the DISP complex and may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. LRCH4, also called leucine-rich repeat neuronal protein 4, or leucine-rich neuronal protein, acts as a novel Toll-like receptor (TLR) accessory protein that regulates the innate immune response. Members of this family contain a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs. 107
38007 409055 cd21206 CH_IQGAP calponin homology (CH) domain found in the IQ motif containing GTPase activating protein family. Members of the IQ motif containing GTPase activating protein (IQGAP) family are associated with the Ras GTP-binding protein and act as essential regulators of cytoskeletal function. There are three known IQGAP family members: IQGAP1, IQGAP2, and IQGAP3. They are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity. It lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3 regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton. 118
38008 409056 cd21207 CH_dMP20-like calponin homology (CH) domain found in Drosophila melanogaster muscle-specific protein 20 (dMP20) and similar domains. This subfamily contains Drosophila melanogaster muscle-specific protein 20 (dMP20), Echinococcus granulosus myophilin, Dictyostelium discoideum Rac guanine nucleotide exchange factor B (also called Trix), and similar proteins. dMP20 is present only in the synchronous muscles of D. melanogaster. It may be involved in the system linking the nerve impulse with the contraction or the relaxation process. Trix is involved in the regulation of the late steps of the endocytic pathway. dMP20 contains a single copy of the CH domain, while Trix (triple CH-domain array exchange factor) contains three, two type 3 CH domains which are included in this model, and one type 1 CH domain that is not included in this subfamily, but is part of the superfamily. CH domains are actin filament (F-actin) binding motifs. 107
38009 409057 cd21208 CH_LMO7-like calponin homology (CH) domain found in LIM domain only protein 7 and similar proteins. This family includes LIM domain only protein 7 (LMO-7) and LIM and calponin homology domains-containing protein 1 (LIMCH1), and similar proteins. LMO-7, also called F-box only protein 20, or LOMP, is a transcription regulator for expression of many Emery-Dreifuss muscular dystrophy (EDMD)-relevant genes. It binds to alpha-actinin and AF6/afadin at adherens junctions for epithelial cell-cell adhesion. LIMCH1 acts as an actin stress fiber-associated protein that activates the non-muscle myosin IIa complex by promoting the phosphorylation of its regulatory subunit MRLC/MYL9. It positively regulates actin stress fiber assembly and stabilizes focal adhesions, and therefore negatively regulates cell spreading and cell migration. Members of this family contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 119
38010 409058 cd21209 CH_TAGLN-like calponin homology (CH) domain found in the transgelin family. The transgelin (TAGLN) family includes transgelin, transgelin-2 and transgelin-3. Transgelin, also called 22 kDa actin-binding protein, protein WS3-10, or smooth muscle protein 22-alpha (SM22-alpha), acts as an actin cross-linking/gelling protein that may be involved in calcium interactions and in regulating contractile properties of the cell. Transgelin-2, also called epididymis tissue protein Li 7e, or SM22-alpha homolog, acts as an actin-binding protein that induces actin gelation and regulates actin cytoskeleton. It may participate in the development and progression of multiple cancers. Transgelin-3, also called neuronal protein 22 (NP22), or neuronal protein NP25, may have a role in alcohol-related adaptations and may mediate regulatory signal transduction pathways in neurons. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38011 409059 cd21210 CH_SCP1-like calponin homology (CH) domain found in Saccharomyces cerevisiae transgelin (SCP1) and similar proteins. The family includes transgelins from Saccharomyces cerevisiae and Schizosaccharomyces pombe, which are also called SCP1 and STG1, respectively. Transgelin, also called calponin homolog 1, has actin-binding and actin-bundling activity. It stabilizes actin filaments against disassembly. Transgelin contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 101
38012 409060 cd21211 CH_CNN calponin homology (CH) domain found in the calponin family. Calponin is an actin filament-associated regulatory protein expressed in smooth muscle and many types of non-muscle cells. There are three calponin isoforms, calponin-1, -2, -3. All of them are actin-binding proteins with functions in inhibiting actin-activated myosin ATPase and stabilizing the actin cytoskeleton. Calponin-1 is specifically expressed in smooth muscle cells and plays a role in fine-tuning smooth muscle contractility. Calponin-2 is expressed in both smooth muscle and non-muscle cells and regulates multiple actin cytoskeleton-based functions. Calponin-3 is expressed in the brain and participates in actin cytoskeleton-based activities in embryonic development and myogenesis. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 108
38013 409061 cd21212 CH_NAV2-like calponin homology (CH) domain found in neuron navigator (NAV) 2, NAV3, and similar proteins. This family includes neuron navigator 2 (NAV2) and NAV3, both of which contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. NAV2, also called helicase APC down-regulated 1 (HELAD1), pore membrane and/or filament-interacting-like protein 2 (POMFIL2), retinoic acid inducible in neuroblastoma 1 (RAINB1), Steerin-2 (STEERIN2), or Unc-53 homolog 2 (unc53H2), possesses 3' to 5' helicase activity and exonuclease activity. It is involved in neuronal development, specifically in the development of different sensory organs. NAV3, also called pore membrane and/or filament-interacting-like protein 1 (POMFIL1), Steerin-3 (STEERIN3), or Unc-53 homolog 3 (unc53H3), may regulate IL2 production by T-cells. It may be involved in neuron regeneration. 105
38014 409062 cd21213 CH_DIXDC1 calponin homology (CH) domain found in Dixin and similar proteins. Dixin, also called coiled-coil protein DIX1, coiled-coil-DIX1, or DIX domain-containing protein 1, is a positive effector of the Wnt signaling pathway. It activates WNT3A signaling via DVL2 and regulates JNK activation by AXIN1 and DVL2. Members of this family contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 107
38015 409063 cd21214 CH_ACTN_rpt1 first calponin homology (CH) domain found in the alpha-actinin family. The alpha-actinin (ACTN) family includes alpha-actinin-1, -2, -3, and -4. They are F-actin cross-linking proteins which are thought to anchor actin to a variety of intracellular structures. ACTN1 mutations cause congenital macrothrombocytopenia. ACTN2 mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. ACTN3 is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. ACTN4 is associated with cell motility and cancer invasion. It is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38016 409064 cd21215 CH_SpAIN1-like_rpt1 first calponin homology (CH) domain found in Schizosaccharomyces pombe alpha-actinin-like protein 1 and similar proteins. Schizosaccharomyces pombe alpha-actinin-like protein 1 (SpAIN1) binds to actin and is involved in actin-ring formation and organization. It plays a role in cytokinesis and is involved in septation. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38017 409065 cd21216 CH_ACTN_rpt2 second calponin homology (CH) domain found in the alpha-actinin family. The alpha-actinin (ACTN) family includes alpha-actinin-1, -2, -3, and -4. They are F-actin cross-linking proteins which are thought to anchor actin to a variety of intracellular structures. ACTN1 mutations cause congenital macrothrombocytopenia. ACTN2 mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. ACTN3 is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. ACTN4 is associated with cell motility and cancer invasion. It is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38018 409066 cd21217 CH_PLS_FIM_rpt1 first calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5; they cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 114
38019 409067 cd21218 CH_PLS_FIM_rpt2 second calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5; they cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 114
38020 409068 cd21219 CH_PLS_FIM_rpt3 third calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 113
38021 409069 cd21220 CH_PLS_FIM_rpt4 fourth calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38022 409070 cd21221 CH_PARV_rpt1 first calponin homology (CH) domain found in the parvin family. The parvin family includes alpha-parvin, beta-parvin, and gamma-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38023 409071 cd21222 CH_PARV_rpt2 second calponin homology (CH) domain found in the parvin family. The parvin family includes alpha-parvin, beta-parvin, and gamma-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 121
38024 409072 cd21223 CH_ASPM_rpt1 first calponin homology (CH) domain found in abnormal spindle-like microcephaly-associated protein (ASPM) and similar proteins. ASPM, also called abnormal spindle protein homolog, or Asp homolog, is involved in mitotic spindle regulation and coordination of mitotic processes. It may also have a preferential role in regulating neurogenesis. Members of this family contain two copies of the CH domain in the middle region. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 113
38025 409073 cd21224 CH_ASPM_rpt2 second calponin homology (CH) domain found in abnormal spindle-like microcephaly-associated protein (ASPM) and similar proteins. ASPM, also called abnormal spindle protein homolog, or Asp homolog, is involved in mitotic spindle regulation and coordination of mitotic processes. It may also have a preferential role in regulating neurogenesis. Members of this family contain two copies of CH domain in the middle region. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 138
38026 409074 cd21225 CH_CTX_rpt1 first calponin homology (CH) domain found in cortexillin. Cortexillins are actin-bundling proteins that play a critical role in regulating cell morphology and actin cytoskeleton reorganization. They play a major role in cytokinesis and contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 111
38027 409075 cd21226 CH_CTX_rpt2 second calponin homology (CH) domain found in cortexillin. Cortexillins are actin-bundling proteins that play a critical role in regulating cell morphology and actin cytoskeleton reorganization. They play a major role in cytokinesis and contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 103
38028 409076 cd21227 CH_jitterbug-like_rpt1 first calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38029 409077 cd21228 CH_FLN_rpt1 first calponin homology (CH) domain found in filamins. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. Members of this family contain two copies of the CH domain. The model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 108
38030 409078 cd21229 CH_jitterbug-like_rpt2 second calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38031 409079 cd21230 CH_FLN_rpt2 second calponin homology (CH) domain found in filamins. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. Members of this family contain two copies of the CH domain. The model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 103
38032 409080 cd21231 CH_DMD_rpt1 first calponin homology (CH) domain found in dystrophin and similar proteins. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It is involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated with abnormal cerebral diffusion and perfusion, as well as in acute Trypanosoma cruzi infection. The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. This model corresponds to the first CH domain. 111
38033 409081 cd21232 CH_UTRN_rpt1 first calponin homology (CH) domain found in utrophin and similar proteins. Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs), and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs. This model corresponds to the first CH domain. 107
38034 409082 cd21233 CH_DMD_rpt2 second calponin homology (CH) domain found in dystrophin and similar proteins. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It is involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated with abnormal cerebral diffusion and perfusion, as well as in acute Trypanosoma cruzi infection. The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. The model corresponds to the second CH domain. 111
38035 409083 cd21234 CH_UTRN_rpt2 second calponin homology (CH) domain found in utrophin and similar proteins. Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs), and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs. This model corresponds to the second CH domain. 104
38036 409084 cd21235 CH_PLEC_rpt1 first calponin homology (CH) domain found in plectin and similar proteins. Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It can also bind muscle proteins such as actin to membrane complexes in muscle. Plectin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38037 409085 cd21236 CH_DYST_rpt1 first calponin homology (CH) domain found in dystonin and similar proteins. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. Dystonin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 128
38038 409086 cd21237 CH_MACF1_rpt1 first calponin homology (CH) domain found in microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1) and similar proteins. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. MACF1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 118
38039 409087 cd21238 CH_PLEC_rpt2 second calponin homology (CH) domain found in plectin and similar proteins. Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments and anchors intermediate filaments to desmosomes or hemidesmosomes. It can also bind muscle proteins such as actin to membrane complexes in muscle. Plectin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38040 409088 cd21239 CH_DYST_rpt2 second calponin homology (CH) domain found in dystonin and similar proteins. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. Dystonin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 104
38041 409089 cd21240 CH_MACF1_rpt2 second calponin homology (CH) domain found in microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1) and similar proteins. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. MACF1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38042 409090 cd21241 CH_SYNE1_rpt1 first calponin homology (CH) domain found in synaptic nuclear envelope protein 1 and similar proteins. Synaptic nuclear envelope protein 1 (SYNE-1), also called nesprin-1, enaptin, KASH domain-containing protein 1 (KASH1), myocyte nuclear envelope protein 1 (MYNE-1), or nuclear envelope spectrin repeat protein 1, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-1 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 113
38043 409091 cd21242 CH_SYNE2_rpt1 first calponin homology (CH) domain found in synaptic nuclear envelope protein 2. Synaptic nuclear envelope protein 2 (SYNE-2), also called nesprin-2, KASH domain-containing protein 2 (KASH2), nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-2 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-2 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 111
38044 409092 cd21243 CH_SYNE1_rpt2 second calponin homology (CH) domain found in synaptic nuclear envelope protein 1 (SYNE-1) and similar proteins. SYNE-1, also called nesprin-1, enaptin, KASH domain-containing protein 1 (KASH1), myocyte nuclear envelope protein 1 (MYNE-1), or nuclear envelope spectrin repeat protein 1, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-1 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38045 409093 cd21244 CH_SYNE2_rpt2 second calponin homology (CH) domain found in synaptic nuclear envelope protein 2 (SYNE-2) and similar proteins. SYNE-2, also called nesprin-2, KASH domain-containing protein 2 (KASH2), nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-2 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-2 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38046 409094 cd21245 CH_CLMN_rpt2 second calponin homology (CH) domain found in calmin and similar proteins. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Calmin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38047 409095 cd21246 CH_SPTB-like_rpt1 first calponin homology (CH) domain found in the beta-I spectrin-like subfamily. The beta-I spectrin-like family includes beta-I, -II, -III and -IV spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-III spectrin, also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5), may play a crucial role as a longer actin-membrane cross-linker or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Members of this subfamily contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 117
38048 409096 cd21247 CH_SPTBN5_rpt1 first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 5 (SPTBN5) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN5, also called beta-V spectrin, is a mammalian ortholog of Drosophila beta H spectrin that may play a crucial role as a longer actin-membrane cross-linker or to fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. SPTBN5 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38049 409097 cd21248 CH_SPTB_like_rpt2 second calponin homology (CH) domain found in the beta-I spectrin-like subfamily. The beta-I spectrin-like family includes beta-I, -II, -III and -IV spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-III spectrin, also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5), may play a crucial role as a longer actin-membrane cross-linker or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Members of this subfamily contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38050 409098 cd21249 CH_SPTBN5_rpt2 second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 5 (SPTBN5) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN5, also called beta-V spectrin, is a mammalian ortholog of Drosophila beta H spectrin that may play a crucial role as a longer actin-membrane cross-linker or to fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. SPTBN5 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38051 409099 cd21250 CH_MICAL2 calponin homology (CH) domain found in molecule interacting with CasL protein 2. MICAL-2 is a nuclear [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-2 acts as a key regulator of the serum response factor (SRF) signaling pathway elicited by nerve growth factor and serum. It mediates oxidation and subsequent depolymerization of nuclear actin, leading to the increased MKL1/MRTF-A presence in the nucleus, promoting SRF:MKL1/MRTF-A-dependent gene transcription. MICAL-2 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 110
38052 409100 cd21251 CH_MICAL3 calponin homology (CH) domain found in molecule interacting with CasL protein 3. MICAL-3 is a [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-3 seems to act as a Rab effector protein and plays a role in vesicle trafficking. It is involved in exocytic vesicle tethering and fusion. MICAL3 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 111
38053 409101 cd21252 CH_MICALL1 calponin homology (CH) domain found in MICAL-like protein 1. MICAL-like protein 1 (MICAL-L1), also called molecule interacting with Rab13 (MIRab13), is a probable lipid-binding protein with higher affinity for phosphatidic acid, a lipid enriched in recycling endosome membranes. It is a tubular endosomal membrane hub that connects Rab35 and Arf6 with Rab8a. It may be involved in a late step of receptor-mediated endocytosis regulating endocytosed-EGF receptor trafficking. Alternatively, it may regulate slow endocytic recycling of endocytosed proteins back to the plasma membrane. MICAL-L1 may indirectly play a role in neurite outgrowth. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38054 409102 cd21253 CH_MICALL2 calponin homology (CH) domain found in MICAL-like protein 2 and similar proteins. MICAL-like protein 2 (MICAL-L2), also called junctional Rab13-binding protein (JRAB), or molecule interacting with CasL-like 2, acts as an effector of small Rab GTPases which is involved in junctional complexes assembly through the regulation of cell adhesion molecule transport to the plasma membrane, and actin cytoskeleton reorganization. It regulates the endocytic recycling of occludins, claudins, and E-cadherin to the plasma membrane and may thereby regulate the establishment of tight junctions and adherens junctions. Members of this subfamily contain a single copy of CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38055 409103 cd21254 CH_EHBP1 calponin homology (CH) domain found in EH domain-binding protein 1 and similar proteins. EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP1 is associated with aggressive prostate cancer and insulin-stimulated trafficking and cell migration. Members of this subfamily contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38056 409104 cd21255 CH_EHBP1L1 calponin homology (CH) domain found in EH domain-binding protein 1-like protein 1 and similar proteins. EHBP1L1 may act as Rab effector protein and play a role in vesicle trafficking. It coordinates Rab8 and Bin1 to regulate apical-directed transport in polarized epithelial cells. Members of this subfamily contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38057 409105 cd21256 CH_CYTSA calponin homology (CH) domain found in cytospin-A. Cytospin-A, also called renal carcinoma antigen NY-REN-22, or sperm antigen with calponin homology and coiled-coil domains 1-like, or SPECC1-like protein (SPECC1L), is involved in cytokinesis and spindle organization. It may play a role in actin cytoskeleton organization and microtubule stabilization and hence, is required for proper cell adhesion and migration. Cytospin-A contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38058 409106 cd21257 CH_CYTSB calponin homology (CH) domain found in cytospin-B. Cytospin-B, also called nuclear structure protein 5 (NSP5), or sperm antigen HCMOGT-1, or sperm antigen with calponin homology and coiled-coil domains 1 (SPECC1), is a novel fusion Cytospin-B that contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 112
38059 409107 cd21258 CH_SMTNA calponin homology (CH) domain found in smoothelin-A and similar proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. This model corresponds to the single CH domain of smoothelin-A. CH domains are actin filament (F-actin) binding motifs. 111
38060 409108 cd21259 CH_SMTNB calponin homology (CH) domain found in smoothelin-B and similar proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. The human SMTN gene encodes smoothelin-A and smoothelin-B. This model corresponds to the single CH domain of smoothelin-B. CH domains are actin filament (F-actin) binding motifs. 112
38061 409109 cd21260 CH_SMTNL1 calponin homology (CH) domain found in smoothelin-like protein 1. Smoothelin-like protein 1 (SMTNL1), also called calponin homology-associated smooth muscle protein (CHASM), plays a role in the regulation of contractile properties of both striated and smooth muscles. It can bind to calmodulin and tropomyosin. When it is unphosphorylated, SMTNL1 may inhibit myosin dephosphorylation. SMTNL1 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 116
38062 409110 cd21261 CH_SMTNL2 calponin homology (CH) domain found in smoothelin-like protein 2. Smoothelin-like protein 2 (SMTNL2) is highly expressed in skeletal muscle and could be associated with differentiating myocytes. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38063 409111 cd21262 CH_VAV1 calponin homology (CH) domain found in VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the CH domain, an actin-binding domain which is present as a single copy in VAV1 protein. 120
38064 409112 cd21263 CH_VAV2 calponin homology (CH) domain found in VAV2 protein and similar proteins. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The model corresponds to CH domain, an actin-binding domain which is present as a single copy in VAV2 protein. 119
38065 409113 cd21264 CH_VAV3 calponin homology (CH) domain found in VAV3 protein and similar proteins. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. Its function has been implicated in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The model corresponds to CH domain, an actin-binding domain which is present as a single copy in VAV3 protein. 117
38066 409114 cd21265 CH_alphaPIX calponin homology (CH) domain found in alpha-Pak Interactive eXchange factor. Alpha-Pak Interactive eXchange factor (alpha-PIX), also called PAK-interacting exchange factor alpha, Rho guanine nucleotide exchange factor 6 (ARHGEF6), Rac/Cdc42 guanine nucleotide exchange factor 6, or Cool (Cloned out of Library)-2, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac1, and is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Alpha-PIX contains a single copy of the CH domain at its N-terminus. CH domains are actin filament (F-actin) binding motifs. 117
38067 409115 cd21266 CH_betaPIX calponin homology (CH) domain found in beta-Pak Interactive eXchange factor. Beta-Pak Interactive eXchange factor (beta-PIX), also called PAK-interacting exchange factor beta, Rho guanine nucleotide exchange factor 7 (ARHGEF7), p85, or Cool (Cloned out of Library)-1, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac1, and plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. Beta-PIX contains a single copy of the CH domain at its N-terminus. CH domains are actin filament (F-actin) binding motifs. 112
38068 409116 cd21267 CH_GAS2 calponin homology (CH) domain found in growth arrest-specific protein 2. Growth arrest-specific protein 2 (GAS-2) may play a role in apoptosis by acting as a cell death substrate for caspases. It contains a single copy of the CH domain at the N-terminal region. CH domains are actin filament (F-actin) binding motifs. 136
38069 409117 cd21268 CH_GAS2L1_2 calponin homology (CH) domain found in GAS2-like protein 1 (GAS2L1), GAS2L2, and similar proteins. This subfamily includes GAS2L1 (also called GAS2-related protein on chromosome 22 or growth arrest-specific protein 2-like 1) and GAS2L2 (also called GAS2-related protein on chromosome 17 or growth arrest-specific protein 2-like 2). They may be involved in the cross-linking of microtubules and microfilaments. Members of this subfamily contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 142
38070 409118 cd21269 CH_GAS2L3 calponin homology (CH) domain found in growth arrest-specific protein 2-like 3. Growth arrest-specific protein 2-like 3 (GAS2L3), also called GAS2-like protein 3, is a cytoskeletal linker protein that may promote and stabilize the formation of the actin and microtubule network. It contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 130
38071 409119 cd21270 CH_LRCH1 calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 1. Leucine-rich repeat and calponin homology domain-containing protein 1 (LRCH1), also called calponin homology domain-containing protein 1, or neuronal protein 81 (NP81), acts as a negative regulator of GTPase CDC42 by sequestering CDC42-guanine exchange factor DOCK8. LRCH1 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs. 112
38072 409120 cd21271 CH_LRCH2 calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 2. Leucine-rich repeat and calponin homology domain-containing protein 2 (LRCH2) may play a role in the organization of the cytoskeleton. It contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs. 111
38073 409121 cd21272 CH_LRCH3 calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 3. Leucine-rich repeat and calponin homology domain-containing protein 3 (LRCH3) is part of the DISP (DOCK7-Induced Septin disPlacement) complex. It may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. LRCH3 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs. 109
38074 409122 cd21273 CH_LRCH4 calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 4. Leucine-rich repeat and calponin homology domain-containing protein 4 (LRCH4), also called leucine-rich repeat neuronal protein 4, or leucine-rich neuronal protein, acts as a novel Toll-like receptor (TLR) accessory protein that regulates the innate immune response. LRCH4 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs. 109
38075 409123 cd21274 CH_IQGAP1 calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP1. IQ motif containing GTPase activating protein 1 (IQGAP1), also called p195, is a homodimeric protein that is widely expressed among vertebrate cell types from early embryogenesis. It plays a crucial role in regulating the dynamics and assembly of the actin cytoskeleton. It belongs to the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity. It lacks GAP activity. Both, IQGAP1 and IQGAP2, specifically bind to Cdc42 and Rac1, but not to RhoA. Despite similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP1 contains a single copy of the CH domain at the N-terminus. 154
38076 409124 cd21275 CH_IQGAP2 calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP2. IQ motif containing GTPase activating protein 2 (IQGAP2) is a member of the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP2 binds to activated Cdc42 and Rac1 but does not seem to stimulate their GTPase activity. It associates with calmodulin. IQGAP2 contains a single copy of the CH domain at the N-terminus. 156
38077 409125 cd21276 CH_IQGAP3 calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP3. IQ motif containing GTPase activating protein 3 (IQGAP3) associates with Ras GTP-binding proteins. It regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton. It belongs to the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP3 contains a single copy of the CH domain at the N-terminus. 152
38078 409126 cd21277 CH_LMO7 calponin homology (CH) domain found in LIM domain only protein 7. LIM domain only protein 7 (LMO-7), also called F-box only protein 20, or LOMP, is a transcription regulator for expression of many Emery-Dreifuss muscular dystrophy (EDMD)-relevant genes. It binds to alpha-actinin and AF6/afadin at adherens junctions for epithelial cell-cell adhesion. It contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 116
38079 409127 cd21278 CH_LIMCH1 calponin homology (CH) domain found in LIM and calponin homology domains-containing protein 1. LIM and calponin homology domains-containing protein 1 (LIMCH1) acts as an actin stress fiber-associated protein that activates the non-muscle myosin IIa complex by promoting the phosphorylation of its regulatory subunit MRLC/MYL9. It positively regulates actin stress fiber assembly and stabilizes focal adhesions, and therefore negatively regulates cell spreading and cell migration. LIMCH1 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 118
38080 409128 cd21279 CH_TAGLN calponin homology (CH) domain found in transgelin. Transgelin, also called 22 kDa actin-binding protein, protein WS3-10, or smooth muscle protein 22-alpha (SM22-alpha), acts as an actin cross-linking/gelling protein that may be involved in calcium interactions and in regulating contractile properties of the cell. It may also contribute to replicative senescence. Transgelin contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 121
38081 409129 cd21280 CH_TAGLN2 calponin homology (CH) domain found in transgelin-2. Transgelin-2, also called epididymis tissue protein Li 7e, or SM22-alpha homolog, acts as an actin-binding protein that induces actin gelation and regulates the actin cytoskeleton. It may participate in the development and progression of multiple cancers. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 137
38082 409130 cd21281 CH_TAGLN3 calponin homology (CH) domain found in transgelin-3. Transgelin-3, also called neuronal protein 22 (NP22), or neuronal protein NP25, may have a role in alcohol-related adaptations and may mediate regulatory signal transduction pathways in neurons. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38083 409131 cd21282 CH_CNN1 calponin homology (CH) domain found in calponin-1 and similar proteins. Calponin-1 (CNN1), also called basic calponin, or smooth muscle calponin H1, is a thin filament-associated protein that is implicated in the regulation and modulation of smooth muscle contraction. It is capable of binding to actin, calmodulin, troponin C, and tropomyosin. Calponin-1 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 108
38084 409132 cd21283 CH_CNN2 calponin homology (CH) domain found in calponin-2. Calponin-2 (CNN2), also called neutral calponin, or smooth muscle calponin H2, is an actin cytoskeleton-associated regulatory protein that inhibits the activity of myosin-ATPase and cytoskeleton dynamics. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38085 409133 cd21284 CH_CNN3 calponin homology (CH) domain found in calponin-3. Calponin-3 (CNN3), also called acidic isoform calponin, is an F-actin-binding protein that is expressed in the brain and has been shown to control dendritic spine morphology, density, and plasticity by regulating actin cytoskeletal reorganization and dynamics. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs. 111
38086 409134 cd21285 CH_NAV2 calponin homology (CH) domain found in neuron navigator 2. Neuron navigator 2 (NAV2), also called helicase APC down-regulated 1 (HELAD1), pore membrane and/or filament-interacting-like protein 2 (POMFIL2), retinoic acid inducible in neuroblastoma 1 (RAINB1), Steerin-2 (STEERIN2), or Unc-53 homolog 2 (unc53H2), possesses 3' to 5' helicase activity and exonuclease activity. It is involved in neuronal development, specifically in the development of different sensory organs. NAV2 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 121
38087 409135 cd21286 CH_NAV3 calponin homology (CH) domain found in neuron navigator 3. Neuron navigator 3 (NAV3), also called pore membrane and/or filament-interacting-like protein 1 (POMFIL1), Steerin-3 (STEERIN3), or Unc-53 homolog 3 (unc53H3), may regulate IL2 production by T-cells. It may be involved in neuron regeneration. NAV3 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. 105
38088 409136 cd21287 CH_ACTN1_rpt2 second calponin homology (CH) domain found in alpha-actinin-1. Alpha-actinin-1 (ACTN1), also called alpha-actinin cytoskeletal isoform, or non-muscle alpha-actinin-1, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN1 is a bundling protein. Its mutations cause congenital macrothrombocytopenia. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 124
38089 409137 cd21288 CH_ACTN2_rpt2 second calponin homology (CH) domain found in alpha-actinin-2. Alpha-actinin-2 (ACTN2), also called alpha-actinin skeletal muscle isoform 2, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN2 is a bundling protein. Its mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 124
38090 409138 cd21289 CH_ACTN3_rpt2 second calponin homology (CH) domain found in alpha-actinin-3. Alpha-actinin-3 (ACTN3), also called alpha-actinin skeletal muscle isoform 3, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN3 is a bundling protein. It is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 124
38091 409139 cd21290 CH_ACTN4_rpt2 second calponin homology (CH) domain found in alpha-actinin-4. Alpha-actinin-4 (ACTN4), also called non-muscle alpha-actinin 4, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. It is associated with cell motility and cancer invasion. ACTN4 is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38092 409140 cd21291 CH_SpAIN1-like_rpt2 second calponin homology (CH) domain found in Schizosaccharomyces pombe alpha-actinin-like protein 1 and similar proteins. Schizosaccharomyces pombe alpha-actinin-like protein 1 (SpAIN1) binds to actin and is involved in actin-ring formation and organization. It plays a role in cytokinesis and is involved in septation. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38093 409141 cd21292 CH_PLS_rpt1 first calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 145
38094 409142 cd21293 CH_AtFIM_like_rpt1 first calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, and are probably involved in the cell cycle, cell division, cell elongation, and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 116
38095 409143 cd21294 CH_FIMB_rpt1 first calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38096 409144 cd21295 CH_PLS_rpt2 second calponin homology (CH) domain found in the family of plastin. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 113
38097 409145 cd21296 CH_AtFIM_like_rpt2 second calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, and are probably involved in the cell cycle, cell division, cell elongation, and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38098 409146 cd21297 CH_FIMB_rpt2 second calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38099 409147 cd21298 CH_PLS_rpt3 third calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 117
38100 409148 cd21299 CH_AtFIM_like_rpt3 third calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes Fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 114
38101 409149 cd21300 CH_FIMB_rpt3 third calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38102 409150 cd21301 CH_PLS_rpt4 fourth calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38103 409151 cd21302 CH_AtFIM_like_rpt4 fourth calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 109
38104 409152 cd21303 CH_FIMB_rpt4 fourth calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 108
38105 409153 cd21304 CH_PARVA_B_rpt1 first calponin homology (CH) domain found in the alpha/beta parvin subfamily. The alpha/beta parvin subfamily includes alpha-parvin and beta-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Members of this subfamily contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 107
38106 409154 cd21305 CH_PARVG_rpt1 first calponin homology (CH) domain found in gamma-parvin. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. It contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38107 409155 cd21306 CH_PARVA_B_rpt2 second calponin homology (CH) domain found in the alpha/beta parvin subfamily. The alpha/beta parvin subfamily includes alpha-parvin and beta-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Members of this subfamily contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 121
38108 409156 cd21307 CH_PARVG_rpt2 second calponin homology (CH) domain found in gamma-parvin. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 122
38109 409157 cd21308 CH_FLNA_rpt1 first calponin homology (CH) domain found in filamin-A (FLN-A) and similar proteins. Filamin-A (FLN-A) is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also anchors various transmembrane proteins to the actin cytoskeleton and serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-A contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 129
38110 409158 cd21309 CH_FLNB_rpt1 first calponin homology (CH) domain found in filamin-B (FLN-B) and similar proteins. Filamin-B (FLN-B) is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton. It may promote orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It anchors various transmembrane proteins to the actin cytoskeleton. FLN-B contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 131
38111 409159 cd21310 CH_FLNC_rpt1 first calponin homology (CH) domain found in filamin-C (FLN-C) and similar proteins. Filamin-C (FLN-C), also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. FLN-C contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38112 409160 cd21311 CH_dFLNA-like_rpt1 first calponin homology (CH) domain found in Drosophila melanogaster filamin-A (dFLNA) and similar proteins. Drosophila melanogaster filamin-A (dFLNA or dFLN-A), also called actin-binding protein 280 (ABP-280) or filamin-1, is involved in germline ring canal formation. It may tether actin microfilaments within the ovarian ring canal to the cell membrane and contributes to actin microfilament organization. dFLNA contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 124
38113 409161 cd21312 CH_FLNA_rpt2 second calponin homology (CH) domain found in filamin-A (FLN-A) and similar proteins. Filamin-A (FLN-A) is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also anchors various transmembrane proteins to the actin cytoskeleton and serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-A contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 114
38114 409162 cd21313 CH_FLNB_rpt2 second calponin homology (CH) domain found in filamin-B (FLN-B) and similar proteins. Filamin-B (FLN-B) is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton. It may promote orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It anchors various transmembrane proteins to the actin cytoskeleton. FLN-B contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 110
38115 409163 cd21314 CH_FLNC_rpt2 second calponin homology (CH) domain found in filamin-C (FLN-C) and similar proteins. Filamin-C (FLN-C), also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. FLN-C contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38116 409164 cd21315 CH_dFLNA-like_rpt2 second calponin homology (CH) domain found in Drosophila melanogaster filamin-A (dFLNA) and similar proteins. Drosophila melanogaster filamin-A (dFLNA or dFLN-A), also called actin-binding protein 280 (ABP-280) or filamin-1, is involved in germline ring canal formation. It may tether actin microfilaments within the ovarian ring canal to the cell membrane and contributes to actin microfilament organization. dFLNA contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 118
38117 409165 cd21316 CH_SPTBN1_rpt1 first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 1 (SPTBN1) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN1, also called beta-II spectrin, fodrin beta chain, or spectrin, non-erythroid beta chain 1, is also a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. SPTBN1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 154
38118 409166 cd21317 CH_SPTBN2_rpt1 first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 2 (SPTBN2) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN2, also called beta-III spectrin, or spinocerebellar ataxia 5 protein (SCA5), probably plays an important role in the neuronal membrane skeleton. Mutations in SPTBN2 is associated with spinocerebellar ataxia type 5. SPTBN2 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 132
38119 409167 cd21318 CH_SPTBN4_rpt1 first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 4 (SPTBN4) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN4, also called beta-IV spectrin, or spectrin, non-erythroid beta chain 3 (SPTBN3), is a novel spectrin isolated as an interactor of the receptor tyrosine phosphatase-like protein ICA512. Its mutation associates with congenital myopathy, neuropathy, and central deafness. SPTBN4 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 139
38120 409168 cd21319 CH_SPTB_rpt2 second calponin homology (CH) domain found in spectrin beta chain, erythrocytic (SPTB) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTB, also called beta-I spectrin, may be involved in anaemia pathogenesis. SPTB contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 112
38121 409169 cd21320 CH_SPTBN1_rpt2 second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 1 (SPTBN1) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN1, also called beta-II spectrin, fodrin beta chain, or spectrin, non-erythroid beta chain 1, is also a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. SPTBN1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 108
38122 409170 cd21321 CH_SPTBN2_rpt2 second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 2 (SPTBN2) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN2, also called beta-III spectrin, or spinocerebellar ataxia 5 protein (SCA5), probably plays an important role in the neuronal membrane skeleton. Mutations in SPTBN2 is associated with spinocerebellar ataxia type 5. SPTBN2 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 119
38123 409171 cd21322 CH_SPTBN4_rpt2 second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 4 (SPTBN4) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN4, also called beta-IV spectrin, or spectrin, non-erythroid beta chain 3 (SPTBN3), is a novel spectrin isolated as an interactor of the receptor tyrosine phosphatase-like protein ICA512. Its mutation associates with congenital myopathy, neuropathy, and central deafness. SPTBN4 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 130
38124 409172 cd21323 CH_PLS1_rpt1 first calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 145
38125 409173 cd21324 CH_PLS2_rpt1 first calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 145
38126 409174 cd21325 CH_PLS3_rpt1 first calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin- 3 contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 148
38127 409175 cd21326 CH_PLS1_rpt2 second calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 121
38128 409176 cd21327 CH_PLS2_rpt2 second calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contaisn four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38129 409177 cd21328 CH_PLS3_rpt2 second calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 122
38130 409178 cd21329 CH_PLS1_rpt3 third calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 118
38131 409179 cd21330 CH_PLS2_rpt3 third calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 125
38132 409180 cd21331 CH_PLS3_rpt3 third calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs. 134
38133 409181 cd21332 CH_PLS1_rpt4 fourth calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38134 409182 cd21333 CH_PLS2_rpt4 fourth calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38135 409183 cd21334 CH_PLS3_rpt4 fourth calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs. 112
38136 409184 cd21335 CH_PARVA_rpt1 first calponin homology (CH) domain found in alpha-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. It is also involved in the reorganization of the actin cytoskeleton, the formation of lamellipodia and ciliogenesis, as well as in the establishement of cell polarity, cell adhesion, cell spreading, and directed cell migration. Alpha-parvin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 115
38137 409185 cd21336 CH_PARVB_rpt1 first calponin homology (CH) domain found in beta-parvin. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. It is involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia and also plays a role in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Beta-parvin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs. 106
38138 409186 cd21337 CH_PARVA_rpt2 second calponin homology (CH) domain found in alpha-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. It is also involved in the reorganization of the actin cytoskeleton, the formation of lamellipodia and ciliogenesis, as well as in the establishement of cell polarity, cell adhesion, cell spreading, and directed cell migration. Alpha-parvin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 129
38139 409187 cd21338 CH_PARVB_rpt2 second calponin homology (CH) domain found in beta-parvin. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. It is involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia and also plays a role in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Beta-parvin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs. 130
38140 410336 cd21339 PPP2R3 serine/threonine protein phosphatase 2A regulatory subunit B". Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This family includes PP2A regulatory B'' subunits alpha, beta and gamma, encoded by PPP2R3A, PPP2R3B and PPP2R3C, respectively. It also includes subunit delta encoded by PPP2R3D in mouse. These B-family regulatory subunits play various roles including regulation of cytoskeletal assembly, neuronal differentiation, mitogen-activated protein kinase signaling, and apoptosis. Subunits alpha and beta contain two-domain elongated structure with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity. 259
38141 411060 cd21340 PPP1R42 protein phosphatase 1 regulatory subunit 42. Protein phosphatase 1 regulatory subunit 42 (PPP1R42), also known as leucine-rich repeat-containing protein 67 (lrrc67) or testis leucine-rich repeat (TLRR) protein, plays a role in centrosome separation. PPP1R42 has been shown to interact with the well-conserved signaling protein phosphatase-1 (PP1) and thereby increasing PP1's activity, which counters centrosome separation. Inhibition of PPP1R42 expression increases the number of centrosomes per cell while its depletion reduces the activity of PP1 leading to activation of NEK2, the kinase responsible for phosphorylation of centrosomal linker proteins promoting centrosome separation. 220
38142 411061 cd21341 TTC8_N N-terminal domain of tetratricopeptide repeat domain 8. Tetratricopeptide repeat domain 8 (TTC80), also known a BBS8, has been directly linked to Bardet-Biedl syndrome, an autosomal recessive ciliopathy characterized by retinal degeneration, renal failure, obesity, diabetes, male infertility, polydactyly and cognitive impairment. Mutations in BBS8 cause early vision loss. In addition to C-terminal tetratricopeptide repeats, TTC8 also contains an N-terminal domain of unknown function. 139
38143 409247 cd21342 Syt1_2_N N-terminal domain of synaptotagmin-1 and -2. The synaptotagmins are integral membrane proteins of synaptic vesicles thought to serve as Ca(2+) sensors in the process of vesicular trafficking and exocytosis. Calcium binding to synaptotagmin-1 participates in triggering neurotransmitter release at the synapse. In general, synaptotagmins contain 2 calcium binding C2 domains. Synaptotagmin-1 and -2 have an additional N-terminal domain that has been shown to bind to Botulinum neurotoxin B. 93
38144 394805 cd21343 ZBD_UPF1_nv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands, and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and equine arteritis virus (EAV) Nsp10, as well as eukaryotic UPF1 helicase. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. The CH/ZBD of UPF1 interacts with UPF2, a factor also involved in NMD. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). UPF1, SARS-Nsp13 and EAV Nsp10 are multidomain proteins; their other domains include a 1B regulatory domain and a SF1 helicase core. The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 72
38145 394813 cd21344 1B_UPF1_nv_SF1_Hel-like 1B domain of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), Equine arteritis virus (EAV) Nsp10, and eukaryotic UPF1 RNA helicase. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. UPF1, EAV Nsp10 and SARS-Nsp13 are multidomain proteins with an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B domain and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 86
38146 410596 cd21369 cwf21 cwf21 domain. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevents its binding to Prp8. The domain is composed of two alpha helices. Proteins containing the cwf21 domain include complexed with CEF1 protein 21 (CWC21) from budding yeast, complexed with cdc5 protein 21 (CWF21) from fission yeast, as well as their orthologs, serine/arginine repetitive matrix proteins (SRRM2 and SRRM3) from vertebrates. This domain family also includes U2-associated protein SR140 from Eumetazoa, protein RRC1, and similar proteins from plants. 48
38147 410597 cd21370 cwf21_SR140 cwf21 domain found in U2-associated protein SR140 and similar proteins. SR140, also called U2 snRNP-associated SURP motif-containing protein, U2SURP, or 140 kDa Ser/Arg-rich domain protein, is a putative splicing factor mainly found in higher eukaryotes. Although it was initially identified as a 17S U2 snRNP-associated protein, the molecular and physiological function of SR140 remains unclear. This model represents the cwf21 domain of SR140 and similar proteins. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 50
38148 410598 cd21371 cwf21_RRC1-like cwf21 domain found in Arabidopsis thaliana protein RRC1 and similar proteins. RRC1, also called reduced red-light responses in cry1cry2 background 1, is a SR-like splicing factor required for phytochrome B (phyB) signal transduction and involved in phyB-dependent alternative splicing. This subfamily also includes protein RRC1-like, which may also function as a SR-like splicing factor. SR family splicing factors are characterized by the presence of a domain rich in arginine and serine dipeptides, called the RS domain. This model represents the cwf21 domain of RRC1 and similar proteins. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 50
38149 410599 cd21372 cwf21_CWC21-like cwf21 domain found in fungal complexed with CEF1 protein 21 (CWC21) and similar proteins. This subfamily includes complexed with CEF1 protein 21 (CWC21) from budding yeast, complexed with cdc5 protein 21 (CWF21) from fission yeast, as well as their orthologs, serine/arginine repetitive matrix proteins (SRRM2 and SRRM3) from vertebrates. Both CWC21 and CWF21 are pre-mRNA-splicing factors that may function at or prior to the first catalytic step of splicing at the catalytic center of the spliceosome, together with ISY1. SRRM2 is required for pre-mRNA splicing as a component of the spliceosome. SRRM3 may play a role in regulating breast cancer cell invasiveness. It may be involved in RYBP-mediated breast cancer progression. Members of this family contain a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 49
38150 410600 cd21373 cwf21_SRRM2-like cwf21 domain found in serine/arginine repetitive matrix proteins, SRRM2, SRRM3 and similar proteins. This subfamily includes SRRM2 and SRRM3, both of which contain a cwf21 domain at the N-terminus. SRRM2, also called 300 kDa nuclear matrix antigen, serine/arginine-rich splicing factor-related nuclear matrix protein of 300 kDa, SR-related nuclear matrix protein of 300 kDa, Ser/Arg-related nuclear matrix protein of 300 kDa, splicing coactivator subunit SRm300, or Tax-responsive enhancer element-binding protein 803 (TaxREB803), is required for pre-mRNA splicing as component of the spliceosome. SRRM3 may play a role in regulating breast cancer cell invasiveness. It may be involved in RYBP-mediated breast cancer progression. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 50
38151 410601 cd21375 cwf21_SRRM2 cwf21 domain found in serine/arginine repetitive matrix protein 2. Serine/arginine repetitive matrix protein 2 (SRRM2) is also called 300 kDa nuclear matrix antigen, serine/arginine-rich splicing factor-related nuclear matrix protein of 300 kDa, SR-related nuclear matrix protein of 300 kDa, Ser/Arg-related nuclear matrix protein of 300 kDa, splicing coactivator subunit SRm300, or Tax-responsive enhancer element-binding protein 803 (TaxREB803). It is required for pre-mRNA splicing as component of the spliceosome. It contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 64
38152 410602 cd21376 cwf21_SRRM3 cwf21 domain found in serine/arginine repetitive matrix protein 3 and similar proteins. Serine/arginine repetitive matrix protein 3 (SRRM3) may play a role in regulating breast cancer cell invasiveness. It may also be involved in RYBP-mediated breast cancer progression. SRRM3 contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 68
38153 411062 cd21378 eIF3E eukaryotic translation initiation factor 3 subunit E. Eukaryotic translation initiation factor 3 subunit E (eIF3E, also called INT6) is a subunit of eIF3, the largest initiation factor. eIF3 is involved in many steps of initiation, including ribosomal recruitment, attachment to mRNA, and scanning. The mammalian eIF3 complex has 13 subunits. Six subunits, including subunit E, contain PCI domains (N-terminal helical repeats and a winged helix domain or WHD) that mediates PCI polymerization. Mammalian eIF3e subunit interacts with eIF3C, eIF3D, eIF3L, and eIF3A subunits, as well as eIF4G and HERC2. It exhibits tumor suppressive or oncogenic functions depending on its expression level and/or tumor type; for example, decreased expression may cause breast cancer or non-small cell lung carcinoma while overexpression is correlated with colon cancer and glioblastoma. Decreased expression of eIF3E may also enable epithelial-mesenchymal transition (EMT), which is involved in adenomyosis by promoting cell invasion, and fibrogenesis by activating the TGF-beta1 signaling pathway. 416
38154 412057 cd21382 RING0_parkin RING finger-like zinc-binding domain 0 of parkin. Parkin, also called Parkinson juvenile disease protein 2, is a RBR (RING1-BRcat-Rcat)-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson's disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins. It is involved in regulating mitochondrial quality control. Its activation is a key regulatory event in the pathway to the clearance of depolarized or damaged mitochondria. Parkin contains an N-terminal ubiquitin-like domain, an acid linker, a RING finger-like domain 0 (RING0), and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. This model represents RING0 of parkin. 84
38155 410588 cd21383 GAT_GGA_Tom1-like canonical GAT domain found in eukaryotic ADP-ribosylation factor (Arf)-binding proteins (GGAs), metazoan myb protein 1 (Tom1)-like proteins, and similar proteins. This model represents the canonical GAT (GGA and Tom1) domain found in GGAs from eukaryotes, Tom1-like proteins from metazoa, and LAS seventeen-binding protein 5 (Lsb5p)-like proteins from fungi. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS-domain followed by a GAT domain. Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain. They do not associate with either Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. In addition, Tom1 family proteins recruit clathrin onto endosomes through their C-terminal region. The C-terminal clathrin-binding region of Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. 80
38156 410589 cd21384 GAT_STAM_Vps27-like non-canonical GAT domain found in metazoan signal transducing adapter molecules (STAMs), fungal vacuolar protein sorting-associated protein 27 (Vps27), and similar proteins. This family includes several components of the ESCRT-0 complex, including STAMs, hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs), as well as vacuolar protein sorting-associated protein 27 (Vps27) and class E vacuolar protein-sorting machinery protein Hse1 from fungi. The ESCRT-0 complex binds ubiquitin and acts as a sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members in this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. By contrast, a canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hrs together with STAM forms a Hrs/STAM core complex. Vps27, together with Hse1, forms a Vps27/Hse1 core complex. Those complexes consist of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The intertwined GAT heterodimer acts as a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 79
38157 410590 cd21385 GAT_Vps27 non-canonical GAT domain found in fungal vacuolar protein sorting-associated protein 27 (Vps27) and similar proteins. Vps27, also called Golgi retention defective protein 11 (GRD11), is a component of the ESCRT-0 complex which is the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB), and recruits ESCRT-I to the MVB outer membrane. It controls exit from the prevacuolar compartment (PVC) in both the forward direction to the vacuole and the return to the Golgi. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Vps27, together with another GAT domain-containing protein Hse1, forms a Vps27/Hse1 core complex that consists of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Vps27/Hse1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 84
38158 410591 cd21386 GAT_Hse1 non-canonical GAT domain found in fungal class E vacuolar protein-sorting machinery protein Hse1 and similar proteins. Hse1 is a component of the ESCRT-0 complex which is the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB), and recruits ESCRT-I to the MVB outer membrane. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hse1, together with another GAT domain-containing protein Vps27, forms a Vps27/Hse1 core complex that consists of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Vps27/Hse1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 84
38159 410592 cd21387 GAT_Hrs non-canonical GAT domain found in metazoan hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs) and similar proteins. Hrs, also called protein pp110, is a tyrosine kinase substrate in growth factor-stimulated cells. It is involved in intracellular signal transduction mediated by cytokines and growth factors. Hrs is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hrs, together with another GAT domain-containing protein STAM, forms a Hrs/STAM core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 96
38160 410593 cd21388 GAT_STAM non-canonical GAT domain found in metazoan signal transducing adapter molecules (STAMs) and similar proteins. STAMs are Hrs-binding proteins involved in intracellular signal transduction mediated by cytokines and growth factors. They are components of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 77
38161 410594 cd21389 GAT_STAM1 non-canonical GAT domain found in signal transducing adapter molecule 1 (STAM-1) and similar proteins. STAM-1 is involved in intracellular signal transduction mediated by cytokines and growth factors. It may also play a role in T-cell development. STAM-1 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this subfamily contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-1, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM1 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 77
38162 410595 cd21390 GAT_STAM2 non-canonical GAT domain found in signal transducing adapter molecule 2 (STAM-2) and similar proteins. STAM-2 is a Hrs-binding protein involved in intracellular signal transduction mediated by cytokines and growth factors. STAM-2 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-2, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM2 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM2 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 91
38163 409621 cd21392 IgC2_CD160 Immunoglobulin Constant-2 domain of Cluster of Differentiation 160 (CD160). CD160 is expressed at the cell surface as a tightly disulfide-linked multimer and is tightly associated with peripheral blood NK cells and CD8 T lymphocytes with cytolytic effector activity. Structurally similar to The B and T lymphocyte attenuator (BTLA), which appears to act as a negative regulator of T cell activation and growth. CD160 is a ligand for HVEM (herpes virus entry mediator), and considered a proposed immune checkpoint inhibitor with anti-cancer activity along with anti-PD-1 antibodies. CD160 has also been proposed as a potential target in cases of human pathological ocular and tumor neoangiogenesis that do not respond or become resistant to existing antiangiogenic drugs. 119
38164 411063 cd21393 sm_acid_XPC-like small acidic domain of Xeroderma pigmentosum group C complementing protein and similar proteins. This model represents the small acidic domain of mammalian Xeroderma pigmentosum group C complementing protein (XPC), yeast Rad4, and similar proteins. XPC/Rad4 recruits transcription/repair factor IIH (TFIIH) to the nucleotide excision repair (NER) complex through interactions with its p62/Tfb1 and XPB/Ssl2 TFIIH subunits. Global genome repair (GGR), one of two NER initiation pathways in mammals, starts with DNA lesion detection by XPC. XPC is a structure specific DNA-binding factor that recognizes distortion of the damaged DNA double helix and recruits the TFIIH complex onto the lesion to open up the damaged DNA. The small acidic domain of XPC/Rad4 interacts with the pleckstrin homology (PH) domain of the p62/Tfb1 subunit of TFIIH. 42
38165 412027 cd21396 GINS_B beta-strand (B) domain of GINS complex proteins: Sld5, Psf1, Psf2, Psf3, Gins51 and Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both the initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. This complex is found in eukaryotes and archaea, but not in bacteria. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, while in archaea, it consists of two different proteins named Gins51 and Gins23. The archaeal GINS complex can be either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. All GINS subunits are homologous and consist of two domains, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1/Gins51 are permuted with respect to Psf1/Psf3/Gins23. The overall tetrameric assemblies of GINS are similar, but the relative locations of the C-terminal small domains are different with respect to the alpha-helical domain, resulting in different subunit contacts. However, the basic function of GINS in DNA replication is conserved across eukaryotes and archaea. This model represents the beta-strand domain (B-domain) of GINS complex proteins. 49
38166 411064 cd21397 cc_ERCC-6_N coiled-coil domain located near the N-terminus of human Excision Repair Cross Complementing 6 (ERCC-6) and related proteins. This model represents a coiled-coil domain located near the N-terminus of ERCC-6 and related proteins. ERCC-6 (also known as Cockayne syndrome group B, CSB) is a DNA-binding protein important in eukaryotic transcription-coupled repair (TCR). TCR is a well-conserved sub-pathway of nucleotide excision repair (NER) that preferentially removes DNA lesions from the template strand blocking translocation of RNA polymerase II (Pol II). In a model for TCR, the processing Pol II encounters the lesion on the transcribed DNA strand and stalls; it is then displaced by the TCR-initiation complex which includes ERCC-6, ERCC-8, UVSSA and USP7; TCR-specific factors then access the lesion for the DNA damage incision process. The N-terminal region, the ATPase domain and the C-terminal region of ERCC-6 all directly contribute to DNA association and catalytic activity. The ATPase domain functions in concert with either the N- or C-terminal region to mediate UV-induced chromatin association. The N-terminal region prevents ERCC-6 from stably associating with chromatin under normal growth conditions, and the C-terminal region of ERCC-6 promotes stable chromatin association in the presence of lesion-stalled transcription. In addition to this coiled-coil domain, the N-terminal region of ERCC-6 includes two lysine residues subject to SUMOylation, a nucleolar localization signal NoLS1, and a nuclear localization signal NLS1. ERCC-6 also includes a SWI/SNF-like ATPase domain, a nucleotide-binding domain and a ubiquitin-binding domain. This coiled-coil domain binds magnesium. This domain family does not include Saccharomyces cerevisiae RAD26, and Schizosaccharomyces pombe Rhp26. 77
38167 394806 cd21399 ZBD_nv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of nidovirus helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This nidovirus family includes Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and equine arteritis virus (EAV) Nsp10 helicase, and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. SARS-Nsp13 and EAV Nsp10 are multidomain proteins; their other domains include a 1B regulatory domain and a SF1 helicase core. 71
38168 394807 cd21400 ZBD_UPF1-like Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. UPF1 belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. The N-terminal CH/ZBD of UPF1 interacts with UPF2, a factor also involved in NMD. UPF1 has an N-terminal CH/ZBD, a 1B domain, and a SF1 helicase core. 120
38169 394808 cd21401 ZBD_cv_Nsp13-like Cys/His rich zinc-binding domain (CH/ZBD) of coronavirus SARS NSP13 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This coronavirus family includes Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. SARS-Nsp13 has an N-terminal CH/ZBD, a stalk domain, a 1B regulatory domain, and SF1 helicase core. 95
38170 394809 cd21402 ZBD_mv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of mesnidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This mesnidovirus group includes the Bontag Baru virus (BBaV) replication helicase encoded on ORF1b and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this group belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 111
38171 394810 cd21403 ZBD_tv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of tornidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This tornidovirus group includes White bream virus (WBV) SF1 helicase encoded on ORF1b and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 95
38172 394811 cd21404 ZBD_rv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of ronidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This ronidovirus family includes Gill-associated virus (GAV) replication helicase encoded on ORF1 and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 105
38173 394812 cd21405 ZBD_av_Nsp10-like Cys/His rich zinc-binding domain (CH/ZBD) of arterivirus EAV Nsp10 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this arnidovirus group belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 62
38174 394814 cd21406 1B_nv_SF1_Hel-like 1B domain of nidovirus helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this nidoviral family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and Equine arteritis virus (EAV) Nsp10. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). They belong to a larger SF1 helicase family which also includes eukaryotic UPF1-like helicases. UPF1, EAV Nsp10 and SARS-Nsp13 are multidomain proteins with an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B domain and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 79
38175 394815 cd21407 1B_UPF1-like 1B domain of eukaryotic UPF1 RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. UPF1 belongs to helicase superfamily 1 (SF1). It participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. UPF1 is a multidomain protein; it includes an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a regulatory 1B domain, and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 90
38176 394816 cd21408 1B_Sen1p-like 1B domain of Saccharomyces cerevisiae Sen1p RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Sen1p belongs to a UPF1-like family of helicase superfamily 1 (SF1). UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons, Sen1p plays a role in the termination of non-coding transcription. UPF1 is a multidomain protein; it includes an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B regulatory domain, and a SF1 helicase core. Sen1p has a similar domain organization and helicase mechanism to UPF1. However, it has distinct structural features including a more elaborate topology of the 1B barrel domain, and a distinct function from UPF1, an ATPase-dependent ability of promoting transcription termination in vitro. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 106
38177 394817 cd21409 1B_cv_Nsp13-like 1B domain of coronavirus SARS NSP13 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include coronavirus helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13). SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 79
38178 394818 cd21410 1B_av_Nsp10-like 1B domain of arterivirus EAV Nsp10 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. EAV Nsp10 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 49
38179 411058 cd21411 NucC cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease. 225
38180 394819 cd21413 unc_tv_SF1_Hel-like uncharacterized domain which connects the Cys/His rich zinc-binding (ZBD) and linker to the first helicase domain of tornidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Helicases in this family belong to helicase superfamily 1 (SF1) and include tornidovirus helicases such as Breda virus serotype 1 (BoTV-1) SF1 helicase encoded on ORF1b. They are related to the SF1 family nidoviral replication helicases which include Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (ZBD) and a SF1 helicase core. The location of the uncharacterized domain represented in this tornidovirus group resembles that of the 1B domain in SARS-Nsp13 helicase; it connects the zinc-binding domain (ZBD) and linker to the first helicase domain. 79
38181 394820 cd21414 unc_rv_SF1_Hel-like uncharacterized domain which connects the Cys/His rich zinc-binding domain (ZBD) and linker to the first helicase domain of ronidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include ronidovirus helicases such as Gill-associated virus (GAV) replication helicase encoded on ORF1b. They are related to the SF1 family nidoviral replication helicases which include Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The location and orientation of the uncharacterized domain represented in this ronidovirus group resembles that of the 1B domain in SARS-Nsp13 helicase; it connects the Cys/His zinc-binding domain (ZBD) and linker to the first helicase domain. 56
38182 411065 cd21416 HDC_protein histidine decarboxylase (HDC) gene cluster protein. This model includes an uncharacterized histidine decarboxylase (HDC) gene cluster protein. Lactobacillus parabuchneri is one of the major causes of elevated histamine levels in cheese. Histamine positive strain FAM21731 from L. parabuchneri has been shown to contain a histidine decarboxylase gene cluster present on a genomic island, which seems to have undergone transfer within the same species as well as between other lactobacilli. 380
38183 411066 cd21417 AvrRxo1 AvrRxo1, a type III effector with a polynucleotide kinase domain. This family contains AvrRxo1-ORF1 (also called AvrRxo1) which has been shown to be a type III-secreted virulence factor in Xanthomonas oryzae (Xoc) that causes bacterial leaf streak (BLS) disease in rice plants. AvrRxo1-ORF1 delivery in rice plant cells is recognized by disease resistance protein Rxo1, which triggers resistance to BLS disease. In the Xoc genome, AvrRxo1-ORF1 is adjacent to AvrRxo1-ORF2 (also called AvrRxo1-required chaperone, or Arc1) which appears to act as a molecular chaperone. AvrRxo1 has a T4 polynucleotide kinase (T4pnk) domain, while Arc1 has a kinase-binding domain with a structure that is atypical of effector-binding chaperones. AvrRxo1 and Arc1 comprise a toxin-antitoxin system similar to members of the zeta-epsilon family, with AvrRxo1 acting as the toxin. 330
38184 411067 cd21418 Arc1 Arc1, AvrRxo1-required chaperone. This family contains AvrRxo1-ORF2 (also called AvrRxo1-required chaperone or Arc1) which appears to act as a molecular chaperone for AvrRxo1-ORF1 (also called AvrRxo1), a type III-secreted virulence factor in Xanthomonas oryzae (Xoc), a bacteria that causes leaf streak (BLS) disease in rice plants. AvrRxo1-ORF1 delivery in rice plant cells is recognized by disease resistance protein Rxo1, which triggers resistance to BLS disease. In the Xoc genome, the Arc1 gene is found adjacent to AvrRxo1; Arc1 functions to suppress the bacteriostatic activity of AvrRxo1-ORF1 in bacterial cells. Arc1 has a kinase-binding domain with a structure that is atypical of effector-binding chaperones, while AvrRxo1 has a T4 polynucleotide kinase (T4pnk) domain. AvrRxo1 and Arc1 comprise a toxin-antitoxin system similar to members of the zeta-epsilon family, with Arc1 acting as the antitoxin. 97
38185 412058 cd21422 GatF mitochondrial glutamyl-tRNA(Gln) amidotransferase subunit F. Glutamyl-tRNA(Gln) amidotransferase subunit F (GatF), also called Glu-AdT subunit F, is the connector subunit of yeast mitochondrial tRNA-dependent amidotransferase (AdT) that is also composed of the GatA and GatB subunits. GatA and GatB are well conserved among bacteria and eukaryota, but the GatF subunit is a fungi-specific ortholog of the GatC subunit found in all other known heterotrimeric AdTs. AdT allows the formation of correctly charged Gln-tRNA(Gln) through the transamidation of misacylated Glu-tRNA(Gln) in the mitochondria. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). This model corresponds to the GatF subunit, which can be divided into two halves, the C-terminal GatC-like portion and the N-terminal appended domain (NTD). 128
38186 410603 cd21435 SUN_cc1 coiled-coil domain 1 of SUN domain-containing proteins. SUN (Sad1 and UNC-84) proteins (SUN1 and SUN2) are components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN proteins contain two coiled-coil domains (CC1 and CC2), which act as intrinsic dynamic regulators controlling the activity of the SUN domain. The model corresponds to CC1 that functions as an activation segment to release CC2-mediated inhibition of the SUN domain. 55
38187 410604 cd21438 SUN2_cc1 coiled-coil domain 1 of SUN domain-containing protein 2 and similar proteins. SUN domain-containing protein 2 (SUN2), also called protein unc-84 homolog B, Rab5-interacting protein (Rab5IP), or Sad1/unc-84 protein-like 2, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN2 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that functions as an activation segment to release CC2-mediated inhibition of the SUN domain. 55
38188 410605 cd21439 SUN1_cc1 coiled-coil domain 1 of SUN domain-containing protein 1 and similar proteins. SUN domain-containing protein 1 (SUN1), also called protein unc-84 homolog A, or Sad1/unc-84 protein-like 1, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN1 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that may function as an activation segment to release CC2-mediated inhibition of the SUN domain. 55
38189 410607 cd21440 KLF8_N N-terminal domain of Kruppel-like factor 8. Kruppel-like factor 8 (also known as Krueppel-like transcription factor 8, KLF8) is a CACCC-box binding protein that associates with C-terminal Binding Protein (CtBP) and represses transcription. It plays an essential role in the regulation of the cell cycle, apoptosis, and differentiation. It has been identified as a key component of the transcription factor network that controls terminal differentiation during adipogenesis. It also plays an important role in the formation of several human tumors, including the promotion of tumorigenesis, invasion, and metastasis of colorectal cancer cells, and the progression of pancreatic cancer. KLF8 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF8 contains an N-terminal repression domain that is related to that of KLF12. 169
38190 410608 cd21441 KLF12_N N-terminal domain of Kruppel-like factor 12. Kruppel-like factor 12 (also known as Krueppel-like transcription factor 12, KLF12) regulates, by transcriptionally repressing Nur77 expression, endometrial decidualization, which is a prerequisite for successful implantation and the establishment of pregnancy. It is involved in the maturation processes of kidney collecting ducts after birth, and is able to increase the promoter activity of the UT-A1 urea transporter promoter by binding to the CACCC motif. KLF12 has also been found to promote colorectal cancer growth is also involved in the invasion and apoptosis of basal-like breast carcinoma. KLF12 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF12 contains an N-terminal domain that is related to the N-terminal repression domain of KLF8. 197
38191 410568 cd21442 SNARE_NTD_STX6-like N-terminal domain of syntaxin-6 and similar proteins. The family includes soluble NSF attachment protein receptor (SNARE) proteins, syntaxin-6 (STX6) and syntaxin-10 (STX10), and their homologs found in fungi and plants, such as Tlg1p, AtSYP61, and similar proteins. STX6 is involved in intracellular vesicle trafficking. STX10, also called Syn10, is involved in vesicular transport from the late endosomes to the trans-Golgi network. Tlg1p, also called syntaxin TLG1, is a SNARE protein (of Qc type) involved in membrane fusion probably in retrograde traffic of cytosolic double-membrane vesicles derived from both, early and possibly late endosomes/PVC (prevacuolar compartment) back to the trans-Golgi network (TGN or late Golgi). It has been reported to function both as a (target membrane) t-SNARE and as a (vesicle) v-SNARE. AtSYP61, also called osmotic stress-sensitive mutant 1 (OSM1), is a vesicle trafficking syntaxin protein that functions in the secretory pathway. It is involved in osmotic stress tolerance and in abscisic acid (ABA) regulation of stomatal responses in Arabidopsis. 103
38192 410569 cd21443 SNARE_NTD_STX6_STX10 N-terminal domain of syntaxin-6, syntaxin-10, and similar proteins. This subfamily includes two soluble NSF attachment protein receptor (SNARE) proteins, syntaxin-6 (STX6) and syntaxin-10 (STX10). STX6 is involved in intracellular vesicle trafficking. STX10, also called Syn10, is involved in vesicular transport from the late endosomes to the trans-Golgi network. This model corresponds to the N-terminal domain of STX6 and STX10, which is a regulatory domain named Habc. 103
38193 410570 cd21444 SNARE_NTD_Tlg1p-like N-terminal domain of t-SNARE affecting a late Golgi compartment protein 1 and similar proteins. t-SNARE affecting a late Golgi compartment protein 1 (Tlg1p), also called syntaxin TLG1, is a soluble NSF attachment protein receptor (SNARE) protein (of Qc type) involved in membrane fusion, probably in retrograde traffic of cytosolic double-membrane vesicles derived from both, early and possibly late endosomes/PVC (prevacuolar compartment) back to the trans-Golgi network (TGN or late Golgi). It has been reported to function both as a (target membrane) t-SNARE and as a (vesicle) v-SNARE. The model corresponds to the N-terminal domain of Tlg1p, which consists of a three-helix bundle. 93
38194 410571 cd21445 SNARE_NTD_AtSYP61-like N-terminal domain of Arabidopsis thaliana syntaxin-61 and similar proteins. Arabidopsis thaliana syntaxin-61 (AtSYP61), also called osmotic stress-sensitive mutant 1 (OSM1), is a vesicle trafficking syntaxin protein that functions in the secretory pathway. It is involved in osmotic stress tolerance and in abscisic acid (ABA) regulation of stomatal responses in Arabidopsis. This model corresponds to the N-terminal domain of AtSYP61, which shows high sequence similarity with the N-terminal domain of yeast Tlg1p, a soluble NSF attachment protein receptor (SNARE) protein (of Qc type) involved in membrane fusion. 99
38195 410572 cd21446 SNARE_NTD_STX10 N-terminal domain of syntaxin-10. Syntaxin-10 (STX10), also called Syn10, is part of a soluble NSF attachment protein receptor (SNARE) complex involved in vesicular transport from the late endosomes to the trans-Golgi network, such as the transport of mannose 6-phosphate receptors from endosomes to the Golgi after delivering lysosomal enzymes to the endocytic pathway. This model corresponds to the N-terminal domain of STX10, which is a regulatory domain named Habc. 103
38196 410573 cd21447 SNARE_NTD_STX6 N-terminal domain of syntaxin-6. Syntaxin-6 (STX6) is a component of a soluble NSF attachment protein receptor (SNARE) complex involved in intracellular vesicle trafficking and in the fusion of retrograde transport carriers with the trans-Golgi network (TGN). This model corresponds to N-terminal domain of STX6, which is a regulatory domain named Habc. 103
38197 411996 cd21448 DLC-like_DYNLT1-like dynein light chain (DLC)-like domain found in the dynein light chain Tctex-type 1 (DYNLT1) subfamily and similar proteins. The dynein light chain Tctex-type 1 (DYNLT1) subfamily includes two isoforms, DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The dynein complex contains either DYNLT1 or DYNLT3, but not both. The family also includes Schizosaccharomyces pombe dynein light chain Tctex-type (SpDlc1), Saccharomyces cerevisiae topoisomerase I damage affected protein 2 (TDA2) and similar proteins. SpDlc1 belongs to the 14-kDa Tctex-1 dynein light chain family. It acts as a non-catalytic accessory component of a dynein complex. It is required for regular oscillatory nuclear movement and efficient recombination during meiotic prophase in fission yeast. TDA2 is a novel protein of the endocytic machinery necessary for normal internalization of native cargo in yeast. It works independently of the dynein motor complex and microtubules. 98
38198 411997 cd21449 DLC-like_SF dynein light chain (DLC)-like domain superfamily. The superfamily corresponds to a class of proteins containing a dynein light chain (DLC)-like domain with anti-parallel beta-sheet packed against an alpha-helical hairpin. DLC-like domain-containing proteins includes cytoplasmic dynein light chain DYNLL1 and DYNLL2, axonemal dynein light chain 4 (DNAL4), tegumental-allergen-like proteins (TALs), dynein light chain Tctex-type DYNLT1 and DYNLT3, as well as Tctex1 domain-containing proteins (TCTEX1D). Both DYNLL1 and DYNLL2 are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. DNAL4 is a force generating protein of respiratory cilia. TALs may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes. DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex, which contains either DYNLT1 or DYNLT3, but not both. TCTEX1D family includes TCTEX1D1-4. TCTEX1D1 is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). TCTEX1D2 is required for proper retrograde ciliary transport. TCTEX1D3 may be an accessory component of axonemal dynein and cytoplasmic dynein 1. TCTEX1D4 is a novel protein phosphatase 1 (PPP1) interactor. 96
38199 411998 cd21450 DLC-like_DYNLL1-like dynein light chain (DLC)-like domain found in cytoplasmic dynein light chain 1 (DYNLL1), axonemal dynein light chain 4 (DNAL4), tegumental-allergen-like proteins (TALs) and similar proteins. The family includes cytoplasmic dynein light chain 1 (DYNLL1), DYNLL2, axonemal dynein light chain 4 (DNAL4), and tegumental-allergen-like proteins (TALs). DYNLL1, also called protein inhibitor of neuronal nitric oxide synthase (PIN), or 8 kDa dynein light chain (DLC8), or dynein light chain LC8-type 1 (DLC1), is one of several non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules. It may play a role in changing or maintaining the spatial distribution of cytoskeletal structures. DYNLL2, also called cytoplasmic dynein light chain 2, or 8 kDa dynein light chain b (DLC8b), or dynein light chain LC8-type 2 (DLC2), is one of several non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. DNAL4 is a force generating protein of respiratory cilia. It produces force towards the minus ends of microtubules. TALs, also called tegument antigens, are characterized by two N-terminal EF-hand motifs and a C-terminal region resembling a dynein light chain (DLC)-like domain. They were mainly found in parasitic platyhelminth species. TALs are strongly associated with the tegument, a syncytial structure that forms the outer layer of the organism. They may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes. 68
38200 411999 cd21451 DLC-like_TCTEX1D dynein light chain (DLC)-like domain found in the Tctex1 domain-containing protein (TCTEX1D) family. The Tctex1 domain-containing protein (TCTEX1D) family includes TCTEX1D1-4. TCTEX1D1 is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). It can interact with ZMYND10 that stabilizes intermediate chain proteins in the cytoplasmic pre-assembly of dynein arms. TCTEX1D2 is required for proper retrograde ciliary transport. It associates with short-rib polydactyly syndrome proteins, such as Wdr34, Wdr60, and other dynein complex 1 and 2 subunits, and is required for ciliogenesis. TCTEX1D2 is a negative regulator of GLUT4 translocation and glucose uptake. TCTEX1D3, also called T-complex testis-specific protein 3, or T-complex-associated testis-expressed protein 3 (Tcte-3), may be an accessory component of axonemal dynein and cytoplasmic dynein 1. TCTEX1D4, also called protein N22.1, or Tctex-2-beta, is a novel protein phosphatase 1 (PPP1) interactor. It also interacts with ENG/endoglin, TGFBR2, and TGFBR3. The distribution of TCTEX1D4 in testis suggests its involvement in distinct functions, such as TGFbeta signaling at the blood-testis barrier and acrosome cap formation. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1Ds. 101
38201 412000 cd21452 DLC-like_DYNLL1_DYNLL2 dynein light chain (DLC)-like domain found in the cytoplasmic dynein light chain 1 (DYNLL1) family. The cytoplasmic dynein light chain 1 (DYNLL1) family includes DYNLL1 and DYNLL2. DYNLL1, also called protein inhibitor of neuronal nitric oxide synthase (PIN), or 8 kDa dynein light chain (DLC8), or dynein light chain LC8-type 1 (DLC1), acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules. It may play a role in changing or maintaining the spatial distribution of cytoskeletal structures. Both DYNLL1 and DYNLL2 (also called 8 kDa dynein light chain b, or DLC8b, or dynein light chain LC8-type 2, or DLC2) are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The model corresponds to the dynein light chain (DLC)-like domain of DYNLL1 and DYNLL2. 84
38202 412001 cd21453 DLC-like_DNAL4 dynein light chain (DLC)-like domain found in axonemal dynein light chain 4 (DNAL4) and similar proteins. Axonemal dynein light chain 4 (DNAL4) is a force generating protein of respiratory cilia. It produces force towards the minus ends of microtubules. The model corresponds to the dynein light chain (DLC)-like domain of DNAL4. 83
38203 412002 cd21454 DLC-like_TAL dynein light chain (DLC)-like domain found in the family of tegumental-allergen-like proteins (TALs). Tegumental-allergen-like proteins (TALs), also called tegument antigens, are characterized by two N-terminal EF-hand motifs and a C-terminal region resembling a dynein light chain (DLC)-like domain. They were mainly found in parasitic platyhelminth species. TALs are strongly associated with the tegument, a syncytial structure that forms the outer layer of the organism. They may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes. The model corresponds to the dynein light chain (DLC)-like domain of TAL. 87
38204 412003 cd21455 DLC-like_DYNLT1_DYNLT3 dynein light chain (DLC)-like domain found in the dynein light chain Tctex-type 1 (DYNLT1) family. The dynein light chain Tctex-type 1 (DYNLT1) family includes two isoforms, DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The dynein complex contains either DYNLT1 or DYNLT3, but not both. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT1 and DYNLT3. 97
38205 412004 cd21456 DLC-like_SpDlc1-like dynein light chain (DLC)-like domain found in Schizosaccharomyces pombe dynein light chain Tctex-type (SpDlc1) and similar proteins. Schizosaccharomyces pombe dynein light chain 1 (SpDlc1) belongs to the 14-kDa Tctex-1 dynein light chain family. It acts as a non-catalytic accessory component of a dynein complex. It is required for regular oscillatory nuclear movement and efficient recombination during meiotic prophase in fission yeast. The model corresponds to the dynein light chain (DLC)-like domain found in SpDlc1 and similar proteins. 110
38206 412005 cd21457 DLC-like_TDA2 dynein light chain (DLC)-like domain found in topoisomerase I damage affected protein 2 (TDA2) and similar proteins. Topoisomerase I damage affected protein 2 (TDA2) is a novel protein of the endocytic machinery necessary for normal internalization of native cargo in yeast. It works independently of the dynein motor complex and microtubules. The model corresponds to the dynein light chain (DLC)-like domain of TDA2. 108
38207 412006 cd21458 DLC-like_TCTEX1D1 dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 1 (TCTEX1D1) and similar proteins. Tctex1 domain-containing protein 1 (TCTEX1D1) is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). It can interact with ZMYND10 that stabilizes intermediate chain proteins in the cytoplasmic pre-assembly of dynein arms. The model corresponds to the (dynein light chain) DLC-like domain of TCTEX1D1. 104
38208 412007 cd21459 DLC-like_TCTEX1D2 dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 2 (TCTEX1D2) and similar proteins. Tctex1 domain-containing protein 2 (TCTEX1D2) is required for proper retrograde ciliary transport. It associates with short-rib polydactyly syndrome proteins, such as Wdr34, Wdr60, and other dynein complex 1 and 2 subunits, and is required for ciliogenesis. TCTEX1D2 is a negative regulator of GLUT4 translocation and glucose uptake. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D2. 104
38209 412008 cd21460 DLC-like_TCTEX1D3 dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 3 (TCTEX1D3) and similar proteins. Tctex1 domain-containing protein 3 (TCTEX1D3), also called T-complex testis-specific protein 3, or T-complex-associated testis-expressed protein 3 (Tcte-3), may be an accessory component of axonemal dynein and cytoplasmic dynein 1. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D3. 112
38210 412009 cd21461 DLC-like_TCTEX1D4 dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 4 (TCTEX1D4) and similar proteins. Tctex1 domain-containing protein 4 (TCTEX1D4), also called protein N22.1, or Tctex-2-beta, is a novel protein phosphatase 1 (PPP1) interactor. It also interacts with ENG/endoglin, TGFBR2 and TGFBR3. The distribution of TCTEX1D4 in testis suggests its involvement in distinct functions, such as TGFbeta signaling at the blood-testis barrier and acrosome cap formation. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D4. 114
38211 412010 cd21462 DLC-like_DYNLT1 dynein light chain (DLC)-like domain found in dynein light chain Tctex-type 1 (DYNLT1) and similar proteins. Dynein light chain Tctex-type 1 (DYNLT1), also called TCTEL1, or TCTEX1, or protein CW-1, or T-complex testis-specific protein 1 homolog, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It plays a role in neuronal morphogenesis. It is involved in intracellular targeting of D-type retrovirus gag polyproteins to the cytoplasmic assembly site. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT1. 102
38212 412011 cd21463 DLC-like_DYNLT3 dynein light chain (DLC)-like domain found in dynein light chain Tctex-type 3 (DYNLT3) and similar proteins. Dynein light chain Tctex-type 3 (DYNLT3), also called rp3, or protein 91/23, or T-complex-associated testis-expressed 1-like, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It has a potential role in chromosome congression in human mitosis and is required for chromosome alignment during mouse oocyte meiotic maturation. The DYNLT3 light chain directly links cytoplasmic dynein to a spindle checkpoint protein, Bub3. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT3. 97
38213 410550 cd21464 7tm_GPR137 GPR137 family belonging to the seven-transmembrane G protein-coupled receptor superfamily. The GPR137 family includes GPR137A, GPR137B, and GPR137C, which are all orphan G protein-coupled receptors (GPCRs). GPR137A, also called GPR137 or transmembrane 7 superfamily member 1-like 1 protein (TM7SF1L1), is expressed in the central nervous system (CNS), endocrine gland, thymus, and lung. It is associated with different cancers including gastric cancer, pancreatic cancer, colon cancer, and malignant glioma. GPR137B, also called transmembrane 7 superfamily member 1 (TM7SF1), is a lysosome integral membrane protein that is strongly expressed in the heart, liver, kidney, and brain. It is associated with M2 macrophage polarization, and has been shown to perform a regulatory function in controlling dynamic Rag and mTORC1 localization and activity, as well as lysosome morphology. GPR137C, also called transmembrane 7 superfamily member 1-like 2 protein (TM7SF1L2), may be a key player in the prognosis of small cell lung cancer. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 310
38214 410574 cd21465 LPS_wlbK-like Bordetella wlbK gene product domains involved in bacterial polysaccharide synthesis, and similar domains. This model includes gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear. 154
38215 394821 cd21466 Ubl2_cv_PLpro_N_Nsp3-like second ubiquitin-like (Ubl) domain located N-terminal to the coronavirus SARS-CoV papain-like protease (PLpro) domain in the non-structural protein 3 (Nsp3) and related proteins. Severe acute respiratory syndrome coronavirus (SARS-CoV) non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the SARS-CoV genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pps), pp1a and pp1ab. Papain-like protease (PLpro) is one of two SARS-CoV proteases which process these polyproteins; it cleaves pp1a at three sites, releasing Nsp1, Nsp2, and Nsp3. Nsp3 is a large multi-functional multi-domain protein which is an essential component of the replication/transcription complex (RTC). This ubiquitin-like (Ubl) domain (sometimes referred to as Ubl2, the second Ubl domain of Nsp3) is located N-terminal to the PLpro domain of Nsp3. In addition to being a protease, SARS-CoV PLpro is a deubiquitinating enzyme (DUB), and may be involved in subverting cellular ubiquitination machinery to facilitate viral replication. A number of cellular DUBs have a Ubl domain, where it may serve a regulatory function. The exact functional role of this Ubl domain is unclear. 54
38216 394822 cd21467 Ubl1_cv_Nsp3_N-like first ubiquitin-like (Ubl) domain located at the N-terminus of coronavirus SARS-CoV non-structural protein 3 (Nsp3) and related proteins. This ubiquitin-like (Ubl) domain (Ubl1) is found at the N-terminus of coronavirus Nsp3, a large multi-functional multi-domain protein which is an essential component of the replication/transcription complex (RTC). The functions of Ubl1 in CoVs are related to single-stranded RNA (ssRNA) binding and to interacting with the nucleocapsid (N) protein. SARS-CoV Ubl1 has been shown to bind ssRNA having AUA patterns, and since the 5'-UTR of the SARS-CoV genome has a number of AUA repeats, it may bind there. In mouse hepatitis virus (MHV), this Ubl1 domain binds the cognate N protein. Adjacent to Ubl1 is a Glu-rich acidic region (also referred to as hypervariable region, HVR); Ubl1 together with HVR has been called Nsp3a. Currently, the function of HVR in CoVs is unknown. This model corresponds to one of two Ubl domains in Nsp3; the other is located N-terminal to the papain-like protease (PLpro) and is not represented by this model. 89
38217 410575 cd21468 LPS_wlbK_N-like N-terminal domain of Bordetella wlbK gene product involved in bacterial polysaccharide synthesis, and similar domains. This model includes the N-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear. 154
38218 410576 cd21469 LPS_wlbK_C-like C-terminal domain of Bordetella wlbK gene product involved in bacterial polysaccharide synthesis, and similar domains. This model includes the C-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear. 154
38219 394823 cd21470 CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of coronavirus spike (S) proteins. This family contains the receptor-binding domain (RBD) of the S1 subunit of coronavirus (CoV) spike (S) proteins from three highly pathogenic human coronaviruses (CoVs), including Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV), as well as S proteins from related coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. MHV uses mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a) as the receptor, and the receptors for SARS-CoV and MERS-CoV are human angiotensin-converting enzyme 2 (ACE2) and human dipeptidyl peptidase 4 (DPP4), respectively. Recent studies found that the RBD of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 171
38220 409193 cd21471 CrtC-like carotenoid 1,2-hydratase and similar proteins. Carotenoid 1,2-hydratase (EC 4.2.1.131; CrtC; also known as acyclic carotenoid 1,2-hydratase, 1-hydroxyneurosporene hydratase, hydroxylycopene hydratase, hydroxyneurosporene synthase, lycopene hydratase, or neurosporene hydratase) is an enzyme with the systematic name lycopene hydro-lyase (1-hydroxy-1,2-dihydrolycopene-forming). It is involved in the biosynthesis of carotenoids such as lycopenes. It catalyzes the hydration of neurosporene to the corresponding hydroxylated carotenoids 1-HO-neurosporene and that of lycopene to 1-HO-lycopene. Studies suggest that CrtC may be bound to the membrane through an anchor so that a close distance to the substrate, which is synthesized in the cell membranes, is facilitated. 276
38221 411068 cd21472 NocO-like cyanobacterial NocO and similar proteins. This family includes many uncharacterized proteins similar to cyanobacterial NocO and NocN, which are involved in the synthesis of natural oxadiazines such as nocuolin A (NoA, exhibits anti-proliferative activity against human cancer cell lines). Members are also similar to cyanobacterial ColD and ColE, putative acyl halogenases involved in columbamide biosynthesis. 356
38222 394836 cd21473 cv_Nsp4_TM coronavirus non-structural protein 4 (Nsp4) transmembrane domain. Nsp4 may be involved in coronavirus-induced membrane remodeling. In order to assemble the replication-transcription complex (RTC), coronavirus induces the rearrangement of host endoplasmic reticulum (ER) membrane into double membrane vesicles (DMVs), zippered ER, or ER spherules. DMV formation has been observed in SARS-CoV cells overexpressing the three transmembrane-containing non-structural proteins of viral replicase polyprotein 1ab: Nsp3, Nsp4 and Nsp6. Together, Nsp3, Nsp4, and Nsp6 have the ability to induce the formation of DMVs that are similar to those seen in SARS-CoV-infected cells. 376
38223 410551 cd21474 7tm_GPR137A Integral membrane protein GPR137A, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137A, also called GPR137 or transmembrane 7 superfamily member 1-like 1 protein (TM7SF1L1), is an orphan G protein-coupled receptor (GPCR) expressed in the central nervous system (CNS), endocrine gland, thymus, and lung. It is associated with different cancers including gastric cancer, pancreatic cancer, colon cancer, and malignant glioma. It is highly expressed in ovarian cancer and plays a pro-oncogenic role in the disease, promoting cell proliferation and metastasis through regulation of the PI3K/AKT pathway. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 310
38224 410552 cd21475 7tm_GPR137C Integral membrane protein GPR137C, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137C, also called transmembrane 7 superfamily member 1-like 2 protein (TM7SF1L2), is an orphan G protein-coupled receptor (GPCR) of unknown function. Bioinformatics analysis identified it as a likely key player in the prognosis of small cell lung cancer. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 319
38225 410553 cd21476 7tm_GPR137B Integral membrane protein GPR137B, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137B, also called transmembrane 7 superfamily member 1 (TM7SF1), is a lysosome integral membrane protein that is strongly expressed in the heart, liver, kidney, and brain. It is an orphan G protein-coupled receptor (GPCR) associated with M2 macrophage polarization, and has been shown to perform a regulatory function in controlling dynamic Rag and mTORC1 localization and activity, as well as lysosome morphology. It also plays a role in bone remodeling in mouse and zebrafish, functioning as a negative regulator of osteoclast activity essential for normal resorption and patterning of the skeleton. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. 320
38226 394824 cd21477 SARS-CoV-like_Spike_S1_RBD receptor-binding domain of the S1 subunit of severe acute respiratory syndrome-related coronavirus Spike (S) protein and similar proteins. This subfamily contains the receptor-binding domain of the S1 subunit of coronavirus (CoV) spike (S) proteins from highly pathogenic human virus, severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), SARS coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV), and other SARS-like coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2 and SARS-CoV use the C-domain to bind their receptors. Recent studies found that the receptor-binding domain (RBD) of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV, SARS-CoV-2, and most CoVs. 205
38227 394825 cd21478 HKU1-like_CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1 and related coronaviruses. This family contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, human coronavirus OC43 (HCoV-OC43), mouse hepatitis virus (MHV), porcine hemagglutinating encephalomyelitis virus (HEV), and other related coronaviruses. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease. HCoV-OC43 is of zoonotic origin and is endemic in the human population, causing mild respiratory tract infections and possible severe complications or fatalities in young children, the elderly, and immunocompromised individuals. MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract. Porcine HEV is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. These viruses are related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBDs of MHV and HCoV-OC43 are located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 223
38228 394826 cd21479 MERS-like_CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from Middle East respiratory syndrome coronavirus. This family contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from the human coronavirus that causes Middle East Respiratory Syndrome (MERS-CoV) and related coronaviruses from animals. MERS-CoV causes severe pulmonary disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including MERS-CoV use the C-domain to bind their receptors. MERS-CoV use human dipeptidyl peptidase 4 (DPP4), also called CD26, as its receptor. It binds DPP4 through the RBD of its S1 subunit and then fuses viral and host membranes through its S2 subunit. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV. 216
38229 394827 cd21480 SARS-CoV-2_Spike_S1_RBD receptor-binding domain of the S1 subunit of severe acute respiratory syndrome coronavirus 2 Spike (S) protein. This group contains the receptor-binding domain of the S1 subunit of the spike (S) protein from highly pathogenic human virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While RBD of mouse hepatitis virus (MHV) is located at the NTD, most of other CoVs, including SARS-CoV-2 use the C-domain to bind their receptors. Recent studies found that the receptor-binding domain (RBD) of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV-2 and most CoVs. 223
38230 394828 cd21481 SARS-CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of severe acute respiratory syndrome-related coronavirus Spike (S) protein. This group contains the receptor-binding domain of the S1 subunit of the spike (S) protein from severe acute respiratory syndrome-related coronavirus (SARS-CoV) and similar coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV use the C-domain to bind their receptors. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV and most CoVs. 222
38231 394829 cd21482 HKU1_N5-like_CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1, isolate N5 and isolate N2. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, isolates N5 and N2. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs, and most likely, for HKU1. 304
38232 394830 cd21483 HKU1_N1_CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1, isolate N1. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, isolate N1. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs, and most likely, for HKU1. 306
38233 394831 cd21484 MHV-like_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from mouse hepatitis virus and other rodent coronaviruses. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from mouse hepatitis virus (MHV) and other rodent coronaviruses. MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). MHV uses mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a) as the receptor; the RBD of MHV is located at the NTD. Most CoVs, such as SARS-CoV and MERS-CoV, use the C-domain to bind their receptors. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 264
38234 394832 cd21485 HCoV-OC43-like_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus OC43 and related proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from several betacoronaviruses including human coronavirus OC43 (HCoV-OC43) and bovine respiratory coronavirus (BCoV), among others. HCoV-OC43 is of zoonotic origin and is endemic in the human population, causing mild respiratory tract infections and possible severe complications or fatalities in young children, the elderly, and immunocompromised individuals. These viruses are related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. It has been reported that HCoV-OC43 uses 9-O-acetyl-sialic acid (9-O-Ac-Sia) as a receptor, which is terminally linked to oligosaccharides decorating glycoproteins and gangliosides at the host cell surface. HCoV-OC43 appears to bind 9-O-Ac-Sia at the NTD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 312
38235 394833 cd21486 human_MERS-CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from human Middle East respiratory syndrome coronavirus. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from the human coronavirus that causes Middle East Respiratory Syndrome (MERS-CoV). MERS-CoV causes severe pulmonary disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including MERS-CoV use the C-domain to bind their receptors. MERS-CoV use human dipeptidyl peptidase 4 (DPP4), also called CD26, as its receptor. It binds DPP4 through the RBD of its S1 subunit and then fuses viral and host membranes through its S2 subunit. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV. 219
38236 394834 cd21487 bat_HKU4-like_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from Tylonycteris bat coronavirus HKU4 and similar proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from Tylonycteris bat coronavirus HKU4 and other Middle East Respiratory Syndrome (MERS)-related coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including human MERS-CoV that is phylogenetically closely related to bat CoV HKU4 use the C-domain to bind their receptors. HKU4 is able to bind the MERS-CoV receptor, human dipeptidyl peptidase 4 (DPP4), also called CD26. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV, and most likely, bat CoV HKU4. 219
38237 411069 cd21494 TMEM168 Transmembrane (TMEM) protein family 168. This family includes transmembrane protein 168 (TMEM168) and similar proteins. TMEM168 is a multi-pass membrane protein predicted to contain nine transmembrane helices. Its expression has been implicated in several diseases. It is upregulated in glioblastoma multiforme (GBM) and small interfering RNA (siRNA)-TMEM168 has been shown to prevent viability of human GBM cell lines, induce cell cycle arrest (G0/G1 phase), and promote apoptosis through the suppression of the Wnt/beta-catenin pathway. A TMEM168 point mutation has also been associated with arrhythmogenesis in familial Brugada syndrome. 681
38238 412059 cd21500 NleA-like bacterial effector protein NleA, and similar proteins. This family includes non-locus of enterocyte effacement (non-LEE) encoded effector A (NleA), a bacterial effector protein injected by enteropathogenic and enterohemorrhagic Escherichia coli (EPEC and EHEC), both related strains capable of inducing severe gastrointestinal disease. These pathogens modulate cellular functions via the deployment of effector proteins in a type three secretion system (T3SS)-dependent manner. In response, the host Nod-like receptor pyrin domain containing (NLRP) inflammasome activates caspase-1 and releases IL-1beta. NleA plays a role in controlling the host immune response through targeting of Nod-like receptor 3 (NLRP3); it has been identified as the effector that can subdue IL-1beta secretion by inhibiting caspase-1 activation, thus inhibiting NLRP inflammasome activation. NleA interacts with NLRP3 via regions containing the PYD and LRR domains. NleA has also been shown to associate with non-ubiquitinated and ubiquitinated NLRP3 and to interrupt de-ubiquitination of NLRP3, which is a required process for inflammasome activation. 425
38239 411070 cd21501 GtgE Salmonella enterica effector protein GtgE. The Salmonella enterica GtgE effector protein contributes to the virulence of this pathogen by modulating trafficking of the Salmonella-containing vacuole. GtgE, which exclusively targets inactive Rab GTPases, has been identified as a cysteine protease with the typical Cys-Hip-Asp catalytic triad. It functions by cleaving the Rab-family GTPases Rab29, Rab32 and Rab38, thereby preventing the delivery of antimicrobial factors to the bacteria-containing vacuole. It has been shown to solely process the inactive GDP-bound GTPase Rab32. However, weak binding of GtgE to the peptide encompassing the Rab29 cleavage site suggests that the function of GtgE may be dependent on other factors, such as a protein partner or interactions with the Salmonella-containing vacuole (SCV) membrane. 174
38240 411071 cd21502 vWA_BABAM1 Von-Willebrand factor A (vWA) domain found in BRISC and BRCA1-A complex member 1. BRISC and BRCA1 A complex member 1 (BABAM1) is also known as Mediator of RAP80 interactions and targeting subunit of 40 kDa (MERIT40), New component of the BRCA1-A complex (NBA1), HSPC142, or C19orf620. It is a core component of the BRCA1-A and BRISC complexes that function in DNA double-strand break repair and immune signaling, and contain the lysine-63 linkage-specific BRCC36 subunit that is functionalized by scaffold subunits Abraxas and ABRO1, respectively. BABAM1 interacts with Rap80, BRCC36, BRCC45, and Abraxas to form the BRCA1-A complex, a lysine-63-Ub specific deubiquitinating enzyme (DUB) which specifically recognizes lysine-63-linked ubiquitinated histones H2A and H2AX at DNA lesions sites, leading to target the BRCA1-BARD1 heterodimer to sites of DNA damage at double-strand breaks (DSBs). BRISC is a DUB complex containing three other subunits, BRCC36, ABRO1 and BRCC45. It specifically hydrolyzes lysine-63 polyubiquitin chains, and is involved in multiple biological processes, including IFN-mediated antiviral immune regulation and inflammatory reaction. BABAM1 likely serves as a scaffold protein by integrating other components to form a functional complex. Furthermore, BABAM1 has been shown to play a critical role in BRISC-mediated regulation of Tankyrase1 (TNKS1) function during spindle assembly; it directly binds to the ankyrin repeat cluster V (ARC-V) domain of TNKS1 via its RXXPEG motif. BABAM1 contains a Von-Willebrand factor A (vWA) domain that is distantly related to classical vWA domains. 216
38241 409631 cd21503 ABC-2_lan_permease lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit. This family contains lantibiotic ABC transporter permease subunits which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to type-A lantibiotics. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Lactococcus lactis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG subunits. This family includes the Lactococcus lactis NisG permease subunit that transports nisin to the surface and expels it from the membrane. This family also includes the lantibiotic ABC transporter permease subunits EpiE, MutE, MutG, and SlvE. Self-protection of the epidermin-producing strain Staphylococcus epidermidis Tu3298 against the pore-forming lantibiotic epidermin is mediated by an ABC transporter composed of the EpiF, EpiE, and EpiG proteins. In the mutacin I-producing strain Streptococcus mutans CH43, self-immunity against mutacin I is mediated by proteins MutF, MutE, and MutG, while in salivaricin D-producing strain Streptococcus salivarius 5M6c, mediation is via ABC transporter proteins SlvF, SlvE, and SlvG. 221
38242 410337 cd21504 PPP2R3A_B-like serine/threonine protein phosphatase 2A regulatory subunit B" alpha and beta subunits, and similar proteins. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. These B-family regulatory subunits play various roles including regulation of cytoskeletal assembly, neuronal differentiation, mitogen-activated protein kinase signaling, and apoptosis. This subfamily includes protein phosphatase 2A regulatory subunit B'' subunits alpha and beta, encoded by PPP2R3A and PPP2R3B. It also includes subunit delta encoded by PPP2R3D in mouse. They contain two-domain elongated structures with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity. 274
38243 410338 cd21505 PPP2R3C serine/threonine protein phosphatase 2A regulatory subunit B" subunit gamma. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This subfamily includes protein phosphatase subunit G5PR (also known as serine/threonine-protein phosphatase 2A regulatory subunit B'' subunit gamma, G4-1, G5pr, GDRM, SPGF36, or C14orf10) that is encoded by the PPP2R3C gene. It is involved in the control of the dynamic organization of the cortical cytoskeleton and plays an important role in the organization of interphase microtubule arrays in part through the regulation of nucleation geometry. G5PR is involved in the ontogeny of multiple organs, especially critical for testis development and spermatogenesis. PPP2R3C gene variants cause syndromic 46,XY gonadal dysgenesis and impaired spermatogenesis in humans, and thus is emerging as a potential therapeutic target for male infertility. 382
38244 410339 cd21506 PPP2R3A serine/threonine protein phosphatase 2A regulatory subunit B" subunit alpha. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR130 (also known as protein phosphatase 2A regulatory subunit B'' subunit alpha, PR72, or PPP2R3) that is encoded by the PPP2R3A gene. PR130 and PR72 subunits are derived from the same gene through differential splicing; they harbor specific N-terminal domains of different lengths that are encoded by alternatively spliced exons and have identical C-termini. The common C-terminus contains a two-domain elongated structure with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity. The PR130 subunit has been shown to interact with the LIM domain of lipoma-preferred partner (LPP) through a conserved Zn2+-finger-like motif in the N-terminus of PR130. 284
38245 410340 cd21507 PPP2R3B serine/threonine protein phosphatase 2A regulatory subunit B" subunit beta. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR70 (also known as protein phosphatase 2 regulatory subunit B'' subunit beta, PR48, NYREN8, PPP2R3L, or PPP2R3LY) that is encoded by the PPP2R3B gene. This substrate-recognizing subunit of PP2A has a two-domain elongated structure with two calcium EF-hands, each displaying different affinities to Ca2+. PPP2R3B/PR70 is a gonosomal melanoma tumor suppressor gene; PR70 decreased melanoma growth by negatively interfering with DNA replication and cell cycle progression through its role in stabilizing the cell division cycle 6 (CDC6)-chromatin licensing and DNA replication factor 1 (CDT1) interaction, which delays the firing of origins of DNA replication. 355
38246 394835 cd21508 HEV_Spike_S1_RBD receptor-binding domain of the S1 subunit of the Spike (S) protein from porcine hemagglutinating encephalomyelitis virus. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from porcine hemagglutinating encephalomyelitis virus (HEV), which is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. Porcine HEV is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. The protein receptor for porcine HEV has not yet been identified. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 298
38247 411072 cd21510 agarase_cat alpha-beta barrel catalytic domain of agarase, such as GH86-like endo-acting agarases identified in non-marine organisms. Typically, agarases (E.C. 3.2.1.81) are found in ocean-dwelling bacteria since agarose is a principle component of red algae cell wall polysaccharides. Agarose is a linear polymer of alternating D-galactose and 3,6-anhydro-L-galactopyranose. Endo-acting agarases, such as glycoside hydrolase 16 (GH16) and GH86 hydrolyze internal beta-1,4 linkages. GH86-like endo-acting agarase of this protein family has been identified in the human intestinal bacterium Bacteroides uniformis. This acquired metabolic pathway, as demonstrated by the prevalence of agar-specific genetic cluster called polysaccharide utilization loci (PULs), varies considerably between human populations, being much more prevalent in a Japanese sample than in North America, European, or Chinese samples. Agarase activity was also identified in the non-marine bacterium Cellvibrio sp. 321
38248 394864 cd21511 cv-alpha_beta_Nsp2-like alpha- and betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and HCoV-229E Nsp2, and related proteins. Coronavirus Nsps are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This alpha- and betacoronavirus family includes alphacoronavirus human coronavirus 229E (HCoV-229E) Nsp2, betacoronavirus Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2 and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. This family may be distantly related to the gammacoronavirus Avian infectious bronchitis virus (IBV) Nsp2; IBV Nsp2 is a weak protein kinase R (PKR) antagonist, which may suggest that it plays a role in interfering with intracellular immunity. 399
38249 394837 cd21512 cv_gamma-delta_Nsp2_IBV-like gamma- and deltacoronavirus non-structural protein 2 (Nsp2), similar to IBV Nsp2 and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these cleaved subunits. The functions of Nsp2 remain unclear. This gamma- and deltacoronavirus family includes Avian infectious bronchitis virus (IBV) Nsp2 which has been shown to be a weak protein kinase R (PKR) antagonist, which may suggest that it plays a role in interfering with intracellular immunity. This family may be distantly related to a family of alpha- and betacoronavirus Nsp2, which includes severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 365
38250 394838 cd21513 SUD_C_DPUP_CoV_Nsp3 C-terminal SARS-Unique Domain (SUD) of betacoronavirus non-structural protein 3 (Nsp3). This family contains the SUD-C of Nsp3 from Severe Acute Respiratory Syndrome (SARS) coronavirus (CoV), Middle East respiratory syndrome-related (MERS) CoV, and Rousettus bat CoV HKU9, as well as the DPUP (domain preceding Ubl2 and PLP2) of murine hepatitis virus (MHV) Nsp3. Though structurally similar, there is little sequence similarity between these four domain subfamilies: SARS SUD-C, MERS SUD-C, HKU9 SUD-C, and MHV DPUP. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). The SUD-C domain adopts a frataxin-like fold and has structural similarity to DNA-binding domains of DNA-modifying enzymes. It binds to single-stranded RNA and recognizes purine bases more strongly than pyrimidine bases. SUD-C also regulates the RNA binding behavior of the SUD-M macrodomain. SUD-C is not as specific to SARS CoV Nsp3 as originally thought, and is conserved in the Nsp3s of all four lineages (A-D) of betacoronavirus. 71
38251 394865 cd21514 cv_alpha_Nsp2_HCoV-229E-like alphacoronavirus non-structural protein 2 (Nsp2), similar to HCoV-229E Nsp2 and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes alphacoronavirus human coronavirus 229E (HCoV-229E) Nsp2 and belongs to a family which includes betacoronavirus Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 503
38252 394866 cd21515 cv_beta_Nsp2_SARS_MHV-like betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and MHV Nsp2 (p65), and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This family includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2 and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2 rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 584
38253 394867 cd21516 cv_beta_Nsp2_SARS-like betacoronavirus non-structural protein 2 (Nsp2) similar to SARS-CoV Nsp2, and related proteins from betacoronaviruses in the B lineage. Non-structural proteins (Nsps) from Severe acute respiratory syndrome coronavirus (SARS-CoV) and betacoronaviruses in the sarbecovirus subgenera (B lineage) are encoded in ORF1a and ORF1b. Post infection, the SARS-CoV genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. The functions of Nsp2 remain unknown. Deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. Rather than playing a role in viral replication, SARS-CoV Nsp2 may be involved in altering the host cell environment; it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. 637
38254 394868 cd21517 cv_beta_Nsp2_MERS-like betacoronavirus non-structural protein 2 (Nsp2) similar to MERS-CoV Nsp2, and related proteins from betacoronaviruses in the C lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and betacoronaviruses in the merbecovirus subgenera (C lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 660
38255 394869 cd21518 cv_beta_Nsp2_HKU9-like betacoronavirus non-structural protein 2 (Nsp2) similar to bat coronavirus HKU9 Nsp2, and related proteins from betacoronaviruses in the D lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Rousettus bat coronavirus HKU9 and betacoronaviruses in the nobecovirus subgenera (D lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 597
38256 394870 cd21519 cv_beta_Nsp2_MHV-like betacoronavirus non-structural protein 2 (Nsp2) similar to MHV Nsp2/p65 and related proteins from betacoronaviruses in the A lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Murine hepatitis virus (MHV) and betacoronaviruses in the embecovirus subgenera (A lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2. The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers, and it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2, also known as p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 586
38257 394839 cd21523 SUD_C_MERS-CoV_Nsp3 C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This subfamily contains the SUD-C of Middle East respiratory syndrome-related (MERS) coronavirus (CoV) Nsp3 and other Nsp3s from betacoronaviruses in the merbecovirus subgenera (C lineage), including several bat-CoVs such as Tylonycteris bat CoV HKU4, Pipistrellus bat CoV HKU5, and Hypsugo bat CoV HKU25. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in MERS and related bat coronaviruses. Similar to SARS SUD-C, Tylonycteris bat-CoV HKU4 SUD-C (HKU4 C), a member of the MERS SUD-C group, also adopts a frataxin-like fold (DOI:10.1177/1934578X19849202) that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether MERS SUD-C or HKU4 C functions in the same way. It has been suggested that HKU4 C engages in protein-protein interactions with HKU4 SUD-M. 76
38258 394840 cd21524 DPUP_MHV_Nsp3 DPUP (domain preceding Ubl2 and PLP2) of non-structural protein 3 (Nsp3) from murine hepatitis virus and related betacoronaviruses in the A lineage. This subfamily contains the DPUP (domain preceding Ubl2 and PLP2) of murine hepatitis virus (MHV) non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the embecovirus subgenera (A lineage), including human CoV OC43, rabbit CoV HKU14 and porcine hemagglutinating encephalomyelitis virus (HEV), among others. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. MHV Nsp3 contains a DPUP that is located N-terminal to the ubiquitin-like domain 2 (Ubl2) and papain-like protease 2 (PLP2) catalytic domain. It is structurally similar to the Severe Acute Respiratory Syndrome (SARS) CoV unique domain C (SUD-C), adopting a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. SUD-C is also located N-terminal to Ubl2 and PLP2 in SARS Nsp3, similar to the DPUP of MHV Nsp3; however, unlike DPUP, it is preceded by SUD-N and SUD-M macrodomains that are absent in MHV Nsp3. Though structurally similar, there is little sequence similarity between DPUP and SUD-C. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether DPUP functions in the same way. 75
38259 394841 cd21525 SUD_C_SARS-CoV_Nsp3 C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Severe Acute Respiratory Syndrome coronavirus and related betacoronaviruses in the B lineage. This subfamily contains the SUD-C of Severe Acute Respiratory Syndrome (SARS) coronavirus (CoV) non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the sarbecovirus subgenera (B lineage), such as SARS-CoV-2 and related bat CoVs. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of the Severe Acute Respiratory Syndrome (SARS) coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). The SUD-C domain adopts a frataxin-like fold and has structural similarity to DNA-binding domains of DNA-modifying enzymes. It binds to single-stranded RNA and recognizes purine bases more strongly than pyrimidine bases. SUD-C also regulates the RNA binding behavior of the SUD-M macrodomain. 67
38260 394843 cd21526 CoV_Nsp6 coronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 287
38261 394949 cd21527 CoV_Spike_S1_NTD N-terminal domain of the S1 subunit of coronavirus Spike (S) proteins. This family contains the N-terminal domain (NTD) of the S1 subunit of coronavirus (CoV) Spike (S) proteins from all four (A-D) lineages of betacoronaviruses, including three highly pathogenic human CoVs (HCoV) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as a 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV, use the C-domain to bind their receptors. However, some CoVs from the A lineage, such as mouse hepatitis virus (MHV) uses the NTD to bind its receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a). Bovine CoV and HCoV-OC43, also from the A lineage, recognize a sugar moiety, 5-N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2), on cell-surface glycoproteins or glycolipids; this binding is also through the S1 NTD. The S1 NTD has also been the target for neutralizing antibodies, including human antibody CDC2-A2, and murine antibodies G2 and 5F9, which target MERS-CoV NTD. In addition, the S1 NTD contributes to the Spike trimer interface. 268
38262 394955 cd21528 CoV_Nsp14 nonstructural protein 14 of coronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 518
38263 394849 cd21529 CoV_M coronavirus Membrane (or Matrix) protein. This family contains the Membrane (M) protein of coronaviruses (CoVs) including three highly pathogenic human CoVs such as Middle East respiratory syndrome (MERS)-related CoV, severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 198
38264 394890 cd21530 CoV_RdRp coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This family contains the RNA-dependent RNA polymerase of alpha-, beta-, gamma-, delta-coronaviruses, including three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 928
38265 394858 cd21531 CoV_E Coronavirus Envelope (E) small membrane protein. This family contains the Envelope (E) small membrane protein of betacoronaviruses, including the E proteins from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS) CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 58
38266 394859 cd21532 HKU1-CoV-like_E human coronavirus HKU1 Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of human coronavirus HKU1 and related coronaviruses (CoVs) from rodents. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 74
38267 394860 cd21533 MERS-CoV-like_E Middle East respiratory syndrome-related coronavirus Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of Middle East respiratory syndrome (MERS) coronavirus (CoV), as well as E proteins from related coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 80
38268 394861 cd21534 SARS-CoV-like_E Severe acute respiratory syndrome coronavirus Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome (SARS) coronavirus (CoV) and SARS-CoV-2, also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus, as well as E proteins from related CoVs. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 62
38269 394862 cd21536 SARS-CoV-2_E Severe acute respiratory syndrome coronavirus 2 Envelope small membrane protein. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 75
38270 394842 cd21537 SUD_C_HKU9_CoV_Nsp3 C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This subfamily contains the SUD-C of Rousettus bat coronavirus (CoV) HKU9 non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the nobecovirus subgenera (D lineage). Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in Rousettus bat CoV HKU9 and related bat CoVs. Similar to SARS SUD-C, Rousettus bat CoV HKU9 SUD-C (HKU9 C), also adopts a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether HKU9 C functions in the same way. 73
38271 394863 cd21554 CoV_N-NTD N-terminal domain of nucleocapsid (N) protein of coronavirus. The coronavirus nucleocapsid (N) protein is a major structural and multifunctional protein. It plays an important role in the virus replication cycle, by forming a complex with the viral RNA through its N-terminal domain (N-NTD), which makes this domain an important drug target. It also interacts with the viral membrane protein during virion assembly and plays a critical role in enhancing the efficiency of virus transcription and assembly. 125
38272 411073 cd21555 OmcS-like C-type cytochrome OmcS and similar proteins. This family includes C-type outer membrane cytochrome S (OmcS) which plays an important role in extracellular electron transfer. OmcS can transfer electrons to insoluble Fe(3+) oxides as well as other extracellular electron acceptors, including Mn(4+) oxide and humic substances. Recent studies show that Geobacter sulfurreducens hexaheme cytochrome OmcS proteins can assemble into filaments, known as microbial nanowires, similar to type IV pili composed of PilA protein. The coordination of a histidine in one subunit with the iron in the heme of an adjacent subunit is an important stabilizing element. The capacity of these bacteria to transport electrons to remote electron acceptors via these protein nanowires is of interest because of the environmental and practical significance of these microbes in soil. 382
38273 394881 cd21556 Macro_cv_SUD-N-M_Nsp3-like SUD-N and SUD-M macrodomains of the SARS-Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This family includes two macrodomains referred to as the SUD-N (N-terminal subdomain) and SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and highly related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains. SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), while SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is not included in this family. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these. 109
38274 394882 cd21557 Macro_X_Nsp3-like X-domain of viral non-structural protein 3 and related macrodomains. The X-domain, a macrodomain, is found in riboviral non-structural protein 3 (Nsp3), including the Nsp3 of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and other coronaviruses (alpha-, beta-, gamma-, and deltacoronavirus), among others. The SARS-CoV X-domain may function as a module binding poly(ADP-ribose), a metabolic product of NAD+ synthesized by poly(ADP-ribose) polymerase (PARP). The X-domain of Avian infectious bronchitis virus (IBV) strain Beaudette coronavirus does not bind ADP-ribose; the triple glycine sequence found in the X-domains of SARS-CoV and human coronavirus 229E (HCoV229E), which are involved in ADP-ribose binding, is not conserved in the IBV X-domain. SARS-CoV has two other macrodomains referred to as the SUD-N (N-terminal subdomain) and SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD), which also do not bind ADP-ribose; these bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SARS-CoV SUD-N and SUD-M are not included in this group. 127
38275 394844 cd21558 alphaCoV-Nsp6 alphacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 293
38276 394845 cd21559 gammaCoV-Nsp6 gammacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 307
38277 394846 cd21560 betaCoV-Nsp6 betacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 290
38278 394847 cd21561 deltaCoV-Nsp6 deltacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 296
38279 394883 cd21562 Macro_cv_SUD-N_Nsp3-like SUD-N macrodomain of the SARS Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This subfamily includes the macrodomain referred to as SUD-N (N-terminal subdomain) of the SARS-unique domain (SUD) which binds G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in the non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and highly related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains: the SUD-M domain (not represented in this subfamily) is a related macrodomain which also binds G-quadruplexes. SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), while SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is also not represented in this subfamily. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these. 126
38280 394884 cd21563 Macro_cv_SUD-M_Nsp3-like SUD-M macrodomain of the SARS Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This subfamily includes the macrodomain referred to as SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD) which binds G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains: The SUD-N domain (not represented in this subfamily) is a related macrodomain which also binds G-quadruplexes. While SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-M, despite its name, is not specific to SARS. SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is also not represented in this subfamily. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these. 120
38281 394850 cd21564 alphaCoV_M alphacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of alphacoronaviruses including human coronaviruses (HCoVs), HCoV-229E and HCoV-NL63. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 218
38282 394851 cd21565 betaCoV_M betacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of betacoronaviruses including the M proteins from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 208
38283 394852 cd21566 gammaCoV_M gammacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of gammacoronavirus including avian infectious bronchitis virus (IBV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 212
38284 394853 cd21567 MERS-like-CoV_M Membrane (or Matrix) protein from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This group contains the Membrane (M) protein of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 216
38285 394854 cd21568 HCoV-like_M Membrane (or Matrix) protein from human coronavirus and related betacoronaviruses in the A lineage. This group contains the Membrane (M) protein of human coronaviruses (HCoVs), HCoV-OC43 and HCoV-HKU1, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 220
38286 394855 cd21569 SARS-like-CoV_M Membrane (or Matrix) protein from Severe acute respiratory syndrome (SARS) coronavirus, SARS-CoV-2, and related betacoronaviruses in the B lineage. This group contains the Membrane (M) protein of Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus), and related proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 218
38287 394856 cd21570 batCoV_HKU9-like_M Membrane (or Matrix) protein from bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This group contains the Membrane (M) protein of Rousettus bat coronavirus HKU9, and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 217
38288 409236 cd21571 KLF13_N N-terminal domain of Kruppel-like factor 13. Kruppel-like factor 13 (KLF13; also known as Krueppel-like factor 13, RANTES factor of late activated T lymphocytes 1/RFLAT-1, or Fetal Kruppel-like factor-2/FKLF-2), is a protein that in humans is encoded by the KLF13 gene. It was originally cloned from fetal globin-expressing tissues, though it has also been cloned from bone marrow, striated muscles, and a subset of T cells where it is highly expressed. KLF13 plays a role in heart development and morphogenesis and is thought to play a role in obesity. It regulates the expression of the chemokine RANTES in T lymphocytes and has been shown to interact with CREB-binding protein, heat shock protein 47, and PCAF. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF13 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF13. 136
38289 409241 cd21572 KLF10_N N-terminal domain of Kruppel-like factor 10. Kruppel-like factor 10 (KLF10; also known as Krueppel-like factor 10; early growth response(EGR)-alpha/EGRA; TGFbeta inducible early gene-1/TIEG1) is a protein that in humans is encoded by the KLF10 gene. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. It may also play a role in adipocyte differentiation and adipose tissue function. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10. 245
38290 409237 cd21573 KLF16_N N-terminal domain of Kruppel-like factor 16. Kruppel-like factor 16 (KLF16; also known as Krueppel-like factor 16, Basic transcription element binding protein 4/BTEB4, or Novel Sp1-like zinc finger transcription factor/2NSLP2) is a protein that in humans is encoded by the KLF16 gene. KLF16 functions as a transcription activator. It is thought to modulate dopaminergic transmission in the brain and also regulates the expression of several genes essential for metabolic and endocrine processes in sex steroid-sensitive uterine cells. KLF16 selectively binds three distinct KLF-binding sites (GC, CA, and BTE boxes). KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF16 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF16. 125
38291 410567 cd21574 KLF17_N N-terminal domain of Kruppel-like factor 17. Kruppel-like factor 17 (KLF17), or Krueppel-like factor 17, is a protein that, in humans, is encoded by the KLF17 gene and acts as a tumor suppressor. It negatively regulates epithelial-mesenchymal transition and metastasis in breast cancer. KLF17 is thought to be the human ortholog of the mouse gene, zinc finger protein 393 (Zfp393), although it has diverged significantly. KLF17 can regulate gene transcription from CACCC-box elements. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF17. 286
38292 410566 cd21575 KLF18_N N-terminal domain of Kruppel-like factor 18. Kruppel-like factor 18 (KLF18), or Krueppel-like factor 18, is a product of a chromosomal neighbor of the KLF17 gene and is likely a product of its duplication. Phylogenetic analyses revealed that mammalian predicted KLF18 proteins and KLF17 proteins experienced elevated rates of evolution and are grouped with KLF1/KLF2/KLF4 and non-mammalian KLF17. KLF18 has been found in the human testis, though it was previously hypothesized to be a pseudogene in extant placental mammals. Mouse KLF18 expression data indicates that it may function in early embryonic development. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF18. Some KLF18 isoforms have duplicated N-terminal domains. 276
38293 409238 cd21576 KLF14_N N-terminal domain of Kruppel-like factor 14. Kruppel-like factor 14 (KLF14; also known as Krueppel-like factor 14 or basic transcription element-binding protein 5/BTEB5) is a protein that in humans is encoded by the KLF14 gene. KLF14 regulates the transcription of various genes, including TGFbetaRII (the type II receptor for TGFbeta). KLF14 is expressed in many tissues, lacks introns, and is subject to parent-specific expression. It also appears to be a master regulator of gene expression in adipose tissue. KLF14 is associated with coronary artery disease, hypercholesterolemia, and type 2 diabetes. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF14 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF14. 195
38294 410554 cd21577 KLF3_N N-terminal domain of Kruppel-like factor 3. Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3. 214
38295 409239 cd21578 KLF9_N N-terminal domain of Kruppel-like factor 9. Kruppel-like factor 9 (KLF9; also known as Krueppel-like factor 9, or Basic Transcription Element Binding Protein 1/BTEB Protein 1) is a protein that in humans is encoded by the KLF9 gene. KLF9 is critical for the inhibition of growth and development of tumors. It is involved in cell differentiation of B cells, keratinocytes, and neurons. It is also a key transcriptional regulator for uterine endometrial cell proliferation, adhesion, and differentiation; these are processes essential for pregnancy success and are subverted during tumorigenesis. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF9 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF9. 142
38296 410335 cd21579 KLF5_N N-terminal domain of Kruppel-like factor 5. Kruppel-like factor 5 (KLF5; also known as also known as Krueppel-like factor 5; intestinal enriched Kruppel-like factor/IKLF; basic transcription element binding protein 2/BTEB2) a protein that in humans is encoded by the KLF5 gene. KLF5 is involved in numerous functions in eukaryotic cells, such as proliferation, migration, and differentiation. The loss of KLF5 expression is associated with tumors of the breast, cervix, endometrium, ovary, and prostate. KLF5 mediates the expression of several genes essential for proper cardiac structure and function, and plays a role in familial dilated cardiomyopathy. It functions as a transcriptional activator. KLF5 exhibits both transcriptional activation activity as well as trans-activating function. It belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF5. 273
38297 410206 cd21580 KLF15_N N-terminal domain of Kruppel-like factor 15. Kruppel-like factor 15 (KLF15; also known as Krueppel-like factor 15 or kidney-enriched Kruppel-like factor/KKLF) is a protein that in humans is encoded by the KLF15 gene. KLF15 plays a role in gluconeogenesis, adipogenesis, and may be a potential therapeutic target to reduce hepatitis B virus gene expression and viral replication, heart failure and aortic aneurysm formation, and endometrial, breast cancer, and other diseases related to estrogen. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF15. 213
38298 409227 cd21581 KLF1_N N-terminal domain of Kruppel-like Factor 1. Kruppel-like Factor 1 (KLF1, also known as Krueppel-like factor 1 or Erythroid Kruppel-like Factor/EKLF) was the first Kruppel-like factor discovered. It was found to be vitally important for embryonic erythropoiesis in promoting the switch from fetal hemoglobin (Hemoglobin F) to adult hemoglobin (Hemoglobin A) gene expression by binding to highly conserved CACCC domains. EKLF ablation in mouse embryos produces a lethal anemic phenotype, causing death by embryonic day 14, and natural mutations lead to beta+ thalassemia in humans. However, expression of embryonic hemoglobin and fetal hemoglobin genes is normal in EKLF-deficient mice, suggesting other factors may be involved. KLF1 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF1, which is related to the N-terminal domains of KLF2 and KLF4. 278
38299 409228 cd21582 KLF4_N N-terminal domain of Kruppel-like factor 4. Kruppel-like factor 4 (KLF4; also known as Krueppel-like factor 4 or gut-enriched Kruppel-like factor/GKLF) is a protein that, in humans, is encoded by the KLF4 gene. Evidence also suggests that KLF4 is a tumor suppressor in certain cancers, including colorectal cancer, gastric cancer, esophageal squamous cell carcinoma, intestinal cancer, prostate cancer, bladder cancer and lung cancer. It may act as a tumor promoter where increased KLF4 expression has been reported, such as in oral squamous cell carcinoma and in primary breast ductal carcinoma. KLF4 is one of four key factors that are essential for inducing pluripotent stem cells. KLF4 is highly expressed in non-dividing cells and its overexpression induces cell cycle arrest. KLF proteins KLF1, KLF2, KLF4, KLF5, KLF6, and KLF7 are transcriptional activators. KLF4 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF4, which is related to the N-terminal domains of KLF1 and KLF2. 335
38300 409229 cd21583 KLF2_N N-terminal domain of Kruppel-like factor 2. Kruppel-like Factor 2 (KLF2, also known as Krueppel-like factor 2 or lung Kruppel-like Factor/LKLF) is a protein that, in humans, is encoded by the KLF2 gene on chromosome 19. It has been implicated in a variety of biochemical processes in the human body, including lung development, embryonic erythropoiesis, epithelial integrity, T-cell viability, and adipogenesis. KLF proteins KLF1, KLF2, KLF4, KLF5, KLF6, and KLF7 are transcriptional activators. KLF2 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF2, which is related to the N-terminal domains of KLF1 and KLF4. 299
38301 409242 cd21584 KLF11_N N-terminal domain of Kruppel-like factor 11. Kruppel-like factor 11 (KLF11; also known as Krueppel-like factor 11; Fetal Kruppel-like factor-1/FKLF-1; maturity-onset diabetes of the young 7/MODY7; TGFbeta Inducible Early Growth Response 2/TIEG2) is a protein that in humans is encoded by the KLF11 gene. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF11 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF11. 217
38302 409244 cd21585 KLF7_N N-terminal domain of Kruppel-like factor 7. Kruppel-like factor 7 (KLF7; also known as Krueppel-like factor 7, or ubiquitous Kruppel-like factor/UKLF) is a protein which, in humans, is encoded by the KLF7 gene. KLF7 is involved in regulation of the development and function of the nervous system and adipose tissue, type 2 diabetes, blood diseases, as well as pluripotent cell maintenance. It functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF7. 160
38303 409245 cd21586 KLF6_N N-terminal domain of Kruppel-like factor 6. Kruppel-like factor 6 (KLF6; also known as Krueppel-like factor 6, BCD1, CBA1, COPEB, CPBP, GBF, PAC1, ST12, or ZF9) is a protein that, in humans, is encoded by the KLF6 gene. KLF6 contributes to cell proliferation, differentiation, cell death, and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF6-expression contributes to hepatic insulin resistance and the progression of non-alcoholic fatty liver disease (NAFLD) to non-alcoholic steatohepatitis (NASH) and NASH-cirrhosis. KLF6 also affects peroxisome proliferator activated receptor gamma (PPARgamma)-signaling in NAFLD. KLF6 has also been identified as a tumor suppressor gene that is inactivated or downregulated in different cancers, including prostate, colon, and hepatocellular carcinomas. KLF6 transactivates genes controlling cell proliferation, including p21, E-cadherin, and pituary tumor-transforming gene 1 (PTTG1). KLF6 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF6. 198
38304 394891 cd21587 gammaCoV_RdRp gammacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of gammacoronaviruses, including the RdRp of avian infectious bronchitis virus (IBV) and similar proteins. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 931
38305 394892 cd21588 alphaCoV_RdRp alphacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of alphacoronaviruses, including human coronaviruses (HCoVs), HCoV-NL63, and HCoV-229E. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 924
38306 394893 cd21589 betaCoV_RdRp betacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of betacoronaviruses, including the RdRps from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 929
38307 394894 cd21590 deltaCoV_RdRp deltacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which has been shown to inhibit human endemic and zoonotic deltacoronaviruses with a highly divergent RdRp. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 928
38308 394895 cd21591 SARS-CoV-like_RdRp Severe acute respiratory syndrome coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the B lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus), and similar proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 928
38309 394896 cd21592 MERS-CoV-like_RdRp Middle East respiratory syndrome-related coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the C lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which has been shown to potently inhibit MERS RdRp. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 931
38310 394897 cd21593 HCoV_HKU1-like_RdRp human coronavirus HKU1 RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the A lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of human coronavirus HKU1, murine hepatitis virus, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 925
38311 394857 cd21594 deltaCoV_M deltacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of deltacoronaviruses including porcine deltacoronavirus and Bulbul coronavirus HKU11. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 217
38312 394954 cd21595 CoV_N-CTD C-terminal domain of nucleocapsid (N) protein of coronavirus. The coronavirus nucleocapsid (N) protein is a major structural and multifunctional protein. It plays an important role in the virus replication cycle, by forming a complex with the viral RNA. It also interacts with the viral membrane protein during virion assembly and plays a critical role in enhancing the efficiency of virus transcription and assembly. The C-terminal domain of the N protein (N-CTD) is involved in dimerization, and is thus, also called the dimerization domain. 95
38313 394898 cd21596 batCoV-HKU9-like_RdRp Bat coronavirus HKU9 RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the D lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of bat coronavirus HKU9 and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 929
38314 394948 cd21597 SARS-CoV-2_Orf10 Severe acute respiratory syndrome coronavirus 2 Orf10. This model represents the Orf10 protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV). SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). Orf10 appears to have no homologous proteins in SARS-CoV and other coronaviruses. It has been suggested that the genome sequence currently annotated as orf10 may not have a protein coding function in SARS-CoV-2, and instead may act, itself or as a precursor of other RNAs, in the regulation of gene expression, replication, or modulating cellular antiviral pathways (DOI:10.1101/2020.03.05.976167). 36
38315 394937 cd21598 ORF7b_SARS_bat-CoV-like Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and similar proteins from related betacoronaviruses in the B lineage. This family contains the structural accessory protein ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus lineage B, including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province, as well as SARS-related virus from Rhinolophus bats in Europe and Kenya. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization. 40
38316 410178 cd21599 RRM1_GNPTAB RNA recognition motif 1 (RRM1) found in N-acetylglucosamine-1-phosphotransferase subunits alpha/beta (GNPTAB) and similar proteins. GNPTAB, also termed GlcNAc-1-phosphotransferase subunits alpha/beta, or stealth protein GNPTAB, or UDP-N-acetylglucosamine-1-phosphotransferase subunits alpha/beta, catalyzes the formation of mannose 6-phosphate (M6P) markers on high mannose type oligosaccharides in the Golgi apparatus. M6P residues are required to bind to the M6P receptors (MPR), which mediate the vesicular transport of lysosomal enzymes to the endosomal/prelysosomal compartment. The model corresponds to the RNA recognition motif 1 (RRM1) of GNPTAB. Its functional significance remains to be investigated. 90
38317 410179 cd21600 RRM2_GNPTAB RNA recognition motif 2 (RRM2) found in N-acetylglucosamine-1-phosphotransferase subunits alpha/beta (GNPTAB) and similar proteins. GNPTAB, also termed GlcNAc-1-phosphotransferase subunits alpha/beta, or stealth protein GNPTAB, or UDP-N-acetylglucosamine-1-phosphotransferase subunits alpha/beta, catalyzes the formation of mannose 6-phosphate (M6P) markers on high mannose type oligosaccharides in the Golgi apparatus. M6P residues are required to bind to the M6P receptors (MPR), which mediate the vesicular transport of lysosomal enzymes to the endosomal/prelysosomal compartment. The model corresponds to the RNA recognition motif 2 (RRM2) of GNPTAB. Its functional significance remains to be investigated. 77
38318 410180 cd21601 RRM1_PES4_MIP6 RNA recognition motif 1 (RRM1) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 80
38319 410181 cd21602 RRM2_PES4_MIP6 RNA recognition motif 2 (RRM2) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 76
38320 410182 cd21603 RRM3_PES4_MIP6 RNA recognition motif 3 (RRM3) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the third RRM motif. 73
38321 410183 cd21604 RRM4_PES4_MIP6 RNA recognition motif 4 (RRM4) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the fourth RRM motif. 79
38322 410184 cd21605 RRM1_HRB1_GBP2 RNA recognition motif 1 (RRM1) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 77
38323 410185 cd21606 RRM2_HRB1_GBP2 RNA recognition motif 2 (RRM2) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 75
38324 410186 cd21607 RRM3_HRB1_GBP2 RNA recognition motif 3 (RRM3) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the third RRM motif. 79
38325 410187 cd21608 RRM2_NsCP33_like RNA recognition motif 2 (RRM2) found in Nicotiana sylvestris chloroplastic 33 kDa ribonucleoprotein (NsCP33) and similar proteins. The family includes NsCP33, Arabidopsis thaliana chloroplastic 31 kDa ribonucleoprotein (CP31A) and mitochondrial glycine-rich RNA-binding protein 2 (AtGR-RBP2). NsCP33 may be involved in splicing and/or processing of chloroplast RNA's. AtCP31A, also called RNA-binding protein 1/2/3 (AtRBP33), or RNA-binding protein CP31A, or RNA-binding protein RNP-T, or RNA-binding protein cp31, is required for specific RNA editing events in chloroplasts and stabilizes specific chloroplast mRNAs, as well as for normal chloroplast development under cold stress conditions by stabilizing transcripts of numerous mRNAs under these conditions. CP31A may modulate telomere replication through RNA binding domains. AtGR-RBP2, also called AtRBG2, or glycine-rich protein 2 (AtGRP2), or mitochondrial RNA-binding protein 1a (At-mRBP1a), plays a role in RNA transcription or processing during stress. It binds RNAs and DNAs sequence with a preference to single-stranded nucleic acids. AtGR-RBP2 displays strong affinity to poly(U) sequence. It exerts cold and freezing tolerance, probably by exhibiting an RNA chaperone activity during the cold and freezing adaptation process. Some members in this family contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 76
38326 410188 cd21609 RRM1_PSRP2_like RNA recognition motif 1 (RRM1) found in chloroplastic plastid-specific 30S ribosomal protein 2 (PSRP-2) and similar proteins. PSRP-2, also called chloroplastic 30S ribosomal protein 2, or chloroplastic small ribosomal subunit protein cS22, is a component of the chloroplast ribosome (chloro-ribosome), a dedicated translation machinery responsible for the synthesis of chloroplast genome-encoded proteins, including proteins of the transcription and translation machinery and components of the photosynthetic apparatus. It binds single strand DNA (ssDNA) and RNA in vitro. It exhibits RNA chaperone activity and regulates negatively resistance responses to abiotic stresses during seed germination (e.g. salt, dehydration, and low temperature) and seedling growth (e.g. salt). The family also includes Nicotiana sylvestris chloroplastic 33 kDa ribonucleoprotein (NsCP33) and Arabidopsis thaliana chloroplastic 31 kDa ribonucleoprotein (AtCP31A). NsCP33 may be involved in splicing and/or processing of chloroplast RNA's. AtCP31A, also called RNA-binding protein 1/2/3 (AtRBP33), or RNA-binding protein CP31A, or RNA-binding protein RNP-T, or RNA-binding protein cp31, is required for specific RNA editing events in chloroplasts and stabilizes specific chloroplast mRNAs, as well as for normal chloroplast development under cold stress conditions by stabilizing transcripts of numerous mRNAs under these conditions. CP31A may modulate telomere replication through RNA binding domains. Members in this family contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 80
38327 410189 cd21610 RRM2_PSRP2 RNA recognition motif 2 (RRM2) found in chloroplastic plastid-specific 30S ribosomal protein 2 (PSRP-2) and similar proteins. PSRP-2, also called chloroplastic 30S ribosomal protein 2, or chloroplastic small ribosomal subunit protein cS22, is a component of the chloroplast ribosome (chloro-ribosome), a dedicated translation machinery responsible for the synthesis of chloroplast genome-encoded proteins, including proteins of the transcription and translation machinery and components of the photosynthetic apparatus. It binds single strand DNA (ssDNA) and RNA in vitro. It exhibits RNA chaperone activity and regulates negatively resistance responses to abiotic stresses during seed germination (e.g. salt, dehydration, and low temperature) and seedling growth (e.g. salt). PSRP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 79
38328 410190 cd21611 RRM_SpPof8_like RNA recognition motif (RRM) found in Schizosaccharomyces pombe protein Pof8 and similar proteins. Pof8 is a La-related protein and a constitutive component of telomerase in fission yeast. It regulates telomerase assembly and poly(a)+TERRA expression in fission yeast. Members in this family contain an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 82
38329 410191 cd21612 RRM_AtRDRP1_like RNA recognition motif (RRM) found in Arabidopsis thaliana RNA-dependent RNA polymerase 1 (AtRDRP1) and similar proteins. AtRDRP1, also called RNA-directed RNA polymerase 1, is an RNA-dependent direct polymerase involved in antiviral silencing. It is required for the biogenesis of viral secondary siRNAs, process that follows the production of primary siRNAs derived from viral RNA replication. Members in this family contain an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 67
38330 410192 cd21613 RRM1_KSRP RNA recognition motif 1 (RRM1) found in Kinetoplastid-Specific Ribosomal Protein (KSRP) and similar proteins. KSRP is an essential protein located at the solvent face of the 40S subunit, where it binds and stabilizes kinetoplastid-specific domains of rRNA, suggesting its role in ribosome integrity. It also interacts with the kinetoplastid-specific C-terminal region of protein eS6. KSRP contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 71
38331 410193 cd21614 RRM2_KSRP RNA recognition motif 2 (RRM2) found in Kinetoplastid-Specific Ribosomal Protein (KSRP) and similar proteins. KSRP is an essential protein located at the solvent face of the 40S subunit, where it binds and stabilizes kinetoplastid-specific domains of rRNA, suggesting its role in ribosome integrity. It also interacts with the kinetoplastid-specific C-terminal region of protein eS6. KSRP contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 97
38332 410194 cd21615 RRM_SNP1_like RNA recognition motif (RRM) found in Saccharomyces cerevisiae U1 small nuclear ribonucleoprotein SNP1 and similar proteins. SNP1, also called U1 snRNP protein SNP1, or U1 small nuclear ribonucleoprotein 70 kDa homolog, or U1 70K, or U1 snRNP 70 kDa homolog, interacts with mRNA and is involved in nuclear mRNA splicing. It is a component of the spliceosome, where it is associated with snRNP U1 by binding stem loop I of U1 snRNA. Members in this family contain an N-terminal U1snRNP70 domain and an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 118
38333 410195 cd21616 RRM_ScJSN1_like RNA recognition motif (RRM) found in Saccharomyces cerevisiae protein JSN1 and similar proteins. JSN1, also called Pumilio homology domain family member 1 (PUF1), is a member of the PUF family of proteins. It facilitates association of Arp2/3 complex to yeast mitochondria. It may play a role in mitosis, perhaps by affecting the stability of microtubules. Members in this family contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 118
38334 410196 cd21617 RRM_TDRD10 RNA recognition motif (RRM) found in Tudor domain-containing protein 10 (TDRD10) and similar proteins. TDRD10 is widely expressed and localized both to the nucleus and cytoplasm and may play general roles like regulation of RNA metabolism. It contains a Tudor domain and a RNA recognition motif (RRM). 69
38335 410197 cd21618 RRM_AtNSRA_like RNA recognition motif (RRM) found in Arabidopsis thaliana nuclear speckle RNA-binding protein A (AtNSRA) and similar protein. AtNSRA is an alternative splicing (AS) regulator that binds to specific mRNAs and modulates auxin effects on the transcriptome. It can be displaced from its targets upon binding to AS competitor long non-coding RNA (ASCO-RNA). Members in this family contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 87
38336 410198 cd21619 RRM1_Crp79 RNA recognition motif 1 (RRM1) found in Schizosaccharomyces pombe mRNA export factor Crp79 and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 78
38337 410199 cd21620 RRM1_Mug28 RNA recognition motif 1 (RRM1) found in Schizosaccharomyces pombe meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif. 84
38338 410200 cd21621 RRM2_Crp79_Mug28 RNA recognition motif 2 (RRM2) found in Schizosaccharomyces pombe mRNA export factor Crp79, meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif. 74
38339 410201 cd21622 RRM3_Crp79_Mug28 RNA recognition motif 3 (RRM3) found in Schizosaccharomyces pombe mRNA export factor Crp79, meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the three RRM motif. 92
38340 394938 cd21623 ORF7b_SARS-CoV-2 Structural accessory protein ORF7b of Severe Acute Respiratory Syndrome coronavirus 2 and similar proteins. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province and showed high sequence identity to SARS-CoV-2. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization. 43
38341 394950 cd21624 SARS-CoV-like_Spike_S1_NTD N-terminal domain of the S1 subunit of the Spike (S) protein from Severe acute respiratory syndrome coronavirus and related betacoronaviruses in the B lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage), including the highly pathogenic human coronavirus (CoV), Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as a 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2 and SARS-CoV, use the C-domain to bind their receptors. The S1 NTD contributes to the Spike trimer interface. 280
38342 394951 cd21625 MHV-like_Spike_S1_NTD N-terminal domain of the S1 subunit of the Spike (S) protein from murine hepatitis virus and related betacoronaviruses in the A lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the embecovirus subgenera (A lineage), including murine hepatitis virus (MHV), human coronavirus (HCoV) HKU1 and OC43, and bovine CoV (BCoV). MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract, while HCoV-HKU1 causes mild yet prevalent respiratory disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While most CoVs, including SARS-CoV and MERS-CoV use the C-domain to bind their receptors, several CoVs in the A lineage use the NTD to bind their receptors. MHV binds its protein receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a), through its S1 NTD. BCoV and HCoV-OC43 recognize a sugar moiety, 5-N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2), on cell-surface glycoproteins or glycolipids; this binding is also through the S1 NTD. In addition, the S1 NTD contributes to the Spike trimer interface. 284
38343 394952 cd21626 MERS-CoV-like_Spike_S1_NTD N-terminal domain of the S1 subunit of the Spike (S) protein from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the merbecovirus subgenera (C lineage), including the highly pathogenic human coronavirus (CoV), Middle East respiratory syndrome (MERS)-related CoV, and related bat CoVs. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including MERS-CoV, use the C-domain to bind their receptors. Despite using the C-domain as its receptor, neutralizing antibodies targeting MERS-CoV S1-NTD have been reported, including human antibody CDC2-A2, murine antibodies G2 and 5F9, and macaque antibodies FIB-H1 and JC57-13. G2 has been shown to strongly disrupt the attachment of MERS-CoV S to its receptor, dipeptidyl peptidase-4 (DPP4). In addition, the S1 NTD contributes to the Spike trimer interface. 328
38344 394953 cd21627 batCoV-HKU9-like_Spike_S1_NTD N-terminal domain of the S1 subunit of the Spike (S) protein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the nobecovirus subgenera (D lineage), including Rousettus bat coronavirus HKU9 and related bat CoVs. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2, SARS-CoV, and MERS-CoV use the C-domain to bind their receptors. However, CoV such as mouse hepatitis virus (MHV) uses the NTD to bind its receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a). The S1 NTD contributes to the Spike trimer interface. 289
38345 394930 cd21628 deltaCoV_NS7_NS7a deltacoronavirus accessory protein NS7 and NS7a. This family includes the accessory protein NS7 found in deltacoronaviruses from the Buldecovirus subgenus, such as porcine coronavirus HKU15, and several avian coronaviruses found in sparrow, pigeon, quail and falcon, among others. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7 and NS7a. NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. PDCoV HKU15, an emerging swine enteric coronavirus that causes diarrhea in neonatal piglets, has also been found in the respiratory tract of pigs and may be able to cause respiratory infections, thus possibly spreading through the respiratory route. NS7-specific mAbs that recognized cells transfected with an NS7 expression construct or infected with PDCoV also recognized NS7a, which is encoded by a separate subgenome mRNA with a non-canonical transcription regulatory sequence. The NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7-expressing and PDCoV-infected cells also show a substantial down-regulation of alpha-actinin-4. 195
38346 394929 cd21629 NS6_deltaCoV deltacoronavirus accessory protein NS6. This family includes the accessory protein NS6 from deltacoronaviruses such as porcine coronavirus HKU15, and several avian coronaviruses found in sparrow, pigeon, quail and falcon, among others. There are five essential genes in coronaviruses (CoVs) that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7, and NS7a. During PDCoV infection, NS6 antagonizes RIG-I-like receptor (RLR)-mediated IFN-beta production to evade host innate immune defense; it interacts with RIG-I and MDA5 to impede their association with double-stranded RNA. This is an important finding towards novel therapeutic targets and may lead to the development of more effective vaccines against PDCoV infection. 91
38347 409020 cd21631 RHH_CopG_NikR-like ribbon-helix-helix domains of transcription repressor CopG, nickel responsive transcription factor NikR, and similar proteins. This family includes the ribbon-helix-helix (RHH) domains of transcriptional repressor CopG, nickel-responsive transcription factor NikR, several antitoxins such as Shewanella oneidensis CopA(SO), Burkholderia pseudomallei HicB, and Caulobacter crescentus ParD, and similar proteins. CopG, a homodimeric RHH protein of around 45 residues, constitutes one of the smallest natural transcriptional repressors characterized and is the prototype of a series of repressor proteins encoded by plasmids that exhibit a similar genetic structure at their leading strand initiation and control regions. It is involved in the control of plasmid copy number. NikR, which consists of the N-terminal DNA-binding RHH domain and the C-terminal metal-binding domain (MBD) with four nickel ions, regulates several genes; in Helicobacter pylori, NikR regulates the urease enzyme under extreme acidic conditions, and is involved in the intracellular physiology of nickel. Protein HicB is part of the HicAB toxin-antitoxin (TA) system, where the toxins are RNases, found in many bacteria. In Burkholderia pseudomallei, the HicAB system may play a role in disease by regulating the frequency of persister cells, while in Yersinia pestis HicB acts as an autoregulatory protein that inhibits HicA, which acts as an mRNase. In Escherichia coli, an excess of HicA has been shown to de-repress a HicB-DNA complex and restore transcription of HicB. The CopG family RHH domain, represented by this model, forms a homodimer and binds DNA. 42
38348 394939 cd21635 ORF7b_SARS-CoV-like Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and related proteins. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome coronaviruses (SARS-CoVs) and related betacoronaviruses identified in Chinese horseshoe bats, including bat SARS-like-CoV WIV1 and HKU3. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization. 44
38349 394931 cd21637 NS7_PDCoV Porcine deltacoronavirus (PDCoV) accessory protein NS7. This group includes the accessory protein NS7 found in Porcine coronavirus HKU15. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle., In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7 and NS7a. NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. PDCoV HKU15, an emerging swine enteric coronavirus that causes diarrhea in neonatal piglets, has also been found in the respiratory tract of pigs and may be able to cause respiratory infections, thus possibly spreading through the respiratory route. NS7-specific mAbs that recognized cells transfected with an NS7 expression construct or infected with PDCoV also recognized NS7a, which is encoded by a separate subgenome mRNA with a non-canonical transcription regulatory sequence. The NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7-expressing and PDCoV-infected cells also show a substantial down-regulation of alpha-actinin-4. 198
38350 394932 cd21638 NS7a_deltaCoV_HKU16-like accessory protein NS7a found in deltacoronavirus, including avian coronavirus HKU16 and related coronaviruses. This group includes the accessory protein NS7a from White-eye coronavirus HKU16, Falcon coronavirus UAE-HKU27, Houbara coronavirus UAE-HKU28 and Pigeon coronavirus UAE-HKU29, within the Buldecovirus subgenus of deltacoronaviruses (deltaCoVs). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized. 197
38351 394933 cd21639 NS7a_deltaCoV_HKU30-like accessory protein NS7a found in deltacoronavirus, including avian coronavirus HKU30 and related coronaviruses. This group includes the accessory protein NS7a from Quail deltacoronavirus (QdCoV) UAE-HKU30 and sparrow deltacoronavirus (SpCoV-HKU17) within the Buldecovirus subgenus of deltacoronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized. Phylogenetic analysis revealed that QdCoV UAE-HKU30 belongs to the same CoV species as porcine deltacoronavirus (PdCoV) HKU15 and sparrow deltacoronavirus (SpdCoV) HKU17 within Buldecovirus subgenus, suggesting transmission between avian and swine hosts. 198
38352 394944 cd21640 ORF8-Ig_SARS-CoV-2-like SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This family includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and related Sarbecovirus ORF8 proteins including those classified as type II, such as bat coronavirus Rf1 ORF8, and those classified as type III, such as Bat SARS coronavirus HKU3-1 ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). 120
38353 394945 cd21641 ORF8-Ig_SARS-CoV-2-like SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and related Sarbecovirus ORF8 proteins. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). It belongs to a family which includes Sarbecovirus ORF8 proteins classified as type II, such as bat coronavirus Rf1 ORF8, and those classified as type III, such as Bat SARS coronavirus HKU3-1 ORF8. 121
38354 394946 cd21642 ORF8-Ig_Bat_SARS_CoV_Rf1_type-II-like ORF8 immunoglobulin (Ig) domain protein of bat coronavirus Rf1, a type II ORF8, and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of bat coronavirus Rf1 (Bat SARS CoV Rf1) and Bat CoV 273/2005, which have been classified previously as type II ORF8 proteins. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as Bat SARS coronavirus HKU3-1 ORF8 which has been classified previously as a type III ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). In most SARS-CoVs, ORF8 is split into overlapping ORF8a and ORF8b proteins; the N- and C-terminus of SARS-CoV-2 ORF8 is similar to SARS-CoV ORF8a and ORF8b, respectively. 119
38355 394947 cd21643 ORF8-Ig_bat_SARS-CoV_HKU3-1_type-III-like ORF8 immunoglobulin (Ig) domain protein of bat SARS coronavirus HKU3-1 ORF8, a type III ORF8, and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of Bat SARS coronavirus HKU3-1 and Bat SARS-like coronavirus Rs3367, which have been classified previously as type III ORF8's. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as bat coronavirus Rf1 (Bat SARS CoV Rf1) ORF8 which has been classified previously as a type II ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). 120
38356 394940 cd21644 batCoV-HKU9_NS7b NS7b protein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This model represents the NS7b protein of Rousettus bat coronavirus (CoV) HKU9 and related proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. The NS7b protein of lineage D betacoronavirus is an accessory protein whose function is unknown. It is not related to NS7b proteins from other betacoronavirus lineages. 178
38357 394928 cd21645 MERS-CoV-like_ORF5 Non-structural protein ORF5 from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This model represents the non-structural protein ORF5 from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). ORF5 is also called non-structural protein 3d (NS3d) or accessory protein 3d in some bat merbecoviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV is a highly pathogenic respiratory virus with pathogenic mechanisms that may be driven by innate immune pathways. MERS-CoV ORF5 acts as an interferon antagonist and may play a role in circumventing the innate immunity of host cells. It is also implicated to play a role in the modulation of NF-kappaB-mediated inflammation. ORF5/NS3d from merbecovirus (betacoronavirus, lineage C) may not be related to ORF5 proteins from other lineages. 223
38358 394885 cd21646 CoV_Nsp5_Mpro coronavirus non-structural protein 5, also called Main protease (Mpro). This family contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 292
38359 394924 cd21647 ORF4b_NS3c-betaCoV accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of betacoronaviruses in the C lineage. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Middle East respiratory syndrome (MERS)-related CoV and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage), including Tylonycteris bat coronavirus HKU4 and Pipistrellus bat coronavirus HKU5. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. The MERS-CoV ORF4b (also known as MERS-CoV 4b) has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis. 227
38360 394922 cd21648 SARS-CoV-like_ORF3a accessory protein ORF3a of severe acute respiratory syndrome-associated coronavirus and similar proteins from related betacoronavirus. This model represents the accessory protein ORF3a of Severe acute respiratory syndrome-associated coronavirus (SARS-CoV), SARS-COV-2 (also called 2019 novel coronavirus or 2019-nCoV), and related betacoronaviruses in the Sarbecovirus subgenus (B lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. SARS-CoV mRNA 3 encodes the distinct proteins ORF3a and ORF3b, which are translated in different reading frames. Accessory protein ORF3a, also called protein 3a and protein X1, is the largest ORF protein in SARS-CoV. It is also called accessory protein 3 or protein 3 in some bat coronaviruses. SARS-CoV ORF3a promotes membrane rearrangement and cell death; it induces vesicle formation and is necessary for SARS-CoV-induced Golgi fragmentation. It has also been found to activate NF-kappaB and the NLRP3 inflammasome by promoting TNF receptor-associated factor 3 (TRAF3)-dependent ubiquitination of p105 and ASC (apoptosis-associated speck-like protein containing a caspase recruitment domain). The cytoplasmic domain of SARS-CoV ORF3a, composed of amino acids at the C-terminal region, has sequence similarity to a calcium pump present in Plasmodium falciparum and has been shown to bind calcium in vitro. 269
38361 394923 cd21649 SARS-CoV_ORF3b accessory protein ORF3b of severe acute respiratory syndrome-associated coronavirus. This model represents the accessory protein ORF3b of Severe acute respiratory syndrome-associated coronavirus (SARS-CoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. SARS-CoV mRNA 3 encodes the distinct proteins ORF3a and ORF3b proteins, which are translated in different reading frames. SARS-CoV accessory protein ORF3b antagonizes interferon (IFN) function by modulating the activity of IFN regulatory factor 3 (IRF3). The IFN system functions as the first line of defense against viral infection in mammalian cells. Viral infection triggers a series of cellular events that lead to the production of IFN and several downstream antiviral genes, helping to establish an antiviral state. Viruses encode IFN antagonists to counteract the antiviral effects of IFN. SARS-CoV ORF3b, ORF6, and N proteins function as IFN antagonists. ORF3b inhibits both IFN synthesis and signaling. It localizes to the nucleus in transfected cells. 151
38362 412060 cd21650 CrtA-like spheroidene monooxygenase and similar proteins. Spheroidene monooxygenase (such as Rhodobacter sphaeroides monooxygenase CrtA) catalyzes the asymmetrical introduction of one keto group at the C-2 position of spheroidene and two keto groups at the C-2 and C-2' positions of spirilloxanthin in carotenoid pathways. Spectroscopic analysis suggests CrtA may have a 5-coordinated heme at its active site and that it may be a novel oxygenase and not a P450 enzyme. 225
38363 394925 cd21651 ORF4b_MERS-CoV-like accessory protein ORF4b, also known as non-structural protein 3c (NS3c) in Middle East respiratory syndrome (MERS)-related CoV. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Middle East respiratory syndrome (MERS)-related CoV, as well as some bat coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. The MERS-CoV ORF4b (also known as MERS-CoV 4b) has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis. 239
38364 394926 cd21652 ORF4b_HKU4-CoV accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of Tylonycteris bat coronavirus HKU4 and similar proteins. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Tylonycteris bat coronavirus HKU4 and related bat coronaviruses including Tylonycteris pachypus bat coronavirus HKU4-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis. 256
38365 394927 cd21653 ORF4b_HKU5-CoV accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of Pipistrellus bat coronavirus HKU5 and similar proteins. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Pipistrellus bat coronavirus HKU5 and related bat coronaviruses including Pipistrellus abramus bat coronavirus HKU5-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis. 249
38366 394941 cd21654 embe-merbe_CoV_ORF8b_protein-I-like MERS-CoV ORF8b, BECV protein I, and related Embecovirus and Merbecovirus proteins. This family includes the ORF8b accessory protein from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and related merbecoviruses (C lineage), and protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage). The gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a), and the gene encoding protein I is included in the N gene as an alternative ORF. ORF8b and protein I appear to have no homologous proteins in Sarbecovirus (lineage B), which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV ORF8b and BECV-F15 protein I are not essential for viral replication. 104
38367 394956 cd21657 deltaCoV_Nsp14 nonstructural protein 14 of deltacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 508
38368 394957 cd21658 gammaCoV_Nsp14 nonstructural protein 14 of gammacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 518
38369 394958 cd21659 betaCoV_Nsp14 nonstructural protein 14 of betacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 519
38370 394959 cd21660 alphaCoV_Nsp14 nonstructural protein 14 of alphacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 510
38371 394942 cd21661 merbe_CoV_ORF8b-like MERS-CoV ORF8b protein and related Merbecovirus proteins. This subfamily includes the ORF8b accessory protein from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and related merbecoviruses (C lineage). The gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a). ORF8b appear to have no homologous proteins in Sarbecovirus (lineage B), which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV ORF8b is not essential for viral replication. It is related to protein I (also known as accessory protein N2) of bovine enteritic coronavirus-F15 strain (BECV-F15) and other related Embecoviruses; the gene encoding protein I is included in the N gene as an alternative ORF. 104
38372 394943 cd21662 embe-CoV_Protein-I_like BECV protein I and related Embecovirus proteins. This subfamily includes protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage) including murine hepatitis virus. The gene encoding protein I is included in the N gene as an alternative ORF. Protein I appears to have no homologous proteins in Sarbecovirus lineage B, which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. BECV-F15 protein I is not essential for viral replication. It is related to the ORF8b accessory protein of Middle East respiratory syndrome-related coronavirus (MERS-CoV) and other related merbecoviruses (C lineage); the gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a). 115
38373 394934 cd21663 ORF7a_SARS-CoV-like Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) structural accessory protein ORF7a and similar proteins from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). This family contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus subgenera Sarbecovirus (lineage B), including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province, as well as SARS-related virus from Rhinolophus bats in Europe and Kenya. ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (lineage B) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-amino acid signal peptide sequence at its N terminus, an 81-amino acid luminal domain, a 21-amino acid transmembrane domain, and a short C-terminal tail. Co-expression of SARS-CoV ORF7a with S, M, N and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions. 83
38374 394886 cd21665 alphaCoV_Nsp5_Mpro alphacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in alphacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 296
38375 394887 cd21666 betaCoV_Nsp5_Mpro betacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in betacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 297
38376 394888 cd21667 gammaCoV_Nsp5_Mpro gammacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in gammacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 306
38377 394889 cd21668 deltaCoV_Nsp5_Mpro deltacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 302
38378 394935 cd21684 ORF7a_SARS-CoV-2-like Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) structural accessory protein ORF7a and a bat coronavirus (BatCoV RaTG13) from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). This group contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoV) from betacoronavirus subgenera Sarbecovirus (lineage B), including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province. ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (B lineage) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-aa signal peptide sequence at its N terminus, an 81-aa luminal domain, a 21-aa transmembrane domain, and a short C-terminal tail. Coexpression of SARS-CoV ORF7a with S, M, N, and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions. 121
38379 394936 cd21685 ORF7a_SARS-CoV-like Severe Acute Respiratory Syndrome coronavirus (SARS-CoV-2) structural accessory protein ORF7a and similar proteins from betacoronaviruses in the subgenera Sarbecovirus (B lineage). This group contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus subgenera Sarbecovirus (lineage B). ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (B lineage) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-aa signal peptide sequence at its N terminus, an 81-aa luminal domain, a 21-aa transmembrane domain, and a short C-terminal tail. Coexpression of SARS-CoV ORF7a with S, M, N, and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions. 83
38380 409657 cd21686 TM_Y_CoV_Nsp3_C C-terminus of coronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from alpha-, beta-, gamma-, and deltacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 476
38381 409334 cd21687 TGEV-like_alphaCoV_Nsp1 non-structural protein 1 from transmissible gastroenteritis virus and similar alphacoronaviruses. This model represents the non-structural protein 1 (Nsp1) from transmissible gastroenteritis virus (TGEV) and similar alphacoronaviruses from the tegacovirus and minacovirus subgenera. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the TGEV and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 104
38382 409647 cd21688 CoV_PLPro Coronavirus (CoV) papain-like protease (PLPro). This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of alpha-, beta-, gamma-, and deltacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. 299
38383 410205 cd21689 stalk_CoV_Nsp13-like stalk domain of coronavirus Nsp13 helicase and related proteins. This model represents the stalk domain of coronavirus non-structural protein 13 (Nsp13) helicase, found in the Nsp3s of alpha-, beta-, gamma-, and deltacoronaviruses, including Severe Acute Respiratory Syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome coronavirus (MERS-CoV). Helicases are classified based on the arrangement of conserved motifs into six superfamilies; coronavirus helicases in this family belong to superfamily 1 (SF1). Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It consists of an N-terminal ZBD (Cys/His rich zinc-binding domain), a stalk domain, a 1B regulatory domain, and SF1 helicase core. The stalk domain lies between the ZBD domain and the 1B domain; a short loop connects the ZBD to the stalk domain. The stalk domain is comprised of three tightly-interacting alpha-helices connected to the 1B domain, transferring the effect from the ZBD domain onto the helicase core domains. The ZBD and stalk domains are critical for the helicase activity of SARS-CoV Nsp13. 48
38384 409667 cd21690 GH2_like GIPC homology 2 (GH2) domain-like family. The GIPC (GAIP C-terminus-interacting protein) family of proteins mediate endocytosis by tethering cargo proteins to the motor myosin VI. This model represents the C-terminal GIPC homology 2 or GH2 domain (plus the linker to the PDZ domain located N-terminally of GH2), which mediates the interaction with myosin VI and is involved in homodimerization in the autoinhibited state. The family also includes DEAH box protein 8 (DHX8) and similar proteins. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. DHX8 contains a GH2-like domain at the N-terminus, which shows high sequence similarity with the GH2 domain found in GIPC proteins. 62
38385 409668 cd21691 GH2-like_DHX8 GIPC-homology 2 (GH2)-like domain found in DEAH box protein 8 (DHX8) and similar proteins. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. This model corresponds to the GH2-like domain that shows high sequence similarity with the GH2 domain found in GIPC (GAIP C-terminus-interacting protein) family of proteins, which mediate endocytosis by tethering cargo proteins to the motor myosin VI. 68
38386 412028 cd21692 GINS_B_Sld5 beta-strand (B) domain of GINS complex protein Sld5. Sld5 is a component of the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex, within which Sld5 interacts with Psf1 via its N-terminal A-domain, and with Psf2 through a combination of the A and B domains. In Drosophila, Sld5 is required for normal cell cycle progression and the maintenance of genomic integrity. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2 and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. Each subunit of the complex consists of two domains called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Sld5. 55
38387 412029 cd21693 GINS_B_Psf3 beta-strand (B) domain of GINS complex protein Psf3. Psf3 (partner of Sld5 3) is one of the proteins known to comprise the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex, which is a macromolecular protein complex associated with DNA replication. Psf3 is dysregulated in cancer cells, and its overexpression may be related to tumor progression in some cancers including colon, breast, and lung cancers; its expression can be used as a prognostic indicator in some cancers. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2, and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf3. 64
38388 412030 cd21694 GINS_B_Psf2 beta-strand (B) domain of GINS complex protein Psf2. Psf2 (partner of Sld5 2) is a component of GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex and has been found to play important roles in normal eye development in Xenopus laevis and in ICL (interstrand crosslinks) repair. ICLs are toxic lesions that covalently attach opposite strands of DNA. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) and is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2, and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf2. 62
38389 412031 cd21695 GINS_B_archaea_Gins51 beta-strand (B) domain of archaeal GINS complex protein Gins51. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins51, as well as eukaryotic Sld5 and Psf1) have the alpha-helical (A) domain at the N-terminus and the beta-strand domain (B) at the C-terminus; this arrangement is called ABtype. Archaeal GINS contacts GAN by using the Gins51 B-domain as a hook, for the formation of the CMG helicase. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of Gins51. 52
38390 412032 cd21696 GINS_B_Psf1 beta-strand (B) domain of GINS complex protein Psf1. Psf1 (partner of Sld5 1) is a component of the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex, and is mainly expressed in highly proliferative tissues, such as blastocysts, adult bone marrow, and testis, in which the stem cell system is active. Psf1 has been reported to be a prognostic biomarker in breast cancer, prostate cancer, hepatocellular carcinoma, and non-small cell lung cancer (NSCLC) patients treated with surgery following preoperative chemotherapy or chemoradiotherapy. Loss of Psf1 causes embryonic lethality. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) and is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2 and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf1. 49
38391 412033 cd21697 GINS_B_archaea_Gins23 beta-strand (B) domain of archaeal GINS complex protein Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins23, as well as eukaryotic Psf2 and Psf3, have the alpha-helical (A) domain at the C-terminus and the beta-strand domain (B) at the N-terminus; this arrangement is called BAtype. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of archaeal Gins23. 42
38392 411955 cd21698 CoV_Spike_S1-S2_S2 S1/S2 cleavage region and the S2 fusion subunit of coronavirus spike (S) proteins. This model represents the S1/S2 cleavage region and the S2 subunit of the spike (S) glycoprotein from coronavirus (CoVs), including three highly pathogenic human CoVs, Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect S1 and S2. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV, and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP), and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. Notably, SARS-CoV-2 has a functional polybasic (furin) cleavage site through the insertion of PRRAR*SV (* indicates the cleavage site) at the S1/S2 interface, which is absent in SARS-CoV and other SARS-related CoVs. The S1/S2 cleavage region and the S2 fusion subunit play an essential role in viral entry by initiating fusion of the viral and cellular membranes. 523
38393 411982 cd21699 JMTM_APP_like juxtamembrane and transmembrane (JMTM) domain found in the amyloid-beta precursor protein (APP) family. The amyloid-beta precursor protein (APP) family includes amyloid-like proteins APLP-1 and APLP-2. APP (also called ABPP, APPI, Alzheimer disease (AD) amyloid protein, amyloid precursor protein, amyloid-beta A4 protein, cerebral vascular amyloid peptide (CVAP), PreA4, or protease nexin-II (PN-II)) functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Amyloid-beta peptides are lipophilic metal chelators with metal-reducing activity; they bind transient metals such as copper, zinc and iron. APLP-1, also called APLP, may play a role in postsynaptic function. It couples to JIP signal transduction through C-terminal binding. APLP-1 may interact with cellular G-protein signaling pathways. It can regulate neurite outgrowth through binding to components of the extracellular matrix such as heparin and collagen I. APLP-2 (also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP)) may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APP, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. More than half of all familial APP mutations of Alzheimer's disease are seen in its JMTM domain region. 41
38394 411983 cd21700 JMTM_Notch_APP juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. The substrates of gamma-secretase include amyloid precursor protein (APP) and the Notch receptor. APP, also called APPI, or Alzheimer disease amyloid protein (ABPP), or amyloid precursor protein, or amyloid-beta A4 protein, or cerebral vascular amyloid peptide (CVAP), or PreA4, or protease nexin-II (PN-II), functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Notch proteins are a family of type-1 transmembrane proteins that form a core component of the Notch signaling pathway. They operate in a variety of different tissues and play a role in a variety of developmental processes by controlling cell fate decisions. Successive cleavage of the APP carboxyl-terminal fragment generates amyloid-beta (Abeta) peptides of varying lengths. Accumulation of Abeta peptides such as Abeta42 and Abeta43 leads to formation of amyloid plaques in the brain, a hallmark of Alzheimer's disease. Notch cleavage is involved in cell-fate determination during development and neurogenesis. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. It comprises a transmembrane helix (TM) with adjacent juxtamembrane (JM) regions. The JMTM domain is likely to be recognized by gamma-secretase in a similar fashion to both Notch and APP family proteins. 41
38395 411984 cd21701 JMTM_Notch juxtamembrane and transmembrane (JMTM) domain found in Notch protein family. Neurogenic locus notch homolog (Notch) proteins are a family of type-1 transmembrane proteins that form a core component of the Notch signaling pathway. They operate in a variety of different tissues and play a role in a variety of developmental processes by controlling cell fate decisions. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch proteins, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 85
38396 411985 cd21702 JMTM_Notch1 juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 1 (Notch1) and similar proteins. Neurogenic locus notch homolog protein 1 (Notch1), also called translocation-associated notch protein TAN-1, functions as a receptor for membrane-bound ligands Jagged-1 (JAG1), Jagged-2 (JAG2) and Delta-1 (DLL1) to regulate cell-fate determination. It affects the implementation of differentiation, proliferation and apoptotic programs. It is also involved in angiogenesis, and also negatively regulates endothelial cell proliferation and migration and angiogenic sprouting. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch1, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 80
38397 411986 cd21703 JMTM_Notch2 juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 2 (Notch2) and similar proteins. Neurogenic locus notch homolog protein 2 (Notch2) functions as a receptor for membrane-bound ligands Jagged-1 (JAG1), Jagged-2 (JAG2) and Delta-1 (DLL1) to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. Notch2 is involved in bone remodeling and homeostasis. In collaboration with RELA/p65, it enhances NFATc1 promoter activity and positively regulates RANKL-induced osteoclast differentiation. Notch2 positively regulates self-renewal of liver cancer cells. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch2, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 82
38398 411987 cd21704 JMTM_Notch3 juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 3 (Notch3) and similar proteins. Neurogenic locus notch homolog protein 3 (Notch3) functions as a receptor for membrane-bound ligands Jagged1, Jagged2 and Delta1 to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch3, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 90
38399 411988 cd21705 JMTM_Notch4 juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 4 (Notch4) and similar proteins. Neurogenic locus notch homolog protein 4 (Notch4) functions as a receptor for membrane-bound ligands Jagged1, Jagged2 and Delta1 to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. It affects the implementation of differentiation, proliferation and apoptotic programs. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch4, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 92
38400 411989 cd21706 JMTM_dNotch juxtamembrane and transmembrane (JMTM) domain found in Drosophila melanogaster neurogenic locus Notch protein (dNotch) and similar proteins. Drosophila melanogaster neurogenic locus Notch protein (dNotch) is an essential signaling protein which has a major role in many developmental processes. It functions as a receptor for membrane-bound ligands Delta and Serrate to regulate cell-fate determination. It regulates oogenesis, the differentiation of the ectoderm and the development of the central and peripheral nervous system, eye, wing disk, muscles and segmental appendages such as antennae and legs, through lateral inhibition or induction. It also regulates neuroblast self-renewal, identity and proliferation through the regulation of bHLH-O proteins; in larval brains, it is involved in the maintenance of type II neuroblast self-renewal and identity by suppressing erm expression together with pnt. It might also regulate dpn expression through the activation of the transcriptional regulator Su(H). This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of dNotch, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand. 90
38401 411990 cd21707 JMTM_APP juxtamembrane and transmembrane (JMTM) domain found in amyloid-beta precursor protein (APP) and similar proteins. Amyloid-beta precursor protein (APP), also called APPI, ABPP, Alzheimer disease amyloid protein, amyloid precursor protein, amyloid-beta A4 protein, cerebral vascular amyloid peptide (CVAP), PreA4, or protease nexin-II (PN-II), functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Amyloid-beta peptides are lipophilic metal chelators with metal-reducing activity; they bind transient metals such as copper, zinc and iron. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APP, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. More than half of all familial APP mutations of Alzheimer's disease are seen in its JMTM domain region. 40
38402 411991 cd21708 JMTM_APLP1 juxtamembrane and transmembrane (JMTM) domain found in amyloid-like protein 1 (APLP-1) and similar proteins. Amyloid-like protein 1 (APLP-1), also called APLP, may play a role in postsynaptic function. It couples to JIP signal transduction through C-terminal binding. APLP-1 may interact with cellular G-protein signaling pathways. It can regulate neurite outgrowth through binding to components of the extracellular matrix such as heparin and collagen I. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of APLP-1, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. 85
38403 411992 cd21709 JMTM_APLP2 juxtamembrane and transmembrane (JMTM) domain found in amyloid-like protein 2 (APLP-2) and similar proteins. Amyloid-like protein 2 (APLP-2), also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP), may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APLP-2, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. 81
38404 409658 cd21710 TM_Y_gammaCoV_Nsp3_C C-terminus of gammacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from gammacoronavirus, including Infectious bronchitis virus. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 525
38405 409659 cd21711 TM_Y_deltaCoV_Nsp3_C C-terminus of deltacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from deltacoronavirus, including Magpie-robin coronavirus HKU18 and Bulbul coronavirus HKU11, among others. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 490
38406 409660 cd21712 TM_Y_alphaCoV_Nsp3_C C-terminus of alphacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus 229E, among others. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 501
38407 409661 cd21713 TM_Y_betaCoV_Nsp3_C C-terminus of betacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 545
38408 409662 cd21714 TM_Y_MHV-like_Nsp3_C C-terminus of non-structural protein 3, including transmembrane and Y domains, from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In MHV and the related Severe acute respiratory syndrome-related coronavirus (SARS-CoV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 555
38409 409663 cd21715 TM_Y_HKU9-like_Nsp3_C C-terminus of non-structural protein 3, including transmembrane and Y domains, from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 526
38410 409664 cd21716 TM_Y_MERS-CoV-like_Nsp3_C C-terminus of non-structural protein 3, including transmembrane and Y domains, from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 566
38411 409665 cd21717 TM_Y_SARS-CoV-like_Nsp3_C C-terminus of non-structural protein 3, including transmembrane and Y domains, from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and the related murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 531
38412 409652 cd21718 CoV_Nsp13-helicase helicase domain of coronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alpha-, beta-, gamma-, and deltacoronavirus, including pathogenic human viruses such as Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 341
38413 409653 cd21720 gammaCoV_Nsp13-helicase helicase domain of gammacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from gammacoronavirus, including Avian infectious bronchitis virus. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Coronavirus (CoV) Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 343
38414 409654 cd21721 deltaCoV_Nsp13-helicase helicase domain of deltacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from deltacoronavirus, including Bulbul coronavirus (CoV) HKU11 and Common moorhen CoV HKU21. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 342
38415 409655 cd21722 betaCoV_Nsp13-helicase helicase domain of betacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from betacoronavirus, including pathogenic human viruses such as Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 340
38416 409656 cd21723 alphaCoV_Nsp13-helicase helicase domain of alphacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus (CoV) NL63. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 340
38417 409626 cd21727 betaCoV_Nsp3_betaSM betacoronavirus-specific marker of betacoronavirus non-structural protein 3. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus, including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 125
38418 409648 cd21731 alphaCoV_PLPro alphacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of alphacoronavirus, including Swine acute diarrhea syndrome coronavirus (SADS-CoV) which causes severe diarrhea in piglets, and Human coronavirus 229E which infects humans and bats and causes the common cold. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in SADS-CoV and many others has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. 289
38419 409649 cd21732 betaCoV_PLPro betacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of betacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. In SARS-CoV and murine hepatitis virus (MHV), the C-terminal non-structural protein 3 region spanning transmembrane regions TM1 and TM2 with 3Ecto domain in between, are important for the PL2pro domain to process Nsp3-Nsp4 cleavage. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain of many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. Interactions of SARS-CoV and MERS-CoV with antiviral interferon (IFN) responses of human cells are remarkably different; high-dose IFN treatment (type I and type III) shows MERS-CoV was substantially more IFN sensitive than SARS-CoV. This may be due to differences in the architecture of the oxyanion hole and of the S3 as well as the S5 specificity sites, despite the overall structures of SARS-CoV and MERS-CoV PLPro being similar. 304
38420 409650 cd21733 gammaCoV_PLPro gammacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of gammacoronavirus, including Avian coronavirus, Canada goose coronavirus, and Beluga whale coronavirus SW1. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in several CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. 304
38421 409651 cd21734 deltaCoV_PLPro deltacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in the non-structural protein 3 (Nsp3) region of deltacoronavirus, including Porcine deltacoronavirus, Bulbul coronavirus HKU11, and Common moorhen coronavirus HKU21. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. 313
38422 412023 cd21743 CTD_KDM2A_2B-like C-terminal domain found in lysine-specific demethylase KDM2A, KDM2B, and similar proteins. This family includes lysine-specific demethylases KDM2A and KDM2B, as well as Drosophila melanogaster JmjC domain-containing histone demethylation protein 1 (Jhd1). KDM2A is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. KDM2B is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. Jhd1, also called lysine (K)-specific demethylase 2 (KDM2), or [Histone-H3]-lysine-36 demethylase 1, is a histone demethylase (EC 1.14.11.27) that specifically demethylates 'Lys-36' of histone H3, thereby playing a central role in the histone code. Members in this family belong to the JmjC domain-containing histone demethylase family. They consist of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature. 67
38423 409643 cd21744 RBD_KIF20A-like RAB6 binding domain (RBD) found in kinesin-like proteins KIF20A, KIF20B, and similar proteins. This family includes kinesin-like proteins KIF20A and KIF20B. KIF20A (also called GG10_2, mitotic kinesin-like protein 2 (MKlp2), Rab6-interacting kinesin-like protein, or rabkinesin-6) is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1 (polo-like kinase 1), it is involved in recruitment of PLK1 to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus-end-directed motility. KIF20B (also called cancer/testis antigen 90 (CT90), kinesin family member 20B, kinesin-related motor interacting with PIN1, or M-phase phosphoprotein 1 (MPP1)) is a plus-end-directed motor enzyme that is required for completion of cytokinesis. It is required for proper midbody organization and abscission in polarized cortical stem cells. KIF20B plays a role in the regulation of neuronal polarization by mediating the transport of specific cargoes. It participates in the mobilization of SHTN1 and in the accumulation of PIP3 in the growth cone of primary hippocampal neurons in a tubulin and actin-dependent manner. In the developing telencephalon, KIF20B cooperates with SHTN1 to promote both the transition from the multipolar to the bipolar stage and the radial migration of cortical neurons from the ventricular zone toward the superficial layer of the neocortex. This model corresponds to a conserved domain in the KIF20A subfamily, that shows RAB6 binding ability and has been called the RAB6 binding domain (RBD). KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge. 56
38424 409646 cd21759 CBD_MYO6-like calmodulin binding domain found in unconventional myosin-VI and similar proteins. Myosins, which are actin-based motor molecules with ATPase activity, include unconventional myosins that serve in intracellular movements. Myosin-VI, also called unconventional myosin-6 (MYO6), is a reverse-direction motor protein that moves towards the minus-end of actin filaments. It is required for the structural integrity of the Golgi apparatus via the p53-dependent pro-survival pathway. Myosin-VI appears to be involved in a very early step of clathrin-mediated endocytosis in polarized epithelial cells. It modulates RNA polymerase II-dependent transcription. As part of the DISP (DOCK7-Induced Septin disPlacement) complex, Myosin-VI may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. Myosin-VI is encoded by gene MYO6, the human homolog of the gene responsible for deafness in Snell's waltzer mice. It is mutated in autosomal dominant non-syndromic hearing loss. This family also includes Drosophila melanogaster unconventional myosin VI Jaguar (Jar; also called myosin heavy chain 95F (Mhc95F), or 95F MHC), which is a motor protein necessary for the morphogenesis of epithelial tissues during Drosophila development. Jar is required for basal protein targeting and correct spindle orientation in mitotic neuroblasts. It contributes to synaptic transmission and development at the Drosophila neuromuscular junction. Together with CLIP-190 (CAP-Gly domain-containing/cytoplasmic linker protein 190), Jar may coordinate the interaction between the actin and microtubule cytoskeleton. Jar may link endocytic vesicles to microtubules and possibly be involved in transport in the early embryo and in the dynamic process of dorsal closure; its function is believed to change during the life cycle. This model corresponds to the calmodulin (CaM) binding domain (CBD), which consists of three subdomains: a unique insert (Insert 2 or Ins2), an IQ motif, and a proximal tail domain (PTD, also known as lever arm extension or LAE). 149
38425 409196 cd21762 WH2 Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif), and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) as well as thymosin-beta (Tbeta; also called beta-thymosin or betaT) domains that are small, widespread intrinsically disordered actin-binding peptides displaying significant sequence variability and different regulations of actin self-assembly in motile and morphogenetic processes. These WH2/betaT peptides are identified by a central consensus actin-binding motif LKKT/V flanked by variable N-terminal and C-terminal extensions; the betaT shares a more extended and conserved C-terminal half than WH2. These single or repeated domains are found in actin-binding proteins (ABPs) such as the hematopoietic-specific protein WASP, its ubiquitously expressed ortholog neural-WASP (N-WASP), WASP-interacting protein (WAS/WASL-interacting protein family members 1 and 2), and WASP-family verprolin homologous protein (WAVE/SCAR) isoforms: WAVE1, WAVE2, and WAVE3. Also included are the WH2 domains found in inverted formin FH2 domain-containing protein (INF2), Cordon bleu (Cobl) protein, vasodilator-stimulated phosphoprotein (VASP) homology protein and actobindin (found in amoebae). These ABPs are commonly multidomain proteins that contain signaling domains and structurally conserved actin-binding motifs, the most important being the WH2 domain motif through which they bind actin in order to direct the location, rate, and timing for actin assembly in the cell into different structures, such as filopodia, lamellipodia, stress fibers, and focal adhesions. The WH2 domain motif is one of the most abundant actin-binding motifs in Wiskott-Aldrich syndrome proteins (WASPs) where they activate Arp2/3-dependent actin nucleation and branching in response to signals mediated by Rho-family GTPases. The thymosin beta (Tbeta) domains in metazoans act in cells as major actin-sequestering peptides; their complex with monomeric ATP-actin (G-ATP-actin) cannot polymerize at either filament (F-actin) end. 22
38426 409640 cd21764 CEN_USH1G_ANKS4B central domain found in usher syndrome type-1G protein, ankyrin repeat and SAM domain-containing protein 4B, and similar proteins. The family includes usher syndrome type-1G protein (USH1G), ankyrin repeat and SAM domain-containing protein 4B (ANKS4B), and similar proteins. USH1G, also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is a part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. ANKS4B, also called Harmonin-interacting ankyrin repeat-containing protein (Harp), is highly expressed in intestine and is essential for intermicrovillar adhesion. As part of the intermicrovillar adhesion complex (IMAC), ANKS4B plays a role in epithelial brush border differentiation, controlling microvilli organization and length. It may be involved in cellular response to endoplasmic reticulum stress. Both USH1G and ANKS4B contain four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN), which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for USH1G binding to the MYO7A MyTH4-FERM tandem, as well as for ANKS4B binding to the N-terminal MyTH4-FERM-SH3 supramodule of MYO7B. 41
38427 409636 cd21769 DEFL defensin-like domain family. This family includes a group of defensin-like proteins, including Arabidopsis thaliana protein LURE 1.2 (AtLURE1.2) and protein LURE 1.6 (AtLURE1.6), Mesobuthus martensii neurotoxin BmBKTx1, Arabidopsis thaliana defensin-like protein 32 (AtDEF32), as well as bactericidal proteins such as defensins, sapecins, tenecins, phormicins, and lucifensins. They are characterized by a defensin-like (DEFL) domain, which adopts a structure characterized by a cysteine-stabilized alpha/beta scaffold. AtLURE1.2 (also called cysteine-rich peptide 810_1.2 or defensin-like protein 213) and AtLURE1.6 (also called cysteine-rich peptide 810_1.6 or defensin-like protein 215) are pollen tube attractants guiding pollen tubes to the ovular micropyle. BmBKTx1, also called potassium channel toxin alpha-KTx 19.1 or BmK37, is a selective inhibitor of high conductance calcium-activated potassium channels KCa1.1/KCNMA1. Bactericidal proteins are host defense peptides produced in response to injury and are mostly active against Gram-positive bacteria. 29
38428 412024 cd21783 CTD_Jhd1-like C-terminal domain found in Drosophila melanogaster JmjC domain-containing histone demethylation protein 1 and similar proteins. JmjC domain-containing histone demethylation protein 1 (Jhd1), also called lysine (K)-specific demethylase 2 (KDM2), or [Histone-H3]-lysine-36 demethylase 1, is a histone demethylase (EC 1.14.11.27) that specifically demethylates 'Lys-36' of histone H3, thereby playing a central role in the histone code. Jhd1 consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in Jhd1 between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature. 67
38429 412025 cd21784 CTD_KDM2A C-terminal domain found in Lysine-specific demethylase 2A. Lysine-specific demethylase 2A (KDM2A) is also called CXXC-type zinc finger protein 8, F-box and leucine-rich repeat protein 11 (FBXL11), F-box protein FBL7, F-box protein Lilina, F-box/LRR-repeat protein 11, JmjC domain-containing histone demethylation protein 1A (Jhdm1a), or [Histone-H3]-lysine-36 demethylase 1A. It is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. It acts as a key negative regulator of gluconeogenic gene expression and plays a critical role in the invasiveness, proliferation, and anchorage-independent growth of non-small cell lung cancer (NSCLC) cells, as well as in the osteo/dentinogenic differentiation of Mesenchymal stem cells (MSCs). KDM2A regulates rRNA transcription in response to starvation and functions as a negative regulator of NF-kappaB. It is a heterochromatin-associated and HP1-interacting protein that promotes Heterochromatin Protein 1 (HP1) localization to chromatin. It is specifically recruited to CpG islands to define a unique chromatin architecture, which requires direct and specific interaction with linker DNA. It also functions as a H3K4 demethylase that regulates cell proliferation through p15 (INK4B) and p27 (Kip1) in stem cells from apical papilla (SCAPs). KDM2A belongs to the JmjC domain-containing histone demethylase family. KDM2A consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2A between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature. 68
38430 412026 cd21785 CTD_KDM2B C-terminal domain found in Lysine-specific demethylase 2B. Lysine-specific demethylase 2B (KDM2B) is also called Ndy1, CXXC-type zinc finger protein 2, F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), F-box protein FBL10, JmjC domain-containing histone demethylation protein 1B (Jhdm1b), Jumonji domain-containing EMSY-interactor methyltransferase motif protein (protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B. It is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). KDM2B also acts as an oncogene that plays a critical role in leukemia development and maintenance. It consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2B between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature. 67
38431 409644 cd21786 RBD_KIF20B RAB6 binding domain (RBD) found in kinesin-like protein KIF20B, and similar proteins. KIF20B (also called cancer/testis antigen 90 (CT90), kinesin family member 20B, kinesin-related motor interacting with PIN1, or M-phase phosphoprotein 1 (MPP1)) is a plus-end-directed motor enzyme that is required for completion of cytokinesis. It is required for proper midbody organization and abscission in polarized cortical stem cells. KIF20B plays a role in the regulation of neuronal polarization by mediating the transport of specific cargos. It participates in the mobilization of SHTN1 (shootin 1) and in the accumulation of PIP3 in the growth cone of primary hippocampal neurons in a tubulin and actin-dependent manner. In the developing telencephalon, KIF20B cooperates with SHTN1 to promote both the transition from the multipolar to the bipolar stage and the radial migration of cortical neurons from the ventricular zone toward the superficial layer of the neocortex. KIF20B acts as an oncogene for promoting bladder cancer cell proliferation, apoptosis inhibition, and carcinogenic progression. This model corresponds to a conserved region in KIF20B that shows some sequence similarity to the RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge. 56
38432 409645 cd21787 RBD_KIF20A RAB6 binding domain (RBD) found in kinesin-like protein KIF20A, and similar proteins. KIF20A, also called GG10_2, or mitotic kinesin-like protein 2 (MKlp2), or Rab6-interacting kinesin-like protein, or rabkinesin-6, is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1, it is involved in recruitment of PLK1 (polo-like kinase 1) to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus end-directed motility. This model corresponds to RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge. 56
38433 409347 cd21795 betaCoV_Nsp3_NAB nucleic acid binding domain of betacoronavirus non-structural protein 3. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus including highly pathogenic human coronaviruses (CoVs) such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but may not be conserved in the Nsp3 NAB from betacoronaviruses in other lineages. 110
38434 409335 cd21796 SARS-CoV-like_Nsp1_N N-terminal domain of non-structural protein 1 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the N-terminal domain of non-structural protein 1 (Nsp1) from betacoronaviruses in the sarbecovirus subgenus (B lineage), including highly pathogenic coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 115
38435 409197 cd21799 WH2_Wa_Cobl first Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wa) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wa, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 (or W) domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the first WH2 domain. 33
38436 409198 cd21800 WH2_Wb_Cobl second Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wb) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wb, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 or W domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the second WH2 domain. 44
38437 409199 cd21801 WH2_Wc_Cobl third Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wc) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wc, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 (or W) domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the first WH2 domain. 26
38438 409641 cd21802 CEN_ANKS4B central domain found in ankyrin repeat and SAM domain-containing protein 4B. Ankyrin repeat and SAM domain-containing protein 4B (ANKS4B), also called Harmonin-interacting ankyrin repeat-containing protein (Harp), is highly expressed in intestine and is essential for intermicrovillar adhesion. As part of the intermicrovillar adhesion complex (IMAC), ANKS4B plays a role in epithelial brush border differentiation, controlling microvilli organization and length. It may be involved in cellular response to endoplasmic reticulum stress. ANKS4B consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of ANKS4B, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the N-terminal MyTH4-FERM-SH3 supramodule of MYO7B, with a mechanism highly analogous to the interaction between USH1G and MYO7A. 46
38439 409642 cd21803 CEN_USH1G central domain found in usher syndrome type-1G protein. Usher syndrome type-1G protein (USH1G), also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. USH1G consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of USH1G, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the MYO7A MyTH4-FERM tandem. 57
38440 409637 cd21804 DEFL_AtLURE1-like defensin-like domain found in Arabidopsis thaliana proteins LURE 1.2, LURE 1.6, and similar proteins. This subfamily includes Arabidopsis thaliana (At) LURE1.2 (also called cysteine-rich peptide 810_1.2, CRP810_1.2, or defensin-like protein 213) and AtLURE1.6 (also called cysteine-rich peptide 810_1.6, CRP810_1.6, or defensin-like protein 215). They are pollen tube attractants guiding pollen tubes to the ovular micropyle. AtLURE1.2 attracts specifically pollen tubes from A. thaliana, but not those from A. lyrata. It triggers endocytosis of MDIS1 in the pollen tube tip. This model corresponds to the defensin-like (DEFL) domains of AtLURE1.2 and AtLURE1.6, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold. 38
38441 409638 cd21805 DEFL_BmBKTx1-like defensin-like domain found in Mesobuthus martensii neurotoxin BmBKTx1 and similar proteins. BmBKTx1, also called potassium channel toxin alpha-KTx 19.1, or BmK37, is a selective inhibitor of high conductance calcium-activated potassium channels KCa1.1/KCNMA1. It belongs to a family of short-chain alpha-KTx toxins of the potassium channel (also called alpha-KTx19) and may be insect specific. This subfamily also includes Arabidopsis thaliana defensin-like protein 32 (AtDEF32). Its biological function remains unclear. This model corresponds to the defensin-like (DEFL) domain of BmBKTx1 and AtDEF32, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold. 39
38442 409639 cd21806 DEFL_defensin-like defensin-like domain found in bilateria defensins, sapecins, tenecins, phormicins, and lucifensins. This subfamily includes a group of bactericidal proteins, such as defensins, sapecins, tenecins, phormicins, and lucifensins from bilateria. They are host defense peptides produced in response to injury and mostly active against Gram-positive bacteria. This model corresponds to the defensin-like (DEFL) domain, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold. 38
38443 409632 cd21807 ABC-2_lan_permease_MutE_EpiE-like lantibiotic immunity ABC transporter MutE/EpiE family permease (also called ABC-2 transporter MutE/EpiE family permease) subunit. This subfamily includes lantibiotic ABC transporter permease subunits EpiE, MutE, SlvE and NisE, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to the lantibiotics mutacin, epidermin, nisin and salivaricin, respectively. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Staphylococcus epidermidis Tu3298, the lantibiotic epidermin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming epidermin is mediated by the ABC transporter immunity proteins composed of EpiF, EpiE and EpiG; the EpiE permease subunit transports epidermin to the surface and expels it from the membrane. This subfamily also includes the lantibiotic ABC transporter permease subunits MutE, SlvF, and NisE. Self-protection of the mutacin-producing strain Streptococcus mutans CH43 against the pore-forming lantibiotic mutacin is mediated by an ABC transporter composed of MutF, MutE and MutG. In salivaricin D-producing strain Streptococcus salivarius 5M6c, self-immunity against the intrinsically trypsin-resistant salivaricin is mediated via ABC transporter proteins SlvF, SlvE and SlvG, while in Lactococcus lactis, self-immunity against nisin is mediated by the ABC transporter NisFEG. The MutE, NisE and SlvF permease subunits transport mutacin, nisin and salivaricin, respectively to the surface and expel them from the membrane. 234
38444 409633 cd21808 ABC-2_lan_permease_MutG lantibiotic immunity ABC transporter MutG family permease (also called ABC-2 transporter MutG family permease) subunit. This subfamily includes lantibiotic ABC transporter permease subunit MutG which is a highly hydrophobic, integral membrane protein, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to lantibiotic mutacin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Streptococcus mutans CH43, the lantibiotic mutacin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming mutacin is mediated by the ABC transporter composed of MutF, MutE, and MutG. This subfamily includes the MutG permease subunit that transports mutacin to the surface and expels it from the membrane. 237
38445 409634 cd21809 ABC-2_lan_permease-like lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit and similar proteins. This subfamily contains lantibiotic ABC transporter permease subunits which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to type-A lantibiotics. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Lactococcus lactis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG; the NisG permease subunit transports nisin to the surface and expels it from the membrane. This family includes mostly uncharacterized transport permease subunits that transport lantibiotics to the surface and expel them from the membrane. 235
38446 409635 cd21810 ABC-2_lan_permease_NisG-like lantibiotic immunity ABC transporter NisG family permease (also called ABC-2 transporter NisG family permease) subunit, and similar proteins. This subfamily contains lantibiotic ABC transporter permease subunits NisG and NsuG, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to the lantibiotic nisin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. In Lactococcus lactis and Streptococcus uberis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG. In Streptococcus uberis, similar proteins provide self-protection against the pore-forming lantibiotic nisin U. This subfamily contains the NisG and NsuG permease subunits that transport nisin to the surface and expel it from the membrane. 211
38447 409251 cd21811 CoV_Nsp7 coronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 83
38448 409627 cd21812 MHV-like_Nsp3_betaSM betacoronavirus-specific marker of non-structural protein 3 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 125
38449 409628 cd21813 HKU9-like_Nsp3_betaSM betacoronavirus-specific marker of non-structural protein 3 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 135
38450 409629 cd21814 SARS-CoV-like_Nsp3_betaSM betacoronavirus-specific marker of non-structural protein 3 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 116
38451 409630 cd21815 MERS-CoV-like_Nsp3_betaSM betacoronavirus-specific marker of non-structural protein 3 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 124
38452 409256 cd21816 CoV_Nsp8 Coronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 194
38453 409622 cd21817 IgC1_CH1_IgEG CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy epsilon and gamma chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of epsilon and gamma chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 94
38454 409623 cd21818 IgC1_CH1_IgA CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy alpha chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of alpha chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 94
38455 409624 cd21819 IgC1_CH1_IgM CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy mu chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of mu chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors. 95
38456 409625 cd21820 IgC1_MHC_1b_Qa-1b Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1b; member of the C1-set of Ig superfamily (IgSF) domains. The non-classical mouse MHC class I (MHC-I) molecule Qa-1b is a non-polymorphic MHC molecule with an important function in innate immunity. It binds and presents signal peptides of classical MHC-I molecules at the cell surface and, as such, act as an indirect sensor for the normal expression of MHC-I molecules. This signal peptide dominantly accommodated in the groove of Qa-1b is called Qdm, for Qa-1 determinant modifier, and its amino acid sequence AMAPRTLLL is highly conserved among mammalian species. The Qdm/Qa-1b complex serves as a ligand for the germ-line encoded heterodimeric CD94/NKG2A receptors expressed on natural killer (NK) cells and activated CD8+ T cells and transduces inhibitory signals to these lymphocytes. Thus, upon binding, Qa-1b signals NK cells not to engage in cell lysis. The molecular basis of Qa-1b function is unclear. 98
38457 409352 cd21821 MavE Dot/Icm type IV secretion system effector MavE. The Icm/Dot protein translocation apparatus is a type IVb secretion system, highly related to bacterial conjugative DNA transfer systems, and is important in establishing a replication vacuole. A complex of Icm/Dot proteins spans the bacterial envelope, allowing the transfer of proteins from the bacterial cytoplasm across membranes located in the target host eukaryotic cell. Icm/Dot-translocated substrates (IDTS) control construction of the replication compartment and have been shown to directly regulate membrane traffic associated with the movement of vesicles along steps in the early secretory system. Although the function of Legionella MavE is unknown, it has been shown to be an Icm/Dot-translocated substrate and is assumed to play a role in this type IV secretion system. 132
38458 409348 cd21822 SARS-CoV-like_Nsp3_NAB nucleic acid binding domain of non-structural protein 3 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage) and hibecovirus subgenus, including highly pathogenic human coronaviruses (CoVs) such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the B lineage. 107
38459 409349 cd21823 MERS-CoV-like_Nsp3_NAB nucleic acid binding domain of non-structural protein 3 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), and appears to be partially conserved in the Nsp3 NAB from betacoronaviruses in the C lineage. 123
38460 409350 cd21824 MHV-like_Nsp3_NAB nucleic acid binding domain of non-structural protein 3 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the A lineage. 119
38461 409351 cd21825 HKU9-like_Nsp3_NAB nucleic acid binding domain of non-structural protein 3 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the D lineage. 117
38462 409252 cd21826 alphaCoV_Nsp7 alphacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of alphacoronaviruses that include Feline infectious peritonitis virus (FCoV), Human coronavirus NL63 (HCoV-NL63), and Porcine transmissible gastroenteritis coronavirus (TGEV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. FCoV Nsp7 forms a 2:1 heterotrimer with Nsp8; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 83
38463 409253 cd21827 betaCoV_Nsp7 betacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of betacoronaviruses including the highly pathogenic Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 83
38464 409254 cd21828 gammaCoV_Nsp7 gammacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of gammacoronaviruses that include Avian infectious bronchitis virus (IBV) and Canada goose coronavirus (CGCoV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 83
38465 409255 cd21829 deltaCoV_Nsp7 deltacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 96
38466 409257 cd21830 alphaCoV_Nsp8 alphacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of alphacoronaviruses that include Feline infectious peritonitis virus (FCoV), Human coronavirus NL63 (HCoV-NL63), and Porcine epidemic diarrhea coronavirus (PEDV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. FCoV Nsp8 forms a 1:2 heterotrimer with Nsp7; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 195
38467 409258 cd21831 betaCoV_Nsp8 betacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) the highly pathogenic betacoronaviruses that include Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 196
38468 409259 cd21832 gammaCoV_Nsp8 gammacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of gammacoronaviruses that include Avian infectious bronchitis virus (IBV) and Canada goose coronavirus (CGCoV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 210
38469 409260 cd21833 deltaCoV_Nsp8 deltacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 189
38470 411711 cd21834 Hhal-like Restriction endonuclease HhaI and similar endonucleases. HhaI is a type II restriction endonuclease that recognizes the symmetric sequence 5'-GCG|C-3' (| denotes the cleavage site) and produces fragments with 2-base, 3'-overhangs. It domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and HindIII. 261
38471 409346 cd21835 SagF streptolysin S-associated protein SagF. Streptolysin S-associated protein SagF is encoded by the sagF gene, which has been identified to be a hemolytic activity-related gene in SEZ (Streptococcus equi ssp. zooepidemicus). The sagF gene is located in the same operon with sagD gene in the SEZ genome implying that it should play an important role in SEZ hemolytic activity and is an indispensable gene in the sag operon for streptolysin S (SLS) biosynthesis. 225
38472 409345 cd21836 adhesin_CP Neisseria gonorrhoeae Adhesin Complex Protein and similar proteins. This model contains adhesin complex protein found in Neisseria gonorrhoeae (Ng-ACP), the causative organism of the sexually transmitted disease gonorrhoea, and similar proteins. Studies have shown that Ng-ACP is conserved and expressed by over 50 gonococcal strains and that recombinant proteins induce antibodies in mice that killed the bacteria in vitro. Thus, recombinant Ng-ACP (rNg-ACP) is a potential vaccine candidate that induces antibodies that are bactericidal and prevent the gonococcus from inhibiting the lytic activity of an innate defense molecule. This protein is structurally similar to N. meningitidis adhesin complex protein as well as members of the MliC/PliC protein family of membrane-bound or periplasmic inhibitors of human C-type lysozyme (HL), suggesting that Ng-ACP may probably be located in the periplasm or phospholipid layer of the outer membrane. 93
38473 409344 cd21837 AvrRps4-like Pseudomonas syringae coiled-coil effector AvrRps4 C-terminal region, and regions in similar proteins. This model includes the C-terminal region of AvrRps4, a type III-secreted (T3S) effector protein originally identified in Pseudomonas syringae pv. pisi, a causal agent of bacterial blight in pea. AvrRps4 triggers RPS4 (resistance to P. syringae 4)-dependent immunity in resistant accessions of Arabidopsis. AvrRps4 is a bipartite effector, processed upon entry in planta by cleavage between two glycine residues, generating two protein parts, AvrRps4N and AvrRps4C. Mutation studies have shown that an electronegative surface patch in AvrRps4(C) is required for recognition by RPS4; mutations in this region have been shown to uncouple triggering of the hypersensitive response from disease resistance. The N-terminal part of AvrRps4 was previously assumed to only function in effector secretion into the host cell; however, in Arabidopsis, which uses a pair of resistance proteins, RRS1 and RPS4, both AvrRps4 parts are required for triggering resistance in Arabidopsis, and in fact, AvrRps4N on its own has some functions of an effector, implying that the fusion of the two AvrRps4 parts may have arisen to counteract plant defenses. 86
38474 412061 cd21864 GTSE1_CTD C-terminal domain of G2 and S phase-expressed protein 1. G2 and S phase-expressed protein 1 (GTSE-1), also called protein B99 homolog, is a cell cycle-regulated protein mainly localized in the cytoplasm and apparently associated with microtubules. It may be involved in p53-induced cell cycle arrest in G2/M phase by interfering with microtubule rearrangements that are required to enter mitosis. Overexpression of GTSE-1 delays G2/M phase progression. GTSE-1 is a clathrin adaptor protein; it is recruited to the spindle by clathrin, which stabilizes microtubules by inhibiting the microtubule depolymerase MCAK. This model corresponds to a conserved domain at the C-terminus of GTSE-1, which is required for clathrin binding and is only conserved in vertebrates. 56
38475 409286 cd21868 CC1_SLMAP-like first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein and similar proteins. The family includes Sarcolemmal membrane-associated protein (SLMAP), its paralog TRAF3-interacting JNK-activating modulator (T3JAM), and similar proteins. SLMAP, also called Sarcolemmal membrane-associated protein, is a cardiac tail-anchored membrane protein that may play a role during myoblast fusion. T3JAM, also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. SLMAP contains an N-terminal FHA domain, followed by four coiled-coil (CC) domains and a transmembrane domain. The model corresponds to the first CC (CC1) domain that is responsible for the binding of suppressor of IKBKE 1 (SIKE1). 38
38476 409343 cd21871 VscT2 type III secretion system apparatus protein VscT2 in Vibrio species. This model contains Vibrio type III secretion system (T3SS) apparatus protein VscT2. Vibrios, which include over 100 species, are ubiquitous in marine and estuarine environments, and many species such as Vibrio cholerae, V. parahaemolyticus and V. mimicus, are pathogens for humans. VscT2 co-occurs with vscS2, vscN2, vscC2 and vscR2 which are all essential for T3SS secretion. 130
38477 409325 cd21872 CoV_Nsp10 coronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation, and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16, and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity. 131
38478 409342 cd21873 Ugr_9a-1-like includes Urticina grebelnyi Ugr 9a-1, Anemonia viridis Avd13a/b, Antheopsis maculata Amc1a peptide actitoxins. This model includes novel peptides isolated from venom of the sea anemone and include Urticina grebelnyi Ugr 9a-1 (also called pi-anemonetoxin (pi-AnmTX) Ugr 9a-1 or Pi-actitoxin-Ugr1a or Ugr 9-1), Antheopsis maculate Amc1a (also called delta-actitoxin-Amc1a, Delta-AITX-Amc1a, AnmTX Ama 9a-1 or peptide toxins Am-1) and Anemonia viridis Avd13b (also called U-actitoxin-Avd13b, AnmTX Avi 9a-1, or peptide toxin AV-2). These peptides belong to structural group 9a. Ugr 9a-1 has an uncommon beta-hairpin structure, stabilized by two S-S bridges. Its precursor protein appears to be processed in the following sequence: release of the signal peptide and of the propeptide, production of six identical 34-residue peptides by cleavage between Arg and Glu, release of four N-terminal and three C-terminal residues from each peptide and hydroxylation of each Pro in position 6 of the resulting 27-residue peptides. Ugr1a has been shown to produce a reversible inhibition effect on both the transient and the sustained current of human acid-sensing ion channel 3 (ASIC3) channels expressed in Xenopus laevis oocytes; it completely blocks the transient component and partially (48%) inhibits the amplitude of the sustained component. In mice, it significantly reversed inflammatory and acid-induced pain. 29
38479 409336 cd21874 alpha_betaCoV_Nsp1 non-structural protein 1 from alpha- and betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from alpha- and betacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Gamma- and deltaCoVs do not have Nsp1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 103
38480 409337 cd21875 PEDV-like_alphaCoV_Nsp1 non-structural protein 1 from porcine epidemic diarrhea virus and similar alphacoronaviruses. This model represents the non-structural protein 1 (Nsp1) from porcine epidemic diarrhea virus (PEDV) and similar alphacoronaviruses from several subgenera including pedacovirus, setracovirus, duvinacovirus, decacovirus, colacovirus, myotacovirus, minunacovirus, and rhinacovirus. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 108
38481 409338 cd21876 betaCoV_Nsp1 non-structural protein 1 from betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from betacoronaviruses, including highly pathogenic coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 114
38482 409339 cd21877 HKU9-like_Nsp1 non-structural protein 1 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 165
38483 409340 cd21878 MERS-CoV-like_Nsp1 non-structural protein 1 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 170
38484 409341 cd21879 MHV-like_Nsp1 non-structural protein 1 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV), bovine coronavirus (BCoV) and Human coronavirus HKU1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and MHV genomes cause drastic reduction or elimination of infectious virus; BCoV Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 236
38485 409329 cd21881 CoV_Nsp9 coronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from coronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for CoV replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG at the C-terminus; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 111
38486 411975 cd21882 TRPV Transient Receptor Potential channel, Vanilloid subfamily (TRPV). The vanilloid TRP subfamily (TRPV), named after the vanilloid receptor 1 (TRPV1), consists of six members: four thermo-sensing channels (TRPV1, TRPV2, TRPV3, and TRPV4) and two Ca2+ selective channels (TRPV5 and TRPV6). The calcium-selective channels TRPV5 and TRPV6 can be heterotetramers and are important for general Ca2+ homeostasis. All four channels within the TRPV1-4 group show temperature-invoked currents when expressed in heterologous cell systems, ranging from activation at ~25C for TRPV4 to ~52C for TRPV2. The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. The TRP family consists of membrane proteins that function as ion channels that communicate between the cell and its environment, by a vast array of physical or chemical stimuli, including radiation (in the form of temperature, infrared ,or light) and pressure (osmotic or mechanical). TRP channels are formed by a tetrameric complex of channel subunits. Based on sequence identity, the mammalian TRP channel family is classified into six subfamilies, with significant sequence similarity within the transmembrane domains, but very low similarity in their N- and C-terminal cytoplasmic regions. The six subfamilies are named based on their first member: TRPC (canonical), TRPV (vanilloid), TRPM (melastatin), TRPA (ankyrin), TRPML (mucolipin), and TRPP (polycystic). 600
38487 409330 cd21897 alphaCoV_Nsp9 alphacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) of alphacoronaviruses, including Porcine epidemic diarrhea virus (PEDV), Porcine transmissible gastroenteritis coronavirus (TGEV), and Human coronavirus 229E. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 108
38488 409331 cd21898 betaCoV_Nsp9 betacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from betacoronaviruses including highly pathogenic Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 111
38489 409332 cd21899 gammaCoV_Nsp9 gammacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from gammacoronaviruses such as Avian infectious bronchitis virus (IBV). CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 113
38490 409333 cd21900 deltaCoV_Nsp9 deltacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from deltacoronaviruses such as the Porcine delta coronavirus (PDCoV) Porcine coronavirus HKU15. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 109
38491 409326 cd21901 alpha_betaCoV_Nsp10 alphacoronavirus and betacoronavirus non-structural protein 14. This model represents the non-structural protein 10 (Nsp10) of alpha- and betacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), Middle East respiratory syndrome-related (MERS) CoV, and alphacoronaviruses such as Human coronavirus 229E. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation, and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity. 130
38492 409327 cd21902 gammaCoV_Nsp10 gammacoronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of gammacoronaviruses, including Infectious bronchitis virus (IBV)and Bottlenose dolphin coronavirus HKU22(BdCoV HKU22). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity. 134
38493 409328 cd21903 deltaCoV_Nsp10 deltacoronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of deltacoronaviruses, including Thrush coronavirus HKU12-600 and Wigeon coronavirus HKU20. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity. 128
38494 409324 cd21904 TtfA-like Mycobacterial trehalose monomycolate transport factor A and similar proteins. TtfA (trehalose monomycolate transport factor A) plays a role in the transport of trehalose monomycolate across the inner membrane, potentially by forming a complex with the atypical lipid transporter MmpL3. Trehalose monomycolate is a component of the mycobacterial envelope. The core domain of TtfA shows strong structural similarity to class I type III secretion system (T3SS) chaperones, and TtfA may play other roles besides assisting in mycolate transport, given its phylogenetic distribution. 171
38495 409300 cd21905 PUA_TruB_thermotogae PUA RNA-binding domain of the thermotogae tRNA pseudouridine synthase B. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the thermotogae subfamily of pseudouridine synthases TruB are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs. 78
38496 409287 cd21911 CC1_SLMAP first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein. Sarcolemmal membrane-associated protein (SLMAP), also called Sarcolemmal membrane-associated protein, is a cardiac tail-anchored membrane protein that may play a role during myoblast fusion. SLMAP contains an N-terminal FHA domain followed by four coiled-coil (CC) domains and a transmembrane domain. The model corresponds to the first CC (CC1) domain that is responsible for the binding of suppressor of IKBKE 1 (SIKE1). 63
38497 409288 cd21912 CC1_T3JAM first coiled-coil (CC1) domain found in TRAF3-interacting JNK-activating modulator. TRAF3-interacting JNK-activating modulator (T3JAM), also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. The model corresponds to a conserved region that shows high sequence similarity with the first CC (CC1) domain of Sarcolemmal membrane-associated protein (SLMAP), which is responsible for the binding of suppressor of IKBKE 1 (SIKE1). 45
38498 409285 cd21913 Nip7_N_arch N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7. The N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7 co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. Nip7 is involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of the 60S ribosomal subunit. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. 85
38499 409275 cd21927 ZIP_TSC22D-like leucine zipper found in the TSC22 domain leucine zipper transcription factors, c-Myc-binding protein, and similar proteins. The family includes TGF-beta-stimulated clone-22 domain (TSC22D) leucine zipper transcription factors, TSC22D1-4, as well as c-Myc-binding protein (MycBP). TSC22D proteins have diverse physiological functions, including cell growth, development, homeostasis, and immune regulation. MycBP, also called associate of Myc 1 (AMY-1), is a novel c-Myc binding protein that may control the transcriptional activity of Myc. It stimulates the activation of E box-dependent transcription by Myc. Members of this family contain a conserved leucine zipper (ZIP) domain. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. In the bZIP family of transcription factors, the leucine zipper acts as a dimerization domain and the upstream basic region as a DNA-binding domain. However, DNA-binding capability of TSC22D family proteins is not obvious, due to the lack of the basic region found in the original bZIP DNA-binding domains. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 51
38500 409272 cd21928 LGNbd_FRMPD1_D4-like LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD4, and similar proteins. The family includes FRMPD1, FRMPD4, and similar proteins. FRMPD1, also called FERM domain-containing protein 2 (FRMD2), stabilizes membrane-bound GPSM1, and thereby promotes its interaction with GNAI1. It also acts as a regulatory binding partner of Activator of G-protein Signaling 3 (AGS3). FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain-containing protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. This model corresponds to a conserved region in FRMPD1 and FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD1 and FRMPD4. 37
38501 412018 cd21930 IPD_PPP1R12 inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12 (PPP1R12) family. The PPP1R12 family includes PPP1R12A/MYPT1, PPP1R12B/MYPT2, and PPP1R12C. PPP1R12A/MYPT1, also called myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit, is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). It acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, PPP1R12A/MYPT1 is involved in dephosphorylation of PLK1. It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. PPP1R12B/MYPT2, also called myosin phosphatase target subunit 2, is the targeting subunit of smooth-muscle myosin phosphatase that regulates myosin phosphatase activity and augments Ca(2+) sensitivity of the contractile apparatus. PPP1R12C, also called protein phosphatase 1 myosin-binding subunit of 85 kDa (MBS85), protein phosphatase 1 myosin-binding subunit p85, or LENG3, regulates myosin phosphatase activity. All family members contain an inhibitory phosphorylation domain. 47
38502 409267 cd21931 TD_EMAP-like trimerization domain of the echinoderm microtubule-associated protein-like family. The echinoderm microtubule-associated protein (EMAP)-like (EML) family includes EMAP-1, EMAP-2, EMAP-3, and EMAP-4. EMAP-1, also called EMAL1, EMAPL or EMAPL1, modulates the assembly and organization of the microtubule cytoskeleton, and probably plays a role in regulating the orientation of the mitotic spindle and the orientation of the plane of cell division. It is required for normal proliferation of neuronal progenitor cells in the developing brain and for normal brain development. EMAP-2, also called EML2 or EMAPL2, is a tubulin binding protein that inhibits microtubule nucleation and growth, resulting in shorter microtubules. EMAP-3, also called EML3, is a nuclear microtubule-binding protein required for the correct alignment of chromosomes in metaphase. EMAP-4, also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to a conserved trimerization domain located at the N-terminus of EML family members. 44
38503 409264 cd21932 MIU2_RNF168-like second motif interacting with ubiquitin domain found in RING finger protein 168 and similar domains. The domain family includes motif interacting with ubiquitin (MIU) domains of RING finger protein, RNF168 and RNF169. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of the DSB repair pathway by competing with repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for interaction with K63 linked poly-ubiquitin. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU domain. This model corresponds to the second MIU (MIU2) domain of RNF168 and the C-terminal MIU domain of RNF169, which is responsible for bridging histone and ubiquitin surfaces. 42
38504 409261 cd21933 TBK1_IKKE-like_C C-terminal domain of non-canonical Inhibitor of kappa B kinases, IKK-E and TBK1, and similar proteins. Inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E or IKK-epsilon) and TANK-binding kinase 1 (TBK1) are non-canonical members of IKK family. They have been characterized as activators of nuclear factor-kappaB (NF-kappaB), but they are not essential for NF-kappaB activation. They play critical roles in antiviral response via phosphorylation and activation of transcription factors IRF3, IRF7, STAT1, and STAT3. They are also involved in the survival, tumorigenesis, and development of various cancers. Both IKK-epsilon and TBK1 contain an N-terminal protein kinase domain followed by a ubiquitin-like (Ubl) domain, a coiled-coil domain 1 (CCD1), and a C-terminal elongated alpha-helical domain. The model corresponds to the C-terminal elongated alpha-helical domain. It is responsible for the binding of adaptor proteins, optineurin (OPTN) and NAP1, to TBK1. 43
38505 409276 cd21936 ZIP_TSC22D leucine zipper domain found in the TSC22 domain family of leucine zipper transcription factors. The TGF-beta-stimulated clone-22 domain (TSC22D) family includes TSC22D1-4 and similar proteins. They have diverse physiological functions, including cell growth, development, homeostasis, and immune regulation. All family members contain a conserved leucine zipper (ZIP) domain located at the C-terminus. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. In the bZIP family of transcription factors, the leucine zipper acts as a dimerization domain and the upstream basic region as a DNA-binding domain. However, DNA-binding capability of TSC22D family proteins is not obvious, due to the lack of the basic region found in the original bZIP DNA-binding domains. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 49
38506 409277 cd21937 ZIP_MycBP-like leucine zipper domain found in c-Myc-binding protein and similar proteins. MycBP, also called associate of Myc 1 (AMY-1), is a novel c-Myc binding protein that may control the transcriptional activity of Myc. It stimulates the activation of E box-dependent transcription by Myc. This model corresponds to the conserved region that shows high sequence similarity with the leucine zipper (ZIP) domain located at the C-terminus of TGF-beta-stimulated clone-22 domain (TSC22D) family transcription factors. The first helix of ZIP is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 53
38507 409278 cd21938 ZIP_TSC22D1 leucine zipper domain found in TSC22 domain family protein 1. TSC22 domain family protein 1 (TSC22D1) is also called cerebral protein 2, regulatory protein TSC-22, TGFB-stimulated clone 22, or transforming growth factor beta-1-induced transcript 4 protein (TGFB1I4). It is a transcriptional repressor that was reported to be present in both the cytoplasmic and the nuclear fraction. It is activated by transcription growth factor-beta1 and other growth factors of osteoblastic cells. TSC22D1 acts on the C-type natriuretic peptide (CNP) promoter. It enhances c-Myc-mediated activation of the telomerase reverse transcriptase (TERT) promoter. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D1. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 79
38508 409279 cd21939 ZIP_TSC22D2 leucine zipper domain found in TSC22 domain family protein 2. TSC22 domain family protein 2 (TSC22D2), also called transforming growth factor beta-stimulated clone 22 domain family member 2, or TSC22-related-inducible leucine zipper protein 4 (TILZ4), may participate in the regulation of cell growth. It interacts with pyruvate kinase isoform M2 (PKM2) and WD repeat domain 77 (WDR77). The model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D2. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 63
38509 409280 cd21940 ZIP_TSC22D3 leucine zipper domain found in TSC22 domain family protein 3. TSC22 domain family protein 3 (TSC22D3) is also called DSIP-immunoreactive peptide, protein DIP, delta sleep-inducing peptide immunoreactor, glucocorticoid-induced leucine zipper protein (GILZ), TSC-22-like protein, or TSC-22-related protein (TSC-22R). It protects T-cells from IL2 deprivation-induced apoptosis through the inhibition of FOXO3A transcriptional activity that leads to the down-regulation of the pro-apoptotic factor BCL2L11. In macrophages, it plays a role in the anti-inflammatory and immunosuppressive effects of glucocorticoids and IL10. In T-cells, it inhibits anti-CD3-induced NFKB1 nuclear translocation. TSC22D3 contains a leucine zipper motif, a Pro/Glu rich domain, and three potential phosphorylation sites. This model corresponds to the leucine zipper (ZIP) domain. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 81
38510 409281 cd21941 ZIP_TSC22D4 leucine zipper domain found in TSC22 domain family protein 4. TSC22 domain family protein 4 (TSC22D4), also called TSC22-related-inducible leucine zipper protein 2 (TILZ2), or Tsc-22-like protein THG-1, is a transcriptional repressor that acts as a molecular determinant of insulin signalling and glucose handling. It also functions in hepatic lipid handling by regulating hepatic very-low-density-lipoprotein (VLDL) release and lipogenic gene expression. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D4. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 74
38511 409273 cd21942 LGNbd_FRMPD1 LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing protein 1. FERM and PDZ domain-containing protein 1 (FRMPD1), also called FERM domain-containing protein 2 (FRMD2), stabilizes membrane-bound GPSM1, and thereby promotes its interaction with GNAI1. It also acts as a regulatory binding partner of Activator of G-protein Signaling 3 (AGS3). This model corresponds to a conserved region in FRMPD1 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD1. 38
38512 409274 cd21943 LGNbd_FRMPD4 LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing protein 4. FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. FRMPD4 contains WW, PDZ and FERM domains in the N-terminal region. This model corresponds to a conserved region in the C-terminal region of FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD4. 49
38513 412019 cd21944 IPD_MYPT1 inhibitory phosphorylation domain of myosin phosphatase targeting subunit 1(MYPT1). MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit, is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of MYPT1. 57
38514 412020 cd21945 IPD_PPP1R12C inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12C (PPP1R12C). PPP1R12C, also called protein phosphatase 1 myosin-binding subunit of 85 kDa (MBS85), protein phosphatase 1 myosin-binding subunit p85, or LENG3, regulates myosin phosphatase activity. This model corresponds to a conserved region of PPP1R12C, which shows high sequence similarity to the inhibitory phosphorylation domain of MYPT1. 54
38515 412021 cd21946 IPD_MYPT2 inhibitory phosphorylation domain of myosin phosphatase targeting subunit 2 (MYPT2). MYPT2, also called protein phosphatase 1 regulatory subunit 12B (PPP1R12B), or myosin phosphatase target subunit 2, is the targeting subunit of smooth-muscle myosin phosphatase that regulates myosin phosphatase activity and augments Ca(2+) sensitivity of the contractile apparatus. This model corresponds to the inhibitory phosphorylation domain of MYPT2. 53
38516 409268 cd21947 TD_EMAP1 trimerization domain of echinoderm microtubule-associated protein-like 1. Echinoderm microtubule-associated protein-like 1 (EMAP-1), also called EMAL1, EMAPL, or EMAPL1, modulates the assembly and organization of the microtubule cytoskeleton, and probably plays a role in regulating the orientation of the mitotic spindle and the orientation of the plane of cell division. It is required for normal proliferation of neuronal progenitor cells in the developing brain and for normal brain development. This model corresponds to a conserved region located at the N-terminus of EMAP-1, which shows high sequence similarity with the N-terminal trimerization domain of EMAP-4 and EMAP-2. 58
38517 409269 cd21948 TD_EMAP2 trimerization domain of echinoderm microtubule-associated protein-like 2. Echinoderm microtubule-associated protein-like 2 (EMAP-2), also called EML2 or EMAPL2, is a tubulin binding protein that inhibits microtubule nucleation and growth, resulting in shorter microtubules. This model corresponds to the N-terminal trimerization domain of EMAP-2. 48
38518 409270 cd21949 TD_EMAP3 trimerization domain of echinoderm microtubule-associated protein-like 3. Echinoderm microtubule-associated protein-like 3 (EMAP-3), also called EML3, is a nuclear microtubule-binding protein required for the correct alignment of chromosomes in metaphase. It may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to a conserved region located at the N-terminus of EMAP-3, which shows high sequence similarity with the N-terminal trimerization domain of EMAP-2 and EMAP-4. 48
38519 409271 cd21950 TD_EMAP4 trimerization domain of echinoderm microtubule-associated protein-like 4. Echinoderm microtubule-associated protein-like 4 (EMAP-4), also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to the N-terminal trimerization domain of EMAP-4. 59
38520 409265 cd21951 MIU_RNF169_C C-terminal motif interacting with ubiquitin domain found in RING finger protein 169. RING finger protein 169 (RNF169) is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of the DSB repair pathway by competing with repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU (motif interacting with ubiquitin) domain. This model corresponds to the MIU domain of RNF169, which shows high sequence similarity with the second MIU (MIU2) domain of RNF168, and is responsible for bridging histone and ubiquitin surfaces. 54
38521 409266 cd21952 MIU2_RNF168 second motif interacting with ubiquitin domain found in RING finger protein 168. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin. This model corresponds to the second MIU (MIU2) domain of RNF168. The first MIU belongs to a different domain family and is not included here. 51
38522 409262 cd21953 IKKE_C C-terminal domain of inhibitor of nuclear factor kappa-B kinase subunit epsilon. Inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E) (EC 2.7.11.10) is also called I-kappa-B kinase epsilon, IKK-epsilon, IkBKE, inducible I kappa-B kinase, or IKK-I. It is an interferon regulatory factor-activating kinase that is a non-canonical member of the IKK family. It is involved in cellular innate immunity by inducing type I interferons. It is induced by the activation of nuclear factor-kappaB (NF-kappaB). IKK-E has also been implicated in antiviral immune response in higher vertebrates. It acts as a crucial pro-survival factor in human T cell leukemia virus type 1 (HTLV-1)-transformed T lymphocytes. Moreover, IKK-E plays an essential role in tumor initiation and progression. It inhibits protein kinase C (PKC) to promote Fascin-dependent actin bundling. IKK-E contains an N-terminal protein kinase domain followed by a ubiquitin-like (Ubl) domain, a coiled-coil domain 1 (CCD1), and a C-terminal elongated helical domain. This model corresponds to the C-terminal elongated helical domain of IKK-E that shows high sequence similarity with the C-terminal domain of TBK1, which is responsible for binding to its adaptor proteins, optineurin (OPTN) and NAP1. 48
38523 409263 cd21954 TBK1_C C-terminal domain of TANK-binding kinase 1. TANK-binding kinase 1 (TBK1), also called T2K and NF-kB-activating kinase, is a serine/threonine-protein kinase that is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. TBK1 may also play roles in cell transformation and oncogenesis. In addition, it regulates optineurin (OPTN), an important autophagy receptor involved in several selective autophagy processes. TBK1 contains N-terminal serine/threonine protein kinase, ubiquitin-like (Ubl), coiled-coil domain 1 (CCD1), and C-terminal alpha-helical domains. This model corresponds to a small conserved elongated alpha-helical domain at the C-terminus of TBK1, which is responsible for the binding of its adaptor proteins such as OPTN and NAP1. 47
38524 409250 cd21955 SARS-CoV_ORF9b accessory protein 9b of severe acute respiratory syndrome-associated coronavirus and similar proteins. This model represents the accessory protein 9b (ORF9b) from Severe acute respiratory syndrome-associated coronavirus (SARS-CoV) and some related betacoronaviruses such as bat coronavirus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF9b is a product of an alternative open reading frame within the N gene from SARS coronavirus. It is a lipid-binding protein that has been shown to associate with intracellular vesicles in mammalian cells, consistent with a role in the assembly of the virion. ORF9b localizes to mitochondria and causes mitochondrial elongation by triggering ubiquitination and proteasomal degradation of dynamin-like protein 1, a host protein involved in mitochondrial fission. It also targets the mitochondrial-associated adaptor molecule MAVS signalosome to trigger the degradation of MAVS, TRAF3, and TRAF, which severely limits host cell interferon responses. There are slight differences in the genome organization of SARS-CoV-2 in different studies; not all SARS-CoV-2 isolates are reported as having ORF9b. 89
38525 409248 cd21963 Syt1_N N-terminal domain of synaptotagmin-1 (Syt1) and similar proteins. Syt1, also called synaptotagmin I (SytI), or p65, is a calcium sensor that participates in triggering neurotransmitter release at the synapse. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. Syt1 binds acidic phospholipids with a specificity that requires the presence of both an acidic head group and a diacyl backbone. A Ca(2+)-dependent interaction between synaptotagmin and putative receptors for activated protein kinase C has also been reported. It can bind to at least three additional proteins in a Ca(2+)-independent manner; these are neurexins, syntaxin and AP2. Syt1 also plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt1, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B). 108
38526 409249 cd21964 Syt2_N N-terminal domain of synaptotagmin-2 (Syt2) and similar proteins. Syt2, also called synaptotagmin II (SytII), exhibits calcium-dependent phospholipid and inositol polyphosphate binding properties. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. It plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt2, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B). 111
38527 412012 cd21965 Zn-C2H2_CALCOCO1_TAX1BP1_like autophagy receptor zinc finger-C2H2 domain found in calcium-binding and coiled-coil domain-containing proteins, TAX1BP1 and similar proteins. The family includes calcium-binding and coiled-coil domain-containing proteins (CALCOCO1 and CALCOCO2), TAX1BP1 and similar proteins. CALCOCO1, also called calphoglin, or coiled-coil coactivator protein, or Sarcoma antigen NY-SAR-3, functions as a coactivator for aryl hydrocarbon and nuclear receptors (NR). CALCOCO2, also called antigen nuclear dot 52 kDa protein, or nuclear domain 10 protein NDP52, or nuclear domain 10 protein 52, or nuclear dot protein 52, is an ubiquitin-binding autophagy receptor involved in the selective autophagic degradation of invading pathogens. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. The family also includes Drosophila melanogaster Spindle-F (Spn-F) that is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. This model corresponds to the C2H2-type zinc binding domain found in family members. It is a typical C2H2-type zinc finger which specifically recognizes mono-ubiquitin or poly-ubiquitin chain. The overall ubiquitin-binding mode utilizes the C-terminal alpha-helix to interact with the solvent-exposed surface of the central beta-sheet of ubiquitin, similar to that observed in the RABGEF1/Rabex-5 or POLN/Pol-eta zinc finger. 24
38528 412013 cd21967 Zn-C2H2_CALCOCO1 C2H2-type zinc binding domain found in calcium-binding and coiled-coil domain-containing protein 1 (CALCOCO1) and similar proteins. CALCOCO1, also called calphoglin, or coiled-coil coactivator protein, or Sarcoma antigen NY-SAR-3, functions as a coactivator for aryl hydrocarbon and nuclear receptors (NR). It is recruited to promoters through its contact with the N-terminal basic helix-loop-helix-Per-Arnt-Sim (PAS) domain of transcription factors or coactivators, such as NCOA2. During ER-activation CALCOCO1 acts synergistically in combination with other NCOA2-binding proteins, such as EP300, CREBBP and CARM1. It is involved in the transcriptional activation of target genes in the Wnt/CTNNB1 pathway. It functions as a secondary coactivator in LEF1-mediated transcriptional activation via its interaction with CTNNB1. In association with CCAR1, CALCOCO1 enhances GATA1- and MED1-mediated transcriptional activation from the gamma-globin promoter during erythroid differentiation of K562 erythroleukemia cells. CALCOCO1 contains a C2H2-type zinc binding domain. 29
38529 412014 cd21968 Zn-C2H2_CALCOCO2 C2H2-type zinc binding domain found in calcium-binding and coiled-coil domain-containing protein 2 (CALCOCO2) and similar proteins. CALCOCO2, also called antigen nuclear dot 52 kDa protein, or nuclear domain 10 protein NDP52, or nuclear domain 10 protein 52, or nuclear dot protein 52, is an Xenophagy-specific receptor required for autophagy-mediated intracellular bacteria degradation. It acts as an effector protein of galectin-sensed membrane damage that restricts the proliferation of infecting pathogens such as Salmonella typhimurium upon entry into the cytosol by targeting LGALS8-associated bacteria for autophagy. It may play a role in ruffle formation and actin cytoskeleton organization and seems to negatively regulate constitutive secretion. CALCOCO2 contains a C2H2-type zinc binding domain. 27
38530 412015 cd21969 Zn-C2H2_TAX1BP1_rpt1 first C2H2-type zinc binding domain found in tax1-binding protein 1 (TAX1BP1) and similar proteins. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. It inhibits TNF-induced apoptosis by mediating the TNFAIP3 anti-apoptotic activity. It may also play a role in the pro-inflammatory cytokine IL-1 signaling cascade. TAX1BP1 is degraded by caspase-3-like family proteins upon TNF-induced apoptosis. TAX1BP1 contains two C2H2-type zinc binding domains; this model corresponds to the first one. 24
38531 412016 cd21970 Zn-C2H2_TAX1BP1_rpt2 second C2H2-type zinc binding domain found in tax1-binding protein 1 (TAX1BP1) and similar proteins. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. It inhibits TNF-induced apoptosis by mediating the TNFAIP3 anti-apoptotic activity. It may also play a role in the pro-inflammatory cytokine IL-1 signaling cascade. TAX1BP1 is degraded by caspase-3-like family proteins upon TNF-induced apoptosis. TAX1BP1 contains two C2H2-type zinc binding domains; this model corresponds to the second one. 27
38532 412017 cd21971 Zn-C2H2_spn-F C2H2-type zinc binding domain found in Drosophila melanogaster Spindle-F (Spn-F) and similar proteins. spn-F is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. It acts downstream of IKK-related kinase Ik2 in the same pathway for dendrite pruning. Spn-F is a coil-coiled protein containing a C2H2-type zinc binding domain. 30
38533 409230 cd21972 KLF1_2_4_N N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins. 194
38534 409246 cd21973 KLF6_7_N-like N-terminal domain of Kruppel-like factor (KLF) 6, KLF7, and similar proteins. This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 6, KLF7, and similar proteins, including KLF Luna, a Drosophila KLF6/KLF7. KLF6 contributes to cell proliferation, differentiation, cell death and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF7 is involved in regulation of the development and function of the nervous system and adipose tissue, type 2 diabetes, blood diseases, as well as pluripotent cell maintenance. KLF Luna is maternally required for synchronized nuclear and centrosome cycles in the preblastoderm embryo. KLF6 and KLF7 are transcriptional activators. They belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF6, KLF7, and similar proteins. 138
38535 409243 cd21974 KLF10_11_N N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins. This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins. 229
38536 409240 cd21975 KLF9_13_N-like Kruppel-like factor (KLF) 9, KLF13, KLF14, KLF16, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF9, KLF13, KLF14, KLF16, and similar proteins. 163
38537 409235 cd21976 SARS-CoV_ORF9c accessory protein ORF9c (also referred to as ORF14) from Severe acute respiratory syndrome-associated coronavirus and related coronaviruses. This model represents the accessory protein 9c (ORF9c, also referred to as ORF14/protein 14) from Sarbecoviruses including Severe acute respiratory syndrome-associated coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Bat SARS-like coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF9c/protein 14 is a product of an alternative open reading frame (ORF) within the N gene from SARS coronavirus. A study of the SARS-CoV2-human protein-protein interaction network (including cloning, tagging and expressing SARS-CoV2 proteins in human cells followed by affinity-purification mass spectrometry) uncovered ORF9c/protein 14 interactions, including those with Sigma receptors (implicated in lipid remodeling and ER stress response), mitochondrial electron transport (ECSIT, ACAD9, NDUFAF1, NDUFB9 ), GPI-anchor biosynthesis (GPAA1, GIPS), and with innate immune signaling proteins (NLRX1, F2RL1, NDFIP2). A preliminary study, using a computational and knowledge-based approach to investigate the interplay between host and SARS-CoV2 in various signaling pathways, supports that SARS-CoV2 ORF9c protein may perturb host antiviral inflammatory cytokine and interferon production pathways (DOI:10.1101/2020.05.06.050260). There are slight differences in the genome organization of SARS-CoV2 in different studies; not all SARS-CoV2 isolates are reported as having ORF9c/protein 14. 70
38538 409233 cd22054 NAC_NACA nascent polypeptide-associated complex (NAC), alpha subunit. The nascent polypeptide-associated complex (NAC) is a complex, conserved from archaea to human, that plays an important role in co translational targeting of nascent polypeptides to the endoplasmic reticulum (ER). In eukaryotes, under physiological conditions, the complex is a stable heterodimer of the NAC alpha subunit and the NAC beta subunit, also known as basal transcription factor 3b (BTF3b). An imbalance of the relative concentrations has been observed in diseases, like Alzheimer's, AIDS, and ulcerative colitis. NAC alpha consists of a NAC domain, also present in BTF3, and a unique C-terminal ubiquitin-associated (UBA) domain. 48
38539 409234 cd22055 NAC_BTF3 basal transcription factor BTF3. Basal transcription factor 3 (BTF3) plays an important role in the transcriptional regulation linked to growth and development in eukaryotes. In mammals, the BTF3 gene encodes two alternative splicing isoforms, BTF3a and BTF3b. The full length BTF3a protein excites transcription. The shortened BTF3b, which lacks the first 44 amino-terminal extension, is a component of the nascent polypeptide-associated complex (NAC), involved in regulating protein localization during translation. BTF3 is involved in oncogenesis; overexpression of BTF3 has been shown to be associated with a variety of malignancies such as cancer of the colon, pancreas, stomach, prostate and breast. It is upregulated in hypopharyngeal squamous cell carcinoma (HSCC) tumors correlating with lymph node metastasis and tumor promotion, thus indicating that BTF3 is a potential therapeutic target and prognostic biomarker for HSCC. BTF3 has also been implicated in the pathogenesis of osteosarcoma (OS), a malignant cancer that affects rapidly proliferating bones, and has a poor prognosis. 117
38540 409231 cd22056 KLF1_2_4_N-like N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of Kruppel-like factor (KLF)1, KLF2, and KLF4. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4. 339
38541 409200 cd22057 WH2_WAVE Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family members 1 (WASP1 or WAVE1), 2 (WASP2 or WAVE2) and 3 (WASP3 or WAVE3). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in three Wiskott-Aldrich syndrome protein (WASP) family verprolin homologous protein (SCAR/WAVE) isoforms: WAVE1, WAVE2, and WAVE3. Members of this family activate actin related protein (Arp)2/3-dependent actin nucleation and branching in response to signals mediated by Rho-family GTPases. The domain structure of these proteins varies, reflecting different modes of regulation; however, they all share a common C-terminal WH2 region which constitutes the smallest fragment necessary for Arp2/3 activation. These proteins interact with actin via their WH2 domain. 28
38542 409201 cd22058 WH2_N_WASP first and second of two tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeats found in Neural Wiskott-Aldrich syndrome protein (N-WASP). This family contains both tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) repeats found in the Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP); N-WASP contains two tandem WH2 domains. N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm. 23
38543 409202 cd22059 WH2_BetaT Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in beta-Thymosin, and similar proteins. This family contains beta-thymosin (betaT; also called thymosin beta or Tbeta) domain which is similar to the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2). Proteins in the beta-thymosin family are small peptides that act as actin monomer (G-actin) sequestering factors. They bind to G-actin into a 1:1 complex, rendering G-actin resistant to polymerization into filaments (F-actin). Thymosin beta 4 (Tbeta4 or TB4) and beta10 (Tbeta10) are minor variants of betaT that bind skeletal muscle actin and inhibit actin polymerization. Thymosin beta4 can also bind to polymerized F-actin. The roles of beta-thymosins also appear to extend beyond G-actin sequestration. Thymosin beta4 has also been linked to a number of additional biological events, including angiogenesis, wound healing, inflammation, and intracellular signaling through kinase activation. Research on thymosin beta10 in breast cancer cells has suggested a relationship with actin cytoskeletal remodeling and cell motility. In addition, thymosins beta4, beta10, and beta15 are highly expressed in several tumor cells, and these have been associated with a higher metastatic potential, possibly due to their function in cell proliferation. 34
38544 409203 cd22060 WH2_MTSS1 Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Metastasis suppressor protein 1 (MTSS-1). This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in metastasis suppressor protein 1 (MTSS1, also called also known as missing in metastasis or MIM). MTSS1 may be related to cancer progression or tumor metastasis in a variety of organ sites, most likely through an interaction with the actin cytoskeleton. It interacts with actin via its WH2 domain. MTSS1 is a novel potential metastasis suppressor gene in several types of human cancers; its expression is down-regulated in ovarian cancer, colorectal cancer, oesophageal cancer, prostate cancer and breast cancer, whereas it has also been observed to be up-regulated in hepato-cellular carcinoma and breast cancer. 31
38545 409204 cd22061 WH2_INF2 Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Inverted formin-2 (INF2). This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in inverted formin-2 (INF2, also known as HBEBP2-binding protein C). INF2 is a formin protein with the unique ability to accelerate both actin polymerization and depolymerization, the latter requiring severing of the filament. It interacts with actin at its formin homology 2 (FH2) domain, while the WH2 domain acts as the diaphanous autoregulatory domain (DAD) and binds to actin monomers. INF2 plays a role in mitochondrial fission and dorsal stress fiber formation. It accelerates actin nucleation and elongation by interacting with the fast-growing ends (barbed ends) of actin filaments, but also accelerates disassembly of actin through encircling and severing filaments. Mutations in INF2 lead to the kidney disease focal segmental glomerulosclerosis (FSGS) and the neurological disorder Charcot-Marie Tooth Disease (CMTD). 30
38546 409205 cd22062 WH2_DdVASP-like Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Dictyostelium discoideum Vasodilator-stimulated phosphoprotein (VASP) and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in Dictyostelium discoideum vasodilator-stimulated phosphoprotein (VASP) and similar proteins. VASP belongs to the Ena/VASP protein family whose members act as actin polymerases that drive the processive elongation of filament barbed ends in membrane protrusions or at the surface of bacterial pathogens. These actin-associated proteins are involved in a range of processes dependent on cytoskeleton remodeling and cell polarity such as lamellipodial and filopodial dynamics in migrating cells. VASP plays a crucial role in filopodia formation, cell-substratum adhesion, and proper chemotaxis. It nucleates and bundles actin filaments via oligomers that use their WH2 domains to effect both the tethering of actin filaments and their processive elongation in sites of active actin assembly. 31
38547 409206 cd22063 WH2_Actobindin Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Actobindin and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in actobindin, an actin-binding protein from amoeba. Actobindin is able to bind two actin monomers at high concentrations of G-actin. It inhibits actin polymerization by sequestering G-actin and stabilizing actin dimers, thus making it a more potent inhibitor of the early phase of actin polymerization than of F-actin elongation. 29
38548 409207 cd22064 WH2_WAS_WASL Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein (WIP). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP). Human WIP protein is proline rich and has high sequence similarity to yeast protein verprolin (included in this model). WIP forms complexes with WASP/N-WASP and modulates their function in vivo. It is involved in the regulation of endocytosis and participates in several cellular processes, some of which are relevant in cancer and may be dependent on different oncogenic stimuli. WIP interacts directly with mammalian actin-binding protein-1 (mABP1) via the SH3 domain during platelet-derived growth factor (PDGF)-mediated dorsal ruffle formation. WIP family includes members 1 (WAS/WASL-interacting protein family member 1) or WIPF1), 2 (WIPF2) and 3 (WIPF3). Aberrant expression of WIPF1 contributes to the invasion and metastasis of several malignancies such breast cancer, glioma and colorectal cancer; it has been identified as an oncoprotein in human pancreatic ductal adenocarcinoma (PDAC) and is associated with poor survival. WIPF2 may be an important regulator of the actin cytoskeleton. WIPF2 binds to N-WASP, regulating actin dynamics close to the plasma membrane; N-WASP in turn controls the second phase insulin secretion through the regulation of the Arp2/3 complex. WIPF3, along with LIPA (lysosomal acid lipase A), are expressed in microphages and are involved in pathological abdominal aortic aneurysm (AAA), a serious condition of the aorta. In yeast, verprolin is involved in cytoskeletal organization and cellular growth. It may exert its effects on the cytoskeleton directly, or indirectly via proline-binding proteins, such as profilin, or via proteins possessing SH3 domains. 29
38549 409208 cd22065 WH2_Spire_1-2_r1 first tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homologs 1 and 2. This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family proteins Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 and Spire-2. This model contains WH2 domain 1 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. 32
38550 409209 cd22066 WH2_Spire second, third, and fourth, tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeats of protein Spire. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) repeats 2-4 in human Spire (also called Spir), Drosophila Spire, and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. This WH2-containing actin nucleator was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. Several spire gene family members have been identified, including paralogs Spire-1 (Spir1) and Spire-2 (Spir2) in higher eukaryotes. Spire acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. Spire-1 and Spire-2 encode a modified Fab1/YOTB/Vac1/EEA1 (FYVE)-type zinc finger membrane-binding domain at their C-termini that promiscuously interacts with negatively charged lipids and the interaction of these proteins with additional factors may provide the specificity for its targeting to the correct subpopulation of vesicles. 22
38551 409210 cd22067 WH2_DmSpire_r1-like first tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the first of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the first tandem WH2 domain of Spire (also called Spir-A or WH2-A). 27
38552 409211 cd22068 WH2_DmSpire_r3-like third tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the third of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the third tandem WH2 domain of Spire (also called Spir-C or WH2-C), which plays a unique role whereby two critical residues have been identified for activity for binding to actin with positive cooperativity. 26
38553 409212 cd22069 WH2_DmSpire_r4 fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the fourth of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the fourth tandem WH2 domain of Spire (also called Spir-D or WH2-A). 29
38554 409213 cd22070 WH2_Pan1-like Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) domain found in Actin cytoskeleton-regulatory complex protein Pan1, and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in actin cytoskeleton-regulatory complex protein Pan1, and similar proteins. Pan1 actin cytoskeleton-regulatory complex is a multi-domain scaffold that is required for the internalization of endosomes during actin-coupled endocytosis. It links the site of endocytosis to the cell membrane-associated actin cytoskeleton. Pan1 mediates uptake of external molecules and vacuolar degradation of plasma membrane proteins, may play a role in the proper organization of the cell membrane-associated actin cytoskeleton, and promotes its destabilization. 22
38555 409214 cd22071 WH2_WAVE-1 Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 1 (WASP1 or WAVE1 or WASF1 or SCAR1). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 1 (also called WASP-family verprolin homologous protein 1 or SCAR1 or WAVE1). WAVE1 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It regulates lamellipodia formation via a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase binding to CYFIP protein, allowing the release of WAVE1 from the complex. WAVE1 then binds and activates the Arp2/3 complex via its C-terminal domain. It interacts with actin via the WH2 domain. WAVE1 has been shown to be necessary for efficient transcriptional reprogramming in Xenopus oocytes and for normal development. 75
38556 409215 cd22072 WH2_WAVE-2 Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 2 (WASP2 or WAVE2 or WASF2 or SCAR2). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 2 (also called WASP-family verprolin homologous protein 2 or WASF2 or SCAR2 or WAVE2). WAVE2 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It participates in multiple processes related to actin dynamics, such as lamellipodia and filopodium formation, cell migration and protrusion, and embryogenesis. It regulates lamellipodia formation via a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase, kinases and phosphatidylinositols, and binds and activates the Arp2/3 complex via WAVE2 C-terminal domain. It interacts with actin via the WH2 domain. WAVE2 can also be phosphorylated by MAPK and forms a complex with PKA that regulates membrane protrusion. In mouse oocyte, WAVE2 regulates meiotic spindle stability, peripheral positioning and polar body emission, probably via an actin-mediated pathway. 30
38557 409216 cd22073 WH2_WAVE-3 Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 3 (WASP-3 or WAVE3). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 3 (also called WASP-family verprolin homologous protein 3 or WASF3 or SCAR3 or WAVE3). WAVE3 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It plays a role in the regulation of cell morphology and cytoskeletal organization and is required in the control of cell shape. It forms a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase, kinases and phosphatidylinositols, and binds and activates the Arp2/3 complex via WAVE3 C-terminal domain. It interacts with actin via the WH2 domain. This actin polymerization process is also involved in cancer cell invasion and metastasis. WASF3 has been shown to have a central role in cancer cell invasion and metastasis; elevated WAVE3 expression promotes metastasis in breast cancer and inactivation of WAVE3 in highly metastatic breast cancer cells has been shown to suppress invasion and metastasis. WAVE3 may also be pivotal in ovarian cancer cell motility, invasion and oncogenesis. In gastric cancer patients, WAVE3 expression correlates with poor outcome. In pancreatic cancer tissues, expression is prominently higher that in normal tissues and may be associated with lymphatic metastasis and poorly differentiated tumors; findings suggest that WAVE3 influences cell proliferation, migration and invasion via the AKT pathway. 66
38558 409217 cd22074 WH2_N-WASP_r1 first tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat found in human Neural Wiskott-Aldrich syndrome protein (N-WASP) and related domains. This subfamily includes the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in human Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP) and related domains. N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm. 27
38559 409218 cd22075 WH2_hN-WASP_r2_like second tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat found in human Neural Wiskott-Aldrich syndrome protein (N-WASP) and related domains. This subfamily includes the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in human Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP). N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm. This subfamily includes both tandem WH2 domains of mouse N-WASP. 25
38560 409219 cd22076 WH2_WAS_WASL-1 Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein family member 1. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP) member 1 (WIPF1). WIPF1 is a ubiquitously expressed proline-rich multidomain protein and is a binding partner and chaperone of WASP. It stabilizes actin filaments and regulates actin organization and polymerization which are associated with cell migration and invasion. Mutations in the WIPF1 binding site of WASP or in WIPF1 itself cause Wiskott-Aldrich syndrome (WAS), a rare X-linked recessive disease characterized by eczema, thrombocytopenia, immune deficiency, and bloody diarrhea. Aberrant expression of WIPF1 contributes to the invasion and metastasis of several malignancies such breast cancer, glioma and colorectal cancer; it has been identified as an oncoprotein in human pancreatic ductal adenocarcinoma (PDAC) and is associated with poor survival. 32
38561 409220 cd22077 WH2_WAS_WASL-2_3 Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein (WIP) family members 2 and 3. WASF2 (WAS protein family, member 2), This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP) members 2 (WIPF2; also known as WIRE/WICH) and 3 (WIPF3). WIPF2 may be an important regulator of the actin cytoskeleton. It binds to N-WASP, regulating actin dynamics close to the plasma membrane; N-WASP in turn controls the second phase insulin secretion through the regulation of the Arp2/3 complex. Pathogenic properties of Shigella flexneri, a causative agent of intestinal infections worldwide, rely on its ability to invade the human colon where it spreads from cell to cell; WIPF2 has been shown to promote this via its contribution to the efficiency of actin-based motility in the cytosol and the resolution of the membrane protrusions into vacuoles. WIPF3, along with LIPA (lysosomal acid lipase A), are expressed in microphages and are involved in pathological abdominal aortic aneurysm (AAA), a serious condition of the aorta. 30
38562 409221 cd22078 WH2_Spire1_r2-like second tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 1 (Spir1), and related proteins. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 2 of human Spire-1 protein. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, and in adult tissues, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. This family also contains the second of four tandem repeats of WH2 in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. 32
38563 409222 cd22079 WH2_Spire2_r2 second tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 2. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-2 (also called Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 2 of human Spire-2. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. 30
38564 409223 cd22080 WH2_Spire1_r4 fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 1. This family contains the fourth tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 4 of Spire-1 protein. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, and in adult tissues, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. 24
38565 409224 cd22081 WH2_Spire2_r4 fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 2. This family contains the fourth tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-2 (also called Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 4 of Spire-2. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. 22
38566 409195 cd22184 Af2093-like Archaeaoglobus fulgidus Af2093 and similar proteins. This family represents the uncharacterized protein Af2093, which has no known function. The three-dimensional fold of this protein family resembles that of PDDEXK nucleases, but it lacks the typical catalytic site. 245
38567 409225 cd22185 WH2_hVASP-like Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) of human Vasodilator-stimulated phosphoprotein and related proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) of Ena/VASP family members including Protein enabled homolog (also known as Mena, mammalian enabled), VASP (vasodilator-stimulated phosphoprotein) and EVL (Ena-VASP-like or Enabled VASP or Ena/VASP). These are actin-associated proteins involved in a range of processes dependent on cytoskeleton remodeling and cell polarity such as axon guidance and lamellipodial and filopodial dynamics in migrating cells, platelet activation and cell migration. Ena/VASP proteins processively elongate F-actin barbed ends, promoting dissociation of barbed end assembly antagonists (uncapping). WH2 domains are small, widespread intrinsically disordered actin-binding peptides displaying significant sequence variability and different regulations of actin self-assembly in motile and morphogenetic processes. WH2 domains are identified by a central consensus actin-binding motif LKKT/V flanked by variable N-terminal and C-terminal extensions. 27
38568 409226 cd22186 WH2_Spire1-2_r3 third tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homologs 1 and 2. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 3 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. 23
38569 409194 cd22187 asqI-like protein asqI and similar proteins. This family includes Aspergillus nidulans tyrosinase family protein asqI (aspoquinolone biosynthesis protein I) that is part of the gene cluster that mediates the biosynthesis of the aspoquinolone mycotoxins. 322
38570 409192 cd22188 arch-AMO_C-like subunit C of ammonia monooxygenase (AMO) from ammonia-oxidizing archaea, and related proteins. This model contains the subunit C of ammonia monooxygenase (AMO, EC 1.14.99.39) from ammonia-oxidizing archaea including Nitrososphaera viennensis gen. nov., sp. nov (also called Nitrososphaera viennensis EN76) that contains six variants (AmoC1-AmoC6) encoded by different genes. AMO catalyzes the conversion of ammonia to hydroxylamine. Nitrososphaera viennensis EN76 AMO is composed of four subunits: AmoA, AmoB, AmoX, and one of six variants of AmoC. The AMO subunit C belongs to a family which also includes subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO, EC 1.14.18.3) from methanotrophic bacteria, and AMO from ammonia-oxidizing bacteria, which are not included in this model. Compared to its bacterial counterpart, archaeal AMO C subunit is significantly shorter at the N-terminal end. 181
38571 409190 cd22189 PGAP4-like_fungal uncharacterized fungal proteins similar to Post-GPI attachment to proteins factor 4. This subfamily contains uncharacterized fungal proteins with similarity to animal post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246). PGAP4 has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases, it contains three transmembrane domains. Proteins from this subfamily contain the putative catalytic site of PGAP4 and may have similar activities. 375
38572 409191 cd22190 PGAP4 Post-GPI attachment to proteins factor 4. Post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246), has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases (GTs), it contains three transmembrane domains. Structural modeling suggests that PGAP4 adopts a GT-A fold split by an insertion of tandem transmembrane domains. 379
38573 409004 cd22191 DPBB_RlpA_EXP_N-like double-psi beta-barrel fold of RlpA, N-terminal domain of expansins, and similar domains. The double-psi beta-barrel (DPBB) fold is found in a divergent group of proteins, including endolytic peptidoglycan transglycosylase RlpA (rare lipoprotein A), EG45-like domain containing proteins, kiwellins, Streptomyces papain inhibitor (SPI), and the N-terminal domain of plant and bacterial expansins. RlpA may work in tandem with amidases to degrade peptidoglycan (PG) in the division septum and lateral wall to facilitate daughter cell separation. An EG45-like domain containing protein from Arabidopsis thaliana, called plant natriuretic peptide A (AtPNP-A), functions in cell volume regulation. Kiwellin proteins comprise a widespread family of plant-defense proteins that target pathogenic bacterial/fungal effectors that down-regulate plant defense responses. SPI is a stress protein produced under hyperthermal stress conditions that serves as a glutamine and lysine donor substrate for microbial transglutaminase (MTG, EC 2.3.2.13) from Streptomycetes. Some expansin family proteins display cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. 94
38574 411976 cd22192 TRPV5-6 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), types 5 and 6. TRPV5 and TRPV6 (TRPV5/6) are two homologous members within the vanilloid subfamily of the transient receptor potential (TRP) family. TRPV5 and TRPV6 show only 30-40% homology with other members of the TRP family and have unique properties that differentiates them from other TRP channels. They mediate calcium uptake in epithelia and their expression is dramatically increased in numerous types of cancer. The structure of TRPV5/6 shows the typical topology features of all TRP family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6, which is predicted to form the Ca2+ pore, and large intracellular N- and C-terminal domains. The N-terminal domain of TRPV5/6 contains three ankyrin repeats. This structural element is present in several proteins and plays a role in protein-protein interactions. The N- and C-terminal tails of TRPV5/6 each contain an internal PDZ motif which can function as part of a molecular scaffold via interaction with PDZ-domain containing proteins. A major difference between the properties of TRPV5 and TRPV6 is in their tissue distribution: TRPV5 is predominantly expressed in the distal convoluted tubules (DCT) and connecting tubules (CNT) of the kidney, with limited expression in extrarenal tissues. In contrast, TRPV6 has a broader expression pattern such as expression in the intestine, kidney, placenta, epididymis, exocrine tissues, and a few other tissues. 609
38575 411977 cd22193 TRPV1-4 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), types 1-4. TRPV1-4 are thermo-sensing channels that function directly in temperature-sensing and nociception; they share substantial structural and functional properties. Transient Receptor Potential (TRP) ion channels activated by temperature (thermo TRPs) are important molecular players in acute, inflammatory, and chronic pain states. So far, 11 TRP channels in mammalian cells have been identified as thermosensitive TRP (thermo-TRP) channels. TRPV1-4 channels are activated by different heat temperatures, for example, TRPV1 and TRPV2 are activated by high temperatures (>43C and >55C, respectively). TRPV1-4 belong to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all TRP ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 607
38576 411978 cd22194 TRPV3 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 3. TRPV3 is a temperature-sensitive Transient Receptor Potential (TRP) ion channel that is activated by warm temperatures, synthetic small-molecule chemicals, and natural compounds from plants. TRPV3 function is regulated by physiological factors such as extracellular divalent cations and acidic pH, intracellular adenosine triphosphate, membrane voltage, and arachidonic acid. It is expressed in both neuronal and non-neuronal tissues including epidermal keratinocytes, epithelial cells in the gut, endothelial cells in blood vessels, and neurons in dorsal root ganglia and CNS. TRPV3 null mice have abnormal hair morphogenesis and compromised skin barrier function. It may play roles in inflammatory skin disorders, such as itch and pain sensation. TRPV3 is also expressed by many neuronal and non-neuronal tissues, showing that TRPV3 might play roles in other unknown cellular and physiological functions. TRPV3 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all TRP ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 680
38577 411979 cd22195 TRPV4 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 4. TRPV4 is expressed broadly in neuronal and non-neuronal cells. It is activated by various stimuli, including hypo-osmolarity, warm temperature, and chemical ligands. TRPV4 acts in physiological functions such as osmoregulation and thermoregulation. It also has a role in mechanosensation in the vascular endothelium and urinary tract, and in cell barrier formation in vascular and epidermal tissues. Knockout mice studies suggested the functional importance of TRPV4 in the central nervous system, nociception, and bone formation. TRPV4 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 733
38578 411980 cd22196 TRPV1 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 1. Vanilloid receptor 1 (TRPV1), a capsaicin (vanilloid) receptor, is the founding member of the vanilloid TRP subfamily (TRPV). In humans, it is expressed in the brain, kidney, pancreas, testis, uterus, spleen, stomach, small intestine, lung and liver. TRPV1 has been implicated to have function in thermo-sensation (heat), autonomic thermoregulation, nociception, food intake regulation, and multiple functions in the gastrointestinal (GI) tract. The receptor has also been involved in growth cone guidance, long-term depression, endocannabinoid signaling and osmosensing in the central nervous system. TRPV1 is up regulated in several human pathological conditions including vulvodynia, GI inflammation, Crohn's disease and ulcerative colitis. TRPV1 knock-out mice exhibit impaired sensation to thermal-mechanical acute pain. The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 649
38579 411981 cd22197 TRPV2 Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 2. TRPV2 is closely related to TRPV1, sharing high sequence identity (>50%), but TRPV2 shows a higher temperature threshold and sensitivity for activation than TRPV1. TRPV2 can be stimulated by ligands or lipids, and is involved in osmosensation and mechanosensation. TRPV2 is expressed in both neuronal and non-neuronal tissues, and it has been implicated in diverse physiological and pathophysiological processes, including cardiac-structure maintenance, innate immunity, and cancer. TRPV2 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 640
38580 409188 cd22198 CH_MICAL_EHBP-like calponin homology (CH) domain found in the MICAL and EHBP families. This group is composed of the molecule interacting with CasL protein (MICAL) and EH domain-binding protein (EHBP) families. MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP proteins contain a single CH domain. CH domains are actin filament (F-actin) binding motifs. 105
38581 412062 cd22200 NRDE2_MID MTR4-interacting domain (MID) found in nuclear exosome regulator NRDE2 and similar proteins. NRDE2 is a protein of the nuclear speckles that regulates RNA degradation and export from the nucleus through its interaction with MTREX, an essential factor directing various RNAs to exosomal degradation. NRDE2 negatively regulates exosome functions by inhibiting the RNA helicase MTR4 recruitment and exosome interaction. This model corresponds to the N-terminal MTR4-interacting domain (MID) of NRDE2. 99
38582 412063 cd22201 cubilin_NTD N-terminal domain of cubilin and similar proteins. Cubilin (CUBN, also called 460 kDa receptor, intestinal intrinsic factor receptor, intrinsic factor-cobalamin receptor, or intrinsic factor-vitamin B12 receptor) is an endocytic receptor which plays a role in lipoprotein, vitamin and iron metabolism by facilitating their uptake. It acts together with the 45-kDa transmembrane protein amnionless (AMN) to mediate endocytosis of the cobalamin (vitamin B12) binding intrinsic factor (CBLIF)-cobalamin complex. This model corresponds to the N-terminal domain of cubilin, which is responsible for the interaction with AMN. The cubilin interface with AMN is formed by the N-terminal strands of three cubilin chains. 129
38583 409026 cd22204 H1_KCTD12-like H1 domain found in potassium channel tetramerization domain-containing proteins. The H1 domain is found in potassium channel tetramerization domain-containing proteins such as KCTD8, KCTD12 (also called predominantly fetal expressed T1 domain/Pfetin), KCTD12b, and KCTD16. They serve as auxiliary gamma-aminobutyric acid type B (GABA-B) receptor subunits that constitute receptor subtypes with distinct functional properties. KCTD12 and -12b generate desensitizing receptor responses while KCTD8 and -16 generate largely non-desensitizing receptor responses. They control GABA-B signaling and regulate the rise time and duration of G protein-coupled inwardly rectifying potassium (GIRK) currents, as well as enhance receptor expression levels. KCTD12 regulates agonist potency and kinetics of GABA-B receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. KCTD16 interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. Members of this family consist of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain. 118
38584 412064 cd22207 pseudoGTPaseD_p190RhoGAP pseudoGTPase domain found in the family of p190RhoGAP. This family includes two p190RhoGAP proteins, A and B, which are Rho family GTPase-activating proteins (GAPs) that act as key regulators of Rho GTPase signaling and are essential for actin cytoskeletal structure and contractility. Rho family is one of five Ras superfamily subgroups (Ras, Rho, Rab, Ran and Arf). Each contains five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases consist of a GTPase fold lacking one or more of these G motifs. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP proteins. 166
38585 412067 cd22209 EMC10 ER membrane protein complex subunit 10 and similar proteins. Endoplasmic reticulum (ER) membrane protein complex subunit 10 (EMC10), also called hematopoietic signal peptide-containing membrane domain-containing protein 1 (HSM1) or INM02, is a bone marrow-derived angiogenic growth factor promoting angiogenesis and tissue repair in the heart after myocardial infarction. It stimulates cardiac endothelial cell migration and outgrowth via the activation of p38 MAPK, PAK, and MAPK2 signaling pathways. Yeast EMC10 is a non-essential component of the ER membrane protein complex (EMC), which may be involved in ER-associated degradation (ERAD) and proper assembly of multi-pass transmembrane proteins. 141
38586 408999 cd22210 HD_XRCC4-like_N N-terminal head domain found in the XRCC4 superfamily of proteins. The XRCC4 superfamily includes five families: XRCC4, XLF, PAXX, SAS6 and CCDC61. XRCC4 (X-ray repair cross-complementing protein 4), XLF (XRCC4-like factor) and PAXX (paralog of XRCC4 and XLF) play crucial roles in the non-homologous end-joining (NHEJ) DNA repair pathway. SAS6 (spindle assembly abnormal protein 6) and CCDC61 (coiled-coil domain-containing protein 61) have a centrosomal/centriolar function. Members of this superfamily have an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal low-complexity region. They form homodimers through two homodimerization domains: an N-terminal globular head domain and a parallel coiled-coil domain. In addition, some members such as XRCC4 and XLF form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XRCC4 superfamily proteins. 115
38587 411792 cd22211 HkD_SF Hook domain-containing proteins superfamily. The Hook domain superfamily includes Hook adaptor proteins, Hook-related proteins and nuclear mitotic apparatus protein (NuMA). They share an N-terminal conserved globular Hook domain, which folds as a variant of the helical calponin homology (CH) domain with an extended alpha-helix. The Hook domain is responsible for the binding of microtubule. The Hook family includes microtubule-binding proteins, Hook1-3. Hook1 is required for spermatid differentiation. Hook2 contributes to the establishment and maintenance of centrosome function. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking, and is involved in Golgi and endosome transport. Hook proteins are components of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. The Hook-related protein (HkRP) family includes Daple, Girdin and Gipie. Daple, also called Dvl-associating protein with a high frequency of leucine residues, or coiled-coil domain-containing protein 88C(CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). Gipie, also called GRP78-interacting protein induced by ER stress, or coiled-coil domain-containing protein 88B(CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. NuMA, also called nuclear mitotic apparatus protein 1, or nuclear matrix protein-22 (NMP-22), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of the spindle poles and the alignment and the segregation of chromosomes during mitotic cell division. 145
38588 412068 cd22212 NDFIP-like NEDD4 family-interacting protein. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP1/2 have been shown to be involved in neural development by regulating the expression of the Robo1 receptor. 171
38589 409027 cd22216 H1_KCTD12b H1 domain found in potassium channel tetramerization domain-containing protein 12b. Potassium channel tetramerization domain-containing protein 12b (KCTD12b) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors associated with mood disorders. KCTD12b consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain. 118
38590 409028 cd22217 H1_KCTD12 H1 domain found in potassium channel tetramerization domain-containing protein 12. Potassium channel tetramerization domain-containing protein 12 (KCTD12), also called predominantly fetal expressed T1 domain (Pfetin), is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors associated with mood disorders. It regulates agonist potency and kinetics of GABA-B receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. It also regulates colorectal cancer cell stemness through the ERK pathway. KCTD12 consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain. 119
38591 409029 cd22218 H1_KCTD8 H1 domain found in potassium channel tetramerization domain-containing protein 8. Potassium channel tetramerization domain-containing protein 8 (KCTD8), a BTB/POZ domain-containing protein, is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors that determine the pharmacology and kinetics of the receptor response. It generates largely non-desensitizing receptor responses. KCTD8 consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD8, which may not be involved in desensitization. 122
38592 409030 cd22219 H1_KCTD16 H1 domain found in potassium channel tetramerization domain-containing protein 16. Potassium channel tetramerization domain-containing protein 16 (KCTD16) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-) receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. KCTD16 generates largely non-desensitizing receptor responses. It consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD16, which may not be involved in desensitization. 121
38593 412065 cd22220 pseudoGTPaseD_p190RhoGAP-B pseudoGTPase domain found in p190RhoGAP-B and similar proteins. p190RhoGAP protein B (p190RhoGAP-B), also called ARHGAP5, or p190-B, or Rho-type GTPase-activating protein 5 (RHOGAP5), is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-B. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases consist of a GTPase fold lacking one or more of these G motifs. 171
38594 412066 cd22221 pseudoGTPaseD_p190RhoGAP-A pseudoGTPase domain found in p190RhoGAP-A and similar proteins. p190RhoGAP protein A (p190RhoGAP-A), also called Rho GTPase-activating protein 35(RHOGAP35), glucocorticoid receptor DNA-binding factor 1, or glucocorticoid receptor repression factor 1 (GRF-1), or Rho GAP p190A, or p190-A, is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. It binds several acidic phospholipids which inhibits the Rho GAP activity to promote the Rac GAP activity. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-A. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases would consist of a GTPase fold lacking one or more of these G motifs. 172
38595 411793 cd22222 HkD_Hook Hook domain found in Hook family of microtubule-binding proteins. The Hook family includes Hook1-3. Hook1 is a microtubule-binding protein required for spermatid differentiation. Hook2, also a microtubule-binding protein, contributes to the establishment and maintenance of centrosome function. It may function in the positioning or formation of aggresomes, which are pericentriolar accumulations of misfolded proteins, proteasomes and chaperones. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking. It is involved in Golgi and endosome transport. It acts as a scaffold for the opposite-polarity microtubule-based motors cytoplasmic dynein-1 and the kinesin KIF1C. It may participate in the turnover of the endocytosed scavenger receptor. Hook proteins are components of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. Hook adaptor proteins share an N-terminal conserved globular Hook domain, which folds as a variant of the helical calponin homology (CH) domain, and contacts the helix alpha1 of dynein light intermediate chain 1 (LIC1) in a hydrophobic groove. 147
38596 411794 cd22223 HkD_HkRP Hook domain found in the Hook-related protein (HkRP) family. The HkRP family includes Daple, Girdin and Gipie. Daple, also called Dvl-associating protein with a high frequency of leucine residues, or coiled-coil domain-containing protein 88C (CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). It acts as a non-receptor guanine nucleotide exchange factor which binds to and activates guanine nucleotide-binding protein G(i) alpha subunits. It also acts as a guanine nucleotide dissociation inhibitor for guanine nucleotide-binding protein G(s) subunit alpha GNAS. In addition, Girdin plays an essential role in cell migration. Gipie, also called GRP78-interacting protein induced by ER stress, or coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing. All family members contain a conserved globular Hook domain which folds as a variant of the helical calponin homology (CH) domain. 149
38597 411795 cd22224 HkD_NuMA Hook domain found in nuclear mitotic apparatus protein (NuMA) and similar proteins. NuMA, also called nuclear mitotic apparatus protein 1, or nuclear matrix protein-22 (NMP-22), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of the spindle poles, and the alignment and segregation of chromosomes during mitotic cell division. The model corresponds to the N-terminal conserved globular Hook domain of NuMA, which folds as a variant of the helical calponin homology (CH) domain. It directly binds dynein light intermediate chains LIC1 and LIC2 through a conserved hydrophobic patch shared among other Hook adaptors. 148
38598 411796 cd22225 HkD_Hook1 Hook domain found in protein Hook 1 (Hook1) and similar proteins. Hook1 is a microtubule-binding protein required for spermatid differentiation. It is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. 150
38599 411797 cd22226 HkD_Hook3 Hook domain found in protein Hook 3 (Hook3) and similar proteins. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking. It is involved in Golgi and endosome transport. It acts as a scaffold for the opposite-polarity microtubule-based motors cytoplasmic dynein-1 and the kinesin KIF1C. It may participate in the turnover of the endocytosed scavenger receptor. Hook3 is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. 153
38600 411798 cd22227 HkD_Hook2 Hook domain found in protein Hook 2 (Hook2) and similar proteins. Hook2 is a microtubule-binding protein that contributes to the establishment and maintenance of centrosome function. It may function in the positioning or formation of aggresomes, which are pericentriolar accumulations of misfolded proteins, proteasomes and chaperones. Hook2 is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. 150
38601 411799 cd22228 HkD_Daple Hook domain found in Daple (Dvl-associating protein with a high frequency of leucine residues) and similar proteins. Protein Daple, also called coiled-coil domain-containing protein 88C (CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling. 153
38602 411800 cd22229 HkD_Girdin Hook domain found in Girdin and similar proteins. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). It acts as a non-receptor guanine nucleotide exchange factor which binds to and activates guanine nucleotide-binding protein G(i) alpha subunits. It also acts as a guanine nucleotide dissociation inhibitor for guanine nucleotide-binding protein G(s) subunit alpha GNAS. In addition, Girdin plays an essential role in cell migration. 156
38603 411801 cd22230 HkD_Gipie Hook domain found in Gipie (GRP78-interacting protein induced by ER stress) and similar proteins. Gipie, also called coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing. 170
38604 409021 cd22231 RHH_NikR_HicB-like ribbon-helix-helix domains of nickel responsive transcription factor NikR, antitoxins HicB, ParD, and MazE, and similar proteins. This family includes the N-terminal domain of NikR, C-terminal domains of antitoxins HicB and ParD, as well as antitoxin MazE, and similar proteins, all of which belong to the ribbon-helix-helix (RHH) family of transcription factors. NikR is a nickel-responsive transcription factor that consists of an N-terminal DNA-binding RHH domain and a C-terminal metal-binding domain (MBD) with four nickel ions. In Helicobacter pylori, which colonizes the gastric epithelium of humans leading to gastric ulcers and gastric cancers, NikR (HpNikR) regulates multiple genes. It regulates urease, which protects H. pylori from acidic shock at low pH, by converting urea to ammonia and bicarbonate. It also plays a complex role in the intracellular physiology of nickel; occupation of nickel-binding sites results in NikR binding to its operator in the nickel permease nikABCDE promoter. Thus, there is weaker repression of NikABCDE transcription at low intracellular free nickel concentrations while strong repression prevails at higher concentrations, which would be potentially toxic. Antitoxin HicB is part of the HicAB toxin-antitoxin (TA) system, where the toxins are RNases, found in many bacteria. In the pathogen Burkholderia pseudomallei, the HicAB system plays a role in regulating the frequency of persister cells and may therefore play a role in disease. Structural studies of Yersinia pestis HicB show that it acts as an autoregulatory protein and HicA acts as an mRNase. In Escherichia coli, an excess of HicA has been shown to de-repress a HicB-DNA complex and restore transcription of HicB. Similarly, Caulobacter crescentus ParD antitoxin neutralizes the effect of cognate ParE toxin. In Bacillus subtilis, during stress conditions, antitoxin MazE binds to toxin MazF, an mRNA interferase, and inactivates it and cleaves mRNAs in a sequence-specific manner, resulting in cellular growth arrest. 44
38605 409022 cd22232 RHH_CopG_Cop6-like ribbon-helix-helix family transcriptional repressor protein CopG, uncharacterized Cop6, and similar proteins. This family includes the ribbon-helix-helix (RHH) family transcriptional repressor CopG, which is involved in the control of plasmid copy number, as well as uncharacterized proteins such as Cop6, which is found in a small plasmid that has been identified in methicillin-resistant Staphylococcus aureus (MRSA). CopG, a homodimeric protein of around 45 residues, constitutes one of the smallest natural transcriptional repressors characterized and is the prototype of a series of repressor proteins encoded by plasmids that exhibit a similar genetic structure at their leading strand initiation and control regions. It binds to and represses the single Pcr promoter that directs the synthesis of a bicistronic mRNA for CopG and the RepB initiator of replication, thereby regulating its own synthesis and that of RepB. 45
38606 409023 cd22233 RHH_CopAso-like ribbon-helix-helix domain of Shewanella oneidensis type II antitoxin CopA(SO), and similar proteins. This family includes the N-terminal ribbon-helix-helix (RHH) domain of Shewanella oneidensis CopA(SO), a newly identified type II antitoxin, as well as the N-terminal RHH domain of Escherichia coli PutA flavoprotein, among other similar proteins, many of which are as yet uncharacterized. CopA(SO) is a typical RHH antitoxin that includes an ordered N-terminal domain (CopA(SO)-N) and a disordered C-terminal domain (CopA(SO)-C). Biophysical investigation indicates allosteric effects of CopA(SO)-N on CopA(SO)-C; DNA binding of CopA(SO)-N appears to induce CopA(SO)-C to fold and self-associate the C-terminal domain. The multifunctional E. coli proline utilization A (PutA) flavoprotein functions as a membrane-associated proline catabolic enzyme as well as a transcriptional repressor of the proline utilization genes putA and putP. The N-terminal domain of PutA is a transcriptional regulator with an RHH fold; structure studies show that it forms a homodimer to bind one DNA duplex. This family also includes orphan antitoxin ParD2, an antitoxin component of a non-functional type II toxin-antitoxin (TA system); it does not neutralize the effect of any of the RelE or ParE toxins. 44
38607 409024 cd22234 RHH_MobB-like ribbon-helix-helix domain of mobilization protein MobB and similar proteins. This subfamily includes Pseudomonas syringae mobilization protein MobB, and mostly archaeal uncharacterized CopG family proteins. These proteins have a typical ribbon-helix-helix (RHH), similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number. 44
38608 409025 cd22235 RHH_CopG_archaea ribbon-helix-helix domain of CopG family transcriptional regulators found in archaea. This subfamily includes the N-terminal ribbon-helix-helix (RHH) domain of putative transcriptional repressor CopG from archaea, and similar proteins. These uncharacterized proteins have a typical RHH, similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number. 43
38609 412071 cd22238 AcrIF3 Anti-CRISPR type I subtype F3. AcrIF3 (also known as AcrF3) is an anti-CRISPR (Acr) protein that forms a homodimer and interacts directly with helicase-nuclease protein Cas3 and blocks its recruitment to the type I-F CRISPR-Cas surveillance complex (Csy). The type I-F Csy is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. Without Cas3 recruitment by the Csy-dsDNA complex, the CRISPR/Cas system is unable to efficiently destroy the invading DNA, resulting in escape from the immune response. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 127
38610 412072 cd22239 NPHP4 Nephrocystin-4. Nephrocystin-4 (NPHP4), also known as nephroretinin, is a component of the nephronophthisis (NPHP) module which is part of the transition zone (TZ) of the cilia. NPHP4 forms complexes with alpha-tubulin, NPHP1 and RPGRIP1. The interaction with NPHP1 is crucial for cell-cell and cell-matrix adhesion signaling. Mutations in NPHP4 have been shown to cause nephronophthisis (NPHP), an autosomal recessive cause of kidney failure and earlier stages of chronic kidney disease among adults. 904
38611 412034 cd22240 akirin akirin. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. In vertebrates, there are two orthologs, Akirin1 and Akirin2. 147
38612 412073 cd22241 AcrIF8 Anti-CRISPR type I subtype F8 (AcrIF8). AcrIF8 (also known as AcrF8) is an anti-CRISPR (Acr) protein that is positioned on the type I-F Csy spiral backbone surrounded by Cas5f, Cas7.4-7.6f, and Cas8f, and forms interactions with crRNA (CRISPR-RNA) to prevent the target DNA from binding to the Csy complex. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr Proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 77
38613 412035 cd22243 akirin-1 akirin-1. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-1 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis and meiosis. 188
38614 412036 cd22244 akirin-2 akirin-2. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-2 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis, and brain- and limb-development. Akirin-2 is partly cytosolic. It has been shown to interact with nuclear importins and therefore may play a role in proper transport between nucleus and cytoplasm. 184
38615 412074 cd22246 PI4KB_NTD N-terminal domain of phosphatidylinositol 4-kinase beta. Phosphatidylinositol 4-kinase beta (PI4K-beta, PI4Kbeta or PI4KB), also called PtdIns 4-kinase beta, NPIK, PI4K92, or PI4KIII, catalyzes the phosphorylation of phosphatidylinositol (PI) to form phosphatidylinositol 4-phosphate (PI4P), in the first committed step in the production of the second messenger inositol-1,4,5,-trisphosphate (PIP). It may regulate Golgi disintegration/reorganization during mitosis, possibly via its phosphorylation. PI4K-beta is critical for the maintenance of the Golgi and trans Golgi network (TGN) PI4P pools. It is recruited to membranes via its interaction with Golgi adaptor protein acyl-coenzyme A binding domain containing protein 3 (ACBD3). The ACBD3:PI4K-beta complex formation is essential for proper function of the Golgi. PI4K-beta also plays an essential role in Aichi virus RNA replication. It is recruited by ACBD3 at viral replication sites. This model corresponds to the N-terminal domain of PI4K-beta, which is responsible for interacting with ACBD3 by forming a complex with the Q domain. 65
38616 410202 cd22248 Rcc_KIF21 regulatory coiled-coil domain found in the kinesin-like KIF21 family. The KIF21 family includes KIF21A and KIF21B. KIF21A (also called kinesin-like protein KIF2, or renal carcinoma antigen NY-REN-62) is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. KIF21B is a plus-end directed microtubule-dependent motor protein which displays processive activity. It is involved in regulation of microtubule dynamics, synapse function, and neuronal morphology, including dendritic tree branching and spine formation. KIF21B plays a role in learning and memory. It is involved in the delivery of gamma-aminobutyric acid (GABA(A)) receptors to the cell surface. This model corresponds to the regulatory coiled-coil domain of KIF21A/KIF21B, which folds into an intramolecular antiparallel coiled-coil monomer in solution but crystallizes into a dimeric domain-swapped antiparallel coiled-coil. 81
38617 409016 cd22249 UDM1_RNF168_RNF169-like UDM1 (ubiquitin-dependent DSB recruitment module 1) found in RING finger proteins RNF168, RNF169 and similar proteins. This model represents the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) found in RING finger proteins, RNF168 and RNF169. RNF168 is an E3 ubiquitin-protein ligase that promotes non-canonical K27 ubiquitination to signal DNA damage. It functions, together with RNF8, as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. The UDM1 domain comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets. 66
38618 409019 cd22250 ROCK_SBD Shroom-binding domain found in Rho-associated coiled-coil containing protein kinase. Rho-associated coiled-coil containing protein kinase (ROCK) is a serine/threonine kinase (STK) that catalyzes the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. It is also referred to as Rho-associated protein kinase or simply as Rho kinase. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Rho-associated protein kinase 1 (ROCK1) is also called renal carcinoma antigen NY-REN-35, Rho-associated, coiled-coil-containing protein kinase 1, ROCK-I, p160 ROCK-1, or p160ROCK, is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient in ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. Rho-associated protein kinase 2 (ROCK2), also called Rho kinase 2, Rho-associated, coiled-coil-containing protein kinase 2, ROCK-II, or p164 ROCK-2, is more prominent in brain and skeletal muscle. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK subfamily proteins contain an N-terminal extension, a catalytic kinase domain, a coiled-coil (CC) region encompassing a Rho-binding domain (RBD), and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via proteolytic cleavage, binding of lipids to the PH domain, or binding of GTP-bound RhoA to the CC region. More recently, the Shroom family of proteins have been identified as an additional regulator of ROCK. This model corresponds to the Shroom-binding domain (SBD) of ROCK, which forms a parallel coiled coil with the Shroom domain 2 (SD2) of Shroom. 75
38619 412075 cd22252 PARP2_NTR NTR (N-terminal region) domain of poly [ADP-ribose] polymerase 2 (PARP-2) and similar proteins. PARP-2 is also called ADP-ribosyltransferase diphtheria toxin-like 2 (ARTD2), DNA ADP-ribosyltransferase PARP2, NAD(+) ADP-ribosyltransferase 2 (ADPRT-2), poly[ADP-ribose] synthase 2 (pADPRT-2), or protein poly-ADP-ribosyltransferase PARP2. It is a poly-ADP-ribosyltransferase that mediates poly-ADP-ribosylation of proteins and plays a key role in DNA repair. It mainly mediates glutamate and aspartate ADP-ribosylation of target proteins. PARP-2 can also ADP-ribosylate DNA; it preferentially acts on 5'-terminal phosphates at DNA strand break termini in nicked duplex. This model corresponds to the NTR (N-terminal region) domain of PARP-2, which contains a nucleolar localization sequence (NoLS) and a putative nuclear localization signal (NLS). The NTR domain has a helical SAF-A/B, Acinus, and PIAS (SAP) domain fold and may participate in protein-protein interactions. 59
38620 412076 cd22255 PPP1R3A_PBD PP1C binding domain found in protein phosphatase 1 regulatory subunit 3A (PPP1R3A) and similar proteins. PPP1R3A, also called protein phosphatase 1 glycogen-associated regulatory subunit (PP1G), protein phosphatase type-1 glycogen targeting subunit, or RG1, acts as a glycogen-targeting subunit for PP1 that is essential for cell division, and participates in the regulation of glycogen metabolism, muscle contractility, and protein synthesis. PPP1R3A plays an important role in glycogen synthesis but is not essential for insulin activation of glycogen synthase. It interacts with the PPP1CC catalytic subunit of PP1 and associates with glycogen. This model corresponds to the protein phosphatase 1 catalytic subunit (PP1C) binding domain of PPP1R3A, which contains a RVxF PP1C-binding motif that mediates interactions with PP1C. 82
38621 412077 cd22256 PrimPol_RBD C-terminal RPA-binding domain (RBD) of DNA-directed primase/polymerase protein and similar proteins. DNA-directed primase/polymerase protein (PrimPol), also called coiled-coil domain-containing protein 111, is a DNA primase-polymerase required for the maintenance of genome integrity. It facilitates mitochondrial and nuclear replication fork progression by initiating de novo DNA synthesis using dNTPs and acting as an error-prone DNA polymerase able to bypass certain DNA lesions. PrimPol is regulated by single-stranded DNA binding proteins. This model corresponds to the C-terminal RPA-binding domain (RBD) of PrimPol, which interacts directly with the RPA70N domain of RPA70. 81
38622 412078 cd22257 AcrIF6-like Anti-CRISPR type I subtype F6 and related uncharacterized proteins. AcrIF6 (also known as AcrF6) is an anti-CRISPR (Acr) protein that blocks invader DNA access by binding in the junction region between Cas7.6f and Cas8f subunits of the type I-F CRISPR-Cas surveillance complex (Csy) to compete for foreign DNA binding. The type I-F Csy is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. AcrIF6 can function as an inhibitor of both the type I-E and I-F CRISPR-Cas systems. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 82
38623 412079 cd22259 AcrIF10 Anti-CRISPR type I subtype F10. AcrIF10 (also known as AcrF10) is an anti-CRISPR (Acr) protein which acts as a "DNA mimic protein" (DMP) that binds in the junction region between Cas 7.6f and Cas8f subunits of the type I-F CRISPR-Cas surveillance complex (Csy) to inhibit foreign DNA binding to the CRISPR-Cas adaptive immune system. The key feature of DMPs is their DNA-like shape and charge distribution, and they affect the activity of DNA-binding proteins by occupying their DNA-binding domains. The type I-F Csy is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. Without Cas3 recruitment by the Csy-dsDNA complex, the CRISPR/Cas system is unable to efficiently destroy the invading DNA, resulting in the escape from the immune response. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 94
38624 410203 cd22262 Rcc_KIF21B regulatory coiled-coil domain found in kinesin-like protein KIF21B. KIF21B is a plus-end directed microtubule-dependent motor protein which displays processive activity. It is involved in regulation of microtubule dynamics, synapse function, and neuronal morphology, including dendritic tree branching and spine formation. KIF21B plays a role in learning and memory. It is involved in the delivery of gamma-aminobutyric acid (GABA(A)) receptors to the cell surface. This model corresponds to a conserved region of KIF21B, which shows high sequence similarity to the regulatory coiled-coil domain of KIF21A. 82
38625 410204 cd22263 Rcc_KIF21A regulatory coiled-coil domain found in kinesin-like protein KIF21A. KIF21A, also called kinesin-like protein KIF2 or renal carcinoma antigen NY-REN-62, is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. This model corresponds to the regulatory coiled-coil domain of KIF21A, which folds into an intramolecular antiparallel coiled-coil monomer in solution, but crystallizes into a dimeric domain-swapped antiparallel coiled-coil. 82
38626 409017 cd22264 UDM1_RNF169 UDM1 (ubiquitin-dependent DSB recruitment module 1) domain found in RING finger protein 169. RING finger protein 169 (RNF169) is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. It recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF169, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of the related RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets. 70
38627 409018 cd22265 UDM1_RNF168 UDM1 (ubiquitin-dependent DSB recruitment module 1) domain found in RING finger protein 168. RING finger protein 168 (RNF168) is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. Together with RNF8, RNF168 functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF168, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets. 73
38628 412080 cd22266 AcrIE1 Anti-CRISPR type I subtype E1. AcrIE1 (also known as AcrE1) is an anti-CRISPR (Acr) protein which binds as a homodimer to and inactivates the CRISPR-associated helicase/nuclease Cas3 protein. It has been shown that the C-terminal region of AcrIE1 is important for its inhibitory activity. AcrIE1 can convert the endogenous type I-E CRISPR system into a programmable transcriptional repressor. The type I-E Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 98
38629 409005 cd22268 DPBB_RlpA-like double-psi beta-barrel fold of endolytic peptidoglycan transglycosylase RlpA and similar proteins. Endolytic peptidoglycan transglycosylase RlpA (rare lipoprotein A, RlpA) is a lytic transglycosylase with a strong preference for naked glycan strands that lack stem peptides. It adopts a double-psi beta-barrel (DPBB) fold and is one of four SPOR-domain containing proteins in Escherichia coli (including FtsN, DedA and DamX) that bind peptidoglycan (PG) and are targeted to the septum during division. It directly interacts with the divisome protein FtsK in vitro, and deletion of the rlpA gene partially bypasses the requirement for functional FtsK, a large, multi-spanning membrane protein that facilitates double-stranded DNA translocation during division and sporulation in E. coli and Bacillus subtilis, respectively. In Pseudomonas aeruginosa, RlpA contributes to rod shape maintenance and daughter cell separation. The separation of daughter cells requires extensive PG remodeling. It has been suggested that amidases and RlpA work in tandem to degrade PG in the division septum and lateral wall to facilitate daughter cell separation. 93
38630 409006 cd22269 DPBB_EG45-like double-psi beta-barrel fold of EG45-like domain-containing proteins. This family contains plant EG45-like domain-containing proteins which show sequence similarity to expansins, and similar proteins. Citrus jambhiri EG45-like domain-containing protein was identified as a protein associated with citrus blight (CB), and is also called blight-associated protein p12 (CjBAp12) or plant natriuretic peptide (PNP). CjBAp12 does not display cell wall loosening activity of expansins. Arabidopsis thaliana EG45-like domain-containing protein 2, also called plant natriuretic peptide A (AtPNP-A), is a systemically mobile natriuretic peptide immunoanalog, recognized by antibodies against vertebrate atrial natriuretic peptides (ANPs), that functions in cell volume regulation. Thus, it has an important and systemic role in plant growth and homeostasis. Due to their similarity to the N-terminal domain of expansin and to endolytic peptidoglycan transglycosylase RlpA, EG45-like domain-containing proteins may adopt a double-psi beta-barrel fold. 106
38631 409007 cd22270 DPBB_kiwellin-like double-psi beta-barrel fold of the kiwellin family. Kiwellin (KWL) proteins comprise a widespread family of plant-defense proteins that target pathogenic bacterial/fungal effectors that down-regulate plant defense responses. They are part of a spatiotemporally coordinated, plant-wide defense response comprising KWL proteins with overlapping activities. Zea mays KWL1 specifically inhibits the enzymatic activity of the secreted chorismate mutase Cmu1, a virulence-promoting effector of the smut fungus Ustilago maydis. KWL proteins adopt a double-psi beta-barrel (DPBB) fold, which provides a versatile scaffold that can specifically counteract pathogen effectors such as Cmu1. 128
38632 409008 cd22271 DPBB_EXP_N-like N-terminal double-psi beta-barrel fold domain of the expansin family and similar domains. The plant expansin family consists of four subfamilies, alpha-expansin (EXPA), beta-expansin (EXPB), expansin-like A (EXLA), and expansin-like B (EXLB). EXPA and EXPB display cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. EXPA proteins function more efficiently on dicotyledonous cell walls, whereas EXPB proteins exhibit specificity for the cell walls of monocotyledons. Expansins also affect environmental stress responses. Expansin family proteins contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This family also includes GH45 endoglucanases from mollusks. This model represents the N-terminal domain of expansins and similar proteins, which adopts a double-psi beta-barrel (DPBB) fold. 117
38633 409009 cd22272 DPBB_EXLX1-like N-terminal double-psi beta-barrel fold domain of bacterial expansins similar to Bacillus subtilis EXLX1. This subfamily is composed of bacterial expansins including Bacillus subtilis EXLX1, also called expansin-YoaJ. Similar to plant expansins, EXLX1 contains an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. It strongly binds to crystalline cellulose via D2, and weakly binds soluble cellooligosaccharides. Bacterial expansins, which are present in some plant pathogens, have the ability to loosen plant cell walls, but with weaker activity compared to plant expansins. They may have a role in plant-bacterial interactions. This model represents the N-terminal domain of EXLX1 and similar bacterial expansins, which adopts a double-psi beta-barrel (DPBB) fold. 101
38634 409010 cd22273 DPBB_SPI-like double-psi beta-barrel fold of Streptomyces papain inhibitor and similar proteins. Streptomyces papain inhibitor (SPI) adopts a rigid, thermo-resistant double-psi-beta-barrel (DPBB) fold that is stabilized by two cysteine bridges. SPI serves as a glutamine and lysine donor substrate for microbial transglutaminase (MTG, EC 2.3.2.13) from Streptomycetes, that is used to covalently and specifically link functional amines to glutamine donor sites of therapeutic proteins. SPI is a stress protein produced under hyperthermal stress conditions, and is able to inhibit the cysteine proteases, papain and bromelain, as well as the bovine serine protease trypsin. 101
38635 409011 cd22274 DPBB_EXPA_N N-terminal double-psi beta-barrel fold domain of the alpha-expansin subfamily. Alpha-expansins (EXPA, expansin-A) have cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. They also affect environmental stress responses. Arabidopsis thaliana EXPA1 is a cell wall modifying enzyme that controls the divisions marking lateral root initiation. Nicotiana tabacum EXPA4 positively regulates abiotic stress tolerance, and negatively regulates pathogen resistance. Wheat TaEXPA2 is involved in conferring cadmium tolerance. Alpha-expansins belong to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of alpha-expansins, which adopts a double-psi beta-barrel (DPBB) fold. 129
38636 409012 cd22275 DPBB_EXPB_N N-terminal double-psi beta-barrel fold domain of the beta-expansin subfamily. Beta-expansins (EXPB, expansin-B) have cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. They also affect environmental stress responses. The EXPB subfamily is known in the allergen literature as group-1 grass pollen allergens. EXPB of Bermuda, Johnson, and Para grass pollens, is a major cross-reactive allergen for allergic rhinitis patients in subtropical climate. EXPB1 induces extension and stress relaxation of grass cell walls. Wheat TaEXPB7-B is a beta-expansin gene involved in low-temperature stress and abscisic acid responses. Beta-expansins belong to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of beta-expansins, which adopts a double-psi beta-barrel (DPBB) fold. 122
38637 409013 cd22276 DPBB_EXLA_N N-terminal double-psi beta-barrel fold domain of the expansin-like A subfamily. Expansin-like A (EXLA) belongs to the plant expansin family that also includes alpha-expansin (EXPA), beta-expansin (EXPB), and expansin-like B (EXLB). Unlike EXPA and EXPB, EXLA proteins have not been shown to display cell wall loosening activity. EXLA2 is one of the three EXLA members in Arabidopsis. It lacks expansin activity, but contains a presumed cellulose-interacting domain. EXLA2 may function as a positive regulator of cell elongation in the dark-grown hypocotyl of Arabidopsis, possibly by interference with cellulose metabolism, deposition, or its organization. EXLA belongs to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of EXLA proteins, which adopts a double-psi beta-barrel (DPBB) fold. 129
38638 409014 cd22277 DPBB_EXLB_N N-terminal double-psi beta-barrel fold domain of the expansin-like B subfamily. Expansin-like B (EXLB) belongs to the plant expansin family that also includes alpha-expansin (EXPA), beta-expansin (EXPB), and expansin-like A (EXLA). Unlike EXPA and EXPB, EXLA proteins have not been shown to display cell wall loosening activity. Solanum tuberosum StEXLB6 showed differential expression under the treatments of abscisic acid (ABA), indoleacetic acid (IAA), and gibberellin acid 3 (GA3), as well as under drought and heat stresses, indicating that it is likely involved in potato stress resistance. Soybean GmEXLB1 improves phosphorus acquisition by regulating root elongation and architecture in Arabidopsis. EXLB belongs to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of EXLB proteins, which adopts a double-psi beta-barrel (DPBB) fold. 117
38639 409015 cd22278 DPBB_GH45_endoglucanase double-psi beta-barrel fold of glycoside hydrolase family 45 endoglucanase EG27II and similar proteins. This group is made up of endoglucanases from mollusks similar to Ampullaria crossean endoglucanase EG27II, a glycoside hydrolase family 45 (GH45) subfamily B protein. Endoglucanases (EC 3.2.1.4) catalyze the endohydrolysis of (1-4)-beta-D-glucosidic linkages in cellulose, lichenin, and cereal beta-D-glucans. Animal cellulases, such as endoglucanase EG27II, have great potential for industrial applications such as bioethanol production. GH45 endoglucanases from mollusks adopt a double-psi beta-barrel (DPBB) fold. 149
38640 412081 cd22279 AcrIF1 Anti-CRISPR type I subtype F1 (AcrIF1). AcrIF1 (also known as AcrF1) is an anti-CRISPR (Acr) protein that targets type I-F Csy and blocks CRISPR-RNA (crRNA) and invader DNA hybridization. It has been shown that multiple copies of AcrIF1 bind to the CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats-CRISPR associated protein) complex with different modes when working individually or cooperating with AcrIF2, which might exclude target DNA binding through different mechanisms. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps: the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 77
38641 412082 cd22280 AcrIF2 Anti-CRISPR type I subtype F2. AcrIF2 (also known as AcrF2) is an anti-CRISPR (Acr) protein which functions as a double-stranded "DNA mimic protein" (DMP) that binds to the type I-F CRISPR-Cas surveillance complex (Csy) and excludes target DNA binding. The key feature of DMPs is their DNA-like shape and charge distribution, and they affect the activity of DNA-binding proteins by occupying their DNA-binding domains. Acidic residues on the surface of AcrIF2 mimic the negative charge distribution on the helical backbone of a DNA duplex. The type I-F Csy complex is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 86
38642 409000 cd22283 HD_XRCC4_N N-terminal head domain found in X-ray repair cross-complementing protein 4 and similar proteins. X-ray repair cross-complementing protein 4 (XRCC4) is a DNA repair protein involved in DNA non-homologous end-joining (NHEJ), which is required for double-strand break repair and V(D)J recombination. The DNA ligase IV (LIG4)- XRCC4 complex is responsible for the ligation step of NHEJ, and XRCC4 enhances the joining activity of LIG4. Binding of the LIG4-XRCC4 complex to DNA ends is dependent on the assembly of the DNA-dependent protein kinase complex DNA-PK to these DNA ends. XRCC4 monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. In addition, XRCC4 and XLF form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XRCC4, which is structurally related to other XRCC4-superfamily members, PAXX, XLF, SAS6, and CCDC61. 117
38643 409001 cd22284 HD_CCDC61_N N-terminal head domain found in coiled-coil domain-containing protein 61 and similar proteins. Coiled-coil domain-containing protein 61 (CCDC61), also known as variable flagellar number 3 (VFL3), is a centrosomal protein required for spindle assembly and precise chromosome alignments in mitosis. It is the human ortholog of proteins required for anchoring distinct sets of cytoskeletal fibers to centrioles in unicellular eukaryotes. CCDC61 monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. These CCDC61 homodimers assembles into linear filaments. This model corresponds to the N-terminal head domain of CCDC61, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and PAXX. 135
38644 409002 cd22285 HD_XLF_N N-terminal head domain found in XRCC4-like factor and similar proteins. XRCC4-like factor (XLF), also known as non-homologous end-joining factor 1 (NHEJ1) or protein cernunnos, is involved in DNA nonhomologous end joining (NHEJ), which is required for double-strand break (DSB) repair and V(D)J recombination. It interacts with the XRCC4-DNA ligase IV complex to promote NHEJ. It may act in concert with XRCC6/XRCC5 (Ku) to stimulate XRCC4-mediated joining of blunt ends and several types of mismatched ends that are non-complementary or partially complementary. XLF binds DNA in a length-dependent manner. Similar to XRCC4, XLF monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. In addition, XLF and XRCC4 form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XLF, which is structurally related to other XRCC4-superfamily members, XRCC4, PAXX, SAS6, and CCDC61. 109
38645 409003 cd22286 HD_PAXX_N N-terminal head domain found in paralog of XRCC4 and XLF, and similar proteins. Paralog of XRCC4 and XLF (PAXX), also called XRCC4-like small protein, is a paralog of X-ray repair cross-complementing protein 4 (XRCC4) and XRCC4-like factor (XLF). It is involved in non-homologous end joining (NHEJ), a major pathway to repair double-strand breaks (DSBs) in DNA. It may act as a scaffold required to stabilize the DSB-repair protein Ku heterodimer, composed of XRCC5/Ku80 and XRCC6/Ku70, at double-strand break sites in cells. It functions with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents. Similar to XRCC4 and XLF, PAXX monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. This model corresponds to the N-terminal head domain of PAXX, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and CCDC61. 102
38646 412083 cd22287 REV3L_RBD REV7 binding domain found in protein reversionless 3-like (REV3L) and similar proteins. REV3L, also called REV3-like, or REV3, or DNA polymerase zeta catalytic subunit (POLZ), is the catalytic subunit of the DNA polymerase zeta complex, an error-prone polymerase specialized in translesion DNA synthesis (TLS). REV3L lacks an intrinsic 3'-5' exonuclease activity and thus has no proofreading function. The model corresponds to a conserved region that is responsible for the binding of REV7. 23
38647 412084 cd22288 CWC27_CTD C-terminal domain of spliceosome-associated protein CWC27 and similar proteins. CWC27, also called antigen NY-CO-10, or probable inactive peptidyl-prolyl cis-trans isomerase CWC27, or PPIase CWC27, or serologically defined colon cancer antigen 10, is part of the spliceosome and plays a role in pre-mRNA splicing. It is a probable inactive PPIase with no peptidyl-prolyl cis-trans isomerase activity. This model corresponds to the C-terminal domain of CWC27, which interacts with CWC22 MIF4G domain. 56
38648 412085 cd22289 RecQL4_SLD2_NTD N-terminal homeodomain-like domain of metazoan RecQ protein-like 4 (RecQL4), fungal DNA replication regulator SLD2 and similar proteins. RecQL4, also called ATP-dependent DNA helicase Q4, or DNA helicase, RecQ-like type 4 (RecQ4), or RTS, is a DNA-dependent ATPase that may modulate chromosome segregation. This family also includes fungal DNA replication regulator SLD2, also known as DNA replication and checkpoint protein 1 (DRC1), which functions with DPB11 to control DNA replication and the S-phase checkpoint. It is also required for the proper activation of RAD53 in response to DNA damage and replication blocks. This model corresponds to the N-terminal domain of RecQL4 and SLD2, which is a homeodomain-like DNA interaction motif. 49
38649 412086 cd22290 cc_RasGRP1_C C-terminal coiled-coil domain of RAS guanyl-releasing protein 1 (RasGRP1) and similar proteins. RasGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII), or Ras guanyl-releasing protein, acts as a calcium- and diacylglycerol (DAG)-regulated nucleotide exchange factor, specifically activating Ras through the exchange of bound GDP for GTP. This model corresponds to the C-terminal coiled-coil domain of RasGRP1, which mediates oligomerization. 55
38650 412087 cd22291 cc_THAP11_C C-terminal coiled-coil domain of THAP domain-containing protein 11. THAP domain-containing protein 11 (THAP11) is a cell cycle and cell growth regulator differentially expressed in cancer cells. It acts as a transcriptional repressor that plays a central role for embryogenesis and the pluripotency of embryonic stem (ES) cells. This model corresponds to the C-terminal coiled-coil domain of THAP11, which is involved in protein dimerization. 61
38651 412088 cd22292 cc_Cep135_MBD coiled-coil microtubule binding domain of centrosomal protein of 135 kDa (Cep135) and similar proteins. Cep135, also called centrosomal protein 4, is involved in early centriole assembly, duplication, biogenesis, and formation. It is required for the recruitment of CEP295 to the proximal end of new-born centrioles at the centriolar microtubule wall during early S phase in a PLK4-dependent manner. This model corresponds to a conserved coiled-coil domain of Cep135, which is critical for microtubule binding. 62
38652 412089 cd22293 RBD_SHLD3_N N-terminal REV7-binding domain of Shieldin complex subunit 3 (SHLD3) and similar proteins. SHLD3, also called REV7-interacting novel NHEJ regulator 1, or Shield complex subunit 3, is a component of the shieldin complex, which plays an important role in the repair of DNA double-stranded breaks (DSBs). During G1 and S phase of the cell cycle, the complex functions downstream of TP53BP1 to promote non-homologous end joining (NHEJ) and suppress DNA end resection. SHLD3 mediates various NHEJ-dependent processes including immunoglobulin class-switch recombination, and fusion of unprotected telomeres. The model corresponds to the N-terminal REV7-binding domain of SHLD3, which contains a REV7-binding FXPWFP motif. 61
38653 412090 cd22294 MYO6_MIU_linker MIU-linker domain found in unconventional myosin-VI. Myosins are actin-based motor molecules with ATPase activity. Unconventional myosins function in intracellular movements. Myosin-VI, also called unconventional myosin-6 (MYO6), is a reverse-direction motor protein that moves towards the minus-end of actin filaments. It is required for the structural integrity of the Golgi apparatus via the p53-dependent pro-survival pathway. It appears to be involved in a very early step of clathrin-mediated endocytosis in polarized epithelial cells. It modulates RNA polymerase II-dependent transcription. As part of the DISP complex, Myosin-VI may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. Myosin-VI is encoded by the MYO6 gene, the human homologue of the gene responsible for deafness in Snell's waltzer mice. It is mutated in autosomal dominant nonsyndromic hearing loss. This model corresponds to a conserved region of myosin-VI, which consist of three helices: MIU (Motif Interacting with Ubiquitin), a common linker helix (linker-alpha1) and an isoform-specific helix (linker-alpha2). 69
38654 411969 cd22295 cc_LAMB_C C-terminal coiled-coil domain found in the laminin subunit beta (LAMB) family. The LAMB family contains four members, LAMB1-4. They are components of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB, which may be involved in the integrin binding activity. 70
38655 412091 cd22296 CBD_TRPV5_C C-terminal CaM binding domain found in transient receptor potential cation channel subfamily V member 5 (TRPV5) and similar proteins. TRPV5, also called calcium transport protein 2 (CaT2), epithelial calcium channel 1 (ECaC1), or Osm-9-like TRP channel 3 (OTRPC3), is a constitutively active calcium selective cation channel that might be involved in Ca(2+) reabsorption in kidney and intestine. The channel is activated by low internal calcium levels, and the current exhibits an inward rectification. The model corresponds to the C-terminal calmodulin (CaM) binding domain of TRPV5, which contains several CaM binding sites in the N- and C-terminal tails. The binding of CaM to the C-terminal binding site is essential for the fast Ca2+-dependent inactivation of the channel. 73
38656 412092 cd22297 PSMD4_RAZUL RAZUL (Rpn10 AZUL-binding) domain of 26S proteasome non-ATPase regulatory subunit 4 (PSMD4) and similar proteins. PSMD4 is also called 26S proteasome regulatory subunit RPN10, 26S proteasome regulatory subunit S5A, antisecretory factor 1, AF, ASF, or multiubiquitin chain-binding protein (MCB1). It acts as a ubiquitin receptor subunit through ubiquitin-interacting motifs and selects ubiquitin-conjugates for destruction. It displays a preferred selectivity for longer polyubiquitin chains. PSMD4 is a component of the 26S proteasome, a multiprotein complex involved in the ATP-dependent degradation of ubiquitinated proteins. The proteasome participates in numerous cellular processes, including cell cycle progression, apoptosis, or DNA damage repair. The model corresponds to the C-terminal Rpn10 AZUL-binding domain (RAZUL) of PSMD4, which is responsible for binding the AZUL domain of E6AP/UBE3A. AZUL stands for amino-terminal zinc-binding domain of ubiquitin E3a ligase. 48
38657 412093 cd22298 NuMA_LGNBD LGN binding domain (LGNBD) of nuclear mitotic apparatus protein (NuMA) and similar proteins. NuMA, also called nuclear matrix protein-22 (NMP-22), nuclear mitotic apparatus protein 1 (NUMA1), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of spindle poles and the alignment and segregation of chromosomes during mitotic cell division. It is involved in the establishment of mitotic spindle orientation during metaphase, and elongation during anaphase in a dynein-dynactin-dependent manner. NuMA, in complex with LGN, forms NuMA:LGN hetero-hexamers that promote spindle orientation. The model corresponds to the LGN binding domain (LGNBD) of NuMA. LGN (named for leu-gly-asn repeats) is also known as G protein signaling modulator 2. 56
38658 411970 cd22299 cc_LAMB2_C C-terminal coiled-coil domain found in laminin subunit beta-2 (LAMB2). LAMB2 is also called laminin B1s chain, laminin-11 subunit beta, laminin-14 subunit beta, laminin-15 subunit beta, laminin-3 subunit beta, laminin-4 subunit beta, laminin-7 subunit beta, laminin-9 subunit beta, S-laminin subunit beta, or S-LAM beta (LAMS). It is an important component of the interphotoreceptor matrix and plays a role in rod morphogenesis. It may also have an important function in the sarcolemmal basement membrane. Mutations of the LAMB2 gene mainly cause Pierson syndrome (microcoria-congenital nephrosis syndrome). LAMB2 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB2, which may be involved in the integrin binding activity. 72
38659 411971 cd22300 cc_LAMB1_C C-terminal coiled-coil domain found in laminin subunit beta-1 (LAMB1). LAMB1 is also called laminin B1 chain, laminin-1 subunit beta, laminin-10 subunit beta, laminin-12 subunit beta, laminin-2 subunit beta, laminin-6 subunit beta, or laminin-8 subunit beta. It is a glycoprotein that is involved in the pathogenesis of neurodevelopmental disorders. It also plays a crucial role in both lung morphogenesis and physiological function. Mutations in LAMB1 are associated with Cobblestone brain malformation (COB) with variable muscular or ocular abnormalities. LAMB1 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB1, which is involved in the integrin binding activity. 73
38660 411972 cd22301 cc_LAMB4_C C-terminal coiled-coil domain found in laminin subunit beta-4 (LAMB4). LAMB4, also called laminin beta-1-related protein, is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. Mutations or loss of LAMB4 may be features of gastric and colorectal cancers. Reduced LAMB4 levels may contribute to colonic dysmotility associated with diverticulitis. This model corresponds to the C-terminal coiled-coil domain of LAMB4, which may be involved in the integrin binding activity. 70
38661 411973 cd22302 cc_DmLAMB1-like_C C-terminal coiled-coil domain found in Drosophila melanogaster laminin subunit beta-1 (DmLAMB1) and similar proteins. DmLAMB1, also called LanB1, is a glycoprotein required for nidogen (Ndg) localization to the basement membrane. It is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of DmLAMB1, which may be involved in the integrin binding activity. 70
38662 411974 cd22303 cc_LAMB3_C C-terminal coiled-coil domain found in laminin subunit beta-3 (LAMB3). LAMB3 is also called epiligrin subunit beta, kalinin B1 chain, kalinin subunit beta, laminin B1k chain, laminin-5 subunit beta, or nicein subunit beta. It is a major component of the basement membrane in most adult tissues. Mutations in LAMB3 are associated with Herlitz junctional epidermolysis bullosa (H-JEB), a severe autosomal recessive disorder characterized by blister formation within the dermal-epidermal basement membrane. LAMB3 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB3, which may be involved in the integrin binding activity. 71
38663 408997 cd22304 VpdB_C C-terminal fragment of effector protein VpdB. This model represents the C-terminal fragment of the effector protein VpdB that binds the Legionella pneumophila Dot/Icm type IVB coupling protein (T4CP) complex which includes IcmS, IcmW, and LvgA. These L. pneumophila proteins are known to selectively assist the export of a subclass of effectors. The effector protein VpdB, like other L. pneumophila effectors VpdA, VpdC and VpdD, is a homolog of phospholipase A (PLA) patatin-like enzymes. However, VpdB does not appear to be involved in phospholipid metabolism. The structure reveals interactions between LvgA and a linear motif in the C-terminus of VpdB. This binding interface of LvgA also interacts with the C-terminal region of three additional L. pneumophila effectors, SidH, SetA, and PieA. 126
38664 412069 cd22305 NDFIP1 NEDD4 family-interacting protein 1. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP1 has been shown to play a role in a variety of processes, including inflammation, immune signaling, and nuclear trafficking. 206
38665 412070 cd22306 NDFIP2 NEDD4 family-interacting protein 2. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP2 may play a role in protein trafficking. 229
38666 412094 cd22307 Adgb_C_mid-like C-terminal middle region of Androglobins (Adgbs) and related proteins; including permuted globin domain and IQ motif. Androglobin (Adgb, also known as Calpain-7-like protein, CAPN7L) is a large multidomain protein consisting of an N-terminal peptidase C2 family calpain-like domain, an IQ calmodulin-binding motif, and an internal, circularly permuted globin domain. The canonical secondary structure of hemoglobins is an 3-over-3 alpha-helical sandwich structure, where the eight alpha-helical segments are conventionally labeled, A-H, according to their sequential order; Adgbs differ from this in having helices C-H followed by A-B. Adgbs and other phylogenetically ancient globins, such as neuroglobins and globin X, form hexacoordinated heme iron complexes. Globins contain various highly conserved residues of the heme pocket: including a Phe in the interhelical position CD1 (Phe CD1, first position in the loop between the helices C and D) that is packed against the heme, a His at the 7th position of the E-helix (His E7) that binds the heme iron distally, and a His at the 8th position of the F-helix (His F8) that binds the heme iron proximally. Unlike other hexacoordinated globins, Adgbs have an E7 Gln; their hexacoordination scheme is [Gln]-Fe-[His]. In mammals, Adgb is mainly expressed in the testes and may play an important role in spermatogenesis. Arthropod Adgbs have degenerate globin domains (DOI:10.3389/fgene.2020.00858). This model spans the permuted globin domain, the IQ motif, and a conserved region of about 200 amino acid residues located C-terminal to the globin domain; it does not include the N-terminal protease domain or the large uncharacterized C-terminal domain of approximately 500 residues. 416
38667 411712 cd22308 Af1548-like Archeoglobus fulgidus Af1548 and similar putative endonucleases. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 165
38668 411713 cd22309 AgeI-like restriction endonuclease AgeI and similar endonucleases. Type IIP restriction endonuclease AgeI recognizes a palindromic sequence 5'-A|CCGGT-3' and cuts it ('|' denotes the cleavage site) producing staggered DNA ends. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 205
38669 411714 cd22310 BcnI-like Restriction endonuclease BcnI and similar endonucleases. Restriction endonuclease BcnI cleaves duplex DNA containing the sequence 5'-CC|SGG-3' (S stands for C or G, | designates a cleavage position) to generate staggered products with single nucleotide 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 229
38670 411715 cd22311 BglI-like Restriction endonuclease BglI and similar endonucleases. BglI is a type II restriction endonuclease that recognizes the interrupted DNA sequence GCCNNNNNGGC and cleaves between the fourth and fifth unspecified base pair to produce overhanging ends; it belongs to a superfamily of nucleases including very short patch repair (Vsr) Endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 284
38671 411716 cd22312 BglII-like Restriction endonuclease BglII and similar endonucleases. Restriction endonuclease BglII cleaves duplex DNA containing the sequence 5'-A|GATCT-3' (| designates the cleavage position) to generate staggered products with four nucleotide 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 178
38672 411717 cd22313 BsaWI-like endonuclease BsaWI and similar endonucleases. The type II restriction endonuclease BsaWI recognizes a degenerated sequence 5'-W|CCGGW-3', where W stands for A or T and '|' denotes the cleavage site. It belongs to a family of restriction endonucleases that recognize a conserved CCGG tetranucleotide in their target and form homodimers or homotetramers, requiring binding of one, two or three DNA targets for optimal catalytic activity. They are part of a yet larger superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many other restriction endonucleases, such as EcoRI, BamHI, and FokI. 276
38673 411718 cd22314 Bse634I-like Restriction endonuclease Bse634I and similar endonucleases. Bacillus stearothermophilus restriction endonuclease Bse634I recognizes the nucleotide sequence R|CCGGY (R = A or G, Y = T or C, with | designating the cleavage site) and is an isoschisomer of Citrobacter freundii restriction endonuclease Cfr10I; it is active as a homotetramer and belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 281
38674 411719 cd22315 BsoBI-like Type II restriction endonuclease BsoBI and similar proteins. BsoBI is a thermophilic PDDEXK-family restriction endonuclease exhibiting both base-specific and degenerate recognition within the sequence C-Y-C-G-R-G. (R = A or G, Y = T or C) A conserved histidine has been proposed to act as a general base in the catalysis. BsoBI belongs to a wider superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 288
38675 411720 cd22316 BspD6I-like nicking endonuclease Nt.BspD6I and similar endonucleases. Heterodimeric type II restriction endonuclease nicking endonuclease BspD6I recognizes a pseudosymmetric DNA sequence (5'-GAGTC) and cuts both strands outside the recognition motif 4 nucleotides downstream. It forms the large subunit in a heterodimeric arrangement. This catalytic domain/subunit belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 345
38676 411721 cd22317 BstYI-like type II restriction endonuclease BstYI and similar proteins. BstYI is a thermophilic PDDExK-family restriction endonuclease with specificities that overlap those of BamHI and BglII; it cleaves the degenerate hexanucleotide R-G-A-T-C-Y (R = A or G, Y = T or C) and is part of a larger superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 188
38677 411722 cd22318 DNA2_N-like Nuclease domain of the nuclease/helicase DNA2 and related nucleases. The eukaryotic nuclease/helicase DNA2 processes double-strand breaks in DNA that have single-stranded ends/overhangs, as well as Okazaki fragments and stalled replication forks; it is therefore crucial for maintaining the integrity of the genome. The nuclease domain modeled here belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 234
38678 411723 cd22319 DpnI-like type II restriction endonuclease DpnI and similar proteins. This catalytic PD-(D/E)XK domain co-occurs with a C-terminal winged-helix DNA binding domain that is not included in the model. Both domains of R.DpnI bind DNA and are separately specific for the Gm6ATC sequences in Dam-methylated DNA. DpnI or Dam-replacing protein (DRP) is a restriction endonuclease flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. Type II restriction endonuclease DpnI belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 174
38679 411724 cd22320 Ecl18kI-like Restriction endonuclease Ecl18kI and similar endonucleases. Restriction endonuclease Ecl18kI recognizes the sequence |CCNGG and cleaves it before the outer C (| designates the cleavage site) to generate 5 nt 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 262
38680 411725 cd22321 EcoO109I-like Restriction endonuclease EcoO109I and related endonucleases. EcoO109I is a type II restriction endonuclease that recognizes ds DNA with a seven-base pair motif of both degenerate and discontinuous sequence, RG|GNCCY (R = A or G, Y = T or C, with | designating the cleavage site), and generates 5'-overhangs; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 232
38681 411726 cd22322 EcoRII-like Restriction endonuclease EcoRII and similar endonucleases. Restriction endonuclease EcoRII recognizes the sequence 5'-CCWGG-3' (W stands for A or T); it requires binding of a second target site as an allosteric effector in order to be active. EcoRII belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 211
38682 411727 cd22323 EcoRV-like Restriction endonuclease EcoRV and similar endonucleases. Type II restriction endonuclease EcoRV recognizes the site 5'-GAT|ATC-3' (| denotes the cleavage site) and functions as a homodimer; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 221
38683 411728 cd22324 Endonuclease_I Endonuclease I and similar nucleases. Junction-resolving T7 endonuclease I is a nuclease that is selective for the structure of the four-way DNA junction, it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 114
38684 411729 cd22325 ERCC1_C-like Central domain of ERCC1. ERCC1 is a subunit of the DNA structure-specific endonuclease XPF-ERCC1, which incises a damaged DNA strand on the 5' side of a lesion during nucleotide excision repair. It also plays roles in DNA interstrand crosslink repair and homologous recombination. The ERCC1 central domain modeled here interacts tightly with XPF and may be involved in binding to single-stranded DNA. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 128
38685 411730 cd22326 FAN1-like repair nuclease FAN1. This model characterizes a set of nucleases that resemble Holliday-junction resolving enzymes. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 652
38686 411731 cd22327 FokI_nuclease-like Nuclease domain of restriction endonuclease FokI and similar endonucleases. The type II restriction endonuclease FokI recognizes an asymmetric nucleotide sequence 5'-GGATG(N)9/13 and cleaves both DNA strands outside the recognition motif; its nuclease domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and HindIII. 161
38687 411732 cd22328 Hef-like Hef-like homing endonuclease and similar nucleases. Hef-like homing endonuclease such as I-Bth0305I, which is encoded within a group I intron in the recA gene of a Bacillus thuringiensis bacteriophage and cleaves a DNA target in the uninterrupted recA gene at a position immediately adjacent to the intron insertion site. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 99
38688 411733 cd22329 HincII-like Restriction endonuclease HincII and similar endonucleases. Type II restriction endonuclease HincII cleaves double-stranded DNA 5'-GTY|RAC-3' (| denotes the cleavage site, Y stands for C or T, R stand for A or G ) creating blunt ends. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 252
38689 411734 cd22330 HindIII-like Restriction endonuclease HindIII and similar endonucleases. The type II restriction endonuclease HindIII cleaves DNA at the palindromic sequence A|AGCTT (| denotes the cleavage site). It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 289
38690 411735 cd22331 HinP1I-like Restriction endonuclease HinP1I and similar endonucleases. HinP1I is a type II restriction endonuclease that recognizes and cleaves a palindromic tetranucleotide sequence (G|CGC) in double-stranded DNA, producing 2 nt 5' overhanging ends, it belongs to the PDDEXK superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 243
38691 411736 cd22332 HsdR_N N-terminal domain of HsdR motor subunit of type I restriction-modification enzyme EcoR124I and similar systems. The N-terminal endonuclease-like domain of HsdR motor subunit of type I restriction-modification enzyme EcoR124I belongs to a wider superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 226
38692 411737 cd22333 LlaBIII_nuclease-like nuclease domain of type ISB restriction-modification enzyme LlaBIII and similar nuclease domains. This N-terminal nuclease domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 149
38693 411738 cd22334 MspI-like Restriction endonuclease MspI and similar endonucleases. The type II restriction endonuclease MspI It recognizes and cleaves the palindromic tetranucleotide sequence 5'-C|CGG (| denotes the cleavage site) leaving 2 base 5' overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 262
38694 411739 cd22335 MspjI-like Modification-dependent restriction endonuclease MspjI and similar endonucleases. MspJI recognizes 5-methylcytosine or 5-hydroxymethylcytosine as part of the motif CNN(G/A) and cleaves both strands at fixed distances (N(12)/N(16)) away from the modified cytosine at the 3'-side. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 185
38695 411740 cd22336 MunI-like restriction endonuclease MunI and similar proteins. MunI ( E.C. 3.1.21.4) is a type II restriction enzyme that catalyzes the hydrolysis of DNA, recognizing the palindromic hexanucleotide sequence CAATTG (with the cleavage site after C-1), and is very similar to EcoRI. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 200
38696 411741 cd22337 MvaI-like Restriction Endonuclease MvaI and similar endonucleases. Restriction endonuclease MvaI recognizes the sequence CC|WGG (W stands for A or T, '|' designates the cleavage site) and generates products with single nucleotide 5'-overhangs; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 239
38697 411742 cd22338 NaeI-like Restriction endonuclease NaeI and similar endonucleases. The type II restriction endonuclease NaeI recognizes and cleaves the DNA motif GCC|GGC (| denotes the cleavage site) and forms a covalent bond with the cleaved substrate. The enzyme binds two DNA recognition sites and only cleaves one DNA sequence. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 256
38698 411743 cd22339 NciI-like Restriction endonuclease NciI and similar endonucleases. NciI is a type II restriction endonuclease that recognizes and cleaves the sequence CC|SGG (S stands for C or G, | denotes the cleavage site). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 170
38699 411744 cd22340 NgoMIV-like Restriction endonuclease NgoMIV and similar endonucleases. Type II restriction endonuclease NgoMIV recognizes and cleaves the palindromic sequence 5'-G|CCGGC-3' (| denotes the cleavage site) to produce 4 bp 5' staggered ends. It is active as a homotetramer and belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 238
38700 411745 cd22341 NucS-like Mismatch restriction endonuclease NucS and similar nucleases. Archaeal mismatch restriction endonuclease NucS and its ortholog EndoMS specifically cleave dsDNA containing mismatched bases. They belong to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 237
38701 411746 cd22342 Pa4535-like putative restriction endonuclease similar to Pseudomonas aeruginosa Pa4535. These proteins belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 195
38702 411747 cd22343 PDDEXK_lambda_exonuclease-like Uncharacterized nucleases similar to lambda phage exonuclease. This model characterizes a diverse set of nucleases such as alkaline exonuclease from Laribacter hongkongensis, lambda phage exonuclease, or a Cas4-like protein from the Mimivirus virophage resistance element system. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 182
38703 411748 cd22344 PDDEXK_nuclease uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 220
38704 411749 cd22345 PDDEXK_nuclease uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 201
38705 411750 cd22346 PDDEXK_nuclease uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 221
38706 411751 cd22347 PDDEXK_nuclease uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 203
38707 411752 cd22348 PDDEXK_nuclease uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 218
38708 411753 cd22349 PDDEXK_RNA_polymerase-like Endonuclease domain of segmented negative-strand RNA virus (sNSV) polymerases. The N-terminal endonuclease domain of sNSV polymerases is essential for viral cap-dependent transcription; it has endonuclease activity and belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 182
38709 411754 cd22350 PspGI-like Restriction endonuclease PspGI and similar nucleases. PspGI is an isoschizomer of EcoRII, it recognizes and cleaves the DNA sequence 5'-|CCWGG-3' (| denotes the cleavage site, W stands for A or T). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 239
38710 411755 cd22351 PvuII-like Restriction endonuclease PvuII and similar nucleases. The type II restriction endonuclease PvuII recognizes and cleaves the DNA sequence 5'-CAG|CTG-3' leaving blunt ends (| denotes the cleavage site). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 151
38711 411756 cd22352 RecB_C-like C-terminal nuclease domain of exodeoxyribonuclease V subunit RecB and similar proteins. Exodeoxyribonuclease V subunit beta (RecB) is a helicase/nuclease that prepares dsDNA breaks (DSB) for recombinational DNA repair; it binds to DSBs and unwinds DNA via a rapid and highly processive ATP-dependent bidirectional helicase. The C-terminal PDDEXK nuclease domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 215
38712 411757 cd22353 RecC_C-like C-terminal nuclease-like domain of exodeoxyribonuclease V subunit RecC and similar proteins. Exodeoxyribonuclease V subunit beta (RecC) is part of the RecBCD complex that processes DNA ends resulting from a double-strand break. Its C-terminal domain contacts the two separate strands of the DNA substrate and may be responsible for stabilizing RecD interactions with the complex. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 283
38713 411758 cd22354 RecU-like Holliday junction resolvase RecU (recombination protein U) and similar nucleases. Holliday junction (HJ) resolving enzyme RecU is involved in DNA repair and recombination. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 157
38714 411759 cd22355 Sau3AI_C C-terminal allosteric effector domain of the restriction endonuclease Sau3AI. Sau3AI is a type II restriction enzyme that recognizes the 5'-|GATC-3' sequence in double-strand DNA (| denotes the cleavage site). The C-terminal domain modeled here does not have catalytic activity, it functions as an allosteric effector domain that assists in DNA binding and cleavage. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methy-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 214
38715 411760 cd22356 Sau3AI_N-like N-terminal catalytic domain of type II restriction enzyme Sau3AI and similar endonucleases. Sau3AI is a type II restriction enzyme that recognizes the 5'-|GATC-3' sequence in double-strand DNA (| denotes the cleavage site). The N-terminal domain modeled here conveys the catalytic activity, it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 188
38716 411761 cd22357 SfsA-like Sugar fermentation stimulation protein A and similar nucleases. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 213
38717 411762 cd22358 SfsA-like_archaeal Sugar fermentation stimulation protein A and similar nucleases. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 221
38718 411763 cd22359 SfsA-like_bacterial Sugar fermentation stimulation protein A and similar proteins. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. The N-terminus of SfsA resembles a DNA-binding OB-fold domain. 218
38719 411764 cd22360 SgrAI-like Restriction endonuclease SgrAI and similar nucleases. The type II restriction endonuclease SgrAI binds and cleaves the target sequence CR|CCGGYG (| denotes the cleavage site, R stands for a purine and Y stands for a pyrimidine). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 272
38720 411765 cd22361 ThaI-like type II restriction endonuclease subunit R of ThaI and similar endonucleases. The PD-(D/E)XK type II restriction endonuclease ThaI cuts the target sequence CG/CG with blunt ends. It belongs to a superfamily of PDDEXK nucleases that includes diverse members such as very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 200
38721 411766 cd22362 TnsA_endonuclease-like Transposon Tn7 transposition protein TnsA. TnsA is part of the Tn7 transposon mobile genetic element working together with TnsB, TnsC, and TnsD to facilitate insertion of the transposon. TnsA catalyzes cleavage at the transposon 5' ends, and TnsC is the activator of the composite TnsAB transposase. TnsA belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 234
38722 411767 cd22363 tRNA-intron_lyase_C catalytic C-terminal domain of the tRNA-intron lyase. This C-terminal catalytic domain of tRNA intron endonucleases cleaves pre tRNA at the 5' and 3' splice sites to release the intron (EC:3.1.27.9). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 91
38723 411768 cd22364 VC1899-like putative nuclease domain found in Vibrio cholerae VC1899 and similar proteins. A putative nuclease domain found in Vibrio cholerae VC1899 and similar proteins belongs to a superfamily of PDDEXK nucleases that includes diverse members such as very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 377
38724 411769 cd22365 VRR-NUC-like Virus-type replication repair nuclease. This model characterizes a set of nucleases that resemble Holliday-junction resolving enzymes. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 92
38725 411770 cd22366 XisH-like Endonuclease XisH and similar nucleases. XisH functions as an endonuclease in the control of expression of nitrogen fixation genes of certain Anabaena and Nostoc species of cyanobacteria. Together with XisI, it controls the cell-type specificity of the excision of the fdxN element by the recombinase XisF. XisH belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 133
38726 411771 cd22367 XPF_ERCC4_MUS81-like XPF family DNA repair endonuclease. (Xeroderma Pigmentosum group F) DNA repair gene homologs are members of the XPF/Rad1/Mus81-dependent nuclease family which specifically cleave branched structures generated during DNA repair, replication, and recombination, and they are essential for maintaining genome stability. They belong to a wider superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 123
38727 411772 cd22368 YaeQ-like Nucleases similar to Escherichia coli YaeQ. This model characterizes a diverse set of poorly characterized nucleases such as Escherichia coli YaeQ. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 174
38728 411956 cd22369 alphaCoV_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) protein from alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses including human coronaviruses (HCoVs), HCoV-NL63, and HCoV-229E, and porcine coronaviruses, transmissible gastroenteritis virus (TGEV) and porcine epidemic diarrhea virus (PEDV), among others. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP), and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1 the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 666
38729 411957 cd22370 betaCoV_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses. This family contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses, including three highly pathogenic human coronaviruses (CoVs), Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 667
38730 411958 cd22371 alphaCoV-HKU2-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the CoV spike (S) glycoprotein from Rhinolophus bat coronavirus HKU2 and related alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Wencheng shrew coronavirus (WESV), Lucheng Rn rat coronavirus (LRNV), and two bat viruses (Rhinolophus bat coronavirus HKU2 and BtRf-AlphaCoV/YN2012). Members of this group form a distinct cluster that is separated from the other alphacoronaviruses. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 686
38731 411959 cd22372 gammaCoV_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from avian infectious bronchitis coronavirus (IBV) and related gammacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from gammacoronaviruses, including avian infectious bronchitis virus, and Beluga whale coronavirus SW1 (whale-CoV SW1). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 661
38732 411960 cd22373 delta-PDCoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine coronavirus HKU15, avian coronaviruses, and related deltacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine coronavirus PDCoV, and several avian coronaviruses such as quail deltacoronavirus (QdCoV) UAE-HKU30, white-eye coronavirus HKU16, common moorhen coronavirus HKU21, thrush CoV HKU12, and munia CoV HKU13, all from the Buldecovirus subgenus of deltacoronaviruses. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 648
38733 411961 cd22374 delta-PiCoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Pigeon coronavirus UAE-HKU29, and related avian deltacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Pigeon coronavirus UAE-HKU29, and related avian deltacoronaviruses including Falcon coronavirus UAE-HKU27, Magpie-robin coronavirus HKU18, Sparrow coronavirus HKU17, and Night heron coronavirus HKU19. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the (C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 739
38734 411962 cd22375 HCoV-NL63-229E-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoproteins from HCoV-NL63, HCoV-229E, and related alphacoronavirus. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses, including human coronaviruses (HCoVs), HCoV-NL63 and HCoV-229E. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 677
38735 411963 cd22376 PDEV-like_Spike_SD1-2_S1-2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Porcine epidemic diarrhea virus and related alphacoronavirus. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses, including porcine epidemic diarrhea virus (PEDV), Scotophilus bat coronavirus, and swine enteric coronavirus, among others. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1 the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 673
38736 411964 cd22377 TGEV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from transmissible gastroenteritis virus and related alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine transmissible gastroenteritis virus (TGEV), canine coronavirus (CCoV), and feline coronavirus (FCoV). They display greater than 96% sequence identity and have been grouped in the same species, alphacoronavirus 1, within the Alphacoronavirus genus. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 751
38737 411965 cd22378 SARS-CoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from SARS-CoV-2 (COVID-19) and related betacoronaviruses in the B lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the sarbecovirus subgenus (B lineage), including highly pathogenic human CoVs such as Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS-CoV-2 (also known as a 2019 novel coronavirus or 2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. Notably, SARS-CoV-2 has a functional polybasic (furin) cleavage site through the insertion of PRRAR*SV (* indicates the cleavage site) at the S1/S2 interface, which is absent in SARS-CoV and other SARS-related coronaviruses. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 662
38738 411966 cd22379 MERS-CoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Middle East respiratory syndrome coronavirus and related betacoronaviruses in the C lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome coronavirus (MERS-CoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 682
38739 411967 cd22380 HKU1-CoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from human HKU1 and OC43 coronaviruses and related betacoronaviruses in the A lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the embecovirus subgenus (A lineage), including highly pathogenic human coronaviruses (CoVs), HKU1 and OC43 CoVs, as well as murine hepatitis virus (MHV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of MHV is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 663
38740 411968 cd22381 bat-HKU9-CoV-like_Spike_SD1-2_S1-S2_S2 SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9 (Ro-BatCoV HKU9). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 731
38741 411810 cd22382 KH-I_SF1 type I K homology (KH) RNA-binding domain found in splicing factor 1 (SF1) and similar proteins. SF1, also called branch point-binding protein, or BBP, or transcription factor ZFM1, or zinc finger gene in MEN1 locus, or zinc finger protein 162, is necessary for the ATP-dependent first step of spliceosome assembly. Binds to the intron branch point sequence (BPS) 5'-UACUAAC-3' of the pre-mRNA. It may act as transcription repressor. 93
38742 411811 cd22383 KH-I_Hqk_like type I K homology (KH) RNA-binding domain found in protein quaking (Hqk) family. The Hqk family includes Hqk and protein held out wings (how) found in Drosophila. Hqk, also called HqkI, is an RNA-binding protein that plays a central role in myelinization. It binds to the 5'-NACUAAY-N(1,20)-UAAY-3' RNA core sequence and regulates target mRNA stability. It acts by regulating pre-mRNA splicing, mRNA export and protein translation. Hqk is a regulator of oligodendrocyte differentiation and maturation in the brain that may play a role in myelin and oligodendrocyte dysfunction in schizophrenia. How, also called KH domain protein KH93F, or protein muscle-specific, or protein Struthio, or protein wings held out (who), or Quaking-related 93F (qkr93F), is an RNA-binding protein involved in the control of muscular and cardiac activity. It is required for integrin-mediated cell-adhesion in wing blade. It plays essential roles during embryogenesis, in late stages of somatic muscle development, for myotube migration and during metamorphosis for muscle reorganization. 101
38743 411812 cd22384 KH-I_KHDRBS type I K homology (KH) RNA-binding domain found in the KH domain-containing, RNA-binding, signal transduction-associated protein (KHDRBS) family. The KHDRBS family includes three members, KHDRBS1-3. KHDRBS1, also called GAP-associated tyrosine phosphoprotein p62, or Src-associated in mitosis 68 kDa protein, or Sam68, or p21 Ras GTPase-activating protein-associated p62, or p68, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS1 acts as a putative regulator of mRNA stability and/or translation rates and mediates mRNA nuclear export. It is recruited and tyrosine phosphorylated by several receptor systems, for example the T-cell, leptin and insulin receptors. KHDRBS2, also called Sam68-like mammalian protein 1, or SLM-1, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds both poly(A) and poly(U) homopolymers. KHDRBS2 may function as an adapter protein for Src kinases during mitosis. KHDRBS3, also called RNA-binding protein T-Star, or Sam68-like mammalian protein 2, or SLM-2, or Sam68-like phosphotyrosine protein, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds optimally to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS3 may play a role as a negative regulator of cell growth. 102
38744 411813 cd22385 KH-I_KHDC4_rpt1 first type I K homology (KH) RNA-binding domain found in KH homology domain-containing protein 4 (KHDC4) and similar proteins. KHDC4, also called Brings lots of money 7 (Blom7), or pre-mRNA splicing factor protein KHDC4, is an RNA-binding protein involved in pre-mRNA splicing. It interacts with the PRP19C/Prp19 complex/NTC/Nineteen complex which is part of the spliceosome. KHDC4 binds preferentially RNA with A/C rich sequences and poly-C stretches. KHDC4 contains two type I K homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif. 84
38745 411814 cd22386 KH-I_KHDC4_rpt2 first type I K homology (KH) RNA-binding domain found in KH homology domain-containing protein 4 (KHDC4) and similar proteins. KHDC4, also called Brings lots of money 7 (Blom7), or pre-mRNA splicing factor protein KHDC4, is an RNA-binding protein involved in pre-mRNA splicing. It interacts with the PRP19C/Prp19 complex/NTC/Nineteen complex which is part of the spliceosome. KHDC4 binds preferentially RNA with A/C rich sequences and poly-C stretches. KHDC4 contains two type I K homology (KH) RNA-binding domains. The model corresponds to the second one. 102
38746 411815 cd22387 KH-I_DDX46_like type I K homology (KH) RNA-binding domain found in the family of DEAD box protein 46 (DDX46). The DDX46 family includes DEAD box protein 46 (DDX46), fungal pre-mRNA-processing ATP-dependent RNA helicase PRP5, Arabidopsis thaliana DEAD-box ATP-dependent RNA helicase RH42 and similar proteins. DDX46, also called PRP5 homolog, is an ATP-dependent RNA helicase that plays an essential role in splicing, either prior to, or during splicing A complex formation. It inhibits antiviral innate responses by entrapping selected antiviral transcripts in the nucleus. It is also involved in the development of several tumors. PRP5 is an ATP-dependent RNA helicase involved spliceosome assembly and in nuclear splicing. It catalyzes an ATP-dependent conformational change of U2 snRNP. PRP5 interacts with the U2 snRNP and HSH155. RH42, also called DEAD-box RNA helicase RCF1, or REGULATOR OF CBF GENE EXPRESSION 1, is a helicase required for pre-mRNA splicing, cold-responsive gene regulation and cold tolerance. Members in this family contain a divergent KH domain that lacks the RNA-binding GXXG motif. 82
38747 411816 cd22388 KH-I_N4BP1_like_rpt2 second type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1). The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the second one. 63
38748 411817 cd22389 KH-I_Dim2p_like_rpt1 first type I K homology (KH) RNA-binding domain found in Pyrococcus horikoshii Dim2p and similar proteins. The family includes a group of conserved KH domain-containing protein mainly from archaea, such as Dim2p homologues from Pyrococcus horikoshii and Aeropyrum pernix. Dim2p acts as a preribosomal RNA processing factor that has been identified as an essential protein for the maturation of 40S ribosomal subunit in Saccharomyces cerevisiae. It is required for the cleavage at processing site A2 to generate the pre-20S rRNA and for the dimethylation of the 18S rRNA by 18S rRNA dimethyltransferase, Dim1p. Dim2p contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one. 70
38749 411818 cd22390 KH-I_Dim2p_like_rpt2 second type I K homology (KH) RNA-binding domain found in Pyrococcus horikoshii Dim2p and similar proteins. The family includes a group of conserved KH domain-containing protein mainly from archaea, such as Dim2p homologues from Pyrococcus horikoshii and Aeropyrum pernix. Dim2p acts as a preribosomal RNA processing factor that has been identified as an essential protein for the maturation of 40S ribosomal subunit in Saccharomyces cerevisiae. It is required for the cleavage at processing site A2 to generate the pre-20S rRNA and for the dimethylation of the 18S rRNA by 18S rRNA dimethyltransferase, Dim1p. Dim2p contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one. 96
38750 411819 cd22391 KH-I_PNO1_rpt1 first type I K homology (KH) RNA-binding domain found in partner of NOB1 (PNO1) and similar proteins. PNO1 is an RNA-binding protein that acts as a ribosome assembly factor and plays an important role in ribosome biogenesis. It positively regulates dimethylation of two adjacent adenosines in the loop of a conserved hairpin near the 3'-end of 18S rRNA. PNO1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif. 80
38751 411820 cd22392 KH-I_PNO1_rpt2 second type I K homology (KH) RNA-binding domain found in partner of NOB1 (PNO1) and similar proteins. PNO1 is an RNA-binding protein that acts as a ribosome assembly factor and plays an important role in ribosome biogenesis. It positively regulates dimethylation of two adjacent adenosines in the loop of a conserved hairpin near the 3'-end of 18S rRNA. PNO1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one. 96
38752 411821 cd22393 KH-I_KRR1_rpt1 first type I K homology (KH) RNA-binding domain found in KRR1 small subunit processome component and similar proteins. KRR1, also called HIV-1 Rev-binding protein 2, or KRR-R motif-containing protein 1, or Rev-interacting protein 1, or Rip-1, or ribosomal RNA assembly protein KRR1, is a ribosomal assembly factor required for 40S ribosome biogenesis. It is involved in nucleolar processing of pre-18S ribosomal RNA and ribosome assembly. KRR1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif and is involved in binding another assembly factor, Kri1. 83
38753 411822 cd22394 KH-I_KRR1_rpt2 second type I K homology (KH) RNA-binding domain found in KRR1 small subunit processome component and similar proteins. KRR1, also called HIV-1 Rev-binding protein 2, or KRR-R motif-containing protein 1, or Rev-interacting protein 1, or Rip-1, or ribosomal RNA assembly protein KRR1, is a nucleolar protein required for 40S ribosome biogenesis. It is involved in nucleolar processing of pre-18S ribosomal RNA and ribosome assembly. KRR1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one. 93
38754 411823 cd22395 KH-I_AKAP1 type I K homology (KH) RNA-binding domain found in mitochondrial A-kinase anchor protein 1 (AKAP1) and similar proteins. AKAP1, also called A-kinase anchor protein 149 kDa, or AKAP 149, or dual specificity A-kinase-anchoring protein 1, or D-AKAP-1, or protein kinase A-anchoring protein 1 (PRKA1), or spermatid A-kinase anchor protein 84, or S-AKAP84, is a novel developmentally regulated A kinase anchor protein of male germ cells. It binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane. 68
38755 411824 cd22396 KH-I_FUBP_rpt1 first type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the first one. 68
38756 411825 cd22397 KH-I_FUBP_rpt2 second type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the second one. 69
38757 411826 cd22398 KH-I_FUBP_rpt3 third type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the third one. 67
38758 411827 cd22399 KH-I_FUBP_rpt4 fourth type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one. 67
38759 411828 cd22400 KH-I_IGF2BP_rpt1 first type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one. 68
38760 411829 cd22401 KH-I_IGF2BP_rpt2 second type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one. 72
38761 411830 cd22402 KH-I_IGF2BP_rpt3 third type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one. 66
38762 411831 cd22403 KH-I_IGF2BP_rpt4 fourth type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/CRD-BP/VICKZ1, IGF2BP2/IMP-2/VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one. 66
38763 411832 cd22404 KH-I_MASK type I K homology (KH) RNA-binding domain found in Mask family proteins. The Mask family includes Drosophila melanogaster ankyrin repeat and KH domain-containing protein Mask, and its mammalian homologues Mask1/ANKHD1 and Mask2/ANKRD17. Mask, also called multiple ankyrin repeat single KH domain-containing protein, is a large ankyrin repeat and KH domain-containing protein involved in Drosophila receptor tyrosine kinase signaling. It acts as a mediator of receptor tyrosine kinase (RTK) signaling and may act either downstream of MAPK or transduce signaling through a parallel branch of the RTK pathway. Mask is required for the development and organization of indirect flight muscle sarcomeres by regulating the formation of M line and H zone and the correct assembly of thick and thin filaments in the sarcomere. Mask1/ANKHD1, also called HIV-1 Vpr-binding ankyrin repeat protein, or multiple ankyrin repeats single KH domain, or Hmask, is highly expressed in various cancer tissues and is involved in cancer progression, including proliferation and invasion. Mask2/ANKRD17, also called ankyrin repeat protein 17, or gene trap ankyrin repeat protein (GTAR), or serologically defined breast cancer antigen NY-BR-16, is a ubiquitously expressed ankyrin factor essential for the vascular integrity during embryogenesis. It may be directly involved in the DNA replication process and play pivotal roles in cell cycle and DNA regulation. It is also involved in innate immune defense against bacteria and viruses. 71
38764 411833 cd22405 KH-I_Vigilin_rpt1 first type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the first one. 69
38765 411834 cd22406 KH-I_Vigilin_rpt2 second type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the second one. 75
38766 411835 cd22407 KH-I_Vigilin_rpt3 third type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the third one. 62
38767 411836 cd22408 KH-I_Vigilin_rpt4 fourth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fourth one. 62
38768 411837 cd22409 KH-I_Vigilin_rpt5 fifth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fifth one. 70
38769 411838 cd22410 KH-I_Vigilin_rpt7 seventh type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the seventh one. 67
38770 411839 cd22411 KH-I_Vigilin_rpt8 eighth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the eighth one. 62
38771 411840 cd22412 KH-I_Vigilin_rpt9 ninth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the ninth one. 70
38772 411841 cd22413 KH-I_Vigilin_rpt10 tenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the tenth one. 66
38773 411842 cd22414 KH-I_Vigilin_rpt11 eleventh type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the eleventh one. 66
38774 411843 cd22415 KH-I_Vigilin_rpt12 twelfth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the twelfth one. 92
38775 411844 cd22416 KH-I_Vigilin_rpt13 thirteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the thirteenth one. 78
38776 411845 cd22417 KH-I_Vigilin_rpt14 fourteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fourteenth one. 72
38777 411846 cd22418 KH-I_Vigilin_rpt15 fifteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fifteenth one. 69
38778 411847 cd22419 KH-I_ASCC1 type I K homology (KH) RNA-binding domain found in activating signal cointegrator 1 complex subunit 1 (ASCC1) and similar proteins. ASCC1, also called ASC-1 complex subunit p50, or Trip4 complex subunit p50, plays a role in DNA damage repair as component of the ASCC complex. It is part of the ASC-1 complex that enhances NF-kappa-B, SRF and AP1 transactivation. In cells responding to gastrin-activated paracrine signals, it is involved in the induction of SERPINB2 expression by gastrin. ASCC1 may also play a role in the development of neuromuscular junction. 66
38779 411848 cd22420 KH-I_BICC1_rpt1 first type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling. Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 81
38780 411849 cd22421 KH-I_BICC1_rpt2 second type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling. Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 70
38781 411850 cd22422 KH-I_BICC1_rpt3 third type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling. Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 67
38782 411851 cd22423 KH-I_MEX3_rpt1 first type I K homology (KH) RNA-binding domain found in the family of MEX-3 RNA-binding proteins. The MEX-3 protein family includes four members, MEX3A/RKHD4, MEX3B/RKHD3/RNF195, MEX3C/ RKHD2/RNF194, and MEX3D/RKHD1/RNF193/TINO. They are homologous of Caenorhabditis elegans MEX-3 protein, a translational regulator that specifies the posterior blastomere identity in the early embryo and contributes to the maintenance of the germline totipotency. Mex-3 proteins are RNA-binding phosphoproteins involved in post-transcriptional regulatory mechanisms. They are characterized by containing two K-homology (KH) RNA-binding domains and a C-terminal RING finger. They bind RNA through their KH domains and shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The model corresponds to the first KH domain. 73
38783 411852 cd22424 KH-I_MEX3_rpt2 second type I K homology (KH) RNA-binding domain found in the family of MEX-3 RNA-binding proteins. The MEX-3 protein family includes four members, MEX3A/RKHD4, MEX3B/RKHD3/RNF195, MEX3C/ RKHD2/RNF194, and MEX3D/RKHD1/RNF193/TINO. They are homologous of Caenorhabditis elegans MEX-3 protein, a translational regulator that specifies the posterior blastomere identity in the early embryo and contributes to the maintenance of the germline totipotency. Mex-3 proteins are RNA-binding phosphoproteins involved in post-transcriptional regulatory mechanisms. They are characterized by containing two K-homology (KH) RNA-binding domains and a C-terminal RING finger. They bind RNA through their KH domains and shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The model corresponds to the second KH domain. 72
38784 411853 cd22425 KH_I_FMR1_FXR_rpt1 first type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 77
38785 411854 cd22426 KH_I_FMR1_FXR_rpt2 second type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 63
38786 411855 cd22427 KH_I_FMR1_FXR_rpt3 third type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 79
38787 411856 cd22428 KH-I_TDRKH_rpt1 first type I K homology (KH) RNA-binding domain found in tudor and KH domain-containing protein (TDRKH) and similar proteins. TDRKH, also called tudor domain-containing protein 2 (TDRD2), is a mitochondria-anchored RNA-binding protein that is required for spermatogenesis and involved in piRNA biogenesis. It specifically recruits MIWI, but not MILI, to engage the piRNA pathway. TDRKH contains two K-homology (KH) RNA-binding domains and one tudor domain, which are involved in binding to RNA or single-strand DNA. The model corresponds to the first one. 74
38788 411857 cd22429 KH-I_TDRKH_rpt2 second type I K homology (KH) RNA-binding domain found in tudor and KH domain-containing protein (TDRKH) and similar proteins. TDRKH, also called tudor domain-containing protein 2 (TDRD2), is a mitochondria-anchored RNA-binding protein that is required for spermatogenesis and involved in piRNA biogenesis. It specifically recruits MIWI, but not MILI, to engage the piRNA pathway. TDRKH contains two K-homology (KH) RNA-binding domains and one tudor domain, which are involved in binding to RNA or single-strand DNA. The model corresponds to the second one. 82
38789 411858 cd22430 KH-I_DDX43_DDX53 type I K homology (KH) RNA-binding domain found in DEAD box protein 43 (DDX43), DEAD box protein 53 (DDX53) and similar proteins. DDX43 (also called cancer/testis antigen 13, or DEAD box protein HAGE, or helical antigen) displays tumor-specific expression. Diseases associated with DDX43 include rheumatoid lung disease. DDX53 (also called cancer-associated gene protein, or cancer/testis antigen 26, or DEAD box protein CAGE) shows high expression level in various tumors and is involved in anti-cancer drug resistance. Both DDX46 and DDX53 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. 66
38790 411859 cd22431 KH-I_RNaseY type I K homology (KH) RNA-binding domain found in ribonuclease Y (RNase Y) and similar proteins. RNase Y is an endoribonuclease that initiates mRNA decay. It initiates the decay of all SAM-dependent riboswitches, such as yitJ riboswitch. RNase Y is involved in processing of the gapA operon mRNA and it cleaves between cggR and gapA. It is also the decay-initiating endonuclease for rpsO mRNA. It plays a role in degradation of type I toxin-antitoxin system bsrG/SR4 RNAs and also a minor role in degradation of type I toxin-antitoxin system bsrE/SR5 degradation. 79
38791 411860 cd22432 KH-I_HNRNPK_rpt1 first type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 64
38792 411861 cd22433 KH-I_HNRNPK_rpt2 second type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 70
38793 411862 cd22434 KH-I_HNRNPK_rpt3 third type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 74
38794 411863 cd22435 KH-I_NOVA_rpt1 first type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis. Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 73
38795 411864 cd22436 KH-I_NOVA_rpt2 second type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis. Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 70
38796 411865 cd22437 KH-I_BTR1_rpt2 second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 69
38797 411866 cd22438 KH-I_PCBP_rpt1 first type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 67
38798 411867 cd22439 KH-I_PCBP_rpt3 third type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 68
38799 411868 cd22440 KH-I_KHDC1_like type I K homology (KH) RNA-binding domain found in KHDC1-like family. The KHDC1-like family corresponds to a group of structurally related proteins characterized by an atypical RNA-binding KH domain. They are unique to eutherian mammals and specifically expressed in oocytes and/or embryonic stem cells. Family members include KH homology domain-containing protein 1 (KHDC1), KHDC1-like protein (KHDC1L), KHDC3-like protein (KHDC3L, also called ES cell-associated transcript 1 protein or ECAT1), developmental pluripotency-associated 5 protein (DPPA5, also called embryonal stem cell-specific gene 1 protein or ESG-1), Oocyte-expressed protein (OOEP, also called KH homology domain-containing protein 2 or KHDC2, or Oocyte- and embryo-specific protein 19 or OEP19). KHDC3L is essential for human oocyte maturation and pre-implantation development of the resulting embryos. DPPA5 is involved in the maintenance of embryonic stem (ES) cell pluripotency. OOEP plays an essential role for zygotes to progress beyond the first embryonic cell divisions. 68
38800 411869 cd22441 KH-I_CeGLD3_rpt1 first type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the first one. 71
38801 411870 cd22442 KH-I_CeGLD3_rpt2 second type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the second one. 73
38802 411871 cd22443 KH-I_CeGLD3_rpt3 third type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the third one. 74
38803 411872 cd22444 KH-I_CeGLD3_rpt4 fourth type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the fourth one. 77
38804 411873 cd22445 KH-I_Rrp4_Rrp40 type I K homology (KH) RNA-binding domain found in exosome complex components Rrp4, Rrp40 and similar proteins. The family includes two ribosomal RNA-processing proteins, Rrp4 and Rrp40. They are non-catalytic components of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Eukaryotic Rrp4 and Rrp40 contain a divergent KH domain that lacks the RNA-binding GXXG motif. 78
38805 411874 cd22446 KH-I_ScSCP160_rpt1 first type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the first one. 86
38806 411875 cd22447 KH-I_ScSCP160_rpt2 second type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the second one. 80
38807 411876 cd22448 KH-I_ScSCP160_rpt3 third type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the third one. 81
38808 411877 cd22449 KH-I_ScSCP160_rpt4 fourth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the fourth one. 70
38809 411878 cd22450 KH-I_ScSCP160_rpt5 fifth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the fifth one. 80
38810 411879 cd22451 KH-I_ScSCP160_rpt6 sixth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the sixth one. 69
38811 411880 cd22452 KH-I_ScSCP160_rpt7 seventh type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the seventh one. 65
38812 411881 cd22453 KH-I_MUG60_like type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe meiotically up-regulated gene 60 protein (MUG60) and similar proteins. MUG60 is a KH domain-containing protein that has a role in meiosis. The family also contains Saccharomyces cerevisiae KH domain-containing protein YLL032C. 72
38813 411882 cd22454 KH-I_Mextli_like type I K homology (KH) RNA-binding domain found in Drosophila melanogaster eukaryotic translation initiation factor 4E-binding protein Mextli and similar proteins. Mextli is a novel eukaryotic translation initiation factor 4E-binding protein that promotes translation in Drosophila melanogaster. 71
38814 411883 cd22455 KH-I_Rnc1_rpt1 first type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 70
38815 411884 cd22456 KH-I_Rnc1_rpt2 second type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 69
38816 411885 cd22457 KH-I_Rnc1_rpt3 third type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 64
38817 411886 cd22458 KH-I_MER1_like type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae meiotic recombination 1 protein (MER1) and similar proteins. MER1 is required for chromosome pairing and genetic recombination. It may function to bring the axial elements of the synaptonemal complex corresponding to homologous chromosomes together by initiating recombination. MER1 might be responsible for regulating the MER2 gene and/or gene product. 65
38818 411887 cd22459 KH-I_PEPPER_rpt1_like first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER, flowering locus K homology domain protein (FLK), RNA-binding KH domain-containing protein RCF3 and KH domain-containing protein HEN4. PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 and HEN4 contain five KH RNA-binding domains. The model corresponds to the KH1 domain of PEPPER and FLK, as well as KH1 and KH3 domains of RCF3 and HEN4. 69
38819 411888 cd22460 KH-I_PEPPER_rpt2_like second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER, flowering locus K homology domain protein (FLK), RNA-binding KH domain-containing protein RCF3 and KH domain-containing protein HEN4. PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 and HEN4 contain five KH RNA-binding domains. The model corresponds to the KH2 domain of PEPPER and FLK, as well as KH2 and KH4 domains of RCF3 and HEN4. 73
38820 411889 cd22461 KH-I_PEPPER_like_rpt3 third type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER and flowering locus K homology domain protein (FLK). PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains. The model corresponds to the KH3 domain of PEPPER and FLK. 69
38821 411890 cd22462 KH-I_HEN4_like_rpt5 fifth type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana KH domain-containing protein HEN4 and similar protein. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. HEN4 contains five K-homology (KH) RNA-binding domains. The model corresponds to the KH5 domain of HEN4. 66
38822 411891 cd22463 KH-I_RCF3_like_rpt5 fifth type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein RCF3 and similar protein. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 contains five K-homology (KH) RNA-binding domains. The model corresponds to the KH5 domain of RCF3. 71
38823 411892 cd22464 KH-I_AtC3H36_like type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana zinc finger CCCH domain-containing proteins AtC3H36, AtC3H52 and similar proteins. The family corresponds to a group of plant CCCH family zinc finger proteins, such as AtC3H36 and AtC3H52, which contain one K homology (KH) RNA-binding domain. They may play important roles in RNA processing as RNA-binding proteins in animals. They may also have an effective role in stress tolerance. 66
38824 411893 cd22465 KH-I_Hqk type I K homology (KH) RNA-binding domain found in protein quaking (Hqk) and similar proteins. Hqk, also called HqkI, is an RNA-binding protein that plays a central role in myelinization. It binds to the 5'-NACUAAY-N(1,20)-UAAY-3' RNA core sequence and regulates target mRNA stability. It acts by regulating pre-mRNA splicing, mRNA export and protein translation. Hqk is a regulator of oligodendrocyte differentiation and maturation in the brain that may play a role in myelin and oligodendrocyte dysfunction in schizophrenia. 103
38825 411894 cd22466 KH-I_HOW type I K homology (KH) RNA-binding domain found in Drosophila protein held out wings (how) and similar proteins. How, also called KH domain protein KH93F, or protein muscle-specific, or protein Struthio, or protein wings held out (who), or Quaking-related 93F (qkr93F), is an RNA-binding protein involved in the control of muscular and cardiac activity. It is required for integrin-mediated cell-adhesion in wing blade. It plays essential roles during embryogenesis, in late stages of somatic muscle development, for myotube migration and during metamorphosis for muscle reorganization. 105
38826 411895 cd22467 KH-I_SPIN1_like type I K homology (KH) RNA-binding domain found in Oryza sativa SPL11-interacting protein 1 (SPIN1) and similar proteins. SPIN1 is a K homology domain protein negatively regulated and ubiquitinated by the E3 ubiquitin ligase SPL11. It is involved in flowering time control in rice. SPIN1 binds DNA and RNA in vitro. 101
38827 411896 cd22468 KH-I_KHDRBS1 type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 1 (KHDRBS1) and similar proteins. KHDRBS1, also called GAP-associated tyrosine phosphoprotein p62, or Src-associated in mitosis 68 kDa protein, or Sam68, or p21 Ras GTPase-activating protein-associated p62, or p68, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS1 acts as a putative regulator of mRNA stability and/or translation rates and mediates mRNA nuclear export. It is recruited and tyrosine phosphorylated by several receptor systems, for example the T-cell, leptin and insulin receptors. 106
38828 411897 cd22469 KH-I_KHDRBS2 type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 2 (KHDRBS2) and similar proteins. KHDRBS2, also called Sam68-like mammalian protein 1, or SLM-1, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds both poly(A) and poly(U) homopolymers. KHDRBS2 may function as an adapter protein for Src kinases during mitosis. 118
38829 411898 cd22470 KH-I_KHDRBS3 type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 3 (KHDRBS3) and similar proteins. KHDRBS3, also called RNA-binding protein T-Star, or Sam68-like mammalian protein 2, or SLM-2, or Sam68-like phosphotyrosine protein, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds optimally to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS3 may play a role as a negative regulator of cell growth. 113
38830 411899 cd22471 KH-I_RIK_like_rpt1 first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein RIK and similar proteins. RIK, also called rough sheath 2-interacting KH domain protein, or RS2-interacting KH domain protein, is a RNA binding protein that acts together with RS2/AS1 in the recruitment of HIRA. RIK contains two type I K homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif. 91
38831 411900 cd22472 KH-I_RIK_like_rpt2 second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein RIK and similar proteins. RIK, also called rough sheath 2-interacting KH domain protein, or RS2-interacting KH domain protein, is a RNA binding protein that acts together with RS2/AS1 in the recruitment of HIRA. RIK contains two type I K homology (KH) RNA-binding domains. The model corresponds to the second one. 96
38832 411901 cd22473 KH-I_DDX46 type I K homology (KH) RNA-binding domain found in DEAD box protein 46 (DDX46) and similar proteins. DDX46, also called PRP5 homolog, is an ATP-dependent RNA helicase that plays an essential role in splicing, either prior to, or during splicing A complex formation. It inhibits antiviral innate responses by entrapping selected antiviral transcripts in the nucleus. It is also involved in the development of several tumors. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 103
38833 411902 cd22474 KH-I_PRP5_like type I K homology (KH) RNA-binding domain found in fungal pre-mRNA-processing ATP-dependent RNA helicase PRP5 and similar proteins. PRP5 is an ATP-dependent RNA helicase involved spliceosome assembly and in nuclear splicing. It catalyzes an ATP-dependent conformational change of U2 snRNP. PRP5 interacts with the U2 snRNP and HSH155. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 89
38834 411903 cd22475 KH-I_AtRH42_like type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana DEAD-box ATP-dependent RNA helicase RH42 and similar proteins. RH42, also called DEAD-box RNA helicase RCF1, or REGULATOR OF CBF GENE EXPRESSION 1, is a helicase required for pre-mRNA splicing, cold-responsive gene regulation and cold tolerance. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 102
38835 411904 cd22476 KH-I_N4BP1 type I K homology (KH) RNA-binding domain found in NEDD4-binding protein 1 (N4BP1) and similar proteins. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. 68
38836 411905 cd22477 KH-I_NYNRIN_like type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN). The NYNRIN subfamily includes NYNRIN and KH and NYN domain-containing protein (KHNYN). NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. 66
38837 411906 cd22478 KH-I_FUBP1_rpt1 first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one. 75
38838 411907 cd22479 KH-I_FUBP2_rpt1 first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one. 71
38839 411908 cd22480 KH-I_FUBP3_rpt1 first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one. 71
38840 411909 cd22481 KH-I_FUBP1_rpt2 second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one. 71
38841 411910 cd22482 KH-I_FUBP2_rpt2 second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one. 73
38842 411911 cd22483 KH-I_FUBP3_rpt2 second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one. 83
38843 411912 cd22484 KH-I_FUBP1_rpt3 third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one. 68
38844 411913 cd22485 KH-I_FUBP2_rpt3 third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one. 68
38845 411914 cd22486 KH-I_FUBP3_rpt3 third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one. 70
38846 411915 cd22487 KH-I_FUBP1_rpt4 fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one. 72
38847 411916 cd22488 KH-I_FUBP2_rpt4 fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one. 69
38848 411917 cd22489 KH-I_FUBP3_rpt4 fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one. 69
38849 411918 cd22490 KH-I_IGF2BP1_rpt1 first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one. 76
38850 411919 cd22491 KH-I_IGF2BP2_rpt1 first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one. 74
38851 411920 cd22492 KH-I_IGF2BP3_rpt1 first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one. 76
38852 411921 cd22493 KH-I_IGF2BP1_rpt2 second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one. 97
38853 411922 cd22494 KH-I_IGF2BP2_rpt2 second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one. 77
38854 411923 cd22495 KH-I_IGF2BP3_rpt2 second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one. 77
38855 411924 cd22496 KH-I_IGF2BP1_rpt3 third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one. 76
38856 411925 cd22497 KH-I_IGF2BP2_rpt3 third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one. 77
38857 411926 cd22498 KH-I_IGF2BP3_rpt3 third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one. 78
38858 411927 cd22499 KH-I_IGF2BP1_rpt4 fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one. 76
38859 411928 cd22500 KH-I_IGF2BP2_rpt4 fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one. 78
38860 411929 cd22501 KH-I_IGF2BP3_rpt4 fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one. 66
38861 411930 cd22502 KH-I_ANKRD17 type I K homology (KH) RNA-binding domain found in ankyrin repeat domain-containing protein 17 (ANKRD17) and similar proteins. ANKRD17, also called ankyrin repeat protein 17, or gene trap ankyrin repeat protein (GTAR), or serologically defined breast cancer antigen NY-BR-16, is a ubiquitously expressed ankyrin factor essential for the vascular integrity during embryogenesis. It may be directly involved in the DNA replication process and play pivotal roles in cell cycle and DNA regulation. It is also involved in innate immune defense against bacteria and viruses. 71
38862 411931 cd22503 KH-I_ANKHD1 type I K homology (KH) RNA-binding domain found in ankyrin repeat and KH domain-containing protein 1 (ANKHD1) and similar proteins. ANKHD1, also called HIV-1 Vpr-binding ankyrin repeat protein, or multiple ankyrin repeats single KH domain, or Hmask, is highly expressed in various cancer tissues and is involved in cancer progression, including proliferation and invasion. It acts as a scaffolding protein that may be associated with the abnormal phenotype of leukemia cells. It may play might have a role in MM cell proliferation and cell cycle progression by regulating expression of p21. It also regulates cell cycle progression and proliferation in multiple myeloma cells. ANKHD1 is a component of Hippo signaling pathway. It functions as a positive regulator of YAP1 and promotes cell growth and cell cycle progression through Cyclin A upregulation in prostate cancer cells. 83
38863 411932 cd22504 KH_I_FXR1_rpt1 first type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 77
38864 411933 cd22505 KH_I_FXR2_rpt1 first type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 77
38865 411934 cd22506 KH_I_FMR1_rpt1 first type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 77
38866 411935 cd22507 KH_I_FXR1_rpt2 second type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 63
38867 411936 cd22508 KH_I_FXR2_rpt2 second type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 63
38868 411937 cd22509 KH_I_FMR1_rpt2 second type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 63
38869 411938 cd22510 KH_I_FXR1_rpt3 third type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 78
38870 411939 cd22511 KH_I_FXR2_rpt3 third type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 78
38871 411940 cd22512 KH_I_FMR1_rpt3 third type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 78
38872 411941 cd22513 KH-I_BTR1_rpt1 first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 73
38873 411942 cd22514 KH-I_BTR1_rpt3 third type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 71
38874 411943 cd22515 KH-I_PCBP1_2_rpt1 first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 70
38875 411944 cd22516 KH-I_PCBP3_rpt1 first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 77
38876 411945 cd22517 KH-I_PCBP4_rpt1 first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. 70
38877 411946 cd22518 KH-I_PCBP1_2_rpt2 second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 78
38878 411947 cd22519 KH-I_PCBP3_rpt2 second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 79
38879 411948 cd22520 KH-I_PCBP4_rpt2 second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one. 72
38880 411949 cd22521 KH-I_PCBP1_2_rpt3 third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 76
38881 411950 cd22522 KH-I_PCBP3_rpt3 third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 75
38882 411951 cd22523 KH-I_PCBP4_rpt3 third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one. 68
38883 411952 cd22524 KH-I_Rrp4_prokar type I K homology (KH) RNA-binding domain found in exosome complex component Rrp4 mainly from archaea. The subfamily corresponds to ribosomal RNA-processing protein 4 (Rrp4) mainly from archaea. It is a non-catalytic component of the exosome, which is a phosphorolytic 3'-5' exoribonuclease complex involved in RNA degradation and processing. Rrp4 increases the RNA binding and the efficiency of RNA degradation and confers strong poly(A) specificity to the exosome. 82
38884 411953 cd22525 KH-I_Rrp4_eukar type I K homology (KH) RNA-binding domain found in exosome complex component Rrp4 from eukaryote. The subfamily corresponds to ribosomal RNA-processing protein 4 (Rrp4) mainly from eukaryote. Rrp4, also called exosome component 2 (EXOSC2), or ribosomal RNA-processing protein 4, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations in EXOSC2 gene are associated with a novel syndrome characterized by retinitis pigmentosa, progressive hearing loss, premature aging, short stature, mild intellectual disability and distinctive gestalt. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 123
38885 411954 cd22526 KH-I_Rrp40 type I K homology (KH) RNA-binding domain found in exosome complex component Rrp40 and similar proteins. Rrp40, also called exosome component 3 (EXOSC3), or ribosomal RNA-processing protein 40, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations of EXOSC3 gene are associated with neurological diseases. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 78
38886 412022 cd22527 IPD_PPP1R12A-like inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12A-like, and similar proteins. Protein phosphatase 1 regulatory subunit 12A-like (PPP1R12A-like) is a homolog of MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit. MYPT1 is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of PPP1R12A-like protein. 50
38887 412095 cd22528 av_Nsp3_ER-remodelling intracellular membrane remodeller motif of arterivirus non-structural protein 3 (Nsp3). This domain is present in subunit Nsp3 of RNA-arteriviruses, such as porcine arterivirus PRRSV and equine arterivirus EAV. Nsp3 proteins are localized to the ER and appear to be essential for formation of double-membrane vesicles that originate from the ER during the life-cycle of the virus. Arterivirus Nsp3 is a predicted tetra-spanning transmembrane protein containing four transmembrane helices, with the N- and C-termini of the protein residing in the cytoplasm. It contains a cluster of four highly conserved cysteine residues that are predicted to reside in the first luminal domain of the protein. These conserved cysteines play a key role in the formation of double-membrane vesicles (DMVs); mutagenesis of each completely blocked DMV formation. 57
38888 411786 cd22529 KH-II_NusA_rpt2 second type II K-homology (KH) RNA-binding domain found in transcription termination/antitermination protein NusA and similar proteins. NusA, also called N utilization substance protein A or transcription termination/antitermination L factor, is an essential multifunctional transcription elongation factor that participates in both transcription termination and antitermination. NusA anti-termination function plays an important role in the expression of ribosomal rrn operons. During transcription of many other genes, NusA-induced RNA polymerase pausing provides a mechanism for synchronizing transcription and translation. In prokaryotes, the N-terminal RNA polymerase-binding domain (NTD) is connected through a flexible hinge helix to three globular domains, the S1 and two K-homology, KH1 and KH2. The K-homology (KH) domains of NusA belong to the type II KH RNA-binding domain superfamily. This model corresponds to the second KH domain of NusA and similar proteins. 61
38889 411787 cd22530 KH-II_NusA_arch_rpt1 first type II K-homology (KH) RNA-binding domain found in archaeal probable transcription termination protein NusA and similar proteins. NusA, also called N utilization substance protein A, is an essential multifunctional transcription elongation factor that is universally conserved among prokaryotes and archaea. It participates in both transcription termination and antitermination. NusA homologs consisting of only the two type II K-homology (KH) domains are widely conserved in archaea. Although their function remains unclear, it has been found that Aeropyrum pernix NusA strongly binds to a certain CU-rich sequence near a termination signal. Archaeal NusA may have retained some functions of bacterial NusA, including ssRNA-binding ability. This model corresponds to the first KH domain of NusA found mainly in archaea. 69
38890 411788 cd22531 KH-II_NusA_arch_rpt2 second type II K-homology (KH) RNA-binding domain found in archaeal probable transcription termination protein NusA and similar proteins. NusA, also called N utilization substance protein A, is an essential multifunctional transcription elongation factor that is universally conserved among prokaryotes and archaea. It participates in both transcription termination and antitermination. NusA homologs consisting of only the two type II K-homology (KH) domains are widely conserved in archaea. Although their function remains unclear, it has been found that Aeropyrum pernix NusA strongly binds to a certain CU-rich sequence near a termination signal. Archaeal NusA may have retained some functions of bacterial NusA, including ssRNA-binding ability. This model corresponds to the second KH domain of NusA mainly found in archaea. 67
38891 411789 cd22532 KH-II_CPSF_arch_rpt1 first type II K-homology (KH) RNA-binding domain found in archaeal cleavage and polyadenylation specificity factor (CPSF) and similar proteins. The archaeal CPSFs are predicted to be metal-dependent RNases belonging to the beta-CASP family, a subgroup of enzymes within the metallo-beta-lactamase fold. Within the CPSF family, all archaeal genomes contain one member with two N-terminal type II K-homology (KH) domains and one without. This family includes the CPSF homologs from archaea possessing N-terminal KH domains. This model corresponds to the first KH domain, which is a non-canonical type II KH domain that does not contain the signature motif GXXG (where X represents any amino acid). 62
38892 411790 cd22533 KH-II_YlqC-like type II K-homology (KH) RNA-binding domain found in Bacillus subtilis UPF0109 protein YlqC and similar proteins. The family includes a group of uncharacterized proteins which show sequence similarity to Bacillus subtilis UPF0109 protein YlqC. They are mainly found in bacteria and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 75
38893 411791 cd22534 KH-II_Era type II K-homology (KH) RNA-binding domain found in GTPase Era and similar proteins. GTPase Era, also called ERA or GTP-binding protein Era, is an essential GTPase that binds both GDP and GTP, with nucleotide exchange occurring in the order of seconds whereas hydrolysis occurs in the order of minutes. It plays a role in numerous processes, including cell cycle regulation, energy metabolism, as a chaperone for 16S rRNA processing, and 30S ribosomal subunit biogenesis. Its presence in the 30S subunit may prevent translation initiation. GTPase Era may also be critical for maintaining cell growth and cell division rates. Members of this family contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid). 87
38894 411773 cd22536 SP4_N N-terminal domain of transcription factor Specificity Protein (SP) 4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4. 623
38895 411774 cd22537 SP3_N N-terminal domain of transcription factor Specificity Protein (SP) 3. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP1 and SP3 can interact with and recruit a large number of proteins including the transcription initiation complex, histone modifying enzymes, and chromatin remodeling complexes, which strongly suggest that SP1 and SP3 are important transcription factors in remodeling chromatin and the regulation of gene expression. SP3 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP3. 574
38896 411690 cd22538 SP8_N N-terminal domain of transcription factor Specificity Protein (SP) 8. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP8 is crucial for limb outgrowth and neuropore closure. It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP8 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP8. 303
38897 411775 cd22539 SP1_N N-terminal domain of transcription factor Specificity Protein (SP) 1. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP1 has been shown to interact with a variety of proteins including myogenin, SMAD3, SUMO1, SF1, TAL1, and UBC. Some 12,000 SP1 binding sites are found in the human genome. SP1 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLF bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1. 433
38898 411776 cd22540 SP2_N N-terminal domain of transcription factor Specificity Protein (SP) 2. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2. 511
38899 412096 cd22541 SP5_N N-terminal domain of transcription factor Specificity Protein (SP) 5. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. All of them contain clade SP5, which plays a potential role in human cancers and was found in several human tumors including hepatocellular carcinoma, gastric cancer, and colon cancer. Leukemia inhibitor factor/Stat3 and Wnt/beta-catenin signaling pathways converge on SP5 to promote mouse embryonic stem cell self-renewal. SP5 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP5. 143
38900 411691 cd22542 SP7_N N-terminal domain of transcription factor Specificity Protein (SP) 7. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7. 297
38901 411692 cd22543 SP6-9_N N-terminal domains of transcription factor Specificity Proteins (SP) 6-9, and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the related N-terminal domains of SP6-SP9, and similar proteins. 162
38902 411693 cd22544 SP6_N N-terminal domain of transcription factor Specificity Protein (SP) 6. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6. 245
38903 411777 cd22545 SP1-4_N N-terminal domain of transcription factor Specificity Proteins (SP) 1-4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1-4. 82
38904 412097 cd22546 AcrIE2 Anti-CRISPR type I subtype E2. AcrIE2 (also known as AcrE2) is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-E CRISPR-Cas system of Pseudomonas aeruginosa. AcrIE2 was discovered via a guilt-by association (GBA) approach, which was based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches, AcrIE2 was then confirmed functionally to be an type 1-E Acr. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 84
38905 411694 cd22547 SP6-9-like_N N-terminal domain of invertebrate transcription factor Specificity Proteins (SP) similar to SP6, SP8 and SP9. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming AER, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of invertebrate SPs similar to SP6, SP8, and SP9. 219
38906 411695 cd22549 SP9_N N-terminal domain of transcription factor Specificity Protein (SP) 9 and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP9 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP9. 299
38907 412098 cd22551 AcrIE3 Anti-CRISPR type I subtype E3 (AcrIE3). AcrIE3 (also known as AcrE3) is an anti-CRISPR (Acr) protein that was discovered via guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. Functional assays confirmed AcrIE3 as a Type I-E Acr protein. AcrIE3 associates with the Cascade complex to hinder DNA binding, via an unknown mechanism. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 68
38908 412099 cd22552 AcrIE4 Anti-CRISPR type I subtype E4. AcrIE4, also known as AcrE4, anti-CRISPR protein 31 or ACR3112-31, is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-E CRISPR-Cas system of Pseudomonas aeruginosa. AcrIE4 was discovered via a guilt-by association (GBA) approach, which was based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. AcrIE4 was then confirmed functionally to be a type 1-E Acr. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 52
38909 411778 cd22553 SP1-4_arthropods_N N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods. 384
38910 412100 cd22554 Slr4-like S (surface)-layer proteins similar to Pseudoalteromonas tunicata Slr4. Pseudoalteromonas tunicata D2 Slr4 (also known as EAR28894 protein) is an S-layer protein and the dominant protein within P. tunicata pellicle biofilm components. S-layers are self-assembling, paracrystalline proteinaceous lattices that form an interface between the cell and its extracellular environment; purified P. tunicata Slr4 protein is able to form square (p4 symmetry) paracrystalline lattices. Slr4 may protect cells and biofilm matrix components against stressors such as attack by viruses, bacteria or eukaryotes. The Slr4 family is widely distributed in gammaproteobacteria, including species of Pseudoalteromonas and Vibrio, and is found exclusively in marine metagenomes. It may play an important role in marine microbial physiology and ecology. 400
38911 412101 cd22555 Lpg2603_kinase Legionella pneumophila Dom/Icm type IV secretion system effector Lpg2603 kinase domain. This model contains the kinase domain of the type IV secretion system effector (T4SS) Lpg2603, an atypical kinase from the bacterial pathogen Legionella pneumophila. Lpg2603 is a remote member of the protein kinase superfamily having structural similarity, but notable differences in primary amino acid sequence. Studies show that Lpg2603 is an active protein kinase that requires the eukaryote-specific host signaling molecule inositol hexakisphosphate (IP6) for activity; IP6 binding rearranges the active site to allow for ATP binding and catalysis. The C-terminal domain of Lpg2603 is a PI4P-binding domain. 291
38912 412102 cd22556 AcrIE5 Anti-CRISPR type I subtype E5. AcrIE5 (also known as AcrE5) is an anti-CRISPR (Acr) protein that was discovered via guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 65
38913 412103 cd22557 AcrIF4 Anti-CRISPR type I subtype F4. AcrIF4 is a phage anti-CRISPR (Acr) protein that has been shown to associate with the type I-F Cascade surveillance complex (type I-F Csy complex) to inhibit DNA binding. AcrIF4 binds the Csy complex with affinities that are orders of magnitude weaker than Acr proteins like AcrIF1 and AcrIF2. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, which inhibit a wide range of CRISPR-Cas systems using various inhibition mechanisms. Weak and strong Acr-phages often cooperate to overcome CRISPR resistance, with a first phage blocking the host CRISPR-Cas immune system to allow a second Acr-phage to successfully replicate which leads to epidemiological tipping points. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 95
38914 411697 cd22558 RETR_RHD N-terminal reticulon-homology domain of Reticulophagy regulators and similar proteins. This subfamily includes Reticulophagy regulators 1-3. Reticulophagy regulator 1 (RETREG1/FAM134B) is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. RETREG2/FAM134A and RETREG3/FAM134C has been shown to interact with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. Members of this subfamily contain an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an ER protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature. 192
38915 411698 cd22559 Arl6IP1 ADP-ribosylation factor-like protein 6-interacting protein 1. ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), also called apoptotic regulator in the membrane of the endoplasmic reticulum (ARMER), is an endoplasmic reticulum (ER) protein that has an important role in cell conduction and material transport. Arl6IP1, a tetraspan membrane protein, is an anti-apoptotic protein specific to multicellular organisms, and is a potential player in shaping the ER tubules in mammalian cells. In neurons, Arl6IP1 has been associated with the regulation of glutamate, a major excitatory neurotransmitter in excitatory synapses. In Drosophila, knockdown of the Arl6IP1 gene leads to progressive motor deficit. An Arl6IP1 variant has also been associated with hereditary spastic paraplegia (HSP), motor and sensory polyneuropathy, and acromutilation. Arl6IP1 shows some sequence similarity to the reticulon-homology domain (RHD) of reticulophagy regulators, which may function in inducing membrane curvature. 167
38916 411699 cd22560 RETR1_RHD N-terminal reticulon-homology domain of Reticulophagy regulator 1. Reticulophagy regulator 1 (RETR1 or RETREG1), also called reticulophagy receptor 1 or FAM134B (family with sequence similarity 134, member B), is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. Loss of function of FAM134B is associated with diseases and cancer, including hereditary sensory and autonomic neuropathy type IIB (HSAN IIB), colorectal adenocarcinoma, and oesophageal squamous cell carcinoma, and other progressive neuronal degenerative diseases. FAM134B is also implicated in the suppression of viral replication during Ebola, Dengue, Zika, and West Nile viral infections. RETREG1/FAM134B contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an ER protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature. 198
38917 411700 cd22561 RETR2_RHD N-terminal reticulon-homology domain of Reticulophagy regulator 2. Reticulophagy regulator 2 (RETR2 or RETREG2), also called FAM134A (family with sequence similarity 134, member A), C2orf17, or MAG2, interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG2/FAM134A contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature. 199
38918 411701 cd22562 RETR3_RHD N-terminal reticulon-homology domain of Reticulophagy regulator 3. Reticulophagy regulator 3 (RETR3 or RETREG3), also called FAM134C (family with sequence similarity 134, member C), mediates NRF1-enhanced neurite outgrowth. It interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG3/FAM134C contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature. 192
38919 412104 cd22563 CclA carnocyclin A. Carnocyclin A (CclA) is a potent ribosomally synthesized antimicrobial peptide, originally found in Carnobacterium maltaromaticum UAL307, that displays a broad spectrum of activity against numerous Gram-positive organisms. An amide bond links the N and C termini of this circular bacteriocin, giving it stability and structural integrity. CclA interacts with lipid bilayers in a voltage-dependent manner and forms anion selective pores that preferentially bind halide anions. The ABC transporter CclEFGH facilitates the production of CclA. 51
38920 412105 cd22564 AcrIF5 Anti-CRISPR type I subtype F5. AcrIF5, also known as AcrF5, is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-F CRISPR-Cas system of Pseudomonas aeruginosa. AcrIF5 is a weak anti-CRISPR and its gene always co-occurs with the AcrIF3 gene; however, the AcrIF3 gene often occurs in the absence of the AcrIF5 gene. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, which inhibit a wide range of CRISPR-Cas systems via various inhibition mechanisms. Weak and strong Acr-phages often cooperate to overcome CRISPR resistance, with a first phage blocking the host CRISPR-Cas immune system to allow a second Acr-phage to successfully replicate which leads to epidemiological tipping points. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 79
38921 412106 cd22565 AcrIF7 Anti-CRISPR type I subtype F7. AcrIF7 (also known as AcrF7) is an anti-CRISPR (Acr) protein that was discovered via the guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. Functional assays show that AcrIF7 in Pseudomonas aeruginosa prophages strongly inhibits the type I-F CRISPR-Cas system. It has been classified as a broad-range type I-F Acr, as it is able to block the type I-F system of both P. aeruginosa and Pectobacterium atrosepticum. AcrIF7 targets the Cas8f subunit of the Csy complex and may compete for the same binding interface with AcrIF2. Extensive mutagenic analyses revealed that AcrIF7 associated with the highly conserved dsDNA binding site of Cas8f, primarily via electrostatic interactions. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, and they inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered. 69
38922 412107 cd22566 MrpH-like Mannose-resistant Proteus-like fimbriae (MR/P) tip adhesin (MrpH) and similar proteins. This model contains mannose-resistant Proteus-like fimbriae (MR/P) tip adhesin (MrpH) found in Proteus mirabilis, a Gram-negative uropathogen and a major causative agent in catheter-associated urinary tract infections. MrpH is required for MR/P-dependent adherence to surfaces. While MR/P belongs to a well-known class of adhesive fimbriae encoded by the chaperone-usher pathway, MrpH has a markedly different structure compared with other tip-located adhesins in this family. It is a novel class of metal-binding adhesin that requires zinc to mediate biofilm formation. 131
38923 412108 cl00011 PLAT N/A. This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes. 0
38924 412109 cl00012 alpha_CA N/A. Carbonic anhydrase alpha, isozyme IX. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are strictly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane protein CA IX. CA IX is functionally implicated in tumor growth and survival. CA IX is mainly present in solid tumors and its expression in normal tissues is limited to the mucosa of alimentary tract. CA IX is a transmembrane protein with two extracellular domains: carbonic anhydrase and, a proteoglycan-like segment mediating cell-cell adhesion. There is evidence for an involvement of the MAPK pathway in the regulation of CA9 expression. 0
38925 412110 cl00013 Lyase_I_like Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions. This domain is found at the C-terminus of argininosuccinate lyase. 0
38926 412111 cl00014 SORL N/A. This domain is found in the sulfur oxidation protein SoxY. It is closely related to the Desulfoferrodoxin family pfam01880. Dissimilatory oxidation of thiosulfate is carried out by the ubiquitous sulfur-oxidizing (Sox) multi-enzyme system. In this system, SoxY plays a key role, functioning as the sulfur substrate-binding protein that offers its sulfur substrate, which is covalently bound to a conserved C-terminal cysteine, to another oxidizing Sox enzyme. The structure of this domain shows an Ig-like fold. 0
38927 412112 cl00015 nt_trans nucleotidyl transferase superfamily. This domain is found as the C-terminal portion of some HIGH_NTase1 proteins. The exact function is not known. 0
38928 412113 cl00016 Cyt_c_Oxidase_Vb N/A. cytochrome c oxidase subunit Vb 0
38929 412114 cl00017 Cyt_c_Oxidase_VIa N/A. cytochrome c oxidase subunit VI protein 0
38930 350864 cl00018 DSRD N/A. Most members of this family are small (approximately 36 amino acids) proteins that from homodimeric complexes. Each subunit contains a high-spin iron atom tetrahedrally bound to four cysteinyl sulphur atoms This family has a similar fold to the rubredoxin metal binding domain. It is also found as the N-terminal domain of desulfoferrodoxin, see (pfam01880). 0
38931 412115 cl00019 Macro_SF macrodomain superfamily. This domain is an ADP-ribose binding module. It is found in a number of yeast proteins. 0
38932 412116 cl00020 GAT_1 Type 1 glutamine amidotransferase (GATase1)-like domain. This family captures members that are not found in pfam00310, pfam07685 and pfam13230. 0
38933 412117 cl00021 PTS_IIB_man N/A. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIB components of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
38934 412118 cl00022 YbaK_like N/A. This domain is found either on its own or in association with the tRNA synthetase class II core domain (pfam00587). It is involved in the tRNA editing of mis-charged tRNAs including Cys-tRNA(Pro), Cys-tRNA(Cys), Ala-tRNA(Pro). The structure of this domain shows a novel fold. 0
38935 412119 cl00025 PTS_IIA_man N/A. phosphotransferase mannnose-specific family component IIA; Provisional 0
38936 412120 cl00030 CH_SF calponin homology (CH) domain superfamily. This group is composed of the molecule interacting with CasL protein (MICAL) and EH domain-binding protein (EHBP) families. MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP proteins contain a single CH domain. CH domains are actin filament (F-actin) binding motifs. 0
38937 412121 cl00031 ALBUMIN N/A. Albumin domain, contains five or six internal disulphide bonds; albuminoid superfamily includes alpha-fetoprotein which binds various cations, fatty acids and bilirubin; vitamin D-binding protein which binds to vitamin D, its metabolites, and fatty acids; alpha-albumin which binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs; and afamin of which little is known; these belong to a multigene family with highly conserved intron/exon organization and encoded protein structures; evolutionary comparisons strongly support vitamin D-binding protein as the original gene in this group with subsequent local duplications generating the remaining genes in the cluster 0
38938 412122 cl00032 ANATO N/A. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. 0
38939 412123 cl00033 AP2 N/A. This 60 amino acid residue domain can bind to DNA and is found in transcription factor proteins. 0
38940 412124 cl00034 Bbox_SF B-box-type zinc finger superfamily. This group is composed of uncharacterized proteins containing a zinc finger B-box domain and a DUF2009 domain, and similar zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. 0
38941 412125 cl00035 BIR N/A. BIR stands for 'Baculovirus Inhibitor of apoptosis protein Repeat'. It is found repeated in inhibitor of apoptosis proteins (IAPs), and in fact it is also known as IAP repeat. These domains characteristically have a number of invariant residues, including 3 conserved cysteines and one conserved histidine that coordinate a zinc ion. They are usually made up of 4-5 alpha helices and a three-stranded beta-sheet. BIR is also found in other proteins known as BIR-domain-containing proteins (BIRPs), such as Survivin. 0
38942 412126 cl00038 BRCT C-terminal domain of the breast cancer suppressor protein (BRCA1) and related domains. This is the fifth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A. 0
38943 412127 cl00040 C1 protein kinase C conserved region 1 (C1 domain) superfamily. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. Members of this family contain C1 domain found in aPKC isoform zeta. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites. 0
38944 412128 cl00042 CASc N/A. Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed. 0
38945 412129 cl00046 ChtBD3 Chitin/cellulose binding domains of chitinase and related enzymes. This short domain is found in many different glycosyl hydrolase enzymes and is presumed to have a carbohydrate binding function. The domain has six aromatic groups that may be important for binding. 0
38946 412130 cl00047 CAP_ED N/A. Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases. 0
38947 412131 cl00049 CUB N/A. This is a family of hypothetical C. elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. However, this domain is related to the CUB domain. 0
38948 412132 cl00051 CysPc N/A. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit). 0
38949 412133 cl00054 DSRM_SF double-stranded RNA binding motif (DSRM) superfamily. A C-terminal domain in human dead end protein 1 (DND1_HUMAN) homologous to double strand RNA binding domains (PF00035, PF00333) 0
38950 412134 cl00055 MH1 N-terminal Mad Homology 1 (MH1) domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localization signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx. 0
38951 412135 cl00056 MH2 C-terminal Mad Homology 2 (MH2) domain. This is the MH2 (MAD homology 2) domain found at the carboxy terminus of MAD related proteins such as Smads. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins and provides specificity and selectivity to Smad function and also is critical for mediating interactions in Smad oligomers. Unlike MH1, MH2 does not bind DNA. The well-studied MH2 domain of Smad4 is composed of five alpha helices and three loops enclosing a beta sandwich. Smads are involved in the propagation of TGF-beta signals by direct association with the TGF-beta receptor kinase which phosphorylates the last two Ser of a conserved 'SSXS' motif located at the C-terminus of MH2. 0
38952 412136 cl00057 vWFA N/A. This is a uncharacterized domain found in eukaryotes and viruses. 0
38953 412137 cl00060 FGF N/A. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. They are found in a wide range of organisms, from nematodes to humans. Most share an internal core region of high similarity, conserved residues in which are involved in binding with their receptors. On binding, they cause dimerization of their tyrosine kinase receptors leading to intracellular signalling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. Most have N-terminal signal peptides and are secreted. A few lack signal sequences but are secreted anyway; still others also lack the signal peptide but are found on the cell surface and within the extracellular matrix. A third group remain intracellular. They have central roles in development, regulating cell proliferation, migration and differentiation. On the other hand, they are important in tissue repair following injury in adult organisms. 0
38954 412138 cl00061 FH_FOX Forkhead (FH) domain found in Forkhead box (FOX) family of transcription factors and similar proteins. FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is not required for T cell development, but is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. 0
38955 412139 cl00062 FHA N/A. Yop-YscD-cpl is the cytoplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus. 0
38956 412140 cl00063 FN1 N/A. One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding. 0
38957 412141 cl00064 ZnMc N/A. This is a family of uncharacterized proteins that carry the highly characteristic met-zincin mmotif HExxHxxGxxH, the extended zinc-binding domain of metallopeptidases. 0
38958 350886 cl00066 FU N/A. Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. 0
38959 412142 cl00068 GAL4 N/A. Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi. 0
38960 412143 cl00069 GGL N/A. G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins. It is also found fused to an inactive Galpha in the Dictyostelium protein gbqA. G-gamma likely shares a common origin with the helical N-terminal unit of G-beta. All organisms that posses a G-beta possess a G-gamma. 0
38961 412144 cl00071 GLECT N/A. This family contains galactoside binding lectins. The family also includes enzymes such as human eosinophil lysophospholipase (EC:3.1.1.5). 0
38962 412145 cl00072 GYF N/A. The GYF domain is named because of the presence of Gly-Tyr-Phe residues. The GYF domain is a proline-binding domain in CD2-binding protein. 0
38963 412146 cl00073 H15 N/A. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types. 0
38964 412147 cl00075 HATPase Histidine kinase-like ATPase domain. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90. 0
38965 412148 cl00081 bHLH_SF basic Helix Loop Helix (bHLH) domain superfamily. DEC2, also termed Class E basic helix-loop-helix protein 41 (bHLHe41), or Class B basic helix-loop-helix protein 3 (bHLHb3), or enhancer-of-split and hairy-related protein 1 (SHARP-1), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. 0
38966 412149 cl00082 HMG-box N/A. This short 71 residue domain is an HMG-box domain. HMG-box domains mediate re-modelling of chromatin-structure. Mammalian HMG-box proteins are of two types: those that are non-sequence-specific DNA-binding proteins with two HMG-box domains and a long highly acidic C-tail; and a diverse group of sequence-specific transcription factor-proteins with either a single HMG-box or up to six copies, and no acidic C-tail. 0
38967 412150 cl00083 HNHc N/A. WHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif WHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. WHH is the shortest version of HNH nuclease families. Like AHH and LHH, the WHH nuclease contains 4 conserved histidines of which the first one is predicted to bind a metal-ion and other three ones are involved in activation of water molecule for hydrolysis. 0
38968 412151 cl00084 homeodomain N/A. This is a homeobox transcription factor KN domain conserved from fungi to human and plants. They were first identified as TALE homeobox genes in eukaryotes, (including KNOX and MEIS genes). They have been recently classified. 0
38969 412152 cl00085 FReD N/A. Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous. 0
38970 412153 cl00086 HPT N/A. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072). 0
38971 412154 cl00087 HR1 Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases. The HR1 repeat was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN). The first two of these repeats were later shown to bind the small G protein rho known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal HR1 repeat complexed with RhoA has been determined by X-ray crystallography. It forms an antiparallel coiled-coil fold termed an ACC finger. 0
38972 412155 cl00089 NUC N/A. A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion. 0
38973 412156 cl00092 IFab N/A. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon. 0
38974 412157 cl00094 IL1 N/A. This family includes interleukin-1 and interleukin-18. 0
38975 412158 cl00096 IRF N/A. This family of transcription factors are important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. Three of the five conserved tryptophan residues bind to DNA. 0
38976 412159 cl00097 KAZAL_FS N/A. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides. 0
38977 412160 cl00098 KH-I K homology (KH) RNA-binding domain, type I. Rrp40, also called exosome component 3 (EXOSC3), or ribosomal RNA-processing protein 40, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations of EXOSC3 gene are associated with neurological diseases. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif. 0
38978 412161 cl00100 KR N/A. Kringle domains have been found in plasminogen, hepatocyte growth factors, prothrombin, and apolipoprotein A. Structure is disulfide-rich, nearly all-beta. 0
38979 412162 cl00101 KU N/A. Indicative of a protease inhibitor, usually a serine protease inhibitor. Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Certain family members are similar to the tick anticoagulant peptide (TAP). This is a highly selective inhibitor of factor Xa in the blood coagulation pathways. TAP molecules are highly dipolar, and are arranged to form a twisted two- stranded antiparallel beta-sheet followed by an alpha helix. 0
38980 412163 cl00103 Trefoil N/A. Proposed role in renewal and pathology of mucous epithelia. 0
38981 412164 cl00104 LDLa N/A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia. 0
38982 412165 cl00105 LMWP Low molecular weight phosphatase family. Arsenate reductase plays an important role in the reduction of intracellular arsenate to arsenite, an important step in arsenic detoxification. The reduction involves three different thiolate nucleophiles. In arsenate reductases of the LMWP family, reduction can be coupled with thioredoxin (Trx)/thioredoxin reductase (TrxR) or glutathione (GSH)/glutaredoxin (Grx). 0
38983 412166 cl00109 MADS N/A. SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero- and homo-dimers. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 0
38984 412167 cl00110 MBD N/A. MBDa is a second MBD domain of Methyl-CpG-binding domain proteins. region implicated in binding the RbAp46/48 (retinoblastoma protein-associated protein) homolog p55, which is one of the components of the MBD2-NuRD complex. The MBD2-NuRD complex is a nucleosome remodelling and deacetylation complex. 0
38985 412168 cl00111 PAH N/A. Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions. 0
38986 412169 cl00112 PAN_APPLE N/A. The PAN domain contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge the links the N and C termini of the domain. The domain is found in diverse proteins, in some they mediate protein-protein interactions, in others they mediate protein-carbohydrate interactions. 0
38987 412170 cl00113 CRIB N/A. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). 0
38988 412171 cl00116 PDGF N/A. Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF. 0
38989 412172 cl00117 PDZ N/A. This domain is the PDZ domain of tricorn protease. 0
38990 412173 cl00120 PP2Cc N/A. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase. 0
38991 412174 cl00123 PROF N/A. Binds actin monomers, membrane polyphosphoinositides and poly-L-proline. 0
38992 412175 cl00125 RHOD N/A. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases. 0
38993 412176 cl00128 RNase_A N/A. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices. 0
38994 412177 cl00130 PseudoU_synth Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. The structure of TruD reveals an overall V-shaped molecule which contains an RNA-binding cleft. 0
38995 412178 cl00133 CAP CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. This is a large family of cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) that are found in a wide range of organisms, including prokaryotes and non-vertebrate eukaryotes, The nine subfamilies of the mammalian CAP 'super'family include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumor suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP 'super'family results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how the cysteine-rich venom protein helothermine blocks the Ca++ transporting ryanodine receptors. 0
38996 412179 cl00134 Chemokine N/A. Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds. 0
38997 412180 cl00136 Sec7 N/A. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family. 0
38998 412181 cl00140 SNc N/A. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis. 0
38999 412182 cl00144 Tar_Tsr_LBD ligand binding domain of Tar- and Tsr-related chemoreceptors. This family is a four helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterized in E coli chemoreceptors. 0
39000 412183 cl00145 T-box DNA-binding domain of the T-box transcription factor family. Fungi incertae sedis refers to a fungal taxonomic group where its broader relationships are unknown or undefined. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors. 0
39001 412184 cl00146 TFIIS_I N/A. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. Notably, the 'small' and 'large' Mediator complexes differ in their subunit composition: the Med26 subunit preferentially associates with the small, active complex, whereas cdk8, cyclin C, Med12 and Med13 associate with the large Mediator complex. This family includesthe C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5. 0
39002 412185 cl00147 TNF N/A. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor. 0
39003 412186 cl00150 TY N/A. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6. 0
39004 412187 cl00154 UBCc N/A. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are similar to the eukaryotic E2 proteins but lack the conserved asparagine. Members of this family are usually fused to an E1 domain at the C-terminus. The protein is usually in the gene neighborhood of a gene encoding a member of the pol-beta nucleotidyltransferase superfamily. Many of the operons in this family are in ICE-like mobile elements and plasmids. 0
39005 412188 cl00156 WAP N/A. WAP belongs to the group of Elafin or elastase-specific inhibitors. 0
39006 412189 cl00157 WW N/A. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro. 0
39007 412190 cl00159 fer2 N/A. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which a beta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This cluster appears within sarcosine oxidase proteins. 0
39008 412191 cl00160 LbetaH N/A. This family of proteins includes the characterized NeuD sialic acid O-acetyltransferase enzymes from E. coli and Streptococcus agalactiae (group B strep). These two are quite closely related to one another, so extension of this annotation to other members of the family in unsupported without additional independent evidence. The neuD gene is often observed in close proximity to the neuABC genes for the biosynthesis of CMP-N-acetylneuraminic acid (CMP-sialic acid), and NeuD sequences from these organisms were used to construct the seed for this model. Nevertheless, there are numerous instances of sequences identified by this model which are observed in a different genomic context (although almost universally in exopolysaccharide biosynthesis-related loci), as well as in genomes for which the biosynthesis of sialic acid (SA) is undemonstrated. Even in the cases where the association with SA biosynthesis is strong, it is unclear in the literature whether the biological substrate is SA iteself, CMP-SA, or a polymer containing SA. Similarly, it is unclear to what extent the enzyme has a preference for acetylation at the 7, 8 or 9 positions. In the absence of evidence of association with SA, members of this family may be involved with the acetylation of differring sugar substrates, or possibly the delivery of alternative acyl groups. The closest related sequences to this family (and those used to root the phylogenetic tree constructed to create this model) are believed to be succinyltransferases involved in lysine biosynthesis. These proteins contain repeats of the bacterial transferase hexapeptide (pfam00132), although often these do not register above the trusted cutoff. 0
39009 412192 cl00162 PTS_IIA_glc N/A. These are part of the The PTS Glucose-Glucoside (Glc) SuperFamily. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). The IIA, IIB and IIC domains of all of the permeases listed below are demonstrably homologous. These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. The three-dimensional structures of the IIA and IIB domains of the E. coli glucose permease have been elucidated. IIAglchas a complex b-sandwich structure while IIBglc is a split ab-sandwich with a topology unrelated to the split ab-sandwich structure of HPr. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39010 412193 cl00163 PTS_IIA_fru N/A. 4.A.2 The PTS Fructose-Mannitol (Fru) Family Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities. The fructose permeases of this family phosphorylate fructose on the 1-position. Those of family 4.6 phosphorylate fructose on the 6-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This model is specific for the IIA domain of the fructose PTS transporters. Also similar to the Enzyme IIA Fru subunits of the PTS, but included in TIGR01419 rather than this model, is enzyme IIA Ntr (nitrogen), also called PtsN, found in E. coli and other organisms, which may play a solely regulatory role. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39011 412194 cl00165 Calpain_III N/A. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. 0
39012 412195 cl00166 PTS_IIA_lac N/A. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation. 0
39013 412196 cl00169 Mog1 N/A. Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mog1 protein has been shown to interact with RanGTP which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran. The human homolog of Mog1 is thought to be alternatively spliced. 0
39014 412197 cl00170 eu-GS N/A. This model represents the eukaryotic glutathione synthetase, which shows little resemblance to the analogous enzyme of Gram-negative bacteria (TIGR01380). In the Kinetoplastida, trypanothione replaces glutathione, but can be made from glutathione; a sequence from Leishmania is not included in the seed, is highly divergent, and therefore scores between the trusted and noise cutoffs. 0
39015 412198 cl00173 VIP2 N/A. Members of this family, which are predominantly found in anthrax toxin lethal factor, adopt a structure consisting of a core of antiparallel beta sheets and alpha helices. They form a long deep groove within the protein that anchors the 16-residue N-terminal tail of MAPKK-2 before cleavage. It has been noted that this domain resembles the ADP-ribosylating toxin from Bacillus cereus, but the active site has been modified to augment substrate recognition. 0
39016 412199 cl00175 alpha-crystallin-Hsps_p23-like alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins. The CS and CHORD (pfam04968) are fused into a single polypeptide chain in metazoans but are found in separate proteins in plants; this is thought to be indicative of an interaction between CS and CHORD. It has been suggested that the CS domain is a binding module for HSP90, implying that CS domain-containing proteins are involved in recruiting heat shock proteins to multiprotein assemblies. Two CS domains are found at the N-terminus of Ubiquitin carboxyl-terminal hydrolase 19 (USP19), these domains may play a role in the interaction of USP19 with cellular inhibitor of apoptosis 2. 0
39017 412200 cl00178 Ecotin Protease Inhibitor Ecotin; homodimeric protease inhibitor. Ecotin is a broad range serine protease inhibitor, which forms homodimers. The C-terminal region contains the dimerization motif. Interestingly, the binding sites show a fluidity of protein contacts binding sites show a fluidity of protein contacts derived from ecotin's innate flexibility in fitting itself to proteases while. 0
39018 412201 cl00179 AlgLyase N/A. This is the N-terminal domain of heparinase II/III proteins. It is a toroid-like domain. 0
39019 412202 cl00180 RabGEF N/A. Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport; 0
39020 412203 cl00182 Mth938-like N/A. This is a large family of uncharacterized proteins found in all domains of life. The structure shows a novel fold with three beta sheets. A dimeric form is found in the crystal structure. It was suggested that the cleft in between the two monomers might bing nucleic acid. 0
39021 412204 cl00184 CAS_like N/A. This family consists of various bacterial proteins pertaining to the non-haem Fe(II)-dependent oxygenase family. Exact function is unknown, but a putative role includes involvement in the control of utilisation of gamma-aminobutyric acid. 0
39022 381844 cl00185 PL_Passenger_AT N/A. Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 2, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of the nonprotease autotransporters, Ag43, AIDA-1 and IcsA, as well as, the less characterized ShdA, MisL, and BapA autotransporters. 0
39023 412205 cl00186 nidG2 N/A. Nidogen, an invariant component of basement membranes, is a multifunctional protein that interacts with most other major basement membrane proteins. The G2 fragment or (G2F domain) contains binding sites for collagen IV and perlecan. The structure is composed of an 11-stranded beta-barrel with a central helix. This domain is structurally related to that of green fluorescent protein pfam01353. A large surface patch on the beta-barrel is conserved in all metazoan nidogens. 0
39024 412206 cl00188 BPI N/A. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar. 0
39025 412207 cl00189 YlxR N/A. Ylxr homologs; group of conserved hypothetical bacterial proteins of unknown function; structure revealed putative RNA binding cleft; proteins are encoded by an operon that includes other proteins involved in transcription and/or translation 0
39026 412208 cl00192 ribokinase_pfkB_like N/A. This enzyme EC:2.7.4.7 is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes. 0
39027 412209 cl00193 cytochrome_b_C N/A. cytochrome b6-f complex subunit IV; Provisional 0
39028 412210 cl00194 EF1B N/A. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains. 0
39029 412211 cl00195 SIR2 N/A. This family of proteins are related to the sirtuins. 0
39030 412212 cl00196 plant_peroxidase_like Heme-dependent peroxidases similar to plant peroxidases. As catalase, this enzyme catalyzes the dismutation of two molecules of hydrogen peroxide to dioxygen and two molecules of water. As a peroxidase, it uses hydrogen peroxide to oxidize donor compounds and produce water. KatG from E. coli is a homotetramer with two non-covalently associated iron protoheme IX groups per tetramer, but the ortholog from Synechococcus sp. is a homodimer with one protoheme. Important sites (numbered according to E. coli KatG) include heme ligands His-106 and His-267 and active site Trp-318. Note that the translation PID:g296476 from accession X71420 from Rhodobacter capsulatus B10 contains extensive frameshift differences from the rest of the orthologous family. [Cellular processes, Detoxification] 0
39031 412213 cl00197 cyclophilin N/A. The peptidyl-prolyl cis-trans isomerases, also known as cyclophilins, share this domain of about 109 amino acids. Cyclophilins have been found in all organisms studied so far and catalyze peptidyl-prolyl isomerisation during which the peptide bond preceding proline (the peptidyl-prolyl bond) is stabilized in the cis conformation. Mammalian cyclophilin A (CypA) is a major cellular target for the immunosuppressive drug cyclosporin A (CsA). Other roles for cyclophilins may include chaperone and cell signalling function. 0
39032 412214 cl00198 Phosphoglycerate_kinase N/A. phosphoglycerate kinase; Provisional 0
39033 412215 cl00199 SO_family_Moco N/A. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity. 0
39034 412216 cl00200 MIP N/A. MIP (Major Intrinsic Protein) family proteins exhibit essentially two distinct types of channel properties: (1) specific water transport by the aquaporins, and (2) small neutral solutes transport, such as glycerol by the glycerol facilitators. 0
39035 412217 cl00202 rubredoxin_like N/A. Rubredoxin; nonheme iron binding domains containing a [Fe(SCys)4] center. Rubredoxins are small nonheme iron proteins. The iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc. They are believed to be involved in electron transfer. 0
39036 412218 cl00203 Ribosomal_L30_like N/A. This family includes prokaryotic L30 and eukaryotic L7. 0
39037 412219 cl00204 PFK N/A. Members of this family that are characterized, save one, are phosphofructokinases dependent on pyrophosphate (EC 2.7.1.90) rather than ATP (EC 2.7.1.11). The exception is one of three phosphofructokinases from Streptomyces coelicolor. Family members are both bacterial and archaeal. [Energy metabolism, Glycolysis/gluconeogenesis] 0
39038 412220 cl00205 HMG-CoA_reductase Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). The HMG-CoA reductases catalyze the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in the synthesis of isoprenoids like cholesterol. Probably because of the critical role of this enzyme in cholesterol homeostasis, mammalian HMG-CoA reductase is heavily regulated at the transcriptional, translational, and post-translational levels. 0
39039 412221 cl00206 PTS-HPr_like N/A. The HPr family are bacterial proteins (or domains of proteins) which function in phosphoryl transfer system (PTS) systems. They include energy-coupling components which catalyze sugar uptake via a group translocation mechanism. The functions of most of these proteins are not known, but they presumably function in PTS-related regulatory capacities. All seed members are stand-alone HPr proteins, although the model also recognizes HPr domains of PTS fusion proteins. This family includes the related NPr protein. [Signal transduction, PTS] 0
39040 412222 cl00207 HMA N/A. This model describes an apparently copper-specific subfamily of the metal-binding domain HMA (pfam00403). Closely related sequences outside this model include mercury resistance proteins and repeated domains of eukaryotic eukaryotic copper transport proteins. Members of this family are strictly prokaryotic. The model identifies both small proteins consisting of just this domain and N-terminal regions of cation (probably copper) transporting ATPases. [Transport and binding proteins, Cations and iron carrying compounds] 0
39041 412223 cl00208 RNase_T2 N/A. Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the prokaryotic RNase T2 family members. 0
39042 412224 cl00210 Isoprenoid_Biosyn_C1 Isoprenoid Biosynthesis enzymes, Class 1. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase. 0
39043 412225 cl00211 Heme_Cu_Oxidase_III_like N/A. Heme-copper oxidase subunit III subfamily. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. 0
39044 412226 cl00212 microbial_RNases N/A. This enzyme hydrolyzes RNA and oligoribonucleotides. 0
39045 412227 cl00213 DNA_BRE_C DNA breaking-rejoining enzymes, C-terminal catalytic domain. catalyzes cleavage and ligation of DNA. 0
39046 412228 cl00214 Aldolase_II N/A. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. 0
39047 412229 cl00215 Aconitase_swivel N/A. This family represents the N-terminal region of several bacterial Aconitate hydratase 2 proteins and is found in conjunction with pfam00330. 0
39048 412230 cl00216 L-asparaginase_like Bacterial L-asparaginases and related enzymes. This is the N-terminal domain of this enzyme. 0
39049 412231 cl00217 pyrophosphatase N/A. inorganic pyrophosphatase; Provisional 0
39050 412232 cl00219 Pterin_binding N/A. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyzes a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin. 0
39051 381872 cl00220 cysteine_hydrolases N/A. This family are hydrolase enzymes. 0
39052 412233 cl00221 ACBP N/A. acyl CoA binding protein; Provisional 0
39053 412234 cl00222 Lyz-like lysozyme-like domains. This family is related to the SLT domain pfam01464. 0
39054 412235 cl00223 NusB_Sun N/A. Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central alpha-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined. 0
39055 412236 cl00224 PLPDE_IV N/A. The D-amino acid transferases (D-AAT) are required by bacteria to catalyze the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity. 0
39056 412237 cl00226 nuc_hydro N/A. A family of proteins in Rhodopirellula baltica that are predicted to be secreted. Also, a member has been identified in Caulobacter crescentus. These proteins mat be related to pfam01156. 0
39057 412238 cl00227 PEBP PhosphatidylEthanolamine-Binding Protein (PEBP) domain. putative kinase inhibitor protein; Provisional 0
39058 412239 cl00228 HIT_like N/A. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the C-terminal region. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The C-terminal domain contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function. 0
39059 412240 cl00229 eIF1_SUI1_like Eukaryotic initiation factor 1 and related proteins. This protein family shows weak but suggestive similarity to translation initiation factor SUI1 and its prokaryotic homologs. 0
39060 412241 cl00230 Cis_IPPS Cis (Z)-Isoprenyl Diphosphate Synthases. Previously known as uncharacterized protein family UPF0015, a single member of this family has been identified as an undecaprenyl diphosphate synthase. 0
39061 412242 cl00231 SAICAR_synt 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Also known as Phosphoribosylaminoimidazole-succinocarboxamide synthase. 0
39062 412243 cl00232 Ribosomal_L19e N/A. Ribosomal protein L19e, archaeal. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 0
39063 412244 cl00233 HPPK N/A. This model describes the folate biosynthesis enzyme 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase. Alternate names include 6-hydroxymethyl-7,8-dihydropterin diphosphokinase and 7,8-dihydro-6-hydroxymethylpterin pyrophosphokinase (HPPK). The extreme C-terminal region, of typically eight to thirty residues, is not included in the model. This enzyme may be found as a fusion protein with other enzymes of folate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 0
39064 412245 cl00234 Pep_deformylase N/A. Peptide deformylase (EC 3.5.1.88), also called polypeptide deformylase, is a metalloenzyme that uses water to release formate from the N-terminal formyl-L-methionine of bacterial and chloroplast peptides. This enzyme should not be confused with formylmethionine deformylase (EC 3.5.1.31) which is active on free N-formyl methionine and has been reported from rat intestine. [Protein fate, Protein modification and repair] 0
39065 412246 cl00235 4Oxalocrotonate_Tautomerase N/A. This family includes the enzyme 4-oxalocrotonate tautomerase, which catalyzes the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate. 0
39066 412247 cl00236 Hsp33 N/A. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localized protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function. 0
39067 412248 cl00237 Peptidase_C15 N/A. PgaPase_1 is a family of functionally diverse Caenorhabditis proteins. The family is homologous to the cysteine-peptidases, but lack of a strictly conserved Glu-Cys-His catalytic triad or pGlu binding site implies that it has other functions that could have resulted in a change in reaction-specificity or even of catalytic activity. 0
39068 412249 cl00238 Frataxin N/A. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport. 0
39069 412250 cl00239 GXGXG N/A. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment. 0
39070 412251 cl00240 RRF N/A. The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth. Thus ribosomes are "recycled" and ready for another round of protein synthesis. 0
39071 412252 cl00241 IF6 N/A. This family includes eukaryotic translation initiation factor 6 as well as presumed archaebacterial homologs. 0
39072 412253 cl00242 MoaC N/A. Members of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known. 0
39073 412254 cl00245 MGS-like N/A. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site. 0
39074 412255 cl00246 MTHFR N/A. This family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from bacteria and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The structure for this domain is known to be a TIM barrel. 0
39075 412256 cl00247 MCR_gamma N/A. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (pfam02241), and 2 gamma (this family) subunits with two identical nickel porphinoid active sites. 0
39076 412257 cl00248 OMPLA N/A. Phospholipase A1 is a bacterial outer membrane bound acyl hydrolase with a broad substrate specificity EC:3.1.1.32. It has been proposed that Ser164 is the active site for Escherichia coli phospholipase A1. 0
39077 412258 cl00249 MCH N/A. Methenyl tetrahydromethanopterin cyclohydrolase EC:3.5.4.27 is involved in methanogenesis in bacteria and archaea, producing methane from carbon monoxide or carbon dioxide. 0
39078 412259 cl00250 RaiA N/A. This Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA). 0
39079 412260 cl00251 Translocase_SecB N/A. This family consists of preprotein translocase subunit SecB. SecB is required for the normal export of envelope proteins out of the cell cytoplasm. 0
39080 412261 cl00252 NifX_NifB N/A. This family contains several NIF (B, Y and X) proteins which are iron-molybdenum cofactors (FeMo-co) in the dinitrogenase enzyme which catalyzes the reduction of dinitrogen to ammonium. Dinitrogenase is a hetero-tetrameric (alpha(2)beta(2)) enzyme which contains the iron-molybdenum cofactor (FeMo-co) at its active site. 0
39081 412262 cl00253 Dtyr_deacylase N/A. This family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells. 0
39082 412263 cl00254 NOS_oxygenase N/A. Nitric oxide synthase (NOS) eukaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain, which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOS's also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. 0
39083 412264 cl00256 CheW_like N/A. CheW proteins are part of the chemotaxis signaling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA binds to CheW, suggesting that these domains can interact with each other. 0
39084 412265 cl00257 HU_IHF DNA sequence specific (IHF) and non-specific (HU) domains. This model describes a set of proteins related to but longer than DNA-binding protein HU. Its distinctive domain architecture compared to HU and related histone-like DNA-binding proteins justifies the designation as superfamily. Members include, so far, one from Bacteroides fragilis, a gut bacterium, and ten from Porphyromonas gingivalis, an oral anaerobe. [DNA metabolism, Chromosome-associated proteins] 0
39085 412266 cl00258 RIBOc N/A. Members of this family are involved in rDNA transcription and rRNA processing. They probably also cleave a stem-loop structure at the 3' end of U2 snRNA to ensure formation of the correct U2 3' end; they are involved in polyadenylation-independent transcription termination. Some members may be mitochondrial ribosomal protein subunit L15, others may be 60S ribosomal protein L3. 0
39086 412267 cl00259 Sm_like Sm and related proteins. This SM domain is found in Ataxin-2. 0
39087 412268 cl00261 PLPDE_III Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes. These pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates This domain has a TIM barrel fold. 0
39088 412269 cl00262 TroA-like N/A. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport. 0
39089 412270 cl00263 TFold N/A. The QueF monomer is made up of two ferredoxin-like domains aligned together with their beta-sheets that have additional embellishments. This subunit is composed of a three-stranded beta-sheet and two alpha-helices. QueF reduces a nitrile bond to a primary amine. The two monomer units together create suitable substrate-binding pockets. 0
39090 412271 cl00264 Ferritin_like Ferritin-like superfamily of diiron-containing four-helix-bundle proteins. This domain has a ferritin-like fold. 0
39091 412272 cl00266 HGTP_anticodon N/A. This is an HGTP_anticodon binding domain, found largely on Gcn2 proteins which bind tRNA to down regulate translation in certain stress situations. 0
39092 412273 cl00268 class_II_aaRS-like_core N/A. This is a family of class II aminoacyl-tRNA synthetase-like and ATP phosphoribosyltransferase regulatory subunits. 0
39093 412274 cl00269 cytidine_deaminase-like N/A. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bdellovibrio Bd3614. They are typified by a distinct N-terminal globular domain. The Bdellovibrio version occurs in a predicted operon with a 23S rRNA G2445-modifying methylase suggesting that it might be involved in RNA editing. 0
39094 412275 cl00271 PI3Ka N/A. PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. 0
39095 412276 cl00274 ML N/A. This domain is distantly similar to pfam02221 and conserves its pattern of conserved cysteines. This suggests that this domain may be involved in lipid binding. 0
39096 412277 cl00275 Heme_Cu_Oxidase_I N/A. Cytochrome C oxidase subunit I. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, may play a role in assembly or oxygen delivery to the active site. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme (heme a3) and a copper ion (CuB). It also contains a low-spin heme (heme a), believed to participate in the transfer of electrons to the binuclear center. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are transferred from cytochrome c (the electron donor) to heme a via the CuA binuclear site in subunit II, and directly from heme a to the binuclear center. 0
39097 412278 cl00276 Maf_Ham1 N/A. Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea. 0
39098 412279 cl00278 CCC1_like CCC1-related family of proteins. This family includes the vacuolar Fe2+/Mn2+ uptake transporter, Ccc1 and the vacuolar iron transporter VIT1. 0
39099 412280 cl00279 APP_MetAP N/A. This family contains metallopeptidases. It also contains non-peptidase homologs such as the N terminal domain of Spt16 which is a histone H3-H4 binding module. 0
39100 412281 cl00281 metallo-dependent_hydrolases N/A. These proteins are amidohydrolases that are related to pfam01979. 0
39101 412282 cl00282 cbb3_Oxidase_CcoQ N/A. This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon. 0
39102 412283 cl00283 ADP_ribosyl N/A. Members of this family, which are found in prokaryotic exotoxin A, catalyze the transfer of ADP ribose from nicotinamide adenine dinucleotide (NAD) to elongation factor-2 in eukaryotic cells, with subsequent inhibition of protein synthesis. 0
39103 412284 cl00285 Aconitase Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Family of hypothetical proteins. 0
39104 260328 cl00288 EPT_RTPC-like N/A. EPSP synthase domain. 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EC 2.5.1.19) catalyses the reaction between shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP) to form 5-enolpyruvylshkimate-3-phosphate (EPSP), an intermediate in the shikimate pathway leading to aromatic amino acid biosynthesis. The reaction is phosphoenolpyruvate + 3-phosphoshikimate = phosphate + 5-O-(1-carboxyvinyl)-3-phosphoshikimate. It is found in bacteria and plants but not animals. The enzyme is the target of the widely used herbicide glyphosate, which has been shown to occupy the active site. In bacteria and plants, it is a single domain protein, while in fungi, the domain is found as part of a multidomain protein with functions that are all part of the shikimate pathway. 0
39105 412285 cl00289 FIG N/A. This family represents the N-terminus of this protein family. 0
39106 412286 cl00292 AANH_like N/A. NAD synthase (EC:6.3.5.1) is involved in the de novo synthesis of NAD and is induced by stress factors such as heat shock and glucose limitation. 0
39107 412287 cl00293 B12-binding_like N/A. This domain tends to occur to the N-terminus of the pfam04055 domain in hypothetical bacterial proteins. 0
39108 412288 cl00295 ZZ N/A. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin. Putative zinc finger; binding not yet shown. Four to six cysteine residues in its sequence are responsible for coordinating zinc ions, to reinforce the structure. 0
39109 412289 cl00296 Peptidase_C39_like N/A. BtrH_N is the N-terminus of the acyl carrier protein:aminoglycoside acyltransferase BtrH. Alternatively it can be referred to as butirosin biosynthesis protein H. BtrH transfers the unique (S)-4-amino-2-hydroxybutyrate (AHBA) side chain, which protects the antibiotic butirosin from several common resistance mechanisms. Butirosin, an aminoglycoside antibiotic produced by Bacillus circulans, exhibits improved antibiotic properties over its parent molecule and retains bactericidal activity toward many aminoglycoside-resistant strains. Butirosin is unique in carrying the AHBA side-chain. BtrH transfers the AHBA from the acyl carrier protein BtrI to the parent aminoglycoside ribostamycin as a gamma-glutamylated dipeptide. 0
39110 412290 cl00297 R3H N/A. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA. 0
39111 412291 cl00299 MIT N/A. The MIT domain forms an asymmetric three-helix bundle and binds ESCRT-III (endosomal sorting complexes required for transport) substrates. 0
39112 412292 cl00301 PAZ N/A. This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerization. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway. 0
39113 412293 cl00303 PNP_UDP_1 Phosphorylase superfamily. This family consists of several purine nucleoside permease from both bacteria and fungi. 0
39114 412294 cl00304 TP_methylase S-AdoMet-dependent tetrapyrrole methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase. 0
39115 412295 cl00305 Sua5_yciO_yrdC Telomere recombination. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the smaller, C-terminal, domain. 0
39116 412296 cl00307 Thiamine_BP Thiamine-binding protein. This protein has been crystallized in both Methanobacterium thermoautotrophicum and yeast, but its function remains unknown. Both crystal structures showed sulfate ions bound at the interface of two dimers to form a tetramer. [Unknown function, General] 0
39117 412297 cl00309 PRTases_typeI Phosphoribosyl transferase (PRT)-type I domain. This PRTase family, and C-terminal TRSP domain, are related to OPRTases, and are predicted to use Orotate as substrate. These genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 0
39118 412298 cl00310 AIRC AIR carboxylase. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, an N5-CAIR mutase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 0
39119 412299 cl00311 UbiD 3-octaprenyl-4-hydroxybenzoate carboxy-lyase. Members of this protein family are putative decarboxylases involved in a late stage of the alternative pathway for menaquinone, via futalosine, as in Streptomyces coelicolor and Helicobacter pylori. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 0
39120 412300 cl00312 Ribosomal_S12_like N/A. This protein is known as S12 in bacteria and archaea and S23 in eukaryotes. 0
39121 412301 cl00313 uS7 Ribosomal protein S7. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes. 0
39122 412302 cl00314 Ribosomal_S10 Ribosomal protein S10p/S20e. This model describes the archaeal ribosomal protein uS10 and its equivalents (previously called S20) in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39123 412303 cl00315 RPS2 N/A. This model describes the ribosomal protein of the cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea. TIGR01011 describes the related protein of organelles and bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39124 412304 cl00317 Lumazine_synthase-like lumazine synthase and riboflavin synthase; involved in the riboflavin (vitamin B2) biosynthetic pathway. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases. It has been established that lumazine synthase catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence. 0
39125 412305 cl00318 YjeF_N YjeF-related protein N-terminus. The protein region corresponding to this model shows no clear homology to any protein of known function. This model is built on yeast protein YNL200C and the N-terminal regions of E. coli yjeF and its orthologs in various species. The C-terminal region of yjeF and its orthologs shows similarity to hydroxyethylthiazole kinase (thiM) and other enzymes involved in thiamine biosynthesis. Yeast YKL151C and B. subtilis yxkO match the yjeF C-terminal domain but lack this region. [Unknown function, General] 0
39126 412306 cl00319 Gn_AT_II N/A. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes such as asparagine synthetase and glutamine-fructose-6-phosphate transaminase. 0
39127 412307 cl00320 tRNA_bindingDomain N/A. This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation. 0
39128 412308 cl00322 Ribosomal_L1 N/A. This family includes prokaryotic L1 and eukaryotic L10. 0
39129 412309 cl00323 Chorismate_synthase Chorismase synthase, the enzyme catalyzing the final step of the shikimate pathway. Homotetramer (noted in E.coli) suggests reason for good conservation. [Amino acid biosynthesis, Aromatic amino acid family] 0
39130 351027 cl00324 RplC Ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. This model describes exclusively the archaeal class of ribosomal protein L3. A separate model (TIGR03625) describes the bacterial/organelle form, and both belong to pfam00297. Eukaryotic proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39131 412310 cl00325 Ribosomal_L4 Ribosomal protein L4/L1 family. Members of this protein family are ribosomal protein L4. This model recognizes bacterial and most organellar forms, but excludes homologs from the eukaryotic cytoplasm and from archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39132 412311 cl00326 Ribosomal_L23 Ribosomal protein L23. This model describes the archaeal ribosomal protein L23P and rigorously excludes the bacterial counterpart L23. In order to capture every known instance of archaeal L23P, the trusted cutoff is set lower than a few of the highest scoring eukaryotic cytosolic ribosomal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39133 412312 cl00327 Ribosomal_L22 N/A. This family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes. 0
39134 412313 cl00328 Ribosomal_L14 Ribosomal protein L14p/L23e. Part of the 50S ribosomal subunit. Forms a cluster with proteins L3 and L24e, part of which may contact the 16S rRNA in 2 intersubunit bridges. 0
39135 412314 cl00330 Ribosomal_S8 Ribosomal protein S8. 30S ribosomal protein S8; Validated 0
39136 412315 cl00331 Ribosomal_S13 Ribosomal protein S13/S18. This model describes bacterial ribosomal protein S13, to the exclusion of the homologous archaeal S13P and eukaryotic ribosomal protein S18. This model identifies some (but not all) instances of chloroplast and mitochondrial S13, which is of bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39137 320911 cl00332 Ribosomal_S11 Ribosomal protein S11. This model describes the bacterial 30S ribosomal protein S11. Cutoffs are set such that the model excludes archaeal and eukaryotic ribosomal proteins, but many chloroplast and mitochondrial equivalents of S11 are detected. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39138 412316 cl00333 Ribosomal_L13 N/A. 60S ribosomal protein L13a; Provisional 0
39139 412317 cl00334 Ribosomal_S9 Ribosomal protein S9/S16. ribosomal protein S9 0
39140 351036 cl00335 NDPk N/A. Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species. It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5). 0
39141 412318 cl00336 DHBP_synthase 3,4-dihydroxy-2-butanone 4-phosphate synthase. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD] 0
39142 412319 cl00337 PT_UbiA UbiA family of prenyltransferases (PTases). A fairly deep split separates this polyprenyltransferase subfamily from the set of mitochondrial and proteobacterial 4-hydroxybenzoate polyprenyltransferases, described in TIGR01474. Protoheme IX farnesyltransferase (heme O synthase) (TIGR01473) is more distantly related. Because no species appears to have both this protein and a member of TIGR01474, it is likely that this model represents 4-hydroxybenzoate polyprenyltransferase, a critical enzyme of ubiquinone biosynthesis, in the Archaea, Gram-positive bacteria, Aquifex aeolicus, the Chlamydias, etc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone] 0
39143 412320 cl00338 ALAD_PBGS N/A. Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. The eukaryotic PBGSs represented by this model, which contain a cysteine-rich zinc binding motif (DXCXCX(Y/F)X3G(H/Q)CG), require zinc for their activity, they do not contain an additional allosteric metal binding site and do not bind magnesium. 0
39144 412321 cl00339 SugarP_isomerase N/A. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA. 0
39145 412322 cl00340 ILVD_EDD Dehydratase family. This protein, dihydroxy-acid dehydratase, catalyzes the fourth step in valine and isoleucine biosynthesis. It contains a catalytically essential [4Fe-4S] cluster This model generates scores of up to 150 bits vs. 6-phosphogluconate dehydratase, a homologous enzyme. [Amino acid biosynthesis, Pyruvate family] 0
39146 412323 cl00341 IGPD Imidazoleglycerol-phosphate dehydratase. imidazoleglycerol-phosphate dehydratase; Provisional 0
39147 294246 cl00342 Trp-synth-beta_II N/A. Members of this family include SbnA, a protein of the staphyloferrin B biosynthesis operon of Staphylococcus aureus. SbnA and SbnB together appear to synthesize 2,3-diaminopropionate, a precursor of certain siderophores and other secondary metabolites. SbnA is a pyridoxal phosphate-dependent enzyme. [Cellular processes, Biosynthesis of natural products] 0
39148 412324 cl00344 PRA-CH Phosphoribosyl-AMP cyclohydrolase. phosphoribosyl-AMP cyclohydrolase; Reviewed 0
39149 351044 cl00348 GCD2 Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. This model, eIF-2B_rel, describes half of a superfamily, where the other half consists of eukaryotic translation initiation factor 2B (eIF-2B) subunits alpha, beta, and delta. It is unclear whether the eIF-2B_rel set is monophyletic, or whether they are all more closely related to each other than to any eIF-2B subunit because the eIF-2B clade is highly derived. Members of this branch of the family are all uncharacterized with respect to function and are found in the Archaea, Bacteria, and Eukarya, although a number are described as putative translation intiation factor components. Proteins found by eIF-2B_rel include at least three clades, including a set of uncharacterized eukaryotic proteins, a set found in some but not all Archaea, and a set universal so far among the Archaea and closely related to several uncharacterized bacterial proteins. [Unknown function, General] 0
39150 412325 cl00349 S15_NS1_EPRS_RNA-bind N/A. 40S ribosomal protein S15; Provisional 0
39151 412326 cl00350 Ribosomal_S19 Ribosomal protein S19. This model represents eukaryotic ribosomal protein uS19 (previously S15) and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39152 412327 cl00351 Ribosomal_S17 Ribosomal protein S17. This model describes the bacterial ribosomal small subunit protein S17, while excluding cytosolic eukaryotic homologs and archaeal homologs. The model finds many, but not, chloroplast and mitochondrial counterparts to bacterial S17. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39153 412328 cl00352 PTH N/A. Chloroplast RNA splicing 2 (CRS2) is a nuclear-encoded protein required for the splicing of group II introns in the chloroplast. CRS2 forms stable complexes with two CRS2-associated factors, CAF1 and CAF2, which are required for the splicing of distinct subsets of CRS2-dependent introns. CRS2 is closely related to bacterial peptidyl-tRNA hydrolases (PTH). 0
39154 412329 cl00353 Ribosomal_L16_L10e N/A. This model describes bacterial and organellar ribosomal protein L16. The homologous protein of the eukaryotic cytosol is designated L10 [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39155 412330 cl00354 KOW KOW: an acronym for the authors&apos; surnames (Kyrpides, Ouzounis and Woese). Ribosomal_L26 is a family of the 50S and the 60S ribosomal proteins from eukaryotes - L26 - and archaea - L25. 0
39156 412331 cl00355 Ribosomal_S14 Ribosomal protein S14p/S29e. 30S ribosomal protein S14P; Reviewed 0
39157 412332 cl00356 Ribosomal_L17 Ribosomal protein L17. Eubacterial and mitochondrial. The mitochondrial form, from yeast, contains an additional 110 amino acids C-terminal to the region found by this model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39158 412333 cl00359 Ribosomal_L27 Ribosomal L27 protein. Eubacterial, chloroplast, and mitochondrial. Mitochondrial members have an additional C-terminal domain. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39159 412334 cl00360 5-FTHF_cyc-lig 5-formyltetrahydrofolate cyclo-ligase family. This enzyme, 5,10-methenyltetrahydrofolate synthetase, is also called 5-formyltetrahydrofolate cycloligase. Function of bacterial proteins in this family was inferred originally from the known activity of eukaryotic homologs. Recently, activity was shown explicitly for the member from Mycoplasma pneumonia. Members of this family from alpha- and gamma-proteobacteria, designated ygfA, are often found in an operon with 6S structural RNA, and show a similar pattern of high expression during stationary phase. The function may be to deplete folate to slow 1-carbon biosynthetic metabolism. [Central intermediary metabolism, One-carbon metabolism] 0
39160 412335 cl00361 Transcrip_reg Transcriptional regulator. This model describes a minimally characterized protein family, restricted to bacteria excepting for some eukaryotic sequences that have possible transit peptides. YebC from E. coli is crystallized, and PA0964 from Pseudomonas aeruginosa has been shown to be a sequence-specific DNA-binding regulatory protein. In silico analysis suggests a role in Holliday junction resolution. [Regulatory functions, DNA interactions] 0
39161 412336 cl00365 F1-ATPase_gamma mitochondrial ATP synthase gamma subunit. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 gamma subunit of this apparent second ATP synthase. 0
39162 412337 cl00366 PMSR Peptide methionine sulfoxide reductase. methionine sulfoxide reductase A; Provisional 0
39163 412338 cl00367 Ribosomal_L28 Ribosomal L28 family. This model describes bacterial and chloroplast forms of the 50S ribosomal protein L28, a polypeptide about 60 amino acids in length. Mitochondrial homologs differ substantially in architecture (e.g. SP|P36525 from Saccharomyces cerevisiae, which is 258 amino acids long) and are not included. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39164 412339 cl00368 Ribosomal_S16 Ribosomal protein S16. This model describes ribosomal S16 of bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39165 412340 cl00370 Ribosomal_L34 Ribosomal protein L34. 50S ribosomal protein L34; Reviewed 0
39166 412341 cl00373 Ribosomal_S18 Ribosomal protein S18. This ribosomal small subunit protein is found in all eubacteria so far, as well as in chloroplasts. YER050C from Saccharomyces cerevisiae and a related protein from Caenorhabditis elegans appear to be homologous and may represent mitochondrial forms. The trusted cutoff is set high enough that these two candidate S18 proteins are not categorized automatically. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39167 412342 cl00376 Ribosomal_L10_P0 N/A. 60S acidic ribosomal protein P0; Provisional 0
39168 412343 cl00377 Ribosomal_L31 Ribosomal protein L31. This family consists exclusively of bacterial (and organellar) 50S ribosomal protein L31. In some species, such as Bacillus subtilis, this protein exists in two forms (RpmE and YtiA), one of which (RpmE) contains a pair of motifs, CXC and CXXC, for binding zinc. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39169 412344 cl00379 Ribosomal_L18_L5e N/A. This family includes the large subunit ribosomal proteins from bacteria, archaea, the mitochondria and the chloroplast. It does not include the 60S L18 or L5 proteins from Metazoa. 0
39170 412345 cl00380 Ribosomal_L36 Ribosomal protein L36. 50S ribosomal protein L36; Validated 0
39171 412346 cl00381 PNPOx/FlaRed_like Pyridoxine 5'-phosphate (PNP) oxidase-like and flavin reductase-like proteins. Pyridoxamine 5'-phosphate oxidase is a FMN flavoprotein that catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). This entry contains several pyridoxamine 5'-phosphate oxidases, and related proteins. 0
39172 412347 cl00382 Ribosomal_L21p Ribosomal prokaryotic L21 protein. 50S ribosomal protein L21; Validated 0
39173 412348 cl00383 Ribosomal_L33 Ribosomal protein L33. This model describes bacterial ribosomal protein L33 and its chloroplast and mitochondrial equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39174 412349 cl00384 Ribosomal_S20p Ribosomal protein S20. ribosomal protein S20 0
39175 412350 cl00386 BolA BolA-like protein. transcriptional regulator BolA; Provisional 0
39176 412351 cl00388 Thioredoxin_like Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. 0
39177 412352 cl00389 SIS N/A. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. 0
39178 412353 cl00391 beta_CA N/A. This family includes carbonic anhydrases as well as a family of non-functional homologs related to YbcF. 0
39179 412354 cl00392 Ribosomal_L35p Ribosomal protein L35. This ribosomal protein is found in bacteria and organelles only. It is not closely related to any eukaryotic or archaeal ribosomal protein. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
39180 412355 cl00393 Ribosomal_L20 Ribosomal protein L20. ribosomal protein L20 0
39181 412356 cl00394 HupF_HypC HupF/HypC family. This protein is suggested by act as a chaperone for a hydrogenase large subunit, holding the precursor form before metallocenter nickel incorporation. [SS 12/31/03] More recently proposed additional function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. . Added metallochaperone and protein mod GO terms. [Protein fate, Protein folding and stabilization, Protein fate, Protein modification and repair] 0
39182 412357 cl00395 FMT_core Formyltransferase, catalytic core domain. This family includes the following members. Glycinamide ribonucleotide transformylase catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function. 0
39183 412358 cl00399 MoaE N/A. This family contains the MoaE protein that is involved in biosynthesis of molybdopterin. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins. 0
39184 412359 cl00400 Fe-S_biosyn Iron-sulphur cluster biosynthesis. Proteins in this subfamily appear to be associated with the process of FeS-cluster assembly. The HesB proteins are associated with the nif gene cluster and the Rhizobium gene IscN has been shown to be required for nitrogen fixation. Nitrogenase includes multiple FeS clusters and many genes for their assembly. The E. coli SufA protein is associated with SufS, a NifS homolog and SufD which are involved in the FeS cluster assembly of the FhnF protein. The Azotobacter protein IscA (homologs of which are also found in E.coli) is associated which IscS, another NifS homolog and IscU, a nifU homolog as well as other factors consistent with a role in FeS cluster chemistry. A homolog from Geobacter contains a selenocysteine in place of an otherwise invariant cysteine, further suggesting a role in redox chemistry. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
39185 412360 cl00402 UPF0054 Uncharacterized protein family UPF0054. This metalloprotein family is represented by a single member sequence only in nearly every bacterium. Crystallography demonstrated metal-binding activity, possibly to nickel. It is a predicted to be a metallohydrolase, and more recently it was shown that mutants have a ribosomal RNA processing defect. [Protein synthesis, Other] 0
39186 412361 cl00406 Ribosomal_L19 Ribosomal protein L19. 50S ribosomal protein L19; Provisional 0
39187 412362 cl00407 tRNA_m1G_MT tRNA (Guanine-1)-methyltransferase. tRNA (guanine-N(1)-)-methyltransferase; Reviewed 0
39188 412363 cl00410 G3P_acyltransf Glycerol-3-phosphate acyltransferase. This model represents the full length of acylphosphate:glycerol 3-phosphate acyltransferase, and integral membrane protein about 200 amino acids in length, called PlsY in Streptococcus pneumoniae, YneS in Bacillus subtilis, and YgiH in E. coli. It is found in a single copy in a large number of bacteria, including the Mycoplasmas but not Mycobacteria or spirochetes, for example. Its partner is PlsX (see TIGR00182), and the pair can replace PlsB for synthesizing 1-acylglycerol-3-phosphate. [Fatty acid and phospholipid metabolism, Biosynthesis] 0
39189 412364 cl00412 P-II Nitrogen regulatory protein P-II. This family of proteins with unknown function appears to be restricted to Proteobacteria. 0
39190 412365 cl00413 ATP-synt_A ATP synthase A chain. Bacterial forms should be designated ATP synthase, F0 subunit A; eukaryotic (chloroplast and mitochondrial) forms should be designated ATP synthase, F0 subunit 6. The F1/F0 ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Individual subunits in each of these clusters are named differently in prokaryotes and in organelles e.g., mitochondria and chloroplast. The bacterial equivalent of subunit 6 is named subunit 'A'. It has been shown that proton is conducted though this subunit. Typically, deprotonation and reprotonation of the acidic amino acid side-chains are implicated in the process. [Energy metabolism, ATP-proton motive force interconversion] 0
39191 412366 cl00414 bS6 Bacterial ribosomal protein S6. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18. 0
39192 412367 cl00415 CobS Cobalamin-5-phosphate synthase. cobalamin synthase; Reviewed 0
39193 412368 cl00416 CS_ACL-C_CCL N/A. This is the long, C-terminal part of the enzyme. 0
39194 412369 cl00420 NadA Quinolinate synthetase A protein. This protein, termed NadA, plays a role in the synthesis of pyridine, a precursor to NAD. The quinolinate synthetase complex consists of A protein (this protein) and B protein. B protein converts L-aspartate to iminoaspartate, an unstable reaction product which in the absence of A protein is spontaneously hydrolyzed to form oxaloacetate. The A protein, NadA, converts iminoaspartate to quinolate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 0
39195 412370 cl00424 UPF0014 Uncharacterized protein family (UPF0014). [Hypothetical proteins, Conserved] 0
39196 412371 cl00425 CofD_YvcK Family of CofD-like proteins and proteins related to YvcK. Members of this family are distantly related to CofD, the enzyme LPPG:FO 2-phospho-L-lactate transferase, involved in coenzyme F420 biosynthesis. This family appears to belong to a biosynthesis cassette of unknown function. 0
39197 412372 cl00426 YbjQ_1 Putative heavy-metal-binding. hypothetical protein; Provisional 0
39198 412373 cl00427 TM_PBP2 N/A. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices. 0
39199 412374 cl00429 SNARE_assoc SNARE associated Golgi protein. This is a family of SNARE associated Golgi proteins. The yeast member of this family localizes with the t-SNARE Tlg2. 0
39200 412375 cl00431 Pmp3 Proteolipid membrane potential modulator. Pmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants. 0
39201 412376 cl00436 SirA_YedF_YeeD N/A. Members of this family of hypothetical bacterial proteins have no known function. 0
39202 412377 cl00437 Zip ZIP Zinc transporter. The Zinc (Zn2+)-Iron (Fe2+) Permease (ZIP) Family (TC 2.A.5)Members of the ZIP family consist of proteins with eight putative transmembrane spanners. They are derived from animals, plants and yeast. Theycomprise a diverse family, with several paralogues in any one organism (e.g., at least five in Caenorabditis elegans, at least five in Arabidopsis thaliana and two inSaccharomyces cervisiae. The two S. cerevisiae proteins, Zrt1 and Zrt2, both probably transport Zn2+ with high specificity, but Zrt1 transports Zn2+ with ten-fold higher affinitythan Zrt2. Some members of the ZIP family have been shown to transport Zn2+ while others transport Fe2+, and at least one transports a range of metal ions. The energy source fortransport has not been characterized, but these systems probably function as secondary carriers. [Transport and binding proteins, Cations and iron carrying compounds] 0
39203 412378 cl00438 FMN_red NADPH-dependent FMN reductase. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN. 0
39204 412379 cl00439 UPF0047 Uncharacterized protein family UPF0047. Members of this protein family have been studied extensively by crystallography. Members from several different species have been shown to have sufficient thiamin phosphate synthase activity (EC 2.5.1.3) to complement thiE mutants. However, it is presumed that this is a secondary activity, and the primary function of this enzyme remains unknown. [Unknown function, Enzymes of unknown specificity] 0
39205 412380 cl00445 Iso_dh Isocitrate/isopropylmalate dehydrogenase. Tartrate dehydrogenase catalyzes the oxidation of both meso- and (+)-tartrate as well as a D-malate. These enzymes are closely related to the 3-isopropylmalate and isohomocitrate dehydrogenases found in TIGR00169 and TIGR02088, respectively. [Energy metabolism, Other] 0
39206 412381 cl00447 Nudix_Hydrolase N/A. This domain family consists of uncharacterized proteins around 175 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown. This family is related to the NUDIX hydrolases. 0
39207 412382 cl00448 SurE Survival protein SurE. This protein family originally was named SurE because of its role in stationary phase survivalin Escherichia coli. In E. coli, surE is next to pcm, an L-isoaspartyl protein repair methyltransferase that is also required for stationary phase survival. Recent work () shows that viewing SurE as an acid phosphatase (3.1.3.2) is not accurate. Rather, SurE in E. coli, Thermotoga maritima, and Pyrobaculum aerophilum acts strictly on nucleoside 5'- and 3'-monophosphates. E. coli SurE is Recommended cutoffs are 15 for homology, 40 for probable orthology, and 200 for orthology with full-length homology. [Cellular processes, Adaptations to atypical conditions] 0
39208 412383 cl00451 MoCF_BD N/A. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. 0
39209 412384 cl00452 AAK N/A. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2. 0
39210 412385 cl00453 CDP-OH_P_transf CDP-alcohol phosphatidyltransferase. Alternate names: phosphatidylglycerophosphate synthase; glycerophosphate phosphatidyltransferase; PGP synthase. A number of related enzymes are quite similar in both sequence and catalytic activity, including Saccharamyces cerevisiae YDL142c, now known to be a cardiolipin synthase. There may be problems with incorrect transitive annotation of near homologs as authentic CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis] 0
39211 412386 cl00454 TM_PBP1_branched-chain-AA_like N/A. This is a large family mainly comprising high-affinity branched-chain amino acid transporter proteins such as E. coli LivH and LivM, both of which are form the LIV-I transport system. Also found with in this family are proteins from the galactose transport system permease and a ribose transport system. 0
39212 382020 cl00456 SLC5-6-like_sbd Solute carrier families 5 and 6-like; solute binding domain. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases. 0
39213 412387 cl00457 Ribonuclease_P Ribonuclease P. ribonuclease P; Reviewed 0
39214 412388 cl00458 Peptidase_A8 Signal peptidase (SPase) II. Alternate name: lipoprotein signal peptidase [Protein fate, Protein and peptide secretion and trafficking] 0
39215 412389 cl00459 MIT_CorA-like metal ion transporter CorA-like divalent cation transporter superfamily. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell. 0
39216 412390 cl00460 CMD Carboxymuconolactone decarboxylase family. PA26 is a p53-inducible protein. Its function is unknown. It has similarity to pfam04636 in its N-terminus. 0
39217 412391 cl00463 CbiQ Cobalt transport protein. This model represents the CbiQ component of the cobalt-specific ECF-type. CbiQ is now recognized as the T component of energy-coupling factor (ECF)-type transporters. The S component confers specificity (CbiM-N for cobalt systems), which CbiO is the ABC-family ATPase. In general, proteins found by this model reside next to the other putative subunits of the complex, identified as CbiN, CbiO, or CbiM. Note that the designation of cobalt transporter has been spread excessively among ECF system transporters with many other specificities. [Transport and binding proteins, Cations and iron carrying compounds] 0
39218 412392 cl00464 URO-D_CIMS_like N/A. The N-terminal domain and C-terminal domains of cobalamin-independent synthases together define a catalytic cleft in the enzyme. The N-terminal domain is thought to bind the substrate, in particular, the negatively charged polyglutamate chain. The N-terminal domain is also thought to stabilize a loop from the C-terminal domain. 0
39219 294317 cl00465 AI-2E_transport AI-2E family transporter. Three lines of evidence show this protein to be involved in sporulation. First, it is under control of a sporulation-specific sigma factor, sigma-E. Second, mutation leads to a sporulation defect. Third, it if found in exactly those genomes whose bacteria are capable of sporulation, except for being absent in Clostridium acetobutylicum ATCC824. This protein has extensive hydrophobic regions and is likely an integral membrane protein. [Cellular processes, Sporulation and germination] 0
39220 412393 cl00466 ATP-synt_C ATP synthase subunit C. F0F1 ATP synthase subunit C; Provisional 0
39221 412394 cl00467 Ntn_hydrolase N/A. This family includes several hydrolases which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides. These include choloylglycine hydrolase (conjugated bile acid hydrolase, CBAH) EC:3.5.1.24, penicillin acylase EC:3.5.1.11 and acid ceramidase EC:3.5.1.23. This domain forms the alpha-subunit for members from vertebral species, see family NAAA-beta, pfam15508. 0
39222 412395 cl00469 NADHdh NADH dehydrogenase. NADH dehydrogenase subunit 1; Provisional 0
39223 412396 cl00470 AKR_SF Aldo-keto reductase (AKR) superfamily. This family includes a number of K+ ion channel beta chain regulatory domains - these are reported to have oxidoreductase activity. 0
39224 412397 cl00473 BI-1-like BAX inhibitor (BI)-1/YccA-like protein family. The Bax-inhibitor-1 region of the receptor molecules is conserved from bacteria to humans. 0
39225 412398 cl00474 PAP2_like N/A. This family is closely related to the C-terminal a region of PAP2. 0
39226 320993 cl00475 FTR1 Iron permease FTR1 family. A characterized member from yeast acts as oxidase-coupled high affinity iron transporter. Note that the apparent member from E. coli K12-MG1655 has a frameshift by homology with member sequences from other species. [Unknown function, General] 0
39227 412399 cl00477 H2MP N/A. The family consists of hydrogenase maturation proteases. In E. coli HypI the hydrogenase maturation protease is involved in processing of HypE the large subunit of hydrogenases 3, by cleavage of its C-terminal. 0
39228 412400 cl00478 LGT Prolipoprotein diacylglyceryl transferase. The conversion of lipoprotein precursors into lipoproteins consists of three steps. First, the enzyme described by this model transfers a diacylglyceryl moiety from phosphatidylglycerol to the side chain of a Cys that will become the new N-terminus. Second, the signal peptide is removed by signal peptidase II. Finally, the free amino group of the new N-terminal Cys is acylated by apolipoprotein N-acyltransferase. [Protein fate, Protein modification and repair] 0
39229 412401 cl00480 RraA-like Aldolase/RraA. hypothetical protein; Validated 0
39230 412402 cl00481 SecE SecE/Sec61-gamma subunits of protein translocation complex. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC. [Protein fate, Protein and peptide secretion and trafficking] 0
39231 412403 cl00482 SmpB Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs. This model describes the SsrA-binding protein, also called tmRNA binding protein, small protein B, and SmpB. The small, stable RNA SsrA (also called tmRNA or 10Sa RNA) recognizes stalled ribosomes such as occur during translation from message that lacks a stop codon. It becomes charged with Ala like a tRNA, then acts as mRNA to resume translation started with the defective mRNA. The short C-terminal peptide tag added by the SsrA system marks the abortively translated protein for degradation. SmpB binds SsrA after its aminoacylation but before the coupling of the Ala to the nascent polypeptide chain and is an essential part of the SsrA peptide tagging system. SmpB has been associated with the survival of bacterial pathogens in conditions of stress. It is universal in the first 100 sequenced bacterial genomes. [Protein synthesis, Other] 0
39232 412404 cl00483 UDG-like uracil-DNA glycosylases (UDG) and related enzymes. This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Listeria species. The function of this family is unknown. 0
39233 412405 cl00485 LacAB_rpiB Ribose/Galactose Isomerase. This family is a member of the RpiB/LacA/LacB subfamily (TIGR00689) but lies outside the RpiB equivalog (TIGR01120) which is also a member of that subfamily. Ribose 5-phosphate isomerase is an essential enzyme of the pentose phosphate pathway; a pathway that appears to be present in the actinobacteria. The only candidates for ribose 5-phosphate isomerase in the Actinobacteria are members of this family. 0
39234 412406 cl00489 60KD_IMP 60Kd inner membrane protein. This model describes full-length from some species, and the C-terminal region only from other species, of the YidC/Oxa1 family of proteins. This domain appears to be univeral among bacteria (although absent from Archaea). The well-characterized YidC protein from Escherichia coli and its close homologs contain a large N-terminal periplasmic domain in addition to the region modeled here. [Protein fate, Protein and peptide secretion and trafficking] 0
39235 412407 cl00490 EEP Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily. This domain represents the endonuclease region of retrotransposons from a range of bacteria, archaea and eukaryotes. These are enzymes largely from class EC:2.7.7.49. 0
39236 412408 cl00492 Oxidored_q2 NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. [Transport and binding proteins, Cations and iron carrying compounds] 0
39237 412409 cl00493 trimeric_dUTPase Trimeric dUTP diphosphatases. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate. 0
39238 412410 cl00494 YbaB_DNA_bd YbaB/EbfC DNA-binding family. The function of this protein is unknown, but it has been expressed and crystallized. Its gene nearly always occurs next to recR and/or dnaX. It is restricted to Bacteria and the plant Arabidopsis. The plant form contains an additional N-terminal region that may serve as a transit peptide and shows a close relationship to the cyanobacterial member, suggesting that it is a chloroplast protein. Members of this family are found in a single copy per bacterial genome, but are broadly distributed. A member is present even in the minimal gene complement of Mycoplasm genitalium. [Unknown function, General] 0
39239 412411 cl00495 Glu-tRNAGln Glu-tRNAGln amidotransferase C subunit. This model represents a family small family related to GatC, the third subunit of an enzyme for completing the charging of tRNA(Gln) by amidating the Glu-tRNA(Gln). The few known archaea that contain a member of this family appear to produce Asn-tRNA(Asn) by an analogous amidotransferase reaction. This protein is proposed to substitute for GatC in the charging of both tRNAs. 0
39240 321006 cl00497 CxxCxxCC Putative zinc- or iron-chelating domain. This family of proteins contains 8 conserved cysteines. It has in the past been annotated as being one of the complex of proteins of the flagellar Fli complex. However this was due to a mis-annotation of the original Salmonella LT2 Genbank entry of 'fliB'. With all its conserved cysteines it is possibly a domain that chelates iron or zinc ions. 0
39241 412412 cl00500 ACPS 4&apos;-phosphopantetheinyl transferase superfamily. This model models a domain active in transferring the phophopantetheine prosthetic group to its attachment site on enzymes and carrier proteins. Many members of this family are small proteins that act on the acyl carrier protein involved in fatty acid biosynthesis. Some members are domains of larger proteins involved specialized pathways for the synthesis of unusual molecules including polyketides, atypical fatty acids, and antibiotics. [Protein fate, Protein modification and repair] 0
39242 321008 cl00504 Cytochrom_C_asm Cytochrome C assembly protein. Members of this protein family represent one of two essential proteins of system II for c-type cytochrome biogenesis. Additional proteins tend to be part of the system but can be replaced by chemical reductants such as dithiothreitol. This protein is designated CcsB in Bordetella pertussis and some other bacteria, resC in Bacillus (where there is additional N-terminal sequence), and CcsA in chloroplast. We use the CcsB designation here. Member sequences show regions of strong sequence conservation and variable-length, poorly conserved regions in between; sparsely filled columns were removed from the seed alignment prior to model construction. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair] 0
39243 412413 cl00505 DHQase_II N/A. 3-dehydroquinate dehydratase; Reviewed 0
39244 412414 cl00506 Haemolytic Haemolytic domain. This model describes a family, YidD, of small, non-essential proteins now suggested to improve YidC-dependent inner membrane protein insertion. A related protein is found in the temperature phage HP1 of Haemophilus influenzae. Annotation of some members of this family as hemolysins appears to represent propagation from an unpublished GenBank submission, L36462, attributed to Aeromonas hydrophila but a close match to E. coli. [Hypothetical proteins, Conserved] 0
39245 412415 cl00508 YGGT YGGT family. This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown. 0
39246 412416 cl00509 hot_dog N/A. This is the dehydratase domain of polyketide synthases. Structural analysis shows these DH domains are double hotdogs in which the active site contains a histidine from the N-terminal hotdog and an aspartate from the C-terminal hotdog. Studies have uncovered that a substrate tunnel formed between the DH domains may be essential for loading substrates and unloading products. 0
39247 412417 cl00510 MlaE Permease MlaE. This model describes a subfamily of ABC transporter permease subunits. One member of this family has been associated with the toluene tolerance phenotype of Pseudomonas putida, another with L-glutamate transport, another with maintenance of lipid asymmetry. Many bacterial species have one or two members. The Mycobacteria have large paralogous families included in the DUF140 family but excluded from this subfamily on based on extreme divergence at the amino end and on phylogenetic and UPGMA trees on the more conserved regions. [Hypothetical proteins, Conserved] 0
39248 294347 cl00511 FTSW_RODA_SPOVE Cell cycle protein. This family consists of FtsW, an integral membrane protein with ten transmembrane segments. In general, it is one of two paralogs involved in peptidoglycan biosynthesis, the other being RodA, and is essential for cell division. All members of the seed alignment for this model are encoded in operons for the biosynthesis of UDP-N-acetylmuramoyl-pentapeptide, a precursor of murein (peptidoglycan). The FtsW designation is not used in endospore-forming bacterial (e.g. Bacillus subtilis), where the member of this family is designated SpoVE and three or more RodA/FtsW/SpoVE family paralogs are present. SpoVE acts in spore cortex formation and is dispensible for growth. Biological rolls for FtsW in cell division include recruitment of penicillin-binding protein 3 to the division site. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division] 0
39249 412418 cl00512 LpxC UDP-3-O-acyl N-acetylglycosamine deacetylase. UDP-3-O-(R-3-hydroxymyristoyl)-GlcNAc deacetylase from E. coli , LpxC, was previously designated EnvA. This enzyme is involved in lipid-A precursor biosynthesis. It is essential for cell viability. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
39250 412419 cl00514 Nitro_FMN_reductase nitroreductase family protein. The nitroreductase family comprises a group of FMN- or FAD-dependent and NAD(P)H-dependent enzymes able to metabolize nitrosubstituted compounds. 0
39251 412420 cl00518 Asp_Glu_race Asp/Glu/Hydantoin racemase. This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL. The function of this family is unknown. 0
39252 412421 cl00519 RsfS Ribosomal silencing factor during starvation. This model describes a widely distributed family of bacterial proteins related to iojap from plants. It includes RsfS(YbeB) from E. coli. The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. The conserved function of this protein is to silence ribosomes by binding the ribosomal large subunit and impairing joining with the small subunit in response to nutrient stress. Note that RsfS (starvation) is an author-endorsed change from the published symbol RsfA, which conflicted with previously published gene symbols. [Protein synthesis, Translation factors] 0
39253 412422 cl00521 TatC Sec-independent protein translocase protein (TatC). This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR01912) represents the archaeal clade of this family. TatC is often found in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409). [Protein fate, Protein and peptide secretion and trafficking] 0
39254 412423 cl00522 GTP_cyclohydro2 N/A. GTP cyclohydrolase II catalyzes the first committed step in the biosynthesis of riboflavin. 0
39255 412424 cl00523 Queuosine_synth Queuosine biosynthesis protein. This model describes the enzyme for S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA). QueA synthesizes Queuosine which is usually in the first position of the anticodon of tRNAs specific for asparagine, aspartate, histidine, and tyrosine. [Protein synthesis, tRNA and rRNA base modification] 0
39256 412425 cl00526 DAGK_IM_like Integral membrane diacylglycerol kinase and similar enzymes. This bacterial family of homo-trimeric integral membrane enzyme domains catalyzes the ATP-dependent phosphorylation of of undecaprenol to undecaprenyl phosphate. They sit N-terminally to phosphatase domains that are members of the type 2 phosphatidic acid phosphatase superfamily, and the function of members of this domain architecture was determined to be undecaprenyl pyrophosphate phosphatases. The bi-functional enzymes might generate undecaprenyl phosphate via two mechanisms - the phosphorylation of undecaprenol or the cleavage of the terminal phosphate group of undecaprenyl pyrophosphate. 0
39257 412426 cl00528 IscU_like Iron-sulfur cluster scaffold-like proteins. This domain is found in NifU in combination with pfam01106. This domain is found on isolated in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. These proteins appear to be scaffold proteins for iron-sulfur clusters. 0
39258 412427 cl00529 Ribosomal_S21 Ribosomal protein S21. 30S ribosomal protein S21; Reviewed 0
39259 412428 cl00530 UreD UreD urease accessory protein. UreD is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex and is required for urease nickel metallocenter assembly. See also UreF pfam01730, UreG pfam01495. 0
39260 412429 cl00532 Urease_gamma N/A. Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia. 0
39261 412430 cl00533 Urease_beta N/A. This subunit is known as alpha in Heliobacter. 0
39262 412431 cl00535 Oxidored_q4 NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. NADH dehydrogenase subunit A; Validated 0
39263 412432 cl00537 ExbD Biopolymer transport protein ExbD/TolR. The model describes the inner membrane protein TolR, part of the TolR/TolQ complex that transduces energy from the proton-motive force, through TolA, to an outer membrane complex made up of TolB and Pal (peptidoglycan-associated lipoprotein). The complex is required to maintain outer membrane integrity, and defects may cause a defect in the import of some organic compounds in addition to the resulting morphologic. While several gene pairs homologous to talR and tolQ may be found in a single genome, but the scope of this model is set to favor finding only bone fide TolR, supported by operon structure as well as by score. [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 0
39264 412433 cl00538 MinE Septum formation topological specificity factor MinE. cell division topological specificity factor MinE; Provisional 0
39265 412434 cl00540 Asp_decarbox Aspartate alpha-decarboxylase or L-aspartate 1-decarboxylase, a pyruvoyl group-dependent decarboxylase in beta-alanine production. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalyzed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesized initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase. 0
39266 412435 cl00541 PNPsynthase N/A. Members of this family belong to the PdxJ family that catalyzes the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate. 0
39267 412436 cl00542 RBFA Ribosome-binding factor A. ribosome-binding factor A; Provisional 0
39268 412437 cl00546 POR Pyruvate ferredoxin/flavodoxin oxidoreductase. This model represents the beta subunit of indolepyruvate ferredoxin oxidoreductase, an alpha(2)/beta(2) tetramer, as found in Pyrococcus furiosus and Methanobacterium thermoautotrophicum. Cofactors for the tetramer include TPP, 4Fe4S, and 3Fe-4S. It shows considerable sequence similarity to subunits of several other ketoacid oxidoreductases. 0
39269 294369 cl00547 Branch_AA_trans Branched-chain amino acid transport protein. The Branched Chain Amino Acid:Cation Symporter (LIVCS) Family (TC 2.A.26) Characterized members of this family transport all three of the branched chain aliphatic amino acids (leucine (L), isoleucine (I) and valine (V)). They function by a Na+ or H+ symport mechanism and display 12 putative transmembrane helical spanners. [Transport and binding proteins, Amino acids, peptides and amines] 0
39270 412438 cl00548 Na_Ala_symp Sodium:alanine symporter family. The Alanine or Glycine: Cation Symporter (AGCS) Family (TC 2.A.25) Members of the AGCS family transport alanine and/or glycine in symport with Na+ and or H+. 0
39271 412439 cl00549 ABC_membrane ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. 0
39272 412440 cl00551 Acylphosphatase Acylphosphatase. acylphosphatase; Provisional 0
39273 412441 cl00552 UPF0146 Uncharacterized protein family (UPF0146). hypothetical protein; Provisional 0
39274 412442 cl00553 DNase-RNase Bifunctional nuclease. This family is a bifunctional nuclease, with both DNase and RNase activity. It forms a wedge-shaped dimer, with each monomer being triangular in shape. A large groove at the thick end of the wedge contains a possible active site. 0
39275 412443 cl00554 Inos-1-P_synth Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction. 0
39276 412444 cl00555 SAF Domains similar to fish antifreeze type III protein. ChapFlgA is a family similar to the SAF family, and includes chaperones for flagellar basal-body proteins and pilus-assembly proteins, FlgA, RcpB and CpaB. ChapFlgA is necessary for the formation of the P-ring of the flagellum, FlgI, which sits in the peptidoglycan layer of the outer membrane of the bacterium. FlgA plays an auxiliary role in P-ring assembly. 0
39277 412445 cl00558 Abi CAAX protease self-immunity. The CAAX prenyl protease, in eukaryotes, catalyzes three covalent modifications, including cleavage and acylation, at the C-terminus of certain proteins in a process connected to protein sorting. This family describes a bacterial protein family homologous to one domain of the CAAX-processing enzyme. Members of this protein family are found in genomes that carry a predicted protein sorting system, PEP-CTERM/exosortase, usually in the vicinity of the EpsH homolog that is the hallmark of the system. The function of this protein is unknown, but it may relate to protein motification. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
39278 412446 cl00559 PgpA Phosphatidylglycerophosphatase A; a bacterial membrane-associated enzyme involved in lipid metabolism. This family represents a family of bacterial phosphatidylglycerophosphatases (EC:3.1.3.27), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli. 0
39279 412447 cl00561 CobD_Cbib CobD/Cbib protein. AmpE is a family of bacterial regulatory proteins. AmpE in conjunction with AmpD sense the effect of beta-lactam on peptidoglycan synthesis and relay this signal to AmpR. AmpR regulates the production of beta-lactamase. 0
39280 412448 cl00562 Cyt_bd_oxida_I Cytochrome bd terminal oxidase subunit I. cytochrome bd-II oxidase subunit 1; Provisional 0
39281 294381 cl00565 LysE LysE type translocator. [Transport and binding proteins, Amino acids, peptides and amines] 0
39282 412449 cl00567 Colicin_V Colicin V production protein. colicin V production protein; Provisional 0
39283 412450 cl00568 MotA_ExbB MotA/TolQ/ExbB proton channel family. The MotA protein, along with its partner MotB, comprise the stator complex of the bacterial flagellar motor. MotAB span the cytoplasmic membrane and undergo conformational changes powered by the translocation of protons. These conformational changes in turn are communicated to the rotor assembly, producing torque. This model represents one family of MotA proteins which are often not identified by the "transporter, MotA/TolQ/ExbB proton channel family" model, pfam01618. 0
39284 412451 cl00569 BCCT BCCT, betaine/carnitine/choline family transporter. putative transporter; Provisional 0
39285 412452 cl00570 AzlC AzlC protein. Overexpression of this gene results in resistance to a leucine analog, 4-azaleucine. The protein has 5 potential transmembrane motifs. It has been inferred, but not experimentally demonstrated, to be part of a branched-chain amino acid transport system. Commonly found in association with azlD. [Transport and binding proteins, Amino acids, peptides and amines] 0
39286 412453 cl00572 SpoIIM Stage II sporulation protein M. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This predicted integral membrane protein is designated stage II sporulation protein M. [Cellular processes, Sporulation and germination] 0
39287 412454 cl00573 SDF Sodium:dicarboxylate symporter family. C4-dicarboxylate transporter DctA; Reviewed 0
39288 412455 cl00574 Asp23 Asp23 family, cell envelope-related function. The alkaline shock protein Asp23 was identified as an alkaline shock protein that was expressed in a sigmaB-dependent manner in Staphylococcus aureus. Following an alkaline shock Asp23 accumulates in the soluble protein fraction of the S. aureus cell. Asp23 is one of the most abundant proteins in the cytosolic protein fraction of stationary S. aureus cells, with a copy-number of >25000 per cell. A second Asp23-family protein, AmaP, which is encoded within the asp23-operon, is required to localize Asp23 to the cell membrane. The overall function for the family is thus a cell envelope-related one in Gram-positive bacteria. 0
39289 412456 cl00581 LytR_cpsA_psr Cell envelope-related transcriptional attenuator domain. This model describes a domain of unknown function that is found in the predicted extracellular domain of a number of putative membrane-bound proteins. One of these is proteins psr, described as a penicillin binding protein 5 (PDP-5) synthesis repressor. Another is Bacillus subtilis LytR, described as a transcriptional attenuator of itself and the LytABC operon, where LytC is N-acetylmuramoyl-L-alanine amidase. A third is CpsA, a putative regulatory protein involved in exocellular polysaccharide biosynthesis. Besides the region of strong similarily represented by this model, these proteins share the property of having a short putative N-terminal cytoplasmic domain and transmembrane domain forming a signal-anchor. [Regulatory functions, Other] 0
39290 412457 cl00583 PhaG_MnhG_YufB Na+/H+ antiporter subunit. putative monovalent cation/H+ antiporter subunit G; Reviewed 0
39291 412458 cl00584 CutA1 CutA1 divalent ion tolerance protein. Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognized structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila). 0
39292 412459 cl00585 RNA_binding RNA binding. hypothetical protein; Provisional 0
39293 412460 cl00588 CarD_CdnL_TRCF CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase. 0
39294 412461 cl00591 FlaG FlaG protein. flagellar protein FlaG; Provisional 0
39295 412462 cl00593 FliP FliP family. type III secretion system protein YscR; Provisional 0
39296 412463 cl00596 LrgB LrgB-like family. Members of this small but broadly distibuted (Gram-positive, Gram-negative, and Archaeal) family appear to have multiple transmembrane segments. The function is unknown. A homolog, LrgB of Staphylococcus aureus, in the same small superfamily but in an outgroup to this subfamily, is regulated by LytSR and is suggested to act as a murein hydrolase. Of the three paralogous proteins in B. subtilis, one is a full length member of this family, one lacks the C-terminal 60 residues and has an additional 128 N-terminal residues but branches within the family in a phylogenetic tree, and one is closely related to LrgB and part of the outgroup. [Hypothetical proteins, Conserved] 0
39297 412464 cl00597 Rnf-Nqr Rnf-Nqr subunit, membrane protein. electron transport complex RsxE subunit; Provisional 0
39298 351169 cl00598 SMC_ScpA Segregation and condensation protein ScpA. segregation and condensation protein A; Reviewed 0
39299 412465 cl00599 Extradiol_Dioxygenase_3B_like Subunit B of Class III Extradiol ring-cleavage dioxygenases. This family contains members from all branches of life. The molecular function of this protein is unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton. 0
39300 412466 cl00600 Ribosomal_L7Ae Ribosomal protein L7Ae/L30e/S12e/Gadd45 family. This RNA binding Pelota domain is at the C-terminus of a PRTase family. These PRTase+Pelota genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 0
39301 412467 cl00603 DmpA_OAT N/A. Members of the ArgJ family catalyze the first EC:2.3.1.1 and fifth steps EC:2.3.1.35 in arginine biosynthesis. 0
39302 412468 cl00604 STAS Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. 0
39303 412469 cl00605 RNase_P_Rpp14 Rpp14/Pop5 family. ribonuclease P protein component 2; Provisional 0
39304 412470 cl00606 Archease Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism. 0
39305 412471 cl00607 PUA PUA domain. This uncharacterized domain is found a number of enzymes and uncharacterized proteins, often at the C-terminus. It is found in some but not all members of a family of related tRNA-guanine transglycosylases (tgt), which exchange a guanine base for some modified base without breaking the phosphodiester backbone of the tRNA. It is also found in rRNA pseudouridine synthase, another enzyme of RNA base modification not otherwise homologous to tgt. It is found, again at the C-terminus, in two putative glutamate 5-kinases. It is also found in a family of small, uncharacterized archaeal proteins consisting mostly of this domain. 0
39306 412472 cl00608 LrgA LrgA family. hypothetical protein; Provisional 0
39307 412473 cl00610 Ribosomal_S17e Ribosomal S17. 40S ribosomal protein S17; Provisional 0
39308 412474 cl00611 Methyltrans_RNA RNA methyltransferase. This family is likely to be an S-adenosyl-L-methionine (SAM)-dependent RNA methyltransferase. It is responsible for N1-methylation of pseudouridine 54 in archaeal tRNAs. 0
39309 412475 cl00612 SMC_ScpB Segregation and condensation complex subunit ScpB. segregation and condensation protein B; Reviewed 0
39310 412476 cl00613 ATP-synt_D ATP synthase subunit D. V-type ATP synthase subunit D; Provisional 0
39311 412477 cl00614 ADP_ribosyl_GH ADP-ribosylglycohydrolase. Members of this family are the enzyme ADP-ribosyl-[dinitrogen reductase] hydrolase (EC 3.2.2.24), better known as Dinitrogenase Reductase Activating Glycohydrolase, DRAG. This enzyme reverses a regulatory inactivation of dinitrogen reductase caused by the action of NAD(+)--dinitrogen-reductase ADP-D-ribosyltransferase (EC 2.4.2.37) (DRAT). This enzyme is restricted to nitrogen-fixing bacteria and belongs to the larger family of ADP-ribosylglycohydrolases described by pfam03747. [Central intermediary metabolism, Nitrogen fixation] 0
39312 294412 cl00615 Membrane-FADS-like N/A. Beta-carotene hydroxylase (CrtR), the carotenoid zeaxanthin biosynthetic enzyme catalyzes the addition of hydroxyl groups to the beta-ionone rings of beta-carotene to form zeaxanthin and is found in bacteria and red algae. Carotenoids are important natural pigments; zeaxanthin and lutein are the only dietary carotenoids that accumulate in the macular region of the retina and lens. It is proposed that these carotenoids protect ocular tissues against photooxidative damage. CrtR does not show overall amino acid sequence similarity to the beta-carotene hydroxylases similar to CrtZ, an astaxanthin biosynthetic beta-carotene hydroxylase. However, CrtR does show sequence similarity to the green alga, Haematococcus pluvialis, beta-carotene ketolase (CrtW), which converts beta-carotene to canthaxanthin. Sequences of the CrtR_beta-carotene-hydroxylase domain family, as well as, the CrtW_beta-carotene-ketolase domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 0
39313 412478 cl00616 DUF177 Uncharacterized ACR, COG1399. This family is nearly universally conserved in bacteria and plants except the Chlorophyceae algae. Thus far, mutantional analysis in bacteria have not established a function. In contrast, mutants have embryo lethal phenotypes in maize and Arabidopsis. In maize, the mutant embryos arrest at an early transition stage.It has been suggested that family members specifically affect 23S rRNA accumulation in plastids as well as bacteria. 0
39314 412479 cl00617 SRP19 SRP19 protein. signal recognition particle protein Srp19; Provisional 0
39315 412480 cl00618 Creatininase Creatinine amidohydrolase. Members of this family are creatininase (EC 3.5.2.10), an amidohydrolase that interconverts creatinine + H(2)O with creatine. It should not be confused with creatinase (EC 3.5.3.3), which hydrolyzes creatine to sarcosine plus urea. [Central intermediary metabolism, Nitrogen metabolism] 0
39316 412481 cl00620 DUF763 Protein of unknown function (DUF763). This family consists of several uncharacterized bacterial and archaeal proteins of unknown function. 0
39317 412482 cl00622 Csm2_III-A CRISPR/Cas system-associated protein Csm2. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-associated) proteins. This entry represents Csm2 Type III-A, a family of Cas proteins also known as TM1810/Csm2. 0
39318 412483 cl00625 BioW 6-carboxyhexanoate--CoA ligase. 6-carboxyhexanoate--CoA ligase; Provisional 0
39319 412484 cl00627 DUF192 Uncharacterized ACR, COG1430. hypothetical protein; Provisional 0
39320 412485 cl00628 Piwi-like N/A. This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA. 0
39321 412486 cl00630 YdcF-like N/A. This large family of proteins contains several highly conserved charged amino acids, suggesting this may be an enzymatic domain (Bateman A pers. obs). The family includes SanA, which is involved in Vancomycin resistance. This protein may be involved in murein synthesis. 0
39322 412487 cl00632 ATP-synt_F ATP synthase (F/14-kDa) subunit. V-type ATP synthase subunit F; Provisional 0
39323 412488 cl00635 Ntn_Asparaginase_2_like L-Asparaginase type 2-like enzymes of the NTN-hydrolase superfamily. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. 0
39324 412489 cl00638 RNA_pol_Rpb4 RNA polymerase Rpb4. This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II. 0
39325 412490 cl00640 DHQS 3-dehydroquinate synthase II (EC 1.4.1.24). 3-Dehydroquinate synthase II was isolated from the archaeon Methanocaldococcus jannaschii and plays a key role in an alternative pathway for the biosynthesis of 3-dehydroquinate (DHQ), an intermediate of the canonical pathway for the biosynthesis of aromatic amino acids. The enzyme catalyzes a two-step reaction - an oxidative deamination, followed by cyclization. The enzyme converts 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonate to 3-dehydroquinate. 0
39326 412491 cl00641 Cas4_I-A_I-B_I-C_I-D_II-B CRISPR/Cas system-associated protein Cas4. Members of this family belong to the PD-(D/E)XK nuclease superfamily 0
39327 412492 cl00642 GCHY-1 Type I GTP cyclohydrolase folE2. GTP cyclohydrolase; Provisional 0
39328 412493 cl00644 F420_ligase F420-0:Gamma-glutamyl ligase. This protein family is related to CofE, a gamma-glutamyl ligase of coenzyme F420 biosynthesis. However, it occurs in a different gamma-glutamyl ligase context, polyglutamylated tetrahydrofolate biosynthesis-like regions in two widely separated lineages that both occur as intracellular bacteria - Chlamydia and Wolbachia. 0
39329 412494 cl00647 SfsA Sugar fermentation stimulation protein. probable regulatory factor involved in maltose metabolism contains a putative DNA binding domain. Isolated as a gene which enabled E.coli strain MK2001 to use maltose. [Energy metabolism, Sugars, Regulatory functions, Other] 0
39330 412495 cl00649 DsbB Disulfide bond formation protein DsbB. disulfide bond formation protein B; Provisional 0
39331 412496 cl00650 Cu-oxidase_4 Multi-copper polyphenol oxidoreductase laccase. PSI-BLAST converges on members of this family of uncharacterized bacterial proteins and shows no significant similarity to any characterized protein. No completed genome to date has two members. Members of the family have been crystallized but the function is unknown. [Unknown function, General] 0
39332 412497 cl00652 DUF501 Protein of unknown function (DUF501). Family of uncharacterized bacterial proteins. 0
39333 412498 cl00653 Endonuclease_V Endonuclease_V, a DNA repair enzyme that initiates repair of nitrosative deaminated purine bases. This domain is found in the C subunits of the bacterial and archaeal UvrABC system which catalyzes nucleotide excision repair in a multi-step process. UvrC catalyzes the first incision on the fourth or fifth phosphodiester bond 3' and on the eighth phosphodiester bond 5' from the damage that is to be excised. The domain described here is found to the N-terminus of a helix hairpin helix (pfam00633) motif and also co-occurs with the pfam01541 catalytic domain which is found at the N-terminus of the same proteins. 0
39334 412499 cl00654 FliS flagellar export chaperone FliS. FliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function. 0
39335 412500 cl00656 Cas1_I-II-III CRISPR/Cas system-associated protein Cas1. Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR. 0
39336 412501 cl00659 FdhD-NarQ FdhD/NarQ family. FdhD in E. coli and NarQ in B. subtilis are required for the activity of formate dehydrogenase. The gene name in B. subtilis reflects the requirement of the neighboring gene narA for nitrate assimilation, for which NarQ is not required. In some species, the gene is associated not with a known formate dehydrogenase but with a related putative molybdopterin-binding oxidoreductase. A reasonable hypothesis is that this protein helps prepare a required cofactor for assembly into the holoenzyme. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 0
39337 412502 cl00660 vATP-synt_AC39 ATP synthase (C/AC39) subunit. The A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The C subunit is part of the hydrophilic A1 "stalk" complex (AhaABCDEFG), which is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex. 0
39338 412503 cl00661 DUF504 Protein of unknown function (DUF504). hypothetical protein; Provisional 0
39339 412504 cl00662 RNA_bind_2 Predicted RNA-binding protein. Members of this family of bacterial proteins are thought to have RNA-binding properties, however, their exact function has not, as yet, been defined. 0
39340 412505 cl00663 CRS1_YhbY CRS1 / YhbY (CRM) domain. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome. 0
39341 412506 cl00666 CinA Competence-damaged protein. CinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species. Several bacterial species have a protein consisting largely of the C-terminal domain of CinA but lacking the N-terminal domain, including nicotinamide mononucleotide (NMN) deamidase (3.5.1.42) proteins PncC in Shewanella oneidensis and ygaD in E. coli. [DNA metabolism, DNA replication, recombination, and repair] 0
39342 412507 cl00667 DUF309 Domain of unknown function (DUF309). This domain is found in eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding. 0
39343 412508 cl00668 Hydantoinase_A Hydantoinase/oxoprolinase. This protein family was identified, by the method of partial phylogenetic profiling, as related to the use of tetrahydromethanopterin (H4MPT) as a C-1 carrier. Characteristic markers of the H4MPT-linked C1 transfer pathway include formylmethanofuran dehydrogenase subunits, methenyltetrahydromethanopterin cyclohydrolase, etc. Tetrahydromethanopterin, a tetrahydrofolate analog, occurs in methanogenic archaea, bacterial methanotrophs, planctomycetes, and a few other lineages. [Central intermediary metabolism, One-carbon metabolism] 0
39344 412509 cl00669 DUF503 Protein of unknown function (DUF503). Family of hypothetical bacterial proteins. 0
39345 412510 cl00670 CsrA Global regulator protein family. Modulates the expression of genes in the glycogen biosynthesis and gluconeogenesis pathways by accelerating the 5'-to-3' degradation of these transcripts through selective RNA binding. The N-terminal end of the sequence (AA 11-45) contains the KH motif which is characteristic of a set of RNA-binding proteins. [Energy metabolism, Glycolysis/gluconeogenesis, Regulatory functions, RNA interactions] 0
39346 412511 cl00671 Ribosomal_L40e Ribosomal L40e family. 50S ribosomal protein L40e; Provisional 0
39347 412512 cl00672 DrsE DsrE/DsrF-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. The family also includes YrkE proteins. 0
39348 412513 cl00674 LUD_dom LUD domain. This entry represents a domain found in lactate utilization proteins B (LutB) and C (LutC), as well as several uncharacterized proteins. LutB and LutC are encoded by th conserved LutABC operon in bacteria. They are involved in lactate utilization and is implicated in the oxidative conversion of L-lactate into pyruvate 0
39349 412514 cl00676 DUF4040 Domain of unknown function (DUF4040). Possible subunit of Na+/H+ antiporter,. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit. 0
39350 412515 cl00681 FliL Flagellar basal body-associated protein FliL. flagellar basal body-associated protein FliL; Reviewed 0
39351 412516 cl00682 Alba Alba. The nuclear RNase P of Saccharomyces cerevisiae is made up of at least nine protein subunits; Pop1, Pop3, Pop4, Pop5, Pop6, Pop7, Pop8, Rpr2 and Rpp1. Many of these subunits seem to be present also in the RNase MRP, with the exception of Rpr2 (Rpp21) which is unique to RNase P. Human nuclear RNase P and MRP appear to contain at least 10 protein subunits, Rpp14, Rpp20, Rpp21, Rpp25, Rpp29, Rpp30, Rpp38, Rpp40, hPop1 and hPop5, although there is recent evidence that not all of these subunits are shared between P and MRP. Archaeal RNase P has at least four protein subunits homologous to eukaryotic RNase P/MRP proteins. In the yeast RNase P, Pop6 and Pop7 (the Rpp20 homolog) interact with each other and they are both interaction partners of Pop4; in the human MRP Rpp25 and Rpp20 interact with each other and Rpp25 binds to Rpp29 (Pop4). 0
39352 412517 cl00683 FlbD Flagellar protein (FlbD). This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown. 0
39353 412518 cl00685 Grp1_Fun34_YaaH GPR1/FUN34/yaaH family. Proteins of this family are acetate transporters, which usually have 6 transmembrane regions. The homologue in E. coli is YaaH. 0
39354 412519 cl00686 NfeD NfeD-like C-terminal, partner-binding. NfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). There appear to be three major groups: an ancestral group with only an N-terminal serine protease domain and this C-terminal beta sheet-rich domain which is structurally very similar to the OB-fold domain, associated with its neighboring slipin cluster; a second major group with an additional middle, membrane-spanning domain, associated in some species with eoslipin and in others with yqfA; a final 'artificial' group which unites truncated forms lacking the protease region and associated with their ancestral gene partner, either yqfA or eoslipin. This NefD, C-terminal, domain appears to be the major one for relating to the associated protein. NfeD homologs are clearly reliant on their conserved gene neighbor which is assumed to be necessary for function, either through direct physical interaction or by functioning in the same pathway, possibly involve with lipid-rafts. 0
39355 412520 cl00687 AdoMet_dc S-adenosylmethionine decarboxylase. Members of this protein family are the single chain precursor of the S-adenosylmethionine decarboxylase as found in Escherichia coli. This form shows a substantially different architecture from the form shared by the Archaea, Bacillus, and many other species (TIGR03330). It shows little or no similarity to the form found in eukaryotes (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis] 0
39356 412521 cl00688 UPF0086 Domain of unknown function UPF0086. ribonuclease P protein component 1; Validated 0
39357 412522 cl00689 TYW3 Methyltransferase TYW3. hypothetical protein; Provisional 0
39358 412523 cl00693 CM_2 Chorismate mutase type II. This model represents the plant and yeast (plastidic) chorismate mutase. These CM's are distinct from other forms by the presence of an extended regulatory domain. [Amino acid biosynthesis, Aromatic amino acid family] 0
39359 412524 cl00698 CGI-121 Kinase binding protein CGI-121. CGI-121 has been shown to bind to the p53-related protein kinase (PRPK). PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumor suppressor protein p53. CGI-121 is part of a conserved protein complex, KEOPS. The KEOPS complex is involved in telomere uncapping and telomere elongation. Interestingly this family also include archaeal homologs, formerly in the DUF509 family. A structure for these proteins has been solved by structural genomics. 0
39360 412525 cl00700 Peptidase_S66 LD-Carboxypeptidase, a serine protease, includes microcin C7 self immunity protein. Muramoyl-tetrapeptide carboxypeptidase hydrolyzes a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66. 0
39361 294462 cl00701 Lactate_perm L-lactate permease. L-lactate permease; Provisional 0
39362 412526 cl00706 Ribosomal_L44 Ribosomal protein L44. 60S ribosomal protein L36a; Provisional 0
39363 412527 cl00711 Glyco_hydro_77 4-alpha-glucanotransferase. 4-alpha-glucanotransferase; Provisional 0
39364 412528 cl00712 RNA_pol_N RNA polymerases N / 8 kDa subunit. DNA-directed RNA polymerase subunit N; Provisional 0
39365 412529 cl00713 Auto_anti-p27 Sjogren&apos;s syndrome/scleroderma autoantigen 1 (Autoantigen p27). hypothetical protein; Validated 0
39366 412530 cl00716 tRNA_deacylase D-aminoacyl-tRNA deacylase. hypothetical protein; Provisional 0
39367 412531 cl00718 TOPRIM N/A. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation. 0
39368 412532 cl00720 DUF296 Domain of unknown function found in archaea, bacteria, and plants. This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Reut_B5223, which should be a structurally conserved metal-binding unit, based on structural comparison with known metal-binding structures. The proteins should work as trimers. 0
39369 294470 cl00721 DDE_Tnp_IS1 IS1 transposase. Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases. 0
39370 412533 cl00723 YajQ_like Proteins similar to Escherichia coli YajQ. Family of uncharacterized proteins. 0
39371 260590 cl00724 DUF2226 Uncharacterized protein conserved in archaea (DUF2226). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39372 412534 cl00727 DUF188 Uncharacterized BCR, YaiI/YqxD family COG1671. 0
39373 412535 cl00728 EVE EVE domain. hypothetical protein; Provisional 0
39374 412536 cl00731 DUF179 Uncharacterized ACR, COG1678. 0
39375 412537 cl00732 Arch_flagellin Archaebacterial flagellin. flagellin; Validated 0
39376 412538 cl00733 DUF523 Protein of unknown function (DUF523). Family of uncharacterized bacterial proteins. 0
39377 412539 cl00734 Bac_export_1 Bacterial export proteins, family 1. flagellar biosynthesis protein FliR; Reviewed 0
39378 412540 cl00735 AzlD Branched-chain amino acid transport protein (AzlD). This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD is known to be involved in conferring resistance to 4-azaleucine although its exact role is uncertain. 0
39379 382173 cl00738 MBOAT MBOAT, membrane-bound O-acyltransferase family. Members of this protein family are DltB, part of a four-gene operon for D-alanyl-lipoteichoic acid biosynthesis that is present in the vast majority of low-GC Gram-positive organisms. This protein may be involved in transport of D-alanine across the plasma membrane. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 0
39380 412541 cl00739 UPF0147 Uncharacterized protein family (UPF0147). hypothetical protein; Provisional 0
39381 412542 cl00740 FliW FliW protein. flagellar assembly protein FliW; Provisional 0
39382 412543 cl00742 LemA LemA family. The members of this family are related to the LemA protein. LemA contains an amino terminal predicted transmembrane helix. It has been predicted that the small amino terminus is extracellular. The exact molecular function of this protein is uncertain. 0
39383 412544 cl00746 RDD RDD family. This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N-terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.). 0
39384 412545 cl00748 Ribosomal_L32_L32e N/A. This family includes ribosomal protein L32 from eukaryotes and archaebacteria. 0
39385 412546 cl00749 UPF0066 Escherichia coli YaeB and related proteins. This protein has been characterized by crystallography in complex with S-Adenosylmethionine, making it a probable S-adenosylmethionine-dependent methyltransferase. Analysis in EcoGene links this protein to the enzyme characterization mapped to the tsaA gene in Escherichia coli. [Unknown function, Enzymes of unknown specificity] 0
39386 412547 cl00750 Exonuc_VII_S Exonuclease VII small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides. [DNA metabolism, Degradation of DNA] 0
39387 412548 cl00751 DUF155 Uncharacterized ACR, YagE family COG1723. 0
39388 412549 cl00752 HicA_toxin HicA toxin of bacterial toxin-antitoxin,. HicA_toxin is a bacterial family of toxins that act as mRNA interferases. The antitoxin that neutralizes this is family HicB, pfam15919. 0
39389 412550 cl00753 DUF327 Protein of unknown function (DUF327). The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues. with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown. 0
39390 412551 cl00755 zf-dskA_traR Prokaryotic dksA/traR C4-type zinc finger. Members of this predicted regulatory protein are found only in endospore-forming members of the Firmicutes group of bacteria, and in nearly every such species; Clostridium perfringens seems to be an exception. The member from Bacillus subtilis, the model system for the study of the sporulation program, has been designated both yteA and yzwB. Some (but not all) members of this family show a strong sequence match to Pfam family pfam01258 the C4-type zinc finger protein, DksA/TraR family, but only one of the four key Cys residues is conserved. All members of this protein family share an additional C-terminal domain. Smaller proteins from the proteobacteria with just the N-terminal domain, including DksA and DksA2 are RNA polymerase-binding regulatory proteins even if the Zn-binding site is not conserved. [Unknown function, General] 0
39391 412552 cl00756 Vut_1 Putative vitamin uptake transporter. All known members of this family are proteins or 210-250 amino acids in length. Conserved regions of hydrophobicity suggest that all members of the family are integral membrane proteins. [Hypothetical proteins, Conserved] 0
39392 242072 cl00757 UPF0060 Uncharacterized BCR, YnfA/UPF0060 family. 0
39393 412553 cl00759 UPF0058 Uncharacterized protein family UPF0058. This archaebacterial protein has no known function. 0
39394 412554 cl00762 VAPB_antitox Putative antitoxin. hypothetical protein; Provisional 0
39395 412555 cl00764 EMG1 EMG1/NEP1 methyltransferase. Members of this family are essential for 40S ribosomal biogenesis. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases. 0
39396 412556 cl00767 OsmC OsmC-like protein. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein OsmC, or osmotically induced protein C. The member from Thermus thermophilus was shown to have hydroperoxide peroxidase activity. In many species, this protein is induced by stress and helps resist oxidative stress. [Cellular processes, Detoxification] 0
39397 412557 cl00768 CitG ATP:dephospho-CoA triphosphoribosyl transferase. This protein acts in cofactor biosynthesis, preparing the coenzyme A derivative that becomes attached to the malonate decarboxylase acyl carrier protein (or delta subunit). The closely related protein CitG of citrate lyase produces the same molecule, but the two families are nonetheless readily separated. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
39398 382191 cl00769 DUF531 Protein of unknown function (DUF531). Family of hypothetical archaeal proteins. 0
39399 412558 cl00770 PSP1 PSP1 C-terminal conserved region. This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast. 0
39400 412559 cl00772 TctA Tripartite tricarboxylate transporter TctA family. This family, formerly known as DUF112, is a family of bacterial and archaeal tripartite tricarboxylate transporters of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. TctA is part of the tripartite TctABC system which, as characterized in S. typhimurium, is a secondary carrier that depends for activity on the extracytoplasmic tricarboxylate-binding receptor TctC as well as two integral membrane proteins, TctA and TctB. complete three-component systems are found only in bacteria. TctA is a large transmembrane protein with up to 12 predicted membrane spanning regions in bacteria and up to 11 such in archaea, with the N-terminal within the cytoplasm. TctA is thought to be a permease, and in most other bacteria functions without TctB and TctC molecules. 0
39401 412560 cl00774 Fae Formaldehyde-activating enzyme (Fae). This family consists of formaldehyde-activating enzyme, or the corresponding domain of longer, bifunctional proteins. It links formaldehyde to the C1 carrier tetrahydromethanopterin (H4MPT), an analog of tetrahydrofolate, and is common among species with H4MPT. The ribulose monophosphate (RuMP) pathway, which removes the toxic metabolite formaldehyde by assimilation, runs in the opposite direction in some species to produce ribulose 5-phosphate for nucleotide biosynthesis, leaving formaldehyde as an additional metabolite. In these species, formaldehyde activating enzyme may occur as a fusion protein with D-arabino 3-hexulose 6-phosphate formaldehyde lyase from the RuMP pathway. 0
39402 412561 cl00775 SepF Cell division protein SepF. SepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology. This family also includes archaeal related proteins of unknown function. 0
39403 412562 cl00777 DUF72 Protein of unknown function DUF72. hypothetical protein; Provisional 0
39404 412563 cl00779 NQR2_RnfD_RnfE NQR2, RnfD, RnfE family. Na(+)-translocating NADH-quinone reductase subunit B; Provisional 0
39405 412564 cl00780 Kinase-PPPase Kinase/pyrophosphorylase. This family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity. 0
39406 412565 cl00781 DUF389 Domain of unknown function (DUF389). This conserved hypothetical protein is found so far only in three archaeal genomes and in Streptomyces coelicolor. It shares a hydrophobic uncharacterized domain (see TIGR00271) of about 180 residues with several eubacterial proteins, including the much longer protein sll1151 of Synechocystis PCC6803. [Hypothetical proteins, Conserved] 0
39407 412566 cl00782 ComA (2R)-phospho-3-sulfolactate synthase (ComA). This model finds the ComA (Coenzyme M biosynthesis A) protein, phosphosulfolactate synthase, in methanogenic archaea. The ComABC pathway is one of at least two pathways to the intermediate sulfopyruvate. Coenzyme M occurs rarely and sporadically outside of the archaea, as for expoxide metabolism in Xanthobacter autotrophicus Py2, but candidate phosphosulfolactate synthases from that and other species occur fall below the cutoff and outside the scope of this model. This model deliberately is narrower in scope than pfam02679. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 0
39408 412567 cl00784 DUF554 Protein of unknown function (DUF554). Family of uncharacterized prokaryotic proteins. Multiple predicted transmembrane regions suggest that the region is membrane associated. 0
39409 412568 cl00787 Ribosomal_L25_TL5_CTC Ribosomal L25/TL5/CTC N-terminal 5S rRNA binding domain. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress. 0
39410 294511 cl00788 MttA_Hcf106 mttA/Hcf106 family. This model distinguishes TatA/E from the related TatB, but does not distinguish TatA from TatE. The Tat (twin-arginine translocation) system is a Sec-independent exporter for folded proteins, often with a redox cofactor already bound, across the bacterial inner membrane. Functionally equivalent systems are found in the chloroplast and some in archaeal species. The signal peptide recognized by the Tat system is modeled by TIGR01409. [Protein fate, Protein and peptide secretion and trafficking] 0
39411 412569 cl00789 PurS Phosphoribosylformylglycinamidine (FGAM) synthase. phosphoribosylformylglycinamidine synthase subunit PurS; Reviewed 0
39412 412570 cl00793 DUF92 Integral membrane protein DUF92. [Hypothetical proteins, Conserved] 0
39413 412571 cl00795 Fumerase_C Fumarase C-terminus. L(+)-tartrate dehydratase subunit beta; Validated 0
39414 412572 cl00796 Adenosine_kin Adenosine specific kinase. The structure of a member of this family from the hyperthermophilic archaeon Pyrobaculum aerophilum contains a modified histidine residue which is interpreted as stable phosphorylation. In vitro binding studies confirmed that adenosine and AMP but not ADP or ATP bind to the protein. 0
39415 412573 cl00797 DUF356 Protein of unknown function (DUF356). Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function. 0
39416 412574 cl00798 DUF357 Protein of unknown function (DUF357). Members of this family are short (less than 100 amino acid) proteins found in archaebacteria. The function of these proteins is unknown. 0
39417 321175 cl00799 UPF0128 Uncharacterized protein family (UPF0128). hypothetical protein; Provisional 0
39418 412575 cl00800 DUF116 Protein of unknown function DUF116. This archaebacterial protein has no known function. The protein contains seven conserved cysteines and may also be an integral membrane protein. 0
39419 412576 cl00802 LuxS S-Ribosylhomocysteinase (LuxS). This family consists of the LuxS protein involved in autoinducer AI2 synthesis and its hypothetical relatives. S-ribosylhomocysteinase (LuxS) catalyzes the cleavage of the thioether bond in S-ribosylhomocysteine (SRH) to produce homocysteine and 4,5-dihydroxy-2,3-pentanedione (DPD), the precursor of type II bacterial quorum sensing molecule. 0
39420 412577 cl00803 Cas7_I CRISPR/Cas system-associated RAMP superfamily protein Cas7. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. This family used to be known as DUF73. DevR appears to be negative auto-regulator within the system. 0
39421 412578 cl00805 UPF0179 Uncharacterized protein family (UPF0179). hypothetical protein; Provisional 0
39422 412579 cl00806 YajC Preprotein translocase subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas. [Protein fate, Protein and peptide secretion and trafficking] 0
39423 412580 cl00807 MNHE Na+/H+ ion antiporter subunit. putative monovalent cation/H+ antiporter subunit E; Reviewed 0
39424 412581 cl00808 CbiZ Adenosylcobinamide amidohydrolase. This prokaryotic protein family includes CbiZ which converts adenosylcobinamide (AdoCbi) to adenosylcobyric acid (AdoCby), an intermediate of the de novo coenzyme B12 biosynthetic route. 0
39425 412582 cl00809 RbsD_FucU RbsD / FucU transport protein family. L-fucose mutarotase; Provisional 0
39426 412583 cl00810 CheD CheD chemotactic sensory transduction. chemoreceptor glutamine deamidase CheD; Provisional 0
39427 412584 cl00811 DUF167 Uncharacterized ACR, YggU family COG1872. hypothetical protein; Validated 0
39428 412585 cl00814 Cyclase Putative cyclase. One of several pathways of tryptophan degradation is as follows: tryptophan 2,3-dioxygenase (1.13.11.11) uses 02 to convert Trp to L-formylkynurenine. Arylformamidase (3.5.1.9) hydrolyzes the product to L-kynurenine and formate. Kynureninase (3.7.1.3) hydrolyzes L-kynurenine to anthranilate plus alanine. Members of the seed alignment for this model are bacterial predicted metal-dependent hydrolases. All are supported as arylformamidase (3.5.1.9) by an operon structure in which kynureninase and/or tryptophan 2,3-dioxygenase genes are adjacent. The members from Bacillus cereus, Pseudomonas aeruginosa and Ralstonia metallidurans were characterized. An example from Pseudomonas fluorescens is given the gene symbol qbsH instead of kynB because of its role in quinolobactin biosynthesis, which begins with tryptophan. All members of this family should be arylformamidase (3.5.1.9). [Energy metabolism, Amino acids and amines] 0
39429 412586 cl00816 OAD_beta Na+-transporting oxaloacetate decarboxylase beta subunit. Malonate decarboxylase can be a soluble enzyme, or a sodium ion-translocating with additional membrane-bound components. Members of this protein family are integral membrane proteins required to couple decarboxylation to sodium ion export. This family belongs to a broader family, TIGR01109 of sodium ion-translocating decarboxylase beta subunits. [Transport and binding proteins, Cations and iron carrying compounds] 0
39430 412587 cl00817 MM_CoA_mutase N/A. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA. 0
39431 412588 cl00818 DUF555 Protein of unknown function (DUF555). hypothetical protein; Provisional 0
39432 412589 cl00820 DUF211 Uncharacterized ArCR, COG1888. 0
39433 412590 cl00821 Ribosomal_S3Ae Ribosomal S3Ae family. 30S ribosomal protein S3Ae; Validated 0
39434 412591 cl00822 4HFCP_synth 4-HFC-P synthase. (5-formylfuran-3-yl)methyl phosphate synthase, also known as 4-HFC-P synthase, is involved in the production of methanofuran. This family has a classical TIM-barrel structure whose biological unit is a homohexamer. 0
39435 412592 cl00824 HEPN HEPN domain. 0
39436 412593 cl00826 DS Deoxyhypusine synthase. Deoxyhypusine synthase is responsible for the first step in creating hypusine. Hypusine is a modified amino acid found in eukaryotes and in archaea in their respective forms of initiation factor 5A. Its presence is confirmed in archaeal genera Pyrococcus (), Sulfolobus, Halobacterium, and Haloferax (), but in an older report was not detected in Methanococcus voltae (J Biol Chem 1987 Dec 5;262(34):16585-9). This family of apparent orthologs has an unusual UPGMA difference tree, in which the members from the archaea M. jannaschii and P. horikoshii cluster with the known eukaryotic deoxyhypusine synthases. Separated by a fairly deep branch, although still strongly related, is a small cluster of proteins from Methanobacterium thermoautotrophicum and Archeoglobus fulgidus, the latter of which has two. [Protein fate, Protein modification and repair] 0
39437 412594 cl00828 CbiD CbiD. This protein has been shown by cloning into E. coli to be required for cobalamin biosynthesis. role_id [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 0
39438 412595 cl00829 UxaC Glucuronate isomerase. glucuronate isomerase; Reviewed 0
39439 294541 cl00830 DUF401 Protein of unknown function (DUF401). This protein is predicted to have 10 transmembrane regions. Members of this family are found so far in the Archaea (Archaeoglobus fulgidus and Pyrococcus horikoshii) and in a bacterial thermophile, Thermotoga maritima. In Pyrococcus, the gene is located between nadA and nadB, two components of an enzyme involved in de novo synthesis of NAD. By PSI-BLAST, this family shows similarity (but not necessarily homology) to gluconate permease and other transport proteins. [Hypothetical proteins, Conserved] 0
39440 412596 cl00831 FlpD Methyl-viologen-reducing hydrogenase, delta subunit. This family consist of methyl-viologen-reducing hydrogenase, delta subunit / heterodisulphide reductase. No specific functions have been assigned to this subunit. The aligned region corresponds to almost the entire delta chain sequence and contains 4 conserved cysteine residues. However, in two Archaeoglobus sequences this region corresponds to only the C-terminus of these proteins. 0
39441 412597 cl00832 DUF359 Protein of unknown function (DUF359). hypothetical protein; Provisional 0
39442 412598 cl00838 FeoA FeoA domain. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment. 0
39443 412599 cl00840 MTD methylene-5,6,7,8-tetrahydromethanopterin dehydrogenase. This enzyme family is involved in formation of methane from carbon dioxide EC:1.5.99.9. The enzyme requires coenzyme F420. 0
39444 412600 cl00841 Gly_kinase Glycerate kinase family. The only characterized member of this family so far is the glycerate kinase GlxK (EC 2.7.1.31) of E. coli. This enzyme acts after glyoxylate carboligase and 2-hydroxy-3-oxopropionate reductase (tartronate semialdehyde reductase) in the conversion of glyoxylate to 3-phosphoglycerate (the D-glycerate pathway) as a part of allantoin degradation. [Energy metabolism, Other] 0
39445 412601 cl00842 CbiN Cobalt transport protein component CbiN. This model describes the cobalt transporter in bacteria and its equivalents in archaea. It principally functions in the ion uptake mechanism. It is a multisubunit transporter with two integral membrane proteins and two closely associated cytoplasmic subunits. This transporter belongs to the ABC transporter superfamily (ATP stands for ATP Binding Cassette). This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds] 0
39446 412602 cl00845 DUF473 Protein of unknown function (DUF473). Family of uncharacterized Archaeal proteins. 0
39447 412603 cl00846 CsoR-like_DUF156 Transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this domain superfamily was previously known as DUF156. This is a family of metal-sensitive repressors, involved in resistance to metal ions. Members of this family bind copper, nickel or cobalt ions via conserved cysteine and histidine residues. In the absence of metal ions, these proteins bind to promoter regions and repress transcription. When bound to metal ions they are unable to bind DNA, leading to transcriptional derepression. 0
39448 412604 cl00847 PAC2 PAC2 family. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in but are universal among the Archaea. This ortholog set includes MJ1210 from Methanococcus jannaschii and AF0525 from Archaeoglobus fulgidus while excluding MJ0106 and AF1251. [Hypothetical proteins, Conserved] 0
39449 412605 cl00848 Y1_Tnp Transposase IS200 like. Most IS200/IS605 family insertion sequences encode both this transposase, TnpA, about 130 amino acids long, and larger accessory protein, TnpB, that may act as a methyltransferase. 0
39450 412606 cl00849 PvlArgDC Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). pyruvoyl-dependent arginine decarboxylase; Provisional 0
39451 412607 cl00850 Phage_holin_4_2 Mycobacterial 4 TMS phage holin, superfamily IV. These proteins are predicted transmembrane proteins with probably four transmembrane spans. The 1.E.40 is represented by the mycobacterial 4 phage holin, but it also contains many cyanobacterial. proteobacterial and firmicute proteins. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as in those of the bacteriophage of these organisms. The primary function of holins appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded the enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases. 0
39452 412608 cl00851 Fumerase Fumarate hydratase (Fumerase). A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this subfamily has not been established. 0
39453 412609 cl00857 DUF63 Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins. 0
39454 412610 cl00858 BacA Bacitracin resistance protein BacA. undecaprenyl pyrophosphate phosphatase; Reviewed 0
39455 412611 cl00860 MscL Large-conductance mechanosensitive channel, MscL. Protein encodes a channel which opens in response to a membrane stretch force. Probably serves as an osmotic gauge. Carboxy terminus tends to be more divergent across species with a high degree of sequence conservation found at the N-terminus. [Cellular processes, Adaptations to atypical conditions] 0
39456 412612 cl00861 RNaseH_like Ribonuclease H-like. RNaseH_like is a family of uncharacterized eubacterial proteins that are distant homologs of Ribonuclease H-like. The family maintains all the core secondary structure elements of the RNase H-like fold and shares several conserved, presumably active site residues with RNase HI. This finding suggests that it functions as a nuclease. 0
39457 412613 cl00862 FBPase_3 Fructose-1,6-bisphosphatase. This is a family of bacterial and archaeal fructose-1,6-bisphosphatases (FBPases). FBPase catalyzes the hydrolysis of D-fructose-1,6-bisphosphate (FBP) to D-fructose-6-phosphate (F6P) and orthophosphate and is an essential regulatory enzyme in the glyconeogenic pathway. 0
39458 412614 cl00864 PspC PspC domain. This family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length. 0
39459 412615 cl00865 CT_A_B Carboxyltransferase domain, subdomain A and B. This domain represents subunit 2 of allophanate hydrolase (AHS2). 0
39460 412616 cl00866 NTPase_I-T Protein of unknown function DUF84. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 0
39461 412617 cl00867 Bac_export_3 Bacterial export proteins, family 3. flagellar biosynthesis protein FliQ; Reviewed 0
39462 412618 cl00868 YdjM LexA-binding, inner membrane-associated putative hydrolase. YdjM is a family of putative LexA-binding proteins. Members are predicted to be membrane-bound metal-dependent hydrolases that may be acting as phospholipases. It is a member of the SOS network, that rescues cells from UV and other DNA-damage. Expression of YdjM is regulated by LexA. 0
39463 412619 cl00871 ThiP_synth Thiamine-phosphate synthase. This family is thiamine-phosphate synthase, and it belongs to the SCOP phosphomethylpyrimidine kinase C-terminal domain-like family. Vitamin B1 (thiamine pyrophosphate) is involved in several microbial metabolic functions. Thiamine biosynthesis is accomplished by joining two intermediate molecules that are synthesized separately, HMP-PP and HET-P. In the archaeon Natrialba magadii, ThiE and ThiN, are known to join HMP-PP ( hydroxymethylpyrimidine pyrophosphate) and HET-P (hydroxyethylthiazole phosphate) to generate thiamine phosphate. Whereas ThiE in Natrialba magadii is a mono-functional protein, ThiN exists as a C-terminal domain in a ThiDN fusion protein - examples of all three forms, from various prokaryotes, are found in this family. 0
39464 412620 cl00872 DUF190 Uncharacterized ACR, COG1993. 0
39465 412621 cl00873 PdxA Pyridoxal phosphate biosynthetic protein PdxA. This model represents PdxA, an NAD+-dependent 4-hydroxythreonine 4-phosphate dehydrogenase (EC 1.1.1.262) active in pyridoxal phosphate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 0
39466 412622 cl00874 DNA_RNApol_7kD DNA directed RNA polymerase, 7 kDa subunit. DNA-directed RNA polymerase subunit P; Provisional 0
39467 412623 cl00876 Ribosomal_S27 Ribosomal protein S27a. 30S ribosomal protein S27ae; Validated 0
39468 412624 cl00877 MazE_antitoxin Antidote-toxin recognition MazE, bacterial antitoxin. PrlF_antitoxin is a family of bacterial antitoxins that neutralizes the toxin YhaV. PrlF is labile and forms a homodimer that then binds to the YhaV toxin thereby neutralising its ribonuclease activity. Alone, it can also act as a transcription factor. The YhaV/PrlF complex binds the prlF-yhaV operon, probably regulating its expression negatively. Over-expression of PrlF leads to increased doubling time. 0
39469 412625 cl00878 Ribosomal_S24e Ribosomal protein S24e. 40S ribosomal protein S24; Provisional 0
39470 294572 cl00880 Ribosomal_S8e_like Eukaryotic/archaeal ribosomal protein S8e and similar proteins. 40S ribosomal protein S8-like; Provisional 0
39471 412626 cl00881 SQR_QFR_TM N/A. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 13kD hydrophobic subunit D. 0
39472 412627 cl00883 RNA_pol_Rpb5_C RNA polymerase Rpb5, C-terminal domain. DNA-directed RNA polymerase subunit H; Reviewed 0
39473 412628 cl00884 AIM24 Mitochondrial biogenesis AIM24. [Hypothetical proteins, Conserved] 0
39474 412629 cl00886 Robl_LC7 Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. 0
39475 412630 cl00887 Rpr2 RNAse P Rpr2/Rpp21/SNM1 subunit domain. ribonuclease P protein component 4; Validated 0
39476 412631 cl00890 DUF366 Domain of unknown function (DUF366). Archaeal domain of unknown function. 0
39477 412632 cl00891 Cu-Zn_Superoxide_Dismutase N/A. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold. 0
39478 412633 cl00892 DUF131 Protein of unknown function DUF131. The member of this family from Pyrococcus horikoshii scores only 13.91 bits, largely because it is at least 15 residues shorter than other members of this family of small proteins and is penalized for not matching to the N-terminal section of the model. Cutoff scores are set so this hit is between noise and trusted cutoffs. [Hypothetical proteins, Conserved] 0
39479 412634 cl00893 DUF368 Domain of unknown function (DUF368). Predicted transmembrane domain of unknown function. Family members have between 6 and 9 predicted transmembrane segments. 0
39480 412635 cl00894 DUF169 Uncharacterized ArCR, COG2043. 0
39481 412636 cl00895 2-ph_phosp 2-phosphosulpholactate phosphatase. 2-phosphosulfolactate phosphatase catalyzes the sulfonation of phosphoenolpyruvate to form 2-phospho-3-sulfolactate, the second step in coenzyme M biosynthesis. Coenzyme M is the terminal methyl carrier in methanogenesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis] 0
39482 294584 cl00897 Ribosomal_S27e Ribosomal protein S27. 40S ribosomal protein S27; Provisional 0
39483 412637 cl00898 DUF370 Domain of unknown function (DUF370). hypothetical protein; Provisional 0
39484 412638 cl00900 Ldh_2 Malate/L-lactate dehydrogenase. This enzyme converts ureidoglycolate to oxalureate in the non-urea-forming catabolism of allantoin (GenProp0687). The pathway has been characterized in E. coli and is observed in the genomes of Entercoccus faecalis and Bacillus licheniformis. 0
39485 412639 cl00903 KdpA Potassium-transporting ATPase A subunit. Kdp is a high affinity ATP-driven K+ transport system in Escherichia coli. It is composed of three membrane-bound subunits, KdpA, KdpB and KdpC and one small peptide, KdpF. KdpA is the K+-transporting subunit of this complex. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [medline:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [medline:9858692]. [Transport and binding proteins, Cations and iron carrying compounds] 0
39486 412640 cl00907 Glutaminase Glutaminase. This family describes the enzyme glutaminase, from a larger family that includes serine-dependent beta-lactamases and penicillin-binding proteins. Many bacteria have two isozymes. This model is based on selected known glutaminases and their homologs within prokaryotes, with the exclusion of highly-derived (long branch) and architecturally varied homologs, so as to achieve conservative assignments. A sharp drop in scores occurs below 250, and cutoffs are set accordingly. The enzyme converts glutamine to glutamate, with the release of ammonia. Members tend to be described as glutaminase A (glsA), where B (glsB) is unknown and may not be homologous (as in Rhizobium etli). Some species have two isozymes that may both be designated A (GlsA1 and GlsA2). [Energy metabolism, Amino acids and amines] 0
39487 412641 cl00909 Ribosomal_L24e_L24 N/A. MYM-type zinc fingers were identified in MYM family proteins. Human protein ZMYM3 is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1. ZMYM2 is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1; in atypical myeloproliferative disorders it is rearranged. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons. 0
39488 412642 cl00911 AMMECR1 AMMECR1. Members of this protein family belong to the same domain family as AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Members of the present family occur as part of a three gene system with a homolog of the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility), and an uncharacterized radical SAM enzyme. 0
39489 412643 cl00912 MmgE_PrpD MmgE/PrpD family. Members of this family are bacterial proteins known or predicted to act as 2-methylcitrate dehydratase, an enzyme involved in the methylcitrate cycle of propionate catabolism. A related clade of archaeal proteins that may or may not be functionally equivalent is reserved for a future model and is excluded from this family. The PrpD enzyme of E. coli is responsible for the minor aconitase activity (AcnC) not accounted for by AcnA and AcnB. 0
39490 412644 cl00913 CbiC Precorrin-8X methylmutase. precorrin-8X methylmutase; Reviewed 0
39491 412645 cl00914 DUF61 Protein of unknown function DUF61. hypothetical protein; Provisional 0
39492 412646 cl00915 SpoVG SpoVG. Stage V sporulation protein G. Essential for sporulation and specific to stage V sporulation in Bacillus megaterium and subtilis. In B. subtilis, expression decreases after 30-60 minutes of cold shock. 0
39493 412647 cl00916 DUF371 Domain of unknown function (DUF371). Archaeal domain of unknown function. 0
39494 412648 cl00920 Cob_adeno_trans Cobalamin adenosyltransferase. This model represents as ATP:cob(I)alamin adenosyltransferase family corresponding to the N-terminal half of Salmonella PduO, a 1,2-propanediol utilization protein that probably is bifunctional. PduO represents one of at least three families of ATP:corrinoid adenosyltransferase: others are CobA (which partially complements PduO) and EutT. It was not clear originally whether ATP:cob(I)alamin adenosyltransferase activity resides in the N-terminal region of PduO, modeled here, but this has now become clear from the characterization of MeaD from Methylobacterium extorquens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 0
39495 412649 cl00921 Ribosomal_L31e Eukaryotic/archaeal ribosomal protein L31. 50S ribosomal protein L31e; Reviewed 0
39496 412650 cl00922 CbiJ Precorrin-6x reductase CbiJ/CobK. This enzyme catalyzes a step in cobalamin biosynthesis. It has been identified experimentally in Pseudomonas denitrificans and has been shown to be part of cobalamin biosynthetic operons in several other species. This enzyme was found to be a monomer by gel filtration. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin] 0
39497 412651 cl00927 Form_Nir_trans Formate/nitrite transporter. FocA (formate channel A) forms a pentameric formate-selective channel through the plasma membrane. The focA gene is largely restricted to Proteobacteria and occurs adjacent to genes for pyruvate formate lyase (PFL) and the PFL activase, a radical SAM protein. FocA is homologous to a nitrite transport protein, NirC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
39498 412652 cl00928 dsDNA_bind Double-stranded DNA-binding domain. This domain is believed to bind double-stranded DNA of 20 bases length. 0
39499 412653 cl00929 PIG-L GlcNAc-PI de-N-acetylase. Members of this protein family are BshB1 (YpjG), an enzyme of bacillithiol biosynthesis; either BshB1 or BshB2 (YojG) must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 0
39500 412654 cl00931 Ribosomal_S6e Ribosomal protein S6e. 30S ribosomal protein S6e; Validated 0
39501 412655 cl00932 Ribosomal_L37e Ribosomal protein L37e. 60S ribosomal protein L37; Provisional 0
39502 412656 cl00933 ClpS ATP-dependent Clp protease adaptor protein ClpS. ATP-dependent Clp protease adaptor protein ClpS; Reviewed 0
39503 412657 cl00934 CDH CDP-diacylglycerol pyrophosphatase. CDP-diacylglycerol pyrophosphatase; Provisional 0
39504 412658 cl00935 Brix Brix domain. ribosomal biogenesis protein; Validated 0
39505 412659 cl00936 RecX RecX family. recombination regulator RecX; Provisional 0
39506 412660 cl00937 Ribosomal_L21e Ribosomal protein L21e. 50S ribosomal protein L21e; Reviewed 0
39507 412661 cl00938 Rieske N/A. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines coordinate one Fe ion, while the other Fe ion is coordinated by two conserved histidines. In hyperthermophilic archaea there is a SKTPCX(2-3)C motif at the C-terminus. The cysteines in this motif form a disulphide bridge, which stabilizes the protein. 0
39508 412662 cl00941 FeS_assembly_P Iron-sulfur cluster assembly protein. The function is unknown for this protein family, but members are found almost always in operons for the the SUF system of iron-sulfur cluster biosynthesis. The SUF system is present elsewhere on the chromosome for those few species where SUF genes are not adjacent. This family shares this property of association with the SUF system with a related family, TIGR02945. TIGR02945 consists largely of a DUF59 domain (see pfam01883), while this protein is about double the length, with a unique N-terminal domain and DUF59 C-terminal domain. A location immediately downstream of the cysteine desulfurase gene sufS in many contexts suggests the gene symbol sufT. Note that some other homologs of this family and of TIGR02945, but no actual members of this family, are found in operons associated with phenylacetic acid (or other ring-hydroxylating) degradation pathways. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
39509 412663 cl00942 PCD_DCoH N/A. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerization cofactor of hepatocyte nuclear factor 1-alpha). 0
39510 412664 cl00943 DUF378 Domain of unknown function (DUF378). Predicted transmembrane domain of unknown function. The majority of the family have two predicted transmembrane regions. 0
39511 412665 cl00944 KdpC K+-transporting ATPase, c chain. potassium-transporting ATPase subunit C; Provisional 0
39512 412666 cl00945 Ribosomal_L18A Ribosomal proteins 50S-L18Ae/60S-L20/60S-L18A. 50S ribosomal protein LX; Validated 0
39513 412667 cl00946 zf-like Cysteine-rich small domain. Probable metal-binding domain. 0
39514 412668 cl00947 L-fuc_L-ara-isomerases N/A. L-Arabinose isomerase (AI) catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion into D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute. 0
39515 412669 cl00949 Acetyltransf_2 N-acetyltransferase. N-hydroxyarylamine O-acetyltransferase; Provisional 0
39516 412670 cl00951 SufE Fe-S metabolism associated domain. Members of this protein family are CsdE, formerly called YgdK. This protein, found as a paralog to SufE in Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, works together and physically interacts with CsdA (a paralog of SufS). CsdA has cysteine desulfurase activity that is enhanced by this protein (CsdE), in which Cys-61 (numbered as in E. coli) is a sulfur acceptor site. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
39517 412671 cl00952 Ribosomal_L39 Ribosomal L39 protein. 50S ribosomal protein L39e; Validated 0
39518 412672 cl00954 GCS2 Glutamate-cysteine ligase family 2(GCS2). Family of bacterial f glutamate-cysteine ligases (EC:6.3.2.2) that carry out the first step of the glutathione biosynthesis pathway. 0
39519 412673 cl00955 Ribosomal_L34e Ribosomal protein L34e. 60S ribosomal protein L34; Provisional 0
39520 412674 cl00957 Translin-like Translin and translin-associated factor-X (TRAX). Members of this family include Translin, which interacts with DNA and forms a ring around the DNA. This family also includes human TSNAX, which was found to interact with translin with yeast two-hybrid screen. 0
39521 412675 cl00958 Nitrate_red_del Nitrate reductase delta subunit. Type II members of the DMSO reductase family are heterotrimeric proteins with bis(molybdopterin guanine dinucleotide)Mo, iron-sulfur, and heme b prosthetic groups bound by the alpha, beta, and gamma subunits respectively. Members of this protein family are not part of the mature protein, although they are the product of a fourth clustered gene. Proteins in this family are interpreted as a chaperone, analogous to NarJ of nitrate reductases. 0
39522 412676 cl00959 Nitrate_red_gam Nitrate reductase gamma subunit. Involved in anerobic respiration the gene product catalyzes the reaction (reduced acceptor + NO3- = Acceptor + nitrite). Another possible role_id for this gene product is in nitrogen fixation (Role_id:160). [Energy metabolism, Anaerobic] 0
39523 412677 cl00960 Fic Fic/DOC family. The characterized member of this family is the death-on-curing (DOC) protein of phage P1. It is part of a two protein operon with prevents-host-death (phd) that forms an addiction module. DOC lacks homology to analogous addiction module post-segregational killing proteins involved in plasmid maintenance. These modules work as a combination of a long lived poison (e.g. this protein) and a more abundant but shorter lived antidote. Members of this family have a well-conserved central motif HxFx[ND][AG]NKR. A similar region, with K replaced by G, is found in the huntingtin interacting protein (HYPE) family. [Unknown function, General] 0
39524 412678 cl00969 Ribosomal_S19e Ribosomal protein S19e. 30S ribosomal protein S19e; Provisional 0
39525 412679 cl00970 DUF996 Protein of unknown function (DUF996). Family of uncharacterized bacterial and archaeal proteins. 0
39526 412680 cl00973 AbiEii Nucleotidyl transferase AbiEii toxin, Type IV TA system. This family was recently identified as belonging to the nucleotidyltransferase superfamily. AbiEii is the cognate toxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 0
39527 412681 cl00977 Nop10p Nucleolar RNA-binding protein, Nop10p family. Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs. 0
39528 412682 cl00978 Transgly_assoc Transglycosylase associated protein. hypothetical protein; Provisional 0
39529 412683 cl00979 DUF402 Protein of unknown function (DUF402). hypothetical protein; Provisional 0
39530 412684 cl00983 Indigoidine_A Indigoidine synthase A like protein. Indigoidine is a blue pigment synthesized by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. The recommended name for this protein is now pseudouridine-5'-phosphate glycosidase. 0
39531 382321 cl00984 TM2 TM2 domain. TM2 domain-containing protein 0
39532 412685 cl00987 GrpB GrpB protein. This family has been suggested to belong to the nucleotidyltransferase superfamily. It occurs at the C-terminus of dephospho-CoA kinase (CoaE) in a number of cases, where it plays a role in the proper folding of the enzyme. 0
39533 412686 cl00989 DUF420 Protein of unknown function (DUF420). Predicted membrane protein with four transmembrane helices. 0
39534 412687 cl00990 DUF421 Protein of unknown function (DUF421). YDFR family 0
39535 412688 cl00991 Caroten_synth Carotenoid biosynthesis protein. The representative member of this family is CruF, a C50 carotenoid 2',3'-hydratase involved in the synthesis of the C50 carotenoid bacterioruberin in the halophilic archaeon Haloarcula japonica. 0
39536 412689 cl00993 Zn-ribbon_8 Zinc ribbon domain. This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 0
39537 412690 cl00994 CcmE CcmE. cytochrome c-type biogenesis protein CcmE; Reviewed 0
39538 412691 cl00995 PemK_toxin PemK-like, MazF-like toxin of type II toxin-antitoxin system. PemK is a growth inhibitor in E. coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. This family represents the toxin molecule of a typical bacterial toxin-antitoxin system pairing. The family includes a number of different toxins, such as MazF, Kid, PemK, ChpA, ChpB and ChpAK. 0
39539 412692 cl00997 Glyco_hydro_114 Glycoside-hydrolase family GH114. Original assignment of this protein family as cysteinyl-tRNA synthetase is controversial, supported by but challenged by and by subsequent discovery of the actual mechanism for synthesizing Cys-tRNA in species where a direct Cys--tRNA ligase was not found. Lingering legacy annotations of members of this family probably should be removed. Evidence against the role includes a signal peptide. This family as been renamed "extracellular protein" to facilitate correction. Members of this family occur in Deinococcus radiodurans (bacterial) and Methanococcus jannaschii (archaeal). A number of homologous but more distantly related proteins are annotated as alpha-1,4 polygalactosaminidases. The function remains unknown. [Unknown function, General] 0
39540 412693 cl00998 NTP_transf_9 Domain of unknown function (DUF427). This domain contains a beta-tent fold. 0
39541 412694 cl00999 YCII YCII-related domain. YciI-like protein; Reviewed 0
39542 412695 cl01001 YceI YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold. 0
39543 412696 cl01002 DUF808 Protein of unknown function (DUF808). hypothetical protein; Provisional 0
39544 412697 cl01005 SpoVS Stage V sporulation protein S (SpoVS). In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly. 0
39545 412698 cl01007 DAP_dppA Peptidase M55, D-aminopeptidase dipeptide-binding protein family. Bacillus subtilis DppA is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a 'self-compartmentalising protease', a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organized in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterized by the SXDXEG key sequence. The only known substrates are D-ala-D-ala and D-ala-gly-gly. 0
39546 412699 cl01008 DUF423 Protein of unknown function (DUF423). hypothetical protein; Provisional 0
39547 412700 cl01011 HupE_UreJ_2 HupE / UreJ protein. This family of proteins are hydrogenase / urease accessory proteins. The alignment contains many conserved histidines that are likely to be involved in nickel binding. The members usually have five membrane-spanning regions. 0
39548 412701 cl01012 Big_5 Bacterial Ig-like domain. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm. 0
39549 412702 cl01015 FUN14 FUN14 family. This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices. 0
39550 412703 cl01017 DUF2227 Uncharacterized metal-binding protein (DUF2227). Members of this family of hypothetical bacterial proteins possess metal binding properties; however, their exact function has not, as yet, been determined. 0
39551 412704 cl01020 ASCH N/A. The search results from NCBI sequence alignment indicates a conserved domain belonging to ASCH superfamily. Dali searching results show that the protein is a structurally similar to the PUA domain, suggesting it may be involved in RNA recognition. It has been reported that the deletion of PUA genes results in impaired growth (RluD) and competitive disadvantage (TruB) in Escherichia coli. Suggestions have been put forward that, apart from their usual catalytic role, certain PUS enzymes (e.g. TruB) may also act as chaperones for RNA folding. The interface interaction indicates that the biomolecule of protein NP_809782.1 should be a dimer. 0
39552 382342 cl01021 DUF424 Protein of unknown function (DUF424). This is a family of uncharacterized proteins. 0
39553 412705 cl01024 Sm_multidrug_ex Putative small multi-drug export protein. This family contains a small number of putative small multi-drug export proteins. 0
39554 412706 cl01027 DUF432 Protein of unknown function (DUF432). Archaeal protein of unknown function. 0
39555 412707 cl01030 DUF433 Protein of unknown function (DUF433). 0
39556 412708 cl01031 DUF86 Protein of unknown function DUF86. The function of members of this family is unknown. 0
39557 412709 cl01033 Ribosomal_L35Ae Ribosomal protein L35Ae. 60S ribosomal protein L35a; Provisional 0
39558 412710 cl01034 DUF2304 Uncharacterized conserved protein (DUF2304). Members of this family of hypothetical archaeal proteins have no known function. 0
39559 412711 cl01041 DUF441 Protein of unknown function (DUF441). Predicted to be an integral membrane protein. 0
39560 412712 cl01047 DUF386 Domain of unknown function (DUF386). This family consists of conserved hypothetical proteins, about 150 amino acids in length. Members with limited information include YhcH, a possible sugar isomerase of sialic acid catabolism, and YjgK. [Unknown function, General] 0
39561 412713 cl01048 Barstar_like N/A. Barstar_SaI14_like contains sequences that are similar to SaI14, an RNAase inhibitor, which are members of the Barstar family. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. The sequences in this subfamily are mostly uncharacterized, but believed to have a similar function and role. 0
39562 412714 cl01049 Zn_peptidase_2 Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142), of the characteristic HEXXH motif. 0
39563 412715 cl01051 Antibiotic_NAT Aminoglycoside 3-N-acetyltransferase. This family consists of bacterial aminoglycoside 3-N-acetyltransferases EC:2.3.1.81, these catalyze the reaction: Acetyl-Co + a 2-deoxystreptamine antibiotic <=> CoA + N3'-acetyl-2-deoxystreptamine antibiotic. The enzyme can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others. 0
39564 412716 cl01052 FlgM Anti-sigma-28 factor, FlgM. FlgM interacts with and inhibits the alternative sigma factor sigma(28) FliA. The C-terminus of FlgM contains the sigma(28)-binding domain. 0
39565 412717 cl01053 SGNH_hydrolase N/A. This domain is mainly found in uncharacterized proteins around 290 residues in length and is mainly found in various Bacteroides species. It has a curved central beta sheet flanked by helices. Distant homolog analysis showed it has a similarity with GDSL-like Lipase/Acylhydrose family. The function of this domain is still unknown. 0
39566 412718 cl01054 HAMP Histidine kinase, Adenylyl cyclase, Methyl-accepting protein, and Phosphatase (HAMP) domain. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established. 0
39567 412719 cl01059 Adenine_glyco Methyladenine glycosylase. All proteins in this family are alkylation DNA glycosylases that function in base excision repair This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
39568 242278 cl01062 DUF452 Protein of unknown function (DUF452). 0
39569 412720 cl01063 DUF454 Protein of unknown function (DUF454). Predicted membrane protein. 0
39570 412721 cl01066 Trm112p Trm112p-like protein. The function of this family is uncertain. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation. Trm112p is required for tRNA methylation in S. cerevisiae and is found in complexes with 2 tRNA methylases (TRM9 and TRM11) also with putative methyltransferase YDR140W. The zinc-finger protein Ynr046w is plurifunctional and a component of the eRF1 methyltransferase in yeast. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three alpha-helices. 0
39571 412722 cl01067 Dyp_perox Dyp-type peroxidase family. A defined member of this superfamily is Dyp, a dye-decolorizing peroxidase that lacks a typical heme-binding region. A distinct, uncharacterized branch (TIGR01412) of this superfamily has a typical twin-arginine dependent signal sequence characteristic of exported proteins with bound redox cofactors. 0
39572 412723 cl01069 DUF456 Protein of unknown function (DUF456). This family is a putative membrane protein that contains glycine zipper motifs. 0
39573 412724 cl01070 DUF465 Protein of unknown function (DUF465). hypothetical protein; Provisional 0
39574 412725 cl01071 PCuAC Copper chaperone PCu(A)C. PCu(A)C is a periplasmic copper chaperone. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase. 0
39575 412726 cl01073 MlaA MlaA lipoprotein. MlaA is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. MlaA is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor. 0
39576 412727 cl01074 MlaC MlaC protein. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se. 0
39577 412728 cl01075 Cons_hypoth698 Conserved hypothetical protein 698. Members of this family are found so far only in one archaeal species, Archaeoglobus fulgidus, and in two related bacterial species, Haemophilus influenzae and Escherichia coli. It has 9 GES predicted transmembrane regions at conserved locations in all members. These proteins have a molecular weight of approximately 35 to 38 kDa. [Hypothetical proteins, Conserved] 0
39578 412729 cl01076 Peptidase_M78 IrrE N-terminal-like domain. This entry includes the catalytic domain of the protein ImmA, which is a metallopeptidase containing an HEXXH zinc-binding motif from peptidase family M78. ImmA is encoded on a conjugative transposon. Conjugating bacteria are able to transfer conjugative transposons that can, for example, confer resistance to antibiotics. The transposon is integrated into the chromosome, but during conjugation excises itself and then moves to the recipient bacterium and re-integrate into its chromosome. Typically a conjugative tranposon encodes only the proteins required for this activity and the proteins that regulate it. During exponential growth, the ICEBs1 transposon of Bacillus subtilis is inactivated by the immunity repressor protein ImmR, which is encoded by the transposon and represses the genes for excision and transfer. Cleavage of ImmR relaxes repression and allows transfer of the transposon. ImmA has been shown to be essential for the cleavage of ImmR. This domain is also found in in metalloprotease IrrE, a central regulator of DNA damage repair in Deinococcaceae, HTH-type transcriptional regulators RamB and PrpC. 0
39579 412730 cl01077 SIMPL Protein of unknown function (DUF541). oxidative stress defense protein; Provisional 0
39580 412731 cl01078 UPF0114 Uncharacterized protein family, UPF0114. hypothetical protein; Provisional 0
39581 412732 cl01080 Prp-like ribosomal-processing cysteine protease Prp and similar proteins. This is a family of cysteine protease that are found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease. 0
39582 412733 cl01081 FMN_bind FMN-binding domain. This model represents the NqrC subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds] 0
39583 412734 cl01082 Sel_put Selenoprotein, putative. This family is named KCU-star because nearly all member proteins end with tripeptide lysine-cysteine-selenocysteine, followed immediately by a stop codon (represented by an asterisk, or star). Members occur in primarily in species of Helicobacter (although not Helicobacter pylori, in which selenocysteine incorporation capability has been lost) and Campylobacter. This small family belongs the larger YbdD/YjiX (DUF466) family described by Pfam model PF04328. 0
39584 412735 cl01085 UPF0175 Uncharacterized protein family (UPF0175). This family contains small proteins of unknown function. 0
39585 294686 cl01087 MreD rod shape-determining protein MreD. Members of this protein family are the MreD protein of bacterial cell shape determination. Most rod-shaped bacteria depend on MreB and RodA to achieve either a rod shape or some other non-spherical morphology such as coil or stalk formation. MreD is encoded in an operon with MreB, and often with RodA and PBP-2 as well. It is highly hydrophobic (therefore somewhat low-complexity) and highly divergent, and therefore sometimes tricky to discover by homology, but this model finds most examples. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 0
39586 412736 cl01090 SlyX SlyX. hypothetical protein; Provisional 0
39587 412737 cl01093 Fer2_BFD BFD-like [2Fe-2S] binding domain. bacterioferritin-associated ferredoxin; Provisional 0
39588 412738 cl01097 DUF489 Protein of unknown function (DUF489). Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis. 0
39589 412739 cl01098 IspA Intracellular septation protein A. intracellular septation protein A; Reviewed 0
39590 412740 cl01101 DsrC DsrC like protein. Members of this protein family may be described as TusE, a partner to TusBCD in a sulfur relay system for 2-thiouridine biosynthesis, a tRNA base modification process. Other members are DsrC, a functionally similar protein in species where the sulfur relay system exists primarily for sulfur metabolism rather than tRNA base modification. Some members of this family are known explicitly as the gamma subunit of sulfite reductases. 0
39591 412741 cl01102 DUF493 Protein of unknown function (DUF493). hypothetical protein; Provisional 0
39592 412742 cl01103 DUF494 Protein of unknown function (DUF494). hypothetical protein; Validated 0
39593 412743 cl01104 Iron_traffic Bacterial Fe(2+) trafficking. oxidative damage protection protein; Provisional 0
39594 412744 cl01106 DNA_pol3_chi DNA polymerase III chi subunit, HolC. DNA polymerase III subunit chi; Validated 0
39595 412745 cl01107 DUF502 Protein of unknown function (DUF502). Predicted to be an integral membrane protein. 0
39596 412746 cl01108 BrnT_toxin Ribonuclease toxin, BrnT, of type II toxin-antitoxin system. BrnT is a ribonuclease toxin of a type II toxin-antitoxin system that exhibits a RelE-like fold. The antitoxin that neutralizes this toxin is pfam14384. BrnT is found in bacteria, archaea, bacteriophage, and plasmids. BrnT-BrnA forms a 2:2 tetrameric complex and autoregulates its own expression, which is induced by a number of different environmental stresses. Expression of BrnT alone results in cessation of bacterial growth which can be rescued after subsequent expression of BrnA. 0
39597 412747 cl01109 SYLF The SYLF domain (also called DUF500), a novel lipid-binding module. Ysc84 is a family of Las17-binding proteins found in metazoa. Together, Las17 and Ysc84 are essential for proper polymerization of actin; Ysc84 is able to bind to and stabilize the actin dimer presented by Las17 and thereby promote polymerization. An active actin cytoskeleton is necessary for adequate endocytosis. (pfam00018), or a FYVE zinc finger (pfam01363). 0
39598 412748 cl01110 Sdh5 Flavinator of succinate dehydrogenase. This family includes the highly conserved mitochondrial and bacterial proteins Sdh5/SDHAF2/SdhE. Both yeast and human Sdh5/SDHAF2 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans. Bacterial homologs of Sdh5, termed SdhE, are functionally conserved being required for the flavinylation of SdhA and succinate dehydrogenase activity. Like Sdh5, SdhE interacts with SdhA. Furthermore, SdhE was characterized as a FAD co-factor chaperone that directly binds FAD to facilitate the flavinylation of SdhA. Phylogenetic analysis demonstrates that SdhE/Sdh5 proteins evolved only once in an ancestral alpha-proteobacteria prior to the evolution of the mitochondria and now remain in subsequent descendants including eukaryotic mitochondria and the alpha, beta and gamma proteobacteria. This family was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true. The E. coli protein, YgfY also acts as the antitoxin to the membrane-bound toxin family Cpta, pfam13166, whose E. coli member YgfX, expressed from the same operon as YgfY. 0
39599 412749 cl01112 DUF507 Protein of unknown function (DUF507). Bacterial protein of unknown function. 0
39600 412750 cl01115 BMFP Membrane fusogenic activity. BMFP consists of two structural domains, a coiled-coil C-terminal domain via which the protein self-associates as a trimer, and an N-terminal domain disordered at neutral pH but adopting an amphipathic alpha-helical structure in the presence of phospholipid vesicles, high ionic strength, acidic pH or SDS. BMFP interacts with phospholipid vesicles though the predicted amphipathic alpha-helix induced in the N-terminal half of the protein and promotes aggregation and fusion of vesicles in vitro. 0
39601 412751 cl01118 ThrE Putative threonine/serine exporter. ThrE_2 is a family of membrane proteins involved in the export of threonine and serine. L-threonine, L-serine are both substrates for the exporter. The exporter exhibits nine-ten predicted transmembrane-spanning helices with long charged C and N termini and an amphipathic helix present within the N-terminus. L-Threonine can be made by the amino acid-producing bacterium Corynebacterium glutamicum, but the potential for amino acid formation can be considerably improved by reducing its intracellular degradation into glycine and increasing its export by this exporter. Members of the family are found in Bacteria, Archaea, and the fungal kingdoms, and the family can exist either as a single long polypeptide chain or as two short polypeptides. All family members show an extended hydrophilic N-terminal domain with weak sequence similarity to portions of hydrolases (proteases, peptidases, and glycosidases); this suggests that since this region is cytoplasmic to the membrane it may be generating the transport substrate, so may imply that threonine may not be the primary substrate and the ThrE has a subsidiary function. 0
39602 412752 cl01119 DUF525 ApaG domain. CO2+/MG2+ efflux protein ApaG; Reviewed 0
39603 412753 cl01120 SspB Stringent starvation protein B. ClpXP protease specificity-enhancing factor; Provisional 0
39604 412754 cl01122 RdgC Putative exonuclease, RdgC. recombination associated protein; Reviewed 0
39605 412755 cl01123 Fe-S_assembly Iron-sulphur cluster assembly. hypothetical protein; Provisional 0
39606 412756 cl01125 LptE Lipopolysaccharide-assembly. LPS-assembly lipoprotein RlpB; Provisional 0
39607 412757 cl01126 EI24 Etoposide-induced protein 2.4 (EI24). putative sulfate transport protein CysZ; Validated 0
39608 412758 cl01128 DUF535 Protein of unknown function (DUF535). Family member Shigella flexneri VirK is a virulence protein required for the expression, or correct membrane localization of IcsA (VirG) on the bacterial cell surface,. This family also includes Pasteurella haemolytica lapB, which is thought to be membrane-associated. 0
39609 412759 cl01129 NqrM (Na+)-NQR maturation NqrM. The NqrM gene is often found adjacent to the nqr operons that encode (Na+)-NQR subunits. It is involved in the maturation of (Na+) translocating NADH:quinone oxidoreductase in proteobacteria. The four conserved Cys residues found in NqrM are required for (Na+)- NQR maturation and may serve as ligands for a metal ion or metal cluster used to build up the (Na+)-NQR molecule. 0
39610 412760 cl01131 HlyC RTX toxin acyltransferase family. Members of this family are enzymes EC:2.3.1.-. involved in fatty acylation of the protoxins (HlyA) at lysine residues, thereby converting them to the active toxin. Acyl-acyl carrier protein (ACP) is the essential acyl donor. This family show a number of conserved residues that are possible candidates for participation in acyl transfer. Site-directed mutagenesis of the single conserved histidine residue in Escherichia coli HlyC resulted in complete inactivation of the enzyme. 0
39611 412761 cl01132 FA_hydroxylase Fatty acid hydroxylase superfamily. beta-carotene hydroxylase 0
39612 412762 cl01133 Na_H_Exchanger Sodium/hydrogen exchanger family. This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyze the exchange of H+ for Na+ in a manner that is highly dependent on the pH. 0
39613 412763 cl01135 ABC_trans_aux ABC-type transport auxiliary lipoprotein component. ABC_trans_aux is a family of bacterial proteins that act as auxiliarires to the ABC-transporter in the gamma-hexachlorocyclohexane uptake permease system in Sphingobium japonicum. Gamma-hexachlorocyclohexane, or Lindane, can be used as the sole source of carbon in S.japonicum in aerobic conditions. Lindane is an insecticide. 0
39614 412764 cl01136 DUF393 Protein of unknown function, DUF393. Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown. 0
39615 412765 cl01137 YfbU YfbU domain. This presumed domain is about 160 residues long. It is found in archaebacteria and eubacteria. In Corynebacterium glutamicum Ycg4L it is associated with a helix-turn-helix domain. This suggests that this may be a ligand binding domain. 0
39616 412766 cl01139 Cofac_haem_bdg Haem-binding uptake, Tiki superfamily, ChaN. This is a family of putative bacterial lipoproteins necessary for the uptake of haem-iron. The structure of UniProtKB:Q0PBW2, Structure 2g5g, comprises a large parallel beta-sheet with flanking alpha-helices and a smaller domain consisting of alpha-helices. Two cofacial haem groups (~3.5 Angstom apart with an inter-iron distance of 4.4 Angstrom) bind in a pocket formed by a dimer of two ChaN monomers. 0
39617 412767 cl01143 H2O2_YaaD Peroxide stress protein YaaA. YaaA is a key element of the stress response to H2O2. It acts by reducing the level of intracellular iron levels after peroxide stress, thereby attenuating the Fenton reaction and the DNA damage that this would cause. The molecular mechanism of action is not known. 0
39618 412768 cl01144 YacG DNA gyrase inhibitor YacG. zinc-binding protein; Provisional 0
39619 412769 cl01146 ZapA Cell division protein ZapA. cell division protein ZapA; Provisional 0
39620 412770 cl01147 YjgA-like uncharacterized proteins similar to Escherichia coli YjgA. This family of bacterial proteins has no known function. 0
39621 412771 cl01148 FxsA FxsA cytoplasmic membrane protein. phage T7 F exclusion suppressor FxsA; Reviewed 0
39622 412772 cl01153 NapB Nitrate reductase cytochrome c-type subunit (NapB). The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase. 0
39623 294724 cl01162 DUF417 Protein of unknown function, DUF417. This family of uncharacterized proteins appears to be restricted to proteobacteria. 0
39624 412773 cl01163 NapD NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase. 0
39625 412774 cl01164 Slp Outer membrane lipoprotein Slp family. Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli, which also contains a close paralog, Haemophilus influenzae and Pasteurella multocida and Vibrio cholera. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from Escherichia coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation. [Cell envelope, Other] 0
39626 412775 cl01166 DUF416 Protein of unknown function (DUF416). This is a bacterial protein family of unknown function. Proteins in this family adopt an alpha helical structure. Genome context analysis has suggested a high probability of a functional association with histidine kinases, which implicates proteins in this family to play a role in signalling (information from TOPSAN 2Q9R). 0
39627 412776 cl01171 RelB RelB antitoxin. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. The resulting model appears to describe a narrower set of proteins than pfam04221, although many in the scope of this model are not obviously paired with toxin proteins. Several toxin/antitoxin pairs may occur in a single species. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 0
39628 412777 cl01172 YihI Der GTPase activator (YihI). YihI activates the GTPase activity of Der, a 50S ribosomal subunit stability factor. The stimulation is specific to Der as YihI does not stimulate the GTPase activity of Era or ObgE. The interaction of YihI with Der requires only the C-terminal 78 amino acids of YihI. A yihI deletion mutant is viable and shows a shorter lag period, but the same post-lag growth rate as a wild-type strain. yihI is expressed during the lag period. Overexpression of yihI inhibits cell growth and biogenesis of the 50S ribosomal subunit. YihI is an unusual, highly hydrophilic protein with an uneven distribution of charged residues, resulting in an N-terminal region with high pI and a C-terminal region with low pI. 0
39629 412778 cl01173 UPF0149 Uncharacterized protein family (UPF0149). This family resembles pfam03695 (version pfam03695.3), uncharacterised protein family UPF0149, but is broader in scope and includes additional proteins. It includes E. coli proteins YgfB and YecA. The function of this family of proteins is unknown. The crystal structure is known for the member from Haemophilus influenzae (Ygfb, HI0817). [Unknown function, General] 0
39630 412779 cl01175 DUF1414 Protein of unknown function (DUF1414). hypothetical protein; Provisional 0
39631 412780 cl01178 RseC_MucC Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, which is itself involved in thiamine biosynthesis. 0
39632 412781 cl01179 CcmH Cytochrome C biogenesis protein. [Energy metabolism, Electron transport] 0
39633 412782 cl01180 UPF0270 Uncharacterized protein family (UPF0270). hypothetical protein; Provisional 0
39634 382420 cl01181 DctQ Tripartite ATP-independent periplasmic transporters, DctQ component. 2,3-diketo-L-gulonate TRAP transporter small permease protein YiaM; Provisional 0
39635 412783 cl01183 DUF412 Protein of unknown function, DUF412. hypothetical protein; Provisional 0
39636 412784 cl01184 SirB Invasion gene expression up-regulator, SirB. SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown. 0
39637 412785 cl01187 DUF446 tRNA pseudouridine synthase C. This family is suggested to be the catalytic domain of tRNA pseudouridine synthase C by association. The structure has been solved for one member, as Structure 2HGK, which by inference is designated in this way. 0
39638 412786 cl01190 EpmC Elongation factor P hydroxylase. This family catalyzes the final step in the elongation factor P modification pathway. It hydroxylates Lys-34 of elongation factor P. Members of this family have a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. 0
39639 412787 cl01193 DUF463 YcjX-like family, DUF463. These proteins possess a P-loop motif. 0
39640 412788 cl01203 ACP_PD Acyl carrier protein phosphodiesterase. YajB, now renamed acpH, encodes an ACP hydrolase that converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine prosthetic group from ACP. 0
39641 412789 cl01204 COX4_pro Prokaryotic Cytochrome C oxidase subunit IV. This family (QoxD) encodes subunit IV of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport] 0
39642 412790 cl01209 DUF480 Protein of unknown function, DUF480. hypothetical protein; Provisional 0
39643 412791 cl01213 DUF481 Protein of unknown function, DUF481. This family includes several proteins of uncharacterized function. 0
39644 412792 cl01215 DUF1315 Protein of unknown function (DUF1315). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 0
39645 412793 cl01217 YebG YebG protein. DNA damage-inducible protein YebG; Provisional 0
39646 412794 cl01219 CheZ Chemotaxis phosphatase, CheZ. chemotaxis regulator CheZ; Provisional 0
39647 412795 cl01221 DTW DTW domain. This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. 0
39648 412796 cl01222 T2SSM Type II secretion system (T2SS), protein M. This family of membrane proteins consists of Type II secretion system protein M sequences from several Gram-negative (diderm) bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the T2SM (EpsM) protein interacts with the T2SL (EpsL) protein, and also forms homodimers. 0
39649 412797 cl01223 DUF1249 Protein of unknown function (DUF1249). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 0
39650 412798 cl01224 DUF805 Protein of unknown function (DUF805). This family consists of several bacterial proteins of unknown function. 0
39651 412799 cl01225 SCP2 SCP-2 sterol transfer family. This domain is found at the C-terminus of alkyl sulfatases. Together with the N-terminal catalytic domain, this domain forms a hydrophobic chute and may recruit hydrophobic substrates. 0
39652 412800 cl01226 T6SS_HCP Type VI secretion system effector, Hcp. This family includes Hcp1 (hemolysin coregulated protein 1), an exported, homohexameric ring-forming virulence protein from Pseudomonas aeruginosa. Hcp1 lacks a conventional signal sequence and is instead exported by means of the type VI secretion system, encoded by a pathogenicity cluster of a class previously designated IAHP (IcmF-associated homologous protein). Homologs of Hcp1, in this protein family, are found in various bacteria of which most but not all are known pathogens. Pathogens may have many multiple members of this family, with three to ten in Erwinia carotovora, Yersinia pestis, uropathogenic Escherichia coli, and the insect pathogen Photorhabdus luminescens. [Cellular processes, Pathogenesis] 0
39653 382439 cl01230 Chor_lyase Chorismate lyase. This is a family of uncharacterized proteins. 0
39654 412801 cl01231 DUF485 Protein of unknown function, DUF485. This family includes several putative integral membrane proteins. 0
39655 412802 cl01234 PilO Pilus assembly protein, PilO. The T2SMb family is conserved in Proteobacteria and Actinobacteria, and differs from the T2SM proteins in Vibrio spp. (pfam04612). 0
39656 412803 cl01236 DMT_6 Putative member of DMT superfamily (DUF486). This family contains several proteins of uncharacterized function. The family is represented in the Transport classification database as 2.A.7.34, though the exact nature of what is transported is not known. 0
39657 412804 cl01237 DUF469 Protein with unknown function (DUF469). hypothetical protein; Provisional 0
39658 412805 cl01240 CtaG_Cox11 Cytochrome c oxidase assembly protein CtaG/Cox11. cytochrome C oxidase assembly protein; Provisional 0
39659 412806 cl01244 arom_aa_hydroxylase N/A. This family includes phenylalanine-4-hydroxylase, the phenylketonuria disease protein. 0
39660 412807 cl01245 META META domain. Small domain family found in proteins of of unknown function. Some are secreted and implicated in motility in bacteria. Also occurs in Leishmania spp. as an essential gene. Over-expression in L.amazonensis increases virulence. A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond. 0
39661 382447 cl01246 DUF488 Protein of unknown function, DUF488. This family includes several proteins of uncharacterized function. 0
39662 412808 cl01247 FliO Flagellar biosynthesis protein, FliO. This short protein found in flagellar biosynthesis operons contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been names "FliZ" but phylogenetic tree building supports a single FliO family. 0
39663 412809 cl01248 EutH Ethanolamine utilisation protein, EutH. ethanolamine utilization protein EutH; Provisional 0
39664 412810 cl01249 Haem_degrading Haem-degrading. hypothetical protein; Provisional 0
39665 412811 cl01250 Ureidogly_lyase Ureidoglycolate lyase. Ureidoglycolate lyase (EC:4.3.2.3) is one of the enzymes that acts upon ureidoglycolate, an intermediate of purine catabolism, releasing urea. The enzyme has in the past been wrongly assigned to EC:3.5.3.19, enzymes which release ammonia from ureidoglycolate. 0
39666 412812 cl01251 OHCU_decarbox OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. This model is a separate (but related) clade from that represented by TIGR3164. This model places a second homolog in streptomyces species which (are not in the vicinity of other urate catabolism associated genes) below the trusted cutoff. 0
39667 412813 cl01252 UPF0167 Uncharacterized protein family (UPF0167). The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs. 0
39668 412814 cl01253 FixS Cytochrome oxidase maturation protein cbb3-type. CcoS from Rhodobacter capsulatus has been shown essential for incorporation of redox-active prosthetic groups (heme, Cu) into cytochrome cbb(3) oxidase. FixS of Bradyrhizobium japonicum appears to have the same function. Members of this family are found so far in organisms with a cbb3-type cytochrome oxidase, including Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Caulobacter crescentus, Bradyrhizobium japonicum, and Rhodobacter capsulatus. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair] 0
39669 412815 cl01255 DAGK_cat Diacylglycerol kinase catalytic domain. Members of this family include ATP-NAD kinases EC:2.7.1.23, which catalyzes the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. Also includes NADH kinases EC:2.7.1.86. 0
39670 412816 cl01256 NMN_transporter Nicotinamide mononucleotide transporter. The PnuC protein of E. coli is membrane protein responsible for nicotinamide mononucleotide transport, subject to regulation by interaction with the NadR (also called NadI) protein (see TIGR01526). This model defines a region corresponding to most of the length of PnuC, found primarily in pathogens. The extreme N- and C-terminal regions are poorly conserved and not included in the alignment and model. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 0
39671 412817 cl01257 DUF2061 Predicted membrane protein (DUF2061). This domain, found in various prokaryotic proteins, has no known function. 0
39672 412818 cl01258 NnrS NnrS protein. This family consists of several bacterial NnrS like proteins. NnrS is a putative heme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase. 0
39673 412819 cl01260 PilZ PilZ domain. This domain is related to Type IV pilus assembly protein PilZ (pfam07238). It is found in at least 12 copies in Myxococcus xanthus DK 1622. 0
39674 412820 cl01261 DUF2062 Uncharacterized protein conserved in bacteria (DUF2062). Members of this family are uncharacterized proteins, usually encoded by a gene adjacent to a member of family TIGR03545, which is also uncharacterized. 0
39675 412821 cl01262 DUF2063 Putative DNA-binding domain. This family represents the N-terminal part of a Neisseria protein, UniProtKB:Q5F5I0, Structure 3dee. It runs from residues 31-117 as a helical bundle with 4 main helices. \From genomic context and the fold of the C-terminal part, it is suggested that this protein is involved in transcriptional regulation. 0
39676 412822 cl01264 PsiE Phosphate-starvation-inducible E. phosphate-starvation-inducible protein PsiE; Provisional 0
39677 412823 cl01267 Peptidase_M90 Glucose-regulated metallo-peptidase M90. DgsA anti-repressor MtfA; Provisional 0
39678 412824 cl01275 DUF2065 Uncharacterized protein conserved in bacteria (DUF2065). This domain, found in various prokaryotic proteins, has no known function. 0
39679 412825 cl01279 MbtH MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters. 0
39680 412826 cl01280 Chlor_dismutase Chlorite dismutase. putative heme peroxidase; Provisional 0
39681 412827 cl01281 rhaM L-rhamnose mutarotase. Members of this protein family are rhamnose mutarotase from Escherichia coli, previously designated YiiL as an uncharacterized protein, and close homologs also associated with rhamnose dissimilation operons in other bacterial genomes. Mutarotase is a term for an epimerase that changes optical activity. This enzyme was shown experimentally to interconvert alpha and beta stereoisomers of the pyranose form of L-rhamnose. The crystal structure of this small (104 amino acid) protein shows a locally asymmetric dimer with active site residues of His, Tyr, and Trp. [Energy metabolism, Sugars] 0
39682 412828 cl01282 TRAM TRAM domain. This small domain has no known function. However it may perform a nucleic acid binding role (Bateman A. unpublished observation). 0
39683 412829 cl01284 DUF1722 Protein of unknown function (DUF1722). hypothetical protein; Provisional 0
39684 412830 cl01285 Gar1 Gar1/Naf1 RNA binding region. H/ACA RNA-protein complex component Gar1; Reviewed 0
39685 412831 cl01287 AE_Prim_S_like N/A. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities. 0
39686 382472 cl01288 DUF2067 Uncharacterized protein conserved in archaea (DUF2067). This domain, found in various archaeal proteins, has no known function. 0
39687 412832 cl01294 Baseplate_J Baseplate J-like protein. This family consists of a large, conserved hypothetical protein in phage tail-like regions of at least six bacterial genomes: Gloeobacter violaceus PCC 7421, Geobacter sulfurreducens PCA, Streptomyces coelicolor A3(2), Streptomyces avermitilis MA-4680, Mesorhizobium loti, and Myxococcus xanthus. The C-terminal region is identified by the broader model pfam04865 as related to baseplate protein J from phage P2, but that relationship is not observed directly. [Mobile and extrachromosomal element functions, Prophage functions] 0
39688 382474 cl01298 Glyco_transf_25 N/A. Members of this family belong to Glycosyltransferase family 25 This is a family of glycosyltransferases involved in lipopolysaccharide (LPS) biosynthesis. These enzymes catalyze the transfer of various sugars onto the growing LPS chain during its biosynthesis. 0
39689 412833 cl01299 DUF2069 Predicted membrane protein (DUF2069). This domain, found in various prokaryotes, has no known function. 0
39690 412834 cl01301 DUF1415 Protein of unknown function (DUF1415). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 0
39691 412835 cl01304 DUF1289 Protein of unknown function (DUF1289). This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N-terminus. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 0
39692 412836 cl01308 CHASE4 CHASE4 domain. CHASE4. This is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in Archaea and in predicted diguanylate cyclases/phosphodiesterases in Bacteria. Environmental factors that are recognized by CHASE4 domains are not known at this time. 0
39693 412837 cl01311 DUF1294 Protein of unknown function (DUF1294). This family includes a number of hypothetical bacterial and archaeal proteins of unknown function. 0
39694 412838 cl01312 Sbt_1 Na+-dependent bicarbonate transporter superfamily. Family of bacterial proteins that are likely to be part of the Na(+)-dependent bicarbonate transporter (sbt) family. Members carry 10TMS in a 5+5 duplicated structure. The loop between helices 5 and 6 in Synechocystis PCC6803 is likely to be the location for regulatory mechanisms governing the activation of the transporter. 0
39695 412839 cl01314 RecU Recombination protein U. Holliday junction-specific endonuclease; Reviewed 0
39696 412840 cl01315 TANGO2 Transport and Golgi organisation 2. In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse Tango2 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif. This gene has been characterized in Drosophila melanogaster and named as transport and Golgi organisation 2, hence the name Tango2. 0
39697 412841 cl01317 Cmr5_III-B CRISPR/Cas system-associated protein Cmr5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species. 0
39698 412842 cl01318 DUF1232 Protein of unknown function (DUF1232). This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function. 0
39699 412843 cl01319 HARE-HTH HB1, ASXL, restriction endonuclease HTH domain. Members of this family are the RNA polymerase delta subunit, as found in the Firmicutes and the Mollicutes. All members of the seed alignment have an extended C-terminal low-complexity region, consisting largely of Asp and Glu, that is not included in the model. Proteins giving borderline scores should be checked to confirm a similar acidic C-terminal domain. [Transcription, DNA-dependent RNA polymerase] 0
39700 412844 cl01321 SURF1 N/A. SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder. 0
39701 412845 cl01327 DUF1684 Protein of unknown function (DUF1684). The sequences featured in this family are found in hypothetical archaeal and bacterial proteins of unknown function. The region in question is approximately 200 amino acids long. 0
39702 412846 cl01328 Dodecin Dodecin. Dodecin is a flavin-binding protein,found in several bacteria and few archaea and represents a stand-alone version of the SHS2 domain. It most closely resembles the SHS2 domains of FtsA and Rpb7p, and represents a single domain small-molecule binding form. 0
39703 412847 cl01329 DUF2071 Uncharacterized conserved protein (COG2071). This conserved protein (similar to YgjF), found in various prokaryotes, has no known function. 0
39704 412848 cl01330 IMP_cyclohyd IMP cyclohydrolase-like protein. This model represents IMP cyclohydrolase, the final step in the biosynthesis of inosine monophosphate (IMP) in archaea. In bacteria this step is catalyzed by a bifunctional enzyme (purH). 0
39705 412849 cl01332 DUF2073 Uncharacterized protein conserved in archaea (DUF2073). This archaeal protein has no known function. 0
39706 412850 cl01339 DUF1805 Domain of unknown function (DUF1805). This domain is found in bacteria and archaea and has an N terminal tetramerisation region that is composed of beta sheets. 0
39707 412851 cl01342 Peptidase_A22B Signal peptide peptidase. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans. 0
39708 412852 cl01346 PaaA_PaaC Phenylacetic acid catabolic protein. Members of this protein family are BoxB, the B subunit of benzoyl-CoA oxygenase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. [Energy metabolism, Other] 0
39709 412853 cl01349 YqcI_YcgG YqcI/YcgG family. This family of proteins are functionally uncharacterized. The family include YqcI and YcgG from B. subtilis. The alignment contains a conserved FPC motif at the N-terminus and CPF at the C-terminus. 0
39710 412854 cl01350 FTCD_C Formiminotransferase-cyclodeaminase. Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family. 0
39711 412855 cl01351 Glyco_hydro_8 Glycosyl hydrolases family 8. 0
39712 412856 cl01356 DUF1508 Domain of unknown function (DUF1508). This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical. 0
39713 412857 cl01359 OpcA_G6PD_assem Glucose-6-phosphate dehydrogenase subunit. Members of this family are found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. The exact function of the domain is, as yet, unknown. 0
39714 412858 cl01360 Pilin_N Archaeal Type IV pilin, N-terminal. This entry represents the N-terminal domain of archaeal pilins, which play important roles in surface adhesion and twitching motility. This domain contains an conserved N- terminal hydrophobic motif. 0
39715 412859 cl01365 ZinT ZinT (YodA) periplasmic lipocalin-like zinc-recruitment. zinc/cadmium-binding protein; Provisional 0
39716 412860 cl01368 GyrI-like GyrI-like small molecule binding domain. This family contains Cass2 from Vibrio cholerae, an integron-associated protein that has been shown to bind cationic drug compounds with submicromolar affinity. Cass2 has been proposed to be representative of a larger family of independent effector-binding proteins associated with lateral gene transfer within Vibrio and other closely-related species. 0
39717 412861 cl01369 CHASE CHASE domain. Predicted to be a ligand binding domain. 0
39718 412862 cl01370 DotU Type VI secretion system protein DotU. At least two families of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (TIGR03348). The other is the family described by this model. Members include DotU from the Legionella pneumophila type IV secretion system. Many of the members of this protein family from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39719 412863 cl01371 PaaB Phenylacetic acid degradation B. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other] 0
39720 412864 cl01377 Iron_transport Fe2+ transport protein. This is a bacterial family of periplasmic proteins that are thought to function in high-affinity Fe2+ transport. 0
39721 412865 cl01378 LicD LicD family. The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily. 0
39722 412866 cl01379 TSPO_MBR Translocator protein (TSPO)/peripheral-type benzodiazepine receptor (MBR) family. Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. In, the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues. 0
39723 412867 cl01380 DUF1440 Protein of unknown function (DUF1440). This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins. 0
39724 412868 cl01381 zinc_ribbon_13 Nucleic-acid-binding protein containing Zn-ribbon domain (DUF2082). This domain, found in various hypothetical prokaryotic proteins, as well as some Zn-ribbon nucleic-acid-binding proteins has no known function. 0
39725 412869 cl01382 PAD Phenolic Acid Decarboxylase. This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The Phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD. 0
39726 412870 cl01385 DUF1244 Protein of unknown function (DUF1244). This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown. 0
39727 412871 cl01386 2HCT 2-hydroxycarboxylate transporter family. These proteins are members of the Citrate:Cation Symporter (CCS) Family (TC 2.A.24). These proteins have 12 GES predicted transmembrane regions. Most members of the CCS family catalyze citrate uptake with either Na+ or H+ as the cotransported cation. However, one member is specific for L-malate and probably functions by a proton symport mechanism. [Unclassified, Role category not yet assigned] 0
39728 412872 cl01387 DUF3299 Protein of unknown function (DUF3299). This is a family of bacterial proteins of unknown function. 0
39729 412873 cl01389 Phage_sheath_1 Phage tail sheath protein subtilisin-like domain. major tail sheath protein; Provisional 0
39730 412874 cl01390 Phage_tube Phage tail tube protein FII. major tail tube protein; Provisional 0
39731 412875 cl01391 Phage_P2_GpU Phage P2 GpU. This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein, which is thought to be involved in tail assembly. 0
39732 412876 cl01393 DUF952 Protein of unknown function (DUF952). This family consists of several hypothetical bacterial and plant proteins of unknown function. 0
39733 412877 cl01397 DUF1349 Protein of unknown function (DUF1349). This family consists of several hypothetical bacterial proteins but contains one sequence from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown. 0
39734 412878 cl01402 T6SS_VipA Type VI secretion system, VipA, VC_A0107 or Hcp2. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39735 412879 cl01403 GPW_gp25 Gene 25-like lysozyme. Some members in this family of proteins are annotated as phage related, xkdS however currently there is no known function. 0
39736 412880 cl01404 T6SS_TssG Type VI secretion, TssG. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39737 412881 cl01405 T6SS-SciN Type VI secretion lipoprotein, VasD, EvfM, TssJ, VC_A0113. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39738 412882 cl01406 T6SS_VasE Bacterial Type VI secretion, VC_A0110, EvfL, ImpJ, VasE. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39739 412883 cl01407 Rdx Rdx family. This model represents a domain found in both bacteria and animals, including animal proteins SelT, SelW, and SelH, all of which are selenoproteins. In a CXXC motif near the N-terminus of the domain, selenocysteine may replace the second Cys. Proteins with this domain may include an insert of about 70 amino acids. This model is broader than the current SelW model pfam05169 in Pfam. 0
39740 412884 cl01408 AAL_decarboxy Alpha-acetolactate decarboxylase. Puruvate can be fermented to 2,3-butanediol. It is first converted to alpha-acetolactate by alpha-acetolactate synthase, then decarboxylated to acetoin by this enzyme. Acetoin can be reduced in some species to 2,3-butanediol by acetoin reductase. [Energy metabolism, Fermentation] 0
39741 412885 cl01409 DUF2219 Uncharacterized protein conserved in bacteria (DUF2219). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39742 412886 cl01410 DUF2387 Probable metal-binding protein (DUF2387). Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various Proteobacteria. 0
39743 412887 cl01411 QSregVF_b Putative quorum-sensing-regulated virulence factor. QSregVF_b is a family of short Pseudomonas proteins that are potential virulence factors. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd, from pfam13652. It is predicted that these two adjacent proteins form a single transcriptional unit based on the prediction that together they interact with their adjacent protein PotD, which is the putrescine-binding periplasmic protein in the polyamine uptake system comprising PotABCD. These two adjacent proteins are predicted to be quroum-sensing-regulated virulence factors. 0
39744 412888 cl01412 Alpha-L-AF_C Alpha-L-arabinofuranosidase C-terminal domain. This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. 0
39745 412889 cl01414 DUF971 Protein of unknown function (DUF971). This family consists of several short bacterial proteins and one sequence from Oryza sativa. The function of this family is unknown. 0
39746 412890 cl01416 Fimbrial Fimbrial protein. FimA is a family of Gram-negative fimbrial component A proteins that form part of the pili. There are usually up to 1000 copies of this subunit in one pilus that form a helically wound rod onto which the tip fibrillum (FimF.FimG, FimH) is attached. Pilus subunits are translocated from the cytoplasm to the periplasm via the general secretory pathway SecYEG. 0
39747 412891 cl01417 Nuc-transf Predicted nucleotidyltransferase. hypothetical protein; Provisional 0
39748 412892 cl01419 DUF1284 Protein of unknown function (DUF1284). This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins. 0
39749 412893 cl01421 DUF1211 Protein of unknown function (DUF1211). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins. 0
39750 412894 cl01424 DUF2218 Uncharacterized protein conserved in bacteria (DUF2218). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39751 412895 cl01425 Glycolipid_bind Putative glycolipid-binding. This family has a novel fold known as a spiral beta-roll, consisting of a 15-stranded beta sheet wrapped around a single alpha helix. It forms dimers. It has some structural similarity to the E. coli lipoprotein localization factors LolA and LolB. Its structure suggests that it may have a role in glycolipid binding. Its genomic context supports a role in glycolipid metabolism. 0
39752 412896 cl01427 DUF2214 Predicted membrane protein (DUF2214). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39753 412897 cl01430 AntA AntA/AntB antirepressor. In E. coli the two proteins AntA and AntB have 62% amino acid identities near their N termini. AntA appears to be encoded by a truncated and divergent copy of AntB. The two proteins are homologous to putative antirepressors found in numerous bacteriophages, such as the hypothetical antirepressor protein encoded by the gene LO142 of the bacteriophage 933W. 0
39754 412898 cl01432 DUF779 Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function. 0
39755 412899 cl01435 NTP_transf_6 Nucleotidyltransferase. This family consists of several hypothetical bacterial proteins of unknown function. This family was recently identified as belonging to the nucleotidyltransferase superfamily. 0
39756 412900 cl01438 zf-AN1 AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. 0
39757 412901 cl01439 3D_domain 3D domain, named for 3 conserved aspartate residues, is found in mltA-like lytic transglycosylases and numerous other contexts. This short presumed domain contains three conserved aspartate residues, hence the name 3D. It has been shown to be part of the catalytic double psi beta barrel domain of MltA. 0
39758 412902 cl01440 TOBE_2 TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain. 0
39759 412903 cl01441 DUF5655 Domain of unknown function (DUF5655). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 122 and 304 amino acids in length. 0
39760 412904 cl01445 DUF4065 Protein of unknown function (DUF4065). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 155 and 202 amino acids in length. 0
39761 412905 cl01449 DUF2240 Uncharacterized protein conserved in archaea (DUF2240). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39762 412906 cl01453 DUF1275 Protein of unknown function (DUF1275). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although most members have 6 TM regions, and may be putative permeases. 0
39763 412907 cl01454 PhnG Phosphonate metabolism protein PhnG. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam06754.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context. 0
39764 412908 cl01455 PhnH Bacterial phosphonate metabolism protein (PhnH). PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam05845.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context. 0
39765 412909 cl01456 PhnI Bacterial phosphonate metabolism protein (PhnI). This family consists of several Proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organized in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex. 0
39766 412910 cl01457 PhnJ Phosphonate metabolism protein PhnJ. This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown. 0
39767 412911 cl01458 OAD_gamma Oxaloacetate decarboxylase, gamma chain. This model finds the subfamily of distantly related, low complexity, hydrophobic small subunits of several related sodium ion-pumping decarboxylases. These include oxaloacetate decarboxylase gamma subunit and methylmalonyl-CoA decarboxylase delta subunit. Most sequences scoring between the noise and trusted cutoffs are eukaryotic sodium channel proteins. 0
39768 412912 cl01461 DUF2239 Uncharacterized protein conserved in bacteria (DUF2239). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39769 412913 cl01462 ANT Phage antirepressor protein KilAC domain. This domain was called the KilAC domain by Iyer and colleagues. 0
39770 412914 cl01464 DUF2238 Predicted membrane protein (DUF2238). hypothetical protein; Provisional 0
39771 412915 cl01465 Cas7_I-C CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR-associated protein Cas7 is one of the components of the type I-B cascade-like antiviral defense complex. In Haloferax volcanii, Cas5, Cas6 and Cas7 form a small complex that aids the stability of CRISPR-derived RNA. 0
39772 412916 cl01467 DUF2237 Uncharacterized protein conserved in bacteria (DUF2237). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39773 412917 cl01472 DUF2236 Uncharacterized protein conserved in bacteria (DUF2236). This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity. 0
39774 412918 cl01474 DUF1989 Domain of unknown function (DUF1989). A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes. 0
39775 412919 cl01480 DUF2235 Uncharacterized alpha/beta hydrolase domain (DUF2235). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39776 412920 cl01481 DDE_Tnp_IS1595 ISXO2-like transposase domain. Most transposases of this family of transposases, IS1595, have an additional short N-terminal domain with a pair of CxxC motifs. 0
39777 412921 cl01482 CpxP_like CpxP component of the bacterial Cpx-two-component system and related proteins. This is a metal-binding protein which is involved in resistance to heavy-metal ions. The protein forms a four-helix hooked hairpin, consisting of two long alpha helices each flanked by a shorter alpha helix. It binds a metal ion in a type-2 like centre. It contains two copies of an LTXXQ motif. 0
39778 412922 cl01483 Com_YlbF Control of competence regulator ComK, YlbF/YmcA. YlbF Is a family of short Gram-positive and archaeal proteins that includes both YlbF and YmcA which may interact synergistically. The family is necessary for correct biofilm formation, as null mutants of ymcA and ylbF fail to form pellicles at air-liquid interfaces and grow on solid media as smooth, undifferentiated colonies. During development, YmcA, YlbF and YaaT, family PSPI, pfam04468, interact directly with one another forming a stable ternary complex, in vitro. All three proteins are required for competence, sporulation and the formation of biofilms. The YmcA-YlbF-YaaT complex affects the phosphotransfer between Spo0F and Spo0B, thus accelerating the production of Spo0A~P. The three processes of biofilm formation, mature spore formation and competence all require the active, phosphorylated form of Spo0A, as Spo0A-P. 0
39779 412923 cl01487 DUF1007 Protein of unknown function (DUF1007). Family of conserved bacterial proteins with unknown function. 0
39780 412924 cl01491 NYN_YacP YacP-like NYN domain. This family consists of bacterial proteins related to YacP. This family is uncharacterized functionally, but it has been suggested that these proteins are nucleases due to them containing a NYN domain. NYN (for N4BP1, YacP-like Nuclease) domains were discovered by Anantharaman and Aravind. Based on gene neighborhoods it was suggested that the bacterial YacP proteins interact with the Ribonuclease III and TrmH methylase in a processome complex that catalyzes the maturation of rRNA and tRNA. 0
39781 412925 cl01492 DUF1980 Domain of unknown function (DUF1980). Members of this occur in gene pairs with members of pfam03773. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region. 0
39782 412926 cl01498 CitX Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-citrate lyase; Reviewed 0
39783 412927 cl01500 VirB8 VirB8 protein. conjugal transfer protein TrbF; Provisional 0
39784 412928 cl01501 VirB3 Type IV secretory pathway, VirB3-like protein. type IV secretion system protein VirB3; Provisional 0
39785 382568 cl01503 TrbL TrbL/VirB6 plasmid conjugal transfer protein. conjugal transfer protein TrbL; Provisional 0
39786 412929 cl01505 YhhN YhhN family. The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many are annotated as possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues. A human member of this family, formerly known as TMEM86B, is a lysoplasmalogenase that catalyzes the hydrolysis of the vinyl ether bond of lysoplasmalogen. Putative conserved active site residues have been proposed for the YhhN family. 0
39787 412930 cl01506 EII-Sor PTS system sorbose-specific iic component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man (PTS splinter group) family is unique in several respects among PTS permease families. It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this family can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the sorbose-specific IIC subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39788 412931 cl01507 EIID-AGA PTS system mannose/fructose/sorbose family IID component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IID subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39789 412932 cl01508 KduI KduI/IolB family. Members of this protein family, 5-deoxy-glucuronate isomerase (iolB), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. [Energy metabolism, Sugars] 0
39790 412933 cl01509 ChuX_HutX Haem utilisation ChuX/HutX. The Yersinia enterocolitica O:8 periplasmic binding-protein- dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family). The structure for HemS has been solved and consists of a tandem repeat of this domain. 0
39791 412934 cl01511 AstB Succinylarginine dihydrolase. Members of this family are succinylarginine dihydrolase (EC 3.5.3.23), the second of five enzymes in the arginine succinyltransferase (AST) pathway. [Energy metabolism, Amino acids and amines] 0
39792 412935 cl01513 Terminase_2 Terminase small subunit. Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site. 0
39793 412936 cl01515 EII-GUT PTS system enzyme II sorbitol-specific factor. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific transporters, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39794 412937 cl01516 PTSIIA_gutA PTS system glucitol/sorbitol-specific IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria.The system in E.Coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
39795 412938 cl01519 DUF1287 Domain of unknown function (DUF1287). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. This family is related to pfam00877. 0
39796 412939 cl01520 DUF817 Protein of unknown function (DUF817). This family consists of several bacterial proteins of unknown function. 0
39797 321544 cl01521 Peptidase_S78 Caudovirus prohead serine protease. This model describes the prohead protease of HK97 and related phage. It is generally encoded next to the gene for the capsid protein that it processes, and in some cases may be fused to it. This family does not show similarity to the prohead protease of phage T4 (see pfam03420). [Mobile and extrachromosomal element functions, Prophage functions, Protein fate, Other] 0
39798 412940 cl01522 FGase N-formylglutamate amidohydrolase. In some species, histidine is converted to via urocanate and then formimino-L-glutamate to glutamate in four steps, where the fourth step is conversion of N-formimino-L-glutamate to L-glutamate and formamide. In others, that pathway from formimino-L-glutamate may differ, with the next enzyme being formiminoglutamate hydrolase (HutF) yielding N-formyl-L-glutamate. This model represents the enzyme N-formylglutamate deformylase, also called N-formylglutamate amidohydrolase, which then produces glutamate. [Energy metabolism, Amino acids and amines] 0
39799 412941 cl01525 Terminase_4 Phage terminase, small subunit. This model describes a distinct family of phage (and integrated prophage) putative terminase small subunit. Members tend to be adjacent to the phage terminase large subunit gene. [Mobile and extrachromosomal element functions, Prophage functions] 0
39800 412942 cl01526 DUF934 Bacterial protein of unknown function (DUF934). This family consists of several bacterial proteins of unknown function. One of the members of this family BMEI1764 is thought to be an oxidoreductase. 0
39801 412943 cl01528 DUF937 Bacterial protein of unknown function (DUF937). hypothetical protein; Provisional 0
39802 412944 cl01529 GH99_GH71_like Glycoside hydrolase families 71, 99, and related domains. This domain, around 350 residues, is mainly found in some uncharacterized proteins from bacteroides to human. Some proteins in this family, annotated as endo-alpha-mannosidases cleave mannoside linkages internally within an N-linked glycan chain, short circuiting the classical N-glycan biosynthetic pathway. This domain reveals a (beta-alpha)(8) barrel fold in which the catalytic centre is present in a long substrate-binding groove, consistent with cleavage within the N-glycan chain, providing a foundation upon which to develop new enzyme inhibitors targeting the hijacking of N-glycan synthesis in viral disease and cancer. 0
39803 412945 cl01530 LprI Lysozyme inhibitor LprI. This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. Family members include lipoprotein LprI from Mycobacterium, which binds to and inhibits macrophage lysozyme, which may aid bacterial survival. 0
39804 412946 cl01531 DUF1376 Protein of unknown function (DUF1376). This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown. 0
39805 412947 cl01532 HutD HutD. HutD from Pseudomonas fluorescens SBW25 is a component of the histidine uptake and utilisation operon. HutD is operonic with the well characterized repressor protein HutC. Genetic analysis using transcriptional fusions (lacZ) and deletion mutants shows that hutD is necessary to maintain fitness in environments replete with histidine. Evidence outlined by Zhang & Rainey (2007) suggests that HutD functions as a governor that sets an upper bound on the level of hut operon transcription. The mechanistic basis is unknown, but in silico molecular docking studies based on the crystal structure of PA5104 (HutD from Pseudomonas aeruginosa) show that urocanate (the first breakdown product of histidine) docks with the active site of HutD. 0
39806 412948 cl01533 DUF1304 Protein of unknown function (DUF1304). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 0
39807 412949 cl01534 NDUFA12 NADH ubiquinone oxidoreductase subunit NDUFA12. NADH:ubiquinone oxidoreductase 18 kDa subunit; Provisional 0
39808 412950 cl01535 TPM_phosphatase TPM domain. This family was first named TPM domain after its founding proteins: TLP18.3, Psb32 and MOLO-1. In Arabidopsis, this domain is called the thylakoid acid phosphatase -TAP - domain and has a Rossmann-like fold. In plants, the family resides in the thylakoid lumen attached to the outer membrane of the chloroplast/plastid. It is active in the photosystem II. 0
39809 412951 cl01538 Peptidase_M74 Penicillin-insensitive murein endopeptidase. penicillin-insensitive murein endopeptidase; Reviewed 0
39810 412952 cl01539 LapA_dom Lipopolysaccharide assembly protein A domain. This family includes a domain found in lipopolysaccharide assembly protein A (LapA). LapA functions along with LapB in the assembly of lipopolysaccharide (LPS). Domains in this family are also found in some uncharacterized bacterial proteins. 0
39811 412953 cl01542 DUF2313 Uncharacterized protein conserved in bacteria (DUF2313). Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins. 0
39812 412954 cl01544 Bestrophin Bestrophin, RFP-TM, chloride channel. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localized to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members. 0
39813 412955 cl01545 DUF1853 Domain of unknown function (DUF1853). This family of proteins are functionally uncharacterized. 0
39814 412956 cl01546 Cytochrom_B562 Cytochrome b562. cytochrome b562; Provisional 0
39815 412957 cl01547 DUF1318 Protein of unknown function (DUF1318). This family consists of several bacterial proteins of around 100 residues in length and is often known as YdbL. The function of this family is unknown. 0
39816 412958 cl01548 YccV-like Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. 0
39817 412959 cl01551 DUF2170 Uncharacterized protein conserved in bacteria (DUF2170). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39818 412960 cl01553 GFA Glutathione-dependent formaldehyde-activating enzyme. glutathione-dependent formaldehyde-activating enzyme; Provisional 0
39819 412961 cl01557 DUF1697 Protein of unknown function (DUF1697). This family contains many hypothetical bacterial proteins. 0
39820 412962 cl01558 DUF2171 Uncharacterized protein conserved in bacteria (DUF2171). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39821 412963 cl01561 DUF924 Bacterial protein of unknown function (DUF924). This family consists of several hypothetical bacterial proteins of unknown function. Structurally, this family resembles TPR-like repeats. 0
39822 412964 cl01562 DOPA_dioxygen Dopa 4,5-dioxygenase family. This family of proteins are related to a DOPA 4,5-dioxygenase that is involved in synthesis of betalain. DOPA-dioxygenase is the key enzyme involved in betalain biosynthesis. It converts 3,4-dihydroxyphenylalanine to betalamic acid, a yellow chromophore. 0
39823 412965 cl01565 zf-TFIIB Transcription factor zinc-finger. 0
39824 412966 cl01566 YjhX_toxin Putative toxin of bacterial toxin-antitoxin pair. hypothetical protein; Provisional 0
39825 412967 cl01567 DUF1993 Domain of unknown function (DUF1993). This family of proteins are functionally uncharacterized. 0
39826 412968 cl01570 DUF2085 Predicted membrane protein (DUF2085). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39827 412969 cl01573 DUF969 Protein of unknown function (DUF969). Family of uncharacterized bacterial membrane proteins. 0
39828 412970 cl01575 DUF599 Protein of unknown function, DUF599. This family includes several uncharacterized proteins. 0
39829 412971 cl01577 MMP_TTHA0227_like Minimal MMP-like domain found in Thermus thermophilus TTHA0227, Acidothermus cellulolyticus ACEL2062 and similar proteins. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family is a minimal version of the metalloprotease fold (Structure 3E11). 0
39830 412972 cl01581 WGR WGR domain. This domain is found in a variety of polyA polymerases as well as the E. coli molybdate metabolism regulator and other proteins of unknown function. I have called this domain WGR after the most conserved central motif of the domain. The domain is found in isolation in proteins such as Rhizobium radiobacter Ych and is between 70 and 80 residues in length. I propose that this may be a nucleic acid binding domain. 0
39831 412973 cl01583 TrbC TrbC/VIRB2 family. conjugal transfer protein TrbC; Provisional 0
39832 412974 cl01585 Flp_Fap Flp/Fap pilin component. 0
39833 412975 cl01587 DUF1290 Protein of unknown function (DUF1290). This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown. 0
39834 412976 cl01589 DUF2087 Uncharacterized protein conserved in bacteria (DUF2087). This domain, found in various hypothetical prokaryotic proteins and transcriptional activators, has no known function. Structural modelling suggests this domain may bind nucleic acids. 0
39835 412977 cl01590 DUF2382 Domain of unknown function (DUF2382). This model describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain (pfam05239, which is also found in rRNA processing protein RimM and in a photosynthetic reaction center complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known. 0
39836 412978 cl01595 DUF1385 Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent. 0
39837 412979 cl01596 Spore_YtfJ Sporulation protein YtfJ (Spore_YtfJ). Members of this protein family, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if and only if the species is capable of endospore formation. YtfJ was confirmed in spores of Bacillus subtilis; it appears to be expressed in the forespore under control of SigF (see ). [Cellular processes, Sporulation and germination] 0
39838 412980 cl01598 DUF1343 Protein of unknown function (DUF1343). This family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown. 0
39839 412981 cl01600 DUF1963 Domain of unknown function (DUF1963). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been described. 0
39840 412982 cl01604 MliC Membrane-bound lysozyme-inhibitor of c-type lysozyme. lysozyme inhibitor; Provisional 0
39841 412983 cl01608 DUF1292 Protein of unknown function (DUF1292). hypothetical protein; Provisional 0
39842 412984 cl01610 Cytochrom_C_2 Cytochrome C&apos;. 0
39843 412985 cl01611 DUF2094 Uncharacterized protein conserved in bacteria (DUF2094). Members of this protein family are found exclusively, although not universally, in bacterial species that possess a type VI secretion system. Genes are found in type VI secretion-associated gene clusters. The specific function is unknown. This model represents the rather well-conserved amino-terminal domain of a protein family in which carboxy-terminal regions, when present, show little conservation. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39844 412986 cl01614 DUF997 Protein of unknown function (DUF997). hypothetical protein; Provisional 0
39845 412987 cl01617 ExoD Exopolysaccharide synthesis, ExoD. Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion. 0
39846 412988 cl01626 Rod-binding Rod binding protein. peptidoglycan hydrolase; Reviewed 0
39847 412989 cl01627 LAB_N Lipid A Biosynthesis N-terminal domain. This family is found at the N-terminus of a group of Chlamydial Lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function. 0
39848 412990 cl01628 DUF1919 Domain of unknown function (DUF1919). This domain has no known function. It is found in various hypothetical and putative bacterial proteins. 0
39849 412991 cl01629 TPP_enzymes N/A. This family contains 1-deoxyxylulose-5-phosphate synthase (DXP synthase), an enzyme which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate, to yield 1-deoxy-D- xylulose-5-phosphate, a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). 0
39850 382632 cl01632 DUF2095 Uncharacterized protein conserved in archaea (DUF2095). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39851 412992 cl01633 DUF5611 Domain of unknown function (DUF5611). This is a domain of unknown function. Studies of the TA0095 gene product indicate that this 96-residue hypothetical protein from Thermoplasma acidophilum is a member of the COG4004 orthologous group of unknown function found in Archaea bacteria. The structure displays an alpha/beta two-layer sandwich architecture formed by three alpha-helices and five beta-strands. Furthermore, structural homologs indicate that the TA0095 structure belongs to the TBP-like fold. 0
39852 412993 cl01636 DUF749 Domain of unknown function (DUF749). Archaeal domain of unknown function. This domain has been solved as part of a structural genomics project and comprises of segregated helical and anti-parallel beta sheet regions. 0
39853 412994 cl01637 DUF2096 Uncharacterized protein conserved in archaea (DUF2096). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39854 412995 cl01638 DUF1786 Putative pyruvate format-lyase activating enzyme (DUF1786). This family is annotated as pyruvate formate-lyase activating enzyme (EC:1.97.1.4) in UniProt. It is not clear where this annotation comes from. 0
39855 412996 cl01639 DUF2097 Uncharacterized protein conserved in archaea (DUF2097). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39856 412997 cl01640 DUF2098 Uncharacterized protein conserved in archaea (DUF2098). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39857 412998 cl01641 UPF0254 Uncharacterized protein family (UPF0254). hypothetical protein; Provisional 0
39858 412999 cl01642 DUF1188 Protein of unknown function (DUF1188). This family consists of several hypothetical archaeal proteins of around 260 residues in length which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown. 0
39859 413000 cl01645 DUF2099 Uncharacterized protein conserved in archaea (DUF2099). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. 0
39860 261026 cl01648 DUF2101 Predicted membrane protein (DUF2101). This domain, found in various archaeal and bacterial proteins, has no known function. 0
39861 413001 cl01650 DUF2102 Uncharacterized protein conserved in archaea (DUF2102). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 0
39862 413002 cl01651 DUF2103 Predicted metal-binding protein (DUF2103). This domain, found in various putative metal binding prokaryotic proteins, has no known function. 0
39863 413003 cl01653 DUF1894 Domain of unknown function (DUF1894). Members of this family have an important role in methanogenesis. They assume an alpha-beta globular structure consisting of six beta-strands and three alpha-helices forming the secondary structural topological arrangement of alpha1-beta1-alpha2-beta2-beta3-beta4-beta5-beta6-alpha3. 0
39864 413004 cl01655 DUF2104 Predicted membrane protein (DUF2104). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39865 413005 cl01656 DUF2105 Predicted membrane protein (DUF2105). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39866 413006 cl01659 DUF2108 Predicted membrane protein (DUF2108). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39867 413007 cl01662 NiFe_hyd_3_EhaA NiFe-hydrogenase-type-3 Eha complex subunit A. Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. Energy-converting [NiFe] hydrogenases function as ion pumps, catalyzing the reduction of ferredoxin with H2 driven by the proton-motive force or the sodium-ion-motive force. Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA, EhaB, EhaC, EhaD, EhaE, EhaF, EhaG, EhaI, EhaK, EhaL) and four hydrophobic subunits (EhaM, EhaR, EhS, EhT). Eha and Ehb catalyze the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases. Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure. This domain family can be found on the small membrane proteins that are predicted to be the EhaA trans-membrane subunits of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes. 0
39868 413008 cl01665 DUF1512 Protein of unknown function (DUF1512). This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown. 0
39869 413009 cl01666 AGOG N-glycosylase/DNA lyase. N-glycosylase/DNA lyase; Provisional 0
39870 413010 cl01667 DUF2111 Uncharacterized protein conserved in archaea (DUF2111). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39871 413011 cl01669 DUF2112 Uncharacterized protein conserved in archaea (DUF2112). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 0
39872 413012 cl01670 DUF2113 Uncharacterized protein conserved in archaea (DUF2113). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 0
39873 413013 cl01673 MCR_D Methyl-coenzyme M reductase operon protein D. Members of this protein family are protein D, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. Proteins in this family are expressed at much lower levels than the methyl-coenzyme M reductase itself and associate and have been shown to form at least transient associations. The precise function is unknown. [Energy metabolism, Methanogenesis] 0
39874 413014 cl01674 MCR_C Methyl-coenzyme M reductase operon protein C. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. 0
39875 413015 cl01675 MtrE Tetrahydromethanopterin S-methyltransferase, subunit E. tetrahydromethanopterin S-methyltransferase subunit E; Provisional 0
39876 413016 cl01676 MtrD Tetrahydromethanopterin S-methyltransferase, subunit D. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit D in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis] 0
39877 413017 cl01677 MtrC Tetrahydromethanopterin S-methyltransferase, subunit C. tetrahydromethanopterin S-methyltransferase subunit C; Provisional 0
39878 413018 cl01678 MtrB Tetrahydromethanopterin S-methyltransferase subunit B. Members of this protein family are the MtrB protein of the tetrahydromethanopterin S-methyltransferase complex. This system is universal in archaeal methanogens. [Energy metabolism, Methanogenesis] 0
39879 413019 cl01680 DUF2114 Uncharacterized protein conserved in archaea (DUF2114). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 0
39880 413020 cl01681 DUF2115 Uncharacterized protein conserved in archaea (DUF2115). hypothetical protein; Provisional 0
39881 413021 cl01683 DUF2116 Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids. 0
39882 382661 cl01684 DUF2118 Uncharacterized protein conserved in archaea (DUF2118). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39883 413022 cl01685 DUF2119 Uncharacterized protein conserved in archaea (DUF2119). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39884 413023 cl01686 Nit_Regul_Hom Uncharacterized protein, homolog of nitrogen regulatory protein PII. This domain, found in various hypothetical archaeal proteins, has no known function. It is distantly similar to the nitrogen regulatory protein PII. 0
39885 413024 cl01687 DUF2120 Uncharacterized protein conserved in archaea (DUF2120). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39886 413025 cl01688 NADHdeh_related NADH dehydrogenase I, subunit N related protein. This family comprises a set of NADH dehydrogenase I, subunit N related proteins found in archaea. Their exact function, has not, as yet, been determined. 0
39887 413026 cl01691 DUF1890 Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins. 0
39888 413027 cl01695 DUF2124 Uncharacterized protein conserved in archaea (DUF2124). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39889 413028 cl01709 PBP2_NikA_DppA_OppA_like The substrate-binding domain of an ABC-type nickel/oligopeptide-like import system contains the type 2 periplasmic binding fold. The borders of this family are based on the PDBSum definitions of the domain edges for Salmonella typhimurium oppA. 0
39890 413029 cl01713 Gamma_PGA_hydro Poly-gamma-glutamate hydrolase. This family consists of a number of bacterial and phage proteins that function as gamma-PGA hydrolase enzymes. Structurally the protein in this family adopted an open alpha/beta mixed core structure with a seven-stranded parallel/anti-parallel beta-sheet. This structure shows similarity to mammalian carboxypeptidase A and related enzymes. 0
39891 382669 cl01720 Phage_Nu1 Phage DNA packaging protein Nu1. Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA. 0
39892 413030 cl01722 DUF896 Bacterial protein of unknown function (DUF896). hypothetical protein; Provisional 0
39893 382671 cl01728 DUF2232 Predicted membrane protein (DUF2232). This family of bacterial proteins are multi-pass membrane proteins with up to 10 (2 x 4/5) transmembrane regions. The exact function of this potential pore molecule is not known, but in many instances it is associated with ABC-transporter-like domains, implying that it is part of a secretion system that uses energy. 0
39894 413031 cl01729 VKOR Vitamin K epoxide reductase (VKOR) family. Vitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues. In some plant and bacterial homologs the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases. 0
39895 413032 cl01730 DUF2231 Predicted membrane protein (DUF2231). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39896 413033 cl01731 DICT Sensory domain in DIguanylate Cyclases and Two-component system. DICT is a sensory domain found associated with GGDEF, EAL, HD-GYP, STAS, and two component systems (histidine-kinase type). It assumes an alpha+beta fold with a 4-stranded beta-sheet and might have a role in light response (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter) 0
39897 413034 cl01732 CHASE2 CHASE2 domain. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time. 0
39898 413035 cl01733 DUF2345 Uncharacterized protein conserved in bacteria (DUF2345). Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins. 0
39899 413036 cl01736 PelG Putative exopolysaccharide Exporter (EPS-E). PelG is a family of putative exopolysaccharide transporters like PelG. Most members carry twelve transmembrane regions. The family also contains fusion proteins with glycosyl transferase group 1, which are putative flippase transporters. 0
39900 382678 cl01737 McrBC McrBC 5-methylcytosine restriction system component. 5-methylcytosine-specific restriction enzyme subunit McrC; Provisional 0
39901 413037 cl01738 DUF898 Bacterial protein of unknown function (DUF898). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative membrane proteins. 0
39902 413038 cl01741 DUF1634 Protein of unknown function (DUF1634). This family contains many hypothetical bacterial and archaeal proteins. A few members of this family are annotated as being putative transmembrane proteins, and the region in question in fact contains many hydrophobic residues. 0
39903 413039 cl01742 DGC DGC domain. This domain appears to be a zinc binding domain from the conservation of four potential chelating cysteines. The domain is named after a conserved central motif. The function of this domain is unknown. 0
39904 413040 cl01743 GYD GYD domain. This protein is found in a range of bacteria. It is usually less than 100 amino acids in length. The function of the protein is unknown. It may belong to the dimeric alpha/beta barrel superfamily. 0
39905 413041 cl01744 Chrome_Resist Chromate resistance exported protein. Members of this family of bacterial proteins, are involved in the reduction of chromate accumulation and are essential for chromate resistance. 0
39906 413042 cl01747 SMI1_KNR4 SMI1 / KNR4 family (SUKH-1). Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins. 0
39907 413043 cl01749 UPF0160 Uncharacterized protein family (UPF0160). This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. 0
39908 413044 cl01751 ASRT Anabaena sensory rhodopsin transducer. The family of bacterial Anabaena sensory rhodopsin transducers are likely to bind sugars or related metabolites. The entire protein is comprised of a single globular domain with an eight-stranded beta-sandwich fold. There are a few characteristics which define this beta-sandwich fold as being distinct from other so-named folds, and these are: 1) a well conserved tryptophan, usually following a polar residue, present at the start of the first strand; this tryptophan appears to be central to a hydrophobic interaction required to hold the two beta-sheets of the sandwich together, and 2) a nearly absolutely conserved asparagine located at the end of the second beta-strand, that hydrogen bonds with the backbone carbonyls of the residues 2 and 4 positions downstream from it, thereby stabilizing the characteristic tight turn between strands 2 and 3 of the structure. 0
39909 413045 cl01752 DUF2264 Uncharacterized protein conserved in bacteria (DUF2264). Members of this family of hypothetical bacterial proteins have no known function. 0
39910 413046 cl01753 DUF1345 Protein of unknown function (DUF1345). This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown. 0
39911 413047 cl01754 LtrA Bacterial low temperature requirement A protein (LtrA). This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes. 0
39912 413048 cl01755 DUF1802 Domain of unknown function (DUF1802). The function of this family is unknown. This region is found associated with a pfam04471 suggesting they could be part of a restriction modification system.. 0
39913 413049 cl01757 DUF2262 Uncharacterized protein conserved in bacteria (DUF2262). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39914 413050 cl01759 YiaAB yiaA/B two helix domain. This domain consists of two transmembrane helices and a conserved linking section. 0
39915 413051 cl01762 EutC Ethanolamine ammonia-lyase light chain (EutC). This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) EC:4.3.1.7 sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. 0
39916 413052 cl01763 DUF2247 Uncharacterized protein conserved in bacteria (DUF2247). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39917 413053 cl01767 SoxD Sarcosine oxidase, delta subunit family. This model describes the delta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) form [Energy metabolism, Amino acids and amines] 0
39918 413054 cl01768 Phenol_MetA_deg Putative MetA-pathway of phenol degradation. 0
39919 413055 cl01769 NosL NosL. NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallo-centre assembly. 0
39920 413056 cl01770 DUF2251 Uncharacterized protein conserved in bacteria (DUF2251). Members of this family of hypothetical bacterial proteins have no known function. 0
39921 413057 cl01771 DUF1427 Protein of unknown function (DUF1427). This model describes an uncharacterized small, hydrophobic protein of about 50 amino acids, found between the xapB and xapR genes of the E. coli xanthosine utilization system, and homologous regions in other small proteins, such as the N-terminal region of DUF1427 (pfam07235). We name this domain XapX, as it comprises the full length of the protein encoded between the genes for the well-studied XapB and XapR proteins. [Unknown function, General] 0
39922 413058 cl01775 RHH_4 Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding. 0
39923 413059 cl01781 DUF4212 Domain of unknown function (DUF4212). Members of this family are highly hydrophobic bacterial proteins of about 90 amino acids in length. Members usually are found immediately upstream (sometimes fused to) a member of the solute:sodium symporter family, and therefore are a putative sodium:solute symporter small subunit. Members tend to be found in aquatic species, especially those from marine or other high salt environments. [Transport and binding proteins, Unknown substrate] 0
39924 413060 cl01783 DUF2243 Predicted membrane protein (DUF2243). This domain, found in various hypothetical bacterial proteins, has no known function. 0
39925 413061 cl01784 DUF1361 Protein of unknown function (DUF1361). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins. 0
39926 413062 cl01785 DUF2127 Predicted membrane protein (DUF2127). This domain, found in various hypothetical prokaryotic and archaeal proteins, has no known function. 0
39927 413063 cl01786 DUF1062 Protein of unknown function (DUF1062). This family consists of several hypothetical bacterial proteins of unknown function. 0
39928 413064 cl01787 DUF1643 Protein of unknown function (DUF1643). The members of this family are all sequences found within hypothetical proteins expressed by various bacterial species. The region concerned is approximately 150 residues long. 0
39929 413065 cl01788 DUF2255 Uncharacterized protein conserved in bacteria (DUF2255). Members of this family of hypothetical bacterial proteins have no known function. 0
39930 413066 cl01790 DUF1445 Protein of unknown function (DUF1445). This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function. 0
39931 413067 cl01792 DUF2256 Uncharacterized protein conserved in bacteria (DUF2256). Members of this family of hypothetical bacterial proteins have no known function. 0
39932 413068 cl01794 2OG-Fe_Oxy_2 2OG-Fe dioxygenase. This family contains 2-oxoglutarate (2OG) and Fe-dependent dioxygenases. It includes L-isoleucine dioxygenase (IDO). 0
39933 413069 cl01797 DUF2258 Uncharacterized protein conserved in archaea (DUF2258). Members of this family of hypothetical bacterial archaeal have no known function. Structural modelling suggests this domain may bind nucleic acids. 0
39934 413070 cl01798 DUF1405 Protein of unknown function (DUF1405). This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown. 0
39935 413071 cl01799 Ribosomal_L13e Ribosomal protein L13e. 60S ribosomal protein L13; Provisional 0
39936 413072 cl01800 DUF1122 Protein of unknown function (DUF1122). This family consists of several hypothetical archaeal and bacterial proteins of unknown function. 0
39937 413073 cl01805 DUF2316 Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function. 0
39938 413074 cl01807 DUF1517 Protein of unknown function (DUF1517). This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown. 0
39939 413075 cl01811 DUF2325 Uncharacterized protein conserved in bacteria (DUF2325). Members of this family of hypothetical bacterial proteins have no known function. 0
39940 413076 cl01813 DUF799 Putative bacterial lipoprotein (DUF799). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. 0
39941 413077 cl01815 DUF1018 Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function. 0
39942 413078 cl01817 Tail_P2_I Phage tail protein (Tail_P2_I). This model describes a region of sequence similarity shared by a number of uncharacterized proteins in bacterial genomes, including Geobacter sulfurreducens PCA, Mesorhizobium loti, Streptomyces coelicolor A3(2), Gloeobacter violaceus PCC 7421, and Myxococcus xanthus. In all cases, the genomic region resembles a phage tail region, based on tentative identifications of neighboring genes. A region of this domain resembles a region of TIGR01634, another phage tail protein model. [Mobile and extrachromosomal element functions, Prophage functions] 0
39943 413079 cl01818 DUF1320 Protein of unknown function (DUF1320). This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown. 0
39944 413080 cl01820 DUF2322 Uncharacterized protein conserved in bacteria (DUF2322). Members of this family of hypothetical bacterial proteins have no known function. 0
39945 413081 cl01821 zf-CHCC Zinc-finger domain. This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C. 0
39946 413082 cl01823 DUF2331 Uncharacterized protein conserved in bacteria (DUF2331). This model describes a conserved protein that typically is encoded next to the gene efp for translation elongation factor P. 0
39947 413083 cl01825 Phage_Mu_Gam Bacteriophage Mu Gam like protein. This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. 0
39948 413084 cl01826 Mu-like_gpT Mu-like prophage major head subunit gpT. Members of this family of proteins comprise various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT. 0
39949 382727 cl01827 DUF2330 Uncharacterized protein conserved in bacteria (DUF2330). Members of this family of hypothetical bacterial proteins have no known function. 0
39950 413085 cl01829 UT Urea transporter. Members of this protein family are bacterial urea transporters, found not only is species that contain urease, but adjacent to the urease operon. It was characterized in Yersinia pseudotuberculosis. Members are homologous to eukaryotic members of solute carrier family 14, a family that includes urea transporters, and to bacterial proteins in species with no detectable urea degradation system. [Transport and binding proteins, Other] 0
39951 413086 cl01831 DUF1003 Protein of unknown function (DUF1003). This family consists of several hypothetical bacterial proteins of unknown function. 0
39952 413087 cl01834 PSK_trans_fac Rv0623-like transcription factor. This entry represents the Rv0623-like family of transcription factors associated with the PSK operon. 0
39953 413088 cl01837 DUF2332 Uncharacterized protein conserved in bacteria (DUF2332). Members of this family of hypothetical bacterial proteins have no known function. 0
39954 413089 cl01841 DUF1499 Protein of unknown function (DUF1499). This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown. 0
39955 413090 cl01842 Asparaginase_II L-asparaginase II. This family consists of several bacterial L-asparaginase II proteins. L-asparaginase (EC:3.5.1.1) catalyzes the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source. 0
39956 413091 cl01843 RuBisCO_small_like N/A. Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. 0
39957 413092 cl01844 CreD Inner membrane protein CreD. This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with an Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA. 0
39958 413093 cl01845 DUF1778 Protein of unknown function (DUF1778). This is a family of uncharacterized proteins. The structure of one of the hypothetical proteins in this family has been solved and it forms a helix structure which may form interactions with DNA. 0
39959 382737 cl01848 NapE Periplasmic nitrate reductase protein NapE. NapE, homologous to TorE (TIGR02972), is a membrane protein of unknown function that is part of the periplasmic nitrate reductase system; it may be part of the enzyme complex. The periplasmic nitrate reductase allows for nitrate respiration in anaerobic conditions. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 0
39960 413094 cl01850 CtsR Firmicute transcriptional repressor of class III stress genes (CtsR). This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon. 0
39961 413095 cl01852 VEG Biofilm formation stimulator VEG. VEG is a family that is highly conserved among Gram-positive bacteria. It stimulates biofilm formation through inducing transcription of the tapA-sipW-tasA operon. The products of this operon are resposible for production of the amyloid fibre (TasA) component of the biofilm. Veg or a Veg-induced protein acts as an antirepressor of SinR - part of the major overall biofilm transcriptional control system - to regulate and stimulate biofilm formation. Veg is transcribed at high levels during both exponential growth and sporulation. 0
39962 242748 cl01853 YabA Regulator of replication initiation timing [Replication, recombination and repair]. 0
39963 413096 cl01857 DUF965 Bacterial protein of unknown function (DUF965). IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135 0
39964 413097 cl01860 DUF436 Protein of unknown function (DUF436). hypothetical protein; Provisional 0
39965 413098 cl01862 DUF1461 Protein of unknown function (DUF1461). This model represents a family of highly hydrophobic, uncharacterized predicted integral membrane proteins found almost entirely in low-GC Gram-positive bacteria, although a member is also found in the early-branching bacterium Aquifex aeolicus. 0
39966 413099 cl01864 DUF951 Bacterial protein of unknown function (DUF951). This family consists of several short hypothetical bacterial proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids. 0
39967 413100 cl01867 DUF4176 Domain of unknown function (DUF4176). 0
39968 413101 cl01868 YukC WXG100 protein secretion system (Wss), protein YukC. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This protein is designated YukC in Bacillus subtilis and EssB is Staphylococcus aureus. [Protein fate, Protein and peptide secretion and trafficking] 0
39969 413102 cl01870 DUF1934 Domain of unknown function (DUF1934). Members of this family are found in a set of hypothetical bacterial proteins. Their precise function has not, as yet, been defined. 0
39970 413103 cl01873 AgrB Accessory gene regulator B. The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide. 0
39971 413104 cl01877 17kDa_Anti_2 17 kDa outer membrane surface antigen. This is a bacterial domain of 17 kDa common-antigen proteins. 0
39972 382749 cl01879 DUF962 Protein of unknown function (DUF962). This family consists of several eukaryotic and prokaryotic proteins of unknown function. The yeast protein YGL010W has been found to be non-essential for cell growth. 0
39973 413105 cl01880 SulA Cell division inhibitor SulA. All proteins in this family for which the functions are known are cell division inhibitors. In E. coli, SulA is one of the SOS regulated genes. [DNA metabolism, DNA replication, recombination, and repair] 0
39974 413106 cl01885 RusA Endodeoxyribonuclease RusA. endodeoxyribonuclease RUS; Reviewed 0
39975 413107 cl01886 Omptin Omptin family. The omptin family is a family of serine proteases. 0
39976 413108 cl01887 ChaB ChaB. This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in E. coli and is found next to ChaA which is a cation transporter protein. ChaB may be regulate ChaA function in some way. 0
39977 413109 cl01888 DUF883 Bacterial protein of unknown function (DUF883). hypothetical protein; Provisional 0
39978 413110 cl01889 EutN_CcmL Ethanolamine utilisation protein and carboxysome structural protein domain family. The crystal structure of EutN contains a central five-stranded beta-barrel, with an alpha-helix at the open end of this barrel (Structure 2HD3). The structure also contains three additional beta-strands, which help the formation of a tight hexamer, with a hole in the center. this suggests that EutN forms a pore, with an opening of 26 Angstrom in diameter on one face and 14 Angstrom on the other face. EutN is involved in the cobalamin-dependent degradation of ethanolamine. 0
39979 413111 cl01890 GutM Glucitol operon activator protein (GutM). This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes. 0
39980 413112 cl01891 AceK Isocitrate dehydrogenase kinase/phosphatase (AceK). bifunctional isocitrate dehydrogenase kinase/phosphatase protein; Validated 0
39981 413113 cl01892 ZapD Cell division protein. Cell division protein ZapD enhances FtsZ-ring assembly. It directly interacts with FtsZ and promotes bundling of FtsZ protofilaments, with a reduction in FtsZ GTPase activity. 0
39982 413114 cl01894 VF530 DNA-binding protein VF530. VF530 contains a unique four-helix motif that shows some similarity to the C-terminal double-stranded DNA (dsDNA) binding domain of RecA, as well as other nucleic acid binding domains. 0
39983 413115 cl01907 YscJ_FliF Secretory protein of YscJ/FliF family. All members of this protein family are predicted lipoproteins with a conserved Cys near the N-terminus for cleavage and modification, and are part of known or predicted type III secretion systems. Members are found in both plant and animal pathogens, including the obligately intracellular chlamydial species and (non-pathogenic) root nodule bacteria. The most closely related proteins outside this family are examples of the flagellar M-ring protein FliF. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
39984 295085 cl01908 Phage_tail_L Phage minor tail protein L. This model detects members of the family of phage lambda minor tail protein L. This model was built as a fragment model to allow detection of fragmentary sequences, as might be found in cryptic prophage regions. [Mobile and extrachromosomal element functions, Prophage functions] 0
39985 413116 cl01912 HigB_toxin HigB_toxin, RelE-like toxic component of a toxin-antitoxin system. HigB_toxin is a family of RelE-like prokaryotic proteins that function as mRNA interferases. HigB cleaves translated mRNA only, and cleavage depended on translation of the target RNAs. HigB belongs to the RelE super-family of RNases. The toxin-antitoxin gene-pair is induced by environmental stress factors. 0
39986 413117 cl01913 YaeQ YaeQ protein. This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialized transcription elongation protein. YaeQ is known to compensate for loss of RfaH function. 0
39987 382764 cl01916 DUF2138 Uncharacterized protein conserved in bacteria (DUF2138). hypothetical protein; Provisional 0
39988 413118 cl01917 DUF956 Domain of unknown function (DUF956). Family of bacterial sequences with undetermined function. 0
39989 413119 cl01919 ADC Acetoacetate decarboxylase (ADC). Members of this family are MppR, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppR is belongs to the acetoacetate decarboxylase-like superfamily. MppR catalyzes an aldol condensation and a dehydration, not a decarboxylation. 0
39990 413120 cl01925 DUF2139 Uncharacterized protein conserved in archaea (DUF2139). This domain, found in various hypothetical archaeal proteins, has no known function. 0
39991 413121 cl01930 DUF2141 Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
39992 413122 cl01935 DUF2391 Putative integral membrane protein (DUF2391). Members of this family are found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. This family, as defined, includes some members of COG4711 but is narrower and strictly bacterial. Members appear to span the membrane seven times. [Cell envelope, Other] 0
39993 382770 cl01936 Rad52_Rad22 Rad52/22 family double-strand break repair protein. All proteins in this family for which functions are known are involved in recombination and recombination repair. Their exact biochemical activity is not yet known.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
39994 413123 cl01938 DUF2167 Protein of unknown function (DUF2167). This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function. 0
39995 413124 cl01940 Phage_min_tail Phage minor tail protein. This family consists of a series of phage minor tail proteins and related sequences from several bacterial species. 0
39996 413125 cl01943 ABC_cobalt ABC-type cobalt transport system, permease component. Members of this family of prokaryotic proteins include various hypothetical proteins as well as ABC-type cobalt transport systems. 0
39997 295099 cl01945 Lambda_tail_I Bacteriophage lambda tail assembly protein I. This family consists of tail assembly proteins from lambdoid and T1 phages and related prophages, e.g. the tail assembly protein I (TAPI). Members of this family contain a core ubiquitin fold domain. The exact function of TAPI is not clear but it is not incorporated into the mature tail. Gene neighborhoods reveal that TAPI co-occurs with genes encoding the host-specificity protein TapJ, and TapK, which contains a JAB metallopeptidase fused to an NlpC/P60 peptidase. It is proposed that the TAPI protein is processed by the peptidase domains of TapK. 0
39998 382774 cl01947 MT-A70 MT-A70. MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs. 0
39999 413126 cl01949 DUF1653 Protein of unknown function (DUF1653). This is a family of hypothetical bacterial proteins of unknown function. 0
40000 413127 cl01950 DUF1850 Domain of unknown function (DUF1850). This family of proteins are functionally uncharacterized. Some members of this family appear to be misannotated as RocC an amino acid transporter from B. subtilis. 0
40001 413128 cl01951 DUF2147 Uncharacterized protein conserved in bacteria (DUF2147). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40002 413129 cl01953 ArdA Antirestriction protein (ArdA). This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification. 0
40003 413130 cl01958 Endonuc_Holl Endonuclease related to archaeal Holliday junction resolvase. This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases. 0
40004 413131 cl01959 DUF1616 Protein of unknown function (DUF1616). This is a family of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long. 0
40005 413132 cl01960 DUF2149 Uncharacterized conserved protein (DUF2149). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40006 413133 cl01962 MTH865 MTH865-like family. This domain has an EF-hand like fold. 0
40007 413134 cl01963 DUF2150 Uncharacterized protein conserved in archaea (DUF2150). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40008 382784 cl01966 DUF2153 Uncharacterized protein conserved in archaea (DUF2153). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40009 413135 cl01969 DUF1990 Domain of unknown function (DUF1990). This family of proteins are functionally uncharacterized. 0
40010 413136 cl01970 DUF2155 Uncharacterized protein conserved in bacteria (DUF2155). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40011 413137 cl01971 VanZ VanZ like family. This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases. 0
40012 413138 cl01972 DUF948 Bacterial protein of unknown function (DUF948). This family consists of bacterial sequences several of which are thought to be general stress proteins. 0
40013 413139 cl01973 Hpre_diP_synt_I Heptaprenyl diphosphate synthase component I. This family contains component I of bacterial heptaprenyl diphosphate synthase (EC:2.5.1.30) (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7. 0
40014 413140 cl01974 GPDPase_memb Membrane domain of glycerophosphoryl diester phosphodiesterase. Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase. 0
40015 413141 cl01977 FeThRed_B Ferredoxin thioredoxin reductase catalytic beta chain. ferredoxin thioreductase subunit beta; Validated 0
40016 413142 cl01978 DUF1269 Protein of unknown function (DUF1269). This family consists of several bacterial and archaeal proteins of around 200 residues in length. The function of this family is unknown. The family carries a repeated glycine-zipper sequence- motif, GxxxGxxxG, where the x following the G is frequently found to be an alanine. As glycine-zippers occur in membrane proteins, this family is likely to be found spanning a membrane. 0
40017 413143 cl01981 DUF1307 Protein of unknown function (DUF1307). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown. 0
40018 413144 cl01982 BMC Bacterial Micro-Compartment (BMC) domain. Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure. 0
40019 413145 cl01983 DUF986 Protein of unknown function (DUF986). hypothetical protein; Provisional 0
40020 413146 cl01985 MepB MepB protein. MepB is a functionally uncharacterized protein in the mepRAB gene cluster of Staphylococcus aureus. 0
40021 413147 cl01986 DUF1048 Protein of unknown function (DUF1048). This family consists of several hypothetical bacterial proteins of unknown function. 0
40022 413148 cl01988 Abi_2 Abi-like protein. This family, found in various bacterial species, contains sequences that are similar to the Abi group of proteins, which are involved in bacteriophage resistance mediated by abortive infection in Lactococcus species. The proteins are thought to have helix-turn-helix motifs, found in many DNA-binding proteins, allowing them to perform their function. 0
40023 413149 cl01989 Phage_holin_4_1 Bacteriophage holin family. This model describes one of the many mutally dissimilar families of holins, phage proteins that act together with lytic enzymes in bacterial lysis. This family includes, besides phage holins, the protein TcdE/UtxA involved in toxin secretion in Clostridium difficile and related species. [Protein fate, Protein and peptide secretion and trafficking, Mobile and extrachromosomal element functions, Prophage functions] 0
40024 413150 cl01990 DUF2162 Predicted transporter (DUF2162). Members of this family of bacterial proteins are thought to be membrane transporters, but their exact function has not, as yet, been elucidated. 0
40025 413151 cl01991 DUF1622 Protein of unknown function (DUF1622). This is a family of 14 highly conserved sequences, from hypothetical proteins expressed by both bacterial and archaeal species. 0
40026 413152 cl01992 MIase Muconolactone delta-isomerase. Members of this protein family are muconolactone delta-isomerase (EC 5.3.3.4), the CatC protein of the ortho cleavage pathway for metabolizing aromatic compounds by way of catechol. [Energy metabolism, Other] 0
40027 413153 cl01993 Ribosomal_S26e Ribosomal protein S26e. ribosomal protein S26; Provisional 0
40028 382803 cl01994 DUF2173 Uncharacterized conserved protein (DUF2173). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40029 413154 cl02005 WXG100 Proteins of 100 residues with WXG. T7SS_ESX-EspC is a family of exported virulence proteins from largely Acinetobacteria and a few Fimicutes, Gram-positive bacteria. It is exported in conjunction with EspA as an interacting pair.ED F8ADQ6.1/227-313; F8ADQ6.1/227-313; 0
40030 413155 cl02008 2-oxoacid_dh 2-oxoacid dehydrogenases acyltransferase (catalytic domain). Chloramphenicol acetyltransferase (CAT).catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT. evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen. 0
40031 413156 cl02009 DUF1453 Protein of unknown function (DUF1453). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales. 0
40032 382807 cl02010 DUF2175 Uncharacterized protein conserved in archaea (DUF2175). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40033 382808 cl02011 DUF1444 Protein of unknown function (DUF1444). hypothetical protein; Provisional 0
40034 413157 cl02014 DUF2177 Predicted membrane protein (DUF2177). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40035 413158 cl02015 YycI YycH protein. This domain is exclusively found in YycI proteins in the low GC content Gram positive species. These two domains share the same structural fold with domains two and three of YycH pfam07435. Both, YycH and YycI are always found in pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies. 0
40036 413159 cl02016 DUF2178 Predicted membrane protein (DUF2178). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40037 413160 cl02017 DUF2180 Uncharacterized protein conserved in archaea (DUF2180). This domain, found in various hypothetical archaeal proteins, has no known function. A few of the family members contain a zinc finger domain. 0
40038 413161 cl02019 DUF2185 Protein of unknown function (DUF2185). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40039 413162 cl02022 MecA Negative regulator of genetic competence (MecA). This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations permit competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcription. 0
40040 413163 cl02025 Glm_e N/A. This family consists of several methylaspartate mutase E chain proteins (EC:5.4.99.1). Glutamate mutase catalyzes the first step in the fermentation of glutamate by Clostridium tetanomorphum. This is an unusual isomerisation in which L-glutamate is converted to threo-beta-methyl L-aspartate. 0
40041 413164 cl02034 DUF2193 Uncharacterized protein conserved in archaea (DUF2193). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40042 413165 cl02037 DUF1847 Protein of unknown function (DUF1847). This family of proteins are functionally uncharacterized. THey contain 4 N-terminal cysteines that may form a zinc binding domain. 0
40043 413166 cl02038 Elf1 Transcription elongation factor Elf1 like. putative transcription elongation factor Elf1; Provisional 0
40044 413167 cl02039 YbgT_YccB Membrane bound YbgT-like protein. This model describes a very small (as short as 33 amino acids) protein of unknown function, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It begins with an aromatic motif MWYFXW and appears to contain a membrane-spanning helix. This protein appears to be restricted to the Proteobacteria and exist in a single copy only. We suggest it may be a membrane subunit of the terminal oxidase. The family is named after the E. coli member YbgT (SP|P56100). This model excludes the apparently related protein YccB (SP|P24244). [Energy metabolism, Electron transport] 0
40045 413168 cl02041 Cyt-b5 Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. 0
40046 413169 cl02042 DUF2195 Uncharacterized protein conserved in bacteria (DUF2195). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40047 413170 cl02043 LOR LURP-one-related. Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. 0
40048 413171 cl02044 DUF2196 Uncharacterized conserved protein (DUF2196). A pair of adjacent genes, ablAB (acetyl-beta-lysine biosynthesis) encodes lysine 2,3-aminomutase and beta-lysine acetyltransferase in methanogenic archaea. Homologous pairs, possibly with identical function, occur in a wide range of species, including Bacillus subtilis. This model describes a conserved hypothetical protein, small in size, with a phylogenetic distribution moderately well correlated to that of the acetyltransferase family. This protein family is also described as DUF2196 and COG4895. The function is unknown. [Hypothetical proteins, Conserved] 0
40049 413172 cl02047 DUF2200 Uncharacterized protein conserved in bacteria (DUF2200). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40050 413173 cl02048 DUF2199 Uncharacterized protein conserved in bacteria (DUF2199). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40051 413174 cl02049 PhageMetallopep Putative phage metallopeptidase. This entry represents a probable metallopeptidase found in a variety of phage and bacterial proteomes. 0
40052 413175 cl02050 Ribosomal_S25 S25 ribosomal protein. 30S ribosomal protein S25e; Provisional 0
40053 413176 cl02055 Dehydratase_SU Dehydratase small subunit. This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 0
40054 413177 cl02056 DUF2203 Uncharacterized conserved protein (DUF2203). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40055 413178 cl02059 Halogen_Hydrol 5-bromo-4-chloroindolyl phosphate hydrolysis protein. Members of this family of prokaryotic proteins mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds. 0
40056 413179 cl02063 DUF2209 Uncharacterized protein conserved in archaea (DUF2209). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40057 413180 cl02066 GDYXXLXY GDYXXLXY protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 171 and 199 amino acids in length. It contains a conserved GDYXXLXY motif. 0
40058 413181 cl02071 DUF1109 Protein of unknown function (DUF1109). This family consists of several hypothetical bacterial proteins of unknown function. 0
40059 413182 cl02073 DUF3422 Protein of unknown function (DUF3422). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 426 to 444 amino acids in length. 0
40060 413183 cl02074 DUF2000 Protein of unknown function (DUF2000). This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold. 0
40061 413184 cl02079 YtxH YtxH-like protein. This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 143 amino acids in length. The N-terminal region is the most conserved. Proteins is this family are functionally uncharacterized. 0
40062 382833 cl02087 DUF1834 Domain of unknown function (DUF1834). This family of proteins are functionally uncharacterized. One member is the Gp37 protein from the FluMu prophage. 0
40063 295164 cl02088 Phage_tail_X Phage Tail Protein X. This domain is found in a family of phage tail proteins. Visual analysis suggests that it is related to pfam01476 (personal obs: C Yeats). The functional annotation of family members further confirms this hypothesis. 0
40064 413185 cl02089 Phage_tail_S Phage virion morphogenesis family. This model describes protein S of phage P2, suggested experimentally to act in tail completion and stable head joining, and related proteins from a number of phage. [Mobile and extrachromosomal element functions, Prophage functions] 0
40065 413186 cl02091 Glyco_transf_15 Glycolipid 2-alpha-mannosyltransferase. This is a family of alpha-1,2 mannosyl-transferases involved in N-linked and O-linked glycosylation of proteins. Some of the enzymes in this family have been shown to be involved in O- and N-linked glycan modifications in the Golgi. 0
40066 413187 cl02092 Clat_adaptor_s Clathrin adaptor complex small chain. 0
40067 413188 cl02093 Coq4 Coenzyme Q (ubiquinone) biosynthesis protein Coq4. Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN). 0
40068 413189 cl02095 CDC50 LEM3 (ligand-effect modulator 3) family / CDC50 family. Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w, and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39. 0
40069 413190 cl02096 Gti1_Pac2 Gti1/Pac2 family. In S. pombe the gti1 protein promotes the onset of gluconate uptake upon glucose starvation. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade. 0
40070 413191 cl02098 14-3-3 14-3-3 domain. This 14-3-3 domain family includes proteins in Caenorhabditis elegans, the silkworm (Bombyx mori) as well as barley (Hordeum vulgare). In C. elegans, 14-3-3 proteins are SIR-2.1 binding partners which induce transcriptional activation of DAF-16 during stress and are required for the life-span extension conferred by extra copies of sir-2.1. In B. mori, the 14-3-3 proteins are expressed widely in larval and adult tissues, including the brain, fat body, Malpighian tube, silk gland, midgut, testis, ovary, antenna, and pheromone gland, and interact with the N-terminal fragment of Hsp60, suggesting that 14-3-3 (a molecular adaptor) and Hsp60 (a molecular chaperone) work together to achieve a wide range of cellular functions in B. mori. In barley aleurone cells, 14-3-3 proteins and members of the ABF transcription factor family have a regulatory function in the gibberellic acid (GA) pathway since the balance of GA and abscisic acid (ABA) is a determining factor during transition of embryogenesis and seed germination. 14-3-3 is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 0
40071 413192 cl02099 CK_II_beta Casein kinase II regulatory subunit. Casein kinase II subunit beta; Provisional 0
40072 413193 cl02102 S10_plectin Plectin/S10 domain. 40S ribosomal protein S10; Provisional 0
40073 413194 cl02103 Maf1 Maf1 regulator. Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB. 0
40074 413195 cl02104 Ribosomal_L36e Ribosomal protein L36e. 60S ribosomal protein L36; Provisional 0
40075 413196 cl02106 IF4E Eukaryotic initiation factor 4E. translation initiation factor E4; Provisional 0
40076 413197 cl02107 Evr1_Alr Erv1 / Alr family. Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter. 0
40077 413198 cl02109 GWT1 GWT1. Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast. 0
40078 413199 cl02110 Pho88 Phosphate transport (Pho88). Members of this family of proteins are involved in regulating inorganic phosphate transport, as well as telomere length regulation and maintenance. 0
40079 413200 cl02111 PCI PCI domain. Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function. 0
40080 413201 cl02113 Vac_ImportDeg Vacuolar import and degradation protein. Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole. 0
40081 413202 cl02117 ORMDL ORMDL family. Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1). 0
40082 413203 cl02120 HAT_KAT11 Histone acetylation protein. Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin. KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast. This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11. 0
40083 413204 cl02121 Med31 SOH1. The family consists of Saccharomyces cerevisiae SOH1 homologs. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes. 0
40084 413205 cl02122 TFIIF_beta Transcription initiation factor IIF, beta subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter. 0
40085 413206 cl02125 Med6 MED6 mediator sub complex component. Component of RNA polymerase II holoenzyme and mediator sub complex. 0
40086 413207 cl02127 RNA_pol_Rpc34 RNA polymerase Rpc34 subunit. Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment. 0
40087 413208 cl02129 ParBc ParB-like nuclease domain. This domain is probably distantly related to pfam02195. Suggesting these uncharacterized proteins have a nuclease function. 0
40088 413209 cl02130 Got1 Got1/Sft2-like family. Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments. 0
40089 413210 cl02137 PRA1 PRA1 family protein. This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. 0
40090 413211 cl02138 G10 G10 protein. 0
40091 413212 cl02144 TLD TLD. This domain is predicted to be an enzyme and is often found associated with pfam01476. It's structure consists of a beta-sandwich surrounded by two helices and two one-turn helices. 0
40092 382862 cl02148 APC10-like APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination. This model represents the APC10/DOC1-like domain present in the uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) of Mus musculus. Members of this family contain EF-hand, APC10, CUB, and zinc finger ZZ-type domains. ZZEF1-like APC10 domains are homologous to the APC10 subunit/DOC1 domains present in E3 ubiquitin ligases, which mediate substrate ubiquitination (or ubiquitylation), and are components of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. 0
40093 413213 cl02150 TAF10 The TATA Binding Protein (TBP) Associated Factor 10. The TATA Binding Protein (TBP) Associated Factor 10 (TAF 10) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of the seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. Several hypotheses are proposed for TAF functions, such as serving as activator-binding sites, being involved in core-promoter recognition, or to perform an essential catalytic activity. Each TAF - with the help of a specific activator - is required only for the expression of a subset of genes, and TAFs are not universally involved in transcription such as the GTFs. TAF10 regulates genes that are important for cell cycle progression and cell morphology. A lack of TAF10 leads to cell cycle arrest and cell death by apoptosis in mouse. In both yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF10 is part of other transcription regulatory multiprotein complexes (e.g., SAGA, TBP-free TAF-containing complex [TFTC], STAGA, and PCAF/GCN5). Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. The minimal HFD contains three alpha-helices linked by two loops. The HFD is found in core histones, TAFs and many other transcription factors. Five HF-containing TAF pairs have been described in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10. 0
40094 413214 cl02153 TFIIE_beta_winged_helix TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity. General transcription factor TFIIE consists of two subunits, TFIIE alpha pfam02002 and TFIIE beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The structure of the DNA binding core region has been solved and has a winged helix fold. 0
40095 413215 cl02154 YL1_C YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. 0
40096 413216 cl02155 ER_lumen_recept ER lumen protein retaining receptor. 0
40097 413217 cl02156 PTPLA Protein tyrosine phosphatase-like protein, PTPLA. 3-hydroxyacyl-CoA dehydratase subunit of elongase 0
40098 413218 cl02160 Rcd1 Cell differentiation family, Rcd1-like. Two of the members in this family have been characterized as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation. 0
40099 413219 cl02161 Ssu72 Ssu72-like protein. The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs. 0
40100 413220 cl02162 Fip1 Fip1 motif. This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Th1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor. 0
40101 413221 cl02164 Utp11 Utp11 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome. 0
40102 413222 cl02165 CBFB_NFYA CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. 0
40103 413223 cl02166 RRS1 Ribosome biogenesis regulatory protein (RRS1). This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae. 0
40104 413224 cl02170 Sec62 Translocation protein Sec62. Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins has been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions. [Transport and binding proteins, Amino acids, peptides and amines] 0
40105 413225 cl02172 Per1 Per1-like family. PER1 is required for GPI-phospholipase A2 activity and is involved in lipid remodelling of GPI-anchored proteins. PER1 is part of the CREST superfamily. 0
40106 242920 cl02174 TAF13 The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This family includes the Spt3 yeast transcription factors and the 18kD subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold 0
40107 413226 cl02175 Rer1 Rer1 family. RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C-terminus of yeast Rer1p interacts with a coatomer complex. 0
40108 413227 cl02176 TAF11 TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C-terminal of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3. 0
40109 413228 cl02183 Zincin_2 Zincin-like metallopeptidase. A phylogenetic tree of the DUF2342 family (TIGR03624) consists of two major branches. One of these branches, modeled here, is observed almost entirely to be found in coenzyme F420 biosynthesizing species of the Actinobacterial, Chloroflexi and Archaeal lineages. The few organisms having genes within this family and lacking F420 biosynthesis may either have an undiscovered F420 transporter, or may represent F420-to-FMN revertants. This family includes a Chloroflexus Aurantiacus protein whose crystal structure has been determined (PDB:3CMN_A). This has been annotated as a putative hydrolase, but the support for that assertion is untraceable. There is no cofactor present in the structure. 0
40110 413229 cl02185 DUF1093 Protein of unknown function (DUF1093). This model represents a family of small (about 115 amino acids) uncharacterized proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members. [Hypothetical proteins, Conserved] 0
40111 413230 cl02186 Plus-3 Plus-3 domain. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1. 0
40112 413231 cl02188 CcdA Post-segregation antitoxin CcdA. This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein. 0
40113 413232 cl02193 VirB5_like VirB5 protein family. Based on Bacteroides thetaiotaomicron gene BT_4772, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 0
40114 413233 cl02197 MmcB-like DNA repair protein MmcB-like. This family includes Caulobacter MmcB (CCNA_03580), which is involved in DNA repair. It has been proposed to be an endonuclease that creates the substrate for translesion synthesis. 0
40115 413234 cl02206 DUF1312 N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431 are part of DUF1312. This domain is found in some NusG proteins where it forms domain II. However most NusG proteins are missing this domain. In other cases this domain is found in isolation. The function of this domain is unknown. 0
40116 413235 cl02207 IalB Invasion associated locus B (IalB) protein. This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is upregulated in response to environmental cues signaling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions. The role of this protein in other bacterial species is unknown. 0
40117 413236 cl02210 DUF2335 Predicted membrane protein (DUF2335). Members of this family of hypothetical bacterial proteins have no known function. 0
40118 413237 cl02211 DUF983 Protein of unknown function (DUF983). hypothetical protein; Provisional 0
40119 413238 cl02212 DUF2169 Uncharacterized protein conserved in bacteria (DUF2169). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40120 413239 cl02216 Terminase_6C Terminase RNaseH-like domain. This model represents the C-terminal region of a set of phage proteins typically about 400-500 amino acids in length, although some members are considerably shorter. An article on Methanobacterium phage Psi-M2 ( calls the member from that phage, ORF9, a putative large terminase subunit, and ORF8 a candidate terminase small subunit. Most proteins in this family have an apparent P-loop nucleotide-binding sequence toward the N-terminus. [Mobile and extrachromosomal element functions, Prophage functions] 0
40121 413240 cl02219 Bap31 B-cell receptor-associated protein 31-like. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31. 0
40122 413241 cl02228 ATP12 ATP12 chaperone protein. Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesized as precursors with amino-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly. 0
40123 382891 cl02232 DUF2306 Predicted membrane protein (DUF2306). Members of this family of hypothetical bacterial proteins have no known function. 0
40124 413242 cl02233 NTP_transf_8 Nucleotidyltransferase. This is a family of bacterial proteins that have a nucleotidyltransferase fold. The fold-prediction is backed up by conservation of three highly characteristic sequence motifs found in all other nucleotidyl transferases: i) pDhDhhh(h/p), where p is a polar residue and h is a hydrophobic residue; ii) upstream of the first, a GG/S; iii) a conserved D/E in a hydrophobic surround. In the classification of nucleotidyltransferases proposed in this is a group XVIII NTP-transferase. Many of these sequences were classified in the COG database as COG5397. The exact function is not known. 0
40125 382893 cl02234 DUF2286 Uncharacterized protein conserved in archaea (DUF2286). Members of this family of hypothetical archaeal proteins have no known function. 0
40126 413243 cl02235 DUF1134 Protein of unknown function (DUF1134). This family consists of several hypothetical bacterial proteins of unknown function. 0
40127 413244 cl02241 DUF2301 Uncharacterized integral membrane protein (DUF2301). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40128 413245 cl02246 DUF2285 Uncharacterized conserved protein (DUF2285). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40129 413246 cl02247 Rop-like Rop-like. This family contains several uncharacterized bacterial proteins. These proteins are found in nitrogen fixation operons so are likely to play some role in this process. They consist of two alpha helices which are joined by a four residue linker. The helices form an antiparallel bundle and cross towards their termini. They are likely to form a rod-like dimer. They have structural similarity to the regulatory protein Rop, pfam01815. 0
40130 413247 cl02248 DUF2284 Predicted metal-binding protein (DUF2284). Members of this family of metal-binding hypothetical bacterial proteins have no known function. 0
40131 413248 cl02250 DUF2298 Uncharacterized membrane protein (DUF2298). Members of this highly hydrophobic probable integral membrane family belong to two classes. In one, a single copy of the region covered by this model represents essentially the full length of a strongly hydrophobic protein of about 700 to 900 residues (variable because of long inserts in some). The domain architecture of the other class consists of an additional N-terminal region, two copies of the region represented by this model, and three to four repeats of TPR, or tetratricopeptide repeat. The unusual species range includes several Archaea, several Chloroflexi, and Clostridium phytofermentans. An unusual motif YYYxG is present, and we suggest the name Chlor_Arch_YYY protein. The function is unknown. 0
40132 413249 cl02251 DUF2283 Protein of unknown function (DUF2283). Members of this family of hypothetical bacterial proteins have no known function. 0
40133 413250 cl02253 SCPU Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins.as well as a family of secreted pili proteins involved in motility and biofilm formation. 0
40134 413251 cl02259 YibE_F YibE/F-like protein. The sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE and YibF. Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues. 0
40135 413252 cl02261 DUF2299 Uncharacterized conserved protein (DUF2299). Members of this family of hypothetical bacterial proteins have no known function. 0
40136 413253 cl02262 Tm-1-like ATP-binding domain found in plant Tm-1-like (Tm-1L) and similar proteins. hypothetical protein; Provisional 0
40137 413254 cl02266 CbtA Probable cobalt transporter subunit (CbtA). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five trans-membrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site. 0
40138 413255 cl02268 DUF2460 Conserved hypothetical protein 2217 (DUF2460). This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes. [Mobile and extrachromosomal element functions, Prophage functions] 0
40139 413256 cl02273 HlyU Transcriptional activator HlyU. This domain, found in various hypothetical prokaryotic proteins, has no known function. One of the sequences in this family corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members. 0
40140 413257 cl02275 RcnB Nickel/cobalt transporter regulator. RcnB is a family of Proteobacteria proteins. RcnB is required for maintaining metal ion homeostasis, in conjunction with the efflux pump RcnA, family NicO, pfam03824. 0
40141 413258 cl02278 DUF2164 Uncharacterized conserved protein (DUF2164). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40142 413259 cl02284 DUF1059 Protein of unknown function (DUF1059). This family consists of several short hypothetical archaeal proteins of unknown function. 0
40143 413260 cl02289 DUF2190 Uncharacterized conserved protein (DUF2190). This domain, found in various hypothetical prokaryotic proteins, as well as in some putative RecA/RadA recombinases, has no known function. 0
40144 413261 cl02290 DUF2165 Predicted small integral membrane protein (DUF2165). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40145 413262 cl02291 DUF2189 Predicted integral membrane protein (DUF2189). Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established. 0
40146 413263 cl02292 Crr6 Chlororespiratory reduction 6. The protein Slr1097 and its functionally equivalent cyanobacterial homologs are required for proper maturation of NdhI, a subunit of NADPH dehydrogenase complexes, so that NDH-1 complexes can assemble properly. The related protein in the model plant species Arabidopsis thaliana is known as CRR6 (chlororespiratory reduction 6). 0
40147 413264 cl02293 DUF2158 Uncharacterized small protein (DUF2158). Members of this family of prokaryotic proteins have no known function. 0
40148 413265 cl02294 DUF2160 Predicted small integral membrane protein (DUF2160). The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet. 0
40149 413266 cl02296 DUF1036 Protein of unknown function (DUF1036). This family consists of several hypothetical bacterial proteins of unknown function. 0
40150 413267 cl02298 DUF2161 Putative PD-(D/E)XK phosphodiesterase (DUF2161). This family of proteins is functionally uncharacterized. This family of proteins is found in prokaryotes. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses s have established this family to be a member of the PD-(D/E)XK superfamily. 0
40151 413268 cl02302 DUF2244 Integral membrane protein (DUF2244). This domain, found in various bacterial hypothetical and putative membrane proteins, has no known function. 0
40152 413269 cl02303 DUF736 Protein of unknown function (DUF736). This family consists of several uncharacterized bacterial proteins of unknown function. 0
40153 413270 cl02309 DUF2259 Predicted secreted protein (DUF2259). Members of this family of hypothetical bacterial proteins have no known function. 0
40154 413271 cl02310 Glyco_hydro_81 Glycosyl hydrolase family 81. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein ENGL1, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release. 0
40155 413272 cl02314 DUF2267 Uncharacterized conserved protein (DUF2267). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40156 242981 cl02317 DUF819 Protein of unknown function (DUF819). This family contains proteins of unknown function from archaeal, bacterial and plant species. 0
40157 413273 cl02318 DUF1694 Protein of unknown function (DUF1694). This family contains many hypothetical proteins. 0
40158 413274 cl02319 DUF1428 Protein of unknown function (DUF1428). This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown. The structure of this family shows it to be part of the Dimeric-alpha-beta-barrel superfamily. Many members are annotated as being RNA signal recognition particle 4.5S RNA, but this could not be verified. 0
40159 413275 cl02324 DUF721 Protein of unknown function (DUF721). hypothetical protein; Provisional 0
40160 413276 cl02325 Inhibitor_I42 Chagasin family peptidase inhibitor I42. Chagasin is a cysteine peptidase inhibitor which forms a beta barrel structure. 0
40161 413277 cl02331 Intg_mem_TP0381 Integral membrane protein (intg_mem_TP0381). This model represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc. 0
40162 413278 cl02333 Bac_rhodopsin Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria.. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). 0
40163 413279 cl02335 DUF2269 Predicted integral membrane protein (DUF2269). Members of this family of bacterial hypothetical integral membrane proteins have no known function. 0
40164 413280 cl02337 DUF2270 Predicted integral membrane protein (DUF2270). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40165 413281 cl02338 DUF2303 Uncharacterized conserved protein (DUF2303). Members of this family of hypothetical bacterial proteins have no known function. 0
40166 413282 cl02341 Sec66 Preprotein translocase subunit Sec66. Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures. 0
40167 413283 cl02344 Phage_holin_1 Bacteriophage holin. Phage proteins for bacterial lysis typically include a membrane-disrupting protein, or holin, and one or more cell wall degrading enzymes that reach the cell wall because of holin action. Holins are found in a large number of mutually non-homologous families. [Mobile and extrachromosomal element functions, Prophage functions] 0
40168 413284 cl02346 Tmemb_14 Transmembrane proteins 14C. This family of short membrane proteins are as yet uncharacterized. 0
40169 413285 cl02349 DUF2277 Uncharacterized conserved protein (DUF2277). Members of this family of hypothetical bacterial proteins have no known function. 0
40170 413286 cl02351 NifT NifT/FixU protein. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein. 0
40171 413287 cl02353 DUF2280 Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function. 0
40172 382939 cl02355 DUF2281 Protein of unknown function (DUF2281). Members of this family of hypothetical bacterial proteins have no known function. 0
40173 413288 cl02356 CGGC CGGC domain. The domain has many conserved cysteines and histidines suggestive of a zinc binding function. 0
40174 413289 cl02360 Mor Mor transcription activator family. Mor (Middle operon regulator) is a sequence specific DNA binding protein. It mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. The N terminal region of Mor is the dimerization region, and the C terminal contains a helix-turn-helix motif which binds DNA. 0
40175 413290 cl02363 CusF_Ec Copper binding periplasmic protein CusF. periplasmic copper-binding protein; Provisional 0
40176 413291 cl02366 DUF2282 Predicted integral membrane protein (DUF2282). Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function. 0
40177 413292 cl02369 DUF624 Protein of unknown function, DUF624. This family includes several uncharacterized bacterial proteins. 0
40178 413293 cl02370 DUF1810 Protein of unknown function (DUF1810). This is a family of uncharacterized proteins. The structure of one of the members in this family has been solved and it adopts a mainly alpha helical structure. 0
40179 413294 cl02371 DUF2292 Uncharacterized small protein (DUF2292). OscA (organosulfur compound A) is a small protein, about 60 amino acids in length, in the DUF2292 family. As characterized in Pseudomonas corrugata, OscA is required during sulfur starvation for obtaining it from organosulfur compounds. The pathway is required to remediate oxidative stress from chromate, so oscA was discovered by the loss of high resistance to chromate in Pseudomonas corrugata 28 when the gene is insertionally inactivated. The oscA gene tends to be found near sulfate transporter genes. 0
40180 413295 cl02373 DUF2293 Uncharacterized conserved protein (DUF2293). This domain, found in various hypothetical bacterial proteins, has no known function. 0
40181 413296 cl02374 DUF2461 Conserved hypothetical protein (DUF2461). Members of this family are widely (though sparsely) distributed bacterial proteins about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown. In several fungi, this model identifies a conserved region of a longer protein. Therefore, it may be incorrect to speculate that all members share a common function. 0
40182 413297 cl02375 DUF1326 Protein of unknown function (DUF1326). This family consists of several hypothetical bacterial proteins which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N-terminus. The function of this family is unknown. 0
40183 413298 cl02376 DUF2390 Protein of unknown function (DUF2390). Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various Proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model. [Hypothetical proteins, Conserved] 0
40184 413299 cl02380 DUF2310 Zn-ribbon-containing, possibly nucleic-acid-binding protein (DUF2310). Members of this family of proteobacterial zinc ribbon proteins are thought to bind to nucleic acids, however their exact function has not as yet been defined. 0
40185 413300 cl02381 Tim17 Tim17/Tim22/Tim23/Pmp24 family. mitochondrial import inner membrane translocase subunit tim17; Provisional 0
40186 413301 cl02384 NOT2_3_5 NOT2 / NOT3 / NOT5 family. NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5. 0
40187 413302 cl02390 DUF2294 Uncharacterized conserved protein (DUF2294). Members of this family of hypothetical bacterial proteins have no known function. 0
40188 413303 cl02395 DUF2291 Predicted periplasmic lipoprotein (DUF2291). Members of this family of hypothetical bacterial proteins have no known function. 0
40189 382955 cl02396 DUF2290 Uncharacterized conserved protein (DUF2290). Members of this family of hypothetical bacterial proteins have no known function. 0
40190 413304 cl02398 Host_attach Protein required for attachment to host cells. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 0
40191 413305 cl02399 DUF2288 Protein of unknown function (DUF2288). Members of this family of hypothetical bacterial proteins have no known function. 0
40192 413306 cl02406 DUF2274 Protein of unknown function (DUF2274). Members of this family of hypothetical bacterial proteins have no known function. 0
40193 413307 cl02411 RES RES domain. This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length. 0
40194 413308 cl02412 Rep_1 Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase. 0
40195 413309 cl02415 DUF922 Bacterial protein of unknown function (DUF922). This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. 0
40196 413310 cl02417 Myelin_PLP Myelin proteolipid protein (PLP or lipophilin). 0
40197 413311 cl02418 Hormone_5 Neurohypophysial hormones, C-terminal Domain. Vasopressin/oxytocin gene family. 0
40198 413312 cl02419 Notch LNR domain. The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains. 0
40199 413313 cl02422 HRM Hormone receptor domain. This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. 0
40200 413314 cl02423 LRRNT Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. 0
40201 413315 cl02425 Osteopontin Osteopontin. Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone 0
40202 413316 cl02426 DIX DIX domain. Domain of unknown function. 0
40203 413317 cl02428 Ependymin Ependymin. Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds. 0
40204 413318 cl02432 CLECT C-type lectin (CTL)/C-type lectin-like (CTLD) domain. This family includes both long and short form C-type 0
40205 413319 cl02434 CNH CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations. 0
40206 413320 cl02436 COLFI Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. 0
40207 413321 cl02440 DAGK_acc Diacylglycerol kinase accessory domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known. 0
40208 413322 cl02442 DEP N/A. The DEP domain is responsible for mediating intracellular protein targeting and regulation of protein stability in the cell. The DEP domain is present in a number of signaling molecules, including Regulator of G protein Signaling (RGS) proteins, and has been implicated in membrane targeting. New findings in yeast, however, demonstrate a major role for a DEP domain in mediating the interaction of an RGS protein to the C-terminal tail of a GPCR, thus placing RGS in close proximity with its substrate G protein alpha subunit. 0
40209 351761 cl02446 MATH N/A. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans. 0
40210 413323 cl02447 CRD_FZ CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines. 0
40211 413324 cl02448 Hormone_6 Glycoprotein hormone. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. 0
40212 413325 cl02449 Gla Vitamin K-dependent carboxylation/gamma-carboxyglutamic (GLA) domain. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration. 0
40213 413326 cl02451 Hydrophobin Fungal hydrophobin. 0
40214 321941 cl02453 IlGF_like N/A. Superfamily includes insulins; relaxins; insulin-like growth factor; and bombyxin. All are secreted regulatory hormones. Disulfide rich, all-alpha fold. Alignment includes B chain, linker (which is processed out of the final product), and A chain. 0
40215 413327 cl02465 BTK BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. 0
40216 413328 cl02467 C4 C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. 0
40217 413329 cl02471 HX N/A. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metallopeptidases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs). 0
40218 413330 cl02472 IGFBP Insulin-like growth factor binding protein. High affinity binding partners of insulin-like growth factors. 0
40219 413331 cl02473 IL6 Interleukin-6/G-CSF/MGF family. GCSF is a family of higher eukaryotic granulocyte colony-stimulating factor proteins. Granulocyte colony-stimulating factors are cytokines that are involved in haematopoeisis. They control the production, differentiation and function of white blood cell granulocytes. GCSF binds to the extracellular Ig-like and CRH domain of its receptor GCSFR, thereby triggering the receptor to homodimerize. Homodimerization result in activation of Janus tyrosine kinase-signal transducers and other activators of transcription (JAK-STAT)-type signalling cascades. 0
40220 413332 cl02475 LIM LIM is a small protein-protein interaction domain, containing two zinc fingers. This family represents two copies of the LIM structural domain. 0
40221 413333 cl02480 MyTH4 MyTH4 domain. Domain present twice in myosin-VIIa, and also present in 3 other myosins. 0
40222 413334 cl02481 NGF Nerve growth factor family. NGF is important for the development and maintenance of the sympathetic and sensory nervous systems. 0
40223 413335 cl02483 PI3K_p85B PI3-kinase family, p85-binding domain. Region of p110 PI3K that binds the p85 subunit. 0
40224 413336 cl02484 PI3K_rbd PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation). 0
40225 413337 cl02485 RasGEF N/A. Guanine nucleotide exchange factor for Ras-like small GTPases. 0
40226 413338 cl02488 SPEC N/A. Spectrin repeat-domains are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin domain- repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. Although the domain occurs in multiple repeats along sequences, the domains are actually stable on their own - ie they act, biophysically, like domains rather than repeats that along function when aggregated. 0
40227 413339 cl02491 VHP Villin headpiece domain. 0
40228 413340 cl02494 SapA Saposin A-type domain. Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains. 0
40229 382990 cl02495 RabGAP-TBC Rab-GTPase-TBC domain. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases. 0
40230 413341 cl02501 IL10 Interleukin 10. Interleukin-22 is distantly related to interleukin (IL)-10, and is produced by activated T cells. IL-22 is a ligand for CRF2-4, a member of the class II cytokine receptor family. 0
40231 413342 cl02505 PTN_MK_N PTN/MK heparin-binding protein family, N-terminal domain. Heparin-binding domain family. 0
40232 413343 cl02506 SAA Serum amyloid A protein. Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins. 0
40233 413344 cl02507 SEA SEA domain. Proposed function of regulating or binding carbohydrate sidechains. 0
40234 413345 cl02508 Somatomedin_B Somatomedin B domain. Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity. 0
40235 413346 cl02509 SRCR_2 Scavenger receptor cysteine-rich domain. Members of this family form an extracellular domain of the serine protease hepsin. They are formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate. 0
40236 413347 cl02510 TGF_beta Transforming growth factor beta like domain. Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types. 0
40237 413348 cl02511 GH64-TLP-SF glycoside hydrolase family 64 (beta-1,3-glucanases which produce specific pentasaccharide oligomers) and thaumatin-like proteins. Family 64 glycoside hydrolases have beta-1,3-glucanase activity. 0
40238 413349 cl02512 NTR_like N/A. Sequence similarity between netrin UNC-6 and C345C complement protein family members, and hence the existence of the UNC-6 module, was first reported in. Subsequently, many additional members of the family were identified on the basis of sequence similarity between the C-terminal domains of netrins, complement proteins C3, C4, C5, secreted frizzled-related proteins, and type I pro-collagen C-proteinase enhancer proteins (PCOLCEs), which are homologous with the N-terminal domains of tissue inhibitors of metalloproteinases (TIMPs). The TIMPs are classified as a separate family in Pfam (pfam00965). This expanded domain family has been named as the NTR module. 0
40239 413350 cl02516 VWD von Willebrand factor type D domain. Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation. 0
40240 413351 cl02517 ZU5 ZU5 domain. Domain of unknown function. 0
40241 413352 cl02518 BTB BTB/POZ domain. In voltage-gated K+ channels this domain is responsible for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. In KCTD1 this domain functions as a transcriptional repressor. It also mediates homomultimerisation of KCTD1 and interaction of KCTD1 with the transcription factor AP-2-alpha. 0
40242 413353 cl02520 REM N/A. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this motif/domain N-terminal to the RasGef (Cdc25-like) domain. 0
40243 413354 cl02521 CBM_1 Fungal cellulose binding domain. Small four-cysteine cellulose-binding domain of fungi 0
40244 413355 cl02522 Calx-beta Calx-beta domain. Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins) 0
40245 413356 cl02524 GAS2 Growth-Arrest-Specific Protein 2 Domain. GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain 0
40246 413357 cl02526 Peptidase_S41 C-terminal processing peptidase family S41. tail specific protease 0
40247 413358 cl02528 Crystall Beta/Gamma crystallin. Beta/gamma crystallins 0
40248 413359 cl02529 Ank Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities. 0
40249 413360 cl02533 SOCS N/A. The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. 0
40250 413361 cl02535 F-box-like F-box-like. This domain is approximately 50 amino acids long, and is usually found in the N-terminal half of a variety of proteins. Two motifs that are commonly found associated with the F-box domain are the leucine rich repeats (LRRs; pfam00560 and pfam07723) and the WD repeat (pfam00400). The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression. 0
40251 413362 cl02536 SAND SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerization. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain. 0
40252 413363 cl02539 BAG BAG domain. BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains 0
40253 413364 cl02541 CIDE_N N/A. This domain is found in CAD nuclease and ICAD, the inhibitor of CAD nuclease. The two proteins interact through this domain. 0
40254 413365 cl02542 DnaJ N/A. DnaJ domains (J-domains) are associated with hsp70 heat-shock system and it is thought that this domain mediates the interaction. DnaJ-domain is therefore part of a chaperone (protein folding) system. The T-antigens, although not in Prosite are confirmed as DnaJ containing domains from literature. 0
40255 413366 cl02544 VHS_ENTH_ANTH VHS, ENTH and ANTH domain superfamily. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2. 0
40256 413367 cl02546 Granulin Granulin. 0
40257 413368 cl02548 Laminin_B Laminin B (Domain IV). 0
40258 413369 cl02549 OLF Olfactomedin-like domain. 0
40259 351799 cl02553 Peptidase_C19 N/A. A subfamily of peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 0
40260 413370 cl02554 PWWP N/A. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The domain binds to Histone-4 methylated at lysine-20, H4K20me, suggesting that it is methyl-lysine recognition motif. Removal of two conserved aromatic residues in a hydrophobic cavity created by this domain within the full-length protein, Pdp1, abolishes the interaction o f the protein with H4K20me3. In fission yeast, Set9 is the sole enzyme that catalyzes all three states of H4K20me, and Set9-mediated H4K20me is required for efficient recruitment of checkpoint protein Crb2 to sites of DNA damage. The methylation of H4K20 is involved in a diverse array of cellular processes, such as organising higher-order chromatin, maintaining genome stability, and regulating cell-cycle progression. 0
40261 413371 cl02556 Bromodomain N/A. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 0
40262 413372 cl02557 DM DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerize and bind palindromic DNA. 0
40263 413373 cl02558 GED Dynamin GTPase effector domain. 0
40264 413374 cl02559 GPS GPCR proteolysis site, GPS, motif. Present in latrophilin/CL-1, sea urchin REJ and polycystin. 0
40265 413375 cl02562 PWI PWI domain. 0
40266 413376 cl02563 PX_domain The Phox Homology domain, a phosphoinositide binding module. PX domains bind to phosphoinositides. 0
40267 413377 cl02564 PXA PXA domain. unpubl. observations 0
40268 413378 cl02565 RGS Regulator of G protein signaling (RGS) domain superfamily. Members of this family adopt a structure consisting of twelve helices that fold into a compact domain that contains the overall structural scaffold observed in other RGS proteins and three additional helical elements that pack closely to it. Helices 1-9 comprise the RGS (pfam00615) fold, in which helices 4-7 form a classic antiparallel bundle adjacent to the other helices. Like other RGS structures, helices 7 and 8 span the length of the folded domain and form essentially one continuous helix with a kink in the middle. Helices 10-12 form an apparently stable C-terminal extension of the structural domain, and although other RGS proteins lack this structure, these elements are intimately associated with the rest of the structural framework by hydrophobic interactions. Members of the family bind to active G-alpha proteins, promoting GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways. 0
40269 413379 cl02566 SET SET domain. Putative methyl transferase, based on outlier plant homologues 0
40270 413380 cl02568 WSC WSC domain. Domain present in WSC proteins, polycystin and fungal exoglucanase 0
40271 413381 cl02569 RasGAP Ras GTPase Activating Domain. This family features the C-terminal regions of various plexins. Plexins are receptors for semaphorins, and plexin signalling is important in path finding and patterning of both neurons and developing blood vessels. The cytoplasmic region, which has been called a SEX domain in some members of this family, is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins. This domain acts as a RasGAP domain. 0
40272 413382 cl02570 RhoGAP N/A. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. 0
40273 413383 cl02571 RhoGEF N/A. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that pfam00169 domains invariably occur C-terminal to RhoGEF/DH domains. 0
40274 413384 cl02573 Tudor_SF Tudor domain superfamily. This group contains SMN, SPF30, Tudor domain-containing protein 3 (TDRD3), DNA excision repair protein ERCC-6-like 2 (ERCC6L2), and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. Members of this group contain a single Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. 0
40275 413385 cl02574 Annexin Annexin. This family of annexins also includes giardin that has been shown to function as an annexin. 0
40276 413386 cl02575 Bcl-2_like N/A. (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation 0
40277 413387 cl02578 HRDC HRDC domain. RecQ helicases unwind DNA in an ATP-dependent manner. Sgs1 has a HRDC (helicase and RNaseD C-terminal) domain which modulates the helicase function via auxiliary contacts to DNA. 0
40278 413388 cl02581 KRAB_A-box KRAB (Kruppel-associated box) domain -A box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation. 0
40279 413389 cl02594 DD_R_PKA Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIbeta plays an important role in adipocytes and neuronal tissues. Mice deficient with RIIbeta have small fat cells, and are resistant to obesity, diet-induced diabetes, and alcohol-induced motor defects. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. 0
40280 413390 cl02596 NR_DBD_like DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other. 0
40281 413391 cl02598 Copper-fist Copper fist DNA binding domain. The domain is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding. 0
40282 413392 cl02599 Ets Ets-domain. variation of the helix-turn-helix motif 0
40283 413393 cl02600 HTH_MerR-SF Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily. This domain is a DNA-binding helix-turn-helix domain. 0
40284 413394 cl02601 PSI Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman). 0
40285 383044 cl02602 STE STE like transcription factor. 0
40286 413395 cl02603 TEA TEA/ATTS domain family. 0
40287 413396 cl02605 SCAN SCAN oligomerization domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerization. 0
40288 413397 cl02608 BAH N/A. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction. 0
40289 413398 cl02609 Zn-ribbon C-terminal zinc ribbon domain of RNA polymerase intrinsic transcript cleavage subunit. TFIIS is a zinc-containing transcription factor. It has been shown in vitro to have distinct biochemical activities, including binding to RNA polymerases, stimulation of transcript elongation, and activation of a nascent RNA cleavage activity in the RNA polymerase II (Pol II) elongation complex. TFIIS consists of three domains. Domain II and III are sufficient for all known TFIIS activities. Domain III is a zinc ribbon that separated from domain II by a long linker and is indispensable for TFIIS function. The TFIIS homologs, subunits A12.2, B9, and C11, of Pol I, II, and III respectively, are required for RNA cleavage by the polymerases. In a single organism, there are tissue-specific TFIIS related proteins. 0
40290 413399 cl02610 FF FF domain. RhoGAP-FF1 is the FF domain of the Rho GTPase activating proteins (GAPs). These are the key proteins that make the switch between the active guanosine-triphosphate-bound form of Rho guanosine triphosphatases (GTPases) and the inactive guanosine-diphosphate-bound form. Rho guanosine triphosphatases (GTPases) are a family of proteins with key roles in the regulation of actin cytoskeleton dynamics. The RhoGAP-FF1 region contains the FF domain that has been implicated in binding to the transcription factor TFII-I; and phosphorylation of Tyr308 within the first FF domain inhibits this interaction. The RhoGAPFF1 domain constitutes the first solved structure of an FF domain that lacks the first of the two highly conserved Phe residues, but the substitution of Phe by Tyr does not affect the domain fold. 0
40291 413400 cl02611 G-patch G-patch domain. Yeast Spp2, a G-patch protein and spliceosome component, interacts with the ATP-dependent DExH-box splicing factor Prp2. As this interaction involves the G-patch sequence in Spp2 and is required for the recruitment of Prp2 to the spliceosome before the first catalytic step of splicing, it is proposed that Spp2 might be an accessory factor that confers spliceosome specificity on Prp2. 0
40292 413401 cl02612 Link_Domain N/A. Link_domain_KIAA0527_like; this domain is found in the human protein KIAA0527. Sequence-wise, it is highly similar to the link domain. The link domain is a hyaluronan-binding (HA) domain. KIAA0527 contains a single link module. The KIAA0527 gene was originally cloned from human brain tissue. 0
40293 413402 cl02614 SPRY SPRY domain. SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown function. Distant homologs are domains in butyrophilin/marenostrin/pyrin homologs. 0
40294 413403 cl02616 MACPF MAC/Perforin domain. Membrane attack complex/ Perforin (MACPF) Superfamily; Provisional 0
40295 413404 cl02617 Sorb Sorbin homologous domain. First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins. 0
40296 413405 cl02619 Smr Smr domain. This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity. The full-length human NEDD4-binding protein 2 also has the polynucleotide kinase activity. 0
40297 413406 cl02620 SAD_SRA SAD/SRA domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. 0
40298 413407 cl02621 TGF_beta_GS Transforming growth factor beta type I GS-motif. Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity. 0
40299 413408 cl02622 Pre-SET Pre-SET motif. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished. 0
40300 413409 cl02623 WIF WIF domain. Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity. 0
40301 413410 cl02626 DNA_pol_A Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. 0
40302 413411 cl02628 XPG_N XPG N-terminal domain. domain in nucleases 0
40303 383060 cl02629 CBM_14 Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. 0
40304 413412 cl02632 PRP4 pre-mRNA processing factor 4 (PRP4) like. This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing. 0
40305 413413 cl02633 ARID ARID/BRIGHT DNA binding domain. Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. 0
40306 413414 cl02637 TFIIS_M Transcription factor S-II (TFIIS), central domain. Transcription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and pfam01096 are required for transcription activity. 0
40307 413415 cl02638 Hairy_orange Hairy Orange. This domain confers specificity among members of the Hairy/E(SPL) family. 0
40308 413416 cl02640 SAP SAP domain. The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins. 0
40309 413417 cl02642 PABP Poly-adenylate binding protein, unique domain. Involved in homodimerisation (either directly or indirectly) 0
40310 413418 cl02643 PSP PSP. Proline rich domain found in numerous spliceosome associated proteins. 0
40311 413419 cl02648 NIDO Nidogen-like. This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. 0
40312 413420 cl02649 LEM LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins. The LEM domain is 50 residues long and is composed of two parallel alpha helices. This domain is found in inner nuclear membrane proteins. It is called the LEM domain after LAP2, Emerin, and Man1. 0
40313 413421 cl02650 FYRN F/Y-rich N-terminus. is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. 0
40314 413422 cl02651 FYRC F/Y rich C-terminus. is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. 0
40315 413423 cl02652 MIF4G MIF4G domain. Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press) 0
40316 413424 cl02653 MA3 MA3 domain. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press 0
40317 413425 cl02656 zf-RanBP Zn-finger in Ran binding protein and others. Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP. 0
40318 413426 cl02658 TAFH NHR1 homology to TAF. Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1). 0
40319 295419 cl02659 z-alpha Adenosine deaminase z-alpha domain. Helix-turn-helix-containing domain. Also known as Zab. 0
40320 413427 cl02660 zf-TAZ TAZ zinc finger. The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumor suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC. 0
40321 413428 cl02661 A_deamin Adenosine-deaminase (editase) domain. Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defense against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc. 0
40322 413429 cl02662 SEP SEP domain. The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain. 0
40323 413430 cl02663 Fasciclin Fasciclin domain. This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria. 0
40324 413431 cl02666 KU N/A. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain. This domain is found in both the Ku70 and Ku80 proteins that form a DNA binding heterodimer. 0
40325 413432 cl02672 L27 L27 domain. The L27 domain is a protein interaction module that exists in a large family of scaffold proteins, functioning as an organisation centre of large protein assemblies required for the establishment and maintenance of cell polarity. L27 domains form specific heterotetrameric complexes, in which each domain contains three alpha-helices. 0
40326 413433 cl02674 DDT DDT domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors) and is approximately 60 residues in length. Along with the WHIM motifs, it comprises an entirely alpha helical module found in diverse eukaryotic chromatin proteins. Based on the structure of Ioc3, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. In particular, the DDT domain, in combination with the WHIM1 and WHIM2 motifs form the SLIDE domain binding pocket. 0
40327 295427 cl02675 DZF DZF domain. The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. 0
40328 413434 cl02676 HSA HSA. This domain is predicted to bind DNA and is often found associated with helicases. 0
40329 413435 cl02677 POX Associated with HOX. The function of this domain is unknown. It is often found in plant proteins associated with pfam00046. 0
40330 413436 cl02684 zf-DBF DBF zinc finger. This domain is predicted to bind metal ions and is often found associated with pfam00533 and pfam02178. It was first identified in the Drosophila chiffon gene product, and is associated with initiation of DNA replication. 0
40331 413437 cl02686 PRY SPRY-associated domain. SPRY and PRY domains occur on PYRIN proteins. Their function is not known. 0
40332 413438 cl02687 RWD RWD domain. This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices. 0
40333 413439 cl02688 BRK BRK domain. The function of this domain is unknown. It is often found associated with helicases and transcription factors. 0
40334 413440 cl02689 RUN RUN domain. This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases. 0
40335 413441 cl02694 LCCL LCCL domain. Rxt3 has been shown in yeast to be required for histone deacetylation. 0
40336 413442 cl02699 VIT Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. 0
40337 413443 cl02701 Kelch_1 Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415. 0
40338 413444 cl02703 zf-BED BED zinc finger. DNA-binding domain in chromatin-boundary-element-binding proteins and transposases 0
40339 413445 cl02704 EphR_LBD Ligand Binding Domain of Ephrin Receptors. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand. 0
40340 413446 cl02706 Malt_amylase_C Maltogenic Amylase, C-terminal domain. This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 110 amino acids in length. This domain is found associated with pfam00128, pfam02922. 0
40341 413447 cl02708 Big_2 Bacterial Ig-like domain (group 2). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial and phage surface proteins such as intimins. 0
40342 413448 cl02712 PGRP N/A. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding. 0
40343 413449 cl02713 MurNAc-LAA N/A. This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation. Engulfment in Bacillus subtilis is mediated by two complementary systems: the first includes the proteins SpoIID, SpoIIM and SpoIIP (DMP) which carry out the engulfment, and the second includes the SpoIIQ-SpoIIIAGH (Q-AH) zipper, that recruits other proteins to the septum in a second-phase of the engulfment. The course of events follows as the incorporation firstly of SpoIIB into the septum during division to serve directly or indirectly as a landmark for localising SpoIIM and then SpoIIP and SpoIID to the septum. SpoIIP and SpoIID interact together to form part of the DMP complex. SpoIIP itself has been identified as an autolysin with peptidoglycan hydrolase activity. 0
40344 413450 cl02715 Surp Surp module. domain present in regulators which are responsible for pre-mRNA splicing processes 0
40345 413451 cl02716 RNA_pol_Rpb8 RNA polymerase Rpb8. RNA_pol_RpbG is a family of archaeal and fungal subunit G of DNA-directed RNA polymerase. 0
40346 295446 cl02717 RNA_POL_M_15KD RNA polymerases M/15 Kd subunit. 0
40347 413452 cl02720 PB1 N/A. Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate. 0
40348 413453 cl02729 WWE WWE domain. The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. 0
40349 413454 cl02731 CLIP Regulatory CLIP domain of proteinases. Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme. 0
40350 413455 cl02735 DM13 Electron transfer DM13. The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme. 0
40351 413456 cl02739 THAP THAP domain. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes. 0
40352 413457 cl02748 zf-CDGSH Iron-binding zinc finger CDGSH type. The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm. 0
40353 413458 cl02754 zf-LITAF-like LITAF-like zinc ribbon domain. Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumor necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure. 0
40354 413459 cl02755 LAM LA motif RNA-binding domain. This presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins. The function of this region is uncertain. 0
40355 413460 cl02758 AMOP AMOP domain. This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. 0
40356 413461 cl02759 TRAM_LAG1_CLN8 TLC domain. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains. 0
40357 413462 cl02760 NEAT NEAr Transport domain, a component of cell surface proteins. NEAT domains are heme and/or hemoprotein-binding modules highly conserved in secondary structure. They have roles in hemoprotein binding, heme extraction and heme transfer 0
40358 413463 cl02763 ChW Clostridial hydrophobic W. A novel extracellular macromolecular system has been proposed based on the proteins containing ChW repeats. ChW stands for Clostridial hydrophobic with conserved W (tryptophan). This repeat was originally described in Clostridium acetobutylicum but is also found in other Gram-positive bacteria including Enterococcus faecalis, Streptococcus agalactiae and Streptomyces coelicolor. 0
40359 383112 cl02765 zf-WRNIP1_ubi Werner helicase-interacting protein 1 ubiquitin-binding domain. Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. 0
40360 413464 cl02766 NGN N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily. Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit. 0
40361 413465 cl02768 PASTA N/A. This domain is found at the C termini of several Penicillin-binding proteins and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 0
40362 413466 cl02770 CFEM CFEM domain. This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis. The structure of the CFEM domain containing protein 'Surface antigen protein 2' from Candida albicans has been solved. 0
40363 413467 cl02772 BSD BSD domain. This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. 0
40364 275778 cl02773 HTTM Horizontally Transferred TransMembrane Domain. Members of this protein family resemble SdpB (Sporulation Delaying Protein B), an integral membrane protein associated with production of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus. 0
40365 413468 cl02774 Topoisomer_IB_N N/A. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. This family may be more than one structural domain. 0
40366 413469 cl02775 Oxidoreductase_nitrogenase N/A. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This metal cluster-binding family is related to nitrogenase structural protein NifD and accessory protein NifE, among others. [Energy metabolism, Methanogenesis] 0
40367 413470 cl02776 GST_C_family C-terminal, alpha helical domain of the Glutathione S-transferase family. Leishmania major and Trypanosoma cruzi glutathione-S-transferase (GST) has undergone gene duplication, diversification, and gene fusion leading to an four domain enzyme which contains two repeats of a GST N-terminal domain followed by a GST C-terminal domain. 0
40368 351886 cl02777 chaperonin_like N/A. This family consists of GroEL, the larger subunit of the GroEL/GroES cytosolic chaperonin. It is found in bacteria, organelles derived from bacteria, and occasionally in the Archaea. The bacterial GroEL/GroES group I chaperonin is replaced a group II chaperonin, usually called the thermosome in the Archaeota and CCT (chaperone-containing TCP) in the Eukaryota. GroEL, thermosome subunits, and CCT subunits all fall under the scope of pfam00118. [Protein fate, Protein folding and stabilization] 0
40369 413471 cl02779 TRFH N/A. Telomere repeat binding factor (TRF) family proteins are important for the regulation of telomere stability. The two related human TRF proteins hTRF1 and hTRF2 form homodimers and bind directly to telomeric TTAGGG repeats via the myb DNA binding domain pfam00249 at the carboxy terminus. TRF1 is implicated in telomere length regulation and TRF2 in telomere protection. Other telomere complex associated proteins are recruited through their interaction with either TRF1 or TRF2. The fission yeast protein Taz1p (telomere-associated in Schizosaccharomyces pombe) has similarity to both hTRF1 and hTRF2 and may perform the dual functions of TRF1 and TRF2 at fission yeast telomeres. This domain is composed of multiple alpha helices arranged in a solenoid conformation similar to TPR repeats. The fungal members have now also been found to carry two double strand telomeric repeat binding factors. 0
40370 413472 cl02780 MIT_C N/A. MIT_C is the C-terminal domain of MIT-containing proteins, pfam04212. It contains an unanticipated phospholipase d fold (PLD fold) that binds avidly to phosphoinositide-containing membranes. It is conserved in eukaryotes, though not fungi and plants, and some bacteria. 0
40371 351888 cl02781 tetraspanin_LEL N/A. Tetraspanin, extracellular domain or large extracellular loop (LEL), oculospanin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contains sequences similar to oculospanin, which is found to be expressed in retinal pigment epithelium, iris, ciliary body, and retinal ganglion cells. 0
40372 413473 cl02782 ERp29c N/A. ERp29 is a ubiquitously expressed endoplasmic reticulum protein found in mammals. ERp29 is comprised of two domains. This domain, the C-terminal domain, has an all helical fold. ERp29 is thought to form part of the thyroglobulin folding complex. 0
40373 413474 cl02783 TopoII_MutL_Trans N/A. Members of this family adopt a structure consisting of a four-stranded beta-sheet backed by three alpha-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme. 0
40374 413475 cl02784 Chelatase_Class_II N/A. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation. 0
40375 413476 cl02785 Elongation_Factor_C N/A. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. 0
40376 413477 cl02786 Translation_factor_III Domain III of Elongation factor (EF) Tu (EF-TU) and related proteins. Members of this family, which are found in the initiation factors eIF2 and EF-Tu, adopt a structure consisting of a beta barrel with Greek key topology. They are required for formation of the ternary complex with GTP and initiator tRNA. 0
40377 413478 cl02787 Translation_Factor_II_like Domain II of Elongation factor Tu (EF-Tu)-like proteins. Elongation factor Tu consists of several structural domains, and this is usually the fourth. 0
40378 413479 cl02788 Ser_Recombinase N/A. The N-terminal domain of the resolvase family (this family) contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase - see pfam02796. 0
40379 413480 cl02789 EFG_like_IV N/A. This domain is found in elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopts a ribosomal protein S5 domain 2-like fold. 0
40380 413481 cl02792 Cyt_c_Oxidase_IV N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit IV. The Dictyostelium member of this family is called COX VI. The yeast protein MTC3 appears to be the yeast COX IV subunit. 0
40381 413482 cl02793 Cyt_c_Oxidase_Va N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit Va. 0
40382 413483 cl02794 Cyt_c_Oxidase_VIb N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the potentially heme-binding subunit IVb of the oxidase. 0
40383 413484 cl02795 Cyt_c_Oxidase_VIc N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIc. 0
40384 413485 cl02796 Cyt_c_Oxidase_VIIa N/A. Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This family also contains both heart and liver isoforms of cytochrome c oxidase subunit VIIa. 0
40385 413486 cl02797 Cyt_c_Oxidase_VIIc N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII. 0
40386 413488 cl02806 Laminin_N Laminin N-terminal (Domain VI). N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins. 0
40387 413489 cl02808 RT_like N/A. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses. 0
40388 413493 cl02823 phosphagen_kinases Phosphagen (guanidino) kinases. The substrate binding site is located in the cleft between N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. 0
40389 413500 cl02844 Arrestin_C Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X). which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction. 0
40390 413509 cl02872 DHQ_Fe-ADH Dehydroquinate synthase-like (DHQ-like) and iron-containing alcohol dehydrogenases (Fe-ADH). The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide. 3-dehydroquinate (DHQ) synthase catalyzes the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. 0
40391 413511 cl02879 Chloroa_b-bind Chlorophyll A-B binding protein. photosystem II light-harvesting-Chl-binding protein Lhcb6 (CP24); Provisional 0
40392 413513 cl02885 Ebola_HIV-1-like_HR1-HR2 heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and human immunodeficiency virus type 1 (HIV-1). This family includes envelope protein from a variety of retroviruses. It includes the GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) which mediate membrane fusion during viral entry. The family also includes bovine immunodeficiency virus, feline immunodeficiency virus and Equine infectious anaemia (EIAV). The family also includes the Gp36 protein from mouse mammary tumor virus (MMTV) and human endogenous retroviruses (HERVs). 0
40393 295537 cl02891 E7 E7 protein, Early protein. E7 protein; Provisional 0
40394 295552 cl02915 Voltage_gated_ClC N/A. ClC-6-like chloride channel proteins. This CD includes ClC-6, ClC-7 and ClC-B, C, D in plants. Proteins in this family are ubiquitous in eukarotes and their functions are unclear. They are expressed in intracellular organelles membranes. This family belongs to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. ClC chloride ion channel superfamily perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, and transepithelial transport in animals. 0
40395 413523 cl02916 POLO_box Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides. 0
40396 413528 cl02928 TGFb_propeptide TGF-beta propeptide. The DNRLRE domain, with a length of about 160 amino acids, appears typically in large, repetitive surface proteins of bacteria and archaea, sometimes repeated several times. It occurs, notably, three times in the C-terminal region of the enzyme disaggregatase from the archaeal species Methanosarcina mazei, each time with the motif DNRLRE, for which the domain is named. Archaeal proteins within this family are described particularly well by the currently more narrowly defined Pfam model, PF06848. Note that the catalytic region of disaggregatase, in the N-terminal portion of the protein, is modeled by a different HMM, PF08480. 0
40397 413529 cl02929 Cation_ATPase_C Cation transporting ATPase, C-terminus. PhoLip_ATPase_C is found at the C-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes. 0
40398 413530 cl02930 Cation_ATPase_N Cation transporter/ATPase, N-terminus. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases. 0
40399 413532 cl02948 GH20_hexosaminidase N/A. This family consists of several uncharacterized proteins found in various Bacteroides and Chloroflexus species. The function of this family is unknown. 0
40400 413533 cl02954 Gas_vesicle Gas vesicle protein. 0
40401 413536 cl02959 Glyco_hydro_9 Glycosyl hydrolase family 9. endoglucanase 0
40402 413539 cl02977 Ribosomal_L15e Ribosomal L15. 50S ribosomal protein L15e; Validated 0
40403 413546 cl02990 ASC Amiloride-sensitive sodium channel. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the vertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds] 0
40404 413547 cl02993 P2X_receptor ATP P2X receptor. ATP-gated Cation Channel (ACC) Family (TC 1.A.7)Members of the ACC family (also called P2X receptors) respond to ATP, a functional neurotransmitter released by exocytosis from many types of neurons.These channels, which function at neuron-neuron and neuron-smooth muscle junctions, may play roles in the control of blood pressure and pain sensation. They may also function in lymphocyte and plateletphysiology. They are found only in animals.ACC channels are probably hetero- or homomultimers and transport small monovalent cations (Me+). Some also transport Ca2+; a few also transport small metabolites. [Transport and binding proteins, Cations and iron carrying compounds] 0
40405 413550 cl03000 Innexin Innexin. viral inexin-like protein; Provisional 0
40406 413552 cl03008 ATP-synt_8 ATP synthase protein 8. ATP synthase F0 subunit 8; Provisional 0
40407 413553 cl03012 Ammonium_transp Ammonium Transporter Family. Members of this protein family are well conserved subclass of putative ammonimum transporters, belonging to the much broader set of ammonium/methylammonium transporter described by TIGR00836. Species with this transporter tend to be marine bacteria. Partial phylogenetic profiling (PPP) picks a member of this protein family as the single best-scoring protein vs. a reference profile for the marine environment Genome Property for a large number of different query genomes. This finding by PPP suggests that this transporter family represents an important adaptation to the marine environment. 0
40408 413555 cl03019 VSV_P-protein-C_like C-terminal domain of Vesicular stomatitis Indiana virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Vesicular stomatitis Indiana virus (VSV). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 0
40409 413558 cl03026 CBM_3 Cellulose binding domain. 0
40410 413563 cl03042 MHC_II_beta Class II histocompatibility antigen, beta domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection). 0
40411 413568 cl03055 DNA_gyraseB_C DNA gyrase B subunit, carboxyl terminus. TOPRIM_C is found as the C-terminal extension of the TOPRIM domain, pfam01751 in metazoa. 0
40412 413569 cl03056 CPSase_sm_chain Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus. 0
40413 413570 cl03058 MHC_II_alpha Class II histocompatibility antigen, alpha domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection). 0
40414 351936 cl03065 Flavi_M Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus. 0
40415 413575 cl03075 GrpE nucleotide exchange factor GrpE. heat shock protein GrpE; Provisional 0
40416 413580 cl03088 MobM_relaxase relaxase domain of MobM and similar proteins. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation). 0
40417 413585 cl03093 Defensin_2 Arthropod defensin. The actinodefensin family is named (here) as an Actinomyces-specific branch of the (otherwise eukaryotic) arthropod defensin family described by Pfam model PF01097. 0
40418 413588 cl03104 CKS Cyclin-dependent kinase regulatory subunit. cyclin-dependent kinases regulatory subunit; Provisional 0
40419 413590 cl03107 ETX_MTX2 Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2. This family represents the pore forming lobe of aerolysin. 0
40420 413592 cl03113 Peptidase_U32 Peptidase family U32. putative protease; Provisional 0
40421 413593 cl03114 RNase_PH RNase PH-like 3&apos;-5&apos; exoribonucleases. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria. 0
40422 413596 cl03119 FpgNei_N N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges. 0
40423 413597 cl03120 ELO GNS1/SUR4 family. fatty acid elongase; Provisional 0
40424 413601 cl03129 T2SSN Type II secretion system (T2SS), protein N. Members of this family are the N (or GspN) protein of type II secretion systems (T2SS) as found in Leptospira, Geobacter, Myxococcus, and several other genera. Sequence similarity to GspN as found in, say, Gammaproteobacteria (see pfam01203) is extremely remote. [Protein fate, Protein and peptide secretion and trafficking] 0
40425 413603 cl03131 Dynein_light Dynein light chain type 1. dynein light chain; Provisional 0
40426 413607 cl03141 Ribosomal_S7e Ribosomal protein S7e. 40S ribosomal protein S7; Provisional 0
40427 413613 cl03152 TbpB_B_D C-lobe and N-lobe beta barrels of Tf-binding protein B. HpuA is a family of Neisseria spp proteins from the hpuAB operon, which are putative porphyrin transporters. 0
40428 413614 cl03164 Col_Im_like inhibitory immunity (Im) protein of colicin (Col) deoxyribonuclease (DNase) and pyocins. This family contains inhibitory immunity (Im) proteins that bind to colicin endonucleases (DNases) or pyocins with very high affinity and specificity; this is critical for the neutralization of endogenous DNase catalytic activity and for protection against exogenous DNase bacteriocins. The DNase colicin family (ColE2, ColE7, ColE8 and ColE9) in E. coli, and pyocin family (S1, S2, S3 and AP41) in P. aeruginosa, are potent bacteriocins where the immunity proteins (Ims) protect the colicin/pyocin producing (i.e. colicinogenic) bacteria by binding and inactivating colicin nucleases. The binding affinities between cognate and non-cognate nucleases by Im proteins can vary up to 10 orders of magnitude. 0
40429 351959 cl03170 CheB_like methylesterase CheB domain family. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors, fused with a CheR domain as well as other domains. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheB and cheR are typically found in the same operon. However, CheB and CheR are fused in multi-domain proteins in this subgroup. The CheR protein/domain includes an all-alpha N-terminal domain and an S-adenosylmethionine-dependent methyltransferase C-terminal domain. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR. 0
40430 413619 cl03179 PARP_regulatory Poly A polymerase regulatory subunit. poly(A) polymerase small subunit; Provisional 0
40431 413620 cl03181 Peptidase_C25_N Peptidase C25 family N-terminal domain, found in Arg-gingipain (Rgp), Lys-gingipain (Kgp) and related proteins. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidases family C25 are a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network. 0
40432 413625 cl03191 CpcD CpcD/allophycocyanin linker domain. 0
40433 413628 cl03205 Jacalin_like Jacalin-like lectin domain. This beta-prism fold lectin is the C-terminal domain of the Vibrio cholerae cytolytic pore-forming toxin hemolysin. It binds to N-glycans with a heptasaccharide GlcNAc4Man3 core (NGA2). 0
40434 413635 cl03224 Porin3 Eukaryotic porin family that forms channels in the mitochondrial outer membrane. MDM10 is a family of eukaryotic proteins that forms a subunit of the SAM complex for biogenesis of beta-barrel proteins, though not porins, into the outer mitochondrial membrane. 0
40435 413636 cl03225 GRIP GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1. 0
40436 413637 cl03230 DAHP_synth_2 Class-II DAHP synthetase family. phospho-2-dehydro-3-deoxyheptonate aldolase 0
40437 413643 cl03253 SAM_decarbox Adenosylmethionine decarboxylase. This enzyme is a key regulatory enzyme of the polyamine synthetic pathway. This protein is a pyruvoyl-dependent enzyme. The proenzyme is cleaved at a Ser residue that becomes a pyruvoyl group active site. [Central intermediary metabolism, Polyamine biosynthesis] 0
40438 383302 cl03283 Allergen_V_VI Group V, VI major allergens from grass, including Phlp 5, Phlp 6, Pha a 5 and Lol p 5. This family contains grass pollen proteins of group V. Phleum pratense pollen allergen Phl p 5b has been shown to possess ribonuclease activity. 0
40439 413657 cl03302 Glyco_hydro_12 Glycosyl hydrolase family 12. hypothetical protein; Provisional 0
40440 413658 cl03304 Plasmid_parti Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins. 0
40441 413667 cl03348 Ribosomal_L22e Ribosomal L22e protein family. 60S ribosomal protein L22; Provisional 0
40442 413668 cl03350 Ribosomal_L28e Ribosomal L28e protein family. 60S ribosomal protein L28; Provisional 0
40443 413669 cl03352 Ribosomal_L38e Ribosomal L38e protein family. 60S ribosomal protein L38; Provisional 0
40444 413671 cl03356 DcrB DcrB. This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the affinity of the water oxidation site for Cl- and provides the conditions required for high affinity binding of Ca2+. 0
40445 413673 cl03371 Peptidase_G1_like Peptidases of the G1 family and homologs that might lack peptidase activity. This family of proteins is found in bacteria. Proteins in this family are typically between 236 and 351 amino acids in length. The member from Bacillus subtilis, UniProtKB:O05411, is named YrpD. 0
40446 413677 cl03379 Myo5-like_CBD Cargo binding domain of myosin 5 and similar proteins. The DIL domain has no known function. 0
40447 413678 cl03381 pVHL von Hippel-Landau (pVHL) tumor suppressor protein. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA. 0
40448 413680 cl03398 DUF111 Protein of unknown function DUF111. Members of this family are found in the Archaea and in several different bacteria lineages. The function in unknown and the genomic context is not well conserved. [Hypothetical proteins, Conserved] 0
40449 413681 cl03400 DUF137 Protein of unknown function DUF137. This family of archaeal proteins has no known function. 0
40450 413688 cl03420 Gallidermin Gallidermin. Mutacins are lantibiotics in the epidermin/gallidermin/nisin family, found in the biofilm-forming dental caries pathogen Streptococcus mutans. Named members of the family include mutacin I and mutacin 1140. This HMM separates the mutacins (MutA) from paralog MutA' encoded nearby, which lacks mutacin activity. 0
40451 413692 cl03428 MAS20 MAS20 protein import receptor. [Transport and binding proteins, Amino acids, peptides and amines] 0
40452 413697 cl03445 V35-RBD_P-protein-C_like C-terminal RNA-binding domain (RBD) domain of Ebola virus VP35 phosphoprotein and related proteins. This family includes the C-terminal RNA-binding domain (RBD) of the P protein of viruses belonging to the Filoviridae family, such as Ebola virus or Marburg virus. VP35-RBD contains two subdomains: an alpha-helical subdomain and a beta-sheet subdomain. Virus infection typically activates host innate immunity, including the interferon (IFN) signaling pathway; VP35-RBD binds double-stranded RNA (dsRNA) inhibiting IFN-alpha/beta signaling. The family Filoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serve as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. 0
40453 413700 cl03449 M35_like Peptidase M35 family. This is the catalytic region of aspzincins, a group of lysine-specific metallo-endopeptidases in the MEROPS:M35 family. They exhibit the following active-site architecture. The active site is composed of two helices and a loop region and includes the HExxH and GTxDxxYG motifs. In UniProt:P81054, His117, His121 and Asp130 coordinate to the catalytic zinc ligands. An electrostatically negative region composed of Asp154 and Glu157 attracts a positively charged Lys side chain of a substrate in a specific manner. 0
40454 351997 cl03493 Alpha_TIF Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology. 0
40455 413718 cl03503 Fe_hyd_SSU Iron hydrogenase small subunit. Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold. 0
40456 413722 cl03508 TFIIA_gamma_N Gamma subunit of transcription initiation factor IIA, N-terminal helical domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The N-terminal domain of the gamma subunit is a 4 helix bundle. 0
40457 413727 cl03519 UreI_AmiS_like UreI/AmiS family, proton-gated urea channel and putative amide transporters. This family includes UreI and proton gated urea channel as well as putative amide transporters. 0
40458 295899 cl03540 HDC Histidine carboxylase PI chain. This enzyme converts histadine to histamine in a single step by catalyzing the release of CO2. This type is synthesized as an inactive single chain precursor, then cleaved into two chains. The Ser at the new N-terminus at the cleavage site is converted to a pyruvoyl group essential for activity. This type of histidine decarboxylase appears is known so far only in some Gram-positive bacteria, where it may play a role in amino acid catabolism. There is also a pyridoxal phosphate type histidine decarboxylase, as found in human, where histamine is a biologically active amine. [Energy metabolism, Amino acids and amines] 0
40459 413743 cl03554 Decorin_bind Decorin binding protein. This family consists of decorin binding proteins from Borrelia. The decorin binding protein of Borrelia burgdorferi the lyme disease spirochetes adheres to the proteoglycan decorin found on collagen fibers. 0
40460 413747 cl03563 MraZ protein domain of unknown function (UPF0040) includes MraZ. This small 70 amino acid domain is found duplicated in a family of bacterial proteins. These proteins may be DNA-binding transcription factors (Pers. comm. A Andreeva & A Murzin). It is likely, due to the similarity of fold, that this family acts as a bacterial antitoxin like the MazE antitoxin family. 0
40461 413748 cl03567 Ycf4 Ycf4. photosystem I assembly protein Ycf4; Provisional 0
40462 186578 cl03578 MerT MerT mercuric transport protein. putative mercuric transport protein; Provisional 0
40463 413752 cl03585 PSI_PsaE Photosystem I reaction centre subunit IV / PsaE. photosystem I reaction center subunit IV; Provisional 0
40464 413755 cl03589 Chalcone_3 Chalcone isomerase-like. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, 4',5,7-trihydroxyflavanone, a key step in the biosynthesis of flavonoids. 0
40465 413762 cl03620 DUF5011 Domain of unknown function (DUF5011). This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion. 0
40466 383422 cl03627 PSI_PsaF Photosystem I reaction centre subunit III. photosystem I reaction center subunit III; Provisional 0
40467 413768 cl03639 PsaD PsaD. photosystem I reaction center subunit II; Provisional 0
40468 413769 cl03646 SRAP SOS response associated peptidase (SRAP). hypothetical protein; Provisional 0
40469 413770 cl03649 HemD N/A. This family consists of uroporphyrinogen-III synthase HemD EC:4.2.1.75 also known as Hydroxymethylbilane hydrolyase (cyclizing) from eukaryotes, bacteria and archaea. This enzyme catalyzes the reaction: Hydroxymethylbilane <=> uroporphyrinogen-III + H(2)O. Some members of this family are multi-functional proteins possessing other enzyme activities related to porphyrin biosynthesis, such as HemD with pfam00590, however the aligned region corresponds with the uroporphyrinogen-III synthase EC:4.2.1.75 activity only. Uroporphyrinogen-III synthase is the fourth enzyme in the heme pathway. Mutant forms of the Uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria in humans a recessive inborn error of metabolism also known as Gunther disease. 0
40470 413771 cl03651 PsaL Photosystem I reaction centre subunit XI. photosystem I reaction center protein subunit XI; Provisional 0
40471 413772 cl03656 PS_Dcarbxylase Phosphatidylserine decarboxylase. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. A closely related family, possibly also active as phosphatidylserine decarboxylase, falls under model TIGR00164. [Fatty acid and phospholipid metabolism, Biosynthesis] 0
40472 413785 cl03715 Mago_nashi Mago nashi proteins, integral members of the exon junction complex. This family was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homolog has been shown to interact with an RNA binding protein. An RNAi knockout of the C. elegans homolog causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination. Mago nashi has been found to be part of the exon-exon junction complex that binds 20 nucleotides upstream of exon-exon junctions. 0
40473 413789 cl03728 Alpha_kinase Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. 0
40474 413793 cl03741 Glyco_hydro_20b Glycosyl hydrolase family 20, domain 2. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the N-terminal region of alpha-glucuronidase. The N-terminal domain forms a two-layer sandwich, each layer being formed by a beta sheet of five strands. A further two helices form part of the interface with the central, catalytic, module (pfam07488). 0
40475 413798 cl03749 STAT_int STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain. 0
40476 413801 cl03758 SRP54_N SRP54-type protein, helical bundle domain. This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. 0
40477 413802 cl03759 Alpha_adaptinC2 Adaptin C-terminal domain. Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology.. The adaptor appendage contains an additional N-terminal strand. 0
40478 413803 cl03763 CaMBD Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. 0
40479 413808 cl03779 Enterotoxin_a Heat-labile enterotoxin alpha chain. 0
40480 413817 cl03803 BAF Barrier to autointegration factor. Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA. 0
40481 413821 cl03812 Me-amine-dh_L Methylamine dehydrogenase, L chain. This family consists of the light chain of methylamine dehydrogenase light chain, a periplasmic enzyme. This subunit contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from Trp-114 and Trp-165 of the precursor, numbered according to the sequence from Paracoccus denitrificans. The enzyme forms a complex with the type I blue copper protein amicyanin and cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines] 0
40482 413824 cl03816 FokI_N N-terminal DNA recognition domain of restriction endonuclease FokI and similar proteins. Restriction endonuclease FokI (EC3.1.21.4), also called R.FokI, or endonuclease FokI, is a type IIS restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. FokI recognizes the double-stranded sequence 5'-GGATG-3'/3'-CATCC-5' and cleaves 14 bases after G-1 and 13 bases before C-1, respectively. It contains an N-terminal DNA recognition domain and a C-terminal endonuclease domain. This model describes the DNA recognition domain. The family also includes endonuclease StsI, a type IIS restriction endonuclease found in Streptococcus sanguinis 54. It recognizes the same sequence as FokI but cleaves at different positions. 0
40483 413828 cl03831 HlyIII Haemolysin-III related. This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens and Drosophila. In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a role cytolytic role. 0
40484 413829 cl03835 RABV_P-protein-C_like C-terminal domain of Rabies virus phosphoprotein and related proteins. This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit, which is thought to be a component of the active polymerase, and may be involved in template binding. 0
40485 413833 cl03849 PSS Phosphatidyl serine synthase. CDP-diacylglycerol-serine O-phosphatidyltransferase 0
40486 413836 cl03855 CemA CemA family. proton extrusion protein PcxA; Provisional 0
40487 383496 cl03860 ComC COMC family. Members of this family are BlpC, a peptide pheromone that stimulates production of BLP (bacteriocin-like peptides) family class II bacteriocins. BlpC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish BlpC from other pheromone precursors, such as ComC. 0
40488 413838 cl03870 NPL Nucleoplasmin-like domain. Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays. 0
40489 413845 cl03888 PTPA N/A. Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumor suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1. 0
40490 413846 cl03892 WRKY WRKY DNA -binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding. 0
40491 413851 cl03904 CAT_RBD CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer.to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template. 0
40492 413852 cl03905 EXS EXS family. We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles. 0
40493 413853 cl03906 GAT_SF GAT domain found in eukaryotic GGAs, metazoan Tom1-like proteins, metazoan STAMs, fungal Vps27, and similar proteins. STAM-2 is a Hrs-binding protein involved in intracellular signal transduction mediated by cytokines and growth factors. STAM-2 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-2, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM2 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM2 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting. 0
40494 413854 cl03910 AnfG_VnfG Vanadium/alternative nitrogenase delta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfG, represents the delta subunit of the V-containing (vanadium) alternative nitrogenase. It is homologous to AnfG, the delta subunit of the Fe-only nitrogenase. [Central intermediary metabolism, Nitrogen fixation] 0
40495 413859 cl03918 CHB_HEX Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. 0
40496 413860 cl03922 V-ATPase_G Vacuolar (H+)-ATPase G subunit. This model describes the vacuolar ATP synthase G subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. V-ATPases are multi-subunit enzymes composed of two functional domains: A transmembrane Vo domain and a peripheral catalytic domain V1. The G subunit is one of the subunits of the catalytic domain. V-ATPases are responsible for the acidification of endosomes and lysosomes, which are part of the central vacuolar system. [Energy metabolism, ATP-proton motive force interconversion] 0
40497 413861 cl03923 BURP BURP domain. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown. 0
40498 413866 cl03934 MerC MerC mercury resistance protein. 0
40499 413867 cl03935 NifW Nitrogen fixation protein NifW. Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of these proteins in the functions of nitrogenase are unclear. Using yeast two-hybrid screening it has been shown that NifW can interact with itself as well as NifZ. 0
40500 413868 cl03936 MEV_P-protein-C_like C-terminal domain of Measles virus phosphoprotein and related proteins. Paramyxoviridae P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L pfam00946. Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In mumps, simian virus type 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini. This C-terminal region of the P phosphoprotein is likely to be the nucleocapsid-binding domain, and is found to be intrinsically disordered and thus liable to induced folding. 0
40501 413874 cl03951 CDC37_N Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases.and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function. 0
40502 296151 cl03953 ESAG1 ESAG protein. expression site-associated gene (ESAG); Provisional 0
40503 383536 cl03956 PSI_PsaH Photosystem I reaction centre subunit VI. photosystem I reaction centre subunit VI; Provisional 0
40504 413879 cl03973 DUF269 Protein of unknown function, DUF269. Members of this protein family, called DUF269 by pfam03270, are strictly limited to nitrogen-fixing species, although not universal among them. The gene typically is found next to the nifX gene (see TIGRFAMs model TIGR02663). [Central intermediary metabolism, Nitrogen fixation] 0
40505 296175 cl03994 Trp_dioxygenase Tryptophan 2,3-dioxygenase. Members of this family are tryptophan 2,3-dioxygenase, as confirmed by several experimental characterizations, and by conserved operon structure for many of the other members. This enzyme represents the first of a two-step degradation to L-kynurenine, and a three-step pathway (via kynurenine) to anthranilate plus alanine. [Energy metabolism, Amino acids and amines] 0
40506 413887 cl04000 Cornichon Cornichon protein. predicted protein; Provisional 0
40507 383563 cl04057 DUF1256 Protein of unknown function (DUF1256). This model describes a tetrameric protease that makes the rate-limiting first cut in the small, acid-soluble spore proteins (SASP) of Bacillus subtilis and related species. The enzyme lacks clear homology to other known proteases. It processes its own amino end before becoming active to cleave SASPs. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination] 0
40508 383566 cl04069 EspA EspA-like secreted protein. EspA is the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which forms a direct link between the bacterium and the host cell. 0
40509 413906 cl04084 dDENN dDENN domain. The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 0
40510 413907 cl04085 uDENN uDENN domain. The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 0
40511 413909 cl04088 TRCF TRCF domain. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript. 0
40512 413914 cl04104 Arg_tRNA_synt_N Arginyl tRNA synthetase N terminal domain. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. 0
40513 413916 cl04109 Methyltransf_7 SAM dependent carboxyl methyltransferase. indole-3-acetate carboxyl methyltransferase 0
40514 413918 cl04114 Channel_Tsx Nucleoside-specific channel-forming protein, Tsx. This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 235 amino acids in length. 0
40515 413923 cl04129 Invas_SpaK Invasion protein B family. type III secretion system chaperone SpaK; Provisional 0
40516 413931 cl04142 VRP3 Salmonella virulence-associated 28kDa protein. 0
40517 413933 cl04145 Peptidase_C58 Yersinia/Haemophilus virulence surface antigen. The model represents a cysteine protease domain found in proteins of bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs, and additional members may exist that are detected by this model. YopT, a virulence effector protein of Yersinia pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors. [Cellular processes, Pathogenesis] 0
40518 413940 cl04176 TDT The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This family of transporters has ten alpha helical transmembrane segments. The structure of a bacterial homolog of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homolog, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1 from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterized as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, this family is found in the stomatal guard cells functioning as an anion-transporting pore. Many homologs are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins. 0
40519 413956 cl04214 UPF0180 Uncharacterized protein family (UPF0180). hypothetical protein; Provisional 0
40520 413958 cl04219 LPG_synthase_TM Lysylphosphatidylglycerol synthase TM region. This family of hydrophobic proteins is observed in two distinct contexts. It is primarily found in the presence of genes for the biosynthesis and elaboration of hopene where we assign the gene symbol HpnL. In a subset of the genomes containing HpnL a second, often plasmid-encoded, homolog is observed in a context implying the biosynthesis of 2-aminoethylphosphonate head-group containing lipids. 0
40521 413960 cl04227 CBM41_pullulanase Family 41 Carbohydrate-Binding Module from pullulanase-like enzymes. Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C terminii of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function. 0
40522 413978 cl04270 Glyco_transf_WecG_TagA N/A. putative UDP-N-acetyl-D-mannosaminuronic acid transferase; Provisional 0
40523 413979 cl04271 IBN_N Importin-beta N-terminal domain. Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins.. which is important for importin-beta mediated transport. 0
40524 413980 cl04273 MadL Malonate transporter MadL subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM. The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
40525 413981 cl04274 MadM Malonate/sodium symporter MadM subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM.The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
40526 413982 cl04275 Mtc Tricarboxylate carrier. The MTC family consists of a limited number of homologues, all from eukaryotes. A single member of the family has been functionally characterized, the tricarboxylate carrier from rat liver mitochondria. The rat liver mitochondrial tricarboxylate carrier has been reported to transport citrate, cis-aconitate, threo-D-isocitrate, D- and L-tartrate, malate, succinate and phosphoenolpyruvate. It presumably functions by a proton symport mechanism. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
40527 413983 cl04276 Mtp Golgi 4-transmembrane spanning transporter. The proteins of the MET family have 4 TMS regions and are located in late endosomal or lysosomal membranes. Substrates of the mouse MTP transporter include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores and steroid hormones. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other compounds.Drug sensitivity by mouse MET was regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol transport, or modulate the multidrug resistance phenotype of mammalian cells. Thus, MET family members may compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity,nucleoside/nucleobase availability and steroid hormone responses. [Transport and binding proteins, Unknown substrate] 0
40528 413987 cl04283 Rad10 Binding domain of DNA repair protein Ercc1 (rad10/Swi10). All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
40529 413988 cl04285 RecT RecT family. This model represents the phage recombination protein Bet from a number of phage, including phage lambda. All members of this family are found in phage genomes or in putative prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions] 0
40530 413989 cl04289 Tfb2 Transcription factor Tfb2. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
40531 413991 cl04295 CG-1 CG-1 domain. The domains contain a predicted bipartite NLS and are named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs. 0
40532 413992 cl04297 ANTAR ANTAR domain. The majority of the domain consists of a coiled-coil. 0
40533 413993 cl04298 SpoVAC_SpoVAEB SpoVAC/SpoVAEB sporulation membrane protein. This model describes stage V sporulation protein AE, a paralog of stage V sporulation protein AC. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAE have a stage V sproulation defect. [Cellular processes, Sporulation and germination] 0
40534 413996 cl04309 RNAP_Rpb7_N_like N/A. Rpb7 bind to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during RNA polymerase II elongation. This family includes the homologs from RNA polymerase I and III. In RNA polymerase I, Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA. The N-terminus of Rpb7p/Rpc25p/MJ0397 has a SHS2 domain that is involved in protein-protein interaction. 0
40535 414000 cl04326 Psb28 Psb28 protein. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW. [Energy metabolism, Photosynthesis] 0
40536 414008 cl04352 Borrelia_REV Borrelia burgdorferi REV protein. This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete). The function of REV is unknown although it known that gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli. 0
40537 414018 cl04375 PMEI_like pectin methylesterase inhibitor and related proteins. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical. 0
40538 414028 cl04394 BRICHOS BRICHOS domain. Its exact function is unknown; roles that have been proposed for the domain, which is about 100 amino acids long, include (a) targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region. 0
40539 414036 cl04407 Dopey_N Dopey, N-terminal. DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologs are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis. 0
40540 414057 cl04451 EIIC-GAT PTS system sugar-specific permease component. PTS system ascorbate-specific transporter subunit IIC; Reviewed 0
40541 414059 cl04460 DUF434 Protein of unknown function (DUF434). 0
40542 414060 cl04466 P-mevalo_kinase Phosphomevalonate kinase. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found. One is this type, found in animals. The other is the ERG8 type, found in plants and fungi (TIGR01219) and in Gram-positive bacteria (TIGR01220). [Central intermediary metabolism, Other] 0
40543 414061 cl04467 DUF443 Protein of unknown function (DUF443). Members of this family of proteins, with average length of 210, have no invariant residues but five predicted transmembrane segments. Strangely, most members occur in groups of consecutive paralogous genes. A striking example is a set of eleven encoded consecutively, head-to-tail, in Staphylococcus aureus strain COL. 0
40544 414062 cl04468 Tic22 Tic22-like family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic22 protein. [Transport and binding proteins, Amino acids, peptides and amines] 0
40545 414067 cl04498 LytTR LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. 0
40546 414081 cl04524 Fimbrial_CS1 CS1 type fimbrial major subunit. Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown. A member of this family has also been identified in Salmonella typhi and Salmonella enterica. 0
40547 414087 cl04545 Phage_rep_O Bacteriophage replication protein O. This model represents the N-terminal region of the phage lambda replication protein O and homologous regions of other phage proteins. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 0
40548 383772 cl04571 MARVEL Membrane-associating domain. This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices. 0
40549 414110 cl04601 SPC22 Signal peptidase subunit. signal peptidase; Provisional 0
40550 414119 cl04635 F1-ATPase_epsilon eukaryotic mitochondrial ATP synthase epsilon subunit. This family constitutes the mitochondrial ATP synthase epsilon subunit. This is not to be confused with the bacterial epsilon subunit, which is homologous to the mitochondrial delta subunit (pfam00401 and pfam02823) The epsilon subunit is located in the extrinsic membrane section F1, which is the catalytic site of ATP synthesis. The epsilon subunit was not well ordered in the crystal structure of bovine F1, but it is known to be located in the stalk region of F1. E subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1. 0
40551 414121 cl04640 DUF600 Protein of unknown function, DUF600. This model represents a tandem array of 10 proteins in Staphylococcus aureus and the C-terminal region of one protein each in Bacillus subtilis and Bacillus halodurans. 0
40552 414123 cl04653 TAF7 TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein. 0
40553 414126 cl04660 PGAP4-like Post-GPI attachment to proteins factor 4 and similar proteins. Post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246), has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases (GTs), it contains three transmembrane domains. Structural modeling suggests that PGAP4 adopts a GT-A fold split by an insertion of tandem transmembrane domains. 0
40554 383809 cl04661 Polysacc_synt_4 Polysaccharide biosynthesis. This model represents an uncharacterized domain found in both Arabidopsis thaliana (at least 10 copies) and Oryza sativa. Most member proteins have only a short stretch of sequence N-terminal to this domain, but one has a long N-terminal extension that includes a protein kinase domain (pfam00069). 0
40555 414146 cl04704 PDDEXK_6 PDDEXK-like family of unknown function. This model represents a domain found toward the C-terminus of a number of uncharacterized plant proteins. The domain is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence. 0
40556 383830 cl04705 GRDA Glycine reductase complex selenoprotein A. putative glycine/sarcosine/betaine reductase complex protein A; Provisional 0
40557 322775 cl04707 PsbR Photosystem II 10 kDa polypeptide PsbR. photosystem II subunit R; Provisional 0
40558 414149 cl04722 PLAC8 PLAC8 family. This model describes an uncharacterized domain of about 100 residues. It is common in plants but found also in Homo sapiens, Dictyostelium, and Leishmania; at least 12 distinct members are found in Arabidopsis. Most members of this family contain more than 10 per cent Cys, but no Cys residue is invariant across the family. 0
40559 414152 cl04729 DUF617 Protein of unknown function, DUF617. This model represents a region of about 170 amino acids found at the C-terminus of a family of plant proteins. These proteins typically have additional highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterized protein. At least 12 distinct members are found in Arabidopsis thaliana. 0
40560 414155 cl04737 ZF-HD_dimer ZF-HD protein dimerization region. This model describes a 54-residue domain found in the N-terminal region of plant proteins, the vast majority of which contain a ZF-HD class homeobox domain toward the C-terminus. The region between the two domains typically is rich in low complexity sequence. The companion ZF-HD homeobox domain is described in model TIGR01565. 0
40561 414182 cl04793 PSRP-3_Ycf65 Plastid and cyanobacterial ribosomal protein (PSRP-3 / Ycf65). putative ribosomal protein 3; Validated 0
40562 414184 cl04796 Ovate Transcriptional repressor, ovate. This model describes an uncharacterized domain of about 70 residues found exclusively in plants, generally toward the C-terminus of proteins of 200 to 350 amino acids in length. At least 14 such proteins are found in Arabidopsis thaliana. Other regions of these proteins tend to consist largely of low-complexity sequence. 0
40563 414191 cl04813 EIN3 Ethylene insensitive 3. ETHYLENE-INSENSITIVE3-like3 protein; Provisional 0
40564 414195 cl04829 pMMO-AMO_C subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO) from methanotrophic bacteria, and of ammonia monooxygenase (AMO) from ammonia-oxidizing bacteria, and related proteins. This model contains the subunit C of ammonia monooxygenase (AMO, EC 1.14.99.39) from ammonia-oxidizing archaea including Nitrososphaera viennensis gen. nov., sp. nov (also called Nitrososphaera viennensis EN76) that contains six variants (AmoC1-AmoC6) encoded by different genes. AMO catalyzes the conversion of ammonia to hydroxylamine. Nitrososphaera viennensis EN76 AMO is composed of four subunits: AmoA, AmoB, AmoX, and one of six variants of AmoC. The AMO subunit C belongs to a family which also includes subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO, EC 1.14.18.3) from methanotrophic bacteria, and AMO from ammonia-oxidizing bacteria, which are not included in this model. Compared to its bacterial counterpart, archaeal AMO C subunit is significantly shorter at the N-terminal end. 0
40565 414197 cl04831 MbeD_MobD MbeD/MobD like. MbeD, as found in the ColE1 plasmid, was originally described as a plasmid mobilization protein. Later, it was shown that MbeD additionally was responsible for a plasmid entry exclusion phenotype that had previously been ascribed to products of the exc1 and exc2 genes. 0
40566 414207 cl04850 Wzy_C O-Antigen ligase. This family of proteins is suggested to transport inorganic carbon (HCO3-), based on the phenotype of a mutant of IctB in Synechococcus sp. strain PCC 7942. Bicarbonate uptake is used by many photosynthetic organisms including cyanobacteria. These organisms are able to concentrate CO2/HCO3- against a greater than ten-fold concentration gradient. Cyanobacteria may have several such carriers operating with different efficiencies. Note that homology to various O-antigen ligases, with possible implications for mutant cell envelope structure, might allow alternatives to the interpretation of IctB as a bicarbonate transport protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
40567 414210 cl04855 BLUF Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light. 0
40568 414213 cl04868 Phage_holin_2_1 Bacteriophage P21 holin S. Phage_holin_2_4 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. 0
40569 414231 cl04902 Agouti Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation. 0
40570 383919 cl04907 L51_S25_CI-B8 Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. 0
40571 414244 cl04947 Phage_cap_P2 Phage major capsid protein, P2 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage P2 and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions] 0
40572 414247 cl04955 LanC_like Cyclases involved in the biosynthesis of lantibiotics, and similar proteins. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 380 amino acids in length. The family is found in association with pfam05147. This domain may be involved in synthesis of a lantibiotic compound. 0
40573 414249 cl04961 DSS1_Sem1 proteasome complex subunit DSS1/Sem1. This family contains the breast cancer tumor suppressor BRCA2-interacting protein DSS1 and its homolog SEM1, both of which are short acidic proteins. DSS1 has been shown to be a conserved component of the Rae1 mediated mRNA export pathway in Schizosaccharomyces pombe. 0
40574 414262 cl04993 Spiralin Spiralin. Spiralin is the major lipoprotein in multiple species of Spiroplasma, a relative of the Mycoplasmas. 0
40575 414266 cl05000 CHASE3 CHASE3 domain. CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE3 domains are not known at this time. 0
40576 414269 cl05005 TAF4 TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This region of similarity is found in Transcription initiation factor TFIID component TAF4. 0
40577 414271 cl05012 FlhD Flagellar transcriptional activator (FlhD). This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 0
40578 414275 cl05017 UPF0203 Uncharacterized protein family (UPF0203). Uncharacterized protein At4g33100; Provisional 0
40579 414280 cl05036 FlhC Flagellar transcriptional activator (FlhC). This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC combines with FlhD to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 0
40580 414285 cl05060 TraE TraE protein. TraE is a component of type IV secretion systems involved in conjugative transfer of plasmid DNA. The function of the TraE protein is unknown. 0
40581 414292 cl05087 Complex1_LYR Complex 1 protein (LYR family). This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria. 0
40582 414296 cl05094 Phage_attach Phage Head-Tail Attachment. Members of this short family are putative ATP-binding sugar transporter-like protein. 0
40583 414308 cl05125 FliT Flagellar protein FliT. This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium. 0
40584 414309 cl05126 PqqD Coenzyme PQQ synthesis protein D (PqqD). Members of this protein show distant homology to PqqD, and belong to a three-gene cassette that included the HPr kinase related protein family of TIGR04352. The role of the cassette, and of this protein, are unknown. 0
40585 414310 cl05127 DUF4779 Domain of unknown function (DUF4779). This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum. 0
40586 414315 cl05142 GUN4 porphyrin-binding protein domain GUN4. In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participates in plastid-to-nucleus signaling by regulating magnesium-protoporphyrin IX synthesis or trafficking. 0
40587 414328 cl05182 PsaN Photosystem I reaction centre subunit N (PSAN or PSI-N). photosystem I reaction center subunit N; Provisional 0
40588 414353 cl05250 Peptidase_U57 YabG peptidase U57. Members of this family are the protein YabG, demonstrated for Bacillus subtilis to be an endopeptidase able to release N-terminal peptides from a number of sporulation proteins, including CotT, CotF, and SpoIVA. It appears to be expressed under control of sigma-K. [Cellular processes, Sporulation and germination] 0
40589 414361 cl05275 Prolamin_like Prolamin-like. putative protein; Provisional 0
40590 414366 cl05282 Borrelia_P13 Borrelia membrane protein P13. This family consists of P13 proteins from Borrelia species. P13 is a 13kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism. 0
40591 414371 cl05296 DUF802 Domain of unknown function (DUF802). Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense. 0
40592 414374 cl05301 MLKL_NTD N-terminal domain of mixed lineage kinase domain-like protein (MLKL) and similar proteins. This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defense responses. The Arabidopsis thaliana locus Resistance To Powdery Mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localized, salicylic acid-dependent defenses similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance. 0
40593 414376 cl05307 DUF814 Domain of unknown function (DUF814). NFACT-R RNA binding family found found in bacteria fused to the ThiI domain as a variant of the canonical tRNA 4-thiouridylation pathway. 0
40594 296991 cl05376 AfaD Enterobacteria AfaD invasin protein. fimbrial adhesin protein SefD; Provisional 0
40595 414398 cl05390 Tcp11 T-complex protein 11. This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants. 0
40596 414410 cl05417 PLA2_like N/A. This family consists of several phospholipase A2 like proteins mostly from insects. 0
40597 414415 cl05426 YopD YopD protein. One SctB and four SctE subunits, located at the tip of the type III secretion system (T3SS) injectosome, combine to form the translocon (translocator pore) in the membrane of targeted cells. Species-specific names for this highly variable component of T3SS include YopD, EspB, IpaC, SipC, etc. 0
40598 414417 cl05433 ARPC4 ARP2/3 complex 20 kDa subunit (ARPC4). ARP2/3 complex subunit; Provisional 0
40599 414418 cl05434 TraX TraX protein. conjugal transfer protein TrbP; Provisional 0
40600 414419 cl05436 Haemagg_act haemagglutination activity domain. This model represents a conserved domain found near the N-terminus of a number of large, repetitive bacterial proteins, including many proteins of over 2500 amino acids. Members generally have a signal sequence, then an intervening region, then the region described by this model. Following this region, proteins typically have regions rich in repeats but may show no homology between the repeats of one member and the repeats of another. A number of the members of this family have been designated adhesins, filamentous haemagglutinins, heme/hemopexin-binding protein, etc. 0
40601 414420 cl05442 Dam DNA N-6-adenine-methyltransferase (Dam). This model is a fragment-mode model for a phage-borne DNA N-6-adenine-methyltransferase. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification] 0
40602 414424 cl05457 pyocin_knob knob domain of R1 and R2 pyocins and similar domains. This family consists of several uncharacterized proteins from the Siphoviruses as well as one bacterial sequence. Some of the members of this family are described as putative minor structural proteins. 0
40603 414425 cl05460 Excalibur Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated. 0
40604 414436 cl05484 VipB Type VI secretion protein, EvpB/VC_A0108, tail sheath. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
40605 414440 cl05512 DUF903 Bacterial protein of unknown function (DUF903). Members of this family are exclusively lipoproteins of small size, including YgdI and YgdR from E. coli K-12. 0
40606 414442 cl05524 Cir_Bir_Yir Plasmodium variant antigen protein Cir/Yir/Bir. This model represents a large paralogous family of variant antigens from several Plasmodium species (P. yoelii, P. berghei and P. chabaudi). The seed was generated from a list of ORF's in P. yoelii containing a paralagous domain as defined by an algorithm implemented at TIFR. The list was aligned and reduced to six sequences approximating the most divergent clades present in the data set. The model only hits genes previously characterized as yir, bir, or cir genes above the trusted cutoff. In between trusted and noise is one gene from P. vivax (vir25) which has been characterized as a distant relative of the yir/bir/cir family. The vir family appears to be present in 600-1000 copies per haploid genome and is preferentially located in the sub-telomeric regions of the chromosomes. The genomic data for yoelii is consistent with this observation. It is not believed that there are any orthologs of this family in P. falciparum. 0
40607 414443 cl05528 AlkA_N AlkA N-terminal domain. The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity (EC:3.2.2.23) and DNA lyase activity (EC:4.2.99.18). The region featured in this family is the N-terminal domain, which is organized into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket. 0
40608 414450 cl05556 Apyrase Apyrase. apyrase Superfamily; Provisional 0
40609 414461 cl05580 TraH Conjugative relaxosome accessory transposon protein. conjugal transfer pilus assembly protein TraH; Provisional 0
40610 414474 cl05618 tify tify domain. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response. 0
40611 297128 cl05636 Phage_tail_T Minor tail protein T. This model represents a translation of the T gene in phage lambda and related phage. A translational frameshift from the upstream gene G into the frame of T produces a minor protein gpG-T, essential in tail assembly but not found in the mature virion. [Mobile and extrachromosomal element functions, Prophage functions] 0
40612 414486 cl05663 fn3_6 Fibronectin type-III domain. Fn3_5 is an fn3-like domain which is frequently found as the first of three on streptococcal C5a peptidase (SCP), a highly specific protease and adhesin/invasin. The family is found in conjunction with pfam00082, pfam02225 and pfam00746. 0
40613 414489 cl05674 PET PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions. This domain is suggested to be involved in protein-protein interactions. The family is found in conjunction with pfam00412. 0
40614 414494 cl05686 P_gingi_FimA Major fimbrial subunit protein (FimA). A family of uncharacterized proteins around 300 residues in length and found in various Bacteroides species. The function of this family is unknown. 0
40615 384206 cl05704 Allene_ox_cyc Allene oxide cyclase. allene oxide cyclase 0
40616 414518 cl05741 AGTRAP Angiotensin II, type I receptor-associated protein (AGTRAP). This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. 0
40617 414520 cl05743 RAP Receptor-associated protein (RAP). The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. The N-terminal domain is predominately alpha helical. Two different studies have provided conflicted domain boundaries. 0
40618 414525 cl05752 HdeA HdeA/HdeB family. HdeA (hns-dependent expression protein A) is a single domain alpha-helical protein localized in the periplasmic space. HdeA is involved in acid resistance essential for infectivity of enteric bacterial pathogens. Functional studies demonstrate that HdeA is activated by a dimer-to-monomer transition at acidic pH, leading to suppression of aggregation by acid-denatured proteins. The gene encoding HdeA was initially identified as part of an operon regulated by the nucleoid protein H-NS. This family also contains HdeB. 0
40619 414526 cl05753 TraD Conjugal transfer protein TraD. This family contains bacterial TraD conjugal transfer proteins. Mutations in the TraD gene result in loss of transfer. 0
40620 414533 cl05762 SATase_N Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants.and bacteria. 0
40621 186667 cl05775 SEF14_adhesin SEF14-like adhesin. fimbrial protein SefA; Provisional 0
40622 414547 cl05797 SMC_hinge SMC proteins Flexible Hinge Domain. This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. 0
40623 414556 cl05813 Disulph_isomer Disulphide isomerase. This protein family is one of several observed in species that express bacillithiol, an analog of glutathione and mycothiol. Rather than being involved in bacillithiol biosynthesis, members are likely to act in bacillithiol-dependent processes. A suggested term is bacilliredoxin (a glutaredoxin-like thiol-dependent oxidoreductase), and a suggested role of YphP is de-bacillithiolation - removing bacillithiol that became linked to protein thiols under oxidative stress. An older description of YphP as a disulphide isomerase therefore may be wrong. 0
40624 414561 cl05827 IpaD Invasion plasmid antigen IpaD. These proteins are found within type III secretion operons and have been shown to be secreted by that system. 0
40625 414582 cl05878 TraK TraK protein. This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 276 to 307 amino acids in length. 0
40626 414583 cl05880 CRA Circumsporozoite-related antigen (CRA). circumsporozoite-related antigen; Provisional 0
40627 414601 cl05946 PspB Phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response. [Cellular processes, Adaptations to atypical conditions] 0
40628 414606 cl05964 zf-C4_ClpX ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known. 0
40629 414610 cl05973 FAM20_C_like C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins. Fam20C represents the C-terminus of eukaryotic secreted Golgi casein kinase proteins. Fam20C is the Golgi casein kinase that phosphorylates secretory pathway proteins within Ser-x-Glu/pSer motifs. Mutations in Fam20C cause Raine syndrome, an autosomal recessive osteosclerotic bone dysplasia. 0
40630 414643 cl06067 TraU TraU protein. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 0
40631 414649 cl06082 ACP Malonate decarboxylase delta subunit (MdcD). citrate lyase subunit gamma; Provisional 0
40632 414651 cl06088 DUF1257 Protein of unknown function (DUF1257). Ycf35; Provisional 0
40633 297413 cl06106 Phage_TAC_2 Bacteriophage lambda tail assembly chaperone, TAC, protein G. This model describes a family of bacteriophage proteins including G of phage lambda. This protein has been described as undergoing a translational frameshift at a Gly-Lys dipeptide near the C-terminus of protein G from phage lambda, with about 4 % efficiency, to produce tail assembly protein G-T. The Lys of the Gly-Lys pair is the conserved second-to-last residue of seed alignment for this family. [Mobile and extrachromosomal element functions, Prophage functions] 0
40634 414667 cl06123 DHR2_DOCK Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins. This family represents a conserved region within a number of eukaryotic dedicator of cytokinesis proteins. These are potential guanine nucleotide exchange factors, which activate some small GTPases by exchanging bound GDP for free GTP. This region interacts with RAC1 and ELMO1. 0
40635 414670 cl06143 RELM resistin-like molecule (RELM) hormone family. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes. 0
40636 414682 cl06181 PagP Antimicrobial peptide resistance and lipid A acylation protein PagP. This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response. 0
40637 414684 cl06188 Orc3 Origin recognition complex subunit 3. This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication. 0
40638 414692 cl06211 KDGP_aldolase KDGP aldolase. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 0
40639 414703 cl06243 PsbW Photosystem II reaction centre W protein (PsbW). photosystem II reaction centre W protein (PsbW); Provisional 0
40640 414716 cl06278 TraL TraL protein. This protein is part of the type IV secretion system for conjugative plasmid transfer. The function of the TraL protein is unknown. [Cellular processes, Conjugation] 0
40641 414734 cl06336 Commd N/A. The leucine-rich, 70-85 amino acid long COMM domain is predicted to form a beta-sheet and an extreme C-terminal alpha- helix. The COMM domain containing proteins are about 200 residues in length and passed the C-terminal COMM domain. 0
40642 414736 cl06338 CDI_inhibitor_EC869_like Inhibitor of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CdiI immunity proteins function as part of the bacterial contact-dependent growth inhibition (CDI) system. CDI is mediated by the CdiB-CdiA two-partner secretion system. Each CdiA protein exhibits a distinct growth inhibition activity, which resides in the polymorphic C-terminal region (CdiA-CT). Cells with the CDI sytem also express a CdiI immunity protein that blocks the activity of cognate CdiA-CT, thereby protecting the cell from autoinhibition. In many CDI systems the cdiBAI genes are followed by orphan cdiA-CT/cdiI modules, suggesting that these modules are exchanged between the CDI systems of different bacteria. 0
40643 414738 cl06345 DUF1439 Protein of unknown function (DUF1439). lipoprotein; Provisional 0
40644 414742 cl06353 BCHF 2-vinyl bacteriochlorophyllide hydratase (BCHF). This model represents the enzyme responsible for the first step in the modification of the ring A vinyl group of chlorophyllide a which (in part) distinguishes chlorophyll from bacteriochlorophyll. This enzyme is aparrently absent from cyanobacteria (which do not use bacteriochlorophyll). [Energy metabolism, Photosynthesis] 0
40645 414755 cl06401 Amastin Amastin surface glycoprotein. amastin surface glycoprotein; Provisional 0
40646 414757 cl06405 Syd Syd, a SecY-interacting protein. This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association. Operon analysis showed that Syd protein may function as immunity protein in bacterial toxin systems. 0
40647 297589 cl06408 UP_III_II Uroplakin IIIb, IIIa and II. This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension. 0
40648 414773 cl06460 CblD CblD like pilus biogenesis initiator. This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis. The family also contains a variety of Enterobacterial minor pilin proteins. 0
40649 414774 cl06461 YycH_N_like N-terminal domain of YycH and structurally similar proteins conserved in Firmicutes. This family consists of several uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 0
40650 414779 cl06472 HycH Formate hydrogenlyase maturation protein HycH. formate hydrogenlyase maturation protein HycH; Provisional 0
40651 414780 cl06473 CHRD CHRD domain. CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologs. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG.. 0
40652 141938 cl06484 Agglutinin Agglutinin domain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains. 0
40653 414794 cl06505 Rho_N Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain. 0
40654 414795 cl06508 MANEC MANEC domain. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase. 0
40655 414799 cl06515 DUF1525 Protein of unknown function (DUF1525). Members of this protein belong to extended genomic regions that appear to be spread by conjugative transfer. [Mobile and extrachromosomal element functions, Plasmid functions] 0
40656 414804 cl06527 HWE_HK HWE histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536. It is usually found adjacent to a C-terminal ATPase domain (pfam02518). This domain is found in a wide range of Bacteria and also several Archaea. 0
40657 414824 cl06567 BatA Aerotolerance regulator N-terminal. This model represents a prokaryotic N-terminal region of about 80 amino acids. The predicted membrane topology by TMHMM puts the N-terminus outside and spans the membrane twice, with a cytosolic region of about 25 amino acids between the two transmembrane regions. Member proteins tend to be between 600 and 1000 amino acids in length. [Hypothetical proteins, Domain] 0
40658 414828 cl06591 DUF1573 Protein of unknown function (DUF1573). This HMM describes a repeat domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, such as type IX secrection (T9SS) or PEP-CTERM. 0
40659 414837 cl06641 Fea1 Low iron-inducible periplasmic protein. In Chlamydomonas reinhardtii, the gene encoding Fe-assimilating protein 1 is induced by iron deficiency. In green algae, this protein is periplasmic. The two paralogues FEA1 and FEA2 are the major proteins secreted by iron-deficient Chlamydomonas reinhardtii, and both are up-regulated in response to iron deficiency. FEA1 but not FEA2 is up-regulated by high CO2 concentration. Both FEA1 and FEA2 are secreted into the periplasmic space and genetic evidence confirms that their association with the cell is required for growth in low iron. 0
40660 414853 cl06673 Extradiol_Dioxygenase_3A_like Subunit A of Class III extradiol dioxygenases. This is a family of aromatic ring opening dioxygenases which catalyze the ring-opening reaction of protocatechuate and related compounds. 0
40661 384654 cl06725 TraC TraC-like protein. conjugal transfer protein TraC; Provisional 0
40662 414890 cl06756 Nif11 Nif11 domain. This model describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus. [Cellular processes, Biosynthesis of natural products] 0
40663 414893 cl06766 YabP YabP family. Members of this protein family are the YabP protein of the bacterial sporulation program, as found in Bacillus subtilis, Clostridium tetani, and other spore-forming members of the Firmicutes. In Bacillus subtilis, a yabP single mutant appears to sporulate and germinate normally (), but is in an operon with yabQ (essential for formation of the spore cortex), it near-universal among endospore-forming bacteria, and is found nowhere else. It is likely, therefore, that YabP does have a function in sporulation or germination, one that is either unappreciated or partially redundant with that of another protein. [Cellular processes, Sporulation and germination] 0
40664 414904 cl06793 PRKCSH Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. The beta-subunit confers substrate specificity for di- and monoglucosylated glycans on the glucose-trimming activity of the alpha-subunit. 0
40665 352591 cl06838 C1_4 TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C. 0
40666 414922 cl06842 X8 X8 domain. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges. 0
40667 414924 cl06844 SRR1 SRR1. Protein SENSITIVITY TO RED LIGHT REDUCED 1; Provisional 0
40668 414930 cl06858 DUF1704 Domain of unknown function (DUF1704). Members of this family include a possible metal-binding motif HEXXXH and, nearby, a perfectly conserved motif QEGLA. All members belong to the Proteobacteria, including Agrobacterium tumefaciens and several species of Vibrio and Pseudomonas, and are found in only one copy per chromosome (Vibrio vulnificus, with two chromosomes, has two). The function is unknown. 0
40669 414932 cl06868 FNR_like N/A. This model describes an NADPH-dependent sulfite reductase flavoprotein subunit. Most members of this family are found in Cys biosynthesis gene clusters. The closest homologs below the trusted cutoff are designated as subunits nitrate reductase. 0
40670 414933 cl06870 SpoU_sub_bind RNA 2&apos;-O ribose methyltransferase substrate binding. This region is found in some members of the SpoU-type rRNA methylase family (pfam00588). 0
40671 414945 cl06893 UME UME (NUC010) domain. Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules. 0
40672 414955 cl06904 eNOPS_SF NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family. This domain is found at the C-terminus of NONA and PSP1 proteins adjacent to 1 or 2 pfam00076 domains. 0
40673 414963 cl06920 dimerization2 dimerization domain. This domain is found at the N-terminus of a variety of plant O-methyltransferases. It has been shown to mediate dimerization of these proteins. 0
40674 414968 cl06939 Antimicrobial17 Alpha/beta enterocin family. This family consists of the alpha and beta enterocins and lactococcin G peptides. These peptides have some antimicrobial properties; they inhibit the growth of Enterococcus spp. and a few other gram-positive bacteria. These peptides act as pore- forming toxins that create cell membrane channels through a barrel-stave mechanism and thus produce an ionic imbalance in the cell. These family of antimicrobial peptides belong to the class II group of bacteriocin. 0
40675 414973 cl06949 SspH Small acid-soluble spore protein H family. This model is derived from pfam08141 but has been expanded to include in the seed corresponding proteins from three species of Clostridium. Members of this family should occur only in endospore-forming bacteria, typically with two members per genome, but may be absent from the genomes of some endospore-forming bacteria. SspH (previously designated YfjU) was shown to be expressed specifically in spores of Bacillus subtilis. [Cellular processes, Sporulation and germination] 0
40676 414974 cl06950 AARP2CN AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU. 0
40677 414976 cl06954 BP28CT BP28CT (NUC211) domain. This C-terminal domain is found in BAP28-like nucleolar proteins. 0
40678 414979 cl06957 BING4CT BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. 0
40679 414982 cl06960 GUCT RNA-binding GUCT domain found in the RNA helicase II/Gu protein family. This is the C terminal domain found in the RNA helicase II / Gu protein family. 0
40680 415001 cl06998 LEM_like LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins. Short protein of 49 amino acid isolated from bovine spleen cells. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control. 0
40681 384799 cl07019 SHR3_chaperone ER membrane protein SH3. This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs). Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER. 0
40682 415011 cl07020 CW_7 CW_7 repeat. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far. 0
40683 415016 cl07029 SPT2 SPT2 chromatin protein. This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation. 0
40684 384806 cl07034 dCache_2 Cache domain. Members include the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions. 0
40685 415021 cl07053 Sin3_corepress Sin3 family co-repressor. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. 0
40686 415023 cl07055 Bac_DnaA_C N/A. Could be involved in DNA-binding. 0
40687 415024 cl07060 NPCBM NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.) 0
40688 415026 cl07066 Mad3_BUB1_I Mad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. 0
40689 415031 cl07072 COG4 COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport. 0
40690 415039 cl07097 oligo_HPY Oligopeptide/dipeptide transporter, C-terminal region. This model represents a domain found in the C-terminal regions of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid. [Transport and binding proteins, Amino acids, peptides and amines] 0
40691 415068 cl07159 Rib Rib/alpha-like repeat. Sequences in this family are tandem repeats of about 79 amino acids, present in up to 14 copies in a protein and highly identical, even at the DNA level, within each protein. Sequences with these repeats are found in the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis, and related proteins of Lactobacillus. The repeat lacks Cys residues. Most members of this protein family also have the cell wall anchor motif LPXTG shared by many staphyloccal and streptococcal surface antigens. 0
40692 384882 cl07218 Ad_cyc_g-alpha Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins. 0
40693 415102 cl07247 CDC37_C Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37. 0
40694 415103 cl07248 CDC37_M Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear. 0
40695 415127 cl07283 Tudor_3 DNA repair protein Crb2 Tudor domain. In Saccharomyces cerevisiae the Rad9 a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via Rad53 pfam00498 domains. This region is structurally composed of a pair of TUDOR domains. 0
40696 415133 cl07291 RNase_H2-C Ribonuclease H2-C is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. This entry represents the non-catalytic subunit of RNase H2, which in S. cerevisiae is Ylr154p/Rnh203p. Whereas bacterial and archaeal RNases H2 are active as single polypeptides, the Saccharomyces cerevisiae homolog, Rnh2Ap, when expressed in Escherichia coli, fails to produce an active RNase H2. For RNase H2 activity three proteins are required [Rnh2Ap (Rnh201p), Ydr279p (Rnh202p) and Ylr154p (Rnh203p)]. Deletion of any one of the proteins or mutations in the catalytic site in Rnh2A leads to loss of RNase H2 activity. RNase H2 ia an endonuclease that specifically degrades the RNA of RNA:DNA hybrids. It participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication. 0
40697 415160 cl07336 MutL_C MutL C terminal dimerization domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation. 0
40698 415174 cl07362 PriCT_1 Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases. 0
40699 415175 cl07364 Nfu_N Scaffold protein Nfu/NifU N terminal. This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters. 0
40700 415183 cl07381 BCS1_N BCS1 N terminal. This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein. 0
40701 415185 cl07383 C8 C8 domain. Not all of the conserved cysteines have been included in the alignment model. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. 0
40702 415187 cl07391 Cadherin_pro Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions. 0
40703 415188 cl07392 GT-D Glycosyltransferase GT-D fold. Members of this protein family are putative glycosyltransferases. Some members are found close to genes for the accessory secretory (SecA2) system, and are suggested by Partial Phylogenetic Profiling to correlate with SecA2 systems. Glycosylation, therefore, may occur in the cytosol prior to secretion. 0
40704 384993 cl07396 CRM1_C CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat. 0
40705 415190 cl07397 SoxZ Sulphur oxidation protein SoxZ. SoxZ forms a heterodimer with SoxY, the subunit that forms a covalent bond with a sulfur moiety during thiosulfate oxidation to sulfate. Note that virtually all proteins that have a SoxY domain fused to a SoxZ domain are functionally distinct and not involved in thiosulfate oxidation. 0
40706 415194 cl07405 DP_DD Dimerization domain of DP. DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer and negatively regulates the G1-S transition. 0
40707 415195 cl07406 c-SKI_SMAD_bind c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4. 0
40708 415200 cl07418 HIRAN HIRAN domain. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks. 0
40709 415202 cl07420 ydhR Putative mono-oxygenase ydhR. putative monooxygenase; Provisional 0
40710 415207 cl07428 Ivy Inhibitor of vertebrate lysozyme (Ivy). C-lysozyme inhibitor; Provisional 0
40711 415209 cl07433 Serine_rich_CAS Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. This is a serine rich domain that is found in the docking protein p130(cas) (Crk-associated substrate). This domain folds into a four helix bundle which is associated with protein-protein interactions. 0
40712 415215 cl07443 Cdt1_m The middle winged helix fold of replication licensing factor Cdt1 binds geminin to inhibit binding of the MCM complex to origins of replication and DNA. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin. 0
40713 385023 cl07446 SymE_toxin Toxin SymE, type I toxin-antitoxin system. endoribonuclease SymE; Provisional 0
40714 415228 cl07460 FRG FRG domain. This presumed domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterized. 0
40715 415229 cl07462 XisI-like XisI is FdxN element excision controlling factor protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required. 0
40716 415230 cl07463 DndE DNA sulphur modification protein DndE. This model describes the DndE protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is a putative carboxylase homologous to NCAIR synthetases. [DNA metabolism, Restriction/modification] 0
40717 415232 cl07469 QLQ QLQ. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions. 0
40718 415241 cl07481 DUF1845 Domain of unknown function (DUF1845). Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens. 0
40719 415251 cl07494 F_actin_bind F-actin binding. FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution. 0
40720 415258 cl07510 CsoSCA Carboxysome Shell Carbonic Anhydrase. This model describes a carboxysome shell protein that proves to be a novel class, designated epsilon, of carbonic anhydrase. It tends to be encoded near genes for RuBisCo and for other carboxysome shell proteins. [Central intermediary metabolism, One-carbon metabolism] 0
40721 415270 cl07523 LciA-like lactococcin A immunity protein (LciA) and similar proteins. Gram-positive lactobacilli produce bacteriocins to kill closely-related competitor species. To protect themselves from the bacteriocidal activity of this molecule they co-express an immunity protein (for discussion of this operon see Bacteriocin_IIc pfam10439). The immunity protein structure is a soluble, cytoplasmic, antiparallel four alpha-helical globular bundle with a fifth, more flexible and more divergent C-terminal helical hair-pin. The C-terminal hair-pin recognizes the C-terminus of the producer bacteriocin and this interaction is sufficient to dis-orient the bacteriocin within the membrane and close up the permeabilising pore that on its own the bacteriocin creates. These immunity proteins interact in the same way with other bacteriocins, family Bacteriocin_II, pfam01721. Since many enterococci can produce more than one bacteriocin it seems likely that the whole operon can be carried on transferable plasmids. 0
40722 415275 cl07531 DUF1874 Domain of unknown function (DUF1874). DNA binding protein 0
40723 415286 cl07557 QH-AmDH_gamma Quinohemoprotein amine dehydrogenase, gamma subunit. Members of this family contain a cross-linked, proteinous quinone cofactor, cysteine tryptophylquinone, which is required for catalysis of the oxidative deamination of a wide range of aliphatic and aromatic amines. The domain assumes a globular secondary structure, with two short alpha-helices having many turns and bends. 0
40724 415293 cl07574 YopH_N YopH, N-terminal. The type III secretion system (T3SS) protein called LcrQ in Yersinia pseudotuberculosis and YscM1 in Yersinia enterocolitica is a post-transcriptional regulator of T3SS effector gene expression. Successful chaperone-dependent export by the T3SS allows the translation of T3SS effector proteins to proceed. 0
40725 298396 cl07585 T3SS_needle_reg YopR, type III needle-polymerization regulator. Members of this family are type III secretion system effectors, named differently in different species and designated YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yops protein is unusual in that it is released to extracellularly rather than injected directly into the target cell as are most Yops. [Cellular processes, Pathogenesis] 0
40726 415300 cl07590 NCBD_CREBBP-p300_like Nuclear Coactivator Binding Domain (NCBD) of CREB (cyclic AMP response element binding protein) binding protein (CREBBP, also known as CBP) and its paralog p300. The Creb binding domain assumes a structure comprising of three alpha-helices which pack in a bundle, exposing a hydrophobic groove between alpha-1 and alpha-3 within which complimentary domains found in the protein 'activator for thyroid hormone and retinoid receptors' (ACTR) can dock. Docking of these domains is required for the recruitment of RNA polymerase II and the basal transcription machinery. 0
40727 415310 cl07609 Sod_Ni Nickel-containing superoxide dismutase. This superoxide dismutase uses nickel, rather than iron, manganese, copper, or zinc. Its gene is always accompanied by a gene for a required protease. 0
40728 415315 cl07618 B2-adapt-app_C Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15). 0
40729 352774 cl07621 EFh_DMD_DYTN_DTN EF-hand-like motif found in the dystrophin/dystrobrevin/dystrotelin family. Beta-dystrobrevin, also termed dystrobrevin beta (DTN-B), is a dystrophin-related protein that is restricted to non-muscle tissues and is abundantly expressed in brain, lung, kidney, and liver. It may be involved in regulating chromatin dynamics, possibly playing a role in neuronal differentiation, through the interactions with the high mobility group HMG20 proteins iBRAF/HMG20a and BRAF35 /HMG20b. It also binds to and represses the promoter of synapsin I, a neuronal differentiation gene. Moreover, beta-dystrobrevin functions as a kinesin-binding receptor involved in brain development via the association with the extracellular matrix components pancortins. Furthermore, beta-dystrobrevin binds directly to dystrophin and is a cytoplasmic component of the dystrophin-associated glycoprotein complex, a multimeric protein complex that links the extracellular matrix to the cortical actin cytoskeleton and acts as a scaffold for signaling proteins such as protein kinase A. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. Beta-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, beta-dystrobrevin contain two syntrophin binding sites (SBSs). 0
40730 415341 cl07672 GH15_N Glycoside hydrolase family 15, N-terminal domain. Members of this family, which are uniquely found in bacterial and archaeal glucoamylases and glucodextranases, adopt a structure consisting of 17 antiparallel beta-strands. These beta-strands are divided into two beta-sheets, and one of the beta-sheets is wrapped by an extended polypeptide, which appears to stabilize the domain. Members of this family are mainly concerned with catalytic activity, hydrolysing alpha-1,6-glucosidic linkages of dextran to release beta-D-glucose from the non-reducing end via an inverting reaction mechanism. 0
40731 385167 cl07688 FimH_man-bind Mannose binding domain of FimH and related proteins. Members of this family adopt a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. They are predominantly found in bacterial mannose-specific adhesins, since they are capable of binding to D-mannose. 0
40732 415353 cl07696 PepX_N X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction. 0
40733 415374 cl07747 Aha1_N Activator of Hsp90 ATPase, N-terminal. This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity. 0
40734 415389 cl07779 DUF1967 Domain of unknown function (DUF1967). CgtA (see model TIGR02729) is a broadly conserved member of the obg family of GTPases associated with ribosome maturation. This model represents a unique C-terminal domain found in some but not all sequences of CgtA. This region is preceded, and may be followed, by a region of low-complexity sequence. 0
40735 385237 cl07828 zf-H2C2 His(2)-Cys(2) zinc finger. This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members. 0
40736 324147 cl07831 growth_hormone_like Somatotropin/prolactin hormone family. Prolactin is primarily responsible for stimulating milk production and breast development in mammals. Aside from roles in reproduction, various functions have been attributed to prolactin, more than for other pituitary gland hormones combined. These are roles in growth and development, metamorphosis, metabolism of lipids, carbohydrates, and steroids, brain biochemistry and even immunoregulation, among others. Most of these roles are poorly understood, but it has become clear that many prolactin-like hormones are actually produced in the placenta and not the pituitary. 0
40737 415414 cl07834 C6 C6 domain. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. 0
40738 385243 cl07847 RGP Reversibly glycosylated polypeptide. reversibly glycosylated polypeptide; Provisional 0
40739 415418 cl07849 Acetyltransf_14 YopJ Serine/Threonine acetyltransferase. The Yersinia effector YopJ inhibits the innate immune response by blocking MAP kinase and NFkappaB signaling pathways. YopJ is a serine/threonine acetyltransferase which regulates signalling pathways by blocking phosphorylation. Specifically, YopJ has been shown to block phosphorylation of active site residues. It has also been shown that YopJ acetyltransferase is activated by eukaryotic host cell inositol hexakisphosphate. This family was previously incorrectly annotated in Pfam as being a peptidase family. 0
40740 415428 cl07863 Phasin Poly(hydroxyalcanoate) granule associated protein (phasin). This model describes a domain found in some proteins associated with polyhydroxyalkanoate (PHA) granules in a subset of species that have PHA inclusion granules. Included are two tandem proteins of Pseudomonas oleovorans, PhaI and PhaF, and their homologs in related species. PhaF proteins have a low-complexity C-terminal region with repeats similar to AAAKP. [Fatty acid and phospholipid metabolism, Biosynthesis] 0
40741 415433 cl07870 EppA_BapA Exported protein precursor (EppA/BapA). This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the Borrelia burgdorferi infectious cycle. 0
40742 415435 cl07874 zf-AD Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA. 0
40743 415437 cl07879 DnaG_DnaB_bind DNA primase DnaG DnaB-binding. DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB. 0
40744 415439 cl07883 CAMSAP_CKK Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin. 0
40745 415443 cl07889 Pro-peptidase_S53 Activation domain of S53 peptidases. Members of this family are found in various subtilase propeptides, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptide. 0
40746 415444 cl07890 AAI_LTSS N/A. This domain has a four-helix bundle structure. It contains four disulfide bonds, of which three function to keep the C- and N-terminal parts of the molecule in place. 0
40747 415445 cl07905 DUF3237 Protein of unknown function (DUF3237). hypothetical protein; Provisional 0
40748 415446 cl07906 DUF2585 Protein of unknown function (DUF2585). hypothetical protein; Provisional 0
40749 415447 cl07918 Virul_fac_BrkB Virulence factor BrkB. Initial identification of members of this protein family was based on characterization of the yihY gene product as ribonuclease BN in Escherichia coli. This identification has been withdrawn, as the group now finds the homolog in E. coli of RNase Z is the true ribonuclease BN rather than a strict functional equivalent of RNase Z. Members of this subfamily include the largely uncharacterized BrkB (Bordetella resist killing by serum B) from Bordetella pertussis. Some members have an additional C-terminal domain. Paralogs from E. coli (yhjD) and Mycobactrium tuberculosis (Rv3335c) are part of a smaller, related subfamily that form their own cluster. [Unknown function, General] 0
40750 415448 cl07929 Glyco_transf_56 4-alpha-L-fucosyltransferase glycosyl transferase group 56. This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (EC 2.4.1.-) (approximately 360 residues long). This catalyzes the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc. 0
40751 415449 cl07930 Fe_bilin_red Ferredoxin-dependent bilin reductase. phycocyanobilin:ferredoxin oxidoreductase; Validated 0
40752 415450 cl07940 SSPI Small, acid-soluble spore protein I. small acid-soluble spore protein SspI; Provisional 0
40753 385277 cl07943 SspO Small acid-soluble spore protein O family. This model represents a minor (low-abundance) spore protein, designated SspO. It is found in a very limited subset of the already small group of endospore-forming bacteria, but these species include Oceanobacillus iheyensis, Geobacillus kaustophilus, Bacillus subtilis, B. halodurans, and B. cereus. This protein was previously called CotK. [Cellular processes, Sporulation and germination] 0
40754 415451 cl07944 HutP Histidine Utilizing Protein, the hut operon positive regulatory protein. The HutP protein family regulates the expression of Bacillus 'hut' structural genes by an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: alpha-alpha-beta-alpha-alpha-beta-beta-beta in the primary structure, and the four antiparallel beta-strands form a beta-sheet in the order beta1-beta2-beta3-beta4, with two alpha-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the beta-sheet. 0
40755 186720 cl07951 PRK03830 N/A. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. Although previously designated tlp (thioredoxin-like protein), the B. subtilis protein was shown to be a minor small acid-soluble spore protein SASP, unique to spores. The motif E[VIL]XDE near the C-terminus probably represents at a germination protease cleavage site. [Cellular processes, Sporulation and germination] 0
40756 415452 cl07980 FHIPEP FHIPEP family. Members of this family are closely homologous to the flagellar biosynthesis protein FlhA (TIGR01398) and should all participate in type III secretion systems. Examples include InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc. Type III secretion systems resemble flagellar biogenesis systems, and may share the property of translocating special classes of peptides through the membrane. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
40757 415453 cl07986 PDGLE PDGLE domain. cobalt transport protein CbiN; Validated 0
40758 415454 cl08022 Spore_III_AB Stage III sporulation protein AB (spore_III_AB). stage III sporulation protein SpoAB; Provisional 0
40759 415455 cl08031 ThiC_Rad_SAM Radical SAM ThiC family. Members of this protein family closely resemble ThiC, an enzyme that performs a complex rearrangement during thiamin biosynthesis, but instead occur as one of two adjacent additional paralogs to bona fide ThiC, in a conserved gene neighborhood with a pair of B12 binding domain/radical SAM domain proteins. Members of the ThiC family are non-canonical radical SAM enzymes, using a C-terminal Cys-rich motif to ligand a 4Fe-4S cluster that cleaves S-adenosylmethionine (SAM), but that sequence region does not belong to pfam04055. 0
40760 415456 cl08044 Cse1_I-E CRISPR/Cas system-associated protein Cse1. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry, represented by CT1972 from Chlorobaculum tepidum, is found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse1. 0
40761 415457 cl08047 Lar_restr_allev Restriction alleviation protein Lar. Restriction alleviation proteins provide a countermeasure to host cell restriction enzyme defense against foreign DNA such as phage or plasmids. This family consists of homologs to the phage antirestriction protein Lar, and most members belong to phage genomes or prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification] 0
40762 415458 cl08066 YafO_toxin Toxin YafO, type II toxin-antitoxin system. YafO is a toxin which inhibits protein synthesis. It acts as a ribosome-dependent mRNA interferase. It forms part of a type II toxin-antitoxin system, where the YafN protein acts as an antitoxin. This domain forms complexes with yafN antitoxins containing pfam02604. 0
40763 415459 cl08082 CsgF Type VIII secretion system (T8SS), CsgF protein. The extracellular nucleation-precipitation (ENP) pathway or Type VIII secretion system (T8SS) in Gram-negative (diderm) bacteria is responsible for the secretion and assembly of prepilins for fimbiae biogenesis, the prototypical curli. Besides the T2SS that can be involved in the assembly of prototypical Type 4 pilus, the T4SS that can be involved in the biogenesis of the prototypical pilus T, the T3SS involved in the assembly of the injectisome and the T7SS involved in the formation of the prototypical Type 1 pilus, the T8SS differs in that fibre-growth occurs extracellularly. The curli, also called thin aggregative fimbriae (Tafi), are the only fimbriae dependent on the T8SS. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. In the absence of extracellular polysaccharides Tafi appear curled, although when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery. 0
40764 298624 cl08090 DUF2541 Protein of unknown function (DUF2541). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. All proteins are annotated as YaaI precursor however currently no function is known. 0
40765 415460 cl08095 HycA_repressor Transcriptional repressor of hyc and hyp operons. This family is conserved in Proteobacteria. It is likely to be the transcriptional repressor molecule for the hyc and hyp operons, which express, amongst others, the protein HycA. This protein may be harnessed for the reduction of technetium oxide, an unwelcome product of radio-nucleotide bioaccumulation. HycA produces formate hydrogenlyase, one of the key proteins necessary for metal compound reduction. 0
40766 415461 cl08096 DUF1992 Domain of unknown function (DUF1992). hypothetical protein; Provisional 0
40767 415462 cl08107 DUF5329 Family of unknown function (DUF5329). hypothetical protein; Provisional 0
40768 415463 cl08110 DUF2756 Protein of unknown function (DUF2756). Some members in this family of proteins are annotated yhhA however currently no function is known. The family appears to be restricted to Enterobacteriaceae. 0
40769 415464 cl08115 CsgE Curli assembly protein CsgE. Curli are a class highly aggregated surface fibers that are part of a complex extracellular matrix. They promote biofilm formation in addition to other activities. CsgE is a non-structural protein involved in curli biogenesis. CsgE forms an outer membrane complex with the curli assembly proteins CsgG and CsgF. 0
40770 415465 cl08119 YgbA_NO Nitrous oxide-stimulated promoter. The function of ygaB is not known but it is a promoter that is stimulated by the presence of nitrous oxide. It is regulated by the gene-product of the bacterial nsrR gene. 0
40771 415466 cl08122 NiFe-hyd_HybE [NiFe]-hydrogenase assembly, chaperone, HybE. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain. 0
40772 415467 cl08125 DUF3811 YjbD family (DUF3811). hypothetical protein; Provisional 0
40773 415468 cl08136 SecD-TM1 SecD export protein N-terminal TM region. This domain appears to be the fist transmembrane region of the SecD export protein. SecD is directly involved in protein secretion and important for the release of proteins that have been translocated across the cytoplasmic membrane. 0
40774 415469 cl08141 Lipoprotein_20 YfhG lipoprotein. This family includes the YfhG protein from E. coli. Members of this family have an N-terminal lipoprotein attachment site. The members of this family are functionally uncharacterized. 0
40775 415470 cl08147 RcsF RcsF lipoprotein. The RcsF lipoprotein is a component of the Rcs signaling system. It activates the Rcs system by transmitting signals from the cell suface to the histidine kinase RcsC. 0
40776 415471 cl08171 HtrL_YibB Bacterial protein of unknown function (HtrL_YibB). hypothetical protein; Provisional 0
40777 415472 cl08177 DUF2625 Protein of unknown function DUF2625. hypothetical protein; Provisional 0
40778 415473 cl08186 DUF3251 Protein of unknown function (DUF3251). hypothetical protein; Provisional 0
40779 415474 cl08187 Cas2_I-E CRISPR/Cas system-associated protein Cas2. This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799. Cas proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element. 0
40780 415475 cl08197 DUF2501 Protein of unknown function (DUF2501). hypothetical protein; Provisional 0
40781 415476 cl08212 Tipalpha TNF-alpha-Inducing protein of Helicobacter. tumor necrosis factor alpha-inducing protein; Reviewed 0
40782 415477 cl08220 Photo_RC D1, D2 subunits of photosystem II (PSII); M, L subunits of bacterial photosynthetic reaction center. This model decribes the photosynthetic reaction center M subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reacion center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 0
40783 415478 cl08223 PSII Photosystem II protein. photosystem II 47 kDa protein 0
40784 415479 cl08224 PsaA_PsaB Photosystem I psaA/psaB protein. photosystem I P700 chlorophyll a apoprotein A1; Provisional 0
40785 415480 cl08232 RuBisCO_large Ribulose bisphosphate carboxylase large chain. The C-terminal domain of RuBisCO large chain is the catalytic domain adopting a TIM barrel fold. 0
40786 415481 cl08246 MHC_I Class I Histocompatibility antigen, domains alpha 1 and 2. Members of this family are known as retinoic-acid-inducible proteins. They are ligands for the activating immunoreceptor NKG2D, which is widely expressed on natural killer cells, T cells, and macrophages. 0
40787 415483 cl08255 Na_K-ATPase Sodium / potassium ATPase beta chain. This model describes the Na+/K+ ATPase beta subunit in eukaryotes. Na+/K+ ATPase(also called Sodium-Potassium pump) is intimately associated with the plasma membrane. It couples the energy released by the hydrolysis of ATP to extrude 3 Na+ ions, with the concomitant uptake of 2K+ ions, against their ionic gradients. [Transport and binding proteins, Cations and iron carrying compounds] 0
40788 415486 cl08263 TBP_TLF N/A. archaeal TATA box binding protein (TBP): TBPs are transcription factors present in archaea and eukaryotes, that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. 0
40789 415487 cl08267 ISOPREN_C2_like N/A. Proteins similar to alpha2-macroglobulin (alpha (2)-M). This group also contains the pregnancy zone protein (PZP). Alpha(2)-M and PZP are broadly specific proteinase inhibitors. Alpha (2)-M is a major carrier protein in serum. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production contributing to fetal survival. It has been suggested that thioester bond cleavage promotes the binding of PZ and alpha (2)-M to the CD91 receptor clearing them from circulation. 0
40790 415488 cl08270 Peptidase_S10 Serine carboxypeptidase. serine carboxypeptidase (CBP1); Provisional 0
40791 415490 cl08275 RHD-n N-terminal sub-domain of the Rel homology domain (RHD). Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal DNA-binding domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam16179) that functions as a dimerization domain. 0
40792 415491 cl08282 Acyl_transf_1 Acyl transferase domain. SAT is the N-terminal starter unit:ACP transacylase of the aflatoxin biosynthesis pathway. SAT selects the hexanoyl starter unit from a pair of specialized fungal fatty acid synthase subunits (HexA/HexB) and transfers it onto the polyketide synthase A acyl-carrier protein to prime polyketide chain elongation. The family is found in association with pfam02801, pfam00109, pfam00550, pfam00975, pfam00698. 0
40793 415497 cl08291 TCTP Translationally controlled tumor protein. translationally controlled tumor-like protein; Provisional 0
40794 415499 cl08298 NAP Nucleosome assembly protein (NAP). (NAP-L) nucleosome assembly protein -L; Provisional 0
40795 415500 cl08299 LAGLIDADG_3 LAGLIDADG-like domain. This domain is found within the sporulation regulator WhiA. It is a LAGLIDADG superfamily like domain. 0
40796 415501 cl08302 EFh N/A. S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1_like group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions. 0
40797 415503 cl08306 Peptidase_C12 Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1. This ubiquitin C-terminal hydrolase (UCH) family includes UCH37 (also known as UCH-L5) and BRCA1-associated protein-1 (BAP1). They contain a UCH catalytic domain as well as an additional C-terminal extension which plays a role in protein-protein interactions. UCH37 is responsible for ubiquitin (Ub) isopeptidase activity in the 19S proteasome regulatory complex; it disassembles Lys48-linked poly-ubiquitin from the distal end of the chain. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus and can be activated through transient association of hINO80 with hRpn13 that is bound to the 19S regulatory particle or the proteasome. UCH37 possibly plays a role in oncogenesis; it competes with Smad ubiquitination regulatory factor 2 (Smurf2, ubiquitin ligase) in binding concurrently to Smad7 in order to deubiquitinate the activated type I transforming growth factor beta (TGF-beta) receptor, thus rescuing it from proteasomal degradation. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus. In addition to the UCH catalytic domain, BAP1 contains a UCH37-like domain (ULD), binding domains for BRCA1 and BARD1, which form a tumor suppressor heterodimeric complex, and a binding domain for HCFC1, which interacts with histone-modifying complexes during cell division. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. BAP1 exhibits tumor suppressor activity in cancer cells, and gene mutations have been reported in a small number of breast and lung cancer samples. In metastasis of uveal melanoma, the most common primary cancer of the eye, inactivating somatic mutations have been identified in the gene encoding BAP1 on chromosome 3p21.1. These mutations include several that cause premature protein termination as well as affect its UCH domain, thus implicating loss of BAP1 and suggesting that the BAP1 pathway may be a valuable therapeutic target. 0
40798 415505 cl08315 CAP_GLY CAP-Gly domain. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. 0
40799 415507 cl08320 Pollen_allerg_1 Pollen allergen. pollen allergen group 3; Provisional 0
40800 415511 cl08346 Rib_hydrolase ADP-ribosyl cyclase, also known as cyclic ADP-ribose hydrolase or CD38. ADP-ribosyl cyclase EC:3.2.2.5 (also know as cyclic ADP-ribose hydrolase or CD38) synthesizes cyclic-ADP ribose, a second messenger for glucose-induced insulin secretion. 0
40801 415514 cl08354 AFOR_N Aldehyde ferredoxin oxidoreductase, N-terminal domain. Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea. carboxylic acid reductase found in clostridia. and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum. GAPOR may be involved in glycolysis. but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases. 0
40802 415515 cl08356 TFIIA_gamma_C Gamma subunit of transcription initiation factor IIA, C-terminal domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The C-terminal domain of the gamma subunit is a 12 stranded beta-barrel. 0
40803 415519 cl08380 CDC48_2 Cell division protein 48 (CDC48), domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain. 0
40804 415523 cl08398 mltA_B_like Domain B insert of mltA_like lytic transglycosylases. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. 0
40805 415528 cl08409 Gln-synt_N Glutamine synthetase, beta-Grasp domain. 0
40806 415530 cl08418 TAF5_NTD2 TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID). This region is an all-alpha domain associated with the WD40 helical bundle of the TAF5 subunit of transcription factor TFIID. The domain has distant structural similarity to RNA polymerase II CTD interacting factors. It contains several conserved clefts that are likely to be critical for TFIID complex assembly. The TAF5 subunit is present twice in the TFIID complex and is critical for the function and assembly of the complex, and the NTD2 and N-terminal domain is crucial for homodimerization. 0
40807 415534 cl08424 OBF_DNA_ligase_family The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. This domain has an OB-like fold, but does not appear to be related to pfam03120. It is found at the C-terminus of the ATP dependent DNA ligase domain pfam01068. 0
40808 415536 cl08426 AMPKBI 5&apos;-AMP-activated protein kinase beta subunit, interaction domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family. 0
40809 415540 cl08444 CesT Tir chaperone protein (CesT) family. Members of this family include YscB of Yersinia and functionally equivalent (but differently named) proteins from type III secretion systems of other pathogens that affect animal cells. YscB acts, along with SycN (TIGR02503), as a chaperone for YopN, a key part of a complex that regulates type III secretion so it responds to contact with the eukaryotic target cell. 0
40810 415542 cl08447 DUF1214 Protein of unknown function (DUF1214). This family represents the C-terminal region of several hypothetical proteins of unknown function. Family members are mostly bacterial, but a few are also found in eukaryotes and archaea. 0
40811 415548 cl08459 PA14 PA14 domain. The GLEYA domain is related to lectin-like binding domains found in the S. cerevisiae Flo proteins and the C. glabrata Epa proteins. It is a carbohydrate-binding domain that is found in fungal adhesins (also referred to as agglutinins or flocculins). Adhesins with a GLEYA domain possess a typical N-terminal signal peptide and a domain of conserved sequence repeats, but lack glycosylphosphatidylinositol (GPI) anchor attachment signals. They contain a conserved motif G(M/L)(E/A/N/Q)YA, hence the name GLEYA. Based on sequence homology, it is suggested that the GLEYA domain would predominantly contain beta sheets. The GLEYA domain is also found in S. pombe putative cell agglutination protein fta5, thought to be a kinetochore portein (Sim4 complex subunit), however no direct evidence for kinetochore association has been found. Furthermore, a global protein localization study in S. pombe identified it as a secreted protein localized to the Golgi complex. 0
40812 415550 cl08468 Leukocidin Leukocidin/Hemolysin toxin family. This family of cytolytic pore-forming proteins includes alpha toxin and leukocidin F and S subunits from Staphylococcus aureus, hemolysin II of Bacillus cereus, and related toxins. [Cellular processes, Toxin production and resistance] 0
40813 415552 cl08475 PIG-X PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. 0
40814 415556 cl08488 ANAPC2 Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein. 0
40815 415560 cl08497 Cas6_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas6e. This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. 0
40816 415561 cl08500 YtxC YtxC-like family. This uncharacterized protein is part of a panel of proteins conserved in all known endospore-forming Firmicutes (low-GC Gram-positive bacteria), including Carboxydothermus hydrogenoformans, and nowhere else. [Cellular processes, Sporulation and germination] 0
40817 415571 cl08520 Cdc6_C Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localization factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity. 0
40818 415579 cl08531 ProRS-C_1 Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif. 0
40819 415584 cl09098 Sortase Sortase domain. The founder member of this family is S.aureus sortase, a transpeptidase that attaches surface proteins by the threonine of an LPXTG motif to the cell wall. 0
40820 415585 cl09109 NTF2_like N/A. This family contains a large number of proteins that share the SnoaL fold. 0
40821 415586 cl09111 Prefoldin N/A. This family comprises of several prefoldin subunits. The biogenesis of the cytoskeletal proteins actin and tubulin involves interaction of nascent chains of each of the two proteins with the oligomeric protein prefoldin (PFD) and their subsequent transfer to the cytosolic chaperonin CCT (chaperonin containing TCP-1). Electron microscopy shows that eukaryotic PFD, which has a similar structure to its archaeal counterpart, interacts with unfolded actin along the tips of its projecting arms. In its PFD-bound state, actin seems to acquire a conformation similar to that adopted when it is bound to CCT. 0
40822 415587 cl09113 cpn10 N/A. This family contains GroES and Gp31-like chaperonins. Gp31 is a functional co-chaperonin that is required for the folding and assembly of Gp23, a major capsid protein, during phage morphogenesis. 0
40823 415588 cl09114 CRCB CrcB-like protein, Camphor Resistance (CrcB). camphor resistance protein CrcB; Provisional 0
40824 415589 cl09115 Ribosomal_L32p Ribosomal L32p protein family. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
40825 415590 cl09123 SecG Preprotein translocase SecG subunit. This family of proteins forms a complex with SecY and SecE. SecA then recruits the SecYEG complex to form an active protein translocation channel. [Protein fate, Protein and peptide secretion and trafficking] 0
40826 415591 cl09125 ResB ResB-like family. c-type cytochrome biogenensis protein; Validated 0
40827 415592 cl09134 NurA NurA domain. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius. 0
40828 415593 cl09139 FliE Flagellar hook-basal body complex protein FliE. fliE is a component of the flagellar hook-basal body complex located possibly at (MS-ring)-rod junction. [Cellular processes, Chemotaxis and motility] 0
40829 415594 cl09141 ACT ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain is a structural motif of 70-90 amino acids that functions in the control of metabolism, solute transport and signal transduction. They are thus found in a variety of different proteins in a variety of different arrangements. In mammalian phenylalanine hydroxylase the domain forms no contacts but promotes an allosteric effect despite the apparent lack of ligand binding. 0
40830 415595 cl09153 PhdYeFM_antitox Antitoxin Phd_YefM, type II toxin-antitoxin system. This model recognizes a region of about 55 amino acids toward the N-terminal end of bacterial proteins of about 85 amino acids in length. The best-characterized member is prevent-host-death (phd) of bacteriophage P1, the antidote partner of death-on-curing (doc) (TIGR01550) in an addiction module. Addiction modules prevent plasmid curing by killing the host cell as the longer-lived killing protein persists while the gene for the shorter-lived antidote is lost. Note, however, that relatively few members of this family appear to be plasmid or phage-encoded. Also, there is little overlap, except for phage P1 itself, of species with this family and with the doc family. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other] 0
40831 415596 cl09154 MrpF_PhaF Multiple resistance and pH regulation protein F (MrpF / PhaF). putative monovalent cation/H+ antiporter subunit F; Reviewed 0
40832 415597 cl09159 Imelysin-like imelysin also called Peptidase M75. The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14. 0
40833 415598 cl09170 ATP-synt_I ATP synthase I chain. F0F1 ATP synthase subunit I; Validated 0
40834 415599 cl09173 Caa3_CtaG Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG). Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as in Bacillus subtilis. 0
40835 415600 cl09176 FlgN FlgN protein. This family includes the FlgN protein and export chaperone involved in flagellar synthesis. 0
40836 415601 cl09182 DUF1009 Protein of unknown function (DUF1009). Family of uncharacterized bacterial proteins. 0
40837 415602 cl09190 MAPEG MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity. 0
40838 415603 cl09194 Sec61_beta Sec61beta family. This family consists of homologs of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. It consists of a single putative transmembrane helix, preceded by a short stretch containing various charged residues; this arrangement may help determine orientation in the cell membrane. 0
40839 415604 cl09208 Tim44 Tim44-like domain. Mba1 is an inner membrane protein that is part of the mitochondrial protein export machinery. It binds to the large subunit of mitochondrial ribosomes and cooperates with the C-terminal ribosome-binding domain of Oxa1, which is a central component of the insertion machinery of the inner membrane. In the absence of both Mba1 and the C-terminus of Oxa1, mitochondrial translation products fail to be properly inserted into the inner membrane and serve as substrates of the matrix chaperone Hsp70. It is proposed that Mba1 functions as a ribosome receptor that cooperates with Oxa1 in the positioning of the ribosome exit site to the insertion machinery of the inner membrane. 0
40840 415605 cl09210 ROF Modulator of Rho-dependent transcription termination (ROF). Rho-binding antiterminator; Provisional 0
40841 415606 cl09211 Tagatose_6_P_K Tagatose 6 phosphate kinase. Aldolases specific for D-tagatose-bisphosphate occur in distinct pathways in Escherichia coli and other bacteria, one for the degradation of galactitol (formerly dulcitol) and one for degradation of N-acetyl-galactosamine and D-galactosamine. This family represents a protein of both systems that behaves as a non-catalytic subunit of D-tagatose-bisphosphate aldolase, required both for full activity and for good stability of the aldolase. Note that members of this protein family appear in public databases annotated as putative tagatose 6-phosphate kinases, possibly in error. [Energy metabolism, Sugars] 0
40842 385434 cl09219 DUF2208 Predicted membrane protein (DUF2208). This domain, found in various hypothetical archaeal proteins, has no known function. 0
40843 415607 cl09232 YqaJ YqaJ-like viral recombinase domain. This family includes various alkaline exonucleases from members of the herpesviridae. Alkaline exonuclease appears to have an important role in the replication of herpes simplex virus. 0
40844 415608 cl09238 CY N/A. SQAPI, aspartic acid inhibitor first isolated from squash, inhibits a wide range of aspartic proteinases. This particular family of PAAPIs (proteinaceous aspartic acid inhibitors) seems to have evolved quite recently from an ancestral cystatin. Structurally it consists of a four-stranded anti-parallel beta-sheet gripping an alpha-helix in much the same manner that a hand grips a tennis racket. The unstructured N-terminus and the loop connecting beta-strands 1 and 2 are important for pepsin inhibition, but the loop connecting strands 3 and 4 is not. 0
40845 415612 cl09299 TSA Type specific antigen. This protein is the immunodominant major cell surface protein of Orienta tsutsugamushi, known as "56-kDa type-specific antigen" or TSA56. It should not be confused with unrelated proteins TSA47 (a serine protease) or TSA22. An ortholog is found in Orientia chuto, and included in the seed alignment. 0
40846 415618 cl09326 MATE_like Multidrug and toxic compound extrusion family and similar proteins. Deletion of the mviN virulence gene in Salmonella enterica serovar. Typhimurium greatly reduces virulence in a mouse model of typhoid-like disease. Open reading frames encoding homologs of MviN have since been identified in a variety of bacteria, including pathogens and non-pathogens and plant-symbionts. In the nitrogen-fixing symbiont Rhizobium tropici, mviN is required for motility. The MviM protein is predicted to be membrane-associated. 0
40847 415635 cl09429 VirE2 VirE2. This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C-terminus, with VirD4. Agrobacterium tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily. 0
40848 415642 cl09462 Coagulase Staphylococcus aureus coagulase. The von Willebrand factor binding protein Vwb, like its paralog staphylocoagulase, is a coagulase and a virulence factor. It induces clotting, not by being an enzyme, but by activating prothrombin to generate fibrin. 0
40849 415644 cl09506 catalase_like Catalase-like heme-binding proteins and protein domains. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria. 0
40850 415645 cl09511 FERM_B-lobe FERM domain B-lobe. This domain is the central structural domain of the FERM domain. 0
40851 415658 cl09607 Gly_reductase Glycine/sarcosine/betaine reductase component B subunits. Members of this family are PrdD, encoded in the proline reductase gene cluster. Members are closely homologous to PrdA, which cleaves during maturation to create two subunits of the subunits of the proline reductase complex, one of which has a Cys-derived pyruvoyl active site. 0
40852 415659 cl09608 Cas7_I-E CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum. 0
40853 415664 cl09615 E1_UFD Ubiquitin fold domain. This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised. 0
40854 415673 cl09633 NIL NIL domain. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family. 0
40855 415677 cl09641 T3SS_needle_F Type III secretion needle MxiH, YscF, SsaG, EprI, PscF, EscF. type III secretion system needle protein SsaG; Provisional 0
40856 415680 cl09645 Ftsk_gamma Ftsk gamma domain. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding. 0
40857 415681 cl09647 SARS-CoV_ORF9b accessory protein 9b of severe acute respiratory syndrome-associated coronavirus and similar proteins. This is a family of proteins found in SARS coronavirus. The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules. This cavity is likely to be involved in membrane attachment. 0
40858 415684 cl09653 Btz CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide. 0
40859 415700 cl09680 eIF3E eukaryotic translation initiation factor 3 subunit E. This is the N terminal domain of subunit 6 translation initiation factor eIF3. 0
40860 385541 cl09697 Saf-Nte_pilin Saf-pilin pilus formation protein. Saf-pilin pilus formation protein SafA; Provisional 0
40861 415716 cl09710 Type_III_YscX Type III secretion system YscX (type_III_YscX). Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believe to be part of the secretion machinery. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
40862 415717 cl09714 Flg_new Listeria-Bacteroides repeat domain (List_Bact_rpt). This model describes a conserved core region, about 43 residues in length, of at least two families of tandem repeats. These include 78-residue repeats from 2 to 15 in number, in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few bacteria. [Unknown function, General] 0
40863 385547 cl09716 OrgA_MxiK Bacterial type III secretion apparatus protein (OrgA_MxiK). This gene is found in type III secretion operons and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the ghene is called MxiK and has been shown to be sessential for the proper assembly of the secretion needle complex. 0
40864 415718 cl09719 Cse2_I-E CRISPR/Cas system-associated protein Cse2. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family of proteins, represented by CT1973 from Chlorobaculum tepidum, is encoded by genes found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse2. 0
40865 415719 cl09723 CbtB Probable cobalt transporter subunit (CbtB). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional trans-membrane segments. 0
40866 415720 cl09726 DUF2389 Tryptophan-rich protein (DUF2389). Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo. 0
40867 415726 cl09741 Hypoth_Ymh Protein of unknown function (Hypoth_ymh). This family consists of a relatively rare (~ 8 occurrences per 200 genomes) prokaryotic protein family. Genes for members are appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. The function is unknown. [Hypothetical proteins, Domain] 0
40868 415728 cl09743 RNA_lig_T4_1 RNA ligase. RNA ligase A; Provisional 0
40869 415733 cl09752 Phg_2220_C Conserved phage C-terminus (Phg_2220_C). This model represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown. [Mobile and extrachromosomal element functions, Prophage functions] 0
40870 415734 cl09754 ATPase_gene1 Putative F0F1-ATPase subunit Ca2+/Mg2+ transporter. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophic stretches and is presumed to be a subunit of the enzyme. [Energy metabolism, ATP-proton motive force interconversion] 0
40871 415735 cl09771 Spore_III_AE Stage III sporulation protein AE (spore_III_AE). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AE. [Cellular processes, Sporulation and germination] 0
40872 415736 cl09775 Spore_II_R Stage II sporulation protein R (spore_II_R). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage II sporulation protein R. [Cellular processes, Sporulation and germination] 0
40873 415739 cl09782 Cas6-I-III CRISPR/Cas system-associated RAMP superfamily protein Cas6. The Cas6 Crispr family of proteins averaging 140 residues are characterized by having a GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus. The CRISPR-Cas system is possibly a mechanism of defense against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes. 0
40874 415740 cl09783 Spore_YunB Sporulation protein YunB (Spo_YunB). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. Mutation of this sigma E-regulated gene, designated yunB, has been shown to cause a sporulation defect. [Cellular processes, Sporulation and germination] 0
40875 415749 cl09801 Spore_YabQ Spore cortex protein YabQ (Spore_YabQ). YabQ, a protein predicted to span the membrane several times, is found in exactly those genomes whose species perform sporulation in the style of Bacillus subtilis, Clostridium tetani, and others of the Firmicutes. Mutation of this sigma(E)-dependent gene blocks development of the spore cortex. The length of the C-terminal region, including some hydrophobic regions, is rather variable between members. [Cellular processes, Sporulation and germination] 0
40876 415750 cl09807 Lin0512_fam Conserved hypothetical protein (Lin0512_fam). This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N-terminus. [Hypothetical proteins, Conserved] 0
40877 415751 cl09810 DUF2031 Protein of unknown function (DUF2031). This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. 0
40878 415756 cl09819 DUF2459 Protein of unknown function (DUF2459). This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species. [Hypothetical proteins, Conserved] 0
40879 385588 cl09820 PhaP_Bmeg Polyhydroxyalkanoic acid inclusion protein (PhaP_Bmeg). This model describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage (see McCool,G.J. and Cannon,M.C, 1999). 0
40880 415757 cl09821 Fib_succ_major Fibrobacter succinogenes major domain (Fib_succ_major). This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron. [Cell envelope, Other] 0
40881 415758 cl09823 Trep_Strep Hypothetical bacterial integral membrane protein (Trep_Strep). This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. If is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6. [Transport and binding proteins, Unknown substrate] 0
40882 415760 cl09826 Alph_Pro_TM Putative transmembrane protein (Alph_Pro_TM). This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome. 0
40883 324500 cl09827 Csb2_I-U CRISPR/Cas system-associated protein Csb2. This entry represents a rare CRISPR-associated protein. So far, members are found in Geobacter sulfurreducens and in two unpublished genomes: Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins. 0
40884 415761 cl09829 Csy1_I-F CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1. 0
40885 415762 cl09832 Csy3_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3. 0
40886 415763 cl09834 Csb1_I-U CRISPR/Cas system-associated protein Csb1. This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria). 0
40887 415764 cl09835 Cas6_I-F CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4. 0
40888 415765 cl09837 Csx3_III-U CRISPR/Cas system-associated protein Csx3. This entry is encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3). 0
40889 299071 cl09838 LcrR Type III secretion system regulator (LcrR). This protein is found in type III secretion operons and has been characterized in Yersinia as a regulator of the Low-Calcium Respone (LCR). [Protein fate, Protein and peptide secretion and trafficking] 0
40890 385597 cl09839 Csx1_III-U CRISPR/Cas system-associated protein Csx1. Members of this minor CRISPR-associated (Cas) protein family are encoded in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum. 0
40891 415777 cl09859 YopX YopX protein. This model represents an uncharacterized, well-conserved family of proteins found in bacteriophage and prophage regions of Gram-positive bacteria. [Mobile and extrachromosomal element functions, Prophage functions, Hypothetical proteins, Conserved] 0
40892 415781 cl09864 CHZ Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones. 0
40893 415782 cl09865 PHA_gran_rgn Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn). All members of this family are encoded by genes polyhydroxyalkanoic acid (PHA) biosynthesis and utilization genes, including proteins at found at the surface of PHA granules. Examples so far are found in the Pseudomonales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria. 0
40894 415783 cl09868 DUF2396 Protein of unknown function (DUF2396). Members of this family of conserved hypothetical proteins are found, so far, only in the Cyanobacteria. Members are about 170 amino acids long and share a motif CxxCx(14)CxxH near the amino end. [Hypothetical proteins, Conserved] 0
40895 415784 cl09869 Nitr_red_assoc Conserved nitrate reductase-associated protein (Nitr_red_assoc). Most members of this protein family are found in the Cyanobacteria, and these mostly near nitrate reductase genes and molybdopterin biosynthesis genes. We note that molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. This protein is sometimes annotated as nitrate reductase-associated protein. Its function is unknown. 0
40896 385616 cl09872 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes archaeal proteins encoded in cas gene regions. 0
40897 415785 cl09873 Csm6_III-A CRISPR/Cas system-associated protein Csm6. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 0
40898 415786 cl09875 DUF2398 Protein of unknown function (DUF2398). Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved] 0
40899 415788 cl09881 Spore_GerQ Spore coat protein (Spore_GerQ). Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase. [Cellular processes, Sporulation and germination] 0
40900 415789 cl09883 TrbC_Ftype Type-F conjugative transfer system pilin assembly protein. conjugal transfer pilus assembly protein TrbC; Provisional 0
40901 415790 cl09884 DUF2400 Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation. [Hypothetical proteins, Conserved] 0
40902 415791 cl09889 Phage_rep_org_N N-terminal phage replisome organizer (Phage_rep_org_N). This model represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region covered by this model is N-terminal to the low-complexity region. [Mobile and extrachromosomal element functions, Prophage functions] 0
40903 415792 cl09890 Phage_holin_6_1 Bacteriophage holin of superfamily 6 (Holin_LLH). This model represents a putative phage holin from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is small (about 100 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. [Mobile and extrachromosomal element functions, Prophage functions] 0
40904 415793 cl09891 Lactococcin_972 Bacteriocin (Lactococcin_972). This model represents bacteriocins related to lactococcin 972. Members tend to be found in association with a seven transmembrane putative immunity protein. [Cellular processes, Toxin production and resistance] 0
40905 299108 cl09901 Gcw_chp Bacterial protein of unknown function (Gcw_chp). This model represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis, Ralstonia solanacearum, and Colwellia psychrerythraea, usually as part of a paralogous family. The function is unknown. 0
40906 415797 cl09903 Porph_ging Protein of unknown function (Porph_ging). This protein family was first noted as a paralogous set in Porphyromonas gingivalis, but it is more widely distributed among the Bacteroidetes. The protein family is now renamed GLPGLI after its best-conserved motif. 0
40907 415798 cl09906 Csa5_I-A CRISPR/Cas system-associated protein Csa5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus. 0
40908 299112 cl09907 Cas8a2_I-A CRISPR/Cas system-associated protein Csa8a2. CRISPR loci appear to be mobile elements with a wide host range. This entry represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 0
40909 299113 cl09912 Trep_dent_lipo Treponema clustered lipoprotein (Trep_dent_lipo). This model represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown. 0
40910 415799 cl09913 Csn2_like CRISPR/Cas system-associated protein Csn2. Cas_St_Csn2 is a family of Csn2 CRISPR-associated (Cas) proteins found in Firmicutes, largely Streptococcus and Enterococcus. CRISPR-associated (Cas) proteins are the main executioners of the process whereby prokaryotes acquire immunity against foreign genetic material. Cas allow short segments of this DNA, called spacer, to become incorporated into chromosomal loci as clustered regularly interspaced short palindromic repeats or CRISPRs; the resulting encoded RNAs are then processed into small fragments that guide the silencing of the invading genetic elements. Thus Cas are involved in the acquisition of new spacers. This family of St_Csn2 is longer than the canonical Csn2, pfam09711 through the addition of a large C-terminal domain. The central domain present in both families appears to be a channel that selectively interacts with dsDNA. 0
40911 385628 cl09914 PHA_synth_III_E Poly(R)-hydroxyalkanoic acid synthase subunit (PHA_synth_III_E). This model represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species. This model must be designated subfamily to indicate the heterogeneity of PHAs. [Cellular processes, Adaptations to atypical conditions, Fatty acid and phospholipid metabolism, Biosynthesis] 0
40912 415800 cl09915 A_thal_3526 Plant protein 1589 of unknown function (A_thal_3526). This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa. 0
40913 415801 cl09916 Plasmod_dom_1 Plasmodium protein of unknown function (Plasmod_dom_1). hypothetical protein; Provisional 0
40914 415802 cl09917 ETRAMP Malarial early transcribed membrane protein (ETRAMP). This model describes a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rident parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region, and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both P. falciparum and P. yoelii chromosomes 0
40915 415803 cl09918 CPW_WPC Plasmodium falciparum domain of unknown function (CPW_WPC). The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown. 0
40916 415804 cl09920 C_GCAxxG_C_C Putative redox-active protein (C_GCAxxG_C_C). This model represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters. 0
40917 415805 cl09921 Unstab_antitox Putative addiction module component. Members of this family are bacterial proteins, typically are about 75 amino acids long, always found as part of a pair (at least) of two small genes. The other in the pair always belongs to a subfamily of the larger family pfam05016 (although not necessarily scoring above the designated cutoff), which contains plasmid stabilization proteins. It is likely that this protein and its pfam05016 member partner comprise some form of addiction module, although these gene pairs usually are found on the bacterial main chromosome. [Mobile and extrachromosomal element functions, Other] 0
40918 415806 cl09927 S1_like N/A. This domain is found at the N-terminus of RsgA domains. It has an OB fold. 0
40919 415807 cl09928 Molybdopterin-Binding N/A. This model describes a subset of formate dehydrogenase alpha chains found mainly in proteobacteria but also in Aquifex. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits of 32 and 20 kDa. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The electrons are utilized mainly in the nitrate respiration by nitrate reductase. In E. coli and Salmonella, there are two forms of the formate dehydrogenase, one induced by nitrate which is strictly anaerobic (fdn), and one incuced during the transition from aerobic to anaerobic growth (fdo). This subunit is one of only three proteins in E. coli which contain selenocysteine. This model is well-defined, with a large, unpopulated trusted/noise gap. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport] 0
40920 415808 cl09929 MopB_CT N/A. This domain is found in various molybdopterin - containing oxidoreductases and tungsten formylmethanofuran dehydrogenase subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where the domain constitutes almost the entire subunit. The formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and has a molybdopterin dinucleotide cofactor. This domain corresponds to the C-terminal domain IV in dimethyl sulfoxide (DMSO)reductase which interacts with the 2-amino pyrimidone ring of both molybdopterin guanine dinucleotide molecules. 0
40921 415809 cl09930 RPA_2b-aaRSs_OBF_like Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold. Replication protein A contains two OB domains in it's DNA binding region. This is the second of the OB domains. 0
40922 415810 cl09932 Acyl-CoA_dh_N Acyl-CoA dehydrogenase, N-terminal domain. Acyl-coenzyme A oxidase consists of three domains. An N-terminal alpha-helical domain, a beta sheet domain (pfam02770) and a C-terminal catalytic domain (pfam01756). This entry represents the N-terminal alpha-helical domain. 0
40923 415811 cl09933 ACAD Acyl-CoA dehydrogenase. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle. 0
40924 415812 cl09936 PP-binding Phosphopantetheine attachment site. acyl carrier protein; Provisional 0
40925 415813 cl09938 cond_enzymes N/A. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.180, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria. 0
40926 415814 cl09940 S4 N/A. This domain is found at the C-terminus of fungal tyrosyl-tRNA synthetases. It binds to group I introns. 0
40927 415815 cl09943 Ribosomal_L29_HIP N/A. This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure. 0
40928 415816 cl09951 FN2 N/A. One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function. 0
40929 415817 cl09954 DUF202 Domain of unknown function (DUF202). This family consists of hypothetical proteins some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 100 amino acids long. 0
40930 415818 cl09957 zf-UBP Zn-finger in ubiquitin-hydrolases and other protein. 0
40931 415819 cl09961 DUF1027 Protein of unknown function (DUF1027). This family consists of several hypothetical bacterial proteins of unknown function. 0
40932 415820 cl09962 DUF771 Domain of unknown function (DUF771). Family of uncharacterized ORFs found in Bacteriophage and Lactococcus lactis. 0
40933 415822 cl10011 Periplasmic_Binding_Protein_type1 Type 1 periplasmic binding fold superfamily. This family includes a diverse range of periplasmic binding proteins. 0
40934 415823 cl10012 DnaQ_like_exo DnaQ-like (or DEDD) 3&apos;-5&apos; exonuclease domain superfamily. This is a highly divergent 3' exoribonuclease family. The proteins constitute a typical RNase fold, where the active site residues form a magnesium catalytic centre. The protein of the solved structure readily cleaves 3' overhangs in a time-dependent manner. It is similar to DEDD-type RNases and is an unusual ATP-binding protein that binds ATP and dATP. It forms a dimer in solution and both protomers in the asymmetric unit bind a magnesium ion through Asp-6 in UniProtKB:P9WJ73. 0
40935 415824 cl10013 Glycosyltransferase_GTB-type glycosyltransferase family 1 and related proteins with GTB topology. Asp1, along with SecY2, SecA2, and other proteins forms part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. Asp1 is predicted to be cytosolic. 0
40936 415825 cl10014 PTS_IIB N/A. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar structure to mammalian tyrosine phosphatases. This family also contains the fructose specific IIB subunit. 0
40937 415826 cl10015 YjgF_YER057c_UK114_family N/A. YjgF_Endoribonuc is a putative endoribonuclease. The structure is of beta-alpha-beta-alpha-beta(2) domains common both to bacterial chorismate mutase and to members of the YjgF family. These proteins form trimers with a three-fold symmetry with three closely-packed beta-sheets. The YjgF family is a large, widely distributed family of proteins of unknown biochemical function that are highly conserved among eubacteria, archaea and eukaryotes. 0
40938 415827 cl10017 Tubulin_FtsZ_Cetz-like Tubulin protein family of FtsZ and CetZ-like. Many of the residues conserved in Tubulin, pfam00091, are also highly conserved in this family. 0
40939 415828 cl10019 PurM-like N/A. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The function of the C-terminal domain of AIR synthase is unclear, but the cleft formed between N and C domains is postulated as a sulphate binding site. 0
40940 415829 cl10020 S2P-M50 N/A. This is a family of bacterial and plant peptidases in the same family as MEROPS:M50B. 0
40941 415830 cl10022 ABM Antibiotic biosynthesis monooxygenase. The function of this family is unknown, but it is upregulated in response to salt stress in Populus balsamifera. It is also found at the C-terminus of an fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus. Arthrobacter nicotinovorans ORF106 is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an a/b barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence. 0
40942 353046 cl10023 POLBc N/A. DNA polymerase subunit B; Provisional 0
40943 415831 cl10029 Histidinol_dh N/A. histidinol dehydrogenase; Reviewed 0
40944 415832 cl10030 MECDP_synthase N/A. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis). 0
40945 415833 cl10031 DUF1190 Protein of unknown function (DUF1190). hypothetical protein 0
40946 415834 cl10037 AroH N/A. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine. 0
40947 415835 cl10043 hemP Hemin uptake protein hemP. hypothetical protein; Provisional 0
40948 415836 cl10045 tRNA_int_endo tRNA intron endonuclease, catalytic C-terminal domain. tRNA-splicing endonuclease subunit beta; Reviewed 0
40949 415837 cl10048 TonB_C Gram-negative bacterial TonB protein C-terminal. This family contains TonB members that are not captured by pfam03544. 0
40950 415838 cl10072 Phage_Mu_F Phage Mu protein F like protein. Family of related phage minor capsid proteins. 0
40951 415839 cl10080 RPE65 Retinal pigment epithelial membrane protein. 9-cis-epoxycarotenoid dioxygenase 0
40952 415840 cl10125 DUF3461 Protein of unknown function (DUF3461). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length. This protein has two conserved sequence motifs: KFK and HLE. 0
40953 415841 cl10143 DUF5431 Family of unknown function (DUF5431). modulator of post-segregation killing protein; Provisional 0
40954 415842 cl10149 TraW_N Sex factor F TraW protein N terminal. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm. 0
40955 415843 cl10177 DUF5455 Family of unknown function (DUF5455). minor coat protein 0
40956 353058 cl10198 DUF5466 Family of unknown function (DUF5466). hypothetical protein 0
40957 353059 cl10201 O_Spanin_T7 outer-membrane spanin sub-unit. phage lambda Rz1-like protein 0
40958 415844 cl10205 Tube Tail tubular protein. tail tubular protein A 0
40959 385669 cl10212 DUF5476 Family of unknown function (DUF5476). hypothetical protein 0
40960 353062 cl10214 TA_inhibitor Inhibitor of toxin/antitoxin system (Gp4.5). hypothetical protein 0
40961 353063 cl10215 DUF5471 Family of unknown function (DUF5471). hypothetical protein 0
40962 353064 cl10223 DUF5480 Family of unknown function (DUF5480). hypothetical protein 0
40963 353065 cl10228 p6 Histone-like Protein p6. dsDNA binding protein 0
40964 415845 cl10256 YecR YecR-like lipoprotein. hypothetical protein 0
40965 353066 cl10264 DUF5493 Family of unknown function (DUF5493). hypothetical protein 0
40966 353067 cl10269 DUF5517 Family of unknown function (DUF5517). hypothetical protein 0
40967 353068 cl10273 DUF5489 Family of unknown function (DUF5489). hypothetical protein 0
40968 385670 cl10291 DUF2523 Protein of unknown function (DUF2523). putative minor coat protein 0
40969 353069 cl10305 Gp17 Superinfection exclusion protein, bacteriophage P22. hypothetical protein 0
40970 353070 cl10308 Phi29_Phage_SSB Phage Single-stranded DNA-binding protein. hypothetical protein 0
40971 385671 cl10335 Phage_TAC_12 Phage tail assembly chaperone protein, TAC. hypothetical protein 0
40972 415846 cl10351 Phage_gp49_66 Phage protein (N4 Gp49/phage Sf6 gene 66) family. hypothetical protein 0
40973 415847 cl10447 GH18_chitinase-like N/A. This DUF is likely to be a form of glycosyl hydrolase from CAZy family 18, possibly chitinase 18. This would have the EC number of EC:3.2.1.14. 0
40974 415848 cl10448 GH25_muramidase N/A. This domain is found in a set of uncharacterized hypothetical bacterial proteins. 0
40975 415849 cl10459 Peptidases_S8_S53 Peptidase domain in the S8 and S53 families. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567. 0
40976 415851 cl10465 Peptidase_S24_S26 N/A. The C-terminal domain of the CI repressor functions in oligomer formation. 0
40977 415853 cl10468 TerC Integral membrane protein TerC family. Predicted to be an integral membrane protein with multiple membrane spans. 0
40978 415854 cl10470 Rick_17kDa_Anti Glycine zipper 2TM domain. hypothetical protein; Provisional 0
40979 415855 cl10471 LU N/A. UPAR_LY6_2 is a family of higher eukaryotic proteins expressed in neurons. It modulates nicotinic acetylcholine receptors by selectively increasing Ca2+-influx through this ion channel. The family carries an LU protein domain - about 80 amino acids long characterized by a conserved pattern of 10 cysteine residues. The family is a positive feedback regulator of Wnt/beta-catenin signalling, eg for patterning of the mesoderm and neuroectoderm in zebrafish gastrulation, where Lypd6 is GPI-anchored to the plasma-membrane and interacts with the Wnt receptor Frizzled8 and the co-receptor Lrp6. 0
40980 415856 cl10479 DUF413 Protein of unknown function, DUF. hypothetical protein; Provisional 0
40981 415857 cl10480 DUF2157 Predicted membrane protein (DUF2157). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
40982 415858 cl10492 DUF596 Protein of unknown function, DUF596. This family contains several uncharacterized proteins. 0
40983 415859 cl10501 DUF1223 Protein of unknown function (DUF1223). This family consists of several hypothetical proteins of around 250 residues in length which are found in both plants and bacteria. The function of this family is unknown. Structurally it lies in the Thioredoxin-like superfamily. 0
40984 415860 cl10502 lipocalin_FABP lipocalin/cytosolic fatty acid-binding protein family. This domain forms a beta barrel structure but the function is unknown. The GO annotation for this protein indicates that the protein has a function in nematode larval development and has a positive regulation on growth rate. 0
40985 415861 cl10503 DUF1737 Domain of unknown function (DUF1737). This domain of unknown function is found at the N-terminus of bacterial and viral hypothetical proteins. 0
40986 299184 cl10504 DUF975 Protein of unknown function (DUF975). Family of uncharacterized bacterial proteins. 0
40987 415862 cl10507 Disintegrin Disintegrin. Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs. 0
40988 415863 cl10509 PAW PNGase C-terminal domain, mannose-binding module PAW. present in several copies in proteins with unknown function in C. elegans 0
40989 415864 cl10511 Beach N/A. The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown. 0
40990 415869 cl10557 Dak1 Dak1 domain. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the DhaK subunit of the latter type of dihydroxyacetone kinase, but it specifically excludes the DhaK paralog DhaK2 (TIGR02362) found in the same operon as DhaK and DhaK in the Firmicutes. 0
40991 415871 cl10571 GT_MraY-like N/A. phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional 0
40992 385701 cl10591 Bro-N BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins. 0
40993 415878 cl10615 sm_acid_XPC-like small acidic domain of Xeroderma pigmentosum group C complementing protein and similar proteins. This model represents the small acidic domain of mammalian Xeroderma pigmentosum group C complementing protein (XPC), yeast Rad4, and similar proteins. XPC/Rad4 recruits transcription/repair factor IIH (TFIIH) to the nucleotide excision repair (NER) complex through interactions with its p62/Tfb1 and XPB/Ssl2 TFIIH subunits. Global genome repair (GGR), one of two NER initiation pathways in mammals, starts with DNA lesion detection by XPC. XPC is a structure specific DNA-binding factor that recognizes distortion of the damaged DNA double helix and recruits the TFIIH complex onto the lesion to open up the damaged DNA. The small acidic domain of XPC/Rad4 interacts with the pleckstrin homology (PH) domain of the p62/Tfb1 subunit of TFIIH. 0
40994 415890 cl10701 FIST FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. 0
40995 415894 cl10713 Phage_pRha Phage regulatory protein Rha (Phage_pRha). Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that pRha is a phage regulatory protein. [Mobile and extrachromosomal element functions, Prophage functions] 0
40996 415896 cl10717 CactinC_cactus Cactus-binding C-terminus of cactin protein. SF3A2 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). with Prp21 wrapping around Prp11. 0
40997 415902 cl10727 E3_UFM1_ligase E3 UFM1-protein ligase 1. E3 UFM1-protein ligase 1 homolog; Provisional 0
40998 415925 cl10767 AD Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. 0
40999 415965 cl10870 NDFIP-like NEDD4 family-interacting protein. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP2 may play a role in protein trafficking. 0
41000 415977 cl10889 Cir_N N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex. 0
41001 415994 cl10918 Cg6151-P Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined. 0
41002 416027 cl10970 AP_MHD_Cterm C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD). The muniscins are a family of endocytic adaptors that is conserved from yeast to humans.This C-terminal domain is structurally similar to mu homology domains, and is the region of the muniscin proteins involved in the interactions with the endocytic adaptor-scaffold proteins Ede1-eps15. This interaction influences muniscin localization. The muniscins provide a combined adaptor-membrane-tubulation activity that is important for regulating endocytosis. 0
41003 416063 cl11037 EKR Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known. 0
41004 416081 cl11062 BHD_1 Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA. 0
41005 416082 cl11063 BHD_2 Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA. 0
41006 416083 cl11065 TAF8 TATA Binding Protein (TBP) Associated Factor 8. This is the C-terminal, Delta, part of the TAF8 protein. The N-terminal is generally the histone fold domain, Bromo_TP (pfam07524). TAF8 is one of the key subunits of the transcription factor for pol II, TFIID. TAF8 is one of the several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery. 0
41007 416092 cl11081 dermokine dermokine. This region has been called the argonaute hook. It has been shown to bind to the Piwi domain pfam02171 of Argnonaute proteins. 0
41008 416130 cl11158 BEN BEN domain. hypothetical protein; Provisional 0
41009 416136 cl11171 Dev_Cell_Death Development and cell death domain. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone. 0
41010 416146 cl11186 Cullin_Nedd8 Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue. 0
41011 416154 cl11198 zinc_ribbon_2 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773. 0
41012 416181 cl11253 Germane Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold. 0
41013 299614 cl11266 EssA WXG100 protein secretion system (Wss), protein EssA. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions. [Protein fate, Protein and peptide secretion and trafficking] 0
41014 416196 cl11278 DUF2492 Protein of unknown function (DUF2492). This model describes a family of small cytosolic proteins, about 80 amino acids in length, in which the eight invariant residues include three His residues and two Cys residues. Two pairs of these invariant residues occur in motifs HxH (where x is A or G) and CxH, both of which suggest metal-binding activity. This protein family was identified by searching with a phylogenetic profile based on an anaerobic sulfatase-maturase enzyme, which contains multiple 4Fe-4S clusters. The linkages by phylogenetic profiling and by iron-sulfur cluster-related motifs together suggest this protein may be an accessory protein to certain maturases in sulfatase/maturase systems. 0
41015 416236 cl11367 EspA_EspE EspA/EspE family. This family of mycobacterial proteins are uncharacterized. 0
41016 416244 cl11377 NADH-u_ox-rdase NADH-ubiquinone oxidoreductase complex I, 21 kDa subunit. complex I subunit 0
41017 416253 cl11393 Peptidase_M14_like M14 family of metallocarboxypeptidases and related proteins. This is the peptidase domain of a D,L-carboxypeptidase. The active site residues are Arg86, Glu222 and the metal ligands, in the peptidase domain, are Gln46, Glu49 and His128 in UniProtKB:O25708. The protein binds many zinc ions and a calcium ion and there are other metal binding sites. The catalytic activity is the release of m-Dpm from the peptide muramyl-Ala-gamma-D-Glu-m-Dpm; this is probably the precursor of the cell wall cross-linking peptide. 0
41018 416254 cl11394 Glyco_tranf_GTA_type Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis. 0
41019 416255 cl11395 Pkinase_C Protein kinase C terminal domain. 0
41020 416256 cl11396 Patatin_and_cPLA2 Patatins and Phospholipases. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates. 0
41021 416257 cl11397 NR_LBD The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators. This all helical domain is involved in binding the hormone in these receptors. 0
41022 416258 cl11399 HP Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches.The smaller branch 2 contains predominantly eukaryotic proteins. The catalytic functions in members include phytase, glucose-1-phosphatase and multiple inositol polyphosphate phosphatase. The in vivo roles of the mammalian acid phosphatases in branch 2 are not fully understood, although activity against lysophosphatidic acid and tyrosine-phosphorylated proteins has been demonstrated. 0
41023 416259 cl11403 pepsin_retropepsin_like Cellular and retroviral pepsin-like aspartate proteases. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylanase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival. 0
41024 416260 cl11404 Biotinyl_lipoyl_domains N/A. HlyD_D4 is the long alpha-hairpin domain in the centre of CusB or HlyD proteins. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons. HlyD_D4 is thought to interact with the alpha-helical tunnels of the corresponding outer-membrane channels, ie the periplasmic domain of CusC. 0
41025 416261 cl11409 RNAP_RPB11_RPB3 RPB11 and RPB3 subunits of RNA polymerase. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp. Many of the members of this family carry only the N-terminal region of Rpb11. 0
41026 416262 cl11410 TPP_enzyme_PYR Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779. 0
41027 416263 cl11421 FAA_hydrolase Fumarylacetoacetate (FAA) hydrolase family. This bacterial family of proteins has no known function. 0
41028 416264 cl11423 VirB9_CagX_TrbG VirB9/CagX/TrbG, a component of the type IV secretion system. This family includes type IV secretion system CagX conjugation protein. Other members of this family are involved in conjugal transfer to plant cells of T-DNA. 0
41029 416265 cl11424 nitrilase Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. Nitrilase-related proteins generally have a conserved E-K-C catalytic triad, and are multimeric alpha-beta-beta-alpha sandwich proteins. 0
41030 416266 cl11425 PSI_PSAK Photosystem I psaG / psaK. Members of this protein family are the PsaK of the photosystem I reaction center. Photosystems I and II occur together in the same sets of organisms. Photosystem I uses light energy to transfer electrons from plastocyanin to ferredoxin, while photosystem II uses light energy to split water and releases molecular oxygen. [Energy metabolism, Photosynthesis] 0
41031 416267 cl11433 DivIC Septum formation initiator. In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect. 0
41032 353252 cl11434 AlkD_like A new structural DNA glycosylase. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix). DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base flipping despite their structural diversity. The known structures for members of this family, AlkC and AlkD from Bacillus cereus, are distant homologues and are composed of six variant HEAT (Huntington/Elongation/ A subunit/Target of rapamycin) repeats. HEAT motifs are ~45-amino acid sequences that form antiparallel alpha-helices, which are packed by a conserved hyrophobic interface and are tandemly repeated to form superhelical alpha-structures. AlkD and AlkC are specific for removal of 3-methyladenine (3mA) and 7-methylguanine (7mG) from the DNA by base excision repair. Homologues of AlkC and AlkD were also identified in other organisms. 0
41033 416268 cl11435 DMB-PRT_CobT Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT), also called CobT. This family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin. 0
41034 416269 cl11436 DNA_III_psi DNA polymerase III psi subunit. This small subunit of the DNA polymerase III holoenzyme in E. coli and related species appearsto have a narrow taxonomic distribution. It is not found so far outside the gamma subdivision proteobacteria. [DNA metabolism, DNA replication, recombination, and repair] 0
41035 416270 cl11437 DUF2057 Uncharacterized protein conserved in bacteria (DUF2057). hypothetical protein; Provisional 0
41036 416271 cl11440 AstA Arginine N-succinyltransferase beta subunit. In some bacteria, including Pseudomonas aeruginosa, the astB gene (arginine N-succinyltransferase) is replaced by tandem paralogs that form a heterodimer. This heterodimer from P. aeruginosa is characterized as arginine and ornithine N-2 succinyltransferase (AOST). Members of this protein family represent the less widespread paralog, designated AruI, or arginine/ornithine succinyltransferase, alpha subunit. 0
41037 416272 cl11442 Cas2_I_II_III CRISPR/Cas system-associated protein Cas2. Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages. 0
41038 416273 cl11443 Cas6 Class 1 CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type III CRISPR-Cas systems and similar proteins. 0
41039 416274 cl11449 DUF406 Protein of unknown function (DUF406). These small proteins are approximately 100 amino acids in length and appear to be found only in gamma proteobacteria. The function of this protein family is unknown. [Hypothetical proteins, Conserved] 0
41040 416275 cl11450 MtlR Mannitol repressor. putative DNA-binding transcriptional regulator; Provisional 0
41041 416276 cl11451 Cyd_oper_YbgE Cyd operon protein YbgE (Cyd_oper_YbgE). hypothetical protein; Provisional 0
41042 416277 cl11452 H_PPase Inorganic H+ pyrophosphatase. This model describes proton pyrophosphatases from eukaryotes (predominantly plants), archaea and bacteria. It is an integral membrane protein and is suggested to have about 15 membrane spanning domains. Proton translocating inorganic pyrophosphatase, like H(+)-ATPase, acidifies the vacuoles and is pivotal to the vacuolar secondary active transport systems in plants. [Transport and binding proteins, Cations and iron carrying compounds] 0
41043 416278 cl11454 FlaF Flagellar protein FlaF. flagellar biosynthesis regulatory protein FlaF; Reviewed 0
41044 416279 cl11455 FlbT Flagellar protein FlbT. flagellar biosynthesis repressor FlbT; Reviewed 0
41045 416280 cl11456 DUF1375 Protein of unknown function (DUF1375). hypothetical protein; Provisional 0
41046 416281 cl11457 Secretoglobin N/A. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated. 0
41047 386124 cl11461 Phage_H_T_join Phage head-tail joining protein. This family describes a small protein of about 100 amino acids found in bacteriophage and in bacterial prophage regions. Examples include gp9 of phage HK022 and gp16 of phage SPP1. This minor structural protein is suggested to be a head-tail adaptor protein (although the source of this annotation was not traced during construction of this model). [Mobile and extrachromosomal element functions, Prophage functions] 0
41048 416282 cl11463 Phage_TTP_12 Lambda phage tail tube protein, TTP. characterized members are major tail tube proteins from various phages, including lactococcal temperate bacteriophage TP901-1. 0
41049 416283 cl11466 STI N/A. Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors. 0
41050 416284 cl11468 KicB MukF winged-helix domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid. 0
41051 416285 cl11470 SeqA SeqA protein C-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor. 0
41052 416286 cl11471 MukE bacterial condensin complex subunit MukE. Bacterial protein involved in chromosome partitioning, MukE 0
41053 416287 cl11472 DUF440 Protein of unknown function, DUF440. dsDNA-mimic protein; Reviewed 0
41054 416288 cl11473 DUF1043 Protein of unknown function (DUF1043). This family consists of several hypothetical bacterial proteins of unknown function. 0
41055 416289 cl11474 UPF0231 Uncharacterized protein family (UPF0231). hypothetical protein; Provisional 0
41056 416290 cl11475 CcmD Heme exporter protein D (CcmD). The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins. 0
41057 416291 cl11478 Rsd_AlgQ Regulator of RNA polymerase sigma(70) subunit, Rsd/AlgQ. This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homolog, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in Pseudomonas aeruginosa isolates from cystic fibrosis patients. 0
41058 416292 cl11479 SMP_2 Bacterial virulence factor haemolysin. Members of this family of bacterial proteins are membrane proteins that effect the expression of haemolysin under anaerobic conditions. 0
41059 416293 cl11481 DUF1145 Protein of unknown function (DUF1145). This family consists of several hypothetical bacterial proteins of unknown function. 0
41060 416294 cl11483 PriC Primosomal replication protein priC. primosomal replication protein N''; Provisional 0
41061 416295 cl11485 YozE_SAM_like YozE SAM-like fold. hypothetical protein; Provisional 0
41062 416296 cl11488 DUF1450 Protein of unknown function (DUF1450). hypothetical protein; Provisional 0
41063 416297 cl11491 Phasin_2 Phasin protein. Members of this protein family are encoded in polyhydroxyalkanoic acid storage system regions in Vibrio, Photobacterium profundum SS9, Acinetobacter sp., Aeromonas hydrophila, and several species of Vibrio. Members appear distantly related to the phasin family proteins modeled by TIGR01841 and TIGR01985. 0
41064 416298 cl11492 DUF1447 Protein of unknown function (DUF1447). hypothetical protein; Provisional 0
41065 416299 cl11493 PQQ_DH_like PQQ-dependent dehydrogenases and related proteins. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Members of this family have several predicted transmembrane helices in the N-terminal region, and include the quinoprotein glucose dehydrogenase (EC 1.1.5.2) of Escherichia coli and the quinate/shikimate dehydrogenase of Acinetobacter sp. ADP1 (EC 1.1.99.25). Sequences closely related except for the absense of the N-terminal hydrophobic region, scoring in the gray zone between the trusted and noise cutoffs, include PQQ-dependent glycerol (EC 1.1.99.22) and and other polyol (sugar alcohol) dehydrogenases. 0
41066 386143 cl11495 IncFII_repA IncFII RepA protein family. replication protein; Provisional 0
41067 299749 cl11500 Phage_Treg Lactococcus bacteriophage putative transcription regulator. putative transcription regulator; Provisional 0
41068 416300 cl11501 HHA Haemolysin expression modulating protein. This family consists of haemolysin expression modulating protein (HHA) homologs. YmoA and Hha are highly similar bacterial proteins downregulating gene expression in Yersinia enterocolitica and Escherichia coli, respectively. 0
41069 416301 cl11502 Ter DNA replicatioN-terminus site-binding protein (Ter protein). DNA replication terminus site-binding protein; Provisional 0
41070 299752 cl11503 TraA TraA. conjugal transfer pilin subunit TraA; Provisional 0
41071 416302 cl11505 Sif Sif protein. This family consists of several SifA and SifB and SseJ proteins which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localize to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function. 0
41072 416303 cl11506 CrgA Cell division protein CrgA. putative septation inhibitor protein; Reviewed 0
41073 416304 cl11507 DUF1471 Protein of unknown function (DUF1471). hypothetical protein; Provisional 0
41074 299755 cl11508 NUMOD1 NUMOD1 domain. Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished). 0
41075 299756 cl11513 Chlamy_scaf Chlamydia-phage Chp2 scaffold (Chlamy_scaf). minor capsid protein 0
41076 416305 cl11515 TrbI_Ftype Type-F conjugative transfer system protein (TrbI_Ftype). This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm. 0
41077 386150 cl11516 TraQ Type-F conjugative transfer system pilin chaperone (TraQ). conjugal transfer pilin chaperone TraQ; Provisional 0
41078 416306 cl11518 IL4 Interleukin 4. Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells. 0
41079 416307 cl11519 DENN DENN (AEX-3) domain. The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 0
41080 416308 cl11522 Tom22 Mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom22 proteins. [Transport and binding proteins, Amino acids, peptides and amines] 0
41081 175307 cl11526 Phage_connector Phage Connector (GP10). putative upper collar protein 0
41082 386154 cl11530 DUF104 Protein of unknown function DUF104. This family includes short archaebacterial proteins of unknown function. Archaeoglobus fulgidus has twelve copies of this protein, with several being clustered together in the genome. 0
41083 416309 cl11531 ZapB Cell division protein ZapB. septal ring assembly protein ZapB; Provisional 0
41084 416310 cl11533 DUF1013 Protein of unknown function (DUF1013). Family of uncharacterized proteins found in Proteobacteria. 0
41085 416311 cl11538 ssDNA-exonuc_C Single-strand DNA-specific exonuclease, C terminal domain. Members of this set of prokaryotic domains are found in a set of single-strand DNA-specific exonucleases, including RecJ. Their exact function has not, as yet, been determined. 0
41086 299766 cl11540 Mu-like_Com Mu-like prophage protein Com. Members of this family of proteins comprise the translational regulator of mom. 0
41087 299767 cl11541 CoiA Competence protein CoiA-like family. Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterized protein. 0
41088 416312 cl11542 EcsB Bacterial ABC transporter protein EcsB. This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters. 0
41089 416313 cl11545 DUF1820 Domain of unknown function (DUF1820). This family includes small functionally uncharacterized proteins around 100 amino acids in length. 0
41090 386160 cl11547 HTH_43 Winged helix-turn helix. This family, found in various hypothetical prokaryotic proteins, is a probable winged helix DNA-binding domain. 0
41091 416314 cl11548 DUF2140 Uncharacterized protein conserved in bacteria (DUF2140). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
41092 416315 cl11550 DUF1797 Protein of unknown function (DUF1797). This is a domain of unknown function. It forms a central anti-parallel beta sheet with flanking alpha helical regions. 0
41093 416316 cl11551 DUF1149 Protein of unknown function (DUF1149). This family consists of several hypothetical bacterial proteins of unknown function. 0
41094 386164 cl11552 DUF1462 Protein of unknown function (DUF1462). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown. 0
41095 416317 cl11555 DUF1129 Protein of unknown function (DUF1129). This family consists of several hypothetical bacterial proteins of unknown function. 0
41096 416318 cl11560 ComK ComK protein. This family consists of several bacterial ComK proteins. The ComK protein of Bacillus subtilis positively regulates the transcription of several late competence genes as well as comK itself. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level. 0
41097 416319 cl11562 DUF1465 Protein of unknown function (DUF1465). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 0
41098 416320 cl11564 GcrA GcrA cell cycle regulator. GcrA is a master cell cycle regulator that, together with CtrA (see pfam00072 and pfam00486), is involved in controlling cell cycle progression and asymmetric polar morphogenesis. During this process, there are temporal and spatial variations in the concentrations of GcrA and CtrA. The variation in concentration produces time and space dependent transcriptional regulation of modular functions that implement cell-cycle processes. More specifically, GcrA acts as an activator of components of the replisome and the segregation machinery. 0
41099 416321 cl11568 DUF1491 Protein of unknown function (DUF1491). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 0
41100 416322 cl11569 DUF1467 Protein of unknown function (DUF1467). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 0
41101 416323 cl11570 DUF1489 Protein of unknown function (DUF1489). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be founds exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 0
41102 416324 cl11571 DUF1476 Domain of unknown function (DUF1476). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown. 0
41103 386171 cl11574 DUF2279 Predicted periplasmic lipoprotein (DUF2279). This domain, found in various hypothetical bacterial proteins, has no known function. 0
41104 416325 cl11576 DUF1398 Protein of unknown function (DUF1398). This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown. 0
41105 416326 cl11577 DUF1150 Protein of unknown function (DUF1150). This family consists of several hypothetical bacterial proteins of unknown function. 0
41106 416327 cl11580 DUF2002 Protein of unknown function (DUF2002). hypothetical protein; Provisional 0
41107 416328 cl11584 DUF2591 Protein of unknown function (DUF2591). hypothetical protein 0
41108 299787 cl11585 Phage_X Phage X family. gene X product; Reviewed 0
41109 416329 cl11586 snake_toxin N/A. This family predominantly includes venomous neurotoxins and cytotoxins from snakes, but also structurally similar (non-snake) toxin-like proteins (TOLIPs) such as Lymphocyte antigen 6D and Ly6/PLAUR domain-containing protein. Snake toxins are short proteins with a compact, disulphide-rich structure. TOLIPs have similar structural features (abundance of spaced cysteine residues, a high frequency of charge residues, a signal peptide for secretion and a compact structure) but, are not associated with a venom gland or poisonous function. They are endogenous animal proteins that are not restricted to poisonous animals. 0
41110 416330 cl11589 Knot1 N/A. Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins. 0
41111 416331 cl11592 zf-CCCH Zinc finger C-x8-C-x5-C-x3-H type (and similar). 0
41112 416332 cl11594 SSI Subtilisin inhibitor-like. 0
41113 416333 cl11600 PBP_GOBP PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP). 0
41114 416334 cl11602 IL7 Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. 0
41115 416335 cl11603 Basic Myogenic Basic domain. This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure. 0
41116 386183 cl11607 7TM_GPCR_Srab Serpentine type 7TM GPCR receptor class ab chemoreceptor. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srb is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 0
41117 299794 cl11610 Phage_G Major spike protein (G protein). major spike protein 0
41118 416336 cl11612 DUF243 Domain of unknown function (DUF243). This family of uncharacterized proteins is only found in fly proteins. It is found associated with YLP motifs pfam02757 in some proteins. 0
41119 416337 cl11614 Peptidase_S77 Prohead core protein serine protease. prohead core scaffolding protein and protease 0
41120 416338 cl11619 SPK Domain of unknown function (DUF545). Family of uncharacterized C. elegans proteins. The region represented by this family can is found to be repeated up to four time in some proteins. 0
41121 386187 cl11622 Phage_endo_I Phage endonuclease I. endonuclease I 0
41122 416339 cl11625 Podovirus_Gp16 Podovirus DNA encapsidation protein (Gp16). DNA encapsidation protein 0
41123 416340 cl11627 HlyE Haemolysin E (HlyE). This family consists of several enterobacterial haemolysin (HlyE) proteins.Hemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterized pore-forming E. coli hemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a hemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host. 0
41124 416341 cl11629 KdgM Oligogalacturonate-specific porin protein (KdgM). This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria Erwinia chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric. 0
41125 416342 cl11630 DinI DinI-like family. DNA damage-inducible protein I; Provisional 0
41126 271727 cl11632 DUF1035 Protein of unknown function (DUF1035). structural protein V1; Reviewed 0
41127 416343 cl11636 SecM Secretion monitor precursor protein (SecM). This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling. 0
41128 416344 cl11637 Mth_Ecto N/A. This family represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. This family is found in conjunction with pfam00002. 0
41129 416345 cl11643 WzyE WzyE protein, O-antigen assembly polymerase. This family consists of several WzyE proteins which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect. The function of this family is unknown. The family is a transmembrane family with up to 11 TM regions, and is necessary for the assembly of O-antigen lipopolysaccharide. 0
41130 299806 cl11645 DUF1293 Protein of unknown function (DUF1293). hypothetical protein 0
41131 416346 cl11647 MalM Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown. 0
41132 386195 cl11648 DUF1418 Protein of unknown function (DUF1418). hypothetical protein; Provisional 0
41133 416347 cl11650 DUF1431 Protein of unknown function (DUF1431). This family contains a number of Drosophila melanogaster proteins of unknown function. These contain several conserved cysteine residues. 0
41134 264457 cl11652 TraP TraP protein. conjugal transfer protein TraP; Provisional 0
41135 416348 cl11653 Crl Transcriptional regulator Crl. This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibers that are found on the surface of certain bacteria. 0
41136 416349 cl11654 DUF1516 Protein of unknown function (DUF1516). hypothetical protein; Provisional 0
41137 159607 cl11655 PRE_C2HC Associated with zinc fingers. This function of this domain is unknown and is often found associated with pfam00096. 0
41138 416350 cl11656 FCD FCD domain. This family contains sequences that are similar to the fatty acid metabolism regulator protein (FadR). This functions as a dimer, with each monomer being composed of an N-terminal DNA-binding domain and a regulatory C-terminal domain. A linker comprising two short alpha helices joins the two domains. In the C-terminal domain, an antiparallel array of six alpha helices forms a barrel-like structure, while a seventh alpha helix forms a 'lid' at the end closest to the N-terminal domain. This structure was found to be similar to that of the C-terminal domain of the Tet repressor. Long-chain acyl-CoA thioesters interact directly and reversibly with the C-terminal domain, and this interaction affects the structure and therefore the DNA binding properties of the N-terminal domain. 0
41139 416351 cl11657 DM4_12 DM4/DM12 family. This family contains sequences derived from hypothetical proteins expressed by two insect species, D. melanogaster and A. gambiae. The region in question is approximately 115 amino acid residues long and contains four highly- conserved cysteine residues. 0
41140 416352 cl11660 PAN_3 PAN-like domain. 0
41141 416353 cl11665 7TM_GPCR_Srh Serpentine type 7TM GPCR chemoreceptor Srh. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sri is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 0
41142 416354 cl11672 DUF2509 Protein of unknown function (DUF2509). This family is conserved in Proteobacteria. The function is not known but many of the members are annotated as protein YgdB. 0
41143 416355 cl11677 FliX Class II flagellar assembly regulator. flagellar assembly regulator FliX; Reviewed 0
41144 416356 cl11681 Anti-adapt_IraP Sigma-S stabilisation anti-adaptor protein. This family is conserved in Enterobacteriaceae. It is one of a series of proteins, expressed by these bacteria in response to stress, that help to regulate Sigma-S, the stationary phase sigma factor of Escherichia coli and Salmonella. IraP is essential for Sigma-S stabilisation in some but not all starvation conditions. 0
41145 416357 cl11685 CdhC CO dehydrogenase/acetyl-CoA synthase complex beta subunit. acetyl-CoA decarbonylase/synthase complex subunit beta; Reviewed 0
41146 416358 cl11698 Lipoprotein_22 Uncharacterized lipoprotein family. hypothetical protein; Provisional 0
41147 275955 cl11748 LysW Lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway. [Amino acid biosynthesis, Aspartate family] 0
41148 416362 cl11777 zinc_ribbon_4 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773. 0
41149 416363 cl11797 Flagellar_put Putative flagellar. Members of this family are found in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus. The specific molecular function is unknown. [Cellular processes, Chemotaxis and motility] 0
41150 416364 cl11819 LigD_N DNA polymerase Ligase (LigD). Most sequences in this family are the 3'-phosphoesterase domain of a multidomain, multifunctional DNA ligase, LigD, involved, along with bacterial Ku protein, in non-homologous end joining, the less common of two general mechanisms of repairing double-stranded breaks in DNA sequences. LigD is variable in architecture, as it lacks this domain in Bacillus subtilis, is permuted in Mycobacterium tuberculosis, and occasionally is encoded by tandem ORFs rather than as a multifuntional protein. In a few species (Dehalococcoides ethenogenes and the archaeal genus Methanosarcina), sequences corresponding to the ligase and polymerase domains of LigD are not found, and the role of this protein is unclear. [DNA metabolism, DNA replication, recombination, and repair] 0
41151 416365 cl11827 DUF3485 Protein of unknown function (DUF3485). In Methylobacillus sp strain 12S, EpsI is encoded immediately downstream of the multiple-membrane-spanning putative transporter EpsH, and is predicted to be a periplasmic protein involved in, but not required for, expression of the exopolysaccharide methanolan. In a number of other species, protein homologous to EpsI is encoded either next to EpsH or, more often, combined in a fused gene. We have proposed renaming EpsH, or the EpsHI fusion protein, to exosortase, based on its phylogenetic association with the PEP-CTERM proposed protein targeting signal. [Transport and binding proteins, Unknown substrate] 0
41152 416366 cl11840 DUF3289 Protein of unknown function (DUF3289). Members of this protein family have been found in several species of gammaproteobacteria, including Yersinia pestis and Y. pseudotuberculosis, Xylella fastidiosa, and Escherichia coli UTI89. As many as five members can be found in a single genome. The function is unknown. [Hypothetical proteins, Conserved] 0
41153 416367 cl11841 PSII_Pbs27 Photosystem II Pbs27. Members of this family are the Psb27 protein of the cyanobacterial photosynthetic supracomplex, photosystem II. Although most protein components of both cyanobacterial and chloroplast versions of photosystem II are closely related and described together by single models, this family is strictly bacterial. Some uncharacterized proteins with highly divergent sequences, from Arabidopsis, score between trusted and noise cutoffs for this model but are not at this time assigned as functionally equivalent photosystem II proteins. [Energy metabolism, Photosynthesis] 0
41154 416368 cl11843 DUF3623 Protein of unknown function (DUF3623). This uncharacterized protein family was identified, by the method of partial phylogenetic profiling, as having a matching phylogenetic distribution to that of the photosynthetic reaction center of the alpha-proteobacterial type. It is nearly always encoded near other photosynthesis-related genes, including puhA. [Energy metabolism, Photosynthesis] 0
41155 416369 cl11853 Couple_hipA HipA N-terminal domain. Although Pfam models pfam07805 and pfam07804 currently are called HipA-like N-terminal domain and HipA-like C-terminal domain, respectively, those models hit the central and C-terminal regions of E. coli HipA but not the N-terminal region. This model hits the N-terminal region of HipA and its homologs, and also identifies proteins that lack match regions for pfam07804 and pfam07805. 0
41156 275987 cl11864 Csf2_U CRISPR/Cas system-associated RAMP superfamily protein Csf2. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf2 (CRISPR/cas Subtype as in A. ferrooxidans protein 2), as it lies second closest to the repeats. 0
41157 187964 cl11865 Csf3_U CRISPR/Cas system-associated RAMP superfamily protein Csf3. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf3 (CRISPR/cas Subtype as in A. ferrooxidans protein 3), as it lies third closest to the repeats. 0
41158 416370 cl11869 Csc2_I-D CRISPR/Cas system-associated protein Csc2. The Csc2 Crispr family of proteins forms a core RNA recognition motif-like domain, flanked by three peripheral insertion domains: a lid domain, a Zinc-binding domain and a helical domain. The CRISPR-Cas system is possibly a mechanism of defence against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes. 0
41159 416371 cl11871 AtpR N-ATPase, AtpR subunit. Members of this protein family are uncharacterized, highly hydrophobic proteins encoded in the middle of apparent F1/F0 ATPase operons. We note, however, that this protein is both broadly and sparsely distributed. It is found in about only about two percent of microbial genomes sequenced, with the first ten examples found coming from the Euryarchaeota, Chlorobia, Betaproteobacteria, Deltaproteobacteria, and Planctomycetes. In most of these species, surrounding operon appears to represent a second F1/F0 ATPase system, and the member proteins belong to subfamilies with the same phylogenetic distribution as the current protein family. 0
41160 416372 cl11879 VasI Type VI secretion system VasI, EvfG, VC_A0118. Members of this protein family, including VC_A0118 from Vibrio cholerae El Tor N16961, are restricted to a subset of bacteria with the type VI secretion system, and are encoded among the type VI-associated pathogenicity islands. However, many species with type VI secretion lack a member of this family. This lack suggests that members of this family may be targets rather than components of the type VI secretion system. 0
41161 416373 cl11880 T6SS_VasJ Type VI secretion, EvfE, EvfF, ImpA, BimE, VC_A0119, VasJ. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41162 416374 cl11881 CBP_BcsG Cellulose biosynthesis protein BcsG. This protein was identified by the partial phylogenetic profiling algorithm () as part of the system for cellulose biosynthesis in bacteria, and in fact is found in cellulose biosynthesis gene regions. The protein was designated YhjU in Salmonella enteritidis, where disruption of its gene disrupts cellulose biosynthesis and biofilm formation (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
41163 416375 cl11883 VPLPA-CTERM VPLPA-CTERM protein sorting domain. PepA was described in Zoogloea resiniphila as a PEP-CTERM protein regulated by the PrsK/PrsR two-component system. Knocking out that system blocks flocculation, after which expression of recombinant PepA can restore flocculation. 0
41164 416376 cl11887 DUF3738 Protein of unknown function (DUF3738). Bacterial reference strains encoding members of this protein family are all isolated from soil. These include 39 members from Solibacter usitatus Ellin6076, 27 from Acidobacterium sp. MP5ACTX8 (both Acidobacteria), and four from Pedosphaera parvula Ellin514 (Verrucomicrobia). The family is well-diversified, with few pairs showing greater than 50 % pairwise identity. A few members are fused to Peptidase_M56 domains (see pfam05569), to Sigma70_r2 domains (see pfam04542), or have a duplication of this domain. 0
41165 416377 cl11888 crt_membr carotene biosynthesis associated membrane protein. Proteins of this family are Involved in the initiation of core alpha-(1,6) mannan biosynthesis of lipomannan (LM-A) and multi-mannosylated polymer (LM-B), extending triacylatedphosphatidyl-myo-inositol dimannoside (Ac1PIM2) and mannosylated glycolipid, 1,2-di-O-C16/C18:1-(alpha-D-mannopyranosyl)-(1->4)-(alpha-D-glucopyranosyluronic acid)-(1->3)-glycerol (Man1GlcAGroAc2), respectively. 0
41166 416378 cl11889 Lycopene_cyc Lycopene cyclase. This domain is often repeated twice within the same polypeptide, as is observed in Archaea, Thermus, Sphingobacteria and Fungi. In the fungal sequences, this tandem domain pair is observed as the N-terminal half of a bifunctional protein, where it has been characterized as a lycopene beta-cyclase and the C-terminal half is a phytoene synthetase. In Myxococcus and Actinobacterial genomes this domain appears as a single polypeptide, tandemly repeated and usually in a genomic context consistent with a role in carotenoid biosynthesis. It is unclear whether any of the sequences in this family truly encode lycopene epsilon cyclases. However a number are annotated as such. The domain is generally hydrophobic with a number of predicted membrane spanning segments and contains a distinctive motif (hPhEEhhhhhh). In certain sequences one of either the proline or glutamates may vary, but always one of the tandem pair appear to match this canonical sequence exactly. 0
41167 187967 cl11892 Cas8c_I-C CRISPR/Cas system-associated protein Cas8c. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the N-terminal region or upstream gene; the C-terminal region is described by TIGR03486. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA. 0
41168 187968 cl11893 Cas8c&apos;_I-D CRISPR/Cas system-associated protein Cas8c&apos;. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the C-terminal region or downstream gene; the N-terminal region is modeled by TIGR03485. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA. 0
41169 187969 cl11894 Csp2_I-U CRISPR/Cas system-associated protein Cas8c. Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pging (Porphyromonas gingivalis) subtype. 0
41170 187970 cl11895 Cas5_I CRISPR/Cas system-associated RAMP superfamily protein Cas5. CC Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pgingi (Porphyromonas gingivalis) subtype, but shows some sequence similarity to genes of the Cas5 type (see TIGR02593). 0
41171 416379 cl11905 GldH_lipo GldH lipoprotein. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility. [Cellular processes, Chemotaxis and motility] 0
41172 416380 cl11917 DUF4312 Domain of unknown function (DUF4312). Members of this family of small (about 100 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 0
41173 416381 cl11918 DUF4310 Domain of unknown function (DUF4310). Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. 0
41174 416382 cl11919 DUF4311 Domain of unknown function (DUF4311). Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Unknown function, General] 0
41175 416383 cl11923 DUF2805 Protein of unknown function (DUF2805). This model describes an uncharacterized bacterial protein family. Members average about 90 amino acids in length with several well-conserved uncommon amino acids (Trp, Met). The majority of species are marine bacteria. Few species have more than one copy, but Vibrio cholerae El Tor N16961 has three identical copies. [Hypothetical proteins, Conserved] 0
41176 416384 cl11925 VioE Violacein biosynthetic enzyme VioE. This enzyme catalyzes the third step in violacein biosynthesis from a pair of Trp residues, as in Chromobacterium violaceum, but the first step that distinguishes that pathway from staurosporine (an indolocarbazole antibiotic) biosynthesis. [Cellular processes, Toxin production and resistance] 0
41177 416385 cl11927 DUF5801 Domain of unknown function (DUF5801). This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion. [Cellular processes, Pathogenesis] 0
41178 353324 cl11943 Activator-TraM Transcriptional activator TraM. conjugal transfer protein TraM; Provisional 0
41179 416386 cl11960 Ig Immunoglobulin domain. The non-classical mouse MHC class I (MHC-I) molecule Qa-1b is a non-polymorphic MHC molecule with an important function in innate immunity. It binds and presents signal peptides of classical MHC-I molecules at the cell surface and, as such, act as an indirect sensor for the normal expression of MHC-I molecules. This signal peptide dominantly accommodated in the groove of Qa-1b is called Qdm, for Qa-1 determinant modifier, and its amino acid sequence AMAPRTLLL is highly conserved among mammalian species. The Qdm/Qa-1b complex serves as a ligand for the germ-line encoded heterodimeric CD94/NKG2A receptors expressed on natural killer (NK) cells and activated CD8+ T cells and transduces inhibitory signals to these lymphocytes. Thus, upon binding, Qa-1b signals NK cells not to engage in cell lysis. The molecular basis of Qa-1b function is unclear. 0
41180 416387 cl11961 ALDH-SF NAD(P)+-dependent aldehyde dehydrogenase superfamily. This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalyzed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components. 0
41181 416388 cl11964 CYTH-like_Pase CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases. This presumed domain is found in the yeast vacuolar transport chaperone proteins VTC2, VTC3 and VTC4. This domain is also found in a variety of bacterial proteins. 0
41182 416389 cl11965 terB_like tellurium resistance terB-like protein. This family contains the TerB tellurite resistance proteins from a a number of bacteria. 0
41183 416390 cl11966 NT_Pol-beta-like Nucleotidyltransferase (NT) domain of DNA polymerase beta and similar proteins. This family is likely to be an uncharacterized group of nucleotidyltransferases. 0
41184 416391 cl11967 Nucleotidyl_cyc_III Class III nucleotidyl cyclases. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. It has been shown to be homologous to the adenylyl cyclase catalytic domain and has diguanylate cyclase activity. This observation correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. In the WspR protein of Pseudomonas aeruginosa, the GGDEF domain acts as a diguanylate cyclase, Structure 3bre, when the whole molecule appears to form a tetramer consisting of two symmetrically-related dimers representing a biological unit. The active site is the GGD/EF motif, buried in the structure, and the cyclic dimeric guanosine monophosphate (c-di-GMP) bind to the inhibitory-motif RxxD on the surface. The enzyme thus catalyzes the cyclisation of two guanosine triphosphate (GTP) molecules to one c-di-GMP molecule. 0
41185 416392 cl11968 harmonin_N_like N-terminal protein-binding module of harmonin and similar domains, also known as HHD (harmonin homology domain). CCM2_HHD is a folded-helical region of a family of vertebral proteins, mutations in which cause cerebral cavernous malformations (CCMs). These malformations are congenital vascular anomalies of the central nervous system that can result in haemorrhagic stroke, seizures, recurrent headaches, and focal neurologic deficits. This domain is structurally homologous to the N-terminal domain of harmonin, so it is named the CCM2 harmonin-homology domain or CCM2_HHD. This protein is often called Malcavernin. 0
41186 416393 cl11970 PriL Archaeal/eukaryotic core primase: Large subunit, PriL. DNA primase is the polymerase that synthesizes small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast S. cerevisiae). The large subunit of DNA primase forms interactions with the small subunit and the structure implicates that it is not directly involved in catalysis, but plays roles in correctly positioning the primase/DNA complex, and in the transfer of RNA to DNA polymerase. 0
41187 416394 cl11971 PPK2 Polyphosphate kinase 2 (PPK2). Members of this protein family belong to the polyphosphate kinase 2 (PPK2) family, which is not related in sequence to PPK1. While PPK1 tends to act in the biosynthesis of polyphosphate, or poly(P), members of the PPK2 family tend to use the terminal phosphate of poly(P) to regenerate ATP or GTP from the corresponding nucleoside diphosphate, or ADP from AMP as is the case with polyphosphate:AMP phosphotransferase (PAP). Members of this protein family most likely transfer the terminal phosphate between poly(P) and some nucleotide, but it is not clear which. [Central intermediary metabolism, Phosphorus compounds] 0
41188 416395 cl11976 SNF Sodium:neurotransmitter symporter family. These are twelve xTM-containing region transporters. 0
41189 416396 cl11978 DUF2333 Uncharacterized protein conserved in bacteria (DUF2333). Members of this family of hypothetical bacterial proteins have no known function. 0
41190 416397 cl11979 Lon_2 Putative ATP-dependent Lon protease. putative ATP-dependent protease 0
41191 416398 cl11982 RHS_repeat RHS Repeat. This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin. 0
41192 416399 cl12004 Cas8c_I-C CRISPR/Cas system-associated protein Cas8c. CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins that tend to be found near CRISPR repeats. The species range, so far, is exclusively bacterial and mesophilic, although CRISPR loci are particularly common among the archaea and thermophilic bacteria. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 0
41193 416400 cl12007 Cby_like Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins. This family includes the eukaryotic chibby proteins. These proteins inhibit the wingless/Wnt pathway by binding to beta-catenin and inhibiting beta-catenin-mediated transcriptional activation. Chibby is Japanese for small, and is named after the RNAi phenotype seen in Drosophila. 0
41194 299861 cl12008 FANCE_c-term Fanconi anemia complementation group E protein, C-terminal domain. Fanconi Anaemia (FA) is a cancer predisposition disorder. In response to DNA damage, the FA core complex monoubiquitinates the downatream FANCD2 protein. The protein FANCE has an important role in DNA repair as it is the FANCD2-binding protein in the FA core complex so it represents the link between the FA core complex and FANCD2. The sequence shown is the C terminal domain of the protein which consists predominantly of helices and does not contain any beta-strand. The fold of the polypeptide is a continuous right-handed solenoidal pattern from the N terminal to the C terminal end. 0
41195 416401 cl12009 HmuY_like Bacterial proteins similar to Porphyromonas gingivalis HmuY and the C-terminal domain of PARMER_03218. HmuY is a novel heme-binding protein that recruits heme from host carriers and delivers it to its cognate outer-membrane transporter, the TonB-dependent receptor HmuR. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 278 amino acids in length. 0
41196 416402 cl12013 BAR The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature. BAR_12 is the BAR coiled-coil domain at the N-terminus of APPL or adaptor protein containing PH domain, PTB domain, and leucine zipper motif proteins in higher eukaryotes. This BAR domain contains four helices whereas the other classical BAR domains contain only three helices. The first three helices form an antiparallel coiled-coil, while the fourth helix, is unique to APPL1. BAR domains take part in many varied biological processes such as fission of synaptic vesicles, endocytosis, regulation of the actin cytoskeleton, transcriptional repression, cell-cell fusion, apoptosis, secretory vesicle fusion, and tissue differentiation. 0
41197 416404 cl12015 Adenylation_DNA_ligase_like Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases. PNKP_ligase is a classical ligase nucleotidyltransferase module of bacteria. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps. 0
41198 416406 cl12018 Peptidase_M48 Peptidase family M48. heat shock protein HtpX; Provisional 0
41199 416407 cl12020 Anticodon_Ia_like Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains. This DALR domain is found in cysteinyl-tRNA-synthetases. 0
41200 416408 cl12022 Ribosomal_L27A Ribosomal proteins 50S-L15, 50S-L18e, 60S-L27A. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
41201 416410 cl12033 Spt4 Transcription elongation factor Spt4. This family consists of several eukaryotic transcription elongation Spt4 proteins as well as archaebacterial RpoE2. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. RpoE2 is one of 13 subunits in the archaeal RNA polymerase. These proteins contain a C4-type zinc finger, and the structure has been solved in. The structure reveals that Spt4-Spt5 binding is governed by an acid-dipole interaction between Spt5 and Spt4, and the complex binds to and travels along the elongating RNA polymerase. The Spt4-Spt5 complex is likely to be an ancient, core component of the transcription elongation machinery. 0
41202 416412 cl12045 Ubiq_cyt_C_chap Ubiquinol-cytochrome C chaperone. 0
41203 416413 cl12046 DUF429 Protein of unknown function (DUF429). 0
41204 416414 cl12049 gp6_gp15_like Head-Tail Connector Proteins gp6 and gp15, and similar proteins. Some members in this family of proteins with unknown function are annotated as YqbG however this cannot be confirmed. Currently the proteins has no known function. 0
41205 353348 cl12054 Terminase_3 Phage terminase large subunit. This model detects members of a highly divergent family of the large subunit of phage terminase. All members are encoded by phage genomes or within prophage regions of bacterial genomes. This is a distinct family from pfam03354. [Mobile and extrachromosomal element functions, Prophage functions] 0
41206 416417 cl12057 YdfA_immunity SigmaW regulon antibacterial. hypothetical protein; Provisional 0
41207 386259 cl12064 DUF697 Domain of unknown function (DUF697). hypothetical protein; Provisional 0
41208 416421 cl12072 HypD Hydrogenase formation hypA family. HypD is involved in the hyp operon which is needed for the activity of the three hydrogenase isoenzymes in Escherichia coli. HypD is one of the genes needed for formation of these enzymes. This protein has been found in gram-negative and gram-positive bacteria and Archaea. [Protein fate, Protein modification and repair] 0
41209 416422 cl12074 YjgP_YjgQ Predicted permease YjgP/YjgQ family. Members of this family are LptG, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptF (TIGR04407). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other] 0
41210 416423 cl12076 THUMP THUMP domain, predicted to bind RNA. The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterized RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets. 0
41211 416424 cl12077 Methyltrn_RNA_3 Putative RNA methyltransferase. This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA, and may function as a methyltransferase. 0
41212 416425 cl12078 p450 Cytochrome P450. Members of this subfamily are cytochrome P450 enzymes that occur next to tRNA-dependent cyclodipeptide synthases. This group does NOT include CYP121 (Rv2275) from Mycobacterium tuberculosis, adjacent to the cyclodityrosine synthetase Rv2276. 0
41213 416426 cl12079 DUF373 Domain of unknown function (DUF373). Archaeal domain of unknown function. Predicted to be an integral membrane protein with six transmembrane regions. 0
41214 416427 cl12080 Pcc1 Transcription factor Pcc1. KEOPS complex Pcc1-like subunit; Provisional 0
41215 416428 cl12096 Iron_permease Low affinity iron permease. 0
41216 416429 cl12097 DUF1772 Domain of unknown function (DUF1772). This domain is of unknown function. 0
41217 416430 cl12098 DUF927 Domain of unknown function (DUF927). Family of bacterial proteins of unknown function. The C-terminal half of this family contains a P-loop motif. The N-terminal domain appears to have a unique fold, which contains three Helices and two strands. Structural analyses show that helicases containing this domain form a hexameric ring with a positively charged central pore threading a single DNA strand through suggestive of a replicative function for this helicase. 0
41218 416431 cl12101 CrtC CrtC N-terminal lipocalin domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of family member NE1406 (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication. 0
41219 416432 cl12104 Dehydratase_LU N/A. This family contains the large subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 0
41220 416433 cl12113 HSF_DNA-bind HSF-type DNA-binding. 0
41221 416434 cl12114 HMG14_17 HMG14 and HMG17. 0
41222 416435 cl12115 HTH_Tnp_Tc5 Tc5 transposase DNA-binding domain. 0
41223 416436 cl12116 DUSP DUSP domain. The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. 0
41224 416437 cl12117 JHBP Haemolymph juvenile hormone binding protein (JHBP). The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions. 0
41225 416438 cl12118 LEA_2 Late embryogenesis abundant protein. uncharacterized protein; Provisional 0
41226 416439 cl12124 HK97-gp10_like Bacteriophage HK97-gp10, putative tail-component. This model represents an uncharacterized, highly divergent bacteriophage family. The family includes gp10 from HK022 and HK97. It appears related to TIGR01635, a phage morphogenesis family believed to be involved in tail completion. [Mobile and extrachromosomal element functions, Prophage functions] 0
41227 416440 cl12127 Csy2_I-F CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2. 0
41228 416441 cl12129 DUF932 Domain of unknown function (DUF932). Members of this uncharacterized protein family are found in various Mycobacterium phage genomes, in Streptomyces coelicolor plasmid SCP1, and in bacterial genomes near various markers that suggest lateral gene transfer. The function is unknown. [Mobile and extrachromosomal element functions, Other] 0
41229 416442 cl12130 Bacteriocin_IId Bacteriocin class IId cyclical uberolysin-like. Circular bacteriocins are antibiotic proteins made by ribosomal translation of a precursor molecular, followed by cleavage and circularization. Members of this subclass of the circular bacteriocins include circularin A from Clostridium beijerinckii, bacteriocin AS-48 from Enterococcus faecalis, uberolysin from Streptococcus uberis, and carnocyclin A from Carnobacterium maltaromaticum. The mature circularized peptides average about 70 amino acids in size. [Cellular processes, Toxin production and resistance] 0
41230 416443 cl12133 CbiG_C Cobalamin synthesis G C-terminus. cobalamin biosynthesis protein CbiG; Provisional 0
41231 416444 cl12138 ThylakoidFormat Thylakoid formation protein. Thf1-like protein; Reviewed 0
41232 386285 cl12141 Tweety_N N-terminal domain of the protein encoded by the Drosophila tweety gene and related proteins, a family of chloride ion channels. The tweety (tty) gene has not been characterized at the protein level. However, it is thought to form a membrane protein with five potential membrane-spanning regions. A number of potential functions have been suggested in. 0
41233 353372 cl12219 PRK15003 cytochrome d ubiquinol oxidase subunit II. part of a two component cytochrome D terminal complex. Terminal reaction in the aerobic respiratory chain. [Energy metabolism, Electron transport] 0
41234 416450 cl12235 Phage_int_SAM_1 Phage integrase, N-terminal SAM-like domain. FliZ is involved in the regulation of flagellar assembly and possibly also the down-regulation of the motile phenotype. FliZ interacts with the flagellar translational activator FlhCD complex. 0
41235 353374 cl12236 VirB7 Outer membrane lipoprotein virB7. type IV secretion system lipoprotein VirB7; Provisional 0
41236 416452 cl12246 OrfB_IS605 Probable transposase. This family includes IS891, IS1136 and IS1341. DUF1225, pfam06774, has now been merged into this family. 0
41237 416455 cl12263 CytoC_RC Cytochrome C subunit of the bacterial photosynthetic reaction center. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction centre (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor. 0
41238 416457 cl12283 IPK Inositol polyphosphate kinase. inositol polyphosphate multikinase 0
41239 416473 cl12345 DUF1425 Putative periplasmic lipoprotein. This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown. 0
41240 416479 cl12363 PrgH Type III secretion system protein PrgH-EprH (PrgH). In Samonella, this gene is part of a four-gene operon PrgHIJK and in general is found in type III secretion operons. PrgH has been shown to be required for secretion, as well as being a structural component of the needle complex. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41241 416485 cl12377 Brr6_like_C_C Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport. 0
41242 416532 cl12494 TM6SF1-like transmembrane 6 superfamily member 1, member 2, and similar proteins. This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins. 0
41243 386428 cl12560 DUF2806 Protein of unknown function (DUF2806). Members of this protein family are conserved hypothetical proteins with a limited species distribution within the Gammaproteobacteria. It is common in the genera Vibrio and Shewanella, and in this resembles the C-terminal domain and putative protein sorting motif TIGR03501. This model, but design, does not extend to all homologs,but rather represents a particular clade. 0
41244 416600 cl12603 YqgB Virulence promoting factor. YqgB encodes adaptive factors that acts in synergy with vqfZ, enabling the bacteria to cope with the physical environment in vivo, facilitating colonisation of the host. 0
41245 416617 cl12633 DUF2859 Protein of unknown function (DUF2859). This model describes a protein family exemplified by PFL_4695 of Pseudomonas fluorescens Pf-5. Full-length proteins in this family show some architectural variety, but this model represents a conserved domain. Most or all member proteins belong to laterally transferred chromosomal islands called integrative conjugative elements, or ICE. 0
41246 416638 cl12689 DUF2909 Protein of unknown function (DUF2909). Members of the seed alignment for this family are small (average length 68 residues), strictly bacterial, and extremely hydrophobic. Pfam model PF04588 (HIG_1_N) includes both eukaryotic proteins, including a protein from the fish Gillichthys mirabilis, and the members of this family. Similarity between those eukaryotic proteins and the members o this model may represent convergent evolution related to the similar composition of their transmembrane alpha-helical regions, rather than a common origin or common function. 0
41247 416688 cl12752 EccE Putative type VII ESX secretion system translocon, EccE. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccE1, EccE2, etc. This model represents a conserved core region, and many members have 200 or more additional C-terminal residues. [Protein fate, Protein and peptide secretion and trafficking] 0
41248 416692 cl12757 DUF2993 Protein of unknown function (DUF2993). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 0
41249 416695 cl12761 Chs5_N N-terminal dimerization domain of Chs5 and similar proteins. This domain is found at the N-terminus of fungal chitin biosynthesis protein CHS5. It functions as a dimerization domain. 0
41250 416746 cl12832 DUF3090 Protein of unknown function (DUF3090). The conserved hypothetical protein described here occurs as part of the trio of uncharacterized proteins common in the Actinobacteria. 0
41251 416797 cl12902 DUF3168 Protein of unknown function (DUF3168). This family of proteins has no known function but is likely to be a component of bacteriophage. 0
41252 416815 cl12928 IcmL inner membrane protein IcmL/DotI. IcmL contains two amphipathic beta-sheet regions, required for the pore-forming ability which may be related to the transfer of this protein into a host cell membrane. The icmL gene shows significant similarity to plasmid genes involved in conjugation however IcmL is thought to be required for macrophage killing. It is unknown whether conjugation plays a role in macrophage killing. This is a family of DotI/IcmL proteins of type IVb secretion systems, that reside in the inner-membrane. It carries a single transmembrane helix in the N-terminal conserved region, has an extra-periplasmic domain, and is conserved in all T4BSSs including I-type conjugation systems (TraM). DotI/IcmL (and DotJ) may form an inner membrane complex that associates with the core complex. 0
41253 416837 cl12968 DUF2895 Protein of unknown function (DUF2895). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 0
41254 416842 cl12973 ArsP_2 Putative, 10TM heavy-metal exporter. Most proteins of this family have 8 transmembrane domains with two 4 transmembrane halves separated by a hydrophilic loop of variable sizes. It has been reported that some proteins of this family are involved in arsenate/arsenite resistance. 0
41255 325694 cl13040 SeleniumBinding Selenium binding protein. This model describes a homopentameric selenium-binding protein with a suggested role in selenium transport and delivery to selenophosphate synthase, the SelD protein. This protein family is closely related to pfam01906, but is shorter because of several deleted regions. It is restricted to the archaeal genus Methanococcus. 0
41256 416885 cl13041 CopK Copper resistance protein K. CopK is a periplasmic dimeric protein which is strongly up-regulated in the presence of copper, leading to a high periplasmic accumulation. CopK has two different binding sites for Cu(I), each with a different affinity for the metal. Binding of the first Cu(I) ion induces a conformational change of CopK which involves dissociation of the dimeric apo-protein. Binding of a second Cu(I) further increases the plasticity of the protein. CopK has features that are common with functionally related proteins such as a structure consisting of an all-beta fold and a methionine-rich Cu(I) binding site. 0
41257 416900 cl13066 Omp28 Outer membrane protein Omp28. The Omp28 family of lipoproteins is named for a founding member described in Porphyromonas gingivalis, where it has been shown across many strains to be an expressed surface antigen. All members of the family are predicted lipoproteins. 0
41258 416918 cl13107 TSPcc Coiled coil region of thrombospondin. This family of proteins represents the five-stranded coiled-coil domain of cartilage oligomeric matrix protein (COMP). This region has a binding site between two internal rings formed by Leu37 and Thr40 0
41259 416923 cl13117 DUF4352 Domain of unknown function (DUF4352). Members of these family are putative lipoproteins that fall into the Antigen MPT63/MPB63 (immunoprotective extracellular protein) superfamily. 0
41260 416931 cl13131 rap1_RCT C-terminal domain of RAP1 recruits proteins to telomeres. This family of proteins represents the C-terminal domain of the protein Rap-1, which plays a distinct role in silencing at the silent mating-type loci and telomeres. The Rap-1 C-terminus adopts an all-helical fold. Rap1 carries out its function by recruiting the Sir3 and Sir4 proteins to chromatin via its C terminal domain. Rap1 is otherwise known as TRF2-interacting protein, as it is one of the six subunit components of the Shelterin complex. Shelterin protects telomere ends from attack by DNA-repair mechanisms. Model doesn't capture Sch. pombe as it cuts this sequence into two. 0
41261 416935 cl13137 LcnG-beta Lactococcin G-beta. This HMM was built to improve on Pfam model PF11632, which in version PF11632.8 had a two-member seed. It includes 12 residues of leader peptide and GlyGly cleavage motif (see TIGR01847), and has a shorter but more broadly conserved core peptide region. Characterized member proteins include lactococcin G and enterocin 1071B. 0
41262 416942 cl13152 RLR_C_like C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors, Cereblon (CRBN), and similar protein domains. This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerization. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity. 0
41263 416963 cl13193 CrtA-like spheroidene monooxygenase and similar proteins. This bacterial family of proteins has no known function. 0
41264 416974 cl13209 CPSF73-100_C Pre-mRNA 3&apos;-end-processing endonuclease polyadenylation factor C-term. The exact function of this domain is not known. 0
41265 416993 cl13241 Bacteriocin_IIi Aureocin-like type II bacteriocin. Members of this family include leaderless, unmodified class IId bacteriocins such as lacticin Q, BacSp222, and the founding member aureocin A53. 0
41266 416997 cl13247 Candida_ALS_N Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins. 0
41267 417110 cl13432 DUF3487 Protein of unknown function (DUF3487). Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 0
41268 417120 cl13446 Telomerase_RBD Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA. 0
41269 417131 cl13463 FAT-like_CAS_C C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 192 to 218 amino acids in length. This domain is found associated with pfam00018, pfam08824. This domain has a conserved QPP sequence motif. 0
41270 417171 cl13524 DUF3573 Protein of unknown function (DUF3573). LbtU, from Legionella pneumophila, a novel TonB-independent siderophore uptake outer membrane protein from a species that lacks TonB, is the founding member of a class of porins that may be involved generally in siderophore-mediated iron acquisition. 0
41271 417288 cl13718 TryThrA_C Tryptophan-Threonine-rich plasmodium antigen C terminal. tryptophan/threonine-rich antigen superfamily; Provisional 0
41272 417301 cl13749 eIF3G eIF3G domain found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. This domain family is found in eukaryotes, and is approximately 130 amino acids in length. The family is found in association with pfam00076. This family is subunit G of the eukaryotic translation initiation factor 3. Subunit G is required for eIF3 integrity. 0
41273 417311 cl13764 ASH Abnormal spindle-like microcephaly-assoc&apos;d, ASPM-SPD-2-Hydin. TMEM131_like is a family of bacterial, plant and other metazoa transmembrane proteins. Many of the members are multi-pass transmembrane proteins. 0
41274 417412 cl13934 Inhibitor_I10 Serine endopeptidase inhibitors. Members of the microviridin/marinostatin are ribosomally translated peptides whose post-translational processing converts them into tricyclic depsipeptides that serve as serine proteinase inhibitors. A single precursor usually has one core peptide region near the C-terminus, with a nearly invariant TxKYPSD motif, but may instead have two or three repeats of the core region. 0
41275 417445 cl13983 DUF3774 Wound-induced protein. hypothetical protein; Provisional 0
41276 276033 cl13994 DUF326 Cysteine-rich 4 helical bundle widely conserved in bacteria. Members of this family average about 150 amino acids in length, beginning with a twin-arginine translocation signal sequence, then a His-rich spacer region, followed by a ~105-residue region in which thirteen positions are nearly invariant Cys residues. CDD (Conserved Domain Database) assigns members of this family to clan cl13994, the DUF326 superfamily, based on homology to PA2107 from Pseudomonas aeruginosa. PA2107 is a cysteine-rich four helical bundle protein, with solved structure PDB:3KAW. 0
41277 417454 cl13995 MPP_superfamily metallophosphatase superfamily, metallophosphatase domain. Members of this family are part of the Calcineurin-like phosphoesterase superfamily. 0
41278 417455 cl13996 MPN Mpr1p, Pad1p N-terminal (MPN) domains. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signaling and protein turnover pathways in eukaryotes. Prokaryotic JAB domains are predicted to have a similar role in their cognates of the ubiquitin modification pathway. The domain is widely found in bacteria, archaea and phages where they are present in several gene contexts in addition to those that correspond to the prokaryotic cognates of the eukaryotic Ub pathway. Other contexts in which JAB domains are present include gene neighbor associations with ubiquitin fold domains in cysteine and siderophore biosynthesis, and phage tail morphogenesis, where they are shown or predicted to process the associated ubiquitin. A distinct family, the RadC-like JAB domains are widespread in bacteria and are predicted to function as nucleases. In halophilic archaea the JAB domain shows strong gene-neighborhood associations with a nucleotidyltransferase suggesting a role in nucleotide metabolism. 0
41279 417456 cl13999 rhv_like N/A. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur. 0
41280 417457 cl14009 DUF5837 Family of unknown function (DUF5837). This model represents a conserved N-terminal region shared by microcyclamide and patellamide bacteriocins precursors. These bacteriocin precursors are associated with heterocyclization. Related precursors are found in family TIGR04446. 0
41281 417458 cl14014 Sec-ASP3 Accessory Sec secretory system ASP3. This protein is designated Asp3 because, along with SecY2, SecA2, and other proteins it is part of the accessory Sec system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41282 417459 cl14015 Asp2 Accessory Sec system GspB-transporter. This protein is designated Asp2 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41283 417460 cl14019 Prok-E2_D Prokaryotic E2 family D. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein B. 0
41284 417461 cl14020 Prok_Ub Prokaryotic Ubiquitin. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein C. 0
41285 417462 cl14023 DUF4400 Domain of unknown function (DUF4400). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 0
41286 417463 cl14026 BCD Beta-carotene 15,15&apos;-dioxygenase. This integral membrane protein family includes Brp (bacterio-opsin related protein) and Blh (Brp-like protein). Bacteriorhodopsin is a light-driven proton pump with a covalently bound retinal cofactor that appears to be derived beta-carotene. Blh has been shown to cleave beta-carotene to product two all-trans retinal molecules. Mammalian enzymes with similar enzymatic function are not multiple membrane spanning proteins and are not homologous. 0
41287 417464 cl14057 BPL_LplA_LipB biotin-lipoate ligase family. This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor. 0
41288 417465 cl14058 lectin_L-type legume lectins. Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first recognized members of a family of animal lectins similar (19-24%) to the leguminous plant lectins. The alignment for this family aligns residues lying towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. However, while Fiedler and Simons identified these proteins as a new family of animal lectins, our alignment also includes yeast sequences. ERGIC-53 is a 53kD protein, localized to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin. Its dysfunction has been associated with combined factors V and VIII deficiency OMIM:227300 OMIM:601567, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein- secreting pathway. 0
41289 417466 cl14106 RIFIN Rifin. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits. 0
41290 246618 cl14192 Phage_T7_Capsid Phage T7 capsid assembly protein. capsid assembly protein 0
41291 353808 cl14340 B277 Family of unknown function. hypothetical protein 0
41292 417467 cl14348 T4-gp15_tss T4-like virus Myoviridae tail sheath stabilizer. tail sheath stabilizer and completion protein; Provisional 0
41293 417468 cl14362 DUF5856 Family of unknown function (DUF5856). hypothetical protein; Provisional 0
41294 353809 cl14364 Gp67 Gene product 67. prohead core protein; Provisional 0
41295 353810 cl14502 E7R Viral Protein E7. putative myristoylated protein; Provisional 0
41296 353811 cl14561 An_peroxidase_like Animal heme peroxidases and related proteins. Peroxidasin is a secreted heme peroxidase which is involved in hydrogen peroxide metabolism and peroxidative reactions in the cardiovascular system. The domain co-occurs with extracellular matrix domains and may play a role in the formation of the extracellular matrix. 0
41297 417469 cl14571 Tocopherol_cycl Tocopherol cyclase. tocopherol cyclase 0
41298 417470 cl14578 GrlR T3SS negative regulator,GrlR. negative regulator GrlR; Provisional 0
41299 417471 cl14603 C2 C2 domain. The Dock180/Dock1 and Zizimin proteins are atypical GTP/GDP exchange factors for the small GTPases Rac and Cdc42 and are implicated cell-migration and phagocytosis. Across all Dock180 proteins, two regions are conserved: C-terminus termed CZH2 or DHR2 (or the Dedicator of cytokinesis) whereas CZH1/DHR1 contain a new family of the C2 domain. 0
41300 417472 cl14605 DUF619-like DUF619 domain of various N-acetylglutamate Kinases and N-acetylglutamate Synthases. This is the C-terminal NAT or N-acetyltransferase domain of bifunctional N-acetylglutamate synthase/kinases. It catalyzes the first two steps in arginine biosynthesis. This domain contains the putative NAGS - N-acetylglutamate synthase - active site. It is found at the C-terminus of Neurospora crassa acetylglutamate synthase - amino-acid acetyltransferase, EC: 2.3.1.1. It is also found C-terminal to the amino acid kinase region (pfam00696) in some fungal acetylglutamate kinase enzymes. it stabilizes the yeast NAGK, N-acetyl-L-glutamate kinase, slows catalysis and modulates feed-back inhibition by arginine. This domain is found to be the N-acetyltransferase (NAT) domain, and it has a typical GCN5-related NAT fold and a site that catalyzes NAG synthesis which is located >25 Angstrom away from the L-arginine binding site in the N-temrinal domain pfam00696. 0
41301 417473 cl14606 Reeler_cohesin_like Domains similar to the eukaryotic reeler domain and bacterial cohesins. Domain found in bacteria with undetermined function. Its structure has been determined and is an immunoglobulin-like fold. 0
41302 387361 cl14607 OPT OPT oligopeptide transporter protein. This protein represents a small family of integral membrane proteins from Gram-negative bacteria, a Gram-positive bacteria, and an archaeal species. Members of this family contain 15 to 18 GES predicted transmembrane regions, and this family has extensive homology to a family of yeast tetrapeptide transporters, including isp4 (Schizosaccharomyces pombe) and Opt1 (Candida albicans). EspB, an apparent equivalog from Myxococcus xanthus, shares an operon with a two component system regulatory protein, and is required for the normal timing of sporulation after the aggregation of cells. This is consistent with a role in transporting oligopeptides as signals across the membrane. [Transport and binding proteins, Amino acids, peptides and amines] 0
41303 417474 cl14608 P53 P53 DNA-binding domain. Members of this family of DNA-binding domains are found the transcription factor CEP-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology. 0
41304 417475 cl14615 PI-PLCc_GDPD_SF Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily. PI-PLC-C1 is a family of calcium 2+-dependent phosphatidylinositol-specific phospholipase C1 enzymes from bacteria and fungi. The enzyme classification number is EC:3.1.4.11. This enzyme is involved in part of the myo-inositol phosphate metabolic pathway. 0
41305 417477 cl14631 Cdt1_c The C-terminal fold of replication licensing factor Cdt1 is essential for Cdt1 activity and directly interacts with MCM2-7 helicase. This is the C-terminal domain of DNA replication factor Cdt1. This domain binds the MCM complex. 0
41306 417478 cl14632 VOC vicinal oxygen chelate (VOC) family. This domain is one of two barrel-shaped regions that together form the active enzyme, 4-hydroxyphenylpyruvic acid dioxygenase, EC:1.13.11.27. As can be deduced from the disposition of the various Glyoxalase families, _2, _3 and _4 in Pfam, pfam00903, pfam12681, pfam13468, pfam13669, these two regions are similar to be indicative of a gene-duplication event. At the individual sequence level slight differences in conformation have given rise to slightly different functions. In the case of UniProt:P80064, 4-hydroxyphenylpyruvic acid dioxygenase catalyzes the formation of homogentisate from 4-hydroxyphenylpyruvate, and the pyruvate part of the HPPD substrate (4-hydroxyphenylpyruvate), derived from L-tyrosine, and the O2 molecule occupy the three free coordination sites of the catalytic iron atom in the C-terminal domain. In plants and photosynthetic bacteria, the tyrosine degradation pathway is crucial because homogentisate, a tyrosine degradation product, is a precursor for the biosynthesis of photosynthetic pigments, such as quinones or tocopherols. 0
41307 417479 cl14633 DD Death Domain Superfamily of protein-protein interaction domains. In the probable ATP-dependent RNA helicase DDX58 this CARD domain is found near the N-terminus and interacts with the C-terminal domain. 0
41308 417480 cl14643 SRPBCC START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily. This domain is found on aromatic hydroxylating enzymes such as 2-oxo-1,2-dihydroquinoline 8-monooxygenase from Pseudomonas putida and carbazole 1,9a-dioxygenase from Janthinobacterium. These enzymes are homotrimers and are distantly related to the typical oxygenase. This domain is found C terminal to the Rieske domain which binds an iron-sulphur cluster. 0
41309 417481 cl14647 GH43_62_32_68_117_130 Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. The glycosyl hydrolase family 43 contains members that are arabinanases. Arabinanases hydrolyze the alpha-1,5-linked L-arabinofuranoside backbone of plant cell wall arabinans. The structure of arabinanase Arb43A from Cellvibrio japonicus reveals a five-bladed beta-propeller fold. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 0
41310 417482 cl14648 Aldose_epim aldose 1-epimerase superfamily. Members of this protein family act as galactose mutarotase (D-galactose 1-epimerase) and participate in the Leloir pathway for galactose/glucose interconversion. All members of the seed alignment for this model are found in gene clusters with other enzymes of the Leloir pathway. This enzyme family belongs to the aldose 1-epimerase family, described by pfam01263. However, the enzyme described as aldose 1-epimerase itself (EC 5.1.3.3) is called broadly specific for D-glucose, L-arabinose, D-xylose, D-galactose, maltose and lactose. The restricted genome context for genes in this family suggests members should act primarily on D-galactose. 0
41311 417483 cl14649 BRO1_Alix_like Protein-interacting Bro1-like domain of mammalian Alix and related domains. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1. 0
41312 417484 cl14651 RNA_pol_Rpb6 RNA polymerase Rpb6. DNA-directed RNA polymerase subunit omega; Reviewed 0
41313 417485 cl14653 KdgT 2-keto-3-deoxygluconate permease. This family includes the characterized 2-Keto-3-Deoxygluconate transporters from Bacillus subtilis and Erwinia chrysanthemi. There are homologs of this protein found in both gram-positive and gram-negative bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
41314 353824 cl14654 V_Alix_like Protein-interacting V-domain of mammalian Alix and related domains. This domain family is comprised of uncharacterized plant proteins. It belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), (His-Domain) type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_Rim20_Bro1_like superfamily also have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members of the V_Alix_like superfamily also have a proline-rich region (PRR). 0
41315 187418 cl14664 PRK15266 N/A. subtilase cytotoxin subunit B-like protein; Provisional 0
41316 417486 cl14670 CP12 CP12 domain. CP12 gene family protein; Provisional 0
41317 417487 cl14673 DUF1266 Protein of unknown function (DUF1266). hypothetical protein; Provisional 0
41318 417488 cl14674 DctA-YdbH Dicarboxylate transport. In certain bacterial families this protein is expressed from the ydbH gene, and there is a suggestion that this is a form of DctA or dicarboxylate transport protein. Dicarboxylate transport proteins are found in aerobic bacteria which grow on succinate or other C4-dicarboxylates. 0
41319 417489 cl14675 PorP_SprF Type IX secretion system membrane protein PorP/SprF. This model describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that have type IX secretion systems (T9SS) and exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility. 0
41320 387379 cl14676 PgaD PgaD-like protein. Members of this protein family are PgaD, essential to the production of poly-beta-1,6-N-acetyl-D-glucosamine (PGA). This cytoplasmic membrane protein appears to be an auxiliary subunit to the PGA synthase, PgaC (TIGR03937). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
41321 387380 cl14695 C166 Family of unknown function. hypothetical protein; Provisional 0
41322 301338 cl14701 SopD Salmonella outer protein D. SopD is a type III virulence effector protein whose structure consists of 38% alpha-helix and 26% beta-strand. 0
41323 353831 cl14716 UL16 Viral unique long protein 16. tegument protein UL16; Provisional 0
41324 417491 cl14728 DUF4922 Domain of unknown function (DUF4922). GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional 0
41325 387383 cl14744 small_mem_YnhF YnhF family membrane protein; Validated. Members of this protein family, are small membrane proteins, about 29 amino acids in length. YnhF from E. coli was shown to have an intact fMet residue at the N-terminus and to be chloroform-soluble. The previously generated narrow cluster PRK14756 includes some members of this family. 0
41326 417492 cl14758 T3SS_basalb_I Type III secretion basal body protein I, YscI, HrpB, PscI. T3SS_basalb_I represents a family of Gram-negative type III secretion basal body proteins I. It is the inner rod protein of the secreted needle. YscI is suggested to form a rod that allows substrate passage across the inner membrane of the needle protein YscF through it. 0
41327 301342 cl14772 PDU_like Putative propanediol utilisation. Members of this family are PduM, a protein essential for forming functional microcompartments in which a trimeric B12-dependent enzyme acts as a dehydratase for 1,2-propanediol (Salmonella enterica) or glycerol (Lactobacillus reuteri). 0
41328 417493 cl14778 DnaJ-X X-domain of DnaJ-containing. RESA-like protein; Provisional 0
41329 417494 cl14782 RNase_H_like Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8. This domain is found in plants and appears to be part of a retrotransposon. 0
41330 417495 cl14783 DOMON_like Domon-like ligand-binding domains. CBM9_2 is a family of putative endoxylanase-like proteins that belong to the Carbohydrate-binding family 9. 0
41331 417496 cl14785 FMT_C_like Carboxy-terminal domain of Formyltransferase and similar domains. Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. 0
41332 276063 cl14805 Csx14_I-U CRISPR/Cas system-associated protein Csx14. This model describes a CRISPR-associated (cas) protein unique to the Dpsyc subtype (named for Desulfotalea psychrophila), a variant type I-C subtype, although not universal to the that subtype. Members of this family occur in CRISPR loci of Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, Rhodospirillum centenum SW, Planctomyces limnophilus DSM 3776, and Methylosinus trichosporium OB3b. 0
41333 417497 cl14807 ACE1-Sec16-like Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16. Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure. 0
41334 417498 cl14813 GluZincin Gluzincin Peptidase family (thermolysin-like proteinases, TLPs) which includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins). Glycyl aminopeptidase is an unusual peptidase in that it has a preference for substrates with an N-terminal glycine or alanine. These proteins are found in Bacteria and in Archaea. 0
41335 417499 cl14817 DUF1858 Domain of unknown function (DUF1858). Members of this protein family resemble the domain of unknown function DUF1858 described by pfam08984, but all members contain an apparent redox-active disulfide. In at least one member protein, a cysteine in the CXXC motif is substituted by a selenocysteine. Most member proteins consist of this domain only, but a few members are fused to or adjacent to members of the hybrid-cluster (prismane) family or the nitrite/sulfite reductase family. [Energy metabolism, Electron transport] 0
41336 417500 cl14828 Lant_dehydr_C Lantibiotic biosynthesis dehydratase C-term. This domain occurs within longer proteins that contain lantibiotic dehydratase domains (see pfam04737 and pfam04738), and as single-domain proteins in bacteriocin biosynthesis genomic contexts. Three named genes in this family, SioK in Streptomyces sioyaensis, TsrD in Streptomyces laurentii, and NosD in Streptomyces actuosus, all occur in regions associated with thiopeptide biosynthesis. [Cellular processes, Toxin production and resistance] 0
41337 301355 cl14830 Mersacidin Two-component Enterococcus faecalis cytolysin (EFC). This model recognizes a number of type 2 lantibiotic-type bacteriocins, related to but distinct from the family that includes lichenicidin and mersacidin. Sequence similarity among members consists largely of a 20-residue block of conserved sequence that covers most of the leader peptide region, absent from the mature lantibiotic. This is followed by a region with characteristic composition for lantibiotic precursor regions, rich in Ser and Thr and including a near-invariant Cys near or at the C-terminus, involved in cyclization. Members of this family typically are shorter than 70 amino acids. [Cellular processes, Toxin production and resistance] 0
41338 417501 cl14834 TSCPD TSCPD domain. This model describes a family of conserved hypothetical proteins of small size, typically ~85 residues, with four invariant Cys residues. This small protein is distantly homologous to a C-terminal domain found in proteins identified by N-terminal homology as ribonucleotide reductases. The rare and sporadic distribution of this protein family falls mostly within the subset of bacterial genomes containing the uncharacterized radical SAM protein modeled by TIGR03904. [Unknown function, General] 0
41339 417502 cl14836 DUF4130 Domain of unknown function (DUF4130. This model represents a conserved hypothetical protein that almost invariably pairs with an uncharacterized radical SAM protein. The pair occurs in about twenty percent of completed prokaryotic genomes. About forty percent of the members of this family occur as fusion proteins, where the C-terminal domain belongs to the uracil-DNA glycosylase family, a DNA repair family (because uracil in DNA is deamidated cytosine). The linkage by gene clustering and correlated species distribution to a radical SAM protein, and by gene fusion to a DNA repair protein family, suggests a role in DNA modification and/or repair. 0
41340 417503 cl14844 DUF3817 Domain of unknown function (DUF3817). This model describes a strictly bacterial integral membrane domain of about 85 residues in length. It occurs in proteins that on rare occasions are fused to transporter domains such as the major facilitator superfamily domain. Of three invariant residues, two occur as a His-Gly dipeptide in the middle of three predicted transmembrane helices. [Unknown function, General] 0
41341 417504 cl14852 WYL WYL domain. Members of this protein family belong to CRISPR-associated (Cas) gene clusters. The majority of members are Cyanobacterial. 0
41342 417505 cl14855 Caps_synth_CapC Capsule biosynthesis CapC. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsC, covering both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material. PgsC binds tightly to PgsB, which has been shown to have poly-gamma-glutamate activity. [Cell envelope, Other] 0
41343 417506 cl14857 SdpA Sporulation delaying protein SdpA. Members of this protein family resemble SdpA (Sporulation Delaying Protein A), a protein associated with production and export of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus, Stigmatella aurantiaca DW4/3-1, Streptomyces sp. ACTE, etc. 0
41344 276089 cl14861 AZL_007950_fam AZL_007950 family protein. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor of the Mb-OB3b type. The model excludes several Pseudomonas proteins whose function is unknown, which likewise are in model TIGR04061, but which diverge toward the C-terminus. 0
41345 417507 cl14869 SPASM Iron-sulfur cluster-binding domain. This domain contains regions binding additional 4Fe4S clusters found in various radical SAM proteins C-terminal to the domain described by model pfam04055. Radical SAM enzymes with this domain tend to be involved in protein modification, including anaerobic sulfatase maturation proteins, a quinohemoprotein amine dehydrogenase biogenesis protein, the Pep1357-cyclizing radical SAM enzyme, and various bacteriocin biosynthesis proteins. The motif CxxCxxxxxCxxxC is nearly invariant for members of this family, although PqqE has a variant form. We name this domain SPASM for Subtilosin, PQQ, Anaerobic Sulfatase, and Mycofactocin. 0
41346 276097 cl14874 Luminal_IRE1_like The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is a kinase receptor that also contains an endoribonuclease domain in the cytoplasmic side. It plays roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its luminal domain and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). IRE1alpha is expressed in all cells and tissues while IRE1beta is found only in intestinal epithelial cells. 0
41347 417508 cl14876 Zinc_peptidase_like Zinc peptidases M18, M20, M28, and M42. This domain consists of 4 beta strands and two alpha helices which make up the dimerization surface of members of the M20 family of peptidases. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases. 0
41348 417509 cl14879 LabA_like_C C-terminal domain of LabA_like proteins. A predicted RNA-binding domain found in insect Oskar and vertebrate TDRD5/TDRD7 proteins that nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. The domain adopts the winged helix-turn- helix fold and bind RNA with a potential specificity for dsRNA.In eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized RNase domain belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains. 0
41349 417510 cl14880 CBM6-CBM35-CBM36_like Carbohydrate Binding Module 6 (CBM6) and CBM35_like superfamily. CBM_26 is a family of bacterial carbohydrate-binding modules frequently found at the C-terminus of enzymes. The combination is not unusual as the CBMs function to bring the relevant polysaccharide into close proximity to the active site. 0
41350 417512 cl14897 HcyBio Homocysteine biosynthesis enzyme, sulfur-incorporation. This presumed domain is about is about 360 residues long. The function of this domain is unknown. It is found in some proteins that have two C-terminal CBS pfam00571 domains. There are also proteins that contain two inserted Fe4S domains near the C-terminal end of the domain. The Methanothermobacter thermautotrophicus gene MTH_855 product has been misannotated as an inosine monophosphate dehydrogenase based on the similarity to the CBS domains. Based on genetic analyses in the methanogen Methanosarcina acetivorans, this family is a key component of the metabolic network for sulfide assimilation and trafficking in methanogens. It is essential to a novel, O-acetylhomoserine sulfhydrylase-independent pathway for homocysteine biosynthesis, and may catalyze sulfur incorporation into the side chain of an as yet unidentified amino acid precursor. The DUF39-CBS and DUF39-ferredoxin architectures repeatedly occur together in the genomes of methanogenic Archaea, suggesting they may be of diverged function. This is consistent with a phylogenetic reconstruction of the DUF39 family, which clearly distinguishes the CBS-associated and ferredoxin-associated DUF39s. 0
41351 417513 cl14898 DUF1175 Protein of unknown function (DUF1175). This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown. 0
41352 417514 cl14901 DDE_Tnp_Tn3 Tn3 transposase DDE domain. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases. 0
41353 417515 cl14905 DUF1091 Protein of unknown function (DUF1091). This is a family of uncharacterized proteins. Based on its distant similarity to pfam02221 and conserved pattern of cysteine residues it is possible that these domains are also lipid binding. 0
41354 417516 cl14906 AKAP_110 A-kinase anchor protein 110 kDa (AKAP 110). This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction. 0
41355 301371 cl14909 GspL_C GspL periplasmic domain. GspL-like protein; Provisional 0
41356 417540 cl15003 TcpC_C C-terminal domain of conjugative transposon protein TcpC. This family of proteins are annotated as conjugative transposon protein TcpC. The transfer clostridial plasmid (tcp) locus is part of some conjugative antibiotic resistance and virulence plasmids. TcpC was one of five genes whose products had low-level sequence identity to Tn916 proteins, having similarity to ORF13 homologs from Tn916, Tn5397, and CW459tet. This family of proteins is found in bacteria. Proteins in this family are typically between 302 and 351 amino acids in length. 0
41357 417587 cl15079 PDS5 Sister chromatid cohesion protein PDS5. This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble. 0
41358 417599 cl15102 CLU-central An uncharacterized central domain of CLU mitochondrial proteins. Translation initiation factor eIF3 is a multi-subunit protein complex required for initiation of protein biosynthesis in eukaryotic cells. The complex promotes ribosome dissociation, the binding of the initiator methionyl-tRNA to the 40 S ribosomal subunit, and mRNA recruitment to the ribosome. The protein product from TIF31 genes in yeast is p135 which associates with the eIF3 but does not seem to be necessary for protein translation initiation. 0
41359 417633 cl15166 RRP7_like RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A (Rrp7A), and similar proteins. RRP7 is an essential protein in yeast that is involved in pre-rRNA processing and ribosome assembly. It is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. 0
41360 417677 cl15232 BACON Bacteroidetes-Associated Carbohydrate-binding (putative) Often N-terminal (BACON) domain. This family represents a distinct class of BACON domains found in crAss-like phages, the most common viral family in the human gut, in which they are found in tail fiber genes. This suggests they may play a role in phage-host interactions. 0
41361 417680 cl15236 PliI_like Periplasmic lysozyme inhibitor, I-type (PliI) and similar proteins. Aeromonas hydrophila PliI is a dimeric periplasmic protein that enables bacteria to resist permeabilization of the outer membrane by the bactericidal action of lysozyme. PliI may be a direct inhibitor of lysozyme that inserts a conserved loop into the active site of type I (invertebrate) lysozymes. 0
41362 417681 cl15237 Deltex_C Domain found at the C-terminus of deltex-like. This is the C-terminal domains found in members of the Deltex family of proteins which comprises five members (DTX1, 2, 3, 4, and 3L). This conserved C-terminal region of about 150 residues of the Deltex family, is preceded by a RING E3 ligase domain in four of the members. Crystal structure of the Deltex C-terminal (DTC) domain reveals a fold composed of a central beta-sheet lined with two long parallel alpha-helices. 0
41363 417682 cl15239 PLDc_SF Catalytic domain of phospholipase D superfamily proteins. TrmB is an alpha-glucoside sensing transcriptional regulator. The protein is the transcriptional repressor for gene cluster encoding trehalose/maltose ABC transporter in T.litoralis and P.furiosus. TrmB has lost its DNA binding domain but retained its sugar recognition site. A nonreducing glucosyl residue is shared by all substrates bound to TrmB which suggests that its a common recognition motif. 0
41364 197448 cl15240 Reelin_subrepeat_like Tandem repeat subunit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). 0
41365 417683 cl15242 BfiI_C_EcoRII_N_B3 DNA binding domains of BfiI, EcoRII and plant B3 proteins. The N-terminal effector-binding domain of the Restriction Endonuclease EcoRII has a DNA recognition fold, allowing for binding to 5'-CCWGG sequences. It assumes a structure composed of an eight-stranded beta-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. They are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini beta-sheets of four antiparallel beta-strands, sheet I from beta-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed beta-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not. 0
41366 417684 cl15243 HemeO-like heme oxygenase. The CADD, Chlamydia protein associating with death domains, crystal structure reveals a dimer of seven-helical bundles. Each bundle contains a di-iron centre adjacent to an internal cavity that forms an active site similar to that of methane mono-oxygenase hydrolase. 0
41367 417685 cl15254 UBAN polyubiquitin binding domain of NEMO and related proteins. CC2-LZ is a leucine-zipper domain associated with the CC2 coiled-coil region of NF-kappa-B essential modulator, NEMO. It plays a regulatory role, along with the very C-terminal zinc-finger; it contains a ubiquitin-binding domain (UBD) and represents one region that contributes to NEMO oligomerization. NEMO itself is an integral part of the IkappaB kinase complex and serves as a molecular switch via which the NF-kappaB signalling pathway is regulated. 0
41368 417686 cl15255 SH2 Src homology 2 (SH2) domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the C-terminal SH2 domain. 0
41369 417687 cl15257 GIY-YIG_SF GIY-YIG nuclease domain superfamily. This domain was identified by Iyer and colleagues. 0
41370 417688 cl15262 PUB PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins. The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain. This domain is also found on other proteins linked to the ubiquitin-proteasome system. 0
41371 417689 cl15265 YjbR YjbR. YjbR has a CyaY-like fold. 0
41372 387591 cl15268 V4R V4R domain. This model represents the component of bacteriochlorophyll synthetase responsible for reduction of the B-ring pendant ethylene (4-vinyl) group. It appears that this step must precede the reduction of ring D, at least by the "dark" protochlorophyllide reductase enzymes BchN, BchB and BchL. This family appears to be present in photosynthetic bacteria except for the cyanobacterial clade. Cyanobacteria must use a non-orthologous gene to carry out this required step for the biosynthesis of both bacteriochlorophyll and chlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll] 0
41373 417690 cl15270 FinO_conjug_rep N/A. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProP, in Escherichia coli. This family includes several bacterial fertility inhibition (FINO) proteins. The conjugative transfer of F-like plasmids is repressed by FinO, an RNA binding protein. FinO interacts with the F-plasmid encoded traJ mRNA and its antisense RNA, FinP, stabilizing FinP against endonucleolytic degradation and facilitating sense-antisense RNA recognition. ProQ operates as an RNA-chaperone, binding RNA and bringing about both RNA strand-exchange and RNA duplexing. This suggests that in fact it does not regulate ProP transcription but rather regulates ProP translation through activity as an RNA-binding protein. 0
41374 353932 cl15276 Phage_GPA Bacteriophage replication gene A protein (GPA). DNA replication initiation protein gpA 0
41375 417691 cl15278 TSP_1 Thrombospondin type 1 domain. Type 1 repeats in thrombospondin-1 bind and activate TGF-beta. 0
41376 387594 cl15288 DUF2378 Protein of unknown function (DUF2378). This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622. Members are about 200 amino acids in length. No other homologs are known; the function is unknown. 0
41377 387595 cl15289 DUF2380 Predicted lipoprotein of unknown function (DUF2380). This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown. 0
41378 417692 cl15307 TPKR_C2 Tyrosine-protein kinase receptor C2 Ig-like domain. In the tyrosine-protein kinase receptor NTRK1 this domain interacts with beta-nerve growth factor NGF. 0
41379 417694 cl15347 CBM20 N/A. Novamyl (also known as acarviose transferase, ATase, maltogenic alpha-amylase, glucan 1,4-alpha-maltohydrolase, and AcbD), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Novamyl has a five-domain structure similar to that of cyclodextrin glucanotransferase (CGTase). Novamyl has a substrate-binding surface with an open groove which can accommodate both cyclodextrins and linear substrates. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch. 0
41380 417695 cl15354 CBS_pair_SF Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains superfamily. CBS domains are small intracellular modules that pair together to form a stable globular domain. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. CBS domain pairs from AMPK bind AMP or ATP. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP. 0
41381 417696 cl15368 RNase_Ire1_like RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L. This domain is a endoribonuclease. Specifically it cleaves an intron from Hac1 mRNA in humans, which causes it to be much more efficiently translated. 0
41382 417697 cl15371 NIF3 NIF3 (NGG1p interacting factor 3). The characterization of this family of uncharacterized proteins as orthologous is tentative. Members are found in all three domains of life. Several members (from Bacillus subtilis, Listeria monocytogenes, and Mycobacterium tuberculosis - all classified as Firmicutes within the Eubacteria) share a long insert relative to other members. [Unknown function, General] 0
41383 417698 cl15373 PATR Passenger-associated-transport-repeat. This model represent a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797, TIGR01414). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. pfam05594 is based on a longer, much more poorly conserved multiple sequence alignment and hits some of the same proteins as this model with some overlap between the hit regions of the two models. It describes these repeats as likely to have a beta-helical structure. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41384 417699 cl15383 IDH Monomeric isocitrate dehydrogenase. The monomeric type of isocitrate dehydrogenase has been found so far in a small number of species, including Azotobacter vinelandii, Corynebacterium glutamicum, Rhodomicrobium vannielii, and Neisseria meningitidis. It is NADP-specific. [Energy metabolism, TCA cycle] 0
41385 417700 cl15384 DUF5131 Protein of unknown function (DUF5131). Members of this family are the upstream member (A) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown. 0
41386 417701 cl15385 MTTB Trimethylamine methyltransferase (MTTB). This model represents a distinct subfamily of pfam06253. All members here are trimethylamine:corrinoid methyltransferases that contain a critical pyrrolysine residue incorporated during translation via a special tRNA for a TAG (amber) codon. Known members so far are from the genus Methanosarcina. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with dimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates trimethylamine, leaving dimethylamine, and methylates the prosthetic group of its small cognate corrinoid protein, MttC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence. 0
41387 417702 cl15397 DUF89 Protein of unknown function DUF89. This family has no known function. 0
41388 301612 cl15401 12TM_1 Membrane protein of 12 TMs. This family carries twelve transmembrane regions. It does not have any characteristic nucleotide-binding-domains of the GxSGSGKST type. so it may not be an ATP-binding cassette transporter. However, it may well be a transporter of some description. ABC transporters always have two nucleotide binding domains; this has two unusual conserved sequence-motifs: 'KDhKxhhR' and 'LxxLP'. 0
41389 417703 cl15406 DUF2088 Domain of unknown function (DUF2088). LarA from Lactobacillus plantarum is a nickel-dependent lactate racemase and the founding member of a family of isomerases that depend on a nicotinic acid-derived nickel pincer cofactor. While it is not yet clear which homologs of LarA act preferentially on lactate, this model identifies one clade of architecurally similar proteins from among a broader set of LarA homologs. Note that the crystal structure 4NAR, on deposit at PDB but not associated with any publication, represents a protein from Thermotoga maritima that falls outside the scope of this family and that is annotated in PDB as a putative uronate isomerase. 0
41390 417704 cl15407 DUF1614 Protein of unknown function (DUF1614). This is a family of sequences coming from hypothetical proteins found in both bacterial and archaeal species. 0
41391 417705 cl15411 SpecificRecomb Site-specific recombinase. Members of this family of bacterial proteins are found in various putative site-specific recombinase transmembrane proteins. 0
41392 417706 cl15413 AAA_assoc_C C-terminal AAA-associated domain. This had been thought to be an ATPase domain of ABC-transporter proteins. However, only one member has any trans-membrane regions. It is associated with an upstream ATP-binding cassette family, pfam00005. 0
41393 417707 cl15414 V-ATPase_C Subunit C of vacuolar H+-ATPase (V-ATPase). This family contains subunit C of vacuolar H+-ATPase (V-ATPase), a protein that plays a crucial role in the vacuolar system of eukaryotic cells. The main function of V-ATPase is to generate a proton-motive force at the expense of ATP and to cause limited acidification in the internal space (lumen) of several organelles of the vacuolar system. V-ATPases are multi-subunit protein complexes made up of two distinct structures: a peripheral catalytic sector (V1) and a hydrophobic membrane sector (V0) responsible for driving protons; subunit C is one of five polypeptides composing V1. The key function of the C subunit is intimately involved in the reversible dissociation of the V1 and V0 structures. It has also been identified as a mediator of the acidic microenvironment of tumors which it controls by proton extrusion to the extracellular medium. The acidic environment causes tissue damage, activates destructive enzymes in the extracellular matrix, and acquires metastatic cell phenotypes. 0
41394 417708 cl15415 Sec1 Sec1 family. 0
41395 417709 cl15422 DUF3419 Protein of unknown function (DUF3419). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length. 0
41396 417710 cl15424 PEP_hydrolase Phosphoenolpyruvate hydrolase-like. This domain has a TIM barrel fold related to IGPS and to phosphoenolpyruvate mutase/aldolase/carboxylase. 0
41397 417711 cl15430 Nucleoside_tran Nucleoside transporter. This is a family of proteins from the CLN3 gene. A mis-sense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Batten disease is characterized by the accumulation of autofluorescent material in the lysosomes of most cells. Members of this family are transmembrane proteins functional in pre-vacuolar compartments. The protein in Sch.pombe is found to be localized to the vacuolar membrane, and a lack of functional protein clearly affects the size and pH of the vacuole. Thus the protein is necessary for vacuolar homeostasis. It is important for localization of late endosomal/lysosomal compartments, and it interacts with motor components driving both plus and minus end microtubular trafficking: tubulin, dynactin, dynein and kinesin-2. 0
41398 417712 cl15435 DUF1045 Protein of unknown function (DUF1045). This family of proteins is observed in the vicinity of other caharacterized genes involved in the catabolism of phosphonates via the3 C-P lyase system (GenProp0232), its function is unknown. These proteins are members of the somewhat broader pfam06299 model "Protein of unknown function (DUF1045)" which contains proteins found in a different genomic context as well. 0
41399 417713 cl15439 BTG BTG family. The tob/btg1 is a family of proteins that inhibit cell proliferation. 0
41400 387618 cl15442 DUF2381 Protein of unknown function (DUF2381). This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown. 0
41401 417714 cl15454 HrpJ HrpJ-like domain. This protein is found in type III secretion operons and, in Yersinia is localized to the cell surface and is involved in the Low-Calicium Response (LCR), possibly by sensing the calcium concentration. In Salmonella, the gene is known as InvE and is believed to perform an essential role in the secretion process and interacts with the proteins SipBCD and SicA.//Altered name to reflect regulatory role. Added GO and role IDs . Negative regulation of type III secretion in Y pestis is mediated in part by a multiprotein complex that has been proposed to act as a physical impediment to type III secretion by blocking the entrance to the secretion apparatus prior to contact with mammalian cells. This complex is composed of YopN, its heterodimeric secretion chaperone SycN-YscB, and TyeA. 3[SS 6/3/05] [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
41402 417715 cl15456 ADAM_CR ADAM cysteine-rich. ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. 0
41403 417716 cl15462 T6SS_TssF Type VI secretion system, TssF. This protein family is associated with type VI secretion in a number of pathogenic bacteria. Mutation is associated with impaired virulence, such as impaired infection of plants by Rhizobium leguminosarum. 0
41404 417717 cl15463 Pup_ligase Pup-ligase protein. This protein family is paralogous to (and distinct from) the PafA (proteasome accessory factor) first described in Mycobacterium tuberculosis (see TIGR03686). Members of both this family and TIGR03686 itself tend to cluster with each other, with the ubiquitin analog Pup (TIGR03687) associated with targeting to the proteasome, and with proteasome subunits themselves. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 0
41405 276111 cl15465 CsaX_III-U CRISPR/Cas system-associated protein CsaX. This family comprises a minor CRISPR-associated protein family. It occurs only in the context of the (strictly archaeal) Apern subtype of CRISPR/Cas system, and is further restricted to the Sulfolobales, including Metallosphaera sedula DSM 5348 and multiple species of the genus Sulfolobus. 0
41406 417718 cl15473 NA37 37-kD nucleoid-associated bacterial protein. nucleoid-associated protein NdpA; Validated 0
41407 417719 cl15483 Dymeclin Dyggve-Melchior-Clausen syndrome protein. Hid1 (high-temperature-induced dauer-formation protein 1) represents proteins of approximately 800 residues long and is conserved from fungi to humans. It contains up to seven potential transmembrane domains separated by regions of low complexity. Functionally it might be involved in vesicle secretion or be an inter-cellular signalling protein or be a novel insulin receptor. 0
41408 417750 cl15674 IPT N/A. The Rel homology domain (RHD) is composed of two structural domains, an N-terminal DNA_binding domain (pfam00554) and a C-terminal dimerization domain. This is the dimerization domain. 0
41409 417751 cl15675 RGL4_N N-terminal catalytic domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. Members of this family are found in both fungi, bacteria and wood-eating arthropods. The domain is found at the N-terminus of rhamnogalacturonase B, a member of the polysaccharide lyase family 4. The domain adopts a structure consisting of a beta super-sandwich, with eighteen strands in two beta-sheets. The three domains of the whole protein rhamnogalacturonan lyase (RGL4), are involved in the degradation of rhamnogalacturonan-I, RG-I, an important pectic plant cell-wall polysaccharide. The active-site residues are a lysine at position 169 in UniProtKB:Q00019 and a histidine at 229, Lys169 is likely to be a proton abstractor, His229 a proton donor in the mechanism. The substrate is a disaccharide, and RGL4, in contrast to other rhamnogalacturonan hydrolases, cleaves the alpha-1,4 linkages of RG-I between Rha and GalUA through a beta-elimination resulting in a double bond in the nonreducing GalUA residue, and is thus classified as a polysaccharide lyase (PL). 0
41410 417753 cl15685 Wzt_C-like C-Terminal domain of O-antigenic polysaccharide transporter protein Wzt and related proteins. This domain is found at the C-terminus of the Wzt protein. The crystal structure of C-Wzt(O9a) reveals a beta sandwich with an immunoglobulin-like topology that contains the O-antigenic polysaccharide binding pocket. This domain is often associated with the ABC-transporter domain. 0
41411 417754 cl15687 RGL4_C C-terminal domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. CBM-like is domain III of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain possesses a jelly roll beta-sandwich fold structurally homologous to carbohydrate binding modules (CBMs), and it carries two sulfate ions and a hexa-coordinated calcium ion. 0
41412 417755 cl15688 anti-TRAP anti-TRAP (AT) protein specific to Bacilli. In Bacillus subtilis and related bacteria, AT binds to the TRAP protein, (tryptophan-activated trp RNA-binding attenuation protein), effectively disrupting interaction of TRAP with mRNAs. Upon binding of tryptophan, TRAP (which forms a complex of 11 identical subunits) interacts with a specific location in the leader RNA and blocks translation of the tryptophan biosynthetic operon. AT, in turn, recognizes the tryptophan-activated TRAP complex and prevents RNA binding. AT is expressed in response to high levels of uncharged tryptophan tRNA. AT contains a zinc-binding motif that closely resembles the zinc-binding motifs in the zinc-finger region of DnaJ/Hsp40. AT has been shown to form homo-dodecameric assemblies, and can actually do that in two different relative orientations, resulting in two different dodecamers. Recent data suggest that the trimeric form of AT may be the biologically relevant active complex. 0
41413 417756 cl15692 CE4_SF Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily. This domain, found in various hypothetical bacterial proteins, has no known function. 0
41414 417757 cl15693 Sema The Sema domain, a protein interacting module, of semaphorins and plexins. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor and plexin-A3. 0
41415 417758 cl15694 Exosortase_EpsH Transmembrane exosortase (Exosortase_EpsH). This model represents the most conserved region of the multitransmembrane protein family of exosortases and archaeosortases. The region includes nearly invariant motifs at the ends of three predicted transmembrane helices on the extracytoplasmic face: a Cys (often Cys-Xaa-Gly), Asn-Xaa-Xaa-Arg, and His. This model is much broader than the bacterial exosortase model (TIGR02602), and has in intended scope similar to (or broader than) pfam09721. 0
41416 417759 cl15697 ADF_gelsolin Actin depolymerization factor/cofilin- and gelsolin-like domains. Severs actin filaments and binds to actin monomers. 0
41417 417760 cl15705 DUF563 Protein of unknown function (DUF563). Family of uncharacterized proteins. 0
41418 417762 cl15731 PGF-CTERM PGF-CTERM motif. This model describes a strictly archaeal putative protein-sorting motif, PGF-CTERM. It is the (predicted) recognition sequence for an exosortase homolog, archaeosortase (TIGR04125). In some archaea, up to fifty proteins have this domain as their C-terminal region, usually preceded by a Thr-rich region likely to be heavily glycosylated. The removal of this sorting signal may be associated with a C-terminal prenyl group modification in the halobacterial major cell surface glycoprotein, an S-layer protein. 0
41419 417763 cl15733 YyzF YyzF-like protein. Members of this protein family occur exclusively in the Firmicutes, in at least 50 different species. Members average about 55 residues in length, and four of the five invariant or nearly invariant residues occur in motifs CxxH and CxxC. The function is unknown. 0
41420 417764 cl15739 Bacteroid_pep Ribosomally synthesized peptide in Bacteroidetes. This model describes a rare family of small putative polypeptides, including three encoded in tandem in Sphingobacterium spiritivorum ATCC 33300, in the vicinity of a TIGR04085 protein. This pairing is conserved in Chryseobacterium gleum ATCC 35910, Kordia algicida OT-1, and other species. TIGR04085 describes a C-terminal additional 4Fe4S-binding domain in PqqE and other radical SAM enzymes that seems to be a marker for peptide modification, and the family modeled here is a candidate modified peptide precursor. 0
41421 417765 cl15749 IPTL-CTERM IPTL-CTERM motif. This model describes a variant form of the PEP-CTERM C-terminal protein-sorting domain, with a consensus motif IPTL replacing the more typical VPEP. A majority of these sequences have a WG (Trp-Gly) motif at positions 7-8 of the domain. Species with multiple (up to 15) copies of this domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3. 0
41422 417766 cl15753 CollagenBindB Repeat unit of collagen-binding protein domain B. GramPos_pilinD3 is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base. 0
41423 417767 cl15755 SAM_superfamily SAM (Sterile alpha motif ). The fungal Ste50p SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerization and heterodimerization (and in some cases oligomerization) of the protein. 0
41424 417769 cl15774 Hemerythrin-like Hemerythrin family. Iteration of the HHE family found it to be related to Hemerythrin. It also demonstrated that what has been described as a single domain in fact consists of two cation binding domains. Members of this family occur all across nature and are involved in a variety of processes. For instance, in Nereis diversicolor hemerythrin binds Cadmium so as to protect the organism from toxicity. However Hemerythrin is classically described as Oxygen-binding through two attached Fe2+ ions. And the bacterial NorA is a regulator of response to NO, which suggests yet another set-up for its metal ligands. In Staphylococcus aureus the iron-sulfur cluster repair protein ScdA has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger. 0
41425 417770 cl15781 K_trans K+ potassium transporter. potassium transporter; Provisional 0
41426 417771 cl15787 SEC14 N/A. This family includes divergent members of the CRAL-TRIO domain family. This family includes ECM25 that contains a divergent CRAL-TRIO domain identified by Gallego and colleagues. 0
41427 417772 cl15796 Phage_GPD Phage late control gene D protein (GPD). tail protein; Provisional 0
41428 387679 cl15806 antisig_RsrA mycothiol system anti-sigma-R factor. This group of anti-sigma factors are associated in an apparent operon with a family of sigma-70 family sigma factors (TIGR02947). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria. [Transcription, Transcription factors] 0
41429 417774 cl15816 CheC CheC-like family. CheX is very closely related to the CheC chemotaxis phosphatase, but it dimerizes in a different way, via a continuous beta sheet between the subunits. CheC and CheX both dephosphorylate CheY, although CheC requires binding of CheD to achieve the activity of CheX. The ability of bacteria to modulate their swimming behaviour in the presence of external chemicals (nutrients and repellents) is one of the most rudimentary behavioural responses known, but the the individual components are very sensitively tuned. 0
41430 326649 cl15819 MqsA antitoxin MqsA for MqsR toxin. The YokU-like protein family includes the B. subtilis YokU protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved CXXC sequence motifs. This is likely to be a family of bacterial antitoxins, as the sequence bears remote homology to the RelE fold family. 0
41431 417776 cl15824 LPP20 LPP20 lipoprotein. This family contains the LPP20 lipoprotein, which is a non-essential class of lipoprotein. 0
41432 417777 cl15825 YscW Type III secretion system lipoprotein chaperone (YscW). This family of proteins is found within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC (TIGR02516). YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC. 0
41433 417778 cl15827 BKACE beta-keto acid cleavage enzyme. BKACE, beta-keto acid cleavage enzyme plays, a role in lysine degradation. In certain instances it catalyzes the conversion of 3-keto-5-aminohexanoate and acetyl-CoA into acetoacetate and 3-aminobutyryl-CoA. The family is found to have at least 14 slightly different potential new enzymatic activities, all of which can therefore be designated as beta-keto acid cleavage enzymes. 0
41434 387686 cl15828 DUF308 Short repeat of unknown function (DUF308). Family of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic). 0
41435 417779 cl15830 DsbC Disulphide bond corrector protein DsbC. DsbC rearranges incorrect disulphide bonds during oxidative protein folding. It is activated by the N-terminal domain of DsbD, a transmembrane electron transporter. DsbD binds to a DsbC dimer and selectively activates it using electrons from the cytoplasm. 0
41436 417780 cl15834 YbjN Putative bacterial sensory transduction regulator. YbjN is a putative sensory transduction regulator protein found in Proteobacteria. As it is a multi-copy suppressor of the coenzyme A-associated temperature sensitivity in temperature-sensitive mutant strains of Escherichia coli the suggestion is that it both helps CoA-A1 and possibly works as a general stabilizer for some other unstable proteins. This family was expanded to subsume other related families: DUF1790, DUF1821 and DUF2596. 0
41437 417781 cl15838 Phage_GPO Phage capsid scaffolding protein (GPO) serine peptidase. capsid-scaffolding protein; Provisional 0
41438 417782 cl15839 ShK ShK domain-like. ShK toxin domain 0
41439 417783 cl15840 JmjN jmjN domain. To date, this domain always co-occurs with the JmjC domain (although the reverse is not true). 0
41440 417784 cl15841 SelR SelR domain. This model describes a domain found in PilB, a protein important for pilin expression, N-terminal to a domain coextensive to with the known peptide methionine sulfoxide reductase (MsrA), a protein repair enzyme, of E. coli. Among the early completed genomes, this module is found if and only if MsrA is also found, whether N-terminal to MsrA (as for Helicobacter pylori), C-terminal (as for Treponema pallidum), or in a separate polypeptide. Although the function of this region is not clear, an auxiliary function to MsrA is suggested. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions] 0
41441 326659 cl15846 Phage_F Capsid protein (F protein). major capsid protein 0
41442 417785 cl15848 ESSS ESSS subunit of NADH:ubiquinone oxidoreductase (complex I). complex I subunit 0
41443 417786 cl15851 BcsB Bacterial cellulose synthase subunit. This family includes bacterial proteins involved in cellulose synthesis. Cellulose synthesis has been identified in several bacteria. In Agrobacterium tumefaciens, for instance, cellulose has a pathogenic role: it allows the bacteria to bind tightly to their host plant cells. While several enzymatic steps are involved in cellulose synthesis, potentially the only step unique to this pathway is that catalyzed by cellulose synthase. This enzyme is a multi subunit complex. This family encodes a subunit that is thought to bind the positive effector cyclic di-GMP. This subunit is found in several different bacterial cellulose synthase enzymes. The first recognized sequence for this subunit is BcsB. In the AcsII cellulose synthase, this subunit and the subunit corresponding to BcsA are found in the same protein. Indeed, this alignment only includes the C-terminal half of the AcsAII synthase, which corresponds to BcsB. 0
41444 387695 cl15855 Flg_bbr_C Flagellar basal body rod FlgEFG protein C-terminal. Members of this protein are FlgF, one of several homologous flagellar basal-body rod proteins in bacteria. [Cellular processes, Chemotaxis and motility] 0
41445 417789 cl15893 MgtC MgtC family. This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 0
41446 387700 cl15935 TIC20 Chloroplast import apparatus Tic20-like. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic20 protein. [Transport and binding proteins, Amino acids, peptides and amines] 0
41447 276137 cl15945 PRK09822 N/A. Members of this family are WaaZ, or Kdo-III transferase. This enzyme, present in some strains of E. coli and its allies but not others, performs a non-stoichiometric addition of a third 3-deoxy-D-manno-oct-2-ulosonic acid (KDO-III) onto some fraction of KDO-II in the lipopolysaccharide (LPS) inner core. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
41448 387731 cl16047 Fur_reg_FbpB Fur-regulated basic protein B. This model describes FbpB (Fur-regulated basic protein B), one of three paralogous small proteins recognized by Pfam model PF13040 in Bacillus subtilis. 0
41449 417904 cl16231 DUF4089 Protein of unknown function (DUF4089). HpxX is a small protein of unknown function, about 60 residues in length, encoded in the set of four genes, hpxWXYZ, that belong to the oxalurate metabolism portion of a complete pathway for hypoxanthine (hpx) utilization, as in Klebsiella pneumoniae. 0
41450 417917 cl16254 PDDEXK_3 PD-(D/E)XK nuclease superfamily. Members of this protein family average about 130 residues in length and include an almost perfectly conserved motif GxxExxY. Members occur in a wide range of prokaryotes, including Proteobacteria, Perrucomicrobia, Cyanobacteria, Bacteriodetes, Archaea, etc. 0
41451 417918 cl16257 Alginate_exp Alginate export. Proteins of this HMM family are primarily identified in sulfate-reducing Desulfovibrio, but this HMM may also hit proteins from other Gram-negative bacteria. Porins of this family form transmembrane pores for the passive transport of small molecules across the outer membranes of Gram-negative bacteria. 0
41452 417923 cl16268 LytR_C LytR cell envelope-related transcriptional attenuator. Cei (cell envelope integrity), as described for the founding member Rv2700 from Mycobacterium tuberculosis, is a transmembrane protein with an extracellular LytR_C domain. It lacks any DNA-binding domain and is not a transcriptional regulator. It shares homology to C-terminal regions present in some members of the LytR-CpsA-Psr family, a family in which some characterized members transfer teichoic acids to from carriers to mature peptidoglycan. 0
41453 417930 cl16279 SH3_8 SH3-like domain. The GW domain of Listeria belongs to the clan of SH3-like domains. A similar but broader model (PF13457) occurs in Pfam. The GW domain occurs as repeats on surface proteins of the cell-invading pathogenic bacterium Listeria monocytogenes, and is involved in binding to glycosaminoglycans. Members of this family include the GW-type internalin InlB and several paralogs. 0
41454 417943 cl16298 GH113-like Glycoside hydrolase family 113 beta-mannosidase and similar proteins. This domain is found in the gene transfer agent protein. An unusual system of genetic exchange exists in the purple nonsulfur bacterium Rhodobacter capsulatus. DNA transmission is mediated by a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. The genes involved in this process appear to be found widely in bacteria. According to the SUPERFAMILY database this domain has a TIM barrel fold. 0
41455 417973 cl16352 zf-3CxxC Zinc-binding domain. This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. 0
41456 417986 cl16365 TraF_2 F plasmid transfer operon, TraF, protein. This is a family of unknown function mainly found in bacteria. 0
41457 418019 cl16409 GH31_N N-terminal domain of glycosyl hydrolase family 31 (GH31). This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily. 0
41458 418021 cl16414 DUF4185 Domain of unknown function (DUF4185). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 440 amino acids in length. 0
41459 418048 cl16452 Peptidase_S74_CIMCD Peptidase S74 family, C-terminal intramolecular chaperone domain of Escherichia coli phage K1F endosialidase and related proteins. This is the very C-terminal, chaperone, domain of the bacteriophage protein endosialidase. It releases itself, via the serine-lysine dyad at the N-terminus, from the remainder of the end-tail-spike. Cleavage occurs after the threonine which is the final residue of the End-tail-spike family, pfam12219. The endosialidase protein forms homotrimeric molecules in bacteriophages. The catalytic dyad allows this portion of the molecule to be cleaved from the more N-terminal region such that the latter can fold and presumably bind to DNA. 0
41460 418112 cl16538 DUF4231 Protein of unknown function (DUF4231). The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. The SLATT domain defined here (170 residues long) is similar to the DUF4231 domain (105 residues long) described in Pfam model PF14015. 0
41461 418122 cl16549 DUF4242 Protein of unknown function (DUF4242). Members of the SCO4226 family belong to the larger family of DUF4242 domain-containing proteins, described by Pfam model PF14026. SCO4226 itself was shown to dimerize and bind four nickel atoms per homodimer. 0
41462 418271 cl16759 RAMA Restriction Enzyme Adenine Methylase Associated. This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important. 0
41463 418281 cl16774 AlgX_N_like N-terminal catalytic domain of putative alginate O-acetyltranferase and similar proteins. ALGX is a family found in bacteria. The domain demonstrates catalytic activity similar to that of the SGNH hydrolase-like domain, with the typical Ser-His-Asp triad found in this enzyme. Alginate is an exopolysaccharide that contributes to biofilm formation. ALGX is secreted into the biofilm and is responsible for the acetylation of biofilm polymers that help protect them from host destruction. 0
41464 418317 cl16818 PrcB_C PrcB C-terminal. This domain is found at the C-terminus of Treponema denticola PrcB. PrcB interacts with the PrtP protease (dentilisin) and is required for the stability of the protease complex. 0
41465 418358 cl16881 CdiA-CT_Ec-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli STEC_O31, and similar proteins. This is a bacterial virion of EndoU nuclease. It is found at C-terminal region of polymorphic toxin proteins. 0
41466 418372 cl16901 DUF4425 Uncharacterized protein conserved in Bacteroidetes. A small family of bacterial proteins, found in several Bacteroides species. Structure determination (NMR and Xray) shows an immunoglobulin beta-barrel fold. Multiple homologs have been found in human gut metagenomics data sets. Structural experimentation shows it to share features with two well-established protein architectures in the SCOP database, ie, C2 (calcium/lipid-binding domain) of the Pfam PF00168 and PLAT/LH2 (lipase/lipooxigenase domain) of the Pfam PF01477. The C2 and PLAT/LH2 domains bind Ca2+ in their functions of targeting proteins to cell-membranes; this domain is also shown to bind Ca2+ as well as to be a novel fold. 0
41467 418375 cl16905 alpha_DG_C C-terminal domain of alpha dystroglycan. This is the second N-terminal domain found in alpha-Dystroglycan (DG). The murine skeletal muscle N-terminal alpha-DG region, contains two autonomous domains; the first identified as an Ig-like and the second resembling ribosomal RNA-binding proteins. This domain is similar to the small subunit ribosomal protein S6 of Thermus thermophilus (S6 domain). It is suggested that the S6 domain may be of functional relevance for LARGE (like-acetylglucosaminyltransferase) recognition along the alpha-DG maturation pathway. 0
41468 418376 cl16912 MDR Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family. Members of this family are putative quinone oxidoreductases that belong to the broader superfamily (modeled by Pfam pfam00107) of zinc-dependent alcohol (of medium chain length) dehydrogenases and quinone oxiooreductases. The alignment shows no motif of conserved Cys residues as are found in zinc-binding members of the superfamily, and members are likely to be quinone oxidoreductases instead. A member of this family in Homo sapiens, PIG3, is induced by p53 but is otherwise uncharacterized. [Unknown function, Enzymes of unknown specificity] 0
41469 418377 cl16914 O-FucT_like GDP-fucose protein O-fucosyltransferase and related proteins. The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal. 0
41470 418378 cl16915 ZnPC_S1P1 Zinc dependent phospholipase C/S1-P1 nuclease. This domain of unknown function contains several highly conserved histidines. 0
41471 418379 cl16916 ChtBD1 Hevein or type 1 chitin binding domain. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements. 0
41472 418380 cl16919 CRAL_TRIO_N CRAL/TRIO, N-terminal domain. This all-alpha domain is found to the N-terminus of pfam00650. 0
41473 418381 cl16921 eIF2D_N_like N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins. Members of this family are found in a set of hypothetical Archaeal proteins. Their exact function has not, as yet, been defined. 0
41474 418382 cl16934 Axin_TNKS_binding Tankyrase binding N-terminal segment of axin. This is the N-terminal domain tankyrase binding domain of Axin-1. 0
41475 418383 cl16936 SATB1_N N-terminal domain of SATB1 and similar proteins. ULD is an N-terminal oligomerization domain of SATB or special AT-rich sequence-binding proteins. SATBs are global chromatin organizers and regulators of gene expression that are essential for T-cell development, breast cancer tumor growth and metastasis. SATBs assemble into a tetramer via the ULD domain, and the tetramerisation of SATBs are essential for recognising specific DNA sequences (such as multiple AT-rich DNA fragments). Thus, SATBs may regulate gene expression directly by binding to various promoters and upstream regions and thereby influencing promoter activity. 0
41476 418384 cl16937 Ndc10 Ndc10 component of the yeast centromere-binding factor 3. NDC10_II is a the second of five domains on the Kluyveromyces lactis Ndc10 protein. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilizes a DNA loop at the centromere. 0
41477 388367 cl16938 ThermoDBP Thermoproteales single-stranded DNA-binding (SSB) domain. This domain is found in the N-terminal of ThermoDBP, a single stranded DNA binding protein found in Thermoproteus tenax. ThermoDBP binds specifically to ssDNA with low sequence specificity. This domain is responsible for ssDNA binding. Conserved motif 'LIYWIRSDR' is located at the C-terminal end of the domain and is thought to participate in ssDNA binding. 0
41478 418385 cl16939 RTT106_N histone chaperone RTT106, regulator of Ty1 transposition protein 106; N-terminal homodimerization domain. This is the N-terminal domain of Rtt106 in Saccharomyces cerevisiae. Rtt106 is a histone chaperone that contributes to the deposition of newly synthesized acetylated Histone 3 Lysine 56 (H3K56ac) carrying H3-H4 complex on replicating DNA. The N-terminal domain of Rtt106 homodimerizes and interacts with H3-H4 independently of acetylation. 0
41479 418386 cl16941 NTP-PPase Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily. This family of short proteins are distantly related to the MazG enzyme. This suggests that these proteins are enzymes that catalyze a related reaction. 0
41480 418387 cl16946 Actino_peptide Ribosomally synthesized peptide in actinomycetes. A ribosomally synthesized peptide related to microviridin and marinostatin, usually in the gene neighborhood of one or more RimK-like ATP-grasp. The gene-context suggests that it is further modified by the ATP-grasp. The peptide is predicted to function in a defensive or developmental role, or as an antibiotic. 0
41481 418388 cl16948 FctA Spy0128-like isopeptide containing domain. This model describes a domain that occurs once in the major pilin of Streptococcus pyogenes, Spy0128, but in higher copy numbers in other streptococcal proteins. The domain occurs nine times in a surface-anchored protein of Bifidobacterium longum. All members of this family have LPXTG-type sortase target sequences. The S. pyogenes major pilin has been shown to undergo isopeptide bond cross-linking, mediated by sortases, that are critical to maintaining pilus structural integrity. One such Lys-to-Asn isopeptide bond is to a near-invariant Asn near the C-terminal end of this domain (column 81 of the seed alignment). A Glu in the S. pyogenes major pilin (column 25 of the seed alignment), invariant as Glu or Gln, is described as catalytic for isopeptide bond formation. 0
41482 276145 cl16949 MAST_ArtA_sort MAST domain. Members of this protein family are exclusive to archaea, probably all of which have S-layer surface protein arrays. All member proteins have an N-terminal signal sequence. The majority of known members belong to codirectional tandem arrays in the genus Methanosarcina (nine in M. barkeri str. Fusaro). Nearly all members have an additional 50 residues, (trimmed from the seed alignment for this model), consisting of low-complexity sequence rich in E,N,Q,T,S, and P, followed by a variant (PAF) form of the PGF-CTERM putative archaeal surface glycoprotein sorting signal. The coined name, sarcinarray family protein, evokes the predicted archaeal surface layer localization, the taxonomic bias of known members, and the tandem organization of most members. 0
41483 418389 cl16968 CFSR Collagen-flanked surface repeat. This model describes a repeat sequence that occurs primarily LPXTG-anchored Streptococcus surface proteins, although it does occur elsewhere. It can comprise a major fraction of the length of repeat proteins taht exceed 2000 in length. 0
41484 418390 cl16979 ser_adhes_Nterm serine-rich repeat adhesion glycoprotein AST domain. Lacb_SerRich_Nt describes a Lactobacillus-restricted N-terminal non-repetitive sequence region shared by proteins with extensive serine-rich repeat regions, all likely to function as adhesins. This region contains a variant form of the KxYKxGKxW motif (see TIGR03715) followed by a region related to serine-rich glycoprotein adhesins of the Streptococci. 0
41485 418391 cl16982 Antigen_C Cell surface antigen C-terminus. This domain has a conserved Lys (position 3 in seed alignment) and Asn at 177 that form an intramolecular isopeptide bond. The Asp (or Glu) at position 59 0
41486 418392 cl17006 VbhA_like VbhA antitoxin and related proteins. VbhT is a bacterial Fic protein of the mammalian pathogen B. schoenbuchensis7,8. It is composed of an N-terminal FIC domain and a C-terminal BID domain. FIC domains are known to catalyse adenylylation (also called AMPylation). This entry represents VbhA, an antitoxin that binds FIC domain (filamentation induced by cyclic AMP) of VbhT and inhibits its activity. It inhibits the adenylylation activity of VbhT by positioning close to the putative ATP-binding site, hence competing with ATP binding. 0
41487 418393 cl17007 COE_DBD Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain. COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain. 0
41488 388374 cl17010 TTHB210-like Hypothetical protein TTHB210, a sigma(E)-regulated gene product found in Thermus thermophilus, and similar proteins. This domain is found in TTHB210 protein present in Thermus thermophilus. TTHB210 is a Sigma-E factor regulated gene product that forms a homodecamer. This domain is chain G and can be classified with chains A, C, E and I based on its folds. 0
41489 418394 cl17011 Arginase_HDAC Arginase-like and histone-like hydrolases. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyze the removal of the acetyl group. Histone deacetylases are related to other proteins. 0
41490 418395 cl17012 GINS_A Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3. The eukaryotic GINS complex is essential for the initiation and elongation phases of DNA replication. It consists of four paralogous protein subunits (Sld5, Psf1, Psf2 and Psf3), all of which are included in this family. The GINS complex is conserved from yeast to humans, and has been shown in human to bind directly to DNA primase. 0
41491 418396 cl17013 W2 C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon. This domain of unknown function is found at the C-terminus of several translation initiation factors. 0
41492 418397 cl17014 eIF-5_eIF-2B Domain found in IF2B/IF5. translation initiation factor IF-2 subunit beta; Validated 0
41493 354298 cl17015 HRI1_like Tandem repeat domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This C-terminal repeat is missing several strands and forms an incomplete barrel. 0
41494 418398 cl17018 FANC Fanconi anemia ID complex proteins FANCI and FANCD2. The Fanconi anaemia protein FancD2 is a nuclease necessary for the repair of DNA interstrand-crosslinks. 0
41495 327373 cl17028 hemoglobin_linker_C Globular domain of extracellular hemoglobin linker. This domain is found in linker subunits of the erythrocruorin respiratory complex in annelid worms. 0
41496 418400 cl17033 SOAR STIM1 Orai1-activating region. SOAR is the Orai1-activating region of STIM1, where STIM1 are calcium sensors in the endoplasmic reticulum. As the store of calcium is depleted the calcium sensor in the ER activates Orai1, a Ca2+-release-activated Ca2+ (CRAC) channel, in the plasma membrane. The SOAR region, which runs from residues 340-443 on UniProtKB:Q13586, forms a dimer, and is essential for oligomerization of the whole of STIM1. 0
41497 418401 cl17036 SH3 Src Homology 3 domain superfamily. This domain is the 70 C-terminal residues of ADAP - Adhesion and de-granulation promoting adapter protein. It shows homology to SH3 domains; however, conserved residues of the fold are absent. It thus represents an altered SH3 domain fold. An N-terminal, amphipathic, helix makes extensive contacts to residues of the regular SH3 domain fold thereby creating a composite surface with unusual surface properties. The domain can no longer bind conventional proline-rich peptides. There are key phosphorylation sites within the two hSH3 domains and it would appear that binding at these sites does not materially affect the folding of these regions although the equilibrium towards the unfolded state may be slightly altered. The binding partners of the hSH3 domains are still unknown. 0
41498 418402 cl17037 NBD_sugar-kinase_HSP70_actin Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012). FtsA has a SHS2 domain PF02491 inserted in to the RnaseH fold PF02491. 0
41499 354301 cl17041 helicase_insert_domain helicase_insert_domain. The endoribonuclease Dicer plays a central role in RNA interference by breaking down RNA molecules into fragments of about 22 nucleotides (miRNAs and siRNAs). Loading of RNA onto Dicer and the enzymatic cleavage are supported by dsRNA-binding proteins, including trans-activation response (TAR) RNA-binding protein (TRBP) or protein activator of PKR (PACT). Together with Argonaute, this constitutes the RNA-induced silencing complex (RISC) which functions to load the small RNA fragments onto Argonaute. The Partner-binding domain of Dicer is responsible for interactions with the dsRNA-binding proteins. This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases. 0
41500 418403 cl17042 Polysacc_deac_2 Divergent polysaccharide deacetylase. This family is divergently related to pfam01522 (personal obs:Yeats C). 0
41501 418404 cl17044 DD_cGKI Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I. PKcGMP_CC is the N-terminal coiled-coil, dimerization, domain of cGMP-protein kinases. 0
41502 277498 cl17045 TM_EGFR-like Transmembrane domain of the Epidermal Growth Factor Receptor family of Protein Tyrosine Kinases. ErbB3 (HER3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. ErbB receptors are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. ErbB3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The ErbB2-ErbB3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB receptors have been associated with increased breast cancer risk. ErbB3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells. 0
41503 418407 cl17065 Cthe_2751_like Uncharacterized protein domain similar to Clostridium thermocellum 2751. Cthe_2751 has been found to form homodimers. Based on structural similarity to other families, a role in processing nucleic acids was suggested, though interactions with DNA could not be demonstrated. 0
41504 418408 cl17067 GH94N_like N-terminal domain of glycoside hydrolase family 94 and related domains. This is a family of bacterial proteins of unknown function. 0
41505 418409 cl17068 AFD_class_I Adenylate forming domain, Class I superfamily. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices. 0
41506 418410 cl17070 AMPKA_C_like C-terminal regulatory domain of 5&apos;-AMP-activated protein kinase (AMPK) alpha subunit and similar domains. This domain is found at the C-terminus of several fungal kinases. 0
41507 418411 cl17077 Caudo_TAP Caudovirales tail fibre assembly protein, lambda gpK. Phage_tail_APC is a family of general phage tail assembly chaperone proteins from double-stranded DNA viruses with no RNA stage, many of which are unclassified. 0
41508 418413 cl17090 Yos9_DD C-terminal dimerization domain (DD) of Saccharomyces cerevisiae Yos9 and related proteins. This is the dimerization domain (DD) found in Yos9 proteins in yeast. Structural analysis revealed that this domain contributes to self association of Yos9. The overall fold of the domain can be classified as an alpha-beta-roll architecture, comprising two alpha-helices and seven beta-strands. 0
41509 418414 cl17091 Rev1_C C-terminal domain of the Y-family polymerase Rev1. This is the C-terminal domain of DNA repair protein REV1. It interacts with REV7, POLN, POLK and POLI. 0
41510 418415 cl17092 STING_C C-terminal domain of STING. Transmembrane protein 173, also known as stimulator of interferon genes protein (STING), is a transmembrane adaptor protein which is involved in innate immune signalling processes. It induces expression of type I interferons (IFN-alpha and IFN-beta) via the NF-kappa-B and IRF3, pathways in response to non-self cytosolic RNA and dsDNA. 0
41511 418416 cl17095 Bacova_04320_like Uncharacterized proteins similar to Bacteroides ovatus 4320. A large family of (predicted) secreted proteins with unknown functions from human gut and oral cavity. Typically forms a N-terminal domain with FMN binding domain at the C-terminus. Experimentaly determined 3D structure of this domain shows a variant of a TATA box binding - like fold, but no detectable sequence similarity to other proteins with this fold 0
41512 418417 cl17096 gal11_coact gall11 coactivator domain. This is activator-binding domain (ABD1) found in Gal11/med15 proteins. Structural analysis indicate that it binds to the central activator domain (cAD) of Gcn4. Mutations in Gal11-ABD1 W196 residue abolishes the binding to Gcn4 cAD. 0
41513 418418 cl17100 DIP1984-like DIP1984 family protein and similar proteins. Members of this family, including the Corynebacterium diphtheriae protein DIP1984, which has a solved crystal structure, are uncharacterized with respect to function. Some members of this family previously have been annotated, incorrectly, as septolysin. This model was constructed to overrule and correct such errors. Note that septolysin O, and other members of the family of cholesterol-dependent cytolysins such as listeriolysin O (WP_003722731.1), are unrelated. 0
41514 302613 cl17109 HopAB_KID Kinase-interacting domains of the HopAB family of Type III Effector proteins. AvrPtoB_bdg is a binding region on a family of bacterial plant pathogenic proteins. Type III effector proteins are injected into plants by bacteria when they are under attack, eg Pseudomonas syringae when attacking tomato. AvrPtoB is one such effector that suppresses the plants' PAMP-triggered innate immunity. PAMPs are pathogen/microbe-associated molecular patterns that are detected as non-self by a host. AvrPtoB suppresses this response by binding to BAK1, a kinase that acts with several pattern recognition receptors to activate defense signalling. AvrPtoB_bdg is the region of AvrPtoB that binds to BAK1 thereby preventing its kinase activity after the perception of flagellin. 0
41515 418419 cl17110 Erythro_esteras Erythromycin esterase. This family includes erythromycin esterase enzymes that confer resistance to the erythromycin antibiotic. 0
41516 418420 cl17112 AnfO_nitrog Iron only nitrogenase protein AnfO (AnfO_nitrog). Members of this protein family, called Anf1 in Rhodobacter capsulatus and AnfO in Azotobacter vinelandii, are found only in species with the Fe-only nitrogenase and are encoded immediately downstream of the structural genes in the above named species. 0
41517 271795 cl17113 DUF2833 Protein of unknown function (DUF2833). internal virion protein A 0
41518 418423 cl17157 Alt_A1 Alternaria alternata allergen Alt a 1. AltA1 is a family of fungal allergens. It shows a unique beta-barrel comprising 11 beta-strands. There is structural evidence for the location of IgE antibody-binding epitopes. The crystal structure will allow efforts to promote immunotherapy for patients allergic to Alternaria species. 0
41519 418424 cl17160 BPSL1549 Burkholderia Lethal Factor 1. This family includes members such as BLF1 (Burkholderia lethal factor 1) also known as BPSL1549. BLF1 is a potent toxin from Burkholderia pseudomallei causing melioidosis. BLF1 interacts with the human translation factor eIF4A causing deamidation of Gln339 to Glu. Thereby, reducing endogenous host cell protein synthesis and triggering increased stress granule formation, which is associated with translational blocks. Structural analysis of BLF1 revealed an alpha/beta fold comprising a sandwich of two mixed beta-sheets surrounded by loops and alpha-helices, where the beta-sheet core of the catalytic pocket is structurally similar to that of the deamidase domain of CNF1 pfam05785. 0
41520 388404 cl17163 CarS Antirepressor CarS. This is an SH3 domain found in antirepressor proteins such as CarS from Myxococcus xanthus. CarS antirepressor recognizes and neutralizes its cognate repressors to turn on a photo-inducible promoter. CarS physically interacts with the MerR-type winged-helix DNA-binding domain of these repressors leading to activation of carB operon. Structural studies of CarS from M. Xanthus reveals a beta-barrel fold akin to that in SH3 domains. However, it diverges from the typical SH3 domain fold in the lengths and conformations of the connecting loops. Functional analysis reveal that SH3 domain-like fold in the antirepressor CasS, mimics operator DNA in sequestering the repressor DNA recognition helix to activate transcription. 0
41521 418425 cl17165 SKA2 Spindle and kinetochore-associated protein 2. Spindle and kinetochore-associated protein 2 (SKA2) interacts with the N-termini of SKA1 and SKA3 and forms the Ska complex. This is a microtubule binding complex required for chromosome segregation. 0
41522 418426 cl17166 MMACHC-like Methylmalonic aciduria and homocystinuria type C protein and similar proteins. MMACHC, also called CblC, is involved in the intracellular processing of vitamin B12 by catalyzing two reactions: the reductive decyanation of cyanocobalamin in the presence of a flavoprotein oxidoreductase and the dealkylation of alkylcobalamins through the nucleophilic displacement of the alkyl group by glutathione. Mutations in MMACHC cause combined methylmalonic acidemia/aciduria and homocystinuria (CblC type), the most common inherited disorder of cobalamin metabolism. The structure of MMACHC reveals it to be the most divergent member of the NADPH-dependent flavin reductase family that can use FMN or FAD to catalyze reductive decyanation; it is also the first enzyme with glutathione transferase (GST) activity that is unrelated to the GST superfamily in structure and sequence. 0
41523 418427 cl17169 RRM_SF RNA recognition motif (RRM) superfamily. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the three RRM motif. 0
41524 418428 cl17171 PH-like Pleckstrin homology-like domain. SIN1_PH is a pleckstrin-homology domain found at the C-terminus of SIN1. It is conserved from yeast to humans. PH-domains are involved in intracellular signalling or as constituents of the cytoskeleton. SIN1 (SAPK-interacting protein 1) plays an essential role in signal transduction, anf the PH domain is involved in lipid and membrane binding. 0
41525 418429 cl17172 ADH_N Alcohol dehydrogenase GroES-like domain. N-terminal region of oxidoreductase and prostaglandin reductase and alcohol dehydrogenase. 0
41526 418430 cl17173 AdoMet_MTases N/A. This family appears to have methyltransferase activity. 0
41527 418431 cl17182 NAT_SF N-Acyltransferase superfamily: Various enyzmes that characteristicly catalyze the transfer of an acyl group to a substrate. This family of GCN5-related N-acetyl-transferases bind both CoA and acetyl-CoA. They are characterized by highly conserved glycine, a cysteine residue in the acetyl-CoA binding site near the acetyl group, their small size compared with other GNATs and a lack of of an obvious substrate-binding site. It is proposed that they transfer an acetyl group from acetyl-CoA to one or more unidentified aliphatic amines via an acetyl (cysteine) enzyme intermediate. The substrate might be another macromolecule. 0
41528 418432 cl17185 LPLAT Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis. This family contains proteins with N-acetyltransferase functions. 0
41529 418433 cl17190 NK N/A. This family includes enzymes related to cytidylate kinase. 0
41530 418434 cl17194 Oxidored_q6 NADH ubiquinone oxidoreductase, 20 Kd subunit. This model describes the B chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. The quinone is plastoquinone in Synechocystis (where the chain is designated K) and in chloroplast, where NADH may be replaced by NADPH. In the methanogenic archaeal genus Methanosarcina, NADH is replaced by F420H2. [Energy metabolism, Electron transport] 0
41531 418435 cl17210 OSCP ATP synthase delta (OSCP) subunit. F0F1 ATP synthase subunit delta; Provisional 0
41532 418436 cl17212 PTA_PTB Phosphate acetyl/butaryl transferase. The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes. The exact function of the plsX protein in fatty acid synthesis is unknown. 0
41533 418437 cl17225 DAHP_synth_1 DAHP synthetase I family. NeuB is the prokaryotic N-acetylneuraminic acid (Neu5Ac) synthase. It catalyzes the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesize the 9-phosphate form, Neu5Ac-9-P, and utilize ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family. This family also contains SpsE spore coat polysaccharide biosynthesis proteins. 0
41534 418438 cl17238 RING_Ubox The superfamily of RING finger (Really Interesting New Gene) domain and U-box domain. This is a family of primate-specific Ret finger protein-like (RFPL) zinc-fingers of the C3HC4 type. Ret finger protein-like proteins are primate-specific target genes of Pax6, a key transcription factor for pancreas, eye and neocortex development. This domain is likely to be DNA-binding. This zinc-finger domain together with the RDM domain, pfam11002, forms a large zinc-finger structure of the RING/U-Box superfamily. RING-containing proteins are known to exert an E3 ubiquitin protein ligase activity with the zinc-finger structure being mandatory for binding to the E2 ubiquitin-conjugating enzyme. 0
41535 418439 cl17255 CPSase_L_D2 Carbamoyl-phosphate synthase L chain, ATP binding domain. A member of the ATP-grasp fold predicted to be involved in the modification/biosynthesis of spore-wall and capsular proteins. 0
41536 418440 cl17279 DHFR N/A. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C-terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186. 0
41537 418441 cl17319 PIN_5 PINc domain ribonuclease. hypothetical protein; Provisional 0
41538 418442 cl17340 Glyco_hydro_100 Alkaline and neutral invertase. beta-fructofuranosidase 0
41539 418443 cl17346 Trehalase Trehalase. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the C-terminal catalytic domain. 0
41540 418444 cl17362 Transglut_core Transglutaminase-like superfamily. This peptidase has the catalytic triad C-H-D at the C-terminal end, a triad similar to that in thiol proteases and animal transglutaminases. It catalyzes the in vitro lysis of M. marburgensis cells under reducing conditions and exhibits characteristics of metal-activated peptidases. 0
41541 302641 cl17365 TrkH Cation transport protein. This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli. 0
41542 418445 cl17398 YtfJ_HI0045 Bacterial protein of unknown function (YtfJ_HI0045). This model represents sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ. 0
41543 418446 cl17448 CP_ATPgrasp_1 A circularly permuted ATPgrasp. Circularly permuted ATP-grasp prototyped by Roseiflexus RoseRS_2616 that is associated in gene neighborhoods with a GCS2-like COOH-NH2 ligase, alpha/beta hydrolase fold peptidase, GAT-II -like amidohydrolase, and M20 peptidase. Members of this family are predicted to be involved in the biosynthesis of small peptides. 0
41544 418447 cl17486 Sipho_tail Phage tail protein. This model represents the best-conserved region of about 125 amino acids, toward the N-terminus, of a family of proteins from temperate phage of a number of Gram-positive bacteria. These phage proteins range in length from 230 to 525 amino acids. [Mobile and extrachromosomal element functions, Prophage functions] 0
41545 418448 cl17505 CamS_repeat Repeat domain of CamS sex pheromone cAM373 precursor and related proteins. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. 0
41546 248060 cl17506 LDT_IgD_like IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the repeat adjacent to the catalytic domain. 0
41547 248061 cl17507 LbR-like Left-handed beta-roll, including virulence factors and various other proteins. This group contains the collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins, including Moraxella catarrhalis UspA-like proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric left-handed parallel beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. 0
41548 418449 cl17515 FeS Putative Fe-S cluster. This family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster. 0
41549 418450 cl17537 gp32 gp32 DNA binding protein like. single-stranded DNA binding protein; Provisional 0
41550 418451 cl17559 Amido_AtzD_TrzD Amidohydrolase ring-opening protein (Amido_AtzD_TrzD). Members of this family are are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC 3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC 3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring-opening of barbituric acid to ureidomalonic acid (see Soong, et al., ). 0
41551 418452 cl17562 Spore_III_AF Stage III sporulation protein AF (Spore_III_AF). This family represents the stage III sporulation protein AF of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of this protein is poorly conserved, so only the N-terminal region, which includes two predicted transmembrane domains, is included in the seed alignment. [Cellular processes, Sporulation and germination] 0
41552 418453 cl17592 TfoX_N TfoX N-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found as an isolated domain in some proteins suggesting this is an autonomous domain. 0
41553 302653 cl17685 Phytase Phytase. Phytase is a secreted enzyme which hydrolyzes phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity and has been shown to have a six- bladed propeller folding architecture. 0
41554 418454 cl17687 5_nucleotid 5&apos; nucleotidase family. This model includes a 5'-nucleotidase specific for purines (IMP and GMP). These enzymes are members of the Haloacid Dehalogenase (HAD) superfamily. HAD members are recognized by three short motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and either {hhhh(D/E)(D/E)x(3-4)(G/N)} or {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue). Crystal structures of many HAD enzymes has verified PSI-PRED predictions of secondary structural elements which show each of the "hhhh" sequences of the motifs as part of beta sheets. This subfamily of enzymes is part of "Subfamily I" of the HAD superfamily by virtue of a "cap" domain in between motifs 1 and 2. This subfamily's cap domain has a different predicted secondary structure than all other known HAD enzymes and thus has been designated "subfamily IG". This domain appears to consist of a mixed alpha/beta fold. A Pfam model (pfam05761) detects an identical range of sequences above the trusted cutoff, but does not model the N-terminal motif 1 region. A TIGRFAMs model (TIGR01993) represents a (putative) family of _pyrimidine_ 5'-nucleotidases which are also subfamily I HAD's, which should not be confused with the current model. 0
41555 388436 cl17690 DUF2204 Nucleotidyl transferase of unknown function (DUF2204). This domain, found in various hypothetical archaeal proteins, has no known function. However, this family was identified as belonging to the nucleotidyltransferase superfamily. 0
41556 418455 cl17703 Dehydratase_MU Dehydratase medium subunit. This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 0
41557 418456 cl17705 MBT mbt repeat. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation. 0
41558 418457 cl17713 NnrU NnrU protein. This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor. 0
41559 418458 cl17715 Coat_F Coat F domain. The Coat F proteins, which contribute to the Bacillales spore coat. It occurs multiple times in the genomes it is found in. 0
41560 418459 cl17718 DUF2384 Protein of unknown function (DUF2384). Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. This family was proposed by Makarova, et al. (2009) to be the antitoxin component of a new class of type 2 toxin-antitoxin system, or addiction module. [Cellular processes, Other] 0
41561 418460 cl17720 Aminopep Putative aminopeptidase. This family of bacterial proteins has a conserved HEXXH motif, suggesting that members are putative peptidases of zincin fold. 0
41562 327433 cl17735 VWC von Willebrand factor type C domain. This cysteine rich domain occurs along side the TIL pfam01826 domain and is likely to be a distantly related relative. 0
41563 418461 cl17774 SAM_adeno_trans S-adenosyl-l-methionine hydroxide adenosyltransferase. Members of this family are fluorinase (adenosyl-fluoride synthase, EC 2.5.1.63), an enzyme involved in the first committed step in the biosynthesis of at least two different organofluorine compounds. Few organofluorine natural products are known. Related enzymes include chlorinases (EC 2.5.1.94) that lack fluorinase activity, although a fluorinase may show chlorinase activity. [Cellular processes, Biosynthesis of natural products] 0
41564 418462 cl17781 Chromate_transp Chromate transporter. Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP. 0
41565 302666 cl17795 ArsP_1 Predicted permease. This family of integral membrane proteins are predicted to be permeases of unknown specificity. 0
41566 388445 cl17805 DUF483 Protein of unknown function (DUF483). Family of uncharacterized prokaryotic proteins. 0
41567 418463 cl17812 Phage_base_V Type VI secretion system, phage-baseplate injector. This family consists of Bacteriophage Mu Gp45 related proteins from both phages and bacteria. The function of this family is unknown although it has been suggested that family members may be involved in baseplate assembly. 0
41568 354343 cl17816 OprB Carbohydrate-selective porin, OprB family. 0
41569 418464 cl17823 MASE1 MASE1. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins. This entry also includes members of the 8 transmembrane UhpB type (8TMR-UT) domain family. 0
41570 418465 cl17829 DUF917 Protein of unknown function (DUF917). This family consists of hypothetical bacterial and archaeal proteins of unknown function. 0
41571 418466 cl17838 DUF1365 Protein of unknown function (DUF1365). This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown. 0
41572 418468 cl17850 Trp_oprn_chp Tryptophan-associated transmembrane protein (Trp_oprn_chp). Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis. 0
41573 418469 cl17851 DUF2100 Uncharacterized protein conserved in archaea (DUF2100). This domain, found in various hypothetical archaeal proteins, has no known function. 0
41574 418470 cl17852 DUF2121 Uncharacterized protein conserved in archaea (DUF2121). This domain, found in various hypothetical archaeal proteins, has no known function. 0
41575 418471 cl17857 DUF2278 Uncharacterized conserved protein (DUF2278). Members of this family of hypothetical bacterial proteins have no known function. 0
41576 418472 cl17862 CBP_GIL GGDEF I-site like or GIL domain. This protein, called BcsE (bacterial cellulose synthase E) or YhjS, is required for cellulose biosynthesis in Salmonella enteritidis. Its role is this process across multiple bacterial species is implied by the partial phylogenetic profiling algorithm. Members are found in the vicinity of other cellulose biosynthesis genes. The model does not include a much less well-conserved N-terminal region about 150 amino acids in length for most members. Solano, et al. suggest this protein acts as a protease. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
41577 418473 cl17874 DDE_5 DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 0
41578 248453 cl17899 HyfE Hydrogenase-4 membrane subunit HyfE [Energy production and conversion]. hydrogenase 4 membrane subunit; Provisional 0
41579 418474 cl17916 BF2867_like Tandemly repeated domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. This family of proteins is found in bacteria. Proteins in this family are typically between 348 and 360 amino acids in length. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins. 0
41580 418475 cl17974 STI1 STI1 domain. This entry corresponds to the STI1 domain that is found in two copies in the Sti1 protein. 0
41581 302697 cl18310 NHL NHL repeat unit of beta-propeller proteins. This domain occurs in tandem repeats, as many as 13, in proteins from Bdellovibrio bacteriovorus, Azotobacter vinelandii, Geobacter sulfurreducens, Pirellula sp. 1, Myxococcus xanthus, and others, many of which are Deltaproteobacteria. The periodicity of the repeat ranges from about 57 to 61 amino acids, and a core region of about 54 is represented by this model and seed alignment. 0
41582 418507 cl18921 Bvu_2165_C_like The C-terminal domain of uncharacterized bacterial proteins. A C-terminal domain in a large family of (predicted) secreted proteins with uknown functions from human gut bacteroides 0
41583 418508 cl18929 TIN2_N N-terminal domain of TRF-interacting nuclear factor 2; shelterin complex protein of telomeres. This is the N-terminus of TERF1-interacting nuclear factor 2. It is required for the formation of the shelterin complex. The shelterin complex is involved in the protection and maintenance of telomeres. 0
41584 418509 cl18942 MqsR Motility quorum-sensing regulator (MqsR). MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins.y characterized bacterial toxins. 0
41585 418510 cl18945 AAT_I N/A. These proteins catalyze the reversible transfer of an amino group from the amino acid substrate to an acceptor alpha-keto acid. They require pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze this reaction. Trans-amination reactions are of central importance in amino acid metabolism and in links to carbohydrate and fat metabolism. This class of aminotransferases acts as dimers in a head-to-tail configuration. 0
41586 418511 cl18951 Amidase Amidase. Members of this protein family are aminohydrolases related to, but distinct from, glutamyl-tRNA(Gln) amidotransferase subunit A. The best characterized member is the biuret hydrolase of Pseudomonas sp. ADP, which hydrolyzes ammonia from the three-nitrogen compound biuret to yield allophanate. Allophanate is also an intermediate in urea degradation by the urea carboxylase/allophanate hydrolase pathway, an alternative to urease. [Unknown function, Enzymes of unknown specificity] 0
41587 418512 cl18957 TerD_like Uncharacterized proteins involved in stress response, similar to tellurium resistance terD. The TerD domain is found in TerD family proteins that include the paralogous TerD, TerA, TerE, TerF and TerZ proteins It is found in a stress response operon with TerB and TerC. TerD has a maximum of two calcium-binding sites depending on the conservation of aspartates. It has various fusions to nuclease domains, RNA binding domains, ubiquitin related domains, and metal binding domains. The ter gene products lie at the centre of membrane-linked metal recognition complexes with regulatory ramifications encompassing phosphorylation-dependent signal transduction, RNA-dependent regulation, biosynthesis of nucleoside-like metabolites and DNA processing linked to novel pathways. 0
41588 418513 cl18961 MltG_like proteins similar to Escherichia coli YceG/mltG may function as endolytic murein transglycosylases. This family of proteins is found in bacteria. Proteins in this family are typically between 332 and 389 amino acids in length. This family was previously incorrectly annotated and names as aminodeoxychorismate lyase. The structure of YceG was solved by X-ray crystallography. 0
41589 418514 cl18962 Radical_SAM N/A. Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerisation, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. 0
41590 276221 cl18967 Csx17_I-U CRISPR/Cas system-associated protein Csx17. Members of this protein family are found exclusively in CRISPR-associated (cas) type I system gene clusters of the Dpsyc subtype. Markers for that type include a variant form of cas3 (model TIGR02621) and the GSU0054-like protein family (model TIGR02165). This family occurs in less than half of known Dpsyc clusters. 0
41591 418515 cl18968 RNase_H2-B Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. RNases H are enzymes that specifically hydrolyze RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologs of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function. 0
41592 276222 cl19000 Cas10_III CRISPR/Cas system-associated protein Cas10. Members of this uncommon, sporadically distributed protein family are large (>900 amino acids) and strictly associated, so far, with CRISPR-associated (Cas) gene clusters. Nearby Cas genes always include members of the RAMP superfamily and the six-gene CRISPR-associated RAMP module. Species in which it is found, so far, include three archaea (Methanosarcina mazei, M. barkeri and Methanobacterium thermoautotrophicum) and two bacteria (Thermodesulfovibrio yellowstonii DSM 11347 and Sulfurihydrogenibium azorense). 0
41593 276223 cl19002 Csf1_U CRISPR/Cas system-associated protein Csf1. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf1 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies closest to the repeats. 0
41594 276224 cl19005 Csc1_I-D CRISPR/Cas system-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc1 for CRISPR/Cas Subtype Cyano protein 1, as it is often the first gene upstream of the core cas genes, cas3-cas4-cas1-cas2. 0
41595 276225 cl19006 Cas10d_I-D CRISPR/Cas system-associated protein Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc3 for CRISPR/Cas Subtype Cyano protein 3, as it is often the third gene upstream of the core cas genes, cas3-cas4-cas1-cas2. 0
41596 418516 cl19028 Csm6_III-A CRISPR/Cas system-associated protein Csm6. This entry represents a conserved region of about 150 amino acids found in at least five archaeal and three bacterial species. These species all contain CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus. 0
41597 418517 cl19029 Csx16_III-U CRISPR/Cas system-associated protein Csx16. This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas (CRISPR-associated) genes. 0
41598 418518 cl19051 ST7 Suppression of tumorigenicity 7. The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumor suppressor gene. The molecular function of this protein is uncertain. 0
41599 418519 cl19054 SDH_N_domain Saccharopine dehydrogenase N-terminal domain. Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to Saccharop_dh (pfam03435) in eukaryotes. 0
41600 418520 cl19078 REC phosphoacceptor receiver (REC) domain of response regulators (RRs) and pseudo response regulators (PRRs). TadZ_N is the N-terminal region of the Flp pilus assembly protein TadZ, which carries an AAA, ATPase domain immediately downstream, AAA_31, pfam13614. The domain is an example of a signal-transduction-response receiver. It is localized to the cytoplasmic side of the inner bacterial cell-membrane, contacting also with both tadA and RcpC. 0
41601 418521 cl19096 Flavin_utilizing_monoxygenases N/A. Members of this family are F420-binding enzymes with a proven functional N-terminal twin-arginine translocation (TAT) signal. Members are homologous to the cytosolic F420-dependent glucose-6-phosphate dehydrogenase but do not share the same function. 0
41602 418522 cl19097 TS_Pyrimidine_HMase N/A. This is a family of proteins that are flavin-dependent thymidylate synthases. 0
41603 418523 cl19102 Fer4_9 4Fe-4S dicluster domain. Domain II of the enzyme dihydroprymidine dehydrogenase binds FAD. Dihydroprymidine dehydrogenase catalyzes the first and rate-limiting step of pyrimidine degradation by converting pyrimidines to the corresponding 5,6- dihydro compounds. This domain carries two Fe4-S4 clusters. 0
41604 418524 cl19105 Sina N/A. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologs of Sina have also been identified. The human homolog Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologs, whose similarity was noted previously, this family also includes putative homologs from Caenorhabditis elegans, Arabidopsis thaliana. 0
41605 418525 cl19107 SPFH_like core domain of the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This domain is found in the Major Vault Protein and has been called the shoulder domain. This family includes two bacterial proteins, suggesting that some bacteria may possess vault particles. 0
41606 418526 cl19111 Sir4p-SID_like The SID domain of Saccharomyces cerevisiae silent information regulator 4, a Sir2p interaction domain; and related domains. This is the Sir2 interaction domain (SID domain) of silent information regulator 4 (Sir4). 0
41607 418527 cl19112 CBM29_CBM65 family 29 and family 65 carbohydrate binding modules. This domain is found in the non-catalytic carbohydrate binding module 65B (CMB65B) present in Eubacterium cellulosolvens. CBMs are present in plant cell wall degrading enzymes and are responsible for targeting, which enhances catalysis. CBM65s display higher affinity for oligosaccharides, such as cellohexaose, and particularly polysaccharides than cellotetraose, which fully occupies the core component of the substrate binding cleft. The concave surface presented by beta-sheet 2 comprises the beta-glucan binding site in CBM65s. C6 of all the backbone glucose moieties makes extensive hydrophobic interactions with the surface tryptophans of CBM65s. Three out of the four surface Trp are highly conserved. The conserved metal ion site typical of CBMs is absent in this CBM65 family. 0
41608 418528 cl19114 RNAP_largest_subunit_N Largest subunit of RNA polymerase (RNAP), N-terminal domain. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking. 0
41609 418529 cl19115 Cupredoxin Cupredoxin superfamily. This family represents the N-terminal non-catalytic domain of protein-arginine deiminase. This domain has a cupredoxin-like fold. 0
41610 418530 cl19120 SMBP_like Small metal-binding protein conserved in proteobacteria. This histidine-rich protein binds metal ions. 0
41611 418531 cl19121 ABBA-PTs ABBA-type aromatic prenyltransferases (PTases). This family of proteins represents tryptophan dimethylallyltransferase (EC:2.5.1.34), which catalyzes the first step of ergot alkaloid biosynthesis. Ergot alkaloids, which are produced by endophyte fungi, can enhance plant host fitness, but also cause livestock toxicosis to host plants. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 390 to 465 amino acids in length. 0
41612 418532 cl19122 PANDER_like Domains similar to the Pancreatic-derived factor. ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras. 0
41613 418533 cl19123 lytB_ispH 4-hydroxy-3-methylbut-2-enyl diphosphate reductase. The mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway. 0
41614 418534 cl19148 Molybdop_Fe4S4 Molybdopterin oxidoreductase Fe4S4 domain. The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain. 0
41615 418535 cl19167 Bac_export_2 FlhB HrpN YscU SpaS Family. type III secretion system protein HrcU; Validated 0
41616 418537 cl19182 FlgH Flagellar L-ring protein. flagellar basal body L-ring protein; Reviewed 0
41617 418538 cl19186 Amidinotransf Amidinotransferase. Peptidyl-arginine deiminase (PAD) enzymes catalyze the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (pfam03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologs). The predicted catalytic residues in PPAD are Asp130, Asp187, His236, Asp238 and Cys351. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyze the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor. 0
41618 418539 cl19188 PL-6 Polysaccharide Lyase Family 6. This family includes chondroitinases. These enzymes cleave the glycosaminoglycan dermatan sulfate. 0
41619 418540 cl19190 Flavoprotein Flavoprotein. phosphopantothenoylcysteine decarboxylase; Validated 0
41620 418541 cl19192 LolA_fold-like family containing periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
41621 418542 cl19194 Phage_portal Phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage capsid and the tail proteins. 0
41622 418543 cl19197 Complex1_30kDa Respiratory-chain NADH dehydrogenase, 30 Kd subunit. This model describes the C subunit of the NADH dehydrogenase complex I in bacteria, as well as many instances of the corresponding mitochondrial subunit (NADH dehydrogenase subunit 9) and of the F420H2 dehydrogenase in Methanosarcina. Complex I contains subunits designated A-N. This C subunit often occurs as a fusion protein with the D subunit. This model excludes the NAD(P)H and plastoquinone-dependent form of chloroplasts and [Energy metabolism, Electron transport] 0
41623 418544 cl19201 HypA Hydrogenase/urease nickel incorporation, metallochaperone, hypA. CXXC-~12X-CXXC and genetically seems a regulatory protein. In Hpylori, hypA mutant abolished hydrogenase activity and decrease in urease activity. Nickel supplementation in media restored urease activity and partial hydrogenase activity. HypA probably involved in inserting Ni in enzymes. [Protein fate, Protein modification and repair] 0
41624 418545 cl19212 PTH2_family N/A. Peptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation. 0
41625 418546 cl19215 CoA_transf_3 CoA-transferase family III. Members of this protein family belong by homology to the family of CoA transferases. However, the characterized member from Chloroflexus aurantiacus appears to perform an intramolecular transfer, making it an isomerase. The enzyme converts mesaconyl-C1-CoA to mesaconyl-C4-CoA as part of the bicyclic 3-hydroxyproprionate pathway for carbon fixation. 0
41626 418547 cl19217 SBF Sodium Bile acid symporter family. These family members are 7TM putative membrane transporter proteins. The family is similar to the SBF family of bile-acid symporters, pfam01758. 0
41627 418548 cl19219 Bactofilin Polymer-forming cytoskeletal. Members of this family include FapA (flagellar assembly protein A), found in Vibrio vulnificus. The synthesis of flagella allows bacteria to respond to chemotaxis by facilitating motility. Studies examining the role of FapA show that the loss or delocalization of FapA results in a complete failure of the flagellar biosynthesis and motility in response to glucose mediated chemotaxis. The polar localization of FapA is required for flagellar synthesis, and dephosphorylated EIIAGlc (Glucose-permease IIA component) inhibited the polar localization of FapA through direct interaction. 0
41628 418549 cl19223 G_glu_transpept Gamma-glutamyltranspeptidase. gamma-glutamyltranspeptidase; Reviewed 0
41629 418550 cl19224 TGT Queuine tRNA-ribosyltransferase. queuine tRNA-ribosyltransferase; Provisional 0
41630 418551 cl19237 DUF45 Protein of unknown function DUF45. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc metallopeptidases. This family includes the bacterial SprT protein. 0
41631 418552 cl19248 CHAT CHAT domain. These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. 0
41632 418553 cl19251 zf-ZPR1 ZPR1 zinc-finger domain. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction. 0
41633 418554 cl19252 MreC rod shape-determining protein MreC. rod shape-determining protein MreC; Provisional 0
41634 418555 cl19253 YcaO YcaO cyclodehydratase, ATP-ad Mg2+-binding. Members of this protein family include enzymes related to SagD, previously referred to as a scaffold or docking protein involved in the biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Newer evidence describes an enzymatic activity, an ATP-dependent cyclodehydration reaction, previously ascribed to the SagC component. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins. 0
41635 418556 cl19280 FlgI Flagellar P-ring protein. flagellar basal body P-ring protein; Provisional 0
41636 418557 cl19284 Ribosomal_L37ae Ribosomal L37ae protein family. This model finds eukaryotic ribosomal protein eL43 (previously L37a) and its archaeal orthologs. The nomeclature is tricky because eukaryotes have proteins called both L37 and L37a. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
41637 418558 cl19285 SmpA_OmlA SmpA / OmlA family. Structure 3D4E shared structural similarity to beta-lactamase inhibitory proteins (BLIP) which already include 1XXM, 1S0W, 1JTG, 2G2U, 2G2W, 2B5R, and 3due. All of structures are involved in beta-lactamase inhibitor complex. (REF http://www.topsan.org/Proteins/JCSG/3d4e) 0
41638 302813 cl19288 RhaT L-rhamnose-proton symport protein (RhaT). Members of this family fall in to the drug/metabolite transporter (dmt) superfamily. They carry 10xTM domains arranged as 5+5. Although these two sets may originally have arisen by gene-duplication the divergence now is such that the two halves are no longer homologous. 0
41639 418559 cl19294 ApbE ApbE family. thiamine biosynthesis lipoprotein ApbE; Provisional 0
41640 418560 cl19297 Dor1 Dor1-like family. This Sec5 family of eukaryotic proteins conserved is not representing the Sec5-Ral binding site. 0
41641 418561 cl19308 MdoG Periplasmic glucan biosynthesis protein, MdoG. glucan biosynthesis protein G; Provisional 0
41642 418562 cl19310 CT_C_D Carboxyltransferase domain, subdomain C and D. This domain represents subunit 1 of allophanate hydrolase (AHS1). 0
41643 418563 cl19311 Urocanase Urocanase Rossmann-like domain. urocanate hydratase; Provisional 0
41644 418564 cl19312 FdhE Protein involved in formate dehydrogenase formation. formate dehydrogenase accessory protein FdhE; Provisional 0
41645 418565 cl19356 PmbA_TldD Putative modulator of DNA gyrase. peptidase PmbA; Provisional 0
41646 418566 cl19360 DegV Uncharacterized protein, DegV family COG1307. This is the kinase domain of the dihydroxyacetone kinase family. 0
41647 418567 cl19362 Coprogen_oxidas Coproporphyrinogen III oxidase. coproporphyrinogen-III oxidase 0
41648 418568 cl19374 Diphthamide_syn Putative diphthamide synthesis protein. Members of this family are the archaeal protein Dph2, members of the universal archaeal protein family designated arCOG04112. The chemical function of this protein is analogous to the radical SAM family (pfam04055), although the sequence is not homologous. The chemistry involves [4Fe-4S]-aided formation of a 3-amino-3-carboxypropyl radical rather than the canonical 5'-deoxyadenosyl radical of the radical SAM family. 0
41649 418569 cl19388 COX15-CtaA Cytochrome oxidase assembly protein. cytochrome c oxidase assembly protein; Provisional 0
41650 418570 cl19398 Rep_3 Initiator Replication protein. Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination. 0
41651 418571 cl19401 FUSC_2 Fusaric acid resistance protein-like. This family consists of bacterial proteins with three transmembrane regions that are purported to be aromatic acid exporters. 0
41652 327549 cl19409 Cad Cadmium resistance transporter. These proteins are members of the Cadmium Resistance (CadD) Family (TC 2.A.77). To date, this family of proteins has only been found in Gram-positive bacteria. The CadD family includes several closely related Staphylococcal proteins reported to function in cadmium resistance. Members are predicted to span the membrane five times; the mechanism of resistance is believed to be export but has also been suggested to be binding and sequestration in the membrane. Closely related but outside the scope of this model is another staphylococcal protein that has been reported to possibly function in quaternary ammonium ion export. Still more distant are other members of the broader LysE family (see Vrljic. et al, ). [Transport and binding proteins, Amino acids, peptides and amines] 0
41653 327550 cl19414 Glt_symporter Sodium/glutamate symporter. [Transport and binding proteins, Amino acids, peptides and amines] 0
41654 418573 cl19416 GRDB Glycine/sarcosine/betaine reductase selenoprotein B (GRDB). Members of this family form the PrdB subunit, usually a selenoprotein, in the D-proline reductase complex. The usual pathway is conversion of L-protein to D-proline by a racemase, then use of D-proline as an electron acceptor coupled to ATP generation under anaerobic conditions. 0
41655 418574 cl19417 FYDLN_acid Protein of unknown function (FYDLN_acid). Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown. 0
41656 418575 cl19418 PrpF PrpF protein. The 2-methylcitrate cycle is one of at least five degradation pathways for propionate via propionyl-CoA. Degradation of propionate toward pyruvate consumes oxaloacetate and releases succinate. Oxidation of succinate back into oxaloacetate by the TCA cycle makes the 2-methylcitrate pathway a cycle. This family consists of PrpF, an incompletely characterized protein that appears to be an essential accessory protein for the Fe/S-dependent 2-methylisocitrate dehydratase AcnD (TIGR02333). This protein is related to but distinct from FldA (part of pfam04303), a putative fluorene degradation protein of Sphingomonas sp. LB126. [Energy metabolism, Fermentation] 0
41657 418576 cl19419 DUF2263 Uncharacterized protein conserved in bacteria (DUF2263). Members of this uncharacterized protein family are found in Streptomyces, Nostoc sp. PCC 7120, Clostridium acetobutylicum, Lactobacillus johnsonii NCC 533, Deinococcus radiodurans, and Pirellula sp. for a broad but sparse phylogenetic distibution that at least suggests lateral gene transfer. 0
41658 418577 cl19420 Spore_IV_A Stage IV sporulation protein A (spore_IV_A). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. [Cellular processes, Sporulation and germination] 0
41659 302859 cl19421 RHSP Retrotransposon hot spot protein. This model describes full-length and part-length members of the RHS (retrotransposon hot spot) family in Trypanosoma brucei and Trypanosoma cruzi. Members of this family are frequently interrupted by non-LTR retrotransposons inserted at exactly the same relative position. 0
41660 267777 cl19424 GCH_III GTP cyclohydrolase III. GTP cyclohydrolase (GCH) III from Methanocaldococcus jannaschi catalyzes the conversion of GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy). The reaction requires two bound magnesium ions for the catalysis and is activated by monovalent cations such as potassium and ammonium. The enzyme is a tetramer of identical subunits; each monomer is composed of an N- and a C-terminal domain that adopt nearly superimposible structures, suggesting that the protein has arisen by gene duplication. The family is found in archaea and bacteria. 0
41661 388564 cl19428 TrbH Conjugal transfer protein TrbH. conjugal transfer protein TrbH; Provisional 0
41662 418578 cl19470 PTS_2-RNA RNA 2&apos;-phosphotransferase, Tpt1 / KptA family. RNA 2'-phosphotransferase; Reviewed 0
41663 418579 cl19471 DUF945 Bacterial protein of unknown function (DUF945). hypothetical protein; Provisional 0
41664 418580 cl19472 Lipoprotein_16 Uncharacterized lipoprotein. hypothetical protein; Provisional 0
41665 418581 cl19473 DUF1846 Domain of unknown function (DUF1846). hypothetical protein; Provisional 0
41666 418582 cl19474 TraV Type IV conjugative transfer system lipoprotein (TraV). The TraV protein is a component of conjugative type IV secretion systems. TraV is an outer membrane lipoprotein and is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half. 0
41667 418583 cl19475 TraN Type-1V conjugative transfer system mating pair stabilisation. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilization (adhesin) component of the F-type conjugative plamid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein. 0
41668 418584 cl19477 Sulf_transp Sulphur transport. For 79 of the first 80 reference genomes in which a member of this protein family, YedE, is found, a selenium utilization system is found, spread over a broad taxonomic range (Firmicutes, spirochetes, delta-proteobacteria, Fusobacteria, Bacteriodes, etc. This family is less widespread than YedF, also involved in selenium metabolism. 0
41669 354426 cl19481 LON Found in ATP-dependent protease La (LON). N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs. 0
41670 388572 cl19482 Peptidase_M8 Leishmanolysin. Glycoprotein GP63 (leishmanolysin); Provisional 0
41671 418585 cl19485 AMA-1 Apical membrane antigen 1. apical membrane antigen 1; Provisional 0
41672 418586 cl19499 UPF0061 Uncharacterized ACR, YdiU/UPF0061 family. 0
41673 418587 cl19501 Mut7-C Mut7-C RNAse domain. RNAse domain of the PIN fold with an inserted Zinc Ribbon at the C-terminus. 0
41674 354429 cl19503 TadB Flp pilus assembly protein TadB [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 0
41675 418588 cl19504 SpoVR SpoVR like protein. SpoVR family protein; Provisional 0
41676 418589 cl19505 CreA CreA protein. This family consists of several bacterial CreA proteins, the function of which is unknown. 0
41677 418590 cl19506 RraB Regulator of ribonuclease activity B. This family of proteins regulate mRNA abundance by binding to RNaseE and inhibiting its endonucleolytic activity. A subset of these proteins are predicted to function as immunity proteins. 0
41678 418591 cl19507 Lipoprotein_18 NlpB/DapX lipoprotein. This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential. 0
41679 418592 cl19509 Fibrillarin_2 Fibrillarin-like archaeal protein. Members of this protein family are HmdC, whose gene regularly occurs in the context of genes for HmdA (5,10-methenyltetrahydromethanopterin hydrogenase) and the radical SAM protein HmdB involved in biosynthesis of the HmdA cofactor. Bioinformatics suggests this protein, a homolog of eukaryotic fibrillarin, may be involved in biosynthesis of the guanylyl pyridinol cofactor in HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis] 0
41680 418593 cl19510 EutB Ethanolamine ammonia lyase large subunit (EutB). This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins (EC:4.3.1.7). Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC). 0
41681 418594 cl19511 BshC Bacillithiol biosynthesis BshC. Members of this protein family are BshC, an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 0
41682 418595 cl19519 FKBP_C FKBP-type peptidyl-prolyl cis-trans isomerase. This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. Distant homology prediction algorithms consistently suggest a homology between this family and FKBP-type peptidyl-prolyl cis-trans isomerases (PF00254), but this relation is as yet not confirmed. The function of this family is unknown. 0
41683 418596 cl19522 PK_C Pyruvate kinase, alpha/beta domain. As well as being found in pyruvate kinase this family is found as an isolated domain in some bacterial proteins. 0
41684 327577 cl19527 SWIM SWIM zinc finger. This domain is found in bacterial, archaeal and eukaryotic proteins. It is predicted to be organized into two N-terminal beta-strands and a C-terminal alpha helix, thus possibly adopting a fold similar to that of the C2H2 zinc finger (pfam00096). SWIM is thought to be a versatile domain that can interact with DNA or proteins in different contexts. 0
41685 418597 cl19531 Phage_prot_Gp6 Phage portal protein, SPP1 Gp6-like. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions] 0
41686 418599 cl19541 Head-tail_con Bacteriophage head to tail connecting protein. hypothetical protein 0
41687 418600 cl19543 Metallothio Metallothionein. This is a family of eukaryotic metallothioneins. 0
41688 302911 cl19548 DUF515 Protein of unknown function (DUF515). Family of hypothetical Archaeal proteins. 0
41689 418601 cl19549 DUF505 Protein of unknown function (DUF505). Family of uncharacterized prokaryotic proteins. 0
41690 418602 cl19551 Monooxygenase_B Monooxygenase subunit B protein. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria. 0
41691 418603 cl19557 DUF1016 Protein of unknown function (DUF1016). Family of uncharacterized proteins found in viruses, archaea and bacteria. 0
41692 418604 cl19561 DNA_circ_N DNA circularisation protein N-terminus. This family represents the N-terminus (approximately 100 residues) of a number of phage DNA circularisation proteins. 0
41693 418605 cl19562 PAS_5 PAS domain. This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. This region is is distantly similar to other PAS domains. 0
41694 418606 cl19566 RE_Alw26IDE Type II restriction endonuclease (RE_Alw26IDE). Members of this family are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. Characterized specificities of three members are GGTCTC, CGTCTC, and the shared subsequence GTCTC. [DNA metabolism, Restriction/modification] 0
41695 418607 cl19567 DSL Delta serrate ligand. 0
41696 418608 cl19568 wnt wnt family. Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. 0
41697 388595 cl19569 VPS9 Vacuolar sorting protein 9 (VPS9) domain. Domain present in yeast vacuolar sorting protein 9 and other proteins. 0
41698 418609 cl19573 Pyr_excise Pyrimidine dimer DNA glycosylase. Members of this protein are found in a small number of taxonomically well separated species, yet are strongly conserved, suggesting lateral gene transfer. Members are found in Treponema denticola, Clostridium acetobutylicum, and several of the Firmicutes. The function of this protein is unknown. [Hypothetical proteins, Conserved] 0
41699 418610 cl19574 Salt_tol_Pase Glucosylglycerol-phosphate phosphatase (Salt_tol_Pase). Proteins in this family are glucosylglycerol-phosphate phosphatase, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol. 0
41700 418611 cl19575 HrpB4 Bacterial type III secretion protein (HrpB4). This family of genes are always found in type III secretion operons in a limited number of species including Burkholderia, Xanthomonas and Ralstonia. 0
41701 327591 cl19576 HrpB1_HrpK Bacterial type III secretion protein (HrpB1_HrpK). This gene is found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia. 0
41702 418612 cl19579 Peptidase_U4 Sporulation factor SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria. [Cellular processes, Sporulation and germination] 0
41703 354443 cl19580 pip_yhgE_Nterm YhgE/Pip N-terminal domain. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This model represents the conserved N-terminal domain. 0
41704 267934 cl19581 COG4008 Predicted metal-binding transcription factor, methanogenesis marker domain 9 [Transcription]. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins. 0
41705 418613 cl19585 Caud_tail_N Caudoviral major tail protein N-terminus. tail protein 0
41706 418614 cl19592 Zn_ribbon_recom Recombinase zinc beta ribbon domain. This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F. This family also includes zinc fingers in recombinase proteins. 0
41707 418615 cl19596 Peptidase_M29 Thermophilic metalloprotease (M29). 0
41708 302938 cl19597 SPAN Surface presentation of antigens protein. Surface presentation of antigens protein (SPAN), also know as invasion protein invJ, is a Salmonella secretory pathway protein involved in presentation of determinants required for mammalian host cell invasion. 0
41709 302951 cl19613 IpgD Enterobacterial virulence protein IpgD. This family consists of several enterobacterial IpgD like virulence factor proteins. In the Gram-negative pathogen Shigella flexneri, the virulence factor IpgD is translocated directly into eukaryotic cells and acts as a potent inositol 4-phosphatase that specifically dephosphorylates phosphatidylinositol 4,5-bisphosphate [PtdIns(4,5)P(2)] into phosphatidylinositol 5-monophosphate [PtdIns(5)P] that then accumulates. Transformation of PtdIns(4,5)P(2) into PtdIns(5)P by IpgD is responsible for dramatic morphological changes of the host cell, leading to a decrease in membrane tether force associated with membrane blebbing and actin filament remodelling. 0
41710 418616 cl19614 Phage_term_smal Phage small terminase subunit. terminase endonuclease subunit; Provisional 0
41711 418617 cl19619 FBPase_2 Firmicute fructose-1,6-bisphosphatase. This family consists of several bacterial fructose-1,6-bisphosphatase proteins (EC:3.1.3.11) which seem to be specific to phylum Firmicutes. Fructose-1,6-bisphosphatase (FBPase) is a well known enzyme involved in gluconeogenesis. This family does not seem to be structurally related to pfam00316. 0
41712 418618 cl19620 Acetone_carb_G Acetone carboxylase gamma subunit. Acetone carboxylase is the key enzyme of bacterial acetone metabolism, catalyzing the condensation of acetone and CO(2) to form acetoacetate. 0
41713 418619 cl19622 Plasmid_RAQPRD Plasmid protein of unknown function (Plasmid_RAQPRD). This model represents a small family of proteins about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions] 0
41714 418620 cl19623 ArsR ArsR transcriptional regulator. Members of this family of archaeal proteins are conserved transcriptional regulators belonging to the ArsR family. 0
41715 418621 cl19625 DUF2314 Uncharacterized protein conserved in bacteria (DUF2314). This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined. 0
41716 418622 cl19626 DUF2321 Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function. 0
41717 418623 cl19627 GlnD_UR_UTase GlnD PII-uridylyltransferase. This domain is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains. 0
41718 418624 cl19633 DUF2799 Protein of unknown function (DUF2799). lipoprotein; Provisional 0
41719 418625 cl19646 DUF4138 Domain of unknown function (DUF4138). Members of this family are the TraN protein encoded by transfer region genes of conjugative transposons of Bacteroides. The family is related to conjugative transfer proteins VirB9 and TrbG of Agrobacterium Ti plasmids. [Cellular processes, DNA transformation] 0
41720 388610 cl19720 DUF4370 Domain of unknown function (DUF4370). Uncharacterized protein At1g47420 0
41721 418626 cl19721 CAAD CAAD domains of cyanobacterial aminoacyl-tRNA synthetase. photosystem I P subunit (PSI-P) 0
41722 418627 cl19726 DUF2884 Protein of unknown function (DUF2884). hypothetical protein; Provisional 0
41723 418628 cl19727 DUF1451 Zinc-ribbon containing domain. This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. 0
41724 418629 cl19728 TraT Enterobacterial TraT complement resistance protein. The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between an Escherichia coli and its surroundings. 0
41725 354453 cl19729 COG2888 Predicted RNA-binding protein involved in translation, contains Zn-ribbon domain, DUF1610 family [General function prediction only]. putative Zn-ribbon RNA-binding protein; Provisional 0
41726 388615 cl19730 Wzz Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases. 0
41727 418630 cl19731 SipA Salmonella invasion protein A. Salmonella invasion protein A is an actin-binding protein that contributes to host cytoskeletal rearrangements by stimulating actin polymerization and counteracting F-actin destabilizing proteins. Members of this family possess an all-helical fold consisting of eight alpha-helices arranged so that six long, amphipathic helices form a compact fold that surrounds a final, predominantly hydrophobic helix in the middle of the molecule. 0
41728 418631 cl19736 NlpE NlpE N-terminal domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes. 0
41729 418632 cl19737 DUF979 Protein of unknown function (DUF979). This family consists of several putative bacterial membrane proteins. The function of this family is unclear. 0
41730 418633 cl19739 DUF1131 Protein of unknown function (DUF1131). RpoE-regulated lipoprotein; Provisional 0
41731 418634 cl19740 DUF1272 Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown. 0
41732 418636 cl19744 zf-UBR Putative zinc finger in N-recognin (UBR box). Domain is involved in recognition of N-end rule substrates in yeast Ubr1p 0
41733 418637 cl19745 Ins145_P3_rec Inositol 1,4,5-trisphosphate/ryanodine receptor. This domain corresponds to the ligand binding region on inositol 1,4,5-trisphosphate receptor, and the N terminal region of the ryanodine receptor. Both receptors are involved in Ca2+ release. They can couple to the activation of neurotransmitter-gated receptors and voltage-gated Ca2+ channels on the plasma membrane, thus allowing the endoplasmic reticulum discriminate between different types of neuronal activity. 0
41734 418638 cl19746 GDNF GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons.. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity. 0
41735 418639 cl19747 BetaGal_dom2 Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members. 0
41736 388627 cl19751 bPH_4 Bacterial PH domain. This family of proteins with unknown function appear to be related to bacterial PH domains. This family was formerly known as DUF2679. 0
41737 418642 cl19752 DUF2145 Uncharacterized protein conserved in bacteria (DUF2145). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
41738 418643 cl19753 DUF2154 Cell wall-active antibiotics response 4TMS YvqF. 0
41739 276272 cl19755 heavy_Cys_CGP heavy-Cys/CGP-CTERM domain protein. In this domain of about 50 residues, eight of twelve invariant residues are Cys. Proteins with this domain tend to have N-terminal signal sequences, suggesting an extracytoplasmic location for this domain. 0
41740 418644 cl19756 I_LWEQ I/LWEQ domain. Thought to possess an F-actin binding function. 0
41741 418645 cl19758 FH2 Formin Homology 2 Domain. FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain. 0
41742 418646 cl19760 IBR IBR domain, a half RING-finger domain. the domains occurs between pairs og RING fingers 0
41743 418647 cl19763 BOP1NT BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins. 0
41744 418648 cl19764 COG6 Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation. 0
41745 418649 cl19765 mTERF mTERF. MOC1-like protein; Provisional 0
41746 418650 cl19816 Hydantoinase_B Hydantoinase B/oxoprolinase. This family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase EC:3.5.2.9 which catalyzes the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to pfam01968. 0
41747 418651 cl19817 UreF UreF. This family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyzes urea into ammonia and carbamic acid. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein. 0
41748 418652 cl19818 DUF106 Integral membrane protein DUF106. This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. 0
41749 388639 cl19819 DUF499 Protein of unknown function (DUF499). Family of uncharacterized hypothetical prokaryotic proteins. 0
41750 418653 cl19820 DUF530 Protein of unknown function (DUF530). Family of hypothetical archaeal proteins. 0
41751 388641 cl19821 DUF166 Domain of unknown function. This family catalyzes the synthesis of thymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). The physiological co-substrate has not yet been identified. Previous designation of this famliy as being thymidylate synthase from one paper, PMID:10436953, has been shown to be erroneous. The proteins are uncharacterized. 0
41752 418654 cl19822 DUF362 Domain of unknown function (DUF362). Domain that is sometimes present in iron-sulphur proteins. 0
41753 418655 cl19823 GtrA GtrA-like protein. Members of this family are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family are a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved is in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyzes the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b). This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose. 0
41754 418656 cl19824 Alpha-E A predicted alpha-helical domain with a conserved ER motif. An uncharacterized alpha helical domain containing a highly conserved ER motif and typically found as a tandem duplication. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system comprising of a transglutaminase, a peptidase of the NTN-hydrolase superfamily, an active and inactive circularly permuted ATP-grasp domains and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase domain. 0
41755 418657 cl19825 Zn_peptidase Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases (Prosite:PDOC00129). 0
41756 418658 cl19826 FmdA_AmdA Acetamidase/Formamidase family. This family includes amidohydrolases of formamide EC:3.5.1.49 and acetamide. Methylophilus methylotrophus FmdA forms a homotrimer suggesting all the members of this family also do. 0
41757 418659 cl19828 DUF2309 Uncharacterized protein conserved in bacteria (DUF2309). Members of this family of hypothetical bacterial proteins have no known function. 0
41758 418660 cl19829 DUF333 Domain of unknown function (DUF333). This small domain of about 70 residues is found in a number of bacterial proteins. It is found at the N-terminus the of AF_1947 protein. The proteins containing this domain are uncharacterized. 0
41759 418661 cl19830 PilN Fimbrial assembly protein (PilN). 0
41760 418662 cl19831 PilP Pilus assembly protein, PilP. The PilP family are periplasmic proteins involved in the biogenesis of type IV pili. 0
41761 418663 cl19832 DIT1_PvcA Pyoverdine/dityrosine biosynthesis protein. DIT1 is involved in synthesising dityrosine. Dityrosine is a sporulation-specific component of the yeast ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. Pyoverdine biosynthesis protein PvcA is involved in the biosynthesis of pyoverdine, a cyclized isocyano derivative of tyrosine. It has a modified Rossmann fold. 0
41762 418664 cl19833 HTH_42 Winged helix DNA-binding domain. This family contains two copies of a winged helix domain. 0
41763 418665 cl19834 DUF2066 Uncharacterized protein conserved in bacteria (DUF2066). This domain, found in various prokaryotic proteins, has no known function. 0
41764 418666 cl19836 DUF2072 Zn-ribbon containing protein. This archaeal protein has no known function. 0
41765 418667 cl19837 DUF790 Protein of unknown function (DUF790). This family consists of several hypothetical archaeal proteins of unknown function. 0
41766 418668 cl19838 DUF4190 Domain of unknown function (DUF4190). Family of uncharacterized proteins found in bacteria and archaea. 0
41767 418669 cl19839 TfuA TfuA-like protein. This family consists of a group of sequences that are similar to a region of TfuA protein. This protein is involved in the production of trifolitoxin (TFX), an gene-encoded, post-translationally modified peptide antibiotic. The role of TfuA in TFX synthesis is unknown, and it may be involved in other cellular processes. 0
41768 418670 cl19841 Glyco_hydro_125 Metal-independent alpha-mannosidase (GH125). This family, which contains bacterial and fungal glycoside hydrolases, is also known as GH125. They function as metal-independent alpha-mannosidases, with specificity for alpha-1,6-linked non-reducing terminal mannose residues. Structurally this family is part of the 6 hairpin glycosidase superfamily. 0
41769 418671 cl19842 DUF2213 Uncharacterized protein conserved in bacteria (DUF2213). Members of this family of bacterial proteins comprise various hypothetical and phage-related proteins. The exact function of these proteins has not, as yet, been determined. 0
41770 418672 cl19843 DUF871 Bacterial protein of unknown function (DUF871). This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown. 0
41771 418673 cl19844 Metal_hydrol Predicted metal-dependent hydrolase. Members of this family of proteins comprise various bacterial transition metal-dependent hydrolases. 0
41772 418674 cl19845 NAGPA Phosphodiester glycosidase. This is a family conserved from bacteria to humans. The structure of a member from Bacteroides has been crystallized and modelled onto the luminal region of the human member of the family, the transmembrane glycoprotein N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase. There is some conservation of potentially functional residues, implying that in the bacterial members this family acts in some way as a phosphodiester glycosidase. The human protein is also present, so the eukaryotic members are likely to be catalyzing the second step in the formation of the mannose 6-phosphate targeting signal on lysosomal enzyme oligosaccharides. 0
41773 418675 cl19846 DGOK 2-keto-3-deoxy-galactonokinase. 2-keto-3-deoxy-galactonokinase EC:2.7.1.58 catalyzes the second step in D-galactonate degradation. 0
41774 418676 cl19847 DUF1285 Protein of unknown function (DUF1285). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. The structures revealed a conserved core with domain duplication and a superficial similarity of the C-terminal domain to pleckstrin homology-like folds. The conservation of the domain- interface indicates a potential binding site that is likely to involve a nucleotide-based ligand, with genome-context and gene-fusion analyses additionally supporting a role for this family in signal transduction, possibly during oxidative stress. 0
41775 418677 cl19849 DUF881 Bacterial protein of unknown function (DUF881). This family consists of a series of hypothetical bacterial proteins. One of the family members YlxW from Bacillus subtilis is thought to be involved in cell division and sporulation. 0
41776 418678 cl19850 Virulence_RhuM Virulence protein RhuM family. There are currently no experimental data for members of this group or their homologs. However, these proteins are implicated in virulence/pathogenicity because RhuM is encoded in the SPI-3 pathogenicity island in Salmonella typhimurium. 0
41777 418679 cl19851 DUF1152 Protein of unknown function (DUF1152). This family consists of several hypothetical archaeal proteins of unknown function. 0
41778 418680 cl19852 DUF2110 Uncharacterized protein conserved in archaea (DUF2110). This domain, found in various hypothetical archaeal proteins, has no known function. 0
41779 418681 cl19853 DUF2117 Uncharacterized protein conserved in archaea (DUF2117). This domain, found in various hypothetical archaeal proteins, has no known function. 0
41780 418682 cl19854 DUF1002 Protein of unknown function (DUF1002). This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria. 0
41781 418683 cl19855 DUF1501 Protein of unknown function (DUF1501). This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long. 0
41782 418684 cl19857 DUF2126 Putative amidoligase enzyme (DUF2126). Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown, but they are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes). 0
41783 418685 cl19858 DUF1015 Protein of unknown function (DUF1015). Family of proteins with unknown function found in archaea and bacteria. 0
41784 418686 cl19860 DUF2252 Uncharacterized protein conserved in bacteria (DUF2252). This domain, found in various hypothetical bacterial proteins, has no known function. 0
41785 418687 cl19861 zf-CHY CHY zinc finger. This family of domains are likely to bind to zinc ions. They contain many conserved cysteine and histidine residues. We have named this domain after the N-terminal motif CXHY. This domain can be found in isolation in some proteins, but is also often associated with pfam00097. One of the proteins in this family is a mitochondrial intermembrane space protein called Hot13. This protein is involved in the assembly of small TIM complexes. 0
41786 418688 cl19863 DUF935 Protein of unknown function (DUF935). This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein. 0
41787 418689 cl19864 Mu-like_Pro Mu-like prophage I protein. Members of this family of proteins comprise various viral Mu-like prophage I proteins. 0
41788 418690 cl19866 SrfB Virulence factor SrfB. This family includes homologs of SsrAB is a two-component regulatory system encoded within the Salmonella pathogenicity island SPI-2. Among the products of genes activated by SsrAB within epithelial and macrophage cells is Salmonella typhimurium srfB. homologs are found in several other proteobacteria. 0
41789 418691 cl19867 Virul_Fac Putative bacterial virulence factor. Members of this family of prokaryotic proteins include various putative virulence factor effector proteins. Their exact function is, as yet, unknown. 0
41790 418692 cl19868 DUF1054 Protein of unknown function (DUF1054). This family consists of several hypothetical bacterial proteins of unknown function. 0
41791 418693 cl19870 DUF2135 Uncharacterized protein conserved in bacteria (DUF2135). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
41792 303050 cl19871 DUF1646 Protein of unknown function (DUF1646). Some of the members of this family are hypothetical bacterial and archaeal proteins, but others are annotated as being cation transporters expressed by the archaebacterium Methanosarcina mazei. 0
41793 418694 cl19872 DUF885 Bacterial protein of unknown function (DUF885). This family consists of several hypothetical bacterial proteins several of which are putative membrane proteins. 0
41794 418695 cl19873 DUF2179 Uncharacterized protein conserved in bacteria (DUF2179). hypothetical protein; Provisional 0
41795 418696 cl19874 DUF2183 Uncharacterized conserved protein (DUF2183). This domain, found in various hypothetical bacterial proteins, has no known function. 0
41796 388683 cl19875 AbiEi_2 Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_2 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 0
41797 388684 cl19876 DUF2192 Uncharacterized protein conserved in archaea (DUF2192). This domain, found in various hypothetical archaeal proteins, has no known function. 0
41798 418697 cl19878 DUF2207 Predicted membrane protein (DUF2207). The majority of the proteins with a domain as described by this model have an extreme C-terminal sequence that is consists of extremely low-complexity sequence, rich in Ser or in Gly interspersed with Cys. That C-terminal region resembles ribosomal natural product precursors, although there is no evidence that C-terminal regions of these proteins undergo any modification or have any such function. 0
41799 418698 cl19879 PocR Sensory domain found in PocR. PocR, a ligand binding domain, has a novel variant of the PAS-like Fold. Evidence suggests that it binds small hydrocarbon derivatives such as 1,3-propanediol. In (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter) 0
41800 418699 cl19880 ROS_MUCR ROS/MUCR transcriptional regulator protein. This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti. This gene encodes a 15.5-kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid and is necessary for succinoglycan production. Sinorhizobium meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of alfalfa nodules. MucR from Sinorhizobium meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan. 0
41801 418700 cl19881 YEATS YEATS family. We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity 0
41802 418701 cl19882 TB2_DP1_HVA22 TB2/DP1, HVA22 family. This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein. 0
41803 418702 cl19883 ERO1 Endoplasmic Reticulum Oxidoreductin 1 (ERO1). Members of this family are required for the formation of disulphide bonds in the ER. 0
41804 418703 cl19885 Sec6 Exocyst complex component Sec6. Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localize to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner. 0
41805 418704 cl19886 Cnd2 Condensin complex subunit 2. This family consists of several Barren protein homologs from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologs in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localizes to chromatin throughout mitosis. Colocalization and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase. This family forms one of the three non-structural maintenance of chromosomes (SMC) subunits of the mitotic condensation complex along with Cnd1 and Cnd3. 0
41806 418705 cl19887 TFCD_C Tubulin folding cofactor D C terminal. This domain family is found in eukaryotes, and is typically between 182 and 199 amino acids in length. The family is found in association with pfam02985. There is a single completely conserved residue R that may be functionally important. Tubulin folding cofactor D does not co-polymerize with microtubules either in vivo or in vitro, but instead modulates microtubule dynamics by sequestering beta-tubulin from GTP-bound alphabeta-heterodimers in microtubules. 0
41807 418706 cl19888 2H-phosphodiest Domain of unknown function (DUF1868). This group of 2H-phosphodiesterases comprises a single family typified by the protein mlr3352 from M.loti. Members are also present in various alpha-proteobacteria, Synechocystis, Streptococcus and Chilo iridescent virus. The presence of a member of this predominantly bacterial group in a large eukaryotic DNA virus represents a potential case of horizontal transfer from a bacterial source into a virus. Several proteins of bacterial origin have been noticed in the insect viruses (L.M.Iyer, E.V.Koonin and L.Aravind, unpublished observations and these appear to have been acquired from endo-symbiotic or parasitic bacteria that share the same host cells with the viruses. Presence of 2H proteins in the proteomes of large DNA viruses (e.g. T4 57B protein and the Fowl-pox virus FPV025) may point to some role for these proteins in regulating the viral tRNA metabolism. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. 0
41808 418707 cl19890 DHHC DHHC palmitoyltransferase. This entry refers to the DHHC domain, found in DHHC proteins which are palmitoyltransferases. Palmitoylation or, more specifically S-acylation, plays important roles in the regulation of protein localization, stability, and activity. It is a post-translational protein modification that involves the attachment of palmitic acid to Cys residues through a thioester linkage. Protein acyltransferases (PATs), also known as palmitoyltransferases, catalyze this reaction by transferring the palmitoyl group from palmitoyl-CoA to the thiol group of Cys residues. They are characterized by the presence of a 50-residue-long domain called the DHHC domain, which in most but not all cases is also cysteine-rich and gets its name from a highly conserved DHHC signature tetrapeptide (Asp-His-His-Cys). The Cys residue within the DHHC domain forms a stable acyl intermediate and transfers the acyl chain to the Cys residues of a target protein. Some proteins containing a DHHC domain include Drosophila DNZ1 protein, Mouse Abl-philin 2 (Aph2) protein, Mammalian ZDHHC9, Yeast ankyrin repeat-containing protein AKR1, Yeast Erf2 protein, and Arabidopsis thaliana tip growth defective 1. 0
41809 418708 cl19894 Pilus_CpaD Pilus biogenesis CpaD protein (pilus_cpaD). This family consists of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function is not known. [Cell envelope, Surface structures] 0
41810 418709 cl19895 DUF2182 Predicted metal-binding integral membrane protein (DUF2182). This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function. 0
41811 418710 cl19897 DUF2428 Putative death-receptor fusion protein (DUF2428). This is a family of proteins conserved from plants to humans. The function is not known. Several members have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these could be confirmed. 0
41812 418711 cl19898 Noc2 Noc2p family. At least one member, Noc2p from yeast, is required for a late step in 60S subunit export from the nucleus. It has also been shown to co-precipitate with Nug1p, a nuclear GTPase also required for ribosome nucleus export. This family was formerly known as UPF0120. 0
41813 418712 cl19900 HECT_2 HECT-like Ubiquitin-conjugating enzyme (E2)-binding. HECT_2 is a family of UbcH10-binding proteins. 0
41814 418713 cl19902 protein_MS5 Protein MS5. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL. 0
41815 418714 cl19905 TrwC TrwC relaxase. This domain is in the N-terminal (relaxase) region of TrwC, a relaxase-helicase that acts in plasmid R388 conjugation. The relaxase domain has DNA cleavage and strand transfer activities. Plasmid transfer protein TraI is also a member of this domain family. Members of this family on bacterial chromosomes typically are found near other genes typical of conjugative plasmids and appear to mark integrated plasmids. [Mobile and extrachromosomal element functions, Plasmid functions] 0
41816 418715 cl19906 Spore_GerAC Spore germination B3/ GerAC like, C-terminal. Members of this protein family are restricted to endospore-forming members of the Firmicutes lineage of bacteria, including the genera Bacillus, Clostridium, Thermoanaerobacter, Carboxydothermus, etc. Members are nearly all predicted lipoproteins and belong to probable transport operons, some of which have been characterized as crucial to germination in response to alanine. Members typically have been gene symbols gerKC, gerAC, gerYC, etc. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Sporulation and germination] 0
41817 418716 cl19907 ImpA_N ImpA, N-terminal, type VI secretion system. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). 0
41818 388705 cl19908 alpha-hel2 Alpha-helical domain 2. A novel genetic system characterized by seven (usually) major proteins, including a ParB homolog and a ThiF homolog, is commonly found on plasmids or in bacterial chromosomal regions near phage, plasmid, or transposon markers. It is most common among the beta Proteobacteria. We designate the system PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein F. It is the most divergent of the families. 0
41819 418717 cl19911 CBM_4_9 Carbohydrate binding domain. This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif. 0
41820 418718 cl19912 DNA_pol3_delta DNA polymerase III, delta subunit. hypothetical protein; Provisional 0
41821 418719 cl19913 Peptidase_U49 Peptidase U49. phage exclusion protein Lit; Provisional 0
41822 418720 cl19916 ISG65-75 Invariant surface glycoprotein. 65 kDa invariant surface glycoprotein; Provisional 0
41823 418721 cl19922 FAD_binding_4 FAD binding domain. This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidizes the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan. 0
41824 418724 cl19929 A2M_N MG2 domain. This family includes a region of the alpha-2-macroglobulin family. 0
41825 418725 cl19932 OTU OTU-like cysteine protease. This family of proteins conserved from plants to humans is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryote being a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumor domain) in which there is an active cysteine protease triad (ii) a nuclear localization signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. 0
41826 418726 cl19935 Gp_dh_C Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase. 0
41827 418729 cl19950 Sec3_C Exocyst complex component Sec3. Vps52 complexes with Vps53 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events. 0
41828 418730 cl19952 Gly_transf_sug Glycosyltransferase sugar-binding region containing DXD motif. This domain represents the N-terminal glycosyltransferase from a set of toxins found in some bacteria. This domain in TcdB glycosylates the host RhoA protein. 0
41829 418731 cl19976 7tm_7 7tm Chemosensory receptor. In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose. 0
41830 418746 cl20010 Consortin_C Consortin C-terminus. This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 161 amino acids in length. 0
41831 418788 cl20168 Kazal_3 Kazal-type serine protease inhibitor domain. Kazal domain found in factor I-like modules (FIMs) region on the carboxyl-terminal of complement component C7 proteins. Complement component C7 is a subunit of the membrane attack complex (MAC), a fundamental machinery in the mammalian innate immunity. KAZAL domains are common in serine protease inhibitors. 0
41832 418789 cl20183 DUF4810 Domain of unknown function (DUF4810). This family of proteins is found in bacteria. Proteins in this family are typically between 117 and 134 amino acids in length. There is a conserved PES sequence motif. It is a putative lipoprotein. 0
41833 418790 cl20192 DUF4915 Domain of unknown function (DUF4915). This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown. [Hypothetical proteins, Conserved] 0
41834 418792 cl20210 LTD Lamin Tail Domain. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 433 amino acids in length. There is a conserved NNS sequence motif. 0
41835 418796 cl20221 NSP3_rotavirus rotavirus non-structural protein 3 (NSP3). This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerization and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein. 0
41836 418798 cl20224 zf-MYND MYND finger. zf-C6H2 is an unusual zinc-finger similar to zf-MYND, pfam01753.This zinc-finger is found at the N-terminus of Pfam families Exo_endo_phos pfam03372 and Peptidase_M24 pfam00557. The domain is missing in prokaryotic methionine aminopeptidases, and is a unique type of zinc-finger domain. It consists of a C2-C2 zinc-finger motif similar to the RING finger family followed by a C2H2 motif similar to zinc-fingers involved in RNA-binding. In yeast the domain chelates zinc in a 2:1 ratio. The domain is found in yeast, plants and mammals. The domain is necessary for the association of the methionine aminopeptidase with the ribosome and the normal processing of the peptidase. 0
41837 418800 cl20226 TIL trypsin inhibitor-like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. 0
41838 418844 cl20325 UPF0236 Uncharacterized protein family (UPF0236). 0
41839 418849 cl20331 SseB SseB protein N-terminal domain. Members of this family occur almost exclusively in the genus Streptomyces, in the context of type VII secretion systems (T7SS). Several paralogs may accompany a single T7SS. A few members of this family are large proteins with additional domains that add or remove, ADP-ribosylations, suggesting that all family members may have effector activity as well, and that the longer members of the family are multifunctional effector proteins. 0
41840 418856 cl20343 Glyco_hydro_38C Glycosyl hydrolases family 38 C-terminal domain. This family consists of Glycosyl hydrolase family 38 proteins around 700 residues in length and is mainly found in various Clostridium and Rhizobium species. The function of this family is unknown. 0
41841 418913 cl20439 RNase_Zc3h12a Zc3h12a-like Ribonuclease NYN domain. PRORPs (protein-only RNase P) are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. Arabidopsis thaliana contains PRORP enzymes (PRORP1, PRORP2 and PRORP3) where PRORP1 localizes to mitochondria as well as chloroplasts, while PRORP2 and PRORP3 are found in the nucleus. In humans and most other metazoans, mt-RNase P is composed of three protein subunits (mitochondrial RNase P proteins 1-3; MRPP1-3), homologs to the Arabidopsis thaliana PRORP1-3. This domain corresponds to the metallonuclease domain of PRORPs. PRORP1 has 22% sequence identity to the human homolog MRPP3. PRORP1 crystal structure shows a V-shaped tripartite structure with a C-terminal metallonuclease domain of the NYN (N4BL1, YacP-like nuclease) family, with a typical and functional two-metal-ion catalytic site that has conserved aspartate residues. 0
41842 418935 cl20473 ANAPC5 Anaphase-promoting complex subunit 5. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This model represents the Tetratricopeptide repeat (TPR)-like motif region of Apc5. 0
41843 418984 cl20541 EFG_III-like Domain III of Elongation factor G (EF-G) and related proteins. This domain is found in Elongation Factor G. It shares a similar structure with domain V (pfam00679). 0
41844 418994 cl20555 TetR_C_29 Tetracyclin repressor-like, C-terminal domain. This family comprises proteins that belong to the TetR family of transcriptional regulators. This family features the C-terminal region of these sequences, which does not include the N-terminal helix-turn-helix. 0
41845 419068 cl20644 DUF4976 Domain of unknown function (DUF4976). This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown. 0
41846 419202 cl20817 GBP_C Guanylate-binding protein, C-terminal domain. IFT20 is subunit 20 of the intraflagellar transport complex B. The intraflagellar transport complex assembles and maintains eukaryotic cilia and flagella. IFT20 is localized to the Golgi complex and is anchored there by the Golgi polypeptide, GMAP210, whereas all other subunits except IFT172 localize to cilia and the peri-basal body or centrosomal region at the base of cilia. IFT20 accompanies Golgi-derived vesicles to the point of exocytosis near the basal bodies where the other IFT polypeptides are present, and where the intact IFT particle is assembled in association with the inner surface of the cell membrane. Passage of the IFT complex then follows, through the flagellar pore recognition site at the transition region, into the ciliary compartment. There also appears to be a role of intraflagellar transport (IFT) polypeptides in the formation of the immune synapse in non ciliated cells. The flagellum, in addition to being a sensory and motile organelle, is also a secretory organelle. A number of IFT components are expressed in haematopoietic cells, which have no cilia, indicating an unexpected role of IFT proteins in immune synapse-assembly and intracellular membrane trafficking in T lymphocytes; this suggests that the immune synapse could represent the functional homolog of the primary cilium in these cells. 0
41847 419207 cl20823 DCAF15-NTD N-terminal domain of DDB1- and CUL4-associated factor 15. DCAFs, Ddb1- and Cul4-associated factors, are substrate receptors for the Cul4-Ddb1 Ubiquitin Ligase. There are 18 different factors, the majority of which are WD40-repeat-proteins. 0
41848 389307 cl20914 humanin humanin and similar peptides. This family of proteins is found exclusively in humans. Humanin is a short anti-apoptotic peptide that interacts with Bax. 0
41849 419594 cl21329 nt01cx_1156_like Uncharacterized proteins conserved in Clostridia. This family of uncharacterized proteins from Clostridia and Bacilli classes has an unusual structure of three beta propeller repeats that do not form a barrel, as in well known 6-, 7- etc beta propeller barrels, but instead are stacked in a three-layer beta-sheet sandwich. The function of all the proteins from this family is unknown. 0
41850 419595 cl21330 CdiA-CT_Ecl_RNase-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Enterobacter cloacae, and similar proteins. Bacterial genomes and plasmids encode a variety of peptide and protein toxins that mediate inter-bacterial competition. Bacteriocins are diffusible proteins that parasitize cell-envelope proteins to enter and kill bacteria. Contact-dependent growth inhibition (CDI) is one mechanism of inter-bacterial competition. Novel Toxin 21 (alternatively 16S rRNA endonuclease CdiA) belongs to a family of prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. This RNase toxin found in bacterial polymorphic toxin systems, is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, with two conserved lysine residues and [DS]xDxxxH, RxG[ST] and RxxD motifs. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 4, type 5 or type 7 secretion systems. This is also referred to as the E. cloacae CdiAC. The CdiAC proteins carry a variety of sequence-diverse C-terminal domains, which represent a collection of distinct toxins. Many CdiA-CT toxins have nuclease activities. In accord with the structural homology, CdiA-CT cleaves 16S rRNA at the same site as colicin E3 and this nuclease activity is responsible for growth inhibition. 0
41851 419639 cl21406 CdiA-CT_Ec_tRNase C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Escherichia coli 563, and similar proteins. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and[DN]HxxE motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5 or type 7 secretion system. 0
41852 419665 cl21453 PKc_like Protein Kinases, catalytic domain. The KIND domain (kinase non-catalytic C-lobe domain) evolved from a catalytic protein kinase fold and functions as an interaction domain. In SPIRE1 (protein spire homolog 1) this domain interacts with FMN2 (formin-2). 0
41853 419666 cl21454 NADB_Rossmann Rossmann-fold NAD(P)(+)-binding proteins. This entry is the rossmann domain found in the Xanthine dehydrogenase accessory protein. 0
41854 419667 cl21456 Periplasmic_Binding_Protein_Type_2 Type 2 periplasmic binding fold superfamily. This domain is often found in association with the helix-turn-helix domain HTH_41 (pfam14502). It includes YhfZ proteins from Escherichia coli and Shigella flexneri. 0
41855 419668 cl21457 TIM TIM-like beta/alpha barrel domains. This domain includes the enzyme Phosphoenolpyruvate phosphomutase (EC:5.4.2.9). This protein has been characterized as catalyzing the formation of a carbon-phosphorus bond by converting phosphoenolpyruvate (PEP) to phosphonopyruvate (P-Pyr). This enzyme has a TIM barrel fold. 0
41856 419669 cl21459 HTH Helix-turn-helix domains. This winged helix-turn-helix domain contains an extended C-terminal alpha helix which is responsible for dimerization of this domain. 0
41857 419670 cl21460 HAD_like Haloacid Dehalogenase-like Hydrolases. This family is part of the HAD superfamily. 0
41858 419671 cl21461 Globin-like Globin-like protein superfamily. This family includes protoglobin from Methanosarcina acetivorans C2A. It is also found near the N-terminus of the Haem-based aerotactic transducer HemAT in Bacillus subtilis. It is part of the haemoglobin superfamily. Protoglobin has specific loops and an amino-terminal extension which leads to the burying of the haem within the matrix of the protein. Protoglobin-specific apolar tunnels allow the access of O2, CO and NO to the haem distal site. In HemAT it acts as an oxygen sensor domain. 0
41859 419672 cl21462 bZIP Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain. This domain is found at the C-terminus of ABC transporters. It has a coiled coil structure with an atypical 3(10)-helix in the alpha-hairpin region. It is involved in DNA_binding. 0
41860 419673 cl21463 UBA_like_SF UBA domain-like superfamily. EDD, the ER ubiquitin ligase from the HECT ligases, contains an N-terminal ubiquitin-associated domain which binds ubiquitin. Ubiquitin is recognized by helices alpha-1 and -3 in in the UBA domain. EDD is involved in DNA damage repair pathways and binds to mono-ubiquitinated proteins. 0
41861 419674 cl21467 Cytochrom_C Cytochrome c. This domain is a heme binding cytochrome known as cytochrome c550, or cytochrome c549, or PsbV. 0
41862 419675 cl21469 HDc N/A. HD domains are metal dependent phosphohydrolases. 0
41863 419676 cl21470 Peptidase_M14NE-CP-C_like Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain. This is the N-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown. 0
41864 419677 cl21471 RAMP_I_III CRISPR/Cas system-associated RAMP superfamily protein. CRISPR is a term for Clustered Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family, found in at least ten different archaeal and bacterial species, is represented by TM1793 from Thermotoga maritima. 0
41865 419678 cl21473 ArsB_NhaD_permease N/A. CitMHS is a family of putative citrate transporters, belonging to the Na+/H+ antiporter NhaD-like permease superfamily. 0
41866 419679 cl21474 ABC2_membrane ABC-2 type transporter. This is the N-terminal region of 7tm proteins. The function is not known. 0
41867 419680 cl21478 ATP-synt_B ATP synthase B/B&apos; CF(0). This family corresponds to subunit 8 (YMF19) of the F0 complex of plant and algae mitochondrial F-ATPases (EC:3.6.1.34). 0
41868 419681 cl21479 Cas5_I CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum. 0
41869 419682 cl21481 malate_synt N/A. This family of TIM-Barrel fold C-C bond lyase is related to citrate-lyase. These genes are found in the biosynthetic operon, with other enzymatic domains, associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 0
41870 419683 cl21482 RuvC_like Crossover junction endodeoxyribonuclease RuvC and similar proteins. This is the YqgF-like domain of the bacterial Tex protein, which is involved in transcriptional processes. 0
41871 419684 cl21484 Oxidored_q3 NADH-ubiquinone/plastoquinone oxidoreductase chain 6. NADH dehydrogenase subunit 6; Provisional 0
41872 419685 cl21486 Ketoacyl-synt_C Beta-ketoacyl synthase, C-terminal domain. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.41, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria. 0
41873 419686 cl21487 OM_channels N/A. This family includes proteins annotated as TonB dependent receptors. But it is also likely to contain other membrane beta barrel proteins of other functions. 0
41874 419687 cl21488 ECF_trnsprt ECF transporter, substrate-specific component. Members of this protein family have been assigned as thiamine transporters by a phylogenetic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation. 0
41875 419688 cl21491 Transpeptidase Penicillin binding protein transpeptidase domain. This family is closely related to Beta-lactamase, pfam00144, the serine beta-lactamase-like superfamily, which contains the distantly related pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. 0
41876 419689 cl21492 PTS_EIIC Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate. 0
41877 419690 cl21493 Complex1_49kDa Respiratory-chain NADH dehydrogenase, 49 Kd subunit. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein is a member of the Nickel-dependent hydrogenase superfamily represented by Pfam model, pfam00374. 0
41878 419691 cl21494 Abhydrolase alpha/beta hydrolases. This family consists of several chlorophyllase and chlorophyllase-2 (EC:3.1.1.14) enzymes. Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of an ester bond to yield chlorophyllide and phytol. The family includes both plant and Amphioxus members. 0
41879 419692 cl21495 Acyl_transf_3 Acyltransferase family. This domain, found in various hypothetical and OpgC prokaryotic proteins. It is likely to act as an acyltransferase enzyme. 0
41880 419693 cl21496 2OG-FeII_Oxy 2OG-Fe(II) oxygenase superfamily. This family has structural similarity to the 2OG-Fe(II) oxygenase superfamily. 0
41881 419694 cl21497 PAAR_like proline-alanine-alanine-arginine (PAAR) repeat superfamily. This motif is found usually in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins. 0
41882 419695 cl21498 SANT N/A. This domain, approximately 90 residues, is mainly found in DNA methyltransferase 1-associated protein 1 (DAMP1) that plays an important role in development and maintenace of genome integrity in various mammalia species. It mainly consists of tandem repeats of three alpha-helices that are arranged in a helix-turn-helix motif and shows a structual similarity with SANT domain and Myb DNA-binding domain, indicating it contains a putative DNA binding site. 0
41883 354841 cl21499 SPX Domain found in Syg1, Pho81, XPR1, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The yeast protein Gde1/Ypl110c is similar to both, NUC-2 and Pho81, in sharing their multi-domain architecture, which includes the SPX N-terminal domain followed by several ankyrin repeats and a C-terminal glycerophosphodiester phosphodiesterase domain (GDPD). Gde1 hydrolyzes intracellular glycerophosphocholine into glycerolphosphate and choline, and plays a role in the utilization of glycerophosphocholine as a source for phosphate. 0
41884 419696 cl21502 CTP_transf_1 Cytidylyltransferase family. CDP-archaeol synthase functions in the archaeal lipid biosynthetic pathway. It catalyzes the transfer of the nucleotide to its specific archaeal lipid substrate, leading to the formation of a CDP-activated precursor (CDP-archaeol) to which polar head groups are attached. Bacterial members of this family are uncharacterized. 0
41885 419697 cl21503 ParE_toxin ParE toxin of type II toxin-antitoxin system, parDE. YafQ is a family of bacterial toxin ribonucleases of type II toxin-antitoxin systems. The E.coli gene is expressed from the dinB operon. The cognate antitoxin for the E. coli protein is DinJ, in family RelB_antitoxin, pfam02604. 0
41886 419698 cl21504 EGF_CA N/A. This short domain on coagulation enzyme factor Xa is found to be the target for a potent inhibitor of coagulation, TAK-442. 0
41887 419699 cl21506 DinB_2 DinB superfamily. This domain is found in MSMEG_5817 gene product from M. smegmatis. It has been shown to be vital for mycobacterial survival within host macrophages. Crystal structure revealed a Rossmann-like fold alpha/beta two-layer sandwich forming a highly hydrophobic interface cavity and with high structural homology to the SCP family. Hence, it has been suggested that this domain may be involved in the interaction of apolar ligands through its hydrophobic cavity. Alanine-scanning mutagenesis of the hydrophobic cavity of MSMEG_5817 protein demonstrated that the conserved Val82 residue plays an important role in ligand binding. 0
41888 419700 cl21508 Ribosomal_P1_P2_L12p N/A. This family includes archaebacterial L12, eukaryotic P0, P1 and P2. 0
41889 389780 cl21509 ApoLp-III_like Apolipophorin-III and similar insect proteins. This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage. 0
41890 419701 cl21511 PEMT Phospholipid methyltransferase. This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long. 0
41891 419702 cl21513 NrfD Polysulphide reductase, NrfD. The terminal electron transfer enzyme Me2SO reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC). 0
41892 419703 cl21514 TauE Sulfite exporter TauE/SafE. High affinity nickel transporters involved in the incorporation of nickel into H2-uptake hydrogenase and urease enzymes. Essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Alcaligenes eutrophus is thought to be an integral membrane protein with seven transmembrane helices. The family also includes a cobalt transporter. 0
41893 419704 cl21515 GAF GAF domain. SpoVT_C is the C-terminal part of the stage V sporulation protein T, a transcription factor involved in endospore formation in Gram-positive bacteria such as Bacillus subtilis. Sporulation is induced by conditions of environmental stress to protect the genome. SpoVT behaves as a tetramer that shows an overall significant distortion mediated by electrostatic interactions. Two monomers dimerize via the highly charged N-terminal AbrB-like domains, family pfam04014, to form swapped-hairpin beta-barrels. These asymmetric dimers then form tetramers through the formation of mixed helix bundles between their C-terminal domains. The C-termini themselves fold as GAF (cGMP-specific and cGMP-stimulated phosphodiesterases, Anabaena adenylate cyclases, and Escherichia coli FhlA) domains. 0
41894 419705 cl21516 Csx1_III-U CRISPR/Cas system-associated protein Csx1. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown. 0
41895 419706 cl21519 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes a conserved region of about 65 amino acids from an otherwise highly divergent protein found in a minority of CRISPR-associated protein regions. This region features two motifs of CXXC. 0
41896 419707 cl21521 PEPcase Phosphoenolpyruvate carboxylase. This family of phosphoenolpyruvate carboxylases is based on seqeunces not picked up by the model for PEPcase, PF00311. Most of the family members are from Archaea. 0
41897 419708 cl21522 FN3 N/A. This fibronectin type III domain is found in fungal chitin biosynthesis protein CHS5 where, together with the neighboring BRCT domain (pfam00533), it binds to the Arf1 GTPase. 0
41898 276343 cl21524 PRK13923 N/A. In a subset of endospore-forming members of the Firmcutes, members of this protein family are found, several to a genome. Two very strongly conserved sequences regions are separated by a highly variable linker region. Much of the linker region was excised from the seed alignment for this model. A characterized member is the prespore-specific transcription RsfA from Bacillus subtilis, previously called YwfN, which is controlled by sigma factor F and seems to fine-tune expression of some genes in the sigma-F regulon. A paralog in Bacillus subtilis is designated YlbO. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination] 0
41899 419709 cl21525 LysM Lysin Motif is a small domain involved in binding peptidoglycan. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known. 0
41900 419710 cl21526 TolB_N TolB amino-terminal domain. This is a family of Gram-negative bacterial outer membrane lipoproteins. LpoB is required for the function of the major peptidoglycan synthase enzyme PBP1B. It interacts with PBP1B protein via the UvrB-like non-catalytic domain on that protein. LpoB has a 54-aa-long flexible N-terminal stretch followed by a globular domain with similarity to the N-terminal domain of the prevalent periplasmic protein TolB. The long, flexible N-terminal region of LpoB enables it to span the periplasm and reach its docking site in PBP1B. Peptidoglycan is the essential polymer within the sacculus that surrounds the cytoplasmic membrane of bacteria. 0
41901 419711 cl21527 DoxX DoxX. This family of uncharacterized proteins are related to DoxX pfam07681. 0
41902 419712 cl21528 Lipocalin Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). 0
41903 277547 cl21530 Dockerin_like Dockerin repeat domains and domains resembling dockerin repeats. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I dockerins, which are responsible for anchoring a variety of enzymatic domains to the complex. 0
41904 419713 cl21531 Sialidase sialidases/neuraminidases. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases. 0
41905 419714 cl21532 NADAR Escherichia coli swarming motility protein YbiA and related proteins. This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. 0
41906 419715 cl21533 Cas8a1_I-A CRISPR/Cas system-associated protein Cas8a1. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This minor cas protein is found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial. 0
41907 419716 cl21534 NLPC_P60 NlpC/P60 family. Amidase_YiiX is a family of permuted papain-like amidases. It has amidase specificity for the amide bond between a lipid and an amino acid (or peptide). From the structure, a tetramer, each monomer is made up of a layered alpha-beta fold with a central, 6-stranded, antiparallel beta-sheet that is protected by helices on either side. The catalytic Cys154 in UniProtKB:Q74NK7, Structure 3kw0, is located on the N-terminus of helix alphaF. The two additional helices located above Cys154 contribute to the formation of the active site, where the lysine ligand is bound. 0
41908 419717 cl21536 Rhomboid Rhomboid family. The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.). 0
41909 419718 cl21538 TRAPPC_bet3-like Bet3-like domains of TRAPP. TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localize TRAPP to the Golgi. 0
41910 419719 cl21539 DnaJ_zf Zinc finger domain of DnaJ and HSP40. The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found. 0
41911 419720 cl21540 CopD Copper resistance protein D. It appears this conserved hypothetical integral membrane protein is found only in gram negative bacteria. Completed genomes that include a member of this family include Rickettsia prowazekii, Synechocystis sp. PCC6803, and Helicobacter pylori. These proteins have 3 (Helicobacter pylori) to 5 (Synechocystis sp. PCC 6803) GES predicted transmembrane regions. Most members have 4 GES predicted transmembrane regions. [Hypothetical proteins, Conserved] 0
41912 419721 cl21541 OstA OstA-like protein. This is a family of OstA-like proteins that are related to pfam03968. 0
41913 419722 cl21542 EthD EthD domain. MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species and tetrameric in others. The known structure shows two copies of the protein form a dimeric alpha beta barrel. 0
41914 419723 cl21543 MMPL MMPL family. Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus. 0
41915 419724 cl21544 FlgD_ig FlgD Ig-like domain. The function of this C-terminal domain is not known; there are several conserved tryptophan and asparagine residues. 0
41916 419725 cl21545 GHB_like Glycoprotein hormone beta chain homologues. This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterized by a generalized hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients. 0
41917 419726 cl21549 rve Integrase core domain. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 0
41918 419727 cl21551 Sulfotransfer_3 Sulfotransferase family. Members of this family are essential for the biosynthesis of sulpholipid-1 in prokaryotes. They adopt a structure that belongs to the sulphotransferase superfamily, consisting of a single domain with a core four-stranded parallel beta-sheet flanked by alpha-helices. 0
41919 419728 cl21552 TPK Thiamine pyrophosphokinase. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 0
41920 419730 cl21556 DUF2184 Uncharacterized protein conserved in bacteria (DUF2184). The Linocin_M18 is found in eubacteria and archaea. These proteins, referred to as encapsulins, form nanocompartments within the bacterium which contain ferritin-like proteins or peroxidases, enzymes involved in oxidative-stress response. These enzymes are targeted to the interior of encapsulins via unique C-terminal extensions. 0
41921 419731 cl21557 Yip1 Yip1 domain. YIF1 (Yip1 interacting factor) is an integral membrane protein that is required for membrane fusion of ER derived vesicles. It also plays a role in the biogenesis of ER derived COPII transport vesicles. 0
41922 419732 cl21559 HGD-D 2-hydroxyglutaryl-CoA dehydratase, D-component. Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined. 0
41923 419733 cl21560 Ion_trans_2 Ion channel. This family includes the two membrane helix type ion channels found in bacteria. 0
41924 419734 cl21562 DDE_Tnp_4 DDE superfamily endonuclease. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contains three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 0
41925 419736 cl21565 LIF_OSM LIF / OSM family. OSM, Oncostatin M 0
41926 354875 cl21566 Sedlin_N Sedlin, N-terminal conserved region. Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses. 0
41927 419737 cl21567 Cyclophil_like Cyclophilin-like. This is a family of bacterial and archaeal proteins, the structure for one of whose members has been characterized. Structure 3kop probably adopts a new hexameric form compared to previous structures. The putative active is near the domain interface. 3kop is most closely related, structurally to Structure 1zx8, where the potential active site is located near residues E51 and Y53 (conserved in 1zx8). Beyond the two residues above, the other residues are not conserved. Also the shape of the active site differs from that of 1zx8. Structure 1zx8 belongs to family DUF369. pfam04126, which is part of the cyclophilin-like clan. 0
41928 419738 cl21568 SurA_N_3 SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment. 0
41929 419739 cl21569 Ribosomal_S30 Ribosomal protein S30. 40S ribosomal protein S30; Provisional 0
41930 419740 cl21570 Csx1_III-U CRISPR/Cas system-associated protein Csx1. Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia). 0
41931 419741 cl21572 GatB_Yqey GatB domain. The function of this domain found in the YqeY protein is uncertain. 0
41932 419742 cl21573 B3_4 B3/4 domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. 0
41933 419744 cl21578 PAS N/A. The MEKHLA domain shares similarity with the PAS domain and is found in the 3' end of plant HD-ZIP III homeobox genes, and bacterial proteins. 0
41934 419745 cl21579 CBM_2 Cellulose binding domain. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose. 0
41935 419746 cl21581 FixH FixH. This family consists of several Rhizobium FixH like proteins. It has been suggested that suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG. 0
41936 419747 cl21583 DUF4142 Domain of unknown function (DUF4142). Domain found in small family of bacterial secreted proteins with no known function. Also found in Paramecium bursaria chlorella virus 1. This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important. This domain belongs to the ferritin superfamily. It contains two sequence similar repeats each of which is composed of two alpha helices. 0
41937 419748 cl21584 Tryp_SPc N/A. This family represents the catalytic domain of alpha-lytic protease (alpha-LP) and its closely-related homologs. Alpha-lytic protease (EC 3.4.21.12; also called alpha-lytic endopeptidase), originally isolated from the myxobacterium Lysobacter enzymogenes, belongs to the MEROPS peptidase family S1, subfamily S1E (streptogrisin A subfamily). It is synthesized as a pro-enzyme, thus having two domains; the N-terminal pro-domain acts as a foldase, required transiently for the correct folding of the protease domain, and also acts as a potent inhibitor of the mature enzyme, while the C-terminal domain catalyzes the cleavage of peptide bonds. Members of the alpha-lytic protease subfamily include Nocardiopsis alba protease (NAPase), a secreted chymotrypsin from the alkaliphile Cellulomonas bogoriensis, streptogrisins (SPG-A, SPG-B, SPG-C, and SPG-D), and Thermobifida fusca protease A (TFPA). These serine proteases have characteristic kinetic stability, exhibited by their extremely slow unfolding kinetics. The active site, characteristic of serine proteases, contains the catalytic triad consisting of serine acting as a nucleophile, aspartate as an electrophile, and histidine as a base, all required for activity. This model represents the C-terminal catalytic domain of alpha-lytic proteases. 0
41938 419749 cl21588 Snf7 Snf7. SNF-7-like protein; Provisional 0
41939 389828 cl21589 Relaxase Relaxase/Mobilisation nuclease domain. Relaxases/mobilisation proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalyzing trans-esterification. 0
41940 419750 cl21590 PMT_2 Dolichyl-phosphate-mannose-protein mannosyltransferase. This family is conserved in bacteria. The function is not known. 0
41941 419751 cl21591 PRCH N/A. The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain. 0
41942 419752 cl21592 DUF998 Protein of unknown function (DUF998). Family of conserved archaeal proteins. 0
41943 419753 cl21594 Gate Nucleoside recognition. Members of this protein family are found exclusively in Firmicutes (low-GC Gram-positive bacterial) and are known from studies in Bacillus subtilis to be part of the sigma-E regulon. Mutation leads to a sporulation defect, confirming that members of this protein family, YlbJ, are sporulation proteins. This protein appears to be universal among endospore-forming bacteria, but is encoded by a pair ORFs distant from eash other in Symbiobacterium thermophilum IAM14863. [Cellular processes, Sporulation and germination] 0
41944 419754 cl21598 PMP22_Claudin PMP-22/EMP/MP20/Claudin family. Members of this family are claudins, that form tight junctions between cells. 0
41945 419755 cl21600 DUF302_like Domains similar to DUF302 and the N-terminal domains found in some bacterial RNAses. RnlA_toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB. 0
41946 419756 cl21601 zf-CHC2 CHC2 zinc finger. This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities. 0
41947 419757 cl21602 DUF1073 Protein of unknown function (DUF1073). This model describes an uncharacterized family of proteins found in prophage regions of a number of bacterial genomes, including Haemophilus influenzae, Xylella fastidiosa, Salmonella typhi, and Enterococcus faecalis. Distantly related proteins can be found in the prophage-bearing plasmids of Borrelia burgdorferi. [Mobile and extrachromosomal element functions, Prophage functions] 0
41948 419758 cl21606 GH3 GH3 auxin-responsive promoter. indole-3-acetic acid-amido synthetase 0
41949 419759 cl21608 Galactosyl_T Galactosyltransferase. This family includes a conserved region found in several uncharacterized plant proteins. 0
41950 419760 cl21610 PQ-loop PQ loop repeat. This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologs are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter. 0
41951 419761 cl21612 PolyA_pol Poly A polymerase head domain. hypothetical protein 0
41952 419762 cl21614 YkuD_like L,D-transpeptidases/carboxypeptidases similar to Bacillus YkuD. This family is related to pfam03734. 0
41953 419763 cl21616 DUF4870 Domain of unknown function (DUF4870). 0
41954 389842 cl21617 Terminase_GpA Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities. 0
41955 354892 cl21618 Peptidase_M11 Gametolysin peptidase M11. This model describes a metalloproteinase domain, with a characteristic HExxH motif. Examples of this domain are found in proteins in the family of immune inhibitor A, which cleaves antibacterial peptides, and in other, only distantly related proteases. This model is built to be broader and more inclusive than pfam05547. 0
41956 419765 cl21622 PepSY Peptidase propeptide and YPEB domain. This region is likely to have a protease inhibitory function (personal obs:C Yeats). The name is derived from Peptidase & Bacillus subtilis YPEB. 0
41957 419766 cl21623 ALO D-arabinono-1,4-lactone oxidase. The substrate-binding domain found in Cholesterol oxidase is composed of an eight-stranded mixed beta-pleated sheet and six alpha-helices. This domain is positioned over the isoalloxazine ring system of the FAD cofactor bound by FAD_binding_4 (PF:PF01565) and forms the roof of the active site cavity, allowing for catalysis of oxidation and isomerisation of cholesterol to cholest-4-en-3-one. 0
41958 304472 cl21625 Capsule_synth Capsule polysaccharide biosynthesis protein. This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB. 0
41959 419767 cl21627 DRTGG DRTGG domain. This family represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phospho-relay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains. 0
41960 419768 cl21628 POTRA Surface antigen variable number repeat. FtsQ/DivIB bacterial division proteins (pfam03799) contain an N-terminal POTRA domain (for polypeptide-transport-associated domain). This is found in different types of proteins, usually associated with a transmembrane beta-barrel. FtsQ/DivIB may have chaperone-like roles, which has also been postulated for the POTRA domain in other contexts. 0
41961 419769 cl21633 TruB-C_2 Pseudouridine synthase II TruB, C-terminal. The C terminal domain of tRNA Pseudouridine synthase II adopts a PUA (pfam01472) fold, with a four-stranded mixed beta-sheet flanked by one alpha-helix on each side. It allows for binding of the enzyme to RNA, as well as stabilisation of the RNA molecule. 0
41962 419770 cl21636 AsmA_2 AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli. 0
41963 272058 cl21638 LodA_like L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. The dimerization interface observed in the available 3D structure does not seem to be conserved. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms. 0
41964 419771 cl21639 GH_101_like Endo-a-N-acetylgalactosaminidase and related glcyosyl hydrolases. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by the S. pneumoniae protein Endo-alpha-N-acetylgalactosaminidase, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor. 0
41965 419772 cl21640 PUFD_like PCGF Ub-like fold discriminator and related domains. PUFD is the minimal domain at the C-terminus of BCORL (BCL6 corepressor) that is needed for binding and giving specificity to some of the PCGF proteins, polycomb-group RING finger homologs. PUFD binds to the RAWUL (RING finger- and WD40-associated ubiquitin-like) domain of the particular PCGF PCGF1, pfam16207. Polycomb group proteins form repressive complexes (PRC) that mediate epigenetic modifications of histones. In humans there are many different PCGF homologs whose functions all vary, but the direct binding partner of PCGF1 is BCOR. BCOR has emerged as an important player in development and health. 0
41966 419773 cl21642 Pentapeptide Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid. 0
41967 419774 cl21648 Coa1 Cytochrome oxidase complex assembly protein 1. TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane. 0
41968 419775 cl21649 GFO_IDH_MocA_C Oxidoreductase family, C-terminal alpha/beta domain. This is the C terminal of a family of putative oxidoreductases. 0
41969 419776 cl21652 Peptidase_C11 Clostripain family. Clostripain is a cysteine protease characterized from Clostridium histolyticum, and also known from Clostridium perfringens. It is a heterodimer processed from a single precursor polypeptide, specific for Arg-|-Xaa peptide bonds. The older term alpha-clostripain refers to the most active, most reduced form, rather than to the product of one of several different genes. Clostripain belongs to the peptidase family C11, or clostripain family (see pfam03415). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Pathogenesis] 0
41970 419777 cl21655 AMO Ammonia monooxygenase. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit A of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria. 0
41971 276368 cl21656 Silic_transp Silicon transporter. Marine diatoms such as Cylindrotheca fusiformis encode at least six silicon transport protein homologues which exhibit similar size and topology. One characterized member of the family (Sit1) functions in the energy-dependent uptake of either Silicic acid [Si(OH)4] or Silicate [Si(OH)3O-] by a Na+ symport mechanism. The system is found in marine diatoms which make their "glass houses" out of silicon. [Transport and binding proteins, Other] 0
41972 328841 cl21657 Phage_TTP_1 Phage tail tube protein. This model describes a set of proteins that share low levels of sequence similarity but similar lengths and similar patterns of charged, hydrophobic, and Gly/Pro residues. All members (except one attributed to mouse embryo cDNA) belong to phage of Gram-positive bacteria. Several are identified as phage major tail proteins. Some members of this family have additional C-terminal regions of about 100 residues not included in this model. [Mobile and extrachromosomal element functions, Prophage functions] 0
41973 419778 cl21658 NinB NinB protein. hypothetical protein; Provisional 0
41974 419779 cl21662 GH7_CBH_EG Glycosyl hydrolase family 7. Glycosyl hydrolase family 7 contains eukaryotic endoglucanases (EGs) and cellobiohydrolases (CBHs) that hydrolyze glycosidic bonds using a double-displacement mechanism. This leads to a net retention of the conformation at the anomeric carbon. Both enzymes work synergistically in the degradation of cellulose,which is the main component of plant cell wall, and is composed of beta-1,4 linked glycosyl units. EG cleaves the beta-1,4 linkages of cellulose and CBH cleaves off cellobiose disaccharide units from the reducing end of the chain. In general, the O-glycosyl hydrolases are a widespread group of enzymes that hydrolyze the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycoside hydrolase family 7. 0
41975 272086 cl21666 PHA02004 N/A. major capsid protein 0
41976 419780 cl21672 DIOX_N non-haem dioxygenase in morphine synthesis N-terminal. flavanone-3-hydroxylase; Provisional 0
41977 419781 cl21673 DUF3828 Protein of unknown function (DUF3828). putative lipoprotein; Provisional 0
41978 419782 cl21675 OprD outer membrane porin, OprD family. This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments. 0
41979 389861 cl21676 VRP1 Salmonella virulence plasmid 28.1kDa A protein. virulence protein SpvA; Provisional 0
41980 419783 cl21677 NHase_beta Nitrile hydratase beta subunit. Members of this protein family are the beta subunit of nitrile hydratase. The alpha subunit is represented by model TIGR01323. While nitrile hydratase is given the specific EC number 4.2.1.84, nitriles are a class of compounds, and one genome may carry more than one nitrile hydratase. The enzyme occurs in both non-heme iron and non-corrin cobalt forms. [Energy metabolism, Amino acids and amines] 0
41981 419784 cl21678 Lon_C Lon protease (S16) C-terminal proteolytic domain. The Lon serine proteases must hydrolyze ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops. 0
41982 419785 cl21680 DUF3276 Protein of unknown function (DUF3276). This bacterial family of proteins has no known function. 0
41983 389865 cl21681 STN Secretin and TonB N-terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates. 0
41984 272102 cl21682 CBM_10 Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. 0
41985 354906 cl21683 TFIIA_alpha_beta_like Precursor of TFIIA alpha and beta subunits and similar proteins. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage). 0
41986 419786 cl21684 DUF4397 Domain of unknown function (DUF4397). AlgF is essential for the addition of O-acetyl groups to alginate, an extracellular polysaccharide. The presence of O-acetyl groups plays an important role in the ability of the polymer to act as a virulence factor. 0
41987 419787 cl21686 RecO_N Recombination protein O N terminal. This entry contains members that are not captured by pfam11967. 0
41988 389868 cl21687 Orc6_mid Middle domain of the origin recognition complex subunit 6. This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. Despite differences in their structure and sequences among eukaryotic replicators, ORC is a conserved feature of replication initiation in all eukaryotes. ORC-related genes have been identified in organisms ranging from S. pombe to plants to humans. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. 0
41989 419788 cl21688 DUF1743 Domain of unknown function (DUF1743). The first twenty-nine completed genomes with a member of this protein family include twenty-eight archaeal methanogens and one other related archaeon, Ferroglobus placidus DSM 10642. The exact function is unknown, but the protein likely belongs to a system usually tightly linked to methanogenesis. 0
41990 419789 cl21693 CDC48_N Cell division protein 48 (CDC48), N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain. 0
41991 419790 cl21695 Tn7_Tnp_TnsA_N TnsA endonuclease N terminal. head completion protein; Provisional 0
41992 419791 cl21700 Glyco_hydro_26 Glycosyl hydrolase family 26. 0
41993 419792 cl21701 PC4 Transcriptional Coactivator p15 (PC4). p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. 0
41994 304504 cl21702 DUF1700 Protein of unknown function (DUF1700). This family contains many hypothetical bacterial proteins and putative membrane proteins. 0
41995 419793 cl21703 Peptidase_A24 Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea. 0
41996 419794 cl21704 zf-CSL CSL zinc finger. This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine. 0
41997 419795 cl21705 Arv1 Arv1-like family. Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis. 0
41998 419796 cl21707 Phage_holin_3_6 Putative Actinobacterial Holin-X, holin superfamily III. Phage_holin_3_6 is a family of small hydrophobic proteins with two or three transmembrane domains of the Hol-X family. Holin proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. 0
41999 419797 cl21709 COQ9 COQ9. This uncharacterized protein is found in a number of Alphaproteobacteria and, with N-terminal regions long enough to be transit peptides, in eukaryotes. This phylogeny suggests mitochondrial derivation. In several Alphaproteobacteria, the gene for this protein is encoded divergently from rpsU, the gene for ribosomal protein S21. S21 is unusual in being encoded outside the usual long ribosomal protein operons, but rather in contexts that suggest regulation of the initiation of protein translation. [Unknown function, General] 0
42000 389879 cl21710 SASP_gamma Small, acid-soluble spore protein, gamma-type. This model represents a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. Most members of the family have two copies of the spore protease (GPR) cleavage motif, typically EFASE in this family, separating three low-complexity repeats. [Cellular processes, Sporulation and germination] 0
42001 419798 cl21712 T2SSC Type II secretion system protein C. Members of this protein family are found in type IV pilus biogenesis loci and include proteins designated PilP. [Cell envelope, Surface structures] 0
42002 389881 cl21715 TrbM TrbM. conjugal transfer protein TrbM; Provisional 0
42003 419799 cl21716 PapD_N Pili and flagellar-assembly chaperone, PapD N-terminal domain. C2 domain-like beta-sandwich fold. This domain is the n-terminal part of the PapD chaperone protein for pilus and flagellar assembly. 0
42004 419800 cl21721 IDO Indoleamine 2,3-dioxygenase. This domain has no known function. It is found in various hypothetical and conserved domain proteins. 0
42005 419801 cl21722 DnaJ_C C-terminal substrate binding domain of DnaJ and HSP40. This family consists of the C terminal region of the DnaJ protein. It is always found associated with pfam00226 and pfam00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and CTDII, both incorporated in this family are necessary for maintaining the J-domains in their specific relative positions. Structural analysis of Structure 1nlt shows that PF00684 is nested within this DnaJ C-terminal region. 0
42006 419802 cl21724 GAG_Lyase N/A. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 0
42007 419803 cl21727 VATPase_H N/A. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the N terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation. 0
42008 419804 cl21728 CIA30 Complex I intermediate-associated protein 30 (CIA30). This protein is associated with mitochondrial Complex I intermediate-associated protein 30 (CIA30) in human and mouse. The family is also present in Schizosaccharomyces pombe which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that this family of protein may not be directly involved in oxidative phosphorylation. 0
42009 304525 cl21731 DUF677 Protein of unknown function (DUF677). This family consists of several plant proteins and includes BYPASS1, which is required for normal root and shoot development. This protein prevents constitutive production of a root mobile carotenoid-derived signaling compound that is capable of arresting shoot and leaf development. 0
42010 419805 cl21735 Lung_7-TM_R Lung seven transmembrane receptor. This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers. 0
42011 419806 cl21736 TAF6C C-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6). TAF6_C is the C-terminal domain of the TAF6 subunit of the general transcription factor TFIID. The crystal structure reveals the presence of five conserved HEAT repeats. This region is necessary for the complexing together of the subunits TAF5, TAF6 and TAF9. 0
42012 419808 cl21738 Alginate_lyase2 Alginate lyase. This family includes heparin lyase I, EC:4.2.2.7. Heparin lyase I depolymerizes heparin by cleaving the glycosidic linkage next to an iduronic acid moiety. The structure of heparin lyase I consists of a beta-jelly roll domain with a long, deep substrate-binding groove and an unusual thumb domain containing many basic residues extending from the main body of the enzyme. This family also includes glucuronan lyase, EC:4.2.2.14. The structure glucuronan lyase is a beta-jelly roll. 0
42013 419809 cl21742 Cas10_III CRISPR/Cas system-associated protein Cas10. This domain family is found in bacteria and archaea, and is typically between 101 and 138 amino acids in length. The proteins in this family are frequently annotated as CRISPR-associated proteins however there is little accompanying literature to confirm this. 0
42014 419810 cl21745 DUF4188 Domain of unknown function (DUF4188). This family includes aldoxime dehydratase, EC:4.99.1.5. This is a haem-containing enzyme, which catalyzes the dehydration of aldoximes to their corresponding nitrile. It also includes phenylacetaldoxime dehydratase, EC:4.99.1.7. This haem-containing enzyme catalyzes the dehydration of Z-phenylacetaldoxime to phenylacetonitrile. The enzyme forms an elliptic beta barrel, composed of eight beta-strands, flanked by alpha-helices. 0
42015 419811 cl21747 SusD starch binding outer membrane protein SusD. SusD is a secreted polysaccharide-binding protein with an N-terminal lipid moiety that allows it to associate with the outer membrane. SusD probably mediates xyloglucan-binding prior to xyloglucan transport in the periplasm for degradation. This domain is found N-terminal to pfam07980. 0
42016 419813 cl21750 Fis1 Mitochondrial Fission Protein Fis1, cytosolic domain. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the C-terminal tetratricopeptide repeat 0
42017 419819 cl22409 BslA_like Bacterial immunoglobulin-like hydrophobin BslA and similar proteins. This family includes members such as BslA (previously called YuaB). Secreted BslA from Bacillus subtillis has been shown to form surface layers around the biofilm self-assembling at interfaces of B. subtilis biofilms, forming an elastic film. structural analysis revealed that BslA consists of an Ig-type fold with the addition of an unusual, extremely hydrophobic cap region. The hydrophobic cap exhibits physiochemical properties similar to the hydrophobic surface found in fungal hydrophobins; thus, BslA is defined as member of a class of bacterially produced hydrophobins. 0
42018 419820 cl22411 E2F_DD Dimerization domain of E2F transcription factors. This is the coiled coil (CC) - marked box (MB) domain of E2F transcription factors. This domain forms a heterodimer with the corresponding domain of the DP transcription factor, the heterodimer binds the C-terminus of retinoblastoma protein. 0
42019 419821 cl22413 ADAM17_MPD Membrane-proximal domain of a disintegrin and metalloprotease 17 (ADAM17). ADAM17_MPD is the membrane-proximal domain of a family of disintegrin and metalloproteinase domain-containing protein 17 found in metazoan species. ADAM17 is a major sheddase that is responsible for the regulation of a wide range of biological processes, such as cellular differentiation, regeneration, and cancer progression. This MPD region acts as the sheddase switch. PDI or protein-disulfide isomerase interacts with ADAM17 and to down-regulate its enzymatic activity. The interaction is directly with the MPD, the region of dimerization and substrate recognition, where it catalyzes an isomerisation of disulfide bridges within the thioredoxin motif CXXC. this isomerisation results in a major structural change between an active, open state and an inactive, closed state of the MPD. This change is thought to act as a molecular switch, allowing a global reorientation of the extracellular domains in ADAM17 and regulating its shedding activity. 0
42020 419822 cl22414 Lmo2686_like Uncharacterized hexameric protein conserved in Bacilli. This is a domain of unknown function mostly found in firmicutes. 0
42021 419823 cl22415 ESP Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP is a family of largely rodent exocrine gland-secreting peptides that are produced by the male extraorbital lacrimal gland to be secreted into the tear fluid. Other mice including females detect these peptides through receptors in the vomeronasal organ, and the receptors report information on mouse-strain, sex and species. The peptides are short, all carrying an N-terminal signal-peptide to indicate they are for secretion which accounts for much of the common conservation. 0
42022 419824 cl22417 bt3222_like Uncharacterized proteins similar to Bacteriodes thetaiotaomicron bt3222. A small family of uncharacterized proteins around 310 residues in length and found in various Bacteroides species. The function of this family is unknown. 0
42023 419825 cl22418 CttA_X X module of the carbohydrate-binding protein CttA and similar proteins. This is the N-terminal domain of cellulose-binding protein CttA present in Ruminococcus flavefaciens. CttA mediates attachment of the bacterial substrate via two carbohydrate-binding modules. The domain is known as the X-module and lacks a true hydrophobic core. Unlike the X-modules in other types of CohE-XDoc complexes it does not contribute to the binding surface. This X-module appears to serve as an extended spacer, which separates the cellulose-binding modules at the N terminus of CttA and the bacterial cell wall. The domain does not share structural similarity with other known X-modules from cellulolytic bacteria but does show similarity to G5-1 module of StrH from S. pneumoniae. 0
42024 419826 cl22419 LepB Legionella Rab1-specific GAP LepB. This is a subdomain of a Rab GTPase-activating protein (GAP) effector from Legionella pneumophilia. This GAP modulates Rab enzymes that act as molecular switches in regulating vesicular transport in eukaryotic cells. This N-terminal subdomain belongs to the the GAP domain of the protein. The catalytic arginine finger (Arg444) is located within this sub-domain and it is the only arginine residue required for GAP activity. 0
42025 419827 cl22420 Hip_N N-terminal dimerization domain of the Hsp70-interacting protein (Hip) and similar proteins. This is the N-terminal domain, known as HipN, found in Hsp70-interacting protein (Hip) present in Rattus norvegicus. Hip cooperates with the chaperone Hsp70 in protein folding and prevention of aggregation and may delay substrate release by slowing ADP dissociation from Hsp70. HipN is responsible for N-terminal homo-dimerization which is necessary so that the Hip dimer can interact with Hsp70 molecules. 0
42026 277561 cl22421 ZIP_Gal4p-like Leucine zipper Dimerization domain of Gal4p-like transcription factors. Sip4p binds to carbon source-responsive element (CSRE) motifs and activates transcription of target genes under conditions of glucose deprivation. Its function is modulated through phosphorylation by SNF1 protein kinase, a protein essential for expression of glucose-repressed genes in response to glucose deprivation. Sip4p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. 0
42027 419828 cl22422 SRP68-RBD RNA-binding domain of signal recognition particle subunit 68. SRP68 is a family that is part of the SRP or signal recognition particle complex. This complex, consisting of six proteins and a 7SL-RNA is necessary for guiding the emerging proteins designed for the membrane towards the translocation pore. SRP68 forms a stable heterodimer with SRP72, a protein with a TPR repeat. Specific RNA-binding of SRP68 is mediated by the N-terminal domain of approximately 200 residues of this family. 0
42028 419829 cl22423 NBR1_like Functionally uncharacterized domain in neighbor of Brca1 Gene 1 and related proteins. Domain present between positions 365-485 in the human next to BRCA1 gene 1 protein Q14596 (NBR1_HUMAN) Distant homology and fold prediction analysis suggests this domain has an immunoglobulin like fold and is distantly homologous to domains involved in cell adhesion such as CARDB (PF07705). JCSG construct was crystalized confirming the domain boundaries 0
42029 304554 cl22428 E1_enzyme_family N/A. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with the Escherichia. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the more widely distributed clade of ThiF proteins such found in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine] 0
42030 419830 cl22429 HHH_5 Helix-hairpin-helix domain. The HHH domain is a short DNA-binding domain. 0
42031 419831 cl22433 H3TH_StructSpec-5&apos;-nucleases H3TH domains of structure-specific 5&apos; nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination. Exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of EXO1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively. 0
42032 419832 cl22434 Hint N/A. This short domain is a conserved region of intein-containing proteins from lower eukaryotes. 0
42033 419833 cl22435 TPP_enzyme_M Thiamine pyrophosphate enzyme, central domain. TPP_enzyme_M_2 is the middle domain of thiamine pyrophosphate in sequences not captured by pfam00205. This enzyme is necessary for the first step of the biosynthesis of menaquinone, or vitamin K2, an important cofactor in electron transport in bacteria. 0
42034 419834 cl22448 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. 0
42035 419835 cl22450 RtcB RNA-splicing ligase RtcB, repairs tRNA damage [Translation, ribosomal structure and biogenesis]. Members of this family are related to RctB. RctB a protein of known structure but unknown function that often is encoded near RNA cyclase and therefore is suggested to be a tRNA or mRNA processing enzyme. This family of RctB-like proteins in encoded upstream of, and apparently is translationally coupled to, the putative peptide chain release factor RF-H (TIGR03072), product of the prfH gene. Note that a large deletion at the junction between this gene and the prfH gene in Escherichia coli K-12 marks both as probable pseudogenes. [Protein synthesis, Other] 0
42036 419836 cl22451 ASF1_hist_chap ASF1 like histone chaperone. This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a a compact immunoglobulin-like beta sandwich fold topped by three helical linkers. 0
42037 419837 cl22454 Arm Armadillo/beta-catenin-like repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). These EZ repeats are found in subunits of cyanobacterial phycocyanin lyase and other proteins and probably carry out a scaffolding role. 0
42038 419838 cl22470 AXH Ataxin-1 and HBP1 module (AXH). unknown function 0
42039 419839 cl22471 MtrF Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF). tetrahydromethanopterin S-methyltransferase subunit F; Provisional 0
42040 419840 cl22482 Sec63 Sec63 Brl domain. This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases. 0
42041 419842 cl22495 Gp23 Major capsid protein Gp23. capsid vertex protein; Provisional 0
42042 419843 cl22503 DUF2385 Protein of unknown function (DUF2385). Members of this uncharacterized protein family are found in a number of alphaProteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown. 0
42043 419844 cl22520 UPF0181 Uncharacterized protein family (UPF0181). This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH. 0
42044 304575 cl22532 Carbam_trans_N Carbamoyltransferase N-terminus. This family describes a protein family, YeaZ, now associated with the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family may occur as fusions with ygjD (previously gcp) or the ribosomal protein N-acetyltransferase rimI, and is frequently encoded next to rimI. [Protein synthesis, tRNA and rRNA base modification] 0
42045 419845 cl22542 P22_CoatProtein P22 coat protein - gene protein 5. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 369 and 424 amino acids in length. There is a single completely conserved residue G that may be functionally important. 0
42046 419847 cl22548 DUF3792 Protein of unknown function (DUF3792). Members of this family of strongly hydrophobic putative transmembrane protein average about 125 amino acids in length and occur mostly, but not exclusively, in the Firmicutes. Members are quite diverse in sequence. The function is unknown. 0
42047 419848 cl22555 CRF Corticotropin-releasing factor family. 0
42048 304582 cl22557 SPAM Salmonella surface presentation of antigen gene type M protein. type III secretion system protein SpaM; Provisional 0
42049 304583 cl22571 VirC2 VirC2 protein. This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear. 0
42050 419849 cl22626 YSIRK_signal YSIRK type signal peptide. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus. 0
42051 419850 cl22628 YcgL YcgL domain. This family of proteins formerly called DUF709 includes the E. coli gene ycgL. homologs of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure. 0
42052 419851 cl22629 Cyt_c_Oxidase_VIIb N/A. Cytochrome C oxidase chain VIIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIb subunit is found only in eukaryotes and its specific function remains unclear. A rare polymorphism of the CcO VIIb gene may be associated with the high risk of nasopharyngeal carcinoma in a Cantonese family. 0
42053 419852 cl22636 DNA_pol3_theta DNA polymerase III, theta subunit. This family of proteins with unknown function appears to be restricted to Proteobacteria. 0
42054 276635 cl22681 wcaD N/A. This membrane protein is believed to function as the colanic acid repeating unit polymerase (in an analagous fashion to wzy proteins in O-antigen polymerization). 0
42055 276638 cl22684 wcaM N/A. This protein of uncharacterized function is the final gene in the conserved colanic acid biosynthesis cluster observed in Enterobacteraceae. 0
42056 419854 cl22701 Phage_lysis Bacteriophage Rz lysis protein. phage lambda Rz-like lysis protein 0
42057 419856 cl22733 DUF1800 Protein of unknown function (DUF1800). This is a family of large bacterial proteins of unknown function. 0
42058 419858 cl22759 T2SS_PulS_OutS Type II secretion system pilotin lipoprotein (PulS_OutS). This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae, the OutS protein of Erwinia chrysanthemi and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. [Transport and binding proteins, Amino acids, peptides and amines] 0
42059 419860 cl22765 SP_1775_like Uncharacterized protein conserved in Streptococci. This family of Firmicute sequences has members that are annotated as ribose-phosphate pyrophosphokinase; however there is no evidence for this attribution. Member proteins are all shorter than 100 residues in length. 0
42060 276720 cl22766 mycoplas_M_dom IgG-blocking virulence domain. Members of this family, including MG_281 of Mycoplasma genitalium, bind conserved regions of the IgG light chain sequences, blocking IgG's normal function of antigen-specific binding. It is therefore an important virulence protein. Members of this family are found also in Mycoplasma pneumoniae, M. penetrans, M. gallisepticum, and M. iowae. Model TIGR04524 describes a region within this protein that is shared by many additional Mycoplasma and Ureaplasma proteins. [Cellular processes, Pathogenesis] 0
42061 276722 cl22768 TIGR04562 TIGR04562 family protein. Members of this family are bacterial proteins, roughly 400 amino acids in length. Most members belong to the Deltaproteobacteria. All members of the Myxococcales, and order withing the Deltaproteobacteria, have a member. The arrangement of conserved residues into invariant motifs suggests enzymatic activity. The function is unknown. 0
42062 419861 cl22808 DUF5840 Family of unknown function (DUF5840). Members of this protein family occur primarily in Cyanobacteria. They average about 50 residues in length and are the ribosomally translated precursors of peptide natural products whose modifications include cleavage, cyclization, and prenylation. Sequences are well-conserved in the N-terminal region. They are nearly invariant over the last eight residues, but hypervariable just before that stretch. A related family, often in a similar genome context, is TIGR03678. 0
42063 419862 cl22817 DUF4842 Domain of unknown function (DUF4842). This domain is abundant in the Leptospira, in Bacteroides, and in Vibrio (three widely separated lineages). Most members have plausible lipoprotein signal peptides, including lipoprotein LruC from Leptospira interrogans and BACOVA_00967, from Bacteroides ovatus, with a solved crystal structure. Note that the C-terminal region of pfam13448 (length 83) matches the N-terminal region of some members of this domain (length 243). 0
42064 419863 cl22825 HEPN_AbiV AbiV. This family includes AbiV (abortive infection system V) from Lactococcus lactis, a phage resistance protein that causes certain phage infections to fail to lead to successful phage replication. Abortive infection mechanisms differ greatly. AbiV interacts directly with the protein SaV in phage p2 and blocks translation of phage proteins. 0
42065 419864 cl22834 CDPS Cyclodipeptide synthase. Members of this family take two aminoacylated tRNA molecules and produce a cyclic dipeptide with two peptide bonds. This enzyme therefore produces a type of nonribosomal peptide, but by a mechanism entirely different from the typical non-ribosomal peptide synthase (NRPS) that relies on adenylation to activate amino acids. Three characterized members of this family are the cyclodityrosine synthase of Mycobacterium tuberculosis (an essential gene), a cyclo(L-Phe-L-Leu) synthase from Streptomyces noursei involved in natural product biosynthesis, and cyclodileucine synthase YvmC from Bacillus licheniformis. Many cyclodipeptide synthases are found next to a cytochrome P450 that further modifies the product. 0
42066 419865 cl22837 Peptidase_Mx1 Putative zinc-binding metallo-peptidase. Members of this family are lipoproteins with the typical zinc metallohydrolase HExxH motif and additional similarities to a better-documented zinc peptidase family, pfam06167. The seed alignment begins immediately after the lipoprotein motif Cys residue. Up to five members of this protein family occur per genome, in the context of certain gene pairs related to RagA and RagB, or to SusC and SusD. Those gene pairs, like the present family, are restricted to the Bacteriodetes, may number up to 100 pairs per genome, and are linked to TonB-dependent uptake of biopolymer-derived nutrients such as glycans. A possible function for this lipoprotein is to hydrolyse larger molecules to prepare substrates for import and utilization. [Unknown function, Enzymes of unknown specificity] 0
42067 419866 cl22850 RGL11 Rhamnogalacturonan lyase of the polysaccharide lyase family 11. This is the beta-sheet domain found in rhamnogalacturonan (RG) lyases, which are responsible for an initial cleavage of the RG type I (RG-I) region of plant cell wall pectin. Polysaccharide lyase family 11 carrying this domain, such as YesW (EC:4.2.2.23) and YesX (EC:4.2.2.24), cleave glycoside bonds between rhamnose and galacturonic acid residues in RG-I through a beta-elimination reaction. Other family members carrying this domain are hemagglutinin A, lysine gingipain (Kgp) and Chitinase C (EC:3.2.1.14). 0
42068 419867 cl22851 PHD_SF PHD finger superfamily. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. 0
42069 419868 cl22853 Motor_domain Myosin and Kinesin motor domain. Myosin motor domain of cardiac muscle, beta myosin heavy chain 7b (also called KIAA1512, dJ756N5.1, MYH14, MHC14). MYH7B is a slow-twitch myosin. Mutations in this gene result in one form of autosomal dominant hearing impairment. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 0
42070 419869 cl22854 HTH_XRE N/A. YdaS_antitoxin is a family of putative bacterial antitoxins, neutralising the toxin YdaT, family pfam06254. 0
42071 419870 cl22855 TNFRSF Tumor necrosis factor receptor superfamily (TNFRSF). This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 184 amino acids in length. This is the stn_TNFRSF12A_TNFR domain from the tumor necrosis factor receptor. The function of this domain is unknown. 0
42072 419871 cl22856 SNARE SNARE motif. This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport. 0
42073 419872 cl22860 PEPCK_HprK N/A. This family represents the C terminal kinase domain of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. 0
42074 419873 cl22861 LamG N/A. This domain belongs to the Concanavalin A-like lectin/glucanases superfamily. 0
42075 354965 cl22863 Str_synth Strictosidine synthase. This family consists of arylesterases (Also known as serum paraoxonase) EC:3.1.1.2. These enzymes hydrolyze organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation. 0
42076 419874 cl22867 Sigma70_r4 N/A. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif. 0
42077 419876 cl22877 Autotransporter Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs. 0
42078 419877 cl22881 DNA_processg_A DNA recombination-mediator protein A. This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 0
42079 419878 cl22882 S-methyl_trans Homocysteine S-methyltransferase. homocysteine S-methyltransferase 0
42080 419879 cl22885 TAF9 TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. 0
42081 419880 cl22886 GGCT_like N/A. GGACT, gamma-glutamylamine cyclotransferase, is a ubiquitous enzyme found in bacteria, plants, and metazoans from Dictyostelium through to humans. It converts gamma-glutamylamines to free amines and 5-oxoproline. 0
42082 328943 cl22894 Mem_trans Membrane transport protein. [Transport and binding proteins, Other] 0
42083 419882 cl22895 HTH_8 Bacterial regulatory protein, Fis family. 0
42084 419883 cl22897 TPR_1 Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by pfam00515. 0
42085 419884 cl22899 UTRA UTRA domain. It has a similar fold to HutC/FarR-like bacterial transcription factors of the GntR family. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules. 0
42086 419885 cl22901 RHH_1 Ribbon-helix-helix protein, copG family. ParD is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is ParE in, pfam05016. The family contains several related antitoxins from Cyanobacteria, Proteobacteria and Actinobacteria. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all type II antitoxins. 0
42087 419886 cl22902 MdcG Phosphoribosyl-dephospho-CoA transferase MdcG. Malonate decarboxylase, like citrate lyase, has a unique acyl carrier protein subunit with a prosthetic group derived from, and distinct from, coenzyme A. Members of this protein family are the phosphoribosyl-dephospho-CoA transferase specific to the malonate decarboxylase system. This enzyme can also be designated holo-ACP synthase (2.7.7.61). The corresponding component of the citrate lyase system, CitX, shows little or no sequence similarity to this family. [Energy metabolism, Other] 0
42088 419887 cl22903 Arrestin_N Arrestin (or S-antigen), N-terminal domain. Vacuolar protein sorting-associated protein (Vps) 26 is one of around 50 proteins involved in protein trafficking. In particular, Vps26 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps29 and Vps35. This family also contains Down syndrome critical region 3/A. 0
42089 419888 cl22904 CARDB CARDB. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW3 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the NPCBM family, pfam08305, a novel putative carbohydrate binding module found at the N-terminus of glycosyl hydrolases. 0
42090 389966 cl22907 zf-U1 U1 zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins. 0
42091 419889 cl22912 CsbD CsbD-like. hypothetical protein; Provisional 0
42092 419890 cl22913 DUF1255 Protein of unknown function (DUF1255). This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown 0
42093 419891 cl22917 PNTB NAD(P) transhydrogenase beta subunit. This family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with pfam01262. Pyridine nucleotide transhydrogenase catalyzes the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli. 0
42094 419892 cl22918 AnmK Anhydro-N-acetylmuramic acid kinase. anhydro-N-acetylmuramic acid kinase; Reviewed 0
42095 419893 cl22919 Tfb4 Transcription factor Tfb4. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
42096 419895 cl22923 DUF2268 Predicted Zn-dependent protease (DUF2268). This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function. 0
42097 419896 cl22924 7TM_GPCR_Str Serpentine type 7TM GPCR chemoreceptor Str. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srd is part of the larger Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 0
42098 419897 cl22925 SLBB SLBB domain. This family consists of the C-terminal domain of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration. 0
42099 389977 cl22931 TAF12 TATA Binding Protein (TBP) Associated Factor 12. The TATA Binding Protein (TBP) Associated Factor 12 (TAF12; also known as TAF2J or TAFII20) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and in the assembly of the pre-initiation complex (PIC). The TFIID complex is composed of the TBP and at least 13 TAFs which specifically interact with a variety of core promoter DNA sequences. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A unified and systematic nomenclature has been adopted for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs function such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF12 interacts with TAF4 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes. It is important for RAS-induced transformation properties of human colorectal cancer cells; its levels are increased in the cells harboring the RAS mutation. Also, TAF12 interacts with activating transcription factor 7 (ATF7) and contributes to the hypersensitivity of osteoclast (OCL) precursors to 1,25-dihydroxyvitamin D2 (1,25-(OH)2D3; also known as calcitriol) in Paget's disease (PD), a disorder of the bone remodeling process, in which the body absorbs old bone and forms abnormal new bone. 0
42100 419899 cl22933 Spo0M SpoOM protein. This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH. 0
42101 389979 cl22934 DUF1699 Protein of unknown function (DUF1699). This family contains many archaeal proteins which have very conserved sequences. 0
42102 419900 cl22935 GP11 GP11 baseplate wedge protein. baseplate wedge subunit and tail pin; Provisional 0
42103 419901 cl22936 DUF2089 Protein of unknown function (DUF2089). This domain, found in various hypothetical prokaryotic proteins, has no known function. This domain is a zinc-ribbon. 0
42104 304651 cl22942 TOMM_pelo NHLP leader peptide domain. This model recognizes a number of type 2 lantibiotic-type bacteriocins, including mersacidin and lichenicidin. Members often are found as gene pairs encoding two-chain bacteriocins. Maturation is accomplished, at least in part, by a LanM-type enzyme (TIGR03897). This model describes only the leader peptide region. [Cellular processes, Toxin production and resistance] 0
42105 389982 cl22943 DUF3693 Phage related protein. hypothetical protein 0
42106 304653 cl22944 COG5510 Predicted small secreted protein [Function unknown]. 0
42107 419903 cl22948 FeoC FeoC like transcriptional regulator. This family contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependant transcriptional repressor. 0
42108 419905 cl22951 Vps51 Vps51/Vps67. The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7. 0
42109 419906 cl22952 Pou Pou domain - N-terminal to homeobox domain. 0
42110 419907 cl22953 RCC_reductase Red chlorophyll catabolite reductase (RCC reductase). red chlorophyll catabolite reductase 0
42111 419908 cl22958 Agenet Agenet domain. Domain in plant sequences with possible chromatin-associated functions. 0
42112 419909 cl22959 VRR_NUC VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI. 0
42113 419910 cl22960 T4_gp9_10 Bacteriophage T4 gp9/10-like protein. baseplate wedge tail fiber connector; Provisional 0
42114 419911 cl22961 RNA_Me_trans Predicted SAM-dependent RNA methyltransferase. This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases. 0
42115 277679 cl22964 COG3905 Predicted transcriptional regulator [Transcription]. 0
42116 419912 cl22966 DUF1330 Domain of unknown function (DUF1330). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 0
42117 419913 cl22970 Sulfotransfer_2 Sulfotransferase family. This family consists of several mammalian galactose-3-O-sulfotransferase proteins. Gal-3-O-sulfotransferase is thought to play a critical role in 3'-sulfation of N-acetyllactosamine in both O- and N-glycans. 0
42118 419914 cl22974 HpaP Type III secretion protein (HpaP). This family of genes is always found in type III secretion operons, althought its function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence. 0
42119 419917 cl22978 rve_3 Integrase core domain. 0
42120 277695 cl22980 Csx12 CRISPR/Cas system-associated protein Cas9. Members of this family of CRISPR-associated (cas) protein are found, so far, in CRISPR/cas loci in Wolinella succinogenes DSM 1740, Legionella pneumophila str. Paris, and Francisella tularensis, where the last probably is an example of a degenerate CRISPR locus, having neither repeats nor a functional Cas1. The characteristic repeat length is 37 base pairs and period is about 72. One region of this large protein shows sequence similarity to pfam01844, HNH endonuclease. 0
42121 419933 cl23554 DUF968 Protein of unknown function (DUF968). REF is a family of P1-like phage RecA-dependent nucleases. It does not appear to act as a positive RecA regulator. It is a new kind of enzyme, a RecA-dependent nuclease. 0
42122 355006 cl23634 EFh_HEF EF-hand, calcium binding motif, found in the hexa-EF hand proteins family. CBN, the product of the cbn gene, is a Drosophila homolog to vertebrate neuronal six EF-hand calcium binding proteins. It is expressed through most of ontogenesis with a selective distribution in the nervous system and in a few small adult thoracic muscles. Its precise biological role remains unclear. CBN contains six EF-hand motifs, but some of them may not bind calcium ions due to the lack of key residues. 0
42123 419953 cl23654 DUF4322 Domain of unknown function (DUF4322). This family contains transposases from the insertion element ISH3, and related transposases from other mobile elements with similar transposases. This model reproduces the classification from ISFinder except for ISC1439B-like transposases, since those are extremely different. 0
42124 419954 cl23655 DUF4343 Domain of unknown function (DUF4343). Family of ATP-grasp enzymes belonging to the R2K clade, wherein one of the absolutely-conserved lysine residues has migrated to the RAGYNA domain which is a part of the core ATP-grasp module. This family is predicted to catalyze peptide ligation reactions on protein substrates in biological conflict contexts, probably between bacteriophages and their hosts. 0
42125 419960 cl23716 metallo-hydrolase-like_MBL-fold mainly hydrolytic enzymes and related proteins which carry out various biological functions; MBL-fold metallohydrolase domain. This family is part of the metallo-beta-lactamase superfamily. 0
42126 419961 cl23717 crotonase-like N/A. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This family differs from pfam00378 in the structure of it's C-terminus. 0
42127 419962 cl23718 ALP_like alkaline phosphatases and sulfatases. This family is a member of the Alkaline phosphatase clan. 0
42128 419963 cl23719 NAD_binding_1 Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity. 0
42129 419964 cl23720 RILP-like Rab interacting lysosomal protein-like 1 and 2 (Rilpl1 and Rilpl2). CEP290 and similar centrosomal proteins carry a number of coiled-coil regions, and this is the fifth along the length of the protein. It is thought that the proteins are involved in cilia biosynthesis. 0
42130 419965 cl23721 AP2Ec N/A. This family consists of several bacterial L-rhamnose isomerase proteins (EC:5.3.1.14). 0
42131 419966 cl23723 Cytochrome_b_N N/A. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length. There are two conserved histidines that may be functionally important. This family is N-terminally truncated compared to other members of the clan. 0
42132 419967 cl23724 PHP Polymerase and Histidinol Phosphatase domain. This protein is part of the RNase P complex that is involved in tRNA maturation. 0
42133 419968 cl23725 Glyco_hydro Glycosyl hydrolases. A family of putative cellulases. 0
42134 419971 cl23728 ATP_bind_2 P-loop ATPase protein family. This family contains an ATP-binding site and could be an ATPase (personal obs:C Yeats). 0
42135 419972 cl23729 SdiA-regulated SdiA-regulated. This family represents a conserved region approximately within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the pfam01436 repeat. 0
42136 419973 cl23730 F5_F8_type_C F5/8 type C domain. This family around 200 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown. 0
42137 419975 cl23733 FliJ Flagellar FliJ protein. 0
42138 419976 cl23735 H4 N/A. This family includes archaebacterial histones and histone like transcription factors from eukaryotes. 0
42139 419978 cl23739 HCP_like N/A. This family includes both hybrid-cluster proteins and the beta chain of carbon monoxide dehydrogenase. The hybrid-cluster proteins contain two Fe/S centers - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested. The prismane protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ -> NH3 + H2O). This activity is rather low. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre. 0
42140 419981 cl23744 Peptidase_C1 N/A. This family is closely related to the Peptidase_C1 family pfam00112, containing several prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases. 0
42141 419983 cl23746 Xan_ur_permease Permease family. MFS_MOT1 is a family of molybdenate transporters. Molybdenum is an essential element that is taken up into the cell in the oxyanion molybdate. Molybdenum is used in the form of molybdopterin-cofactor, which participates in the active site of enzymes involved in key reactions of carbon, nitrogen, and sulfur metabolism. 0
42142 419984 cl23747 UPF0182 Uncharacterized protein family (UPF0182). hypothetical protein; Provisional 0
42143 419985 cl23748 DUF3585 Protein of unknown function (DUF3585). This family consists of several eukaryotic proteins. Suppressor of IKBKE 1 (SIKE) is a physiological suppressor of IKK-epsilon and TBK1, which are two IKK-related kinases involved in virus- and TLR3-triggered activation of interferon regulatory factor 3 (IRF-3). Other members of this family are circulating cathodic antigen (CCA), found in Schistosoma mansoni (Blood fluke), and FGFR1 oncogene partner 2, which may be involved in wound healing pathway. 0
42144 419986 cl23749 TIR_2 TIR domain. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 98 and 145 amino acids in length. 0
42145 419987 cl23750 vATP-synt_E ATP synthase (E/31 kDa) subunit. V-type ATP synthase subunit E; Provisional 0
42146 329056 cl23751 Plasmodium_Vir Plasmodium vivax Vir protein. variable surface protein Vir32; Provisional 0
42147 419988 cl23752 Cytochrom_C3 Heme-binding domain of the class III cytochrome C family and related proteins. This family includes cytochromes c7 and c7-type. In cytochromes c7 all three haems are bis-His co-ordinated. In c7-type the last haem is His-Met co-ordinated. 0
42148 419990 cl23754 EamA EamA-like transporter family. This region is found in proteins related to Plasmodium falciparum chloroquine resistance transporter (CRT). 0
42149 304914 cl23757 OCRE OCRE domain. RBM5 is also called protein G15, H37, putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9. It is a known modulator of apoptosis. It acts as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both of them specifically binds poly(G) RNA. RBM5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 0
42150 419993 cl23759 GT1 GT1, myb-like, SANT family. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. 0
42151 329062 cl23762 TK Thymidine kinase. thymidine kinase; Provisional 0
42152 419994 cl23766 fungal_TF_MHR fungal transcription factor regulatory middle homology region. Cep3 is one of the major components of the CBF3. It dimerizes and in so doing forms a large central channel that is large enough to accommodate duplex B-form DNA. The dimerization region is followed by a linker to the zinc-finger domain at the C-terminus. The CBF3 complex is an essential core component of the budding yeast kinetochore and is required for the centromeric localization of all other kinetochore proteins. Cep3 is the only component with DNA-binding properties. 0
42153 419995 cl23768 ENDO3c N/A. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family. 0
42154 419996 cl23770 FliH Flagellar assembly protein FliH. This family consists of several nodulation protein NolV sequences from different Rhizobium species. The function of this family is unclear. 0
42155 419997 cl23771 Big_1 Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold. 0
42156 304931 cl23774 TAF TATA box binding protein associated factor (TAF). TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold. 0
42157 355042 cl23776 EFP_modif_epmB EF-P beta-lysylation protein EpmB. Members of this family are arginine 2,3-aminomutase, a radical SAM enzyme more closely related to lysine 2,3-aminomutase than to glutamate 2,3-aminomutase. The enzyme makes L-beta-arginine, sometimes in the context of antibiotic biosynthesis (blasticidin S, mildiomycin, etc). Activity is proven in Streptomyces griseochromogenes, which makes blasticidin S. 0
42158 304934 cl23777 PRK01005 N/A. V-type ATP synthase subunit E; Provisional 0
42159 420000 cl23778 LpxK Tetraacyldisaccharide-1-P 4&apos;-kinase. Also called lipid-A 4'-kinase. This essential gene encodes an enzyme in the pathway of lipid A biosynthesis in Gram-negative organisms. A single copy of this protein is found in Gram-negative bacteria. PSI-BLAST converges on this set of apparent orthologs without identifying any other homologs. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
42160 420001 cl23779 MethyltransfD12 D12 class N6 adenine-specific DNA methyltransferase. All proteins in this family for which functions are known are DNA-adenine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The DNA adenine methylase (dam) of E. coli and related species is instrumental in distinguishing the newly synthesized strand during DNA replication for methylation-directed mismatch repair. This family includes several phage methylases and a number of different restriction enzyme chromosomal site-specific modification systems. [DNA metabolism, DNA replication, recombination, and repair] 0
42161 420002 cl23780 PepSY_TM PepSY-associated TM region. This is a family of bacterial proteins with three PepSY-like TM regions. 0
42162 420003 cl23781 Fascin N/A. This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture. Dictyostelium hisactophilin, another actin-binding protein, is a submembranous pH sensor that signals slight changes of the H+ concentration to actin by inducing actin polymerization and binding to microfilaments only at pH values below seven. Members of this family are histidine rich, typically contain the repeated motif of HHXH. 0
42163 420004 cl23783 PsbQ Oxygen evolving enhancer protein 3 (PsbQ). This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology. [Energy metabolism, Photosynthesis] 0
42164 420005 cl23784 RICIN N/A. This family of serine protease inhibitors has a beta-trefoil fold and inhibits trypsin and chymotrypsin. 0
42165 304945 cl23788 Met_repressor_MetJ N/A. Met Repressor, MetJ. MetJ is a bacterial regulatory protein that uses S-adenosylmethionine (SAM) as a corepressor to regulate the production of Methionine. MetJ binds arrays of two to five adjacent copies of an eight base-pair 'metbox' sequence. MetJ forms sufficiently strong interactions with the sugar-phosphate backbone to accomodate sequence variation in natural operators. However, it is very sensitive to particular base changes in the operator. MetJ exists as a homodimer. 0
42166 355048 cl23789 PLN02481 N/A. spermidine hydroxycinnamoyl transferase; Provisional 0
42167 420009 cl23790 Auxin_inducible Auxin responsive protein. uncharacterized protein; Provisional 0
42168 420010 cl23791 UPF0154 Uncharacterized protein family (UPF0154). hypothetical protein; Provisional 0
42169 420011 cl23792 DUF2129 Uncharacterized protein conserved in bacteria (DUF2129). hypothetical protein; Provisional 0
42170 420012 cl23793 PsbH Photosystem II 10 kDa phosphoprotein. photosystem II reaction center protein H; Provisional 0
42171 420013 cl23795 Mntp Putative manganese efflux pump. This protein family was identified, at the time of the publication of the Carboxydothermus hydrogenoformans genome, as having a phylogenetic profile that exactly matches the subset of the Firmicutes capable of forming endospores. The species include Bacillus anthracis, Clostridium tetani, Thermoanaerobacter tengcongensis, Geobacillus kaustophilus, etc. This protein, previously named YtaF, is therefore a putative sporulation protein. [Cellular processes, Sporulation and germination] 0
42172 390093 cl23796 DUF1120 Protein of unknown function (DUF1120). hypothetical protein; Provisional 0
42173 420014 cl23797 LPP Lipoprotein leucine-zipper. This is leucine-zipper is found in the enterobacterial outer membrane lipoprotein LPP. It is likely that this domain oligomerizes and is involved in protein-protein interactions. As such it is a bundle of alpha-helical coiled-coils, which are known to play key roles in mediating specific protein-protein interactions for in molecular recognition and the assembly of multi-protein complexes. 0
42174 420015 cl23798 CBM53 Starch/carbohydrate-binding module (family 53). CBM26 is a carbohydrate-binding module that binds starch. 0
42175 420016 cl23799 MgtE_N MgtE intracellular N domain. This is the N-terminal domain of the flagellar rotor protein FliG. 0
42176 420017 cl23800 Creatinase_N Creatinase/Prolidase N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families. 0
42177 420019 cl23802 Peptidase_C48 Ulp1 protease family, C-terminal catalytic domain. Protease specific for SMALL UBIQUITIN-RELATED MODIFIER (SUMO); Provisional 0
42178 420020 cl23804 CAF1 CAF1 family ribonuclease. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localizes to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom resolution. 0
42179 420021 cl23805 ABC_transp_aux ABC-type uncharacterized transport system. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldG is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldG abolish the gliding phenotype. GldG, along with GldA and GldF are believed to compose an ABC transporter and are observed as an operon. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. 0
42180 420022 cl23808 TetR_C_11 Bacterial transcriptional repressor C-terminal. This family comprises the C-terminal domain of transcriptional regulators of the TetR family. It includes the AefR transcriptional regulator from P. syringae. It is found in association with pfam00440. 0
42181 420024 cl23811 Glucos_trans_II Glucosyl transferase GtrII. O-antigen conversion protein C 0
42182 420028 cl23815 Glyco_hydro_30 Glycosyl hydrolase family 30 TIM-barrel domain. 0
42183 304973 cl23816 CSF2 N/A. GM-CSF stimulates the development of and the cytotoxic activity of white blood cells. 0
42184 420029 cl23817 DUF1146 Protein of unknown function (DUF1146). Members of this protein family are small, typically about 80 residues in length, and are highly hydrophobic. The gene is found so far only in a subset of the Firmicutes in association with genes of the ATP synthase F1 complex or NADH-quinone oxidoreductase. This family includes ywzB from Bacillus subtilis; pfam06612 describes the same family as Protein of unknown function DUF1146. 0
42185 304975 cl23818 COG4020 Uncharacterized protein [Function unknown]. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis] 0
42186 420030 cl23820 PSI_PsaJ Photosystem I reaction centre subunit IX / PsaJ. photosystem I reaction center subunit IX; Provisional 0
42187 420031 cl23821 GSP_synth Glutathionylspermidine synthase preATP-grasp. glutathionylspermidine synthase domain-containing protein 0
42188 420032 cl23822 TCP TCP family transcription factor. Protein TCP2; Provisional 0
42189 420033 cl23823 RALF Rapid ALkalinization Factor (RALF). rapid alkalinization factor 23-like protein; Provisional 0
42190 420034 cl23824 PetG Cytochrome B6-F complex subunit 5. cytochrome b6-f complex subunit PetG; Reviewed 0
42191 420035 cl23825 UPF0223 Uncharacterized protein family (UPF0223). hypothetical protein; Provisional 0
42192 420036 cl23826 Phageshock_PspG Phage shock protein G (Phageshock_PspG). This protein previously was designated yjbO in E. coli. It is found only in genomes that have the phage shock operon (psp), but only rarely is encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins, and heat shock. [Cellular processes, Adaptations to atypical conditions] 0
42193 420037 cl23827 Tra_M TraM mediates signalling between transferosome and relaxosome. The TraM protein is an essential part of the DNA transfer machinery of the conjugative resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential transfer protein TraM has at least two functions. First, a functional TraM protein was found to be required for normal levels of transfer gene expression. Second, experimental evidence was obtained that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo. Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal nucleoprotein complex and the membrane-bound DNA transfer apparatus. 0
42194 420038 cl23828 RMF Ribosome modulation factor. ribosome modulation factor; Provisional 0
42195 304986 cl23829 PRK15383 type III secretion system effector arginine glycosyltransferase. 0
42196 355063 cl23830 H2B Histone H2B. histone H2B; Provisional 0
42197 304988 cl23831 Csm4_III-A CRISPR/Cas system-associated RAMP superfamily protein Csm4. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm4, for CRISPR/cas Subtype Mtube, protein 4. 0
42198 304989 cl23832 photo_TT_lyase spore photoproduct lyase. This uncharacterized radical SAM domain protein occurs rarely and sporadically in species that include select Alphaproteobacteria and Actinobacteria, and in Deinococcus deserti VCD115. It is a distant but full-length homolog to the Bacillus subtilis spore photoproduct lyase (spl), which monomerizes thymine dimers created as DNA damage by uv radiation. 0
42199 355064 cl23833 NrfA Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit [Inorganic ion transport and metabolism]. Members of this protein family are cytochrome c552, a component of cytochrome c nitrite reductase, which is known more formally as nitrite reductase (cytochrome; ammonia-forming) (EC 1.7.2.2). Nitrate can be reduced by several enzymes. EC 1.7.2.2 reduces nitrite all the way to ammonia, rather than to ammonium hydroxide (nitrite reductase (NAD(P)H), EC 1.7.1.4) or nitric oxide (nitrite reductase (NO-forming), EC 1.7.2.1). Some examples of EC 1.7.2.2 occur in a seven gene system that enables formate-dependent nitrite reduction, but is also found in simpler contexts. Members of this protein family, however, belong to the formate-dependent system. [Energy metabolism, Electron transport] 0
42200 329112 cl23835 Glyco_transf_90 Glycosyl transferase family 90. This family of glycosyl transferases are specifically (mannosyl) glucuronoxylomannan/galactoxylomannan -beta 1,2-xylosyltransferases, EC:2.4.2.-. 0
42201 420039 cl23836 DUF262 Protein of unknown function DUF262. 0
42202 420040 cl23837 HicB_lk_antitox HicB_like antitoxin of bacterial toxin-antitoxin system. This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches. 0
42203 420041 cl23838 PrsW-protease Protease prsW family. PrsW, an intramembrane protease, cleaves the anti-sigma factor RsiW, which regulates the activity of the ECF-type sigma factor SigW. 0
42204 420042 cl23839 DUF496 Protein of unknown function (DUF496). 0
42205 420043 cl23840 AGE N/A. This family contains a number of eukaryotic and bacterial N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase) enzymes (EC:5.3.1.8) approximately 500 residues long. This converts N-acyl-D-glucosamine to N-acyl-D-mannosamine. 0
42206 420044 cl23841 DUF411 Protein of unknown function, DUF. The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance. 0
42207 420045 cl23842 MatP MatP N-terminal domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the N-terminal domain of MatP. 0
42208 420046 cl23843 DUF839 Bacterial protein of unknown function (DUF839). This family consists of several bacterial proteins of unknown function that contain a predicted beta-propeller repeats. 0
42209 305001 cl23844 Ble Predicted trehalose synthase [Carbohydrate transport and metabolism]. Three pathways for the biosynthesis of trehalose, an osmoprotectant that in some species is also a precursor of certain cell wall glycolipids. Trehalose synthase, TreS, can interconvert maltose and trehalose, but while the equilibrium may favor trehalose, physiological concentrations of trehalose may be much greater than that of maltose and TreS may act largely in its degradation. This model describes a domain found only as a C-terminal fusion to TreS proteins. The most closely related proteins outside this family, Pep2 of Streptomyces coelicolor and Mak1 of Actinoplanes missouriensis, have known maltokinase activity. We suggest this domain acts as a maltokinase and helps drive conversion of trehalose to maltose. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 0
42210 420047 cl23845 DUF2312 Uncharacterized protein conserved in bacteria (DUF2312). hypothetical protein; Provisional 0
42211 420048 cl23846 PhosphMutase 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate]. 0
42212 420049 cl23847 UPF0262 Uncharacterized protein family (UPF0262). hypothetical protein; Provisional 0
42213 420050 cl23848 DUF1801 Domain of unknown function (DU1801). This large family of bacterial proteins is uncharacterized. They contain a presumed domain about 110 amino acids in length. 0
42214 420051 cl23849 TrbI Bacterial conjugation TrbI-like protein. Proteins in this entry are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. 0
42215 420052 cl23850 Plug TonB-dependent Receptor Plug Domain. This model describes a 31-residue signature region of the SusC/RagA family of outer membrane proteins from the Bacteriodetes. While many TonB-dependent outer membrane receptors are associated with siderophore import, this family seems to include generalized nutrient receptors that may convey fairly large oligomers of protein or carbohydrate. This family occurs in high copy numbers in the most abundant species of the human gut microbiome. 0
42216 420053 cl23851 MHB Haemophore, haem-binding. Members of this family, including Rv0203 from Mycobacterium tuberculosis, are secreted heme-binding proteins used in heme acquisition. Such proteins are called hemophores. Members have a cleavable N-terminal signal peptide, and a mature region just over 100 amino acids long with a pair of invariant Cys residues. An unrelated hemophore, HasA, occurs in Gram-negative pathogens such as Yersinia pestis. [Transport and binding proteins, Other] 0
42217 420054 cl23853 Condensation Condensation domain. This family contains a number of alcohol acetyltransferase (EC:2.3.1.84) enzymes approximately 500 residues long found in both bacteria and metazoa. These catalyze the esterification of isoamyl alcohol by acetyl coenzyme A. 0
42218 420055 cl23855 FGE-sulfatase Sulfatase-modifying factor enzyme 1. This model represents a signature C-terminal region of a distinct clade in the EgtB subfamily, other members of which participate in ergothioneine biosynthesis 0
42219 305014 cl23857 DUF1565 Protein of unknown function (DUF1565). This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the beta-strand repeats that stack in a right-handed parallel beta-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel beta-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences. 0
42220 420056 cl23858 PSCyt1 Planctomycete cytochrome C. This domain contains a potential haem-binding motif, CXXCH. This family is found in association with pfam00034 and pfam03150. 0
42221 420057 cl23859 Peptidase_M10_C Peptidase M10 serralysin C terminal. This family consists of a number of bacteria specific domains which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with pfam00353 and is often found in multiple copies. 0
42222 420060 cl23862 PmrD Polymyxin resistance protein PmrD. anti-adapter protein IraM; Provisional 0
42223 420061 cl23863 BrnA_antitoxin BrnA antitoxin of type II toxin-antitoxin system. CopG antitoxin is a member of a type II toxin-antitoxin system family found in bacteria and archaea. Most antitoxins encoded by the relBE and parDE loci belong to the MetJ/Arc/CopG family of dimeric proteins which bind DNA through N-terminal ribbon-helix-helix (RHH) motifs. The toxin for CopG proteins falls into the family BrnT_toxin, pfam04365. 0
42224 420068 cl23870 PsbX Photosystem II reaction centre X protein (PsbX). photosystem II protein X; Reviewed 0
42225 305028 cl23871 DUF2560 Protein of unknown function (DUF2560). hypothetical protein 0
42226 420069 cl23872 A_amylase_inhib Alpha amylase inhibitor. Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases. 0
42227 305030 cl23873 Spider_toxin Spider neurotoxins including agatoxin, purotoxin and ctenitoxin. This family of spider neurotoxins are thought to be calcium ion channel inhibitors. 0
42228 305031 cl23874 DUF1187 Protein of unknown function (DUF1187). hypothetical protein; Provisional 0
42229 420070 cl23875 MvaI_BcnI MvaI/BcnI restriction endonuclease family. This family includes the LlaMI (recognizes and cleaves CC^NGG) restriction endonuclease. 0
42230 420071 cl23876 ToxGAP N/A. GTPase-activating protein (GAP) domain found in bacterial cytotoxins, ExoS, SptP, and YopE. Part of protein secretion system; stimulates Rac1- dependent cytoskeletal changes that promote bacterial internalization. 0
42231 390151 cl23877 Lysin-Sp18 N/A. Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg. 0
42232 420072 cl23878 C1q C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor. 0
42233 390153 cl23879 IL2 Interleukin 2. Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response. 0
42234 420073 cl23880 HALZ Homeobox associated leucine zipper. 0
42235 420074 cl23881 GIT_SHD Spa2 homology domain (SHD) of GIT. Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX. 0
42236 420075 cl23882 Holin_SPP1 SPP1 phage holin. This model represents one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others. [Mobile and extrachromosomal element functions, Prophage functions] 0
42237 305040 cl23883 DUF722 Protein of unknown function (DUF722). This model represents a family of phage proteins, including RinA, a transcriptional activator in staphylococcal phage phi 11. This family shows similarity to ArpU, a phage-related putative autolysin regulator, and to some sporulation-specific sigma factors. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions] 0
42238 420076 cl23884 PRESAN Plasmodium RESA N-terminal. This model represents a conserved sequence region of about 60 amino acids found in over 40 predicted proteins of Plasmodium falciparum. It is not found elsewhere, including closely related species such as Plasmodium yoelii. No member of this family is characterized. 0
42239 420077 cl23885 NTase_sub_bind Nucleotidyltransferase substrate binding protein like. The member of this family from Haemophilus influenzae, HI0074, has been shown by crystal structure to resemble nucleotidyltransferase substrate binding proteins. It forms a complex with HI0073, encoded by the adjacent gene and containing a nucleotidyltransferase nucleotide binding domain (pfam01909). 0
42240 420078 cl23886 SWM_repeat Putative flagellar system-associated repeat. This domain appears in 29 copies in a large (>10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803. 0
42241 420079 cl23887 DUF4349 Domain of unknown function (DUF4349). This model describes a protein, PhaR, localized to polyhydroxyalkanoic acid (PHA) inclusion granules in Bacillus cereus and related species. PhaR is required for PHA biosynthesis along with PhaC and may be a regulatory subunit. 0
42242 390161 cl23888 Gmx_para_CXXCG Protein of unknown function (Gmx_para_CXXCG). This family consists of at least 10 paralogous proteins from Myxococcus xanthus that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment. 0
42243 390162 cl23889 Dimeth_Pyl Dimethylamine methyltransferase (Dimeth_PyL). This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of M. acetivorans, M. barkeri, and M. Mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence. 0
42244 329155 cl23890 Bac_small_YrzI Probable sporulation protein (Bac_small_yrzI). Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, but arrays of six in tandem in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number small, acid-soluble spore proteins (SASP). A role in sporulation is possible. 0
42245 420080 cl23891 Type_III_YscG Bacterial type II secretion system chaperone protein (type_III_yscG). YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designate Yops (Yersinia outer proteins) in Yersinia. This family consists of YscG of Yersinia, and functionally equivalent type III secretion machinery protein in other species: AscG in Aeromonas, LscG in Photorhabdus luminescens, etc. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
42246 420081 cl23892 TyeA TyeA. Members of this family include both small proteins, about 90 amino acids, in which this model covers the whole, and longer proteins of about 360 residues which match in the C-terminal region. The longer proteins (HrpJ) have N-terminal regions that match pfam07201. Members of this family belong to bacterial type III secretion systems, and include TyeA from the well-studied Yersinia systems. TyeA appears involved in calcium-responsive regulation of the delivery of type III effectors. 0
42247 390164 cl23893 HrpB2 Bacterial type III secretion protein (HrpB2). This family of genes is found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia. 0
42248 305052 cl23895 LcrG LcrG protein. This protein is found in type III secretion operons, along with LcrR, H and V. Also known as PcrG in Pseudomonas, the protein is believed to make a 1:1 complex with PcrV (LcrV). Mutants of LcrG cause premature secretion of effector proteins into the medium . 0
42249 390165 cl23896 PGPGW Putative transmembrane protein (PGPGW). Members of this family are Actinobacterial putative proteins of about 150 amino acids in length with three apparent transmembrane helix and an unusual motif with consensus sequence PGPGW. [Hypothetical proteins, Conserved] 0
42250 305054 cl23897 Phenyl_P_gamma Phenylphosphate carboxylase gamma subunit (Phenyl_P_gamma). Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs. 0
42251 420082 cl23898 SpoIIIAC Stage III sporulation protein AC/AD protein family. Members of this family are the uncharacterized protein SpoIIIAD, part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. Note that the start sites of members of this family as annotated tend to be variable; quite a few members have apparent homologous protein-coding regions continuing upstream of the first available start codon. The length of the alignment and the scoring cutoff thresholds for the model have been set to try to detect all valid members of the family, even if annotation of the start site begins too far downstream. [Cellular processes, Sporulation and germination] 0
42252 329159 cl23899 Spore_YpjB Sporulation protein YpjB (SpoYpjB). Members of this protein, YpjB, family are restricted to a subset of endospore-forming bacteria, including Bacillus species but not CLostridium or some others. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon, where sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect. This protein family is not, however, a part of the endospore formation minimal gene set. [Cellular processes, Sporulation and germination] 0
42253 305057 cl23900 DUF2375 Protein of unknown function (DUF2375). Two members of this family are found in Colwellia psychrerythraea 34H and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by TIGR02595. [Hypothetical proteins, Conserved] 0
42254 390167 cl23901 DUF3938 Protein of unknown function (DUF3938). hypothetical protein; Provisional 0
42255 329161 cl23902 DUF2689 Protein of unknown function (DUF2689). conjugal transfer protein TrbD; Provisional 0
42256 329162 cl23903 TrbE Conjugal transfer protein TrbE. conjugal transfer protein TrbE; Provisional 0
42257 420083 cl23904 DNA_Packaging_2 DNA packaging protein. DNA packaging protein, small subunit 0
42258 390169 cl23905 DUF2824 Protein of unknown function (DUF2824). head assembly protein 0
42259 420084 cl23906 UvsW ATP-dependant DNA helicase UvsW. hypothetical protein; Provisional 0
42260 305065 cl23908 RBP-H Head domain of virus receptor-binding proteins (RBP). Caudo_bapla_RBP is a family of proteins expressed from ORF18 of the Lactococcus P2-like phage. This is one of three protein species, shoulders, neck, and head, that form the phage tail base-plate. In the overall structure this head domain exists as six trimers, and is necessary for specific recognition of the receptors at the host cell surface. Siphoviridae are the P2-like Caudovirales of Lactococcus. This family now includes DUF1914. Family Baseplate, pfam16774, is the ORF15 or shoulder component of the base-plate complex. 0
42261 420085 cl23910 Replic_Relax Replication-relaxation. putative internal core protein; Provisional 0
42262 420086 cl23911 MSP Manganese-stabilizing protein / photosystem II polypeptide. photosystem II oxygen-evolving enhancer protein 1; Provisional 0
42263 420087 cl23912 GlgS Glycogen synthesis protein. Members of this family are involved in glycogen synthesis in Enterobacteria. The structure of the polypeptide chain comprises a bundle of two parallel amphipathic helices, alpha-1 and alpha-3, and a short hydrophobic helix alpha-2 sandwiched between them. 0
42264 420088 cl23913 DUF2614 Zinc-ribbon containing domain. hypothetical protein; Provisional 0
42265 420089 cl23914 UPF0257 Uncharacterized protein family (UPF0257). 0
42266 329168 cl23915 SspK Small acid-soluble spore protein K family. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspK. [Cellular processes, Sporulation and germination] 0
42267 420090 cl23916 UPF0253 Uncharacterized protein family (UPF0253). hypothetical protein; Provisional 0
42268 420091 cl23918 SspP Small acid-soluble spore protein P family. This family consists of the small acid-soluble spore proteins (SASP) P type (sspP). sspP is expressed only in the forespore compartment of the sporulating cell. sspP is also expressed under sigma-G control from the same promoter as sspO. Mutations deleting sspP causes no discernible effect on sporulation, spore properties or spore germination. 0
42269 329170 cl23919 Ribosomal_S22 30S ribosomal protein subunit S22 family. This family consists of the 30S ribosomal proteins subunit S22 polypeptides. This polypeptide is 47 amino acids in length and has a molecular weight of about 5 kDa. The S22 subunit is a component of the stationary-phase-specific ribosomal protein and is assembled in the ribosomal particles in the stationary phase. This subunit along with other stationary-phase-specific ribosomal proteins result in compositional changes of ribosomes during the stationary phase. The significance of this change is not clear as yet. 0
42270 329171 cl23920 Tafi-CsgC Thin aggregative fimbriae synthesis protein. curli assembly protein CsgC; Provisional 0
42271 420092 cl23921 YccJ YccJ-like protein. hypothetical protein; Provisional 0
42272 420093 cl23922 DUF2559 Protein of unknown function (DUF2559). hypothetical protein; Provisional 0
42273 420094 cl23924 DUF2767 Protein of unknown function (DUF2767). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 0
42274 420095 cl23925 YedD YedD-like protein. lipoprotein; Provisional 0
42275 390178 cl23926 DUF2594 Protein of unknown function (DUF2594). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae. 0
42276 420096 cl23927 DUF2583 Protein of unknown function (DUF2583). Some members in this family of proteins are annotated as YchH however currently no function is known. 0
42277 420097 cl23928 MsyB MsyB protein. secY/secA suppressor protein; Provisional 0
42278 420098 cl23929 YejG YejG-like protein. hypothetical protein; Provisional 0
42279 420099 cl23930 BssS BssS protein family. The BssS protein family is a group of proteins that are involved in regulation of biofilm formation. Proteins in this family are approximately 80 amino acids in length. 0
42280 420100 cl23931 Peptidase_S48 Peptidase family S48. heterocyst differentiation control protein; Reviewed 0
42281 420101 cl23932 DUF1283 Protein of unknown function (DUF1283). This family consists of several hypothetical proteins of around 115 residues in length which seem to be specific to Enterobacteria. The function of the family is unknown. 0
42282 420102 cl23933 UPF0370 Uncharacterized protein family (UPF0370). hypothetical protein; Provisional 0
42283 420103 cl23934 YebF YebF-like protein. hypothetical protein; Provisional 0
42284 390185 cl23935 PsiA PsiA protein. plasmid SOS inhibition protein A; Provisional 0
42285 305093 cl23936 PTZ00202 N/A. tuzin-like protein; Provisional 0
42286 305094 cl23937 exosort_Gpos exosortase family protein XrtG. Members of this protein family, ArtF, belong to the archaeosortase/exosortase family, in which many members associate with specific protein C-terminal putative protein sorting domains (exosortase A with PEP-CTERM, archaeosortase A with PGF-CTERM, etc.). This subgroup is observed in Thermococcus gammatolerans EJ3 and Thermococcus sp. AM4, but the gene neighborhood is not conserved. The cognate sequence to ArtF is unknown, but should not be ICGP-CTERM (model TIGR04288), found also in many Pyrococcus species that lack any archaeosortase family member. 0
42287 420104 cl23938 GA GA module. The protein G-related albumin-binding (GA) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin. 0
42288 420105 cl23940 PYST-C1 Plasmodium yoelii subtelomeric region (PYST-C1). This model represents the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604). 0
42289 420106 cl23941 ChpXY CO2 hydration protein (ChpXY). This small family of proteins includes paralogs ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations. [Energy metabolism, Photosynthesis] 0
42290 305099 cl23942 Paramecium_SA Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues. 0
42291 420107 cl23944 CaM_binding Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains.. Binding of the proteins to calmodulin depends on the presence of calcium ions.. These proteins are thought to be involved in various processes, such as plant defence responses.and stolonisation or tuberization. 0
42292 420108 cl23945 FragX_IP Cytoplasmic Fragile-X interacting family. Protein PIR; Provisional 0
42293 305103 cl23946 PHA00148 N/A. putative lower collar protein 0
42294 390190 cl23947 DUF3653 Phage protein. putative transcription regulator 0
42295 420109 cl23948 Peptidase_S80 Bacteriophage T4-like capsid assembly protein (Gp20). portal vertex protein; Provisional 0
42296 420110 cl23949 Late_protein_L1 L1 (late) protein. major capsid L1 protein; Provisional 0
42297 305108 cl23951 Pox_vIL-18BP Orthopoxvirus interleukin 18 binding protein. IL-18 binding protein; Provisional 0
42298 420111 cl23953 DUF212 Divergent PAP2 family. This family is related to the pfam01569 family (personal obs: C Yeats). 0
42299 420112 cl23954 DHNA Dihydroneopterin aldolase. 0
42300 305112 cl23955 COG2122 Uncharacterized protein, UPF0280 family, ApbE superfamily [Function unknown]. hypothetical protein; Provisional 0
42301 420113 cl23956 YitT_membrane Uncharacterized 5xTM membrane BCR, YitT family COG1284. This is probably a bacterial ABC transporter permease (personal obs:Yeats C). 0
42302 420114 cl23957 Lys_export Lysine exporter LysO. Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. This family includes lysine exporter LysO (YbjE) from E. coli. 0
42303 420115 cl23958 DUF1040 Protein of unknown function (DUF1040). This family consists of several bacterial YihD proteins of unknown function. 0
42304 390198 cl23959 DUF1495 Winged helix DNA-binding domain (DUF1495). This family consists of several hypothetical archaeal proteins of around 110 residues in length. The structure of this domain possesses a winged helix DNA-binding domain suggesting these proteins are bacterial transcription factors. 0
42305 420116 cl23960 DUF4097 Putative adhesin. This bacterial family of proteins shows structural similarity to other pectin lyase families. Although structures from this family align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In Structure 3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that this family belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN). 0
42306 420117 cl23961 DUF2106 Predicted membrane protein (DUF2106). This domain, found in various hypothetical archaeal proteins, has no known function. 0
42307 420118 cl23962 DUF1959 Domain of unknown function (DUF1959). This domain is found in a set of uncharacterized Archaeal hypothetical proteins. Its function has not, as yet, been described. 0
42308 420119 cl23964 DUF2324 Putative membrane peptidase family (DUF2324). This domain, found in various hypothetical bacterial proteins, has no known function. This family appears to be related to the prenyl protease 2 family pfam02517, suggesting this family may be peptidases. 0
42309 420120 cl23965 DUF910 Bacterial protein of unknown function (DUF910). This family consists of several short bacterial proteins of unknown function. 0
42310 420121 cl23966 Phage_TAC_7 Phage tail assembly chaperone proteins, E, or 41 or 14. This is family of various Myoviridae bacteriophage tail assembly chaperone, or TAC, proteins. 0
42311 420122 cl23967 DUF1033 Protein of unknown function (DUF1033). This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown. 0
42312 420123 cl23968 DUF1128 Protein of unknown function (DUF1128). This family consists of several short, hypothetical bacterial proteins of unknown function. 0
42313 420124 cl23969 DUF2187 Uncharacterized protein conserved in bacteria (DUF2187). This domain, found in various hypothetical bacterial proteins, has no known function. 0
42314 390205 cl23970 DUF2188 Uncharacterized protein conserved in bacteria (DUF2188). This domain, found in various hypothetical bacterial proteins, has no known function. 0
42315 329205 cl23971 DUF2197 Uncharacterized protein conserved in bacteria (DUF2197). This domain, found in various hypothetical bacterial proteins, has no known function. 0
42316 420125 cl23972 DUF2198 Uncharacterized protein conserved in bacteria (DUF2198). This domain, found in various hypothetical bacterial proteins, has no known function. 0
42317 420126 cl23973 Glycoamylase Putative glucoamylase. The structure of UniProt:Q5LIB7 has an alpha/alpha toroid fold and is similar structurally to a number of glucoamylases. Most of these structural homologs are glucoamylases, involved in breaking down complex sugars (e.g. starch). The biologically relevant state is likely to be monomeric. The putative active site is located at the centre of the toroid with a well defined large cavity. 0
42318 420127 cl23974 zinc_ribbon_10 Predicted integral membrane zinc-ribbon metal-binding protein. This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins is a probably zinc-ribbon. 0
42319 420128 cl23975 DUF1127 Domain of unknown function (DUF1127). This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence. 0
42320 420129 cl23976 DUF1189 Protein of unknown function (DUF1189). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown. 0
42321 305135 cl23978 UPF0259 Uncharacterized protein family (UPF0259). hypothetical protein; Provisional 0
42322 355112 cl23979 UspB Universal stress protein B (UspB). universal stress protein UspB; Provisional 0
42323 420130 cl23980 MarB MarB protein. The MarB protein is found in the multiple antibiotic resistance (mar) locus in Escherichia coli. The MarB protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved GSDKSD sequence motif. 0
42324 305138 cl23981 PyrBI_leader PyrBI operon leader peptide. This family consists of the pyrBI operon leader peptides. The expression of the pyrBI operon, which encodes the subunits of the pyrimidine biosynthetic enzyme aspartate transcarbamylase. is regulated primarily through a UTP-sensitive transcriptional attenuation control mechanism. In this mechanism, the concentration of UTP determines the extent of coupling between transcription and translation within the pyrBI leader region, hence determining the level of rho-independent transcriptional termination at an attenuator preceding the pyrB gene. 0
42325 420131 cl23982 DUF3561 Protein of unknown function (DUF3561). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length. 0
42326 420132 cl23983 DUF1422 Protein of unknown function (DUF1422). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 0
42327 420133 cl23984 PsiB Plasmid SOS inhibition protein (PsiB). This family consists of several plasmid SOS inhibition protein (PsiB) sequences. 0
42328 420134 cl23985 Endostatin-like N/A. NC10 stands for Non-helical region 10 and is taken from COL15A1. A mutation in this region in COL18A1 is associated with an increased risk of prostate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumor suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor. Endostatin also binds a zinc ion near the N-terminus; this is likely to be of structural rather than functional importance according to. 0
42329 420135 cl23986 DAXX_helical_bundle Helical bundle domain of the death-domain associated protein (DAXX). The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis. Daxx forms a complex with Axin. Remodelling of the family to a short domain based on the Structure 2kzs structure gives a more representative family. DAXX is a scaffold protein shown to play diverse roles in transcription and cell cycle regulation. This N-terminal domain folds into a left-handed four-helix bundle (H1, H2, H4, H5) that binds to the N-terminal residues of the tumor-suppressor Rassf1C. 0
42330 329218 cl23987 FBA_1 F-box associated. This model describes a large family of plant domains, with several hundred members in Arabidopsis thaliana. Most examples are found C-terminal to an F-box (pfam00646), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes. Some members have two copies of this domain. 0
42331 390217 cl23989 Gram_pos_anchor LPXTG cell wall anchor motif. This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other] 0
42332 420136 cl23990 S-layer S-layer protein. This model represents a sequence region found tandemly duplicated in two proven archaeal S-layer glycoproteins, MA0829 from Methanosarcina acetivorans C2A and MM1976 from Methanosarcina mazei Go1, as well as in several paralogs of those L-layer proteins from both species. Members of the family show regions of local similarity to another known family of archaeal S-layer proteins described by model TIGR01564. Some members of this family, including the proven S-layer proteins, have the archaeosortase A target motif, PGF-CTERM (TIGR04126), at the protein C-terminus. [Cell envelope, Surface structures] 0
42333 420137 cl23991 Phage_holin_3_1 Phage holin family (Lysis protein S). This model represents one of a large number of mutally dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. [Mobile and extrachromosomal element functions, Prophage functions] 0
42334 420138 cl23992 Wx5_PLAF3D7 Protein of unknown function (Wx5_PLAF3D7). This model represents a family of at least four proteins in Plasmodium falciparum. An interesting feature is five perfectly conserved Trp residues. 0
42335 420139 cl23994 Prophage_tail Prophage endopeptidase tail. This model represents the conserved N-terminal region, typically from about residue 25 to about residue 350, of a family of uncharacterized phage proteins 500 to 1700 residues in length. [Mobile and extrachromosomal element functions, Prophage functions] 0
42336 420140 cl23995 Phage_XkdX Phage uncharacterized protein (Phage_XkdX). This model represents a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes. [Mobile and extrachromosomal element functions, Prophage functions] 0
42337 420141 cl23996 DUF576 Csa1 family. Members of this family are predicted lipoproteins (mostly), found in Staphylococcus aureus in several different tandem clusters in pathogenicity islands. Members are also found, clustered, in Staphylococcus epidermidis. 0
42338 420142 cl23997 Phage_TAC_6 Phage tail assembly chaperone protein, TAC. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. [Mobile and extrachromosomal element functions, Prophage functions] 0
42339 420143 cl23998 Phage_T4_gp19 T4-like virus tail tube protein gp19. This family consists of uncharacterized proteins. All members so far represent bacterial genes found in apparent phage or otherwisely laterally transferred regions of the chromosome. Tentatively identified neighboring proteins tend to be phage tail region proteins. In some species, including Photorhabdus luminescens TTO1, several members of this family may be encoded near each other. 0
42340 420144 cl23999 DUF2388 Protein of unknown function (DUF2388). This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1, and also in Pseudomonas putida KT2440, four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown. 0
42341 420145 cl24000 Pec_lyase Pectic acid lyase. Members of this family are isozymes of pectate lyase (EC 4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase. [Energy metabolism, Biosynthesis and degradation of polysaccharides] 0
42342 420146 cl24001 Flg_hook Flagellar hook-length control protein FliK. Members of this family include YscP of the Yersinia type III secretion system and equivalent proteins in other animal pathogen bacterial type III secretion systems. The model describes the conserved C-terminal region. N-terminal regions are poorly conserved and variable in length with some low-complexity sequence. 0
42343 305159 cl24002 Dot_icm_IcmQ Dot/Icm secretion system protein (dot_icm_IcmQ). Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (). 0
42344 420147 cl24004 CBP_BcsF Cellulose biosynthesis protein BcsF. Members of this protein family are found invariably together with genes of bacterial cellulose biosynthesis, and are presumed to be involved in the process. Members average about 63 amino acids in length and are not uncharacterized. The gene has been designated both YhjT and BcsF (bacterial cellulose synthesis F). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
42345 420148 cl24005 DUF2570 Protein of unknown function (DUF2570). Members of this protein family are phage lysis regulatory protein, including the well-studied protein LysB (lysis protein B) of Enterobacteria phage P2. For members of this family, genes are found in phage or in prophage regions of bacterial genomes, typically near a phage lysozyme or phage holin. 0
42346 420149 cl24006 YidC_periplas YidC periplasmic domain. Essentially all bacteria have a member of the YidC family, whose C-terminal domain is modeled by TIGR03592. The two copies are found in endospore-forming bacteria such as Bacillus subtilis appear redundant during vegetative growth, although the member designated spoIIIJ (stage III sporulation protein J) has a distinct role in spore formation. YidC, its mitochondrial homolog Oxa1, and its chloroplast homolog direct insertion into the bacterial/organellar inner (or only) membrane. This model describes an N-terminal sequence region, including a large periplasmic domain lacking in YidC members from Gram-positive species. The multifunctional YidC protein acts both with and independently of the Sec system. [Protein fate, Protein and peptide secretion and trafficking] 0
42347 420150 cl24007 DUF2976 Protein of unknown function (DUF2976). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 0
42348 420151 cl24008 CtnDOT_TraJ homologs of TraJ from Bacteroides conjugative transposon. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. This family is related conjugation system proteins in the Proteobacteria, including TrbL of Agrobacterium Ti plasmids and VirB6. [Cellular processes, DNA transformation] 0
42349 390233 cl24009 Parvo_NS1 Parvovirus non-structural protein NS1. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508. 0
42350 420152 cl24013 B_lectin N/A. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria. 0
42351 420153 cl24015 DDE_Tnp_ISL3 Transposase. This domain was identified by Babu and colleagues. 0
42352 305174 cl24017 PCNA N/A. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA. 0
42353 420154 cl24018 DUF4143 Domain of unknown function (DUF4143). This domain is almost always found C-terminal to an ATPase core family. 0
42354 420155 cl24019 CSN8_PSD8_EIF3K CSN8/PSMD8/EIF3K family. This family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localization of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. 0
42355 420156 cl24020 Pyocin_S S-type Pyocin. The C-terminal region of colicin-like bacteriocins is either a pore-forming or an endonuclease-like domain. Cloacin and Pyocins have similar structures and activities to the colicins from E coli and the klebicins from Klebsiella spp. Colicins E5 and D cleave the anticodon loops of distinct tRNAs of Escherichia coli both in vivo and in vitro. The full-length molecule has an N-terminal translocation domain and a middle, double alpha-helical region which is receptor-binding. 0
42356 420157 cl24021 NPR3 Nitrogen Permease regulator of amino acid transport activity 3. This family of regulators are involved in post-translational control of nitrogen permease. 0
42357 420159 cl24023 Cyclase_polyket Polyketide synthesis cyclase. Members of this family have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site. Many of these proteins contain two copies of this presumed domain. 0
42358 420163 cl24030 SUR7 SUR7/PalI family. During the mating process of yeast cells, two Ca2+ influx pathways become activated. The resulting elevation of cytosolic free Ca2+ activates downstream signaling factors that promote long term survival of unmated cells. Fig1 is a regulator of the low affinity Ca2+ influx system (LACS), and is also required for efficient membrane fusion during yeast mating. 0
42359 420164 cl24032 QCR10 Ubiquinol-cytochrome-c reductase complex subunit (QCR10). The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is an essential component of the mitochondrial cellular respiratory chain. This family represents the 6.4kD protein, which may be closely linked to the iron-sulphur protein in the complex and function as an iron-sulphur protein binding factor. 0
42360 420165 cl24033 zf-NADH-PPase NADH pyrophosphatase zinc ribbon domain. Ths domain occurs at the N-terminus of several Nudix (Nucleoside Diphosphate linked to X) hydrolases. 0
42361 420167 cl24037 MRP-S25 Mitochondrial ribosomal protein S25. MRP-S23 is one of the proteins that makes up the 55S ribosome in eukaryotes from nematodes to humans. It does not appear to carry any common motifs, either RNA binding or ribosomal protein motifs. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble coordinately with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system. MRP-S23 is significantly up-regulated in uterine cancer cells. 0
42362 420168 cl24038 SNAP Soluble N-ethylmaleimide-sensitive factor (NSF) Attachment Protein family. Neuromuscular junction formation relies upon the clustering of acetylcholine receptors and other proteins in the muscle membrane. Rapsyn is a peripheral membrane protein that is selectively concentrated at the neuromuscular junction and is essential for the formation of synaptic acetylcholine receptor aggregates. Acetylcholine receptors fail to aggregate beneath nerve terminals in mice where rapsyn has been knocked out. The N-terminal six amino acids of rapsyn are its myristoylation site, and myristoylation is necessary for the targeting of the protein to the membrane. 0
42363 420169 cl24040 AbiEi_4 Transcriptional regulator, AbiEi antitoxin. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 0
42364 420173 cl24047 DUF4173 Domain of unknown function (DUF4173). Members of this family are annotated as putative inner membrane proteins. 0
42365 420177 cl24051 BBS2_Mid Ciliary BBSome complex subunit 2, middle region. Members of this family are annotated as being integrin-alpha FG-GAP repeat-containing protein 2. 0
42366 420179 cl24054 CTC1 CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. This family largely represents sequences from plants species. 0
42367 420186 cl24062 Phage_holin_3_3 LydA holin phage, holin superfamily III. Phage_holin_6_2 is a family of holins classified as 1.E.20 in the TC database. The hol gene (PRF9) product (117 aas) of Pseudomonas aeruginosa PAO1 exhibits a hydrophobicity profile similar to holins of P2 and phiCTX phages with two peaks of hydrophobicity that might correspond to either one or two TMSs. Hol functions in conjunction with the lytic enzyme, Lys, a glycosyl hydrolase that breaks-up the murein in the bacterial cell-wall, causing lysis of the cell and hence entry of phage particles. Several members are annotated as pyocin R2_PP when encoded on the chromosome. 0
42368 305223 cl24066 SA1633_like Uncharacterized protein family conserved in Staphylococci. This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown. 0
42369 390276 cl24077 Zn_ribbon_17 Zinc-ribbon, C4HC2 type. This family is found at the C-terminus of WD40 repeat structures in eukaryotes. 0
42370 420196 cl24079 Helo_like_N Fungal N-terminal domain of STAND proteins. This is a family of fungal N-terminal domains that appear at the N-terminus of P-loop NTPases, NACHT-NTPases and Ankyrin or WD repeat proteins. The exact function is not known. 0
42371 420197 cl24084 DUF5669 Family of unknown function (DUF5669). Members of this family are found, so far, only in the Gammaproteobacteria. The function is unknown. The location on the chromosome usually is not far from housekeeping genes rather than in what is clearly, say, a prophage region. Some members have been annotated in public databases as DNA-binding protein inhibitor Id-2-related protein, putative transcriptional regulator, or hypothetical DNA binding protein. [Hypothetical proteins, Conserved] 0
42372 420257 cl24259 Transglut_core3 Transglutaminase-like superfamily. This family includes uncharacterized proteins that are related to the transglutaminase like domain pfam01841. 0
42373 420385 cl24410 CASIMO1 Cancer Associated Small Integral Membrane Open reading frame 1 (CASIMO1). This family of proteins is found in eukaryotes. Proteins in this family are typically between 68 and 91 amino acids in length. Members are single-pass membrane proteins. 0
42374 420631 cl24758 DUF5128 6-bladed beta-propeller. This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp. The function of this family is unknown. 0
42375 420752 cl24939 T2SSB Type II secretion system protein B. GspB (general secretory pathway B) occurs in type II secretion systems (T2SS) and is viewed as an accessory protein, a factor involved in the assembly process rather than integral to the completed T2SS apparatus. 0
42376 420758 cl24946 FlgT_N Flagellar assembly protein T, N-terminal domain. Members of this family are lipoprotein LipL46, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 0
42377 420886 cl25130 DotD DotD protein. Members of this family are the lipoprotein DotD from type IVB secretion systems, which are also called Dot/Icm secretion systems. DotD is is related to conjugal transfer protein TraH as that term is used in IncI1 plasmid transfer regions. 0
42378 420941 cl25222 MaAIMP_sms Putative methionine and alanine importer, small subunit. MetS, as described in the Gram-positive bacterium Corynebacterium glutamicum, is the small subunit of MetPS, an NSS (Neurotransmitter:Sodium Symporter) transporter involved in methionine and alanine import. While MetS itself is small, only 60 amino acids, homologs in gamma proteobacteria such as Vibrio sp., similarly found next to an NSS transporter large subunit, may be barely half that length and consist almost entirely of a predicted hydrophobic region that would localize to within the plasma membrane. 0
42379 391092 cl25225 Porin_7 Putative general bacterial porin. Members of this family are outer membrane beta-barrel proteins that facilitate passive transport from the extracellular milieu into the periplasm. Known members are limited to the genus Acinetobacter, and the name, Omp33-36, reflects variability of this protein across the lineage. Note that this HMM previously was named CarO in error. Both this protein and CarO affect carbapenem transport across the outer member and thus carbapenem susceptibility or resistance. 0
42380 420976 cl25298 QVR Sleepless protein. This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown, with some of the sequences annotated as being Palmitoyltransferase. This highly conserved domain is located at the amino terminus, next to a DHHC domain (named for its signature tetrapeptide Asp-His-His-Cys). 0
42381 330171 cl25349 EFh_SPARC_EC EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC)-like proteins. SMOC-2, also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2 (SMAP-2), is a ubiquitously expressed matricellular protein that enhances the response to angiogenic growth factors, mediate cell adhesion, keratinocyte migration, and metastasis. It is also associated with vitiligo and craniofacial and dental defects. Moreover, SMOC-2 acts as an Arf1 GTPase-activating protein (GAP) that interacts with clathrin heavy chain (CHC) and clathrin assembly protein CALM and functions in the retrograde, early endosome/trans-Golgi network (TGN) pathway in a clathrin- and AP-1-dependent manner. It also contributes to mitogenesis via activation of integrin-linked kinase (ILK). SMOC-2 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-1, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs. 0
42382 355382 cl25352 EFh_PEF The penta-EF hand (PEF) family. CAPN2, also termed millimolar-calpain (m-calpain), or calpain-2 catalytic subunit, or calcium-activated neutral proteinase 2 (CANP 2), or calpain large polypeptide L2, or calpain-2 large subunit, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms an m-calpain heterodimer. CAPN2 acts as the key protease responsible for N-methyl-d-aspartic acid (NMDA)-induced cytoplasmic polyadenylation element-binding protein 3 (CPEB3) degradation in neurons. It cleaves several components of the focal adhesion complex, such as FAK and talin, triggering disassembly of the complex at the rear of the cell. The stimulation of CAPN2 activity is required for Golgi antiapoptotic proteins (GAAPs) to promote cleavage of FA kinase (FAK), cell spreading, and enhanced migration. calpain 2 is also involved in the onset of glial differentiation. It regulates proliferation, survival, migration, and tumorigenesis of breast cancer cells through a PP2A-Akt-FoxO-p27(Kip1) signaling cascade. Its expression is associated with response to platinum based chemotherapy, progression-free and overall survival in ovarian cancer. Moreover, CAPN2 may play a role in fundamental mitotic functions, such as the maintenance of sister chromatid cohesion. The activation of CAPN2 plays an essential role in hippocampal synaptic plasticity and in learning and memory. In the eye, CAPN2, together with a lens-specific variant of CAPN3, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. Sometimes, CAPN2 compensates for loss of CAPN1, and both calpain isoforms are involved in AngII-induced aortic aneurysm formation. The main phosphorylation sites in m-calpain are Ser50 and Ser369/Thr370. 0
42383 330175 cl25354 EFh_CREC EF-hand, calcium binding motif, found in CREC-EF hand family. RCN-3, also termed EF-hand calcium-binding protein RLP49, is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal His-Asp-Glu-Leu (HDEL) tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4. 0
42384 330177 cl25356 EFh_parvalbumin_like EF-hand, calcium binding motif, found in parvalbumin-like EF-hand family. Beta-parvalbumin, also termed Oncomodulin-1 (OM), is a small calcium-binding protein that is expressed in hepatomas, as well as in the blastocyst and the cytotrophoblasts of the placenta. It is also found to be expressed in the cochlear outer hair cells of the organ of Corti and frequently expressed in neoplasms. Mammalian beta-parvalbumin is secreted by activated macrophages and neutrophils. It may function as a tissue-specific Ca2+-dependent regulatory protein, and may also serve as a specialized cytosolic Ca2+ buffer. Beta-parvalbumin acts as a potent growth-promoting signal between the innate immune system and neurons in vivo. It has high and specific affinity for its receptor on retinal ganglion cells (RGC) and functions as the principal mediator of optic nerve regeneration. It exerts its effects in a cyclic adenosine monophosphate (cAMP)-dependent manner and can further elevate intracellular cAMP levels. Moreover, beta-parvalbumin is associated with efferent function and outer hair cell electromotility, and can identify different hair cell types in the mammalian inner ear. Beta-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. The EF site displays a high-affinity for Ca2+/Mg2+, and the CD site is a low-affinity Ca2+-specific site. In addition, beta-parvalbumin is distinguished from other parvalbumins by its unusually low isoelectric point (pI = 3.1) and sequence eccentricities (e.g., Y57-L58-D59 instead of F57-I58-E59). 0
42385 355383 cl25360 DMSOR_beta-like Beta subunit of the DMSO Reductase (DMSOR) family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well. Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI). The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. 0
42386 355384 cl25362 YitT_C_like C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, characteristic of various bacterial proteins, has no known function. It has been given the designation DUF2179 and is similar to the C-terminus of the Bacillus subtilis membrane protein. 0
42387 355385 cl25364 beta_Kdo_transferase beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. 0
42388 355386 cl25366 ChuX-like heme utilization protein ChuX and similar proteins. This family contains the C-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria. Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both, heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. 0
42389 355389 cl25402 iSH2_PI3K_IA_R Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunits. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p85beta, also called PIK3R2, contains N-terminal SH3 and GAP domains. It is expressed ubiquitously but at lower levels than p85alpha. Its expression is increased in breast and colon cancer, correlates with tumor progression, and enhanced invasion. During viral infection, the viral nonstructural (NS1) protein binds p85beta specifically, which leads to PI3K activation and the promotion of viral replication. Mice deficient with PIK3R2 develop normally and exhibit moderate metabolic and immunological defects. 0
42390 421006 cl25407 ClassIIa_HDAC_Gln-rich-N Glutamine-rich N-terminal helical domain of various Class IIa histone deacetylases (HDAC4, HDAC5 and HDCA9). This domain is found in eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam00850. The domain forms an alpha helix which complexes to form a tetramer. The glutamine rich domains have many intra- and inter-helical interactions which are thought to be involved in reversible assembly and disassembly of proteins. The domain is part of histone deacetylase 4 (HDAC4) which removes acetyl groups from histones. This restores their positive charge to allow stronger DNA binding thus restricting transcriptional activity. 0
42391 355393 cl25421 Lnt Apolipoprotein N-acyltransferase [Cell wall/membrane/envelope biogenesis]. apolipoprotein N-acyltransferase; Reviewed 0
42392 421007 cl25432 RseA_N N-terminal domain of RseA. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress. 0
42393 421008 cl25434 MukF_C bacterial condensin complex subunit MukF, C-terminal domain. This presumed domain is found at the C-terminus of the MukF protein. 0
42394 355396 cl25446 PRK15055 anaerobic sulfite reductase subunit AsrA. Members of this protein family include the A subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism] 0
42395 330301 cl25480 FA58C N/A. Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. 0
42396 421067 cl25556 tRNA-synt_2_TM Transmembrane region of lysyl-tRNA synthetase. tRNA-synt_2_TM is a family from the N-terminal region of tRNA-synthase-2, with 6xTMs. The presence of this region indicates that the protein is anchored in the membrane. The family is found in Actinobacteria. 0
42397 421069 cl25561 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. 0
42398 421071 cl25563 Mal_decarbox_Al Malonate decarboxylase, alpha subunit, transporter. This model describes malonate decarboxylase alpha subunit, from both the water-soluble form as found in Klebsiella pneumoniae and the form couple to sodium ion pumping in Malonomonas rubra. Malonate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases. Essentially, it couples the energy derived from decarboxylation of a carboxylic acid substrate to move Na+ ion across the bilayer. Functional malonate decarboylase is a multi subunit protein. The alpha subunit enzymatically performs the transfer of malonate (substrate) to an acyl carrier protein subunit for subsequent decarboxylation, hence the name: acetyl-S-acyl carrier protein:malonate carrier protein-SH transferase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other] 0
42399 391257 cl25564 OFeT_1 Ferrous iron uptake permease, iron-lead transporter. OFeT_1 is a family of conserved archaeal membrane proteins that are putative oxidase-dependent Fe2+ transporters. 0
42400 391260 cl25584 RNA_pol_inhib RNA polymerase inhibitor. inhibitor of host bacterial RNA polymerase 0
42401 421073 cl25585 Stomagen Stomagen. stomagen; Provisional 0
42402 421079 cl25598 TRF2_RBM RAP1 binding motif of telomere repeat binding factor. This domain, found in telomeric repeat-binding factor 2, binds to the C-terminus of repressor activator protein 1 (RAP1) (telomeric repeat-binding factor 2-interacting protein 1). 0
42403 391269 cl25608 MSL2_CXC DNA-binding cysteine-rich domain of male-specific lethal 2 and related proteins. MSL2-CXC is an autonomously folded domain containing that binds three zinc ions. It lies on the E3 ubiquitin-protein ligase MSL2 in eukaryotes. The CXC domain critically contributes to the DNA-binding activity of MSL2. It carries 9 invariant cysteines within about a 50 residue region. 0
42404 421085 cl25616 Apocytochr_F_N Apocytochrome F, N-terminal. cytochrome f 0
42405 421098 cl25646 zn-ribbon_14 Zinc-ribbon. [Hypothetical proteins, Conserved] 0
42406 355433 cl25655 Patched Patched family. The transmembrane protein Patched is a receptor for the morphogene Sonic Hedgehog. This protein associates with the smoothened protein to transduce hedgehog signals. 0
42407 421105 cl25664 YaiA YaiA protein. hypothetical protein; Provisional 0
42408 421106 cl25669 DUF1735 Domain of unknown function (DUF1735). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown. 0
42409 421107 cl25673 CcmF_C Cytochrome c-type biogenesis protein CcmF C-terminal. Members of this protein family closely resemble the CcmF protein of the CcmABCDEFGH system, or system I, for c-type cytochrome biogenesis (GenProp0678). Members are found, as a rule, next to closely related paralogs of CcmG and CcmH and always located near other genes associated with the cytochrome c nitrite reductase enzyme complex. As a rule, members are found in species that also encode bona fide members of the CcmF, CcmG, and CcmH families. 0
42410 421111 cl25682 DUF4912 Domain of unknown function (DUF4912). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 0
42411 421112 cl25683 DUF4910 Domain of unknown function (DUF4910). This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet. Rebuilding from Structure 3kt9 shows this is an inserted (nested domain within the amino-peptidase). The function of this small domain is not known. 0
42412 330507 cl25686 COG5135 Uncharacterized protein [Function unknown]. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily described here is widespread in Cyanobacteria and plants, and is named for alr4036 from Nostoc sp. PCC 7120. The family consists mostly of proteins from species that lack the capability to synthesize F420, so it is probable that all members bind FMN rather than F420. [Unknown function, Enzymes of unknown specificity] 0
42413 355439 cl25705 COG4745 Predicted membrane-bound mannosyltransferase [General function prediction only]. Members of this protein family, uncommon and rather sporadically distributed, are found almost always in the same genomes as members of family TIGR03662, and frequently as a nearby gene. Members show some N-terminal sequence similarity with pfam02366, dolichyl-phosphate-mannose-protein mannosyltransferase. The few invariant residues in this family, found toward the N-terminus, include a dipeptide DE, a tripeptide HGP, and two different Arg residues. Up to three members may be found in a genome. The function is unknown. 0
42414 421146 cl25771 ComGF Putative Competence protein ComGF. ComGF is a family of putative bacterial competence proteins. 0
42415 421190 cl25857 RmuC RmuC family. DNA recombination protein RmuC; Provisional 0
42416 421217 cl25912 Bacuni_01323_like Uncharacterized protein conserved in Bacteroidetes. Large family of predicted secreted proteins mostly from CFG group, but also from Burkholderia, Pseudomonas and Streptomyces. Function of these proteins is not known. A 3D structure of a representative of this family from Bacteroides uniformis was solved by JCSG and deposited to PDB as 4ghb. There is some overlap with RHS-repeat (PF05593) family despite lack of obvious repeats in the structure. 0
42417 330753 cl25932 Peptidase_S21 Assemblin (Peptidase family S21). hypothetical protein; Provisional 0
42418 355468 cl25954 PRK06718 NAD(P)-binding protein. precorrin-2 dehydrogenase; Validated 0
42419 355469 cl25961 POLXc DNA polymerase X family. includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases 0
42420 421244 cl25973 RNA_pol DNA-dependent RNA polymerase. T3/T7-like RNA polymerase 0
42421 421249 cl25995 TM_EphA1 Transmembrane domain of Ephrin Receptor A1 Protein Tyrosine Kinase. Epha2_TM represents the left-handed dimer transmembrane domain of of EphA2 receptor. This domain oligomerizes and is important for the active signalling process. 0
42422 421254 cl26030 Topo_C_assoc C-terminal topoisomerase domain. DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras 0
42423 421256 cl26041 Gly_rich_SFCGS Glycine-rich SFCGS. Members of this family of small (about 120 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved] 0
42424 421257 cl26042 Arylsulfotrans Arylsulfotransferase (ASST). This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate. 0
42425 421260 cl26057 YdfZ YdfZ protein. This small protein has a very limited distribution, being found so far only among some gamma-Proteobacteria. The member from Escherichia coli was shown to bind selenium in the absence of a working SelD-dependent selenium incorporation system. Note that while the E. coli member contains a single Cys residue, a likely selenium binding site, some other members of this protein family contain two Cys residues or none. [Unknown function, General] 0
42426 421261 cl26058 MgrB MgrB protein. The MgrB protein is a short lipoprotein. The mgrB gene has a mg2+ responsive promoter. Deletion of mgrB results in a potent increase in PhoP-regulated transcription. The PhoQ/PhoP signaling system responds to low magnesium and the presence of certain cationic antimicrobial peptides. Over-expression of mgrB decreased transcription at both high and low concentrations of magnesium. Localization and bacterial two-hybrid studies suggest that MgrB resides in the inner-membrane and interacts directly with PhoQ. This domain family is found in bacteria, and is approximately 40 amino acids in length. There are two conserved sequence motifs: CDQ and GIC. 0
42427 421262 cl26078 Clathrin Region in Clathrin and VPS. Each region is about 140 amino acids long. The regions are composed of multiple alpha helical repeats. They occur in the arm region of the Clathrin heavy chain. 0
42428 421264 cl26089 DDE_Tnp_IS240 DDE domain. This DDE domain is found in a wide variety of transposases including those found in IS240, IS26, IS6100 and IS26. 0
42429 421266 cl26118 LPAM_2 Prokaryotic lipoprotein-attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached. 0
42430 421268 cl26132 DUF3029 Protein of unknown function (DUF3029). Members of this family are homologs to enzymes known to undergo activation by a radical SAM protein to create an active site glycyl radical. This family appears to be activated by the YjjW radical SAM protein, usually encoded by an adjacent gene. [Unknown function, Enzymes of unknown specificity] 0
42431 421271 cl26153 Glyco_trans_1_3 Glycosyl transferase family 1. This model represents nearly the full length of MJ1255 from Methanococcus jannaschii and of an unpublished protein from Vibrio cholerae, as well as the C-terminal half of a protein from Methanobacterium thermoautotrophicum. A small region (~50 amino acids) within the domain appears related to a family of sugar transferases. [Hypothetical proteins, Conserved] 0
42432 421272 cl26156 Acetyltransf_8 Acetyltransferase (GNAT) domain. AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle. 0
42433 421273 cl26163 FUSC Fusaric acid resistance protein family. This family includes a conserved region found in two proteins associated with fusaric acid resistance, FusC from Burkholderia cepacia and fdt-2 from Klebsiella oxytoca. These proteins are likely to be membrane transporter proteins. 0
42434 391485 cl26168 Dehalogenase Reductive dehalogenase subunit. This model represents a family of corrin and 8-iron Fe-S cluster-containing reductive dehalogenases found primarily in halorespiring microorganisms such as dehalococcoides ethenogenes which contains as many as 17 enzymes of this type with varying substrate ranges. One example of a characterized species is the tetrachloroethene reductive dehalogenase (1.97.1.8) which also acts on trichloroethene converting it to dichloroethene. 0
42435 421277 cl26199 Transglut_core2 Transglutaminase-like superfamily. 0
42436 421279 cl26235 RecQL4_SLD2_NTD N-terminal homeodomain-like domain of metazoan RecQ protein-like 4 (RecQL4), fungal DNA replication regulator SLD2 and similar proteins. Genome duplication is precisely regulated by cyclin-dependent kinases CDKs, which bring about the onset of S phase by activating replication origins and then prevent re-licensing of origins until mitosis is completed. The optimum sequence motif for CDK phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found to have at least 11 potential phosphorylation sites. Drc1 is required for DNA synthesis and S-M replication checkpoint control. Drc1 associates with Cdc2 and is phosphorylated at the onset of S phase when Cdc2 is activated. Thus Cdc2 promotes DNA replication by phosphorylating Drc1 and regulating its association with Cut5. Sld2 and Sld3 represent the minimal set of S-CDK substrates required for DNA replication. 0
42437 421281 cl26244 VSG_B Trypanosomal VSG domain. variant surface glycoprotein; Provisional 0
42438 421283 cl26253 SCIFF Six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein, family TIGR03974, that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase. 0
42439 421288 cl26307 FUSC-like FUSC-like inner membrane protein yccS. This model represents two clades of putative transmembrane proteins including the E. coli YccS and YhfK proteins. The YccS hypothetical equivalog (TIGR01666) is found in beta and gamma proteobacteria, while the smaller YhfK group is only found in E. coli, Salmonella and Yersinia. TMHMM on the 19 hits to this model shows a consensus of 11 transmembrane helices separated into two clusters, an N-terminal cluster of 6 and a central cluster of 5. This would indicate two non-membrane domains one on each side of the membrane 0
42440 421293 cl26348 T2SSppdC Type II secretion prepilin peptidase dependent protein C. 0
42441 391509 cl26362 Peptidase_M73 Camelysin metallo-endopeptidase. This model describes a protein N-terminal domain found regularly in proteins encoded near a variant form of signal peptidase I such as the SipW protein of Bacillus subtilis. Many though not all members are homologs of camelysin (a casein-cleaving metalloprotease) and TasA (CotN), a metalloprotease that is secreted, along with extracellular polysaccharide (EPS), to be the major protein constituent of the Bacillus subtilis biofilm matrix. Sequencing from several known TasA/CotN proteins shows the cleavage location to be near the center of the alignment and typical of type I signal peptidases, with small residues at -3 and -1. This domain, therefore, appears to be a special subclass of signal peptide. 0
42442 421296 cl26390 Beta-TrCP_D D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation. 0
42443 421299 cl26405 MgsA_C MgsA AAA+ ATPase C terminal. recombination factor protein RarA; Provisional 0
42444 421304 cl26420 IAT_beta Inverse autotransporter, beta-domain. This is a family of beta-barrel porin-like outer membrane proteins from enteropathogenic Gram-negative bacteria. Intimins and invasins are virulence factors produced by pathogenic Gram-negative bacteria. They carry C-terminal extracellular passenger domains that are involved in adhesion to host cells and N-terminal beta domains that are embedded in the outer membrane. This family represents the beta-barrel porin-like domain in the outer membrane that can be found in intimins, invasins and some inverse autotransporters. 0
42445 421305 cl26423 DUF3421 Protein of unknown function (DUF3421). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 119 to 296 amino acids in length. 0
42446 421307 cl26442 Cytochrome_cB Cytochrome c bacterial. Members of this protein family are multiheme cytochrome c proteins of Methanosarcina acetivorans C2A and several other archaeal methanogens. All members have N-terminal signal peptides and are presumed to act in electron transfer reactions associated with methanogenesis. Putative heme-binding motifs include five (or six) CXXCH motifs, a CXXXCH motif, and a CXXXXCH motif. These proteins show multiple regions of local homology, in the same order, with multiheme cytochrome c proteins such as octaheme tetrathionate reductase from Shewanella. 0
42447 421308 cl26443 CbiG_mid Cobalamin biosynthesis central region. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. 0
42448 421310 cl26453 DUF3262 Protein of unknown function (DUF3262). Members of this family of small, hydrophobic proteins are found occasionally on plasmids such as the Pseudomonas putida TOL (toluene catabolic) plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions] 0
42449 331275 cl26454 VirionAssem_T7 Bacteriophage T7 virion assembly protein. tail assembly protein 0
42450 421311 cl26457 T2SSJ Type II secretion system (T2SS), protein J. The T2SJ proteins are pseudopilins, which are targeted to the membrane in E. Coli. T2SJ forms a complex with T2SI (pfam02501) and T2SK (pfam03934) which is part of the Type II secretion apparatus involved in the translocation of proteins across the outer membrane in E.coli. The T2SK-I-J complex has quasihelical characteristics. 0
42451 421313 cl26461 DUF3236 Protein of unknown function (DUF3236). This family of proteins with unknown function appears to be restricted to Methanobacteria. 0
42452 331288 cl26467 MerE MerE protein. putative mercury resistance protein; Provisional 0
42453 421316 cl26468 Scaffolding_pro Phi29 scaffolding protein. scaffolding protein 0
42454 331295 cl26474 DUF2675 Protein of unknown function (DUF2675). host protein H-NS-interacting protein 0
42455 421319 cl26475 Phage_gp53 Base plate wedge protein 53. baseplate wedge subunit; Provisional 0
42456 421320 cl26479 EutA Ethanolamine utilisation protein EutA. This family consists of several bacterial EutA ethanolamine utilisation proteins. The EutA protein is thought to protect the lyase (EutBC) from inhibition by CNB12. 0
42457 421321 cl26481 M11L Apoptosis regulator M11L like. Hypothetical protein; Provisional 0
42458 391537 cl26482 T4_tail_cap Tail-tube assembly protein. baseplate subunit; Provisional 0
42459 421322 cl26486 YfdX YfdX protein. hypothetical protein; Provisional 0
42460 331308 cl26487 DUF2745 Protein of unknown function (DUF2745). host dGTPase inhibitor 0
42461 331309 cl26488 DUF2718 Protein of unknown function (DUF2718). Hypothetical protein; Provisional 0
42462 331311 cl26490 YliH Biofilm formation protein (YliH/bssR). YliH is induced in biofilms and is involved in repression of motility in the biofilms. YliH is also known as bssR (regulator of biofilm through signal secreton). 0
42463 421323 cl26491 DSRB Dextransucrase DSRB. DSRB is a novel dextransucrase which produces a dextran different from the typical dextran, as it contains (1-6) and (1-2) linkages, when this strain is grown in the presence of sucrose. 0
42464 331313 cl26492 CedA Cell division activator CedA. CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet.. 0
42465 421324 cl26494 DUF2498 Protein of unknown function (DUF2498). hypothetical protein; Provisional 0
42466 421325 cl26496 DUF2496 Protein of unknown function (DUF2496). hypothetical protein; Provisional 0
42467 331319 cl26498 BDM Putative biofilm-dependent modulation protein. biofilm-dependent modulation protein; Provisional 0
42468 421326 cl26499 DUF4198 Domain of unknown function (DUF4198). This family was previously missannotated in Pfam as NikM. 0
42469 421327 cl26521 AKAP7_NLS AKAP7 2&apos;5&apos; RNA ligase-like domain. unknown protein; Provisional 0
42470 421338 cl26560 TIR-like Predicted nucleotide-binding protein containing TIR-like domain. Members of this family of bacterial nucleotide-binding proteins contain a TIR-like domain. Their exact function has not, as yet, been defined. 0
42471 421339 cl26561 DUF2344 Uncharacterized protein conserved in bacteria (DUF2344). This model describes an uncharacterized protein encoded adjacent to, or as a fusion protein with, an uncharacterized radical SAM protein. 0
42472 421340 cl26564 DUF2338 Uncharacterized protein conserved in bacteria (DUF2338). Members of this family of hypothetical bacterial proteins have no known function. 0
42473 421341 cl26565 DUF2336 Uncharacterized protein conserved in bacteria (DUF2336). Members of this family of hypothetical bacterial proteins have no known function. 0
42474 421342 cl26566 HPTransfase Histidine phosphotransferase C-terminal domain. HPTransfase is a family of essential histidine phosphotransferases. It controls the activity of the master bacterial cell-cycle regulator CtrA through phosphorylation. It behaves as a homodimer by adopting the domain architecture of the intracellular part of class I histidine kinases. Each subunit consists of two distinct domains: an N-terminal helical hairpin domain and a C-terminal [alpha]/[beta] domain. The two N-terminal domains are adjacent within the dimer, forming a four-helix bundle. The C-terminal domain adopts an atypical Bergerat ATP-binding fold. 0
42475 421343 cl26569 DUF2318 Predicted membrane protein (DUF2318). Members of this family of hypothetical bacterial proteins have no known function. 0
42476 421344 cl26571 DUF2273 Small integral membrane protein (DUF2273). Members of this family of hypothetical bacterial proteins have no known function. 0
42477 421345 cl26572 Methyltransf_33 Histidine-specific methyltransferase, SAM-dependent. This model represents an uncharacterized domain of about 300 amino acids with homology to S-adenosylmethionine-dependent methyltransferases. Proteins with this domain are exclusively fungal. A few, such as EasF from Neotyphodium lolii, are associated with the biosynthesis of ergot alkaloids, a class of fungal secondary metabolites. EasF may, in fact, be the AdoMet:dimethylallyltryptophan N-methyltransferase, the enzyme that follows tryptophan dimethylallyltransferase (DMATS) in ergot alkaloid biosynthesis. Several other members of this family, including mug158 (meiotically up-regulated gene 158 protein) from Schizosaccharomyces pombe, contain an additional uncharacterized domain DUF323 (pfam03781). 0
42478 421346 cl26573 DUF2254 Predicted membrane protein (DUF2254). Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined. 0
42479 421347 cl26577 DUF2225 Uncharacterized protein conserved in bacteria (DUF2225). This domain, found in various hypothetical bacterial proteins, has no known function. 0
42480 421348 cl26580 DUF2206 Predicted membrane protein (DUF2206). This domain, found in various hypothetical archaeal proteins, has no known function. 0
42481 421349 cl26582 VapB_antitoxin Bacterial antitoxin of type II TA system, VapB. VapB is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is VapC, pfam05016. The family contains several related antitoxins from Cyanobacteria and Actinobacterial families. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the Toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all antitoxins. 0
42482 421350 cl26583 DUF2163 Uncharacterized conserved protein (DUF2163). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
42483 421351 cl26584 DUF2148 Uncharacterized protein containing a ferredoxin domain (DUF2148). This domain, found in various hypothetical bacterial proteins containing a ferredoxin domain, has no known function. 0
42484 421352 cl26586 DUF2125 Uncharacterized protein conserved in bacteria (DUF2125). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
42485 421353 cl26587 DUF2109 Predicted membrane protein (DUF2109). This domain, found in various hypothetical archaeal proteins, has no known function. 0
42486 421354 cl26588 DUF2107 Predicted membrane protein (DUF2107). This domain, found in various hypothetical archaeal proteins, has no known function. 0
42487 421355 cl26589 DUF2093 Uncharacterized protein conserved in bacteria (DUF2093). This domain, found in various hypothetical prokaryotic proteins, has no known function. 0
42488 421356 cl26591 DUF2080 Putative transposon-encoded protein (DUF2080). Members of this family appear restricted to the archaea. They tend to be encoded upstream of predicted transposase genes within insertion sequences such as ISNagr11, ISHca1, ISH36, etc. The widespread distribution suggests this protein may be more than a mere passenger gene and may participate in some transposase-associated function. See PF09853, COG3466, and arCOG03884 for alternative (currently narrow) treatments of this family. 0
42489 421357 cl26592 SHOCT Short C-terminal domain. 0
42490 421358 cl26593 DUF2076 Uncharacterized protein conserved in bacteria (DUF2076). This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins. 0
42491 421359 cl26595 DUF2070 Predicted membrane protein (DUF2070). This is a family of Archaeal 7-TM proteins. There are 6 closely assembled TM-regions at the N-terminus followed by a long intracellular, from residues 220-590, highly conserved region, of unknown function, terminating with one more TM-region. The short 25 residue section between TMs 5 and 6 might lie on the outer surface of the membrane and be acting as a receptor (from TMHMM). 0
42492 421360 cl26596 DUF2059 Uncharacterized protein conserved in bacteria (DUF2059). This domain, found in various prokaryotic proteins, has no known function. 0
42493 421361 cl26597 DUF2058 Uncharacterized protein conserved in bacteria (DUF2058). This domain, found in various prokaryotic proteins, has no known function. 0
42494 421362 cl26599 Beta_propel Beta propeller domain. Members of this family comprise secreted bacterial proteins containing C-terminal beta-propeller domain distantly related to WD-40 repeats. Jpred secondary-structure prediction shows family to be a series of 4 short beta-strands, characteristic of beta-propeller families. 0
42495 421366 cl26616 DUF2397 Protein of unknown function (DUF2397). Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved] 0
42496 421367 cl26619 Myco_arth_vir_N Mycoplasma virulence signal region (Myco_arth_vir_N). This model represents the N-terminal region, including a probable signal sequence or signal anchor which in most instances has four consecutive Lys residues before the hydrophobic stretch, of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. 0
42497 421368 cl26629 Phageshock_PspD Phage shock protein PspD (Phageshock_PspD). Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown. [Cellular processes, Adaptations to atypical conditions] 0
42498 421369 cl26630 Spore_YhcN_YlaJ Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ). YhcN and YlaJ are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic 40-residue C-terminal domain that is not included in the seed alignment for this model. A portion of the low-complexity region between the lipoprotein signal sequence and the main conserved region of the protein family was also excised from the seed alignment. [Cellular processes, Sporulation and germination] 0
42499 391585 cl26631 DUF2379 Protein of unknown function (DUF2379). This family consists of at least eight paralogs in Myxococcus xanthus and six in Stigmatella aurantiaca DW4/3-1, both members of Myxococcales order within the Deltaproteobacteria. The function is unknown. Some member proteins consist of two copies of the domain. This domain is hereby named DUSAM, DUplication in Stigmatella And Myxococcus. 0
42500 331453 cl26632 Ehrlichia_rpt Ehrlichia tandem repeat (Ehrlichia_rpt). This model represents 77 residues of an 80 amino acid (240 nucleotide) tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen. 0
42501 421374 cl26647 Lact-deh-memb D-lactate dehydrogenase, membrane binding. D-lactate dehydrogenase; Provisional 0
42502 421375 cl26651 AcetDehyd-dimer Prokaryotic acetaldehyde dehydrogenase, dimerization. acetaldehyde dehydrogenase; Validated 0
42503 421376 cl26652 FOLN Follistatin/Osteonectin-like EGF domain. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence 0
42504 421378 cl26682 GshA Glutamate-cysteine ligase. This family consists of a rare family of glutamate--cysteine ligases, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs] 0
42505 421379 cl26684 IDEAL IDEAL domain. It is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members. 0
42506 421381 cl26695 Ca_chan_IQ Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF). 0
42507 421382 cl26696 CotH CotH kinase protein. Members of this family include the spore coat protein H (cotH). This protein is an atypical protein kinase that phosphorylates CotB and CotG. 0
42508 421384 cl26712 U3_assoc_6 U3 small nucleolar RNA-associated protein 6. This is a family of U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA. 0
42509 331548 cl26727 aMBF1 Archaeal ribosome-binding protein aMBF1, putative translation factor, contains Zn-ribbon and HTH domains [Translation, ribosomal structure and biogenesis]. [Hypothetical proteins, Conserved] 0
42510 421385 cl26728 STAG STAG domain. STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version. 0
42511 391601 cl26729 LisH LisH. Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly. 0
42512 421386 cl26730 Amelin Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. 0
42513 421388 cl26737 SpoIID Stage II sporulation protein. Stage II sporulation protein D (SpoIID) is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIID, along with SpoIIM and SpoIIP, is one of three major proteins involved in engulfment of the forespore by the mother cell. [Cellular processes, Sporulation and germination] 0
42514 421389 cl26744 TPT Triose-phosphate Transporter family. The 6-8 TMS Triose-phosphate Transporter (TPT) Family (TC 2.A.7.9)Functionally characterized members of the TPT family are derived from the inner envelope membranes of chloroplasts and nongreen plastids of plants. However,homologues are also present in yeast. Saccharomyces cerevisiae has three functionally uncharacterized TPT paralogues encoded within its genome. Under normal physiologicalconditions, chloroplast TPTs mediate a strict antiport of substrates, frequently exchanging an organic three carbon compound phosphate ester for inorganic phosphate (Pi).Normally, a triose-phosphate, 3-phosphoglycerate, or another phosphorylated C3 compound made in the chloroplast during photosynthesis, exits the organelle into thecytoplasm of the plant cell in exchange for Pi. However, experiments with reconstituted translocator in artificial membranes indicate that transport can also occur by achannel-like uniport mechanism with up to 10-fold higher transport rates. Channel opening may be induced by a membrane potential of large magnitude and/or by high substrateconcentrations. Nongreen plastid and chloroplast carriers, such as those from maize endosperm and root membranes, mediate transport of C3 compounds phosphorylated atcarbon atom 2, particularly phosphenolpyruvate, in exchange for Pi. These are the phosphoenolpyruvate:Pi antiporters (PPT). Glucose-6-P has also been shown to be asubstrate of some plastid translocators (GPT). The three types of proteins (TPT, PPT and GPT) are divergent in sequence as well as substrate specificity, but their substratespecificities overlap. [Hypothetical proteins, Conserved] 0
42515 421394 cl26765 FBD FBD. This region is found in F-box (pfam00646) and other domain containing plant proteins; it is repeated in two family members. Its precise function is unknown, but it is thought to be associated with nuclear processes. In fact, several family members are annotated as being similar to transcription factors. 0
42516 421396 cl26778 TED Thioester domain. This model describes a domain of about 40 residues with an invariant TQ dipeptide in an almost invariant TQxA[VI]W motif. This domain occurs in surface-expressed proteins of Gram-positive bacteria, many of which are anchored by LPXTG-containing sortase target domains. Numerous members of this family have domains pfam05738 (Cna protein B-type domain) and pfam08341 (fibronectin-binding protein signal sequence). 0
42517 421397 cl26779 YicC_N YicC-like family, N-terminal region. The apparent ortholog from Aquifex aeolicus as reported is split into two consecutive reading frames. [Hypothetical proteins, Conserved] 0
42518 391612 cl26797 SpoV Stage V sporulation protein family. Members of this family are SpoVM (stage V sporulation protein M). 0
42519 391614 cl26808 PqqA PqqA family. This model describes a very small protein, coenzyme PQQ biosynthesis protein A, which is smaller than 25 amino acids in many species. It is proposed to serve as a peptide precursor of coenzyme pyrrolo-quinoline-quinone (PQQ), with Glu and Tyr of a conserved motif Glu-Xxx-Xxx-Xxx-Tyr becoming part of the product. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
42520 421399 cl26817 GSDH Glucose / Sorbosone dehydrogenase. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis. 0
42521 355577 cl26832 PLN03020 N/A. putative Low-temperature-induced protein; Provisional 0
42522 421400 cl26841 PHB_acc_N PHB/PHA accumulation regulator DNA-binding domain. Poly-B-hydroxyalkanoates are lipidlike carbon/energy storage polymers found in granular inclusions. PhaR is a regulatory protein found in general near other proteins associated with polyhydroxyalkanoate (PHA) granule biosynthesis and utilization. It is found to be a DNA-binding homotetramer that is also capable of binding short chain hydroxyalkanoic acids and PHA granules. PhaR may regulate the expression of itself, of the phasins that coat granules, and of enzymes that direct carbon flux into polymers stored in granules. The C-terminal region is poorly conserved in this family and is not part of this model.//GO terms added 12/6/04 [SS] [Fatty acid and phospholipid metabolism, Biosynthesis, Regulatory functions, DNA interactions] 0
42523 421401 cl26842 DUF1656 Protein of unknown function (DUF1656). efflux system membrane protein; Provisional 0
42524 421402 cl26844 DUF1641 Protein of unknown function (DUF1641). Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long. 0
42525 421403 cl26845 FTCD_N Formiminotransferase domain, N-terminal subdomain. This model represents the tetrahydrofolate (THF) dependent glutamate formiminotransferase involved in the histidine utilization pathway. This enzyme interconverts L-glutamate and N-formimino-L-glutamate. The enzyme is bifunctional as it also catalyzes the cyclodeaminase reaction on N-formimino-THF, converting it to 5,10-methenyl-THF and releasing ammonia - part of the process of regenerating THF. This model covers enzymes from metazoa as well as gram-positive bacteria and archaea. In humans, deficiency of this enzyme results in a disease phenotype. The crystal structure of the enzyme has been studied in the context of the catalytic mechanism. [Energy metabolism, Amino acids and amines] 0
42526 421404 cl26877 Band_3_cyto Band 3 cytoplasmic domain. The Anion Exchanger (AE) Family (TC 2.A.31)Characterized protein members of the AE family are found only in animals.They preferentially catalyze anion exchange (antiport) reactions, typically acting as HCO3-:Cl- antiporters, but also transporting a range of other inorganic and organic anions. Additionally, renal Na+:HCO3- cotransporters have been found to be members of the AE family. They catalyze the reabsorption of HCO3- in the renal proximal tubule. [Transport and binding proteins, Anions] 0
42527 421405 cl26884 TraI_2 Putative helicase. Members of this protein family are the TraI putative relaxases required for transfer by a subclass of integrating conjugative elements (ICE) as found in Pseudomonas fluorescens Pf-5, and understood from study of two related ICE, SXT and R391. This model represents the N-terminal domain. Note that no homology is detected to the similarly named TraI relaxase of the F plasmid. 0
42528 421406 cl26890 Collar Phage Tail Collar Domain. This region is occasionally found in conjunction with pfam03335. Most of the family appear to be phage tail proteins; however some appear to be involved in other processes. For instance a member from Rhizobium leguminosarum may be involved in plant-microbe interactions. A related protein MrpB is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologs but the intervening sequence is not. Much of the function of phage T4 appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the 'collar'. The domain may bind SO4, however the residues accredited with this vary between the PDB file and the Swiss-Prot entry. The long unconserved region maybe due to domain swapping in and out of a loop or reflective of rapid evolution. 0
42529 421407 cl26892 PsaM Photosystem I protein M (PsaM). Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. The seed alignment for this model includes sequences from pfam07465 and additional sequences, as from Prochlorococcus. [Energy metabolism, Photosynthesis] 0
42530 421408 cl26894 BofA SigmaK-factor processing regulatory protein BofA. Members of this protein family are found only in endospore-forming bacteria, such as Bacillus subtilis and Clostridium tetani. Among such bacteria, it appears only Symbiobacterium thermophilum lacks a member of this family. The protein, designated BofA, is an integral membrane protein that regulates the proteolytic activation of the RNA polymerase sigma factor K. [Cellular processes, Sporulation and germination] 0
42531 421409 cl26898 DUF1507 Protein of unknown function (DUF1507). hypothetical protein; Provisional 0
42532 355590 cl26909 COG4652 Uncharacterized protein [Function unknown]. This model represents a family of integral membrane proteins, most of which are about 650 residues in size and predicted to span the membrane seven times. Nearly half of the members of this family are found in association with a member of the lactococcin 972 family of bacteriocins (TIGR01653). Others may be associated with uncharacterized proteins that may also act as bacteriocins. Although this protein is suggested to be an immunity protein, and the bacteriocin is suggested to be exported by a Sec-dependent process, the role of this protein is unclear. [Cellular processes, Toxin production and resistance] 0
42533 421410 cl26914 Neuralized Neuralized. This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. 0
42534 331736 cl26915 Sugar_transport Sugar transport protein. This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes. These members are transmembrane proteins which are usually 5+5 duplications. This model recognizes a set of five TMs, 0
42535 421411 cl26917 SKA1_N Spindle and kinetochore-associated protein 1, N-terminal domain. Spindle and kinetochore-associated protein 1 (SKA1) is a component of the SKA1 complex (consists of Ska1, Ska2, and Ska3/Rama1), a microtubule-binding subcomplex of the outer kinetochore that is essential for proper chromosome segregation. 0
42536 421412 cl26923 DUF1328 Protein of unknown function (DUF1328). hypothetical protein; Provisional 0
42537 421413 cl26935 zf-LSD1 LSD1 zinc finger. This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC 0
42538 421414 cl26937 YqfD Putative stage IV sporulation protein YqfD. YqfD is part of the sigma-E regulon in the sporulation program of endospore-forming Gram-positive bacteria. Mutation results in a sporulation defect in Bacillus subtilis. Members are found in all currently known endospore-forming bacteria, including the genera Bacillus, Symbiobacterium, Carboxydothermus, Clostridium, and Thermoanaerobacter. [Cellular processes, Sporulation and germination] 0
42539 421415 cl26941 DUF1246 Protein of unknown function (DUF1246). This family represents the N-terminus of a number of hypothetical archaeal proteins of unknown function. This family is structurally related to the PreATP-grasp domain. 0
42540 391639 cl26952 BC10 Bladder cancer-related protein BC10. hypothetical protein 0
42541 421416 cl26955 DUF1192 Protein of unknown function (DUF1192). This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown. 0
42542 391642 cl26958 DUF1156 Protein of unknown function (DUF1156). This family represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids. 0
42543 421418 cl26962 DUF1126 DUF1126 PH-like domain. The structure of this domain shows that it has a PH-like fold. 0
42544 421419 cl26964 ABC_trans_CmpB Putative ABC-transporter type IV. CmpB is a family of membrane proteins that are likely to be part of a two-component type IV ABC-transporter system. Families can transport multiple drugs including ethidium and fluoroquinolones. UniProtKB:Q83XH0 is a member of TCDB family 3.A.1.121.4. 0
42545 421420 cl26967 Orthopox_A49R Orthopoxvirus A49R protein. hypothetical protein; Provisional 0
42546 421421 cl26969 Vitellogenin_N Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains. 0
42547 391647 cl26972 API3 N/A. Pepsin inhibitor-3 consisting of two domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the active site flap region of pepsin. The two domains are tandem repeats of sequence, and has therefore been termed repeated domain. 0
42548 421422 cl26975 pbsY N/A. photosystem II protein Y; Provisional 0
42549 421423 cl26977 Mito_fiss_Elm1 Mitochondrial fission ELM1. In plants, this family is involved in mitochondrial fission. It binds to dynamin-related proteins and plays a role in their relocation from the cytosol to mitochondrial fission sites. Its function in bacteria is unknown. 0
42550 421424 cl26979 Usg Usg-like family. Family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system. 0
42551 331802 cl26981 Terminase_1 Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function. 0
42552 421425 cl26984 PTAC Phosphate propanoyltransferase. This family includes phosphotransacylases (PTACs) required for the degradation of 1,2-propanediol (1,2-PD). 0
42553 391652 cl26985 TLP-20 N/A. This family consists of several Nucleopolyhedrovirus telokin-like protein-20 (TLP20) sequences. The function of this family is unknown but TLP20 is known to shares some antigenic similarities to the smooth muscle protein telokin although the amino acid sequence shows no homologies to telokin. 0
42554 421430 cl26996 TelA Toxic anion resistance protein (TelA). This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance and plasmid fertility inhibition. 0
42555 331826 cl27005 VirD1 T-DNA border endonuclease VirD1. This family consists of several T-DNA border endonuclease VirD1 proteins which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialized bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process. 0
42556 421431 cl27008 HECTc N/A. The name HECT comes from Homologous to the E6-AP Carboxyl Terminus. 0
42557 391659 cl27011 Pup Pup-like protein. Members of this protein family are Pup, a small protein whose ligation to target proteins steers them toward degradation. This protein family occurs in a number of bacteria, especially Actinobacteria such as Mycobacterium tuberculosis, that possess an archeal-type proteasome. All members of this protein family known during model construction end with the C-terminal motif [FY][VI]QKGG[QE]. Ligation is thought to occur between the C-terminal COOH of Pup and an epsilon-amino group of a Lys on the target protein. The N-terminal half of this protein is poorly conserved and not represented in the seed alignment. [Protein fate, Degradation of proteins, peptides, and glycopeptides] 0
42558 421432 cl27012 HIGH_NTase1 HIGH Nucleotidyl Transferase. This family consists of HIGH Nucleotidyl Transferases 0
42559 421433 cl27023 TraY TraY domain. This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid. These proteins have a ribbon-helix-helix fold and are likely to be DNA-binding proteins. 0
42560 421434 cl27025 Herpes_UL69 Herpesvirus transcriptional regulator family. multifunctional expression regulator; Provisional 0
42561 391663 cl27031 Phi-29_GP3 Phi-29 DNA terminal protein GP3. terminal protein 0
42562 421436 cl27037 FlaC_arch Flagella accessory protein C (FlaC). Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells). 0
42563 421437 cl27040 PilS PilS N terminal. This family consists of several bundlin proteins from E. coli. Bundlin is a type IV pilin protein that is the only known structural component of enteropathogenic Escherichia coli bundle-forming pili (BFP). BFP play a role in virulence, antigenicity, autoaggregation, and localized adherence to epithelial cells. These proteins contain an N-terminal methylation motif. 0
42564 421440 cl27046 CopB Copper resistance protein B precursor (CopB). This family consists of several bacterial copper resistance proteins. Copper is essential and serves as cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to radical formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every organism. CopB serves to extrude copper when it approaches toxic levels. 0
42565 421441 cl27047 CHAD CHAD domain. It has conserved histidines that may chelate metals. 0
42566 421443 cl27068 AbrB Transition state regulatory protein AbrB. The model describes a hydrophobic sequence region that is duplicated to form the AbrB protein of Escherichia coli (not to be confused with a Bacillus subtilis protein with the same gene symbol). In some species, notably the Cyanobacteria and Thermus thermophilus, proteins consist of a single copy rather than two copies. The member from Pseudomonas putida, PP_1415, was suggested to be an ammonia monooxygenase characteristic of heterotrophic nitrifiers, based on an experimental indication of such activity in the organism and a glimmer of local sequence similarity between parts of P. putida protein and an instance of the AmoA protein from Nitrosomonas europaea (; we do not believe the sequence similarity to be meaningful. The member from E. coli (b0715, ybgN) appears to be the largely uncharacterized AbrB (aidB regulator) protein of E. coli cited in Volkert, et al. (PMID 8002588), although we did not manage to trace the origin of association of the article to the sequence. 0
42567 421444 cl27073 T7SS_ESX1_EccB Type VII secretion system ESX-1, transport TM domain B. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccB1, EccB2, etc. This model does not identify functionally related proteins in the Firmicutes such as Staphylococcus aureus and Bacillus anthracis. [Protein fate, Protein and peptide secretion and trafficking] 0
42568 331896 cl27075 Holin_BlyA holin, BlyA family. This family represents a BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage. This protein was previously proposed to be an hemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C-terminus. [Mobile and extrachromosomal element functions, Prophage functions] 0
42569 421445 cl27076 Glu_cyclase_2 Glutamine cyclotransferase. This family of enzymes EC:2.3.2.5 catalyze the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes. 0
42570 421447 cl27082 Phage_capsid Phage capsid family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage HK97, phi-105, P27, and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions] 0
42571 421448 cl27083 Menin Scaffolding protein menin encoded by the MEN1 gene. MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumor suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23. 0
42572 421450 cl27099 DltD DltD protein. Members of this protein family are DltD, part of the DltABCD system widely distributed in the Firmicutes for D-alanylation of lipoteichoic acids. The most common form of LTA, as in Staphylococcus aureus, has a backbone of polyglycerolphosphate. 0
42573 421451 cl27100 Glu_synthase Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster. 0
42574 421452 cl27103 SseC Secretion system effector C (SseC) like family. SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella. All these proteins are secreted by the type III secretion system. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon. 0
42575 421453 cl27104 Mak16 Mak16 protein C-terminal region. Protein MAK16 homolog; Provisional 0
42576 421455 cl27109 Herpes_UL49_2 Herpesvirus UL49 tegument protein. tegument protein VP22; Provisional 0
42577 331932 cl27111 Herpes_ORF11 Herpesvirus dUTPase protein. hypothetical protein; Provisional 0
42578 421457 cl27115 IKI3 IKI3 family. Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. This region contains WD40 like repeats. 0
42579 421458 cl27123 Arch_fla_DE Archaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD. 0
42580 421460 cl27132 Herpes_UL17 Herpesvirus UL17 protein. UL17 tegument protein; Provisional 0
42581 421461 cl27156 FrhB_FdhB_C Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C-terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037). 0
42582 421462 cl27158 FrhB_FdhB_N Coenzyme F420 hydrogenase/dehydrogenase, beta subunit N-term. coenzyme F420-reducing hydrogenase subunit beta; Validated 0
42583 421463 cl27166 ATE_N Arginine-tRNA-protein transferase, N-terminus. arginyl-tRNA-protein transferase; Provisional 0
42584 421464 cl27167 HemX HemX, putative uroporphyrinogen-III C-methyltransferase. putative uroporphyrinogen III C-methyltransferase; Provisional 0
42585 421465 cl27170 DUF460 Protein of unknown function (DUF460). Archaeal protein of unknown function. 0
42586 421466 cl27175 CheF-arch Chemotaxis signal transduction system protein F from archaea. This is a family of proteins that are archaea-specific components of the bacterial-like chemotaxis signal transduction system of archaea. In H. salinarum, the CheF proteins interact with the chemotaxis proteins CheY, CheD and CheC2 as well as the flagella-accessory proteins FlaCE and FlaD, and are essential for any tactic response. CheF probably functions at the interface between the bacterial-like chemotaxis signal transduction system and the archaeal flagellar apparatus. 0
42587 391698 cl27177 Phospholamban Phospholamban. This model represents the short (52 residue) transmembrane phosphoprotein phospholamban. Phospholamban, in its unphosphorylated form, inhibits SERCA2, the cardiac sarcoplasmic reticulum Ca-ATPase. 0
42588 421467 cl27188 Mre11_DNA_bind Mre11 DNA-binding presumed domain. All proteins in this family for which functions are known are subunits of a nuclease complex made up of multiple proteins including MRE11 and RAD50 homologs. The functions of this nuclease complex include recombinational repair and non-homolgous end joining. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The proteins in this family are distantly related to proteins in the SbcCD complex of bacteria. [DNA metabolism, DNA replication, recombination, and repair] 0
42589 421470 cl27198 ORC2 Origin recognition complex subunit 2. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. 0
42590 421471 cl27200 Ribo_biogen_C Ribosome biogenesis protein, C-terminal. This family represents the C-terminal domain of some putative ribosome biogenesis proteins in archaea. It has also been identified in the eukaryotic protein Tsr3, which is involved in ribosomal RNA biogenesis. 0
42591 421472 cl27202 Mpp10 Mpp10 protein. This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites. 0
42592 421474 cl27208 PRCH Photosynthetic reaction centre, H-chain N-terminal region. This model describes the photosynthetic reaction center H subunit in non-oxygenic photosynthetic bacteria. The reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in the form of NADH. Ultimately, the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is an organic acid rather than water. Much of our current functional understanding of photosynthesis comes from the structural determination and spectroscopic studies on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis] 0
42593 421475 cl27223 CBF CBF/Mak21 family. 0
42594 421476 cl27226 DUF331 Domain of unknown function. Members of this family are uncharacterized proteins from a number of bacterial species. The proteins range in size from 50-70 residues. 0
42595 421477 cl27236 VMO-I N/A. VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure that consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold. 0
42596 421478 cl27240 PetN PetN. cytochrome b6/f complex subunit VIII 0
42597 421479 cl27241 YccF Inner membrane component domain. Domain occurs as one or more copies in bacterial and eukaryotic proteins. These are membrane proteins of four TM regions, two appearing in each of the two copies when both are present. Many of the latter members also carry the sodium/calcium exchanger protein family pfam01699, which have multipass membrane regions. 0
42598 421480 cl27251 EIIBC-GUT_N Sorbitol phosphotransferase enzyme II N-terminus. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIBC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS] 0
42599 355631 cl27253 Alc Allantoicase [Nucleotide transport and metabolism]. Members of this family are the enzyme allantoicase (EC 3.5.3.4), also called allantoate amidinohydrolase. This enzyme hydrolyzes allantoate to (S)-ureidoglycolate and urea; it can also degrade (R)-ureidoglycolate to glyoxylate and urea. Allantoinase (EC 3.5.2.5) hydrolyzes (S)-allantoin (a xanthine metabolite, via urate) to allantoate. Allantoate can then be degraded either by this enzyme, allantoicase, or by allantoate deiminase (EC 3.5.3.9). Members of the seed alignment for this model were taken from BRENDA. Proteins in this family contain two copies of the allantoicase repeat (pfam03561). A different but similarly named enzyme, allantoate amidohydrolase (EC 3.5.3.9), simultaneously breaks down the urea to ammonia and carbon dioxide. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other] 0
42600 355632 cl27261 NrdR Transcriptional regulator NrdR, contains Zn-ribbon and ATP-cone domains [Transcription]. Members of this almost entirely bacterial family contain an ATP cone domain (pfam03477). There is never more than one member per genome. Common gene symbols given include nrdR, ybaD, ribX and ytcG. The member from Streptomyces coelicolor is found upstream in the operon of the class II oxygen-independent ribonucleotide reductase gene nrdJ and was shown to repress nrdJ expression. Many members of this family are found near genes for riboflavin biosynthesis in Gram-negative bacteria, suggesting a role in that pathway. However, a phylogenetic profiling study associates members of this family with the presence of a palindromic signal with consensus acaCwAtATaTwGtgt, termed the NrdR-box, an upstream element for most operons for ribonucleotide reductase of all three classes in bacterial genomes. [Regulatory functions, DNA interactions] 0
42601 421481 cl27262 MOSC MOSC domain. 6-N-hydroxylaminopurine resistance protein; Provisional 0
42602 421482 cl27268 UPF0126 UPF0126 domain. hypothetical protein; Provisional 0
42603 391716 cl27276 RNA_replicase_B RNA replicase, beta-chain. RNA replicase, beta subunit 0
42604 391717 cl27281 PapB Adhesin biosynthesis transcription regulatory protein. fimbriae biosynthesis regulatory protein; Provisional 0
42605 421483 cl27283 SDH_alpha Serine dehydratase alpha chain. This enzyme is also called serine deaminase and L-serine dehydratase 1. L-serine ammonia-lyase converts serine into pyruvate in the gluconeogenesis pathway from serine. This enzyme is comprised of a single chain in Escherichia coli, Mycobacterium tuberculosis, and several other species, but has separate alpha and beta chains in Bacillus subtilis and related species. The beta and alpha chains are homologous to the N-terminal and C-terminal regions, respectively, but are rather deeply branched in a UPGMA tree. This enzyme requires iron and dithiothreitol for activation in vitro, and is a predicted 4Fe-4S protein. Escherichia coli Pseudomonas aeruginosa have two copies of this protein. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis] 0
42606 421484 cl27287 Glf UDP-galactopyranose mutase [Cell wall/membrane/envelope biogenesis]. This enzyme is involved in the conversion of UDP-GALP into UDP-GALF through a 2-keto intermediate. It contains FAD as a cofactor. The gene is known as glf, ceoA, and rfbD. It is known experimentally in E. coli, Mycobacterium tuberculosis, and Klebsiella pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides] 0
42607 421485 cl27291 LUC7 LUC7 N_terminus. This family contains the N terminal region of several LUC7 protein homologs and only contains eukaryotic proteins. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The family also contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP). 0
42608 421486 cl27293 UFD1 Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterized by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation. 0
42609 391729 cl27371 GYR GYR motif. The GYR motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases. 0
42610 355644 cl27375 PyrI Aspartate carbamoyltransferase, regulatory subunit [Nucleotide transport and metabolism]. aspartate carbamoyltransferase regulatory subunit; Reviewed 0
42611 421492 cl27388 Herpes_UL31 Herpesvirus UL31-like protein. nuclear egress lamina protein UL31; Provisional 0
42612 391731 cl27389 Pap_E4 E4 protein. E4 protein; Provisional 0
42613 421493 cl27397 WhiA_N WhiA N-terminal LAGLIDADG-like domain. This family describes a DNA-binding protein widely conserved in Gram-positive bacteria, and occasionally occurring elsewhere, such as in Thermotoga. It is associated with cell division, and in sporulating organisms with sporulation. [Cellular processes, Cell division] 0
42614 421494 cl27405 RecO_C Recombination protein O C terminal. All proteins in this family for which functions are known are DNA binding proteins that are involved in the initiation of recombination or recombinational repair. [DNA metabolism, DNA replication, recombination, and repair] 0
42615 421495 cl27409 PsbK Photosystem II 4 kDa reaction centre component. Photosystem II reaction center protein K; Provisional 0
42616 332231 cl27410 PsbI Photosystem II reaction centre I protein (PSII 4.8 kDa protein). photosystem II protein I 0
42617 421496 cl27417 Rep_trans Replication initiation factor. DNA replication initiation protein 0
42618 421497 cl27418 Branch Core-2/I-Branching enzyme. acetylglucosaminyltransferase family protein; Provisional 0
42619 421498 cl27421 DisA_N DisA bacterial checkpoint controller nucleotide-binding. These proteins have no detectable global or local homology to any protein of known function. Members are restricted to the bacteria and found broadly in lineages other than the Proteobacteria. [Hypothetical proteins, Conserved] 0
42620 421499 cl27429 PsbL PsbL protein. photosystem II protein L 0
42621 421500 cl27440 Cyt_c_Oxidase_VIII N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIII. 0
42622 421501 cl27443 Orthopox_35kD 35kD major secreted virus protein. chemokine binding protein; Provisional 0
42623 421502 cl27447 WSN Domain of unknown function. 0
42624 421503 cl27448 GoLoco GoLoco motif. GEF specific for Galpha_i proteins 0
42625 421504 cl27450 BH4 Bcl-2 homology region 4. 0
42626 421508 cl27463 TrpBP Tryptophan RNA-binding attenuator protein. 0
42627 391748 cl27467 NMU Neuromedin U. Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians. 0
42628 421509 cl27474 AdoMet_Synthase S-adenosylmethionine synthetase (AdoMet synthetase). This family consists of several archaebacterial S-adenosylmethionine synthetase C(AdoMet synthetase or MAT) (EC 2.5.1.6). S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. The biological roles of AdoMet include acting as the primary methyl group donor, as a precursor to the polyamines, and as a progenitor of a 5'-deoxyadenosyl radical. S-Adenosylmethionine synthetase catalyzes the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. MATs from various organisms contain ~400-amino acid polypeptide chains. 0
42629 421510 cl27476 Ribosomal_L14e Ribosomal protein L14. 60S ribosomal protein L14; Provisional 0
42630 421511 cl27487 HOK_GEF Hok/gef family. small toxic polypeptide; Provisional 0
42631 332309 cl27488 Bax Uncharacterized FlgJ-related protein [General function prediction only]. 0
42632 421512 cl27498 PsbJ PsbJ. photosystem II reaction center protein J; Provisional 0
42633 421513 cl27501 Ribosomal_L29e Ribosomal L29e protein family. 60S ribosomal protein L29; Provisional 0
42634 421514 cl27502 Folate_carrier Reduced folate carrier. The Reduced Folate Carrier (RFC) Family (TC 2.A.48) Members of the RFC family mediate the uptake of folate, reduce folate, derivatives of reduced folate and the drug, methotrexate. Proteins of the RFC family are so-far restricted to animals. RFC proteins possess 12 putative transmembrane a-helical spanners (TMSs) and evidence for a 12 TMS topology has been published for the human RFC. The RFC transporters appear to transport reduced folate by an energy-dependent, pH-dependent, Na+-independent mechanism. Folate:H+ symport, folate:OH- antiport and folate:anion antiport mechanisms have been proposed, but the energetic mechanism is not well defined. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids] 0
42635 421515 cl27506 zf-A20 A20-like zinc finger. A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation. 0
42636 421516 cl27507 Ycf9 YCF9. PsbZ is a core protein of photosystem II in thylakoid-containing Cyanobacteria and plant chloroplasts. The original Chlamydomonas gene symbol, ycf9, is a synonym. PsbZ controls the interaction of the reaction center core with the light-harvesting antenna. [Energy metabolism, Photosynthesis] 0
42637 332333 cl27512 Adeno_PIX Adenovirus hexon-associated protein (IX). capsid protein IX,hexon associated protein IX; Provisional 0
42638 421517 cl27532 B56 Protein phosphatase 2A regulatory B subunit (B56 family). serine/threonine protein phosphatase 2A; Provisional 0
42639 421518 cl27544 Herpes_glycop_D Herpesvirus glycoprotein D/GG/GX domain. envelope glycoprotein D; Provisional 0
42640 421520 cl27556 Col_cuticle_N Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins. 0
42641 421521 cl27557 P_proprotein Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. 0
42642 421523 cl27575 PHO4 Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor. 0
42643 421524 cl27585 Adenylate_cycl Adenylate cyclase, class-I. 0
42644 421525 cl27586 Thymosin Thymosin beta-4 family. 0
42645 421526 cl27588 Parathyroid Parathyroid hormone family. 0
42646 421527 cl27628 Gastrin Gastrin/cholecystokinin family. This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction. 0
42647 421528 cl27631 Nebulin Nebulin repeat. Tandem arrays of these repeats are known to bind actin. 0
42648 421529 cl27632 Transposase_mut Transposase, Mutator family. Members of this family belong to the branch of the IS256-like family of transposases that includes the founding member. It excludes the IS1249 group. 0
42649 332455 cl27634 STNV N/A. STNV domain; satellite tobacco necrosis virus (STNV) are small plant viruses which are completely dependent on the presence of a specific helper virus, TNV, for their replication; 60 identical subunits, this domain is one of them; form an icosahedral shell around a single RNA molecule. Half of the RNA codes for the coat protein with the other half being non-coding. The STNV domain has a "Swiss roll" Greek key topology with its two 4-stranded antiparallel beta sheets 0
42650 332467 cl27646 Polyhedrin Polyhedrin. polyhedrin; Provisional 0
42651 332469 cl27648 Polyoma_coat Polyomavirus coat protein. Major capsid protein VP1; Provisional 0
42652 421530 cl27652 Plectin Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen. 0
42653 421531 cl27657 Motile_Sperm MSP (Major sperm protein) domain. Major sperm proteins are involved in sperm motility. These proteins oligomerize to form filaments. This family contains many other proteins. 0
42654 421532 cl27659 Filamin Filamin/ABP280 repeat. These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29). 0
42655 421533 cl27660 MAM N/A. An extracellular domain found in many receptors. The MAM domain along with the associated Ig domain in type IIB receptor protein tyrosine phosphatases forms a structural unit (termed MIg) with a seamless interdomain interface. It plays a major role in homodimerization of the phosphatase ectoprotein and in cell adhesion. MAM is a beta-sandwich consisting of two five-stranded antiparallel beta-sheets rotated away from each other by approx 25 degrees, and plays a similar role in meprin metalloproteinases. 0
42656 332494 cl27673 E6 Early Protein (E6). E6 protein; Provisional 0
42657 355670 cl27691 AcrR DNA-binding transcriptional regulator, AcrR family [Transcription]. transcriptional regulator BetI; Validated 0
42658 421536 cl27706 Prion Prion/Doppel alpha-helical domain. The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy. 0
42659 421537 cl27713 Defensin_1 Mammalian defensin. Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels. 0
42660 391777 cl27714 Endothelin Endothelin family. endothelin precursor; Provisional 0
42661 355673 cl27727 BBI N/A. Bowman-Birk type proteinase inhibitor (BBI); family of plant serine protease inhibitors that block trypsin or chymotrypsin.They are either single-headed (one reactive site, one inactive site, present mainly in monocotyledonous seeds) or double-headed (two reactive sites, present mainly in dicotyledonous seeds). 0
42662 421538 cl27728 Calc_CGRP_IAPP Calcitonin / CGRP / IAPP family. This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones. 0
42663 421539 cl27729 ANP Atrial natriuretic peptide. Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. 0
42664 421541 cl27746 Hormone_2 Peptide hormone. This family contains glucagon, GIP, secretin and VIP. 0
42665 421542 cl27758 Zona_pellucida Zona pellucida-like domain. ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan). 0
42666 332591 cl27770 Tail_VII Inovirus G7P protein. minor coat protein 0
42667 332592 cl27771 RPS31 Ribosomal protein S31e. Members of this protein are the lineage-specific bacterial ribosomal small subunit proteint bTHX (previously THX), originally shown to exist in the genus Thermus. The protein is conserved for the first 26 amino acids, past which some members continue with additional sequence, often repetitive or low-complexity. This model also finds eukaryotic organelle forms, which have additional N-terminal transit peptides. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
42668 421544 cl27778 Phage_clamp_A Bacteriophage clamp loader A subunit. clamp loader small subunit; Provisional 0
42669 421545 cl27779 DMP12 Putative DNA mimic protein DMP12. This is a family of DNA-mimic proteins expressed by Neisseria species. In its monomeric form DMP12 interacts with the Neisseria dimeric form of the bacterial histone-like protein HU. HU proteins promote the assembly of higher-order DNA-protein structures, The interaction between DMP12 and HU protein may be instrumental in controlling the stability of the nucleoid in Neisseria as DMP12 prevents Neisseria HU protein from being digested by trypsin. 0
42670 421546 cl27780 UL141 Herpes-like virus membrane glycoprotein UL141. UL14 tegument protein; Provisional 0
42671 332602 cl27781 EAGR_box Enriched in aromatic and glycine Residues box. The EAGR box (Enriched in Aromatic and Glycine Residues) is found in three different proteins of the Mycoplasma genitalium terminal organelle, which acts in both cytadherence and gliding motility. The presence of this domain in a genome predicts the Mycoplasma-type terminal organelle structure, gliding motility, and cytadherence. The EAGR box may occur from one to nine times in a protein. 0
42672 355678 cl27782 Yop-YscD_ppl Inner membrane component of T3SS, periplasmic domain. Yop-YscD-ppl is the periplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus. 0
42673 421547 cl27783 Cas9_REC REC lobe of CRISPR-associated endonuclease Cas9. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas locus. The species range so far for this protein is animal pathogens and commensals only. 0
42674 332607 cl27786 FPRL1_inhibitor Formyl peptide receptor-like 1 inhibitory protein. formyl peptide receptor-like 1 inhibitory protein; Reviewed 0
42675 332610 cl27789 YmcE_antitoxin Putative antitoxin of bacterial toxin-antitoxin system. YmcE_antitoxin is the putative antitoxin for the supposed bacterial toxin GnsA, UniProtKB:P0AC92, family pfam08178. 0
42676 421548 cl27796 Tox-PLDMTX Dermonecrotoxin of the Papain-like fold. A papain fold toxin domain found in bacterial polymorphic toxin systems. 0
42677 421549 cl27802 MerR_2 MerR HTH family regulatory protein. 0
42678 332632 cl27811 G-7-MTase mRNA (guanine-7-)methyltransferase (G-7-MTase). This model represents a common C-terminal region shared by paramyxovirus-like RNA-dependent RNA polymerases (see pfam00946). Polymerase proteins described by these two models are often called L protein (large polymerase protein). Capping of mRNA requires RNA triphosphatase and guanylyl transferase activities, demonstrated for the rinderpest virus L protein and at least partially localized to the region of this model. 0
42679 332633 cl27812 EF-hand_4 Cytoskeletal-regulatory complex EF hand. Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences. 0
42680 332642 cl27821 T4_baseplate T4 bacteriophage base plate protein. baseplate hub assembly protein; Provisional 0
42681 332643 cl27822 Protein_K Bacteriophage protein K. protein K 0
42682 332644 cl27823 Sulf_coat_C Sulfolobus virus coat protein C terminal. coat protein 0
42683 332645 cl27824 VirE1 Single-strand DNA-binding protein. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved IELE sequence motif. VirE1 is an acidic chaperone protein which binds to VirE2, a ssDNA binding protein. These proteins are virulence factors of the plant pathogens Agrobacteria. VirE1 competes for the ssDNA binding site of VirE2. 0
42684 332646 cl27825 VirArc_Nuclease Viral/Archaeal nuclease. hypothetical protein 0
42685 332649 cl27828 T4_neck-protein Virus neck protein. neck protein; Provisional 0
42686 421551 cl27829 NTPase_P4 ATPase P4 of dsRNA bacteriophage phi-12. packaging NTPase P4 0
42687 332651 cl27830 DUF3130 Protein of unknown function (DUF3130. Members of this protein family are similar in length and sequence (although remotely) to the WXG100 family of type VII secretion system (T7SS) targets, described by family TIGR03930. Phylogenetic profiling shows that members of this family are similarly restricted to species with T7SS, marking this family as a related set of T7SS effectors. Members include SACOL2603 from Staphylococcus aureus subsp. aureus COL. Oddly, members of family pfam10824 (DUF2580), which appears also to be related, seem not to be tied to T7SS. 0
42688 332652 cl27831 Phage_DsbA Transcriptional regulator DsbA. double-stranded DNA binding protein; Provisional 0
42689 332653 cl27832 DUF2830 Protein of unknown function (DUF2830). lysis protein 0
42690 391790 cl27833 Phage_glycop_gL Viral glycoprotein L. hypothetical protein; Provisional 0
42691 332655 cl27834 UL11 Membrane-associated tegument protein. tegument protein UL11; Provisional 0
42692 332656 cl27835 DNA_Packaging Terminase DNA packaging enzyme. small terminase protein; Provisional 0
42693 421552 cl27836 DUF2810 Protein of unknown function (DUF2810). This is a bacterial family of uncharacterized proteins. 0
42694 332658 cl27837 DUF2685 Protein of unknown function (DUF2685). hypothetical protein; Provisional 0
42695 355682 cl27838 DUF2701 Protein of unknown function (DUF2701). putative transmembrane protein; Provisional 0
42696 332660 cl27839 DUF2649 Protein of unknown function (DUF2649). hypothetical protein 0
42697 332661 cl27840 DUF2654 Protein of unknown function (DUF2654). hypothetical protein; Provisional 0
42698 332662 cl27841 DUF2733 Protein of unknown function (DUF2733). Alkaline exonuclease; Provisional 0
42699 391791 cl27842 YbaJ Biofilm formation regulator YbaJ. YbaJ regulates biofilm formation. It also has an important role in the regulation of motility in the biofilm. YbaJ functions in increasing conjugation, aggregation and decreasing the motility, resulting in an increase of biofilm 0
42700 332664 cl27843 Phage_holin_2_2 Phage holin T7 family, holin superfamily II. type II holin 0
42701 332665 cl27844 RepB-RCR_reg Replication regulatory protein RepB. This is a family of proteins which regulate replication of rolling circle replication (RCR) plasmids that have a double-strand replication origin (dso). Regulation of replication of RCR plasmids occurs mainly at initiation of leading strand synthesis at the dso, such that Rep protein concentration controls plasmid replication. 0
42702 355684 cl27850 Glyco_hydro_65m Glycosyl hydrolase family 65 central catalytic domain. maltose phosphorylase; Provisional 0
42703 332674 cl27853 GSu_C4xC__C2xCH Geobacter CxxxxCH...CXXCH motif (GSu_C4xC__C2xCH). This domain occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches ProSite motif PS00190, the cytochrome c family heme-binding site signature, suggesting 0
42704 332675 cl27854 IpaC_SipC Salmonella-Shigella invasin protein C (IpaC_SipC). This model represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri (SP:P18012) and SipC from Salmonella typhimurium (GB:AAA75170.1). Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins. [Cellular processes, Pathogenesis] 0
42705 332676 cl27855 Spore_SspJ Small spore protein J (Spore_SspJ). New small, acid-soluble proteins unique to spores of Bacillus subtilis [Cellular processes, Sporulation and germination] 0
42706 391792 cl27856 DUF2374 Protein of unknown function (Duf2374). This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC7966. [Hypothetical proteins, Conserved] 0
42707 421553 cl27860 Pfg27 Pfg27. gamete antigen 27/25-like protein; Provisional 0
42708 421554 cl27861 Phage-Gp8 Bacteriophage T4, Gp8. baseplate wedge subunit; Provisional 0
42709 421555 cl27870 T3SS_needle_E Type III secretion system, cytoplasmic E component of needle. Members of this family are found exclusively in type III secretion appparatus gene clusters in bacteria. Those bacteria with a protein from this family tend to target animal cells, as does Yersinia pestis. This protein is small (about 70 amino acids) and not well characterized. [Cellular processes, Pathogenesis] 0
42710 332699 cl27878 Flu_M1_C Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein. 0
42711 391795 cl27882 Phage_1_1 Bacteriophage 1.1 Protein. hypothetical protein 0
42712 332704 cl27883 SspN Small acid-soluble spore protein N family. acid-soluble spore protein N; Provisional 0
42713 332705 cl27884 TetM_leader Tetracycline resistance determinant leader peptide. tetracycline resistance determinant leader peptide; Provisional 0
42714 332706 cl27885 Leu_leader Leucine operon leader peptide. leu operon leader peptide; Provisional 0
42715 391796 cl27886 Tna_leader Tryptophanase operon leader peptide. tryptophanase leader peptide; Provisional 0
42716 391797 cl27888 PaaX_C PaaX-like protein C-terminal domain. This transcriptional regulator is always found in association with operons believed to be involved in the degradation of phenylacetic acid. The gene product has been shown to bind to the promoter sites and repress their transcription. [Regulatory functions, DNA interactions] 0
42717 332711 cl27890 Chaperone_III Type III secretion chaperone domain. Type III secretion chaperones are involved in delivering virulence effector proteins from bacterial pathogens directly into eukaryotic cells. The chaperones may prevent aggregation and degradation of their substrates, may target the effector to the secretion apparatus, and may ensure a secretion-component unfolded confirmation of their specific substrate. One member of this family, SigE forms homodimers in crystal. The monomers have a novel fold with an alpha-beta(3)-alpha-beta(2)-alpha topology. 0
42718 332717 cl27896 Herpes_UL37_2 Betaherpesvirus immediate-early glycoprotein UL37. UL37 tegument protein; Provisional 0
42719 332720 cl27899 Orthopox_B11R Orthopoxvirus B11R protein. hypothetical protein; Provisional 0
42720 332721 cl27900 DUF1314 Protein of unknown function (DUF1314). circ protein; Provisional 0
42721 421556 cl27901 GlpM GlpM protein. This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa. 0
42722 332723 cl27902 DUF1235 Protein of unknown function (DUF1235). hypothetical protein; Provisional 0
42723 332726 cl27905 DUF1231 Protein of unknown function (DUF1231). hypothetical protein; Provisional 0
42724 332734 cl27913 DUF1181 Protein of unknown function (DUF1181). hypothetical protein; Provisional 0
42725 332736 cl27915 Orthopox_F6 Orthopoxvirus F6 protein. hypothetical protein; Provisional 0
42726 391798 cl27921 DUF1039 Protein of unknown function (DUF1039). type III secretion system protein SsaH; Provisional 0
42727 332743 cl27922 DUF1029 Protein of unknown function (DUF1029). ORF091 IMV membrane protein; Provisional 0
42728 332744 cl27923 Orthopox_A5L Orthopoxvirus A5L protein-like. virion core protein; Provisional 0
42729 332748 cl27927 Chordopox_E11 Chordopoxvirus E11 protein. putative virion core protein; Provisional 0
42730 332749 cl27928 Chordopox_G3 Chordopoxvirus G3 protein. hypothetical protein; Provisional 0
42731 421558 cl27929 Pox_A30L_A26L Orthopoxvirus A26L/A30L protein. A-type inclusion protein; Provisional 0
42732 332751 cl27930 Chordopox_A30L Chordopoxvirus A30L protein. ORF107 virion morphogenesis; Provisional 0
42733 421559 cl27931 RING_CBP-p300 atypical RING domain found in CREB-binding protein and p300 histone acetyltransferases. This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. This short domain is found to the C-terminus of bromodomains. The 40 residue domain contains four conserved cysteines suggesting that it may be stabilized by a zinc ion. In CREB this domain is to the N-terminus of another zinc binding PHD domain. 0
42734 332753 cl27932 Herpes_U5 Herpesvirus U5-like family. hypothetical protein; Provisional 0
42735 332754 cl27933 Chordopox_A35R Chordopoxvirus A35R protein. hypothetical protein; Provisional 0
42736 421560 cl27937 PSII_Ycf12 Photosystem II complex subunit Ycf12. Ycf12; Provisional 0
42737 332759 cl27938 Chordopox_A33R Chordopoxvirus A33R protein. EEV glycoprotein; Provisional 0
42738 332760 cl27939 Chordopox_A13L Chordopoxvirus A13L protein. IMV membrane protein; Provisional 0
42739 391800 cl27940 AgrD Staphylococcal AgrD protein. Members of this family of short peptides are precursors to thiolactone (unless Cys is replaced by Ser) cyclic autoinducer peptides, used in quorum-sensing systems in Gram-positive bacteria. The best characterized is the AgrD precursor, processed by the AgrB protein. Nearby proteins regularly encountered include a histidine kinase and a response regulator. This model is related to pfam05931 but is newer and currently broader in scope. 0
42740 332762 cl27941 Orthopox_F8 Orthopoxvirus F8 protein. Hypothetical protein; Provisional 0
42741 332763 cl27942 Chordopox_RPO7 Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7). DNA-dependent RNA polymerase subunit; Provisional 0
42742 332764 cl27943 DUF848 Gammaherpesvirus protein of unknown function (DUF848). hypothetical protein; Provisional 0
42743 332765 cl27944 Orthopox_F7 Orthopoxvirus F7 protein. hypothetical protein; Provisional 0
42744 332766 cl27945 Herpes_BLRF2 Herpesvirus BLRF2 protein. hypothetical protein; Provisional 0
42745 332768 cl27947 Chordopox_G2 Chordopoxvirus protein G2. transcriptional elongation factor; Provisional 0
42746 332769 cl27948 Herpes_heli_pri Herpesvirus helicase-primase complex component. helicase-primase primase subunit; Provisional 0
42747 355689 cl27949 Pox_A14 Poxvirus virion envelope protein A14. ORF090 IMV phosphorylated membrane protein; Provisional 0
42748 421561 cl27954 SpvD Salmonella plasmid virulence protein SpvD. This family consists of several SpvD plasmid virulence proteins from different Salmonella species. The structure of the protein from Salmonella typhimurium has been solved and shows a papain-like fold, with a predicted catalytic triad of Cys73, His162 and Asp182. The protein has been shown to have deubiquitinating-like activity, releasing aminoluciferin (AML) from Ub-AML. 0
42749 332777 cl27956 Pox_G7 Poxvirus G7-like. putative virion core protein; Provisional 0
42750 421563 cl27957 Phi-29_GP4 Phi-29-like late genes activator (early protein GP4). transcriptional regulator 0
42751 332779 cl27958 Pox_ser-thr_kin Poxvirus serine/threonine protein kinase. Ser/Thr kinase; Provisional 0
42752 332782 cl27961 Pox_A21 Poxvirus A21 Protein. hypothetical protein; Provisional 0
42753 332785 cl27964 Pox_A3L Poxvirus A3L Protein. virus redox protein; Provisional 0
42754 332788 cl27967 DUF705 Protein of unknown function (DUF705). This model represents a family of viral proteins of unknown function. These proteins are members, however, of the IIIC (TIGR01681) subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. All characterized members of the III subfamilies (IIIA, TIGR01662; IIIB, pfam03767) are phosphatases, including MDP-1, a member of subfamily IIIC (TIGR01681). No member of this subfamily is characterized with respect to particular function. All of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC. These proteins also include an N-terminal domain (ca. 125 aas) that is unique to this clade. 0
42755 332791 cl27970 Baculo_p47 Baculovirus P47 protein. viral transcription regulator p47; Provisional 0
42756 332792 cl27971 LEF-9 Late expression factor 9 (LEF-9). late expression factor 9; Provisional 0
42757 332793 cl27972 DUF678 Protein of unknown function (DUF678). hypothetical protein; Provisional 0
42758 332794 cl27973 Herpes_UL43 Herpesvirus UL43 protein. UL43 envelope protein; Provisional 0
42759 332795 cl27974 Pox_A11 Poxvirus A11 Protein. hypothetical protein; Provisional 0
42760 391803 cl27976 PIF3 Per os infectivity factor 3. per os infectivity factor 3; Provisional 0
42761 332800 cl27979 LEF-8 Late expression factor 8 (LEF-8). DNA-directed RNA polymerase subunit beta-like protein; Provisional 0
42762 332801 cl27980 DUF655 Protein of unknown function (DUF655). This family includes several uncharacterized archaeal proteins. This protein appears to contain two HHH motifs. 0
42763 332802 cl27981 Pox_M2 Poxvirus M2 protein. hypothetical protein; Provisional 0
42764 332804 cl27983 Pox_L5 Poxvirus L5 protein family. ORF051 putative membrane protein; Provisional 0
42765 332806 cl27985 Pox_E10 E10-like protein conserved region. sulfhydryl oxidase; Provisional 0
42766 391804 cl27986 Herpes_BBRF1 BRRF1-like protein. hypothetical protein; Provisional 0
42767 332808 cl27987 Pox_H7 Late protein H7. hypothetical protein; Provisional 0
42768 332809 cl27988 Pox_F17 DNA-binding 11 kDa phosphoprotein. ORF017 DNA-binding phosphoprotein; Provisional 0
42769 332810 cl27989 InvH InvH outer membrane lipoprotein. This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localization to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion. 0
42770 391805 cl27990 Agro_virD5 Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein. 0
42771 355692 cl27992 Pox_I5 Poxvirus protein I5. putative IMV membrane protein; Provisional 0
42772 332814 cl27993 Pox_F16 Poxvirus F16 protein. hypothetical protein; Provisional 0
42773 332816 cl27995 Microvir_H Microvirus H protein (pilot protein). minor spike protein 0
42774 332817 cl27996 Herpes_BTRF1 Herpesvirus BTRF1 protein conserved region. hypothetical protein; Provisional 0
42775 332818 cl27997 Pox_I3 Poxvirus I3 ssDNA-binding protein. DNA-binding phosphoprotein; Provisional 0
42776 332819 cl27998 Pox_E6 Pox virus E6 protein. Hypothetical protein; Provisional 0
42777 355693 cl27999 Herpes_pp85 Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25). DNA packaging tegument protein UL25; Provisional 0
42778 332821 cl28000 Pox_G5 Poxvirus G5 protein. Hypothetical protein; Provisional 0
42779 332822 cl28001 Pox_F15 Poxvirus protein F15. hypothetical protein; Provisional 0
42780 332823 cl28002 Herpes_UL55 Herpesvirus UL55 protein. nuclear protein UL55; Provisional 0
42781 332824 cl28003 Herpes_U44 Herpes virus U44 protein. tegument protein; Provisional 0
42782 332825 cl28004 Microvir_lysis Microvirus lysis protein (E), C-terminus. cell lysis protein 0
42783 332826 cl28005 Pox_VP8_L4R Poxvirus nucleic acid binding protein VP8/L4R. DNA-binding virion core protein; Provisional 0
42784 355694 cl28006 PHA02695 N/A. hypothetical protein; Provisional 0
42785 332830 cl28009 Poxvirus_B22R Poxvirus B22R protein. hypothetical protein; Provisional 0
42786 421565 cl28016 Hema_HEFG Hemagglutinin domain of haemagglutinin-esterase-fusion glycoprotein. 0
42787 332838 cl28017 Herpes_UL37_1 Herpesvirus UL37 tegument protein. UL37 tegument protein; Provisional 0
42788 332842 cl28021 Phage_mat-A Phage maturation protein. maturation protein 0
42789 332856 cl28035 Herpes_UL33 Herpesvirus UL33-like protein. DNA packaging protein UL33; Provisional 0
42790 355695 cl28037 PHA03163 N/A. hypothetical protein; Provisional 0
42791 332861 cl28040 Peptidase_C37 Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins. 0
42792 332864 cl28043 Peptidase_M44 Metallopeptidase from vaccinia pox. putative metalloprotease; Provisional 0
42793 332865 cl28044 Pox_E8 Poxvirus E8 protein. putative membrane protein; Provisional 0
42794 355696 cl28045 Herpes_UL46 Herpesvirus UL46 protein. tegument protein VP11/12; Provisional 0
42795 332867 cl28046 Pox_LP_H2 Viral late protein H2. putative viral membrane protein; Provisional 0
42796 332868 cl28047 Pox_L3_FP4 Poxvirus L3/FP4 protein. hypothetical protein; Provisional 0
42797 332869 cl28048 Pox_F12L Poxvirus F12L protein. EEV maturation protein; Provisional 0
42798 421566 cl28050 Herpes_VP19C Herpesvirus capsid shell protein VP19C. Capsid triplex subunit 1; Provisional 0
42799 355697 cl28051 PHA03144 N/A. helicase-primase primase subunit; Provisional 0
42800 332874 cl28053 Pox_I1 Poxvirus protein I1. putative DNA-binding virion core protein; Provisional 0
42801 332875 cl28054 Pox_Ag35 Pox virus Ag35 surface protein. late transcription factor VLTF-4; Provisional 0
42802 332877 cl28056 Herpes_UL21 Herpesvirus UL21. tegument protein UL21; Provisional 0
42803 332878 cl28057 Pox_P35 Poxvirus P35 protein. ORF059 IMV protein VP55; Provisional 0
42804 391808 cl28058 DNA_pol_B_2 DNA polymerase type B, organellar and viral. DNA polymerase; Provisional 0
42805 332886 cl28065 Herpes_UL79 UL79 family. hypothetical protein; Provisional 0
42806 355700 cl28066 Herpes_UL16 Herpesvirus UL16/UL94 family. tegument protein UL16; Provisional 0
42807 355701 cl28067 Herpes_UL87 Herpesvirus UL87 family. hypothetical protein; Provisional 0
42808 332889 cl28068 Pox_G9-A16 Pox virus entry-fusion-complex G9/A16. poxvirus myristoylprotein; Provisional 0
42809 332891 cl28070 gpD Bacteriophage scaffolding protein D. external scaffolding protein 0
42810 332896 cl28075 Flavi_E_C Immunoglobulin-like domain III (C-terminal domain) of Flavivirus envelope glycoprotein E. The C-terminal domain (domain III) of Flavivirus glycoprotein E appears to be involved in low-affinity interactions with negatively charged glycoaminoglycans on the host cell surface. Domain III may also play a role in interactions with alpha-v-beta-3 integrins in West Nile virus, Japanese encephalitis virus, and Dengue virus. The interface between domain I and domain III appears to be destabilized by the low-pH environment of the endosome, and domain III may play a vital role in the conformational changes of envelope glycoprotein E that follow the clathrin-mediated endocytosis of viral particles and are a prerequisite to membrane fusion. 0
42811 332900 cl28079 Herpes_Helicase Helicase. helicase-primase subunit BBLF4; Provisional 0
42812 421567 cl28085 US2 US2 family. virion protein US2; Provisional 0
42813 421568 cl28086 PsbN Photosystem II reaction centre N protein (psbN). photosystem II protein N 0
42814 421569 cl28088 L1R_F9L Lipid membrane protein of large eukaryotic DNA viruses. S-S bond formation pathway protein; Provisional 0
42815 355704 cl28089 TrkG Trk-type K+ transport system, membrane component [Inorganic ion transport and metabolism]. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. [Transport and binding proteins, Cations and iron carrying compounds] 0
42816 332926 cl28105 Vac_Fusion Chordopoxvirus multifunctional envelope protein A27. ORF104 fusion protein; Provisional 0
42817 332927 cl28106 Phage_B Scaffold protein B. internal scaffolding protein 0
42818 332933 cl28112 PhoU_div Protein of unknown function DUF47. An apparent homolog with a suggested function is Pit accessory protein from Sinorhizobium meliloti, which may be involved in phosphate (Pi) transport. [Hypothetical proteins, Conserved] 0
42819 332935 cl28114 MatK_N MatK/TrnK amino terminal region. maturase K 0
42820 421570 cl28115 Levi_coat Levivirus coat protein. coat protein 0
42821 421571 cl28116 Translat_reg Bacteriophage translational regulator. translation repressor protein; Provisional 0
42822 332939 cl28118 Cytomega_gL Cytomegalovirus glycoprotein L. envelope glycoprotein L; Provisional 0
42823 332940 cl28119 Polyoma_agno Polyomavirus agnoprotein. agnoprotein; Provisional 0
42824 332942 cl28121 Herpes_UL7 Herpesvirus UL7 like. UL7 tegument protein; Provisional 0
42825 332943 cl28122 Herpes_env Herpesvirus putative major envelope glycoprotein. DNA packaging protein UL32; Provisional 0
42826 332947 cl28126 Fibritin_C Fibritin C-terminal region. fibritin; Provisional 0
42827 332949 cl28128 Herpes_glycop Herpesvirus glycoprotein M. envelope glycoprotein M; Provisional 0
42828 421572 cl28129 Herpes_UL25 Herpesvirus UL25 family. DNA packaging tegument protein UL25; Provisional 0
42829 421573 cl28134 PsbT Photosystem II reaction centre T protein. photosystem II protein T 0
42830 391813 cl28136 MMTV_SAg Mouse mammary tumor virus superantigen. hypothetical protein; Provisional 0
42831 355705 cl28141 psaI N/A. photosystem I subunit VIII; Validated 0
42832 421574 cl28142 Viral_DNA_bp ssDNA binding protein. single-stranded DNA binding protein; Provisional 0
42833 332973 cl28153 Late_protein_L2 Late Protein L2. major capsid L1 protein; Provisional 0
42834 355706 cl28158 psbF N/A. photosystem II protein VI 0
42835 355708 cl28191 MSEP-CTERM MSEP-CTERM protein. Members of this subfamily average about 850 amino acids in length, ending with a variant form of PEP-CTERM sorting signal. Members have a VIT (vault protein inter-alpha-trypsin inhibitor heavy chain) domain (pfam08487). Other bacterial subfamilies of VIT domain proteins have members with either GlyGly-CTERM or LPXTG C-terminal sorting signals. Members of this subfamily occur only in context next to a protein sorting/processing enzyme, exosortase N (XrtN). These subsystems occur both among the Bacteriodetes and in the spirochete genus Leptospira. 0
42836 333016 cl28196 PchG Oxidoreductase (NAD-binding), involved in siderophore biosynthesis [Inorganic ion transport and metabolism]. This reductase is found associated with gene clusters for the biosynthesis of various non-ribosomal peptide derived natural products in which cysteine is cyclized to a thiazoline ring containing an imide double bond. Examples include yersiniabactin (irp3/YbtU, GP|21959262) and pyochelin (PchG, GP|4325022). 0
42837 333077 cl28257 SpsG Spore coat polysaccharide biosynthesis protein SpsG, predicted glycosyltransferase [Cell wall/membrane/envelope biogenesis]. This protein is found in association with enzymes involved in the biosynthesis of pseudaminic acid, a component of polysaccharide in certain Pseudomonas strains as well as a modification of flagellin in Campylobacter and Hellicobacter. The role of this protein is unclear, although it may participate in N-acetylation in conjunction with, or in the absence of PseH (TIGR03585) as it often scores above the trusted cutoff to pfam00583 representing a family of acetyltransferases. 0
42838 421575 cl28269 CitF Citrate lyase, alpha subunit (CitF). This is a model of the alpha subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The alpha subunit catalyzes the reaction Acetyl-CoA + citrate = acetate + (3S)-citryl-CoA. The seed contains an experimentally characterized member from Lactococcus lactis subsp. lactis. The model covers both Gram positive and Gram negative bacteria. It is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation] 0
42839 333203 cl28383 Pus10 tRNA U54 and U55 pseudouridine synthase Pus10 [Translation, ribosomal structure and biogenesis]. Members of this family show twilight-zone similarity to several predicted RNA pseudouridine synthases. All trusted members of this family are archaeal. Several eukaryotic homologs lack N-terminal homology including two CXXC motifs. [Hypothetical proteins, Conserved] 0
42840 421576 cl28438 Phage_coatGP8 Phage major coat protein, Gp8. Class I phage major coat protein Gp8 or B. The coat protein is largely alpha-helix with a slight curve. 0
42841 333259 cl28439 Pox_I6 Poxvirus I6-like family. Hypothetical protein; Provisional 0
42842 391818 cl28444 Polyoma_coat2 Polyomavirus coat protein. VP3; Provisional 0
42843 421577 cl28445 PlantTI N/A. Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges. 0
42844 333282 cl28462 pnk bifunctional NADP phosphatase/NAD kinase. NAD kinase 0
42845 333312 cl28492 PheB ACT domain-containing protein [General function prediction only]. 0
42846 333347 cl28527 HYS2 Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B [Replication, recombination and repair]. 0
42847 333351 cl28531 ENDO3c Thermostable 8-oxoguanine DNA glycosylase [Replication, recombination and repair, Defense mechanisms]. N-glycosylase/DNA lyase; Provisional 0
42848 333356 cl28536 COG5412 Phage-related protein [Mobilome: prophages, transposons]. membrane protein P6 0
42849 355712 cl28539 PRK14982 N/A. This enzyme, found in cyanobacteria, reduces a long-chain (mainly C16 or C18) fatty acyl ACP ester to its corresponding fatty aldehyde, releasing the acyl carrier protein (ACP). NADPH or NADH is the reductant for this reaction. This enzyme may be distantly related to the short-chain dehydrogenase or reductase (SDR) family (pfam00106). The purpose of this reaction is in the first step of alkane biosynthesis (GenProp0942). [Central intermediary metabolism, Other] 0
42850 333370 cl28550 PilV Tfp pilus assembly protein PilV [Cell motility, Extracellular structures]. Pilus systems categorized as type IV pilins differ greatly from one another, with some showing greater similarty to type II or type III secretion systems than to each other. Members of this protein family represent the PilV protein of type IV pilus systems as found in Pseudomonas aeruginosa PAO1, Pseudomonas syringae DC3000, Neisseria meningitidis MC58, Xylella fastidiosa 9a5c, etc. [Cell envelope, Surface structures, Protein fate, Protein modification and repair] 0
42851 333390 cl28570 CoxE Uncharacterized conserved protein, contains von Willebrand factor type A (vWA) domain [Function unknown]. 0
42852 421578 cl28577 MVD1 Mevalonate pyrophosphate decarboxylase [Lipid transport and metabolism]. diphosphomevalonate decarboxylase 0
42853 355713 cl28581 PRK15430 EamA family transporter RarD. This uncharacterized protein is predicted to have many membrane-spanning domains. [Transport and binding proteins, Unknown substrate] 0
42854 333416 cl28596 NhaC Na+/H+ antiporter NhaC [Energy production and conversion]. A single member of the NhaC family, a protein from Bacillus firmus, has been functionally characterized.It is involved in pH homeostasis and sodium extrusion. Members of the NhaC family are found in both Gram-negative bacteria and Gram-positive bacteria. Intriguingly, archaeal homolog ArcD (just outside boundaries of family) has been identified as an arginine/ornithine antiporter. [Transport and binding proteins, Cations and iron carrying compounds] 0
42855 333419 cl28599 COG1318 Predicted transcriptional regulator [Transcription]. This model describes a common domain shared by two different families of proteins, each of which occurs regularly next to its corresponding partner family, a probable regulatory with homology to KaiC. By implication, this protein family likely is also involved in sensory transduction and/or regulation. 0
42856 421579 cl28607 PHA02840 N/A. hypothetical protein; Provisional 0
42857 333429 cl28609 PHA03178 N/A. UL43 envelope protein; Provisional 0
42858 391819 cl28610 Herpes_HEPA Herpesvirus DNA helicase/primase complex associated protein. hypothetical protein; Provisional 0
42859 333432 cl28612 PHA03128 N/A. dUTPase; Provisional 0
42860 333435 cl28615 PHA02681 N/A. putative IMV membrane protein; Provisional 0
42861 333436 cl28616 PHA02670 N/A. GM-CSF/IL-2 inhibition factor; Provisional 0
42862 333439 cl28619 PHA03415 N/A. virion protein; Provisional 0
42863 333447 cl28627 PLN00046 N/A. Members of this family are the PsaO protein of photosystem I. This protein is found in chloroplasts but not in Cyanobacteria. 0
42864 421580 cl28628 Reoviridae_Vp9 Reoviridae VP9. This model, broader than related pfam08978, describes proteins VP9 in Coltivirus, and proteins with various designations in the seadornavirus group: VP9 in Banna virus, VP10 in Liao ning virus, and VP11 in Kadipiro virus. 0
42865 333450 cl28630 PRK15358 type III secretion systems effector SseF. pathogenicity island 2 effector protein SseG; Provisional 0
42866 333451 cl28631 PRK15355 N/A. This model represents the conserved C-terminal domain of a protein conserved in across species in the bacterial type III secretion apparatus. This protein is designated YscI (Yop proteins translocation protein I) in Yersinia and HrpB (hypersensitivity response and pathogenicity protein B) in plant pathogens such as Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
42867 333457 cl28637 PHA03175 N/A. US22 family homolog; Provisional 0
42868 333459 cl28639 Orthopox_F14 Orthopoxvirus F14 protein. hypothetical protein; Provisional 0
42869 333460 cl28640 PHA02693 N/A. Hypothetical protein; Provisional 0
42870 333461 cl28641 19 N/A. baseplate subunit; Provisional 0
42871 391821 cl28642 NAD4L NADH dehydrogenase subunit 4L (NAD4L). NADH dehydrogenase subunit 4L; Provisional 0
42872 333464 cl28644 PRK09781 N/A. hypothetical protein; Provisional 0
42873 391822 cl28645 CHIPS Chemotaxis-inhibiting protein CHIPS. chemotaxis-inhibiting protein CHIPS; Reviewed 0
42874 333468 cl28648 Pox_TAP Viral Trans-Activator Protein. late transcription factor VLTF-1; Provisional 0
42875 333469 cl28649 PHA03043 N/A. putative virulence factor; Provisional 0
42876 333472 cl28652 PHA02837 N/A. Toll/IL-receptor-like protein; Provisional 0
42877 333473 cl28653 PHA02836 N/A. hypothetical protein; Provisional 0
42878 333476 cl28656 PHA02725 N/A. hypothetical protein; Provisional 0
42879 355716 cl28658 DUF1406 Protein of unknown function (DUF1406). hypothetical protein; Provisional 0
42880 333479 cl28659 End_beta_propel Catalytic beta propeller domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 80 amino acids in length.This domain is the beta barrel domain of bacteriophage endosialidase which represents the one of the two sialic acid binding sites of the enzyme. The domain is nested in the beta propeller domain of the endosialidase enzyme. The endosialidase protein complexes to form homotrimeric molecules. 0
42881 333481 cl28661 DUF2717 Protein of unknown function (DUF2717). hypothetical protein 0
42882 333482 cl28662 RepA1_leader Tap RepA1 leader peptide. This protein is a translated leader peptide that actis in the regulation of the expression of the plasmid replication protein RepA in incF2 group plasmids. [Mobile and extrachromosomal element functions, Plasmid functions] 0
42883 333484 cl28664 Amb_V_allergen Amb V Allergen. Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens. 0
42884 333485 cl28665 PapG_CBD N/A. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan. 0
42885 421582 cl28728 TIGR02687 TIGR02687 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 880 amino acids in length. This protein is repeatedly found upstream of another uncharacterized protein of about 470 amino acids in length, modeled by TIGR02688. 0
42886 421593 cl28752 Penicillinase_R Penicillinase repressor. The penicillinase repressor negatively regulates expression of the penicillinase gene. The N-terminal region of this protein is involved in operator recognition, while the C-terminal is responsible for dimerization of the protein. 0
42887 421665 cl28849 CrtC-like carotenoid 1,2-hydratase and similar proteins. This family includes Aspergillus nidulans tyrosinase family protein asqI (aspoquinolone biosynthesis protein I) that is part of the gene cluster that mediates the biosynthesis of the aspoquinolone mycotoxins. 0
42888 421686 cl28876 PCSK9_C-CRD proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain (CRD). This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds. 0
42889 333703 cl28883 PA N/A. PA_M28_1_3: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 3. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup. 0
42890 421687 cl28889 CA_like Cadherin repeat-like domain. This domain is found in a range of enzymes that act on branched substrates - isoamylase, pullulanase and branching enzyme. This family also contains the beta subunit of 5' AMP activated kinase. 0
42891 333710 cl28890 FYVE_like_SF FYVE domain like superfamily. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 0
42892 421688 cl28891 ParB_N_Srx ParB N-terminal domain and sulfiredoxin protein-related families. This is family of bacterial proteins likely to be necessary for binding to DNA and recognising the modification sites. Members are found in bacteria, archaea and on viral plasmids, and are typically between 354 and 474 amino acids in length. There is a conserved DGQHR sequence motif. 0
42893 333712 cl28892 CNF1_CheD_YfiH-like cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and YfiH (DUF152) are distant homologs. This subfamily contains Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) and dermonecrotic toxin (DNT) from Bordetella species, as well as Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549), and similar proteins. CNF1 causes alteration of the host cell actin cytoskeleton and promotes bacterial invasion of blood-brain barrier endothelial cells. E. coli CNF1 constitutively activates host small G proteins such as RhoA and Cdc42 by deamidating a glutamine residue essential for GTP hydrolysis. DNT stimulates the assembly of actin stress fibers and focal adhesions by deamidation/polyamination of a specific glutamine of the small GTPase Rho. CNF1 and DNT are A-B toxins composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain; their homology is restricted to the catalytic domains at the C termini of the toxins, suggesting that they share a similar molecular mechanism. BLF1, a toxin that inhibits helicase activity of translation factor eIF4A, is similar to the catalytic domain of Escherichia coli CNF1 (CNF1-C); although CNF1-C and BLF1 show little sequence identity, the active sites have the conserved LSGC (Leu, Ser, Gly, Cys) motif. 0
42894 333713 cl28893 CpcS_T S- and T-type phycobiliprotein (PBP) lyases. This family contains the S-type phycobiliprotein (PBP) lyase (denoted CpcS/CpcU or CpeS/CpeU). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The S-type lyase is distantly related to CpcT and similarly adopts a beta-barrel structure with a modified lipocalin fold. Many members of the CpcS/CpcU family ligate phycocyanobilin (PCB) to a specific cysteine residue in the beta-subunits of phycocyanin (CpcB) or phycoerythrocyanin (PecB) and to a related cysteine residue in the alpha and beta subunits of allophycocyanin (AP); they are typically given the designation of "CpcS" or "CpcU". Other members which attach phycoerythrobilin (PEB) to the beta-subunit of phycoerythrin (PE) are given the designation "CpeS" or "CpeU". In Guillardia theta, a Cryptophyte, which has adopted phycoerythrobilin (PEB) biosynthesis from cyanobacteria, phycobiliprotein lyase has been shown to provide structural requirements for the transfer of this chromophore to the specific cysteine residue of the apophycobiliprotein. 0
42895 333715 cl28895 EFh_PI-PLC EF-hand motif found in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) isozymes. PRIP-2, also termed phospholipase C-L2, or phospholipase C-epsilon-2 (PLC-epsilon-2), or inactive phospholipase C-like protein 2 (PLC-L2), is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that exhibits a relatively ubiquitous expression. It functions as a novel negative regulator of B-cell receptor (BCR) signaling and immune responses. PRIP-2 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-2 does not have PLC enzymatic activity. 0
42896 333716 cl28896 EFh_MICU EF-hand, calcium binding motif, found in mitochondrial calcium uptake proteins MICU1, MICU2, MICU3, and similar proteins. MICU3, also termed EF-hand domain-containing family member A2 (EFHA2), is a paralog of MICU1 and notably found in the central nervous system (CNS) and skeletal muscle. At present, the precise molecular function of MICU3 remains unclear. It likely has a role in mitochondrial calcium handling. MICU3 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices. 0
42897 421689 cl28897 7tm_GPCRs seven-transmembrane G protein-coupled receptor superfamily. The viral group I rhodopsins includes Phaeocystis globosa virus 12T divergent type-1 DTS-motif rhodopsin (VirRDTS), a green light-absorbing proton pump that has a structure similar to that of bacteriorhodopsin (BR) and transfers light energy in a manner that substantially changes medium pH when expressed in a cell. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors. 0
42898 333718 cl28898 Peptidase_M48_M56 Peptidases M48 (Ste24 endopeptidase or htpX homolog) and M56 (in MecR1 and BlaR1), integral membrane metallopeptidases. This family contains peptidase family M48 subfamily A-like CaaX prenyl protease 1, most of which are uncharacterized. Some of these contain tetratricopeptide (TPR) repeats at the C-terminus. Proteins in this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be possibly associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. These proteins putatively remove the C-terminal three residues of farnesylated proteins proteolytically. 0
42899 421690 cl28899 DEAD-like_helicase_N N-terminal helicase domain of the DEAD-box helicase superfamily. This domain is found at the C-terminus of DEAD-box helicases. 0
42900 355777 cl28901 MCM MCM helicase family. archaeal MCM proteins form a homohexameric ring homologous to the eukaryotic Mcm2-7 helicase and also function as the replicative helicase at the replication fork 0
42901 355778 cl28902 ARID ARID/BRIGHT DNA binding domain family. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex, which is a signal-sensing modulator of histone methylation and gene transcription. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Its polymorphism has been associated with risk for pediatric acute lymphoblastic leukemia (ALL). ARID5B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which can bind both the major and minor grooves of its target sequences. 0
42902 421692 cl28903 BACK BACK (BTB and C-terminal Kelch) domain. This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation). 0
42903 421693 cl28904 PTP_DSP_cys cys-based protein tyrosine phosphatase and dual-specificity phosphatase superfamily. This family is closely related to the pfam00102 and pfam00782 families. 0
42904 421694 cl28905 PIN_SF PIN (PilT N terminus) domain: Superfamily. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 153 and 170 amino acids in length. There is a single completely conserved residue D that may be functionally important. 0
42905 355783 cl28907 ArfGap GTPase-activating protein (GAP) for the ADP ribosylation factors (ARFs). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG2 is a member of the HIV-1 Rev binding protein (HRB) family and contains one Arf-GAP zinc finger domain, several Phe-Gly (FG) motifs, and four Asn-Pro-Phe (NPF) motifs. AGFG2 interacts with Eps15 homology (EH) domains and plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication. 0
42906 421695 cl28910 MFS Major Facilitator Superfamily. This is a family of transport proteins. Members of this family include a protein responsible for the secretion of the ferric chelator, enterobactin, and a protein involved in antibiotic resistance. 0
42907 355788 cl28912 LGIC_ECD extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels (also known as ligand-gated ion channel (LGIC)). This family contains extracellular domain (ECD) of the rho subunit 3 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR3 gene which maps to a different chromosome to that of GABRR1 and GABRR2. While close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event, GABRR3 may have arisen by duplication of a GABRR1/GABRR2 progenitor. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, some individuals contain a variant that is predicted to inactivate this gene product. 0
42908 421697 cl28914 CD_CSD CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains. This is a novel knotted tudor domain which is required for binding to RNA. The know influences the loop conformation of the helical turn Ht2 - residues 61-6 3- that is located at the side opposite the knot in the tudor domain-chromodomain; stabilisation of Ht2 is essential for RNA binding. 0
42909 355792 cl28916 HipA-like serine/threonine-protein kinases similar to HipA and CtkA. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Shewanella oneidensis HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Shewanella oneidensis HipAB:DNA promoter complex, HipB forms a dimer that binds the duplex operator DNA, with each HipB monomer interacting with separate HipA monomers. The HipAB component of the complex is composed of two HipA and two HipB subunits. 0
42910 355793 cl28917 Alpha_kinase Alpha kinase family. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions. 0
42911 355795 cl28919 ING Inhibitor of growth (ING) domain family. The ING family includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays acritical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail. 0
42912 355796 cl28920 STAT_DBD DNA-binding domain of Signal Transducer and Activator of Transcription (STAT). This family consists of the DNA-binding domain (DBD) of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). The DNA binding domain has an Ig-like fold. STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites. 0
42913 355797 cl28921 STAT_CCD Coiled-coil domain of Signal Transducer and Activator of Transcription (STAT), also called alpha domain. This family consists of the coiled-coil (alpha) domain of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). SImilar to STAT3 and STAT5. the coiled-coil domain (CCD) of STAT6 is required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells.STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites. 0
42914 421700 cl28922 Ubiquitin_like_fold Beta-grasp ubiquitin-like fold. This domain is the binding/interacting region of several protein kinases, such as the Schizosaccharomyces pombe Byr2. Byr2 is a Ser/Thr-specific protein kinase acting as mediator of signals for sexual differentiation in S. pombe by initiating a MAPK module, which is a highly conserved element in eukaryotes. Byr2 is activated by interacting with Ras, which then translocates the molecule to the plasma membrane. Ras proteins are key elements in intracellular signaling and are involved in a variety of vital processes such as DNA transcription, growth control, and differentiation. They function like molecular switches cycling between GTP-bound 'on' and GDP-bound 'off' states. 0
42915 421701 cl28923 PIPKc Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyzes the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway. 0
42916 421702 cl28926 23S_rRNA_IVP 23S rRNA-intervening sequence protein. This family describes a protein of unknown function whose structure is a bundle of four long alpha helices. Some of the first members of this family were found encoded in the (atypically large) intervening sequence (IVS) of Leptospira 23S RNA, a region often present in the rRNA gene and removed during rRNA processing without re-ligation. However, this location is not conserved, and naming this protein as a 23S RNA protein is both confusing and inaccurate. 0
42917 355803 cl28927 Avd_IVP_like proteins similar to the diversity-generating retroelement protein bAvd. A family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes. The distantly related bAvd protein, which also forms a homopentamer of four-helix bundles, has been suggested to interact with nucleic acids and a reverse transcriptase. 0
42918 355804 cl28928 VirB8_like virulence protein VirB8. This family includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain is similar to the type IV secretion system (T4ASS) component VirB8 and possibly has a similar fold to the nuclear transport factor-2 (NTF-2)-like superfamily. 0
42919 421703 cl28929 VirB10_like VirB10 and similar proteins form part of core complex in Type IV secretion system (T4SS). This family contains DotG/IcmE (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. The Dot/Icm system is a T4SS found in the pathogens Legionella and Coxiella and the conjugative apparatus of IncI plasmids; T4SS is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. Similar to T4SS VirB/D components, the Legionella Dot/Icm secretion apparatus contains a critical five-protein sub-assembly that forms the membrane-spanning 'core-complex' (CC), around which all other components assemble. This transmembrane connection is mediated by protein dimer pairs consisting of two inner membrane proteins, DotF and DotG, each independently associating with DotH/DotC/DotD in the outer membrane. 0
42920 421705 cl28933 Riboflavin_synthase_like Riboflavin synthase and similar proteins. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine. 0
42921 421711 cl28984 E_set Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus. AMPK1_CBM is a family found in close association with AMPKBI pfam04739. The surface of AMPK1_CBM reveals a carbohydrate-binding pocket. 0
42922 421712 cl28996 PolY Y-family of DNA polymerases. These proteins are involved in UV protection. 0
42923 391963 cl29010 DNA_alkylation DNA alkylation repair enzyme. Proteins in this family are predicted to be DNA alkylation repair enzymes. The structure of a hypothetical protein in this family shows it to adopt a supercoiled alpha helical structure. 0
42924 355888 cl29012 RNAP_largest_subunit_C Largest subunit of RNA polymerase (RNAP), C-terminal domain. Archaeal RNA polymerase (RNAP), like bacterial RNAP, is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. The relative positioning of the RNAP core is highly conserved between archaeal RNAP and the three classes of eukaryotic RNAPs. In archaea, the largest subunit is split into two polypeptides, A' and A'', which are encoded by separate genes in an operon. Sequence alignments reveal that the archaeal A'' subunit corresponds to the C-terminal one-third of the RNAPII largest subunit (Rpb1). In subunit A'', several loops in the jaw domain are shorter. The RNAPII Rpb1 interacts with the second-largest subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. 0
42925 355925 cl29049 DUF5409 Family of unknown function (DUF5409). hypothetical protein; Provisional 0
42926 421719 cl29051 Bac_rhamnosid6H Bacterial alpha-L-rhamnosidase 6 hairpin glycosidase domain. This family includes human glycogen branching enzyme AGL. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homolog GDB1 that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33). 0
42927 355929 cl29053 ArenaCapSnatch Arenavirus cap snatching domain. This model describes a shared signature region from an RNA endonuclease region associated with cap-snatching for mRNA production by RNA viruses. This domain usually is part of a multifunctional protein, the L protein responsible for RNA-dependent RNA polymerase activity. Cap-snatching is a viral alternative to synthesizing a eukaryotic-like mRNA cap itself. 0
42928 421721 cl29069 MASE4 Membrane-associated sensor, integral membrane domain. MASE3 (Membrane-Associated SEnsor) is an integral membrane sensor domain of unknown specificity found in histidine kinases, diguanylate cyclases and protein phosphatases in various bacteria and archaea. 0
42929 421723 cl29075 ComP_DUS Type IV minor pilin ComP, DNA uptake sequence receptor. ComP-DUS is the DNA-uptake sequence receptor of pathogenic Proteobacteria. ComP is a type IV minor pilin -site on the minor type IV pilin, C one of three minor (low abundance) pilins in pathogenic Proteobacteria Neisseria species (with PilV and PilX). These modulate Tfp-mediated properties without affecting Tfp biogenesis. ComP plays a prominent role in competence at the level of DNA uptake. Comp is exposed on the surface of Neisseria filaments, and it is this that recognizes homotypic DNA through genus-specific DNA uptake sequence (DUS) motifs. 0
42930 421724 cl29080 Glyco_hydro_32C Glycosyl hydrolases family 32 C terminal. This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glycosyl hydrolases, but the function of this protein is unknown. 0
42931 421725 cl29083 Spaetzle Spaetzle. This family of proteins are nerve growth factor-like ligands required in the pathway that establishes the dorsal-ventral pattern of the embryo. They form a cystine knot structure. 0
42932 421727 cl29093 SynN N/A. This domain includes syntaxin-like domains including from the Vam3p protein. 0
42933 421729 cl29100 MutL MutL protein. This small family includes, so far, an uncharacterized protein from E. coli O157:H7 and GlmL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family (pfam01968). 0
42934 421730 cl29105 FLgD_tudor FlgD Tudor-like domain. flagellar basal body rod modification protein; Reviewed 0
42935 421731 cl29107 Ligase_CoA CoA-ligase. This domain contains the catalytic domain from Succinyl-CoA ligase alpha subunit and other related enzymes. A conserved histidine is involved in phosphoryl transfer. 0
42936 421732 cl29110 HSDR_N Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA. 0
42937 421733 cl29114 BamD BamD lipoprotein, a component of the beta-barrel assembly machinery. BamD, also called YfiO, is part of the beta-barrel assembly machinery (BAM), which is essential for the folding and insertion of outer membrane proteins (OMPs) in the OM of Gram-negative bacteria. Transmembrane OMPs carry out important functions including nutrient and waste management, cell adhesion, and structural roles. The BAM complex is composed of the beta-barrel OMP BamA (also called Omp85/YaeT) and four lipoproteins BamBCDE. BamD is the only BAM lipoprotein required for viability. Both BamA and BamD are broadly distributed in Gram-negative bacteria, and may constitute the core of the BAM complex. BamD contains five Tetratricopeptide repeats (TPRs). The three TPRs at the N-terminus may participate in interaction with substrates, while the two TPRs in the C-terminus may be involved in binding with other BAM components. 0
42938 391992 cl29116 Cytochrome_C554 Cytochrome c554 and c-prime. This domain carries up to seven CxxCH repeated sequence motifs, characteristic of multi-haem cytochromes. 0
42939 421736 cl29122 Rotamase_2 PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline. 0
42940 421737 cl29123 Phage_int_SAM_5 Phage integrase SAM-like domain. This domain is found in a variety of phage integrase proteins. 0
42941 421739 cl29132 GspH Type II transport protein GspH. GspH is involved in bacterial type II export systems. Like all pilins, GspH has an N-terminus alpha helix. This helix is followed by nine beta strands forming two beta sheets, one of five antiparallel strands and one of four antiparallel strands. GspH is a minor pseudopilin; it is expressed much less than other pseudopilins in the type II secretion pilus (major pilins). The function and localization of minor pseudo-pilins are still to be fully unraveled. It has been suggested that some minor pseudopilins may assemble either into the base or the tip of pili, or both. They function as initiators or regulators of pilus biogenesis and dynamics, and/or as adaptors between various pseudopilin component and other members of the T2SS. 0
42942 421740 cl29134 Nup188 Nucleoporin subcomplex protein binding to Pom34. This is a family of eukaryotic nucleoporins of several different sizes. All of them are long and form the scaffold of the nuclear pore complex. Nup192 in particular modulates the permeability of the central channel of the NPC central or nuclear pore complex. 0
42943 421741 cl29139 Beta-Casp Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. 0
42944 421742 cl29140 DUF2815 Protein of unknown function (DUF2815). single-stranded DNA-binding protein 0
42945 356017 cl29141 PLN02918 N/A. This model is similar to Pyridox_oxidase from Pfam but is designed to find only true pyridoxamine-phosphate oxidase and to ignore the related protein PhzG involved in phenazine biosynthesis. This protein from E. coli was characterized as a homodimer with two FMN per dimer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine] 0
42946 421744 cl29146 NADH_4Fe-4S NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 0
42947 421745 cl29147 NADH-G_4Fe-4S_3 NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 0
42948 421746 cl29148 Proteasome_A_N Proteasome subunit A N-terminal signature. This domain is conserved in the A subunits of the proteasome complex proteins. 0
42949 421747 cl29154 ClpB_D2-small C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2. 0
42950 421748 cl29165 BHD_3 Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA. 0
42951 421749 cl29167 PhageMin_Tail Phage-related minor tail protein. This model represents a reasonably well conserved core region of a family of phage tail proteins. The member from phage TP901-1 was characterized as a tail length tape measure protein in that a shortened form of the protein leads to phage with proportionately shorter tails. [Mobile and extrachromosomal element functions, Prophage functions] 0
42952 421750 cl29174 MethyTransf_Reg Predicted methyltransferase regulatory domain. Members of this family of domains are found in various prokaryotic methyltransferases, where they regulate the activity of the methyltransferase domain. 0
42953 421751 cl29183 RQC RQC domain. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain. 0
42954 421754 cl29203 Alpha-mann_mid Alpha mannosidase middle domain. Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase. 0
42955 421756 cl29215 LeuA_dimer LeuA allosteric (dimerization) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich. 0
42956 356100 cl29224 PulG Type II secretory pathway, pseudopilin PulG [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
42957 421759 cl29226 Cadherin Cadherin domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. 0
42958 421761 cl29229 Sel1 Sel1 repeat. These represent a subfamily of TPR (tetratricopeptide repeat) sequences. 0
42959 421762 cl29235 HisG_C HisG, C-terminal domain. This domain corresponds to the C-terminal third of the HisG protein. It is absent in many lineages. 0
42960 421763 cl29236 tRNA_SAD Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain. 0
42961 421764 cl29237 PBP5_C Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides. 0
42962 421765 cl29239 PYNP_C Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer. 0
42963 421766 cl29240 Alpha-amyl_C2 Alpha-amylase C-terminal beta-sheet domain. This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet. 0
42964 392030 cl29241 CobW_C Cobalamin synthesis protein cobW C-terminal domain. CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression. 0
42965 421767 cl29255 DnaB_2 Replication initiation and membrane attachment. This model represents the conserved domain of DnaD, part of Bacillus subtilis replication restart primosome, and of a number of phage-associated proteins. Members, both chromosomal or phage-associated, are found in the Bacillus/Clostridium group of Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions] 0
42966 421768 cl29256 Curlin_rpt Curlin associated repeat. This family consists of several bacterial repeats of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibers are thin aggregative surface fibers, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibers are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibers is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB. 0
42967 421769 cl29260 BATS Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers.. This domain therefore may be involved in co-factor binding or dimerisation. 0
42968 421770 cl29262 Hyd_WA Propeller. Probable beta-propeller. 0
42969 421771 cl29265 TnpB_IS66 IS66 Orf2 like protein. The IS66 family insertion sequence element encodes a DDE transposase TnpC, and two accessory proteins, TnpA and TnpB. It has been assumed that the TnpA, TnpB, and TnpC proteins are produced independently in appropriate amounts and form a complex, which acts as a transposase to promote the transposition of an IS66 family element. 0
42970 421774 cl29279 MutS_III MutS domain III. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam00488. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds in part with globular domain IV, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in. 0
42971 421778 cl29298 DapB_C Dihydrodipicolinate reductase, C-terminus. dihydrodipicolinate reductase; Provisional 0
42972 421780 cl29306 Bac_GDH Bacterial NAD-glutamate dehydrogenase. glutamate dehydrogenase 2; Provisional 0
42973 421781 cl29307 BON BON domain. This domain is found in a family of osmotic shock protection proteins. It is also found in some Secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. 0
42974 392048 cl29316 Transposase_31 Putative transposase, YhgA-like. This family of putative transposases includes the YhgA sequence from Escherichia coli and several prokaryotic homologs. 0
42975 421784 cl29317 GcpE GcpE protein. In a variety of organisms, including plants and several eubacteria, isoprenoids are synthesized by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway. 0
42976 421785 cl29319 HA2 Helicase associated domain (HA2). This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. 0
42977 392051 cl29320 KilA-N KilA-N domain. N1R/p28-like protein; Provisional 0
42978 421786 cl29321 ZipA N/A. This family represents the ZipA C-terminal domain. ZipA is involved in septum formation in bacterial cell division. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ. 0
42979 421787 cl29323 TPK_B1_binding Thiamin pyrophosphokinase, vitamin B1 binding domain. Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 0
42980 421788 cl29327 DFP DNA / pantothenate metabolism flavoprotein. phosphopantothenate--cysteine ligase; Validated 0
42981 421791 cl29338 Ribosomal_L11 N/A. The N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain. 0
42982 421792 cl29344 PhnA PhnA domain. 0
42983 421793 cl29347 FtsQ Cell division protein FtsQ. cell division protein FtsQ; Provisional 0
42984 421794 cl29357 BK_channel_a Calcium-activated BK potassium channel alpha subunit. This family represents a short region in the middle of largely plant proteins that belong to the TCDB:1.A.1.23.2 family of the voltage-gated ion channel superfamily, eg UniProtKB:Q5H8A6, Q5H8A5 and Q4VY51. 0
42985 421795 cl29360 B5 tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits. 0
42986 421796 cl29361 CorC_HlyC Transporter associated domain. This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. 0
42987 421797 cl29362 CO_deh_flav_C CO dehydrogenase flavoprotein C-terminal domain. 0
42988 421798 cl29364 DPBB_1 Lytic transglycolase. Putative EG45-like domain containing protein 1; Provisional 0
42989 421799 cl29368 LPMO_10 Lytic polysaccharide mono-oxygenase, cellulose-degrading. spherodin-like protein; Provisional 0
42990 421800 cl29370 FAR_C C-terminal domain of fatty acyl CoA reductases. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included. 0
42991 421801 cl29371 Transposase_21 Transposase family tnp2. This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to C. elegans. Note that this family contains a number of conserved cysteine and histidine residues. 0
42992 421804 cl29392 UreE N/A. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. 0
42993 421805 cl29395 CPSase_L_D3 Carbamoyl-phosphate synthetase large chain, oligomerization domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. 0
42994 421806 cl29396 Biotin_carb_C Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain. 0
42995 421809 cl29414 RPEL RPEL repeat. The RPEL repeat is named after four conserved amino acids it contains. The RPEL motif binds to actin. 0
42996 421811 cl29417 Ald_Xan_dh_C2 Molybdopterin-binding domain of aldehyde dehydrogenase. xanthine dehydrogenase subunit XdhA; Provisional 0
42997 421812 cl29418 Dak2 DAK2 domain. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the subunit homologous to the E. coli YcgS subunit. 0
42998 421813 cl29420 ERCC4 ERCC4 domain. This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases. 0
42999 421814 cl29422 B12-binding_2 B12 binding domain. Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases. 0
43000 421816 cl29427 PhzC-PhzF Phenazine biosynthesis-like protein. CntK (cobalt and nickel transport system protein K) is a histidine racemase that performs the first step in the biosynthesis of staphylopine, a metallophore involved in the import of multiple divalent cations. It was first characterized in Staphylococcus aureus. 0
43001 421817 cl29429 Cyanase_C N/A. Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain. 0
43002 421818 cl29433 Bac_transf Bacterial sugar transferase. This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways. 0
43003 421825 cl29459 Glucosaminidase Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase. Eubacterial enzymes distantly related to eukaryotic lysozymes. 0
43004 421827 cl29462 AICARFT_IMPCHas AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. 0
43005 421829 cl29468 RimM RimM N-terminal domain. 16S rRNA-processing protein RimM; Provisional 0
43006 421830 cl29469 MgtE Divalent cation transporter. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.) 0
43007 421832 cl29474 Flavokinase Riboflavin kinase. Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme. the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme. 0
43008 421834 cl29482 UPF0051 Uncharacterized protein family (UPF0051). This protein, SufD, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. SufB and SufD are homologous. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
43009 421835 cl29486 IlvC Acetohydroxy acid isomeroreductase, catalytic domain. ketol-acid reductoisomerase; Provisional 0
43010 421837 cl29490 Porphobil_deam Porphobilinogen deaminase, dipyromethane cofactor binding domain. porphobilinogen deaminase; Provisional 0
43011 421841 cl29503 GreA_GreB Transcription elongation factor, GreA/GreB, C-term. This domain has an FKBP-like fold. 0
43012 421844 cl29515 NifU NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. 0
43013 421845 cl29525 FliMN_C Type III flagellar switch regulator (C-ring) FliN C-term. flagellar motor switch protein; Validated 0
43014 421846 cl29526 SecA_PP_bind SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain. 0
43015 421847 cl29527 ATase N/A. This domain is a 3 helical bundle. 0
43016 421849 cl29529 Transgly Transglycosylase. This family is one of the transglycosylases involved in the late stages of peptidoglycan biosynthesis. Members tend to be small, about 240 amino acids in length, and consist almost entirely of a domain described by pfam00912 for transglycosylases. Species with this protein will have several other transglycosylases as well. All species with this protein are Proteobacteria that produce murein (peptidoglycan). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 0
43017 421850 cl29530 EF_TS Elongation factor TS. elongation factor Ts 0
43018 421851 cl29532 Peptidase_M17 N/A. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. 0
43019 421852 cl29533 XPG_I XPG I-region. domain in nucleases 0
43020 356411 cl29535 HNS Domain in histone-like proteins of HNS family. 0
43021 421854 cl29546 Pumilio Pumilio-family RNA binding domain. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognize all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure. 0
43022 421855 cl29549 beta_clamp N/A. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 0
43023 421859 cl29555 IQ IQ calmodulin-binding motif. Short calmodulin-binding motif containing conserved Ile and Gln residues. 0
43024 421860 cl29556 Glycos_transf_3 Glycosyl transferase family, a/b domain. anthranilate phosphoribosyltransferase; Provisional 0
43025 356436 cl29560 SNc Staphylococcal nuclease homologues. 0
43026 421861 cl29561 EAL N/A. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site. 0
43027 421862 cl29562 Ribosomal_L7_L12 N/A. Ribosomal protein L7/L12. Ribosomal protein L7/L12 refers to the large ribosomal subunit proteins L7 and L12, which are identical except that L7 is acetylated at the N terminus. It is a component of the L7/L12 stalk, which is located at the surface of the ribosome. The stalk base consists of a portion of the 23S rRNA and ribosomal proteins L11 and L10. An extended C-terminal helix of L10 provides the binding site for L7/L12. L7/L12 consists of two domains joined by a flexible hinge, with the helical N-terminal domain (NTD) forming pairs of homodimers that bind to the extended helix of L10. It is the only multimeric ribosomal component, with either four or six copies per ribosome that occur as two or three dimers bound to the L10 helix. L7/L12 is the only ribosomal protein that does not interact directly with rRNA, but instead has indirect interactions through L10. The globular C-terminal domains of L7/L12 are highly mobile. They are exposed to the cytoplasm and contain binding sites for other molecules. Initiation factors, elongation factors, and release factors are known to interact with the L7/L12 stalk during their GTP-dependent cycles. The binding site for the factors EF-Tu and EF-G comprises L7/L12, L10, L11, the L11-binding region of 23S rRNA, and the sarcin-ricin loop of 23S rRNA. Removal of L7/L12 has minimal effect on factor binding and it has been proposed that L7/L12 induces the catalytically active conformation of EF-Tu and EF-G, thereby stimulating the GTPase activity of both factors. In eukaryotes, the proteins that perform the equivalent function to L7/L12 are called P1 and P2, which do not share sequence similarity with L7/L12. However, a bacterial L7/L12 homolog is found in some eukaryotes, in mitochondria and chloroplasts. In archaea, the protein equivalent to L7/L12 is called aL12 or L12p, but it is closer in sequence to P1 and P2 than to L7/L12. 0
43028 421864 cl29579 HisKA N/A. dimerization and phospho-acceptor domain of histidine kinases. 0
43029 421865 cl29584 RF-1 RF-1 domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis. 0
43030 421866 cl29593 WD40 N/A. Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. 0
43031 421867 cl29594 PI-PLC-Y Phosphatidylinositol-specific phospholipase C, Y domain. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs. 0
43032 421868 cl29595 PTS_IIB_glc N/A. PTS_IIB, PTS system, glucose/sucrose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation 0
43033 421870 cl29608 ZnF_GATA N/A. This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain. 0
43034 421877 cl29653 Pilin Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. The Prosite pattern detects those better. 0
43035 421878 cl29654 CCP Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII. 0
43036 421879 cl29663 exosort_XrtP exosortase P. Members of the exosortase S family occur in the high GC Gram-positive order Micrococcales (a branch of the Actinobacteria), in genera such as Arthrobacter, Microbacterium, Curtobacterium, and Paenarthrobacter. 0
43037 421880 cl29666 FusC_FusB Fusidic acid resistance protein (FusC/FusB). FBP_C is a family from the C terminal end of fibronectin-binding proteins. It forms an extended four-cysteine zinc-finger with a unique structural fold. Fibronectin-binding proteins bind to elongation factor G - EF-G, which is mediated by the zinc-finger binding to the C-terminus of EF-G. FBPs release ribosomes by competing with them for EF-G. 0
43038 421881 cl29674 Retrotrans_gag Retrotransposon gag protein. This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various mammalia species. LDOC1, a member of this family and a novel MZF-1-interacting protein, inhibits NF-kappaB activation and relates with cancer and some other diseases. But the specific function of this family is still unknown. 0
43039 421882 cl29684 Citrate_bind ATP citrate lyase citrate-binding. ATP citrate (pro-S)-lyase 0
43040 421883 cl29685 LisH_2 LisH. Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerization. 0
43041 421885 cl29690 PSII_BNR Photosynthesis system II assembly factor YCF48. YCF48 is one of several assembly factors of the photosynthesis system II. The photosynthesis system II occurs in Cyanobacteria that are Gram-negative bacteria performing oxygenic photosynthesis. One of the three membranes surrounding these bacteria is the inner thylakoid membrane (TM) system that is localized within the cell and houses the large pigment-protein complexes of the photosynthetic electron transfer chain, i.e. Photosystem (PS) II, PSI, the cytochrome b6f complex, and the ATP synthase. YCF48 is necessary for efficient assembly and repair of the PSII. YCF48 is found predominantly in the thykaloid membrane. It is a BNR repeat protein. 0
43042 421886 cl29693 Defensin_beta_2 Beta defensin. Big defensins are antimicrobial peptides. They consist of a hydrophobic N-terminal half, which is active against Gram-positive bacteria, and a cationic C-terminal half, which is active against Gram-negative bacteria. The C-terminal half adopts a beta-defensin-like structure. 0
43043 392159 cl29696 Peptidase_S30 Potyvirus P1 protease. This family is the P1 protein of the Potyviridae polyproteins that is a serine peptidase at the N-terminus. The catalytic triad in the genome polyprotein of ssRNA positive-strand Brome streak mosaic rymovirus, is His-311, Asp-322 and Ser-355. 0
43044 421888 cl29705 CHB_HEX_C_1 Chitobiase/beta-hexosaminidase C-terminal domain. 0
43045 421891 cl29710 PPR PPR repeat. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. The exact function is not known. 0
43046 421893 cl29718 Rhomboid_N Cytoplasmic N-terminal domain of rhomboid serine protease. This is the N-terminal domain of rhomboid protease. 0
43047 421894 cl29726 Pectate_lyase22 Oligogalacturonate lyase. Members of this protein family are the TolB periplasmic protein of Gram-negative bacteria. TolB is part of the Tol-Pal (peptidoglycan-associated lipoprotein) multiprotein complex, comprising five envelope proteins, TolQ, TolR, TolA, TolB and Pal, which form two complexes. The TolQ, TolR and TolA inner-membrane proteins interact via their transmembrane domains. The {beta}-propeller domain of the periplasmic protein TolB is responsible for its interaction with Pal. TolB also interacts with the outer-membrane peptidoglycan-associated proteins Lpp and OmpA. TolA undergoes a conformational change in response to changes in the proton-motive force, and interacts with Pal in an energy-dependent manner. The C-terminal periplasmic domain of TolA also interacts with the N-terminal domain of TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi , Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. [Transport and binding proteins, Other, Cellular processes, Pathogenesis] 0
43048 421897 cl29742 Tiny_TM_bacill Protein of unknown function (Tiny_TM_bacill). This model represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV. [Hypothetical proteins, Conserved] 0
43049 421898 cl29743 AbiEi_1 AbiEi antitoxin C-terminal domain. AbiEi_1 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 0
43050 421899 cl29745 Phospholip_A2_3 Prokaryotic phospholipase A2. This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids. 0
43051 392178 cl29746 D5_N D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages. 0
43052 421901 cl29748 ApbA_C Ketopantoate reductase PanE/ApbA C terminal. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway. 0
43053 421903 cl29757 TBCC Tubulin binding cofactor C. Members of this family are involved in the folding pathway of tubulins and form a beta helix structure. 0
43054 421904 cl29758 Gryzun Gryzun, putative trafficking through Golgi. Members of this family are involved in Golgi trafficking. 0
43055 421906 cl29762 PP2C_C Protein serine/threonine phosphatase 2C, C-terminal domain. Protein phosphatase 2c; Provisional 0
43056 421908 cl29764 Cathelicidins Cathelicidin. This family represents a conserved region approximately 60 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover. 0
43057 421909 cl29765 s48_45 Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation. 0
43058 421910 cl29766 NDUF_B4 NADH-ubiquinone oxidoreductase B15 subunit (NDUFB4). complex I subunit 0
43059 392191 cl29769 Poxvirus dsDNA Poxvirus. putative alpha aminitin-sensitive protein; Provisional 0
43060 356648 cl29772 PRK10015 N/A. putative oxidoreductase FixC; Provisional 0
43061 392192 cl29774 Herpes_UL73 UL73 viral envelope glycoprotein. UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope. 0
43062 392193 cl29777 NPV_P10 Nucleopolyhedrovirus P10 protein. fibrous body protein; Provisional 0
43063 421911 cl29782 AlaDh_PNT_N Alanine dehydrogenase/PNT, N-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. 0
43064 421915 cl29790 Herpes_U34 Herpesvirus virion protein U34. nuclear egress membrane protein UL34; Provisional 0
43065 421916 cl29795 MtrA Tetrahydromethanopterin S-methyltransferase, subunit A. methyltransferase; Provisional 0
43066 421917 cl29798 RNA_pol_Rpb5_N RNA polymerase Rpb5, N-terminal domain. DNA-directed RNA polymerase II subunit family protein; Provisional 0
43067 421920 cl29801 Competence Competence protein. The related model ComEC_Rec2 (TIGR00361) describes a set of proteins of ~ 700-800 residues, one each from a number of different species, of which most can become competent for natural transformation with exogenous DNA. The best-studied examples are ComEC from Bacillus subtilis and Rec-2 from Haemophilus influenzae, where the protein appears to form part of the DNA import structure. This model represents a region found in full-length ComEC/Rec2 and shorter homologs of unknown function from large number of additional bacterial species, most of which are not known to become competent for transformation (an exception is Helicobacter pylori). [Unknown function, General] 0
43068 421923 cl29820 FDX-ACB Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2). 0
43069 421925 cl29826 F1-ATPase_delta mitochondrial ATP synthase delta subunit. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. The subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213). 0
43070 421928 cl29842 COLIPASE N/A. SCOP reports duplication of common fold with Colipase N-terminal domain. 0
43071 421931 cl29847 CobN_like CobN subunit of cobaltochelatase, bchH and chlH subunits of magnesium chelatases, and similar proteins. This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of hydrogenobyrinic acid a,c-diamide to cobyrinic acid. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis. 0
43072 421932 cl29848 MlaD MlaD protein. Members of this protein family are the MlaD (maintenance of Lipid Asymmetry D) protein of an ABC transport system that seems to remove phospholipid from the outer leaflet of the Gram-negative bacterial outer membrane (OM), leaving only lipopolysaccharide in the outer leaflet. The Mla locus has long been associated with toluene tolerance, consistent with the proposed role in retrograde transport of phospholipid and therefore with maintaining the integrity of the OM as a protective barrier. 0
43073 392220 cl29860 Papo_T_antigen T-antigen specific domain. Small T antigen; Reviewed 0
43074 421935 cl29862 CTP-dep_RFKase Domain of unknown function DUF120. riboflavin kinase; Provisional 0
43075 421936 cl29864 tRNA-synt_1f tRNA synthetases class I (K). lysyl-tRNA synthetase; Reviewed 0
43076 421937 cl29867 Peptidase_S7 Peptidase S7, Flavivirus NS3 serine protease. This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. It appears to be related to the superfamily of trypsin peptidases and so may have a peptidase function. 0
43077 421939 cl29874 LIGANc N/A. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism. 0
43078 356751 cl29875 Herpes_UL24 Herpes virus proteins UL24 and UL76. nuclear protein UL24; Provisional 0
43079 421941 cl29885 ArfGap Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. 0
43080 421942 cl29886 DUF11 Domain of unknown function DUF11. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins. 0
43081 421943 cl29887 Apocytochr_F_C Apocytochrome F, C-terminal. apocytochrome f; Reviewed 0
43082 421944 cl29893 Pectinesterase Pectinesterase. pectinesterase family protein 0
43083 421945 cl29894 AsnC_trans_reg Lrp/AsnC ligand binding domain. AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli) 0
43084 421946 cl29895 Glyco_hydro_3 Glycosyl hydrolase family 3 N terminal domain. beta-hexosaminidase; Provisional 0
43085 421949 cl29917 RNA_pol_B_RPB2 N/A. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII). 0
43086 421951 cl29920 Chorismate_bind chorismate binding enzyme. Members of this family, aminodeoxychorismate synthase, component I (PabB), were designated para-aminobenzoate synthase component I until it was recognized that PabC, a lyase, completes the pathway of PABA synthesis. This family is closely related to anthranilate synthase component I (trpE), and both act on chorismate. The clade of PabB enzymes represented by this model includes sequences from Gram-positive and alpha and gamma Proteobacteria as well as Chlorobium, Nostoc, Fusobacterium and Arabidopsis. A closely related clade of fungal PabB enzymes is identified by TIGR01823, while another bacterial clade of potential PabB enzymes is more closely related to TrpE (TIGR01824). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid] 0
43087 421954 cl29940 RbcS Ribulose-1,5-bisphosphate carboxylase small subunit. ribulose-bisphosphate carboxylase small chain 0
43088 421955 cl29941 Viral_coat Viral coat protein (S domain). The capsid or coat protein of this family is expressed in Nodaviridae, that are ssRNA positive-strand viruses, with no DNA stage. These viruses are the causative agents of viral nervous necrosis in marine fish. 0
43089 356823 cl29947 VirDNA-topo-I_N Viral DNA topoisomerase I, N-terminal. Members of this family are predominantly found in viral DNA topoisomerase, and assume a beta(2)-alpha-beta-alpha-beta(2) fold, with a left-handed crossover between strands beta2 and beta3. 0
43090 421957 cl29957 PepX_C X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases. 0
43091 356835 cl29959 SBP_bac_10 Protein of unknown function (DUF1559). This model describes a region of ~16 residues found typically about 30 residues away from the C-terminus of large numbers of proteins in the Planctomycetes, Lentisphaerae, and Verrucomicrobia, on proteins with a prepilin-type N-terminal cleavage/methylation domain (see TIGR02532). The motif H-X(9)-D-G is nearly invariant. Single genomes may encode over 200 such proteins. 0
43092 421958 cl29960 SopE_GEF SopE GEF domain. type III secretion protein BopE; Provisional 0
43093 421959 cl29970 Phage_antitermQ Phage antitermination protein Q. This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown. 0
43094 356847 cl29971 Chordopox_L2 Chordopoxvirus L2 protein. hypothetical protein; Provisional 0
43095 356850 cl29974 NinE NINE Protein. prophage protein NinE; Provisional 0
43096 392246 cl29975 Herpes_UL1 Herpesvirus glycoprotein L. envelope glycoprotein L; Provisional 0
43097 356852 cl29976 minC septum site-determining protein MinC. septum formation inhibitor; Reviewed 0
43098 356854 cl29978 Pox_F11 Poxvirus F11 protein. hypothetical protein; Provisional 0
43099 421960 cl29988 CutC CutC family. copper homeostasis protein CutC; Provisional 0
43100 356870 cl29994 Herpes_ICP4_C Herpesvirus ICP4-like protein C-terminal region. transcriptional regulator ICP4; Provisional 0
43101 356910 cl30034 DNA_pack_N Probable DNA packing protein, N-terminus. DNA packaging terminase subunit 1; Provisional 0
43102 356911 cl30035 DNA_pack_C Probable DNA packing protein, C-terminus. DNA packaging terminase subunit 1; Provisional 0
43103 356913 cl30037 Herpes_gE Alphaherpesvirus glycoprotein E. envelope glycoprotein E; Provisional 0
43104 356923 cl30047 Marek_A Marek&apos;s disease glycoprotein A. envelope glycoprotein C; Provisional 0
43105 421965 cl30049 Herpes_V23 Herpesvirus VP23 like capsid protein. Capsid triplex subunit 2; Provisional 0
43106 356926 cl30050 PHA03259 N/A. Capsid triplex subunit 2; Provisional 0
43107 356927 cl30051 Herpes_gI Alphaherpesvirus glycoprotein I. envelope glycoprotein I; Provisional 0
43108 421970 cl30079 OmpA_C-like Peptidoglycan binding domains similar to the C-terminal domain of outer-membrane protein OmpA. The Pfam entry also includes MotB and related proteins which are not included in the Prosite family. 0
43109 421973 cl30086 Ribosomal_L6 Ribosomal protein L6. Members of this protein family are the archaeal form ofribosomal protein uL6 (previously L9 in yeast and human). The top-scoring proteins not selected by this model are eukaryotic cytosolic uL6. Bacterial ribosomal protein L6 scores lower and is described by a distinct model. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
43110 392263 cl30117 AA_permease Amino acid permease. 0
43111 421976 cl30226 VpdB_C C-terminal fragment of effector protein VpdB. Members of this family include the enzyme myo-inosose-2 dehydratase, product of the gene iolE, as found in inositol utilization cassettes in many species. [Energy metabolism, Sugars] 0
43112 357125 cl30249 Fe_III_red_FhuF siderophore-iron reductase FhuF. Members of this protein family are 2Fe-2S cluster binding proteins, found regularly in the context of siderophore transporters. Members are distantly related to FhuF from E. coli, a ferric iron reductase linked to removal of iron from hydroxamate-type siderophores (). [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds] 0
43113 357126 cl30250 cyclo_dehyd_2 bacteriocin biosynthesis cyclodehydratase domain. Members of this protein family are found in a three-gene operon in Bacillus anthracis and related Bacillus species, where the other two genes are clearly identified with maturation of a putative thiazole-containing bacteriocin precursor. While there is no detectable pairwise sequence similarity between members of this family and the proposed cyclodehydratases such as SagC of Streptococcus pyogenes (see family TIGR03603), both families show similarity through PSI-BLAST to ThiF, a protein involved in biosynthesis of the thiazole moiety for thiamine biosynthesis. This family, therefore, may contribute to cyclodehydratase function in heterocycle-containing bacteriocin biosyntheses. In Bacillus licheniformis ATCC 14580, the bacteriocin precursor gene is adjacent to the gene for this protein. [Cellular processes, Toxin production and resistance] 0
43114 357136 cl30260 PRK10992 iron-sulfur cluster repair protein YtfE. Members of this protein family, designated variously as YftE, NorA, DrnN, and NipC, are di-iron proteins involved in the repair of iron-sulfur clusters. Previously assigned names reflect pleiotropic effects of damage from NO or other oxidative stress when this protein is mutated. The suggested name now is RIC, for Repair of Iron Centers. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other] 0
43115 357139 cl30263 PRK14127 cell division regulator GpsB. This model describes a domain found in Bacillus subtilis cell division initiation protein DivIVA, and homologs, toward the N-terminus. It is also found as a repeated domain in certain other proteins, including family TIGR03543. 0
43116 357161 cl30285 Csf4_U CRISPR/Cas system-associated DinG family helicase Csf4. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf4 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies farthest (fourth closest) from the repeats in the A. ferrooxidans genome. 0
43117 357162 cl30286 PLN00090 N/A. Members of this protein family are the photosystem II reaction center M protein, product of the psbM gene, in Cyanobacteria and their derived organelles in plants. This model resembles pfam05151 but has cutoffs set to avoid false-positive matches to similar (not necessarily homologous) sequences in species that are not photosynthetic. [Energy metabolism, Photosynthesis] 0
43118 421978 cl30289 CccA Cytochrome c, mono- and diheme variants [Energy production and conversion]. Cytochrome 579, as described originally in Leptospirillum from acid mine drainage, is an abundant red cytochrome that acts as an electron transfer protein involved in Fe(II) oxidation. 0
43119 357168 cl30292 PRK15331 type III secretion system translocator chaperone SicA. Genes in this family are found in type III secretion operons. LcrH, from Yersinia is believed to have a regulatory function in the low-calcium response of the secretion system. The same protein is also known as SycD (SYC = Specific Yop Chaperone) for its chaperone role. In Pseudomonas, where the homolog is known as PcrH, the chaperone role has been demonstrated and the regulatory role appears to be absent. ScyD/LcrH contains three central tetratricopeptide-like repeats that are predicted to fold into an all-alpha-helical array. 0
43120 357184 cl30308 LolE ABC-type transport system, involved in lipoprotein release, permease component [Cell wall/membrane/envelope biogenesis]. This model describes the LolC protein, and its paralog LolE found in some species. These proteins are homologous to permease proteins of ABC transporters. In some species, two paralogs occur, designated LolC and LolE. In others, a single form is found and tends to be designated LolC. [Protein fate, Protein and peptide secretion and trafficking] 0
43121 357216 cl30340 tolC N/A. Members of this model are outer membrane proteins from the TolC subfamily within the RND (Resistance-Nodulation-cell Division) efflux systems. These proteins, unlike the NodT subfamily, appear not to be lipoproteins. All are believed to participate in type I protein secretion, an ABC transporter system for protein secretion without cleavage of a signal sequence, although they may, like TolC, participate also in the efflux of smaller molecules as well. This family includes the well-documented examples TolC (E. coli), PrtF (Erwinia), and AprF (Pseudomonas aeruginosa). [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Porins] 0
43122 357259 cl30383 flgB N/A. flagellar basal body rod protein FlgB; Reviewed 0
43123 392266 cl30459 Cyt_C5_DNA_methylase N/A. All proteins in this family for which functions are known are DNA-cytosine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
43124 357347 cl30471 SORL Desulfoferrodoxin, superoxide reductase-like (SORL) domain [Energy production and conversion]. The short N-terminal domain contains four conserved Cys for binding of a ferric iron atom, and is homologous to the small protein desulforedoxin; this domain may also be responsible for dimerization. The remainder of the molecule binds a ferrous iron atom and is similar to neelaredoxin, a monomeric blue non-heme iron protein. The homolog from Treponema pallidum scores between the trusted cutoff for orthology and the noise cutoff. Although essentially a full length homolog, it lacks three of the four Cys residues in the N-terminal domain; the domain may have lost ferric binding ability but may have some conserved structural role such as dimerization, or some new function. This protein is described in some articles as rubredoxin oxidoreductase (rbo), and its gene shares an operon with the rubredoxin gene in Desulfovibrio vulgaris Hildenborough. [Energy metabolism, Electron transport] 0
43125 357355 cl30479 PRK15103 membrane integrity-associated transporter subunit PqiA. This family consists of uncharacterized predicted integral membrane proteins found, so far, only in the Proteobacteria. Of two members in E. coli, one is induced by paraquat and is designated PqiA, paraquat-inducible protein A. [Unknown function, General] 0
43126 357372 cl30496 GIDA Glucose inhibited division protein A. GidA, the longer of two forms of GidA-related proteins, appears to be present in all complete eubacterial genomes so far, as well as Saccharomyces cerevisiae. A subset of these organisms have a closely related protein. GidA is absent in the Archaea. It appears to act with MnmE, in an alpha2/beta2 heterotetramer, in the 5-carboxymethylaminomethyl modification of uridine 34 in certain tRNAs. The shorter, related protein, previously called gid or gidA(S), is now called TrmFO (see model TIGR00137). [Protein synthesis, tRNA and rRNA base modification] 0
43127 357415 cl30539 PLN02852 N/A. adrenodoxin reductase; Provisional 0
43128 357420 cl30544 TOP1Ac N/A. Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit 0
43129 421979 cl30545 PKD N/A. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold. 0
43130 357422 cl30546 H2A N/A. histone H2A; Provisional 0
43131 357435 cl30559 PriA Primosomal protein N&apos; (replication factor Y) - superfamily II helicase [Replication, recombination and repair]. 0
43132 357450 cl30574 PRK13596 NADH-quinone oxidoreductase subunit NuoF. NADH dehydrogenase [ubiquinone] flavoprotein 1; Provisional 0
43133 421980 cl30589 ThiD2 ThiD2 family. This domain functions as a ThiD protein and is called the ThiD2 family. The domain is associated with the ThiE domain in some proteins. 0
43134 357482 cl30606 KdpD K+-sensing histidine kinase KdpD [Signal transduction mechanisms]. sensor protein KdpD; Provisional 0
43135 357489 cl30613 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]. beta-D-glucuronidase; Provisional 0
43136 357493 cl30617 flgD flagellar hook assembly protein FlgD. flagellar basal body rod modification protein; Provisional 0
43137 421981 cl30663 DUF5710 Domain of unknown function (DUF5710). DNA polymerase III subunit epsilon; Validated 0
43138 421982 cl30664 FAD_binding_2 FAD binding domain. L-aspartate oxidase is the B protein, NadB, of the quinolinate synthetase complex. Quinolinate synthetase makes a precursor of the pyridine nucleotide portion of NAD. This model identifies proteins that cluster as L-aspartate oxidase (a flavoprotein difficult to separate from the set of closely related flavoprotein subunits of succinate dehydrogenase and fumarate reductase) by both UPGMA and neighbor-joining trees. The most distant protein accepted as an L-aspartate oxidase (NadB), that from Pyrococcus horikoshii, not only clusters with other NadB but is just one gene away from NadA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides] 0
43139 357541 cl30665 NuoL NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit [Energy production and conversion, Inorganic ion transport and metabolism]. NADH dehydrogenase subunit 5; Validated 0
43140 357544 cl30668 PRK14692 flagellar hook-associated protein FlgL. flagellar hook-associated protein FlgL; Validated 0
43141 357559 cl30683 PRK08241 RNA polymerase subunit sigma-70. RNA polymerase sigma factor SigJ; Provisional 0
43142 357561 cl30685 PRK07502 prephenate/arogenate dehydrogenase family protein. 0
43143 421983 cl30686 PLN02487 N/A. phytoene desaturase 0
43144 357568 cl30692 UbiH 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme transport and metabolism, Energy production and conversion]. hypothetical protein; Provisional 0
43145 357572 cl30696 TFIIE Transcription initiation factor IIE. 0
43146 357587 cl30711 PRK04233 N/A. hypothetical protein; Provisional 0
43147 392272 cl30717 InfB Translation initiation factor IF-2, a GTPase [Translation, ribosomal structure and biogenesis]. This model describes archaeal and eukaryotic orthologs of bacterial IF-2. Like IF-2, it helps convey the initiator tRNA to the ribosome, although the initiator is N-formyl-Met in bacteria and Met here. This protein is not closely related to the subunits of eIF-2 of eukaryotes, which is also involved in the initiation of translation. The aIF-2 of Methanococcus jannaschii contains a large intein interrupting a region of very strongly conserved sequence very near the amino end; the alignment generated by this model does not correctly align the sequences from Methanococcus jannaschii and Pyrococcus horikoshii in this region. [Protein synthesis, Translation factors] 0
43148 421984 cl30729 tRNA_bind_4 tRNA-binding domain. seryl-tRNA synthetase; Provisional 0
43149 357607 cl30731 PRK00694 N/A. 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Provisional 0
43150 357616 cl30740 COG2810 Predicted type IV restriction endonuclease [Defense mechanisms]. 0
43151 392274 cl30759 PelA Stalled ribosome rescue protein Dom34, pelota family [Translation, ribosomal structure and biogenesis]. Directs the termination of nascent peptide synthesis (translation) in response to the termination codons UAA, UAG and UGA. This model identifies both archaeal (aRF1) and eukaryotic (eRF1) of the protein. Also known as translation termination factor 1. [Protein synthesis, Translation factors] 0
43152 357654 cl30778 MltE Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) [Cell wall/membrane/envelope biogenesis]. lytic murein transglycosylase; Provisional 0
43153 357662 cl30786 FusA Translation elongation factor EF-G, a GTPase [Translation, ribosomal structure and biogenesis]. elongation factor 2; Provisional 0
43154 392275 cl30787 PhrB Deoxyribodipyrimidine photolyase [Replication, recombination and repair]. This model describes a narrow clade of cyanobacterial deoxyribodipyrimidine photo-lyase. This group, in contrast to several closely related proteins, uses a chromophore that, in other lineages is modified further to become coenzyme F420. This chromophore is called 8-HDF in most articles on the DNA photolyase and FO in most literature on coenzyme F420. [DNA metabolism, DNA replication, recombination, and repair] 0
43155 357667 cl30791 FieF Divalent metal cation (Fe/Co/Zn/Cd) transporter [Inorganic ion transport and metabolism]. 0
43156 357684 cl30808 PHA00368 N/A. virion protein; Provisional 0
43157 357685 cl30809 PLN03107 N/A. eukaryotic initiation factor 5a; Provisional 0
43158 357701 cl30825 PLN02893 N/A. cellulose synthase-like protein 0
43159 357716 cl30840 PRK09940 N/A. transcriptional regulator SirC; Provisional 0
43160 421985 cl30861 PRK09915 MdtP family multidrug efflux transporter outer membrane subunit. multidrug resistance outer membrane protein MdtQ; Provisional 0
43161 357750 cl30874 PHA02699 N/A. Hypothetical protein; Provisional 0
43162 357752 cl30876 PHA03055 N/A. ORF033 IMV membrane protein; Provisional 0
43163 357759 cl30883 PHA02984 N/A. hypothetical protein; Provisional 0
43164 357760 cl30884 PHA02861 N/A. hypothetical protein; Provisional 0
43165 357761 cl30885 PHA02818 N/A. hypothetical protein; Provisional 0
43166 392276 cl30941 CheC_CheX_FliY CheC/CheX/FliY (CXY) family phosphatases. This family contains class III CheC proteins, present chiefly in the archaeal class Halobacteria. Sequence analysis shows that class III CheC proteins are structurally and functionally similar to class I CheCs, and not to CheX, despite the fact that both class III CheCs and CheX lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site. Mutation analysis shows that the second active site is more important for function that the first one, suggesting that class III proteins arose by loss of the unnecessary first active site through mutational shift. All chemotactic archaea have a CheC homologue. 0
43167 421990 cl31489 Mtd_N Major tropism determinant N-terminal domain. major tropism determinant 0
43168 421991 cl32029 DnaT DnaT DNA-binding domain. This domain is found in E.coli primosomal protein 1 (Pp1); the PP1 domain (residues 84-153) can bind to different types of ssDNA, which is fundamental for its physiological substrate bindings. Functional analysis indicate that both N- and C- terminals are essential to having the cooperative effect in binding ssDNA. The ssDNA bound complex displays a spiral filament assembly that is adopted by many proteins that are involved in DNA replication, such as DnaA, RecA and PriB. This domain is similar to pfam08585 except that it contains an extra loop at the N-terminus (84-99). Structural analysis indicate that this extra loop might be essential for the stabilisation of the three-helix bundle. 0
43169 421994 cl36576 RX-CC_like Coiled-coil domain of the potato virux X resistance protein and similar proteins. This entry represents the N-terminal domain found in many plant resistance proteins. This domain has been predicted to be a coiled-coil, however the structure shows that it adopts a four helical bundle fold. 0
43170 421995 cl36727 NaPi_cotrn_rel Na/Pi-cotransporter. Proteins of this family belong to the Phosphate:Na+ Symporter (PNaS) superfamily. 0
43171 422182 cl37813 DUF2791 P-loop Domain of unknown function (DUF2791). BrxD is an ATP-binding protein found in types 2 and 6 of BREX (bacteriophage exclusion) phage resistance systems. 0
43172 422465 cl38185 BTP Chlorhexidine efflux transporter. PACE (proteobacterial antimicrobial compound efflux) transporters are single component proton-coupled efflux pumps that help confer resistance to a number of biocides and antibiotics. The family has also been named PCE (proteobacterial chlorhexidine efflux). Members of this subfamily of the PACE transporters, distinct from the AceI-like branch, include several whose expression is increased by exposure to chlorhexidine and/or help confer increased resistance to it. 0
43173 422524 cl38252 YbbR YbbR domain. The members of this family are are all hypothetical bacterial proteins of unknown function, and are similar to the YbbR protein expressed by Bacillus subtilis. One member is annotated as an uncharacterized secreted protein, whereas another member is described as a hypothetical protein in the 5'region of the def gene of Thermus thermophilus, which encodes a deformylase, but no further information was found in either case. This region is found repeated up to four times in many members of this family. 0
43174 422701 cl38447 DUF4277 Domain of unknown function (DUF4277). Members of this protein family are DDE type transposases encoded by the IS1634 family elements, which were firstly identified and characterized in Mycoplasma mycoides. 0
43175 422835 cl38594 DUF5357 Family of unknown function (DUF5357). Proteins of this family are components of cyanobacterial septal junctions (microplasmodesmata) in heterocyst-forming cyanobacteria. 0
43176 422937 cl38795 CnrY anti-sigma factor CnrY. This family is found in alpha and beta proteobacteria. Family members include anti-sigma factor CnrY from Cupriavidus metallidurans. Sigma factors are multi-domain sub-units of bacterial RNA polymerase (RNAP) that play critical roles in transcription initiation, including the recognition and opening of promoters as well as the initial steps in RNA synthesis. They also control a wide variety of adaptive responses such as morphological development and the management of stress. A recurring theme in sigma factor control is their sequestration by anti-sigma factors that occlude their RNAP-binding determinants. CnrH, controls cobalt and nickel resistance in Cupriavidus metallidurans. CnrH is regulated by a complex of two transmembrane proteins: the periplasmic sensor CnrX and the anti-sigma CnrY. At rest, CnrH is sequestered by CnrY whose 45-residue-long cytosolic domain is one of the shortest anti-sigma domains. Upon Ni(II) or Co(II) ions detection by CnrX in the periplasm, CnrH is released between CnrH and the cytosolic domain of CnrY (CnrYc). The CnrH/CnrYC complex displays an unexpected structural similarity to the anti-sigma NepR in complex with its antagonist PhyR, whereas NepR shares no sequence similarity with CnrY. Crystal structure of CnrH/CnrY shows that CnrYC residues 3-19 are folded as a well-defined alpha-helix. The peptide further extends along the hydrophobic groove of sigma 2 with no canonical structure except for a short helical turn spanning residues 24-28. CnrY has a hydrophobic knob made of V4, W7 and L8 side chains protruding into sigma 4 hydrophobic pocket and contributing to the interface. In vivo investigation of CnrY function pinpoints part of the hydrophobic knob as a hotspot in CnrH inhibitory binding. 0
43177 422948 cl38891 SARS-CoV_ORF9c accessory protein ORF9c (also referred to as ORF14) from Severe acute respiratory syndrome-associated coronavirus and related coronaviruses. This is a family of unknown function found in SARS coronavirus. 0
43178 422950 cl38901 T3SC_I-like class I type III secretion system (T3SS) chaperones and similar proteins. TtfA (trehalose monomycolate transport factor A) plays a role in the transport of trehalose monomycolate across the inner membrane, potentially by forming a complex with the atypical lipid transporter MmpL3. Trehalose monomycolate is a component of the mycobacterial envelope. The core domain of TtfA shows strong structural similarity to class I type III secretion system (T3SS) chaperones, and TtfA may play other roles besides assisting in mycolate transport, given its phylogenetic distribution. 0
43179 365778 cl38902 YEATS YEATS domain family, chromatin reader proteins. YEATS domain containing proteins, which include Transcription initiation factor TFIID subunits 14 and 14b of Arabidopsis, shown to be part of the TFIID general transcriptional regulator complex in a two-hybrid screen. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein. 0
43180 365779 cl38903 RMtype1_S_TRD-CR_like Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR) and similar domains. The recognition sequences of Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. Also included in this subfamily is the C-terminal TRD-CR-like sequence-recognition domain of Microcystis aeruginosa putative type I N6-adenine DNA methyltransferase M subunit (M.Mae7806ORF3969P). The recognition sequence of M.Mae7806ORF3969P is undetermined. 0
43181 365780 cl38904 AldB-like proteins similar to alpha-acetolactate dehydrogenase. alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5) converts acetolactate ((2S)-2-hydroxy-2-methyl-3-oxobutanoate) into acetoin ((3R)-3-hydroxybutan-2-one) and CO(2). Acetoin may be secreted by the cells, perhaps in order to control the internal pH. AldB may function as a regulator in valine and leucine biosynthesis and in catalyzing the second step of the 2,3-butanediol pathway. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxxH motif (x could be any residue) that coordinates a zinc ion. 0
43182 365781 cl38905 longin-like Longin-like domains. Trafficking protein particle complex subunit 4 (TRAPPC4), also known as synbindin or TRS23, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. 0
43183 365782 cl38906 ATP-synt_Fo_Vo_Ao_c ATP synthase, membrane-bound Fo/Vo/Ao complexes, subunit c. This family includes subunit c of F-ATP synthase (also called ATP synthase F(o) sector subunit c, F-type ATPase subunit c, or F-ATPase subunit c) and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpE. 0
43184 422951 cl38907 SWIB-MDM2 SWIB/MDM2 domain family. This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumor suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. 0
43185 365784 cl38908 BTB_POZ BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain superfamily. ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. It is enriched in skeletal muscles, especially at the neuromuscular junction. A ZBTB42 mutation has been identified to define a novel lethal congenital contracture syndrome (LCCS6), a lethal autosomal recessive form of arthrogryposis multiplex congenita (AMC). ZBTB42 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. 0
43186 365785 cl38909 ATP-synt_F1_V1_A1_AB_FliI_N ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, N-terminal domain. The alpha (A) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. 0
43187 365786 cl38910 ATP-synt_F1_V1_A1_AB_FliI_C ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, C-terminal domain. The C-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The flagellum-specific ATPase FliI is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of FoF1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway. 0
43188 365787 cl38911 LGIC_TM transmembrane domain of Cys-loop neurotransmitter-gated ion channels. This family contains transmembrane (TM) domain of zinc-activated ligand-gated ion channel (ZAC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown. 0
43189 422952 cl38912 SPOUT_MTase SPOUT superfamily of SAM-dependent RNA methyltransferases. This family has a Rossmanoid fold, with a deep trefoil knot in its C-terminal region. It has structural similarity to RNA methyltransferases, and is likely to function as an S-adenosyl-L-methionine (SAM)-dependent RNA 2'-O methyltransferase. 0
43190 365789 cl38913 ABC_6TM_exporters Six-transmembrane helical domain of the ATP-binding cassette transporters. ATP-binding cassette sub-family B member 9 is also known as transporter associated with antigen processing, TAP-like protein, TAPL, and ABCB9. It is a half transporter comprises a homodimeric lysosomal peptide transport complex. It belongs to the ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs. The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit. 0
43191 365790 cl38914 NP-I nucleoside phosphorylase-I family. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs), as well as futalosine nucleosidase and adenosylhopane nucleosidase. Bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family. 0
43192 365791 cl38915 DEAD-like_helicase_C C-terminal helicase domain of the DEAD-like helicases. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. RecG helps process Holliday junction intermediates to mature products by catalyzing branch migration. It is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC. 0
43193 365792 cl38916 HK_sensor Sensor domains of Histidine Kinase receptors. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution. 0
43194 422953 cl38917 Tiki_TraB-like diverse proteins related to the Tiki and TraB protease domains. pAD1 is a haemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus. This family also contains the Tiki proteins that regulate Wnt signalling. 0
43195 422954 cl38918 Peptidase_M15 Metalloproteases including zinc D-Ala-D-Ala carboxypeptidase, L-Ala-D-Glu peptidase, L,D-carboxypeptidase, bacteriophage endolysins, and related proteins. This family resembles VanY, pfam02557, which is part of the peptidase M15 family. 0
43196 365795 cl38919 AfaD_SafA-like AfaD-like family of invasins. This subfamily is composed of Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Psa and Myf specifically recognize beta1-3- or beta1-4-linked galactose in glycosphingolipids, but while Psa also binds phosphatidylcholine, Myf does not. Psa has acquired a tyrosine-rich surface that enables it to bind to phosphatidylcholine and mediate adhesion of Y. pestis/pseudotuberculosis to alveolar cells. Myf has specialized as a carbohydrate-binding adhesin, facilitating the attachment of Y. enterocolitica to intestinal cells. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE). 0
43197 365796 cl38920 T3SS_Flik_C_like C-terminal domain of type III secretion proteins FliK, HrpP, YscP, and similar domains. The flagellar hook-length control protein FliK is a soluble cytoplasmic protein that is secreted during flagellar formation. It controls hook elongation by two successive events: by determining hook length and by stopping the supply of hook protein. It contains an N-terminal domain that determines hook length and a C-terminal domain that is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate. 0
43198 365797 cl38921 HLD_clamp helical lid domain of clamp loader-like AAA+ proteins. Replication factor C (RFC) is five-protein clamp loader complex that forms a stable ATP-dependent complex with the sliding clamp, PCNA, which binds specifically to primed DNA. RFC subunits belong to the clamp loader clade of the AAA+ superfamily. 0
43199 422955 cl38923 PIN_Mut7-C-like PIN domain at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related domains. This is a domain of unknown function found in potential toxin-antitoxin system component. 0
43200 393294 cl38924 Wnt Wnt domain found in the WNT signaling gene family, also called Wingless-type mouse mammary tumor virus (MMTV) integration site family. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection. 0
43201 393295 cl38925 TenA_PqqC-like TenA-like proteins including TenA_C and TenA_E proteins, as well as pyrroloquinoline quinone (PQQ) synthesis protein C. This family contains proteins with similarity to TenA, and includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C or PQQC proteins. PQQ is the prosthetic group of several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC catalyzes the last step of PQQ biogenesis which involves a ring closure and an eight-electron oxidation of the substrate [3a-(2-amino-2-carboxyethyl)-4,5-dioxo-4,5,6,7,8,9-hexahydroquinoline-7,9-dicarboxylic acid (AHQQ)]. The exact molecular function of members of this family is unclear. Also belonging to this family is Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), a redox protein toxin unique to Chlamydia species, which modulates host cell apoptosis; its redox activity and death domain binding ability may be required for this biological activity. CADD may have a role in folate metabolism. 0
43202 422956 cl38926 serpin SERine Proteinase INhibitors (serpin) family. Structure is a multi-domain fold containing a bundle of helices and a beta sandwich. 0
43203 393297 cl38927 Peptidase_M90-like M90 peptidase is a zinc-metallopeptidase. This subfamily contains uncharacterized M90 peptidase-like domains, similar to the Mlc Titration Factor A (MtfA) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in an HEXXH motif. MtfA is related to the catalytic domain of the anthrax lethal factor and the Mop protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site. 0
43204 422958 cl38930 AmyAc_family Alpha amylase catalytic domain family. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides, Prevotella and Prevotella species. The function of this family remains unknown. 0
43205 422963 cl38936 P-loop_NTPase P-loop containing Nucleoside Triphosphate Hydrolases. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 0
43206 422965 cl38938 RNR_PFL Ribonucleotide reductase and Pyruvate formate lyase. The proteins in this family are functionally uncharacterized. The proteins are around 450 amino acids long. It is likely that this family represents a group of glycerol-3-phosphate dehydrogenases. 0
43207 422966 cl38939 phosphohexomutase N/A. The MMP1680 protein from Methanococcus maripaludis has been characterized as the archaeal protein responsible for the second step of UDP-GlcNAc biosynthesis. This GlmM protein catalyzes the conversion of glucosamine-6-phosphate to glucosamine-1-phosphate. The first-characterized bacterial GlmM protein is modeled by TIGR01455. These two families are members of the larger phosphoglucomutase/phosphomannomutase family (characterized by three domains: pfam02878, pfam02879 and pfam02880), but are not nearest neighbors to each other. This model also includes a number of sequences from non-archaea in the Bacteroides, Chlorobi, Chloroflexi, Planctomycetes and Spirochaetes lineages. Evidence supporting their inclusion in this equivalog as having the same activity comes from genomic context and phylogenetic profiling. A large number of these organisms are known to produce exo-polysaccharide and yet only appeared to contain the GlmS enzyme of the GlmSMU pathway for UDP-GlcNAc biosynthesis (GenProp0750). In some organisms including Leptospira, this archaeal GlmM is found adjacent to the GlmS as well as a putative GlmU non-orthologous homolog. Phylogenetic profiling of the GlmS-only pattern using PPP identifies members of this archaeal GlmM family as the highest-scoring result. [Central intermediary metabolism, Amino sugars] 0
43208 393315 cl38945 BrxE_fam BrxE family protein. Members of this family are BrxE, a protein of unknown function that is found in type 6 BREX systems of phage defense. 0
43209 422967 cl38947 Spa1_C Lantibiotic immunity protein Spa1 C-terminal domain. This HMM describes a domain that occurs twice in the nisin lantibiotic self-immunity lipoprotein NisI, and once in the subtilin lantibiotic self-immunity lipoprotein SpaI, and once or twice in numerous other known or putative lantibiotic resistance lipoproteins. 0
43210 393319 cl38949 T6SS_TagK_dom TagK family protein C-terminal domain. Members of this family have full-length homology to SciF, a type VI secretion system (T6SS) protein from Salmonella typhimurium island SPI-6. Homologs occur in some but not all T6SS loci, and the broader family is now called TagK. 0
43211 393320 cl38950 PriX Primase X. In most archaea, the eukaryotic-type DNA primase has catalytic subunit PriS and a regulatory subunit PriL. The proteins in this family are PriX, an essential second noncatalytic subunit found in a subset of the archaea. 0
43212 422969 cl38951 BB_PF Beta barrel Pore-forming domain. Members of this family are secreted in a water-soluble pro-toxin form, but undergo cleavage and oligomerization to form beta-barrel pore. The founding member of the family is monalysin from Pseudomonas entomophila. This family is built narrowly, and therefore excludes a set of pore-forming proteins (not necessarily toxins) from a eukaryote, Dictyostelium. Analogous (but perhaps not homologous) beta-type pore-forming toxins include aerolysin and leukocidin. 0
43213 422972 cl38962 retention_LapA retention module-containing protein. Members of this family are lipoprotein LipL45, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region. 0
43214 422973 cl38966 LD_cluster2 SLOG cluster2. Family in the SLOG superfamily, observed to associate with a predicted effector protein containing one enzymatically active and inactive copy of the TIR domain. 0
43215 422978 cl38972 TetR_C_30 Tetracyclin repressor-like, C-terminal domain. Members of this family are found in various prokaryotic transcriptional regulator proteins. Their exact function has not, as yet, been identified. 0
43216 422979 cl38973 TetR_C_24 Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain present in putative TetR transcriptional regulators. 0
43217 422980 cl38975 TetR_C_15 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor present in sco1712 proteins from Streptomyces coelicolo which act as a regulator of antibiotic production. 0
43218 422988 cl38983 Ig_mannosidase Ig-fold domain. This domain can be found in 2 glycoside hydrolase subfamily of beta-glucosaminidases (EC:3.2.1.165) such as CsxA, from Amycolatopsis orientalis that has exo-beta-D-glucosaminidase (exo-chitosanase) activity. It has an immunoglobulin-like topology. 0
43219 422989 cl38984 sCache_3_3 Single cache domain 3. Cache_3 is the periplasmic sensor domains of sensor histidine kinase of E. coli DcuS. This domain forms one of the components of the two-component signalling system that allows bacteria to adapt to changing environments. The ability of bacteria to monitor and adapt to their environment is crucial to their survival, and two-component signal transduction systems mediate most of these adaptive responses. One component is a histidine kinase sensor - this domain - most commonly part of a homodimeric transmembrane sensor protein, and the second component is a cytoplasmic response regulator. The two components interact in tandem through a phospho-transfer cascade. 0
43220 422991 cl38987 2_5_RNA_ligase2 2&apos;-5&apos; RNA ligase superfamily. Members of this family are bacterial and archaeal RNA ligases that are able to ligate tRNA half molecules containing 2',3'-cyclic phosphate and 5' hydroxyl termini to products containing the 2',5' phosphodiester linkage. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. The structure of a related protein is known. They belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins. 0
43221 422992 cl38988 WHG WHG domain. This presumed domain is around 80 amino acids in length. It is found to the C-terminus of a DNA-binding helix-turn-helix domain. This domain may be involved in binding to an as yet unknown ligand that allows a transcriptional regulation response to that molecule. The domain is named WHG after three conserved residues near the C-terminus of the domain. 0
43222 422993 cl38990 EndIII_4Fe-2S Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease. 0
43223 422995 cl38996 T2SSK Type II secretion system (T2SS), protein K. Members of this family are involved in the Type II protein secretion system. The T2SK family includes proteins such as ExeK, PulK, OutX and XcpX. 0
43224 422996 cl38998 MCR_alpha_N Methyl-coenzyme M reductase alpha subunit, N-terminal domain. Members of this protein family are the alpha subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis] 0
43225 422999 cl39004 HrcA HrcA protein C terminal domain. HrcA represses the class I heat shock operons groE and dnaK; overproduction prevents induction of these operons by heat shock while deletion allows constitutive expression even at low temperatures. In Bacillus subtilis, hrcA is the first gene of the dnaK operon and so is itself a heat shock gene. [Regulatory functions, DNA interactions] 0
43226 423004 cl39012 baeRF_family10 Bacterial archaeo-eukaryotic release factor family 10. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family contains a well-conserved 'FP' motif in the catalytic loop. 0
43227 423005 cl39013 PBECR3 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease3. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages. The predicted active site contains a conserved arginine and threonine residues. 0
43228 423006 cl39014 CxC2 CxC2 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 0
43229 423007 cl39015 LRR_RI N/A. Leucine-rich repeats are composed of a beta-alpha unit. This repeat unit is found as capping unit (N- or C- terminal of the repeat region) of Ribonuclease Inhibitors. 0
43230 423013 cl39021 aGPT-Pplase2 Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 2. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in bacterial mobile operons. 0
43231 423014 cl39022 SNAD1 Secreted Novel AID/APOBEC-like Deaminase 1. A family of secreted AID/APOBEC like deaminases found in ray-finned fishes. 0
43232 423015 cl39023 HEPN_RiboL-PSP RiboL-PSP-HEPN. HEPN-like nuclease. MAE_28990 In operon with a ParB nuclease and DNA methylase genes. MAE_18760-like HEPN found fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold. 0
43233 423016 cl39024 BclA_C BclA C-terminal domain. This model often occurs at the C-terminus, and companion model N_to_GlyXaaXaa (NF033172) at the N-terminus, of proteins that in between consist largely of variable numbers of Gly-Xaa-Xaa repeats, reminiscent of collagen repeats. Member proteins observed have been found so far only in Gram-positive bacteria. This domain contains a motif IPxTG near its C-terminus, suggesting it is processed by some form of sortase. 0
43234 423019 cl39027 RuvC_1 RuvC nuclease domain. This is a nuclease (NUC) domain found in Cpf1, an RNA-guided endonuclease of a type V CRISPR-Cas system. Structural and functional analysis indicate that this domain is involved in DNA cleavage. 0
43235 423021 cl39030 zf_CCCH_4 Zinc finger domain. This short zinc binding domain has the pattern of three cysteines and one histidine to coordinate the zinc ion. This domain is found in a wide variety of proteins such as E3 ligases. 0
43236 423022 cl39031 mCpol minimal CRISPR polymerase domain. The mCpol domain (minimal CRISPR polymerase) is named for its homology relationship to catalytic domain of the CRISPR polymerases (often called Cmr2 or Cas10). It is predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. It is part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. The putative function of the mCpol domain implies that CRISPR polymerases of the type III CRISPR/Cas systems have a nucleotide synthetase functional role. 0
43237 423023 cl39032 LSDAT_euk SLOG in TRPM. Family in the SLOG superfamily, fused to or operonically associating with SLATT domain in diverse prokaryotes. Predicted to function as ligand sensor in conjunction with the SLATT transmembrane domain. 0
43238 423024 cl39033 ISP1_C ISP1 C-terminal. This is the C-terminal domain of ISP3 protein, which plays a role in asexual daughter cell formation, for example in T.gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by a interstrand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The loop between beta 5 and beta 6 is extended and variable. The domain adopts a pleckstrin homology (PH) fold, despite having neglible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the ISP3. Unlike PH domains, ISP3 is cysteine rich. The cysteine-rich nature of the ISP3s and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. There are no disulfide bonds in ISP3 unlike in ISP1. It is worth noting that ISP1 and ISP3 share low sequence identity but contain the same secondary core elements. 0
43239 423026 cl39035 Agglutinin_C Agglutinin C-terminal. This is the C-terminal domain of the beta chain found in Polyporus squamosus lectin protein (PSL). PSL binds specifically to glycans terminating with the sequence: Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is not involved in the binding to the Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is characterized by a central five-stranded beta-sheet that is flanked by three alpha-helices and topped by a short strand. It shows high fold similarity to its closest relative, the Gal.alpha1-3Gal-binding agglutinin from the mushroom Marasmius oreades agglutinin (MOA). 0
43240 423032 cl39045 MukF_N bacterial condensin complex subunit MukF, N-terminal domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid. 0
43241 423033 cl39046 UAE_UbL Ubiquitin/SUMO-activating enzyme ubiquitin-like domain. E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains. 0
43242 423034 cl39049 DUF2300 Predicted secreted protein (DUF2300). This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function. 0
43243 423035 cl39051 ATG27 Autophagy-related protein 27. This family includes both Cation-dependent and cation independent mannose-6-phosphate receptors. 0
43244 423037 cl39057 MS_channel Mechanosensitive ion channel. Two members of this protein family of M. jannaschii have been functionally characterized. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins. 0
43245 423038 cl39058 RNB RNB domain. This family consists of an exoribonuclease, ribonuclease R, also called VacB. It is one of the eight exoribonucleases reported in E. coli and is broadly distributed throughout the bacteria. In E. coli, double mutants of this protein and polynucleotide phosphorylase are not viable. Scoring between trusted and noise cutoffs to the model are shorter, divergent forms from the Chlamydiae, and divergent forms from the Campylobacterales (including Helicobacter pylori) and Leptospira interrogans. [Transcription, Degradation of RNA] 0
43246 393429 cl39059 Proton_antipo_M Proton-conducting membrane transporter. This model describes the 14th (based on E. coli) structural gene, N, of bacterial and chloroplast energy-transducing NADH (or NADPH) dehydrogenases. This model does not describe any subunit of the mitochondrial complex I (for which the subunit composition is very different), nor NADH dehydrogenases that are not coupled to ion transport. The Enzyme Commission designation 1.6.5.3, for NADH dehydrogenase (ubiquinone), is applied broadly, perhaps unfortunately, even if the quinone is menaquinone (Thermus, Mycobacterium) or plastoquinone (chloroplast). For chloroplast members, the name NADH-plastoquinone oxidoreductase is used for the complex and this protein is designated as subunit 2 or B. This model also includes a subunit of a related complex in the archaeal methanogen, Methanosarcina mazei, in which F420H2 replaces NADH and 2-hydroxyphenazine replaces the quinone. [Energy metabolism, Electron transport] 0
43247 423039 cl39061 OTCace Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. Members of this family are putrescine carbamoyltransferase (EC 2.1.3.6). There is some overlapping specificity with ornithine carbamoyltransferase (EC 2.1.3.3). The gene regularly is found next to agmatine deiminase and a carbamate kinase, suggesting a conserved catabolic agmatine deiminase pathway. [Energy metabolism, Amino acids and amines] 0
43248 393432 cl39062 DUF5494 Family of unknown function (DUF5494). hypothetical protein 0
43249 393433 cl39063 DUF5461 Family of unknown function (DUF5461). hypothetical protein 0
43250 393434 cl39064 GerPE Spore germination protein GerPE. Members of this family are required for formation of functionally normal spores. They may be involved in the establishment of spore coat structure or permeability. 0
43251 423040 cl39066 GCV_T Aminomethyltransferase folate-binding domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. 0
43252 393437 cl39067 Phage_T7_tail Phage T7 tail fibre protein. hypothetical protein; Provisional 0
43253 423042 cl39070 Ldl_recept_b Low-density lipoprotein receptor repeat class B. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. 0
43254 393443 cl39073 GapA Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]. This model describes the type II glyceraldehyde-3-phosphate dehydrogenases which are limited to archaea. These enzymes catalyze the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. In archaea, either NAD or NADP may be utilized as the cofactor. The class I GAPDH's from bacteria and eukaryotes are covered by TIGR01534. All of the members of the seed are characterized. See, for instance. This model is very solid, there are no species falling between trusted and noise at this time. The closest relatives scoring in the noise are the class I GAPDH's. 0
43255 393445 cl39075 IlvB Acetolactate synthase large subunit or other thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. Two groups of proteins form acetolactate from two molecules of pyruvate. The type of acetolactate synthase described in this model also catalyzes the formation of acetohydroxybutyrate from pyruvate and 2-oxobutyrate, an early step in the branched chain amino acid biosynthesis; it is therefore also termed acetohydroxyacid synthase. In bacteria, this catalytic chain is associated with a smaller regulatory chain in an alpha2/beta2 heterotetramer. Acetolactate synthase is a thiamine pyrophosphate enzyme. In this type, FAD and Mg++ are also found. Several isozymes of this enzyme are found in E. coli K12, one of which contains a frameshift in the large subunit gene and is not expressed. [Amino acid biosynthesis, Pyruvate family] 0
43256 423043 cl39076 Pyruvate_Kinase N/A. This domain of the is actually a small beta-barrel domain nested within a larger TIM barrel. The active site is found in a cleft between the two domains. 0
43257 393447 cl39077 PulD Type II secretory pathway component GspD/PulD (secretin) [Intracellular trafficking, secretion, and vesicular transport]. A number of proteins homologous to the type IV pilus secretin PilQ (TIGR02515) are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). The clade described by this model contains the outer membrane pore proteins of bacterial type III secretion systems, typified by YscC for animal pathogens and HrcC for plant pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis] 0
43258 393448 cl39078 SecA Preprotein translocase subunit SecA (ATPase, RNA helicase) [Intracellular trafficking, secretion, and vesicular transport]. Members of this family are SecA2, part of a Sec-like preprotein translocase called accessory Sec. This SecA2 family is characteristic of Listeria species. 0
43259 393449 cl39079 OadA1 Pyruvate/oxaloacetate carboxyltransferase [Energy production and conversion]. This model describes the bacterial oxaloacetate decarboxylase alpha subunit and its equivalents in archaea. The oxaloacetate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases that present in bacteria and archaea. It a multi subunit enzyme consisting of a peripheral alpha-subunit and integral membrane subunits beta and gamma. The energy released by the decarboxylation reaction of oxaloacetate is coupled to Na+ ion pumping across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other] 0
43260 393450 cl39080 DnaX DNA polymerase III, gamma/tau subunits [Replication, recombination and repair]. This model represents the well-conserved first ~ 365 amino acids of the translation of the dnaX gene. The full-length product of the dnaX gene in the model bacterium E. coli is the DNA polymerase III tau subunit. A translational frameshift leads to early termination and a truncated protein subunit gamma, about 1/3 shorter than tau and present in roughly equal amounts. This frameshift mechanism is not necessarily universal for species with DNA polymerase III but appears conserved in the exterme thermophile Thermus thermophilis. [DNA metabolism, DNA replication, recombination, and repair] 0
43261 393451 cl39081 MotB Flagellar motor protein MotB [Cell motility]. flagellar motor protein MotB; Reviewed 0
43262 393452 cl39082 PycA Pyruvate carboxylase [Energy production and conversion]. Members of this family are ATP-dependent urea carboxylase, including characterized members from Oleomonas sagaranensis (alpha class Proteobacterium) and yeasts such as Saccharomyces cerevisiae. The allophanate hydrolase domain of the yeast enzyme is not included in this model and is represented by an adjacent gene in Oleomonas sagaranensis. The fusion of urea carboxylase and allophanate hydrolase is designated urea amidolyase. The enzyme from Oleomonas sagaranensis was shown to be highly active on acetamide and formamide as well as urea. [Central intermediary metabolism, Nitrogen metabolism] 0
43263 393456 cl39086 FadR DNA-binding transcriptional regulator, FadR family [Transcription]. transcriptional regulator NanR; Provisional 0
43264 393457 cl39087 MurB UDP-N-acetylenolpyruvoylglucosamine reductase [Cell wall/membrane/envelope biogenesis]. This model describes MurB, UDP-N-acetylenolpyruvoylglucosamine reductase, which is also called UDP-N-acetylmuramate dehydrogenase. It is part of the pathway for the biosynthesis of the UDP-N-acetylmuramoyl-pentapeptide that is a precursor of bacterial peptidoglycan. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan] 0
43265 393460 cl39090 ComEA DNA uptake protein ComE and related DNA-binding proteins [Replication, recombination and repair]. This model describes the ComEA protein in bacteria. The com E locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. Lesions in the loci has been variously described for the appearance of competence-related pheonotypes and impairment of competence, suggesting their intimate functional role in bacterial transformation. [Cellular processes, DNA transformation] 0
43266 393461 cl39091 SdhA Succinate dehydrogenase/fumarate reductase, flavoprotein subunit [Energy production and conversion]. This model represents the succinate dehydrogenase flavoprotein subunit as found in Gram-negative bacteria, mitochondria, and some Archaea. Mitochondrial forms interact with ubiquinone and are designated EC 1.3.5.1, but can be degraded to 1.3.99.1. Some isozymes in E. coli and other species run primarily in the opposite direction and are designated fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle] 0
43267 393462 cl39092 SucA 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes [Energy production and conversion]. 2-oxoglutarate dehydrogenase E1 component; Reviewed 0
43268 423044 cl39093 Pyr_redox_2 Pyridine nucleotide-disulphide oxidoreductase. Members of this protein family include N-terminal sequence regions of (probable) bifunctional proteins whose C-terminal sequences are SelD, or selenide,water dikinase, the selenium donor protein necessary for selenium incorporation into protein (as selenocysteine), tRNA (as 2-selenouridine), or both. However, some members of this family occur in species that do not show selenium incorporation, and the function of this protein family is unknown. 0
43269 423045 cl39094 Ank_2 Ankyrin repeats (3 copies). ankyrin repeat protein; Provisional 0
43270 393467 cl39097 MCP_signal Methyl-accepting chemotaxis protein (MCP), signaling domain. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs. 0
43271 393468 cl39098 TrpE Anthranilate/para-aminobenzoate synthases component I [Amino acid transport and metabolism, Coenzyme transport and metabolism]. Members of this protein family are salicylate synthases, bifunctional enzymes that make salicylate, in two steps, from chorismate. Members are homologous to anthranilate synthase component I from Trp biosynthesis. Members typically are found in gene regions associated with siderophore or other secondary metabolite biosynthesis. 0
43272 423046 cl39166 CshA_repeat Surface adhesin CshA repetitive domain. Many proteins with this repeat are LPXTG-anchored surface proteins of Firmicutes species, but the repeat occurs more broadly. Members include CshA from Streptococcus gordonii. 0
43273 423165 cl39327 Bep_C_terminal BID domain of Bartonella effector protein (Bep). The BID domain (Bartonella intracellular delivery domain) is recognized by the type IV secretion system (T4SS) virB (not trw) of Bartonella and related taxa (e.g. Ochrobactrum), and is found in T4SS effector proteins such as BepA, BepB, BepC, etc. Multiple copies of the domain may be found in a single protein. 0
43274 423228 cl39406 RING0_parkin RING finger-like zinc-binding domain 0 of parkin. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by this domain RING0 and three additional zinc finger domains characteristic of the RBR family. RING0 binds two coordinated zinc atoms at each extremity of the domain with a hairpin. Deletion of RING0 massively derepressed parkin activity supporting the role of RING0 in autoinhibition, point mutations in RING0 (Phe146 to Ala) or RING2 (Phe463 to Ala) both increased parkin activity. The REP (repressor element of parkin) and RING0 domains play a preeminent role in repressing parkin ligase activity through their interactions with RING1 and RING2, respectively. 0
43275 423331 cl39524 SLATT_fungal SMODS and SLOG-associating 2TM effector domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict. 0
43276 423341 cl39535 SLATT_5 SMODS and SLOG-associating 2TM effector domain family 5. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential. 0
43277 423350 cl39545 SLATT_1 SMODS and SLOG-associating 2TM effector domain 1. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain. 0
43278 423351 cl39546 SLATT_2 SMODS and SLOG-associating 2TM effector domain 2. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations. 0
43279 423385 cl39583 CdiI_ECL-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Enterobacter cloacae, and similar proteins. This is the N-terminal domain of Contact-dependent growth inhibition immunity (CdiI) proteins present in Enterobacter cloacae. CdiI proteins neutralize CdiA-CT toxins to protect toxin-producing cells from auto-inhibition. Structural homology searches reveal that Enterobacter cloacae's CdiI is most similar to the Whirly family of single-stranded DNA-binding protein. 0
43280 423424 cl39636 T3SS_ExsE Type III secretion system ExsE. ExsE, through protein-protein interaction, serves in a regulatory cascade that modulates the role of ExsA, a transcriptional activator of Pseudomonas aeruginosa's type III secretion system (T3SS) regulon. ExsE itself is a substrate for translocation (i.e. removal) by the T3SS system, providing feedback that modulates expression of secretion system genes. Homologs found in multiple species of Aeromonas and Photorhabdus may be functionally equivalent. Note that VP1702 from Vibrio parahaemolyticus, given the same gene symbol and ascribed an equivalent function, appears unrelated in sequence. 0
43281 423545 cl39768 CdiA-CT_Yk_RNaseA-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Yersinia kristensenii, and similar proteins. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. Structure analysis of CdiA-CT shows that it adopts the same fold (with two beta-sheets forming an overall kidney shape) as angiogenin and other RNase A paralogs, but the toxin does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. Furthermore, structural comparison analysis identified human angiogenin, Rana pipiens protein P-30 (onconase) and mouse pancreatic ribonuclease (RNase 1) as the closest structural homologs of CdiA-CT. 0
43282 423675 cl39912 CdiI_Ykris-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Yersinia kristensenii, and similar proteins. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. CDI+ bacteria also produce CdiI immunity proteins, which specifically neutralize cognate CdiA-CT toxins to prevent self-inhibition. Structure analysis of CdiI immunity protein from Yersinia kristensenii shows that it is composed of eight alpha-helices packed together to form a nearly spherical structure with weak structural homology to a putative TetR family transcriptional repressor. The CdiI protein fits into the curved cavity of the CdiA-CTYkris toxin domain where it most likely neutralizes toxin activity by blocking access to RNA substrates. This domain is mostly found in gammaproteobacteria. 0
43283 423697 cl39935 CdiI_EC536-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 536, and similar proteins. Contact-dependent growth inhibition (CDI) is a widespread mechanism of bacterial competition. CDI+ bacteria deliver the toxic C-terminal region of contact-dependent inhibition A proteins (CdiA-CT) into neighboring target bacteria and produce CDI immunity proteins (CdiI) which bind CdiA-CT domains and neutralize their toxic activity to protect against self-inhibition. CdiI immunity proteins are also variable and only neutralize their cognate CdiA-CT toxins. Structure analysis of CdiI from Escherichia coli 536 (EC536) shows that is composed of a single domain and that it blocks the interaction with substrate, strongly suggesting that the immunity protein occludes the nuclease active site. 0
43284 423705 cl39943 CdiI_Ecoli_Nm-like inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli STEC_O31, Neisseria meningitidis MC58, and similar proteins. CdiI proteins, including the founding member from Escherichia coli strain STEC_O31, serve as immunity proteins for the toxic tRNA-cleaving ribonuclease toxin CdiA. The system confers contact-dependent inhibition (cdi) between different strains of bacteria. 0
43285 423742 cl39980 Pallilysin Pallilysin beta barrel domain. In contrast to pallilysin itself (a bifunctional adhesin and protease), members of the pallilysin-related adhesin family average twice the length, lack the HEXXH motif essential to pallilysin's metalloprotease activity, and are likely to function in virulence only as an adhesin. Typical members of this family include TDE0840 from Treponema denticola and BB0038 from Borrelia burgdorferi, which share less than 20% pairwise amino acid sequence identity. 0
43286 423769 cl40010 NTD_TDP-43 N-terminal domain of transactive response DNA-binding protein 43. This domain can be found at the N-terminal region of transactive response DNA-binding protein 43 kDa (TDP-43), an RNA transporting and processing protein whose aberrant aggregates are implicated in neurodegenerative diseases. TDP-43 N-terminal domain has been shown to play an important role in the aggregation of TDP-43 monomers and its loss of function affects the RNA metabolic levels. Secondary structure of the N-terminal domain consists of six beta-strands and it resembles axin 1. 0
43287 423820 cl40066 Heliorhodopsin Heliorhodopsin. This HMM represents heliorhodopsins, a group of phylogenetically distinct microbial rhodopsins, which play an important role in absorbing and transferring light energy for numerous biological processes in bacteria. Heliorhodopsin was initially identified and characterized in a Gram-positive actinobacterium based on functional metagenomics and photochemical approaches. Heliorhodopsin have seven transmembrane domains, and exhibit similar biological function as microbial rhodopsins. however, heliorhodopsin form a distinct cluster based on phylogenetic analyses. Most microbial rhodopsins are hit by the Pfam HMM PF01036, which does not hit heliorhodopsins. 0
43288 423877 cl40125 IFTase inulin fructotransferase. This region contains a right-handed parallel beta helix repeat unit found in Inulin fructotransferase. This Pfam entry includes sequences not found by pfam13229. 0
43289 423902 cl40150 RnlA-toxin_C RNase LS, bacterial toxin C terminal. RnaseLS-like HEPN. 0
43290 424006 cl40259 Rimk_N RimK PreATP-grasp domain. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa. 0
43291 424030 cl40283 SAVED SMODS-associated and fused to various effectors sensor domain. The SAVED domain is predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers. 0
43292 424038 cl40291 SLATT_6 SMODS and SLOG-associating 2TM effector domain 6. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems. 0
43293 424041 cl40294 SLATT_4 SMODS and SLOG-associating 2TM effector domain family 4. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems. 0
43294 424056 cl40356 DUF5841 Family of unknown function (DUF5841). Members of this family have leader sequences like bacteriocins (see TIGR01847), but characterized examples function as signaling peptides that induce production of a nearby encoded bacteriocin, rather than as bacteriocins themselves. The founding member of this family is enterocin induction factor EntF. 0
43295 394792 cl40422 M34_peptidase Peptidase family M34 includes the C-terminal catalytic domain of anthrax lethal factor (ATLF), the protective antigen-binding domains of ATLF and edema factor, and Pro-Pro endopeptidase. This subfamily includes the C-terminal catalytic domain of anthrax toxin lethal factor (ATLF; EC 3.4.24.83). ATLF and edema factor are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is secreted by Bacillus anthracis to promote disease virulence through disruption of host signaling pathways. ATLF belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). MKKs are cleaved by ATLF near their N-termini, removing the docking sequence for the downstream cognate mitogen-activated protein kinase. Preferred amino acids around the cleavage site can be denoted BBBBxHxH, in which B denotes Arg or Lys, H denotes a hydrophobic amino acid, and x is any amino acid. At its N-terminus, ATLF has a related PABD domain which lacks the hallmark metalloprotease motif HEXXH. This subfamily also includes Bacillus thuringiensis Vip2Ac-like_2 which belongs to the Vip family of proteins that are secreted during the vegetative growth phase. 0
43296 424065 cl40423 cupin_RmlC-like RmlC-like cupin superfamily. Breaks down into dimethylsulfoniopropionate (DMSP) into acrylate and dimethyl sulfide. 0
43297 394794 cl40424 USP25_USP28_C-like carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25) and 28 (USP28), and similar domains. This family contains the C-terminal domain of ubiquitin-specific protease USP28, a deubiquitinase (DUB), which shares high similarity with USP25 but varies in cellular function; USP28 is known for its tumor-promoting role while USP25 is a regulator of the innate immune system and may play a role in tumorigenesis. USP28 stabilizes c-MYC and other nuclear proteins, and USP25 regulates inflammatory TRAF signaling. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP28 and harbors the splicing site for isoform-specific sequences. Structure studies suggest that the C-terminal domain forms an independent entity. 0
43298 394795 cl40425 C_NRPS-like Condensation domain of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily have the typical C-domain HHxxxD motif. PksJ is involved in some intermediate steps for the synthesis of the antibiotic polyketide bacillaene which is important in secondary metabolism. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. 0
43299 394796 cl40426 Cas13b Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage. 0
43300 394797 cl40427 Tet_JBP oxygenase domain of ten-eleven translocation (TET) enzymes, J-binding proteins (JBPs), and similar proteins. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity. 0
43301 394798 cl40428 phospholamban_like phospholamban, sarcolipin, and sarcolamban family of bioactive peptides. Invertebrate sarcolamban A (SCLA) belongs to a family of bioactive peptides which includes invertebrate sarcolamban B (SCLB), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. 0
43302 394799 cl40429 Complex1_LYR_SF LYR (leucine-tyrosine-arginine) motif found in Complex1_LYR-like superfamily. This group contains uncharacterized LYR motif-containing proteins belonging to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. 0
43303 394800 cl40430 Stannin_family Stannin family includes vertebrate Stannin and insect Hemotin. Stannin (SNN) is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. It binds and antagonizes 14-3-3zeta and is required for endosomal maturation. It has also been identified as the specific marker for neuronal cell apoptosis induced by trimethyltin (TMT) intoxication. TMT is one of the most toxic organotin compound (or alkyltin), and is known to selectively inflict injury to specific regions of the brain. 0
43304 394801 cl40431 PFM_aerolysin_family pore-forming module of aerolysin-type beta-barrel pore-forming proteins. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). 0
43305 394802 cl40432 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily. This subfamily contains fission yeast Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (also known as Suv39h), the sole homolog of the mammalian SUV39H1 and SUV39H2 enzymes, that has a critical role in preventing aberrant heterochromatin formation. It is known to di- and tri-methylate Lys-9 of histone H3, a central heterochromatic histone modification, with its specificity profile most similar to that of the human SUV39H2 homolog. 0
43306 394803 cl40433 Fer2_BFD-like [2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport. 0
43307 394804 cl40434 TGF_beta_SF transforming growth factor beta (TGF-beta) like domain found in TGF-beta superfamily. The family includes INHBC and INHBE. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress. 0
43308 424066 cl40435 CTD_KDM2A_2B-like C-terminal domain found in lysine-specific demethylase KDM2A, KDM2B, and similar proteins. Lysine-specific demethylase 2B (KDM2B) is also called Ndy1, CXXC-type zinc finger protein 2, F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), F-box protein FBL10, JmjC domain-containing histone demethylation protein 1B (Jhdm1b), Jumonji domain-containing EMSY-interactor methyltransferase motif protein (protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B. It is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). KDM2B also acts as an oncogene that plays a critical role in leukemia development and maintenance. It consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2B between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature. 0
43309 424067 cl40436 IPD_PPP1R12 inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12 (PPP1R12) family. Protein phosphatase 1 regulatory subunit 12A-like (PPP1R12A-like) is a homolog of MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit. MYPT1 is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of PPP1R12A-like protein. 0
43310 424068 cl40437 TRPV Transient Receptor Potential channel, Vanilloid subfamily (TRPV). TRPV2 is closely related to TRPV1, sharing high sequence identity (>50%), but TRPV2 shows a higher temperature threshold and sensitivity for activation than TRPV1. TRPV2 can be stimulated by ligands or lipids, and is involved in osmosensation and mechanosensation. TRPV2 is expressed in both neuronal and non-neuronal tissues, and it has been implicated in diverse physiological and pathophysiological processes, including cardiac-structure maintenance, innate immunity, and cancer. TRPV2 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. 0
43311 424069 cl40438 cc_LAMB_C C-terminal coiled-coil domain found in the laminin subunit beta (LAMB) family. LAMB3 is also called epiligrin subunit beta, kalinin B1 chain, kalinin subunit beta, laminin B1k chain, laminin-5 subunit beta, or nicein subunit beta. It is a major component of the basement membrane in most adult tissues. Mutations in LAMB3 are associated with Herlitz junctional epidermolysis bullosa (H-JEB), a severe autosomal recessive disorder characterized by blister formation within the dermal-epidermal basement membrane. LAMB3 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB3, which may be involved in the integrin binding activity. 0
43312 424070 cl40439 CoV_Spike_S1-S2_S2 S1/S2 cleavage region and the S2 fusion subunit of coronavirus spike (S) proteins. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9 (Ro-BatCoV HKU9). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes. 0
43313 424071 cl40440 PDDEXK_nuclease-like PDDEXK family nucleases. This model characterizes a diverse set of poorly characterized nucleases such as Escherichia coli YaeQ. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 0
43314 424072 cl40441 5TM_YidC_Oxa1_Alb3 Five transmembrane core domain of YidC/Oxa1/Alb3 protein family of insertases. This group is composed of the bacterial and chloroplastic members of the YidC/Oxa1/Alb3 protein family of insertases, including bacterial YidC, and chloroplastic ALBINO3 (Alb3) and Alb3-like proteins such as ALBINO3-like protein 1 (also called Alb4). Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC from Gram-negative bacteria contains an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. Alb3 and Alb3-like proteins are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. Alb3 acts independently and may also function cooperatively with the thylakoid cpSecYE translocase to insert proteins co-translationally into the thylakoid membrane, similar to bacterial YidC that can function with the SecYEG translocase. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function. 0
43315 424073 cl40442 Peptidase_C80 peptidase C80 family. This peptidase C80 family includes the cysteine-binding domain (CPD) of several large, repetitive bacterial exoproteins involved in heme utilization or adhesion and many typically having CPD repeats as well as regions rich in repeats. Many members of this family have been designated adhesins or filamentous haemagglutinins. The CPD contains the characteristic Asp/Cys/His residues found in Clostridium toxin B active site. 0
43316 424075 cl40444 DBD_XPA-like DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14 and similar proteins. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator. 0
43317 424078 cl40447 cyt_P460_fam Cytochrome P460 family. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. M. capsulatus (Bath) cytochrome P460, encoded by the cytL gene, catalyzes the oxidation of hydroxylamine (NH2OH) to form nitrous oxide (N2O) under anaerobic conditions. Similar to Nitrosomonas europaea cyt P460, it is defined by an unusual porphyrin (heme)-lysine cross link. This subfamily belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. 0
43318 424079 cl40448 NURR NURR (N-terminal unit for RNA recognition) domain. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression. The model corresponds to NURR domain of hnRNPR. 0
43319 424080 cl40449 DHD_Ski_Sno_Dac Dachshund-homology domain found in the Ski/Sno/Dac family of transcriptional regulators. Ski-like protein, also known as SKIL, Ski-related oncogene (Sno), or Ski-related protein, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. 0
43320 424081 cl40450 KLF8_12_N N-terminal domain of Kruppel-like factor (KLF) 8, KLF12, and similar proteins. Kruppel-like factor 12 (also known as Krueppel-like transcription factor 12, KLF12) regulates, by transcriptionally repressing Nur77 expression, endometrial decidualization, which is a prerequisite for successful implantation and the establishment of pregnancy. It is involved in the maturation processes of kidney collecting ducts after birth, and is able to increase the promoter activity of the UT-A1 urea transporter promoter by binding to the CACCC motif. KLF12 has also been found to promote colorectal cancer growth is also involved in the invasion and apoptosis of basal-like breast carcinoma. KLF12 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF12 contains an N-terminal domain that is related to the N-terminal repression domain of KLF8. 0
43321 424082 cl40451 cwf21 cwf21 domain. Serine/arginine repetitive matrix protein 3 (SRRM3) may play a role in regulating breast cancer cell invasiveness. It may also be involved in RYBP-mediated breast cancer progression. SRRM3 contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. 0
43322 424083 cl40452 SNARE_NTD_STX6-like N-terminal domain of syntaxin-6 and similar proteins. Syntaxin-6 (STX6) is a component of a soluble NSF attachment protein receptor (SNARE) complex involved in intracellular vesicle trafficking and in the fusion of retrograde transport carriers with the trans-Golgi network (TGN). This model corresponds to N-terminal domain of STX6, which is a regulatory domain named Habc. 0
43323 424084 cl40453 FXYD phenylalanine-X-tyrosine-aspartate (FXYD) family. The FXYD domain-containing ion transport regulator 12 (FXYD12) mRNA is mainly distributed in kidneys and intestines of fish. In co-immunoprecipitation experiments, FXYD12 was shown to associate with the Na(+)/(K+)-ATPase (NKA) alpha-subunit in the intestines of two closely related medakas, Oryzias dancena and O. latipes. These results suggests that FXYD12 may play a role in modulating NKA activity in the intestines following salinity changes in the maintenance of internal homeostasis. 0
43324 424085 cl40454 CYCLIN_SF Cyclin box fold superfamily. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain. 0
43325 424087 cl40456 TM_Y_CoV_Nsp3_C C-terminus of coronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and the related murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain. 0
43326 424088 cl40457 CoV_PLPro Coronavirus (CoV) papain-like protease (PLPro). This model represents the papain-like protease (PLPro) found in the non-structural protein 3 (Nsp3) region of deltacoronavirus, including Porcine deltacoronavirus, Bulbul coronavirus HKU11, and Common moorhen coronavirus HKU21. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. 0
43327 424089 cl40458 betaCoV_Nsp3_NAB nucleic acid binding domain of betacoronavirus non-structural protein 3. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the D lineage. 0
43328 424090 cl40459 CoV_Nsp9 coronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from deltacoronaviruses such as the Porcine delta coronavirus (PDCoV) Porcine coronavirus HKU15. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication. 0
43329 424091 cl40460 CoV_Nsp10 coronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of deltacoronaviruses, including Thrush coronavirus HKU12-600 and Wigeon coronavirus HKU20. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity. 0
43330 424092 cl40461 ZIP_TSC22D-like leucine zipper found in the TSC22 domain leucine zipper transcription factors, c-Myc-binding protein, and similar proteins. TSC22 domain family protein 4 (TSC22D4), also called TSC22-related-inducible leucine zipper protein 2 (TILZ2), or Tsc-22-like protein THG-1, is a transcriptional repressor that acts as a molecular determinant of insulin signalling and glucose handling. It also functions in hepatic lipid handling by regulating hepatic very-low-density-lipoprotein (VLDL) release and lipogenic gene expression. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D4. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription. 0
43331 424093 cl40462 CoV_Nsp8 Coronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length. 0
43332 424094 cl40463 NAC NAC domain. Basal transcription factor 3 (BTF3) plays an important role in the transcriptional regulation linked to growth and development in eukaryotes. In mammals, the BTF3 gene encodes two alternative splicing isoforms, BTF3a and BTF3b. The full length BTF3a protein excites transcription. The shortened BTF3b, which lacks the first 44 amino-terminal extension, is a component of the nascent polypeptide-associated complex (NAC), involved in regulating protein localization during translation. BTF3 is involved in oncogenesis; overexpression of BTF3 has been shown to be associated with a variety of malignancies such as cancer of the colon, pancreas, stomach, prostate and breast. It is upregulated in hypopharyngeal squamous cell carcinoma (HSCC) tumors correlating with lymph node metastasis and tumor promotion, thus indicating that BTF3 is a potential therapeutic target and prognostic biomarker for HSCC. BTF3 has also been implicated in the pathogenesis of osteosarcoma (OS), a malignant cancer that affects rapidly proliferating bones, and has a poor prognosis. 0
43333 424095 cl40464 CoV_Nsp14 nonstructural protein 14 of coronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs. 0
43334 424096 cl40465 CoV_Spike_S1_NTD N-terminal domain of the S1 subunit of coronavirus Spike (S) proteins. The N-terminal domain of the coronavirus spike glycoprotein functions as a receptor binding domain. It binds carcinoembryonic antigen-related cell adhesion molecule 1. 0
43335 424097 cl40466 ORF8-Ig_SARS-CoV-2-like SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of Bat SARS coronavirus HKU3-1 and Bat SARS-like coronavirus Rs3367, which have been classified previously as type III ORF8's. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as bat coronavirus Rf1 (Bat SARS CoV Rf1) ORF8 which has been classified previously as a type II ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). 0
43336 424098 cl40467 embe-merbe_CoV_ORF8b_protein-I-like MERS-CoV ORF8b, BECV protein I, and related Embecovirus and Merbecovirus proteins. This subfamily includes protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage) including murine hepatitis virus. The gene encoding protein I is included in the N gene as an alternative ORF. Protein I appears to have no homologous proteins in Sarbecovirus lineage B, which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. BECV-F15 protein I is not essential for viral replication. It is related to the ORF8b accessory protein of Middle East respiratory syndrome-related coronavirus (MERS-CoV) and other related merbecoviruses (C lineage); the gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a). 0
43337 424099 cl40468 ORF7a_SARS-CoV-like Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) structural accessory protein ORF7a and similar proteins from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). The structure of the coronavirus X4 protein (also known as 7a and U122) shows similarities to the immunoglobulin like fold and suggests a binding activity to integrin I domains. In SARS-CoV- infected cells, the X4 protein is expressed and retained intra-cellularly within the Golgi network. X4 has been implicated to function during the replication cycle of SARS-CoV. 0
43338 424100 cl40469 NTD_cv_Nsp15-like N-terminal domain of coronavirus Nonstructural protein 15 (Nsp15) and related proteins. This is the N-terminal domain of the coronavirus nonstructural protein 15 (NSP15), which is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP15, is a nidoviral RNA uridylate-specific endoribonuclease (NendoU) carrying C-terminal catalytic domain belonging to the EndoU family. The SARS-CoV-2 NendoU monomers assemble into a double-ring hexamer, generated by a dimer of trimers. The hexamer is stabilized by the interactions of N-terminal oligomerization domain. 0
43339 424101 cl40470 CoV_RdRp coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of bat coronavirus HKU9 and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. 0
43340 424102 cl40471 CoV_Nsp5_Mpro coronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs. 0
43341 424104 cl40473 cv-alpha_beta_Nsp2-like alpha- and betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and HCoV-229E Nsp2, and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Murine hepatitis virus (MHV) and betacoronaviruses in the embecovirus subgenera (A lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2. The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers, and it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2, also known as p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. 0
43342 424105 cl40474 CoV_E Coronavirus Envelope (E) small membrane protein. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection. 0
43343 424106 cl40475 CoV_M coronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of deltacoronaviruses including porcine deltacoronavirus and Bulbul coronavirus HKU11. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles. 0
43344 424108 cl40477 CoV_Nsp6 coronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation. 0
43345 424109 cl40478 CoV_Spike_S1_RBD receptor-binding domain of the S1 subunit of coronavirus spike (S) proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from porcine hemagglutinating encephalomyelitis virus (HEV), which is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. Porcine HEV is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. The protein receptor for porcine HEV has not yet been identified. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs. 0
43346 424111 cl40480 enolase_like N/A. Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 4. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 0
43347 424112 cl40481 YlqF_related_GTPase Circularly permuted YlqF-related GTPases. This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases. 0
43348 424114 cl40483 ICL_KPHMT N/A. Ketopantoate hydroxymethyltransferase (KPHMT) is the first enzyme in the pantothenate biosynthesis pathway. Ketopantoate hydroxymethyltransferase (KPHMT) catalyzes the first committed step in the biosynthesis of pantothenate (vitamin B5), which is a precursor to coenzyme A and is required for penicillin biosynthesis. 0
43349 424116 cl40485 AcrIF10 Anti-CRISPR type I subtype F10. Members of the AcrF10 family of anti-CRISPR proteins have been found in phage from various Vibrio, Shewanella, and their relatives. AcrF10 is considered a DNA mimic protein. 0
43350 424117 cl40486 HopBF1 type III secretion system (T3SS) effector HopBF1 from Ewingella americana, and similar proteins. HopBF1, found in plant pathogens such as Pseudomonas syringae and in the human pathogen Ewingella americana, it a type III secretion system effector that acts as a protein kinase. It phosphorylates the eukaryotic chaperone HSP90 on a serine residue, inhibiting its ATPase activity. The inhibition interferes with the proper folding of client proteins of HSP90 that are important to resistance to bacterial infection. 0
43351 424126 cl40495 CueP_fam CueP family metal-binding protein. This narrowly built model for CueP includes periplasmic proteins from Salmonella enterica, in which it contributes to an increased tolerance to copper, and from various other Gram-negative bacteria. It does not include CueP lipoproteins from species such as Corynebacterium diphtheriae. 0
43352 424133 cl40502 phiSA1p31 phiSA1p31 domain. This uncharacterized protein family occurs in Streptomyces and related species. Some members have insertions of long stretches of low-complexity sequences. 0
43353 424149 cl40518 heterocyst_HetZ heterocyst differentiation protein HetZ. Members of this family are cyanobacterial proteins distantly related to heterocyst differentiation protein HetZ, which also has a much more closely related set of paralogs in heterocyst-forming species. 0
43354 424152 cl40521 PEPxxWA-CTERM PEPxxWA-CTERM sorting domain. Members of this family are PEP-CTERM proteins, that is, surface proteins of Gram-negative organisms that carry a short C-terminal region used to help target proteins to their proper cellular location, hold them in position for post-translational modifications that might need to occur (such as glycosylation), and which is eventually removed by exosortase as the protein is ligated to something else. In this family the most conspicuous feature other than the PEP-CTERM sorting signal (with variants that include PEP, PAP, PTP, and SEP) is a pair of Cys residues about 6 amino acids apart from each other. The second Cys occurs in the middle of run of amino acids that are all either small (Gly, Ser, Ala) or else Asn. The local context suggests the Cys occurs at a turn at the end of a structural feature such as alpha-helix or beta-strand, rather than in the middle of one. The word "cistern" was assigned to suggest the proposed Cys-turn feature. 0
43355 424154 cl40523 EboA_domain EboA domain-containing protein. This HMM describes a narrow, cyanobacterial-only clade of members of the EboA (eustigmatophyte/bacterial operon A) family. Members of this family appear required for transport of certain secondary metabolite precursors to the periplasm, including (but not limited) to precursors of scytonemin. More than half the members of this clade belong to scytonemin producers. 0
43356 424159 cl40528 porH_1 PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins. 0
43357 424160 cl40529 opr_proin_2 Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates. 0
43358 424162 cl40531 DotA_TraY conjugal transfer/type IV secretion protein DotA/TraY. This HMM distinguishes DotA of type IVB secretion systems from TraY as the term is used in the conjugal transfer systems of IncI1 family plasmids. 0
43359 424173 cl40542 NprX_fam NprX family peptide pheromone. NprX, also called NprRB, belongs to the NprR-NprX quorum-sensing system in Bacillus. The mature form of the peptide pheromone is the SKPDIVG heptapeptide. 0
43360 424178 cl40547 gliding_CglD adventurous gliding motility lipoprotein CglD. CglC (cell contact-dependent gliding (or conditional gliding) motility protein C, also called adventurous gliding motility protein AgmO, is found in delta-proteobacterial species that exhibit a taxonomically restricted form of gliding motility. 0
43361 424188 cl40557 PorV_fam PorV/PorQ family protein. PorV, as characterized in oral pathogen Porphyromonas gingivalis, is a component of the type IX secretion system (T9SS) needed to process a subset of T9SS substrates. PorV is a paralog of PorQ. 0
43362 424196 cl40565 staphy_B_SbnF staphyloferrin B biosynthesis protein SbnF. SbnC, related to siderophore biosynthesis protein IucA and IucC, is encoded in Staphylococcus aureus in the sbnABCDEFGHI locus responsible for the biosynthesis of staphyloferrin B, a carboxylate-type siderophore. SbnC is found in many species of Staphylococcus. 0
43363 424201 cl40570 Ca_tandemer Ca2+-stabilized adhesin repeat. This variant form of the Ig-like domain occurs as a repeat in a number of large adhesins, including a 1.5-MDa ice-binding adhesin, the Marinomonas primoryensis antifreeze protein. 0
43364 424204 cl40573 DUF5670 Family of unknown function (DUF5670). Members of this family are very small (about 45 amino acids) and highly hydrophobic, suggesting a presence in the membrane, and have a broad phylogenetic distribution. The member protein lmo0937, from the pathogen Listeria monocytogenes, is described as up-regulated when the bacterium is in the mouse spleen, suggesting a role in stress response. 0
43365 424205 cl40574 DUF5309 Family of unknown function (DUF5309). A founding member of this family, AKO59007.1, was identified as the major head protein in Brucella phage 02_19 during a comparison of Brucella phage genomes. The N-terminal half appears to the better conserved region with fewer insertions and deletions. 0
43366 424212 cl40581 Fuzzy protein fuzzy and homologs. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the second Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia. 0
43367 424216 cl40585 dRRM_Rrp7p deviant RNA recognition motif (dRRM) in yeast ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. This domain corresponds to the N-terminal RNA-binding domain found in the Rrp7 protein. It has an RRM-like fold with a circular permutation. 0
43368 424219 cl40588 cv_Nsp4_TM coronavirus non-structural protein 4 (Nsp4) transmembrane domain. This is the N-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex to modified endoplasmic reticulum membranes. This N-terminal region represents the membrane spanning region, covering four transmembrane regions. 0
43369 424225 cl40594 AAA_10 AAA-like domain. This entry represents the P-loop domain found in the TraG conjugation protein. 0
43370 424235 cl40604 baeRF_family3 Bacterial archaeo-eukaryotic release factor family 3. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 0
43371 424236 cl40605 Hsm3_like Hsm3 is a yeast Proteasome chaperone of the 19S regulatory particle and related proteins. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to the HEAT repeats. This entry includes the first 5 repeats at the N-terminal. 0
43372 424239 cl40608 MDD_C Mevalonate 5-diphosphate decarboxylase C-terminal domain. This enzyme catalyzes the last step in the synthesis of isopentenyl diphosphate (IPP) in the mevalonate pathway. Alternate names: mevalonate diphosphate decarboxylase; pyrophosphomevalonate decarboxylase [Central intermediary metabolism, Other] 0
43373 424240 cl40609 SAM_DrpA DNA processing protein A sterile alpha motif domain. This is the N-terminal domain found in DNA processing protein A (DprA) present in Streptococcus pneumoniae. DprA has recently been discovered to be a transformation-dedicated RecA loader. Transformation is believed to play a major role in genetic plasticity. This domain is known as the sterile alpha motif (SAM) domain. DprAs are able to form a type of dimer through SAM-SAM interactions, also known as N/N interactions. 0
43374 424241 cl40610 FtsX_ECD FtsX extracellular domain. FtsX is an integral membrane protein encoded in the same operon as signal recognition particle docking protein FtsY and FtsE. It belongs to a family of predicted permeases and may play a role in the insertion of proteins required for potassium transport, cell division, and other activities. FtsE is a hydrophilic nucleotide-binding protein that associates with the inner membrane by means of association with FtsX. [Cellular processes, Cell division, Protein fate, Protein and peptide secretion and trafficking] 0
43375 424243 cl40612 YlmH_RBD Putative RNA-binding domain in YlmH. This domain adopts an RRM like fold and is found in the B. subtilis YlmH cell division protein. 0
43376 424244 cl40613 HTH_ParB HTH domain found in ParB protein. This family may include an HTH domain. 0
43377 424245 cl40614 Usher Outer membrane usher protein. This is the presumed beta barrel domain from the usher-like TcfC family of proteins. 0
43378 424246 cl40615 KIX_2 KIX domain. CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. 0
43379 424248 cl40617 zf-TRAF TRAF-type zinc finger. 0
43380 424250 cl40619 TetR_C_8 Transcriptional regulator C-terminal region. The seed alignment for this family was built from a set of closely related uncharacterized proteins associated with operons for the type of bacterial dihydroxyacetone kinase that transfers PEP-derived phosphate from a phosphoprotein, as in phosphotransferase system transport, rather than from ATP. Members have a TetR transcriptional regulator domain (pfam00440) at the N-terminus and sequence homology throughout. 0
43381 424251 cl40620 DUF4175 Domain of unknown function (DUF4175). Members of this family are long (~850 residue) bacterial proteins from the alpha Proteobacteria. Each has 2-3 predicted transmembrane helices near the N-terminus and a long C-terminal region that includes stretches of Gln/Gly-rich low complexity sequence, predicted by TMHMM to be outside the membrane. In Bradyrhizobium japonicum, two tandem reading frames are together homologous the single members found in other species; the cutoffs scores are set low enough that the longer scores above the trusted cutoff and the shorter above the noise cutoff for this model. 0
43382 424252 cl40621 HTH_33 Winged helix-turn helix. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterized as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily. 0
43383 424253 cl40622 BatD Oxygen tolerance. This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion. 0
43384 424254 cl40623 tRNA_int_end_N2 tRNA-splicing endonuclease subunit sen54 N-term. tRNA-splicing endonuclease subunit alpha; Reviewed 0
43385 424255 cl40624 zf-C2H2_jaz Zinc-finger double-stranded RNA-binding. This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding. 0
43386 424257 cl40626 Helicase_IV_N DNA helicase IV / RNA helicase N terminal. DNA helicase IV; Provisional 0
43387 424258 cl40627 HpaB_N 4-hydroxyphenylacetate 3-hydroxylase N terminal. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Shigella, Photorhabdus and Pasteurella. The family represented by this model is narrowly limited to gammaproteobacteria to exclude other aromatic hydroxylases involved in various secondary metabolic pathways. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241). 0
43388 424259 cl40628 SARS-CoV-like_ORF3a accessory protein ORF3a of severe acute respiratory syndrome-associated coronavirus and similar proteins from related betacoronavirus. APA3_viroporin is a pro-apoptosis-inducing protein. It localizes to the endoplasmic reticulum (ER)-Golgi compartment. The Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) causes apoptosis of infected cells, and this is one of the culprits. Multi-pass membrane protein that forms a homotetrameric potassium-sensitive ion channel called a viroporin whose activity causes ER-stress to the host cell. 0
43389 424260 cl40629 DisA-linker DisA bacterial checkpoint controller linker region. DNA integrity scanning protein DisA; Provisional 0
43390 424261 cl40630 RskA Anti-sigma-K factor rskA. This domain, formerly known as DUF2337, is the anti-sigma-K factor, RskA. In Mycobacterium tuberculosis the protein positively regulates expression of the antigenic proteins MPB70 and MPB83. 0
43391 424262 cl40631 SPA Stabilization of polarity axis. Members of this family of hypothetical proteins have no known function. 0
43392 424263 cl40632 Avl9 Transport protein Avl9. This domain occurs at the N-terminal of Afi1, an Arf3p-interacting protein, is a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3, the highly conserved small GTPases (ADP-ribosylation factors). Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homolog of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalization. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localization and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth. 0
43393 424264 cl40633 Potass_KdpF F subunit of K+-transporting ATPase (Potass_KdpF). This model describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (TIGR00680). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation. [Transport and binding proteins, Cations and iron carrying compounds] 0
43394 424265 cl40634 Docking Erythronolide synthase docking. Polyketide synthase (PKS) catalyzes the biosynthesis of polyketides, which are structurally and functionally diverse natural products in microorganisms and plants. Type I modular PKSs are the large, multifunctional enzymes responsible for the production of a diverse family of structurally rich and often biologically active natural products. The efficiency of acyl transfer at the interfaces of the individual PKS proteins is thought to be governed by helical regions, termed docking domains (dd), located at the C-terminus of the upstream and N-terminus of the downstream polypeptide chains. This entry represents the N-terminal coiled-coil domain found in PikAIV (module 6) proteins from the Pik PKS system in bacteria. This N-terminal PKS docking domain (KS-side docking domain, KSdd) exhibits a coiled-coil motif and the dimer presents a small hydrophobic patch, sometimes flanked by charged residues, as a narrow binding groove where the ACPdd terminal helix can bind. 0
43395 424267 cl40636 FAR1 FAR1 DNA-binding domain. AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This family includes the paralogous pair of transcription factors AFT1 and AFT2. 0
43396 424268 cl40637 HDOD HDOD domain. 0
43397 424269 cl40638 TetR_C_3 YcdC-like protein, C-terminal region. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the TetR family of transcriptional regulators defined by the N-teminal model pfam00440 and the C-terminal model pfam08362 (YcdC-like protein, C-terminal region). 0
43398 424271 cl40640 A-2_8-polyST Alpha-2,8-polysialyltransferase (POLYST). This family features glycosyltransferases belonging to glycosyltransferase family 52, which have alpha-2,3- sialyltransferase (EC:4.2.99.4) and alpha-glucosyltransferase (EC 2.4.1.-) activity. For example, beta-galactoside alpha-2,3- sialyltransferase expressed by Neisseria meningitidis is a member of this family and is involved in a step of lipooligosaccharide biosynthesis requiring sialic acid transfer; these lipooligosaccharides are thought to be important in the process of pathogenesis. This family includes several bacterial lipooligosaccharide sialyltransferases similar to the Haemophilus ducreyi LST protein. Haemophilus ducreyi is the cause of the sexually transmitted disease chancroid and produces a lipooligosaccharide (LOS) containing a terminal sialyl N-acetyllactosamine trisaccharide. 0
43399 424272 cl40641 TraG_N TraG-like protein, N-terminal region. conjugal transfer mating pair stabilization protein TraG; Provisional 0
43400 424281 cl40650 MAT1 CDK-activating kinase assembly factor MAT1. MAT1 is an assembly/targeting factor for cyclin-dependent kinase-activating kinase (CAK), which interacts with the transcription factor TFIIH. The domain found to the N-terminal side of this domain is a C3HC4 RING finger. 0
43401 424282 cl40651 FhuF Ferric iron reductase protein FhuF, involved in iron transport [Inorganic ion transport and metabolism]. ferric iron reductase involved in ferric hydroximate transport; Provisional 0
43402 424284 cl40653 FbpA Fibronectin-binding protein A N-terminus (FbpA). This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host. 0
43403 424285 cl40654 Phage_Coat_B Phage Coat protein B. CoatB is a single filamentous bacteriophage alpha helix of approximately 44 residues. It is likely to assemble into a complex of 35 monomers in a Catherine-wheel like formation. It is the major coat protein of the virion. 0
43404 424286 cl40655 Legionella_OMP Legionella pneumophila major outer membrane protein precursor. This is a family of putative beta barrel porin-7 BBP7 proteins identified initially in Rhodopirellula baltica. 0
43405 424287 cl40656 NMD3 NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway. 0
43406 424288 cl40657 DUF572 Family of unknown function (DUF572). Family of eukaryotic proteins with undetermined function. 0
43407 424289 cl40658 DUF512 Protein of unknown function (DUF512). Members of this protein family are predicted radical SAM enzymes of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis. 0
43408 424291 cl40660 G3P_antiterm Glycerol-3-phosphate responsive antiterminator. Intracellular glycerol is usually converted to glycerol-3-phosphate in an ATP-requiring phosphorylation reaction catalyzed by glycerol kinase (GlpK) glycerol-3-phosphate activates the antiterminator GlpP. 0
43409 424293 cl40662 DUF445 Protein of unknown function (DUF445). Predicted to be a membrane protein. 0
43410 424295 cl40664 Exonuc_V_gamma Exodeoxyribonuclease V, gamma subunit. This model describes the gamma subunit of exodeoxyribonuclease V. Species containing this protein should also have the alpha (TIGR01447) and beta (TIGR00609) subunits. Candidates from Borrelia and from the Chlamydias differ dramatically and score between trusted and noise cutoffs. [DNA metabolism, DNA replication, recombination, and repair] 0
43411 424296 cl40665 Class_IIIsignal Class III signal peptide. This family of archaeal proteins contains. an amino terminal motif QXSXEXXXL that has been suggested to be part of a class III signal sequence. With the Q being the +1 residue of the signal peptidase cleavage site. Two members of this family are cleaved by a type IV pilin-like signal peptidase. 0
43412 424297 cl40666 Autophagy_act_C Autophagocytosis associated protein, active-site domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif. 0
43413 424298 cl40667 RplB Ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. This model distinguishes bacterial and organellar ribosomal protein L2 from its counterparts in the archaea nad in the eukaryotic cytosol. Plant mitochondrial examples tend to have long, variable inserts. [Protein synthesis, Ribosomal proteins: synthesis and modification] 0
43414 424301 cl40670 PCRF PCRF domain. This is a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae the protein is MRP-S24 whereas in humans it is MRP-S28. The human mitochondrial ribosome has 29 distinct proteins in the small subunit and these have homologs in, for example, Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi. 0
43415 424304 cl40673 Fe_dep_repr_C Iron dependent repressor, metal binding and dimerization domain. This family includes the Diphtheria toxin repressor. 0
43416 424306 cl40675 Herpes_glycop_H Herpesvirus glycoprotein H main domain. envelope glycoprotein H; Provisional 0
43417 424307 cl40676 Epimerase_2 UDP-N-acetylglucosamine 2-epimerase. This family consists of UDP-N-acetylglucosamine 2-epimerases EC:5.1.3.14 this enzyme catalyzes the production of UDP-ManNAc from UDP-GlcNAc. Note that some of the enzymes is this family are bifunctional, in these instances Pfam matches only the N-terminal half of the protein suggesting that the additional C-terminal part (when compared to mono-functional members of this family) is responsible for the UPD-N-acetylmannosamine kinase activity of these enzymes. This hypothesis is further supported by the assumption that the C-terminal part of rat Gne is the kinase domain. 0
43418 424310 cl40679 MCR_beta Methyl-coenzyme M reductase beta subunit, C-terminal domain. Members of this protein family are the beta subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis] 0
43419 424313 cl40682 DUF4271 Domain of unknown function (DUF4271). This family includes O-antigen polysaccharide polymerases. These enzymes link O-units via a glycosidic linkage to form a long O-antigen. These enzymes vary in specificity and sequence. 0
43420 424314 cl40683 Pro_dh Proline dehydrogenase. proline dehydrogenase 0
43421 424315 cl40684 PPTA Protein prenyltransferase alpha subunit repeat. Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognize a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognizes a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family. 0
43422 424316 cl40685 DusA tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 0
43423 424318 cl40687 Carboxyl_trans Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognized types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilize acyl-CoA as the acceptor molecule. 0
43424 424319 cl40688 MvaT_DBD DNA-binding domain of the bacterial xenogeneic silencer MvaT. MvaT is a xenogeneic silencer conserved in Pseudomonas which assists in distinguishing foreign from self DNA. It prefers binding to flexible DNA segments with multiple TpA steps, and forms nucleoprotein filaments through cooperative polymerization. 0
43425 424320 cl40689 TOP4c N/A. DNA topisomerase II medium subunit; Provisional 0
43426 424325 cl40694 BglB Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase [Carbohydrate transport and metabolism]. 6-phospho-beta-glucosidase; Reviewed 0
43427 424326 cl40695 Ribosomal_S4 Ribosomal protein S4/S9 N-terminal domain. 30S ribosomal protein S4; Validated 0
43428 424327 cl40696 SARS-CoV_ORF3b accessory protein ORF3b of severe acute respiratory syndrome-associated coronavirus. This family of proteins is found in viruses. Proteins in this family are typically between 32 and 154 amino acids in length. This family contains the SARS coronavirus 3b protein which is predominantly localized in the nucleolus, and induces G0/G1 arrest and apoptosis in transfected cells. 0
43429 424328 cl40697 cv_gamma-delta_Nsp2_IBV-like gamma- and deltacoronavirus non-structural protein 2 (Nsp2), similar to IBV Nsp2 and related proteins. This is the N-terminal domain found in Replicase polyprotein 1a (also known as non-structural protein 2a-Nsp2a). Family members are found in Gammacoronaviruses. 0
43430 424330 cl40699 AAA_13 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. This family includes the PrrC protein that is thought to be the active component of the anticodon nuclease. 0
43431 424334 cl40703 Trypan_PARP Procyclic acidic repetitive protein (PARP). The SPATA3 family of proteins is expressed significantly in testis and faintly in epididymis in the ten tissues of testis, ovary, spleen, kidney, lung, heart, brain, epididymis, liver and skeletal muscle in mouse. Members are not expressed in the eight other tissues. This suggests that SPATA3 plays potential roles in spermatogenesis cell apoptosis or spermatogenesis. 0
43432 424335 cl40704 Ycf1 Ycf1. Ycf1; Provisional 0
43433 424336 cl40705 RAP-1 Rhoptry-associated protein 1 (RAP-1). rhoptry-associated protein; Provisional 0
43434 424338 cl40707 EpmB L-lysine 2,3-aminomutase (EF-P beta-lysylation pathway) [Amino acid transport and metabolism]. This model represents essentially the whole of E. coli YjeK and of some of its apparent orthologs. YodO in Bacillus subtilis, a family member which is longer protein by an additional 100 residues, is characterized as a lysine 2,3-aminomutase with iron, sulphide and pyridoxal 5'-phosphate groups. The homolog MJ0634 from M. jannaschii is preceded by nearly 200 C-terminal residues. This family shows similarity to molybdenum cofactor biosynthesis protein MoaA and related proteins. Note that the E. coli homolog was expressed in E. coli and purified and found not to display display lysine 2,3-aminomutase activity. Active site residues are found in 100 residue extension in B. subtilis. Name changed to KamA family protein. [Cellular processes, Adaptations to atypical conditions] 0
43435 424339 cl40708 MviM Predicted dehydrogenase [General function prediction only]. All members of the seed alignment for this model are known or predicted inositol 2-dehydrogenase sequences co-clustered with other enzymes for catabolism of myo-inositol or closely related compounds. Inositol 2-dehydrogenase catalyzes the first step in inositol catabolism. Members of this family may vary somewhat in their ranges of acceptable substrates and some may act on analogs to myo-inositol rather than myo-inositol per se. [Energy metabolism, Sugars] 0
43436 424340 cl40709 cyano_w_EgtBD hercynine metabolism protein. Members of this protein family resemble TIGR04375 and, more distantly, to phage shock protein A (PspA). Members are restricted to the Cyanobacteria. 0
43437 424341 cl40710 PflX Uncharacterized Fe-S protein PflX, radical SAM superfamily [General function prediction only]. Members of this protein family are uncharacterized radical SAM enzymes that occur in a prokaryotic three-gene system along with homologs of mammalian proteins Memo (Mediator of ErbB2-driven cell MOtility) and AMMERCR1 (Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis). Among radical SAM enzymes that have been experimentally characterized, the most closely related in sequence include activases of pyruvate formate-lyase and of benzylsuccinate synthase. 0
43438 424349 cl40718 mnmC bifunctional tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD/FAD-dependent 5-carboxymethylaminomethyl-2-thiouridine(34) oxidoreductase MnmC. In Escherichia coli, the protein previously designated YfcK is now identified as the bifunctional enzyme MnmC. It acts, following the action of the heterotetramer of GidA and MnmE, in the modification of U-34 of certain tRNA to 5-methylaminomethyl-2-thiouridine (mnm5s2U). In other bacterial, the corresponding proteins are usually but always found as a single polypeptide chain, but occasionally as the product of tandem genes. This model represents the C-terminal region of the multifunctional protein. [Protein synthesis, tRNA and rRNA base modification] 0
43439 424350 cl40719 PRK06078 N/A. In general, members of this protein family are designated pyrimidine-nucleoside phosphorylase, enzyme family EC 2.4.2.2, as in Bacillus subtilis, and more narrowly as the enzyme family EC 2.4.2.4, thymidine phosphorylase (alternate name: pyrimidine phosphorylase), as in Escherichia coli. The set of proteins encompassed by this model is designated subfamily rather than equivalog for this reason; the protein name from this model should be used when TIGR02643 does not score above trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Other] 0
43440 424351 cl40720 PRK15033 tricarballylate utilization 4Fe-4S protein TcuB. This model identifies proteins of two distinct names which may or may not have two distinct functions. CitB has been identified in salmonella and E. coli as the signal transduction component of a two-component system for citrate in which CitA acts as a citrate transporter. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the C-terminal domain of the R. capsulatus CobZ, which, in most other species exists as a separate gene adjacent to CobZ. 0
43441 424352 cl40721 CysI sulfite reductase (NADPH) hemoprotein, beta-component. Distantly related to the iron-sulfur hemoprotein of sulfite reductase (NADPH) found in Proteobacteria and Eubacteria, sulfite reductase (ferredoxin) is a cyanobacterial and plant monomeric enzyme that also catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism] 0
43442 424353 cl40722 COG2605 Predicted kinase related to galactokinase and mevalonate kinase [General function prediction only]. This model represents the shikimate kinase (SK) gene found in archaea which is only distantly related to homoserine kinase (thrB) and not atr all to the bacterial SK enzyme. The SK from M. janaschii has been overexpressed in E. coli and characterized. SK catalyzes the fifth step of the biosynthesis of chorismate from D-erythrose-4-phosphate and phosphoenolpyruvate. [Amino acid biosynthesis, Aromatic amino acid family] 0
43443 424354 cl40723 ArgC N-acetyl-gamma-glutamylphosphate reductase [Amino acid transport and metabolism]. This model represents the more common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and the gap architecture in a multiple sequence alignment. Bacterial members of this family tend to be found within Arg biosynthesis operons. [Amino acid biosynthesis, Glutamate family] 0
43444 424355 cl40724 PpsA Phosphoenolpyruvate synthase/pyruvate phosphate dikinase [Carbohydrate transport and metabolism]. Also called pyruvate,water dikinase and PEP synthase. The member from Methanococcus jannaschii contains a large intein. This enzyme generates phosphoenolpyruvate (PEP) from pyruvate, hydrolyzing ATP to AMP and releasing inorganic phosphate in the process. The enzyme shows extensive homology to other enzymes that use PEP as substrate or product. This enzyme may provide PEP for gluconeogenesis, for PTS-type carbohydrate transport systems, or for other processes. [Energy metabolism, Glycolysis/gluconeogenesis] 0
43445 424356 cl40725 ERG8 Phosphomevalonate kinase [Lipid transport and metabolism]. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents plant and fungal forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other] 0
43446 424357 cl40726 COG2936 Predicted acyl esterase [General function prediction only]. This model represents a protein subfamily that includes the cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). This family shows extensive, low-level similarity to a family of xaa-pro dipeptidyl-peptidases, and local similarity by PSI-BLAST to many other hydrolases. [Unknown function, Enzymes of unknown specificity] 0
43447 424358 cl40727 CagE_TrbE_VirB CagE, TrbE, VirB family, component of type IV transporter system. Type IV secretion systems are found in Gram-negative pathogens. They export proteins, DNA, or complexes in different systems and are related to plasmid conjugation systems. This model represents related ATPases that include VirB4 in Agrobacterium tumefaciens (DNA export) CagE in Helicobacter pylori (protein export) and plasmid TraB (conjugation). 0
43448 424359 cl40728 cax calcium/proton exchanger (cax). The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is generated from the calcium ion/proton exchangers of the CacA family. [Transport and binding proteins, Cations and iron carrying compounds] 0
43449 424361 cl40730 GlcD FAD/FMN-containing dehydrogenase [Energy production and conversion]. This protein, the glycolate oxidase GlcD subunit, is similar in sequence to that of several D-lactate dehydrogenases, including that of E. coli. The glycolate oxidase has been found to have some D-lactate dehydrogenase activity. [Energy metabolism, Other] 0
43450 424362 cl40731 PLN02677 N/A. mevalonate kinase; Provisional 0
43451 424363 cl40732 AA_permease_2 Amino acid permease. inner membrane transporter YjeM; Provisional 0
43452 424364 cl40733 EutJ Ethanolamine utilization protein EutJ, possible chaperonin [Amino acid transport and metabolism]. ethanolamine utilization protein EutJ; Provisional 0
43453 424366 cl40735 RecB ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) [Replication, recombination and repair]. The RecBCD holoenzyme is a multifunctional nuclease with potent ATP-dependent exodeoxyribonuclease activity. Ejection of RecD, as occurs at chi recombinational hotspots, cripples exonuclease activity in favor of recombinagenic helicase activity. All proteins in this family for which functions are known are DNA-DNA helicases that are used as part of an exonuclease-helicase complex (made up of RecBCD homologs) that function to generate substrates for the initiation of recombination and recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair] 0
43454 424367 cl40736 FeoB Fe2+ transport system protein B [Inorganic ion transport and metabolism]. FeoB (773 amino acids in E. coli), a cytoplasmic membrane protein required for iron(II) update, is encoded in an operon with FeoA (75 amino acids), which is also required, and is regulated by Fur. There appear to be two copies in Archaeoglobus fulgidus and Clostridium acetobutylicum. [Transport and binding proteins, Cations and iron carrying compounds] 0
43455 424368 cl40737 PurT Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) [Nucleotide transport and metabolism]. This enzyme is an alternative to PurN (TIGR00639) [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis] 0
43456 424369 cl40738 fliF flagellar basal body M-ring protein FliF. flagellar MS-ring protein; Reviewed 0
43457 424370 cl40739 TOP2c TopoisomeraseII. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. Proteins scoring above the noise cutoff for this model and below the trusted cutoff for topoisomerase IV models probably should be designated GyrB. [DNA metabolism, DNA replication, recombination, and repair] 0
43458 424371 cl40740 DnaG DNA primase (bacterial type) [Replication, recombination and repair]. DNA primase; Provisional 0
43459 424372 cl40741 DAO FAD dependent oxidoreductase. Members of this protein family are the A subunit, product of the glpA gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic] 0
43460 424373 cl40742 PhaC Poly(3-hydroxyalkanoate) synthetase [Lipid transport and metabolism]. This model represents the class II subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs, typically with six to fourteen carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis] 0
43461 424374 cl40743 HemY Uncharacterized conserved protein HemY, contains two TPR repeats [Function unknown]. Members of this protein family are uncharacterized tetratricopeptide repeat (TPR) proteins invariably found in heme biosynthesis gene clusters. The absence of any invariant residues other than Ala argues against this protein serving as an enzyme per se. The gene symbol hemY assigned in E. coli is unfortunate in that an unrelated protein, protoporphyrinogen oxidase (HemG in E. coli) is designated HemY in Bacillus subtilis. [Unknown function, General] 0
43462 424376 cl40745 SufI Multicopper oxidase with three cupredoxin domains (includes cell division protein FtsP and spore coat protein CotA) [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism, Cell wall/membrane/envelope biogenes. This family consists of copper-type nitrite reductase. It reduces nitrite to nitric oxide, the first step in denitrification. [Central intermediary metabolism, Nitrogen metabolism] 0
43463 424377 cl40746 SepRS O-phosphoseryl-tRNA(Cys) synthetase [Translation, ribosomal structure and biogenesis]. O-phosphoseryl-tRNA synthetase; Reviewed 0
43464 424378 cl40747 MraZ MraZ, DNA-binding transcriptional regulator and inhibitor of RsmH methyltransferase activity [Translation, ribosomal structure and biogenesis]. Members of this family contain two tandem copies of a domain described by pfam02381. This protein often is found with other genes of the dcw (division cell wall) gene cluster, including mraW, ftsI, murE, murF, ftsW, murG, etc. Recent work shows MraW in E. coli binds an upstream region with three tandem GTGGG repeats separated by 5bp spacers. We find similar sites in other species. [Cellular processes, Cell division, Regulatory functions, DNA interactions] 0
43465 424379 cl40748 MauG Cytochrome c peroxidase [Posttranslational modification, protein turnover, chaperones]. This model describes a subfamily of di-heme proteins related to the di-heme cytochrome c peroxidase and to MauG (methylamine utilization G), an enzyme that performs a tryptophan tryptophylquinone modification to the methylamine dehydrogenase light chain. 0
43466 424380 cl40749 COG1712 Predicted dinucleotide-utilizing enzyme [General function prediction only]. putative L-aspartate dehydrogenase; Provisional 0
43467 424381 cl40750 PcnB tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 0
43468 424382 cl40751 LysR DNA-binding transcriptional regulator, LysR family [Transcription]. This group of sequences represents a number of related clades with numerous examples of members adjacent to operons for the degradation of 2-aminoethylphosphonate (AEP) in Pseudomonas, Ralstonia, Bordetella and Burkholderia species. These are transcriptional regulators of the LysR family which contain a helix-turn-helix (HTH) domain (pfam00126) and a periplasmic substrate-binding protein-like domain (pfam03466). [Regulatory functions, DNA interactions] 0
43469 424383 cl40752 GltD NADPH-dependent glutamate synthase beta chain or related oxidoreductase [Amino acid transport and metabolism, General function prediction only]. dihydropyrimidine dehydrogenase subunit A; Provisional 0
43470 424384 cl40753 PRK10668 N/A. putative transcriptional regulator; Provisional 0
43471 424385 cl40754 Efp Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. function: involved in peptide bond synthesis. stimulate efficient translation and peptide-bond synthesis on native or reconstituted 70S ribosomes in vitro. probably functions indirectly by altering the affinity of the ribosome for aminoacyl-tRNA, thus increasing their reactivity as acceptors for peptidyl transferase (by similarity). The trusted cutoff of this model is set high enough to exclude members of TIGR02178, an EFP-like protein of certain Gammaproteobacteria. [Protein synthesis, Translation factors] 0
43472 425345 cl41714 ZBD_UPF1_nv_SF1_Hel-like Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this arnidovirus group belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. 0
43473 425346 cl41715 1B_UPF1_nv_SF1_Hel-like 1B domain of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. EAV Nsp10 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids. 0
43474 425347 cl41716 SUD_C_DPUP_CoV_Nsp3 C-terminal SARS-Unique Domain (SUD) of betacoronavirus non-structural protein 3 (Nsp3). This subfamily contains the SUD-C of Rousettus bat coronavirus (CoV) HKU9 non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the nobecovirus subgenera (D lineage). Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in Rousettus bat CoV HKU9 and related bat CoVs. Similar to SARS SUD-C, Rousettus bat CoV HKU9 SUD-C (HKU9 C), also adopts a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether HKU9 C functions in the same way. 0
43475 425348 cl41717 M_cv_Nsp15-NTD_av_Nsp11-like middle (M) domain of coronavirus Nonstructural protein 15 (Nsp15) and the N-terminal domain (NTD) of arterivirus Nsp11 and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. 0
43476 425349 cl41718 NendoU_XendoU-like Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, Xenopus laevis endoribonuclease XendoU, and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. The Porcine torovirus (PToV) strain PToV-NPL/2013 NendoU domain is located at the N-terminus of the ORF1ab replicase polyprotein, between regions annotated as Nonstructural proteins 11 (Nsp11) and 13 (Nsp13). This subfamily belongs to a family which includes Nsp15 from coronaviruses and Nsp11 from arteriviruses, which may participate in the viral replication process and in the evasion of the host immune system. These vary in their requirement for Mn2+. Coronavirus Nsp15 generally form functional hexamers, with the exception of Porcine DeltaCoronavirus (PDCoV) Nsp15 which exists as a dimer and a monomer in solution. Arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11 is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. 0
43477 425350 cl41719 capping_2-OMTase_viral viral Cap-0 specific (nucleoside-2&apos;-O-)-methyltransferase. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Nidovirales, a family of ss(+)RNA viruses, cap their mRNAs. For one member, Coronavirus, the 2'OMTase activity is located in the nonstructural protein 16 (NSP16). For others, the 2'OMTase activity may be located in replicase polyprotein 1ab. 0
43478 425351 cl41720 ORF4b_NS3c-betaCoV accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of betacoronaviruses in the C lineage. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Pipistrellus bat coronavirus HKU5 and related bat coronaviruses including Pipistrellus abramus bat coronavirus HKU5-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis. 0
43479 425352 cl41721 deltaCoV_NS7_NS7a deltacoronavirus accessory protein NS7 and NS7a. This group includes the accessory protein NS7a from Quail deltacoronavirus (QdCoV) UAE-HKU30 and sparrow deltacoronavirus (SpCoV-HKU17) within the Buldecovirus subgenus of deltacoronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized. Phylogenetic analysis revealed that QdCoV UAE-HKU30 belongs to the same CoV species as porcine deltacoronavirus (PdCoV) HKU15 and sparrow deltacoronavirus (SpdCoV) HKU17 within Buldecovirus subgenus, suggesting transmission between avian and swine hosts. 0
43480 425353 cl41722 ORF7b_SARS_bat-CoV-like Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and similar proteins from related betacoronaviruses in the B lineage. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome coronaviruses (SARS-CoVs) and related betacoronaviruses identified in Chinese horseshoe bats, including bat SARS-like-CoV WIV1 and HKU3. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization. 0
43481 425354 cl41723 HD_XRCC4-like_N N-terminal head domain found in the XRCC4 superfamily of proteins. Paralog of XRCC4 and XLF (PAXX), also called XRCC4-like small protein, is a paralog of X-ray repair cross-complementing protein 4 (XRCC4) and XRCC4-like factor (XLF). It is involved in non-homologous end joining (NHEJ), a major pathway to repair double-strand breaks (DSBs) in DNA. It may act as a scaffold required to stabilize the DSB-repair protein Ku heterodimer, composed of XRCC5/Ku80 and XRCC6/Ku70, at double-strand break sites in cells. It functions with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents. Similar to XRCC4 and XLF, PAXX monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. This model corresponds to the N-terminal head domain of PAXX, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and CCDC61. 0
43482 425355 cl41724 DPBB_RlpA_EXP_N-like double-psi beta-barrel fold of RlpA, N-terminal domain of expansins, and similar domains. This group is made up of endoglucanases from mollusks similar to Ampullaria crossean endoglucanase EG27II, a glycoside hydrolase family 45 (GH45) subfamily B protein. Endoglucanases (EC 3.2.1.4) catalyze the endohydrolysis of (1-4)-beta-D-glucosidic linkages in cellulose, lichenin, and cereal beta-D-glucans. Animal cellulases, such as endoglucanase EG27II, have great potential for industrial applications such as bioethanol production. GH45 endoglucanases from mollusks adopt a double-psi beta-barrel (DPBB) fold. 0
43483 425356 cl41725 UDM1_RNF168_RNF169-like UDM1 (ubiquitin-dependent DSB recruitment module 1) found in RING finger proteins RNF168, RNF169 and similar proteins. RING finger protein 168 (RNF168) is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. Together with RNF8, RNF168 functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF168, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets. 0
43484 425357 cl41726 RHH_CopG_NikR-like ribbon-helix-helix domains of transcription repressor CopG, nickel responsive transcription factor NikR, and similar proteins. This subfamily includes the N-terminal ribbon-helix-helix (RHH) domain of putative transcriptional repressor CopG from archaea, and similar proteins. These uncharacterized proteins have a typical RHH, similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number. 0
43485 425358 cl41727 H1_KCTD12-like H1 domain found in potassium channel tetramerization domain-containing proteins. Potassium channel tetramerization domain-containing protein 16 (KCTD16) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-) receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. KCTD16 generates largely non-desensitizing receptor responses. It consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD16, which may not be involved in desensitization. 0
43486 425359 cl41728 WH2 Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif), and similar proteins. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 3 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. 0
43487 425360 cl41729 KLF1_2_4_N N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4. 0
43488 425361 cl41730 KLF9_13_N-like Kruppel-like factor (KLF) 9, KLF13, KLF14, KLF16, and similar proteins. Kruppel-like factor 9 (KLF9; also known as Krueppel-like factor 9, or Basic Transcription Element Binding Protein 1/BTEB Protein 1) is a protein that in humans is encoded by the KLF9 gene. KLF9 is critical for the inhibition of growth and development of tumors. It is involved in cell differentiation of B cells, keratinocytes, and neurons. It is also a key transcriptional regulator for uterine endometrial cell proliferation, adhesion, and differentiation; these are processes essential for pregnancy success and are subverted during tumorigenesis. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF9 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF9. 0
43489 425362 cl41731 KLF10_11_N N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins. Kruppel-like factor 11 (KLF11; also known as Krueppel-like factor 11; Fetal Kruppel-like factor-1/FKLF-1; maturity-onset diabetes of the young 7/MODY7; TGFbeta Inducible Early Growth Response 2/TIEG2) is a protein that in humans is encoded by the KLF11 gene. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF11 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF11. 0
43490 425363 cl41732 KLF6_7_N-like N-terminal domain of Kruppel-like factor (KLF) 6, KLF7, and similar proteins. Kruppel-like factor 6 (KLF6; also known as Krueppel-like factor 6, BCD1, CBA1, COPEB, CPBP, GBF, PAC1, ST12, or ZF9) is a protein that, in humans, is encoded by the KLF6 gene. KLF6 contributes to cell proliferation, differentiation, cell death, and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF6-expression contributes to hepatic insulin resistance and the progression of non-alcoholic fatty liver disease (NAFLD) to non-alcoholic steatohepatitis (NASH) and NASH-cirrhosis. KLF6 also affects peroxisome proliferator activated receptor gamma (PPARgamma)-signaling in NAFLD. KLF6 has also been identified as a tumor suppressor gene that is inactivated or downregulated in different cancers, including prostate, colon, and hepatocellular carcinomas. KLF6 transactivates genes controlling cell proliferation, including p21, E-cadherin, and pituary tumor-transforming gene 1 (PTTG1). KLF6 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF6. 0
43491 425364 cl41733 Syt1_2_N N-terminal domain of synaptotagmin-1 and -2. Syt2, also called synaptotagmin II (SytII), exhibits calcium-dependent phospholipid and inositol polyphosphate binding properties. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. It plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt2, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B). 0
43492 425365 cl41734 CoV_Nsp7 coronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length. 0
43493 425366 cl41735 TBK1_IKKE-like_C C-terminal domain of non-canonical Inhibitor of kappa B kinases, IKK-E and TBK1, and similar proteins. TANK-binding kinase 1 (TBK1), also called T2K and NF-kB-activating kinase, is a serine/threonine-protein kinase that is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. TBK1 may also play roles in cell transformation and oncogenesis. In addition, it regulates optineurin (OPTN), an important autophagy receptor involved in several selective autophagy processes. TBK1 contains N-terminal serine/threonine protein kinase, ubiquitin-like (Ubl), coiled-coil domain 1 (CCD1), and C-terminal alpha-helical domains. This model corresponds to a small conserved elongated alpha-helical domain at the C-terminus of TBK1, which is responsible for the binding of its adaptor proteins such as OPTN and NAP1. 0
43494 425367 cl41736 MIU2_RNF168-like second motif interacting with ubiquitin domain found in RING finger protein 168 and similar domains. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin. This model corresponds to the second MIU (MIU2) domain of RNF168. The first MIU belongs to a different domain family and is not included here. 0
43495 425368 cl41737 TD_EMAP-like trimerization domain of the echinoderm microtubule-associated protein-like family. Echinoderm microtubule-associated protein-like 4 (EMAP-4), also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to the N-terminal trimerization domain of EMAP-4. 0
43496 425369 cl41738 LGNbd_FRMPD1_D4-like LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD4, and similar proteins. FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. FRMPD4 contains WW, PDZ and FERM domains in the N-terminal region. This model corresponds to a conserved region in the C-terminal region of FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD4. 0
43497 425370 cl41739 Nip7_N-like N-terminal domain of Nip7 and similar proteins. The N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7 co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. Nip7 is involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of the 60S ribosomal subunit. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. 0
43498 425371 cl41740 CC1_SLMAP-like first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein and similar proteins. TRAF3-interacting JNK-activating modulator (T3JAM), also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. The model corresponds to a conserved region that shows high sequence similarity with the first CC (CC1) domain of Sarcolemmal membrane-associated protein (SLMAP), which is responsible for the binding of suppressor of IKBKE 1 (SIKE1). 0
43499 425372 cl41741 PUA PUA RNA binding domain. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the thermotogae subfamily of pseudouridine synthases TruB are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs. 0
43500 425373 cl41742 alpha_betaCoV_Nsp1 non-structural protein 1 from alpha- and betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV), bovine coronavirus (BCoV) and Human coronavirus HKU1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and MHV genomes cause drastic reduction or elimination of infectious virus; BCoV Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome. 0
43501 425374 cl41743 betaCoV_Nsp3_betaSM betacoronavirus-specific marker of betacoronavirus non-structural protein 3. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16. 0
43502 425375 cl41744 ABC-2_lan_permease lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit. This subfamily contains lantibiotic ABC transporter permease subunits NisG and NsuG, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to the lantibiotic nisin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. In Lactococcus lactis and Streptococcus uberis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG. In Streptococcus uberis, similar proteins provide self-protection against the pore-forming lantibiotic nisin U. This subfamily contains the NisG and NsuG permease subunits that transport nisin to the surface and expel it from the membrane. 0
43503 425376 cl41745 DEFL defensin-like domain family. This subfamily includes a group of bactericidal proteins, such as defensins, sapecins, tenecins, phormicins, and lucifensins from bilateria. They are host defense peptides produced in response to injury and mostly active against Gram-positive bacteria. This model corresponds to the defensin-like (DEFL) domain, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold. 0
43504 425377 cl41746 CEN_USH1G_ANKS4B central domain found in usher syndrome type-1G protein, ankyrin repeat and SAM domain-containing protein 4B, and similar proteins. Usher syndrome type-1G protein (USH1G), also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. USH1G consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of USH1G, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the MYO7A MyTH4-FERM tandem. 0
43505 425378 cl41747 RBD_KIF20A-like RAB6 binding domain (RBD) found in kinesin-like proteins KIF20A, KIF20B, and similar proteins. KIF20A, also called GG10_2, or mitotic kinesin-like protein 2 (MKlp2), or Rab6-interacting kinesin-like protein, or rabkinesin-6, is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1, it is involved in recruitment of PLK1 (polo-like kinase 1) to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus end-directed motility. This model corresponds to RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge. 0
43506 425379 cl41748 CoV_Nsp13-helicase helicase domain of coronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus (CoV) NL63. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core. 0
43507 425380 cl41749 GH2_like GIPC homology 2 (GH2) domain-like family. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. This model corresponds to the GH2-like domain that shows high sequence similarity with the GH2 domain found in GIPC (GAIP C-terminus-interacting protein) family of proteins, which mediate endocytosis by tethering cargo proteins to the motor myosin VI. 0
43508 425381 cl41750 Rcc_KIF21 regulatory coiled-coil domain found in the kinesin-like KIF21 family. KIF21A, also called kinesin-like protein KIF2 or renal carcinoma antigen NY-REN-62, is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. This model corresponds to the regulatory coiled-coil domain of KIF21A, which folds into an intramolecular antiparallel coiled-coil monomer in solution, but crystallizes into a dimeric domain-swapped antiparallel coiled-coil. 0
43509 425382 cl41751 PPP2R3 serine/threonine protein phosphatase 2A regulatory subunit B&quot;. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR70 (also known as protein phosphatase 2 regulatory subunit B'' subunit beta, PR48, NYREN8, PPP2R3L, or PPP2R3LY) that is encoded by the PPP2R3B gene. This substrate-recognizing subunit of PP2A has a two-domain elongated structure with two calcium EF-hands, each displaying different affinities to Ca2+. PPP2R3B/PR70 is a gonosomal melanoma tumor suppressor gene; PR70 decreased melanoma growth by negatively interfering with DNA replication and cell cycle progression through its role in stabilizing the cell division cycle 6 (CDC6)-chromatin licensing and DNA replication factor 1 (CDT1) interaction, which delays the firing of origins of DNA replication. 0
43510 425383 cl41752 LPS_wlbK-like Bordetella wlbK gene product domains involved in bacterial polysaccharide synthesis, and similar domains. This model includes the C-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear. 0
43511 425384 cl41753 SUN_cc1 coiled-coil domain 1 of SUN domain-containing proteins. SUN domain-containing protein 1 (SUN1), also called protein unc-84 homolog A, or Sad1/unc-84 protein-like 1, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN1 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that may function as an activation segment to release CC2-mediated inhibition of the SUN domain. 0
43512 425385 cl41754 SPASM Iron-sulfur cluster-binding SPASM domain. Butirosin biosynthesis protein N (BtrN), also called S-adenosyl-L-methionine-dependent 2-deoxy-scyllo-inosamine dehydrogenase (EC 1.1.99.38), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the two-electron oxidation of 2-deoxy-scyllo-inosamine (DOIA) to amino-dideoxy-scyllo-inosose (amino-DOI) in the biosynthetic pathway of the aminoglycoside antibiotic butirosin. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. BtrN contains one auxillary 4Fe-4S cluster. 0
43513 425386 cl41755 LPMO_auxiliary lytic polysaccharide monooxygenase auxiliary activity protein. Fusolin is a protein found in spindles of insect poxviruses that resembles the lytic polysaccharide monooxygenases of chitinovorous bacteria and may function to disrupt the chitin-rich peritrophic matrix that protects insects against oral infections. Thus, it is a component of the virus occlusion bodies (which are large proteinaceous polyhedra) that protect the virus from the outside environment for extended periods until they are ingested by insect larvae. 0
43514 425387 cl41756 PoNe Polymorphic Nuclease effector (PoNe) domain is a deoxyribonuclease. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as a TANFOR domain which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains, or a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis. 0
43515 425388 cl41757 cytochrome_P450 cytochrome P450 (CYP) superfamily. Cytochrome P450 family 4, subfamily V, polypeptide 2 (CYP4V2) is the most characterized member of the CYP4V subfamily. It is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. 0
43516 425389 cl41758 peptidase_C58-like C58 peptidase domain and and similar domains. This family includes the C58 peptidase domain of Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. HopN1's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain. 0
43517 425390 cl41759 polyA_pol_NCLDV RNA polyadenylate polymerase of nucleocytoplasmic large DNA viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs. 0
43518 425391 cl41760 XPF_nuclease-like nuclease domain of XPF/MUS81 family proteins. Budding yeast Mms4, also known as Eme1 in other organisms, is a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Typical substrates include 3'-flap structures, D-loops, replication forks with regressed leading strands and nicked Holliday junctions. The nuclease domain of Mms4 lacks the catalytic motif. 0
43519 425392 cl41761 FIX-like Found in type sIX effector (FIX) domain of unknown function. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that generally co-occurs with the C-terminal Ntox15 (Novel toxin 15), a predicted RNase toxin that possesses a conserved HxxD motif, as well as with domains such as DNA/RNA non-specific endonuclease, RhsA domain regions with extende RHS repeats, or DUF4112. Some members also contain an N-terminal PAAR-like (i.e., DUF4280) domain. 0
43520 425393 cl41762 MIX Marker for type sIX effectors domain. This subfamily contains the MIX (Marker for type sIX effectors) V clan (MIX V) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted antibacterial activities of the C-terminal toxin domains of Vibrionaceae MIX V effectors include peptidase, peptidoglycan hydrolase, nuclease and pore-forming. Also included in this clan is VPR01S_11_01570, encoded by V. proteolyticus, that carries a CNF1 (cytotoxic necrotizing factor 1) toxin domain and modulates the actin cytoskeleton of eukaryotic phagocytic cells. Some members contain DUF2235, which is predicted as a phospholipase domain. Members of the MIX V clan are shared between marine bacteria via horizontal gene transfer, thereby enhancing their bacterial competitive fitness. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. 0
43521 425394 cl41763 ELD_TRPML extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipins (TRPMLs). TRPML3, also called mucolipin-3 (ML3), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It mediates release of Ca(2+) from endosomes to the cytoplasm, contributes to endosomal acidification and is involved in the regulation of membrane trafficking and fusion in the endosomal pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML3. It forms a tight tetramer that is crucial for full-length TRPML3 assembly and localization. 0
43522 425395 cl41764 CdiA-CT_Ec_Kp-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli and Klebsiella pneumoniae CdiA, and similar proteins. This family includes the C-terminal (CT) domain of bacterial CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. Many of the domains in this family are associated with RHS repeats N-terminal to the domain. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity. 0
43523 425396 cl41765 EVE-like EVE and YTH domains belong to the PUA superfamily. Individual members of the YTH family have been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. In general, eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or in other forms of silencing. The YTH domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. It belongs to the larger PUA superfamily. 0
43524 425397 cl41766 YidC_peri periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Gram-negative bacteria and similar domains. This subfamily is composed of Escherichia coli YidC and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. 0
43525 425398 cl41767 Rab11BD_RAB3IP_like Rab11 binding domain of Rab-3A-interacting protein (RAB3IP), Rab-3A-interacting-like protein 1 (RAB3IL1) and similar proteins. RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IL1 lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region. 0
43526 425399 cl41768 WH_NTD_SMARCB1_like N-terminal winged helix DNA-binding domain found in SMARCB1, PHF10 and similar proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. The model corresponds to the N-terminal winged helix DNA binding domain of SMARCB1, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins. 0
43527 425400 cl41769 toxin_MLD_like membrane localization domain (MLD) of Vibrio MARTX, Pasteurella PMT, clostridial glycosylating cytotoxins, toxin effectors BteA (Bordetella T3SS effector A) and related proteins. This family includes the MLD located in the N-terminal minimal membrane-binding segment of BteA (residues 1-131, BteA131), which has also been referred to as the lipid raft targeting (LRT) domain/motif. BteA is a type III secretion system (T3SS) effector protein from Bordetella pertussis, a bacterial respiratory pathogen and the causative agent of whooping cough. The BteA131 segment is multifunctional: in addition to targeting phosphatidylinositol (PI)-rich microdomains in the host membrane, it binds its cognate chaperone BtcA. The MLD adopts a four-helix bundle structure, with a positively charged surface that targets phosphatidylinositol 4,5-bisphosphate (PIP2) in the host membrane via critical arginine and lysine residues. A flexible region preceding the BteA helical bundle contains the characteristic beta-motif required for binding BtcA. This domain has significant sequence similarity to the N-terminal domain of effectors and the endo-domain of RTX-type toxins from Photorhabdus luminescens. This family includes the N-terminal domain of Photorhabdus laumondii Photox toxin; little is known about the N-terminus of Photox, but its C-terminus is an actin-targeting ADP-ribosyltransferase. 0
43528 425401 cl41770 NucC-like cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease and similar proteins. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease. 0
43529 425402 cl41771 SP6-9_N N-terminal domains of transcription factor Specificity Proteins (SP) 6-9, and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP9 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP9. 0
43530 425403 cl41772 Arl6IP1_RETR3-like ADP-ribosylation factor-like protein 6-interacting protein 1, Reticulophagy regulator 3, and similar proteins. Reticulophagy regulator 3 (RETR3 or RETREG3), also called FAM134C (family with sequence similarity 134, member C), mediates NRF1-enhanced neurite outgrowth. It interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG3/FAM134C contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature. 0
43531 425404 cl41773 SP1-4_N N-terminal domain of transcription factor Specificity Proteins (SP) 1-4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods. 0
43532 425405 cl41774 HkD_SF Hook domain-containing proteins superfamily. Gipie, also called coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing. 0
43533 425406 cl41775 JMTM_Notch_APP juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. Amyloid-like protein 2 (APLP-2), also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP), may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APLP-2, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. 0
43534 425407 cl41776 DLC-like_SF dynein light chain (DLC)-like domain superfamily. Dynein light chain Tctex-type 3 (DYNLT3), also called rp3, or protein 91/23, or T-complex-associated testis-expressed 1-like, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It has a potential role in chromosome congression in human mitosis and is required for chromosome alignment during mouse oocyte meiotic maturation. The DYNLT3 light chain directly links cytoplasmic dynein to a spindle checkpoint protein, Bub3. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT3. 0
43535 425408 cl41777 Zn-C2H2_CALCOCO1_TAX1BP1_like autophagy receptor zinc finger-C2H2 domain found in calcium-binding and coiled-coil domain-containing proteins, TAX1BP1 and similar proteins. spn-F is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. It acts downstream of IKK-related kinase Ik2 in the same pathway for dendrite pruning. Spn-F is a coil-coiled protein containing a C2H2-type zinc binding domain. 0
43536 425409 cl41778 GINS_B beta-strand (B) domain of GINS complex proteins: Sld5, Psf1, Psf2, Psf3, Gins51 and Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins23, as well as eukaryotic Psf2 and Psf3, have the alpha-helical (A) domain at the C-terminus and the beta-strand domain (B) at the N-terminus; this arrangement is called BAtype. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of archaeal Gins23. 0
43537 425410 cl41779 akirin akirin. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-2 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis, and brain- and limb-development. Akirin-2 is partly cytosolic. It has been shown to interact with nuclear importins and therefore may play a role in proper transport between nucleus and cytoplasm. 0
43538 425411 cl41780 CDI_toxin_Bp_tRNase-like C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) toxin domains that are similar to Burkholderia pseudomallei E479 and 1026b CdiA toxins, both of which are tRNAses. 0
43539 425412 cl41781 pseudoGTPaseD_p190RhoGAP pseudoGTPase domain found in the family of p190RhoGAP. p190RhoGAP protein A (p190RhoGAP-A), also called Rho GTPase-activating protein 35(RHOGAP35), glucocorticoid receptor DNA-binding factor 1, or glucocorticoid receptor repression factor 1 (GRF-1), or Rho GAP p190A, or p190-A, is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. It binds several acidic phospholipids which inhibits the Rho GAP activity to promote the Rac GAP activity. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-A. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases would consist of a GTPase fold lacking one or more of these G motifs. 0
43540 394960 pfam00001 7tm_1 7 transmembrane receptor (rhodopsin family). This family contains, amongst other G-protein-coupled receptors (GCPRs), members of the opsin family, which have been considered to be typical members of the rhodopsin superfamily. They share several motifs, mainly the seven transmembrane helices, GCPRs of the rhodopsin superfamily. All opsins bind a chromophore, such as 11-cis-retinal. The function of most opsins other than the photoisomerases is split into two steps: light absorption and G-protein activation. Photoisomerases, on the other hand, are not coupled to G-proteins - they are thought to generate and supply the chromophore that is used by visual opsins. 256
43541 394961 pfam00002 7tm_2 7 transmembrane receptor (Secretin family). This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognized. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling. 245
43542 394962 pfam00003 7tm_3 7 transmembrane sweet-taste receptor of 3 GCPR. This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness. 236
43543 394963 pfam00004 AAA ATPase family associated with various cellular activities (AAA). AAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes. 130
43544 394964 pfam00005 ABC_tran ABC transporter. ABC transporters for a large family of proteins responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family of proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide as in CFTR, or belong in different polypeptide chains. 150
43545 394965 pfam00006 ATP-synt_ab ATP synthase alpha/beta family, nucleotide-binding domain. This entry includes the ATP synthase alpha and beta subunits, the ATP synthase associated with flagella and the termination factor Rho. 212
43546 394966 pfam00007 Cys_knot Cystine-knot domain. The family comprises glycoprotein hormones and the C-terminal domain of various extracellular proteins. It is believed to be involved in disulfide-linked dimerization. 105
43547 394967 pfam00008 EGF EGF-like domain. There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains. 31
43548 394968 pfam00009 GTP_EFTU Elongation factor Tu GTP binding domain. This domain contains a P-loop motif, also found in several other families such as pfam00071, pfam00025 and pfam00063. Elongation factor Tu consists of three structural domains, this plus two C-terminal beta barrel domains. 187
43549 394969 pfam00010 HLH Helix-loop-helix DNA-binding domain. 53
43550 365807 pfam00011 HSP20 Hsp20/alpha crystallin family. Not only do small heat-shock-proteins occur in eukaryotes and prokaryotes but they have also now been shown to occur in cyanobacterial phages as well as their bacterial hosts. 100
43551 394970 pfam00012 HSP70 Hsp70 protein. Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region. 598
43552 394971 pfam00013 KH_1 KH domain. KH motifs bind RNA in vitro. Autoantibodies to Nova, a KH domain protein, cause paraneoplastic opsoclonus ataxia. 65
43553 394972 pfam00014 Kunitz_BPTI Kunitz/Bovine pancreatic trypsin inhibitor domain. Indicative of a protease inhibitor, usually a serine protease inhibitor. Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Certain family members are similar to the tick anticoagulant peptide (TAP). This is a highly selective inhibitor of factor Xa in the blood coagulation pathways. TAP molecules are highly dipolar, and are arranged to form a twisted two- stranded antiparallel beta-sheet followed by an alpha helix. 53
43554 333767 pfam00015 MCPsignal Methyl-accepting chemotaxis protein (MCP) signalling domain. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs. 172
43555 394973 pfam00016 RuBisCO_large Ribulose bisphosphate carboxylase large chain, catalytic domain. The C-terminal domain of RuBisCO large chain is the catalytic domain adopting a TIM barrel fold. 292
43556 394974 pfam00017 SH2 SH2 domain. 77
43557 394975 pfam00018 SH3_1 SH3 domain. SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organisation. First described in the Src cytoplasmic tyrosine kinase. The structure is a partly opened beta barrel. 47
43558 394976 pfam00019 TGF_beta Transforming growth factor beta like domain. 100
43559 394977 pfam00020 TNFR_c6 TNFR/NGFR cysteine-rich region. 38
43560 394978 pfam00021 UPAR_LY6 u-PAR/Ly-6 domain. This extracellular disulphide bond rich domain is related to pfam00087. 77
43561 394979 pfam00022 Actin Actin. 407
43562 394980 pfam00023 Ank Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities. Repeats 13-24 are especially active, with known sites of interaction for the Na/K ATPase, Cl/HCO(3) anion exchanger, voltage-gated sodium channel, clathrin heavy chain and L1 family cell adhesion molecules. The ANK repeats are found to form a contiguous spiral stack such that ion transporters like the anion exchanger associate in a large central cavity formed by the ANK repeat spiral, while clathrin and cell adhesion molecules associate with specific regions outside this cavity. 33
43563 394981 pfam00024 PAN_1 PAN domain. The PAN domain contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge the links the N and C termini of the domain. The domain is found in diverse proteins, in some they mediate protein-protein interactions, in others they mediate protein-carbohydrate interactions. 77
43564 394982 pfam00025 Arf ADP-ribosylation factor family. Pfam combines a number of different Prosite families together 174
43565 394983 pfam00026 Asp Eukaryotic aspartyl protease. Aspartyl (acid) proteases include pepsins, cathepsins, and renins. Two-domain structure, probably arising from ancestral duplication. This family does not include the retroviral nor retrotransposon proteases (pfam00077), which are much smaller and appear to be homologous to a single domain of the eukaryotic asp proteases. 313
43566 394984 pfam00027 cNMP_binding Cyclic nucleotide-binding domain. 89
43567 394985 pfam00028 Cadherin Cadherin domain. 92
43568 394986 pfam00029 Connexin Connexin. Connexin proteins form gap-junctions between cells. They carry four transmembrane regions, hence why this family now includes Connexin_CCC, which represented the second pair of TMs. 222
43569 394987 pfam00030 Crystall Beta/Gamma crystallin. The alignment comprises two Greek key motifs since the similarity between them is very low. 82
43570 394988 pfam00031 Cystatin Cystatin domain. Very diverse family. Attempts to define separate sub-families failed. Typically, either the N-terminal or C-terminal end is very divergent. But splitting into two domains would make very short families. pfam00666 is related to this family but members have not been included. 92
43571 394989 pfam00032 Cytochrom_B_C Cytochrome b(C-terminal)/b6/petD. 101
43572 306530 pfam00033 Cytochrome_B Cytochrome b/b6/petB. 189
43573 394990 pfam00034 Cytochrom_C Cytochrome c. The Pfam entry does not include all Prosite members. The cytochrome 556 and cytochrome c' families are not included. All these are now in a new clan together. The C-terminus of DUF989, pfam06181, has now been merged into this family. 89
43574 394991 pfam00035 dsrm Double-stranded RNA binding motif. Sequences gathered for seed by HMM_iterative_training Putative motif shared by proteins that bind to dsRNA. At least some DSRM proteins seem to bind to specific RNA targets. Exemplified by Staufen, which is involved in localization of at least five different mRNAs in the early Drosophila embryo. Also by interferon-induced protein kinase in humans, which is part of the cellular response to dsRNA. 66
43575 394992 pfam00036 EF-hand_1 EF hand. The EF-hands can be divided into two classes: signalling proteins and buffering/transport proteins. The first group is the largest and includes the most well-known members of the family such as calmodulin, troponin C and S100B. These proteins typically undergo a calcium-dependent conformational change which opens a target binding site. The latter group is represented by calbindin D9k and do not undergo calcium dependent conformational changes. 28
43576 394993 pfam00037 Fer4 4Fe-4S binding domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 24
43577 365827 pfam00038 Filament Intermediate filament protein. 313
43578 394994 pfam00039 fn1 Fibronectin type I domain. 40
43579 394995 pfam00040 fn2 Fibronectin type II domain. 42
43580 394996 pfam00041 fn3 Fibronectin type III domain. 85
43581 394997 pfam00042 Globin Globin. 109
43582 394998 pfam00043 GST_C Glutathione S-transferase, C-terminal domain. GST conjugates reduced glutathione to a variety of targets including S-crystallin from squid, the eukaryotic elongation factor 1-gamma, the HSP26 family of stress-related proteins and auxin-regulated proteins in plants. Stringent starvation proteins in E. coli are also included in the alignment but are not known to have GST activity. The glutathione molecule binds in a cleft between N and C-terminal domains. The catalytically important residues are proposed to reside in the N-terminal domain. In plants, GSTs are encoded by a large gene family (48 GST genes in Arabidopsis) and can be divided into the phi, tau, theta, zeta, and lambda classes. 93
43583 394999 pfam00044 Gp_dh_N Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold. 101
43584 395000 pfam00045 Hemopexin Hemopexin. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metallopeptidases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs). 44
43585 395001 pfam00046 Homeobox Homeobox domain. 55
43586 395002 pfam00047 ig Immunoglobulin domain. Members of the immunoglobulin superfamily are found in hundreds of proteins of different functions. Examples include antibodies, the giant muscle kinase titin and receptor tyrosine kinases. Immunoglobulin-like domains may be involved in protein-protein and protein-ligand interactions. 86
43587 395003 pfam00048 IL8 Small cytokines (intecrine/chemokine), interleukin-8 like. Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds. 60
43588 306545 pfam00049 Insulin Insulin/IGF/Relaxin family. Superfamily includes insulins; relaxins; insulin-like growth factor; and bombyxin. All are secreted regulatory hormones. Disulfide rich, all-alpha fold. Alignment includes B chain, linker (which is processed out of the final product), and A chain. 77
43589 395004 pfam00050 Kazal_1 Kazal-type serine protease inhibitor domain. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides. Alignment also includes a single domain from transporters in the OATP/PGT family. 49
43590 395005 pfam00051 Kringle Kringle domain. Kringle domains have been found in plasminogen, hepatocyte growth factors, prothrombin, and apolipoprotein A. Structure is disulfide-rich, nearly all-beta. 79
43591 395006 pfam00052 Laminin_B Laminin B (Domain IV). 136
43592 395007 pfam00053 Laminin_EGF Laminin EGF domain. This family is like pfam00008 but has 8 conserved cysteines instead of six. 49
43593 395008 pfam00054 Laminin_G_1 Laminin G domain. 131
43594 395009 pfam00055 Laminin_N Laminin N-terminal (Domain VI). 231
43595 395010 pfam00056 Ldh_1_N lactate/malate dehydrogenase, NAD binding domain. L-lactate dehydrogenases are metabolic enzymes which catalyze the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyze the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. N-terminus (this family) is a Rossmann NAD-binding fold. C-terminus is an unusual alpha+beta fold. 141
43596 395011 pfam00057 Ldl_recept_a Low-density lipoprotein receptor domain class A. 37
43597 395012 pfam00058 Ldl_recept_b Low-density lipoprotein receptor repeat class B. This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. 41
43598 395013 pfam00059 Lectin_C Lectin C-type domain. This family includes both long and short form C-type 104
43599 395014 pfam00060 Lig_chan Ligand-gated ion channel. This family includes the four transmembrane regions of the ionotropic glutamate receptors and NMDA receptors. 266
43600 395015 pfam00061 Lipocalin Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel. 143
43601 395016 pfam00062 Lys C-type lysozyme/alpha-lactalbumin family. Alpha-lactalbumin is the regulatory subunit of lactose synthase, changing the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose. C-type lysozymes are secreted bacteriolytic enzymes that cleave the peptidoglycan of bacterial cell walls. Structure is a multi-domain, mixed alpha and beta fold, containing four conserved disulfide bonds. 123
43602 395017 pfam00063 Myosin_head Myosin head (motor domain). 674
43603 395018 pfam00064 Neur Neuraminidase. Neuraminidases cleave sialic acid residues from glycoproteins. Belong to the sialidase family - but this alignment does not generalize to the other sialidases. Structure is a 6-sheet beta propeller. 334
43604 395019 pfam00066 Notch LNR domain. The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. 30
43605 395020 pfam00067 p450 Cytochrome P450. Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures. 461
43606 395021 pfam00068 Phospholip_A2_1 Phospholipase A2. Phospholipase A2 releases fatty acids from the second carbon group of glycerol. Perhaps the best known members are secreted snake venoms, but also found in secreted pancreatic and membrane-associated forms. Structure is all-alpha, with two core disulfide-linked helices and a calcium-binding loop. This alignment represents the major family of PLA2s. A second minor family, defined by the honeybee venom PLA2 Structure 1POC and related sequences from Gila monsters (Heloderma), is not recognized. This minor family conserves the core helix pair but is substantially different elsewhere. The PROSITE pattern PA2_HIS, specific to the first core helix, recognizes both families. 107
43607 395022 pfam00069 Pkinase Protein kinase domain. 217
43608 395023 pfam00070 Pyr_redox Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. 80
43609 395024 pfam00071 Ras Ras family. Includes sub-families Ras, Rab, Rac, Ral, Ran, Rap Ypt1 and more. Shares P-loop motif with GTP_EFTU, arf and myosin_head. See pfam00009 pfam00025, pfam00063. As regards Rab GTPases, these are important regulators of vesicle formation, motility and fusion. They share a fold in common with all Ras GTPases: this is a six-stranded beta-sheet surrounded by five alpha-helices. 162
43610 395025 pfam00072 Response_reg Response regulator receiver domain. This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain. 111
43611 395026 pfam00073 Rhv picornavirus capsid protein. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur. 170
43612 395027 pfam00074 RnaseA Pancreatic ribonuclease. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices. 121
43613 395028 pfam00075 RNase_H RNase H. RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers. 141
43614 395029 pfam00076 RRM_1 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases The C-terminal beta strand (4th strand) and final helix are hard to align and have been omitted in the SEED alignment The LA proteins have an N terminal rrm which is included in the seed. There is a second region towards the C-terminus that has some features characteristic of a rrm but does not appear to have the important structural core of a rrm. The LA proteins are one of the main autoantigens in Systemic lupus erythematosus (SLE), an autoimmune disease. 70
43615 395030 pfam00077 RVP Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026). 99
43616 395031 pfam00078 RVT_1 Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. 189
43617 395032 pfam00079 Serpin Serpin (serine protease inhibitor). Structure is a multi-domain fold containing a bundle of helices and a beta sandwich. 367
43618 395033 pfam00080 Sod_Cu Copper/zinc superoxide dismutase (SODC). superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold. 137
43619 395034 pfam00081 Sod_Fe_N Iron/manganese superoxide dismutases, alpha-hairpin domain. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. N-terminal domain is a long alpha antiparallel hairpin. A small fragment of YTRE_LEPBI matches well - sequencing error? 82
43620 395035 pfam00082 Peptidase_S8 Subtilase family. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567. 287
43621 395036 pfam00083 Sugar_tr Sugar (and other) transporter. 452
43622 395037 pfam00084 Sushi Sushi repeat (SCR repeat). 56
43623 395038 pfam00085 Thioredoxin Thioredoxin. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. Some members with only the active site are not separated from the noise. 103
43624 395039 pfam00086 Thyroglobulin_1 Thyroglobulin type-1 repeat. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6. 66
43625 395040 pfam00087 Toxin_TOLIP Snake toxin and toxin-like protein. This family predominantly includes venomous neurotoxins and cytotoxins from snakes, but also structurally similar (non-snake) toxin-like proteins (TOLIPs) such as Lymphocyte antigen 6D and Ly6/PLAUR domain-containing protein. Snake toxins are short proteins with a compact, disulphide-rich structure. TOLIPs have similar structural features (abundance of spaced cysteine residues, a high frequency of charge residues, a signal peptide for secretion and a compact structure) but, are not associated with a venom gland or poisonous function. They are endogenous animal proteins that are not restricted to poisonous animals. 70
43626 395041 pfam00088 Trefoil Trefoil (P-type) domain. 43
43627 395042 pfam00089 Trypsin Trypsin. 219
43628 395043 pfam00090 TSP_1 Thrombospondin type 1 domain. 49
43629 395044 pfam00091 Tubulin Tubulin/FtsZ family, GTPase domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules. 190
43630 395045 pfam00092 VWA von Willebrand factor type A domain. 174
43631 278520 pfam00093 VWC von Willebrand factor type C domain. The high cutoff was used to prevent overlap with pfam00094. 57
43632 395046 pfam00094 VWD von Willebrand factor type D domain. Luciferin monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. 155
43633 395047 pfam00095 WAP WAP-type (Whey Acidic Protein) 'four-disulfide core'. WAP belongs to the group of Elafin or elastase-specific inhibitors. 42
43634 395048 pfam00096 zf-C2H2 Zinc finger, C2H2 type. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter. 23
43635 395049 pfam00097 zf-C3HC4 Zinc finger, C3HC4 type (RING finger). The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid. Many proteins containing a RING finger play a key role in the ubiquitination pathway. 40
43636 395050 pfam00098 zf-CCHC Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger. 18
43637 395051 pfam00100 Zona_pellucida Zona pellucida-like domain. 247
43638 395052 pfam00101 RuBisCO_small Ribulose bisphosphate carboxylase, small chain. 97
43639 395053 pfam00102 Y_phosphatase Protein-tyrosine phosphatase. 234
43640 306585 pfam00103 Hormone_1 Somatotropin hormone family. 214
43641 395054 pfam00104 Hormone_recep Ligand-binding domain of nuclear hormone receptor. This all helical domain is involved in binding the hormone in these receptors. 208
43642 395055 pfam00105 zf-C4 Zinc finger, C4 type (two domains). In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other. 68
43643 395056 pfam00106 adh_short short chain dehydrogenase. This family contains a wide variety of dehydrogenases. 195
43644 395057 pfam00107 ADH_zinc_N Zinc-binding dehydrogenase. 129
43645 395058 pfam00108 Thiolase_N Thiolase, N-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase. 260
43646 395059 pfam00109 ketoacyl-synt Beta-ketoacyl synthase, N-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (pfam00108) and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. The N-terminal domain contains most of the structures involved in dimer formation and also the active site cysteine. 249
43647 395060 pfam00110 wnt wnt family. Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. 304
43648 395061 pfam00111 Fer2 2Fe-2S iron-sulfur cluster binding domain. 77
43649 395062 pfam00112 Peptidase_C1 Papain family cysteine protease. 212
43650 395063 pfam00113 Enolase_C Enolase, C-terminal TIM barrel domain. 296
43651 395064 pfam00114 Pilin Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. The Prosite pattern detects those better. 108
43652 395065 pfam00115 COX1 Cytochrome C and Quinol oxidase polypeptide I. 434
43653 395066 pfam00116 COX2 Cytochrome C oxidase subunit II, periplasmic domain. 120
43654 395067 pfam00117 GATase Glutamine amidotransferase class-I. 188
43655 395068 pfam00118 Cpn60_TCP1 TCP-1/cpn60 chaperonin family. This family includes members from the HSP60 chaperone family and the TCP-1 (T-complex protein) family. 489
43656 395069 pfam00119 ATP-synt_A ATP synthase A chain. 209
43657 395070 pfam00120 Gln-synt_C Glutamine synthetase, catalytic domain. 343
43658 395071 pfam00121 TIM Triosephosphate isomerase. 243
43659 395072 pfam00122 E1-E2_ATPase E1-E2 ATPase. 181
43660 395073 pfam00123 Hormone_2 Peptide hormone. This family contains glucagon, GIP, secretin and VIP. 28
43661 395074 pfam00124 Photo_RC Photosynthetic reaction centre protein. 258
43662 333859 pfam00125 Histone Core histone H2A/H2B/H3/H4. 127
43663 395075 pfam00126 HTH_1 Bacterial regulatory helix-turn-helix protein, lysR family. 60
43664 395076 pfam00127 Copper-bind Copper binding proteins, plastocyanin/azurin family. 99
43665 395077 pfam00128 Alpha-amylase Alpha amylase, catalytic domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain. 334
43666 395078 pfam00129 MHC_I Class I Histocompatibility antigen, domains alpha 1 and 2. 179
43667 395079 pfam00130 C1_1 Phorbol esters/diacylglycerol binding domain (C1 domain). This domain is also known as the Protein kinase C conserved region 1 (C1) domain. 53
43668 395080 pfam00131 Metallothio Metallothionein. 65
43669 395081 pfam00132 Hexapep Bacterial transferase hexapeptide (six repeats). 30
43670 395082 pfam00133 tRNA-synt_1 tRNA synthetases class I (I, L, M and V). Other tRNA synthetase sub-families are too dissimilar to be included. 603
43671 395083 pfam00134 Cyclin_N Cyclin, N-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). Cyclin-0 (CCNO) is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the N-terminal domain. 127
43672 395084 pfam00135 COesterase Carboxylesterase family. 513
43673 395085 pfam00136 DNA_pol_B DNA polymerase family B. This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation, DNA-binding and dNTP binding activities. 439
43674 395086 pfam00137 ATP-synt_C ATP synthase subunit C. 60
43675 395087 pfam00139 Lectin_legB Legume lectin domain. 244
43676 395088 pfam00140 Sigma70_r1_2 Sigma-70 factor, region 1.2. 33
43677 395089 pfam00141 peroxidase Peroxidase. 187
43678 395090 pfam00142 Fer4_NifH 4Fe-4S iron sulfur cluster binding proteins, NifH/frxC family. 271
43679 395091 pfam00143 Interferon Interferon alpha/beta domain. 160
43680 395092 pfam00144 Beta-lactamase Beta-lactamase. This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. 327
43681 395093 pfam00145 DNA_methylase C-5 cytosine-specific DNA methylase. 324
43682 395094 pfam00146 NADHdh NADH dehydrogenase. 302
43683 395095 pfam00147 Fibrinogen_C Fibrinogen beta and gamma chains, C-terminal globular domain. 221
43684 395096 pfam00148 Oxidored_nitro Nitrogenase component 1 type Oxidoreductase. 398
43685 395097 pfam00149 Metallophos Calcineurin-like phosphoesterase. This family includes a diverse range of phosphoesterases, including protein phosphoserine phosphatases, nucleotidases, sphingomyelin phosphodiesterases and 2'-3' cAMP phosphodiesterases as well as nucleases such as bacterial SbcD or yeast MRE11. The most conserved regions in this superfamily centre around the metal chelating residues. 191
43686 395098 pfam00150 Cellulase Cellulase (glycosyl hydrolase family 5). 272
43687 395099 pfam00151 Lipase Lipase. 336
43688 395100 pfam00152 tRNA-synt_2 tRNA synthetases class II (D, K and N). 315
43689 395101 pfam00153 Mito_carr Mitochondrial carrier protein. 96
43690 395102 pfam00154 RecA recA bacterial DNA recombination protein. RecA is a DNA-dependent ATPase and functions in DNA repair systems. RecA protein catalyzes an ATP-dependent DNA strand-exchange reaction that is the central step in the repair of dsDNA breaks by homologous recombination. 262
43691 395103 pfam00155 Aminotran_1_2 Aminotransferase class I and II. 351
43692 395104 pfam00156 Pribosyltran Phosphoribosyl transferase domain. This family includes a range of diverse phosphoribosyl transferase enzymes. This family includes: Adenine phosphoribosyl-transferase EC:2.4.2.7, Hypoxanthine-guanine-xanthine phosphoribosyl-transferase, Hypoxanthine phosphoribosyl-transferase EC:2.4.2.8. Ribose-phosphate pyrophosphokinase i EC:2.7.6.1. Amidophosphoribosyltransferase EC:2.4.2.14. Orotate phosphoribosyl-transferase EC:2.4.2.10, Uracil phosphoribosyl-transferase EC:2.4.2.9, Xanthine-guanine phosphoribosyl-transferase EC:2.4.2.22. In Arabidopsis, at the very N-terminus of this domain is the P-Loop NTPase domain. 150
43693 395105 pfam00157 Pou Pou domain - N-terminal to homeobox domain. 69
43694 395106 pfam00158 Sigma54_activat Sigma-54 interaction domain. 168
43695 395107 pfam00159 Hormone_3 Pancreatic hormone peptide. 35
43696 395108 pfam00160 Pro_isomerase Cyclophilin type peptidyl-prolyl cis-trans isomerase/CLD. The peptidyl-prolyl cis-trans isomerases, also known as cyclophilins, share this domain of about 109 amino acids. Cyclophilins have been found in all organisms studied so far and catalyze peptidyl-prolyl isomerisation during which the peptide bond preceding proline (the peptidyl-prolyl bond) is stabilized in the cis conformation. Mammalian cyclophilin A (CypA) is a major cellular target for the immunosuppressive drug cyclosporin A (CsA). Other roles for cyclophilins may include chaperone and cell signalling function. 150
43697 395109 pfam00161 RIP Ribosome inactivating protein. 198
43698 395110 pfam00162 PGK Phosphoglycerate kinase. 370
43699 395111 pfam00163 Ribosomal_S4 Ribosomal protein S4/S9 N-terminal domain. This family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA. This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices. 87
43700 395112 pfam00164 Ribosom_S12_S23 Ribosomal protein S12/S23. This protein is known as S12 in bacteria and archaea and S23 in eukaryotes. 114
43701 395113 pfam00165 HTH_AraC Bacterial regulatory helix-turn-helix proteins, AraC family. In the absence of arabinose, the N-terminal arm of AraC binds to the DNA binding domain (pfam00165) and helps to hold the two DNA binding domains in a relative orientation that favours DNA looping. In the presence of arabinose, the arms bind over the arabinose on the dimerization domain, thus freeing the DNA-binding domains. The freed DNA-binding domains are then able to assume a conformation suitable for binding to the adjacent DNA sites that are utilized when AraC activates transcription, and hence AraC ceases looping the DNA when arabinose is added. 42
43702 395114 pfam00166 Cpn10 Chaperonin 10 Kd subunit. This family contains GroES and Gp31-like chaperonins. Gp31 is a functional co-chaperonin that is required for the folding and assembly of Gp23, a major capsid protein, during phage morphogenesis. 92
43703 395115 pfam00167 FGF Fibroblast growth factor. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. They are found in a wide range of organisms, from nematodes to humans. Most share an internal core region of high similarity, conserved residues in which are involved in binding with their receptors. On binding, they cause dimerization of their tyrosine kinase receptors leading to intracellular signalling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. Most have N-terminal signal peptides and are secreted. A few lack signal sequences but are secreted anyway; still others also lack the signal peptide but are found on the cell surface and within the extracellular matrix. A third group remain intracellular. They have central roles in development, regulating cell proliferation, migration and differentiation. On the other hand, they are important in tissue repair following injury in adult organisms. 124
43704 395116 pfam00168 C2 C2 domain. 104
43705 395117 pfam00169 PH PH domain. PH stands for pleckstrin homology. 105
43706 395118 pfam00170 bZIP_1 bZIP transcription factor. The Pfam entry includes the basic region and the leucine zipper region. 60
43707 395119 pfam00171 Aldedh Aldehyde dehydrogenase family. This family of dehydrogenases act on aldehyde substrates. Members use NADP as a cofactor. The family includes the following members: The prototypical members are the aldehyde dehydrogenases EC:1.2.1.3. Succinate-semialdehyde dehydrogenase EC:1.2.1.16. Lactaldehyde dehydrogenase EC:1.2.1.22. Benzaldehyde dehydrogenase EC:1.2.1.28. Methylmalonate-semialdehyde dehydrogenase EC:1.2.1.27. Glyceraldehyde-3-phosphate dehydrogenase EC:1.2.1.9. Delta-1-pyrroline-5-carboxylate dehydrogenase EC: 1.5.1.12. Acetaldehyde dehydrogenase EC:1.2.1.10. Glutamate-5-semialdehyde dehydrogenase EC:1.2.1.41. This family also includes omega crystallin, an eye lens protein from squid and octopus that has little aldehyde dehydrogenase activity. 458
43708 395120 pfam00172 Zn_clus Fungal Zn(2)-Cys(6) binuclear cluster domain. 39
43709 395121 pfam00173 Cyt-b5 Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. 74
43710 395122 pfam00174 Oxidored_molyb Oxidoreductase molybdopterin binding domain. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity. 168
43711 395123 pfam00175 NAD_binding_1 Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity. 109
43712 395124 pfam00176 SNF2_N SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1). 305
43713 395125 pfam00177 Ribosomal_S7 Ribosomal protein S7p/S5e. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes. 132
43714 395126 pfam00178 Ets Ets-domain. 80
43715 395127 pfam00179 UQ_con Ubiquitin-conjugating enzyme. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologs. TSG101 is one of several UBC homologs that lacks this active site cysteine. 139
43716 395128 pfam00180 Iso_dh Isocitrate/isopropylmalate dehydrogenase. 349
43717 395129 pfam00181 Ribosomal_L2 Ribosomal Proteins L2, RNA binding domain. 77
43718 395130 pfam00182 Glyco_hydro_19 Chitinase class I. 232
43719 395131 pfam00183 HSP90 Hsp90 protein. 515
43720 395132 pfam00184 Hormone_5 Neurohypophysial hormones, C-terminal Domain. N-terminal Domain is in hormone5 79
43721 395133 pfam00185 OTCace Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. 156
43722 395134 pfam00186 DHFR_1 Dihydrofolate reductase. 159
43723 395135 pfam00187 Chitin_bind_1 Chitin recognition protein. 36
43724 395136 pfam00188 CAP Cysteine-rich secretory protein family. This is a large family of cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) that are found in a wide range of organisms, including prokaryotes and non-vertebrate eukaryotes, The nine subfamilies of the mammalian CAP 'super'family include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumor suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP 'super'family results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how the cysteine-rich venom protein helothermine blocks the Ca++ transporting ryanodine receptors. 117
43725 395137 pfam00189 Ribosomal_S3_C Ribosomal protein S3, C-terminal domain. This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer. 83
43726 395138 pfam00190 Cupin_1 Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant. 151
43727 395139 pfam00191 Annexin Annexin. This family of annexins also includes giardin that has been shown to function as an annexin. 66
43728 395140 pfam00193 Xlink Extracellular link domain. 93
43729 395141 pfam00194 Carb_anhydrase Eukaryotic-type carbonic anhydrase. 232
43730 395142 pfam00195 Chal_sti_synt_N Chalcone and stilbene synthases, N-terminal domain. The C-terminal domain of Chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in this N-terminal domain. 225
43731 395143 pfam00196 GerE Bacterial regulatory proteins, luxR family. 57
43732 395144 pfam00197 Kunitz_legume Trypsin and protease inhibitor. 174
43733 395145 pfam00198 2-oxoacid_dh 2-oxoacid dehydrogenases acyltransferase (catalytic domain). These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. 212
43734 395146 pfam00199 Catalase Catalase. 383
43735 395147 pfam00200 Disintegrin Disintegrin. 75
43736 278624 pfam00201 UDPGT UDP-glucoronosyl and UDP-glucosyl transferase. 499
43737 395148 pfam00202 Aminotran_3 Aminotransferase class-III. 397
43738 395149 pfam00203 Ribosomal_S19 Ribosomal protein S19. 80
43739 395150 pfam00204 DNA_gyraseB DNA gyrase B. This family represents the second domain of DNA gyrase B which has a ribosomal S5 domain 2-like fold. This family is structurally related to PF01119. 173
43740 395151 pfam00205 TPP_enzyme_M Thiamine pyrophosphate enzyme, central domain. The central domain of TPP enzymes contains a 2-fold Rossman fold. 137
43741 395152 pfam00206 Lyase_1 Lyase. 312
43742 395153 pfam00207 A2M Alpha-2-macroglobulin family. This family includes the C-terminal region of the alpha-2-macroglobulin family. 91
43743 395154 pfam00208 ELFV_dehydrog Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. 240
43744 395155 pfam00209 SNF Sodium:neurotransmitter symporter family. These are twelve xTM-containing region transporters. 517
43745 395156 pfam00210 Ferritin Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins. 141
43746 306677 pfam00211 Guanylate_cyc Adenylate and Guanylate cyclase catalytic domain. 183
43747 395157 pfam00212 ANP Atrial natriuretic peptide. 31
43748 395158 pfam00213 OSCP ATP synthase delta (OSCP) subunit. The ATP D subunit from E. coli is the same as the OSCP subunit which is this family. The ATP D subunit from metazoa are found in family pfam00401. 171
43749 395159 pfam00214 Calc_CGRP_IAPP Calcitonin / CGRP / IAPP family. 124
43750 395160 pfam00215 OMPdecase Orotidine 5'-phosphate decarboxylase / HUMPS family. This family includes Orotidine 5'-phosphate decarboxylase enzymes EC:4.1.1.23 that are involved in the final step of pyrimidine biosynthesis. The family also includes enzymes such as hexulose-6-phosphate synthase. This family appears to be distantly related to pfam00834. 215
43751 395161 pfam00216 Bac_DNA_binding Bacterial DNA-binding protein. 88
43752 395162 pfam00217 ATP-gua_Ptrans ATP:guanido phosphotransferase, C-terminal catalytic domain. The substrate binding site is located in the cleft between N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. 212
43753 395163 pfam00218 IGPS Indole-3-glycerol phosphate synthase. 252
43754 395164 pfam00219 IGFBP Insulin-like growth factor binding protein. 53
43755 395165 pfam00220 Hormone_4 Neurohypophysial hormones, N-terminal Domain. C-terminal is in hormone5 9
43756 395166 pfam00221 Lyase_aromatic Aromatic amino acid lyase. This family includes proteins with phenylalanine ammonia-lyase, EC:4.3.1.24, histidine ammonia-lyase, EC:4.3.1.3, and tyrosine aminomutase, EC:5.4.3.6, activities. 465
43757 395167 pfam00223 PsaA_PsaB Photosystem I psaA/psaB protein. 712
43758 395168 pfam00224 PK Pyruvate kinase, barrel domain. This domain of the is actually a small beta-barrel domain nested within a larger TIM barrel. The active site is found in a cleft between the two domains. 348
43759 395169 pfam00225 Kinesin Kinesin motor domain. 326
43760 395170 pfam00226 DnaJ DnaJ domain. DnaJ domains (J-domains) are associated with hsp70 heat-shock system and it is thought that this domain mediates the interaction. DnaJ-domain is therefore part of a chaperone (protein folding) system. The T-antigens, although not in Prosite are confirmed as DnaJ containing domains from literature. 63
43761 395171 pfam00227 Proteasome Proteasome subunit. The proteasome is a multisubunit structure that degrades proteins. Protein degradation is an essential component of regulation because proteins can become misfolded, damaged, or unnecessary. Proteasomes and their homologs vary greatly in complexity: from HslV (heat shock locus v), which is encoded by 1 gene in bacteria, to the eukaryotic 20S proteasome, which is encoded by more than 14 genes. Recently evidence of two novel groups of bacterial proteasomes was proposed. The first is Anbu, which is sparsely distributed among cyanobacteria and proteobacteria. The second is call beta-proteobacteria proteasome homolog (BPH). 188
43762 395172 pfam00228 Bowman-Birk_leg Bowman-Birk serine protease inhibitor family. 24
43763 395173 pfam00229 TNF TNF(tumor Necrosis Factor) family. 127
43764 395174 pfam00230 MIP Major intrinsic protein. MIP (Major Intrinsic Protein) family proteins exhibit essentially two distinct types of channel properties: (1) specific water transport by the aquaporins, and (2) small neutral solutes transport, such as glycerol by the glycerol facilitators. 223
43765 395175 pfam00231 ATP-synt ATP synthase. 285
43766 395176 pfam00232 Glyco_hydro_1 Glycosyl hydrolase family 1. 453
43767 395177 pfam00233 PDEase_I 3'5'-cyclic nucleotide phosphodiesterase. 236
43768 395178 pfam00234 Tryp_alpha_amyl Protease inhibitor/seed storage/LTP family. This family is composed of trypsin-alpha amylase inhibitors, seed storage proteins and lipid transfer proteins from plants. 75
43769 395179 pfam00235 Profilin Profilin. 124
43770 395180 pfam00236 Hormone_6 Glycoprotein hormone. 89
43771 395181 pfam00237 Ribosomal_L22 Ribosomal protein L22p/L17e. This family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes. 104
43772 395182 pfam00238 Ribosomal_L14 Ribosomal protein L14p/L23e. 119
43773 395183 pfam00239 Resolvase Resolvase, N terminal domain. The N-terminal domain of the resolvase family (this family) contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase - see pfam02796. 144
43774 395184 pfam00240 ubiquitin Ubiquitin family. This family contains a number of ubiquitin-like proteins: SUMO (smt3 homolog), Nedd8, Elongin B, Rub1, and Parkin. A number of them are thought to carry a distinctive five-residue motif termed the proteasome-interacting motif (PIM), which may have a biologically significant role in protein delivery to proteasomes and recruitment of proteasomes to transcription sites. 72
43775 395185 pfam00241 Cofilin_ADF Cofilin/tropomyosin-type actin-binding protein. Severs actin filaments and binds to actin monomers. 124
43776 365972 pfam00242 DNA_pol_viral_N DNA polymerase (viral) N-terminal domain. 401
43777 395186 pfam00243 NGF Nerve growth factor family. 111
43778 395187 pfam00244 14-3-3 14-3-3 protein. 221
43779 395188 pfam00245 Alk_phosphatase Alkaline phosphatase. 418
43780 395189 pfam00246 Peptidase_M14 Zinc carboxypeptidase. 287
43781 395190 pfam00248 Aldo_ket_red Aldo/keto reductase family. This family includes a number of K+ ion channel beta chain regulatory domains - these are reported to have oxidoreductase activity. 290
43782 395191 pfam00249 Myb_DNA-binding Myb-like DNA-binding domain. This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family. 46
43783 395192 pfam00250 Forkhead Forkhead domain. 86
43784 395193 pfam00251 Glyco_hydro_32N Glycosyl hydrolases family 32 N-terminal domain. This domain corresponds to the N-terminal domain of glycosyl hydrolase family 32 which forms a five bladed beta propeller structure. 308
43785 395194 pfam00252 Ribosomal_L16 Ribosomal protein L16p/L10e. 132
43786 395195 pfam00253 Ribosomal_S14 Ribosomal protein S14p/S29e. This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. 54
43787 395196 pfam00254 FKBP_C FKBP-type peptidyl-prolyl cis-trans isomerase. 94
43788 395197 pfam00255 GSHPx Glutathione peroxidase. 108
43789 395198 pfam00257 Dehydrin Dehydrin. 144
43790 395199 pfam00258 Flavodoxin_1 Flavodoxin. 141
43791 365984 pfam00260 Protamine_P1 Protamine P1. 49
43792 395200 pfam00261 Tropomyosin Tropomyosin. Tropomyosin is an alpha-helical protein that forms a coiled-coil structure of 2 parallel helices containing 2 sets of 7 alternating actin binding sites. The protein is best known for its role in regulating the interaction between actin and myosin in muscle contraction, but is also involved in the organisation and dynamics of the cytoskeleton in non-muscle cells. There are multiple cell-specific isoforms, expressed by alternative promoters and alternative RNA processing of at least four genes. Muscle isoforms of tropomyosin are characterized by having 284 amino acid residues and a highly conserved N-terminal region, whereas non-muscle forms are generally smaller and are heterogeneous in their N-terminal region. 235
43793 395201 pfam00262 Calreticulin Calreticulin family. 369
43794 395202 pfam00263 Secretin Bacterial type II and III secretion system protein. 161
43795 395203 pfam00264 Tyrosinase Common central domain of tyrosinase. This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. 209
43796 306721 pfam00265 TK Thymidine kinase. 176
43797 395204 pfam00266 Aminotran_5 Aminotransferase class-V. This domain is found in amino transferases, and other enzymes including cysteine desulphurase EC:4.4.1.-. 368
43798 395205 pfam00267 Porin_1 Gram-negative porin. 335
43799 395206 pfam00268 Ribonuc_red_sm Ribonucleotide reductase, small chain. 276
43800 395207 pfam00269 SASP Small, acid-soluble spore proteins, alpha/beta type. 58
43801 395208 pfam00270 DEAD DEAD/DEAH box helicase. Members of this family include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression. 165
43802 395209 pfam00271 Helicase_C Helicase conserved C-terminal domain. The Prosite family is restricted to DEAD/H helicases, whereas this domain family is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase. 109
43803 395210 pfam00272 Cecropin Cecropin family. 30
43804 395211 pfam00273 Serum_albumin Serum albumin family. 176
43805 395212 pfam00274 Glycolytic Fructose-bisphosphate aldolase class-I. 349
43806 395213 pfam00275 EPSP_synthase EPSP synthase (3-phosphoshikimate 1-carboxyvinyltransferase). 415
43807 395214 pfam00276 Ribosomal_L23 Ribosomal protein L23. 85
43808 395215 pfam00277 SAA Serum amyloid A protein. 101
43809 395216 pfam00278 Orn_DAP_Arg_deC Pyridoxal-dependent decarboxylase, C-terminal sheet domain. These pyridoxal-dependent decarboxylases act on ornithine, lysine, arginine and related substrates. 346
43810 395217 pfam00280 potato_inhibit Potato inhibitor I family. 64
43811 395218 pfam00281 Ribosomal_L5 Ribosomal protein L5. 56
43812 395219 pfam00282 Pyridoxal_deC Pyridoxal-dependent decarboxylase conserved domain. 373
43813 395220 pfam00283 Cytochrom_B559 Cytochrome b559, alpha (gene psbE) and beta (gene psbF)subunits. 29
43814 395221 pfam00284 Cytochrom_B559a Lumenal portion of Cytochrome b559, alpha (gene psbE) subunit. This family is the lumenal portion of cytochrome b559 alpha chain, matches to this family should be accompanied by a match to the pfam00283 family also. The Prosite pattern pattern matches the transmembrane region of the cytochrome b559 alpha and beta subunits. 38
43815 395222 pfam00285 Citrate_synt Citrate synthase, C-terminal domain. This is the long, C-terminal part of the enzyme. 358
43816 395223 pfam00286 Flexi_CP Viral coat protein. Family includes coat proteins from Potexviruses and carlaviruses. 138
43817 395224 pfam00287 Na_K-ATPase Sodium / potassium ATPase beta chain. 272
43818 395225 pfam00288 GHMP_kinases_N GHMP kinases N terminal domain. This family includes homoserine kinases, galactokinases and mevalonate kinases. 64
43819 395226 pfam00289 Biotin_carb_N Biotin carboxylase, N-terminal domain. This domain is structurally related to the PreATP-grasp domain. The family contains the N-terminus of biotin carboxylase enzymes, and propionyl-CoA carboxylase A chain. 109
43820 395227 pfam00290 Trp_syntA Tryptophan synthase alpha chain. 258
43821 395228 pfam00291 PALP Pyridoxal-phosphate dependent enzyme. Members of this family are all pyridoxal-phosphate dependent enzymes. This family includes: serine dehydratase EC:4.2.1.13 P20132, threonine dehydratase EC:4.2.1.16, tryptophan synthase beta chain EC:4.2.1.20, threonine synthase EC:4.2.99.2, cysteine synthase EC:4.2.99.8 P11096, cystathionine beta-synthase EC:4.2.1.22, 1-aminocyclopropane-1-carboxylate deaminase EC:4.1.99.4. 295
43822 366005 pfam00292 PAX 'Paired box' domain. 125
43823 395229 pfam00293 NUDIX NUDIX domain. 132
43824 395230 pfam00294 PfkB pfkB family carbohydrate kinase. This family includes a variety of carbohydrate and pyrimidine kinases. 289
43825 395231 pfam00295 Glyco_hydro_28 Glycosyl hydrolases family 28. Glycosyl hydrolase family 28 includes polygalacturonase EC:3.2.1.15 as well as rhamnogalacturonase A(RGase A), EC:3.2.1.-. These enzymes are important in cell wall metabolism. 321
43826 395232 pfam00296 Bac_luciferase Luciferase-like monooxygenase. 314
43827 395233 pfam00297 Ribosomal_L3 Ribosomal protein L3. 369
43828 395234 pfam00298 Ribosomal_L11 Ribosomal protein L11, RNA binding domain. 69
43829 395235 pfam00299 Squash Squash family serine protease inhibitor. 29
43830 395236 pfam00300 His_Phos_1 Histidine phosphatase superfamily (branch 1). The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches. The larger branch 1 contains a wide variety of catalytic functions, the best known being fructose 2,6-bisphosphatase (found in a bifunctional protein with 2-phosphofructokinase) and cofactor-dependent phosphoglycerate mutase. The latter is an unusual example of a mutase activity in the superfamily: the vast majority of members appear to be phosphatases. The bacterial regulatory protein phosphatase SixA is also in branch 1 and has a minimal, and possible ancestral-like structure, lacking the large domain insertions that contribute to binding of small molecules in branch 1 members. 194
43831 395237 pfam00301 Rubredoxin Rubredoxin. 47
43832 395238 pfam00302 CAT Chloramphenicol acetyltransferase. 203
43833 395239 pfam00303 Thymidylat_synt Thymidylate synthase. This is a family of proteins that are flavin-dependent thymidylate synthases. 262
43834 395240 pfam00304 Gamma-thionin Gamma-thionin family. 44
43835 395241 pfam00305 Lipoxygenase Lipoxygenase. 672
43836 395242 pfam00306 ATP-synt_ab_C ATP synthase alpha/beta chain, C terminal domain. 126
43837 395243 pfam00307 CH Calponin homology (CH) domain. The CH domain is found in both cytoskeletal proteins and signal transduction proteins. The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most member proteins have from two to four copies of the CH domain, however some proteins such as calponin have only a single copy. 109
43838 278724 pfam00308 Bac_DnaA Bacterial dnaA protein. 219
43839 395244 pfam00309 Sigma54_AID Sigma-54 factor, Activator interacting domain (AID). The sigma-54 holoenzyme is an enhancer dependent form of the RNA polymerase. The AID is necessary for activator interaction. In addition, the AID also inhibits transcription initiation in the sigma-54 holoenzyme prior to interaction with the activator. 39
43840 395245 pfam00310 GATase_2 Glutamine amidotransferases class-II. 420
43841 395246 pfam00311 PEPcase Phosphoenolpyruvate carboxylase. 794
43842 395247 pfam00312 Ribosomal_S15 Ribosomal protein S15. 76
43843 278729 pfam00313 CSD 'Cold-shock' DNA-binding domain. 66
43844 395248 pfam00314 Thaumatin Thaumatin family. 211
43845 395249 pfam00316 FBPase Fructose-1-6-bisphosphatase, N-terminal domain. This family represents the N-terminus of this protein family. 189
43846 395250 pfam00317 Ribonuc_red_lgN Ribonucleotide reductase, all-alpha domain. 79
43847 395251 pfam00318 Ribosomal_S2 Ribosomal protein S2. 216
43848 395252 pfam00319 SRF-TF SRF-type transcription factor (DNA-binding and dimerization domain). 47
43849 395253 pfam00320 GATA GATA zinc finger. This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain. 36
43850 395254 pfam00321 Thionin Plant thionin. 46
43851 366026 pfam00322 Endothelin Endothelin family. 29
43852 395255 pfam00323 Defensin_1 Mammalian defensin. 29
43853 366028 pfam00324 AA_permease Amino acid permease. 467
43854 395256 pfam00325 Crp Bacterial regulatory proteins, crp family. 32
43855 395257 pfam00326 Peptidase_S9 Prolyl oligopeptidase family. 213
43856 395258 pfam00327 Ribosomal_L30 Ribosomal protein L30p/L7e. This family includes prokaryotic L30 and eukaryotic L7. 51
43857 395259 pfam00328 His_Phos_2 Histidine phosphatase superfamily (branch 2). The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches.The smaller branch 2 contains predominantly eukaryotic proteins. The catalytic functions in members include phytase, glucose-1-phosphatase and multiple inositol polyphosphate phosphatase. The in vivo roles of the mammalian acid phosphatases in branch 2 are not fully understood, although activity against lysophosphatidic acid and tyrosine-phosphorylated proteins has been demonstrated. 356
43858 395260 pfam00329 Complex1_30kDa Respiratory-chain NADH dehydrogenase, 30 Kd subunit. 120
43859 395261 pfam00330 Aconitase Aconitase family (aconitate hydratase). 458
43860 395262 pfam00331 Glyco_hydro_10 Glycosyl hydrolase family 10. 310
43861 366033 pfam00332 Glyco_hydro_17 Glycosyl hydrolases family 17. 309
43862 395263 pfam00333 Ribosomal_S5 Ribosomal protein S5, N-terminal domain. 65
43863 395264 pfam00334 NDK Nucleoside diphosphate kinase. 135
43864 395265 pfam00335 Tetraspannin Tetraspanin family. 221
43865 366036 pfam00336 DNA_pol_viral_C DNA polymerase (viral) C-terminal domain. 241
43866 395266 pfam00337 Gal-bind_lectin Galactoside-binding lectin. This family contains galactoside binding lectins. The family also includes enzymes such as human eosinophil lysophospholipase (EC:3.1.1.5). 131
43867 395267 pfam00338 Ribosomal_S10 Ribosomal protein S10p/S20e. This family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes. 98
43868 395268 pfam00339 Arrestin_N Arrestin (or S-antigen), N-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain. 146
43869 395269 pfam00340 IL1 Interleukin-1 / 18. This family includes interleukin-1 and interleukin-18. 119
43870 395270 pfam00341 PDGF PDGF/VEGF domain. 82
43871 395271 pfam00342 PGI Phosphoglucose isomerase. Phosphoglucose isomerase catalyzes the interconversion of glucose-6-phosphate and fructose-6-phosphate. 487
43872 395272 pfam00343 Phosphorylase Carbohydrate phosphorylase. The members of this family catalyze the formation of glucose 1-phosphate from one of the following polyglucoses; glycogen, starch, glucan or maltodextrin. 661
43873 395273 pfam00344 SecY SecY translocase. 313
43874 395274 pfam00345 PapD_N Pili and flagellar-assembly chaperone, PapD N-terminal domain. C2 domain-like beta-sandwich fold. This domain is the n-terminal part of the PapD chaperone protein for pilus and flagellar assembly. 121
43875 395275 pfam00346 Complex1_49kDa Respiratory-chain NADH dehydrogenase, 49 Kd subunit. 270
43876 395276 pfam00347 Ribosomal_L6 Ribosomal protein L6. 78
43877 395277 pfam00348 polyprenyl_synt Polyprenyl synthetase. 252
43878 395278 pfam00349 Hexokinase_1 Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam03727. Some members of the family have two copies of each of these domains. 194
43879 395279 pfam00350 Dynamin_N Dynamin family. 168
43880 395280 pfam00351 Biopterin_H Biopterin-dependent aromatic amino acid hydroxylase. This family includes phenylalanine-4-hydroxylase, the phenylketonuria disease protein. 331
43881 395281 pfam00352 TBP Transcription factor TFIID (or TATA-binding protein, TBP). 83
43882 395282 pfam00353 HemolysinCabind Haemolysin-type calcium-binding repeat (2 copies). 36
43883 278768 pfam00354 Pentaxin Pentaxin family. Pentaxins are also known as pentraxins. 194
43884 395283 pfam00355 Rieske Rieske [2Fe-2S] domain. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines coordinate one Fe ion, while the other Fe ion is coordinated by two conserved histidines. In hyperthermophilic archaea there is a SKTPCX(2-3)C motif at the C-terminus. The cysteines in this motif form a disulphide bridge, which stabilizes the protein. 88
43885 306791 pfam00356 LacI Bacterial regulatory proteins, lacI family. 46
43886 395284 pfam00357 Integrin_alpha Integrin alpha cytoplasmic region. This family contains the short intracellular region of integrin alpha chains. 15
43887 395285 pfam00358 PTS_EIIA_1 phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 1. 122
43888 395286 pfam00359 PTS_EIIA_2 Phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 2. 139
43889 395287 pfam00360 PHY Phytochrome region. Phytochromes are red/far-red photochromic biliprotein photoreceptors which regulate plant development. They are widely represented in both photosynthetic and non-photosynthetic bacteria and are known in a variety of fungi. Although sequence similarities are low, this domain is structurally related to pfam01590, which is generally located immediately N-terminal to this domain. Compared with pfam01590, this domain carries an additional tongue-like hairpin loop between the fifth beta-sheet and the sixth alpha-helix which functions to seal the chromophore pocket and stabilize the photoactivated far-red-absorbing state (Pfr). The tongue carries a conserved PRxSF motif, from which an arginine finger points into the chromophore pocket close to ring D forming a salt bridge with a conserved aspartate residue. 182
43890 366050 pfam00361 Proton_antipo_M Proton-conducting membrane transporter. This is a family of membrane transporters that inlcudes some 7 of potentially 14-16 TM regions. In many instances the family forms part of complex I that catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane, and in this context is a combination predominantly of subunits 2, 4, 5, 14, L, M and N. In many bacterial species these proteins are probable stand-alone transporters not coupled with oxidoreduction. The family in total represents homologs across the phyla. 291
43891 395288 pfam00362 Integrin_beta Integrin beta chain VWA domain. Integrins have been found in animals and their homologs have also been found in cyanobacteria, probably due to horizontal gene transfer. This domain corresponds to the integrin beta VWA domain. 248
43892 395289 pfam00363 Casein Casein. 87
43893 395290 pfam00364 Biotin_lipoyl Biotin-requiring enzyme. This family covers two Prosite entries, the conserved lysine residue binds biotin in one group and lipoic acid in the other. Note that the HMM does not currently recognize the Glycine cleavage system H proteins. 73
43894 395291 pfam00365 PFK Phosphofructokinase. 271
43895 395292 pfam00366 Ribosomal_S17 Ribosomal protein S17. 68
43896 395293 pfam00367 PTS_EIIB phosphotransferase system, EIIB. 34
43897 395294 pfam00368 HMG-CoA_red Hydroxymethylglutaryl-coenzyme A reductase. The HMG-CoA reductases catalyze the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in the synthesis of isoprenoids like cholesterol. Probably because of the critical role of this enzyme in cholesterol homeostasis, mammalian HMG-CoA reductase is heavily regulated at the transcriptional, translational, and post-translational levels. 368
43898 395295 pfam00370 FGGY_N FGGY family of carbohydrate kinases, N-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain. 245
43899 395296 pfam00372 Hemocyanin_M Hemocyanin, copper containing domain. This family includes arthropod hemocyanins and insect larval storage proteins. 268
43900 395297 pfam00373 FERM_M FERM central domain. This domain is the central structural domain of the FERM domain. 116
43901 395298 pfam00374 NiFeSe_Hases Nickel-dependent hydrogenase. 495
43902 395299 pfam00375 SDF Sodium:dicarboxylate symporter family. 388
43903 395300 pfam00376 MerR MerR family regulatory protein. 38
43904 395301 pfam00377 Prion Prion/Doppel alpha-helical domain. The prion protein is thought to be the infectious agent that causes transmissible spongiform encephalopathies, such as scrapie and BSE. It is thought that the prion protein can exist in two different forms: one is the normal cellular protein, and the other is the infectious form which can change the normal prion protein into the infectious form. It has been found that the prion alpha-helical domain is also found in the Doppel protein. 116
43905 395302 pfam00378 ECH_1 Enoyl-CoA hydratase/isomerase. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. 251
43906 395303 pfam00379 Chitin_bind_4 Insect cuticle protein. Many insect cuticular proteins include a 35-36 amino acid motif known as the R&R consensus. The extensive conservation of this region led to the suggestion that it functions to bind chitin. Provocatively, it has no sequence similarity to the well-known cysteine-containing chitin-binding domain found in chitinases and some peritrophic membrane proteins. Chitin binding has been shown experimentally for this region. Thus arthropods have two distinct classes of chitin binding proteins, those with the chitin-binding domain found in lectins, chitinases and peritrophic membranes (cysCBD) and those with the cuticular protein chitin-binding domain (non-cysCBD). 52
43907 395304 pfam00380 Ribosomal_S9 Ribosomal protein S9/S16. This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes. 121
43908 395305 pfam00381 PTS-HPr PTS HPr component phosphorylation site. 79
43909 395306 pfam00382 TFIIB Transcription factor TFIIB repeat. 71
43910 395307 pfam00383 dCMP_cyt_deam_1 Cytidine and deoxycytidylate deaminase zinc-binding region. 100
43911 395308 pfam00384 Molybdopterin Molybdopterin oxidoreductase. 359
43912 395309 pfam00385 Chromo Chromo (CHRromatin Organisation MOdifier) domain. 52
43913 395310 pfam00386 C1q C1q domain. C1q is a subunit of the C1 enzyme complex that activates the serum complement system. 126
43914 395311 pfam00387 PI-PLC-Y Phosphatidylinositol-specific phospholipase C, Y domain. This associates with pfam00388 to form a single structural unit. 114
43915 395312 pfam00388 PI-PLC-X Phosphatidylinositol-specific phospholipase C, X domain. This associates with pfam00387 to form a single structural unit. 142
43916 395313 pfam00389 2-Hacid_dh D-isomer specific 2-hydroxyacid dehydrogenase, catalytic domain. This family represents the largest portion of the catalytic domain of 2-hydroxyacid dehydrogenases as the NAD binding domain is inserted within the structural domain. 312
43917 395314 pfam00390 malic Malic enzyme, N-terminal domain. 182
43918 395315 pfam00391 PEP-utilizers PEP-utilising enzyme, mobile domain. This domain is a "swivelling" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it. 73
43919 306822 pfam00392 GntR Bacterial regulatory proteins, gntR family. This family of regulatory proteins consists of the N-terminal HTH region of GntR-like bacterial transcription factors. At the C-terminus there is usually an effector-binding/oligomerization domain. The GntR-like proteins include the following sub-families: MocR, YtrR, FadR, AraR, HutC and PlmA, DevA, DasR. Many of these proteins have been shown experimentally to be autoregulatory, enabling the prediction of operator sites and the discovery of cis/trans relationships. The DasR regulator has been shown to be a global regulator of primary metabolism and development in Streptomyces coelicolor. 64
43920 395316 pfam00393 6PGD 6-phosphogluconate dehydrogenase, C-terminal domain. This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each. 290
43921 395317 pfam00394 Cu-oxidase Multicopper oxidase. Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. 146
43922 395318 pfam00395 SLH S-layer homology domain. 42
43923 395319 pfam00396 Granulin Granulin. 42
43924 395320 pfam00397 WW WW domain. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro. 30
43925 395321 pfam00398 RrnaAD Ribosomal RNA adenine dimethylase. 263
43926 395322 pfam00399 PIR Yeast PIR protein repeat. 17
43927 395323 pfam00400 WD40 WD domain, G-beta repeat. 39
43928 395324 pfam00401 ATP-synt_DE ATP synthase, Delta/Epsilon chain, long alpha-helix domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213). 45
43929 395325 pfam00402 Calponin Calponin family repeat. 25
43930 395326 pfam00403 HMA Heavy-metal-associated domain. 58
43931 395327 pfam00404 Dockerin_1 Dockerin type I repeat. The dockerin repeat is the binding partner of the cohesin domain pfam00963. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. The dockerin repeats, each bearing homology to the EF-hand calcium-binding loop bind calcium. 56
43932 395328 pfam00405 Transferrin Transferrin. 328
43933 395329 pfam00406 ADK Adenylate kinase. 184
43934 395330 pfam00407 Bet_v_1 Pathogenesis-related protein Bet v I family. This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10: - Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1, an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones. - (S)-Norcoclaurine synthases are enzymes catalyzing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens. 149
43935 395331 pfam00408 PGM_PMM_IV Phosphoglucomutase/phosphomannomutase, C-terminal domain. 71
43936 395332 pfam00410 Ribosomal_S8 Ribosomal protein S8. 127
43937 306838 pfam00411 Ribosomal_S11 Ribosomal protein S11. 109
43938 395333 pfam00412 LIM LIM domain. This family represents two copies of the LIM structural domain. 57
43939 395334 pfam00413 Peptidase_M10 Matrixin. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. 159
43940 366084 pfam00414 MAP1B_neuraxin Neuraxin and MAP1B repeat. 17
43941 395335 pfam00415 RCC1 Regulator of chromosome condensation (RCC1) repeat. 50
43942 395336 pfam00416 Ribosomal_S13 Ribosomal protein S13/S18. This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes. 108
43943 395337 pfam00418 Tubulin-binding Tau and MAP protein, tubulin-binding repeat. This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilize microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif. 31
43944 395338 pfam00419 Fimbrial Fimbrial protein. 149
43945 395339 pfam00420 Oxidored_q2 NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. 95
43946 395340 pfam00421 PSII Photosystem II protein. 496
43947 278832 pfam00423 HN Haemagglutinin-neuraminidase. 539
43948 366091 pfam00424 REV REV protein (anti-repression trans-activator protein). 91
43949 395341 pfam00425 Chorismate_bind chorismate binding enzyme. This family includes the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase. 255
43950 395342 pfam00426 VP4_haemagglut Outer Capsid protein VP4 (Hemagglutinin) Concanavalin-like domain. This entry represents the N-terminal concanavalin-like domain from the VP4 protein of rotavirus C. 193
43951 395343 pfam00427 PBS_linker_poly Phycobilisome Linker polypeptide. 125
43952 395344 pfam00428 Ribosomal_60s 60s Acidic ribosomal protein. This family includes archaebacterial L12, eukaryotic P0, P1 and P2. 87
43953 306850 pfam00429 TLV_coat ENV polyprotein (coat polyprotein). 560
43954 366095 pfam00430 ATP-synt_B ATP synthase B/B' CF(0). Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006 131
43955 395345 pfam00431 CUB CUB domain. 110
43956 395346 pfam00432 Prenyltrans Prenyltransferase and squalene oxidase repeat. 44
43957 395347 pfam00433 Pkinase_C Protein kinase C terminal domain. 42
43958 278842 pfam00434 VP7 Glycoprotein VP7. 336
43959 395348 pfam00435 Spectrin Spectrin repeat. Spectrin repeat-domains are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin domain- repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. Although the domain occurs in multiple repeats along sequences, the domains are actually stable on their own - ie they act, biophysically, like domains rather than repeats that along function when aggregated. 105
43960 395349 pfam00436 SSB Single-strand binding protein family. This family includes single stranded binding proteins and also the primosomal replication protein N (PriB). PriB forms a complex with PriA, PriC and ssDNA. 103
43961 395350 pfam00437 T2SSE Type II/IV secretion system protein. This family contains both type II and type IV pathway secretion proteins from bacteria. VirB11 ATPase is a subunit of the Agrobacterium tumefaciens transfer DNA (T-DNA) transfer system, a type IV secretion pathway required for delivery of T-DNA and effector proteins to plant cells during infection. 271
43962 395351 pfam00438 S-AdoMet_synt_N S-adenosylmethionine synthetase, N-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 98
43963 395352 pfam00439 Bromodomain Bromodomain. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 84
43964 395353 pfam00440 TetR_N Bacterial regulatory proteins, tetR family. 47
43965 395354 pfam00441 Acyl-CoA_dh_1 Acyl-CoA dehydrogenase, C-terminal domain. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle. 149
43966 395355 pfam00443 UCH Ubiquitin carboxyl-terminal hydrolase. 306
43967 395356 pfam00444 Ribosomal_L36 Ribosomal protein L36. 38
43968 395357 pfam00445 Ribonuclease_T2 Ribonuclease T2 family. 183
43969 278853 pfam00446 GnRH Gonadotropin-releasing hormone. 10
43970 395358 pfam00447 HSF_DNA-bind HSF-type DNA-binding. 99
43971 395359 pfam00448 SRP54 SRP54-type protein, GTPase domain. This family includes relatives of the G-domain of the SRP54 family of proteins. 193
43972 395360 pfam00449 Urease_alpha Urease alpha-subunit, N-terminal domain. The N-terminal domain is a composite domain and plays a major trimer stabilizing role by contacting the catalytic domain of the symmetry related alpha-subunit. 120
43973 395361 pfam00450 Peptidase_S10 Serine carboxypeptidase. 415
43974 278858 pfam00451 Toxin_2 Scorpion short toxin, BmKK2. Members of this family, which are found in various scorpion toxins, confer potassium channel blocking activity. 32
43975 395362 pfam00452 Bcl-2 Apoptosis regulator proteins, Bcl-2 family. 100
43976 395363 pfam00453 Ribosomal_L20 Ribosomal protein L20. 104
43977 395364 pfam00454 PI3_PI4_kinase Phosphatidylinositol 3- and 4-kinase. Some members of this family probably do not have lipid kinase activity and are protein kinases. 241
43978 395365 pfam00455 DeoRC DeoR C terminal sensor domain. The sensor domains of the DeoR are catalytically inactive versions of the ISOCOT fold, but retain the substrate binding site. DeorC senses diverse sugar derivatives such as deoxyribose nucleoside (DeoR), tagatose phosphate (LacR), galactosamine (AgaR), myo-inositol (Bacillus IolR) and L-ascorbate (UlaR). 160
43979 395366 pfam00456 Transketolase_N Transketolase, thiamine diphosphate binding domain. This family includes transketolase enzymes EC:2.2.1.1. and also partially matches to 2-oxoisovalerate dehydrogenase beta subunit EC:1.2.4.4. Both these enzymes utilize thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis. 334
43980 395367 pfam00457 Glyco_hydro_11 Glycosyl hydrolases family 11. 175
43981 395368 pfam00458 WHEP-TRS WHEP-TRS domain. 49
43982 395369 pfam00459 Inositol_P Inositol monophosphatase family. 270
43983 395370 pfam00460 Flg_bb_rod Flagella basal body rod protein. 30
43984 395371 pfam00462 Glutaredoxin Glutaredoxin. 60
43985 278869 pfam00463 ICL Isocitrate lyase family. 526
43986 395372 pfam00464 SHMT Serine hydroxymethyltransferase. 399
43987 395373 pfam00465 Fe-ADH Iron-containing alcohol dehydrogenase. 362
43988 395374 pfam00466 Ribosomal_L10 Ribosomal protein L10. 99
43989 395375 pfam00467 KOW KOW motif. This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG. 32
43990 395376 pfam00468 Ribosomal_L34 Ribosomal protein L34. 42
43991 366118 pfam00469 F-protein Negative factor, (F-Protein) or Nef. Nef protein accelerates virulent progression of AIDS by its interaction with cellular proteins involved in signal transduction and host cell activation. Nef has been shown to bind specifically to a subset of the Src kinase family. 220
43992 395377 pfam00471 Ribosomal_L33 Ribosomal protein L33. 46
43993 395378 pfam00472 RF-1 RF-1 domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis. 111
43994 395379 pfam00473 CRF Corticotropin-releasing factor family. 38
43995 109527 pfam00474 SSF Sodium:solute symporter family. 406
43996 395380 pfam00475 IGPD Imidazoleglycerol-phosphate dehydratase. 140
43997 395381 pfam00476 DNA_pol_A DNA polymerase family A. 372
43998 306882 pfam00477 LEA_5 Small hydrophilic plant seed protein. 109
43999 395382 pfam00478 IMPDH IMP dehydrogenase / GMP reductase domain. This family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains pfam00571 are inserted in the TIM barrel. This family is a member of the common phosphate binding site TIM barrel family. 418
44000 395383 pfam00479 G6PD_N Glucose-6-phosphate dehydrogenase, NAD binding domain. 178
44001 395384 pfam00480 ROK ROK family. 292
44002 395385 pfam00481 PP2C Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase. 252
44003 395386 pfam00482 T2SSF Type II secretion system (T2SS), protein F. The original family covered both the regions found by the current model. The splitting of the family has allowed the related FlaJ_arch (archaeal FlaJ family) to be merged with it. Proteins with this domain in form a platform for the machiney of the Type II secretion system, as well as the Type 4 pili and the archaeal flagella. This domain seems to show some similarity to PF00664 but this may just be due to similarities in the TM helices (personal obs: C Yeats). 115
44004 395387 pfam00483 NTP_transferase Nucleotidyl transferase. This family includes a wide range of enzymes which transfer nucleotides onto phosphosugars. 243
44005 395388 pfam00484 Pro_CA Carbonic anhydrase. This family includes carbonic anhydrases as well as a family of non-functional homologs related to YbcF. 156
44006 395389 pfam00485 PRK Phosphoribulokinase / Uridine kinase family. This family matches three types of P-loop containing kinases: phosphoribulokinases, uridine kinases and bacterial pantothenate kinases(CoaA). Arabidopsis and other organisms have a dual uridine kinase/uracil phosphoribosyltransferase protein where the N-terminal region consists of a UK domain and the C-terminal region of a UPRT domain. 197
44007 395390 pfam00486 Trans_reg_C Transcriptional regulatory protein, C terminal. 75
44008 395391 pfam00487 FA_desaturase Fatty acid desaturase. 253
44009 395392 pfam00488 MutS_V MutS domain V. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam05190. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain V of Thermus aquaticus MutS, which contains a Walker A motif, and is structurally similar to the ATPase domain of ABC transporters. 188
44010 395393 pfam00489 IL6 Interleukin-6/G-CSF/MGF family. 184
44011 395394 pfam00490 ALAD Delta-aminolevulinic acid dehydratase. 315
44012 395395 pfam00491 Arginase Arginase family. 271
44013 395396 pfam00493 MCM MCM2/3/5 family. 224
44014 395397 pfam00494 SQS_PSY Squalene/phytoene synthase. 260
44015 395398 pfam00496 SBP_bac_5 Bacterial extracellular solute-binding proteins, family 5 Middle. The borders of this family are based on the PDBSum definitions of the domain edges for Salmonella typhimurium oppA. 369
44016 395399 pfam00497 SBP_bac_3 Bacterial extracellular solute-binding proteins, family 3. 225
44017 395400 pfam00498 FHA FHA domain. The FHA (Forkhead-associated) domain is a phosphopeptide binding motif. 66
44018 395401 pfam00499 Oxidored_q3 NADH-ubiquinone/plastoquinone oxidoreductase chain 6. 143
44019 395402 pfam00500 Late_protein_L1 L1 (late) protein. 456
44020 395403 pfam00501 AMP-binding AMP-binding enzyme. 411
44021 395404 pfam00502 Phycobilisome Phycobilisome protein. 155
44022 395405 pfam00503 G-alpha G-protein alpha subunit. G proteins couple receptors of extracellular signals to intracellular signaling pathways. The G protein alpha subunit binds guanyl nucleotide and is a weak GTPase. A set of residues that are unique to G-alpha as compared to its ancestor the Arf-like family form a ring of residues centered on the nucleotide binding site. A Ggamma is found fused to an inactive Galpha in the Dictyostelium protein gbqA. 316
44023 395406 pfam00504 Chloroa_b-bind Chlorophyll A-B binding protein. 157
44024 395407 pfam00505 HMG_box HMG (high mobility group) box. 68
44025 278907 pfam00506 Flu_NP Influenza virus nucleoprotein. 520
44026 395408 pfam00507 Oxidored_q4 NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. 94
44027 278909 pfam00508 PPV_E2_N E2 (early) protein, N terminal. 197
44028 395409 pfam00509 Hemagglutinin Haemagglutinin. Haemagglutinin from influenza virus causes membrane fusion of the viral membrane with the host membrane. Fusion occurs after the host cell internalizes the virus by endocytosis. The drop of pH causes release of a hydrophobic fusion peptide and a large conformational change leading to membrane fusion. 504
44029 395410 pfam00510 COX3 Cytochrome c oxidase subunit III. 258
44030 395411 pfam00511 PPV_E2_C E2 (early) protein, C terminal. 80
44031 395412 pfam00512 HisKA His Kinase A (phospho-acceptor) domain. dimerization and phospho-acceptor domain of histidine kinases. 64
44032 278914 pfam00513 Late_protein_L2 Late Protein L2. 514
44033 395413 pfam00514 Arm Armadillo/beta-catenin-like repeat. Approx. 40 amino acid repeat. Tandem repeats form super-helix of helices that is proposed to mediate interaction of beta-catenin with its ligands. CAUTION: This family does not contain all known armadillo repeats. 41
44034 395414 pfam00515 TPR_1 Tetratricopeptide repeat. 34
44035 278917 pfam00516 GP120 Envelope glycoprotein GP120. The entry of HIV requires interaction of viral GP120 with CD4 and a chemokine receptor on the cell surface. 525
44036 395415 pfam00517 GP41 Retroviral envelope protein. This family includes envelope protein from a variety of retroviruses. It includes the GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) which mediate membrane fusion during viral entry. The family also includes bovine immunodeficiency virus, feline immunodeficiency virus and Equine infectious anaemia (EIAV). The family also includes the Gp36 protein from mouse mammary tumor virus (MMTV) and human endogenous retroviruses (HERVs). 197
44037 306907 pfam00518 E6 Early Protein (E6). 108
44038 278920 pfam00519 PPV_E1_C Papillomavirus helicase. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508. 432
44039 395416 pfam00520 Ion_trans Ion transport protein. This family contains sodium, potassium and calcium ion channels. This family is 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane. 238
44040 395417 pfam00521 DNA_topoisoIV DNA gyrase/topoisomerase IV, subunit A. 430
44041 278923 pfam00522 VPR VPR/VPX protein. 83
44042 395418 pfam00523 Fusion_gly Fusion glycoprotein F0. 473
44043 278925 pfam00524 PPV_E1_N E1 Protein, N terminal domain. 121
44044 395419 pfam00525 Crystallin Alpha crystallin A chain, N terminal. 53
44045 306912 pfam00526 Dicty_CTDC Dictyostelium (slime mold) repeat. 23
44046 278928 pfam00527 E7 E7 protein, Early protein. 92
44047 334128 pfam00528 BPD_transp_1 Binding-protein-dependent transport system inner membrane component. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices. 183
44048 395420 pfam00529 HlyD HlyD membrane-fusion protein of T1SS. HlyD is a component of the prototypical alpha-haemolysin (HlyA) bacterial type I secretion system, along with the other components HlyB and TolC. HlyD and HlyB are inner-membrane proteins and specific components of the transport apparatus of alpha-haemolysin. HlyD is anchored in the cytoplasmic membrane by a single transmembrane domain and has a large periplasmic domain within the carboxy-terminal 100 amino acids, HlyB and HlyD form a stable complex that binds the recombinant protein bearing a C-terminal HlyA signal sequence and ATP in the cytoplasm. HlyD, HlyB and TolC combine to form the three-component ABC transporter complex that forms a trans-membrane channel or pore through which HlyA can be transferred directly to the extracellular medium. Cutinase has been shown to be transported effectively through this pore. 322
44049 395421 pfam00530 SRCR Scavenger receptor cysteine-rich domain. These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. 98
44050 395422 pfam00531 Death Death domain. 83
44051 395423 pfam00532 Peripla_BP_1 Periplasmic binding proteins and sugar binding domain of LacI family. This family includes the periplasmic binding proteins, and the LacI family transcriptional regulators. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The LacI family of proteins consist of transcriptional regulators related to the lac repressor. In this case, generally the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain (pfam00356). 281
44052 395424 pfam00533 BRCT BRCA1 C-terminus (BRCT) domain. The BRCT domain is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage. The BRCT domain of XRCC1 forms a homodimer in the crystal structure. This suggests that pairs of BRCT domains associate as homo- or heterodimers. BRCT domains are often found as tandem-repeat pairs. Structures of the BRCA1 BRCT domains revealed a basis for a widely utilized head-to-tail BRCT-BRCT oligomerization mode. This conserved tandem BRCT architecture facilitates formation of the canonical BRCT phospho-peptide interaction cleft at a groove between the BRCT domains. Disease associated missense and nonsense mutations in the BRCA1 BRCT domains disrupt peptide binding by directly occluding this peptide binding groove, or by disrupting key conserved BRCT core folding determinants. 78
44053 395425 pfam00534 Glycos_transf_1 Glycosyl transferases group 1. Mutations in this domain of PIGA lead to disease (Paroxysmal Nocturnal haemoglobinuria). Members of this family transfer activated sugars to a variety of substrates, including glycogen, Fructose-6-phosphate and lipopolysaccharides. Members of this family transfer UDP, ADP, GDP or CMP linked sugars. The eukaryotic glycogen synthases may be distant members of this family. 158
44054 395426 pfam00535 Glycos_transf_2 Glycosyl transferase family 2. Diverse family, transferring sugar from UDP-glucose, UDP-N-acetyl- galactosamine, GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol phosphate and teichoic acids. 164
44055 395427 pfam00536 SAM_1 SAM domain (Sterile alpha motif). It has been suggested that SAM is an evolutionarily conserved protein binding domain that is involved in the regulation of numerous developmental processes in diverse eukaryotes. The SAM domain can potentially function as a protein interaction module through its ability to homo- and heterooligomerize with other SAM domains. 64
44056 395428 pfam00537 Toxin_3 Scorpion toxin-like domain. This family contains both neurotoxins and plant defensins. The mustard trypsin inhibitor, MTI-2, is plant defensin. It is a potent inhibitor of trypsin with no activity towards chymotrypsin. MTI-2 is toxic for Lepidopteran insects, but has low activity against aphids. Brazzein is plant defensin-like protein. It is pH-stable, heat-stable and intensely sweet protein. The scorpion toxin (a neurotoxin) binds to sodium channels and inhibits the activation mechanisms of the channels, thereby blocking neuronal transmission. Scorpion toxins bind to sodium channels and inhibit the activation mechanisms of the channels, thereby blocking neuronal transmission 55
44057 395429 pfam00538 Linker_histone linker histone H1 and H5 family. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types. 73
44058 306921 pfam00539 Tat Transactivating regulatory protein (Tat). The retroviral Tat protein binds to the Tar RNA. This activates transcriptional initiation and elongation from the LTR promoter. Binding is mediated by an arginine rich region. 64
44059 249943 pfam00540 Gag_p17 gag gene protein p17 (matrix protein). The matrix protein forms an icosahedral shell associated with the inner membrane of the mature immunodeficiency virus. 140
44060 306922 pfam00541 Adeno_knob Adenoviral fibre protein (knob domain). Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, termed the carboxy-terminal knob domain. 178
44061 395430 pfam00542 Ribosomal_L12 Ribosomal protein L7/L12 C-terminal domain. 67
44062 395431 pfam00543 P-II Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase. 102
44063 366158 pfam00544 Pec_lyase_C Pectate lyase. This enzyme forms a right handed beta helix structure. Pectate lyase is an enzyme involved in the maceration and soft rotting of plant tissue. 211
44064 395432 pfam00545 Ribonuclease ribonuclease. This enzyme hydrolyzes RNA and oligoribonucleotides. 81
44065 395433 pfam00547 Urease_gamma Urease, gamma subunit. Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia. 99
44066 278947 pfam00548 Peptidase_C3 3C cysteine protease (picornain 3C). Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease. 174
44067 395434 pfam00549 Ligase_CoA CoA-ligase. This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilize ATP others use GTP. 128
44068 395435 pfam00550 PP-binding Phosphopantetheine attachment site. A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of the anguibactin system regulator AngR has the attachment serine replaced by an alanine. 62
44069 395436 pfam00551 Formyl_trans_N Formyl transferase. This family includes the following members. Glycinamide ribonucleotide transformylase catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function. 181
44070 395437 pfam00552 IN_DBD_C Integrase DNA binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain. The central domain is the catalytic domain pfam00665. This domain is the carboxyl terminal domain that is a non-specific DNA binding domain. 45
44071 395438 pfam00553 CBM_2 Cellulose binding domain. Two tryptophan residues are involved in cellulose binding. Cellulose binding domain found in bacteria. 101
44072 395439 pfam00554 RHD_DNA_bind Rel homology DNA-binding domain. Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal DNA-binding domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam16179) that functions as a dimerization domain. 169
44073 395440 pfam00555 Endotoxin_M delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 204
44074 395441 pfam00556 LHC Antenna complex alpha/beta subunit. 39
44075 395442 pfam00557 Peptidase_M24 Metallopeptidase family M24. This family contains metallopeptidases. It also contains non-peptidase homologs such as the N terminal domain of Spt16 which is a histone H3-H4 binding module. 206
44076 109608 pfam00558 Vpu Vpu protein. The Vpu protein contains an N-terminal transmembrane spanning region and a C-terminal cytoplasmic region. The HIV-1 Vpu protein stimulates virus production by enhancing the release of viral particles from infected cells. The VPU protein binds specifically to CD4. 81
44077 278957 pfam00559 Vif Retroviral Vif (Viral infectivity) protein. Human immunodeficiency virus type 1 (HIV-1) Vif is required for productive infection of T lymphocytes and macrophages. Virions produced in the absence of Vif have abnormal core morphology and those produced in primary T cells carry immature core proteins and low levels of mature capsid. 200
44078 395443 pfam00560 LRR_1 Leucine Rich Repeat. CAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich Repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains. 23
44079 395444 pfam00561 Abhydrolase_1 alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes. 245
44080 395445 pfam00562 RNA_pol_Rpb2_6 RNA polymerase Rpb2, domain 6. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII). 369
44081 395446 pfam00563 EAL EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site. 236
44082 395447 pfam00564 PB1 PB1 domain. 84
44083 395448 pfam00565 SNase Staphylococcal nuclease homolog. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis. 106
44084 366170 pfam00566 RabGAP-TBC Rab-GTPase-TBC domain. Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases. 180
44085 395449 pfam00567 TUDOR Tudor domain. 117
44086 395450 pfam00568 WH1 WH1 domain. WASp Homology domain 1 (WH1) domain. WASP is the protein that is defective in Wiskott-Aldrich syndrome (WAS). The majority of point mutations occur within the amino- terminal WH1 domain. The metabotropic glutamate receptors mGluR1alpha and mGluR5 bind a protein called homer, which is a WH1 domain homolog. A subset of WH1 domains has been termed a "EVH1" domain and appear to bind a polyproline motif. 111
44087 395451 pfam00569 ZZ Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin. Putative zinc finger; binding not yet shown. Four to six cysteine residues in its sequence are responsible for coordinating zinc ions, to reinforce the structure. 45
44088 395452 pfam00570 HRDC HRDC domain. The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains. 68
44089 395453 pfam00571 CBS CBS domain. CBS domains are small intracellular modules that pair together to form a stable globular domain. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. CBS domain pairs from AMPK bind AMP or ATP. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP. 57
44090 395454 pfam00572 Ribosomal_L13 Ribosomal protein L13. 119
44091 395455 pfam00573 Ribosomal_L4 Ribosomal protein L4/L1 family. This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA. 190
44092 395456 pfam00574 CLP_protease Clp protease. The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his-136 and asp-185 form the catalytic triad. Some members have lost active site residues and are therefore inactive, some contain one or two large insertions. 175
44093 395457 pfam00575 S1 S1 RNA binding domain. The S1 domain occurs in a wide range of RNA associated proteins. It is structurally similar to cold shock protein which binds nucleic acids. The S1 domain has an OB-fold structure. 74
44094 395458 pfam00576 Transthyretin HIUase/Transthyretin family. This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyze the conversion of 5-hydroxyisourate (HIU) to OHCU. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence. 110
44095 395459 pfam00577 Usher Outer membrane usher protein. In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two- component assembly and transport system which is composed of a periplasmic chaperone and an outer membrane protein which has been termed a molecular 'usher'. The usher protein is rather large (from 86 to 100 Kd) and seems to be mainly composed of membrane-spanning beta-sheets, a structure reminiscent of porins. Although the degree of sequence similarity of these proteins is not very high they share a number of characteristics. One of these is the presence of two pairs of cysteines, the first one located in the N-terminal part and the second at the C-terminal extremity that are probably involved in disulphide bonds. The best conserved region is located in the central part of these proteins. 551
44096 395460 pfam00578 AhpC-TSA AhpC/TSA family. This family contains proteins related to alkyl hydroperoxide reductase (AhpC) and thiol specific antioxidant (TSA). 124
44097 395461 pfam00579 tRNA-synt_1b tRNA synthetases class I (W and Y). 292
44098 395462 pfam00580 UvrD-helicase UvrD/REP helicase N-terminal domain. The Rep family helicases are composed of four structural domains. The Rep family function as dimers. REP helicases catalyze ATP dependent unwinding of double stranded DNA to single stranded DNA. Some members have large insertions near to the carboxy-terminus relative to other members of the family. 267
44099 395463 pfam00581 Rhodanese Rhodanese-like domain. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases. 92
44100 395464 pfam00582 Usp Universal stress protein family. The universal stress protein UspA is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. UspA enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. The crystal structure of Haemophilus influenzae UspA reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, though UspA lacks ATP-binding activity. 140
44101 395465 pfam00583 Acetyltransf_1 Acetyltransferase (GNAT) family. This family contains proteins with N-acetyltransferase functions such as Elp3-related proteins. 116
44102 395466 pfam00584 SecE SecE/Sec61-gamma subunits of protein translocation complex. SecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex. 55
44103 395467 pfam00585 Thr_dehydrat_C C-terminal regulatory domain of Threonine dehydratase. Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain. 91
44104 395468 pfam00586 AIRS AIR synthase related protein, N-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain. 104
44105 395469 pfam00587 tRNA-synt_2b tRNA synthetase class II core domain (G, H, P, S and T). tRNA-synt_2b is a family of largely threonyl-tRNA members. 181
44106 395470 pfam00588 SpoU_methylase SpoU rRNA Methylase family. This family of proteins probably use S-AdoMet. 141
44107 395471 pfam00589 Phage_integrase Phage integrase family. Members of this family cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment. The catalytic site residues in CRE recombinase are Arg-173, His-289, Arg-292 and Tyr-324. 169
44108 395472 pfam00590 TP_methylase Tetrapyrrole (Corrin/Porphyrin) Methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase. 209
44109 395473 pfam00591 Glycos_transf_3 Glycosyl transferase family, a/b domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate. 251
44110 395474 pfam00593 TonB_dep_Rec TonB dependent receptor. This model now only covers the conserved part of the barrel structure. 475
44111 395475 pfam00594 Gla Vitamin K-dependent carboxylation/gamma-carboxyglutamic (GLA) domain. This domain is responsible for the high-affinity binding of calcium ions. This domain contains post-translational modifications of many glutamate residues by Vitamin K-dependent carboxylation to form gamma-carboxyglutamate (Gla). 41
44112 395476 pfam00595 PDZ PDZ domain (Also known as DHR or GLGF). PDZ domains are found in diverse signaling proteins. 81
44113 395477 pfam00596 Aldolase_II Class II Aldolase and Adducin N-terminal domain. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. 175
44114 395478 pfam00598 Flu_M1 Influenza Matrix protein (M1). This protein forms a continuous shell on the inner side of the lipid bilayer, but its function is unclear. 156
44115 278994 pfam00599 Flu_M2 Influenza Matrix protein (M2). This protein spans the viral membrane with an extracellular amino-terminus external and a cytoplasmic carboxy-terminus. 97
44116 366187 pfam00600 Flu_NS1 Influenza non-structural protein (NS1). NS1 is a homodimeric RNA-binding protein that is required for viral replication. NS1 binds polyA tails of mRNA keeping them in the nucleus. NS1 inhibits pre-mRNA splicing by tightly binding to a specific stem-bulge of U6 snRNA. 217
44117 278996 pfam00601 Flu_NS2 Influenza non-structural protein (NS2). NS2 may play a role in promoting normal replication of the genomic RNAs by preventing the replication of short-length RNA species. 108
44118 395479 pfam00602 Flu_PB1 Influenza RNA-dependent RNA polymerase subunit PB1. Two GTP binding sites exist in this protein. 732
44119 395480 pfam00603 Flu_PA Influenza RNA-dependent RNA polymerase subunit PA. 694
44120 395481 pfam00604 Flu_PB2 Influenza RNA-dependent RNA polymerase subunit PB2. PB2 can bind 5' end cap structure of RNA. 754
44121 395482 pfam00605 IRF Interferon regulatory factor transcription factor. This family of transcription factors are important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. Three of the five conserved tryptophan residues bind to DNA. 106
44122 334169 pfam00606 Glycoprotein_B Herpesvirus Glycoprotein B ectodomain. This domain corresponds to the ectodomain of glycoprotein B according to ECOD. 222
44123 395483 pfam00607 Gag_p24 gag gene protein p24 (core nucleocapsid protein). p24 forms inner protein layer of the nucleocapsid. ELISA tests for p24 is the most commonly used method to demonstrate virus replication both in vivo and in vitro. 195
44124 306963 pfam00608 Adeno_shaft Adenoviral fibre protein (repeat/shaft region). There is no separation between signal and noise. Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, rather than the 'shaft' region represented by this family. The alignment of this family contains two copies of a fifteen residue repeat found in the 'shaft' region of adenoviral fibre proteins. 34
44125 395484 pfam00609 DAGK_acc Diacylglycerol kinase accessory domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. 158
44126 395485 pfam00610 DEP Domain found in Dishevelled, Egl-10, and Pleckstrin (DEP). The DEP domain is responsible for mediating intracellular protein targeting and regulation of protein stability in the cell. The DEP domain is present in a number of signaling molecules, including Regulator of G protein Signaling (RGS) proteins, and has been implicated in membrane targeting. New findings in yeast, however, demonstrate a major role for a DEP domain in mediating the interaction of an RGS protein to the C-terminal tail of a GPCR, thus placing RGS in close proximity with its substrate G protein alpha subunit. 71
44127 395486 pfam00611 FCH Fes/CIP4, and EFC/F-BAR homology domain. Alignment extended from. Highly alpha-helical. The cytosolic endocytic adaptor proteins in fungi carry this domain at the N-terminus; several of these have been referred to as muniscin proteins. These N-terminal BAR, N-BAR, and EFC/F-BAR domains are found in proteins that regulate membrane trafficking events by inducing membrane tubulation. The domain dimerizes into a curved structure that binds to liposomes and either senses or induces the curvature of the membrane bilayer to cause biophysical changes to the shape of the bilayer; it also thereby recruits other trafficking factors, such as the GTPase dynamin. Most EFC/F-BAR domain-family members localize to actin-rich structures. 77
44128 395487 pfam00612 IQ IQ calmodulin-binding motif. Calmodulin-binding motif. 21
44129 395488 pfam00613 PI3Ka Phosphoinositide 3-kinase family, accessory domain (PIK domain). PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. 185
44130 395489 pfam00614 PLDc Phospholipase D Active site motif. Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homolog of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site. aspartic acid. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologs but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. 28
44131 395490 pfam00615 RGS Regulator of G protein signaling domain. RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. 117
44132 395491 pfam00616 RasGAP GTPase-activator protein for Ras-like GTPase. All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position. 206
44133 395492 pfam00617 RasGEF RasGEF domain. Guanine nucleotide exchange factor for Ras-like small GTPases. 179
44134 395493 pfam00618 RasGEF_N RasGEF N-terminal motif. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this motif/domain N-terminal to the RasGef (Cdc25-like) domain. 104
44135 395494 pfam00619 CARD Caspase recruitment domain. Motif contained in proteins involved in apoptotic signaling. Predicted to possess a DEATH (pfam00531) domain-like fold. 85
44136 395495 pfam00620 RhoGAP RhoGAP domain. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. 152
44137 395496 pfam00621 RhoGEF RhoGEF domain. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that pfam00169 domains invariably occur C-terminal to RhoGEF/DH domains. 177
44138 395497 pfam00622 SPRY SPRY domain. SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown function. Distant homologs are domains in butyrophilin/marenostrin/pyrin homologs. 121
44139 395498 pfam00623 RNA_pol_Rpb1_2 RNA polymerase Rpb1, domain 2. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion. 166
44140 395499 pfam00624 Flocculin Flocculin repeat. This short repeat is rich in serine and threonine residues. 39
44141 395500 pfam00625 Guanylate_kin Guanylate kinase. 182
44142 395501 pfam00626 Gelsolin Gelsolin repeat. 76
44143 395502 pfam00627 UBA UBA/TS-N domain. This small domain is composed of three alpha helices. This family includes the previously defined UBA and TS-N domains. The UBA-domain (ubiquitin associated domain) is a novel sequence motif found in several proteins having connections to ubiquitin and the ubiquitination pathway. The structure of the UBA domain consists of a compact three helix bundle. This domain is found at the N-terminus of EF-TS hence the name TS-N. The structure of EF-TS is known and this domain is implicated in its interaction with EF-TU. The domain has been found in non EF-TS proteins such as alpha-NAC and MJ0280. 37
44144 395503 pfam00628 PHD PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. 51
44145 395504 pfam00629 MAM MAM domain, meprin/A5/mu. An extracellular domain found in many receptors. The MAM domain along with the associated Ig domain in type IIB receptor protein tyrosine phosphatases forms a structural unit (termed MIg) with a seamless interdomain interface. It plays a major role in homodimerization of the phosphatase ectoprotein and in cell adhesion. MAM is a beta-sandwich consisting of two five-stranded antiparallel beta-sheets rotated away from each other by approx 25 degrees, and plays a similar role in meprin metalloproteinases. 160
44146 395505 pfam00630 Filamin Filamin/ABP280 repeat. 89
44147 395506 pfam00631 G-gamma GGL domain. G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins. It is also found fused to an inactive Galpha in the Dictyostelium protein gbqA. G-gamma likely shares a common origin with the helical N-terminal unit of G-beta. All organisms that posses a G-beta possess a G-gamma. 67
44148 395507 pfam00632 HECT HECT-domain (ubiquitin-transferase). The name HECT comes from Homologous to the E6-AP Carboxyl Terminus. 300
44149 395508 pfam00633 HHH Helix-hairpin-helix motif. The helix-hairpin-helix DNA-binding motif is found to be duplicated in the central domain of RuvA. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain. 30
44150 395509 pfam00634 BRCA2 BRCA2 repeat. The alignment covers only the most conserved region of the repeat. 31
44151 395510 pfam00635 Motile_Sperm MSP (Major sperm protein) domain. Major sperm proteins are involved in sperm motility. These proteins oligomerize to form filaments. This family contains many other proteins. 109
44152 395511 pfam00636 Ribonuclease_3 Ribonuclease III domain. 101
44153 395512 pfam00637 Clathrin Region in Clathrin and VPS. Each region is about 140 amino acids long. The regions are composed of multiple alpha helical repeats. They occur in the arm region of the Clathrin heavy chain. 137
44154 395513 pfam00638 Ran_BP1 RanBP1 domain. 122
44155 395514 pfam00639 Rotamase PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline. 96
44156 395515 pfam00640 PID Phosphotyrosine interaction domain (PTB/PID). 133
44157 395516 pfam00641 zf-RanBP Zn-finger in Ran binding protein and others. 30
44158 395517 pfam00642 zf-CCCH Zinc finger C-x8-C-x5-C-x3-H type (and similar). 27
44159 395518 pfam00643 zf-B_box B-box zinc finger. 42
44160 395519 pfam00644 PARP Poly(ADP-ribose) polymerase catalytic domain. Poly(ADP-ribose) polymerase catalyzes the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 195
44161 395520 pfam00645 zf-PARP Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region. Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. 87
44162 395521 pfam00646 F-box F-box domain. This domain is approximately 50 amino acids long, and is usually found in the N-terminal half of a variety of proteins. Two motifs that are commonly found associated with the F-box domain are the leucine rich repeats (LRRs; pfam00560 and pfam07723) and the WD repeat (pfam00400). The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression. 45
44163 395522 pfam00647 EF1G Elongation factor 1 gamma, conserved domain. 105
44164 395523 pfam00648 Peptidase_C2 Calpain family cysteine protease. 296
44165 395524 pfam00649 Copper-fist Copper fist DNA binding domain. 38
44166 395525 pfam00650 CRAL_TRIO CRAL/TRIO domain. 152
44167 395526 pfam00651 BTB BTB/POZ domain. The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerization and in some instances heteromeric dimerization. The structure of the dimerized PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN. 107
44168 395527 pfam00652 Ricin_B_lectin Ricin-type beta-trefoil lectin domain. 126
44169 395528 pfam00653 BIR Inhibitor of Apoptosis domain. BIR stands for 'Baculovirus Inhibitor of apoptosis protein Repeat'. It is found repeated in inhibitor of apoptosis proteins (IAPs), and in fact it is also known as IAP repeat. These domains characteristically have a number of invariant residues, including 3 conserved cysteines and one conserved histidine that coordinate a zinc ion. They are usually made up of 4-5 alpha helices and a three-stranded beta-sheet. BIR is also found in other proteins known as BIR-domain-containing proteins (BIRPs), such as Survivin. 67
44170 395529 pfam00654 Voltage_CLC Voltage gated chloride channel. This family of ion channels contains 10 or 12 transmembrane helices. Each protein forms a single pore. It has been shown that some members of this family form homodimers. In terms of primary structure, they are unrelated to known cation channels or other types of anion channels. Three ClC subfamilies are found in animals. ClC-1 is involved in setting and restoring the resting membrane potential of skeletal muscle, while other channels play important parts in solute concentration mechanisms in the kidney. These proteins contain two pfam00571 domains. 344
44171 395530 pfam00656 Peptidase_C14 Caspase domain. 232
44172 395531 pfam00657 Lipase_GDSL GDSL-like Lipase/Acylhydrolase. 224
44173 395532 pfam00658 PABP Poly-adenylate binding protein, unique domain. The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family. 65
44174 395533 pfam00659 POLO_box POLO box duplicated region. 66
44175 395534 pfam00660 SRP1_TIP1 Seripauperin and TIP1 family. 98
44176 395535 pfam00661 Matrix Viral matrix protein. Found in Morbillivirus and paramyxovirus, pneumovirus. 340
44177 395536 pfam00662 Proton_antipo_N NADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus. This sub-family represents an amino terminal extension of pfam00361. Only NADH-Ubiquinone chain 5 and eubacterial chain L are in this family. This sub-family is part of complex I which catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. 58
44178 395537 pfam00664 ABC_membrane ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. 274
44179 395538 pfam00665 rve Integrase core domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site. 99
44180 395539 pfam00666 Cathelicidins Cathelicidin. A novel protein family, showing a conserved proregion and a variable carboxyl-terminal antimicrobial domain. This region shows similarity to cystatins. 101
44181 395540 pfam00667 FAD_binding_1 FAD binding domain. This domain is found in sulfite reductase, NADPH cytochrome P450 reductase, Nitric oxide synthase and methionine synthase reductase. 219
44182 395541 pfam00668 Condensation Condensation domain. This domain is found in many multi-domain enzymes which synthesize peptide antibiotics. This domain catalyzes a condensation reaction to form peptide bonds in non- ribosomal peptide biosynthesis. It is usually found to the carboxy side of a phosphopantetheine binding domain (pfam00550). It has been shown that mutations in the HHXXXDG motif abolish activity suggesting this is part of the active site. 454
44183 395542 pfam00669 Flagellin_N Bacterial flagellin N-terminal helical region. Flagellins polymerize to form bacterial flagella. This family includes flagellins and hook associated protein 3. Structurally this family forms an extended helix that interacts with pfam00700. 139
44184 395543 pfam00670 AdoHcyase_NAD S-adenosyl-L-homocysteine hydrolase, NAD binding domain. 162
44185 395544 pfam00672 HAMP HAMP domain. 53
44186 395545 pfam00673 Ribosomal_L5_C ribosomal L5P family C-terminus. This region is found associated with pfam00281. 94
44187 395546 pfam00674 DUP DUP family. This family consists of several yeast proteins of unknown functions. Swiss-prot annotates these as belonging to the DUP family. Several members of this family contain an internal duplication of this region. 96
44188 395547 pfam00675 Peptidase_M16 Insulinase (Peptidase family M16). 149
44189 395548 pfam00676 E1_dh Dehydrogenase E1 component. This family uses thiamine pyrophosphate as a cofactor. This family includes pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and 2-oxoisovalerate dehydrogenase. 300
44190 395549 pfam00677 Lum_binding Lumazine binding domain. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine. 83
44191 395550 pfam00679 EFG_C Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. 88
44192 395551 pfam00680 RdRP_1 RNA dependent RNA polymerase. 453
44193 395552 pfam00681 Plectin Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen. 40
44194 395553 pfam00682 HMGL-like HMGL-like. This family contains a diverse set of enzymes. These include various aldolases and a region of pyruvate carboxylase. 264
44195 395554 pfam00683 TB TB domain. This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. 42
44196 395555 pfam00684 DnaJ_CXXCXGXG DnaJ central domain. The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found. 61
44197 395556 pfam00685 Sulfotransfer_1 Sulfotransferase domain. 253
44198 395557 pfam00686 CBM_20 Starch binding domain. 95
44199 395558 pfam00687 Ribosomal_L1 Ribosomal protein L1p/L10e family. This family includes prokaryotic L1 and eukaryotic L10. 197
44200 395559 pfam00688 TGFb_propeptide TGF-beta propeptide. This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. 229
44201 376368 pfam00689 Cation_ATPase_C Cation transporting ATPase, C-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices. 175
44202 395560 pfam00690 Cation_ATPase_N Cation transporter/ATPase, N-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. 69
44203 395561 pfam00691 OmpA OmpA family. The Pfam entry also includes MotB and related proteins which are not included in the Prosite family. 94
44204 395562 pfam00692 dUTPase dUTPase. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate. 129
44205 395563 pfam00693 Herpes_TK Thymidine kinase from herpesvirus. 280
44206 395564 pfam00694 Aconitase_C Aconitase C-terminal domain. Members of this family usually also match to pfam00330. This domain undergoes conformational change in the enzyme mechanism. 131
44207 366252 pfam00695 vMSA Major surface antigen from hepadnavirus. 394
44208 395565 pfam00696 AA_kinase Amino acid kinase family. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2. 232
44209 395566 pfam00697 PRAI N-(5'phosphoribosyl)anthranilate (PRA) isomerase. 193
44210 395567 pfam00698 Acyl_transf_1 Acyl transferase domain. 319
44211 395568 pfam00699 Urease_beta Urease beta subunit. This subunit is known as alpha in Heliobacter. 98
44212 395569 pfam00700 Flagellin_C Bacterial flagellin C-terminal helical region. Flagellins polymerize to form bacterial flagella. There is some similarity between this family and pfam00669, particularly the motif NRFXSXIXXL. It has been suggested that these two regions associate and this is shown to be correct as structurally this family forms an extended helix that interacts with pfam00700. 86
44213 395570 pfam00701 DHDPS Dihydrodipicolinate synthetase family. This family has a TIM barrel structure. 289
44214 395571 pfam00702 Hydrolase haloacid dehalogenase-like hydrolase. This family is structurally different from the alpha/beta hydrolase family (pfam00561). This family includes L-2-haloacid dehalogenase, epoxide hydrolases and phosphatases. The structure of the family consists of two domains. One is an inserted four helix bundle, which is the least well conserved region of the alignment, between residues 16 and 96 of Pseudomonas sp. (S)-2-haloacid dehalogenase 1. The rest of the fold is composed of the core alpha/beta domain. Those members with the characteristic DxD triad at the N-terminus are probably phosphatidylglycerolphosphate (PGP) phosphatases involved in cardiolipin biosynthesis in the mitochondria. 191
44215 395572 pfam00703 Glyco_hydro_2 Glycosyl hydrolases family 2. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities. 106
44216 395573 pfam00704 Glyco_hydro_18 Glycosyl hydrolases family 18. 307
44217 395574 pfam00705 PCNA_N Proliferating cell nuclear antigen, N-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA. 125
44218 109750 pfam00706 Toxin_4 Anenome neurotoxin. 43
44219 395575 pfam00707 IF3_C Translation initiation factor IF-3, C-terminal domain. 86
44220 395576 pfam00708 Acylphosphatase Acylphosphatase. 85
44221 395577 pfam00709 Adenylsucc_synt Adenylosuccinate synthetase. 418
44222 395578 pfam00710 Asparaginase Asparaginase, N-terminal. This is the N-terminal domain of this enzyme. 188
44223 395579 pfam00711 Defensin_beta Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation. 36
44224 395580 pfam00712 DNA_pol3_beta DNA polymerase III beta subunit, N-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 121
44225 307040 pfam00713 Hirudin Hirudin. 64
44226 395581 pfam00714 IFN-gamma Interferon gamma. 138
44227 366262 pfam00715 IL2 Interleukin 2. 144
44228 279105 pfam00716 Peptidase_S21 Assemblin (Peptidase family S21). 336
44229 395582 pfam00717 Peptidase_S24 Peptidase S24-like. 116
44230 307044 pfam00718 Polyoma_coat Polyomavirus coat protein. 293
44231 395583 pfam00719 Pyrophosphatase Inorganic pyrophosphatase. 154
44232 395584 pfam00720 SSI Subtilisin inhibitor-like. 92
44233 307047 pfam00721 TMV_coat Virus coat protein (TMV like). This family contains coat proteins from tobamoviruses, hordeiviruses, Tobraviruses, Furoviruses and Potyviruses. 163
44234 395585 pfam00722 Glyco_hydro_16 Glycosyl hydrolases family 16. 168
44235 395586 pfam00723 Glyco_hydro_15 Glycosyl hydrolases family 15. In higher organisms this family is represented by phosphorylase kinase subunits. 417
44236 395587 pfam00724 Oxidored_FMN NADH:flavin oxidoreductase / NADH oxidase family. 341
44237 395588 pfam00725 3HCDH 3-hydroxyacyl-CoA dehydrogenase, C-terminal domain. This family also includes lambda crystallin. Some proteins include two copies of this domain. 97
44238 334228 pfam00726 IL10 Interleukin 10. 170
44239 395589 pfam00727 IL4 Interleukin 4. 116
44240 395590 pfam00728 Glyco_hydro_20 Glycosyl hydrolase family 20, catalytic domain. This domain has a TIM barrel fold. 345
44241 395591 pfam00729 Viral_coat Viral coat protein (S domain). 204
44242 395592 pfam00730 HhH-GPD HhH-GPD superfamily base excision DNA repair protein. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family. 142
44243 395593 pfam00731 AIRC AIR carboxylase. Members of this family catalyze the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyze the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. 147
44244 366272 pfam00732 GMC_oxred_N GMC oxidoreductase. This family of proteins bind FAD as a cofactor. 218
44245 395594 pfam00733 Asn_synthase Asparagine synthase. This family is always found associated with pfam00310. Members of this family catalyze the conversion of aspartate to asparagine. 279
44246 395595 pfam00734 CBM_1 Fungal cellulose binding domain. 29
44247 395596 pfam00735 Septin Septin. Members of this family include CDC3, CDC10, CDC11 and CDC12/Septin. Members of this family bind GTP. As regards the septins, these are polypeptides of 30-65kDa with three characteristic GTPase motifs (G-1, G-3 and G-4) that are similar to those of the Ras family. The G-4 motif is strictly conserved with a unique septin consensus of AKAD. Most septins are thought to have at least one coiled-coil region, which in some cases is necessary for intermolecular interactions that allow septins to polymerize to form rod-shaped complexes. In turn, these are arranged into tandem arrays to form filaments. They are multifunctional proteins, with roles in cytokinesis, sporulation, germ cell development, exocytosis and apoptosis. 272
44248 395597 pfam00736 EF1_GNE EF-1 guanine nucleotide exchange domain. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains. 83
44249 395598 pfam00737 PsbH Photosystem II 10 kDa phosphoprotein. This protein is phosphorylated in a light dependent reaction. 52
44250 307060 pfam00738 Polyhedrin Polyhedrin. These proteins are found in occlusion bodies in various viruses. The polyhedrin protein protects the virus. 232
44251 109783 pfam00739 X Trans-activation protein X. This protein is found in hepadnaviruses where it is indispensable for replication. 142
44252 395599 pfam00740 Parvo_coat Parvovirus coat protein VP2. This protein, together with VP1 forms a capsomer. Both of these proteins are formed from the same transcript using alternative splicing. As a result, VP1 and VP2 differ only in the N-terminal region of VP1. VP2 is involved in packaging the viral DNA. 518
44253 395600 pfam00741 Gas_vesicle Gas vesicle protein. 39
44254 395601 pfam00742 Homoserine_dh Homoserine dehydrogenase. 178
44255 395602 pfam00743 FMO-like Flavin-binding monooxygenase-like. This family includes FMO proteins, cyclohexanone mono-oxygenase and a number of different mono-oxygenases. 531
44256 395603 pfam00745 GlutR_dimer Glutamyl-tRNAGlu reductase, dimerization domain. 94
44257 366278 pfam00746 Gram_pos_anchor LPXTG cell wall anchor motif. 43
44258 395604 pfam00747 Viral_DNA_bp ssDNA binding protein. This protein is found in herpesviruses and is needed for replication. 1120
44259 395605 pfam00748 Calpain_inhib Calpain inhibitor. This region is found multiple times in calpain inhibitor proteins. 130
44260 395606 pfam00749 tRNA-synt_1c tRNA synthetases class I (E and Q), catalytic domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln). 314
44261 395607 pfam00750 tRNA-synt_1d tRNA synthetases class I (R). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only arginyl tRNA synthetase. 348
44262 395608 pfam00751 DM DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerize and bind palindromic DNA. 47
44263 395609 pfam00752 XPG_N XPG N-terminal domain. 100
44264 395610 pfam00753 Lactamase_B Metallo-beta-lactamase superfamily. 196
44265 395611 pfam00754 F5_F8_type_C F5/8 type C domain. This domain is also known as the discoidin (DS) domain family. 127
44266 395612 pfam00755 Carn_acyltransf Choline/Carnitine o-acyltransferase. 578
44267 395613 pfam00756 Esterase Putative esterase. This family contains Esterase D. However it is not clear if all members of the family have the same function. This family is related to the pfam00135 family. 246
44268 395614 pfam00757 Furin-like Furin-like cysteine rich region. 143
44269 395615 pfam00758 EPO_TPO Erythropoietin/thrombopoietin. 160
44270 395616 pfam00759 Glyco_hydro_9 Glycosyl hydrolase family 9. 374
44271 395617 pfam00760 Cucumo_coat Cucumovirus coat protein. 175
44272 366290 pfam00761 Polyoma_coat2 Polyomavirus coat protein. 322
44273 395618 pfam00762 Ferrochelatase Ferrochelatase. 315
44274 395619 pfam00763 THF_DHG_CYH Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. 117
44275 279148 pfam00764 Arginosuc_synth Arginosuccinate synthase. This family contains a PP-loop motif. 386
44276 366292 pfam00765 Autoind_synth Autoinducer synthase. 182
44277 395620 pfam00766 ETF_alpha Electron transfer flavoprotein FAD-binding domain. This domain found at the C-terminus of electron transfer flavoprotein alpha chain and binds to FAD. The fold consists of a five-stranded parallel beta sheet as the core of the domain, flanked by alternating helices. A small part of this domain is donated by the beta chain. 83
44278 279151 pfam00767 Poty_coat Potyvirus coat protein. 243
44279 395621 pfam00768 Peptidase_S11 D-alanyl-D-alanine carboxypeptidase. 236
44280 395622 pfam00769 ERM Ezrin/radixin/moesin family. This family of proteins contain a band 4.1 domain (pfam00373), at their amino terminus. This family represents the rest of these proteins. 244
44281 279154 pfam00770 Peptidase_C5 Adenovirus endoprotease. This family of adenovirus thiol endoproteases specifically cleave Gly-Ala peptides in viral precursor peptides. 179
44282 395623 pfam00771 FHIPEP FHIPEP family. 657
44283 395624 pfam00772 DnaB DnaB-like helicase N terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This N-terminal domain is required both for interaction with other proteins in the primosome and for DnaB helicase activity. 103
44284 395625 pfam00773 RNB RNB domain. This domain is the catalytic domain of ribonuclease II. 317
44285 395626 pfam00775 Dioxygenase_C Dioxygenase. 182
44286 395627 pfam00777 Glyco_transf_29 Glycosyltransferase family 29 (sialyltransferase). Members of this family belong to glycosyltransferase family 29. 267
44287 395628 pfam00778 DIX DIX domain. The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerization. It is involved in the homo- oligomerization of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain. 79
44288 395629 pfam00779 BTK BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains. The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. 29
44289 395630 pfam00780 CNH CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations. 260
44290 395631 pfam00781 DAGK_cat Diacylglycerol kinase catalytic domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologs. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family. 125
44291 395632 pfam00782 DSPc Dual specificity phosphatase, catalytic domain. Ser/Thr and Tyr protein phosphatases. The enzyme's tertiary fold is highly similar to that of tyrosine-specific phosphatases, except for a "recognition" region. 127
44292 395633 pfam00784 MyTH4 MyTH4 domain. Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins. 104
44293 395634 pfam00786 PBD P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). 59
44294 395635 pfam00787 PX PX domain. PX domains bind to phosphoinositides. 84
44295 395636 pfam00788 RA Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Recent evidence (not yet in MEDLINE) shows that some RA domains do NOT bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. 93
44296 395637 pfam00789 UBX UBX domain. This domain is present in ubiquitin-regulatory proteins and is a general Cdc48-interacting module. 80
44297 395638 pfam00790 VHS VHS domain. Domain present in VPS-27, Hrs and STAM. 136
44298 395639 pfam00791 ZU5 ZU5 domain. Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. 98
44299 395640 pfam00792 PI3K_C2 Phosphoinositide 3-kinase C2. Phosphoinositide 3-kinase region postulated to contain a C2 domain. Outlier of pfam00168 family. 136
44300 395641 pfam00793 DAHP_synth_1 DAHP synthetase I family. Members of this family catalyze the first step in aromatic amino acid biosynthesis from chorismate. E-coli has three related synthetases, which are inhibited by different aromatic amino acids. This family also includes KDSA which has very similar catalytic activity but is involved in the first step of liposaccharide biosynthesis. The enzyme is also part of the shikimate pathway, EC:2.5.1.54. 271
44301 395642 pfam00794 PI3K_rbd PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation). 106
44302 395643 pfam00795 CN_hydrolase Carbon-nitrogen hydrolase. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. Nitrilase-related proteins generally have a conserved E-K-C catalytic triad, and are multimeric alpha-beta-beta-alpha sandwich proteins. 257
44303 279175 pfam00796 PSI_8 Photosystem I reaction centre subunit VIII. 24
44304 395644 pfam00797 Acetyltransf_2 N-acetyltransferase. Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30kDa. It facilitates the transfer of an acetyl group from Acetyl Coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (M. tuberculosis, M. smegmatis etc) to man. It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat Tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid. 240
44305 279177 pfam00798 Arena_glycoprot Arenavirus glycoprotein. 483
44306 366313 pfam00799 Gemini_AL1 Geminivirus Rep catalytic domain. The AL1 proteins encodes the replication initiator protein (Rep) of geminiviruses, which is a replicon-specific initiator enzyme and is an essential component of the replisome. For geminivirus Rep protein, this N-terminal region is crucial for origin recognition and DNA cleavage and nucleotidyl transfer. 113
44307 395645 pfam00800 PDT Prephenate dehydratase. This protein is involved in Phenylalanine biosynthesis. This protein catalyzes the decarboxylation of prephenate to phenylpyruvate. 181
44308 395646 pfam00801 PKD PKD domain. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold. 70
44309 144411 pfam00802 Glycoprotein_G Pneumovirus attachment glycoprotein G. This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or hemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues. 263
44310 279181 pfam00803 3A 3A/RNA2 movement protein family. This family includes movement proteins from various viruses. The 3A protein is found in bromoviruses and Cucumoviruses. The genome of these viruses contain 3 RNA segments. The third segment (RNA 3) contains two proteins, the coat protein and the 3A protein. The function of the 3A protein is uncertain but has been shown to be involved in cell-to- cell movement of the virus. The family also includes movement proteins from Dianthoviruses. 225
44311 395647 pfam00804 Syntaxin Syntaxin. Syntaxins are the prototype family of SNARE proteins. They usually consist of three main regions - a C-terminal transmembrane region, a central SNARE domain which is characteristic of and conserved in all syntaxins (pfam05739), and an N-terminal domain that is featured in this entry. This domain varies between syntaxin isoforms; in syntaxin 1A it is found as three alpha-helices with a left-handed twist. It may fold back on the SNARE domain to allow the molecule to adopt a 'closed' configuration that prevents formation of the core fusion complex - it thus has an auto-inhibitory role. The function of syntaxins is determined by their localization. They are involved in neuronal exocytosis, ER-Golgi transport and Golgi-endosome transport, for example. They also interact with other proteins as well as those involved in SNARE complexes. These include vesicle coat proteins, Rab GTPases, and tethering factors. 200
44312 395648 pfam00805 Pentapeptide Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid. 40
44313 395649 pfam00806 PUF Pumilio-family RNA binding repeat. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognize all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure. 35
44314 366317 pfam00807 Apidaecin Apidaecin. These antibacterial peptides are found in bees. These heat-stable, non-helical peptides are active against a wide range of plant-associated bacteria and some human pathogens. The Pfam alignment includes the propeptide and apidaecin sequence. 30
44315 395650 pfam00808 CBFD_NFYB_HMF Histone-like transcription factor (CBF/NF-Y) and archaeal histone. This family includes archaebacterial histones and histone like transcription factors from eukaryotes. 65
44316 395651 pfam00809 Pterin_bind Pterin binding enzyme. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyzes a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin. 243
44317 395652 pfam00810 ER_lumen_recept ER lumen protein retaining receptor. 143
44318 395653 pfam00811 Ependymin Ependymin. 124
44319 395654 pfam00812 Ephrin Ephrin. 137
44320 395655 pfam00813 FliP FliP family. 191
44321 395656 pfam00814 Peptidase_M22 Glycoprotease family. The Peptidase M22 proteins are part of the HSP70-actin superfamily. The region represented here is an insert into the fold and is not found in the rest of the family (beyond the Peptidase M22 family). Included in this family are the Rhizobial NodU proteins and the HypF regulator. This region also contains the histidine dyad believed to coordinate the metal ion and hence provide catalytic activity. Interestingly the histidines are not well conserved, and there is a lack of experimental evidence to support peptidase activity as a general property of this family. There also appear to be instances of this domain outside of the HSP70-actin superfamily. 272
44322 395657 pfam00815 Histidinol_dh Histidinol dehydrogenase. 410
44323 395658 pfam00816 Histone_HNS H-NS histone family. 91
44324 395659 pfam00817 IMS impB/mucB/samB family. These proteins are involved in UV protection. 148
44325 307113 pfam00818 Ice_nucleation Ice nucleation protein repeat. 15
44326 109859 pfam00819 Myotoxins Myotoxin, crotamine. Crotamine is a family of cationic peptides expressed by the venom gland of, for example, Crotalus durissus terrificus. It acts as a cell-penetrating peptide (CPP) and as a potent voltage-gated potassium channel (Kv) inhibitor. 42
44327 395660 pfam00820 Lipoprotein_1 Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain. 259
44328 395661 pfam00821 PEPCK_C Phosphoenolpyruvate carboxykinase C-terminal P-loop domain. catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate. 358
44329 395662 pfam00822 PMP22_Claudin PMP-22/EMP/MP20/Claudin family. 162
44330 395663 pfam00823 PPE PPE family. This family named after a PPE motif near to the amino terminus of the domain. The PPE family of proteins all contain an amino-terminal region of about 180 amino acids. The carboxyl terminus of this family are variable, and on the basis of this region fall into at least three groups. The MPTR subgroup has tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group are only related in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis. 158
44331 395664 pfam00825 Ribonuclease_P Ribonuclease P. 107
44332 395665 pfam00827 Ribosomal_L15e Ribosomal L15. 191
44333 395666 pfam00828 Ribosomal_L27A Ribosomal proteins 50S-L15, 50S-L18e, 60S-L27A. This family includes higher eukaryotic ribosomal 60S L27A, archaeal 50S L18e, prokaryotic 50S L15, fungal mitochondrial L10, plant L27A, mitochondrial L15 and chloroplast L18-3 proteins. 127
44334 395667 pfam00829 Ribosomal_L21p Ribosomal prokaryotic L21 protein. 100
44335 395668 pfam00830 Ribosomal_L28 Ribosomal L28 family. The ribosomal 28 family includes L28 proteins from bacteria and chloroplasts. The L24 protein from yeast also contains a region of similarity to prokaryotic L28 proteins. L24 from yeast is also found in the large ribosomal subunit 58
44336 395669 pfam00831 Ribosomal_L29 Ribosomal L29 protein. 56
44337 395670 pfam00832 Ribosomal_L39 Ribosomal L39 protein. 42
44338 395671 pfam00833 Ribosomal_S17e Ribosomal S17. 122
44339 395672 pfam00834 Ribul_P_3_epim Ribulose-phosphate 3 epimerase family. This enzyme catalyzes the conversion of D-ribulose 5-phosphate into D-xylulose 5-phosphate. 198
44340 395673 pfam00835 SNAP-25 SNAP-25 family. SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment. 55
44341 395674 pfam00836 Stathmin Stathmin family. The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerization of tubulin heterodimers. 136
44342 279210 pfam00837 T4_deiodinase Iodothyronine deiodinase. Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3). 237
44343 395675 pfam00838 TCTP Translationally controlled tumor protein. 165
44344 395676 pfam00839 Cys_rich_FGFR Cysteine rich repeat. This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). 58
44345 395677 pfam00840 Glyco_hydro_7 Glycosyl hydrolase family 7. 434
44346 395678 pfam00841 Protamine_P2 Sperm histone P2. This protein also known as protamine P2 can substitute for histones in the chromatin of sperm. The alignment contains both the sequence of the mature P2 protein and its propeptide. 89
44347 395679 pfam00842 Ala_racemase_C Alanine racemase, C-terminal domain. 126
44348 334282 pfam00843 Arena_nucleocap Arenavirus nucleocapsid N-terminal domain. This N-terminal domain folds into a novel structure with a deep cavity for binding the m7GpppN cap structure that is required for viral RNA transcription. 334
44349 307130 pfam00844 Gemini_coat Geminivirus coat protein/nuclear export factor BR1 family. It has been shown that the 104 N-terminal amino acids of the maize streak virus coat protein bind DNA non- specifically. This family also includes various geminivirus movement proteins that are nuclear export factors or shuttles. One member BR1 facilitates the export of both ds and ss DNA form the nucleus. 244
44350 307131 pfam00845 Gemini_BL1 Geminivirus BL1 movement protein. Geminiviruses encode two movement proteins that are essential for systemic infection of their host but dispensable for replication and encapsidation. 276
44351 279218 pfam00846 Hanta_nucleocap Hantavirus nucleocapsid protein. 429
44352 395680 pfam00847 AP2 AP2 domain. This 60 amino acid residue domain can bind to DNA and is found in transcription factor proteins. 52
44353 395681 pfam00848 Ring_hydroxyl_A Ring hydroxylating alpha subunit (catalytic domain). This family is the catalytic domain of aromatic-ring- hydroxylating dioxygenase systems. The active site contains a non-heme ferrous ion coordinated by three ligands. 210
44354 376401 pfam00849 PseudoU_synth_2 RNA pseudouridylate synthase. Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes RluD, a pseudouridylate synthase that converts specific uracils to pseudouridine in 23S rRNA. RluA from E. coli converts bases in both rRNA and tRNA. 151
44355 395682 pfam00850 Hist_deacetyl Histone deacetylase domain. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyze the removal of the acetyl group. Histone deacetylases are related to other proteins. 297
44356 279223 pfam00851 Peptidase_C6 Helper component proteinase. This protein is found in genome polyproteins of potyviruses. 440
44357 395683 pfam00852 Glyco_transf_10 Glycosyltransferase family 10 (fucosyltransferase) C-term. This is the C-terminal domain of a family of fucosyltransferases. This enzyme transfers fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is known as glycosyltransferase family 10. The C-terminal domain is the likely binding-region for ADP (manuscript in publication). 173
44358 395684 pfam00853 Runt Runt domain. 127
44359 395685 pfam00854 PTR2 POT family. The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. 392
44360 395686 pfam00855 PWWP PWWP domain. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The domain binds to Histone-4 methylated at lysine-20, H4K20me, suggesting that it is methyl-lysine recognition motif. Removal of two conserved aromatic residues in a hydrophobic cavity created by this domain within the full-length protein, Pdp1, abolishes the interaction o f the protein with H4K20me3. In fission yeast, Set9 is the sole enzyme that catalyzes all three states of H4K20me, and Set9-mediated H4K20me is required for efficient recruitment of checkpoint protein Crb2 to sites of DNA damage. The methylation of H4K20 is involved in a diverse array of cellular processes, such as organising higher-order chromatin, maintaining genome stability, and regulating cell-cycle progression. 95
44361 395687 pfam00856 SET SET domain. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure. 115
44362 376404 pfam00857 Isochorismatase Isochorismatase family. This family are hydrolase enzymes. 173
44363 395688 pfam00858 ASC Amiloride-sensitive sodium channel. 405
44364 395689 pfam00859 CTF_NFI CTF/NF-I family transcription modulation region. 292
44365 395690 pfam00860 Xan_ur_permease Permease family. This family includes permeases for diverse substrates such as xanthine, uracil, and vitamin C. However many members of this family are functionally uncharacterized and may transport other substrates. Members of this family have ten predicted transmembrane helices. 389
44366 395691 pfam00861 Ribosomal_L18p Ribosomal L18 of archaea, bacteria, mitoch. and chloroplast. This family includes the large subunit ribosomal proteins from bacteria, archaea, the mitochondria and the chloroplast. It does not include the 60S L18 or L5 proteins from Metazoa. 116
44367 395692 pfam00862 Sucrose_synth Sucrose synthase. Sucrose synthases catalyze the synthesis of sucrose from UDP-glucose and fructose. This family includes the bulk of the sucrose synthase protein. However the carboxyl terminal region of the sucrose synthases belongs to the glycosyl transferase family pfam00534. 540
44368 279235 pfam00863 Peptidase_C4 Peptidase family C4. This peptidase is present in the nuclear inclusion protein of potyviruses. 243
44369 395693 pfam00864 P2X_receptor ATP P2X receptor. 363
44370 395694 pfam00865 Osteopontin Osteopontin. 291
44371 395695 pfam00866 Ring_hydroxyl_B Ring hydroxylating beta subunit. This subunit has a similar structure to NTF-2 and scytalone dehydratase. 144
44372 395696 pfam00867 XPG_I XPG I-region. 90
44373 395697 pfam00868 Transglut_N Transglutaminase family. 117
44374 395698 pfam00869 Flavi_glycoprot Flavivirus glycoprotein, central and dimerization domains. 300
44375 395699 pfam00870 P53 P53 DNA-binding domain. This family contains one anomalous member, viz: Zea mays (Q6JAD8). This sequence is identical to human P53 and would appear to be a a human contaminant within the Zea mays sampling effort. 191
44376 395700 pfam00871 Acetate_kinase Acetokinase family. This family includes acetate kinase, butyrate kinase and 2-methylpropanoate kinase. 387
44377 307151 pfam00872 Transposase_mut Transposase, Mutator family. 380
44378 395701 pfam00873 ACR_tran AcrB/AcrD/AcrF family. Members of this family are integral membrane proteins. Some are involved in drug resistance. AcrB cooperates with a membrane fusion protein, AcrA, and an outer membrane channel TolC. The structure shows the AcrB forms a homotrimer. 1021
44379 395702 pfam00874 PRD PRD domain. The PRD domain (for PTS Regulation Domain), is the phosphorylatable regulatory domain found in bacterial transcriptional antiterminator such as BglG, SacY and LicT, as well as in activators such as MtlR and LevR. The PRD is phosphorylated on one or two conserved histidine residues. PRD-containing proteins are involved in the regulation of catabolic operons in Gram+ and Gram- bacteria and are often characterized by a short N-terminal effector domain that binds to either RNA (CAT-RBD for antiterminators pfam03123) or DNA (for activators), and a duplicated PRD module which is phosphorylated by the sugar phosphotransferase system (PTS) in response to the availability of carbon source. The phosphorylations modify the conformation and stability of the dimeric proteins and thereby the RNA- or DNA-binding activity of the effector domain. The structure of the LicT PRD domains has been solved in both the active (Structure 1h99) and inactive state (Structure 1tlv), revealing massive structural rearrangements upon activation. 90
44380 395703 pfam00875 DNA_photolyase DNA photolyase. This domain binds a light harvesting cofactor. 164
44381 395704 pfam00876 Innexin Innexin. This family includes the Drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins. 330
44382 395705 pfam00877 NLPC_P60 NlpC/P60 family. The function of this domain is unknown. It is found in several lipoproteins. 105
44383 395706 pfam00878 CIMR Cation-independent mannose-6-phosphate receptor repeat. The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. 145
44384 395707 pfam00879 Defensin_propep Defensin propeptide. 51
44385 395708 pfam00880 Nebulin Nebulin repeat. 28
44386 395709 pfam00881 Nitroreductase Nitroreductase family. The nitroreductase family comprises a group of FMN- or FAD-dependent and NAD(P)H-dependent enzymes able to metabolize nitrosubstituted compounds. 168
44387 395710 pfam00882 Zn_dep_PLPC Zinc dependent phospholipase C. 174
44388 395711 pfam00883 Peptidase_M17 Cytosol aminopeptidase family, catalytic domain. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. 303
44389 395712 pfam00884 Sulfatase Sulfatase. 298
44390 395713 pfam00885 DMRL_synthase 6,7-dimethyl-8-ribityllumazine synthase. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases. It has been established that lumazine synthase catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence. 134
44391 395714 pfam00886 Ribosomal_S16 Ribosomal protein S16. 61
44392 395715 pfam00887 ACBP Acyl CoA binding protein. 81
44393 395716 pfam00888 Cullin Cullin family. 610
44394 395717 pfam00889 EF_TS Elongation factor TS. 204
44395 395718 pfam00890 FAD_binding_2 FAD binding domain. This family includes members that bind FAD. This family includes the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase. 398
44396 395719 pfam00891 Methyltransf_2 O-methyltransferase. This family includes a range of O-methyltransferases. These enzymes utilize S-adenosyl methionine. 208
44397 307170 pfam00892 EamA EamA-like transporter family. This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Members of this family usually carry 5+5 transmembrane domains, and this domain attempts to model five of these. 136
44398 279265 pfam00893 Multi_Drug_Res Small Multidrug Resistance protein. This family is the Small Multidrug Resistance (SMR) family. Several members have been shown to export a range of toxins, including ethidium bromide and quaternary ammonium compounds, through coupling with proton influx. 93
44399 279266 pfam00894 Luteo_coat Luteovirus coat protein. 138
44400 395720 pfam00895 ATP-synt_8 ATP synthase protein 8. 54
44401 279268 pfam00897 Orbi_VP7 Orbivirus inner capsid protein VP7. In BTV, 260 trimers of VP7 are found in the core. The major proteins of the core are VP7 and VP3. VP7 forms an outer layer around VP3. 348
44402 307171 pfam00898 Orbi_VP2 Orbivirus outer capsid protein VP2. VP2 acts as an anchor for VP1 and VP3. VP2 contains a non-specific DNA and RNA binding domain in the N-terminus. 946
44403 395721 pfam00899 ThiF ThiF family. This domain is found in ubiquitin activating E1 family and members of the bacterial ThiF/MoeB/HesA family. It is repeated in Ubiquitin-activating enzyme E1. 243
44404 395722 pfam00900 Ribosomal_S4e Ribosomal family S4e. 75
44405 279272 pfam00901 Orbi_VP5 Orbivirus outer capsid protein VP5. cryoelectron microscopy indicates that VP5 is a trimer implying that there are 360 copies of VP5 per virion. 507
44406 395723 pfam00902 TatC Sec-independent protein translocase protein (TatC). The bacterial Tat system has a remarkable ability to transport folded proteins even enzyme complexes across the cytoplasmic membrane. It is structurally and mechanistically similar to the Delta pH-driven thylakoidal protein import pathway. A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N- and C-termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N-terminus and the first cytoplasmic loop region of the Escherichia coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon. 210
44407 395724 pfam00903 Glyoxalase Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. 121
44408 395725 pfam00904 Involucrin Involucrin repeat. 9
44409 395726 pfam00905 Transpeptidase Penicillin binding protein transpeptidase domain. The active site serine is conserved in all members of this family. 292
44410 279277 pfam00906 Hepatitis_core Hepatitis core antigen. The core antigen of hepatitis viruses possesses a carboxyl terminus rich in arginine. On this basis it was predicted that the core antigen would bind DNA. There is some experimental evidence to support this. 273
44411 395727 pfam00907 T-box T-box. The T-box encodes a 180 amino acid domain that binds to DNA. Genes encoding T-box proteins are found in a wide range of animals, but not in other kingdoms such as plants. Family members are all thought to bind to the DNA consensus sequence TCACACCT. they are found exclusively in the nucleus, and perform DNA-binding and transcriptional activation/repression roles. They are generally required for development of the specific tissues they are expressed in, and mutations in T-box genes are implicated in human conditions such as DiGeorge syndrome and X-linked cleft palate, which feature malformations. 182
44412 395728 pfam00908 dTDP_sugar_isom dTDP-4-dehydrorhamnose 3,5-epimerase. This family catalyze the isomerisation of dTDP-4-dehydro-6-deoxy -D-glucose with dTDP-4-dehydro-6-deoxy-L-mannose. The EC number of this enzyme is 5.1.3.13. 164
44413 395729 pfam00909 Ammonium_transp Ammonium Transporter Family. 399
44414 395730 pfam00910 RNA_helicase RNA helicase. This family includes RNA helicases thought to be involved in duplex unwinding during viral RNA replication. Members of this family are found in a variety of single stranded RNA viruses. 101
44415 395731 pfam00912 Transgly Transglycosylase. The penicillin-binding proteins are bifunctional proteins consisting of transglycosylase and transpeptidase in the N- and C-terminus respectively. The transglycosylase domain catalyzes the polymerization of murein glycan chains. 177
44416 395732 pfam00913 Trypan_glycop Trypanosome variant surface glycoprotein (A-type). The trypanosome parasite expresses these proteins to evade the immune response. This family includes a variety of surface proteins such as Trypanosoma brucei VSGs such as expression site associated gene (ESAG) 6 and 7. 367
44417 366366 pfam00915 Calici_coat Calicivirus coat protein. 290
44418 395733 pfam00916 Sulfate_transp Sulfate permease family. This family of integral membrane proteins are known as the Sulfate Permease (SulP) family. SulP is a large family found in all domains of life. Although sulfate is a commonly transported ion there are many other activities in this family. See the TCDB description for a comprehensive summary. 379
44419 334312 pfam00917 MATH MATH domain. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans. 113
44420 395734 pfam00918 Gastrin Gastrin/cholecystokinin family. 126
44421 395735 pfam00919 UPF0004 Uncharacterized protein family UPF0004. This family is the N terminal half of the Prosite family. The C-terminal half has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938 although its function is uncertain. 98
44422 395736 pfam00920 ILVD_EDD Dehydratase family. 518
44423 376417 pfam00921 Lipoprotein_2 Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain. 299
44424 366369 pfam00922 Phosphoprotein Vesiculovirus phosphoprotein. 204
44425 395737 pfam00923 TAL_FSA Transaldolase/Fructose-6-phosphate aldolase. Transaldolase (TAL) is an enzyme of the pentose phosphate pathway (PPP) found almost ubiquitously in the three domains of life (Archaea, Bacteria, and Eukarya). TAL shares a high degree of structural similarity and sequence identity with fructose-6-phosphate aldolase (FSA). They both belong to the class I aldolase family. Their protein structures have been revealed. 226
44426 395738 pfam00924 MS_channel Mechanosensitive ion channel. Two members of this protein family of M. jannaschii have been functionally characterized. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins. 201
44427 395739 pfam00925 GTP_cyclohydro2 GTP cyclohydrolase II. GTP cyclohydrolase II catalyzes the first committed step in the biosynthesis of riboflavin. 123
44428 395740 pfam00926 DHBP_synthase 3,4-dihydroxy-2-butanone 4-phosphate synthase. 3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesized from ribulose 5-phosphate and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as a bifunctional enzyme with pfam00925. 191
44429 395741 pfam00927 Transglut_C Transglutaminase family, C-terminal ig like domain. 106
44430 395742 pfam00928 Adap_comp_sub Adaptor complexes medium subunit family. This family also contains members which are coatomer subunits. 259
44431 395743 pfam00929 RNase_T Exonuclease. This family includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III.; 164
44432 395744 pfam00930 DPPIV_N Dipeptidyl peptidase IV (DPP IV) N-terminal region. This family is an alignment of the region to the N-terminal side of the active site. The Prosite motif does not correspond to this Pfam entry. 352
44433 395745 pfam00931 NB-ARC NB-ARC domain. 245
44434 395746 pfam00932 LTD Lamin Tail Domain. The lamin-tail domain (LTD), which has an immunoglobulin (Ig) fold, is found in Nuclear Lamins, Chlo1887 from Chloroflexus, and several bacterial proteins where it occurs with membrane associated hydrolases of the metallo-beta-lactamase,synaptojanin, and calcineurin-like phosphoesterase superfamilies. 106
44435 395747 pfam00933 Glyco_hydro_3 Glycosyl hydrolase family 3 N terminal domain. 316
44436 395748 pfam00934 PE PE family. This family named after a PE motif near to the amino terminus of the domain. The PE family of proteins all contain an amino-terminal region of about 110 amino acids. The carboxyl terminus of this family are variable and fall into several classes. The largest class of PE proteins is the highly repetitive PGRS class which have a high glycine content. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis. 91
44437 395749 pfam00935 Ribosomal_L44 Ribosomal protein L44. 76
44438 395750 pfam00936 BMC BMC domain. Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure. 74
44439 395751 pfam00937 Corona_nucleoca Coronavirus nucleocapsid protein. 346
44440 279306 pfam00938 Lipoprotein_3 Lipoprotein. This family of lipoproteins is Mycoplasma specific. 85
44441 279307 pfam00939 Na_sulph_symp Sodium:sulfate symporter transmembrane region. There are also some members in this family that do not match the Prosite motif, and belong to the subfamily SODIT1. 472
44442 395752 pfam00940 RNA_pol DNA-dependent RNA polymerase. This is a family of single chain RNA polymerases. 411
44443 395753 pfam00941 FAD_binding_5 FAD binding domain in molybdopterin dehydrogenase. 170
44444 395754 pfam00942 CBM_3 Cellulose binding domain. 82
44445 279311 pfam00943 Alpha_E2_glycop Alphavirus E2 glycoprotein. E2 forms a heterodimer with E1. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus). 403
44446 366379 pfam00944 Peptidase_S3 Alphavirus core protein. Also known as coat protein C and capsid protein C. This makes the literature very confusing. Alphaviruses consist of a nucleoprotein core, a lipid membrane which envelopes the core, and glycoprotein spikes protruding from the lipid membrane. 156
44447 395755 pfam00945 Rhabdo_ncap Rhabdovirus nucleocapsid protein. The Nucleocapsid (N) Protein is said to have a "tight" structure. The carboxyl end of the N-terminal domain possesses an RNA binding domain. Sequence alignments show 2 regions of reasonable conservation, approx. 64-103 and 201-329. A whole functional protein is required for encapsidation to take place. 409
44448 395756 pfam00946 Mononeg_RNA_pol Mononegavirales RNA dependent RNA polymerase. Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor. 1042
44449 395757 pfam00947 Pico_P2A Picornavirus core protein 2A. This protein is a protease, involved in cleavage of the polyprotein. 127
44450 279316 pfam00948 Flavi_NS1 Flavivirus non-structural Protein NS1. The NS1 protein is well conserved amongst the flaviviruses. It contains 12 cysteines, and undergoes glycosylation in a similar manner to other NS proteins. Mutational analysis has strongly implied a role for NS1 in the early stages of RNA replication. 360
44451 395758 pfam00949 Peptidase_S7 Peptidase S7, Flavivirus NS3 serine protease. The viral genome is a positive strand RNA that encodes a single polyprotein precursor. Processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by NS3 serine protease, which requires NS2B (pfam01002) as a cofactor. 129
44452 334323 pfam00950 ABC-3 ABC 3 transport family. 258
44453 109986 pfam00951 Arteri_Gl Arterivirus GL envelope glycoprotein. Arteriviruses encode 4 envelope proteins, Gl, Gs, M and N. Gl envelope protein, is encoded in ORF5, and is 30- 45 kDa in size. Gl is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. The Gl glycoprotein expresses the neutralisation determinants. 179
44454 395759 pfam00952 Bunya_nucleocap Bunyavirus nucleocapsid (N) protein. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins. 229
44455 395760 pfam00953 Glycos_transf_4 Glycosyl transferase family 4. 160
44456 395761 pfam00954 S_locus_glycop S-locus glycoprotein domain. In Brassicaceae, self-incompatible plants have a self/non-self recognition system. This is sporophytically controlled by multiple alleles at a single locus (S). S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles. This region is inferred to be a domain due to it having other domains adjacent to it. 111
44457 395762 pfam00955 HCO3_cotransp HCO3- transporter family. This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. 502
44458 395763 pfam00956 NAP Nucleosome assembly protein (NAP). NAP proteins are involved in moving histones into the nucleus, nucleosome assembly and chromatin fluidity. They affect the transcription of many genes. 258
44459 395764 pfam00957 Synaptobrevin Synaptobrevin. 89
44460 395765 pfam00958 GMP_synt_C GMP synthase C terminal domain. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. This family is the C-terminal domain specific to the GMP synthases EC:6.3.5.2. In prokaryotes this domain mediates dimerization. Eukaryotic GMP synthases are monomers. This domain in eukaryotes includes several large insertions that may form globular domains. 92
44461 395766 pfam00959 Phage_lysozyme Phage lysozyme. This family includes lambda phage lysozyme and E. coli endolysin. 107
44462 366388 pfam00960 Neocarzinostat Neocarzinostatin family. 110
44463 395767 pfam00961 LAGLIDADG_1 LAGLIDADG endonuclease. 101
44464 395768 pfam00962 A_deaminase Adenosine/AMP deaminase. 327
44465 395769 pfam00963 Cohesin Cohesin domain. Cohesin domains interact with a complementary domain, termed the dockerin domain. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. 139
44466 395770 pfam00964 Elicitin Elicitin. Elicitins form a novel class of plant necrotic proteins which are secreted by Phytophthora and Pythium fungi, parasites of many economically important crops. These proteins induce leaf necrosis in infected plants and elicit an incompatible hypersensitive-like reaction, leading to the development of a systemic acquired resistance against a range of fungal and bacterial plant pathogens. 88
44467 395771 pfam00965 TIMP Tissue inhibitor of metalloproteinase. Members of this family are common in extracellular regions of vertebrate species 183
44468 395772 pfam00967 Barwin Barwin family. 116
44469 395773 pfam00969 MHC_II_beta Class II histocompatibility antigen, beta domain. 75
44470 395774 pfam00970 FAD_binding_6 Oxidoreductase FAD-binding domain. 99
44471 250265 pfam00971 EIAV_GP90 EIAV coat protein, gp90. Equine infectious anaemia (EIAV). EIAV belongs to the family Retroviridae. EIAV gp90 is hypervariable in the carboxyl-end region and more stable in the amino-end region. This variability is a pathogenicity factor that allows the evasion of the host's immune response. 385
44472 366396 pfam00972 Flavi_NS5 Flavivirus RNA-directed RNA polymerase. Flaviviruses produce a polyprotein from the ssRNA genome. This protein is also known as NS5. This RNA-directed RNA polymerase possesses a number of short regions and motifs homologous to other RNA-directed RNA polymerases. 644
44473 395775 pfam00973 Paramyxo_ncap Paramyxovirus nucleocapsid protein. The nucleocapsid protein is referred to as NP. NP is is the major structural component of the nucleocapsid. The protein is approx. 58 kDa. 2600 NP molecules go to tightly encapsidate the RNA. NP interacts with several other viral encoded proteins, all of which are involved in controlling replication. {NP-NP, NP-P, NP-(PL), and NP-V}. 524
44474 279337 pfam00974 Rhabdo_glycop Rhabdovirus spike glycoprotein. Frequently abbreviated to G protein. The glycoprotein spike is made up of a trimer of G proteins. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating. 502
44475 395776 pfam00975 Thioesterase Thioesterase domain. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa. 223
44476 395777 pfam00976 ACTH_domain Corticotropin ACTH domain. 19
44477 395778 pfam00977 His_biosynth Histidine biosynthesis protein. Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in this family. Histidine is formed by several complex and distinct biochemical reactions catalyzed by eight enzymes. The enzymes in this Pfam entry are called His6 and His7 in eukaryotes and HisA and HisF in prokaryotes. The structure of HisA is known to be a TIM barrel fold. In some archaeal HisA proteins the TIM barrel is composed of two tandem repeats of a half barrel. This family belong to the common phosphate binding site TIM barrel family. 230
44478 395779 pfam00978 RdRP_2 RNA dependent RNA polymerase. This family may represent an RNA dependent RNA polymerase. The family also contains the following proteins: 2A protein from bromoviruses putative RNA dependent RNA polymerase from tobamoviruses Non structural polyprotein from togaviruses 440
44479 144537 pfam00979 Reovirus_cap Reovirus outer capsid protein, Sigma 3. Sigma 3 is the major outer capsid protein of reovirus. Sigma 3 is encoded by genome segment 4. Sigma 3 binds to double stranded RNA and associates with polypeptide u1 and its cleavage product u1C to form the outer shell of the virion. The Sigma 3 protein possesses a zinc-finger motif and an RNA-binding domain in the N and C termini respectively. This protein is also thought to play a role in pathogenesis. 367
44480 395780 pfam00980 Rota_Capsid_VP6 Rotavirus major capsid protein VP6. Rotaviruses consist of three concentric protein shells. The intermediate (middle) protein layer consists 260 trimers of VP6. VP6 in the most abundant protein in the virion. VP6 is also involved in virion assembly, and possesses the ability to interact with VP2, VP4 and VP7. 396
44481 144538 pfam00981 Rota_NS53 Rotavirus RNA-binding Protein 53 (NS53). This protein is also known as NSP1. NS53 is encoded by gene 5. It is made in low levels in the infected cells and is a component of early replication. The protein is known to accumulate on the cytoskeleton of the infected cell. NS53 is an RNA binding protein that contains a characteristic cysteine rich region. 488
44482 395781 pfam00982 Glyco_transf_20 Glycosyltransferase family 20. Members of this family belong to glycosyl transferase family 20. OtsA (Trehalose-6-phosphate synthase) is homologous to regions in the subunits of yeast trehalose-6-phosphate synthase/phosphate complex,. 470
44483 395782 pfam00983 Tymo_coat Tymovirus coat protein. 179
44484 395783 pfam00984 UDPG_MGDP_dh UDP-glucose/GDP-mannose dehydrogenase family, central domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 94
44485 144541 pfam00985 MSA_2 Merozoite Surface Antigen 2 (MSA-2) family. 171
44486 395784 pfam00986 DNA_gyraseB_C DNA gyrase B subunit, carboxyl terminus. The amino terminus of eukaryotic and prokaryotic DNA topoisomerase II are similar, but they have a different carboxyl terminus. The amino-terminal portion of the DNA gyrase B protein is thought to catalyze the ATP-dependent super-coiling of DNA. See pfam00204. The carboxyl-terminal end supports the complexation with the DNA gyrase A protein and the ATP-independent relaxation. This family also contains Topoisomerase IV. This is a bacterial enzyme that is closely related to DNA gyrase,. 63
44487 395785 pfam00988 CPSase_sm_chain Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesize carbamoyl phosphate. See pfam00289. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. 128
44488 395786 pfam00989 PAS PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya. 113
44489 395787 pfam00990 GGDEF Diguanylate cyclase, GGDEF domain. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. It has been shown to be homologous to the adenylyl cyclase catalytic domain and has diguanylate cyclase activity. This observation correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. In the WspR protein of Pseudomonas aeruginosa, the GGDEF domain acts as a diguanylate cyclase, Structure 3bre, when the whole molecule appears to form a tetramer consisting of two symmetrically-related dimers representing a biological unit. The active site is the GGD/EF motif, buried in the structure, and the cyclic dimeric guanosine monophosphate (c-di-GMP) bind to the inhibitory-motif RxxD on the surface. The enzyme thus catalyzes the cyclisation of two guanosine triphosphate (GTP) molecules to one c-di-GMP molecule. 160
44490 395788 pfam00992 Troponin Troponin. Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin. 132
44491 395789 pfam00993 MHC_II_alpha Class II histocompatibility antigen, alpha domain. 81
44492 395790 pfam00994 MoCF_biosynth Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. 143
44493 395791 pfam00995 Sec1 Sec1 family. 510
44494 395792 pfam00996 GDI GDP dissociation inhibitor. 436
44495 395793 pfam00997 Casein_kappa Kappa casein. Kappa-casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para kappa-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens. 160
44496 395794 pfam00998 RdRP_3 Viral RNA dependent RNA polymerase. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses. 486
44497 395795 pfam00999 Na_H_Exchanger Sodium/hydrogen exchanger family. Na/H antiporters are key transporters in maintaining the pH of actively metabolising cells. The molecular mechanisms of antiport are unclear. These antiporters contain 10-12 transmembrane regions (M) at the amino-terminus and a large cytoplasmic region at the carboxyl terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. 377
44498 395796 pfam01000 RNA_pol_A_bac RNA polymerase Rpb3/RpoA insert domain. Members of this family include: alpha subunit from eubacteria alpha subunits from chloroplasts Rpb3 subunits from eukaryotes RpoD subunits from archaeal 117
44499 110032 pfam01001 HCV_NS4b Hepatitis C virus non-structural protein NS4b. No precise function has been assigned to NS4b. However, it is known that NS4b interacts with NS4a and NS3 to form a large replicase complex to direct the viral RNA replication. 192
44500 279357 pfam01002 Flavi_NS2B Flavivirus non-structural protein NS2B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex. 127
44501 366413 pfam01003 Flavi_capsid Flavivirus capsid protein C. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. Multiple copies of the C protein form the nucleocapsid, which contains the ssRNA molecule. 117
44502 307237 pfam01004 Flavi_M Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus. 74
44503 279359 pfam01005 Flavi_NS2A Flavivirus non-structural protein NS2A. NS2A is a hydrophobic protein about 25 kDa is size. NS2A is cleaved from NS1 by a membrane bound host protease. NS2A has been found to associate with the dsRNA within the vesicle packages. It has also been found that NS2A associates with the known replicase components and so NS2A has been postulated to be part of this replicase complex. 215
44504 366414 pfam01006 HCV_NS4a Hepatitis C virus non-structural protein NS4a. NS4a forms an integral part of the NS3 serine protease, as it is required in a number of cases as a cofactor of cleavage. It has also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex. 55
44505 395797 pfam01007 IRK Inward rectifier potassium channel. 141
44506 395798 pfam01008 IF-2B Initiation factor 2 subunit family. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes, initiation factor 2B subunits 1 and 2 from archaebacteria and some proteins of unknown function from prokaryotes. Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. Members of this family have also been characterized as 5-methylthioribose- 1-phosphate isomerases, an enzyme of the methionine salvage pathway. The crystal structure of Ypr118w, a non-essential, low-copy number gene product from Saccharomyces cerevisiae, reveals a dimeric protein with two domains and a putative active site cleft. 281
44507 366417 pfam01010 Proton_antipo_C NADH-dehyrogenase subunit F, TMs, (complex I) C-terminus. This sub-family represents a carboxyl terminal extension of pfam00361. It includes subunit 5 from chloroplasts, and bacterial subunit L. This sub-family is part of complex I which catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. This family is largely a few TM regions of the F subunit of NADH-Ubiquinone oxidoreductase from plants. The TMs form part of the anti-porter subunit. 244
44508 395799 pfam01011 PQQ PQQ enzyme repeat. The family represent a single repeat of a beta propeller. This propeller has been found in several enzymes which utilize pyrrolo-quinoline quinone as a prosthetic group. 36
44509 395800 pfam01012 ETF Electron transfer flavoprotein domain. This family includes the homologous domain shared between the alpha and beta subunits of the electron transfer flavoprotein. 178
44510 395801 pfam01014 Uricase Uricase. 128
44511 395802 pfam01015 Ribosomal_S3Ae Ribosomal S3Ae family. 191
44512 395803 pfam01016 Ribosomal_L27 Ribosomal L27 protein. 77
44513 395804 pfam01017 STAT_alpha STAT protein, all-alpha domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017. 171
44514 395805 pfam01018 GTP1_OBG GTP1/OBG. The N-terminal domain of the GTPase OBG has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together. 155
44515 395806 pfam01019 G_glu_transpept Gamma-glutamyltranspeptidase. 498
44516 395807 pfam01020 Ribosomal_L40e Ribosomal L40e family. Bovine L40 has been identified as a secondary RNA binding protein. L40 is fused to a ubiquitin protein. 48
44517 395808 pfam01021 TYA TYA transposon protein. Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. 98
44518 395809 pfam01022 HTH_5 Bacterial regulatory protein, arsR family. Members of this family contains a DNA binding 'helix-turn-helix' motif. This family includes other proteins which are not included in the Prosite definition. 47
44519 395810 pfam01023 S_100 S-100/ICaBP type calcium binding domain. The S-100 domain is a subfamily of the EF-hand calcium binding proteins. 43
44520 395811 pfam01024 Colicin Colicin pore forming domain. 183
44521 395812 pfam01025 GrpE GrpE. 164
44522 395813 pfam01026 TatD_DNase TatD related DNase. This family of proteins are related to a large superfamily of metalloenzymes. TatD, a member of this family has been shown experimentally to be a DNase enzyme. 253
44523 395814 pfam01027 Bax1-I Inhibitor of apoptosis-promoting Bax1. Programmed cell-death involves a set of Bcl-2 family proteins, some of which inhibit apoptosis (Bcl-2 and Bcl-XL) and some of which promote it (Bax and Bak). Human Bax inhibitor, BI-1, is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments predominantly localized to intracellular membranes. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest a function. As plant BI-1 appears to localize predominantly to the ER, we hypothesized that plant BI-1 could also regulate cell death triggered by ER stress. BI-1 appears to exert its effect through an interaction with calmodulin. The budding yeast member of this family has been found unexpectedly to encode a BH3 domain-containing protein (Ybh3p) that regulates the mitochondrial pathway of apoptosis in a phylogenetically conserved manner. Examination of the crystal structure of a bacterial member of this family shows that these proteins mediate a calcium leak across the membrane that is pH-dependent. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively, and the pore can be opened by intracrystalline transition; together these findings suggest that pH controls the conformational transition. 206
44524 395815 pfam01028 Topoisom_I Eukaryotic DNA topoisomerase I, catalytic core. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. 198
44525 395816 pfam01029 NusB NusB family. The NusB protein is involved in the regulation of rRNA biosynthesis by transcriptional antitermination. 132
44526 395817 pfam01030 Recep_L_domain Receptor L domain. The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. 113
44527 395818 pfam01031 Dynamin_M Dynamin central region. This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169. 244
44528 395819 pfam01032 FecCD FecCD transport family. This is a sub-family of bacterial binding protein-dependent transport systems family. This Pfam entry contains the inner components of this multicomponent transport system. 311
44529 395820 pfam01033 Somatomedin_B Somatomedin B domain. 40
44530 395821 pfam01034 Syndecan Syndecan domain. Syndecans are transmembrane heparin sulfate proteoglycans which are implicated in the binding of extracellular matrix components and growth factors. 61
44531 395822 pfam01035 DNA_binding_1 6-O-methylguanine DNA methyltransferase, DNA binding domain. This domain is a 3 helical bundle. 81
44532 395823 pfam01036 Bac_rhodopsin Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). This family also includes distantly related proteins that do not contain the retinal binding lysine and so cannot function as opsins. 223
44533 395824 pfam01037 AsnC_trans_reg Lrp/AsnC ligand binding domain. The l-leucine-responsive regulatory protein (Lrp/AsnC) family is a family of similar bacterial transcription regulatory proteins. The family is named after two E. coli proteins involved in regulating amino acid metabolism. This entry corresponds to the usually C-terminal regulatory ligand binding domain. Structurally this domain has a dimeric alpha/beta barrel fold. 73
44534 395825 pfam01039 Carboxyl_trans Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognized types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilize acyl-CoA as the acceptor molecule. 491
44535 395826 pfam01040 UbiA UbiA prenyltransferase family. 247
44536 395827 pfam01041 DegT_DnrJ_EryC1 DegT/DnrJ/EryC1/StrS aminotransferase family. The members of this family are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. The family includes StsA, StsC and StsS. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase EC:2.6.1.50, which catalyzes the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin. 360
44537 395828 pfam01042 Ribonuc_L-PSP Endoribonuclease L-PSP. Endoribonuclease active on single-stranded mRNA. Inhibits protein synthesis by cleavage of mRNA. Previously thought to inhibit protein synthesis initiation. This protein may also be involved in the regulation of purine biosynthesis. YjgF (renamed RidA) family members are enamine/imine deaminases. They hydrolyze reactive intermediates released by PLP-dependent enzymes, including threonine dehydratase. YjgF also prevents inhibition of transaminase B (IlvE) in Salmonella. 117
44538 395829 pfam01043 SecA_PP_bind SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain. 107
44539 395830 pfam01044 Vinculin Vinculin family. 791
44540 395831 pfam01047 MarR MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif. 59
44541 395832 pfam01048 PNP_UDP_1 Phosphorylase superfamily. Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) 231
44542 395833 pfam01049 Cadherin_C Cadherin cytoplasmic region. Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. 148
44543 395834 pfam01050 MannoseP_isomer Mannose-6-phosphate isomerase. All of the members of this Pfam entry belong to family 2 of the mannose-6-phosphate isomerases. The type II phosphomannose isomerases are bifunctional enzymes. This Pfam entry covers the isomerase domain. The guanosine diphospho-D-mannose pyrophosphorylase domain is in another Pfam entry, see pfam00483. 151
44544 395835 pfam01051 Rep_3 Initiator Replication protein. This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene. 218
44545 395836 pfam01052 FliMN_C Type III flagellar switch regulator (C-ring) FliN C-term. This family includes the C-terminal region of flagellar motor switch proteins FliN and FliM. It is associated with family FliM, pfam02154 and family FliN_N pfam16973. 66
44546 395837 pfam01053 Cys_Met_Meta_PP Cys/Met metabolism PLP-dependent enzyme. This family includes enzymes involved in cysteine and methionine metabolism. The following are members: Cystathionine gamma-lyase, Cystathionine gamma-synthase, Cystathionine beta-lyase, Methionine gamma-lyase, OAH/OAS sulfhydrylase, O-succinylhomoserine sulfhydrylase All of these members participate is slightly different reactions. All these enzymes use PLP (pyridoxal-5'-phosphate) as a cofactor. 376
44547 366439 pfam01054 MMTV_SAg Mouse mammary tumor virus superantigen. The mouse mammary tumor virus (MMTV) is a milk-transmitted type B retrovirus. The superantigen (SAg) is encoded by the long terminal repeat. The SAgs are also called PR73. 184
44548 395838 pfam01055 Glyco_hydro_31 Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases. 442
44549 395839 pfam01056 Myc_N Myc amino-terminal region. The myc family belongs to the basic helix-loop-helix leucine zipper class of transcription factors, see pfam00010. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication. Mutations in the C-terminal 20 residues of this domain cause unique changes in the induction of apoptosis, transformation, and G2 arrest. 346
44550 366441 pfam01057 Parvo_NS1 Parvovirus non-structural protein NS1. This family also contains the NS2 protein. Parvoviruses encode two non-structural proteins, NS1 and NS2. The mRNA for NS2 contains the coding sequence for the first 87 amino acids of NS1, then by an alternative splicing mechanism mRNA from a different reading frame, encoding the last 78 amino acids, makes up the full length of the NS2 mRNA. NS1, is the major non-structural protein. It is essential for DNA replication. It is an 83-kDa nuclear phosphoprotein. It has DNA helicase and ATPase activity. 271
44551 395840 pfam01058 Oxidored_q6 NADH ubiquinone oxidoreductase, 20 Kd subunit. 124
44552 366442 pfam01059 Oxidored_q5_N NADH-ubiquinone oxidoreductase chain 4, amino terminus. 110
44553 395841 pfam01060 TTR-52 Transthyretin-like family. TTR-52 was called family 2 in, and has weak similarity to transthyretin (formerly called pre-albumin) which transports thyroid hormones. The specific function of this protein is as a bridging molecule in apoptosis cross-linking dying cells to phagocytes. TTR-52 bridges by cross-linking surface-exposed phosphatidylserine (PtdSer) on apoptotic cells to the CED-1 receptor, a transmembrane receptor, on phagocytes. TTR-52 has an open beta-barrel-like structure. 79
44554 395842 pfam01061 ABC2_membrane ABC-2 type transporter. 204
44555 395843 pfam01062 Bestrophin Bestrophin, RFP-TM, chloride channel. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localized to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members. 286
44556 395844 pfam01063 Aminotran_4 Amino-transferase class IV. The D-amino acid transferases (D-AAT) are required by bacteria to catalyze the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity. 221
44557 395845 pfam01064 Activin_recp Activin types I and II receptor domain. This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box. 77
44558 395846 pfam01065 Adeno_hexon Hexon, adenovirus major coat protein, N-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length. 586
44559 395847 pfam01066 CDP-OH_P_transf CDP-alcohol phosphatidyltransferase. All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. 65
44560 395848 pfam01067 Calpain_III Calpain large subunit, domain III. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. 135
44561 395849 pfam01068 DNA_ligase_A_M ATP dependent DNA ligase domain. This domain belongs to a more diverse superfamily, including pfam01331 and pfam01653. 203
44562 395850 pfam01070 FMN_dh FMN-dependent dehydrogenase. 350
44563 395851 pfam01071 GARS_A Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02786). 194
44564 366449 pfam01073 3Beta_HSD 3-beta hydroxysteroid dehydrogenase/isomerase family. The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase (3 beta-HSD) catalyzes the oxidation and isomerisation of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene steroid precursors into the corresponding 4-ene-ketosteroids necessary for the formation of all classes of steroid hormones. 279
44565 395852 pfam01074 Glyco_hydro_38 Glycosyl hydrolases family 38 N-terminal domain. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. 271
44566 395853 pfam01075 Glyco_transf_9 Glycosyltransferase family 9 (heptosyltransferase). Members of this family belong to glycosyltransferase family 9. Lipopolysaccharide is a major component of the outer leaflet of the outer membrane in Gram-negative bacteria. It is composed of three domains; lipid A, Core oligosaccharide and the O-antigen. All of these enzymes transfer heptose to the lipopolysaccharide core. 247
44567 395854 pfam01076 Mob_Pre Plasmid recombination enzyme. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation). 195
44568 395855 pfam01077 NIR_SIR Nitrite and sulphite reductase 4Fe-4S domain. Sulphite and nitrite reductases are vital in the biosynthetic assimilation of sulphur and nitrogen, respectfully. They are also both important for the dissimilation of oxidized anions for energy transduction. 153
44569 307292 pfam01078 Mg_chelatase Magnesium chelatase, subunit ChlI. Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa. 207
44570 395856 pfam01079 Hint Hint module. This is an alignment of the Hint module in the Hedgehog proteins. It does not include any Inteins which also possess the Hint module. 211
44571 395857 pfam01080 Presenilin Presenilin. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans. 387
44572 395858 pfam01081 Aldolase KDPG and KHG aldolase. This family includes the following members: 4-hydroxy-2-oxoglutarate aldolase (KHG-aldolase) Phospho-2-dehydro-3-deoxygluconate aldolase (KDPG-aldolase) 196
44573 395859 pfam01082 Cu2_monooxygen Copper type II ascorbate-dependent monooxygenase, N-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold. 130
44574 395860 pfam01083 Cutinase Cutinase. 173
44575 395861 pfam01084 Ribosomal_S18 Ribosomal protein S18. 52
44576 395862 pfam01085 HH_signal Hedgehog amino-terminal signalling domain. For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation. 146
44577 395863 pfam01086 Clathrin_lg_ch Clathrin light chain. 168
44578 395864 pfam01087 GalP_UDP_transf Galactose-1-phosphate uridyl transferase, N-terminal domain. SCOP reports fold duplication with C-terminal domain. Both involved in Zn and Fe binding. 182
44579 395865 pfam01088 Peptidase_C12 Ubiquitin carboxyl-terminal hydrolase, family 1. 205
44580 395866 pfam01090 Ribosomal_S19e Ribosomal protein S19e. 137
44581 395867 pfam01091 PTN_MK_C PTN/MK heparin-binding protein family, C-terminal domain. 61
44582 395868 pfam01092 Ribosomal_S6e Ribosomal protein S6e. 124
44583 395869 pfam01093 Clusterin Clusterin. 417
44584 395870 pfam01094 ANF_receptor Receptor family ligand binding region. This family includes extracellular ligand binding domains of a wide range of receptors. This family also includes the bacterial amino acid binding proteins of known structure. 349
44585 395871 pfam01095 Pectinesterase Pectinesterase. 298
44586 395872 pfam01096 TFIIS_C Transcription factor S-II (TFIIS). 39
44587 395873 pfam01097 Defensin_2 Arthropod defensin. 34
44588 279444 pfam01098 FTSW_RODA_SPOVE Cell cycle protein. This entry includes the following members; FtsW, RodA, SpoVE 359
44589 395874 pfam01099 Uteroglobin Uteroglobin family. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated. 90
44590 395875 pfam01101 HMG14_17 HMG14 and HMG17. 90
44591 395876 pfam01102 Glycophorin_A Glycophorin A. 113
44592 395877 pfam01103 Bac_surface_Ag Surface antigen. This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. The family also includes a number of eukaryotic proteins that are members of the UPF0140 family. There also appears to be a relationship to pfam03865 (personal obs: C Yeats). In eukaryotes, it appears that these proteins are not surface antigens; S. cerevisiae YNL026W (SAM50) is an essential component of the Sorting and Assembly Machinery (SAM) of the mitochondrial outer membrane. The protein was localized to the mitochondria. 323
44593 279449 pfam01104 Bunya_NS-S Bunyavirus non-structural protein NS-s. The NS-s protein is encoded by the S RNA. This segment also encodes for the N protein. These two proteins are encoded by overlapping reading frames. 91
44594 395878 pfam01105 EMP24_GP25L emp24/gp25L/p24 family/GOLD. Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. 181
44595 395879 pfam01106 NifU NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. 67
44596 279452 pfam01107 MP Viral movement protein (MP). This family includes a variety of movement proteins (MP)s. The MP is necessary for the initial cell-to-cell movement during the early stages of a viral infection. This movement is active, and it is known that the MP interacts with the plasmodesmata and possesses the ability to bind to RNA to achieve its role. This family also includes consists of virus movement proteins from the caulimovirus family. It has been suggested in cauliflower mosaic virus that these proteins mediated viral movement by modifying plasmodesmata and forming tubules in the channel that can accommodate the virus particles and references therein. The family contains a conserved DXR motif that is probably functionally important. 191
44597 395880 pfam01108 Tissue_fac Tissue factor. This family is found in metazoa, and is very similar to the fibronectin type III domain. The family is found in cytokine receptors, interleukin and interferon receptors and coagulation factor III proteins. It occurs multiple times, as does fn3, family pfam00041. 107
44598 144630 pfam01109 GM_CSF Granulocyte-macrophage colony-stimulating factor. 122
44599 395881 pfam01110 CNTF Ciliary neurotrophic factor. 187
44600 395882 pfam01111 CKS Cyclin-dependent kinase regulatory subunit. 66
44601 395883 pfam01112 Asparaginase_2 Asparaginase. 304
44602 395884 pfam01113 DapB_N Dihydrodipicolinate reductase, N-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The N-terminal domain of DapB binds the dinucleotide NADPH. 121
44603 279458 pfam01114 Colipase Colipase, N-terminal domain. SCOP reports duplication of common fold with Colipase C-terminal domain. 40
44604 395885 pfam01115 F_actin_cap_B F-actin capping protein, beta subunit. 230
44605 395886 pfam01116 F_bP_aldolase Fructose-bisphosphate aldolase class-II. 277
44606 366474 pfam01117 Aerolysin Aerolysin toxin. This family represents the pore forming lobe of aerolysin. 359
44607 395887 pfam01118 Semialdhyde_dh Semialdehyde dehydrogenase, NAD binding domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase 119
44608 395888 pfam01119 DNA_mis_repair DNA mismatch repair protein, C-terminal domain. This family represents the C-terminal domain of the mutL/hexB/PMS1 family. This domain has a ribosomal S5 domain 2-like fold. 117
44609 395889 pfam01120 Alpha_L_fucos Alpha-L-fucosidase. 333
44610 395890 pfam01121 CoaE Dephospho-CoA kinase. This family catalyzes the phosphorylation of the 3'-hydroxyl group of dephosphocoenzyme A to form Coenzyme A EC:2.7.1.24. This enzyme uses ATP in its reaction. 179
44611 395891 pfam01122 Cobalamin_bind Eukaryotic cobalamin-binding protein. 300
44612 395892 pfam01123 Stap_Strp_toxin Staphylococcal/Streptococcal toxin, OB-fold domain. 79
44613 395893 pfam01124 MAPEG MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity. 127
44614 395894 pfam01125 G10 G10 protein. 146
44615 395895 pfam01126 Heme_oxygenase Heme oxygenase. 204
44616 395896 pfam01127 Sdh_cyt Succinate dehydrogenase/Fumarate reductase transmembrane subunit. This family includes a transmembrane protein from both the Succinate dehydrogenase and Fumarate reductase complexes. 122
44617 395897 pfam01128 IspD 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase. Members of this family are enzymes which catalyze the formation of 4-diphosphocytidyl-2-C-methyl-D-erythritol from cytidine triphosphate and 2-C-methyl-D-erythritol 4-phosphate (MEP). 219
44618 279473 pfam01129 ART NAD:arginine ADP-ribosyltransferase. 222
44619 395898 pfam01130 CD36 CD36 family. The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. 453
44620 395899 pfam01131 Topoisom_bac DNA topoisomerase. This subfamily of topoisomerase is divided on the basis that these enzymes preferentially relax negatively supercoiled DNA, from a 5' phospho- tyrosine linkage in the enzyme-DNA covalent intermediate and has high affinity for single stranded DNA. 409
44621 395900 pfam01132 EFP Elongation factor P (EF-P) OB domain. 54
44622 395901 pfam01133 ER Enhancer of rudimentary. Enhancer of rudimentary is a protein of unknown function that is highly conserved in plants and animals. This protein is found to be an enhancer of the rudimentary gene. 98
44623 250388 pfam01134 GIDA Glucose inhibited division protein A. 391
44624 395902 pfam01135 PCMT Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT). 205
44625 395903 pfam01136 Peptidase_U32 Peptidase family U32. 233
44626 395904 pfam01137 RTC RNA 3'-terminal phosphate cyclase. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyze the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids. 324
44627 395905 pfam01138 RNase_PH 3' exoribonuclease family, domain 1. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria. 129
44628 395906 pfam01139 RtcB tRNA-splicing ligase RtcB. This family of RNA ligases (EC:6.5.1.3) join 2',3'-cyclic phosphate and 5'-OH ends. They catalyze the splicing of tRNA and may also participate in tRNA repair and recovery from stress-induced RNA damage. 415
44629 395907 pfam01140 Gag_MA Matrix protein (MA), p15. The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity. 126
44630 279483 pfam01141 Gag_p12 Gag polyprotein, inner coat protein p12. The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus. 85
44631 395908 pfam01142 TruD tRNA pseudouridine synthase D (TruD). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. The structure of TruD reveals an overall V-shaped molecule which contains an RNA-binding cleft. 415
44632 395909 pfam01144 CoA_trans Coenzyme A transferase. 216
44633 395910 pfam01145 Band_7 SPFH domain / Band 7 family. This family has been called SPFH, Band 7 or PHB domain. Recent phylogenetic analysis has shown this domain to be a slipin or Stomatin-like integral membrane domain conserved from protozoa to mammals. 176
44634 395911 pfam01146 Caveolin Caveolin. All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localized and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localization. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumor suppression. 131
44635 395912 pfam01147 Crust_neurohorm Crustacean CHH/MIH/GIH neurohormone family. 67
44636 395913 pfam01148 CTP_transf_1 Cytidylyltransferase family. The members of this family are integral membrane protein cytidylyltransferases. The family includes phosphatidate cytidylyltransferase EC:2.7.7.41 as well as Sec59 from yeast. Sec59 is a dolichol kinase EC:2.7.1.108. 264
44637 395914 pfam01149 Fapy_DNA_glyco Formamidopyrimidine-DNA glycosylase N-terminal domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges. 118
44638 395915 pfam01150 GDA1_CD39 GDA1/CD39 (nucleoside phosphatase) family. 423
44639 395916 pfam01151 ELO GNS1/SUR4 family. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1. 244
44640 395917 pfam01152 Bac_globin Bacterial-like globin. This family of heme binding proteins are found mainly in bacteria. However they can also be found in some protozoa and plants as well. 121
44641 395918 pfam01153 Glypican Glypican. 554
44642 307348 pfam01154 HMG_CoA_synt_N Hydroxymethylglutaryl-coenzyme A synthase N terminal. 173
44643 395919 pfam01155 HypA Hydrogenase/urease nickel incorporation, metallochaperone, hypA. HypA is a metallochaperone that binds nickel to bring it safely to its target. The targets for Hypa are the nickel-containing enzymes [Ni,Fe]-hydrogenase and urease. The nickel coordinates with four nitrogens within the protein. The four conserved cysteines towards the C-terminus bind one zinc moiety probably to stabilize the protein fold. 111
44644 395920 pfam01156 IU_nuc_hydro Inosine-uridine preferring nucleoside hydrolase. 255
44645 395921 pfam01157 Ribosomal_L21e Ribosomal protein L21e. 100
44646 395922 pfam01158 Ribosomal_L36e Ribosomal protein L36e. 96
44647 395923 pfam01159 Ribosomal_L6e Ribosomal protein L6e. 109
44648 395924 pfam01160 Opiods_neuropep Vertebrate endogenous opioids neuropeptide. 47
44649 395925 pfam01161 PBP Phosphatidylethanolamine-binding protein. 140
44650 395926 pfam01163 RIO1 RIO1 family. This is a family of atypical serine kinases which are found in archaea, bacteria and eukaryotes. Activity of Rio1 is vital in Saccharomyces cerevisiae for the processing of ribosomal RNA, as well as for proper cell cycle progression and chromosome maintenance. The structure of RIO1 has been determined. 184
44651 395927 pfam01165 Ribosomal_S21 Ribosomal protein S21. 51
44652 395928 pfam01166 TSC22 TSC-22/dip/bun family. 57
44653 395929 pfam01167 Tub Tub family. 250
44654 395930 pfam01168 Ala_racemase_N Alanine racemase, N-terminal domain. 220
44655 395931 pfam01169 UPF0016 Uncharacterized protein family UPF0016. This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. It has been suggested that these proteins are calcium transporters. 75
44656 395932 pfam01170 UPF0020 Putative RNA methylase family UPF0020. This domain is probably a methylase. It is associated with the THUMP domain that also occurs with RNA modification domains. 184
44657 395933 pfam01171 ATP_bind_3 PP-loop family. This family of proteins belongs to the PP-loop superfamily. 178
44658 395934 pfam01172 SBDS Shwachman-Bodian-Diamond syndrome (SBDS) protein. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterized by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism. In yeast these proteins have been shown to be critical for the release and recycling of the nucleolar shuttling factor Tif6 from pre-60S ribosomes, a key step in 60S maturation and translational activation of ribosomes. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition. 82
44659 334414 pfam01174 SNO SNO glutamine amidotransferase family. This family and its amidotransferase domain was first described in. It is predicted that members of this family are involved in the pyridoxine biosynthetic pathway, based on the proximity and co-regulation of the corresponding genes and physical interaction between the members of pfam01174 and pfam01680. 188
44660 395935 pfam01175 Urocanase Urocanase Rossmann-like domain. 209
44661 395936 pfam01176 eIF-1a Translation initiation factor 1A / IF-1. This family includes both the eukaryotic translation factor eIF-1A and the bacterial translation initiation factor IF-1. 62
44662 395937 pfam01177 Asp_Glu_race Asp/Glu/Hydantoin racemase. This family contains aspartate racemase, maleate isomerases EC:5.2.1.1, glutamate racemase, hydantoin racemase and arylmalonate decarboxylase EC:4.1.1.76. 210
44663 395938 pfam01179 Cu_amine_oxid Copper amine oxidase, enzyme domain. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme. 405
44664 395939 pfam01180 DHO_dh Dihydroorotate dehydrogenase. 291
44665 395940 pfam01182 Glucosamine_iso Glucosamine-6-phosphate isomerases/6-phosphogluconolactonase. 221
44666 395941 pfam01183 Glyco_hydro_25 Glycosyl hydrolases family 25. 180
44667 395942 pfam01184 Grp1_Fun34_YaaH GPR1/FUN34/yaaH family. The Ady2 protein is required for acetate in Saccharomyces cerevisiae, and is probably an acetate transporter. A homolog in Yarrowia lipolytica (GPR1) has a role in acetic acid sensitivity. 207
44668 395943 pfam01185 Hydrophobin Fungal hydrophobin. 79
44669 395944 pfam01186 Lysyl_oxidase Lysyl oxidase. 200
44670 395945 pfam01187 MIF Macrophage migration inhibitory factor (MIF). 114
44671 395946 pfam01189 Methyltr_RsmB-F 16S rRNA methyltransferase RsmB/F. This is the catalytic core of this SAM-dependent 16S ribosomal methyltransferase RsmB/F enzyme. There is a catalytic cysteine residue at 180 in UniProtKB:Q5SII2, with another highly conserved cysteine at residue 230. It methylates the C(5) position of cytosine 2870 (m5C2870) in 25S rRNA. 199
44672 395947 pfam01190 Pollen_Ole_e_I Pollen proteins Ole e I like. 94
44673 395948 pfam01191 RNA_pol_Rpb5_C RNA polymerase Rpb5, C-terminal domain. The assembly domain of Rpb5. The archaeal equivalent to this domain is subunit H. Subunit H lacks the N-terminal domain. 72
44674 395949 pfam01192 RNA_pol_Rpb6 RNA polymerase Rpb6. Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly. 53
44675 395950 pfam01193 RNA_pol_L RNA polymerase Rpb3/Rpb11 dimerization domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp. 191
44676 395951 pfam01194 RNA_pol_N RNA polymerases N / 8 kDa subunit. 59
44677 395952 pfam01195 Pept_tRNA_hydro Peptidyl-tRNA hydrolase. 177
44678 395953 pfam01196 Ribosomal_L17 Ribosomal protein L17. 97
44679 395954 pfam01197 Ribosomal_L31 Ribosomal protein L31. 65
44680 395955 pfam01198 Ribosomal_L31e Ribosomal protein L31e. 82
44681 395956 pfam01199 Ribosomal_L34e Ribosomal protein L34e. 94
44682 395957 pfam01200 Ribosomal_S28e Ribosomal protein S28e. 64
44683 395958 pfam01201 Ribosomal_S8e Ribosomal protein S8e. 126
44684 395959 pfam01202 SKI Shikimate kinase. 158
44685 395960 pfam01203 T2SSN Type II secretion system (T2SS), protein N. Members of the T2SN family are involved in the Type II protein secretion system. The precise function of these proteins is unknown. 207
44686 395961 pfam01204 Trehalase Trehalase. Trehalase (EC:3.2.1.28) is known to recycle trehalose to glucose. Trehalose is a physiological hallmark of heat-shock response in yeast and protects of proteins and membranes against a variety of stresses. This family is found in conjunction with pfam07492 in fungi. 509
44687 395962 pfam01205 UPF0029 Uncharacterized protein family UPF0029. 103
44688 395963 pfam01206 TusA Sulfurtransferase TusA. This family includes the TusA sulfurtransferases. 65
44689 395964 pfam01207 Dus Dihydrouridine synthase (Dus). Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria. 310
44690 395965 pfam01208 URO-D Uroporphyrinogen decarboxylase (URO-D). 344
44691 395966 pfam01209 Ubie_methyltran ubiE/COQ5 methyltransferase family. 228
44692 395967 pfam01210 NAD_Gly3P_dh_N NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus. NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. 158
44693 395968 pfam01212 Beta_elim_lyase Beta-eliminating lyase. 283
44694 395969 pfam01213 CAP_N Adenylate cyclase associated (CAP) N terminal. 70
44695 395970 pfam01214 CK_II_beta Casein kinase II regulatory subunit. 182
44696 395971 pfam01215 COX5B Cytochrome c oxidase subunit Vb. 125
44697 395972 pfam01216 Calsequestrin Calsequestrin. 350
44698 395973 pfam01217 Clat_adaptor_s Clathrin adaptor complex small chain. 142
44699 395974 pfam01218 Coprogen_oxidas Coproporphyrinogen III oxidase. 292
44700 395975 pfam01219 DAGK_prokar Prokaryotic diacylglycerol kinase. 102
44701 395976 pfam01220 DHquinase_II Dehydroquinase class II. 138
44702 395977 pfam01221 Dynein_light Dynein light chain type 1. 83
44703 250456 pfam01222 ERG4_ERG24 Ergosterol biosynthesis ERG4/ERG24 family. 429
44704 395978 pfam01223 Endonuclease_NS DNA/RNA non-specific endonuclease. 219
44705 395979 pfam01225 Mur_ligase Mur ligase family, catalytic domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl, and FolC. MurC, MurD, Mure and MurF catalyze consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesized by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate. 84
44706 395980 pfam01226 Form_Nir_trans Formate/nitrite transporter. 244
44707 395981 pfam01227 GTP_cyclohydroI GTP cyclohydrolase I. This family includes GTP cyclohydrolase enzymes and a family of related bacterial proteins. 176
44708 395982 pfam01228 Gly_radical Glycine radical. 106
44709 395983 pfam01229 Glyco_hydro_39 Glycosyl hydrolases family 39. 490
44710 395984 pfam01230 HIT HIT domain. 98
44711 395985 pfam01231 IDO Indoleamine 2,3-dioxygenase. 410
44712 395986 pfam01232 Mannitol_dh Mannitol dehydrogenase Rossmann domain. 151
44713 395987 pfam01233 NMT Myristoyl-CoA:protein N-myristoyltransferase, N-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. 158
44714 395988 pfam01234 NNMT_PNMT_TEMT NNMT/PNMT/TEMT family. 261
44715 395989 pfam01235 Na_Ala_symp Sodium:alanine symporter family. 385
44716 395990 pfam01237 Oxysterol_BP Oxysterol-binding protein. 362
44717 395991 pfam01238 PMI_typeI Phosphomannose isomerase type I. This is a family of Phosphomannose isomerase type I enzymes (EC 5.3.1.8). 373
44718 395992 pfam01239 PPTA Protein prenyltransferase alpha subunit repeat. Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognize a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognizes a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family. 32
44719 395993 pfam01241 PSI_PSAK Photosystem I psaG / psaK. 69
44720 395994 pfam01242 PTPS 6-pyruvoyl tetrahydropterin synthase. 6-Pyruvoyl tetrahydrobiopterin synthase catalyzes the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin, the second of three enzymatic steps in the synthesis of tetrahydrobiopterin from GTP. The functional enzyme is a hexamer of identical subunits. 121
44721 395995 pfam01243 Putative_PNPOx Pyridoxamine 5'-phosphate oxidase. Family of domains with putative PNPOx function. Family members were predicted to encode pyridoxamine 5'-phosphate oxidase, based on sequence similarity. However, there is no experimental data to validate the predicted activity and purified proteins, such as yeast YLR456W and its paralogs, do not possess this activity, nor do they bind to flavin mononucleotide (FMN). To date, the only time functional oxidase activity has been experimentally demonstrated is when the sequences contain both pfam01243 and pfam10590. Moreover, some of the family members that contain both domains have been shown to be involved in phenazine biosynthesis. While some molecular function has been experimentally validated for the proteins containing both domains, the role performed by each domain on its own is unknown. 88
44722 395996 pfam01244 Peptidase_M19 Membrane dipeptidase (Peptidase family M19). 317
44723 395997 pfam01245 Ribosomal_L19 Ribosomal protein L19. 108
44724 395998 pfam01246 Ribosomal_L24e Ribosomal protein L24e. 63
44725 395999 pfam01247 Ribosomal_L35Ae Ribosomal protein L35Ae. 94
44726 396000 pfam01248 Ribosomal_L7Ae Ribosomal protein L7Ae/L30e/S12e/Gadd45 family. This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118. 95
44727 396001 pfam01249 Ribosomal_S21e Ribosomal protein S21e. 79
44728 396002 pfam01250 Ribosomal_S6 Ribosomal protein S6. 88
44729 396003 pfam01251 Ribosomal_S7e Ribosomal protein S7e. 184
44730 396004 pfam01252 Peptidase_A8 Signal peptidase (SPase) II. 140
44731 396005 pfam01253 SUI1 Translation initiation factor SUI1. 77
44732 366540 pfam01254 TP2 Nuclear transition protein 2. 133
44733 396006 pfam01255 Prenyltransf Putative undecaprenyl diphosphate synthase. Previously known as uncharacterized protein family UPF0015, a single member of this family has been identified as an undecaprenyl diphosphate synthase. 220
44734 396007 pfam01256 Carb_kinase Carbohydrate kinase. This family is related to pfam02110 and pfam00294 implying that it also is a carbohydrate kinase. (personal obs Yeats C). 242
44735 396008 pfam01257 2Fe-2S_thioredx Thioredoxin-like [2Fe-2S] ferredoxin. 145
44736 396009 pfam01258 zf-dskA_traR Prokaryotic dksA/traR C4-type zinc finger. 36
44737 396010 pfam01259 SAICAR_synt SAICAR synthetase. Also known as Phosphoribosylaminoimidazole-succinocarboxamide synthase. 225
44738 396011 pfam01261 AP_endonuc_2 Xylose isomerase-like TIM barrel. This TIM alpha/beta barrel structure is found in xylose isomerase and in endonuclease IV (EC:3.1.21.2). This domain is also found in the N termini of bacterial myo-inositol catabolism proteins. These are involved in the myo-inositol catabolism pathway, and is required for growth on myo-inositol in Rhizobium leguminosarum bv. viciae. 248
44739 396012 pfam01262 AlaDh_PNT_C Alanine dehydrogenase/PNT, C-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases. 214
44740 396013 pfam01263 Aldose_epim Aldose 1-epimerase. 300
44741 396014 pfam01264 Chorismate_synt Chorismate synthase. 344
44742 396015 pfam01265 Cyto_heme_lyase Cytochrome c/c1 heme lyase. 291
44743 396016 pfam01266 DAO FAD dependent oxidoreductase. This family includes various FAD dependent oxidoreductases: Glycerol-3-phosphate dehydrogenase EC:1.1.99.5, Sarcosine oxidase beta subunit EC:1.5.3.1, D-alanine oxidase EC:1.4.99.1, D-aspartate oxidase EC:1.4.3.1. 339
44744 396017 pfam01267 F-actin_cap_A F-actin capping protein alpha subunit. 265
44745 396018 pfam01268 FTHFS Formate--tetrahydrofolate ligase. 555
44746 396019 pfam01269 Fibrillarin Fibrillarin. 227
44747 396020 pfam01270 Glyco_hydro_8 Glycosyl hydrolases family 8. 321
44748 279595 pfam01271 Granin Granin (chromogranin or secretogranin). 584
44749 396021 pfam01272 GreA_GreB Transcription elongation factor, GreA/GreB, C-term. This domain has an FKBP-like fold. 77
44750 396022 pfam01273 LBP_BPI_CETP LBP / BPI / CETP family, N-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar. 164
44751 396023 pfam01274 Malate_synthase Malate synthase. 523
44752 396024 pfam01275 Myelin_PLP Myelin proteolipid protein (PLP or lipophilin). 233
44753 396025 pfam01276 OKR_DC_1 Orn/Lys/Arg decarboxylase, major domain. 417
44754 396026 pfam01277 Oleosin Oleosin. 113
44755 396027 pfam01278 Omptin Omptin family. The omptin family is a family of serine proteases. 282
44756 396028 pfam01279 Parathyroid Parathyroid hormone family. 106
44757 396029 pfam01280 Ribosomal_L19e Ribosomal protein L19e. 143
44758 396030 pfam01281 Ribosomal_L9_N Ribosomal protein L9, N-terminal domain. 46
44759 396031 pfam01282 Ribosomal_S24e Ribosomal protein S24e. 78
44760 396032 pfam01283 Ribosomal_S26e Ribosomal protein S26e. 105
44761 366555 pfam01284 MARVEL Membrane-associating domain. MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis. 136
44762 396033 pfam01285 TEA TEA/ATTS domain family. 68
44763 396034 pfam01286 XPA_N XPA protein N-terminal. 32
44764 396035 pfam01287 eIF-5a Eukaryotic elongation factor 5A hypusine, DNA-binding OB fold. eIF5A, previously thought to be an initiation factor, has been shown to be required for peptide chain elongation in yeast. 69
44765 396036 pfam01288 HPPK 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK). 128
44766 396037 pfam01289 Thiol_cytolysin Thiol-activated cytolysin. 354
44767 396038 pfam01290 Thymosin Thymosin beta-4 family. 39
44768 396039 pfam01291 LIF_OSM LIF / OSM family. 162
44769 396040 pfam01292 Ni_hydr_CYTB Prokaryotic cytochrome b561. This family includes cytochrome b561 and related proteins, in addition to the nickel-dependent hydrogenases b-type cytochrome subunit. Cytochrome b561 is a secretory vesicle-specific electron transport protein. It is an integral membrane protein, that binds two heme groups non-covalently. This is a prokaryotic family. Members of the 'eukaryotic cytochrome b561' family can be found in pfam03188. 180
44770 396041 pfam01293 PEPCK_ATP Phosphoenolpyruvate carboxykinase. 465
44771 396042 pfam01294 Ribosomal_L13e Ribosomal protein L13e. 180
44772 396043 pfam01295 Adenylate_cycl Adenylate cyclase, class-I. 601
44773 366564 pfam01296 Galanin Galanin. 29
44774 396044 pfam01297 ZnuA Zinc-uptake complex component A periplasmic. ZnuA includes periplasmic solute binding proteins such as TroA that interacts with an ATP-binding cassette transport system in Treponema pallidum. ZnuA is part of the bacterial zinc-uptake complex ZnuABC, whose components are the following families, ZinT, pfam09223, pfam00950, pfam00005, all of which are regulated by the transcription-regulator family FUR, pfam01475. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA (TroA), a high-affinity zinc-uptake protein. In Gram-negative bacteria the ZnuABC transporter system ensures an adequate import of zinc in Zn2+-poor environments, such as those encountered by pathogens within the infected host. 268
44775 396045 pfam01298 TbpB_B_D C-lobe and N-lobe beta barrels of Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. This domain family is found in C and N lobe eight stranded beta barrel region of TbpB proteins. The eight-stranded barrel domains in N and C lobe draw comparisons to eight-stranded beta barrel outer-membrane protein W (OmpW). However, the barrel domains of TbpB have the hydrophobic residues line the inner surface of the beta barrels to create a stable hydrophobic core. 125
44776 396046 pfam01299 Lamp Lysosome-associated membrane glycoprotein (Lamp). 148
44777 396047 pfam01300 Sua5_yciO_yrdC Telomere recombination. This domain has been shown to bind preferentially to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. It has also been shown to be required for telomere recombniation in yeast. 178
44778 396048 pfam01301 Glyco_hydro_35 Glycosyl hydrolases family 35. 316
44779 396049 pfam01302 CAP_GLY CAP-Gly domain. Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. 65
44780 366569 pfam01303 Egg_lysin Egg lysin (Sperm-lysin). Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg. 121
44781 279626 pfam01304 Gas_vesicle_C Gas vesicles protein GVPc repeated domain. 33
44782 396050 pfam01306 LacY_symp LacY proton/sugar symporter. This family is closely related to the sugar transporter family. 413
44783 279627 pfam01307 Plant_vir_prot Plant viral movement protein. This family includes several known plant viral movement proteins from a number of different ssRNA plant virus families including potexviruses, hordeiviruses and carlaviruses. 100
44784 279628 pfam01308 Chlam_OMP Chlamydia major outer membrane protein. The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I to IV). This protein is believed to be an integral part to the pathogenesis, possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein (MOMP) makes up the surface of the elementary body cell. The MOMP is the protein used to determine the different serotypes. 397
44785 279629 pfam01309 EAV_GS Equine arteritis virus small envelope glycoprotein. Equine arteritis virus small envelope glycoprotein (Gs) is a class I transmembrane protein which adopts a number of different conformations. 196
44786 366571 pfam01310 Adeno_PVIII Adenovirus hexon associated protein, protein VIII. See pfam01065. This family represents Hexon. 216
44787 396051 pfam01311 Bac_export_1 Bacterial export proteins, family 1. This family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways. 229
44788 396052 pfam01312 Bac_export_2 FlhB HrpN YscU SpaS Family. This family includes the following members: FlhB, HrpN, YscU, SpaS, HrcU SsaU and YopU. All of these proteins export peptides using the type III secretion system. The peptides exported are quite diverse. 338
44789 396053 pfam01313 Bac_export_3 Bacterial export proteins, family 3. This family includes the following members; FliQ, MopD, HrcS, Hrp, YopS and SpaQ All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways. 72
44790 396054 pfam01314 AFOR_C Aldehyde ferredoxin oxidoreductase, domains 2 & 3. Aldehyde ferredoxin oxidoreductase (AOR) catalyzes the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This family is composed of two structural domains that bind the tungsten cofactor via DXXGL(C/D) motifs. In addition to maintaining specific binding interactions with the cofactor, another role for domains 2 and 3 may be to regulate substrate access to AOR. 388
44791 396055 pfam01315 Ald_Xan_dh_C Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. 107
44792 396056 pfam01316 Arg_repressor Arginine repressor, DNA binding domain. 69
44793 279637 pfam01318 Bromo_coat Bromovirus coat protein. 187
44794 396057 pfam01320 Colicin_Pyocin Colicin immunity protein / pyocin immunity protein. 82
44795 396058 pfam01321 Creatinase_N Creatinase/Prolidase N-terminal domain. This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. 127
44796 396059 pfam01322 Cytochrom_C_2 Cytochrome C'. 117
44797 396060 pfam01323 DSBA DSBA-like thioredoxin domain. This family contains a diverse set of proteins with a thioredoxin-like structure pfam00085. This family also includes 2-hydroxychromene-2-carboxylate (HCCA) isomerase enzymes catalyze one step in prokaryotic polyaromatic hydrocarbon (PAH) catabolic pathways. This family also contains members with functions other than HCCA isomerisation, such as Kappa family GSTs, whose similarity to HCCA isomerases was not previously recognized. Some members have been annotated as dioxygenases, dehydrogenases, or putative glycerol-3-phosphate transfer proteins, but are most likely HCCA isomerase enzymes. 192
44798 396061 pfam01324 Diphtheria_R Diphtheria toxin, R domain. C-terminal receptor binding (R) domain - binds to cell surface receptor, permitting the toxin to enter the cell by receptor mediated endocytosis. 167
44799 396062 pfam01325 Fe_dep_repress Iron dependent repressor, N-terminal DNA binding domain. This family includes the Diphtheria toxin repressor. DNA binding is through a helix-turn-helix motif. 57
44800 396063 pfam01326 PPDK_N Pyruvate phosphate dikinase, PEP/pyruvate binding domain. This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP). 328
44801 396064 pfam01327 Pep_deformylase Polypeptide deformylase. 153
44802 396065 pfam01328 Peroxidase_2 Peroxidase, family 2. The peroxidases in this family do not have similarity to other peroxidases. 186
44803 396066 pfam01329 Pterin_4a Pterin 4 alpha carbinolamine dehydratase. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerization cofactor of hepatocyte nuclear factor 1-alpha). 88
44804 396067 pfam01330 RuvA_N RuvA N terminal domain. The N terminal domain of RuvA has an OB-fold structure. This domain forms the RuvA tetramer contacts. 61
44805 396068 pfam01331 mRNA_cap_enzyme mRNA capping enzyme, catalytic domain. This family represents the ATP binding catalytic domain of the mRNA capping enzyme. 194
44806 396069 pfam01333 Apocytochr_F_C Apocytochrome F, C-terminal. This is a sub-family of cytochrome C. See pfam00034. 115
44807 396070 pfam01335 DED Death effector domain. 82
44808 396071 pfam01336 tRNA_anti-codon OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (see pfam00152). Aminoacyl-tRNA synthetases catalyze the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a hetero-trimeric complex, that contains a subunit in this family. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain. 75
44809 396072 pfam01337 Barstar Barstar (barnase inhibitor). 82
44810 396073 pfam01338 Bac_thur_toxin Bacillus thuringiensis toxin. 230
44811 396074 pfam01339 CheB_methylest CheB methylesterase. 177
44812 279656 pfam01340 MetJ Met Apo-repressor, MetJ. 97
44813 396075 pfam01341 Glyco_hydro_6 Glycosyl hydrolases family 6. 293
44814 396076 pfam01342 SAND SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerization. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain. 75
44815 396077 pfam01343 Peptidase_S49 Peptidase family S49. 154
44816 396078 pfam01344 Kelch_1 Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that the ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415. 46
44817 396079 pfam01345 DUF11 Domain of unknown function DUF11. A domain of unknown function found in multiple copies in several archaebacterial proteins. Conserved N-terminal lysine and C-terminal asparagine with central asp/glu suggests that many of these domain may contain an isopeptide bond. 114
44818 396080 pfam01346 FKBP_N Domain amino terminal to FKBP-type peptidyl-prolyl isomerase. This family is only found at the amino terminus of pfam00254. This domain is of unknown function. 97
44819 396081 pfam01347 Vitellogenin_N Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains. 582
44820 279664 pfam01348 Intron_maturas2 Type II intron maturase. Group II introns use intron-encoded reverse transcriptase, maturase and DNA endonuclease activities for site-specific insertion into DNA. Although this type of intron is self splicing in vitro they require a maturase protein for splicing in vivo. It has been shown that a specific region of the aI2 intron is needed for the maturase function. This region was found to be conserved in group II introns and called domain X. 140
44821 279665 pfam01349 Flavi_NS4B Flavivirus non-structural protein NS4B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4B protein is small and poorly conserved among the Flaviviruses. NS4B contains multiple hydrophobic potential membrane spanning regions. NS4B may form membrane components of the viral replication complex and could be involved in membrane localization of NS3 and pfam00972. 248
44822 279666 pfam01350 Flavi_NS4A Flavivirus non-structural protein NS4A. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4A protein is small and poorly conserved among the Flaviviruses. NS4A contains multiple hydrophobic potential membrane spanning regions. NS4A has only been found in cells infected by Kunjin virus. 144
44823 396082 pfam01351 RNase_HII Ribonuclease HII. 199
44824 396083 pfam01352 KRAB KRAB box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation. 42
44825 396084 pfam01353 GFP Green fluorescent protein. 212
44826 396085 pfam01355 HIPIP High potential iron-sulfur protein. 66
44827 396086 pfam01356 A_amylase_inhib Alpha amylase inhibitor. 68
44828 396087 pfam01357 Pollen_allerg_1 Pollen allergen. This family contains allergens lol PI, PII and PIII from Lolium perenne. 75
44829 396088 pfam01358 PARP_regulatory Poly A polymerase regulatory subunit. 292
44830 396089 pfam01359 Transposase_1 Transposase (partial DDE domain). This family includes the mariner transposase. 80
44831 396090 pfam01361 Tautomerase Tautomerase enzyme. This family includes the enzyme 4-oxalocrotonate tautomerase, which catalyzes the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate. 60
44832 396091 pfam01363 FYVE FYVE zinc finger. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn++ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. We have included members which do not conserve these histidine residues but are clearly related. 68
44833 396092 pfam01364 Peptidase_C25 Peptidase family C25. 343
44834 396093 pfam01365 RYDR_ITPR RIH domain. The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5-trisphosphate receptor. This domain may form a binding site for IP3. 201
44835 279677 pfam01366 PRTP Herpesvirus processing and transport protein. The members of this family are associate with capsid intermediates during packaging of the virus. 653
44836 396094 pfam01367 5_3_exonuc 5'-3' exonuclease, C-terminal SAM fold. 93
44837 396095 pfam01368 DHH DHH family. It is predicted that this family of proteins all perform a phosphoesterase function. It included the single stranded DNA exonuclease RecJ. 100
44838 396096 pfam01369 Sec7 Sec7 domain. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family. 183
44839 396097 pfam01370 Epimerase NAD dependent epimerase/dehydratase family. This family of proteins utilize NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions. 238
44840 396098 pfam01371 Trp_repressor Trp repressor protein. This protein binds to tryptophan and represses transcription of the Trp operon. 86
44841 279683 pfam01372 Melittin Melittin. 26
44842 366599 pfam01373 Glyco_hydro_14 Glycosyl hydrolase family 14. This family are beta amylases. 402
44843 396099 pfam01374 Glyco_hydro_46 Glycosyl hydrolase family 46. This family are chitosanase enzymes. 210
44844 366600 pfam01375 Enterotoxin_a Heat-labile enterotoxin alpha chain. 258
44845 396100 pfam01376 Enterotoxin_b Heat-labile enterotoxin beta chain. 102
44846 396101 pfam01378 IgG_binding_B B domain. This domain is found as a tandem repeat in Streptococcal cell surface proteins, such as the IgG binding protein G. 68
44847 396102 pfam01379 Porphobil_deam Porphobilinogen deaminase, dipyromethane cofactor binding domain. 203
44848 396103 pfam01380 SIS SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Presumably the SIS domains bind to the end-product of the pathway. 131
44849 396104 pfam01381 HTH_3 Helix-turn-helix. This large family of DNA binding helix-turn helix proteins includes Cro and CI. Within Neisseria gonorrhoeae NGO_0477, the full protein fold incorporates a helix-turn-helix motif, but the function of this member is unlikely to be that of a DNA-binding regulator, the function of most other members, so is not necessarily characteristic of the whole family. 55
44850 396105 pfam01382 Avidin Avidin family. 114
44851 396106 pfam01383 CpcD CpcD/allophycocyanin linker domain. 55
44852 396107 pfam01384 PHO4 Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor. 316
44853 396108 pfam01385 OrfB_IS605 Probable transposase. This family includes IS891, IS1136 and IS1341. DUF1225, pfam06774, has now been merged into this family. 120
44854 396109 pfam01386 Ribosomal_L25p Ribosomal L25p family. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress. 87
44855 396110 pfam01387 Synuclein Synuclein. There are three types of synucleins in humans, these are called alpha, beta and gamma. Alpha synuclein has been found mutated in families with autosomal dominant Parkinson's disease. A peptide of alpha synuclein has also been found in amyloid plaques in Alzheimer's patients. 133
44856 396111 pfam01388 ARID ARID/BRIGHT DNA binding domain. This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain. 87
44857 396112 pfam01389 OmpA_membrane OmpA-like transmembrane domain. The structure of OmpA transmembrane domain shows that it consists of an eight stranded beta barrel. This family includes some other distantly related outer membrane proteins with low scores. 177
44858 396113 pfam01390 SEA SEA domain. Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain. 106
44859 396114 pfam01391 Collagen Collagen triple helix repeat (20 copies). Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins. 57
44860 396115 pfam01392 Fz Fz domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines. 107
44861 396116 pfam01393 Chromo_shadow Chromo shadow domain. This domain is distantly related to pfam00385. This domain is always found in association with a chromo domain. 53
44862 396117 pfam01394 Clathrin_propel Clathrin propeller repeat. Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller. 37
44863 396118 pfam01395 PBP_GOBP PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP). 110
44864 307520 pfam01396 zf-C4_Topoisom Topoisomerase DNA binding C4 zinc finger. 39
44865 396119 pfam01397 Terpene_synth Terpene synthase, N-terminal domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase. 190
44866 396120 pfam01398 JAB JAB1/Mov34/MPN/PAD-1 ubiquitin protease. Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain, JABP1 domain or JAMM domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signalling and protein turnover pathways in eukaryotes. Versions of the domain in prokaryotic cognates of the ubiquitin-modification pathway are shown to have a similar role, and the archael protein from Haloferax volcanii is found to cleave ubiquitin-like small archaeal modifier proteins (SAMP1/2) from protein conjugates. 117
44867 396121 pfam01399 PCI PCI domain. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15). 105
44868 396122 pfam01400 Astacin Astacin (Peptidase family M12A). The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family contain two conserved disulphide bridges, these are joined 1-4 and 2-3. Members of this family have an amino terminal propeptide which is cleaved to give the active protease domain. All other linked domains are found to the carboxyl terminus of this domain. This family includes: Astacin, a digestive enzyme from Crayfish. Meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain. Proteins involved in morphogenesis and Tolloid from drosophila. 192
44869 396123 pfam01401 Peptidase_M2 Angiotensin-converting enzyme. Members of this family are dipeptidyl carboxydipeptidases (cleave carboxyl dipeptides) and most notably convert angiotensin I to angiotensin II. Many members of this family contain a tandem duplication of the 600 amino acid peptidase domain, both of these are catalytically active. Most members are secreted membrane bound ectoenzymes. 581
44870 396124 pfam01402 RHH_1 Ribbon-helix-helix protein, copG family. The structure of this protein repressor, which is the shortest reported to date and the first isolated from a plasmid, has a homodimeric ribbon-helix-helix arrangement. The helix-turn-helix-like structure is involved in dimerization and not DNA binding as might have been expected. 39
44871 396125 pfam01403 Sema Sema domain. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor and plexin-A3. 406
44872 396126 pfam01404 Ephrin_lbd Ephrin receptor ligand binding domain. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand. 178
44873 396127 pfam01405 PsbT Photosystem II reaction centre T protein. The exact function of this protein is unknown. It probably consists of a single transmembrane spanning helix. The Chlamydomonas reinhardtii psbT protein appears to be (i) a novel photosystem II subunit and (ii) required for maintaining optimal photosystem II activity under adverse growth conditions. 29
44874 396128 pfam01406 tRNA-synt_1e tRNA synthetases class I (C) catalytic domain. This family includes only cysteinyl tRNA synthetases. 301
44875 279715 pfam01407 Gemini_AL3 Geminivirus AL3 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL3 protein comprises approximately 0.05% of the cellular proteins and is present in the soluble and organelle fractions. AL3 may form oligomers. Immunoprecipitation of AL3 in a baculovirus expression system extracts expressing both AL1 pfam00799 and AL3 showed that the two proteins also complex with each other. The AL3 protein is involved in viral replication. 119
44876 396129 pfam01408 GFO_IDH_MocA Oxidoreductase family, NAD-binding Rossmann fold. This family of enzymes utilize NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot. 120
44877 396130 pfam01409 tRNA-synt_2d tRNA synthetases class II core domain (F). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only phenylalanyl-tRNA synthetases. This is the core catalytic domain. 245
44878 396131 pfam01410 COLFI Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. 233
44879 279719 pfam01411 tRNA-synt_2c tRNA synthetases class II (A). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only alanyl-tRNA synthetases. 548
44880 396132 pfam01412 ArfGap Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. 117
44881 396133 pfam01413 C4 C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. 109
44882 396134 pfam01414 DSL Delta serrate ligand. 63
44883 396135 pfam01415 IL7 Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multi-functional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. 152
44884 396136 pfam01416 PseudoU_synth_1 tRNA pseudouridine synthase. Involved in the formation of pseudouridine at the anticodon stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D-ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved aspartic acid, likely involved in catalysis. 108
44885 396137 pfam01417 ENTH ENTH domain. The ENTH (Epsin N-terminal homology) domain is found in proteins involved in endocytosis and cytoskeletal machinery. The function of the ENTH domain is unknown. 124
44886 334531 pfam01418 HTH_6 Helix-turn-helix domain, rpiR family. This domain contains a helix-turn-helix motif. The best characterized member of this family is RpiR, a regulator of the expression of rpiB gene. 77
44887 396138 pfam01419 Jacalin Jacalin-like lectin domain. Proteins containing this domain are lectins. It is found in 1 to 6 copies in these proteins. The domain is also found in the animal prostatic spermine-binding protein. 134
44888 396139 pfam01420 Methylase_S Type I restriction modification DNA specificity domain. This domain is also known as the target recognition domain (TRD). Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity subunit (this family), two modification (M) subunits and two restriction (R) subunits. 167
44889 396140 pfam01421 Reprolysin Reprolysin (M12B) family zinc metalloprotease. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family are also known as adamalysins. Most members of this family are snake venom endopeptidases, but there are also some mammalian proteins and fertilin. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes. 199
44890 396141 pfam01422 zf-NF-X1 NF-X1 type zinc finger. This domain is presumed to be a zinc binding domain. The following pattern describes the zinc finger. C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. Two position can be either his or cys. The zinc fingers in NFX1 bind to DNA. 19
44891 396142 pfam01423 LSM LSM domain. The LSM domain contains Sm proteins as well as other related LSM (Like Sm) proteins. The U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing contain seven Sm proteins (B/B', D1, D2, D3, E, F and G) in common, which assemble around the Sm site present in four of the major spliceosomal small nuclear RNAs. The U6 snRNP binds to the LSM (Like Sm) proteins. Sm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Sm proteins. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. This family also includes the bacterial Hfq (host factor Q) proteins. Hfq are also RNA-binding proteins, that form hexameric rings. 66
44892 396143 pfam01424 R3H R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA. 60
44893 396144 pfam01425 Amidase Amidase. 442
44894 396145 pfam01426 BAH BAH domain. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction. 120
44895 279735 pfam01427 Peptidase_M15 D-ala-D-ala dipeptidase. 199
44896 396146 pfam01428 zf-AN1 AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. 37
44897 396147 pfam01429 MBD Methyl-CpG binding domain. The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase. 76
44898 396148 pfam01430 HSP33 Hsp33 protein. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localized protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function. 277
44899 279739 pfam01431 Peptidase_M13 Peptidase family M13. Mammalian enzymes are typically type-II membrane anchored enzymes which are known, or believed to activate or inactivate oligopeptide (pro)-hormones such as opioid peptides. The family also contains a bacterial member believed to be involved with milk protein cleavage. 205
44900 396149 pfam01432 Peptidase_M3 Peptidase family M3. This is the Thimet oligopeptidase family, large family of mammalian and bacterial oligopeptidases that cleave medium sized peptides. The group also contains mitochondrial intermediate peptidase which is encoded by nuclear DNA but functions within the mitochondria to remove the leader sequence. 450
44901 396150 pfam01433 Peptidase_M1 Peptidase family M1 domain. Members of this family are aminopeptidases. The members differ widely in specificity, hydrolysing acidic, basic or neutral N-terminal residues. This family includes leukotriene-A4 hydrolase, this enzyme also has an aminopeptidase activity. 220
44902 396151 pfam01434 Peptidase_M41 Peptidase family M41. 190
44903 396152 pfam01435 Peptidase_M48 Peptidase family M48. Peptidase_M48 is the largely extracellular catalytic region of CAAX prenyl protease homologs such as Human FACE-1 protease. These are metallopeptidases, with the characteristic HExxH motif giving the two histidine-zinc-ligands and an adjacent glutamate on the next helix being the third. The whole molecule folds to form a deep groove/cleft into which the substrate can fit. 198
44904 396153 pfam01436 NHL NHL repeat. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in Bos taurus PAM, proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. The E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats. 28
44905 396154 pfam01437 PSI Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman). 52
44906 396155 pfam01439 Metallothio_2 Metallothionein. Members of this family are metallothioneins. These proteins are cysteine rich proteins that bind to heavy metals. Members of this family appear to be closest to Class II metallothioneins, seed pfam00131. 80
44907 279747 pfam01440 Gemini_AL2 Geminivirus AL2 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL2 gene product transactivates expression of TGMV coat protein gene, and BR1 movement protein. 126
44908 396156 pfam01441 Lipoprotein_6 Lipoprotein. Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens. 170
44909 396157 pfam01442 Apolipoprotein Apolipoprotein A1/A4/E domain. These proteins contain several 22 residue repeats which form a pair of alpha helices. This family includes: Apolipoprotein A-I. Apolipoprotein A-IV. Apolipoprotein E. 170
44910 366646 pfam01443 Viral_helicase1 Viral (Superfamily 1) RNA helicase. Helicase activity for this family has been demonstrated and NTPase activity. This helicase has multiple roles at different stages of viral RNA replication, as dissected by mutational analysis. 227
44911 279751 pfam01445 SH Viral small hydrophobic protein. The SH (small hydrophobic) protein is a membrane protein of uncertain function. 57
44912 396158 pfam01446 Rep_1 Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase. 248
44913 396159 pfam01447 Peptidase_M4 Thermolysin metallopeptidase, catalytic domain. 147
44914 396160 pfam01448 ELM2 ELM2 domain. The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N-terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in ARID1. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain. 53
44915 396161 pfam01450 IlvC Acetohydroxy acid isomeroreductase, catalytic domain. Acetohydroxy acid isomeroreductase catalyzes the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine. 138
44916 396162 pfam01451 LMWPc Low molecular weight phosphotyrosine protein phosphatase. 142
44917 279757 pfam01452 Rota_NSP4 Rotavirus non structural protein. This protein has been called NSP4, NSP5, NS28, and NCVP5. The final steps in the assembly of rotavirus occur in the lumen of the endoplasmic reticulum (ER). Targeting of the immature inner capsid particle (ICP) to this compartment is mediated by the cytoplasmic tail of NSP4, located in the ER membrane. 173
44918 396163 pfam01453 B_lectin D-mannose binding lectin. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria. 105
44919 396164 pfam01454 MAGE MAGE family. The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumors but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. This family also contains the yeast protein, Nse3. The Nse3 protein is part of the Smc5-6 complex. Nse3 has been demonstrated to be important for meiosis. 202
44920 396165 pfam01455 HupF_HypC HupF/HypC family. 65
44921 250634 pfam01456 Mucin Mucin-like glycoprotein. This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells. This family contains an N-terminal signal peptide. 143
44922 366652 pfam01457 Peptidase_M8 Leishmanolysin. 529
44923 396166 pfam01458 UPF0051 Uncharacterized protein family (UPF0051). 218
44924 396167 pfam01459 Porin_3 Eukaryotic porin. 270
44925 396168 pfam01462 LRRNT Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. 28
44926 279765 pfam01463 LRRCT Leucine rich repeat C-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the C-terminus of tandem leucine rich repeats. 26
44927 396169 pfam01464 SLT Transglycosylase SLT domain. This family is distantly related to pfam00062. Members are found in phages, type II, type III and type IV secretion systems. 114
44928 396170 pfam01465 GRIP GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1. 43
44929 396171 pfam01466 Skp1 Skp1 family, dimerization domain. 48
44930 396172 pfam01467 CTP_transf_like Cytidylyltransferase-like. This family includes: Cholinephosphate cytidylyltransferase; glycerol-3-phosphate cytidylyltransferase. It also includes putative adenylyltransferases, and FAD synthases. 134
44931 396173 pfam01468 GA GA module. The GA (protein G-related Albumin-binding) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from peptostreptococcal albumin-binding protein shows a strong affinity for albumin. 55
44932 279771 pfam01469 Pentapeptide_2 Pentapeptide repeats (8 copies). These repeats are found in many mycobacterial proteins. These repeats are most common in the pfam00823 family of proteins, where they are found in the MPTR subfamily of PPE proteins. The function of these repeats is unknown. The repeat can be approximately described as XNXGX, where X can be any amino acid. These repeats are similar to pfam00805, however it is not clear if these two families are structurally related. 39
44933 396174 pfam01470 Peptidase_C15 Pyroglutamyl peptidase. 201
44934 396175 pfam01471 PG_binding_1 Putative peptidoglycan binding domain. This domain is composed of three alpha helices. This domain is found at the N or C-terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. 57
44935 396176 pfam01472 PUA PUA domain. The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain. 74
44936 366661 pfam01473 CW_binding_1 Putative cell wall binding repeat. These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Streptococcus phage Cp-1 lysozyme might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known. 19
44937 396177 pfam01474 DAHP_synth_2 Class-II DAHP synthetase family. Members of this family are aldolase enzymes that catalyze the first step of the shikimate pathway. 437
44938 396178 pfam01475 FUR Ferric uptake regulator family. This family includes metal ion uptake regulator proteins, that bind to the operator DNA and controls transcription of metal ion-responsive genes. This family is also known as the FUR family. 120
44939 396179 pfam01476 LysM LysM domain. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known. 43
44940 396180 pfam01477 PLAT PLAT/LH2 domain. This domain is found in a variety of membrane or lipid associated proteins. It is called the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain. The known structure of pancreatic lipase shows this domain binds to procolipase pfam01114, which mediates membrane association. So it appears possible that this domain mediates membrane attachment via other protein binding partners. The structure of this domain is known for many members of the family and is composed of a beta sandwich. 115
44941 396181 pfam01478 Peptidase_A24 Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea. 101
44942 396182 pfam01479 S4 S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterized, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA. 48
44943 396183 pfam01480 PWI PWI domain. 70
44944 396184 pfam01481 Arteri_nucleo Arterivirus nucleocapsid protein. 116
44945 396185 pfam01483 P_proprotein Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. 86
44946 396186 pfam01484 Col_cuticle_N Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens, see pfam01391. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins. 48
44947 396187 pfam01485 IBR IBR domain, a half RING-finger domain. The IBR (In Between Ring fingers) domain is often found to occur between pairs of ring fingers (pfam00097). This domain has also been called the C6HC domain and DRIL (for double RING finger linked) domain. Proteins that contain two Ring fingers and an IBR domain (these proteins are also termed RBR family proteins) are thought to exist in all eukaryotic organisms. RBR family members play roles in protein quality control and can indirectly regulate transcription. Evidence suggests that RBR proteins are often parts of cullin-containing ubiquitin ligase complexes. The ubiquitin ligase Parkin is an RBR family protein whose mutations are involved in forms of familial Parkinson's disease. 59
44948 396188 pfam01486 K-box K-box region. The K-box region is commonly found associated with SRF-type transcription factors see pfam00319. The K-box is a possible coiled-coil structure. Possible role in multimer formation. 91
44949 396189 pfam01487 DHquinase_I Type I 3-dehydroquinase. Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase.) catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate giving dehydroshikimate. Dehydroquinase functions in the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Type II 3-dehydroquinase catalyzes the trans-dehydration of 3-dehydroshikimate see pfam01220. 227
44950 396190 pfam01488 Shikimate_DH Shikimate / quinate 5-dehydrogenase. This family contains both shikimate and quinate dehydrogenases. Shikimate 5-dehydrogenase catalyzes the conversion of shikimate to 5-dehydroshikimate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Quinate 5-dehydrogenase catalyzes the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon in prokaryotes and microbial eukaryotes. Both the shikimate and quinate pathways share two common pathway metabolites 3-dehydroquinate and dehydroshikimate. 136
44951 279788 pfam01490 Aa_trans Transmembrane amino acid transporter protein. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases. 410
44952 396191 pfam01491 Frataxin_Cyay Frataxin-like domain. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport. 104
44953 279790 pfam01492 Gemini_C4 Geminivirus C4 protein. This family consists of the N terminal region of geminivirus C4 or AC4 proteins. In Tomato yellow leaf curl geminivirus (TYLCV) the C4 protein is necessary for efficient spreading of the virus in tomato plants. 84
44954 396192 pfam01493 GXGXG GXGXG motif. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment. 190
44955 396193 pfam01494 FAD_binding_3 FAD binding domain. This domain is involved in FAD binding in a number of enzymes. 348
44956 396194 pfam01496 V_ATPase_I V-type ATPase 116kDa subunit family. This family consists of the 116kDa V-type ATPase (vacuolar (H+)-ATPases) subunits, as well as V-type ATP synthase subunit i. The V-type ATPases family are proton pumps that acidify intracellular compartments in eukaryotic cells for example yeast central vacuoles, clathrin-coated and synaptic vesicles. They have important roles in membrane trafficking processes. The 116kDa subunit (subunit a) in the V-type ATPase is part of the V0 functional domain responsible for proton transport. The a subunit is a transmembrane glycoprotein with multiple putative transmembrane helices it has a hydrophilic amino terminal and a hydrophobic carboxy terminal. It has roles in proton transport and assembly of the V-type ATPase complex. This subunit is encoded by two homologous gene in yeast VPH1 and STV1. 756
44957 396195 pfam01497 Peripla_BP_2 Periplasmic binding protein. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport. 234
44958 366677 pfam01498 HTH_Tnp_Tc3_2 Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of C.elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in. Tc3 is a member of the Tc1/mariner family of transposable elements. 72
44959 396196 pfam01499 Herpes_UL25 Herpesvirus UL25 family. The herpesvirus UL25 gene product is a virion component involved in virus penetration and capsid assembly. The product of the UL25 gene is required for packaging but not cleavage of replicated viral DNA. This family includes a number of herpesvirus proteins: EHV-1 36, EBV BVRF1, HCMV UL77, ILTV ORF2, and VZV gene 34. 541
44960 366678 pfam01500 Keratin_B2 Keratin, high sulfur B2 protein. High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibers in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1. 161
44961 279798 pfam01501 Glyco_transf_8 Glycosyl transferase family 8. This family includes enzymes that transfer sugar residues to donor molecules. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. This family includes Lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, and glycogenin glucosyltransferase. 252
44962 396197 pfam01502 PRA-CH Phosphoribosyl-AMP cyclohydrolase. This enzyme catalyzes the third step in the histidine biosynthetic pathway. It requires Zn ions for activity. 74
44963 396198 pfam01503 PRA-PH Phosphoribosyl-ATP pyrophosphohydrolase. This enzyme catalyzes the second step in the histidine biosynthetic pathway. 83
44964 396199 pfam01504 PIP5K Phosphatidylinositol-4-phosphate 5-Kinase. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyzes the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway. 284
44965 396200 pfam01505 Vault Major Vault Protein repeat. The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. 41
44966 366682 pfam01506 HCV_NS5a Hepatitis C virus non-structural 5a protein membrane anchor. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. The N-terminal region of the NS5a protein has been used in the construction of the alignment for this family. The C-terminal region has not been included because it is too heterogeneous. 23
44967 396201 pfam01507 PAPS_reduct Phosphoadenosine phosphosulfate reductase family. This domain is found in phosphoadenosine phosphosulfate (PAPS) reductase enzymes or PAPS sulfotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP). It is also found in NodP nodulation protein P from Rhizobium which has ATP sulfurylase activity (sulfate adenylate transferase). 173
44968 279805 pfam01508 Paramecium_SA Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues. 62
44969 396202 pfam01509 TruB_N TruB family pseudouridylate synthase (N terminal domain). Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes TruB, a pseudouridylate synthase that specifically converts uracil 55 to pseudouridine in most tRNAs. This family also includes Cbf5p that modifies rRNA. 148
44970 396203 pfam01510 Amidase_2 N-acetylmuramoyl-L-alanine amidase. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding. 121
44971 396204 pfam01512 Complex1_51K Respiratory-chain NADH dehydrogenase 51 Kd subunit. 150
44972 396205 pfam01513 NAD_kinase ATP-NAD kinase. Members of this family include ATP-NAD kinases EC:2.7.1.23, which catalyzes the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. Also includes NADH kinases EC:2.7.1.86. 285
44973 396206 pfam01514 YscJ_FliF Secretory protein of YscJ/FliF family. This family includes proteins that are related to the YscJ lipoprotein, and the amino terminus of FliF, the flageller M-ring protein. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flageller proteins, based on the similarity to YscJ proteins. 179
44974 396207 pfam01515 PTA_PTB Phosphate acetyl/butaryl transferase. This family contains both phosphate acetyltransferase and phosphate butaryltransferase. These enzymes catalyze the transfer of an acetyl or butaryl group to orthophosphate. 318
44975 279812 pfam01516 Orbi_VP6 Orbivirus helicase VP6. The VP6 protein a minor protein in the core of the virion is probably the viral helicase. 324
44976 279813 pfam01517 HDV_ag Hepatitis delta virus delta antigen. The hepatitis delta virus (HDV) encodes a single protein, the hepatitis delta antigen (HDAg). The central region of this protein has been shown to bind RNA. Several interactions are also mediated by a coiled-coil region at the N-terminus of the protein. 195
44977 250679 pfam01518 PolyG_pol Sigma NS protein. This viral protein has a poly(C)-dependent poly(G) polymerase activity. 366
44978 396208 pfam01519 DUF16 Protein of unknown function DUF16. The function of this protein is unknown. It appears to only occur in Mycoplasma pneumoniae. The crystal structure revealed that this domain is composed of two separated homotrimeric coiled-coils. 95
44979 396209 pfam01520 Amidase_3 N-acetylmuramoyl-L-alanine amidase. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls. 172
44980 396210 pfam01521 Fe-S_biosyn Iron-sulphur cluster biosynthesis. This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins. 111
44981 396211 pfam01522 Polysacc_deac_1 Polysaccharide deacetylase. This domain is found in polysaccharide deacetylase. This family of polysaccharide deacetylases includes NodB (nodulation protein B from Rhizobium) which is a chitooligosaccharide deacetylase. It also includes chitin deacetylase from yeast, and endoxylanases which hydrolyzes glucosidic bonds in xylan. 124
44982 396212 pfam01523 PmbA_TldD Putative modulator of DNA gyrase. tldD and pmbA were found to suppress mutations in letD and inhibitor of DNA gyrase. Therefore it has been hypothesized that the TldD and PmbA proteins modulate the activity of DNA gyrase. It has also been suggested that PmbA may be involved in secretion. 156
44983 366689 pfam01524 Gemini_V2 Geminivirus V2 protein. Disruption of the V2 gene in Tomato yellow leaf curl virus (TYLCV) stopped its ability to systemically infect tomato plants, suggesting that the V2 gene product is required for successful infection of the host. 78
44984 144935 pfam01525 Rota_NS26 Rotavirus NS26. Gene 11 product is a non-structural phosphoprotein designated as NS26. 212
44985 396213 pfam01526 DDE_Tnp_Tn3 Tn3 transposase DDE domain. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases. 389
44986 396214 pfam01527 HTH_Tnp_1 Transposase. Transposase proteins are necessary for efficient DNA transposition. This family consists of various E. coli insertion elements and other bacterial transposases some of which are members of the IS3 family. 75
44987 279822 pfam01528 Herpes_glycop Herpesvirus glycoprotein M. The herpesvirus glycoprotein M (gM) is an integral membrane protein predicted to contain 8 transmembrane segments. Glycoprotein M is not essential for viral replication. 373
44988 396215 pfam01529 DHHC DHHC palmitoyltransferase. This entry refers to the DHHC domain, found in DHHC proteins which are palmitoyltransferases. Palmitoylation or, more specifically S-acylation, plays important roles in the regulation of protein localization, stability, and activity. It is a post-translational protein modification that involves the attachment of palmitic acid to Cys residues through a thioester linkage. Protein acyltransferases (PATs), also known as palmitoyltransferases, catalyze this reaction by transferring the palmitoyl group from palmitoyl-CoA to the thiol group of Cys residues. They are characterized by the presence of a 50-residue-long domain called the DHHC domain, which in most but not all cases is also cysteine-rich and gets its name from a highly conserved DHHC signature tetrapeptide (Asp-His-His-Cys). The Cys residue within the DHHC domain forms a stable acyl intermediate and transfers the acyl chain to the Cys residues of a target protein. Some proteins containing a DHHC domain include Drosophila DNZ1 protein, Mouse Abl-philin 2 (Aph2) protein, Mammalian ZDHHC9, Yeast ankyrin repeat-containing protein AKR1, Yeast Erf2 protein, and Arabidopsis thaliana tip growth defective 1. 132
44989 396216 pfam01530 zf-C2HC Zinc finger, C2HC type. This is a DNA binding zinc finger domain. 29
44990 250689 pfam01531 Glyco_transf_11 Glycosyl transferase family 11. This family contains several fucosyl transferase enzymes. 298
44991 396217 pfam01532 Glyco_hydro_47 Glycosyl hydrolase family 47. Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2). 453
44992 396218 pfam01533 Tospo_nucleocap Tospovirus nucleocapsid protein. The tospovirus genome consists of three linear ssRNA segments, denoted L, M and S complexed with the nucleocapsid protein. The S RNA encodes the nucleocapsid protein and another non-structural protein. 246
44993 396219 pfam01534 Frizzled Frizzled/Smoothened family membrane region. This family contains the membrane spanning region of frizzled and smoothened receptors. This membrane region is predicted to contain seven transmembrane alpha helices. Proteins related to Drosophila frizzled are receptors for Wnt (mediating the beta-catenin signalling pathway), but also the planar cell polarity (PCP) pathway and the Wnt/calcium pathway. The predominantly alpha-helical Cys-rich ligand-binding region (CRD) of Frizzled is both necessary and sufficient for Wnt binding. The smoothened receptor mediates hedgehog signalling. 321
44994 366695 pfam01535 PPR PPR repeat. This repeat has no known function. It is about 35 amino acids long and found in up to 18 copies in some proteins. This family appears to be greatly expanded in plants. This repeat occurs in PET309 that may be involved in RNA stabilisation. This domain occurs in crp1 that is involved in RNA processing. This repeat is associated with a predicted plant protein that has a domain organisation similar to the human BRCA1 protein. The repeat has been called PPR. 31
44995 396220 pfam01536 SAM_decarbox Adenosylmethionine decarboxylase. This is a family of S-adenosylmethionine decarboxylase (SAMDC) proenzymes. In the biosynthesis of polyamines SAMDC produces decarboxylated S-adenosylmethionine, which serves as the aminopropyl moiety necessary for spermidine and spermine biosynthesis from putrescine. The Pfam alignment contains both the alpha and beta chains that are cleaved to form the active enzyme. 332
44996 396221 pfam01537 Herpes_glycop_D Herpesvirus glycoprotein D/GG/GX domain. This domain is found in several Herpes viruses glycoproteins. This is a family includes glycoprotein-D (gD or gIV) which is common to herpes simplex virus types 1 and 2, as well as equine herpes, bovine herpes and Marek's disease virus. Glycoprotein-D has been found on the viral envelope and the plasma membrane of infected cells. and gD immunisation can produce an immune response to bovine herpes virus (BHV-1). This response is stronger than that of the other major glycoproteins gB (gI) and gC (gIII) in BHV-1. Glycoprotein G (gG)is one of the seven external glycoproteins of HSV1 and HSV2. This family also contains the glycoprotein GX, (gX), initially identified in Pseudorabies virus. 118
44997 366698 pfam01538 HCV_NS2 Hepatitis C virus non-structural protein NS2. The viral genome is translated into a single polyprotein of about 3000 amino acids. Generation of the mature non-structural proteins relies on the activity of viral proteases. Cleavage at the NS2/NS3 junction is accomplished by a metal-dependent autoprotease encoded within NS2 and the N-terminus of NS3. 195
44998 110536 pfam01539 HCV_env Hepatitis C virus envelope glycoprotein E1. 190
44999 110537 pfam01540 Lipoprotein_7 Adhesin lipoprotein. This family consists of the p50 and variable adherence-associated antigen (Vaa) adhesins from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital diseases, pneumonia, and septic arthritis. An adhesin is a cell surface molecule that mediates adhesion to other cells or to the surrounding surface or substrate. The Vaa antigen is a 50-kDa surface lipoprotein that has four tandem repetitive DNA sequences encoding a periodic peptide structure, and is highly immunogenic in the human host. p50 is also a 50-kDa lipoprotein, having three repeats A,B and C, that may be a tetramer of 191-kDa in its native environment. 353
45000 396222 pfam01541 GIY-YIG GIY-YIG catalytic domain. This domain called GIY-YIG is found in the amino terminal region of excinuclease abc subunit c (uvrC), bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage. The structure of I-TevI a GIY-YIG endonuclease, reveals a novel alpha/beta-fold with a central three-stranded antiparallel beta-sheet flanked by three helices. The most conserved and putative catalytic residues are located on a shallow, concave surface and include a metal coordination site. 78
45001 279832 pfam01542 HCV_core Hepatitis C virus core protein. The viral core protein forms the internal viral coat that encapsidates the genomic RNA and is enveloped in a host cell-derived lipid membrane. The core protein has been shown, by yeast two-hybrid assay to interact with cellular DEAD box helicases. The N-terminus of the core protein is involved in transcriptional repression. 75
45002 144947 pfam01543 HCV_capsid Hepatitis C virus capsid protein. 121
45003 396223 pfam01544 CorA CorA-like Mg2+ transporter protein. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell. 292
45004 396224 pfam01545 Cation_efflux Cation efflux family. Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells. 189
45005 396225 pfam01546 Peptidase_M20 Peptidase family M20/M25/M40. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases. 315
45006 396226 pfam01547 SBP_bac_1 Bacterial extracellular solute-binding protein. This family also includes the bacterial extracellular solute-binding protein family POTD/POTF. 294
45007 396227 pfam01548 DEDD_Tnp_IS110 Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes an amino-terminal region of the pilin gene inverting protein (PIVML) and of members of the IS111A/IS1328/IS1533 family of transposases. The C-terminus is represented by family pfam02371. 155
45008 396228 pfam01549 ShK ShK domain-like. This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. 37
45009 396229 pfam01551 Peptidase_M23 Peptidase family M23. Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown. 96
45010 279840 pfam01552 Pico_P2B Picornavirus 2B protein. Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability. 101
45011 366704 pfam01553 Acyltransferase Acyltransferase. This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This family also includes tafazzin, the Barth syndrome gene. 131
45012 334587 pfam01554 MatE MatE. The MatE domain 161
45013 396230 pfam01555 N6_N4_Mtase DNA methylase. Members of this family are DNA methylases. The family contains both N-4 cytosine-specific DNA methylases and N-6 Adenine-specific DNA methylases. 221
45014 396231 pfam01556 DnaJ_C DnaJ C terminal domain. This family consists of the C terminal region of the DnaJ protein. It is always found associated with pfam00226 and pfam00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and CTDII, both incorporated in this family are necessary for maintaining the J-domains in their specific relative positions. Structural analysis of Structure 1nlt shows that PF00684 is nested within this DnaJ C-terminal region. 130
45015 396232 pfam01557 FAA_hydrolase Fumarylacetoacetate (FAA) hydrolase family. This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyzes fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerizes this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase encoded by mhpD in E. coli is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase. 211
45016 396233 pfam01558 POR Pyruvate ferredoxin/flavodoxin oxidoreductase. This family includes a region of the large protein pyruvate-flavodoxin oxidoreductase and the whole pyruvate ferredoxin oxidoreductase gamma subunit protein. It is not known whether the gamma subunit has a catalytic or regulatory role. Pyruvate oxidoreductase (POR) catalyzes the final step in the fermentation of carbohydrates in anaerobic microorganisms. This involves the oxidative decarboxylation of pyruvate with the participation of thiamine followed by the transfer of an acetyl moiety to coenzyme A for the synthesis of acetyl-CoA. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited. 173
45017 366705 pfam01559 Zein Zein seed storage protein. Zeins are seed storage proteins. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats. 244
45018 110557 pfam01560 HCV_NS1 Hepatitis C virus non-structural protein E2/NS1. The hypervariable region of the E2/NS1 region of hepatitis C virus varies greatly between viral isolates. E2 is thought to encode a structurally unconstrained envelope protein. 344
45019 396234 pfam01561 Hanta_G2 Hantavirus glycoprotein G2. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA. 457
45020 396235 pfam01562 Pep_M12B_propep Reprolysin family propeptide. This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C-terminus of the alignment but is not well aligned. 129
45021 396236 pfam01563 Alpha_E3_glycop Alphavirus E3 glycoprotein. This protein is found in some alphaviruses as a virion associated spike protein. 59
45022 396237 pfam01564 Spermine_synth Spermine/spermidine synthase domain. Spermine and spermidine are polyamines. This family includes spermidine synthase that catalyzes the fifth (last) step in the biosynthesis of spermidine from arginine, and spermine synthase. 183
45023 396238 pfam01565 FAD_binding_4 FAD binding domain. This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidizes the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan. 139
45024 396239 pfam01566 Nramp Natural resistance-associated macrophage protein. The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functional related proteins defined by a conserved hydrophobic core of ten transmembrane domains. This family of membrane proteins are divalent cation transporters. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. By controlling divalent cation concentrations Nramp1 may regulate the interphagosomal replication of bacteria. Mutations in Nramp1 may genetically predispose an individual to susceptibility to diseases including leprosy and tuberculosis conversely this might however provide protection form rheumatoid arthritis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others it is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations. 357
45025 279853 pfam01567 Hanta_G1 Hantavirus glycoprotein G1. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA. 523
45026 396240 pfam01568 Molydop_binding Molydopterin dinucleotide binding domain. This domain is found in various molybdopterin - containing oxidoreductases and tungsten formylmethanofuran dehydrogenase subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where the domain constitutes almost the entire subunit. The formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and has a molybdopterin dinucleotide cofactor. This domain corresponds to the C-terminal domain IV in dimethyl sulfoxide (DMSO)reductase which interacts with the 2-amino pyrimidone ring of both molybdopterin guanine dinucleotide molecules. 110
45027 396241 pfam01569 PAP2 PAP2 superfamily. This family includes the enzyme type 2 phosphatidic acid phosphatase (PAP2), Glucose-6-phosphatase EC:3.1.3.9, Phosphatidylglycerophosphatase B EC:3.1.3.27 and bacterial acid phosphatase EC:3.1.3.2. The family also includes a variety of haloperoxidases that function by oxidising halides in the presence of hydrogen peroxide to form the corresponding hypohalous acids. 123
45028 366710 pfam01570 Flavi_propep Flavivirus polyprotein propeptide. The flaviviruses are small enveloped animal viruses containing a single positive strand genomic RNA. The genome encodes one large ORF a polyprotein which undergos proteolytic processing into mature viral peptide chains. This family consists of a propeptide region of approximately 90 amino acid length. 78
45029 396242 pfam01571 GCV_T Aminomethyltransferase folate-binding domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. 255
45030 279858 pfam01573 Bromo_MP Bromovirus movement protein. 283
45031 396243 pfam01575 MaoC_dehydratas MaoC like domain. The maoC gene is part of a operon with maoA which is involved in the synthesis of monoamine oxidase. The MaoC protein is found to share similarity with a wide variety of enzymes; estradiol 17 beta-dehydrogenase 4, peroxisomal hydratase-dehydrogenase-epimerase, fatty acid synthase beta subunit. Several bacterial proteins that are composed solely of this domain have (R)-specific enoyl-CoA hydratase activity. This domain is also present in the NodN nodulation protein N. 123
45032 396244 pfam01576 Myosin_tail_1 Myosin tail. The myosin molecule is a multi-subunit complex made up of two heavy chains and four light chains it is a fundamental contractile protein found in all eukaryote cell types. This family consists of the coiled-coil myosin heavy chain tail region. The coiled-coil is composed of the tail from two molecules of myosin. These can then assemble into the macromolecular thick filament. The coiled-coil region provides the structural backbone the thick filament. 1081
45033 250716 pfam01577 Peptidase_S30 Potyvirus P1 protease. The potyviridae family positive stand RNA viruses with genome encoding a polyprotein. members include zucchini yellow mosaic virus, and turnip mosaic viruses which cause considerable losses of crops worldwide. This family consists of a C-terminus region from various plant potyvirus P1 proteins (found at the N-terminus of the polyprotein). The C-terminus of P1 is a serine-type protease responsible for autocatalytic cleavage between P1 and the helper component protease pfam00851. The entire P1 protein may be involved in virus-host interactions. 245
45034 307628 pfam01578 Cytochrom_C_asm Cytochrome C assembly protein. This family consists of various proteins involved in cytochrome c assembly from mitochondria and bacteria; CycK from Rhizobium, CcmC from E. coli and Paracoccus denitrificans and orf240 from wheat mitochondria. The members of this family are probably integral membrane proteins with six predicted transmembrane helices. It has been proposed that members of this family comprise a membrane component of an ABC (ATP binding cassette) transporter complex. It is also proposed that this transporter is necessary for transport of some component needed for cytochrome c assembly. One member CycK contains a putative heme-binding motif, orf240 also contains a putative heme-binding motif and is a proposed ABC transporter with c-type heme as its proposed substrate. However it seems unlikely that all members of this family transport heme nor c-type apocytochromes because CcmC in the putative CcmABC transporter transports neither. CcmF forms a working module with CcmH and CcmI, CcmFHI, and itself is unlikely to bind haem directly. 211
45035 396245 pfam01579 DUF19 Domain of unknown function (DUF19). This presumed domain has no known function. It is found in one or two copies in several Caenorhabditis elegans proteins. It is roughly 130 amino acids long. The domain contains 12 conserved cysteines which suggests that the domain is an extracellular domain and that these cysteines form six intradomain disulphide bridges. The GO annotation for this protein indicates that it has a function in nematode larval development and has a positive regulation of growth rate. 155
45036 279863 pfam01580 FtsK_SpoIIIE FtsK/SpoIIIE family. FtsK has extensive sequence similarity to wide variety of proteins from prokaryotes and plasmids, termed the FtsK/SpoIIIE family. This domain contains a putative ATP binding P-loop motif. It is found in the FtsK cell division protein from E. coli and the stage III sporulation protein E SpoIIIE which has roles in regulation of prespore specific gene expression in B. subtilis. A mutation in FtsK causes a temperature sensitive block in cell division and it is involved in peptidoglycan synthesis or modification. The SpoIIIE protein is implicated in intercellular chromosomal DNA transfer. 219
45037 110576 pfam01581 FARP FMRFamide related peptide family. The neuroactive peptide Phe-Met-Arg-Phe-NH2 (FMRF-amide) has a variety of effects on both mammalian and invertebrate tissues. 11
45038 396246 pfam01582 TIR TIR domain. The Toll/interleukin-1 receptor (TIR) homology domain is an intracellular signalling domain found in MyD88, interleukin 1 receptor and the Toll receptor. It contains three highly-conserved regions, and mediates protein-protein interactions between the Toll-like receptors (TLRs) and signal-transduction components. TIR-like motifs are also found in plant proteins thought to be involved in resistance to disease. When activated, TIR domains recruit cytoplasmic adaptor proteins MyD88 and TOLLIP (Toll interacting protein). In turn, these associate with various kinases to set off signalling cascades. 165
45039 396247 pfam01583 APS_kinase Adenylylsulphate kinase. Enzyme that catalyzes the phosphorylation of adenylylsulphate to 3'-phosphoadenylylsulfate. This domain contains an ATP binding P-loop motif. 154
45040 396248 pfam01584 CheW CheW-like domain. CheW proteins are part of the chemotaxis signaling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA binds to CheW, suggesting that these domains can interact with each other. 137
45041 396249 pfam01585 G-patch G-patch domain. This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines. 45
45042 396250 pfam01586 Basic Myogenic Basic domain. This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure. 81
45043 396251 pfam01588 tRNA_bind Putative tRNA binding domain. This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation. 96
45044 279870 pfam01589 Alpha_E1_glycop Alphavirus E1 glycoprotein. E1 forms a heterodimer with E2 pfam00943. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus). 504
45045 396252 pfam01590 GAF GAF domain. This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54. 133
45046 396253 pfam01591 6PF2K 6-phosphofructo-2-kinase. This enzyme occurs as a bifunctional enzyme with fructose-2,6-bisphosphatase. The bifunctional enzyme catalyzes both the synthesis and degradation of fructose-2,6-bisphosphate, a potent regulator of glycolysis. This enzyme contains a P-loop motif. 223
45047 396254 pfam01592 NifU_N NifU-like N terminal domain. This domain is found in NifU in combination with pfam01106. This domain is found on isolated in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. These proteins appear to be scaffold proteins for iron-sulfur clusters. 124
45048 396255 pfam01593 Amino_oxidase Flavin containing amine oxidoreductase. This family consists of various amine oxidases, including maze polyamine oxidase (PAO) and various flavin containing monoamine oxidases (MAO). The aligned region includes the flavin binding site of these enzymes. The family also contains phytoene dehydrogenases and related enzymes. In vertebrates MAO plays an important role regulating the intracellular levels of amines via there oxidation; these include various neurotransmitters, neurotoxins and trace amines. In lower eukaryotes such as aspergillus and in bacteria the main role of amine oxidases is to provide a source of ammonium. PAOs in plants, bacteria and protozoa oxidase spermidine and spermine to an aminobutyral, diaminopropane and hydrogen peroxide and are involved in the catabolism of polyamines. Other members of this family include tryptophan 2-monooxygenase, putrescine oxidase, corticosteroid binding proteins and antibacterial glycoproteins. 446
45049 279875 pfam01594 AI-2E_transport AI-2E family transporter. This family includes four different proteins from E. coli alone. One of them, YdgG or TqsA, has been shown to mediate transport of the quorum-sensing signal autoinducer 2 (AI-2). It is not clear if TqsA enhances secretion of AI-2 or inhibits AI-2 uptake. By altering the intracellular concentration of AI-2, TqsA affects gene expression in biofilms and biofilm formation. TsqA belongs to the AI-2 exporter (AI-2E) superfamily. 327
45050 396256 pfam01595 DUF21 Domain of unknown function DUF21. This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to Brachyspira hyodysenteriae hemolysin C that does not contain this domain. This domain is found in the N-terminus of the proteins adjacent to two intracellular CBS domains pfam00571. 182
45051 396257 pfam01596 Methyltransf_3 O-methyltransferase. Members of this family are O-methyltransferases. The family includes catechol o-methyltransferase, caffeoyl-CoA O-methyltransferase and a family of bacterial O-methyltransferases that may be involved in antibiotic production. 203
45052 396258 pfam01597 GCV_H Glycine cleavage H-protein. This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein. 122
45053 396259 pfam01599 Ribosomal_S27 Ribosomal protein S27a. This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesized as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins. 43
45054 396260 pfam01600 Corona_S1 Coronavirus S1 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 and S2 pfam01601. 407
45055 396261 pfam01601 Corona_S2 Coronavirus S2 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 pfam01600 and S2. 485
45056 396262 pfam01602 Adaptin_N Adaptin N terminal region. This family consists of the N terminal region of various alpha, beta and gamma subunits of the AP-1, AP-2 and AP-3 adaptor protein complexes. The adaptor protein (AP) complexes are involved in the formation of clathrin-coated pits and vesicles. The N-terminal region of the various adaptor proteins (APs) is constant by comparison to the C-terminal which is variable within members of the AP-2 family; and it has been proposed that this constant region interacts with another uniform component of the coated vesicles. 523
45057 396263 pfam01603 B56 Protein phosphatase 2A regulatory B subunit (B56 family). Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family. 404
45058 250739 pfam01606 Arteri_env Arterivirus envelope protein. This family consists of viral envelope proteins from the arterivirus genus; this includes porcine reproductive and respiratory virus (PRRSV) envelope protein GP3 and lactate dehydrogenase elevating virus (LDV) structural glycoprotein. Arteriviruses consists of positive ssRNA and do not have a DNA stage. 211
45059 366726 pfam01607 CBM_14 Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. 53
45060 396264 pfam01608 I_LWEQ I/LWEQ domain. I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks. 140
45061 376573 pfam01609 DDE_Tnp_1 Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. This family contains transposases for IS4, IS421, IS5377, IS427, IS402, IS1355, IS5, which was original isolated in bacteriophage lambda. 196
45062 396265 pfam01610 DDE_Tnp_ISL3 Transposase. Transposase proteins are necessary for efficient DNA transposition. Contains transposases for IS204, IS1001, IS1096 and IS1165. 238
45063 279888 pfam01611 Filo_glycop Filovirus glycoprotein. This family includes an extracellular region from the envelope glycoprotein of Ebola and Marburg viruses. This region is also produced as a separate transcript that gives rise to a non-structural, secreted glycoprotein, which is produced in large amounts and has an unknown function. Processing of this protein may be involved in viral pathogenicity. 395
45064 396266 pfam01612 DNA_pol_A_exo1 3'-5' exonuclease. This domain is responsible for the 3'-5' exonuclease proofreading activity of E. coli DNA polymerase I (polI) and other enzymes, it catalyzes the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D). Werner syndrome is a human genetic disorder causing premature aging; the WRN protein has helicase activity in the 3'-5' direction. The FFA-1 protein is required for formation of a replication foci and also has helicase activity; it is a homolog of the WRN protein. RNase D is a 3'-5' exonuclease involved in tRNA processing. Also found in this family is the autoantigen PM/Scl thought to be involved in polymyositis-scleroderma overlap syndrome. 173
45065 396267 pfam01613 Flavin_Reduct Flavin reductase like domain. This is a flavin reductase family consisting of enzymes known to be flavin reductases as well as various oxidoreductase and monooxygenase components. VlmR is a flavin reductase that functions in a two-component enzyme system to provide isobutylamine N-hydroxylase with reduced flavin and may be involved in the synthesis of valanimycin. SnaC is a flavin reductase that provides reduced flavin for the oxidation of pristinamycin IIB to pristinamycin IIA as catalyzed by SnaA, SnaB heterodimer. This flavin reductase region characterized by enzymes of the family is present in the C-terminus of potential FMN proteins from Synechocystis sp. suggesting it is a flavin reductase domain. 145
45066 396268 pfam01614 IclR Bacterial transcriptional regulator. This family of bacterial transcriptional regulators includes the glycerol operon regulatory protein and acetate operon repressor both of which are members of the iclR family. These proteins have a Helix-Turn-Helix motif at the N-terminus. However this family covers the C-terminal region that may bind to the regulatory substrate (unpublished observation, Bateman A.). 129
45067 279892 pfam01616 Orbi_NS3 Orbivirus NS3. The function of this Orbivirus non structural protein is uncertain. However it may play a role on release of the virus from infected cells. 193
45068 366729 pfam01617 Surface_Ag_2 Surface antigen. This family includes a number of bacterial surface antigens expressed on the surface of pathogens. 247
45069 396269 pfam01618 MotA_ExbB MotA/TolQ/ExbB proton channel family. This family groups together integral membrane proteins that appear to be involved translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flageller motor that uses a proton gradient to generate rotational motion in the flageller. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane. 126
45070 396270 pfam01619 Pro_dh Proline dehydrogenase. 300
45071 366730 pfam01620 Pollen_allerg_2 Ribonuclease (pollen allergen). This family contains grass pollen proteins of group V. Phleum pratense pollen allergen Phl p 5b has been shown to possess ribonuclease activity. 155
45072 279897 pfam01621 Fusion_gly_K Cell fusion glycoprotein K. This protein is probably an integral membrane bound glycoprotein that is involved in viral fusion with the host cell. 339
45073 250753 pfam01623 Carla_C4 Carlavirus putative nucleic acid binding protein. This family of carlavirus nucleic acid binding proteins includes a motif for a potential C-4 type zinc finger this has four highly conserved cysteine residues and is a conserved feature of the carlaviruses 3' terminal ORF. These proteins may function as viral transcriptional regulators. The carlavirus family includes garlic latent virus and potato virus S and M, these viruses are positive strand, ssRNA with no DNA stage. 91
45074 396271 pfam01624 MutS_I MutS domain I. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in. 113
45075 396272 pfam01625 PMSR Peptide methionine sulfoxide reductase. This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. 153
45076 396273 pfam01627 Hpt Hpt domain. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072). 84
45077 396274 pfam01628 HrcA HrcA protein C terminal domain. HrcA is found to negatively regulate the transcription of heat shock genes. HrcA contains an amino terminal helix-turn-helix domain, however this corresponds to the carboxy terminal domain. 221
45078 396275 pfam01629 DUF22 Domain of unknown function DUF22. This domain is found in 1 to 3 copies in archaebacterial proteins. The function of the domain is unknown. This family appears to be expanded in Archaeoglobus fulgidus. 106
45079 396276 pfam01630 Glyco_hydro_56 Hyaluronidase. 327
45080 396277 pfam01632 Ribosomal_L35p Ribosomal protein L35. 60
45081 396278 pfam01633 Choline_kinase Choline/ethanolamine kinase. Choline kinase catalyzes the committed step in the synthesis of phosphatidylcholine by the CDP-choline pathway. This alignment covers the protein kinase portion of the protein. The divergence of this family makes it very difficult to create a model that specifically predicts choline/ethanolamine kinases only. However if pfam01633 is also present then it is definitely a member of this family. 211
45082 396279 pfam01634 HisG ATP phosphoribosyltransferase. 157
45083 396280 pfam01635 Corona_M Coronavirus M matrix/glycoprotein. This family consists of various coronavirus matrix proteins which are transmembrane glycoproteins. The M protein or E1 glycoprotein is The coronavirus M protein is implicated in virus assembly. The E1 viral membrane protein is required for formation of the viral envelope and is transported via the Golgi complex. 208
45084 396281 pfam01636 APH Phosphotransferase enzyme family. This family consists of bacterial antibiotic resistance proteins, which confer resistance to various aminoglycosides they include: aminoglycoside 3'-phosphotransferase or kanamycin kinase / neomycin-kanamycin phosphotransferase and streptomycin 3''-kinase or streptomycin 3''-phosphotransferase. The aminoglycoside phosphotransferases inactivate aminoglycoside antibiotics via phosphorylation. This family also includes homoserine kinase. This family is related to fructosamine kinase pfam03881. 239
45085 376582 pfam01637 ATPase_2 ATPase domain predominantly from Archaea. This family contain a conserved P-loop motif that is involved in binding ATP. There are eukaryote members as well as archaeal members in this family. 222
45086 396282 pfam01638 HxlR HxlR-like helix-turn-helix. HxlR, a member of this family, is a DNA-binding protein that acts as a positive regulator of the formaldehyde-inducible hxlAB operon in Bacillus subtilis. 90
45087 279910 pfam01639 v110 Viral family 110. This family of viral proteins is known as the 110 family. The function of members of this family is unknown. The family contains a central cysteine rich region with eight conserved cysteines. Some members of the family contains two copies of the cysteine rich region. 102
45088 396283 pfam01640 Peptidase_C10 Peptidase C10 family. This family represents just the active peptide part of these proteins. Residues 1-120 are not part of the model as they form the pro-peptide, which before cleavage blocks the active site from the substrate. The catalytic residues of histidine and cysteine are brought close together at the active site by the folding of the active peptide. 187
45089 396284 pfam01641 SelR SelR domain. Methionine sulfoxide reduction is an important process, by which cells regulate biological processes and cope with oxidative stress. MsrA, a protein involved in the reduction of methionine sulfoxides in proteins, has been known for four decades and has been extensively characterized with respect to structure and function. However, recent studies revealed that MsrA is only specific for methionine-S-sulfoxides. Because oxidized methionines occur in a mixture of R and S isomers in vivo, it was unclear how stereo-specific MsrA could be responsible for the reduction of all protein methionine sulfoxides. It appears that a second methionine sulfoxide reductase, SelR, evolved that is specific for methionine-R-sulfoxides, the activity that is different but complementary to that of MsrA. Thus, these proteins, working together, could reduce both stereoisomers of methionine sulfoxide. This domain is found both in SelR proteins and fused with the peptide methionine sulfoxide reductase enzymatic domain pfam01625. The domain has two conserved cysteine and histidines. The domain binds both selenium and zinc. The final cysteine is found to be replaced by the rare amino acid selenocysteine in some members of the family. This family has methionine-R-sulfoxide reductase activity. 120
45090 396285 pfam01642 MM_CoA_mutase Methylmalonyl-CoA mutase. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA. 510
45091 366738 pfam01643 Acyl-ACP_TE Acyl-ACP thioesterase. This family consists of various acyl-acyl carrier protein (ACP) thioesterases (TE) these terminate fatty acyl group extension via hydrolysing an acyl group on a fatty acid. 248
45092 396286 pfam01644 Chitin_synth_1 Chitin synthase. This region is found commonly in chitin synthases classes I, II and III. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesized on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases. 163
45093 396287 pfam01645 Glu_synthase Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster. 367
45094 307668 pfam01646 Herpes_UL24 Herpes virus proteins UL24 and UL76. This family consists of various herpes virus proteins; the gene 20 product, U49 protein, UL24 and UL76 proteins and BXRF1. The UL24 gene (product of the 24th ORF) is not essential for virus replication, and mutants with lesions in UL24 show a reduced ability to replicate in tissue culture and have reduced thymidine kinase activity, as the UL24 gene overlaps with thymidine kinase. The family of proteins is involved in viral production, latency, and reactivation. Protein UL76 presents as globular aggresomes in the nuclei of transiently transfected cells. Bioinformatic analyses predict that UL76 has a propensity for aggregation and targets cellular proteins implicated in protein folding and ubiquitin-proteasome systems. UL76 interacts with the VWA domain of S5a, the 26S proteasome non-ATPase regulatory subunit 4 (or PSMD4, or Rpn10), forming a complex in the late phase of infection. 176
45095 396288 pfam01648 ACPS 4'-phosphopantetheinyl transferase superfamily. Members of this family transfers the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pfam00550. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type and the Sfp type. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion. 111
45096 396289 pfam01649 Ribosomal_S20p Ribosomal protein S20. Bacterial ribosomal protein S20 interacts with 16S rRNA. 76
45097 396290 pfam01650 Peptidase_C13 Peptidase C13 family. Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed. 257
45098 396291 pfam01652 IF4E Eukaryotic initiation factor 4E. 158
45099 396292 pfam01653 DNA_ligase_aden NAD-dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism. 318
45100 396293 pfam01654 Cyt_bd_oxida_I Cytochrome bd terminal oxidase subunit I. This family are the alternative oxidases found in many bacteria which oxidize ubiquinol and reduce oxygen as part of the electron transport chain. This family is the subunit I of the oxidase E. coli has two copies of the oxidase, bo and bd', both of which are represented here In some nitrogen fixing bacteria, e.g. Klebsiella pneumoniae this oxidase is responsible for removing oxygen in microaerobic conditions, making the oxidase required for nitrogen fixation. This subunit binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II 426
45101 396294 pfam01655 Ribosomal_L32e Ribosomal protein L32. This family includes ribosomal protein L32 from eukaryotes and archaebacteria. 108
45102 396295 pfam01656 CbiA CobQ/CobB/MinD/ParA nucleotide binding domain. This family consists of various cobyrinic acid a,c-diamide synthases. These include CbiA and CbiP from S.typhimurium, and CobQ from R. capsulatus. These amidases catalyze amidations to various side chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide in the biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III. Vitamin B12 is an important cofactor and an essential nutrient for many plants and animals and is primarily produced by bacteria. The family also contains dethiobiotin synthetases as well as the plasmid partitioning proteins of the MinD/ParA family. 224
45103 396296 pfam01657 Stress-antifung Salt stress response/antifungal. This domain is often found in association with the kinase domains pfam00069 or pfam07714. In many proteins it is duplicated. It contains six conserved cysteines which are involved in disulphide bridges. It has a role in salt stress response and has antifungal activity. 95
45104 396297 pfam01658 Inos-1-P_synth Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction. 104
45105 279928 pfam01659 Luteo_Vpg Luteovirus putative VPg genome linked protein. This family consists of several putative genome linked proteins. The genomic RNA of luteoviruses are linked to virally encoded genome proteins (VPg). Open reading frame 4 is thought to encode the VPg in Soybean dwarf luteovirus. Luteoviruses have isometric capsids that contain a positive stand ssRNA genome, they have no DNA stage during their replication. 105
45106 396298 pfam01660 Vmethyltransf Viral methyltransferase. This RNA methyltransferase domain is found in a wide range of ssRNA viruses, including Hordei-, Tobra-, Tobamo-, Bromo-, Clostero- and Caliciviruses. This methyltransferase is involved in mRNA capping. Capping of mRNA enhances its stability. This usually occurs in the nucleus. Therefore, many viruses that replicate in the cytoplasm encode their own. This is a specific guanine-7-methyltransferase domain involved in viral mRNA cap0 synthesis. Specificity for guanine 7 position is shown by NMR in and in vivo role in cap synthesis. Based on secondary structure prediction, the basic fold is believed to be similar to the common AdoMet-dependent methyltransferase fold. A curious feature of this methyltransferase domain is that it together with flanking sequences seems to have guanylyltransferase activity coupled to the methyltransferase activity. The domain is found throughout the so-called Alphavirus superfamily, (including alphaviruses and several other groups). It forms the defining, unique feature of this superfamily. 308
45107 396299 pfam01661 Macro Macro domain. This domain is an ADP-ribose binding module. It is found in a number of otherwise unrelated proteins. It is found at the C-terminus of the macro-H2A histone protein. This domain is found in the non-structural proteins of several types of ssRNA viruses such as NSP3 from alphaviruses. This domain is also found on its own in a family of proteins from bacteria, archaebacteria and eukaryotes. 118
45108 396300 pfam01663 Phosphodiest Type I phosphodiesterase / nucleotide pyrophosphatase. This family consists of phosphodiesterases, including human plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase i / nucleotide pyrophosphatase (nppase). These enzymes catalyze the cleavage of phosphodiester and phosphosulfate bonds in NAD, deoxynucleotides and nucleotide sugars. Also in this family is ATX an autotaxin, tumor cell motility-stimulating protein which exhibits type I phosphodiesterases activity. The alignment encompasses the active site. Also present with in this family is 60-kDa Ca2+-ATPase form F. odoratum. 343
45109 366748 pfam01664 Reo_sigma1 Reovirus viral attachment protein sigma 1. This family consists of the reovirus sigma 1 hemagglutinin, cell attachment protein. This glycoprotein is a minor capsid protein and also determines the serotype-specific humoral immune response. Sigma 1 consist of a fibrous tail and a globular head. The head has important roles in the cell attachment function of sigma 1 and determinant of the type-specific humoral immune response. Reovirus is part of the orthoreovirus group of retroviruses with, a dsRNA genome. Also present in this family is bacteriophage SF6 Lysozyme. 216
45110 279933 pfam01665 Rota_NSP3 Rotavirus non-structural protein NSP3. This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerization and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein. 311
45111 366749 pfam01666 DX DX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges. 76
45112 279935 pfam01667 Ribosomal_S27e Ribosomal protein S27. 55
45113 396301 pfam01668 SmpB SmpB protein. 143
45114 396302 pfam01669 Myelin_MBP Myelin basic protein. 156
45115 396303 pfam01670 Glyco_hydro_12 Glycosyl hydrolase family 12. 207
45116 279939 pfam01671 ASFV_360 African swine fever virus multigene family 360 protein. The multigene family 360 protein are found within the African swine fever virus (ASF) genome which consist of dsDNA and has similar structural features to the poxyviruses. The biological function of this family is not known. Although African swine fever virus Protein MGF 360-9L is a major structural protein. 215
45117 376591 pfam01672 Plasmid_parti Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins. 85
45118 279941 pfam01673 Herpes_env Herpesvirus putative major envelope glycoprotein. This family consists of probable major envelope glycoproteins from members of the herpesviridae including herpes simplex virus, human cytomegalovirus and varicella-zoster virus. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication. 526
45119 396304 pfam01674 Lipase_2 Lipase (class 2). This family consists of hypothetical C. elegans proteins and lipases. Lipases or triacylglycerol acylhydrolases hydrolyze ester bonds in triacylglycerol giving diacylglycerol, monoacylglycerol, glycerol and free fatty acids. Lipase EstA is a extracellular lipase from B. subtilis 168. 218
45120 396305 pfam01676 Metalloenzyme Metalloenzyme superfamily. This family includes phosphopentomutase and 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This family is also related to pfam00245. The alignment contains the most conserved residues that are probably involved in metal binding and catalysis. 410
45121 279943 pfam01677 Herpes_UL7 Herpesvirus UL7 like. This family consists of various functionally undefined proteins from the herpesviridae and UL7 from bovine herpes virus. UL7 is not essential for virus replication in cell culture, and is found localized in the cytoplasm of infected cells accumulated around the nucleus but could not be detected in purified virions. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication. 213
45122 396306 pfam01678 DAP_epimerase Diaminopimelate epimerase. Diaminopimelate epimerase contains two domains of the same alpha/beta fold, both contained in this family. 119
45123 396307 pfam01679 Pmp3 Proteolipid membrane potential modulator. Pmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants. 49
45124 396308 pfam01680 SOR_SNZ SOR/SNZ family. Members of this family are enzymes involved in a new pathway of pyridoxine/pyridoxal 5-phosphate biosynthesis. This family was formerly known as UPF0019. 206
45125 396309 pfam01681 C6 C6 domain. This domain of unknown function is found in a hypothetical C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. 90
45126 396310 pfam01682 DB DB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657. 97
45127 396311 pfam01683 EB EB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges. This domain is found associated with kunitz domains pfam00014. 52
45128 366757 pfam01684 ET ET module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By inspection of the conservation of cysteines it looks like cysteines 1,2,3,4,9 and 10 are always present and that sometimes the pair 5 and 8 or the pair 6 and 7 are missing. This suggests that cysteines 5/8 and 6/7 make disulphide bridges. 78
45129 396312 pfam01686 Adeno_Penton_B Adenovirus penton base protein. This family consists of various adenovirus penton base proteins, from both the Mastadenoviradae having mammalian hosts and the Aviadenoviradae having avian hosts. The penton base is a major structural protein forming part of the penton which consists of a base and a fibre, the pentons hold a morphologically prominent position at the vertex capsomer in the adenovirus particle. In mammalian adenovirus there is only one tail on each base where as in avian adenovirus there are two. 450
45130 396313 pfam01687 Flavokinase Riboflavin kinase. This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. This domain is thought to have kinase activity. 122
45131 279953 pfam01688 Herpes_gI Alphaherpesvirus glycoprotein I. This family consists of glycoprotein I form various members of the alphaherpesvirinae these include herpesvirus, varicella-zoster virus and pseudorabies virus. Glycoprotein I (gI) is important during natural infection, mutants lacking gI produce smaller lesions at the site of infection and show reduced neuronal spread. gI forms a heterodimeric complex with gE; this complex displays Fc receptor activity (binds to the Fc region of immunoglobulin). Glycoproteins are also important in the production of virus-neutralising antibodies and cell mediated immunity. The alphaherpesvirinae have a dsDNA gnome and have no RNA stage during viral replication. 155
45132 279954 pfam01690 PLRV_ORF5 Potato leaf roll virus readthrough protein. This family consists mainly of the potato leaf roll virus readthrough protein. This is generated via a readthrough of open reading frame 3 a coat protein allowing transcription of open reading frame 5 to give an extended coat protein with a large c-terminal addition or read through domain. The readthrough protein is thought to play a role in the circulative aphid transmission of potato leaf roll virus. Also in the family is open reading frame 6 from beet western yellows virus and potato leaf roll virus both luteovirus and an unknown protein from cucurbit aphid-borne yellows virus a closterovirus. 524
45133 279955 pfam01691 Adeno_E1B_19K Adenovirus E1B 19K protein / small t-antigen. This family consists of adenovirus E1B 19K protein or small t-antigen. The E1B 19K protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell. It can also inhibit apoptosis mediated by tumor necrosis factor alpha and Fas antigen. E1B 19K blocks apoptosis by interacting with and inhibiting the p53-inducible and death- promoting Bax protein. The E1B region of adenovirus encodes two proteins E1B 19K the small t-antigen as found in this family and E1B 55K the large t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. 135
45134 279956 pfam01692 Paramyxo_C Paramyxovirus non-structural protein C. This family consist of the C proteins (C', C, Y1, Y2) found in Paramyxovirinae; human parainfluenza, and sendai virus. The C proteins effect viral RNA synthesis having both a positive and negative effect during the course of infection. Paramyxovirus have a negative strand ssRNA genome of 15.3kb form which six mRNAs are transcribed, five of these are monocistronic. The P/C mRNA is polycistronic and has two overlapping open reading frames P and C, C encodes the nested C proteins C', C, Y1 and Y2. 204
45135 396314 pfam01693 Cauli_VI Caulimovirus viroplasmin. This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner. 44
45136 396315 pfam01694 Rhomboid Rhomboid family. This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite. 145
45137 396316 pfam01695 IstB_IS21 IstB-like ATP binding protein. This protein contains an ATP/GTP binding P-loop motif. It is found associated with IS21 family insertion sequences. The function of this protein is unknown, but it may perform a transposase function. 238
45138 366761 pfam01696 Adeno_E1B_55K Adenovirus EB1 55K protein / large t-antigen. This family consists of adenovirus E1B 55K protein or large t-antigen. E1B 55K binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein. The E1B region of adenovirus encodes two proteins E1B 55K the large t-antigen as found in this family and E1B 19K pfam01691 the small t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. This family shows distant similarities to the pectate lyase superfamily. 387
45139 396317 pfam01697 Glyco_transf_92 Glycosyltransferase family 92. Members of this family act as galactosyltransferases, belonging to glycosyltransferase family 92. The aligned region contains several conserved cysteine residues and several charged residues that may be catalytic residues. This is supported by the inclusion of this family in the GT-A glycosyl transferase superfamily. 250
45140 396318 pfam01698 LFY_SAM Floricaula / Leafy protein SAM domain. This family consists of various plant development proteins which are homologs of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY proteins have been shown to binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD. In addition to its well-characterized DBD, LFY possesses a second conserved domain at its amino terminus (LFY-N). This entry represents the SAM domain found in N -terminal of LFY proteins in plants. Crystallographic structure determination of LFY-N shows that LFY-N is a Sterile Alpha Motif (SAM) domain that mediates LFY oligomerization. It allows LFY to bind to regions lacking high-affinity LFYbs (LFY-binding sites) and confers on LFY the ability to access closed chromatin regions. Experiments carried out in plants, revealed that altering the capacity of LFY to oligomerize compromised its floral function and drastically reduced its genome-wide DNA binding. SAM oligomerization has been suggested to have a profound effect on a TF binding landscape by promoting cooperative binding of LFY to DNA, as was proposed for other oligomeric TFs, and it gives LFY access to closed chromatin regions that are notably refractory to TF binding. It has also been suggested that the biochemical properties of the SAM domain are evolutionary conserved in all plant species. 80
45141 396319 pfam01699 Na_Ca_ex Sodium/calcium exchanger protein. This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3. 149
45142 279964 pfam01700 Orbi_VP3 Orbivirus VP3 (T2) protein. The orbivirus VP3 protein is part of the virus core and makes a 'subcore' shell made up of 120 copies of the 100K protein. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation. Also found in the family is structural core protein VP2 from broadhaven virus which is similar to VP3 in bluetongue virus. Orbivirus are part of the larger reoviridae which have a dsRNA genome of 10-12 linear segments; orbivirus found in this family include bluetongue virus and epizootic hemorrhagic disease virus. 888
45143 396320 pfam01701 PSI_PsaJ Photosystem I reaction centre subunit IX / PsaJ. This family consists of the photosystem I reaction centre subunit IX or PsaJ from various organisms including Synechocystis sp. (strain pcc 6803), Pinus thunbergii (green pine) and Zea mays (maize). PsaJ is a small 4.4kDa, chloroplastal encoded, hydrophobic subunit of the photosystem I reaction complex its function is not yet fully understood. PsaJ can be cross-linked to PsaF and has a single predicted transmembrane domain it has a proposed role in maintaining PsaF in the correct orientation to allow for fast electron transfer from soluble donor proteins to P700+. 37
45144 396321 pfam01702 TGT Queuine tRNA-ribosyltransferase. This is a family of queuine tRNA-ribosyltransferases EC:2.4.2.29, also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyzes the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues. 358
45145 396322 pfam01704 UDPGP UTP--glucose-1-phosphate uridylyltransferase. This family consists of UTP--glucose-1-phosphate uridylyltransferases, EC:2.7.7.9. Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyzes the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type. In Dictyostelium (slime mold) mutants in this enzyme abort the development cycle. Also within the family is UDP-N-acetylglucosamine or AGX1 and two hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete. 412
45146 396323 pfam01705 CX CX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges. 59
45147 396324 pfam01706 FliG_C FliG C-terminal domain. FliG is a component of the flageller rotor, present in about 25 copies per flagellum. This domain functions specifically in motor rotation. 108
45148 279970 pfam01707 Peptidase_C9 Peptidase family C9. 202
45149 366768 pfam01708 Gemini_mov Geminivirus putative movement protein. This family consists of putative movement proteins from Maize streak and wheat dwarf virus. 92
45150 396325 pfam01709 Transcrip_reg Transcriptional regulator. This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region. 235
45151 279973 pfam01710 HTH_Tnp_IS630 Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterized as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily. 119
45152 396326 pfam01712 dNK Deoxynucleoside kinase. This family consists of various deoxynucleoside kinases cytidine EC:2.7.1.74, guanosine EC:2.7.1.113, adenosine EC:2.7.1.76 and thymidine kinase EC:2.7.1.21 (which also phosphorylates deoxyuridine and deoxycytosine.) These enzymes catalyze the production of deoxynucleotide 5'-monophosphate from a deoxynucleoside. Using ATP and yielding ADP in the process. 201
45153 396327 pfam01713 Smr Smr domain. This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity. The full-length human NEDD4-binding protein 2 also has the polynucleotide kinase activity. 78
45154 396328 pfam01715 IPPT IPP transferase. This is a family of IPP transferases EC:2.5.1.8 also known as tRNA delta(2)-isopentenylpyrophosphate transferase. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37). 244
45155 396329 pfam01716 MSP Manganese-stabilizing protein / photosystem II polypeptide. This family consists of the 33 KDa photosystem II polypeptide from the oxygen evolving complex (OEC) of plants and cyanobacteria. The protein is also known as the manganese-stabilizing protein as it is associated with the manganese complex of the OEC and may provide the ligands for the complex. 242
45156 366771 pfam01717 Meth_synt_2 Cobalamin-independent synthase, Catalytic domain. This is a family of vitamin-B12 independent methionine synthases or 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferases, EC:2.1.1.14 from bacteria and plants. Plants are the only higher eukaryotes that have the required enzymes for methionine synthesis. This enzyme catalyzes the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to homocysteine. The aligned region makes up the carboxy region of the approximately 750 amino acid protein except in some hypothetical archaeal proteins present in the family, where this region corresponds to the entire length. This domain contains the catalytic residues of the enzyme. 323
45157 396330 pfam01718 Orbi_NS1 Orbivirus non-structural protein NS1, or hydrophobic tubular protein. This family consists of orbivirus non-structural protein NS1, or hydrophobic tubular protein. NS1 has no specific function in virus replication, it is however thought to play a role in transport of mature virus particles from virus inclusion bodies to the cell membrane. Orbivirus are part of the larger reoviridae which have a dsRNA genome of at least 10 segments encoding at least 10 viral proteins; orbivirus found in this family include bluetongue virus, and African horsesickness virus. 548
45158 396331 pfam01719 Rep_2 Plasmid replication protein. This family consists of various bacterial plasmid replication (Rep) proteins. These proteins are essential for replication of plasmids, the Rep proteins are topoisomerases that nick the positive stand at the plus origin of replication and also at the single-strand conversion sequence. 181
45159 366773 pfam01721 Bacteriocin_II Class II bacteriocin. The bacteriocins are small peptides that inhibit the growth of various bacteria. Bacteriocins of lactic acid bacteria may inhibit their target cells by permeabilising the cell membrane. 33
45160 396332 pfam01722 BolA BolA-like protein. This family consist of the morphoprotein BolA from E. coli and its various homologs. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5. 75
45161 279983 pfam01723 Chorion_1 Chorion protein. This family consists of the chorion superfamily proteins classes A, B, CA, CB and high-cysteine HCB from silk, gypsy and polyphemus moths. The chorion proteins make up the moths egg shell a complex extracellular structure. 169
45162 396333 pfam01724 DUF29 Domain of unknown function DUF29. This family consists of various hypothetical proteins from cyanobacteria, none of which are functionally described. The aligned region is approximately 120-140 amino acids long corresponding to almost the entire length of the proteins in the family. Structure 3fcn is a small protein that has a novel all-alpha fold. The N-terminal helical hairpin is likely to function as a dimerization module. This protein is a member of PFam family PF01724. The function of this protein is unknown. One protein sequence contains a fusion of this protein and a DnaB domain, suggesting a possible role in DNA helicase activity (hypothetical). Dali hits have low Z and high rmsd, suggesting probably only topological similarities (not functional relevance) (details derived from TOPSAN). The family has several highly conserved sequence motifs, including YD/ExD, DxxNVxEEIE, and CPY/F/W, as well as conserved tryptophans. 138
45163 396334 pfam01725 Ham1p_like Ham1 family. This family consists of the HAM1 protein and hypothetical archaeal bacterial and C. elegans proteins. HAM1 controls 6-N-hydroxylaminopurine (HAP) sensitivity and mutagenesis in S. cerevisiae. The HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions. 184
45164 396335 pfam01726 LexA_DNA_bind LexA DNA binding domain. This is the DNA binding domain of the LexA SOS regulon repressor which prevents expression of DNA repair proteins. The aligned region contains a variant form of the helix-turn-helix DNA binding motif. This domain is found associated with pfam00717 the auto-proteolytic domain of LexA EC:3.4.21.88. 63
45165 396336 pfam01728 FtsJ FtsJ-like methyltransferase. This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N-terminus of flaviviral NS5 protein. It has been hypothesized that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping. 179
45166 396337 pfam01729 QRPTase_C Quinolinate phosphoribosyl transferase, C-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyzes the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The C-terminal domain has a 7 beta-stranded TIM barrel-like fold. 169
45167 396338 pfam01730 UreF UreF. This family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyzes urea into ammonia and carbamic acid. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein. 145
45168 334656 pfam01731 Arylesterase Arylesterase. This family consists of arylesterases (Also known as serum paraoxonase) EC:3.1.1.2. These enzymes hydrolyze organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation. 86
45169 396339 pfam01732 DUF31 Putative peptidase (DUF31). This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. It appears to be related to the superfamily of trypsin peptidases and so may have a peptidase function. 351
45170 396340 pfam01733 Nucleoside_tran Nucleoside transporter. This is a family of nucleoside transporters. In mammalian cells nucleoside transporters transport nucleoside across the plasma membrane and are essential for nucleotide synthesis via the salvage pathways for cells that lack their own de novo synthesis pathways. Also in this family is mouse and human nucleolar protein HNP36, a protein of unknown function; although it has been hypothesized to be a plasma membrane nucleoside transporter. 286
45171 396341 pfam01734 Patatin Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates. 190
45172 366778 pfam01735 PLA2_B Lysophospholipase catalytic domain. This family consists of Lysophospholipase / phospholipase B EC:3.1.1.5 and cytosolic phospholipase A2 EC:3.1.4 which also has a C2 domain pfam00168. Phospholipase B enzymes catalyze the release of fatty acids from lysophsopholipids and are capable in vitro of hydrolysing all phospholipids extractable form yeast cells. Cytosolic phospholipase A2 associates with natural membranes in response to physiological increases in Ca2+ and selectively hydrolyzes arachidonyl phospholipids, the aligned region corresponds the the carboxy-terminal Ca2+-independent catalytic domain of the protein as discussed in. 490
45173 279993 pfam01736 Polyoma_agno Polyomavirus agnoprotein. This family consist of the DNA binding protein or agnoprotein from various polyomaviruses. This protein is highly basic and can bind single stranded and double stranded DNA. Mutations in the agnoprotein produce smaller viral plaques, hence its function is not essential for growth in tissue culture cells but something has slowed in the normal replication cycle. There is also evidence suggesting that the agnogene and agnoprotein act as regulators of structural protein synthesis. 62
45174 396342 pfam01737 Ycf9 YCF9. This family consists of the hypothetical protein product of the YCF9 gene from chloroplasts and cyanobacteria. These proteins have no known function. 56
45175 396343 pfam01738 DLH Dienelactone hydrolase family. 213
45176 396344 pfam01739 CheR CheR methyltransferase, SAM binding domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase - the C-terminal domain (this one) binds SAM. 190
45177 396345 pfam01740 STAS STAS domain. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. 106
45178 396346 pfam01741 MscL Large-conductance mechanosensitive channel, MscL. 130
45179 396347 pfam01742 Peptidase_M27 Clostridial neurotoxin zinc protease. These toxins are zinc proteases that block neurotransmitter release by proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25. 420
45180 396348 pfam01743 PolyA_pol Poly A polymerase head domain. This family includes nucleic acid independent RNA polymerases, such as Poly(A) polymerase, which adds the poly (A) tail to mRNA EC:2.7.7.19. This family also includes the tRNA nucleotidyltransferase that adds the CCA to the 3' of the tRNA EC:2.7.7.25. This family is part of the nucleotidyltransferase superfamily. 126
45181 396349 pfam01744 GLTT GLTT repeat (6 copies). This short repeat of unknown function is found in multiple copies in several C. elegans proteins. The repeat is five residues long and consists of XGLTT where X can be any amino acid. 28
45182 366786 pfam01745 IPT Isopentenyl transferase. Isopentenyl transferase / dimethylallyl transferase synthesizes isopentenyladensosine 5'-monophosphate, a cytokinin that induces shoot formation on host plants infected with the Ti plasmid. 232
45183 396350 pfam01746 tRNA_m1G_MT tRNA (Guanine-1)-methyltransferase. This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyzes the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA. 182
45184 396351 pfam01747 ATP-sulfurylase ATP-sulfurylase. This domain is the catalytic domain of ATP-sulfurylase or sulfate adenylyltransferase EC:2.7.7.4 some of which are part of a bifunctional polypeptide chain associated with adenosyl phosphosulphate (APS) kinase pfam01583. Both enzymes are required for PAPS (phosphoadenosine-phosphosulfate) synthesis from inorganic sulphate. ATP sulfurylase catalyzes the synthesis of adenosine-phosphosulfate APS from ATP and inorganic sulphate. 213
45185 396352 pfam01749 IBB Importin beta binding domain. This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain. 79
45186 396353 pfam01750 HycI Hydrogenase maturation protease. The family consists of hydrogenase maturation proteases. In E. coli HypI the hydrogenase maturation protease is involved in processing of HypE the large subunit of hydrogenases 3, by cleavage of its C-terminal. 130
45187 396354 pfam01751 Toprim Toprim domain. This is a conserved region from DNA primase. This corresponds to the Toprim domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR proteins. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primase EC:2.7.7.6 is a nucleotidyltransferase it synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases. Type II DNA topoisomerases catalyze the relaxation of DNA supercoiling by causing transient double strand breaks. 93
45188 396355 pfam01752 Peptidase_M9 Collagenase. This family of enzymes break down collagens. 285
45189 396356 pfam01753 zf-MYND MYND finger. 39
45190 396357 pfam01754 zf-A20 A20-like zinc finger. The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. 23
45191 366794 pfam01755 Glyco_transf_25 Glycosyltransferase family 25 (LPS biosynthesis protein). Members of this family belong to Glycosyltransferase family 25 This is a family of glycosyltransferases involved in lipopolysaccharide (LPS) biosynthesis. These enzymes catalyze the transfer of various sugars onto the growing LPS chain during its biosynthesis. 200
45192 396358 pfam01756 ACOX Acyl-CoA oxidase. This is a family of Acyl-CoA oxidases EC:1.3.3.6. Acyl-coA oxidase converts acyl-CoA into trans-2- enoyl-CoA. 179
45193 376607 pfam01757 Acyl_transf_3 Acyltransferase family. This family includes a range of acyltransferase enzymes. This domain is found in many as yet uncharacterized C. elegans proteins and it is approximately 300 amino acids long. 330
45194 366796 pfam01758 SBF Sodium Bile acid symporter family. This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae - this is a putative transmembrane protein involved in resistance to arsenic compounds. 191
45195 396359 pfam01759 NTR UNC-6/NTR/C345C module. Sequence similarity between netrin UNC-6 and C345C complement protein family members, and hence the existence of the UNC-6 module, was first reported in. Subsequently, many additional members of the family were identified on the basis of sequence similarity between the C-terminal domains of netrins, complement proteins C3, C4, C5, secreted frizzled-related proteins, and type I pro-collagen C-proteinase enhancer proteins (PCOLCEs), which are homologous with the N-terminal domains of tissue inhibitors of metalloproteinases (TIMPs). The TIMPs are classified as a separate family in Pfam (pfam00965). This expanded domain family has been named as the NTR module. 106
45196 396360 pfam01761 DHQ_synthase 3-dehydroquinate synthase. The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide. 3-dehydroquinate (DHQ) synthase catalyzes the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. 258
45197 250845 pfam01762 Galactosyl_T Galactosyltransferase. This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2. 196
45198 396361 pfam01763 Herpes_UL6 Herpesvirus UL6 like. This family consists of various proteins from the herpesviridae that are similar to herpes simplex virus type I UL6 virion protein. UL6 is essential for cleavage and packaging of the viral genome. 556
45199 396362 pfam01764 Lipase_3 Lipase (class 3). 139
45200 396363 pfam01765 RRF Ribosome recycling factor. The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth. Thus ribosomes are "recycled" and ready for another round of protein synthesis. 163
45201 280020 pfam01766 Birna_VP2 Birnavirus VP2 protein. VP2 is the major structural protein of birnaviruses. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C). 440
45202 280021 pfam01767 Birna_VP3 Birnavirus VP3 protein. VP3 is a minor structural component of the virus. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C). 227
45203 280022 pfam01768 Birna_VP4 Birnavirus VP4 protein. VP4 is a viral protease. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C). 259
45204 396364 pfam01769 MgtE Divalent cation transporter. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.) 122
45205 396365 pfam01770 Folate_carrier Reduced folate carrier. The reduced folate carrier (a transmembrane glycoprotein) transports reduced folate into mammalian cells via the carrier mediated mechanism (as opposed to the receptor mediated mechanism) it also transports cytotoxic folate analogues used in chemotherapy, such as methotrexate (MTX). Mammalian cells have an absolute requirement for exogenous folates which are needed for growth, and biosynthesis of macromolecules. 412
45206 396366 pfam01771 Herpes_alk_exo Herpesvirus alkaline exonuclease. This family includes various alkaline exonucleases from members of the herpesviridae. Alkaline exonuclease appears to have an important role in the replication of herpes simplex virus. 460
45207 396367 pfam01773 Nucleos_tra2_N Na+ dependent nucleoside transporter N-terminus. This family consists of nucleoside transport proteins. Rat Slc28a2 is a purine-specific Na+-nucleoside cotransporter localized to the bile canalicular membrane. Rat Slc28a1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N-terminus of this family 73
45208 396368 pfam01774 UreD UreD urease accessory protein. UreD is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex and is required for urease nickel metallocenter assembly. See also UreF pfam01730, UreG pfam01495. 164
45209 396369 pfam01775 Ribosomal_L18A Ribosomal proteins 50S-L18Ae/60S-L20/60S-L18A. This family includes: archaeal 50S ribosomal protein L18Ae, often referred to as L20e or LX; fungal 60S ribosomal protein L20; and higher eukaryote 60S ribosomal protein L18A. 60
45210 396370 pfam01776 Ribosomal_L22e Ribosomal L22e protein family. 99
45211 396371 pfam01777 Ribosomal_L27e Ribosomal L27e protein family. The N-terminal region of the eukaryotic ribosomal L27 has the KOW motif. C-terminal region is represented by this family. 85
45212 396372 pfam01778 Ribosomal_L28e Ribosomal L28e protein family. 114
45213 396373 pfam01779 Ribosomal_L29e Ribosomal L29e protein family. 39
45214 396374 pfam01780 Ribosomal_L37ae Ribosomal L37ae protein family. This ribosomal protein is found in archaebacteria and eukaryotes. It contains four conserved cysteine residues that may bind to zinc. 85
45215 396375 pfam01781 Ribosomal_L38e Ribosomal L38e protein family. 67
45216 396376 pfam01782 RimM RimM N-terminal domain. The RimM protein is essential for efficient processing of 16S rRNA. The RimM protein was shown to have affinity for free ribosomal 30S subunits but not for 30S subunits in the 70S ribosomes. This N-terminal domain is found associated with a PRC-barrel domain. 84
45217 396377 pfam01783 Ribosomal_L32p Ribosomal L32p protein family. 56
45218 396378 pfam01784 NIF3 NIF3 (NGG1p interacting factor 3). This family contains several NIF3 (NGG1p interacting factor 3) protein homologs. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown. 240
45219 250863 pfam01785 Closter_coat Closterovirus coat protein. This family consist of coat proteins from closteroviruses a member of the closteroviridae. The viral coat protein encapsulates and protects the viral genome. Both the large cp1 and smaller cp2 coat protein originate from the same primary transcript. Members of the closteroviridae include Sugar beet yellow virus and Grapevine leafroll-associated virus, closteroviruses have a positive strand ssRNA genome with no DNA stage during replication. 188
45220 396379 pfam01786 AOX Alternative oxidase. The alternative oxidase is used as a second terminal oxidase in the mitochondria, electrons are transfered directly from reduced ubiquinol to oxygen forming water. This is not coupled to ATP synthesis and is not inhibited by cyanide, this pathway is a single step process. In rice the transcript levels of the alternative oxidase are increased by low temperature. 218
45221 396380 pfam01787 Ilar_coat Ilarvirus coat protein. This family consists of various coat proteins from the ilarviruses part of the Bromoviridae, members include apple mosaic virus and prune dwarf virus. The ilarvirus coat protein is required to initiate replication of the viral genome in host plants. Members of the Bromoviridae have a positive stand ssRNA genome with no DNA stage in there replication. 204
45222 396381 pfam01788 PsbJ PsbJ. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. In Synechocystis sp. PCC 6803 PsbJ regulates the number of photosystem II centers in thylakoid membranes, it is a predicted 4kDa protein with one membrane spanning domain. 38
45223 396382 pfam01789 PsbP PsbP. This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the affinity of the water oxidation site for Cl- and provides the conditions required for high affinity binding of Ca2+. 155
45224 396383 pfam01790 LGT Prolipoprotein diacylglyceryl transferase. 238
45225 396384 pfam01791 DeoC DeoC/LacD family aldolase. This family includes diverse aldolase enzymes. This family includes the enzyme deoxyribose-phosphate aldolase EC:4.1.2.4, which is involved in nucleotide metabolism. The family also includes a group of related bacterial proteins of unknown function. The family also includes tagatose 1,6-diphosphate aldolase (EC:4.1.2.40) is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. 235
45226 396385 pfam01793 Glyco_transf_15 Glycolipid 2-alpha-mannosyltransferase. This is a family of alpha-1,2 mannosyl-transferases involved in N-linked and O-linked glycosylation of proteins. Some of the enzymes in this family have been shown to be involved in O- and N-linked glycan modifications in the Golgi. 313
45227 396386 pfam01794 Ferric_reduct Ferric reductase like transmembrane component. This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease. 117
45228 396387 pfam01795 Methyltransf_5 MraW methylase family. Members of this family are probably SAM dependent methyltransferases based on Escherichia coli RsmH. This family appears to be related to pfam01596. 309
45229 396388 pfam01796 OB_aCoA_assoc DUF35 OB-fold domain, acyl-CoA-associated. The structure of a DUF35 representative reveals two long N-terminal helices followed by a rubredoxin-like zinc ribbon domain and a C-terminal OB fold domain represented in this entry. OB-folds are frequently found to bind nucleic acids suggesting this domain might bind to DNA or RNA (Topsan http://www.topsan.org/). Genomic context shows it to be adjacent to acyl-CoA transferase (http:/www.microbesonline.org/). 65
45230 396389 pfam01797 Y1_Tnp Transposase IS200 like. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS200 from E. coli. 119
45231 396390 pfam01798 Nop snoRNA binding domain, fibrillarin. This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. Fibrillarin, or Nop, is the catalytic subunit responsible for the methyl transfer reaction of the site-specific 2'-O-methylation of ribosomal and spliceosomal RNA. 229
45232 396391 pfam01799 Fer2_2 [2Fe-2S] binding domain. 73
45233 280049 pfam01801 Cytomega_gL Cytomegalovirus glycoprotein L. Glycoprotein L from cytomegalovirus serves a chaperone for the correct folding and surface expression of glycoprotein H (gH). Glycoprotein L is a member of the heterotrimeric gCIII complex of glycoprotein which also includes gH and gO and has an essential role in viral fusion. 211
45234 396392 pfam01802 Herpes_V23 Herpesvirus VP23 like capsid protein. This family consist of various capsid proteins from members of the herpesviridae. The capsid protein VP23 in herpes simplex virus forms a triplex together with VP19C these fit between and link together adjacent capsomers as formed by VP5 and VP26. VP3 along with the scaffolding proteins helps to form normal capsids by defining the curvature of the shell and size of the particle. 294
45235 396393 pfam01803 LIM_bind LIM-domain binding protein. The LIM-domain binding protein, binds to the LIM domain pfam00412 of LIM homeodomain proteins which are transcriptional regulators of development. Nuclear LIM interactor (NLI) / LIM domain-binding protein 1 (LDB1) is located in the nuclei of neuronal cells during development, it is co-expressed with Isl1 in early motor neuron differentiation and has a suggested role in the Isl1 dependent development of motor neurons. It is suggested that these proteins act synergistically to enhance transcriptional efficiency by acting as co-factors for LIM homeodomain and Otx class transcription factors both of which have essential roles in development. The Drosophila protein Chip is required for segmentation and activity of a remote wing margin enhancer. Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions. 242
45236 396394 pfam01804 Penicil_amidase Penicillin amidase. Penicillin amidase or penicillin acylase EC:3.5.1.11 catalyzes the hydrolysis of benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key intermediate in the the synthesis of penicillins. Also in the family is cephalosporin acylase and aculeacin A acylase which are involved in the synthesis of related peptide antibiotics. 626
45237 396395 pfam01805 Surp Surp module. This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. 52
45238 280054 pfam01806 Paramyxo_P Paramyxovirinae P phosphoprotein C-terminal region. The subfamily Paramyxovirinae of the family Paramyxoviridae now contains as main genera the Rubulaviruses, avulaviruses, respiroviruses, Henipavirus-es and morbilliviruses. Protein P is the best characterized, structurally of the replicative complex of N, P and L proteins and consists of two functionally distinct moieties, an N-terminal PNT, and a C-terminal PCT. The P protein is an essential part of the viral RNA polymerase complex formed from the P and L proteins. P protein plays a crucial role in the enzyme by positioning L onto the N/RNA template through an interaction with the C-terminal domain of N. Without P, L is not functional.The C-terminal part of P (PCT) is only functional as an oligomer and forms with L the polymerase complex. PNT is poorly conserved and unstructured in solution while PCT contains the oligomerization domain (PMD) that folds as a homotetrameric coiled coil (40) containing the L binding region and a C-terminal partially folded domain, PX (residues 474 to 568), identified as the nucleocapsid binding site. Interestingly, PX is also expressed as an independent polypeptide in infected cells. PX has a C-subdomain (residues 516 to 568) that consists of three {alpha}-helices arranged in an antiparallel triple-helical bundle linked to an unfolded flexible N-subdomain (residues 474 to 515). 248
45239 280055 pfam01807 zf-CHC2 CHC2 zinc finger. This domain is principally involved in DNA binding in DNA primases. 95
45240 396396 pfam01808 AICARFT_IMPCHas AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalyzing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalyzed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyzes the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. This is catalyzed by a pair of C-terminal deaminase fold domains in the protein, where the active site is formed by the dimeric interface of two monomeric units. The last step is catalyzed by the N-terminal IMP (Inosine monophosphate) cyclohydrolase domain EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. 308
45241 396397 pfam01809 Haemolytic Haemolytic domain. This domain has haemolytic activity. It is found in short (73-103 amino acid) proteins and contains three conserved cysteine residues. 67
45242 280058 pfam01810 LysE LysE type translocator. This family consists of various hypothetical proteins and an l-lysine exporter LysE from Corynebacterium glutamicum which is proposed to be the first of a novel family of translocators. LysE exports l-lysine from the cell into the surrounding medium and is predicted to span the membrane six times. The physiological function of the exporter is to excrete excess l-Lysine as a result of natural flux imbalances or peptide hydrolysis; and also after artificial deregulation of l-Lysine biosynthesis as used by the biotechnology. industry for the production of l-lysine. 193
45243 396398 pfam01812 5-FTHF_cyc-lig 5-formyltetrahydrofolate cyclo-ligase family. 5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetase EC:6.3.3.2 catalyzes the interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this requires ATP and Mg2+. 5-FTHF is used in chemotherapy where it is clinically known as Leucovorin. 186
45244 396399 pfam01813 ATP-synt_D ATP synthase subunit D. This is a family of subunit D form various ATP synthases including V-type H+ transporting and Na+ dependent. Subunit D is suggested to be an integral part of the catalytic sector of the V-ATPase. 194
45245 396400 pfam01814 Hemerythrin Hemerythrin HHE cation binding domain. Iteration of the HHE family found it to be related to Hemerythrin. It also demonstrated that what has been described as a single domain in fact consists of two cation binding domains. Members of this family occur all across nature and are involved in a variety of processes. For instance, in Nereis diversicolor hemerythrin binds Cadmium so as to protect the organism from toxicity. However Hemerythrin is classically described as Oxygen-binding through two attached Fe2+ ions. And the bacterial NorA is a regulator of response to NO, which suggests yet another set-up for its metal ligands. In Staphylococcus aureus the iron-sulfur cluster repair protein ScdA has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger. 128
45246 307776 pfam01815 Rop Rop protein. 57
45247 250888 pfam01816 LRV Leucine rich repeat variant. The function of this repeat is unknown. It has an unusual structure of two helices. One is an alpha helix, the other is the much rarer 3-10 helix. 26
45248 396401 pfam01817 CM_2 Chorismate mutase type II. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine. 79
45249 396402 pfam01818 Translat_reg Bacteriophage translational regulator. The translational regulator protein regA is encoded by the T4 bacteriophage and binds to a region of messenger RNA (mRNA) that includes the initiator codon. RegA is unusual in that it represses the translation of about 35 early T4 mRNAs but does not affect nearly 200 other mRNAs. 122
45250 396403 pfam01819 Levi_coat Levivirus coat protein. The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The MS2 bacteriophage coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication. 132
45251 396404 pfam01820 Dala_Dala_lig_N D-ala D-ala ligase N-terminus. This family represents the N-terminal region of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4 which is thought to be involved in substrate binding. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF). This domain is structurally related to the PreATP-grasp domain. 118
45252 396405 pfam01821 ANATO Anaphylotoxin-like domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. 36
45253 396406 pfam01822 WSC WSC domain. This domain may be involved in carbohydrate binding. 80
45254 396407 pfam01823 MACPF MAC/Perforin domain. The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerization of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerizes into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold. 210
45255 280070 pfam01824 MatK_N MatK/TrnK amino terminal region. The function of this region is unknown. 331
45256 396408 pfam01825 GPS GPCR proteolysis site, GPS, motif. The GPS motif is found in GPCRs, and is the site for auto-proteolysis, so is thus named, GPS. The GPS motif is a conserved sequence of ~40 amino acids containing canonical cysteine and tryptophan residues, and is the most highly conserved part of the domain. In most, if not all, cell-adhesion GPCRs these undergo autoproteolysis in the GPS between a conserved aliphatic residue (usually a leucine) and a threonine, serine, or cysteine residue. In higher eukaryotes this motif is found embedded in the C-terminal beta-stranded part of a GAIN domain - GPCR-Autoproteolysis INducing (GAIN). The GAIN-GPS domain adopts a fold in which the GPS motif, at the C-terminus, forms five beta-strands that are tightly integrated into the overall GAIN domain. The GPS motif, evolutionarily conserved from tetrahymena to mammals, is the only extracellular domain shared by all human cell-adhesion GPCRs and PKD proteins, and is the locus of multiple human disease mutations. The GAIN-GPS domain is both necessary and sufficient functionally for autoproteolysis, suggesting an autoproteolytic mechanism whereby the overall GAIN domain fine-tunes the chemical environment in the GPS to catalyze peptide bond hydrolysis. In the cell-adhesion GPCRs and PKD proteins, the GPS motif is always located at the end of their long N-terminal extracellular regions, immediately before the first transmembrane helix of the respective protein. 46
45257 396409 pfam01826 TIL Trypsin Inhibitor like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. 55
45258 396410 pfam01827 FTH FTH domain. This presumed domain is likely to be a protein-protein interaction module. It is found in many proteins from C. elegans. The domain is found associated with the F-box pfam00646. This domain is named FTH after FOG-2 homology domain. 141
45259 396411 pfam01828 Peptidase_A4 Peptidase A4 family. 206
45260 396412 pfam01829 Peptidase_A6 Peptidase A6 family. 314
45261 280076 pfam01830 Peptidase_C7 Peptidase C7 family. 243
45262 280077 pfam01831 Peptidase_C16 Peptidase C16 family. 249
45263 396413 pfam01832 Glucosaminidase Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase. This family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase EC:3.2.1.96. As well as the flageller protein J that has been shown to hydrolyze peptidoglycan. 91
45264 396414 pfam01833 TIG IPT/TIG domain. This family consists of a domain that has an immunoglobulin like fold. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding. CAUTION: This family does not currently recognize a significant number of members. 85
45265 396415 pfam01834 XRCC1_N XRCC1 N terminal domain. 148
45266 396416 pfam01835 A2M_N MG2 domain. This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. 96
45267 396417 pfam01837 HcyBio Homocysteine biosynthesis enzyme, sulfur-incorporation. This presumed domain is about is about 360 residues long. The function of this domain is unknown. It is found in some proteins that have two C-terminal CBS pfam00571 domains. There are also proteins that contain two inserted Fe4S domains near the C-terminal end of the domain. The Methanothermobacter thermautotrophicus gene MTH_855 product has been misannotated as an inosine monophosphate dehydrogenase based on the similarity to the CBS domains. Based on genetic analyses in the methanogen Methanosarcina acetivorans, this family is a key component of the metabolic network for sulfide assimilation and trafficking in methanogens. It is essential to a novel, O-acetylhomoserine sulfhydrylase-independent pathway for homocysteine biosynthesis, and may catalyze sulfur incorporation into the side chain of an as yet unidentified amino acid precursor. The DUF39-CBS and DUF39-ferredoxin architectures repeatedly occur together in the genomes of methanogenic Archaea, suggesting they may be of diverged function. This is consistent with a phylogenetic reconstruction of the DUF39 family, which clearly distinguishes the CBS-associated and ferredoxin-associated DUF39s. 350
45268 396418 pfam01839 FG-GAP FG-GAP repeat. This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N-terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats. 36
45269 396419 pfam01840 TCL1_MTCP1 TCL1/MTCP1 family. Two related oncogenes, TCL-1 and MTCP-1, are overexpressed in T cell prolymphocytic leukaemias as a result of chromosomal rearrangements that involve the translocation of one T cell receptor gene to either chromosome 14q32 or Xq28. This family contains two repeated motifs that form a single globular domain. 118
45270 376628 pfam01841 Transglut_core Transglutaminase-like superfamily. This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease. 108
45271 396420 pfam01842 ACT ACT domain. This family of domains generally have a regulatory role. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The ACT domain is found in: D-3-phosphoglycerate dehydrogenase EC:1.1.1.95, which is inhibited by serine. Aspartokinase EC:2.7.2.4, which is regulated by lysine. Acetolactate synthase small regulatory subunit, which is inhibited by valine. Phenylalanine-4-hydroxylase EC:1.14.16.1, which is regulated by phenylalanine. Prephenate dehydrogenase EC:4.2.1.51. formyltetrahydrofolate deformylase EC:3.5.1.10, which is activated by methionine and inhibited by glycine. GTP pyrophosphokinase EC:2.7.6.5. 66
45272 396421 pfam01843 DIL DIL domain. The DIL domain has no known function. 103
45273 396422 pfam01844 HNH HNH endonuclease. His-Asn-His (HNH) proteins are a very common family of small nucleic acid-binding proteins that are generally associated with endonuclease activity. 47
45274 396423 pfam01845 CcdB CcdB protein. 99
45275 396424 pfam01846 FF FF domain. This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions. 50
45276 396425 pfam01847 VHL VHL beta domain. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA. 82
45277 396426 pfam01848 HOK_GEF Hok/gef family. 42
45278 396427 pfam01849 NAC NAC domain. 54
45279 396428 pfam01850 PIN PIN domain. 121
45280 396429 pfam01851 PC_rep Proteasome/cyclosome repeat. 35
45281 396430 pfam01852 START START domain. 205
45282 396431 pfam01853 MOZ_SAS MOZ/SAS family. This region of these proteins has been suggested to be homologous to acetyltransferases. 179
45283 396432 pfam01855 POR_N Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-bdg. This family includes the N terminal structural domain of the pyruvate ferredoxin oxidoreductase. This domain binds thiamine diphosphate, and along with domains II and IV, is involved in inter subunit contacts. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited. 230
45284 280099 pfam01856 HP_OMP Helicobacter outer membrane protein. This family seems confined to Helicobacter. It is predicted to be an outer membrane protein based on its pattern of alternating hydrophobic amino acids similar to porins. 154
45285 396433 pfam01857 RB_B Retinoblastoma-associated protein B domain. The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold. 131
45286 396434 pfam01858 RB_A Retinoblastoma-associated protein A domain. This domain has the cyclin fold as predicted. 195
45287 396435 pfam01861 DUF43 Protein of unknown function DUF43. This family includes archaebacterial proteins of unknown function. All the members are 350-400 amino acids long. 243
45288 396436 pfam01862 PvlArgDC Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). Methanococcus jannaschii contains homologs of most genes required for spermidine polyamine biosynthesis. Yet genomes from neither this organism nor any other euryarchaeon have orthologues of the pyridoxal 5'-phosphate- dependent ornithine or arginine decarboxylase genes, required to produce putrescine. Instead,these organisms have a new class of arginine decarboxylase (PvlArgDC) formed by the self-cleavage of a proenzyme into a 5-kDa subunit and a 12-kDa subunit that contains a reactive pyruvoyl group. Although this extremely thermostable enzyme has no significant sequence similarity to previously characterized proteins, conserved active site residues are similar to those of the pyruvoyl-dependent histidine decarboxylase enzyme, and its subunits form a similar (alpha-beta)(3) complex. homologs of PvlArgDC are found in several bacterial genomes, including those of Chlamydia spp., which have no agmatine ureohydrolase enzyme to convert agmatine (decarboxylated arginine) into putrescine. In these intracellular pathogens, PvlArgDC may function analogously to pyruvoyl-dependent histidine decarboxylase; the cells are proposed to import arginine and export agmatine, increasing the pH and affecting the host cell's metabolism. Phylogenetic analysis of Pvl- ArgDC proteins suggests that this gene has been recruited from the euryarchaeal polyamine biosynthetic pathway to function as a degradative enzyme in bacteria. 162
45289 396437 pfam01863 DUF45 Protein of unknown function DUF45. This protein has no known function. Members are found in some archaebacteria, as well as Helicobacter pylori. The proteins are 190-240 amino acids long, with the C-terminus being the most conserved region, containing three conserved histidines. This motif is similar to that found in Zinc proteases, suggesting that this family may also be proteases. 207
45290 280105 pfam01864 CarS-like CDP-archaeol synthase. CDP-archaeol synthase functions in the archaeal lipid biosynthetic pathway. It catalyzes the transfer of the nucleotide to its specific archaeal lipid substrate, leading to the formation of a CDP-activated precursor (CDP-archaeol) to which polar head groups are attached. Bacterial members of this family are uncharacterized. 175
45291 280106 pfam01865 PhoU_div Protein of unknown function DUF47. This family includes prokaryotic proteins of unknown function, as well as a protein annotated as the pit accessory protein from Sinorhizobium meliloti. However, the function of this protein is also unknown (Pit stands for Phosphate transport). It is probably distantly related to pfam01895 (personal obs:Yeats C). 214
45292 396438 pfam01866 Diphthamide_syn Putative diphthamide synthesis protein. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Human DPH1 is a candidate tumor suppressor gene. DPH2 from yeast, which confers resistance to diphtheria toxin has been found to be involved in diphthamide synthesis. Diphtheria toxin inhibits eukaryotic protein synthesis by ADP-ribosylating diphthamide, a post-translationally modified histidine residue present in EF2. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2). 302
45293 396439 pfam01867 Cas_Cas1 CRISPR associated protein Cas1. Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR. 283
45294 396440 pfam01868 UPF0086 Domain of unknown function UPF0086. This family consists of several archaeal and eukaryotic proteins. The archaeal proteins are found to be expressed within ribosomal operons and several of the sequences are described as ribonuclease P protein subunit p29 proteins. 83
45295 396441 pfam01869 BcrAD_BadFG BadF/BadG/BcrA/BcrD ATPase family. This family includes the BadF and BadG proteins that are two subunits of Benzoyl-CoA reductase, that may be involved in ATP hydrolysis. The family also includes an activase subunit from the enzyme 2-hydroxyglutaryl-CoA dehydratase. Aquifex aeolicus aq_278 contains two copies of this region suggesting that the family may structurally dimerize. This family appears to be related to pfam00370. 271
45296 396442 pfam01870 Hjc Archaeal holliday junction resolvase (hjc). This family of archaebacterial proteins are holliday junction resolvases (hjc gene). The Holliday junction is an essential intermediate of homologous recombination. This protein is the archaeal equivalent of RuvC but is not sequence similar. 91
45297 396443 pfam01871 AMMECR1 AMMECR1. This family consists of several AMMECR1 as well as several uncharacterized proteins. The contiguous gene deletion syndrome AMME is characterized by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery. 167
45298 396444 pfam01872 RibD_C RibD C-terminal domain. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C-terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186. 196
45299 396445 pfam01873 eIF-5_eIF-2B Domain found in IF2B/IF5. This family includes the N-terminus of eIF-5, and the C-terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homolog. The region contains a putative zinc binding C4 finger. 115
45300 396446 pfam01874 CitG ATP:dephospho-CoA triphosphoribosyl transferase. The citG gene is found in a gene cluster with citrate lyase subunits. The function of the CitG protein was elucidated as ATP:dephospho-CoA triphosphoribosyl transferase. 258
45301 280116 pfam01875 Memo Memo-like protein. This family contains members from all branches of life. The molecular function of this protein is unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton. 271
45302 396447 pfam01876 RNase_P_p30 RNase P subunit p30. This protein is part of the RNase P complex that is involved in tRNA maturation. 214
45303 396448 pfam01877 RNA_binding RNA binding. PH1010 is composed of five alpha-helices (1-5) and eight beta-strands (1-8) with the following topology: beta-1, alpha-1, beta-2, beta-3, alpha-2, alpha-3, beta-4, beta-5, alpha-4, beta-6, alpha-5, beta-7, beta-8. The first six beta-strands (1-6) form a slightly twisted antiparallel beta-sheet and face five alpha-helices on one side. The last two beta-strands form an antiparallel beta-sheet in the C-terminus. PH1010 forms a characteristic homodimer structure in the crystal. dimerization of the molecule is crucial for function. The structure resembles that of some ribosomal proteins such as the 50S ribosomal protein L5. Although the structure resembles that of the RRM-type RNA-binding domain of the ribosomal L5 protein, the residues involved in RNA-binding in the L5 protein are not conserved in this family. Despite this, these proteins bind to double-stranded RNA in a non-sequence specific manner. 113
45304 396449 pfam01878 EVE EVE domain. This domain was formerly known as DUF55. Crystal structures have shown that this domain is part of the PUA superfamily. This domain has been named EVE and is thought to be RNA-binding. 146
45305 396450 pfam01880 Desulfoferrodox Desulfoferrodoxin. Desulfoferrodoxins contains two types of iron: an Fe-S4 site very similar to that found in desulforedoxin from Desulfovibrio gigas and an octahedral coordinated high-spin ferrous site most probably with nitrogen/oxygen-containing ligands. Due to this rather unusual combination of active centers, this novel protein is named desulfoferrodoxin. 97
45306 396451 pfam01881 Cas_Cas6 CRISPR associated protein Cas6. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. 108
45307 396452 pfam01882 DUF58 Protein of unknown function DUF58. This family of prokaryotic proteins have no known function. Caldicellulosiruptor saccharolyticus PepX, a protein of unknown function in the family, has been misannotated as alpha-dextrin 6-glucanohydrolase. 86
45308 396453 pfam01883 FeS_assembly_P Iron-sulfur cluster assembly protein. This family has an alpha/beta topology, with 13 conserved hydrophobic residues at its core and a putative active site containing a highly conserved cysteine. Members of this family are involved in a range of physiological functions. The family includes PaaJ (PhaH) from Pseudomonas putida. PaaJ forms a complex with PaaG (PhaF), PaaI (PhaG) and PaaK (PhaI), which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic acid. It also includes PaaD from Escherichia coli, a member of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation. Furthermore, several members of this family are shown to be involved in iron-sulfur (FeS) cluster assembly. Iron-sulfur (FeS) clusters are inorganic co-factors that are are able to transfer electrons and act as catalysts. They are involved in diverse cellular processes including cellular respiration, DNA replication and repair, antibiotic resistance, and dinitrogen fixation. The biogenesis of such clusters from elemental iron and sulfur is an enzymatic process that requires a set of specialized proteins. Proteins containing this domain include the chloroplast protein HCF101 (high chlorophyll fluorescence 101), which has been described as an essential and specific factor for assembly of [4Fe-4S]-cluster-containing protein complexes such as the membrane complex Photosystem I (PSI) and the heterodimeric FTR (ferredoxin-thioredoxin reductase) complex and is involved in the assembly of [4Fe-4S] clusters and their transfer to apoproteins. The mature HCF101 protein contains an N-terminal DUF59 domain as well as eight cysteine residues along the sequence. All cysteine residues are conserved among higher plants, but of the two cysteine residues located in the DUF59 domain only Cys128 is highly conserved and is present in the highly conserved P-loop domain of the plant HCF101 (CKGGVGKS). SufT protein from Staphylococcus aureus is composed of DUF59 solely and is shown to be involved in the maturation of FeS proteins. Given all this data, it is hypothesized that DUF59 might play a role in FeS cluster assembly. 72
45309 396454 pfam01884 PcrB PcrB family. This family contains proteins that are related to PcrB. The function of these proteins is unknown. 226
45310 396455 pfam01885 PTS_2-RNA RNA 2'-phosphotransferase, Tpt1 / KptA family. Tpt1 catalyzes the last step of tRNA splicing in yeast. It transfers the splice junction 2'-phosphate from ligated tRNA to NAD, to produce ADP-ribose 1"-2"-cyclic phosphate. This is presumed to be followed by a transesterification step to release the RNA. The first step of this reaction is similar to that catalyzed by some bacterial toxins. E. coli KptA and mouse Tpt1 are likely to use the same reaction mechanism. 172
45311 396456 pfam01886 DUF61 Protein of unknown function DUF61. Protein found in Archaebacteria. These proteins have no known function. 121
45312 396457 pfam01887 SAM_adeno_trans S-adenosyl-l-methionine hydroxide adenosyltransferase. This is a family of proteins, previously known as DUF62, found in archaebacteria and bacteria. The structure of proteins in this family is similar to that of a bacterial fluorinating enzyme. S-adenosyl-l-methionine hydroxide adenosyltransferases utilizes a rigorously conserved amino acid side chain triad (Asp-Arg-His) which may have a role in activating water to hydroxide ion. This family used to be known as DUF62. 217
45313 396458 pfam01888 CbiD CbiD. CbiD is essential for cobalamin biosynthesis in both S. typhimurium and B. megaterium, no functional role has been ascribed to the protein. The CbiD protein has a putative S-AdoMet binding site. It is possible that CbiD might have the same role as CobF in undertaking the C-1 methylation and deacylation reactions required during the ring contraction process. 258
45314 396459 pfam01889 DUF63 Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins. 270
45315 396460 pfam01890 CbiG_C Cobalamin synthesis G C-terminus. Members of this family are involved in cobalamin synthesis. The protein encoded by Synechocystis sp.cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. Within the cobalamin synthesis pathway CbiG catalyzes the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B. This family is the C-terminal region, and the mid- and N-termival parts are conserved independently in other families. 120
45316 396461 pfam01891 CbiM Cobalt uptake substrate-specific transmembrane region. This family of proteins forms part of the cobalt-transport complex in prokaryotes, CbiMNQO. CbiMNQO and NikMNQO are the most widespread groups of microbial transporters for cobalt and nickel ions and are unusual uptake systems as they consist of eg two transmembrane components (CbiM and CbiQ), a small membrane-bound component (CbiN) and an ATP-binding protein (CbiO) but no extracytoplasmic solute-binding protein. Similar components constitute the nickel transporters with some variability in the small membrane-bound component, either NikN or NikL, which are not similar to CbiN at the sequence level. CbiM is the substrate-specific component of the complex and is a seven-transmembrane protein. The CbiMNQO and NikMNQO systems form part of the coenzyme B12 biosynthesis pathway. The NikM protein is pfam10670. 202
45317 396462 pfam01893 UPF0058 Uncharacterized protein family UPF0058. This archaebacterial protein has no known function. 86
45318 396463 pfam01894 UPF0047 Uncharacterized protein family UPF0047. This family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important. 116
45319 396464 pfam01895 PhoU PhoU domain. This family contains phosphate regulatory proteins including PhoU. PhoU proteins are known to play a role in the regulation of phosphate uptake. The PhoU domain is composed of a three helix bundle. The PhoU protein contains two copies of this domain. The domain binds to an iron cluster via its conserved E/DXXXD motif. 87
45320 396465 pfam01896 DNA_primase_S DNA primase small subunit. DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large and small subunits. This family also includes baculovirus late expression factor 1 or LEF-1 proteins. Baculovirus LEF-1 is a DNA primase enzyme. The family also contains many bacterial DNA primases. 158
45321 396466 pfam01899 MNHE Na+/H+ ion antiporter subunit. Subunit of a Na+/H+ Prokaryotic antiporter complex. 150
45322 396467 pfam01900 RNase_P_Rpp14 Rpp14/Pop5 family. tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29, Rpp30, Rpp38, and Rpp40. This protein is known as Pop5 in eukaryotes. 102
45323 396468 pfam01901 O_anti_polymase Putative O-antigen polymerase. Archaebacterial proteins of unknown function. Members of this family may be transmembrane proteins. These are potentially O-antigen assembly enzymes, with up to 11 transmembrane regions. 337
45324 280139 pfam01902 Diphthami_syn_2 Diphthamide synthase. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2). In some members of this family this domain is associated with pfam01042. The enzyme classification is EC:6.3.1.14. 219
45325 396469 pfam01903 CbiX CbiX. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation. 106
45326 396470 pfam01904 DUF72 Protein of unknown function DUF72. The function of this family is unknown. 219
45327 396471 pfam01905 DevR CRISPR-associated negative auto-regulator DevR/Csa2. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. This family used to be known as DUF73. DevR appears to be negative auto-regulator within the system. 268
45328 396472 pfam01906 YbjQ_1 Putative heavy-metal-binding. From comparative structural analysis, this family is likely to be a heavy-metal binding domain. The domain oligomerizes as a pentamer. The domain is about 100 amino acids long and is found in prokaryotes. 99
45329 396473 pfam01907 Ribosomal_L37e Ribosomal protein L37e. This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc. 53
45330 396474 pfam01909 NTP_transf_2 Nucleotidyltransferase domain. Members of this family belong to a large family of nucleotidyltransferases. This family includes kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase in-activates antibiotics by catalyzing the addition of a nucleotidyl group onto the drug. 91
45331 396475 pfam01910 Thiamine_BP Thiamine-binding protein. The crystal structure of two of these members shows that this domain has a ferredoxin like fold and is likely to exists as at least homodimers. Sulphate ions are are located at the dimer interfaces, which are thought to confer additional stability. Although the function of this domain remains to be identified, its structure suggests a role in protein-protein interactions possibly regulated by the binding of small-molecule ligands. Solution of the structure of the hyperthermophilic anaerobic Thermotoga maritima sequence, UniProtKB:Q9WYV6, shows that this has a beta-alpha-beta-beta-alpha-beta ferredoxin-like fold and assembles as a homotetramer. It was possible to identify a pocket in each monmer that bound an unidentified ligand. It was also found that it bound charged thiamine though not hydroxymethyl pyrimidine. It is proposed that it is transporting charged thiamine around the cytoplasm. Under oxidative conditions this bacterium is under stress, and the transcriiptional unit within which this protein is expressed is up-regulated in these conditions, suggesting that the chelation of cytoplasmic thaimine is part of the response mechanism to such oxidatvie stress, which is mediated by this family. 92
45332 396476 pfam01912 eIF-6 eIF-6 family. This family includes eukaryotic translation initiation factor 6 as well as presumed archaebacterial homologs. 196
45333 396477 pfam01913 FTR Formylmethanofuran-tetrahydromethanopterin formyltransferase. This enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. N-terminal distal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with C-terminal proximal lobe. 144
45334 280149 pfam01914 MarC MarC family integral membrane protein. Integral membrane protein family that includes the protein MarC. MarC was thought to be a multiple antibiotic resistance protein. Nevertheless, a study has shown that MarC is not involved in multiple antibiotic resistance. The function of this family is unclear. 203
45335 396478 pfam01915 Glyco_hydro_3_C Glycosyl hydrolase family 3 C-terminal domain. This domain is involved in catalysis and may be involved in binding beta-glucan. This domain is found associated with pfam00933. 216
45336 396479 pfam01916 DS Deoxyhypusine synthase. Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalyzed by the enzyme deoxyhypusine synthase (DS) EC:1.1.1.249. The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. 284
45337 396480 pfam01917 Arch_flagellin Archaebacterial flagellin. Members of this family are the proteins that form the flagella in archaebacteria. 160
45338 396481 pfam01918 Alba Alba. Alba is a novel chromosomal protein that coats archaeal DNA without compacting it. 66
45339 396482 pfam01920 Prefoldin_2 Prefoldin subunit. This family includes prefoldin subunits that are not detected by pfam02996. 102
45340 396483 pfam01921 tRNA-synt_1f tRNA synthetases class I (K). This family includes only lysyl tRNA synthetases from prokaryotes. 357
45341 396484 pfam01922 SRP19 SRP19 protein. The signal recognition particle (SRP) binds to the signal peptide of proteins as they are being translated. The binding of the SRP halts translation and the complex is then transported to the endoplasmic reticulum's cytoplasmic surface. The SRP then aids translocation of the protein through the ER membrane. The SRP is a ribonucleoprotein that is composed of a small RNA and several proteins. One of these proteins is the SRP19 protein (Sec65 in yeast). 94
45342 396485 pfam01923 Cob_adeno_trans Cobalamin adenosyltransferase. Cobalamin adenosyltransferase This family contains the gene products of PduO and EutT which are both cobalamin adenosyltransferases. PduO is a protein with ATP:cob(I)alamin adenosyltransferase activity. The main role of this protein is the conversion of inactive cobalamins to AdoCbl for 1,2-propanediol degradation.The EutT enzyme appears to be an adenosyl transferase, converting CNB12 to AdoB12. 163
45343 396486 pfam01924 HypD Hydrogenase formation hypA family. HypD is involved in hydrogenase formation. It contains many possible metal binding residues, which may bind to nickel. Transposon Tn5 insertions into hypD resulted in R. leguminosarum mutants that lacked any hydrogenase activity in symbiosis with peas. 352
45344 396487 pfam01925 TauE Sulfite exporter TauE/SafE. This is a family of integral membrane proteins where the alignment appears to contain two duplicated modules of three transmembrane helices. The proteins are involved in the transport of anions across the cytoplasmic membrane during taurine metabolism as an exporter of sulfoacetate. This family used to be known as DUF81. 235
45345 396488 pfam01926 MMR_HSR1 50S ribosome-binding GTPase. The full-length GTPase protein is required for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotide. 113
45346 396489 pfam01927 Mut7-C Mut7-C RNAse domain. RNAse domain of the PIN fold with an inserted Zinc Ribbon at the C-terminus. 145
45347 396490 pfam01928 CYTH CYTH domain. These sequences are functionally identified as members of the adenylate cyclase family, which catalyzes the conversion of ATP to 3',5'-cyclic AMP and pyrophosphate. Six distinct non-homologous classes of AC have been identified. The structure of three classes of adenylyl cyclases have been solved. 172
45348 396491 pfam01929 Ribosomal_L14e Ribosomal protein L14. This family includes the eukaryotic ribosomal protein L14. 75
45349 396492 pfam01930 Cas_Cas4 Domain of unknown function DUF83. This domain has no known function. The domain contains three conserved cysteines at its C-terminus. 162
45350 396493 pfam01931 NTPase_I-T Protein of unknown function DUF84. NTPase_I-T is a family of NTPases with supreme activity against ITP and XTP. Active site analysis and structure comparison of YjjX strongly suggested that it is an NTP binding protein with nucleoside triphosphatase activity. YjjX exhibits a mixed alpha-beta fold. 163
45351 396494 pfam01933 UPF0052 Uncharacterized protein family UPF0052. 249
45352 396495 pfam01934 DUF86 Protein of unknown function DUF86. The function of members of this family is unknown. 120
45353 376671 pfam01935 DUF87 Domain of unknown function DUF87. The function of this prokaryotic domain is unknown. It contains several conserved aspartates and histidines that could be metal ligands. 220
45354 376672 pfam01936 NYN NYN domain. These domains are found in the eukaryotic proteins typified by the Nedd4-binding protein 1 and the bacterial YacP-like proteins (Nedd4-BP1, YacP nucleases; NYN domains). The NYN domain shares a common protein fold with two other previously characterized groups of nucleases, namely the PIN (PilT N-terminal) and FLAP/5' --> 3' exonuclease superfamilies. These proteins share a common set of 4 acidic conserved residues that are predicted to constitute their active site. Based on the conservation of the acidic residues and structural elements Aravind and colleagues suggest that PIN and NYN domains are likely to bind only a single metal ion, unlike the FLAP/5' --> 3' exonuclease superfamily, which binds two metal ions. Based on conserved gene neighborhoods Aravind and colleagues infer that the bacterial members are likely to be components of the processome/degradsome that process tRNAs or ribosomal RNAs. 137
45355 396496 pfam01937 DUF89 Protein of unknown function DUF89. This family has no known function. 303
45356 396497 pfam01938 TRAM TRAM domain. This small domain has no known function. However it may perform a nucleic acid binding role (Bateman A. unpublished observation). 59
45357 280172 pfam01939 NucS Endonuclease NucS. Endonuclease NucS cleaves both 3' and 5' ssDNA extremities of branched DNA structures and it binds to ssDNA. 229
45358 396498 pfam01940 DUF92 Integral membrane protein DUF92. Members of this family have several predicted transmembrane helices. The function of these prokaryotic proteins is unknown. 238
45359 396499 pfam01941 AdoMet_Synthase S-adenosylmethionine synthetase (AdoMet synthetase). This family consists of several archaebacterial S-adenosylmethionine synthetase C(AdoMet synthetase or MAT) (EC 2.5.1.6). S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. The biological roles of AdoMet include acting as the primary methyl group donor, as a precursor to the polyamines, and as a progenitor of a 5'-deoxyadenosyl radical. S-Adenosylmethionine synthetase catalyzes the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. MATs from various organisms contain ~400-amino acid polypeptide chains. 394
45360 280175 pfam01943 Polysacc_synt Polysaccharide biosynthesis protein. Members of this family are integral membrane proteins. Many members of the family are implicated in production of polysaccharide. The family includes RfbX part of the O antigen biosynthesis operon. The family includes SpoVB from Bacillus subtilis, which is involved in spore cortex biosynthesis. 273
45361 396500 pfam01944 SpoIIM Stage II sporulation protein M. SpoIIM is on e of four stage II sporulation proteins that is necessary for the forespore inside the mother-cell to be properly internalized through the breakdown of peptidoglycans trapped between the membranes of the septum separating the forespore and the mother-cell. The four proteins working in sequence are SpoIIB, pfam05036, SpoIIM, SpoIIP, pfam07454, and finally SpoIID, pfam08486. D, M and P are in a complex with each other and the complex assembles in a hierarchical manner such that M, which serves as a membrane anchor, recruits P to the septum and P, in turn, recruits D to the septum. 172
45362 396501 pfam01946 Thi4 Thi4 family. This family includes a putative thiamine biosynthetic enzyme. 230
45363 366866 pfam01947 DUF98 Protein of unknown function (DUF98). This is a family of uncharacterized proteins. 149
45364 396502 pfam01948 PyrI Aspartate carbamoyltransferase regulatory chain, allosteric domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The N-terminal domain has ferredoxin-like fold, and provides the regulatory chain dimerization interface. 92
45365 396503 pfam01949 DUF99 Protein of unknown function DUF99. The function of this archaebacterial protein family is unknown. 173
45366 396504 pfam01950 FBPase_3 Fructose-1,6-bisphosphatase. This is a family of bacterial and archaeal fructose-1,6-bisphosphatases (FBPases). FBPase catalyzes the hydrolysis of D-fructose-1,6-bisphosphate (FBP) to D-fructose-6-phosphate (F6P) and orthophosphate and is an essential regulatory enzyme in the glyconeogenic pathway. 357
45367 396505 pfam01951 Archease Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism. 136
45368 376681 pfam01954 DUF104 Protein of unknown function DUF104. This family includes short archaebacterial proteins of unknown function. Archaeoglobus fulgidus has twelve copies of this protein, with several being clustered together in the genome. 56
45369 396506 pfam01955 CbiZ Adenosylcobinamide amidohydrolase. This prokaryotic protein family includes CbiZ which converts adenosylcobinamide (AdoCbi) to adenosylcobyric acid (AdoCby), an intermediate of the de novo coenzyme B12 biosynthetic route. 193
45370 396507 pfam01956 DUF106 Integral membrane protein DUF106. This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. 169
45371 396508 pfam01957 NfeD NfeD-like C-terminal, partner-binding. NfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). There appear to be three major groups: an ancestral group with only an N-terminal serine protease domain and this C-terminal beta sheet-rich domain which is structurally very similar to the OB-fold domain, associated with its neighboring slipin cluster; a second major group with an additional middle, membrane-spanning domain, associated in some species with eoslipin and in others with yqfA; a final 'artificial' group which unites truncated forms lacking the protease region and associated with their ancestral gene partner, either yqfA or eoslipin. This NefD, C-terminal, domain appears to be the major one for relating to the associated protein. NfeD homologs are clearly reliant on their conserved gene neighbor which is assumed to be necessary for function, either through direct physical interaction or by functioning in the same pathway, possibly involve with lipid-rafts. 90
45372 396509 pfam01958 DUF108 Domain of unknown function DUF108. This family has no known function. It is found to compose the complete protein in archaebacteria and a single domain in a large C. elegans protein. 89
45373 396510 pfam01959 DHQS 3-dehydroquinate synthase II (EC 1.4.1.24). 3-Dehydroquinate synthase II was isolated from the archaeon Methanocaldococcus jannaschii and plays a key role in an alternative pathway for the biosynthesis of 3-dehydroquinate (DHQ), an intermediate of the canonical pathway for the biosynthesis of aromatic amino acids. The enzyme catalyzes a two-step reaction - an oxidative deamination, followed by cyclization. The enzyme converts 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonate to 3-dehydroquinate. 347
45374 396511 pfam01960 ArgJ ArgJ family. Members of the ArgJ family catalyze the first EC:2.3.1.1 and fifth steps EC:2.3.1.35 in arginine biosynthesis. 373
45375 396512 pfam01963 TraB TraB family. pAD1 is a haemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus. This family also contains the Tiki proteins that regulate Wnt signalling. 260
45376 396513 pfam01964 ThiC_Rad_SAM Radical SAM ThiC family. ThiC is found within the thiamine biosynthesis operon. ThiC is involved in pyrimidine biosynthesis. ThiC participates in the formation of 4-Amino-5-hydroxymethyl-2-methylpyrimidine from AIR, an intermediate in the de novo pyrimidine biosynthesis. Thic is a member of the radical SAM superfamily. 418
45377 396514 pfam01965 DJ-1_PfpI DJ-1/PfpI family. The family includes the protease PfpI. This domain is also found in transcriptional regulators. 165
45378 396515 pfam01966 HD HD domain. HD domains are metal dependent phosphohydrolases. 110
45379 396516 pfam01967 MoaC MoaC family. Members of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known. 136
45380 396517 pfam01968 Hydantoinase_A Hydantoinase/oxoprolinase. This family includes the enzymes hydantoinase and oxoprolinase EC:3.5.2.9. Both reactions involve the hydrolysis of 5-membered rings via hydrolysis of their internal imide bonds. 288
45381 396518 pfam01969 DUF111 Protein of unknown function DUF111. This prokaryotic family has no known function. 380
45382 396519 pfam01970 TctA Tripartite tricarboxylate transporter TctA family. This family, formerly known as DUF112, is a family of bacterial and archaeal tripartite tricarboxylate transporters of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. TctA is part of the tripartite TctABC system which, as characterized in S. typhimurium, is a secondary carrier that depends for activity on the extracytoplasmic tricarboxylate-binding receptor TctC as well as two integral membrane proteins, TctA and TctB. complete three-component systems are found only in bacteria. TctA is a large transmembrane protein with up to 12 predicted membrane spanning regions in bacteria and up to 11 such in archaea, with the N-terminal within the cytoplasm. TctA is thought to be a permease, and in most other bacteria functions without TctB and TctC molecules. 415
45383 110924 pfam01972 SDH_sah Serine dehydrogenase proteinase. This family of archaebacterial proteins, formerly known as DUF114, has been found to be a serine dehydrogenase proteinase distantly related to ClpP proteinases that belong to the serine proteinase superfamily. The family has a catalytic triad of Ser, Asp, His residues, which shows an altered residue ordering compared with the ClpP proteinases but similar to that of the carboxypeptidase clan. 286
45384 396520 pfam01973 MAF_flag10 Protein of unknown function DUF115. This family of archaebacterial proteins has no known function. 171
45385 396521 pfam01974 tRNA_int_endo tRNA intron endonuclease, catalytic C-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9. 85
45386 396522 pfam01975 SurE Survival protein SurE. E. coli cells with the surE gene disrupted are found to survive poorly in stationary phase. It is suggested that SurE may be involved in stress response. Yeast also contains a member of the family. Yarrowia lipolytica PHO2 can complement a mutation in acid phosphatase, suggesting that members of this family could be phosphatases. 187
45387 396523 pfam01976 DUF116 Protein of unknown function DUF116. This archaebacterial protein has no known function. The protein contains seven conserved cysteines and may also be an integral membrane protein. 152
45388 396524 pfam01977 UbiD 3-octaprenyl-4-hydroxybenzoate carboxy-lyase. This family has been characterized as 3-octaprenyl-4- hydroxybenzoate carboxy-lyase enzymes. This enzyme catalyzes the third reaction in ubiquinone biosynthesis. For optimal activity the carboxy-lase was shown to require Mn2+. 400
45389 396525 pfam01978 TrmB Sugar-specific transcriptional regulator TrmB. One member of this family, TrmB, has been shown to be a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter in Thermococcus litoralis. 67
45390 396526 pfam01979 Amidohydro_1 Amidohydrolase family. This family of enzymes are a a large metal dependent hydrolase superfamily. The family includes Adenine deaminase EC:3.5.4.2 that hydrolyzes adenine to form hypoxanthine and ammonia. Adenine deaminases reaction is important for adenine utilisation as a purine and also as a nitrogen source. This family also includes dihydroorotase and N-acetylglucosamine-6-phosphate deacetylases, EC:3.5.1.25 These enzymes catalyze the reaction N-acetyl-D-glucosamine 6-phosphate + H2O <=> D-glucosamine 6-phosphate + acetate. This family includes the catalytic domain of urease alpha subunit. Dihydroorotases (EC:3.5.2.3) are also included. 335
45391 396527 pfam01980 UPF0066 Uncharacterized protein family UPF0066. 116
45392 396528 pfam01981 PTH2 Peptidyl-tRNA hydrolase PTH2. Peptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation. 115
45393 396529 pfam01982 CTP-dep_RFKase Domain of unknown function DUF120. This domain is a CTP-dependent riboflavin kinase (RFK), found in archaea, that catalyzes the phosphorylation of riboflavin to form flavin mononucleotide in riboflavin biosynthesis EC:2.7.1.26. Its structure resembles a RIFT barrel, structurally similar to but topologically distinct from bacterial and eukaryotic examples. The N-terminal is a winged helix-turn-helix DNA-binding domain, and the C-terminal half is most similar in sequence to a group of cradle-loop barrels. Archaeoglobus fulgidus RibK has this domain attached to pfam00325. 121
45394 251014 pfam01983 CofC Guanylyl transferase CofC like. Coenzyme F420 is a hydride carrier cofactor that functions during methanogenesis. This family of proteins represents CofC, a nucleotidyl transferase that is involved in coenzyme F420 biosynthesis. CofC has been shown to catalyze the formation of lactyl-2-diphospho-5'-guanosine from 2-phospho-L-lactate and GTP. 217
45395 396530 pfam01984 dsDNA_bind Double-stranded DNA-binding domain. This domain is believed to bind double-stranded DNA of 20 bases length. 107
45396 396531 pfam01985 CRS1_YhbY CRS1 / YhbY (CRM) domain. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localizes to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome. 81
45397 396532 pfam01986 DUF123 Domain of unknown function DUF123. This archaebacterial domain has no known function. It is attached to an endonuclease domain in Methanocaldococcus jannaschii endonuclease III (nth). The domain contains several conserved cysteines and histidines. This suggests that the domain may be a zinc binding nucleic acid interaction domain (Bateman A unpubl.). 96
45398 396533 pfam01987 AIM24 Mitochondrial biogenesis AIM24. In eukaryotes, this domain is involved in mitochondrial biogenesis. Its function in prokaryotes in unknown. 206
45399 396534 pfam01988 VIT1 VIT family. This family includes the vacuolar Fe2+/Mn2+ uptake transporter, Ccc1 and the vacuolar iron transporter VIT1. 212
45400 396535 pfam01989 DUF126 Protein of unknown function DUF126. This archaebacterial protein family has no known function. 75
45401 396536 pfam01990 ATP-synt_F ATP synthase (F/14-kDa) subunit. This family includes 14-kDa subunit from vATPases, which is in the peripheral catalytic part of the complex. The family also includes archaebacterial ATP synthase subunit F. 91
45402 396537 pfam01991 vATP-synt_E ATP synthase (E/31 kDa) subunit. This family includes the vacuolar ATP synthase E subunit, as well as the archaebacterial ATP synthase E subunit. 199
45403 396538 pfam01992 vATP-synt_AC39 ATP synthase (C/AC39) subunit. This family includes the AC39 subunit from vacuolar ATP synthase, and the C subunit from archaebacterial ATP synthase. The family also includes subunit C from the Sodium transporting ATP synthase from Enterococcus hirae. 333
45404 396539 pfam01993 MTD methylene-5,6,7,8-tetrahydromethanopterin dehydrogenase. This enzyme family is involved in formation of methane from carbon dioxide EC:1.5.99.9. The enzyme requires coenzyme F420. 274
45405 396540 pfam01994 Trm56 tRNA ribose 2'-O-methyltransferase, aTrm56. This family is an aTrm56 that catalyzes the 2'-O-methylation of the cytidine residue in archaeal tRNA, using S-adenosyl-L-methionine. Biochemical assays showed that aTrm56 forms a dimer and prefers the L-shaped tRNA to the lambda form as its substrate. aTrm56 consists of the SPOUT domain, which contains the characteristic deep trefoil knot for AdoMet binding, and a unique C-terminal beta-hairpin. 119
45406 396541 pfam01995 DUF128 Domain of unknown function DUF128. This archaebacterial protein family has no known function. The domain is found duplicated in Methanothermobacter thermautotrophicus MTH_1569. Many of these are attached to an N-terminal winged helix domain suggesting these are transcriptional regulators and that this domain has a ligand binding function. 238
45407 396542 pfam01996 F420_ligase F420-0:Gamma-glutamyl ligase. F420-0:Gamma-glutamyl ligase (EC:6.3.2.-) is an enzyme involved in F420 biosynthesis pathway. It catalyzes the GTP-dependent successive addition of multiple gamma-linked L-glutamates to the L-lactyl phosphodiester of 7,8-didemethyl-8-hydroxy-5-deazariboflavin (F420-0). This reaction produces polyglutamated F420 derivatives. GTP + F420-0 + n L-glutamate -> GDP + phosphate + F420-n 216
45408 396543 pfam01997 Translin Translin family. Members of this family include Translin, which interacts with DNA and forms a ring around the DNA. This family also includes human TSNAX, which was found to interact with translin with yeast two-hybrid screen. 196
45409 396544 pfam01998 DUF131 Protein of unknown function DUF131. This archaebacterial protein family has no known function. The proteins are predicted to contain two transmembrane helices. 62
45410 280223 pfam02001 DUF134 Protein of unknown function DUF134. This family of archaeal proteins has no known function. 98
45411 280224 pfam02002 TFIIE_alpha TFIIE alpha subunit. The general transcription factor TFIIE has an essential role in eukaryotic transcription initiation together with RNA polymerase II and other general factors. Human TFIIE consists of two subunits TFIIE-alpha and TFIIE-beta and joins the pre-initiation complex after RNA polymerase II and TFIIF. This family consists of the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria that are presumed to be TFIIE-alpha subunits also Archaeoglobus fulgidus tfe. 105
45412 396545 pfam02005 TRM N2,N2-dimethylguanosine tRNA methyltransferase. This enzyme EC:2.1.1.32 used S-AdoMet to methylate tRNA. The TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaebacteria 375
45413 396546 pfam02006 DUF137 Protein of unknown function DUF137. This family of archaeal proteins has no known function. 176
45414 280227 pfam02007 MtrH Tetrahydromethanopterin S-methyltransferase MtrH subunit. The enzyme tetrahydromethanopterin S-methyltransferase EC:2.1.1.86 is composed of eight subunits. The enzyme is a membrane- associated enzyme complex which catalyzes an energy-conserving, sodium-ion-translocating step in methanogenesis from hydrogen and carbon dioxide. 299
45415 366873 pfam02008 zf-CXXC CXXC zinc finger domain. This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two repeats, and shows a peculiar internal duplication in which the second unit is inserted into the first one. Each of these units is characterized by four conserved cysteines, displaying a CXXCXXCX(n)C motif that chelate a Zn+2 ion. The DNA binding interface has been identified by NMR. In eukaryotes, the CXXC domain is found in stramenopiles, plants and metazoans. Plants possess a mono-CXXC domain that is present in distinct chromatin proteins. Structural comparisons show that the mono-CXXC is homologous to the structural-zinc binding domain of medium chain dehydrogenases. 48
45416 396547 pfam02009 RIFIN Rifin. Plasmodium falciparum is the causative agent of deadly malaria disease. It encodes repetitive interspersed families of polypeptides (RIFINs), which are expressed on the surface of infected erythrocytes. All RIFIN sequences contain the PEXEL motif (a pentameric sequence RxLxE/Q/D, known as the Plasmodium export element) required for correct export and surface expression or host-targeting (HT) signal which plays a central role in the export of proteins into the host cell. It has been reported that PEXEL is preferably located 15-20 amino acids downstream of an N-terminal hydrophobic signal sequence. The RIFIN protein family can be divided into A and B types based on the presence or absence of a 25 amino acid motif located approximately 66 amino acids downstream of the PEXEL motif, with A- and B-types serving different roles in distinct parasite stages. The specific type B RIFIN variant (PF13_0006) is expressed on the surface of free merozoites, internally in developing gametocytes and on the surface of gametes at the point of emerging from activated, mature stage V gametocytes. While type A RIFIN are expressed on the infected erythrocyte surface, potentially contributing to the antigenic variation capacity of the parasite. 326
45417 366875 pfam02010 REJ REJ domain. The REJ (Receptor for Egg Jelly) domain is found in PKD1 and the sperm receptor for egg jelly. The function of this domain is unknown. The domain is 600 amino acids long so is probably composed of multiple structural domains. There are six completely conserved cysteine residues that may form disulphide bridges. This region contains tandem PKD-like domains. 448
45418 396548 pfam02011 Glyco_hydro_48 Glycosyl hydrolase family 48. Members of this family are endoglucanase EC:3.2.1.4 and exoglucanase EC:3.2.1.91 enzymes that cleave cellulose or related substrate. 620
45419 396549 pfam02012 BNR BNR/Asp-box repeat. Members of this family contain multiple BNR (bacterial neuraminidase repeat) repeats or Asp-boxes. The repeats are short, however the repeats are never found closer than 40 residues together suggesting that the repeat is structurally longer. These repeats are found in many glycosyl hydrolases as well as other extracellular proteins of unknown function. 12
45420 251036 pfam02013 CBM_10 Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. 36
45421 396550 pfam02014 Reeler Reeler domain. 129
45422 280233 pfam02015 Glyco_hydro_45 Glycosyl hydrolase family 45. 214
45423 396551 pfam02016 Peptidase_S66 LD-carboxypeptidase. Muramoyl-tetrapeptide carboxypeptidase hydrolyzes a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66. 119
45424 396552 pfam02017 CIDE-N CIDE-N domain. This domain is found in CAD nuclease and ICAD, the inhibitor of CAD nuclease. The two proteins interact through this domain. 75
45425 396553 pfam02018 CBM_4_9 Carbohydrate binding domain. This family includes diverse carbohydrate binding domains. 134
45426 396554 pfam02019 WIF WIF domain. The WIF domain is found in the RYK tyrosine kinase receptors and WIF the Wnt-inhibitory-factor. The domain is extracellular and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt. The WIF domain is a member of the immunoglobulin superfamily, and it comprises nine beta-strands and two alpha-helices, with two of the beta-strands (6 and 9) interrupted by four and six residues of irregular secondary structure, respectively. Considering that the activity of Wnts depends on the presence of a palmitoylated cysteine residue in their amino-terminal polypeptide segment, Wnt proteins are lipid-modified and can act as stem cell growth factors, it is likely that the WIF domain recognizes and binds to Wnts that have been activated by palmitoylation and that the recognition of palmitoylated Wnts by WIF-1 is effected by its WIF domain rather than by its EGF domains. A strong binding affinity for palmitoylated cysteine residues would further explain the remarkably high affinity of human WIF-1 not only for mammalian Wnts, but also for Wnts from Xenopus and Drosophila. 126
45427 396555 pfam02020 W2 eIF4-gamma/eIF5/eIF2-epsilon. This domain of unknown function is found at the C-terminus of several translation initiation factors. 78
45428 396556 pfam02021 UPF0102 Uncharacterized protein family UPF0102. The function of this family is unknown. 92
45429 396557 pfam02022 Integrase_Zn Integrase Zinc binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. This domain is the amino-terminal domain zinc binding domain. The central domain is the catalytic domain pfam00665. The carboxyl terminal domain is a DNA binding domain pfam00552. 37
45430 396558 pfam02023 SCAN SCAN domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerization. 87
45431 396559 pfam02024 Leptin Leptin. 142
45432 396560 pfam02025 IL5 Interleukin 5. 112
45433 396561 pfam02026 RyR RyR domain. This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. 91
45434 366884 pfam02027 RolB_RolC RolB/RolC glucosidase family. This family of proteins includes RolB and RolC. RolC releases cytokinins from glucoside conjugates. Whereas RolB hydrolyzes indole glucosides. 184
45435 396562 pfam02028 BCCT BCCT, betaine/carnitine/choline family transporter. 484
45436 396563 pfam02029 Caldesmon Caldesmon. 474
45437 307931 pfam02030 Lipoprotein_8 Hypothetical lipoprotein (MG045 family). This family includes hypothetical lipoproteins, the amino terminal part of this protein is related to pfam01547, a family of solute binding proteins. This suggests this family also has a solute binding function. 493
45438 280248 pfam02031 Peptidase_M7 Streptomyces extracellular neutral proteinase (M7) family. 133
45439 396564 pfam02033 RBFA Ribosome-binding factor A. 104
45440 396565 pfam02035 Coagulin Coagulin. 172
45441 396566 pfam02036 SCP2 SCP-2 sterol transfer family. This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C-terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145. 99
45442 396567 pfam02037 SAP SAP domain. The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins. 35
45443 396568 pfam02038 ATP1G1_PLM_MAT8 ATP1G1/PLM/MAT8 family. 46
45444 307937 pfam02040 ArsB Arsenical pump membrane protein. 423
45445 396569 pfam02041 Auxin_BP Auxin binding protein. 164
45446 396570 pfam02042 RWP-RK RWP-RK domain. This domain is named RWP-RK after a conserved motif at the C-terminus of the presumed domain. The domain is found in algal minus dominance proteins as well as plant proteins involved in nitrogen-controlled development. 49
45447 396571 pfam02043 Bac_chlorC Bacteriochlorophyll C binding protein. 80
45448 280257 pfam02044 Bombesin Bombesin-like peptide. 14
45449 396572 pfam02045 CBFB_NFYA CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. 56
45450 396573 pfam02046 COX6A Cytochrome c oxidase subunit VIa. 112
45451 307942 pfam02048 Enterotoxin_ST Heat-stable enterotoxin ST. This family consists of the heat stable enterotoxin ST from Escherichia coli. ST is a small peptide of 18 or 19 amino acid residues produced by enterotoxigenic E. coli and is one of the causes of acute diarrhoea in infants and travellers in developing countries. ST triggers a biological response by binding to a membrane-associated guanylyl cyclase C which is located on intestinal epithelial cell membranes. 54
45452 396574 pfam02049 FliE Flagellar hook-basal body complex protein FliE. 89
45453 396575 pfam02050 FliJ Flagellar FliJ protein. 123
45454 110996 pfam02052 Gallidermin Gallidermin. 52
45455 366894 pfam02053 Gene66 Gene 66 (IR5) protein. 209
45456 307945 pfam02055 Glyco_hydro_30 Glycosyl hydrolase family 30 TIM-barrel domain. 348
45457 396576 pfam02056 Glyco_hydro_4 Family 4 glycosyl hydrolase. 183
45458 396577 pfam02057 Glyco_hydro_59 Glycosyl hydrolase family 59. 293
45459 396578 pfam02058 Guanylin Guanylin precursor. 86
45460 396579 pfam02059 IL3 Interleukin-3. 110
45461 396580 pfam02060 ISK_Channel Slow voltage-gated potassium channel. 122
45462 280268 pfam02061 Lambda_CIII Lambda Phage CIII. The CIII protein from bacteriophage lambda is an inhibitor of the FtsH peptidase. 42
45463 366900 pfam02063 MARCKS MARCKS family. 281
45464 396581 pfam02064 MAS20 MAS20 protein import receptor. 132
45465 307952 pfam02065 Melibiase Melibiase. Glycoside hydrolase families GH27, GH31 and GH36 form the glycoside hydrolase clan GH-D. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K. This family includes enzymes from GH36A-B and GH36D-K and from GH27. 347
45466 280272 pfam02066 Metallothio_11 Metallothionein family 11. 54
45467 280273 pfam02067 Metallothio_5 Metallothionein family 5. 41
45468 396582 pfam02068 Metallothio_PEC Plant PEC family metallothionein. 75
45469 396583 pfam02069 Metallothio_Pro Prokaryotic metallothionein. 51
45470 366904 pfam02070 NMU Neuromedin U. 25
45471 366905 pfam02071 NSF Aromatic-di-Alanine (AdAR) repeat. This repeat is found in NSF attachment proteins. Its structure is similar to that found in TPR repeats pfam00515. 12
45472 396584 pfam02072 Orexin Prepro-orexin. 129
45473 396585 pfam02073 Peptidase_M29 Thermophilic metalloprotease (M29). 405
45474 396586 pfam02074 Peptidase_M32 Carboxypeptidase Taq (M32) metallopeptidase. 489
45475 366907 pfam02075 RuvC Crossover junction endodeoxyribonuclease RuvC. This entry includes endodeoxyribonucleases found in bacteria, such as RuvC. RuvC is a small protein of about 20 kD. It requires and binds a magnesium ion. The structure of E. coli RuvC is a 3-layer alpha-beta sandwich containing a 5-stranded beta-sheet sandwiched between 5 alpha-helices. The Escherichia coli RuvC gene is involved in DNA repair and in the late step of RecE and RecF pathway recombination. RuvC protein (EC:3.1.22.4) cleaves cruciform junctions, which are formed by the extrusion of inverted repeat sequences from a super-coiled plasmid and which are structurally analogous to Holliday junctions, by introducing nicks into strands with the same polarity. The nicks leave a 5'terminal phosphate and a 3'terminal hydroxyl group which are ligated by E. coli or Bacteriophage T4 DNA ligases. Analysis of the cleavage sites suggests that DNA topology rather than a particular sequence determines the cleavage site. RuvC protein also cleaves Holliday junctions that are formed between gapped circular and linear duplex DNA by the function of RecA protein. The active form of RuvC protein is a dimer. This is mechanistically suited for an endonuclease involved in swapping DNA strands at the crossover junctions. It is inferred that RuvC protein is an endonuclease that resolves Holliday structures in vivo. 148
45476 396587 pfam02076 STE3 Pheromone A receptor. 292
45477 111019 pfam02077 SURF4 SURF4 family. 267
45478 396588 pfam02078 Synapsin Synapsin, N-terminal domain. This family is structurally related to the PreATP-grasp domain. 100
45479 366910 pfam02079 TP1 Nuclear transition protein 1. 51
45480 396589 pfam02080 TrkA_C TrkA-C domain. This domain is often found next to the pfam02254 domain. The exact function of this domain is unknown. It has been suggested that it may bind an unidentified ligand. The domain is predicted to adopt an all beta structure. 70
45481 396590 pfam02081 TrpBP Tryptophan RNA-binding attenuator protein. 68
45482 396591 pfam02082 Rrf2 Transcriptional regulator. This family is related to pfam001022 and other transcription regulation families (personal obs: Yeats C). 131
45483 280288 pfam02083 Urotensin_II Urotensin II. 12
45484 251078 pfam02084 Bindin Bindin. 239
45485 396592 pfam02085 Cytochrom_CIII Class III cytochrome C family. 103
45486 396593 pfam02086 MethyltransfD12 D12 class N6 adenine-specific DNA methyltransferase. 254
45487 111029 pfam02087 Nitrophorin Nitrophorin. 178
45488 145317 pfam02088 Ornatin Ornatin. 41
45489 396594 pfam02089 Palm_thioest Palmitoyl protein thioesterase. 251
45490 280292 pfam02090 SPAM Salmonella surface presentation of antigen gene type M protein. 140
45491 396595 pfam02091 tRNA-synt_2e Glycyl-tRNA synthetase alpha subunit. 276
45492 396596 pfam02092 tRNA_synt_2f Glycyl-tRNA synthetase beta subunit. 536
45493 396597 pfam02093 Gag_p30 Gag P30 core shell protein. According to Swiss-Prot annotation this protein is the viral core shell protein. P30 is essential for viral assembly. 208
45494 396598 pfam02095 Extensin_1 Extensin-like protein repeat. 10
45495 396599 pfam02096 60KD_IMP 60Kd inner membrane protein. 188
45496 280298 pfam02097 Filo_VP35 Filoviridae VP35. 342
45497 396600 pfam02098 His_binding Tick histamine binding protein. 150
45498 396601 pfam02099 Josephin Josephin. 154
45499 396602 pfam02100 ODC_AZ Ornithine decarboxylase antizyme. This family consists of ornithine decarboxylase antizyme proteins. The polyamine biosynthetic enzyme ornithine decarboxylase (ODC) is degraded by the 26 S proteasome via a ubiquitin-independent pathway. Its degradation is greatly accelerated by association with the polyamine-induced regulatory protein antizyme 1 (AZ1). 112
45500 396603 pfam02101 Ocular_alb Ocular albinism type 1 protein. 402
45501 251087 pfam02102 Peptidase_M35 Deuterolysin metalloprotease (M35) family. 352
45502 396604 pfam02104 SURF1 SURF1 family. 192
45503 396605 pfam02106 Fanconi_C Fanconi anaemia group C protein. 547
45504 396606 pfam02107 FlgH Flagellar L-ring protein. 175
45505 396607 pfam02108 FliH Flagellar assembly protein FliH. 124
45506 396608 pfam02109 DAD DAD family. Members of this family are thought to be integral membrane proteins. Some members of this family have been shown to cause apoptosis if mutated, these proteins are known as DAD for defender against death. The family also includes the epsilon subunit of the oligosaccharyltransferase that is involved in N-linked glycosylation. 108
45507 396609 pfam02110 HK Hydroxyethylthiazole kinase family. 247
45508 396610 pfam02112 PDEase_II cAMP phosphodiesterases class-II. 339
45509 396611 pfam02113 Peptidase_S13 D-Ala-D-Ala carboxypeptidase 3 (S13) family. 444
45510 251094 pfam02114 Phosducin Phosducin. 265
45511 396612 pfam02115 Rho_GDI RHO protein GDP dissociation inhibitor. 194
45512 396613 pfam02116 STE2 Fungal pheromone mating factor STE2 GPCR. 276
45513 111054 pfam02117 7TM_GPCR_Sra Serpentine type 7TM GPCR chemoreceptor Sra. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sra is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 328
45514 396614 pfam02118 Srg Srg family chemoreceptor. 270
45515 396615 pfam02119 FlgI Flagellar P-ring protein. 342
45516 396616 pfam02120 Flg_hook Flagellar hook-length control protein FliK. This is the C terminal domain of FliK. FliK controls the length of the flagellar hook by directly measuring the hook length as a molecular ruler. This family also includes YscP of the Yersinia type III secretion system, and equivalent proteins in other pathogenic bacterial type III secretion systems. 83
45517 396617 pfam02121 IP_trans Phosphatidylinositol transfer protein. Along with the structurally unrelated Sec14p family (found in pfam00650), this family can bind/exchange one molecule of phosphatidylinositol (PI) or phosphatidylcholine (PC) and thus aids their transfer between different membrane compartments. There are three sub-families - all share an N-terminal PITP-like domain, whose sequence is highly conserved. It is described as consisting of three regions. The N-terminal region is thought to bind the lipid and contains two helices and an eight-stranded, mostly antiparallel beta-sheet. An intervening loop region, which is thought to play a role in protein-protein interactions, separates this from the C-terminal region, which exhibits the greatest sequence variation and may be involved in membrane binding. PITP alpha has a 16-fold greater affinity for PI than PC. Together with PITP beta, it is expressed ubiquitously in all tissues. 245
45518 111059 pfam02122 Peptidase_S39 Peptidase S39. This family contains polyprotein processing endopeptidases from RNA viruses. 203
45519 280316 pfam02123 RdRP_4 Viral RNA-directed RNA-polymerase. This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus. 465
45520 280317 pfam02124 Marek_A Marek's disease glycoprotein A. 210
45521 396618 pfam02126 PTE Phosphotriesterase family. 298
45522 396619 pfam02127 Peptidase_M18 Aminopeptidase I zinc metalloprotease (M18). 430
45523 396620 pfam02128 Peptidase_M36 Fungalysin metallopeptidase (M36). 366
45524 396621 pfam02129 Peptidase_S15 X-Pro dipeptidyl-peptidase (S15 family). 264
45525 396622 pfam02130 UPF0054 Uncharacterized protein family UPF0054. 125
45526 396623 pfam02132 RecR RecR protein. 40
45527 251109 pfam02133 Transp_cyt_pur Permease for cytosine/purines, uracil, thiamine, allantoin. 439
45528 396624 pfam02135 zf-TAZ TAZ zinc finger. The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumor suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC. 72
45529 396625 pfam02136 NTF2 Nuclear transport factor 2 (NTF2) domain. This family includes the NTF2-like Delta-5-3-ketosteroid isomerase proteins. 116
45530 396626 pfam02137 A_deamin Adenosine-deaminase (editase) domain. Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defense against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc. 327
45531 396627 pfam02138 Beach Beige/BEACH domain. 277
45532 396628 pfam02140 Gal_Lectin Galactose binding lectin domain. 80
45533 396629 pfam02141 DENN DENN (AEX-3) domain. DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. 185
45534 396630 pfam02142 MGS MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site. 93
45535 396631 pfam02144 Rad1 Repair protein Rad1/Rec1/Rad17. 257
45536 396632 pfam02145 Rap_GAP Rap/ran-GAP. 181
45537 396633 pfam02146 SIR2 Sir2 family. This region is characteristic of Silent information regulator 2 (Sir2) proteins, or sirtuins. These are protein deacetylases that depend on nicotine adenine dinucleotide (NAD). They are found in many subcellular locations, including the nucleus, cytoplasm and mitochondria. Eukaryotic forms play in important role in the regulation of transcriptional repression. Moreover, they are involved in microtubule organisation and DNA damage repair processes.i 179
45538 396634 pfam02148 zf-UBP Zn-finger in ubiquitin-hydrolases and other protein. 63
45539 396635 pfam02149 KA1 Kinase associated domain 1. 44
45540 280336 pfam02150 RNA_POL_M_15KD RNA polymerases M/15 Kd subunit. 36
45541 308001 pfam02151 UVR UvrB/uvrC motif. 36
45542 396636 pfam02152 FolB Dihydroneopterin aldolase. This enzyme EC:4.1.2.25 catalyzes the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. 113
45543 396637 pfam02153 PDH Prephenate dehydrogenase. Members of this family are prephenate dehydrogenases EC:1.3.1.12 involved in tyrosine biosynthesis. 257
45544 111086 pfam02154 FliM Flagellar motor switch protein FliM. 192
45545 396638 pfam02155 GCR Glucocorticoid receptor. 371
45546 396639 pfam02156 Glyco_hydro_26 Glycosyl hydrolase family 26. 311
45547 396640 pfam02157 Man-6-P_recep Mannose-6-phosphate receptor. This family includes both Cation-dependent and cation independent mannose-6-phosphate receptors. 254
45548 396641 pfam02158 Neuregulin Neuregulin family. 360
45549 396642 pfam02159 Oest_recep Oestrogen receptor. 138
45550 280345 pfam02160 Peptidase_A3 Cauliflower mosaic virus peptidase (A3). 208
45551 396643 pfam02161 Prog_receptor Progesterone receptor. 564
45552 145362 pfam02162 XYPPX XYPPX repeat (two copies). This repeat is found in a wide variety of proteins and generally consists of the motif XYPPX where X can be any amino acid. The family includes annexin VII and the carboxy tail of certain rhodopsins. This family also includes plaque matrix proteins, however this motif is embedded in a ten residue repeat in Mytilus edulis adhesive plaque matrix protein FP1. The molecular function of this repeat is unknown. It is also not clear is all the members of this family share a common evolutionary ancestor due to its short length and biased amino acid composition. 15
45553 308008 pfam02163 Peptidase_M50 Peptidase family M50. 275
45554 396644 pfam02165 WT1 Wilm's tumor protein. 290
45555 396645 pfam02166 Androgen_recep Androgen receptor. 484
45556 396646 pfam02167 Cytochrom_C1 Cytochrome C1 family. 219
45557 396647 pfam02169 LPP20 LPP20 lipoprotein. This family contains the LPP20 lipoprotein, which is a non-essential class of lipoprotein. 96
45558 396648 pfam02170 PAZ PAZ domain. This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerization. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway. 116
45559 396649 pfam02171 Piwi Piwi domain. This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA. 296
45560 366953 pfam02172 KIX KIX domain. CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. 81
45561 396650 pfam02173 pKID pKID domain. CBP and P300 bind to the pKID (phosphorylated kinase-inducible-domain) domain of CREB. 41
45562 396651 pfam02174 IRS PTB domain (IRS-1 type). 97
45563 111105 pfam02175 7TM_GPCR_Srb Serpentine type 7TM GPCR chemoreceptor Srb. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srb is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 236
45564 280357 pfam02176 zf-TRAF TRAF-type zinc finger. 60
45565 396652 pfam02177 APP_N Amyloid A4 N-terminal heparin-binding. This N-terminal domain of APP, amyloid precursor protein, is the heparin-binding domain of the protein. this region is also responsible for stimulation of neurite outgrowth. The structure reveals both a highly charged basic surface that may interact with glycosaminoglycans in the brain and an abutting hydrophobic surface that is proposed to play an important functional role such as in dimerization or ligand-binding. Structural similarities with cysteine-rich growth factors, taken together with its known growth-promoting properties, suggest the APP N-terminal domain could function as a growth factor in vivo. 100
45566 308018 pfam02178 AT_hook AT hook motif. At hooks are DNA binding motifs with a preference for A/T rich regions. 13
45567 396653 pfam02179 BAG BAG domain. Domain present in Hsp70 regulators. 77
45568 396654 pfam02180 BH4 Bcl-2 homology region 4. 25
45569 396655 pfam02181 FH2 Formin Homology 2 Domain. 372
45570 396656 pfam02182 SAD_SRA SAD/SRA domain. The domain goes by several names including SAD, SRA and YDG. It adopts a beta barrel, modified PUA-like, fold that is widely present in eukaryotic chromatin proteins and in bacteria. Versions of this domain are known to bind hemi-methylated CpG dinucleotides and also other 5mC containing dinucleotides. The domain binds DNA by flipping out the methylated cytosine base from the DNA double helix.The conserved tyrosine and aspartate residues and a glycine rich patch are critical for recognition of the flipped out base. Mammalian UHRF1 that contains this domain plays an important role in maintenance of methylation at CpG dinucleotides by recruiting DNMT1 to hemimethylated sites associated with replication forks. The SAD/SRA domain has been combined with other domains involved in the ubiquitin pathway on multiple occasions and such proteins link recognition of DNA methylation to chromatin-protein ubiquitination. The domain is also found in species that lack DNA methylation, such as certain apicomplexans, suggestive of other DNA-binding modes or functions. A highly derived and distinct version of the domain is also found in fungi where it is fused to AlkB-type 2OGFeDO domains. In bacteria, the domain is usually fused or associated with restriction endonucleases, many of which target methylated or hemi-methylated DNA. 143
45571 396657 pfam02183 HALZ Homeobox associated leucine zipper. 43
45572 111114 pfam02184 HAT HAT (Half-A-TPR) repeat. The HAT (Half A TPR) repeat is found in several RNA processing proteins. 32
45573 396658 pfam02185 HR1 Hr1 repeat. The HR1 repeat was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN). The first two of these repeats were later shown to bind the small G protein rho known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal HR1 repeat complexed with RhoA has been determined by X-ray crystallography. It forms an antiparallel coiled-coil fold termed an ACC finger. 57
45574 396659 pfam02186 TFIIE_beta TFIIE beta subunit core domain. General transcription factor TFIIE consists of two subunits, TFIIE alpha pfam02002 and TFIIE beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The structure of the DNA binding core region has been solved and has a winged helix fold. 66
45575 396660 pfam02187 GAS2 Growth-Arrest-Specific Protein 2 Domain. The GAR2 domain is common in plakin family members and Gas2 family members. The GAR domain comprises around 57 amino acids and has been shown to bind to microtubules. 69
45576 396661 pfam02188 GoLoco GoLoco motif. 20
45577 396662 pfam02189 ITAM Immunoreceptor tyrosine-based activation motif. 20
45578 396663 pfam02190 LON_substr_bdg ATP-dependent protease La (LON) substrate-binding domain. This domain has been shown to be part of the PUA superfamily. This domain represents a general protein and polypeptide interaction domain for the ATP-dependent serine peptidase, LON, Peptidase_S16, pfam05362. ATP-dependent Lon proteases are conserved in all living organisms and catalyze rapid turnover of short-lived regulatory proteins and many damaged or denatured proteins. 195
45579 396664 pfam02191 OLF Olfactomedin-like domain. 243
45580 396665 pfam02192 PI3K_p85B PI3-kinase family, p85-binding domain. 76
45581 396666 pfam02194 PXA PXA domain. This domain is associated with PX domains pfam00787. 183
45582 396667 pfam02195 ParBc ParB-like nuclease domain. 90
45583 396668 pfam02196 RBD Raf-like Ras-binding domain. 65
45584 396669 pfam02197 RIIa Regulatory subunit of type II PKA R-subunit. 37
45585 396670 pfam02198 SAM_PNT Sterile alpha motif (SAM)/Pointed domain. 82
45586 396671 pfam02199 SapA Saposin A-type domain. 33
45587 366975 pfam02200 STE STE like transcription factor. 109
45588 396672 pfam02201 SWIB SWIB/MDM2 domain. This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumor suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. 73
45589 396673 pfam02202 Tachykinin Tachykinin family. 11
45590 396674 pfam02203 TarH Tar ligand binding domain homolog. 152
45591 366977 pfam02204 VPS9 Vacuolar sorting protein 9 (VPS9) domain. This domain acts as a GDP-GTP exchange factor (GEF). It activates Rab GTPases by stimulating the release of GDP and allowing GTP to bind. 104
45592 396675 pfam02205 WH2 WH2 motif. The WH2 motif (for Wiskott Aldrich syndrome homology region 2) has been shown in WASP and Scar1 (mammalian homolog) to be the region that interacts with actin. 28
45593 396676 pfam02206 WSN Domain of unknown function. 66
45594 396677 pfam02207 zf-UBR Putative zinc finger in N-recognin (UBR box). This region is found in E3 ubiquitin ligases that recognize N-recognins. 68
45595 396678 pfam02208 Sorb Sorbin homologous domain. 45
45596 396679 pfam02209 VHP Villin headpiece domain. 35
45597 396680 pfam02210 Laminin_G_2 Laminin G domain. This family includes the Thrombospondin N-terminal-like domain, a Laminin G subfamily. 126
45598 396681 pfam02211 NHase_beta Nitrile hydratase beta subunit. Nitrile hydratases EC:4.2.1.84 are unusual metalloenzymes that catalyze the hydration of nitriles to their corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and they contain one iron atom per alpha beta unit. 220
45599 396682 pfam02212 GED Dynamin GTPase effector domain. 89
45600 396683 pfam02213 GYF GYF domain. The GYF domain is named because of the presence of Gly-Tyr-Phe residues. The GYF domain is a proline-binding domain in CD2-binding protein. 45
45601 396684 pfam02214 BTB_2 BTB/POZ domain. In voltage-gated K+ channels this domain is responsible for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. In KCTD1 this domain functions as a transcriptional repressor. It also mediates homomultimerisation of KCTD1 and interaction of KCTD1 with the transcription factor AP-2-alpha. 93
45602 396685 pfam02216 B B domain. This family contains the B domain of Staphylococcal protein A, which specifically binds to the Fc portion of immunoglobulin G. 51
45603 280395 pfam02217 T_Ag_DNA_bind Origin of replication binding protein. This domain of large T antigen binds to the SV40 origin of DNA replication. 94
45604 396686 pfam02218 HS1_rep Repeat in HS1/Cortactin. The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. 36
45605 396687 pfam02219 MTHFR Methylenetetrahydrofolate reductase. This family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from bacteria and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The structure for this domain is known to be a TIM barrel. 287
45606 396688 pfam02221 E1_DerP2_DerF2 ML domain. ML domain - MD-2-related lipid recognition domain. This family consists of proteins from plants, animals and fungi, including dust mite allergen Der P 2. It has been implicate in lipid recognition, particularly in the recognition of pathogen related products. A mutation in Npc2 causes a rare form of Niemann-Pick type C2 disease. This domain has a similar topology to immunoglobulin domains. 130
45607 396689 pfam02222 ATP-grasp ATP-grasp domain. This family does not contain all known ATP-grasp domain members. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. 169
45608 396690 pfam02223 Thymidylate_kin Thymidylate kinase. 184
45609 280401 pfam02224 Cytidylate_kin Cytidylate kinase. Cytidylate kinase EC:2.7.4.14 catalyzes the phosphorylation of cytidine 5'-monophosphate (dCMP) to cytidine 5'-diphosphate (dCDP) in the presence of ATP or GTP. 211
45610 396691 pfam02225 PA PA domain. The PA (Protease associated) domain is found as an insert domain in diverse proteases. The PA domain is also found in a plant vacuolar sorting receptor and members of the RZF family. It has been suggested that this domain forms a lid-like structure that covers the active site in active proteases, and is involved in protein recognition in vacuolar sorting receptors. 89
45611 308053 pfam02226 Pico_P1A Picornavirus coat protein (VP4). VP1, VP2, VP3 and VP4 for the basic unit that forms the icosahedral coat of picornaviruses. Five symmetry-related N termini of coat protein VP4 form a ten-stranded, antiparallel beta barrel around the base of the icosahedral fivefold axis. 68
45612 280404 pfam02228 Gag_p19 Major core protein p19. p19 is a component of the inner protein layer of the viral nucleocapsid. 92
45613 396692 pfam02229 PC4 Transcriptional Coactivator p15 (PC4). p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. 48
45614 396693 pfam02230 Abhydrolase_2 Phospholipase/Carboxylesterase. This family consists of both phospholipases and carboxylesterases with broad substrate specificity, and is structurally related to alpha/beta hydrolases pfam00561. 217
45615 280407 pfam02232 Alpha_TIF Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF, a virion protein (VP16), is involved in transcriptional activation of viral immediate early (IE) promoters (alpha genes). Specificity of tegument protein VP16 for IE genes is conferred by the 400 residue N-terminal, the 80 residue C-terminal is responsible for transcriptional activation. 343
45616 396694 pfam02233 PNTB NAD(P) transhydrogenase beta subunit. This family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with pfam01262. Pyridine nucleotide transhydrogenase catalyzes the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli. 452
45617 396695 pfam02234 CDI Cyclin-dependent kinase inhibitor. Cell cycle progression is negatively controlled by cyclin-dependent kinases inhibitors (CDIs). CDIs are involved in cell cycle arrest at the G1 phase. 46
45618 280410 pfam02236 Viral_DNA_bi Viral DNA-binding protein, all alpha domain. This family represents a domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control. 79
45619 396696 pfam02237 BPL_C Biotin protein ligase C terminal domain. The function of this structural domain is unknown. It is found to the C-terminus of the biotin protein ligase catalytic domain pfam01317. 47
45620 396697 pfam02238 COX7a Cytochrome c oxidase subunit VII. Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This family also contains both heart and liver isoforms of cytochrome c oxidase subunit VIIa. 53
45621 366994 pfam02239 Cytochrom_D1 Cytochrome D1 heme domain. Cytochrome cd1 (nitrite reductase) catalyzes the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 heme binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 heme iron of the active site in the oxidized state. 368
45622 396698 pfam02240 MCR_gamma Methyl-coenzyme M reductase gamma subunit. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (pfam02241), and 2 gamma (this family) subunits with two identical nickel porphinoid active sites. 246
45623 396699 pfam02241 MCR_beta Methyl-coenzyme M reductase beta subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The C-terminal domain of MCR beta has an all-alpha fold with buried central helix. 249
45624 396700 pfam02244 Propep_M14 Carboxypeptidase activation peptide. Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme. 68
45625 396701 pfam02245 Pur_DNA_glyco Methylpurine-DNA glycosylase (MPG). Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. 181
45626 308063 pfam02246 B1 Protein L b1 domain. Protein L is a bacterial protein with immunoglobulin (Ig) light chain-binding properties. It contains a number of homologous b1 repeats towards the N-terminus. These repeats have been found to be responsible for the interaction of protein L with Ig light chains. 62
45627 396702 pfam02247 Como_LCP Large coat protein. This family contains the large coat protein (LCP) of the comoviridae viral family. 369
45628 396703 pfam02248 Como_SCP Small coat protein. This family contains the small coat protein (SCP) of the comoviridae viral family. 183
45629 396704 pfam02249 MCR_alpha Methyl-coenzyme M reductase alpha subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The C-terminal domain is comprised of an all-alpha multi-helical bundle. 127
45630 396705 pfam02250 Orthopox_35kD 35kD major secreted virus protein. This family of orthopoxvirus secreted proteins (also known as T1 and A41) interact with members of both the CC and CXC superfamilies of chemokines. It has been suggested that these secreted proteins modulate leukocyte influx into virus-infected tissues. 224
45631 396706 pfam02251 PA28_alpha Proteasome activator pa28 alpha subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the alpha subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. 61
45632 396707 pfam02252 PA28_beta Proteasome activator pa28 beta subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the beta subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. 143
45633 396708 pfam02253 PLA1 Phospholipase A1. Phospholipase A1 is a bacterial outer membrane bound acyl hydrolase with a broad substrate specificity EC:3.1.1.32. It has been proposed that Ser164 is the active site for Escherichia coli phospholipase A1. 251
45634 396709 pfam02254 TrkA_N TrkA-N domain. This domain is found in a wide variety of proteins. These protein include potassium channels, phosphoesterases, and various other transporters. This domain binds to NAD. 115
45635 396710 pfam02255 PTS_IIA PTS system, Lactose/Cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation. 94
45636 396711 pfam02256 Fe_hyd_SSU Iron hydrogenase small subunit. This family represents the small subunit of the Fe-only hydrogenases EC:1.18.99.1. The subunit is comprised of alternating random coil and alpha helical structures that encompasses the large subunit in a novel protein fold. 56
45637 396712 pfam02257 RFX_DNA_binding RFX DNA-binding domain. RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. It recognize X-boxes (DNA of the sequence 5'-GTNRCC(0-3N)RGYAAC-3', where N is any nucleotide, R is a purine and Y is a pyrimidine) using a highly conserved 76-residue DNA-binding domain (DBD). 77
45638 396713 pfam02258 SLT_beta Shiga-like toxin beta subunit. This family represents the B subunit of shiga-like toxin (SLT or verotoxin) produced by some strains of E.coli associated with hemorrhagic colitis and hemolytic uremic syndrome. SLT's are composed of one enzymatic A subunit and five cell binding B subunits. 69
45639 396714 pfam02259 FAT FAT domain. The FAT domain is named after FRAP, ATM and TRRAP. 342
45640 396715 pfam02260 FATC FATC domain. The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability. 32
45641 396716 pfam02261 Asp_decarbox Aspartate decarboxylase. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalyzed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesized initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase. 107
45642 396717 pfam02262 Cbl_N CBL proto-oncogene N-terminal domain 1. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. Cbl_N is comprised of 3 structural domains of which this is the first - a four helix bundle. 119
45643 308078 pfam02263 GBP Guanylate-binding protein, N-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP. 260
45644 396718 pfam02264 LamB LamB porin. Maltoporin (LamB protein) forms a trimeric structure which facilitates the diffusion of maltodextrins across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an antiparallel beta-barrel. 385
45645 396719 pfam02265 S1-P1_nuclease S1/P1 Nuclease. This family contains both S1 and P1 nucleases (EC:3.1.30.1) which cleave RNA and single stranded DNA with no base specificity. 252
45646 396720 pfam02267 Rib_hydrolayse ADP-ribosyl cyclase. ADP-ribosyl cyclase EC:3.2.2.5 (also know as cyclic ADP-ribose hydrolase or CD38) synthesizes cyclic-ADP ribose, a second messenger for glucose-induced insulin secretion. 229
45647 396721 pfam02268 TFIIA_gamma_N Transcription initiation factor IIA, gamma subunit, helical domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The N-terminal domain of the gamma subunit is a 4 helix bundle. 46
45648 190265 pfam02269 TFIID-18kDa Transcription initiation factor IID, 18kD subunit. This family includes the Spt3 yeast transcription factors and the 18kD subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold 93
45649 396722 pfam02270 TFIIF_beta Transcription initiation factor IIF, beta subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter. 65
45650 396723 pfam02271 UCR_14kD Ubiquinol-cytochrome C reductase complex 14kD subunit. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 14kD (or VI) subunit of the complex which is not directly involved in electron transfer, but has a role in assembly of the complex. 100
45651 396724 pfam02272 DHHA1 DHHA1 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA1 for DHH associated domain. This domain is diagnostic of DHH subfamily 1 members. This domains is also found in alanyl tRNA synthetase, suggesting that this domain may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif. 139
45652 111194 pfam02273 Acyl_transf_2 Acyl transferase. This bacterial family of Acyl transferases (or myristoyl-acp-specific thioesterases) catalyze the first step in the bioluminescent fatty acid reductase system. 294
45653 396725 pfam02274 Amidinotransf Amidinotransferase. This family contains glycine (EC:2.1.4.1) and inosamine (EC:2.1.4.2) amidinotransferases, enzymes involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, EC:3.5.3.6. These enzymes catalyze the reaction: arginine + H2O <=> citrulline + NH3. Also found in this family is the Streptococcus anti tumor glycoprotein. 284
45654 396726 pfam02275 CBAH Linear amide C-N hydrolases, choloylglycine hydrolase family. This family includes several hydrolases which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides. These include choloylglycine hydrolase (conjugated bile acid hydrolase, CBAH) EC:3.5.1.24, penicillin acylase EC:3.5.1.11 and acid ceramidase EC:3.5.1.23. This domain forms the alpha-subunit for members from vertebral species, see family NAAA-beta, pfam15508. 316
45655 396727 pfam02276 CytoC_RC Photosynthetic reaction centre cytochrome C subunit. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction centre (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor. 309
45656 396728 pfam02277 DBI_PRT Phosphoribosyltransferase. This family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin. 333
45657 396729 pfam02278 Lyase_8 Polysaccharide lyase family 8, super-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 252
45658 280446 pfam02281 Dimer_Tnp_Tn5 Transposase Tn5 dimerization domain. Transposons are mobile DNA sequences capable of replication and insertion into the chromosome. Typically transposons code for the transposase enzyme, which catalyzes insertion, found between terminal inverted repeats. Tn5 has a unique method of self- regulation in which a truncated version of the transposase enzyme acts as an inhibitor. The catalytic domain of the Tn5 transposon is found in pfam01609. This domain mediates dimerization in the known structure. 106
45659 396730 pfam02282 Herpes_UL42 DNA polymerase processivity factor (UL42). The DNA polymerase processivity factor (UL42) of herpes simplex virus forms a heterodimer with UL30 to create the viral DNA polymerase complex. UL42 functions to increase the processivity of polymerization and makes little contribution to the catalytic activity of the polymerase. 142
45660 396731 pfam02283 CobU Cobinamide kinase / cobinamide phosphate guanyltransferase. This family is composed of a group of bifunctional cobalamin biosynthesis enzymes which display cobinamide kinase and cobinamide phosphate guanyltransferase activity. The crystal structure of the enzyme reveals the molecule to be a trimer with a propeller-like shape. 167
45661 396732 pfam02284 COX5A Cytochrome c oxidase subunit Va. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit Va. 99
45662 396733 pfam02285 COX8 Cytochrome oxidase c subunit VIII. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIII. 41
45663 396734 pfam02286 Dehydratase_LU Dehydratase large subunit. This family contains the large subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 552
45664 396735 pfam02287 Dehydratase_SU Dehydratase small subunit. This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 131
45665 396736 pfam02288 Dehydratase_MU Dehydratase medium subunit. This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 108
45666 396737 pfam02289 MCH Cyclohydrolase (MCH). Methenyl tetrahydromethanopterin cyclohydrolase EC:3.5.4.27 is involved in methanogenesis in bacteria and archaea, producing methane from carbon monoxide or carbon dioxide. 308
45667 396738 pfam02290 SRP14 Signal recognition particle 14kD protein. The signal recognition particle (SRP) is a multimeric protein involved in targeting secretory proteins to the rough endoplasmic reticulum membrane. SRP14 and SRP9 form a complex essential for SRP RNA binding. 106
45668 396739 pfam02291 TFIID-31kDa Transcription initiation factor IID, 31kD subunit. This family represents the N-terminus of the 31kD subunit (42kD in drosophila) of transcription initiation factor IID (TAFII31). TAFII31 binds to p53, and is an essential requirement for p53 mediated transcription activation. 122
45669 396740 pfam02293 AmiS_UreI AmiS/UreI family transporter. This family includes UreI and proton gated urea channel as well as putative amide transporters. 165
45670 280458 pfam02294 7kD_DNA_binding 7kD DNA-binding domain. This family contains members of the hyper-thermophilic archaebacterium 7kD DNA-binding/endoribonuclease P2 family. There are five 7kD DNA-binding proteins, 7a-7e, found as monomers in the cell. Protein 7e shows the tightest DNA-binding ability. 58
45671 280459 pfam02295 z-alpha Adenosine deaminase z-alpha domain. This family consists of the N-terminus and thus the z-alpha domain of double-stranded RNA-specific adenosine deaminase (ADAR), an RNA- editing enzyme. The z-alpha domain is a Z-DNA binding domain, and binding of this region to B-DNA has been shown to be disfavoured by steric hindrance. 67
45672 367023 pfam02296 Alpha_adaptin_C Alpha adaptin AP2, C-terminal domain. Alpha adaptin is a hetero tetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. 113
45673 396741 pfam02297 COX6B Cytochrome oxidase c subunit VIb. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the potentially heme-binding subunit IVb of the oxidase. 65
45674 280462 pfam02298 Cu_bind_like Plastocyanin-like domain. This family represents a domain found in flowering plants related to the copper binding protein plastocyanin. Some members of this family may not bind copper due to the lack of key residues. 84
45675 396742 pfam02300 Fumarate_red_C Fumarate reductase subunit C. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 15kD hydrophobic subunit C. 127
45676 396743 pfam02301 HORMA HORMA domain. The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognize chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity. 209
45677 396744 pfam02302 PTS_IIB PTS system, Lactose/Cellobiose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar structure to mammalian tyrosine phosphatases. This family also contains the fructose specific IIB subunit. 92
45678 396745 pfam02303 Phage_DNA_bind Helix-destabilizing protein. This family contains the bacteriophage helix-destabilizing protein, or single-stranded DNA binding protein, required for DNA synthesis. 83
45679 280467 pfam02304 Phage_B Scaffold protein B. This is a family of proteins from single-stranded DNA bacteriophages. Scaffold proteins B and D are required for procapsid formation. Sixty copies of the internal scaffold protein B are found in the procapsid. 117
45680 308107 pfam02305 Phage_F Capsid protein (F protein). This is a family of proteins from single-stranded DNA bacteriophages. Protein F is the major capsid component, sixty copies of which are found in the virion. 510
45681 280468 pfam02306 Phage_G Major spike protein (G protein). This is a family of proteins from single-stranded DNA bacteriophages. Five G proteins, each a tight beta barrel, from twelve surface spikes. 175
45682 396746 pfam02308 MgtC MgtC family. The MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologs is not known. 120
45683 396747 pfam02309 AUX_IAA AUX/IAA family. Transcription of the AUX/IAA family of genes is rapidly induced by the plant hormone auxin. Some members of this family are longer and contain an N terminal DNA binding domain. The function of this region is uncertain. 188
45684 396748 pfam02310 B12-binding B12 binding domain. This domain binds to B12 (adenosylcobamide), it is found in several enzymes, such as glutamate mutase, methionine synthase, and methylmalonyl-CoA mutase. It contains a conserved DxHxxGx(41)SxVx(26)GG motif, which is important for B12 binding. 121
45685 396749 pfam02311 AraC_binding AraC-like ligand binding domain. This family represents the arabinose-binding and dimerization domain of the bacterial gene regulatory protein AraC. The domain is found in conjunction with the helix-turn-helix (HTH) DNA-binding motif pfam00165. This domain is distantly related to the Cupin domain pfam00190. 134
45686 396750 pfam02312 CBF_beta Core binding factor beta subunit. Core binding factor (CBF) is a heterodimeric transcription factor essential for genetic regulation of hematopoiesis and osteogenesis. The beta subunit enhances DNA-binding ability of the alpha subunit in vitro, and has been show to have a structure related to the OB fold. 166
45687 396751 pfam02313 Fumarate_red_D Fumarate reductase subunit D. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 13kD hydrophobic subunit D. 114
45688 396752 pfam02315 MDH Methanol dehydrogenase beta subunit. Methanol dehydrogenase (MDH) is a bacterial periplasmic quinoprotein that oxidizes methanol to formaldehyde. MDH is a tetramer of two alpha and two beta subunits. This family contains the small beta subunit. 88
45689 367032 pfam02316 HTH_Tnp_Mu_1 Mu DNA-binding domain. This family consists of MuA-transposase and repressor protein CI. These proteins contain homologous DNA-binding domains at their N-termini which compete for the same DNA site within the Mu bacteriophage genome. 134
45690 396753 pfam02317 Octopine_DH NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain. This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids. 149
45691 396754 pfam02318 FYVE_2 FYVE-type zinc finger. This FYVE-type zinc finger is found at the N-terminus of effector proteins including rabphilin-3A and regulating synaptic membrane exocytosis protein 2. 118
45692 396755 pfam02319 E2F_TDP E2F/DP family winged-helix DNA-binding domain. This family contains the transcription factor E2F and its dimerization partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins. 64
45693 396756 pfam02320 UCR_hinge Ubiquinol-cytochrome C reductase hinge protein. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 'hinge' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex. 64
45694 396757 pfam02321 OEP Outer membrane efflux protein. The OEP family (Outer membrane efflux protein) form trimeric channels that allow export of a variety of substrates in Gram negative bacteria. Each member of this family is composed of two repeats. The trimeric channel is composed of a 12 stranded all beta sheet barrel that spans the outer membrane, and a long all helical barrel that spans the periplasm. 181
45695 396758 pfam02322 Cyt_bd_oxida_II Cytochrome bd terminal oxidase subunit II. This family consists of cytochrome bd type terminal oxidases that catalyze quinol-dependent, Na+-independent oxygen uptake. Members of this family are integral membrane proteins and contain a protohaem IX centre B558. One member of the family, Klebsiella pneumoniae CydB, is implicated in having an important role in micro-aerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae. The family forms an integral functional unit with subunit I, family Bac_Ubq_Cox, pfam01654. 300
45696 145463 pfam02323 ELH Egg-laying hormone precursor. This family consists of egg-laying hormone (ELH) precursor and atrial gland peptides form little and California sea hare. The family also includes ovulation prohormone precursor from great pond snail. This family thus represents a conserved gastropoda ovulation and egg production prohormone. Note that many of the proteins present are further cleaved to give individual peptides. Neuropeptidergic bag cells of the marine mollusk Aplysia californica synthesize an egg-laying hormone (ELH) precursor protein which is cleaved to generate several bioactive peptides including ELH, bag cell peptides (BCP) and acidic peptide (AP). 255
45697 334895 pfam02324 Glyco_hydro_70 Glycosyl hydrolase family 70. Members of this family belong to glycosyl hydrolase family 70 Glucosyltransferases or sucrose 6-glycosyl transferases (GTF-S) catalyze the transfer of D-glucopyramnosyl units from sucrose onto acceptor molecules, EC:2.4.1.5. This family roughly corresponds to the N-terminal catalytic domain of the enzyme. Members of this family also contain the Putative cell wall binding domain pfam01473, which corresponds with the C-terminal glucan-binding domain. 804
45698 396759 pfam02325 YGGT YGGT family. This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown. 71
45699 396760 pfam02326 YMF19 Plant ATP synthase F0. This family corresponds to subunit 8 (YMF19) of the F0 complex of plant and algae mitochondrial F-ATPases (EC:3.6.1.34). 84
45700 396761 pfam02327 BChl_A Bacteriochlorophyll A protein. Bacteriochlorophyll A protein is involved in the energy transfer system of green photosynthetic bacteria. The protein forms a homotrimer, with each monomer unit containing seven molecules of bacteriochlorophyll A. 354
45701 280487 pfam02329 HDC Histidine carboxylase PI chain. Histidine carboxylase catalyzes the formation of histamine from histidine. Cleavage of the proenzyme PI chain yields two subunits, alpha and beta, which arrange as a hexamer (alpha beta)6. 293
45702 396762 pfam02330 MAM33 Mitochondrial glycoprotein. This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q. It is thought to be involved in mitochondrial oxidative phosphorylation and in nucleus-mitochondrion interactions. 202
45703 280489 pfam02331 P35 Apoptosis preventing protein. This viral protein functions to block the host apoptotic response caused by infection by the virus. The apoptosis preventing protein (or early 35kD protein, P35) acts by blocking caspase protease activity. 295
45704 396763 pfam02332 Phenol_Hydrox Methane/Phenol/Toluene Hydroxylase. Bacterial phenol hydroxylase is a multicomponent enzyme that catabolises phenol and some of its methylated derivatives. This Pfam family contains both the P1 and P3 polypeptides of phenol hydroxylase and the alpha and beta chain of methane hydroxylase protein A. 226
45705 280491 pfam02333 Phytase Phytase. Phytase is a secreted enzyme which hydrolyzes phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity and has been shown to have a six- bladed propeller folding architecture. 375
45706 396764 pfam02334 RTP Replication terminator protein. The bacterial replication terminator protein (RTP) plays a role in the termination of DNA replication by impeding replication fork movement. Two RTP dimers bind to the two inverted repeat regions at the termination site. 113
45707 396765 pfam02335 Cytochrom_C552 Cytochrome c552. Cytochrome c552 (cytochrome c nitrite reductase) is a crucial enzyme in the nitrogen cycle catalyzing the reduction of nitrite to ammonia. The crystal structure of cytochrome c552 reveals it to be a dimer, with with 10 close-packed type c haem groups. 435
45708 280494 pfam02336 Denso_VP4 Capsid protein VP4. Four different translation initiation sites of the densovirus capsid protein mRNA give rise to four viral proteins, VP1 to VP4. This family represents VP4. 431
45709 396766 pfam02337 Gag_p10 Retroviral GAG p10 protein. This family consists of various retroviral GAG (core) polyproteins and encompasses the p10 region producing the p10 protein upon proteolytic cleavage of GAG by retroviral protease. The p10 or matrix protein (MA) is associated with the virus envelope glycoproteins in most mammalian retroviruses and may be involved in virus particle assembly, transport and budding. Some of the GAG polyproteins have alternate cleavage sites leading to the production of alternative and longer cleavage products (e.g. p19) the alignment of this family only covers the approximately N-terminal (GAG) 100 amino acid region of homology to p10. 83
45710 396767 pfam02338 OTU OTU-like cysteine protease. This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian tumor (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases. 128
45711 111251 pfam02340 PRRSV_Env PRRSV putative envelope protein. This family consists of a conserved probable envelope protein or ORF2 in porcine reproductive and respiratory syndrome virus (PRRSV) also in the family is a minor structural protein from lactate dehydrogenase-elevating virus. 234
45712 396768 pfam02341 RcbX RbcX protein. The RBCX protein has been identified as having a possible chaperone-like function. The rbcX gene is juxtaposed to and cotranscribed with rbcL and rbcS encoding RuBisCO in Anabaena sp. CA. RbcX has been shown to possess a chaperone-like function assisting correct folding of RuBisCO in E. coli expression studies and is needed for RuBisCO to reach its maximal activity. 100
45713 396769 pfam02342 TerD TerD domain. The TerD domain is found in TerD family proteins that include the paralogous TerD, TerA, TerE, TerF and TerZ proteins It is found in a stress response operon with TerB and TerC. TerD has a maximum of two calcium-binding sites depending on the conservation of aspartates. It has various fusions to nuclease domains, RNA binding domains, ubiquitin related domains, and metal binding domains. The ter gene products lie at the centre of membrane-linked metal recognition complexes with regulatory ramifications encompassing phosphorylation-dependent signal transduction, RNA-dependent regulation, biosynthesis of nucleoside-like metabolites and DNA processing linked to novel pathways. 187
45714 396770 pfam02343 TRA-1_regulated TRA-1 regulated protein R03H10.4. This family of proteins represents the protein product of the gene R03H10.4 which is located near a sequence that matches the TRA-1 binding consensus. TRA-1 is a transcription factor which controls sexual differentiation in C.elegans. R03H10.4 shows male-enriched reporter gene expression and acts as a direct target of TRA-1 regulation. 128
45715 396771 pfam02344 Myc-LZ Myc leucine zipper domain. This family consists of the leucine zipper dimerization domain found in both cellular c-Myc proto-oncogenes and viral v-Myc oncogenes. dimerization via the leucine zipper motif with other basic helix-loop-helix-leucine zipper (b/HLH/lz) proteins such as Max is required for efficient DNA binding. The Myc-Max dimer is a transactivating complex activating expression of growth related genes promoting cell proliferation. The dimerization is facilitated via interdigitating leucine residues every 7th position of the alpha helix. Like charge repulsion of adjacent residues in this region perturbs the formation of homodimers with heterodimers being promoted by opposing charge attractions. 27
45716 280501 pfam02346 Vac_Fusion Chordopoxvirus multifunctional envelope protein A27. This is a family of viral fusion proteins from the chordopoxviruses. The A27L gene product, a 14-kDa Vaccinia Virus protein, has been demonstrated to function as a viral fusion protein mediating cell fusion at endosmomal (low) pH. More recently it has been shown that A27 forms disulfide-linked protein complexes with A26 protein providing an anchor for A26 protein packaging into mature virions. A27 regulates virion-membrane fusion rather than inducing it and is critical for the successful egress of mature virus particles. 56
45717 396772 pfam02347 GDC-P Glycine cleavage system P-protein. This family consists of Glycine cleavage system P-proteins EC:1.4.4.2 from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex EC:2.1.2.10 (GDC) also annotated as glycine cleavage system or glycine synthase. GDC consists of four proteins P, H, L and T. The reaction catalyzed by this protein is:- Glycine + lipoylprotein <=> S-aminomethyldihydrolipoylprotein + CO2 428
45718 396773 pfam02348 CTP_transf_3 Cytidylyltransferase. This family consists of two main Cytidylyltransferase activities: 1) 3-deoxy-manno-octulosonate cytidylyltransferase,, EC:2.7.7.38 catalyzing the reaction:- CTP + 3-deoxy-D-manno-octulosonate <=> diphosphate + CMP-3-deoxy-D-manno-octulosonate, 2) acylneuraminate cytidylyltransferase EC:2.7.7.43, catalyzing the reaction:- CTP + N-acylneuraminate <=> diphosphate + CMP-N-acylneuraminate. NeuAc cytydilyltransferase of Mannheimia haemolytica has been characterized describing kinetics and regulation by substrate charge, energetic charge and amino-sugar demand. 217
45719 396774 pfam02349 MSG Major surface glycoprotein. This is a novel repeat in Pneumocystis carinii Major surface glycoprotein (MSG) some members of the alignment have up to nine repeats of this family, the repeats containing several conserved cysteines. The MSG of P. carinii is an important protein in host-pathogen interactions. Surface glycoprotein A from Pneumocystis carinii is a main target for the host immune system, this protein is implicated in the attachment of Pneumocystis carinii to the host alveolar epithelial cells, alveolar macrophages, host surfactant and possibly accounts in part for the hypoxia seen in Pneumocystis carinii pneumonia (PCP). 76
45720 396775 pfam02350 Epimerase_2 UDP-N-acetylglucosamine 2-epimerase. This family consists of UDP-N-acetylglucosamine 2-epimerases EC:5.1.3.14 this enzyme catalyzes the production of UDP-ManNAc from UDP-GlcNAc. Note that some of the enzymes is this family are bifunctional, in these instances Pfam matches only the N-terminal half of the protein suggesting that the additional C-terminal part (when compared to mono-functional members of this family) is responsible for the UPD-N-acetylmannosamine kinase activity of these enzymes. This hypothesis is further supported by the assumption that the C-terminal part of rat Gne is the kinase domain. 336
45721 396776 pfam02351 GDNF GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity. 88
45722 367048 pfam02352 Decorin_bind Decorin binding protein. This family consists of decorin binding proteins from Borrelia. The decorin binding protein of Borrelia burgdorferi the lyme disease spirochetes adheres to the proteoglycan decorin found on collagen fibers. 141
45723 396777 pfam02353 CMAS Mycolic acid cyclopropane synthetase. This family consist of Cyclopropane-fatty-acyl-phospholipid synthase or CFA synthase EC:2.1.1.79 this enzyme catalyze the reaction: S-adenosyl-L-methionine + phospholipid olefinic fatty acid <=> S-adenosyl-L-homocysteine + phospholipid cyclopropane fatty acid. 272
45724 396778 pfam02354 Latrophilin Latrophilin Cytoplasmic C-terminal region. This family consists of the cytoplasmic C-terminal region in latrophilin. Latrophilin is a synaptic Ca2+ independent alpha- latrotoxin (LTX) receptor and is a novel member of the secretin family of G-protein coupled receptors that are involved in secretion. Latrophilin mRNA is present only in neuronal tissue. Lactrophillin interacts with G-alpha O. 378
45725 280510 pfam02355 SecD_SecF Protein export membrane protein. This family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC. SecD and SecF are required to maintain a proton motive force. 189
45726 396779 pfam02357 NusG Transcription termination factor nusG. 98
45727 396780 pfam02358 Trehalose_PPase Trehalose-phosphatase. This family consist of trehalose-phosphatases EC:3.1.3.12 these enzyme catalyze the de-phosphorylation of trehalose-6-phosphate to trehalose and orthophosphate. The aligned region is present in trehalose-phosphatases and comprises the entire length of the protein it is also found in the C-terminus of trehalose-6-phosphate synthase EC:2.4.1.15 adjacent to the trehalose-6-phosphate synthase domain - pfam00982. It would appear that the two equivalent genes in the E. coli otsBA operon otsA the trehalose-6-phosphate synthase and otsB trehalose-phosphatase (this family) have undergone gene fusion in most eukaryotes. Trehalose is a common disaccharide of bacteria, fungi and invertebrates that appears to play a major role in desiccation tolerance. 232
45728 396781 pfam02359 CDC48_N Cell division protein 48 (CDC48), N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain. 85
45729 396782 pfam02361 CbiQ Cobalt transport protein. This family consists of various cobalt transport proteins Most of which are found in Cobalamin (Vitamin B12) biosynthesis operons. In Salmonella the cbiN cbiQ (product CbiQ in this family) and cbiO are likely to form an active cobalt transport system. 215
45730 396783 pfam02362 B3 B3 DNA binding domain. This is a family of plant transcription factors with various roles in development, the aligned region corresponds to the B3 DNA binding domain, this domain is found in VP1/AB13 transcription factors. Some proteins also have a second AP2 DNA binding domain pfam00847 such as RAV1. 101
45731 308140 pfam02363 C_tripleX Cysteine rich repeat. This Cysteine repeat C-X3-C-X3-C is repeated in sequences of this family, 34 times in an uncharacterized C. elegans protein. The function of these repeats is unknown as is the function of the proteins in which they occur. Most of the sequences in this family are from C. elegans. 17
45732 396784 pfam02364 Glucan_synthase 1,3-beta-glucan synthase component. This family consists of various 1,3-beta-glucan synthase components including Gls1, Gls2 and Gls3 from yeast. 1,3-beta-glucan synthase EC:2.4.1.34 also known as callose synthase catalyzes the formation of a beta-1,3-glucan polymer that is a major component of the fungal cell wall. The reaction catalyzed is:- UDP-glucose + {(1,3)-beta-D-glucosyl}(N) <=> UDP + {(1,3)-beta-D-glucosyl}(N+1). 819
45733 396785 pfam02365 NAM No apical meristem (NAM) protein. This is a family of no apical meristem (NAM) proteins these are plant development proteins. Mutations in NAM result in the failure to develop a shoot apical meristem in petunia embryos. NAM is indicated as having a role in determining positions of meristems and primordial. One member of this family NAP (NAC-like, activated by AP3/PI) is encoded by the target genes of the AP3/PI transcriptional activators and functions in the transition between growth by cell division and cell expansion in stamens and petals. 123
45734 396786 pfam02366 PMT Dolichyl-phosphate-mannose-protein mannosyltransferase. This is a family of Dolichyl-phosphate-mannose-protein mannosyltransferase proteins EC:2.4.1.109. These proteins are responsible for O-linked glycosylation of proteins, they catalyze the reaction:- Dolichyl phosphate D-mannose + protein <=> dolichyl phosphate + O-D-mannosyl-protein. Also in this family is Drosophila rotated abdomen protein which is a putative mannosyltransferase. This family appears to be distantly related to pfam02516 (A Bateman pers. obs.). This family also contains sequences from ArnTs (4-amino-4-deoxy-L-arabinose lipid A transferase). They catalyze the addition of 4-amino-4-deoxy-l-arabinose (l-Ara4N) to the lipid A moiety of the lipopolysaccharide. This is a critical modification enabling bacteria (e.g. Escherichia coli and Salmonella typhimurium) to resist killing by antimicrobial peptides such as polymyxins. Members such as undecaprenyl phosphate-alpha-4-amino-4-deoxy-L-arabinose arabinosyl transferase are predicted to have 12 trans-membrane regions. The N-terminal portion of these proteins is hypothesized to have a conserved glycosylation activity which is shared between distantly related oligosaccharyltransferases ArnT and PglB families. 245
45735 396787 pfam02367 TsaE Threonylcarbamoyl adenosine biosynthesis protein TsaE. This family of proteins is involved in the synthesis of threonylcarbamoyl adenosine (t(6)A). 125
45736 396788 pfam02368 Big_2 Bacterial Ig-like domain (group 2). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial and phage surface proteins such as intimins. 77
45737 396789 pfam02369 Big_1 Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial surface proteins such as intimins and invasins involved in pathogenicity. 64
45738 111279 pfam02370 M M protein repeat. This short repeat is found in multiple copies in bacterial M proteins. The M proteins bind to IgA and are closely associated with virulence. The M protein has been postulated to be a major group A Streptococcal (GAS) virulence factor because of its contribution to the bacterial resistance to opsonophagocytosis. 21
45739 396790 pfam02371 Transposase_20 Transposase IS116/IS110/IS902 family. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS116, IS110 and IS902. This region is often found with pfam01548. The exact function of this region is uncertain. This family contains a HHH motif suggesting a DNA-binding function. 86
45740 308146 pfam02372 IL15 Interleukin 15. Interleukin-15 (IL-15) is a cytokine that possesses a variety of biological functions, including stimulation and maintenance of cellular immune responses. Structurally these proteins are short-chain 4-helical cytokines. 129
45741 396791 pfam02373 JmjC JmjC domain, hydroxylase. The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation. 114
45742 396792 pfam02374 ArsA_ATPase Anion-transporting ATPase. This Pfam family represents a conserved domain, which is sometimes repeated, in an anion-transporting ATPase. The ATPase is involved in the removal of arsenate, antimonite, and arsenate from the cell. 302
45743 396793 pfam02375 JmjN jmjN domain. 34
45744 396794 pfam02376 CUT CUT domain. The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein. 78
45745 396795 pfam02377 Dishevelled Dishevelled specific domain. This domain is specific to the signalling protein dishevelled. The domain is found adjacent to the PDZ domain pfam00595, often in conjunction with DEP (pfam00610) and DIX (pfam00778). Much of it is disordered and yet conserved. 162
45746 367061 pfam02378 PTS_EIIC Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate. 315
45747 367062 pfam02380 Papo_T_antigen T-antigen specific domain. This domain represents a conserved region in papovavirus small and middle T-antigens. It is found as the N-terminal domain in the small T-antigen, and is centrally located in the middle T-antigen. 93
45748 396796 pfam02381 MraZ MraZ protein, putative antitoxin-like. This small 70 amino acid domain is found duplicated in a family of bacterial proteins. These proteins may be DNA-binding transcription factors (Pers. comm. A Andreeva & A Murzin). It is likely, due to the similarity of fold, that this family acts as a bacterial antitoxin like the MazE antitoxin family. 72
45749 396797 pfam02382 RTX RTX N-terminal domain. The RTX family of bacterial toxins are a group of cytolysins and cytotoxins. This Pfam family represents the N-terminal domain which is found in association with a glycine-rich repeat domain and hemolysinCabind pfam00353. 312
45750 396798 pfam02383 Syja_N SacI homology domain. This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. 296
45751 396799 pfam02384 N6_Mtase N-6 DNA Methylase. Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The R-M system is a complex containing three polypeptides: M (this family), S (pfam01420), and R. This family consists of N-6 adenine-specific DNA methylase EC:2.1.1.72 from Type I and Type IC restriction systems. These methylases have the same sequence specificity as their corresponding restriction enzymes. 312
45752 280534 pfam02386 TrkH Cation transport protein. This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli. 491
45753 367066 pfam02387 IncFII_repA IncFII RepA protein family. This protein is plasmid encoded and found to be essential for plasmid replication. 275
45754 367067 pfam02388 FemAB FemAB family. The femAB operon codes for two nearly identical approximately 50-kDa proteins involved in the formation of the Staphylococcal pentaglycine interpeptide bridge in peptidoglycan. These proteins are also considered as a factor influencing the level of methicillin resistance. 406
45755 280537 pfam02389 Cornifin Cornifin (SPRR) family. SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo. The most characteristic feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino acids with the general consensus XKXPEPXX where X is any amino acid. In order to avoid bacterial contamination due to the high polar-nature of the HMM the threshold has been set very high. 135
45756 367068 pfam02390 Methyltransf_4 Putative methyltransferase. This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. 173
45757 396800 pfam02391 MoaE MoaE protein. This family contains the MoaE protein that is involved in biosynthesis of molybdopterin. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins. 113
45758 396801 pfam02392 Ycf4 Ycf4. This family consists of hypothetical Ycf4 proteins from various chloroplast genomes. It has been suggested that Ycf4 is involved in the assembly and/or stability of the photosystem I complex in chloroplasts. 176
45759 396802 pfam02393 US22 US22 like. US22 proteins have been found across many animal DNA viruses and some vertebrates. The name sake of this family, US22, is an early nuclear protein that is secreted from cells. The US22 family may have a role in virus replication and pathogenesis. Domain analysis showed that US22 proteins usually contain two copies of conserved modules which is homologous to several other families like SMI1 and SYD (commonly called SUKH superfamily). Bacterial operon analysis revealed that all bacterial SUKH members function as immunity proteins against various toxins. Thus US22 family is predicted to counter diverse anti-viral responses by interacting with specific host proteins. 124
45760 396803 pfam02394 IL1_propep Interleukin-1 propeptide. The Interleukin-1 cytokines are translated as precursor proteins. The N terminal approx. 115 amino acids form a propeptide that is cleaved off to release the active interleukin-1. 102
45761 396804 pfam02395 Peptidase_S6 Immunoglobulin A1 protease. This family consists of immunoglobulin A1 protease proteins. The immunoglobulin A1 protease cleaves immunoglobulin IgA and is found in pathogenic bacteria such as Neisseria gonorrhoeae. Not all of the members of this family are IgA proteases, EspP from E. coli O157:H7 cleaves human coagulation factor V and hbp is a hemoglobin protease from E. coli EB1. 784
45762 396805 pfam02397 Bac_transf Bacterial sugar transferase. This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways. 181
45763 280545 pfam02398 Corona_7 Coronavirus protein 7. This is a family of proteins from coronavirus which may function in viral assembly. 101
45764 280546 pfam02399 Herpes_ori_bp Origin of replication binding protein. This Pfam family represents the herpesvirus origin of replication binding protein, probably involved in DNA replication. 820
45765 396806 pfam02401 LYTB LytB protein. The mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway. 267
45766 367072 pfam02402 Lysis_col Lysis protein. These small bacterial proteins are required for colicin release and partial cell lysis. This family contains lysis proteins for several different forms of colicin. B. subtilis LytA has been included in this family, the similarity is not highly significant, however it is also a short protein, that is involved in secretion of other proteins (Bateman A pers. obs.). This family includes a signal peptide motif and a lipid attachment site. 49
45767 396807 pfam02403 Seryl_tRNA_N Seryl-tRNA synthetase N-terminal domain. This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. 107
45768 396808 pfam02404 SCF Stem cell factor. Stem cell factor (SCF) is a homodimer involved in hematopoiesis. SCF binds to and activates the SCF receptor (SCFR), a receptor tyrosine kinase. The crystal structure of human SCF has been resolved and a potential receptor-binding site identified. 275
45769 396809 pfam02405 MlaE Permease MlaE. MlaE is a permease which in E. coli is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. In NMB1965 it is involved in L-glutamate import into the cell. In Arabidopsis thaliana TGD1 it is involved in lipid transfer within the cell. 212
45770 396810 pfam02406 MmoB_DmpM MmoB/DmpM family. This family consists of monooxygenase components such as MmoB methane monooxygenase (EC:1.14.13.25) regulatory protein B. When MmoB is present at low concentration it converts methane monooxygenase from an oxidase to a hydroxylase and stabilizes intermediates required for the activation of dioxygen. Also found in this family is DmpM or Phenol hydroxylase (EC:1.14.13.7) protein component P2, this protein lacks redox co-factors and is required for optimal turnover of Phenol hydroxylase. 85
45771 280553 pfam02407 Viral_Rep Putative viral replication protein. This is a family of viral ORFs from various plant and animal ssDNA circoviruses. Published evidence to support the annotated function "viral replication associated protein" has not be found. 82
45772 280554 pfam02408 CUB_2 CUB-like domain. This is a family of hypothetical C. elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. However, this domain is related to the CUB domain. 120
45773 396811 pfam02410 RsfS Ribosomal silencing factor during starvation. This family is expressed by almost all bacterial and eukaryotic genomes but not by archaea. Its function is to down-regulate protein synthesis under conditions of nutrient shortage, and it does this by binding to protein L14 of the large ribosomal subunit, thus acting as a ribosomal silencing factor (RsfS) by blocking the joining of the ribosomal subunits. This family is structurally homologous to nucleotidyltransferases. 97
45774 111318 pfam02411 MerT MerT mercuric transport protein. MerT is an mercuric transport integral membrane protein and is responsible for transport of the Hg2+ iron from periplasmic MerP (also part of the transport system) to mercuric reductase (MerE). 116
45775 367074 pfam02412 TSP_3 Thrombospondin type 3 repeat. The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. 36
45776 396812 pfam02413 Caudo_TAP Caudovirales tail fibre assembly protein, lambda gpK. This family contains bacterial and phage tail fibre assembly proteins. E.coli contains several members of this family although the function of these proteins is uncertain. Using the lambda phage members as examples, there are both gptfa and gpK tail proteins here. GpK forms part of the TTC or tail-tip complex that is located at the distal end of the tail. TTCs form the platform on which the tail-tube proteins self-assemble and are also the attachment point for fibers or receptor-binding proteins that mediate phage-adsorption to the surface of the host cell. TTC assembly starts with gpJ, which is also known as the central tail fibre and is involved in host-cell adsorption. It is the C-terminus of gpJ that interacts with the lamB receptor on host cells. A number of intermediates including gpK then interact with gpJ during tail morphogenesis. 130
45777 396813 pfam02414 Borrelia_orfA Borrelia ORF-A. This protein is encoded by an open reading frame in plasmid borne DNA repeats of Borrelia species. This protein is known as ORF-A. The function of this putative protein is unknown. 285
45778 308169 pfam02415 Chlam_PMP Chlamydia polymorphic membrane protein (Chlamydia_PMP) repeat. This family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. Common for the Pmps are the tetrapeptide GGA(I/V/L) motif repeated several times in the N-terminal part. The C-terminal half is characterized by conserved tryptophans and a carboxy-terminal phenylalanine. A signal peptide leader sequence is predicted in 20 C. pneumoniae Pmps, which indicates an outer membrane localization. Pmp10 and Pmp11 contain a signal peptidase II cleavage site suggesting lipid modification. The C. pneumoniae pmp genes represent 17.5% of the chlamydia-specific coding capacity and they are all transcribed during chlamydial growth but the function of Pmps remains unknown. This family shows some similarity to pfam05594 and hence is likely to also form a beta-helical structure (personal obs:C Yeats). 19
45779 280560 pfam02416 MttA_Hcf106 mttA/Hcf106 family. Members of this protein family are involved in a sec independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts. Members of this family in E.coli are involved in export of redox proteins with a "twin arginine" leader motif. 53
45780 396814 pfam02417 Chromate_transp Chromate transporter. Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP. 164
45781 396815 pfam02419 PsbL PsbL protein. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. The function of this small protein is unknown. Interestingly the mRNA for this protein requires a post-transcriptional modification of an ACG triplet to form an AUG initiator codon. 37
45782 111326 pfam02420 AFP Insect antifreeze protein repeat. This family of extracellular proteins is involved in stopping the formation of ice crystals at low temperatures. The proteins are composed of a 12 residue repeat that forms a structural repeat. The structure of the repeats is a beta helix. Each repeat contains two cys residues that form a disulphide bridge. 12
45783 396816 pfam02421 FeoB_N Ferrous iron transport protein B. Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N-terminus contains a P-loop motif suggesting that iron transport may be ATP dependent. 156
45784 396817 pfam02422 Keratin Keratin. This family represents avian keratin proteins, found in feathers, scale and claw. 89
45785 396818 pfam02423 OCD_Mu_crystall Ornithine cyclodeaminase/mu-crystallin family. This family contains the bacterial Ornithine cyclodeaminase enzyme EC:4.3.1.12, which catalyzes the deamination of ornithine to proline. This family also contains mu-Crystallin the major component of the eye lens in several Australian marsupials, mRNA for this protein has also been found in human retina. 317
45786 396819 pfam02424 ApbE ApbE family. This prokaryotic family of lipoproteins are related to ApbE from Salmonella typhimurium. ApbE is involved in thiamine synthesis. It acts as an FAD:protein FMN-transferase, catalyzing the attachment of an FMN residue to a threonine residue of a protein via a phosphoester bond in such bacterial flavoproteins. 226
45787 280567 pfam02425 GBP_PSP Paralytic/GBP/PSP peptide. This family includes insect peptides that are short (23 amino acids) and contain 1 disulphide bridge. The family includes growth-blocking peptide (GBP) of Pseudaletia separata and the paralytic peptides from Manduca sexta, Heliothis virescens, and Spodoptera exigua as well as plasmatocyte-spreading peptide (PSP1). These peptides function to halt metamorphosis from larvae to pupae. 23
45788 396820 pfam02426 MIase Muconolactone delta-isomerase. This small enzyme forms a homodecameric complex, that catalyzes the third step in the catabolism of catechol to succinate- and acetyl-coa in the beta-ketoadipate pathway EC:5.3.3.4. The protein has a ferredoxin-like fold according to SCOP. 87
45789 396821 pfam02427 PSI_PsaE Photosystem I reaction centre subunit IV / PsaE. PsaE is a 69 amino acid polypeptide from photosystem I present on the stromal side of the thylakoid membrane. The structure is comprised of a well-defined five-stranded beta-sheet similar to SH3 domains. 59
45790 396822 pfam02428 Prot_inhib_II Potato type II proteinase inhibitor family. Members of this family are proteinase inhibitors that contain eight cysteines that form four disulphide bridges. The structure of the proteinase-inhibitor complex is known. 51
45791 396823 pfam02429 PCP Peridinin-chlorophyll A binding protein. Peridinin-chlorophyll-protein, a water-soluble light-harvesting complex that has a blue-green absorbing carotenoid as its main pigment, is present in most photosynthetic dinoflagellates. These proteins are composed of two similar repeated domains. These domains constitute a scaffold with pseudo-twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin, and two chlorophyll a molecules. 145
45792 396824 pfam02430 AMA-1 Apical membrane antigen 1. Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites. 432
45793 396825 pfam02431 Chalcone Chalcone-flavanone isomerase. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, 4',5,7-trihydroxyflavanone, a key step in the biosynthesis of flavonoids. 203
45794 367084 pfam02432 Fimbrial_K88 Fimbrial, major and minor subunit. Fimbriae (also know as pili) are polar filaments found on the bacterial surface, allowing colonisation of the host. This family consists of the minor and major fimbrial subunits. 155
45795 396826 pfam02433 FixO Cytochrome C oxidase, mono-heme subunit/FixO. The bacterial oxidase complex, fixNOPQ or cytochrome cbb3, is thought to be required for respiration in endosymbiosis. FixO is a membrane bound mono-heme constituent of the fixNOPQ complex. 217
45796 367085 pfam02434 Fringe Fringe-like. The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localized to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homolog, lunatic fringe, has been implicated in a variety of functions. 248
45797 396827 pfam02435 Glyco_hydro_68 Levansucrase/Invertase. This Pfam family consists of the glycosyl hydrolase 68 family, including several bacterial levansucrase enzymes, and invertase from zymomonas. 411
45798 396828 pfam02436 PYC_OADA Conserved carboxylase domain. This domain represents a conserved region in pyruvate carboxylase (PYC), oxaloacetate decarboxylase alpha chain (OADA), and transcarboxylase 5s subunit. The domain is found adjacent to the HMGL-like domain (pfam00682) and often close to the biotin_lipoyl domain (pfam00364) of biotin requiring enzymes. 199
45799 396829 pfam02437 Ski_Sno SKI/SNO/DAC family. This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homolog, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development. 100
45800 308188 pfam02438 Adeno_100 Late 100kD protein. The late 100kD protein is a non-structural viral protein involved in the transport of hexon from the cytoplasm to the nucleus. 591
45801 111345 pfam02439 Adeno_E3_CR2 Adenovirus E3 region protein CR2. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR2 (conserved region 1) is found in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR2 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 50 amino acid region is unknown. 38
45802 367088 pfam02440 Adeno_E3_CR1 Adenovirus E3 region protein CR1. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR1 (conserved region 1) is found three times in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain (A. Bateman pers. observation). 95
45803 396830 pfam02441 Flavoprotein Flavoprotein. This family contains diverse flavoprotein enzymes. This family includes epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. dipicolinate synthase catalyzes the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenyl-acrylic acid decarboxylase (EC:4.1.1.-). 179
45804 396831 pfam02442 L1R_F9L Lipid membrane protein of large eukaryotic DNA viruses. The four families of large eukaryotic DNA viruses, Poxviridae, Asfarviridae, Iridoviridae, and Phycodnaviridae, referred to collectively as nucleocytoplasmic large DNA viruses or NCLDV, have all been shown to have a lipid membrane, in spite of the major differences in virion structure. The paralogous genes L1R and F9L encode membrane proteins that have a conserved domain architecture, with a single, C-terminal transmembrane helix, and an N-terminal, multiple-disulfide-bonded domain. The conservation of the myristoylated, disulfide-bonded protein L1R/F9L in most of the NCLDV correlates with the conservation of the thiol-disulfide oxidoreductase E10R which, in vaccinia virus, is required for the formation of disulfide bonds in L1R and F9L. 183
45805 308191 pfam02443 Circo_capsid Circovirus capsid protein. Circoviruses are small circular single stranded viruses. This family is the capsid protein from viruses such as porcine circovirus and beak and feather disease virus. These proteins are about 220 amino acids long. 200
45806 280583 pfam02444 HEV_ORF1 Hepatitis E virus ORF-2 (Putative capsid protein). The Hepatitis E virus (HEV) genome is a single-stranded, positive-sense RNA molecule of approximately 7.5 kb. Three open reading frames (ORF) were identified within the HEV genome: ORF1 encodes non-structural proteins, ORF2 encodes the putative structural protein(s), and ORF3 encodes a protein of unknown function. ORF2 contains a consensus signal peptide sequence at its amino terminus and a capsid-like region with a high content of basic amino acids similar to that seen with other virus capsid proteins. 114
45807 396832 pfam02445 NadA Quinolinate synthetase A protein. Quinolinate synthetase catalyzes the second step of the de novo biosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid. This synthesis requires two enzymes, a FAD-containing "B protein" and an "A protein". 287
45808 396833 pfam02446 Glyco_hydro_77 4-alpha-glucanotransferase. These enzymes EC:2.4.1.25 transfer a segment of a (1,4)-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or (1,4)-alpha-D-glucan. 460
45809 308194 pfam02447 GntP_permease GntP family permease. This is a family of integral membrane permeases that are involved in gluconate uptake. E. coli contains several members of this family including GntU, a low affinity transporter, and GntT, a high affinity transporter. 440
45810 367089 pfam02448 L71 L71 family. This family of insect proteins are each about 100 amino acids long and have 6 conserved cysteine residues. They all have a predicted signal peptide and are probably excreted. The function of the proteins is unknown. 70
45811 396834 pfam02449 Glyco_hydro_42 Beta-galactosidase. This group of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. The enzyme catalyzes the hydrolysis of terminal, non-reducing terminal beta-D-galactosidase residues. 376
45812 396835 pfam02450 LCAT Lecithin:cholesterol acyltransferase. Lecithin:cholesterol acyltransferase (LCAT) is involved in extracellular metabolism of plasma lipoproteins, including cholesterol. 383
45813 308198 pfam02451 Nodulin Nodulin. Nodulin is a plant protein of unknown function. It is induced during nodulation in legume roots after rhizobium infection. 188
45814 396836 pfam02452 PemK_toxin PemK-like, MazF-like toxin of type II toxin-antitoxin system. PemK is a growth inhibitor in E. coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. This family represents the toxin molecule of a typical bacterial toxin-antitoxin system pairing. The family includes a number of different toxins, such as MazF, Kid, PemK, ChpA, ChpB and ChpAK. 108
45815 396837 pfam02453 Reticulon Reticulon. Reticulon, also know as neuroendocrine-specific protein (NSP), is a protein of unknown function which associates with the endoplasmic reticulum. This family represents the C-terminal domain of the three reticulon isoforms and their homologs. 157
45816 111360 pfam02454 Sigma_1s Sigma 1s protein. The reoviral gene S1 encodes for haemagglutinin (sigma 1 protein), an outer capsid protein and a major factor in determining virus-host cell interactions. Sigma 1s is one of two translation products of the S1 gene. 116
45817 308201 pfam02455 Hex_IIIa Hexon-associated protein (IIIa). The major capsid protein of the adenovirus strain is also known as a hexon. This is a family of hexon-associated proteins (protein IIIa). 539
45818 280594 pfam02456 Adeno_IVa2 Adenovirus IVa2 protein. IVa2 protein can interact with the adenoviral packaging signal and that this interaction involves DNA sequences that have previously been demonstrated to be required for packaging. During the course of lytic infection, the adenovirus major late promoter (MLP) is induced to high levels after replication of viral DNA has started. IVa2 is a transcriptional activator of the major late promoter. 370
45819 396838 pfam02457 DisA_N DisA bacterial checkpoint controller nucleotide-binding. The DisA protein is a bacterial checkpoint protein that dimerizes into an octameric complex. The protein consists of three distinct domains. This domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, pfam10635, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, pfam00633, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis. 114
45820 280596 pfam02458 Transferase Transferase family. This family includes a number of transferase enzymes. These include anthranilate N-hydroxycinnamoyl/benzoyltransferase that catalyzes the first committed reaction of phytoalexin biosynthesis. Deacetylvindoline 4-O-acetyltransferase EC:2.3.1.107 catalyzes the last step in vindoline biosynthesis is also a member of this family. The motif HXXXD is probably part of the active site. The family also includes trichothecene 3-O-acetyltransferase. 434
45821 280597 pfam02459 Adeno_terminal Adenoviral DNA terminal protein. This protein is covalently attached to the terminii of replicating DNA in vivo. 543
45822 308203 pfam02460 Patched Patched family. The transmembrane protein Patched is a receptor for the morphogene Sonic Hedgehog. This protein associates with the smoothened protein to transduce hedgehog signals. 793
45823 396839 pfam02461 AMO Ammonia monooxygenase. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons. 238
45824 308205 pfam02462 Opacity Opacity family porin protein. Pathogenic Neisseria spp. possess a repertoire of phase-variable Opacity proteins that mediate various pathogen--host cell interactions. These proteins are integral membrane proteins related to other porins. 126
45825 308206 pfam02463 SMC_N RecF/RecN/SMC N terminal domain. This domain is found at the N-terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination. 1162
45826 396840 pfam02464 CinA Competence-damaged protein. CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation. This Pfam family consists of putative competence-damaged proteins from the cin operon. Some members of this family have nicotinamide mononucleotide (NMN) deamidase activity. 155
45827 396841 pfam02465 FliD_N Flagellar hook-associated protein 2 N-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the N-terminal region of this family of proteins. 97
45828 396842 pfam02466 Tim17 Tim17/Tim22/Tim23/Pmp24 family. The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in. 111
45829 396843 pfam02467 Whib Transcription factor WhiB. WhiB is a putative transcription factor in Actinobacteria, required for differentiation and sporulation. 65
45830 396844 pfam02468 PsbN Photosystem II reaction centre N protein (psbN). This is a family of small proteins encoded on the chloroplast genome. psbN is involved in photosystem II during photosynthesis, but its exact role is unknown. 43
45831 396845 pfam02469 Fasciclin Fasciclin domain. This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria. 123
45832 396846 pfam02470 MlaD MlaD protein. This family of proteins contains MlaD, which is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. The family also contains the mce (mammalian cell entry) proteins from Mycobacterium tuberculosis. The archetype (Rv0169), was isolated as being necessary for colonisation of, and survival within, the macrophage. This family contains proteins of unknown function from other bacteria. 81
45833 396847 pfam02471 OspE Borrelia outer surface protein E. This is a family of outer surface proteins (Osp) from the Borrelia spirochete. The family includes OspE, and OspEF-related proteins (Erp). These proteins are coded for on different circular plasmids in the Borrelia genome. 107
45834 396848 pfam02472 ExbD Biopolymer transport protein ExbD/TolR. This group of proteins are membrane bound transport proteins essential for ferric ion uptake in bacteria. The Pfam family consists of ExbD, and TolR which are involved in TonB-dependent transport of various receptor bound substrates including colicins. 128
45835 396849 pfam02474 NodA Nodulation protein A (NodA). Rhizobia nodulation (nod) genes control the biosynthesis of Nod factors required for infection and nodulation of their legume hosts. Nodulation protein A (NodA) is a N-acetyltransferase involved in production of Nod factors that stimulate mitosis in various plant protoplasts. 195
45836 396850 pfam02475 Met_10 Met-10+ like-protein. The methionine-10 mutant allele of N. crassa codes for a protein of unknown function. However, homologous proteins have been found in yeast, suggesting this protein may be involved in methionine biosynthesis, transport and/or utilisation. 198
45837 396851 pfam02476 US2 US2 family. This is a family of unique short (US) region proteins from the herpesvirus strain. The US2 family have no known function. 124
45838 396852 pfam02477 Nairo_nucleo Nucleocapsid N protein. The nucleoprotein of the ssRNA negative-strand Nairovirus is an internal part of the virus particle. 443
45839 280615 pfam02478 Pneumo_phosprot Pneumovirus phosphoprotein. This family represents the phosphoprotein of Paramyxoviridae, a putative RNA polymerase alpha subunit that may function in template binding. 286
45840 280616 pfam02479 Herpes_IE68 Herpesvirus immediate early protein. This regulatory protein is expressed from an immediate early gene in the cell cycle of herpesvirus. The protein is known by various names including IE-68, US1, ICP22 and IR4. 132
45841 280617 pfam02480 Herpes_gE Alphaherpesvirus glycoprotein E. Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI) (pfam01688), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation. 432
45842 396853 pfam02481 DNA_processg_A DNA recombination-mediator protein A. The SMF family, of DNA processing chain A, dprA, are a group of bacterial proteins. In H. pylori, dprA is required for natural chromosomal and plasmid transformation. It has now been shown that DprA is found to bind cooperatively to single-stranded DNA (ssDNA) and to interact with RecA. In the process, DprA-RecA-ssDNA filaments are produced and these filaments catalyze the homology-dependent formation of joint molecules. While the E.coli SSB protein limits access of RecA to ssDNA, DprA alleviates this barrier. It is proposed that DprA is a new member of the recombination-mediator protein family, dedicated to natural bacterial transformation. 210
45843 396854 pfam02482 Ribosomal_S30AE Sigma 54 modulation protein / S30EA ribosomal protein. This Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA). 92
45844 280620 pfam02484 Rhabdo_NV Rhabdovirus Non-virion protein. Infectious hematopoietic necrosis virus (IHNV) is a member of the family Rhabdoviridae. The non-virion protein (NV) is coded for by one of the six genes of the IHNV genome, but is absent in vesiculovirus -like rhabdovirus. 111
45845 396855 pfam02485 Branch Core-2/I-Branching enzyme. This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme. I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans. This is a fmmily of glycosyl-transferases that are Type II membrane proteins that are found in the endoplasmic reticulum (ER) and Golgi apparatus. 250
45846 396856 pfam02486 Rep_trans Replication initiation factor. Plasmid replication is initiated by the replication initiation factor (REP). This family represents a probable topoisomerase that makes a sequence-specific single-stranded nick in the plasmid DNA at the origin of replication. Human proteins also belong to this family, including myelin transcription factor 2 and cerebrin-50. 201
45847 396857 pfam02487 CLN3 CLN3 protein. This is a family of proteins from the CLN3 gene. A mis-sense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Batten disease is characterized by the accumulation of autofluorescent material in the lysosomes of most cells. Members of this family are transmembrane proteins functional in pre-vacuolar compartments. The protein in Sch.pombe is found to be localized to the vacuolar membrane, and a lack of functional protein clearly affects the size and pH of the vacuole. Thus the protein is necessary for vacuolar homeostasis. It is important for localization of late endosomal/lysosomal compartments, and it interacts with motor components driving both plus and minus end microtubular trafficking: tubulin, dynactin, dynein and kinesin-2. 384
45848 280624 pfam02488 EMA Merozoite Antigen. This family represents the immunodominant surface antigen of Theileria parasites including equi merozoite antigen-1 (EMA-1) and equi merozoite antigen-2 (EMA-2). The protein shows variation at a putative glycosylation site, a potential mechanism for host immune response evasion. 250
45849 396858 pfam02489 Herpes_glycop_H Herpesvirus glycoprotein H main domain. Herpesvirus glycoprotein H (gH) is a virion associated envelope glycoprotein. Complex formation between gH and gL has been demonstrated in both virions and infected cells. 500
45850 396859 pfam02491 SHS2_FTSA SHS2 domain inserted in FTSA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. The SHS2 domain is inserted in to the RNAseH fold of FtsA, and is involved in protein-protein interaction. 73
45851 396860 pfam02492 cobW CobW/HypB/UreG, nucleotide-binding domain. This domain is found in HypB, a hydrogenase expression / formation protein, and UreG a urease accessory protein. Both these proteins contain a P-loop nucleotide binding motif. HypB has GTPase activity and is a guanine nucleotide binding protein. It is not known whether UreG binds GTP or some other nucleotide. Both enzymes are involved in nickel binding. HypB can store nickel and is required for nickel dependent hydrogenase expression. UreG is required for functional incorporation of the urease nickel metallocenter. GTP hydrolysis may required by these proteins for nickel incorporation into other nickel proteins. This family of domains also contains P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression, and the cobW gene product, which may be involved in cobalamin biosynthesis in Pseudomonas denitrificans. 179
45852 308220 pfam02493 MORN MORN repeat. The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (see Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesized to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. 23
45853 367105 pfam02494 HYR HYR domain. This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion. 81
45854 396861 pfam02495 7kD_coat 7kD viral coat protein. This family consists of a 7kD coat protein from carlavirus and potexvirus. 59
45855 396862 pfam02496 ABA_WDS ABA/WDS induced protein. This is a family of plant proteins induced by water deficit stress (WDS), or abscisic acid (ABA) stress and ripening. 78
45856 111400 pfam02497 Arteri_GP4 Arterivirus glycoprotein. This is a family of structural glycoproteins from arterivirus that corresponds to open reading frame 4 (ORF4) of the virus. 178
45857 376797 pfam02498 Bro-N BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins. 96
45858 280633 pfam02499 DNA_pack_C Probable DNA packing protein, C-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is found at the C-terminus of the protein. 348
45859 280634 pfam02500 DNA_pack_N Probable DNA packing protein, N-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is normally found at the N-terminus of the protein. 277
45860 396863 pfam02501 T2SSI Type II secretion system (T2SS), protein I. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for the transport of proteins across the outer membrane first exported to the periplasm by the Sec or Tat translocon in Gram-negative (diderm) bacteria. As members of the T2SJ family, members of the T2SI family are pseudopilins containing prepilin signal sequences. 80
45861 396864 pfam02502 LacAB_rpiB Ribose/Galactose Isomerase. This family of proteins contains the sugar isomerase enzymes ribose 5-phosphate isomerase B (rpiB), galactose isomerase subunit A (LacA) and galactose isomerase subunit B (LacB). 134
45862 396865 pfam02503 PP_kinase Polyphosphate kinase middle domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. 199
45863 396866 pfam02504 FA_synthesis Fatty acid synthesis protein. The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes. The exact function of the plsX protein in fatty acid synthesis is unknown. 324
45864 396867 pfam02505 MCR_D Methyl-coenzyme M reductase operon protein D. Methyl coenzyme M reductase (MCR) catalyzes the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown. 142
45865 367108 pfam02507 PSI_PsaF Photosystem I reaction centre subunit III. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. Subunit III (or PSI-F) is one of at least 14 different subunits that compose the PSI complex. 159
45866 396868 pfam02508 Rnf-Nqr Rnf-Nqr subunit, membrane protein. This is a family of integral membrane proteins including Rhodobacter-specific nitrogen fixation (rnf) proteins RnfA and RnfE and Na+-translocating NADH:ubiquinone oxidoreductase (Na+-NQR) subunits NqrD and NqrE. 181
45867 396869 pfam02509 Rota_NS35 Rotavirus non-structural protein 35. Rotavirus non-structural protein 35 (NS35) is a basic protein which possesses RNA-binding activity and is essential for genome replication. 317
45868 280643 pfam02510 SPAN Surface presentation of antigens protein. Surface presentation of antigens protein (SPAN), also know as invasion protein invJ, is a Salmonella secretory pathway protein involved in presentation of determinants required for mammalian host cell invasion. 336
45869 396870 pfam02511 Thy1 Thymidylate synthase complementing protein. Thymidylate synthase complementing protein (Thy1) complements the thymidine growth requirement of the organisms in which it is found, but shows no homology to thymidylate synthase. The bacterial members of this family at least are flavin-dependent thymidylate synthases. 185
45870 280645 pfam02512 UK Virulence determinant. The UK protein is an African swine fever virus (ASFV) protein that is highly conserved amongst strains, and is an important viral virulence determinant for domestic pigs. 96
45871 396871 pfam02513 Spin-Ssty Spin/Ssty Family. Spindlin (Spin) is a novel maternal transcript present in the unfertilized egg and early embryo. The Y-linked spermiogenesis -specific transcript (Ssty) is also expressed during gametogenesis and forms part of this Pfam family. Members of this family contain three copies of this 50 residue repeat. The repeat is predicted to contain four beta strands. 49
45872 376805 pfam02514 CobN-Mg_chel CobN/Magnesium Chelatase. This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of hydrogenobyrinic acid a,c-diamide to cobyrinic acid. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis. 1051
45873 396872 pfam02515 CoA_transf_3 CoA-transferase family III. CoA-transferases are found in organisms from all lines of descent. Most of these enzymes belong to two well-known enzyme families, but recent work on unusual biochemical pathways of anaerobic bacteria has revealed the existence of a third family of CoA-transferases. The members of this enzyme family differ in sequence and reaction mechanism from CoA-transferases of the other families. Currently known enzymes of the new family are a formyl-CoA: oxalate CoA-transferase, a succinyl-CoA: (R)-benzylsuccinate CoA-transferase, an (E)-cinnamoyl-CoA: (R)-phenyllactate CoA-transferase, and a butyrobetainyl-CoA: (R)-carnitine CoA-transferase. In addition, a large number of proteins of unknown or differently annotated function from Bacteria, Archaea and Eukarya apparently belong to this enzyme family. Properties and reaction mechanisms of the CoA-transferases of family III are described and compared to those of the previously known CoA-transferases. 367
45874 396873 pfam02516 STT3 Oligosaccharyl transferase STT3 subunit. This family consists of the oligosaccharyl transferase STT3 subunit and related proteins. The STT3 subunit is part of the oligosaccharyl transferase (OTase) complex of proteins and is required for its activity. In eukaryotes, OTase transfers a lipid-linked core-oligosaccharide to selected asparagine residues in the ER. In the archaea STT3 occurs alone, rather than in an OTase complex, and is required for N-glycosylation of asparagines. 478
45875 396874 pfam02517 Abi CAAX protease self-immunity. Members of this family are probably proteases (after a isoprenyl group is attached to the Cys residue in the C-terminal CAAX motif of a protein to attach it to the membrane, the AAX tripeptide being removed by one of the CAAX prenyl proteases). The family contains the CAAX prenyl protease. The proteins contain a highly conserved Glu-Glu motif at the amino end of the alignment. The alignment also contains two histidine residues that may be involved in zinc binding. While they are involved in membrane anchoring of proteins in eukaryotes, little is known about their function in prokaryotes. In some known bacteriocin loci, Abi genes have been found downstream of bacteriocin structural genes where they are probably involved in self-immunity. Investigation of the bacteriocin-like loci in the Gram positive bacteria locus from Lactobacillus sakei 23K confirmed that the bacteriocin-like genes (sak23Kalphabeta) exhibited antimicrobial activity when expressed in a heterologous host and that the associated Abi gene (sak23Ki) conferred immunity against the cognate bacteriocin. Interestingly, the immunity genes from three similar systems conferred a high degree of cross-immunity against each other's bacteriocins, suggesting the recognition of a common receptor. Site-directed mutagenesis demonstrated that the conserved motifs constituting the putative proteolytic active site of the Abi proteins are essential for the immunity function of Sak23Ki - thus a new concept in self-immunity. 92
45876 396875 pfam02518 HATPase_c Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90. 111
45877 396876 pfam02519 Auxin_inducible Auxin responsive protein. This family consists of the protein products of the ARG7 auxin responsive genes family none of which have any identified functional role. 92
45878 396877 pfam02520 DUF148 Domain of unknown function DUF148. This domain has no known function nor do any of the proteins that possess it. In one member of this family the aligned region is repeated twice. 107
45879 334957 pfam02521 HP_OMP_2 Putative outer membrane protein. This family consists of putative outer membrane proteins from Helicobacter pylori (campylobacter pylori). 442
45880 396878 pfam02522 Antibiotic_NAT Aminoglycoside 3-N-acetyltransferase. This family consists of bacterial aminoglycoside 3-N-acetyltransferases EC:2.3.1.81, these catalyze the reaction: Acetyl-Co + a 2-deoxystreptamine antibiotic <=> CoA + N3'-acetyl-2-deoxystreptamine antibiotic. The enzyme can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others. 230
45881 280656 pfam02524 KID KID repeat. This is family contains the KID repeat as found in Borrelia spirochete RepA / Rep+ proteins. The function of these proteins is unknown. RepA and related Borrelia proteins have been suggested to play an important genus-wide role in the biology of the Borrelia. 11
45882 396879 pfam02525 Flavodoxin_2 Flavodoxin-like fold. This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. 190
45883 280658 pfam02526 GBP_repeat Glycophorin-binding protein. This family contains glycophorin binding proteins from P. falciparum the malarial parasite. Glycophorin is a cell surface protein of erythrocytes. The Glycophorin binding protein contains a tandem 38 residue repeat. In Plasmodium falciparum GBP the repeat occurs 11 times. 38
45884 396880 pfam02527 GidB rRNA small subunit methyltransferase G. This is a family of bacterial glucose inhibited division proteins these are probably involved in the regulation of cell devision. GidB has been shown to be a methyltransferase G specific to the rRNA small subunit. Previously identified as a glucose-inhibited division protein B that appears to be present and in a single copy in all complete eubacterial genomes so far sequenced. GidB specifically methylates the N7 position of a guanosine in 16S rRNA. 184
45885 396881 pfam02529 PetG Cytochrome B6-F complex subunit 5. This family consists of cytochrome B6-F complex subunit 5 (PetG). The cytochrome bf complex found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase. PetG or subunit 5 is associated with the bf complex and the absence of PetG affects either the assembly or stability of the cytochrome bf complex in Chlamydomonas reinhardtii. 36
45886 308243 pfam02530 Porin_2 Porin subfamily. This family consists of porins from the alpha subdivision of Proteobacteria the members of this family are related to pfam00267. The porins form large aqueous channels in the cell membrane allowing the selective entry of hydrophilic compounds this so called 'molecular sieve' is found in the cell walls of gram negative bacteria. 355
45887 396882 pfam02531 PsaD PsaD. This family consists of PsaD from plants and cyanobacteria. PsaD is an extrinsic polypeptide of photosystem I (PSI) and is required for native assembly of PSI reaction clusters and is implicated in the electrostatic binding of ferredoxin within the reaction centre. PsaD forms a dimer in solution which is bound by PsaE however PsaD is monomeric in its native complexed PSI environment. 133
45888 308245 pfam02532 PsbI Photosystem II reaction centre I protein (PSII 4.8 kDa protein). This family consists of various Photosystem II (PSII) reaction centre I proteins or PSII 4.8 kDa proteins, PsbI, from the chloroplast genome of many plants and Cyanobacteria. PsbI is a small, integral membrane component of PSII the role of which is not clear. Synechocystis mutants lacking PsbI have 20-30% loss of PSII activity however the PSII complex is not destabilized. 36
45889 396883 pfam02533 PsbK Photosystem II 4 kDa reaction centre component. This family consists of various photosystem II 4 kDa reaction centre components (PsbK) from plant and Cyanobacteria. The photosystem II reaction centre is responsible for catalyzing the core photosynthesis reaction the light-induced splitting of water and the consequential release of dioxygen. In C. reinhardtii the psbK product is required for the stable assembly and/or stability of the photosystem II complex. 41
45890 367119 pfam02534 T4SS-DNA_transf Type IV secretory system Conjugative DNA transfer. These proteins contain a P-loop and walker-B site for nucleotide binding. TraG is essential for DNA transfer in bacterial conjugation. These proteins are thought to mediate interactions between the DNA-processing (Dtr) and the mating pair formation (Mpf) systems. The C-terminus of this domain interacts with the relaxosome component TraM via the latter's tetramerisation domain. TraD is a hexameric ring ATPase that forms the cytoplasmic face of the conjugative pore. The family contains a number of different DNA transfer proteins. 468
45891 396884 pfam02535 Zip ZIP Zinc transporter. The ZIP family consists of zinc transport proteins and many putative metal transporters. The main contribution to this family is from the Arabidopsis thaliana ZIP protein family these proteins are responsible for zinc uptake in the plant. Also found within this family are C. elegans proteins of unknown function which are annotated as being similar to human growth arrest inducible gene product, although this protein in not found within this family. 325
45892 396885 pfam02536 mTERF mTERF. This family contains one sequence of known function Human mitochondrial transcription termination factor (mTERF) the rest of the family consists of hypothetical proteins none of which have any functional information. mTERF is a multizipper protein possessing three putative leucine zippers one of which is bipartite. The protein binds DNA as a monomer. The leucine zippers are not implicated in a dimerization role as in other leucine zippers. 313
45893 396886 pfam02537 CRCB CrcB-like protein, Camphor Resistance (CrcB). CRCB is a family of bacterial integral membrane proteins with four TMs.. Over expression in E. coli also leads to camphor resistance. 109
45894 396887 pfam02538 Hydantoinase_B Hydantoinase B/oxoprolinase. This family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase EC:3.5.2.9 which catalyzes the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to pfam01968. 505
45895 396888 pfam02540 NAD_synthase NAD synthase. NAD synthase (EC:6.3.5.1) is involved in the de novo synthesis of NAD and is induced by stress factors such as heat shock and glucose limitation. 241
45896 396889 pfam02541 Ppx-GppA Ppx/GppA phosphatase family. This family consists of the N-terminal region of exopolyphosphatase (Ppx) EC:3.6.1.11 and guanosine pentaphosphate phospho-hydrolase (GppA) EC:3.6.1.40. 285
45897 396890 pfam02542 YgbB YgbB family. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis). 155
45898 280673 pfam02543 Carbam_trans_N Carbamoyltransferase N-terminus. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the larger, N-terminal, domain. 336
45899 251363 pfam02544 Steroid_dh 3-oxo-5-alpha-steroid 4-dehydrogenase. This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development. 150
45900 396891 pfam02545 Maf Maf-like protein. Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea. 183
45901 396892 pfam02547 Queuosine_synth Queuosine biosynthesis protein. Queuosine (Q) biosynthesis protein, or S-adenosylmethionine:tRNA -ribosyltransferase-isomerase, is required for the synthesis of the queuosine precursor (oQ). It catalyzes the transfer and isomerisation of the ribose moiety from AdoMet to the 7-aminomethyl group of 7-deazaguanine (preQ1-tRNA) to form epoxyqueuosine (oQ-tRNA). Q is a hypermodified nucleoside usually found at the first position of the anticodon of asparagine, aspartate, histidine, and tyrosine tRNAs. In Streptococcus gordonii, QueA has been shown to play a role in the regulation of arginine deiminase genes. 336
45902 396893 pfam02548 Pantoate_transf Ketopantoate hydroxymethyltransferase. Ketopantoate hydroxymethyltransferase (EC:2.1.2.11) is the first enzyme in the pantothenate biosynthesis pathway. 259
45903 251367 pfam02550 AcetylCoA_hydro Acetyl-CoA hydrolase/transferase N-terminal domain. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA. 198
45904 396894 pfam02551 Acyl_CoA_thio Acyl-CoA thioesterase. This family represents the thioesterase II domain. Two copies of this domain are found in a number of acyl-CoA thioesterases. 132
45905 251369 pfam02552 CO_dh CO dehydrogenase beta subunit/acetyl-CoA synthase epsilon subunit. This family consists of Carbon monoxide dehydrogenase I/II beta subunit EC:1.2.99.2 and acetyl-CoA synthase epsilon subunit. Carbon monoxide beta subunit catalyzes the reaction: CO + H2O + acceptor <=> CO2 + reduced acceptor. 168
45906 396895 pfam02553 CbiN Cobalt transport protein component CbiN. CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as the periplasmic binding protein component of the active cobalt transport system. 67
45907 396896 pfam02554 CstA Carbon starvation protein CstA. This family consists of Carbon starvation protein CstA a predicted membrane protein. It has been suggested that CstA is involved in peptide utilisation. 372
45908 396897 pfam02556 SecB Preprotein translocase subunit SecB. This family consists of preprotein translocase subunit SecB. SecB is required for the normal export of envelope proteins out of the cell cytoplasm. 137
45909 396898 pfam02557 VanY D-alanyl-D-alanine carboxypeptidase. 131
45910 396899 pfam02558 ApbA Ketopantoate reductase PanE/ApbA. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway. 150
45911 396900 pfam02559 CarD_CdnL_TRCF CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain, CdnL. CarD interacts with the zinc-binding protein CarG to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF (transcription-repair-coupling factor) proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase. The family includes members otherwise referred to as CdnL, for CarD N-terminal like, whichdiffer functionally from CarD. The TRCF domain mentioned above is the RNA polymerase-interacting domain or RID. 89
45912 396901 pfam02560 Cyanate_lyase Cyanate lyase C-terminal domain. Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain. 65
45913 396902 pfam02561 FliS Flagellar protein FliS. FliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function. 115
45914 396903 pfam02562 PhoH PhoH-like protein. PhoH is a cytoplasmic protein and predicted ATPase that is induced by phosphate starvation. 204
45915 396904 pfam02563 Poly_export Polysaccharide biosynthesis/export protein. This is a family of periplasmic proteins involved in polysaccharide biosynthesis and/or export. 74
45916 396905 pfam02565 RecO_C Recombination protein O C terminal. Recombination protein O (RecO) is involved in DNA repair and pfam00470 pathway recombination. 157
45917 396906 pfam02566 OsmC OsmC-like protein. Osmotically inducible protein C (OsmC) is a stress -induced protein found in E. Coli. This family also contains a organic hydroperoxide detoxification protein that has a novel pattern of oxidative stress regulation. 99
45918 396907 pfam02567 PhzC-PhzF Phenazine biosynthesis-like protein. PhzC/PhzF is involved in dimerization of two 2,3-dihydro-3-oxo-anthranilic acid molecules to create PCA by P. fluorescens. This family also contains uncharacterized Mycobacterial proteins, though there is no significant sequence similarity to pfam00303 members. This family appears to be distantly related to pfam01678, including containing a weak internal duplication. However members of this family do not contain the conserved cysteines that are hypothesized to be active site residues (Bateman A pers obs). 280
45919 280691 pfam02568 ThiI Thiamine biosynthesis protein (ThiI). ThiI is required for thiazole synthesis, required for thiamine biosynthesis. 197
45920 396908 pfam02569 Pantoate_ligase Pantoate-beta-alanine ligase. Pantoate-beta-alanine ligase, also know as pantothenate synthase, (EC:6.3.2.1) catalyzes the formation of pantothenate from pantoate and alanine. 275
45921 396909 pfam02570 CbiC Precorrin-8X methylmutase. This is a family Precorrin-8X methylmutases also known as Precorrin isomerase, CbiC/CobH, EC:5.4.1.2. This enzyme catalyzes the reaction: Precorrin-8X <=> hydrogenobyrinate. This enzyme is part of the Cobalamin (vitamin B12) biosynthetic pathway and catalyzes a methyl rearrangement. 191
45922 396910 pfam02571 CbiJ Precorrin-6x reductase CbiJ/CobK. This family consists of Precorrin-6x reductase EC:1.3.1.54. This enzyme catalyzes the reaction: precorrin-6Y + NADP(+) <=> precorrin-6X + NADPH. CbiJ and CobK both catalyze the reduction of macocycle in the colbalmin biosynthesis pathway. 248
45923 396911 pfam02572 CobA_CobO_BtuR ATP:corrinoid adenosyltransferase BtuR/CobO/CobP. This family consists of the BtuR, CobO, CobP proteins all of which are Cob(I)alamin adenosyltransferase, EC:2.5.1.17, involved in cobalamin (vitamin B12) biosynthesis. These enzymes catalyze the adenosylation reaction: ATP + cob(I)alamin + H2O <=> phosphate + diphosphate + adenosylcobalamin. 171
45924 396912 pfam02574 S-methyl_trans Homocysteine S-methyltransferase. This is a family of related homocysteine S-methyltransferases enzymes: 5-methyltetrahydrofolate--homocysteine S-methyltransferases also known EC:2.1.1.13; Betaine--homocysteine S-methyltransferase (vitamin B12 dependent), EC:2.1.1.5; and Homocysteine S-methyltransferase, EC:2.1.1.10,. 267
45925 396913 pfam02575 YbaB_DNA_bd YbaB/EbfC DNA-binding family. This is a family of DNA-binding proteins. Members of this family form homodimers which bind DNA via a tweezer-like structure. The conformation of the DNA is changed when bound to these proteins. In bacteria, these proteins may play a role in DNA replication-recovery following DNA damage. 90
45926 396914 pfam02576 DUF150 RimP N-terminal domain. This family represents the N-terminal domain from RimP. 73
45927 396915 pfam02577 DNase-RNase Bifunctional nuclease. This family is a bifunctional nuclease, with both DNase and RNase activity. It forms a wedge-shaped dimer, with each monomer being triangular in shape. A large groove at the thick end of the wedge contains a possible active site. 112
45928 396916 pfam02578 Cu-oxidase_4 Multi-copper polyphenol oxidoreductase laccase. Laccases are multi-copper oxidoreductases able to oxidize a wide variety of phenolic and non-phenolic compounds and are widely distributed among both prokaryotes and eukaryotes. There are two main active catalytic sites with conserved histidines that are capable of binding four copper atoms. 232
45929 396917 pfam02579 Nitro_FeMo-Co Dinitrogenase iron-molybdenum cofactor. This family contains several NIF (B, Y and X) proteins which are iron-molybdenum cofactors (FeMo-co) in the dinitrogenase enzyme which catalyzes the reduction of dinitrogen to ammonium. Dinitrogenase is a hetero-tetrameric (alpha(2)beta(2)) enzyme which contains the iron-molybdenum cofactor (FeMo-co) at its active site. 92
45930 396918 pfam02580 Tyr_Deacylase D-Tyr-tRNA(Tyr) deacylase. This family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells. 143
45931 396919 pfam02581 TMP-TENI Thiamine monophosphate synthase. Thiamine monophosphate synthase (TMP) (EC:2.5.1.3) catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate. This Pfam family also includes the regulatory protein TENI, a protein from Bacillus subtilis that regulates the production of several extracellular enzymes by reducing alkaline protease production. While TenI shows high sequence similarity with thiamin phosphate synthase, the purified protein has no thiamin phosphate synthase activity. Instead, it is a thiazole tautomerase. 180
45932 396920 pfam02582 DUF155 Uncharacterized ACR, YagE family COG1723. 173
45933 396921 pfam02583 Trns_repr_metal Metal-sensitive transcriptional repressor. This is a family of metal-sensitive repressors, involved in resistance to metal ions. Members of this family bind copper, nickel or cobalt ions via conserved cysteine and histidine residues. In the absence of metal ions, these proteins bind to promoter regions and repress transcription. When bound to metal ions they are unable to bind DNA, leading to transcriptional derepression. 79
45934 396922 pfam02585 PIG-L GlcNAc-PI de-N-acetylase. Members of this family are related to PIG-L an N-acetylglucosaminylphosphatidylinositol de-N-acetylase (EC:3.5.1.89) that catalyzes the second step in GPI biosynthesis. 125
45935 396923 pfam02586 SRAP SOS response associated peptidase (SRAP). The SRAP family functions as a DNA-associated autoproteolytic switch that recruits diverse repair enzymes onto DNA damage. We propose that the human protein Q96FZ2:UniProtKB, the eukaryotic member of the SRAP family, which has been recently shown to bind specifically to DNA with 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxycytosine, is a sensor for these oxidized bases generated by the TET (tetrahedral aminopeptidase of the M42 family) enzymes from methylcytosine. Hence, its autoproteolytic activity might help it act as a switch that recruits DNA repair enzymes to remove these oxidized methylcytosine species as part of the DNA demethylation pathway downstream of the TET enzymes. 212
45936 396924 pfam02588 YitT_membrane Uncharacterized 5xTM membrane BCR, YitT family COG1284. This is probably a bacterial ABC transporter permease (personal obs:Yeats C). 206
45937 396925 pfam02589 LUD_dom LUD domain. This entry represents a domain found in lactate utilization proteins B (LutB) and C (LutC), as well as several uncharacterized proteins. LutB and LutC are encoded by th conserved LutABC operon in bacteria. They are involved in lactate utilization and is implicated in the oxidative conversion of L-lactate into pyruvate 188
45938 396926 pfam02590 SPOUT_MTase Predicted SPOUT methyltransferase. This family of proteins are predicted to be SPOUT methyltransferases. 155
45939 396927 pfam02591 zf-RING_7 C4-type zinc ribbon domain. Zn-ribbon_9 is a Zn-ribbon domain rich in aromatic and positively charged amino acid residues. This C-terminal Zn-ribbon domain consists of two beta-strands acting as a scaffold for the two Zn knuckles. Both pairs of cysteines making up the two Zn knuckles are situated at highly conserved sharp beta-turns, an arrangement that facilitates the tetrahedral coordination of the divalent Zn ion. The two Zn-knuckle cysteine motifs are separated by 20 residues, 9 of which form an alpha-helix (helix 4).Structural modelling suggests this domain may bind nucleic acids. The domain appears to bind flaA-mRNA, thus contributing to flagellum formation and motility. 33
45940 396928 pfam02592 Vut_1 Putative vitamin uptake transporter. 154
45941 367127 pfam02593 DUF166 Domain of unknown function. This family catalyzes the synthesis of thymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). The physiological co-substrate has not yet been identified. Previous designation of this famliy as being thymidylate synthase from one paper, PMID:10436953, has been shown to be erroneous. The proteins are uncharacterized. 218
45942 396929 pfam02594 DUF167 Uncharacterized ACR, YggU family COG1872. 75
45943 396930 pfam02595 Gly_kinase Glycerate kinase family. This is family of Glycerate kinases. 367
45944 396931 pfam02596 DUF169 Uncharacterized ArCR, COG2043. 209
45945 396932 pfam02597 ThiS ThiS family. ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in E. coli. This family of proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalyzed by IscS. MoaD, a protein involved sulphur transfer in molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end. 74
45946 396933 pfam02598 Methyltrn_RNA_3 Putative RNA methyltransferase. This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA, and may function as a methyltransferase. 282
45947 396934 pfam02599 CsrA Global regulator protein family. This is a family of global regulator proteins. This protein is a RNA-binding protein and a global regulator of carbohydrate metabolism genes facilitating mRNA decay. In E. coli CsrA binds the CsrB RNA molecule to form the Csr regulatory system which has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara RmsA has been shown to regulate the production of virulence determinants, such extracellular enzymes. RmsA binds to RmsB regulatory RNA. 50
45948 396935 pfam02600 DsbB Disulfide bond formation protein DsbB. This family consists of disulfide bond formation protein DsbB from bacteria. The DsbB protein oxidizes the periplasmic protein DsbA which in turn oxidizes cysteines in other periplasmic proteins in order to make disulfide bonds. DsbB acts as a redox potential transducer across the cytoplasmic membrane and is an integral membrane protein. DsbB posses six cysteines four of which are necessary for it proper function in vivo. 149
45949 396936 pfam02601 Exonuc_VII_L Exonuclease VII, large subunit. This family consist of exonuclease VII, large subunit EC:3.1.11.6 This enzyme catalyzes exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones. 264
45950 396937 pfam02602 HEM4 Uroporphyrinogen-III synthase HemD. This family consists of uroporphyrinogen-III synthase HemD EC:4.2.1.75 also known as Hydroxymethylbilane hydrolyase (cyclizing) from eukaryotes, bacteria and archaea. This enzyme catalyzes the reaction: Hydroxymethylbilane <=> uroporphyrinogen-III + H(2)O. Some members of this family are multi-functional proteins possessing other enzyme activities related to porphyrin biosynthesis, such as HemD with pfam00590, however the aligned region corresponds with the uroporphyrinogen-III synthase EC:4.2.1.75 activity only. Uroporphyrinogen-III synthase is the fourth enzyme in the heme pathway. Mutant forms of the Uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria in humans a recessive inborn error of metabolism also known as Gunther disease. 230
45951 396938 pfam02603 Hpr_kinase_N HPr Serine kinase N-terminus. This family represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phospho-relay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains. 125
45952 396939 pfam02604 PhdYeFM_antitox Antitoxin Phd_YefM, type II toxin-antitoxin system. Members of this family act as antitoxins in type II toxin-antitoxin systems. When bound to their toxin partners, they can bind DNA via the N-terminus and repress the expression of operons containing genes encoding the toxin and the antitoxin. This domain complexes with Txe toxins containing pfam06769, Fic/DOC toxins containing pfam02661 and YafO toxins containing pfam13957. 67
45953 396940 pfam02605 PsaL Photosystem I reaction centre subunit XI. This family consists of the photosystem I reaction centre subunit XI, PsaL, from plants and bacteria. PsaL is one of the smaller subunits in photosystem I with only two transmembrane alpha helices and interacts closely with PsaI. 143
45954 396941 pfam02606 LpxK Tetraacyldisaccharide-1-P 4'-kinase. This family consists of tetraacyldisaccharide-1-P 4'-kinase also known as Lipid-A 4'-kinase or Lipid A biosynthesis protein LpxK, EC:2.7.1.130. This enzyme catalyzes the reaction: ATP + 2,3-bis(3-hydroxytetradecanoyl)-D -glucosaminyl-(beta-D-1,6)-2,3-bis(3-hydroxytetradecanoyl)-D-glu cosam inyl beta-phosphate <=> ADP + 2,3,2',3'-tetrakis(3-hydroxytetradecanoyl)-D- glucosaminyl-1,6-beta-D-glucosamine 1,4'-bisphosphate. This enzyme is involved in the synthesis of lipid A portion of the bacterial lipopolysaccharide layer (LPS). The family contains a P-loop motif at the N-terminus. 318
45955 396942 pfam02607 B12-binding_2 B12 binding domain. This B12 binding domain is found in methionine synthase EC:2.1.1.13, and other shorter proteins that bind to B12. This domain is always found to the N-terminus of pfam02310. The structure of this domain is known, it is a 4 helix bundle. Many of the conserved residues in this domain are involved in B12 binding, such as those in the MXXVG motif. 68
45956 396943 pfam02608 Bmp ABC transporter substrate-binding protein PnrA-like. Proteins containing this domain were originally annotated as basic membrane lipoproteins. However, several proteins containing this domain were later predicted as ABC transporter substrate-binding proteins, such as PnrA (also known as TmpC or TP0319) and RfuA (also known as Tpn38 or TP0298) from Treponema pallidum. RfuA transports purine nucleosides, while RfuA transports riboflavin. Proteins containing this domain also include Med from Bacillus subtilis. Med was annotated as a transcriptional activator protein that regulates comK. This domain can also found at the N-terminus of glutamate receptor-like proteins from Dictyostelium (slime mold). 302
45957 396944 pfam02609 Exonuc_VII_S Exonuclease VII small subunit. This family consist of exonuclease VII, small subunit EC:3.1.11.6 This enzyme catalyzes exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones. 52
45958 396945 pfam02610 Arabinose_Isome L-arabinose isomerase. This is a family of L-arabinose isomerases, AraA, EC:5.3.1.4. These enzymes catalyze the reaction: L-arabinose <=> L-ribulose. This reaction is the first step in the pathway of L-arabinose utilisation as a carbon source after entering the cell L-arabinose is converted into L-ribulose by the L-arabinose isomerases enzyme. 356
45959 396946 pfam02611 CDH CDP-diacylglycerol pyrophosphatase. This is a family of CDP-diacylglycerol pyrophosphatases, EC:3.6.1.26. This enzyme catalyzes the reaction CDP-diacylglycerol + H2O <=> CMP + phosphatidate. 224
45960 396947 pfam02613 Nitrate_red_del Nitrate reductase delta subunit. This family is the delta subunit of the nitrate reductase enzyme, The delta subunit is not part of the nitrate reductase enzyme but is most likely needed for assembly of the multi-subunit enzyme complex. In the absence of the delta subunit the core alpha beta enzyme complex is unstable. The delta subunit is essential for enzyme activity in vivo and in vitro. The nitrate reductase enzyme, EC:1.7.99.4 catalyze the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen. This family also now contains the family TorD, a family of cytoplasmic chaperone proteins; like many prokaryotic molybdoenzymes, the TMAO reductase (TorA) of Escherichia coli requires the insertion of a bis(molybdopterin guanine dinucleotide) molybdenum (bis(MGD)Mo) cofactor in its catalytic site to be active and translocated to the periplasm. The TorD chaperone increases apoTorA activation up to four-fold, allowing maturation of most of the apoprotein. Therefore TorD is involved in the first step of TorA maturation to make it competent to receive the cofactor. 133
45961 396948 pfam02614 UxaC Glucuronate isomerase. This is a family of Glucuronate isomerases also known as D-glucuronate isomerase, uronic isomerase, uronate isomerase, or uronic acid isomerase, EC:5.3.1.12. This enzyme catalyzes the reactions: D-glucuronate <=> D-fructuronate and D-galacturonate <=> D-tagaturonate. It is not however clear where the experimental evidence for this functional assignment came from and thus this family has no literature reference. 464
45962 396949 pfam02615 Ldh_2 Malate/L-lactate dehydrogenase. This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively. 330
45963 280735 pfam02616 SMC_ScpA Segregation and condensation protein ScpA. This is a family of proteins that from part of the condensin complex that regulates chromosome segregation. This is the A subunit, which binds to the ScpB subunit, pfam04079, and SMC, pfam02463, to participate in chromosomal partition during cell division. The condensin complex pulls DNA away from the mid-cell into both cell halves. These proteins are part of the Kleisin superfamily. 225
45964 396950 pfam02617 ClpS ATP-dependent Clp protease adaptor protein ClpS. In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins. 80
45965 396951 pfam02618 YceG YceG-like family. This family of proteins is found in bacteria. Proteins in this family are typically between 332 and 389 amino acids in length. This family was previously incorrectly annotated and names as aminodeoxychorismate lyase. The structure of YceG was solved by X-ray crystallography. 274
45966 396952 pfam02620 DUF177 Uncharacterized ACR, COG1399. This family is nearly universally conserved in bacteria and plants except the Chlorophyceae algae. Thus far, mutantional analysis in bacteria have not established a function. In contrast, mutants have embryo lethal phenotypes in maize and Arabidopsis. In maize, the mutant embryos arrest at an early transition stage.It has been suggested that family members specifically affect 23S rRNA accumulation in plastids as well as bacteria. 118
45967 396953 pfam02621 VitK2_biosynth Menaquinone biosynthesis. This family includes two enzymes which are involved in menaquinone biosynthesis. One which catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate, and one which may be involved in the conversion of chorismate to futalosine. These enzymes comprise two domains with alpha/beta structures, a large domain and a small domain. A pocket between the two domains may form the active site, a conserved histidine located within this pocket could be the catalytic base. 253
45968 396954 pfam02622 DUF179 Uncharacterized ACR, COG1678. 159
45969 396955 pfam02623 FliW FliW protein. The protein BSU35380 from Bacillus subtilis (renamed FliW) was characterized as being a flagellar assembly factor. Experimental characterization was also carried out in Treponema pallidum (TP0658). In Campylobacter jejuni, Cj1075 has been shown to be involved in motility and flagellin biosynthesis. The two paralogues in Helicobacter pylori (HP1154 and HP1377) were found to be able to bind to flagellin. FliW proteins are involved in flagellar assembly. FliW is part of a three-part feedback loop: in Bacillus subtilis FliW inhibits CsrA (an RNA-binding protein) which inhibits FliC translation; hence FliW is required for FliC (flagellin) production. 121
45970 396956 pfam02624 YcaO YcaO cyclodehydratase, ATP-ad Mg2+-binding. YcaO is an ATP- an Mg2+-binding protein involved in the peptidic biosynthesis of azoline. There three motifs involved in the binding are, in UniProtKB:P75838, 71-79: Sx3ExxER, 184-203: Sx6Ex3Qx3ExxER, and 286-290: RxxxE. Three slightly different functional families are represented in this family, proteins involved in TOMM (thiazole/oxazole-modified microcin) biogenesis, non-TOMM proteins such as UniProtKB:P75838, and TfuA-associated non-TOMM proteins involved in trifolitoxin biosynthesis. UniProtKB:P75838 hydrolyzes ATP to AMP and pyrophosphate. 319
45971 396957 pfam02625 XdhC_CoxI XdhC and CoxI family. This domain is often found in association with an NAD-binding region, related to TrkA-N (pfam02254; personal obs:C. Yeats). XdhC is believed to be involved in the attachment of molybdenum to Xanthine Dehydrogenase. 68
45972 396958 pfam02626 CT_A_B Carboxyltransferase domain, subdomain A and B. Urea carboxylase (UC) catalyzes a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the A and B subdomains of the CT domain. This domain covers the whole length of KipA (kinase A) from Bacillus subtilis. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity. 263
45973 396959 pfam02627 CMD Carboxymuconolactone decarboxylase family. Carboxymuconolactone decarboxylase (CMD) EC:4.1.1.44 is involved in protocatechuate catabolism. In some bacteria a gene fusion event leads to expression of CMD with a hydrolase involved in the same pathway. In these bifunctional proteins CMD represents the C-terminal domain, pfam00561 represents the N-terminal domain. 84
45974 396960 pfam02628 COX15-CtaA Cytochrome oxidase assembly protein. This is a family of integral membrane proteins. CtaA is required for cytochrome aa3 oxidase assembly in Bacillus subtilis. COX15 is required for cytochrome c oxidase assembly in yeast. 322
45975 396961 pfam02629 CoA_binding CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. 97
45976 396962 pfam02630 SCO1-SenC SCO1/SenC. This family is involved in biogenesis of respiratory and photosynthetic systems. SCO1 is required for a post-translational step in the accumulation of subunits COXI and COXII of cytochrome c oxidase. SenC is required for optimal cytochrome c oxidase activity and maximal induction of genes encoding the light-harvesting and reaction centre complexes of R. capsulatus. 134
45977 396963 pfam02631 RecX RecX family. RecX is a putative bacterial regulatory protein. The gene encoding RecX is found downstream of recA, and is thought to interact with the RecA protein. 122
45978 396964 pfam02632 BioY BioY family. A number of bacterial genes are involved in bioconversion of pimelate into dethiobiotin. BioY is a component of the BioMNY transport system involved in biotin uptake in prokaryotes. 138
45979 396965 pfam02633 Creatininase Creatinine amidohydrolase. Creatinine amidohydrolase (EC:3.5.2.10), or creatininase, catalyzes the hydrolysis of creatinine to creatine. 226
45980 396966 pfam02634 FdhD-NarQ FdhD/NarQ family. A pan-bacterial lineage of proteins. Nitrate assimilation protein, NarQ, and FdhD are required for formate dehydrogenase activity. Structurally, they possess a deaminase fold with a characteristic binding pocket, suggesting that they might bind a nucleotide or related molecule allosterically to regulate the formate dehydrogenase catalytic subunit. 238
45981 396967 pfam02635 DrsE DsrE/DsrF-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. This family also includes DsrF. 118
45982 396968 pfam02636 Methyltransf_28 Putative S-adenosyl-L-methionine-dependent methyltransferase. This family is a putative S-adenosyl-L-methionine (SAM)-dependent methyltransferase. In eukaryotes it plays a role in mitochondrial complex I activity. 247
45983 396969 pfam02637 GatB_Yqey GatB domain. This domain is found in GatB. It is about 140 amino acid residues long. This domain is found at the C-terminus of GatB, which transamidates Glu-tRNA to Gln-tRNA. 148
45984 251441 pfam02638 GHL10 Glycosyl hydrolase-like 10. This is family of bacterial glycosyl-hydrolase-like proteins falling into the family GHL10 as described above,. 311
45985 396970 pfam02639 DUF188 Uncharacterized BCR, YaiI/YqxD family COG1671. 130
45986 396971 pfam02641 DUF190 Uncharacterized ACR, COG1993. 97
45987 396972 pfam02643 DUF192 Uncharacterized ACR, COG1430. Two structures have been solved for members of this large (>500 members) family of bacterial proteins present mostly in environmental bacteria and metagenomes (distant homologs are also present in several Plasmodium species). TOPSAN analysis for Structure 3pjy shows that there is much similarity with the other solved structure, Structure 3m7a, solved for UniProt:Q2GA55 (Saro_0823), a homolog of Thermotoga maritima TM1668, UniProt:Q9X1Z6., The homolog in Caulobacter crescentus (CC1388), UniProt:Q9A8G6, is associated with CspD, a cold shock protein (CC1387), UniProt:Q9A8G7. However, the genomic context of UniProt:Q2GA55 is most conserved with a putative xylose isomerase, suggesting a possible role in extracellular sugar processing. Saro_0821, UniProt:Q2GA57, is annotated as an AMP-dependent synthetase and ligase. Structure 3m7a structure corresponds to the C-terminal (27-165) fragment of the YP_496102 (Saro_0823) protein and it is structurally unique, as the best hits from Dali have a Z-score of 3.8 (1nt0, 2j1t, 3kq4) and it is thus a likely candidate for a new fold. Interestingly, many of the top Dali hits are involved in sugar metabolism. There are no obvious active site-like cavities on the protein surface of 3m7a (http://www.topsan.org/Proteins/JCSG/). 105
45988 396973 pfam02645 DegV Uncharacterized protein, DegV family COG1307. The structure of this protein revealed a bound fatty-acid molecule in a pocket between the two protein domains. The structure indicates that this family has the molecular function of fatty-acid binding and may play a role in the cellular functions of fatty acid transport or metabolism. 280
45989 396974 pfam02646 RmuC RmuC family. This family contains several bacterial RmuC DNA recombination proteins. The function of the RMUC protein is unknown but it is suspected that it is either a structural protein that protects DNA against nuclease action, or is itself involved in DNA cleavage at the regions of DNA secondary structures 286
45990 396975 pfam02649 GCHY-1 Type I GTP cyclohydrolase folE2. This is a family of prokaryotic proteins with type I GTP cyclohydrolase activity. GTP cyclohydrolase I is the first enzyme of the de novo tetrahydrofolate biosynthetic pathway present in bacteria, fungi, and plants, and encoded in Escherichia coli by the folE gene; it is also the first enzyme of the biopterin (BH4) pathway in Homo sapiens. The invariate, highly conserved glutamate residue at position 216 in Neisseria gonorrhoeae FolE2 is likely to be the substrate ligand and the metal ligand is likely to be the cysteine at position 147. The enzyme is Zinc 2+ dependent. 262
45991 396976 pfam02650 HTH_WhiA WhiA C-terminal HTH domain. This domain is found at the C-terminus of the sporulation regulator WhiA. It is predicted to form a DNA-binding helix-turn-helix structure. The WhiA protein also contains two N-terminal domains that are distant homologs of LAGLIDADG homing endonucleases. 83
45992 280762 pfam02652 Lactate_perm L-lactate permease. L-lactate permease is an integral membrane protein probably involved in L-lactate transport. 522
45993 396977 pfam02653 BPD_transp_2 Branched-chain amino acid transport system / permease component. This is a large family mainly comprising high-affinity branched-chain amino acid transporter proteins such as E. coli LivH and LivM, both of which are form the LIV-I transport system. Also found with in this family are proteins from the galactose transport system permease and a ribose transport system. 269
45994 396978 pfam02654 CobS Cobalamin-5-phosphate synthase. This is family of Colbalmin-5-phosphate synthases, CobS, from bacteria. The CobS enzyme catalyzes the synthesis of AdoCbl-5'-p from AdoCbi-GDP and alpha-ribazole-5'-P. This enzyme is involved in the cobalamin (vitamin B12) biosynthesis pathway in particular the nucleotide loop assembly stage in conjunction with CobC, CobU and CobT. 217
45995 396979 pfam02655 ATP-grasp_3 ATP-grasp domain. No functional information or experimental verification of function is known in this family. This family appears to be an ATP-grasp domain (Pers. obs. A Bateman). 160
45996 396980 pfam02656 DUF202 Domain of unknown function (DUF202). This family consists of hypothetical proteins some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 100 amino acids long. 68
45997 396981 pfam02657 SufE Fe-S metabolism associated domain. This family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export). 119
45998 396982 pfam02659 Mntp Putative manganese efflux pump. MntP is a family of bacterial proteins with a signal peptide and four transmembrane domains. It is a putative manganese efflux pump, since deletion of the gene leads to profound manganese sensitivity and elevated intracellular manganese levels in bacteria. Manganese is a highly important trace nutrient for organisms from bacteria to humans, and acts as an important element in the defense against oxidative stress and as an enzyme cofactor. 152
45999 396983 pfam02660 G3P_acyltransf Glycerol-3-phosphate acyltransferase. This family of enzymes catalyzes the transfer of an acyl group from acyl-ACP to glycerol-3-phosphate to form lysophosphatidic acid. 174
46000 396984 pfam02661 Fic Fic/DOC family. This family consists of the Fic (filamentation induced by cAMP) protein and doc (death on curing). The Fic protein is involved in cell division and is suggested to be involved in the synthesis of PAB or folate, indicating that the Fic protein and cAMP are involved in a regulatory mechanism of cell division via folate metabolism. This family contains a central conserved motif HPFXXGNG in most members. The exact molecular function of these proteins is uncertain. P1 lysogens of Escherichia coli carry the prophage as a stable low copy number plasmid. The frequency with which viable cells cured of prophage are produced is about 10(-5) per cell per generation. A significant part of this remarkable stability can be attributed to a plasmid-encoded mechanism that causes death of cells that have lost P1. In other words, the lysogenic cells appear to be addicted to the presence of the prophage. The plasmid withdrawal response depends on a gene named doc (death on curing) that is represented by this family. Doc induces a reversible growth arrest of E. coli cells by targetting the protein synthesis machinery. Doc hosts the C-terminal domain of its antitoxin partner Phd (prevents host death) through fold complementation, a domain that is intrinsically disordered in solution but that folds into an alpha-helix on binding to Doc.This domain forms complexes with Phd antitoxins containing pfam02604. 94
46001 396985 pfam02662 FlpD Methyl-viologen-reducing hydrogenase, delta subunit. This family consist of methyl-viologen-reducing hydrogenase, delta subunit / heterodisulphide reductase. No specific functions have been assigned to this subunit. The aligned region corresponds to almost the entire delta chain sequence and contains 4 conserved cysteine residues. However, in two Archaeoglobus sequences this region corresponds to only the C-terminus of these proteins. 122
46002 396986 pfam02663 FmdE FmdE, Molybdenum formylmethanofuran dehydrogenase operon. This entry represents the FmdE protein that is encode by the molybdenum formylmethanofuran dehydrogenase operon. FmdE does not co-purify with the molybdenum isozyme that is formed by FmdC and FmdB. The domain is typically found as a single copy, but is repeated in some sequence two to three times. It is also common place to find this domain co-occurs with a zinc-beta ribbon domain, suggesting that is may bind nucleic acid and be involved in transcription regulation. 89
46003 396987 pfam02664 LuxS S-Ribosylhomocysteinase (LuxS). This family consists of the LuxS protein involved in autoinducer AI2 synthesis and its hypothetical relatives. S-ribosylhomocysteinase (LuxS) catalyzes the cleavage of the thioether bond in S-ribosylhomocysteine (SRH) to produce homocysteine and 4,5-dihydroxy-2,3-pentanedione (DPD), the precursor of type II bacterial quorum sensing molecule. 154
46004 396988 pfam02665 Nitrate_red_gam Nitrate reductase gamma subunit. This family is the gamma subunit of the nitrate reductase enzyme, the gamma subunit is a b-type cytochrome that receives electrons from the quinone pool. It then transfers these via the iron-sulfur clusters of the beta subunit to the molybdenum cofactor found in the alpha subunit. The nitrate reductase enzyme, EC:1.7.99.4 catalyzes the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen. 220
46005 396989 pfam02666 PS_Dcarbxylase Phosphatidylserine decarboxylase. This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyze the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine. 198
46006 280776 pfam02667 SCFA_trans Short chain fatty acid transporter. This family consists of two sequences annotated as short chain fatty acid transporters, however, there are no references giving details of experimental characterization of this function. 453
46007 367137 pfam02668 TauD Taurine catabolism dioxygenase TauD, TfdA family. This family consists of taurine catabolism dioxygenases of the TauD, TfdA family. TauD from E. coli is a alpha-ketoglutarate-dependent taurine dioxygenase. This enzyme catalyzes the oxygenolytic release of sulfite from taurine. TfdA from Burkholderia sp. is a 2,4-dichlorophenoxyacetic acid/alpha-ketoglutarate dioxygenase. TfdA from Alcaligenes eutrophus JMP134 is a 2,4-dichlorophenoxyacetate monooxygenase. Also included are gamma-Butyrobetaine hydroxylase enzymes EC:1.14.11.1. 264
46008 396990 pfam02669 KdpC K+-transporting ATPase, c chain. This family consists of K+-transporting ATPase, c chain, KdpC. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilize the Kdp complex. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+-transporting subunit KdpA. The K+ transport system actively transports K+ ions via ATP hydrolysis. 179
46009 396991 pfam02670 DXP_reductoisom 1-deoxy-D-xylulose 5-phosphate reductoisomerase. This is a family of 1-deoxy-D-xylulose 5-phosphate reductoisomerases. This enzyme catalyzes the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH. This reaction is part of the terpenoid biosynthesis pathway. 127
46010 396992 pfam02671 PAH Paired amphipathic helix repeat. This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions. 45
46011 396993 pfam02672 CP12 CP12 domain. The function of this domain is unknown, it does contain three conserved cysteines and a histidine, that suggests this may be a zinc binding domain (Bateman A pers. observation). This domain is found associated with CBS domains in some proteins pfam00571. 68
46012 396994 pfam02673 BacA Bacitracin resistance protein BacA. Bacitracin resistance protein (BacA) is a putative undecaprenol kinase. BacA confers resistance to bacitracin, probably by phosphorylation of undecaprenol. More recent studies show that BacA has undecaprenyl pyrophosphate phosphatase activity. Undecaprenyl phosphate is a key lipid intermediate involved in the synthesis of various bacterial cell wall polymers. Bacitracin, a mixture of related cyclic polypeptide antibiotics, is used to treat surface tissue infections. Its primary mode of action is the inhibition of bacterial cell wall synthesis through sequestration of the essential carrier lipid undecaprenyl pyrophosphate, C55-PP, resulting in the loss of cell integrity and lysis. The characteristic phosphatase sequence-motif in this family is likely to be the PGxSRSGG, compared with the PSGH of the PAP family of phosphatases. 257
46013 396995 pfam02674 Colicin_V Colicin V production protein. Colicin V production protein is required in E. Coli for colicin V production from plasmid pColV-K30. This protein is coded for in the purF operon. 144
46014 396996 pfam02675 AdoMet_dc S-adenosylmethionine decarboxylase. This family contains several S-adenosylmethionine decarboxylase proteins from bacterial and archaebacterial species. S-adenosylmethionine decarboxylase (AdoMetDC), a key enzyme in the biosynthesis of spermidine and spermine, is first synthesized as a proenzyme, which is cleaved post translationally to form alpha and beta subunits. The alpha subunit contains a covalently bound pyruvoyl group derived from serine that is essential for activity. 98
46015 396997 pfam02676 TYW3 Methyltransferase TYW3. The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3' position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner. 207
46016 396998 pfam02677 DUF208 Uncharacterized BCR, COG1636. 175
46017 396999 pfam02678 Pirin Pirin. This family consists of Pirin proteins from both eukaryotes and prokaryotes. The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localized within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures. A tomato homolog of human Pirin has been found to be induced during programmed cell death. Human Pirin interacts with Bcl-3 and NFI and hence is probably involved in the regulation of DNA transcription and replication. It appears to be an Fe(II)-containing member of the Cupin superfamily. 104
46018 397000 pfam02679 ComA (2R)-phospho-3-sulfolactate synthase (ComA). In methanobacteria (2R)-phospho-3-sulfolactate synthase (ComA) catalyzes the first step of the biosynthesis of coenzyme M from phosphoenolpyruvate (P-enolpyruvate). This novel enzyme catalyzes the stereospecific Michael addition of sulfite to P-enolpyruvate, forming L-2-phospho-3-sulfolactate (PSL). It is suggested that the ComA-catalyzed reaction is analogous to those reactions catalyzed by beta-elimination enzymes that proceed through an enolate intermediate. 238
46019 397001 pfam02680 DUF211 Uncharacterized ArCR, COG1888. 88
46020 397002 pfam02681 DUF212 Divergent PAP2 family. This family is related to the pfam01569 family (personal obs: C Yeats). 134
46021 397003 pfam02682 CT_C_D Carboxyltransferase domain, subdomain C and D. Urea carboxylase (UC) catalyzes a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the C and D subdomains of the CT domain. This domain covers the whole length of kipI (kinase A inhibitor) from Bacillus subtilis. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity. 201
46022 280792 pfam02683 DsbD Cytochrome C biogenesis protein transmembrane region. This family consists of the transmembrane (i.e. non-catalytic) region of Cytochrome C biogenesis proteins also known as disulphide interchange proteins. These proteins posses a protein disulphide isomerase like domain that is not found within the aligned region of this family. 213
46023 397004 pfam02684 LpxB Lipid-A-disaccharide synthetase. This is a family of lipid-A-disaccharide synthetases, EC:2.4.2.128. These enzymes catalyze the reaction: UDP-2,3-bis(3-hydroxytetradecanoyl) glucosamine + 2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate <=> UDP + 2,3-bis(3-hydroxytetradecanoyl)-D-glucosaminyl-1,6 -beta-D-2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate. These enzymes catalyze the fist disaccharide step in the synthesis of lipid-A-disaccharide. 374
46024 397005 pfam02685 Glucokinase Glucokinase. This is a family of glucokinases or glucose kinases EC:2.7.1.2. These enzymes phosphorylate glucose using ATP as a donor to give glucose-6-phosphate and ADP. 314
46025 397006 pfam02686 Glu-tRNAGln Glu-tRNAGln amidotransferase C subunit. This is a family of Glu-tRNAGln amidotransferase C subunits. The Glu-tRNA Gln amidotransferase enzyme itself is an important translational fidelity mechanism replacing incorrectly charged Glu-tRNAGln with the correct Gln-tRANGln via transmidation of the misacylated Glu-tRNAGln. This activity supplements the lack of glutaminyl-tRNA synthetase activity in gram-positive eubacterteria, cyanobacteria, Archaea, and organelles. 70
46026 397007 pfam02687 FtsX FtsX-like permease family. This is a family of predicted permeases and hypothetical transmembrane proteins. Buchnera aphidicola LolC has been shown to transport lipids targeted to the outer membrane across the inner membrane. Both LolC and Streptococcus cristatus TptD have been shown to require ATP. This region contains three transmembrane helices. 96
46027 280797 pfam02689 Herpes_Helicase Helicase. This family consists of Helicases from the Herpes viruses. Helicases are responsible for the unwinding of DNA and are essential for replication and completion of the viral life cycle. 809
46028 397008 pfam02690 Na_Pi_cotrans Na+/Pi-cotransporter. This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. 137
46029 111576 pfam02691 VacA Vacuolating cyotoxin. This family consists of Vacuolating cyotoxin proteins form Proteobacteria. These proteins are an important virulence determinate in H. pylori and induce cytoplasmic vacuolation in a variety of mammalian cell lines. 1002
46030 111577 pfam02694 UPF0060 Uncharacterized BCR, YnfA/UPF0060 family. 107
46031 397009 pfam02696 UPF0061 Uncharacterized ACR, YdiU/UPF0061 family. 458
46032 397010 pfam02697 VAPB_antitox Putative antitoxin. Proteins in this family are possibly the antitoxin component of a VAPBC-like toxin-antitoxin (TA) module, which is widespread in the in both archaea and bacteria. 69
46033 397011 pfam02698 DUF218 DUF218 domain. This large family of proteins contains several highly conserved charged amino acids, suggesting this may be an enzymatic domain (Bateman A pers. obs). The family includes SanA, which is involved in Vancomycin resistance. This protein may be involved in murein synthesis. 137
46034 397012 pfam02699 YajC Preprotein translocase subunit. See. 78
46035 397013 pfam02700 PurS Phosphoribosylformylglycinamidine (FGAM) synthase. This family forms a component of the de novo purine biosynthesis pathway. 76
46036 397014 pfam02701 zf-Dof Dof domain, zinc finger. The Dof domain is a zinc finger DNA-binding domain, that shows resemblance to the Cys2 zinc finger. 57
46037 397015 pfam02702 KdpD Osmosensitive K+ channel His kinase sensor domain. This is a family of KdpD sensor kinase proteins that regulate the kdpFABC operon responsible for potassium transport. The aligned region corresponds to the N-terminal cytoplasmic part of the protein which may be the sensor domain responsible for sensing turgor pressure. 210
46038 308370 pfam02703 Adeno_E1A Early E1A protein. This is a family of adenovirus early E1A proteins. The E1A protein is 32 kDa it can however be cleaved to yield the 28 kDa protein. The E1A protein is responsible for the transcriptional activation of the early genes with in the viral genome at the start of the infection process as well as some cellular genes. 289
46039 397016 pfam02704 GASA Gibberellin regulated protein. This is the GASA gibberellin regulated cysteine rich protein family. The expression of these proteins is up-regulated by the plant hormone gibberellin, most of these proteins have some role in plant development. There are 12 cysteine residues conserved within the alignment giving the potential for these proteins to posses 6 disulphide bonds. 60
46040 397017 pfam02705 K_trans K+ potassium transporter. This is a family of K+ potassium transporters that are conserved across phyla, having both bacterial (KUP), yeast (HAK), and plant (AtKT) sequences as members. 534
46041 367147 pfam02706 Wzz Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases. 74
46042 280810 pfam02707 MOSP_N Major Outer Sheath Protein N-terminal region. This is a family of spirochete major outer sheath protein N-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola. 196
46043 397018 pfam02709 Glyco_transf_7C N-terminal domain of galactosyltransferase. This is the N-terminal domain of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction. 77
46044 397019 pfam02710 Hema_HEFG Hemagglutinin domain of haemagglutinin-esterase-fusion glycoprotein. 155
46045 367150 pfam02711 Pap_E4 E4 protein. This is is a family of Papillomavirus proteins, E4, coded for by ORF4. A splice variant, E1--E4, exists but neither the function of E4 or E1--E4 is known. 95
46046 397020 pfam02713 DUF220 Domain of unknown function DUF220. This is family consists of a region in several Arabidopsis thaliana hypothetical proteins none of which have any known function. The aligned region contains two cysteine residues. 73
46047 397021 pfam02714 RSN1_7TM Calcium-dependent channel, 7TM region, putative phosphate. RSN1_7TM is the seven transmembrane domain region of putative phosphate transporter. The family is the 7TM region of osmosensitive calcium-permeable cation channels. 273
46048 397022 pfam02718 Herpes_UL31 Herpesvirus UL31-like protein. This is a family of Herpesvirus proteins including UL31, UL53, and the product of ORF 69 in some strains. The proteins in this family have no known function. 251
46049 397023 pfam02719 Polysacc_synt_2 Polysaccharide biosynthesis protein. This is a family of diverse bacterial polysaccharide biosynthesis proteins including the CapD protein, WalL protein, mannosyl-transferase, and several putative epimerases (e.g. WbiI). 284
46050 397024 pfam02720 DUF222 Domain of unknown function (DUF222). This family is often found associated to the N-terminus of the HNH endonuclease domain pfam01844. The function of this domain is uncertain. This family has been called the 13E12 repeat family. 305
46051 145722 pfam02721 DUF223 Domain of unknown function DUF223. 95
46052 280819 pfam02722 MOSP_C Major Outer Sheath Protein C-terminal domain. This is a family of spirochete major outer sheath protein C-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola. This domain forms an amphipathic beta rich structure with channel forming activity. 205
46053 397025 pfam02723 NS3_envE Non-structural protein NS3/Small envelope protein E. This is a family of small non-structural proteins, well conserved among Coronavirus strains. This protein is also found in murine hepatitis virus as small envelope protein E. 75
46054 397026 pfam02724 CDC45 CDC45-like protein. CDC45 is an essential gene required for initiation of DNA replication in S. cerevisiae, forming a complex with MCM5/CDC46. homologs of CDC45 have been identified in human, mouse and smut fungus, among others. 539
46055 280822 pfam02725 Paramyxo_NS_C Non-structural protein C. This family consists of the polymerase accessory protein C from members of the paramyxoviridae. 164
46056 397027 pfam02727 Cu_amine_oxidN2 Copper amine oxidase, N2 domain. This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). 87
46057 397028 pfam02728 Cu_amine_oxidN3 Copper amine oxidase, N3 domain. This domain is the second or third structural domain in copper amine oxidases, it is known as the N3 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). 101
46058 397029 pfam02729 OTCace_N Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain. 140
46059 397030 pfam02730 AFOR_N Aldehyde ferredoxin oxidoreductase, N-terminal domain. Aldehyde ferredoxin oxidoreductase (AOR) catalyzes the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This domain interacts with the tungsten cofactor. 200
46060 397031 pfam02731 SKIP_SNW SKIP/SNW domain. This domain is found in chromatin proteins. 152
46061 397032 pfam02732 ERCC4 ERCC4 domain. This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. 139
46062 397033 pfam02733 Dak1 Dak1 domain. This is the kinase domain of the dihydroxyacetone kinase family EC:2.7.1.29. 310
46063 397034 pfam02734 Dak2 DAK2 domain. This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. 175
46064 397035 pfam02735 Ku Ku70/Ku80 beta-barrel domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain. This domain is found in both the Ku70 and Ku80 proteins that form a DNA binding heterodimer. 193
46065 397036 pfam02736 Myosin_N Myosin N-terminal SH3-like domain. This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. 39
46066 397037 pfam02737 3HCDH_N 3-hydroxyacyl-CoA dehydrogenase, NAD binding domain. This family also includes lambda crystallin. 180
46067 397038 pfam02738 Ald_Xan_dh_C2 Molybdopterin-binding domain of aldehyde dehydrogenase. 541
46068 397039 pfam02739 5_3_exonuc_N 5'-3' exonuclease, N-terminal resolvase-like domain. 163
46069 397040 pfam02740 Colipase_C Colipase, C-terminal domain. SCOP reports duplication of common fold with Colipase N-terminal domain. 44
46070 397041 pfam02741 FTR_C FTR, proximal lobe. The FTR (Formylmethanofuran--tetrahydromethanopterin formyltransferase) enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. C-terminal proximal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with N-terminal distal lobe. 149
46071 397042 pfam02742 Fe_dep_repr_C Iron dependent repressor, metal binding and dimerization domain. This family includes the Diphtheria toxin repressor. 70
46072 397043 pfam02743 dCache_1 Cache domain. Double cache domain 1 covers the last three strands from the membrane distal PAS-like domain, the first two strands of the membrane proximal domain, and the connecting elements between the two domains. 238
46073 397044 pfam02744 GalP_UDP_tr_C Galactose-1-phosphate uridyl transferase, C-terminal domain. SCOP reports fold duplication with N-terminal domain. Both involved in Zn and Fe binding. 166
46074 397045 pfam02745 MCR_alpha_N Methyl-coenzyme M reductase alpha subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has a ferredoxin-like fold. 269
46075 397046 pfam02746 MR_MLE_N Mandelate racemase / muconate lactonizing enzyme, N-terminal domain. SCOP reports fold similarity with enolase N-terminal domain. 117
46076 280843 pfam02747 PCNA_C Proliferating cell nuclear antigen, C-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA. 128
46077 397047 pfam02748 PyrI_C Aspartate carbamoyltransferase regulatory chain, metal binding domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The C-terminal metal binding domain has a rubredoxin-like fold and provides the interface with the catalytic chain. 48
46078 397048 pfam02749 QRPTase_N Quinolinate phosphoribosyl transferase, N-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyzes the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The N-terminal domain has an alpha/beta hammerhead fold. 87
46079 308403 pfam02750 Synapsin_C Synapsin, ATP binding domain. Ca dependent ATP binding in this ATP grasp fold. Function unknown. 203
46080 397049 pfam02751 TFIIA_gamma_C Transcription initiation factor IIA, gamma subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The C-terminal domain of the gamma subunit is a 12 stranded beta-barrel. 43
46081 397050 pfam02752 Arrestin_C Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. 135
46082 397051 pfam02753 PapD_C Pili assembly chaperone PapD, C-terminal domain. Ig-like beta-sandwich fold. This domain is the C-terminal part of the pilus and flagellar-assembly chaperone protein PapD. 63
46083 397052 pfam02754 CCG Cysteine-rich domain. The key element of this family is the CX31-38CCX33-34CXXC sequence motif normally found at the C-terminus in archaeal and bacterial Hdr-like proteins. There may be one or two copies, and the motif is probably an iron-sulfur binding cluster. In some instances one of the cysteines is replaced by an aspartate, and aspartate can in principle also function as a ligand of an iron-sulfur cluster. The family includes a subunit from heterodisulphide reductase and a subunit from glycolate oxidase and glycerol-3-phosphate dehydrogenase. 84
46084 397053 pfam02755 RPEL RPEL repeat. The RPEL repeat is named after four conserved amino acids it contains. The RPEL motif binds to actin. 24
46085 367166 pfam02756 GYR GYR motif. The GYR motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases. 18
46086 367167 pfam02757 YLP YLP motif. The YLP motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues and its presence in human ERBB4 may suggest it could be a substrate for tyrosine kinases. 9
46087 397054 pfam02758 PYRIN PAAD/DAPIN/Pyrin domain. This domain is predicted to contain 6 alpha helices and to have the same fold as the pfam00531 domain. This similarity may mean that this is a protein-protein interaction domain. 75
46088 397055 pfam02759 RUN RUN domain. This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases. 129
46089 397056 pfam02760 HIN HIN-200/IF120x domain. This domain has no known function. It is found in one or two copies per protein, and is found associated with the PAAD/DAPIN domain pfam02758. 168
46090 397057 pfam02761 Cbl_N2 CBL proto-oncogene N-terminus, EF hand-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the central EF hand domain. 84
46091 397058 pfam02762 Cbl_N3 CBL proto-oncogene N-terminus, SH2-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the C-terminal SH2 domain. 80
46092 397059 pfam02763 Diphtheria_C Diphtheria toxin, C domain. N-terminal catalytic (C) domain - blocks protein synthesis by transfer of ADP-ribose from NAD to a diphthamide residue of EF-2. 187
46093 280860 pfam02764 Diphtheria_T Diphtheria toxin, T domain. Central domain of diphtheria toxin is the translocation (T) domain. pH induced conformational change in this domain triggers insertion into the endosomal membrane and facilitates the transfer of the catalytic domain into the cytoplasm. 180
46094 397060 pfam02765 POT1 Telomeric single stranded DNA binding POT1/CDC13. This domain binds single stranded telomeric DNA and adopts an OB fold. It includes the proteins POT1 and CDC13 which have been shown to regulate telomere length, replication and capping. POT1 is one component of the shelterin complex that protects telomere-ends from attack by DNA-repair mechanisms. 140
46095 397061 pfam02767 DNA_pol3_beta_2 DNA polymerase III beta subunit, central domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 115
46096 280863 pfam02768 DNA_pol3_beta_3 DNA polymerase III beta subunit, C-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 118
46097 397062 pfam02769 AIRS_C AIR synthase related protein, C-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The function of the C-terminal domain of AIR synthase is unclear, but the cleft formed between N and C domains is postulated as a sulphate binding site. 152
46098 397063 pfam02770 Acyl-CoA_dh_M Acyl-CoA dehydrogenase, middle domain. Central domain of Acyl-CoA dehydrogenase has a beta-barrel fold. 95
46099 397064 pfam02771 Acyl-CoA_dh_N Acyl-CoA dehydrogenase, N-terminal domain. The N-terminal domain of Acyl-CoA dehydrogenase is an all-alpha domain. 113
46100 397065 pfam02772 S-AdoMet_synt_M S-adenosylmethionine synthetase, central domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 119
46101 397066 pfam02773 S-AdoMet_synt_C S-adenosylmethionine synthetase, C-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 138
46102 397067 pfam02774 Semialdhyde_dhC Semialdehyde dehydrogenase, dimerization domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase. 167
46103 397068 pfam02775 TPP_enzyme_C Thiamine pyrophosphate enzyme, C-terminal TPP binding domain. 151
46104 397069 pfam02776 TPP_enzyme_N Thiamine pyrophosphate enzyme, N-terminal TPP binding domain. 169
46105 397070 pfam02777 Sod_Fe_C Iron/manganese superoxide dismutases, C-terminal domain. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. C-terminal domain is a mixed alpha/beta fold. 102
46106 397071 pfam02778 tRNA_int_endo_N tRNA intron endonuclease, N-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9. 67
46107 397072 pfam02779 Transket_pyr Transketolase, pyrimidine binding domain. This family includes transketolase enzymes, pyruvate dehydrogenases, and branched chain alpha-keto acid decarboxylases. 174
46108 397073 pfam02780 Transketolase_C Transketolase, C-terminal domain. The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site. 124
46109 397074 pfam02781 G6PD_C Glucose-6-phosphate dehydrogenase, C-terminal domain. 296
46110 397075 pfam02782 FGGY_C FGGY family of carbohydrate kinases, C-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the N-terminal domain. 197
46111 397076 pfam02783 MCR_beta_N Methyl-coenzyme M reductase beta subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has an alpha/beta ferredoxin-like fold. 182
46112 397077 pfam02784 Orn_Arg_deC_N Pyridoxal-dependent decarboxylase, pyridoxal binding domain. These pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates This domain has a TIM barrel fold. 241
46113 397078 pfam02785 Biotin_carb_C Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain. 107
46114 397079 pfam02786 CPSase_L_D2 Carbamoyl-phosphate synthase L chain, ATP binding domain. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesize carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. The ATP binding domain (this one) has an ATP-grasp fold. 209
46115 397080 pfam02787 CPSase_L_D3 Carbamoyl-phosphate synthetase large chain, oligomerization domain. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. 79
46116 397081 pfam02788 RuBisCO_large_N Ribulose bisphosphate carboxylase large chain, N-terminal domain. The N-terminal domain of RuBisCO large chain adopts a ferredoxin-like fold. 120
46117 397082 pfam02789 Peptidase_M17_N Cytosol aminopeptidase family, N-terminal domain. 127
46118 397083 pfam02790 COX2_TM Cytochrome C oxidase subunit II, transmembrane domain. The N-terminal domain of cytochrome C oxidase contains two transmembrane alpha-helices. 89
46119 397084 pfam02791 DDT DDT domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors) and is approximately 60 residues in length. Along with the WHIM motifs, it comprises an entirely alpha helical module found in diverse eukaryotic chromatin proteins. Based on the structure of Ioc3, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. In particular, the DDT domain, in combination with the WHIM1 and WHIM2 motifs form the SLIDE domain binding pocket. 58
46120 397085 pfam02792 Mago_nashi Mago nashi protein. This family was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homolog has been shown to interact with an RNA binding protein. An RNAi knockout of the C. elegans homolog causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination. Mago nashi has been found to be part of the exon-exon junction complex that binds 20 nucleotides upstream of exon-exon junctions. 131
46121 397086 pfam02793 HRM Hormone receptor domain. This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. 64
46122 397087 pfam02794 HlyC RTX toxin acyltransferase family. Members of this family are enzymes EC:2.3.1.-. involved in fatty acylation of the protoxins (HlyA) at lysine residues, thereby converting them to the active toxin. Acyl-acyl carrier protein (ACP) is the essential acyl donor. This family show a number of conserved residues that are possible candidates for participation in acyl transfer. Site-directed mutagenesis of the single conserved histidine residue in Escherichia coli HlyC resulted in complete inactivation of the enzyme. 127
46123 397088 pfam02796 HTH_7 Helix-turn-helix domain of resolvase. 45
46124 397089 pfam02797 Chal_sti_synt_C Chalcone and stilbene synthases, C-terminal domain. This domain of chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in the N-terminal domain. 151
46125 397090 pfam02798 GST_N Glutathione S-transferase, N-terminal domain. Function: conjugation of reduced glutathione to a variety of targets. Also included in the alignment, but not GSTs: S-crystallins from squid (similarity to GST previously noted); eukaryotic elongation factors 1-gamma (not known to have GST activity and similarity not previously recognized); HSP26 family of stress-related proteins including auxin-regulated proteins in plants and stringent starvation proteins in E. coli (not known to have GST activity and similarity not previously recognized). The glutathione molecule binds in a cleft between the N- and C-terminal domains - the catalytically important residues are proposed to reside in the N-terminal domain. 76
46126 397091 pfam02799 NMT_C Myristoyl-CoA:protein N-myristoyltransferase, C-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. 193
46127 397092 pfam02800 Gp_dh_C Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. C-terminal domain is a mixed alpha/antiparallel beta fold. 158
46128 397093 pfam02801 Ketoacyl-synt_C Beta-ketoacyl synthase, C-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (pfam00108) and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. 118
46129 397094 pfam02803 Thiolase_C Thiolase, C-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase. 123
46130 397095 pfam02805 Ada_Zn_binding Metal binding domain of Ada. The Escherichia coli Ada protein repairs O6-methylguanine residues and methyl phosphotriesters in DNA by direct transfer of the methyl group to a cysteine residue. This domain contains four conserved cysteines that form a zinc binding site. One of these cysteines is a methyl group acceptor. The methylated domain can then specifically bind to the ada box on a DNA duplex. 62
46131 397096 pfam02806 Alpha-amylase_C Alpha amylase, C-terminal all-beta domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain. 93
46132 397097 pfam02807 ATP-gua_PtransN ATP:guanido phosphotransferase, N-terminal domain. The N-terminal domain has an all-alpha fold. 67
46133 397098 pfam02809 UIM Ubiquitin interaction motif. This motif is called the ubiquitin interaction motif. One of the proteins containing this motif is a receptor for poly-ubiquitination chains for the proteasome. This motif has a pattern of conservation characteristic of an alpha helix. 16
46134 397099 pfam02810 SEC-C SEC-C motif. The SEC-C motif found in the C-terminus of the SecA protein, in the middle of some SWI2 ATPases and also solo in several proteins. The motif is predicted to chelate zinc with the CXC and C[HC] pairs that constitute the most conserved feature of the motif. It is predicted to be a potential nucleic acid binding domain. 19
46135 397100 pfam02811 PHP PHP domain. The PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain. 164
46136 397101 pfam02812 ELFV_dehydrog_N Glu/Leu/Phe/Val dehydrogenase, dimerization domain. 129
46137 280904 pfam02813 Retro_M Retroviral M domain. Retroviruses contain a small protein, MA (matrix), which forms a protein lining immediately beneath the phospholipid membrane of the mature virus particle. MA is located in the N-terminal region of the Gag precursor polyprotein. The N-terminal segment of MA proteins directs the Gag protein to the plasma membrane where budding takes place, and has been called the M domain. This domain forms an alpha helical bundle structure. 86
46138 397102 pfam02814 UreE_N UreE urease accessory protein, N-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. 62
46139 397103 pfam02815 MIR MIR domain. The MIR (protein mannosyltransferase, IP3R and RyR) domain is a domain that may have a ligand transferase function. 185
46140 397104 pfam02816 Alpha_kinase Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. 183
46141 397105 pfam02817 E3_binding e3 binding domain. This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit. 35
46142 397106 pfam02818 PPAK PPAK motif. These motifs are found in the PEVK region of titin. 27
46143 111689 pfam02819 Toxin_9 Spider toxin. This family of spider neurotoxins are thought to be calcium ion channel inhibitors. 43
46144 397107 pfam02820 MBT mbt repeat. The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function. 68
46145 367201 pfam02821 Staphylokinase Staphylokinase/Streptokinase family. 117
46146 397108 pfam02822 Antistasin Antistasin family. Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. 26
46147 397109 pfam02823 ATP-synt_DE_N ATP synthase, Delta/Epsilon chain, beta-sandwich domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. The subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213). 79
46148 397110 pfam02824 TGS TGS domain. The TGS domain is named after ThrRS, GTPase, and SpoT. Interestingly, TGS domain was detected also at the amino terminus of the uridine kinase from the spirochaete Treponema pallidum (but not any other organism, including the related spirochaete Borrelia burgdorferi). TGS is a small domain that consists of ~50 amino acid residues and is predicted to possess a predominantly beta-sheet structure. There is no direct information on the functions of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role. 60
46149 397111 pfam02825 WWE WWE domain. The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. 64
46150 397112 pfam02826 2-Hacid_dh_C D-isomer specific 2-hydroxyacid dehydrogenase, NAD binding domain. This domain is inserted into the catalytic domain, the large dehydrogenase and D-lactate dehydrogenase families in SCOP. N-terminal portion of which is represented by family pfam00389. 176
46151 397113 pfam02827 PKI cAMP-dependent protein kinase inhibitor. Members of this family are extremely potent competitive inhibitors of camp-dependent protein kinase activity. These proteins interact with the catalytic subunit of the enzyme after the cAMP-induced dissociation of its regulatory chains. 69
46152 397114 pfam02828 L27 L27 domain. The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. 52
46153 397115 pfam02829 3H 3H domain. This domain is predicted to be a small molecule binding domain, based on its occurrence with other domains. The domain is named after its three conserved histidine residues. 97
46154 367208 pfam02830 V4R V4R domain. The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. 62
46155 397116 pfam02831 gpW gpW. gpW is a 68 residue protein known to be present in phage particles. Extracts of phage-infected cells lacking gpW contain DNA-filled heads, and active tails, but no infectious virions. gpW is required for the addition of gpFII to the head, which is, in turn, required for the attachment of tails. Since gpFII and tails are known to be attached at the connector, gpW is also likely to assemble at this site. The addition of gpW to filled heads increases the DNase resistance of the packaged DNA, suggesting that gpW either forms a plug at the connector to prevent ejection of the DNA, or binds directly to the DNA. The large number of positively charged residues in gpW (its calculated pI is 10.8) is consistent with a role in DNA interaction. 62
46156 280922 pfam02832 Flavi_glycop_C Flavivirus glycoprotein, immunoglobulin-like domain. 97
46157 397117 pfam02833 DHHA2 DHHA2 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. 124
46158 397118 pfam02834 LigT_PEase LigT like Phosphoesterase. Members of this family are bacterial and archaeal RNA ligases that are able to ligate tRNA half molecules containing 2',3'-cyclic phosphate and 5' hydroxyl termini to products containing the 2',5' phosphodiester linkage. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. The structure of a related protein is known. They belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins. 87
46159 397119 pfam02836 Glyco_hydro_2_C Glycosyl hydrolases family 2, TIM barrel domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities. 302
46160 397120 pfam02837 Glyco_hydro_2_N Glycosyl hydrolases family 2, sugar binding domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold. The domain binds the sugar moiety during the sugar-hydrolysis reaction. 169
46161 397121 pfam02838 Glyco_hydro_20b Glycosyl hydrolase family 20, domain 2. This domain has a zincin-like fold. 123
46162 397122 pfam02839 CBM_5_12 Carbohydrate binding domain. This short domain is found in many different glycosyl hydrolase enzymes and is presumed to have a carbohydrate binding function. The domain has six aromatic groups that may be important for binding. 25
46163 397123 pfam02840 Prp18 Prp18 domain. The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles. 138
46164 397124 pfam02841 GBP_C Guanylate-binding protein, C-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP. 297
46165 397125 pfam02843 GARS_C Phosphoribosylglycinamide synthetase, C domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02787). 92
46166 397126 pfam02844 GARS_N Phosphoribosylglycinamide synthetase, N domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam00289). This domain is structurally related to the PreATP-grasp domain. 101
46167 397127 pfam02845 CUE CUE domain. CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. 42
46168 397128 pfam02847 MA3 MA3 domain. Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains. 113
46169 397129 pfam02852 Pyr_redox_dim Pyridine nucleotide-disulphide oxidoreductase, dimerization domain. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. 107
46170 397130 pfam02854 MIF4G MIF4G domain. MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. 203
46171 397131 pfam02861 Clp_N Clp amino terminal domain, pathogenicity island component. This short domain is found in one or two copies at the amino terminus of ClpA and ClpB proteins from bacteria and eukaryotes. The function of these domains is uncertain but they may form a protein binding site. In many bacterial species, including E.coli, this region represents the N-terminus of one of the key components of the pathogenicity island complex that injects toxin from one bacterium into another. 53
46172 397132 pfam02862 DDHD DDHD domain. The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family. 235
46173 397133 pfam02863 Arg_repressor_C Arginine repressor, C-terminal domain. 68
46174 397134 pfam02864 STAT_bind STAT protein, DNA binding domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. This family represents the DNA binding domain of STAT, which has an ig-like fold. STAT proteins also include an SH2 domain pfam00017. 133
46175 397135 pfam02865 STAT_int STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017. 119
46176 397136 pfam02866 Ldh_1_C lactate/malate dehydrogenase, alpha/beta C-terminal domain. L-lactate dehydrogenases are metabolic enzymes which catalyze the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyze the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. 173
46177 397137 pfam02867 Ribonuc_red_lgC Ribonucleotide reductase, barrel domain. 487
46178 397138 pfam02868 Peptidase_M4_C Thermolysin metallopeptidase, alpha-helical domain. 167
46179 397139 pfam02870 Methyltransf_1N 6-O-methylguanine DNA methyltransferase, ribonuclease-like domain. 77
46180 397140 pfam02872 5_nucleotid_C 5'-nucleotidase, C-terminal domain. 155
46181 397141 pfam02873 MurB_C UDP-N-acetylenolpyruvoylglucosamine reductase, C-terminal domain. Members of this family are UDP-N-acetylenolpyruvoylglucosamine reductase enzymes EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan. 99
46182 397142 pfam02874 ATP-synt_ab_N ATP synthase alpha/beta family, beta-barrel domain. This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella. 69
46183 397143 pfam02875 Mur_ligase_C Mur ligase family, glutamate ligase domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl, and FolC. MurC, MurD, Mure and MurF catalyze consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesized by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate. 87
46184 397144 pfam02876 Stap_Strp_tox_C Staphylococcal/Streptococcal toxin, beta-grasp domain. 101
46185 397145 pfam02877 PARP_reg Poly(ADP-ribose) polymerase, regulatory domain. Poly(ADP-ribose) polymerase catalyzes the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 134
46186 397146 pfam02878 PGM_PMM_I Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain I. 138
46187 397147 pfam02879 PGM_PMM_II Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain II. 102
46188 397148 pfam02880 PGM_PMM_III Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain III. 114
46189 397149 pfam02881 SRP54_N SRP54-type protein, helical bundle domain. 75
46190 397150 pfam02882 THF_DHG_CYH_C Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain. 160
46191 397151 pfam02883 Alpha_adaptinC2 Adaptin C-terminal domain. Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins. 111
46192 397152 pfam02884 Lyase_8_C Polysaccharide lyase family 8, C-terminal beta-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 67
46193 397153 pfam02885 Glycos_trans_3N Glycosyl transferase family, helical bundle domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate. 61
46194 397154 pfam02886 LBP_BPI_CETP_C LBP / BPI / CETP family, C-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar. 238
46195 397155 pfam02887 PK_C Pyruvate kinase, alpha/beta domain. As well as being found in pyruvate kinase this family is found as an isolated domain in some bacterial proteins. 114
46196 397156 pfam02888 CaMBD Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. 73
46197 397157 pfam02889 Sec63 Sec63 Brl domain. This domain (also known as the Brl domain) is required for assembly of functional endoplasmic reticulum translocons. 307
46198 308504 pfam02890 DUF226 Borrelia family of unknown function DUF226. This family of proteins are found in Borrelia. The proteins are about 190 amino acids long and have no known function. 139
46199 397158 pfam02891 zf-MIZ MIZ/SP-RING zinc finger. This domain has SUMO (small ubiquitin-like modifier) ligase activity and is involved in DNA repair and chromosome organisation. 50
46200 397159 pfam02892 zf-BED BED zinc finger. 44
46201 397160 pfam02893 GRAM GRAM domain. The GRAM domain is found in in glucosyltransferases, myotubularins and other putative membrane-associated proteins. Note the alignment is lacking the last two beta strands and alpha helix. 112
46202 397161 pfam02894 GFO_IDH_MocA_C Oxidoreductase family, C-terminal alpha/beta domain. This family of enzymes utilize NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot. 204
46203 397162 pfam02895 H-kinase_dim Signal transducing histidine kinase, homodimeric domain. This helical bundle domain is the homodimer interface of the signal transducing histidine kinase family. 66
46204 397163 pfam02896 PEP-utilizers_C PEP-utilising enzyme, TIM barrel domain. 292
46205 397164 pfam02897 Peptidase_S9_N Prolyl oligopeptidase, N-terminal beta-propeller domain. This unusual 7-stranded beta-propeller domain protects the catalytic triad of prolyl oligopeptidase (see pfam00326), excluding larger peptides and proteins from proteolysis in the cytosol. 414
46206 397165 pfam02898 NO_synthase Nitric oxide synthase, oxygenase domain. 362
46207 397166 pfam02899 Phage_int_SAM_1 Phage integrase, N-terminal SAM-like domain. 84
46208 397167 pfam02900 LigB Catalytic LigB subunit of aromatic ring-opening dioxygenase. 260
46209 397168 pfam02901 PFL-like Pyruvate formate lyase-like. This family of enzymes includes pyruvate formate lyase, choline trimethylamine lyase, glycerol dehydratase, 4-hydroxyphenylacetate decarboxylase, and benzylsuccinate synthase. 646
46210 397169 pfam02902 Peptidase_C48 Ulp1 protease family, C-terminal catalytic domain. This domain contains the catalytic triad Cys-His-Asn. 202
46211 397170 pfam02903 Alpha-amylase_N Alpha amylase, N-terminal ig-like domain. 120
46212 397171 pfam02905 EBV-NA1 Epstein Barr virus nuclear antigen-1, DNA-binding domain. This domain has a ferredoxin-like fold. 139
46213 397172 pfam02906 Fe_hyd_lg_C Iron only hydrogenase large subunit, C-terminal domain. 277
46214 397173 pfam02907 Peptidase_S29 Hepatitis C virus NS3 protease. Hepatitis C virus NS3 protein is a serine protease which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. NS2-3 proteinase, a zinc-dependent enzyme, performs a single proteolytic cut to release the N-terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4A. 149
46215 397174 pfam02909 TetR_C Tetracyclin repressor, C-terminal all-alpha domain. 144
46216 397175 pfam02910 Succ_DH_flav_C Fumarate reductase flavoprotein C-term. This family contains fumarate reductases, succinate dehydrogenases and L-aspartate oxidases. 128
46217 397176 pfam02911 Formyl_trans_C Formyl transferase, C-terminal domain. 95
46218 397177 pfam02912 Phe_tRNA-synt_N Aminoacyl tRNA synthetase class II, N-terminal domain. 67
46219 397178 pfam02913 FAD-oxidase_C FAD linked oxidases, C-terminal domain. This domain has a ferredoxin-like fold. 248
46220 397179 pfam02914 DDE_2 Bacteriophage Mu transposase. 221
46221 397180 pfam02915 Rubrerythrin Rubrerythrin. This domain has a ferritin-like fold. 137
46222 397181 pfam02916 DNA_PPF DNA polymerase processivity factor. 115
46223 397182 pfam02917 Pertussis_S1 Pertussis toxin, subunit 1. 239
46224 280986 pfam02918 Pertussis_S2S3 Pertussis toxin, subunit 2 and 3, C-terminal domain. 109
46225 397183 pfam02919 Topoisom_I_N Eukaryotic DNA topoisomerase I, DNA binding fragment. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. This family may be more than one structural domain. 213
46226 397184 pfam02920 Integrase_DNA DNA binding domain of tn916 integrase. 58
46227 397185 pfam02921 UCR_TM Ubiquinol cytochrome reductase transmembrane region. Each subunit of the cytochrome bc1 complex provides a single helix (this family) to make up the transmembrane region of the complex. 66
46228 397186 pfam02922 CBM_48 Carbohydrate-binding module 48 (Isoamylase N-terminal domain). This domain is found in a range of enzymes that act on branched substrates - isoamylase, pullulanase and branching enzyme. This family also contains the beta subunit of 5' AMP activated kinase. 80
46229 280991 pfam02923 BamHI Restriction endonuclease BamHI. 157
46230 397187 pfam02924 HDPD Bacteriophage lambda head decoration protein D. 116
46231 280993 pfam02925 gpD Bacteriophage scaffolding protein D. 141
46232 397188 pfam02926 THUMP THUMP domain. The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterized RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets. 143
46233 397189 pfam02927 CelD_N Cellulase N-terminal ig-like domain. 83
46234 397190 pfam02928 zf-C5HC2 C5HC2 zinc finger. Predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji. This domain may have a DNA binding function. 54
46235 397191 pfam02929 Bgal_small_N Beta galactosidase small chain. This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. 230
46236 397192 pfam02931 Neur_chan_LBD Neurotransmitter-gated ion-channel ligand binding domain. This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. 215
46237 397193 pfam02932 Neur_chan_memb Neurotransmitter-gated ion-channel transmembrane region. This family includes the four transmembrane helices that form the ion channel. 232
46238 397194 pfam02933 CDC48_2 Cell division protein 48 (CDC48), domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain. 64
46239 397195 pfam02934 GatB_N GatB/GatE catalytic domain. This domain is found in the GatB and GatE proteins. 283
46240 397196 pfam02935 COX7C Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII. 57
46241 397197 pfam02936 COX4 Cytochrome c oxidase subunit IV. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit IV. The Dictyostelium member of this family is called COX VI. The yeast protein MTC3 appears to be the yeast COX IV subunit. 132
46242 397198 pfam02937 COX6C Cytochrome c oxidase subunit VIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIc. 70
46243 397199 pfam02938 GAD GAD domain. This domain is found in some members of the GatB and aspartyl tRNA synthetases. 94
46244 397200 pfam02939 UcrQ UcrQ family. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This family represents the 9.5 kDa subunit of the complex. 76
46245 397201 pfam02940 mRNA_triPase mRNA capping enzyme, beta chain. The beta chain of mRNA capping enzyme has triphosphatase activity. The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain (see pfam01331) 221
46246 397202 pfam02941 FeThRed_A Ferredoxin thioredoxin reductase variable alpha chain. 67
46247 281009 pfam02942 Flu_B_NS1 Influenza B non-structural protein (NS1). A specific region of the influenza B virus NS1 protein, which includes part of its effector domain, blocks the covalent linkage of ISG15 to its target proteins both in vitro and in infected cells. Of the several hundred proteins induced by interferon (IFN) alpha/beta, the ubiquitin-like ISG15 protein is one of the most predominant. Influenza A virus employs a different strategy: its NS1 protein does not bind the ISG15 protein, but little or no ISG15 protein is produced during infection. 247
46248 397203 pfam02943 FeThRed_B Ferredoxin thioredoxin reductase catalytic beta chain. 106
46249 397204 pfam02944 BESS BESS motif. The BESS motif is named after the proteins in which it is found (BEAF, Suvar(3)7 and Stonewall). The motif is 40 amino acid residues long and is composed of two predicted alpha helices. Based on the protein in which it is found and the presence of conserved positively charged residues it is predicted to be a DNA binding domain. This domain appears to be specific to drosophila. 35
46250 397205 pfam02945 Endonuclease_7 Recombination endonuclease VII. 82
46251 397206 pfam02946 GTF2I GTF2I-like repeat. This region of sequence similarity is found up to six times in a variety of proteins including GTF2I. It has been suggested that this may be a DNA binding domain. 75
46252 281014 pfam02947 Flt3_lig flt3 ligand. The flt3 ligand is a short chain cytokine with a 4 helical bundle fold. 131
46253 397207 pfam02948 Amelogenin Amelogenin. Amelogenins play a role in biomineralisation. They seem to regulate the formation of crystallites during the secretory stage of tooth enamel development. thought to play a major role in the structural organisation and mineralisation of developing enamel. They are found in the extracellular matrix. Mutations in X-chromosomal amelogenin can cause Amelogenesis imperfecta. 173
46254 251636 pfam02949 7tm_6 7tm Odorant receptor. This family is composed of 7 transmembrane receptors, that are probably drosophila odorant receptors. 313
46255 367269 pfam02950 Conotoxin Conotoxin. Conotoxins are small snail toxins that block ion channels. 74
46256 397208 pfam02951 GSH-S_N Prokaryotic glutathione synthetase, N-terminal domain. 116
46257 397209 pfam02952 Fucose_iso_C L-fucose isomerase, C-terminal domain. 142
46258 397210 pfam02953 zf-Tim10_DDP Tim10/DDP family zinc finger. Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localized to the mitochondrial intermembrane space. 62
46259 397211 pfam02954 HTH_8 Bacterial regulatory protein, Fis family. 41
46260 397212 pfam02955 GSH-S_ATP Prokaryotic glutathione synthetase, ATP-grasp domain. 175
46261 367272 pfam02956 TT_ORF1 TT viral orf 1. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF1 is a large 750 residue protein. The N-terminal half of this protein corresponds to the capsid protein. 526
46262 251643 pfam02957 TT_ORF2 TT viral ORF2. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals, and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF2 is a 150 residue protein. This family also includes the VP2 protein from the chicken anaemia virus which is a gyrovirus. Gyroviruses are small circular single stranded viruses. The proteins contain a set of conserved cysteine and histidine residues suggesting a zinc binding domain. 103
46263 397213 pfam02958 EcKinase Ecdysteroid kinase. This family includes ecdysteroid 22-kinase, an enzyme responsible for the phosphorylation of ecdysteroids (insect growth and moulting hormones) at C-22, to form physiologically inactive ecdysteroid 22-phosphates. 293
46264 281024 pfam02959 Tax HTLV Tax. Human T-cell leukaemia virus type I (HTLV-I) is the etiological agent for adult T-cell leukaemia (ATL), as well as for tropical spastic paraparesis (TSP) and HTLV-I associate myelopathy (HAM). A biological understanding of the involvement of HTLV-I and in ATL has focused significantly on the workings of the virally-encoded 40 kDa phospho-oncoprotein, Tax. Tax is a transcriptional activator. Its ability to modulate the expression and function of many cellular genes has been reasoned to be a major contributory mechanism explaining HTLV-I-mediated transformation of cells. In activating cellular gene expression, Tax impinges upon several cellular signal-transduction pathways, including those for CREB/ATF and NF-kappaB. 222
46265 281025 pfam02960 K1 K1 glycoprotein. 120
46266 397214 pfam02961 BAF Barrier to autointegration factor. The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. 86
46267 397215 pfam02962 CHMI 5-carboxymethyl-2-hydroxymuconate isomerase. 124
46268 397216 pfam02963 EcoRI Restriction endonuclease EcoRI. 257
46269 281029 pfam02964 MeMO_Hyd_G Methane monooxygenase, hydrolase gamma chain. 161
46270 397217 pfam02965 Met_synt_B12 Vitamin B12 dependent methionine synthase, activation domain. 273
46271 397218 pfam02966 DIM1 Mitosis protein DIM1. 133
46272 281032 pfam02969 TAF TATA box binding protein associated factor (TAF). TAF proteins adopt a histone-like fold. 66
46273 397219 pfam02970 TBCA Tubulin binding cofactor A. 88
46274 397220 pfam02971 FTCD Formiminotransferase domain. 144
46275 145886 pfam02972 Phycoerythr_ab Phycoerythrin, alpha/beta chain. This family represents the non-globular alpha and beta chain components of phycoerythrin. The structure is a long beta-hairpin and a single alpha-helix. 57
46276 281035 pfam02973 Sialidase Sialidase, N-terminal domain. 188
46277 397221 pfam02974 Inh Protease inhibitor Inh. The Inh inhibitor is secreted into the periplasm where its presumed physiological function is to protect periplasmic proteins against the action of secreted proteases. A range of proteases including A, B and C from E. chrysanthemi, alkaline protease from Pseudomonas aeruginosa and the 50 kDa protease from Serratia marcescens are inhibited. 95
46278 397222 pfam02975 Me-amine-dh_L Methylamine dehydrogenase, L chain. 113
46279 397223 pfam02976 MutH DNA mismatch repair enzyme MutH. 103
46280 397224 pfam02977 CarbpepA_inh Carboxypeptidase A inhibitor. 40
46281 397225 pfam02978 SRP_SPB Signal peptide binding domain. 95
46282 397226 pfam02979 NHase_alpha Nitrile hydratase, alpha chain. 178
46283 397227 pfam02980 FokI_C Restriction endonuclease FokI, catalytic domain. 136
46284 397228 pfam02981 FokI_N Restriction endonuclease FokI, recognition domain. 135
46285 202497 pfam02982 Scytalone_dh Scytalone dehydratase. Scytalone dehydratases are structurally related to the NTF2 family (see pfam02136). 160
46286 397229 pfam02983 Pro_Al_protease Alpha-lytic protease prodomain. 57
46287 397230 pfam02984 Cyclin_C Cyclin, C-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). Human CCNO is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the C-terminal domain. 119
46288 397231 pfam02985 HEAT HEAT repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). 31
46289 397232 pfam02986 Fn_bind Fibronectin binding repeat. The ability of bacteria to bind fibronectin is thought to enable the colonisation of wound tissue and blood clots. The fibronectin binding repeat is found in bacterial fibronectin binding proteins and serum opacity factor. Bacterial fibronectin binding proteins are surface proteins that covalently link to the bacterial cell wall, mediate adherence of the bacteria to host cells and trigger the fibronectin/integrin-mediated uptake of bacteria by host cells. Each fibronectin binding repeat is an array of short motifs that bind to fibronectin type I domains. Fibronectin binding repeats are natively unfolded in the absence of fibronectin and are thought to adopt a well-defined conformation (tandem beta-zipper) upon binding. 33
46290 111833 pfam02987 LEA_4 Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. 44
46291 190495 pfam02988 PLA2_inh Phospholipase A2 inhibitor. 83
46292 281047 pfam02989 DUF228 Lyme disease proteins of unknown function. 182
46293 397233 pfam02990 EMP70 Endomembrane protein 70. 509
46294 281049 pfam02991 Atg8 Autophagy protein Atg8 ubiquitin like. Light chain 3 is proposed to function primarily as a subunit of microtubule associated proteins 1A and 1B and that its expression may regulate microtubule binding activity. Autophagy is generally known as a process involved in the degradation of bulk cytoplasmic components that are non-specifically sequestered into an autophagosome, where they are sequestered into double-membrane vesicles and delivered to the degradative organelle, the lysosome/vacuole, for breakdown and eventual recycling of the resulting macromolecules. The yeast proteins are involved in the autophagosome, and Atg8 binds Atg19, via its N-terminus and the C-terminus of Atg19. 104
46295 397234 pfam02992 Transposase_21 Transposase family tnp2. 211
46296 367287 pfam02993 MCPVI Minor capsid protein VI. This minor capsid protein may act as a link between the external capsid and the internal DNA-protein core. The C-terminal 11 residues may function as a protease cofactor leading to enzyme activation. 225
46297 397235 pfam02994 Transposase_22 L1 transposable element RBD-like domain. This entry represents the RBD-like domain. 98
46298 397236 pfam02995 DUF229 Protein of unknown function (DUF229). Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. 496
46299 397237 pfam02996 Prefoldin Prefoldin subunit. This family comprises of several prefoldin subunits. The biogenesis of the cytoskeletal proteins actin and tubulin involves interaction of nascent chains of each of the two proteins with the oligomeric protein prefoldin (PFD) and their subsequent transfer to the cytosolic chaperonin CCT (chaperonin containing TCP-1). Electron microscopy shows that eukaryotic PFD, which has a similar structure to its archaeal counterpart, interacts with unfolded actin along the tips of its projecting arms. In its PFD-bound state, actin seems to acquire a conformation similar to that adopted when it is bound to CCT. 118
46300 111843 pfam02998 Lentiviral_Tat Lentiviral Tat protein. This family contains retroviral transactivating (Tat) proteins, from a variety of Lentiviruses. 86
46301 308571 pfam02999 Borrelia_orfD Borrelia orf-D family. Borrelia burgdorferi supercoiled plasmids encode multicopy tandem open reading frames called Orf-A, Orf-B, Orf-C and Orf-D. This family corresponds to Orf-D. The putative product of this gene has no known function. 100
46302 397238 pfam03000 NPH3 NPH3 family. Phototropism of Arabidopsis thaliana seedlings in response to a blue light source is initiated by nonphototropic hypocotyl 1 (NPH1), a light-activated serine-threonine protein kinase. Mutations in NPH3 disrupt early signaling occurring downstream of the NPH1 photoreceptor. The NPH3 gene encodes a NPH1-interacting protein. NPH3 is a member of a large protein family, apparently specific to higher plants, and may function as an adapter or scaffold protein to bring together the enzymatic components of a NPH1-activated phosphorelay. 219
46303 367290 pfam03002 Somatostatin Somatostatin/Cortistatin family. Members of this family are hormones. Somatostatin inhibits the release of somatotropin. Cortistatin is a peptide that is related to the Somatostatins that is found to depresses neuronal electrical activity but, unlike somatostatin, induces low-frequency waves in the cerebral cortex and antagonizes the effects of acetylcholine on hippocampal and cortical measures of excitability. 18
46304 281058 pfam03003 Pox_G9-A16 Pox virus entry-fusion-complex G9/A16. Pox_G9-A16 is a family of two of the eight entry-fusion complex proteins of pox viruses. the viral fusion proteins are components of the mature virion, MV, membrane. Extracellular enveloped virions (EVs), the infecting particles are MVs with an additional membrane that is opened or removed prior to the fusion of the MV and cell membrane during virus entry. G9 and A16 interact closely with each other and each is required for membrane fusion and virus entry as well as for interaction with A56/K2. 128
46305 367291 pfam03004 Transposase_24 Plant transposase (Ptta/En/Spm family). Transposase proteins are necessary for efficient DNA transposition. This family includes various plant transposases from the Ptta and En/Spm families. 137
46306 397239 pfam03006 HlyIII Haemolysin-III related. Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea. This family belongs to the CREST superfamily, which are distantly related to GPCRs. 222
46307 281060 pfam03007 WES_acyltransf Wax ester synthase-like Acyl-CoA acyltransferase domain. This domain is found in wax ester synthase genes. In these proteins this domain catalyzes the CoA dependent acyltransferase reaction with fatty alcohols to form wax esters. 261
46308 397240 pfam03008 DUF234 Archaea bacterial proteins of unknown function. 91
46309 397241 pfam03009 GDPD Glycerophosphoryl diester phosphodiesterase family. E. coli has two sequence related isozymes of glycerophosphoryl diester phosphodiesterase (GDPD) - periplasmic and cytosolic. This family also includes agrocinopine synthase, the similarity to GDPD has been noted. This family appears to have weak but not significant matches to mammalian phospholipase C pfam00388, which suggests that this family may adopt a TIM barrel fold. 244
46310 281063 pfam03010 GP4 GP4. GP4 is a minor membrane-associated glycoproteins. This family contains envelope protein GP4 from equine arteritis virus. 152
46311 281064 pfam03011 PFEMP PFEMP DBL domain. PfEMP1 (Plasmodium falciparum erythrocyte membrane protein) has been identified as the rosetting ligand of the malaria parasite P. falciparum. Rosetting is the adhesion of infected erythrocytes with uninfected erythrocytes in the vasculature of the infected organ, and is associated with severe malaria. PfEMP1 interacts with Complement Receptor One on uninfected erythrocytes to form rosettes. The extreme variation within these proteins and the grouping of var genes implies that var gene recombination preferentially occurs within var gene groups. These groups reflect a functional diversification that has evolved to cope with the varying conditions of transmission and host immune response met by the parasite. A recombination hotspot was uncovered between Duffy-binding-like (DBL) subdomains. Solution of the crystal structure of the N-terminal and first DBL region of PfEMP1 from the VarO variant of the PfEMP1 protein is found to be directly implicated in rosetting as the heparin-binding site. 154
46312 397242 pfam03012 PP_M1 Phosphoprotein. This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit, which is thought to be a component of the active polymerase, and may be involved in template binding. 296
46313 397243 pfam03013 Pyr_excise Pyrimidine dimer DNA glycosylase. Pyrimidine dimer DNA glycosylases excise pyrimidine dimers by hydrolysis of the glycosylic bond of the 5' pyrimidine, followed by the intra-pyrimidine phosphodiester bond. Pyrimidine dimers are the major UV-lesions of DNA. 81
46314 367296 pfam03014 SP2 Structural protein 2. This family represents structural protein 2 of the hepatitis E virus. The high basic amino acid content of this protein has lead to the suggestion of a role in viral genomic RNA encapsidation. 709
46315 397244 pfam03015 Sterile Male sterility protein. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included. 92
46316 397245 pfam03016 Exostosin Exostosin family. The EXT family is a family of tumor suppressor genes. Mutations of EXT1 on 8q24.1, EXT2 on 11p11-13, and EXT3 on 19p have been associated with the autosomal dominant disorder known as hereditary multiple exostoses (HME). This is the most common known skeletal dysplasia. The chromosomal locations of other EXT genes suggest association with other forms of neoplasia. EXT1 and EXT2 have both been shown to encode a heparan sulphate polymerase with both D-glucuronyl (GlcA) and N-acetyl-D-glucosaminoglycan (GlcNAC) transferase activities. The nature of the defect in heparan sulphate biosynthesis in HME is unclear. 290
46317 397246 pfam03017 Transposase_23 TNP1/EN/SPM transposase. 64
46318 397247 pfam03018 Dirigent Dirigent-like protein. This family contains a number of proteins which are induced during disease response in plants. Members of this family are involved in lignification. 144
46319 397248 pfam03020 LEM LEM domain. The LEM domain is 50 residues long and is composed of two parallel alpha helices. This domain is found in inner nuclear membrane proteins. It is called the LEM domain after LAP2, Emerin, and Man1. 40
46320 281073 pfam03021 CM2 Influenza C virus M2 protein. Influenza C virus M1 protein is encoded by a spliced mRNA. The unspliced mRNA is also found in small quantities and can encode the protein represented by this family. 139
46321 308585 pfam03022 MRJP Major royal jelly protein. Royal jelly is the food of queen bee larvae, and is responsible for the high reproductive ability of the queen. Major royal jelly proteins make up around 90% of larval jelly proteins. This family also the sequence-related yellow protein of drosophila which controls pigmentation of the adult cuticle and larval mouth parts. 288
46322 397249 pfam03023 MVIN MviN-like protein. Deletion of the mviN virulence gene in Salmonella enterica serovar. Typhimurium greatly reduces virulence in a mouse model of typhoid-like disease. Open reading frames encoding homologs of MviN have since been identified in a variety of bacteria, including pathogens and non-pathogens and plant-symbionts. In the nitrogen-fixing symbiont Rhizobium tropici, mviN is required for motility. The MviM protein is predicted to be membrane-associated. 451
46323 397250 pfam03024 Folate_rec Folate receptor family. This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges. 172
46324 145918 pfam03025 Papilloma_E5 Papillomavirus E5. The E5 protein from papillomaviruses is about 80 amino acids long. The proteins are contain three regions that are predicted to be transmembrane alpha helices. The function of this protein is unknown. 72
46325 281077 pfam03026 CM1 Influenza C virus M1 protein. This family represents the matrix 1 protein of influenza C virus. The protein is the product of a spliced mRNA. Small quantities of the unspliced mRNA are found in the cell additionally encoding the M2 protein (see pfam03021). 235
46326 397251 pfam03028 Dynein_heavy Dynein heavy chain and region D6 of dynein motor. This family represents the C-terminal region of dynein heavy chain. The chain also contains ATPase activity and microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules. Dynein is also involved in cilia and flagella movement. The dynein subunit consists of at least two heavy chains and a number of intermediate and light chains. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This C-terminal domain carries the D6 region of the dynein motor where the P-loop has been lost in evolution but the general structure of a potential ATP binding site appears to be retained. 113
46327 397252 pfam03029 ATP_bind_1 Conserved hypothetical ATP binding protein. Members of this family are found in a range of archaea and eukaryotes and have hypothesized ATP binding activity. 238
46328 397253 pfam03030 H_PPase Inorganic H+ pyrophosphatase. The H+ pyrophosphatase is an transmembrane proton pump involved in establishing the H+ electrochemical potential difference between the vacuole lumen and the cell cytosol. Vacuolar-type H(+)-translocating inorganic pyrophosphatases have long been considered to be restricted to plants and to a few species of photo-trophic bacteria. However, in recent investigations, these pyrophosphatases have been found in organisms as disparate as thermophilic Archaea and parasitic protists. 663
46329 397254 pfam03031 NIF NLI interacting factor-like phosphatase. This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. 160
46330 281082 pfam03032 FSAP_sig_propep Frog skin active peptide family signal and propeptide. This family contains a number of defense peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins. The alignment for this family consists of the signal peptide and propeptide regions and does not include the active peptides. 46
46331 397255 pfam03033 Glyco_transf_28 Glycosyltransferase family 28 N-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). This N-terminal domain contains the acceptor binding site and likely membrane association site. This family also contains a large number of proteins that probably have quite distinct activities. 139
46332 397256 pfam03034 PSS Phosphatidyl serine synthase. Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II which are membrane bound proteins which catalyzes the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine. 273
46333 367308 pfam03035 RNA_capsid Calicivirus putative RNA polymerase/capsid protein. 226
46334 397257 pfam03036 Perilipin Perilipin family. The perilipin family includes lipid droplet-associated protein (perilipin) and adipose differentiation-related protein (adipophilin). 403
46335 367310 pfam03037 KMP11 Kinetoplastid membrane protein 11. Kinetoplastid membrane protein 11 is a major cell surface glycoprotein of the parasite Leishmania donovani. 90
46336 281088 pfam03038 Herpes_UL95 UL95 family. Members of this family, found in several herpesviruses, include EBV BGLF3 and other UL95 proteins (e.g. HCMV UL95, HVS-1 34, HSV6 U67). Their function is unknown. 319
46337 397258 pfam03039 IL12 Interleukin-12 alpha subunit. Interleukin 12 (IL-12) is a disulphide-bonded heterodimer consisting of a 35kDa alpha subunit and a 40kDa beta subunit. It is involved in the stimulation and maintenance of Th1 cellular immune responses, including the normal host defense against various intracellular pathogens, such as Leishmania, Toxoplasma, measles virus and HIV. IL-12 also has an important role in pathological Th1 responses, such as in inflammatory bowel disease and multiple sclerosis. Suppression of IL-12 activity in such diseases may have therapeutic benefit. On the other hand, administration of recombinant IL-12 may have therapeutic benefit in conditions associated with pathological Th2 responses. 214
46338 397259 pfam03040 CemA CemA family. Members of this family are probable integral membrane proteins. Their molecular function is unknown. CemA proteins are found in the inner envelope membrane of chloroplasts but not in the thylakoid membrane. A cyanobacterial member of this family has been implicated in CO2 transport, but is probably not a CO2 transporter itself. They are predicted to be haem-binding however this has not been proven experimentally. 228
46339 281091 pfam03041 Baculo_LEF-2 lef-2. The lef-2 gene (for late expression factor 2) from baculovirus is required for expression of late genes. This gene has been shown to be specifically required for expression from the vp39 and polh promoters. LEF-1 is a DNA primase and there is some evidence to suggest that LEF-2 may bind to both DNA and LEF-1. 184
46340 281092 pfam03042 Birna_VP5 Birnavirus VP5 protein. Birnaviruses are ds RNA viruses. Non structural protein VP5 is found in RNA segment A. The function of this small viral protein is unknown. The proteins are about 150 amino acids long and contain several conserved histidines and cysteines that might form a zinc binding site (Bateman A pers. obs.). 139
46341 281093 pfam03043 Herpes_UL87 Herpesvirus UL87 family. Members of this family are functionally uncharacterized. This family groups together EBV BcRF1, HSV-6 U58, HVS-1 24 and HCMV UL87. The proteins range from 575 to 950 amino acids in length. 523
46342 281094 pfam03044 Herpes_UL16 Herpesvirus UL16/UL94 family. This family groups together HSV-1 UL16, HSV-6 ORF11R, EHV-1 46, HCMV UL94, EBV BGLF2 and VZV 44. UL16 protein may play a role in capsid maturation including DNA packaging/cleavage. In immunofluorescence studies, UL16 was localized to the nucleus of infected cells in areas containing high concentrations of HSV capsid proteins. These nuclear compartments have been described previously as viral assemblons and are distinct from compartments containing replicating DNA. localization within assemblons argues for a role of UL16 encoded protein in capsid assembly or maturation. 328
46343 397260 pfam03045 DAN DAN domain. This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonize different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo. 108
46344 335196 pfam03047 ComC COMC family. This family consists exclusively of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine. A few COMC proteins and their precursors (not included in this family) do not fully follow the above description. In particular: the leader sequence in the CSP precursor from Streptococcus sanguis NCTC 7863 is not of the double-glycine type; the CSP from Streptococcus gordonii NCTC 3165 does not have a negatively charged N-terminus residue and has a lysine instead of arginine at the third position. Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognized by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species. The family Antibacterial16, streptolysins from group A streptococci, has been merged into this family. 31
46345 281097 pfam03048 Herpes_UL92 UL92 family. Members of this family, found in several herpesviruses, include EBV BDLF4, HCMV UL92, HHV8 31, HSV6 U63. Their function is unknown. The N-terminus of this protein contains 6 conserved cysteines and histidines that might form a zinc binding domain (A Bateman pers. obs.). 189
46346 281098 pfam03049 Herpes_UL79 UL79 family. Members of this family are functionally uncharacterized proteins from herpesviruses. This family groups together HSV-6 U52, HVS-1 18 and HCMV UL79. 254
46347 397261 pfam03050 DDE_Tnp_IS66 Transposase IS66 family. Transposase proteins are necessary for efficient DNA transposition. This family includes IS66 from Agrobacterium tumefaciens. 282
46348 397262 pfam03051 Peptidase_C1_2 Peptidase C1-like family. This family is closely related to the Peptidase_C1 family pfam00112, containing several prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases. 438
46349 308597 pfam03052 Adeno_52K Adenoviral protein L1 52/55-kDa. The adenoviral protein L1 52/55-kDa is expressed in both the early and late stages of infection which suggests that it could play multiple roles in the viral life cycle. The L1 52/55 kDa protein interacts with the viral IVa2 protein and is required for DNA packaging. L1 53/55-kDa is required to mediate stable association between the viral DNA and empty capsid. 198
46350 251695 pfam03053 Corona_NS3b ORF3b coronavirus protein. Members of this family are non-structural proteins, approximately 250 amino acid residues long. They are found in transmissible gastroenteritis coronavirus (TGEV) and porcine respiratory coronavirus (PRCV) isolates. These proteins are found on the same mRNA as another product, designated ORF3a. While ORF3a/b has been implicated in TGEV and PRCV pathogenesis, its precise role remains unclear. 226
46351 281101 pfam03054 tRNA_Me_trans tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs. 353
46352 397263 pfam03055 RPE65 Retinal pigment epithelial membrane protein. This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria. 445
46353 397264 pfam03057 DUF236 DUF236 repeat. This family represents a short repeat region found a number of C. elegans proteins of unknown function. 31
46354 251699 pfam03058 Sar8_2 Sar8.2 family. Members of this family are found in Solanaceae plants, a taxonomic group (family) that includes pepper and tobacco plant species. Synthesis of these proteins is induced by tobacco mosaic virus (TMV) and salicylic acid; indeed they are thought to be involved in the development of systemic acquired resistance (SAR) after an initial hypersensitive response to microbial infection. SAR is characterized by long-lasting resistance to infection by a wide range of pathogens, extending to plant tissues distant from the initial infection site. 85
46355 308600 pfam03059 NAS Nicotianamine synthase protein. Nicotianamine synthase EC:2.5.1.43 catalyzes the trimerisation of S-adenosylmethionine to yield one molecule of nicotianamine. Nicotianamine has an important role in plant iron uptake mechanisms. Plants adopt two strategies (termed I and II) of iron acquisition. Strategy I is adopted by all higher plants except graminaceous plants, which adopt strategy II. In strategy I plants, the role of nicotianamine is not fully determined: possible roles include the formation of more stable complexes with ferrous than with ferric ion, which might serve as a sensor of the physiological status of iron within a plant, or which might be involved in the transport of iron. In strategy II (graminaceous) plants, nicotianamine is the key intermediate (and nicotianamine synthase the key enzyme) in the synthesis of the mugineic family (the only known family in plants) of phytosiderophores. Phytosiderophores are iron chelators whose secretion by the roots is greatly increased in instances of iron deficiency. The 3D structures of five example NAS from Methanothermobacter thermautotrophicus reveal the monomer to consist of a five-helical bundle N-terminal domain on top of a classic Rossmann fold C-terminal domain. The N-terminal domain is unique to the NAS family, whereas the C-terminal domain is homologous to the class I family of SAM-dependent methyltransferases. An active site is created at the interface of the two domains, at the rim of a large cavity that corresponds to the nucleotide binding site such as is found in other proteins adopting a Rossmann fold. 276
46356 367316 pfam03060 NMO Nitronate monooxygenase. Nitronate monooxygenase (NMO), formerly referred to as 2-nitropropane dioxygenase (NPD) (EC:1.13.11.32), is an FMN-dependent enzyme that uses molecular oxygen to oxidize (anionic) alkyl nitronates and, in the case of the enzyme from Neurospora crassa, (neutral) nitroalkanes to the corresponding carbonyl compounds and nitrite. Previously classified as 2-nitropropane dioxygenase, but it is now recognized that this was the result of the slow ionization of nitroalkanes to their nitronate (anionic) forms. The enzymes from the fungus Neurospora crassa and the yeast Williopsis saturnus var. mrakii (formerly classified as Hansenula mrakii) contain non-covalently bound FMN as the cofactor. Active towards linear alkyl nitronates of lengths between 2 and 6 carbon atoms and, with lower activity, towards propyl-2-nitronate. The enzyme from N. crassa can also utilize neutral nitroalkanes, but with lower activity. One atom of oxygen is incorporated into the carbonyl group of the aldehyde product. The reaction appears to involve the formation of an enzyme-bound nitronate radical and an a-peroxynitroethane species, which then decomposes, either in the active site of the enzyme or after release, to acetaldehyde and nitrite. 331
46357 397265 pfam03061 4HBT Thioesterase superfamily. This family contains a wide variety of enzymes, principally thioesterases. This family includes 4HBT (EC 3.1.2.23) which catalyzes the final step in the biosynthesis of 4-hydroxybenzoate from 4-chlorobenzoate in the soil dwelling microbe Pseudomonas CBS-3. This family includes various cytosolic long-chain acyl-CoA thioester hydrolases. Long-chain acyl-CoA hydrolases hydrolyze palmitoyl-CoA to CoA and palmitate, they also catalyze the hydrolysis of other long chain fatty acyl-CoA thioesters. 79
46358 281107 pfam03062 MBOAT MBOAT, membrane-bound O-acyltransferase family. The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue. 334
46359 397266 pfam03063 Prismane Prismane/CO dehydrogenase family. This family includes both hybrid-cluster proteins and the beta chain of carbon monoxide dehydrogenase. The hybrid-cluster proteins contain two Fe/S centers - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested. The prismane protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ -> NH3 + H2O). This activity is rather low. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre. 538
46360 281109 pfam03064 U79_P34 HSV U79 / HCMV P34. This family represents herpes virus protein U79 and cytomegalovirus early phosphoprotein P34 (UL112). 228
46361 397267 pfam03065 Glyco_hydro_57 Glycosyl hydrolase family 57. This family includes alpha-amylase (EC:3.2.1.1), 4--glucanotransferase (EC:2.4.1.-) and amylopullulanase enzymes. 293
46362 397268 pfam03066 Nucleoplasmin Nucleoplasmin/nucleophosmin domain. Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays. 102
46363 397269 pfam03067 LPMO_10 Lytic polysaccharide mono-oxygenase, cellulose-degrading. This domain is found associated with a wide variety of cellulose binding domains. This is a family of two very closely related proteins that together act as both a C1- and a C4-oxidising lytic polysaccharide mono-oxygenase, degrading cellulose. This domain is also found in baculoviral spheroidins and spindolins, protein of unknown function. 186
46364 397270 pfam03068 PAD Protein-arginine deiminase (PAD). Members of this family are found in mammals. In the presence of calcium ions, PAD enzymes EC:3.5.3.15 catalyze the post-translational modification reaction responsible for the formation of citrulline residues: Protein L-arginine + H2O <=> Protein L-citrulline + NH3. Several types are recognized (and included in the family) on the basis of molecular mass, substrate specificity, and tissue localization. The expression of type I PAD is known to be under the control of oestrogen. 384
46365 397271 pfam03069 FmdA_AmdA Acetamidase/Formamidase family. This family includes amidohydrolases of formamide EC:3.5.1.49 and acetamide. Methylophilus methylotrophus FmdA forms a homotrimer suggesting all the members of this family also do. 271
46366 397272 pfam03070 TENA_THI-4 TENA/THI-4/PQQC family. Members of this family are found in all the three major phyla of life: archaebacteria, eubacteria, and eukaryotes. In Bacillus subtilis, TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as alkaline protease, neutral protease and levansucrase. The THI-4 protein, which is involved in thiamine biosynthesis, is also a member of this family. The C-terminal part of these proteins consistently show significant sequence similarity to TENA proteins. This similarity was first noted with the Neurospora crassa THI-4. This family includes bacterial coenzyme PQQ synthesis protein C or PQQC proteins. Pyrroloquinoline quinone (PQQ) is the prosthetic group of several bacterial enzymes,including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC has been found to be required in the synthesis of PQQ but its function is unclear. The exact molecular function of members of this family is uncertain. 210
46367 397273 pfam03071 GNT-I GNT-I family. Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GNT-I, GLCNAC-T I) EC:2.4.1.101 transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide. This is an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localized to the Golgi apparatus, and is probably distributed in all tissues. The catalytic domain is located at the C-terminus. 434
46368 281117 pfam03072 DUF237 MG032/MG096/MG288 family 1. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03086, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins are included in both families, but of course differ in the aligned residues. 137
46369 397274 pfam03073 TspO_MBR TspO/MBR family. Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. In, the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues. 144
46370 397275 pfam03074 GCS Glutamate-cysteine ligase. This family represents the catalytic subunit of glutamate-cysteine ligase (E.C. 6.3.2.2), also known as gamma-glutamylcysteine synthetase (GCS). This enzyme catalyzes the rate limiting step in the biosynthesis of glutathione. The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy chain (this family). 369
46371 281120 pfam03076 GP3 Equine arteritis virus GP3. This protein is encoded by ORF3 of equine arteritis virus. The function is unknown. 160
46372 281121 pfam03077 VacA2 Putative vacuolating cytotoxin. This family contains a number of Helicobacter outer membrane proteins with multiple copies of this small conserved region. 58
46373 251715 pfam03078 ATHILA ATHILA ORF-1 family. ATHILA is a group of Arabidopsis thaliana retrotransposons belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown. 456
46374 281122 pfam03079 ARD ARD/ARD' family. The two acireductone dioxygenase enzymes (ARD and ARD', previously known as E-2 and E-2') from Klebsiella pneumoniae share the same amino acid sequence, but bind different metal ions: ARD binds Ni2+, ARD' binds Fe2+. ARD and ARD' can be experimentally interconverted by removal of the bound metal ion and reconstitution with the appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but yield different products. ARD' yields the alpha-keto precursor of methionine (and formate), thus forming part of the ubiquitous methionine salvage pathway that converts 5'-methylthioadenosine (MTA) to methionine. This pathway is responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis and transmethylation reactions. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents the conversion of MTA to methionine. The role of the ARD catalyzed reaction is unclear: methylthiopropanoate is cytotoxic, and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels. This family also contains other members, whose functions are not well characterized. 157
46375 397276 pfam03080 Neprosin Neprosin. Pitcher plants are insectivorous and secrete a digestive fluid into the pitcher. This fluid contains a mixture of enzymes including peptidases. One of these is neprosin, characterized from the pitcher plant Nepenthes ventrata. This peptidase is of unknown catalytic type and is unaffected by standard peptidase inhibitors. Unusually, activity is directed towards prolyl bonds, but unlike most peptidase that cleave after proline, there is no restriction on sequence length or position of the proline residue. The peptidase is secreted and is presumed to possess an N-terminal activation peptide. The neprosin domain corresponds to the mature peptidase. It is not known if other proteins with this domain are peptidases. 221
46376 397277 pfam03081 Exo70 Exo70 exocyst complex subunit. The Exo70 protein forms one subunit of the exocyst complex. First discovered in S. cerevisiae, Exo70 and other exocyst proteins have been observed in several other eukaryotes, including humans. In S. cerevisiae, the exocyst complex is involved in the late stages of exocytosis, and is localized at the tip of the bud, the major site of exocytosis in yeast. Exo70 interacts with the Rho3 GTPase. This interaction mediates one of the three known functions of Rho3 in cell polarity: vesicle docking and fusion with the plasma membrane (the other two functions are regulation of actin polarity and transport of exocytic vesicles from the mother cell to the bud). In humans, the functions of Exo70 and the exocyst complex are less well characterized: Exo70 is expressed in several tissues and is thought to also be involved in exocytosis. 373
46377 281125 pfam03082 MAGSP Male accessory gland secretory protein. The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. This protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. The proteins are transferred to the female fly during copulation and are rapidly altered in the female genital tract. 267
46378 397278 pfam03083 MtN3_slv Sugar efflux transporter for intercellular exchange. This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologs are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter. 87
46379 367326 pfam03084 Sigma_1_2 Reoviral Sigma1/Sigma2 family. Reoviruses are double-stranded RNA viruses. They lack a membrane envelope and their capsid is organized in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought to initiate core formation, followed by mu2 and lambda2. Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil, anchors it in the virion, and a C-terminal globular head interacts with the cellular receptor. These two parts form by separate trimerisation events. The N-terminal fibrous tail forms on the polysome, without the involvement of ATP or chaperones. The post- translational assembly of the C-terminal globular head involves the chaperone activity of Hsp90, which is associated with phosphorylation of Hsp90 during the process. Sigma1 protein acts as a cell attachment protein, and determines viral virulence, pathways of spread, and tropism. Junctional adhesion molecule has been identified as a receptor for sigma1. In type 3 reoviruses, a small region, predicted to form a beta sheet, in the N-terminal tail was found to bind target cell surface sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis. The sigma1 protein also binds to the lambda2 core protein. 452
46380 367327 pfam03085 RAP-1 Rhoptry-associated protein 1 (RAP-1). Members of this family are found in Babesia species. Though not in this Pfam family, rhoptry-associated proteins are also found in Plasmodium falciparum. Indeed, animal infection with Babesia may produce a pattern similar to human malaria. Rhoptry organelles form part of the apical complex in apicomplexan parasites. Rhoptry-associated proteins are antigenic, and generate partially protective immune responses in infected mammals. Thus RAPs are among the targeted vaccine antigens for babesial (and malarial) parasites. However, RAP-1 proteins are encoded by by a multigene family; thus RAP-1 proteins are polymorphic, with B and T cell epitopes that are conserved among strains, but not across species. Antibodies to Babesia RAP-1 may also be helpful in the serological detection of Babesia infections. 241
46381 281129 pfam03086 DUF240 MG032/MG096/MG288 family 2. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03072, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins are included in both families, but of course differ in the aligned residues. 119
46382 397279 pfam03087 DUF241 Arabidopsis protein of unknown function. This family represents a number of Arabidopsis proteins. Their functions are unknown. 238
46383 281131 pfam03088 Str_synth Strictosidine synthase. Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyzes the condensation of tryptamine with secologanin to form strictosidine. 89
46384 397280 pfam03089 RAG2 Recombination activating protein 2. V-D-J recombination is the combinatorial process by which the huge range of immunoglobulin and T cell binding specificity is generated from a limited amount of genetic material. This process is synergistically activated by RAG1 and RAG2 in developing lymphocytes. Defects in RAG2 in humans are a cause of severe combined immunodeficiency B cell negative and Omenn syndrome. 338
46385 397281 pfam03090 Replicase Replicase family. This is a family of bacterial plasmid DNA replication initiator proteins. pfam01051 is a similar family. These RepA proteins exist as monomers and dimers in equilibrium: monomers bind directly to repeated DNA sequences and thus activate replication; dimers repress repA transcription by binding an inversely repeated DNA operator. Dimer dissociation can occur spontaneously or be mediated by Hsp70 chaperones. 128
46386 397282 pfam03091 CutA1 CutA1 divalent ion tolerance protein. Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognized structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila). 99
46387 308617 pfam03092 BT1 BT1 family. Members of this family are transmembrane proteins. Several are Leishmania putative proteins that are thought to be pteridine transporters. One such protein, previously termed (and still annotated as) ORFG, was shown to encode a biopterin transport protein using null mutants, thus being subsequently renamed BT1. The significant similarity of ORFG/BT1 to Trypanosoma brucei ESAG10 (a putative transmembrane protein and another member of this family) was previously noted. This family also contains five putative Arabidopsis thaliana proteins of unknown function. In addition, it also contains two predicted prokaryotic proteins (from the cyanobacteria Synechocystis and Synechococcus). 432
46388 397283 pfam03094 Mlo Mlo family. A family of plant integral membrane proteins, first discovered in barley. Mutants lacking wild-type Mlo proteins show broad spectrum resistance to the powdery mildew fungus, and dysregulated cell death control, with spontaneous cell death in response to developmental or abiotic stimuli. Thus wild-type Mlo proteins are thought to be inhibitors of cell death whose deficiency lowers the threshold required to trigger the cascade of events that result in plant cell death. Mlo proteins are localized in the plasma membrane and possess seven transmembrane regions; thus the Mlo family is the only major higher plant family to possess 7 transmembrane domains. It has been suggested that Mlo proteins function as G-protein coupled receptors in plants; however the molecular and biological functions of Mlo proteins remain to be fully determined. 484
46389 397284 pfam03095 PTPA Phosphotyrosyl phosphate activator (PTPA) protein. Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumor suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1. 291
46390 397285 pfam03096 Ndr Ndr family. This family consists of proteins from different gene families: Ndr1/RTP/Drg1, Ndr2, and Ndr3. Their similarity was previously noted. The precise molecular and cellular function of members of this family is still unknown. Yet, they are known to be involved in cellular differentiation events. The Ndr1 group was the first to be discovered. Their expression is repressed by the proto-oncogenes N-myc and c-myc, and in line with this observation, Ndr1 protein expression is down-regulated in neoplastic cells, and is reactivated when differentiation is induced by chemicals such as retinoic acid. Ndr2 and Ndr3 expression is not under the control of N-myc or c-myc. Ndr1 expression is also activated by several chemicals: tunicamycin and homocysteine induce Ndr1 in human umbilical endothelial cells; nickel induces Ndr1 in several cell types. Members of this family are found in wide variety of multicellular eukaryotes, including an Ndr1 type protein in Helianthus annuus (sunflower), known as Sf21. Interestingly, the highest scoring matches in the noise are all alpha/beta hydrolases pfam00561, suggesting that this family may have an enzymatic function (Bateman A pers. obs.). 285
46391 397286 pfam03097 BRO1 BRO1-like domain. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1. 369
46392 397287 pfam03098 An_peroxidase Animal haem peroxidase. 533
46393 397288 pfam03099 BPL_LplA_LipB Biotin/lipoate A/B protein ligase family. This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor. 129
46394 397289 pfam03100 CcmE CcmE. CcmE is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor. 129
46395 335217 pfam03101 FAR1 FAR1 DNA-binding domain. This domain contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain. 90
46396 397290 pfam03102 NeuB NeuB family. NeuB is the prokaryotic N-acetylneuraminic acid (Neu5Ac) synthase. It catalyzes the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesize the 9-phosphate form, Neu5Ac-9-P, and utilize ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family. This family also contains SpsE spore coat polysaccharide biosynthesis proteins. 240
46397 397291 pfam03103 DUF243 Domain of unknown function (DUF243). This family of uncharacterized proteins is only found in fly proteins. It is found associated with YLP motifs pfam02757 in some proteins. 97
46398 397292 pfam03104 DNA_pol_B_exo1 DNA polymerase family B, exonuclease domain. This domain has 3' to 5' exonuclease activity and adopts a ribonuclease H type fold. 333
46399 397293 pfam03105 SPX SPX domain. We have named this region the SPX domain after SYG1, Pho81 and XPR1. This 180 residue long domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G-protein beta subunit and inhibits transduction of the mating pheromone signal. Similarly, the N-terminus of the human XPR1 protein binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that all the members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The SPX domain of S. cerevisiae low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2 NUC-2 contains several ankyrin repeats pfam00023. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with murine xenotropic and polytropic leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signalling and result in cell toxicity and death. The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae, and many other diverse organisms. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor. 340
46400 397294 pfam03106 WRKY WRKY DNA -binding domain. 57
46401 367338 pfam03107 C1_2 C1 domain. This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in pfam00130, therefore we have termed this domain DC1 for divergent C1 domain. This domain probably also binds to two zinc ions. The function of proteins with this domain is uncertain, however this domain may bind to molecules such as diacylglycerol (A Bateman pers. obs.). This family are found in plant proteins. 48
46402 397295 pfam03108 DBD_Tnp_Mut MuDR family transposase. This region is found in plant proteins that are presumed to be the transposases for Mutator transposable elements. These transposons contain two ORFs. The molecular function of this region is unknown. 65
46403 281150 pfam03109 ABC1 ABC1 family. This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins. 117
46404 397296 pfam03110 SBP SBP domain. SBP domains (for SQUAMOSA-pROMOTER BINDING PROTEIN) are found in plant proteins. It is a sequence specific DNA-binding domain. Members of family probably function as transcription factors involved in the control of early flower development. The domain contains 10 conserved cysteine and histidine residues that probably are zinc ligands. 75
46405 281152 pfam03112 DUF244 Uncharacterized protein family (ORF7) DUF. Several members of this family are Borrelia burgdorferi plasmid proteins of uncharacterized function. 161
46406 281153 pfam03113 RSV_NS2 Respiratory synctial virus non-structural protein NS2. The molecular structure and function of the NS2 protein is not known. However, mutants lacking the NS2 grow at slower rates when compared to the wild-type. Nevertheless, NS2 is not essential for viral replication. 124
46407 281154 pfam03114 BAR BAR domain. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different protein families. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysin, endophilin, BRAP and Nadrin. BAR domains are also frequently found alongside domains that determine lipid specificity, like pfam00169 and pfam00787 domains in beta centaurins and sorting nexins respectively. 234
46408 367340 pfam03115 Astro_capsid_N Astrovirus capsid protein precursor. This product is encoded by astrovirus ORF2, one of the three astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein undergoes an intracellular cleavage to form a 79kD protein. Subsequently, extracellular trypsin cleavage yields the three proteins forming the infectious virion. 431
46409 397297 pfam03116 NQR2_RnfD_RnfE NQR2, RnfD, RnfE family. This family of bacterial proteins includes a sodium-translocating NADH-ubiquinone oxidoreductase (i.e. a respiration linked sodium pump). In Vibrio cholerae, it negatively regulates the expression of virulence factors through inhibiting (by an unknown mechanism) the transcription of the transcriptional activator ToxT. The family also includes proteins involved in nitrogen fixation, RnfD and RnfE. The similarity of these proteins to NADH-ubiquinone oxidoreductases was previously noted. 304
46410 281157 pfam03117 Herpes_UL49_1 UL49 family. Members of this family, found in several herpesviruses, include EBV BFRF2 and other UL49 proteins (e.g. HCMVA UL49, HSV6 U33). There are eight conserved cysteine residues in this alignment, all lying towards the C-terminus. Their function is unknown. 243
46411 397298 pfam03118 RNA_pol_A_CTD Bacterial RNA polymerase, alpha chain C terminal domain. The alpha subunit of RNA polymerase consists of two independently folded domains, referred to as amino-terminal and carboxyl terminal domains. The amino terminal domain is involved in the interaction with the other subunits of the RNA polymerase. The carboxyl-terminal domain interacts with the DNA and activators. The amino acid sequence of the alpha subunit is conserved in prokaryotic and chloroplast RNA polymerases. There are three regions of particularly strong conservation, two in the amino-terminal and one in the carboxyl- terminal. 63
46412 397299 pfam03119 DNA_ligase_ZBD NAD-dependent DNA ligase C4 zinc finger domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small zinc binding motif that is presumably DNA binding. IT is found only in NAD dependent DNA ligases. 26
46413 397300 pfam03120 DNA_ligase_OB NAD-dependent DNA ligase OB-fold domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small domain found after the adenylation domain pfam01653 in NAD dependent ligases. OB-fold domains generally are involved in nucleic acid binding. 79
46414 367342 pfam03121 Herpes_UL52 Herpesviridae UL52/UL70 DNA primase. Herpes simplex virus type 1 DNA replication in host cells is known to be mediated by seven viral-encoded proteins, three of which form a heterotrimeric DNA helicase-primase complex. This complex consists of UL5, UL8, and UL52 subunits. Heterodimers consisting of UL5 and UL52 have been shown to retain both helicase and primase activities. Nevertheless, UL8 is still essential for replication: though it lacks any DNA binding or catalytic activities, it is involved in the transport of UL5-UL52 and it also interacts with other replication proteins. The molecular mechanisms of the UL5-UL52 catalytic activities are not known. While UL5 is associated with DNA helicase activity and UL52 with DNA primase activity, the helicase activity requires the interaction of UL5 and UL52. It is not known if the primase activity can be maintained by UL52 alone. The region encompassed by residues 610-636 of HSV1 UL52 is thought to contain a divalent metal cation binding motif. Indeed, this region contains several aspartate and glutamate residues that might be involved in divalent cation binding. The biological significance of UL52-UL8 interaction is not known. Yeast two-hybrid analysis together with immunoprecipitation experiments have shown that the HSV1 UL52 region between residues 366-914 is essential for this interaction, while the first 349 N-terminal residues are dispensable. This family also includes protein UL70 from cytomegalovirus (CMV, a subgroup of the Herpesviridae) strains, which, by analogy with UL52, is thought to have DNA primase activity. Indeed, CMV strains also possess a DNA helicase-primase complex, the other subunits being protein UL105 (with known similarity to HSV1 UL5) and protein UL102. 75
46415 397301 pfam03122 Herpes_MCP Herpes virus major capsid protein. This family represents the major capsid protein (MCP) of herpes viruses. The capsid shell consists of 150 MCP hexamers and 12 MCP pentamers. One pentamer is found at each of the 12 apices of the icosahedral shell, and the hexamers form the edges and 20 faces. 1368
46416 397302 pfam03123 CAT_RBD CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains (pfam00874) that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template. 55
46417 397303 pfam03124 EXS EXS family. We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles. 332
46418 251743 pfam03125 Sre C. elegans Sre G protein-coupled chemoreceptor. Caenorhabditis elegans Sre proteins are candidate chemosensory receptors. There are four main recognized groups of such receptors: Odr-10, Sra, Sro, and Srg. Sre (this family), Sra pfam02117 and Srb pfam02175 comprise the Sra group. All of the above receptors are thought to be G protein-coupled seven transmembrane domain proteins. The existence of several different chemosensory receptors underlies the fact that in spite of having only 20-30 chemosensory neurones, C. elegans detects hundreds of different chemicals, with the ability to discern individual chemicals among combinations. 363
46419 397304 pfam03126 Plus-3 Plus-3 domain. This domain is about 90 residues in length and is often found associated with the pfam02213 domain. The function of this domain is uncertain. It is possible that this domain is involved in DNA binding as it has three conserved positively charged residues, hence this domain has been named the plus-3 domain. It is found in yeast Rtf1 which may be a transcription elongation factor. 103
46420 397305 pfam03127 GAT GAT domain. The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 and ARF3. The GAT domain stabilizes membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins. 77
46421 397306 pfam03128 CXCXC CXCXC repeat. This repeat contains the conserved pattern CXCXC where X can be any amino acid. The repeat is found in up to five copies in Vascular endothelial growth factor C. In the salivary glands of the dipteran Chironomus tentans, a specific messenger ribonucleoprotein (mRNP) particle, the Balbiani ring (BR) granule, can be visualized during its assembly on the gene and during its nucleocytoplasmic transport. This repeat is found over 70 copies in the balbiani ring protein 3. It is also found in some silk proteins. 13
46422 397307 pfam03129 HGTP_anticodon Anticodon binding domain. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases it is probably the anticodon binding domain. 94
46423 308641 pfam03130 HEAT_PBS PBS lyase HEAT-like repeat. This family contains a short bi-helical repeat that is related to pfam02985. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. This family includes the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/F lyase, which attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six). All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif. 27
46424 367348 pfam03131 bZIP_Maf bZIP Maf transcription factor. Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerization and DNA binding property. Thus, this family is probably related to pfam00170. This family also includes the DNA_binding domain of Skn-1, this domain lacks the leucine zipper found in other bZip domains, and binds DNA is a monomer. 92
46425 397308 pfam03133 TTL Tubulin-tyrosine ligase family. Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumor aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain. 291
46426 397309 pfam03134 TB2_DP1_HVA22 TB2/DP1, HVA22 family. This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein. 77
46427 367350 pfam03135 CagE_TrbE_VirB CagE, TrbE, VirB family, component of type IV transporter system. This family includes the Helicobacter pylori protein CagE, which together with other proteins from the cag pathogenicity island (PAI), encodes a type IV transporter secretion system. The precise role of CagE is not known, but studies in animal models have shown that it is essential for pathogenesis in Helicobacter pylori induced gastritis and peptic ulceration. Indeed, the expression of the cag PAI has been shown to be essential for stimulating human gastric epithelial cell apoptosis in vitro. Similar type IV transport systems are also found in other bacteria. This family includes the TrbE and VirB proteins from the respective trb and Vir conjugal transfer systems in Agrobacterium tumefaciens. homologs of VirB proteins from other species are also members of this family, e.g. VirB from Brucella suis. 202
46428 397310 pfam03136 Pup_ligase Pup-ligase protein. Pupylation is a novel protein modification system found in some bacteria. This family of proteins are the enzyme that can conjugate proteins of the Pup family to lysine residues in target proteins marking them for degradation. The archetypal protein in this family is PafA (proteasome accessory factor) from Mycobacterium tuberculosis. It has been suggested that these proteins are related to gamma-glutamyl-cysteine synthetases. 405
46429 397311 pfam03137 OATP Organic Anion Transporter Polypeptide (OATP) family. This family consists of several eukaryotic Organic-Anion-Transporting Polypeptides (OATPs). Several have been identified mostly in human and rat. Different OATPs vary in tissue distribution and substrate specificity. Since the numbering of different OATPs in particular species was based originally on the order of discovery, similarly numbered OATPs in humans and rats did not necessarily correspond in function, tissue distribution and substrate specificity (in spite of the name, some OATPs also transport organic cations and neutral molecules). Thus, Tamai et al. initiated the current scheme of using digits for rat OATPs and letters for human ones. Prostaglandin transporter (PGT) proteins are also considered to be OATP family members. In addition, the methotrexate transporter OATK is closely related to OATPs. This family also includes several predicted proteins from Caenorhabditis elegans and Drosophila melanogaster. This similarity was not previously noted. Note: Members of this family are described (in the Swiss-Prot database) as belonging to the SLC21 family of transporters. 529
46430 397312 pfam03139 AnfG_VnfG Vanadium/alternative nitrogenase delta subunit. The nitrogenase complex EC:1.18.6.1 catalyzes the conversion of molecular nitrogen to ammonia (nitrogen fixation) as follows: 8 reduced ferredoxin + 8 H(+) + N(2) + 16 ATP <=> 8 oxidized ferredoxin + 2 NH(3) + 16 ADP + 16 phosphate. The complex is hexameric, consisting of 2 alpha, 2 beta, and 2 delta subunits. This family represents the delta subunit of a group of nitrogenases that do not utilize molybdenum (Mo) as a cofactor, but instead use either vanadium (V nitrogenases), or iron (alternative nitrogenases). V nitrogenases are encoded by vnf operons, and alternative nitrogenases by anf operons. The delta subunits are VnfG and AnfG, respectively. 111
46431 397313 pfam03140 DUF247 Plant protein of unknown function. The function of the plant proteins constituting this family is unknown. 389
46432 335237 pfam03141 Methyltransf_29 Putative S-adenosyl-L-methionine-dependent methyltransferase. This family is a putative S-adenosyl-L-methionine (SAM)-dependent methyltransferase. 506
46433 367353 pfam03142 Chitin_synth_2 Chitin synthase. Members of this family are fungal chitin synthase EC:2.4.1.16 enzymes. They catalyze chitin synthesis as follows: UDP-N-acetyl-D-glucosamine + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N) <=> UDP + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N+1). 527
46434 397314 pfam03143 GTP_EFTU_D3 Elongation factor Tu C-terminal domain. Elongation factor Tu consists of three structural domains, this is the third domain. This domain adopts a beta barrel structure. This the third domain is involved in binding to both charged tRNA and binding to EF-Ts pfam00889. 105
46435 397315 pfam03144 GTP_EFTU_D2 Elongation factor Tu domain 2. Elongation factor Tu consists of three structural domains, this is the second domain. This domain adopts a beta barrel structure. This the second domain is involved in binding to charged tRNA. This domain is also found in other proteins such as elongation factor G and translation initiation factor IF-2. This domain is structurally related to pfam03143, and in fact has weak sequence matches to this domain. 73
46436 397316 pfam03145 Sina Seven in absentia protein family. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologs of Sina have also been identified. The human homolog Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologs, whose similarity was noted previously, this family also includes putative homologs from Caenorhabditis elegans, Arabidopsis thaliana. 198
46437 397317 pfam03146 NtA Agrin NtA domain. Agrin is a multidomain heparan sulphate proteoglycan, that is a key organizer for the induction of postsynaptic specialisations at the neuromuscular junction. Binding of agrin to basement membranes requires the amino terminal (NtA) domain. This region mediates high affinity interaction with the coiled-coil domain of laminins. The binding of agrin to laminins via the NtA domain is subject to tissue-specific regulation. The NtA domain-containing form of agrin is expressed in non-neuronal cells or in neurons that project to non-neuronal cell such as motor neurons. The structure of this domain is an OB-fold. 109
46438 397318 pfam03147 FDX-ACB Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold. 94
46439 397319 pfam03148 Tektin Tektin family. Tektins are cytoskeletal proteins. They have been demonstrated in such cellular sites as centrioles, basal bodies, and along ciliary and flagellar doublet microtubules. Tektins form unique protofilaments, organized as longitudinal polymers of tektin heterodimers with axial periodicity matching tubulin. Tektin polypeptides consist of several alpha-helical regions that are predicted to form coiled coils. Indeed, tektins share considerable structural similarities with intermediate filament proteins. Possible functional roles for tektins are: stabilisation of tubulin protofilaments; attachment of A and B-tubules in ciliary/flagellar microtubule doublets and C-tubules in centrioles; binding of axonemal components. 383
46440 397320 pfam03150 CCP_MauG Di-haem cytochrome c peroxidase. This is a family of distinct cytochrome c peroxidases (CCPs) that contain two haem groups. Similar to other cytochrome c peroxidases, they reduce hydrogen peroxide to water using c-type haem as an oxidisable substrate. However, since they possess two, instead of one, haem prosthetic groups, bacterial CCPs reduce hydrogen peroxide without the need to generate semi-stable free radicals. The two haem groups have significantly different redox potentials. The high potential (+320 mV) haem feeds electrons from electron shuttle proteins to the low potential (-330 mV) haem, where peroxide is reduced (indeed, the low potential site is known as the peroxidatic site). The CCP protein itself is structured into two domains, each containing one c-type haem group, with a calcium-binding site at the domain interface. This family also includes MauG proteins, whose similarity to di-haem CCP was previously recognized. 151
46441 308657 pfam03151 TPT Triose-phosphate Transporter family. This family includes transporters with a specificity for triose phosphate. 290
46442 397321 pfam03152 UFD1 Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterized by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation. 172
46443 397322 pfam03153 TFIIA Transcription factor IIA, alpha/beta subunit. Transcription initiation factor IIA (TFIIA) is a heterotrimer, the three subunits being known as alpha, beta, and gamma, in order of molecular weight. The N and C-terminal domains of the gamma subunit are represented in pfam02268 and pfam02751, respectively. This family represents the precursor that yields both the alpha and beta subunits. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II. Together with TFIID, TFIIA binds to the promoter region; this is the first step in the formation of a pre-initiation complex (PIC). Binding of the rest of the transcription machinery follows this step. After initiation, the PIC does not completely dissociate from the promoter. Some components, including TFIIA, remain attached and re-initiate a subsequent round of transcription. 237
46444 397323 pfam03154 Atrophin-1 Atrophin-1 family. Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity. 982
46445 397324 pfam03155 Alg6_Alg8 ALG6, ALG8 glycosyltransferase family. N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue, and Alg8 transfers the second one. In the human alg6 gene, a C->T transition, which causes Ala333 to be replaced with Val, has been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147. 472
46446 367362 pfam03157 Glutenin_hmw High molecular weight glutenin subunit. Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm. 786
46447 281192 pfam03158 DUF249 Multigene family 530 protein. Members of this family are multigene family 530 proteins from African swine fever viruses. These proteins may be involved in promoting survival of infected macrophages. 192
46448 397325 pfam03159 XRN_N XRN 5'-3' exonuclease N-terminus. This family aligns residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to possess 5'-3' exonuclease activity EC:3.1.11.-. Thus, the aligned region may be necessary for 5' to 3' exonuclease function. The family also contains several Xrn1 and Xrn2 proteins. The 5'-3' exoribonucleases Xrn1p and Xrn2p/Rat1p function in the degradation and processing of several classes of RNA in Saccharomyces cerevisiae. Xrn1p is the main enzyme catalyzing cytoplasmic mRNA degradation in multiple decay pathways, whereas Xrn2p/Rat1p functions in the processing of rRNAs and small nucleolar RNAs (snoRNAs) in the nucleus. 231
46449 397326 pfam03160 Calx-beta Calx-beta domain. 91
46450 397327 pfam03161 LAGLIDADG_2 LAGLIDADG DNA endonuclease family. This is a family of site-specific DNA endonucleases encoded by DNA mobile elements. Similar to pfam00961, the members of this family are also LAGLIDADG endonucleases. 168
46451 397328 pfam03162 Y_phosphatase2 Tyrosine phosphatase family. This family is closely related to the pfam00102 and pfam00782 families. 150
46452 367367 pfam03164 Mon1 Trafficking protein Mon1. Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologs to domains of known structure that assemble into cargo vesicle adapter (AP) complexes. describes orthologues in Fugu rubripes. 400
46453 397329 pfam03165 MH1 MH1 domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localization signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx. 103
46454 397330 pfam03166 MH2 MH2 domain. This is the MH2 (MAD homology 2) domain found at the carboxy terminus of MAD related proteins such as Smads. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins and provides specificity and selectivity to Smad function and also is critical for mediating interactions in Smad oligomers. Unlike MH1, MH2 does not bind DNA. The well-studied MH2 domain of Smad4 is composed of five alpha helices and three loops enclosing a beta sandwich. Smads are involved in the propagation of TGF-beta signals by direct association with the TGF-beta receptor kinase which phosphorylates the last two Ser of a conserved 'SSXS' motif located at the C-terminus of MH2. 181
46455 397331 pfam03167 UDG Uracil DNA glycosylase superfamily. 154
46456 397332 pfam03168 LEA_2 Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. This family represents a group of LEA proteins that appear to be distinct from those in pfam02987. The family DUF1511, pfam07427, has now been merged into this family. 98
46457 367370 pfam03169 OPT OPT oligopeptide transporter protein. The OPT family of oligopeptide transporters is distinct from the ABC pfam00005 and PTR pfam00854 transporter families. OPT transporters were first recognized in fungi (Candida albicans and Schizosaccharomyces pombe), but this alignment also includes orthologues from Arabidopsis thaliana. OPT transporters are thought to have 12-14 transmembrane domains and contain the following motif: SPYxEVRxxVxxxDDP. 614
46458 397333 pfam03170 BcsB Bacterial cellulose synthase subunit. This family includes bacterial proteins involved in cellulose synthesis. Cellulose synthesis has been identified in several bacteria. In Agrobacterium tumefaciens, for instance, cellulose has a pathogenic role: it allows the bacteria to bind tightly to their host plant cells. While several enzymatic steps are involved in cellulose synthesis, potentially the only step unique to this pathway is that catalyzed by cellulose synthase. This enzyme is a multi subunit complex. This family encodes a subunit that is thought to bind the positive effector cyclic di-GMP. This subunit is found in several different bacterial cellulose synthase enzymes. The first recognized sequence for this subunit is BcsB. In the AcsII cellulose synthase, this subunit and the subunit corresponding to BcsA are found in the same protein. Indeed, this alignment only includes the C-terminal half of the AcsAII synthase, which corresponds to BcsB. 605
46459 397334 pfam03171 2OG-FeII_Oxy 2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB. 101
46460 397335 pfam03172 HSR HSR domain. The Sp100 protein is a constituent of nuclear domains, also known as nuclear dots (NDs). An ND-targeting region that coincides with a homodimerization domain was mapped in Sp100. Sequences similar to the Sp100 homodimerization/ND-targeting region occur in several other proteins and constitute a novel protein motif, termed HSR domain (for homogeneously-staining region). The HSR domain has also been named ASS (AIRE, Sp-100 and Sp140). This domain is usually found at the amino terminus of proteins that contain a SAND domain pfam01342. 99
46461 397336 pfam03173 CHB_HEX Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. 152
46462 397337 pfam03174 CHB_HEX_C Chitobiase/beta-hexosaminidase C-terminal domain. This short domain represents the C terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure. The function of this domain is unknown. 76
46463 367375 pfam03175 DNA_pol_B_2 DNA polymerase type B, organellar and viral. Like pfam00136, members of this family are also DNA polymerase type B proteins. Those included here are found in plant and fungal mitochondria, and in viruses. 455
46464 308676 pfam03176 MMPL MMPL family. Members of this family are putative integral membrane proteins from bacteria. Several of the members are mycobacterial proteins. Many of the proteins contain two copies of this aligned region. The function of these proteins is not known, although it has been suggested that they may be involved in lipid transport. 332
46465 397338 pfam03177 Nucleoporin_C Non-repetitive/WGA-negative nucleoporin C-terminal. This is the C-termainl half of a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterized by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. RNA undergoing nuclear export first encounters the basket of the nuclear pore and many nucleoporins are accessible on the basket side of the pore. 559
46466 397339 pfam03178 CPSF_A CPSF A subunit region. This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding. 318
46467 397340 pfam03179 V-ATPase_G Vacuolar (H+)-ATPase G subunit. This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialized cells. V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation. 105
46468 397341 pfam03180 Lipoprotein_9 NLPA lipoprotein. This family of bacterial lipoproteins contains several antigenic members, that may be involved in bacterial virulence. Their precise function is unknown. However they are probably distantly related to pfam00497 which are solute binding proteins. 236
46469 397342 pfam03181 BURP BURP domain. The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown. 215
46470 112017 pfam03183 Borrelia_rep Borrelia repeat protein. 18
46471 367380 pfam03184 DDE_1 DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localized to the centromere. 177
46472 397343 pfam03185 CaKB Calcium-activated potassium channel, beta subunit. 195
46473 397344 pfam03186 CobD_Cbib CobD/Cbib protein. This family includes CobD proteins from a number of bacteria, in Salmonella this protein is called Cbib. Salmonella CobD is a different protein. This protein is involved in cobalamin biosynthesis and is probably an enzyme responsible for the conversion of adenosylcobyric acid to adenosylcobinamide or adenosylcobinamide phosphate. 278
46474 281216 pfam03187 Corona_I Corona nucleocapsid I protein. 207
46475 397345 pfam03188 Cytochrom_B561 Eukaryotic cytochrome b561. Cytochrome b561 is a secretory vesicle-specific electron transport protein. It is an integral membrane protein, that binds two heme groups non-covalently. This is a eukaryotic family. Members of the 'prokaryotic cytochrome b561' family can be found in pfam01292. 137
46476 397346 pfam03189 Otopetrin Otopetrin. 435
46477 397347 pfam03190 Thioredox_DsbH Protein of unknown function, DUF255. 163
46478 367385 pfam03192 DUF257 Pyrococcus protein of unknown function, DUF257. 205
46479 397348 pfam03193 RsgA_GTPase RsgA GTPase. RsgA (also known as EngC and YjeQ) represents a protein family whose members are broadly conserved in bacteria and are indispensable for growth. The GTPase domain of RsgA is very similar to several P-loop GTPases, but differs in having a circular permutation of the GTPase structure described by a G4-G1-G3 pattern. 174
46480 397349 pfam03194 LUC7 LUC7 N_terminus. This family contains the N terminal region of several LUC7 protein homologs and only contains eukaryotic proteins. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The family also contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP). 246
46481 397350 pfam03195 LOB Lateral organ boundaries (LOB) domain. The lateral organ boundaries (LOB) gene encodes a plant-specific protein of unknown function that is expressed at the adaxial base of initiating lateral organs. The N-terminal of the LOB protein contains an approximately 100-amino acid conserved domain (the LOB domain) that is present in 42 other Arabidopsis thaliana proteins as well as in proteins from a variety of other plant species. The LOB domain contains conserved blocks of amino acids that identify the LOB domain (LBD) gene family. In particular, a conserved C-x(2)-C-x(6)-C-x(3)-C motif, which is the defining feature of the LOB domain, is present in all LBD proteins. 99
46482 281224 pfam03196 DUF261 Protein of unknown function, DUF261. 137
46483 281225 pfam03197 FRD2 Bacteriophage FRD2 protein. 103
46484 397351 pfam03198 Glyco_hydro_72 Glucanosyltransferase. This is a family of glycosylphosphatidylinositol-anchored beta(1-3)glucanosyltransferases. The active site residues in the Aspergillus fumigatus example are the two glutamate residues at 160 and 261. 315
46485 397352 pfam03199 GSH_synthase Eukaryotic glutathione synthase. 103
46486 397353 pfam03200 Glyco_hydro_63 Glycosyl hydrolase family 63 C-terminal domain. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the C-terminal catalytic domain. 494
46487 397354 pfam03201 HMD H2-forming N5,N10-methylene-tetrahydromethanopterin dehydrogenase. 88
46488 397355 pfam03202 Lipoprotein_10 Putative mycoplasma lipoprotein, C-terminal region. 129
46489 397356 pfam03203 MerC MerC mercury resistance protein. 105
46490 397357 pfam03205 MobB Molybdopterin guanine dinucleotide synthesis protein B. This protein contains a P-loop. 133
46491 397358 pfam03206 NifW Nitrogen fixation protein NifW. Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of these proteins in the functions of nitrogenase are unclear. Using yeast two-hybrid screening it has been shown that NifW can interact with itself as well as NifZ. 99
46492 367392 pfam03207 OspD Borrelia outer surface protein D (OspD). 254
46493 397359 pfam03208 PRA1 PRA1 family protein. This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. 141
46494 367394 pfam03209 PUCC PUCC protein. This protein is required for high-level transcription of the PUC operon. 401
46495 281237 pfam03210 Paramyx_P_V_C Paramyxovirus P/V phosphoprotein C-terminal. Paramyxoviridae P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L pfam00946. Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In mumps, simian virus type 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini. This C-terminal region of the P phosphoprotein is likely to be the nucleocapsid-binding domain, and is found to be intrinsically disordered and thus liable to induced folding. 154
46496 397360 pfam03211 Pectate_lyase Pectate lyase. 200
46497 367396 pfam03212 Pertactin Pertactin. 121
46498 281240 pfam03213 Pox_P35 Poxvirus P35 protein. 323
46499 367397 pfam03214 RGP Reversibly glycosylated polypeptide. 340
46500 367398 pfam03215 Rad17 Rad17 cell cycle checkpoint protein. 186
46501 397361 pfam03216 Rhabdo_ncap_2 Rhabdovirus nucleoprotein. 295
46502 397362 pfam03217 SLAP SLAP domain. This short domain is found in a variety of bacterial cell surface proteins. The domain is about 60 residues in length (although previously defined as 2 copies of this domain). It usually occurs in tandem pairs. It may be distantly related to the SH3 domain. 54
46503 397363 pfam03219 TLC TLC ATP/ADP transporter. 491
46504 397364 pfam03220 Tombus_P19 Tombusvirus P19 core protein. 171
46505 397365 pfam03221 HTH_Tnp_Tc5 Tc5 transposase DNA-binding domain. 63
46506 367404 pfam03222 Trp_Tyr_perm Tryptophan/tyrosine permease family. 393
46507 397366 pfam03223 V-ATPase_C V-ATPase subunit C. 369
46508 397367 pfam03224 V-ATPase_H_N V-ATPase subunit H. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the N terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation. 311
46509 251809 pfam03225 Viral_Hsp90 Viral heat shock protein Hsp90 homolog. 511
46510 397368 pfam03226 Yippee-Mis18 Yippee zinc-binding/DNA-binding /Mis18, centromere assembly. This family includes both Yippee-type proteins and Mis18 kinetochore proteins. Yippee are putative zinc-binding/DNA-binding proteins. Mis18 are proteins involved in the priming of centromeres for recruiting CENP-A. Mis18-alpha and beta form part of a small complex with Mis18-binding protein. Mis18-alpha is found to interact with DNA de-methylases through a Leu-rich region located at its carboxyl terminus. This entry also includes the CULT domain proteins such as Cereblon. 100
46511 397369 pfam03227 GILT Gamma interferon inducible lysosomal thiol reductase (GILT). This family includes the two characterized human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction. 106
46512 281252 pfam03228 Adeno_VII Adenoviral core protein VII. The function of this protein is unknown. It has a conserved amino terminus of 50 residues followed by a positively charged tail, suggesting it may interact with nucleic acid. The major core protein of the adenovirus, protein VII, was found to be associated with viral DNA throughout infection. The precursor to protein VII were shown to be in vivo and in vitro acceptors of ADP-ribose. The ADP-ribosylated core proteins were assembled into mature virus particles. ADP-ribosylation of adenovirus core proteins may have a role in virus decapsidation. 142
46513 251813 pfam03229 Alpha_GJ Alphavirus glycoprotein J. 126
46514 397370 pfam03230 Antirestrict Antirestriction protein. This family includes various protein that are involved in antirestriction. The ArdB protein efficiently inhibits restriction by members of the three known families of type I systems of E. coli. 92
46515 251815 pfam03231 Bunya_NS-S_2 Bunyavirus non-structural protein NS-S. This family represents the Bunyavirus NS-S family. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. 444
46516 397371 pfam03232 COQ7 Ubiquinone biosynthesis protein COQ7. Members of this family contain two repeats of about 90 amino acids, that contains two conserved motifs. One of these DXEXXH may be part of an enzyme active site. 171
46517 251817 pfam03233 Cauli_AT Aphid transmission protein. This protein is found in various caulimoviruses. It codes for an 18 kDa protein (PII), which is dispensable for infection but which is required for aphid transmission of the virus. This protein interacts with the PIII protein. 163
46518 397372 pfam03234 CDC37_N Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain pfam08565. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function. 123
46519 397373 pfam03235 DUF262 Protein of unknown function DUF262. 177
46520 397374 pfam03237 Terminase_6 Terminase-like family. This family represents a group of terminase proteins. 203
46521 281258 pfam03238 ESAG1 ESAG protein. Expression-site-associated gene (ESAG) proteins are thought to be involved in VSG activation. This family includes ESAG 117A as well as ESAG IM. 227
46522 251822 pfam03239 FTR1 Iron permease FTR1 family. 284
46523 397375 pfam03241 HpaB 4-hydroxyphenylacetate 3-hydroxylase C terminal. HpaB encodes part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli. HpaB is part of a heterodimeric enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes PvcC may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis. 196
46524 397376 pfam03242 LEA_3 Late embryogenesis abundant protein. Members of this family are similar to late embryogenesis abundant proteins. Members of the family have been isolated in a number of different screens. However, the molecular function of these proteins remains obscure. 93
46525 377009 pfam03243 MerB Alkylmercury lyase. Alkylmercury lyase (EC:4.99.1.2) cleaves the carbon-mercury bond of organomercurials such as phenylmercuric acetate. 126
46526 367413 pfam03244 PSI_PsaH Photosystem I reaction centre subunit VI. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. 139
46527 397377 pfam03245 Phage_lysis Bacteriophage Rz lysis protein. This protein is involved in host lysis. This family is not considered to be a peptidase according to the MEROPs database. This family Rz and the Rz1 protein (pfam06085) represent a unique example of two genes located in different reading frames in the same nucleotide sequence, which encode different proteins that are both required in the same physiological pathway. 126
46528 397378 pfam03246 Pneumo_ncap Pneumovirus nucleocapsid protein. 390
46529 397379 pfam03247 Prothymosin Prothymosin/parathymosin family. Prothymosin alpha and parathymosin are two ubiquitous small acidic nuclear proteins that are thought to be involved in cell cycle progression, proliferation, and cell differentiation. 111
46530 397380 pfam03248 Rer1 Rer1 family. RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C-terminus of yeast Rer1p interacts with a coatomer complex. 167
46531 397381 pfam03249 TSA Type specific antigen. There are several antigenic variants in Rickettsia tsutsugamushi, and a type-specific antigen (TSA) of 56-kilodaltons located on the rickettsial surface is responsible for the variation. TSA proteins are probably integral membrane proteins. 510
46532 397382 pfam03250 Tropomodulin Tropomodulin. Tropomodulin is a novel tropomyosin regulatory protein that binds to the end of erythrocyte tropomyosin and blocks head-to-tail association of tropomyosin along actin filaments. Limited proteolysis shows this protein is composed of two domains. The amino terminal domain contains the tropomyosin binding function. 143
46533 281269 pfam03251 Tymo_45kd_70kd Tymovirus 45/70Kd protein. Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping this protein overlaps a larger ORF that is thought to be the polymerase. 468
46534 281270 pfam03252 Herpes_UL21 Herpesvirus UL21. The UL21 protein appears to be a dispensable component in herpesviruses. 524
46535 397383 pfam03253 UT Urea transporter. Members of this family transport urea across membranes. The family includes a bacterial homolog. 292
46536 397384 pfam03254 XG_FTase Xyloglucan fucosyltransferase. Plant cell walls are crucial for development, signal transduction, and disease resistance in plants. Cell walls are made of cellulose, hemicelluloses, and pectins. Xyloglucan (XG), the principal load-bearing hemicellulose of dicotyledonous plants, has a terminal fucosyl residue. This fucosyltransferase adds this residue. 458
46537 397385 pfam03255 ACCA Acetyl co-enzyme A carboxylase carboxyltransferase alpha subunit. Acetyl co-enzyme A carboxylase carboxyltransferase is composed of an alpha and beta subunit. 144
46538 367420 pfam03256 ANAPC10 Anaphase-promoting complex, subunit 10 (APC10). 185
46539 281275 pfam03257 Adhesin_P1 Mycoplasma adhesin P1. This family corresponds to a short 100 residue region found in adhesins from Mycoplasmas. 91
46540 367421 pfam03258 Baculo_FP Baculovirus FP protein. The FP protein is missing in baculovirus (Few Polyhedra) mutants. 147
46541 397386 pfam03259 Robl_LC7 Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. 91
46542 397387 pfam03260 Lipoprotein_11 Lepidopteran low molecular weight (30 kD) lipoprotein. 251
46543 397388 pfam03261 CDK5_activator Cyclin-dependent kinase 5 activator protein. 356
46544 281280 pfam03262 Corona_6B_7B Coronavirus 6B/7B protein. 206
46545 281281 pfam03263 Cucumo_2B Cucumovirus protein 2B. This protein may be a viral movement protein. 105
46546 397389 pfam03264 Cytochrom_NNT NapC/NirT cytochrome c family, N-terminal region. Within the NapC/NirT family of cytochrome c proteins, some members, such as NapC and NirT, bind four haem groups, while others, such as TorC, bind five haems. This family aligns the common N-terminal region that contains four haem-binding C-X(2)-CH motifs. 174
46547 397390 pfam03265 DNase_II Deoxyribonuclease II. 312
46548 397391 pfam03266 NTPase_1 NTPase. This domain is found across all species from bacteria to human, and the function was determined first in a hyperthermophilic bacterium to be an NTPase. The structure of one member-sequence represents a variation of the RecA fold, and implies that the function might be that of a DNA/RNA modifying enzyme. The sequence carries both a Walker A and Walker B motif which together are characteristic of ATPases or GTPases. The protein exhibits an increased expression profile in human liver cholangiocarcinoma when compared to normal tissue. 168
46549 367428 pfam03268 DUF267 Caenorhabditis protein of unknown function, DUF267. 360
46550 367429 pfam03269 DUF268 Caenorhabditis protein of unknown function, DUF268. 176
46551 397392 pfam03270 DUF269 Protein of unknown function, DUF269. Members of this family may be involved in nitrogen fixation, since they are found within nitrogen fixation operons. 121
46552 397393 pfam03271 EB1 EB1-like C-terminal motif. This motif is found at the C-terminus of proteins that are related to the EB1 protein. The EB1 proteins contain an N-terminal CH domain pfam00307. The human EB1 protein was originally discovered as a protein interacting with the C-terminus of the APC protein. This interaction is often disrupted in colon cancer, due to deletions affecting the APC C-terminus. Several EB1 orthologues are also included in this family. The interaction between EB1 and APC has been shown to have a potent synergistic effect on microtubule polymerization. Neither of EB1 or APC alone has this effect. It is thought that EB1 targets APC to the + ends of microtubules, where APC promotes microtubule polymerization. This process is regulated by APC phosphorylation by Cdc2, which disrupts APC-EB1 binding. Human EB1 protein can functionally substitute for the yeast EB1 homolog Mal3. In addition, Mal3 can substitute for human EB1 in promoting microtubule polymerization with APC. 39
46553 397394 pfam03272 Mucin_bdg Putative mucin or carbohydrate-binding module. This family is the putative binding domain for the substrates of enhancin, and other similar metallopeptidases. This is not the enzymically active, peptidase, part of the proteins - see pfam13402. 116
46554 397395 pfam03273 Baculo_gp64 Baculovirus gp64 envelope glycoprotein family. This family includes the gp64 glycoprotein from baculovirus as well as other viruses. 506
46555 281291 pfam03274 Foamy_BEL Foamy virus BEL 1/2 protein. 301
46556 397396 pfam03275 GLF UDP-galactopyranose mutase. 203
46557 281293 pfam03276 Gag_spuma Spumavirus gag protein. 614
46558 281294 pfam03277 Herpes_UL4 Herpesvirus UL4 family. 187
46559 281295 pfam03278 IpaB_EvcA IpaB/EvcA family. This family includes IpaB, which is an invasion plasmid antigen from Shigella, as well as EvcA from E. coli. Members of this family seem to be involved in pathogenicity of some enterobacteria. However the exact function of this component is not clear. 144
46560 281296 pfam03279 Lip_A_acyltrans Bacterial lipid A biosynthesis acyltransferase. 294
46561 397397 pfam03280 Lipase_chap Proteobacterial lipase chaperone protein. 193
46562 397398 pfam03281 Mab-21 Mab-21 protein. This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. 287
46563 397399 pfam03283 PAE Pectinacetylesterase. 355
46564 397400 pfam03284 PHZA_PHZB Phenazine biosynthesis protein A/B. 161
46565 397401 pfam03285 Paralemmin Paralemmin. 312
46566 281302 pfam03286 Pox_Ag35 Pox virus Ag35 surface protein. 196
46567 308745 pfam03287 Pox_C7_F8A Poxvirus C7/F8A protein. 147
46568 367437 pfam03288 Pox_D5 Poxvirus D5 protein-like. This family includes D5 from Poxviruses which is necessary for viral DNA replication, and is a nucleic acid independent nucleoside triphosphatase. Members of this family are also found outside of poxviruses. This domain is a DNA-binding winged HTH domain. 85
46569 281305 pfam03289 Pox_I1 Poxvirus protein I1. 307
46570 281306 pfam03290 Peptidase_C57 Vaccinia virus I7 processing peptidase. 425
46571 281307 pfam03291 Pox_MCEL mRNA capping enzyme. This family of enzymes are related to pfam03919. 332
46572 281308 pfam03292 Pox_P4B Poxvirus P4B major core protein. 657
46573 397402 pfam03293 Pox_RNA_pol Poxvirus DNA-directed RNA polymerase, 18 kD subunit. 160
46574 397403 pfam03294 Pox_Rap94 RNA polymerase-associated transcription specificity factor, Rap94. 796
46575 281311 pfam03295 Pox_TAA1 Poxvirus trans-activator protein A1 C-terminal. 63
46576 281312 pfam03296 Pox_polyA_pol Poxvirus poly(A) polymerase nucleotidyltransferase domain. 147
46577 397404 pfam03297 Ribosomal_S25 S25 ribosomal protein. 100
46578 397405 pfam03298 Stanniocalcin Stanniocalcin family. 200
46579 397406 pfam03299 TF_AP-2 Transcription factor AP-2. 196
46580 281316 pfam03300 Tenui_NS4 Tenuivirus non-structural, movement protein NS4. 282
46581 281317 pfam03301 Trp_dioxygenase Tryptophan 2,3-dioxygenase. 346
46582 146106 pfam03302 VSP Giardia variant-specific surface protein. 397
46583 281318 pfam03303 WTF WTF protein. This is a family of hypothetical Schizosaccharomyces pombe proteins. Their function is unknown. 237
46584 308750 pfam03304 Mlp Mlp lipoprotein family. The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species. This family were previously known as 2.9 lipoprotein genes. These surface expressed genes may represent new candidate vaccinogens for Lyme disease. Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity. 121
46585 397407 pfam03305 Lipoprotein_X Mycoplasma MG185/MG260 protein. Most of the aligned regions in this family are found towards the middle of the member proteins. 183
46586 397408 pfam03306 AAL_decarboxy Alpha-acetolactate decarboxylase. 219
46587 281322 pfam03307 Adeno_E3_15_3 Adenovirus 15.3kD protein in E3 region. 117
46588 281323 pfam03308 ArgK ArgK protein. The ArgK protein acts as an ATPase enzyme and as a kinase, and phosphorylates periplasmic binding proteins involved in the LAO (lysine, arginine, ornithine)/AO transport systems. 272
46589 397409 pfam03309 Pan_kinase Type III pantothenate kinase. Type III pantothenate kinase catalyzes the phosphorylation of pantothenate (Pan), the first step in the universal pathway of CoA biosynthesis. 204
46590 251863 pfam03310 Cauli_DNA-bind Caulimovirus DNA-binding protein. 121
46591 397410 pfam03311 Cornichon Cornichon protein. 122
46592 397411 pfam03312 DUF272 Protein of unknown function (DUF272). This family of proteins is restricted to C.elegans and has no known function. The protein contains a ubiquitin fold. The GO annotation for the protein indicates that it has a function in nematode larval development. 123
46593 397412 pfam03313 SDH_alpha Serine dehydratase alpha chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. 259
46594 367444 pfam03314 DUF273 Protein of unknown function, DUF273. 219
46595 397413 pfam03315 SDH_beta Serine dehydratase beta chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. 146
46596 397414 pfam03317 ELF ELF protein. This is a family of hypothetical proteins from cereal crops. 123
46597 397415 pfam03318 ETX_MTX2 Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2. This family appears to be distantly related to pfam01117. 222
46598 397416 pfam03319 EutN_CcmL Ethanolamine utilisation protein EutN/carboxysome. The crystal structure of EutN contains a central five-stranded beta-barrel, with an alpha-helix at the open end of this barrel (Structure 2HD3). The structure also contains three additional beta-strands, which help the formation of a tight hexamer, with a hole in the center. this suggests that EutN forms a pore, with an opening of 26 Angstrom in diameter on one face and 14 Angstrom on the other face. EutN is involved in the cobalamin-dependent degradation of ethanolamine. 83
46599 397417 pfam03320 FBPase_glpX Bacterial fructose-1,6-bisphosphatase, glpX-encoded. 304
46600 397418 pfam03321 GH3 GH3 auxin-responsive promoter. 529
46601 397419 pfam03323 GerA Bacillus/Clostridium GerA spore germination protein. 467
46602 367447 pfam03324 Herpes_HEPA Herpesvirus DNA helicase/primase complex associated protein. This family includes HSV UL8, EHV-1 54, VZV 52 AND HCMV 102. 95
46603 397420 pfam03325 Herpes_PAP Herpesvirus polymerase accessory protein. The same proteins are also known as polymerase processivity factors. 166
46604 308764 pfam03326 Herpes_TAF50 Herpesvirus transcription activation factor (transactivator). This family includes EBV BRLF1 and similar ORF 50 proteins from other herpesviruses. 568
46605 397421 pfam03327 Herpes_VP19C Herpesvirus capsid shell protein VP19C. 262
46606 397422 pfam03328 HpcH_HpaI HpcH/HpaI aldolase/citrate lyase family. This family includes 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase and 4-hydroxy-2-oxovalerate aldolase. 221
46607 397423 pfam03330 DPBB_1 Lytic transglycolase. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N-terminus of pollen allergen. Recent studies show that the full-length RlpA protein from Pseudomonas Aeruginosa is an outer membrane protein that is a lytic transglycolase with specificity for peptidoglycan lacking stem peptides. Residue D157 in UniProtKB:Q9X6V6 is critical for lytic activity. 82
46608 397424 pfam03331 LpxC UDP-3-O-acyl N-acetylglycosamine deacetylase. The enzymes in this family catalyze the second step in the biosynthetic pathway for lipid A. 271
46609 397425 pfam03332 PMM Eukaryotic phosphomannomutase. This enzyme EC:5.4.2.8 is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions. 220
46610 377022 pfam03333 PapB Adhesin biosynthesis transcription regulatory protein. This family includes PapB, DaaA, FanA, FanB, and AfaA. 91
46611 397426 pfam03334 PhaG_MnhG_YufB Na+/H+ antiporter subunit. This family includes PhaG from Rhizobium meliloti, MnhG from Staphylococcus aureus, YufB from Bacillus subtilis. 71
46612 335296 pfam03335 Phage_fiber Phage tail fibre repeat. 14
46613 281347 pfam03336 Pox_C4_C10 Poxvirus C4/C10 protein. 322
46614 281348 pfam03337 Pox_F12L Poxvirus F12L protein. 649
46615 281349 pfam03338 Pox_J1 Poxvirus J1 protein. 145
46616 281350 pfam03339 Pox_L3_FP4 Poxvirus L3/FP4 protein. 316
46617 397427 pfam03340 Pox_Rif Poxvirus rifampicin resistance protein. 542
46618 281352 pfam03341 Pox_mRNA-cap Poxvirus mRNA capping enzyme, small subunit. The small subunit of the poxvirus mRNA capping enzyme has been found to have a structure which suggests that it started life as an RNA cap 2-prime O-methyltransferase. It has subsequently evolved to a catalytically inactive form that has been retained in order to help stabilize the large subunit, D1, and to enhance its methyltransferase activity through an allosteric mechanism. 286
46619 281353 pfam03342 Rhabdo_M1 Rhabdovirus M1 matrix protein (M1 polymerase-associated protein). 227
46620 397428 pfam03343 SART-1 SART-1 family. SART-1 is a protein involved in cell cycle arrest and pre-mRNA splicing. It has been shown to be a component of U4/U6 x U5 tri-snRNP complex in human, Schizosaccharomyces pombe and Saccharomyces cerevisiae. SART-1 is a known tumor antigen in a range of cancers recognized by T cells. 569
46621 397429 pfam03344 Daxx Daxx N-terminal Rassf1C-interacting domain. The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis. Daxx forms a complex with Axin. Remodelling of the family to a short domain based on the Structure 2kzs structure gives a more representative family. DAXX is a scaffold protein shown to play diverse roles in transcription and cell cycle regulation. This N-terminal domain folds into a left-handed four-helix bundle (H1, H2, H4, H5) that binds to the N-terminal residues of the tumor-suppressor Rassf1C. 95
46622 397430 pfam03345 DDOST_48kD Oligosaccharyltransferase 48 kDa subunit beta. Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase EC:2.4.1.119, transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kD subunit, ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kD polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kD subunit and the other orthologues. This family includes the 48kD-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1. 412
46623 281357 pfam03347 TDH Vibrio thermostable direct hemolysin. 165
46624 397431 pfam03348 Serinc Serine incorporator (Serinc). This is a family of eukaryotic membrane proteins which incorporate serine into membranes and facilitate the synthesis of the serine-derived lipids phosphatidylserine and sphingolipid. Members of this family contain 11 transmembrane domains and form intracellular complexes with key enzymes involved in serine and sphingolipid biosynthesis. 421
46625 397432 pfam03349 Toluene_X Outer membrane protein transport protein (OMPP1/FadL/TodX). This family includes TodX from Pseudomonas putida F1 and TbuX from Ralstonia pickettii PKO1. These are membrane proteins of uncertain function that are involved in toluene catabolism. Related proteins involved in the degradation of similar aromatic hydrocarbons are also in this family, such as CymD. This family also includes FadL involved in translocation of long-chain fatty acids across the outer membrane. It is also a receptor for the bacteriophage T2. 419
46626 397433 pfam03350 UPF0114 Uncharacterized protein family, UPF0114. 118
46627 397434 pfam03351 DOMON DOMON domain. The DOMON (named after dopamine beta-monooxygenase N-terminal) domain is 110-125 residues long. It is predicted to form an all beta fold with up to 11 strands and is secreted to the extracellular compartment. The beta-strand folding produces a hydrophobic pocket which appears to bind soluble haem. This is consistent with the predominant architectures where the protein is associated with cytochromes or enzymatic domains whose activity involves redox or electron transfer reactions potentially as a direct participant in the electron transfer process. The DOMON domain superfamily, of which this is just one member, shows (1) multiple hydrophobic residues that contribute to the hydrophobic core of the strands of the beta-sandwich, and small residues found at the boundaries of strands and loops, (2) a strongly conserved charged residue (usually arginine/lysine) at the end of strand 9, which possibly stabilizes the loop between 9 and 10, and (3) a polar residue (usually histidine, lysine or arginine), that interacts or coordinates with ligands. The suggested superfamily includes both haem- and sugar-binding members: the haem-binding families being the ethyl-Benzoate dehydrogenase family EB_dh, pfam09459, the cellobiose dehydrogenase family CBDH and this family, and the sugar-binding families being the xylanases, CBM_4_9, pfam02018. The common feature of the superfamily is the 11-beta-strand structure, although the first and eleventh strands are not well conserved either within families or between families. 119
46628 397435 pfam03352 Adenine_glyco Methyladenine glycosylase. The DNA-3-methyladenine glycosylase I is constitutively expressed and is specific for the alkylated 3-methyladenine DNA. 177
46629 397436 pfam03353 Lin-8 Ras-mediated vulval-induction antagonist. LIN-8 is a nuclear protein, present at the sites of transcriptional repressor complexes, which interacts with LIN-35 Rb. Lin35 Rb is a product of the class B synMuv gene lin-35 which silences genes required for vulval specification through chromatin modification and remodelling. The biological role of the interaction has not yet been determined however predictions have been made. The interaction shows that class A synMuv genes control vulval induction through the transcriptional regulation of gene expression. LIN-8 normally functions as part of a protein complex however when the complex is absent, other family members can partially replace LIN-8 activity. 309
46630 281364 pfam03354 Terminase_1 Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function. 466
46631 146143 pfam03355 Pox_TAP Viral Trans-Activator Protein. These proteins function as a trans-activator of viral late genes. 260
46632 281365 pfam03356 Pox_LP_H2 Viral late protein H2. All Members of this family show similarity to the vaccinia virus late protein H2. This protein is often referred to by its gene name of H2R. Members from this family all belong to the viral taxon Poxviridae. 188
46633 397437 pfam03357 Snf7 Snf7. This family of proteins are involved in protein sorting and transport from the endosome to the vacuole/lysosome in eukaryotic cells. Vacuoles/lysosomes play an important role in the degradation of both lipids and cellular proteins. In order to perform this degradative function, vacuoles/lysosomes contain numerous hydrolases which have been transported in the form of inactive precursors via the biosynthetic pathway and are proteolytically activated upon delivery to the vacuole/lysosome. The delivery of transmembrane proteins, such as activated cell surface receptors to the lumen of the vacuole/lysosome, either for degradation/downregulation, or in the case of hydrolases, for proper localization, requires the formation of multivesicular bodies (MVBs). These late endosomal structures are formed by invaginating and budding of the limiting membrane into the lumen of the compartment. During this process, a subset of the endosomal membrane proteins is sorted into the forming vesicles. Mature MVBs fuse with the vacuole/lysosome, thereby releasing cargo containing vesicles into its hydrolytic lumen for degradation. Endosomal proteins that are not sorted into the intralumenal MVB vesicles are either recycled back to the plasma membrane or Golgi complex, or remain in the limiting membrane of the MVB and are thereby transported to the limiting membrane of the vacuole/lysosome as a consequence of fusion. Therefore, the MVB sorting pathway plays a critical role in the decision between recycling and degradation of membrane proteins. A few archaeal sequences are also present within this family. 170
46634 397438 pfam03358 FMN_red NADPH-dependent FMN reductase. 151
46635 397439 pfam03359 GKAP Guanylate-kinase-associated protein (GKAP) protein. 341
46636 397440 pfam03360 Glyco_transf_43 Glycosyltransferase family 43. 202
46637 281370 pfam03361 Herpes_IE2_3 Herpes virus intermediate/early protein 2/3. These viral sequences are similar to UL117 protein of human and chimpanzee cytomegalovirus, and to intermediate/early proteins 2 and 3 of certain herpes viruses. UL117 is thought to be a glycoprotein that is expressed at early and late times after infection. This region is close to the C-terminus of the protein and may be a transmembrane region. 162
46638 281371 pfam03362 Herpes_UL47 Herpesvirus UL47 protein. 448
46639 281372 pfam03363 Herpes_LP Herpesvirus leader protein. 174
46640 397441 pfam03364 Polyketide_cyc Polyketide cyclase / dehydrase and lipid transport. This family contains polyketide cylcases/dehydrases which are enzymes involved in polyketide synthesis. The family also includes proteins which are involved in the binding/transport of lipids. 125
46641 397442 pfam03366 YEATS YEATS family. We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity 85
46642 397443 pfam03367 zf-ZPR1 ZPR1 zinc-finger domain. The zinc-finger protein ZPR1 is ubiquitous among eukaryotes. It is indeed known to be an essential protein in yeast. In quiescent cells, ZPR1 is localized to the cytoplasm. But in proliferating cells treated with EGF or with other mitogens, ZPR1 accumulates in the nucleolus. ZPR1 interacts with the cytoplasmic domain of the inactive EGF receptor (EGFR) and is thought to inhibit the basal protein tyrosine kinase activity of EGFR. This interaction is disrupted when cells are treated with EGF, though by themselves, inactive EGFRs are not sufficient to sequester ZPR1 to the cytoplasm. Upon stimulation by EGF, ZPR1 directly binds the eukaryotic translation elongation factor-1alpha (eEF-1alpha) to form ZPR1/eEF-1alpha complexes. These move into the nucleus, localising particularly at the nucleolus. Indeed, the interaction between ZPR1 and eEF-1alpha has been shown to be essential for normal cellular proliferation, and ZPR1 is thought to be involved in pre-ribosomal RNA expression. The ZPR1 domain consists of an elongation initiation factor 2-like zinc finger and a double-stranded beta helix with a helical hairpin insertion. ZPR1 binds preferentially to GDP-bound eEF1A but does not directly influence the kinetics of nucleotide exchange or GTP hydrolysis. The alignment for this family shows a domain of which there are two copies in ZPR1 proteins. This family also includes several hypothetical archaeal proteins (from both Crenarchaeota and Euryarchaeota), which only contain one copy of the aligned region. This similarity between ZPR1 and archaeal proteins was not previously noted. 159
46643 397444 pfam03368 Dicer_dimer Dicer dimerization domain. This domain is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules. It is essential for the activity of Dicer. It is a divergent double stranded RNA-binding domain. The N-terminal alpha helix of this domain is in a different orientation to that found in canonical dsRNA-binding domains. This results in a change of charge distribution at the potential dsRNA-binding surface and in the N- and C-termini of the domain being in close proximity. This domain has weak dsRNA-binding activity. It mediates heterodimerization of Dicer proteins with their respective protein partners. 90
46644 281377 pfam03369 Herpes_UL3 Herpesvirus UL3 protein. 135
46645 397445 pfam03370 CBM_21 Carbohydrate/starch-binding module (family 21). This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localizes it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain. 113
46646 397446 pfam03371 PRP38 PRP38 family. Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation. 166
46647 397447 pfam03372 Exo_endo_phos Endonuclease/Exonuclease/phosphatase family. This large family of proteins includes magnesium dependent endonucleases and a large number of phosphatases involved in intracellular signalling. This family includes: AP endonuclease proteins EC:4.2.99.18, DNase I proteins EC:3.1.21.1, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase EC:3.1.3.56, Sphingomyelinase EC:3.1.4.12, and Nocturnin. 228
46648 281381 pfam03373 Octapeptide Octapeptide repeat. This octapeptide repeat is found in several bacterial proteins. The function of this repeat is unknown. 8
46649 397448 pfam03374 ANT Phage antirepressor protein KilAC domain. This domain was called the KilAC domain by Iyer and colleagues. 105
46650 281383 pfam03376 Adeno_E3B Adenovirus E3B protein. 67
46651 397449 pfam03377 TAL_effector TAL effector repeat. The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices. 33
46652 367469 pfam03378 CAS_CSE1 CAS/CSE protein, C-terminus. Mammalian cellular apoptosis susceptibility (CAS) proteins are homologous to the yeast chromosome-segregation protein, CSE1. This family aligns the C-terminal halves (approximately). CAS is involved in both cellular apoptosis and proliferation. Apoptosis is inhibited in CAS-depleted cells, while the expression of CAS correlates to the degree of cellular proliferation. Like CSE1, it is essential for the mitotic checkpoint in the cell cycle (CAS depletion blocks the cell in the G2 phase), and has been shown to be associated with the microtubule network and the mitotic spindle, as is the protein MEK, which is thought to regulate the intracellular localization (predominantly nuclear vs. predominantly cytosolic) of CAS. In the nucleus, CAS acts as a nuclear transport factor in the importin pathway. The importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell cycle through its effect on the nuclear transport of these proteins. Since apoptosis also requires the nuclear import of several proteins (such as P53 and transcription factors), it has been suggested that CAS also enables apoptosis by facilitating the nuclear import of at least a subset of these essential proteins. 435
46653 281385 pfam03379 CcmB CcmB protein. CcmB is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor. CcmB is required for the export of haem to the periplasm. 215
46654 367470 pfam03380 DUF282 Caenorhabditis protein of unknown function, DUF282. 38
46655 397450 pfam03381 CDC50 LEM3 (ligand-effect modulator 3) family / CDC50 family. Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w, and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39. 276
46656 397451 pfam03382 DUF285 Mycoplasma protein of unknown function, DUF285. This region appears distantly related to leucine rich repeats. 120
46657 251914 pfam03383 Serpentine_r_xa Caenorhabditis serpentine receptor-like protein, class xa. This family contains various Caenorhabditis proteins, some of which are annotated as being serpentine receptors, mainly of the xa class. 153
46658 146165 pfam03384 DUF287 Drosophila protein of unknown function, DUF287. 55
46659 308795 pfam03385 STELLO STELLO glycosyltransferases. This domain family is found in Metazoa and in Virdiplantae. Two of the family members are characterized in Arabidopsis thaliana and named STELLO1 (STL1) and STELLO2 (STL2) respectively. They are Golgi-localized proteins that can interact with CesAs (cellulose synthase A) and control cellulose quantity. In the absence of STELLO function, the spatial distribution within the Golgi, secretion and activity of the CSCs are impaired indicating a central role of the STELLO proteins in CSC assembly. Point mutations in the predicted catalytic domains of the STELLO proteins indicate that they are glycosyltransferases facing the Golgi lumen. STL homologs are present throughout the plant kingdom, but STL proteins are distinct from distantly related proteins in nematodes, fungi and molluscs. 388
46660 397452 pfam03386 ENOD93 Early nodulin 93 ENOD93 protein. 78
46661 281391 pfam03387 Herpes_UL46 Herpesvirus UL46 protein. 435
46662 397453 pfam03388 Lectin_leg-like Legume-like lectin family. Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first recognized members of a family of animal lectins similar (19-24%) to the leguminous plant lectins. The alignment for this family aligns residues lying towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. However, while Fiedler and Simons identified these proteins as a new family of animal lectins, our alignment also includes yeast sequences. ERGIC-53 is a 53kD protein, localized to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin. Its dysfunction has been associated with combined factors V and VIII deficiency OMIM:227300 OMIM:601567, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein- secreting pathway. 226
46663 397454 pfam03389 MobA_MobL MobA/MobL family. This family includes of the MobA protein from the E. coli plasmid RSF1010, and the MobL protein from the Thiobacillus ferrooxidans plasmid PTF1. These sequences are mobilisation proteins, which are essential for specific plasmid transfer. 217
46664 397455 pfam03390 2HCT 2-hydroxycarboxylate transporter family. The 2-hydroxycarboxylate transporter family is a family of secondary transporters found exclusively in the bacterial kingdom. They function in the metabolism of the di- and tricarboxylates malate and citrate, mostly in fermentative pathways involving decarboxylation of malate or oxaloacetate. 415
46665 397456 pfam03391 Nepo_coat Nepovirus coat protein, central domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 167
46666 397457 pfam03392 OS-D Insect pheromone-binding family, A10/OS-D. 93
46667 308800 pfam03393 Pneumo_matrix Pneumovirus matrix protein. 252
46668 281397 pfam03394 Pox_E8 Poxvirus E8 protein. 238
46669 281398 pfam03395 Pox_P4A Poxvirus P4A protein. 882
46670 397458 pfam03396 Pox_RNA_pol_35 Poxvirus DNA-directed RNA polymerase, 35 kD subunit. 293
46671 281400 pfam03397 Rhabdo_matrix Rhabdovirus matrix protein. 168
46672 397459 pfam03398 Ist1 Regulator of Vps4 activity in the MVB pathway. ESCRT-I, -II, and -III are endosomal sorting complexes required for transporting proteins and carry out cargo sorting and vesicle formation in the multivesicular bodies, MVBs, pathway. These complexes are transiently recruited from the cytoplasm to the endosomal membrane where they bind transmembrane proteins previously marked for degradation by mono-ubiquitination. Assembly of ESCRT-III, a complex composed of at least four subunits (Vps2, Vps24, Vps20, Snf7), is intimately linked with MVB vesicle formation, its disassembly being an essential step in the MVB vesicle formation, a reaction that is carried out by Vps4, an AAA-type ATPase. The family Ist1 is a regulator of Vps4 activity; by interacting with Did2 and Vps4, Ist1 appears to regulate the recruitment and oligomerization of Vps4. Together Ist1, Did2, and Vta1 form a network of interconnected regulatory proteins that modulate Vps4 activity, thereby regulating the flow of cargo through the MVB pathway. 164
46673 397460 pfam03399 SAC3_GANP SAC3/GANP family. This family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localization of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. 293
46674 281403 pfam03400 DDE_Tnp_IS1 IS1 transposase. Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases. 131
46675 397461 pfam03401 TctC Tripartite tricarboxylate transporter family receptor. These probable extra-cytoplasmic solute receptors are strongly overrepresented in several beta-proteobacteria. This family, formerly known as Bug - Bordetella uptake gene (bug) product - is a family of bacterial tripartite tricarboxylate receptors of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. The TctABC system has been characterized in S. typhimurium, and TctC is the extracytoplasmic tricarboxylate-binding receptor which binds the transporters TctA and TctB, two integral membrane proteins. Complete three-component systems are found only in bacteria. 274
46676 112227 pfam03402 V1R Vomeronasal organ pheromone receptor family, V1R. This family represents one of two known vomeronasal organ receptor families, the V1R family. 265
46677 397462 pfam03403 PAF-AH_p_II Platelet-activating factor acetylhydrolase, isoform II. Platelet-activating factor acetylhydrolase (PAF-AH) is a subfamily of phospholipases A2, responsible for inactivation of platelet-activating factor through cleavage of an acetyl group. Three known PAF-AHs are the brain heterotrimeric PAF-AH Ib, whose catalytic beta and gamma subunits are aligned in pfam02266, the extracellular, plasma PAF-AH (pPAF-AH), and the intracellular PAF-AH isoform II (PAF-AH II). This family aligns pPAF-AH and PAF-AH II, whose similarity was previously noted. 372
46678 397463 pfam03404 Mo-co_dimer Mo-co oxidoreductase dimerization domain. This domain is found in molybdopterin cofactor (Mo-co) oxidoreductases. It is involved in dimer formation, and has an Ig-fold structure. 136
46679 397464 pfam03405 FA_desaturase_2 Fatty acid desaturase. 319
46680 397465 pfam03406 Phage_fiber_2 Phage tail fibre repeat. This repeat is found in the tail fibers of phage. For example protein K. The repeats are about 40 residues long. 38
46681 367480 pfam03407 Nucleotid_trans Nucleotide-diphospho-sugar transferase. Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases. 208
46682 281409 pfam03408 Foamy_virus_ENV Foamy virus envelope protein. Expression of the envelope (Env) glycoprotein is essential for viral particle egress. This feature is unique to the Spumavirinae, a subclass of the Retroviridae. 984
46683 397466 pfam03409 Glycoprotein Transmembrane glycoprotein. This family of proteins has some GO annotations for positive regulation of growth rate and nematode larval development. This is probably a family of membrane glycoproteins. 351
46684 281411 pfam03410 Peptidase_M44 Metallopeptidase from vaccinia pox. This is a family of Poxviridae metalloendopeptidases. The members were often originally named as G1 proteins. The family carries three zinc-binding ligands and a catalytic glutamate. The first two zinc ligands are histidine residues, found together with the catalytic glutamate in a HXXEH motif, an inverse of the classical metallopeptidase motif, HEXXH. The third zinc ligand is a glutamate C-terminal to the HXXEH motif within a motif ELENEY (see MEROPS). 596
46685 397467 pfam03411 Peptidase_M74 Penicillin-insensitive murein endopeptidase. 242
46686 367483 pfam03412 Peptidase_C39 Peptidase C39 family. Lantibiotic and non-lantibiotic bacteriocins are synthesized as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved. Peptide bacteriocins are exported across the cytoplasmic membrane by a dedicated ATP-binding cassette (ABC) transporter. The ABC transporter is the maturation protease and its proteolytic domain resides in the N-terminal part of the protein. This peptidase domain is found in a wide range of ABC transporters, however the presumed catalytic cysteine and histidine are not conserved in all members of this family. 133
46687 397468 pfam03413 PepSY Peptidase propeptide and YPEB domain. This region is likely to have an protease inhibitory function (personal obs:C Yeats). This model is likely to miss some members of this family as the separation from signal to noise is not clear. The name is derived from Peptidase & Bacillus subtilis YPEB. 59
46688 397469 pfam03414 Glyco_transf_6 Glycosyltransferase family 6. 289
46689 397470 pfam03415 Peptidase_C11 Clostripain family. 354
46690 397471 pfam03416 Peptidase_C54 Peptidase family C54. 271
46691 397472 pfam03417 AAT Acyl-coenzyme A:6-aminopenicillanic acid acyl-transferase. 223
46692 367487 pfam03418 Peptidase_A25 Germination protease. 354
46693 397473 pfam03419 Peptidase_U4 Sporulation factor SpoIIGA. 275
46694 397474 pfam03420 Peptidase_S77 Prohead core protein serine protease. 198
46695 397475 pfam03421 Acetyltransf_14 YopJ Serine/Threonine acetyltransferase. The Yersinia effector YopJ inhibits the innate immune response by blocking MAP kinase and NFkappaB signaling pathways. YopJ is a serine/threonine acetyltransferase which regulates signalling pathways by blocking phosphorylation. Specifically, YopJ has been shown to block phosphorylation of active site residues. It has also been shown that YopJ acetyltransferase is activated by eukaryotic host cell inositol hexakisphosphate. This family was previously incorrectly annotated in Pfam as being a peptidase family. 172
46696 397476 pfam03422 CBM_6 Carbohydrate binding module (family 6). 125
46697 367491 pfam03423 CBM_25 Carbohydrate binding domain (family 25). 101
46698 397477 pfam03424 CBM_17_28 Carbohydrate binding domain (family 17/28). 204
46699 397478 pfam03425 CBM_11 Carbohydrate binding domain (family 11). 175
46700 367493 pfam03426 CBM_15 Carbohydrate binding domain (family 15). 153
46701 112252 pfam03427 CBM_19 Carbohydrate binding domain (family 19). 61
46702 397479 pfam03428 RP-C Replication protein C N-terminal domain. Replication protein C is involved in the early stages of viral DNA replication. 174
46703 367495 pfam03429 MSP1b Major surface protein 1B. The major surface protein (MSP1) of the cattle pathogen Anaplasma is a heterodimer comprised of MSP1a and MSP1b. This family is the MSP1b chain. There MSP1 proteins are putative adhesins for bovine erythrocytes. 769
46704 281430 pfam03430 TATR Trans-activating transcriptional regulator. This family of trans-activating transcriptional regulator (TATR), also known as intermediate early protein 1, are common to the Nucleopolyhedroviruses. 575
46705 367496 pfam03431 RNA_replicase_B RNA replicase, beta-chain. This family is of Leviviridae RNA replicases. The replicase is also known as RNA dependent RNA polymerase. 538
46706 367497 pfam03432 Relaxase Relaxase/Mobilisation nuclease domain. Relaxases/mobilisation proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalyzing trans-esterification. 240
46707 367498 pfam03433 EspA EspA-like secreted protein. EspA is the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which forms a direct link between the bacterium and the host cell. 180
46708 308824 pfam03434 DUF276 DUF276. This family is specific to Borrelia burgdorferi. The protein is encoded on extra-chromosomal DNA. This domain has no known function. 291
46709 397480 pfam03435 Sacchrp_dh_NADP Saccharopine dehydrogenase NADP binding domain. This family contains the NADP binding domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase. 120
46710 367499 pfam03436 DUF281 Domain of unknown function (DUF281). This family of worm domain has no known function. The boundaries of the presumed domain are rather uncertain. 54
46711 281436 pfam03437 BtpA BtpA family. The BtpA protein is tightly associated with the thylakoid membranes, where it stabilizes the reaction centre proteins of photosystem I. 254
46712 367500 pfam03438 Pneumo_NS1 Pneumovirus NS1 protein. This non-structural protein is one of two found in pneumoviruses. The protein is about 140 amino acids in length. The NS1 protein appears to be important for efficient replication but not essential. The NS1 protein has been shown by yeast two-hybrid to interact with the viral P protein. This protein is also known as the 1C protein. It has also been shown that NS1 can potently inhibit transcription and RNA replication. 136
46713 397481 pfam03439 Spt5-NGN Early transcription elongation factor of RNA pol II, NGN section. Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit. 84
46714 367501 pfam03440 APT Aerolysin/Pertussis toxin (APT) domain. This family represents the N-terminal domain of aerolysin and pertussis toxin and has a type-C lectin like fold. 87
46715 397482 pfam03441 FAD_binding_7 FAD binding domain of DNA photolyase. 201
46716 397483 pfam03442 CBM_X2 Carbohydrate binding domain X2. This domain binds to cellulose and to bacterial cell walls. It is found in glycosyl hydrolases and in scaffolding proteins of cellulosomes (multiprotein glycosyl hydrolase complexes). In the cellulosome it may aid cellulose degradation by anchoring the cellulosome to the bacterial cell wall and by binding it to its substrate. This domain has an Ig-like fold. 83
46717 397484 pfam03443 Glyco_hydro_61 Glycosyl hydrolase family 61. Although weak endoglucanase activity has been demonstrated in several members of this family, they lack the clustered conserved catalytic acidic amino acids present in most glycoside hydrolases. Many members of this family lack measurable cellulase activity on their own, but enhance the activity of other cellulolytic enzymes. They are therefore unlikely to be true glycoside hydrolases. The subsrate-binding surface of this family is a flat Ig-like fold. 211
46718 281443 pfam03444 HrcA_DNA-bdg Winged helix-turn-helix transcription repressor, HrcA DNA-binding. This domain is always found with a pair of CBS domains pfam00571. 79
46719 397485 pfam03445 DUF294 Putative nucleotidyltransferase DUF294. This domain is found associated with pfam00571. This region is uncharacterized, however it seems to be similar to pfam01909, conserving the DXD motif. This strongly suggests that members of this family are also nucleotidyltransferases (Bateman A pers. obs.). 138
46720 397486 pfam03446 NAD_binding_2 NAD binding domain of 6-phosphogluconate dehydrogenase. The NAD binding domain of 6-phosphogluconate dehydrogenase adopts a Rossmann fold. 159
46721 281446 pfam03447 NAD_binding_3 Homoserine dehydrogenase, NAD binding domain. This domain adopts a Rossmann NAD binding fold. The C-terminal domain of homoserine dehydrogenase contributes a single helix to this structural domain, which is not included in the Pfam model. 116
46722 397487 pfam03448 MgtE_N MgtE intracellular N domain. This domain is found at the N-terminus of eubacterial magnesium transporters of the MgtE family pfam01769. This domain is an intracellular domain that has an alpha-helical structure. The crystal structure of the MgtE transporter shows two of 5 magnesium ions are in the interface between the N domain and the CBS domains. In the absence of magnesium there is a large shift between the N and CBS domains. 102
46723 397488 pfam03449 GreA_GreB_N Transcription elongation factor, N-terminal. This domain adopts a long alpha-hairpin structure. 71
46724 397489 pfam03450 CO_deh_flav_C CO dehydrogenase flavoprotein C-terminal domain. 103
46725 397490 pfam03451 HELP HELP motif. The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons. 73
46726 397491 pfam03452 Anp1 Anp1. The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins co-localize within the cis Golgi, and that they are physically associated in two distinct complexes. 265
46727 397492 pfam03453 MoeA_N MoeA N-terminal region (domain I and II). This family contains two structural domains. One of these contains the conserved DGXA motif. This region is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this region is uncertain. 145
46728 397493 pfam03454 MoeA_C MoeA C-terminal region (domain IV). This domain is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this domain is uncertain. The structure of this domain is known and forms an incomplete beta barrel. 72
46729 397494 pfam03455 dDENN dDENN domain. This region is always found associated with pfam02141. It is predicted to form a globular domain. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. This N-terminal region of DENN folds into a longin module, consisting of a central antiparallel beta-sheet layered between helix H1 and helices H2 and H3 (strands S1-S5). Rab35 interacts with dDENN via residues in helix 1 and in the loop S3-S4. 50
46730 397495 pfam03456 uDENN uDENN domain. This region is always found associated with pfam02141. It is predicted to form an all beta domain. 62
46731 397496 pfam03457 HA Helicase associated domain. This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid. 63
46732 397497 pfam03458 UPF0126 UPF0126 domain. Domain always found as pair in bacterial membrane proteins of unknown function. This domain contains three transmembrane helices. The conserved glycines are suggestive of an ion channel (C. Yeats unpublished obs.). 74
46733 397498 pfam03459 TOBE TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain. 62
46734 377044 pfam03460 NIR_SIR_ferr Nitrite/Sulfite reductase ferredoxin-like half domain. Sulfite and Nitrite reductases are key to both biosynthetic assimilation of sulfur and nitrogen and dissimilation of oxidized anions for energy transduction. Two copies of this repeat are found in Nitrite and Sulfite reductases and form a single structural domain. 67
46735 397499 pfam03461 TRCF TRCF domain. 93
46736 397500 pfam03462 PCRF PCRF domain. This domain is found in peptide chain release factors. 192
46737 397501 pfam03463 eRF1_1 eRF1 domain 1. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 128
46738 397502 pfam03464 eRF1_2 eRF1 domain 2. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 133
46739 397503 pfam03465 eRF1_3 eRF1 domain 3. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 100
46740 397504 pfam03466 LysR_substrate LysR substrate binding domain. The structure of this domain is known and is similar to the periplasmic binding proteins. 209
46741 397505 pfam03467 Smg4_UPF3 Smg-4/UPF3 family. This family contains proteins that are involved in nonsense mediated mRNA decay. A process that is triggered by premature stop codons in mRNA. The family includes Smg-4 and UPF3. 171
46742 397506 pfam03468 XS XS domain. The XS (rice gene X and SGS3) domain is found in a family of plant proteins including gene X and SGS3. SGS3 is thought to be involved in post-transcriptional gene silencing (PTGS). This domain contains a conserved aspartate residue that may be functionally important. The XS domain has recently been predicted to possess an RRM-like RNA-binding domain by fold recognition. 113
46743 397507 pfam03469 XH XH domain. The XH (rice gene X Homology) domain is found in a family of plant proteins including gene X. The molecular function of these proteins is unknown. However these proteins usually contain an XS domain that is also found in the PTGS protein SGS3. This domain contains a conserved glutamate residue that may be functionally important. 131
46744 251981 pfam03470 zf-XS XS zinc finger domain. This domain is a putative nucleic acid binding zinc finger found in proteins that also contain an XS domain. 43
46745 397508 pfam03471 CorC_HlyC Transporter associated domain. This small domain is found in a family of proteins with the pfam01595 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C-terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. 81
46746 397509 pfam03472 Autoind_bind Autoinducer binding domain. This domain is found a a large family of transcriptional regulators. This domain specifically binds to autoinducer molecules. 149
46747 397510 pfam03473 MOSC MOSC domain. The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters. 116
46748 397511 pfam03474 DMA DMRTA motif. This region is found to the C-terminus of the pfam00751. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown. 36
46749 397512 pfam03475 3-alpha 3-alpha domain. This small triple helical domain has been predicted to assume a topology similar to helix-turn-helix domains. These domains are found at the C-terminus of proteins related to Escherichia coli YiiM. 44
46750 281474 pfam03476 MOSC_N MOSC N-terminal beta barrel domain. This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold. 118
46751 397513 pfam03477 ATP-cone ATP cone domain. 86
46752 397514 pfam03478 DUF295 Protein of unknown function (DUF295). This family of proteins are found in plants. The function of the proteins is unknown. 57
46753 397515 pfam03479 DUF296 Domain of unknown function (DUF296). This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Reut_B5223, which should be a structurally conserved metal-binding unit, based on structural comparison with known metal-binding structures. The proteins should work as trimers. 114
46754 397516 pfam03480 DctP Bacterial extracellular solute-binding protein, family 7. This family of proteins is involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes DctP, a C4-dicarboxylate-binding protein and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component. 285
46755 397517 pfam03481 SUA5 Putative GTP-binding controlling metal-binding. Structural investigation of this domain suggests that it might be a GTP-binding region that regulates metal binding and involves hydrolysis of ATP to AMP. It is found to the C-terminus of pfam01300. 132
46756 281480 pfam03482 SIC sic protein repeat. Serotype M1 group A Streptococcus strains cause epidemic waves of human infections. This 30 aa repeat occurs in the sic protein, an extracellular protein (streptococcal inhibitor of complement) that inhibits human complement. 30
46757 397518 pfam03483 B3_4 B3/4 domain. This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. 174
46758 397519 pfam03484 B5 tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits. 67
46759 397520 pfam03485 Arg_tRNA_synt_N Arginyl tRNA synthetase N terminal domain. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. 83
46760 397521 pfam03486 HI0933_like HI0933-like protein. 404
46761 397522 pfam03487 IL13 Interleukin-13. 108
46762 112313 pfam03488 Ins_beta Nematode insulin-related peptide beta type. 48
46763 397523 pfam03489 SapB_2 Saposin-like type B, region 2. 34
46764 397524 pfam03491 5HT_transport_N Serotonin (5-HT) neurotransmitter transporter, N-terminus. This short domain lies at the very N-terminus of many serotonin and other transporter proteins, eg SNF, pfam00209. 41
46765 397525 pfam03492 Methyltransf_7 SAM dependent carboxyl methyltransferase. This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesized through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyzes the second step to produce theobromine. 326
46766 397526 pfam03493 BK_channel_a Calcium-activated BK potassium channel alpha subunit. 98
46767 367525 pfam03494 Beta-APP Beta-amyloid peptide (beta-APP). 37
46768 367526 pfam03495 Binary_toxB Clostridial binary toxin B/anthrax toxin PA Ca-binding domain. This domain is a calcium binding domain in the anthrax toxin protective antigen. 78
46769 397527 pfam03496 ADPrib_exo_Tox ADP-ribosyltransferase exoenzyme. This is a family of bacterial and viral bi-glutamic acid ADP-ribosyltransferases, where, in Aeromonas salmonicida AexT, E403 is the catalytic residue and E401 contributes to the transfer of ADP-ribose to the target protein. In clostridial species it is actin that is being ADP-ribosylated; this result is lethal and dermonecrotic in infected mammals. 199
46770 397528 pfam03497 Anthrax_toxA Anthrax toxin LF subunit. 174
46771 397529 pfam03498 CDtoxinA Cytolethal distending toxin A/C domain. 149
46772 281496 pfam03500 Cellsynth_D Cellulose synthase subunit D. 144
46773 397530 pfam03501 S10_plectin Plectin/S10 domain. This presumed domain is found at the N-terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding. 92
46774 367531 pfam03502 Channel_Tsx Nucleoside-specific channel-forming protein, Tsx. 242
46775 308876 pfam03503 Chlam_OMP3 Chlamydia cysteine-rich outer membrane protein 3. 54
46776 281500 pfam03504 Chlam_OMP6 Chlamydia cysteine-rich outer membrane protein 6. 91
46777 308877 pfam03505 Clenterotox Clostridium enterotoxin. 197
46778 281501 pfam03506 Flu_C_NS1 Influenza C non-structural protein (NS1). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1. This protein contains 6 conserved cysteines that may be functionally important, perhaps binding to a metal ion. 162
46779 397531 pfam03507 CagA CagA exotoxin. 39
46780 397532 pfam03508 Connexin43 Gap junction alpha-1 protein (Cx43). 20
46781 397533 pfam03509 Connexin50 Gap junction alpha-8 protein (Cx50). 65
46782 281505 pfam03510 Peptidase_C24 2C endopeptidase (C24) cysteine protease family. 105
46783 397534 pfam03511 Fanconi_A Fanconi anaemia group A protein. 63
46784 397535 pfam03512 Glyco_hydro_52 Glycosyl hydrolase family 52. 414
46785 281508 pfam03513 Cloacin_immun Cloacin immunity protein. 80
46786 397536 pfam03514 GRAS GRAS domain family. Proteins in the GRAS (GAI, RGA, SCR) family are known as major players in gibberellin (GA) signaling, which regulates various aspects of plant growth and development. Mutation of the SCARECROW (SCR) gene results in a radial pattern defect, loss of a ground tissue layer, in the root. The PAT1 protein is involved in phytochrome A signal transduction. A sequence, structure and evolutionary analysis showed that the GRAS family emerged in bacteria and belongs to the Rossmann-fold, AdoMET (SAM)-dependent methyltransferase superfamily. All bacterial, and a subset of plant GRAS proteins, are predicted to be active and function as small-molecule methylases. Several plant GRAS proteins lack one or more AdoMet (SAM)-binding residues while preserving their substrate-binding residues. Although GRAS proteins are implicated to function as transcriptional factors, the above analysis suggests that they instead might either modify or bind small molecules. 374
46787 367536 pfam03515 Cloacin Colicin-like bacteriocin tRNase domain. The C-terminal region of colicin-like bacteriocins is either a pore-forming or an endonuclease-like domain. Cloacin and Pyocins have similar structures and activities to the colicins from E coli and the klebicins from Klebsiella spp. Colicins E5 and D cleave the anticodon loops of distinct tRNAs of Escherichia coli both in vivo and in vitro. The full-length molecule has an N-terminal translocation domain and a middle, double alpha-helical region which is receptor-binding. 273
46788 397537 pfam03516 Filaggrin Filaggrin. 56
46789 397538 pfam03517 Voldacs Regulator of volume decrease after cellular swelling. ICln is a ubiquitously expressed multi-functional protein that plays a critical role in regulating volume decrease in cells after cellular swelling. In plants, ICln induces Cl- currents, thus regulating Cl- homoeostasis in eukaryotes. Structurally, the fold resembles a pleckstrin homology fold, on of whose roles is to recruit and tether their host protein to the cell membrane; and although the surface charges of the ICln fold are not equivalent to those of the PH domain, ICln can be phosphorylated in vitro and the PH-nature of the domain may be the part involving it in the transposition from cytosol to cell membrane during cytotonic swelling. 139
46790 397539 pfam03519 Invas_SpaK Invasion protein B family. 78
46791 397540 pfam03520 KCNQ_channel KCNQ voltage-gated potassium channel. This family matches to the C-terminal tail of KCNQ type potassium channels. 190
46792 397541 pfam03521 Kv2channel Kv2 voltage-gated K+ channel. 288
46793 397542 pfam03522 SLC12 Solute carrier family 12. 415
46794 397543 pfam03523 Macscav_rec Macrophage scavenger receptor. 49
46795 397544 pfam03524 CagX Conjugal transfer protein. This family includes type IV secretion system CagX conjugation protein. Other members of this family are involved in conjugal transfer to plant cells of T-DNA. 217
46796 367544 pfam03525 Meiotic_rec114 Meiotic recombination protein rec114. 328
46797 112349 pfam03526 Microcin Colicin E1 (microcin) immunity protein. 55
46798 397545 pfam03527 RHS RHS protein. 38
46799 367545 pfam03528 Rabaptin Rabaptin. 486
46800 397546 pfam03529 TF_Otx Otx1 transcription factor. 89
46801 397547 pfam03530 SK_channel Calcium-activated SK potassium channel. 109
46802 397548 pfam03531 SSrecog Structure-specific recognition protein (SSRP1). SSRP1 has been implicated in transcriptional initiation and elongation and in DNA replication and repair. This domain belongs to the Pleckstrin homology fold superfamily. 69
46803 281524 pfam03532 OMS28_porin OMS28 porin. 253
46804 367549 pfam03533 SPO11_like SPO11 homolog. 43
46805 397549 pfam03534 SpvB Salmonella virulence plasmid 65kDa B protein. 286
46806 397550 pfam03535 Paxillin Paxillin family. Paxillin is a multi-domain protein that localizes in cultured cells primarily to sites of cell adhesion to the extracellular matrix (ECM) called focal adhesions. The family here represents the N-terminal regions with the proline-rich part as well as the Paxillin part. Focal adhesions form a structural link between the ECM and the actin cytoskeleton and are also important sites of signal transduction; their components propagate signals arising from the activation of integrins following their engagement with ECM proteins, such as fibronectin, collagen and laminin. Importantly, focal adhesion proteins including paxillin also serve as a point of convergence for signals resulting from stimulation of various classes of growth factor receptor. 198
46807 397551 pfam03536 VRP3 Salmonella virulence-associated 28kDa protein. 218
46808 397552 pfam03537 Glyco_hydro_114 Glycoside-hydrolase family GH114. This family is recognized as a glycosyl-hydrolase family, number 114. It is endo-alpha-1,4-polygalactosaminidase, a rare enzyme. It is proposed to be TIM-barrel, the most common structure amongst the catalytic domains of glycosyl-hydrolases. 221
46809 367552 pfam03538 VRP1 Salmonella virulence plasmid 28.1kDa A protein. 311
46810 112362 pfam03539 Spuma_A9PTase Spumavirus aspartic protease (A9). 163
46811 397553 pfam03540 TFIID_30kDa Transcription initiation factor TFIID 23-30kDa subunit. 49
46812 397554 pfam03542 Tuberin Tuberin. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumor suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. 351
46813 397555 pfam03543 Peptidase_C58 Yersinia/Haemophilus virulence surface antigen. 203
46814 397556 pfam03544 TonB_C Gram-negative bacterial TonB protein C-terminal. The TonB_C domain is the well-characterized C-terminal region of the TonB receptor molecule. This protein is bound to an inner membrane-bound protein ExbB via a globular domain and has a flexible middle region that is likely to help in positioning the C-terminal domain into the iron-transporter barrel in the outer membrane. TonB_C interacts with the N-terminal TonB box of the outer membrane transporter that binds the Fe3+-siderophore complex. The barrel of the transporter, consisting of 22 beta-sheets and an inside plug, binds the iron complex in the barrel entrance. 79
46815 397557 pfam03545 YopE Yersinia virulence determinant (YopE). 70
46816 397558 pfam03546 Treacle Treacher Collins syndrome protein Treacle. 524
46817 308904 pfam03547 Mem_trans Membrane transport protein. This family includes auxin efflux carrier proteins and other transporter proteins from all domains of life. 341
46818 397559 pfam03548 LolA Outer membrane lipoprotein carrier protein LolA. 165
46819 308906 pfam03549 Tir_receptor_M Translocated intimin receptor (Tir) intimin-binding domain. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir intimin-binding domain (Tir IBD) which is needed to bind intimin and support the predicted topology for Tir, with both N- and C-terminal regions in the mammalian cell cytosol. 65
46820 397560 pfam03550 LolB Outer membrane lipoprotein LolB. 149
46821 397561 pfam03551 PadR Transcriptional regulator PadR-like family. Members of this family are transcriptional regulators that appear to be related to the pfam01047 family. This family includes PadR, a protein that is involved in negative regulation of phenolic acid metabolism. 74
46822 281541 pfam03552 Cellulose_synt Cellulose synthase. Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterized genes found in Acetobacter and Agrobacterium sp. More correctly designated as 'cellulose synthase catalytic subunits', plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity. 715
46823 281542 pfam03553 Na_H_antiporter Na+/H+ antiporter family. This family includes integral membrane proteins, some of which are NA+/H+ antiporters. 303
46824 281543 pfam03554 Herpes_UL73 UL73 viral envelope glycoprotein. This family groups together the viral proteins BLRF1, U46, 53, and UL73. The UL73-like envelope glycoproteins, which associates in a high molecular mass complex with its counterpart, gM, induce neutralising antibody responses in the host. These glycoprotein are highly polymorphic, particularly in the N-terminal region. 74
46825 281544 pfam03555 Flu_C_NS2 Influenza C non-structural protein (NS2). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1 pfam03506 as well as the NS2 protein. The NS2 protein is only about 60 amino acids in length and of unknown function. 57
46826 397562 pfam03556 Cullin_binding Cullin binding. This domain binds to cullins and to Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. Neddylation is the process by which the C-terminal glycine of the ubiquitin-like protein Nedd8 is covalently linked to lysine residues in a protein through an isopeptide bond. The structure of this domain is composed entirely of alpha helices. 116
46827 281546 pfam03557 Bunya_G1 Bunyavirus glycoprotein G1. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G1 glycoprotein which is the viral attachment protein. 879
46828 367561 pfam03558 TBSV_P22 TBSV core protein P21/P22. This protein is required for cell-to-cell movement in plants. Furthermore, the membrane-associated protein is dispensable for both replication and transcription. 187
46829 397563 pfam03559 Hexose_dehydrat NDP-hexose 2,3-dehydratase. This family includes a range of proteins from antibiotic production pathways. The family includes gra-ORF27 product that probably functions at an early step, most likely as a dTDP-4-keto-6- deoxyglucose-2,3-dehydratase. Its homologs include dnmT from the daunorubicin biosynthetic gene cluster in S. peucetius, a similar gene from the daunomycin biosynthetic cluster in Streptomyces sp. strain C5, eryBVI from the erythromycin cluster in S. erythraea and snoH from the nogalamycin cluster in S. nogalater. The proteins in this family are composed of two copies of a 200 amino acid long unit that may be a structural domain. 203
46830 397564 pfam03561 Allantoicase Allantoicase repeat. This family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea. 103
46831 397565 pfam03562 MltA MltA specific insert domain. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. 231
46832 397566 pfam03563 Bunya_G2 Bunyavirus glycoprotein G2. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G2 glycoprotein which interacts with the pfam03557 G1 glycoprotein. 281
46833 281552 pfam03564 DUF1759 Protein of unknown function (DUF1759). This is a family of proteins of unknown function. Most of the members are gag-polyproteins. 148
46834 397567 pfam03566 Peptidase_A21 Peptidase family A21. 574
46835 397568 pfam03567 Sulfotransfer_2 Sulfotransferase family. This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate. 235
46836 397569 pfam03568 Peptidase_C50 Peptidase family C50. 394
46837 281556 pfam03569 Peptidase_C8 Peptidase family C8. 208
46838 397570 pfam03571 Peptidase_M49 Peptidase family M49. 549
46839 397571 pfam03572 Peptidase_S41 Peptidase family S41. 165
46840 397572 pfam03573 OprD outer membrane porin, OprD family. This family includes outer membrane proteins related to OprD. OprD has been described as a serine type peptidase. However the proposed catalytic residues are not conserved suggesting that many of these proteins are not peptidases. 393
46841 397573 pfam03574 Peptidase_S48 Peptidase family S48. 149
46842 397574 pfam03575 Peptidase_S51 Peptidase family S51. 206
46843 397575 pfam03576 Peptidase_S58 Peptidase family S58. 309
46844 335383 pfam03577 Peptidase_C69 Peptidase family C69. 401
46845 397576 pfam03578 HGWP HGWP repeat. This short (30 amino acids) repeat is found in a number of plant proteins. It contains a conserved HGWP motif, hence its name. The function of these proteins is unknown. 28
46846 281565 pfam03579 SHP Small hydrophobic protein. The small hydrophobic integral membrane protein, SH (previously designated 1A) is found to have a variety of glycosylated forms. This protein is a component of the mature virion. 64
46847 281566 pfam03580 Herpes_UL14 Herpesvirus UL14-like protein. This is a family of Herpesvirus proteins including UL14. UL14 protein is a minor component of the virion tegument and is expressed late in infection. UL14 protein can influence the intracellular localization patterns of a number of proteins belonging to the capsid or the DNA encapsidation machinery. 146
46848 281567 pfam03581 Herpes_UL33 Herpesvirus UL33-like protein. This is a family of Herpesvirus proteins including UL33 and UL51. The proteins in this family are involved in packaging viral DNA. 72
46849 367570 pfam03583 LIP Secretory lipase. These lipases are expressed and secreted during the infection cycle of these pathogens. In particular, C. albicans has a large number of different lipases, possibly reflecting broad lipolytic activity, which may contribute to the persistence and virulence of C. albicans in human tissue. 286
46850 367571 pfam03584 Herpes_ICP4_N Herpesvirus ICP4-like protein N-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the N-terminal region that contains sites for DNA binding and homodimerization. 163
46851 281570 pfam03585 Herpes_ICP4_C Herpesvirus ICP4-like protein C-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the C-terminal region that probably acts as an enhancer for the N-terminal region. 444
46852 397577 pfam03586 Herpes_UL36 Herpesvirus UL36 tegument protein. The UL36 open reading frame (ORF) encodes the largest herpes simplex virus type 1 (HSV-1) protein, a 270-kDa polypeptide designated VP1/2, which is also a component of the virion tegument. A null mutation in the UL36 gene of herpes simplex virus type 1 results in accumulation of unenveloped DNA-filled capsids in the cytoplasm of infected cells. This family only covers a small central part of this large protein. 252
46853 397578 pfam03587 EMG1 EMG1/NEP1 methyltransferase. Members of this family are essential for 40S ribosomal biogenesis. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases. 205
46854 397579 pfam03588 Leu_Phe_trans Leucyl/phenylalanyl-tRNA protein transferase. 171
46855 397580 pfam03589 Antiterm Antitermination protein. 85
46856 397581 pfam03590 AsnA Aspartate-ammonia ligase. 228
46857 397582 pfam03591 AzlC AzlC protein. 135
46858 397583 pfam03592 Terminase_2 Terminase small subunit. Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site. 143
46859 397584 pfam03594 BenE Benzoate membrane transport protein. 378
46860 397585 pfam03595 SLAC1 Voltage-dependent anion channel. This family of transporters has ten alpha helical transmembrane segments. The structure of a bacterial homolog of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homolog, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1 from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterized as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, this family is found in the stomatal guard cells functioning as an anion-transporting pore. Many homologs are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins. 332
46861 281580 pfam03596 Cad Cadmium resistance transporter. 192
46862 397586 pfam03597 FixS Cytochrome oxidase maturation protein cbb3-type. 44
46863 397587 pfam03598 CdhC CO dehydrogenase/acetyl-CoA synthase complex beta subunit. 155
46864 397588 pfam03599 CdhD CO dehydrogenase/acetyl-CoA synthase delta subunit. 384
46865 397589 pfam03600 CitMHS Citrate transporter. 299
46866 397590 pfam03601 Cons_hypoth698 Conserved hypothetical protein 698. 305
46867 397591 pfam03602 Cons_hypoth95 Conserved hypothetical protein 95. 179
46868 397592 pfam03603 DNA_III_psi DNA polymerase III psi subunit. 126
46869 397593 pfam03604 DNA_RNApol_7kD DNA directed RNA polymerase, 7 kDa subunit. 32
46870 397594 pfam03605 DcuA_DcuB Anaerobic c4-dicarboxylate membrane transporter. 364
46871 281589 pfam03606 DcuC C4-dicarboxylate anaerobic carrier. 452
46872 397595 pfam03607 DCX Doublecortin. 55
46873 397596 pfam03608 EII-GUT PTS system enzyme II sorbitol-specific factor. 162
46874 397597 pfam03609 EII-Sor PTS system sorbose-specific iic component. 233
46875 397598 pfam03610 EIIA-man PTS system fructose IIA component. 114
46876 397599 pfam03611 EIIC-GAT PTS system sugar-specific permease component. This family includes bacterial transmembrane proteins with a putative sugar-specific permease function, including and analogous to the IIC component of the PTS system. It has been suggested that this permease may form part of an L-ascorbate utilisation pathway, with proposed specificity for 3-keto-L-gulonate (formed by hydrolysis of L-ascorbate). This family includes the IIC component of the galactitol specific GAT family PTS system. 397
46877 397600 pfam03612 EIIBC-GUT_N Sorbitol phosphotransferase enzyme II N-terminus. 184
46878 397601 pfam03613 EIID-AGA PTS system mannose/fructose/sorbose family IID component. 263
46879 281597 pfam03614 Flag1_repress Repressor of phase-1 flagellin. 170
46880 397602 pfam03615 GCM GCM motif protein. 140
46881 308938 pfam03616 Glt_symporter Sodium/glutamate symporter. 368
46882 281600 pfam03617 IBV_3A IBV 3A protein. The gene product of gene 3 from Avian infectious bronchitis virus. Currently, the function of this protein remains unknown. 57
46883 397603 pfam03618 Kinase-PPPase Kinase/pyrophosphorylase. This family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity. 255
46884 397604 pfam03619 Solute_trans_a Organic solute transporter Ostalpha. This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death. 261
46885 281603 pfam03620 IBV_3C IBV 3C protein. Product of ORF 3C from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown. 92
46886 397605 pfam03621 MbtH MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. MbtH and its homologs were first noted in gene clusters involved in non-ribosomal peptides and other secondary metabolites by Quadri et al. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. The structure of the PA2412 protein shows it adopts a beta-beta-beta-alpha-alpha topology with the short C-terminal helix forming the tip of an overall arrowhead shape. MbtH proteins have been shown to be required for the synthesis of antibiotics, siderophores and glycopeptidolipids. 52
46887 281605 pfam03622 IBV_3B IBV 3B protein. Product of ORF 3B from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown. 64
46888 397606 pfam03623 Focal_AT Focal adhesion targeting region. Focal adhesion kinase (FAK) is a tyrosine kinase found in focal adhesions, intracellular signaling complexes that are formed following engagement of the extracellular matrix by integrins. The C-terminal 'focal adhesion targeting' (FAT) region is necessary and sufficient for localising FAK to focal adhesions. The crystal structure of FAT shows it forms a four-helix bundle that resembles those found in two other proteins involved in cell adhesion, alpha-catenin and vinculin. The binding of FAT to the focal adhesion protein, paxillin, requires the integrity of the helical bundle, whereas binding to another focal adhesion protein, talin, does not. 130
46889 397607 pfam03625 DUF302 Domain of unknown function DUF302. Domain is found in an undescribed set of proteins. Normally occurs uniquely within a sequence, but is found as a tandem repeat. Shows interesting phylogenetic distribution with majority of examples in bacteria and archaea, but also in in D.melanogaster. 63
46890 397608 pfam03626 COX4_pro Prokaryotic Cytochrome C oxidase subunit IV. Cytochrome c oxidase (COX) is a multi-subunit enzyme complex that catalyzes the final step of electron transfer through the respiratory chain on the mitochondrial inner membrane. This family is composed of cytochrome c oxidase subunit 4 from prokaryotes. 72
46891 112444 pfam03627 PapG_N PapG carbohydrate binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan. 226
46892 397609 pfam03628 PapG_C PapG chaperone-binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus and chaperone binding C-terminus (this domain). The chaperone-binding domain is highly conserved, and is essential for the correct assembly of the pili structure when aided by the chaperone molecule PapD. 108
46893 397610 pfam03629 SASA Carbohydrate esterase, sialic acid-specific acetylesterase. The catalytic triad of this esterase enzyme comprises residues Ser127, His403 and Asp391 in UniProtKB:P70665. 226
46894 397611 pfam03630 Fumble Fumble. Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of A. nidulans and mouse. A role of fumble in membrane synthesis has been proposed. 321
46895 397612 pfam03631 Virul_fac_BrkB Virulence factor BrkB. This family acts as a virulence factor. In Bordetella pertussis, BrkB is essential for resistance to complement-dependent killing by serum. This family was originally predicted to be ribonuclease BN, but this prediction has since been shown to be incorrect. 256
46896 281612 pfam03632 Glyco_hydro_65m Glycosyl hydrolase family 65 central catalytic domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucleophilic attack on the anomeric carbon atom. The catalytic domain also forms the majority of the dimerization interface. 387
46897 397613 pfam03633 Glyco_hydro_65C Glycosyl hydrolase family 65, C-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The C-terminal domain forms a two layered jelly roll motif. This domain is situated at the base of the catalytic domain, however its function remains unknown. 50
46898 397614 pfam03634 TCP TCP family transcription factor. This is a family of TCP plant transcription factors. TCP proteins were named after the first characterized members (TB1, CYC and PCFs) and they are involved in multiple developmental control pathways. This region contains a DNA binding basic-Helix-Loop-Helix (bHLP) structure. 152
46899 397615 pfam03635 Vps35 Vacuolar protein sorting-associated protein 35. Vacuolar protein sorting-associated protein (Vps) 35 is one of around 50 proteins involved in protein trafficking. In particular, Vps35 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps26 and Vps29. Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains. 732
46900 397616 pfam03636 Glyco_hydro_65N Glycosyl hydrolase family 65, N-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. This domain is believed to be essential for catalytic activity although its precise function remains unknown. 240
46901 397617 pfam03637 Mob1_phocein Mob1/phocein family. Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin. 170
46902 397618 pfam03638 TCR Tesmin/TSO1-like CXC domain, cysteine-rich domain. This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin and TSO1. This family is called a CXC domain in. 38
46903 397619 pfam03639 Glyco_hydro_81 Glycosyl hydrolase family 81. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein ENGL1, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release. 321
46904 397620 pfam03640 Lipoprotein_15 Secreted repeat of unknown function. This family occurs as tandem repeats in a set of lipoproteins. The alignment contains a Y-X4-D motif. 45
46905 397621 pfam03641 Lysine_decarbox Possible lysine decarboxylase. The members of this family share a highly conserved motif PGGXGTXXE that is probably functionally important. This family includes proteins annotated as lysine decarboxylases, although the evidence for this is not clear. 130
46906 367591 pfam03642 MAP MAP domain. This presumed 110 amino acid residue domain is found in multiple copies in MAP (MHC class II analogue protein). The protein has been found in a wide range of extracellular matrix proteins. 87
46907 397622 pfam03643 Vps26 Vacuolar protein sorting-associated protein 26. Vacuolar protein sorting-associated protein (Vps) 26 is one of around 50 proteins involved in protein trafficking. In particular, Vps26 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps29 and Vps35. This family also contains Down syndrome critical region 3/A. 275
46908 397623 pfam03644 Glyco_hydro_85 Glycosyl hydrolase family 85. Family of endo-beta-N-acetylglucosaminidases. These enzymes work on a broad spectrum of substrates. 291
46909 397624 pfam03645 Tctex-1 Tctex-1 family. Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. 98
46910 397625 pfam03646 FlaG FlaG protein. Although important for flagella the exact function of this protein is unknown. 101
46911 397626 pfam03647 Tmemb_14 Transmembrane proteins 14C. This family of short membrane proteins are as yet uncharacterized. 90
46912 397627 pfam03648 Glyco_hydro_67N Glycosyl hydrolase family 67 N-terminus. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the N-terminal region of alpha-glucuronidase. The N-terminal domain forms a two-layer sandwich, each layer being formed by a beta sheet of five strands. A further two helices form part of the interface with the central, catalytic, module (pfam07488). 120
46913 397628 pfam03649 UPF0014 Uncharacterized protein family (UPF0014). 242
46914 397629 pfam03650 MPC Uncharacterized protein family (UPF0041). 110
46915 397630 pfam03652 RuvX Holliday junction resolvase. This family of nucleases resolves the Holliday junction intermediates in genetic recombination. 134
46916 397631 pfam03653 UPF0093 Uncharacterized protein family (UPF0093). 146
46917 252088 pfam03656 Pam16 Pam16. The Pam16 protein is the fifth essential subunit of the pre-sequence translocase-associated protein import motor (PAM). In Saccharomyces cerevisiae, Pam16 is required for preprotein translocation into the matrix, but not for protein insertion into the inner membrane. Pam16 has a degenerate J domain. J-domain proteins play important regulatory roles as co-chaperones, recruiting Hsp70 partners and accelerating the ATP-hydrolysis step of the chaperone cycle. Pam16's J-like domain strongly interacts with Pam18's J domain, leading to a productive interaction of Pam18 with mtHsp70 at the mitochondria import channel. Pam18 stimulates the ATPase activity of mtHsp70. 127
46918 397632 pfam03657 UPF0113 Uncharacterized protein family (UPF0113). 72
46919 397633 pfam03658 Ub-RnfH RnfH family Ubiquitin. A member of the RnfH family of the ubiquitin superfamily. Members of this family strongly co-occur in two distinct gene neighborhood contexts. In one it is associated with a START domain protein, a membrane protein SmpA and the transfer mRNA binding protein SmpB. This association suggests a possible role in the SmpB-tmRNA-based tagging and degadation system of bacteria, which is interesting given that other members of the ubiquitin system are analogously involved in protein-tagging and degradation across eukaryotes and various prokaryotes. The second context in which the RnfH genes are present is in a membrane associated complex involved in transporting electrons for various reductive reactions such as nitrogen fixation. 83
46920 397634 pfam03659 Glyco_hydro_71 Glycosyl hydrolase family 71. Family of alpha-1,3-glucanases. 372
46921 397635 pfam03660 PHF5 PHF5-like protein. This family of proteins the superfamily of PHD-finger proteins. At least one example, from mouse, may act as a chromatin-associated protein. The S. pombe ini1 gene is essential, required for splicing. It is localized in the nucleus, but not detected in the nucleolus and can be complemented by human ini1. 104
46922 397636 pfam03661 UPF0121 Uncharacterized protein family (UPF0121). Uncharacterized integral membrane protein family. 237
46923 397637 pfam03662 Glyco_hydro_79n Glycosyl hydrolase family 79, N-terminal domain. Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesized as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumor cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity. 318
46924 397638 pfam03663 Glyco_hydro_76 Glycosyl hydrolase family 76. Family of alpha-1,6-mannanases. 348
46925 281639 pfam03664 Glyco_hydro_62 Glycosyl hydrolase family 62. Family of alpha -L-arabinofuranosidase (EC 3.2.1.55). This enzyme hydrolyzed aryl alpha-L-arabinofuranosides and cleaves arabinosyl side chains from arabinoxylan and arabinan. 272
46926 397639 pfam03665 UPF0172 Uncharacterized protein family (UPF0172). In Chlamydomonas reinhardtii the protein TLA1 (truncated light-harvesting chlorophyll antenna size) apparently regulates genes that define the chlorophyll-a antenna size in the photosynthetic apparatus. This family was formerly known as UPF0172. 192
46927 397640 pfam03666 NPR3 Nitrogen Permease regulator of amino acid transport activity 3. This family, also known in yeasts as Rmd11, complexes with NPR2, pfam06218. This complex heterodimer is responsible for inactivating TORC1. an evolutionarily conserved protein complex that controls cell size via nutritional input signals, specifically, in response to amino acid starvation. 446
46928 397641 pfam03668 ATP_bind_2 P-loop ATPase protein family. This family contains an ATP-binding site and could be an ATPase (personal obs:C Yeats). 284
46929 397642 pfam03669 UPF0139 Uncharacterized protein family (UPF0139). 96
46930 112485 pfam03670 UPF0184 Uncharacterized protein family (UPF0184). 83
46931 397643 pfam03671 Ufm1 Ubiquitin fold modifier 1 protein. This is a family of short ubiquitin-like proteins, that is like neither type-1 or type-2. It is a ubiquitin-fold modifier 1 (Ufm1) that is synthesized in a precursor form of 85 amino-acid residues. In humans the enzyme for Ufm1 is Uba5 and the conjugating enzyme is Ufc1. Prior to activation by Uba5 the extra two amino acids at the C-terminal region of the human pro-Ufm1 protein are removed to expose Gly whose residue is necessary for conjugation to target molecule(s). The mature Ufm1 is conjugated to yet unidentified endogenous proteins. While Ubiquitin and many Ubls possess the conserved C-terminal di-glycine that is adenylated by each specific E1 or E1-like enzyme, respectively, in an ATP-dependent manner, Ufm1(1-83) possesses a single glycine at its C-terminus, which is followed by a Ser-Cys dipeptide in the precursor form of Ufm1. The C-terminally processed Ufm1(1-83) is specifically activated by Uba5, an E1-like enzyme, and then transferred to its cognate Ufc1, an E2-like enzyme. 75
46932 397644 pfam03672 UPF0154 Uncharacterized protein family (UPF0154). This family contains a set of short bacterial proteins of unknown function. 60
46933 308975 pfam03673 UPF0128 Uncharacterized protein family (UPF0128). The members of this family are about 240 amino acids in length. The proteins are as yet uncharacterized. 221
46934 397645 pfam03676 UPF0183 Uncharacterized protein family (UPF0183). This family of proteins includes Lin-10 from C. elegans. 395
46935 281647 pfam03677 UPF0137 Uncharacterized protein family (UPF0137). This family includes GP6-D a virulence plasmid encoded protein. 237
46936 308977 pfam03678 Adeno_hexon_C Hexon, adenovirus major coat protein, C-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length. 241
46937 308978 pfam03682 UPF0158 Uncharacterized protein family (UPF0158). 157
46938 397646 pfam03683 UPF0175 Uncharacterized protein family (UPF0175). This family contains small proteins of unknown function. 75
46939 397647 pfam03684 UPF0179 Uncharacterized protein family (UPF0179). The function of this family is unknown, however the proteins contain two cysteine clusters that may be iron sulphur redox centers. 139
46940 397648 pfam03685 UPF0147 Uncharacterized protein family (UPF0147). This family of small proteins have no known function. 81
46941 397649 pfam03686 UPF0146 Uncharacterized protein family (UPF0146). The function of this family of proteins is unknown. 129
46942 281654 pfam03687 UPF0164 Uncharacterized protein family (UPF0164). This family of uncharacterized proteins are only found in Treponema pallidum. These proteins belong to the membrane beta barrel superfamily. 326
46943 397650 pfam03688 Nepo_coat_C Nepovirus coat protein, C-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 163
46944 397651 pfam03689 Nepo_coat_N Nepovirus coat protein, N-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 91
46945 397652 pfam03690 UPF0160 Uncharacterized protein family (UPF0160). This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. 315
46946 397653 pfam03691 UPF0167 Uncharacterized protein family (UPF0167). The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs. 174
46947 308985 pfam03692 CxxCxxCC Putative zinc- or iron-chelating domain. This family of proteins contains 8 conserved cysteines. It has in the past been annotated as being one of the complex of proteins of the flagellar Fli complex. However this was due to a mis-annotation of the original Salmonella LT2 Genbank entry of 'fliB'. With all its conserved cysteines it is possibly a domain that chelates iron or zinc ions. 85
46948 281658 pfam03693 ParD_antitoxin Bacterial antitoxin of ParD toxin-antitoxin type II system and RHH. ParD is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is ParE in, pfam05016. The family contains several related antitoxins from Cyanobacteria, Proteobacteria and Actinobacteria. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all type II antitoxins. 80
46949 397654 pfam03694 Erg28 Erg28 like protein. This is a family of integral membrane proteins, which may contain four transmembrane helices. Members of this family are thought to be involved in sterol C-4 demethylation. In S. cerevisiae they may tether Erg26p (sterol dehydrogenase/decarboxylase) and Erg27p (3-ketoreductase) to the endoplasmic reticulum or may facilitate interaction between these proteins. The family contains a conserved arginine and histidine that may be functionally important. 110
46950 397655 pfam03695 UPF0149 Uncharacterized protein family (UPF0149). The protein in this family are about 190 amino acids long. The function of these proteins is unknown. 170
46951 397656 pfam03698 UPF0180 Uncharacterized protein family (UPF0180). The members of this family are small uncharacterized proteins. 73
46952 397657 pfam03699 UPF0182 Uncharacterized protein family (UPF0182). This family contains uncharacterized integral membrane proteins. 753
46953 397658 pfam03700 Sorting_nexin Sorting nexin, N-terminal domain. These proteins bins to the cytoplasmic domain of plasma membrane receptors. and are involved in endocytic protein trafficking. The N-terminal domain appears to be specific to sorting nexins 1 and 2. 78
46954 397659 pfam03701 UPF0181 Uncharacterized protein family (UPF0181). This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH. 50
46955 397660 pfam03702 AnmK Anhydro-N-acetylmuramic acid kinase. Anhydro-N-acetylmuramic acid kinase catalyzes the specific phosphorylation of 1,6-anhydro-N-acetylmuramic acid (anhMurNAc) with the simultaneous cleavage of the 1,6-anhydro ring, generating MurNAc-6-P. It is also required for the utilization of anhMurNAc, either imported from the medium, or derived from its own cell wall murein, and in so doing plays a role in cell wall recycling. 364
46956 397661 pfam03703 bPH_2 Bacterial PH domain. Domain found in uncharacterized family of membrane proteins. 1-3 copies found in each protein, with each copy flanked by transmembrane helices. Members of this family have a PH domain like structure. 79
46957 397662 pfam03704 BTAD Bacterial transcriptional activator domain. Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR, along with the C terminal region, is capable of independently directing actinorhodin production. This family contains TPR repeats. 146
46958 397663 pfam03705 CheR_N CheR methyltransferase, all-alpha domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase. 53
46959 397664 pfam03706 LPG_synthase_TM Lysylphosphatidylglycerol synthase TM region. LPG_synthase_TM is the N-terminal region of this family of bacterial phosphatidylglycerol lysyltransferases. The function of the family is to add lysyl groups to membrane lipids, and this region is the transmembrane domain of 7xTMs. In order to counteract attack by membrane-damaging external cationic antimicrobial molecules - from host immune systems, bacteriocins, defensins, etc - bacteria modify their anionic membrane phosphatidylglycerol with positively-charged L-lysine; this results in repulsion of the foreign cationic peptides. 300
46960 397665 pfam03707 MHYT Bacterial signalling protein N terminal repeat. Found as an N terminal triplet tandem repeat in bacterial signalling proteins. Family includes CoxC and CoxH from P.carboxydovorans. Each repeat contains two transmembrane helices. Domain is also described as the MHYT domain. 54
46961 281671 pfam03708 Avian_gp85 Avian retrovirus envelope protein, gp85. Family of a vain specific viral glycoproteins that forms a receptor-binding gp85 polypeptide that is linked through disulfide to a membrane-spanning gp37 spike. Gp85 confers a high degree of subgroup specificity for interaction with distinct cell receptors. 246
46962 397666 pfam03709 OKR_DC_1_N Orn/Lys/Arg decarboxylase, N-terminal domain. This domain has a flavodoxin-like fold, and is termed the "wing" domain because of its position in the overall 3D structure. 111
46963 397667 pfam03710 GlnE Glutamate-ammonia ligase adenylyltransferase. Conserved repeated domain found in GlnE proteins. These proteins adenylate and deadenylate glutamine synthases: ATP + {L-Glutamate:ammonia ligase (ADP-forming)} = Diphosphate + Adenylyl-{L-Glutamate:Ammonia ligase (ADP-forming)}. The family is related to the pfam01909 domain. 249
46964 397668 pfam03711 OKR_DC_1_C Orn/Lys/Arg decarboxylase, C-terminal domain. 129
46965 397669 pfam03712 Cu2_monoox_C Copper type II ascorbate-dependent monooxygenase, C-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold. 157
46966 397670 pfam03713 DUF305 Domain of unknown function (DUF305). Domain found in small family of bacterial secreted proteins with no known function. Also found in Paramecium bursaria chlorella virus 1. This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important. This domain belongs to the ferritin superfamily. It contains two sequence similar repeats each of which is composed of two alpha helices. 151
46967 397671 pfam03714 PUD Bacterial pullanase-associated domain. Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C terminii of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function. 97
46968 397672 pfam03715 Noc2 Noc2p family. At least one member, Noc2p from yeast, is required for a late step in 60S subunit export from the nucleus. It has also been shown to co-precipitate with Nug1p, a nuclear GTPase also required for ribosome nucleus export. This family was formerly known as UPF0120. 299
46969 202737 pfam03716 WCCH WCCH motif. The WCCH motif is found in a retrotransposons and Gemini viruses. A specific function has not been associated to this motif. 25
46970 397673 pfam03717 PBP_dimer Penicillin-binding Protein dimerization domain. This domain is found at the N-terminus of Class B High Molecular Weight Penicillin-Binding Proteins. Its function has not been precisely defined, but is strongly implicated in PBP polymerization. The domain forms a largely disordered 'sugar tongs' structure. 178
46971 397674 pfam03718 Glyco_hydro_49 Glycosyl hydrolase family 49. Family of dextranase (EC 3.2.1.11) and isopullulanase (EC 3.2.1.57). Dextranase hydrolyzes alpha-1,6-glycosidic bonds in dextran polymers. This domain corresponds to the C-terminal pectate lyase like domain. 118
46972 397675 pfam03719 Ribosomal_S5_C Ribosomal protein S5, C-terminal domain. 66
46973 397676 pfam03720 UDPG_MGDP_dh_C UDP-glucose/GDP-mannose dehydrogenase family, UDP binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 103
46974 397677 pfam03721 UDPG_MGDP_dh_N UDP-glucose/GDP-mannose dehydrogenase family, NAD binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 186
46975 397678 pfam03722 Hemocyanin_N Hemocyanin, all-alpha domain. This family includes arthropod hemocyanins and insect larval storage proteins. 124
46976 397679 pfam03723 Hemocyanin_C Hemocyanin, ig-like domain. This family includes arthropod hemocyanins and insect larval storage proteins. 243
46977 397680 pfam03724 META META domain. Small domain family found in proteins of of unknown function. Some are secreted and implicated in motility in bacteria. Also occurs in Leishmania spp. as an essential gene. Over-expression in L.amazonensis increases virulence. A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond. 109
46978 397681 pfam03725 RNase_PH_C 3' exoribonuclease family, domain 2. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria. 67
46979 397682 pfam03726 PNPase Polyribonucleotide nucleotidyltransferase, RNA binding domain. This family contains the RNA binding domain of Polyribonucleotide nucleotidyltransferase (PNPase) PNPase is involved in mRNA degradation in a 3'-5' direction. 80
46980 397683 pfam03727 Hexokinase_2 Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. 241
46981 281690 pfam03728 Viral_DNA_Zn_bi Viral DNA-binding protein, zinc binding domain. This family represents the zinc binding domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control. Two copies of this domain are found at the C-terminus of many members of the family. 94
46982 377113 pfam03729 DUF308 Short repeat of unknown function (DUF308). Family of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic). 73
46983 397684 pfam03730 Ku_C Ku70/Ku80 C-terminal arm. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the C terminal arm. This alpha helical region embraces the beta-barrel domain pfam02735 of the opposite subunit. 87
46984 397685 pfam03731 Ku_N Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold. 220
46985 367628 pfam03732 Retrotrans_gag Retrotransposon gag protein. Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved. 97
46986 397686 pfam03733 YccF Inner membrane component domain. Domain occurs as one or more copies in bacterial and eukaryotic proteins. These are membrane proteins of four TM regions, two appearing in each of the two copies when both are present. Many of the latter members also carry the sodium/calcium exchanger protein family pfam01699, which have multipass membrane regions. 51
46987 397687 pfam03734 YkuD L,D-transpeptidase catalytic domain. This family of proteins are found in a range of bacteria. It has been shown that this domain can act as an L,D-transpeptidase that gives rise to an alternative pathway for peptidoglycan cross-linking. This gives bacteria resistance to beta-lactam antibiotics that inhibit PBPs which usually carry out the cross-linking reaction. The conserved region contains a conserved histidine and cysteine, with the cysteine thought to be an active site residue. Several members of this family contain peptidoglycan binding domains. The molecular structure of YkuD protein shows this domain has a novel tertiary fold consisting of a beta-sandwich with two mixed sheets, one containing five strands and the other, six strands. The two beta-sheets form a cradle capped by an alpha-helix. This family was formerly called the ErfK/YbiS/YcfS/YnhG family, but is now named after the first protein of known structure. 89
46988 397688 pfam03735 ENT ENT domain. This presumed domain is named after Emsy N-terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N-terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important. 68
46989 397689 pfam03736 EPTP EPTP domain. Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold. 41
46990 397690 pfam03737 RraA-like Aldolase/RraA. Members of this family include regulator of ribonuclease E activity A (RraA) and 4-hydroxy-4-methyl-2-oxoglutarate (HMG)/4-carboxy- 4-hydroxy-2-oxoadipate (CHA) aldolase, also known as RraA-like protein. RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing. RraA-like proteins seem to contain aldolase and/or decarboxylase activity either in place of or in addition to the RNase E inhibitor functions. 149
46991 397691 pfam03738 GSP_synth Glutathionylspermidine synthase preATP-grasp. This region contains the Glutathionylspermidine synthase enzymatic activity EC:6.3.1.8. This is the C-terminal region in bi-enzymes. Glutathionylspermidine (GSP) synthetases of Trypanosomatidae and Escherichia coli couple hydrolysis of ATP (to ADP and Pi) with formation of an amide bond between spermidine and the glycine carboxylate of glutathione (gamma-Glu-Cys-Gly). In the pathogenic trypanosomatids, this reaction is the penultimate step in the biosynthesis of the antioxidant metabolite, trypanothione (N1,N8-bis-(glutathionyl)spermidine), and is a target for drug design. This region, the pre-ATP grasp region, probably carries the substrate-binding site. 369
46992 397692 pfam03739 YjgP_YjgQ Predicted permease YjgP/YjgQ family. Members of this family are predicted integral membrane proteins of unknown function. They are about 350 amino acids long and contain about 6 transmembrane regions. They are predicted to be permeases although there is no verification of this. 351
46993 397693 pfam03740 PdxJ Pyridoxal phosphate biosynthesis protein PdxJ. Members of this family belong to the PdxJ family that catalyzes the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate. 234
46994 397694 pfam03741 TerC Integral membrane protein TerC family. This family contains a number of integral membrane proteins that also contains the TerC protein. TerC has been implicated in resistance to tellurium. This protein may be involved in efflux of tellurium ions. The tellurite-resistant Escherichia coli strain KL53 was found during testing of the group of clinical isolates for antibiotics and heavy metal ion resistance. Determinant of the tellurite resistance of the strain was located on a large conjugative plasmid. Analyses showed, the genes terB, terC, terD and terE are essential for conservation of the resistance. The members of the family contain a number of conserved aspartates that could be involved in binding to metal ions. 179
46995 397695 pfam03742 PetN PetN. PetN is a small hydrophobic protein, crucial for cytochrome b6-f complex assembly and/or stability. 29
46996 397696 pfam03743 TrbI Bacterial conjugation TrbI-like protein. Although not essential for conjugation, the TrbI protein greatly increase the conjugational efficiency. 186
46997 397697 pfam03744 BioW 6-carboxyhexanoate--CoA ligase. This family contains the enzyme 6-carboxyhexanoate--CoA ligase EC:6.2.1.14. This enzyme is involved in the first step of biotin synthesis, where it converts pimelate into pimeloyl-CoA. The enzyme requires magnesium as a cofactor and forms a homodimer. 239
46998 397698 pfam03745 DUF309 Domain of unknown function (DUF309). This domain is found in eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding. 59
46999 397699 pfam03746 LamB_YcsF LamB/YcsF family. This family includes LamB. The lam locus of Aspergillus nidulans consists of two divergently transcribed genes, lamA and lamB, involved in the utilisation of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. The exact molecular function of the proteins in this family is unknown. 238
47000 397700 pfam03747 ADP_ribosyl_GH ADP-ribosylglycohydrolase. This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues. 198
47001 397701 pfam03748 FliL Flagellar basal body-associated protein FliL. This FliL protein controls the rotational direction of the flagella during chemotaxis. FliL is a cytoplasmic membrane protein associated with the basal body. 98
47002 397702 pfam03749 SfsA Sugar fermentation stimulation protein. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. SfsA has been shown to bind DNA and it contains a helix-turn-helix motif that probably binds DNA at its C-terminus. 138
47003 397703 pfam03750 Csm2_III-A Csm2 Type III-A. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-associated) proteins. This entry represents Csm2 Type III-A, a family of Cas proteins also known as TM1810/Csm2. 112
47004 397704 pfam03752 ALF Short repeats of unknown function. This set of repeats is found in a small family of secreted proteins of no known function, though they are possibly involved in signal transduction. ALF stands for Alanine-rich (AL) - conserved Phenylalanine (F). 42
47005 367637 pfam03753 HHV6-IE Human herpesvirus 6 immediate early protein. The proteins in this family are poorly characterized, but an investigation has indicated that the immediate early protein is required the down-regulation of MHC class I expression in dendritic cells. Human herpesvirus 6 immediate early protein is also referred to as U90. 964
47006 281714 pfam03754 DUF313 Domain of unknown function (DUF313). Family of proteins from Arabidopsis thaliana with uncharacterized function. 113
47007 397705 pfam03755 YicC_N YicC-like family, N-terminal region. Family of bacterial proteins. Although poorly characterized, the members of this protein family have been demonstrated to play a role in stationary phase survival. These proteins are not essential during stationary phase. 155
47008 367638 pfam03756 AfsA A-factor biosynthesis hotdog domain. The AfsA family are key enzymes in A-factor biosynthesis, which is essential for streptomycin production and resistance. This domain is distantly related to the thioester dehydratase FabZ family and therefore has a HotDog domain. 133
47009 397706 pfam03759 PRONE PRONE (Plant-specific Rop nucleotide exchanger). This is a functional guanine exchange factor (GEF) of plant Rho GTPase. 361
47010 397707 pfam03760 LEA_1 Late embryogenesis abundant (LEA) group 1. Family members are conserved along the entire coding region, especially within the hydrophobic internal 20 amino acid motif, which may be repeated. 71
47011 367641 pfam03761 DUF316 Chymotrypsin family Peptidase-S1. This is a family of trypsin-6 part of the chymotrypsin family S21, ie a serine peptidase. The C. elegans sequence UniProt:O01566 is trypsin-6: all the active site residues are present (His90, Asp168, Ser267). 281
47012 397708 pfam03762 VOMI Vitelline membrane outer layer protein I (VOMI). VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure that consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold. 168
47013 397709 pfam03763 Remorin_C Remorin, C-terminal region. Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich N-half and a more conserved C-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is alpha-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible. 106
47014 397710 pfam03764 EFG_IV Elongation factor G, domain IV. This domain is found in elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopts a ribosomal protein S5 domain 2-like fold. 121
47015 397711 pfam03765 CRAL_TRIO_N CRAL/TRIO, N-terminal domain. This all-alpha domain is found to the N-terminus of pfam00650. 53
47016 367646 pfam03766 Remorin_N Remorin, N-terminal region. Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich C-half and a more conserved N-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is alpha-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible. 51
47017 397712 pfam03767 Acid_phosphat_B HAD superfamily, subfamily IIIB (Acid phosphatase). This family proteins includes acid phosphatases and a number of vegetative storage proteins. 213
47018 397713 pfam03768 Attacin_N Attacin, N-terminal region. This family includes attacin and sarcotoxin, but not diptericin (which share similarity to the C-terminal region of attacin). All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism. 64
47019 397714 pfam03769 Attacin_C Attacin, C-terminal region. This family includes attacin, sarcotoxin and diptericin. All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism. 120
47020 397715 pfam03770 IPK Inositol polyphosphate kinase. ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. 190
47021 367651 pfam03771 SPDY Domain of unknown function (DUF317). This a sequence family found in a set of bacterial proteins with no known function. This domain is currently only found in streptomyces bacteria. Most proteins contain two copies of this domain. 59
47022 397716 pfam03772 Competence Competence protein. Members of this family are integral membrane proteins with 6 predicted transmembrane helices. Some members of this family have been shown to be essential for bacterial competence in uptake of extracellular DNA. These proteins may transport DNA across the cell membrane. These proteins contain a highly conserved motif in the amino terminal transmembrane region that has two histidines that may form a metal binding site. 270
47023 281730 pfam03773 ArsP_1 Predicted permease. This family of integral membrane proteins are predicted to be permeases of unknown specificity. 316
47024 397717 pfam03775 MinC_C Septum formation inhibitor MinC, C-terminal domain. In Escherichia coli FtsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought interact with FtsZ. 97
47025 397718 pfam03776 MinE Septum formation topological specificity factor MinE. The E. coli minicell locus was shown to code for three gene products (MinC, MinD, and MinE) whose coordinate action is required for proper placement of the division septum. The minE gene codes for a topological specificity factor that, in wild-type cells, prevents the division inhibitor from acting at internal division sites while permitting it to block septation at polar sites. 67
47026 397719 pfam03777 DUF320 Small secreted domain (DUF320). Small domain found in a family of secreted streptomyces proteins. It occurs singly or as a pair. Many of the domains have two cysteines that may form a disulphide bridge. 55
47027 112584 pfam03778 DUF321 Protein of unknown function (DUF321). This family may be related to the FARP (FMRFamide) family, pfam01581. Currently this repeat was only detectable in Arabidopsis thaliana. 20
47028 397720 pfam03779 SPW SPW repeat. A short repeat found in a small family of membrane-bound proteins. This repeat contains a conserved SPW motif in the first of two transmembrane helices. 48
47029 397721 pfam03780 Asp23 Asp23 family, cell envelope-related function. The alkaline shock protein Asp23 was identified as an alkaline shock protein that was expressed in a sigmaB-dependent manner in Staphylococcus aureus. Following an alkaline shock Asp23 accumulates in the soluble protein fraction of the S. aureus cell. Asp23 is one of the most abundant proteins in the cytosolic protein fraction of stationary S. aureus cells, with a copy-number of >25000 per cell. A second Asp23-family protein, AmaP, which is encoded within the asp23-operon, is required to localize Asp23 to the cell membrane. The overall function for the family is thus a cell envelope-related one in Gram-positive bacteria. 108
47030 397722 pfam03781 FGE-sulfatase Sulfatase-modifying factor enzyme 1. This domain is found in eukaryotic proteins required for post-translational sulfatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulfatase deficiency (MSD). The protein product of the SUMF1 gene is FGE, formylglycine (FGly),-generating enzyme, which is a sulfatase. Sulfatases are enzymes essential for degradation and remodelling of sulfate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulfatases. FGE is localized to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesized sulfatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilized by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulfide bond is present in the active site of FGE. An oxidized cysteine residue, possibly cysteine sulfenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulfatases. In Mycobacteria and Treponema denticola this enzyme functions as an iron(II)-dependent oxidoreductase. 259
47031 397723 pfam03782 AMOP AMOP domain. This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. 144
47032 397724 pfam03783 CsgG Curli production assembly/transport component CsgG. CsgG is an outer membrane-located lipoprotein that is highly resistant to protease digestion. During curli assembly, an adhesive surface fibre, CsgG is required to maintain the stability of CsgA and CsgB. 209
47033 397725 pfam03784 Cyclotide Cyclotide family. This family contains a set of cyclic peptides with a variety of activities. The structure consists of a distorted triple-stranded beta-sheet and a cysteine-knot arrangement of the disulfide bonds. Cyclotides can be separated into two subfamilies, namely bracelet and moebius. The bracelet cyclotide subfamily tends to contain a larger number of positively charged residues and has a bracelet-like circularisation of the backbone. The moebius cyclotide subfamily contains a backbone twist due to a cis-Pro peptide bond and may conceptually be regarded as a molecular Moebius strip. 28
47034 367657 pfam03785 Peptidase_C25_C Peptidase family C25, C terminal ig-like domain. 74
47035 397726 pfam03786 UxuA D-mannonate dehydratase (UxuA). UxuA (this family) and UxuB are required for hexuronate degradation. 351
47036 397727 pfam03787 RAMPs RAMP superfamily. The molecular function of these proteins is not yet known. However, they have been identified and called the RAMP (Repair Associated Mysterious Proteins) superfamily. The members of this family have no known function they are around 300 amino acids in length and have several conserved motifs. 188
47037 397728 pfam03788 LrgA LrgA family. This family is uncharacterized. It contains the protein LrgA that has been hypothesized to export murein hydrolases. 94
47038 397729 pfam03789 ELK ELK domain. This domain is required for the nuclear localization of these proteins. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily within homeobox pfam00046. 22
47039 397730 pfam03790 KNOX1 KNOX1 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization. 40
47040 397731 pfam03791 KNOX2 KNOX2 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization. 48
47041 397732 pfam03792 PBC PBC domain. The PBC domain is a member of the TALE (three-amino-acid loop extension) superclass of homeodomain proteins. 188
47042 397733 pfam03793 PASTA PASTA domain. This domain is found at the C termini of several Penicillin-binding proteins and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 63
47043 397734 pfam03795 YCII YCII-related domain. The majority of proteins in this family consist of a single copy of this domain, though it is also found as a repeat. A strongly conserved histidine and a aspartate suggest that the domain has an enzymatic function. This family also now includes the family formerly known as the DGPF domain (COG3795). Although its function is unknown it is found fused to a sigma-70 factor family domain in CC_1329. Suggesting that this domain plays a role in transcription initiation (Bateman A per. obs.). This domain is named after the most conserved motif in the alignment. 94
47044 397735 pfam03796 DnaB_C DnaB-like helicase C terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis. 255
47045 397736 pfam03797 Autotransporter Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type V pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different protease is used and in some cases no cleavage occurs. 254
47046 397737 pfam03798 TRAM_LAG1_CLN8 TLC domain. 200
47047 397738 pfam03799 FtsQ Cell division protein FtsQ. FtsQ is one of several cell division proteins. FtsQ interacts with other Fts proteins, reviewed in. The precise function of FtsQ is unknown. 111
47048 397739 pfam03800 Nuf2 Nuf2 family. Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family. The match is not strong, but in common with other members of this family contains coiled-coil to the C-terminus of this region. 139
47049 397740 pfam03801 Ndc80_HEC HEC/Ndc80p family. Members of this family are components of the mitotic spindle. It has been shown that Ndc80/HEC from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. 156
47050 397741 pfam03802 CitX Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. 164
47051 252175 pfam03803 Scramblase Scramblase. Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. 221
47052 281756 pfam03804 DUF325 Viral domain of unknown function. 73
47053 367667 pfam03805 CLAG Cytoadherence-linked asexual protein. Clag (cytoadherence linked asexual gene) is a malaria surface protein which has been shown to be involved in the binding of Plasmodium falciparum infected erythrocytes to host endothelial cells, a process termed cytoadherence. The cytoadherence phenomenon is associated with the sequestration of infected erythrocytes in the blood vessels of the brain, cerebral malaria. Clag is a multi-gene family in Plasmodium falciparum with at least 9 members identified to date. Orthologous proteins in the rodent malaria species Plasmodium chabaudi (Lawson D Unpubl. obs.) suggest that the gene family is found in other malaria species and may play a more generic role in cytoadherence. 1286
47054 397742 pfam03806 ABG_transport AbgT putative transporter family. 502
47055 397743 pfam03807 F420_oxidored NADP oxidoreductase coenzyme F420-dependent. 92
47056 397744 pfam03808 Glyco_tran_WecB Glycosyl transferase WecB/TagA/CpsF family. 169
47057 397745 pfam03810 IBN_N Importin-beta N-terminal domain. 72
47058 281762 pfam03811 Zn_Tnp_IS1 InsA N-terminal domain. This appears to be a short zinc binding domain found in IS1 InsA family protein. It is found at the N-terminus of the protein and may be a DNA-binding domain. 35
47059 397746 pfam03812 KdgT 2-keto-3-deoxygluconate permease. 298
47060 397747 pfam03813 Nrap Nrap protein domain 1. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. This domain has a nucleotidyltransferase structure. 144
47061 397748 pfam03814 KdpA Potassium-transporting ATPase A subunit. 549
47062 397749 pfam03815 LCCL LCCL domain. 96
47063 397750 pfam03816 LytR_cpsA_psr Cell envelope-related transcriptional attenuator domain. 149
47064 397751 pfam03817 MadL Malonate transporter MadL subunit. 117
47065 397752 pfam03818 MadM Malonate/sodium symporter MadM subunit. 55
47066 397753 pfam03819 MazG MazG nucleotide pyrophosphohydrolase domain. This domain is about 100 amino acid residues in length. It is found in the MazG protein from E. coli. It contains four conserved negatively charged residues that probably form an active site or metal binding site. This domain is found in isolation in some proteins as well as associated with pfam00590. This domain is clearly related to pfam01503 another pyrophosphohydrolase involved in histidine biosynthesis. This family may be structurally related to the NUDIX domain pfam00293 (Bateman A pers. obs.). 74
47067 397754 pfam03820 Mtc Tricarboxylate carrier. 319
47068 397755 pfam03821 Mtp Golgi 4-transmembrane spanning transporter. 231
47069 397756 pfam03822 NAF NAF domain. 56
47070 397757 pfam03823 Neurokinin_B Neurokinin B. 58
47071 397758 pfam03824 NicO High-affinity nickel-transport protein. High affinity nickel transporters involved in the incorporation of nickel into H2-uptake hydrogenase and urease enzymes. Essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Alcaligenes eutrophus is thought to be an integral membrane protein with seven transmembrane helices. The family also includes a cobalt transporter. 287
47072 281776 pfam03825 Nuc_H_symport Nucleoside H+ symporter. 400
47073 397759 pfam03826 OAR OAR domain. 19
47074 397760 pfam03827 Orexin_rec2 Orexin receptor type 2. 57
47075 397761 pfam03828 PAP_assoc Cid1 family poly A polymerase. This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. 60
47076 397762 pfam03829 PTSIIA_gutA PTS system glucitol/sorbitol-specific IIA component. 113
47077 397763 pfam03830 PTSIIB_sorb PTS system sorbose subfamily IIB component. 148
47078 397764 pfam03831 PhnA PhnA domain. 69
47079 397765 pfam03832 WSK WSK motif. This short motif is names after three conserved residues found in a WXSXK motif in protein kinase A anchoring proteins. 29
47080 397766 pfam03833 PolC_DP2 DNA polymerase II large subunit DP2. 866
47081 397767 pfam03834 Rad10 Binding domain of DNA repair protein Ercc1 (rad10/Swi10). Ercc1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA interstrand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex. 114
47082 397768 pfam03835 Rad4 Rad4 transglutaminase-like domain. 144
47083 397769 pfam03836 RasGAP_C RasGAP C-terminus. This domain can be found in the C-terminus of the IQGAP family members, including human IQGAP1/2/3, S. cerevisiae Iqg1 and S. pombe Rng2. Some members function in cytoskeletal remodelling. Human IQGAP1 is a scaffolding protein that can assemble multi-protein complexes involved in cell-cell interaction, cell adherence, and movement via actin/tubulin-based cytoskeletal reorganization. IQGAP1 is also a regulator of the MAPK and Wnt/beta-catenin signaling pathways.Iqg1 and Rng2 are required for actomyosin ring construction during cytokinesis. 140
47084 397770 pfam03837 RecT RecT family. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to RecT. 192
47085 397771 pfam03838 RecU Recombination protein U. 162
47086 397772 pfam03839 Sec62 Translocation protein Sec62. 213
47087 397773 pfam03840 SecG Preprotein translocase SecG subunit. 69
47088 309101 pfam03841 SelA L-seryl-tRNA selenium transferase. 367
47089 146463 pfam03842 Silic_transp Silicon transporter. 513
47090 397774 pfam03843 Slp Outer membrane lipoprotein Slp family. 151
47091 281794 pfam03845 Spore_permease Spore germination protein. 321
47092 397775 pfam03846 SulA Cell division inhibitor SulA. 111
47093 367691 pfam03847 TFIID_20kDa Transcription initiation factor TFIID subunit A. 68
47094 397776 pfam03848 TehB Tellurite resistance protein TehB. 193
47095 397777 pfam03849 Tfb2 Transcription factor Tfb2. 362
47096 397778 pfam03850 Tfb4 Transcription factor Tfb4. This family appears to be distantly related to the VWA domain. 273
47097 252205 pfam03851 UvdE UV-endonuclease UvdE. 275
47098 281799 pfam03852 Vsr DNA mismatch endonuclease Vsr. 74
47099 397779 pfam03853 YjeF_N YjeF-related protein N-terminus. YjeF-N domain is a novel version of the Rossmann fold with a set of catalytic residues and structural features that are different from the conventional dehydrogenases. YjeF-N domain is fused to Ribokinases in bacteria (YjeF), where they may be phosphatases, and to divergent Sm and the FDF domain in eukaryotes (Dcp3p and FLJ21128), where they may be involved in decapping and catalyze hydrolytic RNA-processing reactions. 168
47100 281801 pfam03854 zf-P11 P-11 zinc finger. 50
47101 281802 pfam03855 M-factor M-factor. The M-factor is a pheromone produce upon nitrogen starvation. The production of M-factor is increased by the pheromone signal. The protein undergoes post-translational modification, to remove the C-terminal signal peptide, the carboxy-terminal cysteine residue is carboxy-methylated and S-alkylated, with a farnesyl residue. 43
47102 397780 pfam03856 SUN Beta-glucosidase (SUN family). Members of this family include Nca3, Sun4 and Sim1. This is a family of yeast proteins, involved in a diverse set of functions (DNA replication, aging, mitochondrial biogenesis and cell septation). BGLA from Candida wickerhamii has been characterized as a Beta-glucosidase EC:3.2.1.21. 244
47103 367697 pfam03857 Colicin_im Colicin immunity protein. Colicin immunity proteins are plasmid-encoded proteins necessary for protecting the cell against colicins. Colicins are toxins released by bacteria during times of stress. 138
47104 367698 pfam03858 Crust_neuro_H Crustacean neurohormone H. These proteins are referred to as precursor-related peptides as they are typically co-transcribed and translated with the CHH neurohormone (pfam01147). However, in some species this neuropeptide is synthesized as a separate protein. Furthermore, neurohormone H can undergo proteolysis to give rise to 5 different neuropeptides. 41
47105 397781 pfam03859 CG-1 CG-1 domain. CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs. 114
47106 397782 pfam03860 DUF326 Domain of Unknown Function (DUF326). This family is a small cysteine-rich repeat. The cysteines mostly follow a C-X(2)-C-X(3)-C-X(2)-C-X(3) pattern, though they often appear at other positions in the repeat as well. 21
47107 397783 pfam03861 ANTAR ANTAR domain. ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. 46
47108 397784 pfam03862 SpoVAC_SpoVAEB SpoVAC/SpoVAEB sporulation membrane protein. Members of this family are all transcribed from the spoVA operon. Bacillus and Clostridium are two well studied endospore forming bacteria. Spore formation provides a resistance mechanism in response to extreme or unfavourable environmental conditions such as heat, radiation, and chemical agents or nutrient deprivation. The reverse process termed germination takes place where spores develop into growing cells in response to nutrient availability or stress reduction. Nutrient germinant receptors (GRs) and the SpoVA proteins are important players in the germination process. In B.subtilis the SpoVAC and SpoVAEB, belonging to this domain family, are predicted to be membrane proteins, with two to five membrane spanning. Biophysical and biochemical studies suggest that SpoVAC acts as a mechano-sensitive channel with properties that would allow the release of Ca-DPA (dipicolinic acid) and amino acids during germination of the spore. The release of Ca-DPA is a crucial event during spore germination. When expressed in E.coli SpoVAC provides protection against osmotic downshift. Furthermore, SpoVAC acts as channel that facilitates the efflux down the concentration gradient of osmolytes up to a mass of at least 600 Da. Another conserved SpoVA protein in all spore-forming bacteria is SpoVAEb, which appears to be an integral membrane protein with no known function. 116
47109 281809 pfam03863 Phage_mat-A Phage maturation protein. 446
47110 397785 pfam03864 Phage_cap_E Phage major capsid protein E. Major capsid protein E is involved with the stabilisation of the condensed form of the DNA molecule in phage heads. 335
47111 397786 pfam03865 ShlB Haemolysin secretion/activation protein ShlB/FhaC/HecB. This family represents a group of sequences that are related to ShlB from Serratia marcescens. ShlB is an outer membrane protein pore involved in the Type Vb or Two-partner secretion system where it is functions to secrete and activate the haemolysin ShlA. The activation of ShlA occurs during secretion when ShlB imposes a conformational change in the inactive haemolysin to form the active protein. 316
47112 146478 pfam03866 HAP Hydrophobic abundant protein (HAP). Expression of HAP is thought to be developmentally regulated and possibly involved in spherule cell wall formation. 167
47113 397787 pfam03867 FTZ Fushi tarazu (FTZ), N-terminal region. This region contains the important motif (LXXLL) necessary for the interaction of FTZ with the nuclear receptor FTZ-F1. FTZ is thought to represents a category of LXXLL motif-dependent co-activators for nuclear receptors. 269
47114 397788 pfam03868 Ribosomal_L6e_N Ribosomal protein L6, N-terminal domain. 60
47115 397789 pfam03869 Arc Arc-like DNA binding domain. Arc repressor act by he cooperative binding of two Arc repressor dimers to a 21-base-pair operator site. Each Arc dimer uses an antiparallel beta-sheet to recognize bases in the major groove. 50
47116 397790 pfam03870 RNA_pol_Rpb8 RNA polymerase Rpb8. Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits. 136
47117 397791 pfam03871 RNA_pol_Rpb5_N RNA polymerase Rpb5, N-terminal domain. Rpb5 has a bipartite structure which includes a eukaryote-specific N-terminal domain and a C-terminal domain resembling the archaeal RNAP subunit H. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA. 89
47118 397792 pfam03872 RseA_N Anti sigma-E protein RseA, N-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress. 87
47119 397793 pfam03873 RseA_C Anti sigma-E protein RseA, C-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress. 53
47120 397794 pfam03874 RNA_pol_Rpb4 RNA polymerase Rpb4. This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II. 115
47121 397795 pfam03875 Statherin Statherin. Statherin functions biologically to inhibit the nucleation and growth of calcium phosphate minerals. The N-terminus of statherin is highly charge, the glutamic acids of which have been shown to be important in the recognition hydroxyapatite. 41
47122 397796 pfam03876 SHS2_Rpb7-N SHS2 domain found in N-terminus of Rpb7p/Rpc25p/MJ0397. Rpb7 bind to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during RNA polymerase II elongation. This family includes the homologs from RNA polymerase I and III. In RNA polymerase I, Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA. The N-terminus of Rpb7p/Rpc25p/MJ0397 has a SHS2 domain that is involved in protein-protein interaction. 57
47123 397797 pfam03878 YIF1 YIF1. YIF1 (Yip1 interacting factor) is an integral membrane protein that is required for membrane fusion of ER derived vesicles. It also plays a role in the biogenesis of ER derived COPII transport vesicles. 243
47124 397798 pfam03879 Cgr1 Cgr1 family. Members of this family are coiled-coil proteins that are involved in pre-rRNA processing. 107
47125 397799 pfam03880 DbpA DbpA RNA binding domain. This RNA binding domain is found at the C-terminus of a number of DEAD helicase proteins. It is sufficient to confer specificity for hairpin 92 of 23S rRNA, which is part of the ribosomal A-site. However, several members of this family lack specificity for 23S rRNA. These can proteins can generally be distinguished by a basic region that extends beyond this domain [Karl Kossen, unpublished data]. 72
47126 397800 pfam03881 Fructosamin_kin Fructosamine kinase. This family includes eukaryotic fructosamine-3-kinase enzymes. The family also includes bacterial members that have not been characterized but probably have a similar or identical function. 287
47127 397801 pfam03882 KicB MukF winged-helix domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid. 115
47128 397802 pfam03883 H2O2_YaaD Peroxide stress protein YaaA. YaaA is a key element of the stress response to H2O2. It acts by reducing the level of intracellular iron levels after peroxide stress, thereby attenuating the Fenton reaction and the DNA damage that this would cause. The molecular mechanism of action is not known. 233
47129 397803 pfam03884 YacG DNA gyrase inhibitor YacG. YacG inhibits all the catalytic activities of DNA gyrase by preventing its interaction with DNA. It acts by binding directly to the C-terminal domain of GyrB, which probably disrupts DNA binding by the gyrase. YacG has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins. The conserved four cysteine motif in this protein (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures. 48
47130 397804 pfam03885 DUF327 Protein of unknown function (DUF327). The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues. with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown. 141
47131 397805 pfam03886 ABC_trans_aux ABC-type transport auxiliary lipoprotein component. ABC_trans_aux is a family of bacterial proteins that act as auxiliarires to the ABC-transporter in the gamma-hexachlorocyclohexane uptake permease system in Sphingobium japonicum. Gamma-hexachlorocyclohexane, or Lindane, can be used as the sole source of carbon in S.japonicum in aerobic conditions. Lindane is an insecticide. 154
47132 397806 pfam03887 YfbU YfbU domain. This presumed domain is about 160 residues long. It is found in archaebacteria and eubacteria. In Corynebacterium glutamicum Ycg4L it is associated with a helix-turn-helix domain. This suggests that this may be a ligand binding domain. 165
47133 397807 pfam03888 MucB_RseB MucB/RseB N-terminal domain. Members of this family are regulators of the anti-sigma E protein RseD. 178
47134 397808 pfam03889 DUF331 Domain of unknown function. Members of this family are uncharacterized proteins from a number of bacterial species. The proteins range in size from 50-70 residues. 44
47135 397809 pfam03891 DUF333 Domain of unknown function (DUF333). This small domain of about 70 residues is found in a number of bacterial proteins. It is found at the N-terminus the of AF_1947 protein. The proteins containing this domain are uncharacterized. 46
47136 397810 pfam03892 NapB Nitrate reductase cytochrome c-type subunit (NapB). The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase. 122
47137 309136 pfam03893 Lipase3_N Lipase 3 N-terminal region. N terminal region to pfam01764, found on a subset of Lipase 3 containing proteins. 76
47138 397811 pfam03894 XFP D-xylulose 5-phosphate/D-fructose 6-phosphate phosphoketolase. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779. 177
47139 397812 pfam03895 YadA_anchor YadA-like membrane anchor domain. This region represents the C-terminal 120 amino acids of a family of surface-exposed bacterial proteins. YadA, an adhesin from Yersinia, was the first member of this family to be characterized. UspA2 from Moraxella was second. The Eib immunoglobulin-binding proteins from E. coli were third, followed by the DsrA proteins of Haemophilus ducreyi and others. These proteins are homologous at their C-terminal and have predicted signal sequences, but they diverge elsewhere. The C-terminal 9 amino acids, consisting of alternating hydrophobic amino acids ending in F or W, comprise a targeting motif for the outer membrane of the Gram negative cell envelope. This region is important for oligomerization. 60
47140 397813 pfam03896 TRAP_alpha Translocon-associated protein (TRAP), alpha subunit. The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane. 279
47141 146498 pfam03898 TNV_CP Satellite tobacco necrosis virus coat protein. 198
47142 397814 pfam03899 ATP-synt_I ATP synthase I chain. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI. AtpI is a Ca2+/Mg2+ transporter. 99
47143 397815 pfam03900 Porphobil_deamC Porphobilinogen deaminase, C-terminal domain. 72
47144 281842 pfam03901 Glyco_transf_22 Alg9-like mannosyltransferase family. Members of this family are mannosyltransferase enzymes. At least some members are localized in endoplasmic reticulum and involved in GPI anchor biosynthesis. 414
47145 397816 pfam03902 Gal4_dimer Gal4-like dimerization domain. 44
47146 397817 pfam03903 Phage_T4_gp36 Phage T4 tail fibre. 131
47147 112704 pfam03904 DUF334 Domain of unknown function (DUF334). Staphylococcus aureus plasmid proteins with no characterized function. 229
47148 146503 pfam03905 Corona_NS4 Coronavirus non-structural protein NS4. 45
47149 281845 pfam03906 Phage_T7_tail Phage T7 tail fibre protein. The bacteriophage T7 tail complex consists of a conical tail-tube surrounded by six kinked tail-fibers, which are oligomers of the viral protein gp17. 157
47150 397818 pfam03907 Spo7 Spo7-like protein. S. cerevisiae Spo7 has an unknown function, but has a role in formation of a spherical nucleus and meiotic division. 205
47151 112708 pfam03908 Sec20 Sec20. Sec20 is a membrane glycoprotein associated with secretory pathway. 92
47152 397819 pfam03909 BSD BSD domain. This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. 55
47153 367721 pfam03910 Adeno_PV Adenovirus minor core protein PV. 355
47154 397820 pfam03911 Sec61_beta Sec61beta family. This family consists of homologs of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. It consists of a single putative transmembrane helix, preceded by a short stretch containing various charged residues; this arrangement may help determine orientation in the cell membrane. 38
47155 397821 pfam03912 Psb28 Psb28 protein. Psb28 is a 13 kDa soluble protein that is directly assembled in dimeric PSII supercomplexes. The negatively charged N-terminal region is essential for this process. This protein was formerly known as PsbW, but PsbW is now reserved for pfam07123. 106
47156 112713 pfam03913 Amb_V_allergen Amb V Allergen. 44
47157 397822 pfam03914 CBF CBF/Mak21 family. 152
47158 397823 pfam03915 AIP3 Actin interacting protein 3. 407
47159 397824 pfam03916 NrfD Polysulphide reductase, NrfD. NrfD is an integral transmembrane protein with loops in both the periplasm and the cytoplasm. NrfD is thought to participate in the transfer of electrons, from the quinone pool into the terminal components of the Nrf pathway. 313
47160 397825 pfam03917 GSH_synth_ATP Eukaryotic glutathione synthase, ATP binding domain. 475
47161 397826 pfam03918 CcmH Cytochrome C biogenesis protein. Members of this family include NrfF, CcmH, CycL, Ccl2. 143
47162 397827 pfam03919 mRNA_cap_C mRNA capping enzyme, C-terminal domain. 108
47163 397828 pfam03920 TLE_N Groucho/TLE N-terminal Q-rich domain. The N-terminal domain of the Grouch/TLE co-repressor proteins are involved in oligomerization. 125
47164 397829 pfam03921 ICAM_N Intercellular adhesion molecule (ICAM), N-terminal domain. ICAMs normally functions to promote intercellular adhesion and signalling. However, The N-terminal domain of the receptor binds to the rhinovirus 'canyon' surrounding the icosahedral 5-fold axes, during the viral attachment process. This family is a family that is part of the Ig superfamily and is therefore related to the family ig (pfam00047). 86
47165 397830 pfam03922 OmpW OmpW family. This family includes outer membrane protein W (OmpW) proteins from a variety of bacterial species. This protein may form the receptor for S4 colicins in E. coli. 193
47166 397831 pfam03923 Lipoprotein_16 Uncharacterized lipoprotein. The function of this presumed lipoprotein is unknown. The family includes E. coli YajG. 149
47167 397832 pfam03924 CHASE CHASE domain. This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases. Predicted to be a ligand binding domain. 186
47168 397833 pfam03925 SeqA SeqA protein C-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor. 110
47169 397834 pfam03927 NapD NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase. 71
47170 397835 pfam03928 Haem_degrading Haem-degrading. Haem_bdg is a bacterial protein that is up-regulated in response to haemin- and peroxide-based oxidative stress. It interacts with the SenS/SenR two-component signal transduction system. Iron binds to surface-exposed lysine residues of an octomeric assembly of the protein. 116
47171 397836 pfam03929 PepSY_TM PepSY-associated TM region. The PepSY_TM family is so named because it is an alignment of up to five transmembranes helices found in bacterial species some of which carry a nested PepSY domain, pfam03413. 354
47172 397837 pfam03930 Flp_N Recombinase Flp protein N-terminus. 82
47173 397838 pfam03931 Skp1_POZ Skp1 family, tetramerisation domain. 60
47174 397839 pfam03932 CutC CutC family. Copper transport in Escherichia coli is mediated by the products of at least six genes, cutA, cutB, cutC, cutD, cutE, and cutF. A mutation in one or more of these genes results in an increased copper sensitivity. Members of this family are between 200 and 300 amino acids in length are found in both eukaryotes and bacteria. 201
47175 397840 pfam03934 T2SSK Type II secretion system (T2SS), protein K. Members of this family are involved in the Type II protein secretion system. The T2SK family includes proteins such as ExeK, PulK, OutX and XcpX. 282
47176 397841 pfam03935 SKN1 Beta-glucan synthesis-associated protein (SKN1). This family consists of the beta-glucan synthesis-associated proteins KRE6 and SKN1. Beta1,6-Glucan is a key component of the yeast cell wall, interconnecting cell wall proteins, beta1,3-glucan, and chitin. It has been postulated that the synthesis of beta1,6-glucan begins in the endoplasmic reticulum with the formation of protein-bound primer structures and that these primer structures are extended in the Golgi complex by two putative glucosyltransferases that are functionally redundant, Kre6 and Skn1. This is followed by maturation steps at the cell surface and by coupling to other cell wall macromolecules. 500
47177 397842 pfam03936 Terpene_synth_C Terpene synthase family, metal binding domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase. 266
47178 397843 pfam03937 Sdh5 Flavinator of succinate dehydrogenase. This family includes the highly conserved mitochondrial and bacterial proteins Sdh5/SDHAF2/SdhE. Both yeast and human Sdh5/SDHAF2 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans. Bacterial homologs of Sdh5, termed SdhE, are functionally conserved being required for the flavinylation of SdhA and succinate dehydrogenase activity. Like Sdh5, SdhE interacts with SdhA. Furthermore, SdhE was characterized as a FAD co-factor chaperone that directly binds FAD to facilitate the flavinylation of SdhA. Phylogenetic analysis demonstrates that SdhE/Sdh5 proteins evolved only once in an ancestral alpha-proteobacteria prior to the evolution of the mitochondria and now remain in subsequent descendants including eukaryotic mitochondria and the alpha, beta and gamma proteobacteria. This family was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true. The E. coli protein, YgfY also acts as the antitoxin to the membrane-bound toxin family Cpta, pfam13166, whose E. coli member YgfX, expressed from the same operon as YgfY. 73
47179 397844 pfam03938 OmpH Outer membrane protein (OmpH-like). This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterized as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery. 140
47180 397845 pfam03939 Ribosomal_L23eN Ribosomal protein L23, N-terminal domain. The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a. 50
47181 281873 pfam03940 MSSP Male specific sperm protein. This family of drosophila proteins are typified by the repetitive motif C-G-P. 51
47182 397846 pfam03941 INCENP_ARK-bind Inner centromere protein, ARK binding region. This region of the inner centromere protein has been found to be necessary and sufficient for binding to aurora-related kinase. This interaction has been implicated in the coordination of chromosome segregation with cell division in yeast. 53
47183 397847 pfam03942 DTW DTW domain. This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. 194
47184 397848 pfam03943 TAP_C TAP C-terminal domain. The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for nuclear export of mRNA. Tap has a modular structure, and its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling. The structure of the C-terminal domain is composed of four helices. The structure is related to the UBA domain. 48
47185 397849 pfam03944 Endotoxin_C delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 142
47186 397850 pfam03945 Endotoxin_N delta endotoxin, N-terminal domain. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 218
47187 397851 pfam03946 Ribosomal_L11_N Ribosomal protein L11, N-terminal domain. The N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain. 65
47188 397852 pfam03947 Ribosomal_L2_C Ribosomal Proteins L2, C-terminal domain. 123
47189 397853 pfam03948 Ribosomal_L9_C Ribosomal protein L9, C-terminal domain. 86
47190 397854 pfam03949 Malic_M Malic enzyme, NAD binding domain. 257
47191 397855 pfam03950 tRNA-synt_1c_C tRNA synthetases class I (E and Q), anti-codon binding domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln). 175
47192 397856 pfam03951 Gln-synt_N Glutamine synthetase, beta-Grasp domain. 82
47193 397857 pfam03952 Enolase_N Enolase, N-terminal domain. 131
47194 397858 pfam03953 Tubulin_C Tubulin C-terminal domain. This family includes the tubulin alpha, beta and gamma chains. Members of this family are involved in polymer formation. Tubulins are GTPases. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules. (The FtsZ GTPases have been split into their won family). 125
47195 397859 pfam03954 Lectin_N Hepatic lectin, N-terminal domain. 128
47196 281888 pfam03955 Adeno_PIX Adenovirus hexon-associated protein (IX). Hexon (PF01065) is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. 110
47197 397860 pfam03956 Lys_export Lysine exporter LysO. Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. This family includes lysine exporter LysO (YbjE) from E. coli. 190
47198 397861 pfam03957 Jun Jun-like transcription factor. 228
47199 397862 pfam03958 Secretin_N Bacterial type II/III secretion system short domain. This is a short, often repeated, domain found in bacterial type II/III secretory system proteins. All previous NolW-like domains fall into this family. 68
47200 397863 pfam03959 FSH1 Serine hydrolase (FSH1). This is a family of serine hydrolases. 208
47201 397864 pfam03960 ArsC ArsC family. This family is related to glutaredoxins pfam00462. 109
47202 397865 pfam03961 FapA Flagellar Assembly Protein A. Members of this family include FapA (flagellar assembly protein A), found in Vibrio vulnificus. The synthesis of flagella allows bacteria to respond to chemotaxis by facilitating motility. Studies examining the role of FapA show that the loss or delocalization of FapA results in a complete failure of the flagellar biosynthesis and motility in response to glucose mediated chemotaxis. The polar localization of FapA is required for flagellar synthesis, and dephosphorylated EIIAGlc (Glucose-permease IIA component) inhibited the polar localization of FapA through direct interaction. 451
47203 397866 pfam03962 Mnd1 Mnd1 family. This family of proteins includes MND1 from S. cerevisiae. The mnd1 protein forms a complex with hop2 to promote homologous chromosome pairing and meiotic double-strand break repair. 60
47204 397867 pfam03963 FlgD Flagellar hook capping protein - N-terminal region. FlgD is known to be absolutely required for hook assembly, yet it has not been detected in the mature flagellum. It appears to act as a hook-capping protein to enable assembly of hook protein subunits. FlgD regulates the assembly of the hook cap structure to prevent leakage of hook monomers into the medium and hook monomer polymerization and also plays a role in determination of the correct hook length, with the help of the FliK protein. This family represents the N-terminal conserved region of FlgD. A recent crystal structure showed that this region was likely to be flexible and was cleaved off during crystallisation. 75
47205 281897 pfam03964 Chorion_2 Chorion family 2. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. 103
47206 397868 pfam03965 Penicillinase_R Penicillinase repressor. The penicillinase repressor negatively regulates expression of the penicillinase gene. The N-terminal region of this protein is involved in operator recognition, while the C-terminal is responsible for dimerization of the protein. 115
47207 397869 pfam03966 Trm112p Trm112p-like protein. The function of this family is uncertain. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation. Trm112p is required for tRNA methylation in S. cerevisiae and is found in complexes with 2 tRNA methylases (TRM9 and TRM11) also with putative methyltransferase YDR140W. The zinc-finger protein Ynr046w is plurifunctional and a component of the eRF1 methyltransferase in yeast. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three alpha-helices. 44
47208 397870 pfam03967 PRCH Photosynthetic reaction centre, H-chain N-terminal region. The family corresponds the N-terminal cytoplasmic domain. 133
47209 397871 pfam03968 OstA OstA-like protein. This family of proteins are mostly uncharacterized. However the family does include E. coli OstA that has been characterized as an organic solvent tolerance protein. 113
47210 397872 pfam03969 AFG1_ATPase AFG1-like ATPase. This P-loop motif-containing family of proteins includes AFG1, LACE1 and ZapE. ATPase family gene 1 (AFG1) is a 377 amino acid yeast protein with an ATPase motif typical of the family. LACE1, the mammalian homolog of AFG1, is a mitochondrial integral membrane protein that is essential for maintenance of fused mitochondrial reticulum and lamellar cristae morphology. It has also been demonstrated that LACE1 mediates degradation of nuclear-encoded complex IV subunits COX4 (cytochrome c oxidase 4), COX5A and COX6A, and is required for normal activity of complexes III and IV of the respiratory chain. ZapE is a cell division protein found in Gram-negative bacteria. The bacterial cell division process relies on the assembly, positioning, and constriction of FtsZ ring (the so-called Z-ring), a ring-like network that marks the future site of the septum of bacterial cell division. ZapE is a Z-ring associated protein required for cell division under low-oxygen conditions. It is an ATPase that appears at the constricting Z-ring late in cell division. It reduces the stability of FtsZ polymers in the presence of ATP in vitro. 361
47211 281903 pfam03970 Herpes_UL37_1 Herpesvirus UL37 tegument protein. UL37 interacts with UL36, which is thought to be an important early step in tegumentation during virion morphogenesis in the cytoplasm. 267
47212 397873 pfam03971 IDH Monomeric isocitrate dehydrogenase. NADP(+)-dependent isocitrate dehydrogenase (ICD) is an important enzyme of the intermediary metabolism, as it controls the carbon flux within the citric acid cycle and supplies the cell with 2-oxoglutarate EC:1.1.1.42 and NADPH for biosynthetic purposes. 733
47213 397874 pfam03972 MmgE_PrpD MmgE/PrpD family. This family includes 2-methylcitrate dehydratase EC:4.2.1.79 (PrpD) that is required for propionate catabolism. It catalyzes the third step of the 2-methylcitric acid cycle. 437
47214 397875 pfam03973 Triabin Triabin. Triabin is a serine-protease inhibitor with a calycin fold. 147
47215 397876 pfam03974 Ecotin Ecotin. Ecotin is a broad range serine protease inhibitor, which forms homodimers. The C-terminal region contains the dimerization motif. Interestingly, the binding sites show a fluidity of protein contacts binding sites show a fluidity of protein contacts derived from ecotin's innate flexibility in fitting itself to proteases while. 122
47216 397877 pfam03975 CheD CheD chemotactic sensory transduction. This chemotaxis protein stimulates methylation of MCP proteins. The chemotaxis machinery of Bacillus subtilis is similar to that of the well characterized system of Escherichia coli. However, B. subtilis contains several chemotaxis genes not found in the E. coli genome, such as CheC and CheD, indicating that the B. subtilis chemotactic system is more complex. CheD plays an important role in chemotactic sensory transduction for many organisms. CheD deamidates other B. subtilis chemoreceptors including McpB and McpC. Deamidation by CheD is required for B. subtilis chemoreceptors to effectively transduce signals to the CheA kinase. The structure of a complex between the signal-terminating phosphatase, CheC, and the receptor-modifying deamidase, CheD, reveals how CheC mimics receptor substrates to inhibit CheD and how CheD stimulates CheC phosphatase activity. CheD resembles other cysteine deamidases from bacterial pathogens that inactivate host Rho-GTPases. Phospho-CheY, the intracellular signal and CheC target, stabilizes the CheC-CheD complex and reduces availability of CheD. A model is proposed whereby CheC acts as a CheY-P-induced regulator of CheD; CheY-P would cause CheC to sequester CheD from the chemoreceptors, inducing adaptation of the chemotaxis system. 106
47217 397878 pfam03976 PPK2 Polyphosphate kinase 2 (PPK2). Inorganic polyphosphate (polyP) plays a role in metabolism and regulation and has been proposed to serve as a energy source in a pre-ATP world. In prokaryotes, the synthesis and utilisation of polyP are catalyzed by PPK1, PPK2 and polyphosphatases. Proteins with a single PPK2 domain catalyze polyP-dependent phosphorylation of ADP to ATP, whereas proteins containing 2 fused PPK2 domains phosphorylate AMP to ADP. The structure of PPK2 from Pseudomonas aeruginosa has revealed a a 3-layer alpha/beta/alpha sandwich fold with an alpha-helical lid similar to the structures of microbial thymidylate kinases. 229
47218 397879 pfam03977 OAD_beta Na+-transporting oxaloacetate decarboxylase beta subunit. Members of this family are integral membrane proteins. The decarboxylation reactions they catalyze are coupled to the vectorial transport of Na+ across the cytoplasmic membrane, thereby creating a sodium ion motive force that is used for ATP synthesis. 345
47219 397880 pfam03978 Borrelia_REV Borrelia burgdorferi REV protein. This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete). The function of REV is unknown although it known that gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli. 157
47220 397881 pfam03979 Sigma70_r1_1 Sigma-70 factor, region 1.1. Region 1.1 modulates DNA binding by region 2 and 4 when sigma is unbound by the core RNA polymerase. Region 1.1 is also involved in promoter binding 69
47221 397882 pfam03980 Nnf1 Nnf1. NNF1 is an essential yeast gene that is necessary for chromosome segregation. It is associated with the spindle poles and forms part of a kinetochore subcomplex called MIND. 103
47222 397883 pfam03981 Ubiq_cyt_C_chap Ubiquinol-cytochrome C chaperone. 137
47223 112781 pfam03982 DAGAT Diacylglycerol acyltransferase. The terminal step of triacylglycerol (TAG) formation is catalyzed by the enzyme diacylglycerol acyltransferase (DAGAT). 297
47224 397884 pfam03983 SHD1 SLA1 homology domain 1, SHD1. NPFXD peptides specifically interact with the SHD1 domain. NPFXD is a clathrin-facilitated endocytic targeting signal. NPFXD was originally discovered in the cytoplasmic domain of the furin-like protease Kex2p. Sla1 is thought to function as an endocytic adaptor. 67
47225 367755 pfam03984 DUF346 Repeat of unknown function (DUF346). This repeat was found as seven tandem copies in one protein. It is predicted to be composed of beta-strands. Thus it is likely that it forms a beta-propeller structure. It is found in association with BNR repeats, which also form a beta-propeller. 36
47226 397885 pfam03985 Paf1 Paf1. Members of this family are components of the RNA polymerase II associated Paf1 complex. The Paf1 complex functions during the elongation phase of transcription in conjunction with Spt4-Spt5 and Spt16-Pob3i. 416
47227 367757 pfam03986 Autophagy_N Autophagocytosis associated protein (Atg3), N-terminal domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the lysosome/vacuole. Atg3 is a ubiquitin like modifier that is topologically similar to the canonical E2 enzyme. It catalyzes the conjugation of Atg8 and phosphatidylethanolamine. 126
47228 397886 pfam03987 Autophagy_act_C Autophagocytosis associated protein, active-site domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex. 188
47229 397887 pfam03988 DUF347 Repeat of Unknown Function (DUF347). This repeat is found as four tandem repeats in a family of bacterial membrane proteins. Each repeat contains two transmembrane regions and a conserved tryptophan. 50
47230 397888 pfam03989 DNA_gyraseA_C DNA gyrase C-terminal domain, beta-propeller. This repeat is found as 6 tandem copies at the C-termini of GyrA and ParC DNA gyrases. It is predicted to form 4 beta strands and to probably form a beta-propeller structure. This region has been shown to bind DNA non-specifically and may stabilize the DNA-topoisomerase complex. 48
47231 397889 pfam03990 DUF348 Domain of unknown function (DUF348). This domain normally occurs as tandem repeats; however it is found as a single copy in the S. cerevisiae DNA-binding nuclear protein YCR593. This protein is involved in sporulation part of the SET3C complex, which is required to repress early/middle sporulation genes during meiosis. The bacterial proteins are likely to be involved in a cell wall function as they are found in conjunction with the pfam07501 domain, which is involved in various cell surface processes. 41
47232 112790 pfam03991 Prion_octapep Copper binding octapeptide repeat. This repeat is found at the amino terminus of prion proteins. It has been shown to bind to copper. 8
47233 397890 pfam03992 ABM Antibiotic biosynthesis monooxygenase. This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue. 74
47234 397891 pfam03993 DUF349 Domain of Unknown Function (DUF349). This domain is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand. 72
47235 397892 pfam03994 DUF350 Domain of Unknown Function (DUF350). This domain occurs in a small set of of bacterial proteins. It has two transmembrane regions, and often occurs as tandem repeats. The are no conserved catalytic residues. 54
47236 377188 pfam03995 Inhibitor_I36 Peptidase inhibitor family I36. This domain is currently only found in a small set of S. coelicolor secreted proteins. There are four conserved cysteines that probably form two disulphide bonds. Proteins 2SCK31.15C and SCO3675 also have probable beta-propellers at their C-termini. This family includes Streptomyces nigrescens SmpI, a known peptidase inhibitor of known structure. This protein has a crystallin like fold pfam00030 and is distantly related by sequence. It is not known whether other members of this family are peptidase inhibitors. 69
47237 397893 pfam03996 Hema_esterase Hemagglutinin esterase. 384
47238 397894 pfam03997 VPS28 VPS28 protein. 187
47239 397895 pfam03998 Utp11 Utp11 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome. 245
47240 397896 pfam03999 MAP65_ASE1 Microtubule associated protein (MAP65/ASE1 family). 562
47241 397897 pfam04000 Sas10_Utp3 Sas10/Utp3/C1D family. This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein. 84
47242 397898 pfam04001 Vhr1 Transcription factor Vhr1. Vhr1 is a transcription factor which regulates the biotin-dependent expression of transporters VHT1 and BIO5. 91
47243 397899 pfam04002 RadC RadC-like JAB domain. A family of proteins present widely across the bacteria. This family was named initially with reference to the E. coli radC102 mutation which suggested that RadC was involved in repair of DNA lesions. However the relevant mutation has subsequently been shown to be in recG, where radC is in fact an allele of recG. In addition, a personal communication from Claverys, J-P, et al, indicates a total failure of all attempts to characterize a radiation-related function for RadC in Streptococcus pneumoniae, suggesting that it is not involved in repair of DNA lesions, in recombination during transformation, in gene conversion, nor in mismatch repair. Computational analysis, however, provides a possible function. The RadC-like family belong to the JAB superfamily of metalloproteins. The domain shows fusions to an N-terminal Helix-hairpin-Helix (HhH) domain in most instances. Other domain combinations include fusions to the anti-restriction module ArdC, the DinG/RAD3-like superfamily II helicases and the DNAG-like primase. In some bacteria, closely related DinG/Rad3- like superfamily II helicases are fused to a 3'-5' exonuclease in the same position as the RadC-like JAB domain. These conserved domain associations lead to the hypothesis that the RadC-like JAB domains might function as a nuclease. 113
47244 397900 pfam04003 Utp12 Dip2/Utp12 Family. This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. 106
47245 397901 pfam04004 Leo1 Leo1-like protein. Members of this family are part of the Paf1/RNA polymerase II complex. The Paf1 complex probably functions during the elongation phase of transcription. The Leo1 subunit of the yeast Paf1-complex binds RNA and contributes to complex recruitment. The subunit acts by co-ordinating co-transcriptional chromain modifications and helping recruitment of mRNA 3prime-end processing factors. 163
47246 397902 pfam04005 Hus1 Hus1-like protein. Hus1, Rad1, and Rad9 are three evolutionarily conserved proteins required for checkpoint control in fission yeast. These proteins are known to form a stable complex in vivo. Hus1-Rad1-Rad9 complex may form a PCNA-like ring structure, and could function as a sliding clamp during checkpoint control. 288
47247 397903 pfam04006 Mpp10 Mpp10 protein. This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites. 591
47248 281936 pfam04007 DUF354 Protein of unknown function (DUF354). Members of this family are around 350 amino acids in length. They are found in archaebacteria and have no known function. 349
47249 397904 pfam04008 Adenosine_kin Adenosine specific kinase. The structure of a member of this family from the hyperthermophilic archaeon Pyrobaculum aerophilum contains a modified histidine residue which is interpreted as stable phosphorylation. In vitro binding studies confirmed that adenosine and AMP but not ADP or ATP bind to the protein. 154
47250 397905 pfam04009 DUF356 Protein of unknown function (DUF356). Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function. 106
47251 397906 pfam04010 DUF357 Protein of unknown function (DUF357). Members of this family are short (less than 100 amino acid) proteins found in archaebacteria. The function of these proteins is unknown. 73
47252 397907 pfam04011 LemA LemA family. The members of this family are related to the LemA protein. LemA contains an amino terminal predicted transmembrane helix. It has been predicted that the small amino terminus is extracellular. The exact molecular function of this protein is uncertain. 149
47253 281941 pfam04012 PspA_IM30 PspA/IM30 family. This family includes PspA a protein that suppresses sigma54-dependent transcription. The PspA protein, a negative regulator of the Escherichia coli phage shock psp operon, is produced when virulence factors are exported through secretins in many Gram-negative pathogenic bacteria and its homolog in plants, VIPP1, plays a critical role in thylakoid biogenesis, essential for photosynthesis. Activation of transcription by the enhancer-dependent bacterial sigma(54) containing RNA polymerase occurs through ATP hydrolysis-driven protein conformational changes enabled by activator proteins that belong to the large AAA(+) mechanochemical protein family. It has been shown that PspA directly and specifically acts upon and binds to the AAA(+) domain of the PspF transcription activator. 218
47254 397908 pfam04013 Methyltrn_RNA_2 Putative SAM-dependent RNA methyltransferase. This family is likely to be an S-adenosyl-L-methionine (SAM)-dependent RNA methyltransferase. It is responsible for N1-methylation of pseudouridine 54 in archaeal tRNAs. 198
47255 397909 pfam04014 MazE_antitoxin Antidote-toxin recognition MazE, bacterial antitoxin. AbrB-like is a family of small proteins that operate in conjunction with a cognate toxin molecule. The commonly attributed role of toxin-antitoxin systems is to maintain low-copy number plasmids from one generation to the next. Such gene-pairs are also found on chromosomes and to be associated with a number of biological functions such as: reduction of protein synthesis, gene regulation and retardation of cell growth under nutritional stress. This family includes proteins from a number of different pairings, eg MazE, AbrB, VapB, PhoU, PemI-like and SpoVT. MazE is the antidote to the toxin MazF of E. coli. MazE-MazF in E. coli is a regulated prokaryotic chromosomal addiction module. MazE antidote is degraded by the ClpPA protease of the bacterial proteasome. MazE-MazF is thought to play a role in programmed cell death when cells suffer nutrient deprivation, and MazE-MazF modules have also been implicated in the bacteriostatic effects of other addiction modules. 44
47256 397910 pfam04015 DUF362 Domain of unknown function (DUF362). Domain that is sometimes present in iron-sulphur proteins. 199
47257 397911 pfam04016 DUF364 Putative heavy-metal chelation. This domain of unknown function has a PLP-dependent transferase-like fold. Its genomic context suggests that it may have a role in anaerobic vitamin B12 biosynthesis. This domain is often found at the C-terminus of proteins containing DUF4213, pfam13938. The structure of UnioProtKB:B8FUJ5, Structure 3l5o, suggests that the protein has an enolase N-terminal-like fold and this Rossmann-like C-terminal domain. Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The protein may be playing a role in heavy-metal chelation. 147
47258 397912 pfam04017 DUF366 Domain of unknown function (DUF366). Archaeal domain of unknown function. 184
47259 397913 pfam04018 DUF368 Domain of unknown function (DUF368). Predicted transmembrane domain of unknown function. Family members have between 6 and 9 predicted transmembrane segments. 241
47260 397914 pfam04019 DUF359 Protein of unknown function (DUF359). This family of archaebacterial proteins are about 170 amino acids in length. They have no known function. The most conserved portion of the protein contains the sequence GEEDL that may be important for its function. 122
47261 397915 pfam04020 Phage_holin_4_2 Mycobacterial 4 TMS phage holin, superfamily IV. These proteins are predicted transmembrane proteins with probably four transmembrane spans. The 1.E.40 is represented by the mycobacterial 4 phage holin, but it also contains many cyanobacterial. proteobacterial and firmicute proteins. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as in those of the bacteriophage of these organisms. The primary function of holins appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded the enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases. 105
47262 397916 pfam04021 Class_IIIsignal Class III signal peptide. This family of archaeal proteins contains. an amino terminal motif QXSXEXXXL that has been suggested to be part of a class III signal sequence. With the Q being the +1 residue of the signal peptidase cleavage site. Two members of this family are cleaved by a type IV pilin-like signal peptidase. 27
47263 281951 pfam04022 Staphylcoagulse Staphylocoagulase repeat. 27
47264 397917 pfam04023 FeoA FeoA domain. This family includes FeoA a small protein, probably involved in Fe2+ transport. This presumed short domain is also found at the C-terminus of a variety of metal dependent transcriptional regulators. This suggests that this domain may be metal-binding. In most cases this is likely to be either iron or manganese. 74
47265 397918 pfam04024 PspC PspC domain. This family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length. 57
47266 397919 pfam04025 DUF370 Domain of unknown function (DUF370). Bacterial domain of unknown function. 73
47267 397920 pfam04026 SpoVG SpoVG. Stage V sporulation protein G. Essential for sporulation and specific to stage V sporulation in Bacillus megaterium and subtilis. In B. subtilis, expression decreases after 30-60 minutes of cold shock. 82
47268 397921 pfam04027 DUF371 Domain of unknown function (DUF371). Archaeal domain of unknown function. 133
47269 397922 pfam04028 DUF374 Domain of unknown function (DUF374). Bacterial domain of unknown function. 69
47270 397923 pfam04029 2-ph_phosp 2-phosphosulpholactate phosphatase. Thought to catalyze 2-phosphosulpholactate = sulpholactate + phosphate. Probable magnesium cofactor. Involved in the second step of coenzyme M biosynthesis. Inhibited by vanadate in Methanococcus jannaschii. Also known as the ComB family. 220
47271 397924 pfam04030 ALO D-arabinono-1,4-lactone oxidase. This domain is specific to D-arabinono-1,4-lactone oxidase EC:1.1.3.-, which is involved in the final step of the D-erythroascorbic acid biosynthesis pathway. 259
47272 397925 pfam04031 Las1 Las1-like. Las1 is an essential nuclear protein involved in cell morphogenesis and cell surface growth. 147
47273 397926 pfam04032 Rpr2 RNAse P Rpr2/Rpp21/SNM1 subunit domain. This family contains a ribonuclease P subunit of humans and yeast. Other members of the family include the probable archaeal homologs. This family includes SNM1. It is a subunit of RNase MRP (mitochondrial RNA processing), a ribonucleoprotein endoribonuclease that has roles in both mitochondrial DNA replication and nuclear 5.8S rRNA processing. SNM1 is an RNA binding protein that binds the MRP RNA specifically. This subunit possibly binds the precursor tRNA. 88
47274 397927 pfam04033 DUF365 Domain of unknown function (DUF365). Archaeal domain of unknown function. 96
47275 397928 pfam04034 Ribo_biogen_C Ribosome biogenesis protein, C-terminal. This family represents the C-terminal domain of some putative ribosome biogenesis proteins in archaea. It has also been identified in the eukaryotic protein Tsr3, which is involved in ribosomal RNA biogenesis. 126
47276 397929 pfam04037 DUF382 Domain of unknown function (DUF382). This domain is specific to the human splicing factor 3b subunit 2 and it's orthologues. Splicing factor 3b subunit 2 or SAP145 is a suppressor of U2 snRNA mutations. Pre-mRNA splicing is catalyzed by a large ribonucleoprotein complex called the spliceosome. Spliceosomes are multi-component enzymes that catalyze pre-mRNA splicing and form step-wise by the ordered interaction of UsnRNPs and non-snRNP proteins with short conserved regions of the pre-mRNA at the 5' and 3' splice sites and branch site. 127
47277 397930 pfam04038 DHNA Dihydroneopterin aldolase. 108
47278 397931 pfam04039 MnhB Domain related to MnhB subunit of Na+/H+ antiporter. Possible subunit of Na+/H+ antiporter,. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit. 124
47279 397932 pfam04041 Glyco_hydro_130 beta-1,4-mannooligosaccharide phosphorylase. This is a family of glycosyl-hydrolases of the CAZy GH130 family. Several have been characterized as mannosylglucose phosphorylase. This enzyme is part of the mannan catalytic pathway and feeds into the glycolysis cycle. Specifically it catalyzes the reversible phosphorolysis of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine. This family was noted to belong to the Beta fructosidase superfamily in. 315
47280 397933 pfam04042 DNA_pol_E_B DNA polymerase alpha/epsilon subunit B. This family contains a number of DNA polymerase subunits. The B subunit of the DNA polymerase alpha plays an essential role at the initial stage of DNA replication in S. cerevisiae and is phosphorylated in a cell cycle-dependent manner. DNA polymerase epsilon is essential for cell viability and chromosomal DNA replication in budding yeast. In addition, DNA polymerase epsilon may be involved in DNA repair and cell-cycle checkpoint control. The enzyme consists of at least four subunits in mammalian cells as well as in yeast. The largest subunit of DNA polymerase epsilon is responsible for polymerase epsilon is responsible for polymerase activity. In mouse, the DNA polymerase epsilon subunit B is the second largest subunit of the DNA polymerase. A part of the N-terminal was found to be responsible for the interaction with SAP18. Experimental evidence suggests that this subunit may recruit histone deacetylase to the replication fork to modify the chromatin structure. 209
47281 397934 pfam04043 PMEI Plant invertase/pectin methylesterase inhibitor. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical. 148
47282 397935 pfam04045 P34-Arc Arp2/3 complex, 34 kD subunit p34-Arc. Arp2/3 protein complex has been implicated in the control of actin polymerization in cells. The human complex consists of seven subunits which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. This family represents the p34-Arc subunit. 239
47283 397936 pfam04046 PSP PSP. Proline rich domain found in numerous spliceosome associated proteins. 45
47284 367782 pfam04048 Sec8_exocyst Sec8 exocyst complex component specific domain. 141
47285 397937 pfam04049 ANAPC8 Anaphase promoting complex subunit 8 / Cdc23. The anaphase-promoting complex is composed of eight protein subunits, including BimE (APC1), CDC27 (APC3), CDC16 (APC6), and CDC23 (APC8). 140
47286 397938 pfam04050 Upf2 Up-frameshift suppressor 2. Transcripts harbouring premature signals for translation termination are recognized and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay. 133
47287 397939 pfam04051 TRAPP Transport protein particle (TRAPP) component. TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localize TRAPP to the Golgi. 150
47288 397940 pfam04052 TolB_N TolB amino-terminal domain. TolB is an essential periplasmic component of the tol-dependent translocation system. This function of this amino terminal domain is uncertain. 105
47289 397941 pfam04053 Coatomer_WDAD Coatomer WD associated region. This region is composed of WD40 repeats. 439
47290 397942 pfam04054 Not1 CCR4-Not complex component, Not1. The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID. 365
47291 397943 pfam04055 Radical_SAM Radical SAM superfamily. Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerisation, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. 159
47292 397944 pfam04056 Ssl1 Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. 175
47293 397945 pfam04057 Rep-A_N Replication factor-A protein 1, N-terminal domain. 99
47294 112856 pfam04059 RRM_2 RNA recognition motif 2. 97
47295 397946 pfam04060 FeS Putative Fe-S cluster. This family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster. 33
47296 397947 pfam04061 ORMDL ORMDL family. Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1). 135
47297 397948 pfam04062 P21-Arc ARP2/3 complex ARPC3 (21 kDa) subunit. The seven component ARP2/3 actin-organising complex is involved in actin assembly and function. 173
47298 397949 pfam04063 DUF383 Domain of unknown function (DUF383). 188
47299 397950 pfam04064 DUF384 Domain of unknown function (DUF384). 55
47300 397951 pfam04065 Not3 Not1 N-terminal domain, CCR4-Not complex component. 228
47301 397952 pfam04066 MrpF_PhaF Multiple resistance and pH regulation protein F (MrpF / PhaF). Members of the PhaF / MrpF family are predicted to be an integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti. MrpF is part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate. The Mrp system in Bacilli may also have primary energisation capacities. 51
47302 397953 pfam04068 RLI Possible Fer4-like domain in RNase L inhibitor, RLI. Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L. 35
47303 397954 pfam04069 OpuAC Substrate binding domain of ABC-type glycine betaine transport system. Part of a high affinity multicomponent binding-protein-dependent transport system involved in bacterial osmoregulation. This domain is often fused to the permease component of the transporter complex. Family members are often integral membrane proteins or predicted to be attached to the membrane by a lipid anchor. Glycine betaine is involved in protection from high osmolarity environments for example in Bacillus subtilis. The family member OpuBC is closely related, and involved in choline transport. Choline is necessary for the biosynthesis of glycine betaine. L-carnitine is important for osmoregulation in Listeria monocytogenes. Family also contains proteins binding l-proline (ProX), histidine (HisX) and taurine (TauA). 257
47304 397955 pfam04070 DUF378 Domain of unknown function (DUF378). Predicted transmembrane domain of unknown function. The majority of the family have two predicted transmembrane regions. 60
47305 397956 pfam04071 zf-like Cysteine-rich small domain. Probable metal-binding domain. 80
47306 397957 pfam04072 LCM Leucine carboxyl methyltransferase. Family of leucine carboxyl methyltransferases EC:2.1.1.-. This family may need divides a the full alignment contains a significantly shorter mouse sequence. 188
47307 397958 pfam04073 tRNA_edit Aminoacyl-tRNA editing domain. This domain is found either on its own or in association with the tRNA synthetase class II core domain (pfam00587). It is involved in the tRNA editing of mis-charged tRNAs including Cys-tRNA(Pro), Cys-tRNA(Cys), Ala-tRNA(Pro). The structure of this domain shows a novel fold. 123
47308 397959 pfam04074 DUF386 Domain of unknown function (DUF386). This family consists of conserved hypothetical proteins, typically about 150 amino acids in length, with no known function. 148
47309 281995 pfam04075 F420H2_quin_red F420H(2)-dependent quinone reductase. This family of proteins is found in the genera Mycobacterium and Streptomyces. Member protein Rv3547 has been characterized as a deazaflavin-dependent nitroreductase. Rv1558 is an F420H(2)-dependent quinone reductase involved in oxidative stress protection. 129
47310 281996 pfam04076 BOF Bacterial OB fold (BOF) protein. Proteins in this family form an OB-fold. Analysis of the predicted binding site of BOF family proteins implies that they lack nucleic acid-binding properties. They contain an predicted N-terminal signal peptide which indicates that they localize in the periplasm where they may function to bind proteins, small molecules, or other typical OB-fold ligands. As hypothesized for the distantly related OB-fold containing bacterial enterotoxins, the loss of nucleotide-binding function and the rapid evolution of the BOF ligand-binding site may be associated with the presence of BOF proteins in mobile genetic elements and their potential role in bacterial pathogenicity. 103
47311 397960 pfam04077 DsrH DsrH like protein. DsrH is involved in oxidation of intracellular sulphur in the phototrophic sulphur bacterium Chromatium vinosum D. 87
47312 397961 pfam04078 Rcd1 Cell differentiation family, Rcd1-like. Two of the members in this family have been characterized as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation. 259
47313 397962 pfam04079 SMC_ScpB Segregation and condensation complex subunit ScpB. This is a family of prokaryotic proteins that form one of the subunits, ScpB, of the segregation and condensation complex, condensin, that plays a key role in the maintenance of the chromosome. In prokaryotes the complex consists of three proteins, SMC, ScpA (kleisin) and ScpB. ScpB dimerizes and binds to ScpA. As originally predicted, ScpB is structurally a winged-helix at both its N- and C-terminal halves. IN Bacillus subtilis,one Smc dimer is bridged by a single ScpAB to generate asymmetric tripartite rings analogous to eukaryotic SMC complex ring-shaped assemblies. 160
47314 397963 pfam04080 Per1 Per1-like family. PER1 is required for GPI-phospholipase A2 activity and is involved in lipid remodelling of GPI-anchored proteins. PER1 is part of the CREST superfamily. 254
47315 367800 pfam04081 DNA_pol_delta_4 DNA polymerase delta, subunit 4. 131
47316 397964 pfam04082 Fungal_trans Fungal specific transcription factor domain. 262
47317 397965 pfam04083 Abhydro_lipase Partial alpha/beta-hydrolase lipase region. This family corresponds to a N-terminal part of an alpha/beta hydrolase domain. 62
47318 397966 pfam04084 ORC2 Origin recognition complex subunit 2. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. 323
47319 397967 pfam04085 MreC rod shape-determining protein MreC. MreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. 112
47320 397968 pfam04086 SRP-alpha_N Signal recognition particle, alpha subunit, N-terminal. SRP is a complex of six distinct polypeptides and a 7S RNA that is essential for transferring nascent polypeptide chains that are destined for export from the cell to the translocation apparatus of the endoplasmic reticulum (ER) membrane. SRP binds hydrophobic signal sequences as they emerge from the ribosome, and arrests translation. 287
47321 397969 pfam04087 DUF389 Domain of unknown function (DUF389). Family of hypothetical bacterial proteins with an undetermined function. 137
47322 397970 pfam04088 Peroxin-13_N Peroxin 13, N-terminal region. Both termini of the Peroxin-13 are oriented to the cytosol. Peroxin-13 is required for peroxisomal association of peroxin-14. 141
47323 397971 pfam04089 BRICHOS BRICHOS domain. The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for it include (a) in targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialized intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999, provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region. 88
47324 367807 pfam04090 RNA_pol_I_TF RNA polymerase I specific initiation factor. 199
47325 397972 pfam04091 Sec15 Exocyst complex subunit Sec15-like. 308
47326 397973 pfam04092 SAG SRS domain. Toxoplasma gondii is a persistent protozoan parasite capable of infecting almost any warm-blooded vertebrate. The surface of Toxoplasma is coated with a family of developmentally regulated glycosylphosphatidylinositol (GPI)-linked proteins (SRSs), of which SAG1 is the prototypic member. SRS proteins mediate attachment to host cells and interface with the host immune response to regulate the virulence of the parasite. SAG1 is composed of two disulphide linked SRS domains. These have 6 cysteines that form 1-6,2-5 and 3-4 pairings. The structure of the immunodominant SAG1 antigen reveals a homodimeric configuration. The SRS domain is found in a single copy in the SAG2 proteins. This family of surface antigens are found in other apicomplexans. 132
47327 282013 pfam04093 MreD rod shape-determining protein MreD. MreD (murein formation D) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. 160
47328 397974 pfam04095 NAPRTase Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase (EC:2.4.2.11) is the rate limiting enzyme that catalyzes the first reaction in the NAD salvage synthesis. This family also includes Pre-B cell enhancing factor that is a cytokine. This family is related to Quinolinate phosphoribosyltransferase pfam01729. 235
47329 397975 pfam04096 Nucleoporin2 Nucleoporin autopeptidase. 143
47330 397976 pfam04097 Nic96 Nup93/Nic96. Nup93/Nic96 is a component of the nuclear pore complex. It is required for the correct assembly of the nuclear pore complex. In Saccharomyces cerevisiae, Nic96 has been shown to be involved in the distribution and cellular concentration of the GTPase Gsp1. The structure of Nic96 has revealed a mostly alpha helical structure. 611
47331 367812 pfam04098 Rad52_Rad22 Rad52/22 family double-strand break repair protein. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs. 140
47332 282019 pfam04099 Sybindin Sybindin-like family. Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses. 134
47333 282020 pfam04100 Vps53_N Vps53-like, N-terminal. Vps53 complexes with Vps52 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events. 374
47334 397977 pfam04101 Glyco_tran_28_C Glycosyltransferase family 28 C-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site. 166
47335 397978 pfam04102 SlyX SlyX. The SlyX protein has no known function. It is short less than 80 amino acids and is found close to the slyD gene. The SlyX protein has a conserved PPH(Y/W) motif at its C-terminus. The protein may be a coiled-coil structure. 66
47336 397979 pfam04103 CD20 CD20-like family. This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton. 156
47337 397980 pfam04104 DNA_primase_lrg Eukaryotic and archaeal DNA primase, large subunit. DNA primase is the polymerase that synthesizes small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast S. cerevisiae). The large subunit of DNA primase forms interactions with the small subunit and the structure implicates that it is not directly involved in catalysis, but plays roles in correctly positioning the primase/DNA complex, and in the transfer of RNA to DNA polymerase. 222
47338 397981 pfam04106 APG5 Autophagy protein Apg5. Apg5 is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway. 201
47339 397982 pfam04107 GCS2 Glutamate-cysteine ligase family 2(GCS2). Also known as gamma-glutamylcysteine synthetase and gamma-ECS (EC:6.3.2.2). This enzyme catalyzes the first and rate limiting step in de novo glutathione biosynthesis. Members of this family are found in archaea, bacteria and plants. May and Leaver discuss the possible evolutionary origins of glutamate-cysteine ligase enzymes in different organisms and suggest that it evolved independently in different eukaryotes, from an ancestral bacterial enzyme. They also state that Arabidopsis thaliana gamma-glutamylcysteine synthetase is structurally unrelated to mammalian, yeast and Escherichia coli homologs. In plants, there are separate cytosolic and chloroplast forms of the enzyme. 289
47340 397983 pfam04108 APG17 Autophagy protein Apg17. Apg17 is required for activating Apg1 protein kinases. 387
47341 397984 pfam04109 APG9 Autophagy protein Apg9. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialized compartment essential for these vesicle-mediated alternative targeting pathways. 478
47342 397985 pfam04110 APG12 Ubiquitin-like autophagy protein Apg12. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. The Apg12 system is one of the ubiquitin-like protein conjugation systems conserved in eukaryotes. It was first discovered in yeast during systematic analyses of the apg mutants defective in autophagy. Covalent attachment of Apg12-Apg5 is essential for autophagy. 87
47343 397986 pfam04111 APG6 Autophagy protein Apg6. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway. 176
47344 397987 pfam04112 Mak10 Mak10 subunit, NatC N(alpha)-terminal acetyltransferase. NatC N(alpha)-terminal acetyltransferases contains Mak10p, Mak31p and Mak3p subunits. All three subunits are associated with each other to form the active complex. 164
47345 397988 pfam04113 Gpi16 Gpi16 subunit, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesized proteins. Gpi16 is an essential N-glycosylated transmembrane glycoprotein. Gpi16 is largely found on the lumenal side of the ER. It has a single C-terminal transmembrane domain and a small C-terminal, cytosolic extension with an ER retrieval motif. 526
47346 397989 pfam04114 Gaa1 Gaa1-like, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesized proteins. 496
47347 397990 pfam04115 Ureidogly_lyase Ureidoglycolate lyase. Ureidoglycolate lyase (EC:4.3.2.3) is one of the enzymes that acts upon ureidoglycolate, an intermediate of purine catabolism, releasing urea. The enzyme has in the past been wrongly assigned to EC:3.5.3.19, enzymes which release ammonia from ureidoglycolate. 162
47348 397991 pfam04116 FA_hydroxylase Fatty acid hydroxylase superfamily. This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins. 134
47349 397992 pfam04117 Mpv17_PMP22 Mpv17 / PMP22 family. The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesized on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance. 62
47350 397993 pfam04118 Dopey_N Dopey, N-terminal. DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologs are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis. 297
47351 397994 pfam04119 HSP9_HSP12 Heat shock protein 9/12. These heat shock proteins (Hsp9 and Hsp12) are strongly expressed, an increase of 100 fold, upon entry into stationary phase in yeast. 59
47352 397995 pfam04120 Iron_permease Low affinity iron permease. 125
47353 397996 pfam04121 Nup84_Nup100 Nuclear pore protein 84 / 107. Nup84p forms a complex with five proteins, of which Nup120p, Nup85p, Sec13p, and a Sec13p homologs. This Nup84p complex in conjunction with Sec13-type proteins is required for correct nuclear pore biogenesis. 699
47354 397997 pfam04122 CW_binding_2 Putative cell wall binding repeat 2. This repeat is found in multiple tandem copies in proteins including amidase enhancers and adhesins. 80
47355 397998 pfam04123 DUF373 Domain of unknown function (DUF373). Archaeal domain of unknown function. Predicted to be an integral membrane protein with six transmembrane regions. 336
47356 252394 pfam04124 Dor1 Dor1-like family. Dor1 is involved in vesicle targeting to the yeast Golgi apparatus and complexes with a number of other trafficking proteins, which include Sec34 and Sec35. 339
47357 309308 pfam04126 Cyclophil_like Cyclophilin-like. This domain has a cyclophilin-like fold, consisting of an eight-stranded beta-barrel with an alpha helix located between the beta-2 and beta-3 strands and a 310 helix located between the beta-7 and beta-8 strands. The catalytic site found in human cyclophilin is not conserved in this domain, suggesting a different function for this domain. 119
47358 397999 pfam04127 DFP DNA / pantothenate metabolism flavoprotein. The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism. 183
47359 282044 pfam04129 Vps52 Vps52 / Sac2 family. Vps52 complexes with Vps53 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events. 508
47360 398000 pfam04130 Spc97_Spc98 Spc97 / Spc98 family. The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Members of this family all form components of the gamma-tubulin complex, GCP. 296
47361 398001 pfam04131 NanE Putative N-acetylmannosamine-6-phosphate epimerase. This family represents a putative ManNAc-6-P-to-GlcNAc-6P epimerase in the N-acetylmannosamine (ManNAc) utilisation pathway found mainly in pathogenic bacteria. 192
47362 398002 pfam04133 Vps55 Vacuolar protein sorting 55. Vps55 is involved in the secretion of the Golgi form of the soluble vacuolar carboxypeptidase Y, but not the trafficking of the membrane-bound vacuolar alkaline phosphatase. Both Vps55 and obesity receptor gene-related protein are important for functioning membrane trafficking to the vacuole/lysosome of eukaryotic cells. 118
47363 398003 pfam04134 DUF393 Protein of unknown function, DUF393. Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown. 113
47364 398004 pfam04135 Nop10p Nucleolar RNA-binding protein, Nop10p family. Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs. 51
47365 398005 pfam04136 Sec34 Sec34-like family. Sec34 and Sec35 form a sub-complex, in a seven protein complex that includes Dor1 (pfam04124). This complex is thought to be important for tethering vesicles to the Golgi. 146
47366 398006 pfam04137 ERO1 Endoplasmic Reticulum Oxidoreductin 1 (ERO1). Members of this family are required for the formation of disulphide bonds in the ER. 347
47367 398007 pfam04138 GtrA GtrA-like protein. Members of this family are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family are a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved is in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyzes the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b). This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose. 117
47368 398008 pfam04139 Rad9 Rad9. Rad9 is required for transient cell-cycle arrests and transcriptional induction of DNA repair in response to DNA damage. It contains a Bcl-2 homology domain 3 (BH3). 250
47369 367837 pfam04140 ICMT Isoprenylcysteine carboxyl methyltransferase (ICMT) family. The isoprenylcysteine o-methyltransferase (EC:2.1.1.100) family carry out carboxyl methylation of cleaved eukaryotic proteins that terminate in a CaaX motif. In Saccharomyces cerevisiae this methylation is carried out by Ste14p, an integral endoplasmic reticulum membrane protein. Ste14p is the founding member of the isoprenylcysteine carboxyl methyltransferase (ICMT) family, whose members share significant sequence homology. 94
47370 398009 pfam04142 Nuc_sug_transp Nucleotide-sugar transporter. This family of membrane proteins transport nucleotide sugars from the cytoplasm into Golgi vesicles. SSLC35A1 transports CMP-sialic acid, SLC35A2 transports UDP-galactose and SLC35A3 transports UDP-GlcNAc. 315
47371 398010 pfam04143 Sulf_transp Sulphur transport. This is an integral membrane protein. It is predicted to have a function in the transport of sulphur-containing molecules. It contains several conserved glycines and an invariant cysteine that is probably an important functional residue. 310
47372 398011 pfam04144 SCAMP SCAMP family. In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences. 172
47373 398012 pfam04145 Ctr Ctr copper transporter family. The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport. 151
47374 398013 pfam04146 YTH YT521-B-like domain. A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily. 135
47375 398014 pfam04147 Nop14 Nop14-like family. Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production. 844
47376 398015 pfam04148 Erv26 Transmembrane adaptor Erv26. Erv26 is an integral membrane protein that is packed into COPII vesicles and cycles between the ER and Golgi compartments. It directs pro-alkaline phosphatase into endoplasmic reticulum-derived COPII transport vesicles. 202
47377 398016 pfam04149 DUF397 Domain of unknown function (DUF397). The function of this family is unknown. 50
47378 398017 pfam04151 PPC Bacterial pre-peptidase C-terminal domain. This domain is normally found at the C-terminus of secreted bacterial peptidases. They are not present in the active peptidase. It is possible that they fulfill a similar role to the PKD (pfam00801) domain, which also are found in this context. Visual analysis suggests that PKD and PPC are distantly related (personal obs:Bateman A, Yeats C). 68
47379 398018 pfam04152 Mre11_DNA_bind Mre11 DNA-binding presumed domain. The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1. 173
47380 398019 pfam04153 NOT2_3_5 NOT2 / NOT3 / NOT5 family. NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5. 124
47381 398020 pfam04155 Ground-like Ground-like domain. This family consists of the ground-like domain and is specific to C.elegans. It has been proposed that the ground-like domain containing proteins may bind and modulate the activity of Patched-like membrane molecules, reminiscent of the modulating activities of neuropeptides. 73
47382 398021 pfam04157 EAP30 EAP30/Vps36 family. This family includes EAP30 as well as the Vps36 protein. Vps36 is involved in Golgi to endosome trafficking. EAP30 is a subunit of the ELL complex. The ELL is an 80-kDa RNA polymerase II transcription factor. ELL interacts with three other proteins to form the complex known as ELL complex. The ELL complex is capable of increasing that catalytic rate of transcription elongation, but is unable to repress initiation of transcription by RNA polymerase II as is the case of ELL. EAP30 is thought to lead to the derepression of ELL's transcriptional inhibitory activity. 210
47383 398022 pfam04158 Sof1 Sof1-like domain. Sof1 is essential for cell growth and is a component of the nucleolar rRNA processing machinery. 87
47384 282069 pfam04159 NB NB glycoprotein. The NB glycoprotein is found in Influenza type B virus. Its function is unknown. 100
47385 282070 pfam04160 Borrelia_orfX Orf-X protein. This short protein has no known function and is found in Jaagsiekte sheep retrovirus. Jaagsiekte sheep retrovirus (JSRV) is the etiological agent of a contagious lung tumor of sheep known as sheep pulmonary adenomatosis. JSRV exhibits a simple genetic organisation, characteristic of the type D and type B retroviruses, with the canonical retroviral sequences gag, pro, pol and env encoding the structural proteins of the virion. An additional open reading frame (orf-x), of approximately 500 bp overlapping pol. 154
47386 398023 pfam04161 Arv1 Arv1-like family. Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis. 205
47387 282072 pfam04162 Gyro_capsid Gyrovirus capsid protein (VP1). Gyroviruses are small circular single stranded viruses. This family includes the VP1 protein from the chicken anaemia virus which is the viral capsid protein. 449
47388 282073 pfam04163 Tht1 Tht1-like nuclear fusion protein. 595
47389 282074 pfam04165 DUF401 Protein of unknown function (DUF401). Members if this family are predicted to have 10 transmembrane regions. 393
47390 398024 pfam04166 PdxA Pyridoxal phosphate biosynthetic protein PdxA. In Escherichia coli the coenzyme pyridoxal 5'-phosphate is synthesized de novo by a pathway that is thought to involve the condensation of 4-(phosphohydroxy)-L-threonine and 1-deoxy-D-xylulose, catalyzed by the enzymes PdxA and PdxJ, to form either pyridoxine (vitamin B6) or pyridoxine 5'-phosphate. 251
47391 398025 pfam04167 DUF402 Protein of unknown function (DUF402). Family member FomD is a predicted protein from a fosfomycin biosynthesis gene cluster in Streptomyces wedmorensis. Its function is unknown. 68
47392 398026 pfam04168 Alpha-E A predicted alpha-helical domain with a conserved ER motif. An uncharacterized alpha helical domain containing a highly conserved ER motif and typically found as a tandem duplication. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system comprising of a transglutaminase, a peptidase of the NTN-hydrolase superfamily, an active and inactive circularly permuted ATP-grasp domains and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase domain. 303
47393 398027 pfam04170 NlpE NlpE N-terminal domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes. 78
47394 398028 pfam04172 LrgB LrgB-like family. The two products of the lrgAB operon are potential membrane proteins, and LrgA and LrgB are both thought to control of murein hydrolase activity and penicillin tolerance. 206
47395 282080 pfam04173 DoxD TQO small subunit DoxD. DoxD is a subunit of the terminal quinol oxidase present in the plasma membrane of Acidianus ambivalens, with calculated molecular mass of 20.4 kDa. Thiosulphate:quinone oxidoreductase (TQO) is one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in BT_0515. 167
47396 398029 pfam04174 CP_ATPgrasp_1 A circularly permuted ATPgrasp. An ATP-grasp family that is present both as catalytically active and inactive versions. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system that additionally contains a transglutaminase, an NTN-hydrolase, the Alpha-E domain, and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase. The inactive forms are often fused N-terminal to the Alpha-E domain. 332
47397 398030 pfam04175 DUF406 Protein of unknown function (DUF406). Members of this family appear to be found only in gamma proteobacteria. The function of this protein family is undetermined. Solution of the structures of the two members of this family investigated bear some resemblance to that of the single domain enzyme pterin-4a-carbinolamine dehydratase, PDC. Although the residues of PCDs involved in binding of metabolite are not conserved in the two structures under study, they do correspond to a surface-region structurally aligned with residues that are highly conserved, eg Glu 89, suggesting that this region is also involved in binding of a ligand, thereby possibly constituting a catalytic site of a yet uncharacterized enzyme specific for gamma proteobacteria. 91
47398 398031 pfam04176 TIP41 TIP41-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway. 168
47399 398032 pfam04177 TAP42 TAP42-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 (pfam04176) interacts with TAP42 and negatively regulates the TOR signaling pathway. 310
47400 398033 pfam04178 Got1 Got1/Sft2-like family. Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments. 114
47401 398034 pfam04179 Init_tRNA_PT Rit1 DUSP-like domain. This enzyme (EC:2.4.2.-) modifies exclusively the initiator tRNA in position 64 using 5'-phosphoribosyl-1'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. This C-terminal domain shows similarity to dual specificity phosphatases. 110
47402 398035 pfam04180 LTV Low temperature viability protein. The low-temperature viability protein LTV1 is involved in ribosome biogenesis 40S subunit production. 409
47403 398036 pfam04181 RPAP2_Rtr1 Rtr1/RPAP2 family. This family includes the human RPAP2 (RNAP II associated polypeptide) protein and the yeast Rtr1 protein. It has been suggested that this family of proteins are regulators of core RNA polymerase II function. 75
47404 367858 pfam04182 B-block_TFIIIC B-block binding subunit of TFIIIC. Yeast transcription factor IIIC (TFIIIC) is a multi-subunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding. 75
47405 398037 pfam04183 IucA_IucC IucA / IucC family. IucA and IucC catalyze discrete steps in biosynthesis of the siderophore aerobactin from N epsilon-acetyl-N epsilon-hydroxylysine and citrate. This family represents the N-terminal region. The C-terminal region appears to be related to iron transporter proteins. 210
47406 398038 pfam04184 ST7 ST7 protein. The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumor suppressor gene. The molecular function of this protein is uncertain. 528
47407 309350 pfam04185 Phosphoesterase Phosphoesterase family. This family includes both bacterial phospholipase C enzymes EC:3.1.4.3, but also eukaryotic acid phosphatases EC:3.1.3.2. 348
47408 398039 pfam04186 FxsA FxsA cytoplasmic membrane protein. This is a bacterial family of cytoplasmic membrane proteins. It includes two transmembrane regions. The molecular function of FxsA is unknown, but in Escherichia coli its over-expression has been shown to alleviate the exclusion of phage T7 in those cells with an F plasmid. 109
47409 398040 pfam04187 Cofac_haem_bdg Haem-binding uptake, Tiki superfamily, ChaN. This is a family of putative bacterial lipoproteins necessary for the uptake of haem-iron. The structure of UniProtKB:Q0PBW2, Structure 2g5g, comprises a large parallel beta-sheet with flanking alpha-helices and a smaller domain consisting of alpha-helices. Two cofacial haem groups (~3.5 Angstom apart with an inter-iron distance of 4.4 Angstrom) bind in a pocket formed by a dimer of two ChaN monomers. 209
47410 282094 pfam04188 Mannosyl_trans2 Mannosyltransferase (PIG-V). This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI. 439
47411 398041 pfam04189 Gcd10p Gcd10p family. eIF-3 is a multi-subunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit if eIF3. The Yeast protein Gcd10p has also been shown to be part of a complex with the methyltransferase Gcd14p that is involved in modifying tRNA. 292
47412 398042 pfam04190 DUF410 Protein of unknown function (DUF410). This family of proteins is from Caenorhabditis elegans and has no known function. The protein has some GO references indicating that the protein has a positive regulation of growth rate and is involved in nematode larval development. 251
47413 398043 pfam04191 PEMT Phospholipid methyltransferase. The S. cerevisiae phospholipid methyltransferase (EC:2.1.1.16) has a broad substrate specificity of unsaturated phospholipids. 106
47414 398044 pfam04192 Utp21 Utp21 specific WD40 associated putative domain. Utp21 is a subunit of U3 snoRNP, which is essential for synthesis of 18S rRNA. 230
47415 398045 pfam04193 PQ-loop PQ loop repeat. Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localization of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue. Family members are membrane transporters since two members, cystinosin and PQLC2, transport cystine and cationic amino acids, respectively, across the lysosomal membrane. The 2nd PQ-loop of cystinosin hosts the substrate-coupled H+ binding site underlying its H+ symport mechanism, suggesting that PQ-loop repeats have functional significance. It is thus likely that PQ-loop-containing proteins act as a family of membrane transporters. Some transport cystine and cationic amino acids, respectively, across the lysosomal membrane. Others transport lysine and or arginine across the lysosomal membrane in order to maintain the acidic homoeostasis. 61
47416 398046 pfam04194 PDCD2_C Programmed cell death protein 2, C-terminal putative domain. 126
47417 398047 pfam04195 Transposase_28 Putative gypsy type transposon. This family of plant genes are thought to be related to gypsy type transposons. 70
47418 282102 pfam04196 Bunya_RdRp Bunyavirus RNA dependent RNA polymerase. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The L segment codes for an RNA polymerase. This family contains the RNA dependent RNA polymerase on the L segment. 739
47419 398048 pfam04197 Birna_RdRp Birnavirus RNA dependent RNA polymerase (VP1). Birnaviruses are dsRNA viruses. This family corresponds to the RNA dependent RNA polymerase. This protein is also known as VP1. All of the birnavirus VP1 proteins contain conserved RdRp motifs that reside in the catalytic "palm" domain of all classes of polymerases. However, the birnavirus RdRps lack the highly conserved Gly-Asp-Asp (GDD) sequence, a component of the proposed catalytic site of this enzyme family that exists in the conserved motif VI of the palm domain of other RdRps. 802
47420 398049 pfam04198 Sugar-bind Putative sugar-binding domain. This probable domain is found in bacterial transcriptional regulators such as DeoR and SorC. These proteins have an amino-terminal helix-turn-helix pfam00325 that binds to DNA. This domain is probably the ligand regulator binding region. SorC is regulated by sorbose and other members of this family are likely to be regulated by other sugar substrates. 256
47421 398050 pfam04199 Cyclase Putative cyclase. Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. 158
47422 398051 pfam04200 Lipoprotein_17 Lipoprotein associated domain. This presumed domain is about 100 amino acids in length. It is found in lipoprotein of unknown function and is greatly expanded in Mycoplasma pulmonis. The domain is found in up to five copies in some proteins. This family also includes the Mycoplasma arthritidis MAA2 variable surface protein. MAA2 is implicated in in cytoadherence and virulence and has been shown to exhibit both size and phase variability. 84
47423 398052 pfam04201 TPD52 tumor protein D52 family. The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologs from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologs of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other. 148
47424 112992 pfam04202 Mfp-3 Foot protein 3. Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels. 71
47425 398053 pfam04203 Sortase Sortase family. The founder member of this family is S.aureus sortase, a transpeptidase that attaches surface proteins by the threonine of an LPXTG motif to the cell wall. 123
47426 398054 pfam04204 HTS Homoserine O-succinyltransferase. 298
47427 398055 pfam04205 FMN_bind FMN-binding domain. This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins. 72
47428 398056 pfam04206 MtrE Tetrahydromethanopterin S-methyltransferase, subunit E. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 271
47429 398057 pfam04207 MtrD Tetrahydromethanopterin S-methyltransferase, subunit D. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 217
47430 398058 pfam04208 MtrA Tetrahydromethanopterin S-methyltransferase, subunit A. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 165
47431 398059 pfam04209 HgmA homogentisate 1,2-dioxygenase. Homogentisate dioxygenase cleaves the aromatic ring during the metabolic degradation of Phe and Tyr. Homogentisate dioxygenase deficiency causes alkaptonuria. The structure of homogentisate dioxygenase shows that the enzyme forms a hexamer arrangement comprised of a dimer of trimers. The active site iron ion is coordinated near the interface between the trimers. 424
47432 367872 pfam04210 MtrG Tetrahydromethanopterin S-methyltransferase, subunit G. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 64
47433 398060 pfam04211 MtrC Tetrahydromethanopterin S-methyltransferase, subunit C. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 266
47434 398061 pfam04212 MIT MIT (microtubule interacting and transport) domain. The MIT domain forms an asymmetric three-helix bundle and binds ESCRT-III (endosomal sorting complexes required for transport) substrates. 66
47435 398062 pfam04213 HtaA Htaa. This domain is found in HtaA, a secreted protein implicated in iron acquisition and transport. 159
47436 398063 pfam04214 DUF411 Protein of unknown function, DUF. The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance. 68
47437 398064 pfam04216 FdhE Protein involved in formate dehydrogenase formation. The function of these proteins is unknown. They may possibly be involved in the formation of formate dehydrogenase. 286
47438 398065 pfam04217 DUF412 Protein of unknown function, DUF412. This family consists of bacterial proteins, including yfbV from E. coli. YfbV is a membrane protien involved in insulating the chromosome from the TerR macrodomain. 133
47439 398066 pfam04218 CENP-B_N CENP-B N-terminal DNA-binding domain. Centromere Protein B (CENP-B) is a DNA-binding protein localized to the centromere. Within the N-terminal 125 residues, there is a DNA-binding region, which binds to a corresponding 17bp CENP-B box sequence. CENP-B dimers either bind two separate DNA molecules or alternatively, they may bind two CENP-B boxes on one DNA molecule, with the intervening stretch of DNA forming a loop structure. The CENP-B DNA-binding domain consists of two repeating domains, RP1 and RP2. This family corresponds to RP1 has been shown to consist of four helices in a helix-turn-helix structure. 53
47440 398067 pfam04219 DUF413 Protein of unknown function, DUF. 89
47441 398068 pfam04220 YihI Der GTPase activator (YihI). YihI activates the GTPase activity of Der, a 50S ribosomal subunit stability factor. The stimulation is specific to Der as YihI does not stimulate the GTPase activity of Era or ObgE. The interaction of YihI with Der requires only the C-terminal 78 amino acids of YihI. A yihI deletion mutant is viable and shows a shorter lag period, but the same post-lag growth rate as a wild-type strain. yihI is expressed during the lag period. Overexpression of yihI inhibits cell growth and biogenesis of the 50S ribosomal subunit. YihI is an unusual, highly hydrophilic protein with an uneven distribution of charged residues, resulting in an N-terminal region with high pI and a C-terminal region with low pI. 156
47442 398069 pfam04221 RelB RelB antitoxin. RelE and RelB form a toxin-antitoxin system. RelE represses translation, probably through binding ribosomes. RelB stably binds RelE, presumably deactivating it. 82
47443 398070 pfam04222 DUF416 Protein of unknown function (DUF416). This is a bacterial protein family of unknown function. Proteins in this family adopt an alpha helical structure. Genome context analysis has suggested a high probability of a functional association with histidine kinases, which implicates proteins in this family to play a role in signalling (information from TOPSAN 2Q9R). 183
47444 398071 pfam04223 CitF Citrate lyase, alpha subunit (CitF). In citrate-utilising prokaryotes, citrate lyase EC:4.1.3.6 cleaves intracellular citrate into acetate and oxaloacetate, and is organized as a functional complex consisting of alpha, beta, and gamma subunits. The gamma subunit serves as an acyl carrier protein (ACP), and has a 2'-(5''-phosphoribosyl)-3'-dephospho-CoA prosthetic group. The citrate lyase is active only if this prosthetic group is acetylated; this acetylation is catalyzed by an acetate:SH-citrate lyase ligase. The alpha subunit substitutes citryl for the acetyl group to form citryl-S-ACP. The beta subunit completes the reaction by cleaving the citryl to yield oxaloacetate and (regenerated) acetyl-S-ACP. This family represents the alpha subunit EC:2.8.3.10. 466
47445 282127 pfam04224 DUF417 Protein of unknown function, DUF417. This family of uncharacterized proteins appears to be restricted to proteobacteria. 175
47446 282128 pfam04225 OapA Opacity-associated protein A LysM-like domain. This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation. This is a LysM-like domain. 84
47447 398072 pfam04226 Transgly_assoc Transglycosylase associated protein. Bacterial protein, predicted to be an integral membrane protein. Some family members have been annotated as transglycosylase associated proteins, but no experimental evidence is provided. This family was annotated based on the information found for Escherichia coli YmgE. 49
47448 398073 pfam04227 Indigoidine_A Indigoidine synthase A like protein. Indigoidine is a blue pigment synthesized by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. The recommended name for this protein is now pseudouridine-5'-phosphate glycosidase. 288
47449 398074 pfam04228 Zn_peptidase Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases (Prosite:PDOC00129). 291
47450 398075 pfam04229 GrpB GrpB protein. This family has been suggested to belong to the nucleotidyltransferase superfamily. It occurs at the C-terminus of dephospho-CoA kinase (CoaE) in a number of cases, where it plays a role in the proper folding of the enzyme. 160
47451 398076 pfam04230 PS_pyruv_trans Polysaccharide pyruvyl transferase. Pyruvyl-transferases involved in peptidoglycan-associated polymer biosynthesis. CsaB in Bacillus anthracis is necessary for the non-covalent anchoring of proteins containing an SLH (S-layer homology) domain to peptidoglycan-associated pyruvylated polysaccharides. WcaK and AmsJ are involved in the biosynthesis of colanic acid in Escherichia coli and of amylovoran in Erwinia amylovora. 233
47452 398077 pfam04231 Endonuclease_1 Endonuclease I. Bacterial periplasmic or secreted endonuclease I (EC:3.1.21.1) E. coli endonuclease I (EndoI) is a sequence independent endonuclease located in the periplasm. It is inhibited by different RNA species. It is thought to normally generate double strand breaks in DNA, except in the presence of high salt concentrations and RNA, when it generates single strand breaks in DNA. Its biological role is unknown. Other family members are known to be extracellular. This family also includes a non-specific, Mg2+ activated ribonuclease precursor. 235
47453 398078 pfam04232 SpoVS Stage V sporulation protein S (SpoVS). In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly. 81
47454 309383 pfam04233 Phage_Mu_F Phage Mu protein F like protein. Members of this family are found in double-stranded DNA bacteriophages, and in some bacteria. A member of this family is required for viral head morphogenesis in bacteriophage SPP1. This family is possibly a minor head protein. This family may be related to the family TT_ORF1 (pfam02956). 110
47455 398079 pfam04234 CopC CopC domain. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm. 93
47456 398080 pfam04235 DUF418 Protein of unknown function (DUF418). Probable integral membrane protein. 163
47457 398081 pfam04236 Transp_Tc5_C Tc5 transposase C-terminal domain. This family corresponds to a C-terminal cysteine rich region that probably binds to a metal ion and could be DNA binding (pers. obs. A Bateman). 63
47458 398082 pfam04237 YjbR YjbR. YjbR has a CyaY-like fold. 91
47459 398083 pfam04238 DUF420 Protein of unknown function (DUF420). Predicted membrane protein with four transmembrane helices. 130
47460 398084 pfam04239 DUF421 Protein of unknown function (DUF421). YDFR family 119
47461 398085 pfam04240 Caroten_synth Carotenoid biosynthesis protein. The representative member of this family is CruF, a C50 carotenoid 2',3'-hydratase involved in the synthesis of the C50 carotenoid bacterioruberin in the halophilic archaeon Haloarcula japonica. 209
47462 398086 pfam04241 DUF423 Protein of unknown function (DUF423). This family of proteins with unknown function is a possible integral membrane protein from Caenorhabditis elegans. This family of proteins has GO references indicating the protein is involved in nematode larval development and is a positive regulator of growth rate. 86
47463 377260 pfam04242 DUF424 Protein of unknown function (DUF424). This is a family of uncharacterized proteins. 92
47464 398087 pfam04244 DPRP Deoxyribodipyrimidine photo-lyase-related protein. This family appears to be related to pfam00875. 221
47465 398088 pfam04245 NA37 37-kD nucleoid-associated bacterial protein. 311
47466 398089 pfam04246 RseC_MucC Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, which is itself involved in thiamine biosynthesis. 129
47467 398090 pfam04247 SirB Invasion gene expression up-regulator, SirB. SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown. 120
47468 398091 pfam04248 NTP_transf_9 Domain of unknown function (DUF427). This domain contains a beta-tent fold. 93
47469 398092 pfam04250 DUF429 Protein of unknown function (DUF429). 213
47470 398093 pfam04252 RNA_Me_trans Predicted SAM-dependent RNA methyltransferase. This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases. 200
47471 398094 pfam04253 TFR_dimer Transferrin receptor-like dimerization domain. This domain is involved in dimerization of the transferrin receptor as shown in its crystal structure. 118
47472 398095 pfam04254 DUF432 Protein of unknown function (DUF432). Archaeal protein of unknown function. 120
47473 398096 pfam04255 DUF433 Protein of unknown function (DUF433). 54
47474 398097 pfam04256 DUF434 Protein of unknown function (DUF434). 55
47475 398098 pfam04257 Exonuc_V_gamma Exodeoxyribonuclease V, gamma subunit. The Exodeoxyribonuclease V enzyme is a multi-subunit enzyme comprised of the proteins RecB, RecC (this family) and RecD. This enzyme plays an important role in homologous genetic recombination, repair of double strand DNA breaks resistance to UV irradiation and chemical DNA-damage. The enzyme (EC:3.1.11.5) catalyzes ssDNA or dsDNA-dependent ATP hydrolysis, hydrolysis of ssDNA or dsDNA and unwinding of dsDNA. This family consists of two AAA domains. 762
47476 282158 pfam04258 Peptidase_A22B Signal peptide peptidase. The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080. 286
47477 367886 pfam04259 SASP_gamma Small, acid-soluble spore protein, gamma-type. The SASP family is a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. 84
47478 398099 pfam04260 DUF436 Protein of unknown function (DUF436). Family of bacterial proteins with undetermined function. 170
47479 398100 pfam04261 Dyp_perox Dyp-type peroxidase family. This family of dye-decolourising peroxidases lack a typical heme-binding region. 315
47480 398101 pfam04262 Glu_cys_ligase Glutamate-cysteine ligase. Family of bacterial f glutamate-cysteine ligases (EC:6.3.2.2) that carry out the first step of the glutathione biosynthesis pathway. 371
47481 398102 pfam04263 TPK_catalytic Thiamin pyrophosphokinase, catalytic domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 112
47482 398103 pfam04264 YceI YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to poly-isoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold. 101
47483 398104 pfam04265 TPK_B1_binding Thiamin pyrophosphokinase, vitamin B1 binding domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 66
47484 398105 pfam04266 ASCH ASCH domain. The ASCH domain adopts a beta-barrel fold similar to the pfam01472 domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation. 102
47485 398106 pfam04267 SoxD Sarcosine oxidase, delta subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2. 80
47486 398107 pfam04268 SoxG Sarcosine oxidase, gamma subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2. 152
47487 398108 pfam04269 DUF440 Protein of unknown function, DUF440. This family consists of uncharacterized bacterial proteins. 101
47488 398109 pfam04270 Strep_his_triad Streptococcal histidine triad protein. All members of this family are proteins from Streptococcal species. The proteins are characterized by having a HxxHxH motif that usually occurs multiple times throughout the protein. The histidine triad is predicted to bind metal cations, in particular Zn2+. The zinc is transferred, on the surface of the streptococcus from the Strep_his_triad protein, a zinc scavenger, to apo-ADCAII, a cell-surface lipoprotein transporter that leads to Zn2+ uptake into the bacterium. 51
47489 367891 pfam04272 Phospholamban Phospholamban. The regulation of calcium levels across the membrane of the sarcoplasmic reticulum involves the interplay of many membrane proteins. Phospholamban is a 52 residue integral membrane protein that is involved in reversibly inhibiting the Ca(2+) pump and regulating the flow of Ca ions across the sarcoplasmic reticulum membrane during muscle contraction and relaxation. Phospholamban is thought to form a pentamer in the membrane. 52
47490 398110 pfam04273 DUF442 Putative phosphatase (DUF442). Although this domain is uncharacterized it seems likely that it performs a phosphatase function. 110
47491 398111 pfam04275 P-mevalo_kinase Phosphomevalonate kinase. Phosphomevalonate kinase (EC:2.7.4.2) catalyzes the phosphorylation of 5-phosphomevalonate into 5-diphosphomevalonate, an essential step in isoprenoid biosynthesis via the mevalonate pathway. This family represents the animal type of the enzyme. The other is the ERG8 type, found in plants and fungi, and some bacteria (see pfam00288). 111
47492 398112 pfam04276 DUF443 Protein of unknown function (DUF443). Family of uncharacterized proteins. 202
47493 398113 pfam04277 OAD_gamma Oxaloacetate decarboxylase, gamma chain. 76
47494 398114 pfam04278 Tic22 Tic22-like family. The preprotein translocation at the inner envelope membrane of chloroplasts so far involves five proteins: Tic110, Tic55, Tic40, Tic22 (this family) and Tic20. The molecular function of these proteins has not yet been established. 244
47495 398115 pfam04279 IspA Intracellular septation protein A. 176
47496 398116 pfam04280 Tim44 Tim44-like domain. Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. This family includes the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members of this family is unknown but transport seems likely. The crystal structure of the C terminal of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane. 144
47497 398117 pfam04281 Tom22 Mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded pre-proteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family represents the Tom22 proteins. The N terminal region of Tom22 has been shown to have chaperone-like activity, and the C terminal region faces the intermembrane face. 137
47498 398118 pfam04282 DUF438 Family of unknown function (DUF438). 67
47499 398119 pfam04283 CheF-arch Chemotaxis signal transduction system protein F from archaea. This is a family of proteins that are archaea-specific components of the bacterial-like chemotaxis signal transduction system of archaea. In H. salinarum, the CheF proteins interact with the chemotaxis proteins CheY, CheD and CheC2 as well as the flagella-accessory proteins FlaCE and FlaD, and are essential for any tactic response. CheF probably functions at the interface between the bacterial-like chemotaxis signal transduction system and the archaeal flagellar apparatus. 217
47500 398120 pfam04284 DUF441 Protein of unknown function (DUF441). Predicted to be an integral membrane protein. 139
47501 398121 pfam04285 DUF444 Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA). 409
47502 398122 pfam04286 DUF445 Protein of unknown function (DUF445). Predicted to be a membrane protein. 368
47503 398123 pfam04287 DUF446 tRNA pseudouridine synthase C. This family is suggested to be the catalytic domain of tRNA pseudouridine synthase C by association. The structure has been solved for one member, as Structure 2HGK, which by inference is designated in this way. 97
47504 398124 pfam04288 MukE MukE-like family. Bacterial protein involved in chromosome partitioning, MukE 229
47505 398125 pfam04289 DUF447 Protein of unknown function (DUF447). Archaeal protein of unknown function. A fungal member UniProtKB:M2LN89 is clearly a Flavine-reductase enzyme by homology, and UniProtKB:O28442 has been shown to bind riboflavin 5'-phosphate (unpublished structural Xray analysis). 174
47506 377284 pfam04290 DctQ Tripartite ATP-independent periplasmic transporters, DctQ component. The function of the members of this family is unknown, but DctQ homologs are invariably found in the tripartite ATP-independent periplasmic transporters. 131
47507 398126 pfam04293 SpoVR SpoVR like protein. Bacillus subtilis stage V sporulation protein R is involved in spore cortex formation. Little is known about cortex biosynthesis, except that it depends on several sigma E controlled genes, including spoVR. 419
47508 398127 pfam04294 VanW VanW like protein. Family members include vancomycin resistance protein W (VanW). Genes encoding members of this family have been found in vancomycin resistance gene clusters vanB and vanG. The function of VanW is unknown. 130
47509 398128 pfam04295 GD_AH_C D-galactarate dehydratase / Altronate hydrolase, C-terminus. Family members include the C termini of D-galactarate dehydratase (EC:4.2.1.42) which is thought to catalyze the reaction D-galactarate = 5-keto-4-deoxy-D-glucarate + H2O, and altronate hydrolase (altronic acid hydratase, EC:4.2.1.7), which catalyzes D-altronate = 2-keto-2-deoxygluconate + H2O. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core. 393
47510 398129 pfam04296 DUF448 Protein of unknown function (DUF448). 74
47511 282193 pfam04297 UPF0122 Putative helix-turn-helix protein, YlxM / p13 like. Members of this family are predicted to contain a helix-turn-helix motif, for example residues 37-55 in Mycoplasma mycoides p13. Genes encoding family members are often part of operons that encode components of the SRP pathway, and this protein may regulate the expression of an operon related to the SRP pathway. 98
47512 398130 pfam04298 Zn_peptidase_2 Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142), of the characteristic HEXXH motif. 216
47513 398131 pfam04299 FMN_bind_2 Putative FMN-binding domain. In Bacillus subtilis, family member PAI 2/ORF-2 was found to be essential for growth. The SUPERFAMILY database finds that this domain is related to FMN-binding domains, suggesting this protein is also FMN-binding. 167
47514 398132 pfam04300 FBA F-box associated region. Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. FBXO2 is involved in binding to N-glycosylated proteins. 177
47515 113085 pfam04301 DUF452 Protein of unknown function (DUF452). 213
47516 398133 pfam04303 PrpF PrpF protein. PrpF is a protein found in the 2-methylcitrate pathway. It is structurally similar to DAP epimerase and proline racemase. This protein is likely to acts to isomerise trans-aconitate to cis-aconitate. 384
47517 398134 pfam04304 DUF454 Protein of unknown function (DUF454). Predicted membrane protein. 115
47518 398135 pfam04305 DUF455 Protein of unknown function (DUF455). 243
47519 398136 pfam04306 DUF456 Protein of unknown function (DUF456). This family is a putative membrane protein that contains glycine zipper motifs. 135
47520 398137 pfam04307 YdjM LexA-binding, inner membrane-associated putative hydrolase. YdjM is a family of putative LexA-binding proteins. Members are predicted to be membrane-bound metal-dependent hydrolases that may be acting as phospholipases. It is a member of the SOS network, that rescues cells from UV and other DNA-damage. Expression of YdjM is regulated by LexA. 173
47521 398138 pfam04308 RNaseH_like Ribonuclease H-like. RNaseH_like is a family of uncharacterized eubacterial proteins that are distant homologs of Ribonuclease H-like. The family maintains all the core secondary structure elements of the RNase H-like fold and shares several conserved, presumably active site residues with RNase HI. This finding suggests that it functions as a nuclease. 145
47522 398139 pfam04309 G3P_antiterm Glycerol-3-phosphate responsive antiterminator. Intracellular glycerol is usually converted to glycerol-3-phosphate in an ATP-requiring phosphorylation reaction catalyzed by glycerol kinase (GlpK) glycerol-3-phosphate activates the antiterminator GlpP. 173
47523 398140 pfam04310 MukB MukB N-terminal. This family represents the N-terminal region of MukB, one of a group of bacterial proteins essential for the movement of nucleoids from mid-cell towards the cell quarters (i.e. chromosome partitioning). The structure of the N-terminal domain consists of an antiparallel six-stranded beta sheet surrounded by one helix on one side and by five helices on the other side. It contains an exposed Walker A loop in an unexpected helix-loop-helix motif (in other proteins, Walker A motifs generally adopt a P loop conformation as part of a strand-loop-helix motif embedded in a conserved topology of alternating helices and (parallel) beta strands). 226
47524 398141 pfam04311 DUF459 Protein of unknown function (DUF459). Putative periplasmic protein. 322
47525 398142 pfam04312 DUF460 Protein of unknown function (DUF460). Archaeal protein of unknown function. 135
47526 398143 pfam04313 HSDR_N Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA. 139
47527 398144 pfam04314 PCuAC Copper chaperone PCu(A)C. PCu(A)C is a periplasmic copper chaperone. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase. 107
47528 398145 pfam04315 EpmC Elongation factor P hydroxylase. This family catalyzes the final step in the elongation factor P modification pathway. It hydroxylates Lys-34 of elongation factor P. Members of this family have a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. 162
47529 398146 pfam04316 FlgM Anti-sigma-28 factor, FlgM. FlgM binds and inhibits the activity of the transcription factor sigma 28. Inhibition of sigma 28 prevents the expression of genes from flagellar transcriptional class 3, which include genes for the filament and chemotaxis. Correctly assembled basal body-hook structures export FlgM, relieving inhibition of sigma 28 and allowing expression of class 3 genes. NMR studies show that free FlgM is mostly unfolded, which may facilitate its export. The C terminal half of FlgM adopts a tertiary structure when it binds to sigma 28. All mutations in FlgM that prevent sigma 28 inhibition affect the C-terminal domain and is the region thought to constitute the binding domain. A minimal binding domain has been identified between Glu 64 and Arg 88 in Salmonella typhimurium. The N-terminal portion remains unstructured and may be necessary for recognition by the export machinery. 54
47530 398147 pfam04317 DUF463 YcjX-like family, DUF463. These proteins possess a P-loop motif. 443
47531 367903 pfam04318 DUF468 Protein of unknown function (DUF468). These conserved ORFs probably are probably not translated into protein [Personal communication, Val Wood]. 84
47532 398148 pfam04319 NifZ NifZ domain. This short protein is found in the nif (nitrogen fixation) operon. Its function is unknown but is probably involved in nitrogen fixation or regulating some component of this process. This 75 residue region is presumed to be a domain. It is found in isolation in some members and in the amino terminal half of the longer NifZ proteins. 70
47533 398149 pfam04320 DUF469 Protein with unknown function (DUF469). Family of bacteria protein with no known function. 102
47534 398150 pfam04321 RmlD_sub_bind RmlD substrate binding domain. L-rhamnose is a saccharide required for the virulence of some bacteria. Its precursor, dTDP-L-rhamnose, is synthesized by four different enzymes the final one of which is RmlD. The RmlD substrate binding domain is responsible for binding a sugar nucleotide. 284
47535 398151 pfam04322 DUF473 Protein of unknown function (DUF473). Family of uncharacterized Archaeal proteins. 118
47536 398152 pfam04324 Fer2_BFD BFD-like [2Fe-2S] binding domain. The two Fe ions are each coordinated by two conserved cysteine residues. This domain occurs alone in small proteins such as Bacterioferritin-associated ferredoxin (BFD). The function of BFD is not known, but it may may be a general redox and/or regulatory component involved in the iron storage or mobilisation functions of bacterioferritin in bacteria. This domain is also found in nitrate reductase proteins in association with Nitrite and sulphite reductase 4Fe-4S domain (pfam01077), Nitrite/Sulfite reductase ferredoxin-like half domain (pfam03460) and Pyridine nucleotide-disulphide oxidoreductase (pfam00070). It is also found in NifU nitrogen fixation proteins, in association with NifU-like N terminal domain (pfam01592) and NifU-like domain (pfam01106). 50
47537 398153 pfam04325 DUF465 Protein of unknown function (DUF465). Family members are found in small bacterial proteins, and also in the heavy chains of eukaryotic myosin and kinesin, C terminal of the motor domain (Myosin pfam00063, Kinesin pfam00225). Members of this family may form coiled coil structures. 48
47538 398154 pfam04326 AlbA_2 Putative DNA-binding domain. This family belongs to the AlbA clan of DNA-binding domains. 116
47539 398155 pfam04327 Peptidase_Prp Cysteine protease Prp. This is a family of cysteine protease that are found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease. 102
47540 398156 pfam04328 Sel_put Selenoprotein, putative. This entry includes a group of putative selenoproteins from Proteobacteria, Actinobacteria and Firmicutes. The invariant cysteine at the C-terminus is encoded by a TGA Sec codon in some Epsilonproteobacteria, suggesting a redox activity for the protein. 61
47541 367905 pfam04332 DUF475 Protein of unknown function (DUF475). Predicted to be an integral membrane protein with multiple membrane spans. 295
47542 398157 pfam04333 MlaA MlaA lipoprotein. MlaA is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. MlaA is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor. 194
47543 113117 pfam04334 DUF478 Protein of unknown function (DUF478). This family contains uncharacterized protein encoded on Trypanosoma kinetoplast minicircles. 68
47544 398158 pfam04335 VirB8 VirB8 protein. VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that it is a primary constituent of a DNA transporter. The periplasmic region interacts with VirB9, VirB10, and itself. This family also includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain has a similar fold to the NTF2 protein. 212
47545 398159 pfam04336 ACP_PD Acyl carrier protein phosphodiesterase. YajB, now renamed acpH, encodes an ACP hydrolase that converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine prosthetic group from ACP. 105
47546 398160 pfam04337 DUF480 Protein of unknown function, DUF480. This family consists of several proteins of uncharacterized function. 149
47547 398161 pfam04338 DUF481 Protein of unknown function, DUF481. This family includes several proteins of uncharacterized function. 211
47548 398162 pfam04339 FemAB_like Peptidogalycan biosysnthesis/recognition. FemAB_like is a family of both baterial and Viridiplantae proteins with responsibility for building interpeptide bridges in peptidoglycan. Such a function is feasible for bacteria but less likely for the plant members of this family. Perhaps the plant-members are using homologous proteins to recognize bacterial peptidoglcans as part of their innate immune system. 369
47549 282228 pfam04340 DUF484 Protein of unknown function, DUF484. This family consists of several proteins of uncharacterized function. 219
47550 398163 pfam04341 DUF485 Protein of unknown function, DUF485. This family includes several putative integral membrane proteins. 89
47551 398164 pfam04342 DMT_6 Putative member of DMT superfamily (DUF486). This family contains several proteins of uncharacterized function. The family is represented in the Transport classification database as 2.A.7.34, though the exact nature of what is transported is not known. 104
47552 377317 pfam04343 DUF488 Protein of unknown function, DUF488. This family includes several proteins of uncharacterized function. 123
47553 398165 pfam04344 CheZ Chemotaxis phosphatase, CheZ. This family represents the bacterial chemotaxis phosphatase, CheZ. This protein forms a dimer characterized by a long four-helix bundle, composed of two helices from each monomer. CheZ dephosphorylates CheY in a reaction that is essential to maintain a continuous chemotactic response to environmental changes. It is thought that CheZ's conserved residue Gln 147 orientates a water molecule for nucleophilic attack at the CheY active site. 204
47554 113128 pfam04345 Chor_lyase Chorismate lyase. Chorismate lyase catalyzes the first step in ubiquinone synthesis, i.e. the removal of pyruvate from chorismate, to yield 4-hydroxybenzoate. 168
47555 398166 pfam04346 EutH Ethanolamine utilisation protein, EutH. EutH is a bacterial membrane protein whose molecular function is unknown. It has been suggested that it may act as an ethanolamine transporter, responsible for carrying ethanolamine from the periplasm to the cytoplasm. 351
47556 398167 pfam04347 FliO Flagellar biosynthesis protein, FliO. FliO is an essential component of the flagellum-specific protein export apparatus. It is an integral membrane protein. Its precise molecular function is unknown. 88
47557 367906 pfam04348 LppC LppC putative lipoprotein. This family includes several bacterial outer membrane antigens, whose molecular function is unknown. 532
47558 398168 pfam04349 MdoG Periplasmic glucan biosynthesis protein, MdoG. This family represents MdoG, a protein that is necessary for the synthesis of periplasmic glucans. The function of MdoG remains unknown. It has been suggested that it may catalyze the addition of branches to a linear glucan backbone. 474
47559 309474 pfam04350 PilO Pilus assembly protein, PilO. PilO proteins are involved in the assembly of pilin. However, the precise function of this family of proteins is not known. 145
47560 398169 pfam04351 PilP Pilus assembly protein, PilP. The PilP family are periplasmic proteins involved in the biogenesis of type IV pili. 148
47561 398170 pfam04352 ProQ ProQ/FINO family. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProP, in Escherichia coli. This family includes several bacterial fertility inhibition (FINO) proteins. The conjugative transfer of F-like plasmids is repressed by FinO, an RNA binding protein. FinO interacts with the F-plasmid encoded traJ mRNA and its antisense RNA, FinP, stabilizing FinP against endonucleolytic degradation and facilitating sense-antisense RNA recognition. ProQ operates as an RNA-chaperone, binding RNA and bringing about both RNA strand-exchange and RNA duplexing. This suggests that in fact it does not regulate ProP transcription but rather regulates ProP translation through activity as an RNA-binding protein. 106
47562 398171 pfam04353 Rsd_AlgQ Regulator of RNA polymerase sigma(70) subunit, Rsd/AlgQ. This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homolog, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in Pseudomonas aeruginosa isolates from cystic fibrosis patients. 149
47563 398172 pfam04354 ZipA_C ZipA, C-terminal FtsZ-binding domain. This family represents the ZipA C-terminal domain. ZipA is involved in septum formation in bacterial cell division. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ. 127
47564 398173 pfam04355 SmpA_OmlA SmpA / OmlA family. Lipoprotein Bacterial outer membrane lipoprotein, possibly involved in in maintaining the structural integrity of the cell envelope. Lipid attachment site is a conserved N terminal cysteine residue. Sometimes found adjacent to the OmpA domain (pfam00691). 69
47565 398174 pfam04356 DUF489 Protein of unknown function (DUF489). Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis. 192
47566 398175 pfam04357 TamB TamB, inner membrane protein subunit of TAM complex. TamB is an integral inner membrane protein that forms a complex - the translocation and assembly module or TAM - with the outer membrane protein, TamA. TAM is responsible for the efficient secretion of the adhesin protein Ag43 in E.coli K-12. 383
47567 398176 pfam04358 DsrC DsrC like protein. Family member DsvC has been observed to co-purify with Desulfovibrio vulgaris dissimilatory sulfite reductase, and many members of this family are annotated as the third (gamma) subunit of dissimilatory sulphite reductase. However, this protein appears to be only loosely associated to the sulfite reductase, which suggests that DsrC may not be an integral part of the dissimilatory sulphite reductase. Members of this family are found in organisms such as E. coli and H. influenzae which do not contain dissimilatory sulphite reductases but can synthesize assimilatory sirohaem sulphite and nitrite reductases. It is speculated that DsrC may be involved in the assembly, folding or stabilisation of sirohaem proteins. The strictly conserved cysteine in the C-terminus suggests that DsrC may have a catalytic function in the metabolism of sulphur compounds. 103
47568 398177 pfam04359 DUF493 Protein of unknown function (DUF493). This domain is likely to act in a regulatory capacity like pfam01842 domains. This domain has a remarkable property in that the C-terminal residue of every protein in the family lies up in the alignment. This suggests that the C-terminal residue plays some important functional role (Bateman A pers obs). 83
47569 398178 pfam04360 Serglycin Serglycin. Serglycin is the most prevalent proteoglycan produced in haemopoietic cells. Serglycin is a proteinase resistant secretory granule proteoglycan. 148
47570 398179 pfam04361 DUF494 Protein of unknown function (DUF494). Members of this family of uncharacterized proteins are often named Smg. 153
47571 398180 pfam04362 Iron_traffic Bacterial Fe(2+) trafficking. This is a family of bacterial Fe(2+) trafficking proteins. 86
47572 398181 pfam04363 DUF496 Protein of unknown function (DUF496). 93
47573 398182 pfam04364 DNA_pol3_chi DNA polymerase III chi subunit, HolC. The DNA polymerase III holoenzyme (EC:2.7.7.7) is the polymerase responsible for the replication of the Escherichia coli chromosome. The holoenzyme is composed of the DNA polymerase III core, the sliding clamp, and the DnaX clamp loading complex. The DnaX complex contains either either the tau or gamma product of gene dnax, complexed to delta.delta' and to chi psi. Chi forms a 1:1 heterodimer with psi. The chi psi complex functions by increasing the affinity of tau and gamma for delta.delta' allowing a functional clamp-loading complex to form at physiological subunit concentrations. Psi is responsible for the interaction with DnaX (gamma/tau), but psi is insoluble unless it is in a complex with chi. 135
47574 398183 pfam04365 BrnT_toxin Ribonuclease toxin, BrnT, of type II toxin-antitoxin system. BrnT is a ribonuclease toxin of a type II toxin-antitoxin system that exhibits a RelE-like fold. The antitoxin that neutralizes this toxin is pfam14384. BrnT is found in bacteria, archaea, bacteriophage, and plasmids. BrnT-BrnA forms a 2:2 tetrameric complex and autoregulates its own expression, which is induced by a number of different environmental stresses. Expression of BrnT alone results in cessation of bacterial growth which can be rescued after subsequent expression of BrnA. 77
47575 398184 pfam04366 Ysc84 Las17-binding protein actin regulator. Ysc84 is a family of Las17-binding proteins found in metazoa. Together, Las17 and Ysc84 are essential for proper polymerization of actin; Ysc84 is able to bind to and stabilize the actin dimer presented by Las17 and thereby promote polymerization. An active actin cytoskeleton is necessary for adequate endocytosis. (pfam00018), or a FYVE zinc finger (pfam01363). 127
47576 398185 pfam04367 DUF502 Protein of unknown function (DUF502). Predicted to be an integral membrane protein. 106
47577 398186 pfam04368 DUF507 Protein of unknown function (DUF507). Bacterial protein of unknown function. 182
47578 113152 pfam04369 Lactococcin Lactococcin-like family. Family of bacteriocins from lactic acid bacteria. 60
47579 367915 pfam04370 DUF508 Domain of unknown function (DUF508). This is a family of uncharacterized proteins from C. elegans. 142
47580 398187 pfam04371 PAD_porph Porphyromonas-type peptidyl-arginine deiminase. Peptidyl-arginine deiminase (PAD) enzymes catalyze the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (pfam03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologs). The predicted catalytic residues in PPAD are Asp130, Asp187, His236, Asp238 and Cys351. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyze the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor. 324
47581 398188 pfam04375 HemX HemX, putative uroporphyrinogen-III C-methyltransferase. This is a family of bacterial putative uroporphyrinogen-III C-methyltransferase proteins. It forms one of the members of a complex of proteins involved in the biogenesis of the inner membrane in E.coli. Uroporphorphyrin-III C-methyltransferase (HemX) is a single spanning inner membrane protein that regulates the activity of NAD(P)H:glutamyl-tRNA reductase (HemA) in the tetrapyrrole biosynthesis pathway. 236
47582 398189 pfam04376 ATE_N Arginine-tRNA-protein transferase, N-terminus. This family represents the N terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N-terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a de-stabilizing amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. In S cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity. Of these, only Cys 94 appears to be completely conserved in this family. 71
47583 398190 pfam04377 ATE_C Arginine-tRNA-protein transferase, C-terminus. This family represents the C terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N-terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilizing amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. 122
47584 398191 pfam04378 RsmJ Ribosomal RNA large subunit methyltransferase D, RlmJ. RlmJ is ribosomal RNA large subunit methyltransferase J is required for full methylation of 23S ribosomal RNA (rRNA) during ribosome biogenesis. The ribosomal RNA of E. coli carries 24 residues that require methylation, and this methyltransferase is the last to be described, that modifies A2030. RlmJ displays a variant of the Rossmann-like methyltransferase (MTase) fold with an inserted helical subdomain. On binding cofactor and substrate a large shift of the N-terminal motif X tail is induced in order to make it cover the cofactor-binding site and to trigger active-site changes in motifs IV and VIII. 245
47585 398192 pfam04379 DUF525 ApaG domain. Members of this family include the bacterial protein ApaG and the C termini of some F-box proteins (pfam00646). F-box proteins contain a carboxyl-terminal domain that interacts with protein substrates, so this family may be involved in protein-protein interaction. The function of ApaG proteins is unknown, but mutations in the Salmonella typhimurium ApaG homolog corD gives a phenotype of low-level cobalt resistance and decreased magnesium efflux by effects on the CorA magnesium transport system. 87
47586 398193 pfam04380 BMFP Membrane fusogenic activity. BMFP consists of two structural domains, a coiled-coil C-terminal domain via which the protein self-associates as a trimer, and an N-terminal domain disordered at neutral pH but adopting an amphipathic alpha-helical structure in the presence of phospholipid vesicles, high ionic strength, acidic pH or SDS. BMFP interacts with phospholipid vesicles though the predicted amphipathic alpha-helix induced in the N-terminal half of the protein and promotes aggregation and fusion of vesicles in vitro. 70
47587 398194 pfam04381 RdgC Putative exonuclease, RdgC. Members of the RdgC family may have exonuclease activity. RdgC is required for efficient pilin variation in Neisseria gonorrhoeae, suggesting that it may be involved in recombination reactions. In Escherichia coli, RdgC is required for growth in recombination-deficient exonuclease-depleted strains. Under these conditions, RdgC may act as an exonuclease to remove collapsed replication forks, in the absence of the normal repair mechanisms. 295
47588 398195 pfam04382 SAB SAB domain. This presumed domain is found in proteins containing FERM domains pfam00373. This domain is found to bind to both spectrin and actin, hence the name SAB (Spectrin and Actin Binding) domain. 49
47589 367917 pfam04383 KilA-N KilA-N domain. The amino-terminal module of the D6R/N1R proteins defines a novel, conserved DNA-binding domain (the KilA-N domain) that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses. The KilA-N domain family also includes the previously defined APSES domain. The KilA-N and APSES domains may also share a common fold with the nucleic acid-binding modules of the LAGLIDADG nucleases and the amino-terminal domains of the tRNA endonuclease. 107
47590 398196 pfam04384 Fe-S_assembly Iron-sulphur cluster assembly. This family of proteins is likely to be involved in the assembly of iron-sulphur clusters. It may function as an adaptor protein. In Escherichia coli IscX forms part of the isc operon, which encodes genes involved in iron-sulphur cluster assembly. Its structure is entirely alpha helical, and it contains a modified wing-helix structure, usually found in DNA-binding proteins. It binds to Fe2+ and Fe3+ ions and to the cysteine desulfurase IscS, the same surface of the protein is involved in both binding to iron and to IscS. 64
47591 252557 pfam04385 FAINT Domain of unknown function, DUF529. This family represents a repeated region found in several Theileria parva proteins. The repeat is normally about 70 residues long and contains a conserved aromatic residue in the middle. 78
47592 398197 pfam04386 SspB Stringent starvation protein B. Escherichia coli stringent starvation protein B (SspB), is thought to enhance the specificity of degradation of tmRNA-tagged proteins by the ClpXP protease. The tmRNA tag, also known as ssrA, is an 11-aa peptide added to the C-terminus of proteins stalled during translation, targets proteins for degradation by ClpXP and ClpAP. SspB a cytoplasmic protein that specifically binds to residues 1-4 and 7 of the tag. Binding of SspB enhances degradation of tagged proteins by ClpX, and masks sequence elements important for ClpA interactions, inhibiting degradation by ClpA. However, more recent work has cast doubt on the importance of SspB in wild-type cells. SspB is encoded in an operon whose synthesis is stimulated by carbon, amino acid, and phosphate starvation. SspB may play a special role during nutrient stress, for example by ensuring rapid degradation of the products of stalled translation, without causing a global increase in degradation of all ClpXP substrates. 144
47593 398198 pfam04387 PTPLA Protein tyrosine phosphatase-like protein, PTPLA. This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types. 162
47594 398199 pfam04388 Hamartin Hamartin protein. This family includes the hamartin protein which is thought to function as a tumor suppressor. The hamartin protein interacts with the tuberin protein pfam03542. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumor suppressor gene. TSC1 encodes a protein, hamartin, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. The TSC2 gene codes for tuberin pfam03542. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. 730
47595 398200 pfam04389 Peptidase_M28 Peptidase family M28. 192
47596 398201 pfam04390 LptE Lipopolysaccharide-assembly. LptE (formerly known as RplB) is involved in lipopolysaccharide-assembly on the outer membrane of Gram-negative organisms. The lipopolysaccharide component of the outer bacterial membrane is transported from its source of origin to the outer membrane by a set of proteins constituting a transport machinery that is made up of LptA, LptB, LptC, LptD, LptE. LptD appears to be anchored in the outer membrane, and LptE forms a complex with it. This part of the machinery complex is involved in the assembly of lipopolysaccharide in the outer leaflet of the outer membrane. 49
47597 398202 pfam04391 DUF533 Protein of unknown function (DUF533). Some family members may be secreted or integral membrane proteins. 177
47598 398203 pfam04392 ABC_sub_bind ABC transporter substrate binding protein. This family contains many hypothetical proteins and some ABC transporter substrate binding proteins. 293
47599 398204 pfam04393 DUF535 Protein of unknown function (DUF535). Family member Shigella flexneri VirK is a virulence protein required for the expression, or correct membrane localization of IcsA (VirG) on the bacterial cell surface,. This family also includes Pasteurella haemolytica lapB, which is thought to be membrane-associated. 279
47600 282276 pfam04394 DUF536 Protein of unknown function, DUF536. This family aligns the C-terminal region from several bacterial proteins of unknown function that may be involved in a theta-type replication mechanism. 43
47601 282277 pfam04395 Poxvirus_B22R Poxvirus B22R protein. This is highly conserved C-rich, central region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses. There are three pairs of conserved cysteine residues. 204
47602 398205 pfam04397 LytTR LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. 98
47603 398206 pfam04398 DUF538 Protein of unknown function, DUF538. This family consists of several plant proteins of unknown function. 109
47604 398207 pfam04399 Glutaredoxin2_C Glutaredoxin 2, C terminal domain. Glutaredoxins are a multifunctional family of glutathione-dependent disulphide oxidoreductases. Unlike other glutaredoxins, glutaredoxin 2 (Grx2) cannot reduce ribonucleotide reductase. Grx2 has significantly higher catalytic activity in the reduction of mixed disulphides with glutathione (GSH) compared with other glutaredoxins. The active site residues (Cys9-Pro10-Tyr11-Cys12, in Escherichia coli Grx2), which are found at the interface between the N- and C-terminal domains are identical to other glutaredoxins, but there is no other similarity between glutaredoxin 2 and other glutaredoxins. Grx2 is structurally similar to glutathione-S-transferases (GST), but there is no obvious sequence similarity. The inter-domain contacts are mainly hydrophobic, suggesting that the two domains are unlikely to be stable on their own. Both domains are needed for correct folding and activity of Grx2. It is thought that the primary function of Grx2 is to catalyze reversible glutathionylation of proteins with GSH in cellular redox regulation including the response to oxidative stress. 130
47605 398208 pfam04400 NqrM (Na+)-NQR maturation NqrM. The NqrM gene is often found adjacent to the nqr operons that encode (Na+)-NQR subunits. It is involved in the maturation of (Na+) translocating NADH:quinone oxidoreductase in proteobacteria. The four conserved Cys residues found in NqrM are required for (Na+)- NQR maturation and may serve as ligands for a metal ion or metal cluster used to build up the (Na+)-NQR molecule. 42
47606 398209 pfam04402 SIMPL Protein of unknown function (DUF541). Members of this family have so far been found in bacteria and mouse SwissProt or TrEMBL entries. However possible family members have also been identified in translated rat (Genbank:AW144450) and human (Genbank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerization. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane. 156
47607 398210 pfam04403 PqiA Paraquat-inducible protein A. Paraquat is a superoxide radical-generating agent. The promoter for the pqiA gene is also inducible by other known superoxide generators. This is predicted to be a family of integral membrane proteins, possibly located in the inner membrane. This family is related to NADH dehydrogenase subunit 2 (pfam00361). 155
47608 398211 pfam04404 ERF ERF superfamily. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to ERF. 151
47609 398212 pfam04405 ScdA_N Domain of Unknown function (DUF542). This domain is always found in conjunction with the HHE domain (pfam03794) at the N-terminus. 55
47610 398213 pfam04406 TP6A_N Type IIB DNA topoisomerase. Type II DNA topoisomerases are ubiquitous enzymes that catalyze the ATP-dependent transport of one DNA duplex through a second DNA segment via a transient double-strand break. Type II DNA topoisomerases are now subdivided into two sub-families, type IIA and IIB DNA topoisomerases. TP6A_N is present in type IIB topoisomerase and is thought to be involved in DNA binding owing to its sequence similarity to E. coli catabolite activator protein (CAP). 62
47611 367925 pfam04407 DUF531 Protein of unknown function (DUF531). Family of hypothetical archaeal proteins. 170
47612 398214 pfam04408 HA2 Helicase associated domain (HA2). This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. 104
47613 398215 pfam04409 DUF530 Protein of unknown function (DUF530). Family of hypothetical archaeal proteins. 521
47614 398216 pfam04410 Gar1 Gar1/Naf1 RNA binding region. Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo. Naf1 is an essential protein that plays a role in ribosome biogenesis, modification of spliceosomal small nuclear RNAs and telomere synthesis, and is homologous to Gar1. 153
47615 398217 pfam04411 PDDEXK_7 PD-(D/E)XK nuclease superfamily. This domain has been identified as a member of the PD-(D/E)XK nuclease superfamily through transitive meta profile searches. The domain has two additional beta-strands inserted to the core fold after the first core alpha-helix. It has been speculated that it could function as s methylation-dependent restriction. The domain has two additional beta-strands inserted into the core fold after the first core alpha-helix. The PD-(D/E)XK signature is clearly conserved corresponding to an invariant PD (motif II) and DAK (motif III) motifs. There is also a conserved glutamic acid in motif I that is most likely to be involved in metal ion binding. The second core alpha-helix contains an invariant MHXYRD motif. It has been speculated that it could function as s methylation-dependent restriction enzyme. 161
47616 398218 pfam04412 DUF521 Protein of unknown function (DUF521). Family of hypothetical proteins. 392
47617 398219 pfam04413 Glycos_transf_N 3-Deoxy-D-manno-octulosonic-acid transferase (kdotransferase). Members of this family transfer activated sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. Members of the family transfer UDP, ADP, GDP or CMP linked sugars. The Glycos_transf_N region is flanked at the N-terminus by a signal peptide and at the C-terminus by Glycos_transf_1 (pfam00534). The eukaryotic glycogen synthases may be distant members of this bacterial family. 176
47618 398220 pfam04414 tRNA_deacylase D-aminoacyl-tRNA deacylase. Several aminoacyl-tRNA synthetases have the ability to transfer the D-isomer of their amino acid onto their cognate tRNA. D-aminoacyl-tRNA deacylases hydrolyze the ester bond between the polynucleotide and the D-amino acid, thereby preventing the accumulation of such mis-acylated and metabolically inactive tRNA molecules. 203
47619 282295 pfam04415 DUF515 Protein of unknown function (DUF515). Family of hypothetical Archaeal proteins. 449
47620 398221 pfam04417 DUF501 Protein of unknown function (DUF501). Family of uncharacterized bacterial proteins. 137
47621 398222 pfam04418 DUF543 Domain of unknown function (DUF543). This family of short eukaryotic proteins has no known function. Most of the members of this family are only 80 amino acid residues long. However the Arabidopsis homolog is over 300 residues long. The presumed domain contains a conserved amino terminal cysteine and a conserved motif GXGXGXG in the carboxy terminal half that may be functionally important. 75
47622 398223 pfam04419 4F5 4F5 protein family. Members of this family are short proteins that are rich in aspartate, glutamate, lysine and arginine. Although the function of these proteins is unknown, they are found to be ubiquitously expressed. 38
47623 398224 pfam04420 CHD5 CHD5-like protein. Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. In Saccharomyces cerevisiae this protein localizes to the ER and is thought to play a homeostatic role. 158
47624 398225 pfam04421 Mss4 Mss4 protein. 94
47625 398226 pfam04422 FrhB_FdhB_N Coenzyme F420 hydrogenase/dehydrogenase, beta subunit N-term. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the N termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037). 78
47626 398227 pfam04423 Rad50_zn_hook Rad50 zinc hook motif. The Mre11 complex (Mre11 Rad50 Nbs1) is central to chromosomal maintenance and functions in homologous recombination, telomere maintenance and sister chromatid association. The Rad50 coiled-coil region contains a dimer interface at the apex of the coiled coils in which pairs of conserved Cys-X-X-Cys motifs form interlocking hooks that bind one Zn ion. This alignment includes the zinc hook motif and a short stretch of coiled-coil on either side. 49
47627 398228 pfam04424 MINDY_DUB MINDY deubiquitinase. This entry represents a group of deubiquitinating (DUB) enzymes known as the MINDY family (MIU-containing novel DUB). Ubiquitin (Ub) is released one molecule at a time from the distal end of proteins with Lys48-linked polyubiquitin chains. Long polyubiquitin chains are preferred. The catalytic Cys and His residues have been identified by site-directed mutagenesis, as has the Gln that participates in formation of the oxyanion hole during catalysis. Despite the structural similarity to papain-like cysteine peptidases, a residue corresponding to the Asn that orientates the imidazolium ring of the catalytic His has not been identified. Members of the MINDY family of DUBs contain an MIU (motif interacting with Ub) motif, which is a helical motif that binds mono-Ub. 110
47628 398229 pfam04425 Bul1_N Bul1 N-terminus. This family contains the N-terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats. 445
47629 367936 pfam04426 Bul1_C Bul1 C-terminus. This family contains the C-terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats. 271
47630 398230 pfam04427 Brix Brix domain. 137
47631 367938 pfam04428 Choline_kin_N Choline kinase N-terminus. Found N terminal to choline/ethanolamine kinase regions (pfam01633) in some plant and fungal choline kinase enzymes (EC:2.7.1.32). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis. 51
47632 398231 pfam04430 DUF498 Protein of unknown function (DUF498/DUF598). This is a large family of uncharacterized proteins found in all domains of life. The structure shows a novel fold with three beta sheets. A dimeric form is found in the crystal structure. It was suggested that the cleft in between the two monomers might bing nucleic acid. 104
47633 398232 pfam04431 Pec_lyase_N Pectate lyase, N-terminus. This region is found N terminal to the pectate lyase domain (pfam00544) in some plant pectate lyase enzymes. 52
47634 398233 pfam04432 FrhB_FdhB_C Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C-terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037). 146
47635 398234 pfam04433 SWIRM SWIRM domain. This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. 78
47636 309540 pfam04434 SWIM SWIM zinc finger. This domain is found in bacterial, archaeal and eukaryotic proteins. It is predicted to be organized into two N-terminal beta-strands and a C-terminal alpha helix, thus possibly adopting a fold similar to that of the C2H2 zinc finger (pfam00096). SWIM is thought to be a versatile domain that can interact with DNA or proteins in different contexts. 38
47637 398235 pfam04435 SPK Domain of unknown function (DUF545). Family of uncharacterized C. elegans proteins. The region represented by this family can is found to be repeated up to four time in some proteins. 104
47638 398236 pfam04437 RINT1_TIP1 RINT-1 / TIP-1 family. This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control, as well as the TIP-1 protein from yeast that seems to be involved in a complex with Sec20p that is required for Golgi transport. 511
47639 398237 pfam04438 zf-HIT HIT zinc finger. This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interact with the ligand binding domain of the thyroid receptor. 30
47640 398238 pfam04439 Adenyl_transf Streptomycin adenylyltransferase. Also known as Aminoglycoside 6- adenylyltransferase (EC:2.7.7.-), this protein confers resistance to aminoglycoside antibiotics. 278
47641 398239 pfam04440 Dysbindin Dysbindin (Dystrobrevin binding protein 1). Dysbindin is an evolutionary conserved 40-kDa coiled-coil-containing protein that binds to alpha- and beta-dystrobrevin in muscle and brain. Dystrophin and alpha-dystrobrevin are co-immunoprecipitated with dysbindin, indicating that dysbindin is DPC-associated in muscle. Dysbindin co-localizes with alpha-dystrobrevin at the sarcolemma and is up-regulated in dystrophin-deficient muscle. In the brain, dysbindin is found primarily in axon bundles and especially in certain axon terminals, notably mossy fibre synaptic terminals in the cerebellum and hippocampus. Dysbindin may have implications for the molecular pathology of Duchenne muscular dystrophy and may provide an alternative route for anchoring dystrobrevin and the DPC to the muscle membrane. Genetic variation in the human dysbindin gene is also thought to be associated with Schizophrenia. 143
47642 398240 pfam04441 Pox_VERT_large Poxvirus early transcription factor (VETF), large subunit. The poxvirus early transcription factor (VETF), in addition to the viral RNA polymerase, is required for efficient transcription of early genes in vitro. VETF is a heterodimeric protein that binds specifically to early gene promoters. The heterodimer is comprised of an 82 kDa (this family) subunit and a 70 kDa subunit. 697
47643 398241 pfam04442 CtaG_Cox11 Cytochrome c oxidase assembly protein CtaG/Cox11. Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae. Met 224 is also thought to play a role in copper transfer or stabilizing the copper site. 148
47644 282319 pfam04443 LuxE Acyl-protein synthetase, LuxE. LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyzes the formation of an acyl-protein thioester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalyzed bioluminescence reaction A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The carboxyl terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction. This family also includes Vibrio cholerae RBFN protein, which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid. 386
47645 398242 pfam04444 Dioxygenase_N Catechol dioxygenase N-terminus. This family consists of the N termini of catechol, chlorocatechol or hydroxyquinol 1,2-dioxygenase proteins. This region is always found adjacent to the dioxygenase domain (pfam00775). 75
47646 398243 pfam04445 SAM_MT Putative SAM-dependent methyltransferase. This is a family of putative SAM-dependent methyltransferases. 231
47647 398244 pfam04446 Thg1 tRNAHis guanylyltransferase. The Thg1 protein from Saccharomyces cerevisiae is responsible for adding a GMP residue to the 5' end of tRNA His. The catalytic domain Thg1 contains a RRM (ferredoxin) fold palm domain, just like the viral RNA-dependent RNA polymerases, reverse transcriptases, family A and B DNA polymerases, adenylyl cyclases, diguanylate cyclases (GGDEF domain) and the predicted polymerase of the CRISPR system. Thg1 possesses an active site with three acidic residues that chelate Mg++ cations. Thg1 catalyzes polymerization similar to the 5'-3' polymerases. 127
47648 367945 pfam04447 DUF550 Protein of unknown function (DUF550). This family is found in a range of Proteobacteria and a few P-22 dsDNA virus particles. The function is currently not known. 97
47649 398245 pfam04448 DUF551 Protein of unknown function (DUF551). This family represents the carboxy terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages. 66
47650 398246 pfam04449 Fimbrial_CS1 CS1 type fimbrial major subunit. Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown. A member of this family has also been identified in Salmonella typhi and Salmonella enterica. 134
47651 398247 pfam04450 BSP Peptidase of plants and bacteria. These basic secretory proteins (BSPs) are believed to be part of the plants defense mechanism against pathogens. 205
47652 398248 pfam04451 Capsid_NCLDV Large eukaryotic DNA virus major capsid protein. This family includes the major capsid protein of iridoviruses, chlorella virus and Spodoptera ascovirus, which are all dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein. In Chlorella virus PBCV-1 the major capsid protein is a glycoprotein. The four families of large eukaryotic DNA viruses, Poxviridae, Asfarviridae, Iridoviridae, and Phycodnaviridae, are referred to collectively as nucleocytoplasmic large DNA viruses or NCLDV. The virions of different NCLDV have dramatically different structures. The major capsid proteins of iridoviruses and phycodnaviruses, both of which have icosahedral capsids surrounding an inner lipid membrane, showed a high level of sequence conservation. A more limited, but statistically significant sequence similarity was observed between these proteins and the major capsid protein (p72) of ASFV, which also has an icosahedral capsid. It was surprising, however, to find that all of these proteins shared a conserved domain with the poxvirus protein D13L, which is an integral virion component thought to form a scaffold for the formation of viral crescents and immature virion. 192
47653 398249 pfam04452 Methyltrans_RNA RNA methyltransferase. RNA methyltransferases modify nucleotides during ribosomal RNA maturation in a site-specific manner. The Escherichia coli member is specific for U1498 methylation. 221
47654 398250 pfam04453 OstA_C Organic solvent tolerance protein. Family involved in organic solvent tolerance in bacteria. The region contains several highly conserved, potentially catalytic, residues. 384
47655 398251 pfam04454 Linocin_M18 Encapsulating protein for peroxidase. The Linocin_M18 is found in eubacteria and archaea. These proteins, referred to as encapsulins, form nanocompartments within the bacterium which contain ferritin-like proteins or peroxidases, enzymes involved in oxidative-stress response. These enzymes are targeted to the interior of encapsulins via unique C-terminal extensions. 253
47656 398252 pfam04455 Saccharop_dh_N LOR/SDH bifunctional enzyme conserved region. Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to Saccharop_dh (pfam03435) in eukaryotes. 92
47657 398253 pfam04456 DUF503 Protein of unknown function (DUF503). Family of hypothetical bacterial proteins. 82
47658 398254 pfam04457 DUF504 Protein of unknown function (DUF504). Family of uncharacterized proteins. 76
47659 398255 pfam04458 DUF505 Protein of unknown function (DUF505). Family of uncharacterized prokaryotic proteins. 621
47660 398256 pfam04459 DUF512 Protein of unknown function (DUF512). Family of uncharacterized prokaryotic proteins. 203
47661 398257 pfam04461 DUF520 Protein of unknown function (DUF520). Family of uncharacterized proteins. 161
47662 398258 pfam04463 DUF523 Protein of unknown function (DUF523). Family of uncharacterized bacterial proteins. 142
47663 398259 pfam04464 Glyphos_transf CDP-Glycerol:Poly(glycerophosphate) glycerophosphotransferase. Wall-associated teichoic acids are a heterogeneous class of phosphate-rich polymers that are covalently linked to the cell wall peptidoglycan of gram-positive bacteria. They consist of a main chain of phosphodiester-linked polyols and/or sugar moieties attached to peptidoglycan via a linkage unit. CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase is responsible for the polymerization of the main chain of the teichoic acid by sequential transfer of glycerol-phosphate units from CDP-glycerol to the linkage unit lipid. 360
47664 367954 pfam04465 DUF499 Protein of unknown function (DUF499). Family of uncharacterized hypothetical prokaryotic proteins. 1016
47665 335802 pfam04466 Terminase_3 Phage terminase large subunit. Initiation of packaging of double-stranded viral DNA involves the specific interaction of the prohead with viral DNA in a process mediated by a phage-encoded terminase protein. The terminase enzymes are usually hetero-oligomers composed of a small and a large subunit. This region is found on the large subunit and possess an endonuclease and ATPase activity that require Mg2+ and a neutral or slightly basic reaction. This region is also found in bacterial sequences. 201
47666 367955 pfam04467 DUF483 Protein of unknown function (DUF483). Family of uncharacterized prokaryotic proteins. 119
47667 398260 pfam04468 PSP1 PSP1 C-terminal conserved region. This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast. 86
47668 398261 pfam04471 Mrr_cat Restriction endonuclease. Prokaryotic family found in type II restriction enzymes containing the hallmark (D/E)-(D/E)XK active site. Presence of catalytic residues implicates this region in the enzymatic cleavage of DNA. 114
47669 398262 pfam04472 SepF Cell division protein SepF. SepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology. This family also includes archaeal related proteins of unknown function. 72
47670 282345 pfam04473 DUF553 Transglutaminase-like domain. This family of uncharacterized archaeal proteins are related to Transglutaminase-like domains. This family has previously been called DUF553 and UPF0252. 140
47671 398263 pfam04474 DUF554 Protein of unknown function (DUF554). Family of uncharacterized prokaryotic proteins. Multiple predicted transmembrane regions suggest that the region is membrane associated. 220
47672 398264 pfam04475 DUF555 Protein of unknown function (DUF555). Family of uncharacterized, hypothetical archaeal proteins. 101
47673 398265 pfam04476 4HFCP_synth 4-HFC-P synthase. (5-formylfuran-3-yl)methyl phosphate synthase, also known as 4-HFC-P synthase, is involved in the production of methanofuran. This family has a classical TIM-barrel structure whose biological unit is a homohexamer. 228
47674 398266 pfam04478 Mid2 Mid2 like cell wall stress sensor. This family represents a region near the C-terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway. 150
47675 398267 pfam04479 RTA1 RTA1 like protein. This family is comprised of fungal proteins with multiple transmembrane regions. RTA1 is involved in resistance to 7-aminocholesterol, while RTM1 confers resistance to an an unknown toxic chemical in molasses. These proteins may bind to the toxic substance, and thus prevent toxicity. They are not thought to be involved in the efflux of xenobiotics. 210
47676 398268 pfam04480 DUF559 Protein of unknown function (DUF559). 95
47677 113257 pfam04481 DUF561 Protein of unknown function (DUF561). Protein of unknown function found in a cyanobacterium, and the chloroplasts of algae. 243
47678 398269 pfam04483 DUF565 Protein of unknown function (DUF565). Predicted transmembrane protein found in plants, chloroplasts and cyanobacteria. This family is also known as YCF20. 57
47679 398270 pfam04484 QWRF QWRF family. AUG8 belongs to the plant QWRF motif-containing protein family, which also includes microtubule-associated protein ENDOSPERM DEFECTIVE 1 and SNOWY COTYLEDON 3. AUG8 binds the microtubule plus-end and participates in the reorientation of microtubules in hypocotyls (the stem of a germinating seedling). 300
47680 398271 pfam04485 NblA Phycobilisome degradation protein nblA. In the cyanobacterium Synechococcus PCC 7942, nblA triggers degradation of light-harvesting phycobiliproteins in response to deprivation nutrients including nitrogen, phosphorus and sulphur. The mechanism of nblA function is not known, but it has been hypothesized that nblA may act by disrupting phycobilisome structure, activating a protease or tagging phycobiliproteins for proteolysis. Members of this family have also been identified in the chloroplasts of some red algae. 50
47681 398272 pfam04486 SchA_CurD SchA/CurD like domain. Members of this family have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site. Many of these proteins contain two copies of this presumed domain. 118
47682 398273 pfam04487 CITED CITED. CITED, CBP/p300-interacting transactivator with ED-rich tail, are characterized by a conserved 32-amino acid sequence at the C-terminus. CITED proteins do not bind DNA directly and are thought to function as transcriptional co-activators. 210
47683 398274 pfam04488 Gly_transf_sug Glycosyltransferase sugar-binding region containing DXD motif. The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases. 93
47684 282358 pfam04489 DUF570 Protein of unknown function (DUF570). Protein of unknown function, found in herpesvirus and cytomegalovirus. 456
47685 282359 pfam04490 Pox_T4_C Poxvirus T4 protein, C-terminus. This family of poxvirus proteins are thought to be retained in the endoplasmic reticulum. M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection. 146
47686 282360 pfam04491 Pox_T4_N Poxvirus T4 protein, N-terminus. This family of poxvirus proteins are thought to be secreted or retained in the endoplasmic reticulum if the protein also contains an additional C terminal region (pfam04490). M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection. 46
47687 398275 pfam04492 Phage_rep_O Bacteriophage replication protein O. Replication protein O is necessary for the initiation of bacteriophage DNA replication. Protein O interacts with the lambda replication origin, and also with replication protein P to form an oligomer. It is speculated that the N-terminal half interacts with the replication origin while the C terminal half mediates protein-protein interaction. 92
47688 398276 pfam04493 Endonuclease_5 Endonuclease V. Endonuclease V is specific for single-stranded DNA or for duplex DNA that contains uracil or that is damaged by a variety of agents. 198
47689 398277 pfam04494 TFIID_NTD2 WD40 associated region in TFIID subunit, NTD2 domain. This region is an all-alpha domain associated with the WD40 helical bundle of the TAF5 subunit of transcription factor TFIID. The domain has distant structural similarity to RNA polymerase II CTD interacting factors. It contains several conserved clefts that are likely to be critical for TFIID complex assembly. The TAF5 subunit is present twice in the TFIID complex and is critical for the function and assembly of the complex, and the NTD2 and N-terminal domain is crucial for homodimerization. 125
47690 398278 pfam04495 GRASP55_65 GRASP55/65 PDZ-like domain. GRASP55 (Golgi re-assembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. This region appears to be related to the PDZ domain. 138
47691 282365 pfam04496 Herpes_UL35 Herpesvirus UL35 family. UL35 represents a true late gene which encodes a 12-kDa capsid protein. 109
47692 282366 pfam04497 Pox_E2-like Poxviridae protein. This family of proteins is restricted to Poxviridae. It contains a number of differently named uncharacterized proteins. 729
47693 282367 pfam04498 Pox_VP8_L4R Poxvirus nucleic acid binding protein VP8/L4R. The 25 kDa product of Vaccinia virus gene L4R is also known as VP8. VP8 is found in the cores of Vaccinia virions and is essential for the formation of transcriptionally competent viral particles. It binds both single stranded and double stranded DNA and RNA with similar affinities. Binding is thought to involve cooperative interactions between protein subunits. The protein is proteolytically cleaved during viral assembly at an Ala-Gly-Ala site. Possible roles for VP8 include packaging and maintaining the DNA genome in a transcribable configuration; binding ssDNA during transcription initiation; and cooperation with I8R protein to unwind early promoter regions. VP8 may also function in either transcription elongation or release of mRNA molecules from viral particles. 217
47694 398279 pfam04499 SAPS SIT4 phosphatase-associated protein. This family includes a conserved region from a group of yeast proteins that associate with the SIT4 phosphatase. This association is required for SIT4's role in G1 cyclin transcription and for bud formation. This family also includes homologous regions from other eukaryotes. 383
47695 398280 pfam04500 FLYWCH FLYWCH zinc finger domain. Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in. 62
47696 367967 pfam04501 Baculo_VP39 Baculovirus major capsid protein VP39. This family constitutes the 39 kDa major capsid protein of the Baculoviridae. 240
47697 398281 pfam04502 DUF572 Family of unknown function (DUF572). Family of eukaryotic proteins with undetermined function. 316
47698 398282 pfam04503 SSDP Single-stranded DNA binding protein, SSDP. This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene. 294
47699 398283 pfam04504 DUF573 Protein of unknown function, DUF573. 89
47700 398284 pfam04505 CD225 Interferon-induced transmembrane protein. This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression. 68
47701 398285 pfam04506 Rft-1 Rft protein. 513
47702 398286 pfam04507 DUF576 Csa1 family. This family contains several uncharacterized staphylococcal proteins. These proteins have been called conserved staphylococcal antigens (Csa). 225
47703 282377 pfam04508 Pox_A_type_inc Viral A-type inclusion protein repeat. The repeat is found in the A-type inclusion protein of the Poxvirus family. 22
47704 398287 pfam04509 CheC CheC-like family. The restoration of pre-stimulus levels of the chemotactic response regulator, CheY-P, is important for allowing bacteria to respond to new environmental stimuli. The members of this family, CheC, CheX, CheA and FliY are CheY-P phosphatase. CheC appears to be primarily involved in restoring normal CheY-P levels, whereas FliY seems to act on CheY-P constitutively. CheD enhances the activity of CheC 5-fold, which is normally relatively low. In some cases, the region represented by this entry is present as multiple copies. 38
47705 367974 pfam04510 DUF577 Family of unknown function (DUF577). Family of Arabidopsis thaliana proteins. Many of these members contain a repeated region. 173
47706 398288 pfam04511 DER1 Der1-like family. The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.). 191
47707 398289 pfam04512 Baculo_PEP_N Baculovirus polyhedron envelope protein, PEP, N-terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilize polyhedra and protect them from fusion or aggregation. 91
47708 309591 pfam04513 Baculo_PEP_C Baculovirus polyhedron envelope protein, PEP, C-terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilize polyhedra and protect them from fusion or aggregation. 140
47709 282383 pfam04514 BTV_NS2 Bluetongue virus non-structural protein NS2. This family includes NS2 proteins from other members of the Orbivirus genus. NS2 is a non-specific single-stranded RNA-binding protein that forms large homomultimers and accumulates in viral inclusion bodies of infected cells. Three RNA binding regions have been identified in Bluetongue virus serotype 17 at residues 2-11, 153-166 and 274-286. NS2 multimers also possess nucleotidyl phosphatase activity. The precise function of NS2 is not known, but it may be involved in the transport and condensation of viral mRNAs. 349
47710 398290 pfam04515 Choline_transpo Plasma-membrane choline transporter. This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. 324
47711 398291 pfam04516 CP2 CP2 transcription factor. This family represents a conserved region in the CP2 transcription factor family. 214
47712 282386 pfam04517 Microvir_lysis Microvirus lysis protein (E), C-terminus. E protein causes host cell lysis by inhibiting MraY, a peptidoglycan biosynthesis enzyme. This leads to cell wall failure at septation. The N terminal transmembrane region matches the signal peptide model and must be omitted from the family. 42
47713 309594 pfam04518 Effector_1 Effector from type III secretion system. This is a family of effector proteins which are secreted by the type III secretion system. The precise function of this family is unknown. 352
47714 398292 pfam04519 Bactofilin Polymer-forming cytoskeletal. This is a family of bactofilins, a functionally diverse class of cytoskeletal, polymer-forming, proteins that is widely conserved among bacteria. In the example species C. crescentus, two bactofilins assemble into a membrane-associated laminar structure that shows cell-cycle-dependent polar localization and acts as a platform for the recruitment of a cell wall biosynthetic enzyme involved in polar morphogenesis. Bactofilins display distinct subcellular distributions and dynamics in different bacterial species, suggesting that they are versatile structural elements that have adopted a range of different cellular functions. 89
47715 398293 pfam04520 Senescence_reg Senescence regulator. This protein regulates the expression of proteins associated with leaf senescence in plants. 171
47716 282390 pfam04521 Viral_P18 ssRNA positive strand viral 18kD cysteine rich protein. 137
47717 282391 pfam04522 DUF585 Protein of unknown function (DUF585). This region represents the N-terminus of bromovirus 2a protein, and is always found N terminal to a predicted RNA-dependent RNA polymerase region (pfam00978). 234
47718 282392 pfam04523 Herpes_U30 Herpes virus tegument protein U30. This family is named after the human herpesvirus protein, but has been characterized in cytomegalovirus as UL47. Cytomegalovirus UL47 is a component of the tegument, which is a protein layer surrounding the viral capsid. UL47 co-precipitates with UL48 and UL69 tegument proteins, and the major capsid protein UL86. A UL47-containing complex is thought to be involved in the release of viral DNA from the disassembling virus particle. 906
47719 398294 pfam04525 LOR LURP-one-related. The structure of this family has been solved. It comprises a 12-stranded beta barrel with a central C-terminal alpha helix. This helix is thought to be a transmembrane helix. It is structurally similar to the C-terminal domain of the Tubby protein. In plants it plays a role in defense against pathogens. 186
47720 398295 pfam04526 DUF568 Protein of unknown function (DUF568). Family of uncharacterized plant proteins. 100
47721 309599 pfam04527 Retinin_C Drosophila Retinin like protein. Family of Drosophila proteins related to the C-terminal region of the Drosophila Retinin protein. Conserved region is found towards the C-terminus of the member proteins. 63
47722 282396 pfam04528 Adeno_E4_34 Adenovirus early E4 34 kDa protein conserved region. Conserved region found in the Adenovirus E4 34 kDa protein. 145
47723 398296 pfam04529 Herpes_U59 Herpesvirus U59 protein. The proteins in this family have no known function. Cytomegalovirus UL88 is also a member of this family. 365
47724 282398 pfam04530 Viral_Beta_CD Viral Beta C/D like family. Family of ssRNA positive-strand viral proteins. Conserved region found in the Beta C and Beta D transcripts. 123
47725 398297 pfam04531 Phage_holin_1 Bacteriophage holin. This family of holins is found in several staphylococcal and streptococcal bacteriophages. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis. 82
47726 282400 pfam04532 DUF587 Protein of unknown function (DUF587). This family consists of the N termini of some human herpesvirus U58 proteins, and some cytomegalovirus UL87 proteins. This region is always found N terminal to the Pfam family UL87 (pfam03043), which has no known function. 227
47727 282401 pfam04533 Herpes_U44 Herpes virus U44 protein. This is a family of proteins from dsDNA beta-herpesvirinae and gamma-herpesvirinae viruses. The function is not known, and the proteins are named variously as U44, BSRF1, UL71, and M71. The family BSRF1 has been merged into this. 202
47728 398298 pfam04534 Herpes_UL56 Herpesvirus UL56 protein. In herpes simplex virus type 2, UL56 is thought to be a tail-anchored type II membrane protein involved in vesicular trafficking. The C terminal hydrophobic region is required for association with the cytoplasmic membrane, and the N terminal proline-rich region is important for the translocation of UL56 to the Golgi apparatus and cytoplasmic vesicles. 197
47729 367980 pfam04535 DUF588 Domain of unknown function (DUF588). This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices. 150
47730 398299 pfam04536 TPM_phosphatase TPM domain. This family was first named TPM domain after its founding proteins: TLP18.3, Psb32 and MOLO-1. In Arabidopsis, this domain is called the thylakoid acid phosphatase -TAP - domain and has a Rossmann-like fold. In plants, the family resides in the thylakoid lumen attached to the outer membrane of the chloroplast/plastid. It is active in the photosystem II. 125
47731 282405 pfam04537 Herpes_UL55 Herpesvirus UL55 protein. In infected cells, UL55 is associated with the nuclear matrix, and found adjacent to compartments containing the capsid protein ICP35. UL55 was not detected in assembled virions. It is thought that UL55 may play a role in virion assembly or maturation. 164
47732 398300 pfam04538 BEX Brain expressed X-linked like family. This is a family of transcription elongation factors which includes those referred to as Bex proteins as well as those named TCEAL7. Bex1 was shown to be a novel link between neurotrophin signalling, the cell cycle, and neuronal differentiation, suggesting it might function by coordinating internal cellular states with the ability of cells to respond to external signals. TCEAL7 has been shown negatively to regulate the NF-kappaB pathway, hence being important in ovarian cancer as it one of the genes frequently downregulated in this cancer. A closely related protein, TFIIS/TCEA, found in pfam07500 is involved in transcription elongation and transcript fidelity. TFIIS/TCEA promotes 3' endoribonuclease activity of RNA polymerase II (pol II) and allows pol II to bypass transcript pause or 'arrest' during elongation process. It is thus possible that BEX is also acting in this way. 103
47733 398301 pfam04539 Sigma70_r3 Sigma-70 region 3. Region 3 forms a discrete compact three helical domain within the sigma-factor. Region is not normally involved in the recognition of promoter DNA, but as some specific bacterial promoters containing an extended -10 promoter element, residues within region 3 play an important role. Region 3 primarily is involved in binding the core RNA polymerase in the holoenzyme. 76
47734 282408 pfam04540 Herpes_UL51 Herpesvirus UL51 protein. UL51 protein is a virion protein. In pseudorabies virus, UL51 was identified as a component of the capsid. In herpes simplex virus type 1 there is evidence for post-translational modification of UL51. 158
47735 398302 pfam04541 Herpes_U34 Herpesvirus virion protein U34. The virion proteins in this family include membrane phosphoprotein-like proteins such as UL34, Epstein-Barr and R50, from dsDNA viruses, no RNA stage, Herpesvirales. The family Herpes_BFRF1, pfam05900, has been merged in. 182
47736 398303 pfam04542 Sigma70_r2 Sigma-70 region 2. Region 2 of sigma-70 is the most conserved region of the entire protein. All members of this class of sigma-factor contain region 2. The high conservation is due to region 2 containing both the -10 promoter recognition helix and the primary core RNA polymerase binding determinant. The core binding helix, interacts with the clamp domain of the largest polymerase subunit, beta prime. The aromatic residues of the recognition helix, found at the C-terminus of this domain are though to mediate strand separation, thereby allowing transcription initiation. 69
47737 282411 pfam04544 Herpes_UL20 Herpesvirus egress protein UL20. UL20 is predicted to be a transmembrane protein with multiple membrane spans. It is involved in the trans-cellular transport of enveloped virions, and is therefore important for viral egress. However, UL20 operates in different cellular compartments and different stages of egress in pseudorabies virus and herpes simplex virus. This is thought to be due to differences in egress pathways between these two viruses. 179
47738 398304 pfam04545 Sigma70_r4 Sigma-70, region 4. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif. Due to the way Pfam works, the threshold has been set artificially high to prevent overlaps with other helix-turn-helix families. Therefore there are many false negatives. 50
47739 398305 pfam04546 Sigma70_ner Sigma-70, non-essential region. The domain is found in the primary vegetative sigma factor. The function of this domain is unclear and can be removed without loss of function. 169
47740 398306 pfam04547 Anoctamin Calcium-activated chloride channel. The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes. 422
47741 398307 pfam04548 AIG1 AIG1 family. Arabidopsis protein AIG1 appears to be involved in plant resistance to bacteria. 200
47742 367985 pfam04549 CD47 CD47 transmembrane region. This family represents the transmembrane region of CD47 leukocyte antigen. 147
47743 398308 pfam04550 Phage_holin_3_2 Phage holin family 2. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis. 86
47744 398309 pfam04551 GcpE GcpE protein. In a variety of organisms, including plants and several eubacteria, isoprenoids are synthesized by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway. 342
47745 398310 pfam04552 Sigma54_DBD Sigma-54, DNA binding domain. This DNA binding domain is based on peptide fragmentation data. This domain is proximal to DNA in the promoter/holoenzyme complex. Furthermore this region contains a putative helix-turn-helix motif. At the C-terminus, there is a highly conserved region known as the RpoN box and is the signature of the sigma-54 proteins. 159
47746 398311 pfam04553 Tis11B_N Tis11B like protein, N-terminus. Members of this family always contain a tandem repeat of CCCH zinc fingers pfam00642. Tis11B, Tis11D and their homologs are thought to be regulatory proteins involved in the response to growth factors. The function of the N-terminus is unknown. 105
47747 252669 pfam04554 Extensin_2 Extensin-like region. 57
47748 398312 pfam04555 XhoI Restriction endonuclease XhoI. This family consists of type II restriction enzymes (EC:3.1.21.4) that recognize the double-stranded sequence CTCGAG and cleave after C-1. 191
47749 398313 pfam04556 DpnII DpnII restriction endonuclease. Members of this family are type II restriction enzymes (EC:3.1.21.4). They recognize the double-stranded unmethylated sequence GATC and cleave before G-1. http://rebase.neb.com/rebase/enz/DpnII.html 276
47750 398314 pfam04557 tRNA_synt_1c_R2 Glutaminyl-tRNA synthetase, non-specific RNA binding region part 2. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function. 84
47751 398315 pfam04558 tRNA_synt_1c_R1 Glutaminyl-tRNA synthetase, non-specific RNA binding region part 1. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function. 162
47752 398316 pfam04559 Herpes_UL17 Herpesvirus UL17 protein. UL17 protein is required for DNA cleavage and packaging in herpes viruses. It has been shown to associate with immature B-type capsids, and is required for the the localization of capsids and capsid proteins to the intranuclear sites where viral DNA is cleaved and packaged. In the virion, UL17 is a component of the tegument, which is a protein layer surrounding the viral capsid. 492
47753 398317 pfam04560 RNA_pol_Rpb2_7 RNA polymerase Rpb2, domain 7. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain comprised of the structural domains anchor and clamp. The clamp region (C-terminal) contains a zinc-binding motif. The clamp region is named due to its interaction with the clamp domain found in Rpb1. The domain also contains a region termed "switch 4". The switches within the polymerase are thought to signal different stages of transcription. 85
47754 398318 pfam04561 RNA_pol_Rpb2_2 RNA polymerase Rpb2, domain 2. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 1 (DRI). 185
47755 398319 pfam04562 Dicty_spore_N Dictyostelium spore coat protein, N-terminus. The Dictyostelium spore coat is a polarised extracellular matrix composed of glycoproteins and cellulose. Four of the major coat glycoproteins exist as a multi-protein complex within the prespore vesicles before secretion. Of these, SP96 and SP70 are members of this family. The presence of SP96 and SP70 in the complex is necessary for the cellulose binding activity of the complex, which is in turn necessary for normal spore coat assembly. The function of this region of these proteins is not known. 114
47756 367994 pfam04563 RNA_pol_Rpb2_1 RNA polymerase beta subunit. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the protrusion domain. The other lobe (pfam04561) is nested within this domain. 396
47757 398320 pfam04564 U-box U-box domain. This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. 73
47758 398321 pfam04565 RNA_pol_Rpb2_3 RNA polymerase Rpb2, domain 3. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 3, s also known as the fork domain and is proximal to catalytic site. 67
47759 398322 pfam04566 RNA_pol_Rpb2_4 RNA polymerase Rpb2, domain 4. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 4, is also known as the external 2 domain. 62
47760 398323 pfam04567 RNA_pol_Rpb2_5 RNA polymerase Rpb2, domain 5. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 5, is also known as the external 2 domain. 54
47761 367997 pfam04568 IATP Mitochondrial ATPase inhibitor, IATP. ATP synthase inhibitor prevents the enzyme from switching to ATP hydrolysis during collapse of the electrochemical gradient, for example during oxygen deprivation ATP synthase inhibitor forms a one to one complex with the F1 ATPase, possibly by binding at the alpha-beta interface. It is thought to inhibit ATP synthesis by preventing the release of ATP. The minimum inhibitory region for bovine inhibitor is from residues 39 to 72. The inhibitor has two oligomeric states, dimer (the active state) and tetramer. At low pH, the inhibitor forms a dimer via antiparallel coiled coil interactions between the C terminal regions of two monomers. At high pH, the inhibitor forms tetramers and higher oligomers by coiled coil interactions involving the N-terminus and inhibitory region, thus preventing the inhibitory activity. 98
47762 367998 pfam04569 DUF591 Protein of unknown function. This family represents a conserved region in a number of uncharacterized plant proteins. 41
47763 398324 pfam04570 zf-FLZ zinc-finger of the FCS-type, C2-C2. zf-FLZ is a FCS-like zinc-finger domain found in higher plants. It is bryophitic in origin. It carries a zf-FCS-like C2-C2 zinc finger, consisting of a consensus cysteine-signature sequence with conserved phenyl alanine and serine residue associated with a third cysteine. It acts as a protein-protein interaction module. 53
47764 398325 pfam04571 Lipin_N lipin, N-terminal conserved region. Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. The conserved region is found at the N-terminus of the member proteins. 103
47765 398326 pfam04572 Gb3_synth Alpha 1,4-glycosyltransferase conserved region. The glycosphingolipids (GSL) form part of eukaryotic cell membranes. They consist of a hydrophilic carbohydrate moiety linked to a hydrophobic ceramide tail embedded within the lipid bilayer of the membrane. Lactosylceramide, Gal1,4Glc1Cer (LacCer), is the common synthetic precursor to the majority of GSL found in vertebrates. Alpha 1.4-glycosyltransferases utilize UDP donors and transfer the sugar to a beta-linked acceptor. This region appears to be confined to higher eukaryotes. No function has been yet assigned to this region. 125
47766 398327 pfam04573 SPC22 Signal peptidase subunit. Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC) which consists of four subunits in yeast and five in mammals. This family is common to yeast and mammals. 172
47767 368003 pfam04574 DUF592 Protein of unknown function (DUF592). This region is found in some SIR2 family proteins (pfam02146). 153
47768 398328 pfam04575 DUF560 Protein of unknown function (DUF560). Family of hypothetical bacterial proteins. 288
47769 398329 pfam04576 Zein-binding Zein-binding. This domain binds to zein proteins, pfam01559. Zein proteins are seed storage proteins. 92
47770 398330 pfam04577 DUF563 Protein of unknown function (DUF563). Family of uncharacterized proteins. 209
47771 398331 pfam04578 DUF594 Protein of unknown function, DUF594. 54
47772 398332 pfam04579 Keratin_matx Keratin, high-sulphur matrix protein. Family of Keratin, high-sulfur matrix proteins. The keratin products of mammalian epidermal derivatives such as wool and hair consist of microfibrils embedded in a rigid matrix of other proteins. The matrix proteins include the high-sulphur and high-tyrosine keratins, having molecular weights of 6-20 kDa, whereas microfibrils contain the larger, low-sulphur keratins (40-56 kDa). 96
47773 282445 pfam04580 Pox_D3 Chordopoxvirinae D3 protein. Chordopoxvirinae D3 protein conserved region. Region occupies entire length of D3 protein. 248
47774 368009 pfam04582 Reo_sigmaC Reovirus sigma C capsid protein. 130
47775 368010 pfam04583 Baculo_p74 Baculoviridae p74 conserved region. Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein. 218
47776 282447 pfam04584 Pox_A28 Poxvirus A28 family. Family of conserved Poxvirus A28 family proteins. Conserved region spans entire protein in the majority of family members. 140
47777 309640 pfam04586 Peptidase_S78 Caudovirus prohead serine protease. Family of Caudovirus prohead serine proteases also found in a number of bacteria possibly as the result of horizontal transfer. 160
47778 398333 pfam04587 ADP_PFK_GK ADP-specific Phosphofructokinase/Glucokinase conserved region. In archaea a novel type of glycolytic pathway exists that is deviant from the classical Embden-Meyerhof pathway. This pathway utilizes two novel proteins: an ADP-dependent Glucokinase and an ADP-dependent Phosphofructokinase. This conserved region is present at the C-terminal of both these proteins. Interestingly this family contains sequences from higher eukaryotes.. 428
47779 398334 pfam04588 HIG_1_N Hypoxia induced protein conserved region. This family is found in proteins thought to be involved in the response to hypoxia. Family members mostly come from diverse eukaryotic organisms however eubacterial members have been identified. This region is found at the N-terminus of the member proteins which are predicted to be transmembrane. 50
47780 398335 pfam04589 RFX1_trans_act RFX1 transcription activation region. The RFX family is a family of winged-helix DNA binding proteins. RFX1 is a regulatory factor essential for expression of MHC class II genes. This region is to found N terminal to the RFX DNA binding region (pfam02257) in some mammalian RFX proteins, and is thought to activate transcription when associated with DNA. Deletion analysis has identified the region 233-351 in human RFX1 as being required for maximal activation. 160
47781 398336 pfam04591 DUF596 Protein of unknown function, DUF596. This family contains several uncharacterized proteins. 68
47782 398337 pfam04592 SelP_N Selenoprotein P, N terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N-terminal region of SelP can exist independently of the C terminal region. Zebrafish selenoprotein Pb lacks the C terminal Sec-rich region, and a protein encoded by the rat SelP gene and lacking this region has also been reported. N-terminal region contains a conserved SecxxCys motif, which is similar to the CysxxCys found in thioredoxins. It is speculated that the N terminal region may adopt a thioredoxin fold and catalyze redox reactions. The N-terminal region also contains a His-rich region, which is thought to mediate heparin binding. Binding to heparan proteoglycans could account for the membrane binding properties of SelP. The function of the bacterial members of this family is uncharacterized. 233
47783 335847 pfam04593 SelP_C Selenoprotein P, C terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N terminal region always contains one Sec residue, and this is separated from the C terminal region (9-16 sec residues) by a histidine-rich sequence. The large number of Sec residues in the C-terminal portion of SelP suggest CC that it may be involved in selenium transport or storage. However, it is also possible that this region has a redox function. 133
47784 252691 pfam04595 Pox_I6 Poxvirus I6-like family. This family includes I6 proteins as well as the related F5L proteins. 320
47785 282455 pfam04596 Pox_F15 Poxvirus protein F15. 136
47786 398338 pfam04597 Ribophorin_I Ribophorin I. Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as Dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (EC:2.4.1.119). OST catalyzes the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are though to be responsible for OST catalytic activity. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function Most family members are predicted to have a transmembrane helix at the C-terminus of this region. 439
47787 398339 pfam04598 Gasdermin Gasdermin family. The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells. This family also includes the gasdermin protein 240
47788 282458 pfam04599 Pox_G5 Poxvirus G5 protein. This protein has been predicted to be related to the FEN-1 endonuclease. 425
47789 282459 pfam04601 DUF569 Domain of unknown function (DUF569). Family of hypothetical proteins. Some family members contain a two copies of the domain. 142
47790 377388 pfam04602 Arabinose_trans Mycobacterial cell wall arabinan synthesis protein. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol. 459
47791 398340 pfam04603 Mog1 Ran-interacting Mog1 protein. Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mog1 protein has been shown to interact with RanGTP which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran. The human homolog of Mog1 is thought to be alternatively spliced. 136
47792 368018 pfam04604 L_biotic_typeA Type-A lantibiotic. Lantibiotics are antibiotic peptides distinguished by the presence of the rare thioether amino acids lanthionine and/or methyl-lanthionine. They are produced by Gram-positive bacteria as gene-encoded precursor peptides and undergo post-translational modification to generate the mature peptide. Based on their structural and functional features lantibiotics are currently divided into two major groups: the flexible amphiphilic type-A and the rather rigid and globular type-B. Type-A lantibiotics act primarily by pore formation in the bacterial membrane by a mechanism involving the interaction with specific docking molecules such as the membrane precursor lipid II. 51
47793 398341 pfam04606 Ogr_Delta Ogr/Delta-like zinc finger. This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F. This family also includes zinc fingers in recombinase proteins. 47
47794 398342 pfam04607 RelA_SpoT Region found in RelA / SpoT proteins. This region of unknown function is found in RelA and SpoT of Escherichia coli, and their homologs in plants and in other eubacteria. RelA is a guanosine 3',5'-bis-pyrophosphate (ppGpp) synthetase (EC:2.7.6.5) while SpoT is thought to be a bifunctional enzyme catalyzing both ppGpp synthesis and degradation (ppGpp 3'-pyrophosphohydrolase, (EC:3.1.7.2)). This region is often found in association with HD (pfam01966), a metal-dependent phosphohydrolase, TGS (pfam02824) which is a possible nucleotide-binding region, and the ACT regulatory domain (pfam01842). 113
47795 398343 pfam04608 PgpA Phosphatidylglycerophosphatase A. This family represents a family of bacterial phosphatidylglycerophosphatases (EC:3.1.3.27), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli. 144
47796 398344 pfam04609 MCR_C Methyl-coenzyme M reductase operon protein C. Methyl coenzyme M reductase (MCR) catalyzes the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown. This family nowalso includes family MtrC_related, 271
47797 377390 pfam04610 TrbL TrbL/VirB6 plasmid conjugal transfer protein. 214
47798 282467 pfam04611 AalphaY_MDB Mating type protein A alpha Y mating type dependent binding region. This region is important for the mating type dependent binding of Y protein to the A alpha Z protein of another mating type in Schizophyllum commune. 145
47799 398345 pfam04612 T2SSM Type II secretion system (T2SS), protein M. This family of membrane proteins consists of Type II secretion system protein M sequences from several Gram-negative (diderm) bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the T2SM (EpsM) protein interacts with the T2SL (EpsL) protein, and also forms homodimers. 159
47800 398346 pfam04613 LpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase, LpxD. UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.-) catalyzes an early step in lipid A biosynthesis: UDP-3-O-(3-hydroxytetradecanoyl)glucosamine + (R)-3-hydroxytetradecanoyl- [acyl carrier protein] -> UDP-2,3-bis(3-hydroxytetradecanoyl)glucosamine + [acyl carrier protein]. Members of this family also contain a hexapeptide repeat (pfam00132). This family constitutes the non-repeating region of LPXD proteins. 69
47801 398347 pfam04614 Pex19 Pex19 protein family. 246
47802 398348 pfam04615 Utp14 Utp14 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome. 729
47803 398349 pfam04616 Glyco_hydro_43 Glycosyl hydrolases family 43. The glycosyl hydrolase family 43 contains members that are arabinanases. Arabinanases hydrolyze the alpha-1,5-linked L-arabinofuranoside backbone of plant cell wall arabinans. The structure of arabinanase Arb43A from Cellvibrio japonicus reveals a five-bladed beta-propeller fold. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. 281
47804 398350 pfam04617 Hox9_act Hox9 activation region. This family constitutes the N termini of the paralogous homeobox proteins HoxA9, HoxB9, HoxC9 and HoxD9. The N terminal region is found to act as a transcription activation region. Btg1 and Btg2 - the B-cell translocation gene products - may function as cofactors for Hoxb9-mediated transcription. The Btg proteins modulate Hoxb9 transcriptional activity by recruiting a multiprotein Ccr4-like complex. 182
47805 398351 pfam04618 HD-ZIP_N HD-ZIP protein N-terminus. This family consists of the N termini of plant homeobox-leucine zipper proteins. Its function is unknown. 99
47806 309663 pfam04619 Adhesin_Dr Dr-family adhesin. This family of adhesins bind to the Dr blood group antigen component of decay-accelerating factor. This mediates adherence of uropathogenic Escherichia coli to the urinary tract. This family contains both fimbriated and afimbriated adherence structures. This protein also confers the phenotype of mannose-resistant hemagglutination, which can be inhibited by chloramphenicol. The N terminal portion of the protein is though to be responsible for chloramphenicol sensitivity. 139
47807 309664 pfam04620 FlaA Flagellar filament outer layer protein Flaa. Periplasmic flagella are the organelles of spirochete mobility, and are structurally different from the flagella of other motile bacteria. They reside inside the cell within the periplasmic space, and confer mobility in viscous gel-like media such connective tissue. The flagella are composed of an outer sheath of FlaA proteins and a core filament of FlaB proteins. Each species usually has several FlaA protein species. 230
47808 398352 pfam04621 ETS_PEA3_N PEA3 subfamily ETS-domain transcription factor N terminal domain. The N-terminus of the PEA3 transcription factors is implicated in transactivation and in inhibition of DNA binding. Transactivation is potentiated by activation of the Ras/MAP kinase and protein kinase A signalling cascades. The N terminal region contains conserved MAP kinase phosphorylation sites. 342
47809 398353 pfam04622 ERG2_Sigma1R ERG2 and Sigma1 receptor like protein. This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyzes a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum, which interacts with endogenous steroid hormones, such as progesterone and testosterone. It also binds the sigma ligands, which are are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia. 193
47810 113396 pfam04623 Adeno_E1B_55K_N Adenovirus E1B protein N-terminus. This family constitutes the amino termini of E1B 55 kDa (pfam01696). E1B 55K binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the by the adenovirus E1A protein. The role of the N-terminus in the function of E1B is not known. 71
47811 309667 pfam04624 Dec-1 Dec-1 repeat. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). This repeat is usually found in 12 copies in the central region of the protein. Its function is unknown. Length polymorphisms of Dec-1 have been observed in wild-type strains, and are caused by changes in the numbers of the first five repeats. 27
47812 368028 pfam04625 DEC-1_N DEC-1 protein, N-terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). 403
47813 282480 pfam04626 DEC-1_C Dec-1 protein, C terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). Alternative splicing generates different carboxyl terminal ends in different protein isoforms, so this is region is the most C terminal region that is present in the main isoforms. 131
47814 398354 pfam04627 ATP-synt_Eps Mitochondrial ATP synthase epsilon chain. This family constitutes the mitochondrial ATP synthase epsilon subunit. This is not to be confused with the bacterial epsilon subunit, which is homologous to the mitochondrial delta subunit (pfam00401 and pfam02823) The epsilon subunit is located in the extrinsic membrane section F1, which is the catalytic site of ATP synthesis. The epsilon subunit was not well ordered in the crystal structure of bovine F1, but it is known to be located in the stalk region of F1. E subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1. 49
47815 335859 pfam04628 Sedlin_N Sedlin, N-terminal conserved region. Mutations in this protein are associated with the X-linked spondyloepiphyseal dysplasia tarda syndrome (OMIM:313400). This family represents an N-terminal conserved region. 129
47816 398355 pfam04629 ICA69 Islet cell autoantigen ICA69, C-terminal domain. This family includes a 69 kD protein which has been identified as an islet cell autoantigen in type I diabetes mellitus. Its precise function is unknown. 254
47817 309671 pfam04630 Phage_TTP_1 Phage tail tube protein. This is a family of phage tail tube proteins from Myoviridae. 199
47818 282484 pfam04631 PIF2 Per os infectivity factor 2. This family includes several hypothetical baculoviral proteins, with predicted molecular weights of approximately 44 kD. Family members include per os infectivity factor 2 (PIF2). PIF2 forms a stable complex with PIF1, PIF3, PIF4 which is essential for oral infectivity of Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) in insect larvae, and P74 is also associated with this complex. 372
47819 398356 pfam04632 FUSC Fusaric acid resistance protein family. This family includes a conserved region found in two proteins associated with fusaric acid resistance, FusC from Burkholderia cepacia and fdt-2 from Klebsiella oxytoca. These proteins are likely to be membrane transporter proteins. 655
47820 282486 pfam04633 Herpes_BMRF2 Herpesvirus BMRF2 protein. 349
47821 398357 pfam04634 DUF600 Protein of unknown function, DUF600. This conserved region is found in several uncharacterized proteins from Gram positive bacteria. 144
47822 398358 pfam04636 PA26 PA26 p53-induced protein (sestrin). PA26 is a p53-inducible protein. Its function is unknown. It has similarity to pfam04636 in its N-terminus. 440
47823 282489 pfam04637 Herpes_pp85 Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25). This family includes UL25 proteins from HCMV, as well as U14 proteins from HHV 6 and HHV7. These 85 kD phosphoproteins appear to act as structural antigens, but their precise function is otherwise unknown. 542
47824 282490 pfam04639 Baculo_E56 Baculoviral E56 protein, specific to ODV envelope. This family represents the E56 protein, which is localizes to the occlusion derived virus (ODV) envelope, but not to the budded virus (BV) envelope. 293
47825 398359 pfam04640 PLATZ PLATZ transcription factor. Plant AT-rich sequence and zinc-binding proteins (PLATZ) are zinc dependant DNA binding proteins. They bind to AT rich sequences and functions in transcriptional repression. 79
47826 398360 pfam04641 Rtf2 Rtf2 RING-finger. It is vital for effective cell-replication that replication is not stalled at any point by, for instance, damaged bases. Replication termination factor 2 (Rtf2) stabilizes the replication fork stalled at the site-specific replication barrier RTS1 by preventing replication restart until completion of DNA synthesis by a converging replication fork initiated at a flanking origin. The RTS1 element terminates replication forks that are moving in the cen2-distal direction while allowing forks moving in the cen2-proximal direction to pass through the region. Rtf2 contains a C2HC2 motif related to the C3HC4 RING-finger motif, and would appear to fold up, creating a RING finger-like structure but forming only one functional Zn2+ ion-binding site. This domain is also found at the N-terminus of peptidyl-prolyl cis-trans isomerase 4, a divergent cyclophilin family. 258
47827 282493 pfam04642 DUF601 Protein of unknown function, DUF601. This family represents a conserved region found in several uncharacterized plant proteins. 327
47828 368034 pfam04643 Motilin_assoc Motilin/ghrelin-associated peptide. This family represents a peptide sequence that lies C-terminal to motilin/ghrelin on the respective precursor peptide. Its function is unknown. 59
47829 398361 pfam04644 Motilin_ghrelin Motilin/ghrelin. Motilin is a gastrointestinal regulatory polypeptide produced by motilin cells in the duodenal epithelium. It is released into the general circulation at about 100-min intervals during the inter-digestive state and is the most important factor in controlling the inter-digestive migrating contractions. Motilin also stimulates endogenous release of the endocrine pancreas. This family also includes ghrelin, a growth hormone secretagogue synthesized by endocrine cells in the stomach. Ghrelin stimulates growth hormone secretagogue receptors in the pituitary. These receptors are distinct from the growth hormone-releasing hormone receptors, and thus provide a means of controlling pituitary growth hormone release by the gastrointestinal system. 28
47830 282496 pfam04645 DUF603 Protein of unknown function, DUF603. This family includes several uncharacterized proteins from Borrelia species. 181
47831 398362 pfam04646 DUF604 Protein of unknown function, DUF604. This family includes a conserved region found in several uncharacterized plant proteins. 256
47832 398363 pfam04647 AgrB Accessory gene regulator B. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. 185
47833 398364 pfam04648 MF_alpha Yeast mating factor alpha hormone. The hormone is excreted into the culture medium by haploid cells of the alpha mating type and acts on cells of the opposite mating type (type A). It inhibits DNA synthesis in type A cells synchronising them with type alpha, and so mediates the conjugation process. 13
47834 68229 pfam04649 VlpA_repeat Mycoplasma hyorhinis VlpA repeat. This repeat is found in the extracellular (C-terminal) region of the variant surface antigen A (VlpA) of Mycoplasma hyorhinis. Mutations that change the number of repeats in the protein are involved in antigenic variation and immune evasion of this swine pathogen. 13
47835 398365 pfam04650 YSIRK_signal YSIRK type signal peptide. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus. 25
47836 282501 pfam04651 Pox_A12 Poxvirus A12 protein. 183
47837 398366 pfam04652 Vta1 Vta1 like. Vta1 (VPS20-associated protein 1) is a positive regulator of Vps4. Vps4 is an ATPase that is required in the multivesicular body (MVB) sorting pathway to dissociate the endosomal sorting complex required for transport (ESCRT). Vta1 promotes correct assembly of Vps4 and stimulates its ATPase activity through its conserved Vta1/SBP1/LIP5 region. 133
47838 398367 pfam04654 DUF599 Protein of unknown function, DUF599. This family includes several uncharacterized proteins. 211
47839 398368 pfam04655 APH_6_hur Aminoglycoside/hydroxyurea antibiotic resistance kinase. The aminoglycoside phosphotransferases achieve inactivation of their antibiotic substrates by phosphorylation utilising ATP. Likewise hydroxyurea is inactivated by phosphorylation of the hydroxy group in the hydroxylamine moiety. 250
47840 282505 pfam04656 Pox_E6 Pox virus E6 protein. Family of pox virus E6 proteins. 566
47841 398369 pfam04657 DMT_YdcZ Putative inner membrane exporter, YdcZ. DMT_YdcZ is a family of putative inner membrane exporters from both Gram-positive and Gram-negative bacteria. 139
47842 398370 pfam04658 TAFII55_N TAFII55 protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein. 161
47843 398371 pfam04659 Arch_fla_DE Archaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD. 96
47844 282509 pfam04660 Nanovirus_coat Nanovirus coat protein. Family of conserved Nanoviral coat proteins. 177
47845 282510 pfam04661 Pox_I3 Poxvirus I3 ssDNA-binding protein. 257
47846 252728 pfam04662 Luteo_PO Luteovirus P0 protein. This family of proteins may be involved in suppression of PTGS a plant defense mechanism. 208
47847 398372 pfam04663 Phenol_monoox Phenol hydroxylase conserved region. Under aerobic conditions, phenol is usually hydroxylated to catechol and degraded via the meta or ortho pathways. Two types of phenol hydroxylase are known: one is a multi-component enzyme the other is a single-component monooxygenase. This region is found in both types of enzymes. 66
47848 398373 pfam04664 OGFr_N Opioid growth factor receptor (OGFr) conserved region. Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving in neurotransmission/neuromodulation in the nervous system. The Opioid growth factor receptor is an integral membrane protein associated with the nucleus. The conserved region is situated at the N-terminus of the member proteins with a series of imperfect repeats lying immediately to its C-terminus. 208
47849 282513 pfam04665 Pox_A32 Poxvirus A32 protein. The A32 protein is thought to be involved in viral DNA packaging. 241
47850 398374 pfam04666 Glyco_transf_54 N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region. The complex-type of oligosaccharides are synthesized through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein. 278
47851 398375 pfam04667 Endosulfine cAMP-regulated phosphoprotein/endosulfine conserved region. Conserved region found in both cAMP-regulated phosphoprotein 19 (ARPP-19) and Alpha/Beta endosulfine. No function has yet been assigned to ARPP-19. Endosulfine is the endogenous ligand for the ATP-dependent potassium (K ATP) channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism. In both cases the region occupies the majority of the protein. 80
47852 398376 pfam04668 Tsg Twisted gastrulation (Tsg) protein conserved region. Tsg was identified in Drosophila as being required to specify the dorsal-most structures in the embryo, for example amnioserosa. Biochemical experiments have revealed three key properties of Tsg: it can synergistically inhibit Dpp/BMP action in both Drosophila and vertebrates by forming a tripartite complete between itself, SOG/chordin and a BMP ligand; Tsg seems to enhance the Tld/BMP-1-mediated cleavage rate of SOG/chordin and may change the preference of site utilisation; Tsg can promote the dissociation of chordin cysteine-rich-containing fragments from the ligand to inhibit BMP signalling. 97
47853 368048 pfam04669 Polysacc_synt_4 Polysaccharide biosynthesis. This family of proteins plays a role in xylan biosynthesis in plant cell walls. The precise role of IRX15/IRX15-L in xylan biosynthesis is unknown. Glucuronoxylan methyltransferase (GXMT) catalyzes 4-O-methylation of the glucuronic acid substituents of this polysaccharide. AtGXMT1 specifically transfers the methyl group from S-adenosyl-l-methionine to O-4 of alpha-d-glucopyranosyluronic acid residues that are linked to O-2 of the xylan backbone. The function of members of this family in animals and fungi is not known. 177
47854 398377 pfam04670 Gtr1_RagA Gtr1/RagA G protein conserved region. GTR1 was first identified in S. cerevisiae as a suppressor of a mutation in RCC1. Biochemical analysis revealed that Gtr1 is in fact a G protein of the Ras family. The RagA/B proteins are the human homologs of Gtr1. Included in this family is the human Rag C, a novel protein that has been shown to interact with RagA/B. 231
47855 309696 pfam04671 Ag332 Erythrocyte membrane-associated giant protein antigen 332. To date many different Plasmodium antigens recognized by the hyperimmune system human sera have been cloned, sequenced and characterized. The majority contain tandemly repeated amino acid sequences which make up a considerable portion of the protein sequence. It has been suggested that these repeat-containing antigens may provide an immunological 'smokescreen' to the parasite in order to evade the human immune system. This repeat is found exclusively in the Plasmodium falciparum Ag332 protein and occupies most of its length. 21
47856 252734 pfam04672 Methyltransf_19 S-adenosyl methyltransferase. This family contains a SAM (S-adenosyl methyltransferase) domain, with a central beta sheet with 3 alpha-helices on both sides. Crystal packing analysis of the structure Structure 3giw suggests that a monomer is the solution state oligomeric form. An unidentified ligand (UNL, cyan) was found at the putative active site surrounded by the residues His57, His170, Phe171, Tyr216 and Met22. The UNL is likely to be a phenylalanine or phenylalanine-like molecule. (details derived from TOPSAN). 268
47857 398378 pfam04673 Cyclase_polyket Polyketide synthesis cyclase. This family represents a number of cyclases involved in polyketide synthesis in a number of actinobacterial species. 104
47858 398379 pfam04674 Phi_1 Phosphate-induced protein 1 conserved region. Family of conserved plant proteins. Conserved region identified in a phosphate-induced protein of unknown function. 265
47859 398380 pfam04675 DNA_ligase_A_N DNA ligase N-terminus. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I, and in Saccharomyces cerevisiae, this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In vaccinia virus this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation. 169
47860 398381 pfam04676 CwfJ_C_2 Protein similar to CwfJ C-terminus 2. This region is found in the N-terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. 96
47861 309701 pfam04677 CwfJ_C_1 Protein similar to CwfJ C-terminus 1. This region is found in the N-terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. 122
47862 398382 pfam04678 MCU Mitochondrial calcium uniporter. MCU functions with MICU1, an essential gatekeeper component of calcium-channel transport, to facilitate Ca2+ uptake into the mitochondrion. 179
47863 398383 pfam04679 DNA_ligase_A_C ATP dependent DNA ligase C terminal region. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to constitute part of the catalytic core of ATP dependent DNA ligase. 94
47864 282527 pfam04680 OGFr_III Opioid growth factor receptor repeat. Proline-rich repeat found only in a human opioid growth factor receptor. 20
47865 113449 pfam04681 Bys1 Blastomyces yeast-phase-specific protein. The molecular function of this protein is not known. Its expression is specific to the high temperature, unicellular yeast morphology (as opposed to the lower temperature, multicellular mycelium form). 155
47866 282528 pfam04682 Herpes_BTRF1 Herpesvirus BTRF1 protein conserved region. Herpesvirus protein. 258
47867 398384 pfam04683 Proteasom_Rpn13 Proteasome complex subunit Rpn13 ubiquitin receptor. This family was thought originally to be involved in cell-adhesion, but the members are now known to be proteasome subunit Rpn13, a novel ubiquitin receptor. The 26S proteasome is a huge macromolecular protein-degradation machine consisting of a proteolytically active 20S core, in the form of four disc-like proteins, and one or two 19S regulatory particles. The regulatory particle(s) sit on the top and or bottom of the core, de-ubiquitinate the substrate peptides, unfold them and guide them into the narrow channel through the centre of the core. Rpn13 and its homologs dock onto the regulatory particle through the N-terminal region which binds Rpn2. The C-terminal part of the domain binds de-ubiquitinating enzyme Uch37/UCHL5 and enhances its isopeptidase activity. Rpn13 binds ubiquitin via a conserved amino-terminal region called the pleckstrin-like receptor for ubiquitin, termed Pru, domain. The domain forms two contiguous anti-parallel beta-sheets with a configuration similar to the pleckstrin-homology domain (PHD) fold. Rpn13's ability to bind ubiquitin and the proteasome subunit Rpn2/S1 simultaneously supports evidence of its role as a ubiquitin receptor. Finally, when complexed to di-ubiquitin, via the Pru, and Uch37 via the C-terminal part, it frees up the distal ubiquitin for de-ubiquitination by the Uch37. 87
47868 398385 pfam04684 BAF1_ABF1 BAF1 / ABF1 chromatin reorganising factor. ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 in the S. cerevisiae protein). The N-terminal two thirds of the protein are necessary for DNA binding, and the N-terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilize the protein structure. 501
47869 398386 pfam04685 DUF608 Glycosyl-hydrolase family 116, catalytic region. This represents a family of archaeal, bacterial and eukaryotic glycosyl hydrolases, that belong to superfamily GH116. The primary catabolic pathway for glucosylceramide is catalysis by the lysosomal enzyme glucocerebrosidase. In higher eukaryotes, glucosylceramide is the precursor of glycosphingolipids, a complex group of ubiquitous membrane lipids. Mutations in the human protein cause motor-neurone defects in hereditary spastic paraplegia. The catalytic nucleophile, identified in UniProtKB:Q97YG8_SULSO, is a glutamine-335, with the likely acid/base at Asp-442 and the aspartates at Asp-406 and Asp-458 residues also playing a role in the catalysis of glucosides and xylosides that are beta-bound to hydrophobic groups. The family is defined as GH116, which presently includes enzymes with beta-glucosidase, EC:3.2.1.21, beta-xylosidase, EC:3.2.1.37, and glucocerebrosidase EC:3.2.1.45 activity. 362
47870 398387 pfam04686 SsgA Streptomyces sporulation and cell division protein, SsgA. The precise function of SsgA is unknown. It has been found to be essential for spore formation, and to stimulate cell division. 97
47871 282533 pfam04687 Microvir_H Microvirus H protein (pilot protein). A single molecule of H protein is found on each of the 12 spikes on the microvirus shell. H is involved in the ejection of the phage DNA, and at least one copy is injected into the host's periplasmic space along with the ssDNA viral genome. Part of H is thought to lie outside the shell, where it recognizes lipopolysaccharide from virus-sensitive strains. Part of H may lie within the capsid, since mutations in H can influence the DNA ejection mechanism by affecting the DNA-protein interactions. H may span the capsid through the hydrophilic channels formed by G proteins. Elucidation of the DNA-ejection mechanism from the crystal structure of part of the H protein shows that this tail-less icosahedral, single-stranded DNA phiX174-like coliphage bacteriophage requires H as a pilot protein for its DNA-delivery. H oligomerizes to form a tube the function of which seems to be the delivery of the DNA genome across the host's periplasmic space into the host cytoplasm. The tube is constructed of ten alpha-helices with their amino termini arrayed in a right-handed super-helical coiled-coil and their carboxy termini arrayed in a left-handed super-helical coiled-coil. The tube spans the periplasmic space and is present while the genome is being delivered into the host cell's cytoplasm. 310
47872 398388 pfam04688 Holin_SPP1 SPP1 phage holin. This family constitutes holin proteins from the dsDNA Siphidoviridae group bacteriophages with two transmembrane segments. Most bacteriophages require an endolysin and a holin for host lysis. During late gene expression, holins accumulate and oligomerize in the host cell membrane. They then suddenly trigger to permeablise the membrane, which causes lysis by allowing endolysin to attach the peptidoglycan. There are thought to be at least 35 different families of holin genes. 74
47873 282535 pfam04689 S1FA DNA binding protein S1FA. S1FA is a DNA-binding protein found in plants that specifically recognizes the negative promoter element S1F. 66
47874 398389 pfam04690 YABBY YABBY protein. YABBY proteins are a group of plant-specific transcription involved in the specification of abaxial polarity in lateral organs. 163
47875 398390 pfam04691 ApoC-I Apolipoprotein C-I (ApoC-1). Apolipoprotein C-I (ApoC-1) is a water-soluble protein component of plasma lipoprotein. It solubalises lipids and regulates lipid metabolism. ApoC-1 transfers among HDL (high density lipoprotein), VLDL (very low-density lipoprotein) and chylomicrons. ApoC-1 activates lecithin:choline acetyltransferase (LCAT), inhibits cholesteryl ester transfer protein, can inhibit hepatic lipase and phospholipase 2 and can stimulate cell growth. ApoC-1 delays the clearance of beta-VLDL by inhibiting its uptake via the LDL receptor-related pathway. ApoC-1 has been implicated in hypertriglyceridemia, and Alzheimer's disease. 60
47876 398391 pfam04692 PDGF_N Platelet-derived growth factor, N terminal region. This family consists of the amino terminal regions of platelet-derived growth factor (PDGF, pfam00341) A and B chains. 77
47877 147046 pfam04693 DDE_Tnp_2 Archaeal putative transposase ISC1217. 327
47878 282539 pfam04694 Corona_3 Coronavirus ORF3 protein. 59
47879 398392 pfam04695 Pex14_N Peroxisomal membrane anchor protein (Pex14p) conserved region. Family of peroxisomal membrane anchor proteins which bind the PTS1 (peroxisomal targeting signal) receptor and are required for the import of PTS1-containing proteins into peroxisomes. Loss of functional Pex14p results in defects in both the PTS1 and PTS2-dependent import pathways. Deletion analysis of this conserved region implicates it in selective peroxisome degradation. In the majority of members this region is situated at the N-terminus of the protein. 46
47880 398393 pfam04696 Pinin_SDK_memA pinin/SDK/memA/ protein conserved region. Members of this family have very varied localizations within the eukaryotic cell. pinin is known to localize at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque. SDK2/3 is a dynamically localized nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing. memA is a tumor marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions. 130
47881 398394 pfam04697 Pinin_SDK_N pinin/SDK conserved region. SDK2/3 is localized in nuclear speckles where as pinin is known to localize at the desmosomes where it is thought to be involved in anchoring intermediate filaments to the desmosomal plaque. The role of SDK2/3 in the nucleus is thought to be concerned with modulation of alternative pre-mRNA splicing. pinin has also been implicated as a tumor suppressor. The conserved region is found at the N-terminus of the member proteins. 132
47882 398395 pfam04698 Rab_eff_C Rab effector MyRIP/melanophilin C-terminus. This domain is found at the C-terminus of the Rab effector proteins MyRIP and melanophilin. 715
47883 398396 pfam04699 P16-Arc ARP2/3 complex 16 kDa subunit (p16-Arc). The Arp2/3 protein complex has been implicated in the control of actin polymerization. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. The precise function of p16-Arc is currently unknown. Its structure consists of a single domain containing a bundle of seven alpha helices. 147
47884 282545 pfam04700 Baculo_gp41 Structural glycoprotein p40/gp41 conserved region. Family of viral structural glycoproteins. 185
47885 282546 pfam04701 Pox_D2 Pox virus D2 protein. 139
47886 252749 pfam04702 Vicilin_N Vicilin N terminal region. This region is found in plant seed storage proteins, N-terminal to the Cupin domain (pfam00190). In Macadamia integrifolia, this region is processed into peptides of approximately 50 amino acids containing a C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides exhibit antimicrobial activity in vitro. 147
47887 398397 pfam04703 FaeA FaeA-like protein. This family represents a number of fimbrial protein transcription regulators found in Gram-negative bacteria. These proteins are thought to facilitate binding of the leucine-rich regulatory protein to regulatory elements, possibly by inhibiting deoxyadenosine methylation of these elements by deoxyadenosine methylase. 61
47888 398398 pfam04704 Zfx_Zfy_act Zfx / Zfy transcription activation region. Zfx and Zfy are transcription factors implicated in mammalian sex determination. This region is found N terminal to multiple copies of a C2H2 Zinc finger (pfam00096). This region has been shown to activate transcription when fused to a GAL4 DNA binding domain. 328
47889 368067 pfam04705 TSNR_N Thiostrepton-resistance methylase, N-terminus. This region is found in some members of the SpoU-type rRNA methylase family (pfam00588). 111
47890 398399 pfam04706 Dickkopf_N Dickkopf N-terminal cysteine-rich region. Dickkopf proteins are a class of Wnt antagonists. They possess two conserved cysteine-rich regions. This family represents the N-terminal one. The C-terminal region has been found to share significant sequence similarity to the colipase fold, pfam01114, pfam02740. 50
47891 398400 pfam04707 PRELI PRELI-like family. This family includes a conserved region found in the PRELI protein and yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins. 156
47892 282551 pfam04708 Pox_F16 Poxvirus F16 protein. 215
47893 398401 pfam04709 AMH_N Anti-Mullerian hormone, N terminal region. Anti-Mullerian hormone, AMH is a signalling molecule involved in male and female sexual differentiation. Defects in synthesis or action of AMH cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism. This family represents the N terminal part of the protein, which is not thought to be essential for activity. AMH contains a TGF-beta domain (pfam00019), at the C-terminus. 390
47894 398402 pfam04710 Pellino Pellino. Pellino is involved in Toll-like signalling pathways, and associates with the kinase domain of the Pelle Ser/Thr kinase. 409
47895 398403 pfam04711 ApoA-II Apolipoprotein A-II (ApoA-II). Apolipoprotein A-II (ApoA-II) is the second major apolipoprotein of high density lipoprotein in human plasma. Mature ApoA-II is present as a dimer of two 77-amino acid chains joined by a disulphide bridge. ApoA-II regulates many steps in HDL metabolism, and its role in coronary heart disease is unclear. In bovine serum, the ApoA-II homolog is present in almost free form. Bovine ApoA-II shows antimicrobial activity against Escherichia coli and yeasts in phosphate buffered saline (PBS). 75
47896 398404 pfam04712 Radial_spoke Radial spokehead-like protein. This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologs, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene. 493
47897 282556 pfam04713 Pox_I5 Poxvirus protein I5. 75
47898 398405 pfam04714 BCL_N BCL7, N-terminal conserver region. Members of the BCL family have significant sequence similarity at their N-terminus, represented in this family. The function of BCL7 proteins is unknown. They may be involved in early development. In addition, BCL7B is commonly hemizygously deleted in patients with Williams syndrome. 48
47899 398406 pfam04715 Anth_synt_I_N Anthranilate synthase component I, N terminal region. Anthranilate synthase (EC:4.1.3.27) catalyzes the first step in the biosynthesis of tryptophan. Component I catalyzes the formation of anthranilate using ammonia and chorismate. The catalytic site lies in the adjacent region, described in the chorismate binding enzyme family (pfam00425). This region is involved in feedback inhibition by tryptophan. This family also contains a region of Para-aminobenzoate synthase component I (EC 4.1.3.-). 141
47900 398407 pfam04716 ETC_C1_NDUFA5 ETC complex I subunit conserved region. Family of eukaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins. 66
47901 398408 pfam04717 Phage_base_V Type VI secretion system, phage-baseplate injector. Family of bacterial and phage baseplate assembly proteins responsible for forming the small spike at the end of the tail or bacterial pathogenic needle-shaft. 75
47902 398409 pfam04718 ATP-synt_G Mitochondrial ATP synthase g subunit. The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). The function of subunit g is currently unknown. The conserved region covers all but the very N-terminus of the member sequences. No prokaryotic members have been identified thus far. 92
47903 398410 pfam04719 TAFII28 hTAFII28-like protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C-terminal of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3. 85
47904 398411 pfam04720 PDDEXK_6 PDDEXK-like family of unknown function. PDDEXK_6 is a family of plant proteins that are distant homologs of the PD-(D/E)XK nuclease superfamily. The core structure is retained, as alpha-beta-beta-beta-alpha-beta. It retains the characteristic PDDEXK motifs II and III in modified forms - xDxxx motif located in the second core beta-strand, where x is any hydrophobic residue, and a D/E)X(D/N/S/C/G) pattern. The missing positively charged residue in motif III is possibly replaced by a conserved arginine in motif IV located in the proceeding alpha-helix. The family is not in general fused with any other domains, so its function cannot be predicted. 215
47905 398412 pfam04721 PAW PNGase C-terminal domain, mannose-binding module PAW. The PAW domain is found at the C-terminus of PGNase, or peptide-N-glycanase, enzymes. It was named for 'domain present in PNGases and other worm proteins'. PNGase catalyzes the deglycosylation of several misfolded N-linked glycoproteins by cleaving off the bulky glycan chain before these proteins are degraded by the proteasome. PNGase specifically acts on the unfolded form of high-mannose type N-glycosylated proteins, and this domain appears to be the mannose-binding domain, which contributes to the oligosaccharide-binding specificity of PNGase. 197
47906 398413 pfam04722 Ssu72 Ssu72-like protein. The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs. 189
47907 377402 pfam04723 GRDA Glycine reductase complex selenoprotein A. Found in clostridia, this protein contains one active site selenocysteine and catalyzes the reductive deamination of glycine, which is coupled to the esterification of orthophosphate resulting in the formation of ATP. A member of this family may also exist in Treponema denticola. 147
47908 398414 pfam04724 Glyco_transf_17 Glycosyltransferase family 17. This family represents beta-1,4-mannosyl-glycoprotein beta-1,4-N-acetylglucosaminyltransferase (EC:2.4.1.144). This enzyme transfers the bisecting GlcNAc to the core mannose of complex N-glycans. The addition of this residue is regulated during development and has functional consequences for receptor signalling, cell adhesion, and tumor progression. 349
47909 309736 pfam04725 PsbR Photosystem II 10 kDa polypeptide PsbR. This protein is associated with the oxygen-evolving complex of photosystem II. Its function in photosynthesis is not known. The C-terminal hydrophobic region functions as a thylakoid transfer signal but is not removed. 98
47910 282569 pfam04726 Microvir_J Microvirus J protein. This small protein is involved in DNA packaging, interacting with DNA via its hydrophobic carboxyl terminus. In bacteriophage phi-X174, J is present in 60 copies, and forms an S-shaped polypeptide chain without any secondary structure. It is thought to interact with DNA through simple charge interactions. 37
47911 398415 pfam04727 ELMO_CED12 ELMO/CED-12 family. This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C-terminus which are not present in this alignment. 165
47912 398416 pfam04728 LPP Lipoprotein leucine-zipper. This is leucine-zipper is found in the enterobacterial outer membrane lipoprotein LPP. It is likely that this domain oligomerizes and is involved in protein-protein interactions. As such it is a bundle of alpha-helical coiled-coils, which are known to play key roles in mediating specific protein-protein interactions for in molecular recognition and the assembly of multi-protein complexes. 53
47913 398417 pfam04729 ASF1_hist_chap ASF1 like histone chaperone. This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a a compact immunoglobulin-like beta sandwich fold topped by three helical linkers. 154
47914 368086 pfam04730 Agro_virD5 Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterized products. This family represents the VirD5 protein. 672
47915 398418 pfam04731 Caudal_act Caudal like protein activation region. This family consists of the amino termini of proteins belonging to the caudal-related homeobox protein family. This region is thought to mediate transcription activation. The level of activation caused by mouse Cdx2 is affected by phosphorylation at serine 60 via the mitogen-activated protein kinase pathway. Caudal family proteins are involved in the transcriptional regulation of multiple genes expressed in the intestinal epithelium, and are important in differentiation and maintenance of the intestinal epithelial lining. Caudal proteins always have a homeobox DNA binding domain (pfam00046). 127
47916 368088 pfam04732 Filament_head Intermediate filament head (DNA binding) region. This family represents the N-terminal head region of intermediate filaments. Intermediate filament heads bind DNA. Vimentin heads are able to alter nuclear architecture and chromatin distribution, and the liberation of heads by HIV-1 protease liberates may play an important role in HIV-1 associated cytopathogenesis and carcinogenesis. Phosphorylation of the head region can affect filament stability. The head has been shown to interaction with the rod domain of the same protein. 83
47917 398419 pfam04733 Coatomer_E Coatomer epsilon subunit. This family represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex. 288
47918 398420 pfam04734 Ceramidase_alk Neutral/alkaline non-lysosomal ceramidase, N-terminal. This family represents N-terminal domain of a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. The EC classification is EC:3.5.1.23. The enzyme hydrolyzes ceramide to generate sphingosine and fatty acid. The enzyme plays a regulatory role in a variety of physiological events in eukaryotes and also functions as an exotoxin in particular bacteria. This N-terminal domain carries two metal-binding sites, the first for Zn2+ residing within the domain, and the second, for Mg2+/Ca2+ lying at the interface between the two domains. 473
47919 282577 pfam04735 Baculo_helicase Baculovirus DNA helicase. 1307
47920 368091 pfam04736 Eclosion Eclosion hormone. Eclosion hormone is an insect neuropeptide that triggers the performance of ecdysis behaviour, which causes shedding of the old cuticle at the end of a molt,. 61
47921 398421 pfam04738 Lant_dehydr_N Lantibiotic dehydratase, C-terminus. Lantibiotics are ribosomally synthesized antimicrobial agents derived from ribosomally synthesized peptides. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues. This family constitutes the N-terminus of the enzyme proposed to catalyze the dehydration step, via glutamylation of the substrate during lantibiotic biosynthesis. The enzyme dehydrates Ser/Thr residues in the precursor by glutamylation. 648
47922 398422 pfam04739 AMPKBI 5'-AMP-activated protein kinase beta subunit, interaction domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologs Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain (pfam02922) is sometimes found in proteins belonging to this family. 69
47923 398423 pfam04740 LXG LXG domain of WXG superfamily. This domain is present is the N-terminal region of a group of polymorphic toxin proteins in bacteria. It is predicted to use Type VII secretion pathway to mediate export of bacterial toxins. 202
47924 282582 pfam04741 InvH InvH outer membrane lipoprotein. This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localization to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion. 147
47925 398424 pfam04744 Monooxygenase_B Monooxygenase subunit B protein. Family of membrane associated monooxygenases (EC 1.13.12.-) which utilize O(2) to oxidize their substrate. Family members include both ammonia and methane monooxygenases involved in the oxidation of their respective substrates. These enzymes are multi-subunit complexes. This family represents the B subunit of the enzyme; the A subunit is thought to contain the active site.. 379
47926 282584 pfam04745 Pox_A8 VITF-3 subunit protein. Family of Chordopoxvirus proteins composing one of the two subunits that make up VITF-3, a virally encoded complex necessary for intermediate stage transcription. 289
47927 113513 pfam04746 DUF575 Protein of unknown function (DUF575). Family of uncharacterized proteins. Contains several chlamydial members. 101
47928 282585 pfam04747 DUF612 Protein of unknown function, DUF612. This family includes several uncharacterized proteins from Caenorhabditis elegans. 511
47929 398425 pfam04748 Polysacc_deac_2 Divergent polysaccharide deacetylase. This family is divergently related to pfam01522 (personal obs:Yeats C). 212
47930 398426 pfam04749 PLAC8 PLAC8 family. This family includes the Placenta-specific gene 8 protein. 100
47931 368096 pfam04750 Far-17a_AIG1 FAR-17a/AIG1-like protein. This family includes the hamster androgen-induced FAR-17a protein, and its human homolog, the AIG1 protein. The function of these proteins is unknown. This family also includes homologous regions from a number of other metazoan proteins. 206
47932 398427 pfam04751 DUF615 Protein of unknown function (DUF615). This family of bacterial proteins has no known function. 139
47933 398428 pfam04752 ChaC ChaC-like protein. The ChaC family of proteins function as gamma-glutamyl cyclotransferases acting specifically to degrade glutathione but not other gamma-glutamyl peptides. It is is conversed across all phyla and represents a new pathway for glutathione degradation in living cells. 176
47934 398429 pfam04753 Corona_NS2 Coronavirus non-structural protein NS2. 109
47935 368098 pfam04754 Transposase_31 Putative transposase, YhgA-like. This family of putative transposases includes the YhgA sequence from Escherichia coli and several prokaryotic homologs. 202
47936 309752 pfam04755 PAP_fibrillin PAP_fibrillin. This family identifies a conserved region found in a number of plastid lipid-associated proteins (PAPs), and in a number of putative fibrillin proteins. 196
47937 398430 pfam04756 OST3_OST6 OST3 / OST6 family, transporter family. The proteins in this family are part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation. These proteins are also thought to be novel Mg2+ transporters. 294
47938 398431 pfam04757 Pex2_Pex12 Pex2 / Pex12 amino terminal region. This region is found at the N terminal of a number of known and predicted peroxins including Pex2, Pex10 and Pex12. This conserved region is usually associated with a C terminal ring finger (pfam00097) domain. 213
47939 398432 pfam04758 Ribosomal_S30 Ribosomal protein S30. 58
47940 398433 pfam04759 DUF617 Protein of unknown function, DUF617. This family represents a conserved region in a number of uncharacterized plant proteins. 163
47941 398434 pfam04760 IF2_N Translation initiation factor IF-2, N-terminal region. This conserved feature at the N-terminus of bacterial translation initiation factor IF2 has recently had its structure solved. It shows structural similarity to the tRNA anticodon Stem Contact Fold domains of the methionyl-tRNA and glutaminyl-tRNA synthetases, and a similar fold is also found in the B5 domain of the phenylalanine-tRNA synthetase. 52
47942 282599 pfam04761 Phage_Treg Lactococcus bacteriophage putative transcription regulator. This family represents a number of putative transcription repressor proteins found in several Lactococcus bacteriophages. Horizontal transfer may account for the presence of similar proteins in Lactococcus. 61
47943 398435 pfam04762 IKI3 IKI3 family. Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. This region contains WD40 like repeats. 920
47944 368104 pfam04763 DUF562 Protein of unknown function (DUF562). Family of uncharacterized proteins. 146
47945 252787 pfam04764 DUF613 Protein of unknown function (DUF613). Family of chloroplast proteins of unknown function. Some members have two copies of the conserved region. 120
47946 398436 pfam04765 DUF616 Protein of unknown function (DUF616). Family of uncharacterized proteins. 303
47947 309761 pfam04766 Baculo_p26 Nucleopolyhedrovirus p26 protein. Family of Baculovirus p26 proteins. 234
47948 282602 pfam04767 Pox_F17 DNA-binding 11 kDa phosphoprotein. Family of poxvirus proteins required for virus morphogenesis. Protein function necessary for proteolytic processing of the major viral structural proteins, P4a and P4b. 93
47949 398437 pfam04768 NAT NAT, N-acetyltransferase, of N-acetylglutamate synthase. This is the C-terminal NAT or N-acetyltransferase domain of bifunctional N-acetylglutamate synthase/kinases. It catalyzes the first two steps in arginine biosynthesis. This domain contains the putative NAGS - N-acetylglutamate synthase - active site. It is found at the C-terminus of Neurospora crassa acetylglutamate synthase - amino-acid acetyltransferase, EC: 2.3.1.1. It is also found C-terminal to the amino acid kinase region (pfam00696) in some fungal acetylglutamate kinase enzymes. it stabilizes the yeast NAGK, N-acetyl-L-glutamate kinase, slows catalysis and modulates feed-back inhibition by arginine. This domain is found to be the N-acetyltransferase (NAT) domain, and it has a typical GCN5-related NAT fold and a site that catalyzes NAG synthesis which is located >25 Angstrom away from the L-arginine binding site in the N-temrinal domain pfam00696. 166
47950 398438 pfam04769 MATalpha_HMGbox Mating-type protein MAT alpha 1 HMG-box. This family includes Saccharomyces cerevisiae mating type protein alpha 1. Mat alpha 1 is a transcription activator which activates mating-type alpha-specific genes. MAT alpha 1 and MCM 1 bind cooperatively to PQ elements upstream of alpha-specific genes. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (pfam04648) response pathway. In silico modelling of the MAT_Alpha1 domain indicates that its best scoring templates were structures of HMG-box proteins, and DOI: 10.4236/ojbiphy.2013.31001. Phylogenetic analysis suggests that the MAT_Alpha1 domain diverged from the MATA_HMG-box subfamily. The name of MATalpha_HMG-box was proposed for the MAT_alpha1 domain. 186
47951 398439 pfam04770 ZF-HD_dimer ZF-HD protein dimerization region. This family of proteins has are plant transcription factors, and have been named ZF-HD for zinc finger homeodomain proteins, on the basis of similarity to proteins of known structure. This region is thought to be involved in the formation of homo and heterodimers, and may form a zinc finger. 55
47952 282606 pfam04771 CAV_VP3 Chicken anaemia virus VP-3 protein. This protein is found in the nucleus of infected cells and may act as a transcriptional regulator. It induces apoptosis, and is also known as apoptin. 121
47953 282607 pfam04772 Flu_B_M2 Influenza B matrix protein 2 (BM2). M2 is synthesized in the late phase of infection and incorporated into the virion. It may be phosphorylated in vivo. The function of BM2 is unknown. 109
47954 398440 pfam04773 FecR FecR protein. FecR is involved in regulation of iron dicitrate transport. In the absence of citrate FecR inactivates FecI. FecR is probably a sensor that recognizes iron dicitrate in the periplasm. 96
47955 398441 pfam04774 HABP4_PAI-RBP1 Hyaluronan / mRNA binding family. This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function. 106
47956 398442 pfam04775 Bile_Hydr_Trans Acyl-CoA thioester hydrolase/BAAT N-terminal region. This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters. 120
47957 398443 pfam04776 protein_MS5 Protein MS5. Proteins are known only from species of Brassicaceae. Protein MS5 is essential for pairing of homologs during early prophase stage of meiosis but not necessary for the initiation of DNA double-strand breaks. 119
47958 398444 pfam04777 Evr1_Alr Erv1 / Alr family. Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter. 91
47959 398445 pfam04778 LMP LMP repeated region. This family consists of a repeated sequence element found in the LMP group of surface-located membrane proteins of Mycoplasma hominis. The the number of repeats in the protein affects the tendency of cells to spontaneously aggregate. Agglutination may be an important factor in colonisation. Non-agglutinating microorganisms might easily be distributed whereas aggregation might provide a better chance to avoid an antibody response since some of the epitopes may be buried. 158
47960 398446 pfam04780 DUF629 Protein of unknown function (DUF629). This family represents a region of several plant proteins of unknown function. A C2H2 zinc finger is predicted in this region in some family members, but the spacing between the cysteine residues is not conserved throughout the family. 465
47961 398447 pfam04781 DUF627 Protein of unknown function (DUF627). This family represents the N-terminal region of several plant proteins of unknown function. 108
47962 398448 pfam04782 DUF632 Protein of unknown function (DUF632). This plant protein may be a leucine zipper, but there is no experimental evidence for this. 311
47963 398449 pfam04783 DUF630 Protein of unknown function (DUF630). This region is sometimes found at the N-terminus of putative plant bZIP proteins. Its function is not known. Structural modelling suggests this domain may bind nucleic acids. 59
47964 398450 pfam04784 DUF547 Protein of unknown function, DUF547. Family of uncharacterized proteins from C. elegans and A. thaliana. 120
47965 282619 pfam04785 Rhabdo_M2 Rhabdovirus matrix protein M2. M protein is involved in condensing and targeting the ribonucleoprotein (RNP) coil to the plasma membrane. M interacts specifically with the transmembrane spike protein (G) is important for the incorporation of G protein into budding virions. 202
47966 398451 pfam04786 Baculo_DNA_bind ssDNA binding protein. Family of Baculovirus ssDNA binding proteins. 248
47967 282621 pfam04787 Pox_H7 Late protein H7. Family of poxvirus late H7 proteins. 143
47968 398452 pfam04788 DUF620 Protein of unknown function (DUF620). Family of uncharacterized proteins. 243
47969 398453 pfam04789 DUF621 Protein of unknown function (DUF621). Family of uncharacterized proteins. Some are annotated as having possible G-protein-coupled receptor-like activity. 301
47970 398454 pfam04790 Sarcoglycan_1 Sarcoglycan complex subunit protein. The dystrophin glycoprotein complex (DGC) is a membrane-spanning complex that links the interior cytoskeleton to the extracellular matrix in muscle. The sarcoglycan complex is a subcomplex within the DGC and is composed of several muscle-specific, transmembrane proteins (alpha-, beta-, gamma-, delta- and zeta-sarcoglycan). The sarcoglycans are asparagine-linked glycosylated proteins with single transmembrane domains. This family contains beta, gamma and delta members. 253
47971 398455 pfam04791 LMBR1 LMBR1-like membrane protein. Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor 467
47972 398456 pfam04792 LcrV V antigen (LcrV) protein. Yersinia pestis, the aetiologic agent of plague, secretes a set of environmentally regulated, plasmid pCD1-encoded virulence proteins termed Yops and V antigen (LcrV) by a type III secretion mechanism. LcrV is a multifunctional protein that has been shown to act at the level of secretion control by binding the Ysc inner-gate protein LcrG and to modulate the host immune response by altering cytokine production. LcrV is also necessary for full induction of low-calcium response (LCR) stimulon virulence gene transcription. Family members are not confined to Yersinia pestis. 298
47973 368123 pfam04793 Herpes_BBRF1 BRRF1-like protein. Family of herpesvirus proteins including Epstein-barr virus protein BBRF1. 282
47974 398457 pfam04794 YdjC YdjC-like protein. Family of YdjC-like proteins. This region is possibly involved in the the cleavage of cellobiose-phosphate. 193
47975 398458 pfam04795 PAPA-1 PAPA-1-like conserved region. Family of proteins with a conserved region found in PAPA-1, a PAP-1 binding protein. 79
47976 309782 pfam04796 RepA_C Plasmid encoded RepA protein. Family of plasmid encoded proteins involved in plasmid replication. The role of RepA in the replication process is not clearly understood. 161
47977 309783 pfam04797 Herpes_ORF11 Herpesvirus dUTPase protein. This family of proteins are found in Herpesvirus proteins. This family includes proteins called ORF10 and ORF11 amongst others. However, these proteins seem to be related to other dUTPases pfam00692 suggesting that these proteins are also dUTPases (Bateman A pers. obs.). 372
47978 282632 pfam04798 Baculo_19 Baculovirus 19 kDa protein conserved region. Family of Baculovirus proteins of approximate mass 19 kDa. 143
47979 398459 pfam04799 Fzo_mitofusin fzo-like conserved region. Family of putative transmembrane GTPase. The fzo protein is a mediator of mitochondrial fusion. This conserved region is also found in the human mitofusin protein. 159
47980 398460 pfam04800 ETC_C1_NDUFA4 ETC complex I subunit conserved region. Family of pankaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein. 96
47981 398461 pfam04801 Sin_N Sin-like protein conserved region. Family of higher eukaryotic proteins. SIN was identified as a protein that interacts specifically with SXL (sex lethal) in a yeast two-hybrid assay. The interaction is mediated by one of the SXL RNA binding domains. 426
47982 398462 pfam04802 SMK-1 Component of IIS longevity pathway SMK-1. SMK-1 is a component of the IIs longevity pathway which regulates aging in C.elegans. Specifically, SMK-1 influences DAF-16-dependant regulation of the aging process by regulating the transcriptional specificity of DAF-16 activity. SMK-1 plays a role in longevity by modulating the transcriptional specificity of DAF-16. 191
47983 398463 pfam04803 Cor1 Cor1/Xlr/Xmr conserved region. Cor1 is a component of the chromosome core in the meiotic prophase chromosomes. Xlr is a lymphoid cell specific protein. Xlm is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes. 132
47984 282638 pfam04805 Pox_E10 E10-like protein conserved region. Family of poxvirus proteins. 69
47985 398464 pfam04806 EspF EspF protein repeat. The enteropathogenic Escherichia coli EspF secreted protein induces host cell apoptosis. Its proline-rich structure suggests that it may act by binding to SH3 domains or EVH1 domains of host cell signalling proteins. 47
47986 147122 pfam04807 Gemini_AC4_5 Geminivirus AC4/5 conserved region. 33
47987 113574 pfam04808 CTV_P23 Citrus tristeza virus (CTV) P23 protein. This family consists of protein P23 from the citrus tristeza virus, which is a member of the Closteroviridae. CTV viruses produce more positive than negative RNA strands, and P23 controls this asymmetrical RNA accumulation. Amino acids 42-180 are essential for function and are thought to contain RNA-binding and zinc finger domains. 209
47988 398465 pfam04809 HupH_C HupH hydrogenase expression protein, C-terminal conserved region. This family represents a C-terminal conserved region found in these bacterial proteins necessary for hydrogenase synthesis. Their precise function is unknown. 109
47989 398466 pfam04810 zf-Sec23_Sec24 Sec23/Sec24 zinc finger. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain. 38
47990 398467 pfam04811 Sec23_trunk Sec23/Sec24 trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. 241
47991 398468 pfam04812 HNF-1B_C Hepatocyte nuclear factor 1 (HNF-1), beta isoform C-terminus. This family consists of a region found within the alpha isoform and at the C-terminus of the beta isoform of the homeobox-containing transcription factor of HNF-1. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3). 258
47992 398469 pfam04813 HNF-1A_C Hepatocyte nuclear factor 1 (HNF-1), alpha isoform C-terminus. This family consists of an alternative C-terminus of homeobox-containing transcription factor HNF-1, found in the HNF-1A isoform. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription, and HNF-1A, which has this C-terminal extension, transactivates less well than the B and C isoforms. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3). 89
47993 398470 pfam04814 HNF-1_N Hepatocyte nuclear factor 1 (HNF-1), N-terminus. This family consists of the N-terminus of homeobox-containing transcription factor HNF-1. This region contains a dimerization sequence and an acidic region that may be involved in transcription activation. Mutations and the common Ala/Val 98 polymorphism in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3). 167
47994 398471 pfam04815 Sec23_helical Sec23/Sec24 helical domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices. 103
47995 398472 pfam04816 TrmK tRNA (adenine(22)-N(1))-methyltransferase. tRNA_MT is a family of bacterial tRNA (adenine(22)-N(1))-methyltransferase enzymes with a Rossmann-like fold. This enzyme carries out the function of N1-adenosine methylation at position 22 of bacterial tRNA. 205
47996 113583 pfam04817 Umbravirus_LDM Umbravirus long distance movement (LDM) family. The long distance movement protein of Umbraviruses mediates the movement of viral RNA through the phloem of infected plants. 231
47997 398473 pfam04818 CTD_bind RNA polymerase II-binding domain. This domain binds to the phosphorylated C-terminal domain (CTD) of RNA polymerase II. 120
47998 398474 pfam04819 DUF716 Family of unknown function (DUF716). This family is equally distributed in both metazoa and plants. Annotation associated with a Nicotiana tabacum mRNA suggest that it may be involved in response to viral attack in plants. However, no clear function has been assigned to this family. 134
47999 398475 pfam04820 Trp_halogenase Tryptophan halogenase. Tryptophan halogenase catalyzes the chlorination of tryptophan to form 7-chlorotryptophan. This is the first step in the biosynthesis of pyrrolnitrin, an antibiotic with broad-spectrum anti-fungal activity. Tryptophan halogenase is NADH-dependent. 457
48000 398476 pfam04821 TIMELESS Timeless protein. The timeless gene in Drosophila melanogaster and its homologs in a number of other insects and mammals (including human) are involved in circadian rhythm control. This family includes a related proteins from a number of fungal species. 271
48001 398477 pfam04822 Takusan Takusan. This domain is named takusan, which is a Japanese word meaning 'many'. Members of this family regulate synaptic activity. 85
48002 398478 pfam04823 Herpes_UL49_2 Herpesvirus UL49 tegument protein. 82
48003 335912 pfam04824 Rad21_Rec8 Conserved region of Rad21 / Rec8 like protein. This family represents a conserved region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation. 55
48004 398479 pfam04825 Rad21_Rec8_N N-terminus of Rad21 / Rec8 like protein. This family represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation. 104
48005 398480 pfam04826 Arm_2 Armadillo-like. This domain contains armadillo-like repeats. Proteins containing this domain interact with numerous other proteins, through these interactions they are involved in a wide variety of processes including carcinogenesis, control of cellular ageing and survival, regulation of circadian rhythm and lysosomal sorting of G protein-coupled receptors. 252
48006 203098 pfam04827 Plant_tran Plant transposon protein. This family contains plant transposases which are putative members of the PIF / Ping-Pong family. 205
48007 398481 pfam04828 GFA Glutathione-dependent formaldehyde-activating enzyme. The GFA enzyme catalyzes the first step in the detoxification of formaldehyde. This domain has a beta-tent fold. 93
48008 398482 pfam04829 PT-VENN Pre-toxin domain with VENN motif. This family represents a conserved region found in many bacterial porlymorphic toxins which is located before the C-terminal toxin modules. 52
48009 377410 pfam04830 DUF637 Possible hemagglutinin (DUF637). This family represents a conserved region found in a bacterial protein which may be a hemagglutinin or hemolysin. 170
48010 398483 pfam04831 Popeye Popeye protein conserved region. The function of Popeye proteins is not well understood. They are predominantly expressed in cardiac and skeletal muscle. This family represents a conserved region which includes three potential transmembrane domains. 226
48011 398484 pfam04832 SOUL SOUL heme-binding protein. This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologs. 173
48012 398485 pfam04833 COBRA COBRA-like protein. Family of plant proteins are designated COBRA-like (COBL) proteins. The 12 Arabidopsis members of the family are all GPI-liked. Some members of this family are annotated as phytochelatin synthase, but these annotations are incorrect. 167
48013 309809 pfam04834 Adeno_E3_14_5 Early E3 14.5 kDa protein. The E3B 14.5 kDa was first identified in Human adenovirus type 5. It is an integral membrane protein oriented with its C-terminus in the cytoplasm. It functions to down-regulate the epidermal growth factor receptor and prevent tumor necrosis factor cytolysis. It achieves this through the interaction with E3 10.4 kDa protein. 100
48014 282664 pfam04835 Pox_A9 A9 protein conserved region. Family of Chordopoxvirus A9 proteins. 53
48015 398486 pfam04836 IFRD_C Interferon-related protein conserved region. Family of proteins thought to be involved in regulating gene activity in the proliferative and/or differentiative pathways induced by NGF. 52
48016 113603 pfam04837 MbeB_N MbeB-like, N-term conserved region. This family represents an N-terminal conserved region of MbeB/MobB proteins. These proteins are essential for specific plasmid transfer. 52
48017 282666 pfam04838 Baculo_LEF5 Baculoviridae late expression factor 5. 156
48018 398487 pfam04839 PSRP-3_Ycf65 Plastid and cyanobacterial ribosomal protein (PSRP-3 / Ycf65). This small acidic protein is found in 30S ribosomal subunit of cyanobacteria and plant plastids. In plants it has been named plastid-specific ribosomal protein 3 (PSRP-3), and in cyanobacteria it is named Ycf65. Plastid-specific ribosomal proteins may mediate the effects of nuclear factors on plastid translation. The acidic PSRPs are thought to contribute to protein-protein interactions in the 30S subunit, and are not thought to bind RNA. 47
48019 282668 pfam04840 Vps16_C Vps16, C-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known. 320
48020 252829 pfam04841 Vps16_N Vps16, N-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known. 408
48021 368150 pfam04842 DUF639 Plant protein of unknown function (DUF639). Plant protein of unknown function. 230
48022 398488 pfam04843 Herpes_teg_N Herpesvirus tegument protein, N-terminal conserved region. 158
48023 398489 pfam04844 Ovate Transcriptional repressor, ovate. This is a family of transcriptional repressors. In plants, these proteins are important regulators of growth and development. 58
48024 282672 pfam04845 PurA PurA ssDNA and RNA-binding protein. This family represents most of the length of the protein. 219
48025 282673 pfam04846 Herpes_pp38 Herpesvirus pp38 phosphoprotein. This protein represents a conserved region found in most herpesvirus pp38 phosphoproteins. 63
48026 282674 pfam04847 Calcipressin Calcipressin. Calcipressin is also known as calcineurin-binding protein, since it inhibits calcineurin-mediated transcriptional modulation by binding to calcineurin's catalytic domain. 183
48027 282675 pfam04848 Pox_A22 Poxvirus A22 protein. 143
48028 398490 pfam04849 HAP1_N HAP1 N-terminal conserved region. This family represents an N-terminal conserved region found in several huntingtin-associated protein 1 (HAP1) homologs. HAP1 binds to huntingtin in a polyglutamine repeat-length-dependent manner. However, its possible role in the pathogenesis of Huntington's disease is unclear. This family also includes a similar N-terminal conserved region from hypothetical protein products of ALS2CR3 genes found in the human juvenile amyotrophic lateral sclerosis critical region 2q33-2q34. 305
48029 398491 pfam04850 Baculo_E66 Baculovirus E66 occlusion-derived virus envelope protein. 387
48030 398492 pfam04851 ResIII Type III restriction enzyme, res subunit. 162
48031 398493 pfam04852 DUF640 Protein of unknown function (DUF640). This family represents a conserved region found in plant proteins including Resistance protein-like protein. 126
48032 398494 pfam04854 DUF624 Protein of unknown function, DUF624. This family includes several uncharacterized bacterial proteins. 77
48033 398495 pfam04855 SNF5 SNF5 / SMARCB1 / INI1. SNF5 is a component of the yeast SWI/SNF complex, which is an ATP-dependent nucleosome-remodelling complex that regulates the transcription of a subset of yeast genes. SNF5 is a key component of all SWI/SNF-class complexes characterized so far. This family consists of the conserved region of SNF5, including a direct repeat motif. SNF5 is essential for the assembly promoter targeting and chromatin remodelling activity of the SWI-SNF complex. SNF5 is also known as SMARCB1, for SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily b, member 1, and also INI1 for integrase interactor 1. Loss-of function mutations in SNF5 are thought to contribute to oncogenesis in malignant rhabdoid tumors (MRTs). 178
48034 282682 pfam04856 Securin Securin sister-chromatid separation inhibitor. Securin is also known as pituitary tumor-transforming gene product. Over-expression of securin is associated with a number of tumors, and it has been proposed that this may be due to erroneous chromatid separation leading to chromosome gain or loss. 214
48035 398496 pfam04857 CAF1 CAF1 family ribonuclease. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localizes to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom resolution. 370
48036 398497 pfam04858 TH1 TH1 protein. TH1 is a highly conserved but uncharacterized metazoan protein. No homolog has been identified in Caenorhabditis elegans. TH1 binds specifically to A-Raf kinase. 579
48037 398498 pfam04859 DUF641 Plant protein of unknown function (DUF641). Plant protein of unknown function. 127
48038 398499 pfam04860 Phage_portal Phage portal protein. Bacteriophage portal proteins form a dodecamer and is located at a five-fold vertex of the viral capsid. The portal complex forms a channel through which the viral DNA is packaged into the capsid, and exits during infection. The portal protein is though to rotate during DNA packaging. Portal proteins from different phage show little sequence homology, so this family does not represent all portal proteins. 323
48039 398500 pfam04862 DUF642 Protein of unknown function (DUF642). This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif. 157
48040 398501 pfam04863 EGF_alliinase Alliinase EGF-like domain. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesized from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defense system. This family represents the N-terminal EGF-like domain. 56
48041 398502 pfam04864 Alliinase_C Allinase. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesized from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defense system. 363
48042 398503 pfam04865 Baseplate_J Baseplate J-like protein. The P2 bacteriophage J protein lies at the edge of the baseplate. This family also includes a number of bacterial homologs, which are thought to have been horizontally transferred. 253
48043 282690 pfam04866 Rota_NS6 Rotavirus non-structural protein 6. 92
48044 282691 pfam04867 DUF643 Protein of unknown function (DUF643). Protein of unknown function found in Borrelia burgdorferi, the Lyme disease spirochete. 114
48045 398504 pfam04868 PDE6_gamma Retinal cGMP phosphodiesterase, gamma subunit. Retinal rod and cone cGMP phosphodiesterases function as the effector enzymes in the vertebrate visual transduction cascade. This family represents the inhibitory gamma subunit, which is also expressed outside retinal tissues and has been shown to interact with the G-protein-coupled receptor kinase 2 signalling system to regulate the epidermal growth factor- and thrombin-dependent stimulation of p42/p44 mitogen-activated protein kinase in human embryonic kidney 293 cells. 82
48046 398505 pfam04869 Uso1_p115_head Uso1 / p115 like vesicle tethering protein, head region. Also known as General vesicular transport factor, Transcytosis associated protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerization, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of part of the head region. The head region is highly conserved, but its function is unknown. It does not seem to be essential for vesicle tethering. The N-terminal part of the head region, not within this family, contains context-detected Armadillo/beta-catenin-like repeats (pfam00514). 311
48047 398506 pfam04870 Moulting_cycle Moulting cycle. This family of proteins plays a role in the moulting cycle of nematodes, which involves the synthesis of a new collagen-rich cuticle underneath the existing cuticle and the subsequent removal of the old cuticle. 343
48048 398507 pfam04871 Uso1_p115_C Uso1 / p115 like vesicle tethering protein, C terminal region. Also known as General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerization, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of the acidic C-terminus, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another. 130
48049 282696 pfam04872 Pox_L5 Poxvirus L5 protein family. This family includes variola (smallpox) and vaccinia virus L5 proteins. However, not all proteins in this family are called L5. L5 is thought to contain a metal-binding region. 79
48050 398508 pfam04873 EIN3 Ethylene insensitive 3. Ethylene insensitive 3 (EIN3) proteins are a family of plant DNA-binding proteins that regulate transcription in response to the gaseous plant hormone ethylene, and are essential for ethylene-mediated responses including the triple response, cell growth inhibition, and accelerated senescence. 252
48051 398509 pfam04874 Mak16 Mak16 protein C-terminal region. The precise function of this eukaryotic protein family is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni Mak16 has been shown to target protein transport to the nucleolus. 99
48052 282699 pfam04875 DUF645 Protein of unknown function, DUF645. This family includes several uncharacterized proteins from Vibrio cholerae. There is some doubt regarding the existence of these proteins, they are encoded by open reading frames contained within a repeated region in the Vibrio superintegron. 59
48053 282700 pfam04876 Tenui_NCP Tenuivirus major non-capsid protein. This protein of unknown function accumulates in large amounts in tenuivirus infected cells. It is found in all forms of the inclusion bodies that are formed after infection. 173
48054 368170 pfam04877 Hairpins HrpZ. HrpZ from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants. The family of hairpinN proteins, Harpin, has been merged into this family. HrpN is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis. 277
48055 282702 pfam04878 Baculo_p48 Baculovirus P48 protein. 370
48056 398510 pfam04879 Molybdop_Fe4S4 Molybdopterin oxidoreductase Fe4S4 domain. This domain is found in formate dehydrogenase H for which the structure is known. This first domain (residues 1 to 60) of Structure 1aa6 is an Fe4S4 cluster just below the protein surface. 55
48057 398511 pfam04880 NUDE_C NUDE protein, C-terminal conserved region. This family represents the C-terminal conserved region of the NUDE proteins. NUDE proteins are involved in nuclear migration. 169
48058 282705 pfam04881 Adeno_GP19K Adenovirus GP19K. This 19 kDa glycoprotein binds the major histocompatibility (MHC) class I antigens in the endoplasmic reticulum (ER). The ER retention signal at the C-terminus of GP19K causes retention of the complex in the ER, preventing lysis of the cell by cytotoxic T lymphocytes. 132
48059 398512 pfam04882 Peroxin-3 Peroxin-3. Peroxin-3 is a peroxisomal protein. It is thought to be involve in membrane vesicle assembly prior to the translocation of matrix proteins. 453
48060 398513 pfam04883 HK97-gp10_like Bacteriophage HK97-gp10, putative tail-component. This family of proteins is found in the caudovirales. It may be a tail component. 80
48061 398514 pfam04884 DUF647 Vitamin B6 photo-protection and homoeostasis. In plants, this domain plays a role in auxin-transport, plant growth and development and appears to be expressed by all cells in the plant as well as in plastids. The family has been shown to play a role in vitamin B6 photo-protection and homoeostasis in plants. 240
48062 398515 pfam04885 Stig1 Stigma-specific protein, Stig1. This family represents the Stig1 cysteine rich plant protein. The STIG1 gene is developmentally regulated and expressed specifically in the stigmatic secretory zone. 134
48063 282710 pfam04886 PT PT repeat. This short repeat is composed on the tetrapeptide XPTX. This repeat is found in a variety of proteins, however it is not clear if these repeats are homologous to each other. The alignment represents nine copies of this repeat. 36
48064 282711 pfam04887 Pox_M2 Poxvirus M2 protein. This family includes M2 protein from variola virus. The function of this protein is not known. 196
48065 398516 pfam04888 SseC Secretion system effector C (SseC) like family. SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella. All these proteins are secreted by the type III secretion system. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon. 312
48066 398517 pfam04889 Cwf_Cwc_15 Cwf15/Cwc15 cell cycle control protein. This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologs. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing. 243
48067 309840 pfam04890 DUF648 Family of unknown function (DUF648). Family of hypothetical Chlamydia proteins. This family may well comprise of two domains, as some members only match the N-terminus. 289
48068 398518 pfam04891 NifQ NifQ. NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co), which is an integral part of the active site of dinitrogenase. The conserved C-terminal cysteine residues may be involved in metal binding. 159
48069 398519 pfam04892 VanZ VanZ like family. This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases. 131
48070 398520 pfam04893 Yip1 Yip1 domain. The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterized by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases. 173
48071 398521 pfam04894 Nre_N Archaeal Nre, N-terminal. This conserved region is found in the N-terminal region of archaeal Nre proteins. While most archaeal organisms encode only a single Nre protein, some encode two, NreA and NreB. 270
48072 398522 pfam04895 Nre_C Archaeal Nre, C-terminal. This conserved region is found in the C-terminal region of archaeal Nre proteins. While most archaeal organisms encode only a single Nre protein, some encode two, NreA and NreB. 110
48073 398523 pfam04896 AmoC Ammonia monooxygenase/methane monooxygenase, subunit C. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons. This family represents the AmoC subunit. It also includes the particulate methane monooxygenase subunit PmoC from methanotrophic bacteria. 245
48074 398524 pfam04898 Glu_syn_central Glutamate synthase central domain. The central domain of glutamate synthase connects the amino terminal amidotransferase domain with the FMN-binding domain and has an alpha / beta overall topology. This domain appears to be a rudimentary form of the FMN-binding TIM barrel according to SCOP. 281
48075 113664 pfam04899 MbeD_MobD MbeD/MobD like. The MbeD and MobD proteins are plasmid encoded, and are involved in the plasmids mobilisation and transfer in the presence of conjugative plasmids. 70
48076 398525 pfam04900 Fcf1 Fcf1. Fcf1 is a nucleolar protein involved in pre-rRNA processing. Depletion of yeast Fcf1 and Fcf2 leads to a decrease in synthesis of the 18S rRNA and results in a deficit in 40S ribosomal subunits. 99
48077 398526 pfam04901 RAMP Receptor activity modifying family. The calcitonin-receptor-like receptor can function as either a calcitonin-gene-related peptide or an adrenomedullin receptor. The receptors function is modified by receptor-activity-modifying protein or RAMP. RAMPs are single-transmembrane-domain proteins. 108
48078 398527 pfam04902 Nab1 Conserved region in Nab1. Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This C-terminal region is found only in the Nab1 subfamily. 190
48079 398528 pfam04904 NCD1 NAB conserved region 1 (NCD1). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This region consists of the N-terminal NAB conserved region 1, which interacts with the EGR1 inhibitory domain (R1). It may also mediate multimerisation. 79
48080 398529 pfam04905 NCD2 NAB conserved region 2 (NCD2). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This family consists of NAB conserved region 2, near the C-terminus of the protein. It is necessary for transcriptional repression by the Nab proteins. It is also required for transcription activation by Nab proteins at Nab-activated promoters. 123
48081 368183 pfam04906 Tweety Tweety. The tweety (tty) gene has not been characterized at the protein level. However, it is thought to form a membrane protein with five potential membrane-spanning regions. A number of potential functions have been suggested in. 406
48082 398530 pfam04908 SH3BGR SH3-binding, glutamic acid-rich protein. 92
48083 398531 pfam04909 Amidohydro_2 Amidohydrolase. These proteins are amidohydrolases that are related to pfam01979. 285
48084 398532 pfam04910 Tcf25 Transcriptional repressor TCF25. Members of this family are transcriptional repressors. They may act by increasing histone deacetylase activity at promoter regions. 321
48085 368187 pfam04911 ATP-synt_J ATP synthase j chain. 51
48086 398533 pfam04912 Dynamitin Dynamitin. Dynamitin is a subunit of the microtubule-dependent motor complex and in implicated in cell adhesion by binding to macrophage-enriched myristoylated alanine-rice C kinase substrate (MacMARCKS). 393
48087 282731 pfam04913 Baculo_Y142 Baculovirus Y142 protein. This domain family is found in Baculovirus proteins including protein AC142, which is expressed in the cytoplasm and nucleus throughout infection. It is required for nucleocapsid envelopment in the budding virus to form the occlusion-derived virus and subsequent embedding of virions into polyhedra. 440
48088 398534 pfam04914 DltD DltD protein. DltD is and integral membrane protein involved in the biosynthesis of D-alanyl-lipoteichoic acid. This is important in controlling the net ionic charge in lipoteichoic acid (LTA). This family is found in bacteria of the Bacillus/Clostridium group. DltD binds Dcp and ligates it with D-alanine. DltD does not ligate acyl carrier protein (ACP) with D-alanine. It also has thioesterase activity for mischarged D-alanyl-acyl carrier protein (ACP). DltD is thought to be responsible for discriminating between Dcp involved in the D-alanylation of LTA, and ACP involved in fatty acid biosynthesis. 349
48089 398535 pfam04916 Phospholip_B Phospholipase B. Phospholipase B (PLB) catalyzes the hydrolytic cleavage of both acylester bonds of glycerophospholipids. This family of PLB enzymes has been identified in mammals, flies and nematodes but not in yeast. In Drosophila this protein was named LAMA for laminin ancestor since it is expressed in the neuronal and glial precursors that surround the lamina. 536
48090 309860 pfam04917 Shufflon_N Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins. 324
48091 282736 pfam04919 DUF655 Protein of unknown function (DUF655). This family includes several uncharacterized archaeal proteins. This protein appears to contain two HHH motifs. 181
48092 282737 pfam04920 DUF656 Family of unknown function (DUF656). A family of hypothetical proteins from Beet necrotic yellow vein virus. 126
48093 398536 pfam04921 XAP5 XAP5, circadian clock regulator. This protein is found in a wide range of eukaryotes. It is a nuclear protein and is suggested to be DNA binding. In plants, this family is essential for correct circadian clock functioning by acting as a light-quality regulator coordinating the activities of blue and red light signalling pathways during plant growth - inhibiting growth in red light but promoting growth in blue light. 238
48094 398537 pfam04922 DIE2_ALG10 DIE2/ALG10 family. The ALG10 protein from Saccharomyces cerevisiae encodes the alpha-1,2 glucosyltransferase of the endoplasmic reticulum. This protein has been characterized in rat as potassium channel regulator 1. 383
48095 398538 pfam04923 Ninjurin Ninjurin. Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. 101
48096 282741 pfam04924 Pox_A6 Poxvirus A6 protein. 370
48097 398539 pfam04925 SHQ1 SHQ1 protein. S. cerevisiae SHQ1 protein is required for SnoRNAs of the box H/ACA Quantitative accumulation (unpublished). 176
48098 398540 pfam04926 PAP_RNA-bind Poly(A) polymerase predicted RNA binding domain. Based on its similarity structurally to the RNA recognition motif this domain is thought to be RNA binding. 177
48099 398541 pfam04927 SMP Seed maturation protein. Plant seed maturation protein. 59
48100 398542 pfam04928 PAP_central Poly(A) polymerase central domain. The central domain of Poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta- sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different. 344
48101 282746 pfam04929 Herpes_DNAp_acc Herpes DNA replication accessory factor. Replicative DNA polymerases are capable of polymerising tens of thousands of nucleotides without dissociating from their DNA templates. The high processivity of these polymerases is dependent upon accessory proteins that bind to the catalytic subunit of the polymerase or to the substrate. The Epstein-Barr virus (EBV) BMRF1 protein is an essential component of the viral DNA polymerase and is absolutely required for lytic virus replication. BMRF1 is also a transactivator. This family is predicted to have a UL42 like structure. 400
48102 398543 pfam04930 FUN14 FUN14 family. This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices. 93
48103 398544 pfam04931 DNA_pol_phi DNA polymerase phi. This family includes the fifth essential DNA polymerase in yeast EC:2.7.7.7. Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units. 765
48104 398545 pfam04932 Wzy_C O-Antigen ligase. This group of bacterial proteins is involved in the synthesis of O-antigen, a lipopolysaccharide found in the outer membrane in gram-negative bacteria. This family includes O-antigen ligases such as E. coli RfaL. 149
48105 398546 pfam04934 Med6 MED6 mediator sub complex component. Component of RNA polymerase II holoenzyme and mediator sub complex. 132
48106 398547 pfam04935 SURF6 Surfeit locus protein 6. The surfeit locus protein SURF-6 is shown to be a component of the nucleolar matrix and has a strong binding capacity for nucleic acids. 197
48107 282752 pfam04936 DUF658 Protein of unknown function (DUF658). Protein of unknown function found in Lactococcus lactis bacteriophages. 186
48108 398548 pfam04937 DUF659 Protein of unknown function (DUF 659). Transposase-like protein with no known function. 152
48109 398549 pfam04938 SIP1 Survival motor neuron (SMN) interacting protein 1 (SIP1). Survival motor neuron (SMN) interacting protein 1 (SIP1) interacts with SMN protein and plays a crucial role in the biogenesis of spliceosomes. There is evidence that the protein is linked to spinal muscular atrophy (SMA) and amyotrophic lateral sclerosis(ALS) in humans. 212
48110 398550 pfam04939 RRS1 Ribosome biogenesis regulatory protein (RRS1). This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae. 161
48111 398551 pfam04940 BLUF Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light. 91
48112 282757 pfam04941 LEF-8 Late expression factor 8 (LEF-8). Late expression factor 8 (LEF-8) is one of the primary components of RNA polymerase produced by polyhedrosis viruses. LEF-8 shows homology to the second largest subunit of prokaryotic DNA-directed RNA polymerase. 730
48113 368205 pfam04942 CC CC domain. This short domain contains four conserved cysteines that probably for two disulphide bonds. The domain is named after the characteristic CC motif. 34
48114 282759 pfam04943 Pox_F11 Poxvirus F11 protein. The protein F11 is an early virus protein. 409
48115 398552 pfam04945 YHS YHS domain. This short presumed domain is about 50 amino acid residues long. It often contains two cysteines that may be functionally important. This domain is found in copper transporting ATPases, some phenol hydroxylases and in a set of uncharacterized membrane proteins. This domain is named after three of the most conserved amino acids it contains. The domain may be metal binding, possibly copper ions. This domain is duplicated in some copper transporting ATPases. 47
48116 282761 pfam04947 Pox_VLTF3 Poxvirus Late Transcription Factor VLTF3 like. Members of this family are approximately 26 KDa, and are involved in trans-activator of late transcription. 168
48117 282762 pfam04948 Pox_A51 Poxvirus A51 protein. 337
48118 398553 pfam04949 Transcrip_act Transcriptional activator. This family of proteins may act as a transcriptional activator. It plays a role in stress response in plants. 154
48119 398554 pfam04950 RIBIOP_C 40S ribosome biogenesis protein Tsr1 and BMS1 C-terminal. RIBIOP_C is a family of eukaryotic proteins from the C-terminus of pre-rRNA-processing protein or ribosome biogenesis proteins BMS1 and TSR1. These proteins act, in the nucleolus, as a molecular switch during maturation of the 40S ribosomal subunit. This domain, domain IV of translation elongation factor selb, adopts the same fold as translation proteins such as domain II of GTP-elongation factor Tu proteins. 289
48120 398555 pfam04951 Peptidase_M55 D-aminopeptidase. Bacillus subtilis DppA is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a 'self-compartmentalising protease', a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organized in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterized by the SXDXEG key sequence. The only known substrates are D-ala-D-ala and D-ala-gly-gly. 263
48121 398556 pfam04952 AstE_AspA Succinylglutamate desuccinylase / Aspartoacylase family. This family includes Succinylglutamate desuccinylase EC:3.1.-.- that catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. The family also include aspartoacylase EC:3.5.1.15 which cleaves acylaspartate into a fatty acid and aspartate. Mutations in human ASPA lead to Canavan disease disease. This family is probably structurally related to pfam00246 (Bateman A pers. obs.). 287
48122 398557 pfam04954 SIP Siderophore-interacting protein. 119
48123 398558 pfam04955 HupE_UreJ HupE / UreJ protein. This family of proteins are hydrogenase / urease accessory proteins. The alignment contains many conserved histidines that are likely to be involved in nickel binding. The members usually have five membrane-spanning regions. 179
48124 398559 pfam04956 TrbC TrbC/VIRB2 family. Conjugal transfer protein, TrbC has been identified as a subunit of the pilus precursor in bacteria. The protein undergoes three processing steps before gaining its mature cyclic structure. This family also contains several VIRB2 type IV secretion proteins. The virB2 gene encodes a putative type IV secretion system and is known to be a pathogenicity factor in Bartonella species. 98
48125 398560 pfam04957 RMF Ribosome modulation factor. This protein associates with 70s ribosomes and converts them to a dimeric form (100S ribosomes) which appear during the transition from the exponential growth phase to the stationary phase of Escherichia coli cells. 51
48126 398561 pfam04958 AstA Arginine N-succinyltransferase beta subunit. Arginine N-succinyltransferase EC:2.3.1.109 catalyzes the transfer of succinyl-CoA to arginine to produce succinyl-arginine. This is the first step in arginine catabolism by the arginine succinyltransferase pathway. 335
48127 398562 pfam04959 ARS2 Arsenite-resistance protein 2. Arsenite is a carcinogenic compound which can act as a co-mutagen by inhibiting DNA repair. Arsenite-resistance protein 2 is thought to play a role in arsenite resistance. 195
48128 398563 pfam04960 Glutaminase Glutaminase. This family of enzymes deaminates glutamine to glutamate EC:3.5.1.2. 283
48129 398564 pfam04961 FTCD_C Formiminotransferase-cyclodeaminase. Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family. 181
48130 398565 pfam04962 KduI KduI/IolB family. This family includes the 5-keto 4-deoxyuronate isomerase enzyme EC:5.3.1.17 that is involved in pectin degradation. This family aldo includes bacterial Myo-inositol catabolism (IolB) proteins. The Bacillus subtilis inositol operon (iolABCDEFGHIJ) is involved in myo-inositol catabolism. Glucose repression of the iol operon induced by inositol is exerted through catabolite repression mediated by CcpA and the iol induction system mediated by IolR. The exact function of IolB is unknown. Members of this family possess a Cupin like structure. 260
48131 398566 pfam04963 Sigma54_CBD Sigma-54 factor, core binding domain. This domain makes a direct interaction with the core RNA polymerase, to form an enhancer dependent holoenzyme. The centre of this domain contains a very weak similarity to a helix-turn-helix motif which may represent the other DNA binding domain. 182
48132 398567 pfam04964 Flp_Fap Flp/Fap pilin component. 46
48133 398568 pfam04965 GPW_gp25 Gene 25-like lysozyme. This family includes the phage protein Gene 25 from T4 which is a structural component of the outer wedge of the baseplate that has acidic lysozyme activity. The family also includes relatives from bacteria that are also presumably lysozymes. 93
48134 335962 pfam04966 OprB Carbohydrate-selective porin, OprB family. 373
48135 282780 pfam04967 HTH_10 HTH DNA binding domain. 53
48136 398569 pfam04968 CHORD CHORD. CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development. 62
48137 398570 pfam04969 CS CS domain. The CS and CHORD (pfam04968) are fused into a single polypeptide chain in metazoans but are found in separate proteins in plants; this is thought to be indicative of an interaction between CS and CHORD. It has been suggested that the CS domain is a binding module for HSP90, implying that CS domain-containing proteins are involved in recruiting heat shock proteins to multiprotein assemblies. Two CS domains are found at the N-terminus of Ubiquitin carboxyl-terminal hydrolase 19 (USP19), these domains may play a role in the interaction of USP19 with cellular inhibitor of apoptosis 2. 76
48138 398571 pfam04970 LRAT Lecithin retinol acyltransferase. The full-length members of this family, are representatives of a novel class II tumor-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localization, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester. 106
48139 398572 pfam04971 Phage_holin_2_1 Bacteriophage P21 holin S. Phage_holin_2_1 is a family of small hydrophobic holin proteins with one or more transmembrane domains. Members of this family fall into the holin superfamily II, and Phage 21 S holin is the prototype for this superfamily. It has two transmembrane segments with both the N- and C-termini on the cytoplasmic side of the inner membrane in E. coli. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis. 64
48140 398573 pfam04972 BON BON domain. This domain is found in a family of osmotic shock protection proteins. It is also found in some Secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. 69
48141 398574 pfam04973 NMN_transporter Nicotinamide mononucleotide transporter. Members of this family are integral membrane proteins that are involved in transport of nicotinamide mononucleotide. 176
48142 398575 pfam04976 DmsC DMSO reductase anchor subunit (DmsC). The terminal electron transfer enzyme Me2SO reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC). 275
48143 398576 pfam04977 DivIC Septum formation initiator. DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. 69
48144 398577 pfam04978 DUF664 Protein of unknown function (DUF664). This family is commonly found in Streptomyces coelicolor and is of unknown function. These proteins contain several conserved histidines at their N-terminus that may form a metal binding site. 149
48145 398578 pfam04979 IPP-2 Protein phosphatase inhibitor 2 (IPP-2). Protein phosphotase inhibitor 2 (IPP-2) is a phosphoprotein conserved among all eukaryotes, and it appears in both the nucleus and cytoplasm of tissue culture cells. 130
48146 398579 pfam04981 NMD3 NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway. 242
48147 398580 pfam04982 HPP HPP family. These proteins are integral membrane proteins with four transmembrane spanning helices. The most conserved region of the alignment is a motif HPP. The function of these proteins is uncertain but they may be transporters. 122
48148 398581 pfam04983 RNA_pol_Rpb1_3 RNA polymerase Rpb1, domain 3. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking. 158
48149 398582 pfam04984 Phage_sheath_1 Phage tail sheath protein subtilisin-like domain. This entry represents the second domain in a variety of phage tail sheath proteins. According to ECOD this domain has a subtilisin-like structure. 157
48150 398583 pfam04985 Phage_tube Phage tail tube protein FII. The major structural components of the contractile tail of bacteriophage P2 are proteins FI and FII, which are believed to be the tail sheath and tube proteins, respectively. 162
48151 398584 pfam04986 Y2_Tnp Putative transposase. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases IS1294 and IS801. This is a rolling-circle transposase. 183
48152 398585 pfam04987 PigN Phosphatidylinositolglycan class N (PIG-N). Phosphatidylinositolglycan class N (PIG-N) is a mammalian homolog of the yeast protein MCD4P and is expressed in the endoplasmic reticulum. PIG-N is essential for glycosylphosphatidylinositol anchor synthesis. Glycosylphosphatidylinositol (GPI)-anchored proteins are cell surface-localized proteins that serve many important cellular functions. 455
48153 398586 pfam04988 AKAP95 A-kinase anchoring protein 95 (AKAP95). A-kinase (or PKA)-anchoring protein AKAP95 is implicated in mitotic chromosome condensation by acting as a targeting molecule for the condensin complex. The protein contains two zinc fingers which are thought to mediate the binding of AKAP95 to DNA. 134
48154 398587 pfam04989 CmcI Cephalosporin hydroxylase. Members of this family are about 220 amino acids long. The CmcI protein is presumed to represent the cephalosporin-7--hydroxylase. However this has not been experimentally verified. 206
48155 398588 pfam04990 RNA_pol_Rpb1_7 RNA polymerase Rpb1, domain 7. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 7, represents a mobile module of the RNA polymerase. Domain 7 forms a substantial interaction with the lobe domain of Rpb2 (pfam04561). 136
48156 398589 pfam04991 LicD LicD family. The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily. 224
48157 398590 pfam04992 RNA_pol_Rpb1_6 RNA polymerase Rpb1, domain 6. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 6, represents a mobile module of the RNA polymerase. Domain 6 forms part of the shelf module. This family appears to be specific to the largest subunit of RNA polymerase II. 188
48158 398591 pfam04993 TfoX_N TfoX N-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found as an isolated domain in some proteins suggesting this is an autonomous domain. 91
48159 398592 pfam04994 TfoX_C TfoX C-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found associated with pfam00383 in Neisseria meningitidis TadA. It is also found as an isolated domain in some proteins suggesting this is an autonomous domain. 81
48160 398593 pfam04995 CcmD Heme exporter protein D (CcmD). The CcmD protein is part of a C-type cytochrome biogenesis operon. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation. These proteins contain a predicted transmembrane helix. 44
48161 398594 pfam04996 AstB Succinylarginine dihydrolase. This enzyme transforms N(2)-succinylglutamate into succinate and glutamate. This is the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. 442
48162 398595 pfam04997 RNA_pol_Rpb1_1 RNA polymerase Rpb1, domain 1. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand. 320
48163 398596 pfam04998 RNA_pol_Rpb1_5 RNA polymerase Rpb1, domain 5. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 5, represents the discontinuous cleft domain that is required to from the central cleft or channel where the DNA is bound. 516
48164 398597 pfam04999 FtsL Cell division protein FtsL. In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect. 97
48165 398598 pfam05000 RNA_pol_Rpb1_4 RNA polymerase Rpb1, domain 4. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors. 108
48166 398599 pfam05001 RNA_pol_Rpb1_R RNA polymerase Rpb1 C-terminal repeat. The repetitive C-terminal domain (CTD) of Rpb1 (RNA polymerase Pol II) plays a critical role in the regulation of gene expression. The activity of the CTD is dependent on its state of phosphorylation. 12
48167 398600 pfam05002 SGS SGS domain. This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins. 82
48168 398601 pfam05003 DUF668 Protein of unknown function (DUF668). Uncharacterized plant protein. 88
48169 398602 pfam05004 IFRD Interferon-related developmental regulator (IFRD). Interferon-related developmental regulator (IFRD1) is the human homolog of the rat early response protein PC4 and its murine homolog TIS7. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary to muscle differentiation and that it might have a role in signal transduction. This family also contains IFRD2 and its murine equivalent SKMc15 which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis. 314
48170 398603 pfam05005 Ocnus Janus/Ocnus family (Ocnus). This family is comprised of the Ocnus, Janus-A and Janus-B proteins. These proteins have been found to be testes specific in Drosophila melanogaster. 102
48171 368236 pfam05006 PIF3 Per os infectivity factor 3. This family contains viral proteins and includes Baculovirus Per os infectivity factor 3 (PIF3). PIF3 forms a complex on the occlusion-derived virus surface with PIF1, PIF2, and P74 which has an essential function in the initial stages of baculovirus oral infection. 149
48172 252941 pfam05007 Mannosyl_trans Mannosyltransferase (PIG-M). PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilize nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates. 259
48173 398604 pfam05008 V-SNARE Vesicle transport v-SNARE protein N-terminus. V-SNARE proteins are required for protein traffic between eukaryotic organelles. The v-SNAREs on transport vesicles interact with t-SNAREs on target membranes in order to facilitate this. This domain is the N-terminal half of the V-Snare proteins. 78
48174 282817 pfam05009 EBV-NA3 Epstein-Barr virus nuclear antigen 3 (EBNA-3). This family contains EBNA-3A, -3B, and -3C which are latent infection nuclear proteins important for Epstein-Barr virus (EBV)-induced B-cell immortalisation and the immune response to EBV infection. 254
48175 398605 pfam05010 TACC Transforming acidic coiled-coil-containing protein (TACC). This family contains the proteins TACC 1, 2 and 3 the genes for which are found concentrated in the centrosomes of eukaryotic and may play a conserved role in organising centrosomal microtubules. The human TACC proteins have been linked to cancer and TACC2 has been identified as a possible tumor suppressor (AZU-1). The functional homolog (Alp7) in Schizosaccharomyces pombe has been shown to be required for organisation of bipolar spindles. 201
48176 398606 pfam05011 DBR1 Lariat debranching enzyme, C-terminal domain. This presumed domain is found at the C-terminus of lariat debranching enzyme. This domain is always found in association with pfam00149. 136
48177 398607 pfam05013 FGase N-formylglutamate amidohydrolase. Formylglutamate amidohydrolase (FGase) catalyzes the terminal reaction in the five-step pathway for histidine utilisation in Pseudomonas putida. By this action, N-formyl-L-glutamate (FG) is hydrolyzed to produce L-glutamate plus formate. 218
48178 398608 pfam05014 Nuc_deoxyrib_tr Nucleoside 2-deoxyribosyltransferase. Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. 115
48179 398609 pfam05015 HigB-like_toxin RelE-like toxin of type II toxin-antitoxin system HigB. This family carries several different examples of type II bacterial toxins of toxin-antitoxin systems including many HigB-like ones. The fold is referred to as the RelE/YoeB/Txe/Yeeu fold suggesting all examples of these are present in this family. Several plasmids with proteic killer gene-systems have been reported. All of them encode a stable toxin and an unstable antidote. Upon loss of the plasmid, the less stable inhibitor is inactivated more rapidly than the toxin, allowing the toxin to be activated. The activation of these systems results in cell filamentation and cessation of viable cell production. It has been verified that both the stable killer and the unstable inhibitor of the systems are short polypeptides. This family corresponds to the toxin. 93
48180 398610 pfam05016 ParE_toxin ParE toxin of type II toxin-antitoxin system, parDE. ParE is the toxin family of a type II toxin-antitoxin family. It is toxic towards DNA gyrase, but is neutralized by the antitoxin ParD. The family also encompasses RelE/ParE described in. 91
48181 368241 pfam05017 TMP TMP repeat. This short repeat consists of the motif WXXh where X can be any residue and h is a hydrophobic residue. The repeat is name TMP after its occurrence in the tape measure protein (TMP). Tape measure protein is a component of phage tail and probably forms a beta-helix. Truncated forms of TMP lead to shortened tail fibers. This repeat is also found in non-phage proteins where it may play a structural role. 11
48182 398611 pfam05018 DUF667 Protein of unknown function (DUF667). This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature. 185
48183 398612 pfam05019 Coq4 Coenzyme Q (ubiquinone) biosynthesis protein Coq4. Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN). 213
48184 398613 pfam05020 zf-NPL4 NPL4 family, putative zinc binding region. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterized step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. This region of the protein contains possibly two zinc binding motifs (Bateman A pers. obs.). Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing. 145
48185 398614 pfam05021 NPL4 NPL4 family. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterized step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing. 308
48186 398615 pfam05022 SRP40_C SRP40, C-terminal domain. This presumed domain is found at the C-terminus of the S. cerevisiae SRP40 protein and its homologs. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III. 75
48187 398616 pfam05023 Phytochelatin Phytochelatin synthase. Phytochelatin synthase is the enzyme responsible for the synthesis of heavy-metal-binding peptides (phytochelatins) from glutathione and related thiols. The crystal structure of a member of this family shows it to possess a papain fold. The enzyme catalyzes the deglycination of a GSH donor molecule. The enzyme contains a catalytic triad of cysteine, histidine and aspartate residues. 207
48188 398617 pfam05024 Gpi1 N-acetylglucosaminyl transferase component (Gpi1). Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins.The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This chemically simple step is genetically complex because three or four genes are required in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively. 187
48189 398618 pfam05025 RbsD_FucU RbsD / FucU transport protein family. The Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). Of the six components, RbsD is the only one whose function is unknown although it is thought that it somehow plays a critical role in PtsG-mediated ribose transport. This family also includes FucU a protein from the fucose biosynthesis operon that is presumably also involved in fucose transport by similarity to RbsD. 133
48190 398619 pfam05026 DCP2 Dcp2, box A domain. This domain is always found to the amino terminal side of pfam00293. This domain is specific to mRNA decapping protein 2 and this region has been termed Box A. Removal of the cap structure is catalyzed by the Dcp1-Dcp2 complex. 83
48191 398620 pfam05028 PARG_cat Poly (ADP-ribose) glycohydrolase (PARG). Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death. 324
48192 398621 pfam05029 TIMELESS_C Timeless protein C terminal region. The timeless (tim) gene is essential for circadian function in Drosophila. Putative homologs of Drosophila tim have been identified in both mice and humans (mTim and hTIM, respectively). Mammalian TIM is not the true orthologue of Drosophila TIM, but is the likely orthologue of a fly gene, timeout (also called tim-2). mTim has been shown to be essential for embryonic development, but does not have substantiated circadian function. Some family members contain a SANT domain in this region. 88
48193 398622 pfam05030 SSXT SSXT protein (N-terminal region). The SSXT or SS18 protein is involved in synovial sarcoma in humans. A SYT-SSX fusion gene resulting from the chromosomal translocation t(X;18) (p11;q11) is characteristic of synovial sarcomas. This translocation fuses the SSXT (SYT) gene from chromosome 18 to either of two homologous genes at Xp11, SSX1 or SSX2. 60
48194 398623 pfam05031 NEAT Iron Transport-associated domain. NEAT domains are heme and/or hemoprotein-binding modules highly conserved in secondary structure. They have roles in hemoprotein binding, heme extraction and heme transfer 113
48195 398624 pfam05032 Spo12 Spo12 family. This family of proteins includes Spo12 from S. cerevisiae. The Spo12 protein plays a regulatory role in two of the most fundamental processes of biology, mitosis and meiosis, and yet its biochemical function remains elusive. Spo12 is a nuclear protein. Spo12 is a component of the FEAR (Cdc fourteen early anaphase release) regulatory network, that promotes Cdc14 release from the nucleolus during early anaphase. The FEAR network is comprised of the polo kinase Cdc5, the separase Esp1, the kinetochore-associated protein Slk19, and Spo12. 33
48196 398625 pfam05033 Pre-SET Pre-SET motif. This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilizing SET domains. 98
48197 398626 pfam05034 MAAL_N Methylaspartate ammonia-lyase N-terminus. Methylaspartate ammonia-lyase EC:4.3.1.2 catalyzes the second step of fermentation of glutamate. It is a homodimer. This family represents the N-terminal region of Methylaspartate ammonia-lyase. This domain is structurally related to pfam03952. This domain is associated with the catalytic domain pfam07476. 160
48198 398627 pfam05035 DGOK 2-keto-3-deoxy-galactonokinase. 2-keto-3-deoxy-galactonokinase EC:2.7.1.58 catalyzes the second step in D-galactonate degradation. 284
48199 398628 pfam05036 SPOR Sporulation related domain. This 70 residue domain is composed of two 35 residue repeats found in proteins involved in sporulation and cell division such as FtsN, DedD, and CwlM. This domain is involved in binding peptidoglycan. Two tandem repeats fold into a pseudo-2-fold symmetric single-domain structure containing numerous contacts between the repeats. FtsN is an essential cell division protein with a simple bitopic topology, a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. These repeats lay at the periplasmic C-terminus. FtsN localizes to the septum ring complex. 76
48200 398629 pfam05037 DUF669 Protein of unknown function (DUF669). Members of this family are found in various phage proteins. 126
48201 398630 pfam05038 Cytochrom_B558a Cytochrome Cytochrome b558 alpha-subunit. Cytochrome b-245 light chain (p22-phox) is one of the key electron transfer elements of the NADPH oxidase in phagocytes. 177
48202 398631 pfam05039 Agouti Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP)is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation. 87
48203 398632 pfam05041 Pecanex_C Pecanex protein (C-terminus). This family consists of C terminal region of the pecanex protein homologs. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila. 227
48204 398633 pfam05042 Caleosin Caleosin related protein. This family contains plant proteins related to caleosin. Caleosins contain calcium-binding domains and have an oleosin-like association with lipid bodies. Caleosins are present at relatively low levels and are mainly bound to microsomal membrane fractions at the early stages of seed development. As the seeds mature, overall levels of caleosins increased dramatically and they were associated almost exclusively with storage lipid bodies. This family is probably related to EF hands pfam00036. 170
48205 398634 pfam05043 Mga Mga helix-turn-helix domain. M regulator protein trans-acting positive regulator (Mga) is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. This domain is found in the centre of the Mga proteins. This family also contains a number of bacterial RofA transcriptional regulators that seem to be largely restricted to streptococci. These proteins have been shown to regulate the expression of important bacterial adhesins. This is presumably a DNA-binding domain. 87
48206 398635 pfam05044 HPD Homeo-prospero domain. Prospero is a large drosophila transcription factor protein that is expressed in all neural lineages of drosophila embryos. It is needed for correct expression of several neural proteins and in determining the cell fates of neural stem cells. homologs of prospero are found in a wide range of animals including humans with the highest level of similarity being found in the C-terminal 160 amino acids. This region was identified as containing an atypical homeobox domain followed by a prospero domain. However, the structure shows that these two regions form a single stable structural domain as defined here. This homeo-prospero domain binds to DNA. 152
48207 398636 pfam05045 RgpF Rhamnan synthesis protein F. This family consists of a group of proteins which are related to the Streptococcus rhamnose-glucose polysaccharide assembly protein (RgpF). Rhamnan backbones are found in several O polysaccharides of phytopathogenic bacteria and are regarded as pathogenic factors. 501
48208 398637 pfam05046 Img2 Mitochondrial large subunit ribosomal protein (Img2). This family of proteins have been identified as part of the mitochondrial large ribosomal subunit in yeast. 82
48209 368263 pfam05047 L51_S25_CI-B8 Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. Structurally related to thioredoxin-fold. 51
48210 398638 pfam05048 NosD Periplasmic copper-binding protein (NosD). NosD is a periplasmic protein which is thought to insert copper into the exported reductase apoenzyme (NosZ). This region forms a parallel beta helix domain. 215
48211 398639 pfam05049 IIGP Interferon-inducible GTPase (IIGP). Interferon-inducible GTPase (IIGP) is thought to play a role in in intracellular defense. IIGP is predominantly associated with the Golgi apparatus and also localizes to the endoplasmic reticulum and exerts a distinct role in IFN-induced intracellular membrane trafficking or processing. 375
48212 398640 pfam05050 Methyltransf_21 Methyltransferase FkbM domain. This family has members from bacteria to human, and appears to be a methyltransferase. 173
48213 398641 pfam05051 COX17 Cytochrome C oxidase copper chaperone (COX17). Cox17 is essential for the assembly of functional cytochrome c oxidase (CCO) and for delivery of copper ions to the mitochondrion for insertion into the enzyme in yeast. The structure of Cox17 shows the protein to have an unstructured N-terminal region followed by two helices and several unstructured C-terminal residues. The Cu(I) binding site has been modelled as two-coordinate with ligation by conserved residues Cys23 and Cys26. 47
48214 309965 pfam05052 MerE MerE protein. The prokaryotic MerE (or URF-1) protein is part of the mercury resistance operon. The protein is thought not to have any direct role in conferring mercury resistance to the organism but may be a mercury resistance transposon. 75
48215 398642 pfam05053 Menin Menin. MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumor suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23. 617
48216 282858 pfam05054 AcMNPV_Ac109 Autographa californica nuclear polyhedrosis virus (AcMNPV) protein. This domain family is found in viral proteins such as Ac109 from Autographa californica nuclear polyhedrosis virus (AcMNPV). The gene (Orf1090) is essential and transcribed late in virus assembly, and protein AC109 has been shown to be important for the transport of the budded virion to the host nucleus. In mutants lacking the AC109 gene, virions are unable to enter the nucleus and remain in the cytoplasm. Although addition of AC109 allowed virions to enter the nucleus, the occlusion bodies were empty, indicating that AC109 is also important for the production of infectious budded virus. The exact function of this domain family remains unknown. 418
48217 252976 pfam05055 DUF677 Protein of unknown function (DUF677). This family consists of AT14A like proteins from Arabidopsis thaliana. At14a has a small domain that has sequence similarities to integrins from fungi, insects and humans. Transcripts of At14a are found in all Arabidopsis tissues and localizes partly to the plasma membrane. 336
48218 398643 pfam05056 DUF674 Protein of unknown function (DUF674). This family is found in Arabidopsis thaliana and contains several uncharacterized proteins. 449
48219 309968 pfam05057 DUF676 Putative serine esterase (DUF676). This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold. 212
48220 282861 pfam05058 ActA ActA Protein. The ActA family is found in Listeria and is associated with motility. ActA protein acts as a scaffold to assemble and activate host cell actin cytoskeletal factors at the bacterial surface, resulting in directional actin polymerization and propulsion of the bacterium through the cytoplasm of the host cell. 633
48221 282862 pfam05059 Orbi_VP4 Orbivirus VP4 core protein. Orbiviruses are double stranded RNA retroviruses of which the bluetongue virus is a member. The core of bluetongue virus (BTV) is a multienzyme complex composed of two major proteins (VP7 and VP3) and three minor proteins (VP1, VP4 and VP6) in addition to the viral genome. VP4 has been shown to perform all RNA capping activities and has both methyltransferase type 1 and type 2 activities associated with it. 640
48222 398644 pfam05060 MGAT2 N-acetylglucosaminyltransferase II (MGAT2). UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors. 349
48223 282864 pfam05061 Pox_A11 Poxvirus A11 Protein. Family of conserved Chordopoxvirinae A11 family proteins. Conserved region spans entire protein in the majority of family members. 315
48224 309970 pfam05062 RICH RICH domain. This presumed domain is about 85 residues in length and very rich in charged residues, hence the name RICH (Rich In CHarged residues). It is found in secreted proteins such as PspC, SpsA and IgA FC receptor from Streptococcus agalactiae. This domain could be involved in bacterial adherence or cell wall binding. 81
48225 368270 pfam05063 MT-A70 MT-A70. MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs. 174
48226 398645 pfam05064 Nsp1_C Nsp1-like C-terminal region. This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. 116
48227 398646 pfam05065 Phage_capsid Phage capsid family. Family of bacteriophage hypothetical proteins and capsid proteins. 273
48228 398647 pfam05066 HARE-HTH HB1, ASXL, restriction endonuclease HTH domain. A winged helix-turn-helix domain present in the plant HB1, vertebrate ASXL, the H. pylori restriction endonuclease HpyAIII(HgrA), the RNA polymerase delta subunit(RpoE) of Gram positive bacteria and several restriction endonucleases. The domain is distinguished by the presence of a conserved one-turn helix between helix-3 and the preceding conserved turn. Its diverse architectures in eukaryotic species with extensive gene body methylation is suggestive of a chromatin function. The genetic interaction of the HARE-HTH containing ASXL with the methyl cytosine hydroxylating Tet2 protein is suggestive of a role for the domain in discriminating sequences with DNA modifications such as hmC. Bacterial versions include fusions to diverse restriction endonucleases, and a DNA glycosylase where it may play a similar role in detecting modified DNA. Certain bacterial version of the HARE-HTH domain show fusions to the helix-hairpin-helix domain of the RNA polymerase alpha subunit and the HTH domains found in regions 3 and 4 of the sigma factors. These versions are predicted to function as a novel inhibitor of the binding of RNA polymerase to transcription start sites, similar to the Bacillus delta protein. 71
48229 252986 pfam05067 Mn_catalase Manganese containing catalase. Catalases are important antioxidant metalloenzymes that catalyze disproportionation of hydrogen peroxide, forming dioxygen and water. Two families of catalases are known, one having a heme cofactor, and this family that is a structurally distinct family containing non-heme manganese. 283
48230 398648 pfam05068 MtlR Mannitol repressor. The mannitol operon of Escherichia coli, encoding the mannitol-specific enzyme II of the phosphotransferase system (MtlA) and mannitol phosphate dehydrogenase (MtlD) contains an additional downstream open reading frame which encodes the mannitol repressor (MtlR). 164
48231 398649 pfam05069 Phage_tail_S Phage virion morphogenesis family. Protein S of phage P2 is thought to be involved in tail completion and stable head joining. 148
48232 398650 pfam05071 NDUFA12 NADH ubiquinone oxidoreductase subunit NDUFA12. This family contains the 17.2 kD subunit of complex I (NDUFA12) and its homologs. The family also contains a second related eukaryotic protein of unknown function. 78
48233 282873 pfam05072 Herpes_UL43 Herpesvirus UL43 protein. UL43 genes are expressed with true-late (gamma2) kinetics and have been identified as a virion tegument component. 373
48234 368276 pfam05073 Baculo_p24 Baculovirus P24 capsid protein. Baculovirus P24 is associated with nucleocapsids of budded and polyhedra-derived virions. 165
48235 368277 pfam05075 DUF684 Protein of unknown function (DUF684). This family contains several uncharacterized proteins from Caenorhabditis elegans. The GO annotation suggests that the protein is involved in nematode larval development and has a positive regulation on growth rate. 338
48236 398651 pfam05076 SUFU Suppressor of fused protein (SUFU). SUFU, encoding the human orthologue of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signaling. SUFU exerts its repressor role by physically interacting with GLI proteins in both the cytoplasm and the nucleus. SUFU has been found to be a tumor-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signaling pathway. Genomic contextual analysis of bacterial SUFU versions revealed that they are immunity proteins against diverse nuclease toxins in polymorphic toxin systems. 171
48237 282877 pfam05077 DUF678 Protein of unknown function (DUF678). This family contains several poxvirus proteins of unknown function. 73
48238 398652 pfam05078 DUF679 Protein of unknown function (DUF679). This family contains several uncharacterized plant proteins. 163
48239 398653 pfam05079 DUF680 Protein of unknown function (DUF680). This family contains several uncharacterized proteins which seem to be found exclusively in Rhizobium loti. 54
48240 113835 pfam05080 DUF681 Protein of unknown function (DUF681). This family contains several uncharacterized beak and feather disease virus proteins. 101
48241 282879 pfam05081 DUF682 Protein of unknown function (DUF682). This family consists if several uncharacterized baculovirus proteins. 157
48242 398654 pfam05082 Rop-like Rop-like. This family contains several uncharacterized bacterial proteins. These proteins are found in nitrogen fixation operons so are likely to play some role in this process. They consist of two alpha helices which are joined by a four residue linker. The helices form an antiparallel bundle and cross towards their termini. They are likely to form a rod-like dimer. They have structural similarity to the regulatory protein Rop, pfam01815. 60
48243 398655 pfam05083 LST1 LST-1 protein. B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia. 78
48244 398656 pfam05084 GRA6 Granule antigen protein (GRA6). This family contains the granule antigen protein GRA6 which is found in the parasitic protozoa Toxoplasma gondii and Neospora caninum. GRA6 protein plays an important role in the antigenicity and pathogenicity in these organisms. 216
48245 282883 pfam05085 DUF685 Protein of unknown function (DUF685). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete). There is some evidence to suggest that the proteins may be outer surface proteins. 265
48246 252996 pfam05086 Dicty_REP Dictyostelium (Slime Mold) REP protein. This family consists of REP proteins from Dictyostelium (Slime molds). REP protein is likely involved in transcription regulation and control of DNA replication, specifically amplification of plasmid at low copy numbers. The formation of homomultimers may be required for their regulatory activity. 910
48247 398657 pfam05087 Rota_VP2 Rotavirus VP2 protein. Rotavirus particles consist of three concentric proteinaceous capsid layers. The innermost capsid (core) is made of VP2. The genomic RNA and the two minor proteins VP1 and VP3 are encapsidated within this layer. The N-terminus of rotavirus VP2 is necessary for the encapsidation of VP1 and VP3. 882
48248 398658 pfam05088 Bac_GDH Bacterial NAD-glutamate dehydrogenase. This family consists of several bacterial proteins which are closely related to NAD-glutamate dehydrogenase found in Streptomyces clavuligerus. Glutamate dehydrogenases (GDHs) are a broadly distributed group of enzymes that catalyze the reversible oxidative deamination of glutamate to ketoglutarate and ammonia. 1530
48249 398659 pfam05089 NAGLU Alpha-N-acetylglucosaminidase (NAGLU) tim-barrel domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This central domain has a tim barrel fold. 333
48250 398660 pfam05090 VKG_Carbox Vitamin K-dependent gamma-carboxylase. Using reduced vitamin K, oxygen, and carbon dioxide, gamma-glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the gamma position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facilitated by an interaction between a propeptide present on target proteins and the gamma-glutamyl carboxylase. 431
48251 398661 pfam05091 eIF-3_zeta Eukaryotic translation initiation factor 3 subunit 7 (eIF-3). This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals. 530
48252 282889 pfam05092 PIF Per os infectivity. This is a family of dsDNA Baculovirus proteins. It is required for the infectivity of the OBs or occlusion bodies. It is a structural protein of the ODV envelope required only in the first steps of per os larva infection, as viruses being produced in cells expressing the gene for this protein but not containing it in their genomes are able to produce successful infections. Baculoviruses are large DNA viruses that infect arthropods, mainly members of the order Lepidoptera. In their life cycle, they produce two kinds of particles, a budded, non-occluded virus (BV), which buds out of the infected cell and is responsible for the cell-to-cell transmission of the virus, and an occluded form, the occlusion body (OB), which is responsible for protecting the virus between encounters with larvae. A variable number of virions are included in the para-crystalline structure of the OB, mainly constituted by the virus-encoded polyhedrin protein; these virions are called occlusion body-derived virions or ODVs. 519
48253 398662 pfam05093 CIAPIN1 Cytokine-induced anti-apoptosis inhibitor 1, Fe-S biogenesis. Anamorsin, subsequently named CIAPIN1 for cytokine-induced anti-apoptosis inhibitor 1, in humans is the homolog of yeast Dre2, a conserved soluble eukaryotic Fe-S cluster protein, that functions in cytosolic Fe-S protein biogenesis. It is found in both the cytoplasm and in the mitochondrial intermembrane space (IMS). CIAPIN1 is found to be up-regulated in hepatocellular cancer, is considered to be a downstream effector of the receptor tyrosine kinase-Ras signalling pathway, and is essential in mouse definitive haematopoiesis. Dre2 has been found to interact with the yeast reductase Tah18, forming a tight cytosolic complex implicated in the response to high levels of oxidative stress. 99
48254 282891 pfam05094 LEF-9 Late expression factor 9 (LEF-9). Late expression factor 9 (LEF-9) is one of the primary components of RNA polymerase produced by baculoviruses. LEF-9 is homologous to the largest beta-subunit of prokaryotic DNA-directed RNA polymerase. 493
48255 309989 pfam05095 DUF687 Protein of unknown function (DUF687). This family contains several uncharacterized Chlamydia proteins. 537
48256 398663 pfam05096 Glu_cyclase_2 Glutamine cyclotransferase. This family of enzymes EC:2.3.2.5 catalyze the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes. 240
48257 368284 pfam05097 DUF688 Protein of unknown function (DUF688). This family contains several uncharacterized proteins found in Arabidopsis thaliana. 443
48258 282893 pfam05098 LEF-4 Late expression factor 4 (LEF-4). Late expression factor 4 (LEF-4) is one of the Baculovirus late expression factor proteins. LEF-4 carries out all the enzymatic functions related to mRNA capping. 471
48259 398664 pfam05099 TerB Tellurite resistance protein TerB. This family contains the TerB tellurite resistance proteins from a a number of bacteria. 142
48260 282895 pfam05100 Phage_tail_L Phage minor tail protein L. 206
48261 398665 pfam05101 VirB3 Type IV secretory pathway, VirB3-like protein. This family includes the Type IV secretory pathway VirB3 protein, that is found associated with bacterial inner and outer membranes. The family also includes the conjugal transfer protein TrbD family that contains a nucleotide binding motif and may provide energy for the export of DNA or the export of other Trb proteins. 82
48262 309993 pfam05102 Holin_BlyA holin, BlyA family. BlyA, a small holin found in Borrelia circular plasmids that is encoded by a prophage. BlyA contains two largely hydrophobic helices and a highly charged C-terminus and has two transmembrane segments. 61
48263 398666 pfam05103 DivIVA DivIVA protein. The Bacillus subtilis divIVA1 mutation causes misplacement of the septum during cell division, resulting in the formation of small, circular, anucleate mini-cells. Inactivation of divIVA produces a mini-cell phenotype, whereas overproduction of DivIVA results in a filamentation phenotype. These proteins appear to contain coiled-coils. 131
48264 398667 pfam05104 Rib_recp_KP_reg Ribosome receptor lysine/proline rich region. This highly conserved region is found towards the C-terminus of the transmembrane domain. The function is unclear. 139
48265 398668 pfam05105 Phage_holin_4_1 Bacteriophage holin family. Phage holins and lytic enzymes are both necessary for bacterial lysis and virus dissemination. This family also includes TcdE/UtxA involved in toxin secretion in Clostridium difficile. The 1.E.10 family is represented by Bacillus subtilis phi29 holin; 1.E.16 represents the Cph1 holin; and the 1.E.19 family is represented by the Clostridium difficile TcdE holin. Toxigenic strains of C. difficile produce two large toxins (TcdA and TcdB) encoded within a pathogenicity locus. tcdE, encoded between tcdA and tcdB, encodes a 166 aa protein which causes death to E. coli when expressed, and the structure of TcdE resembles holins. TcdE acts on the bacterial membrane. Since TcdA and TcdB lack signal peptides, they may be released via TcdE either prior to or subsequent to cell lysis. 109
48266 398669 pfam05106 Phage_holin_3_1 Phage holin family (Lysis protein S). This family represents one of a large number of mutually dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. 99
48267 398670 pfam05107 Cas_Cas7 CRISPR-associated protein Cas7. CRISPR-associated protein Cas7 is one of the components of the type I-B cascade-like antiviral defense complex. In Haloferax volcanii, Cas5, Cas6 and Cas7 form a small complex that aids the stability of CRISPR-derived RNA. 252
48268 398671 pfam05108 T7SS_ESX1_EccB Type VII secretion system ESX-1, transport TM domain B. EccB is a family of largely Gram-positive bacterial transmembrane componenets of the type VII secretion system characterized in Mycobacterium tuberculosis, systems ESX1-5. Translocation of virulent peptides through the membranes is thought to be mediated via a complex that includes EccB, EccC, EccD, EccE, and MycP. EccB, EccC, EccD, and EccE form a stable complex in the mycobacterial cell envelope. 405
48269 282904 pfam05109 Herpes_BLLF1 Herpes virus major outer envelope glycoprotein (BLLF1). This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo. 886
48270 398672 pfam05110 AF-4 AF-4 proto-oncoprotein. This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homolog Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila. 515
48271 398673 pfam05111 Amelin Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. 417
48272 282907 pfam05112 Baculo_p47 Baculovirus P47 protein. This family consists of several Baculovirus P47 proteins which is one of the primary components of Baculovirus encoded RNA polymerase, which initiates transcription from late and very late promoters. 306
48273 310001 pfam05113 DUF693 Protein of unknown function (DUF693). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete). 313
48274 398674 pfam05114 DUF692 Protein of unknown function (DUF692). This family consists of several uncharacterized bacterial proteins. 263
48275 398675 pfam05115 PetL Cytochrome B6-F complex subunit VI (PetL). This family consists of several Cytochrome B6-F complex subunit VI (PetL) proteins found in several plant species. PetL is one of the small subunits which make up The cytochrome b(6)f complex. PetL is strictly required neither for the accumulation nor for the function of cytochrome b6f; in its absence, however, the complex becomes unstable in vivo in aging cells and labile in vitro. It has been suggested that the N-terminus of the protein is likely to lie in the thylakoid lumen. 31
48276 398676 pfam05116 S6PP Sucrose-6F-phosphate phosphohydrolase. This family consists of Sucrose-6F-phosphate phosphohydrolase proteins found in plants and cyanobacteria. Sucrose-6(F)-phosphate phosphohydrolase catalyzes the final step in the pathway of sucrose biosynthesis. 246
48277 398677 pfam05117 DUF695 Family of unknown function (DUF695). Family of uncharacterized bacterial proteins. 129
48278 398678 pfam05118 Asp_Arg_Hydrox Aspartyl/Asparaginyl beta-hydroxylase. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure of proline 3-hydroxylase contains the conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. This family represent the arginine, asparagine and proline hydroxylases. The aspartyl/asparaginyl beta-hydroxylase (EC:1.14.11.16) specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins. 157
48279 398679 pfam05119 Terminase_4 Phage terminase, small subunit. 92
48280 310007 pfam05120 GvpG Gas vesicle protein G. These proteins are involved in the formation of gas vesicles. 80
48281 398680 pfam05121 GvpK Gas vesicle protein K. These proteins are involved in the formation of gas vesicles. 81
48282 398681 pfam05122 SpdB Mobile element transfer protein. This proteins are involved in transferring a group of integrating conjugative DNA elements, such as pSAM2 from Streptomyces ambofaciens. Their precise role is not known. 50
48283 368294 pfam05123 S_layer_N S-layer like family, N-terminal region. 284
48284 368295 pfam05124 S_layer_C S-layer like family, C-terminal region. 221
48285 398682 pfam05125 Phage_cap_P2 Phage major capsid protein, P2 family. 326
48286 377463 pfam05127 Helicase_RecD Helicase. This domain contains a P-loop (Walker A) motif, suggesting that it has ATPase activity, and a Walker B motif. In tRNA(Met) cytidine acetyltransferase (TmcA) it may function as an RNA helicase motor (driven by ATP hydrolysis) which delivers the wobble base to the active centre of the GCN5-related N-acetyltransferase (GNAT) domain. It is found in the bacterial exodeoxyribonuclease V alpha chain (RecD), which has 5'-3' helicase activity. It is structurally similar to the motor domain 1A in other SF1 helicases. 175
48287 368297 pfam05128 DUF697 Domain of unknown function (DUF697). Family of bacterial hypothetical proteins that is sometimes associated with GTPase domains. 162
48288 398683 pfam05129 Elf1 Transcription elongation factor Elf1 like. This family of short proteins contains a putative zinc binding domain with four conserved cysteines. ELF1 has been identified as a transcription elongation factor in Saccharomyces cerevisiae. 77
48289 398684 pfam05130 FlgN FlgN protein. This family includes the FlgN protein and export chaperone involved in flagellar synthesis. 141
48290 398685 pfam05131 Pep3_Vps18 Pep3/Vps18/deep orange family. This region is found in a number of protein identified as involved in golgi function and vacuolar sorting. The molecular function of this region is unknown. The members of this family contain a C-terminal ring finger domain. 147
48291 398686 pfam05132 RNA_pol_Rpc4 RNA polymerase III RPC4. Specific subunit for Pol III, the tRNA specific polymerase. 138
48292 398687 pfam05133 Phage_prot_Gp6 Phage portal protein, SPP1 Gp6-like. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. During SPP1 morphogenesis, Gp6 participates in the procapsid assembly reaction. This family also includes the old Pfam family Phage_min_cap (PF05126). 416
48293 282928 pfam05134 T2SSL Type II secretion system (T2SS), protein L. This family consists of Type II secretion system protein L sequences from several Gram-negative (diderm) bacteria. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for extracellular secretion of a number of different proteins, including proteases and toxins. This pathway supports secretion of proteins across the cell envelope in two distinct steps, in which the second step, involving translocation through the outer membrane, is assisted by at least 13 different gene products. T2SL is predicted to contain a large cytoplasmic domain represented by this family and has been shown to interact with the autophosphorylating cytoplasmic membrane protein T2SE. It is thought that the tri-molecular complex of T2SL, T2SE (pfam00437) and T2SM (pfam04612) might be involved in regulating the opening and closing of the secretion pore and/or transducing energy to the site of outer membrane translocation. 230
48294 398688 pfam05135 Phage_connect_1 Phage gp6-like head-tail connector protein. This family of proteins contain head-tail connector proteins related to gp6 from bacteriophage HK97. A structure of this protein shows similarity to gp15 a well characterized connector component of bacteriophage SPP1. 94
48295 398689 pfam05136 Phage_portal_2 Phage portal protein, lambda family. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage capsid and the tail proteins. 343
48296 398690 pfam05137 PilN Fimbrial assembly protein (PilN). 77
48297 398691 pfam05138 PaaA_PaaC Phenylacetic acid catabolic protein. This family includes proteins such as PaaA and PaaC that are part of a catabolic pathway of phenylacetic acid. These proteins may form part of a dioxygenase complex. 258
48298 398692 pfam05139 Erythro_esteras Erythromycin esterase. This family includes erythromycin esterase enzymes that confer resistance to the erythromycin antibiotic. 312
48299 398693 pfam05140 ResB ResB-like family. This family includes both ResB and cytochrome c biogenesis proteins. Mutations in ResB indicate that they are essential for growth. ResB is predicted to be a transmembrane protein. 446
48300 398694 pfam05141 DIT1_PvcA Pyoverdine/dityrosine biosynthesis protein. DIT1 is involved in synthesising dityrosine. Dityrosine is a sporulation-specific component of the yeast ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. Pyoverdine biosynthesis protein PvcA is involved in the biosynthesis of pyoverdine, a cyclized isocyano derivative of tyrosine. It has a modified Rossmann fold. 270
48301 398695 pfam05142 DUF702 Domain of unknown function (DUF702). Members of this family are found in various putative zinc finger proteins. 154
48302 368303 pfam05144 Phage_CRI Phage replication protein CRI. The phage replication protein CRI, is also known as Gene II, is essential for DNA replication. 234
48303 398696 pfam05145 AbrB Transition state regulatory protein AbrB. Bacillus subtilis respond to a multitude of environmental stimuli by using transcription factors called transition state regulators (TSRs). They play an essential role in cell survival by regulating spore formation, competence, and biofilm development. AbrB is one of the most known TSRs, acting as a pleotropic regulator for over 60 different genes where it directly binds to their promoter or regulatory regions. Many other genes are indirectly controlled by AbrB since it is a regulator of other regulatory proteins, including ScoC, Abh, SinR and SigH. Hence, AbrB is considered a global regulatory protein controlling processes such as Bacillus subtilis growth and cell division as well as production of extracellular degradative enzymes, nitrogen utilization and amino acid metabolism, motility, synthesis of antibiotics and their resistant determinants, development of competence, transport systems, oxidative stress response, phosphate metabolism, cell surface components and sporulation. AbrB is a tetramer consisting of identical 94 residue monomers. Its DNA-binding function resides solely in the N-terminal domain (AbrBN) of 53 residues. Although it does not recognize a well-defined DNA base-pairing sequence, instead, it appears to target a very weak pseudo consensus nucleotide sequence, TGGNA-5bp-TGGNA, which allows it to be rather promiscuous in binding. The N-terminal domains of very similar sequences are present in two more Bacillus subtilis proteins, Abh and SpoVT. Mutagenesis studies suggest that the role of the C-terminal domain is in forming multimers. 312
48304 398697 pfam05147 LANC_like Lanthionine synthetase C-like protein. Lanthionines are thioether bridges that are putatively generated by dehydration of Ser and Thr residues followed by addition of cysteine residues within the peptide. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1 (P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs. Lanthionines are found in lantibiotics, which are peptide-derived, post-translationally modified antimicrobials produced by several bacterial strains. This region contains seven internal repeats. 350
48305 398698 pfam05148 Methyltransf_8 Hypothetical methyltransferase. This family consists of several uncharacterized eukaryotic proteins which are related to methyltransferases pfam01209. 214
48306 368306 pfam05149 Flagellar_rod Paraflagellar rod protein. This family consists of several eukaryotic paraflagellar rod component proteins. The eukaryotic flagellum represents one of the most complex macromolecular structures found in any organism and contains more than 250 proteins. In addition to its locomotive role, the flagellum is probably involved in nutrient uptake since receptors for host low-density lipoproteins are localized on the flagellar membrane as well as on the flagellar pocket membrane. 287
48307 398699 pfam05150 Legionella_OMP Legionella pneumophila major outer membrane protein precursor. This family consists of major outer membrane protein precursors from Legionella pneumophila. 279
48308 398700 pfam05151 PsbM Photosystem II reaction centre M protein (PsbM). This family consists of several Photosystem II reaction centre M proteins (PsbM) from plants and cyanobacteria. During the photosynthetic light reactions in the thylakoid membranes of cyanobacteria, algae, and plants, photosystem II (PSII), a multi-subunit membrane protein complex, catalyzes oxidation of water to molecular oxygen and reduction of plastoquinon. 31
48309 282943 pfam05152 DUF705 Protein of unknown function (DUF705). This family contains several uncharacterized Baculovirus proteins. 302
48310 398701 pfam05153 MIOX Myo-inositol oxygenase. MIOX is the enzyme myo-inositol oxygenase. It catalyzes the first committed step in the glucuronate-xylulose pathway, It is a di-iron oxygenase with a key role in inositol metabolism. The structure reveals a monomeric, single-domain protein with a mostly helical fold that is distantly related to the diverse HD domain superfamily. The structural core is of five alpha-helices that contribute six ligands, four His and two Asp, to the di-iron centre where the two iron atoms are bridged by a putative hydroxide ion and one of the Asp ligands. The substrate is myo-inositol is bound in a terminal substrate-binding mode to a di-iron cluster. Within the structure are two additional proteinous lids that cover and shield the enzyme's active site. 249
48311 377473 pfam05154 TM2 TM2 domain. This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts. 50
48312 282946 pfam05155 Phage_X Phage X family. This family is the product of Gene X. The function of this protein is unknown. 88
48313 398702 pfam05157 T2SSE_N Type II secretion system (T2SS), protein E, N-terminal domain. This domain is found at the N-terminus of members of the Type II secretion system protein E. Proteins in this subfamily are typically involved in Type 4 pilus biogenesis, though some are involved in other processes; for instance aggregation in Myxococcus xanthus. The structure of this domain is now known. 109
48314 398703 pfam05158 RNA_pol_Rpc34 RNA polymerase Rpc34 subunit. Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment. 317
48315 282949 pfam05159 Capsule_synth Capsule polysaccharide biosynthesis protein. This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB. 310
48316 398704 pfam05160 DSS1_SEM1 DSS1/SEM1 family. This family contains the breast cancer tumor suppressor BRCA2-interacting protein DSS1 and its homolog SEM1, both of which are short acidic proteins. DSS1 has been shown to be a conserved component of the Rae1 mediated mRNA export pathway in Schizosaccharomyces pombe. 56
48317 398705 pfam05161 MOFRL MOFRL family. MOFRL(multi-organism fragment with rich Leucine) family exists in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases. 106
48318 398706 pfam05162 Ribosomal_L41 Ribosomal protein L41. 24
48319 398707 pfam05163 DinB DinB family. DNA damage-inducible (din) genes in Bacillus subtilis are coordinately regulated and together compose a global regulatory network that has been termed the SOS-like or SOB regulon. This family includes DinB from B. subtilis. 163
48320 398708 pfam05164 ZapA Cell division protein ZapA. ZapA is a cell division protein which interacts with FtsZ. FtsZ is part of a mid-cell cytokinetic structure termed the Z-ring that recruits a hierarchy of fission related proteins early in the bacterial cell cycle. The interaction of FtsZ with ZapA drives its polymerization and promotes FtsZ filament bundling thereby contributing to the spatio-temporal tuning of the Z-ring. 85
48321 147379 pfam05165 GCH_III GTP cyclohydrolase III. GTP cyclohydrolase (GCH) III from Methanocaldococcus jannaschi catalyzes the conversion of GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy). The reaction requires two bound magnesium ions for the catalysis and is activated by monovalent cations such as potassium and ammonium. The enzyme is a tetramer of identical subunits; each monomer is composed of an N- and a C-terminal domain that adopt nearly superimposible structures, suggesting that the protein has arisen by gene duplication. The family is found in archaea and bacteria. 246
48322 398709 pfam05166 YcgL YcgL domain. This family of proteins formerly called DUF709 includes the E. coli gene ycgL. homologs of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure. 73
48323 398710 pfam05167 DUF711 Uncharacterized ACR (DUF711). The proteins in this family are functionally uncharacterized. The proteins are around 450 amino acids long. It is likely that this family represents a group of glycerol-3-phosphate dehydrogenases. 402
48324 398711 pfam05168 HEPN HEPN domain. 117
48325 282957 pfam05170 AsmA AsmA family. The AsmA gene, whose product is involved in the assembly of outer membrane proteins in Escherichia coli. AsmA mutations were isolated as extragenic suppressors of an OmpF assembly mutant. AsmA may have a role in LPS biogenesis. 608
48326 398712 pfam05171 HemS Haemin-degrading HemS.ChuX domain. The Yersinia enterocolitica O:8 periplasmic binding-protein- dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family). The structure for HemS has been solved and consists of a tandem repeat of this domain. 128
48327 398713 pfam05172 Nup35_RRM Nup53/35/40-type RNA recognition motif. Members of this family belong to the nucleor pore complex, NPC, the only gateway between the nucleus and the cytoplasm. The NPC consists of several subcomplexes each one of which is made up of multiple copies of several individual Nup, Nic or Sec protein subunits. In yeast, this Nup or nucleoporin subunit is numbered Nup53, Nup40 in Schizo. pombe and in vertebrates as Nup35. This subunit forms part of the inner ring within the membrane and interacts directly with Nup-Ndc1, considered to be an anchor for the NPC in the pore membrane. This region of the Nup is the RNA-recognition region. 81
48328 398714 pfam05173 DapB_C Dihydrodipicolinate reductase, C-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The C-terminal domain of DapB has been proposed to be the substrate- binding domain. 134
48329 398715 pfam05175 MTS Methyltransferase small domain. This domain is found in ribosomal RNA small subunit methyltransferase C as well as other methyltransferases. 170
48330 398716 pfam05176 ATP-synt_10 ATP10 protein. ATP 10 is essential for the assembly of a functional mitochondrial ATPase complex. 255
48331 398717 pfam05177 RCSD RCSD region. Proteins contain this region include C.elegans UNC-89. This region is found repeated in UNC-89 and shows conservation in prolines, lysines and glutamic acids. Proteins with RCSD are involved in muscle M-line assembly, but the function of this region RCSD is not clear. 101
48332 398718 pfam05178 Kri1 KRI1-like family. The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. 101
48333 398719 pfam05179 CDC73_C RNA pol II accessory factor, Cdc73 family, C-terminal. CDC73 is an RNA polymerase II accessory factor, and forms part of the Paf1 complex that has roles in post-initiation events. More specifically, crystal structure analysis shows the C-terminus to be a Ras-like domain that adopts a fold that is highly similar to GTPases of the Ras superfamily. The canonical nucleotide binding pocket is altered in CDC73, and there is no nucleotide ligand, but it contributes to histone methylation and Paf1C recruitment to active genes. Thus together with Rtf1 it combines to couple the Paf1 complex to elongating polymerase. The family has been added to the P-loop clan on the basis of the topology of the b-stranded core, and its similarity to Ras. 155
48334 398720 pfam05180 zf-DNL DNL zinc finger. The domain is named after a short C-terminal motif of D(N/H)L. This domain is a novel zinc-finger protein essential for protein import into mitochondria. 64
48335 398721 pfam05181 XPA_C XPA protein C-terminus. 51
48336 398722 pfam05182 Fip1 Fip1 motif. This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Th1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor. 43
48337 398723 pfam05183 RdRP RNA dependent RNA polymerase. This family of proteins are eukaryotic RNA dependent RNA polymerases. These proteins are involved in post transcriptional gene silencing where they are thought to amplify dsRNA templates. 554
48338 398724 pfam05184 SapB_1 Saposin-like type B, region 1. 36
48339 398725 pfam05185 PRMT5 PRMT5 arginine-N-methyltransferase. The human homolog of yeast Skb1 (Shk1 kinase-binding protein 1) is PRMT5, an arginine-N-methyltransferase. These proteins appear to be key mitotic regulators. They play a role in Jak signalling in higher eukaryotes. 171
48340 398726 pfam05186 Dpy-30 Dpy-30 motif. This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably formed of two alpha-helices. It may be a dimerization motif analogous to pfam02197 (Bateman A pers obs). 42
48341 398727 pfam05187 ETF_QO Electron transfer flavoprotein-ubiquinone oxidoreductase, 4Fe-4S. Electron-transfer flavoprotein-ubiquinone oxidoreductase (ETF-QO) in the inner mitochondrial membrane accepts electrons from electron-transfer flavoprotein which is located in the mitochondrial matrix and reduces ubiquinone in the mitochondrial membrane. The two redox centers in the protein, FAD and a [4Fe4S] cluster, are present in a 64-kDa monomer. 103
48342 398728 pfam05188 MutS_II MutS domain II. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS and has similarity resembles RNAse-H-like domains (see pfam00075). 133
48343 398729 pfam05189 RTC_insert RNA 3'-terminal phosphate cyclase (RTC), insert domain. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyze the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids. 102
48344 398730 pfam05190 MutS_IV MutS family domain IV. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam00488. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds in part with globular domain IV, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in. 92
48345 398731 pfam05191 ADK_lid Adenylate kinase, active site lid. Comparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain, is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function. 36
48346 398732 pfam05192 MutS_III MutS domain III. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in. 290
48347 398733 pfam05193 Peptidase_M16_C Peptidase M16 inactive domain. Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp. 181
48348 398734 pfam05194 UreE_C UreE urease accessory protein, C-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. The C-terminal region of members of this family contains a His rich Nickel binding site. 86
48349 398735 pfam05195 AMP_N Aminopeptidase P, N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families. 121
48350 398736 pfam05196 PTN_MK_N PTN/MK heparin-binding protein family, N-terminal domain. 57
48351 398737 pfam05197 TRIC TRIC channel. TRIC (trimeric intracellular cation) channels are differentially expressed in intracellular stores in animal cell types. TRIC subtypes contain three proposed transmembrane segments, and form homo-trimers with a bullet-like structure. Electrophysiological measurements with purified TRIC preparations identify a monovalent cation-selective channel. 185
48352 398738 pfam05198 IF3_N Translation initiation factor IF-3, N-terminal domain. 70
48353 398739 pfam05199 GMC_oxred_C GMC oxidoreductase. This domain found associated with pfam00732. 143
48354 398740 pfam05201 GlutR_N Glutamyl-tRNAGlu reductase, N-terminal domain. 149
48355 398741 pfam05202 Flp_C Recombinase Flp protein. 254
48356 368334 pfam05203 Hom_end_hint Hom_end-associated Hint. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This Domain corresponds to the latter protein-splicing domain. 444
48357 368335 pfam05204 Hom_end Homing endonuclease. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. 110
48358 398742 pfam05205 COMPASS-Shg1 COMPASS (Complex proteins associated with Set1p) component shg1. The Shg1 subunit is one of the eight subunits of the COMPASS complex, complex associated with SET1, conserved in yeasts and in other eukaryotes up to humans. It is associated with the region of the Set1 protein that is N-terminal to the C-terminus, ie Set1-560-900. The function of Shg1 seems to be to slightly inhibit histone 3 lysine 4 (H3K4) di- and tri-methylation, and it is a pioneer protein. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for silencing of genes close to the telomeres of chromosomes. 100
48359 398743 pfam05206 TRM13 Methyltransferase TRM13. This is a family of eukaryotic proteins which are responsible for 2'-O-methylation of tRNA at position 4. TRM13 shows no sequence similarity to other known methyltransferases. 256
48360 398744 pfam05207 zf-CSL CSL zinc finger. This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine. 55
48361 398745 pfam05208 ALG3 ALG3 protein. The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man. ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase. 358
48362 282993 pfam05209 MinC_N Septum formation inhibitor MinC, N-terminal domain. In Escherichia coli FtsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought to interact with FtsZ. 104
48363 398746 pfam05210 Sprouty Sprouty protein (Spry). This family consists of eukaryotic Sprouty protein homologs. Sprouty proteins have been revealed as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases. The sprouty gene has found to be expressed in the the brain, cochlea, nasal organs, teeth, salivary gland, lungs, digestive tract, kidneys and limb buds in mice. 101
48364 398747 pfam05211 NLBH Neuraminyllactose-binding hemagglutinin precursor (NLBH). This family is comprised of several flagellar sheath adhesin proteins also called neuraminyllactose-binding hemagglutinin precursor (NLBH) or N-acetylneuraminyllactose-binding fibrillar hemagglutinin receptor-binding subunits. NLBH is found exclusively in Helicobacter which are gut colonising bacteria and bind to sialic acid rich macromolecules present on the gastric epithelium. 229
48365 398748 pfam05212 DUF707 Protein of unknown function (DUF707). This family consists of several uncharacterized proteins from Arabidopsis thaliana. 292
48366 282997 pfam05213 Corona_NS2A Coronavirus NS2A protein. This family contains a number of corona virus non-structural proteins of unknown function. The family also includes a polymerase protein fragment from Berne virus and does not seem to be related to the pfam04753 Coronavirus NS2 family. This family is part of the 2H phosphoesterase superfamily. 267
48367 282998 pfam05214 Baculo_p33 Baculovirus P33. This family consists of a series of Baculovirus P33 protein homologs of unknown function. 247
48368 253093 pfam05215 Spiralin Spiralin. This family consists of Spiralin proteins found in spiroplasma bacteria. Spiroplasmas are helically shaped pathogenic bacteria related to the mycoplasmas. The surface of spiroplasma bacteria is crowded with the membrane-anchored lipoprotein spiralin whose structure and function are unknown although its cellular function is thought to be a structural and mechanical one rather than a catalytic one. 239
48369 398749 pfam05216 UNC-50 UNC-50 family. Gmh1p from S. cerevisiae is located in the Golgi membrane and interacts with ARF exchange factors. 223
48370 310081 pfam05217 STOP STOP protein. Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects. 35
48371 368344 pfam05218 DUF713 Protein of unknown function (DUF713). This family contains several proteins of unknown function from C.elegans. The GO annotation suggests that this protein is involved in nematode development and has a positive regulation on growth rate. 185
48372 253097 pfam05219 DREV DREV methyltransferase. This family contains DREV protein homologs from several eukaryotes. The function of this protein is unknown. However, these proteins appear to be related to other methyltransferases (Bateman A pers obs). 265
48373 283002 pfam05220 MgpC MgpC protein precursor. This family contains several Mycoplasma MgpC like-proteins. 226
48374 398750 pfam05221 AdoHcyase S-adenosyl-L-homocysteine hydrolase. 461
48375 398751 pfam05222 AlaDh_PNT_N Alanine dehydrogenase/PNT, N-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases. 136
48376 398752 pfam05223 MecA_N NTF2-like N-terminal transpeptidase domain. The structure of this domain from MecA is known and is found to be similar to that found in NTF2 pfam02136. This domain seems unlikely to have an enzymatic function, and its role remains unknown. 117
48377 398753 pfam05224 NDT80_PhoG NDT80 / PhoG like DNA-binding family. This family includes the DNA-binding region of NDT80 as well as PhoG and its homologs. The family contains VIB-1. VIB-1 is thought to be a regulator of conidiation in Neurospora crassa and shares a region of similarity to PHOG, a possible phosphate nonrepressible acid phosphatase in Aspergillus nidulans. It has been found that vib-1 is not the structural gene for nonrepressible acid phosphatase, but rather may regulate nonrepressible acid phosphatase activity. 180
48378 283007 pfam05225 HTH_psq helix-turn-helix, Psq domain. This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence. 45
48379 398754 pfam05226 CHASE2 CHASE2 domain. CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE2 domains are not known at this time. 266
48380 398755 pfam05227 CHASE3 CHASE3 domain. CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE3 domains are not known at this time. 138
48381 398756 pfam05228 CHASE4 CHASE4 domain. CHASE4. This is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in Archaea and in predicted diguanylate cyclases/phosphodiesterases in Bacteria. Environmental factors that are recognized by CHASE4 domains are not known at this time. 142
48382 398757 pfam05229 SCPU Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins, as well as a family of secreted pili proteins involved in motility and biofilm formation. This family is distantly related to fimbrial proteins. 139
48383 398758 pfam05230 MASE2 MASE2 domain. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins. 85
48384 398759 pfam05231 MASE1 MASE1. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins. This entry also includes members of the 8 transmembrane UhpB type (8TMR-UT) domain family. 299
48385 398760 pfam05232 BTP Chlorhexidine efflux transporter. This family represents a conserved pair of two transmembrane alpha-helices. All members carry the two pairs of TMs. BTP is a form of drug efflux pump, that actively tranports chlorhexidine out of the cell. Chlorhexidine, a bisbiguanide antimicrobial agent, is commonly used as an antiseptic and disinfectant in hospitals, and there is an increasing problem with resistance to it in some pathogenic bacteria. BTP is localized in the cytoplasmic membrane. 63
48386 398761 pfam05233 PHB_acc PHB accumulation regulatory domain. The proteins this domain is found in are typically involved in regulating polymer accumulation in bacteria, particularly poly-beta-hydroxybutyrate (PHB). The N-terminal region is likely to be the DNA-binding domain (pfam07879) while this domain probably binds PHB (personal obs:C Yeats). 40
48387 113985 pfam05234 UAF_Rrn10 UAF complex subunit Rrn10. The protein Rrn10 has been identified as a component of the Upstream Activating Factor (UAF), an RNA polymerase I (pol I) specific transcription stimulatory factor 122
48388 398762 pfam05235 CHAD CHAD domain. The CHAD domain is an alpha-helical domain functionally associated with the pfam01928 domains. It has conserved histidines that may chelate metals. 164
48389 398763 pfam05236 TAF4 Transcription initiation factor TFIID component TAF4 family. This region of similarity is found in Transcription initiation factor TFIID component TAF4. 259
48390 398764 pfam05238 CENP-N Kinetochore protein CHL4 like. CHL4 is a protein involved in chromosome segregation. It is a component of the central kinetochore which mediates the attachment of the centromere to the mitotic spindle. CENP-N is one of the components that assembles onto the CENP-A-nucleosome-associated (NAC) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. 403
48391 398765 pfam05239 PRC PRC-barrel domain. The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain. 78
48392 398766 pfam05240 APOBEC_C APOBEC-like C-terminal domain. This domain is found at the C-termini of the Apolipoprotein B mRNA editing enzyme. 78
48393 398767 pfam05241 EBP Emopamil binding protein. Emopamil binding protein (EBP) is as a gene that encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homolog of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities. 113
48394 368355 pfam05242 GLYCAM-1 Glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). This family consists of the lactophorin precursors proteose peptone component 3 (PP3) and glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). GlyCAM-1 functions as a ligand for L-selectin, a saccharide-binding protein on the surface of circulating leukocytes, and mediates the trafficking of blood-born lymphocytes into secondary lymph nodes. In this context, sulphatation of the carbohydrates of GlyCAM-1 has been shown to be a critical structural requirement to be recognized by L-selectin. GlyCAM-1 is also expressed in pregnant and lactating mammary glands of mouse and in an unknown site in the lung, in the bovine uterus and rat cochlea. 135
48395 113995 pfam05244 Brucella_OMP2 Brucella outer membrane protein 2. This family consists of several outer membrane proteins (2a and 2b) from brucella bacteria. Brucellae are Gram-negative, facultative intracellular bacteria that can infect many species of animals and man. 240
48396 283023 pfam05246 DUF735 Protein of unknown function (DUF735). This family consists of several uncharacterized Borrelia burgdorferi (Lyme disease spirochete) proteins of unknown function. 211
48397 398768 pfam05247 FlhD Flagellar transcriptional activator (FlhD). This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 102
48398 310102 pfam05248 Adeno_E3A Adenovirus E3A. 104
48399 398769 pfam05250 UPF0193 Uncharacterized protein family (UPF0193). This family of proteins is functionally uncharacterized. 215
48400 398770 pfam05251 Ost5 Oligosaccharyltransferase subunit 5. Eukaryotic N-glycosylation is catalyzed in the ER lumen, where the enzyme oligosaccharyltransferase (OTase) transfers donor glycans from a dolichol pyrophosphate (DolP) carrier (Lipid-linked oligosaccharide; LLO) to polypeptides. The yeast OTase is a hetero-oligomeric complex composed of essential (Ost1, Ost2, Wbp1, Stt3, and Swp1) and nonessential (Ost3, Ost4, Ost5, and Ost6) subunits. This domain family is found in Ost5. The precise function of this subunit is not known, however Ost5 appears to form a sub-complex with Ost1, and this sub-complex associates with the catalytic Stt3 subunit of OTase. Down regulation of Ost5 resulted in a limited effect on glycosylation and no effect on the stability of Ost1 or Stt3 subunits. 73
48401 398771 pfam05253 zf-U11-48K U11-48K-like CHHC zinc finger. This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. 24
48402 398772 pfam05254 UPF0203 Uncharacterized protein family (UPF0203). This family of proteins is functionally uncharacterized. 69
48403 398773 pfam05255 UPF0220 Uncharacterized protein family (UPF0220). This family of proteins is functionally uncharacterized. 160
48404 398774 pfam05256 UPF0223 Uncharacterized protein family (UPF0223). This family of proteins is functionally uncharacterized. 85
48405 398775 pfam05257 CHAP CHAP domain. This domain corresponds to an amidase function. Many of these proteins are involved in cell wall metabolism of bacteria. This domain is found at the N-terminus of Escherichia coli gss, where it functions as a glutathionylspermidine amidase EC:3.5.1.78. This domain is found to be the catalytic domain of PlyCA. CHAP is the amidase domain of bifunctional Escherichia coli glutathionylspermidine synthetase/amidase, and it catalyzes the hydrolysis of Gsp (glutathionylspermidine) into glutathione and spermidine. 83
48406 398776 pfam05258 DUF721 Protein of unknown function (DUF721). This family contains several actinomycete proteins of unknown function. 88
48407 283034 pfam05259 Herpes_UL1 Herpesvirus glycoprotein L. This family consists of several herpesvirus glycoprotein L or UL1 proteins. Glycoprotein L is known to form a complex with glycoprotein H but the function of this complex is poorly understood. 103
48408 398777 pfam05261 Tra_M TraM protein, DNA-binding. The TraM protein is an essential part of the DNA transfer machinery of the conjugative resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential transfer protein TraM has at least two functions. First, a functional TraM protein was found to be required for normal levels of transfer gene expression. Second, experimental evidence was obtained that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo. Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal nucleoprotein complex and the membrane-bound DNA transfer apparatus. 126
48409 114011 pfam05262 Borrelia_P83 Borrelia P83/100 protein. This family consists of several Borrelia P83/P100 antigen proteins. 489
48410 283036 pfam05263 DUF722 Protein of unknown function (DUF722). This family contains several bacteriophage proteins of unknown function. 129
48411 310112 pfam05264 CfAFP Choristoneura fumiferana antifreeze protein (CfAFP). This family consists of several antifreeze proteins from the insect Choristoneura fumiferana (Spruce budworm). Antifreeze proteins (AFPs) and antifreeze glycoproteins (AFGPs) are present in many organisms that must survive sub-zero temperatures. These proteins bind to seed ice crystals and inhibit their growth through an adsorption-inhibition mechanism. 137
48412 283037 pfam05265 DUF723 Protein of unknown function (DUF723). This family contains several uncharacterized proteins from Neisseria meningitidis. These proteins may have a role in DNA-binding. 60
48413 398778 pfam05266 DUF724 Protein of unknown function (DUF724). This family contains several uncharacterized proteins found in Arabidopsis thaliana and other plants. This region is often found associated with Agenet domains and may contain coiled-coil. 188
48414 398779 pfam05267 DUF725 Protein of unknown function (DUF725). This family contains several Drosophila proteins of unknown function. 121
48415 147458 pfam05268 GP38 Phage tail fibre adhesin Gp38. This family contains several Gp38 proteins from T-even-like phages. Gp38, together with a second phage protein, gp57, catalyzes the organisation of gp37 but is absent from the phage particle. Gp37 is responsible for receptor recognition. 261
48416 398780 pfam05269 Phage_CII Bacteriophage CII protein. This family consists of several phage CII regulatory proteins. CII plays a key role in the lysis-lysogeny decision in bacteriophage lambda and related phages. 79
48417 398781 pfam05270 AbfB Alpha-L-arabinofuranosidase B (ABFB) domain. This family consists of several fungal alpha-L-arabinofuranosidase B proteins. L-Arabinose is a constituent of plant-cell-wall poly-saccharides. It is found in a polymeric form in L-arabinan, in which the backbone is formed by 1,5-a- linked l-arabinose residues that can be branched via 1,2-a- and 1,3-a-linked l-arabinofuranose side chains. AbfB hydrolyzes 1,5-a, 1,3-a and 1,2-a linkages in both oligosaccharides and polysaccharides, which contain terminal non-reducing l-arabinofuranoses in side chains. 137
48418 147459 pfam05271 Tobravirus_2B Tobravirus 2B protein. This family consists of several tobravirus 2B proteins. It is known that the 2B protein is required for transmission by both Paratrichodorus pachydermus and P. anemones nematodes. 117
48419 398782 pfam05272 VirE Virulence-associated protein E. This family contains several bacterial virulence-associated protein E like proteins. These proteins contain a P-loop motif. 217
48420 398783 pfam05273 Pox_RNA_Pol_22 Poxvirus RNA polymerase 22 kDa subunit. This family consists of several poxvirus DNA-dependent RNA polymerase 22 kDa subunits. 184
48421 283043 pfam05274 Baculo_E25 Occlusion-derived virus envelope protein E25. This family consists of several nucleopolyhedrovirus occlusion-derived virus envelope E25 proteins. 190
48422 398784 pfam05275 CopB Copper resistance protein B precursor (CopB). This family consists of several bacterial copper resistance proteins. Copper is essential and serves as cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to radical formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every organism. CopB serves to extrude copper when it approaches toxic levels. 207
48423 398785 pfam05276 SH3BP5 SH3 domain-binding protein 5 (SH3BP5). This family consists of several eukaryotic SH3 domain-binding protein 5 or c-Jun N-terminal kinase (JNK)-interacting proteins (SH3BP5 or Sab). Sab binds to and serves as a substrate for JNK in vitro, and has been found to interact with the Src homology 3 (SH3) domain of Bruton's tyrosine kinase (Btk). Inspection of the sequence of Sab reveals the presence of two putative mitogen-activated protein kinase interaction motifs (KIMs) similar to that found in the JNK docking domain of the c-Jun transcription factor, and four potential serine-proline JNK phosphorylation sites in the C-terminal half of the molecule. 231
48424 368366 pfam05277 DUF726 Protein of unknown function (DUF726). This family consists of several uncharacterized eukaryotic proteins. 341
48425 253129 pfam05278 PEARLI-4 Arabidopsis phospholipase-like protein (PEARLI 4). This family contains several phospholipase-like proteins from Arabidopsis thaliana which are homologous to PEARLI 4. 234
48426 191249 pfam05279 Asp-B-Hydro_N Aspartyl beta-hydroxylase N-terminal region. This family includes the N-terminal regions of the junctin, junctate and aspartyl beta-hydroxylase proteins. Junctate is an integral ER/SR membrane calcium binding protein, which comes from an alternatively spliced form of the same gene that generates aspartyl beta-hydroxylase and junctin. Aspartyl beta-hydroxylase catalyzes the post-translational hydroxylation of aspartic acid or asparagine residues contained within epidermal growth factor (EGF) domains of proteins. 240
48427 398786 pfam05280 FlhC Flagellar transcriptional activator (FlhC). This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC combines with FlhD to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 171
48428 368368 pfam05281 Secretogranin_V Neuroendocrine protein 7B2 precursor (Secretogranin V). The neuroendocrine protein 7B2 has a critical role in the proteolytic conversion and activation of proPC2, the enzyme responsible for the proteolytic conversion of many peptide hormone precursors. The 7B2 protein acts as an intracellular binding protein for proPC2, facilitates its maturation, and is required for its enzymatic activity. Processing of many important peptide precursors does not occur in 7B2 nulls. 7B2 null mice exhibit a unique form of Cushing's disease with many atypical symptoms, such as hypoglycemia. 230
48429 398787 pfam05282 AAR2 AAR2 protein. This family consists of several eukaryotic AAR2-like proteins. The yeast protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth. 355
48430 368370 pfam05283 MGC-24 Multi-glycosylated core protein 24 (MGC-24), sialomucin. This family consists of several MGC-24 (or Cd164 antigen) proteins from eukaryotic organisms. MGC-24/CD164 is a sialomucin expressed in many normal and cancerous tissues. In humans, soluble and transmembrane forms of MGC-24 are produced by alternative splicing. 140
48431 398788 pfam05284 DUF736 Protein of unknown function (DUF736). This family consists of several uncharacterized bacterial proteins of unknown function. 98
48432 398789 pfam05285 SDA1 SDA1. This family consists of several SDA1 protein homologs. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the actin cytoskeleton. The protein is essential for cell viability and is localized in the nucleus. 288
48433 283053 pfam05287 PMG PMG protein. This family consists of several mouse anagen-specific protein mKAP13 (PMG1 and PMG2). PMG1 and 2 contain characteristic repeats reminiscent of the keratin-associated proteins (KAPs). Both genes are expressed in growing hair follicles in skin as well as in sebaceous and eccrine sweat glands. Interestingly, expression is also detected in the mammary epithelium where it is limited to the onset of the pubertal growth phase and is independent of ovarian hormones. Their broad, developmentally controlled expression pattern, together with their unique amino acid composition, demonstrate that pmg-1 and pmg-2 constitute a novel KAP gene family participating in the differentiation of all epithelial cells forming the epidermal appendages. 180
48434 283054 pfam05288 Pox_A3L Poxvirus A3L Protein. This family consists of several poxvirus A3L or A2_5L proteins. 70
48435 368373 pfam05289 BLYB Borrelia hemolysin accessory protein. This family consists of several borrelia hemolysin accessory proteins (BLYB). BLYB was thought to be an accessory protein, which was proposed to comprise a hemolysis system but it is now thought that BlyA and BlyB function instead as a prophage-encoded holin or holin-like system. 120
48436 368374 pfam05290 Baculo_IE-1 Baculovirus immediate-early protein (IE-0). The Autographa californica multinucleocapsid nuclear polyhedrosis virus (AcMNPV) ie-1 gene product (IE-1) is thought to play a central role in stimulating early viral transcription. IE-1 has been demonstrated to activate several early viral gene promoters and to negatively regulate the promoters of two other AcMNPV regulatory genes, ie-0 and ie-2. It is thought that that IE-1 negatively regulates the expression of certain genes by binding directly, or as part of a complex, to promoter regions containing a specific IE-1-binding motif (5'-ACBYGTAA-3') near their mRNA start sites. 141
48437 398790 pfam05291 Bystin Bystin. Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the yeast protein ENP1. ENP1 is an essential protein in Saccharomyces cerevisiae and is localized in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective yeast cannot synthesize 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits. 289
48438 398791 pfam05292 MCD Malonyl-CoA decarboxylase C-terminal domain. This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidized. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK). 245
48439 114041 pfam05293 ASFV_L11L African swine fever virus (ASFV) L11L protein. L11L is an integral membrane protein of the African swine fever virus (ASFV) which is expressed late in the virus replication cycle. The protein is thought to be non-essential for growth in vitro and for virus virulence in domestic swine. 78
48440 253137 pfam05294 Toxin_5 Scorpion short toxin. This family contains various secreted scorpion short toxins and seems to be unrelated to pfam00451. 32
48441 253138 pfam05295 Luciferase_N Luciferase/LBP N-terminal domain. This family consists of a presumed N-terminal domain that is conserved between dinoflagellate luciferase and luciferin binding proteins. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence and luciferin binding protein (LBP) is known to bind to luciferin (the substrate for luciferase) to stop it reacting with the enzyme and therefore switching off the bioluminescence function. The expression of these two proteins is controlled by a circadian clock at the translational level, with synthesis and degradation occurring on a daily basis. However This domain is not the catalytic part of the protein. It has been suggested that this region may mediate an interaction between LBP and Luciferase or their association with the vacuolar membrane. 82
48442 283059 pfam05296 TAS2R Taste receptor protein (TAS2R). This family consists of several forms of eukaryotic taste receptor proteins (TAS2Rs). TAS2Rs are G protein-coupled receptors expressed in subsets of taste receptor cells of the tongue and palate epithelia in humans and mice, and are organized in the genome in clusters. The proteins are genetically linked to loci that influence bitter perception in mice and humans. 303
48443 283060 pfam05297 Herpes_LMP1 Herpesvirus latent membrane protein 1 (LMP1). This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1 of EBV is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N-terminus and a long cytoplasmic carboxy tail of 200 amino acids. EBV latent membrane protein 1 (LMP1) is essential for EBV-mediated transformation and has been associated with several cases of malignancies. EBV-like viruses in Cynomolgus monkeys (Macaca fascicularis) have been associated with high lymphoma rates in immunosuppressed monkeys 386
48444 114046 pfam05298 Bombinin Bombinin. This family consists of Bombinin and Maximin proteins from Bombina maxima (Chinese red belly toad). Two groups of antimicrobial peptides have been isolated from skin secretions of Bombina maxima. Peptides in the first group, named maximins 1, 2, 3, 4 and 5, are structurally related to bombinin-like peptides (BLPs). Unlike BLPs, sequence variations in maximins occurred all through the molecules. In addition to the potent antimicrobial activity, cytotoxicity against tumor cells and spermicidal action of maximins, maximin 3 possessed a significant anti-HIV activity. Maximins 1 and 3 have been found to be toxic to mice. Peptides in the second group, termed maximins H1, H2, H3 and H4, are homologous with bombinin H peptides. 141
48445 398792 pfam05299 Peptidase_M61 M61 glycyl aminopeptidase. Glycyl aminopeptidase is an unusual peptidase in that it has a preference for substrates with an N-terminal glycine or alanine. These proteins are found in Bacteria and in Archaea. 116
48446 398793 pfam05300 DUF737 Protein of unknown function (DUF737). This family consists of several uncharacterized mammalian proteins of unknown function. 142
48447 398794 pfam05301 Acetyltransf_16 GNAT acetyltransferase, Mec-17. Mec-17 is the protein product of one of the 18 genes required for the development and function of the touch receptor neuron for gentle touch. Mec-17 is specifically required for maintaining the differentiation of the touch receptor. The family shares all the residue-motifs characteristic of Gcn5-related acetyl-transferases, though the exact unction is still unknown. 176
48448 310131 pfam05302 DUF720 Protein of unknown function (DUF720). This family consists of several uncharacterized Chlamydia proteins of unknown function. 128
48449 398795 pfam05303 DUF727 Protein of unknown function (DUF727). This family consists of several uncharacterized eukaryotic proteins of unknown function. 103
48450 283066 pfam05304 DUF728 Protein of unknown function (DUF728). This family consists of several uncharacterized tobravirus proteins of unknown function. 139
48451 398796 pfam05305 DUF732 Protein of unknown function (DUF732). This family consists of several uncharacterized Mycobacterium tuberculosis and leprae proteins of unknown function. 72
48452 368381 pfam05306 DUF733 Protein of unknown function (DUF733). This family consists of several uncharacterized Drosophila melanogaster proteins of unknown function. 85
48453 398797 pfam05307 Bundlin Bundlin. This family consists of several bundlin proteins from E. coli. Bundlin is a type IV pilin protein that is the only known structural component of enteropathogenic Escherichia coli bundle-forming pili (BFP). BFP play a role in virulence, antigenicity, autoaggregation, and localized adherence to epithelial cells. These proteins contain an N-terminal methylation motif. 60
48454 398798 pfam05308 Mito_fiss_reg Mitochondrial fission regulator. In eukaryotes, this family of proteins induces mitochondrial fission. 242
48455 398799 pfam05309 TraE TraE protein. This family consists of several bacterial sex pilus assembly and synthesis proteins (TraE). Conjugal transfer of plasmids from donor to recipient cells is a complex process in which a cell-to-cell contact plays a key role. Many genes encoded by self-transmissible plasmids are required for various processes of conjugation, including pilus formation, stabilisation of mating pairs, conjugative DNA metabolism, surface exclusion and regulation of transfer gene expression. The exact function of the TraE protein is unknown. 182
48456 191255 pfam05310 Tenui_NS3 Tenuivirus movement protein. This family of ssRNA negative-strand crop plant tenuivirus proteins appears to combine PV2, NS2, NS3, and PV3 proteins. Plant viruses encode specific proteins known as movement proteins (MPs) to control their spread through plasmodesmata (PD) in walls between cells as well as from leaf to leaf via vascular-dependent transport. During this movement process, the virally encoded MPs interact with viral genomes for transport from the viral replication sites to the PDs in the walls of infected cells along the cytoskeleton and/or endoplasmic reticulum (ER) network. The virus is then thought to move through the PDs in the form of MP-associated ribonucleoprotein complexes or as virions. The NS3 protein appears to function as an RNA silencing suppressor. 186
48457 253146 pfam05311 Baculo_PP31 Baculovirus 33KDa late protein (PP31). Autographa californica nuclear polyhedrosis virus (AcMNPV) pp31 is a nuclear phosphoprotein that accumulates in the virogenic stroma, which is the viral replication centre in the infected-cell nucleus, binds to DNA, and serves as a late expression factor. 267
48458 283071 pfam05313 Pox_P21 Poxvirus P21 membrane protein. The P21 membrane protein of vaccinia virus, encoded by the A17L (or A18L) gene, has been reported to localize on the inner of the two membranes of the intracellular mature virus (IMV). It has also been shown that P21 acts as a membrane anchor for the externally located fusion protein P14 (A27L gene). 189
48459 283072 pfam05314 Baculo_ODV-E27 Baculovirus occlusion-derived virus envelope protein EC27. This family consists of several baculovirus occlusion-derived virus envelope proteins (EC27 or E27). The ODV-E27 protein has distinct functional characteristics compared to cellular and viral cyclins. Depending on the cdk protein, and perhaps other viral or cellular proteins yet to be described, the kinase-EC27 complex may have either cyclin B- or D-like activity. 295
48460 398800 pfam05315 ICEA ICEA Protein. This family consists of several ICEA proteins from Helicobacter pylori. Helicobacter pylori infection causes gastritis and peptic ulcer disease, and is classified as a definite carcinogen of gastric cancer. ICEA1 is speculated to be associated with peptic ulcer disease. 218
48461 283074 pfam05316 VAR1 Mitochondrial ribosomal protein (VAR1). This family consists of the yeast mitochondrial ribosomal proteins VAR1. Mitochondria possess their own ribosomes responsible for the synthesis of a small number of proteins encoded by the mitochondrial genome. In yeast the two ribosomal RNAs and a single ribosomal protein, VAR1, are products of mitochondrial genes, and the remaining approximately 80 ribosomal proteins are encoded in the nucleus. VAR1 along with 15S rRNA are necessary for the formation of mature 37S subunits. 337
48462 368384 pfam05317 Thermopsin Thermopsin. This family consists of several thermopsin proteins from archaebacteria. Thermopsin is a thermostable acid protease which is capable of hydrolysing the following bonds: Leu-Val, Leu-Tyr, Phe-Phe, Phe-Tyr, and Tyr-Thr. The specificity of thermopsin is therefore similar to that of pepsin, that is, it prefers large hydrophobic residues at both sides of the scissile bond. 253
48463 253150 pfam05318 Tombus_movement Tombusvirus movement protein. This family consists of several Tombusvirus movement proteins. These proteins allow the virus to move from cell-to-cell and allow host-specific systemic spread. 68
48464 398801 pfam05320 Pox_RNA_Pol_19 Poxvirus DNA-directed RNA polymerase 19 kDa subunit. This family contains several DNA-directed RNA polymerase 19 kDa polypeptides. The Poxvirus DNA-directed RNA polymerase (EC: 2.7.7.6) catalyzes DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. 164
48465 398802 pfam05321 HHA Haemolysin expression modulating protein. This family consists of haemolysin expression modulating protein (HHA) homologs. YmoA and Hha are highly similar bacterial proteins downregulating gene expression in Yersinia enterocolitica and Escherichia coli, respectively. 56
48466 283078 pfam05322 NinE NINE Protein. This family consists of NINE proteins from several bacteriophages and from E. coli. 58
48467 283079 pfam05323 Pox_A21 Poxvirus A21 Protein. This family consists of several poxvirus A21 proteins. 111
48468 398803 pfam05324 Sperm_Ag_HE2 Sperm antigen HE2. This family consists of several variants of the human and chimpanzee sperm antigen proteins (HE2 and EP2 respectively). The EP2 gene codes for a family of androgen-dependent, epididymis-specific secretory proteins.The EP2 gene uses alternative promoters and differential splicing to produce a family of variant messages. The translated putative protein variants differ significantly from each other. Some of these putative proteins have similarity to beta-defensins, a family of antimicrobial peptides. 70
48469 114071 pfam05325 DUF730 Protein of unknown function (DUF730). This family consists of several uncharacterized Arabidopsis thaliana proteins of unknown function. 122
48470 398804 pfam05326 SVA Seminal vesicle autoantigen (SVA). This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins. Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19-kDa secretory glycoprotein suppresses the motility of spermatozoa by interacting with phospholipid. PIP, has several known functions. In saliva, this protein plays a role in host defense by binding to microorganisms such as Streptococcus. PIP is an aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its interaction with CD4. 124
48471 398805 pfam05327 RRN3 RNA polymerase I specific transcription initiation factor RRN3. This family consists of several eukaryotic proteins which are homologous to the yeast RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in Saccharomyces cerevisiae. 543
48472 368389 pfam05328 CybS CybS, succinate dehydrogenase cytochrome B small subunit. This family consists of several eukaryotic succinate dehydrogenase [ubiquinone] cytochrome B small subunit, mitochondrial precursor (CybS) proteins. SDHD encodes the small subunit (cybS) of cytochrome b in succinate-ubiquinone oxidoreductase (mitochondrial complex II). Mitochondrial complex II is involved in the Krebs cycle and in the aerobic electron transport chain. It contains four proteins. The catalytic core consists of a flavoprotein and an iron-sulfur protein; these proteins are anchored to the mitochondrial inner membrane by the large subunit of cytochrome b (cybL) and cybS, which together comprise the heme-protein cytochrome b. Mutations in the SDHD gene can lead to hereditary paraganglioma, characterized by the development of benign, vascularised tumors in the head and neck. 133
48473 398806 pfam05331 DUF742 Protein of unknown function (DUF742). This family consists of several uncharacterized Streptomyces proteins as well as one from Mycobacterium tuberculosis. The function of these proteins is unknown. 114
48474 283085 pfam05332 DUF743 Protein of unknown function (DUF743). This family consists of several uncharacterized Calicivirus proteins of unknown function. 113
48475 398807 pfam05334 DUF719 Protein of unknown function (DUF719). This family consists of several eukaryotic proteins of unknown function. 189
48476 398808 pfam05335 DUF745 Protein of unknown function (DUF745). This family consists of several uncharacterized Drosophila melanogaster proteins of unknown function. 180
48477 398809 pfam05336 rhaM L-rhamnose mutarotase. This family contains L-rhamnose mutarotase which is a glycosyl hydrolase that converts the monosaccharide L-rhamnopyranose from the alpha to the beta stereoisomer. In Escherichia coli this enzyme is the product of the rhaM gene (also known as yiiL). The tertiary structure has been solved, in complex with L-rhamnose, and the catalytic mechanism determined. His22 is the proton donor. The enzyme naturally exists as a dimer. 100
48478 398810 pfam05337 CSF-1 Macrophage colony stimulating factor-1 (CSF-1). Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption. Structurally these are short-chain 4-helical cytokines. 140
48479 283089 pfam05338 DUF717 Protein of unknown function (DUF717). This family consists of several herpesvirus proteins of unknown function. 55
48480 283090 pfam05339 DUF739 Protein of unknown function (DUF739). This family contains several bacteriophage proteins. Some of the proteins in this family have been labeled putative cro repressor proteins. 69
48481 398811 pfam05340 DUF740 Protein of unknown function (DUF740). This family consists of several uncharacterized plant proteins of unknown function. 610
48482 283092 pfam05341 PIF6 Per os infectivity factor 6. Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV) Orf68 (also known as per os infectivity factor 6, PIF6 or ac68). PIF6 is present in both the budded virus (BV) and the occluded-derived virus (ODV). The ac68 gene overlaps the lef3 gene which encodes the single-stranded DNA binding protein, and knockout experiments of ac68 have to ensure that a functional lef3 gene is present. In ac68KO experiments, viral DNA replication and BV levels were unaffected as were mortality rates if caterpillars were injected with BV directly into the hemolymph bypassing the gut. However, in oral bioassays the ac68KO occlusion bodies failed to kill larvae, indicating that PIF6 is a per os infectivity factor. 105
48483 398812 pfam05342 Peptidase_M26_N M26 IgA1-specific Metallo-endopeptidase N-terminal region. These peptidases, which cleave mammalian IgA, are found in Gram-positive bacteria. Often found associated with pfam00746, they may be attached to the cell wall. 250
48484 398813 pfam05343 Peptidase_M42 M42 glutamyl aminopeptidase. These peptidases are found in Archaea and Bacteria. The example in Lactococcus lactis, PepA, aids growth on milk. Pyrococcus horikoshii contain a thermostable de-blocking aminopeptidase member of this family used commercially for N-terminal protein sequencing. 292
48485 283095 pfam05344 DUF746 Domain of Unknown Function (DUF746). This is a short conserved region found in some transposons. Structural modelling suggests this domain may bind nucleic acids. 64
48486 398814 pfam05345 He_PIG Putative Ig domain. This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei. 95
48487 398815 pfam05346 DUF747 Eukaryotic membrane protein family. This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton. 311
48488 398816 pfam05347 Complex1_LYR Complex 1 protein (LYR family). Proteins in this family include an accessory subunit of the higher eukaryotic NADH dehydrogenase complex. In Saccharomyces cerevisiae, the Isd11 protein has been shown to play a role in Fe/S cluster biogenesis in mitochondria. We have named this family LYR after a highly conserved tripeptide motif close to the N-terminus of these proteins. 59
48489 398817 pfam05348 UMP1 Proteasome maturation factor UMP1. UMP1 is a short-lived chaperone present in the precursor form of the 20S proteasome and absent in the mature complex. UMP1 is required for the correct assembly and enzymatic activation of the proteasome. UMP1 seems to be degraded by the proteasome upon its formation 115
48490 398818 pfam05349 GATA-N GATA-type transcription activator, N-terminal. GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutation are often associated with certain congenital human disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly homologous and have two tandem zinc fingers. The classical GATA transcription factors function transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This family represents the N-terminal domain of the family of GATA transcription activators. 176
48491 398819 pfam05350 GSK-3_bind Glycogen synthase kinase-3 binding. Glycogen synthase kinase-3 (GSK-3) sequentially phosphorylates four serine residues on glycogen synthase (GS), in the sequence SxxxSxxxSxxx-SxxxS(p), by recognising and phosphorylating the first serine in the sequence motif SxxxS(P) (where S(p) represents a phosphoserine). Interaction of GSK-3 with a peptide derived from GSK-3 binding protein (this family) prevents GSK-3 interaction with Axin. This interaction thereby inhibits the Axin-dependent phosphorylation of beta-catenin by GSK-3. 237
48492 398820 pfam05351 GMP_PDE_delta GMP-PDE, delta subunit. GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE)(EC:3.1.4.35). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport. 154
48493 147504 pfam05352 Phage_connector Phage Connector (GP10). The head-tail connector of bacteriophage 29 is composed of 12 36 kDa subunits with 12 fold symmetry. It is the central component of a rotary motor that packages the genomic dsDNA into pre-formed proheads. This motor consists of the head-tail connector, surrounded by a 29-encoded, 174-base, RNA and a viral ATPase protein. 281
48494 398821 pfam05353 Atracotoxin Delta Atracotoxin. Delta atracotoxin produces potentially fatal neurotoxic symptoms in primates by slowing he inactivation of voltage-gated sodium channels. The structure of atracotoxin comprises a core beta region containing a triple-stranded a thumb-like extension protruding from the beta region and a C-terminal helix. The beta region contains a cystine knot motif, a feature seen in other neurotoxic polypeptides. 42
48495 283102 pfam05354 Phage_attach Phage Head-Tail Attachment. The phage head-tail attachment protein is required for the joining of phage heads and tails at the last step of morphogenesis. 117
48496 398822 pfam05355 Apo-CII Apolipoprotein C-II. Apolipoprotein C-II (ApoC-II) is the major activator of lipoprotein lipase, a key enzyme in the regulation of triglyceride levels in human serum. 77
48497 398823 pfam05356 Phage_Coat_B Phage Coat protein B. The major coat protein in the capsid of filamentous bacteriophage forms a helical assembly of about 7000 identical protomers, with each protomer comprised of 46 amino acid, after the cleavage of the signal peptide. Each protomer forms a slightly curved helix that combine to form a tubular structure that encapsulates the viral DNA. 56
48498 368403 pfam05357 Phage_Coat_A Phage Coat Protein A. Infection of Escherichia coli by filamentous bacteriophages is mediated by the minor phage coat protein A and involves two distinct cellular receptors, the F' pilus and the periplasmic protein TolA. These two receptors are contacted in a sequential manner, such that binding of TolA by the extreme N-terminal domain is conditional on a primary interaction of the second coat protein A domain with the F' pilus. 62
48499 283105 pfam05358 DicB DicB protein. DicB is part of the dic operon, which resides on cryptic prophage Kim. Under normal conditions, expression of dicB is actively repressed. When expression is induced, however, cell division rapidly ceases, and this division block is dependent on MinC with which it interacts. 62
48500 398824 pfam05359 DUF748 Domain of Unknown Function (DUF748). 152
48501 398825 pfam05360 YiaAB yiaA/B two helix domain. This domain consists of two transmembrane helices and a conserved linking section. 53
48502 398826 pfam05361 PP1_inhibitor PKC-activated protein phosphatase-1 inhibitor. Contractility of vascular smooth muscle depends on phosphorylation of myosin light chains, and is modulated by hormonal control of myosin phosphatase activity. Signaling pathways activate kinases such as PKC or Rho-dependent kinases that phosphorylate the myosin phosphatase inhibitor protein called CPI-17. Phosphorylation of CPI-17 at Thr-38 enhances its inhibitory potency 1000-fold, creating a molecular switch for regulating contraction. 141
48503 368405 pfam05362 Lon_C Lon protease (S16) C-terminal proteolytic domain. The Lon serine proteases must hydrolyze ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops. 205
48504 398827 pfam05363 Herpes_US12 Herpesvirus US12 family. US12 a key factor in the evasion of cellular immune response against HSV-infected cells. Specific inhibition of the transporter associated with antigen processing (TAP) by US12 prevents peptide transport into the endoplasmic reticulum and subsequent loading of major histocompatibility complex (MHC) class I molecules. US12 is comprised of three helices and is associated with cellular membranes. 82
48505 283111 pfam05364 SecIII_SopE_N Salmonella type III secretion SopE effector N-terminus. Salmonella typhimurium employs a type III secretion system to inject bacterial toxins into the host cell cytosol. These toxins transiently activate Rho family GTP-binding protein-dependent signaling cascades to induce cytoskeletal rearrangements. SopE, one of these toxins, can activate Cdc42 in a Dbl-like fashion via its C-terminal GEP domain pfam07487. This family represents the N-terminal region of SopE. The function of this domain is unknown. 74
48506 398828 pfam05365 UCR_UQCRX_QCR9 Ubiquinol-cytochrome C reductase, UQCRX/QCR9 like. The UQCRX/QCR9 protein is the 9/10 subunit of complex III, encoding a protein of about 7-kDa. Deletion of QCR9 results in the inability of cells to grow on grow on-fermentable carbon source n yeast. 53
48507 368407 pfam05366 Sarcolipin Sarcolipin. Sarcolipin is a 31 amino acid integral membrane protein that regulates Ca-ATPase activity in skeletal muscle. 31
48508 368408 pfam05367 Phage_endo_I Phage endonuclease I. The bacteriophage endonuclease I is a nuclease that is selective for the structure of the four-way Holliday DNA junction. 149
48509 398829 pfam05368 NmrA NmrA-like family. NmrA is a negative transcriptional regulator involved in the post-translational modification of the transcription factor AreA. NmrA is part of a system controlling nitrogen metabolite repression in fungi. This family only contains a few sequences as iteration results in significant matches to other Rossmann fold families. 236
48510 368410 pfam05369 MtmB Monomethylamine methyltransferase MtmB. Monomethylamine methyltransferase of the archaebacterium Methanosarcina barkeri contains a novel amino acid, pyrrolysine, encoded by the termination codon UAG. The structure reveals a homohexamer comprised of individual subunits with a TIM barrel fold. 450
48511 398830 pfam05370 DUF749 Domain of unknown function (DUF749). Archaeal domain of unknown function. This domain has been solved as part of a structural genomics project and comprises of segregated helical and anti-parallel beta sheet regions. 87
48512 368412 pfam05371 Phage_Coat_Gp8 Phage major coat protein, Gp8. Class I phage major coat protein Gp8 or B. The coat protein is largely alpha-helix with a slight curve. 52
48513 398831 pfam05372 Delta_lysin Delta lysin family. Delta-lysin is a 26 amino acid, hemolytic peptide toxin secreted by Staphylococcus aureus. It is thought that delta-toxin forms an amphipathic helix upon binding to lipid bilayers. The precise mode of action of delta-lysis is unclear. 25
48514 398832 pfam05373 Pro_3_hydrox_C L-proline 3-hydroxylase, C-terminal. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure contains conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. The structure differs significantly from many other 2-OG oxygenases in possessing a discrete C-terminal helical domain. 101
48515 310168 pfam05374 Mu-conotoxin Mu-Conotoxin. Mu-conotoxins are peptide inhibitors of voltage-sensitive sodium channels. 22
48516 253170 pfam05375 Pacifastin_I Pacifastin inhibitor (LCMII). Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors. 40
48517 398833 pfam05377 FlaC_arch Flagella accessory protein C (FlaC). Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells). 55
48518 398834 pfam05378 Hydant_A_N Hydantoinase/oxoprolinase N-terminal region. This family is found at the N-terminus of the pfam01968 family. 176
48519 283122 pfam05379 Peptidase_C23 Carlavirus endopeptidase. A peptidase involved in auto-proteolysis of a polyprotein from the plant pathogen blueberry scorch carlavirus (BBScV). Corresponds to Merops family C23. 88
48520 398835 pfam05380 Peptidase_A17 Pao retrotransposon peptidase. Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses. 162
48521 398836 pfam05381 Peptidase_C21 Tymovirus endopeptidase. Corresponds to Merops family C21. The best-studied plant alpha-like virus proteolytic enzyme is the proteinase of turnip yellow mosaic virus (TYMV). The TYMV replicase protein undergoes auto-cleavage to yield two products. The auto-peptidase activity has been mapped to the central part of this polyprotein. 100
48522 283125 pfam05382 Amidase_5 Bacteriophage peptidoglycan hydrolase. At least one of the members of this family, the Pal protein from the pneumococcal bacteriophage Dp-1 has been shown to be a N-acetylmuramoyl-L-alanine amidase. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside at the N-terminal domain whereas the C-terminal domain binds to the choline residues of the cell wall teichoic acids. This family appears to be related to pfam00877. 142
48523 398837 pfam05383 La La domain. This presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins. The function of this region is uncertain. 59
48524 398838 pfam05384 DegS Sensor protein DegS. This is small family of Bacillus DegS proteins. The DegS-DegU two-component regulatory system of Bacillus subtilis controls various processes that characterize the transition from the exponential to the stationary growth phase, including the induction of extracellular degradative enzymes, expression of late competence genes and down-regulation of the sigma D regulon. The family also contains one sequence from Thermoanaerobacter tengcongensis which is described as a sensory transduction histidine kinase. 159
48525 283128 pfam05385 Adeno_E4 Mastadenovirus early E4 13 kDa protein. This family consists of human and simian mastadenovirus early E4 13 kDa proteins. Human adenovirus type 9 (Ad9) is unique in eliciting exclusively estrogen-dependent mammary tumors in rats and in not requiring viral E1 region transforming genes for tumorigenicity. E4 codes for an oncoprotein essential for tumorigenesis by Ad9. 108
48526 398839 pfam05386 TEP1_N TEP1 N-terminal domain. This short sequence region is found in four copies at the N-terminus of the TEP1 telomerase component. The functional significance of the region is uncertain. However the conservation of two histidines and a cysteine suggests it is a potential zinc binding domain. 29
48527 310175 pfam05387 Chorion_3 Chorion family 3. This family consists of several Drosophila chorion proteins S36 and S38. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. 277
48528 398840 pfam05388 Carbpep_Y_N Carboxypeptidase Y pro-peptide. This family is found at the N-terminus of several carboxypeptidase Y proteins and contains a signal peptide and pro-peptide regions. 126
48529 398841 pfam05389 MecA Negative regulator of genetic competence (MecA). This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations permit competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcription. 168
48530 398842 pfam05390 KRE9 Yeast cell wall synthesis protein KRE9/KNH1. This family contains several KRE9 and KNH1 proteins which are involved in encoding cell surface O glycoproteins, which are required for beta -1,6-glucan synthesis in yeast. 101
48531 398843 pfam05391 Lsm_interact Lsm interaction motif. This short motif is found at the C-terminus of Prp24 proteins and probably interacts with the Lsm proteins to promote U4/U6 formation. 19
48532 398844 pfam05392 COX7B Cytochrome C oxidase chain VIIB. 79
48533 283135 pfam05393 Hum_adeno_E3A Human adenovirus early E3A glycoprotein. This family consists of several early glycoproteins from human adenoviruses. 102
48534 368421 pfam05394 AvrB_AvrC Avirulence protein. This family consists of several avirulence proteins from Pseudomonas syringae and Xanthomonas campestris. 326
48535 398845 pfam05395 DARPP-32 Protein phosphatase inhibitor 1/DARPP-32. This family consists of several mammalian protein phosphatase inhibitor 1 (IPP-1) and dopamine- and cAMP-regulated neuronal phosphoprotein (DARPP-32) proteins. Protein phosphatase inhibitor-1 is involved in signal transduction and is an endogenous inhibitor of protein phosphatase-1. It has been demonstrated that DARPP-32, if phosphorylated, can inhibit protein-phosphatase-1. DARPP-32 has a key role in many neurotransmitter pathways throughout the brain and has been shown to be involved in controlling receptors, ion channels and other physiological factors including the brain's response to drugs of abuse, such as cocaine, opiates and nicotine. DARPP-32 is reciprocally regulated by the two neurotransmitters that are most often implicated in schizophrenia - dopamine and glutamate. Dopamine activates DARPP-32 through the D1 receptor pathway and disables DARPP-32 through the D2 receptor. Glutamate, acting through the N-methyl-d-aspartate receptor, renders DARPP-32 inactive. A mutant form of DARPP-32 has been linked with gastric cancers. 136
48536 147533 pfam05396 Phage_T7_Capsid Phage T7 capsid assembly protein. 123
48537 398846 pfam05397 Med15_fungi Mediator complex subunit 15. GAL11 or MED15 is one of the up to 32 or subunits of the Mediator complex which is found from fungi to humans. The Mediator complex interacts with RNA polymerase II and other general transcription factors to form the RNA polymerase II holoenzyme, thereby affecting transcription through targetting of activators and repressors. This family is found in fungi and the small metazoan starlet anemone. 112
48538 310184 pfam05398 PufQ PufQ cytochrome subunit. This family consists of bacterial PufQ proteins. PufQ id required for bacteriochlorophyll biosynthesis serving a regulatory function in the formation of photosynthetic complexes. 74
48539 368424 pfam05399 EVI2A Ectropic viral integration site 2A protein (EVI2A). This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumors. 231
48540 398847 pfam05400 FliT Flagellar protein FliT. This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium. 85
48541 398848 pfam05401 NodS Nodulation protein S (NodS). This family consists of nodulation S (NodS) proteins. The products of the rhizobial nodulation genes are involved in the biosynthesis of lipochitin oligosaccharides (LCOs), which are host-specific signal molecules required for nodule formation. NodS is an S-adenosyl-L-methionine (SAM)-dependent methyltransferase involved in N methylation of LCOs. NodS uses N-deacetylated chitooligosaccharides, the products of the NodBC proteins, as its methyl acceptors. 199
48542 398849 pfam05402 PqqD Coenzyme PQQ synthesis protein D (PqqD). This family contains several bacterial coenzyme PQQ synthesis protein D (PqqD) sequences. This protein is required for coenzyme pyrrolo-quinoline-quinone (PQQ) biosynthesis. 64
48543 253181 pfam05403 Plasmodium_HRP Plasmodium histidine-rich protein (HRPII/III). This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum. 218
48544 398850 pfam05404 TRAP-delta Translocon-associated protein, delta subunit precursor (TRAP-delta). This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown. 162
48545 368426 pfam05405 Mt_ATP-synt_B Mitochondrial ATP synthase B chain precursor (ATP-synt_B). The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). 163
48546 398851 pfam05406 WGR WGR domain. This domain is found in a variety of polyA polymerases as well as the E. coli molybdate metabolism regulator and other proteins of unknown function. I have called this domain WGR after the most conserved central motif of the domain. The domain is found in isolation in proteins such as Rhizobium radiobacter Ych and is between 70 and 80 residues in length. I propose that this may be a nucleic acid binding domain. 79
48547 368428 pfam05407 Peptidase_C27 Rubella virus endopeptidase. Corresponds to Merops family C27. Required for processing of the rubella virus replication protein. 171
48548 283147 pfam05408 Peptidase_C28 Foot-and-mouth virus L-proteinase. Corresponds to Merops family C28. Protein fold of the peptidase unit for members of this family resembles that of papain. The leader proteinase of foot and mouth disease virus (FMDV) cleaves itself from the growing polyprotein and also cleaves the host translation initiation factor 4GI (eIF4G), thus inhibiting 5'-cap dependent translation. 201
48549 398852 pfam05409 Peptidase_C30 Coronavirus endopeptidase C30. Corresponds to Merops family C30. These peptidases are involved in viral polyprotein processing in replication. 274
48550 398853 pfam05410 Peptidase_C31 Porcine arterivirus-type cysteine proteinase alpha. Corresponds to Merops family C31. These peptidases are involved in viral polyprotein processing in replication. 105
48551 398854 pfam05411 Peptidase_C32 Equine arteritis virus putative proteinase. These proteins are characterized by a region that has been proposed to have peptidase activity involved in viral polyprotein processing in replication. 127
48552 114153 pfam05412 Peptidase_C33 Equine arterivirus Nsp2-type cysteine proteinase. Corresponds to Merops family C33. These peptidases are involved in viral polyprotein processing in replication. 108
48553 147545 pfam05413 Peptidase_C34 Putative closterovirus papain-like endopeptidase. Corresponds to Merops family C34. Putative closterovirus papain-like endopeptidase from the apple chlorotic leaf spot closterovirus. 92
48554 283149 pfam05414 DUF1717 Viral domain of unknown function (DUF1717). This domain is found in viral proteins of unknown function. 78
48555 283150 pfam05415 Peptidase_C36 Beet necrotic yellow vein furovirus-type papain-like endopeptidase. Corresponds to Merops family C36. This protease involved in processing the viral polyprotein. 104
48556 253185 pfam05416 Peptidase_C37 Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins. 535
48557 283151 pfam05417 Peptidase_C41 Hepatitis E cysteine protease. Corresponds to MEROPs family C41. This papain-like protease cleaves the viral polyprotein encoded by ORF1 of the hepatitis E virus (HEV). 161
48558 283152 pfam05418 Apo-VLDL-II Apovitellenin I (Apo-VLDL-II). This family consists of several avian apovitellenin I sequences. As part of the avian reproductive effort, large quantities of triglyceride-rich very-low-density lipoprotein (VLDL) particles are transported by receptor-mediated endocytosis into the female germ cells. Although the oocytes are surrounded by a layer of granulosa cells harbouring high levels of active lipoprotein lipase, non-lipolysed VLDL is transported into the yolk. This is because VLDL particles from laying chickens are protected from lipolysis by apolipoprotein (apo)-VLDL-II, a potent dimeric lipoprotein lipase inhibitor. Apo-VLDL-II is produced in the liver and secreted into the blood stream when induced by estrogen production in female birds. 79
48559 398855 pfam05419 GUN4 GUN4-like. In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participates in plastid-to-nucleus signaling by regulating magnesium-protoporphyrin IX synthesis or trafficking. 138
48560 398856 pfam05420 BCSC_C Cellulose synthase operon protein C C-terminus (BCSC_C). This family contains the C-terminal regions of several bacterial cellulose synthase operon C (BCSC) proteins. BCSC is involved in cellulose synthesis although the exact function of this protein is unknown. 336
48561 398857 pfam05421 DUF751 Protein of unknown function (DUF751). This family contains several plant, cyanobacterial and algal proteins of unknown function. The family is exclusively found in phototrophic organisms and may therefore play a role in photosynthesis (personal obs:Moxon SJ). 60
48562 398858 pfam05422 SIN1 Stress-activated map kinase interacting protein 1 (SIN1). SIN1 is the N-terminus of stress-activated map kinase interacting protein 1 (MAPKAP1 OR SIN1) sequences. This domain is likely to be the Ras-binding domain. The fission yeast Sty1/Spc1 mitogen-activated protein (MAP) kinase is a member of the eukaryotic stress-activated MAP kinase (SAPK) family. Sin1 interacts with Sty1/Spc1. Cells lacking Sin1 display many, but not all, of the phenotypes of cells lacking the Sty1/Spc1 MAP kinase including sterility, multiple stress sensitivity and a cell-cycle delay. Sin1 is phosphorylated after stress but this is not Sty1/Spc1-dependent. The separate CRIM and PH, pleckstrin-homology domains of the full-length SIN1 proteins have been separated into distinct families. 139
48563 283157 pfam05423 Mycobact_memb Mycobacterium membrane protein. This family contains several membrane proteins from Mycobacterium species. 138
48564 398859 pfam05424 Duffy_binding Duffy binding domain. This domain is found in Plasmodium Duffy binding proteins. Plasmodium vivax and Plasmodium knowlesi merozoites invade human erythrocytes that express Duffy blood group surface determinants. The Duffy receptor family is localized in micronemes, an organelle found in all organisms of the phylum Apicomplexa. This family is closely associated on PfEMP1 proteins with PFEMP, pfam03011. 187
48565 398860 pfam05425 CopD Copper resistance protein D. Copper sequestering activity displayed by some bacteria is determined by copper-binding protein products of the copper resistance operon (cop). CopD, together with CopC, perform copper uptake into the cytoplasm. 97
48566 398861 pfam05426 Alginate_lyase Alginate lyase. This family contains several bacterial alginate lyase proteins. Alginate is a family of 1-4-linked copolymers of beta -D-mannuronic acid (M) and alpha -L-guluronic acid (G). It is produced by brown algae and by some bacteria belonging to the genera Azotobacter and Pseudomonas. Alginate lyases catalyze the depolymerization of alginates by beta -elimination, generating a molecule containing 4-deoxy-L-erythro-hex-4-enepyranosyluronate at the nonreducing end. This family adopts an all alpha fold. 274
48567 398862 pfam05427 FIBP Acidic fibroblast growth factor binding (FIBP). Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that is thought to be involved in the intracellular function of aFGF. 360
48568 398863 pfam05428 CRF-BP Corticotropin-releasing factor binding protein (CRF-BP). This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development. 298
48569 398864 pfam05430 Methyltransf_30 S-adenosyl-L-methionine-dependent methyltransferase. This family is a S-adenosyl-L-methionine (SAM)-dependent methyltransferase. It is often found in association with pfam01266, where it is responsible for catalyzing the transfer of a methyl group from S-adenosyl-L-methionine to 5-aminomethyl-2-thiouridine to form 5-methylaminomethyl-2-thiouridine. 124
48570 398865 pfam05431 Toxin_10 Insecticidal Crystal Toxin, P42. Family of Bacillus insecticidal crystal toxins. Strains of Bacillus that have this insecticidal activity use a binary toxin comprised of two proteins, P51 and P42 (this family). Members of this family are highly conserved between strains of different serotypes and phage groups. 169
48571 398866 pfam05432 BSP_II Bone sialoprotein II (BSP-II). Bone sialoprotein (BSP) is a major structural protein of the bone matrix that is specifically expressed by fully-differentiated osteoblasts. The expression of bone sialoprotein (BSP) is normally restricted to mineralized connective tissues of bones and teeth where it has been associated with mineral crystal formation. However, it has been found that ectopic expression of BSP occurs in various lesions, including oral and extraoral carcinomas, in which it has been associated with the formation of microcrystalline deposits and the metastasis of cancer cells to bone. 301
48572 398867 pfam05433 Rick_17kDa_Anti Glycine zipper 2TM domain. This family includes a putative two transmembrane alpha-helical region that contains glycine zipper motifs. This family includes several Rickettsia genus specific 17 kDa surface antigen proteins. 42
48573 398868 pfam05434 Tmemb_9 TMEM9. This family contains several eukaryotic transmembrane proteins which are homologous to human transmembrane protein 9. The TMEM9 gene encodes a 183 amino-acid protein that contains an N-terminal signal peptide, a single transmembrane region, three potential N-glycosylation sites and three conserved cys-rich domains in the N-terminus, but no known functional domains. The protein is highly conserved between species from Caenorhabditis elegans to man and belongs to a novel family of transmembrane proteins. The exact function of TMEM9 is unknown although it has been found to be widely expressed and localized to the late endosomes and lysosomes. Members of this family contain pfam03128 repeats in their N-terminal region. 142
48574 368442 pfam05435 Phi-29_GP3 Phi-29 DNA terminal protein GP3. This family consists of DNA terminal protein GP3 sequences from Phi-29 like bacteriophages. DNA terminal protein GP3 is linked to the 5' ends of both strands of the genome through a phosphodiester bond between the beta-hydroxyl group of a serine residue and the 5'-phosphate of the terminal deoxyadenylate. This protein is essential for DNA replication and is involved in the priming of DNA elongation. 266
48575 398869 pfam05436 MF_alpha_N Mating factor alpha precursor N-terminus. This family contains the N-terminal regions of the Saccharomyces mating factor alpha precursor protein. All proteins in this family contain one or more copies pfam04648 further toward their C-terminus. 87
48576 398870 pfam05437 AzlD Branched-chain amino acid transport protein (AzlD). This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD is known to be involved in conferring resistance to 4-azaleucine although its exact role is uncertain. 99
48577 398871 pfam05438 TRH Thyrotropin-releasing hormone (TRH). This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localized on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans. 219
48578 398872 pfam05439 JTB Jumping translocation breakpoint protein (JTB). This family contains several jumping translocation breakpoint proteins or JTBs. Jumping translocation (JT) is an unbalanced translocation that comprises amplified chromosomal segments jumping to various telomeres. JTB, located at 1q21, has been found to fuse with the telomeric repeats of acceptor telomeres in a case of JT. hJTB (human JTB) encodes a trans-membrane protein that is highly conserved among divergent eukaryotic species. JT results in a hJTB truncation, which potentially produces an hJTB product devoid of the trans-membrane domain. hJTB is located in a gene-rich region at 1q21, called EDC (Epidermal Differentiation Complex). JTB has also been implicated in prostatic carcinomas. 110
48579 398873 pfam05440 MtrB Tetrahydromethanopterin S-methyltransferase subunit B. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 94
48580 398874 pfam05443 ROS_MUCR ROS/MUCR transcriptional regulator protein. This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti. This gene encodes a 15.5-kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid and is necessary for succinoglycan production. Sinorhizobium meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of alfalfa nodules. MucR from Sinorhizobium meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan. 122
48581 398875 pfam05444 DUF753 Protein of unknown function (DUF753). This family contains sequences with are repeated in several uncharacterized proteins from Drosophila melanogaster. 149
48582 283176 pfam05445 Pox_ser-thr_kin Poxvirus serine/threonine protein kinase. 434
48583 398876 pfam05448 AXE1 Acetyl xylan esterase (AXE1). This family consists of several bacterial acetyl xylan esterase proteins. Acetyl xylan esterases are enzymes that hydrolyze the ester linkages of the acetyl groups in position 2 and/or 3 of the xylose moieties of natural acetylated xylan from hardwood. These enzymes are one of the accessory enzymes which are part of the xylanolytic system, together with xylanases, beta-xylosidases, alpha-arabinofuranosidases and methylglucuronidases; these are all required for the complete hydrolysis of xylan. 316
48584 398877 pfam05449 Phage_holin_3_7 Putative 3TM holin, Phage_holin_3. This is a family of putative proteobacterial phage three-transmembrane-domain holins. 80
48585 310213 pfam05450 Nicastrin Nicastrin. Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesized in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin). A region featured in this family has a fold similar to human transferrin receptor (TfR) and a bacterial aminopeptidase. It is implicated in the pathogenesis of Alzheimer's disease. 227
48586 283180 pfam05451 Phytoreo_Pns Phytoreovirus nonstructural protein Pns10/11. This family consists of Phytoreovirus nonstructural proteins Pns10 and Pns11. Genome segment S11 of rice gall dwarf virus (RGDV), a member of Phytoreovirus encodes a putative protein of 40 kDa that exhibits approximately 37% homology at the amino acid level to the nonstructural proteins Pns10 of rice dwarf and wound tumor viruses, which are other members of Phytoreovirus. 359
48587 147565 pfam05452 Clavanin Clavanin. This family consists of clavanin proteins from the haemocytes of the invertebrate Styela clava, a solitary tunicate. The family is made up of four alpha-helical antimicrobial peptides, clavanins A, B, C and D. The tunicate peptides resemble magainins in size, primary sequence and antibacterial activity. Synthetic clavanin A displays comparable antimicrobial activity to magainins and cecropins. The presence of alpha-helical antimicrobial peptides in the haemocytes of a urochordate suggests that such peptides are primeval effectors of innate immunity in the vertebrate lineage. 80
48588 398878 pfam05453 Toxin_6 BmTXKS1/BmP02 toxin family. This family consists of toxin-like peptides that are isolated from the venom of Buthus martensii Karsch scorpion. The precursor consists of 60 amino acid residues, with a putative signal peptide of 28 residues and an extra residue, and a mature peptide of 31 residues with an amidated C-terminal. The peptides share close homology with other scorpion K+ channel toxins and should present a common three-dimensional fold - the Cysteine -stabilized alphabeta (CSalphabeta) motif. This family acts by blocking small conductance calcium activated potassium ion channels in their victim. 28
48589 398879 pfam05454 DAG1 Dystroglycan (Dystrophin-associated glycoprotein 1). Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in human. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton. [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in mouse brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear. 290
48590 310215 pfam05455 GvpH GvpH. This family consists of archaeal GvpH proteins which are thought to be involved in gas vesicle synthesis. 177
48591 398880 pfam05456 eIF_4EBP Eukaryotic translation initiation factor 4E binding protein (EIF4EBP). This family consists of several eukaryotic translation initiation factor 4E binding proteins (EIF4EBP1,2 and 3). Translation initiation in eukaryotes is mediated by the cap structure (m7GpppN, where N is any nucleotide) present at the 5' end of all cellular mRNAs, except organellar. The cap is recognized by eukaryotic initiation factor 4F (eIF4F), which consists of three polypeptides, including eIF4E, the cap-binding protein subunit. The interaction of the cap with eIF4E facilitates the binding of the ribosome to the mRNA. eIF4E activity is regulated in part by translational repressors, 4E-BP1, 4E-BP2 and 4E-BP3 which bind to it and prevent its assembly into eIF4F. 120
48592 283184 pfam05458 Siva Cd27 binding protein (Siva). Siva binds to the CD27 cytoplasmic tail. It has a DD homology region, a box-B-like ring finger, and a zinc finger-like domain. Overexpression of Siva in various cell lines induces apoptosis, suggesting an important role for Siva in the CD27-transduced apoptotic pathway. Siva-1 binds to and inhibits BCL-X(L)-mediated protection against UV radiation-induced apoptosis. Indeed, the unique amphipathic helical region (SAH) present in Siva-1 is required for its binding to BCL-X(L) and sensitising cells to UV radiation. Natural complexes of Siva-1/BCL-X(L) are detected in HUT78 and murine thymocyte, suggesting a potential role for Siva-1 in regulating T cell homeostasis. This family contains both Siva-1 and the shorter Siva-2 lacking the sequence coded by exon 2. It has been suggested that Siva-2 could regulate the function of Siva-1. 173
48593 398881 pfam05459 Herpes_UL69 Herpesvirus transcriptional regulator family. This family includes UL69 and IE63 that are transcriptional regulator proteins. 217
48594 368452 pfam05460 ORC6 Origin recognition complex subunit 6 (ORC6). This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. Despite differences in their structure and sequences among eukaryotic replicators, ORC is a conserved feature of replication initiation in all eukaryotes. ORC-related genes have been identified in organisms ranging from S. pombe to plants to humans. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. 288
48595 398882 pfam05461 ApoL Apolipoprotein L. Apo L belongs to the high density lipoprotein family that plays a central role in cholesterol transport. The cholesterol content of membranes is important in cellular processes such as modulating gene transcription and signal transduction both in the adult brain and during neurodevelopment. There are six apo L genes located in close proximity to each other on chromosome 22q12 in humans. 22q12 is a confirmed high-susceptibility locus for schizophrenia and close to the region associated with velocardiofacial syndrome that includes symptoms of schizophrenia. 313
48596 283188 pfam05462 Dicty_CAR Slime mold cyclic AMP receptor. This family consists of cyclic AMP receptor (CAR) proteins from slime molds. CAR proteins are responsible for controlling development in Dictyostelium discoideum. 305
48597 398883 pfam05463 Sclerostin Sclerostin (SOST). This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterized by a generalized hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients. 198
48598 398884 pfam05464 Phi-29_GP4 Phi-29-like late genes activator (early protein GP4). This family consists of phi-29-like late genes activator (or early protein GP4). This protein is thought to be a positive regulator of late transcription and may function as a sigma like component of the host RNA polymerase. 123
48599 310220 pfam05465 Halo_GVPC Halobacterial gas vesicle protein C (GVPC) repeat. This family consists of Halobacterium gas vesicle protein C sequences which are thought to confer stability to the gas vesicle membranes. 32
48600 398885 pfam05466 BASP1 Brain acid soluble protein 1 (BASP1 protein). This family consists of several brain acid soluble protein 1 (BASP1) or neuronal axonal membrane protein NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent calmodulin-binding protein of unknown function. 239
48601 283192 pfam05467 Herpes_U47 Herpesvirus glycoprotein U47. 677
48602 283193 pfam05470 eIF-3c_N Eukaryotic translation initiation factor 3 subunit 8 N-terminus. The largest of the mammalian translation initiation factors, eIF3, consists of at least eight subunits ranging in mass from 35 to 170 kDa. eIF3 binds to the 40 S ribosome in an early step of translation initiation and promotes the binding of methionyl-tRNAi and mRNA. 544
48603 398886 pfam05472 Ter DNA replicatioN-terminus site-binding protein (Ter protein). This family contains several bacterial Ter proteins. The Ter protein specifically binds to DNA replicatioN-terminus sites on the host and plasmid genome and then blocks progress of the DNA replication fork. 296
48604 368457 pfam05473 UL45 UL45 protein, carbohydrate-binding C-type lectin-like. This family consists of several UL45 proteins. The herpes simplex virus UL45 gene encodes an 18 kDa virion envelope protein whose function remains unknown. It has been suggested that the 18 kDa UL45 gene product is required for efficient growth in the central nervous system at low doses and may play an important role under the conditions of a naturally acquired infection. This family also contains several Varicellovirus UL45 or gene 15 proteins. The Equine herpesvirus 1 UL45 protein represents a type II membrane glycoprotein which has been found to be non-essential for EHV-1 growth in vitro but deletion reduces the viruses' replication efficiency. Studies have shown that UL45 has a C-type lectin-like fold, suggesting that it might have a carbohydrate-binding function. 191
48605 368458 pfam05474 Semenogelin Semenogelin. This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated human semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg). 582
48606 368459 pfam05475 Chlam_vir Pgp3 C-terminal domain. This family consists of Chlamydia virulence proteins which are thought to be required for growth within mammalian cells. The C-terminal domain shows distant homology to the TNF superfamily. 146
48607 368460 pfam05476 PET122 PET122. The nuclear PET122 gene of S. cerevisiae encodes a mitochondrial-localized protein that activates initiation of translation of the mitochondrial mRNA from the COX3 gene, which encodes subunit III of cytochrome c oxidase. 259
48608 398887 pfam05477 SURF2 Surfeit locus protein 2 (SURF2). Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown. 240
48609 398888 pfam05478 Prominin Prominin. The prominins are an emerging family of proteins that among the multispan membrane proteins display a novel topology. Mouse prominin and human prominin (mouse)-like 1 (PROML1) are predicted to contain five membrane spanning domains, with an N-terminal domain exposed to the extracellular space followed by four, alternating small cytoplasmic and large extracellular, loops and a cytoplasmic C-terminal domain. The exact function of prominin is unknown although in humans defects in PROM1, the gene coding for prominin, cause retinal degeneration. 799
48610 398889 pfam05479 PsaN Photosystem I reaction centre subunit N (PSAN or PSI-N). This family contains several Photosystem I reaction centre subunit N (PSI-N) proteins. The protein has no known function although it is localized in the thylakoid lumen. PSI-N is a small extrinsic subunit at the lumen side and is very likely involved in the docking of plastocyanin. 132
48611 398890 pfam05480 Staph_haemo Staphylococcus haemolytic protein. This family consists of several different short Staphylococcal proteins, it contains SLUSH A, B and C proteins as well as haemolysin and gonococcal growth inhibitor. Some strains of the coagulase-negative Staphylococcus lugdunensis produce a synergistic hemolytic activity (SLUSH), phenotypically similar to the delta-hemolysin of S. aureus. Gonococcal growth inhibitor from Staphylococcus act on the cytoplasmic membrane of the gonococcal cell causing cytoplasmic leakage and, eventually, death. 41
48612 398891 pfam05481 Myco_19_kDa Mycobacterium 19 kDa lipoprotein antigen. Most of the antigens of Mycobacterium leprae and M. tuberculosis that have been identified are members of stress protein families, which are highly conserved throughout many diverse species. Of the M. leprae and M. tuberculosis antigens identified by monoclonal antibodies, all except the 18-kDa M. leprae antigen and the 19-kDa M. tuberculosis antigen are strongly cross-reactive between these two species and are coded within very similar genes. 116
48613 368466 pfam05482 Serendipity_A Serendipity locus alpha protein (SRY-A). The Drosophila serendipity alpha (sry alpha) gene is specifically transcribed at the blastoderm stage, from nuclear cycle 11 to the onset of gastrulation, in all somatic nuclei. SRY-A is required for the cellularisation of the embryo and is involved in the localization of the actin filaments just prior to and during plasma membrane invagination. 542
48614 114219 pfam05483 SCP-1 Synaptonemal complex protein 1 (SCP-1). Synaptonemal complex protein 1 (SCP-1) is the major component of the transverse filaments of the synaptonemal complex. Synaptonemal complexes are structures that are formed between homologous chromosomes during meiotic prophase. 787
48615 398892 pfam05484 LRV_FeS LRV protein FeS4 cluster. This Iron sulphur cluster is found at the N-terminus of some proteins containing pfam01816 repeats. 53
48616 398893 pfam05485 THAP THAP domain. The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes. 74
48617 398894 pfam05486 SRP9-21 Signal recognition particle 9 kDa protein (SRP9). This family consists of several eukaryotic SRP9 proteins. SRP9 together with the Alu-homologous region of 7SL RNA and SRP14 comprise the "Alu domain" of SRP, which mediates pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP. This family also contains the homologous fungal SRP21. 82
48618 398895 pfam05488 PAAR_motif PAAR motif. This motif is found usually in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins. 71
48619 283209 pfam05489 Phage_tail_X Phage Tail Protein X. This domain is found in a family of phage tail proteins. Visual analysis suggests that it is related to pfam01476 (personal obs: C Yeats). The functional annotation of family members further confirms this hypothesis. 60
48620 398896 pfam05491 RuvB_C Holliday junction DNA helicase ruvB C-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalyzed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family consists of the C-terminal region of the RuvB protein which is thought to be helicase DNA-binding domain. 72
48621 398897 pfam05493 ATP_synt_H ATP synthase subunit H. ATP synthase subunit H is an extremely hydrophobic of approximately 9 kDa. This subunit may be required for assembly of vacuolar ATPase. 59
48622 398898 pfam05494 MlaC MlaC protein. MlaC is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. This family of proteins is involved in toluene tolerance, which is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation. Many proteins are involved in these processes. 164
48623 398899 pfam05495 zf-CHY CHY zinc finger. This family of domains are likely to bind to zinc ions. They contain many conserved cysteine and histidine residues. We have named this domain after the N-terminal motif CXHY. This domain can be found in isolation in some proteins, but is also often associated with pfam00097. One of the proteins in this family is a mitochondrial intermembrane space protein called Hot13. This protein is involved in the assembly of small TIM complexes. 76
48624 398900 pfam05496 RuvB_N Holliday junction DNA helicase ruvB N-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalyzed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family contains the N-terminal region of the protein. 159
48625 398901 pfam05497 Destabilase Destabilase. Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine. 118
48626 398902 pfam05498 RALF Rapid ALkalinization Factor (RALF). RALF, a 5-kDa ubiquitous polypeptide in plants, arrests root growth and development. 65
48627 398903 pfam05499 DMAP1 DNA methyltransferase 1-associated protein 1 (DMAP1). DNA methylation can contribute to transcriptional silencing through several transcriptionally repressive complexes, which include methyl-CpG binding domain proteins (MBDs) and histone deacetylases (HDACs). The chief enzyme that maintains mammalian DNA methylation, DNMT1, can also establish a repressive transcription complex. The non-catalytic amino terminus of DNMT1 binds to HDAC2 and DMAP1 (for DNMT1 associated protein), and can mediate transcriptional repression. DMAP1 has intrinsic transcription repressive activity, and binds to the transcriptional co-repressor TSG101. DMAP1 is targeted to replication foci through interaction with the far N-terminus of DNMT1 throughout S phase, whereas HDAC2 joins DNMT1 and DMAP1 only during late S phase, providing a platform for how histones may become deacetylated in heterochromatin following replication. 163
48628 368473 pfam05501 DUF755 Domain of unknown function (DUF755). This family is predominated by ORFs from Circoviridae. The function of this family remains to be determined. 122
48629 368474 pfam05502 Dynactin_p62 Dynactin p62 family. Dynactin is a multi-subunit complex and a required cofactor for most, or all, of the cellular processes powered by the microtubule-based motor cytoplasmic dynein. p62 binds directly to the Arp1 subunit of dynactin. 472
48630 283219 pfam05503 Pox_G7 Poxvirus G7-like. 367
48631 398904 pfam05504 Spore_GerAC Spore germination B3/ GerAC like, C-terminal. The GerAC protein of the Bacillus subtilis spore is required for the germination response to L-alanine. Members of this family are thought to be located in the inner spore membrane. Although the function of this family is unclear, they are likely to encode the components of the germination apparatus that respond directly to this germinant, mediating the spore's response. 167
48632 398905 pfam05505 Ebola_NP Ebola nucleoprotein. This family consists of Ebola and Marburg virus nucleoproteins. These proteins are responsible for encapsidation of genomic RNA. It has been found that nucleoprotein DNA vaccines can offer protection from the virus. 717
48633 398906 pfam05506 DUF756 Domain of unknown function (DUF756). This domain is found, normally as a tandem repeat, at the C-terminus of bacterial phospholipase C proteins. 86
48634 398907 pfam05507 MAGP Microfibril-associated glycoprotein (MAGP). This family consists of several mammalian microfibril-associated glycoprotein (MAGP) 1 and 2 proteins. MAGP1 and 2 are components of elastic fibers. MAGP-1 has been proposed to bind a C-terminal region of tropoelastin, the soluble precursor of elastin. MAGP-2 was found to interact with fibrillin-1 and -2, as well as fibulin-1, another component of elastic fibers this suggests that MAGP-2 may be important in the assembly of microfibrils. 133
48635 398908 pfam05508 Ran-binding RanGTP-binding protein. The small Ras-like GTPase Ran plays an essential role in the transport of macromolecules in and out of the nucleus and has been implicated in spindle and nuclear envelope formation during mitosis in higher eukaryotes. The S. cerevisiae ORF YGL164c encoding a novel RanGTP-binding protein, termed Yrb30p was identified. The protein competes with yeast RanBP1 (Yrb1p) for binding to the GTP-bound form of yeast Ran (Gsp1p) and is, like Yrb1p, able to form trimeric complexes with RanGTP and some of the karyopherins. 308
48636 398909 pfam05509 TraY TraY domain. This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid. These proteins have a ribbon-helix-helix fold and are likely to be DNA-binding proteins. 49
48637 368478 pfam05510 Sarcoglycan_2 Sarcoglycan alpha/epsilon. Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localization of these proteins along the nonjunctional sarcolemma is not clear. This family contains alpha and epsilon members. 385
48638 398910 pfam05511 ATP-synt_F6 Mitochondrial ATP synthase coupling factor 6. Coupling factor 6 (F6) is a component of mitochondrial ATP synthase which is required for the interactions of the catalytic and proton-translocating segments. 96
48639 398911 pfam05512 AWPM-19 AWPM-19-like family. Members of this family are 19 kDa membrane proteins. The levels of the plant protein AWPM-19 increase dramatically when there is an increase level of abscisic acid. The increase presence of this protein leads to greater tolerance of freezing. 142
48640 283228 pfam05513 TraA TraA. Conjugative transfer of a bacteriocin plasmid, pPD1, of Enterococcus faecalis is induced in response to a peptide sex pheromone, cPD1, secreted from plasmid-free recipient cells. cPD1 is taken up by a pPD1 donor cell and binds to an intracellular receptor, TraA. Once a recipient cell acquires pPD1, it starts to produce an inhibitor of cPD1, termed iPD1, which functions as a TraA antagonist and blocks self-induction in donor cells. TraA transduces the signal of cPD1 to the mating response. 120
48641 283229 pfam05514 HR_lesion HR-like lesion-inducing. Family of plant proteins that are associated with the hypersensitive response (HR) pathway of defense against plant pathogens. 138
48642 283230 pfam05515 Viral_NABP Viral nucleic acid binding. This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP). 190
48643 398912 pfam05517 p25-alpha p25-alpha. This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila. 155
48644 253234 pfam05518 Totivirus_coat Totivirus coat protein. 753
48645 114252 pfam05520 Citrus_P18 Citrus tristeza virus P18 protein. 167
48646 377521 pfam05521 Phage_H_T_join Phage head-tail joining protein. 96
48647 114254 pfam05522 Metallothio_6 Metallothionein. This family consists of metallothioneins from several worm and sea urchin species. Metallothioneins are low molecular weight, cysteine rich proteins known to be involved in heavy metal detoxification and homeostasis. 65
48648 398913 pfam05523 FdtA WxcM-like, C-terminal. This family includes FdtA from Aneurinibacillus thermoaerophilus, which has been characterized as a dtdp-6-deoxy-3,4-keto-hexulose isomerase. It also includes WxcM from Xanthomonas campestris (pv. campestris). 129
48649 398914 pfam05524 PEP-utilizers_N PEP-utilising enzyme, N-terminal. 125
48650 283235 pfam05525 Branch_AA_trans Branched-chain amino acid transport protein. This family consists of several bacterial branched-chain amino acid transport proteins which are responsible for the transport of leucine, isoleucine and valine via proton motive force. 429
48651 398915 pfam05526 R_equi_Vir Rhodococcus equi virulence-associated protein. This family consists of several virulence-associated proteins from Rhodococcus equi. Rhodococcus equi is an important pulmonary pathogen of foals and is increasingly isolated from pneumonic infections and other infections in human immunodeficiency virus (HIV)-infected patients. Isolates from foals possess a large virulence plasmid, varying in size from 80 to 90 kb. Isolates lacking the plasmid are avirulent to foals. Little is known about the function of the plasmid apart from its encoding a virulence associated surface proteins. 177
48652 398916 pfam05527 DUF758 Domain of unknown function (DUF758). Family of eukaryotic proteins with unknown function, which are induced by tumor necrosis factor. 155
48653 283238 pfam05528 Coronavirus_5 Coronavirus gene 5 protein. Infectious bronchitis virus (IBV), a member of Coronaviridae family, has a single-stranded positive-sense RNA genome, which is 27 kb in length. Gene 5 contains two (5a and 5b) open reading frames. The function of the 5a and 5b proteins is unknown. 82
48654 398917 pfam05529 Bap31 B-cell receptor-associated protein 31-like. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31. 137
48655 368485 pfam05531 NPV_P10 Nucleopolyhedrovirus P10 protein. This family consists of several nucleopolyhedrovirus P10 proteins which are thought to be involved in the morphogenesis of the polyhedra. 75
48656 398918 pfam05532 CsbD CsbD-like. CsbD is a bacterial general stress response protein. It's expression is mediated by sigma-B, an alternative sigma factor. The role of CsbD in stress response is unclear. 53
48657 114264 pfam05533 Peptidase_C42 Beet yellows virus-type papain-like endopeptidase C42. Members of the Closteroviridae and Potyviridae families of plant positive-strand RNA viruses encode one or two papain-like leader proteinases, belonging to Merops peptidase family C42. 88
48658 310261 pfam05534 HicB HicB family. This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches. 51
48659 398919 pfam05535 Chromadorea_ALT Chromadorea ALT protein. This family consists of several ALT protein homologs found in nematodes. Lymphatic filariasis is a major tropical disease caused by the mosquito borne nematodes Brugia and Wuchereria. About 120 million people are infected and at risk of lymphatic pathology such as acute lymphangitis and elephantiasis. Expression of alt-1 and alt-2 is initiated midway through development in the mosquito, peaking in the infective larva and declining sharply following entry into the host. ALT-1 and the closely related ALT-2 have been found to be strong candidates for a future vaccine against human filariasis. 77
48660 336138 pfam05536 Neurochondrin Neurochondrin. This family contains several eukaryotic neurochondrin proteins. Neurochondrin induces hydroxyapatite resorptive activity in bone marrow cells resistant to bafilomycin A1, an inhibitor of macrophage- and osteoclast-mediated resorption. Expression of the gene is localized to chondrocyte, osteoblast, and osteocyte in the bone and to the hippocampus and Purkinje cell layer of cerebellum in the brain. 605
48661 283245 pfam05537 DUF759 Borrelia burgdorferi protein of unknown function (DUF759). This family consists of several uncharacterized proteins from the Lyme disease spirochete Borrelia burgdorferi. 429
48662 368487 pfam05538 Campylo_MOMP Campylobacter major outer membrane protein. This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments. 421
48663 114270 pfam05539 Pneumo_att_G Pneumovirinae attachment membrane glycoprotein G. 408
48664 398920 pfam05540 Serpulina_VSP Serpulina hyodysenteriae variable surface protein. This family consists of several variable surface proteins from Serpulina hyodysenteriae. 394
48665 253243 pfam05541 Spheroidin Entomopoxvirus spheroidin protein. Entomopoxviruses (EPVs) are large (300-400 nm) oval-shaped viruses replicating in the cytoplasm of their insect host cells. At the end of their replicative cycle EPVs virions are occluded in a highly expressed protein called spheroidin. This protein forms large (5-20 mm long) oval-shaped occlusion bodies (OBs) called spherules. The infectious cycle of EPVs begins with the ingestion by the insect host of the spherules, their dissolution by the alkaline reducing conditions of the midgut fluid and the release of virions in the midgut lumen. The infective particles first replicate in midgut epithelial cells, then pass the gut barrier to colonise the internal tissues, mainly the fat body cells. Whilst spheroidin has been demonstrated to be non-essential for viral replication, it plays an essential role in the natural biological cycle of the virus in protecting virions from adverse environmental conditions (e.g. UV degradation) and thus improving transmission efficacy. In this respect, spheroidins are functionally similar to polyhedrins of baculoviruses or cypoviruses. 943
48666 398921 pfam05542 DUF760 Protein of unknown function (DUF760). This family contains several uncharacterized plant proteins. 83
48667 398922 pfam05543 Peptidase_C47 Staphopain peptidase C47. Staphopains are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme. 174
48668 398923 pfam05544 Pro_racemase Proline racemase. This family consists of proline racemase (EC 5.1.1.4) proteins which catalyze the interconversion of L- and D-proline in bacteria. This family also contains several similar eukaryotic proteins including Trypanosoma cruzi PA45-A, a protein with B-cell mitogenic properties which has been characterized as a co-factor-independent proline racemase. 325
48669 398924 pfam05545 FixQ Cbb3-type cytochrome oxidase component FixQ. This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon. 49
48670 398925 pfam05546 She9_MDM33 She9 / Mdm33 family. Members of this family are mitochondrial inner membrane proteins with a role in inner mitochondrial membrane organisation and biogenesis. 198
48671 398926 pfam05547 Peptidase_M6 Immune inhibitor A peptidase M6. The insect pathogenic Gram-positive Bacillus thuringiensis secretes immune inhibitor A, a metallopeptidase, which specifically cleaves host antibacterial proteins. A homolog of immune inhibitor A, PrtV, has been identified in the Gram-negative human pathogen Vibrio cholerae. 644
48672 336143 pfam05548 Peptidase_M11 Gametolysin peptidase M11. In the unicellular biflagellated alga, Chlamydomonas reinhardtii, gametolysin, a zinc-containing metallo-protease, is responsible for the degradation of the cell wall. homologs of gametolysin have also been reported in the simple multicellular organism, Volvox. 303
48673 253247 pfam05549 Allexi_40kDa Allexivirus 40kDa protein. 271
48674 283255 pfam05550 Peptidase_C53 Pestivirus Npro endopeptidase C53. Unique to pestiviruses, the N-terminal protein encoded by the bovine viral diarrhoea virus genome is a cysteine protease (Npro) responsible for a self-cleavage that releases the N-terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core. 168
48675 368495 pfam05551 zf-His_Me_endon Zinc-binding loop region of homing endonuclease. This domain is the short zinc-binding loops region of a number of much longer chain homing endonucleases. Such loops are probably stabilized by the zinc and may be viewed as small but separate domains. The common structural feature of these domains is that at least three zinc ligands lie very close to each other in the sequence and are not incorporated into regular secondary structural elements. The biological roles played by these small zinc-binding domains are presently unknown. 131
48676 398927 pfam05552 TM_helix Conserved TM helix. This alignment represents a conserved transmembrane helix as well as some flanking sequence. It is often found in association with pfam00924. 50
48677 398928 pfam05553 DUF761 Cotton fibre expressed protein. This family consists of several plant proteins of unknown function. Three of the sequences (from Gossypium hirsutum) in this family are described as cotton fibre expressed proteins. The remaining sequences, found in Arabidopsis thaliana, are uncharacterized. 35
48678 147629 pfam05554 Novirhabdo_Nv Viral hemorrhagic septicemia virus non-virion protein. This family consists of several viral hemorrhagic septicemia virus non-virion (Nv) proteins. The NV protein is a nonstructural protein absent from mature virions although it is present in infected cells. The function of this protein is unknown. 122
48679 283259 pfam05555 DUF762 Coxiella burnetii protein of unknown function (DUF762). This family consists several of several uncharacterized proteins from the bacterium Coxiella burnetii. Coxiella burnetii is the causative agent of the Q fever disease. 244
48680 398929 pfam05556 Calsarcin Calcineurin-binding protein (Calsarcin). This family consists of several mammalian calcineurin-binding proteins. The calcium- and calmodulin-dependent protein phosphatase calcineurin has been implicated in the transduction of signals that control the hypertrophy of cardiac muscle and slow fibre gene expression in skeletal muscle. Calsarcin-1 and calsarcin-2 are expressed in developing cardiac and skeletal muscle during embryogenesis, but calsarcin-1 is expressed specifically in adult cardiac and slow-twitch skeletal muscle, whereas calsarcin-2 is restricted to fast skeletal muscle. Calsarcins represent a novel family of sarcomeric proteins that link calcineurin with the contractile apparatus, thereby potentially coupling muscle activity to calcineurin activation. Calsarcin-3, is expressed specifically in skeletal muscle and is enriched in fast-twitch muscle fibers. Like calsarcin-1 and calsarcin-2, calsarcin-3 interacts with calcineurin, and the Z-disc proteins alpha-actinin, gamma-filamin, and telethonin. 255
48681 368498 pfam05557 MAD Mitotic checkpoint protein. This family consists of several eukaryotic mitotic checkpoint (Mitotic arrest deficient or MAD) proteins. The mitotic spindle checkpoint monitors proper attachment of the bipolar spindle to the kinetochores of aligned sister chromatids and causes a cell cycle arrest in prometaphase when failures occur. Multiple components of the mitotic spindle checkpoint have been identified in yeast and higher eukaryotes. In S.cerevisiae, the existence of a Mad1-dependent complex containing Mad2, Mad3, Bub3 and Cdc20 has been demonstrated. 660
48682 368499 pfam05558 DREPP DREPP plasma membrane polypeptide. This family contains several plant plasma membrane proteins termed DREPPs as they are developmentally regulated plasma membrane polypeptides. 206
48683 398930 pfam05559 DUF763 Protein of unknown function (DUF763). This family consists of several uncharacterized bacterial and archaeal proteins of unknown function. 312
48684 114291 pfam05560 Bt_P21 Bacillus thuringiensis P21 molecular chaperone protein. This family contains several Bacillus thuringiensis P21 proteins. These proteins are thought to be molecular chaperones and have mosquitocidal properties. 182
48685 310276 pfam05561 DUF764 Borrelia burgdorferi protein of unknown function (DUF764). This family consists of proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete). 182
48686 398931 pfam05562 WCOR413 Cold acclimation protein WCOR413. This family consists of several WCOR413-like plant cold acclimation proteins. 181
48687 398932 pfam05563 SpvD Salmonella plasmid virulence protein SpvD. This family consists of several SpvD plasmid virulence proteins from different Salmonella species. The structure of the protein from Salmonella typhimurium has been solved and shows a papain-like fold, with a predicted catalytic triad of Cys73, His162 and Asp182. The protein has been shown to have deubiquitinating-like activity, releasing aminoluciferin (AML) from Ub-AML. 213
48688 398933 pfam05564 Auxin_repressed Dormancy/auxin associated protein. This family contains several plant dormancy-associated and auxin-repressed proteins the function of which are poorly understood. 117
48689 398934 pfam05565 Sipho_Gp157 Siphovirus Gp157. This family contains both viral and bacterial proteins which are related to the Gp157 protein of the Streptococcus thermophilus SFi bacteriophages. It is thought that bacteria possessing the gene coding for this protein have an increased resistance to the bacteriophage. 162
48690 283269 pfam05566 Pox_vIL-18BP Orthopoxvirus interleukin 18 binding protein. Interleukin-18 (IL-18) is a proinflammatory cytokine that plays a key role in the activation of natural killer and T helper 1 cell responses principally by inducing interferon-gamma (IFN-gamma). Several poxvirus genes encode proteins with sequence similarity to IL-18BPs. It has been shown that vaccinia, ectromelia and cowpox viruses secrete from infected cells a soluble IL-18BP (vIL-18BP) that may modulate the host antiviral response. The expression of vIL-18BPs by distinct poxvirus genera that cause local or general viral dissemination, or persistent or acute infections in the host, emphasises the importance of IL-18 in response to viral infections. 126
48691 398935 pfam05567 Neisseria_PilC Neisseria PilC beta-propeller domain. This family consists of several PilC protein sequences from Neisseria gonorrhoeae and N. meningitidis. PilC is a phase-variable protein associated with pilus-mediated adherence of pathogenic Neisseria to target cells. This domain has been shown to adopt a beta-propeller structure. 411
48692 114299 pfam05568 ASFV_J13L African swine fever virus J13L protein. This family consists of several African swine fever virus J13L proteins. 189
48693 310280 pfam05569 Peptidase_M56 BlaR1 peptidase M56. Production of beta-Lactamase and penicillin-binding protein 2a (which mediate staphylococcal resistance to beta-lactam antibiotics) is regulated by a signal-transducing integral membrane protein and a transcriptional repressor. The signal transducer is a fusion protein with penicillin-binding and zinc metalloprotease domains. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. homologs to this peptidase domain, which corresponds to Merops family M56, are also found in a number of other bacterial genome sequences. 299
48694 114301 pfam05570 DUF765 Circovirus protein of unknown function (DUF765). This family consists of several short (27-30aa) porcine and bovine circovirus ORF6 proteins of unknown function. 29
48695 398936 pfam05571 DUF766 Protein of unknown function (DUF766). This family consists of several eukaryotic proteins of unknown function. 292
48696 368506 pfam05572 Peptidase_M43 Pregnancy-associated plasma protein-A. Pregnancy-associated plasma protein A (PAPP-A) is a metallo-protease belonging to Merops family M43. It cleaves insulin-like growth factor (IGF) binding protein-4 (IGFBP-4), causing a dramatic reduction in its affinity for IGF-I and -II. Through this mechanism, PAPP-A is a regulator of IGF bioactivity in several systems, including the human ovary and the cardiovascular system. 152
48697 398937 pfam05573 NosL NosL. NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallo-centre assembly. 131
48698 114305 pfam05575 V_cholerae_RfbT Vibrio cholerae RfbT protein. This family consists of several RfbT proteins from Vibrio cholerae. It has been found that genetic alteration of the rfbT gene is responsible for serotype conversion of Vibrio cholerae O1 and determines the difference between the Ogawa and Inaba serotypes, in that the presence of rfbT is sufficient for Inaba-to-Ogawa serotype conversion. 286
48699 283275 pfam05576 Peptidase_S37 PS-10 peptidase S37. These serine proteases have been found in Streptomyces species. 448
48700 310284 pfam05577 Peptidase_S28 Serine carboxypeptidase S28. These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase. 434
48701 398938 pfam05578 Peptidase_S31 Pestivirus NS3 polyprotein peptidase S31. These serine peptidases are involved in processing of the flavivirus polyprotein. 211
48702 253263 pfam05579 Peptidase_S32 Equine arteritis virus serine endopeptidase S32. Serine peptidases involved in processing nidovirus polyprotein. 297
48703 398939 pfam05580 Peptidase_S55 SpoIVB peptidase S55. The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis. 210
48704 398940 pfam05582 Peptidase_U57 YabG peptidase U57. YabG is a protease involved in the proteolysis and maturation of SpoIVA and YrbA proteins, conserved with the cortex and/or coat assembly by Bacillus subtilis. 280
48705 283279 pfam05584 Sulfolobus_pRN Sulfolobus plasmid regulatory protein. This family consists of several plasmid regulatory proteins from the extreme thermophilic and acidophilic archaea Sulfolobus. 72
48706 147642 pfam05585 DUF1758 Putative peptidase (DUF1758). This is a family of nematode proteins of unknown function. However, it seems likely that these proteins act as aspartic peptidases. 164
48707 398941 pfam05586 Ant_C Anthrax receptor C-terminus region. This region is found in the putatively cytoplasmic C-terminus of the anthrax receptor. 93
48708 398942 pfam05587 Anth_Ig Anthrax receptor extracellular domain. This region is found in the putatively extracellular N-terminal half of the anthrax receptor. It is probably part of the Ig superfamily and most closely related to pfam01833 (personal obs: C Yeats). 100
48709 283282 pfam05588 Botulinum_HA-17 Clostridium botulinum HA-17 domain. This family consists of several Clostridium botulinum hemagglutinin (HA) subcomponents. Clostridium botulinum type D strain 4947 produces two different sizes of progenitor toxins (M and L) as intact forms without proteolytic processing. The M toxin is composed of neurotoxin (NT) and nontoxic-nonhemagglutinin (NTNHA), whereas the L toxin is composed of the M toxin and hemagglutinin (HA) subcomponents (HA-70, HA-17, and HA-33). 145
48710 398943 pfam05589 DUF768 Protein of unknown function (DUF768). This family consists of several uncharacterized hypothetical proteins from Rhizobium loti. 67
48711 283284 pfam05590 DUF769 Xylella fastidiosa protein of unknown function (DUF769). This family consists of several uncharacterized hypothetical proteins of unknown function from Xylella fastidiosa, the organism that causes Pierce's disease in plants. 259
48712 398944 pfam05591 T6SS_VipA Type VI secretion system, VipA, VC_A0107 or Hcp2. VipA is a family of Gram-negative bacterial proteins that form part of the type VI pathogenic secretion system. Members have been variously defined as VC_A0107 family, Hcp2 and VipA, for ClpV-interacting proteins. VipB and VipA proteins interact very closely to form the shaft of the pathogenic penetrating needle system. 153
48713 336149 pfam05592 Bac_rhamnosid Bacterial alpha-L-rhamnosidase concanavalin-like domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria. 102
48714 398945 pfam05593 RHS_repeat RHS Repeat. RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain. 36
48715 398946 pfam05594 Fil_haemagg Haemagluttinin repeat. This highly divergent repeat occurs in number of proteins implicated in cell aggregation. The Pfam alignment probably contains three such repeats (personal obs: C Yeats). These are likely to have a beta-helical structure. 69
48716 398947 pfam05595 DUF771 Domain of unknown function (DUF771). Family of uncharacterized ORFs found in Bacteriophage and Lactococcus lactis. 90
48717 398948 pfam05596 Taeniidae_ag Taeniidae antigen. This family consists of several antigen proteins from Taenia and Echinococcus (tapeworm) species. 64
48718 398949 pfam05597 Phasin Poly(hydroxyalcanoate) granule associated protein (phasin). Polyhydroxyalkanoates (PHAs) are storage polyesters synthesized by various bacteria as intracellular carbon and energy reserve material. PHAs are accumulated as water-insoluble inclusions within the cells. This family consists of the phasins PhaF and PhaI which act as a transcriptional regulator of PHA biosynthesis genes. PhaF has been proposed to repress expression of the phaC1 gene and the phaIF operon. 126
48719 398950 pfam05598 DUF772 Transposase domain (DUF772). This presumed domain is found at the N-terminus of many proteins found in transposons. 71
48720 191311 pfam05599 Deltaretro_Tax Deltaretrovirus Tax protein. This family consists of Rex/Tax proteins from human and simian T-cell leukaemia viruses. The exact function of these proteins is unknown. Tax is the viral transactivator; is it a nuclear phosphoprotein that interacts with CREB, coactivator CBP/p300 and PCAF to form a multiprotein complex, which activates viral LTR and stimulates virus expression. Tax is also involved in deregulated expression of numerous cellular genes leading to T-cell leukaemia. Rex is a nucleolar post transcriptional regulator that facilitates export to the cytoplasm of viral RNA not or incompletely spliced [personal communication, Dr. S Nicot]. 87
48721 398951 pfam05600 DUF773 Protein of unknown function (DUF773). This family contains several eukaryotic sequences which are thought to be CDK5 activator-binding proteins, however, the function of this family is unknown. 505
48722 398952 pfam05602 CLPTM1 Cleft lip and palate transmembrane protein 1 (CLPTM1). This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis. 431
48723 398953 pfam05603 DUF775 Protein of unknown function (DUF775). This family consists of several eukaryotic proteins of unknown function. 200
48724 368516 pfam05604 DUF776 Protein of unknown function (DUF776). This family consists of several highly related mouse and human proteins of unknown function. 176
48725 398954 pfam05605 zf-Di19 Drought induced 19 protein (Di19), zinc-binding. This family consists of several drought induced 19 (Di19) like proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought. This domain is a zinc-binding domain. 54
48726 368518 pfam05606 DUF777 Borrelia burgdorferi protein of unknown function (DUF777). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete). 135
48727 368519 pfam05608 DUF778 Protein of unknown function (DUF778). This family consists of several eukaryotic proteins of unknown function. 136
48728 398955 pfam05609 LAP1C Lamina-associated polypeptide 1C (LAP1C). This family contains rat LAP1C proteins and several uncharacterized highly related sequences from both mice and humans. LAP1s (lamina-associated polypeptide 1s) are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane. LAP1s bind to both A- and B-type lamins and have a putative role in the membrane attachment and assembly of the nuclear lamina. 446
48729 398956 pfam05610 DUF779 Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function. 94
48730 368521 pfam05611 DUF780 Caenorhabditis elegans protein of unknown function (DUF780). This family consists of several short C. elegans proteins of unknown function. 71
48731 398957 pfam05612 Leg1 Leg1. Protein liver-enriched gene 1 (Leg1) has been suggested to function as a novel secreted regulator for the liver development. 331
48732 283304 pfam05613 Herpes_U15 Human herpesvirus U15 protein. 110
48733 114342 pfam05614 DUF782 Circovirus protein of unknown function (DUF782). This family consists of porcine and bovine circovirus proteins of unknown function. 104
48734 398958 pfam05615 THOC7 Tho complex subunit 7. The Tho complex is involved in transcription elongation and mRNA export from the nucleus. 135
48735 283306 pfam05616 Neisseria_TspB Neisseria meningitidis TspB protein. This family consists of several Neisseria meningitidis TspB virulence factor proteins. 517
48736 398959 pfam05617 Prolamin_like Prolamin-like. Prolamin_like (in which DUF784 and DUF1278 have been merged) is found to be expressed in the plant embryo sac and to be regulated by the Myb98 transcription factor. Computational analysis has revealed that members are homologous to the plant prolamin superfamily (Protease inhibitor-seed storage-LTP family, pfam00234). In contrast to typical prolamin members that have eight conserved Cys residues forming four pairs of disulfide bonds, this domain contains only six conserved Cys residues that may form three pairs of disulfide bonds. The domain may have a potential function in lipid transfer or protection during plant embryo sac development and reproduction. 70
48737 283308 pfam05618 Zn_protease Putative ATP-dependant zinc protease. Proteins in this family are annotated as being ATP-dependant zinc proteases. 138
48738 310307 pfam05619 DUF787 Borrelia burgdorferi protein of unknown function (DUF787). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete). 369
48739 398960 pfam05620 DUF788 Protein of unknown function (DUF788). This family consists of several eukaryotic proteins of unknown function. 167
48740 398961 pfam05621 TniB Bacterial TniB protein. This family consists of several bacterial TniB NTP-binding proteins. TniB is a probable ATP-binding protein which is involved in Tn5053 mercury resistance transposition. 189
48741 398962 pfam05622 HOOK HOOK protein. This family consists of several HOOK1, 2 and 3 proteins from different eukaryotic organisms. The different members of the human gene family are HOOK1, HOOK2 and HOOK3. Different domains have been identified in the three human HOOK proteins, and it was demonstrated that the highly conserved NH2-domain mediates attachment to microtubules, whereas the central coiled-coil motif mediates homodimerization and the more divergent C-terminal domains are involved in binding to specific organelles (organelle-binding domains). It has been demonstrated that endogenous HOOK3 binds to Golgi membranes, whereas both HOOK1 and HOOK2 are localized to discrete but unidentified cellular structures. In mice the Hook1 gene is predominantly expressed in the testis. Hook1 function is necessary for the correct positioning of microtubular structures within the haploid germ cell. Disruption of Hook1 function in mice causes abnormal sperm head shape and fragile attachment of the flagellum to the sperm head. 526
48742 398963 pfam05623 DUF789 Protein of unknown function (DUF789). This family consists of several plant proteins of unknown function. 294
48743 398964 pfam05624 LSR Lipolysis stimulated receptor (LSR). The lipolysis-stimulated receptor (LSR) is a lipoprotein receptor primarily expressed in the liver and activated by free fatty acids. It is thought to be involved in the clearance of triglyceride-rich lipoproteins, and has been shown in mice to be critical for liver and embryonic development. 48
48744 398965 pfam05625 PAXNEB PAXNEB protein. PAXNEB or PAX6 neighbor is found in several eukaryotic organisms. PAXNED is an RNA polymerase II Elongator protein subunit. It is part of the HAP subcomplex of Elongator, which is a six-subunit component of the RNA polymerase II holoenzyme. The HAP subcomplex is required for Elongator structural integrity and histone acetyltransferase activity. This protein family has a P-loop motif. However its sequence has degraded in many members of the family. 358
48745 398966 pfam05626 DUF790 Protein of unknown function (DUF790). This family consists of several hypothetical archaeal proteins of unknown function. 386
48746 398967 pfam05627 AvrRpt-cleavage Cleavage site for pathogenic type III effector avirulence factor Avr. This domain is conserved in small families of otherwise unrelated proteins in both mono-cots and di-cots, suggesting that it has a conserved, plant-specific function. It is found both in the plant RIN4 (resistance R membrane-bound host-target protein) where it appears to contribute to the binding of the protein to both RCS (AvrRpt2 auto-cleavage site) and AvrB, the virulence factor from the infecting bacterium. The cleavage site for the AvrRpt2 avirulence protein would appear to be the sequence motifs VPQFGDW and LPKFGEW, both of which are highly conserved within the domain. 36
48747 310316 pfam05628 Borrelia_P13 Borrelia membrane protein P13. This family consists of P13 proteins from Borrelia species. P13 is a 13kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism. 138
48748 283319 pfam05629 Nanovirus_C8 Nanovirus component 8 (C8) protein. This family consists of a group of 17.4 kDa nanovirus proteins which are highly related to the faba bean necrotic yellows virus component 8 protein whose function is unknown. 154
48749 398968 pfam05630 NPP1 Necrosis inducing protein (NPP1). This family consists of several NPP1 like necrosis inducing proteins from oomycetes, fungi and bacteria. Infiltration of NPP1 into leaves of Arabidopsis thaliana plants result in transcript accumulation of pathogenesis-related (PR) genes, production of ROS and ethylene, callose apposition, and HR-like cell death. 198
48750 283321 pfam05631 MFS_5 Sugar-tranasporters, 12 TM. MFS_5 is a family of sugar-transporters from both prokaryotes and eukaryotes. 356
48751 368533 pfam05632 DUF792 Borrelia burgdorferi protein of unknown function (DUF792). This family consists of several hypothetical proteins from the Lyme disease spirochete Borrelia burgdorferi. 184
48752 283323 pfam05633 BPS1 Protein BYPASS1-related. This family consists of several plant proteins and includes BYPASS1, which is required for normal root and shoot development. This protein prevents constitutive production of a root mobile carotenoid-derived signaling compound that is capable of arresting shoot and leaf development. 386
48753 398969 pfam05634 APO_RNA-bind APO RNA-binding. This domain contains conserved cysteine and histidine residues. It resembles zinc fingers, and binds to zinc. This domain functions as an RNA-binding domain. 194
48754 398970 pfam05635 23S_rRNA_IVP 23S rRNA-intervening sequence protein. This family consists of bacterial proteins encoded within an intervening sequence present within some 23S rRNA genes. It folds into an anti-parallel four-helix bundle and forms homopentamers. 106
48755 398971 pfam05636 HIGH_NTase1 HIGH Nucleotidyl Transferase. This family consists of HIGH Nucleotidyl Transferases 393
48756 368535 pfam05637 Glyco_transf_34 galactosyl transferase GMA12/MNN10 family. This family contains a number of glycosyltransferase enzymes that contain a DXD motif. This family includes a number of C. elegans homologs where the DXD is replaced by DXH. Some members of this family are included in glycosyltransferase family 34. 238
48757 398972 pfam05638 T6SS_HCP Type VI secretion system effector, Hcp. HCP is a family of proteins which are expressed in up to 1000 copies in Gram-negative bacteria. Together these copies aggregate into a needle-like shaft or tube that will penetrate other bacteria via a puncturing protein attached to its head. Initially Hcp forms a hexameric structure with a central channel of 40 Angstroms. These hexamers pile up one on top of each other forming nanotubes resembling the gp19 tail phage tube. 129
48758 368536 pfam05639 Pup Pup-like protein. This family consists of several short bacterial proteins formely known as (DUF797). It was recently shown that Mycobacterium tuberculosis contains a small protein, Pup (Rv2111c), that is covalently conjugated to the e-NH2 groups of lysines on several target proteins (pupylation) such as the malonyl CoA acyl carrier protein (FabD). Pupylation of FabD was shown to result in its recruitment to the mycobacterial proteasome and subsequent degradation analogous to eukaryotic ubiquitin-conjugated proteins. Searches recovered Pup orthologs in all major actinobacteria lineages including the basal bifidobacteria and also sporadically in certain other bacterial lineages. The Pup proteins were all between 50-90 residues in length and a multiple alignment shows that they all contain a conserved motif with a G [EQ] signature at the C-terminus. Thus, all of them are suitable for conjugation via the terminal glutamate or the deamidated glutamine (as shown in the case of the Mycobacterium Pup). The conserved globular core of Pup is predicted to form a bihelical unit with the extreme C-terminal 6-7 residues forming a tail in the extended conformation. Thus, Pup is structurally unrelated to the ubiquitin fold and has convergently evolved the function of protein modifier. 64
48759 398973 pfam05640 NKAIN Na,K-Atpase Interacting protein. NKAIN (Na,K-Atpase INteracting) proteins are a family of evolutionary conserved transmembrane proteins that localize to neurons, that are critical for neuronal function, and that interact with the beta subunits, beta1 in vertebrates and beta in Drosophila, of Na,K-ATPase. NKAINs have highly conserved trans-membrane domains but otherwise no other characterized domains. NKAINs may function as subunits of pore or channel structures in neurons or they may affect the function of other membrane proteins. They are likely to function within the membrane bilayer. 197
48760 398974 pfam05641 Agenet Agenet domain. This domain is related to the TUDOR domain pfam00567. The function of the agenet domain is unknown. This family now matches both the two Agenet domains in the FMR proteins. 61
48761 253298 pfam05642 Sporozoite_P67 Sporozoite P67 surface antigen. This family consists of several Theileria P67 surface antigens. A stage specific surface antigen of Theileria parva, p67, is the basis for the development of an anti-sporozoite vaccine for the control of East Coast fever (ECF) in cattle. The antigen has been shown to contain five distinct linear peptide sequences recognized by sporozoite-neutralising murine monoclonal antibodies. 727
48762 398975 pfam05643 DUF799 Putative bacterial lipoprotein (DUF799). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. 185
48763 398976 pfam05644 Miff Mitochondrial and peroxisomal fission factor Mff. This protein has a role in mitochondrial and peroxisomal fission. 292
48764 398977 pfam05645 RNA_pol_Rpc82 RNA polymerase III subunit RPC82. This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa. 233
48765 398978 pfam05647 Epiglycanin_TR Tandem-repeating region of mucin, epiglycanin-like. The unusual mucin, epiglycanin, is membrane-bound at the C-terminus but has a long region of this tandem-repeat at the N-terminus. It was the first mucin identified to be associated with the malignant behaviour of carcinoma cells. Mouse Muc21/epiglycanin is thought to be a highly glycosylated molecule, which makes it likely that its function is dependent on its glycoforms. Cells expressing Muc21 are significantly less adherent to each other and to extracellular matrix components than control cells, and this loss of adhesion is mediated by the TR portion of Muc21. This family also now contains the repeat that was the C. elegans protein of unknown function (DUF801). 67
48766 398979 pfam05648 PEX11 Peroxisomal biogenesis factor 11 (PEX11). This family consists of several peroxisomal biogenesis factor 11 (PEX11) proteins from several eukaryotic species. The PEX11 peroxisomal membrane proteins promote peroxisome division in multiple eukaryotes. 223
48767 398980 pfam05649 Peptidase_M13_N Peptidase family M13. M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk. 380
48768 398981 pfam05650 DUF802 Domain of unknown function (DUF802). This region is found as two or more repeats in a small number of hypothetical proteins. 53
48769 398982 pfam05651 Diacid_rec Putative sugar diacid recognition. This region is found in several proteins characterized as carbohydrate diacid regulators. An HTH DNA-binding motif is found at the C-terminus of these proteins suggesting that this region includes the sugar recognition region. 131
48770 398983 pfam05652 DcpS Scavenger mRNA decapping enzyme (DcpS) N-terminal. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the N-terminal domain of these proteins. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. 103
48771 398984 pfam05653 Mg_trans_NIPA Magnesium transporter NIPA. NIPA (nonimprinted in Prader-Willi/Angelman syndrome) is a family of integral membrane proteins which function as magnesium transporters. 295
48772 398985 pfam05655 AvrD Pseudomonas avirulence D protein (AvrD). This family consists of several avirulence D (AvrD) proteins primarily found in Pseudomonas syringae. 330
48773 398986 pfam05656 DUF805 Protein of unknown function (DUF805). This family consists of several bacterial proteins of unknown function. 108
48774 368546 pfam05657 DUF806 Protein of unknown function (DUF806). This family consists of several Siphovirus and Lactococcus proteins of unknown function. The viral sequences are thought to be tail component proteins. 121
48775 398987 pfam05658 YadA_head Head domain of trimeric autotransporter adhesin. This seven residue repeat makes up the majority sequence of a family of bacterial haemagglutinins and invasins. The representative alignment contains four repeats. 27
48776 398988 pfam05659 RPW8 Arabidopsis broad-spectrum mildew resistance protein RPW8. This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defense responses. The Arabidopsis thaliana locus Resistance To Powdery Mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localized, salicylic acid-dependent defenses similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance. 139
48777 283346 pfam05660 DUF807 Coxiella burnetii protein of unknown function (DUF807). This family consists of several proteins of unknown function from Coxiella burnetii (the causative agent of a zoonotic disease called Q fever). 142
48778 398989 pfam05661 DUF808 Protein of unknown function (DUF808). This family consists of several bacterial proteins of unknown function. 299
48779 398990 pfam05662 YadA_stalk Coiled stalk of trimeric autotransporter adhesin. This short motif is found in invasins and haemagglutinins, normally associated with (pfam05658). 43
48780 147685 pfam05663 DUF809 Protein of unknown function (DUF809). This family consists of several proteins of unknown function Raphanus sativus (Radish) and Brassica napus (Rape). 138
48781 398991 pfam05664 DUF810 Plant family of unknown function (DUF810). This family is found in plant-symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but is not known in any other organism. It represents a candidate family for involvement in interactions with plants, or it may at least play a role in plant-associated lifestyles. 678
48782 368549 pfam05666 Fels1 Fels-1 Prophage Protein-like. 42
48783 398992 pfam05667 DUF812 Protein of unknown function (DUF812). This family consists of several eukaryotic proteins of unknown function. 590
48784 398993 pfam05669 Med31 SOH1. The family consists of Saccharomyces cerevisiae SOH1 homologs. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes. 94
48785 398994 pfam05670 DUF814 Domain of unknown function (DUF814). This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from Bacillus subtilis YloA, where the N-terminal region is involved in this activity. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important. 111
48786 398995 pfam05671 GETHR GETHR pentapeptide repeat (5 copies). This pentapeptide repeat is found mainly in C. elegans. The most conserved amino acid at each position leads to its name GETHR (Bateman A unpublished obs.). The family also includes a divergent repeat in a microneme protein. The function of this repeat is unknown. 25
48787 398996 pfam05672 MAP7 MAP7 (E-MAP-115) family. The organisation of microtubules varies with the cell type and is presumably controlled by tissue-specific microtubule-associated proteins (MAPs). The 115-kDa epithelial MAP (E-MAP-115/MAP7) has been identified as a microtubule-stabilizing protein predominantly expressed in cell lines of epithelial origin. The binding of this microtubule associated protein is nucleotide independent. 163
48788 398997 pfam05673 DUF815 Protein of unknown function (DUF815). This family consists of several bacterial proteins of unknown function. 250
48789 283357 pfam05674 DUF816 Baculovirus protein of unknown function (DUF816). This family includes proteins that are about 200 amino acids in length. The proteins are all from baculoviruses. This family includes ORF107 from Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) and a variety of other numbered ORF proteins, such as ORF52, ORF140. The function of these proteins is unknown. 176
48790 398998 pfam05675 DUF817 Protein of unknown function (DUF817). This family consists of several bacterial proteins of unknown function. 234
48791 398999 pfam05676 NDUF_B7 NADH-ubiquinone oxidoreductase B18 subunit (NDUFB7). This family consists of several NADH-ubiquinone oxidoreductase B18 subunit proteins from different eukaryotic organisms. Oxidative phosphorylation is the well-characterized process in which ATP, the principal carrier of chemical energy of individual cells, is produced due to a mitochondrial proton gradient formed by the transfer of electrons from NADH and FADH2 to molecular oxygen. The oxidative phosphorylation (OXPHOS) system is located in the mitochondrial inner membrane and consists of five multi-subunit enzyme complexes and two small electron carriers: coenzyme Q10 and cytochrome C. At least 70 structural proteins involved in the formation of the whole OXPHOS system are encoded by nuclear genes, whereas 13 structural proteins are encoded by the mitochondrial genome. Deficiency of NADH ubiquinone oxidoreductase, the first enzyme complex of the mitochondrial respiratory chain, is one of the most frequent causes of human mitochondrial encephalomyopathies. 60
48792 253315 pfam05677 DUF818 Chlamydia CHLPS protein (DUF818). This family consists of several Chlamydia CHLPS proteins, the function of which are unknown. 364
48793 399000 pfam05678 VQ VQ motif. This short motif is found in a variety of plant proteins. These proteins vary greatly in length and are mostly composed of low complexity regions. They all conserve a short motif FXhVQChTG, where X is any amino acid and h is a hydrophobic amino acid. The function of this motif is uncertain, however one protein in this family has been found to bind the SigA sigma factor. It would seem plausible that this motif is needed for this activity and that this whole family might be involved in modulating plastid sigma factors (Bateman A pers. obs.). 28
48794 399001 pfam05679 CHGN Chondroitin N-acetylgalactosaminyltransferase. 501
48795 368558 pfam05680 ATP-synt_E ATP synthase E chain. This family consists of several ATP synthase E chain sequences which are components of the CF(0) subunit. 83
48796 399002 pfam05681 Fumerase Fumarate hydratase (Fumerase). This family consists of several bacterial fumarate hydratase proteins FumA and FumB. Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. Three fumarases, FumA, FumB, and FumC, have been reported in E. coli. fumA and fumB genes are homologous and encode products of identical sizes which form thermolabile dimers of Mr 120,000. FumA and FumB are class I enzymes and are members of the iron-dependent hydrolases, which include aconitase and malate hydratase. The active FumA contains a 4Fe-4S centre, and it can be inactivated upon oxidation to give a 3Fe-4S centre. 267
48797 399003 pfam05683 Fumerase_C Fumarase C-terminus. This family consists of the C terminal region of several bacterial fumarate hydratase proteins (FumA and FumB). Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. 204
48798 114410 pfam05684 DUF819 Protein of unknown function (DUF819). This family contains proteins of unknown function from archaeal, bacterial and plant species. 379
48799 399004 pfam05685 Uma2 Putative restriction endonuclease. This family consists of hypothetical proteins that are greatly expanded in cyanobacteria. The proteins are found sporadically in other bacteria. A small number of member proteins also contain pfam02861 domains that are involved in protein interactions. Solutions of several structures for members of this family show that it is likely to be acting as an endonuclease. 168
48800 310354 pfam05686 Glyco_transf_90 Glycosyl transferase family 90. This family of glycosyl transferases are specifically (mannosyl) glucuronoxylomannan/galactoxylomannan -beta 1,2-xylosyltransferases, EC:2.4.2.-. 396
48801 399005 pfam05687 BES1_N BES1/BZR1 plant transcription factor, N-terminal. This family consists of the N terminal regions of several plant transcription factors. It is classified as BES1/BZR1, a plant-specific transcription factor that cooperates with transcription factors such as BIM1 to regulate brassinosteroid-induced genes. 143
48802 399006 pfam05688 DUF824 Salmonella repeat of unknown function (DUF824). This family consists of several repeated sequences of around 45 residues. 108
48803 399007 pfam05689 DUF823 Salmonella repeat of unknown function (DUF823). This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium and Salmonella typhi. Sequences from this family are almost always found with pfam05688. 135
48804 399008 pfam05690 ThiG Thiazole biosynthesis protein ThiG. This family consists of several bacterial thiazole biosynthesis protein G sequences. ThiG, together with ThiF and ThiH, is proposed to be involved in the synthesis of 4-methyl-5-(b-hydroxyethyl)thiazole (THZ) which is an intermediate in the thiazole production pathway. This family also includes triosephosphate isomerase and pyridoxal 5'-phosphate synthase subunit PdxS. 247
48805 283371 pfam05691 Raffinose_syn Raffinose synthase or seed imbibition protein Sip1. This family consists of several raffinose synthase proteins, also known as seed imbibition (Sip1) proteins. Raffinose (O-alpha- D-galactopyranosyl- (1-->6)- O-alpha- D-glucopyranosyl-(1<-->2)- O-beta- D-fructofuranoside) is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase (EC:2.4.1.82) is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway. Raffinose family oligosaccharides (RFOs) are ubiquitous in plant seeds and are thought to play critical roles in the acquisition of tolerance to desiccation and seed longevity. Raffinose synthases are alkaline alpha-galactosidases and are solely responsible for RFO breakdown in germinating maize seeds, whereas acidic galactosidases appear to have other functions. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K. This family includes enzymes from GH36C. 749
48806 368561 pfam05692 Myco_haema Mycoplasma haemagglutinin. This family consists of several haemagglutinin sequences from Mycoplasma synoviae and Mycoplasma gallisepticum. The major plasma membrane proteins, pMGAs, of Mycoplasma gallisepticum are cell adhesin (hemagglutinin) molecules. It has been shown that the genetic determinants that code for the haemagglutinins are organized into a large family of genes and that only one of these genes is predominately expressed in any given strain. 424
48807 399009 pfam05693 Glycogen_syn Glycogen synthase. This family consists of the eukaryotic glycogen synthase proteins GYS1, GYS2 and GYS3. Glycogen synthase (GS) is the enzyme responsible for the synthesis of -1,4-linked glucose chains in glycogen. It is the rate limiting enzyme in the synthesis of the polysaccharide, and its activity is highly regulated through phosphorylation at multiple sites and also by allosteric effectors, mainly glucose 6-phosphate (G6P). 639
48808 399010 pfam05694 SBP56 56kDa selenium binding protein (SBP56). This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport. The Lotus japonicus homolog of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins, in vesicular Golgi transport. 453
48809 283375 pfam05695 DUF825 Plant protein of unknown function (DUF825). This family consists of several plant proteins greater than 1000 residues in length. The function of this family is unknown. 1486
48810 368563 pfam05696 DUF826 Protein of unknown function (DUF826). This family consists of several enterobacterial and siphoviral sequences of unknown function. 65
48811 399011 pfam05697 Trigger_N Bacterial trigger factor protein (TF). In the E. coli cytosol, a fraction of the newly synthesized proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the N-terminal region of the protein. 144
48812 399012 pfam05698 Trigger_C Bacterial trigger factor protein (TF) C-terminus. In the E. coli cytosol, a fraction of the newly synthesized proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the C-terminal region of the protein. 162
48813 399013 pfam05699 Dimer_Tnp_hAT hAT family C-terminal dimerization region. This dimerization region is found at the C-terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerization region forms extremely stable dimers in vitro. 83
48814 399014 pfam05700 BCAS2 Breast carcinoma amplified sequence 2 (BCAS2). This family consists of several eukaryotic sequences of unknown function. The mammalian members of this family are annotated as breast carcinoma amplified sequence 2 (BCAS2) proteins. BCAS2 is a putative spliceosome associated protein. 204
48815 399015 pfam05701 WEMBL Weak chloroplast movement under blue light. WEMBL consists of several plant proteins required for the chloroplast avoidance response under high intensity blue light. This avoidance response consists in the relocation of chloroplasts on the anticlinal side of exposed cells. Acts in association with PMI2 to maintain the velocity of chloroplast photo-relocation movement via the regulation of cp-actin filaments. Thus several member-sequences are described as "myosin heavy chain-like". 562
48816 368568 pfam05702 Herpes_UL49_5 Herpesvirus UL49.5 envelope/tegument protein. UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope. 98
48817 399016 pfam05703 Auxin_canalis Auxin canalisation. This domain is frequently found at the N-terminus of proteins containing pfam08458 at the C-terminus. It is a component of the auto-regulatory loop which enables auxin canalisation by recruitment of the PIN1 auxin efflux protein to the cell membrane. 258
48818 368570 pfam05704 Caps_synth Capsular polysaccharide synthesis protein. This family consists of several capsular polysaccharide proteins. Capsular polysaccharide (CPS) is a major virulence factor in Streptococcus pneumoniae. 278
48819 399017 pfam05705 DUF829 Eukaryotic protein of unknown function (DUF829). This family consists of several uncharacterized eukaryotic proteins. 240
48820 399018 pfam05706 CDKN3 Cyclin-dependent kinase inhibitor 3 (CDKN3). This family consists of cyclin-dependent kinase inhibitor 3 or kinase associated phosphatase proteins from several mammalian species. The cyclin-dependent kinase (Cdk)-associated protein phosphatase (KAP) is a human dual specificity protein phosphatase that dephosphorylates Cdk2 on threonine 160 in a cyclin-dependent manner. 168
48821 283385 pfam05707 Zot Zonular occludens toxin (Zot). This family consists of bacterial and viral proteins which are very similar to the Zonular occludens toxin (Zot). Zot is elaborated by bacteriophages present in toxigenic strains of Vibrio cholerae. Zot is a single polypeptide chain of 44.8 kDa, with the ability to reversibly alter intestinal epithelial tight junctions, allowing the passage of macromolecules through mucosal barriers. 195
48822 399019 pfam05708 Peptidase_C92 Permuted papain-like amidase enzyme, YaeF/YiiX, C92 family. Amidase_YiiX is a family of permuted papain-like amidases. It has amidase specificity for the amide bond between a lipid and an amino acid (or peptide). From the structure, a tetramer, each monomer is made up of a layered alpha-beta fold with a central, 6-stranded, antiparallel beta-sheet that is protected by helices on either side. The catalytic Cys154 in UniProtKB:Q74NK7, Structure 3kw0, is located on the N-terminus of helix alphaF. The two additional helices located above Cys154 contribute to the formation of the active site, where the lysine ligand is bound. 157
48823 399020 pfam05709 Sipho_tail Phage tail protein. This family consists of several Siphovirus and other phage tail component proteins as well as some bacterial proteins of unknown function. 258
48824 283388 pfam05710 Coiled Coiled coil. This region is found in a group of Dictyostelium discoideum proteins. It is likely to form a coiled-coil. Some of the proteins are regulated by cyclic AMP and are expressed late in development. 90
48825 399021 pfam05711 TylF Macrocin-O-methyltransferase (TylF). This family consists of bacterial macrocin O-methyltransferase (TylF) proteins. TylF is responsible for the methylation of macrocin to produce tylosin. Tylosin is a macrolide antibiotic used in veterinary medicine to treat infections caused by Gram-positive bacteria and as an animal growth promoter in the swine industry. It is produced by several Streptomyces species. As with other macrolides, the antibiotic activity of tylosin is due to the inhibition of protein biosynthesis by a mechanism that involves the binding of tylosin to the ribosome, preventing the formation of the mRNA-aminoacyl-tRNA-ribosome complex. The structure of one representative sequence from this family, NovP, shows it to be an S-adenosyl-l-methionine-dependent O-methyltransferase that catalyzes the penultimate step in the biosynthesis of the aminocoumarin antibiotic novobiocin. Specifically, it methylates at 4-OH of the noviose moiety, and the resultant methoxy group is important for the potency of the mature antibiotic. It is likely that the key structural features of NovP are common to the rest of the family and include: a helical 'lid' region that gates access to the co-substrate binding pocket and an active centre that contains a 3-Asp putative metal binding site. A further conserved Asp probably acts as the general base that initiates the reaction by de-protonating the 4-OH group of the noviose unit. 256
48826 399022 pfam05712 MRG MRG. This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif. 184
48827 399023 pfam05713 MobC Bacterial mobilisation protein (MobC). This family consists of several bacterial MobC-like, mobilisation proteins. MobC proteins belong to the group of relaxases. Together with MobA and MobB they bind to a single cis-active site of a mobilising plasmid, the origin of transfer (oriT) region. The absence of MobC has several different effects on oriT DNA. Site- and strand-specific nicking by MobA protein is severely reduced, accounting for the lower frequency of mobilisation. The localized DNA strand separation required for this nicking is less affected, but becomes more sensitive to the level of active DNA gyrase in the cell. In addition, strand separation is not efficiently extended through the region containing the nick site. These effects suggest a model in which MobC acts as a molecular wedge for the relaxosome-induced melting of oriT DNA. The effect of MobC on strand separation may be partially complemented by the helical distortion induced by supercoiling. However, MobC extends the melted region through the nick site, thus providing the single-stranded substrate required for cleavage by MobA. 43
48828 399024 pfam05714 Borrelia_lipo_1 Borrelia burgdorferi virulent strain associated lipoprotein. This family consists of several virulent strain associated lipoproteins from the Lyme disease spirochete Borrelia burgdorferi. 184
48829 399025 pfam05715 zf-piccolo Piccolo Zn-finger. This (predicted) Zinc finger is found in the bassoon and piccolo proteins. There are eight conserved cysteines, suggesting that it coordinates two zinc ligands. 59
48830 399026 pfam05716 AKAP_110 A-kinase anchor protein 110 kDa (AKAP 110). This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction. 692
48831 399027 pfam05717 TnpB_IS66 IS66 Orf2 like protein. This protein is found in insertion sequences related to IS66. The function of these proteins is uncertain, but they are probably essential for transposition. 96
48832 283396 pfam05718 Pox_int_trans Poxvirus intermediate transcription factor. This family consists of several highly related Poxvirus sequences which are thought to be intermediate transcription factors. 382
48833 399028 pfam05719 GPP34 Golgi phosphoprotein 3 (GPP34). This family consists of several eukaryotic GPP34 like proteins. GPP34 localizes to the Golgi complex and is conserved from yeast to humans. The cytosolic-ally exposed location of GPP34 predict a role for a novel coat protein in Golgi trafficking. 197
48834 283398 pfam05720 Dicty_CAD Cell-cell adhesion domain. This family is based on a group of Dictyostelium discoideum proteins that are essential in early development. csbA and csbB are located on the cell surface and mediate cell-cell adhesion. 75
48835 399029 pfam05721 PhyH Phytanoyl-CoA dioxygenase (PhyH). This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues. 213
48836 114448 pfam05722 Ustilago_mating Ustilago B locus mating-type protein. This family consists of several Ustilago mating-type proteins. The b locus of the phytopathogenic fungus Ustilago maydis encodes a multiallelic recognition function that controls the ability of the fungus to form a dikaryon and complete the sexual stage of the life cycle. The b locus has at least 25 alleles and any combination of two different alleles, brought together by mating between haploid cells, allows the fungus to cause disease and undergo sexual development within the plant. 239
48837 399030 pfam05724 TPMT Thiopurine S-methyltransferase (TPMT). This family consists of thiopurine S-methyltransferase proteins from both eukaryotes and prokaryotes. Thiopurine S-methyltransferase (TPMT) is a cytosolic enzyme that catalyzes S-methylation of aromatic and heterocyclic sulfhydryl compounds, including anticancer and immunosuppressive thiopurines. 218
48838 191356 pfam05725 FNIP FNIP Repeat. This repeat is approximately 22 residues long and is only found in Dictyostelium discoideum. It appears to be related to pfam00560 (personal obs:C Yeats). The alignment consists of two tandem repeats. It is termed the FNIP repeat after the pattern of conserved residues. 44
48839 399031 pfam05726 Pirin_C Pirin C-terminal cupin domain. This region is found the C-terminal half of the Pirin protein. 103
48840 283402 pfam05727 UPF0228 Uncharacterized protein family (UPF0228). This small family of proteins is currently restricted Methanosarcina species. Members of this family are about 200 residues in length, except for MA_2565 that has two copies of this region. Although the function of this region is unknown the pattern of conservation suggests that this may be an enzyme, including multiple conserved aspartate and glutamate residues (Bateman A. pers. obs.). The most conserved motif in these proteins is NEL/MEXNE/D, where X can be any amino acid, which is found at the C-terminus of these proteins. 124
48841 283403 pfam05728 UPF0227 Uncharacterized protein family (UPF0227). Despite being classed as uncharacterized proteins, the members of this family are almost certainly enzymes that are distantly related to the pfam00561. 187
48842 399032 pfam05729 NACHT NACHT domain. This NTPase domain is found in apoptosis proteins as well as those involved in MHC transcription activation. This family is closely related to pfam00931. 166
48843 399033 pfam05730 CFEM CFEM domain. This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis. The structure of the CFEM domain containing protein 'Surface antigen protein 2' from Candida albicans has been solved. 66
48844 399034 pfam05731 TROVE TROVE domain. This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding. 362
48845 253356 pfam05732 RepL Firmicute plasmid replication protein (RepL). This family consists of Firmicute RepL proteins which are involved in plasmid replication. 165
48846 368585 pfam05733 Tenui_N Tenuivirus/Phlebovirus nucleocapsid protein. This family consists of several Tenuivirus and Phlebovirus nucleocapsid proteins. These are ssRNA viruses. 224
48847 283407 pfam05734 DUF832 Herpesvirus protein of unknown function (DUF832). This family consists of several herpesvirus proteins of unknown function. 228
48848 399035 pfam05735 TSP_C Thrombospondin C-terminal region. This region is found at the C-terminus of thrombospondin and related proteins. 198
48849 399036 pfam05736 OprF OprF membrane domain. This domain represents the presumed membrane spanning region of the OprF proteins. This region is involved in channel formation and is thought to form an 8-stranded beta-barrel. 156
48850 283410 pfam05737 Collagen_bind Collagen binding domain. The domain fold is a jelly-roll, composed of two antiparallel beta-sheets and two short alpha-helices. A groove on beta-sheet I exhibited the best surface complementarity to the collagen. This site partially overlaps with the peptide sequence previously shown to be critical for collagen binding. Recombinant proteins containing single amino acid mutations designed to disrupt the surface of the putative binding site exhibited significantly lower affinities for collagen. 129
48851 399037 pfam05738 Cna_B Cna protein B-type domain. This domain is found in Staphylococcus aureus collagen-binding surface protein. The structure of the repetitive B-region has been solved and forms a beta sandwich structure. 87
48852 399038 pfam05739 SNARE SNARE domain. Most if not all vesicular membrane fusion events in eukaryotic cells are believed to be mediated by a conserved fusion machinery, the SNARE [soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors] machinery. The SNARE domain is thought to act as a protein-protein interaction module in the assembly of a SNARE protein complex. 52
48853 399039 pfam05741 zf-nanos Nanos RNA binding domain. This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localized determinant of posterior pattern. Nanos RNA is localized to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localized source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localized and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development. 53
48854 399040 pfam05742 TANGO2 Transport and Golgi organisation 2. In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse Tango2 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif. This gene has been characterized in Drosophila melanogaster and named as transport and Golgi organisation 2, hence the name Tango2. 254
48855 399041 pfam05743 UEV UEV domain. This family includes the eukaryotic tumor susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other human malignancies. However, the involvement of this gene in neoplastic transformation and tumorigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but that this gene is not a tumor suppressor for sporadic forms of breast cancer. This family is related to the ubiquitin conjugating enzymes. 119
48856 283416 pfam05744 Benyvirus_P25 Benyvirus P25/P26 protein. This family consists of P25 and P26 proteins from the beet necrotic yellow vein viruses. 240
48857 368591 pfam05745 CRPA Chlamydia 15 kDa cysteine-rich outer membrane protein (CRPA). This family consists of several Chlamydia 15 kDa cysteine-rich outer membrane proteins which are associated with differentiation of reticulate bodies (RBs) into elementary bodies (EBs). 150
48858 399042 pfam05746 DALR_1 DALR anticodon binding domain. This all alpha helical domain is the anticodon binding domain in Arginyl and glycyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids. 117
48859 283418 pfam05748 Rubella_E1 Rubella membrane glycoprotein E1. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. The E1 has been shown to be a type 1 membrane protein, rich in cysteine residues with extensive intramolecular disulfide bonds. 496
48860 283419 pfam05749 Rubella_E2 Rubella membrane glycoprotein E2. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. 267
48861 399043 pfam05750 Rubella_Capsid Rubella capsid protein. Rubella virus is an enveloped positive-strand RNA virus of the family Togaviridae. Virions are composed of three structural proteins: a capsid and two membrane-spanning glycoproteins, E2 and E1. During virus assembly, the capsid interacts with genomic RNA to form nucleocapsids. It has been discovered that capsid phosphorylation serves to negatively regulate binding of viral genomic RNA. This may delay the initiation of nucleocapsid assembly until sufficient amounts of virus glycoproteins accumulate at the budding site and/or prevent non-specific binding to cellular RNA when levels of genomic RNA are low. It follows that at a late stage in replication, the capsid may undergo dephosphorylation before nucleocapsid assembly occurs. 269
48862 399044 pfam05751 FixH FixH. This family consists of several Rhizobium FixH like proteins. It has been suggested that suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG. 150
48863 399045 pfam05752 Calici_MSP Calicivirus minor structural protein. This family consists of minor structural proteins largely from human calicivirus isolates. Human calicivirus causes gastroenteritis. The function of this family is unknown. 165
48864 399046 pfam05753 TRAP_beta Translocon-associated protein beta (TRAPB). This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion. 178
48865 399047 pfam05754 DUF834 Domain of unknown function (DUF834). This short presumed domain is found in a large number of hypothetical plant proteins. The domain is quite rich in conserved glycine residues. It occurs in some putative transposons but currently has no known function. 53
48866 399048 pfam05755 REF Rubber elongation factor protein (REF). This family consists of the highly related rubber elongation factor (REF), small rubber particle protein (SRPP) and stress-related protein (SRP) sequences. REF and SRPP are released from the rubber particle membrane into the cytosol during osmotic lysis of the sedimentable organelles (lutoids). The exact function of this family is unknown. 206
48867 399049 pfam05756 S-antigen S-antigen protein. S-antigens are heat stable proteins that are found in the blood of individuals infected with malaria. 92
48868 399050 pfam05757 PsbQ Oxygen evolving enhancer protein 3 (PsbQ). This family consists of the plant specific oxygen evolving enhancer protein 3 (PsbQ). Photosystem II (PSII)1 is a pigment-protein complex, which consists of at least 25 different protein subunits, at present denoted PsbA-Z according to the genes that encode them. PsbQ plays an important role in the lumenal oxygen-evolving activity of PSII from higher plants and green algae. 198
48869 368599 pfam05758 Ycf1 Ycf1. The chloroplast genomes of most higher plants contain two giant open reading frames designated ycf1 and ycf2. Although the function of Ycf1 is unknown, it is known to be an essential gene. 944
48870 399051 pfam05760 IER Immediate early response protein (IER). This family consists of several eukaryotic immediate early response (IER) 2 and 5 proteins. The role of IER5 is unclear although it play an important role in mediating the cellular response to mitogenic signals. Again, little is known about the function of IER2 although it is thought to play a role in mediating the cellular responses to a variety of extracellular signals. 304
48871 399052 pfam05761 5_nucleotid 5' nucleotidase family. This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5. 444
48872 399053 pfam05762 VWA_CoxE VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA (von Willebrand factor type A) domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. 221
48873 368601 pfam05763 DUF835 Protein of unknown function (DUF835). The members of this archaebacterial protein family are around 250-300 amino acid residues in length. The function of these proteins is not known. 136
48874 399054 pfam05764 YL1 YL1 nuclear protein. The proteins in this family are designated YL1. These proteins have been shown to be DNA-binding and may be a transcription factor. 246
48875 368603 pfam05766 NinG Bacteriophage Lambda NinG protein. NinG or Rap is involved in recombination. Rap (recombination adept with plasmid) increases lambda-by-plasmid recombination catalyzed by Escherichia coli's RecBCD pathway. 186
48876 283435 pfam05767 Pox_A14 Poxvirus virion envelope protein A14. This family consists of several Poxvirus virion envelope protein A14 like sequences. A14 is a component of the virion membrane and has been found to be an H1 phosphatase substrate in vivo and in vitro. A14 is hyperphosphorylated on serine residues in the absence of H1 expression. 93
48877 399055 pfam05768 DUF836 Glutaredoxin-like domain (DUF836). These proteins are related to the pfam00462 family. 80
48878 399056 pfam05769 SIKE SIKE family. This family consists of several eukaryotic proteins. Suppressor of IKBKE 1 (SIKE) is a physiological suppressor of IKK-epsilon and TBK1, which are two IKK-related kinases involved in virus- and TLR3-triggered activation of interferon regulatory factor 3 (IRF-3). Other members of this family are circulating cathodic antigen (CCA), found in Schistosoma mansoni (Blood fluke), and FGFR1 oncogene partner 2, which may be involved in wound healing pathway. 180
48879 368606 pfam05770 Ins134_P3_kin Inositol 1, 3, 4-trisphosphate 5/6-kinase. This family consists of several inositol 1, 3, 4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase. 201
48880 283438 pfam05771 Pox_A31 Poxvirus A31 protein. 113
48881 399057 pfam05772 NinB NinB protein. The ninR region of phage lambda contains two recombination genes, orf (ninB) and rap (ninG), that have roles when the RecF and RecBCD recombination pathways of E. coli, respectively, operate on phage lambda. NinB binds to single-stranded DNA. 117
48882 399058 pfam05773 RWD RWD domain. This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices. 111
48883 283441 pfam05774 Herpes_heli_pri Herpesvirus helicase-primase complex component. This family consists of several helicase-primase complex components from the Gammaherpesviruses. 127
48884 283442 pfam05775 AfaD Enterobacteria AfaD invasin protein. This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. 105
48885 147756 pfam05776 Papilloma_E5A Papillomavirus E5A protein. Human papillomaviruses (HPVs) are epitheliotropic viruses, and their life cycle is intimately linked to the stratification and differentiation state of the host epithelial tissues. The kinetics of E5a protein expression during the complete viral life cycle has been studied and the highest level was found to be coincidental with the onset of virion morphogenesis. 91
48886 283443 pfam05777 Acp26Ab Drosophila accessory gland-specific peptide 26Ab (Acp26Ab). This family consists of accessory gland-specific 26Ab peptides or male accessory gland secretory protein 355B from different Drosophila species. Drosophila males, like males of most other insects, transfer a group of specific proteins (Acp26Ab and Acp26Aa in Drosophila) to the females during mating. These proteins are produced primarily in the accessory gland and are likely to influence the female's reproduction. 90
48887 399059 pfam05778 Apo-CIII Apolipoprotein CIII (Apo-CIII). This family consists of several mammalian apolipoprotein CIII (Apo-CIII) sequences. Apolipoprotein C-III is a 79-residue glycoprotein. It is synthesized in the intestine and liver as part of the very low density lipoprotein (VLDL) and the high density lipoprotein (HDL) particles. Owing to its positive correlation with plasma triglyceride (Tg) levels, Apo-CIII is suggested to play a role in Tg metabolism and is therefore of interest regarding atherosclerosis. However, unlike other apolipoproteins such as Apo-AI, Apo E or CII for which many naturally occurring mutations are known, the structure-function relationships of apo C-III remains a subject of debate. One possibility is that apo C-III inhibits lipoprotein lipase (LPL) activity, as shown by in vitro experiments. Another suggestion, is that elevated levels of Apo-CIII displace other apolipoproteins at the lipoprotein surface, modifying their clearance from plasma. 68
48888 399060 pfam05781 MRVI1 MRVI1 protein. This family consists of mammalian MRVI1 proteins which are related to the lymphoid-restricted membrane protein (JAW1) and the IP3 receptor associated cGMP kinase substrates A and B (IRAGA and IRAGB). The function of MRVI1 is unknown although mutations in the Mrvi1 gene induces myeloid leukaemia by altering the expression of a gene important for myeloid cell growth and/or differentiation so it has been speculated that Mrvi1 is a tumor suppressor gene. IRAG is very similar in sequence to MRVI1 and is an essential NO/cGKI-dependent regulator of IP3-induced calcium release. Activation of cGKI decreases IP3-stimulated elevations in intracellular calcium, induces smooth muscle relaxation and contributes to the antiproliferative and pro-apoptotic effects of NO/cGMP. Jaw1 is a member of a class of proteins with COOH-terminal hydrophobic membrane anchors and is structurally similar to proteins involved in vesicle targeting and fusion. This suggests that the function and/or the structure of the ER in lymphocytes may be modified by lymphoid-restricted resident ER proteins. 521
48889 399061 pfam05782 ECM1 Extracellular matrix protein 1 (ECM1). This family consists of several eukaryotic extracellular matrix protein 1 (ECM1) sequences. ECM1 has been shown to regulate endochondral bone formation, stimulate the proliferation of endothelial cells and induce angiogenesis. Mutations in the ECM1 gene can cause lipoid proteinosis, a disorder which causes generalized thickening of skin, mucosae and certain viscera. Classical features include beaded eyelid papules and laryngeal infiltration leading to hoarseness. 559
48890 368612 pfam05783 DLIC Dynein light intermediate chain (DLIC). This family consists of several eukaryotic dynein light intermediate chain proteins. The light intermediate chains (LICs) of cytoplasmic dynein consist of multiple isoforms, which undergo post-translational modification to produce a large number of species. DLIC1 is known to be involved in assembly, organisation, and function of centrosomes and mitotic spindles when bound to pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2 that may play a role in maintaining Golgi organisation by binding cytoplasmic dynein 2 to its Golgi-associated cargo. 468
48891 283448 pfam05784 Herpes_UL82_83 Betaherpesvirus UL82/83 protein N-terminus. This family represents the N terminal region of the Betaherpesvirus UL82 and UL83 proteins. As viruses are reliant upon their host cell to serve as proper environments for their replication, many have evolved mechanisms to alter intracellular conditions to suit their own needs. Human cytomegalovirus induces quiescent cells to enter the cell cycle and then arrests them in late G(1), before they enter the S phase, a cell cycle compartment that is presumably favourable for viral replication. The protein product of the human cytomegalovirus UL82 gene, pp71, can accelerate the movement of cells through the G(1) phase of the cell cycle. This activity would help infected cells reach the late G(1) arrest point sooner and thus may stimulate the infectious cycle. pp71 also induces DNA synthesis in quiescent cells, but a pp71 mutant protein that is unable to induce quiescent cells to enter the cell cycle still retains the ability to accelerate the G(1) phase. Thus, the mechanism through which pp71 accelerates G(1) cell cycle progression appears to be distinct from the one that it employs to induce quiescent cells to exit G(0) and subsequently enter the S phase. 343
48892 283449 pfam05785 CNF1 Rho-activating domain of cytotoxic necrotizing factor. This family consists of several bacterial cytotoxic necrotizing factor proteins as well as related dermonecrotic toxin (DNT) from Bordetella species. Cytotoxic necrotizing factor 1 (CNF1) causes necrosis of rabbit skin and re-organisation of the actin cytoskeleton in cultured cells. Bordetella dermonecrotic toxin (DNT) stimulates the assembly of actin stress fibers and focal adhesions by deamidating or polyaminating Gln63 of the small GTPase Rho. DNT is an A-B toxin which is composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain. 286
48893 399062 pfam05786 Cnd2 Condensin complex subunit 2. This family consists of several Barren protein homologs from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologs in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localizes to chromatin throughout mitosis. Colocalization and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase. This family forms one of the three non-structural maintenance of chromosomes (SMC) subunits of the mitotic condensation complex along with Cnd1 and Cnd3. 743
48894 399063 pfam05787 DUF839 Bacterial protein of unknown function (DUF839). This family consists of several bacterial proteins of unknown function that contain a predicted beta-propeller repeats. 504
48895 399064 pfam05788 Orbi_VP1 Orbivirus RNA-dependent RNA polymerase (VP1). This family consists of the RNA-dependent RNA polymerase protein VP1 from the Orbiviruses. VP1 may have both enzymatic and structural roles in the virus life cycle. 1297
48896 283452 pfam05789 Baculo_VP1054 Baculovirus VP1054 protein. This family consists of several VP1054 proteins from the Baculoviruses. VP1054 is a virus structural protein required for nucleocapsid assembly. 379
48897 399065 pfam05790 C2-set Immunoglobulin C2-set domain. 80
48898 368615 pfam05791 Bacillus_HBL Bacillus haemolytic enterotoxin (HBL). This family consists of several Bacillus haemolytic enterotoxins (HblC, HblD, HblA, NheA, and NheB) which can cause food poisoning in humans. 177
48899 399066 pfam05792 Candida_ALS Candida agglutinin-like (ALS). This family consists of several agglutinin-like proteins from different Candida species. ALS genes of Candida albicans encode a family of cell-surface glycoproteins with a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif of variable number, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The ALS family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying ALS genes and their encoded proteins. Fungal adhesins, which include sexual agglutinins, virulence factors, and flocculins, are surface proteins that mediate cell-cell and cell-environment interactions. It is possible that both the serine/threonine-rich domain and the cysteine residues in the C-terminal and DIPSY pfam11763 participate in anchoring the terminal domains inside the wall, so that only the inner part of Map4p, including the repeat region, is sticking out as a fold-back loop then able to act in adhesing. 33
48900 310411 pfam05793 TFIIF_alpha Transcription initiation factor IIF, alpha subunit (TFIIF-alpha). Transcription initiation factor IIF, alpha subunit (TFIIF-alpha) or RNA polymerase II-associating protein 74 (RAP74) is the large subunit of transcription factor IIF (TFIIF), which is essential for accurate initiation and stimulates elongation by RNA polymerase II. 528
48901 399067 pfam05794 Tcp11 T-complex protein 11. This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants. 389
48902 310413 pfam05795 Plasmodium_Vir Plasmodium vivax Vir protein. This family consists of several Vir proteins specific to Plasmodium vivax. The vir genes are present at about 600-1,000 copies per haploid genome and encode proteins that are immunovariant in natural infections, indicating that they may have a functional role in establishing chronic infection through antigenic variation. 371
48903 283458 pfam05796 Chordopox_G2 Chordopoxvirus protein G2. This family consists of several Chordopoxvirus isatin-beta-thiosemicarbazone dependent protein (protein G2) sequences. Inactivation of the gene coding for this protein renders the virus dependent upon isatin-beta-thiosemicarbazone (IBT) for growth. 215
48904 283459 pfam05797 Rep_4 Yeast trans-acting factor (REP1/REP2). This family consists of the yeast trans-acting factor B and C (REP1 and 2) proteins. The yeast plasmid stability system consists of two plasmid-coded proteins, Rep1 and Rep2, and a cis-acting locus, STB. The Rep proteins show both self- and cross-interactions in vivo and in vitro, and bind to the STB DNA with assistance from host factor(s). Within the yeast nucleus, the Rep1 and Rep2 proteins tightly associate with STB-containing plasmids into well organized plasmid foci that form a cohesive unit in partitioning. It is generally accepted that the protein-protein and DNA-protein interactions engendered by the Rep-STB system are central to plasmid partitioning. Point mutations in Rep1 that knock out interaction with Rep2 or with STB simultaneously block the ability of these Rep1 variants to support plasmid stability. 369
48905 283460 pfam05798 Phage_FRD3 Bacteriophage FRD3 protein. This family consists of bacteriophage FRD3 proteins. 75
48906 399068 pfam05800 GvpO Gas vesicle synthesis protein GvpO. This family consists of archaeal GvpO proteins which are required for gas vesicle synthesis. The family also contains two related sequences from Streptomyces coelicolor. 94
48907 283462 pfam05801 DUF840 Lagovirus protein of unknown function (DUF840). This family consists of several Lagovirus sequences of unknown function, largely from rabbit hemorrhagic disease virus. 113
48908 368619 pfam05802 EspB Enterobacterial EspB protein. EspB is a type-III-secreted pore-forming protein of enteropathogenic Escherichia coli (EPEC) which is essential for EPEC pathogenesis. EspB is also found in Citrobacter rodentium. 165
48909 283464 pfam05803 Chordopox_L2 Chordopoxvirus L2 protein. This family consists of several Chordopoxvirus L2 proteins. 79
48910 253396 pfam05804 KAP Kinesin-associated protein (KAP). This family consists of several eukaryotic kinesin-associated (KAP) proteins. Kinesins are intracellular multimeric transport motor proteins that move cellular cargo on microtubule tracks. It has been shown that the sea urchin KRP85/95 holoenzyme associates with a KAP115 non-motor protein, forming a heterotrimeric complex in vitro, called the Kinesin-II. 708
48911 399069 pfam05805 L6_membrane L6 membrane protein. This family consists of several eukaryotic L6 membrane proteins. L6, IL-TMP, and TM4SF5 are cell surface proteins predicted to have four transmembrane domains. Previous sequence analysis led to their assignment as members of the tetraspanin superfamily it has now been found that that they are not significantly related to genuine tetraspanins, but instead constitute their own L6 family. Several members of this family have been implicated in human cancer. 192
48912 399070 pfam05806 Noggin Noggin. This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation. 215
48913 399071 pfam05808 Podoplanin Podoplanin. This family consists of several mammalian podoplanin like proteins which are thought to control specifically the unique shape of podocytes. 134
48914 399072 pfam05810 NinF NinF protein. This family consists of several bacteriophage NinF proteins as well as related sequences from E. coli. 57
48915 399073 pfam05811 DUF842 Eukaryotic protein of unknown function (DUF842). This family consists of a number of conserved eukaryotic proteins of unknown function. The sequences carry three sets of CxxxC motifs, which might suggest a type of zinc-finger formation. 126
48916 283470 pfam05812 Herpes_BLRF2 Herpesvirus BLRF2 protein. This family consists of several Herpesvirus BLRF2 proteins. 119
48917 283471 pfam05813 Orthopox_F7 Orthopoxvirus F7 protein. 80
48918 114536 pfam05814 Ac76 Orf76 (Ac76). This family consists mainly of baculovirus proteins. Family members include Autographa californica multiple nucleopolyhedrovirus (AcMNPV), protein AC76. Ac76 has been shown to be involved in intranuclear microvesicle formation. Functional studies suggest that ac76 is essential for both BV (budded virus) and ODV (occlusion-derived virus) development but is not required for viral DNA synthesis. 83
48919 283472 pfam05815 DUF844 Baculovirus protein of unknown function (DUF844). This family consists of several Baculovirus sequences of between 350 and 380 residues long. The family has no known function. 377
48920 399074 pfam05816 TelA Toxic anion resistance protein (TelA). This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance and plasmid fertility inhibition. 330
48921 399075 pfam05817 Ribophorin_II Oligosaccharyltransferase subunit Ribophorin II. This family contains eukaryotic Ribophorin II (RPN2) proteins. The mammalian oligosaccharyltransferase (OST) is a protein complex that effects the cotranslational N-glycosylation of newly synthesized polypeptides, and is composed of the following proteins: ribophorins I and II (RI and RII), OST48, and Dadl, N33/IAP, OST4, STT3. The family also includes the SWP1 protein from yeast. In yeast the oligosaccharyltransferase complex is composed 7 or 8 subunits, SWP1, being one of them. 625
48922 399076 pfam05818 TraT Enterobacterial TraT complement resistance protein. The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between an Escherichia coli and its surroundings. 214
48923 399077 pfam05819 NolX NolX protein. This family consists of Rhizobium NolX and Xanthomonas HrpF proteins. The interaction between the plant pathogen Xanthomonas campestris pv. vesicatoria and its host plants is controlled by hrp genes (hypersensitive reaction and pathogenicity), which encode a type III protein secretion system. Among type III-secreted proteins are avirulence proteins, effectors involved in the induction of plant defense reactions. HrpF is dispensable for protein secretion but required for AvrBs3 recognition in planta, is thought to function as a translocator of effector proteins into the host cell. NolX, a soybean cultivar specificity protein, is secreted by a type III secretion system (TTSS) and shows homology to HrpF of the plant pathogen Xanthomonas campestris pv. vesicatoria. It is not known whether NolX functions at the bacterium-plant interface or acts inside the host cell. NolX is expressed in planta only during the early stages of nodule development. 438
48924 399078 pfam05820 Ac81 Baculoviridae AC81. This family consists of several highly related Baculovirus proteins and includes Autographa californica multiple nucleopolyhedrovirus (AcMNPV) protein AC81, which is required for nucleocapsid envelopment. Ac81 contains a functional hydrophobic transmembrane (TM) domain, whose deletion resulted in a phenotype similar to that of Ac81 knockout. 178
48925 399079 pfam05821 NDUF_B8 NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI or NDUFB8). This family consists of several eukaryotic NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI) proteins. NADH:ubiquinone oxidoreductase (complex I) is an extremely complicated multiprotein complex located in the inner mitochondrial membrane. Its main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. Human complex I appears to consist of 41 subunits. 166
48926 310424 pfam05822 UMPH-1 Pyrimidine 5'-nucleotidase (UMPH-1). This family consists of several eukaryotic pyrimidine 5'-nucleotidase proteins. P5'N-1, also known as uridine monophosphate hydrolase-1 (UMPH-1), is a member of a large functional group of enzymes, characterized by the ability to dephosphorylate nucleic acids. P5'N-1 catalyzes the dephosphorylation of pyrimidine nucleoside monophosphates to the corresponding nucleosides. Deficiencies in this proteins function can lead to several different disorders in humans. 246
48927 399080 pfam05823 Gp-FAR-1 Nematode fatty acid retinoid binding protein (Gp-FAR-1). Parasitic nematodes produce at least two structurally novel classes of small helix-rich retinol- and fatty-acid-binding proteins that have no counterparts in their plant or animal hosts and thus represent potential targets for new nematicides. Gp-FAR-1 is a member of the nematode-specific fatty-acid- and retinol-binding (FAR) family of proteins but localizes to the surface of the organism, placing it in a strategic position for interaction with the host. Gp-FAR-1 functions as a broad-spectrum retinol- and fatty-acid-binding protein, and it is thought that it is involved in the evasion of primary host plant defense systems. 142
48928 399081 pfam05824 Pro-MCH Pro-melanin-concentrating hormone (Pro-MCH). This family consists of several mammalian pro-melanin-concentrating hormone (Pro-MCH) 1 and 2 proteins. Melanin-concentrating hormone (MCH) is a 19 amino acid cyclic peptide that was first isolated from the pituitary of teleost fish. It is produced from pro-MCH that encodes, in addition to MCH, NEI, and a putative peptide, NGE. In lower vertebrates, MCH acts to regulate skin colour by antagonising the melanin-dispersing actions of small alpha, Greek-melanocyte stimulating hormone (small alpha, Greek-MSH). In mammals, MCH serves as a neuropeptide and is found in many regions of the brain and especially the hypothalamus. It affects many types of behaviours such as appetite, sexual receptivity, aggression, and anxiety. MCH also stimulates the release of luteinising hormone. 85
48929 283482 pfam05825 PSP94 Beta-microseminoprotein (PSP-94). This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from human seminal plasma. The exact function of this protein is unknown. 94
48930 399082 pfam05826 Phospholip_A2_2 Phospholipase A2. This family consists of several phospholipase A2 like proteins mostly from insects. 95
48931 399083 pfam05827 ATP-synt_S1 Vacuolar ATP synthase subunit S1 (ATP6S1). This family consists of eukaryotic vacuolar ATP synthase subunit S1 proteins. The threshold is set high to avoid the inclusion of BIG1 ER integral membrane proteins which are involved in cell wall organisation and biogenesis. 149
48932 310429 pfam05829 Adeno_PX Adenovirus late L2 mu core protein (Protein X). This family consists of several Adenovirus late L2 mu core protein or Protein X sequences. 41
48933 399084 pfam05830 NodZ Nodulation protein Z (NodZ). The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal. 320
48934 399085 pfam05831 GAGE GAGE protein. This family consists of several GAGE and XAGE proteins which are found exclusively in humans. The function of this family is unknown although they have been implicated in human cancers. 107
48935 399086 pfam05832 DUF846 Eukaryotic protein of unknown function (DUF846). This family consists of several of unknown function from a variety of eukaryotic organisms. 139
48936 399087 pfam05833 FbpA Fibronectin-binding protein A N-terminus (FbpA). This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host. 452
48937 310433 pfam05834 Lycopene_cycl Lycopene cyclase protein. This family consists of lycopene beta and epsilon cyclase proteins. Carotenoids with cyclic end groups are essential components of the photosynthetic membranes in all plants, algae, and cyanobacteria. These lipid-soluble compounds protect against photo-oxidation, harvest light for photosynthesis, and dissipate excess light energy absorbed by the antenna pigments. The cyclisation of lycopene (psi, psi-carotene) is a key branch point in the pathway of carotenoid biosynthesis. Two types of cyclic end groups are found in higher plant carotenoids: the beta and epsilon rings. Carotenoids with two beta rings are ubiquitous, and those with one beta and one epsilon ring are common; however, carotenoids with two epsilon rings are rare. 380
48938 399088 pfam05835 Synaphin Synaphin protein. This family consists of several eukaryotic synaphin 1 and 2 proteins. Synaphin/complexin is a cytosolic protein that preferentially binds to syntaxin within the SNARE complex. Synaphin promotes SNAREs to form precomplexes that oligomerize into higher order structures. A peptide from the central, syntaxin binding domain of synaphin competitively inhibits these two proteins from interacting and prevents SNARE complexes from oligomerising. It is thought that oligomerization of SNARE complexes into a higher order structure creates a SNARE scaffold for efficient, regulated fusion of synaptic vesicles. Synaphin promotes neuronal exocytosis by promoting interaction between the complementary syntaxin and synaptobrevin transmembrane regions that reside in opposing membranes prior to fusion. 142
48939 283492 pfam05836 Chorion_S16 Chorion protein S16. This family consists of several examples of the fruit fly specific chorion protein S16. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. 108
48940 399089 pfam05837 CENP-H Centromere protein H (CENP-H). This family consists of several eukaryotic centromere protein H (CENP-H) sequences. Macromolecular centromere-kinetochore complex plays a critical role in sister chromatid separation, but its complete protein composition as well as its precise dynamic function during mitosis has not yet been clearly determined. CENP-H contains a coiled-coil structure and a nuclear localization signal. CENP-H is specifically and constitutively localized in kinetochores throughout the cell cycle. CENP-H may play a role in kinetochore organisation and function throughout the cell cycle. This the C-terminus of the region, which is conserved from fungi to humans. 112
48941 399090 pfam05838 Glyco_hydro_108 Glycosyl hydrolase 108. This family acts as a lysozyme (N-acetylmuramidase), EC:3.2.1.17. It contains a conserved EGGY motif near the N-terminus, the glutamic acid within this motif is essential for catalytic activity. In bacteria, it may activate the secretion of large proteins via the breaking and rearrangement of the peptidoglycan layer during secretion. It is frequently found at the N-terminus of proteins containing a C-terminal pfam09374 domain. 86
48942 310437 pfam05839 Apc13p Apc13p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc13p. 89
48943 336220 pfam05840 Phage_GPA Bacteriophage replication gene A protein (GPA). This family consists of a group of bacteriophage replication gene A protein (GPA) like sequences from both viruses and bacteria. The members of this family are likely to be endonucleases. 321
48944 399091 pfam05841 Apc15p Apc15p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc15p. 118
48945 191385 pfam05842 Euplotes_phero Euplotes octocarinatus mating pheromone protein. This family consists of several mating pheromone proteins from Euplotes octocarinatus. Cells of the ten mating types of the ciliate Euplotes octocarinatus communicate by pheromones before they enter conjugation. The pheromones induce homotypic pairing when applied to mating types that do not secrete the same pheromone(s). Heterotypic pairs (i.e., those between cells of different mating types) are formed only when both mating types in a mixture secrete a pheromone that the other does not. The genetics of mating types is based on four codominant mating type alleles, each allele determining production of a different pheromone. The pheromones not only induce pair formation but also attract cells. 135
48946 399092 pfam05843 Suf Suppressor of forked protein (Suf). This family consists of several eukaryotic suppressor of forked (Suf) like proteins. The Drosophila melanogaster Suppressor of forked [Su(f)] protein shares homology with the yeast RNA14 protein and the 77-kDa subunit of human cleavage stimulation factor, which are proteins involved in mRNA 3' end formation. This suggests a role for Su(f) in mRNA 3' end formation in Drosophila. The su(f) gene produces three transcripts; two of them are polyadenylated at the end of the transcription unit, and one is a truncated transcript, polyadenylated in intron 4. It is thought that su(f) plays a role in the regulation of poly(A) site utilisation and an important role of the GU-rich sequence for this regulation to occur. 291
48947 283498 pfam05844 YopD YopD protein. This family consists of several bacterial YopD like proteins. Virulent Yersinia species harbour a common plasmid that encodes essential virulence determinants (Yersinia outer proteins [Yops]), which are regulated by the extracellular stimuli Ca2+ and temperature. YopD is thought to be a possible transmembrane protein and contains an amphipathic alpha-helix in its carboxy terminus. 292
48948 399093 pfam05845 PhnH Bacterial phosphonate metabolism protein (PhnH). This family consists of several bacterial PhnH sequences which are known to be involved in phosphonate metabolism. 182
48949 283500 pfam05846 Chordopox_A15 Chordopoxvirus A15 protein. This family consists of several Chordopoxvirus A15 like sequences. 90
48950 283501 pfam05847 Baculo_LEF-3 Nucleopolyhedrovirus late expression factor 3 (LEF-3). This family consists of LEF-3 Nucleopolyhedrovirus late expression factor 3 (LEF-3) sequences which are known to be ssDNA-binding proteins. Alkaline nuclease (AN) and LEF-3 may participate in homologous recombination of the baculovirus genome in a manner similar to that of exonuclease (Redalpha) and DNA-binding protein (Redbeta) of the Red-mediated homologous recombination system of bacteriophage lambda. LEF-3 is essential for transporting the putative baculovirus helicase protein P143 into the nucleus where they function together during viral DNA replication. LEF-3 and other proteins have been shown to bind to closely linked sites on viral chromatin in vivo, suggesting that they may form part of the baculovirus replisome. 364
48951 399094 pfam05848 CtsR Firmicute transcriptional repressor of class III stress genes (CtsR). This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon. 72
48952 368637 pfam05849 L-fibroin Fibroin light chain (L-fibroin). This family consists of several moth fibroin light chain (L-fibroin) proteins. Fibroin of the silkworm, Bombyx mori, is secreted into the lumen of posterior silk gland (PSG) from the surrounding PSG cells as a molecular complex consisting of a heavy (H)-chain of approximately 350 kDa, a light (L)-chain of 25 kDa and a P25 of about 27 kDa. The H- and L-chains are disulfide-linked but P25 is associated with the H-L complex by non-covalent force. 239
48953 147807 pfam05851 Lentivirus_VIF Lentivirus virion infectivity factor (VIF). This family consists of several feline specific Lentivirus virion infectivity factor (VIF) proteins. VIF is essential for productive FIV infection of host target cells in vitro. 250
48954 283504 pfam05852 DUF848 Gammaherpesvirus protein of unknown function (DUF848). This family consists of several uncharacterized proteins from the Gammaherpesvirinae. 145
48955 399095 pfam05853 BKACE beta-keto acid cleavage enzyme. BKACE, beta-keto acid cleavage enzyme plays, a role in lysine degradation. In certain instances it catalyzes the conversion of 3-keto-5-aminohexanoate and acetyl-CoA into acetoacetate and 3-aminobutyryl-CoA. The family is found to have at least 14 slightly different potential new enzymatic activities, all of which can therefore be designated as beta-keto acid cleavage enzymes. 274
48956 399096 pfam05854 MC1 Non-histone chromosomal protein MC1. This family consists of archaeal chromosomal protein MC1 sequences which protect DNA against thermal denaturation. 90
48957 399097 pfam05856 ARPC4 ARP2/3 complex 20 kDa subunit (ARPC4). This family consists of several eukaryotic ARP2/3 complex 20 kDa subunit (P20-ARC) proteins. The Arp2/3 protein complex has been implicated in the control of actin polymerization in cells. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3 it has been suggested that the complex promotes actin assembly in lamellipodia and may participate in lamellipodial protrusion. 166
48958 399098 pfam05857 TraX TraX protein. This family consists of several bacterial TraX proteins. TraX is responsible for the amino-terminal acetylation of F-pilin subunits. 215
48959 147812 pfam05858 BIV_Env Bovine immunodeficiency virus surface protein (SU). The bovine lentivirus also known as the bovine immunodeficiency-like virus (BIV) has conserved and hypervariable regions in the surface envelope gene. This family corresponds to the SU surface protein. 548
48960 399099 pfam05859 Mis12 Mis12 protein. Kinetochores are the chromosomal sites for spindle interaction and play a vital role in chromosome segregation. Fission yeast kinetochore protein Mis12, is required for correct spindle morphogenesis, determining metaphase spindle length. Thirty-five to sixty percent extension of metaphase spindle length takes place in Mis12 mutants. It has been shown that Mis12 genetically interacts with Mal2, another inner centromere core complex protein in S. pombe. 137
48961 399100 pfam05860 Haemagg_act haemagglutination activity domain. This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins. 118
48962 399101 pfam05861 PhnI Bacterial phosphonate metabolism protein (PhnI). This family consists of several Proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organized in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex. 346
48963 147815 pfam05862 IceA2 Helicobacter pylori IceA2 protein. This family consists of several Helicobacter pylori specific IceA2 proteins. The function of this family is unknown. 59
48964 283512 pfam05864 Chordopox_RPO7 Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7). This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA. 63
48965 310449 pfam05865 Cypo_polyhedrin Cypovirus polyhedrin protein. This family consists of several Cypovirus polyhedrin protein. Polyhedrin is known to form a crystalline matrix (polyhedra) in infected insect cells. 248
48966 399102 pfam05866 RusA Endodeoxyribonuclease RusA. This family consists of several bacterial and phage Holliday junction resolvase (RusA) like proteins. The RusA protein of Escherichia coli is an endonuclease that can resolve Holliday intermediates and correct the defects in genetic recombination and DNA repair associated with inactivation of RuvAB or RuvC. 122
48967 399103 pfam05867 DUF851 Protein of unknown function (DUF851). 241
48968 114586 pfam05868 Rotavirus_VP7 Rotavirus major outer capsid protein VP7. This family consists of several Rotavirus major outer capsid protein VP7 sequences. The rotavirus capsid is composed of three concentric protein layers. Proteins VP4 and VP7 comprise the outer layer. VP4 forms spikes and is the viral attachment protein. VP7 is a glycoprotein and the major constituent of the outer protein layer. 249
48969 399104 pfam05869 Dam DNA N-6-adenine-methyltransferase (Dam). This family consists of several bacterial and phage DNA N-6-adenine-methyltransferase (Dam) like sequences. 165
48970 399105 pfam05870 PA_decarbox Phenolic acid decarboxylase (PAD). This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The Phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD. 158
48971 399106 pfam05871 ESCRT-II ESCRT-II complex subunit. This family of conserved eukaryotic proteins are subunits of the endosome associated complex ESCRT-II which recruits transport machinery for protein sorting at the multivesicular body (MVB). This protein complex transiently associates with the endosomal membrane and thereby initiates the formation of ESCRT-III, a membrane-associated protein complex that functions immediately downstream of ESCRT-II during sorting of MVB cargo. ESCRT-II in turn functions downstream of ESCRT-I, a protein complex that binds to ubiquitinated endosomal cargo. 133
48972 283518 pfam05872 DUF853 Bacterial protein of unknown function (DUF853). This family consists of several bacterial proteins of unknown function. BMEI1370 is thought to be an ATPase. 503
48973 399107 pfam05873 Mt_ATP-synt_D ATP synthase D chain, mitochondrial (ATP5H). This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts. 154
48974 368647 pfam05874 PBAN Pheromone biosynthesis activating neuropeptide (PBAN). This family consists of several moth pheromone biosynthesis activating neuropeptide (PBAN) sequences. Female moths produce and release species specific sex pheromones to attract males for mating. Pheromone biosynthesis is hormonally regulated by the Pheromone Biosynthesis Activating Neuropeptide (PBAN) which is biosynthesized in the subesophageal ganglion (SOG). 184
48975 399108 pfam05875 Ceramidase Ceramidase. This family consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates, EC:3.5.1.23. This family belongs to the CREST superfamily, which are distantly related to the GPCRs. 260
48976 368649 pfam05876 Terminase_GpA Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities. 559
48977 283523 pfam05878 Phyto_Pns9_10 Phytoreovirus nonstructural protein Pns9/Pns10. This family consists of the Phytoreovirus nonstructural proteins Pns9 and Pns10. The function of this family is unknown. 344
48978 399109 pfam05879 RHD3 Root hair defective 3 GTP-binding protein (RHD3). This family consists of several eukaryotic root hair defective 3 like GTP-binding proteins. It has been speculated that the RHD3 protein is a member of a novel class of GTP-binding proteins that is widespread in eukaryotes and required for regulated cell enlargement. The family also contains the homologous yeast synthetic enhancement of YOP1 (SEY1) protein which is involved in membrane trafficking. 639
48979 283524 pfam05880 Fiji_64_capsid Fijivirus 64 kDa capsid protein. This family consists of several Fijivirus 64 kDa capsid proteins. 554
48980 399110 pfam05881 CNPase 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP or CNPase). This family consists of the eukaryotic protein 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP). 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2',3'-cyclic nucleotides to produce 2'-nucleotides in vitro. Although physiologically relevant substrates with 2',3'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2',3'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity. CNPases belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT, vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins. 214
48981 283526 pfam05883 Baculo_RING Baculovirus U-box/Ring-like domain. This family consists of several Baculovirus proteins of around 130 residues in length. The function of this family is unknown, but it appears to be related to the U-box and ring finger domain by profile-profile comparison. 138
48982 368652 pfam05884 ZYG-11_interact Interactor of ZYG-11. This family of proteins represents the protein product of the gene W03D8.9 which has been identified as an interactor of ZYG-11. ZYG-11 is the substrate-recognition subunit for a CUL-2 based complex that regulates cell division and embryonic development. 295
48983 283528 pfam05886 Orthopox_F8 Orthopoxvirus F8 protein. This family consists of several Orthopoxvirus F8 proteins. The function of this family is unknown. 65
48984 368653 pfam05887 Trypan_PARP Procyclic acidic repetitive protein (PARP). This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of Trypanosoma brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated. 134
48985 399111 pfam05889 SepSecS O-phosphoseryl-tRNA(Sec) selenium transferase, SepSecS. Early annotation suggested this family, SepSecS, of several eukaryotic and archaeal proteins, was involved in antigen-antibodies responses in the liver and pancreas. Structural studies show that the family is O-phosphoseryl-tRNA(Sec) selenium transferase, an enzyme involved in the synthesis of the amino acid selenocysteine (Sec). Sec is the only amino acid whose biosynthesis occurs on its cognate transfer RNA (tRNA). SepSecS catalyzes the final step in the formation of the amino acid. The early observation that autoantibodies isolated from patients with type I autoimmune hepatitis targeted a ribonucleoprotein complex containing tRNASec led to the identification and characterization of the archaeal and the human SepSecS. SepSecS forms its own branch in the family of fold-type I pyridoxal phosphate (PLP) enzymes that goes back to the last universal common ancestor which explains why the archaeal sequences spcS and MK0229 are annotated as being pyridoxal phosphate-dependent enzymes. 389
48986 399112 pfam05890 Ebp2 Eukaryotic rRNA processing protein EBP2. This family consists of several Eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis. This family also plays a role in chromosome segregation. 273
48987 283530 pfam05891 Methyltransf_PK AdoMet dependent proline di-methyltransferase. This protein is expressed in the tail neuron PVT and in uterine cells in C. elegans [worm-base]. In Saccharomyces cerevisiae this is AdoMet dependent proline di-methyltransferase. This enzyme catalyzes the di-methylation of ribosomal proteins Rpl12 and Rps25 at N-terminal proline residues. The methyltransferases described here specifically recognize the N-terminal X-Pro-Lys sequence motif, and they may account for nearly all previously described eukaryotic protein N-terminal methylation reactions. A number of other yeast and human proteins also share the recognition motif and may be similarly modified. As with other methyltransferases, this family carries the characteristic GxGxG motif. 217
48988 283531 pfam05892 Tricho_coat Trichovirus coat protein. This family consists of several coat proteins which are specific to the ssRNA positive-strand, no DNA stage viruses such as the Trichovirus and Vitivirus. 195
48989 399113 pfam05893 LuxC Acyl-CoA reductase (LuxC). This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalyzed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components. 401
48990 399114 pfam05894 Podovirus_Gp16 Podovirus DNA encapsidation protein (Gp16). This family consists of several DNA encapsidation protein (Gp16) sequences from the phi-29-like viruses. Gene product 16 catalyzes the in vivo and in vitro genome-encapsidation reaction. 331
48991 310464 pfam05895 DUF859 Siphovirus protein of unknown function (DUF859). This family consists of several uncharacterized proteins from the Siphoviruses as well as one bacterial sequence. Some of the members of this family are described as putative minor structural proteins. 626
48992 399115 pfam05896 NQRA Na(+)-translocating NADH-quinone reductase subunit A (NQRA). This family consists of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration. 257
48993 399116 pfam05899 Cupin_3 Protein of unknown function (DUF861). This family consists of several proteins which seem to be specific to plants and bacteria. The function of this family is unknown. 74
48994 399117 pfam05901 Excalibur Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognized and the evolution of EF-hand-like domains is probably more complex than previously appreciated. 36
48995 399118 pfam05902 4_1_CTD 4.1 protein C-terminal domain (CTD). At the C-terminus of all known 4.1 proteins is a sequence domain unique to these proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24-kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates. 106
48996 399119 pfam05903 Peptidase_C97 PPPDE putative peptidase domain. The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p). 151
48997 399120 pfam05904 DUF863 Plant protein of unknown function (DUF863). This family consists of a number of hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 939
48998 114618 pfam05906 DUF865 Herpesvirus-7 repeat of unknown function (DUF865). This family consists of a series of 12 repeats of 35 amino acids in length which are found exclusively in Herpesvirus-7. The function of this family is unknown. 35
48999 399121 pfam05907 DUF866 Eukaryotic protein of unknown function (DUF866). This family consists of a number of hypothetical eukaryotic proteins of unknown function with an average length of around 165 residues. 152
49000 399122 pfam05908 Gamma_PGA_hydro Poly-gamma-glutamate hydrolase. This family consists of a number of bacterial and phage proteins that function as gamma-PGA hydrolase enzymes. Structurally the protein in this family adopted an open alpha/beta mixed core structure with a seven-stranded parallel/anti-parallel beta-sheet. This structure shows similarity to mammalian carboxypeptidase A and related enzymes. 191
49001 399123 pfam05910 DUF868 Plant protein of unknown function (DUF868). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 266
49002 399124 pfam05911 FPP Filament-like plant protein, long coiled-coil. FPP is a family of long coiled-coil plant proteins that are filament-like. It interacts with the nuclear envelope-associated protein, MAF1, the WPP family pfam13943. 859
49003 399125 pfam05912 DUF870 Caenorhabditis elegans protein of unknown function (DUF870). This family consists of a number of hypothetical proteins which seem to be specific to Caenorhabditis elegans. The function of this family is unknown. 111
49004 399126 pfam05913 DUF871 Bacterial protein of unknown function (DUF871). This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown. 116
49005 399127 pfam05914 RIB43A RIB43A. This family consists of several RIB43A-like eukaryotic proteins. Ciliary and flagellar microtubules contain a specialized set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterized in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialized protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilizing doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologs could represent a structural requirement in centriole replication in dividing cells. 376
49006 399128 pfam05915 DUF872 Eukaryotic protein of unknown function (DUF872). This family consists of several uncharacterized eukaryotic proteins. The function of this family is unknown. 115
49007 399129 pfam05916 Sld5 GINS complex protein. The eukaryotic GINS complex is essential for the initiation and elongation phases of DNA replication. It consists of four paralogous protein subunits (Sld5, Psf1, Psf2 and Psf3), all of which are included in this family. The GINS complex is conserved from yeast to humans, and has been shown in human to bind directly to DNA primase. 105
49008 283549 pfam05917 DUF874 Helicobacter pylori protein of unknown function (DUF874). This family consists of several hypothetical proteins specific to Helicobacter pylori. The function of this family is unknown. 398
49009 399130 pfam05918 API5 Apoptosis inhibitory protein 5 (API5). This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterized by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in human chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors. 534
49010 253459 pfam05919 Mitovir_RNA_pol Mitovirus RNA-dependent RNA polymerase. This family consists of several Mitovirus RNA-dependent RNA polymerase proteins. The family also contains fragment matches in the mitochondria of Arabidopsis thaliana. 495
49011 399131 pfam05920 Homeobox_KN Homeobox KN domain. This is a homeobox transcription factor KN domain conserved from fungi to human and plants. They were first identified as TALE homeobox genes in eukaryotes, (including KNOX and MEIS genes). They have been recently classified. 40
49012 399132 pfam05922 Inhibitor_I9 Peptidase inhibitor I9. This family includes the proteinase B inhibitor from Saccharomyces cerevisiae and the activation peptides from peptidases of the subtilisin family. The subtilisin propeptides are known to function as molecular chaperones, assisting in the folding of the mature peptidase, but have also been shown to act as 'temporary inhibitors'. 82
49013 399133 pfam05923 APC_r APC repeat. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). In the human protein many cancer-linked SNPs are found near the first three occurrences of the motif. These repeats bind beta-catenin. 24
49014 399134 pfam05924 SAMP SAMP Motif. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). This motif binds axin. 22
49015 283555 pfam05925 IpgD Enterobacterial virulence protein IpgD. This family consists of several enterobacterial IpgD like virulence factor proteins. In the Gram-negative pathogen Shigella flexneri, the virulence factor IpgD is translocated directly into eukaryotic cells and acts as a potent inositol 4-phosphatase that specifically dephosphorylates phosphatidylinositol 4,5-bisphosphate [PtdIns(4,5)P(2)] into phosphatidylinositol 5-monophosphate [PtdIns(5)P] that then accumulates. Transformation of PtdIns(4,5)P(2) into PtdIns(5)P by IpgD is responsible for dramatic morphological changes of the host cell, leading to a decrease in membrane tether force associated with membrane blebbing and actin filament remodelling. 580
49016 399135 pfam05926 Phage_GPL Phage head completion protein (GPL). This family consists of several phage head completion protein (GPL) as well as related bacterial sequences. Members of this family allow the completion of filled heads by rendering newly packaged DNA in the heads resistant to DNase. The protein is thought to bind to DNA filled capsids. 139
49017 147854 pfam05927 Penaeidin Penaeidin. This family consists of several isoforms of the penaeidin protein which is specific to shrimps. Penaeidins, a unique family of antimicrobial peptides (AMPs) with both proline and cysteine-rich domains, were initially identified in the hemolymph of the Pacific white shrimp, Litopenaeus vannamei. 73
49018 114639 pfam05928 Zea_mays_MuDR Zea mays MURB-like protein (MuDR). This family consists of several Zea mays specific MURB-like proteins. The transposition of Mu elements underlying Mutator activity in maize requires a transcriptionally active MuDR element. Despite variation in MuDR copy number and RNA levels in Mutator lines, transposition events are consistently late in plant development, and Mu excision frequencies are similar. 207
49019 399136 pfam05929 Phage_GPO Phage capsid scaffolding protein (GPO) serine peptidase. This family consists of several bacteriophage capsid scaffolding proteins (GPO) and some related bacterial sequences. GPO is thought to function in both the assembly of proheads and the cleavage of GPN. The family is found to function as a serine peptidase, with a conserved Asp, His and Ser catalytic triad, as in subtilisin, and as represented in MEROPS:S73. The family includes capsid assembly scaffolding protein from Enterobacteria phage P2 which cleaves itself and then becomes the scaffold protein upon which the bacteriophage prohead is built - a mechanism quite common amongst phages. 272
49020 399137 pfam05930 Phage_AlpA Prophage CP4-57 regulatory protein (AlpA). This family consists of several short bacterial and phage proteins which are related to the E. coli protein AlpA. AlpA suppress two phenotypes of a delta lon protease mutant, overproduction of capsular polysaccharide and sensitivity to UV light. Several of the sequences in this family are thought to be DNA-binding proteins. 51
49021 368676 pfam05931 AgrD Staphylococcal AgrD protein. This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr. 45
49022 399138 pfam05932 CesT Tir chaperone protein (CesT) family. This family consists of a number of bacterial sequences which are highly similar to the Tir chaperone protein in E. Coli. In many Gram-negative bacteria, a key indicator of pathogenic potential is the possession of a specialized type III secretion system, which is utilized to deliver virulence effector proteins directly into the host cell cytosol. Many of the proteins secreted from such systems require small cytosolic chaperones to maintain the secreted substrates in a secretion-competent state. CesT serves a chaperone function for the enteropathogenic Escherichia coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment. This family also contains several DspF and related sequences from several plant pathogenic bacteria. The "disease-specific" (dsp) region next to the hrp gene cluster of Erwinia amylovora is required for pathogenicity but not for elicitation of the hypersensitive reaction. DspF and AvrF are small (16 kDa and 14 kDa) and acidic with predicted amphipathic alpha helices in their C termini; they resemble chaperones for virulence factors secreted by type III secretion systems of animal pathogens. 119
49023 283561 pfam05933 Fun_ATP-synt_8 Fungal ATP synthase protein 8 (A6L). This family consists of fungus specific ATP synthase protein 8 (EC:3.6.3.14). The family may be related to the ATP synthase protein 8 found in other eukaryotes pfam00895. 48
49024 310487 pfam05934 MCLC Mid-1-related chloride channel (MCLC). This family consists of several mid-1-related chloride channels. mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer. 549
49025 399139 pfam05935 Arylsulfotrans Arylsulfotransferase (ASST). This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate. 368
49026 399140 pfam05936 T6SS_VasE Bacterial Type VI secretion, VC_A0110, EvfL, ImpJ, VasE. T6SS_VasE is a family of of bacterial proteins that are essential for the type VI pathogenic secretion system, although the exact function of this particular component of the system is still not known. 427
49027 399141 pfam05937 EB1_binding EB-1 Binding Domain. This region, found at the C-terminus of the APC proteins, binds the microtubule-associating protein EB-1. At the C-terminus of the alignment is also a pfam00595 binding domain. A short motif in the middle of the region appears to be found in the APC2 proteins. 174
49028 399142 pfam05938 Self-incomp_S1 Plant self-incompatibility protein S1. This family consists of a series of plant proteins which are related to the Papaver rhoeas S1 self-incompatibility protein. Self incompatibility (SI) is the single most important outbreeding device found in angiosperms and is a mechanism that regulates the acceptance or rejection of pollen. S1 is known to exhibit specific pollen-inhibitory properties. 89
49029 399143 pfam05939 Phage_min_tail Phage minor tail protein. This family consists of a series of phage minor tail proteins and related sequences from several bacterial species. 107
49030 399144 pfam05940 NnrS NnrS protein. This family consists of several bacterial NnrS like proteins. NnrS is a putative heme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase. 367
49031 283569 pfam05941 Chordopox_A20R Chordopoxvirus A20R protein. This family consists of several Chordopoxvirus A20R proteins. The A20R protein is required for DNA replication, is associated with the processive form of the viral DNA polymerase, and directly interacts with the viral proteins encoded by the D4R, D5R, and H5R open reading frames. A20R may contribute to the assembly or stability of the multiprotein DNA replication complex. 335
49032 377572 pfam05942 PaREP1 Archaeal PaREP1/PaREP8 family. This family consists of several archaeal PaREP1 and PaREP8 proteins the function of this family is unknown. 115
49033 399145 pfam05943 VipB Type VI secretion protein, EvpB/VC_A0108, tail sheath. EvpB is a family of Gram-negative probable type VI secretion system components of the tail sheath. They have been known as COG:COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as VipA/VipB and TssB/TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell. 301
49034 399146 pfam05944 Phage_term_smal Phage small terminase subunit. This family consists of several phage small terminase subunit proteins as well as some related bacterial sequences. 129
49035 283573 pfam05946 TcpA Toxin-coregulated pilus subunit TcpA. This family consists of toxin-coregulated pilus subunit (TcpA) proteins from Vibrio cholerae and related sequences. The major virulence factors of toxigenic Vibrio cholerae are cholera toxin (CT), which is encoded by a lysogenic bacteriophage (CTXPhi), and toxin-coregulated pilus (TCP), an essential colonisation factor which is also the receptor for CTXPhi. The genes for the biosynthesis of TCP are part of a larger genetic element known as the TCP pathogenicity island. 130
49036 399147 pfam05947 T6SS_TssF Type VI secretion system, TssF. This is a family of Gram-negative bacterial proteins that form part of the type VI pathogenicity secretion system (T6SS), including TssF. TssF is homologs to phage tail proteins and is required for proper assembly of the Hcp tube (the T6SS inner tube) in bacteria. 606
49037 399148 pfam05949 DUF881 Bacterial protein of unknown function (DUF881). This family consists of a series of hypothetical bacterial proteins. One of the family members YlxW from Bacillus subtilis is thought to be involved in cell division and sporulation. 141
49038 283576 pfam05950 Orthopox_A36R Orthopoxvirus A36R protein. This family consists of several Orthopoxvirus A36R proteins. The A36R protein is predicted to be a type Ib membrane protein. 158
49039 399149 pfam05951 Peptidase_M15_2 Bacterial protein of unknown function (DUF882). This family consists of a series of hypothetical bacterial proteins of unknown function. 150
49040 368681 pfam05952 ComX Bacillus competence pheromone ComX. Natural genetic competence in Bacillus subtilis is controlled by quorum-sensing (QS). The ComP- ComA two-component system detects the signalling molecule ComX, and this signal is transduced by a conserved phosphotransfer mechanism. ComX is synthesized as an inactive precursor and is then cleaved and modified by ComQ before export to the extracellular environment. 55
49041 283579 pfam05953 Allatostatin Allatostatin. This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata. 11
49042 399150 pfam05954 Phage_GPD Phage late control gene D protein (GPD). This family includes a number of phage late control gene D proteins and related bacterial sequences. This family also includes Bacteriophage Mu P proteins and related sequences. 305
49043 368683 pfam05955 Herpes_gp2 Equine herpesvirus glycoprotein gp2. This family consists of a number of glycoprotein gp2 sequences from equine herpesviruses. 226
49044 399151 pfam05956 APC_basic APC basic domain. This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules. 337
49045 399152 pfam05957 DUF883 Bacterial protein of unknown function (DUF883). This family consists of several hypothetical bacterial proteins of unknown function. 53
49046 310503 pfam05958 tRNA_U5-meth_tr tRNA (Uracil-5-)-methyltransferase. This family consists of (Uracil-5-)-methyltransferases EC:2.1.1.35 from bacteria, archaea and eukaryotes. A 5-methyluridine (m(5)U) residue at position 54 is a conserved feature of bacterial and eukaryotic tRNAs. The methylation of U54 is catalyzed by the tRNA(m5U54)methyltransferase, which in Saccharomyces cerevisiae is encoded by the nonessential TRM2 gene. It is thought that tRNA modification enzymes might have a role in tRNA maturation not necessarily linked to their known catalytic activity. 357
49047 283584 pfam05959 DUF884 Nucleopolyhedrovirus protein of unknown function (DUF884). This family consists of several hypothetical Nucleopolyhedrovirus proteins of unknown function. 194
49048 399153 pfam05960 DUF885 Bacterial protein of unknown function (DUF885). This family consists of several hypothetical bacterial proteins several of which are putative membrane proteins. 527
49049 283586 pfam05961 Chordopox_A13L Chordopoxvirus A13L protein. This family consists of A13L proteins from the Chordopoxviruses. A13L or p8 is one of the three most abundant membrane proteins of the intracellular mature Vaccinia virus. 69
49050 399154 pfam05962 HutD HutD. HutD from Pseudomonas fluorescens SBW25 is a component of the histidine uptake and utilisation operon. HutD is operonic with the well characterized repressor protein HutC. Genetic analysis using transcriptional fusions (lacZ) and deletion mutants shows that hutD is necessary to maintain fitness in environments replete with histidine. Evidence outlined by Zhang & Rainey (2007) suggests that HutD functions as a governor that sets an upper bound on the level of hut operon transcription. The mechanistic basis is unknown, but in silico molecular docking studies based on the crystal structure of PA5104 (HutD from Pseudomonas aeruginosa) show that urocanate (the first breakdown product of histidine) docks with the active site of HutD. 180
49051 283588 pfam05963 Cytomega_US3 Cytomegalovirus US3 protein. US3 of human cytomegalovirus is an endoplasmic reticulum resident transmembrane glycoprotein that binds to major histocompatibility complex class I molecules and prevents their departure. The endoplasmic reticulum retention signal of the US3 protein is contained in the luminal domain of the protein. 212
49052 399155 pfam05964 FYRN F/Y-rich N-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541. 51
49053 399156 pfam05965 FYRC F/Y rich C-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. 83
49054 283591 pfam05966 Chordopox_A33R Chordopoxvirus A33R protein. This family consists of several Chordopoxvirus A33R proteins. A33R plays a role in promoting Ab-resistant cell-to-cell spread of virus and interacts with A36R to incorporate the protein into the outer membrane of intracellular enveloped virions (IEV). 184
49055 399157 pfam05968 Bacillus_PapR Bacillus PapR protein. This family consists of the Bacillus species specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific. 45
49056 399158 pfam05969 PSII_Ycf12 Photosystem II complex subunit Ycf12. Ycf12 has been identified as a core subunit in the photosystem II (PSII) complex. PsbZ has been shown to be required for the association of PsbK and Ycf12 with PSII. 29
49057 399159 pfam05970 PIF1 PIF1-like helicase. This family includes homologs of the PIF1 helicase, which inhibits telomerase activity and is cell cycle regulated. This family includes a large number of largely uncharacterized plant proteins. This family includes a P-loop motif that is involved in nucleotide binding. 360
49058 399160 pfam05971 Methyltransf_10 Protein of unknown function (DUF890). This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of this family is unknown. 291
49059 283595 pfam05972 APC_15aa APC 15 residue motif. This motif, known as the 15 aa repeat, is found in the APC protein family. They are involved in binding beta-catenin along with the pfam05923 repeats. Many human cancer mutations map to the region around these motifs, and may be involved in disrupting their binding of beta-catenin. 15
49060 399161 pfam05973 Gp49 Phage derived protein Gp49-like (DUF891). This family consists of hypothetical bacterial proteins of unknown function as well as phage Gp49 proteins. 90
49061 399162 pfam05974 DUF892 Domain of unknown function (DUF892). This family consists of several hypothetical bacterial proteins of unknown function. 156
49062 399163 pfam05975 EcsB Bacterial ABC transporter protein EcsB. This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters. 383
49063 399164 pfam05977 MFS_3 Transmembrane secretion effector. This is a family of transport proteins. Members of this family include a protein responsible for the secretion of the ferric chelator, enterobactin, and a protein involved in antibiotic resistance. 523
49064 310513 pfam05978 UNC-93 Ion channel regulatory protein UNC-93. This family of proteins is a component of a multi-subunit protein complex which is involved in the coordination of muscle contraction. UNC-93 is most likely an ion channel regulatory protein. 157
49065 399165 pfam05979 DUF896 Bacterial protein of unknown function (DUF896). In B. subtilis, one small SOS response operon under the control of LexA, the yneA operon, is comprised of three genes: yneA, yneB, and ynzC. This family consists of several short, hypothetical bacterial proteins of unknown function. These proteins are mainly found in gram-positive firmicutes. Structures show that the N-terminus is composed of two alpha helices forming a helix-loop-helix motif. The structure of ynzC from B. subtilis forms a trimeric complex. Structural modelling suggests this domain may bind nucleic acids. This family is also known as UPF0291. 62
49066 368691 pfam05980 Toxin_7 Toxin 7. This family consists of several short spider neurotoxin proteins including many from the Funnel-web spider. 34
49067 399166 pfam05981 CreA CreA protein. This family consists of several bacterial CreA proteins, the function of which is unknown. 118
49068 399167 pfam05982 Sbt_1 Na+-dependent bicarbonate transporter superfamily. Family of bacterial proteins that are likely to be part of the Na(+)-dependent bicarbonate transporter (sbt) family. Members carry 10TMS in a 5+5 duplicated structure. The loop between helices 5 and 6 in Synechocystis PCC6803 is likely to be the location for regulatory mechanisms governing the activation of the transporter. 308
49069 399168 pfam05983 Med7 MED7 protein. This family consists of several eukaryotic proteins which are homologs of the yeast MED7 protein. Activation of gene transcription in metazoans is a multi-step process that is triggered by factors that recognize transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus. 180
49070 283605 pfam05984 Cytomega_UL20A Cytomegalovirus UL20A protein. This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein. 103
49071 399169 pfam05985 EutC Ethanolamine ammonia-lyase light chain (EutC). This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) EC:4.3.1.7 sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. 233
49072 368694 pfam05986 ADAM_spacer1 ADAM-TS Spacer 1. This family represents the Spacer-1 region from the ADAM-TS family of metalloproteinases. 114
49073 399170 pfam05987 DUF898 Bacterial protein of unknown function (DUF898). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative membrane proteins. 336
49074 399171 pfam05988 DUF899 Bacterial protein of unknown function (DUF899). This family consists of several uncharacterized bacterial proteins of unknown function. 224
49075 283610 pfam05989 Chordopox_A35R Chordopoxvirus A35R protein. This family consists of several Chordopoxvirus sequences homologous to the Vaccinia virus A35R protein. The function of this family is unknown. 172
49076 399172 pfam05990 DUF900 Alpha/beta hydrolase of unknown function (DUF900). This family consists of several hypothetical proteins of unknown function mostly found in Rhizobium species. Members of this family have an alpha/beta hydrolase fold. 236
49077 399173 pfam05991 NYN_YacP YacP-like NYN domain. This family consists of bacterial proteins related to YacP. This family is uncharacterized functionally, but it has been suggested that these proteins are nucleases due to them containing a NYN domain. NYN (for N4BP1, YacP-like Nuclease) domains were discovered by Anantharaman and Aravind. Based on gene neighborhoods it was suggested that the bacterial YacP proteins interact with the Ribonuclease III and TrmH methylase in a processome complex that catalyzes the maturation of rRNA and tRNA. 166
49078 399174 pfam05992 SbmA_BacA SbmA/BacA-like family. The Rhizobium meliloti bacA gene encodes a function that is essential for bacterial differentiation into bacteroids within plant cells in the symbiosis between R. meliloti and alfalfa. An Escherichia coli homolog of BacA, SbmA, is implicated in the uptake of microcins and bleomycin. This family is likely to be a subfamily of the ABC transporter family. 315
49079 283614 pfam05993 Reovirus_M2 Reovirus major virion structural protein Mu-1/Mu-1C (M2). This family consists of several Reovirus major virion structural protein Mu-1/Mu-1C (M2) sequences. This family is family is thought to play a role in host cell membrane penetration. 647
49080 399175 pfam05994 FragX_IP Cytoplasmic Fragile-X interacting family. CYFIP1/2 (Cytoplasmic fragile X mental retardation interacting protein) like proteins for a highly conserved protein family. The function of CYFIPs is unclear, but CYFIP interaction with fragile X mental retardation interacting protein (FMRP) involves the domain of FMRP which also mediating homo- and heteromerization. 842
49081 399176 pfam05995 CDO_I Cysteine dioxygenase type I. Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production. 168
49082 399177 pfam05996 Fe_bilin_red Ferredoxin-dependent bilin reductase. This family consists of several different but closely related proteins which include phycocyanobilin:ferredoxin oxidoreductase EC:1.3.7.5 (PcyA), 15,16-dihydrobiliverdin:ferredoxin oxidoreductase EC:1.3.7.2 (PebA) and phycoerythrobilin:ferredoxin oxidoreductase EC:1.3.7.3 (PebB). Phytobilins are linear tetrapyrrole precursors of the light-harvesting prosthetic groups of the phytochrome photoreceptors of plants and the phycobiliprotein photosynthetic antennae of cyanobacteria, red algae, and cryptomonads. It is known that that phytobilins are synthesized from heme via the intermediary of biliverdin IX alpha (BV), which is reduced subsequently by ferredoxin-dependent bilin reductases with different double-bond specificities. 228
49083 399178 pfam05997 Nop52 Nucleolar protein,Nop52. Nop52 believed to be involved in the generation of 28S rRNA. 212
49084 283619 pfam05999 Herpes_U5 Herpesvirus U5-like family. This family of Herpesvirus includes U4, U5 and UL27. 488
49085 399179 pfam06001 DUF902 Domain of Unknown Function (DUF902). This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. This short domain is found to the C-terminus of bromodomains. The 40 residue domain contains four conserved cysteines suggesting that it may be stabilized by a zinc ion. In CREB this domain is to the N-terminus of another zinc binding PHD domain. 40
49086 310528 pfam06002 CST-I Alpha-2,3-sialyltransferase (CST-I). This family consists of several alpha-2,3-sialyltransferase (CST-I) proteins largely found in Campylobacter jejuni. 293
49087 399180 pfam06003 SMN Survival motor neuron protein (SMN). This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs. 263
49088 399181 pfam06004 DUF903 Bacterial protein of unknown function (DUF903). This family consists of several small bacterial proteins several of which are classified as putative lipoproteins. The function of this family is unknown. 48
49089 399182 pfam06005 ZapB Cell division protein ZapB. ZapB is a non-essential, abundant cell division factor that is required for proper Z-ring formation. 71
49090 368702 pfam06006 DUF905 Bacterial protein of unknown function (DUF905). This family consists of several short hypothetical Enterobacteria proteins of unknown function. Structural analysis of the surface features of the protein YvyC has revealed a single cluster of highly conserved residues on the surface. Additionally, these residues fall into two groups which lie within the two largest of the three cavities identified over the surface. The conclusion from this is that these two cavities with, Leu 58, Glu 75, Ile 82, and Glu 83 and Pro 86, conserved, are likely to be important for the molecular function and reflect the cavities found on the surface of the FlaG proteins in pfam03646. 70
49091 399183 pfam06007 PhnJ Phosphonate metabolism protein PhnJ. This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown. 274
49092 310534 pfam06008 Laminin_I Laminin Domain I. coiled-coil structure. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure. 258
49093 368703 pfam06009 Laminin_II Laminin Domain II. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure. 138
49094 399184 pfam06011 TRP Transient receptor potential (TRP) ion channel. This family of proteins are transient receptor potential (TRP) ion channels. They are essential for cellular viability and are involved in cell growth and cell wall synthesis. The genes for these proteins are homologous to polycystic kidney disease related ion channel genes. 424
49095 399185 pfam06012 DUF908 Domain of Unknown Function (DUF908). 350
49096 399186 pfam06013 WXG100 Proteins of 100 residues with WXG. ESAT-6 is a small protein appears to be of fundamental importance in virulence and protective immunity in Mycobacterium tuberculosis. homologs have been detected in other Gram-positive bacterial species. It may represent a novel secretion system potentially driven by the pfam01580 domains in the YukA-like proteins. 85
49097 399187 pfam06014 DUF910 Bacterial protein of unknown function (DUF910). This family consists of several short bacterial proteins of unknown function. 61
49098 283633 pfam06015 Chordopox_A30L Chordopoxvirus A30L protein. This family consists of several short Chordopoxvirus proteins which are homologous to the A30L protein of Vaccinia virus. The vaccinia virus A30L protein is required for the association of electron-dense, granular, proteinaceous material with the concave surfaces of crescent membranes, an early step in viral morphogenesis. A30L is known to interact with the G7L protein and it has been shown that the stability of each is dependent on its association with the other. 71
49099 283634 pfam06016 Reovirus_L2 Reovirus core-spike protein lambda-2 (L2). This family consists of several Reovirus core-spike protein lambda-2 (L2) sequences. The reovirus L2 genome segment encodes the core spike protein lambda-2, which mediates enzymatic reactions in 5' capping of the viral plus-strand transcripts. 1297
49100 399188 pfam06017 Myosin_TH1 Unconventional myosin tail, actin- and lipid-binding. Unconventional myosins, ie those that are not found in muscle, have the common, classical-type head domain, sometimes a neck with the IQ calmodulin-binding motifs, and then non-standard tails. These tails determine the subcellular localization of the unconventional myosins and also help determine their individual functions. The family carries several different unconventional myosins, eg. Myo1f is expressed mainly in immune cells as well as in the inner ear where it can be associated with deafness, Myo1d has a lipid-binding module in their tail and is implicated in endosome vesicle recycling in epithelial cells. Myo1a, b, c and g from various eukaryotes are also found in this family. 196
49101 399189 pfam06018 CodY CodY GAF-like domain. This domain is a GAF-like domain found at the N-terminus of several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. Presumably this domain is involved in GTP binding. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk. 177
49102 147919 pfam06019 Phage_30_8 Phage GP30.8 protein. This family consists of several GP30.8 proteins from the T4-like phages. The function of this family is unknown. 124
49103 310540 pfam06020 Roughex Drosophila roughex protein. This family consists of several roughex (RUX) proteins specific to Drosophila species. Roughex can influence the intracellular distribution of cyclin A and is therefore defined as a distinct and specialized cell cycle inhibitor for cyclin A-dependent kinase activity. Rux is though to regulate the metaphase to anaphase transition during development. 379
49104 399190 pfam06021 Gly_acyl_tr_N Aralkyl acyl-CoA:amino acid N-acyltransferase. This family consists of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13. 196
49105 399191 pfam06022 Cir_Bir_Yir Plasmodium variant antigen protein Cir/Yir/Bir. This family consists of several Cir, Yir and Bir proteins from the Plasmodium species P.chabaudi, P.yoelii and P.berghei. 253
49106 283640 pfam06023 Csa1 CRISPR-associated exonuclease Csa1. CRISPR (clustered regularly interspaced short palindromic repeats) elements and cas (CRISPR-associated) genes are widespread in Bacteria and Archaea. The CRISPR/Cas system operates as a defense mechanism against mobile genetic elements (i.e., viruses or plasmids). Csa1 is part of the archaeal subtype I-A system. Cas1 has not yet been enzymatically characterized. 292
49107 368709 pfam06024 Orf78 Orf78 (ac78). Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV), AC78 or Orf78. AC78 is a late gene in the viral life cycle and encodes an envelope structural protein that plays an essential role in embedding the occlusion-derived virus (ODV) in the occlusion body. Although AC78 is not essential for budding virus formation or nucleocapsid assembly and ODV formation, number are significantly reduced if the gene is knocked-out. 101
49108 399192 pfam06025 DUF913 Domain of Unknown Function (DUF913). Members of this family are found in various ubiquitin protein ligases. 368
49109 399193 pfam06026 Rib_5-P_isom_A Ribose 5-phosphate isomerase A (phosphoriboisomerase A). This family consists of several ribose 5-phosphate isomerase A or phosphoriboisomerase A (EC:5.3.1.6) from bacteria, eukaryotes and archaea. 169
49110 283644 pfam06027 SLC35F Solute carrier family 35. This is a family of putative solute carrier proteins from eukaryotes. 299
49111 283645 pfam06028 DUF915 Alpha/beta hydrolase of unknown function (DUF915). This family consists of several bacterial proteins of unknown function. Members of this family have an alpha/beta hydrolase fold. 253
49112 399194 pfam06029 AlkA_N AlkA N-terminal domain. 118
49113 399195 pfam06030 DUF916 Bacterial protein of unknown function (DUF916). This family consists of several hypothetical bacterial proteins of unknown function. 120
49114 399196 pfam06031 SERTA SERTA motif. This family consists of a novel motif designated as SERTA (for SEI-1, RBT1, and TARA), corresponding to the largest conserved region among TRIP-Br proteins. The function of this motif is uncertain, but the CDK4-interacting segment of p34SEI-1 (amino acid residues 44-161) includes most of the SERTA motif. 36
49115 399197 pfam06032 DUF917 Protein of unknown function (DUF917). This family consists of hypothetical bacterial and archaeal proteins of unknown function. 350
49116 283650 pfam06033 DUF918 Nucleopolyhedrovirus protein of unknown function (DUF918). This family consists of several Nucleopolyhedrovirus proteins with no known function. 152
49117 114740 pfam06034 DUF919 Nucleopolyhedrovirus protein of unknown function (DUF919). This family consists of several short Nucleopolyhedrovirus proteins of unknown function. 62
49118 399198 pfam06035 Peptidase_C93 Bacterial transglutaminase-like cysteine proteinase BTLCP. Members of this family are predicted to be bacterial transglutaminase-like cysteine proteinases. They contain a conserved Cys-His-Asp catalytic triad. Their structure is predicted to be similar to that of Salmonella typhimurium N-hydroxyarylamine O-acetyltransferase, in pfam00797, however they lack the sub-domain which is important for arylamine recognition. 161
49119 399199 pfam06037 DUF922 Bacterial protein of unknown function (DUF922). This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. 159
49120 399200 pfam06039 Mqo Malate:quinone oxidoreductase (Mqo). This family consists of several bacterial Malate:quinone oxidoreductase (Mqo) proteins (EC:1.1.99.16). Mqo takes part in the citric acid cycle. It oxidizes L-malate to oxaloacetate and donates electrons to ubiquinone-1 and other artificial acceptors or, via the electron transfer chain, to oxygen. NAD is not an acceptor and the natural direct acceptor for the enzyme is most likely a quinone. The enzyme is therefore called malate:quinone oxidoreductase, abbreviated to Mqo. Mqo is a peripheral membrane protein and can be released from the membrane by addition of chelators. 488
49121 253527 pfam06040 Adeno_E3 Adenovirus E3 protein. This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions. 126
49122 399201 pfam06041 DUF924 Bacterial protein of unknown function (DUF924). This family consists of several hypothetical bacterial proteins of unknown function. Structurally, this family resembles TPR-like repeats. 185
49123 399202 pfam06042 NTP_transf_6 Nucleotidyltransferase. This family consists of several hypothetical bacterial proteins of unknown function. This family was recently identified as belonging to the nucleotidyltransferase superfamily. 157
49124 283656 pfam06043 Reo_P9 Reovirus P9-like family. 334
49125 399203 pfam06044 DpnI Dam-replacing family. Dam-replacing protein (DRP) is an restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. 182
49126 283658 pfam06045 Rhamnogal_lyase Rhamnogalacturonate lyase family. Rhamnogalacturonate lyase (EC:4.2.2.-) degrades the rhamnogalacturonan I (RG-I) backbone of pectin. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi. 211
49127 399204 pfam06046 Sec6 Exocyst complex component Sec6. Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localize to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner. 566
49128 399205 pfam06047 SynMuv_product Ras-induced vulval development antagonist. This family is from synthetic multi-vulval genes which encode chromatin-associated proteins involved in transcriptional repression. This protein has a role in antagonising Ras-induced vulval development. 102
49129 399206 pfam06048 DUF927 Domain of unknown function (DUF927). Family of bacterial proteins of unknown function. The C-terminal half of this family contains a P-loop motif. The N-terminal domain appears to have a unique fold, which contains three Helices and two strands. Structural analyses show that helicases containing this domain form a hexameric ring with a positively charged central pore threading a single DNA strand through suggestive of a replicative function for this helicase. 286
49130 399207 pfam06049 LSPR Coagulation Factor V LSPD Repeat. These repeats are found in coagulation factor V (five). The name LSPD derives from the conserved residues in the middle of the repeat.They occur in the B domain, which is cleaved prior to activation of the protein. It has been suggested that domain B bring domains A and C together for activation. 9
49131 399208 pfam06050 HGD-D 2-hydroxyglutaryl-CoA dehydratase, D-component. Degradation of glutamate via the hydroxyglutarate pathway involves the syn-elimination of water from 2-hydroxyglutaryl-CoA. This anaerobic process is catalyzed by 2-hydroxyglutaryl-CoA dehydratase, an enzyme with two components (A and D) that reversibly associate during reaction cycles. This component contains one non-reducible [4Fe-4S]2+ cluster and a reduced riboflavin 5'-monophosphate. 339
49132 399209 pfam06051 DUF928 Domain of Unknown Function (DUF928). Family of uncharacterized bacterial protein. 186
49133 399210 pfam06052 3-HAO 3-hydroxyanthranilic acid dioxygenase. In eukaryotes 3-hydroxyanthranilic acid dioxygenase (EC:1.13.11.6) is part of the kynurenine pathway for the degradation of tryptophan and the biosynthesis of nicotinic acid.The prokaryotic homolog is involved in the 2-nitrobenzoate degradation pathway. 151
49134 368722 pfam06053 DUF929 Domain of unknown function (DUF929). Family of proteins from the archaeon Sulfolobus, with undetermined function. 248
49135 283666 pfam06054 CoiA Competence protein CoiA-like family. Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterized protein. 377
49136 399211 pfam06055 ExoD Exopolysaccharide synthesis, ExoD. Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion. 174
49137 310561 pfam06056 Terminase_5 Putative ATPase subunit of terminase (gpP-like). This family of proteins are annotated as ATPase subunits of phage terminase after. Terminases are viral proteins that are involved in packaging viral DNA into the capsid. 58
49138 368724 pfam06057 VirJ Bacterial virulence protein (VirJ). This family consists of several bacterial VirJ virulence proteins. VirJ is thought to be involved in the type IV secretion system. It is thought that the substrate proteins localized to the periplasm may associate with the pilus in a manner that is mediated by VirJ, and suggest a two-step process for type IV secretion in Agrobacterium. 191
49139 399212 pfam06058 DCP1 Dcp1-like decapping family. An essential step in mRNA turnover is decapping. In yeast, two proteins have been identified that are essential for decapping, Dcp1 (this family) and Dcp2 (pfam05026). The precise role of these proteins in the decapping reaction have not been established. Evidence suggests that the Dcp1 may enhance the function of Dcp2. 112
49140 399213 pfam06059 DUF930 Domain of Unknown Function (DUF930). Family of bacterial proteins with undetermined function. All bacteria in this family are from the Rhizobiales order. 99
49141 368727 pfam06060 Mesothelin Pre-pro-megakaryocyte potentiating factor precursor (Mesothelin). This family consists of several mammalian pre-pro-megakaryocyte potentiating factor precursor (MPF) or mesothelin proteins. Mesothelin is a glycosylphosphatidylinositol-linked glycoprotein highly expressed in mesothelial cells, mesotheliomas, and ovarian cancer, but the biological function of the protein is not known. 624
49142 310566 pfam06061 Baculo_ME53 Baculoviridae ME53. ME53 is one of the major early-transcribed genes. The ME53 protein is reported to contain a putative zinc finger motif. 339
49143 399214 pfam06062 UPF0231 Uncharacterized protein family (UPF0231). Family of uncharacterized Proteobacteria proteins. 121
49144 399215 pfam06064 Gam Host-nuclease inhibitor protein Gam. The Gam protein inhibits RecBCD nuclease and is found in both bacteria and bacteriophage. 98
49145 368729 pfam06066 SepZ SepZ. SepZ is a component of the type III secretion system use in bacteria. SepZ is a gene within the enterocyte effacement locus. SepZ mutants exhibit reduced invasion efficiency and lack of tyrosine phosphorylation of Hp90. 99
49146 399216 pfam06067 DUF932 Domain of unknown function (DUF932). Family of prokaryotic proteins with unknown function. Contains a number of highly conserved polar residues that could suggest an enzymatic activity. 227
49147 399217 pfam06068 TIP49 TIP49 C-terminus. This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the pfam00004 domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities.TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases. 347
49148 310570 pfam06069 PerC PerC transcriptional activator. PerC is a transcriptional activator of EaeA/BfpA expression in enteropathogenic bacteria. 90
49149 283680 pfam06070 Herpes_UL32 Herpesvirus large structural phosphoprotein UL32. The large phosphorylated protein (UL32-like) of herpes viruses is the polypeptide most frequently reactive in immuno-blotting analyses with antisera when compared with other viral proteins. 1037
49150 399218 pfam06071 YchF-GTPase_C Protein of unknown function (DUF933). This domain is found at the C-terminus of the YchF GTP-binding protein and is possibly related to the ubiquitin-like and MoaD/ThiS superfamilies. 82
49151 283682 pfam06072 Herpes_US9 Alphaherpesvirus tegument protein US9. This family consists of several US9 and related proteins from the Alphaherpesviruses. The function of the US9 protein is unknown although in Bovine herpesvirus 5 Us9 is essential for the anterograde spread of the virus from the olfactory mucosa to the bulb. 61
49152 399219 pfam06073 DUF934 Bacterial protein of unknown function (DUF934). This family consists of several bacterial proteins of unknown function. One of the members of this family BMEI1764 is thought to be an oxidoreductase. 103
49153 399220 pfam06074 DUF935 Protein of unknown function (DUF935). This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein. 516
49154 399221 pfam06075 DUF936 Plant protein of unknown function (DUF936). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 680
49155 114778 pfam06076 Orthopox_F14 Orthopoxvirus F14 protein. This family consists of several short Orthopoxvirus F14 proteins. The function of this protein is unknown. 73
49156 399222 pfam06078 DUF937 Bacterial protein of unknown function (DUF937). This family consists of several hypothetical bacterial proteins of unknown function. 107
49157 399223 pfam06079 Apyrase Apyrase. This family consists of several eukaryotic apyrase proteins (EC:3.6.1.5). The salivary apyrases of blood-feeding arthropods are nucleotide hydrolysing enzymes implicated in the inhibition of host platelet aggregation through the hydrolysis of extracellular adenosine diphosphate.. 289
49158 253548 pfam06080 DUF938 Protein of unknown function (DUF938). This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown. 201
49159 399224 pfam06081 ArAE_1 Aromatic acid exporter family member 1. This family consists of bacterial proteins with three transmembrane regions that are purported to be aromatic acid exporters. 141
49160 399225 pfam06082 YjbH Exopolysaccharide biosynthesis protein YbjH. YjbH is a family of Gram-negative beta-barrel outer-membrane lipoproteins that act as putative porins. YbjH is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjH and F are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production. 662
49161 399226 pfam06083 IL17 Interleukin-17. IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. 80
49162 399227 pfam06084 Cytomega_TRL10 Cytomegalovirus TRL10 protein. This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulfide-linked complex. 149
49163 368737 pfam06085 Rz1 Lipoprotein Rz1 precursor. This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces. This family Rz1 and the Rz protein Rz (pfam03245) represent a unique example of two genes located in different reading frames in the same nucleotide sequence, which encode different proteins that are both required in the same physiological pathway. 41
49164 399228 pfam06086 Pox_A30L_A26L Orthopoxvirus A26L/A30L protein. This family consists of several Orthopoxvirus A26L and A30L proteins. The Vaccinia A30L gene is regulated by a late promoter and encodes a protein of approximately 9 kDa. It is thought that the A30L protein is needed for vaccinia virus morphogenesis, specifically the association of the dense viroplasm with viral membranes. 219
49165 399229 pfam06087 Tyr-DNA_phospho Tyrosyl-DNA phosphodiesterase. Covalent intermediates between topoisomerase I and DNA can become dead-end complexes that lead to cell death. Tyrosyl-DNA phosphodiesterase can hydrolyze the bond between topoisomerase I and DNA. 433
49166 368739 pfam06088 TLP-20 Nucleopolyhedrovirus telokin-like protein-20 (TLP20). This family consists of several Nucleopolyhedrovirus telokin-like protein-20 (TLP20) sequences. The function of this family is unknown but TLP20 is known to shares some antigenic similarities to the smooth muscle protein telokin although the amino acid sequence shows no homologies to telokin. 164
49167 399230 pfam06089 Asparaginase_II L-asparaginase II. This family consists of several bacterial L-asparaginase II proteins. L-asparaginase (EC:3.5.1.1) catalyzes the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source. 320
49168 399231 pfam06090 Ins_P5_2-kin Inositol-pentakisphosphate 2-kinase. This is a family of inositol-pentakisphosphate 2-kinases (EC 2.7.1.158) (also known as inositol 1,3,4,5,6-pentakisphosphate 2-kinase, Ins(1,3,4,5,6)P5 2-kinase) and InsP5 2-kinase). This enzyme phosphorylates Ins(1,3,4,5,6)P5 to form Ins(1,2,3,4,5,6)P6 (also known as InsP6 or phytate). InsP6 is involved in many processes such as mRNA export, nonhomologous end-joining, endocytosis and ion channel regulation. 376
49169 399232 pfam06092 DUF943 Enterobacterial putative membrane protein (DUF943). This family consists of several hypothetical putative membrane proteins from Escherichia coli, Yersinia pestis and Salmonella typhi. 151
49170 399233 pfam06093 Spt4 Spt4/RpoE2 zinc finger. This family consists of several eukaryotic transcription elongation Spt4 proteins as well as archaebacterial RpoE2. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. RpoE2 is one of 13 subunits in the archaeal RNA polymerase. These proteins contain a C4-type zinc finger, and the structure has been solved in. The structure reveals that Spt4-Spt5 binding is governed by an acid-dipole interaction between Spt5 and Spt4, and the complex binds to and travels along the elongating RNA polymerase. The Spt4-Spt5 complex is likely to be an ancient, core component of the transcription elongation machinery. 77
49171 399234 pfam06094 GGACT Gamma-glutamyl cyclotransferase, AIG2-like. GGACT, gamma-glutamylamine cyclotransferase, is a ubiquitous enzyme found in bacteria, plants, and metazoans from Dictyostelium through to humans. It converts gamma-glutamylamines to free amines and 5-oxoproline. 114
49172 368743 pfam06096 Baculo_8kDa Baculoviridae 8.2 KDa protein. Family of proteins from various Baculoviruses with undetermined function. 65
49173 399235 pfam06097 DUF945 Bacterial protein of unknown function (DUF945). This family consists of several hypothetical bacterial proteins of unknown function. 458
49174 399236 pfam06098 Radial_spoke_3 Radial spoke protein 3. This family consists of several radial spoke protein 3 (RSP3) sequences. Eukaryotic cilia and flagella present in diverse types of cells perform motile, sensory, and developmental functions in organisms from protists to humans. They are centred by precisely organized, microtubule-based structures, the axonemes. The axoneme consists of two central singlet microtubules, called the central pair, and nine outer doublet microtubules. These structures are well-conserved during evolution. The outer doublet microtubules, each composed of A and B sub-fibers, are connected to each other by nexin links, while the central pair is held at the centre of the axoneme by radial spokes. The radial spokes are T-shaped structures extending from the A-tubule of each outer doublet microtubule to the centre of the axoneme. Radial spoke protein 3 (RSP3), is present at the proximal end of the spoke stalk and helps in anchoring the radial spoke to the outer doublet. It is thought that radial spokes regulate the activity of inner arm dynein through protein phosphorylation and dephosphorylation. 286
49175 399237 pfam06099 Phenol_hyd_sub Phenol hydroxylase subunit. This family consists of several bacterial phenol hydroxylase subunit proteins which are part of a multicomponent phenol hydroxylase. Some bacteria can utilize phenol or some of its methylated derivatives as their sole source of carbon and energy. The first step in this process is the conversion of phenol into catechol. Catechol is then further metabolized via the meta-cleavage pathway into TCA cycle intermediates. 56
49176 399238 pfam06100 MCRA MCRA family. The MCRA (myosin-cross-reactive antigen) family of proteins were thought to have structural features in common with the beta chain of the class II antigens, as well as myosin, and may play an important role in the pathogenesis. More recent work shows that these proteins act as hydratase enzymes that convert linoleic acid and oleic acid to their respective 10-hydroxy derivatives. It has been suggested that MCRA proteins catalyze the first step in conjugated linoleic acid production. Proteins in this family act in an FAD dependent manner. The structure of a fatty acid double-bond hydratase from Lactobacillus acidophilus has been recently solved showing four structural domains. 492
49177 399239 pfam06101 Vps62 Vacuolar protein sorting-associated protein 62. Vps62 is a vacuolar protein sorting (VPS) protein required for cytoplasm to vacuole targeting of proteins. 539
49178 399240 pfam06102 RRP36 rRNA biogenesis protein RRP36. RRP36 is involved in the early processing steps of the pre-rRNA. 158
49179 399241 pfam06103 DUF948 Bacterial protein of unknown function (DUF948). This family consists of bacterial sequences several of which are thought to be general stress proteins. 83
49180 399242 pfam06105 Aph-1 Aph-1 protein. This family consists of several eukaryotic Aph-1 proteins.Gamma-secretase catalyzes the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signaling paradigm and as a key step in the pathogenesis of Alzheimer's disease. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity. 224
49181 399243 pfam06106 SAUGI S. aureus uracil DNA glycosylase inhibitor. Uracil-DNA glycosylase inhibitors, are DNA mimic proteins that prevent the DNA binding sites of UDGs (Uracil DNA glycosylase) from interacting with their DNA substrate. SSP0047 (SAUGI; for Staphylococcus aureus uracil-DNA glycosylase inhibitor) acts as a uracil-DNA glycosylase inhibitor that breaks the uracil-removing activity of S. aureus uracil-DNA glycosylase (SAUDG) pfam03167. The SAUGI/SAUDG complex has been determined, and shows that SAUGI binds to the SAUDG DNA binding region via several strong interactions, by using a hydrophobic pocket to hold SAUDG's protruding residue (i.e. SAUDG Leu184, E. coli UDG Leu191 and B. subtilis UDG Phe191). By binding to SAUDG in this way, SAUGI thus prevents SAUDG from binding to its DNA substrate and performing DNA repair activity. 112
49182 399244 pfam06107 DUF951 Bacterial protein of unknown function (DUF951). This family consists of several short hypothetical bacterial proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids. 55
49183 399245 pfam06108 DUF952 Protein of unknown function (DUF952). This family consists of several hypothetical bacterial and plant proteins of unknown function. 84
49184 399246 pfam06109 HlyE Haemolysin E (HlyE). This family consists of several enterobacterial haemolysin (HlyE) proteins.Hemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterized pore-forming E. coli hemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a hemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host. 333
49185 399247 pfam06110 DUF953 Eukaryotic protein of unknown function (DUF953). This family consists of several hypothetical eukaryotic proteins of unknown function. 119
49186 399248 pfam06112 Herpes_capsid Gammaherpesvirus capsid protein. This family consists of several Gammaherpesvirus capsid proteins. The exact function of this family is unknown. 169
49187 399249 pfam06113 BRE Brain and reproductive organ-expressed protein (BRE). This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene.Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms.It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities. 320
49188 399250 pfam06114 Peptidase_M78 IrrE N-terminal-like domain. This entry includes the catalytic domain of the protein ImmA, which is a metallopeptidase containing an HEXXH zinc-binding motif from peptidase family M78. ImmA is encoded on a conjugative transposon. Conjugating bacteria are able to transfer conjugative transposons that can, for example, confer resistance to antibiotics. The transposon is integrated into the chromosome, but during conjugation excises itself and then moves to the recipient bacterium and re-integrate into its chromosome. Typically a conjugative tranposon encodes only the proteins required for this activity and the proteins that regulate it. During exponential growth, the ICEBs1 transposon of Bacillus subtilis is inactivated by the immunity repressor protein ImmR, which is encoded by the transposon and represses the genes for excision and transfer. Cleavage of ImmR relaxes repression and allows transfer of the transposon. ImmA has been shown to be essential for the cleavage of ImmR. This domain is also found in in metalloprotease IrrE, a central regulator of DNA damage repair in Deinococcaceae, HTH-type transcriptional regulators RamB and PrpC. 122
49189 399251 pfam06115 DUF956 Domain of unknown function (DUF956). Family of bacterial sequences with undetermined function. 117
49190 283718 pfam06116 RinB Transcriptional activator RinB. This family consists of several Staphylococcus aureus bacteriophage RinB proteins and related sequences from their host. The int gene of staphylococcal bacteriophage phi 11 is the only viral gene responsible for the integrative recombination of phi 11. rinA and rinB, are both required to activate expression of the int gene. 51
49191 399253 pfam06119 NIDO Nidogen-like. This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. 90
49192 399254 pfam06120 Phage_HK97_TLTM Tail length tape measure protein. This family consists of the tail length tape measure protein from bacteriophage HK97 and related sequences from Escherichia coli O157:H7. 288
49193 399255 pfam06121 DUF959 Domain of Unknown Function (DUF959). This N-terminal domain is not expressed in the 'Short' isoform of Collagen A. 192
49194 399256 pfam06122 TraH Conjugative relaxosome accessory transposon protein. The TraH protein is thought to be a relaxosome accessory component, also necessary for transfer but not for H-pilus synthesis within the conjugative transposon. 356
49195 399257 pfam06123 CreD Inner membrane protein CreD. This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with an Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA. 428
49196 399258 pfam06124 DUF960 Staphylococcal protein of unknown function (DUF960). This family consists of several hypothetical proteins from several species of Staphylococcus. The function of this family is unknown. 94
49197 399259 pfam06125 DUF961 Bacterial protein of unknown function (DUF961). This family consists of several hypothetical bacterial proteins of unknown function. 96
49198 147993 pfam06126 Herpes_LAMP2 Herpesvirus Latent membrane protein 2. Family of Kaposi's sarcoma-associated herpesvirus (HHV8) latent membrane protein. 510
49199 377609 pfam06127 DUF962 Protein of unknown function (DUF962). This family consists of several eukaryotic and prokaryotic proteins of unknown function. The yeast protein YGL010W has been found to be non-essential for cell growth. 95
49200 283728 pfam06128 Shigella_OspC Shigella flexneri OspC protein. This family consists of the Shigella flexneri specific protein OspC. The function of this family is unknown but it is thought that Osp proteins may be involved in post invasion events related to virulence. Since bacterial pathogens adapt to multiple environments during the course of infecting a host, it has been proposed that Shigella evolved a mechanism to take advantage of a unique intracellular cue, which is mediated through MxiE, to express proteins when the organism reaches the eukaryotic cytosol. 292
49201 283729 pfam06129 Chordopox_G3 Chordopoxvirus G3 protein. This family consists of several Chordopoxvirus specific G3 proteins. The function of this family is unknown. 108
49202 399260 pfam06130 PTAC Phosphate propanoyltransferase. This family includes phosphotransacylases (PTACs) required for the degradation of 1,2-propanediol (1,2-PD). 67
49203 283731 pfam06131 DUF963 Schizosaccharomyces pombe repeat of unknown function (DUF963). This family consists of a series of repeated sequences from one hypothetical protein found in Schizosaccharomyces pombe. The function of this family is unknown. 36
49204 399261 pfam06133 Com_YlbF Control of competence regulator ComK, YlbF/YmcA. YlbF Is a family of short Gram-positive and archaeal proteins that includes both YlbF and YmcA which may interact synergistically. The family is necessary for correct biofilm formation, as null mutants of ymcA and ylbF fail to form pellicles at air-liquid interfaces and grow on solid media as smooth, undifferentiated colonies. During development, YmcA, YlbF and YaaT, family PSPI, pfam04468, interact directly with one another forming a stable ternary complex, in vitro. All three proteins are required for competence, sporulation and the formation of biofilms. The YmcA-YlbF-YaaT complex affects the phosphotransfer between Spo0F and Spo0B, thus accelerating the production of Spo0A~P. The three processes of biofilm formation, mature spore formation and competence all require the active, phosphorylated form of Spo0A, as Spo0A-P. 104
49205 399262 pfam06134 RhaA L-rhamnose isomerase (RhaA). This family consists of several bacterial L-rhamnose isomerase proteins (EC:5.3.1.14). 417
49206 399263 pfam06135 DUF965 Bacterial protein of unknown function (DUF965). This family consists of several hypothetical bacterial proteins. The function of the family is unknown. 77
49207 399264 pfam06136 DUF966 Domain of unknown function (DUF966). Family of plant proteins with unknown function. 366
49208 283736 pfam06138 Chordopox_E11 Chordopoxvirus E11 protein. This family consists of several Chordopoxvirus E11 proteins. The E11 gene of vaccinia virus encodes a 15-kDa polypeptide. Mutations in the E11 gene makes the virus temperature-sensitive due to either the fact that virus infectivity requires a threshold level of active E11 protein or that E11 function is conditionally essential. 126
49209 399265 pfam06139 BphX BphX-like. Family of bacterial proteins located in the phenyl dioxygenase (bph) operon. The function of this family is unknown. 133
49210 399266 pfam06140 Ifi-6-16 Interferon-induced 6-16 family. 77
49211 399267 pfam06141 Phage_tail_U Phage minor tail protein U. Tail fibre component U of bacteriophage. 129
49212 114838 pfam06143 Baculo_11_kDa Baculovirus 11 kDa family. Family of uncharacterized Baculovirus proteins that are all about 11 kDa in size. 84
49213 399268 pfam06144 DNA_pol3_delta DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalyzed reaction. The delta subunit is also known as HolA. 174
49214 148007 pfam06145 Corona_NS1 Coronavirus nonstructural protein NS1. Bovine coronavirus NS1 encodes a 4.9 kDa protein. 29
49215 399269 pfam06146 PsiE Phosphate-starvation-inducible E. Phosphate-starvation-inducible E (PsiE) expression is under direct positive and negative control by PhoB and cAMP-CRP, respectively. The function of PsiE remains to be determined. 68
49216 399270 pfam06147 DUF968 Protein of unknown function (DUF968). Family of uncharacterized prophage proteins found in Gammaproteobacteria. These may be HNH-nucleases, as there are several conserved cysteines and histidines. 206
49217 399271 pfam06148 COG2 COG (conserved oligomeric Golgi) complex component, COG2. The COG complex comprises eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function. The proposed function of the complex is to mediate the initial physical contact between transport vesicles and their membrane targets. A comparable role in tethering vesicles has been suggested for at least six additional large multisubunit complexes, including the exocyst, a complex that mediates trafficking to the plasma membrane. COG2 structure reveals a six-helix bundle with few conserved surface features but a general resemblance to recently determined crystal structures of four different exocyst subunits. These bundles inCOG2 may act as platforms for interaction with other trafficing proteins including SNAREs (soluble N-ethylmaleimide factor attachment protein receptors) and Rabs. 133
49218 399272 pfam06149 DUF969 Protein of unknown function (DUF969). Family of uncharacterized bacterial membrane proteins. 216
49219 399273 pfam06150 ChaB ChaB. This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in E. coli and is found next to ChaA which is a cation transporter protein. ChaB may be regulate ChaA function in some way. 60
49220 336318 pfam06151 Trehalose_recp Trehalose receptor. In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose. 411
49221 399274 pfam06152 Phage_min_cap2 Phage minor capsid protein 2. Family of related phage minor capsid proteins. 367
49222 283747 pfam06153 CdAMP_rec Cyclic-di-AMP receptor. CdAMP is a family of bacterial cyclic-di-AMP receptor proteins. Cyclic-di-AMP (c-di-AMP) is a bacterial secondary messenger involved in various processes, including sensing of DNA-integrity, cell wall metabolism and potassium transport. CdAMP_rec has a ferredoxin-like fold and is structurally related to Pii-signal transduction proteins. 109
49223 399275 pfam06154 CbeA_antitoxin CbeA_antitoxin, type IV, cytoskeleton bundling-enhancing factor A. CbeA_antitoxin is a family of cognate antitoxins to the CbtA toxins that act by inhibiting the polymerization of cytoskeletal proteins, see pfam06755. These are classified as a type IV toxin-antitoxin system. The family includes three proteins from E. coli YagB, YeeU and YfjZ, which act not by forming a complex with CbtA but through acting as antagonists to the CbtA toxicity, by stabilizing the CbtA target proteins. For example, YeeU binds directly to both MreB and FtsZ and enhances the bundling of their filaments in vitro. YeeU is also able to neutralize the toxicity caused by other MreB and FtsZ inhibitors, such as A22 [S-(3, 4-dichlorobenzyl)isothiourea] for MreB, and SulA and DicB for FtsZ. Thus CbeA, for cytoskeleton bundling-enhancing factor A, is proposed as a general name for all of these antitoxin proteins. 101
49224 399276 pfam06155 DUF971 Protein of unknown function (DUF971). This family consists of several short bacterial proteins and one sequence from Oryza sativa. The function of this family is unknown. 83
49225 399277 pfam06156 YabB Initiation control protein YabA. YabA is involved in initiation control of chromosome replication. It interacts with both DnaA and DnaN, acting as a bridge between these two proteins. 103
49226 368769 pfam06157 DUF973 Protein of unknown function (DUF973). This family consists of several hypothetical archaeal proteins of unknown function. 309
49227 399278 pfam06159 DUF974 Protein of unknown function (DUF974). Family of uncharacterized eukaryotic proteins. 243
49228 399279 pfam06160 EzrA Septation ring formation regulator, EzrA. During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerizes into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation. 542
49229 283754 pfam06161 DUF975 Protein of unknown function (DUF975). Family of uncharacterized bacterial proteins. 244
49230 310625 pfam06162 PgaPase_1 Putative pyroglutamyl peptidase PgaPase_1. PgaPase_1 is a family of functionally diverse Caenorhabditis proteins. The family is homologous to the cysteine-peptidases, but lack of a strictly conserved Glu-Cys-His catalytic triad or pGlu binding site implies that it has other functions that could have resulted in a change in reaction-specificity or even of catalytic activity. 166
49231 310626 pfam06163 DUF977 Bacterial protein of unknown function (DUF977). This family consists of several hypothetical bacterial proteins from Escherichia coli and Salmonella typhi. The function of this family is unknown. 134
49232 399280 pfam06165 Glyco_transf_36 Glycosyltransferase family 36. The glycosyltransferase family 36 includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-). Many members of this family contain two copies of this domain. 247
49233 399281 pfam06166 DUF979 Protein of unknown function (DUF979). This family consists of several putative bacterial membrane proteins. The function of this family is unclear. 311
49234 399282 pfam06167 Peptidase_M90 Glucose-regulated metallo-peptidase M90. MtfA (earlier known as YeeI) is a transcription factor A that binds Mlc (make large colonies), itself a repressor of glucose and hence a protein important in regulation of the phosphoenolpyruvate:glucose-phosphotransferase (ptsG) system, the major glucose transporter in E.coli. Mlc is a repressor of ptsG, and MtfA is found to bind and inactivate Mlc with high affinity. The membrane-bound protein EIICBGlc encoded by the ptsG gene is the major glucose transporter in Escherichia coli. MtfA is found to be a glucose-regulated peptidase, whose activity is regulated by binding to Mlc available in the cytoplasm, which in turn has been released from EIICBGlc during times when no glucose is taken up. A physiologically relevant target for this peptidase is not yet known. 243
49235 399283 pfam06168 DUF981 Protein of unknown function (DUF981). Family of uncharacterized proteins found in bacteria and archaea. 180
49236 399284 pfam06169 DUF982 Protein of unknown function (DUF982). This family consists of several hypothetical proteins from Rhizobium meliloti, Rhizobium loti and Agrobacterium tumefaciens. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 71
49237 399285 pfam06170 DUF983 Protein of unknown function (DUF983). This family consists of several bacterial proteins of unknown function. 85
49238 399286 pfam06172 Cupin_5 Cupin superfamily (DUF985). Family of uncharacterized proteins found in bacteria and eukaryotes that belongs to the Cupin superfamily. 138
49239 399287 pfam06173 DUF986 Protein of unknown function (DUF986). This family consists of several bacterial putative membrane proteins of unknown function. 148
49240 368775 pfam06174 DUF987 Protein of unknown function (DUF987). Family of bacterial proteins that are related to the hypothetical protein yeeT. 65
49241 114868 pfam06175 MiaE tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE). This family consists of several bacterial tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE) proteins. The modified nucleoside 2-methylthio-N-6-isopentenyl adenosine (ms2i6A) is present at position 37 (3' of the anticodon) of tRNAs that read codons beginning with U except tRNA(I,V Ser) in Escherichia coli. Salmonella typhimurium 2-methylthio-cis-ribozeatin (ms2io6A) is found in tRNA, probably in the corresponding species that have ms2i6A in E. coli. The miaE gene is absent in E. coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species. 199
49242 283765 pfam06176 WaaY Lipopolysaccharide core biosynthesis protein (WaaY). This family consists of several bacterial lipopolysaccharide core biosynthesis proteins (WaaY or RfaY). The waaY, waaQ, and waaP genes are located in the central operon of the waa (formerly rfa) locus on the chromosome of Escherichia coli. This locus contains genes whose products are involved in the assembly of the core region of the lipopolysaccharide molecule. WaaY is the enzyme that phosphorylates HepII in this system. 229
49243 399288 pfam06177 QueT QueT transporter. This family includes the queT gene encoding a hypothetical integral membrane protein with 5 predicted transmembrane regions. The queT genes in Firmicutes are often preceded by the PreQ1 (7-aminomethyl-7-deazaguanine) riboswitches of two distinct classes, suggesting involvement of the QueT transporters in uptake of a queuosine biosynthetic intermediate. 140
49244 399289 pfam06178 KdgM Oligogalacturonate-specific porin protein (KdgM). This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria Erwinia chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric. 215
49245 399290 pfam06179 Med22 Surfeit locus protein 5 subunit 22 of Mediator complex. This family consists of several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit locus. The Med22 subunit of Mediator complex is part of the essential core head region. 105
49246 399291 pfam06180 CbiK Cobalt chelatase (CbiK). This family consists of several bacterial cobalt chelatase (CbiK) proteins (EC:4.99.1.-). 261
49247 399292 pfam06181 Urate_ox_N Urate oxidase N-terminal. Cytochrome c urate oxidase (Uox) PuuD is involved in purine degradation. In contrast with soluble Uox it is a membrane protein with an 8-helix transmembrane N-terminal domain and a C-terminal cytochrome c. 295
49248 253605 pfam06182 ABC2_membrane_6 ABC-2 family transporter protein. This family acts as the transmembrane domain (TMD) of ABC transporters. The family includes proteins responsible for the transport of herbicides. 229
49249 399293 pfam06183 DinI DinI-like family. This family of short proteins includes DNA-damage-inducible protein I (DinI) and related proteins. The SOS response, a set of cellular phenomena exhibited by eubacteria, is initiated by various causes that include DNA damage-induced replication arrest, and is positively regulated by the co- protease activity of RecA. Escherichia coli DinI, a LexA-regulated SOS gene product, shuts off the initiation of the SOS response when overexpressed in vivo. Biochemical and genetic studies indicated that DinI physically interacts with RecA to inhibit its co-protease activity. The structure of DinI is known. 63
49250 399294 pfam06184 Potex_coat Potexvirus coat protein. This family consists of several Potexvirus coat proteins. 153
49251 399295 pfam06185 YecM YecM protein. This family consists of several bacterial YecM proteins of unknown function. 179
49252 399296 pfam06186 DUF992 Protein of unknown function (DUF992). This family consists of several hypothetical bacterial proteins of unknown function. 143
49253 399297 pfam06187 DUF993 Protein of unknown function (DUF993). This family consists of several hypothetical bacterial proteins of unknown function. 381
49254 368784 pfam06188 HrpE HrpE/YscL/FliH and V-type ATPase subunit E. This is a prokaryotic family that contains proteins of the FliH and HrpE/YscL family. These proteins are involved in type III secretion, which is the process that drives flagellar biosynthesis and mediates bacterial-eukaryotic interactions. This family also V-type ATPase subunit E. This subunit appears to form a tight interaction with subunit G in the F0 complex. Subunits E and G may act together as stators to prevent certain subunits from rotating with the central rotary element. pfam01991 also contains V-type ATPase subunit E proteins. 187
49255 399298 pfam06189 5-nucleotidase 5'-nucleotidase. This family consists of both eukaryotic and prokaryotic 5'-nucleotidase sequences (EC:3.1.3.5). 265
49256 399299 pfam06191 DUF995 Protein of unknown function (DUF995). Family of uncharacterized Proteobacteria proteins. 140
49257 283778 pfam06193 Orthopox_A5L Orthopoxvirus A5L protein-like. This family includes several Orthopoxvirus A5L proteins. The vaccinia virus WR A5L open reading frame (corresponding to open reading frame A4L in vaccinia virus Copenhagen) encodes an immunodominant late protein found in the core of the vaccinia virion. The A5 protein appears to be required for the immature virion to form the brick-shaped intracellular mature virion. 216
49258 283779 pfam06194 Phage_Orf51 Phage Conserved Open Reading Frame 51. Family of conserved bacteriophage open reading frames. 80
49259 399300 pfam06195 DUF996 Protein of unknown function (DUF996). Family of uncharacterized bacterial and archaeal proteins. 135
49260 399301 pfam06196 DUF997 Protein of unknown function (DUF997). Family of predicted bacterial membrane protein with unknown function. 77
49261 399302 pfam06197 DUF998 Protein of unknown function (DUF998). Family of conserved archaeal proteins. 185
49262 114890 pfam06198 DUF999 Protein of unknown function (DUF999). Family of conserved Schizosaccharomyces pombe proteins with unknown function. 143
49263 399303 pfam06199 Phage_tail_2 Phage tail tube protein. characterized members are major tail tube proteins from various phages, including lactococcal temperate bacteriophage TP901-1. 134
49264 399304 pfam06200 tify tify domain. This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response. 34
49265 399305 pfam06201 PITH PITH domain. This family was formerly known as DUF1000. The full-length, Txnl1, protein which is a probable component of the 26S proteasome, uses its C-terminal, PITH, domain to associate specifically with the 26S proteasome. PITH derives from proteasome-interacting thioredoxin domain. 145
49266 283786 pfam06202 GDE_C Amylo-alpha-1,6-glucosidase. This family includes human glycogen branching enzyme AGL. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homolog GDB1 that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33). 374
49267 399306 pfam06203 CCT CCT motif. This short motif is found in a number of plant proteins. It is rich in basic amino acids and has been called a CCT motif after Co, Col and Toc1. The CCT motif is about 45 amino acids long and contains a putative nuclear localization signal within the second half of the CCT motif. Toc1 mutants have been identified in this region. 44
49268 399307 pfam06206 CpeT CpeT/CpcT family (DUF1001). This family consists of proteins of proteins belonging to the CpeT/CpcT family. These proteins are around 200 amino acids in length. The proteins contain a conserved motif PYR in the amino terminal half of the protein that may be functionally important. The species distribution of the family is interesting. So far it is restricted to cyanobacteria, cryptomonads and plants. It has been shown that CpcT encodes a bilin lyase responsible for attachment of phycocyanobilin to the beta subunit of phycocyanin. 179
49269 399308 pfam06207 DUF1002 Protein of unknown function (DUF1002). This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria. 220
49270 283790 pfam06208 BDV_G Borna disease virus G protein. This family consists of Borna disease virus G glycoprotein sequences. Borna disease virus (BDV) infection produces a variety of clinical diseases, from behavioural illnesses to classical fatal encephalitis. G protein is important for viral entry into the host cell. 503
49271 399309 pfam06209 COBRA1 Cofactor of BRCA1 (COBRA1). This family consists of several cofactor of BRCA1 (COBRA1) like proteins. It is thought that COBRA1 along with BRCA1 is involved in chromatin unfolding. COBRA1 is recruited to the chromosome site by the first BRCT repeat of BRCA1, and is itself sufficient to induce chromatin unfolding. BRCA1 mutations that enhance chromatin unfolding also increase its affinity for, and recruitment of, COBRA1. It is thought that that reorganisation of higher levels of chromatin structure is an important regulated step in BRCA1-mediated nuclear functions. 472
49272 399310 pfam06210 DUF1003 Protein of unknown function (DUF1003). This family consists of several hypothetical bacterial proteins of unknown function. 101
49273 283793 pfam06211 BAMBI BMP and activin membrane-bound inhibitor (BAMBI) N-terminal domain. This family consists of several eukaryotic BMP and activin membrane-bound inhibitor (BAMBI) proteins. Members of the transforming growth factor-beta (TGF-beta) superfamily, including TGF-beta, bone morphogenetic proteins (BMPs), activins and nodals, are vital for regulating growth and differentiation. BAMBI is related to TGF-beta-family type I receptors but lacks an intracellular kinase domain. BAMBI is co-expressed with the ventralising morphogen BMP4 during Xenopus embryogenesis and requires BMP signalling for its expression. The protein stably associates with TGF-beta-family receptors and inhibits BMP and activin as well as TGF-beta signalling. 107
49274 399311 pfam06212 GRIM-19 GRIM-19 protein. This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-beta and RA. A bovine homolog of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localization and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity. 132
49275 399312 pfam06213 CobT Cobalamin biosynthesis protein CobT. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. 274
49276 368793 pfam06214 SLAM Signaling lymphocytic activation molecule (SLAM) protein. This family consists of several mammalian signaling lymphocytic activation molecule (SLAM) proteins. Optimal T cell activation and expansion require engagement of the TCR plus co-stimulatory signals delivered through accessory molecules. SLAM, a 70-kDa co-stimulatory molecule belonging to the Ig superfamily, is defined as a human cell surface molecule that mediates CD28-independent proliferation of human T cells and IFN-gamma production by human Th1 and Th2 clones. SLAM has also been recognized as a receptor for measles virus. 125
49277 368794 pfam06215 ISAV_HA Infectious salmon anaemia virus haemagglutinin. This family consists of several infectious salmon anaemia virus haemagglutinin proteins. Infectious salmon anaemia virus (ISAV), an orthomyxovirus-like virus, is an important fish pathogen in marine aquaculture. 380
49278 283798 pfam06216 RTBV_P46 Rice tungro bacilliform virus P46 protein. This family consists of several Rice tungro bacilliform virus P46 proteins. The function of this family is unknown. 389
49279 399313 pfam06217 GAGA_bind GAGA binding protein-like family. This family includes gbp a protein from Soybean that binds to GAGA element dinucleotide repeat DNA. It seems likely that the this domain mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain. 290
49280 368796 pfam06218 NPR2 Nitrogen permease regulator 2. This family of regulators are involved in post-translational control of nitrogen permease. 439
49281 399314 pfam06219 DUF1005 Protein of unknown function (DUF1005). Family of plant proteins with undetermined function. 430
49282 368798 pfam06220 zf-U1 U1 zinc finger. This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilize U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family. 38
49283 399315 pfam06221 zf-C2HC5 Putative zinc finger motif, C2HC5-type. This zinc finger appears to be common in activating signal cointegrator 1/thyroid receptor interacting protein 4. 54
49284 368800 pfam06222 Phage_TAC_1 Phage tail assembly chaperone. 126
49285 283804 pfam06223 Phage_tail_T Minor tail protein T. Minor tail protein T is located at the distal end and is involved in the assembly of the initiator complex for tail polymerization. 100
49286 399316 pfam06224 HTH_42 Winged helix DNA-binding domain. This family contains two copies of a winged helix domain. 324
49287 399317 pfam06226 DUF1007 Protein of unknown function (DUF1007). Family of conserved bacterial proteins with unknown function. 210
49288 368802 pfam06227 Poxvirus dsDNA Poxvirus. This is a family of dsDNA viruses, with no RNA stage, Poxvirus proteins. 145
49289 399318 pfam06228 ChuX_HutX Haem utilisation ChuX/HutX. This family is found within haem utilisation operons. It has a similar structure to that of pfam05171. pfam05171 usually occurs as a duplicated domain, but this domain occurs as a single domain and forms a dimer. The organisation of the dimer is very similar to that of the duplicated pfam05171 domains. It binds haem via conserved histidines. 128
49290 368803 pfam06229 FRG1 FRG1-like domain. The human FRG1 gene maps to human chromosome 4q35 and has been identified as a candidate for facioscapulohumeral muscular dystrophy. Currently, the function of FRG1 is unknown. 189
49291 399319 pfam06230 DUF1009 Protein of unknown function (DUF1009). Family of uncharacterized bacterial proteins. 131
49292 283811 pfam06231 DUF1010 Protein of unknown function (DUF1010). Family of plasmid encoded proteins with unknown function. 81
49293 399320 pfam06232 ATS3 Embryo-specific protein 3, (ATS3). This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes. 125
49294 399321 pfam06233 Usg Usg-like family. Family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system. 80
49295 399322 pfam06234 TmoB Toluene-4-monooxygenase system protein B (TmoB). This family consists of several Toluene-4-monooxygenase system protein B (TmoB) sequences. Pseudomonas mendocina KR1 metabolizes toluene as a carbon source. The initial step of the pathway is hydroxylation of toluene to form p-cresol by a multicomponent toluene-4-monooxygenase (T4MO) system. TmoB adopts a ubiquitin fold. Although TmoB is a component of the T4MO system, its precise role remains unclear. 78
49296 368806 pfam06235 NAD4L NADH dehydrogenase subunit 4L (NAD4L). This family consists of NADH dehydrogenase subunit 4L (NAD4L) proteins from the mitochondria of several parasitic flatworms. 86
49297 399323 pfam06236 MelC1 Tyrosinase co-factor MelC1. This family consists of several tyrosinase co-factor MELC1 proteins from a number of Streptomyces species. The melanin operon (melC) of Streptomyces antibioticus contains two genes, melC1 and melC2 (apotyrosinase). It is thought that MelC1 forms a transient binary complex with the downstream apotyrosinase MelC2 to facilitate the incorporation of copper ion and the secretion of tyrosinase indicating that MelC1 is a chaperone for the apotyrosinase MelC2. 113
49298 399324 pfam06237 DUF1011 Protein of unknown function (DUF1011). Family of uncharacterized eukaryotic proteins. 98
49299 399325 pfam06239 ECSIT Evolutionarily conserved signalling intermediate in Toll pathway. Activation of NF-kappaB as a consequence of signaling through the Toll and IL-1 receptors is a major element of innate immune responses. ECSIT plays an important role in signalling to NF-kappaB, functioning as the intermediate in the signaling pathways between TRAF-6 and MEKK-1. 219
49300 399326 pfam06240 COXG Carbon monoxide dehydrogenase subunit G (CoxG). The CO dehydrogenase structural genes coxMSL are flanked by nine accessory genes arranged as the cox gene cluster. The cox genes are specifically and coordinately transcribed under chemolithoautotrophic conditions in the presence of CO as carbon and energy source. 140
49301 399327 pfam06241 Castor_Poll_mid Castor and Pollux, part of voltage-gated ion channel. This family represents a short region in the middle of largely plant proteins that belong to the TCDB:1.A.1.23.2 family of the voltage-gated ion channel superfamily, eg UniProtKB:Q5H8A6, Q5H8A5 and Q4VY51. 104
49302 399328 pfam06242 DUF1013 Protein of unknown function (DUF1013). Family of uncharacterized proteins found in Proteobacteria. 138
49303 399329 pfam06243 PaaB Phenylacetic acid degradation B. Phenylacetic acid degradation protein B (PaaB) is thought to be part of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation. 88
49304 399330 pfam06244 Ccdc124 Coiled-coil domain-containing protein 124. Ccdc124 is a centrosome and midbody protein involved in cytokinesis. 121
49305 399331 pfam06245 DUF1015 Protein of unknown function (DUF1015). Family of proteins with unknown function found in archaea and bacteria. 330
49306 399332 pfam06246 Isy1 Isy1-like splicing family. Isy1 protein is important in the optimisation of splicing. 245
49307 399333 pfam06247 Plasmod_Pvs28 Pvs28 EGF domain. This family consists of several ookinete surface proteins (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunized animals. The structure of this protein shows it is composed of four EGF domains. 35
49308 368816 pfam06248 Zw10 Centromere/kinetochore Zw10. Zw10 and rough deal proteins are both required for correct metaphase check-pointing during mitosis. These proteins bind to the centromere/kinetochore. 543
49309 114941 pfam06249 EutQ Ethanolamine utilisation protein EutQ. The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The role of EutQ in this process is unclear. 152
49310 399334 pfam06250 DUF1016 Protein of unknown function (DUF1016). Family of uncharacterized proteins found in viruses, archaea and bacteria. 155
49311 399335 pfam06251 Caps_synth_GfcC Capsule biosynthesis GfcC. Many bacteria are covered in a layer of surface-associated polysaccharide called the capsule. These capsules can be divided into four groups depending upon the organisation of genes responsible for capsule assembly, the assembly pathway and regulation. This family plays a role in group 4 capsule biosynthesis. These proteins have a beta-grasp fold. Two beta-grasp domains, D2 and D3, are arranged in tandem. There is a C-terminal amphipathic helix which packs against D3. A helical hairpin insert in D2 binds to D3 and constrains its position, a conserved arginine residue at the end of this hairpin is essential for structural integrity. 229
49312 399336 pfam06252 DUF1018 Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function. 118
49313 399337 pfam06253 MTTB Trimethylamine methyltransferase (MTTB). This family consists of several trimethylamine methyltransferase (MTTB) (EC:2.1.1.-) proteins from numerous Rhizobium and Methanosarcina species. 489
49314 368819 pfam06254 YdaT_toxin Putative bacterial toxin ydaT. YdaT_toxin is a family of putative bacterial toxins that are neutralized by the putative antitoxin YdaS, UniProtKB:P76063, family pfam144549. 88
49315 283833 pfam06255 MafB Neisseria toxin MafB. MafB constitutes a family of secreted toxins in pathogenic Neisseria species, probably involved in interbacterial competition. Genes immediately downstream of mafB encode a specific immunity protein (MafI). MafB proteins exhibit a signal peptide sequence, a N-terminal conserved domain and a C-terminal variable region. Toxic domains identified at the C-terminus include pfam15542, pfam14437, pfam15524, and pfam14436. 312
49316 283834 pfam06256 Nucleo_LEF-12 Nucleopolyhedrovirus LEF-12 protein. This family consists of several Nucleopolyhedrovirus late expression factor-12 (LEF-12) proteins. The function of this family is unknown. 173
49317 399338 pfam06257 VEG Biofilm formation stimulator VEG. VEG is a family that is highly conserved among Gram-positive bacteria. It stimulates biofilm formation through inducing transcription of the tapA-sipW-tasA operon. The products of this operon are resposible for production of the amyloid fibre (TasA) component of the biofilm. Veg or a Veg-induced protein acts as an antirepressor of SinR - part of the major overall biofilm transcriptional control system - to regulate and stimulate biofilm formation. Veg is transcribed at high levels during both exponential growth and sporulation. 62
49318 399339 pfam06258 Mito_fiss_Elm1 Mitochondrial fission ELM1. In plants, this family is involved in mitochondrial fission. It binds to dynamin-related proteins and plays a role in their relocation from the cytosol to mitochondrial fission sites. Its function in bacteria is unknown. 301
49319 283837 pfam06259 Abhydrolase_8 Alpha/beta hydrolase. Members of this family are predicted to have an alpha/beta hydrolase fold. They contain a predicted Ser-His-Asp catalytic triad, in which the serine is likely to act as a nucleophile. 178
49320 283838 pfam06260 DUF1024 Protein of unknown function (DUF1024). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus phage phi proteins. The function of this family is unknown. 82
49321 368820 pfam06261 LktC Actinobacillus actinomycetemcomitans leukotoxin activator LktC. This family consists of several Actinobacillus actinomycetemcomitans leukotoxin activator (LktC) proteins. Actinobacillus actinomycetemcomitans is a Gram-negative bacterium that has been implicated in the etiology of several forms of periodontitis, especially localized juvenile periodontitis. LktC along with LktB and LktD are thought to be required for activation and localization of the leukotoxin. 150
49322 399340 pfam06262 Zincin_1 Zincin-like metallopeptidase. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family is a minimal version of the metalloprotease fold (Structure 3E11). 95
49323 399341 pfam06265 DUF1027 Protein of unknown function (DUF1027). This family consists of several hypothetical bacterial proteins of unknown function. 84
49324 336357 pfam06266 HrpF HrpF protein. The species Pseudomonas syringae encompasses plant pathogens with differing host specificities and corresponding pathovar designations. P. syringae requires the Hrp (type III protein secretion) system, encoded by a 25-kb cluster of hrp and hrc genes, in order to elicit the hypersensitive response (HR) in nonhosts or to be pathogenic in hosts. The exact function of HrpF is unknown but the protein is needed for pathogenicity. 74
49325 399342 pfam06267 DUF1028 Family of unknown function (DUF1028). Family of bacterial and archaeal proteins with unknown function. Some members are associated with a C-terminal peptidoglycan binding domain. So perhaps this could be an enzyme involved in peptidoglycan metabolism. 189
49326 399343 pfam06268 Fascin Fascin domain. This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture. Dictyostelium hisactophilin, another actin-binding protein, is a submembranous pH sensor that signals slight changes of the H+ concentration to actin by inducing actin polymerization and binding to microfilaments only at pH values below seven. Members of this family are histidine rich, typically contain the repeated motif of HHXH. 111
49327 283844 pfam06269 DUF1029 Protein of unknown function (DUF1029). This family consists of several short Chordopoxvirus proteins of unknown function. 53
49328 114962 pfam06270 DUF1030 Protein of unknown function (DUF1030). This family consists of several short Circovirus proteins of unknown function. 53
49329 399344 pfam06271 RDD RDD family. This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N-terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.). 136
49330 310698 pfam06273 eIF-4B Plant specific eukaryotic initiation factor 4B. This family consists of several plant specific eukaryotic initiation factor 4B proteins. 502
49331 114966 pfam06275 DUF1031 Protein of unknown function (DUF1031). This family consists of several Lactococcus lactis bacteriophage and Lactococcus lactis proteins of unknown function. 80
49332 399345 pfam06276 FhuF Ferric iron reductase FhuF-like transporter. This family consists of several bacterial ferric iron reductase protein (FhuF) sequences. FhuF is involved in the reduction of ferric iron in cytoplasmic ferrioxamine B. This family also includes the IucA and IucC proteins. 163
49333 377642 pfam06277 EutA Ethanolamine utilisation protein EutA. This family consists of several bacterial EutA ethanolamine utilisation proteins. The EutA protein is thought to protect the lyase (EutBC) from inhibition by CNB12. 475
49334 399346 pfam06278 CNDH2_N Condensin II complex subunit CAP-H2 or CNDH2, N-terminal. CNDH2_N is the N-terminal domain of the H2 subunit of the condensing II complex, found in eukaryotes but not in fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII is concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII. 111
49335 399347 pfam06279 DUF1033 Protein of unknown function (DUF1033). This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown. 117
49336 399348 pfam06280 fn3_5 Fn3-like domain. Fn3_5 is an fn3-like domain which is frequently found as the first of three on streptococcal C5a peptidase (SCP), a highly specific protease and adhesin/invasin. The family is found in conjunction with pfam00082, pfam02225 and pfam00746. 112
49337 253656 pfam06281 DUF1035 Protein of unknown function (DUF1035). This family consists of several Sulfolobus and Sulfolobus virus proteins of unknown function. 73
49338 399349 pfam06282 DUF1036 Protein of unknown function (DUF1036). This family consists of several hypothetical bacterial proteins of unknown function. 111
49339 399350 pfam06283 ThuA Trehalose utilisation. This family consists of several bacterial ThuA like proteins. ThuA appears to be involved in utilisation of trehalose. The thuA and thuB genes form part of the trehalose/sucrose transport operon thuEFGKAB, which is located on the pSymB megaplasmid. The thuA and thuB genes are induced in vitro by trehalose but not by sucrose and the extent of its induction depends on the concentration of trehalose available in the medium. 213
49340 283854 pfam06284 Cytomega_UL84 Cytomegalovirus UL84 protein. This family consists of several Cytomegalovirus UL84 proteins. The open reading frame UL84 of human cytomegalovirus encodes a multifunctional regulatory protein which is required for viral DNA replication and binds with high affinity to the immediate-early transactivator IE2-p86. 586
49341 253659 pfam06286 Coleoptericin Coleoptericin. This family consists of several insect Coleoptericin, Acaloleptin, Holotricin and Rhinocerosin proteins which are all known to be antibacterial proteins. 143
49342 377644 pfam06287 DUF1039 Protein of unknown function (DUF1039). This family consists of several hypothetical bacterial proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unknown. 65
49343 399351 pfam06288 DUF1040 Protein of unknown function (DUF1040). This family consists of several bacterial YihD proteins of unknown function. 86
49344 399352 pfam06289 FlbD Flagellar protein (FlbD). This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown. 59
49345 399353 pfam06290 PsiB Plasmid SOS inhibition protein (PsiB). This family consists of several plasmid SOS inhibition protein (PsiB) sequences. 138
49346 399354 pfam06291 Lambda_Bor Bor protein. This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the Escherichia coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis. 77
49347 399355 pfam06292 DUF1041 Domain of Unknown Function (DUF1041). This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with pfam00168, pfam00130 and pfam00169 domains. 107
49348 399356 pfam06293 Kdo Lipopolysaccharide kinase (Kdo/WaaP) family. These lipopolysaccharide kinases are related to protein kinases pfam00069. This family includes waaP (rfaP) gene product is required for the addition of phosphate to O-4 of the first heptose residue of the lipopolysaccharide (LPS) inner core region. It has previously been shown that WaaP is necessary for resistance to hydrophobic and polycationic antimicrobials in E. coli and that it is required for virulence in invasive strains of S. enterica. 206
49349 399357 pfam06294 CH_2 CH-like domain in sperm protein. Spef is a region of sperm flagellar proteins. It probably exerts a role in spermatogenesis in that the protein is expressed predominantly in adult tissue. It is present in the tails of developing and epididymal sperm internal to the fibrous sheath and around the dense outer fibers of the sperm flagellum. The amino-terminal domain (residues 1-110) shows a possible calponin homology (CH) domain; however Spef does not bind actin directly under in vitro conditions, so the function of the amino-terminal calponin-like domain is unclear. Transcription aberrations leading to a truncated protein result in immotile sperm. 91
49350 399358 pfam06295 DUF1043 Protein of unknown function (DUF1043). This family consists of several hypothetical bacterial proteins of unknown function. 123
49351 399359 pfam06296 RelE RelE toxin of RelE / RelB toxin-antitoxin system. RelE is a family of Gram-negative bacterial antitoxins of the RelE/RelB toxin-antitoxin system. Its cognate antitoxin is family RelB, pfam04221. 120
49352 399360 pfam06297 PET PET Domain. This domain is suggested to be involved in protein-protein interactions. The family is found in conjunction with pfam00412. 85
49353 399361 pfam06298 PsbY Photosystem II protein Y (PsbY). This family consists of several bacterial and plant photosystem II protein Y (PsbY) sequences. PsbY is a manganese-binding protein that has an L-arginine metabolising enzyme activity. 35
49354 399362 pfam06299 DUF1045 Protein of unknown function (DUF1045). This family consists of several hypothetical proteins from Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown. 159
49355 368834 pfam06300 Tsp45I Tsp45I type II restriction enzyme. This family consists of several type II restriction enzymes. 260
49356 399363 pfam06301 Lambda_Kil Bacteriophage lambda Kil protein. This family consists of several Bacteriophage lambda Kil protein like sequences from both phages and bacteria. Induction of a lambda prophage causes the death of the host cell even in the absence of phage replication and lytic functions due to expression of the lambda kil gene. 42
49357 399364 pfam06303 MatP MatP N-terminal domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the N-terminal domain of MatP. 84
49358 399365 pfam06304 DUF1048 Protein of unknown function (DUF1048). This family consists of several hypothetical bacterial proteins of unknown function. 103
49359 399366 pfam06305 LapA_dom Lipopolysaccharide assembly protein A domain. This family includes a domain found in lipopolysaccharide assembly protein A (LapA). LapA functions along with LapB in the assembly of lipopolysaccharide (LPS). Domains in this family are also found in some uncharacterized bacterial proteins. 64
49360 114995 pfam06306 CgtA Beta-1,4-N-acetylgalactosaminyltransferase (CgtA). This family consists of several beta-1,4-N-acetylgalactosaminyltransferase proteins from Campylobacter jejuni. 347
49361 253668 pfam06307 Herpes_IR6 Herpesvirus IR6 protein. This family consists of several Herpesvirus IR6 proteins. The equine herpesvirus 1 (EHV-1) IR6 protein forms typical rod-like structures in infected cells, influences virus growth at elevated temperatures, and determines the virulence of EHV-1 Rac strains. 214
49362 368836 pfam06308 ErmC 23S rRNA methylase leader peptide (ErmC). This family consists of several very short bacterial 23S rRNA methylase leader peptide (ErmC) sequences. ermC confers resistance to macrolide-lincosamide streptogramin B antibiotics by specifying a ribosomal RNA methylase, which results in decreased ribosomal affinity for these antibiotics. ermC expression is induced by exposure to erythromycin. 31
49363 399367 pfam06309 Torsin Torsin. This family consists of several eukaryotic torsin proteins. Torsion dystonia is an autosomal dominant movement disorder characterized by involuntary, repetitive muscle contractions and twisted postures. The most severe early-onset form of dystonia has been linked to mutations in the human DYT1 (TOR1A) gene encoding a protein termed torsinA. While causative genetic alterations have been identified, the function of torsin proteins and the molecular mechanism underlying dystonia remain unknown. Phylogenetic analysis of the torsin protein family indicates these proteins share distant sequence similarity with the large and diverse family of (pfam00004) proteins. It has been suggested that torsins play a role in effectively managing protein folding and that possible breakdown in a neuroprotective mechanism that is, in part, mediated by torsins may be responsible for the neuronal dysfunction associated with dystonia. 120
49364 399368 pfam06311 NumbF NUMB domain. This presumed domain is found in the Numb family of proteins adjacent to the PTB domain.. 90
49365 368838 pfam06312 Neurexophilin Neurexophilin. This family consists of mammalian neurexophilin proteins. Mammalian brains contain four different neurexophilin proteins. Neurexophilins form a family of related glycoproteins that are proteolytically processed after synthesis and bind to alpha-neurexins. The structure and characteristics of neurexophilins indicate that they function as neuropeptides that may signal via alpha-neurexins. 203
49366 399369 pfam06313 ACP53EA Drosophila ACP53EA protein. This family consists of several Drosophila ACP53EA accessory gland (seminal) proteins. 90
49367 399370 pfam06314 ADC Acetoacetate decarboxylase (ADC). This family consists of several acetoacetate decarboxylase (ADC) proteins (EC:4.1.1.4). 239
49368 399371 pfam06315 AceK Isocitrate dehydrogenase kinase/phosphatase (AceK). This family consists of several bacterial isocitrate dehydrogenase kinase/phosphatase (AceK) proteins (EC:2.7.1.116). 560
49369 115004 pfam06316 Ail_Lom Enterobacterial Ail/Lom protein. This family consists of several bacterial and phage Ail/Lom-like proteins. The Yersinia enterocolitica Ail protein is a known virulence factor. Proteins in this family are predicted to consist of eight transmembrane beta-sheets and four cell surface-exposed loops. It is thought that Ail directly promotes invasion and loop 2 contains an active site, perhaps a receptor-binding domain. The phage protein Lom is expressed during lysogeny, and encode host-cell envelope proteins. Lom is found in the bacterial outer membrane, and is homologous to virulence proteins of two other enterobacterial genera. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis. 199
49370 399372 pfam06317 Arena_RNA_pol Arenavirus RNA polymerase. This family consists of several Arenavirus RNA polymerase proteins (EC:2.7.7.48). 2039
49371 399373 pfam06319 MmcB-like DNA repair protein MmcB-like. This family includes Caulobacter MmcB (CCNA_03580), which is involved in DNA repair. It has been proposed to be an endonuclease that creates the substrate for translesion synthesis. 148
49372 399374 pfam06320 GCN5L1 GCN5-like protein 1 (GCN5L1). This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown. 113
49373 399375 pfam06321 P_gingi_FimA Major fimbrial subunit protein (FimA). This family consists of several Porphyromonas gingivalis major fimbrial subunit protein (FimA) sequences. Fimbriae of Porphyromonas gingivalis, a periodontopathogen, play an important role in its adhesion to and invasion of host cells. The fimA genes encoding fimbrillin (FimA), a subunit protein of fimbriae, have been classified into five types, types I to V, based on nucleotide sequences. It has been found that type II FimA can bind to epithelial cells most efficiently through specific host receptors. Human dental plaque is a multispecies microbial biofilm that is associated with two common oral diseases, dental caries and periodontal disease. There is an inter-species contact-dependent communication system between P. gingivalis and S. cristatus that involces the Arc-A enzyme. 157
49374 283883 pfam06322 Phage_NinH Phage NinH protein. This family consists of several phage NinH proteins. The function of this family is unknown. 60
49375 399376 pfam06323 Phage_antiter_Q Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Phage 82 gene Q encodes a phage-specific positive regulator of late gene expression, thought, by analogy to the corresponding gene of phage lambda, to be a transcription antiterminator. 220
49376 283885 pfam06324 Pigment_DH Pigment-dispersing hormone (PDH). This family consists of several eukaryotic pigment-dispersing hormone (PDH) proteins. The pigment-dispersing hormone (PDH) is produced in the eyestalks of Crustacea where it induces light-adapting movements of pigment in the compound eye and regulates the pigment dispersion in the chromatophores. 18
49377 399377 pfam06325 PrmA Ribosomal protein L11 methyltransferase (PrmA). This family consists of several Ribosomal protein L11 methyltransferase (EC:2.1.1.-) sequences. 295
49378 283887 pfam06326 Vesiculo_matrix Vesiculovirus matrix protein. This family consists of several Vesiculovirus matrix proteins. The matrix (M) protein of vesicular stomatitis virus (VSV) expressed in the absence of other viral components causes many of the cytopathic effects of VSV, including an inhibition of host gene expression and the induction of cell rounding. It has been shown that M protein also induces apoptosis in the absence of other viral components. It is thought that the activation of apoptotic pathways causes the inhibition of host gene expression and cell rounding by M protein. 240
49379 399378 pfam06327 DUF1053 Domain of Unknown Function (DUF1053). This domain is found in Adenylate cyclases. 100
49380 399379 pfam06328 Lep_receptor_Ig Ig-like C2-type domain. This domain is a ligand-binding immunoglobulin-like domain. The two cysteine residues form a disulphide bridge. 87
49381 310730 pfam06330 TRI5 Trichodiene synthase (TRI5). This family consists of several fungal trichodiene synthase proteins (EC:4.2.3.6). TRI5 encodes the enzyme trichodiene synthase, which has been shown to catalyze the first step in the trichothecene pathways of Fusarium and Trichothecium species. 353
49382 399380 pfam06331 Tfb5 Transcription factor TFIIH complex subunit Tfb5. This family is a component of the general transcription and DNA repair factor IIH. TFB5 has been shown to be required for efficient recruitment of TFIIH to a promoter. 67
49383 399381 pfam06333 Med13_C Mediator complex subunit 13 C-terminal. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med13 is part of the ancillary kinase module, together with Med12, CDK8 and CycC, which in yeast is implicated in transcriptional repression, though most of this activity is likely attributable to the CDK8 kinase. The large Med12 and Med13 proteins are required for specific developmental processes in Drosophila, zebrafish, and Caenorhabditis elegans but their biochemical functions are not understood. 322
49384 283893 pfam06334 Orthopox_A47 Orthopoxvirus A47 protein. This family consists of several Orthopoxvirus A47 proteins. The function of this family is unknown. 244
49385 399382 pfam06335 DUF1054 Protein of unknown function (DUF1054). This family consists of several hypothetical bacterial proteins of unknown function. 198
49386 283895 pfam06336 Corona_5a Coronavirus 5a protein. This family consists of several Coronavirus 5a proteins. The function of this family is unknown. 64
49387 399383 pfam06337 DUSP DUSP domain. The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. 80
49388 399384 pfam06338 ComK ComK protein. This family consists of several bacterial ComK proteins. The ComK protein of Bacillus subtilis positively regulates the transcription of several late competence genes as well as comK itself. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level. 152
49389 368850 pfam06339 Ectoine_synth Ectoine synthase. This family consists of several bacterial ectoine synthase proteins. The ectABC genes encode the diaminobutyric acid acetyltransferase (EctA), the diaminobutyric acid aminotransferase (EctB), and the ectoine synthase (EctC). Together these proteins constitute the ectoine biosynthetic pathway. 127
49390 283899 pfam06340 TcpF Vibrio cholerae toxin co-regulated pilus biosynthesis protein F. This family consists of several Vibrio cholerae toxin co-regulated pilus biosynthesis protein F (TcpF) sequences. TcpF is known to be a secreted virulence protein but its exact function is unknown. 317
49391 283900 pfam06341 DUF1056 Protein of unknown function (DUF1056). This family consists of several putative head-tail joining bacteriophage proteins. 63
49392 115027 pfam06342 DUF1057 Alpha/beta hydrolase of unknown function (DUF1057). This family consists of several Caenorhabditis elegans specific proteins of unknown function. Members of this family have an alpha/beta hydrolase fold. 297
49393 283901 pfam06344 Parecho_VpG Parechovirus Genome-linked protein. This family is of the Parechovirus genome-linked protein Vpg type P3B. 20
49394 399385 pfam06345 Drf_DAD DRF Autoregulatory Domain. This motif is found in Diaphanous-related formins. It binds the N-terminal GTPase-binding domain; this link is broken when GTP-bound Rho binds to the GBD and activates the protein. The addition of DAD to mammalian cells induces actin filament formation, stabilizes microtubules, and activates serum-response mediated transcription. 15
49395 399386 pfam06346 Drf_FH1 Formin Homology Region 1. This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues. 154
49396 399387 pfam06347 SH3_4 Bacterial SH3 domain. This family consists of several hypothetical bacterial proteins of unknown function. These are composed of SH3-like domains. 56
49397 399388 pfam06348 DUF1059 Protein of unknown function (DUF1059). This family consists of several short hypothetical archaeal proteins of unknown function. 56
49398 399389 pfam06350 HSL_N Hormone-sensitive lipase (HSL) N-terminus. This family consists of several mammalian hormone-sensitive lipase (HSL) proteins (EC:3.1.1.-). Hormone-sensitive lipase, a key enzyme in fatty acid mobilisation, overall energy homeostasis, and possibly steroidogenesis, is acutely controlled through reversible phosphorylation by catecholamines and insulin. 306
49399 368854 pfam06351 Allene_ox_cyc Allene oxide cyclase. This family consists of several plant specific allene oxide cyclase proteins (EC:5.3.99.6). The allene oxide cyclase (AOC)-catalyzed step in jasmonate (JA) biosynthesis is important in the wound response of tomato. 175
49400 399390 pfam06353 DUF1062 Protein of unknown function (DUF1062). This family consists of several hypothetical bacterial proteins of unknown function. 135
49401 399391 pfam06355 Aegerolysin Aegerolysin. This family consists of several bacterial and fungal Aegerolysin-like proteins. It has been found that aegerolysin and ostreolysin are expressed during formation of primordia and fruiting bodies. It has been suggested that these haemolysins play an important role in initial phase of fungal fruiting. The bacterial members of this family are expressed during sporulation. Ostreolysin was found cytolytic to various erythrocytes and tumor cells. It forms transmembrane pores 4 nm in diameter. The activity is inhibited by total membrane lipids, and modulated by lysophosphatides. The potential use of aegerolysins is reviewed with special emphasis on their properties which would allow their use in therapeutics. Aegerolysin is part of the pleurotolysin pore-forming (Pleurotolysin) transporter superfamily. Member proteins assemble into a transmembrane pore complex. 131
49402 283910 pfam06356 DUF1064 Protein of unknown function (DUF1064). This family consists of several phage and bacterial proteins of unknown function. 117
49403 253691 pfam06357 Omega-toxin Omega-atracotoxin. This family consists of several Hadronyche versuta (Blue mountains funnel-web spider) specific omega-atracotoxin proteins. Omega-Atracotoxin-Hv1a is an insect-specific neurotoxin whose phylogenetic specificity derives from its ability to antagonise insect, but not vertebrate, voltage-gated calcium channels. Two spatially proximal residues, Asn(27) and Arg(35), form a contiguous molecular surface that is essential for toxin activity. It has been proposed that this surface of the beta-hairpin is a key site for interaction of the toxin with insect calcium channels. 37
49404 283911 pfam06358 DUF1065 Protein of unknown function (DUF1065). This family consists of several Benyvirus proteins of unknown function. 111
49405 399392 pfam06360 E_raikovi_mat Euplotes raikovi mating pheromone. This family consists of several Euplotes raikovi mating pheromone proteins. Diffusible polypeptide pheromones, which distinguish otherwise morphologically identical vegetative cell types from one another, are produced by some species of ciliates. In the marine sand-dwelling protozoan ciliate Euplotes raikovi, pheromone molecules promote the vegetative reproduction (mitogenic proliferation or growth) of the same cells from which they originate. As, understandably, such autocrine pheromone activity is primary to that of targeting and inducing a foreign cell to mate (paracrine functions), this finding provides an example of how the original function of a molecule can be obscured during evolution by the acquisition of a new one. 33
49406 283912 pfam06361 RTBV_P12 Rice tungro bacilliform virus P12 protein. This family consists of several Rice tungro bacilliform virus P12 proteins. The function of this family is unknown. 110
49407 115044 pfam06362 DUF1067 Protein of unknown function (DUF1067). This family consists of several hypothetical Mycobacterium leprae specific proteins. The function of this family is unknown. 97
49408 368858 pfam06363 Picorna_P3A Picornaviridae P3A protein. This family consists of the P3A protein of picornaviridae. P3A has been identified as a genome-linked protein (VPg) and is involved in replication. 98
49409 399393 pfam06364 DUF1068 Protein of unknown function (DUF1068). This family consists of several hypothetical plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 165
49410 399394 pfam06365 CD34_antigen CD34/Podocalyxin family. This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighboring foot processes in the glomerular epithelium by charge repulsion. 210
49411 399395 pfam06366 FlhE Flagellar protein FlhE. This family consists of several Enterobacterial FlhE flagellar proteins. The exact function of this family is unknown. 106
49412 368862 pfam06367 Drf_FH3 Diaphanous FH3 Domain. This region is found in the Formin-like and and diaphanous proteins. 195
49413 399396 pfam06368 Met_asp_mut_E Methylaspartate mutase E chain (MutE). This family consists of several methylaspartate mutase E chain proteins (EC:5.4.99.1). Glutamate mutase catalyzes the first step in the fermentation of glutamate by Clostridium tetanomorphum. This is an unusual isomerisation in which L-glutamate is converted to threo-beta-methyl L-aspartate. 441
49414 399397 pfam06369 Anemone_cytotox Sea anemone cytotoxic protein. Sea anemones are a rich source of cytotoxic proteins. Cytolysins comprise a group of more than 30 highly basic proteins with molecular masses of about 20 kDa. Cytolysins isolated from the sea anemone, Heteractis magnifica, include magnificalysin I (HMg I), magnificalysin II (HMg II) and Heteractis magnifica toxin (HMgtxn). These are highly homologous at their N-terminals. HMg I and II have molecular masses of approximately 19 kDa, and pI values of 9.4 and 10.0, respectively. Cytolysins isolated from other sea anemones Actinia tenebrosa (Tenebrosin-C, TN-C), Actinia equina (Equinatoxin, EqT) and Stichodactyla helianthus (ShC) exhibit pore-forming, haemolytic, cytotoxic, and heart stimulatory activities. 176
49415 191504 pfam06370 DUF1069 Protein of unknown function (DUF1069). This family consists of several Maize streak virus 21.7 kDa proteins. The function of this family is unknown. 206
49416 399398 pfam06371 Drf_GBD Diaphanous GTPase-binding Domain. This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein. 190
49417 148150 pfam06372 Gemin6 Gemin6 protein. This family consists of several mammalian Gemin6 proteins. The exact function of Gemin6 is unknown but it has been found to form part of the pfam06003 complex. The SMN complex plays a key role in the biogenesis of spliceosomal small nuclear ribonucleoproteins (snRNPs) and other ribonucleoprotein particles. 169
49418 368865 pfam06373 CART Cocaine and amphetamine regulated transcript protein (CART). This family consists of several cocaine and amphetamine regulated transcript type I protein (CART) sequences. Cocaine and amphetamine regulated transcript (CART) peptide has been shown to be an anorectic peptide that inhibits both normal and starvation-induced feeding and completely blocks the feeding response induced by neuropeptide Y and regulated by leptin in the hypothalamus. The C-terminal part containing the three disulfide bridges is the biologically active part of the molecule affecting food intake. The solution structure of the active part of CART has a fold equivalent to other functionally distinct small proteins. CART consists mainly of turns and loops spanned by a compact framework composed by a few small stretches of antiparallel beta-sheet common to cystine knots. 70
49419 399399 pfam06374 NDUF_C2 NADH-ubiquinone oxidoreductase subunit b14.5b (NDUFC2). This family consists of several NADH-ubiquinone oxidoreductase subunit b14.5b proteins (EC:1.6.5.3). 109
49420 399400 pfam06375 AP3D1 AP-3 complex subunit delta-1. AP-3 complex subunit delta-1 (AP3D1) is part of the AP-3 complex, an adaptor-related complex which is not clathrin- associated. The complex is associated with the Golgi region as well as more peripheral structures. AP3D1 is required for efficient transport of VSV-G (vesicular stomatitis virus glycoprotein) from the trans-Golgi network to the cell surface. 157
49421 399401 pfam06376 AGP Arabinogalactan peptide. This entry represents the arabinogalactan peptide family found in plants. 35
49422 399402 pfam06377 Adipokin_hormo Adipokinetic hormone. This family consists of several insect adipokinetic hormone as well as the related crustacean red pigment concentrating hormone. Flight activity of insects comprises one of the most intense biochemical processes known in nature, and therefore provides an attractive model system to study the hormonal regulation of metabolism during physical exercise. In long-distance flying insects, such as the migratory locust, both carbohydrate and lipid reserves are utilized as fuels for sustained flight activity. The mobilization of these energy stores in Locusta migratoria is mediated by three structurally related adipokinetic hormones (AKHs), which are all capable of stimulating the release of both carbohydrates and lipids from the fat body. 51
49423 399403 pfam06378 DUF1071 Protein of unknown function (DUF1071). This family consists of several hypothetical bacterial and phage proteins of unknown function. 152
49424 115061 pfam06379 RhaT L-rhamnose-proton symport protein (RhaT). This family consists of several bacterial L-rhamnose-proton symport protein (RhaT) sequences. 344
49425 148156 pfam06380 DUF1072 Protein of unknown function (DUF1072). This family consists of several Barley yellow dwarf virus proteins of unknown function. 39
49426 399404 pfam06381 DUF1073 Protein of unknown function (DUF1073). This family consists of several hypothetical bacterial proteins. The function of this family is unknown. 355
49427 283927 pfam06382 DUF1074 Protein of unknown function (DUF1074). This family consists of several proteins which appear to be specific to Drosophila melanogaster. The function of this family is unknown. 125
49428 399405 pfam06384 ICAT Beta-catenin-interacting protein ICAT. This family consists of several eukaryotic beta-catenin-interacting (ICAT) proteins. Beta-catenin is a multifunctional protein involved in both cell adhesion and transcriptional activation. Transcription mediated by the beta-catenin/Tcf complex is involved in embryological development and is upregulated in various cancers. ICAT selectively inhibits beta-catenin/Tcf binding in vivo, without disrupting beta-catenin/cadherin interactions. 76
49429 283929 pfam06385 Baculo_LEF-11 Baculovirus LEF-11 protein. This family consists of several Baculovirus LEF-11 proteins. The exact function of this family is unknown although it has been shown that LEF-11 is required for viral DNA replication during the infection cycle. 93
49430 399406 pfam06386 GvpL_GvpF Gas vesicle synthesis protein GvpL/GvpF. This family consists of several bacterial and archaeal gas vesicle synthesis protein (GvpL/GvpF) sequences. The exact function of this family is unknown. 237
49431 399407 pfam06387 Calcyon D1 dopamine receptor-interacting protein (calcyon). This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca(2+) as well as cAMP-dependent signaling. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) and schizophrenia. 178
49432 399408 pfam06388 DUF1075 Protein of unknown function (DUF1075). This family consists of several eukaryotic proteins of unknown function. 124
49433 399409 pfam06389 Filo_VP24 Filovirus membrane-associated protein VP24. This family consists of several membrane-associated protein VP24 sequences from a variety of Ebola and Marburg viruses. The VP24 protein of Ebola virus is believed to be a secondary matrix protein and minor component of virions. VP24 possesses structural features commonly associated with viral matrix proteins and that VP24 may have a role in virus assembly and budding. 227
49434 115071 pfam06390 NESP55 Neuroendocrine-specific golgi protein P55 (NESP55). This family consists of several mammalian neuroendocrine-specific golgi protein P55 (NESP55) sequences. NESP55 is a novel member of the chromogranin family and is a soluble, acidic, heat-stable secretory protein that is expressed exclusively in endocrine and nervous tissues, although less widely than chromogranins. 261
49435 399410 pfam06391 MAT1 CDK-activating kinase assembly factor MAT1. MAT1 is an assembly/targeting factor for cyclin-dependent kinase-activating kinase (CAK), which interacts with the transcription factor TFIIH. The domain found to the N-terminal side of this domain is a C3HC4 RING finger. 203
49436 368876 pfam06392 Asr Acid shock protein repeat. The Asr protein is synthesized as a precursor and the cleavage is essential for moderate to high acid tolerance. 22
49437 399411 pfam06393 BID BH3 interacting domain (BID). BID is a member of the BCL-2 superfamily of proteins are key regulators of programmed cell death, hence this family is related to pfam00452. BID is a pro-apoptotic member of the Bcl-2 superfamily and as such posses the ability to target intracellular membranes and contains the BH3 death domain. The activity of BID is regulated by a Caspase 8-mediated cleavage event, exposing the BH3 domain and significantly changing the surface charge and hydrophobicity, which causes a change of cellular localization. 191
49438 368878 pfam06394 Pepsin-I3 Pepsin inhibitor-3-like repeated domain. Pepsin inhibitor-3 consisting of two domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the active site flap region of pepsin. The two domains are tandem repeats of sequence, and has therefore been termed repeated domain. 74
49439 399412 pfam06395 CDC24 CDC24 Calponin. Is a calponin homology domain. 89
49440 399413 pfam06396 AGTRAP Angiotensin II, type I receptor-associated protein (AGTRAP). This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the carboxyl-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. 146
49441 283940 pfam06397 Desulfoferrod_N Desulfoferrodoxin, N-terminal domain. Most members of this family are small (approximately 36 amino acids) proteins that from homodimeric complexes. Each subunit contains a high-spin iron atom tetrahedrally bound to four cysteinyl sulphur atoms This family has a similar fold to the rubredoxin metal binding domain. It is also found as the N-terminal domain of desulfoferrodoxin, see (pfam01880). 36
49442 399414 pfam06398 Pex24p Integral peroxisomal membrane peroxin. Peroxisomes play diverse roles in the cell, compartmentalising many activities related to lipid metabolism and functioning in the decomposition of toxic hydrogen peroxide. Sequence similarity was identified between two hypothetical proteins and the peroxin integral membrane protein Pex24p. 369
49443 399415 pfam06399 GFRP GTP cyclohydrolase I feedback regulatory protein (GFRP). Tetrahydrobiopterin, the cofactor required for hydroxylation of aromatic amino acids regulates its own synthesis in via feedback inhibition of GTP cyclohydrolase I. This mechanism is mediated by the regulatory subunit called GTP cyclohydrolase I feedback regulatory protein (GFRP). 81
49444 399416 pfam06400 Alpha-2-MRAP_N Alpha-2-macroglobulin RAP, N-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. The N-terminal domain is predominately alpha helical. Two different studies have provided conflicted domain boundaries. 117
49445 399417 pfam06401 Alpha-2-MRAP_C Alpha-2-macroglobulin RAP, C-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. Two different studies have provided conflicted domain boundaries. 209
49446 399418 pfam06403 Lamprin Lamprin. This family consists of several lamprin proteins from the Sea lamprey Petromyzon marinus. Lamprin, an insoluble non-collagen, non-elastin protein, is the major connective tissue component of the fibrillar extracellular matrix of lamprey annular cartilage. Although not generally homologous to any other protein, soluble lamprins contain a tandemly repeated peptide sequence (GGLGY) which is present in both silkmoth chorion proteins and spider dragline silk. Strong homologies to this repeat sequence are also present in several mammalian and avian elastins. It is thought that these proteins share a structural motif which promotes self-aggregation and fibril formation in proteins through interdigitation of hydrophobic side chains in beta-sheet/beta-turn structures, a motif that has been preserved in recognisable form over several hundred million years of evolution. 91
49447 399419 pfam06404 PSK Phytosulfokine precursor protein (PSK). This family consists of several plant specific phytosulfokine precursor proteins. Phytosulfokines, are active as either a pentapeptide or a C-terminally truncated tetrapeptide. These compounds were first isolated because of their ability to stimulate cell division in somatic embryo cultures of Asparagus officinalis. 89
49448 399420 pfam06405 RCC_reductase Red chlorophyll catabolite reductase (RCC reductase). This family consists of several red chlorophyll catabolite reductase (RCC reductase) proteins. Red chlorophyll catabolite (RCC) reductase (RCCR) and pheophorbide (Pheide) a oxygenase (PaO) catalyze the key reaction of chlorophyll catabolism, porphyrin macrocycle cleavage of Pheide a to a primary fluorescent catabolite (pFCC). 253
49449 310773 pfam06406 StbA StbA protein. This family consists of several bacterial StbA plasmid stability proteins. 317
49450 399421 pfam06407 BDV_P40 Borna disease virus P40 protein. This family consists of several Borna disease virus P40 proteins. Borna disease (BD) is a persistent viral infection of the central nervous system caused by the single-negative-strand, nonsegmented RNA Borna disease virus (BDV). P40 is known to be a nucleoprotein. 348
49451 399422 pfam06409 NPIP Nuclear pore complex interacting protein (NPIP). This family consists of a series of primate specific nuclear pore complex interacting protein (NPIP) sequences. The function of this family is unknown but is well conserved from African apes to humans. 262
49452 399423 pfam06411 HdeA HdeA/HdeB family. HdeA (hns-dependent expression protein A) is a single domain alpha-helical protein localized in the periplasmic space. HdeA is involved in acid resistance essential for infectivity of enteric bacterial pathogens. Functional studies demonstrate that HdeA is activated by a dimer-to-monomer transition at acidic pH, leading to suppression of aggregation by acid-denatured proteins. The gene encoding HdeA was initially identified as part of an operon regulated by the nucleoid protein H-NS. This family also contains HdeB. 92
49453 399424 pfam06412 TraD Conjugal transfer protein TraD. This family contains bacterial TraD conjugal transfer proteins. Mutations in the TraD gene result in loss of transfer. 61
49454 399425 pfam06413 Neugrin Neugrin. This family consists of several mouse and human neugrin proteins. Neugrin and m-neugrin are mainly expressed in neurons in the nervous system, and are thought to play an important role in the process of neuronal differentiation. 225
49455 399426 pfam06414 Zeta_toxin Zeta toxin. This family consists of several bacterial zeta toxin proteins. Zeta toxin is thought to be part of a postregulational killing system in bacteria. It relies on antitoxin/toxin systems that secure stable inheritance of low and medium copy number plasmids during cell division and kill cells that have lost the plasmid. 192
49456 399427 pfam06415 iPGM_N BPG-independent PGAM N-terminus (iPGM_N). This family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (EC:5.4.2.1). The family is found in conjunction with pfam01676 (located in the C-terminal region of the protein). 217
49457 368893 pfam06416 T3SS_NleG Effector protein NleG. Many bacterial pathogens deliver effector proteins into host cells via a type III secretion system. These effector proteins then alter the host cell's biology in ways that are advantageous to the pathogen. The NleG protein and its homologs form the largest family of effector proteins in the enterohemorrhagic Escherichia coli O157:H7, with 14 members identified in the Sakai strain alone. 113
49458 399428 pfam06417 DUF1077 Protein of unknown function (DUF1077). This family consists of several hypothetical eukaryotic proteins of unknown function. 118
49459 399429 pfam06418 CTP_synth_N CTP synthase N-terminus. This family consists of the N-terminal region of the CTP synthase protein (EC:6.3.4.2). This family is found in conjunction with pfam00117 located in the C-terminal region of the protein. CTP synthase catalyzes the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position. 265
49460 399430 pfam06419 COG6 Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localization. 612
49461 399431 pfam06420 Mgm101p Mitochondrial genome maintenance MGM101. The mgm101 gene was identified as essential for maintenance of the mitochondrial genome in Saccharomyces cerevisiae. Based on its DNA-binding activity, and experimental work with a temperature-sensitive mgm101 mutant, it has been proposed that the mgm101 gene product performs an essential function in the repair of oxidatively damaged mitochondrial DNA. 170
49462 399432 pfam06421 LepA_C GTP-binding protein LepA C-terminus. This family consists of the C-terminal region of several pro- and eukaryotic GTP-binding LepA proteins. 107
49463 399433 pfam06422 PDR_CDR CDR ABC transporter. Corresponds to a region of the PDR/CDR subgroup of ABC transporters comprising extracellular loop 3, transmembrane segment 6 and linker region. 92
49464 399434 pfam06423 GWT1 GWT1. Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast. 140
49465 399435 pfam06424 PRP1_N PRP1 splicing factor, N-terminal. This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N-terminus of the RNA splicing factor encoded by prp1. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression. 109
49466 399436 pfam06426 SATase_N Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants and bacteria. 104
49467 399437 pfam06427 UDP-g_GGTase UDP-glucose:Glycoprotein Glucosyltransferase. UDP-g_GGTase is an important, central component of the QC system in the ER for checking that glycoproteins are folded correctly. This QC prevents incorrectly folded glycoproteins from leaving the ER. 109
49468 368902 pfam06428 Sec2p GDP/GTP exchange factor Sec2p. In Saccharomyces cerevisiae, Sec2p is a GDP/GTP exchange factor for Sec4p, which is required for vesicular transport at the post-Golgi stage of yeast secretion. 92
49469 377656 pfam06429 Flg_bbr_C Flagellar basal body rod FlgEFG protein C-terminal. This family consists of a number of C-terminal domains of unknown function. This domain seems to be specific to flagellar basal-body rod and flagellar hook proteins in which pfam00460 is often present at the extreme N-terminus. 74
49470 399438 pfam06430 L_lactis_RepB_C Lactococcus lactis RepB C-terminus. This family consists of the C-terminal region of RepB proteins from Lactococcus lactis (See pfam01051). 122
49471 283968 pfam06431 Polyoma_lg_T_C Polyomavirus large T antigen C-terminus. 417
49472 399439 pfam06432 GPI2 Phosphatidylinositol N-acetylglucosaminyltransferase. Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively. 267
49473 399440 pfam06433 Me-amine-dh_H Methylamine dehydrogenase heavy chain (MADH). Methylamine dehydrogenase (EC:1.4.99.3) a periplasmic quinoprotein found in several methyltrophic bacteria. Induced when grown on methylamine as a carbon source MADH catalyzes the oxidative deamination of amines to there corresponding aldehydes. MADH is a hetero- tetramer, comprised of two heavy chains (H) and two light chains (L). The H-chain forms a beta-propeller like structure. 343
49474 399441 pfam06434 Aconitase_2_N Aconitate hydratase 2 N-terminus. This family represents the N-terminal region of several bacterial Aconitate hydratase 2 proteins and is found in conjunction with pfam00330. 204
49475 283972 pfam06435 DUF1079 Repeat of unknown function (DUF1079). This family consists of several repeats of 31 residues in length and seems to be exclusive to Moraxella catarrhalis UspA proteins. The UspA1 and UspA2 proteins of Moraxella catarrhalis are structurally related and are exposed on the bacterial cell surface where can function adhesins. This family is commonly found with the pfam03895 family. 31
49476 399442 pfam06436 Pneumovirus_M2 Pneumovirus matrix protein 2 (M2). This family consists of several Pneumovirus matrix glycoprotein M2 sequences. This family functions as a transcription processivity factor that is essential for virus replication. 155
49477 399443 pfam06437 ISN1 IMP-specific 5'-nucleotidase. The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5'-nucleotidase, which catalyzes degradation of IMP to inosine as part of the purine salvage pathway. 407
49478 399444 pfam06438 HasA Heme-binding protein A (HasA). Free iron is limited in vertebrate hosts, thus an alternative to siderophores has been developed by pathogenic bacteria to access host iron bound in protein complexes. HasA is a secreted hemophore that has the ability to obtain iron from hemoglobin. Once bound to HasA, the heme is shuttled to the receptor HasR, which releases the heme into the bacterium. 184
49479 399445 pfam06439 DUF1080 Domain of Unknown Function (DUF1080). This family has structural similarity to an endo-1,3-1,4-beta glucanase belonging to glycoside hydrolase family 16. However, the structure surrounding the active site differs from that of the endo-1,3-1,4-beta glucanase. 182
49480 399446 pfam06440 DNA_pol3_theta DNA polymerase III, theta subunit. DNA polymerase III (EC 2.7.7.7) is comprised of three tightly associated subunits, alpha, epsilon and theta. This family contains the theta subunit. The structure of the theta subunit shows that the N-terminal two thirds is comprised of three helices while the C-terminal third is disordered. The function of the theta subunit is poorly understood, but the interaction of the theta subunit with the epsilon subunit is thought to enhance the 3' to 5' exonucleolytic proofreading activity of epsilon. 68
49481 399447 pfam06441 EHN Epoxide hydrolase N-terminus. This family represents the N-terminal region of the eukaryotic epoxide hydrolase protein. Epoxide hydrolases (EC:3.3.2.3) comprise a group of functionally related enzymes that catalyze the addition of water to oxirane compounds (epoxides), thereby usually generating vicinal trans-diols. EHs have been found in all types of living organisms, including mammals, invertebrates, plants, fungi and bacteria. In animals, the major interest in EH is directed towards their detoxification capacity for epoxides since they are important safeguards against the cytotoxic and genotoxic potential of oxirane derivatives that are often reactive electrophiles because of the high tension of the three-membered ring system and the strong polarization of the C--O bonds. This is of significant relevance because epoxides are frequent intermediary metabolites which arise during the biotransformation of foreign compounds. This family is often found in conjunction with pfam00561. 106
49482 368910 pfam06442 DHFR_2 R67 dihydrofolate reductase. R67 dihydrofolate reductase is a plasmid encoded enzyme that provides resistance to the antibacterial drug trimethoprim. The R67 dihydrofolate reductase does not share significant similarity to the chromosomal encoded dihydrofolate reductase. 78
49483 115120 pfam06443 SEF14_adhesin SEF14-like adhesin. Family of enterotoxigenic bacterial adhesins. 165
49484 368911 pfam06444 NADH_dehy_S2_C NADH dehydrogenase subunit 2 C-terminus. This family consists of the C-terminal region specific to the eukaryotic NADH dehydrogenase subunit 2 protein and is found in conjunction with pfam00361. 51
49485 399448 pfam06445 GyrI-like GyrI-like small molecule binding domain. This family contains the small molecule binding domain of a number of different bacterial transcription activators. This family also contains DNA gyrase inhibitors. The GyrI superfamily contains a diad of the SHS2 module, adapted for small-molecule binding. The GyrI superfamily includes a family of secreted forms that is found only in animals and the bacterial pathogen Leptospira. 153
49486 399449 pfam06446 Hepcidin Hepcidin. Hepcidin is a antibacterial and antifungal protein expressed in the liver and is also a signaling molecule in iron metabolism. The hepcidin protein is cysteine-rich and forms a distorted beta-sheet with an unusual disulphide bond found at the turn of the hairpin. 53
49487 399450 pfam06448 DUF1081 Domain of Unknown Function (DUF1081). This region is found in Apolipophorin proteins. 103
49488 368914 pfam06449 DUF1082 Mitochondrial domain of unknown function (DUF1082). This family consists of the C-terminal region of several plant mitochondria specific proteins. The function of this family is unknown. This family is found in conjunction with pfam02326. 51
49489 283984 pfam06450 NhaB Bacterial Na+/H+ antiporter B (NhaB). This family consists of several bacterial Na+/H+ antiporter B (NhaB) proteins. The exact function of this family is unknown. 515
49490 368915 pfam06451 Moricin Moricin. Moricin is a antibacterial peptide that is highly basic. The structure of moricin reveals that it is comprised of a long alpha-helix. The N-terminus of the helix is amphipathic, and the C-terminus of the helix is predominately hydrophobic. The amphipathic N-terminal segment of the alpha- helix is mainly responsible for the increase in permeability of the bacterial membrane which kills the bacteria. 41
49491 399451 pfam06452 CBM9_1 Carbohydrate family 9 binding domain-like. CBM9_1 is a C-terminal domain on bacterial xylanase proteins, and it is tandemly repeated in a number of family-members. The CBM9 module binds to amorphous and crystalline cellulose and a range of soluble di- and monosaccharides as well as to cello- and xylo- oligomers of different degrees of polymerization. Comparison of the glucose and cellobiose complexes during crystallisation reveals surprising differences in binding of these two substrates by CBM9-2. Cellobiose was found to bind in a distinct orientation from glucose, while still maintaining optimal stacking and electrostatic interactions with the reducing end sugar. 182
49492 115129 pfam06453 LT-IIB Type II heat-labile enterotoxin, B subunit (LT-IIB). Family of B subunits from the type II heat-labile enterotoxin. The B subunits form a pentameric ring, which interacts with one A subunit. Thus, the structural arrangement of type I and type II heat-labile enterotoxins are very similar. 122
49493 399452 pfam06454 DUF1084 Protein of unknown function (DUF1084). This family consists of several hypothetical plant specific proteins of unknown function. 271
49494 368917 pfam06455 NADH5_C NADH dehydrogenase subunit 5 C-terminus. This family represents the C-terminal region of several NADH dehydrogenase subunit 5 proteins and is found in conjunction with pfam00361 and pfam00662. 181
49495 399453 pfam06456 Arfaptin Arfaptin-like domain. Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin. 207
49496 115133 pfam06457 Ectatomin Ectatomin. Ectatomin is a toxic component from the Ectatomma tuberculatum ant venom. It is comprised of two subunits, A and B, which are homologous. The structure of ectatomin reveals that each subunit is comprised of two helices and a connecting hinge region, the forms a hairpin structure that is stabilized by disulphide bridges. The two hinges are connected by a disulphide bond. 34
49497 399454 pfam06458 MucBP MucBP domain. The MucBP (MUCin-Binding Protein) domain is found in a wide variety of bacterial proteins, in several repeats. The domain is found in bacterial peptidoglycan bound proteins and is often found in conjunction with pfam00746 and pfam00560. 61
49498 399455 pfam06459 RR_TM4-6 Ryanodine Receptor TM 4-6. This region covers TM regions 4-6 of the ryanodine receptor 1 family. 280
49499 399456 pfam06460 NSP13 Coronavirus NSP13. This family covers the NSP13 region of the coronavirus polyprotein. This protein has the predicted function of an mRNA cap-1 methyltransferase function. 297
49500 368921 pfam06461 DUF1086 Domain of Unknown Function (DUF1086). This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with pfam00176, pfam00271, pfam06465, pfam00385 and pfam00628. 138
49501 399457 pfam06462 Hyd_WA Propeller. Probable beta-propeller. 30
49502 399458 pfam06463 Mob_synth_C Molybdenum Cofactor Synthesis C. This region contains two iron-sulphur (3Fe-4S) binding sites. Mutations in this region of human MOCS1 cause MOCOD (Molybdenum Co-Factor Deficiency) type A. 127
49503 368923 pfam06464 DMAP_binding DMAP1-binding Domain. This domain binds DMAP1, a transcriptional co-repressor. 104
49504 399459 pfam06465 DUF1087 Domain of Unknown Function (DUF1087). Members of this family are found in various chromatin remodelling factors and transposases. Their exact function is, as yet, unknown. 60
49505 399460 pfam06466 PCAF_N PCAF (P300/CBP-associated factor) N-terminal domain. This region is spliced out of human KAT2A isoform 2. It is predicted to be of a mixed alpha/beta fold - though predominantly helical. 249
49506 399461 pfam06467 zf-FCS MYM-type Zinc finger with FCS sequence motif. MYM-type zinc fingers were identified in MYM family proteins. Human protein ZMYM3 is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1. ZMYM2 is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1; in atypical myeloproliferative disorders it is rearranged. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons. 40
49507 399462 pfam06468 Spond_N Spondin_N. This conserved region is found at the in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells. 185
49508 399463 pfam06469 DUF1088 Domain of Unknown Function (DUF1088). This family is found in the neurobeachins. The function of this region is not known. 168
49509 399464 pfam06470 SMC_hinge SMC proteins Flexible Hinge Domain. This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. 115
49510 399465 pfam06471 NSP11 NSP11. This region of coronavirus polyproteins encodes the NSP11 protein. 515
49511 399466 pfam06472 ABC_membrane_2 ABC transporter transmembrane region 2. This domain covers the transmembrane of a small family of ABC transporters and shares sequence similarity with pfam00664. Mutations in this domain in ABCD3 are believed responsible for Zellweger Syndrome-2; mutations in ABCD1 are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids. 269
49512 399467 pfam06473 FGF-BP1 FGF binding protein 1 (FGF-BP1). This family consists of several mammalian FGF binding protein 1. Fibroblast growth factors (FGFs) play important roles during fetal and embryonic development. Fibroblast growth factor-binding protein (FGF-BP) 1 is a secreted protein that can bind fibroblast growth factors (FGFs) 1 and 2. 226
49513 399468 pfam06474 MLTD_N MltD lipid attachment motif. This short motif is a lipid attachment site. 34
49514 399469 pfam06475 Glycolipid_bind Putative glycolipid-binding. This family has a novel fold known as a spiral beta-roll, consisting of a 15-stranded beta sheet wrapped around a single alpha helix. It forms dimers. It has some structural similarity to the E. coli lipoprotein localization factors LolA and LolB. Its structure suggests that it may have a role in glycolipid binding. Its genomic context supports a role in glycolipid metabolism. 178
49515 399470 pfam06476 DUF1090 Protein of unknown function (DUF1090). This family consists of several bacterial proteins of unknown function and is known as YqjC in E. coli. 106
49516 399471 pfam06477 DUF1091 Protein of unknown function (DUF1091). This is a family of uncharacterized proteins. Based on its distant similarity to pfam02221 and conserved pattern of cysteine residues it is possible that these domains are also lipid binding. 83
49517 399472 pfam06478 Corona_RPol_N Coronavirus RPol N-terminus. This family covers the N-terminal region of the coronavirus RNA-directed RNA Polymerase. 353
49518 399473 pfam06479 Ribonuc_2-5A Ribonuclease 2-5A. This domain is a endoribonuclease. Specifically it cleaves an intron from Hac1 mRNA in humans, which causes it to be much more efficiently translated. 127
49519 377663 pfam06480 FtsH_ext FtsH Extracellular. This domain is found in the FtsH family of proteins. FtsH is the only membrane-bound ATP-dependent protease universally conserved in prokaryotes. It only efficiently degrades proteins that have a low thermodynamic stability - e.g. it lacks robust unfoldase activity. This feature may be key and implies that this could be a criterion for degrading a protein. In Oenococcus oeni FtsH is involved in protection against environmental stress, and shows increased expression under heat or osmotic stress. These two lines of evidence suggest that it is a fundamental prokaryotic self-protection mechanism that checks if proteins are correctly folded (personal obs: Yeats C). The precise function of this N-terminal region is unclear. 103
49520 399474 pfam06481 COX_ARM COX Aromatic Rich Motif. COX2 (Cytochrome O ubiquinol OXidase 2) is a major component of the respiratory complex during vegetative growth. It transfers electrons from a quinol to the binuclear centre of the catalytic subunit 1. The function of this region is not known. 46
49521 399475 pfam06482 Endostatin Collagenase NC10 and Endostatin. NC10 stands for Non-helical region 10 and is taken from COL15A1. A mutation in this region in COL18A1 is associated with an increased risk of prostate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumor suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor. Endostatin also binds a zinc ion near the N-terminus; this is likely to be of structural rather than functional importance according to. 222
49522 368936 pfam06483 ChiC Chitinase C. This ~170 aa region is found at the C-terminus of pfam00704. 174
49523 399476 pfam06484 Ten_N Teneurin Intracellular Region. This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats). 367
49524 399477 pfam06485 DUF1092 Protein of unknown function (DUF1092). This family consists of several hypothetical proteins of unknown function all from photosynthetic organisms including plants and cyanobacteria. 268
49525 399478 pfam06486 DUF1093 Protein of unknown function (DUF1093). This family consists of several hypothetical bacterial proteins of unknown function. 81
49526 399479 pfam06487 SAP18 Sin3 associated polypeptide p18 (SAP18). This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death. 123
49527 115164 pfam06488 L_lac_phage_MSP Phage tail tube protein. This is a family of Siphoviridae phage tail tube proteins including several from Lactococcus lactis. 301
49528 399480 pfam06489 Orthopox_A49R Orthopoxvirus A49R protein. This family consists of several Orthopoxvirus A49R proteins. The function of this family is unknown. 150
49529 377666 pfam06490 FleQ Flagellar regulatory protein FleQ. This domain is found at the N-terminus of a subset of sigma54-dependent transcriptional activators that are involved in regulation of flagellar motility e.g. FleQ in Pseudomonas aeruginosa. It is clearly related to pfam00072, but lacks the conserved aspartate residue that undergoes phosphorylation in the classic two-component system response regulator (pfam00072). 108
49530 399481 pfam06491 Disulph_isomer Disulphide isomerase. This family of proteins has disulphide isomerase activity, EC:5.3.4.1. It has a similar fold to thioredoxin, with an alpha-beta-alpha-beta-alpha-beta-beta-alpha topology. It has a conserved CGC motif in the loop immediately downstream of the first beta strand. This motif is essential for activity. 136
49531 368941 pfam06493 DUF1096 Protein of unknown function (DUF1096). This family represents the N-terminal region of several proteins found in C. elegans. The family is often found with pfam02363. 52
49532 253769 pfam06495 Transformer Fruit fly transformer protein. This family consists of transformer proteins from several Drosophila species and also from Ceratitis capitata (Mediterranean fruit fly). The transformer locus (tra) produces an RNA processing protein that alternatively splices the doublesex pre-mRNA in the sex determination hierarchy of Drosophila melanogaster. 182
49533 399482 pfam06496 DUF1097 Protein of unknown function (DUF1097). This family consists of several bacterial putative membrane proteins. 139
49534 284024 pfam06497 DUF1098 Protein of unknown function (DUF1098). This family consists of several hypothetical Baculovirus proteins of unknown function. 99
49535 399483 pfam06500 DUF1100 Alpha/beta hydrolase of unknown function (DUF1100). This family consists of several hypothetical bacterial proteins of unknown function. Members of this family have an alpha/beta hydrolase fold. 410
49536 253772 pfam06501 Herpes_U55 Human herpesvirus U55 protein. This family consists of several human herpesvirus U55 proteins. The function of this family is unknown. 432
49537 115174 pfam06502 Equine_IAV_S2 Equine infectious anaemia virus S2 protein. This family consists of several equine infectious anaemia virus S2 proteins. The function of this family is unknown. 67
49538 284026 pfam06503 DUF1101 Protein of unknown function (DUF1101). This family consists of several hypothetical Fijivirus proteins of unknown function. 360
49539 399484 pfam06504 RepC Replication protein C (RepC). This family consists of several bacterial replication protein C (RepC) sequences. 273
49540 399485 pfam06505 XylR_N Activator of aromatic catabolism. This domain is found at the N-terminus of a subset of sigma54-dependent transcriptional activators in several proteobacteria, including activators of phenol degradation such as XylR. It is found adjacent to pfam02830. 100
49541 399486 pfam06506 PrpR_N Propionate catabolism activator. This domain is found at the N-terminus of several sigma54- dependent transcriptional activators including PrpR, which activates catabolism of propionate. 165
49542 399487 pfam06507 Auxin_resp Auxin response factor. A conserved region of auxin-responsive transcription factors. 83
49543 284031 pfam06508 QueC Queuosine biosynthesis protein QueC. This family of proteins participate in the biosynthesis of 7-carboxy-7-deazaguanine. They catalyze the conversion of 7-deaza-7-carboxyguanine to preQ0. 211
49544 368946 pfam06510 DUF1102 Protein of unknown function (DUF1102). This family consists of several hypothetical archaeal proteins of unknown function. 141
49545 399488 pfam06511 IpaD Invasion plasmid antigen IpaD. This family consists of several invasion plasmid antigen IpaD proteins. Entry of Shigella flexneri into epithelial cells and lysis of the phagosome involve the IpaB, IpaC, and IpaD proteins, which are secreted by type III secretion machinery. 355
49546 399489 pfam06512 Na_trans_assoc Sodium ion transport-associated. Members of this family contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of pfam00520 and, less often, one copy of pfam00612. 201
49547 115185 pfam06513 DUF1103 Repeat of unknown function (DUF1103). This family consists of several repeats of around 30 residues in length which are found specifically in mature-parasite-infected erythrocyte surface antigen proteins from Plasmodium falciparum. This family often found in conjunction with pfam00226. 215
49548 284035 pfam06514 PsbU Photosystem II 12 kDa extrinsic protein (PsbU). This family consists of several photosystem II 12 kDa extrinsic protein (PsbU) proteins from cyanobacteria and algae. PsbU is an extrinsic protein of the photosystem II complex of cyanobacteria and red algae. PsbU is known to stabilize the oxygen-evolving machinery of the photosystem II complex against heat-induced inactivation. This family appears to be related to the Helix-hairpin-helix domain. 93
49549 115187 pfam06515 BDV_P10 Borna disease virus P10 protein. This family consists of several Borna disease virus P10 (or X) proteins. Borna disease virus (BDV) is unique among the non-segmented negative-strand RNA viruses of animals and man because it transcribes and replicates its genome in the nucleus of the infected cell. It has been suggested that the p10 protein plays a role in viral RNA synthesis or ribonucleoprotein transport. 87
49550 399490 pfam06516 NUP Purine nucleoside permease (NUP). This family consists of several purine nucleoside permease from both bacteria and fungi. 304
49551 284037 pfam06517 Orthopox_A43R Orthopoxvirus A43R protein. This family consists of several Orthopoxvirus A43R proteins. The function of this family is unknown. 195
49552 399491 pfam06518 DUF1104 Protein of unknown function (DUF1104). This family consists of several hypothetical proteins of unknown function which appear to be found largely in Helicobacter pylori. 83
49553 399492 pfam06519 TolA TolA C-terminal. This family consists of several bacterial TolA proteins as well as two eukaryotic proteins of unknown function. Tol proteins are involved in the translocation of group A colicins. Colicins are bacterial protein toxins, which are active against Escherichia coli and other related species (See pfam01024). TolA is anchored to the cytoplasmic membrane by a single membrane spanning segment near the N-terminus, leaving most of the protein exposed to the periplasm. 94
49554 399493 pfam06521 PAR1 PAR1 protein. This family consists of several plant specific PAR1 proteins from Nicotiana tabacum and Arabidopsis thaliana. The function of this family is unknown. 156
49555 399494 pfam06522 B12D NADH-ubiquinone reductase complex 1 MLRQ subunit. The MLRQ subunit of mitochondrial NADH-ubiquinone reductase complex I is nuclear and is found in plants, insects, fungi and higher metazoans. It appears to act within the membrane and, in mammals, is highly expressed in muscle and neural tissue, indicative of a role in ATP generation. 69
49556 115195 pfam06523 DUF1106 Protein of unknown function (DUF1106). This family consists of several hypothetical bacterial proteins found in Escherichia coli and Citrobacter rodentium. The function of this family is unknown. 91
49557 368953 pfam06524 NOA36 NOA36 protein. This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown. 306
49558 310845 pfam06525 SoxE Sulfocyanin (SoxE) domain. This family consists of several archaeal sulfocyanin (or blue copper protein) sequences from a number of Sulfolobus species. 149
49559 399495 pfam06526 DUF1107 Protein of unknown function (DUF1107). This family consists of several short, hypothetical bacterial proteins of unknown function. 63
49560 399496 pfam06527 TniQ TniQ. This family consists of several bacterial TniQ proteins. TniQ along with TniA and B is involved in the transposition of the mercury-resistance transposon Tn5053 which carries the mer operon. It has been suggested that the tni genes are involved in the dissemination of integrons. 142
49561 399497 pfam06528 Phage_P2_GpE Phage P2 GpE. This family consists of several phage and bacterial proteins which are closely related to the GpE tail protein from Phage P2. 37
49562 368956 pfam06529 Vert_IL3-reg_TF Vertebrate interleukin-3 regulated transcription factor. This family includes vertebrate transcription factors, some of which are regulated by IL-3/adenovirus E4 promoter binding protein. Others were found to strongly repress transcription in a DNA-binding-site-dependent manner. 332
49563 399498 pfam06530 Phage_antitermQ Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Antiterminator proteins control gene expression by recognising control signals near the promoter and preventing transcriptional termination which would otherwise occur at sites that may be a long way downstream. 118
49564 368958 pfam06531 DUF1108 Protein of unknown function (DUF1108). This family consists of several bacterial proteins from Staphylococcus aureus as well as a number of phage proteins. The function of this family is unknown. 84
49565 399499 pfam06532 DUF1109 Protein of unknown function (DUF1109). This family consists of several hypothetical bacterial proteins of unknown function. 204
49566 399500 pfam06533 DUF1110 Protein of unknown function (DUF1110). This family consists of hypothetical proteins specific to Oryza sativa. One sequence appears to be tandemly repeated. 189
49567 399501 pfam06534 RGM_C Repulsive guidance molecule (RGM) C-terminus. This family consists of several mammalian and one bird sequence from Gallus gallus (Chicken). This family represents the C-terminal region of several sequences but in others it represents the full protein. All of the mammalian proteins are hypothetical and have no known function but RGMA from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum. 171
49568 399502 pfam06535 RGM_N Repulsive guidance molecule (RGM) N-terminus. This family consists of the N-terminal region of several mammalian and one bird sequence from Gallus gallus (Chicken). All of the mammalian proteins are hypothetical and have no known function but RGMA from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum. 166
49569 368962 pfam06536 Av_adeno_fibre Avian adenovirus fibre, N-terminal. This family is the N-terminal region of avian adenovirus fibre proteins; the domain is frequently found repeated several times along the fibre. These fibers have been linked to variations in virulence. Avian adenoviruses possess penton capsomers that consist of a pentameric base associated with two fibers. 56
49570 399503 pfam06537 DHOR Di-haem oxidoreductase, putative peroxidase. DHOR is a family of di-haem oxidoredictases. It carries the two characteristic Cys-X-Y-Cys-His haem-binding motifs. The C-terminal high-potential site functions as an electron transfer centre, and the N-terminal low-potential site corresponds to the peroxidatic centre. Its probable function is as a peroxidase. 486
49571 399504 pfam06540 GMAP Galanin message associated peptide (GMAP). This family consists of several galanin message associated peptides. In rat preprogalanin, galanin is C-terminally flanked by a 60 amino acid long peptide: galanin message-associated peptide (GMAP). GMAP sequences in different species show high degree of homology, but the biological function of this family is unknown. 58
49572 399505 pfam06541 ABC_trans_CmpB Putative ABC-transporter type IV. CmpB is a family of membrane proteins that are likely to be part of a two-component type IV ABC-transporter system. Families can transport multiple drugs including ethidium and fluoroquinolones. UniProtKB:Q83XH0 is a member of TCDB family 3.A.1.121.4. 149
49573 368965 pfam06542 PHA-1 Regulator protein PHA-1. This family represents the protein product of the gene pha-1 which coordinates with lin-35 Rb during animal development. The protein is expressed during embryonic development and functions in the cytoplasm. PHA-1 acts in a parallel pathway with UBC-18 to regulate the activity of a common cellular target. 403
49574 284059 pfam06543 Lac_bphage_repr Lactococcus bacteriophage repressor. This family represents the C-terminus of Lactococcus bacteriophage repressor proteins. 49
49575 399506 pfam06544 DUF1115 Protein of unknown function (DUF1115). This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. 133
49576 399507 pfam06545 DUF1116 Protein of unknown function (DUF1116). This family contains hypothetical bacterial proteins of unknown function. 215
49577 399508 pfam06546 Vert_HS_TF Vertebrate heat shock transcription factor. This family represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell's recovery after the removal of stress. This C-terminal region is found with the N-terminal pfam00447, and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response. 269
49578 399509 pfam06547 DUF1117 Protein of unknown function (DUF1117). This family represents the C-terminus of a number of hypothetical plant proteins. 110
49579 368969 pfam06549 DUF1118 Protein of unknown function (DUF1118). This family consists of several hypothetical plant proteins of unknown function. 115
49580 284066 pfam06550 SPP Signal-peptide peptidase, presenilin aspartyl protease. SPP is a family of signal-peptide aspartyl proteases. The family carries the characteristic catalytic aspartate GXGD motif, and members are integral membrane peptidases of the presenilin-type with nine transmembrane regions. UniProtKB:Q18K19 is part of the TCDB family 1.A.54.3.4, the presenilin er Ca(2+) leak channel (presenilin). 283
49581 368970 pfam06551 DUF1120 Protein of unknown function (DUF1120). This family consists of several hypothetical bacterial proteins of unknown function. 116
49582 284068 pfam06552 TOM20_plant Plant specific mitochondrial import receptor subunit TOM20. This family consists of several plant specific mitochondrial import receptor subunit TOM20 (translocase of outer membrane 20 kDa subunit) proteins. Most mitochondrial proteins are encoded by the nuclear genome, and are synthesized in the cytosol. TOM20 is a general import receptor that binds to mitochondrial pre-sequences in the early step of protein import into the mitochondria. 187
49583 399510 pfam06553 BNIP3 BNIP3. This family consists of several mammalian specific BCL2/adenovirus E1B 19-kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localization, homodimerization, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterized by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation. 184
49584 399511 pfam06554 Olfactory_mark Olfactory marker protein. This family consists of several olfactory marker proteins. Expression of the olfactory marker protein (OMP) is highly restricted to mature olfactory receptor neurons in virtually all vertebrate species from fish to man. 149
49585 284071 pfam06556 ASFV_p27 IAP-like protein p27 C-terminus. This family represents the C-terminal region of the African swine fever virus IAP-like protein p27. This family is found in conjunction with pfam00653. It has been suggested that the family may be a host range gene involved in aspects of infection in the arthropod host, ticks of the genus Ornithodoros. 131
49586 399512 pfam06557 DUF1122 Protein of unknown function (DUF1122). This family consists of several hypothetical archaeal and bacterial proteins of unknown function. 162
49587 399513 pfam06558 SecM Secretion monitor precursor protein (SecM). This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling. 146
49588 399514 pfam06559 DCD 2'-deoxycytidine 5'-triphosphate deaminase (DCD). This family consists of several bacterial 2'-deoxycytidine 5'-triphosphate deaminase proteins (EC:3.5.4.13). 360
49589 284075 pfam06560 GPI Glucose-6-phosphate isomerase (GPI). This family consists of several bacterial and archaeal glucose-6-phosphate isomerase (GPI) proteins (EC:5.3.1.9). 177
49590 284076 pfam06563 DUF1125 Protein of unknown function (DUF1125). This family consists of several short Lactococcus lactis and bacteriophage proteins. The function of this family is unknown. 55
49591 399515 pfam06564 CBP_BcsQ Cellulose biosynthesis protein BcsQ. This is a family of bacterial proteins involved in cellulose biosynthesis. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). A second component of the extracellular matrix of the multicellular morphotype (rdar) of Salmonella typhimurium and Escherichia coli is cellulose. The family does contain a P-loop sequence motif suggesting a nucleotide binding function, but this has not been confirmed. 234
49592 399516 pfam06565 DUF1126 DUF1126 PH-like domain. The structure of this domain shows that it has a PH-like fold. 105
49593 368978 pfam06566 Chon_Sulph_att Chondroitin sulphate attachment domain. This family represents the chondroitin sulphate attachment domain of vertebrate neural transmembrane proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains several potential sites of chondroitin sulphate attachment, as well as potential sites of N-linked glycosylation. 249
49594 368979 pfam06567 Neural_ProG_Cyt Neural chondroitin sulphate proteoglycan cytoplasmic domain. This family represents the C-terminal cytoplasmic domain of vertebrate neural chondroitin sulphate proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains a number of potential sites of phosphorylation by protein kinase C. 120
49595 399517 pfam06568 DUF1127 Domain of unknown function (DUF1127). This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence. 33
49596 399518 pfam06569 DUF1128 Protein of unknown function (DUF1128). This family consists of several short, hypothetical bacterial proteins of unknown function. 71
49597 399519 pfam06570 DUF1129 Protein of unknown function (DUF1129). This family consists of several hypothetical bacterial proteins of unknown function. 200
49598 399520 pfam06572 DUF1131 Protein of unknown function (DUF1131). This family consists of several hypothetical bacterial proteins of unknown function. 169
49599 399521 pfam06573 Churchill Churchill protein. This family consists of several eukaryotic Churchill proteins. This protein contains a novel zinc binding region that mediates FGF signaling during neural development (unpublished obs Sheng G and Stern C). 111
49600 399522 pfam06574 FAD_syn FAD synthetase. This family corresponds to the N terminal domain of the bifunctional enzyme riboflavin kinase / FAD synthetase. These enzymes have both ATP:riboflavin 5'-phospho transferase and ATP:FMN-adenylyltransferase activity. They catalyze the 5'-phosphorylation of riboflavin to FMN and the adenylylation of FMN to FAD. This domain is thought to have the flavin mononucleotide (FMN) adenylyltransferase activity. 158
49601 284087 pfam06575 DUF1132 Protein of unknown function (DUF1132). This family consists of several hypothetical proteins from Neisseria meningitidis. The function of this family is unknown. 101
49602 148278 pfam06576 DUF1133 Protein of unknown function (DUF1133). This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown. 176
49603 399523 pfam06577 DUF1134 Protein of unknown function (DUF1134). This family consists of several hypothetical bacterial proteins of unknown function. 159
49604 284089 pfam06578 YscK YOP proteins translocation protein K (YscK). This family consists of several YscK proteins. The function of this protein is unknown but it belongs to an operon involved in the secretion of Yop proteins across bacterial membranes. 209
49605 399524 pfam06579 Ly-6_related Caenorhabditis elegans ly-6-related protein. This family consists of several Caenorhabditis elegans specific ly-6-related HOT and ODR proteins. These proteins are involved in the olfactory system. Odr-2 mutants are known to be defective in the ability to chemotax to odorants that are recognized by the two AWC olfactory neurons. Odr-2 encodes a membrane-associated protein related to the Ly-6 superfamily of GPI-linked signaling proteins. 125
49606 399525 pfam06580 His_kinase Histidine kinase. This family represents a region within bacterial histidine kinase enzymes. Two-component signal transduction systems such as those mediated by histidine kinase are integral parts of bacterial cellular regulatory processes, and are used to regulate the expression of genes involved in virulence. Members of this family often contain pfam02518 and/or pfam00672. 79
49607 368986 pfam06581 p31comet Mad1 and Cdc20-bound-Mad2 binding. This family is involved in the cell-cycle surveillance mechanism called the spindle checkpoint. This mechanism monitors the proper bipolar attachment of sister chromatids to spindle microtubules and ensures the fidelity of chromosome segregation during mitosis. A key player in mitosis is Mad2, and Mad2 exhibits an unusual two-state behaviour. A Mad1-Mad2 core complex recruits cytosolic Mad2 to kinetochores through Mad2 dimerization and converts Mad2 to a conformer amenable to Cdc20 binding. p31comet inactivates the checkpoint by binding to Mad1- or Cdc20-bound Mad2 in such a way as to stop Mad2 activation and to promote the dissociation of the Mad2-Cdc20 complex. 265
49608 399526 pfam06582 DUF1136 Repeat of unknown function (DUF1136). This family consists of several eukaryote specific repeats of unknown function. This repeat seems to always be found with pfam00047. 27
49609 399527 pfam06583 Neogenin_C Neogenin C-terminus. This family represents the C-terminus of eukaryotic neogenin precursor proteins, which contains several potential phosphorylation sites. Neogenin is a member of the N-CAM family of cell adhesion molecules (and therefore contains multiple copies of pfam00047 and pfam00041) and is closely related to the DCC tumor suppressor gene product - these proteins may play an integral role in regulating differentiation programmes and/or cell migration events within many adult and embryonic tissues. 289
49610 399528 pfam06584 DIRP DIRP. DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway. 107
49611 399529 pfam06585 JHBP Haemolymph juvenile hormone binding protein (JHBP). This family consists of several insect-specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone regulates embryogenesis, maintains the status quo of larval development and stimulates reproductive maturation in the adult insect. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). JHBP protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph. The crystal structure of the JHBP from Galleria mellonella shows an unusual fold consisting of a long alpha-helix wrapped in a much curved antiparallel beta-sheet. The folding pattern for this structure closely resembles that found in some tandem-repeat mammalian lipid-binding and bactericidal permeability-increasing proteins, with a similar organisation of the major cavity and a disulfide bond linking the long helix and the beta-sheet. It would appear that JHBP forms two cavities, only one of which, the one near the N- and C-termini, binds the hormone; binding induces a conformational change, of unknown significance. This family now includes DUF233, pfam03027. 239
49612 399530 pfam06586 TraK TraK protein. This family consists of several TraK proteins from Escherichia coli, Salmonella typhi and Salmonella typhimurium. TraK is known to be essential for pilus assembly but its exact role in this process is unknown. 228
49613 368992 pfam06587 DUF1137 Protein of unknown function (DUF1137). This family consists of several hypothetical proteins specific to Chlamydia species. The function of this family is unknown. 117
49614 284099 pfam06588 Muskelin_N Muskelin N-terminus. This family represents the N-terminal region of muskelin and is found in conjunction with several pfam01344 repeats. Muskelin is an intracellular, kelch repeat protein that is needed in cell-spreading responses to the matrix adhesion molecule, thrombospondin-1. 197
49615 399531 pfam06589 CRA Circumsporozoite-related antigen (CRA). This family consists of several circumsporozoite-related antigen (CRA) or exported protein-1 (EXP1) sequences found specifically in Plasmodium species. The function of this family is unknown. 123
49616 115260 pfam06590 PerB PerB protein. This family consists of several PerB or BfpV proteins found specifically in Escherichia coli. PerB is thought to play a role in regulating the expression of BfpA. 129
49617 148289 pfam06591 Phage_T4_Ndd T4-like phage nuclear disruption protein (Ndd). This family consists of several nuclear disruption (Ndd) proteins from T4-like phages. Early in a bacteriophage T4 infection, the phage ndd gene causes the rapid destruction of the structure of the Escherichia coli nucleoid. The targets of Ndd action may be the chromosomal sequences that determine the structure of the nucleoid. 154
49618 399532 pfam06592 DUF1138 Protein of unknown function (DUF1138). This family consists of several hypothetical short plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 73
49619 284102 pfam06593 RBDV_coat Raspberry bushy dwarf virus coat protein. This family consists of several Raspberry bushy dwarf virus coat proteins. 274
49620 399533 pfam06594 HCBP_related Haemolysin-type calcium binding protein related domain. This family consists of a number of bacteria specific domains which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with pfam00353 and is often found in multiple copies. 42
49621 284104 pfam06595 BDV_P24 Borna disease virus P24 protein. This family consists of several Borna disease virus (BDV) P24 proteins. The function of this family is unknown. 201
49622 399534 pfam06596 PsbX Photosystem II reaction centre X protein (PsbX). This family consists of several photosystem II reaction centre X protein (PsbX) sequences from both prokaryotes and eukaryotes. 37
49623 368996 pfam06597 Clostridium_P47 Clostridium P-47 protein. This family consists of several P-47 proteins from various Clostridium species as well as two related sequences from Pseudomonas putida. The function of this family is unknown. 469
49624 253815 pfam06598 Chlorovi_GP_rpt Chlorovirus glycoprotein repeat. This family consists of s number of repeats found in Chlorovirus glycoproteins. The function of this family is unknown. 34
49625 115269 pfam06599 DUF1139 Protein of unknown function (DUF1139). This family consists of several hypothetical Fijivirus proteins of unknown function. 309
49626 284107 pfam06600 DUF1140 Protein of unknown function (DUF1140). This family consists of several short, hypothetical phage and bacterial proteins. The function of this family is unknown. 99
49627 284108 pfam06601 Orthopox_F6 Orthopoxvirus F6 protein. This family consists of several Orthopoxvirus F6L proteins the function of which are unknown. 72
49628 399535 pfam06602 Myotub-related Myotubularin-like phosphatase domain. This family represents the phosphatase domain within eukaryotic myotubularin-related proteins. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease. 330
49629 399536 pfam06603 UpxZ UpxZ family of transcription anti-terminator antagonists. The UpxZ family of proteins acts to inhibit transcription of heterologous capsular polysaccharide loci in Bacteroides species by interfering with the action of the UpxY family of transcription anti-terminators. As antagonists of polysaccharide locus-specific UpxY transcription anti-terminators, the UpxZ proteins exert a hierarchical level of regulation, insuring that only one of the multiple phase-variable capsular polysaccharide loci per cell characteristic of this genus is transcribed at a time. 104
49630 399537 pfam06605 Prophage_tail Prophage endopeptidase tail. This family is of prophage tail proteins that are probably acting as endopeptidases. 251
49631 148298 pfam06607 Prokineticin Prokineticin. This family consists of several prokineticin proteins and related BM8 sequences. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock. 97
49632 284112 pfam06608 DUF1143 Protein of unknown function (DUF1143). This family consists of several hypothetical mammalian proteins (from mouse and human). The function of this family is unknown. 148
49633 115279 pfam06609 TRI12 Fungal trichothecene efflux pump (TRI12). This family consists of several fungal specific trichothecene efflux pump proteins. Many of the genes involved in trichothecene toxin biosynthesis in Fusarium sporotrichioides are present within a gene cluster.It has been suggested that TRI12 may play a role in F. sporotrichioides self-protection against trichothecenes. 598
49634 399538 pfam06610 AlaE L-alanine exporter. AlaE is a family of Gram-negative amino-acid transporters. It is not entirely clear why bacteria export metabolites but recent studies have shown that many excrete alanine. AlaE is likely to be the exporter protein for L-alanine. UniProtKB:A8ANM6, UniProt:G4R961 and UniProt:H5SVY7 are classified as putative alanine exporters. 141
49635 399539 pfam06611 DUF1145 Protein of unknown function (DUF1145). This family consists of several hypothetical bacterial proteins of unknown function. 56
49636 399540 pfam06612 DUF1146 Protein of unknown function (DUF1146). This family consists of several hypothetical bacterial proteins of unknown function. 48
49637 399541 pfam06613 KorB_C KorB C-terminal beta-barrel domain. This family consists of several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon. This beta-barrel domain is found at the C-terminus of KorB. 58
49638 369001 pfam06614 Neuromodulin Neuromodulin. This family consists of several neuromodulin (Axonal membrane protein GAP-43) sequences and is found in conjunction with pfam00612. GAP-43 is a neuronal calmodulin-binding phosphoprotein that is concentrated in growth cones and pre-synaptic terminals. 175
49639 115285 pfam06615 DUF1147 Protein of unknown function (DUF1147). This family consists of several short Circovirus proteins of unknown function. 59
49640 377679 pfam06616 BsuBI_PstI_RE BsuBI/PstI restriction endonuclease C-terminus. This family represents the C-terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (EC:3.1.21.4). The enzymes of the BsuBI restriction/modification (R/M) system recognize the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system. 153
49641 399542 pfam06617 M-inducer_phosp M-phase inducer phosphatase. This family represents a region within eukaryotic M-phase inducer phosphatases (EC:3.1.3.48), which also contain the pfam00581 domain. These proteins are involved in the control of mitosis. 228
49642 115288 pfam06618 DUF1148 Protein of unknown function (DUF1148). This family consists of several Maize streak virus proteins of unknown function. 114
49643 399543 pfam06619 DUF1149 Protein of unknown function (DUF1149). This family consists of several hypothetical bacterial proteins of unknown function. 122
49644 399544 pfam06620 DUF1150 Protein of unknown function (DUF1150). This family consists of several hypothetical bacterial proteins of unknown function. 76
49645 399545 pfam06621 SIM_C Single-minded protein C-terminus. This family represents the C-terminal region of the eukaryotic single-minded (SIM) protein. Drosophila single-minded acts as a positive master gene regulator in central nervous system midline formation. There are two homologs in mammals: SIM1 and SIM2, which are members of the basic-helix-loop-helix PAS family of transcription factors. SIM1 and SIM2 are novel heterodimerization partners for ARNT in vitro, and they may function both as positive and negative transcriptional regulators in vivo, during embryogenesis and in the adult organism. SIM2 is thought to contribute to some specific Down syndrome phenotypes. This family is found in conjunction with a pfam00989 domain and associated pfam00785 motif. 293
49646 115292 pfam06622 SepQ SepQ protein. This family consists of several enterobacterial SepQ proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unclear. 305
49647 399546 pfam06623 MHC_I_C MHC_I C-terminus. This family represents the C-terminal region of the MHC class I antigen. The family is found in conjunction with pfam00129 and pfam00047. 26
49648 399547 pfam06624 RAMP4 Ribosome associated membrane protein RAMP4. This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER. 57
49649 399548 pfam06625 DUF1151 Protein of unknown function (DUF1151). This family consists of several hypothetical eukaryotic proteins of unknown function. 114
49650 399549 pfam06626 DUF1152 Protein of unknown function (DUF1152). This family consists of several hypothetical archaeal proteins of unknown function. 294
49651 399550 pfam06627 DUF1153 Protein of unknown function (DUF1153). This family consists of several short, hypothetical bacterial proteins of unknown function. 87
49652 399551 pfam06628 Catalase-rel Catalase-related immune-responsive. This family represents a small conserved region within catalase enzymes (EC:1.11.1.6). All members also contain the Catalase family, pfam00199 domain. Catalase decomposes hydrogen peroxide into water and oxygen, serving to protect cells from its toxic effects. This domain carries the immune-responsive amphipathic octa-peptide that is recognized by T cells. 61
49653 399552 pfam06629 MipA MltA-interacting protein MipA. This family consists of several bacterial MltA-interacting protein (MipA) like sequences. As well as interacting with the membrane-bound lytic transglycosylase MltA, MipA is known to bind to PBP1B, a bifunctional murein transglycosylase/transpeptidase. MipA is considered to be a structural protein mediating the assembly of MltA to PBP1B into a complex. 221
49654 284130 pfam06630 Exonuc_VIII Enterobacterial exodeoxyribonuclease VIII. This family consists of several Enterobacterial exodeoxyribonuclease VIII proteins. 203
49655 399553 pfam06631 DUF1154 Protein of unknown function (DUF1154). This family represents a small conserved region of unknown function within eukaryotic phospholipase C (EC:3.1.4.3). All members also contain pfam00387 and pfam00388. 44
49656 369011 pfam06632 XRCC4 DNA double-strand break repair and V(D)J recombination protein XRCC4. This family consists of several eukaryotic DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalyzed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks. 336
49657 115303 pfam06633 DUF1155 Protein of unknown function (DUF1155). This family consists of several Cucumber mosaic virus ORF IIB proteins. The function of this family is unknown. 42
49658 369012 pfam06634 DUF1156 Protein of unknown function (DUF1156). This family represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids. 71
49659 399554 pfam06635 NolV Nodulation protein NolV. This family consists of several nodulation protein NolV sequences from different Rhizobium species. The function of this family is unclear. 206
49660 253836 pfam06636 DUF1157 Protein of unknown function (DUF1157). This family consists of several uncharacterized proteins from Melanoplus sanguinipes entomopoxvirus (MsEPV). The function of this family is unknown. 370
49661 399555 pfam06637 PV-1 PV-1 protein (PLVAP). This family consists of several PV-1 (PLVAP) proteins which seem to be specific to mammals. PV-1 is a novel protein component of the endothelial fenestral and stomatal diaphragms. The function of this family is unknown. 440
49662 399556 pfam06638 Strabismus Strabismus protein. This family consists of several strabismus (STB) or Van Gogh-like (VANGL) proteins 1 and 2. The exact function of this family is unknown. It is thought, however that STB1 gene and STB2 may be potent tumor suppressor gene candidates. 503
49663 369015 pfam06639 BAP Basal layer antifungal peptide (BAP). This family consists of several basal layer antifungal peptide (BAP) sequences specific to Zea mays. The BAP2 peptide exhibits potent broad-range activity against a range of filamentous fungi, including several plant pathogens. 76
49664 336460 pfam06640 P_C P protein C-terminus. This family represents the C-terminus of plant P proteins. The maize P gene is a transcriptional regulator of genes encoding enzymes for flavonoid biosynthesis in the pathway leading to the production of a red phlobaphene pigment, and P proteins are homologous to the DNA-binding domain of myb-like transcription factors. All members of this family contain the pfam00249 domain. 206
49665 399557 pfam06643 DUF1158 Protein of unknown function (DUF1158). This family consists of several enterobacterial YbdJ proteins. The function of this family is unknown 78
49666 399558 pfam06644 ATP11 ATP11 protein. This family consists of several eukaryotic ATP11 proteins. In Saccharomyces cerevisiae, expression of functional F1-ATPase requires two proteins encoded by the ATP11 and ATP12 genes. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, the catalytic unit of the ATP synthase. 267
49667 399559 pfam06645 SPC12 Microsomal signal peptidase 12 kDa subunit (SPC12). This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12). 71
49668 284141 pfam06646 Mycoplasma_p37 High affinity transport system protein p37. This family consists of several high affinity transport system protein p37 sequences which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria.It has been suggested that p37 is part of a homologous, high-affinity transport system in M. hyorhinis, a Gram-positive bacterium. 330
49669 148323 pfam06648 DUF1160 Protein of unknown function (DUF1160). This family consists of several hypothetical Baculovirus proteins of unknown function. 122
49670 399560 pfam06649 DUF1161 Protein of unknown function (DUF1161). This family consists of several short, hypothetical bacterial proteins of unknown function. 52
49671 399561 pfam06650 SHR-BD SHR-binding domain of vacuolar-sorting associated protein 13. SHR-BD is a family of eukaryotic proteins found on vacuolar-sorting associated proteins towards the C-terminus. In plants, the domain is found to be the region which interacts with SHR or the SHORT-ROOT transcription factor, a regulator of root-growth and asymmetric cell division that separates ground tissue into endodermis and cortex. The plant protein containing the SHR-BD is named SHRUBBY or SHBY, UniProtKB:Q9FT44. 272
49672 284144 pfam06651 DUF1163 Protein of unknown function (DUF1163). This family represents the C-terminus of hypothetical Arabidopsis thaliana proteins of unknown function. 67
49673 399562 pfam06652 Methuselah_N Methuselah N-terminus. This family represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. This family is found in conjunction with pfam00002. 179
49674 399563 pfam06653 Claudin_3 Tight junction protein, Claudin-like. This is a family of probable membrane tight junction, Claudin-like, proteins. 164
49675 284147 pfam06656 Tenui_PVC2 Tenuivirus PVC2 protein. This family consists of several Tenuivirus PVC2 proteins from Rice grassy stunt virus, Maize stripe virus and Rice hoja blanca virus. The function of this family is unknown. 784
49676 399564 pfam06657 Cep57_MT_bd Centrosome microtubule-binding domain of Cep57. This C-terminal region of Cep57 binds, nucleates and bundles microtubules. The N-terminal part, family Cep57_CLD, pfam14073, is the centrosome localization domain Cep57. 77
49677 399565 pfam06658 DUF1168 Protein of unknown function (DUF1168). This family consists of several hypothetical eukaryotic proteins of unknown function. 136
49678 284150 pfam06661 VirE3 VirE3. This family represents a conserved region within Agrobacterium tumefaciens VirE3. Agrobacterium tumefaciens (a plant pathogen) has a tumor-inducing (Ti) plasmid of which part, the transfer (T)-region, is transferred to plant cells during the infection process. Vir proteins mediate the processing of the T-region and the transfer of a single-stranded (ss) DNA copy of this region, the T-strand, into the recipient cells. VirE3 is a translocated effector protein, but its specific role has not been established. 316
49679 399566 pfam06662 C5-epim_C D-glucuronyl C5-epimerase C-terminus. This family represents the C-terminus of D-glucuronyl C5-epimerase (EC:5.1.3.-). Glucuronyl C5-epimerases catalyze the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans. 188
49680 399567 pfam06663 DUF1170 Protein of unknown function (DUF1170). This family represents a conserved region of unknown function within MAGUIN, a neuronal membrane-associated guanylate kinase-interacting protein. This region is situated between the pfam00595 and pfam00169 domains. All family members also contain an N-terminal pfam00536 domain. 214
49681 399568 pfam06664 MIG-14_Wnt-bd Wnt-binding factor required for Wnt secretion. MIG-14 is a Wnt-binding factor. Newly synthesized EGL-20/Wnt binds to MIG-14 in the Golgi, targetting the Wnt to the cell membrane for secretion. AP-2-mediated endocytosis and retromer retrieval at the sorting endosome would recycle MIG-14 to the Golgi, where it can bind to EGL-20/Wnt for next cycle of secretion. 294
49682 399569 pfam06666 DUF1173 Protein of unknown function (DUF1173). This family contains a group of hypothetical bacterial proteins that contain three conserved cysteine residues towards the N-terminal. The function of these proteins is unknown. 380
49683 399570 pfam06667 PspB Phage shock protein B. This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one. 73
49684 399571 pfam06668 ITI_HC_C Inter-alpha-trypsin inhibitor heavy chain C-terminus. This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain. 188
49685 115333 pfam06670 Etmic-2 Microneme protein Etmic-2. This family consists of several Microneme protein Etmic-2 sequences from Eimeria tenella. Etmic-2 is a 50 kDa acidic protein, which is found within the microneme organelles of Eimeria tenella sporozoites and merozoites. 379
49686 369031 pfam06671 DUF1174 Repeat of unknown function (DUF1174). This family consists of a number of Caenorhabditis elegans specific repeats of around 36 residues in length which are found in two hypothetical proteins. This family is found in conjunction with pfam00024. 24
49687 399572 pfam06672 DUF1175 Protein of unknown function (DUF1175). This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown. 218
49688 115336 pfam06673 L_lactis_ph-MCP Lactococcus lactis bacteriophage major capsid protein. This family consists of several Lactococcus lactis bacteriophage major capsid proteins. 347
49689 399573 pfam06674 DUF1176 Protein of unknown function (DUF1176). This family consists of several hypothetical bacterial proteins of around 340 residues in length. Members of this family contain six highly conserved cysteine residues. The function of this family is unknown. 319
49690 399574 pfam06675 DUF1177 Protein of unknown function (DUF1177). This family consists of several hypothetical archaeal and and bacterial proteins of around 300 residues in length. The function of this family is unknown. 270
49691 399575 pfam06676 DUF1178 Protein of unknown function (DUF1178). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 148
49692 399576 pfam06677 Auto_anti-p27 Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27). This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis. 38
49693 369033 pfam06678 DUF1179 Protein of unknown function (DUF1179). This family consists of several hypothetical Caenorhabditis elegans proteins of around 106 residues in length. The function of the family is unknown. 107
49694 369034 pfam06679 DUF1180 Protein of unknown function (DUF1180). This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown. 167
49695 284166 pfam06680 DUF1181 Protein of unknown function (DUF1181). This family consists of several hypothetical proteins of around 120 residues in length which are found specifically in Trypanosoma brucei. The function of this family is unknown. 120
49696 284167 pfam06681 DUF1182 Protein of unknown function (DUF1182). This family consists of several hypothetical proteins of around 360 residues in length and seems to be specific to Caenorhabditis elegans. The function of this family is unknown. It appears to carry seven TM regions. 208
49697 399577 pfam06682 SARAF SOCE-associated regulatory factor of calcium homoeostasis. SARAF is as family of eukaryotic proteins embedded in the ER. SARAF is SOCE-associated regulatory factor, where SOCE is store operated calcium entry. ie a mechanism governing calcium homoeostasis in the cell and the mitochondria. SOCE involves the enabling of Ca2+ release-activated Ca2+ (CRAC) channels. SARAF is a single pass ER membrane protein whose systolic-facing domain is responsible for activity and whose luminary-facing domain carries out a regulatory function in conjunction with another membrane protein STIM, an ER single pass membrane protein that detects changes in ER Ca2+ levels through its EF-hand, conserved Ca2+ binding domain. STIM is the major target for SARAF regulation, and thus SARAF negatively regulates the SOCE entry of calcium into cells protecting them from overfilling. 320
49698 310940 pfam06683 DUF1184 Protein of unknown function (DUF1184). This family contains a number of hypothetical proteins of unknown function from Arabidopsis thaliana. 203
49699 399578 pfam06684 AA_synth Amino acid synthesis. This family of proteins is structurally similar to proteins with the Bacillus chorismate mutase-like (BCM-like) fold. This structure, combined with its genomic context, suggest that it has a role in amino acid synthesis. 175
49700 369037 pfam06685 DUF1186 Protein of unknown function (DUF1186). This family consists of several hypothetical bacterial proteins of around 250 residues in length and is found in several Chlamydia and Anabaena species. The function of this family is unknown. 246
49701 399579 pfam06686 SpoIIIAC Stage III sporulation protein AC/AD protein family. This family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) and SpoIIIAD sequences. The exact function of this family is unknown. SpoIIIAD is the an uncharacterized protein which is part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. The operon is regulated by sigmaG. 56
49702 399580 pfam06687 SUR7 SUR7/PalI family. This family consists of several fungal-specific SUR7 proteins. Its activity regulates expression of RVS161, a homolog of human endophilin, suggesting a function for both in endocytosis. The protein carries four transmembrane domains and is thus likely to act as an anchoring protein for the eisosome to the plasma membrane. Eisosomes are the immobile protein complexes, that include the proteins Pil1 and Lsp1, which co-localize with sites of protein and lipid endocytosis at the plasma membrane. SUR7 protein may play a role in sporulation. This family also includes PalI which is part of a pH signal transduction cascade. Based on the similarity of PalI to the yeast Rim9 meiotic signal transduction component it has been suggested that PalI might be a membrane sensor for ambient pH. 201
49703 115351 pfam06688 DUF1187 Protein of unknown function (DUF1187). This family consists of several short, hypothetical bacterial proteins of around 62 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi. The function of this family is unknown. 61
49704 399581 pfam06689 zf-C4_ClpX ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known. 39
49705 399582 pfam06690 DUF1188 Protein of unknown function (DUF1188). This family consists of several hypothetical archaeal proteins of around 260 residues in length which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown. 248
49706 399583 pfam06691 DUF1189 Protein of unknown function (DUF1189). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown. 240
49707 115355 pfam06692 MNSV_P7B Melon necrotic spot virus P7B protein. This family consists of several Melon necrotic spot virus (MNSV) P7B proteins. The function of this family is unknown. 61
49708 399584 pfam06693 DUF1190 Protein of unknown function (DUF1190). This family consists of several hypothetical Enterobacterial proteins of around 212 residues in length and is known as YjfM in Escherichia coli. The function of this family is unknown. 161
49709 369039 pfam06694 Plant_NMP1 Plant nuclear matrix protein 1 (NMP1). This family consists of several plant specific nuclear matrix protein 1 (NMP1) sequences. Nuclear Matrix Protein 1 is a ubiquitously expressed 36 kDa protein, which has no homologs in animals and fungi, but is highly conserved among flowering and non-flowering plants. NMP1 is located both in the cytoplasm and nucleus and that the nuclear fraction is associated with the nuclear matrix. NMP1 is a candidate for a plant-specific structural protein with a function both in the nucleus and cytoplasm. 318
49710 399585 pfam06695 Sm_multidrug_ex Putative small multi-drug export protein. This family contains a small number of putative small multi-drug export proteins. 112
49711 336475 pfam06696 Strep_SA_rep Streptococcal surface antigen repeat. This family consists of a number of ~25 residue long repeats found commonly in Streptococcal surface antigens although one copy is present in the HPSR2-heavy chain potential motor protein of Giardia lamblia. This family is often found in conjunction with pfam00746. 25
49712 399586 pfam06697 DUF1191 Protein of unknown function (DUF1191). This family contains hypothetical plant proteins of unknown function. 182
49713 399587 pfam06698 DUF1192 Protein of unknown function (DUF1192). This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown. 58
49714 399588 pfam06699 PIG-F GPI biosynthesis protein family Pig-F. PIG-F is involved in glycosylphosphatidylinositol (GPI) anchor biosynthesis. 183
49715 399589 pfam06701 MIB_HERC2 Mib_herc2. Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). 66
49716 399590 pfam06702 Fam20C Golgi casein kinase, C-terminal, Fam20. Fam20C represents the C-terminus of eukaryotic secreted Golgi casein kinase proteins. Fam20C is the Golgi casein kinase that phosphorylates secretory pathway proteins within Ser-x-Glu/pSer motifs. Mutations in Fam20C cause Raine syndrome, an autosomal recessive osteosclerotic bone dysplasia. 218
49717 399591 pfam06703 SPC25 Microsomal signal peptidase 25 kDa subunit (SPC25). This family consists of several microsomal signal peptidase 25 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 25 kDa subunit (SPC25). 153
49718 284187 pfam06705 SF-assemblin SF-assemblin/beta giardin. This family consists of several eukaryotic SF-assemblin and related beta giardin proteins. During mitosis the SF-assemblin-based cytoskeleton is reorganized; it divides in prophase and is reduced to two dot-like structures at each spindle pole in metaphase. During anaphase, the two dots present at each pole are connected again. In telophase there is an asymmetrical outgrowth of new fibers. It has been suggested that SF-assemblin is involved in re-establishing the microtubular root system characteristic of interphase cells after mitosis. 247
49719 115368 pfam06706 CTV_P6 Citrus tristeza virus 6-kDa protein. This family consists of several Citrus tristeza virus (CTV) 6-kDa, 51 residue long hydrophobic (P6) proteins. The function of this family is unknown. 51
49720 369046 pfam06707 DUF1194 Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. 206
49721 399592 pfam06708 DUF1195 Protein of unknown function (DUF1195). This family consists of several plant specific hypothetical proteins of around 160 residues in length. The function of this family is unknown. 147
49722 399593 pfam06711 DUF1198 Protein of unknown function (DUF1198). This family consists of several bacterial proteins of around 150 residues in length which are specific to Escherichia coli, Salmonella species and Yersinia pestis. The function of this family is unknown. 142
49723 115374 pfam06712 DUF1199 Protein of unknown function (DUF1199). This family consists of several hypothetical Feline immunodeficiency virus (FIV) proteins. Members of this family are typically around 67 residues long and are often annotated as ORF3 proteins. The function of this family is unknown. 52
49724 369049 pfam06713 bPH_4 Bacterial PH domain. This family consists of several hypothetical proteins specific to Oceanobacillus and Bacillus species. Members of this family are typically around 130 residues in length. The function of this family is unknown. Members of this family have a PH domain like structure. 74
49725 148361 pfam06714 Gp5_OB Gp5 N-terminal OB domain. This domain is found at the N-terminus of the Gp5 baseplate protein of bacteriophage T4. This domain binds to the Gp27 protein. This domain has the common OB fold. 144
49726 310962 pfam06715 Gp5_C Gp5 C-terminal repeat (3 copies). This repeat composes the C-terminal part of the bacteriophage T4 baseplate protein Gp5. This region of the protein forms a needle like projection from the baseplate that is presumed to puncture the bacterial cell membrane. Structurally three copies of the repeated region trimerize to form a beta solenoid type structure. This family also includes repeats from bacterial Vgr proteins. 24
49727 284193 pfam06716 DUF1201 Protein of unknown function (DUF1201). This family consists of several Sugar beet yellow virus (SBYV) putative membrane-binding proteins of around 54 residues in length. The function of this family is unknown. 54
49728 284194 pfam06717 DUF1202 Protein of unknown function (DUF1202). This family consists of several hypothetical bacterial proteins of around 335 residues in length. Members of this family are found exclusively in Escherichia coli and Salmonella species and are often referred to as YggM proteins. The function of this family is unknown. 307
49729 399594 pfam06718 DUF1203 Protein of unknown function (DUF1203). This family consists of several hypothetical bacterial proteins of around 155 residues in length. Family members are present in Rhizobium, Agrobacterium and Streptomyces species. 116
49730 399595 pfam06719 AraC_N AraC-type transcriptional regulator N-terminus. This family represents the N-terminus of bacterial ARAC-type transcriptional regulators. In E. coli, these regulate the L-arabinose operon through sensing the presence of arabinose, and when the sugar is present, transmitting this information from the arabinose-binding domains to the protein's DNA-binding domains. This family might represent the N-terminal arm of the protein, which binds to the C-terminal DNA binding domains to hold them in a state where the protein prefers to loop and remain non-activating. All family members contain the pfam00165 domain. 152
49731 336481 pfam06720 Phi-29_GP16_7 Bacteriophage phi-29 early protein GP16.7. This family consists of several bacteriophage phi-29 early protein GP16.7 sequences of around 130 residues in length. The function of this family is unknown. 130
49732 284198 pfam06721 DUF1204 Protein of unknown function (DUF1204). This family represents the C-terminus of a number of Arabidopsis thaliana hypothetical proteins of unknown function. Family members contain a conserved DFD motif. 243
49733 369050 pfam06722 DUF1205 Protein of unknown function (DUF1205). This family represents a conserved region of unknown function within bacterial glycosyl transferases. Many family members contain pfam03033. 95
49734 399596 pfam06723 MreB_Mbl MreB/Mbl protein. This family consists of bacterial MreB and Mbl proteins as well as two related archaeal sequences. MreB is known to be a rod shape-determining protein in bacteria and goes to make up the bacterial cytoskeleton. Genes coding for MreB/Mbl are only found in elongated bacteria, not in coccoid forms. It has been speculated that constituents of the eukaryotic cytoskeleton (tubulin, actin) may have evolved from prokaryotic precursor proteins closely related to today's bacterial proteins FtsZ and MreB/Mbl. 327
49735 399597 pfam06724 DUF1206 Domain of Unknown Function (DUF1206). This region consists of two a pair of transmembrane helices and occurs three times in each of the family member proteins. 70
49736 399598 pfam06725 3D 3D domain. This short presumed domain contains three conserved aspartate residues, hence the name 3D. It has been shown to be part of the catalytic double psi beta barrel domain of MltA. 72
49737 369052 pfam06726 BC10 Bladder cancer-related protein BC10. This family consists of a series of short proteins of around 90 residues in length. The human protein BC10 has been implicated in bladder cancer where the transcription of the gene coding for this protein is nearly completely abolished in highly invasive transitional cell carcinomas (TCCs). The protein is a small globular protein containing two transmembrane helices, and it is a multiply edited transcript. All the editing sites are found in either the 5'-UTR or the N-terminal section of the protein, which is predicted to be outside the membrane. The three coding edits are all non-synonymous and predicted to encode exposed residues. The function of this family is unknown. 65
49738 310969 pfam06727 DUF1207 Protein of unknown function (DUF1207). This family consists of a number of hypothetical bacterial proteins of around 410 residues in length which seem to be specific to Chlamydia species. The function of this family is unknown. 337
49739 399599 pfam06728 PIG-U GPI transamidase subunit PIG-U. Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the carboxyl-terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI. 374
49740 399600 pfam06729 CENP-R Kinetochore component, CENP-R. This family consists of mammalian kinetochore sub-complex proteins CENP-R, also referred to as nuclear receptor co-activator NRIF3 proteins. NRIF3 exhibits a distinct receptor specificity in interacting with and potentiating the activity of only TRs and RXRs but not other examined nuclear receptors. NRIF3 as a co-regulator that possesses both transactivation and transrepression domains and/or functions. Collectively, the NRIF3 family of co-regulators may play dual roles in mediating both positive and negative regulatory effects on gene expression. CENP-R is one of the 15 components that make up the constitutive centromere associated complex (CCAN) part of the kinetochore. A sub-complex of CCAN, consisting of CENP-P/O/R/Q/U self-assembles on kinetochores with varying stoichiometry and undergoes a pre-mitotic maturation step. Kinetochore assembly is a cell cycle regulated multi-step process. The initial step occurs during interphase and involves loading of the 15-subunit constitutive centromere associated complex (CCAN). Kinetochores are multi-protein megadalton assemblies that are required for attachment of microtubules to centromeres and, in turn, the segregation of chromosomes in mitosis. 137
49741 284207 pfam06730 FAM92 FAM92 protein. This family of proteins has a role in embryogenesis. During embryogenesis it is essential for ectoderm and axial mesoderm development. It may regulate cell proliferation and apoptosis. 225
49742 399601 pfam06732 Pescadillo_N Pescadillo N-terminus. This family represents the N-terminal region of Pescadillo. Pescadillo protein localizes to distinct substructures of the interphase nucleus including nucleoli, the site of ribosome biogenesis. During mitosis pescadillo closely associates with the periphery of metaphase chromosomes and by late anaphase is associated with nucleolus-derived foci and prenucleolar bodies. Blastomeres in mouse embryos lacking pescadillo arrest at morula stages of development, the nucleoli fail to differentiate and accumulation of ribosomes is inhibited. It has been proposed that in mammalian cells pescadillo is essential for ribosome biogenesis and nucleologenesis and that disruption to its function results in cell cycle arrest. This family is often found in conjunction with a pfam00533 domain. 269
49743 399602 pfam06733 DEAD_2 DEAD_2. This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast. 168
49744 284210 pfam06734 UL97 UL97. This family represents a conserved region within viral UL97 phosphotransferases. UL97 participates in the phosphorylation of the nucleoside analog ganciclovir (GCV) to produce GCV-monophosphate. 187
49745 399603 pfam06736 DUF1211 Protein of unknown function (DUF1211). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins. 88
49746 399604 pfam06737 Transglycosylas Transglycosylase-like domain. This family of proteins are very likely to act as transglycosylase enzymes related to pfam00062 and pfam01464. These other families are weakly matched by this family, and include the known active site residues. 75
49747 399605 pfam06738 ThrE Putative threonine/serine exporter. ThrE is a family of bacterial and Archaeal proteins that catalyze the export of L-threonine from the cell. UniProtKB:Q79VD1 has been characterized as being necessary for this export. The domain exhibits 10 putative TMs and catalyzes the proton-motive-force-dependent efflux of threonine and serine. 241
49748 115400 pfam06739 SBBP Beta-propeller repeat. This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller. 38
49749 369057 pfam06740 DUF1213 Protein of unknown function (DUF1213). This family represents a short conserved repeat within Drosophila melanogaster proteins of unknown function. Approximately 50 copies of this repeat are present in each protein. 32
49750 399606 pfam06741 LsmAD LsmAD domain. This domain is found associated with Lsm domain. 65
49751 399607 pfam06742 DUF1214 Protein of unknown function (DUF1214). This family represents the C-terminal region of several hypothetical proteins of unknown function. Family members are mostly bacterial, but a few are also found in eukaryotes and archaea. 109
49752 399608 pfam06743 FAST_1 FAST kinase-like protein, subdomain 1. This family represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases (EC:2.7.1.-) that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis. Note that many family members are hypothetical proteins. This region is often found immediately N-terminal to the FAST kinase-like protein, subdomain 2. 69
49753 399609 pfam06744 IcmF_C Type VI secretion protein IcmF C-terminal. IcmF_C family represents a conserved region situated towards the C-terminal end of IcmF-like proteins. It was thought to be involved in Vibrio cholerae cell surface reorganisation that results in increased adherence to epithelial cells leading to an increased conjugation frequency. IcmF as a whole interacts with DotU whereby these bind tightly and form the docking area of the T6SS within the inner membrane. The exact function of this domain is not clear. 106
49754 399610 pfam06745 ATPase KaiC. This family is in the P-loop NTPase superfamily and is found in archaea, bacteria and eukaryotes. More than one copy is sometimes found in each protein. This family includes KaiC, which is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria. 231
49755 369060 pfam06746 DUF1216 Protein of unknown function (DUF1216). This family represents a conserved region, within Arabidopsis thaliana proteins, of unknown function. Family members sometimes contain more than one copy.It has been reported that this domain will be found in other Brassicaceae. 132
49756 399611 pfam06747 CHCH CHCH domain. we have identified a conserved motif in the LOC118487 protein that we have called the CHCH motif. Alignment of this protein with related members showed the presence of three subgroups of proteins, which are called the S (Small), N (N-terminal extended) and C (C-terminal extended) subgroups. All three sub-groups of proteins have in common that they contain a predicted conserved [coiled coil 1]-[helix 1]-[coiled coil 2]-[helix 2] domain (CHCH domain). Within each helix of the CHCH domain, there are two cysteines present in a C-X9-C motif. The N-group contains an additional double helix domain, and each helix contains the C-X9-C motif. This family contains a number of characterized proteins: Cox19 protein - a nuclear gene of Saccharomyces cerevisiae, codes for an 11-kDa protein (Cox19p) required for expression of cytochrome oxidase. Because cox19 mutants are able to synthesize the mitochondrial and nuclear gene products of cytochrome oxidase, Cox19p probably functions post-translationally during assembly of the enzyme. Cox19p is present in the cytoplasm and mitochondria, where it exists as a soluble intermembrane protein. This dual location is similar to what was previously reported for Cox17p, a low molecular weight copper protein thought to be required for maturation of the CuA centre of subunit 2 of cytochrome oxidase. Cox19p have four conserved potential metal ligands, these are three cysteines and one histidine. Mrp10 - belongs to the class of yeast mitochondrial ribosomal proteins that are essential for translation. Eukaryotic NADH-ubiquinone oxidoreductase 19 kDa (NDUFA8) subunit. The CHCH domain was previously called DUF657. 35
49757 399612 pfam06748 DUF1217 Protein of unknown function (DUF1217). This family represents a conserved region that is found within bacterial proteins, most of which are hypothetical. Some members contain multiple copies. 149
49758 399613 pfam06749 DUF1218 Protein of unknown function (DUF1218). This family contains hypothetical plant proteins of unknown function. Family members contain a number of conserved cysteine residues. 95
49759 399614 pfam06750 DiS_P_DiS Bacterial Peptidase A24 N-terminal domain. This family is found at the N-terminus of the pre-pilin peptidases (pfam01478). It's function has not been specifically determined; however some of the family have been characterized as bifunctional, and this domain may contain the N-methylation activity (EC:2.1.1.-). It consists of an intracellular region between a pair of transmembrane. This region contains an invariant proline and two almost fully conserved disulphide bridges - hence the name DiS-P-DiS. The cysteines have been shown to be essential to the overall function of the enzyme in, but their role was incorrectly ascribed. 84
49760 399615 pfam06751 EutB Ethanolamine ammonia lyase large subunit (EutB). This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins (EC:4.3.1.7). Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC). 435
49761 399616 pfam06752 E_Pc_C Enhancer of Polycomb C-terminus. This family represents the C-terminus of eukaryotic enhancer of polycomb proteins, which have roles in heterochromatin formation. This family contains several conserved motifs. 229
49762 70231 pfam06753 Bradykinin Bradykinin. This family consists of several bradykinin sequences. The skins of anuran amphibians, in addition to mucus glands, contain highly specialized poison glands, which, in reaction to stress or attack, exude a complex noxious cocktail of biologically active molecules. These secretions often contain a plethora of peptides among which bradykinin or structural variants have been identified. 19
49763 399617 pfam06754 PhnG Phosphonate metabolism protein PhnG. This family consists of several bacterial phosphonate metabolism protein PhnG sequences. In Escherichia coli, the phn operon encodes proteins responsible for the uptake and breakdown of phosphonates. The exact function of PhnG is unknown, however it is thought likely that along with six other proteins PhnG makes up the the C-P (carbon-phosphorus) lyase. 142
49764 369065 pfam06755 CbtA_toxin CbtA_toxin of type IV toxin-antitoxin system. CbtA is a family of bacterial and archaeal toxins of type IV toxin-antitoxin system. Toxins from such systems in free-living bacteria inhibit cell growth by targeting essential functions of cellular metabolism. In this case the toxin inhibits cell-division leading to changes in morphology and finally lysis, by interacting with two essential cytoskeletal proteins, FtsZ and MreB. For FtsZ it inhibits its GTPase activity and GTP-dependent polymerization, and for MreB it inhibits its ATP-dependent polymerization. These actions of CbtA appear to occur simultaneously. he cognate antitoxin family is represented by pfam06154. 108
49765 369066 pfam06756 S19 Chorion protein S19 C-terminal. This family represents the C-terminal region of eukaryotic chorion protein S19. In Drosophilidae, the S19 gene is known to form part of an autosomal cluster that also contains s16, s15 and s18. Note that members of this family contain a conserved PVA motif, and many contain pfam03964. 72
49766 399618 pfam06757 Ins_allergen_rp Insect allergen related repeat, nitrile-specifier detoxification. This family exemplifies a case of novel gene evolution. The case in point is the arms-race between plants and their infective insective herbivores in the area of the glucosinolate-myrosinase system. Brassicas have developed the glucosinolate-myrosinase system as chemical defense mechanism against the insects, and consequently the insects have adapted to produce a detoxifying molecule, nitrile-specifier protein (NSP). NSP is present in the small white butterfly Pieris rapae. NSP is structurally different from and has no amino acid homology to any known detoxifying enzymes, and it appears to have arisen by a process of domain and gene duplication of a sequence of unknown function that is widespread in insect species and referred to as insect-allergen-repeat protein. Thus this family is found either as a single domain or as a multiple repeat-domain. 173
49767 399619 pfam06758 DUF1220 Repeat of unknown function (DUF1220). 66
49768 399620 pfam06760 DUF1221 Protein of unknown function (DUF1221). This is a family of plant proteins, most of which are hypothetical and of unknown function. All members contain the pfam00069 domain, suggesting that they may possess kinase activity. 215
49769 399621 pfam06761 IcmF-related Intracellular multiplication and human macrophage-killing. This family represents a conserved region within several bacterial proteins that resemble IcmF, which has been proposed to be involved in Vibrio cholerae cell surface reorganisation, resulting in increased adherence to epithelial cells and increased conjugation frequency. Note that many family members are hypothetical proteins. 304
49770 399622 pfam06762 LMF1 Lipase maturation factor. This family of transmembrane proteins includes the lipase maturation factor, LMF1. Lipoprotein lipase and hepatic lipase require LMF1 to fold into their active states. The precise role of LMF1 in lipase folding has yet to be determined. 378
49771 284235 pfam06763 Minor_tail_Z Prophage minor tail protein Z (GPZ). This family consists of several prophage minor tail protein Z like sequences from Escherichia coli, Salmonella typhimurium and Lambda-like bacteriophages. 190
49772 399623 pfam06764 DUF1223 Protein of unknown function (DUF1223). This family consists of several hypothetical proteins of around 250 residues in length which are found in both plants and bacteria. The function of this family is unknown. Structurally it lies in the Thioredoxin-like superfamily. 201
49773 399624 pfam06766 Hydrophobin_2 Fungal hydrophobin. This is a family of fungal hydrophobins that seems to be restricted to ascomycetes. These are small, moderately hydrophobic extracellular proteins that have eight cysteine residues arranged in a strictly conserved motif. Hydrophobins are generally found on the outer surface of conidia and of the hyphal wall, and may be involved in mediating contact and communication between the fungus and its environment. Note that some family members contain multiple copies. 65
49774 399625 pfam06767 Sif Sif protein. This family consists of several SifA and SifB and SseJ proteins which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localize to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function. 336
49775 284238 pfam06769 YoeB_toxin YoeB-like toxin of bacterial type II toxin-antitoxin system. YoeB_toxin is a family of bacterial toxins that forms one component of the type II toxin-antitoxin system in E. coli whose antitoxin is represented by YefM, found in pfam02604. The plasmid encoded Axe-Txe proteins in Enterococcus faecium act as an antitoxin-toxin pair. When the plasmid is lost, the antitoxin is degraded relatively quickly by host enzymes. This allows the toxin to interact with its intracellular target, thus killing the cell or impeding cell growth. These toxins are highly potent protein synthesis inhibitors, specifically blocking the initiation of translation. In the case of YoeB, it binds to the 50 S ribosomal subunit in 70 S ribosomes and interacts with the A site leading to mRNA cleavage at this site. As a result, the 3'-end portion of the mRNA is released from ribosomes, and translation initiation is effectively inhibited. 80
49776 284239 pfam06770 Arif-1 Actin-rearrangement-inducing factor (Arif-1). This family consists of several Nucleopolyhedrovirus actin-rearrangement-inducing factor (Arif-1) proteins. In response to Autographa californica multicapsid nuclear polyhedrosis virus (AcMNPV) infection, a sequential rearrangement of the actin cytoskeleton occurs this is induced by Arif-1. Arif-1 is tyrosine phosphorylated and is located at the plasma membrane as a component of the actin rearrangement-inducing complex. 205
49777 284240 pfam06771 Desmo_N Viral Desmoplakin N-terminus. This family represents the N-terminus of viral desmoplakin. Desmoplakin is a component of mature desmosomes, which are the main adhesive junctions in epithelia and cardiac muscle. Desmoplakin is also essential for the maturation of adherens junctions. Note that many family members are hypothetical. 97
49778 399626 pfam06772 LtrA Bacterial low temperature requirement A protein (LtrA). This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes. 352
49779 399627 pfam06773 Bim_N Bim protein N-terminus. This family represents the N-terminal region of several mammal specific Bim proteins. The Bim protein is one of the BH3-only proteins, members of the Bcl-2 family that have only one of the Bcl-2 homology regions, BH3. BH3-only proteins are essential initiators of apoptotic cell death. 40
49780 399628 pfam06775 Seipin Putative adipose-regulatory protein (Seipin). Seipin is a protein of approximately 400 residues, in humans, which is the product of a gene homologous to the murine guanine nucleotide-binding protein (G protein) gamma-3 linked gene. This gene is implicated in the regulation of body fat distribution and insulin resistance and particularly in the auto-immune disease Berardinelli-Seip congenital lipodystrophy type 2. Seipin has no similarity with other known proteins or consensus motifs that might predict its function, but it is predicted to contain two transmembrane domains at residues 28-49 and 237-258, in human, and a third transmembrane domain might be present at residues 155-173. Seipin may also be implicated in Silver spastic paraplegia syndrome and distal hereditary motor neuropathy type V. 194
49781 399629 pfam06776 IalB Invasion associated locus B (IalB) protein. This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is upregulated in response to environmental cues signaling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions. The role of this protein in other bacterial species is unknown. 134
49782 399630 pfam06777 HBB Helical and beta-bridge domain. HBB is the domain on DEAD-box eukaryotic DNA repair helicases (EC:3.6.1.-) that appears to be a unique fold. It's conformation is of alpha-helices 12-16 plus a short beta-bridge to the FeS-cluster domain at the N-terminal. The full-length XPD protein verifies the presence of damage to DNA and allows DNA repair to proceed. XPD is an assembly of several domains to form a doughnut-shaped molecule that is able to separate two DNA strands and scan the DNA for damage. HBB helps to form the overall DNA-clamping architecture. This family represents a conserved region within a number of eukaryotic DNA repair helicases (EC:3.6.1.-). 190
49783 399631 pfam06778 Chlor_dismutase Chlorite dismutase. This family contains chlorite dismutase enzymes of bacterial and archaeal origin. This enzyme catalyzes the disproportionation of chlorite into chloride and oxygen. Note that many family members are hypothetical proteins. 190
49784 399632 pfam06779 MFS_4 Uncharacterized MFS-type transporter YbfB. This family represents putative bacterial membrane proteins which may be sugar transporters. Members carry twelve transmembrane regions which are characteristic of members of the major facilitator sugar-transporter superfamily. 365
49785 311003 pfam06780 Erp_C Erp protein C-terminus. This family represents the C-terminus of bacterial Erp proteins that seem to be specific to Borrelia burgdorferi (a causative agent of Lyme disease). Borrelia Erp proteins are particularly heterogeneous, which might enable them to interact with a wide variety of host components. 140
49786 399633 pfam06781 CrgA Cell division protein CrgA. CrgA is a trans-membrane (TM) protein, first described in Streptomyces as being required for sporulation through the coordination of several aspects of reproductive growth. In Mtb (Mycobacterium tuberculosis ) CrgA is a central component of the divisome, and consists of 93 residues with two predicted TM helices (TM1: residues 29-51; and TM2: residues 66-88). CrgA facilitates the recruitment of the proteins essential for peptidoglycan synthesis to the divisome and also stabilizes the divisome. Reduced production of CrgA results in elongated cells and reduced growth rate, and loss of CrgA impairs peptidoglycan synthesis. CrgA has homologs in other actinomycetes. 88
49787 284250 pfam06782 UPF0236 Uncharacterized protein family (UPF0236). 479
49788 399634 pfam06783 UPF0239 Uncharacterized protein family (UPF0239). 83
49789 399635 pfam06784 UPF0240 Uncharacterized protein family (UPF0240). 167
49790 399636 pfam06785 UPF0242 Uncharacterized protein family (UPF0242). 194
49791 399637 pfam06786 UPF0253 Uncharacterized protein family (UPF0253). 65
49792 399638 pfam06787 UPF0254 Uncharacterized protein family (UPF0254). 162
49793 399639 pfam06788 UPF0257 Uncharacterized protein family (UPF0257). 235
49794 399640 pfam06789 UPF0258 Uncharacterized protein family (UPF0258). 148
49795 284258 pfam06790 UPF0259 Uncharacterized protein family (UPF0259). 248
49796 399641 pfam06791 TMP_2 Prophage tail length tape measure protein. This family represents a conserved region located towards the N-terminal end of prophage tail length tape measure protein (TMP). TMP is important for assembly of phage tails and involved in tail length determination. Mutated forms TMP cause tail fibers to be shortened. 205
49797 399642 pfam06792 UPF0261 Uncharacterized protein family (UPF0261). 400
49798 399643 pfam06793 UPF0262 Uncharacterized protein family (UPF0262). 152
49799 399644 pfam06794 UPF0270 Uncharacterized protein family (UPF0270). 67
49800 284263 pfam06795 Erythrovirus_X Erythrovirus X protein. This family consists of several Erythrovirus X proteins which seem to be found exclusively in human parvovirus and human erythrovirus. The function of this family is unknown. 81
49801 369084 pfam06796 NapE Periplasmic nitrate reductase protein NapE. This family consists of several bacterial periplasmic nitrate reductase NapE proteins. Seven genes, napKEFDABC, encoding the periplasmic nitrate reductase system were cloned from the denitrifying phototrophic bacterium Rhodobacter sphaeroides f. sp. denitrificans IL106. NapE is thought to be a transmembrane protein. 53
49802 369085 pfam06797 DUF1229 Protein of unknown function (DUF1229). This family consists of several hypothetical proteins of around 415 residues in length which seem to be specific to the bacterium Leptospira interrogans. 146
49803 399645 pfam06798 PrkA PrkA serine protein kinase C-terminal domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This family corresponds to the C-terminal domain. 252
49804 399646 pfam06799 DUF1230 Protein of unknown function (DUF1230). This family consists of several hypothetical plant and photosynthetic bacterial proteins of around 160 residues in length. The function of this family is unknown although looking at the species distribution the protein may play a part in photosynthesis. 141
49805 284268 pfam06800 Sugar_transport Sugar transport protein. This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes. These members are transmembrane proteins which are usually 5+5 duplications. This model recognizes a set of five TMs, 281
49806 284269 pfam06802 DUF1231 Protein of unknown function (DUF1231). This family consists of several Orthopoxvirus specific proteins predominantly of around 340 residues in length. This family contains both B17 and B15 proteins, the function of which are unknown. 340
49807 399647 pfam06803 DUF1232 Protein of unknown function (DUF1232). This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function. 37
49808 399648 pfam06804 Lipoprotein_18 NlpB/DapX lipoprotein. This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential. 292
49809 284272 pfam06805 Lambda_tail_I Bacteriophage lambda tail assembly protein I. This family consists of tail assembly proteins from lambdoid and T1 phages and related prophages, e.g. the tail assembly protein I (TAPI). Members of this family contain a core ubiquitin fold domain. The exact function of TAPI is not clear but it is not incorporated into the mature tail. Gene neighborhoods reveal that TAPI co-occurs with genes encoding the host-specificity protein TapJ, and TapK, which contains a JAB metallopeptidase fused to an NlpC/P60 peptidase. It is proposed that the TAPI protein is processed by the peptidase domains of TapK. 82
49810 399649 pfam06806 DUF1233 Putative excisionase (DUF1233). This family consists of several putative phage excisionase proteins of around 80 residues in length. 70
49811 399650 pfam06807 Clp1 Pre-mRNA cleavage complex II protein Clp1. This family consists of several pre-mRNA cleavage complex II Clp1 (or HeaB) proteins. Six different protein factors are required in vitro for 3' end formation of mammalian pre-mRNAs by endonucleolytic cleavage and polyadenylation. Clp1 is a subunit of cleavage complex IIA, which is required for cleavage, but not for polyadenylation of pre-mRNA. 112
49812 284275 pfam06808 DctM Tripartite ATP-independent periplasmic transporter, DctM component. This family contains a diverse range of predicted transporter proteins. Including the DctM subunit of the bacterial and archaeal TRAP C4-dicarboxylate transport (Dct) system permease. In general, C4-dicarboxylate transport systems allow C4-dicarboxylates like succinate, fumarate, and malate to be taken up. TRAP C4-dicarboxylate carriers are secondary carriers that use an electrochemical H+ gradient as the driving force for transport. DctM is an integral membrane protein that is one of the constituents of TRAP carriers. Note that many family members are hypothetical proteins. 413
49813 284276 pfam06809 NPDC1 Neural proliferation differentiation control-1 protein (NPDC1). This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the C. elegans protein CAB-1, which is known to interact with AEX-3. 352
49814 399651 pfam06810 Phage_GP20 Phage minor structural protein GP20. This family consists of several phage minor structural protein GP20 sequences of around 180 residues in length. The function of this family is unknown. 148
49815 399652 pfam06812 ImpA_N ImpA, N-terminal, type VI secretion system. This family represents a conserved region located towards the N-terminal end of ImpA and related proteins. ImpA is an inner membrane protein, which has been suggested to be involved with proteins that are exported and associated with colony variations in Actinobacillus actinomycetemcomitans. The ImpA gene in Vibrio cholera and many other bacteria is expressed from the virulence factor operon which produces the pathogenic injection, type VI secretion system; although the exact function of this gene-product is not known it appears to be an essential component of the pathogenic effect. 115
49816 284279 pfam06813 Nodulin-like Nodulin-like. This family represents a conserved region within plant nodulin-like proteins. 250
49817 369089 pfam06814 Lung_7-TM_R Lung seven transmembrane receptor. This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins. 294
49818 399653 pfam06815 RVT_connect Reverse transcriptase connection domain. This domain is known as the connection domain. This domain lies between the thumb and palm domains. 102
49819 399654 pfam06816 NOD NOTCH protein. NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated. 55
49820 399655 pfam06817 RVT_thumb Reverse transcriptase thumb domain. This domain is known as the thumb domain. It is composed of a four helix bundle. 66
49821 399656 pfam06818 Fez1 Fez1. This family represents the eukaryotic Fez1 protein. Fez1 contains a leucine-zipper region with similarity to the DNA-binding domain of the cAMP-responsive activating-transcription factor 5. There is evidence that Fez1 inhibits cancer cell growth through regulation of mitosis, and that its alterations result in abnormal cell growth. Note that some family members contain more than one copy of this region. 198
49822 399657 pfam06819 Arc_PepC Archaeal Peptidase A24 C-terminal Domain. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by JPred. 112
49823 148432 pfam06820 Phage_fiber_C Putative prophage tail fibre C-terminus. This family represents the C-terminus of a prophage tail fibre protein found mostly in E. coli. All family members contain a conserved RLGP motif. 64
49824 399658 pfam06821 Ser_hydrolase Serine hydrolase. Members of this family have serine hydrolase activity. They contain a conserved serine hydrolase motif, GXSXG/A, where the serine is a putative nucleophile. This family has an alpha-beta hydrolase fold. Eukaryotic members of this family have a conserved LXCXE motif, which binds to retinoblastomas. This motif is absent from prokaryotic members of this family. 171
49825 284287 pfam06822 DUF1235 Protein of unknown function (DUF1235). This family contains a number of viral proteins of unknown function. 261
49826 399659 pfam06823 DUF1236 Protein of unknown function (DUF1236). This family contains a number of hypothetical bacterial proteins of unknown function. Some family members contain more than one copy of the region represented by this family. 64
49827 399660 pfam06824 Glyco_hydro_125 Metal-independent alpha-mannosidase (GH125). This family, which contains bacterial and fungal glycoside hydrolases, is also known as GH125. They function as metal-independent alpha-mannosidases, with specificity for alpha-1,6-linked non-reducing terminal mannose residues. Structurally this family is part of the 6 hairpin glycosidase superfamily. 416
49828 399661 pfam06825 HSBP1 Heat shock factor binding protein 1. Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response. 51
49829 284291 pfam06826 Asp-Al_Ex Predicted Permease Membrane Region. This family represents five transmembrane helices that are normally found flanking (five either side) a pair of pfam02080 domains. This suggests that the paired regions form a ten helical structure, probably forming the pore, whereas the pfam02080) binds a ligand for export or regulation of the pore. Tetragenococcus halophilus aspT is described as a aspartate-alanine antiporter. In conjunction with aspD it forms a 'proton motive metabolic cycle catalyzed by an aspartate-alanine exchange'. The general conservation of domain architecture in this family suggests that they are functional orthologues. 167
49830 399662 pfam06827 zf-FPG_IleRS Zinc finger found in FPG and IleRS. This zinc binding domain is found at the C-terminus of isoleucyl tRNA synthetase and the enzyme Formamidopyrimidine-DNA glycosylase EC:3.2.2.23. 28
49831 399663 pfam06830 Root_cap Root cap. The cells at the periphery of the root cap are continuously sloughed off from the root into the mucilage, and are thought to be programmed to die.This family represents a conserved region approximately 60 residues in length within plant root cap proteins, which may be involved in the process. 57
49832 399664 pfam06831 H2TH Formamidopyrimidine-DNA glycosylase H2TH domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain. 89
49833 399665 pfam06832 BiPBP_C Penicillin-Binding Protein C-terminus Family. This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (pfam00905). It is predicted by PROF to be an all beta fold. 90
49834 399666 pfam06833 MdcE Malonate decarboxylase gamma subunit (MdcE). This family consists of several bacterial malonate decarboxylase gamma subunit proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyzes the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcD and E together probably function as malonyl-S-ACP decarboxylase. 232
49835 399667 pfam06834 TraU TraU protein. This family consists of several bacterial TraU proteins. TraU appears to be more essential to conjugal DNA transfer than to assembly of pilus filaments. 306
49836 399668 pfam06835 LptC Lipopolysaccharide-assembly, LptC-related. This family consists of several related groups of proteins one of which is the LptC family. LptC is involved in lipopolysaccharide-assembly on the outer membrane of Gram-negative organisms. The lipopolysaccharide component of the outer bacterial membrane is transported form its source of origin to the outer membrane by a set of proteins constituting a transport machinery that is made up of LptA, LptB, LptC, LptD, LptE. LptC is located on the inner membrane side of the intermembrane space. 176
49837 336520 pfam06836 DUF1240 Protein of unknown function (DUF1240). This family consists of a number of hypothetical putative membrane proteins which seem to be specific to Yersinia pestis. The function of this family is unknown. 95
49838 284300 pfam06837 Fijivirus_P9-2 Fijivirus P9-2 protein. This family consists of several Fijivirus specific P9-2 proteins from Rice black streaked dwarf virus (RBSDV) and Fiji disease virus. The function of this family is unknown. 207
49839 399669 pfam06838 Met_gamma_lyase Methionine gamma-lyase. This is a putative pyridoxal 5'-phosphate-dependent methionine gamma-lyase enzyme involved in methionine catabolism. 405
49840 399670 pfam06839 zf-GRF GRF zinc finger. This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. 45
49841 399671 pfam06840 DUF1241 Protein of unknown function (DUF1241). This family consists of several programmed cell death 10 protein (PDCD10 or TFAR15) sequences. The function of this family is unknown. 150
49842 399672 pfam06841 Phage_T4_gp19 T4-like virus tail tube protein gp19. This family consists of several tail tube protein gp19 sequences from the T4-like viruses. This family also contains bacterial members which suggest lateral transfer of genes. 134
49843 399673 pfam06842 DUF1242 Protein of unknown function (DUF1242). This family consists of a number of eukaryotic proteins of around 72 residues in length. The function of this family is unknown. 35
49844 399674 pfam06844 DUF1244 Protein of unknown function (DUF1244). This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown. 65
49845 377719 pfam06847 Arc_PepC_II Archaeal Peptidase A24 C-terminus Type II. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by Prof. 93
49846 399675 pfam06848 Disaggr_repeat Disaggregatase related repeat. This family consists of several repeats which seem to be specific to the Methanosarcina archaea species and are often found in multiple copies in disaggregatase proteins. Members of this family are also found in single copies in several hypothetical proteins. This repeat is also known as DNRLRE repeat and is predicted form a mainly beta-strand structure with two alpha-helices [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. It is found in some cell surface proteins. 179
49847 399676 pfam06849 DUF1246 Protein of unknown function (DUF1246). This family represents the N-terminus of a number of hypothetical archaeal proteins of unknown function. This family is structurally related to the PreATP-grasp domain. 122
49848 399677 pfam06850 PHB_depo_C PHB de-polymerase C-terminus. This family represents the C-terminus of bacterial poly(3-hydroxybutyrate) (PHB) de-polymerase. This degrades PHB granules to oligomers and monomers of 3-hydroxy-butyric acid. 203
49849 284311 pfam06851 DUF1247 Protein of unknown function (DUF1247). This family contains a number of hypothetical viral proteins of unknown function approximately 200 residues long. 149
49850 369108 pfam06852 DUF1248 Protein of unknown function (DUF1248). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to C. elegans. Note that some family members contain more than one copy of this region. 181
49851 399678 pfam06853 DUF1249 Protein of unknown function (DUF1249). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 116
49852 399679 pfam06854 Phage_Gp15 Bacteriophage Gp15 protein. This family consists of bacteriophage Gp15 proteins and related bacterial sequences. The function of this family is unknown 172
49853 399680 pfam06855 YozE_SAM_like YozE SAM-like fold. YozE-like is a family of Firmicute proteins that carries a four-helix motif similar to sterile alpha motif (SAM) domains. The family is suggested to fall into two subfamilies, possibly with differing functions based on the different surface charges on the three structural representatives, YozE MW0776 and MW1311. What this function is is not yet known although it is likely to involve binding to DNA. 66
49854 369111 pfam06856 DUF1251 Protein of unknown function (DUF1251). This family consists of the N-terminal region of several hypothetical Nucleopolyhedrovirus proteins of unknown function. 121
49855 399681 pfam06857 ACP Malonate decarboxylase delta subunit (MdcD). This family consists of several bacterial malonate decarboxylase delta subunit (MdcD) proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyzes the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcC is the (apo) ACP subunit. The family also contains the CitD family of citrate lyase acyl carrier proteins. 83
49856 399682 pfam06858 NOG1 Nucleolar GTP-binding protein 1 (NOG1). This family represents a conserved region of approximately 60 residues in length within nucleolar GTP-binding protein 1 (NOG1). In S. cerevisiae, the NOG1 gene has been shown to be essential for cell viability, suggesting that NOG1 may play an important role in nucleolar functions. Family members include eukaryotic, bacterial and archaeal proteins. 57
49857 399683 pfam06859 Bin3 Bicoid-interacting protein 3 (Bin3). This family represents a conserved region of approximately 120 residues within eukaryotic Bicoid-interacting protein 3 (Bin3). Bin3, which shows similarity to a number of protein methyltransferases that modify RNA-binding proteins, interacts with Bicoid, which itself directs pattern formation in the early Drosophila embryo. The interaction might allow Bicoid to switch between its dual roles in transcription and translation. Note that family members contain a conserved HLN motif. 108
49858 115513 pfam06861 BALF1 BALF1 protein. This family consists of several BALF1 proteins which seem to be specific to the Lymphocryptoviruses. BALF1, inhibits the antiapoptotic activity of EBV BHRF1 and of KSBcl-2. 184
49859 369113 pfam06862 UTP25 Utp25, U3 small nucleolar RNA-associated SSU processome protein 25. UTP25 is a family of eukaryotic proteins. The family displays limited sequence similarity to DEAD-box RNA helicases, having alternative residues at the Walker A and DEAD-box sites, but conservation of structural and other key residues. The domain is required and sufficient for the interaction of Utp25 with Utp3. UTP25 interacts with nucleolar protein Nop19 in S. cerevisiae, and Nop19p is essential for the incorporation of Utp25p into pre-ribosomes. 471
49860 399684 pfam06863 DUF1254 Protein of unknown function (DUF1254). This family represents a conserved region about 130 residues long within hypothetical proteins of unknown function. Family members include eukaryotic, bacterial and archaeal proteins. 131
49861 399685 pfam06864 PAP_PilO Pilin accessory protein (PilO). This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body. This family does not seem to be related to pfam04350. 412
49862 399686 pfam06865 DUF1255 Protein of unknown function (DUF1255). This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown 91
49863 115518 pfam06866 DUF1256 Protein of unknown function (DUF1256). This family consists of several uncharacterized bacterial proteins which seem to be specific to the orders Clostridia and Bacillales. Family members are typically around 180 residues in length. The function of this family is unknown. These proteins are related to peptidase family M63 and so may be peptidases. 164
49864 399687 pfam06868 DUF1257 Protein of unknown function (DUF1257). This family contains hypothetical proteins of unknown function that are approximately 120 residues long. Family members include eukaryotic and bacterial proteins. 103
49865 369115 pfam06869 DUF1258 Protein of unknown function (DUF1258). This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to C. elegans. Note that this family contains a number of conserved cysteine and histidine residues. 250
49866 399688 pfam06870 RNA_pol_I_A49 A49-like RNA polymerase I associated factor. Saccharomyces cerevisiae A49 is a specific subunit associated with RNA polymerase I (Pol I) in eukaryotes. Pol I maintains transcription activities in A49 deletion mutants. However, such mutants are deficient in transcription activity at low temperatures. Deletion analysis of the fusion yeast homolog indicate that only the C-terminal two thirds are required for function. Transcript analysis has demonstrated that A49 is maximising transcription of ribosomal DNA. 380
49867 284325 pfam06871 TraH_2 TraH_2. This family consists of several TraH proteins which seem to be specific to Agrobacterium and Rhizobium species. This protein is thought to be involved in conjugal transfer but its function is unknown. This family does not appear to be related to pfam06122. 207
49868 399689 pfam06872 EspG EspG protein. This family consists of several EspG like proteins from Citrobacter rodentium and Escherichia coli. EspG is secreted by the type III secretory system and is translocated into host epithelial cells. EspG is homologous with Shigella flexneri protein VirA and can rescue invasion in a Shigella virA mutant, indicating that these proteins are functionally equivalent in Shigella. EspG plays an accessory but as yet undefined role in EPEC virulence that may involve intestinal colonisation. 351
49869 284327 pfam06873 SerH Cell surface immobilisation antigen SerH. This family consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of Tetrahymena thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag). 418
49870 399690 pfam06874 FBPase_2 Firmicute fructose-1,6-bisphosphatase. This family consists of several bacterial fructose-1,6-bisphosphatase proteins (EC:3.1.3.11) which seem to be specific to phylum Firmicutes. Fructose-1,6-bisphosphatase (FBPase) is a well known enzyme involved in gluconeogenesis. This family does not seem to be structurally related to pfam00316. 638
49871 115526 pfam06875 PRF Plethodontid receptivity factor PRF. This family consists of several plethodontid receptivity factor (PRF) proteins which seem to be specific to Plethodon jordani (Jordan's salamander). PRF is a courtship pheromone produced by males increase female receptivity. 214
49872 369118 pfam06876 SCRL Plant self-incompatibility response (SCRL) protein. This family consists of several Plant self-incompatibility response (SCRL) proteins. The male component of the self-incompatibility response in Brassica has been shown to be encoded by the S locus cysteine-rich gene (SCR). SCR is related, at the sequence level, to the pollen coat protein (PCP) gene family whose members encode small, cysteine-rich proteins located in the proteo-lipidic surface layer (tryphine) of Brassica pollen grains. 67
49873 399691 pfam06877 RraB Regulator of ribonuclease activity B. This family of proteins regulate mRNA abundance by binding to RNaseE and inhibiting its endonucleolytic activity. A subset of these proteins are predicted to function as immunity proteins. 97
49874 399692 pfam06878 Pkip-1 Pkip-1 protein. This family consists of several Pkip-1 proteins which seem to be specific to Nucleopolyhedroviruses. The function of this family is unknown although it has been found that Pkip-1 is not essential for virus replication in cell culture or by in vivo intrahaemocoelic injection. 163
49875 399693 pfam06880 DUF1262 Protein of unknown function (DUF1262). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that some family members contain more than one copy of this region. 101
49876 399694 pfam06881 Elongin_A RNA polymerase II transcription factor SIII (Elongin) subunit A. This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin. 105
49877 399695 pfam06882 DUF1263 Protein of unknown function (DUF1263). This family represents a conserved region located towards the C-terminus of a number proteins of unknown function that seem to be specific to Oryza sativa. 57
49878 399696 pfam06883 RNA_pol_Rpa2_4 RNA polymerase I, Rpa2 specific domain. This domain is found between domain 3 (pfam04565) and domain 5 (pfam04565), but shows no homology to domain 4 of Rpb2. The external domains in multisubunit RNA polymerase (those most distant from the active site) are known to demonstrate more sequence variability. 53
49879 399697 pfam06884 DUF1264 Protein of unknown function (DUF1264). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 200 residues long. Some family members are annotated as putative lipoproteins. 169
49880 399698 pfam06886 TPX2 Targeting protein for Xklp2 (TPX2). This family represents a conserved region approximately 60 residues long within the eukaryotic targeting protein for Xklp2 (TPX2). Xklp2 is a kinesin-like protein localized on centrosomes throughout the cell cycle and on spindle pole microtubules during metaphase. In Xenopus, it has been shown that Xklp2 protein is required for centrosome separation and maintenance of spindle bi-polarity. TPX2 is a microtubule-associated protein that mediates the binding of the C-terminal domain of Xklp2 to microtubules. It is phosphorylated during mitosis in a microtubule-dependent way. 82
49881 369125 pfam06887 DUF1265 Protein of unknown function (DUF1265). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be restricted to C. elegans. The GO annotation for this protein indicate that its a protein involved in nematode larval development and has a positive regulation on growth rate. 47
49882 284339 pfam06888 Put_Phosphatase Putative Phosphatase. This family contains a number of putative eukaryotic acid phosphatases. Some family members represent the products of the PSI14 phosphatase family in Lycopersicon esculentum (Tomato). 234
49883 399699 pfam06889 DUF1266 Protein of unknown function (DUF1266). This family consists of several hypothetical bacterial proteins of around 235 residues in length. Members of this family seem to be found exclusively in the Enterobacteria Salmonella typhimurium and Escherichia coli. The function of this family is unknown. 174
49884 399700 pfam06890 Phage_Mu_Gp45 Bacteriophage Mu Gp45 protein. This family consists of Bacteriophage Mu Gp45 related proteins from both phages and bacteria. The function of this family is unknown although it has been suggested that family members may be involved in baseplate assembly. 68
49885 399701 pfam06891 P2_Phage_GpR P2 phage tail completion protein R (GpR). This family consists of P2 phage tail completion protein R (GpR) like sequences. GpR is thought to be a tail completion protein which is essential for stable head joining. 131
49886 399702 pfam06892 Phage_CP76 Phage regulatory protein CII (CP76). This family consists of several phage regulatory protein CII (CP76) sequences which are thought to be DNA binding proteins which are involved in the establishment of lysogeny. 155
49887 284344 pfam06894 Phage_TAC_2 Bacteriophage lambda tail assembly chaperone, TAC, protein G. This family consists of Bacteriophage lambda minor tail protein G and related sequences. The construction of phage tails involves a stage of tail-tube formation, and tail-tube polymerization requires two additional proteins, gpG and gpGT. The open reading frames, ORFs, for gpG and gpGT are overlapping and are related by a programmed translational frameshift. During virion morphogenesis, gpG is expressed in large amounts, and about 3.5% of the time, a -1 translational frameshift leads to the production of the larger fusion protein, gpGT. The correct ratio of gpG to gpGT, as determined by the frequency of frameshifting, is crucial for tail assembly. Since gpG accumulates to high levels during a lambda infection and yet is not found in mature phage particles it is believed to act as a chaperone. 126
49888 311073 pfam06896 Phage_TAC_3 Phage tail assembly chaperone proteins, TAC. This is a family of phage tail tube assembly chaperone proteins from some Siphoviridae viruses. 115
49889 399703 pfam06897 DUF1269 Protein of unknown function (DUF1269). This family consists of several bacterial and archaeal proteins of around 200 residues in length. The function of this family is unknown. The family carries a repeated glycine-zipper sequence- motif, GxxxGxxxG, where the x following the G is frequently found to be an alanine. As glycine-zippers occur in membrane proteins, this family is likely to be found spanning a membrane. 99
49890 399704 pfam06898 YqfD Putative stage IV sporulation protein YqfD. This family consists of several putative bacterial stage IV sporulation (SpoIV) proteins. YqfD of Bacillus subtilis is known to be essential for efficient sporulation although its exact function is unknown. 379
49891 399705 pfam06899 WzyE WzyE protein, O-antigen assembly polymerase. This family consists of several WzyE proteins which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect. The function of this family is unknown. The family is a transmembrane family with up to 11 TM regions, and is necessary for the assembly of O-antigen lipopolysaccharide. 446
49892 284349 pfam06900 DUF1270 Protein of unknown function (DUF1270). This family consists of several hypothetical Staphylococcus aureus and phage proteins of 53 residues in length. The function of this family is unknown. 53
49893 399706 pfam06901 FrpC RTX iron-regulated protein FrpC. This family consists of several RTX iron-regulated FrpC proteins which appear to be found exclusively in Neisseria meningitidis. FrpC has been shown to be related to the RTX family of bacterial cytotoxins. FrpC is found in the meningococcal outer membrane. The function of this family is unknown although it is thought to be a virulence factor. 228
49894 399707 pfam06902 Fer4_19 Divergent 4Fe-4S mono-cluster. Members of this family contain three highly conserved cysteine residues. This family includes proteins containing divergent domains which are most likely to bind to iron-sulfur clusters. 64
49895 399708 pfam06903 VirK VirK protein. This family consists of several bacterial VirK proteins of around 145 residues in length. The function of this family is unknown. 98
49896 399709 pfam06904 Extensin-like_C Extensin-like protein C-terminus. This family represents the C-terminus (approx. 120 residues) of a number of bacterial extensin-like proteins. Extensins are cell wall glycoproteins normally associated with plants, where they strengthen the cell wall in response to mechanical stress. Note that many family members of this family are hypothetical. 176
49897 399710 pfam06905 FAIM1 Fas apoptotic inhibitory molecule (FAIM1). This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology. 174
49898 399711 pfam06906 DUF1272 Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown. 57
49899 399712 pfam06907 Latexin Latexin. This family consists of several animal specific latexin proteins. Latexin is a carboxypeptidase A inhibitor and is expressed in a cell type-specific manner in both central and peripheral nervous systems in the rat. 216
49900 399713 pfam06908 DUF1273 Protein of unknown function (DUF1273). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 168
49901 399714 pfam06910 MEA1 Male enhanced antigen 1 (MEA1). This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localized in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localization of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis. 128
49902 399715 pfam06911 Senescence Senescence-associated protein. This family contains a number of plant senescence-associated proteins of approximately 450 residues in length. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the flower opens, and it is believed that senescence proteins produced around that time have a role in this program. This family extends to the higher vertebrates where the full-length protein is often a Spartin, associated with mitochondrial membranes and transportation along microtubules. 179
49903 399716 pfam06912 DUF1275 Protein of unknown function (DUF1275). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although most members have 6 TM regions, and may be putative permeases. 202
49904 369134 pfam06916 DUF1279 Protein of unknown function (DUF1279). This family represents the C-terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function. 88
49905 369135 pfam06917 Pectate_lyase_2 Periplasmic pectate lyase. This family consists of several Enterobacterial periplasmic pectate lyase proteins (EC:4.2.2.2). A major virulence determinant of the plant-pathogenic enterobacterium Erwinia chrysanthemi is the production of pectate lyase enzymes that degrade plant cell walls. 556
49906 369136 pfam06918 DUF1280 Protein of unknown function (DUF1280). This family represents a conserved region approximately 200 residues long within a number of proteins of unknown function that seem to be specific to C. elegans. 219
49907 284364 pfam06919 Phage_T4_Gp30_7 Phage Gp30.7 protein. This family consists of several phage Gp30.7 proteins of 121 residues in length. Family members seem to be exclusively from the T4-like viruses. The function of this family is unknown. 121
49908 399717 pfam06920 DHR-2 Dock homology region 2. This family represents a conserved region within a number of eukaryotic dedicator of cytokinesis proteins. These are potential guanine nucleotide exchange factors, which activate some small GTPases by exchanging bound GDP for free GTP. This region interacts with RAC1 and ELMO1. 489
49909 115570 pfam06922 CTV_P13 Citrus tristeza virus P13 protein. This family consists of several Citrus tristeza virus (CTV) P13 13-kDa proteins. Citrus tristeza virus (CTV), a member of the closterovirus group, is one of the more complex single-stranded RNA viruses. The function of this family is unknown. 119
49910 399718 pfam06923 GutM Glucitol operon activator protein (GutM). This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes. 105
49911 399719 pfam06924 DUF1281 Protein of unknown function (DUF1281). This family consists of several hypothetical enterobacterial proteins of around 170 residues in length. Members of this family are found in Escherichia coli, Salmonella typhimurium and Shigella species. The function of this family is unknown. 179
49912 284368 pfam06925 MGDG_synth Monogalactosyldiacylglycerol (MGDG) synthase. This family represents a conserved region of approximately 180 residues within plant and bacterial monogalactosyldiacylglycerol (MGDG) synthase (EC:2.4.1.46). In Arabidopsis, there are two types of MGDG synthase which differ in their N-terminal portion: type A and type B. 169
49913 284369 pfam06926 Rep_Org_C Putative replisome organizer protein C-terminus. This family represents the C-terminus (approximately 100 residues) of a putative replisome organizer protein in Lactococcus bacteriophages. 95
49914 399720 pfam06929 Rotavirus_VP3 Rotavirus VP3 protein. This family consists of several Rotavirus specific VP3 proteins. VP3 is known to be a viral guanylyltransferase and is thought to posses methyltransferase activity and therefore VP3 is a predicted multifunctional capping enzyme. 687
49915 369140 pfam06930 DUF1282 Protein of unknown function (DUF1282). This family consists of several hypothetical proteins of around 200 residues in length. The function of this family is unknown although a number of family members are thought to be putative membrane proteins. 172
49916 284372 pfam06931 Adeno_E4_ORF3 Mastadenovirus E4 ORF3 protein. This family consists of several Mastadenovirus E4 ORF3 proteins. Early proteins E4 ORF3 and E4 ORF6 have complementary functions during viral infection. Both proteins facilitate efficient viral DNA replication, late protein expression, and prevention of concatenation of viral genomes. A unique function of E4 ORF3 is the reorganisation of nuclear structures known as PML oncogenic domains (PODs). The function of these domains is unclear, but PODs have been implicated in a number of important cellular processes, including transcriptional regulation, apoptosis, transformation, and response to interferon. 113
49917 399721 pfam06932 DUF1283 Protein of unknown function (DUF1283). This family consists of several hypothetical proteins of around 115 residues in length which seem to be specific to Enterobacteria. The function of the family is unknown. 74
49918 115579 pfam06933 SSP160 Special lobe-specific silk protein SSP160. This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species. 758
49919 399722 pfam06934 CTI Fatty acid cis/trans isomerase (CTI). This family consists of several fatty acid cis/trans isomerase proteins which appear to be found exclusively in bacteria of the orders Vibrionales and Pseudomonadales. Cis/trans isomerase (CTI) catalyzes the cis-trans isomerisation of esterified fatty acids in phospholipids, mainly cis-oleic acid (C(16:1,9)) and cis-vaccenic acid (C(18:1,11)), in response to solvents. The CTI protein has been shown to be involved in solvent resistance in Pseudomonas putida. 692
49920 399723 pfam06935 DUF1284 Protein of unknown function (DUF1284). This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins. 102
49921 369143 pfam06936 Selenoprotein_S Selenoprotein S (SelS). This family consists of several mammalian selenoprotein S (SelS) sequences. SelS is a plasma membrane protein and is present in a variety of tissues and cell types. Selenoprotein S (SelS) is an intrinsically disordered protein. It formsa selenosulfide bond between cys 174 and Sec 188, that has a redox potential -234 mV. In vitro, SelS is an efficient reductase that depends on the presence of selenocysteine. Due to the high reactivity, SelS also has peroxidase activity that can catalyze the reduction of hydrogen peroxide. It is also resistant to inactivation by hydrogen peroxide which might provide evolutionary advantage compared to cysteine containing peroxidases. 192
49922 284377 pfam06937 EURL EURL protein. This family consists of several animal EURL proteins. EURL is preferentially expressed in chick retinal precursor cells as well as in the anterior epithelial cells of the lens at early stages of development. EURL transcripts are found primarily in the peripheral dorsal retina, i.e., the most undifferentiated part of the dorsal retina. EURL transcripts are also detected in the lens at stage 18 and remain abundant in the proliferating epithelial cells of the lens until at least day 11. The distribution pattern of EURL in the developing retina and lens suggest a role before the events leading to cell determination and differentiation. 283
49923 399724 pfam06938 DUF1285 Protein of unknown function (DUF1285). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. The structures revealed a conserved core with domain duplication and a superficial similarity of the C-terminal domain to pleckstrin homology-like folds. The conservation of the domain- interface indicates a potential binding site that is likely to involve a nucleotide-based ligand, with genome-context and gene-fusion analyses additionally supporting a role for this family in signal transduction, possibly during oxidative stress. 145
49924 311095 pfam06939 DUF1286 Protein of unknown function (DUF1286). This family consists of several hypothetical archaeal proteins of around 120 residues in length. All members of this family seem to be Sulfolobus species specific. The function of this family is unknown. 111
49925 399725 pfam06940 DUF1287 Domain of unknown function (DUF1287). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. This family is related to pfam00877. 163
49926 284381 pfam06941 NT5C 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C protein (NT5C). This family consists of several 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C (NT5C) proteins. 5'(3')-Deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known. 180
49927 399726 pfam06942 GlpM GlpM protein. This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa. 107
49928 399727 pfam06943 zf-LSD1 LSD1 zinc finger. This family consists of several plant specific LSD1 zinc finger domains. Arabidopsis lsd1 mutants are hyper-responsive to cell death initiators and fail to limit the extent of cell death. Superoxide is a necessary and sufficient signal for cell death propagation. LSD1 monitors a superoxide-dependent signal and negatively regulates a plant cell death pathway. LSD1 protein contains three zinc finger domains, defined by CxxCxRxxLMYxxGASxVxCxxC. It has been suggested that LSD1 defines a zinc finger protein subclass and that LSD1 regulates transcription, via either repression of a pro-death pathway or activation of an anti-death pathway, in response to signals emanating from cells undergoing pathogen-induced hypersensitive cell death. 25
49929 399728 pfam06945 DUF1289 Protein of unknown function (DUF1289). This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N-terminus. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 47
49930 311099 pfam06946 Phage_holin_5_1 Bacteriophage A118-like holin, Hol118. This family consists of several Listeria bacteriophage holin proteins and related bacterial sequences. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis. 92
49931 399729 pfam06947 DUF1290 Protein of unknown function (DUF1290). This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown. 86
49932 399730 pfam06949 DUF1292 Protein of unknown function (DUF1292). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 77
49933 284388 pfam06950 DUF1293 Protein of unknown function (DUF1293). This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown. 115
49934 399731 pfam06951 PLA2G12 Group XII secretory phospholipase A2 precursor (PLA2G12). This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids. 186
49935 369145 pfam06952 PsiA PsiA protein. This family consists of several Enterobacterial PsiA proteins. The function of PsiA is unknown although it is thought that it may affect the generation of an SOS signal in Escherichia coli. 237
49936 399732 pfam06953 ArsD Arsenical resistance operon trans-acting repressor ArsD. This family consists of several bacterial arsenical resistance operon trans-acting repressor ArsD proteins. ArsD is a trans-acting repressor of the arsRDABC operon that confers resistance to arsenicals and antimonials in Escherichia coli. It possesses two-pairs of vicinal cysteine residues, Cys(12)-Cys(13) and Cys(112)-Cys(113), that potentially form separate binding sites for the metalloids that trigger dissociation of ArsD from the operon. However, as a homodimer it has four vicinal cysteine pairs. 120
49937 399733 pfam06954 Resistin Resistin. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes. 85
49938 399734 pfam06955 XET_C Xyloglucan endo-transglycosylase (XET) C-terminus. This family represents the C-terminus (approximately 60 residues) of plant xyloglucan endo-transglycosylase (XET). Xyloglucan is the predominant hemicellulose in the cell walls of most dicotyledons. With cellulose, it forms a network that strengthens the cell wall. XET catalyzes the splitting of xyloglucan chains and the linking of the newly generated reducing end to the non-reducing end of another xyloglucan chain, thereby loosening the cell wall. Note that all family members contain the pfam00722 domain. 48
49939 399735 pfam06956 RtcR Regulator of RNA terminal phosphate cyclase. RtcR is a sigma54-dependent enhancer binding protein that activates transcription of the rtcBA operon. The product of the rtcA gene is an RNA 3'-terminal phosphate cyclase. This domain is found at the N-terminus of the RtcR sequence. RtcR, and other sigma54-dependent activators, contain pfam00158 in the central region of the protein sequence. 183
49940 399736 pfam06957 COPI_C Coatomer (COPI) alpha subunit C-terminus. This family represents the C-terminus (approximately 500 residues) of the eukaryotic coatomer alpha subunit. Coatomer (COPI) is a large cytosolic protein complex which forms a coat around vesicles budding from the Golgi apparatus. Such coatomer-coated vesicles have been proposed to play a role in many distinct steps of intracellular transport. Note that many family members also contain the pfam04053 domain. 403
49941 399737 pfam06958 Pyocin_S S-type Pyocin. This family represents a conserved region approximately 180 residues long within bacterial S-type pyocins. Pyocins are polypeptide toxins produced by, and active against, bacteria. S-type pyocins cause cell death by DNA breakdown due to endonuclease activity. 139
49942 399738 pfam06959 RecQ5 RecQ helicase protein-like 5 (RecQ5). This family represents a conserved region approximately 200 residues long within eukaryotic RecQ helicase protein-like 5 (RecQ5). The RecQ helicases have been implicated in DNA repair and recombination, and RecQ5 may have an important role in DNA metabolism. 202
49943 399739 pfam06961 DUF1294 Protein of unknown function (DUF1294). This family includes a number of hypothetical bacterial and archaeal proteins of unknown function. 55
49944 399740 pfam06962 rRNA_methylase Putative rRNA methylase. This family contains a number of putative rRNA methylases. Note that many family members are hypothetical proteins. 135
49945 311113 pfam06963 FPN1 Ferroportin1 (FPN1). This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell. This family may represent a number of transmembrane regions in Ferroportin1. 430
49946 399741 pfam06964 Alpha-L-AF_C Alpha-L-arabinofuranosidase C-terminal domain. This family represents the C-terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase (EC:3.2.1.55). This catalyzes the hydrolysis of nonreducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. 192
49947 399742 pfam06965 Na_H_antiport_1 Na+/H+ antiporter 1. This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyze the exchange of H+ for Na+ in a manner that is highly dependent on the pH. 378
49948 399743 pfam06966 DUF1295 Protein of unknown function (DUF1295). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long. 235
49949 399744 pfam06967 Mo-nitro_C Mo-dependent nitrogenase C-terminus. This family represents the C-terminus (approximately 80 residues) of a number of bacterial Mo-dependent nitrogenases. These are involved in nitrogen fixation in cyanobacteria. Note that many family members are hypothetical proteins. 83
49950 399745 pfam06968 BATS Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), EC:2.8.1.6, catalyzes the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this family) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers. This domain therefore may be involved in co-factor binding or dimerization (Finn, RD personal observation). 85
49951 399746 pfam06969 HemN_C HemN C-terminal domain. Members of this family are all oxygen-independent coproporphyrinogen-III oxidases (HemN). This enzyme catalyzes the oxygen-independent conversion of coproporphyrinogen-III to protoporphyrinogen-IX, one of the last steps in haem biosynthesis. The function of this domain is unclear, but comparison to other proteins containing a radical SAM domain (pfam04055) suggest it may be a substrate binding domain. 66
49952 399747 pfam06970 RepA_N Replication initiator protein A (RepA) N-terminus. This of family of predicted proteins represents the N-terminus (approximately 80 residues) of replication initiator protein A (RepA), a DNA replication initiator in plasmids. Most proteins in this family are bacterial, but archaeal and eukaryotic members are also included. 76
49953 399748 pfam06971 Put_DNA-bind_N Putative DNA-binding protein N-terminus. This family represents the N-terminus (approximately 50 residues) of a number of putative bacterial DNA-binding proteins. 49
49954 399749 pfam06972 DUF1296 Protein of unknown function (DUF1296). This family represents a conserved region approximately 60 residues long within a number of plant proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids. 60
49955 399750 pfam06973 DUF1297 Domain of unknown function (DUF1297). This family represents the C-terminus (approximately 200 residues) of a number of archaeal proteins of unknown function. One member is annotated as being a possible carboligase enzyme. 188
49956 399751 pfam06974 DUF1298 Protein of unknown function (DUF1298). This family represents the C-terminus (approximately 170 residues) of a number of hypothetical plant proteins of unknown function. 144
49957 115620 pfam06975 DUF1299 Protein of unknown function (DUF1299). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that many family members contain multiple copies of this region. 47
49958 399752 pfam06977 SdiA-regulated SdiA-regulated. This family represents a conserved region approximately within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the pfam01436 repeat. 249
49959 399753 pfam06978 POP1 Ribonucleases P/MRP protein subunit POP1. This family represents a conserved region approximately 150 residues long located towards the N-terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins (EC:3.1.26.5). These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5' ends. 211
49960 399754 pfam06979 TMEM70 Assembly, mitochondrial proton-transport ATP synth complex. TMEM70 is a family of proteins essential for assembly of the mitochondrial proton-transporting ATP synthase complex within the inner mitochondrial membrane. 132
49961 399755 pfam06980 DUF1302 Protein of unknown function (DUF1302). This family contains a number of hypothetical bacterial proteins of unknown function that are approximately 600 residues long. Most family members seem to be from Pseudomonas. 569
49962 399756 pfam06983 3-dmu-9_3-mt 3-demethylubiquinone-9 3-methyltransferase. This family represents a conserved region approximately 100 residues long within a number of bacterial and archaeal 3-demethylubiquinone-9 3-methyltransferases (EC:2.1.1.64). Note that some family members contain more than one copy of this region, and that many members are hypothetical proteins. 116
49963 369158 pfam06984 MRP-L47 Mitochondrial 39-S ribosomal protein L47 (MRP-L47). This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure. 86
49964 369159 pfam06985 HET Heterokaryon incompatibility protein (HET). This family represents a conserved region approximately 150 residues long within various heterokaryon incompatibility proteins that seem to be restricted to ascomycete fungi. Genetic differences in specific het genes prevent a viable heterokaryotic fungal cell from being formed by the fusion of filaments from two different wild-type strains. Many family members also contain the pfam00400 repeat and the pfam05729 domain. 146
49965 399757 pfam06986 TraN Type-1V conjugative transfer system mating pair stabilisation. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilisation (adhesin) component of the F-type conjugative plasmid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein. 239
49966 399758 pfam06988 NifT NifT/FixU protein. This family consists of several NifT/FixU bacterial proteins. NifT/FixU is a very small, conserved protein that is found in nif clusters; however, its function is unknown. Although it is thought that the protein may be involved in biosynthesis of the FeMo cofactor of nitrogenase although perturbation of nifT expression in K. pneumoniae has only a limited effect on nitrogen fixation. 64
49967 369161 pfam06989 BAALC_N BAALC N-terminus. This family represents the N-terminal region of the mammalian BAALC proteins. BAALC (brain and acute leukaemia, cytoplasmic), that is highly conserved among mammals but evidently absent from lower organisms. Two isoforms are specifically expressed in neuroectoderm-derived tissues, but not in tumors or cancer cell lines of non-neural tissue origin. It has been shown that blasts from a subset of patients with acute leukaemia greatly overexpress eight different BAALC transcripts, resulting in five protein isoforms. Among patients with acute myeloid leukaemia, those overexpressing BAALC show distinctly poor prognosis, pointing to a key role of the BAALC products in leukaemia. It has been suggested that BAALC is a gene implicated in both neuroectodermal and hematopoietic cell functions. 50
49968 254007 pfam06990 Gal-3-0_sulfotr Galactose-3-O-sulfotransferase. This family consists of several mammalian galactose-3-O-sulfotransferase proteins. Gal-3-O-sulfotransferase is thought to play a critical role in 3'-sulfation of N-acetyllactosamine in both O- and N-glycans. 400
49969 399759 pfam06991 MFAP1 Microfibril-associated/Pre-mRNA processing. MFAP1 was first named for proteins associated with microfibrils which are an important component of the extracellular matrix (ECM) of many tissues. For example, MFAP1 has been shown to be associated with elastin-like fibers at the base of Schlemm's canal endothelium cells, in the juxtacanalicular tissue, and in the uveal region. Based on its role in the ECM and the proximity of the MFAP1 gene to FBN1 it was hypothesized that mutations in MFAP1 contributed to heritable diseases affecting microfibrils, Marfan syndrome but this has now been shown not to be the case. MFAP1 has also been shown to interact directly with certain pre-mRNA processing factor proteins, Prps, which are also spliceosome components and is thus required for pre-mRNA processing. MAFP1 bound to Pr38 of yeast is necessary for cells in vivo to progress from G2 to M phase. 214
49970 399760 pfam06992 Phage_lambda_P Replication protein P. This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin. 162
49971 399761 pfam06993 DUF1304 Protein of unknown function (DUF1304). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 110
49972 399762 pfam06994 Involucrin2 Involucrin. This family represents a conserved region approximately 60 residues long, multiple copies of which are found within eukaryotic involucrin, and which is rich in glutamine and glutamic acid residues. Involucrin forms part of the insoluble cornified cell envelope (a specialized protective barrier) of stratified squamous epithelia. Members of this family seem to be restricted to mammals. 41
49973 399763 pfam06995 Phage_P2_GpU Phage P2 GpU. This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein, which is thought to be involved in tail assembly. 120
49974 399764 pfam06996 T6SS_TssG Type VI secretion, TssG. This is a family of Gram-negative bacterial proteins that form part of the type VI pathogenicity secretion system (T6SS), including TssG. TssG is homologs to phage tail proteins and is required for proper assembly of the Hcp tube in bacteria.One other member in this family, SciB (Q93IT4) from Salmonella enterica, is thought to be involved in virulence. 303
49975 399765 pfam06998 DUF1307 Protein of unknown function (DUF1307). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown. 114
49976 399766 pfam06999 Suc_Fer-like Sucrase/ferredoxin-like. This family contains a number of bacterial and eukaryotic proteins approximately 400 residues long that resemble ferredoxin and appear to have sucrolytic activity. 217
49977 399767 pfam07000 DUF1308 Protein of unknown function (DUF1308). This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown. 163
49978 399768 pfam07001 BAT2_N BAT2 N-terminus. This family represents the N-terminus (approximately 200 residues) of the proline-rich protein BAT2. BAT2 is similar to other proteins with large proline-rich domains, such as some nuclear proteins, collagens, elastin, and synapsin. 189
49979 284432 pfam07002 Copine Copine. This family represents a conserved region approximately 220 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth. 218
49980 311141 pfam07004 SHIPPO-rpt Sperm-tail PG-rich repeat. This family represents a short conserved region carrying a PGP motif that is repeated in eukaryotic proteins of sperm-tails. Shippo orthologues from some species may include up to 40 Pro-Gly-Pro repeats. 33
49981 399769 pfam07005 DUF1537 Putative sugar-binding N-terminal domain. This conserved region is found in proteins of unknown function in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini interacting to produce an interacting surface into which a threonate-ADPcomplex is bound, suggesting that a sugar binding site is on the N-terminal domain here, and a nucleotide binding site is in the C-terminal domain (manuscript in preparation). There is a critical motif, DDXTG, at approximately residues 22-25. 230
49982 399770 pfam07006 DUF1310 Protein of unknown function (DUF1310). This family consists of several hypothetical proteins of around 125 residues in length. Members of this family seem to be specific to Listeria and Streptococcus species. The function of this family is unknown. 116
49983 399771 pfam07007 LprI Lysozyme inhibitor LprI. This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. Family members include lipoprotein LprI from Mycobacterium, which binds to and inhibits macrophage lysozyme, which may aid bacterial survival. 103
49984 399772 pfam07009 NusG_II NusG domain II. This domain is found in some NusG proteins where it forms domain II. However most NusG proteins are missing this domain. In other cases this domain is found in isolation. The function of this domain is unknown. 107
49985 399773 pfam07010 Endomucin Endomucin. This family consists of several mammalian endomucin proteins. Endomucin is an early endothelial-specific antigen that is also expressed on putative hematopoietic progenitor cells. 260
49986 399774 pfam07011 DUF1313 Protein of unknown function (DUF1313). This family consists of several hypothetical plant proteins of around 100 residues in length. The function of this family is unknown. 83
49987 399775 pfam07012 Curlin_rpt Curlin associated repeat. This family consists of several bacterial repeats of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibers are thin aggregative surface fibers, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibers are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibers is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB. 34
49988 284441 pfam07013 DUF1314 Protein of unknown function (DUF1314). This family consists of several Alphaherpesvirus proteins of around 200 residues in length. The function of this family is unknown. 197
49989 399776 pfam07014 Hs1pro-1_C Hs1pro-1 protein C-terminus. This family represents the C-terminus (approximately 270 residues) of a number of plant Hs1pro-1 proteins, which are believed to confer nematode resistance. 258
49990 148565 pfam07015 VirC1 VirC1 protein. This family consists of several bacterial VirC1 proteins. In Agrobacterium tumefaciens, a cis-active 24-base-pair sequence adjacent to the right border of the T-DNA, called overdrive, stimulates tumor formation by increasing the level of T-DNA processing. It is thought that the virC operon which enhances T-DNA processing probably does so because the VirC1 protein interacts with overdrive. It has now been shown that the virC1 gene product binds to overdrive but not to the right border of T-DNA. 231
49991 284443 pfam07016 CRAM_rpt Cysteine-rich acidic integral membrane protein precursor. This family consists of several 24 residue repeats from the Trypanosoma brucei cysteine-rich, acidic integral membrane protein precursor (CRAM). CRAM is concentrated in the flagellar pocket, an invagination of the cell surface of the trypanosome where endocytosis has been documented. 22
49992 399777 pfam07017 PagP Antimicrobial peptide resistance and lipid A acylation protein PagP. This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response. 146
49993 399778 pfam07019 Rab5ip Rab5-interacting protein (Rab5ip). This family consists of several Rab5-interacting protein (RIP5 or Rab5ip ) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5. 79
49994 115660 pfam07020 Orthopox_C10L Orthopoxvirus C10L protein. This family consists of several Orthopoxvirus C10L proteins. C10L viral protein can play an important role in vaccinia virus evasion of the host immune system. It may consist in the blockade of IL-1 receptors by the C10L protein, a homolog of the IL-1 Ra. 83
49995 399779 pfam07021 MetW Methionine biosynthesis protein MetW. This family consists of several bacterial and one archaeal methionine biosynthesis MetW proteins. Biosynthesis of methionine from homoserine in Pseudomonas putida takes place in three steps. The first step is the acylation of homoserine to yield an acyl-L-homoserine. This reaction is catalyzed by the products of the metXW genes and is equivalent to the first step in enterobacteria, gram-positive bacteria and fungi, except that in these microorganisms the reaction is catalyzed by a single polypeptide (the product of the metA gene in Escherichia coli and the met5 gene product in Neurospora crassa). In Pseudomonas putida, as in gram-positive bacteria and certain fungi, the second and third steps are a direct sulfhydrylation that converts the O-acyl-L-homoserine into homocysteine and further methylation to yield methionine. The latter reaction can be mediated by either of the two methionine synthetases present in the cells. 193
49996 311152 pfam07022 Phage_CI_repr Bacteriophage CI repressor helix-turn-helix domain. This family consists of several phage CI repressor proteins and related bacterial sequences. The CI repressor is known to function as a transcriptional switch, determining whether transcription is lytic or lysogenic. 65
49997 399780 pfam07023 DUF1315 Protein of unknown function (DUF1315). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 81
49998 399781 pfam07024 ImpE ImpE protein. This family consists of several bacterial proteins including ImpE from Rhizobium leguminosarum. It has been suggested that the imp locus is involved in the secretion to the environment of proteins, including periplasmic RbsB protein, that cause blocking of infection specifically in pea plants. The exact function of this family is unknown. 122
49999 284449 pfam07026 DUF1317 Protein of unknown function (DUF1317). This family consists of several hypothetical bacterial and phage proteins of around 60 residues in length. The function of this family is unknown. 60
50000 399782 pfam07027 DUF1318 Protein of unknown function (DUF1318). This family consists of several bacterial proteins of around 100 residues in length and is often known as YdbL. The function of this family is unknown. 86
50001 369174 pfam07028 DUF1319 Protein of unknown function (DUF1319). This family contains a number of viral proteins of unknown function approximately 200 residues long. Family members seem to be restricted to badnaviruses. 109
50002 369175 pfam07029 CryBP1 CryBP1 protein. This family consists of several CryBP1 like proteins from Bacillus thuringiensis and Paenibacillus popilliae. Members of this family are thought to be involved in the overall toxicity of the bacteria to their hosts. 180
50003 399783 pfam07030 DUF1320 Protein of unknown function (DUF1320). This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown. 109
50004 369177 pfam07032 DUF1322 Protein of unknown function (DUF1322). This family consists of several hypothetical 9.4 kDa Borrelia burgdorferi (Lyme disease spirochete) proteins of around 78 residues in length. The function of this family is unknown. 73
50005 284455 pfam07033 Orthopox_B11R Orthopoxvirus B11R protein. This family consists of several Orthopoxvirus B11R proteins of around 70 residues in length. The function of this family is unknown. 71
50006 399784 pfam07034 ORC3_N Origin recognition complex (ORC) subunit 3 N-terminus. This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication. 330
50007 399785 pfam07035 Mic1 Colon cancer-associated protein Mic1-like. This family represents the C-terminus (approximately 160 residues) of a number of proteins that resemble colon cancer-associated protein Mic1. 157
50008 399786 pfam07037 DUF1323 Putative transcription regulator (DUF1323). This family consists of several hypothetical Enterobacterial proteins of around 120 residues in length. This family appears to have an HTH domain and is therefore likely to act as a transcriptional regulator. 122
50009 70500 pfam07038 DUF1324 Protein of unknown function (DUF1324). This family consists of several Circovirus proteins of around 60 residues in length. The function of this family is unknown. 59
50010 399787 pfam07039 DUF1325 SGF29 tudor-like domain. This domain is found in the yeast protein SAGA-associated factor 29. This domain is related to members of the Tudor domain superfamily such as pfam05641. The SAGA complex is involved in RNA polymerase II-dependent transcriptional regulation. The membership of the tudor domain superfamily suggests this domain may bind to RNA. 131
50011 399788 pfam07040 DUF1326 Protein of unknown function (DUF1326). This family consists of several hypothetical bacterial proteins which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N-terminus. The function of this family is unknown. 175
50012 311164 pfam07041 DUF1327 Protein of unknown function (DUF1327). This family consists of several hypothetical bacterial proteins of around 115 residues in length which seem to be specific to Escherichia coli. The function of this family is unknown. 113
50013 115680 pfam07042 TrfA TrfA protein. This family consists of several bacterial TrfA proteins. The trfA operon of broad-host-range IncP plasmids is essential to activate the origin of vegetative replication in diverse species. The trfA operon encodes two ORFs. The first ORF is highly conserved and encodes a putative single-stranded DNA binding protein (Ssb). The second, trfA, contains two translational starts as in the IncP alpha plasmids, generating related polypeptides of 406 (TrfA1) and 282 (TrfA2) amino acids. TrfA2 is very similar to the IncP alpha product, whereas the N-terminal region of TrfA1 shows very little similarity to the equivalent region of IncP alpha TrfA1. This region has been implicated in the ability of IncP alpha plasmids to replicate efficiently in Pseudomonas aeruginosa. 282
50014 399789 pfam07043 DUF1328 Protein of unknown function (DUF1328). This family consists of several hypothetical bacterial proteins of around 50 residues in length. The function of this family is unknown. 38
50015 399790 pfam07044 DUF1329 Protein of unknown function (DUF1329). This family consists of several hypothetical bacterial proteins of around 475 residues in length. The majority of family members are from Pseudomonas species but the family also contains sequences from Shewanella oneidensis and Thauera aromatica. 366
50016 399791 pfam07045 DUF1330 Domain of unknown function (DUF1330). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 94
50017 336588 pfam07046 CRA_rpt Cytoplasmic repetitive antigen (CRA) like repeat. This family consists of several repeats of around 42 residues in length. These repeated sequences are found in multiple copies in Trypanosoma cruzi antigens, the cytoplasmic repetitive antigen (CRA) protein contains 23 copies of this repeat. 42
50018 399792 pfam07047 OPA3 Optic atrophy 3 protein (OPA3). This family consists of several optic atrophy 3 (OPA3) proteins. OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity. 125
50019 115686 pfam07048 DUF1331 Protein of unknown function (DUF1331). This family consists of several Circovirus proteins of around 35 residues in length. Members of this family are described as ORF-10 proteins and their function is unknown. 35
50020 399793 pfam07051 OCIA Ovarian carcinoma immunoreactive antigen (OCIA). This family consists of several ovarian carcinoma immunoreactive antigen (OCIA) and related eukaryotic sequences. The function of this family is unknown. 87
50021 399794 pfam07052 Hep_59 Hepatocellular carcinoma-associated antigen 59. This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins. 102
50022 369186 pfam07054 Pericardin_rpt Pericardin like repeat. This family consists of several repeated sequences of around 34 residues in length. This repeat is found in multiple copies in the Drosophila pericardin and other extracellular matrix proteins. 35
50023 399795 pfam07055 Eno-Rase_FAD_bd Enoyl reductase FAD binding domain. This family carries the region of the enzyme trans-2-enoyl-CoA reductase, at the very C-terminus, that binds to FAD. The activity was characterized in Euglena where an unusual fatty acid synthesis path-way in mitochondria performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation. The full enzyme catalyzes the reduction of enoyl-CoA to acyl-CoA. The conserved region is seen as the motif FGFxxxxxDY. 64
50024 399796 pfam07056 DUF1335 Protein of unknown function (DUF1335). This family represents a conserved region approximately 130 residues long within a number of proteins of unknown function that seem to be specific to the white spot syndrome virus (WSSV). 131
50025 399797 pfam07057 TraI DNA helicase TraI. This family represents a conserved region approximately 130 residues long within the bacterial DNA helicase TraI (EC:3.6.1.-). TraI is a bifunctional protein that catalyzes the unwinding of duplex DNA as well as acts as a sequence-specific DNA trans-esterase, providing the site- and strand-specific nick required to initiate DNA transfer. 123
50026 399798 pfam07058 MAP70 Microtubule-associated protein 70. This family represents a family of plant microtubule-associated proteins of size 70 kDa. The proteins contain four predicted coiled-coil domains, and truncation studies identify a central domain that targets the proteins to microtubules. It has no predicted trans-membrane domains, and the region between the coils from approximately residues 240-483 is the targetting region. 544
50027 399799 pfam07059 DUF1336 Protein of unknown function (DUF1336). This family represents the C-terminus (approximately 250 residues) of a number of hypothetical plant proteins of unknown function. 211
50028 399800 pfam07061 Swi5 Swi5. Swi5 is involved in meiotic DNA repair synthesis and meiotic joint molecule formation. It is known to interact with Swi2, Rhp51 and Swi6. 79
50029 311178 pfam07062 Clc-like Clc-like. This family contains a number of Clc-like proteins that are approximately 250 residues long. 212
50030 399801 pfam07063 DUF1338 Domain of unknown function (DUF1338). This domain is found in a variety of bacterial and fungal hypothetical proteins of unknown function. The structure of this domain has been solved by structural genomics. The structure implies a zinc-binding function, so it is a putative metal hydrolase (information derived from TOPSAN for Structure 3iuz). 322
50031 399802 pfam07064 RIC1 RIC1. RIC1 has been identified in yeast as a Golgi protein involved in retrograde transport to the cis-Golgi network. It forms a heterodimer with Rgp1 and functions as a guanyl-nucleotide exchange factor. 248
50032 399803 pfam07065 D123 D123. This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate. D123 proteins are regulators of eIF2, the central regulator of translational initiation. 300
50033 369194 pfam07066 DUF3882 Lactococcus phage M3 protein. This family consists of several Lactococcus phage middle-3 (M3) proteins of around 160 residues in length. The function of this family is unknown. 159
50034 115703 pfam07067 DUF1340 Protein of unknown function (DUF1340). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 235 residues in length. The function of this family is unknown. 236
50035 399804 pfam07068 Gp23 Major capsid protein Gp23. This family contains a number of major capsid Gp23 proteins approximately 500 residues long, from T4-like bacteriophages. 449
50036 115705 pfam07069 PRRSV_2b Porcine reproductive and respiratory syndrome virus 2b. This family consists of several Porcine reproductive and respiratory syndrome virus (PRRSV) ORF2b proteins. The function of this family is unknown however it is known that large amounts of 2b protein are present in the virion and it is thought that this protein may be an integral component of the virion. 73
50037 399805 pfam07070 Spo0M SpoOM protein. This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH. 203
50038 399806 pfam07071 KDGP_aldolase KDGP aldolase. DgaF is part of the dga operon required for wild-type growth of Salmonella Typhimurium with D-glucosaminate. It catalyzes the conversion of keto-3-deoxygluconate 6-phosphate (KDGP) to yield pyruvate and glyceraldehyde-3-phosphate. Orthologues of the dga genes are largely restricted to certain enteric bacteria and a few species in the phylum Firmicutes. 217
50039 399807 pfam07072 ZapD Cell division protein. Cell division protein ZapD enhances FtsZ-ring assembly. It directly interacts with FtsZ and promotes bundling of FtsZ protofilaments, with a reduction in FtsZ GTPase activity. 210
50040 399808 pfam07073 ROF Modulator of Rho-dependent transcription termination (ROF). This family consists of several bacterial modulator of Rho-dependent transcription termination (ROF) proteins. ROF binds transcription termination factor Rho and inhibits Rho-dependent termination in vivo. 80
50041 399809 pfam07074 TRAP-gamma Translocon-associated protein, gamma subunit (TRAP-gamma). This family consists of several eukaryotic translocon-associated protein, gamma subunit (TRAP-gamma) sequences. The translocation site (translocon), at which nascent polypeptides pass through the endoplasmic reticulum membrane, contains a component previously called 'signal sequence receptor' that is now renamed as 'translocon-associated protein' (TRAP). The TRAP complex is comprised of four membrane proteins alpha, beta, gamma and delta which are present in a stoichiometric relation, and are genuine neighbors in intact microsomes. The gamma subunit is predicted to span the membrane four times. 170
50042 399810 pfam07075 DUF1343 Protein of unknown function (DUF1343). This family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown. 362
50043 399811 pfam07076 DUF1344 Protein of unknown function (DUF1344). This family consists of several short, hypothetical bacterial proteins of around 80 residues in length. Members of this family are found in Rhizobium, Agrobacterium and Brucella species. The function of this family is unknown. 59
50044 399812 pfam07077 DUF1345 Protein of unknown function (DUF1345). This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown. 171
50045 399813 pfam07078 FYTT Forty-two-three protein. This family consists of several mammalian proteins of around 320 residues in length called 40-2-3 proteins. The function of this family is unknown. 308
50046 284489 pfam07079 DUF1347 Protein of unknown function (DUF1347). This family consists of several hypothetical bacterial proteins of around 610 residues in length. Members of this family are highly conserved and seem to be specific to Chlamydia species. The function of this family is unknown. 548
50047 399814 pfam07080 DUF1348 Protein of unknown function (DUF1348). This family consists of several highly conserved hypothetical proteins of around 150 residues in length. The function of this family is unknown. 130
50048 399815 pfam07081 DUF1349 Protein of unknown function (DUF1349). This family consists of several hypothetical bacterial proteins but contains one sequence from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown. 168
50049 115718 pfam07082 DUF1350 Protein of unknown function (DUF1350). This family consists of several hypothetical proteins from both cyanobacteria and plants. Members of this family are typically around 250 residues in length. The function of this family is unknown but the species distribution indicates that the family may be involved in photosynthesis. 250
50050 399816 pfam07083 DUF1351 Protein of unknown function (DUF1351). This family consists of several bacterial and phage proteins of around 230 residues in length. The function of this family is unknown. 210
50051 399817 pfam07084 Spot_14 Thyroid hormone-inducible hepatic protein Spot 14. This family consists of several thyroid hormone-inducible hepatic protein (Spot 14 or S14) sequences. Mainly expressed in tissues that synthesize triglycerides, the mRNA coding for Spot 14 has been shown to be increased in rat liver by insulin, dietary carbohydrates, glucose in hepatocyte culture medium, as well as thyroid hormone. In contrast, dietary fats and polyunsaturated fatty acids, have been shown to decrease the amount of Spot 14 mRNA, while an elevated level of cAMP acts as a dominant negative factor. In addition, liver-specific factors or chromatin organisation of the gene have been shown to contribute to the regulation of its expression. Spot 14 protein is thought to be required for induction of hepatic lipogenesis. 144
50052 399818 pfam07085 DRTGG DRTGG domain. This presumed domain is about 120 amino acids in length. It is found associated with CBS domains pfam00571, as well as the CbiA domain pfam01656. The function of this domain is unknown. It is named the DRTGG domain after some of the most conserved residues. This domain may be very distantly related to a pair of CBS domains. There are no significant sequence similarities, but its length and association with CBS domains supports this idea (Bateman A, pers. obs.). 105
50053 399819 pfam07086 Jagunal Jagunal, ER re-organisation during oogenesis. Jagunal is an endoplasmic-reticulum (ER)-membrane protein found in eukaryotes. It is involved in reorganising the ER in cells that must increase exocytic membrane traffic during development, that is, in the oocyte during vitellogenesis. It facilitates vesicular traffic in the subcortex. 186
50054 399820 pfam07087 DUF1353 Protein of unknown function (DUF1353). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown. 91
50055 311196 pfam07088 GvpD GvpD gas vesicle protein. This family consists of several archaeal GvpD gas vesicle proteins. GvpD is thought to be involved in the regulation of gas vesicle formation. 484
50056 399821 pfam07090 GATase1_like Putative glutamine amidotransferase. This family consists of several hypothetical bacterial proteins of around 250 residues in length. The function of this family is unknown. The structure of this cytoplasmic domain was solved by the Midwest Center for Structural Genomics (MCSG). The structure has been classified as part of the Class-I Glutamine amidotransferase superfamily owing to similarity with other known structures. The monomer combines with itself to form a hexamer, and this hexamer exposes a potential catalytic surface rich in Glu, Asp, Tyr, Ser.Trp and His residues. 246
50057 336604 pfam07091 FmrO Ribosomal RNA methyltransferase (FmrO). This family consists of several bacterial ribosomal RNA methyltransferase (aminoglycoside-resistance methyltransferase) proteins. 252
50058 399822 pfam07092 DUF1356 Protein of unknown function (DUF1356). This family consists of several hypothetical mammalian proteins of around 250 residues in length. The function of this family is unknown. 225
50059 399823 pfam07093 SGT1 SGT1 protein. This family consists of several eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known to suppress GCR2 and is highly expressed in the muscle and heart. The function of this family is unknown although it has been speculated that SGT1 may be functionally analogous to the Gcr2p protein of Saccharomyces cerevisiae which is known to be a regulatory factor of glycolytic gene expression. 583
50060 284502 pfam07094 DUF1357 Protein of unknown function (DUF1357). This family consists of several hypothetical bacterial proteins of around 225 residues in length. Members of this family appear to be specific Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 223
50061 399824 pfam07095 IgaA Intracellular growth attenuator protein IgaA. This family consists of several bacterial intracellular growth attenuator (IgaA) proteins. IgaA is involved in negative control of bacterial proliferation within fibroblasts. IgaA is homologous to the E. coli YrfF and P. mirabilis UmoB proteins. Whereas the biological function of YrfF is currently unknown, UmoB has been shown elsewhere to act as a positive regulator of FlhDC, the master regulator of flagella and swarming. FlhDC has been shown to repress cell division during P. mirabilis swarming, suggesting that UmoB could repress cell division via FlhDC. This biological function, if maintained in S. enterica, could sustain a putative negative control of cell division and growth exerted by IgaA in intracellular bacteria. 696
50062 369206 pfam07096 DUF1358 Protein of unknown function (DUF1358). This family consists of several hypothetical eukaryotic proteins of around 125 residues in length. The function of this family is unknown. 115
50063 284505 pfam07097 DUF1359 Protein of unknown function (DUF1359). This family consists of several hypothetical bacterial and phage proteins of around 100 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this species. The function of this family is unknown. 104
50064 399825 pfam07098 DUF1360 Protein of unknown function (DUF1360). This family consists of several bacterial proteins of around 115 residues in length. Members of this family are found in Bacillus species and Streptomyces coelicolor, the function of the family is unknown. 102
50065 399826 pfam07099 DUF1361 Protein of unknown function (DUF1361). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins. 166
50066 399827 pfam07100 ASRT Anabaena sensory rhodopsin transducer. The family of bacterial Anabaena sensory rhodopsin transducers are likely to bind sugars or related metabolites. The entire protein is comprised of a single globular domain with an eight-stranded beta-sandwich fold. There are a few characteristics which define this beta-sandwich fold as being distinct from other so-named folds, and these are: 1) a well conserved tryptophan, usually following a polar residue, present at the start of the first strand; this tryptophan appears to be central to a hydrophobic interaction required to hold the two beta-sheets of the sandwich together, and 2) a nearly absolutely conserved asparagine located at the end of the second beta-strand, that hydrogen bonds with the backbone carbonyls of the residues 2 and 4 positions downstream from it, thereby stabilizing the characteristic tight turn between strands 2 and 3 of the structure. 119
50067 115736 pfam07101 DUF1363 Protein of unknown function (DUF1363). This family consists of several Trypanosoma brucei putative variant specific antigen proteins of around 80 residues in length. 124
50068 377777 pfam07102 DUF1364 Protein of unknown function (DUF1364). This family consists of several bacterial and phage proteins of around 95 residues in length. The function of this family is unknown. 91
50069 399828 pfam07103 DUF1365 Protein of unknown function (DUF1365). This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown. 227
50070 336610 pfam07104 DUF1366 Protein of unknown function (DUF1366). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 130 residues in length. One of the sequences in this family, from phage Sfi11 is known as Gp149. The function of this family is unknown. 116
50071 369209 pfam07105 DUF1367 Protein of unknown function (DUF1367). This family consists of several highly conserved, hypothetical phage proteins of around 200 residues in length. The function of this family is unknown. Some proteins are annotated as IrsA (intracellular response to stress). 192
50072 369210 pfam07106 TBPIP Tat binding protein 1(TBP-1)-interacting protein (TBPIP). This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation. 61
50073 284513 pfam07107 WI12 Wound-induced protein WI12. This family consists of several plant wound-induced protein sequences related to WI12 from Mesembryanthemum crystallinum. Wounding, methyl jasmonate, and pathogen infection is known to induce local WI12 expression. WI12 expression is also thought to be developmentally controlled in the placenta and developing seeds. WI12 preferentially accumulates in the cell wall and it has been suggested that it plays a role in the reinforcement of cell wall composition after wounding and during plant development. This family seems partly related to the NTF2-like superfamily. 109
50074 369211 pfam07108 PipA PipA protein. This family consists of several Salmonella PipA (pathogenicity island-encoded protein A) and related phage sequences. PipA is thought to contribute to enteric but not to systemic salmonellosis. The family carries a highly conserved HEXXH sequence motif along with several highly conserved glutamic acid residues which might be indicative of the family being a metallo-peptidase. 200
50075 399829 pfam07109 Mg-por_mtran_C Magnesium-protoporphyrin IX methyltransferase C-terminus. This family represents the C-terminus (approximately 100 residues) of bacterial and eukaryotic Magnesium-protoporphyrin IX methyltransferase (EC:2.1.1.11). This converts magnesium-protoporphyrin IX to magnesium-protoporphyrin IX methylester using S-adenosyl-L-methionine as a cofactor. 97
50076 399830 pfam07110 EthD EthD domain. This family consists of several bacterial sequences which are related to the EthD protein of Rhodococcus ruber. In Rhodococcus ruber, EthD is thought to be involved in the degradation of ethyl tert-butyl ether (ETBE). EthD synthesis is induced by ETBE but it's exact function is unknown, it is however thought to be essential to the ETBE degradation system. 95
50077 284517 pfam07111 HCR Alpha helical coiled-coil rod protein (HCR). This family consists of several mammalian alpha helical coiled-coil rod HCR proteins. The function of HCR is unknown but it has been implicated in psoriasis in humans and is thought to affect keratinocyte proliferation. 749
50078 254061 pfam07112 DUF1368 Protein of unknown function (DUF1368). This family consists of several proteins with seem to be specific to red algae plasmids. Members of this family are typically around 415 residues in length. The function of this family is unknown. 404
50079 399831 pfam07114 TMEM126 Transmembrane protein 126. This entry includes the transmembrane protein 126 A/B (TMEM126A/B) from animals. Human TMEM126B participates in constructing the membrane arm of mitochondrial respiratory complex I. 176
50080 254062 pfam07116 DUF1372 Protein of unknown function (DUF1372). This family consists of several Streptococcus bacteriophage sequences and related proteins from Streptococcus species. Members of this family are typically around 100 residues in length and their function is unknown. 104
50081 399832 pfam07117 DUF1373 Protein of unknown function (DUF1373). This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown. 213
50082 399833 pfam07118 DUF1374 Protein of unknown function (DUF1374). This family consists of several hypothetical Sulfolobus virus proteins of around 100 residues in length. The function of this family is unknown. 92
50083 399834 pfam07119 DUF1375 Protein of unknown function (DUF1375). This family consists of several hypothetical, putative lipoproteins of around 80 residues in length. Members of this family seem to be specific to the Class Gammaproteobacteria. The function of this family is unknown. 53
50084 399835 pfam07120 DUF1376 Protein of unknown function (DUF1376). This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown. 87
50085 284524 pfam07122 VLPT Variable length PCR target protein (VLPT). This family consists of a number of 29 residue repeats which seem to be specific to the Ehrlichia chaffeensis variable length PCR target (VLPT) protein. Ehrlichia chaffeensis is a tick-transmitted rickettsial agent and is responsible for human monocytic ehrlichiosis (HME). The function of this family is unknown. 30
50086 399836 pfam07123 PsbW Photosystem II reaction centre W protein (PsbW). This family consists of several plant specific photosystem II reaction centre W (PsbW) proteins. PsbW is a nuclear-encoded protein located in the thylakoid membrane of the chloroplast. PsbW is a core component of photosystem II but not photosystem I. This family does not appear to be related to pfam03912. 129
50087 399837 pfam07124 Phytoreo_P8 Phytoreovirus outer capsid protein P8. This family consists of several Phytoreovirus outer capsid protein P8 sequences. 427
50088 115758 pfam07125 DUF1378 Protein of unknown function (DUF1378). This family consists of hypothetical bacterial and phage proteins of around 59 residues in length. Bacterial members of this family seem to be specific to Enterobacteria. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 59
50089 399838 pfam07126 ZapC Cell-division protein ZapC. ZapC is one of four FtsZ-binding components of the Z ring in bacteria. Formation of the Z ring on the cytoplasmic surface of the membrane is the starting process for assembly of the cell-division apparatus. It binds directly to the Z ring, and although it is not essential for absolute cell division it contributes to it by enhancing the interactions between the FtsZ protofilaments (or polymers) which aggregate to form the ring conformation in the Z ring. 169
50090 399839 pfam07127 Nodulin_late Late nodulin protein. This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues towards the C-terminus which may be involved in metal-binding. 54
50091 399840 pfam07128 DUF1380 Protein of unknown function (DUF1380). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be specific to Enterobacteria. The function of this family is unknown. 137
50092 369223 pfam07129 DUF1381 Protein of unknown function (DUF1381). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus bacteriophage proteins of around 65 residues in length. The function of this family is unknown. 44
50093 399841 pfam07130 YebG YebG protein. This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS. 72
50094 369224 pfam07131 DUF1382 Protein of unknown function (DUF1382). This family consists of several hypothetical Escherichia coli and bacteriophage lambda-like proteins of around 60 residues in length. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 60
50095 284533 pfam07133 Merozoite_SPAM Merozoite surface protein (SPAM). This family consists of several Plasmodium falciparum SPAM (secreted polymorphic antigen associated with merozoites) proteins. Variation among SPAM alleles is the result of deletions and amino acid substitutions in non-repetitive sequences within and flanking the alanine heptad-repeat domain. Heptad repeats in which the a and d position contain hydrophobic residues generate amphipathic alpha-helices which give rise to helical bundles or coiled-coil structures in proteins. SPAM is an example of a P. falciparum antigen in which a repetitive sequence has features characteristic of a well-defined structural element. 182
50096 254068 pfam07134 DUF1383 Protein of unknown function (DUF1383). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 375 residues in length. The function of this family is unknown. 328
50097 399842 pfam07136 DUF1385 Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent. 231
50098 399843 pfam07137 VDE VDE lipocalin domain. This family represents a conserved region approximately 150 residues long within plant violaxanthin de-epoxidase (VDE). In higher plants, violaxanthin de-epoxidase forms part of a conserved system that dissipates excess energy as heat in the light-harvesting complexes of photosystem II (PSII), thus protecting them from photo-inhibitory damage. 240
50099 369226 pfam07138 DUF1386 Protein of unknown function (DUF1386). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 350 residues in length. The function of this family is unknown. 334
50100 399844 pfam07139 DUF1387 Protein of unknown function (DUF1387). This family represents a conserved region approximately 300 residues long within a number of hypothetical proteins of unknown function that seem to be restricted to mammals. 313
50101 284538 pfam07140 IFNGR1 Interferon gamma receptor (IFNGR1). This family consists of several eukaryotic and viral interferon gamma receptor proteins. Molecular interactions among cytokines and cytokine receptors in eukaryotes form the basis of many cell-signaling pathways relevant to immune function. Human interferon-gamma (IFN-gamma) signals through a multimeric receptor complex consisting of two different but structurally related transmembrane chains: the high-affinity receptor-binding subunit (IFN-gammaRalpha) and a species specific accessory factor (AF-1 or IFN-gammaRbeta). The vaccinia viral interferon gamma receptor has been shown to be secreted from infected cells during early infection. The structure has been halved such that the N-terminus of this family is now represented by Tissue_fac pfam01108. 133
50102 369228 pfam07141 Phage_term_sma Putative bacteriophage terminase small subunit. This family consists of several putative Lactococcus bacteriophage terminase small subunit proteins. The exact function of this family is unknown. 174
50103 399845 pfam07142 DUF1388 Repeat of unknown function (DUF1388). This family consists of several repeats of around 29 residues in length. Members of this family are found in the variable surface lipoproteins in Mycoplasma bovis and in mammalian neurofilament triplet H (NefH or NF-H) proteins. This repeat contains several Lys-Ser-Pro (KSP) motifs and in NefH these are thought to function as the main target for neurofilament directed protein kinases in vivo. 27
50104 399846 pfam07143 CrtC CrtC N-terminal lipocalin domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of family member NE1406 (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication. 175
50105 399847 pfam07145 PAM2 Ataxin-2 C-terminal region. The PABP-interacting motif PAM2 has been identified in various eukaryotic proteins as an important binding site for pfam00658. It has been found in a wide range of eukaryotic proteins. Strikingly, this motif appears to occur solely outside of globular domains. 16
50106 254076 pfam07146 DUF1389 Protein of unknown function (DUF1389). This family consists of several hypothetical bacterial proteins which seem to be specific to Chlamydia pneumoniae. Members of this family are typically around 400 residues in length. The function of this family is unknown. 311
50107 399848 pfam07147 PDCD9 Mitochondrial 28S ribosomal protein S30 (PDCD9). This family consists of several eukaryotic mitochondrial 28S ribosomal protein S30 (or programmed cell death protein 9 PDCD9) sequences. The exact function of this family is unknown although it is known to be a component of the mitochondrial ribosome and a component in cellular apoptotic signaling pathways. 441
50108 399849 pfam07148 MalM Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown. 134
50109 399850 pfam07149 Pes-10 Pes-10. This family consists of several Caenorhabditis elegans pes-10 and related proteins. Members of this family are typically around 400 residues in length. The function of this family is unknown. 397
50110 284546 pfam07150 DUF1390 Protein of unknown function (DUF1390). This family consists of several Paramecium bursaria chlorella virus 1 (PBCV-1) proteins of around 250 residues in length. The function of this family is unknown. 226
50111 369233 pfam07151 DUF1391 Protein of unknown function (DUF1391). This family consists of several Enterobacterial proteins of around 50 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi where they are often known as YdfA. The function of this family is unknown. 48
50112 399851 pfam07152 YaeQ YaeQ protein. This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialized transcription elongation protein. YaeQ is known to compensate for loss of RfaH function. 172
50113 369234 pfam07153 Marek_SORF3 Marek's disease-like virus SORF3 protein. This family consists of several SORF3 proteins from the Marek's disease-like viruses. Members of this family are around 350 residues in length. The function of this family is unknown. 290
50114 284550 pfam07154 DUF1392 Protein of unknown function (DUF1392). This family consists of several hypothetical cyanobacterial proteins of around 150 residues in length which seem to be specific to Anabaena species. The function of this family is unknown. 150
50115 399852 pfam07155 ECF-ribofla_trS ECF-type riboflavin transporter, S component. This family is the substrate-binding component (S component) of the energy coupling-factor (ECF)-type riboflavin transporter. It is a transmembrane protein which binds riboflavin, and is responsible for riboflavin-uptake by cells. 168
50116 399853 pfam07156 Prenylcys_lyase Prenylcysteine lyase. This family contains prenylcysteine lyases (EC:1.8.3.5) that are approximately 500 residues long. Prenylcysteine lyase is a FAD-dependent thioether oxidase that degrades a variety of prenylcysteines, producing free cysteine, an isoprenoid aldehyde and hydrogen peroxide as products of the reaction. It has been noted that this enzyme has considerable homology with ClP55, a 55 kDa protein that is associated with chloride ion pumps. 362
50117 399854 pfam07157 DNA_circ_N DNA circularisation protein N-terminus. This family represents the N-terminus (approximately 100 residues) of a number of phage DNA circularisation proteins. 88
50118 115789 pfam07158 MatC_N Dicarboxylate carrier protein MatC N-terminus. This family represents the N-terminal region of the bacterial dicarboxylate carrier protein MatC. The MatC protein is an integral membrane protein that could function as a malonate carrier. 149
50119 399855 pfam07159 DUF1394 Protein of unknown function (DUF1394). This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown. 302
50120 399856 pfam07160 SKA1 Spindle and kinetochore-associated protein 1. Spindle and kinetochore-associated protein 1 (SKA1) is a component of the SKA1 complex (consists of Ska1, Ska2, and Ska3/Rama1), a microtubule-binding subcomplex of the outer kinetochore that is essential for proper chromosome segregation. 234
50121 399857 pfam07161 LppX_LprAFG LppX_LprAFG lipoprotein. This entry consists of several lipoproteins mainly from Mycobacterium species, collectively known as the LppX_ LprAFG family. Proteins in this entry include LprG, LppX, LprF and lprA. 191
50122 399858 pfam07162 B9-C2 Ciliary basal body-associated, B9 protein. The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localize to the basal body or centrosome of cilia. 165
50123 399859 pfam07163 Pex26 Pex26 protein. This family consists of Pex26 and related mammalian proteins. Pex26 is a type II peroxisomal membrane protein which recruits Pex6-Pex1 complexes to peroxisomes. Mutations in Pex26 can lead to human disorders. 301
50124 399860 pfam07165 DUF1397 Protein of unknown function (DUF1397). This family consists of several insect specific proteins. A member from Manduca sexta is annotated as being a haemolymph glycoprotein precursor. The function of this family is unknown. 203
50125 399861 pfam07166 DUF1398 Protein of unknown function (DUF1398). This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown. 118
50126 399862 pfam07167 PhaC_N Poly-beta-hydroxybutyrate polymerase (PhaC) N-terminus. This family represents the N-terminal region of the bacterial poly-beta-hydroxybutyrate polymerase (PhaC). Polyhydroxyalkanoic acids (PHAs) are carbon and energy reserve polymers produced in some bacteria when carbon sources are plentiful and another nutrient, such as nitrogen, phosphate, oxygen, or sulfur, becomes limiting. PHAs composed of monomeric units ranging from 3 to 14 carbons exist in nature. When the carbon source is exhausted, PHA is utilized by the bacterium. PhaC links D-(-)-3-hydroxybutyrl-CoA to an existing PHA molecule by the formation of an ester bond. This family appears to be a partial segment of an alpha/beta hydrolase domain. 173
50127 399863 pfam07168 Ureide_permease Ureide permease. Heterocyclic nitrogen compounds may serve as nitrogen sources or nitrogen transport compounds in plants that are not able to fix nitrogen. This family represents ureide permease, a transporter of a wide spectrum of oxo derivatives of heterocyclic nitrogen compounds, including allantoin, uric acid and xanthine; it has 10 putative transmembrane domains with a large cytosolic central domain containing a 'Walker A' motif. Ureide permease is likely to transport other purine degradation products when nitrogen sources are low. Transport is dependent on glucose and a proton gradient. The family is found in bacteria, plants and yeast. These transporters are constituted of two sets of 5xTMs. 358
50128 399864 pfam07171 MlrC_C MlrC C-terminus. This family represents the C-terminus (approximately 200 residues) of the product of a bacterial gene cluster that is involved in the degradation of the cyanobacterial toxin microcystin LR. Many members of this family are hypothetical proteins. 178
50129 254089 pfam07172 GRP Glycine rich protein family. This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses. 91
50130 399865 pfam07173 GRDP-like Glycine-rich domain-containing protein-like. This entry includes Arabidopsis Glycine-rich domain- containing protein 1 and 2 (GRDP1/2). They are involved in development and stress responses. 139
50131 399866 pfam07174 FAP Fibronectin-attachment protein (FAP). This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix. 301
50132 399867 pfam07175 Osteoregulin Osteoregulin. This family represents a conserved region approximately 180 residues long within osteoregulin, a bone-remodelling protein expressed highly in osteocytes within trabecular and cortical bone. A conserved RGD motif is found towards the C-terminal end of this region, and this is potentially involved in integrin recognition. 160
50133 399868 pfam07176 DUF1400 Alpha/beta hydrolase of unknown function (DUF1400). This family contains a number of hypothetical proteins of unknown function that seem to be specific to cyanobacteria. Members of this family have an alpha/beta hydrolase fold. 127
50134 399869 pfam07177 Neuralized Neuralized. This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. 150
50135 399870 pfam07178 TraL TraL protein. This family consists of several bacterial TraL proteins. TraL is a predicted peripheral membrane protein which is thought to be involved in bacterial sex pilus assembly. The exact function of this family is unclear. 87
50136 399871 pfam07179 SseB SseB protein N-terminal domain. This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon. This entry contains the presumed N-terminal domain of SseB. 120
50137 369250 pfam07180 CaiF_GrlA CaiF/GrlA transcriptional regulator. This is a family of transcriptional regulators. CaiF is involved in carnitine metabolism. GrlA is encoded within the LEE type III secretion system in the enteropathogenic E. coli O157:H. GrlR interacts with GrlA at its Helix-Turn-Helix (HTH) motif, preventing GrlA from binding to its target promoter DNA. 134
50138 284572 pfam07181 VirC2 VirC2 protein. This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear. 200
50139 399872 pfam07182 DUF1402 Protein of unknown function (DUF1402). This family consists of several hypothetical bacterial proteins of around 310 residues in length. Members of this family seem to be found exclusively in Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown. 300
50140 399873 pfam07183 DUF1403 Protein of unknown function (DUF1403). This family consists of several hypothetical bacterial proteins of around 320 residues in length. Members of this family are mainly found in Rhizobium and Agrobacterium species. The function of this family is unknown. 320
50141 115814 pfam07184 CTV_P33 Citrus tristeza virus P33 protein. This family consists of several Citrus tristeza virus (CTV) P33 proteins. The function of P33 is unclear although it is known that the protein is not needed for virion formation. 303
50142 369252 pfam07185 DUF1404 Protein of unknown function (DUF1404). This family consists of several archaeal proteins of around 180 residues in length. Members of this family seem to be found exclusively in Sulfolobus tokodaii and Sulfolobus solfataricus. The function of this family is unknown. 169
50143 399874 pfam07187 DUF1405 Protein of unknown function (DUF1405). This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown. 160
50144 284577 pfam07188 KSHV_K8 Kaposi's sarcoma-associated herpesvirus (KSHV) K8 protein. This family consists of Kaposi's sarcoma-associated herpesvirus (KSHV) K8 proteins. KSHV is a human Gammaherpesvirus related to Epstein-Barr virus (EBV) and herpesvirus saimiri. KSHV open reading frame K8 encodes a basic region-leucine zipper protein of 237 aa that homodimerizes. K8 interacts and co-localizes with human pfam04855, a cellular chromatin-remodelling factor, both in vivo and in vitro. K8 is thought to function as a transcriptional activator under specific conditions and its transactivation activity requires its interaction with the cellular chromatin remodelling factor hSNF5. 238
50145 399875 pfam07189 SF3b10 Splicing factor 3B subunit 10 (SF3b10). This family consists of several eukaryotic splicing factor 3B subunit 10 (SF3b10) proteins. SF3b10 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b10 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site. 75
50146 148663 pfam07190 DUF1406 Protein of unknown function (DUF1406). This family consists of several Orthopoxvirus proteins of around 185 resides in length. Members of this family seem to be exclusive to Vaccinia, Camelpox and Cowpox viruses. Some family members are annotated as being C8 proteins but their function is unknown. 170
50147 399876 pfam07191 zinc-ribbons_6 zinc-ribbons. This family consists of several short, hypothetical bacterial proteins of around 70 residues in length. Members of this family have 8 highly conserved cysteine residues, which form two zinc ribbon domains. 64
50148 369255 pfam07192 SNURF SNURF/RPN4 protein. This family consists of several mammalian SNRPN upstream reading frame (SNURF) proteins. SNURF or RPF4 is a RING-finger protein and a coregulator of androgen receptor-dependent transcription. It has been suggested that SNURF is involved in the regulation of processes required for late steps of spermatid maturation. 65
50149 284581 pfam07193 DUF1408 Protein of unknown function (DUF1408). This family consists of several hypothetical Lactococcus lactis and related phage proteins of around 75 residues in length. The function of this family is unknown. 71
50150 399877 pfam07194 P2 P2 response regulator binding domain. The response regulators for CheA bind to the P2 domain, which is found between pfam01627 and pfam02895 as either one or two copies. Highly flexible linkers connect P2 to the rest of CheA and impart remarkable mobility to the P2 domain. This feature is thought to enhance the inter CheA dimer phosphotransfer reactions within the signalling complex, thereby amplifying the phosphorylation signal. 80
50151 311255 pfam07195 FliD_C Flagellar hook-associated protein 2 C-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the C-terminal region of this family of proteins. 235
50152 399878 pfam07196 Flagellin_IN Flagellin hook IN motif. The function of this region is not clear, but it is found in many flagellar hook proteins, including FliD homologs. It is normally repeated, but is also apparently seen as a singleton. A conserved IN is seen at the centre of the motif. The diversity of these motifs makes it likely that some members of the family are not identified. 55
50153 369256 pfam07197 DUF1409 Protein of unknown function (DUF1409). This family represents a short conserved region (approximately 50 residues long), sometimes repeated, within a number of hypothetical Oryza sativa proteins of unknown function. 48
50154 399879 pfam07198 DUF1410 Protein of unknown function (DUF1410). This family represents a conserved domain approximately 100 residues long, multiple copies of which are found within hypothetical Ureaplasma parvum proteins of unknown function, as well as related species. 61
50155 369258 pfam07199 DUF1411 Protein of unknown function (DUF1411). This family represents a conserved region approximately 150 residues long that is sometimes repeated within some Babesia bovis proteins of unknown function. 188
50156 399880 pfam07200 Mod_r Modifier of rudimentary (Mod(r)) protein. This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homolog of Vps37 interacts with Tsg101 (pfam05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors. 146
50157 399881 pfam07201 HrpJ HrpJ-like domain. This family represents a conserved region approximately 200 residues long within a number of bacterial hypersensitivity response secretion protein HrpJ and similar proteins. HrpJ forms part of a type III secretion system through which, in phytopathogenic bacterial species, virulence factors are thought to be delivered to plant cells. This family also includes the InvE invasion protein from Salmonella. This protein is involved in host parasite interactions and mutations in the InvE gene render Salmonella typhimurium non-invasive. InvE S. typhimurium mutants fail to elicit a rapid Ca2+ increase in cultured cells, an important event in the infection procedure and internalisation of S. typhimurium into epithelial cells. This family includes bacterial SepL and SsaL proteins. SepL plays an essential role in the infection process of enterohemorrhagic Escherichia coli and is thought to be responsible for the secretion of EspA, EspD, and EspB. SsaL of Salmonella typhimurium is thought to be a component of the type III secretion system. 165
50158 399882 pfam07202 Tcp10_C T-complex protein 10 C-terminus. This family represents the C-terminus (approximately 180 residues) of eukaryotic T-complex protein 10. The T-complex is involved in spermatogenesis in mice. 35
50159 311262 pfam07203 DUF1412 Protein of unknown function (DUF1412). This family consists of several Caenorhabditis elegans proteins of around 70-75 residues in length. The function of this family is unknown. 53
50160 115833 pfam07204 Orthoreo_P10 Orthoreovirus membrane fusion protein p10. This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction. 98
50161 399883 pfam07205 DUF1413 Domain of unknown function (DUF1413). This family consists of several hypothetical bacterial proteins which seem to be specific to firmicute species. Members of this family are typically around 100 residues in length. The function of this family is unknown. 69
50162 284592 pfam07206 Baculo_LEF-10 Baculovirus late expression factor 10 (LEF-10). This family consists of several Baculovirus specific late expression factor 10 (LEF-10) sequences. LEF-10 is thought to be a late expressed structural protein although its exact function is unknown. 71
50163 399884 pfam07207 Lir1 Light regulated protein Lir1. This family consists of several plant specific light regulated Lir1 proteins. Lir1 mRNA accumulates in the light, reaching maximum and minimum steady-state levels at the end of the light and dark period, respectively. Plants germinated in the dark have very low levels of lir1 mRNA, whereas plants germinated in continuous light express lir1 at an intermediate but constant level. It is thought that lir1 expression is controlled by light and a circadian clock. The exact function of this family is unclear. 134
50164 399885 pfam07208 DUF1414 Protein of unknown function (DUF1414). This family consists of several hypothetical bacterial proteins of around 70 residues in length. Members of this family are often referred to as YejL. The function of this family is unknown. 44
50165 399886 pfam07209 DUF1415 Protein of unknown function (DUF1415). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 171
50166 377793 pfam07210 DUF1416 Protein of unknown function (DUF1416). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown. 97
50167 399887 pfam07212 Hyaluronidase_1 Hyaluronidase protein (HylP). This family consists of several phage associated hyaluronidase proteins (EC:3.2.1.35) which seem to be specific to Streptococcus pyogenes and Streptococcus pyogenes bacteriophages. The substrate of hyaluronidase is hyaluronic acid, a sugar polymer composed of alternating N-acetylglucosamine and glucuronic acid residues. Hyaluronic acid is found in the ground substance of human connective tissue and the vitreous of the eye and also is the sole component of the capsule of group A streptococci. The capsule has been shown to be an important virulence factor of this organism by virtue of its ability to resist phagocytosis. Production by S. pyogenes of both a hyaluronic acid capsule and hyaluronidase enzymatic activity capable of destroying the capsule is an interesting, yet-unexplained, phenomenon. 278
50168 284598 pfam07213 DAP10 DAP10 membrane protein. This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells. 79
50169 369264 pfam07214 DUF1418 Protein of unknown function (DUF1418). This family consists of several hypothetical Enterobacterial proteins of around 100 residues in length. Members of this family are often described as YbjC. In E. coli the ybjC gene is located downstream of nfsA (which encodes the major oxygen-insensitive nitroreductase). It is thought that nfsA and ybjC form an operon an its promoter is a class I SoxS-dependent promoter. The function of this family is unknown. 94
50170 399888 pfam07215 DUF1419 Protein of unknown function (DUF1419). This family consists of several bacterial proteins of around 110 residues in length. Members of this family seem to be specific to Agrobacterium species and to Rhizobium loti. The function of this family is unknown. 112
50171 284601 pfam07216 LcrG LcrG protein. This family consists of several bacterial LcrG proteins. Yersiniae are equipped with the Yop virulon, an apparatus that allows extracellular bacteria to deliver toxic Yop proteins inside the host cell cytosol in order to sabotage the communication networks of the host cell or even to cause cell death. LcrG is a component of the Yop virulon involved in the regulation of secretion of the Yops. 91
50172 399889 pfam07217 Het-C Heterokaryon incompatibility protein Het-C. In filamentous fungi, het loci (for heterokaryon incompatibility) are believed to regulate self/nonself-recognition during vegetative growth. As filamentous fungi grow, hyphal fusion occurs within an individual colony to form a network. Hyphal fusion can occur also between different individuals to form a heterokaryon, in which genetically distinct nuclei occupy a common cytoplasm. However, heterokaryotic cells are viable only if the individuals involved have identical alleles at all het loci. 560
50173 284603 pfam07218 RAP1 Rhoptry-associated protein 1 (RAP-1). This family consists of several rhoptry-associated protein 1 (RAP-1) sequences which appear to be specific to Plasmodium falciparum. 793
50174 399890 pfam07219 HemY_N HemY protein N-terminus. This family represents the N-terminus (approximately 150 residues) of bacterial HemY porphyrin biosynthesis proteins. This is a membrane protein involved in a late step of protoheme IX synthesis. 104
50175 115849 pfam07220 DUF1420 Protein of unknown function (DUF1420). This family consists of several hypothetical putative lipoproteins which seem to be found specifically in the bacterium Leptospira interrogans. Members of this family are typically around 670 resides in length and their function is unknown. 672
50176 399891 pfam07221 GlcNAc_2-epim N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase). This family contains a number of eukaryotic and bacterial N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase) enzymes (EC:5.3.1.8) approximately 500 residues long. This converts N-acyl-D-glucosamine to N-acyl-D-mannosamine. 347
50177 311272 pfam07222 PBP_sp32 Proacrosin binding protein sp32. This family consists of several mammalian specific proacrosin binding protein sp32 sequences. sp32 is a sperm specific protein which is known to bind with with 55- and 53-kDa proacrosins and the 49-kDa acrosin intermediate. The exact function of sp32 is unclear, it is thought however that the binding of sp32 to proacrosin may be involved in packaging the acrosin zymogen into the acrosomal matrix. 243
50178 399892 pfam07223 DUF1421 UBA-like domain (DUF1421). This domain represents a conserved region that has a UBA like fold. It is found in a number of plant proteins of unknown function. 45
50179 254111 pfam07224 Chlorophyllase Chlorophyllase. This family consists of several plant specific Chlorophyllase proteins (EC:3.1.1.14). Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of ester bond to yield chlorophyllide and phytol. 307
50180 399893 pfam07225 NDUF_B4 NADH-ubiquinone oxidoreductase B15 subunit (NDUFB4). This family consists of several NADH-ubiquinone oxidoreductase B15 subunit proteins (EC:1.6.5.3). 124
50181 399894 pfam07226 DUF1422 Protein of unknown function (DUF1422). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 114
50182 399895 pfam07227 PHD_Oberon PHD - plant homeodomain finger protein. PHD_oberon is a plant homeodomain finger domain of Oberon proteins from plants. Oberon is necessary for maintenance and/or establishment of both the shoot and root apical meristems in Arabidopsis. Oberon proteins are made up of a PHD finger domain and a coiled-coil domain. The PHD-finger domain is found in a wide variety of proteins involved in the regulation of chromatin structure. Oberon proteins mediate the TMO7 (the direct target of MP) expression through modification of, or binding to, chromatin at the TMO7 locus. TMO7 stands for the target of Monopteros 7 (MP) (or Auxin response factor 7). 130
50183 399896 pfam07228 SpoIIE Stage II sporulation protein E (SpoIIE). This family contains a number of bacterial stage II sporulation E proteins (EC:3.1.3.16). These are required for formation of a normal polar septum during sporulation. The N-terminal region is hydrophobic and is expected to contain up to 12 membrane-spanning segments. 192
50184 399897 pfam07229 VirE2 VirE2. This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C-terminus, with VirD4. Agrobacterium tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily. 557
50185 399898 pfam07230 Peptidase_S80 Bacteriophage T4-like capsid assembly protein (Gp20). This family consists of several bacteriophage T4-like capsid assembly (or portal) proteins. The exact mechanism by which the double-stranded (ds) DNA bacteriophages incorporate the portal protein at a unique vertex of the icosahedral capsid is unknown. In phage T4, there is evidence that this vertex, constituted by 12 subunits of gp20, acts as an initiator for the assembly of the major capsid protein and the scaffolding proteins into a prolate icosahedron of precise dimensions. The regulation of portal protein gene expression is an important regulator of prohead assembly in bacteriophage T4. This family represents the protease responsible for the proteolysis of head proteins, a critical step in the morphogenesis of many tailed phages, Cleavage facilitates the conversion of the prohead to the mature capsid. All these cleavages are carried out by action at consensus S/A/G-X-E recognition sequences at 39 cleavage sites. Evidence of multiple processing sites in nine phiKZ proteins appears to represent a built-in mechanism by which the phage ensures that the majority of the propeptide regions are removed, and emphasizes the essential nature of processing in phiKZ-head morphogenesis. The family is classified by MEROPS as a serine peptidase. 445
50186 399899 pfam07231 Hs1pro-1_N Hs1pro-1 N-terminus. This family represents the N-terminus (approximately 180 residues) of plant Hs1pro-1, which is believed to confer resistance to nematodes. 195
50187 115861 pfam07232 DUF1424 Putative rep protein (DUF1424). This family consists of several archaeal proteins of around 320 residues in length. Members of this family seem to be found exclusively in Halobacterium and Haloferax species. The function of this family is unknown. This protein is probably a rep protein due to conservation of functional motifs. 329
50188 399900 pfam07233 DUF1425 Protein of unknown function (DUF1425). This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown. 87
50189 284616 pfam07234 Babuvirus_MP Movement and RNA silencing protein. This family consists of several Babuvirus proteins of around 120 residues in length. Proteins in this family include movement and RNA silencing protein (also known as MP) from Banana bunchy top virus. MP acts as a suppressor of RNA-mediated gene silencing, also known as post-transcriptional gene silencing (PTGS), a mechanism of plant viral defense that limits the accumulation of viral RNAs. It transports viral genome to neighboring plant cells directly through plasmosdesmata, without any budding. The movement protein allows efficient cell to cell propagation, by bypassing the host cell wall barrier. 117
50190 399901 pfam07235 DUF1427 Protein of unknown function (DUF1427). This family consists of several bacterial proteins of around 100 residues in length. The function of this family is unknown. 84
50191 399902 pfam07236 Phytoreo_S7 Phytoreovirus S7 protein. This family consists of several Phytoreovirus S7 proteins which are thought to be viral core proteins. 505
50192 399903 pfam07237 DUF1428 Protein of unknown function (DUF1428). This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown. The structure of this family shows it to be part of the Dimeric-alpha-beta-barrel superfamily. Many members are annotated as being RNA signal recognition particle 4.5S RNA, but this could not be verified. 102
50193 399904 pfam07238 PilZ PilZ domain. PilZ is a c-di-GMP binding domain which is found C terminal to pfam07317. Proteins which contain PilZ are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias. This domain forms a beta barrel structure. 102
50194 399905 pfam07239 OpcA Outer membrane protein OpcA. This family consists of several Neisseria species specific OpcA outer membrane proteins. Opc (formerly called 5C) is one of the major outer membrane proteins and has been shown to play an important role in meningococcal adhesion and invasion of both epithelial and endothelial cells. 250
50195 399906 pfam07240 Turandot Stress-inducible humoral factor Turandot. This family consists of several Drosophila species specific Turandot proteins. The Turandot A (TotA) gene encodes a humoral factor, which is secreted from the fat body and accumulates in the body fluids. TotA is strongly induced upon bacterial challenge, as well as by other types of stress such as high temperature, mechanical pressure, dehydration, UV irradiation, and oxidative agents. It is also up-regulated during metamorphosis and at high age. Flies that over-express TotA show prolonged survival and retain normal activity at otherwise lethal temperatures. Although TotA is only induced by severe stress, it responds to a much wider range of stimuli than heat shock genes such as hsp70 or immune genes such as Cecropin A1. 81
50196 369281 pfam07242 DUF1430 Protein of unknown function (DUF1430). This family represents the C-terminus (approximately 120 residues) of a number of hypothetical bacterial proteins of unknown function. These are possibly membrane proteins involved in immunity. 100
50197 369282 pfam07243 Phlebovirus_G1 Phlebovirus glycoprotein G1. This family consists of several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. 527
50198 399907 pfam07244 POTRA Surface antigen variable number repeat. This family is found primarily in bacterial surface antigens, normally as variable number repeats at the N-terminus. The C-terminus of these proteins is normally represented by pfam01103. The alignment centers on a -GY- or -GF- motif. Some members of this family are found in the mitochondria. It is predicted to have a mixed alpha/beta secondary structure. 80
50199 399908 pfam07245 Phlebovirus_G2 Phlebovirus glycoprotein G2. This family consists of several Phlebovirus glycoprotein G2 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. 325
50200 369285 pfam07246 Phlebovirus_NSM Phlebovirus nonstructural protein NS-M. This family consists of several Phlebovirus nonstructural NS-M proteins which represent the N-terminal region of the M polyprotein precursor. The function of this family is unknown. 228
50201 369286 pfam07247 AATase Alcohol acetyltransferase. This family contains a number of alcohol acetyltransferase (EC:2.3.1.84) enzymes approximately 500 residues long found in both bacteria and metazoa. These catalyze the esterification of isoamyl alcohol by acetyl coenzyme A. 500
50202 399909 pfam07248 DUF1431 Protein of unknown function (DUF1431). This family contains a number of Drosophila melanogaster proteins of unknown function. These contain several conserved cysteine residues. 154
50203 369288 pfam07249 Cerato-platanin Cerato-platanin. This family contains a number of fungal cerato-platanin phytotoxic proteins approximately 150 residues long. Cerato-platanin contains four cysteine residues that form two disulphide bonds. 118
50204 399910 pfam07250 Glyoxal_oxid_N Glyoxal oxidase N-terminus. This family represents the N-terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyzes the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium. 243
50205 284628 pfam07252 DUF1433 Protein of unknown function (DUF1433). This family contains a number of hypothetical bacterial proteins of unknown function approximately 100 residues in length. 88
50206 254125 pfam07253 Gypsy Gypsy protein. This family consists of several Gypsy/Env proteins from Drosophila and Ceratitis fruit fly species. Gypsy is an endogenous retrovirus of Drosophila melanogaster. Phylogenetic studies suggest that occasional horizontal transfer events of gypsy occur between Drosophila species. Gypsy possesses infective properties associated with the products of the envelope gene that might be at the origin of these interspecies transfers. This family contains many members with full-length matches; however, it also includes a number of very short sequences and short matches of sequences with other unrelated domains on them, which cannot be excluded. These matches may represent remnants of once-functional genes. 472
50207 399911 pfam07254 Cpta_toxin Membrane-bound toxin component of toxin-antitoxin system. CptA is a family of bacterial proteins named for the member of this family, YGFX_ECOLI. YgfX was previously thought to be the toxic part of a toxin-antitoxin module along with the antitoxin, pfam03937 Sdh5. However, studies have shown that, YgfX interferes with correct cell division and morphology. Furthermore, the function of YgfX-SdhE as a TA system could not be demonstrated in either E. coli or Serratia sp. ATCC 39006. YgfX is predicted to have a short N-terminal cytoplasmic domain followed by two transmembrane helices (TMHs) separated by a short periplasmic loop and finally, a larger C-terminal cytoplasmic domain. The TMHs of YgfX are required for activity, but the sequence of the cytoplasmic 13 N-terminal amino acids is not essential. Furthermore, the amino acids W34 and D117 are not required for localization but are necessary for YgfX multimerization, interaction with SdhE, and YgfX activity. It is proposed that the formation of YgfX multimeric membrane-bound proteins are required to enable the interaction with the cytoplasmic SDH assembly factor SdhE. Another study has demonstrated that sdhEygfX (bicistronic operon) affects pig biosynthesis, directly or indirectly, at the level of transcription of the biosynthetic operon (pigA-O). It has also been suggested that, in addition to indirect transcriptional activation of pigA-O, YgfX might facilitate the formation of a terminal pig biosynthetic complex consisting of PigB and PigC. 130
50208 284630 pfam07255 Benyvirus_14KDa Benyvirus 14KDa protein. This family consists of several Benyvirus specific 14KDa proteins of around 125 residues in length. Members of this family contain 9 conserved cysteine residues. The function of this family is unknown. 123
50209 399912 pfam07256 DUF1435 Protein of unknown function (DUF1435). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown. 75
50210 399913 pfam07258 COMM_domain COMM domain. The leucine-rich, 70-85 amino acid long COMM domain is predicted to form a beta-sheet and an extreme C-terminal alpha- helix. The COMM domain containing proteins are about 200 residues in length and passed the C-terminal COMM domain. 72
50211 399914 pfam07259 ProSAAS ProSAAS precursor. This family consists of several mammalian proSAAS precursor proteins. ProSAAS mRNA is expressed primarily in brain and other neuroendocrine tissues (pituitary, adrenal, pancreas); within brain, the mRNA is broadly distributed among neurons. ProSAAS is thought to be an endogenous inhibitor of prohormone convertase 1 may function as a neuropeptide. N-terminal fragments of proSAAS in intracellular Pick Bodies (PBs) may cause a functional disturbance of neurons in Pick's disease. 266
50212 311292 pfam07260 ANKH Progressive ankylosis protein (ANKH). This family consists of several progressive ankylosis protein (ANK or ANKH) sequences. The ANK protein spans the outer cell membrane and shuttles inorganic pyrophosphate (PPi), a major inhibitor of physiologic and pathologic calcification, bone mineralisation and bone resorption. Mutations in ANK are thought to give rise to Craniometaphyseal dysplasia (CMD) which is a rare skeletal disorder characterized by progressive thickening and increased mineral density of craniofacial bones and abnormally developed metaphyses in long bones. This family shows distant homology to the MOP (TCDB) superfamily of transporters. 344
50213 399915 pfam07261 DnaB_2 Replication initiation and membrane attachment. This family consists of several bacterial replication initiation and membrane attachment (DnaB) proteins, as well as DnaD which is a component of the PriA primosome. The PriA primosome functions to recruit the replication fork helicase onto the DNA. The DnaB protein is essential for both replication initiation and membrane attachment of the origin region of the chromosome and plasmid pUB110 in Bacillus subtilis. It is known that there are two different classes (DnaBI and DnaBII) in the DnaB mutants; DnaBI is essential for both chromosome and pUB110 replication, whereas DnaBII is necessary only for chromosome replication. DnaD has been merged into this family. This family also includes Ftn6, a cyanobacterial-specific divisome component possibly playing a role at the interface between DNA replication and cell division. Ftn6 possesses a conserved domain localized within the N-terminus of the proteins. This domain, named FND, exhibits sequence and structure similarities with the DnaD-like domains pfam04271 now merged into pfam07261. 70
50214 399916 pfam07262 CdiI CDI immunity protein. CdiI immunity proteins function as part of the bacterial contact-dependent growth inhibition (CDI) system. CDI is mediated by the CdiB-CdiA two-partner secretion system. Each CdiA protein exhibits a distinct growth inhibition activity, which resides in the polymorphic C-terminal region (CdiA-CT). Cells with the CDI sytem also express a CdiI immunity protein that blocks the activity of cognate CdiA-CT, thereby protecting the cell from autoinhibition. In many CDI systems the cdiBAI genes are followed by orphan cdiA-CT/cdiI modules, suggesting that these modules are exchanged between the CDI systems of different bacteria. 155
50215 399917 pfam07263 DMP1 Dentin matrix protein 1 (DMP1). This family consists of several mammalian dentin matrix protein 1 (DMP1) sequences. The dentin matrix acidic phosphoprotein 1 (DMP1) gene has been mapped to human chromosome 4q21. DMP1 is a bone and teeth specific protein initially identified from mineralized dentin. DMP1 is primarily localized in the nuclear compartment of undifferentiated osteoblasts. In the nucleus, DMP1 acts as a transcriptional component for activation of osteoblast-specific genes like osteocalcin. During the early phase of osteoblast maturation, Ca(2+) surges into the nucleus from the cytoplasm, triggering the phosphorylation of DMP1 by a nuclear isoform of casein kinase II. This phosphorylated DMP1 is then exported out into the extracellular matrix, where it regulates nucleation of hydroxyapatite. DMP1 is a unique molecule that initiates osteoblast differentiation by transcription in the nucleus and orchestrates mineralized matrix formation extracellularly, at later stages of osteoblast maturation. The DMP1 gene has been found to be ectopically expressed in lung cancer although the reason for this is unknown. 522
50216 399918 pfam07264 EI24 Etoposide-induced protein 2.4 (EI24). This family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long as well as bacterial CysZ proteins (formerly known as DUF540). In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53. It has been suggested to play an important role in negative cell growth control. 161
50217 399919 pfam07265 TAP35_44 Tapetum specific protein TAP35/TAP44. This family consists of several plant tapetum specific proteins. Members of this family are found in Arabidopsis thaliana, Brassica napus and Sinapis alba. Members of this family may be involved in sporopollenin formation and/or deposition. 119
50218 369296 pfam07267 Nucleo_P87 Nucleopolyhedrovirus capsid protein P87. This family consists of several Nucleopolyhedrovirus capsid protein P87 sequences. P87 is expressed late in infection and concentrated in infected cell nuclei. 623
50219 284641 pfam07268 EppA_BapA Exported protein precursor (EppA/BapA). This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the Borrelia burgdorferi infectious cycle. 138
50220 284642 pfam07270 DUF1438 Protein of unknown function (DUF1438). This family consists of several hypothetical proteins of around 170 residues in length which appear to be mouse specific. The function of this family is unknown. 151
50221 284643 pfam07271 Cytadhesin_P30 Cytadhesin P30/P32. This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localized on the tip organelle. It is thought that it is important in cytadherence and virulence. 308
50222 148716 pfam07272 Orthoreo_P17 Orthoreovirus P17 protein. This family consists of several Orthoreovirus P17 proteins. P17 is specified be ORF2 of the S1 gene and represents a nonstructural protein which associate with cell membranes. 146
50223 399920 pfam07273 DUF1439 Protein of unknown function (DUF1439). This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown. 151
50224 399921 pfam07274 DUF1440 Protein of unknown function (DUF1440). This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins. 133
50225 399922 pfam07275 ArdA Antirestriction protein (ArdA). This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification. 153
50226 115901 pfam07276 PSGP Apopolysialoglycoprotein (PSGP). This family represents a series of 13 reside repeats found in the apopolysialoglycoprotein of Oncorhynchus mykiss (Rainbow trout) and Oncorhynchus masou (Cherry salmon). Polysialoglycoprotein (PSGP) of unfertilized eggs of rainbow trout consists of tandem repeats of a glycotridecapeptide, Asp-Asp-Ala-Thr*-Ser*-Glu-Ala-Ala-Thr*-Gly-Pro-Ser- Gly (* denotes the attachment site of a polysialoglycan chain). In response to egg activation, PSGP is discharged by exocytosis into the space between the vitelline envelope and the plasma membrane, i.e. the perivitelline space, where the 200-kDa PSGP molecules undergo rapid and dramatic depolymerization by proteolysis into glycotridecapeptides. 13
50227 399923 pfam07277 SapC SapC. This family contains a number of bacterial SapC proteins approximately 250 residues long. In Campylobacter fetus, SapC forms part of a paracrystalline surface layer (S-layer) that confers serum resistance. 221
50228 399924 pfam07278 DUF1441 Protein of unknown function (DUF1441). This family consists of several hypothetical Enterobacterial proteins of around 160 residues in length. The function of this family is unknown. However, it appears to be distantly related to other HTH families so may act as a transcriptional regulator. 149
50229 115904 pfam07279 DUF1442 Protein of unknown function (DUF1442). This family consists of several hypothetical Arabidopsis thaliana proteins of around 225 residues in length. The function of this family is unknown. 218
50230 148722 pfam07280 Ac110_PIF Per os infectivity factor AC110. This family consists of several Baculovirus proteins of around 55 residues in length. Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV) Per os infectivity factor AC110, which is required for oral infectivity. It may play a role after occlusion-derived virions pass through the host's peritrophic membrane. 43
50231 399925 pfam07281 INSIG Insulin-induced protein (INSIG). This family contains a number of eukaryotic Insulin-induced proteins (INSIG-1 and INSIG-2) approximately 200 residues long. INSIG-1 and INSIG-2 are found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations. 187
50232 284650 pfam07282 OrfB_Zn_ribbon Putative transposase DNA-binding domain. This putative domain is found at the C-terminus of a large number of transposase proteins. This domain contains four conserved cysteines suggestive of a zinc binding domain. Given the need for transposases to bind DNA as well as the large number of DNA-binding zinc fingers we hypothesize this domain is DNA-binding. 69
50233 369299 pfam07283 TrbH Conjugal transfer protein TrbH. This family contains TrbH, a bacterial conjugal transfer protein approximately 150 residues long. This contains a putative membrane lipoprotein lipid attachment site. 119
50234 399926 pfam07284 BCHF 2-vinyl bacteriochlorophyllide hydratase (BCHF). This family contains the bacterial enzyme 2-vinyl bacteriochlorophyllide hydratase (EC:4.2.1.-) (approximately 150 residues long). This is involved in the light-independent bacteriochlorophyll biosynthesis pathway by adding water across the 2-vinyl group. 139
50235 377802 pfam07285 DUF1444 Protein of unknown function (DUF1444). This family contains several hypothetical bacterial proteins of unknown function that are approximately 250 residues long. 264
50236 399927 pfam07286 DUF1445 Protein of unknown function (DUF1445). This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function. 143
50237 399928 pfam07287 AtuA Acyclic terpene utilisation family protein AtuA. This family consists of several bacterial and plant proteins of around 400 residues in length. One member of this family has been characterized in Pseudomonas citronellolis as AtuA, a member of a gene cluster that is essential for the acyclic terpene utilisation (Atu) pathway. 348
50238 399929 pfam07288 DUF1447 Protein of unknown function (DUF1447). This family consists of several bacterial proteins of around 70 residues in length. The function of this family is unknown. 68
50239 399930 pfam07289 BBL5 Bardet-Biedl syndrome 5 protein. BBS5 is part of the BBSome complex that may function as a coat complex required for sorting of specific membrane proteins to the primary cilia. Mutations in the BBS5 gene cause Bardet-Biedl syndrome 5. 334
50240 399931 pfam07290 DUF1449 Protein of unknown function (DUF1449). This family consists of several bacterial proteins of around 210 residues in length. The function of this family is unknown. 198
50241 399932 pfam07291 MauE Methylamine utilisation protein MauE. This family consists of several bacterial methylamine utilisation MauE proteins. Synthesis of enzymes involved in methylamine oxidation via methylamine dehydrogenase (MADH) is encoded by genes present in the mau cluster. MauE and MauD are specifically involved in the processing, transport, and/or maturation of the beta-subunit and that the absence of each of these proteins leads to production of a non-functional beta-subunit which becomes rapidly degraded. 184
50242 399933 pfam07292 NID Nmi/IFP 35 domain (NID). This family represents a domain of approximately 90 residues that is tandemly repeated within interferon-induced 35 kDa protein (IFP 35) and the homologous N-myc-interactor (Nmi). This domain mediates Nmi-Nmi protein interactions and subcellular localization. 89
50243 399934 pfam07293 DUF1450 Protein of unknown function (DUF1450). This family consists of several hypothetical bacterial proteins of around 80 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown. 75
50244 284662 pfam07294 Fibroin_P25 Fibroin P25. This family consists of several insect fibroin P25 proteins. Silk fibroin produced by the silkworm Bombyx mori consists of a heavy chain, a light chain, and a glycoprotein, P25. The heavy and light chains are linked by a disulfide bond, and P25 associates with disulfide-linked heavy and light chains by non-covalent interactions. P25 is plays an important role in maintaining integrity of the complex. 196
50245 399935 pfam07295 DUF1451 Zinc-ribbon containing domain. This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. 146
50246 254144 pfam07296 TraP TraP protein. This family consists of several bacterial conjugative transfer TraP proteins from Escherichia coli and Salmonella typhimurium. TraP appears to play a minor role in conjugation and may interact with TraB, which varies in sequence along with TraP, in order to stabilize the proposed transmembrane complex formed by the tra operon products. 202
50247 399936 pfam07297 DPM2 Dolichol phosphate-mannose biosynthesis regulatory protein (DPM2). This family consists of several eukaryotic dolichol phosphate-mannose biosynthesis regulatory (DPM2) proteins. Biosynthesis of glycosylphosphatidylinositol and N-glycan precursor is dependent upon a mannosyl donor, dolichol phosphate-mannose (DPM). DPM2, an 84 amino acid membrane protein expressed in the endoplasmic reticulum (ER), makes a complex with DPM1 that is essential for the ER localization and stable expression of DPM1. Moreover, DPM2 enhances binding of dolichol phosphate, a substrate of DPM synthase. Biosynthesis of DPM in mammalian cells is regulated by DPM2. 76
50248 399937 pfam07298 NnrU NnrU protein. This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor. 189
50249 399938 pfam07299 EF-G-binding_N Elongation factor G-binding protein, N-terminal. This domain can be found in the N-terminus of the FusB, FusC, and FusD proteins from Staphylococcus aureus. They are elongation factor G (EF-G) binding proteins that are linked to the fusidic acid (FA) resistance in S. aureus. The FusB proteins are two-domain metalloproteins, and this N-terminal domain forms a four-helical bundle whose helices help to stabilize the conformation of the treble-clef zinc-finger in the C-terminal domain. FA is an antibiotic that binds to EF-G, preventing its release from the ribosome, thus stalling bacterial protein synthesis. The FusB proteins provide FA resistance by preventing formation or facilitating dissociation of the FA-locked EF-G-ribosome complex during elongation and ribosome recycling. 82
50250 399939 pfam07301 DUF1453 Protein of unknown function (DUF1453). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales. 144
50251 399940 pfam07302 AroM AroM protein. This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL. The function of this family is unknown. 218
50252 399941 pfam07303 Occludin_ELL Occludin homology domain. This domain represents a conserved region approximately 100 residues long within eukaryotic occludin proteins and the RNA polymerase II elongation factor ELL. Occludin is an integral membrane protein that localizes to tight junctions, while ELL is an elongation factor that can increase the catalytic rate of RNA polymerase II transcription by suppressing transient pausing by polymerase at multiple sites along the DNA. This shared domain is thought to mediate protein interactions. 101
50253 399942 pfam07304 SRA1 Steroid receptor RNA activator (SRA1). This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs. 145
50254 399943 pfam07305 DUF1454 Protein of unknown function (DUF1454). This family consists of several Enterobacterial sequences of around 200 residues in length which are often known as YiiQ proteins. The function of this family is unknown. 190
50255 284672 pfam07306 DUF1455 Protein of unknown function (DUF1455). This family consists of several hypothetical putative outer membrane proteins which appear to be specific to Anaplasma marginale and Anaplasma ovis. 130
50256 399944 pfam07307 HEPPP_synt_1 Heptaprenyl diphosphate synthase (HEPPP synthase) subunit 1. This family contains subunit 1 of bacterial heptaprenyl diphosphate synthase (HEPPP synthase) (EC:2.5.1.30) (approximately 230 residues long). The enzyme consists of two subunits, both of which are required for catalysis of heptaprenyl diphosphate synthesis. 210
50257 399945 pfam07308 DUF1456 Protein of unknown function (DUF1456). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 68
50258 399946 pfam07309 FlaF Flagellar protein FlaF. This family consists of several bacterial FlaF flagellar proteins. FlaF and FlaG are trans-acting, regulatory factors that modulate flagellin synthesis during flagellum biogenesis. 113
50259 399947 pfam07310 PAS_5 PAS domain. This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. This region is is distantly similar to other PAS domains. 136
50260 399948 pfam07311 Dodecin Dodecin. Dodecin is a flavin-binding protein,found in several bacteria and few archaea and represents a stand-alone version of the SHS2 domain. It most closely resembles the SHS2 domains of FtsA and Rpb7p, and represents a single domain small-molecule binding form. 62
50261 369313 pfam07312 DUF1459 Protein of unknown function (DUF1459). This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown. 81
50262 399949 pfam07313 DUF1460 Protein of unknown function (DUF1460). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown. 214
50263 399950 pfam07314 DUF1461 Protein of unknown function (DUF1461). This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. These are possibly integral membrane proteins. 175
50264 377812 pfam07315 DUF1462 Protein of unknown function (DUF1462). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown. 93
50265 311332 pfam07316 DUF1463 Protein of unknown function (DUF1463). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 137
50266 399951 pfam07317 YcgR Flagellar regulator YcgR. This domain is found N terminal to pfam07238. Proteins which contain YcgR domains are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias. 103
50267 399952 pfam07318 DUF1464 Protein of unknown function (DUF1464). This family consists of several hypothetical archaeal proteins of around 350 residues in length. The function of this family is unknown. 327
50268 399953 pfam07319 DnaI_N Primosomal protein DnaI N-terminus. This family represents the N-terminus (approximately 120 residues) of bacterial primosomal DnaI proteins, although one family member appears to be of viral origin. DnaI is one of the components of the Bacillus subtilis replication restart primosome, and is required for the DnaB75-dependent loading of the DnaC helicase. 90
50269 399954 pfam07321 YscO Type III secretion protein YscO. This family contains the bacterial type III secretion protein YscO, which is approximately 150 residues long. YscO has been shown to be required for high-level expression and secretion of the anti-host proteins V antigen and Yops in Yersinia pestis. 148
50270 284687 pfam07322 Seadorna_Vp10 Seadornavirus Vp10. This family consists of several Seadornavirus Vp10 proteins found in the Banna and Kadipiro viruses. Members of this family are typically around 240 residues in length. The function of this family is unknown. 241
50271 399955 pfam07323 DUF1465 Protein of unknown function (DUF1465). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 154
50272 369318 pfam07324 DGCR6 DiGeorge syndrome critical region 6 (DGCR6) protein. This family contains DiGeorge syndrome critical region 6 (DGCR6) proteins (approximately 200 residues long) of a number of vertebrates. DGCR6 is a candidate for involvement in the DiGeorge syndrome pathology by playing a role in neural crest cell migration into the third and fourth pharyngeal pouches, the structures from which derive the organs affected in DiGeorge syndrome. Also found in this family is the Drosophila melanogaster gonadal protein gdl. 187
50273 148753 pfam07325 Curto_V2 Curtovirus V2 protein. This family consists of several Curtovirus V2 proteins. The exact function of V2 is unclear but it is known that the protein is required for a successful host infection process. 126
50274 399956 pfam07326 DUF1466 Protein of unknown function (DUF1466). This family consists of several hypothetical mammalian proteins of around 240 residues in length. 229
50275 115951 pfam07327 Neuroparsin Neuroparsin. This family consists of several locust specific neuroparsin proteins. Neuroparsins are produced by the A1 type of protocerebral median neurosecretory cells of the PI-CC system and display pleiotropic activities: inhibition of the effect of juvenile hormone, stimulation of fluid reabsorption of isolated recta, induction of an increase in hemolymph lipid and trehalose levels, and neurotrophic effects. 103
50276 284691 pfam07328 VirD1 T-DNA border endonuclease VirD1. This family consists of several T-DNA border endonuclease VirD1 proteins which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialized bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process. 142
50277 399957 pfam07330 DUF1467 Protein of unknown function (DUF1467). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 82
50278 399958 pfam07331 TctB Tripartite tricarboxylate transporter TctB family. This family consists of several hypothetical bacterial proteins of around 150 residues in length. This family was formerly known as DUF1468. 136
50279 399959 pfam07332 Phage_holin_3_6 Putative Actinobacterial Holin-X, holin superfamily III. Phage_holin_3_6 is a family of small hydrophobic proteins with two or three transmembrane domains of the Hol-X family. Holin proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. 116
50280 284695 pfam07333 SLR1-BP S locus-related glycoprotein 1 binding pollen coat protein (SLR1-BP). This family consists of a number of cysteine rich SLR1 binding pollen coat like proteins. Adhesion of pollen grains to the stigmatic surface is a critical step during sexual reproduction in plants. In Brassica, S locus-related glycoprotein 1 (SLR1), a stigma-specific protein belonging to the S gene family of proteins, has been shown to be involved in this step. SLR1-BP specifically binds SLR1 with high affinity. The SLR1-BP gene is specifically expressed in pollen at late stages of development and is a member of the class A pollen coat protein (PCP) family, which includes PCP-A1, an SLG (S locus glycoprotein)-binding protein. 56
50281 284696 pfam07334 IFP_35_N Interferon-induced 35 kDa protein (IFP 35) N-terminus. This family represents the N-terminus of interferon-induced 35 kDa protein (IFP 35) (approximately 80 residues long), which contains a leucine zipper motif in an alpha helical configuration. This family also includes N-myc-interactor (Nmi), a homologous interferon-induced protein. 76
50282 399960 pfam07335 Glyco_hydro_75 Fungal chitosanase of glycosyl hydrolase group 75. This family consists of several fungal chitosanase proteins. Chitin, xylan, 6-O-sulphated chitosan and O-carboxymethyl chitin are indigestible by chitosanase. EC:3.2.1.132. The mechanism is likely to be inverting, and the probable catalytic neutrophile base is Asp, with the probable catalytic proton donor being Glu. (see the Chitosanase web-page from CAZY). 165
50283 399961 pfam07336 ABATE Putative stress-induced transcription regulator. The structure of one member of the ABATE domain family consists of a two-domain organisation, with the N-terminal domain presenting a new fold called the ABATE domain that may bind an as yet unknown ligand. The C-terminal domain forms a treble-clef zinc-finger that is likely to be involved in DNA binding. suggests a role as stress-induced transcriptional regulator. Further computational analyses sugeests a role as a stress-induced transcriptional regulator. Members of this family are found in Streptomyces, Rhizobium, Ralstonia, Agrobacterium and Bradyrhizobium species. 90
50284 284699 pfam07337 CagY_M DC-EC Repeat. This repeat is found in the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells. It forms part of a surface needle structure, and this repeat may form an alpha-helical rod structure. A conserved -DC- and -EC- can be seen in regularly spaced in the alignment. 32
50285 399962 pfam07338 DUF1471 Protein of unknown function (DUF1471). This family consists of several hypothetical Enterobacterial proteins of around 90 residues in length. Some members of this family are annotated as ydgH precursors and contain two copies of this region, one at the N-terminus and the other at the C-terminus. The function of this family is unknown. 56
50286 115962 pfam07339 DUF1472 Protein of unknown function (DUF1472). This family consists of several Enterobacterial proteins of around 125 residues in length and contains 6 highly conserved cysteine residues. The function of this family is unknown. 101
50287 284701 pfam07340 Herpes_IE1 Cytomegalovirus IE1 protein. Expression from a human cytomegalovirus early promoter (E1.7) has been shown to be activated in trans by the IE2 gene product. Although the IE1 gene product alone had no effect on this early viral promoter, maximal early promoter activity was detected when both IE1 and IE2 gene products were present. The IE1 protein from cytomegalovirus is also known as UL123. 391
50288 115964 pfam07341 DUF1473 Protein of unknown function (DUF1473). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 163
50289 369323 pfam07342 DUF1474 Protein of unknown function (DUF1474). This family consists of several bacterial proteins of around 100 residues in length. Members of this family seem to be found exclusively in Staphylococcus aureus. The function of this family is unknown. 100
50290 399963 pfam07343 DUF1475 Protein of unknown function (DUF1475). This family consists of several hypothetical plant proteins of around 250 residues in length. Members of this family seem to be found exclusively in Arabidopsis thaliana. The function of this family is unknown. 236
50291 399964 pfam07344 Amastin Amastin surface glycoprotein. This family contains the eukaryotic surface glycoprotein amastin (approximately 180 residues long).In Trypanosoma cruzi, amastin is particularly abundant during the amastigote stage. 156
50292 399965 pfam07345 DUF1476 Domain of unknown function (DUF1476). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown. 102
50293 311348 pfam07346 DUF1477 Protein of unknown function (DUF1477). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 100 resides in length. The function of this family is unknown. 115
50294 399966 pfam07347 CI-B14_5a NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a). This family contains the eukaryotic NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a) (EC:1.6.5.3). This is approximately 100 residues long, and forms part of a multiprotein complex that resides on the inner mitochondrial membrane. The main function of the complex is the transport of electrons from NADH to ubiquinone, accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. 91
50295 399967 pfam07348 Syd Syd protein (SUKH-2). This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association. Operon analysis showed that Syd protein may function as immunity protein in bacterial toxin systems. 174
50296 284709 pfam07349 DUF1478 Protein of unknown function (DUF1478). This family consists of several hypothetical Sapovirus proteins of around 165 residues in length. The function of this family is unknown. 161
50297 399968 pfam07350 DUF1479 Protein of unknown function (DUF1479). This family consists of several hypothetical Enterobacterial proteins, of around 420 residues in length. Members of this family are often known as YbiU. The function of this family is unknown. 404
50298 369328 pfam07351 DUF1480 Protein of unknown function (DUF1480). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown. 79
50299 399969 pfam07352 Phage_Mu_Gam Bacteriophage Mu Gam like protein. This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. 146
50300 284713 pfam07353 Uroplakin_II Uroplakin II. This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension. 161
50301 336682 pfam07354 Sp38 Zona-pellucida-binding protein (Sp38). This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90-kDa family of zona pellucida glycoproteins in a calcium-dependent manner. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur. 177
50302 399970 pfam07355 GRDB Glycine/sarcosine/betaine reductase selenoprotein B (GRDB). This family represents a conserved region approximately 350 residues long within the selenoprotein B component of the bacterial glycine, sarcosine and betaine reductase complexes. 347
50303 399971 pfam07356 DUF1481 Protein of unknown function (DUF1481). This family consists of several hypothetical bacterial proteins of around 230 residues in length. Members of this family are often referred to as YjaH and are found in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown. 186
50304 369330 pfam07357 DRAT Dinitrogenase reductase ADP-ribosyltransferase (DRAT). This family consists of several bacterial dinitrogenase reductase ADP-ribosyltransferase (DRAT) proteins. Members of this family seem to be specific to Rhodospirillum, Rhodobacter and Azospirillum species. Dinitrogenase reductase ADP-ribosyl transferase (DRAT) carries out the transfer of the ADP-ribose from NAD to the Arg-101 residue of one subunit of the dinitrogenase reductase homodimer, resulting in inactivation of that enzyme. Dinitrogenase reductase-activating glycohydrolase (DRAG) removes the ADP-ribose group attached to dinitrogenase reductase, thus restoring nitrogenase activity. The DRAT-DRAG system negatively regulates nitrogenase activity in response to exogenous NH4+ or energy limitation in the form of a shift to darkness or to anaerobic conditions. 256
50305 369331 pfam07358 DUF1482 Protein of unknown function (DUF1482). This family consists of several Enterobacterial proteins of around 60 residues in length. The function of this family is unknown. 57
50306 284718 pfam07359 LEAP-2 Liver-expressed antimicrobial peptide 2 precursor (LEAP-2). This family consists of several mammalian liver-expressed antimicrobial peptide 2 (LEAP-2) sequences. LEAP-2 is a cysteine-rich, and cationic protein. LEAP-2 contains a core structure with two disulfide bonds formed by cysteine residues in relative 1-3 and 2-4 positions. LEAP-2 is synthesized as a 77-residue precursor, which is predominantly expressed in the liver and highly conserved among mammals. The largest native LEAP-2 form of 40 amino acid residues is generated from the precursor at a putative cleavage site for a furin-like endoprotease. In contrast to smaller LEAP-2 variants, this peptide exhibits dose-dependent antimicrobial activity against selected microbial model organisms. The exact function of this family is unclear. 77
50307 399972 pfam07361 Cytochrom_B562 Cytochrome b562. This family contains the bacterial cytochrome b562. This forms a four-helix bundle that non-covalently binds a single heme prosthetic group.. 101
50308 399973 pfam07362 CcdA Post-segregation antitoxin CcdA. This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein. 71
50309 284721 pfam07363 DUF1484 Protein of unknown function (DUF1484). This family consists of several hypothetical bacterial proteins of around 110 residues in length. Members of this family appear to be found exclusively in Ralstonia solanacearum. The function of this family is unknown. 109
50310 399974 pfam07364 DUF1485 Metallopeptidase family M81. This is a family of proteobacterial metallo-peptidases. 287
50311 191732 pfam07365 Toxin_8 Alpha conotoxin precursor. This family consists of several alpha conotoxin precursor proteins from a number of Conus species. The alpha-conotoxins are small peptide neurotoxins from the venom of fish-hunting cone snails which block nicotinic acetylcholine receptors (nAChRs). 50
50312 399975 pfam07366 SnoaL SnoaL-like polyketide cyclase. This family includes SnoaL a polyketide cyclase involved in nogalamycin biosynthesis. This family was formerly known as DUF1486. The proteins in this family adopt a distorted alpha-beta barrel fold. Structural data together with site-directed mutagenesis experiments have shown that SnoaL has a different mechanism to that of the classical aldolase for catalyzing intramolecular aldol condensation. 126
50313 399976 pfam07367 FB_lectin Fungal fruit body lectin. This family consists of several fungal fruit body lectin proteins. Fruit body lectins are thought to have insecticidal activity and may also function in capturing nematodes. 139
50314 254173 pfam07368 DUF1487 Protein of unknown function (DUF1487). This family consists of several uncharacterized proteins from Drosophila melanogaster. The function of this family is unknown. 215
50315 399977 pfam07369 DUF1488 Protein of unknown function (DUF1488). This family consists of several hypothetical bacterial proteins of around 85 residues in length. The function of this family is unknown. 82
50316 399978 pfam07370 DUF1489 Protein of unknown function (DUF1489). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be founds exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 138
50317 399979 pfam07371 DUF1490 Protein of unknown function (DUF1490). This family consists of several hypothetical bacterial proteins of around 90 residues in length. Members of the family seem to be found exclusively in Mycobacterium species. The function of this family is unknown. 88
50318 399980 pfam07372 DUF1491 Protein of unknown function (DUF1491). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 103
50319 399981 pfam07373 CAMP_factor CAMP factor (Cfa). This family consists of several bacterial CAMP factor (Cfa) proteins which seem to be specific to Streptococcus species. The CAMP reaction is a synergistic lysis of erythrocytes by the interaction of an extracellular protein (CAMP factor) produced by some streptococcal species with the Staphylococcus aureus sphingomyelinase C (beta-toxin). 220
50320 311366 pfam07374 DUF1492 Protein of unknown function (DUF1492). This family consists of several hypothetical, highly conserved Streptococcal and related phage proteins of around 100 residues in length. The function of this family is unknown. It appears to be distantly related to pfam08281. 100
50321 284730 pfam07376 Prosystemin Prosystemin. This family consists of several plant specific prosystemin proteins. Prosystemin is the precursor protein of the 18 amino acid wound signal systemin which activates systemic defense in plant leaves against insect herbivores. 204
50322 399982 pfam07377 DUF1493 Protein of unknown function (DUF1493). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in Salmonella and Yersinia species and several have been described as being putative cytoplasmic proteins. The function of this family is unknown. 111
50323 399983 pfam07378 FlbT Flagellar protein FlbT. This family consists of several FlbT proteins. FlbT is a post-transcriptional regulator of flagellin. FlbT is associated with the 5' untranslated region (UTR) of fljK (25 kDa flagellin) mRNA and that this association requires a predicted loop structure in the transcript. Mutations within this loop abolish FlbT association and result in increased mRNA stability. It is therefore thought that FlbT promotes the degradation of flagellin mRNA by associating with the 5' UTR. 125
50324 284733 pfam07379 DUF1494 Protein of unknown function (DUF1494). This family consists of several bacterial proteins of around 175 residues in length. Members of this family seem to be found exclusively in Chlamydia species. The function of this family is unknown. 179
50325 284734 pfam07380 Pneumo_M2 Pneumovirus M2 protein. This family consists of several Pneumovirus M2 proteins. The M2-1 protein of respiratory syncytial virus (RSV) is a transcription processivity factor that is essential for virus replication. 89
50326 369338 pfam07381 DUF1495 Winged helix DNA-binding domain (DUF1495). This family consists of several hypothetical archaeal proteins of around 110 residues in length. The structure of this domain possesses a winged helix DNA-binding domain suggesting these proteins are bacterial transcription factors. 90
50327 369339 pfam07382 HC2 Histone H1-like nucleoprotein HC2. This family contains the bacterial histone H1-like nucleoprotein HC2 (approximately 200 residues long), which seems to be found mostly in Chlamydia. HC2 functions in DNA condensation, although it has been suggested that it also has other roles. 187
50328 399984 pfam07383 DUF1496 Protein of unknown function (DUF1496). This family consists of several bacterial proteins of around 90 residues in length. Members of this family seem to be found exclusively in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown. 51
50329 284738 pfam07384 DUF1497 Protein of unknown function (DUF1497). This family consists of several phage and bacterial proteins of around 59 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this organism. The function of this family is unknown. 59
50330 399985 pfam07385 Lyx_isomer D-lyxose isomerase. Members of this family of sugar isomerases belong to the cupin superfamily. The enzyme from Cohnella laevoribosii has been shown to be specific for D-lyxose, L-ribose, and D-mannose. E. coli sugar isomerase (EcSI) has been structurally and functionally characterized and shows a preference for D-lyxose and D-mannose. 223
50331 399986 pfam07386 DUF1499 Protein of unknown function (DUF1499). This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown. 114
50332 254182 pfam07387 Seadorna_VP7 Seadornavirus VP7. This family consists of several Seadornavirus specific VP7 proteins of around 305 residues in length. The function of this family is unknown. However, it appears to be distantly related to protein kinases. 308
50333 399987 pfam07388 A-2_8-polyST Alpha-2,8-polysialyltransferase (POLYST). This family contains the bacterial enzyme alpha-2,8-polysialyltransferase (EC:2.4.99.-) (approximately 500 residues long). This catalyzes the polycondensation of alpha-2,8-linked sialic acid required for the synthesis of polysialic acid (PSA). 322
50334 284742 pfam07389 DUF1500 Protein of unknown function (DUF1500). This family consists of several Orthopoxvirus specific proteins of around 100 residues in length. The function of this family is unknown. 97
50335 369342 pfam07390 P30 Mycoplasma P30 protein. This family consists of several P30 proteins which seem to be specific to Mycoplasma agalactiae. P30 is a 30-kDa immunodominant antigen and is known to be a transmembrane protein. 150
50336 369343 pfam07391 NPR NPR nonapeptide repeat (2 copies). This nine residue repeat which I have called NPR after NonaPeptide Repeat. It is found in two malarial proteins and has the consensus EEhhEEhhP where h stands for a hydrophobic amino acid. 17
50337 369344 pfam07392 P19Arf_N Cyclin-dependent kinase inhibitor 2a p19Arf N-terminus. This family represents the N-terminus (approximately 50 residues) of cyclin-dependent kinase inhibitor 2a p19Arf, which seems to be restricted to mammals. This is a tumor-suppressor protein that has been shown to inhibit the growth of human tumor cells lacking functional p53 by inducing a transient G2 arrest and subsequently apoptosis. 51
50338 399988 pfam07393 Sec10 Exocyst complex component Sec10. This family contains the Sec10 component (approximately 650 residues long) of the eukaryotic exocyst complex, which specifically affects the synthesis and delivery of secretory and basolateral plasma membrane proteins. 704
50339 399989 pfam07394 DUF1501 Protein of unknown function (DUF1501). This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long. 392
50340 284747 pfam07395 Mig-14 Mig-14. This family contains a number of bacterial mig-14 proteins (approximately 270 residues long). In Salmonella, mig-14 contributes to resistance to antimicrobial peptides, although the mechanism is not fully understood. 264
50341 399990 pfam07396 Porin_O_P Phosphate-selective porin O and P. This family represents a conserved region approximately 400 residues long within the bacterial phosphate-selective porins O and P. These are anion-specific porins, the binding site of which has a higher affinity for phosphate than chloride ions. Porin O has a higher affinity for polyphosphates, while porin P has a higher affinity for orthophosphate. In P. aeruginosa, porin O was found to be expressed only under phosphate-starvation conditions during the stationary growth phase. 358
50342 116019 pfam07397 DUF1502 Repeat of unknown function (DUF1502). This family consists of a number of repeats of around 34 residues in length. Members of this family seem to be found exclusively in three hypothetical Murid herpesvirus 4 proteins. The function of this family is unknown. 34
50343 399991 pfam07398 MDMPI_C MDMPI C-terminal domain. This domain is found at the C-terminus of the mycothiol maleylpyruvate isomerase enzyme (MDMPI). The structure of this protein has been solved. This domain appears weakly similar to pfam08608. 88
50344 399992 pfam07399 Na_H_antiport_3 Putative Na+/H+ antiporter. This family consists of several hypothetical bacterial proteins of around 440 residues in length. The function of this family is unknown. Many members carry 11 or 12 transmembrane regions, suggesting that they might be transporters. One family member, UniProtKB:Q821X2 is classified by TCDB as being an NhaE type of Na+/H+ antiporter. 418
50345 399993 pfam07400 IL11 Interleukin 11. This family contains interleukin 11 (approximately 200 residues long). This is a secreted protein that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of interleukin 11. Family members seem to be restricted to mammals. 167
50346 116023 pfam07401 Lenti_VIF_2 Bovine Lentivirus VIF protein. This family consists of several Lentivirus viral infectivity factor (VIF) proteins. VIF is known to be essential for ability of cell-free virus preparation to infect cells. Members of this family are specific to Bovine immunodeficiency virus (BIV) and Jembrana disease virus which also infects cattle. 198
50347 284752 pfam07402 Herpes_U26 Human herpesvirus U26 protein. This family consists of several Human herpesvirus U26 proteins of around 300 residues in length. The function of this family is unknown. 293
50348 369351 pfam07403 DUF1505 Protein of unknown function (DUF1505). This family consists of several uncharacterized Caenorhabditis elegans proteins of around 115 resides in length. Members of this family contain 6 highly conserved cysteine residues. The function of this family is unknown. 114
50349 254188 pfam07404 TEBP_beta Telomere-binding protein beta subunit (TEBP beta). This family consists of several telomere-binding protein beta subunits which appear to be specific to the family Oxytrichidae. Telomeres are specialized protein-DNA complexes that compose the ends of eukaryotic chromosomes. Telomeres protect chromosome termini from degradation and recombination and act together with telomerase to ensure complete genome replication. TEBP beta forms a complex with TEBP alpha and this complex is able to recognize and bind ssDNA to form a sequence-specific, telomeric nucleoprotein complex that caps the very 3' ends of chromosomes. 375
50350 284754 pfam07405 DUF1506 Protein of unknown function (DUF1506). This family consists of several bacterial proteins of around 130 residues in length. Members of this family seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 127
50351 399994 pfam07406 NICE-3 NICE-3 protein. This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC) which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis. The function of NICE-3 is unknown. 181
50352 284756 pfam07407 Seadorna_VP6 Seadornavirus VP6 protein. This family consists of several VP6 proteins from the Banna virus as well as a related protein VP5 from the Kadipiro virus. Members of this family are typically of around 420 residues in length. The function of this family is unknown. 420
50353 399995 pfam07408 DUF1507 Protein of unknown function (DUF1507). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 84
50354 399996 pfam07409 GP46 Phage protein GP46. This family contains GP46 phage proteins (approximately 120 residues long). 115
50355 399997 pfam07410 Phage_Gp111 Streptococcus thermophilus bacteriophage Gp111 protein. This family consists of several Streptococcus thermophilus bacteriophage Gp111 proteins of around 110 residues in length. The function of this family is unknown. 118
50356 399998 pfam07411 DUF1508 Domain of unknown function (DUF1508). This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical. 48
50357 399999 pfam07412 Geminin Geminin. This family contains the eukaryotic protein geminin (approximately 200 residues long). Geminin inhibits DNA replication by preventing the incorporation of MCM complex into prereplication complex, and is degraded during the mitotic phase of the cell cycle. It has been proposed that geminin inhibits DNA replication during S, G2, and M phases and that geminin destruction at the metaphase-anaphase transition permits replication in the succeeding cell cycle. 194
50358 284761 pfam07413 Herpes_UL37_2 Betaherpesvirus immediate-early glycoprotein UL37. This family consists of several Betaherpesvirus immediate-early glycoprotein UL37 sequences. The human cytomegalovirus (HCMV) UL37 immediate-early regulatory protein is a type I integral membrane N-glycoprotein which traffics through the ER and the Golgi network. 334
50359 284762 pfam07415 Herpes_LMP2 Gammaherpesvirus latent membrane protein (LMP2) protein. This family consists of several Gammaherpesvirus latent membrane protein (LMP2) proteins. Epstein-Barr virus is a human Gammaherpesvirus that infects and establishes latency in B lymphocytes in vivo. The latent membrane protein 2 (LMP2) gene is expressed in latently infected B cells and encodes two protein isoforms, LMP2A and LMP2B, that are identical except for an additional N-terminal 119 aa cytoplasmic domain which is present in the LMP2A isoform. LMP2A is thought to play a key role in either the establishment or the maintenance of latency and/or the reactivation of productive infection from the latent state. The significance of LMP2B and its role in pathogenesis remain unclear. 497
50360 284763 pfam07416 Crinivirus_P26 Crinivirus P26 protein. This family consists of several Crinivirus P26 proteins which seem to be found exclusively in the Lettuce infectious yellows virus. The function of this family is unknown. 227
50361 400000 pfam07417 Crl Transcriptional regulator Crl. This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibers that are found on the surface of certain bacteria. 128
50362 400001 pfam07418 PCEMA1 Acidic phosphoprotein precursor PCEMA1. This family consists of several acidic phosphoprotein precursor PCEMA1 sequences which appear to be found exclusively in Plasmodium chabaudi. PCEMA1 is an antigen that is associated with the membrane of the infected erythrocyte throughout the entire intraerythrocytic cycle. The exact function of this family is unclear. 294
50363 369356 pfam07419 PilM PilM. This family contains the bacterial protein PilM (approximately 150 residues long). PilM is an inner membrane protein that has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body. 135
50364 284767 pfam07420 DUF1509 Protein of unknown function (DUF1509). This family consists of several uncharacterized viral proteins from the Marek's disease-like viruses. Members of this family are typically around 400 residues in length. The function of this family is unknown. 384
50365 400002 pfam07421 Pro-NT_NN Neurotensin/neuromedin N precursor. This family contains the precursor of bacterial neurotensin/neuromedin N (approximately 170 residues long). This the common precursor of two biologically active related peptides, neurotensin and neuromedin N. It undergoes tissue-specific processing leading to the formation in some tissues and cancer cell lines of large peptides ending with the neurotensin or neuromedin N sequence. 161
50366 400003 pfam07422 s48_45 Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation. This domain contains 6 conserved cysteines suggesting 3 disulphide bridges. 110
50367 400004 pfam07423 DUF1510 Protein of unknown function (DUF1510). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. 93
50368 369359 pfam07424 TrbM TrbM. This family contains the bacterial protein TrbM (approximately 180 residues long). In Comamonas testosteroni T-2, TrbM is derived from the IncP1beta plasmid pTSA, which encodes the widespread genes for p-toluenesulfonate (TSA) degradation. 156
50369 116046 pfam07425 Pardaxin Pardaxin. This family consists of several Pardaxin proteins. Pardaxin, a 33-amino-acid pore-forming polypeptide toxin isolated from the Red Sea Moses sole Pardachirus marmoratus, has a helix-hinge-helix structure. This is a common structural motif found both in antibacterial peptides that can act selectively on bacterial membranes (e.g., cecropin), and in cytotoxic peptides that can lyse both mammalian and bacterial cells (e.g., melittin). Pardaxin possesses a high antibacterial activity with a significantly reduced haemolytic activity towards human red blood cells compared with melittin. Pardaxin has also been found to have a shark repellent action. 33
50370 400005 pfam07426 Dynactin_p22 Dynactin subunit p22. This family contains p22, the smallest subunit of dynactin, a complex that binds to cytoplasmic dynein and is a required activator for cytoplasmic dynein-mediated vesicular transport. Dynactin localizes to the cleavage furrow and to the midbodies of dividing cells, suggesting that it may function in cytokinesis. Family members are approximately 170 residues long. 164
50371 400006 pfam07428 Tri3 15-O-acetyltransferase Tri3. This family represents a conserved region approximately 400 residues long within 15-O-acetyltransferase (Tri3), which seems to be restricted to ascomycete fungi. In Fusarium sporotrichioides, this is required for acetylation of the C-15 hydroxyl group of trichothecenes in the biosynthesis of T-2 toxin. 416
50372 400007 pfam07429 Glyco_transf_56 4-alpha-L-fucosyltransferase glycosyl transferase group 56. This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (EC 2.4.1.-) (approximately 360 residues long). This catalyzes the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc. 358
50373 311397 pfam07430 PP1 Phloem filament protein PP1 cystatin-like domain. This domain represents a conserved region related to cystatins. Eight copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem. 78
50374 400008 pfam07431 DUF1512 Protein of unknown function (DUF1512). This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown. 355
50375 311399 pfam07432 Hc1 Histone H1-like protein Hc1. This family consists of several bacterial histone H1-like Hc1 proteins. In Chlamydia, Hc1 is expressed in the late stages of the life cycle, concomitant with the reorganisation of chlamydial reticulate bodies into elementary bodies. This suggests that Hc1 protein plays a role in the condensation of chromatin during intracellular differentiation. 124
50376 400009 pfam07433 DUF1513 Protein of unknown function (DUF1513). This family consists of several bacterial proteins of around 360 residues in length. The function of this family is unknown. 304
50377 400010 pfam07434 CblD CblD like pilus biogenesis initiator. This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis. The family also contains a variety of Enterobacterial minor pilin proteins. 380
50378 400011 pfam07435 YycH YycH protein. This family contains the bacterial protein YycH which is approximately 450 residues long. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two component system together with its cognate response regulator YycF. PhoA fusion studies have shown that YycH is transported across the cytoplasmic protein. It is postulated that YycH functions as an antagonist to YycG. The molecule is made up of three domains, and has a novel three-dimensional structure. The N-terminal domain features a calcium binding site and the central domain contains two conserved loop regions. 407
50379 369364 pfam07436 Curto_V3 Curtovirus V3 protein. This family consists of several Curtovirus V3 proteins of around 90 residues in length. The function of this family is unknown. 87
50380 284781 pfam07437 YfaZ YfaZ precursor. This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins. 180
50381 369365 pfam07438 DUF1514 Protein of unknown function (DUF1514). This family consists of several Staphylococcus aureus and related bacteriophage proteins of around 65 residues in length. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids. 62
50382 400012 pfam07439 DUF1515 Protein of unknown function (DUF1515). This family consists of several hypothetical bacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Rhizobium species. The function of this family is unknown. 113
50383 116061 pfam07440 Caerin_1 Caerin 1 protein. This family consists of several caerin 1 proteins from Litoria species. The caerin 1 peptides are among the most powerful of the broad-spectrum antibiotic amphibian peptides. 24
50384 400013 pfam07441 BofA SigmaK-factor processing regulatory protein BofA. This family contains the sigmaK-factor processing regulatory protein BofA (Bypass-of-forespore protein A) (approximately 80 residues long). During sporulation in Bacillus subtilis, transcription is controlled in the developing sporangium by a cascade of sporulation-specific transcription factors (sigma factors). Following engulfment, processing of sigmaK is inhibited by BofA. It has been suggested that this effect is exerted by alteration of the level of the SpoIVFA protein. 73
50385 116063 pfam07442 Ponericin Ponericin. This family contains a number of ponericin peptides (approximately 30 residues long) from the venom of the predatory ant Pachycondyla goeldii. These peptides exhibit antibacterial and insecticidal properties, and may adopt an amphipathic alpha-helical structure in polar environments such as cell membranes. 29
50386 400014 pfam07443 HARP HepA-related protein (HARP). This family represents a conserved region approximately 60 residues long within eukaryotic HepA-related protein (HARP). This exhibits single-stranded DNA-dependent ATPase activity, and is ubiquitously expressed in human and mouse tissues. Family members may contain more than one copy of this region. 55
50387 400015 pfam07444 Ycf66_N Ycf66 protein N-terminus. This family represents the N-terminus (approximately 80 residues) of Ycf66, a protein that seems to be restricted to eukaryotes that contain chloroplasts and to cyanobacteria. 76
50388 400016 pfam07445 PriC Primosomal replication protein priC. This family contains the bacterial primosomal replication protein priC. In Escherichia coli, this function in the assembly of the primosome. 173
50389 400017 pfam07447 VP40 Matrix protein VP40. This family contains viral VP40 matrix proteins that seem to be restricted to the Filoviridae. These play an important role in the assembly process of virus particles by interacting with cellular factors, cellular membranes, and the ribonuclearprotein particle complex. It has been shown that the N-terminal region of VP40 folds into a mixture of hexameric and octameric states - these may have distinct roles. 292
50390 311407 pfam07448 Spp-24 Secreted phosphoprotein 24 (Spp-24) cystatin-like domain. This family represents a conserved region approximately 60 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover. 64
50391 400018 pfam07449 HyaE Hydrogenase-1 expression protein HyaE. This family contains bacterial hydrogenase-1 expression proteins approximately 120 residues long. This includes the E. coli protein HyaE, and the homologous proteins HoxO of R. eutropha and HupG of R. leguminosarum. Deletion of the hoxO gene in R. eutropha led to complete loss of the uptake [NiFe] hydrogenase activity, suggesting that it has a critical role in hydrogenase assembly. 108
50392 400019 pfam07450 HycH Formate hydrogenlyase maturation protein HycH. This family contains the bacterial formate hydrogenlyase maturation protein HycH, which is approximately 140 residues long. This may be required for the conversion of a precursor form of the large subunit of hydrogenlyase 3 into a mature form. 129
50393 400020 pfam07451 SpoVAD Stage V sporulation protein AD (SpoVAD). This family contains the bacterial stage V sporulation protein AD (SpoVAD), which is approximately 340 residues long. This is one of six proteins encoded by the spoVA operon, which is transcribed exclusively in the forespore at about the time of dipicolinic acid (DPA) synthesis in the mother cell. The functions of the proteins encoded by the spoVA operon are unknown, but it has been suggested they are involved in DPA transport during sporulation. 329
50394 400021 pfam07452 CHRD CHRD domain. CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologs. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG.. 115
50395 284793 pfam07453 NUMOD1 NUMOD1 domain. This domain probably represents a DNA-binding helix-turn-helix based on its similarity to other families (Bateman A pers obs). 37
50396 400022 pfam07454 SpoIIP Stage II sporulation protein P (SpoIIP). This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation. Engulfment in Bacillus subtilis is mediated by two complementary systems: the first includes the proteins SpoIID, SpoIIM and SpoIIP (DMP) which carry out the engulfment, and the second includes the SpoIIQ-SpoIIIAGH (Q-AH) zipper, that recruits other proteins to the septum in a second-phase of the engulfment. The course of events follows as the incorporation firstly of SpoIIB into the septum during division to serve directly or indirectly as a landmark for localising SpoIIM and then SpoIIP and SpoIID to the septum. SpoIIP and SpoIID interact together to form part of the DMP complex. SpoIIP itself has been identified as an autolysin with peptidoglycan hydrolase activity. 263
50397 311413 pfam07455 Psu Phage polarity suppression protein (Psu). This family contains a number of phage polarity suppression proteins (Psu) (approximately 190 residues long). The Psu protein of bacteriophage P4 causes suppression of transcriptional polarity in Escherichia coli by overcoming Rho termination factor activity. 174
50398 400023 pfam07456 Hpre_diP_synt_I Heptaprenyl diphosphate synthase component I. This family contains component I of bacterial heptaprenyl diphosphate synthase (EC:2.5.1.30) (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7. 147
50399 400024 pfam07457 DUF1516 Protein of unknown function (DUF1516). This family contains a number of hypothetical bacterial proteins of unknown function approximately 120 residues long. 107
50400 400025 pfam07458 SPAN-X Sperm protein associated with nucleus, mapped to X chromosome. This family contains human sperm proteins associated with the nucleus and mapped to the X chromosome (SPAN-X) (approximately 100 residues long). SPAN-X proteins are cancer-testis antigens (CTAs), and thus represent potential targets for cancer immunotherapy because they are widely distributed in tumors but not in normal tissues, except testes. They are highly insoluble, acidic, and polymorphic. 94
50401 400026 pfam07459 CTX_RstB CTX phage RstB protein. This family contains a number of RstB proteins approximately 120 residues long, including RstB1 and RstB2, from the Vibrio cholerae phage CTX. Functional analyses indicate that rstB2 is required for integration of the CTXphi phage into the V. cholerae chromosome. 90
50402 400027 pfam07460 NUMOD3 NUMOD3 motif (2 copies). NUMOD3 is a DNA-binding motif found in homing endonucleases and related proteins. It occurs on its own or in tandem repeats in GIY-YIG (pfam01541) and HTH proteins. It constitutes a beta-turn-loop-helix subregion of the the DNA-binding domain of I-TevI homing endonuclease. 37
50403 284801 pfam07461 NADase_NGA Nicotine adenine dinucleotide glycohydrolase (NADase). This family consists of several bacterial nicotine adenine dinucleotide glycohydrolase (NGA) proteins which appear to be specific to Streptococcus pyogenes. NAD glycohydrolase (NADase) is a potential virulence factor. Streptococcal NADase may contribute to virulence by its ability to cleave beta-NAD at the ribose-nicotinamide bond, depleting intracellular NAD pools and producing the potent vasoactive compound nicotinamide. 446
50404 400028 pfam07462 MSP1_C Merozoite surface protein 1 (MSP1) C-terminus. This family represents the C-terminal region of merozoite surface protein 1 (MSP1) which are found in a number of Plasmodium species. MSP-1 is a 200-kDa protein expressed on the surface of the P. vivax merozoite. MSP-1 of Plasmodium species is synthesized as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19-kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate. 553
50405 400029 pfam07463 NUMOD4 NUMOD4 motif. NUMOD4 is a putative DNA-binding motif found in homing endonucleases and related proteins. 49
50406 369376 pfam07464 ApoLp-III Apolipophorin-III precursor (apoLp-III). This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage. 143
50407 400030 pfam07465 PsaM Photosystem I protein M (PsaM). This family consists of several plant and cyanobacterial photosystem I protein M (PsaM) sequences. PsaM forms part of the photosystem I complex and its binding is stabilized by PsaI. 29
50408 400031 pfam07466 DUF1517 Protein of unknown function (DUF1517). This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown. 183
50409 400032 pfam07467 BLIP Beta-lactamase inhibitor (BLIP). The structure of BLIP reveals two structural domains, which form a polar, concave surface that docks onto a predominantly polar, convex protrusion on beta-lactamase. The ability of BLIP to adapt to a variety of class A beta-lactamases is thought to be due to flexibility between these two domains. 123
50410 116089 pfam07468 Agglutinin Agglutinin domain. 141
50411 400033 pfam07469 DUF1518 Domain of unknown function (DUF1518). This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins. 58
50412 400034 pfam07470 Glyco_hydro_88 Glycosyl Hydrolase Family 88. Unsaturated glucuronyl hydrolase catalyzes the hydrolytic release of unsaturated glucuronic acids from oligosaccharides (EC:3.2.1.-) produced by the reactions of polysaccharide lyases. 343
50413 369382 pfam07471 Phage_Nu1 Phage DNA packaging protein Nu1. Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA. 164
50414 400035 pfam07472 PA-IIL Fucose-binding lectin II (PA-IIL). In Pseudomonas aeruginosa the fucose-binding lectin II (PA-IIL) contributes to the pathogenic virulence of the bacterium. PA-IIL functions as a tetramer when binding fucose. Each monomer is comprised of a nine-stranded, antiparallel beta-sandwich arrangement and contains two calcium cations that mediate the binding of fucose in a recognition mode unique among carbohydrate-protein interactions. 107
50415 400036 pfam07473 Toxin_11 Spasmodic peptide gm9a; conotoxin from Conus species. This family consists of several spasmodic peptide gm9a sequences. Conotoxin gm9a is a putative 27-residue polypeptide encoded by Conus gloriamaris and is known to be a homolog of the 'spasmodic peptide', tx9a, isolated from the venom of the mollusc-hunting cone shell Conus textile. Upon injection of this venom component, normal mice are converted into behavioural phenocopies of a well-known mutant, the spasmodic mouse. 28
50416 400037 pfam07474 G2F G2F domain. Nidogen, an invariant component of basement membranes, is a multifunctional protein that interacts with most other major basement membrane proteins. The G2 fragment or (G2F domain) contains binding sites for collagen IV and perlecan. The structure is composed of an 11-stranded beta-barrel with a central helix. This domain is structurally related to that of green fluorescent protein pfam01353. A large surface patch on the beta-barrel is conserved in all metazoan nidogens. 184
50417 400038 pfam07475 Hpr_kinase_C HPr Serine kinase C-terminal domain. This family represents the C terminal kinase domain of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. 171
50418 284814 pfam07476 MAAL_C Methylaspartate ammonia-lyase C-terminus. Methylaspartate ammonia-lyase EC:4.3.1.2 catalyzes the second step of fermentation of glutamate. It is a homodimer. This family represents the C-terminal region of Methylaspartate ammonia-lyase and contains a TIM barrel fold similar to the pfam01188. This family represents the catalytic domain and contains a metal binding site. 247
50419 400039 pfam07477 Glyco_hydro_67C Glycosyl hydrolase family 67 C-terminus. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the C terminal region of alpha-glucuronidase which is mainly alpha-helical. It wraps around the catalytic domain (pfam07488), making additional interactions both with the N-terminal domain (pfam03648) of its parent monomer and also forming the majority of the dimer-surface with the equivalent C-terminal domain of the other monomer of the dimer. 223
50420 400040 pfam07478 Dala_Dala_lig_C D-ala D-ala ligase C-terminus. This family represents the C-terminal, catalytic domain of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF). 205
50421 400041 pfam07479 NAD_Gly3P_dh_C NAD-dependent glycerol-3-phosphate dehydrogenase C-terminus. NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the C-terminal substrate-binding domain. 141
50422 369385 pfam07481 DUF1521 Domain of Unknown Function (DUF1521). This family of unknown function is found in a limited set of Bradyrhizobium proteins. There appears to be a periodic -DG- motif in it. 169
50423 284819 pfam07482 DUF1522 Domain of Unknown Function (DUF1522). 110
50424 400042 pfam07483 W_rich_C Tryptophan-rich Synechocystis species C-terminal domain. This domain is found at the C-terminus, normally between 2-3 copies, of a range of Synechocystis membrane proteins. This domain is fairly tryptophan rich as well. 105
50425 400043 pfam07484 Collar Phage Tail Collar Domain. This region is occasionally found in conjunction with pfam03335. Most of the family appear to be phage tail proteins; however some appear to be involved in other processes. For instance a member from Rhizobium leguminosarum may be involved in plant-microbe interactions. A related protein MrpB is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologs but the intervening sequence is not. Much of the function of phage T4 appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the 'collar'. The domain may bind SO4, however the residues accredited with this vary between the PDB file and the Swiss-Prot entry. The long unconserved region maybe due to domain swapping in and out of a loop or reflective of rapid evolution. 57
50426 400044 pfam07485 DUF1529 Domain of Unknown Function (DUF1259). This family is the lppY/lpqO homolog family. 118
50427 400045 pfam07486 Hydrolase_2 Cell Wall Hydrolase. These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance Bacillus subtilis steB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyzes the cortex. A similar role is carried out by the partially redundant Bacillus subtilis CwlJ. It is not clear whether these enzymes are amidases or peptidases. 101
50428 400046 pfam07487 SopE_GEF SopE GEF domain. This family represents the C-terminal guanine nucleotide exchange factor (GEF) domain of SopE. Salmonella typhimurium employs a type III secretion system to inject bacterial toxins into the host cell cytosol. These toxins transiently activate Rho family GTP-binding protein-dependent signaling cascades to induce cytoskeletal rearrangements. SopE, can activate Cdc42, an essential component of the host cellular signaling cascade, in a Dbl-like fashion despite its lack of sequence similarity to Dbl-like proteins, the Rho-specific eukaryotic guanine nucleotide exchange factors. 136
50429 400047 pfam07488 Glyco_hydro_67M Glycosyl hydrolase family 67 middle domain. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the central catalytic domain of alpha-glucuronidase. 324
50430 369388 pfam07489 Tir_receptor_C Translocated intimin receptor (Tir) C-terminus. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir C-terminal domain which has been reported to bind uninfected host cells and beta-1 integrins although the role of intimin binding to integrins is unclear. This intimin C-terminal domain has also been shown to be sufficient for Tir recognition. 222
50431 254231 pfam07490 Tir_receptor_N Translocated intimin receptor (Tir) N-terminus. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir N-terminal domain which is involved in Tir stability and Tir secretion. 269
50432 400048 pfam07491 PPI_Ypi1 Protein phosphatase inhibitor. These proteins include Ypi1,, a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor and ppp1r11/hcgv, annotated as having protein phosphatase inhibitor activity. 57
50433 400049 pfam07492 Trehalase_Ca-bi Neutral trehalase Ca2+ binding domain. Neutral trehalases mobilise trehalose accumulated by fungal cells as a protective and storage carbohydrate. This family represents a calcium-binding domain similar to EF hand. Residues 97 and 108 in S. pombe ntp1 have been implicated in this interaction. It is thought that this domain may provide a general mechanism for regulating neutral trehalase activity in yeasts and filamentous fungi. 30
50434 400050 pfam07494 Reg_prop Two component regulator propeller. A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to pfam01011 and pfam00400 indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA. 24
50435 400051 pfam07495 Y_Y_Y Y_Y_Y domain. This domain is mostly found at the end of the beta propellers (pfam07494) in a family of two component regulators. However they are also found tandemly repeated in CTC_02402 without other signal conduction domains being present. It's named after the conserved tyrosines found in the alignment. The exact function is not known. 65
50436 400052 pfam07496 zf-CW CW-type Zinc Finger. This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialized mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found. 46
50437 400053 pfam07497 Rho_RNA_bind Rho termination factor, RNA-binding domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. 72
50438 400054 pfam07498 Rho_N Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain (pfam07497). 43
50439 400055 pfam07499 RuvA_C RuvA, C-terminal domain. Homologous recombination is a crucial process in all living organisms. In bacteria, this process the RuvA, RuvB, and RuvC proteins are involved. More specifically the proteins process the Holliday junction DNA. RuvA is comprised of three distinct domains. The domain represents the C-terminal domain and plays a significant role in the ATP-dependent branch migration of the hetero-duplex through direct contact with RuvB. Within the Holliday junction, the C-terminal domain makes no interaction with DNA. 47
50440 400056 pfam07500 TFIIS_M Transcription factor S-II (TFIIS), central domain. Transcription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and pfam01096 are required for transcription activity. 110
50441 400057 pfam07501 G5 G5 domain. This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function. 75
50442 400058 pfam07502 MANEC MANEC domain. This region of similarity, comprising 8 conserved cysteines, is found in the N-terminal region of several membrane-associated and extracellular proteins. Although formerly called MANSC (for motif at N-terminus with seven cysteines) it has now been renamed by MANEC (motif at N-terminus with eight cysteines) by Richard Mitter and Stephen Fitzgerald after the discovery of an eighth conserved cysteine. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. 89
50443 400059 pfam07503 zf-HYPF HypF finger. The HypF family of proteins are involved in the maturation and regulation of hydrogenase. In the N-terminus they appear to have two Zinc finger domains, as modelled by this family. 33
50444 400060 pfam07504 FTP Fungalysin/Thermolysin Propeptide Motif. This motif is found in both the bacterial M4 peptidase propeptide and the fungal M36 propeptide. Its exact function is not clear, but it is likely to either inhibit the peptidase, so as to prevent its premature activation, or has a chaperone activity. Both of these roles have been ascribed to the M4 and M36 propeptides. 51
50445 400061 pfam07505 DUF5131 Protein of unknown function (DUF5131). This is a family of bacterial and phage proteins of unknown function. There are three highly conserved cysteine residues in the disposition, Cx6Cxxc, amongst many highly conserved residues. 246
50446 311449 pfam07506 RepB RepB plasmid partitioning protein. This family includes proteins with sequence similarity to the RepB partitioning protein of the large Ti (tumor-inducing) plasmids of Agrobacterium tumefaciens. 185
50447 400062 pfam07507 WavE WavE lipopolysaccharide synthesis. These proteins are encoded by putative wav gene clusters, which are responsible for the synthesis of the core oligosaccharide (OS) region of Vibrio cholerae lipopolysaccharide. 305
50448 377856 pfam07508 Recombinase Recombinase. This domain is usually found associated with pfam00239 in putative integrases/recombinases of mobile genetic elements of diverse bacteria and phages. 102
50449 400063 pfam07509 DUF1523 Protein of unknown function (DUF1523). 175
50450 400064 pfam07510 DUF1524 Protein of unknown function (DUF1524). This family of uncharacterized proteins contain a conserved HXXP motif. A similar motif is seen in protein families in the His-Me finger endonuclease superfamily which suggests this family of proteins may also act as endonucleases. 140
50451 400065 pfam07511 DUF1525 Protein of unknown function (DUF1525). 113
50452 400066 pfam07514 TraI_2 Putative helicase. Some members of this family have been annotated as helicases. 325
50453 400067 pfam07515 TraI_2_C Putative conjugal transfer nickase/helicase TraI C-term. 123
50454 400068 pfam07516 SecA_SW SecA Wing and Scaffold domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This family is composed of two C-terminal alpha helical subdomains: the wing and scaffold subdomains. 209
50455 369399 pfam07517 SecA_DEAD SecA DEAD-like domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the pfam00270. 379
50456 284851 pfam07519 Tannase Tannase and feruloyl esterase. This family includes fungal tannase and feruloyl esterase. It also includes several bacterial homologs of unknown function. 460
50457 400069 pfam07520 SrfB Virulence factor SrfB. This family includes homologs of SsrAB is a two-component regulatory system encoded within the Salmonella pathogenicity island SPI-2. Among the products of genes activated by SsrAB within epithelial and macrophage cells is Salmonella typhimurium srfB. homologs are found in several other proteobacteria. 985
50458 400070 pfam07521 RMMBL Zn-dependent metallo-hydrolase RNA specificity domain. The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. This, the fifth motif, appears to be specific to Zn-dependent metallohydrolases such as ribonuclease J 2 which are involved in the processing of mRNA. This domain adds essential structural elements to the CASP-domain and is unique to RNA/DNA-processing nucleases, showing that they are pre-mRNA 3'-end-processing endonucleases. 61
50459 400071 pfam07522 DRMBL DNA repair metallo-beta-lactamase. The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair. 107
50460 400072 pfam07523 Big_3 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. 67
50461 400073 pfam07524 Bromo_TP Bromodomain associated. This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. 77
50462 400074 pfam07525 SOCS_box SOCS box. The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. 38
50463 400075 pfam07526 POX Associated with HOX. The function of this domain is unknown. It is often found in plant proteins associated with pfam00046. 139
50464 400076 pfam07527 Hairy_orange Hairy Orange. The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute. 39
50465 284860 pfam07528 DZF DZF domain. The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. 248
50466 400077 pfam07529 HSA HSA. This domain is predicted to bind DNA and is often found associated with helicases. 71
50467 148888 pfam07530 PRE_C2HC Associated with zinc fingers. This function of this domain is unknown and is often found associated with pfam00096. 68
50468 400078 pfam07531 TAFH NHR1 homology to TAF. This corresponds to the region NHR1 that is conserved between the product of the nervy gene in Drosophila and the human mtg8b protein, which is hypothesized to be a transcription factor. 89
50469 400079 pfam07532 Big_4 Bacterial Ig-like domain (group 4). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. 59
50470 400080 pfam07533 BRK BRK domain. The function of this domain is unknown. It is often found associated with helicases and transcription factors. 43
50471 400081 pfam07534 TLD TLD. This domain is predicted to be an enzyme and is often found associated with pfam01476. It's structure consists of a beta-sandwich surrounded by two helices and two one-turn helices. 136
50472 400082 pfam07535 zf-DBF DBF zinc finger. This domain is predicted to bind metal ions and is often found associated with pfam00533 and pfam02178. It was first identified in the Drosophila chiffon gene product, and is associated with initiation of DNA replication. 42
50473 400083 pfam07536 HWE_HK HWE histidine kinase. Two-component systems, consisting of a histidine kinase and a cognate response regulator protein, represent the best-known apparatus for transducing external cues into a physiological response in bacteria. The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as pfam00512. The family was defined by the presence of a highly conserved H residue in the kinase domain and a WxE motif in a C-terminal ATPase domain that is related to pfam02518. These proteins are found in a variety of alpha- and gamma-proteobacteria, with significant enrichment in the rhizobia. 83
50474 400084 pfam07537 CamS CamS sex pheromone cAM373 precursor. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. 316
50475 400085 pfam07538 ChW Clostridial hydrophobic W. A novel extracellular macromolecular system has been proposed based on the proteins containing ChW repeats. ChW stands for Clostridial hydrophobic with conserved W (tryptophan). This repeat was originally described in Clostridium acetobutylicum but is also found in other Gram-positive bacteria including Enterococcus faecalis, Streptococcus agalactiae and Streptomyces coelicolor. 35
50476 400086 pfam07539 DRIM Down-regulated in metastasis. These eukaryotic proteins include DRIM (Down-Regulated In Metastasis), which is differentially expressed in metastatic and non-metastatic human breast carcinoma cells. It is believed to be involved in processing of non-coding RNA. 591
50477 400087 pfam07540 NOC3p Nucleolar complex-associated protein. Nucleolar complex-associated protein (Noc3p) is conserved in eukaryotes and has essential roles in replication and rRNA processing in Saccharomyces cerevisiae. 91
50478 400088 pfam07541 EIF_2_alpha Eukaryotic translation initiation factor 2 alpha subunit. These proteins share a region of similarity that falls towards the C-terminus from pfam00575. 112
50479 400089 pfam07542 ATP12 ATP12 chaperone protein. Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesized as precursors with amino-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly. 121
50480 369416 pfam07543 PGA2 Protein trafficking PGA2. A Saccharomyces cerevisiae member of this family (PGA2) is an ER protein which has been implicated in protein trafficking. 138
50481 400090 pfam07544 Med9 RNA polymerase II transcription mediator complex subunit 9. This family of Med9 proteins is conserved in yeasts. It forms part of the middle region of Mediator. Med9 has two functional domains. The species-specific amino-terminal half (aa 1-63) plays a regulatory role in transcriptional regulation, whereas this well-conserved carboxy-terminal half (aa 64-149) has a more fundamental function involved in direct binding to the amino-terminal portions of Med4 and Med7 and the assembly of Med9 into the Middle module. Also, some unidentified factor(s) in med9 extracts may impact the binding of TFIID to the promoter. 79
50482 400091 pfam07545 Vg_Tdu Vestigial/Tondu family. The mammalian TEF and the Drosophila scalloped genes belong to a conserved family of transcriptional factors that possesses a TEA/ATTS DNA-binding domain. Transcriptional activation by these proteins likely requires interactions with specific coactivators. In Drosophila, Scalloped (Sd) interacts with Vestigial (Vg) to form a complex, which binds DNA through the Sd TEA/ATTS domain. The Sd-Vg heterodimer is a key regulator of wing development, which directly controls several target genes and is able to induce wing outgrowth when ectopically expressed. This short conserved region is needed for interaction with Sd. 30
50483 400092 pfam07546 EMI EMI domain. The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains. 67
50484 400093 pfam07547 RSD-2 RSD-2 N-terminal domain. This domain is found in three copies in the N-terminus of the C. elegans RSD-2 protein. RSD-2 (RNAi spreading defective) is involved in systemic RNAi. Mutations in the rsd-2 gene do not effect somatic genes but only germline expressed genes. 83
50485 311484 pfam07548 ChlamPMP_M Chlamydia polymorphic membrane protein middle domain. This family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. This domain is found between the beta-helical repeats (pfam02415) and the C-terminal pfam03797. This domain is excised subsequent to secretion. 170
50486 400094 pfam07549 Sec_GG SecD/SecF GG Motif. This family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC. SecD and SecF are required to maintain a proton motive force. This alignment encompasses a -GG- motif typically found in N-terminal half of the SecD/SecF proteins. 27
50487 369421 pfam07550 DUF1533 Protein of unknown function (DUF1533). This family consists of several hypothetical bacterial proteins and is around 60 residues in length. It's function is not known. 59
50488 148905 pfam07551 DUF1534 Protein of unknown function (DUF1534). This family is found in a group of small bacterial proteins. Its function is not known. 48
50489 284881 pfam07552 Coat_X Spore Coat Protein X and V domain. This family is found in the Bacilliales coat protein X as a tandem repeat and also in coat protein V. The proteins are found in the insoluble fraction. 54
50490 400095 pfam07553 Lipoprotein_Ltp Host cell surface-exposed lipoprotein. This is a family of lipoproteins that is involved in superinfection exclusion. Proteins in this family have been shown to act at the stage of DNA release from the phage head into the cell. 48
50491 400096 pfam07554 FIVAR FIVAR domain. This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Bacillus sp. Gellan lyase, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures. 69
50492 400097 pfam07555 NAGidase beta-N-acetylglucosaminidase. This family has previously been described as a hyaluronidase. However, more recently it has been shown that this family has beta-N-acetylglucosaminidase activity. 293
50493 400098 pfam07556 DUF1538 Protein of unknown function (DUF1538). This family contains several conserved glycines and phenylalanines. 209
50494 400099 pfam07557 Shugoshin_C Shugoshin C-terminus. Shugoshin-like proteins contain this conserved sequence at the C-terminus, which is rich in basic amino-acids. Shugoshin (Sgo1) protects Rec8 at centromeres during anaphase I (during meiosis) so that sister chromatids remain tethered. Sgo2 is a paralogue of Sgo1 and is involved in correctly orienting sister-centromeres. 25
50495 400100 pfam07558 Shugoshin_N Shugoshin N-terminal coiled-coil region. The Shugoshin protein is found to have this conserved N-terminal coiled-coil region and a highly conserved C-terminal basic region, family Shugoshin_C pfam07557. Shugoshin is a crucial target of Bub1 kinase function at kinetochores, necessary for both meiotic and mitotic localization of shugoshin to the kinetochore. Human shugoshin is diffusible and mediates kinetochore-driven formation of kinetochore-microtubules during bipolar spindle assembly. Further, the primary role of shugoshin is to ensure bipolar attachment of kinetochores, and its role in protecting cohesion has co-developed to facilitate this process. 45
50496 400101 pfam07559 FlaE Flagellar basal body protein FlaE. This family consists of several bacterial FlaE flagellar proteins. These proteins are part of the flageller basal body rod complex. 85
50497 116179 pfam07560 DUF1539 Domain of Unknown Function (DUF1539). 126
50498 400102 pfam07561 DUF1540 Domain of Unknown Function (DUF1540). This family has four conserved cysteines, which is suggestive of a metal binding function. 40
50499 400103 pfam07562 NCD3G Nine Cysteines Domain of family 3 GPCR. This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins. 54
50500 400104 pfam07563 DUF1541 Protein of unknown function (DUF1541). This family consists of several hypothetical bacterial and occurs as a tandem repeat. 52
50501 400105 pfam07564 DUF1542 Domain of Unknown Function (DUF1542). This domain is found in several cell surface proteins. Some are involved in antibiotic resistance and/or cellular adhesion. 77
50502 400106 pfam07565 Band_3_cyto Band 3 cytoplasmic domain. This family contains the cytoplasmic domain of the Band 3 anion exchange proteins that exchange Cl-/HCO3-. Band 3 constitutes the most abundant polypeptide in the red blood cell membrane, comprising 25% of the total membrane protein. The cytoplasmic domain of band 3 functions primarily as an anchoring site for other membrane-associated proteins. Included among the protein ligands of cdb3 are ankyrin, protein 4.2, protein 4.1, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphofructokinase, aldolase, hemoglobin, hemichromes, and the protein tyrosine kinase (p72syk). 258
50503 400107 pfam07566 DUF1543 Domain of Unknown Function (DUF1543). This domain is found as 1-2 copies in a small family of proteins of unknown function. 52
50504 400108 pfam07568 HisKA_2 Histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536. It is usually found adjacent to a C-terminal ATPase domain (pfam02518). This domain is found in a wide range of Bacteria and also several Archaea. 76
50505 400109 pfam07569 Hira TUP1-like enhancer of split. The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain pfam00400. 206
50506 400110 pfam07571 TAF6_C TAF6 C-terminal HEAT repeat domain. TAF6_C is the C-terminal domain of the TAF6 subunit of the general transcription factor TFIID. The crystal structure reveals the presence of five conserved HEAT repeats. This region is necessary for the complexing together of the subunits TAF5, TAF6 and TAF9. 90
50507 400111 pfam07572 BCNT Bucentaur or craniofacial development. Bucentaur or craniofacial development protein 1 (BCNT) in ruminents has a different domain architecture to that in mouse and human. For this reason it has been used as a model for molecular evolution. Both bovine and human BCNTs are phosphorylated by casein kinase II in vitro. 71
50508 311502 pfam07573 AreA_N Nitrogen regulatory protein AreA N-terminus. The AreA nitrogen regulatory protein proteins (which are GATA type transcription factors) share a highly conserved N-terminus and pfam00320 at the C-terminus. 94
50509 400112 pfam07574 SMC_Nse1 Nse1 non-SMC component of SMC5-6 complex. S. cerevisiae Nse1 forms part of a complex with SMC5-SMC6. This non-structural maintenance of chromosomes (SMC) complex plays an essential role in genomic stability, being involved in DNA repair and DNA metabolism. It is conserved in eukaryotes from yeast to human. This domain lies immediatley up-stream of the DNA-binding zinc-finger domain, zf-RING-like pfam08746. 195
50510 400113 pfam07575 Nucleopor_Nup85 Nup85 Nucleoporin. A family of nucleoporins conserved from yeast to human. THe nuclear pore complex is a large assembly composed of two essential complexes: the heptameric Nup84 complex and the heteromeric Nic96-containing complex. The Nup84 complex is composed of one copy each of Nup84, Nup85, Nup120, Nup133, Nup145C, Sec13, and Seh1. The structure of a complex of Nup85 and Seh1 was solved. The N-terminus of Nup85 is inserted and forms a three-stranded blade that completes the Seh1 6-bladed beta-propeller in trans. Following its N-terminal insertion blade, Nup85 forms a compact cuboid structure composed of 20 helices, with two distinct modules, referred to as crown and trunk. 562
50511 400114 pfam07576 BRAP2 BRCA1-associated protein 2. These proteins include BRCA1-associated protein 2 (BRAP2), which binds nuclear localization signals (NLSs) in vitro and in yeast two-hybrid screening. These proteins share a region of sequence similarity at their N-terminus. They also have pfam02148 at the C-terminus. 93
50512 284902 pfam07577 DUF1547 Domain of Unknown Function (DUF1547). This family appears to be found only in a small family of Chlamydia species. 60
50513 400115 pfam07578 LAB_N Lipid A Biosynthesis N-terminal domain. This family is found at the N-terminus of a group of Chlamydial Lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function. 68
50514 311507 pfam07579 DUF1548 Domain of Unknown Function (DUF1548). This family appears to be found only in a small family of Chlamydia proteins. 135
50515 400116 pfam07580 Peptidase_M26_C M26 IgA1-specific Metallo-endopeptidase C-terminal region. These peptidases, which cleave mammalian IgA, are found in Gram-positive bacteria. Often found associated with pfam00746, they may be attached to the cell wall. 734
50516 311509 pfam07581 Glug The GLUG motif. This family is found in the IgA1 (M26) peptidases, which attached to the cell wall peptidoglycan by an amide bond. IgA1 protease selectively cleaves human IgA1 and is likely to be a pathogenicity factor in some pathogens. This family is also found in various other contexts, including with pfam05860. It is named GLUG after the mostly conserved G-L-any-G motif. 28
50517 377871 pfam07582 AP_endonuc_2_N AP endonuclease family 2 C-terminus. This highly-conserved sequence is found at the C-terminus of several apurinic/apyrimidinic (AP) endonucleases. in a range of Gram-positive and Gram-negative bacteria. See also pfam01261. 55
50518 400117 pfam07583 PSCyt2 Protein of unknown function (DUF1549). A family of paralogues in the planctomyces. 208
50519 400118 pfam07584 BatA Aerotolerance regulator N-terminal. These proteins share a highly-conserved sequence at their N-terminus. They include several proteins from Rhodopirellula baltica and also several from proteobacteria. The proteins are produced by the Batl operon which appears to be important in pathogenicity and aerotolerance. This family is the conserved N-terminus, but the full length proteins carry multiple membrane-spanning domains. BatA ensures bacterial survival in the early stages of the infection process, when the infected sites are aerobic, and is produced under conditions of oxidative stress. 75
50520 377874 pfam07585 BBP7 Putative beta barrel porin-7 (BBP7). This is a family of putative beta barrel porin-7 BBP7 proteins identified initially in Rhodopirellula baltica. 350
50521 377875 pfam07586 HXXSHH Protein of unknown function (DUF1552). A family of proteins identified in Rhodopirellula baltica. 301
50522 400119 pfam07587 PSD1 Protein of unknown function (DUF1553). A family of proteins found in Rhodopirellula baltica. 213
50523 369437 pfam07588 DUF1554 Protein of unknown function (DUF1554). A family of proteins identified in Leptospira interrogans. 136
50524 377877 pfam07589 VPEP PEP-CTERM motif. This motif has been identified in a wide range of bacteria at their C-terminus. It has been suggested that this is a protein sorting signal. Based on phylogenetic profiling it has been suggested that the EpsH family of proteins mediate this function. 23
50525 284914 pfam07590 DUF1556 Protein of unknown function (DUF1556). 82
50526 400120 pfam07591 PT-HINT Pretoxin HINT domain. A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis. 136
50527 400121 pfam07592 DDE_Tnp_ISAZ013 Rhodopirellula transposase DDE domain. These transposases are found in the planctomycete Rhodopirellula baltica, the cyanobacterium Nostoc, and the Gram-positive bacterium Streptomyces. 308
50528 400122 pfam07593 UnbV_ASPIC ASPIC and UnbV. This conserved sequence is found associated with pfam00515 in several paralogous proteins in Rhodopirellula baltica. It is also found associated with pfam01839 in several eukaryotic integrin-like proteins (e.g. human ASPIC) and in several other bacterial proteins. 66
50529 284918 pfam07595 Planc_extracel Planctomycete extracellular. This motif is conserved as the N-terminus of several Rhodopirellula baltica proteins predicted to be extracellular. 24
50530 284919 pfam07596 SBP_bac_10 Protein of unknown function (DUF1559). A large family of paralogous proteins apparently unique to planctomycetes. 268
50531 284921 pfam07598 DUF1561 Protein of unknown function (DUF1561). A family of paralogous proteins in Leptospira interrogans. 625
50532 203693 pfam07599 DUF1563 Protein of unknown function (DUF1563). A small family of short hypothetical proteins in Leptospira interrogans. 43
50533 284922 pfam07600 DUF1564 Protein of unknown function (DUF1564). A family of paralogous proteins in Leptospira interrogans. Several have been annotated as possible CopG-like transcriptional regulators (see pfam01402). 167
50534 284923 pfam07602 DUF1565 Protein of unknown function (DUF1565). These proteins share a region of homology in their N termini, and are found in several phylogenetically diverse bacteria and in the archaeon Methanosarcina acetivorans. Some of these proteins also contain characterized domains such as pfam00395 and pfam03422. 256
50535 400123 pfam07603 DUF1566 Protein of unknown function (DUF1566). These proteins of unknown function are found in Leptospira interrogans and in several gamma proteobacteria. 118
50536 369441 pfam07606 DUF1569 Protein of unknown function (DUF1569). A family of hypothetical proteins identified in Rhodopirellula baltica. 152
50537 284926 pfam07607 DUF1570 Protein of unknown function (DUF1570). A family of hypothetical proteins in Rhodopirellula baltica. This family carries a highly conserved HExxH sequence motif characteristic of members of the Peptidase clan MA. 129
50538 400124 pfam07608 DUF1571 Protein of unknown function (DUF1571). A family of paralogous proteins in Rhodopirellula baltica. 208
50539 400125 pfam07609 DUF1572 Protein of unknown function (DUF1572). These proteins, from several diverse bacteria, share a short conserved sequence towards their N termini. 163
50540 400126 pfam07610 DUF1573 Protein of unknown function (DUF1573). These hypothetical proteins, from bacteria such as Rhodopirellula baltica, Bacteroides thetaiotaomicron, and Porphyromonas gingivalis, share a region of conserved sequence towards their N-termini. 98
50541 369442 pfam07611 DUF1574 Protein of unknown function (DUF1574). A family of hypothetical proteins in Leptospira interrogans. 342
50542 400127 pfam07613 DUF1576 Protein of unknown function (DUF1576). This small family is found in several undescribed proteins. The alignment is distinguished by the frequent occurrence of conserved glycine and aromatic residues. 176
50543 369443 pfam07614 DUF1577 Protein of unknown function (DUF1577). A family of hypothetical proteins in Leptospira interrogans. 256
50544 400128 pfam07615 Ykof YKOF-related Family. 81
50545 400129 pfam07617 DUF1579 Protein of unknown function (DUF1579). A family of paralogous hypothetical proteins identified in Rhodopirellula baltica that also has members in Gloeobacter violaceus, Sinorhizobium meliloti and Agrobacterium tumefaciens. 155
50546 284935 pfam07618 DUF1580 Protein of unknown function (DUF1580). A family of short hypothetical proteins found in Rhodopirellula baltica. 57
50547 284936 pfam07619 DUF1581 Protein of unknown function (DUF1581). Several Rhodopirellula baltica proteins share this probable domain. Most of these proteins are predicted to be secreted or membrane-associated. 84
50548 284937 pfam07621 DUF1582 Protein of unknown function (DUF1582). A family of hypothetical proteins in Rhodopirellula baltica. 29
50549 284938 pfam07622 DUF1583 Protein of unknown function (DUF1583). Most of these Rhodopirellula baltica hypothetical proteins also match pfam07619. 411
50550 377883 pfam07624 PSD2 Protein of unknown function (DUF1585). A conserved sequence region at the C-terminus of several cytochrome-like proteins in Rhodopirellula baltica. 74
50551 377884 pfam07626 PSD3 Protein of unknown function (DUF1587). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07624. 65
50552 377885 pfam07627 PSCyt3 Protein of unknown function (DUF1588). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07626 and pfam07624. 98
50553 284944 pfam07628 DUF1589 Protein of unknown function (DUF1589). A family of short hypothetical proteins in Rhodopirellula baltica. 164
50554 377886 pfam07631 PSD4 Protein of unknown function (DUF1592). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07627, pfam07626, and pfam07624. 128
50555 400130 pfam07632 DUF1593 Protein of unknown function (DUF1593). A family of proteins in Rhodopirellula baltica that are predicted to be secreted. Also, a member has been identified in Caulobacter crescentus. These proteins mat be related to pfam01156. 261
50556 400131 pfam07634 RtxA RtxA repeat. This short repeat is found in the RtxA toxin family. 18
50557 369445 pfam07635 PSCyt1 Planctomycete cytochrome C. These proteins share a region of homology at their N-terminus that contains the C-{CPWHF}-{CPWR}-C-H-{CFYW} motif typical of cytochromes C, or CxxCH. 59
50558 148958 pfam07636 PSRT PSRT. This motif is found at the N-terminus of several short hypothetical proteins in Rhodopirellula baltica and the predicted Arylsulfatase B (EC:3.1.6.12). 32
50559 377888 pfam07637 PSD5 Protein of unknown function (DUF1595). A family of proteins in Rhodopirellula baltica, associated with pfam07635, pfam07626, pfam07631, pfam07627, and pfam07624. 62
50560 254323 pfam07638 Sigma70_ECF ECF sigma factor. These proteins are probably RNA polymerase sigma factors belonging to the extra-cytoplasmic function (ECF) subfamily and show sequence similarity to pfam04542 and pfam04545. 185
50561 377889 pfam07639 YTV YTV. These hypothetical proteins in Rhodopirellula baltica contain several repeats of a sequence whose core is the residues YTV. 40
50562 400132 pfam07642 BBP2 Putative beta-barrel porin-2, OmpL-like. bbp2. BBP2 is a family of putative porin proteins that are likely to be outer membrane beta barrel proteins porins. 340
50563 377890 pfam07643 DUF1598 Protein of unknown function (DUF1598). A family of Rhodopirellula baltica hypothetical proteins of about 500 amino acids in length. 84
50564 311536 pfam07645 EGF_CA Calcium-binding EGF domain. 42
50565 400133 pfam07646 Kelch_2 Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila kel is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415. 47
50566 400134 pfam07647 SAM_2 SAM domain (Sterile alpha motif). 66
50567 400135 pfam07648 Kazal_2 Kazal-type serine protease inhibitor domain. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides. 50
50568 400136 pfam07650 KH_2 KH domain. 78
50569 400137 pfam07651 ANTH ANTH domain. AP180 is an endocytotic accessory proteins that has been implicated in the formation of clathrin-coated pits. The domain is involved in phosphatidylinositol 4,5-bisphosphate binding and is a universal adaptor for nucleation of clathrin coats. 272
50570 400138 pfam07652 Flavi_DEAD Flavivirus DEAD domain. 146
50571 400139 pfam07653 SH3_2 Variant SH3 domain. SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organisation. First described in the Src cytoplasmic tyrosine kinase. The structure is a partly opened beta barrel. 52
50572 400140 pfam07654 C1-set Immunoglobulin C1-set domain. 85
50573 377891 pfam07655 Secretin_N_2 Secretin N-terminal domain. This is a short domain found in bacterial type II/III secretory system proteins. The architecture of these proteins suggest that this family may be functionally analogous to pfam03958. 91
50574 400141 pfam07657 MNNL N-terminus of Notch ligand. This entry represents a region of conserved sequence at the N-terminus of several Notch ligand proteins. 75
50575 400142 pfam07659 DUF1599 Domain of Unknown Function (DUF1599). 61
50576 377893 pfam07660 STN Secretin and TonB N-terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates. 51
50577 311548 pfam07661 MORN_2 MORN repeat variant. This family represents an apparent variant of the pfam02493 repeat (personal obs:C Yeats). 22
50578 400143 pfam07662 Nucleos_tra2_C Na+ dependent nucleoside transporter C-terminus. This family consists of nucleoside transport proteins. Rat Slc28a2 is a purine-specific Na+-nucleoside cotransporter localized to the bile canalicular membrane. Rat Slc28a1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. 205
50579 400144 pfam07663 EIIBC-GUT_C Sorbitol phosphotransferase enzyme II C-terminus. 92
50580 400145 pfam07664 FeoB_C Ferrous iron transport protein B C-terminus. Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N-terminus has been previously erroneously described as being ATP-binding. Recent work shows that it is similar to eukaryotic G-proteins and that it is a GTPase. 51
50581 254342 pfam07666 MpPF26 M penetrans paralogue family 26. These proteins include those ascribed to M penetrans paralogue family 26 in. 133
50582 284973 pfam07667 DUF1600 Protein of unknown function (DUF1600). These proteins appear to be specific to Mycoplasma species. 109
50583 284974 pfam07668 MpPF1 M penetrans paralogue family 1. This family of paralogous proteins identified in Mycoplasma penetrans includes homologs of p35. 313
50584 369456 pfam07669 Eco57I Eco57I restriction-modification methylase. homologs of the Escherichia coli Eco57I restriction-modification methylase are found in several phylogenetically diverse bacteria. The structure of TaqI has been solved. 104
50585 400146 pfam07670 Gate Nucleoside recognition. This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'. 101
50586 400147 pfam07671 DUF1601 Protein of unknown function (DUF1601). This repeat is found in a small number of proteins and is apparently limited to Coxiella and related species. 37
50587 284977 pfam07672 MFS_Mycoplasma Mycoplasma MFS transporter. These proteins share some similarity with members of the Major Facilitator Superfamily (MFS). 267
50588 284979 pfam07675 Cleaved_Adhesin Cleaved Adhesin Domain. This is a family of bacterial protein modules thought to function in various roles including cell adhesion, cell lysis and carbohydrate binding. A tandem repeat of these modules (either two or three repeats) constitute the haemagglutinin/adhesin (HA) regions of the gingipains, RgpA, Kgp, and Lys-gingipain HG66 expressed by Porphyromonas gingivalis (Bacteroides gingivalis). They form components of the major extracellular virulence complex RgpA-Kgp - a mixture of proteinases and adhesin domains. The adhesin domains in this complex are found in proteinase-cleaved forms when isolated from the cell surface. Haemagglutinin genes of P. gingivalis (hagA1 HAGA1_PORGI - and hagA2 HAGA2_PORGI) suggest that such proteins are composed of eight to ten tandem repeats of these adhesin modules. Genomic data predicts that homologous protein modules are also expressed by a number of other bacteria and form part of putative multi-domain proteins. These domains may be acting in concert with other adhesion modules thought to be part of these multi-domain proteins such as fibronectin type III, pfam00041, and Meprin, A5, mu (MAM), pfam00629, domains. 166
50589 400148 pfam07676 PD40 WD40-like Beta Propeller Repeat. This family appears to be related to the pfam00400 repeat This This repeat corresponds to the RIVW repeat identified in cell surface proteins [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. 37
50590 400149 pfam07677 A2M_recep A-macroglobulin receptor. This family includes the receptor domain region of the alpha-2-macroglobulin family. 90
50591 400150 pfam07678 A2M_comp A-macroglobulin complement component. This family includes the complement components region of the alpha-2-macroglobulin family. 310
50592 400151 pfam07679 I-set Immunoglobulin I-set domain. 90
50593 400152 pfam07680 DoxA TQO small subunit DoxA. Thiosulphate:quinone oxidoreductase (TQO) is one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in BT_0515. 131
50594 400153 pfam07681 DoxX DoxX. These proteins appear to have some sequence similarity with pfam04173 but their function is unknown. 84
50595 400154 pfam07682 SOR Sulphur oxygenase reductase. The sulphur oxygenase/reductase (SOR) of the thermo-acidophilic archaeon Acidianus ambivalens is an unusual enzyme consisting of 24 identical subunits arranged in a perfectly symmetrical hollow sphere and containing a mononuclear non-heme iron centre (personal communication: A. Kletzin). At 85 degrees C in vitro, elemental sulphur is oxidized to sulphite, thiosulphate and hydrogen sulphide with no external cofactors needed. The proposed equation is: 4S + O2 + 4 H2O ---> 2 HSO3- + 2 H2S + 2 H+. 302
50596 377897 pfam07683 CobW_C Cobalamin synthesis protein cobW C-terminal domain. This is a large and diverse family of putative metal chaperones that can be separated into up to 15 subgroups. In addition to known roles in cobalamin biosynthesis and the activation of the Fe-type nitrile hydratase, this family is also known to be involved in the response to zinc limitation. The CobW subgroup involved in cobalamin synthesis represents only a small sub-fraction of the family. 94
50597 400155 pfam07684 NODP NOTCH protein. NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. The role of the NOD and NODP domains remains to be elucidated. 58
50598 400156 pfam07685 GATase_3 CobB/CobQ-like glutamine amidotransferase domain. 189
50599 400157 pfam07686 V-set Immunoglobulin V-set domain. This domain is found in antibodies as well as neural protein P0 and CTL4 amongst others. 109
50600 400158 pfam07687 M20_dimer Peptidase dimerization domain. This domain consists of 4 beta strands and two alpha helices which make up the dimerization surface of members of the M20 family of peptidases. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases. 107
50601 400159 pfam07688 KaiA KaiA C-terminal domain. The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. The overall fold of the KaiA C-terminal domain is that of a four-helix bundle, which forms a dimer in the known structure. 122
50602 400160 pfam07689 KaiB KaiB domain. The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. Mutations in both proteins have been reported to alter or abolish circadian rhythmicity. KaiB adopts an alpha-beta meander motif and is found to be a dimer. 82
50603 369468 pfam07690 MFS_1 Major Facilitator Superfamily. 346
50604 400161 pfam07691 PA14 PA14 domain. This domain forms an insert in bacterial beta-glucosidases and is found in other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium prespore-cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding. 141
50605 254362 pfam07692 Fea1 Low iron-inducible periplasmic protein. In Chlamydomonas reinhardtii, the gene encoding Fe-assimilating protein 1 is induced by iron deficiency. In green algae, this protein is periplasmic. The two paralogues FEA1 and FEA2 are the major proteins secreted by iron-deficient Chlamydomonas reinhardtii, and both are up-regulated in response to iron deficiency. FEA1 but not FEA2 is up-regulated by high CO2 concentration. Both FEA1 and FEA2 are secreted into the periplasmic space and genetic evidence confirms that their association with the cell is required for growth in low iron. 359
50606 284995 pfam07693 KAP_NTPase KAP family P-loop domain. The KAP (after Kidins220/ARMS and PifA) family of predicted NTPases are sporadically distributed across a wide phylogenetic range in bacteria and in animals. Many of the prokaryotic KAP NTPases are encoded in plasmids and tend to undergo disruption to form pseudogenes. A unique feature of all eukaryotic and certain bacterial KAP NTPases is the presence of two or four transmembrane helices inserted into the P-loop NTPase domain. These transmembrane helices anchor KAP NTPases in the membrane such that the P-loop domain is located on the intracellular side. 293
50607 400162 pfam07694 5TM-5TMR_LYT 5TMR of 5TMR-LYT. This entry represents the transmembrane region of the 5TM-LYT (5TM Receptors of the LytS-YhcK type). 171
50608 377898 pfam07695 7TMR-DISM_7TM 7TM diverse intracellular signalling. This entry represents the transmembrane region of the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules). 207
50609 400163 pfam07696 7TMR-DISMED2 7TMR-DISM extracellular 2. This entry represents one of two distinct types of extracellular domain found in the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules) bacterial transmembrane proteins. It is possible that this domain adopts a jelly roll fold and acts as a receptor for carbohydrates and their derivatives. 127
50610 400164 pfam07697 7TMR-HDED 7TM-HD extracellular. This entry represents the extracellular domain of the 7TM-HD (7TM Receptors with HD hydrolase). 219
50611 400165 pfam07698 7TM-7TMR_HD 7TM receptor with intracellular HD hydrolase. These bacterial 7TM receptor proteins have an intracellular pfam01966. This entry corresponds to the 7 helix transmembrane domain. These proteins also contain an N-terminal extracellular domain. 190
50612 400166 pfam07699 Ephrin_rec_like Putative ephrin-receptor like. This family has repeats of a region rich in cysteines. 48
50613 400167 pfam07700 HNOB Haem-NO-binding. The HNOB (Haem NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. It is a haem protein sensor (SONO) that displays femtomolar affinity for nitrous oxide, NO. It is predicted to function as a haem-dependent sensor for gaseous ligands and to transduce diverse downstream signals in both bacteria and animals. 163
50614 400168 pfam07701 HNOBA Heme NO binding associated. The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals. 215
50615 400169 pfam07702 UTRA UTRA domain. The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain that has a similar fold to pfam04345. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules. 141
50616 400170 pfam07703 A2M_N_2 Alpha-2-macroglobulin family N-terminal region. This family includes a region of the alpha-2-macroglobulin family. 141
50617 400171 pfam07704 PSK_trans_fac Rv0623-like transcription factor. This entry represents the Rv0623-like family of transcription factors associated with the PSK operon. 82
50618 400172 pfam07705 CARDB CARDB. Cell adhesion related domain found in bacteria. 101
50619 400173 pfam07706 TAT_ubiq Aminotransferase ubiquitination site. This segment contains a probable site of ubiquitination that ensures rapid degradation of tyrosine aminotransferase in rats. The half life of the enzyme in vivo is about 2-4 hours. In addition, unpublished information identifies at least 2 phosphorylation sites including CAPK at Ser29 and, at the other end of the protein, a casein kinase II site at S*QEECDK. This region of TAT is probably primarily related to regulatory events. Most other transaminases are much more stable and are not phosphorylated. 40
50620 400174 pfam07707 BACK BTB And C-terminal Kelch. This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation). 100
50621 285010 pfam07708 Tash_PEST Tash protein PEST motif. This motif is found in the Tash AT-hook proteins of Theileria annulata. These proteins are transported to the hosts nucleus and are likely to be involved in pathogenesis. It is also often found in conjunction with pfam04385. It is suggested that they may be 'part of PEST motifs' (a signal for rapid proteolytic degradation) in, though this is not definite. This motif is also found in other T. annulata proteins, which have no other known domains in (unpublished data: C Yeats). 18
50622 116323 pfam07709 SRR Seven Residue Repeat. Associated with pfam02969 in This repeat is found in some Plasmodium and Theileria proteins. 14
50623 400175 pfam07710 P53_tetramer P53 tetramerisation motif. 37
50624 400176 pfam07711 RabGGT_insert Rab geranylgeranyl transferase alpha-subunit, insert domain. Rab geranylgeranyl transferase (RabGGT) catalyzes the addition of two geranylgeranyl groups to the C-terminal cysteine residues of Rab proteins, which is crucial for membrane association and function of these proteins in intracellular vesicular trafficking. This domain is inserted between pfam01239 repeats. This domain adopts an Ig-like fold and is thought to be involved in protein-protein interactions and might be involved in the recognition and binding of REP. 101
50625 400177 pfam07712 SURNod19 Stress up-regulated Nod 19. 377
50626 400178 pfam07713 DUF1604 Protein of unknown function (DUF1604). This family is found at the N-terminus of several eukaryotic RNA processing proteins. 84
50627 400179 pfam07714 Pkinase_Tyr Protein tyrosine kinase. 258
50628 400180 pfam07715 Plug TonB-dependent Receptor Plug Domain. The Plug domain has been shown to be an independently folding subunit of the TonB-dependent receptors. It acts as the channel gate, blocking the pore until the channel is bound by ligand. At this point it under goes conformational changes opens the channel. 107
50629 400181 pfam07716 bZIP_2 Basic region leucine zipper. 51
50630 400182 pfam07717 OB_NTP_bind Oligonucleotide/oligosaccharide-binding (OB)-fold. This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself. The structure Structure 3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins. 82
50631 400183 pfam07718 Coatamer_beta_C Coatomer beta C-terminal region. This family is found at the C-terminus of the coatamer beta subunit proteins (Beta-coat proteins). This C-terminal domain probably adapts the function of the N-terminal pfam01602 domain. 138
50632 400184 pfam07719 TPR_2 Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by pfam00515. 33
50633 400185 pfam07720 TPR_3 Tetratricopeptide repeat. This Pfam entry includes tetratricopeptide-like repeats found in the LcrH/SycD-like chaperones. 34
50634 311590 pfam07721 TPR_4 Tetratricopeptide repeat. This Pfam entry includes tetratricopeptide-like repeats not detected by the pfam00515, pfam07719 and pfam07720 models. 26
50635 400186 pfam07722 Peptidase_C26 Peptidase C26. These peptidases have gamma-glutamyl hydrolase activity; that is they catalyze the cleavage of the gamma-glutamyl bond in poly-gamma-glutamyl substrates. They are structurally related to pfam00117, but contain extensions in four loops and at the C-terminus. 217
50636 336782 pfam07723 LRR_2 Leucine Rich Repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. 26
50637 400187 pfam07724 AAA_2 AAA domain (Cdc48 subfamily). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model. 168
50638 400188 pfam07725 LRR_3 Leucine Rich Repeat. This Pfam entry includes some LRRs that fail to be detected by the pfam00560 model. 20
50639 400189 pfam07726 AAA_3 ATPase family associated with various cellular activities (AAA). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model. 131
50640 400190 pfam07727 RVT_2 Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model. 243
50641 400191 pfam07728 AAA_5 AAA domain (dynein-related subfamily). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model. 135
50642 400192 pfam07729 FCD FCD domain. This domain is the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain pfam00392. This domain is found in Escherichia coli NanR and DgoR that are regulators of sugar biosynthesis operons. It is also in the known structure of FadR where it binds to acyl-coA, the domain is alpha helical. This family has been named as FCD for (FadR C-terminal Domain). 121
50643 400193 pfam07730 HisKA_3 Histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536. 68
50644 400194 pfam07731 Cu-oxidase_2 Multicopper oxidase. This entry contains many divergent copper oxidase-like domains that are not recognized by the pfam00394 model. 138
50645 400195 pfam07732 Cu-oxidase_3 Multicopper oxidase. This entry contains many divergent copper oxidase-like domains that are not recognized by the pfam00394 model. 119
50646 400196 pfam07733 DNA_pol3_alpha Bacterial DNA polymerase III alpha subunit. 259
50647 254394 pfam07734 FBA_1 F-box associated. Most of these proteins contain pfam00646 at the N-terminus, suggesting that they are effectors linked with ubiquitination. 159
50648 400197 pfam07735 FBA_2 F-box associated. Most of these proteins contain pfam00646 at the N-terminus, suggesting that they are effectors linked with ubiquitination. 66
50649 400198 pfam07736 CM_1 Chorismate mutase type I. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine. 115
50650 285037 pfam07737 ATLF Anthrax toxin lethal factor, N- and C-terminal domain. The C-terminal domain is the catalytically active domain whereas the N-terminal domain is likely to be inactive. 218
50651 400199 pfam07738 Sad1_UNC Sad1 / UNC-like C-terminal. The C. elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localizes at the spindle pole body. UNC-84 and and Sad1 share a common C-terminal region, that is often termed the SUN (Sad1 and UNC) domain. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2. The SUN domain of Sun2 has been demonstrated to be in the periplasm. 130
50652 400200 pfam07739 TipAS TipAS antibiotic-recognition domain. This domain is found at the C-terminus of some MerR family transcription factors. The domain has an alpha-helical globin-like fold. The family includes Mta a central regulator of multidrug resistance in Bacillus subtilis. 117
50653 400201 pfam07740 Toxin_12 Ion channel inhibitory toxin. This is a family of potent toxins that function as ion-channel inhibitors for several different ions. Omega-Grammotoxin SIA is a VSCC antagonist that inhibits neuronal N- and P-type VSCC responses. Huwentoxin-IV, from the Chinese bird spider, is a highly potent neurotoxin that specifically inhibits the neuronal tetrodotoxin-sensitive voltage-gated sodium channel in rat dorsal root ganglion neurons. Hainantoxin-4, from the venom of spider Selenocosmia hainana, adopts an inhibitor cystine knot structural motif like huwentoin-IV, and is a potent antagonist that acts at site 1 on tetrodotoxin-sensitive (TTX-S) sodium channels. Study of the molecular nature of toxin-receptor interactions has helped elucidate the functioning of many ion-channels. 30
50654 400202 pfam07741 BRF1 Brf1-like TBP-binding domain. This region covers both the Brf homology II and III regions. This region is involved in binding TATA binding protein. 98
50655 400203 pfam07742 BTG BTG family. 115
50656 400204 pfam07743 HSCB_C HSCB C-terminal oligomerization domain. This domain is the HSCB C-terminal oligomerization domain and is found on co-chaperone proteins. 75
50657 400205 pfam07744 SPOC SPOC domain. The SPOC (Spen paralogue and orthologue C-terminal) domain is involved in developmental signalling. 142
50658 311610 pfam07745 Glyco_hydro_53 Glycosyl hydrolase family 53. This domain belongs to family 53 of the glycosyl hydrolase classification. These enzymes are enzymes are endo-1,4- beta-galactanases (EC:3.2.1.89). The structure of this domain is known and has a TIM barrel fold. 333
50659 400206 pfam07746 LigA Aromatic-ring-opening dioxygenase LigAB, LigA subunit. This is a family of aromatic ring opening dioxygenases which catalyze the ring-opening reaction of protocatechuate and related compounds. 87
50660 400207 pfam07747 MTH865 MTH865-like family. This domain has an EF-hand like fold. 70
50661 400208 pfam07748 Glyco_hydro_38C Glycosyl hydrolases family 38 C-terminal domain. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. 204
50662 400209 pfam07749 ERp29 Endoplasmic reticulum protein ERp29, C-terminal domain. ERp29 is a ubiquitously expressed endoplasmic reticulum protein found in mammals. ERp29 is comprised of two domains. This domain, the C-terminal domain, has an all helical fold. ERp29 is thought to form part of the thyroglobulin folding complex. 94
50663 400210 pfam07750 GcrA GcrA cell cycle regulator. GcrA is a master cell cycle regulator that, together with CtrA (see pfam00072 and pfam00486), is involved in controlling cell cycle progression and asymmetric polar morphogenesis. During this process, there are temporal and spatial variations in the concentrations of GcrA and CtrA. The variation in concentration produces time and space dependent transcriptional regulation of modular functions that implement cell-cycle processes. More specifically, GcrA acts as an activator of components of the replisome and the segregation machinery. 155
50664 400211 pfam07751 Abi_2 Abi-like protein. This family, found in various bacterial species, contains sequences that are similar to the Abi group of proteins, which are involved in bacteriophage resistance mediated by abortive infection in Lactococcus species. The proteins are thought to have helix-turn-helix motifs, found in many DNA-binding proteins, allowing them to perform their function. 184
50665 400212 pfam07752 S-layer S-layer protein. Archaeal S-layer proteins consist of two copies of this domain. 258
50666 285051 pfam07753 DUF1609 Protein of unknown function (DUF1609). This region is found in a number of hypothetical proteins thought to be expressed by the eukaryote Encephalitozoon cuniculi, an obligate intracellular microsporidial parasite. It is approximately 200 residues long. 227
50667 400213 pfam07754 DUF1610 Domain of unknown function (DUF1610). This zinc ribbon domain is found in archaeal species. It is likely to bind zinc via its four well-conserved cysteine residues. 24
50668 400214 pfam07755 DUF1611 Domain of unknown function (DUF1611_C) P-loop domain. This region is found in a number of hypothetical bacterial and archaeal proteins. According to structure it has a P-loop structure. 198
50669 400215 pfam07756 DUF1612 Protein of unknown function (DUF1612). This family includes sequences of largely unknown function but which share a number of features in common. They are expressed by bacterial species, and in many cases these bacteria are known to associate symbiotically with plants. Moreover, the majority are coded for by plasmids, which in many cases are known to confer on the organism the ability to interact symbiotically with leguminous plants. An example of such a plasmid is NGR234, which encodes Y4CF, a protein of unknown function that is a member of this family. Other members of this family are expressed by organisms with a documented genomic similarity to plant symbionts. 127
50670 254409 pfam07757 AdoMet_MTase Predicted AdoMet-dependent methyltransferase. Proteins in this family have been predicted to function as AdoMet-dependent methyltransferases. 112
50671 400216 pfam07758 DUF1614 Protein of unknown function (DUF1614). This is a family of sequences coming from hypothetical proteins found in both bacterial and archaeal species. 170
50672 400217 pfam07759 DUF1615 Protein of unknown function (DUF1615). This is a family of proteins of unknown function expressed by various bacterial species. Some members of this family are thought to be lipoproteins. Another member of this family is thought to be involved in photosynthesis. 320
50673 400218 pfam07760 DUF1616 Protein of unknown function (DUF1616). This is a family of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long. 301
50674 285058 pfam07761 DUF1617 Protein of unknown function (DUF1617). This is a family of sequences from hypothetical bacterial and bacteriophage proteins. The region in question is approximately 150 residues long and is highly conserved throughout the family. 143
50675 400219 pfam07762 DUF1618 Protein of unknown function (DUF1618). The members of this family are mainly hypothetical proteins expressed by Oryza sativa. 130
50676 400220 pfam07763 FEZ FEZ-like protein. This is a family of eukaryotic proteins thought to be involved in axonal outgrowth and fasciculation. The N-terminal regions of these sequences are less conserved than the C-terminal regions, and are highly acidic. The C. elegans homolog, UNC-76, may play structural and signalling roles in the control of axonal extension and adhesion (particularly in the presence of adjacent neuronal cells) and these roles have also been postulated for other FEZ family proteins. Certain homologs have been definitively found to interact with the N-terminal variable region (V1) of PKC-zeta, and this interaction causes cytoplasmic translocation of the FEZ family protein in mammalian neuronal cells. The C-terminal region probably participates in the association with the regulatory domain of PKC-zeta. The members of this family are predicted to form coiled-coil structures, which may interact with members of the RhoA family of signalling proteins, but are not thought to contain other characteristic protein motifs. Certain members of this family are expressed almost exclusively in the brain, whereas others (such as FEZ2) are expressed in other tissues, and are thought to perform similar but unknown functions in these tissues. 240
50677 311623 pfam07764 Omega_Repress Omega Transcriptional Repressor. The omega transcriptional repressor regulates expression of involved in copy number control and stable maintenance of plasmids. The omega protein belongs to the structural superfamily of MetJ/Arc repressors featuring a ribbon-helix-helix DNA-binding motif with the beta-ribbon located in and recognising the major groove of operator DNA. 71
50678 400221 pfam07765 KIP1 KIP1-like protein. This is a family of sequences found exclusively in plants. They are similar to kinase interacting protein 1 (KIP1), which has been found to interact with the kinase domain of PRK1, a receptor-like kinase. This particular region contains two coiled-coils, which are described as motifs involved in protein-protein interactions. It has also been suggested that the protein's coiled- coils allow it to dimerize in vivo. 74
50679 400222 pfam07766 LETM1 LETM1-like protein. Members of this family are inner mitochondrial membrane proteins which play a role in potassium and hydrogen ion exchange. Deletion of LETM1 is thought to be involved in the development of Wolf-Hirschhorn syndrome in humans. 264
50680 400223 pfam07767 Nop53 Nop53 (60S ribosomal biogenesis). This nucleolar family of proteins are involved in 60S ribosomal biogenesis. They are specifically involved in the processing beyond the 27S stage of 25S rRNA maturation. This family contains sequences that bear similarity to the glioma tumor suppressor candidate region gene 2 protein (p60). This protein has been found to interact with herpes simplex type 1 regulatory proteins. 350
50681 285065 pfam07768 PVL_ORF50 PVL ORF-50-like family. This is a family of sequences found in both bacteria and bacteriophages. This region is approximately 130 residues long and in some cases is found as part of the PVL (Panton-Valentine leukocidin) group of genes, which encode a member of the leukocidin group of bacterial toxins that kill leukocytes by creation of pores in the cell membrane. PVL appears to be a virulence factor associated with a number of human diseases. 116
50682 400224 pfam07769 PsiF_repeat psiF repeat. This region is approximately 35 residues long. It is found repeated in a number of putative phosphate starvation- inducible proteins expressed by various bacterial species. psiF is known to be an example of such phosphate starvation-inducible proteins. 34
50683 369508 pfam07771 TSGP1 Tick salivary peptide group 1. This contains a group of peptides derived from a salivary gland cDNA library of the tick Ixodes scapularis. Also present are peptides from a related tick species, Ixodes ricinus. They are characterized by a putative signal peptide indicative of secretion and conserved cysteine residues. 120
50684 400225 pfam07773 DUF1619 Protein of unknown function (DUF1619). This is a family of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich amino-terminus. 315
50685 400226 pfam07774 DUF1620 Protein of unknown function (DUF1620). These sequences are mainly derived from predicted eukaryotic proteins. The region in question lies towards the C-terminus of these large proteins and is approximately 300 amino acid residues long. 214
50686 285070 pfam07775 PaRep2b PaRep2b protein. This is a family of proteins, expressed in the crenarchaeon Pyrobaculum aerophilum, whose members are variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional. 512
50687 400227 pfam07776 zf-AD Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA. 74
50688 400228 pfam07777 MFMR G-box binding protein MFMR. This region is found to the N-terminus of the pfam00170 transcription factor domain. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain), whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C, classified according to motif composition. It has been suggested that some of these motifs may be involved in mediating protein-protein interactions. The MFMR region contains a nuclear localization signal in bZIP opaque and GBF-2. The MFMR also contains a transregulatory activity in TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention signals. 96
50689 369513 pfam07778 CENP-I Mis6. Mis6 is an essential centromere connector protein acting during G1-S phase of the cell cycle. Mis6 is thought to be required for recruiting CENP-A, the centromere- specific histone H3 variant, an important event for centromere function and chromosome segregation during mitosis. 511
50690 400229 pfam07779 Cas1_AcylT 10 TM Acyl Transferase domain found in Cas1p. Cas1p protein of Cryptococcus neoformans is required for the synthesis of O-acetylated glucuronoxylomannans, a consitutent of the capsule, and is critical for its virulence. The multi TM domain of the Cas1p was unified with the 10 TM Sugar Acyltransferase superfamily. This superfamily is comprised of members from the OatA, MdoC, OpgC, NolL and GumG families in addition to the Cas1p family. The Cas1p protein has a N terminal PC-Esterase domain with the opposing Acyl esterase activity. 474
50691 400230 pfam07780 Spb1_C Spb1 C-terminal domain. This presumed domain is found at the C-terminus of a family of FtsJ-like methyltransferases. Members of this family are involved in 60S ribosomal biogenesis. 209
50692 369516 pfam07781 Reovirus_Mu2 Reovirus minor core protein Mu-2. This family represents the Reovirus core protein Mu-2. Mu-2 is a microtubule associated protein and is thought to play a key role in the formation and structural organisation of reovirus inclusion bodies. 727
50693 400231 pfam07782 DC_STAMP DC-STAMP-like protein. This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family. 191
50694 400232 pfam07784 DUF1622 Protein of unknown function (DUF1622). This is a family of 14 highly conserved sequences, from hypothetical proteins expressed by both bacterial and archaeal species. 78
50695 400233 pfam07785 DUF1623 Protein of unknown function (DUF1623). The members of this family are all derived from relatively short hypothetical proteins thought to be expressed by various Nucleopolyhedroviruses. 90
50696 377915 pfam07786 DUF1624 Protein of unknown function (DUF1624). These sequences are found in hypothetical proteins of unknown function expressed by bacterial and archaeal species. The region in question is approximately 230 residues long. 222
50697 400234 pfam07787 TMEM43 Transmembrane protein 43. This entry represents the transmembrane protein 43 family of proteins, which may function as tetraspanin-like membrane organizers. 248
50698 369519 pfam07788 PDDEXK_10 PD-(D/E)XK nuclease superfamily. This family is found to carry modified motifs characteristic of PD-(D/E)XK endonuclease superfamily. These are the conserved Glu of motif I, the Asp surreounded by hydrophobics of motif II, EIKS of motif III, and the lysine of mmotif IV has migrated to an alpha-helix following the third core beta-strand. The conserved patch of positively charged lysine and arginine residues in the motif IV apha-helix might be involved in substrate-binding or be contributing to active site formation. Members with an additional N-terminal coi9led-coil domain, are annotated as tropomyosin, coiled-coil or microtubule binding proteins. 74
50699 369520 pfam07789 DUF1627 Protein of unknown function (DUF1627). This is a group of sequences found in hypothetical proteins predicted to be expressed in a number of bacterial species. The region in question is approximately 150 amino acid residues long. 155
50700 400235 pfam07790 Pilin_N Archaeal Type IV pilin, N-terminal. This entry represents the N-terminal domain of archaeal pilins, which play important roles in surface adhesion and twitching motility. This domain contains an conserved N- terminal hydrophobic motif. 78
50701 369521 pfam07791 DUF1629 Protein of unknown function (DUF1629). This family consists of sequences from hypothetical proteins thought to be expressed by two members of the Xanthomonas genus. The region in question is 125 amino acid residues long. 123
50702 400236 pfam07792 Afi1 Docking domain of Afi1 for Arf3 in vesicle trafficking. This domain occurs at the N-terminal of Afi1, an Arf3p-interacting protein, is a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3, the highly conserved small GTPases (ADP-ribosylation factors). Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homolog of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalization. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localization and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth. 119
50703 400237 pfam07793 DUF1631 Protein of unknown function (DUF1631). The members of this family are sequences derived from a group of hypothetical proteins expressed by certain bacterial species. The region concerned is approximately 440 amino acid residues in length. 741
50704 116408 pfam07794 DUF1633 Protein of unknown function (DUF1633). This family contains sequences derived from a group of hypothetical proteins expressed by Arabidopsis thaliana. These sequences are highly similar and the region concerned is about 100 residues long. 698
50705 400238 pfam07795 DUF1635 Protein of unknown function (DUF1635). The members of this family include sequences that are parts of hypothetical proteins expressed by plant species. The region in question is about 170 amino acids long. 223
50706 400239 pfam07796 DUF1638 Protein of unknown function (DUF1638). This family contains sequences covering an approximately 270 amino acid stretch of a group of hypothetical proteins. These proteins are expressed by archaeal species of the Methanosarcina genus. 161
50707 400240 pfam07797 DUF1639 Protein of unknown function (DUF1639). This approximately 50 residue region is found in a number of sequences derived from hypothetical plant proteins. This region features a highly basic 5 amino-acid stretch towards its centre. 50
50708 400241 pfam07798 DUF1640 Protein of unknown function (DUF1640). This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. 174
50709 400242 pfam07799 DUF1643 Protein of unknown function (DUF1643). The members of this family are all sequences found within hypothetical proteins expressed by various bacterial species. The region concerned is approximately 150 residues long. 133
50710 400243 pfam07800 DUF1644 Protein of unknown function (DUF1644). This family consists of sequences found in a number of hypothetical plant proteins of unknown function. The region of interest contains nine highly conserved cysteine residues and is approximately 160 amino acids in length, and is probably a zinc-binding domain. 164
50711 369527 pfam07801 DUF1647 Protein of unknown function (DUF1647). The sequences making up this family are all derived from hypothetical proteins expressed by C. elegans. The region in question is approximately 160 amino acids long. The GO annotation for this protein indicates the protein to be involved in nematode larval development and to have a positive regulation on growth rate. 141
50712 400244 pfam07802 GCK GCK domain. This domain is found in proteins carrying other domains known to be involved in intracellular signalling pathways (such as pfam00071) indicating that it might also be involved in these pathways. It has 4 highly conserved cysteine residues, suggesting that it can bind zinc ions. Moreover, it is found repeated in some members of this family; this may indicate that these domains are able to interact with one another, raising the possibility that this domain mediates heterodimerization. 74
50713 400245 pfam07803 GSG-1 GSG1-like protein. This family contains sequences bearing similarity to a region of GSG1, a protein specifically expressed in testicular germ cells. It is possible that overexpression of the human homolog may be involved in tumorigenesis of human testicular germ cell tumors. The region in question has four highly-conserved cysteine residues. 109
50714 400246 pfam07804 HipA_C HipA-like C-terminal domain. The members of this family are similar to a region close to the C-terminus of the HipA protein expressed by various bacterial species. This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis. When expressed alone, it is toxic to bacterial cells, but it is usually tightly associated with HipB, and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products. 221
50715 116420 pfam07806 Nod_GRP Nodule-specific GRP repeat. The region featured in this family is found repeated in a number of plant proteins, some of which are expressed specifically in nodules formed during symbiotic interactions with certain bacterial species. Some of these proteins are also termed glycine-rich proteins (GRPs), due to the presence of a glycine-rich C-terminal region in their structures. Bacterial infection is required for the induction of nodule-specific GRP genes, and it is thought that nodule-specific GRPs may play non-redundant roles required at specific stages of nodule development. Members of this group of proteins may be cytosolic, whereas others are thought to be membrane-associated. 38
50716 400247 pfam07807 RED_C RED-like protein C-terminal region. This family contains sequences that are similar to the C-terminal region of Red protein. This and related proteins are thought to be localized to the nucleus, and contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated or that the protein self-aggregates extremely efficiently. 107
50717 400248 pfam07808 RED_N RED-like protein N-terminal region. This family contains sequences that are similar to the N-terminal region of Red protein. This and related proteins contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences and a putative trimeric coiled-coil region, suggesting that these proteins are expressed in the nucleus. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated of that the protein self-aggregates extremely efficiently. 226
50718 400249 pfam07809 RTP801_C RTP801 C-terminal region. The members of this family are sequences similar to the C-terminal region of RTP801, the protein product of a hypoxia-inducible factor 1 (HIF-1)- responsive gene. Two members of this family expressed by Drosophila melanogaster, Scylla and Charybde, are designated by the GenBank as Hox targets. RTP801 is thought to be involved in various cellular processes. Its overexpression caused the apoptosis- resistant phenotype in cycling cells, and apoptosis sensitivity in growth arrested cells. Moreover, the protein product of the mouse homolog of RTP801 (dig2) is thought to be induced by diverse apoptotic signals, and also by dexamethasone treatment. 113
50719 400250 pfam07810 TMC TMC domain. These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters. 111
50720 400251 pfam07811 TadE TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues. 43
50721 400252 pfam07812 TfuA TfuA-like protein. This family consists of a group of sequences that are similar to a region of TfuA protein. This protein is involved in the production of trifolitoxin (TFX), an gene-encoded, post-translationally modified peptide antibiotic. The role of TfuA in TFX synthesis is unknown, and it may be involved in other cellular processes. 120
50722 400253 pfam07813 LTXXQ LTXXQ motif family protein. This protein family includes two copies of a five residue motif is found in a number of bacterial proteins bearing similarity to the protein CpxP. This is a periplasmic protein that aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH. Another member of this family, Spy, is also a periplasmic protein that may be involved in the response to stress. The homology between CpxP and Spy may indicate that these two proteins are functionally related. 97
50723 400254 pfam07814 WAPL Wings apart-like protein regulation of heterochromatin. This family contains sequences expressed in eukaryotic organisms bearing high similarity to the WAPL conserved region of D. melanogaster wings apart-like protein. This protein is involved in the regulation of heterochromatin structure. hWAPL, the human homolog, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death. 344
50724 400255 pfam07815 Abi_HHR Abl-interactor HHR. The region featured in this family is found towards the N-terminus of a number of adaptor proteins that interact with Abl-family tyrosine kinases. More specifically, it is termed the homeo-domain homologous region (HHR), as it is similar to the DNA-binding region of homeo-domain proteins. Other homeo-domain proteins have been implicated in specifying positional information during embryonic development, and in the regulation of the expression of cell-type specific genes. The Abl-interactor proteins are thought to coordinate the cytoplasmic and nuclear functions of the Abl-family kinases, and seem to be involved in cytoskeletal reorganisation, but their precise role remains unclear. 64
50725 400256 pfam07816 DUF1645 Protein of unknown function (DUF1645). These sequences are derived from a number of hypothetical plant proteins. The region in question is approximately 270 amino acids long. Some members of this family are annotated as yeast pheromone receptor proteins AR781 but no literature was found to support this. 194
50726 400257 pfam07817 GLE1 GLE1-like protein. The members of this family are sequences that are similar to the human protein GLE1. This protein is localized at the nuclear pore complexes and functions in poly(A)+ RNA export to the cytoplasm. 231
50727 400258 pfam07818 HCNGP HCNGP-like protein. This family comprises sequences bearing significant similarity to the mouse transcriptional regulator protein HCNGP. This protein is localized to the nucleus and is thought to be involved in the regulation of beta-2-microglobulin genes. 91
50728 369540 pfam07819 PGAP1 PGAP1-like protein. The sequences found in this family are similar to PGAP1. This is an endoplasmic reticulum membrane protein with a catalytic serine containing motif that is conserved in a number of lipases. PGAP1 functions as a GPI inositol-deacylase; this deacylation is important for the efficient transport of GPI-anchored proteins from the endoplasmic reticulum to the Golgi body. 233
50729 369541 pfam07820 TraC TraC-like protein. The members of this family are sequences that are similar to TraC. The gene encoding this protein is one of a group of genes found on plasmid p42a of Rhizobium etli CFN42 that are thought to be involved in the process of plasmid self-transmission. Mobilisation of plasmid p42a is of importance as it is required for transfer of plasmid p42a, which is also known as plasmid pSym as it carries most of the genes required for nodulation and nitrogen fixation by the symbiotic bacterium. The predicted protein products of p42a are similar to known transfer proteins of Agrobacterium tumefaciens plasmid pTiC58. 88
50730 400259 pfam07821 Alpha-amyl_C2 Alpha-amylase C-terminal beta-sheet domain. This domain is organized as a five-stranded anti-parallel beta-sheet. It is the probable result of a decay of the common-fold. 59
50731 400260 pfam07822 Toxin_13 Neurotoxin B-IV-like protein. The members of this family resemble neurotoxin B-IV, which is a crustacean-selective neurotoxin produced by the marine worm Cerebratulus lacteus. This highly cationic peptide is approximately 55 residues and is arranged to form two antiparallel helices connected by a well-defined loop in a hairpin structure. The branches of the hairpin are linked by four disulphide bonds. Three residues identified as being important for activity, namely Arg-17, -25 and -34, are found on the same face of the molecule, while another residue important for activity, Trp30, is on the opposite side. The protein's mode of action is not entirely understood, but it may act on voltage-gated sodium channels, possibly by binding to an as yet uncharacterized site on these proteins. Its site of interaction may also be less specific, for example it may interact with negatively charged membrane lipids. 55
50732 311669 pfam07823 CPDase Cyclic phosphodiesterase-like protein. Cyclic phosphodiesterase (CPDase) is involved in the tRNA splicing pathway. This protein exhibits a bilobal arrangement of two alpha-beta modules. Two antiparallel helices are found on the outer side of each lobe and frame an antiparallel beta-sheet that is wrapped around an accessible cleft. Moreover, the beta-strands of each lobe interact with the other lobe. The central water-filled cavity houses the enzyme's active site. 199
50733 285114 pfam07824 Chaperone_III Type III secretion chaperone domain. Type III secretion chaperones are involved in delivering virulence effector proteins from bacterial pathogens directly into eukaryotic cells. The chaperones may prevent aggregation and degradation of their substrates, may target the effector to the secretion apparatus, and may ensure a secretion-component unfolded confirmation of their specific substrate. One member of this family, SigE forms homodimers in crystal. The monomers have a novel fold with an alpha-beta(3)-alpha-beta(2)-alpha topology. 110
50734 116439 pfam07825 Exc Excisionase-like protein. The phage-encoded excisionase protein (Xis) is involved in excisive recombination by regulating the assembly of the excisive intasome and by inhibiting viral integration. It adopts an unusual 'winged'-helix structure in which two alpha helices are packed against two extended strands. Also present in the structure is a two-stranded anti-parallel beta-sheet, whose strands are connected by a four-residue 'wing'. During interaction with DNA, helix alpha2 is thought to insert into the major groove, while the wing contacts the adjacent minor groove or phosphodiester backbone. The C-terminal region of Xis is involved in interaction with phage-encoded integrase (Int), and a putative C-terminal alpha helix may fold upon interaction with Int and/or DNA. 72
50735 400261 pfam07826 IMP_cyclohyd IMP cyclohydrolase-like protein. This enzyme may catalyze the cyclization of 5-formylamidoimidazole-4-carboxamide ribonucleotide to inosine monophosphate (IMP), a reaction which is important in de novo purine biosynthesis in archaeal species. This single domain protein is arranged to form an overall fold that consists of a four-layered alpha-beta-beta-alpha core structure. The two antiparallel beta-sheets pack against each other and are covered by alpha-helices on one face of the molecule. The protein is structurally similar to members of the N-terminal nucleophile (NTN) hydrolase superfamily. A deep pocket was in fact found on the surface of IMP cyclohydrolase in a position equivalent to that of active sites of NTN-hydrolases, but an N-terminal nucleophile could not be found. Therefore, it is thought that this enzyme is structurally but not functionally similar to members of the NTN-hydrolase family. 194
50736 285116 pfam07827 KNTase_C KNTase C-terminal domain. Kanamycin nucleotidyltransferase (KNTase) is involved in conferring resistance to aminoglycoside antibiotics and catalyzes the transfer of a nucleoside monophosphate group from a nucleotide to kanamycin. This enzyme is dimeric with each subunit being composed of two domains. The C-terminal domain contains five alpha helices, four of which are organized into an up-and-down alpha helical bundle. Residues found in this domain may contribute to this enzyme's active site. 132
50737 369543 pfam07828 PA-IL PA-IL-like protein. The members of this family are similar to the galactophilic lectin-1 expressed by P. aeruginosa ((PA-IL). Lectins recognising specific carbohydrates found on the surface of host cells are known to be involved in the initiation of infections by this organism. The protein is thought to be organized into an extensive network of beta-sheets, as is the case with many other lectins. 121
50738 311671 pfam07829 Toxin_14 Alpha-A conotoxin PIVA-like protein. Alpha-A conotoxin PIVA is the major paralytic toxin found in the venom produced by the piscivorous snail Conus purpurascens. This peptide acts by blocking the acetylcholine binding site of the nicotinic acetylcholine receptor at the neuromuscular junction. The overall shape of the peptide is described as an "iron" with a highly charged hydrophilic loop of 15S-19R forming the "handle" domain that is exposed to the exterior of the protein. The stability of the conotoxin is primarily governed by three disulphide bonds. A triangular structural motif formed by residues 19R, 12H and 6Y is thought to constitute a "binding core" that is important in binding to the acetylcholine receptor. 26
50739 400262 pfam07830 PP2C_C Protein serine/threonine phosphatase 2C, C-terminal domain. Protein phosphatase 2C (PP2C) is involved in regulating cellular responses to stress in various eukaryotes. It consists of two domains: an N-terminal catalytic domain and a C-terminal domain characteristic of mammalian PP2Cs. This domain consists of three antiparallel alpha helices, one of which packs against two corresponding alpha-helices of the N-terminal domain. The C-terminal domain does not seem to play a role in catalysis, but it may provide protein substrate specificity due to the cleft that is created between it and the catalytic domain. 79
50740 400263 pfam07831 PYNP_C Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer. 74
50741 400264 pfam07832 Bse634I Cfr10I/Bse634I restriction endonuclease. Cfr10I and Bse634I are two Type II restriction endonucleases. They exhibit a conserved tetrameric architecture that is of functional importance, wherein two dimers are arranged 'back-to-back' with their putative DNA-binding clefts facing opposite directions. These clefts are formed between two monomers that interact, mainly via hydrophobic interactions supported by a few hydrogen bonds, to form a U-shaped dimer. Each monomer is folded to form a compact alpha-beta structure, whose core is made up of a five-stranded mixed beta-sheet.The monomer may be split into separate N-terminal and C-terminal subdomains at a hinge located in helix alpha3. 281
50742 400265 pfam07833 Cu_amine_oxidN1 Copper amine oxidase N-terminal domain. Copper amine oxidases catalyze the oxidative deamination of primary amines to the corresponding aldehydes, while reducing molecular oxygen to hydrogen peroxide. These enzymes are dimers of identical subunits, each comprising four domains. The N-terminal domain, which is absent in some amine oxidases, consists of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the 'stalk' of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other. 93
50743 400266 pfam07834 RanGAP1_C RanGAP1 C-terminal domain. Ran-GTPase activating protein 1 (RanGAP1) is a GTPase activator for the nuclear Ras-related regulatory protein Ran, converting it to the putatively inactive GDP-bound state. Its C-terminal domain is required for RanGAP1 localization at the vertebrate nuclear pore complex, and is sumoylated by the small ubiquitin-related modifier protein (SUMO-1). This domain is composed almost entirely of helical substructures that are organized into an alpha-alpha superhelix fold, with the exception of the peptide containing the lysine residue required for SUMO-1 conjugation. 179
50744 400267 pfam07835 COX4_pro_2 Bacterial aa3 type cytochrome c oxidase subunit IV. Bacterial cytochrome c oxidase is found bound to the to the cell membrane, where it is involved in the generation of the transmembrane proton electrochemical gradient. It is composed of four subunits. Subunit IV consists of one transmembrane helix that does not interact directly with the other subunits, but maintains its position by indirect contacts via phospholipid molecules found in the structure. The function of subunit IV is as yet unknown. 42
50745 400268 pfam07836 DmpG_comm DmpG-like communication domain. This domain is found towards the C-terminal region of various aldolase enzymes. It consists of five alpha-helices, four of which form an antiparallel helical bundle that plugs the C-terminus of the N-terminal TIM barrel domain. The communication domain is thought to play an important role in the heterodimerization of the enzyme. 63
50746 400269 pfam07837 FTCD_N Formiminotransferase domain, N-terminal subdomain. The formiminotransferase (FT) domain of formiminotransferase- cyclodeaminase (FTCD) forms a homodimer, and each protomer comprises two subdomains. The N-terminal subdomain is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains. 173
50747 400270 pfam07839 CaM_binding Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains. Binding of the proteins to calmodulin depends on the presence of calcium ions. These proteins are thought to be involved in various processes, such as plant defense responses and stolonisation or tuberization. 118
50748 400271 pfam07840 FadR_C FadR C-terminal domain. This family contains sequences that are similar to the fatty acid metabolism regulator protein (FadR). This functions as a dimer, with each monomer being composed of an N-terminal DNA-binding domain and a regulatory C-terminal domain. A linker comprising two short alpha helices joins the two domains. In the C-terminal domain, an antiparallel array of six alpha helices forms a barrel-like structure, while a seventh alpha helix forms a 'lid' at the end closest to the N-terminal domain. This structure was found to be similar to that of the C-terminal domain of the Tet repressor. Long-chain acyl-CoA thioesters interact directly and reversibly with the C-terminal domain, and this interaction affects the structure and therefore the DNA binding properties of the N-terminal domain. 163
50749 400272 pfam07841 DM4_12 DM4/DM12 family. This family contains sequences derived from hypothetical proteins expressed by two insect species, D. melanogaster and A. gambiae. The region in question is approximately 115 amino acid residues long and contains four highly- conserved cysteine residues. 86
50750 400273 pfam07842 GCFC GC-rich sequence DNA-binding factor-like protein. Sequences found in this family are similar to a region of a human GC-rich sequence DNA-binding factor homolog. This is thought to be a protein involved in transcriptional regulation due to partial homologies to a transcription repressor and histone-interacting protein. This family also contains tuftelin interacting protein 11 which has been identified as both a nuclear and cytoplasmic protein, and has been implicated in the secretory pathway. Sip1, a septin interacting protein is also a member of this family. 275
50751 400274 pfam07843 DUF1634 Protein of unknown function (DUF1634). This family contains many hypothetical bacterial and archaeal proteins. A few members of this family are annotated as being putative transmembrane proteins, and the region in question in fact contains many hydrophobic residues. 102
50752 400275 pfam07845 DUF1636 Protein of unknown function (DUF1636). The sequences featured in this family are derived from a number of hypothetical prokaryotic proteins. The region in question is approximately 130 amino acids long. 117
50753 285130 pfam07846 Metallothio_Cad Metallothionein family. The sequences making up Metallothio_Cad are found repeated in metallothionein proteins expressed by several different Tetrahymena species. Metallothioneins are low molecular mass, cysteine-rich metal-binding proteins that are thought to be involved in the regulation of levels of trace metals, and detoxification of these metals when present in excess. Some of the metallothioneins found in this family are known to be induced by cadmium and are thought to be involved in the cellular sequestration of toxic metal ions. The high proportion of cysteine residues allows the metal ions to be bound by the formation of clusters of metal-thiolate complexes. Tetrahymena spp. metallothioneins differ from other eukaryotic metallothioneins mainly in the length of their sequences and in the cysteine-containing motifs they exhibit. 20
50754 400276 pfam07847 PCO_ADO PCO_ADO. This entry includes cysteine oxidases (PCOs) from plants and 2-aminoethanethiol dioxygenases (ADOs) from animals. 201
50755 285132 pfam07848 PaaX PaaX-like protein. This family contains proteins that are similar to the product of the paaX gene of Escherichia coli. This protein is involved in the regulation of expression of a group of proteins known to participate in the metabolism of phenylacetic acid. In fact, some members of this family are annotated by InterPro as containing a winged helix DNA-binding domain. 70
50756 400277 pfam07849 DUF1641 Protein of unknown function (DUF1641). Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long. 39
50757 400278 pfam07850 Renin_r Renin receptor-like protein. The sequences featured in this family are similar to a region of the human renin receptor that bears a putative transmembrane spanning segment. The renin receptor is involved in intracellular signal transduction by the activation of the ERK1/ERK2 pathway, and it also serves to increase the efficiency of angiotensinogen cleavage by receptor-bound renin, therefore facilitating angiotensin II generation and action on a cell surface. 97
50758 400279 pfam07851 TMPIT TMPIT-like protein. A number of members of this family are annotated as being transmembrane proteins induced by tumor necrosis factor alpha, but no literature was found to support this. 324
50759 369557 pfam07852 DUF1642 Protein of unknown function (DUF1642). The sequences making up this family are derived from various hypothetical phage and prophage proteins. The region in question is approximately 140 amino acids long. 136
50760 400280 pfam07853 DUF1648 Protein of unknown function (DUF1648). Members of this family are hypothetical proteins expressed by either bacterial or archaeal species. Some of these are annotated as being transmembrane proteins, and in fact many of these sequences contain a high proportion of hydrophobic residues. 49
50761 285138 pfam07854 DUF1646 Protein of unknown function (DUF1646). Some of the members of this family are hypothetical bacterial and archaeal proteins, but others are annotated as being cation transporters expressed by the archaebacterium Methanosarcina mazei. 347
50762 400281 pfam07855 ATG101 Autophagy-related protein 101. Atg101 is a critical autophagy factor that functions together with ULK, Atg13 and FIP200. 152
50763 400282 pfam07856 Orai-1 Mediator of CRAC channel activity. ORAI-1 is a protein homolog of Drosophila Orai and human Orai1, Orai2 and Orai3. ORAI-1 GFP reporters are co- expressed with STIM-1 (ER CA(2+) sensors) in the gonad and intestine. The protein has four predicted transmembrane domains with a highly conserved region between TM2 ad TM3. This conserved domain is thought to function in channel regulation. ORAI1- related proteins are required for the production of the calcium channel, CRAC, along with STIM1-related proteins. 168
50764 285141 pfam07857 TMEM144 Transmembrane family, TMEM144 of transporters. Members of this family fall in to the drug/metabolite transporter (dmt) superfamily. They carry 10xTM domains arranged as 5+5. Although these two sets may originally have arisen by gene-duplication the divergence now is such that the two halves are no longer homologous. 333
50765 400283 pfam07858 LEH Limonene-1,2-epoxide hydrolase catalytic domain. Epoxide hydrolases catalyze the hydrolysis of epoxides to corresponding diols, which is important in detoxification, synthesis of signal molecules, or metabolism. Limonene-1,2- epoxide hydrolase (LEH) differs from many other epoxide hydrolases in its structure and its novel one-step catalytic mechanism. Its main fold consists of a six-stranded mixed beta-sheet, with three N-terminal alpha helices packed to one side to create a pocket that extends into the protein core. A fourth helix lies in such a way that it acts as a rim to this pocket. Although mainly lined by hydrophobic residues, this pocket features a cluster of polar groups that lie at its deepest point and constitute the enzyme's active site. 125
50766 400284 pfam07859 Abhydrolase_3 alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes. 208
50767 285144 pfam07860 CCD WisP family C-Terminal Region. This family is found at the C-terminus of the Tropheryma whipplei WisP family proteins. 130
50768 311693 pfam07861 WND WisP family N-Terminal Region. This family is found at the N-terminus of the Tropheryma whipplei WisP family proteins. 239
50769 400285 pfam07862 Nif11 Nif11 domain. This domain is found mainly in the Cyanobacteria and in Proteobacteria such as the nitrogen-fixing bacterium Azotobacter vinelandii. It is found in Nif11, a protein described in Azotobacter as linked to nitrogen fixation. It also constitutes a leader peptide in Nif11-derived peptides (N11P), which are thought to be post-translationally modified microcins derived from a putative nitrogen-fixing protein. N11P sequences have a classic leader peptide cleavage motif, usually Gly-Gly, which marks the end of family-wide similarity area and the beginning of a low-complexity region rich in Cys, Gly and Ser. 47
50770 400286 pfam07863 CtnDOT_TraJ homologs of TraJ from Bacteroides conjugative transposon. Members of this family have been implicated in as being involved in an unusual form of DNA transfer (conjugation) in Bacteroides. The family has been named CtnDOT_TraJ to avoid confusion with other conjugative transfer systems. 66
50771 400287 pfam07864 DUF1651 Protein of unknown function (DUF1651). This is a family containing bacterial proteins of unknown function. 73
50772 311697 pfam07865 DUF1652 Protein of unknown function (DUF1652). This is a family containing hypothetical bacterial proteins. 67
50773 400288 pfam07866 DUF1653 Protein of unknown function (DUF1653). This is a family of hypothetical bacterial proteins of unknown function. 61
50774 311699 pfam07867 DUF1654 Protein of unknown function (DUF1654). This family consists of proteins from the Pseudomonadaceae. 70
50775 285151 pfam07868 DUF1655 Protein of unknown function (DUF1655). This protein is found in some prophages found in Lactobacillales lactis. 55
50776 400289 pfam07869 DUF1656 Protein of unknown function (DUF1656). This family contains bacterial proteins, many of which are hypothetical. Some proteins in this family are putative membrane proteins. 56
50777 377932 pfam07870 DUF1657 Protein of unknown function (DUF1657). This domain appears to be restricted to the Bacillales. 49
50778 285154 pfam07871 DUF1658 Protein of unknown function (DUF1658). This family of small proteins seems to be found in several places in the Coxiella genome. 21
50779 400290 pfam07872 DUF1659 Protein of unknown function (DUF1659). This family consists of hypothetical bacterial proteins of unknown function. 44
50780 400291 pfam07873 YabP YabP family. This family of proteins is involved in spore coat assembly during the process of sporulation. 64
50781 369565 pfam07874 DUF1660 Prophage protein (DUF1660). This protein is found in Lactobacillae prophages. 64
50782 400292 pfam07875 Coat_F Coat F domain. The Coat F proteins, which contribute to the Bacillales spore coat. It occurs multiple times in the genomes it is found in. 63
50783 400293 pfam07876 Dabb Stress responsive A/B Barrel Domain. The function of this family is unknown, but it is upregulated in response to salt stress in Populus balsamifera. It is also found at the C-terminus of an fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus. Arthrobacter nicotinovorans ORF106 is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an a/b barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence. 96
50784 285160 pfam07877 DUF1661 Protein of unknown function (DUF1661). This is a family containing bacterial proteins of unknown function. Many of the proteins in this family are hypothetical. 31
50785 400294 pfam07878 RHH_5 CopG-like RHH_1 or ribbon-helix-helix domain, RHH_5. This family contains bacterial proteins that form a ribbon-helix-helix fold. This fold occurs in many examples of bacterial antitoxins. 43
50786 400295 pfam07879 PHB_acc_N PHB/PHA accumulation regulator DNA-binding domain. This domain is found at the N-terminus of the Polyhydroxyalkanoate (PHA) synthesis regulators. These regulators have been shown to directly bind DNA and PHA. The invariant nature of this domain compared to the C-terminal pfam05233 domain(s) suggests that it contains the DNA-binding function. 59
50787 400296 pfam07880 T4_gp9_10 Bacteriophage T4 gp9/10-like protein. The members of this family are similar to gene products 9 (gp9) and 10 (gp10) of bacteriophage T4. Both proteins are components of the viral baseplate. Gp9 connects the long tail fibers of the virus to the baseplate and triggers tail contraction after viral attachment to a host cell. The protein is active as a trimer, with each monomer being composed of three domains. The N-terminal domain consists of an extended polypeptide chain and two alpha helices. The alpha1 helix from each of the three monomers in the trimer interacts with its counterparts to form a coiled-coil structure. The middle domain is a seven-stranded beta-sandwich that is thought to be a novel protein fold. The C-terminal domain is thought to be essential for gp9 trimerisation and is organized into an eight- stranded antiparallel beta-barrel, which was found to resemble the 'jelly roll' fold found in many viral capsid proteins. The long flexible region between the N-terminal and middle domains may be required for the function of gp9 to transmit signals from the long tail fibers. Together with gp11, gp10 initiates the assembly of wedges that then go on to associate with a hub to form the viral baseplate. 285
50788 400297 pfam07881 Fucose_iso_N1 L-fucose isomerase, first N-terminal domain. The members of this family are similar to L-fucose isomerase expressed by E. coli (EC:5.3.1.3). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta-sheets with surrounding alpha helices. Domain 1 demonstrates the beta-alpha-beta-alpha- beta Rossman fold. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues, and domain 1 from the adjacent subunit contributing some other residues. 168
50789 400298 pfam07882 Fucose_iso_N2 L-fucose isomerase, second N-terminal domain. The members of this family are similar to L-fucose isomerase expressed by E. coli (EC:5.3.1.3). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta- sheets with surrounding alpha helices. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues. 180
50790 400299 pfam07883 Cupin_2 Cupin domain. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). 70
50791 400300 pfam07884 VKOR Vitamin K epoxide reductase family. Vitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues. In some plant and bacterial homologs the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases. 132
50792 400301 pfam07885 Ion_trans_2 Ion channel. This family includes the two membrane helix type ion channels found in bacteria. 76
50793 400302 pfam07886 BA14K BA14K-like protein. The sequences found in this family are similar to the BA14K proteins expressed by Brucella abortus and by Brucella suis. BA14K was found to be strongly immunoreactive; it induces both humoral and cellular responses in hosts throughout the infective process. 29
50794 400303 pfam07887 Calmodulin_bind Calmodulin binding protein-like. The members of this family are putative or actual calmodulin binding proteins expressed by various plant species. Some members are known to be involved in the induction of plant defense responses. However, their precise function in this regards is as yet unknown. 291
50795 400304 pfam07888 CALCOCO1 Calcium binding and coiled-coil domain (CALCOCO1) like. Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region. 488
50796 400305 pfam07889 DUF1664 Protein of unknown function (DUF1664). The members of this family are hypothetical plant proteins of unknown function. The region featured in this family is approximately 100 amino acids long. 122
50797 400306 pfam07890 Rrp15p Rrp15p. Rrp15p is required for the formation of 60S ribosomal subunits. 126
50798 400307 pfam07891 DUF1666 Protein of unknown function (DUF1666). These sequences are derived from hypothetical plant proteins of unknown function. The region in question is approximately 250 residues long. 246
50799 400308 pfam07892 DUF1667 Protein of unknown function (DUF1667). Hypothetical archaeal and bacterial proteins make up this family. A few proteins are annotated as being potential metal-binding proteins, and in fact the members of this family have four highly conserved cysteine residues, but no further literature evidence was found in this regard. 82
50800 369579 pfam07893 DUF1668 Protein of unknown function (DUF1668). The hypothetical proteins found in this family are expressed by Oryza sativa and are of unknown function. 330
50801 400309 pfam07894 DUF1669 Protein of unknown function (DUF1669). This family is composed of sequences derived from hypothetical eukaryotic proteins of unknown function. Some members of this family are annotated as being potential phospholipases but no literature was found to support this. 276
50802 400310 pfam07895 DUF1673 Protein of unknown function (DUF1673). This family contains hypothetical proteins of unknown function expressed by two archaeal species. 207
50803 400311 pfam07896 DUF1674 Protein of unknown function (DUF1674). The members of this family are sequences derived from hypothetical eukaryotic and bacterial proteins. The region in question is approximately 60 residues long. 50
50804 400312 pfam07897 EAR Ethylene-responsive binding factor-associated repression. The EAR motif is the ethylene-responsive element binding factor-associated amphiphilic repression motif. This motif binds to the Groucho/Tup1-type co-repressor TOPLESS (TPL) and TPL-related proteins. The motif is frequently to be find at the N-terminus of NINJA, or Novel INteractor of JAZ, proteins. The EAR motif, defined by the consensus sequence patterns of either LxLxL or DLN xxP, is the most predominant form of transcriptional repression motif so far identified in plants. It is highly conserved in transcriptional regulators that are known to function as negative regulators in a broad range of developmental and physiological processes across evolutionarily diverse plant species. This family is closely related to family AUX_IAA Pam:PF02309 which also has an LxLxL signature. 35
50805 400313 pfam07898 DUF1676 Protein of unknown function (DUF1676). This family contains sequences derived from proteins of unknown function expressed by Drosophila melanogaster and Anopheles gambiae. 172
50806 400314 pfam07899 Frigida Frigida-like protein. This family is composed of plant proteins that are similar to FRIGIDA protein expressed by Arabidopsis thaliana. This protein is probably nuclear and is required for the regulation of flowering time in the late-flowering phenotype. It is known to increase RNA levels of flowering locus C. Allelic variation at the FRIGIDA locus is a major determinant of natural variation in flowering time. 290
50807 369585 pfam07900 DUF1670 Protein of unknown function (DUF1670). The hypothetical eukaryotic proteins found in this family are of unknown function. 218
50808 285184 pfam07901 DUF1672 Protein of unknown function (DUF1672). This family is composed of hypothetical bacterial proteins of unknown function. 271
50809 369586 pfam07902 Gp58 gp58-like protein. Sequences found in this family are derived from a number of bacteriophage and prophage proteins. They are similar to gp58, a minor structural protein of Lactococcus delbrueckii bacteriophage LL-H. 594
50810 285186 pfam07903 PaRep2a PaRep2a protein. This is a family of proteins expressed by the crenarchaeon Pyrobaculum aerophilum. The members are highly variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional. 122
50811 400315 pfam07904 Eaf7 Chromatin modification-related protein EAF7. The S. cerevisiae member of this family is part of NuA4, the only essential histone acetyltransferase complex in Saccharomyces cerevisiae involved in global histone acetylation. 97
50812 400316 pfam07905 PucR Purine catabolism regulatory protein-like family. The bacterial proteins found in this family are similar to the purine catabolism regulatory protein expressed by Bacillus subtilis (PucR). PucR is thought to be a transcriptional activator involved in the induction of the purine degradation pathway, and may contain a LysR-like DNA-binding domain. It is similar to LysR-type regulators in that it represses its own expression. The other members of this family are also annotated as being putative regulatory proteins. 117
50813 369588 pfam07906 Toxin_15 ShET2 enterotoxin, N-terminal region. The members of this family are are sequences that are similar to the N-terminal half of the ShET2 enterotoxin produced by Shigella flexneri and Escherichia coli. This protein was found to confer toxigenicity in the Ussing chamber, and the N-terminal region was found to be important for the protein's enterotoxic effect. It is thought to be a hydrophobic protein that forms inclusion bodies within the bacterial cell, and may be secreted by the Mxi system. Most members of this family are annotated as putative enterotoxins, but one member is a regulator of acetyl CoA synthetase, and another two members are annotated as ankyrin-like regulatory proteins and contain Ank repeats (pfam00023). 278
50814 400317 pfam07907 YibE_F YibE/F-like protein. The sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE and YibF. Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues. 240
50815 116521 pfam07909 DUF1663 Protein of unknown function (DUF1663). The members of this family are hypothetical proteins expressed by Trypanosoma cruzi, a eukaryotic parasite that causes Chagas' disease in humans. This region is found as multiple copies per protein. 514
50816 400318 pfam07910 Peptidase_C78 Peptidase family C78. This family formerly known as DUF1671 has been shown to be a cysteine peptidase called (Ufm1)-specific protease. 199
50817 400319 pfam07911 DUF1677 Protein of unknown function (DUF1677). The sequences found in this family are all derived from hypothetical plant proteins of unknown function. The region features a number of highly conserved cysteine residues. 89
50818 400320 pfam07912 ERp29_N ERp29, N-terminal domain. ERp29 is a ubiquitously expressed endoplasmic reticulum protein, and is involved in the processes of protein maturation and protein secretion in this organelle. The protein exists as a homodimer, with each monomer being composed of two domains. The N-terminal domain featured in this family is organized into a thioredoxin-like fold that resembles the a domain of human protein disulphide isomerase (PDI). However, this domain lacks the C-X-X-C motif required for the redox function of PDI; it is therefore thought that ERp29's function is similar to the chaperone function of PDI. The N-terminal domain is exclusively responsible for the homodimerization of the protein, without covalent linkages or additional contacts with other domains. 126
50819 285193 pfam07913 DUF1678 Protein of unknown function (DUF1678). This family is composed of uncharacterized proteins expressed by Methanopyrus kandleri, a hyperthermophilic archaebacterium. 196
50820 369592 pfam07914 DUF1679 Protein of unknown function (DUF1679). The region featured in this family is found in a number of C. elegans proteins, in one case as a repeat. In many of the family members, this region is associated with the CHK region described by SMART as being found in ZnF_C4 and HLH domain-containing kinases. In fact, one member of this family is annotated as being a member of the nuclear hormone receptor family, and contains regions typical of such proteins (Interpro:IPR000536, Interpro:IPR008946, and Interpro:IPR001628). 413
50821 400321 pfam07915 PRKCSH Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain. 72
50822 400322 pfam07916 TraG_N TraG-like protein, N-terminal region. The bacterial sequences found in this family are similar to the N-terminal region of the TraG protein. This is a membrane-spanning protein, with three predicted transmembrane segments and two periplasmic regions. TraG protein is known to be essential for DNA transfer in the process of conjugation, with the N-terminal portion being required for F pilus assembly. The protein is thought to interact with the periplasmic domain of TraN to stabilize mating-cell interactions. 400
50823 400323 pfam07918 CAP160 CAP160 repeat. This region featured in this family is repeated in spinach cold acclimation protein CAP160. CAP160 is induced during periods of drought stress; its precise function is unknown but it has been implicated in the stabilisation of membranes, cytoskeletal elements, and ribosomes. By acting as a compatible solute, it may reduce the toxic effects of cellular solutes that accumulate at high concentration during dehydration; it may also function as an enzyme that produces such a solute. Other members of this family are also induced by water stress, abscisic acid, and/or low temperature, such as desiccation-responsive protein 29B and CDet11-24 protein. 27
50824 400324 pfam07919 Gryzun Gryzun, putative trafficking through Golgi. The proteins featured in this family are all eukaryotic, and many of them are annotated as being Gryzun. Gryzun is distantly related to, but distinct from, the Trs130 subunit of the TRAPP complex but is absent from S. cerevisiae. RNAi of human Gryzun blocks Golgi exit. Thus the family is likely to be involved with trafficking of proteins through membranes, perhaps as part of the TRAPP complex. 590
50825 400325 pfam07920 DUF1684 Protein of unknown function (DUF1684). The sequences featured in this family are found in hypothetical archaeal and bacterial proteins of unknown function. The region in question is approximately 200 amino acids long. 141
50826 254516 pfam07921 Fibritin_C Fibritin C-terminal region. This family features sequences bearing similarity to the C-terminal portion of the bacteriophage T4 protein fibritin. This protein is responsible for attachment of long tail fibers to virus particle, and forms the 'whiskers' or fibers on the neck of the virion. The region seen in this family contains an N-terminal coiled-coil portion and the C-terminal globular foldon domain (residues 457-486), which is essential for fibritin trimerisation and folding. This domain consists of a beta-hairpin; three such hairpins come together in a beta-propeller-like arrangement in the trimer, which is stabilized by hydrogen bonds, salt bridges and hydrophobic interactions. 93
50827 400326 pfam07922 Glyco_transf_52 Glycosyltransferase family 52. This family features glycosyltransferases belonging to glycosyltransferase family 52, which have alpha-2,3- sialyltransferase (EC:4.2.99.4) and alpha-glucosyltransferase (EC 2.4.1.-) activity. For example, beta-galactoside alpha-2,3- sialyltransferase expressed by Neisseria meningitidis is a member of this family and is involved in a step of lipooligosaccharide biosynthesis requiring sialic acid transfer; these lipooligosaccharides are thought to be important in the process of pathogenesis. This family includes several bacterial lipooligosaccharide sialyltransferases similar to the Haemophilus ducreyi LST protein. Haemophilus ducreyi is the cause of the sexually transmitted disease chancroid and produces a lipooligosaccharide (LOS) containing a terminal sialyl N-acetyllactosamine trisaccharide. 271
50828 400327 pfam07923 N1221 N1221-like protein. The sequences featured in this family are similar to a hypothetical protein product of ORF N1221 in the CPT1-SPC98 intergenic region of the yeast genome. This encodes an acidic polypeptide with several possible transmembrane regions. 282
50829 400328 pfam07924 NuiA Nuclease A inhibitor-like protein. This family consists of protein sequences that are similar to the nuclease A inhibitor expressed by bacteria of the genus Anabaena ((NuiA). This sequence is organized to form an alpha-beta-alpha sandwich fold, which is similar to the PR-1-like fold. NuiA interacts with nuclease A by means of residues located at one end of the molecule, including residues making up the loop between helices III and IV and the loop between strands C and D. The mechanism of inhibition of nuclease A by NuiA is as yet incompletely understood. 130
50830 369599 pfam07925 RdRP_5 Reovirus RNA-dependent RNA polymerase lambda 3. The sequences in this family are similar to the reoviral minor core protein lambda 3, which functions as a RNA-dependent RNA polymerase within the protein capsid. It is organized into 3 domains. N- and C-terminal domains create a 'cage' that encloses a conserved central catalytic domain within a hollow centre; this catalytic domain is arranged to form 'fingers', 'palm' and 'thumb' subdomains. Unlike other RNA polymerases, like HIV reverse transcriptase and T7 RNA polymerase, lambda 3 protein binds template and substrate with only localized rearrangements, and catalytic activity can occur with little structural change. However, the structure of the catalytic complex is similar to that of other polymerase catalytic complexes with known structure. 1271
50831 400329 pfam07926 TPR_MLP1_2 TPR/MLP1/MLP2-like protein. The sequences featured in this family are similar to a region of human TPR protein and to yeast myosin-like proteins 1 (MLP1) and 2 (MLP2). These proteins share a number of features; for example, they all have coiled-coil regions and all three are associated with nuclear pores. TPR is thought to be a component of nuclear pore complex- attached intra-nuclear filaments, and is implicated in nuclear protein import. Moreover, its N-terminal region is involved in the activation of oncogenic kinases, possibly by mediating the dimerization of kinase domains or by targeting these kinases to the nuclear pore complex. MLP1 and MLP2 are involved in the process of telomere length regulation, where they are thought to interact with proteins such as Tel1p and modulate their activity. 129
50832 400330 pfam07927 HicA_toxin HicA toxin of bacterial toxin-antitoxin,. HicA_toxin is a bacterial family of toxins that act as mRNA interferases. The antitoxin that neutralizes this is family HicB, pfam15919. 56
50833 400331 pfam07928 Vps54 Vps54-like protein. This family contains various proteins that are homologs of the yeast Vps54 protein, such as the rat homolog, the human homolog, and the mouse homolog. In yeast, Vps54 associates with Vps52 and Vps53 proteins to form a trimolecular complex that is involved in protein transport between Golgi, endosomal, and vacuolar compartments. All Vps54 homologs contain a coiled coil region (not found in the region featured in this family) and multiple dileucine motifs. 133
50834 400332 pfam07929 PRiA4_ORF3 Plasmid pRiA4b ORF-3-like protein. Members of this family are similar to the protein product of ORF-3 found on plasmid pRiA4 in the bacterium Agrobacterium rhizogenes. This plasmid is responsible for tumorigenesis at wound sites of plants infected by this bacterium, but the ORF-3 product does not seem to be involved in the pathogenetic process. Other proteins found in this family are annotated as being putative TnpR resolvases, but no further evidence was found to back this. Moreover, another member of this family is described as a probable lexA repressor and in fact carries a LexA DNA binding domain (pfam01726), but no references were found to expand on this. 166
50835 400333 pfam07930 DAP_B D-aminopeptidase, domain B. D-aminopeptidase is a dimeric enzyme with each monomer being composed of three domains. Domain B is organized to form a beta barrel made up of eight antiparallel beta strands. It is connected to domain A, the catalytic domain, by an eight-residue sequence, and also interacts with both domains A and C via non-covalent bonds. Domain B probably functions in maintaining domain C in a good position to interact with domain A. 181
50836 400334 pfam07931 CPT Chloramphenicol phosphotransferase-like protein. The members of this family are all similar to chloramphenicol 3-O phosphotransferase (CPT) expressed by Streptomyces venezuelae. Chloramphenicol (Cm) is a metabolite produced by this bacterium that can inhibit ribosomal peptidyl transferase activity and therefore protein production. By transferring a phosphate group to the C-3 hydroxyl group of Cm, CPT inactivates this potentially lethal metabolite. 172
50837 400335 pfam07933 DUF1681 Protein of unknown function (DUF1681). This family is composed of sequences derived from a number of hypothetical eukaryotic proteins of unknown function. 156
50838 400336 pfam07934 OGG_N 8-oxoguanine DNA glycosylase, N-terminal domain. The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity (EC:3.2.2.23) and DNA lyase activity (EC:4.2.99.18). The region featured in this family is the N-terminal domain, which is organized into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket. 115
50839 311750 pfam07935 SSV1_ORF_D-335 ORF D-335-like protein. The sequences featured in this family are similar to a probable integrase expressed by the SSV1 virus of the archaebacterium Sulfolobus shibatae. This protein may be necessary for the integration of the virus into the host genome by a process of site-specific recombination. 63
50840 254527 pfam07936 Defensin_4 Potassium-channel blocking toxin. This family features the antihypertensive and antiviral proteins BDS-I and BDS-II expressed by Anemonia sulcata. BDS-I is organized into a triple-stranded antiparallel beta-sheet, with an additional small antiparallel beta-sheet at the N-terminus. Both peptides are known to specifically block the Kv3.4 potassium channel, and thus bring about a decrease in blood pressure. Moreover, they inhibit the cytopathic effects of mouse hepatitis virus strain MHV-A59 on mouse liver cells, by an unknown mechanism. 34
50841 285213 pfam07937 DUF1686 Protein of unknown function (DUF1686). The members of this family are all hypothetical proteins of unknown function expressed by the eukaryotic parasite Encephalitozoon cuniculi GB-M1. The region in question is approximately 250 amino acids long. 182
50842 369605 pfam07938 Fungal_lectin Fungal fucose-specific lectin. Lectins are involved in many recognition events at the molecular or cellular level. These fungal lectins, such as Aleuria aurantia lectin (AAL), specifically recognize fucosylated glycans. AAL is a dimeric protein, with each monomer being organized into a six-bladed beta-propeller fold and a small antiparallel two-stranded beta-sheet. The beta-propeller fold is important in fucose recognition; five binding pockets are found between the propeller blades. The small beta-sheet, on the other hand, is involved in the dimerization process. 303
50843 400337 pfam07939 DUF1685 Protein of unknown function (DUF1685). The members of this family are hypothetical eukaryotic proteins of unknown function. The region in question is approximately 100 amino acid residues long. 60
50844 400338 pfam07940 Hepar_II_III Heparinase II/III-like protein. This family features sequences that are similar to a region of the Flavobacterium heparinum proteins heparinase II and heparinase III. The former is known to degrade heparin and heparin sulphate, whereas the latter predominantly degrades heparin sulphate. Both are secreted into the periplasmic space upon induction with heparin. 235
50845 400339 pfam07941 K_channel_TID Potassium channel Kv1.4 tandem inactivation domain. This family features the tandem inactivation domain found at the N-terminus of the Kv1.4 potassium channel. It is composed of two subdomains. Inactivation domain 1 (ID1, residues 1-38) consists of a flexible N-terminus anchored at a 5-turn helix, and is thought to work by occluding the ion pathway, as is the case with a classical ball domain. Inactivation domain 2 (ID2, residues 40-50) is a 2.5 turn helix with a high proportion of hydrophobic residues that probably serves to attach ID1 to the cytoplasmic face of the channel. In this way, it can promote rapid access of ID1 to the receptor site in the open channel. ID1 and ID2 function together to being about fast inactivation of the Kv1.4 channel, which is important for the channel's role in short-term plasticity. 71
50846 400340 pfam07942 N2227 N2227-like protein. This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defense response to stressful conditions. 268
50847 400341 pfam07943 PBP5_C Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides. 91
50848 400342 pfam07944 Glyco_hydro_127 Beta-L-arabinofuranosidase, GH127. One member of this family, from Bidobacterium longicum, UniProtKB:E8MGH8, has been characterized as an unusual beta-L-arabinofuranosidase enzyme, EC:3.2.1.185. It rleases l-arabinose from the l-arabinofuranose (Araf)-beta1,2-Araf disaccharide and also transglycosylates 1-alkanols with retention of the anomeric configuration. Terminal beta-l-arabinofuranosyl residues have been found in arabinogalactan proteins from a mumber of different plantt species. beta-l-Arabinofuranosyl linkages with 1-4 arabinofuranosides are also found in the sugar chains of extensin and solanaceous lectins, hydroxyproline (Hyp)2-rich glycoproteins that are widely observed in plant cell wall fractions. The critical residue for catalytic activity is Glu-338, in a ET/SCAS sequence context. 503
50849 116555 pfam07945 Toxin_16 Janus-atracotoxin. This family includes three peptides secreted by the spider Hadronyche versuta. These are insect-selective, excitatory neurotoxins that may function by antagonising muscle acetylcholine receptors, or acetylcholine receptor subtypes present in other invertebrate neurons. Janus atracotoxin-Hv1c (J-ACTX-Hv1c) is organized into a disulphide-rich globular core (residues 3-19) and a beta-hairpin (residues 20-34). There are 4 disulphide bridges, one of which is a vicinal disulphide bridge; this is known to be unimportant in the maintenance of structure but critical for insecticidal activity. 36
50850 400343 pfam07946 DUF1682 Protein of unknown function (DUF1682). The members of this family are all hypothetical eukaryotic proteins of unknown function. One member is described as being an adipocyte-specific protein, but no evidence of this was found. 326
50851 400344 pfam07947 YhhN YhhN family. The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many are annotated as possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues. A human member of this family, formerly known as TMEM86B, is a lysoplasmalogenase that catalyzes the hydrolysis of the vinyl ether bond of lysoplasmalogen. Putative conserved active site residues have been proposed for the YhhN family. 182
50852 285223 pfam07948 Nairovirus_M Nairovirus M polyprotein-like. The sequences in this family are similar to the Dugbe virus M polyprotein precursor, which includes glycoproteins G1 and G2. Both are thought to be inserted in the membrane of the Golgi complex of the infected host cell, and G1 is known to have a role in infection of vertebrate hosts. 657
50853 400345 pfam07949 YbbR YbbR-like protein. The members of this family are are all hypothetical bacterial proteins of unknown function, and are similar to the YbbR protein expressed by Bacillus subtilis. One member is annotated as an uncharacterized secreted protein, whereas another member is described as a hypothetical protein in the 5'region of the def gene of Thermus thermophilus, which encodes a deformylase, but no further information was found in either case. This region is found repeated up to four times in many members of this family. 80
50854 400346 pfam07950 DUF1691 Protein of unknown function (DUF1691). This family of fungal proteins is uncharacterized. Each protein contains two copies of this region. 105
50855 285226 pfam07951 Toxin_R_bind_C Clostridium neurotoxin, C-terminal receptor binding. The Clostridium neurotoxin family is composed of tetanus neurotoxins and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. This domains is the C-terminal receptor binding domain, which adopts a modified beta-trefoil fold with a six stranded beta-barrel and a beta-hairpin triplet capping the domain. The first step in the intoxication process is a binding event between this domains and the pre-synaptic nerve ending. 217
50856 400347 pfam07952 Toxin_trans Clostridium neurotoxin, Translocation domain. The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. Subsequent to cell surface binding and receptor mediated endocytosis of the neurotoxin, an acid induced conformational change in the neurotoxin translocation domain is believed to allow the domain to penetrate the endosome and from a pore, thereby facilitating the passage of the catalytic domain across the membrane into the cytosol. The structure of the translocation reveals a pair of helices that are 105 Angstroms long and is structurally distinct from other pore forming toxins. 323
50857 400348 pfam07953 Toxin_R_bind_N Clostridium neurotoxin, N-terminal receptor binding. The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. This domains is the N-terminal receptor binding domain,which is comprised of two seven-stranded beta-sheets sandwiched together to form a jelly role motif. The role of this domain in receptor binding appears to be indirect. 192
50858 369614 pfam07954 DUF1689 Protein of unknown function (DUF1689). Family of fungal proteins with unknown function. A member of this family has been found to localize in the mitochondria. 143
50859 400349 pfam07955 DUF1687 Protein of unknown function (DUF1687). This is a putative redox protein which is predicted to have a thioredoxin fold containing a single active cysteine. 124
50860 400350 pfam07956 DUF1690 Protein of Unknown function (DUF1690). Family of uncharacterized fungal proteins. 138
50861 400351 pfam07957 DUF3294 Protein of unknown function (DUF3294). This family was annotated as mitochondrial Ribosomal protein MRP8, based on the presumed similarity of the S.cerevisiae protein to an E.coli mitochondrial ribosomal protein; however, this similarity is spurious, and the function is not known [Wood, V]. 213
50862 400352 pfam07958 DUF1688 Protein of unknown function (DUF1688). A family of uncharacterized proteins. 420
50863 400353 pfam07959 Fucokinase L-fucokinase. In the salvage pathway of GDP-L-fucose, free cytosolic fucose is phosphorylated by L-fucokinase to form L-fucose-L-phosphate, which is then further converted to GDP-L-fucose in the reaction catalyzed by GDP-L-fucose pyrophosphorylase. 404
50864 369619 pfam07960 CBP4 CBP4. The CBP4 in S. cerevisiae is essential for the expression and activity of ubiquinol-cytochrome c reductase. This family appears to be fungal specific. 125
50865 254545 pfam07961 MBA1 MBA1-like protein. Mba1 is an inner membrane protein that is part of the mitochondrial protein export machinery. It binds to the large subunit of mitochondrial ribosomes and cooperates with the C-terminal ribosome-binding domain of Oxa1, which is a central component of the insertion machinery of the inner membrane. In the absence of both Mba1 and the C-terminus of Oxa1, mitochondrial translation products fail to be properly inserted into the inner membrane and serve as substrates of the matrix chaperone Hsp70. It is proposed that Mba1 functions as a ribosome receptor that cooperates with Oxa1 in the positioning of the ribosome exit site to the insertion machinery of the inner membrane. 235
50866 400354 pfam07962 Swi3 Replication Fork Protection Component Swi3. Replication fork pausing is required to initiate a recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1 Swi3, together with Swi1, define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks. The Swi1-Swi3 complex is required for accurate replication, fork protection and replication checkpoint signalling. 83
50867 400355 pfam07963 N_methyl Prokaryotic N-terminal methylation motif. This short motif directs methylation of the conserved phenylalanine residue. It is most often found at the N-terminus of pilins and other proteins involved in secretion, see pfam00114, pfam05946, pfam02501 and pfam07596. 27
50868 369621 pfam07964 Red1 Rec10 / Red1. Rec10 / Red1 is involved in meiotic recombination and chromosome segregation during homologous chromosome formation. This protein localizes to the synaptonemal complex in S. cerevisiae and the analogous structures (linear elements) in S. pombe. This family is currently only found in fungi. 748
50869 400356 pfam07965 Integrin_B_tail Integrin beta tail domain. This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. 84
50870 400357 pfam07966 A1_Propeptide A1 Propeptide. Most eukaryotic endopeptidases (Merops Family A1) are synthesized with signal and propeptides. The animal pepsin-like endopeptidase propeptides form a distinct family of propeptides, which contain a conserved motif approximately 30 residues long. In pepsinogen A, the first 11 residues of the mature pepsin sequence are displaced by residues of the propeptide. The propeptide contains two helices that block the active site cleft, in particular the conserved Asp11 residue, in pepsin, hydrogen bonds to a conserved Arg residues in the propeptide. This hydrogen bond stabilizes the propeptide conformation and is probably responsible for triggering the conversion of pepsinogen to pepsin under acidic conditions. 29
50871 400358 pfam07967 zf-C3HC C3HC zinc finger-like. This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) proteins. NIPA is implicate to perform some sort of antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signaling events. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe the protein containing this domain is involved in mRNA export from the nucleus. 132
50872 400359 pfam07968 Leukocidin Leukocidin/Hemolysin toxin family. 250
50873 400360 pfam07969 Amidohydro_3 Amidohydrolase family. 464
50874 400361 pfam07970 COPIIcoated_ERV Endoplasmic reticulum vesicle transporter. This family is conserved from plants and fungi to humans. Erv46 works in close conjunction with Erv41 and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58 kDa beta subunit, which is required for ER localization. All proteins identified biochemically as Erv41p-Erv46p interactors are localized to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi. 223
50875 400362 pfam07971 Glyco_hydro_92 Glycosyl hydrolase family 92. Members of this family are alpha-1,2-mannosidases, enzymes which remove alpha-1,2-linked mannose residues from Man(9)(GlcNAc)(2) by hydrolysis. They are critical for the maturation of N-linked oligosaccharides and ER-associated degradation. 465
50876 400363 pfam07972 Flavodoxin_NdrI NrdI Flavodoxin like. 119
50877 400364 pfam07973 tRNA_SAD Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active from of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain. 43
50878 400365 pfam07974 EGF_2 EGF-like domain. This family contains EGF domains found in a variety of extracellular proteins. 26
50879 336887 pfam07975 C1_4 TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterized by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C (pfam00130). 55
50880 400366 pfam07976 Phe_hydrox_dim Phenol hydroxylase, C-terminal dimerization domain. Phenol hydroxylase acts a homodimer, to hydroxylates phenol to catechol or similar product. The enzyme is comprised of three domains. The first two domains from the active site. The third domain, this domain, is involved in forming the dimerization interface. The domain adopts a thioredoxin-like fold. 166
50881 400367 pfam07977 FabA FabA-like domain. This enzyme domain has a HotDog fold. 134
50882 400368 pfam07978 NIPSNAP NIPSNAP. Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family which have putative roles in vesicular transport. This domain is often found in duplicate. 102
50883 311782 pfam07979 Intimin_C Intimin C-type lectin domain. This domain is found at the C-terminus of intimin. Its structure has been solved and shown to have a C-lectin type of structure. Intimin is a bacterial adhesion molecule involved in intimate attachment of enteropathogenic and enterohemorrhagic Escherichia coli to mammalian host cells. Intimin targets the translocated intimin receptor (Tir), which is exported by the bacteria and integrated into the host cell plasma membrane. 101
50884 400369 pfam07980 SusD_RagB SusD family. This domain is found in bacterial cell surface proteins such SusD and SusD-like proteins, as as well RagB, outer membrane surface receptor antigen. Bacteroidetes, one of the two dominant bacterial phyla in the human gut, are Gram-negative saccharolytic microorganisms that utilize a diverse array of glycans. Hence, they express starch-utilization system (Sus) for glycan uptake. SusD has 551 amino acids, and is almost entirely alpha-helical, with 22 alpha-helices, eight of which form 4 tetra-trico peptide repeats (TPRs: helix-turn-helix motifs involved in protein-protein interactions). The four TPRs pack together to create a right-handed super-helix. This is predicted to mediate the formation of SusD and SusC porin complex at the cell surface. The interaction between SusC and TPR1/TPR2 region of SusD is predicted to be of functional importance since it allows SusD to be in position for oligosaccharide capture from other Sus lipoproteins and delivery of these glycans to the SusC porin. The non-TPR containing portion of SusD is where starch binding occurs. The binding site is a shallow surface cavity located on top of TPR1. SusD homologs such as SusD-like proteins have a critical role in carbohydrate acquisition. Both SusD and its homologs, contain about 15-20 residues at the N-terminus that might be a flexible linker region, anchoring the protein to the membrane and the glycan-binding domain. Other homologs to SusD have been examined in Porphyromonas gingivalis such as RagB, an immunodominant outer-membrane surface receptor antigen. Structural characterization of RagB shows substantial similarity with Bacteroides thetaiotaomicron SusD (i.e alpha-helices and TPR regions). Based on this structural similarity, functional studies suggest that, RagB binding of glycans occurs at pockets on the molecular surface that are distinct from those of SusD. 292
50885 400370 pfam07981 Plasmod_MYXSPDY Plasmodium repeat_MYXSPDY. This repeat is found in two hypothetical Plasmodium proteins. 17
50886 285256 pfam07982 Herpes_UL74 Herpes UL74 glycoproteins. Members of this family are viral glycoproteins that form part of an envelope complex. 418
50887 400371 pfam07983 X8 X8 domain. The X8 domain domain contains at least 6 conserved cysteine residues that presumably form three disulphide bridges. The domain is found in an Olive pollen allergen as well as at the C-terminus of several families of glycosyl hydrolases. This domain may be involved in carbohydrate binding. This domain is characteristic of GPI-anchored domains. 76
50888 400372 pfam07984 NTP_transf_7 Nucleotidyltransferase. This family contains many hypothetical proteins. It also includes four nematode prion-like proteins. This domain has been identified as part of the nucleotidyltransferase superfamily. 319
50889 400373 pfam07985 SRR1 SRR1. SRR1 proteins are signalling proteins involved in regulating the circadian clock in Arabidopsis. 54
50890 400374 pfam07986 TBCC Tubulin binding cofactor C. Members of this family are involved in the folding pathway of tubulins and form a beta helix structure. 119
50891 400375 pfam07987 DUF1775 Domain of unkown function (DUF1775). Domain found in bacteria with undetermined function. Its structure has been determined and is an immunoglobulin-like fold. 145
50892 400376 pfam07988 LMSTEN LMSTEN motif. This region of Myb proteins has previously been described as the transcriptional activation domain present in the vertebrate c-Myb and A-Myb, but neither vertebrate B-Myb proteins nor Myb proteins of invertebrates. Because vertebrate B-Myb (but neither A-Myb nor c-Myb) can partially complement Drosophila Myb null mutants, this region appears to have been a relatively recent insertion. 45
50893 400377 pfam07989 Cnn_1N Centrosomin N-terminal motif 1. This domain has been identified in two microtubule associated proteins in Schizosaccharomyces pombe, Mto1 and Pcp1. Mto1 has been identified in association with spindle pole body and non-spindle pole body microtubules. The pericentrin homolog Pcp1 is also associated with the fungal centrosome or spindle pole body (SPB). Members of this family have been named centrosomins, and are an essential mitotic centrosome component required for assembly of all other known pericentriolar matrix proteins in order to achieve microtubule-organising activity in fission yeast. Cnn_1N is a short conserved motif towards the N-terminus. Motif 1 is found to be necessary for proper recruitment of gamma-tubulin, D-TACC (the homolog of vertebrate transforming acidic coiled-coil proteins [TACC]), and Minispindles (Msps) to embryonic centrosomes but is not required for assembly of other centrosome components including Aurora A kinase and CP60 in Drosophila. 69
50894 400378 pfam07990 NABP Nucleic acid binding protein NABP. Many members of this family are putative nucleic acid binding proteins. One member of this family has been partially characterized and contains two putative phosphorylation sites and a possible dimerization / leucine zipper domain. 387
50895 285265 pfam07991 IlvN Acetohydroxy acid isomeroreductase, NADPH-binding domain. Acetohydroxy acid isomeroreductase catalyzes the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine. This N-terminal region of the enzyme carries the binding-site for NADPH. The active-site for enzymatic activity lies in the C-terminal part, IlvC, pfam01450. 165
50896 400379 pfam07992 Pyr_redox_2 Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. 301
50897 400380 pfam07993 NAD_binding_4 Male sterility protein. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included. 257
50898 400381 pfam07994 NAD_binding_5 Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction. 435
50899 400382 pfam07995 GSDH Glucose / Sorbosone dehydrogenase. Members of this family are glucose/sorbosone dehydrogenases that possess a beta-propeller fold. 326
50900 400383 pfam07996 T4SS Type IV secretion system proteins. Members of this family are components of the type IV secretion system. They mediate intracellular transfer of macromolecules via a mechanism ancestrally related to that of bacterial conjugation machineries. 191
50901 400384 pfam07997 DUF1694 Protein of unknown function (DUF1694). This family contains many hypothetical proteins. 114
50902 191923 pfam07998 Peptidase_M54 Peptidase family M54. This is a family of metallopeptidases. Two human proteins have been reported to degrade synthetic substrates and peptides. 176
50903 191924 pfam07999 RHSP Retrotransposon hot spot protein. Members of this family are retrotransposon hot spot proteins. They are associated with polymorphic subtelomeric regions in Trypanosoma. These proteins contain a P-loop motif. 439
50904 400385 pfam08000 bPH_1 Bacterial PH domain. This family contains many bacterial hypothetical proteins. The structures of Structure 3hsa and Structure 3dcx show similarities to the PH or pleckstrin homology domain. First evidence of PH-like domains in bacteria suggests role in cell envelope stress response. 122
50905 285273 pfam08001 CMV_US CMV US. This is a family of unique short (US) cytoplasmic glycoproteins which are expressed in cytomegalovirus. 245
50906 400386 pfam08002 DUF1697 Protein of unknown function (DUF1697). This family contains many hypothetical bacterial proteins. 131
50907 400387 pfam08003 Methyltransf_9 Protein of unknown function (DUF1698). This family contains many hypothetical proteins. It also includes two putative methyltransferase proteins. 315
50908 369645 pfam08004 DUF1699 Protein of unknown function (DUF1699). This family contains many archaeal proteins which have very conserved sequences. 130
50909 400388 pfam08005 PHR PHR domain. This domain is called PHR as it was originally found in the proteins PAM, highwire, and RPM. This domain can be duplicated in the highwire, PFAM and PRM sequence. The C-terminal region of the protein BTBD1 includes the PHR domain and is known to interact with Topoisomerase I, an enzyme which relaxes DNA supercoils. 150
50910 285278 pfam08006 DUF1700 Protein of unknown function (DUF1700). This family contains many hypothetical bacterial proteins and putative membrane proteins. 181
50911 400389 pfam08007 Cupin_4 Cupin superfamily protein. This family contains many hypothetical proteins that belong to the cupin superfamily. 319
50912 285280 pfam08008 Viral_cys_rich Viral cysteine rich. Members of this family are polydna viral proteins that contain a cysteine rich motif. Some members of this family have multiple copies of this domain. 83
50913 400390 pfam08009 CDP-OH_P_tran_2 CDP-alcohol phosphatidyltransferase 2. This domain is found on CDP-alcohol phosphatidyltransferases. These enzymes catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. 37
50914 285282 pfam08010 Phage_30_3 Bacteriophage protein GP30.3. Proteins in this family are bacteriophage GP30.3 proteins. Their function is poorly characterized. 138
50915 400391 pfam08011 PDDEXK_9 PD-(D/E)XK nuclease superfamily. This family contains many hypothetical bacterial proteins. It has been identified as a member of the PD-(D/E)XK nuclease superfamily through transitive meta profile searches. DUF1703 has the predicted secondary structure pattern of the restriction endonuclease-like fold core and contains an additional beta-strand at the C-terminus. 104
50916 400392 pfam08012 DUF1702 Protein of unknown function (DUF1702). This family of proteins contains many bacterial proteins that are encoded by the UnbL gene. The function of these proteins is unknown. 319
50917 400393 pfam08013 Tagatose_6_P_K Tagatose 6 phosphate kinase. Proteins in this family are tagatose 6 phosphate kinases. 420
50918 400394 pfam08014 DUF1704 Domain of unknown function (DUF1704). This family contains many hypothetical proteins. 365
50919 369652 pfam08015 Pheromone Fungal mating-type pheromone. This family corresponds to mating-type pheromone proteins. The homobasidiomycetes, or mushroom fungi, have arguably the most complex mating system of all known organisms. Many species possess a mating system known as bifactorial incompatibility, where two unlinked loci control the mating -type of an individual incompatibility loci (the A and B mating-type loci). Each A mating-type sublocus encodes a pair of divergently transcribed homeodomain transcription factors while the genes responsible for B mating-type activity encode lipopeptide pheromones and G-protein -coupled pheromone receptors. 67
50920 400395 pfam08016 PKD_channel Polycystin cation channel. This family contains the cation channel region of PKD1 and PKD2 proteins. 424
50921 311808 pfam08017 Fibrinogen_BP Fibrinogen binding protein. Proteins in this family bind to fibrinogen. Members of this family includes the fibrinogen receptor, FbsA, which mediates platelet aggregation. 393
50922 285289 pfam08018 Antimicrobial_1 Frog antimicrobial peptide. This family includes antimicrobial peptides secreted from skins of frogs. The secretion of antimicrobial peptides from the skins of frogs plays an important role in the self defense of these frogs. Structural characterization of these peptides showed that they belonged to four known families: the brevinin-1 family, the esculentin-2 family, the ranatuerin-2 family and the temporin family. 24
50923 400396 pfam08019 DUF1705 Domain of unknown function (DUF1705). Some members of this family are putative bacterial membrane proteins. This domain is found immediately N terminal to the sulfatase domain in many sulfatases. 149
50924 400397 pfam08020 DUF1706 Protein of unknown function (DUF1706). This family contains many hypothetical proteins from bacteria and yeast. 161
50925 311811 pfam08021 FAD_binding_9 Siderophore-interacting FAD-binding domain. 118
50926 285293 pfam08022 FAD_binding_8 FAD-binding domain. 108
50927 400398 pfam08023 Antimicrobial_2 Frog antimicrobial peptide. This family consists of the major classes of antimicrobial peptides secreted from the skin of frogs that protect the frogs against invading microbes. They are typically between 10-50 amino acids long and are derived from proteolytic cleavage of larger precursors. Major classes of peptides such esculentin, gaegurin, brevinin, rugosin and ranatuerin are included in this family. 31
50928 116634 pfam08024 Antimicrobial_4 Ant antimicrobial peptide. This family consists of the ponericin family of antimicrobial peptides isolated from predatory ant Pachycondyla goeldii. The ponericin peptides may adopt amphipathic alpha-helical structure in polar environments. In the ant colony, these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion. 24
50929 116635 pfam08025 Antimicrobial_3 Spider antimicrobial peptide. This family includes antimicrobial peptides isolated from the crude venom of the wolf spider Oxyopes kitabensis. These peptides, known as oxyopinins, are the largest linear cationic amphipathic peptides chemically characterized and exhibit disrupting activities towards biological membranes. 37
50930 369654 pfam08026 Antimicrobial_5 Bee antimicrobial peptide. This family consists of antimicrobial peptides produced by bees. These peptides have strong antimicrobial and some anti-fungal activity and has homology to abaecin which is the largest proline-rich antimicrobial peptide isolated from European bumblebee Bombus pascuorum. 31
50931 369655 pfam08027 Albumin_I Albumin I chain b. The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43 kDa receptor. The structure of this domain (chain b) reveals a knottin like fold, comprise of three beta strands. 35
50932 400399 pfam08028 Acyl-CoA_dh_2 Acyl-CoA dehydrogenase, C-terminal domain. 133
50933 400400 pfam08029 HisG_C HisG, C-terminal domain. 73
50934 369657 pfam08030 NAD_binding_6 Ferric reductase NAD binding domain. 151
50935 369658 pfam08031 BBE Berberine and berberine like. This domain is found in the berberine bridge and berberine bridge- like enzymes which are involved in the biosynthesis of numerous isoquinoline alkaloids. They catalyze the transformation of the N-methyl group of (S)-reticuline into the C-8 berberine bridge carbon of (S)-scoulerine. 45
50936 400401 pfam08032 SpoU_sub_bind RNA 2'-O ribose methyltransferase substrate binding. This domain is a RNA 2'-O ribose methyltransferase substrate binding domain. 74
50937 400402 pfam08033 Sec23_BS Sec23/Sec24 beta-sandwich domain. 86
50938 400403 pfam08034 TES Trematode eggshell synthesis protein. This domain has been identified in a number of distantly related species of trematodes. This protein domain is crucial for eggshell synthesis in trematodes (Ebersberger I). 66
50939 400404 pfam08035 Op_neuropeptide Opioids neuropeptide. This family corresponds to the conserved YGG motif that is found in a wide variety of opioid neuropeptides such as enkephalin. 30
50940 285304 pfam08036 Antimicrobial_6 Diapausin family of antimicrobial peptide. This family consists of diapausin-related antimicrobial peptides. Diapause during periods of environmental adversity is an essential part of the life cycle of many organisms with the molecular basis being different among animals. Diapause-specific peptides provide anti-fungal activity and act as N-type voltage-gated calcium channel blocker. 39
50941 311820 pfam08037 Attractin Attractin family. This family consists of the attractin family of water-borne pheromone. Mate attraction in Aplysia involves a long-distance water-borne signal in the form of the attractin peptide, that is released during egg laying. These peptides contain 6 conserved cysteines and are folded into 2 antiparallel helices. The second helix contains the IEECKTS sequence conserved in Aplysia attractins. 55
50942 400405 pfam08038 Tom7 TOM7 family. This family consists of TOM7 family of mitochondrial import receptors. TOM7 forms part of the translocase of the outer mitochondrial membrane (TOM) complex and it appears to function as a modulator of the dynamics of the mitochondrial protein transport machinery by promoting the dissociation of subunits of the outer membrane translocase. 41
50943 400406 pfam08039 Mit_proteolip Mitochondrial proteolipid. This family consists of proteins with similarity to the mitochondrial proteolipids. Mitochondrial proteolipid consists of about 60 amino acids residues and is about 6.8 kDa in size. 60
50944 400407 pfam08040 NADH_oxidored MNLL subunit. This family consists of the MNLL subunits of NADH-ubiquinone oxidoreductase complex. NADH-ubiquinone oxidoreductase is involved in the transfer of electrons from NADH to the electron transport chain. This oxidation of NADH is coupled to proton transfer across the membrane, generating a proton motive force that is utilized for the synthesis of ATP. MNLL subunit is one of the many subunits found in the complex and it contains a mitochondrial import sequence. However, the role of MNLL subunit is unclear. 58
50945 400408 pfam08041 PetM PetM family of cytochrome b6f complex subunit 7. This family consists of the PetM family of cytochrome b6f complex subunit IV. The cytochrome b6f complex consists of 7 subunits and contains 2 beta hemes and 1 chlorophyll alpha per cytochrome f. It is highly active in transferring electrons from decylplastoquinol to oxidized plastocyanin. 29
50946 369666 pfam08042 PqqA PqqA family. This family consists of proteins belonging to the coenzyme Pyrroloquinoline quinone A (pqqA) family. PQQ is the non-covalently bounded prosthetic group of many quinoproteins catalyzing reactions in the periplasm of Gram-negative bacteria. PQQ is formed by the fusion of glutamate and tyrosine and synthesis of PQQ require the proteins encoded by the pqqABCDEF operon but details of the biosynthetic pathway are unclear. 19
50947 400409 pfam08043 Xin Xin repeat. The repeat has the consensus sequence GDV(K/Q/R)(T/S/G)X(R/K/T) WLFETXPLD. This repeat motif is typically found in the N-terminus of the proteins, with a copy number between 2 and 28 repeats. Direct evidence for binding to and stabilizing F-actin has been found in the human protein XIRP1. The homologs in mouse and chicken localize in the adherens junction complex of the intercalated disc in cardiac muscle and in the myotendon junction of skeletal muscle. mXin may co-localize with Vinculin which is known to attach the actin to the cytoplasmic membrane. It has been shown that the amino-terminus of human xin (CMYA1) binds the EVH1 domain of Mena/VASP/EVL, and the carboxy-terminus binds the, for the filamin family unique, domain 20 of filaminC. This confirms the proposed role of xin repeat containing proteins as F-actin-binding adapter proteins. 16
50948 400410 pfam08044 DUF1707 Domain of unknown function (DUF1707). This domain is found in a variety of Actinomycetales proteins. All of the proteins containing this domain are hypothetical and probably membrane bound or associated. Currently, it is unclear to the function of this domain. 52
50949 400411 pfam08045 CDC14 Cell division control protein 14, SIN component. Cdc14 is a component of the septation initiation network (SIN) and is required for the localization and activity of Sid1. Sid1 is a protein kinase that localizes asymmetrically to one spindle pole body (SPB) in anaphase disappears prior to cell separation. 283
50950 285313 pfam08046 IlvGEDA_leader IlvGEDA operon leader peptide. This family consists of the leader peptides of ilvGEDA operon. The expression of the ilvGEDA operon of E coli K-12 is multivalently controlled by the three branched -chain amino acids. Regulation is thought to occur by attenuation of transcription in response to the changing levels of the cognate tRNAs. Transcription of this operon is usually terminated at the end of the leader (regulatory) region. 32
50951 285314 pfam08047 His_leader Histidine operon leader peptide. This family consists of the leader peptide of the histidine (his) operon. The his operon contains all the genes necessary for histidine biosynthesis. The region corresponding to the untranslated 5' end of the transcript, named the his leader region, displays the typical features of the T box transcriptional attenuation mechanism which is involved in the regulation of many amino acid biosynthetic operons. 16
50952 116658 pfam08048 RepA1_leader Tap RepA1 leader peptide. This family consists of the RepA1 leader peptides. The frequency of replication of IncFII plasmid NR1 during the cell division cycle is regulated by the control of the synthesis of the plasmid-specific replication initiation protein (RepA1). When RepA1 is synthesized, it binds to the plasmid replication origin (ori) and effects the assembly of a replication complex composed of host proteins that mediate the replication of the plasmid. The tap gene encodes a 24-amino acids protein. The translation of tap is required for translation of repA. 25
50953 369669 pfam08049 IlvB_leader IlvB leader peptide. This family consists of the leader peptides of the ilvB operon. This region encodes a potential leader polypeptide containing 32 amino acids, 12 of which are the regulatory amino acids valine and leucine. A model for the multivalent regulation of this operon by valyl- and leucyl-tRNA is proposed on the basis of the mutually exclusive formation of five strong stem-and-loop structures in the leader mRNA. 32
50954 254601 pfam08050 Tet_res_leader Tetracycline resistance leader peptide. This family consists of the tetracycline resistance leader peptide. The presence of 3 inverted repeats which can form 2 different conformations of mRNA suggests that the tetracycline resistance (TcR) region is regulated by a translational attenuation mechanism. A Rho-independent transcriptional terminator structure is present immediately after the translational stop codon of the TET protein. 20
50955 369670 pfam08051 Ery_res_leader1 Erythromycin resistance leader peptide. This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the translational attenuation of erythromycin resistance genes. Interestingly, the consensus sequence of peptides conferring erythromycin resistance is similar to that of the leader peptides, thus indicating that a similar type of interaction between the nascent peptide and antibiotics can occur in both cases. This family also includes a small number of regions from within larger proteins from actinomycetes. 15
50956 285317 pfam08052 PyrBI_leader PyrBI operon leader peptide. This family consists of the pyrBI operon leader peptides. The expression of the pyrBI operon, which encodes the subunits of the pyrimidine biosynthetic enzyme aspartate transcarbamylase. is regulated primarily through a UTP-sensitive transcriptional attenuation control mechanism. In this mechanism, the concentration of UTP determines the extent of coupling between transcription and translation within the pyrBI leader region, hence determining the level of rho-independent transcriptional termination at an attenuator preceding the pyrB gene. 44
50957 369671 pfam08053 Tna_leader Tryptophanase operon leader peptide. This family consists of the tryptophanase (tna) operon leader peptide. Tna catalyzes the degradation of L-tryptophan to indole, pyruvate and ammonia, enabling the bacteria to utilize tryptophan as a source of carbon, nitrogen and energy. The tna operon of E. coli contains two major structural genes, tnaA and tnaB. Preceding tnaA in the tna operon is a 319 -nucleotide transcribed regulatory region that contains the coding region for a 24-residue leader peptide, TnaC. The RNA sequence in the vicinity of the tnaC stop codon is rich in Cytidylate residues which is required for efficient Rho -dependent termination in the leader region of the tna operon. 23
50958 285319 pfam08054 Leu_leader Leucine operon leader peptide. This family consists of the leucine operon leader peptide. The leucine operon is involved in the control of the biosynthesis of leucine. Four adjacent leucine codons within the leucine leader RNA are critically important in transcription attenuation-mediated control of leucine operon expression in bacteria. The leader RNA contains translational start and stop signals, a cluster of four leucine codons and overlapping regions of dyad symmetry that are capable of forming stem-and-loop structures. 28
50959 116663 pfam08055 Trp_leader1 Tryptophan leader peptide. This family consists of the tryptophan (trp) leader peptides. Tryptophan accumulation is the principal event resulting in downregulation of transcription of the structural genes of the trp operon. The leader peptide of the trp operon forms mutually exclusive secondary structures that would either result in the termination of transcription of the trp operon when tryptophan is in plentiful supply or vice versa. 18
50960 285320 pfam08056 Trp_leader2 Tryptophan operon leader peptide. This family consists of the tryptophan operon leader peptides. The tryptophan operon is regulated by transcription attenuation in response to changes in the level of tryptophan. The transcript of the leader peptide can adopt alternative mutually-exclusive secondary structures that would either result in termination of transcription of the tryptophan structural genes or in transcription of the entire operon. 41
50961 71493 pfam08057 Ery_res_leader2 Erythromycin resistance leader peptide. This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the transcriptional attenuation control of the synthesis of the macrolide-lincosamide -streptogramin B resistance protein. It acts as a transcriptional attenuator, in contrast to other inducible erm genes. The mRNA leader sequence can fold in either of two mutually exclusive conformations, one of which is postulated to form in the absence of induction, and to contain two rho factor-independent terminators.. 14
50962 400412 pfam08058 NPCC Nuclear pore complex component. Proteins containing this domain are components of the nuclear pore complex. One member of this family is Nucleoporin POM34, which is thought to have a role in anchoring peripheral Nups into the pore and mediating pore formation. 134
50963 400413 pfam08059 SEP SEP domain. The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain. 75
50964 400414 pfam08061 P68HR P68HR (NUC004) repeat. This short region is found in two copies in p68-like RNA helicases. 32
50965 285324 pfam08062 P120R P120R (NUC006) repeat. This characteristic repeat of proliferating cell nuclear antigen P120 is found in three copies. 22
50966 400415 pfam08063 PADR1 PADR1 (NUC008) domain. This domain is found in poly(ADP-ribose)-synthetases. The function of this domain is unknown. 53
50967 400416 pfam08064 UME UME (NUC010) domain. This domain is characteristic of UVSB PI-3 kinase, MEI-41 and ESR1. 102
50968 400417 pfam08065 K167R K167R (NUC007) repeat. This family represents the K167/Chmadrin repeat. The function of this repeat is unknown. 112
50969 400418 pfam08066 PMC2NT PMC2NT (NUC016) domain. This domain is found at the N-terminus of 3'-5' exonucleases with HRDC domains, and also in putative exosome components. 87
50970 400419 pfam08067 ROKNT ROKNT (NUC014) domain. This presumed domain is found at the N-terminus of RNP K-like proteins that also contains KH domains pfam00013. 42
50971 400420 pfam08068 DKCLD DKCLD (NUC011) domain. This is a TruB_N/PUA domain associated N-terminal domain of Dyskerin-like proteins. 58
50972 400421 pfam08069 Ribosomal_S13_N Ribosomal S13/S15 N-terminal domain. This domain is found at the N-terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021. 57
50973 400422 pfam08070 DTHCT DTHCT (NUC029) region. The DTCHT region is the C-terminal part of DNA gyrases B / topoisomerase IV / HATPase proteins. This region is composed of quite low complexity sequence. 96
50974 400423 pfam08071 RS4NT RS4NT (NUC023) domain. This is the N-terminal domain of Ribosomal S4 / S4e proteins. This domain is associated with S4 and KOW domains. 37
50975 400424 pfam08072 BDHCT BDHCT (NUC031) domain. This is a C-terminal domain in Bloom's syndrome DEAD helicase subfamily. 40
50976 400425 pfam08073 CHDNT CHDNT (NUC034) domain. The CHDNT domain is found in PHD/RING finger and chromo domain-associated helicases. 54
50977 400426 pfam08074 CHDCT2 CHDCT2 (NUC038) domain. The CHDCT2 C-terminal domain is found in PHD/RING finger and chromo domain-associated CHD-like helicases. 126
50978 400427 pfam08075 NOPS NOPS (NUC059) domain. This domain is found at the C-terminus of NONA and PSP1 proteins adjacent to 1 or 2 pfam00076 domains. 52
50979 285338 pfam08076 TetM_leader Tetracycline resistance determinant leader peptide. This family consists of the tetracycline resistance determinant tet(M) leader peptides. A short open reading frame corresponding to a 28 amino acid peptide which contain a number of inverted repeat sequences was found immediately upstream of the tet(M). Transcriptional analyses has found that expression of tet(M) resulted from an extension of a small transcript representing the upstream leader region into the resistance determinant. Thus this leader sequence is responsible for transcriptional attenuation and thus regulation of the transcription of tet(M). 28
50980 71513 pfam08077 Cm_res_leader Chloramphenicol resistance gene leader peptide. This family consists of chloramphenicol (Cm) resistance gene leader peptides. Inducible resistance to Cm in both Gram positive and Gram negative bacteria is controlled by translation attenuation. In translation attenuation, the ribosome-binding-site (RBS) for the resistance determinant is sequestered in a secondary structure domain within the mRNA. Preceding the secondary structure is a short, translated ORF termed the leader. Ribosome stalling in the leader causes the destabilization of the downstream secondary structure, allowing initiation of translation of the Cm resistance gene. 17
50981 369686 pfam08078 PsaX PsaX family. This family consists of the PsaX family of photosystem I (PSI) protein subunits. PSI is a large multi-subunit pigment protein complex embedded in the thylakoid membranes of green plants and cyanobacteria. PsaX is one of the 12 protein subunits found in PSI and these subunits are arranged as monomers or trimers within the membrane as shown by the structure of the trimeric complex from Synechococcus elongatus. 35
50982 400428 pfam08079 Ribosomal_L30_N Ribosomal L30 N-terminal domain. This presumed domain is found at the N-terminus of Ribosomal L30 proteins and has been termed RL30NT or NUC018. 66
50983 369688 pfam08080 zf-RNPHF RNPHF zinc finger. This domain is a putative zinc-binding domain (CHHC motif) in RNP H and F. The domain is often associated with pfam00076. 36
50984 400429 pfam08081 RBM1CTR RBM1CTR (NUC064) family. This C-terminal region is found in RBM1-like RNA binding hnRNPs. 45
50985 400430 pfam08082 PRO8NT PRO8NT (NUC069), PrP8 N-terminal domain. The PRO8NT domain is found at the N-terminus of pre-mRNA splicing factors of PRO8 family. The NLS or nuclear localization signal for these spliceosome proteins begins at the start and runs for 60 residues. N-terminal to this domain is a highly variable proline-rich region. 152
50986 400431 pfam08083 PROCN PROCN (NUC071) domain. The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family. 402
50987 400432 pfam08084 PROCT PROCT (NUC072) domain. The PROCT domain is the C-terminal domain in pre-mRNA splicing factors of PRO8 family. 111
50988 400433 pfam08085 Entericidin Entericidin EcnA/B family. This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing. 20
50989 116691 pfam08086 Toxin_17 Ergtoxin family. This family consists of ergtoxin peptides which are toxins secreted by the scorpions. The ergtoxins are capable of blocking the function of K+ channels. More than 100 ergtoxins have been found from scorpion venoms and they have been classified into three subfamilies according to their primary structures. 41
50990 191941 pfam08087 Toxin_18 Conotoxin O-superfamily. This family consists of members of the conotoxin O-superfamily. The O-superfamily of conotoxins consists of 3 groups of Conus peptides that belong to the same structural group. These 3 groups differ in their pharmacological properties: the w-conotoxins which inhibit calcium channels, the delta-conotoxins which slow down the inactivation rate of voltage -sensitive sodium channels and the muO-conotoxins block the voltage sensitive sodium currents. 31
50991 400434 pfam08088 Toxin_19 Conotoxin I-superfamily. This family consists of the I-superfamily of conotoxins. This is a new class of peptides in the venom of some Conus species. These toxins are characterized by four disulfide bridges and inhibit of modify ion channels of nerve cells. The I-superfamily conotoxins is found in five or six major clades of cone snails and could possible be found in many more species. 40
50992 400435 pfam08089 Toxin_20 Huwentoxin-II family. This family consists of the huwentoxin-II (HWTX-II) family of toxins secreted by spiders. These toxins are found in venom that secreted from the bird spider Selenocosmia huwena Wang. The HWTX-II adopts a novel scaffold different from the ICK motif that is found in other huwentoxins. HWTX-II consists of 37 amino acids residues including six cysteines involved in three disulfide bridges. 39
50993 116695 pfam08090 Enterotoxin_HS1 Heat stable E.coli enterotoxin 1. Heat-stable toxin 1 of entero-aggregative E.coli (EAST1) is a small toxin. It is not, however, solely associated with entero-aggregative E.coli but also with many other diarrhoaeic E. coli families. Some studies have established the role of EAST1 in some human outbreaks of diarrhoea. Isolates from farm animals have been shown to carry the astA gene coding for EAST1. However, the relation between the presence of EAST1 and disease is not conclusive. 36
50994 400436 pfam08091 Toxin_21 Spider insecticidal peptide. This family consists of insecticidal peptides isolated from venom of spiders of Aptostichus schlingeri and Calisoga sp. Nine insecticidal peptides were isolated from the venom of the Aptostichus schlingeri spider and seven of these toxins cause flaccid paralysis to insect larvae within 10 min of injection. However, all nine peptides were lethal within 24 hours. The structure of Aps III was solved and shown to be an atypical knottin peptide with four disulphide bridges. 40
50995 149265 pfam08092 Toxin_22 Magi peptide toxin family. This family consists of Magi peptide toxins (Magi 1, 2 and 5) isolated from the venom of Hexathelidae spider. These insecticidal peptide toxins bind to sodium channels and induce flaccid paralysis when injected into lepidopteran larvae. However, these peptides are not toxic to mice when injected intracranially at 20 pmol/g. 38
50996 116698 pfam08093 Toxin_23 Magi 5 toxic peptide family. This family consists of toxic peptides (Magi 5) found in the venom of the Hexathelidae spider. Magi 5 is the first spider toxin with binding affinity to site 4 of a mammalian sodium channel and the toxin has an insecticidal effect on larvae, causing paralysis when injected into the larvae. 30
50997 311850 pfam08094 Toxin_24 Conotoxin TVIIA/GS family. This family consists of conotoxins isolated from the venom of cone snail Conus tulipa and Conus geographus. Conotoxin TVIIA, isolated from Conus tulipa displays little sequence homology with other well-characterized pharmacological classes of peptides, but displays similarity with conotoxin GS, a peptide from Conus geographus. Both these peptides block skeletal muscle sodium channels and also share several biochemical features and represent a distinct subgroup of the four-loop conotoxins. 33
50998 71530 pfam08095 Toxin_25 Hefutoxin family. This family consists of the hefutoxins that are found in the venom of the scorpion Heterometrus fulvipes. These toxins, kappa-hefutoxin1 and kappa-hefutoxin2, exhibit no homology to any known toxins. The hefutoxins are potassium channel toxins. 22
50999 71531 pfam08096 Bombolitin Bombolitin family. This family consists of the bombolitin peptides that are found in the venom of the bumblebee Megabombus pennsylvanicus. Bombolitins are structurally and functionally very similar. They lyse erythrocytes and liposomes, release histamine from rat peritoneal mast cells, and stimulate phospholipase A2 from different sources. 17
51000 71532 pfam08097 Toxin_26 Conotoxin T-superfamily. This family consists of the T-superfamily of conotoxins. Eight different T-superfamily peptides from five Conus species were identified. These peptides share a consensus signal sequence, and a conserved arrangement of cysteine residues. T-superfamily peptides were found expressed in venom ducts of all major feeding types of Conus, suggesting that the T-superfamily is a large and diverse group of peptides, widely distributed in the 500 different Conus species. 11
51001 400437 pfam08098 ATX_III Anemonia sulcata toxin III family. This family consists of the Anemonia sulcata toxin III (ATX III) neurotoxin family. ATX III is a neurotoxin that is produced by sea anemone; it adopts a compact structure containing four reverse turns and two other chain reversals, but no regular alpha-helix or beta-sheet. A hydrophobic patch found on the surface of the peptide may constitute part of the sodium channel binding surface. 23
51002 400438 pfam08099 Toxin_27 Scorpion calcine family. This family consists of the calcine family of scorpion toxins. The calcine family consists of Maurocalcine and Imperatoxin. These toxins have been shown to be potent effector of ryanodyne-sensitive calcium channel from skeletal muscles. These toxins are thus useful for dihydropyridine receptor/ryanodyne receptor interaction studies. 33
51003 400439 pfam08100 dimerization dimerization domain. This domain is found at the N-terminus of a variety of plant O-methyltransferases. It has been shown to mediate dimerization of these proteins. 50
51004 400440 pfam08101 DUF1708 Domain of unknown function (DUF1708). This is a yeast domain of unknown function. 423
51005 116704 pfam08102 Antimicrobial_7 Scorpion antimicrobial peptide. This family consists of antimicrobial peptides secreted by scorpions. Novel antimicrobial peptides have been isolated from scorpions, namely the opistoporin and the pandinin. These peptides form essentially helical structures and demonstrate high antimicrobial activity against Gram-negative and Gram-positive bacteria respectively. 43
51006 116705 pfam08103 Antimicrobial_8 Uperin family. This family consists of the uperin family of antimicrobial peptides. Uperin is a wide-spectrum antibiotic peptide isolated from the Australian toadlet, Uperoleia mjobergii. Being only 17 amino acid residues long, it is smaller than most other wide-spectrum antibiotic peptides isolated from amphibians. Uperin adopts a well-defined amphipathic alpha-helix with distinct hydrophilic and hydrophobic faces. 17
51007 71539 pfam08104 Antimicrobial_9 Ponericin L family. This family consists of the ponericin L family of antimicrobial peptides that are isolated from the venom of the predatory ant Pachycondyla goeldii. Ponericin L family shares similarities with dermaseptins. Ponericin L may adopt an amphipathic alpha-helical structure in polar environments and these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion. 24
51008 311854 pfam08105 Antimicrobial10 Metchnikowin family. This family consists of the metchnikowin family of antimicrobial peptides from Drosophila. metchnikowin is a proline-rich peptide whose expression is immune-inducible. Induction of the metchnikowin gene expression can be mediated either by the TOLL pathway or by the imd gene product. The metchnikowin peptide is unique among the Drosophila antimicrobial peptides in that it is active against both bacteria and fungi. 50
51009 71541 pfam08106 Antimicrobial11 Formaecin family. This family consists of the formaecin family of antimicrobial peptides isolated from the bulldog ant Myrmecia gulosa in response to bacterial infection. Formaecins are inducible peptide antibiotics and are active against growing Escherichia coli but were inactive against other Gram-negative and Gram-positive bacteria. Formaecin peptides are 16 amino acids long, are rich in proline and have N-acetylgalactosamine O-linked to a conserved threonine. 16
51010 116706 pfam08107 Antimicrobial12 Pleurocidin family. This family consists of the pleurocidin family of antimicrobial peptides. Pleurocidins are found in the skin mucous secretions of the winter flounder (Pleuronectes americanus) and these peptides exhibit antimicrobial activity against Escherichia coli. Pleurocidin is predicted to assume an amphipathic alpha-helical conformation similar to other linear antimicrobial peptides and may play a role in innate host defense. 42
51011 71543 pfam08108 Antimicrobial13 Halocidin family. This family consists of the halocidin family of antimicrobial peptides. Halocidins are isolated from the haemocytes of the tunicate, Halocynthia aurantium. They are dimeric in structures which are found via a disulfide linkage between cysteines of two different- sized monomers. Halocidins have been shown to have strong antimicrobial activities against a wide variety of pathogenic bacteria and could be ideal candidates as peptide antibiotics against multidrug-resistant bacteria. 15
51012 71544 pfam08109 Antimicrobial14 Lactocin 705 family. This family consists of lactocin 705 which is a bacteriocin produced by Lactobacillus casei CRL 705. Lactocin 705 is a class IIb bacteriocin, whose activity depends upon the complementation of two peptides (705-alpha and 705-beta) of 33 amino acid residues each. Lactocin 705 is active against several Gram-positive bacteria, including food-borne pathogens and is a good candidate to be used for biopreservation of fermented meats. 31
51013 149268 pfam08110 Antimicrobial15 Ocellatin family. This family consists of the ocellatin family of antimicrobial peptides. Ocellatins are produced from the electrical-stimulated skin secretions of the South American frog, Leptodactylus ocellatus. The family consists of three structurally related peptides, ocellatin 1, ocellatin 2 and ocellatin 3. These peptides present hemolytic activity against human erythrocytes and are also active against Escherichia coli. 19
51014 71546 pfam08111 Pea-VEAacid Pea-VEAacid family. This family consists of the PEA-VEAacid neuropeptides family. These neuropeptides are isolated from the abdominal perisympathetic organs of the American cockroach. These peptides are found together with Pea-YLS-amide and Pea-SKNacid, giving a unique neuropeptide pattern in abdominal perisympathetic organs. The functions of these neuropeptides are unknown. 15
51015 116708 pfam08112 ATP-synt_E_2 ATP synthase epsilon subunit. This family consists of epsilon subunits of the ATP synthase. The ATP synthase complex is composed of an oligomeric transmembrane sector (CF0), and a catalytic core (CF1). CF1 is composed of 5 subunits, of which the epsilon subunit functions as a potent inhibitor of ATPase activity in both soluble and bound CF1. Only when the epsilon inhibition is disabled is high ATPase activity detected in ATPase 56
51016 285350 pfam08113 CoxIIa Cytochrome c oxidase subunit IIa family. This family consists of the cytochrome c oxidase subunit IIa family. The bax-type cytochrome c oxidase from Thermus thermophilus is known as a two subunit enzyme. From its crystal structure, it was discovered that an additional transmembrane helix 'subunit IIa' spans the membrane. This subunit consists of 34 residues forming one helix across the membrane. The presence of this subunit seems to be important for the function of cytochrome c oxidases. 33
51017 285351 pfam08114 PMP1_2 ATPase proteolipid family. This family consists of small proteolipids associated with the plasma membrane H+ ATPase. Two proteolipids (PMP1 and PMP2) are associated with the ATPase and both genes are similarly expressed in the wild-type strain of yeast with no modification of the level of transcription of one PMP gene is detected in a strain deleted of the other. Though both proteolipids show similarity with other small proteolipids associated with other cation -transporting ATPases, their functions remain unclear. 43
51018 285352 pfam08115 Toxin_28 SFI toxin family. This family consists of the SFI family of spider toxins. This family of toxins might share structural, evolutionary and functional relationships with other small, highly structurally constrained spider neurotoxins. These toxins are highly selective agonists/antagonists of different voltage-dependent calcium channels and are extremely valuable reagents in the analysis of neuromuscular function. 35
51019 149271 pfam08116 Toxin_29 PhTx neurotoxin family. This family consists of PhTx insecticidal neurotoxins that are found in the venom of Brazilian, Phoneutria nigriventer. The venom of the Phoneutria nigrivente contains numerous neurotoxic polypeptides of 30-140 amino acids which exert a range of biological effects. While some of these neurotoxins are lethal to mice after intracerebroventricular injections, others are extremely toxic to insects of the orders Diptera and Dictyoptera but had much weaker toxic effects on mice. 31
51020 400441 pfam08117 Toxin_30 Ptu family. This family consists of toxic peptides that are isolated from the saliva of assassin bugs. The saliva contains a complex mixture of proteins that are used by the bug either to immobilise the prey or to digest it. One of the proteins (Ptu1) has been purified and shown to block reversibly the N-type calcium channels and to be less specific for the L- and P/Q- type calcium channels expressed in BHK cells. 36
51021 400442 pfam08118 MDM31_MDM32 Yeast mitochondrial distribution and morphology (MDM) proteins. Proteins in this family are yeast mitochondrial inner membrane proteins MDM31 and MDM32. These proteins are required for the maintenance of mitochondrial morphology, and the stability of mitochondrial DNA. 519
51022 71554 pfam08119 Toxin_31 Scorpion acidic alpha-KTx toxin family. This family consists of acidic alpha-KTx short chain scorpion toxins. These toxins named parabutoxins, block voltage-gated K channels and have extremely low pI values. Furthermore, they lack the crucial pore-plugging lysine. In addition, the second important residue of the dyad, the hydrophobic residue (Phe or Tyr) is also missing. 37
51023 71555 pfam08120 Toxin_32 Tamulustoxin family. This family consists of the tamulustoxins which are found in the venom of the Indian red scorpion (Mesobuthus tamulus). Tamulustoxin shares no similarity with other scorpion venom toxins, although the positions of its six cysteine residues suggest that it shares the same structural scaffold. Tamulustoxin acts as a potassium channel blocker. 35
51024 71556 pfam08121 Toxin_33 Waglerin family. This family consists of the lethal peptides (waglerins) that are found in the venom of Trimeresurus wagleri. Waglerins are 22-24 residue lethal peptides and are competitive antagonist of the muscle nicotinic receptor (nAChR). Waglerin-1 possesses a distinctive selectivity for the alpha-epsilon interface binding site of the mouse nAChR. 22
51025 400443 pfam08122 NDUF_B12 NADH-ubiquinone oxidoreductase B12 subunit family. This family consists of the NADH-ubiquinone oxidoreductase B12 subunit proteins. NADH is the central source of electrons in the mitochondrial and bacterial respiration. NADH-ubiquinone oxidoreductase is involved in the transfer of electrons from NADH to the electron transport chain. This oxidation of NADH is coupled to proton transfer across the membrane, generating a proton motive force that is utilized for the synthesis of ATP. The function of this subunit is unclear. 55
51026 149273 pfam08123 DOT1 Histone methylation protein DOT1. The DOT1 domain regulates gene expression by methylating histone H3. H3 methylation by DOT1 has been shown to be required for the DNA damage checkpoint in yeast. 205
51027 400444 pfam08124 Lyase_8_N Polysaccharide lyase family 8, N terminal alpha-helical domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 323
51028 369700 pfam08125 Mannitol_dh_C Mannitol dehydrogenase C-terminal domain. 246
51029 285357 pfam08126 Propeptide_C25 Propeptide_C25. This is found at the N terminal end of some of the members of the C25 peptidase family (PF01364). Little is known about the function of this motif. 205
51030 400445 pfam08127 Propeptide_C1 Peptidase family C1 propeptide. This motif is found at the N terminal of some members of the Peptidase_C1 family (pfam00112) and is involved in activation of this peptidase. 40
51031 71564 pfam08129 Antimicrobial17 Alpha/beta enterocin family. This family consists of the alpha and beta enterocins and lactococcin G peptides. These peptides have some antimicrobial properties; they inhibit the growth of Enterococcus spp. and a few other gram-positive bacteria. These peptides act as pore- forming toxins that create cell membrane channels through a barrel-stave mechanism and thus produce an ionic imbalance in the cell. These family of antimicrobial peptides belong to the class II group of bacteriocin. 57
51032 116721 pfam08130 Antimicrobial18 Type A lantibiotic family. This family consists of the type A lantibiotic peptides. Both Pep5 and epicidin-280 are ribosomally-synthesized antimicrobial peptides produced by Gram-positive bacteria that are characterized by the presence of lanthionine and/or methyllanthionine residues. The lantibiotics family has a highly specific activity against multi- drug resistant bacteria and has potential to be utilized in a wide range of medical applications. 60
51033 400446 pfam08131 Defensin_3 Defensin-like peptide family. This family consists of the defensin-like peptides (DLPs) isolated from platypus venom. These DLPs show similar three-dimensional fold to that of beta-defensin-12 and sodium-channel neurotoxin Shl. However the side chains known to be functionally important to beta-defensin-12 and Shl are not conserved in DLPs. This suggests a different biological function. Consistent with this contention, DLPs have been shown to possess no anti-microbial properties and have no observable activity on rat dorsal-root-ganglion sodium-channel currents. 39
51034 369703 pfam08132 AdoMetDC_leader S-adenosyl-l-methionine decarboxylase leader peptide. This family consists of the S-adenosyl-l-methionine decarboxylase (AdoMetDC) leader peptides. AdoMetDC is a key regulatory enzymes in the biosynthesis of polyamines. All expressed plant AdoMetDC mRNA 5' leader sequences contain a highly conserved pair of overlapping upstream ORFs (uORFs) that overlap by one base. Sequences of the small uORFs are highly conserved between monocot, dicot and gymnosperm AdoMetDC mRNA species, suggesting a translational regulatory mechanism. 51
51035 285360 pfam08133 Nuclease_act Anticodon nuclease activator family. This family consists of the anticodon nuclease activator proteins. Pre-existing host tRNAs are reprocessed during bacteriophage T4 infection of certain Escherichia coli strains. In this pathway, tRNA(Lys) is cleaved 5' by the anticodon nuclease to the wobble base and is later restored in polynucleotide kinase and RNA ligase reactions. 26
51036 400447 pfam08134 cIII cIII protein family. This family consists of the cIII family of regulatory proteins. The lambda CIII protein has 54 amino acids and it forms an amphipathic helix within its amino acid sequence. Lambda cIII stabilizes the lambda cII protein and the host sigma factor 32, responsible for transcribing genes of the heat shock regulon. 37
51037 285362 pfam08135 EPV_E5 Major transforming protein E5 family. This family consists of the major transforming proteins (E5) of the bovine papilloma virus (BPV). The equine sarcoid is one of the most common dermatological lesion in equids. It is a benign, locally invasive dermal fibroblastic lesion and studies have shown an association of the lesions with BPV. E5 is a short hydrophobic membrane protein localising to the Golgi apparatus and other intracellular membranes. It binds to and constitutively activates the platelet-derived growth factor-beta in transformed cells. This stimulation activates a receptor signaling cascade which results in an intracellular growth stimulatory signal. 43
51038 311863 pfam08136 Ribosomal_S22 30S ribosomal protein subunit S22 family. This family consists of the 30S ribosomal proteins subunit S22 polypeptides. This polypeptide is 47 amino acids in length and has a molecular weight of about 5 kDa. The S22 subunit is a component of the stationary-phase-specific ribosomal protein and is assembled in the ribosomal particles in the stationary phase. This subunit along with other stationary-phase-specific ribosomal proteins result in compositional changes of ribosomes during the stationary phase. The significance of this change is not clear as yet. 44
51039 400448 pfam08137 DVL DVL family. This family consists of the DVL family of proteins. In a gain-of-function genetic screen for genes that influence fruit development in Arabidopsis, DEVIL (DVL) gene was identified. DVL is a small protein and overexpression of the protein results in pleiotropic phenotypes featured by shortened stature, rounder rosette leaves, clustered inflorescences, shortened pedicles, and siliques with pronged tips. DVL family is a novel class of small polypeptides and the overexpression phenotypes suggest that these polypeptides may have a role in plant development. 19
51040 285365 pfam08138 Sex_peptide Sex peptide (SP) family. This family consists of Sex Peptides (SP) that are found in Drosophila. On mating, Drosophila females decreases her remating rate and increases her egg-laying rate due, in part, to the transfer of SP from the male to the female. SP are found in seminal fluids transferred from the male to the female during mating. The male seminal fluid proteins are referred to as accessory gland proteins (Acps). The SP is one of the most interesting Acps and plays an important role in reproduction. 55
51041 400449 pfam08139 LPAM_1 Prokaryotic membrane lipoprotein lipid attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached. 18
51042 369707 pfam08140 Cuticle_1 Crustacean cuticle protein repeat. This family consists of the cuticle proteins from the Cancer pagurus and the Homarus americanus. These proteins are isolated from the calcified regions of the crustacean and they contain two copies of an 18 residue sequence motif, which thus far has been found only in crustacean calcified exoskeletons. 40
51043 400450 pfam08141 SspH Small acid-soluble spore protein H family. This family consists of the small acid-soluble spore proteins (SASP) of the H type (sspH). SspH are unique to spores of Bacillus subtilis and are expressed only in the forespore compartment during sporulation of this organism. The sspH genes are monocistronic and are recognized by the forespore-specific sigma factor for RNA polymerase - sigma-G. The specific role of this protein is unclear but is thought to play a role in sporulation under conditions different from that of the common laboratory tests of spore properties. 58
51044 400451 pfam08142 AARP2CN AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU. 85
51045 311868 pfam08143 CBFNT CBFNT (NUC161) domain. This N terminal domain is found in proteins of CARG-binding factor A-like proteins. 60
51046 400452 pfam08144 CPL CPL (NUC119) domain. This C terminal domain is fund in Penguin-like proteins associated with Pumilio like repeats. 140
51047 400453 pfam08145 BOP1NT BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins. 259
51048 400454 pfam08146 BP28CT BP28CT (NUC211) domain. This C terminal domain is found in BAP28-like nucleolar proteins. 146
51049 400455 pfam08147 DBP10CT DBP10CT (NUC160) domain. This C terminal domain is found in the Dbp10p subfamily of hypothetical RNA helicases. 63
51050 400456 pfam08148 DSHCT DSHCT (NUC185) domain. This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases. 158
51051 400457 pfam08149 BING4CT BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. 79
51052 400458 pfam08150 FerB FerB (NUC096) domain. This is central domain B in proteins of the Ferlin family. 76
51053 400459 pfam08151 FerI FerI (NUC094) domain. This domain is present in proteins of the Ferlin family. It is often located between two C2 domains. 52
51054 400460 pfam08152 GUCT GUCT (NUC152) domain. This is the C terminal domain found in the RNA helicase II / Gu protein family. 91
51055 400461 pfam08153 NGP1NT NGP1NT (NUC091) domain. This N terminal domain is found in a subfamily of hypothetical nucleolar GTP-binding proteins similar to human NGP1. 130
51056 400462 pfam08154 NLE NLE (NUC135) domain. This domain is located N terminal to WD40 repeats. It is found in the microtubule-associated yeast ribosome biogenesis protein YTM1. 65
51057 400463 pfam08155 NOGCT NOGCT (NUC087) domain. This C terminal domain is found in the NOG subfamily of nucleolar GTP-binding proteins. 51
51058 400464 pfam08156 NOP5NT NOP5NT (NUC127) domain. This N terminal domain is found in RNA-binding proteins of the NOP5 family. 66
51059 400465 pfam08157 NUC129 NUC129 domain. This C terminal domain is found in a novel family of hypothetical nucleolar proteins. 63
51060 400466 pfam08158 NUC130_3NT NUC130/3NT domain. This N terminal domain is found in a novel nucleolar protein family. 50
51061 400467 pfam08159 NUC153 NUC153 domain. This small domain is found in a a novel nucleolar family. 29
51062 400468 pfam08161 NUC173 NUC173 domain. This is the central domain of of novel family of hypothetical nucleolar proteins. 202
51063 400469 pfam08163 NUC194 NUC194 domain. This is domain B in the catalytic subunit of DNA-dependent protein kinases. 387
51064 400470 pfam08164 TRAUB Apoptosis-antagonizing transcription factor, C-terminal. This C terminal domain is found in traube proteins. This is the domain of the AATF proteins that interacts with BLOS2 or Ceap, that functions as an adaptor in processes such as protein and vesicle processing and transport, and perhaps transcription. 81
51065 400471 pfam08165 FerA FerA (NUC095) domain. This is central domain A in proteins of the Ferlin family. 58
51066 149302 pfam08166 NUC202 NUC202 domain. This domain is found in a novel family of nucleolar proteins. 61
51067 400472 pfam08167 RIX1 rRNA processing/ribosome biogenesis. Rix1 is a nucleoplasmic particle involved in rRNA processing/ribosome assembly. It associates with two other proteins, Ipi1 and Ipi3, to form the RIX1 complex that allows Rea1 - the AAA ATPase - to associate with the 60S ribosomal subunit. More than 170 assembly factors are involved in the construction and maturation of yeast ribosomes, and after these factors have completed their function they need to be released from the pre-ribosomes. Rea1 induces the release of the assembly protein complex in a mechanical fashion. This family is usually associated with NUC202, pfam08166. 187
51068 400473 pfam08168 NUC205 NUC205 domain. This domain is found in a novel family of nucleolar proteins. 44
51069 400474 pfam08169 RBB1NT RBB1NT (NUC162) domain. This domain is found N terminal to the ARID/BRIGHT domain in DNA-binding proteins of the Retinoblastoma-binding protein 1 family. 94
51070 400475 pfam08170 POPLD POPLD (NUC188) domain. This domain is found in POP1-like nucleolar proteins. 92
51071 400476 pfam08171 Mad3_BUB1_II Mad3/BUB1 homology region 2. This domain is found in checkpoint proteins which are involved in cell division. This region has been shown to be necessary and sufficient for the binding of MAD3 to BUB3 in Saccharomyces cerevisiae. This domain is present in BUB1 which also binds BUB3. 65
51072 400477 pfam08172 CASP_C CASP C terminal. This domain is the C-terminal region of the CASP family of proteins. It is a Golgi membrane protein which is thought to have a role in vesicle transport. 247
51073 400478 pfam08173 YbgT_YccB Membrane bound YbgT-like protein. This family contains a set of membrane proteins, typically 33 amino acids long. The family has no known function, but the protein is found in the operon CydAB in E. coli. Members have a consensus motif (MWYFXW) which is rich in aromatic residues. The protein forms a single membrane-spanning helix. This family seems to be restricted to Proteobacteria. 26
51074 400479 pfam08174 Anillin Cell division protein anillin. Anillin is a protein involved in septin organisation during cell division. It is an actin binding protein that is localized to the cleavage furrow, and it maintains the localization of active myosin, which ensures the spatial control of concerted contraction during cytokinesis. 140
51075 369735 pfam08175 SspO Small acid-soluble spore protein O family. This family consists of the small acid-soluble spore proteins (SASP) O type (sspO). SspO (originally cotK) are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspO is the first gene in a likely operon with sspP and transcription of this gene is primarily by RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspO causes the loss of the SspO from the forespore but had no discernible effect on sporulation, spore properties or spore germination. 50
51076 285399 pfam08176 SspK Small acid-soluble spore protein K family. This family consists of the small acid-soluble spore proteins (SASP) belonging to the K type (sspK). The sspK are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspK gene is monocistronic and transcription is primarily by the RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspK results in loss of SspK from the spore but had no discernible effect on sporulation, spore properties or spore germination. 47
51077 285400 pfam08177 SspN Small acid-soluble spore protein N family. This family consists of the small acid-soluble spore protein (SASP) N type (sspN). SspN is a 48 residues protein that is expressed only in the forespore compartment of sporulating Bacillus subtilis. The sspN gene is recognized equally by both sigma-G and sigma-F. The role of SspN is still not well-defined. 46
51078 285401 pfam08178 GnsAB_toxin GnsA/GnsB toxin of bacterial toxin-antitoxin system. This family consists of the GnsA/GnsB family. GnsA and GnsB are multicopy suppressors of the secG null mutation. These proteins participate in the synthesis of phospholipids, suggesting the functional relationship between SecG and membrane phospholipids. Over-expression of gnsA and gnsB causes a remarkable increase in the unsaturated fatty acid content. However, the gnsA-gnsB double null mutant exhibits no effect. Both proteins are predicted to possess a helix-turn-helix structure. GnsAB is a family of putative bacterial toxins (both GnsA and GnsB) that, are neutralized by the antitoxin YmcE, pfam15939. 54
51079 400480 pfam08179 SspP Small acid-soluble spore protein P family. This family consists of the small acid-soluble spore proteins (SASP) P type (sspP). sspP is expressed only in the forespore compartment of the sporulating cell. sspP is also expressed under sigma-G control from the same promoter as sspO. Mutations deleting sspP causes no discernible effect on sporulation, spore properties or spore germination. 44
51080 400481 pfam08180 BAGE B melanoma antigen family. This family consists of the B melanoma antigen (BAGE) peptides. The BAGE gene encodes a human tumor antigen that is recognized by a cytolytic T lymphocyte. BAGE genes are expressed in melanomas, bladder and lung carcinomas and in a few tumors of other histological types. 28
51081 285404 pfam08181 DegQ DegQ (SacQ) family. This family consists of the DegQ (formerly sacQ) regulatory peptides. The DegQ family of peptides control the rates of synthesis of a class of both secreted and intracellular degradative enzymes in Bacillus subtilis. DegQ is 46 amino acids long and activates the synthesis of degradative enzymes. The expression of this peptide was shown to be subjected both to catabolite repression and DegS-DegU-mediated control. Thus allowing an increase in the rate of synthesis of degQ under conditions of nitrogen starvation. 46
51082 400482 pfam08182 Pedibin Pedibin/Hym-346 family. This family consists of the pedibin and Hym-346 signalling peptides. These two peptides have been isolated from Hydra vulgaris and Hydra magnipapillata. Experiments have indicated that both cause a reduction in the positional value gradient, the principle patterning process governing the maintenance of form in the adult hydra. The peptides cause an increase in the rate of foot regeneration following bisection of the body column. Thus both play important signalling roles in patterning processes in cnidaria and maybe in more complex metazoans. 35
51083 369736 pfam08183 SpoV Stage V sporulation protein family. This family consists of the stage V sporulation (SpoV) proteins of Bacillus subtilis which includes SpoVM. SpoVM is an small, 26 residue-long protein that is produced in the mother cell chamber of the sporangium during the process of sporulation in B. subtilis. SpoVM forms an amphipathic alpha-helix and is recruited to the polar septum shortly after the sporangium undergoes asymmetric division. The function of SpoVM depends on proper subcellular localization. 25
51084 116772 pfam08184 Cuticle_2 Cuticle protein 7 isoform family. This family consists of cuticle protein 7 isoforms that are isolated from the carapace cuticle of a juvenile horseshoe crab, Limulus polyphemus. There are 3 isoforms of cuticle protein 7. The 3 isoforms are N-terminally blocked but could be deblocked by treatment with pyroglutaminase, showing that the N-terminal residue is a pyroglutamine residue. 59
51085 369737 pfam08186 Wound_ind Wound-inducible basic protein family. This family consists of the wound-inducible basic proteins from plants. The metabolic activities of plants are dramatically altered upon mechanical injury or pathogen attack. A large number of proteins accumulates at wound or infection sites, such as the wound-inducible basic proteins. These proteins are small, 47 amino acids in length, has no signal peptides and are hydrophilic and basic. 44
51086 71621 pfam08187 Tetradecapep Myoactive tetradecapeptides family. This family consists of myoactive tetradecapeptides that are isolated from the gut of earthworms, Eisenia foetida and Pheretima vitata. These peptides were termed ETP and PTP respectively. Both peptides showed a potent excitatory action on spontaneous contractions of the anterior gut. These peptides show similarity to Molluscan tetradecapeptides and arthropodan tridecapeptides. 14
51087 71622 pfam08188 Protamine_3 Spermatozal protamine family. This family consists of the spermatozal protamines. Spermatozal protamines play an important role in remodelling of the sperm chromatin during mammalian spermiogenesis. Nuclear elongation and chromatin condensation are concomitant with modifications in the basic protein complement associated with DNA. Somatic histones are initially replaced by testis -specific histone variants, then by transitional proteins, and ultimately by protamines. 48
51088 400483 pfam08189 Meleagrin Meleagrin/Cygnin family. This family consists of meleagrin and cygnin basic peptides that are isolated from turkey and black swan respectively. Both peptides are low in molecular weight and contains three disulphide bonds with high concentrations of aromatic residues. These peptides show similarity to transferrins and probably play some vital role in avian eggs but the exact function is still unknown. 38
51089 369738 pfam08190 PIH1 pre-RNA processing PIH1/Nop17. This domain is involved in pre-rRNA processing. It has has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae. The C-terminal region of this family has similarity to the CS domain pfam04969. 166
51090 369739 pfam08191 LRR_adjacent LRR adjacent. These are small, all beta strand domains, structurally described for the protein Internalin (InlA) and related proteins InlB, InlE, InlH from the pathogenic bacterium Listeria monocytogenes. Their function appears to be mainly structural: They are fused to the C-terminal end of leucine-rich repeats (LRR), significantly stabilizing the LRR, and forming a common rigid entity with the LRR. They are themselves not involved in protein-protein-interactions but help to present the adjacent LRR-domain for this purpose. These domains belong to the family of Ig-like domains in that they consist of two sandwiched beta sheets that follow the classical connectivity of Ig-domains. The beta strands in one of the sheets is, however, much smaller than in most standard Ig-like domains, making it somewhat of an outlier. 57
51091 369740 pfam08192 Peptidase_S64 Peptidase family S64. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1. The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS:S1) and to have a typical trypsin-like catalytic triad. 684
51092 400484 pfam08193 INO80_Ies4 INO80 complex subunit Ies4. The INO80 ATPase is a member of the SNF2 family of ATPases and functions as an integral component of a multisubunit ATP-dependent chromatin remodelling complex. This family of proteins corresponds to the fungal Ies4 subunit of INO80. 233
51093 400485 pfam08194 DIM DIM protein. Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila. This family includes DIMs 1 to 4 that have masses below 5 kDa. 36
51094 369743 pfam08195 TRI9 TRI9 protein. Putative gene of 129 bp in the Trichothecene gene cluster of Fusarium sporotrichioides and F. graminearum. Encoding a predicted protein of 43 amino acids which function is unknown. 43
51095 285414 pfam08196 UL2 UL2 protein. Orf UL2 of Human cytomegalovirus (HCMV) which is a short protein of unknown function. 59
51096 285415 pfam08197 TT_ORF2a pORF2a truncated protein. Most isolated ORF2 of TT virus (TTV) encode a 49 amino acids protein (pORF2a) because of an in-frame stop codon. ORF2s isolated from G1 TTV encode 202 amino acids protein (pORF2ab). 49
51097 400486 pfam08198 Thymopoietin Thymopoietin protein. Short protein of 49 amino acid isolated from bovine spleen cells. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control. 48
51098 285417 pfam08199 E2 Bacteriophage E2-like protein. Short conseved protein described in Lactococcus Bacteriophage c2 of 37 amino acids. 37
51099 369745 pfam08200 Phage_1_1 Bacteriophage 1.1 Protein. Gene 1.1 in Bacteriophage T7 encodes a 42 amino acid protein, rich in basic amino acids suggesting its interaction with nucleic acids. Many homologs are present in different T7 and T3-like bacteriophage. 45
51100 369746 pfam08201 BssC_TutF BssC/TutF protein. BssC short protein (57 amino acids) has been described as the gamma-subunit of benzylsuccinate synthase from Thauera aromatica strain K172. TutF has been identified and described as highly similar to BssC in T.aromatica strain T1. 56
51101 369747 pfam08202 MIS13 Mis12-Mtw1 protein family. Mis12-Mtw1 is a eukaryotic conserved kinetochore protein that is involved in chromosome segregation. 288
51102 400487 pfam08203 RNA_polI_A14 Yeast RNA polymerase I subunit RPA14. This is a family of yeast proteins. A14 is one of the final two subunits of Saccharomyces cerevisiae RNA polymerase I and is proposed to play a role in the recruitment of pol I to the promoter. 76
51103 400488 pfam08204 V-set_CD47 CD47 immunoglobulin-like domain. This family represents the CD47 leukocyte antigen V-set like Ig domain. 93
51104 400489 pfam08205 C2-set_2 CD80-like C2-set immunoglobulin domain. These domains belong to the immunoglobulin superfamily. 89
51105 400490 pfam08206 OB_RNB Ribonuclease B OB domain. This family includes the N-terminal OB domain found in ribonuclease B proteins in one or two copies. 58
51106 400491 pfam08207 EFP_N Elongation factor P (EF-P) KOW-like domain. 58
51107 400492 pfam08208 RNA_polI_A34 DNA-directed RNA polymerase I subunit RPA34.5. This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription. 203
51108 369754 pfam08209 Sgf11 Sgf11 (transcriptional regulation protein). The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae. The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation. 33
51109 400493 pfam08210 APOBEC_N APOBEC-like N-terminal domain. A mechanism of generating protein diversity is mRNA editing. Members of this family are C-to-U editing enzymes. The N-terminal domain of APOBEC-1 like proteins is the catalytic domain, while the C-terminal domain is a pseudocatalyitc domain. More specifically, the catalytic domain is a zinc dependent deaminases domain and is essential for cytidine deamination.APOBEC-3 like members contain two copies of this domain. RNA editing by APOBEC-1 requires homodimerization and this complex interacts with RNA binding proteins to from the editosome (and references therein). This family also includes the functionally homologous activation induced deaminase (AID), which is essential for the development of antibody diversity in B lymphocytes, and the sea lamprey PmCDA1 and PmCDA2, which are predicted to play an AID-like role in the adaptive immune response of jawless vertebrates. Divergent members of this family are present in various eukaryotes such as Nematostella, C. elegans, Micromonas and Emiliania, and prokaryotes such as Wolbachia and Pseudomonas brassicacearum. 170
51110 400494 pfam08211 dCMP_cyt_deam_2 Cytidine and deoxycytidylate deaminase zinc-binding region. 122
51111 400495 pfam08212 Lipocalin_2 Lipocalin-like domain. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel. 143
51112 400496 pfam08213 DUF1713 Mitochondrial domain of unknown function (DUF1713). This domain is found at the C terminal end of mitochondrial proteins of unknown function. 33
51113 400497 pfam08214 HAT_KAT11 Histone acetylation protein. Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin. KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast. This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11. 348
51114 400498 pfam08216 CTNNBL Catenin-beta-like, Arm-motif containing nuclear. CTNNBL is a family of eukaryotic nuclear proteins of the catenin-beta-like 1 type that contain an armadillo motif. A human nuclear protein with this domain is thought to have a role in apoptosis. The interaction of CTNNBL1 with its known partners (the Prp19-CDC5L complex and AID) is mediated by recognition of NLS (nuclear localization signal) motifs. The RNA-splicing factor Prp31 is also an interactor, with recognition also occurring through the NLS. CTNNBL1 uses its central armadillo (ARM) domain to bind NLS-containing partners. 104
51115 369760 pfam08217 DUF1712 Fungal domain of unknown function (DUF1712). The function of this family of proteins is unknown. 468
51116 285434 pfam08218 Citrate_ly_lig Citrate lyase ligase C-terminal domain. This family is composed of the C-terminal domain of citrate lyase ligase EC:6.2.1.22. 182
51117 400499 pfam08219 TOM13 Outer membrane protein TOM13. The TOM13 family of proteins are mitochondrial outer membrane proteins that mediate the assembly of beta-barrel proteins. 82
51118 285436 pfam08220 HTH_DeoR DeoR-like helix-turn-helix domain. 57
51119 400500 pfam08221 HTH_9 RNA polymerase III subunit RPC82 helix-turn-helix domain. This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa. This region is a probably DNA-binding helix-turn-helix. 62
51120 369763 pfam08222 HTH_CodY CodY helix-turn-helix domain. This family consists of the C-terminal helix-turn-helix domain found in several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk. 61
51121 369764 pfam08223 PaaX_C PaaX-like protein C-terminal domain. This family contains proteins that are similar to the product of the paaX gene of Escherichia coli. This protein is involved in the regulation of expression of a group of proteins known to participate in the metabolism of phenylacetic acid. 99
51122 400501 pfam08224 DUF1719 Domain of unknown function (DUF1719). This is a domain of unknown function. It may have a role in ATPase activation. 231
51123 400502 pfam08225 Antimicrobial19 Pseudin antimicrobial peptide. Pseudins are a subfamily of the FSAP family (Frog Secreted Active Peptides) extracted from the skin of the paradoxical frog Pseudis paradoxa (Pseudidae). The pseudins belong to the class of cationic, amphipathic-helical antimicrobial peptides. 23
51124 369766 pfam08226 DUF1720 Domain of unknown function (DUF1720). This domain is found in different combinations with cortical patch components EF hand, SH3 and ENTH and is therefore likely to be involved in cytoskeletal processes. This family contains many hypothetical proteins. 75
51125 400503 pfam08227 DASH_Hsk3 DASH complex subunit Hsk3 like. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. This family also includes several higher eukaryotic proteins. However, other DASH subunits do not appear to be conserved in higher eukaryotes. 45
51126 400504 pfam08228 RNase_P_pop3 RNase P subunit Pop3. This family of fungal proteins form a subunit of RNase P, the ribonucleoprotein enzyme that cleaves the leader sequence of precursor tRNAs to generate mature tRNAs. The structure of Pop3 has been assigned the L7Ae/L30e fold. This RNA-binding fold is also present in human RNase P subunit Rpp38, raising the possibility that Pop3p and Rpp38 are functional homologs. 158
51127 369769 pfam08229 SHR3_chaperone ER membrane protein SH3. This family of proteins are membrane localized chaperones that are required for correct plasma membrane localization of amino acid permeases (AAPs). SH3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of SH3, AAPs are retained in the ER. 185
51128 400505 pfam08230 CW_7 CW_7 repeat. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is also found in the cell wall hydrolases of human and life-stock pathogens. CW_7 repeats make up a cell wall binding motif. 40
51129 400506 pfam08231 SYF2 SYF2 splicing factor. Proteins in this family are involved in cell cycle progression and pre-mRNA splicing. 150
51130 400507 pfam08232 Striatin Striatin family. Striatin is an intracellular protein which has a caveolin-binding motif, a coiled-coil structure, a calmodulin-binding site, and a WD (pfam00400) repeat domain. It acts as a scaffold protein and is involved in signalling pathways. 142
51131 400508 pfam08234 Spindle_Spc25 Chromosome segregation protein Spc25. This is a family of chromosome segregation proteins. It contains Spc25, which is a conserved eukaryotic kinetochore protein involved in cell division. In fungi the Spc25 protein is a subunit of the Nuf2-Ndc80 complex, and in vertebrates it forms part of the Ndc80 complex. 71
51132 400509 pfam08235 LNS2 LNS2 (Lipin/Ned1/Smp2). This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain (pfam04571). SMP2 (also known as PAH1) is involved in plasmid maintenance and respiration, and has been identified as a Mg2+-dependent phosphatidate phosphatase (EC:3.1.3.4) that contains a haloacid dehalogenase (HAD)-like domain. Lipin proteins are involved in adipose tissue development and insulin resistance. 226
51133 400510 pfam08236 SRI SRI (Set2 Rpb1 interacting) domain. The SRI (Set2 Rpb1 interacting) domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. This domain is conserved from yeast to humans. Members of this family form a compact, closed three-helix bundle, with an up-down-up topology. The first and second helices are antiparallel to each other and are of similar length; the third helix, which is packed across helices alpha1 and alpha2 is slightly shorter, consisting of only 15 amino acids. Most conserved hydrophobic residues are largely buried in the interior of the structure and form an extensive and contiguous hydrophobic core that stabilizes the packing of the three-helix bundle. This domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. 83
51134 369775 pfam08237 PE-PPE PE-PPE domain. This domain is found C terminal to the PE (pfam00934) and PPE (pfam00823) domains. The secondary structure of this domain is predicted to be a mixture of alpha helices and beta strands. 227
51135 400511 pfam08238 Sel1 Sel1 repeat. This short repeat is found in the Sel1 protein. It is related to TPR repeats. 35
51136 400512 pfam08239 SH3_3 Bacterial SH3 domain. 54
51137 400513 pfam08240 ADH_N Alcohol dehydrogenase GroES-like domain. This is the catalytic domain of alcohol dehydrogenases. Many of them contain an inserted zinc binding domain. This domain has a GroES-like structure. 106
51138 400514 pfam08241 Methyltransf_11 Methyltransferase domain. Members of this family are SAM dependent methyltransferases. 94
51139 400515 pfam08242 Methyltransf_12 Methyltransferase domain. Members of this family are SAM dependent methyltransferases. 98
51140 400516 pfam08243 SPT2 SPT2 chromatin protein. This family includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation. 105
51141 400517 pfam08244 Glyco_hydro_32C Glycosyl hydrolases family 32 C terminal. This domain corresponds to the C terminal domain of glycosyl hydrolase family 32. It forms a beta sandwich module. 162
51142 400518 pfam08245 Mur_ligase_M Mur ligase middle domain. 200
51143 400519 pfam08246 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. 58
51144 116832 pfam08247 ENOD40 ENOD40 protein. Rohrig et al. reported the in vitro translation of two peptides of 12 and 24 amino acids from the short, overlapping ORFs of soybean ENOD40 mRNA. The putative role of the enod40 genes has been in favour of organogenesis, such as induction of the cortical cell divisions that lead to initiation of nodule primordia, in developing lateral roots and embryonic tissues. This supports the hypothesis for a role of enod40 in lateral organ development. 12
51145 116833 pfam08248 Tryp_FSAP Tryptophyllin-3 skin active peptide. PdT-3 or Tryptophyllin-3 peptide is a subfamily of the family Tryptophyllin and of the superfamily FSAP (Frog Skin Active Peptide). Originally identified in skin extracts of Neotropical leaf frogs, Phyllomedusa sp. This subfamily has an average length of 13 amino acids. The pharmacological activity of the tryptophyllins remains to be established but it seems that these peptides possess an action on liver protein synthesis and body weight. 12
51146 116834 pfam08249 Mastoparan Mastoparan protein. Mastoparans are a family of tetradecapeptides from wasp venom, that have been shown to directly activate GTP-binding regulatory proteins. These peptides show selectivity among G proteins: they strongly activate Go and Gi but not Gs or Gt. The peptide of this family are composed by 14 amino acids but they can assume different structures. 14
51147 116835 pfam08250 Sperm_act_pep Sperm-activating peptides. The sperm-activating peptides (SAPs) are isolated in egg-conditioned media (egg jelly) of sea urchins. SAPs have several effects on sea urchin spermatozoa: stimulate sperm respiration and motility through intracellular alkalinization, transient elevation of cAMP, cGMP and Ca++levels in sperm cells. 10
51148 116836 pfam08251 Mastoparan_2 Mastoparan peptide. Mastoparan (MP) peptides I II and III are extracted from the venom gland of the Neotropical social wasp Protopolybia exigua(Saussure) They are tetradecapeptides presenting from seven to ten hydrophobic amino acid residues and from two to four lysine residues in their primary sequences. These peptide cause the degranulation of mast cells. Protopolybia-MP-I also act causing hemolysis in erythrocytes. 14
51149 285459 pfam08252 Leader_CPA1 arg-2/CPA1 leader peptide. In this family there are Leaders Peptides involved in the regulation the glutaminase subunit (small subunit) of arginine-specific carbamoyl phosphate synthetase. In Neurospora crassa it is a small upstream ORF of 24 codon above the arg-2 locus. In yeast it is the leader peptide of the CPA1 gene. The 5' region of CPA1 mRNA contains a 25 codon upstream open reading frame. The leader peptide, the product of the upstream open reading frame, plays an essential, negative role in the specific repression of CPA1 by arginine. 23
51150 285460 pfam08253 Leader_Erm Erm Leader peptide. These short proteins are Leader peptides (15-19 amino acids) of erm genes that code for resistance determinants in Staphylococcus aureus. 19
51151 285461 pfam08254 Leader_Thr Threonine leader peptide. Threonine leader peptide of the Threonine operon thrA1A2BC. It as been sequenced in different bacteria: E. coli, Serratia marcescens, Salmonella typhi. 22
51152 285462 pfam08255 Leader_Trp Trp-operon Leader Peptide. The tryptophan operon regulatory region of C. freundii's (leader transcript) encodes a 14-residue peptide containing characteristic tandem tryptophan residues. It is about 10 nucleotides shorter than those of E. coli and S. typhimurium. 14
51153 116841 pfam08256 Antimicrobial20 Aurein-like antibiotic peptide. This family of antibacterial peptides are secreted from the granular dorsal glands of the Green and Golden Bell Frog Litoria aurea, Southern Bell Frog L. raniformis, Blue Mountains tree-frog Litoria citropa (genus Litoria) and frogs from genus Uperoleia. They are a part of the FSAP peptide family. Amongst the more active of these are aurein 1.2, aurein 2.2 and aurein 3.1; caerin 1.1, maculatin 1.1, uperin 3.6; citropin 1.1, citropin 1.2, citropin 1.3 and a minor peptide are wide-spectrum antibacterial peptides. 13
51154 400520 pfam08257 Sulfakinin Sulfakinin family. The sulfakinin (SK) family of neuropeptides have only been identified in crustaceans and insects. For most species there is the potential for producing two sulfakinin peptides one have a short sulfakinin sequence The function of the sulfakinins is difficult to assess. For the American cockroach, various forms of the endogenous sulfakinins have been shown to be active on the hindgut, and also on the heart. In C. vomitoria the peptides act as neurotransmitters or neuromodulators, linking the brain with all thoracic and abdominal ganglia. In adults of P. monodon they appear to be restricted to a few neurones in the brain with a neural pathway extending along to the ventral thoracic and abdominal ganglia. 9
51155 116843 pfam08258 WWamide WWamide peptide. This family contain neuropeptides, isolated from ganglia of the African giant snail, Achatina fulica. Each peptide has a Trp residue at both the N- and C-termini. Purified WWamide-1, -2 and -3 showed an inhibitory effect on the phasic contractions of the anterior byssus retractor muscle (ABRM). 7
51156 400521 pfam08259 Periviscerokin Periviscerokinin family. Abdominal Perisympathetic organs of insects contain Periviscerokinins neuropeptides of about 11 amino acids. 11
51157 116845 pfam08260 Kinin Insect kinin peptide. These neuropeptides are the first members of the insect kinin-family isolated from the American cockroach. Their occurrence in the retrocerebral complex suggests a physiological role as a neurohormone. The C-terminal sequence Phe-X-Ser-Trp-Gly-NH2 characterized the peptides as members of the insect kinin family. Data suggest a possible involvement of insect kinins in water-balance by regulating the osmoregulation. These peptides have length from 6 to 14 amino acids. 8
51158 87473 pfam08261 Carcinustatin Carcinustatin peptide. A total of 20 peptides of the superfamily allostatin were isolated from the shore crab Carcinus maenas. They are named carcinustatin 1 to 20 and their length ranges from 5 to 27 amino acids. This family includes carcinustatin 8,9,15 and 16. 8
51159 116846 pfam08262 Lem_TRP Leucophaea maderae tachykinin-related peptide. These peptides are designated Leucophaea maderae tachykinin-related peptides (Lem TRPs). Some were isolated from the midgut of L. maderae, whereas others appear to be brain specific. The Lem TRPs of the brain are myotropic and induce increases in the amplitude and frequency of spontaneous contractions and tonus of hindgut muscle in L. maderae. They were also isolated from brain-corpora, cardiaca-corpora, allata-suboesophageal ganglion extracts of the Locusta migratoria. They stimulate visceral muscle contractions of the oviduct and the foregut of Locusta migratoria. 10
51160 400522 pfam08263 LRRNT_2 Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. 41
51161 400523 pfam08264 Anticodon_1 Anticodon-binding domain of tRNA. This domain is found mainly hydrophobic tRNA synthetases. The domain binds to the anticodon of the tRNA. 141
51162 400524 pfam08265 YL1_C YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. 29
51163 400525 pfam08266 Cadherin_2 Cadherin-like. This cadherin domain is usually the most N-terminal copy of the domain. 83
51164 400526 pfam08267 Meth_synt_1 Cobalamin-independent synthase, N-terminal domain. The N-terminal domain and C-terminal domains of cobalamin-independent synthases together define a catalytic cleft in the enzyme. The N-terminal domain is thought to bind the substrate, in particular, the negatively charged polyglutamate chain. The N-terminal domain is also thought to stabilize a loop from the C-terminal domain. 310
51165 285468 pfam08268 FBA_3 F-box associated domain. 125
51166 377967 pfam08269 dCache_2 Cache domain. Double Cache domain 2 (dCache_2) may be a result of single Cache domain 2 (sCache_2) duplication. 297
51167 285470 pfam08270 PRD_Mga M protein trans-acting positive regulator (MGA) PRD domain. Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. This corresponds to the PRD like region. 220
51168 400527 pfam08271 TF_Zn_Ribbon TFIIB zinc-binding. The transcription factor TFIIB contains a zinc-binding motif near the N-terminus. This domain is involved in the interaction with RNA pol II and TFIIF and plays a crucial role in selecting the transcription initiation site. The domain adopts a zinc ribbon like structure. 43
51169 400528 pfam08272 Topo_Zn_Ribbon Topoisomerase I zinc-ribbon-like. Some Proteobacteria topoisomerase I contain two zinc-ribbon-like domains at the C-terminus that structurally homologous to pfam01396. However, this domain no longer bind zinc. Indeed, only one of the four cysteine residues remains. 39
51170 400529 pfam08273 Prim_Zn_Ribbon Zinc-binding domain of primase-helicase. 37
51171 311948 pfam08274 PhnA_Zn_Ribbon PhnA Zinc-Ribbon. 30
51172 400530 pfam08275 Toprim_N DNA primase catalytic core, N-terminal domain. 128
51173 400531 pfam08276 PAN_2 PAN-like domain. 67
51174 400532 pfam08277 PAN_3 PAN-like domain. 71
51175 400533 pfam08278 DnaG_DnaB_bind DNA primase DnaG DnaB-binding. Eubacterial DnaG primases interact with several factors to from the replisome. One of these factors in DnaB, a helicase. This domain has been demonstrated to be responsible for the interaction between DnaG and DnaB. 122
51176 400534 pfam08279 HTH_11 HTH domain. This family includes helix-turn-helix domains in a wide variety of proteins. 52
51177 311953 pfam08280 HTH_Mga M protein trans-acting positive regulator (MGA) HTH domain. Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. 59
51178 400535 pfam08281 Sigma70_r4_2 Sigma-70, region 4. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif. 54
51179 400536 pfam08282 Hydrolase_3 haloacid dehalogenase-like hydrolase. This family contains haloacid dehalogenase-like hydrolase enzymes. 254
51180 285483 pfam08283 Gemini_AL1_M Geminivirus rep protein central domain. This is the cetral domain of the geminivirus rep proteins. 107
51181 400537 pfam08284 RVP_2 Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases. 134
51182 400538 pfam08285 DPM3 Dolichol-phosphate mannosyltransferase subunit 3 (DPM3). This family corresponds to subunit 3 of dolichol-phosphate mannosyltransferase, an enzyme which generates mannosyl donors for glycosylphosphatidylinositols, N-glycan and protein O- and C-mannosylation. DPM3 is an integral membrane protein and plays a role in stabilizing the dolichol-phosphate mannosyl transferase complex. 89
51183 400539 pfam08286 Spc24 Spc24 subunit of Ndc80. Spc24 is a component of the evolutionarily conserved kinetochore-associated Ndc80 complex and is involved in chromosome segregation 106
51184 400540 pfam08287 DASH_Spc19 Spc19. Spc19 is a component of the DASH complex. The DASH complex associates with the spindle pole body and is important for spindle and kinetochore integrity during cell division. 148
51185 400541 pfam08288 PIGA PIGA (GPI anchor biosynthesis). This domain is found on phosphatidylinositol n-acetylglucosaminyltransferase proteins. These proteins are involved in GPI anchor biosynthesis and are associated with disease the paroxysmal nocturnal haemoglobinuria. 90
51186 285488 pfam08289 Flu_M1_C Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein. 97
51187 285489 pfam08290 Hep_core_N Hepatitis core protein, putative zinc finger. This short region is found at the N-terminus of some hepatitis core proteins. Its conservation of four cys and his suggests a zinc binding domain. 27
51188 400542 pfam08291 Peptidase_M15_3 Peptidase M15. 112
51189 400543 pfam08292 RNA_pol_Rbc25 RNA polymerase III subunit Rpc25. Rpc25 is a strongly conserved subunit of RNA polymerase III and has homology to Rpa43 in RNA polymerase I, Rpb7 in RNA polymerase II and the archaeal RpoE subunit. Rpc25 is required for transcription initiation and is not essential for the elongating properties of RNA polymerase III. 121
51190 400544 pfam08293 MRP-S33 Mitochondrial ribosomal subunit S27. This family of proteins corresponds to mitochondrial ribosomal subunit S27 in prokaryotes and to subunit S33 in humans. It is a small 106 residue protein.The evolutionary history of the mitoribosomal proteome that is encoded by a diverse subset of eukaryotic genomes, reveals an ancestral ribosome of alpha-proteobacterial descent that more than doubled its protein content in most eukaryotic lineages. Several new MRPs have originated via duplication of existing MRPs as well as by recruitment from outside of the mitoribosomal proteome. 87
51191 311963 pfam08294 TIM21 TIM21. TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane. 145
51192 400545 pfam08295 Sin3_corepress Sin3 family co-repressor. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. 97
51193 400546 pfam08297 U3_snoRNA_assoc U3 snoRNA associated. This family of proteins is associated with U3 snoRNA. U3 snoRNA is required for nucleolar processing of pre-18S ribosomal RNA. 88
51194 116881 pfam08298 AAA_PrkA PrkA AAA domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain. 358
51195 400547 pfam08299 Bac_DnaA_C Bacterial dnaA protein helix-turn-helix. 69
51196 369801 pfam08300 HCV_NS5a_1a Hepatitis C virus non-structural 5a zinc finger domain. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. This domain corresponds to the N-terminal zinc binding domain. 62
51197 149382 pfam08301 HCV_NS5a_1b Hepatitis C virus non-structural 5a domain 1b. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. This region corresponds to the 1b domain. 102
51198 400548 pfam08302 tRNA_lig_CPD Fungal tRNA ligase phosphodiesterase domain. This domain is found in fungal tRNA ligases and has cyclic phosphodiesterase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns. 253
51199 400549 pfam08303 tRNA_lig_kinase tRNA ligase kinase domain. This domain is found in fungal tRNA ligases and has kinase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns. This family contains a P-loop motif. 168
51200 400550 pfam08305 NPCBM NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.) 136
51201 400551 pfam08306 Glyco_hydro_98M Glycosyl hydrolase family 98. This domain is the putative catalytic domain of glycosyl hydrolase family 98 proteins. 328
51202 285502 pfam08307 Glyco_hydro_98C Glycosyl hydrolase family 98 C-terminal domain. This putative domain is found at the C-terminus of glycosyl hydrolase family 98 proteins. This domain is not expected to form part of the catalytic activity. 270
51203 285503 pfam08308 PEGA PEGA domain. This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. 70
51204 285504 pfam08309 LVIVD LVIVD repeat. This repeat is found in bacterial and archaeal cell surface proteins, many of which are hypothetical. The secondary structure corresponding to this repeat is predicted to comprise from 1-7 of 4-beta-strands which may associate to form a beta-propeller. The repeat copy number varies from 3-29. This repeat is sometimes found with the PKD domain pfam00801. 42
51205 400552 pfam08310 LGFP LGFP repeat. This 54 amino acid repeat is found in many hypothetical proteins. Several hypothetical proteins from C.glutamicum and C.efficiens along with PS1 protein contain this repeat region. The N-terminus region of PS1 contains an esterase domain which transfers corynomycolic acid. The C-terminus region consists of 4 tandem LGFP repeats. It is hypothesized that the PS1 proteins in Corynebacterium, when associated with the cell wall, may be anchored via the LGFP tandem repeats that may be important for maintaining cell wall integrity [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. Deletion of Corynebacterium glutamicum csp1 protein results in a 10-fold increase in the cell volume of the organism and infers the corresponding proteins involvement in the cell shape formation. The secondary structure of each repeat is predicted to comprise two beta-strands and one alpha-helix [Adindla et al. 2004]. 52
51206 400553 pfam08311 Mad3_BUB1_I Mad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. 123
51207 400554 pfam08312 cwf21 cwf21 domain. The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding. The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices. 42
51208 400555 pfam08313 SCA7 SCA7, zinc-binding domain. This domain is found in the protein Sgf73/Sca7 which is a component of the multihistone acetyltransferase complexes SAGA and SILK. This domain is also found in Ataxin-7, a human protein which in its polyglutamine expanded pathological form, is responsible for the neurodegenerative disease spinocerebellar ataxia 7 (SCA7). Ataxin-7 is an integral component of the mammalian SAGA-like complexes, the TATA-binding protein-free TAF-containing complex (TFTC) and the SPT3/TAF9/GCN5 acetyltransferase complex (STAGA). This domain is a minimal domain in ataxin-7-like proteins that is required for interaction with TFTC/STAGA subunits and is conserved highly through evolution. The domain contains a conserved Cys(3)His motif that binds zinc, thus indicating this to be a new zinc-binding domain. 60
51209 400556 pfam08314 Sec39 Secretory pathway protein Sec39. Mnaimneh et al identified Sec39p as a protein involved in ER-Golgi transport in a large scale promoter shut down analysis of essential yeast genes. Kraynack et al. (2005) showed that Sec39p (Dsl3p) is required for Golgi-ER retrograde transport and is part of a very stable protein complex that also includes Dsl1p (in mammals ZW10), Tip20p (Rint-1) and the ER localized Q-SNARE proteins Ufe1p (syntaxin-18), Sec20p and Use1p. This was confirmed in a genome-wide analysis of protein complexes by Gavin et al (2006). 724
51210 400557 pfam08315 cwf18 cwf18 pre-mRNA splicing factor. The cwf18 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. 135
51211 400558 pfam08316 Pal1 Pal1 cell morphology protein. Pal1 is a membrane associated protein that is involved in the maintenance of cylindrical cellular morphology. It localizes to sites of active growth. Pal1 physically interacts and displays overlapping localization with the Huntingtin-interacting-protein (Hip1)-related protein Sla2p/End4p. 135
51212 400559 pfam08317 Spc7 Spc7 kinetochore protein. This domain is found in cell division proteins which are required for kinetochore-spindle association. 311
51213 400560 pfam08318 COG4 COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi an intra Golgi transport. 326
51214 400561 pfam08320 PIG-X PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. 205
51215 400562 pfam08321 PPP5 PPP5 TPR repeat region. This region is specific to the PPP5 subfamily of serine/threonine phosphatases and contains TPR repeats. 92
51216 400563 pfam08323 Glyco_transf_5 Starch synthase catalytic domain. 239
51217 400564 pfam08324 PUL PUL domain. The PUL (PLAP, Ufd3p and Lub1p) domain is a novel alpha-helical Ub-associated domain. It directly binds to Cdc48, a chaperone-like AAA ATPase that collects ubiquitylated substrates. 260
51218 311984 pfam08325 WLM WLM domain. This is a predicted metallopeptidase domain called WLM (Wss1p-like metalloproteases). These are linked to the Ub-system by virtue of fusions with the UB-binding PUG (PUB), Ub-like, and Little Finger domains. More specifically, genetic evidence implicates the WLM family in de-SUMOylation. 190
51219 400565 pfam08326 ACC_central Acetyl-CoA carboxylase, central region. The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process. 718
51220 400566 pfam08327 AHSA1 Activator of Hsp90 ATPase homolog 1-like protein. This family includes eukaryotic, prokaryotic and archaeal proteins that bear similarity to a C-terminal region of human activator of 90 kDa heat shock protein ATPase homolog 1 (AHSA1/p38). This protein is known to interact with the middle domain of Hsp90, and stimulate its ATPase activity. It is probably a general upregulator of Hsp90 function, particularly contributing to its efficiency in conditions of increased stress. p38 is also known to interact with the cytoplasmic domain of the VSV G protein, and may thus be involved in protein transport. It has also been reported as being underexpressed in Down's syndrome. This region is found repeated in two members of this family. 125
51221 400567 pfam08328 ASL_C Adenylosuccinate lyase C-terminal. This domain is found at the C-terminus of adenylosuccinate lyase(ASL; PurB in E. coli). It has been identified in bacteria, eukaryotes and archaea and is found together with the lyase domain pfam00206. ASL catalyzes the cleavage of succinylaminoimidazole carboxamide ribotide to aminoimidazole carboxamide ribotide and fumarate and the cleavage of adenylosuccinate to adenylate and fumarate. 115
51222 400568 pfam08329 ChitinaseA_N Chitinase A, N-terminal domain. This domain is found in a number of bacterial chitinases and similar viral proteins. It is organized into a fibronectin III module domain-like fold, comprising only beta strands. Its function is not known, but it may be involved in interaction with the enzyme substrate, chitin. It is separated by a hinge region from the catalytic domain (pfam00704); this hinge region is probably mobile, allowing the N-terminal domain to have different relative positions in solution. 130
51223 400569 pfam08331 DUF1730 Domain of unknown function (DUF1730). This domain of unknown function occurs in Iron-sulfur cluster-binding proteins together with the 4Fe-4S binding domain (pfam00037). 77
51224 285524 pfam08332 CaMKII_AD Calcium/calmodulin dependent protein kinase II association domain. This domain is found at the C-terminus of the Calcium/calmodulin dependent protein kinases II (CaMKII). These proteins also have a Ser/Thr protein kinase domain (pfam00069) at their N-terminus. The function of the CaMKII association domain is the assembly of the single proteins into large (8 to 14 subunits) multimers. 128
51225 400570 pfam08333 DUF1725 Protein of unknown function (DUF1725). This family include many eukaryotic and one bacterial sequence. Many of its members are annotated as being putative L1 retrotransposons or LINE-1 reverse transcriptase homologs. The region in question is found repeated in some family members. 19
51226 400571 pfam08334 T2SSG Type II secretion system (T2SS), protein G. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for the transport of proteins across the outer membrane first exported to the periplasm by the Sec or Tat translocon in Gram-negative (diderm) bacteria. The T2SG family includes proteins such as EpsG (P45773) in Vibrio cholera, XcpT also called PddA (Q00514) in Pseudomonas aeruginosa or PulG (P15746)in Klebsiella pneumoniae. The PulG is thought to be anchored in the inner membrane with its C-terminus directed towards the periplasme. Together with other members of the Type II secretion machinery, it is thought to assemble into a pilus-like structure that may function as a dynamic mechanism to push secreted proteins out of the cell. The polypeptide is organized into a long N-terminal alpha-helix followed by a loop region that separates it from a C-terminal anti-parallel beta-sheet. 106
51227 400572 pfam08335 GlnD_UR_UTase GlnD PII-uridylyltransferase. This is a family of bifunctional uridylyl-removing enzymes/uridylyltransferases (UR/UTases, GlnD) that are responsible for the modification (EC:2.7.7.59) of the regulatory protein P-II, or GlnB (pfam00543). In response to nitrogen limitation, these transferases catalyze the uridylylation of the PII protein, which in turn stimulates deadenylylation of glutamine synthetase (GlnA). Deadenylylated glutamine synthetase is the more active form of the enzyme. Moreover, uridylylated PII can act together with NtrB and NtrC to increase transcription of genes in the sigma54 regulon, which include glnA and other nitrogen-level controlled genes. It has also been suggested that the product of the glnD gene is involved in other physiological functions such as control of iron metabolism in certain species. The region described in this family is found in many of its members to be C-terminal to a nucleotidyltransferase domain (pfam01909), and N-terminal to an HD domain (pfam01966) and two ACT domains (pfam01842). 140
51228 400573 pfam08336 P4Ha_N Prolyl 4-Hydroxylase alpha-subunit, N-terminal region. The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known. 92
51229 400574 pfam08337 Plexin_cytopl Plexin cytoplasmic RasGAP domain. This family features the C-terminal regions of various plexins. Plexins are receptors for semaphorins, and plexin signalling is important in path finding and patterning of both neurons and developing blood vessels. The cytoplasmic region, which has been called a SEX domain in some members of this family, is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins. This domain acts as a RasGAP domain. 504
51230 400575 pfam08338 DUF1731 Domain of unknown function (DUF1731). This domain of unknown function appears towards the C-terminus of proteins of the NAD dependent epimerase/dehydratase family (pfam01370) in bacteria, eukaryotes and archaea. Many of the proteins in which it is found are involved in cell-division inhibition. 44
51231 400576 pfam08339 RTX_C RTX C-terminal domain. This family describes the C-terminal region of various bacterial haemolysins and leukotoxins, which belong to the RTX family of toxins. These are produced by various Gram negative bacteria, such as E. coli and Actinobacillus pleuropneumoniae. RTX toxins may interact with lipopolysaccharide (LPS) to functionally impair and eventually kill leukocytes. This region is found in association with the RTX N-terminal domain (pfam02382) and multiple hemolysin-type calcium-binding repeats (pfam00353). 131
51232 400577 pfam08340 DUF1732 Domain of unknown function (DUF1732). This domain of unknown function is often found at the C-terminus of bacterial proteins, many of which are hypothetical, including proteins of the YicC family which have pfam03755 at the N-terminus. These include a protein important in the stationary phase of growth, and required for growth at high temperature. Structural modelling suggests this domain may bind nucleic acids. 85
51233 400578 pfam08341 TED Thioester domain. This domain is found near the N-terminus of a variety of bacterial surface proteins and pili. This domain contains an unusual covalent ester bond between a conserved cysteine and glutamine residue. 104
51234 400579 pfam08343 RNR_N Ribonucleotide reductase N-terminal. This domain is found at the N-terminus of bacterial ribonucleoside-diphosphate reductases (ribonucleotide reductases, RNRs) which catalyze the formation of deoxyribonucleotides. It occurs together with the RNR all-alpha domain (pfam00317) and the RNR barrel domain (pfam02867). 82
51235 400580 pfam08344 TRP_2 Transient receptor ion channel II. This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). 60
51236 400581 pfam08345 YscJ_FliF_C Flagellar M-ring protein C-terminal. This domain is found in bacterial flagellar M-ring (FliF) proteins together with the YscJ/FliF domain (pfam01514). 155
51237 400582 pfam08346 AntA AntA/AntB antirepressor. In E. coli the two proteins AntA and AntB have 62% amino acid identities near their N termini. AntA appears to be encoded by a truncated and divergent copy of AntB. The two proteins are homologous to putative antirepressors found in numerous bacteriophages, such as the hypothetical antirepressor protein encoded by the gene LO142 of the bacteriophage 933W. 68
51238 400583 pfam08347 CTNNB1_binding N-terminal CTNNB1 binding. This region tends to appear at the N-terminus of proteins also containing DNA-binding HMG (high mobility group) boxes (pfam00505) and appears to bind the armadillo repeat of CTNNB1 (beta-catenin), forming a stable complex. Signaling by Wnt through TCF/LCF is involved in developmental patterning, induction of neural tissues, cell fate decisions and stem cell differentiation. Isoforms of HMG T-cell factors lacking the N-terminal CTNNB1-binding domain cannot fulfill their role as transcriptional activators in T-cell differentiation. 206
51239 400584 pfam08348 PAS_6 YheO-like PAS domain. This family contains various hypothetical bacterial proteins that are similar to the E. coli protein YheO. Their function is unknown, but are likely to be involved in signalling based on the presence of this PAS domain. 115
51240 400585 pfam08349 DUF1722 Protein of unknown function (DUF1722). This domain of unknown function is found in bacteria and archaea and is homologous to the hypothetical protein ybgA from E. coli. 115
51241 400586 pfam08350 DUF1724 Domain of unknown function (DUF1724). This domain of unknown function has so far only been found at the C-terminus of archaean proteins, including several transcriptional regulators of the ArsR family (see pfam01022). 62
51242 400587 pfam08351 DUF1726 Domain of unknown function (DUF1726). This domain of unknown function is often found at the N-terminus of proteins containing pfam05127. Its fold resembles that of pfam05127, but it does not appear to bind ATP. 92
51243 400588 pfam08352 oligo_HPY Oligopeptide/dipeptide transporter, C-terminal region. This family features a region found towards the C-terminus of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid. 65
51244 400589 pfam08353 DUF1727 Domain of unknown function (DUF1727). This domain of unknown function is found at the C-terminus of bacterial proteins which include UDP-N-acetylmuramyl tripeptide synthase and the related Mur ligase. 110
51245 369827 pfam08354 DUF1729 Domain of unknown function (DUF1729). This domain of unknown function is found in fatty acid synthase beta subunits together with the MaoC-like domain (pfam01575) and the Acyltransferase domain (pfam00698). The domain has been identified in fungi and bacteria. 353
51246 400590 pfam08355 EF_assoc_1 EF hand associated. This region typically appears on the C-terminus of EF hands in GTP-binding proteins such as Arht/Rhot (may be involved in mitochondrial homeostasis and apoptosis). The EF hand associated region is found in yeast, vertebrates and plants. 69
51247 400591 pfam08356 EF_assoc_2 EF hand associated. This region predominantly appears near EF-hands (pfam00036) in GTP-binding proteins. It is found in all three eukaryotic kingdoms. 85
51248 254756 pfam08357 SEFIR SEFIR domain. This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways. 150
51249 149427 pfam08358 Flexi_CP_N Carlavirus coat. This domain is found together with the viral coat protein domain (pfam00286) in coat/capsid proteins of Carlaviruses infecting plants. 52
51250 285548 pfam08359 TetR_C_4 YsiA-like protein, C-terminal region. The members of this family are thought to be TetR-type transcriptional regulators that bear particular similarity to YsiA, a hypothetical protein expressed by B. subtilis. 133
51251 400592 pfam08360 TetR_C_5 QacR-like protein, C-terminal region. This family features the C-terminal region of a number of proteins that bear similarity to the QacR protein, a transcriptional regulator of the TetR family. QacR is able to bind various environmental agents, which include a number of cationic lipophilic compounds, and thus regulate the transcription of QacA, a multidrug efflux pump. The C-terminal region contains the multifaceted, expansive drug-binding pocket, which is composed of several separate, but linked, binding sites. 131
51252 369831 pfam08361 TetR_C_2 MAATS-type transcriptional repressor, C-terminal region. This family is named after the various transcriptional regulatory proteins that it contains, including MtrR, AcrR, ArpR, TtgR and SmeT. These are members of the TetR family of transcriptional repressors, that are involved in the control of expression of multidrug resistance proteins. 121
51253 400593 pfam08362 TetR_C_3 YcdC-like protein, C-terminal region. This family comprises proteins that belong to the TetR family of transcriptional regulators. They bear particular similarity to YcdC, a putative HTH-containing protein. This family features the C-terminal region of these sequences, which does not include the helix-turn-helix. 143
51254 400594 pfam08363 GbpC Glucan-binding protein C. This domain is found in the Streptococcus Glucan-binding protein C (GbpC) and also in surface protein antigen (Spa)-family proteins which show sequence similarity to GbpC. 260
51255 400595 pfam08364 IF2_assoc Bacterial translation initiation factor IF-2 associated region. Most of the sequences in this alignment come from bacterial translation initiation factors (IF-2, also pfam04760), but the domain is also found in the eukaryotic translation initiation factor 4 gamma in yeast and in a hypothetical Euglenozoa protein of unknown function. 39
51256 400596 pfam08365 IGF2_C Insulin-like growth factor II E-peptide. This domain is found at the C-terminal domain of the insulin-like growth factor II (IGF-2, also see pfam00049) in vertebrates and seems to represent the E-peptide. 56
51257 400597 pfam08366 LLGL LLGL2. This domain is found in lethal giant larvae homolog 2 (LLGL2) proteins and syntaxin-binding proteins like tomosyn. It has been identified in eukaryotes and tends to be found together with WD repeats (pfam00400). 102
51258 400598 pfam08367 M16C_assoc Peptidase M16C associated. This domain appears in eukaryotes as well as bacteria and tends to be found near the C-terminus of the metalloprotease M16C (pfam05193). 245
51259 400599 pfam08368 FAST_2 FAST kinase-like protein, subdomain 2. This family represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases (EC:2.7.1.-) that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis. Note that many family members are hypothetical proteins. This subdomain is often found associated with the FAST kinase-like protein, subdomain 2. 87
51260 400600 pfam08369 PCP_red Proto-chlorophyllide reductase 57 kD subunit. This domain is found in bacteria and plant chloroplast proteins. It often appears at the C-terminal of Nitrogenase component 1 type Oxidoreductases (pfam00148) and sometimes independently in bacterial proteins such as the Proto-chlorophyllide reductase 57 kD subunit of the Cyanobacterium Synechocystis. 44
51261 400601 pfam08370 PDR_assoc Plant PDR ABC transporter associated. This domain is found on the C-terminus of ABC-2 type transporter domains (pfam01061). It seems to be associated with the plant pleiotropic drug resistance (PDR) protein family of ABC transporters. Like in yeast, plant PDR ABC transporters may also play a role in the transport of antifungal agents [pfam06422]. The PDR family is characterized by a configuration in which the ABC domain is nearer the N-terminus of the protein than the transmembrane domain. 62
51262 337028 pfam08372 PRT_C Plant phosphoribosyltransferase C-terminal. This domain is found at the C-terminus of phosphoribosyltransferases and phosphoribosyltransferase-like proteins. It contains putative transmembrane regions. It often appears together with calcium-ion dependent C2 domains (pfam00168). 156
51263 400602 pfam08373 RAP RAP domain. This domain is found in various eukaryotic species, where it is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain. The domain is involved in plant defense in response to bacterial infection. 58
51264 400603 pfam08374 Protocadherin Protocadherin. The structure of protocadherins is similar to that of classic cadherins (pfam00028), but particularly on the cytoplasmic domains they also have some unique features. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localized mainly at cell-cell contact sites. Their expression seems to be developmentally regulated. 217
51265 400604 pfam08375 Rpn3_C Proteasome regulatory subunit C-terminal. This eukaryotic domain is found at the C-terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function. It occurs together with the PCI/PINT domain (pfam01399). 57
51266 400605 pfam08376 NIT Nitrate and nitrite sensing. The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure. 234
51267 369841 pfam08377 MAP2_projctn MAP2/Tau projection domain. This domain is found in the MAP2/Tau family of proteins which includes MAP2, MAP4, Tau, and their homologs. All isoforms contain a conserved C-terminal domain containing tubulin-binding repeats (pfam00418), and a N-terminal projection domain of varying size. This domain has a net negative charge and exerts a long-range repulsive force. This provides a mechanism that can regulate microtubule spacing which might facilitate efficient organelle transport. 1134
51268 400606 pfam08378 NERD Nuclease-related domain. The nuclease-related domain (NERD) is found in a range of bacterial as well as archaeal and plant proteins. It has distant similarity to endonucleases (hence its name) and its predicted secondary structure is helix - sheet - sheet - sheet - sheet - weak sheet/long loop - helix - sheet - sheet. The majority of NERD-containing proteins are single-domain, but in several cases proteins containing NERD have additional domains which in 75% of cases are involved in DNA processing. 108
51269 400607 pfam08379 Bact_transglu_N Bacterial transglutaminase-like N-terminal region. This region is found towards the N-terminus of various archaeal and bacterial hypothetical proteins. Some of these are annotated as being transglutaminase-like proteins, and in fact contain a transglutaminase-like superfamily domain (pfam01841). 80
51270 400608 pfam08381 BRX Transcription factor regulating root and shoot growth via Pin3. The BREVIS RADIX (BRX) domain was characterized as being a transcription factor in plants regulating the extent of cell proliferation and elongation in the growth zone of the root. BRX is rate limiting for auxin-responsive gene-expression by mediating cross-talk with the brassino-steroid pathway. BRX has a ubiquitous, although quantitatively variable role in modulating the growth rate in both the root and the shoot. The family features a short region of alpha-helix, approximately 60 residues in length, which is found repeated up to three times. BRX is expressed in the vasculature and is rate-limiting for transcriptional auxin action. 56
51271 400609 pfam08383 Maf_N Maf N-terminal region. This region is found in various leucine zipper transcription factors of the Maf family. These are implicated in the regulation of insulin gene expression, in erythroid differentiation, and in differentiation of the neuroretina. 34
51272 400610 pfam08384 NPP Pro-opiomelanocortin, N-terminal region. This family features the N-terminal peptide of pro-opiomelanocortin (NPP). It is thought to represent an important pituitary peptide, given its high yield from pituitary glands, and exhibits a potent in vitro aldosterone-stimulating activity. 43
51273 400611 pfam08385 DHC_N1 Dynein heavy chain, N-terminal region 1. Dynein heavy chains interact with other heavy chains to form dimers, and with intermediate chain-light chain complexes to form a basal cargo binding unit. The region featured in this family includes the sequences implicated in mediating these interactions. It is thought to be flexible and not to adopt a rigid conformation. 559
51274 312032 pfam08386 Abhydrolase_4 TAP-like protein. This is a family of putative bacterial peptidases and hydrolases that bear similarity to a tripeptidyl aminopeptidase isolated from Streptomyces lividans. A member of this family is thought to be involved in the C-terminal processing of propionicin F, a bacteriocidin characterized from Propionibacterium freudenreichii. 98
51275 400612 pfam08387 FBD FBD. This region is found in F-box (pfam00646) and other domain containing plant proteins; it is repeated in two family members. Its precise function is unknown, but it is thought to be associated with nuclear processes. In fact, several family members are annotated as being similar to transcription factors. 46
51276 400613 pfam08388 GIIM Group II intron, maturase-specific domain. This region is found mainly in various bacterial and archaeal species, but a few members of this family are expressed by fungal and chlamydomonal species. It has been implicated in the binding of intron RNA during reverse transcription and splicing. 80
51277 400614 pfam08389 Xpo1 Exportin 1-like protein. The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus. 147
51278 400615 pfam08390 TRAM1 TRAM1-like protein. This family comprises sequences that are similar to human TRAM1. This is a transmembrane protein of the endoplasmic reticulum, thought to be involved in the membrane transfer of secretory proteins. The region featured in this family is found N-terminal to the longevity-assurance protein region (pfam03798). 63
51279 400616 pfam08391 Ly49 Ly49-like protein, N-terminal region. The sequences making up this family are annotated as, or are similar to, Ly49 receptors. These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function. They are members of the C-type lectin receptor superfamily, and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (pfam00059). 120
51280 400617 pfam08392 FAE1_CUT1_RppA FAE1/Type III polyketide synthase-like protein. The members of this family are described as 3-ketoacyl-CoA synthases, type III polyketide synthases, fatty acid elongases and fatty acid condensing enzymes, and are found in both prokaryotic and eukaryotic (mainly plant) species. The region featured in this family contains the active site residues, as well as motifs involved in substrate binding. 290
51281 400618 pfam08393 DHC_N2 Dynein heavy chain, N-terminal region 2. Dyneins are described as motor proteins of eukaryotic cells, as they can convert energy derived from the hydrolysis of ATP to force and movement along cytoskeletal polymers, such as microtubules. This region is found C-terminal to the dynein heavy chain N-terminal region 1 (pfam08385) in many members of this family. No functions seem to have been attributed specifically to this region. 331
51282 285580 pfam08394 Arc_trans_TRASH Archaeal TRASH domain. This region is found in the C-terminus of a number of archaeal transcriptional regulators. It is thought to function as a metal-sensing regulatory module. 37
51283 400619 pfam08395 7tm_7 7tm Chemosensory receptor. This family includes a number of gustatory and odorant receptors mainly from insect species such as A. gambiae and D. melanogaster. They are classified as G-protein-coupled receptors (GPCRs), or seven-transmembrane receptors. They show high sequence divergence, consistent with an ancient origin for the family. 370
51284 254775 pfam08396 Toxin_34 Spider toxin omega agatoxin/Tx1 family. The Tx1 family lethal spider neurotoxin induces excitatory symptoms in mice. 75
51285 400620 pfam08397 IMD IRSp53/MIM homology domain. The N-terminal predicted helical stretch of the insulin receptor tyrosine kinase substrate p53 (IRSp53) is an evolutionary conserved F-actin bundling domain involved in filopodium formation. The domain has been named IMD after the IRSp53 and missing in metastasis (MIM) proteins in which it occurs. Filopodium-inducing IMD activity is regulated by Cdc42 and Rac1 and is SH3-independent. 218
51286 285583 pfam08398 Parvo_coat_N Parvovirus coat protein VP1. This is the N-terminal region of the Parvovirus VP1 coat protein. Also see Parvovirus coat protein VP2 (pfam00740). 63
51287 400621 pfam08399 VWA_N VWA N-terminal. This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits. 123
51288 285585 pfam08400 phage_tail_N Prophage tail fibre N-terminal. This domain is found at the N-terminus of prophage tail fibre proteins. 134
51289 400622 pfam08401 DUF1738 Domain of unknown function (DUF1738). This region is found in a number of bacterial hypothetical proteins. Some members are annotated as being similar to replication primases, and in fact this region is often found together with the Toprim domain (pfam01751). 127
51290 400623 pfam08402 TOBE_2 TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulphate. Found in ABC transporters immediately after the ATPase domain. In this family a strong RPE motif is found at the presumed N-terminus of the domain. 73
51291 400624 pfam08403 AA_permease_N Amino acid permease N-terminal. This domain is found to the N-terminus of the amino acid permease domain (pfam00324) in metazoan Na-K-Cl cotransporters. 69
51292 285589 pfam08404 Baculo_p74_N Baculoviridae P74 N-terminal. This domain is found at the N-terminus of P74 occlusion-derived virus (ODV) envelope proteins which are required for oral infectivity. The envelope proteins are found in baculoviruses which are insect pathogens. The C-terminus of P74 is anchored to the membrane whereas the N-terminus is exposed to the virion surface. Furthermore P74 is unusual for a virus envelope protein as it lacks an N-terminal localization signal sequence. Also see pfam04583. 300
51293 285590 pfam08405 Calici_PP_N Viral polyprotein N-terminal. This domain is found at the N-terminus of non-structural viral polyproteins of the Caliciviridae subfamily. 358
51294 400625 pfam08406 CbbQ_C CbbQ/NirQ/NorQ C-terminal. This domain is found at the C-terminus of proteins of the CbbQ/NirQ/NorQ family of proteins which play a role in the post-translational activation of Rubisco. It is also found in the Thauera aromatica TutH protein which is similar to the CbbQ/NirQ/NorQ family, as well as in putative chaperones. The ATPase family associated with various cellular activities (AAA) pfam07728 is found in the same bacterial and archaeal proteins as the domain described here. 85
51295 400626 pfam08407 Chitin_synth_1N Chitin synthase N-terminal. This is the N-terminal domain of Chitin synthase (pfam01644). 70
51296 369858 pfam08408 DNA_pol_B_3 DNA polymerase family B viral insert. This viral domain is found between the exonuclease domain of the DNA polymerase family B (pfam03104) and the pfam00136 domain, connecting the two. 128
51297 400627 pfam08409 DUF1736 Domain of unknown function (DUF1736). This domain of unknown function is found in various hypothetical metazoan proteins. 74
51298 400628 pfam08410 DUF1737 Domain of unknown function (DUF1737). This domain of unknown function is found at the N-terminus of bacterial and viral hypothetical proteins. 51
51299 400629 pfam08411 Exonuc_X-T_C Exonuclease C-terminal. This bacterial domain is found at the C-terminus of Exodeoxyribonuclease I/Exonuclease I (pfam00929), which is a single-strand specific DNA nuclease affecting recombination and expression pathways. The exonuclease I protein in E. coli is associated with DNA deoxyribophosphodiesterase (dRPase). 267
51300 400630 pfam08412 Ion_trans_N Ion transport protein N-terminal. This metazoan domain is found to the N-terminus of pfam00520 in voltage- and cyclic nucleotide-gated K/Na ion channels. 43
51301 400631 pfam08414 NADPH_Ox Respiratory burst NADPH oxidase. This domain is found in plant proteins such as respiratory burst NADPH oxidase proteins which produce reactive oxygen species as a defense mechanism. It tends to occur to the N-terminus of an EF-hand (pfam00036), which suggests a direct regulatory effect of Ca2+ on the activity of the NADPH oxidase in plants. 100
51302 400632 pfam08416 PTB Phosphotyrosine-binding domain. The phosphotyrosine-binding domain (PTB, also phosphotyrosine-interaction or PI domain) in the protein tensin tends to be found at the C-terminus. Tensin is a multi-domain protein that binds to actin filaments and functions as a focal-adhesion molecule (focal adhesions are regions of plasma membrane through which cells attach to the extracellular matrix). Human tensin has actin-binding sites, an SH2 (pfam00017) domain and a region similar to the tumor suppressor PTEN. The PTB domain interacts with the cytoplasmic tails of beta integrin by binding to an NPXY motif. 128
51303 400633 pfam08417 PaO Pheophorbide a oxygenase. This domain is found in bacterial and plant proteins to the C-terminus of a Rieske 2Fe-2S domain (pfam00355). One of the proteins the domain is found in is Pheophorbide a oxygenase (PaO) which seems to be a key regulator of chlorophyll catabolism. Arabidopsis PaO (AtPaO) is a Rieske-type 2Fe-2S enzyme that is identical to Arabidopsis accelerated cell death 1 and homologous to lethal leaf spot 1 (LLS1) of maize, in which the domain described here is also found. 89
51304 400634 pfam08418 Pol_alpha_B_N DNA polymerase alpha subunit B N-terminal. This is the eukaryotic DNA polymerase alpha subunit B N-terminal domain which is involved in complex formation. Also see pfam04058. 240
51305 400635 pfam08421 Methyltransf_13 Putative zinc binding domain. This domain is found at the N-terminus of bacterial methyltransferases and contains four conserved cysteines suggesting a potential zinc binding domain. 62
51306 400636 pfam08423 Rad51 Rad51. Rad51 is a DNA repair and recombination protein and is a homolog of the bacterial ATPase RecA protein. 255
51307 400637 pfam08424 NRDE-2 NRDE-2, necessary for RNA interference. This is a family of eukaryotic proteins. Eukaryotic cells express a wide variety of endogenous small regulatory RNAs that regulate heterochromatin formation, developmental timing, defense against parasitic nucleic acids, and genome rearrangement. Many small regulatory RNAs are thought to function in nuclei, and in plants and fungi small interfering (si)RNAs associate with nascent transcripts and direct chromatin and/or DNA modifications. This family protein, NRDE-2, is required for small interfering (si)RNA-mediated silencing in nuclei. NRDE-2 associates with the Argonaute protein NRDE-3 within nuclei and is recruited by NRDE-3/siRNA complexes to nascent transcripts that have been targeted by RNA interference, RNAi, the process whereby double-stranded RNA (dsRNA) directs the sequence-specific degradation of mRNA. 315
51308 400638 pfam08426 ICE2 ICE2. ICE2 is a fungal ER protein which has been shown to play an important role in forming/maintaining the cortical ER. It has also bee identified as a protein which is necessary for nuclear inner membrane targeting. 402
51309 400639 pfam08427 DUF1741 Domain of unknown function (DUF1741). This is a eukaryotic domain of unknown function. 230
51310 400640 pfam08428 Rib Rib/alpha-like repeat. The region featured in this family is found repeated in a number of bacterial surface proteins, such as Rib and alpha. These are expressed by group B streptococci, and Rib is thought to confer protective immunity. 76
51311 400641 pfam08429 PLU-1 PLU-1-like protein. Sequences in this family bear similarity to the central region of PLU-1. This is a nuclear protein that may have a role in DNA-binding and transcription, and is closely associated with the malignant phenotype of breast cancer. This region is found in various other Jumonji/ARID domain-containing proteins (see pfam02373, pfam01388). 334
51312 369872 pfam08430 Forkhead_N Forkhead N-terminal region. The region described in this family is found towards the N-terminus of various eukaryotic forkhead/HNF-3-related transcription factors (which contain the pfam00250 domain). These proteins play key roles in embryogenesis, maintenance of differentiated cell states, and tumorigenesis. 139
51313 400642 pfam08432 Vfa1 AAA-ATPase Vps4-associated protein 1. Vps Four-Associated 1, Vfa1, in yeast, is an endosomal protein that interacts with the AAA-ATPase Vps4. It would seem to be involved in regulating the trafficking of other proteins to the endocytic vacuole. There is a CCCH zinc finger at the N-terminus. 179
51314 400643 pfam08433 KTI12 Chromatin associated protein KTI12. This is a family of chromatin associated proteins which interact with the Elongator complex, a component of the elongating form of RNA polymerase II. The Elongator complex has histone acetyltransferase activity. 269
51315 400644 pfam08434 CLCA Calcium-activated chloride channel N terminal. The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumor suppressors. This protein cleaves itself into an N-terminal portion and a C-terminal portion. The N-terminus contains an HEXXHXXXGXXDE motif which is essential for proteolytic cleavage. 266
51316 400645 pfam08435 Calici_coat_C Calicivirus coat protein C-terminal. This is the calicivirus coat protein (pfam00915) C-terminal region. 222
51317 400646 pfam08436 DXP_redisom_C 1-deoxy-D-xylulose 5-phosphate reductoisomerase C-terminal. This domain is found to the C-terminus of pfam02670 domains in bacterial and plant 1-deoxy-D-xylulose 5-phosphate reductoisomerases which catalyze the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH. 84
51318 400647 pfam08437 Glyco_transf_8C Glycosyl transferase family 8 C-terminal. This domain is found at the C-terminus of the pfam01501 domain in bacterial glucosyltransferase and galactosyltransferase proteins. 54
51319 400648 pfam08438 MMR_HSR1_C GTPase of unknown function C-terminal. This domain is found at the C-terminus of pfam01926 in archaeal and eukaryotic GTP-binding proteins. The C-terminal domain of the GTP-binding proteins is necessary for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotides. 109
51320 400649 pfam08439 Peptidase_M3_N Oligopeptidase F. This domain is found to the N-terminus of the pfam01432 domain in bacterial and archaeal proteins including Oligoendopeptidase F. An example of this protein is Lactococcus lactis PepF. 70
51321 285618 pfam08440 Poty_PP Potyviridae polyprotein. This domain is found in polyproteins of the viral Potyviridae taxon. 277
51322 400650 pfam08441 Integrin_alpha2 Integrin alpha. This domain is found in integrin alpha and integrin alpha precursors to the C-terminus of a number of pfam01839 repeats and to the N-terminus of the pfam00357 cytoplasmic region. This region is composed of three immunoglobulin-like domains. 449
51323 400651 pfam08442 ATP-grasp_2 ATP-grasp domain. 202
51324 369879 pfam08443 RimK RimK-like ATP-grasp domain. This ATP-grasp domain is found in the ribosomal S6 modification enzyme RimK. 188
51325 117021 pfam08444 Gly_acyl_tr_C Aralkyl acyl-CoA:amino acid N-acyltransferase, C-terminal region. This family features the C-terminal region of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13. 89
51326 117022 pfam08445 FR47 FR47-like protein. The members of this family are similar to the C-terminal region of the D. melanogaster hypothetical protein FR47. This protein has been found to consist of two N-acyltransferase-like domains swapped with the C-terminal strands. 86
51327 400652 pfam08446 PAS_2 PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya. 107
51328 369881 pfam08447 PAS_3 PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya. 89
51329 312075 pfam08448 PAS_4 PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya. 110
51330 312076 pfam08449 UAA UAA transporter family. This family includes transporters with a specificity for UDP-N-acetylglucosamine. 302
51331 400653 pfam08450 SGL SMP-30/Gluconolaconase/LRE-like region. This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30), gluconolactonase and luciferin-regenerating enzyme (LRE). SMP-30 is known to hydrolyze diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 and LRE. 246
51332 400654 pfam08451 A_deaminase_N Adenosine/AMP deaminase N-terminal. This domain is found to the N-terminus of the Adenosine/AMP deaminase domain (pfam00962) in metazoan proteins such as the Cat eye syndrome critical region protein 1 and its homologs. 95
51333 369884 pfam08452 DNAP_B_exo_N DNA polymerase family B exonuclease domain, N-terminal. This domain is found in viral DNA polymerases to the N-terminus of DNA polymerase family B exonuclease domains (pfam03104). 21
51334 400655 pfam08453 Peptidase_M9_N Peptidase family M9 N-terminal. This domain is found in microbial collagenase metalloproteases to the N-terminus of pfam01752. 183
51335 400656 pfam08454 RIH_assoc RyR and IP3R Homology associated. This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815. 98
51336 400657 pfam08455 SNF2_assoc Bacterial SNF2 helicase associated. This domain is found in bacterial proteins of the SWF/SNF/SWI helicase family to the N-terminus of the SNF2 family N-terminal domain (pfam00176) and together with the Helicase conserved C-terminal domain (pfam00271). The function of the domain is not clear. 369
51337 285632 pfam08456 Vmethyltransf_C Viral methyltransferase C-terminal. This domain is found to the C-terminus of the viral methyltransferase domain (pfam01660) in single-stranded-RNA positive-strand viruses with no DNA stage in the Virgaviridae family. 230
51338 400658 pfam08457 Sfi1 Sfi1 spindle body protein. This is a family of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins which are present in microtubule-organising centers. 570
51339 400659 pfam08458 PH_2 Plant pleckstrin homology-like region. This family describes a pleckstrin homology (PH)-like region found in several plant proteins of unknown function. 104
51340 400660 pfam08459 UvrC_HhH_N UvrC Helix-hairpin-helix N-terminal. This domain is found in the C subunits of the bacterial and archaeal UvrABC system which catalyzes nucleotide excision repair in a multi-step process. UvrC catalyzes the first incision on the fourth or fifth phosphodiester bond 3' and on the eighth phosphodiester bond 5' from the damage that is to be excised. The domain described here is found to the N-terminus of a helix hairpin helix (pfam00633) motif and also co-occurs with the pfam01541 catalytic domain which is found at the N-terminus of the same proteins. 150
51341 285636 pfam08460 SH3_5 Bacterial SH3 domain. 65
51342 285637 pfam08461 HTH_12 Ribonuclease R winged-helix domain. This domain is found at the amino terminus of Ribonuclease R and a number of presumed transcriptional regulatory proteins from archaebacteria. 66
51343 117039 pfam08462 Carmo_coat_C Carmovirus coat protein. This domain is found to the C-terminus of the pfam00729 domain in Carmoviruses. 99
51344 400661 pfam08463 EcoEI_R_C EcoEI R protein C-terminal. The restriction enzyme EcoEI recognizes 5'-GAGN(7)ATGC-3' and is composed of the three proteins R, M, and S. The domain described here is found at the C-terminus of the R protein (HsdR) which is required for both nuclease and ATPase activity. 158
51345 369889 pfam08464 Gemini_AC4_5_2 Geminivirus AC4/5 conserved region. This domain is found in replication initiator (Rep) associated proteins such as AC5 in the Geminivirus/Begomovirus. 43
51346 285639 pfam08465 Herpes_TK_C Thymidine kinase from Herpesvirus C-terminal. This domain is found towards the C-terminus in Herpesvirus Thymidine kinases. 33
51347 400662 pfam08466 IRK_N Inward rectifier potassium channel N-terminal. This metazoan domain is found to the N-terminus of the pfam01007 domain in Inward rectifier potassium channels (KIR2 or IRK2). 45
51348 285641 pfam08467 Luteo_P1-P2 Luteovirus RNA polymerase P1-P2/replicase. This domain is found in RNA-dependent RNA polymerase P1-P2 fusion/replicase proteins in plant Luteoviruses. 339
51349 400663 pfam08468 MTS_N Methyltransferase small domain N-terminal. This domain is found to the N-terminus of the methyltransferase small domain (pfam05175) in bacterial proteins. 157
51350 400664 pfam08469 NPHI_C Nucleoside triphosphatase I C-terminal. This viral domain is found to the C-terminus of Poxvirus nucleoside triphosphatase phosphohydrolase I (NPH I) together with the helicase conserved C-terminal domain (pfam00271). 148
51351 400665 pfam08470 NTNH_C Nontoxic nonhaemagglutinin C-terminal. Bacteria of the Clostridium genus produce protein neurotoxins, which are complexes consisting of neurotoxin (NT), haemagglutinin (HA), nontoxic nonhaemagglutinin (NTNH), and RNA. The domain described here is found at the C-terminus of the NTNH component. 162
51352 400666 pfam08471 Ribonuc_red_2_N Class II vitamin B12-dependent ribonucleotide reductase. This domain is found to the N-terminus of the ribonucleotide reductase barrel domain (pfam02867). It occurs in bacterial class II ribonucleotide reductase proteins which depend upon coenzyme B12 (deoxyadenosylcobalamine). 99
51353 369893 pfam08472 S6PP_C Sucrose-6-phosphate phosphohydrolase C-terminal. This is the Sucrose-6-phosphate phosphohydrolase (S6PP or SPP) C-terminal domain as found in in plant sucrose phosphatases. These enzymes irreversibly catalyze the last step in sucrose synthesis following the formation of Sucrose-6-Phosphate via sucrose-phosphate synthase (SPS). 133
51354 400667 pfam08473 VGCC_alpha2 Neuronal voltage-dependent calcium channel alpha 2acd. This eukaryotic domain has been found in the neuronal voltage-dependent calcium channel (VGCC) alpha 2a, 2c, and 2d subunits. It is also found in other calcium channel alpha-2 delta subunits to the C-terminus of a Cache domain (pfam02743). 430
51355 400668 pfam08474 MYT1 Myelin transcription factor 1. This domain is found in the myelin transcription factor 1 (MYT1) of chordates. MYT1 contains C2HC zinc finger domains (pfam01530) and is expressed in developing neurons of the central nervous system where it is involved in the selection of neuronal precursor cells. 236
51356 285649 pfam08475 Baculo_VP91_N Viral capsid protein 91 N-terminal. This domain is found in Baculoviridae including the nucleopolyhedrovirus at the N-terminus of the viral capsid protein 91 (VP91). 192
51357 285650 pfam08476 VD10_N Viral D10 N-terminal. This domain is found on the N-terminus of the viral protein D10 (VD10) and the related MutT motif proteins. The VD10 protein is probably essential for virus replication and is often found to the N-terminus of a pfam00293 domain. 41
51358 400669 pfam08477 Roc Ras of Complex, Roc, domain of DAPkinase. Roc, or Ras of Complex, proteins are mitochondrial Rho proteins (Miro-1, and Miro-2) and atypical Rho GTPases. Full-length proteins have a unique domain organisation, with tandem GTP-binding domains and two EF hand domains (pfam00036) that may bind calcium. They are also larger than classical small GTPases. It has been proposed that they are involved in mitochondrial homeostasis and apoptosis. 120
51359 400670 pfam08478 POTRA_1 POTRA domain, FtsQ-type. FtsQ/DivIB bacterial division proteins (pfam03799) contain an N-terminal POTRA domain (for polypeptide-transport-associated domain). This is found in different types of proteins, usually associated with a transmembrane beta-barrel. FtsQ/DivIB may have chaperone-like roles, which has also been postulated for the POTRA domain in other contexts. 69
51360 369898 pfam08479 POTRA_2 POTRA domain, ShlB-type. The POTRA domain (for polypeptide-transport-associated domain) is found towards the N-terminus of ShlB family proteins (pfam03865). ShlB is important in the secretion and activation of the haemolysin ShlA. It has been postulated that the POTRA domain has a chaperone-like function over ShlA; it may fold back into the C-terminal beta-barrel channel. 76
51361 285654 pfam08480 Disaggr_assoc Disaggregatase related. This domain is found in disaggregatases and several hypothetical proteins of the archaeal genus Methanosarcina. Disaggregatases cause aggregates to separate into single cells and contain parallel beta-helix repeats. Also see pfam06848. 194
51362 400671 pfam08481 GBS_Bsp-like GBS Bsp-like repeat. This domain is found as a repeat in a number of Streptococcus proteins including some hypothetical proteins and Bsp. Bsp is a protein of group B Streptococcus (GBS) which might control cell morphology. 89
51363 400672 pfam08482 HrpB_C ATP-dependent helicase C-terminal. This domain is found near the C-terminus of bacterial ATP-dependent helicases such as HrpB. 126
51364 378004 pfam08483 IstB_IS21_ATP IstB-like ATP binding N-terminal. This bacterial domain is found to the N-terminus of the pfam01695 like ATP binding domain in proteins which are putative transposase subunits. 28
51365 400673 pfam08484 Methyltransf_14 C-methyltransferase C-terminal domain. This domain is found in bacterial C-methyltransferase proteins. This domain is found C-terminal to methyltransferase domains such as pfam08241 or pfam08242. But this domain is not a methyltransferase. 160
51366 400674 pfam08485 Polysacc_syn_2C Polysaccharide biosynthesis protein C-terminal. This domain is found to the C-terminus of the pfam02719 domain in bacterial polysaccharide biosynthesis enzymes including the capsule protein CapD and several putative epimerases/dehydratases. 48
51367 400675 pfam08486 SpoIID Stage II sporulation protein. This domain is found in the stage II sporulation protein SpoIID. SpoIID is necessary for membrane migration as well as for some of the earlier steps in engulfment during bacterial endospore formation. The domain is also found in amidase enhancer proteins. Amidases, like SpoIID, are cell wall hydrolases. 100
51368 400676 pfam08487 VIT Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. 111
51369 254827 pfam08488 WAK Wall-associated kinase. This domain is found together with the eukaryotic protein kinase domain pfam00069 in plant wall-associated kinases (WAKs) and related proteins. WAKs are serine-threonine kinases which might be involved in signalling to the cytoplasm and are required for cell expansion. 103
51370 400677 pfam08489 DUF1743 Domain of unknown function (DUF1743). This domain of unknown function is found in many hypothetical proteins and predicted DNA-binding proteins such as transcription-associated proteins. It is found in bacteria and archaea. 116
51371 400678 pfam08490 DUF1744 Domain of unknown function (DUF1744). This domain is found on the epsilon catalytic subunit of DNA polymerase. It is found C terminal to pfam03104 and pfam00136. 400
51372 400679 pfam08491 SE Squalene epoxidase. This domain is found in squalene epoxidase (SE) and related proteins which are found in taxonomically diverse groups of eukaryotes and also in bacteria. SE was first cloned from Saccharomyces cerevisiae where it was named ERG1. It contains a putative FAD binding site and is a key enzyme in the sterol biosynthetic pathway. Putative transmembrane regions are found to the protein's C-terminus. 276
51373 400680 pfam08492 SRP72 SRP72 RNA-binding domain. This region has been identified as the binding site of the SRP72 protein to SRP RNA. 57
51374 400681 pfam08493 AflR Aflatoxin regulatory protein. This domain is found in the aflatoxin regulatory protein (AflR) which is involved in the regulation of the biosynthesis of aflatoxin in the fungal genus Aspergillus. It occurs together with the fungal Zn(2)-Cys(6) binuclear cluster domain (pfam00172). 274
51375 400682 pfam08494 DEAD_assoc DEAD/H associated. This domain is found in ATP-dependent helicases as well as a number of hypothetical proteins together with the helicase conserved C-terminal domain (pfam00270) and the pfam00271 domain. 182
51376 400683 pfam08495 FIST FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. 130
51377 400684 pfam08496 Peptidase_S49_N Peptidase family S49 N-terminal. This domain is found to the N-terminus of bacterial signal peptidases of the S49 family (pfam01343). 147
51378 400685 pfam08497 Radical_SAM_N Radical SAM N-terminal. This domain tends to occur to the N-terminus of the pfam04055 domain in hypothetical bacterial proteins. 298
51379 400686 pfam08498 Sterol_MT_C Sterol methyltransferase C-terminal. This domain is found to the C-terminus of a methyltransferase domain (pfam08241) in fungal and plant sterol methyltransferases. 63
51380 400687 pfam08499 PDEase_I_N 3'5'-cyclic nucleotide phosphodiesterase N-terminal. This domain is found to the N-terminus of the calcium/calmodulin-dependent 3'5'-cyclic nucleotide phosphodiesterase domain (pfam00233). 57
51381 149522 pfam08500 Tombus_P33 Tombusvirus p33. Tombusviruses, which replicate in a wide range of plant hosts, replicate with the help of viral replicase protein including the overlapping p33 and p92 proteins which contain the domain described here. 142
51382 400688 pfam08501 Shikimate_dh_N Shikimate dehydrogenase substrate binding domain. This domain is the substrate binding domain of shikimate dehydrogenase. 83
51383 400689 pfam08502 LeuA_dimer LeuA allosteric (dimerization) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyzes the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich. 112
51384 400690 pfam08503 DapH_N Tetrahydrodipicolinate succinyltransferase N-terminal. This domain is found at the N-terminus of tetrahydrodipicolinate N-succinyltransferase (DapH) which catalyzes the acylation of L-2-amino-6-oxopimelate to 2-N-succinyl-6-oxopimelate in the meso-diaminopimelate/lysine biosynthetic pathway of bacteria, blue-green algae, and plants. The N-terminal domain as defined here contains three alpha-helices and two twisted hairpin loops. 83
51385 400691 pfam08504 RunxI Runx inhibition domain. This domain lies to the C-terminus of Runx-related transcription factors and homologous proteins (AML, CBF-alpha, PEBP2). Its function might be to interact with functional cofactors. 98
51386 400692 pfam08505 MMR1 Mitochondrial Myo2 receptor-related protein. Myo2p, a class V myosin, is essential for mitochondrial distribution, class V being vital for organelle distribution in S. cerevisiae. It is the myosin essential for mitochondrial distribution. The established mechanism for distribution of cellular components by class V myosins is that they interact with the cargo at the C-terminal tail domain and transport it along the actin cytoskeleton using the N-terminal motor domain. Cargo-specific myosin receptors act as the link between the myosin tail and cargo. Myo2 binds with MMR1 (mitochondrial Myo2p receptor-related 1), the receptor on cargo, via the C-terminal domain. 267
51387 369909 pfam08506 Cse1 Cse1. This domain is present in Cse1 nuclear export receptor proteins. Cse1 mediates the nuclear export of importin alpha. This domain contains HEAT repeats. 370
51388 400693 pfam08507 COPI_assoc COPI associated protein. Proteins in this family colocalize with COPI vesicle coat proteins. 130
51389 400694 pfam08508 DUF1746 Fungal domain of unknown function (DUF1746). This is a fungal domain of unknown function. 116
51390 369912 pfam08509 Ad_cyc_g-alpha Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins. 48
51391 400695 pfam08510 PIG-P PIG-P. PIG-P (phosphatidylinositol N-acetylglucosaminyltransferase subunit P) is an enzyme involved in GPI anchor biosynthesis. 120
51392 400696 pfam08511 COQ9 COQ9. COQ9 is an enzyme that is required for the biosynthesis of coenzyme Q. It may either catalyze a reaction in the coenzyme Q biosynthetic pathway or have a regulatory role. 73
51393 400697 pfam08512 Rtt106 Histone chaperone Rttp106-like. This family includes Rttp106, a histone chaperone involved in heterochromatin-mediated silencing. This domain belongs to the Pleckstrin homology domain superfamily. 83
51394 369916 pfam08513 LisH LisH. The LisH (lis homology) domain mediates protein dimerization and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex. 25
51395 400698 pfam08514 STAG STAG domain. STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version. 108
51396 400699 pfam08515 TGF_beta_GS Transforming growth factor beta type I GS-motif. This motif is found in the transforming growth factor beta (TGF-beta) type I which regulates cell growth and differentiation. The name of the GS motif comes from its highly conserved GSGSGLP signature in the cytoplasmic juxtamembrane region immediately preceding the protein's kinase domain. Point mutations in the GS motif modify the signaling ability of the type I receptor. 28
51397 400700 pfam08516 ADAM_CR ADAM cysteine-rich. ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. 115
51398 400701 pfam08517 AXH Ataxin-1 and HBP1 module (AXH). AXH is a protein-protein and RNA binding motif found in Ataxin-1 (ATX1). ATX1 is responsible for the autosomal-dominant neurodegenerative disorder Spinocerebellar ataxia type-1 (SCA1) in humans. The AXH module has also been identified in the apparently unrelated transcription factor HBP1 which is thought to be involved in the architectural regulation of chromatin and in specific gene expression. 109
51399 400702 pfam08518 GIT_SHD Spa2 homology domain (SHD) of GIT. GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase. 29
51400 400703 pfam08519 RFC1 Replication factor RFC1 C terminal domain. This is the C terminal domain of replication factor C, RFC1. RFC complexes hydrolyze ATP and load sliding clamps such as PCNA (proliferating cell nuclear antigen) onto double-stranded DNA. RFC1 is essential for RFC function in vivo. 158
51401 400704 pfam08520 DUF1748 Fungal protein of unknown function (DUF1748). This is a family of fungal proteins of unknown function. 69
51402 400705 pfam08521 2CSK_N Two-component sensor kinase N-terminal. This domain is found in bacterial two-component sensor kinases towards the N-terminus. 140
51403 400706 pfam08522 DUF1735 Domain of unknown function (DUF1735). This domain of unknown function is found in a number of bacterial proteins including acylhydrolases. The structure of this domain has a beta-sandwich fold. 120
51404 400707 pfam08523 MBF1 Multiprotein bridging factor 1. This domain is found in the multiprotein bridging factor 1 (MBF1) which forms a heterodimer with MBF2. It has been shown to make direct contact with the TATA-box binding protein (TBP) and interacts with Ftz-F1, stabilizing the Ftz-F1-DNA complex. It is also found in the endothelial differentiation-related factor (EDF-1). Human EDF-1 is involved in the repression of endothelial differentiation, interacts with CaM and is phosphorylated by PKC. The domain is found in a wide range of eukaryotic proteins including metazoans, fungi and plants. A helix-turn-helix motif (pfam01381) is found to its C-terminus. 70
51405 400708 pfam08524 rRNA_processing rRNA processing. This is a family of proteins that are involved in rRNA processing. In a localization study they were found to localize to the nucleus and nucleolus. The family also includes other metazoa members from plants to mammals where the protein has been named BR22 and is associated with TTF-1, thyroid transcription factor 1. In the lungs, the family binds TTF-1 to form a complex which influences the expression of the key lung surfactant protein-B (SP-B) and -C (SP-C), the small hydrophobic surfactant proteins that maintain surface tension in alveoli. 142
51406 400709 pfam08525 OapA_N Opacity-associated protein A N-terminal motif. This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation. This motif occurs at the N-terminus of these proteins. It contains a conserved histidine followed by a run of hydrophobic residues. 28
51407 400710 pfam08526 PAD_N Protein-arginine deiminase (PAD) N-terminal domain. This family represents the N-terminal non-catalytic domain of protein-arginine deiminase. This domain has a cupredoxin-like fold. 113
51408 400711 pfam08527 PAD_M Protein-arginine deiminase (PAD) middle domain. This family represents the central non-catalytic domain of protein-arginine deiminase. This domain has an immunoglobulin-like fold. 159
51409 400712 pfam08528 Whi5 Whi5 like. In metazoans, cyclin-dependent kinase(CDK) dependent phosphorylation of the retinoblastoma Tudor suppressor protein (Rb) alleviates repression of E2F and thereby activates G1/S transcription. The cell size regulator Whi5 appears to be an analogous target of CDK activity during G1 phase. 25
51410 400713 pfam08529 NusA_N NusA N-terminal domain. This domain represents the RNA polymerase binding domain of NusA. 120
51411 400714 pfam08530 PepX_C X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain contains a beta sandwich domain. 216
51412 400715 pfam08531 Bac_rhamnosid_N Alpha-L-rhamnosidase N-terminal domain. This family consists of bacterial rhamnosidase A and B enzymes. This domain is probably involved in substrate recognition. 172
51413 369931 pfam08532 Glyco_hydro_42M Beta-galactosidase trimerisation domain. This is non catalytic domain B of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. This domain is related to glutamine amidotransferase enzymes, but the catalytic residues are replaced by non functional amino acids. This domain is involved in trimerisation. 207
51414 400716 pfam08533 Glyco_hydro_42C Beta-galactosidase C-terminal domain. This domain is found at the C-terminus of beta-galactosidase enzymes that belong to the glycosyl hydrolase 42 family. 58
51415 400717 pfam08534 Redoxin Redoxin. This family of redoxins includes peroxiredoxin, thioredoxin and glutaredoxin proteins. 148
51416 400718 pfam08535 KorB KorB domain. This family consists of several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon. This domain includes the DNA-binding HTH motif. 88
51417 400719 pfam08536 Whirly Whirly transcription factor. This family contains the plant whirly transcription factors. 136
51418 400720 pfam08537 NBP1 Fungal Nap binding protein NBP1. NBP1 is a nuclear protein which has been shown in Saccharomyces cerevisiae to be essential for the G2/M transition of the cell cycle. 332
51419 369936 pfam08538 DUF1749 Protein of unknown function (DUF1749). This is a plant and fungal family of unknown function. This family contains many hypothetical proteins. 299
51420 400721 pfam08539 HbrB HbrB-like. HbrB is involved hyphal growth and polarity. 159
51421 400722 pfam08540 HMG_CoA_synt_C Hydroxymethylglutaryl-coenzyme A synthase C terminal. 280
51422 400723 pfam08541 ACP_syn_III_C 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III C terminal. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.41, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria. 90
51423 400724 pfam08542 Rep_fac_C Replication factor C C-terminal domain. This is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A. 87
51424 400725 pfam08543 Phos_pyr_kin Phosphomethylpyrimidine kinase. This enzyme EC:2.7.4.7 is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes. 246
51425 378019 pfam08544 GHMP_kinases_C GHMP kinases C terminal. This family includes homoserine kinases, galactokinases and mevalonate kinases. 86
51426 400726 pfam08545 ACP_syn_III 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.180, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria. 80
51427 400727 pfam08546 ApbA_C Ketopantoate reductase PanE/ApbA C terminal. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway. 125
51428 400728 pfam08547 CIA30 Complex I intermediate-associated protein 30 (CIA30). This protein is associated with mitochondrial Complex I intermediate-associated protein 30 (CIA30) in human and mouse. The family is also present in Schizosaccharomyces pombe which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that this family of protein may not be directly involved in oxidative phosphorylation. 156
51429 400729 pfam08548 Peptidase_M10_C Peptidase M10 serralysin C terminal. Serralysins are peptidases related to mammalian matrix metallopeptidases (MMPs). The peptidase unit is found at the N terminal while this domain at the C terminal forms a corkscrew and is thought to be important for secretion of the protein through the bacterial cell wall. This domain contains the calcium ion binding domain pfam00353. 221
51430 400730 pfam08549 SWI-SNF_Ssr4 Fungal domain of unknown function (DUF1750). This is a fungal domain of unknown function. 714
51431 400731 pfam08550 DUF1752 Fungal protein of unknown function (DUF1752). This is a family of fungal proteins of unknown function. This short section domain is bounded by two highly conserved tryptophans. The family contains MKD1 that is thought to be a negative regulator of RAS-cAMP pathway in S.cerevisiae. the Sch.pombe member is a GAF1 transcription factor that is also associated with the zinc finger family GATA pfam00320. 28
51432 369945 pfam08551 DUF1751 Eukaryotic integral membrane protein (DUF1751). This domain is found in eukaryotic integral membrane proteins. YOL107W, a Saccharomyces cerervisiae protein, has been shown to localize COP II vesicles. 99
51433 400732 pfam08552 Kei1 Inositolphosphorylceramide synthase subunit Kei1. Kei1 is a subunit of Saccharomyces cerevisiae inositol phosphorylceramide (IPC) synthase. It is localized to the Golgi and is cleaved by the late Golgi processing endopeptidase Kex2. Kei1 is essential for both the activity and the Golgi localization of IPC synthase. 181
51434 400733 pfam08553 VID27 VID27 cytoplasmic protein. This is a family of fungal and plant proteins and contains many hypothetical proteins. VID27 is a cytoplasmic protein that plays a potential role in vacuolar protein degradation. 356
51435 369948 pfam08555 DUF1754 Eukaryotic family of unknown function (DUF1754). This is a eukaryotic protein family of unknown function. 91
51436 400734 pfam08557 Lipid_DES Sphingolipid Delta4-desaturase (DES). Sphingolipids are important membrane signalling molecules involved in many different cellular functions in eukaryotes. Sphingolipid delta 4-desaturase catalyzes the formation of (E)-sphing-4-enine. Some proteins in this family have bifunctional delta 4-desaturase/C-4-hydroxylase activity. Delta 4-desaturated sphingolipids may play a role in early signalling required for entry into meiotic and spermatid differentiation pathways during Drosophila spermatogenesis. This small domain associates with FA_desaturase pfam00487 and appears to be specific to sphingolipid delta 4-desaturase. 37
51437 400735 pfam08558 TRF Telomere repeat binding factor (TRF). Telomere repeat binding factor (TRF) family proteins are important for the regulation of telomere stability. The two related human TRF proteins hTRF1 and hTRF2 form homodimers and bind directly to telomeric TTAGGG repeats via the myb DNA binding domain pfam00249 at the carboxy terminus. TRF1 is implicated in telomere length regulation and TRF2 in telomere protection. Other telomere complex associated proteins are recruited through their interaction with either TRF1 or TRF2. The fission yeast protein Taz1p (telomere-associated in Schizosaccharomyces pombe) has similarity to both hTRF1 and hTRF2 and may perform the dual functions of TRF1 and TRF2 at fission yeast telomeres. This domain is composed of multiple alpha helices arranged in a solenoid conformation similar to TPR repeats. The fungal members have now also been found to carry two double strand telomeric repeat binding factors. 212
51438 400736 pfam08559 Cut8 Cut8, nuclear proteasome tether protein. In Schizosaccharomyces pombe, Cut8 is a nuclear envelope protein that physically interacts with and tethers 26S proteasome in the nucleus resulting in the nuclear accumulation of proteasome. Cut8 comprises three functional domains. An N-terminal lysine-rich segment which binds to the proteasome when ubiquitinated, a central dimerization domain and a C-terminal nine-helix, Structure 3q5w, bundle which shows structural similarity to 14-3-3 phosphoprotein-binding domains. The helical bundle is necessary for liposome and cholesterol binding. Cut8 is a proteasome substrate and the N-terminal segment is polyubiquitinated and functions as a degron tag. Ubiquitination of the amino N-terminal segment is essential for the function of Cut8. Lysine residues in the N-terminal segment of Cut8 are required for physical interaction with proteasome. In fission yeast the function of Cut8 has been demonstrated to be regulated by ubiquitin-conjugating Rhp6/Ubc2/Rad6 and ligating enzymes Ubr1. Cut8 homologs have been identified in Drosophila melanogaster, Anopheles gambiae and Dictyostelium discoideum. 196
51439 400737 pfam08560 DUF1757 Protein of unknown function (DUF1757). This family of proteins are about 150 amino acids in length and have no known function. 147
51440 400738 pfam08561 Ribosomal_L37 Mitochondrial ribosomal protein L37. This family includes yeast MRPL37 a mitochondrial ribosomal protein. 88
51441 400739 pfam08562 Crisp Crisp. This domain is found on Crisp proteins which contain pfam00188 and has been termed the Crisp domain. It is found in the mammalian reproductive tract and the venom of reptiles, and has been shown to regulate ryanodine receptor Ca2+ signalling. It contains 10 conserved cysteines which are all involved in disulphide bonds and is structurally related to the ion channel inhibitor toxins BgK and ShK. 55
51442 400740 pfam08563 P53_TAD P53 transactivation motif. The binding of the p53 transactivation domain by regulatory proteins regulates p53 transcription activation. This motif is comprised of a single amphipathic alpha helix and contains a highly conserved sequence. 25
51443 400741 pfam08564 CDC37_C Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37 pfam03234. 87
51444 400742 pfam08565 CDC37_M Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain pfam03234, which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 pfam08564 whose function is unclear. 113
51445 400743 pfam08566 Pam17 Mitochondrial import protein Pam17. The presequence translocase-associated motor (PAM) drives the completion of preprotein translocation into the mitochondrial matrix. The Pam17 subunit is required for formation of a stable complex between cochaperones Pam16 and Pam18 and promotes the association of Pam16-Pam18 with the presequence translocase. Mitochondria lacking Pam17 are selectively impaired in the import of matrix proteins. 165
51446 400744 pfam08567 PH_TFIIH TFIIH p62 subunit, N-terminal domain. The N-terminal domain of the TFIIH basal transcription factor complex p62 subunit (BTF2-p62) forms an interaction with the 3' endonuclease XPG, which is essential for activity. The 3' endonuclease XPG is a major component of the nucleotide excision repair machinery. The structure of the N-terminal domain reveals that it adopts a pleckstrin homology (PH) fold. This PH-type domain has been shown to bind to a mono-phosphorylated inositide. 88
51447 400745 pfam08568 Kinetochor_Ybp2 Uncharacterized protein family, YAP/Alf4/glomulin. This entry contains a number of protein families with apparently unrelated functions. These include the YAP binding proteins of yeasts. These are stress response and redox homeostasis proteins, induced by hydrogen peroxide or induced in response to alkylating agent methyl methanesulphonate (MMS). The family includes Aberrant root formation protein 4 (Alf4) of Arabidopsis thaliana (Mouse-ear cress), which is required for the initiation of lateral roots independent from auxin signalling. It may also function in maintaining the pericycle in the mitotically competent state needed for lateral root formation. The family includes glomulin (FAP68), which is essential for normal development of the vasculature and may represent a naturally occurring ligand of the immunophilins FKBP59 and FKBP12. 624
51448 400746 pfam08569 Mo25 Mo25-like. Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localized alternately to the spindle pole body and to the site cell division in a cell cycle dependent manner. 328
51449 400747 pfam08570 DUF1761 Protein of unknown function (DUF1761). Family of conserved fungal and bacterial membrane proteins with unknown function. 122
51450 400748 pfam08571 Yos1 Yos1-like. In yeast, Yos1 is a subunit of the Yip1p-Yif1p complex and is required for transport between the endoplasmic reticulum and the Golgi complex. Yos1 appears to be conserved in eukaryotes. 80
51451 400749 pfam08572 PRP3 pre-mRNA processing factor 3 (PRP3). Pre-mRNA processing factor 3 (PRP3) is a U4/U6-associated splicing factor. The human PRP3 has been implicated in autosomal retinitis pigmentosa. 218
51452 400750 pfam08573 SAE2 DNA repair protein endonuclease SAE2/CtIP C-terminus. SAE2 is a protein involved in repairing meiotic and mitotic double-strand breaks in DNA. It has been shown to negatively regulate DNA damage checkpoint signalling. SAE2 is homologous to the CtIP proteins in mammals and an homologous protein in plants. Crucial sequence motifs that are highly conserved are the CxxC and the RHR motifs in this C-terminal part of the protein. It is now known to be an endonuclease. In budding yeast, genetic evidence suggests that the SAE2 protein is essential for the processing of hairpin DNA intermediates and meiotic double-strand breaks by Mre11/Rad50 complexes. SAE2 binds DNA and exhibits endonuclease activity on single-stranded DNA independently of Mre11/Rad50 complexes, but hairpin DNA structures are cleaved cooperatively in the presence of Mre11/Rad50 or Mre11/Rad50/Xrs2. Hairpin structures are not processed at the tip by SAE2 but rather at single-stranded DNA regions adjacent to the hairpin. The catalytic activities of SAE2 are important for its biological functions. 110
51453 400751 pfam08574 Iwr1 Transcription factor Iwr1. Iwr1 is involved in transcription from polymerase II promoters; it interacts with with most of the polymerase II subunits. Deletion of this protein results in hypersensitivity to the K1 killer toxin. 73
51454 400752 pfam08576 DUF1764 Eukaryotic protein of unknown function (DUF1764). This is a family of eukaryotic proteins of unknown function. This family contains many hypothetical proteins. 99
51455 400753 pfam08577 PI31_Prot_C PI31 proteasome regulator. PI31 is a cellular regulator of proteasome formation and of proteasome-mediated antigen processing. 80
51456 400754 pfam08578 DUF1765 Protein of unknown function (DUF1765). This region represents a conserved region found in hypothetical proteins from fungi, mycetozoa and entamoebidae. 125
51457 369970 pfam08579 RPM2 Mitochondrial ribonuclease P subunit (RPM2). Ribonuclease P (RNase P) generates mature tRNA molecules by cleaving their 5' ends. RPM2 is a protein subunit of the yeast mitochondrial RNase P. It has the ability to act as transcriptional activator in the nucleus where it plays a role in defining the steady-state levels of mRNAs for some nucleus-encoded mitochondrial components. 119
51458 369971 pfam08580 KAR9 Yeast cortical protein KAR9. The KAR9 protein in Saccharomyces cerevisiae is a cytoskeletal protein required for karyogamy, correct positioning of the mitotic spindle and for orientation of cytoplasmic microtubules. KAR9 localizes at the shmoo tip in mating cells and at the tip of the growing bud in anaphase. 683
51459 400755 pfam08581 Tup_N Tup N-terminal. The N-terminal domain of the Tup protein has been shown to interact with the Ssn6 transcriptional co-repressor. 77
51460 400756 pfam08583 Cmc1 Cytochrome c oxidase biogenesis protein Cmc1 like. Cmc1 is a metallo-chaperone like protein which is known to localize to the inner mitochondrial membrane in Saccharomyces cerevisiae. It is essential for full expression of cytochrome c oxidase and respiration. Cmc1 contains two Cx9C motifs and is able to bind copper(I). Cmc1 is thought to play a role in mitochondrial copper trafficking and transfer to cytochrome c oxidase. 70
51461 400757 pfam08584 Ribonuc_P_40 Ribonuclease P 40kDa (Rpp40) subunit. The tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule and at least eight protein subunits. Subunits hpop1, Rpp21, Rpp29, Rpp30, Rpp38, and Rpp40 (this entry) are involved in extensive, but weak, protein-protein interactions in the holoenzyme complex. 277
51462 400758 pfam08585 RMI1_N RecQ mediated genome instability protein. RMI1_N is an N-terminal family of eukaryotic proteins. The domain probably carries an oligo-nucleotide-binding domain or OB-fold, and forms a stable complex with Bloom syndrome protein BLM and DNA topoisomerase 3-alpha. 193
51463 400759 pfam08586 Rsc14 RSC complex, Rsc14/Ldb7 subunit. RSC is an ATP-dependent chromatin remodelling complex found in yeast. The RSC components Rsc7/Npl6 and Rsc14/Ldb7 interact physically and/or functionally with Rsc3, Rsc30, and Htl1 to form a module important for a broad range of RSC functions. 99
51464 400760 pfam08587 UBA_2 Ubiquitin associated domain (UBA). This is a UBA (ubiquitin associated) domain. Ubiquitin is involved in intracellular proteolysis. 45
51465 400761 pfam08588 DUF1769 Protein of unknown function (DUF1769). Family of fungal protein with unknown function. 55
51466 400762 pfam08589 DUF1770 Fungal protein of unknown function (DUF1770). The function of this family is unknown. These proteins are rather dissimilar except for a single strongly conserved motif (PDLRFEQ). 98
51467 400763 pfam08590 DUF1771 Domain of unknown function (DUF1771). This domain is always found adjacent to pfam01713. 62
51468 400764 pfam08591 RNR_inhib Ribonucleotide reductase inhibitor. This family includes S. pombe Spd1. Spd1p inhibits fission yeast RNR activity by interacting with the Cdc22p. 95
51469 400765 pfam08592 DUF1772 Domain of unknown function (DUF1772). This domain is of unknown function. 136
51470 400766 pfam08593 DUF1773 Domain of unknown function. This is the C-terminal part of some meiotically up-regulated gene products from fission yeast. The actual function is not yet known but the proteins are likely to be cell-surface glycoproteins. 58
51471 369983 pfam08594 UPF0300 Uncharacterized protein family (UPF0300). This family of proteins appear to be specific to S. pombe. 212
51472 400767 pfam08595 RXT2_N RXT2-like, N-terminal. The family represents the N-terminal region of RXT2-like proteins. In S. cerevisiae, RXT2 has been demonstrated to be involved in conjugation with cellular fusion (mating) and invasive growth. A high throughput localization study has localized RXT2 to the nucleus. 139
51473 400768 pfam08596 Lgl_C Lethal giant larvae(Lgl) like, C-terminal. The Lethal giant larvae (Lgl) tumor suppressor family is conserved from yeast to mammals. The Lgl family functions in cell polarity, at least in part, by regulating SNARE-mediated membrane delivery events at the cell surface. The N-terminal half of Lgl members contains WD40 repeats (see pfam00400), while the C-terminal half appears specific to the family. 393
51474 400769 pfam08597 eIF3_subunit Translation initiation factor eIF3 subunit. This is a family of proteins which are subunits of the eukaryotic translation initiation factor 3 (eIF3). In yeast it is called Hcr1. The Saccharomyces cerevisiae protein HCR1 has been shown to be required for processing of 20S pre-rRNA and binds to 18S rRNA and eIF3 subunits Rpg1p and Prt1p. 242
51475 400770 pfam08598 Sds3 Sds3-like. Repression of gene transcription is mediated by histone deacetylases containing repressor-co-repressor complexes, which are recruited to promoters of target genes via interactions with sequence-specific transcription factors. The co-repressor complex contains a core of at least seven proteins. This family represents the conserved region found in Sds3, Dep1 and BRMS1-homolog p40 proteins. 214
51476 400771 pfam08599 Nbs1_C DNA damage repair protein Nbs1. This C terminal region of the DNA damage repair protein Nbs1 has been identified to be necessary for the binding of Mre11 and Tel1. 62
51477 369989 pfam08600 Rsm1 Rsm1-like. Rsm1 is a protein involved in mRNA export from the nucleus 97
51478 369990 pfam08601 PAP1 Transcription factor PAP1. The transcription factor Pap1 regulates antioxidant-gene transcription in response to H2O2. This region is cysteine rich. Alkylation of cysteine residues following treatment with a cysteine alkylating agent can mask the accessibility of the nuclear exporter Crm1, triggering nuclear accumulation and Pap1 dependent transcriptional expression. 363
51479 369991 pfam08602 Mgr1 Mgr1-like, i-AAA protease complex subunit. The S. cerevisiae Mgr1 protein has been shown to be required for mitochondrial viability in yeast lacking mitochondrial DNA. It is a mitochondrial inner membrane protein, which interacts with Yme1 and is a new subunit of the i-AAA protease complex. 374
51480 400772 pfam08603 CAP_C Adenylate cyclase associated (CAP) C terminal. 156
51481 400773 pfam08604 Nup153 Nucleoporin Nup153-like. This family contains both the nucleoporin Nup153 from human and Nup154 from fission yeast. These have been demonstrated to be functionally equivalent. 501
51482 369994 pfam08605 Rad9_Rad53_bind Fungal Rad9-like Rad53-binding. In Saccharomyces cerevisiae the Rad9 a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via Rad53 pfam00498 domains. This region is structurally composed of a pair of TUDOR domains. 129
51483 400774 pfam08606 Prp19 Prp19/Pso4-like. This regions is found specifically in PRP19-like protein. The region represented by this family covers the sequence implicated in self-interaction and a coiled-coiled motif. PRP19-like proteins form an oligomer that is necessary for spliceosome assembly. 65
51484 400775 pfam08608 Wyosine_form Wyosine base formation. Some proteins in this family appear to be important in wyosine base formation in a subset of phenylalanine specific tRNAs. It has been proposed that they participates in converting tRNA(Phe)-m(1)G(37) to tRNA(Phe)-yW. 63
51485 400776 pfam08609 Fes1 Nucleotide exchange factor Fes1. Fes1 is a cytosolic homolog of Sls1, an ER protein which has nucleotide exchange factor activity. Fes1 in yeast has been shown to bind to the molecular chaperone Hsp70 and has adenyl-nucleotide exchange factor activity. 89
51486 400777 pfam08610 Pex16 Peroxisomal membrane protein (Pex16). Pex16 is a peripheral protein located at the matrix face of the peroxisomal membrane. 346
51487 400778 pfam08611 DUF1774 Fungal protein of unknown function (DUF1774). This is a fungal family of unknown function. 95
51488 400779 pfam08612 Med20 TATA-binding related factor (TRF) of subunit 20 of Mediator complex. This family of proteins is related to TATA-binding protein (TBP). TBP is a highly conserved RNA polymerase II general transcription factor that binds to the core promoter and initiates assembly of the preinitiation complex. Human TRF has been shown to associate with an RNA polymerase II-SRB complex. This Med20 subunit of Mediator is found in the non-essential part of the head. 200
51489 400780 pfam08613 Cyclin Cyclin. This family includes many different cyclin proteins. Members include the G1/S-specific cyclin pas1, and the phosphate system cyclin PHO80/PHO85. 149
51490 400781 pfam08614 ATG16 Autophagy protein 16 (ATG16). Autophagy is a ubiquitous intracellular degradation system for eukaryotic cells. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. ATG16 (also known as Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate in the yeast autophagy pathway. 176
51491 400782 pfam08615 RNase_H2_suC Ribonuclease H2 non-catalytic subunit (Ylr154p-like). This entry represents the non-catalytic subunit of RNase H2, which in S. cerevisiae is Ylr154p/Rnh203p. Whereas bacterial and archaeal RNases H2 are active as single polypeptides, the Saccharomyces cerevisiae homolog, Rnh2Ap, when expressed in Escherichia coli, fails to produce an active RNase H2. For RNase H2 activity three proteins are required [Rnh2Ap (Rnh201p), Ydr279p (Rnh202p) and Ylr154p (Rnh203p)]. Deletion of any one of the proteins or mutations in the catalytic site in Rnh2A leads to loss of RNase H2 activity. RNase H2 ia an endonuclease that specifically degrades the RNA of RNA:DNA hybrids. It participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication. 133
51492 400783 pfam08616 SPA Stabilization of polarity axis. Yeast AFI1 has been shown to interact with the outer plaque of the spindle pole body. In Aspergillus nidulans the protein member is necessary for stabilization of the polarity axes during septation. and in S. cerevisiae it functions as a polarisation-specific docking factor. 113
51493 400784 pfam08617 CGI-121 Kinase binding protein CGI-121. CGI-121 has been shown to bind to the p53-related protein kinase (PRPK). PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumor suppressor protein p53. CGI-121 is part of a conserved protein complex, KEOPS. The KEOPS complex is involved in telomere uncapping and telomere elongation. Interestingly this family also include archaeal homologs, formerly in the DUF509 family. A structure for these proteins has been solved by structural genomics. 159
51494 370005 pfam08618 Opi1 Transcription factor Opi1. Opi1 is a leucine zipper containing yeast transcription factor that negatively regulates phospholipid biosynthesis. It represses the expression of several UAS(INO) cis acting element containing genes and its activity is mediated by phosphorylations catalyzed by protein kinase A, protein kinase C and casein kinase II. 416
51495 400785 pfam08619 Nha1_C Alkali metal cation/H+ antiporter Nha1 C-terminus. The C-terminus of the plasma membrane Nha1 antiporter plays an important role in the immediate cell response to hypo-osmotic shock which prevents an execessive loss of ions and water. This domain is found with pfam00999. 323
51496 400786 pfam08620 RPAP1_C RPAP1-like, C-terminal. Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the C-terminal region that contains the motif GLHHH. This region is conserved from yeast to humans. 69
51497 400787 pfam08621 RPAP1_N RPAP1-like, N-terminal. Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans. 45
51498 400788 pfam08622 Svf1 Svf1-like N-terminal lipocalin domain. Family of proteins that are involved in survival during oxidative stress. This entry corresponds to the N-terminal lipocalin domain of a pair. 162
51499 400789 pfam08623 TIP120 TATA-binding protein interacting (TIP20). TIP120 (also known as cullin-associated and neddylation-dissociated protein 1) is a TATA binding protein interacting protein that enhances transcription. 165
51500 400790 pfam08624 CRC_subunit Chromatin remodelling complex Rsc7/Swp82 subunit. This family has been identified as a subunit of chromatin remodelling complexes. Saccharomyces cerevisiae NPL6 and its paralogue SWP82 have been identified as subunits of the RSC chromatin remodelling complex, and SWI/SNF chromatin remodelling complex respectively. 134
51501 400791 pfam08625 Utp13 Utp13 specific WD40 associated domain. Utp13 is a component of the five protein Pwp2 complex that forms part of a stable particle subunit independent of the U3 small nucleolar ribonucleoprotein that is essential for the initial assembly steps of the 90S pre-ribosome. Pwp2 is capable of interacting directly with the 35 S pre-rRNA 5' end. 141
51502 400792 pfam08626 TRAPPC9-Trs120 Transport protein Trs120 or TRAPPC9, TRAPP II complex subunit. This region is found at the N terminal of Saccharomyces cerevisiae Trs120 protein. Trs120 is a subunit of the multiprotein complex TRAPP (transport particle protein) which functions in ER to Golgi traffic. Trs120 is specific to the larger TRAPP complex, TRAPP II, along with Trs65p and Trs130p(TRAPPC10). It is suggested that Trs120p is required for the stability of the Trs130p subunit, suggesting that these two proteins might interact in some way. It is likely that there is a complex function for TRAPP II in multiple pathways. 1220
51503 400793 pfam08627 CRT-like CRT-like, chloroquine-resistance transporter-like. This region is found in proteins related to Plasmodium falciparum chloroquine resistance transporter (CRT). 331
51504 400794 pfam08628 Nexin_C Sorting nexin C terminal. This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787. 111
51505 400795 pfam08629 PDE8 PDE8 phosphodiesterase. This region is found in members of the PDE8 phosphodiesterase family. It is found with pfam00233. 52
51506 400796 pfam08630 Dfp1_Him1_M Dfp1/Him1, central region. This is the middle regions described by Ogino et al. This region, together with the C-terminal zinc finger (pfam07535) is essential for the mitotic and kinase activation functions of Dfp1/Him1. 128
51507 400797 pfam08631 SPO22 Meiosis protein SPO22/ZIP4 like. SPO22/ZIP4 in yeast is a meiosis specific protein involved in sporulation. It has been shown to regulate crossover distribution by promoting synaptonemal complex formation. 272
51508 370018 pfam08632 Zds_C Activator of mitotic machinery Cdc14 phosphatase activation C-term. This region of the Zds1 protein is critical for sporulation and has also been shown to suppress the calcium sensitivity of Zds1 deletions. The C-terminal motif is common to both Zds1 and Zds2 proteins, both of which are putative interactors of Cdc55 and are required for the completion of mitotic exit and cytokinesis. They both contribute to timely Cdc14 activation during mitotic exit and are required downstream of separase to facilitate nucleolar Cdc14 release. 49
51509 400798 pfam08633 Rox3 Rox3 mediator complex subunit. The mediator complex is part of the RNA polymerase II holoenzyme. Rox3 is a subunit of the mediator complex. 163
51510 400799 pfam08634 Pet127 Mitochondrial protein Pet127. Pet127 has been implicated in mitochondrial RNA stability and/or processing and is localized to the mitochondrial membrane. The Pet127 family is part of the PD-(D/E)XK nuclease superfamily including a full set of active site residues. 275
51511 117208 pfam08635 ox_reductase_C Putative oxidoreductase C terminal. This is the C terminal of a family of putative oxidoreductases. 142
51512 400800 pfam08636 Pkr1 ER protein Pkr1. Pkr1 has been identified as an ER protein of unknown function. 69
51513 400801 pfam08637 NCA2 ATP synthase regulation protein NCA2. NCA2 has been shown to be required for the regulation of ATP synthase subunits Atp6p and Atp8p in Saccharomyces cerevisiae. 288
51514 400802 pfam08638 Med14 Mediator complex subunit MED14. Saccharomyces cerevisiae RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes and sporulation, required for glucose repression, HO repression, RME1 repression and sporulation. This subunit is also found in higher eukaryotes and Med14 is the agreed unified nomenclature for this subunit. Med14 is found in the tail region of Mediator. 192
51515 400803 pfam08639 SLD3 DNA replication regulator SLD3. The SLD3 DNA replication regulator is required for loading and maintenance of Cdc45 on chromatin during DNA replication. 534
51516 400804 pfam08640 U3_assoc_6 U3 small nucleolar RNA-associated protein 6. This is a family of U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA. 77
51517 400805 pfam08641 Mis14 Kinetochore protein Mis14 like. Mis14 is a kinetochore protein which is known to be recruited to kinetochores independently of CENP-A. 131
51518 400806 pfam08642 Rxt3 Histone deacetylation protein Rxt3. Rxt3 has been shown in yeast to be required for histone deacetylation. 113
51519 370028 pfam08643 DUF1776 Fungal family of unknown function (DUF1776). This is a fungal family of unknown function. One of the proteins in this family YSC83 has been localized to the mitochondria. 295
51520 400807 pfam08644 SPT16 FACT complex subunit (SPT16/CDC68). Proteins in this family are subunits the FACT complex. The FACT complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin. 151
51521 370030 pfam08645 PNK3P Polynucleotide kinase 3 phosphatase. Polynucleotide kinase 3 phosphatases play a role in the repair of single breaks in DNA induced by DNA-damaging agents such as gamma radiation and camptothecin. 161
51522 400808 pfam08646 Rep_fac-A_C Replication factor-A C terminal domain. This domain is found at the C terminal of replication factor A. Replication factor A (RPA) binds single-stranded DNA and is involved in replication, repair, and recombination of DNA. 146
51523 400809 pfam08647 BRE1 BRE1 E3 ubiquitin ligase. BRE1 is an E3 ubiquitin ligase that has been shown to act as a transcriptional activator through direct activator interactions. 95
51524 400810 pfam08648 DUF1777 Protein of unknown function (DUF1777). This is a family of eukaryotic proteins of unknown function. Some of the proteins in this family are putative nucleic acid binding proteins. 56
51525 400811 pfam08649 DASH_Dad1 DASH complex subunit Dad1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules. Throughout the cell cycle Dad1 remains bound to kinetochores throughout the cell cycle and its association is dependent on the Mis6 and Mal2. 55
51526 400812 pfam08650 DASH_Dad4 DASH complex subunit Dad4. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. 71
51527 400813 pfam08651 DASH_Duo1 DASH complex subunit Duo1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. 72
51528 400814 pfam08652 RAI1 RAI1 like PD-(D/E)XK nuclease. RAI1 is homologous to Caenorhabditis elegans DOM-3 and human DOM3Z and binds to a nuclear exoribonuclease. It is required for 5.8S rRNA processing. Profile-profile comparison tools demonstrate this to be a PD-(D/E)XK nuclease, with a full set of canonical active site signature motifs characteristic to the PD-(D/E)XK nuclease superfamily. 69
51529 400815 pfam08653 DASH_Dam1 DASH complex subunit Dam1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules. 56
51530 400816 pfam08654 DASH_Dad2 DASH complex subunit Dad2. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. 99
51531 400817 pfam08655 DASH_Ask1 DASH complex subunit Ask1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules. 64
51532 400818 pfam08656 DASH_Dad3 DASH complex subunit Dad3. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. 75
51533 400819 pfam08657 DASH_Spc34 DASH complex subunit Spc34. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules. 279
51534 400820 pfam08658 Rad54_N Rad54 N terminal. This is the N terminal of the DNA repair protein Rad54. 180
51535 370044 pfam08659 KR KR domain. This enzymatic domain is part of bacterial polyketide synthases and catalyzes the first step in the reductive modification of the beta-carbonyl centers in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group. 180
51536 370045 pfam08660 Alg14 Oligosaccharide biosynthesis protein Alg14 like. Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane. 171
51537 400821 pfam08661 Rep_fac-A_3 Replication factor A protein 3. Replication factor A is involved in eukaryotic DNA replication, recombination and repair. 105
51538 400822 pfam08662 eIF2A Eukaryotic translation initiation factor eIF2A. This is a family of eukaryotic translation initiation factors. 194
51539 400823 pfam08663 HalX HalX domain. HalX is a domain of unknown function, previously (mis)annotated as HoxA-like transcriptional regulator. 68
51540 400824 pfam08664 YcbB YcbB domain. YcbB is a DNA-binding domain. 136
51541 400825 pfam08665 PglZ PglZ domain. This family is a member of the Alkaline phosphatase clan. 176
51542 400826 pfam08666 SAF SAF domain. This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins. 61
51543 400827 pfam08667 BetR BetR domain. This family includes an N-terminal helix-turn-helix domain. 148
51544 400828 pfam08668 HDOD HDOD domain. 196
51545 400829 pfam08669 GCV_T_C Glycine cleavage T-protein C-terminal barrel domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. 80
51546 400830 pfam08670 MEKHLA MEKHLA domain. The MEKHLA domain shares similarity with the PAS domain and is found in the 3' end of plant HD-ZIP III homeobox genes, and bacterial proteins. 141
51547 400831 pfam08671 SinI Anti-repressor SinI. SinR is a pleiotropic regulator of several late growth processes. It is a tetrameric DNA binding protein whose activity is down-regulated thorough the formation of a SinI:SinR protein complex. When complexed with SinI, the SinR tetramer is disrupted such that is no longer able to bind DNA. 28
51548 400832 pfam08672 ANAPC2 Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyze the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein. 60
51549 400833 pfam08673 RsbU_N Phosphoserine phosphatase RsbU, N-terminal domain. RsbU is a phosphoserine phosphatase which acts as a positive regulator of the general stress-response factor of gram positive organisms, sigma-B. The phosphatase activity of RsbU is stimulated by association with the RsbT kinase. Deletions in the N terminal domain are deleterious to the activity of RsbU. 75
51550 400834 pfam08674 AChE_tetra Acetylcholinesterase tetramerisation domain. The acetylcholinesterase tetramerisation domain is found at the C-terminus and forms a left handed superhelix. 35
51551 400835 pfam08675 RNA_bind RNA binding domain. This domain corresponds to the RNA binding domain of Poly(A)-specific ribonuclease (PARN). 75
51552 400836 pfam08676 MutL_C MutL C terminal dimerization domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognizes mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerization. 144
51553 400837 pfam08677 GP11 GP11 baseplate wedge protein. GP11 is a viral structural protein that connects short tail fibers to the baseplate. The tail region is responsible for attachment to the host bacteria during infection. 252
51554 400838 pfam08678 Rsbr_N Rsbr N terminal. Rsbr is a regulator of the RNA polymerase sigma factor subunit sigma(B). The structure of the N terminal domain belongs to the globin fold superfamily. 130
51555 370055 pfam08679 DsrD Dissimilatory sulfite reductase D (DsrD). The structure of the DsrD protein has shown it to contain a winged-helix motif similar to those found in DNA binding proteins. The structure suggests a possible role for DsrD in transcription of translation of genes which catalyze dissimilatory sulfite reduction. 64
51556 400839 pfam08680 DUF1779 TATA-box binding. TATA-box_bdg is a family of bacterial proteins. YwmB from Bacillus subtilis contains a circularly permuted TATA-box binding protein-like fold. Jian-Xiang Liu, Qi Xie, Jun Lin. Protein Structural Data Mining and Evolutionary Bioinformatic Analysis on Domains of TATA-box Binding Protein-like Fold. Life Science Journal 2014; 11(2): 298-302 (not yet in PubMed 27-02-2014). 184
51557 400840 pfam08681 DUF1778 Protein of unknown function (DUF1778). This is a family of uncharacterized proteins. The structure of one of the hypothetical proteins in this family has been solved and it forms a helix structure which may form interactions with DNA. 80
51558 285845 pfam08682 DUF1780 Putative endonuclease, protein of unknown function (DUF1780). This is a family of uncharacterized proteins. The structure of a hypothetical protein from Pseudomonas aeruginosa has shown it to adopt an alpha/beta fold, placing it in the Endonuclease superfamily/clan of restriction endonucleases. 208
51559 400841 pfam08683 CAMSAP_CKK Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin. 119
51560 400842 pfam08684 ocr DNA mimic ocr. The structure of an ocr protein from bacteriophage T7 has shown that this protein mimics the size and shape of a bent DNA molecule. ocr has also been shown to be an inhibitor of the complex type I DNA restriction enzymes. 100
51561 400843 pfam08685 GON GON domain. The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. 198
51562 400844 pfam08686 PLAC PLAC (protease and lacunin) domain. The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. 31
51563 400845 pfam08687 ASD2 Apx/Shroom domain ASD2. This region is found in the actin binding protein Shroom which mediates apical contriction in epithelial cells and is required for neural tube closure. 288
51564 400846 pfam08688 ASD1 Apx/Shroom domain ASD1. This region is found in the actin binding protein Shroom which mediates apical contriction in epithelial cells and is required for neural tube closure. ASD1 has been implicated directly in F-actin binding. 170
51565 370064 pfam08689 Med5 Mediator complex subunit Med5. The mediator complex is required for the expression of nearly all RNA pol II dependent genes in Saccharomyces cerevisiae. Deletion of the MED5 gene leads to increased transcription of nuclear genes encoding components of the oxidative phosphorylation machinery, and decreased transcription of mitochondrial genes encoding components of the same machinery. There is no orthologue from pombe, and this subunit appears to be fungal specific. 1082
51566 400847 pfam08690 GET2 GET complex subunit GET2. This family corresponds to the GET complex subunit GET2. The GET complex is involved in the retrieval of ER resident proteins from the Golgi. 308
51567 370066 pfam08691 Nse5 DNA repair proteins Nse5 and Nse6. Nse5 and Nse6 are non essential nuclear proteins that are critical for chromosome segregation in fission yeast. Nse5 forms a dimer with Nse6 and facilitates DNA repair as part of the Smc5-Smc6 holocomplex. 503
51568 400848 pfam08692 Pet20 Mitochondrial protein Pet20. Pet20 is a mitochondrial protein which is thought to play a role in the correct assembly/maintenance of mitochondrial components. 134
51569 400849 pfam08693 SKG6 Transmembrane alpha-helix domain. SKG6/Axl2 are membrane proteins that show polarised intracellular localization. SKG6_Tmem is the highly conserved transmembrane alpha-helical domain of SKG6 and Axl2 proteins,. The full-length fungal protein has a negative regulatory function in cytokinesis. 38
51570 400850 pfam08694 UFC1 Ubiquitin-fold modifier-conjugating enzyme 1. Ubiquitin-like (UBL) post-translational modifiers are covalently linked to most, if not all, target protein(s) through an enzymatic cascade analogous to ubiquitylation, consisting of E1 (activating), E2 (conjugating), and E3 (ligating) enzymes. Ubiquitin-fold modifier 1 (Ufm1) a ubiquitin-like protein is activated by a novel E1-like enzyme, Uba5, by forming a high-energy thioester bond. Activated Ufm1 is then transferred to its cognate E2-like enzyme, Ufc1, in a similar thioester linkage. This family represents the E2-like enzyme. 155
51571 400851 pfam08695 Coa1 Cytochrome oxidase complex assembly protein 1. Coa1 is an inner mitochondrial membrane protein that associates with Shy1 and is required for cytochrome oxidase complex IV assembly. It contains a conserved hydrophobic segment (amino acids 74-92) with the potential to form a membrane-spanning helix. The N-terminus of Coa1 is rich in positively charged amino acids and could form an amphipathic alpha helix, characteristic of a mitochondrial presequence. A cleavage site for the mitochondrial processing peptidase is predicted adjacent to the presequence. Upon in vitro import into mitochondria, Coa1 is processed to a mature form, indicating that it possesses a cleavable presequence. The eukaryotic cytochrome oxidase complex consists of 12-13 subunits, with three mitochondrial encoded subunits, Cox1-Cox3, forming the core enzyme. Translation of the Cox1 transcript requires the two promoters, Pet309 and Mss51, and the latter has an additional role in translational elongation. Coa1 is necessary for linking the activity of Mss51 to Cox1 insertion into the assembly complex. 117
51572 400852 pfam08696 Dna2 DNA replication factor Dna2. Dna2 is a DNA replication factor with single-stranded DNA-dependent ATPase, ATP-dependent nuclease, ( 5'-flap endonuclease) and helicase activities. It is required for Okazaki fragment processing and is involved in DNA repair pathways. 203
51573 400853 pfam08698 Fcf2 Fcf2 pre-rRNA processing. This is a family of eukaryotic nucleolar proteins that are involved in pre-rRNA processing. 94
51574 400854 pfam08699 ArgoL1 Argonaute linker 1 domain. ArgoL1 is a region found in argonaute proteins. It normally co-occurs with pfam02179 and pfam02171. It is a linker region between the N-terminal and the PAZ domains. It contains an alpha-helix packed against a three-stranded antiparallel beta-sheet with two long beta-strands (beta8 and beta9) of the sheet spanning one face of the adjacent N and PAZ domains. L1 together with linker 2, L2, PAZ and ArgoN forms a compact global fold. 52
51575 400855 pfam08700 Vps51 Vps51/Vps67. This family includes a presumed domain found in a number of components of vesicular transport. The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1. Cog1_N is the N-terminus of the Cog1 subunit of the eight-unit Conserved Oligomeric Golgi (COG) complex that participates in retrograde vesicular transport and is required to maintain normal Golgi structure and function. The subunits are located in two lobes and Cog1 serves to bind the two lobes together probably via the highly conserved N-terminal domain of approximately 85 residues. 86
51576 400856 pfam08701 GN3L_Grn1 GNL3L/Grn1 putative GTPase. Grn1 (yeast) and GNL3L (human) are putative GTPases which are required for growth and play a role in processing of nucleolar pre-rRNA. This family contains a potential nuclear localization signal. 74
51577 400857 pfam08702 Fib_alpha Fibrinogen alpha/beta chain family. Fibrinogen is a protein involved in platelet aggregation and is essential for the coagulation of blood. This domain forms part of the central coiled coiled region of the protein which is formed from two sets of three non-identical chains (alpha, beta and gamma). 142
51578 400858 pfam08703 PLC-beta_C PLC-beta C terminal. This domain corresponds to the alpha helical C terminal domain of phospholipase C beta. 176
51579 312288 pfam08704 GCD14 tRNA methyltransferase complex GCD14 subunit. GCD14 is a subunit of the tRNA methyltransferase complex and is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA. 242
51580 312289 pfam08705 Gag_p6 Gag protein p6. HIV protein p6 contains two late-budding domains (L domains) which are short sequence motifs essential for viral particle release. p6 interacts with the endosomal sorting complex and represents a docking site for several cellular and binding factors. The PTAP motif interacts with the cellular budding factor TSG101. This domain is also found in some chimpanzee immunodeficiency virus (SIV-cpz) proteins. 37
51581 378029 pfam08706 D5_N D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages. 145
51582 400859 pfam08707 PriCT_2 Primase C terminal 2 (PriCT-2). This alpha helical domain is found at the C terminal of primases. 76
51583 400860 pfam08708 PriCT_1 Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases. 64
51584 400861 pfam08709 Ins145_P3_rec Inositol 1,4,5-trisphosphate/ryanodine receptor. This domain corresponds to the ligand binding region on inositol 1,4,5-trisphosphate receptor, and the N terminal region of the ryanodine receptor. Both receptors are involved in Ca2+ release. They can couple to the activation of neurotransmitter-gated receptors and voltage-gated Ca2+ channels on the plasma membrane, thus allowing the endoplasmic reticulum discriminate between different types of neuronal activity. 213
51585 285872 pfam08710 nsp9 nsp9 replicase. nsp9 is a single-stranded RNA-binding viral protein likely to be involved in RNA synthesis. Its structure comprises of a single beta barrel. 111
51586 400862 pfam08711 Med26 TFIIS helical bundle-like domain. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. Notably, the 'small' and 'large' Mediator complexes differ in their subunit composition: the Med26 subunit preferentially associates with the small, active complex, whereas cdk8, cyclin C, Med12 and Med13 associate with the large Mediator complex. This family includesthe C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5. 52
51587 400863 pfam08712 Nfu_N Scaffold protein Nfu/NifU N terminal. This domain is found at the N-terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters. 81
51588 378033 pfam08713 DNA_alkylation DNA alkylation repair enzyme. Proteins in this family are predicted to be DNA alkylation repair enzymes. The structure of a hypothetical protein in this family shows it to adopt a supercoiled alpha helical structure. 212
51589 400864 pfam08714 Fae Formaldehyde-activating enzyme (Fae). Formaldehyde-activating enzyme is an enzyme required for energy metabolism and formaldehyde detoxification. It catalyzes the condensation of formaldehyde and tetrahydromethanopterin to methylene tetrahydromethanopterin. 158
51590 400865 pfam08715 Viral_protease Papain like viral protease. This family of viral proteases are similar to the papain protease and are required for proteolytic processing of the replicase polyprotein. The structure of this protein has shown it adopts a fold similar that of de-ubiquitinating enzymes. 320
51591 285878 pfam08716 nsp7 nsp7 replicase. nsp7 (non structural protein 7) has been implicated in viral RNA replication and is predominantly alpha helical in structure. It forms a hexadecameric supercomplex with nsp7 that adopts a hollow cylinder-like structure. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase. 83
51592 400866 pfam08717 nsp8 nsp8 replicase. Viral nsp8 (non structural protein 8) forms a hexadecameric supercomplex with nsp7 that adopts a hollow cylinder-like structure. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase. 197
51593 400867 pfam08718 GLTP Glycolipid transfer protein (GLTP). GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. 137
51594 400868 pfam08719 DUF1768 Domain of unknown function (DUF1768). This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. 155
51595 72144 pfam08720 Hema_stalk Influenza C hemagglutinin stalk. This domain corresponds to the stalk segment of hemagglutinin in influenza C virus. It forms a coiled coil structure. 175
51596 400869 pfam08721 Tn7_Tnp_TnsA_C TnsA endonuclease C terminal. The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. The C terminal domain of TnsA binds DNA. 83
51597 400870 pfam08722 Tn7_Tnp_TnsA_N TnsA endonuclease N terminal. The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. The N terminal domain of TnsA is catalytic. 83
51598 72147 pfam08723 Gag_p15 Gag protein p15. Gag p15 is a viral membrane-binding matrix protein which is alpha helical in structure. 123
51599 312303 pfam08724 Rep_N Rep protein catalytic domain like. Adeno-associated virus (AAV) Replication (Rep) protein is essential for viral replication and integration. The catalytic domain has DNA binding and endonuclease activity. 187
51600 400871 pfam08725 Integrin_b_cyt Integrin beta cytoplasmic domain. Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit. 44
51601 400872 pfam08726 EFhand_Ca_insen Ca2+ insensitive EF hand. EF hands are helix-loop-helix binding motifs involved in the regulation of many cellular processes. EF hands usually bind to Ca2+ ions which causes a major conformational change that allows the protein to interact with its designated targets. This domain corresponds to an EF hand which has partially or entirely lost its calcium-binding properties. The calcium insensitive EF hand is still able to mediate protein-protein recognition. 69
51602 400873 pfam08727 P3A Poliovirus 3A protein like. This domain is found in positive-strand RNA viruses. The 3A protein is a critical component of the poliovirus replication complex, and is also an inhibitor of host cell ER to Golgi transport. 59
51603 400874 pfam08728 CRT10 CRT10. CRT10 is a transcriptional regulator of ribonucleotide reductase (RNR) genes. RNR catalyzes the rate limiting step in dNTP synthesis. Mutations in CRT10 have been shown to enhance hydroxyurea resistance. 618
51604 400875 pfam08729 HUN HPC2 and ubinuclein domain. HPC2 (Histone promoter control 2) is required for cell-cycle regulation of histone transcription. It regulates transcription of the histone genes during the S-phase of the cell cycle by repressing transcription at other cell cycle stages. HPC2 mutants display synthetic interactions with FACT complex which allows RNA Pol II to elongate through nucleosomes. Hpc2 is one of the proteins of one of the multi-subunit complexes that mediate replication- independent nucleosome assembly, along with histone chaperone proteins. the Hip4 sequence from SCH. pombe is an integral component of this complex that is required for transcriptional silencing at multiple loci. HPC2, ubinuclein/yemanuclein, and the cell cycle regulator FLJ25778 share a conserved domain that is predicted to bind histone tails. This domain is also referred to as the HRD or Hpc2-related domain. 52
51605 400876 pfam08730 Rad33 Rad33. Rad33 is involved in nucleotide excision repair (NER). NER is the main pathway for repairing DNA lesions induced by UV. Cells deleted for RAD33 display intermediate UV sensitivity that is epistatic with NER. 165
51606 400877 pfam08731 AFT Transcription factor AFT. AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This family includes the paralogous pair of transcription factors AFT1 and AFT2. 91
51607 400878 pfam08732 HIM1 HIM1. HIM1 (high induction of mutagenesis protein 1) plays a role in the control of spontaneous and induced mutagenesis. It is thought to participate in the control of processing of mutational intermediates appearing during error-prone bypass of DNA damage. 169
51608 400879 pfam08733 PalH PalH/RIM21. PalH (also known as RIM21) is a transmembrane protein required for proteolytic cleavage of Rim101/PacC transcription factors which are activated by C terminal proteolytic processing. Rim101/PacC family proteins play a key role in pH-dependent responses and PalH has been implicated as a pH sensor. 335
51609 400880 pfam08734 GYD GYD domain. This protein is found in a range of bacteria. It is usually less than 100 amino acids in length. The function of the protein is unknown. It may belong to the dimeric alpha/beta barrel superfamily. 89
51610 400881 pfam08735 DUF1786 Putative pyruvate format-lyase activating enzyme (DUF1786). This family is annotated as pyruvate formate-lyase activating enzyme (EC:1.97.1.4) in UniProt. It is not clear where this annotation comes from. 251
51611 400882 pfam08736 FA FERM adjacent (FA). This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesized to play a role in regulatory adaptation, based on similarity to other protein kinase substrates. 44
51612 400883 pfam08737 Rgp1 Rgp1. Rgp1 forms heterodimer with Ric1 (pfam07064) which associates with Golgi membranes and functions as a guanyl-nucleotide exchange factor. 414
51613 370092 pfam08738 Gon7 Gon7 family. In S. cerevisiae Gon7 is a member of the KEOPS protein complex. A protein complex proposed to be involved in transcription and promoting telomere uncapping and telomere elongation. 105
51614 400884 pfam08740 BCS1_N BCS1 N terminal. This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein. 179
51615 400885 pfam08741 YwhD YwhD family. This family of proteins are currently uncharacterized. They are around 170 amino acids in length. 162
51616 400886 pfam08742 C8 C8 domain. This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826. 68
51617 400887 pfam08743 Nse4_C Nse4 C-terminal. Nse4 is a component of the Smc5/6 DNA repair complex. It forms interactions with Smc5 and Nse1. The exact function of this highly conserved C-terminal domain is not known. 87
51618 285901 pfam08744 NOZZLE Plant transcription factor NOZZLE. NOZZLE is a transcription factor that plays a role in patterning the proximal-distal and adaxial-abaxial axes. 335
51619 400888 pfam08745 PIN_5 PINc domain ribonuclease. This is a family of bacterial and archaeal PINc domains. PIN domains are characterized by the conservation of three acidic residues, possibly four, an Asp at residue 13, a Glu at 63, and then Asps at 172 and 194 in UniProtKB:Q58360. 206
51620 400889 pfam08746 zf-RING-like RING-like domain. This is a zinc finger domain that is related to the C3HC4 RING finger domain (pfam00097). 43
51621 400890 pfam08747 DUF1788 Domain of unknown function (DUF1788). Putative uncharacterized domain in proteins of length around 200 amino acids. 119
51622 370098 pfam08748 Phage_TAC_4 Phage tail assembly chaperone. This is a family of phage tail assembly chaperone proteins largely from phage T1 Gp40. 124
51623 400891 pfam08750 CNP1 CNP1-like family. This family of proteins are likely to be lipoproteins. CNP1 (cryptic neisserial protein) has been expressed in E. coli and shown to be localized periplasmicly. 135
51624 400892 pfam08751 TrwC TrwC relaxase. Relaxases are DNA strand transferases which function during the conjugative cell to cell DNA transfer. TrwC binds to the origin of transfer (oriT) and melts the double helix. 283
51625 400893 pfam08752 COP-gamma_platf Coatomer gamma subunit appendage platform subdomain. COPI-coated vesicles function in retrograde transport from the Golgi to the ER, and in intra-Golgi transport. This is the platform subdomain of the coatomer gamma subunit appendage domain. It carries a protein-protein interaction site at UniProt:P53620, residue W776, which in yeast binds to the ARFGAP Glo3p, and in mammalian gamma-COP binds to a Glo3p orthologue, ARFGAP2. 149
51626 400894 pfam08753 NikR_C NikR C terminal nickel binding domain. NikR is a transcription factor that regulates nickel uptake. It consists of two dimeric DNA binding domains separated by a tetrameric regulatory domain that binds nickel. This domain corresponds to the C terminal regulatory domain which contains four nickel binding sites at the tetramer interface. 74
51627 400895 pfam08755 YccV-like Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. 94
51628 378044 pfam08756 YfkB YfkB-like domain. This protein is adjacent to YfkA in B. subtilis. In other bacterial species it is fused to this protein. As YfkA contains a Radical SAM domain it suggests this domain is interacts with them. 149
51629 400896 pfam08757 CotH CotH kinase protein. Members of this family include the spore coat protein H (cotH). This protein is an atypical protein kinase that phosphorylates CotB and CotG. 318
51630 400897 pfam08758 Cadherin_pro Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions. 90
51631 400898 pfam08759 GT-D Glycosyltransferase GT-D fold. This domain is found at the C-terminus of proteins such as the probable glycosyltransferase Gly that also contain the glycosyl transferase domain at the N-terminus. It is also found N-terminal in numerous putative glycosyltransferases such as GalT1. GalT1 has been shown to catalyze the third step of Fap1 glycosylation. This domain is structurally distinct from all known GT folds of glycosyltransferases and contains a metal binding site. This new glycosyltransferase fold has been named GT-D. 223
51632 400899 pfam08760 DUF1793 Domain of unknown function (DUF1793). This presumed domain is found at the C-terminus of a glutaminase protein from fungi. This domain is also found as a single domain protein in Bacteroides thetaiotaomicron. 169
51633 400900 pfam08761 dUTPase_2 dUTPase. 2-Deoxyuridine 5-triphosphate nucleotidohydrolase (dUTPase) catalyzes the hydrolysis of dUTP to dUMP and pyrophosphate (EC:3.6.1.23). Members of this family have a novel all-alpha fold and are unrelated to the all-beta fold found in dUTPases of the majority of organisms. This family contains both dUTPase homologs of dUTPase including dCTPase of phage T4. 162
51634 370104 pfam08762 CRPV_capsid CRPV capsid protein like. This is a family of capsid proteins found in positive stranded ssRNA viruses such as cricket paralysis virus (CRPV). It forms an all beta sheet structure. 198
51635 400901 pfam08763 Ca_chan_IQ Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF). 75
51636 312335 pfam08764 Coagulase Staphylococcus aureus coagulase. Staphylococcus aureus secretes a cofactor called coagulase. Coagulase is an extracellular protein that forms a complex with human prothrombin, and activates it without the usual proteolytic cleavages. The resulting complex directly initiates blood clotting. 279
51637 400902 pfam08765 Mor Mor transcription activator family. Mor (Middle operon regulator) is a sequence specific DNA binding protein. It mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. The N terminal region of Mor is the dimerization region, and the C terminal contains a helix-turn-helix motif which binds DNA. 107
51638 400903 pfam08766 DEK_C DEK C terminal domain. DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family. This domain is also found in chitin synthase proteins and in protein phosphastases. 54
51639 370107 pfam08767 CRM1_C CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat. 323
51640 400904 pfam08768 DUF1794 Domain of unknown function (DUF1794). This domain forms a beta barrel structure but the function is unknown. The GO annotation for this protein indicates that the protein has a function in nematode larval development and has a positive regulation on growth rate. 150
51641 400905 pfam08769 Spo0A_C Sporulation initiation factor Spo0A C terminal. The response regulator Spo0A is comprised of a phophoacceptor domain and a transcription activation domain. This domain corresponds to the transcription activation domain and forms an alpha helical structure comprising of 6 alpha helices. The structure contains a helix-turn-helix and binds DNA. 104
51642 400906 pfam08770 SoxZ Sulphur oxidation protein SoxZ. SoxZ forms an anti parallel beta structure and forms a complex with SoxY. Sulphur oxidation occurs at the thiol of a conserved cysteine residue of the SoxY subunit. 94
51643 400907 pfam08771 FRB_dom FKBP12-rapamycin binding domain. The macrolide antibiotic rapamycin and the cytosol protein FKBP12 can form a complex which specifically inhibits the TORC1 complex, leading to growth arrest. The FKBP12-rapamycin complex interferes with TORC1 function by binding to the FKBP12-rapamycin binding domain (FRB) of the TOR proteins. This entry represents the FRB domain. 98
51644 400908 pfam08772 NOB1_Zn_bind Nin one binding (NOB1) Zn-ribbon like. This domain corresponds to a zinc ribbon and is found on the RNA binding protein NOB1 (Nin one binding). 72
51645 400909 pfam08773 CathepsinC_exc Cathepsin C exclusion domain. Cathepsin C (dipeptidyl peptidase I) is the physiological activator of a group of serine proteases. This domain corresponds to the exclusion domain whose structure excludes the approach of a polypeptide apart from its termini. It forms an enclosed beta barrel structure composed from 8 anti-parallel beta strands. Based on a structural comparison and interaction data, it is suggested that the exclusion domain originates from a metallo-protease inhibitor. 118
51646 400910 pfam08774 VRR_NUC VRR-NUC domain. 114
51647 400911 pfam08775 ParB ParB family. ParB is a component of the par system which mediates accurate DNA partition during cell division. It recognizes A-box and B-box DNA motifs. ParB forms an asymmetric dimer with 2 extended helix-turn-helix (HTH) motifs that bind to A-boxes. The HTH motifs emanate from a beta sheet coiled coil DNA binding module. Both DNA binding elements are free to rotate around a flexible linker, this enables them to bind to complex arrays of A- and B-box elements on adjacent DNA arms of the looped partition site. 125
51648 400912 pfam08776 VASP_tetra VASP tetramerisation domain. Vasodilator-stimulated phosphoprotein (VASP) is an actin cytoskeletal regulatory protein. This region corresponds to the tetramerisation domain which forms a right handed alpha helical coiled coil structure. 36
51649 400913 pfam08777 RRM_3 RNA binding motif. This domain is found in protein La which functions as an RNA chaperone during RNA polymerase III transcription, and can also stimulate translation initiation. It contains a five stranded beta sheet which forms an atypical RNA recognition motif. 102
51650 400914 pfam08778 HIF-1a_CTAD HIF-1 alpha C terminal transactivation domain. Hypoxia inducible factor-1 alpha (HIF-1 alpha) is the regulatory subunit of the heterodimeric transcription factor HIF-1. It plays a key role in cellular response to low oxygen tension. This region corresponds to the C terminal transactivation domain. 36
51651 400915 pfam08779 SARS_X4 SARS coronavirus X4 like. The structure of the coronavirus X4 protein (also known as 7a and U122) shows similarities to the immunoglobulin like fold and suggests a binding activity to integrin I domains. In SARS-CoV- infected cells, the X4 protein is expressed and retained intra-cellularly within the Golgi network. X4 has been implicated to function during the replication cycle of SARS-CoV. 107
51652 400916 pfam08780 NTase_sub_bind Nucleotidyltransferase substrate binding protein like. Nucleotidyltransferases (EC 2.7.7) comprise a large enzyme family with diverse roles in polynucleotide synthesis and modification. This domain is structurally related to kanamycin nucleotidyltransferase (KNTase) and forms a complex with HI0073, a sequence homolog of the nucleotide-binding domain of this nucleotidyltransferase superfamily. 126
51653 400917 pfam08781 DP Transcription factor DP. DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer and negatively regulates the G1-S transition. 138
51654 400918 pfam08782 c-SKI_SMAD_bind c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4 91
51655 400919 pfam08783 DWNN DWNN domain. DWNN is a ubiquitin like domain found at the N-terminus of the RBBP6 family of splicing-associated proteins. The DWNN domain is independently expressed in higher vertebrates so it may function as a novel ubiquitin-like modifier of other proteins. 73
51656 400920 pfam08784 RPA_C Replication protein A C terminal. This domain corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints. 106
51657 400921 pfam08785 Ku_PK_bind Ku C terminal domain like. The non-homologous end joining (NHEJ) pathway is one method by which double stranded breaks in chromosomal DNA are repaired. Ku is a component of a multi-protein complex that is involved in the NHEJ. Ku has affinity for DNA ends and recruits the DNA-dependent protein kinase catalytic subunit (DNA-PKcs). This domain is found at the C terminal of Ku which binds to DNA-PKcs. 117
51658 400922 pfam08786 DcrB DcrB. DcrB is a bacterial protein required for phages C1 and C6 adsorption. It may be involved in the opening or formation of diffusion channels in the outer membrane. Structurally, it consist of an antiparallel beta sheet with some alpha helical regions. 126
51659 400923 pfam08787 Alginate_lyase2 Alginate lyase. Alginate lyases are enzymes that degrade the linear polysaccharide alignate. They cleave the glycosidic linkage of alignate through a beta-elimination reaction. This family forms an all beta fold and is different to all alpha fold of pfam05426. 222
51660 370124 pfam08788 NHR2 NHR2 domain like. The NHR2 (Nervy homology 2) domain is found in the ETO protein where it mediates oligomerization and protein-protein interactions. It forms an alpha-helical tetramer. 67
51661 285942 pfam08789 PBCV_basic_adap PBCV-specific basic adaptor domain. The small PBCV-specific basic adaptor domain is found fused to S/T protein kinases and the 2-Cysteine domain. 38
51662 400924 pfam08790 zf-LYAR LYAR-type C2HC zinc finger. This C2HC zinc finger is found in LYAR proteins, which are involved in cell growth regulation. 28
51663 285944 pfam08792 A2L_zn_ribbon A2L zinc ribbon domain. This zinc ribbon domain is found associated with some viral A2L transcription factors. 33
51664 285945 pfam08793 2C_adapt 2-cysteine adaptor domain. The virus-specific 2-cysteine adaptor domain is found fused to OTU/A20-like peptidases and S/T protein kinases. The domain associations of these proteins indicate that they might function as viral adaptors connecting the kinases and OTU/A20 peptidases to specific targets. 35
51665 400925 pfam08794 Lipoprot_C Lipoprotein GNA1870 C terminal like. GNA1870 is a surface exposed lipoprotein in Neisseria meningitidis that and is a potent antigen of Meningococcus. The structure of the C terminal domain consists of an anti-parallel beta barrel overlaid by a short alpha helical region. 155
51666 400926 pfam08795 DUF1796 Putative papain-like cysteine peptidase (DUF1796). 165
51667 400927 pfam08796 DUF1797 Protein of unknown function (DUF1797). This is a domain of unknown function. It forms a central anti-parallel beta sheet with flanking alpha helical regions. 67
51668 400928 pfam08797 HIRAN HIRAN domain. The HIRAN domain (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. The HIRAN domain is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this domain functions as a DNA-binding domain that probably recognizes features associated with damaged DNA or stalled replication forks 96
51669 400929 pfam08798 CRISPR_assoc CRISPR associated protein. This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. 216
51670 400930 pfam08799 PRP4 pre-mRNA processing factor 4 (PRP4) like. This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing. 29
51671 400931 pfam08800 VirE_N VirE N-terminal domain. This presumed domain is found at the N-terminus of VirE proteins. 133
51672 400932 pfam08801 Nucleoporin_N Nup133 N terminal like. Nup133 is a nucleoporin that is crucial for nuclear pore complex (NPC) biogenesis. The N terminal forms a seven-bladed beta propeller structure. This family now contains other sized nucleoporins, including Nup155, Nup8, Nuo132, Nup15 and Nup170. 426
51673 312365 pfam08802 CytB6-F_Fe-S Cytochrome B6-F complex Fe-S subunit. The cytochrome B6-F complex mediates electron transfer between photosystem II (PSII) and photosystem I (PSI), cyclic electron flow around PSI, and state transitions. This domain corresponds to the alpha helical transmembrane domain of the cytochrome B6-F complex iron-sulphur subunit. 39
51674 400933 pfam08803 ydhR Putative mono-oxygenase ydhR. ydhR is a homodimeric protein that comprises of a central four-stranded beta sheet and four surrounding alpha helices. It shows structural homology to the ActVA-Orf6 and YgiN proteins which indicates it could be a mono-oxygenase. 96
51675 400934 pfam08804 gp32 gp32 DNA binding protein like. gp32 is a single stranded (ss) DNA binding protein in bacteriophage T4 that is essential for DNA replication, recombination and repair. The ssDNA binding cleft of gp32 comprises regions from three structural subdomains. 203
51676 285957 pfam08805 PilS PilS N terminal. Type IV pili are bacterial virulence-associated adhesins that promote bacterial attachment to host cells. In Salmonella typhi, the structural pilin protein PilS interacts with the cystic fibrosis transmembrane conductance regulator. Mutagenesis studies suggest that residues on an alpha-beta loop and the C terminal disulphide-bonded region of PilS might be involved in binding specificity of the pilus. 137
51677 400935 pfam08806 Sep15_SelM Sep15/SelM redox domain. Sep15 and SelM are eukaryotic selenoproteins that have a thioredoxin-like domain and a surface accessible active site redox motif. This suggests that they function as thiol-disulphide isomerases involved in disulphide bond formation in the endoplasmic reticulum. Structurally it resembles the thioredoxin-fold. 75
51678 400936 pfam08807 DUF1798 Bacterial domain of unknown function (DUF1798). This domain is found in many hypothetical proteins. The structure of one of the proteins in this family has been solved and it adopts an all alpha helical fold. 108
51679 400937 pfam08808 RES RES domain. This presumed domain contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. The domain is found widely distributed in bacteria. The domain is about 150 residues in length. 154
51680 400938 pfam08809 DUF1799 Phage related hypothetical protein (DUF1799). Members of this family are about 100 amino acids in length and are uncharacterized. 75
51681 378054 pfam08810 KapB Kinase associated protein B. This bacterial protein forms an anti-parallel beta sheet with an extending alpha helical region. 111
51682 400939 pfam08811 DUF1800 Protein of unknown function (DUF1800). This is a family of large bacterial proteins of unknown function. 436
51683 400940 pfam08812 YtxC YtxC-like family. This family includes proteins similar to B. subtilis YtxC an uncharacterized protein. 215
51684 370135 pfam08813 Phage_tail_3 Phage tail tube protein, TTP. This is a family of phage tail tube proteins. A few members have an associated bacterial Ig-like domain, pfam02368, at their C-terminus. 165
51685 400941 pfam08814 XisH XisH protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required. 133
51686 400942 pfam08815 Nuc_rec_co-act Nuclear receptor coactivator. This region is found on eukaryotic nuclear receptor coactivators and forms an alpha helical structure. 47
51687 400943 pfam08816 Ivy Inhibitor of vertebrate lysozyme (Ivy). This bacterial family is a strong inhibitor of vertebrate lysozyme. 117
51688 400944 pfam08817 YukD WXG100 protein secretion system (Wss), protein YukD. The YukD protein family members participate in the formation of a translocon required for the secretion of WXG100 proteins (pfam06013) in monoderm bacteria, with the WXG100 protein secretion system (Wss). Like the cytoplasmic protein EsaC in Staphylococcus aureus, YukD was hypothesized to play a role of a chaperone. YukD adopts a ubiquitin-like fold. Usually, ubiquitin covalently binds to protein and flags them for protein degradation, however conjugation assays have indicated that the classical YukD lacks the capacity for covalent bond formation with other proteins. In contrast to the situation in firmicutes, YukD-like proteins in actinobacteria are often fused to a transporter involved in the ESAT-6/ESX/Wss secretion pathway. Members of the YukD family are also associated in gene neighborhoods with other enzymatic members of the ubiquitin signaling and degradation pathway such as the E1, E2 and E3 trienzyme complex that catalyze ubiquitin transfer to substrates, and the JAB family metallopeptidases that are involved in its release. This suggests that a subset of the YukD family in bacteria are conjugated and released from proteins as in the eukaryotic ubiquitin-mediated signaling and degradation pathway. 77
51689 400945 pfam08818 DUF1801 Domain of unknown function (DU1801). This large family of bacterial proteins is uncharacterized. They contain a presumed domain about 110 amino acids in length. 96
51690 400946 pfam08819 DUF1802 Domain of unknown function (DUF1802). The function of this family is unknown. This region is found associated with a pfam04471 suggesting they could be part of a restriction modification system.. 175
51691 400947 pfam08820 DUF1803 Domain of unknown function (DUF1803). This small domain is found in one or two copies in proteins from bacteria. The function of this domain is unknown. 91
51692 400948 pfam08821 CGGC CGGC domain. This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function. 105
51693 400949 pfam08822 DUF1804 Protein of unknown function (DUF1804). This family of bacterial protein is uncharacterized. 164
51694 255058 pfam08823 PG_binding_2 Putative peptidoglycan binding domain. This family may be a peptidoglycan binding domain. 74
51695 400950 pfam08824 Serine_rich Serine rich protein interaction domain. This is a serine rich domain that is found in the docking protein p130(cas) (Crk-associated substrate). This domain folds into a four helix bundle which is associated with protein-protein interactions. 156
51696 400951 pfam08825 E2_bind E2 binding domain. E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains. 81
51697 117396 pfam08826 DMPK_coil DMPK coiled coil domain like. This domain is found in the myotonic dystrophy protein kinase (DMPK) and adopts a coiled coil structure. It plays a role in dimerization. 61
51698 400952 pfam08827 DUF1805 Domain of unknown function (DUF1805). This domain is found in bacteria and archaea and has an N terminal tetramerisation region that is composed of beta sheets. 58
51699 370143 pfam08828 DSX_dimer Doublesex dimerization domain. Doublesex (DSX) is a transcription factor that regulates somatic sexual differences in Drosophila. The structure of this domain has revealed a novel dimeric arrangement of ubiquitin-associated folds that has not previously been identified in a transcription factor. 60
51700 337221 pfam08829 AlphaC_N Alpha C protein N terminal. The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle. ACP consists of an N-terminal domain (170 amino acids) followed by a variable number of tandem repeats (82 amino acids each) and a C-terminal domain (45 amino acids) containing an LPXTG peptidoglycan-anchoring motif. This entry is the N-terminal domain of ACP (NtACP). NtACP can be further divided into two structurally distinct domains, D1 and D2. D1, the more distal (amino-terminal) portion, consists of a beta sandwich with strong structural homology to fibronectin's integrin-binding region (FnIII10). D2 consists of three antiparallel alpha helix coils containing a portion of the glycosaminoglycan (GAG)-binding domain adjacent to the repeat region. NtACP binds to heparin and GAGs only when it is covalently associated with the adjacent repeat region. NtACP's D1 region contains a K144- T145-D146 (KTD) motif, located within a loop region that is structurally analogous to the loop containing the RGD integrin-binding motif in FnIII10. Single mutation within the KTD motif (D146A), present in the D1 domain, reduces NtACP binding to a1b integrion. The a1b1-integrin is one of four collagen-binding I-domain-containing integrins. Structural analysis of the D1 domain, in particular the region containing the putative integrin-binding loop and KTD motif, shares a strong structural homology with the FnIII10's integrin-binding region. Amino acid sequence alignment of Alps indicates that KTD is highly conserved. 106
51701 400953 pfam08830 DUF1806 Protein of unknown function (DUF1806). This is a bacterial family of uncharacterized proteins. The structure of one of the proteins in this family has been solved and it adopts a beta barrel-like structure. 112
51702 400954 pfam08831 MHCassoc_trimer Class II MHC-associated invariant chain trimerisation domain. The class II associated invariant chain peptide is required for folding and localization of MHC class II heterodimers. This domain is involved in trimerisation of the ectoderm and interferes with DM/class II binding. The trimeric protein forms a cylindrical shape which is thought to be important for interactions between the invariant chain and class II molecules. 69
51703 400955 pfam08832 SRC-1 Steroid receptor coactivator. This domain is found in steroid/nuclear receptor coactivators and contains two LXXLL motifs that are involved in receptor binding. The family includes SRC-1/NcoA-1, NcoA-2/TIF2, pCIP/ACTR/GRIP-1/AIB1. 87
51704 400956 pfam08833 Axin_b-cat_bind Axin beta-catenin binding domain. This domain is found on the scaffolding protein Axin which is a component of the beta-catenin destruction complex. It competes with the tumor suppressor adenomatous polyposis coli protein (APC) for binding to beta-catenin. 37
51705 400957 pfam08837 DUF1810 Protein of unknown function (DUF1810). This is a family of uncharacterized proteins. The structure of one of the members in this family has been solved and it adopts a mainly alpha helical structure. 136
51706 400958 pfam08838 DUF1811 Protein of unknown function (DUF1811). This is a bacterial family of uncharacterized proteins. Some of the proteins are annotated as being transcriptional regulators. The structure of one of the proteins in this family has revealed a beta-barrel like structure with helix-turn-helix like motif. 99
51707 400959 pfam08839 CDT1 DNA replication factor CDT1 like. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin. 171
51708 400960 pfam08840 BAAT_C BAAT / Acyl-CoA thioester hydrolase C terminal. This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT). 211
51709 400961 pfam08841 DDR Diol dehydratase reactivase ATPase-like domain. Diol dehydratase (DDH, EC:4.2.1.28) and its isofunctional homolog glycerol dehydratase (GDH, EC.4.2.1.30) are enzymes which catalyze the conversion of glycerol 1,2-propanediol, and 1,2-ethanediol to aldehydes. These reactions require coenzyme B12. Cleavage of the Co-C bond of coenzyme B12 by substrates or coenzyme analogues results in inactivation during which coenzyme B12 remains tightly bound to the apoenzyme. This family comprises of the large subunit of the diol dehydratase and glycerol dehydratase reactivating factors whose function is to reactivate the holoenzyme by exchange of a damaged cofactor for intact coenzyme. 328
51710 400962 pfam08842 Mfa2 Fimbrillin-A associated anchor proteins Mfa1 and Mfa2. This family of proteins may be lipoproteins principally from bacilli. They are between 300 and 400 residues. Many Bacteroides-like bacterial species, including Porphyromonas gingivalis, the causal agent of periodontal infection, carry at least two types of fimbriae, namely FimA and Mfa1 fimbriae, following the names of their major subunit proteins. Normally, FimA fimbriae are long filaments that are easily detached from cells, whereas Mfa1 fimbriae are short filaments that are tightly bound to cells; however, in the absence of Mfa2 protein, the Mfa1 fimbriae are also very long and are not attached. Mfa2 and Mfa1 are associated with each other in whole P. gingivalis cells to the extent that Mfa2 is located on the cell surface and probably associated with Mfa1 fimbriae in such a way that it anchors the Mfa1 fimbriae to the cell surface and regulates Mfa1 filament length. 276
51711 400963 pfam08843 AbiEii Nucleotidyl transferase AbiEii toxin, Type IV TA system. This family was recently identified as belonging to the nucleotidyltransferase superfamily. AbiEii is the cognate toxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 238
51712 400964 pfam08844 DUF1815 Domain of unknown function (DUF1815). This presumed domain is about 100 amino acids in length and is functionally uncharacterized. 98
51713 370150 pfam08845 SymE_toxin Toxin SymE, type I toxin-antitoxin system. SymE (SOS-induced yjiW gene with similarity to MazE ) is an SOS-induced toxin. It inhibits cell growth, decreases protein synthesis and increases RNA degradation. It may play a role in the recycling of RNAs damaged under SOS response-inducing conditions. It is predicted to have an AbrB fold, similar to that of the antitoxin MazE. Its translation is repressed by the antisense RNA SymR, which acts as an antitoxin. 54
51714 400965 pfam08846 DUF1816 Domain of unknown function (DUF1816). Crocosphaera watsonii CpcD is associated with the pfam01383 domain suggesting this presumed domain could have a role in phycobilisomes. 64
51715 400966 pfam08847 Crr6 Chlororespiratory reduction 6. Chlororespiratory reduction 6 (Crr6) is a factor required for the assembly or stabilisation of the chloroplast NAD(P)H dehydrogenase complex in Arabidopsis. 150
51716 400967 pfam08848 DUF1818 Domain of unknown function (DUF1818). This presumed domain is found in a small family of cyanobacterial protein. These proteins are functionally uncharacterized. 113
51717 400968 pfam08849 DUF1819 Putative inner membrane protein (DUF1819). These proteins are functionally uncharacterized. Several are annotated as putative inner membrane proteins. 181
51718 400969 pfam08850 DUF1820 Domain of unknown function (DUF1820). This family includes small functionally uncharacterized proteins around 100 amino acids in length. 97
51719 370155 pfam08852 DUF1822 Protein of unknown function (DUF1822). This family of proteins are functionally uncharacterized. 370
51720 400970 pfam08853 DUF1823 Domain of unknown function (DUF1823). This presumed domain is functionally uncharacterized. 111
51721 400971 pfam08854 DUF1824 Domain of unknown function (DUF1824). This uncharacterized family of proteins are principally found in cyanobacteria. 124
51722 400972 pfam08855 DUF1825 Domain of unknown function (DUF1825). This uncharacterized family of proteins are principally found in cyanobacteria. 103
51723 400973 pfam08856 DUF1826 Protein of unknown function (DUF1826). These proteins are functionally uncharacterized. 197
51724 400974 pfam08857 ParBc_2 Putative ParB-like nuclease. This domain is probably distantly related to pfam02195. Suggesting these uncharacterized proteins have a nuclease function. 159
51725 400975 pfam08858 IDEAL IDEAL domain. This short domain is found at the C-terminus of proteins in the UPF0302 family. The domain is named after the sequence of the most conserved region in some members. The function of this domain is unknown. 37
51726 400976 pfam08859 DGC DGC domain. This domain appears to be a zinc binding domain from the conservation of four potential chelating cysteines. The domain is named after a conserved central motif. The function of this domain is unknown. 103
51727 400977 pfam08860 DUF1827 Domain of unknown function (DUF1827). This presumed domain has no known function. 91
51728 400978 pfam08861 DUF1828 Domain of unknown function DUF1828. This presumed domain is functionally uncharacterized. 90
51729 400979 pfam08862 DUF1829 Domain of unknown function DUF1829. This short domain is usually associated with pfam08861. 87
51730 400980 pfam08863 YolD YolD-like protein. Members of this family are functionally uncharacterized. However it has been predicted that these proteins are functionally equivalent to the UmuD subunit of polymerase V from gram-negative bacteria. This family has been shown to belong to the WYL-like superfamily. 94
51731 400981 pfam08864 UPF0302 UPF0302 domain. This family is known as UPF0302. It is currently uncharacterized. 105
51732 400982 pfam08865 DUF1830 Domain of unknown function (DUF1830). This family of short proteins is functionally uncharacterized. 66
51733 400983 pfam08866 DUF1831 Putative amino acid metabolism. Solution of the structure of the Lactobacillus plantarum protein from this family has indicated a potential new fold with remote similarities to TBP-like (TATA-binding protein) structures. This similarity, in combination with genomic context analysis, leads us to propose an involvement in amino-acid metabolism. The potentially novel fold is an alpha + beta fold comprising two beta sheets packed against a single helix. The enzyme is present in the cytosol. 110
51734 400984 pfam08867 FRG FRG domain. This presumed domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterized. 93
51735 370164 pfam08868 YugN YugN-like family. This family of proteins related to B. subtilis YugN are functionally uncharacterized. 130
51736 400985 pfam08869 XisI XisI protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required. 103
51737 400986 pfam08870 DndE DNA sulphur modification protein DndE. DndE is a small protein of 126 residues. It is a putative carboxylase homologous to NCAIR synthetase. It is encoded by an operon that is associated with a sulphur-based modification to DNA. 110
51738 370166 pfam08872 KGK KGK domain. This presumed domain is found in one or two copies in cyanobacterial proteins. It is named after a short sequence motif. 111
51739 370167 pfam08873 DUF1834 Domain of unknown function (DUF1834). This family of proteins are functionally uncharacterized. One member is the Gp37 protein from the FluMu prophage. 204
51740 400987 pfam08874 DUF1835 Domain of unknown function (DUF1835). This family of proteins are functionally uncharacterized. 122
51741 286019 pfam08875 DUF1833 Domain of unknown function (DUF1833). This family of proteins are functionally uncharacterized and are predicted to adopt an all-beta fold. They are often found in gene neighborhoods containing genes for an NlpC peptidase and a Ubiquitin domain predicted to be involved in tail assembly. 150
51742 400988 pfam08876 DUF1836 Domain of unknown function (DUF1836). This family of proteins are functionally uncharacterized. 102
51743 400989 pfam08877 MepB MepB protein. MepB is a functionally uncharacterized protein in the mepRAB gene cluster of Staphylococcus aureus. 122
51744 378078 pfam08878 DUF1837 Domain of unknown function (DUF1837). This family of proteins are functionally uncharacterized. 230
51745 370168 pfam08879 WRC WRC. The WRC domain, named after the conserved Trp-Arg-Cys motif, contains two distinctive features: a putative nuclear localization signal and a zinc-finger motif (C3H). It is suggested that the WRC domain functions in DNA binding. 42
51746 400990 pfam08880 QLQ QLQ. The QLQ domain is named after the conserved Gln, Leu, Gln motif. The QLQ domain is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. This domain has thus been postulated to be involved in mediating protein interactions. 35
51747 400991 pfam08881 CVNH CVNH domain. CyanoVirin-N Homology domains are found in the sugar-binding antiviral protein cyanovirin-N (CVN) as well as filamentous ascomycetes and in the fern Ceratopteris richardii. 101
51748 400992 pfam08882 Acetone_carb_G Acetone carboxylase gamma subunit. Acetone carboxylase is the key enzyme of bacterial acetone metabolism, catalyzing the condensation of acetone and CO(2) to form acetoacetate. 113
51749 400993 pfam08883 DOPA_dioxygen Dopa 4,5-dioxygenase family. This family of proteins are related to a DOPA 4,5-dioxygenase that is involved in synthesis of betalain. DOPA-dioxygenase is the key enzyme involved in betalain biosynthesis. It converts 3,4-dihydroxyphenylalanine to betalamic acid, a yellow chromophore. 106
51750 400994 pfam08884 Flagellin_D3 Flagellin D3 domain. This domain is found in the central portion bacterial flagellin FliC. The domain contains a structural motif called a beta-folium fold. Although no specific function is assigned to this domain its deletion leads to a reduction in filament stability. 88
51751 400995 pfam08885 GSCFA GSCFA family. This family of proteins are functionally uncharacterized. They have been named GSCFA after a highly conserved N-terminal motif in the alignment. Distant similarity to the pfam00657 lipases suggests these proteins are likely to be enzymes. 237
51752 400996 pfam08886 GshA Glutamate-cysteine ligase. This is a rare family of glutamate--cysteine ligases, EC:6.3.2.2, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. The structure of this family has been solved, and is similar to that of human glutathione synthetase and very different to gamma-glutamylcysteine synthetase from Escherichia coli. 402
51753 400997 pfam08887 GAD-like GAD-like domain. This domain is functionally uncharacterized, but it appears to be distantly related to the GAD domain pfam02938. 102
51754 400998 pfam08888 HopJ HopJ type III effector protein. Pathovars of Pseudomonas syringae interact with their plant hosts via the action of Hrp outer protein (Hop) effector proteins, injected into plant cells by the type III secretion system. The proteins in this family are called HopJ after the original member HopPmaJ. 109
51755 400999 pfam08889 WbqC WbqC-like protein family. This family of proteins are functionally uncharacterized. However it is found in an O-antigen gene cluster in E. coli and other bacteria suggesting a role in O-antigen production. Feng et al. suggest that wbnG may code for a glycine transferase. 217
51756 401000 pfam08890 Phage_TAC_5 Phage XkdN-like tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins, TACs, from Gram-positive bacteriophages, in particular PBSX from Firmicutes. 135
51757 401001 pfam08891 YfcL YfcL protein. This family of proteins are functionally uncharacterized. THey are related to the short YfcL protein from E. coli. 85
51758 401002 pfam08892 YqcI_YcgG YqcI/YcgG family. This family of proteins are functionally uncharacterized. The family include YqcI and YcgG from B. subtilis. The alignment contains a conserved FPC motif at the N-terminus and CPF at the C-terminus. 211
51759 401003 pfam08893 DUF1839 Domain of unknown function (DUF1839). This family of proteins are functionally uncharacterized. 312
51760 378086 pfam08894 DUF1838 Protein of unknown function (DUF1838). This family of proteins are functionally uncharacterized. 235
51761 401004 pfam08895 DUF1840 Domain of unknown function (DUF1840). This family of proteins are functionally uncharacterized. 108
51762 401005 pfam08896 DUF1842 Domain of unknown function (DUF1842). This domain is found at the N-terminus of proteins that are functionally uncharacterized. 110
51763 401006 pfam08897 DUF1841 Domain of unknown function (DUF1841). This family of proteins are functionally uncharacterized. 135
51764 370178 pfam08898 DUF1843 Domain of unknown function (DUF1843). This domain is found at the C-terminus of a family of proteins that are functionally uncharacterized. The presumed domain is about 60 amino acid residues in length and is found independently in some proteins. 52
51765 401007 pfam08899 DUF1844 Domain of unknown function (DUF1844). This family of proteins are functionally uncharacterized. 72
51766 401008 pfam08900 DUF1845 Domain of unknown function (DUF1845). This family of proteins are functionally uncharacterized. 215
51767 401009 pfam08901 DUF1847 Protein of unknown function (DUF1847). This family of proteins are functionally uncharacterized. THey contain 4 N-terminal cysteines that may form a zinc binding domain. 157
51768 401010 pfam08902 DUF1848 Domain of unknown function (DUF1848). This family of proteins are functionally uncharacterized. The C-terminus contains a cluster of cysteines that are similar to the iron-sulfur cluster found at the N-terminus of pfam04055. 262
51769 401011 pfam08903 DUF1846 Domain of unknown function (DUF1846). This family of proteins are functionally uncharacterized. Some members of the family are annotated as ATP-dependent peptidases. However, we can find no support for this annotation. 489
51770 401012 pfam08904 DUF1849 Domain of unknown function (DUF1849). This family of proteins are functionally uncharacterized. 248
51771 401013 pfam08905 DUF1850 Domain of unknown function (DUF1850). This family of proteins are functionally uncharacterized. Some members of this family appear to be misannotated as RocC an amino acid transporter from B. subtilis. 86
51772 401014 pfam08906 DUF1851 Domain of unknown function (DUF1851). This domain is found at the C-terminus of a variety of proteins that are functionally uncharacterized. 72
51773 401015 pfam08907 DUF1853 Domain of unknown function (DUF1853). This family of proteins are functionally uncharacterized. 282
51774 401016 pfam08908 DUF1852 Domain of unknown function (DUF1852). This family of proteins are functionally uncharacterized. 321
51775 401017 pfam08909 DUF1854 Domain of unknown function (DUF1854). This potential domain is functionally uncharacterized. It is found at the C-terminus of a number of ATP transporter proteins suggesting this domain may be involved in ligand binding. 126
51776 401018 pfam08910 Aida_N Aida N-terminus. This is the N-terminal domain of the axin interactor, dorsalization-associated protein family. 103
51777 401019 pfam08911 NUP50 NUP50 (Nucleoporin 50 kDa). Nucleoporin 50 kDa (NUP50) acts as a cofactor for the importin-alpha:importin-beta heterodimer, which in turn allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. The C-terminus of NUP50 binds importin-beta through RAN-GTP, the N-terminus binds the C-terminus of importin-alpha, while a central domain binds importin-beta. NUP50:importin-alpha:importin-beta then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to actively displace nuclear localization signals from importin-alpha. 64
51778 401020 pfam08912 Rho_Binding Rho Binding. Rho Binding Domain is responsible for the recognition and binding of Rho binding domain-containing proteins (such as ROCK) to Rho, resulting in activation of the GTPase which in turn modulates the phosphorylation of various signalling proteins. This domain is within an amphipathic alpha-helical coiled-coil and interacts with Rho through predominantly hydrophobic interactions. 67
51779 312463 pfam08913 VBS Vinculin Binding Site. Vinculin binding sites are predominantly found in talin and talin-like molecules, enabling binding of vinculin to talin, stabilizing integrin-mediated cell-matrix junctions. Talin, in turn, links integrins to the actin cytoskeleton. The consensus sequence for Vinculin binding sites is LxxAAxxVAxxVxxLIxxA, with a secondary structure prediction of four amphipathic helices. The hydrophobic residues that define the VBS are themselves 'masked' and are buried in the core of a series of helical bundles that make up the talin rod. 125
51780 286058 pfam08914 Myb_DNA-bind_2 Rap1 Myb domain. The Rap1 Myb domain adopts a canonical three-helix bundle tertiary structure, with the second and third helices forming a helix-turn-helix variant motif. The function of this domain is unclear: it may either interact with DNA via an adaptor protein or it may be only involved in protein-protein interactions. 65
51781 401021 pfam08915 tRNA-Thr_ED Archaea-specific editing domain of threonyl-tRNA synthetase. Archaea-specific editing domain of threonyl-tRNA synthetase, with marked structural similarity to D-amino acids deacylases found in eubacteria and eukaryotes. This domain can bind D-amino acids, and ensures high fidelity during translation. It is especially responsible for removing incorrectly attached serine from tRNA-Thr. The domain forms a fold that can be be defined as two layers of beta-sheets (a three-stranded sheet and a five-stranded sheet), with two alpha-helices located adjacent to the five-stranded sheet. 137
51782 401022 pfam08916 Phe_ZIP Phenylalanine zipper. The phenylalanine zipper consists of aromatic side chains from ten phenylalanine residues that are stacked within a hydrophobic core. This zipper mediates dimerization of various proteins, such as APS, SH2-B and Lnk. 57
51783 401023 pfam08917 ecTbetaR2 Transforming growth factor beta receptor 2 ectodomain. The Transforming growth factor beta receptor 2 ectodomain is a compact fold consisting of nine beta-strands and a single helix stabilized by a network of six intra strand disulphide bonds. The folding topology includes a central five-stranded antiparallel beta-sheet, eight-residues long at its centre, covered by a second layer consisting of two segments of two-stranded antiparallel beta-sheets (beta1-beta4, beta3-beta9). 103
51784 401024 pfam08918 PhoQ_Sensor PhoQ Sensor. The PhoQ Sensor is required for the virulence of various Gram-negative bacteria by allowing interaction of PhoPQ with the intracellular membrane, resulting in remodelling of the bacterial cell surface and subsequent bacterial resistance to host antimicrobial peptides. The domain contains a major flat acidic surface, which binds to at least 3 calcium ions, neutralising the domain's negative charge and allowing interaction with the negatively charged membrane. 179
51785 401025 pfam08919 F_actin_bind F-actin binding. The F-actin binding domain forms a compact bundle of four antiparallel alpha-helices, which are arranged in a left-handed topology. Binding of F-actin to the F-actin binding domain may result in cytoplasmic retention and subcellular distribution of the protein, as well as possible inhibition of protein function. 106
51786 401026 pfam08920 SF3b1 Splicing factor 3B subunit 1. This family consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C-terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for 'A' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and 'E' splicing complex assembly. 116
51787 401027 pfam08921 DUF1904 Domain of unknown function (DUF1904). This domain is found in a set of hypothetical bacterial proteins. 107
51788 401028 pfam08922 DUF1905 Domain of unknown function (DUF1905). This domain is found in a set of hypothetical bacterial proteins. 78
51789 312472 pfam08923 MAPKK1_Int Mitogen-activated protein kinase kinase 1 interacting. Mitogen-activated protein kinase kinase 1 interacting protein is a small subcellular adaptor protein required for MAPK signaling and ERK1/2 activation. The overall topology of this domain has a central five-stranded beta-sheet sandwiched between a two alpha-helix and a one alpha-helix layer. 119
51790 401029 pfam08924 DUF1906 Domain of unknown function (DUF1906). This domain is found in a set of uncharacterized hypothetical bacterial proteins. 179
51791 401030 pfam08925 DUF1907 Domain of Unknown Function (DUF1907). The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase. 281
51792 401031 pfam08926 DUF1908 Domain of unknown function (DUF1908). This domain is found in a set of hypothetical/structural eukaryotic proteins. 282
51793 401032 pfam08928 DUF1910 Domain of unknown function (DUF1910). This domain is found in a set of hypothetical bacterial proteins. 117
51794 401033 pfam08929 DUF1911 Domain of unknown function (DUF1911). This domain is found in a set of hypothetical bacterial proteins. 105
51795 286073 pfam08930 DUF1912 Domain of unknown function (DUF1912). This domain has no known function. It is found in various Streptococcal proteins. 84
51796 286074 pfam08931 Caudo_bapla_RBP Receptor-binding protein of phage tail base-plate Siphoviridae, head. Caudo_bapla_RBP is a family of proteins expressed from ORF18 of the Lactococcus P2-like phage. This is one of three protein species, shoulders, neck, and head, that form the phage tail base-plate. In the overall structure this head domain exists as six trimers, and is necessary for specific recognition of the receptors at the host cell surface. Siphoviridae are the P2-like Caudovirales of Lactococcus. This family now includes DUF1914. Family Baseplate, pfam16774, is the ORF15 or shoulder component of the base-plate complex. 262
51797 255115 pfam08933 DUF1864 Domain of unknown function (DUF1864). This domain has no known function. It is found in various hypothetical and conserved domain proteins. 387
51798 401034 pfam08934 Rb_C Rb C-terminal domain. The Rb C-terminal domain is required for high-affinity binding to E2F-DP complexes and for maximal repression of E2F-responsive promoters, thereby acting as a growth suppressor by blocking the G1-S transition of the cell cycle. This domain has a strand-loop-helix structure, which directly interacts with both E2F1 and DP1, followed by a tail segment that lacks regular secondary structure. 151
51799 286076 pfam08935 VP4_2 Viral protein VP4 subunit. This domain is predominantly found in viral proteins from the family Picornaviridae. It is VP4 of the viral polyprotein which, in poliovirus, is part of the capsid that consists of 60 copies each of four proteins VP1, VP2, VP3, and VP4 arranged on an icosahedral lattice. VP4 is on the inside and differs from the others in being small, myristoylated and having an extended structure. Productive infection involves the externalisation of the VP4, which is cleaved from the rest, along with the N-terminus of VP1. There thus seem to be three stages of the virus, ie a multi-step process for cell entry involving RNA translocation through a membrane channel formed by the externalised N termini of VP1. 84
51800 401035 pfam08936 CsoSCA Carboxysome Shell Carbonic Anhydrase. Carboxysome Shell Carbonic Anhydrase is a bacterial carbonic anhydrase localized in the carboxysome, where it converts bicarbonate ions to carbon dioxide for use in carbon fixation. It contains three domains, these being: (1) an N-terminal domain composed primarily of four alpha-helices; (2) a catalytic domain containing a tightly bound zinc ion and (3) a C-terminal domain with weak structural similarity to the catalytic domain. 455
51801 401036 pfam08937 DUF1863 MTH538 TIR-like domain (DUF1863). This domain adopts the flavodoxin fold, that is, five parallel beta-strands and four helical segments. The structure is a three-layer sandwich with alpha-1 and alpha-4 on one side of the beta-sheet, and alpha-2 and alpha-3 on the other side. Probable role in signal transduction as a phosphorylation-independent conformational switch protein. This domain is similar to the TIR domain. 130
51802 401037 pfam08938 HBS1_N HBS1 N-terminus. This domain is found at the N-terminus of HBS1 proteins. It interacts with the ribosomal protein rpS3 at the mRNA entry site. 74
51803 401038 pfam08939 DUF1917 Domain of unknown function (DUF1917). This domain is found in various hypothetical and basophilic leukaemia proteins. It has no known function. 258
51804 401039 pfam08940 DUF1918 Domain of unknown function (DUF1918). This domain, found in various hypothetical bacterial proteins, has no known function. 58
51805 401040 pfam08941 USP8_interact USP8 interacting. This domain interacts with the UBP deubiquitinating enzyme USP8. 179
51806 401041 pfam08942 DUF1919 Domain of unknown function (DUF1919). This domain has no known function. It is found in various hypothetical and putative bacterial proteins. 191
51807 401042 pfam08943 CsiD CsiD. This family consists of various bacterial proteins pertaining to the non-haem Fe(II)-dependent oxygenase family. Exact function is unknown, but a putative role includes involvement in the control of utilisation of gamma-aminobutyric acid. 294
51808 401043 pfam08944 p47_phox_C NADPH oxidase subunit p47Phox, C terminal domain. The C terminal domain of the phagocyte NADPH oxidase subunit p47Phox contains conserved PxxP motifs that allow binding to SH3 domains, with subsequent activation of the NADPH oxidase, and generation of superoxide, which plays a crucial role in host defense against microbial infection. 32
51809 401044 pfam08945 Bclx_interact Bcl-x interacting, BH3 domain. This domain is a long alpha helix, required for interaction with Bcl-x. It is found in BAM, Bim and Bcl2-like protein 11. This domain is also known as the BH3 domain between residues 146 and 161. 39
51810 401045 pfam08946 Osmo_CC Osmosensory transporter coiled coil. The osmosensory transporter coiled coil is a C-terminal domain found in various bacterial osmoprotective transporters, such as ProP, Proline/betaine transporter, Proline permease 2 and the citrate proton symporters. It adopts an antiparallel coiled-coil structure, and is essential for osmosensory and osmoprotectant transporter function. 46
51811 401046 pfam08947 BPS BPS (Between PH and SH2). The BPS (Between PH and SH2) domain, comprised of 2 beta strands and a C-terminal helix, is an approximately 45 residue region found in the adaptor proteins Grb7/10/14 that mediates inhibition of the tyrosine kinase domain of the insulin receptor by binding of the N-terminal portion of the BPS domain to the substrate peptide groove of the kinase, acting as a pseudosubstrate inhibitor. 45
51812 401047 pfam08948 DUF1859 Domain of unknown function (DUF1859). This domain has no known function. It is predominantly found in the N-terminus of bacteriophage spike proteins. 126
51813 401048 pfam08949 DUF1860 Domain of unknown function (DUF1860). This domain has no known function. It is predominantly found in the C-terminus of bacteriophage spike proteins. 219
51814 401049 pfam08950 DUF1861 Protein of unknown function (DUF1861). This hypothetical protein, found in bacteria and in the eukaryote Leishmania, has no known function. 295
51815 401050 pfam08951 EntA_Immun Enterocin A Immunity. Gram-positive lactobacilli produce bacteriocins to kill closely-related competitor species. To protect themselves from the bacteriocidal activity of this molecule they co-express an immunity protein (for discussion of this operon see Bacteriocin_IIc pfam10439). The immunity protein structure is a soluble, cytoplasmic, antiparallel four alpha-helical globular bundle with a fifth, more flexible and more divergent C-terminal helical hair-pin. The C-terminal hair-pin recognizes the C-terminus of the producer bacteriocin and this interaction is sufficient to dis-orient the bacteriocin within the membrane and close up the permeabilising pore that on its own the bacteriocin creates. These immunity proteins interact in the same way with other bacteriocins, family Bacteriocin_II, pfam01721. Since many enterococci can produce more than one bacteriocin it seems likely that the whole operon can be carried on transferable plasmids. 67
51816 286093 pfam08952 DUF1866 Domain of unknown function (DUF1866). This domain, found in Synaptojanin, has no known function. 146
51817 401051 pfam08953 DUF1899 Domain of unknown function (DUF1899). This set of domains is found in various eukaryotic proteins. Function is unknown. 66
51818 401052 pfam08954 Trimer_CC Trimerisation motif. This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It appears to have the function of stabilizing the topology of short coiled-coils in proteins. 52
51819 401053 pfam08955 BofC_C BofC C-terminal domain. The C-terminal domain of the bacterial protein 'bypass of forespore C' contains a three-stranded beta-sheet and three alpha-helices. Its exact function is, as yet, unknown. 74
51820 401054 pfam08956 DUF1869 Domain of unknown function (DUF1869). This domain is found in a set of hypothetical bacterial proteins. 56
51821 401055 pfam08958 DUF1871 Domain of unknown function (DUF1871). This set of hypothetical proteins is produced by prokaryotes pertaining to the Bacillus genus. 77
51822 401056 pfam08960 DUF1874 Domain of unknown function (DUF1874). This domain is found in a set of hypothetical viral and bacterial proteins. 100
51823 401057 pfam08961 NRBF2 Nuclear receptor-binding factor 2, autophagy regulator. NRBF2 plays an essential role in autophagy, the cellular pathway that degrades long-lived proteins and other cytoplasmic contents through lysosomes. NRBF2 binds Atg14L - a Beclin-binding protein - directly via the MIT domain and enhances Atg14L-linked Vps34 kinase (a class III phosphatidylinositol-3 kinase) activity and autophagy induction. 197
51824 401058 pfam08962 DUF1876 Domain of unknown function (DUF1876). This domain is found in a set of hypothetical bacterial proteins. 82
51825 401059 pfam08963 DUF1878 Protein of unknown function (DUF1878). This domain is found in a set of hypothetical bacterial proteins. 110
51826 401060 pfam08964 Crystall_3 Beta/Gamma crystallin. This family of beta/gamma crystallins includes the N-terminal domain of Dictyostelium discoideum Calcium-dependent cell adhesion molecule 1, which mediates cell-cell adhesion through homophilic interactions. 86
51827 401061 pfam08965 DUF1870 Domain of unknown function (DUF1870). This domain is found in a set of hypothetical bacterial proteins. It contains a helix-turn-helix domain so may be a DNA-binding protein. 117
51828 401062 pfam08966 DUF1882 Domain of unknown function (DUF1882). This domain is found in a set of hypothetical bacterial proteins. 72
51829 370216 pfam08967 DUF1884 Domain of unknown function (DUF1884). This domain is found in a set of hypothetical bacterial proteins. It shows similarity to the N-terminus of ATP-synthase. 92
51830 370217 pfam08968 DUF1885 Domain of unknown function (DUF1885). This domain is found in a set of hypothetical proteins produced by bacteria of the Bacillus genus. 131
51831 401063 pfam08969 USP8_dimer USP8 dimerization domain. This domain is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It forms a five helical bundle that dimerizes. 112
51832 370219 pfam08970 Sda Sporulation inhibitor A. Members of this protein family contain two antiparallel alpha helices that are linked by a highly structured inter-helix loop to form a helical hairpin; the structure is stabilized by numerous hydrophobic and electrostatic interactions. These sporulation inhibitors are antikinases that bind to the histidine kinase KinA phosphotransfer domain and act as a molecular barricade that inhibit productive interaction between the ATP binding site and the phosphorylatable KinA His residue. This results in the inhibition of sporulation (by preventing phosphorylation of spo0A). 45
51833 401064 pfam08971 GlgS Glycogen synthesis protein. Members of this family are involved in glycogen synthesis in Enterobacteria. The structure of the polypeptide chain comprises a bundle of two parallel amphipathic helices, alpha-1 and alpha-3, and a short hydrophobic helix alpha-2 sandwiched between them. 66
51834 312506 pfam08972 DUF1902 Domain of unknown function (DUF1902). Members of this family of prokaryotic proteins adopt a fold consisting of one alpha-helix and four beta-strands. Their function has not, as yet, been elucidated. 75
51835 401065 pfam08973 TM1506 Domain of unknown function (DUF1893). A member of the deaminase fold that binds an unknown ligand in the crystal structure. The protein is ADP-ribosylated at a conserved aspartate. Contextual analysis suggests that the domain is likely to bind NAD or ADP ribose either to sense redox states or to function as a regulatory ADP ribosyltransferase. 126
51836 401066 pfam08974 DUF1877 Domain of unknown function (DUF1877). This domain is found in a set of hypothetical bacterial proteins. 163
51837 401067 pfam08975 2H-phosphodiest Domain of unknown function (DUF1868). This group of 2H-phosphodiesterases comprises a single family typified by the protein mlr3352 from M.loti. Members are also present in various alpha-proteobacteria, Synechocystis, Streptococcus and Chilo iridescent virus. The presence of a member of this predominantly bacterial group in a large eukaryotic DNA virus represents a potential case of horizontal transfer from a bacterial source into a virus. Several proteins of bacterial origin have been noticed in the insect viruses (L.M.Iyer, E.V.Koonin and L.Aravind, unpublished observations and these appear to have been acquired from endo-symbiotic or parasitic bacteria that share the same host cells with the viruses. Presence of 2H proteins in the proteomes of large DNA viruses (e.g. T4 57B protein and the Fowl-pox virus FPV025) may point to some role for these proteins in regulating the viral tRNA metabolism. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. 116
51838 401068 pfam08976 EF-hand_11 EF-hand domain. This domain is found predominantly in DJ binding proteins. 105
51839 286116 pfam08977 BOFC_N Bypass of Forespore C, N terminal. The N-terminal domain of 'bypass of forespore C' is composed of a four-stranded beta-sheet covered by an alpha-helix. The beta-sheet has a beta2-beta1-beta4-beta3 topology, where strands beta1 and beta2 and strands beta3 and beta4 are connected by beta-turns, whereas strands beta2 and beta3 are joined by an alpha-helix that runs across one face of the beta-sheet. This domain is similar to the third immunoglobulin G-binding domain of protein G from Streptococcus, the latter belonging to a large and diverse group of cell surface-associated proteins that bind to immunoglobulins. It has been hypothesized that this domain may be a mediator of protein-protein interactions involved in proteolytic events at the cell surface. 49
51840 401069 pfam08978 Reoviridae_Vp9 Reoviridae VP9. This domain is found in various VP9 viral outer-coat proteins. It has no known function. 280
51841 401070 pfam08979 DUF1894 Domain of unknown function (DUF1894). Members of this family have an important role in methanogenesis. They assume an alpha-beta globular structure consisting of six beta-strands and three alpha-helices forming the secondary structural topological arrangement of alpha1-beta1-alpha2-beta2-beta3-beta4-beta5-beta6-alpha3. 85
51842 401071 pfam08980 DUF1883 Domain of unknown function (DUF1883). This domain is found in a set of hypothetical bacterial proteins. 86
51843 401072 pfam08982 DUF1857 Domain of unknown function (DUF1857). This domain has no known function. It is found in various hypothetical bacterial and fungal proteins. 146
51844 370225 pfam08983 DUF1856 Domain of unknown function (DUF1856). This domain has no known function. It is found in the C-terminal segment of various vasopressin receptors. 48
51845 401073 pfam08984 DUF1858 Domain of unknown function (DUF1858). This domain has no known function. It is found in various hypothetical bacterial proteins. 57
51846 378109 pfam08985 DP-EP DP-EP family. The DP-EP family of proteins, formerly known as DUF1888 have been shown to catalyze a cleavage of an internal peptide bond. 120
51847 401074 pfam08986 DUF1889 Domain of unknown function (DUF1889). This domain is found in a set of hypothetical bacterial proteins. 119
51848 370227 pfam08987 DUF1892 Protein of unknown function (DUF1892). Members of this family, that are synthesized by Saccharomycetes, adopt a structure consisting of a four-stranded beta-sheet, with strand order beta2-beta1-beta4-beta3, and two alpha-helices, with an overall topology of beta-beta-alpha-beta-beta-alpha. They have no known function. 107
51849 401075 pfam08988 T3SS_needle_E Type III secretion system, cytoplasmic E component of needle. T3SS_needle_E is a family of proteins from the operon that builds and controls the needle of the injection system of type III secretion. The YscE protein, produced by the pathogen Yersinia, assumes a secondary structure composed of two anti-parallel alpha-helices separated by a flexible loop. The family is cytoplasmic and may help to stabilize and prevent early polymerization of the needle-protein F. 66
51850 401076 pfam08989 DUF1896 Domain of unknown function (DUF1896). This domain is found in a set of hypothetical bacterial proteins. 142
51851 401077 pfam08990 Docking Erythronolide synthase docking. The N terminal docking domain found in modular polyketide synthase assumes an alpha-helical structure, wherein two alpha-helices are connected by a short loop. Two such N-terminal domains dimerize to form amphipathic parallel alpha-helical coiled coils: dimerization is essential for protein function. 29
51852 401078 pfam08991 MTCP1 Mature-T-Cell Proliferation I type. Members of this family adopt a coiled coil structure, with two antiparallel alpha-helices that are tightly strapped together by two disulfide bridges at each end. The protein sequence shows a cysteine motif, required for the stabilisation of the coiled-coil-like structure. Additional inter-helix hydrophobic contacts impart stability to this scaffold. The precise function of this eukaryotic domain is, as yet, unknown. MTCP1 is found in mitochondria. Mature-T-Cell Proliferation) is the first gene unequivocally identified in the group of uncommon leukemias with a mature phenotype. 62
51853 401079 pfam08992 QH-AmDH_gamma Quinohemoprotein amine dehydrogenase, gamma subunit. Members of this family contain a cross-linked, proteinous quinone cofactor, cysteine tryptophylquinone, which is required for catalysis of the oxidative deamination of a wide range of aliphatic and aromatic amines. The domain assumes a globular secondary structure, with two short alpha-helices having many turns and bends. 75
51854 401080 pfam08993 T4_Gp59_N T4 gene Gp59 loader of gp41 DNA helicase. Bacteriophage T4 gene-59 helicase assembly protein is required for recombination-dependent DNA replication, which is the predominant mode of DNA replication in the late stage of T4 infection. T4 gene-59 helicase assembly protein accelerates the loading of the T4 gene-41 helicase during DNA synthesis by the T4 replication system in vitro. T4 gene-59 helicase assembly protein binds to both T4 gene-41 helicase and T4 gene-32 single-stranded DNA binding protein, and to single and double-stranded DNA. The structure of T4 gene-59 helicase assembly protein reveals a novel alpha-helical bundle fold with two domains of similar size, this being the N-terminal domain that consists of six alpha-helices linked by loop segments and short turns. The surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. This domain has structural similarity to members of the high-mobility-group (HMG) family of DNA minor groove binding proteins including rat HMG1A and lymphoid enhancer-binding factor, and is required for binding of the helicase to the DNA minor groove. 93
51855 401081 pfam08994 T4_Gp59_C T4 gene Gp59 loader of gp41 DNA helicase C-term. Bacteriophage T4 gene-59 helicase assembly protein is required for recombination-dependent DNA replication, which is the predominant mode of DNA replication in the late stage of T4 infection. T4 gene-59 helicase assembly protein accelerates the loading of the T4 gene-41 helicase during DNA synthesis by the T4 replication system in vitro. T4 gene-59 helicase assembly protein binds to both T4 gene-41 helicase and T4 gene-32 single-stranded DNA binding protein, and to single and double-stranded DNA. The structure of T4 gene-59 helicase assembly protein reveals a novel alpha-helical bundle fold with two domains of similar size, this being the C-terminal domain that consists of seven alpha-helices with short intervening loops and turns. The surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. The hydrophobic region on the 'bottom' surface of the domain near the C-terminal helix binds the leading strand DNA, whilst the hydrophobic region on the 'top' surface of the domain lies between the two arms of the fork DNA, allowing for T4 gene 41 helicase binding and assembly into a hexameric complex around the lagging strand. 109
51856 117561 pfam08995 NIP_1 Necrosis inducing protein-1. Necrosis inducing protein-1, a fungal avirulence protein produced by plants, consists of two parts containing beta-sheets of two and three anti-parallel strands, respectively. Five intramolecular disulfide bonds, stabilize these parts and their position with respect to each other, providing a high level of stability. 82
51857 401082 pfam08996 zf-DNA_Pol DNA Polymerase alpha zinc finger. The DNA Polymerase alpha zinc finger domain adopts an alpha-helix-like structure, followed by three turns, all of which involve proline. The resulting motif is a helix-turn-helix motif, in contrast to other zinc finger domains, which show anti-parallel sheet and helix conformation. Zinc binding occurs due to the presence of four cysteine residues positioned to bind the metal centre in a tetrahedral coordination geometry. Function of this domain is uncertain: it has been proposed that the zinc finger motif may be an essential part of the DNA binding domain. 184
51858 401083 pfam08997 UCR_6-4kD Ubiquinol-cytochrome C reductase complex, 6.4kD protein. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is an essential component of the mitochondrial cellular respiratory chain. This family represents the 6.4kD protein, which may be closely linked to the iron-sulphur protein in the complex and function as an iron-sulphur protein binding factor. 51
51859 370234 pfam08998 Epsilon_antitox Bacterial epsilon antitoxin. The epsilon antitoxin, produced by various prokaryotes, forms part of a postsegregational killing system which is involved in the initiation of programmed cell death of plasmid-free cells. The protein is folded into a three-helix bundle that directly interacts with the zeta toxin, inactivating it. 89
51860 286135 pfam08999 SP_C-Propep Surfactant protein C, N terminal propeptide. The N-terminal propeptide of surfactant protein C adopts an alpha-helical structure, with turn and extended regions. It's main function is the stabilisation of metastable surfactant protein C (SP-C), since the latter can irreversibly transform from its native alpha-helical structure to beta-sheet aggregates and form amyloid-like fibrils. The correct intracellular trafficking of proSP-C has also been reported to depend on the propeptide. 96
51861 401084 pfam09000 Cytotoxic Cytotoxic. The cytotoxic domain confers cytotoxic activity to proteins, enabling the formation of nucleolytic breaks in 16S ribosomal RNA. The structure of the domain reveals a highly twisted central beta-sheet elaborated with a short N-terminal alpha-helix. 82
51862 401085 pfam09001 DUF1890 Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins. 141
51863 312526 pfam09002 DUF1887 Domain of unknown function (DUF1887). This domain is found in a set of hypothetical bacterial proteins. 379
51864 370236 pfam09003 Arm-DNA-bind_1 Bacteriophage lambda integrase, Arm DNA-binding domain. The amino terminal domain of bacteriophage lambda integrase folds into a three-stranded, antiparallel beta-sheet that packs against a C-terminal alpha-helix, adopting a fold that is structurally related to the three-stranded beta-sheet family of DNA-binding domains (which includes the GCC-box DNA-binding domain and the N-terminal domain of Tn916 integrase). This domain is responsible for high-affinity binding to each of the five DNA arm-type sites and is also a context-sensitive modulator of DNA cleavage. 72
51865 117570 pfam09004 DUF1891 Domain of unknown function (DUF1891). This domain is found in a set of hypothetical eukaryotic proteins. 38
51866 401086 pfam09005 DUF1897 Domain of unknown function (DUF1897). This domain is found in Psi proteins produced by Drosophila, and in various eukaryotic hypothetical proteins. It has no known function. 36
51867 286141 pfam09006 Surfac_D-trimer Lung surfactant protein D coiled-coil trimerisation. This domain, predominantly found in lung surfactant protein D, forms a triple-helical parallel coiled coil, and mediates trimerisation of the protein. 46
51868 401087 pfam09007 EBP50_C EBP50, C-terminal. This C terminal domain allows interaction of EBP50 with FERM (four-point one ERM) domains, resulting in the activation of Ezrin-radixin-moesin (ERM), with subsequent cytoskeletal modulation and cellular growth control. It includes a disordered section between two reasonably well conserved hydrophobic regions. 127
51869 401088 pfam09008 Head_binding Head binding. The head binding domain found in the Phage P22 tailspike protein contains two regular beta-sheets, A and B, oriented nearly perpendicular to each other and composed of five and three strands respectively. The topology of the strands is exclusively antiparallel. The tailspike protein trimerizes through this domain, and the direction of the strands with respect to the molecular triad is almost parallel for beta-sheet A, whereas beta-sheet B is perpendicular to the triad, forming a dome-like structure. This domain is dispensable for thermostability and SDS resistance of the intact protein, and its deletion has only minor effects on tailspike folding kinetics. 103
51870 401089 pfam09009 Exotox-A_cataly Exotoxin A catalytic. Members of this family, which are found in prokaryotic exotoxin A, catalyze the transfer of ADP ribose from nicotinamide adenine dinucleotide (NAD) to elongation factor-2 in eukaryotic cells, with subsequent inhibition of protein synthesis. 168
51871 401090 pfam09010 AsiA Anti-Sigma Factor A. Anti-sigma factor A is a transcriptional inhibitor that inhibits sigma 70-directed transcription by weakening its interaction with the core of the host's RNA polymerase. It is an all-helical protein, composed of six helical segments and intervening loops and turns, as well as a helix-turn-helix DNA binding motif, although neither free anti-sigma factor nor anti-sigma factor bound to sigma-70 has been shown to interact directly with DNA. In solution, the protein forms a symmetric dimer of small (10.59 kDa) protomers, which are composed of helix and coil regions and are devoid of beta-strand/sheet secondary structural elements. 85
51872 401091 pfam09011 HMG_box_2 HMG-box domain. This short 71 residue domain is an HMG-box domain. HMG-box domains mediate re-modelling of chromatin-structure. Mammalian HMG-box proteins are of two types: those that are non-sequence-specific DNA-binding proteins with two HMG-box domains and a long highly acidic C-tail; and a diverse group of sequence-specific transcription factor-proteins with either a single HMG-box or up to six copies, and no acidic C-tail. 71
51873 401092 pfam09012 FeoC FeoC like transcriptional regulator. This family contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependant transcriptional repressor. 69
51874 401093 pfam09013 YopH_N YopH, N-terminal. The N-terminal domain of YopH is a compact structure composed of four alpha-helices and two beta-hairpins. Helices alpha-1 and alpha-3 are parallel to each other and antiparallel to helices alpha-2 and alpha-4. This domain targets YopH for secretion from the bacterium and translocation into eukaryotic cells, and has phosphotyrosyl peptide-binding activity, allowing for recognition of p130Cas and paxillin. 123
51875 401094 pfam09014 Sushi_2 Beta-2-glycoprotein-1 fifth domain. The fifth domain of beta-2-glycoprotein-1 (b2GP-1) is composed of four well-defined anti-parallel beta-strands and two short alpha-helices, as well as a long highly flexible loop. It plays an important role in the binding of b2GP-1 to negatively charged compounds and subsequent capture for binding of anti-b2GP-1 antibodies. 88
51876 401095 pfam09015 NgoMIV_restric NgoMIV restriction enzyme. Members of this family are prokaryotic DNA restriction enzymes, exhibiting an alpha/beta structure, with a central region comprising a mixed six-stranded beta-sheet with alpha-helices on each side. A long 'arm' protrudes out of the core of the domain between strands beta2 and beta3 and is mainly involved in the tetramerisation interface of the protein. These restriction enzymes recognize the double-stranded sequence GCCGGC and cleave after G-1. 275
51877 312534 pfam09016 Pas_Saposin Pas factor saposin fold. Members of this family adopt a compact structure comprising five alpha helices. Charged and polar residues are exposed mostly on the surface, while most of the hydrophobic residues are buried inside the hydrophobic core of the helical bundle. The precise function of this domain is unknown, but it is has been shown to induce secretion of periplasmic proteins, especially collagenase. 76
51878 117583 pfam09017 Transglut_prok Microbial transglutaminase. Microbial transglutaminase (MTG) catalyzes an acyl transfer reaction by means of a Cys-Asp diad mechanism, in which the gamma-carboxyamide groups of peptide-bound glutamine residues act as the acyl donors. The MTG molecule forms a single, compact domain belonging to the alpha+beta folding class, containing 11 alpha-helices and 8 beta-strands. The alpha-helices and the beta-strands are concentrated mainly at the amino and carboxyl ends of the polypeptide, respectively. These secondary structures are arranged so that a beta-sheet is surrounded by alpha-helices, which are clustered into three regions. 414
51879 312535 pfam09018 Phage_Capsid_P3 P3 major capsid protein. The P3 major capsid protein adopts a 'double-barrel' structure comprising two eight-stranded viral beta-barrels or jelly rolls, each of which contains a 12-residue alpha-helix. This protein then trimerizes through a 'trimerisation loop' sequence, and is incorporated within the viral capsid. 394
51880 401096 pfam09019 EcoRII-C EcoRII C terminal. The C-terminal catalytic domain of the Restriction Endonuclease EcoRII has a restriction endonuclease-like fold with a central five-stranded mixed beta-sheet surrounded on both sides by alpha-helices. It cleaves DNA specifically at single 5' CCWGG sites. 165
51881 286154 pfam09020 YopE_N YopE, N terminal. The N terminal YopE domain targets YopE for secretion from the bacterium and translocation into eukaryotic cells. 126
51882 401097 pfam09021 HutP HutP. The HutP protein family regulates the expression of Bacillus 'hut' structural genes by an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: alpha-alpha-beta-alpha-alpha-beta-beta-beta in the primary structure, and the four antiparallel beta-strands form a beta-sheet in the order beta1-beta2-beta3-beta4, with two alpha-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the beta-sheet. 128
51883 286156 pfam09022 Staphostatin_A Staphostatin A. The staphostatin A polypeptide chain folds into a slightly deformed, eight-stranded beta-barrel, with strands beta-4 through beta-8 forming an antiparallel sheet while the N-terminus forms a a psi-loop motif. Members of this family constitute a class of cysteine protease inhibitors distinct in the fold and the mechanism of action from any known inhibitors of these enzymes. 105
51884 401098 pfam09023 Staphostatin_B Staphostatin B. Staphostatin B inhibits the cysteine protease Staphopain B, produced by Staphylococcus aureus, by blocking the active site of the enzyme. The domain adopts an eight-stranded mixed beta-barrel structure, with a deviation from the up-down topology of canonical beta-barrels in the amino-terminal part of the molecule. 105
51885 286158 pfam09025 T3SS_needle_reg YopR, type III needle-polymerization regulator. The YopR core domain, predominantly found in the Gammaproteobacteria virulence factor YopR, is composed of five alpha-helices, four of which are arranged in an antiparallel bundle. Little is known about this domain, though it may contribute to the virulence of the protein YopR. YopR controls the selective access of early (YscF, YscI and YscP) substrates to the type III secretion machines of yersiniae and other Gammaproteobacteriae. YopR is a mobile regulatory component thought to function as a checkpoints probing the completion of discrete intermediary stages in the assembly of the type III injection pathway. The location of secreted YopR (into the medium) is directly controlling the secretion of YscF, the polymerized needle protein pfam09392, thereby impacting the assembly of type III machines. 145
51886 286159 pfam09026 CENP-B_dimeris Centromere protein B dimerization domain. The centromere protein B (CENP-B) dimerization domain is composed of two alpha-helices, which are folded into an antiparallel configuration. dimerization of CENP-B is mediated by this domain, in which monomers dimerize to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation. 100
51887 401099 pfam09027 GTPase_binding GTPase binding. The GTPase binding domain binds to the G protein Cdc42, inhibiting both its intrinsic and stimulated GTPase activity. The domain is largely unstructured in the absence of Cdc42. 66
51888 401100 pfam09028 Mac-1 Mac 1. The bacterial protein Mac 1 adopts an alpha/beta fold, with 14 beta strands and 9 alpha helices. The N-terminal domain is made up predominantly of alpha helices, whereas the C-terminal domain consists predominantly of beta sheets. Mac 1 blocks polymorphonuclear opsonophagocytosis, inhibits the production of reactive oxygen species and contains IgG endopeptidase activity. 347
51889 401101 pfam09029 Preseq_ALAS 5-aminolevulinate synthase presequence. The N terminal presequence domain found in 5-aminolevulinate synthase exists as an amphipathic helix, with a positively charged surface provided by lysine residues and no stable helix at the N-terminus. The domain is essential for the import process by which ALAS is transported into the mitochondria: translocase of the outer membrane (Tom) and translocase of the inner membrane protein complexes appear responsible for recognition and import through the mitochondrial membrane. The protein Tom20 is anchored to the mitochondrial outer membrane, and its interaction with presequences is thought to be the recognition step which allows subsequent import. 114
51890 401102 pfam09030 Creb_binding Creb binding. The Creb binding domain assumes a structure comprising of three alpha-helices which pack in a bundle, exposing a hydrophobic groove between alpha-1 and alpha-3 within which complimentary domains found in the protein 'activator for thyroid hormone and retinoid receptors' (ACTR) can dock. Docking of these domains is required for the recruitment of RNA polymerase II and the basal transcription machinery. 104
51891 401103 pfam09032 Siah-Interact_N Siah interacting protein, N terminal. The N terminal domain of Siah interacting protein (SIP) adopts a helical hairpin structure with a hydrophobic core stabilized by a classic knobs-and-holes arrangement of side chains contributed by the two amphipathic helices. Little is known about this domain's function, except that it is crucial for interactions with Siah. It has also been hypothesized that SIP can dimerize through this N terminal domain. 76
51892 401104 pfam09033 DFF-C DNA Fragmentation factor 45kDa, C terminal domain. The C terminal domain of DNA Fragmentation factor 45kDa (DFF-C) consists of four alpha-helices, which are folded in a helix-packing arrangement, with alpha-2 and alpha-3 packing against a long C-terminal helix (alpha-4). The main function of this domain is the inhibition of DFF40 by binding to its C-terminal catalytic domain through ionic interactions, thereby inhibiting the fragmentation of DNA in the apoptotic process. In addition to blocking the DNase activity of DFF40, the C-terminal region of DFF45 is also important for the DFF40-specific folding chaperone activity, as demonstrated by the ability of DFF45 to refold DFF40. 165
51893 401105 pfam09034 TRADD_N TRADD, N-terminal domain. The N terminal domain of 'tumor necrosis factor receptor type 1 associated death domain protein' (TRADD) folds into an alpha-beta sandwich with a four-stranded beta sheet and six alpha helices, each forming one layer of the structure. The domain allows docking of TRADD onto 'tumor necrosis factor receptor-associated factor' (TRAF): the binding is at the beta-sandwich domain, away from the coiled-coil domain. Binding ensures the recruitment of cIAPs to the signaling complex, which may be important for direct caspase-8 inhibition and the immediate suppression of apoptosis at the apical point of the cascade. 111
51894 286167 pfam09035 Tn916-Xis Excisionase from transposon Tn916. The phage-encoded excisionase protein Tn916-Xis adopts a winged-helix structure that consists of a three-stranded anti-parallel beta-sheet that packs against a helix-turn-helix (HTH) motif and a third C-terminal alpha-helix. It is encoded for by Tn916, which also codes for the integrase Tn916-Int. The protein interacts with DNA by the insertion of helix alpha-2 into the major groove and the contact of the hairpin that connects strands beta-2 and beta-3 with the adjacent phosphodiester backbone and/or minor groove. Tn916-Xis stimulates phage excision and inhibits viral integration by stabilizing distorted DNA structures. 62
51895 401106 pfam09036 Bcr-Abl_Oligo Bcr-Abl oncoprotein oligomerization domain. The Bcr-Abl oncoprotein oligomerization domain consists of a short N-terminal helix (alpha-1), a flexible loop and a long C-terminal helix (alpha-2). Together these form an N-shaped structure, with the loop allowing the two helices to assume a parallel orientation. The monomeric domains associate into a dimer through the formation of an antiparallel coiled coil between the alpha-2 helices and domain swapping of two alpha-1 helices, where one alpha-1 helix swings back and packs against the alpha-2 helix from the second monomer. Two dimers then associate into a tetramer. The oligomerization domain is essential for the oncogenicity of the Bcr-Abl protein. 73
51896 401107 pfam09037 Sulphotransf Stf0 sulphotransferase. Members of this family are essential for the biosynthesis of sulpholipid-1 in prokaryotes. They adopt a structure that belongs to the sulphotransferase superfamily, consisting of a single domain with a core four-stranded parallel beta-sheet flanked by alpha-helices. 244
51897 401108 pfam09038 53-BP1_Tudor tumor suppressor p53-binding protein-1 Tudor. Members of this family consist of ten beta-strands and a carboxy-terminal alpha-helix. The amino-terminal five beta-strands and the C-terminal five beta-strands adopt folds that are identical to each other. This domain is essential for the recruitment of proteins to double stranded breaks in DNA, which is mediated by interaction with methylated Lys 79 of histone H3. 122
51898 370256 pfam09039 HTH_Tnp_Mu_2 Mu DNA binding, I gamma subdomain. Members of this family are responsible for binding the DNA attachment sites at each end of the Mu genome. They adopt a secondary structure comprising a four helix bundle tightly packed around a hydrophobic core consisting of aliphatic and aromatic amino acid residues. Helices 1 and 2 are oriented antiparallel to each other. Helix 3 crosses helices 1 and 2 at angles of 60 and 120 degrees, respectively. Excluding the C-terminal helix 4, the fold of the I-gamma subdomain is remarkably similar to that of the homeodomain family of helix-turn-helix DNA-binding proteins, although their amino acid sequences are completely unrelated. 109
51899 312544 pfam09040 H-K_ATPase_N Gastric H+/K+-ATPase, N terminal domain. Members of this family adopt an alpha-helical conformation under hydrophobic conditions. The domain contains tyrosine residues, phosphorylation of which regulates the function of the ATPase. Additionally, the domain also interacts with various structural proteins, including the spectrin-binding domain of ankyrin III. 43
51900 370257 pfam09041 Aurora-A_bind Aurora-A binding. The Aurora-A binding domain binds to two distinct sites on the Aurora kinase: the upstream residues bind at the N-terminal lobe, whilst the downstream residues bind in an alpha-helical conformation between the N- and C-terminal lobes. The two Aurora-A binding motifs are connected by a flexible linker that is variable in length and sequence across species. Binding of the domain results strong activation of Aurora-A and protection from deactivating dephosphorylation by phosphatase PP1. 68
51901 401109 pfam09042 Titin_Z Titin Z. The titin Z domain, that recognizes and binds to the C-terminal calmodulin-like domain of alpha-actinin-2 (Act-EF34), adopts a helical structure, and binds in a groove formed by the two planes between the helix pairs of Act-EF34. This interaction is essential for sarcomere assembly. 40
51902 401110 pfam09043 Lys-AminoMut_A D-Lysine 5,6-aminomutase TIM-barrel domain of alpha subunit. Members of his family are involved in the 1,2 rearrangement of the terminal amino group of DL-lysine and of L-beta-lysine, using adenosylcobalamin (AdoCbl) and pyridoxal-5'-phosphate as co-factors. The structure is predominantly a PLP-binding TIM barrel domain, with several additional alpha-helices and beta-strands at the N and C termini. These helices and strands form an intertwined accessory clamp structure that wraps around the sides of the TIM barrel and extends up toward the Ado ligand of the Cbl co-factor, providing most of the interactions observed between the protein and the Ado ligand of the Cbl, suggesting that its role is mainly in stabilizing AdoCbl in the precatalytic resting state. This is a TIM-barrel domain. 508
51903 370259 pfam09044 Kp4 Kp4. Members of this fungal family of toxins specifically inhibit voltage-gated calcium channels in mammalian cells. They adopt an alpha/beta-sandwich structure, comprising a five-stranded antiparallel beta-sheet with two antiparallel alpha-helices lying at approximately 45 degrees to these strands. 123
51904 312549 pfam09045 L27_2 L27_2. The L27_2 domain is a protein-protein interaction domain capable of organising scaffold proteins into supramolecular assemblies by formation of heteromeric L27_2 domain complexes. L27_2 domain-mediated protein assemblies have been shown to play essential roles in cellular processes including asymmetric cell division, establishment and maintenance of cell polarity, and clustering of receptors and ion channels. Members of this family form specific heterotetrameric complexes, in which each domain contains three alpha-helices. The two N-terminal helices of each L27_2 domain pack together to form a tight, four-helix bundle in the heterodimer, whilst the third helix of each L27_2 domain forms another four-helix bundle that assembles the two units of the heterodimer into a tetramer. 58
51905 401111 pfam09046 AvrPtoB-E3_ubiq AvrPtoB E3 ubiquitin ligase. The E3 ubiquitin ligase domain found in the bacterial protein AvrPtoB inhibits immunity-associated programmed cell death (PCD) when translocated into plant cells, probably by recruiting E2 enzymes and transferring ubiquitin molecules to cellular proteins involved in regulation of PCD and targeting them for degradation. The structure of this domain reveals a globular fold centred on a four-stranded beta-sheet that packs against two helices on one face and has three very extended loops connecting the elements of secondary structure, with remarkable homology to the RING-finger and U-box families of proteins involved in ubiquitin ligase complexes in eukaryotes. 118
51906 370261 pfam09047 MEF2_binding MEF2 binding. The myocyte enhancer factor-2 (MEF2) binding domain, predominantly found in the calcineurin-binding protein CABIN 1, adopts an amphipathic alpha-helical structure, which allows it to bind a hydrophobic groove on the MEF2S domain, forming a triple-helical interaction. Interaction of this domain with MEF2 causes repression of transcription. 35
51907 401112 pfam09048 Cro Cro. Members of this family are involved in the repression of transcription by binding as a homodimer to palindromic DNA operator sites in phage lambda: they repress genes expressed in early phage development and are necessary for the late stage of lytic growth. These proteins have a secondary structure consisting of three alpha-helices and three beta-sheets, and dimerize through interactions between the two antiparallel beta-strands. 59
51908 401113 pfam09049 SNN_transmemb Stannin transmembrane. Members of this family consist of a single highly hydrophobic transmembrane helix that transverses the lipid bilayer at a 20 degree angle with respect to the membrane normal. They contain a conserved cysteine residue (Cys32) that, together with Cys34 found in the stannin unstructured linker domain, constitutes the putative trimethyltin-binding site that resides at the end of the transmembrane domain close to the lipid/solvent interface. 32
51909 370263 pfam09050 SNN_linker Stannin unstructured linker. Members of this family are unstructured, acting as connectors of the stannin helical domains. They contain a conserved CXC metal-binding motif and a putative 14-3-3-zeta binding domain. Upon coordinating dimethytin, considerable structural or dynamic changes in the flexible loop region of SNN may take place, recruiting other binding partners such as 14-3-3-zeta, and thereby initiating the apoptotic cascade. 26
51910 401114 pfam09051 SNN_cytoplasm Stannin cytoplasmic. Members of this family consist of a distorted cytoplasmic helix that is partially absorbed into the plane of the lipid bilayer with a tilt angle of approximately 80 degrees from the membrane normal. They interact with the surface of the lipid bilayer, and contribute to the initiation of the apoptotic cascade on binding of the unstructured linker domain to dimethyltin. 26
51911 401115 pfam09052 SipA Salmonella invasion protein A. Salmonella invasion protein A is an actin-binding protein that contributes to host cytoskeletal rearrangements by stimulating actin polymerization and counteracting F-actin destabilizing proteins. Members of this family possess an all-helical fold consisting of eight alpha-helices arranged so that six long, amphipathic helices form a compact fold that surrounds a final, predominantly hydrophobic helix in the middle of the molecule. 213
51912 401116 pfam09053 CagZ CagZ. CagZ is a 23 kDa protein consisting of a single compact L-shaped domain, composed of seven alpha-helices that run antiparallel to each other. 70% of the residues are in alpha-helix conformation and no beta-sheet is present. CagZ is essential for the translocation of the pathogenic protein CagA into host cells. 198
51913 401117 pfam09055 Sod_Ni Nickel-containing superoxide dismutase. Nickel containing superoxide dismutase (NiSOD) is a metalloenzyme containing a hexameric assembly of right-handed 4-helix bundles of up-down-up-down topology with an N-terminal His-Cys-X-X-Pro-Cys-Gly-X-Tyr motif that chelates the active site Ni ions. NiSOD catalyzes the disproportionation of superoxide to peroxide and molecular oxygen through alternate oxidation and reduction of Ni, protecting cells from the toxic products of aerobic metabolism. 127
51914 401118 pfam09056 Phospholip_A2_3 Prokaryotic phospholipase A2. The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments. 102
51915 401119 pfam09057 Smac_DIABLO Second Mitochondria-derived Activator of Caspases. Second Mitochondria-derived Activator of Caspases promotes apoptosis by activating caspases in the cytochrome c/Apaf-1/caspase-9 pathway, and by opposing the inhibitory activity of inhibitor of apoptosis proteins (XIAP-BIR3). The protein assumes an elongated three-helix bundle structure, and forms a dimer in solution. 237
51916 401120 pfam09058 L27_1 L27_1. The L27 domain is a protein interaction module that exists in a large family of scaffold proteins, functioning as an organisation centre of large protein assemblies required for the establishment and maintenance of cell polarity. L27 domains form specific heterotetrameric complexes, in which each domain contains three alpha-helices. 61
51917 401121 pfam09059 TyeA TyeA. Members of this family are composed of two pairs of parallel alpha-helices, and interact with the bacterial protein YopN via hydrophobic residues located on the helices. Association of TyeA with the C-terminus of YopN is accompanied by conformational changes in both polypeptides that create order out of disorder: the resulting structure then serves as an impediment to type III secretion of YopN. 81
51918 401122 pfam09060 L27_N L27_N. The L27_N domain plays a role in the biogenesis of tight junctions and in the establishment of cell polarity in epithelial cells. Each L27_N domain consists of three alpha-helices, the first two of which form an antiparallel coiled-coil. Two L27 domains come together to form a four-helical bundle with the antiparallel coiled-coils formed by the first two helices. The third helix of each domain forms another coiled-coil packing at one end of the four-helix bundle, creating a large hydrophobic interface: the hydrophobic interactions are the major force that drives heterodimer formation. 48
51919 401123 pfam09061 Stirrup Stirrup. The Stirrup domain, found in the prokaryotic protein ribonucleotide reductase, has a molecular mass of 9 kDa and is folded into an alpha/beta structure. It allows for binding of the reductase to DNA via electrostatic interactions, since it has a predominance of positive charges distributed on its surface. 79
51920 401124 pfam09062 Endonuc_subdom PI-PfuI Endonuclease subdomain. The endonuclease subdomain, found in the prokaryotic protein ribonucleotide reductase, assumes an alpha-beta-beta-alpha-beta-beta-alpha-alpha topology. The four stranded beta-sheet forms a saddle-shaped surface and assembles together through an interface made of alpha-helices. The presence of 14 basic residues on the surface of the beta-sheets suggests that this large groove may be involved in DNA binding. 98
51921 401125 pfam09063 Phage_coat Phage PP7 coat protein. Members of this family form the capsid of P. aeruginosa phage PP7. They adopt a secondary structure consisting of a six stranded beta sheet and an alpha helix. 127
51922 312559 pfam09064 Tme5_EGF_like Thrombomodulin like fifth domain, EGF-like. Members of this family adopt a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C. 34
51923 72483 pfam09065 Haemadin Haemadin. Members of this family adopt a secondary structure consisting of five short beta-strands (beta1-beta5), which are arranged in two antiparallel distorted sheets formed by strands beta1-beta4-beta5 and beta2-beta3 facing each other. This beta-sandwich is stabilized by six enclosed cysteines arranged in a [1-2, 3-5, 4-6] disulphide pairing resulting in a disulphide-rich hydrophobic core that is largely inaccessible to bulk solvent. The close proximity of disulfide bonds [3-5] and [4-6] organizes haemadin into four distinct loops. The N-terminal segment of this domain binds to the active site of thrombin, inhibiting it. 27
51924 401126 pfam09066 B2-adapt-app_C Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerization. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15). 109
51925 401127 pfam09067 EpoR_lig-bind Erythropoietin receptor, ligand binding. Members of this family interact with erythropoietin (EPO), with subsequent initiation of the downstream chain of events associated with binding of EPO to the receptor, including EPO-induced erythroblast proliferation and differentiation through induction of the JAK2/STAT5 signaling cascade. The domain adopts a secondary structure composed of a short amino-terminal helix, followed by two beta-sandwich regions. 104
51926 401128 pfam09068 EF-hand_2 EF hand. Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding. 121
51927 401129 pfam09069 EF-hand_3 EF-hand. Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding. 90
51928 401130 pfam09070 PFU PFU (PLAA family ubiquitin binding). This domain is found N terminal to pfam08324 and binds to ubiquitin. 111
51929 286197 pfam09071 Alpha-amyl_C Alpha-amylase, C terminal. Members of this family, which are found in the prokaryotic protein glycosyltrehalose trehalohydrolase, assume a gamma-crystallin-type fold with a five-stranded anti-parallel beta-sheet that packs against the C-terminal side of a beta-alpha barrel. This domain is common to family 13 glycosidases and typically contains a five to ten strand beta-sheet, however its precise fold varies. 67
51930 401131 pfam09072 TMA7 Translation machinery associated TMA7. TMA7 plays a role in protein translation. Deletions of the TMA7 gene results in altered protein synthesis rates. 62
51931 401132 pfam09073 BUD22 BUD22. BUD22 has been shown in yeast to be a nuclear protein involved in bud-site selection. It plays a role in positioning the proximal bud pole signal. More recently it has been shown to be involved in ribosome biogenesis. 426
51932 401133 pfam09074 Mer2 Mer2. Mer2 (Rec107) forms part of a complex that is required for meiotic double strand DNA break formation. Mer2 increases in abundance and is phosphorylated during the prophase phase of cell division. Blocking double strand break formation results in delayed dephosphorylation and dissociation of Mer2 from the chromosome. 193
51933 401134 pfam09075 STb_secrete Heat-stable enterotoxin B, secretory. Members of this family assume a helical secondary structure, with two alpha helices forming a disulphide crosslinked alpha-helical hairpin. The disulphide bonds are crucial for the toxic activity of the protein, and are required for maintenance of the tertiary structure, and subsequent interaction with the particulate form of guanylate cyclase, increasing cyclic GMP levels within the host intestinal epithelial cells. 48
51934 370280 pfam09076 Crystall_2 Beta/Gamma crystallin. Members of this family assume a beta-gamma-crystallin fold, wherein nine beta-strands are connected by loop, and are separated into two sheets, each sheet forming the Greek key motif. The two Greek key motifs face each other in the global topology. The three-dimensional structure of the molecule is a 'sandwich'-shaped beta-barrel structure: hydrophobic side-chains are packed in the large interface area of the beta-sheets. In Streptomyces killer toxin-like protein domain confers a cytocidal effect to the toxin, causing cell death in both budding and fission yeasts, and morphological changes in yeasts and filamentous fungi. This family also includes chitin-biding antifungal proteins. 69
51935 401135 pfam09077 Phage-MuB_C Mu B transposition protein, C terminal. The C terminal domain of the B transposition protein from Bacteriophage Mu comprises four alpha-helices arranged in a loosely packed bundle, where helix alpha1 runs parallel to alpha3, and anti-parallel to helices alpha2 and alpha4. The domain allows for non-specific binding of Mu to double-stranded DNA, allowing for integration into the bacterial genome, and mediates dimerization of the protein. 78
51936 401136 pfam09078 CheY-binding CheY binding. Members of this family adopt a secondary structure consisting of an open-face beta/alpha sandwich, with four antiparallel beta-strands and two alpha-helices. They bind to a corresponding domain on CheY, with subsequent phosphorylation of the CheY Asp57 residue, and activation of CheY, which then affects flagellar rotation. 63
51937 401137 pfam09079 Cdc6_C CDC6, C terminal winged helix domain. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localization factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity. 83
51938 401138 pfam09080 K-cyclin_vir_C K cyclin, C terminal. Members of this family adopt a secondary structure consisting of a five alpha-helix cyclin fold. Interaction with cyclin dependent kinases (CDKs) at a PSTAIRE sequence motif within the catalytic cleft of CDK results in the regulation of CDK activity. 104
51939 401139 pfam09081 DUF1921 Domain of unknown function (DUF1921). This domain, which is found in a set of prokaryotic amylases, has no known function. 51
51940 401140 pfam09082 DUF1922 Domain of unknown function (DUF1922). Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C-terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown. 65
51941 72501 pfam09083 DUF1923 Domain of unknown function (DUF1923). Members of this family are found in maltosyltransferases, and adopt a secondary structure consisting of eight antiparallel beta-strands, which form an open-sided 'jelly roll' Greek key beta-barrel. Their exact function is, as yet, unknown. 64
51942 401141 pfam09084 NMT1 NMT1/THI5 like. This family contains the NMT1 and THI5 proteins. These proteins are proposed to be required for the biosynthesis of the pyrimidine moiety of thiamine. They are regulated by thiamine. The protein adopts a fold related to the periplasmic binding protein (PBP) family. Both pyridoxal-5'-phosphate (PLP) and an iron atom are bound to the protein suggesting numerous residues of the active site necessary for HMP-P biosynthesis. The yeast protein is a dimer and, although exceptionally using PLP as a substrate, has notable similarities with enzymes dependent on this molecule as a cofactor. 216
51943 401142 pfam09085 Adhes-Ig_like Adhesion molecule, immunoglobulin-like. Members of this family are found in a set of mucosal cellular adhesion proteins and adopt an immunoglobulin-like beta-sandwich structure, with seven strands arranged in two beta-sheets in a Greek-key topology. They are essential for recruitment of lymphocytes to specific tissues. 107
51944 401143 pfam09086 DUF1924 Domain of unknown function (DUF1924). This domain is found in a set of bacterial proteins, including Cytochrome c-type protein. It is functionally uncharacterized. 91
51945 401144 pfam09087 Cyc-maltodext_N Cyclomaltodextrinase, N-terminal. Members of this family assume a beta-sandwich structure composed of the eight antiparallel beta-strands. A ten residue linker is also present at the C-terminal end, which connects the N terminal domain to a distal domain in the protein. This domain participates in oligomerization of the protein, wherein the N-terminal domain of one subunit contacts the active centre of the other subunit, and is also required for binding of cyclodextrin to substrate. 88
51946 370287 pfam09088 MIF4G_like MIF4G like. Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices. 191
51947 337290 pfam09089 gp12-short_mid Phage short tail fibre protein gp12, middle domain. Members of this family adopt a right-handed triple-stranded beta-helix fold, and are found in the middle of the phage short tail fibre protein gp12. 81
51948 401145 pfam09090 MIF4G_like_2 MIF4G like. Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices. 263
51949 401146 pfam09092 Lyase_N Lyase, N terminal. Members of this family are predominantly found in chondroitin ABC lyase I, and adopt a jelly-roll fold topology consisting of a two-layered bent beta-sheet sandwich with one short alpha-helix. The convex beta sheet is composed of five antiparallel strands, whilst the concave beta-sheet contains five antiparallel beta-strands with a loop between two consecutive strands folding back onto the concave surface. This domain is required for binding of the protein to long glycosaminoglycan chains. 167
51950 401147 pfam09093 Lyase_catalyt Lyase, catalytic. Members of this family are predominantly found in chondroitin ABC lyase I, and adopt a helical structure, with fifteen alpha-helices which are at least two turns long and several short helical turns. The bulk of the domain is formed by ten alpha-helices forming five hairpin-like pairs and arranged into an incomplete toroid, the (alpha/alpha)5 fold. Additionally, two long and two short alpha-helices at the N-terminus of the domain wrap around the toroid. At the C-terminal end of the toroid there is one additional short alpha-helix. This domain is required for degradation of polysaccharides containing 1,4-beta-D-hexosaminyl and 1,3-beta-D-glucoronosyl or 1,3-alpha-L-iduronosyl linkages to disaccharides containing 4-deoxy-beta-D-gluc-4-enuronosyl groups. 361
51951 370291 pfam09094 DUF1925 Domain of unknown function (DUF1925). Members of this family, which are found in a set of prokaryotic transferases, adopt an immunoglobulin/albumin-binding domain-like fold, with a bundle of three alpha-helices. Their function is, as yet, unknown. 80
51952 401148 pfam09095 DUF1926 Domain of unknown function (DUF1926). Members of this family, which are found in a set of prokaryotic transferases, adopt a beta-sandwich fold, in which two layers of anti-parallel beta-sheets are arranged in a nearly parallel fashion. The exact function of this family is, as yet, unknown, however it has been proposed that they may play a role in transglycosylation reactions. 274
51953 286219 pfam09096 Phage-tail_2 Baseplate structural protein, domain 2. Members of this family adopt a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. They are structural component of the viral baseplate, predominantly found in the structural protein gp27. 173
51954 286220 pfam09097 Phage-tail_1 Baseplate structural protein, domain 1. Members of this family adopt a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. They are structural component of the viral baseplate, predominantly found in the structural protein gp27. 196
51955 401149 pfam09098 Dehyd-heme_bind Quinohemoprotein amine dehydrogenase A, alpha subunit, haem binding. Members of this family are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase. They have a predominantly alpha-helical structure and can be divided into two subdomains, each binding a haem C group via a conserved CXXCH motif. 164
51956 286222 pfam09099 Qn_am_d_aIII Quinohemoprotein amine dehydrogenase, alpha subunit domain III. Members of this family, which are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopt an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined. 82
51957 401150 pfam09100 Qn_am_d_aIV Quinohemoprotein amine dehydrogenase, alpha subunit domain IV. Members of this family, which are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopt an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined. 133
51958 286224 pfam09101 Exotox-A_bind Exotoxin A binding. Members of this family are found in Pseudomonas aeruginosa exotoxin A, and are responsible for binding of the toxin to the alpha-2-macroglobulin receptor, with subsequent internalisation into endosomes. The domain adopts a thirteen-strand antiparallel beta jelly roll topology, which belongs to the Concanavalin A-like lectins/glucanases fold superfamily. 274
51959 401151 pfam09102 Exotox-A_target Exotoxin A, targeting. Members of this family are found in Pseudomonas aeruginosa exotoxin A, and are responsible for transmembrane targeting of the toxin, as well as transmembrane translocation of the catalytic domain into the cytoplasmic compartment. A furin cleavage site is present within the domain: cleavage generates a 37 kDa carboxy-terminal fragment, which includes the enzymatic domain, which is then is translocated into the cytoplasm. The domain adopts a helical structure, with six alpha-helices forming a bundle. 142
51960 401152 pfam09103 BRCA-2_OB1 BRCA2, oligonucleotide/oligosaccharide-binding, domain 1. Members of this family assume an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB1 has a shallow groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for weak single strand DNA binding. The domain also binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome. 120
51961 401153 pfam09104 BRCA-2_OB3 BRCA2, oligonucleotide/oligosaccharide-binding, domain 3. Members of this family assume an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB3 has a pronounced groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for strong ssDNA binding. 137
51962 370297 pfam09105 SelB-wing_1 Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding. 61
51963 401154 pfam09106 SelB-wing_2 Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding. 57
51964 401155 pfam09107 SelB-wing_3 Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding. 46
51965 401156 pfam09108 Xol-1_N Switch protein XOL-1, N-terminal. Members of this family, which are required for the formation of the active site of the sex-determining protein Xol-1, adopt a secondary structure consisting of five alpha helices and six antiparallel beta sheets, in a beta-alpha-beta-beta-beta-alpha-beta-alpha-alpha-alpha-beta arrangement. The fold of this family is similar to that found in ribosomal protein S5 domain 2-like. 160
51966 401157 pfam09109 Xol-1_GHMP-like Switch protein XOL-1, GHMP-like. Members of this family, which are required for the formation of the active site of the sex-determining protein Xol-1, adopt a secondary structure consisting of five alpha helices and seven antiparallel beta sheets, in a beta-alpha-beta-alpha-alpha-alpha-beta-beta-alpha-beta-beta-beta arrangement. The fold of this family is structurally similar to that found in the C-terminal domain of GHMP Kinase. 196
51967 401158 pfam09110 HAND HAND. The HAND domain adopts a secondary structure consisting of four alpha helices, three of which (H2, H3, H4) form an L-like configuration. Helix H2 runs antiparallel to helices H3 and H4, packing closely against helix H4, whilst helix H1 reposes in the concave surface formed by these three helices and runs perpendicular to them. The domain confers DNA and nucleosome binding properties to the protein. 110
51968 401159 pfam09111 SLIDE SLIDE. The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb (pfam00249) repeats or homeodomains. 114
51969 401160 pfam09112 N-glycanase_N Peptide-N-glycosidase F, N terminal. Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins. 147
51970 401161 pfam09113 N-glycanase_C Peptide-N-glycosidase F, C terminal. Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins. 137
51971 312591 pfam09114 MotA_activ Transcription factor MotA, activation domain. Members of this family of viral protein domains are implicated in transcriptional activation. They are almost completely alpha-helical, with five alpha-helices and a short, two-stranded, beta-ribbon. Four alpha helices (alpha1, alpha3, alpha4 and alpha5) are amphipathic and pack their hydrophobic surfaces around the central helix alpha2. 95
51972 401162 pfam09115 DNApol3-delta_C DNA polymerase III, delta subunit, C terminal. Members of this family, which are predominantly found in prokaryotic DNA polymerase III, assume an alpha helical structure, with a core of five alpha helices, and an additional small helix. They are essential for the formation of the polymerase clamp loader. 112
51973 401163 pfam09116 gp45-slide_C gp45 sliding clamp, C terminal. Members of this family are essential for the interaction of the gp45 sliding clamp with the corresponding polymerase. They adopt a DNA clamp fold, consisting of two alpha helices and two beta sheets - the fold is duplicated and has internal pseudo two-fold symmetry. 105
51974 370303 pfam09117 MiAMP1 MiAMP1. MiAMP1 is a highly basic protein from the nut kernel of Macadamia integrifolia which inhibits the growth of several microbial plant pathogens in vitro while having no effect on mammalian or plant cells. It consists of eight beta-strands which are arranged in two Greek key motifs. These Greek key motifs then associate to form a Greek key beta-barrel. 76
51975 401164 pfam09118 DUF1929 Domain of unknown function (DUF1929). Members of this family adopt a secondary structure consisting of a bundle of seven, mostly antiparallel, beta-strands surrounding a hydrophobic core. The 7 strands are arranged in 2 sheets, in a Greek-key topology. Their precise function, has not, as yet, been defined, though they are mostly found in sugar-utilising enzymes, such as galactose oxidase. 91
51976 401165 pfam09119 SicP-binding SicP binding. Members of this family bind the chaperone SicP, which is required both to maintain the stability of SptP, as well as to ensure the eventual secretion of the protein. The domain is found in the Salmonella effector protein SptP, which interacts with SicP chaperone dimers mainly through four regions of its chaperone-binding domain. The structure of the SptP-SicP complex contains four molecules of SicP, aligned in a linear fashion and arranged in two sets of tightly bound homodimers that bind two SptP molecules. The SicP homodimers do not interact with each other, but are held together by a molecular interface formed between two SptP molecules. Each SptP molecule is wrapped around by three SicP chaperones (two chaperones from one homodimer and a third one from the opposite homodimer pair). 84
51977 401166 pfam09121 Tower Tower. Members of this family adopt a secondary structure consisting of a pair of long, antiparallel alpha-helices (the stem) that support a three-helix bundle (3HB) at their end. The 3HB contains a helix-turn-helix motif and is similar to the DNA binding domains of the bacterial site-specific recombinases, and of eukaryotic Myb and homeodomain transcription factors. The Tower domain has an important role in the tumor suppressor function of BRCA2, and is essential for appropriate binding of BRCA2 to DNA. 42
51978 401167 pfam09122 DUF1930 Domain of unknown function (DUF1930). Members of this family are found in 3-mercaptopyruvate sulfurtransferase, and have no known function. They adopt a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase. 67
51979 401168 pfam09123 DUF1931 Domain of unknown function (DUF1931). Members of this family, which are found in a set of hypothetical bacterial proteins, contain a core of six alpha-helices, where one central helix is surrounded by the other five. The exact function of this family has not, as yet, been determined. The known structure shows this domain contains two copies of the histone fold. 138
51980 401169 pfam09124 Endonuc-dimeris T4 recombination endonuclease VII, dimerization. Members of this family, which are predominantly found in Bacteriophage T4 recombination endonuclease VII, adopt a helical secondary structure, with three alpha helices oriented parallel to each other. They mediate dimerization of the protein, as well as binding to the DNA major groove. 54
51981 255195 pfam09125 COX2-transmemb Cytochrome C oxidase subunit II, transmembrane. Members of this family adopt a tertiary structure consisting of two antiparallel transmembrane helices, in a transmembrane helix hairpin fold. 38
51982 401170 pfam09126 NaeI Restriction endonuclease NaeI. Members of this family adopt a secondary structure consisting of nine alpha-helices, six 3-10 helices and 13 beta-strands. They bind two GCC-CGG recognition sequences to cleave DNA into blunt-ended products. 288
51983 401171 pfam09127 Leuk-A4-hydro_C Leukotriene A4 hydrolase, C-terminal. Members of this family adopt a structure consisting of two layers of parallel alpha-helices, five in the inner layer and four in the outer, arranged in an antiparallel manner, with perpendicular loops containing short helical segments on top. They are required for the formation of a deep cleft harbouring the catalytic Zn2+ site in Leukotriene A4 hydrolase. 112
51984 401172 pfam09128 RGS-like Regulator of G protein signalling-like domain. Members of this family adopt a structure consisting of twelve helices that fold into a compact domain that contains the overall structural scaffold observed in other RGS proteins and three additional helical elements that pack closely to it. Helices 1-9 comprise the RGS (pfam00615) fold, in which helices 4-7 form a classic antiparallel bundle adjacent to the other helices. Like other RGS structures, helices 7 and 8 span the length of the folded domain and form essentially one continuous helix with a kink in the middle. Helices 10-12 form an apparently stable C-terminal extension of the structural domain, and although other RGS proteins lack this structure, these elements are intimately associated with the rest of the structural framework by hydrophobic interactions. Members of the family bind to active G-alpha proteins, promoting GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways. 188
51985 401173 pfam09129 Chol_subst-bind Cholesterol oxidase, substrate-binding. The substrate-binding domain found in Cholesterol oxidase is composed of an eight-stranded mixed beta-pleated sheet and six alpha-helices. This domain is positioned over the isoalloxazine ring system of the FAD cofactor bound by FAD_binding_4 (PF:PF01565) and forms the roof of the active site cavity, allowing for catalysis of oxidation and isomerisation of cholesterol to cholest-4-en-3-one. 321
51986 401174 pfam09130 DUF1932 Domain of unknown function (DUF1932). This domain is found in a set of hypothetical prokaryotic proteins. Its exact function has not, as yet, been described. 70
51987 117687 pfam09131 Endotoxin_mid Bacillus thuringiensis delta-Endotoxin, middle domain. Members of this family adopt a structure consisting of three four-stranded beta-sheets, each with a Greek key fold, with internal pseudo threefold symmetry. Thus they act as a receptor binding beta-prism, binding to insect-specific receptors of gut epithelial cells. 206
51988 312601 pfam09132 BmKX BmKX. Members of this family assume a structure adopted by most short-chain scorpion toxins, consisting of a cysteine-stabilized alpha/beta scaffold consisting of a short 3-10-helix and a two-stranded antiparallel beta-sheet. They are predominantly found in short-chain scorpion toxins, and their biological method of action has not, as yet, been defined. 30
51989 401175 pfam09133 SANTA SANTA (SANT Associated). The SANTA domain (SANT Associated domain) is approximately 90 amino acids in length and is conserved in Eukaryota. It is sometimes found in association with the SANT domain (pfam00249, also known as Myb-like DNA-binding domain) implying a putative function in regulating chromatin remodelling. Sequence analysis has showed that the SANTA domain is likely to form four central beta-sheets with three flanking alpha- helixes. Many conserved hydrophobic residues are present which implying a possible role in protein-protein interactions. 87
51990 401176 pfam09134 Invasin_D3 Invasin, domain 3. Members of this family adopt a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, arranged in a Greek-key topology. It forms part of the extracellular region of the protein, which can be expressed as a soluble protein (Inv497) that binds integrins and promotes subsequent uptake by cells when attached to bacteria. 98
51991 401177 pfam09135 Alb1 Alb1. Alb1 is a nuclear shuttling factor involved in ribosome biogenesis. 105
51992 401178 pfam09136 Glucodextran_B Glucodextranase, domain B. Members of this family adopt a structure consisting of seven/eight-strand antiparallel beta-sheets, in a Greek-key topology, similar to the immunoglobulin beta-sandwich fold. They act as cell wall anchors, where they interact with the S-layer present in the cell wall of Gram-positive bacteria by hydrophobic interactions. In glucodextranase, Domain B is buried in the S-layer, and a flexible linker located between domain B and the catalytic unit confers motion to the catalytic unit, which is capable of efficient hydrolysis of the substrates located close to the cell surface. 89
51993 401179 pfam09137 Glucodextran_N Glucodextranase, domain N. Members of this family, which are uniquely found in bacterial and archaeal glucoamylases and glucodextranases, adopt a structure consisting of 17 antiparallel beta-strands. These beta-strands are divided into two beta-sheets, and one of the beta-sheets is wrapped by an extended polypeptide, which appears to stabilize the domain. Members of this family are mainly concerned with catalytic activity, hydrolysing alpha-1,6-glucosidic linkages of dextran to release beta-D-glucose from the non-reducing end via an inverting reaction mechanism. 263
51994 370315 pfam09138 Urm1 Urm1 (Ubiquitin related modifier). Urm1 is a ubiquitin related protein that modifies proteins in the yeast ubiquitin-like pathway urmylation. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily. 96
51995 401180 pfam09139 Mmp37 Mitochondrial matrix Mmp37. MMp37 is a mitochondrial matrix protein that functions in the translocation of proteins across the mitochondrial inner membrane. It has been shown that MMP37 proteins possess the NTase fold but they have only one active site carboxylate and thus probably are not able to carry out enzymatic reaction. These potentially non-active members of NTase fold superfamily may bind ATP, hydrolysis of which is necessary for the translocation of proteins through the membrane. 322
51996 401181 pfam09140 MipZ ATPase MipZ. MipZ is an ATPase that forms a complex with the chromosome partitioning protein ParB near the chromosomal origin of replication. It is responsible for the temporal and spatial regulation of FtsZ ring formation. 262
51997 401182 pfam09141 Talin_middle Talin, middle domain. Members of this family adopt a structure consisting of five alpha helices that fold into a bundle. They contain a Vinculin binding site (VBS) composed of a hydrophobic surface spanning five turns of helix four. Activation of the VBS causes subsequent recruitment of Vinculin, which enables maturation of small integrin/talin complexes into more stable adhesions. Formation of the complex between VBS and Vinculin requires prior unfolding of this middle domain: once released from the talin hydrophobic core, the VBS helix is then available to induce the 'bundle conversion' conformational change within the vinculin head domain thereby displacing the intramolecular interaction with the vinculin tail, allowing vinculin to bind actin. 161
51998 370319 pfam09142 TruB_C tRNA Pseudouridine synthase II, C terminal. The C terminal domain of tRNA Pseudouridine synthase II adopts a PUA (pfam01472) fold, with a four-stranded mixed beta-sheet flanked by one alpha-helix on each side. It allows for binding of the enzyme to RNA, as well as stabilisation of the RNA molecule. 56
51999 401183 pfam09143 AvrPphF-ORF-2 AvrPphF-ORF-2. Members of this family of plant pathogenic proteins adopt an elongated structure somewhat reminiscent of a mushroom that can be divided into 'stalk' and 'head' subdomains. The stalk subdomain is composed of the N-terminal helix (alpha1) and beta strands beta3-beta4. An antiparallel beta sheet (beta5, beta7-beta8) forms the base of the head subdomain that interacts with the stalk. A pair of twisted antiparallel beta sheets (beta1 and beta6; beta2 and beta9/9') supported by alpha2 form the dome of the head. The head subdomain possesses weak structural similarity with the catalytic portion of a number of ADP-ribosyltransferase toxins. 175
52000 72561 pfam09144 YpM Yersinia pseudo-tuberculosis mitogen. Members of this family of Yersinia pseudo-tuberculosis mitogens adopt a sandwich structure consisting of nine strands in two beta sheets, in a jelly-roll topology. As with other super-antigens, they are able to excessively activate T cells by binding to the T cell receptor. 117
52001 401184 pfam09145 Ubiq-assoc Ubiquitin-associated. Ubiquitin associated domains contain approximately 40 residues and bind ubiquitin non-covalently. They adopt a secondary structure consisting of three alpha-helices, and have been identified in various modular proteins involved in protein trafficking, clathrin assembly/disassembly, DNA repair, proteasomal degradation, and cell cycle regulation. 44
52002 286257 pfam09147 DUF1933 Domain of unknown function (DUF1933). Members of this family are predominantly found in carbapenam synthetase, and are composed of two antiparallel six-stranded beta-sheets that form a sandwich, flanked on each side by two alpha-helices. Their exact function has not, as yet, been determined. 201
52003 401185 pfam09148 DUF1934 Domain of unknown function (DUF1934). Members of this family are found in a set of hypothetical bacterial proteins. Their precise function has not, as yet, been defined. 123
52004 401186 pfam09149 DUF1935 Domain of unknown function (DUF1935). Members of this family are found in various bacterial and eukaryotic hypothetical proteins, as well as in the cysteine protease calpain. Their exact function has not, as yet, been defined. 100
52005 401187 pfam09150 Carot_N Orange carotenoid protein, N-terminal. Members of this family adopt an alpha-helical structure consisting of two four-helix bundles. They are predominantly found in prokaryotic orange carotenoid protein, and carotenoid binding proteins. 149
52006 401188 pfam09151 DUF1936 Domain of unknown function (DUF1936). This domain is found in a set of hypothetical Archaeal proteins. Its exact function has not, as yet, been defined. It possesses a zinc ribbon fold. 34
52007 370325 pfam09152 DUF1937 Domain of unknown function (DUF1937). This domain is found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been described. 111
52008 401189 pfam09153 DUF1938 Domain of unknown function (DUF1938). Members of this family, which are predominantly found in the archaeal protein O6-alkylguanine-DNA alkyltransferase, adopt a secondary structure consisting of a three stranded antiparallel beta-sheet and three alpha helices. Their exact function has not, as yet, been defined, though it has been postulated that they confer thermostability to the archaeal protein. 90
52009 401190 pfam09154 DUF1939 Domain of unknown function (DUF1939). Members of this family, which are predominantly found in Archaeal amylase, adopt a secondary structure consisting of an eight-stranded antiparallel beta-sheet containing a Greek key motif. Their exact function has not, as yet, been determined. 57
52010 312612 pfam09155 DUF1940 Domain of unknown function (DUF1940). Members of this family adopt a secondary structure consisting of six alpha helices, with four long helices (alpha1, alpha2, alpha5, alpha6) form a left-handed, antiparallel alpha helical bundle. The function of this family of Archaeal hypothetical proteins has not, as yet, been defined. 143
52011 401191 pfam09156 Anthrax-tox_M Anthrax toxin lethal factor, middle domain. Members of this family, which are predominantly found in anthrax toxin lethal factor, adopt a structure consisting of a core of antiparallel beta sheets and alpha helices. They form a long deep groove within the protein that anchors the 16-residue N-terminal tail of MAPKK-2 before cleavage. It has been noted that this domain resembles the ADP-ribosylating toxin from Bacillus cereus, but the active site has been modified to augment substrate recognition. 286
52012 401192 pfam09157 TruB-C_2 Pseudouridine synthase II TruB, C-terminal. Members of this family adopt a secondary structure consisting of a four-stranded beta sheet and one alpha helix. They are predominantly RNA-binding domains, mostly found in Pseudouridine synthase II TruB. 57
52013 286268 pfam09158 MotCF Bacteriophage T4 MotA, C-terminal. Members of this family adopt a compact alpha/beta structure comprising three alpha-helices and six beta-strands in the order: alpha1-beta1-beta2-beta3-beta4-alpha2-beta5-beta6-alpha3. The beta-strands form a single anti-parallel beta-sheet and the three alpha-helices pack side-by-side onto one surface of the beta-sheet. In this architecture, the domain's hydrophobic core is at the sheet-helix interface, and the second surface of the beta-sheet is completely exposed. The domain is a DNA-binding motif, with a consensus sequence containing nine base pairs (5'-TTTGCTTTA-3'), that appears to bind to various mot boxes, allowing access to the minor groove towards the 5'-end of this sequence and the major groove towards the 3'-end. 103
52014 401193 pfam09159 Ydc2-catalyt Mitochondrial resolvase Ydc2 / RNA splicing MRS1. Members of this family adopt a secondary structure consisting of two beta sheets and one alpha helix, arranged as a beta-alpha-beta motif. Each beta sheet has five strands, arranged in a 32145 order, with the second strand being antiparallel to the rest. Mitochondrial resolvase Ydc2 is capable of resolving Holliday junctions and cleaves DNA after 5'-CT-3' and 5'-TT-3' sequences. This family also contains the mitochondrial RNA-splicing protein MRS1 which is involved in the excision of group I introns. 251
52015 370330 pfam09160 FimH_man-bind FimH, mannose binding. Members of this family adopt a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. They are predominantly found in bacterial mannose-specific adhesins, since they are capable of binding to D-mannose. 144
52016 401194 pfam09162 Tap-RNA_bind Tap, RNA-binding. Members of this family adopt a structure consisting of an alpha+beta sandwich with an antiparallel beta-sheet, arranged in a 2(beta-alpha-beta) motif. They are mainly found in mRNA export factors, and mediate the sequence nonspecific nuclear export of cellular mRNAs as well as the sequence-specific export of retroviral mRNAs bearing the constitutive transport element. 83
52017 401195 pfam09163 Form-deh_trans Formate dehydrogenase N, transmembrane. Members of this family are predominantly found in the beta subunit of formate dehydrogenase, and consist of a single transmembrane helix. They act as a transmembrane anchor, and allow for conduction of electrons within the protein. 43
52018 401196 pfam09164 VitD-bind_III Vitamin D binding protein, domain III. Members of this family are predominantly found in Vitamin D binding protein, and adopt a multi-helical structure. They are required for formation of an actin 'clamp', allowing the protein to bind to actin. 65
52019 401197 pfam09165 Ubiq-Cytc-red_N Ubiquinol-cytochrome c reductase 8 kDa, N-terminal. Members of this family adopt a structure consisting of many antiparallel beta sheets, with few alpha helices, in a non-globular arrangement. They are required for proper functioning of the respiratory chain. 74
52020 401198 pfam09166 Biliv-reduc_cat Biliverdin reductase, catalytic. Members of this family adopt a structure consisting of four alpha helices and six beta sheets, in an alpha-beta-alpha-alpha-alpha-beta-beta-beta-beta-beta arrangement. They contain a catalytic active site, capable of reducing the gamma-methene bridge of the open tetrapyrrole, biliverdin IX alpha, to bilirubin with the concomitant oxidation of a NADH or NADPH cofactor. 113
52021 401199 pfam09167 DUF1942 Domain of unknown function (DUF1942). Members of this family of bacterial proteins assume a beta-sandwich structure consisting of two antiparallel beta-sheets similar to an immunoglobulin-like fold, with an additional small, antiparallel beta-sheet. The longer-stranded beta-sheet is made up of four antiparallel beta-strands. The shorter-stranded beta-sheet consists of five beta-strands, four of these beta-strands form an antiparallel beta-sheet. The exact function of this family of proteins is unknown, though a putative role includes involvement in host-bacterial interactions involved in endocytosis or phagocytosis, possibly during bacterial internalisation. 123
52022 401200 pfam09168 PepX_N X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. Members of this family adopt a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand 5' of catalytic domain. The domain mediates dimerization of the protein, with two proline residues present in the domain being critical for interaction. 156
52023 401201 pfam09169 BRCA-2_helical BRCA2, helical. Members of this family adopt a helical structure, consisting of a four-helix cluster core (alpha 1, alpha 8, alpha 9, alpha 10) and two successive beta-hairpins (beta 1 to beta 4). An approx. 50-amino acid segment that contains four short helices (alpha 2 to alpha 4), meanders around the surface of the core structure. In BRCA2, the alpha 9 and alpha 10 helices pack with BRCA-2_OB1 (pfam09103) through van der Waals contacts involving hydrophobic and aromatic residues, and also through side-chain and backbone hydrogen bonds. The domain binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome. 187
52024 401202 pfam09170 STN1_2 CST, Suppressor of cdc thirteen homolog, complex subunit STN1. STN1 is a component of the CST complex, a complex that binds to single-stranded DNA and is required for protecting telomeres from DNA degradation. The CST complex binds single-stranded DNA with high affinity in a sequence-independent manner, while isolated subunits bind DNA with low affinity on their own. In addition to telomere protection, the CST complex probably has a more general role in DNA metabolism at non-telomeric sites. 176
52025 401203 pfam09171 AGOG N-glycosylase/DNA lyase. This domain is predominantly found in the Archaeal protein N-glycosylase/DNA lyase. 248
52026 401204 pfam09172 DUF1943 Domain of unknown function (DUF1943). Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined. 302
52027 401205 pfam09173 eIF2_C Initiation factor eIF2 gamma, C terminal. Members of this family, which are found in the initiation factors eIF2 and EF-Tu, adopt a structure consisting of a beta barrel with Greek key topology. They are required for formation of the ternary complex with GTP and initiator tRNA. 85
52028 401206 pfam09174 Maf1 Maf1 regulator. Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB. 174
52029 401207 pfam09175 DUF1944 Domain of unknown function (DUF1944). Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined. 167
52030 401208 pfam09176 Mpt_N Methylene-tetrahydromethanopterin dehydrogenase, N-terminal. Members of this family adopt a alpha-beta structure, with a core comprising three alpha/beta/alpha layers, in which each sheet contains four strands. They are predominantly found in prokaryotic methylene-tetrahydromethanopterin dehydrogenase, which catalyzes the dehydrogenation of methylene-tetrahydromethanopterin and the reversible dehydrogenation of methylene-H(4)F. 76
52031 401209 pfam09177 Syntaxin-6_N Syntaxin 6, N-terminal. Members of this family, which are found in the amino terminus of various SNARE proteins, adopt a structure consisting of an antiparallel three-helix bundle. Their exact function has not been determined, though it is known that they regulate the SNARE motif, as well as mediate various protein-protein interactions involved in membrane-transport. 91
52032 286287 pfam09178 DUF1945 Domain of unknown function (DUF1945). Members of this family, which are predominantly found in prokaryotic 4-alpha-glucanotransferase, adopt a structure composed of six antiparallel beta-strands, four of which form a beta-sheet and another two form a type I' beta-hairpin. The role of this family of domains, has not, as yet, been defined. 50
52033 401210 pfam09179 TilS TilS substrate binding domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. 66
52034 401211 pfam09180 ProRS-C_1 Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif. 70
52035 401212 pfam09181 ProRS-C_2 Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif. 66
52036 401213 pfam09182 PuR_N Bacterial purine repressor, N-terminal. The N-terminal domain of the bacterial purine repressor PuR is a winged-helix domain, a subdivision of the HTH structural family. It consists of a canonical arrangement of secondary structures: a1-b1-a2-T-a3-b2-W-b3, where a2-T-a3 is the HTH motif, a3 is the recognition helix, and W is the wing. The domain allows for recognition of a conserved CGAA sequence in the centre of a DNA PurBox, resulting in binding to the major groove of DNA. 70
52037 401214 pfam09183 DUF1947 Domain of unknown function (DUF1947). Members of this family are found in a set of hypothetical Archaeal proteins. Their exact function has not, as yet, been defined. 63
52038 401215 pfam09184 PPP4R2 PPP4R2. PPP4R2 (protein phosphatase 4 core regulatory subunit R2) is the regulatory subunit of the histone H2A phosphatase complex. It has been shown to confer resistance to the anticancer drug cisplatin in yeast, and may confer resistance in higher eukaryotes. 287
52039 401216 pfam09185 DUF1948 Domain of unknown function (DUF1948). Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central alpha-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined. 140
52040 401217 pfam09186 DUF1949 Domain of unknown function (DUF1949). Members of this family pertain to a set of functionally uncharacterized hypothetical bacterial proteins. They adopt a ferredoxin-like fold, with a beta-alpha-beta-beta-alpha-beta arrangement. 56
52041 401218 pfam09187 RdDM_RDM1 RNA-directed DNA methylation 1. This family of plant proteins includes RDM1 from Arabidopsis, which is a component of the RNA-directed DNA methylation (RdDM) effector complex and may have a role in linking siRNA production with pre-existing or de novo cytosine methylation. As part of the DDR complex with two other RdDM components, it has been shown to facilitate association of PolV to chromatin. 119
52042 286297 pfam09188 DUF1951 Domain of unknown function (DUF1951). Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with a buried central helix. Their function has not, as yet, been determined. 137
52043 401219 pfam09189 DUF1952 Domain of unknown function (DUF1952). Members of this family are found in various Thermus thermophilus proteins. Their exact function has not, as yet, been determined. 73
52044 401220 pfam09190 DALR_2 DALR domain. This DALR domain is found in cysteinyl-tRNA-synthetases. 63
52045 401221 pfam09191 CD4-extracel CD4, extracellular. Members of this family adopt an immunoglobulin-like beta-sandwich, with seven strands in 2 beta sheets, in a Greek key topology. They are predominantly found in the extracellular portion of CD4 proteins, where they enable interaction with major histocompatibility complex class II antigens. 105
52046 370352 pfam09192 Act-Frag_cataly Actin-fragmin kinase, catalytic. Members of this family assume a secondary structure consisting of eight beta strands and 11 alpha-helices, organized in two lobes. They are predominantly found in actin-fragmin kinase, where they act as a catalytic domain that mediates the phosphorylation of actin. 278
52047 401222 pfam09193 CholecysA-Rec_N Cholecystokinin A receptor, N-terminal. Members of this family are found in the extracellular region of the cholecystokinin A receptor, where they adopt a tertiary structure consisting of a few helical turns and a disulphide-crosslinked loop. They are required for interaction of the cholecystokinin A receptor with it's corresponding hormonal ligand. 47
52048 370354 pfam09194 Endonuc-BsobI Restriction endonuclease BsobI. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence CYCGRG (where Y = T/C, and R = A/G) and cleave after C-1. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. 314
52049 370355 pfam09195 Endonuc-BglII Restriction endonuclease BglII. Members of this family are predominantly found in prokaryotic restriction endonuclease BglII, and adopt a structure consisting of an alpha/beta core containing a six-stranded beta-sheet surrounded by five alpha-helices, two of which are involved in homodimerization of the endonuclease. They recognize the double-stranded DNA sequence AGATCT and cleave after A-1, resulting in specific double-stranded fragments with terminal 5'-phosphates. 161
52050 286305 pfam09196 DUF1953 Domain of unknown function (DUF1953). This domain is found in the Archaeal protein maltooligosyl trehalose synthase produced by Sulfolobus spp. Its function has not, as yet, been defined. 63
52051 401223 pfam09197 Rap1-DNA-bind Rap1, DNA-binding. Members of this family, which are predominantly found in the yeast protein rap1, assume a secondary structure consisting of a three-helix bundle and an N-terminal arm. They contain an Arg-Asp-Arg-Lys sequence that interacts with an ACACC region in the 3' region of the DNA-binding site. 108
52052 72614 pfam09198 T4-Gluco-transf Bacteriophage T4 beta-glucosyltransferase. Members of this family are DNA-modifying enzymes encoded by bacteriophage T4 that transfer glucose from uridine diphosphoglucose to 5-hydroxymethyl cytosine bases of phage T4 DNA. 38
52053 401224 pfam09199 SSL_OB Staphylococcal superantigen-like OB-fold domain. This OB-fold domain folds into a five-stranded beta-barrel. Members of this family are found in various staphylococcal toxins described as staphylococcal superantigen-like (SSL) proteins that are related to the staphylococcal enterotoxins (SEs) or superantigens. These SSL proteins of which 11 have so far been characterized have a typical SE tertiary structure consisting of a distinct oligonucleotide/oligosaccharide binding (OB-fold), this domain, linked to a beta-grasp domain, family Stap_Strp_tox_C, pfam02876. SSLs do not bind to T-cell receptors or major histocompatibility complex class II molecules and do not stimulate T cells. SSLs target components of innate immunity, such as complement, Fc receptors, and myeloid cells 2,3,4,5,6,7,8]. SSL protein 7 (SSL7) is the best characterized of the SSLs and binds complement factor C5 and IgA with high affinity and inhibits the end stage of complement activation and IgA binding to FcalphaR. 84
52054 312644 pfam09200 Monellin Monellin. Monellin, a protein produced by the West African plant Dioscoreophyllum cumminsii, is approximately 70,000 times sweeter than sucrose on a molar basis. The protein adopts an alpha-beta structure, with a cystatin-like fold, where each helix packs against a coiled antiparallel beta-sheet. 40
52055 370358 pfam09201 SRX SRX, signal recognition particle receptor alpha subunit. Members of this family, which are predominantly found in eukaryotic signal recognition particle receptor alpha, consist of a central six-stranded anti-parallel beta-sheet sandwiched by helix alpha1 on one side and helices alpha2-alpha4 on the other. They interact with the small GTPase SR-beta, forming a complex that matches a class of small G protein-effector complexes, including Rap-Raf, Ras-PI3K(gamma), Ras-RalGDS, and Arl2-PDE(delta). Structurally the alpha subunit is SNARE-like. 148
52056 401225 pfam09202 Rio2_N Rio2, N-terminal. Members of this family are found in Rio2, and are structurally homologous to the winged helix (wHTH) domain. They adopt a structure consisting of four alpha helices followed by two beta strands and a fifth alpha helix. The domain confers DNA binding properties to the protein, as per other winged helix domains. 82
52057 401226 pfam09203 MspA MspA. MspA is a membrane porin produced by Mycobacteria, allowing hydrophilic nutrients to enter the bacterium. The protein forms a tightly interconnected octamer with eightfold rotation symmetry that resembles a goblet and contains a central channel. Each subunit fold contains a beta-sandwich of Ig-like topology and a beta-ribbon arm that forms an oligomeric transmembrane barrel. 175
52058 401227 pfam09204 Colicin_immun Bacterial self-protective colicin-like immunity. Colicin D, which is synthesized by various prokaryotes, adopts an antiparallel four helical bundle fold: the helices are tightly packed, forming a compact cylindrical molecule. The protein specifically cleaves the anticodon loop of all four tRNA-Arg isoacceptors, thereby inactivating prokaryotic protein synthesis and leading to cell death. This family also contains immunity proteins to klebicins and microcins. Many bacteria produce proteins that destroy their competitors. Colicin D is one such. The immunity proteins are expressed on the same operon as their cognate bacteriocins and protect the expressing bacterium from the effects of its own bacteriocin. 84
52059 401228 pfam09205 DUF1955 Domain of unknown function (DUF1955). Members of this family are found in hypothetical proteins synthesized by the Archaeal organism Sulfolobus. Their exact function has not, as yet, been determined. 159
52060 401229 pfam09206 ArabFuran-catal Alpha-L-arabinofuranosidase B, catalytic. Members of this family, which are present in fungal alpha-L-arabinofuranosidase B, adopt a beta-sandwich fold similar to that of Concanavalin A-like lectins/glucanase. The beta-sandwich fold consists of two anti-parallel beta-sheets with seven and and six strands, respectively. In addition, there are four helices outside of the beta-strands. The beta-sandwich strands are closely packed and curved with a jelly roll topology, creating a small catalytic pocket. The domain catalyzes the hydrolysis of alpha-1,2-, alpha-1,3- and alpha-1,5-L-arabinofuranosidic bonds in L-arabinose-containing hemicelluloses such as arabinoxylan and L-arabinan. 315
52061 312650 pfam09207 Yeast-kill-tox Yeast killer toxin. Members of this family, which are produced by Williopsis fungi, adopt a secondary structure consisting of eight strands in two beta sheets, in a Greek-key topology. 87
52062 401230 pfam09208 Endonuc-MspI Restriction endonuclease MspI. Members of this family of prokaryotic restriction endonucleases recognize the palindromic tetranucleotide sequence 5'-CCGG and cleave between the first and second nucleotides, leaving 2 base 5' overhangs. They fold into an alpha/beta architecture, with a five-stranded mixed beta-sheet sandwiched on both sides by alpha-helices. 258
52063 401231 pfam09209 DUF1956 Domain of unknown function (DUF1956). Members of this family are found in various prokaryotic transcriptional regulator proteins. Their exact function has not, as yet, been identified. 124
52064 401232 pfam09210 DUF1957 Domain of unknown function (DUF1957). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been defined. 98
52065 401233 pfam09211 DUF1958 Domain of unknown function (DUF1958). Members of this functionally uncharacterized family are found in prokaryotic penicillin-binding protein 4. 63
52066 117765 pfam09212 CBM27 Carbohydrate binding module 27. Members of this family are carbohydrate binding modules that bind to beta-1, 4-manno-oligosaccharides, carob galactomannan, and konjac glucomannan, but not to cellulose (insoluble and soluble) or soluble birchwood xylan. They adopt a beta sandwich structure comprising 13 beta strands with a single, small alpha-helix and a single metal atom. 173
52067 117766 pfam09213 M3 M3. Members of this family of viral chemokine binding proteins adopt a structure consisting of two different beta-sandwich domains of partial topological similarity to immunoglobulin-like folds. They bind with the CC-chemokine MCP-1, acting as cytokine decoy receptors. 367
52068 401234 pfam09214 Prd1-P2 Bacteriophage Prd1, adsorption protein P2. Members of this family form a set of bacteriophage adsorption proteins, composed mainly of beta-strands whose complicated topology forms an elongated seahorse-shaped molecule with a distinct head, containing a pseudo-beta propeller structure with approximate 6-fold symmetry, and tail. They are required for the attachment of the phage to the host conjugative DNA transfer complex. This is a poorly understood large transmembrane complex of unknown architecture, with at least 11 different proteins. 554
52069 401235 pfam09215 Phage-Gp8 Bacteriophage T4, Gp8. Members of this family of viral baseplate structural proteins adopt a structure consisting of a three-layer beta-sandwich with two finger-like loops containing an alpha-helix at the opposite sides of the sandwich. The two peripheral, five-stranded, antiparallel beta-sheets are stacked against the middle, four-stranded, antiparallel beta-sheet. Attachment of this family of proteins to the baseplate during assembly creates a binding site for subsequent attachment of Gp6. 337
52070 401236 pfam09216 Pfg27 Pfg27. Members of this family are essential for gametocytogenesis in Plasmodium falciparum. They contain a fold composed of two pseudo dyad-related repeats of the helix-turn-helix motif, serving as a platform for RNA and Src homology-3 (SH3) binding. 176
52071 401237 pfam09217 EcoRII-N Restriction endonuclease EcoRII, N-terminal. The N-terminal effector-binding domain of the Restriction Endonuclease EcoRII has a DNA recognition fold, allowing for binding to 5'-CCWGG sequences. It assumes a structure composed of an eight-stranded beta-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. They are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini beta-sheets of four antiparallel beta-strands, sheet I from beta-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed beta-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not. 148
52072 401238 pfam09218 DUF1959 Domain of unknown function (DUF1959). This domain is found in a set of uncharacterized Archaeal hypothetical proteins. Its function has not, as yet, been described. 116
52073 401239 pfam09220 LA-virus_coat L-A virus, major coat protein. Members of this family form the major coat protein of the Saccharomyces cerevisiae L-A virus. 439
52074 401240 pfam09221 Bacteriocin_IId Bacteriocin class IId cyclical uberolysin-like. Members of this family are membrane-interacting peptides, produced by Firmicutes that display a broad anti-microbial spectrum against Gram-positive and Gram-negative bacteria. They adopt a helical structure, with four or five alpha helices forming a Saposin-like fold. The structure has been found to be cyclical. It should be pointed out that one reference implies that both circularin A and gassericin A are class V or IIc-type bacteriocins; however we find that these two proteins fall into different Pfam families families, this one and BacteriocIIc_cy, pfam12173. 67
52075 370370 pfam09222 Fim-adh_lectin Fimbrial adhesin F17-AG, lectin domain. Members of this family are carbohydrate-specific lectin domains found in bacterial fimbrial adhesins. They adopt a compact, elongated structure consisting of a beta-sandwich with two major sheets: one consisting of five long strands in mixed orientations, and a front sheet with four antiparallel strands, forming an immunoglobin-like fold. 171
52076 401241 pfam09223 ZinT ZinT (YodA) periplasmic lipocalin-like zinc-recruitment. ZinT plays a critical role in recruiting periplasmic zinc to the bacterial zinc-uptake complex ZnuABC, consisting of families pfam01297,pfam00950, pfam00005, regulated by the transcription-regulator FUR, pfam01475. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA (TroA), pfam01297. Members of this family of prokaryotic domains were first identified as part of the response of bacteria to a challenge with the toxic heavy metal cadmium. They are able to bind to cadmium, and ensure its subsequent elimination. 181
52077 401242 pfam09224 DUF1961 Domain of unknown function (DUF1961). Members of this family are found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been determined. 214
52078 370372 pfam09225 Endonuc-PvuII Restriction endonuclease PvuII. Members of this family are predominantly found in prokaryotic restriction endonuclease PvuII. They recognize the double-stranded DNA sequence 5'-CAGCTG-3' and cleave after G-3, resulting in specific double-stranded fragments with terminal 5'-phosphates. 154
52079 312659 pfam09226 Endonuc-HincII Restriction endonuclease HincII. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence 5'-GTYRAC-3' and cleave after Y-3. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. 247
52080 286330 pfam09227 DUF1962 Domain of unknown function (DUF1962). Members of this family of fungal domains are functionally uncharacterized. 64
52081 401243 pfam09228 Prok-TraM Prokaryotic Transcriptional repressor TraM. Members of this family of transcriptional repressors adopt a T-shaped structure, with a core composed of two antiparallel alpha-helices. These proteins can be divided into two parts, a 'globular head' and an 'elongated tail', and they negatively regulate conjugation and the expression of tra genes by antagonising traR/AAI-dependent activation. 102
52082 401244 pfam09229 Aha1_N Activator of Hsp90 ATPase, N-terminal. Members of this family, which are predominantly found in the protein 'Activator of Hsp90 ATPase' adopt a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity. 130
52083 401245 pfam09230 DFF40 DNA fragmentation factor 40 kDa. Members of this family of eukaryotic apoptotic proteins induce DNA fragmentation and chromatin condensation during apoptosis. 225
52084 286334 pfam09231 RDV-p3 Rice dwarf virus p3. Members of this family are core structural proteins found in the double-stranded RNA virus Phytoreovirus. They are large proteins without apparent domain division, with a number of all-alpha regions and one all beta domain near the C-terminal end. 963
52085 370376 pfam09232 Caenor_Her-1 Caenorhabditis elegans Her-1. Her-1 adopts an all-helical structure with two subdomains: residues 19-80 comprise a left-handed three-helix bundle with an overhand connection between the second and third helices, whilst residues 81-164 comprise a left-handed anti-parallel four-helix bundle in which the first helix consists of four consecutive turns of 3-10-helix. Fourteen Cys are conserved in all known HER-1 sequences and form seven disulfide bonds. The protein dictates male development in Caenorhabditis elegans, probably by playing a direct role in cell signaling during C. elegans sex determination. It also inhibits the function of tra-2a. 131
52086 401246 pfam09233 Endonuc-EcoRV Restriction endonuclease EcoRV. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence 5'-GATATC-3' and cleave after T-3. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. 240
52087 401247 pfam09234 DUF1963 Domain of unknown function (DUF1963). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been described. 178
52088 401248 pfam09235 Ste50p-SAM Ste50p, sterile alpha motif. The fungal Ste50p SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerization and heterodimerization (and in some cases oligomerization) of the protein. 75
52089 401249 pfam09236 AHSP Alpha-haemoglobin stabilizing protein. Alpha-haemoglobin stabilizing protein (AHSP) acts a molecular chaperone for free alpha-haemoglobin, preventing the harmful aggregation of alpha-haemoglobin during normal erythroid cell development: it specifically protects free alpha-haemoglobin from precipitation. AHSP adopts a helical secondary structure consisting of an elongated antiparallel three alpha-helix bundle. 87
52090 150045 pfam09237 GAGA GAGA factor. Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognizes the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognizes the A in the fourth position of the consensus sequence. 54
52091 401250 pfam09238 IL4Ra_N Interleukin-4 receptor alpha chain, N-terminal. Members of this family are related in overall topology to fibronectin type III modules and fold into a sandwich comprising seven antiparallel beta sheets arranged in a three-strand and a four-strand beta-pleated sheet. They are required for binding of interleukin-4 to the receptor alpha chain, which is a crucial event for the generation of a Th2-dominated early immune response. 90
52092 401251 pfam09239 Topo-VIb_trans Topoisomerase VI B subunit, transducer. Members of this family adopt a structure consisting of a four-stranded beta-sheet backed by three alpha-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme. 157
52093 401252 pfam09240 IL6Ra-bind Interleukin-6 receptor alpha chain, binding. Members of this family adopt a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, in a Greek-key topology. They are required for binding to the cytokine Interleukin-6. 96
52094 286342 pfam09241 Herp-Cyclin Herpesviridae viral cyclin. Members of this family of viral cyclins adopt a helical structure consisting of five alpha-helices, with one helix surrounded by the others. They specifically activate CDK6 of host cells to a very high degree. 106
52095 401253 pfam09242 FCSD-flav_bind Flavocytochrome c sulphide dehydrogenase, flavin-binding. Members of this family adopt a structure consisting of a beta(3,4)-alpha(3) core, and an alpha+beta sandwich. They are required for binding to flavin, and subsequent electron transfer. 68
52096 401254 pfam09243 Rsm22 Mitochondrial small ribosomal subunit Rsm22. Rsm22 has been identified as a mitochondrial small ribosomal subunit and is a methyltransferase. In Schizosaccharomyces pombe, Rsm22 is tandemly fused to Cox11 (a factor required for copper insertion into cytochrome oxidase) and the two proteins are proteolytically cleaved after import into the mitochondria. 275
52097 401255 pfam09244 DUF1964 Domain of unknown function (DUF1964). Members of this family of bacterial domains adopt a beta-sandwich fold, with Greek-key topology. They are C-terminal to the catalytic sucrose phosphorylase beta/alpha barrel domain, and are functionally uncharacterized. 67
52098 312671 pfam09245 MA-Mit Mycoplasma arthritidis-derived mitogen. Mycoplasma arthritidis-derived mitogen (MA-Mit) adopts a completely alpha-helical structure consisting of ten alpha helices. It is a superantigen that can activate large fractions of T cells bearing particular TCR V-beta elements. Two MA-Mit molecules form an asymmetric dimer and cross-link two MHC antigens to form a dimerized MA-Mit-MHC complex. 213
52099 204178 pfam09246 PHAT PHAT. The PHAT (pseudo-HEAT analogous topology) domain assumes a structure consisting of a layer of three parallel helices packed against a layer of two antiparallel helices, into a cylindrical shaped five-helix bundle. It is found in the RNA-binding protein Smaug, where it is essential for high-affinity RNA binding. 108
52100 370383 pfam09247 TBP-binding TATA box-binding protein binding. Members of this family adopt a structure consisting of three alpha helices and a beta-hairpin. They bind to TATA box-binding protein (TBP), inhibiting TBP interaction with the TATA element, thereby resulting in shutting down of gene transcription. 58
52101 401256 pfam09248 DUF1965 Domain of unknown function (DUF1965). Members of this family of fungal domains adopt a structure that consists of an alpha/beta motif. Their exact function has not, as yet, been determined. 69
52102 401257 pfam09249 tRNA_NucTransf2 tRNA nucleotidyltransferase, second domain. Members of this family adopt a structure consisting of a five helical bundle core. They are predominantly found in Archaeal tRNA nucleotidyltransferase, following the catalytic nucleotidyltransferase domain. 111
52103 401258 pfam09250 Prim-Pol Bifunctional DNA primase/polymerase, N-terminal. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities. 158
52104 312676 pfam09251 PhageP22-tail Salmonella phage P22 tail-spike. Members of this family of viral domains adopt a structure consisting of a single-stranded right-handed beta-helix, which in turn is made of parallel beta-strands and short turns. They are required for recognition of the 0-antigenic repeating units of the cell surface, and for subsequent infection of the bacterial cell. 550
52105 401259 pfam09252 Feld-I_B Allergen Fel d I-B chain. Members of this family of cat allergens adopt a helical structure consisting of eight alpha helices, in a Uteroglobin-like fold. They are one of the most important causes of allergic asthma worldwide. 67
52106 401260 pfam09253 Ole-e-6 Pollen allergen ole e 6. Members of this family consist of two nearly antiparallel alpha-helices, that are connected by a short loop and followed by a long, unstructured C-terminal tail. They are highly allergenic, primarily mediating olive allergy. 39
52107 401261 pfam09254 Endonuc-FokI_C Restriction endonuclease FokI, C terminal. Members of this family are predominantly found in prokaryotic restriction endonuclease FokI, and adopt a structure consisting of an alpha/beta/alpha core containing a five-stranded beta-sheet. They recognize the double-stranded DNA sequence 5'-GGATG-3' and cleave DNA phosphodiester groups 9 base pairs away on this strand and 13 base pairs away on the complementary strand. 188
52108 286355 pfam09255 Antig_Caf1 Caf1 Capsule antigen. Members of this family are predominantly found in the F1 capsule antigen Caf1 synthesized by Yersinia bacteria. They adopt a structure consisting of a seven strands arranged in two beta-sheets, in a Greek-key topology, and mediate targeting of the bacterium to sites of infection. 136
52109 401262 pfam09256 BaffR-Tall_bind BAFF-R, TALL-1 binding. Members of this family, which are predominantly found in the tumor necrosis factor receptor superfamily member 13c, BAFF-R, are required for binding to tumor necrosis factor ligand TALL-1. 28
52110 401263 pfam09257 BCMA-Tall_bind BCMA, TALL-1 binding. Members of this family, which are predominantly found in the tumor necrosis factor receptor superfamily member 17, BCMA, are required for binding to tumor necrosis factor ligand TALL-1. 37
52111 401264 pfam09258 Glyco_transf_64 Glycosyl transferase family 64 domain. Members of this family catalyze the transfer reaction of N-acetylglucosamine and N-acetylgalactosamine from the respective UDP-sugars to the non-reducing end of [glucuronic acid]beta 1-3[galactose]beta 1-O-naphthalenemethanol, an acceptor substrate analog of the natural common linker of various glycosylaminoglycans. They are also required for the biosynthesis of heparan-sulphate. 240
52112 401265 pfam09259 Fve Fungal immunomodulatory protein Fve. Fve is a major fruiting body protein from Flammulina velutipes, a mushroom possessing immunomodulatory activity. It stimulates lymphocyte mitogenesis, suppresses systemic anaphylaxis reactions and oedema, enhances transcription of IL-2, IFN-gamma and TNF-alpha, and haemagglutinates red blood cells. It appears to be a lectin with specificity for complex cell-surface carbohydrates. Fve adopts a tertiary structure consisting of an immunoglobulin-like beta-sandwich, with seven strands arranged in two beta sheets, in a Greek-key topology. It forms a non-covalently linked homodimer containing no Cys, His or Met residues; dimerization occurs by 3-D domain swapping of the N-terminal helices and is stabilized predominantly by hydrophobic interactions. 111
52113 370390 pfam09260 DUF1966 Domain of unknown function (DUF1966). This domain is found in various fungal alpha-amylase proteins. Its exact function has not, as yet, been defined. 90
52114 401266 pfam09261 Alpha-mann_mid Alpha mannosidase middle domain. Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase. 98
52115 401267 pfam09262 PEX-1N Peroxisome biogenesis factor 1, N-terminal. Members of this family adopt a double psi beta-barrel fold, similar in structure to the Cdc48 N-terminal domain. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation. 77
52116 401268 pfam09263 PEX-2N Peroxisome biogenesis factor 1, N-terminal. Members of this family adopt a Cdc48 domain 2-like fold, with a beta-alpha-beta(3) arrangement. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation. 83
52117 401269 pfam09264 Sial-lect-inser Vibrio cholerae sialidase, lectin insertion. Members of this family are predominantly found in Vibrio cholerae sialidase, and adopt a beta sandwich structure consisting of 12-14 strands arranged in two beta-sheets. They bind to lectins with high affinity helping to target the protein to sialic acid-rich environments, thereby enhancing the catalytic efficiency of the enzyme. 198
52118 401270 pfam09265 Cytokin-bind Cytokinin dehydrogenase 1, FAD and cytokinin binding. Members of this family adopt an alpha+beta sandwich structure with an antiparallel beta-sheet, in a ferredoxin-like fold. They are predominantly found in plant cytokinin dehydrogenase 1, where they are capable of binding both FAD and cytokinin substrates. The substrate displays a 'plug-into-socket' binding mode that seals the catalytic site and precisely positions the carbon atom undergoing oxidation in close contact with the reactive locus of the flavin. 278
52119 286365 pfam09266 VirDNA-topo-I_N Viral DNA topoisomerase I, N-terminal. Members of this family are predominantly found in viral DNA topoisomerase, and assume a beta(2)-alpha-beta-alpha-beta(2) fold, with a left-handed crossover between strands beta2 and beta3. 58
52120 370396 pfam09267 Dict-STAT-coil Dictyostelium STAT, coiled coil. Members of this family are found in Dictyostelium STAT proteins and adopt a structure consisting of four long alpha-helices, folded into a coiled coil. They are responsible for nuclear export of the protein. 114
52121 401271 pfam09268 Clathrin-link Clathrin, heavy-chain linker. Members of this family adopt a structure consisting of alpha-alpha superhelix. They are predominantly found in clathrin, where they act as a heavy-chain linker domain. 24
52122 401272 pfam09269 DUF1967 Domain of unknown function (DUF1967). Members of this family contain a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. They are predominantly found in the bacterial GTP-binding protein Obg, and are still functionally uncharacterized. 67
52123 401273 pfam09270 BTD Beta-trefoil DNA-binding domain. Members of this family of DNA binding domains adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and co-repressors (SMRT/N-Cor and CIR). 123
52124 401274 pfam09271 LAG1-DNAbind LAG1, DNA binding. Members of this family are found in various eukaryotic hypothetical proteins and in the DNA-binding protein LAG-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology, and allow for DNA binding. This domain is also known as RHR-N (Rel-homology region) as it related to Rel domain proteins. 135
52125 401275 pfam09272 Hepsin-SRCR Hepsin, SRCR. Members of this family form an extracellular domain of the serine protease hepsin. They are formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate. 110
52126 401276 pfam09273 Rubis-subs-bind Rubisco LSMT substrate-binding. Members of this family adopt a multihelical structure, with an irregular array of long and short alpha-helices. They allow binding of the protein to substrate, such as the N-terminal tails of histones H3 and H4 and the large subunit of the Rubisco holoenzyme complex. 121
52127 117819 pfam09274 ParG ParG. Members of this family of plasmid partition proteins adopt a ribbon-helix-helix fold, with a core of four alpha-helices. They are an essential component of the DNA partition complex of the multidrug resistance plasmid TP228. 76
52128 286372 pfam09275 Pertus-S4-tox Pertussis toxin S4 subunit. Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology. 110
52129 286373 pfam09276 Pertus-S5-tox Pertussis toxin S5 subunit. Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology. 97
52130 401277 pfam09277 Erythro-docking Erythronolide synthase, docking. Members of this family of docking domains are found in prokaryotic erythronolide synthase. They adopt a structure consisting of a bundle of four alpha-helices, and mediate homodimerization of the protein, stabilizing the resulting complex. 58
52131 401278 pfam09278 MerR-DNA-bind MerR, DNA binding. Members of this family of DNA-binding domains are predominantly found in the prokaryotic transcriptional regulator MerR. They adopt a structure consisting of a core of three alpha helices, with an architecture that is similar to that of the 'winged helix' fold. 65
52132 401279 pfam09279 EF-hand_like Phosphoinositide-specific phospholipase C, efhand-like. Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme. 85
52133 401280 pfam09280 XPC-binding XPC-binding domain. Members of this family adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair. 57
52134 370405 pfam09281 Taq-exonuc Taq polymerase, exonuclease. Members of this family are found in prokaryotic Taq DNA polymerase, where they assume a ribonuclease H-like motif. The domain confers 5'-3' exonuclease activity to the polymerase. 129
52135 401281 pfam09282 Mago-bind Mago binding. Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions. 27
52136 401282 pfam09284 RhgB_N Rhamnogalacturonan lyase B, N-terminal. Members of this family are found in both fungi, bacteria and wood-eating arthropods. The domain is found at the N-terminus of rhamnogalacturonase B, a member of the polysaccharide lyase family 4. The domain adopts a structure consisting of a beta super-sandwich, with eighteen strands in two beta-sheets. The three domains of the whole protein rhamnogalacturonan lyase (RGL4), are involved in the degradation of rhamnogalacturonan-I, RG-I, an important pectic plant cell-wall polysaccharide. The active-site residues are a lysine at position 169 in UniProtKB:Q00019 and a histidine at 229, Lys169 is likely to be a proton abstractor, His229 a proton donor in the mechanism. The substrate is a disaccharide, and RGL4, in contrast to other rhamnogalacturonan hydrolases, cleaves the alpha-1,4 linkages of RG-I between Rha and GalUA through a beta-elimination resulting in a double bond in the nonreducing GalUA residue, and is thus classified as a polysaccharide lyase (PL). 251
52137 401283 pfam09285 Elong-fact-P_C Elongation factor P, C-terminal. Members of this family of nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology. 56
52138 401284 pfam09286 Pro-kuma_activ Pro-kumamolisin, activation domain. Members of this family are found in various subtilase propeptides, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptide. 142
52139 401285 pfam09287 CEP1-DNA_bind CEP-1, DNA binding. Members of this family of DNA-binding domains are found the transcription factor CEP-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology. 198
52140 117832 pfam09288 UBA_3 Fungal ubiquitin-associated domain. Members of this family of ubiquitin binding domains adopt a structure consisting of a three alpha-helix bundle. They are predominantly found in fungal ubiquitin-protein ligases. 55
52141 401286 pfam09289 FOLN Follistatin/Osteonectin-like EGF domain. Members of this family are predominantly found in osteonectin and follistatin and adopt an EGF-like fold. 22
52142 401287 pfam09290 AcetDehyd-dimer Prokaryotic acetaldehyde dehydrogenase, dimerization. Members of this family are found in prokaryotic acetaldehyde dehydrogenase (acylating), and adopt a structure consisting of an alpha-beta-alpha-beta(3) core. They mediate dimerization of the protein. 138
52143 401288 pfam09291 DUF1968 Domain of unknown function (DUF1968). Members of this family are found in mammalian T-cell antigen receptor, and adopt an immunoglobulin-like beta-sandwich fold, with seven strands in two beta-sheets in a Greek-key topology. Their exact function has not, as yet, been determined. 80
52144 401289 pfam09292 Neil1-DNA_bind Endonuclease VIII-like 1, DNA bind. Members of this family are predominantly found in Endonuclease VIII-like 1 and adopt a glucocorticoid receptor-like fold. They allow for DNA binding. 39
52145 401290 pfam09293 RNaseH_C T4 RNase H, C terminal. Members of this family are found in T4 RNaseH ribonuclease, and adopt a SAM domain-like fold, consisting of a bundle of four/five helices. These residues may have a role in providing a docking site for other proteins or enzymes in the replication fork. 124
52146 401291 pfam09294 Interfer-bind Interferon-alpha/beta receptor, fibronectin type III. Members of this family adopt a secondary structure consisting of seven beta-strands arranged in an immunoglobulin-like beta-sandwich, in a Greek-key topology. They are required for binding to interferon-alpha. 104
52147 401292 pfam09295 ChAPs ChAPs (Chs5p-Arf1p-binding proteins). ChAPs (Chs5p-Arf1p-binding proteins) are required for the export of specialized cargo from the Golgi. They physically interact with Chs3, Chs5 and the small GTPase Arf1, and they form also interactions with each other. 395
52148 401293 pfam09296 NUDIX-like NADH pyrophosphatase-like rudimentary NUDIX domain. The N-terminal domain in NADH pyrophosphatase, which has a rudiment Nudix fold according to SCOP. 96
52149 401294 pfam09297 zf-NADH-PPase NADH pyrophosphatase zinc ribbon domain. This domain is found in between two duplicated NUDIX domains. It has a zinc ribbon structure. 32
52150 401295 pfam09298 FAA_hydrolase_N Fumarylacetoacetase N-terminal. The N-terminal domain of fumarylacetoacetate hydrolase is functionally uncharacterized, and adopts a structure consisting of an SH3-like barrel. 106
52151 401296 pfam09299 Mu-transpos_C Mu transposase, C-terminal. Members of this family are found in various prokaryotic integrases and transposases. They adopt a beta-barrel structure with Greek-key topology. 61
52152 286393 pfam09300 Tecti-min-caps Tectiviridae, minor capsid. Members of this family form the minor capsid protein of various Tectiviridae. 83
52153 401297 pfam09301 DUF1970 Domain of unknown function (DUF1970). Members of this family consist of various uncharacterized viral hypothetical proteins. 118
52154 401298 pfam09302 XLF XLF-Cernunnos, XRcc4-like factor, NHEJ component. XLF (also called Cernunnos) is Xrcc4-like-factor, and interacts with the XRCC4-DNA ligase IV complex to promote DNA non-homologous end-joining. It directly interacts with the XRCC4-Ligase IV complex and siRNA-mediated down-regulation of XLF in human cell lines leads to radio-sensitivity and impaired DNA non-homologous end-joining. This family contains Nej1 (non-homologous end-joining factor), and Lif1, ligase-interacting factor. XLF forms one of the components of the NHEJ machinery for DNA non-homologous end-joining. 181
52155 401299 pfam09303 KcnmB2_inactiv KCNMB2, ball and chain domain. Members of this family are found in the cytoplasmic N-terminus of KCNMB2, the beta-2 subunit of large conductance calcium and voltage-activated potassium channels. They are responsible for the fast inactivation of these channels. 30
52156 312712 pfam09304 Cortex-I_coil Cortexillin I, coiled coil. Members of this family are predominantly found in the actin-bundling protein Cortexillin I from Dictyostelium discoideum. They adopt a structure consisting of an 18-heptad-repeat alpha-helical coiled-coil, and are a prerequisite for the assembly of Cortexillin I. 107
52157 370419 pfam09305 TACI-CRD2 TACI, cysteine-rich domain. Members of this family are predominantly found in tumor necrosis factor receptor superfamily, member 13b (TACI), and are required for binding to the ligands APRIL and BAFF. 39
52158 286399 pfam09306 Phage-scaffold Bacteriophage, scaffolding protein. Members of this family of scaffolding proteins are produced by various bacteriophages. 303
52159 401300 pfam09307 MHC2-interact CLIP, MHC2 interacting. Members of this family are found in class II invariant chain-associated peptide (CLIP), and are required for association with class II major histocompatibility complex (MHC) in the MHC class II processing pathway. 109
52160 401301 pfam09308 LuxQ-periplasm LuxQ, periplasmic. Members of this family constitute the periplasmic sensor domain of the prokaryotic protein LuxQ, and assume a structure consisting of two tandem Per/ARNT/Simple-minded (PAS) folds. 238
52161 401302 pfam09309 FCP1_C FCP1, C-terminal. The C-terminal domain of FCP-1 is required for interaction with the carboxy terminal domain of RAP74. Interaction relies extensively on van der Waals contacts between hydrophobic residues situated within alpha-helices in both domains. 260
52162 401303 pfam09310 PD-C2-AF1 POU domain, class 2, associating factor 1. Members of this family are transcriptional coactivators that specifically associate with either OCT1 or OCT2, through recognition of their POU domains. They are essential for the response of B-cells to antigens and required for the formation of germinal centers. 248
52163 401304 pfam09311 Rab5-bind Rabaptin-like protein. Members of this family are predominantly found in Rabaptin and allow for binding to the GTPase Rab5. This interaction is necessary and sufficient for Rab5-dependent recruitment of Rabaptin5 to early endosomal membranes. 307
52164 401305 pfam09312 SurA_N SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment. 118
52165 401306 pfam09313 DUF1971 Domain of unknown function (DUF1971). Members of this family of functionally uncharacterized domains are predominantly found in bacterial Tellurite resistance protein. 80
52166 370426 pfam09314 DUF1972 Domain of unknown function (DUF1972). Members of this family of functionally uncharacterized domains are found in bacterial glycosyltransferases and rhamnosyltransferases. 186
52167 401307 pfam09316 Cmyb_C C-myb, C-terminal. Members of this family are predominantly found in the proto-oncogene c-myb and the viral transforming protein myb. Truncation of the domain results in 'activation' of c-myb and subsequent tumorigenesis. 164
52168 401308 pfam09317 DUF1974 Domain of unknown function (DUF1974). Members of this family of functionally uncharacterized domains are predominantly found in various prokaryotic acyl-coenzyme a dehydrogenases. 284
52169 370428 pfam09318 Glyco_trans_A_1 Glycosyl transferase 1 domain A. Glyco_trans_A_1 is family of found predominantly at the N-terminus of various prokaryotic alpha-glucosyltransferases. According to whether the domain exists as a whole molecule or as a half molecule determines the number of sugar residues that the molecule transfers. Two-domain proteins are processive in that they transfer more than one sugar residue, processively; single domain proteins transfer just one sugar moiety. 199
52170 401309 pfam09320 DUF1977 Domain of unknown function (DUF1977). Members of this family of functionally uncharacterized domains are predominantly found in dnaj-like proteins. 104
52171 312723 pfam09321 DUF1978 Domain of unknown function (DUF1978). Members of this family are found in various hypothetical proteins produced by the bacterium Chlamydia pneumoniae. Their exact function has not, as yet, been identified. 244
52172 401310 pfam09322 DUF1979 Domain of unknown function (DUF1979). Members of this family of functionally uncharacterized domains are found in various Oryza sativa mutator-like transposases. 58
52173 401311 pfam09323 DUF1980 Domain of unknown function (DUF1980). Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function, has not, as yet, been defined. 179
52174 401312 pfam09324 DUF1981 Domain of unknown function (DUF1981). Members of this family of functionally uncharacterized domains are found in various plant and yeast protein transport proteins. 84
52175 401313 pfam09325 Vps5 Vps5 C terminal like. Vps5 is a sorting nexin that functions in membrane trafficking. This is the C terminal dimerization domain. 219
52176 401314 pfam09326 NADH_dhqG_C NADH-ubiquinone oxidoreductase subunit G, C-terminal. Members of this family of are found at the C-terminus of NADH dehydrogenases subunit G or NADH-ubiquinone oxidoreductase subunit G. EC:1.6.99.5. 41
52177 401315 pfam09327 DUF1983 Domain of unknown function (DUF1983). Members of this family of functionally uncharacterized domains are found in various bacteriophage host specificity proteins. 75
52178 401316 pfam09328 Phytochelatin_C Domain of unknown function (DUF1984). Members of this family of functionally uncharacterized domains are found at the C-terminus of plant phytochelatin synthases. 252
52179 401317 pfam09329 zf-primase Primase zinc finger. This zinc finger is found in yeast Mcm10 proteins and DnaG-type primases. 46
52180 401318 pfam09330 Lact-deh-memb D-lactate dehydrogenase, membrane binding. Members of this family are predominantly found in prokaryotic D-lactate dehydrogenase, forming the cap-membrane-binding domain, which consists of a large seven-stranded antiparallel beta-sheet flanked on both sides by alpha-helices. They allow for membrane association. 290
52181 401319 pfam09331 DUF1985 Domain of unknown function (DUF1985). Members of this family of functionally uncharacterized domains are found in a set of Arabidopsis thaliana hypothetical proteins. 133
52182 401320 pfam09332 Mcm10 Mcm10 replication factor. Mcm10 is a eukaryotic DNA replication factor that regulates the stability and chromatin association of DNA polymerase alpha. 344
52183 401321 pfam09333 ATG_C Autophagy-related protein C terminal domain. ATG2 (also known as Apg2) is a peripheral membrane protein. It functions in both cytoplasm-to-vacuole targeting and in autophagy. 96
52184 401322 pfam09334 tRNA-synt_1g tRNA synthetases class I (M). This family includes methionyl tRNA synthetases. 387
52185 401323 pfam09335 SNARE_assoc SNARE associated Golgi protein. This is a family of SNARE associated Golgi proteins. The yeast member of this family localizes with the t-SNARE Tlg2. 120
52186 401324 pfam09336 Vps4_C Vps4 C terminal oligomerization domain. This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerization. 61
52187 370444 pfam09337 zf-H2C2 His(2)-Cys(2) zinc finger. This domain binds to histone upstream activating sequence (UAS) elements that are found in histone gene promoters. Added to clan to resolve overlaps with PF16721 but neither are classic zf_C2H2 zinc-fingers. 39
52188 401325 pfam09338 Gly_reductase Glycine/sarcosine/betaine reductase component B subunits. This is a family of glycine reductase, sarcosine reductase and betaine reductases. These enzymes catalyze the following reactions. sarcosine reductase: Acetyl phosphate + methylamine + thioredoxin disulphide = N-methylglycine + phosphate + thioredoxin Acetyl phosphate + NH(3) + thioredoxin disulphide = glycine + phosphate + thioredoxin. betaine reductase: Acetyl phosphate + trimethylamine + thioredoxin disulphide = N,N,N-trimethylglycine + phosphate + thioredoxin. 426
52189 401326 pfam09339 HTH_IclR IclR helix-turn-helix domain. 52
52190 401327 pfam09340 NuA4 Histone acetyltransferase subunit NuA4. The NuA4 histone acetyltransferase (HAT) multisubunit complex is responsible for acetylation of histone H4 and H2A N-terminal tails in yeast. NuA4 complexes are highly conserved in eukaryotes and play primary roles in transcription, cellular response to DNA damage, and cell cycle control. 78
52191 401328 pfam09341 Pcc1 Transcription factor Pcc1. Pcc1 is a transcription factor that functions in regulating genes involved in cell cycle progression and polarised growth. 75
52192 286432 pfam09342 DUF1986 Domain of unknown function (DUF1986). This domain is found in serine proteases and is predicted to contain disulphide bonds. 116
52193 401329 pfam09343 DUF2460 Conserved hypothetical protein 2217 (DUF2460). This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes. 200
52194 401330 pfam09344 Cas_CT1975 CT1975-like protein. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum. 365
52195 401331 pfam09345 DUF1987 Domain of unknown function (DUF1987). This family of proteins are functionally uncharacterized. 121
52196 401332 pfam09346 SMI1_KNR4 SMI1 / KNR4 family (SUKH-1). Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. Genome contextual information showed that SMI1 are primary immunity proteins in bacterial toxin systems. 119
52197 401333 pfam09347 DUF1989 Domain of unknown function (DUF1989). This family of proteins are functionally uncharacterized. 165
52198 401334 pfam09348 DUF1990 Domain of unknown function (DUF1990). This family of proteins are functionally uncharacterized. 152
52199 401335 pfam09349 OHCU_decarbox OHCU decarboxylase. The proteins in this family are OHCU decarboxylase - enzymes of the purine catabolism that catalyze the conversion of OHCU into S(+)-allantoin. This is the third step of the conversion of uric acid (a purine derivative) to allantoin. Step one is catalyzed by urate oxidase (pfam01014) and step two is catalyzed by HIUases (pfam00576). 155
52200 401336 pfam09350 DUF1992 Domain of unknown function (DUF1992). This family of proteins are functionally uncharacterized. 72
52201 401337 pfam09351 DUF1993 Domain of unknown function (DUF1993). This family of proteins are functionally uncharacterized. 161
52202 401338 pfam09353 DUF1995 Domain of unknown function (DUF1995). This family of proteins are functionally uncharacterized. 202
52203 401339 pfam09354 HNF_C HNF3 C-terminal domain. This presumed domain is found in the C-terminal region of Hepatocyte Nuclear Factor 3 alpha and beta chains. Its specific function is uncertain. The N-terminal region of this presumed domain contains an EH1 (engrailed homology 1) motif, that is characterized by the FxIxxIL sequence. 66
52204 286444 pfam09355 Phage_Gp19 Phage protein Gp19/Gp15/Gp42. This family of proteins are functionally uncharacterized. They are found in a variety of bacteriophage. 116
52205 401340 pfam09356 Phage_BR0599 Phage conserved hypothetical protein BR0599. This entry describes a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions. 80
52206 401341 pfam09357 RteC RteC protein. Human colonic Bacteroides species harbor a family of large conjugative transposons, called tetracycline resistance (Tcr) elements. Activities of these elements are enhanced by pregrowth of bacteria in medium containing tetracycline, indicating that at least some Tcr element genes are regulated by tetracycline. An insertional disruption in the rteC gene abolished self-transfer of the Tcr element to Bacteroides recipients, indicating that the gene was essential for self-transfer. 218
52207 401342 pfam09358 E1_UFD Ubiquitin fold domain. The ubiquitin fold domain is found at the C-terminus of ubiquitin-activating E1 family enzymes. This domain binds to E2 enzymes. 93
52208 401343 pfam09359 VTC VTC domain. This presumed domain is found in the yeast vacuolar transport chaperone proteins VTC2, VTC3 and VTC4. This domain is also found in a variety of bacterial proteins. 235
52209 401344 pfam09360 zf-CDGSH Iron-binding zinc finger CDGSH type. The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm. 42
52210 401345 pfam09361 Phasin_2 Phasin protein. This entry describes a group of small proteins found associated with inclusions in bacterial cells. Most associate with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB). These are designated granule-associate proteins or phasins. 90
52211 401346 pfam09362 DUF1996 Domain of unknown function (DUF1996). This family of proteins are functionally uncharacterized. 234
52212 401347 pfam09363 XFP_C XFP C-terminal domain. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. 200
52213 401348 pfam09364 XFP_N XFP N-terminal domain. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779. 364
52214 401349 pfam09365 DUF2461 Conserved hypothetical protein (DUF2461). Members of this family are widely (though sparsely) distributed bacterial proteins, about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown. 207
52215 401350 pfam09366 DUF1997 Protein of unknown function (DUF1997). This family of proteins are functionally uncharacterized. 153
52216 401351 pfam09367 CpeS CpeS-like protein. This family, that includes CpeS proteins, is functionally uncharacterized. 169
52217 401352 pfam09368 Sas10 Sas10 C-terminal domain. Sas10 is an Essential subunit of U3-containing Small Subunit (SSU) processome complex involved in the production of the 18S rRNA and assembly of the small ribosomal subunit. 74
52218 401353 pfam09369 DUF1998 Domain of unknown function (DUF1998). This family of proteins are functionally uncharacterized. They are mainly found in helicase proteins so could be RNA binding. This family includes a probable zinc binding motif at its C-terminus. 83
52219 401354 pfam09370 PEP_hydrolase Phosphoenolpyruvate hydrolase-like. This domain has a TIM barrel fold related to IGPS and to phosphoenolpyruvate mutase/aldolase/carboxylase. 266
52220 401355 pfam09371 Tex_N Tex-like protein N-terminal domain. This presumed domain is found at the N-terminus of Bordetella pertussis tex. This protein defines a novel family of prokaryotic transcriptional accessory factors. 183
52221 286461 pfam09372 PRANC PRANC domain. This presumed domain is found at the C-terminus of a variety of Pox virus proteins. The PRANC (Pox proteins Repeats of ANkyrin - C terminal) domain is also found on its own in some proteins. The function of this domain is unknown, but it appears to be related to the F-box domain and may play a similar role. 95
52222 117915 pfam09373 PMBR Pseudomurein-binding repeat. Methanothermobacter thermautotrophicus is a methanogenic Gram-positive microorganism with a cell wall consisting of pseudomurein. This repeat specifically binds to pseudomurein. This repeat is found at the N-terminus of PeiW and PeiP which are pseudomurein binding phage proteins. 33
52223 401356 pfam09374 PG_binding_3 Predicted Peptidoglycan domain. This family contains a potential peptidoglycan binding domain. 76
52224 401357 pfam09375 Peptidase_M75 Imelysin. The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14. 287
52225 401358 pfam09376 NurA NurA domain. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius. 252
52226 401359 pfam09377 SBDS_C SBDS protein C-terminal domain. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. Members of this family play a role in RNA metabolism. 116
52227 401360 pfam09378 HAS-barrel HAS barrel domain. The HAS barrel is named after HerA-ATP Synthase. In ATP synthases, this domain is implicated in the assembly of the catalytic toroid and docking of accessory subunits, such as the subunit of the ATP synthase complex. Similar roles in docking of the functional partner, the NurA nuclease, and assembly of the HerA toroid complex appear likely for the HAS-barrel of the HerA family. 91
52228 401361 pfam09379 FERM_N FERM N-terminal domain. This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. 64
52229 401362 pfam09380 FERM_C FERM C-terminal PH-like domain. 85
52230 401363 pfam09381 Porin_OmpG Outer membrane protein G (OmpG). Porins are channel proteins in the outer membrane of gram negative bacteria which mediate the uptake of molecules required for growth and survival. Escherichia coli OmpG forms a 14 stranded beta-barrel and in contrast to most porins, appears to function as a monomer. The central pore of OmpG is wider than other E. coli porins and it is speculated that it may form a non-specific channel for the transport of larger oligosaccharides. 285
52231 401364 pfam09382 RQC RQC domain. This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain. 109
52232 401365 pfam09383 NIL NIL domain. This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family. 73
52233 401366 pfam09384 UTP15_C UTP15 C terminal. U3 snoRNA is ubiquitous in eukaryotes and is required for nucleolar processing of pre-18S ribosomal RNA. It is a component of the ribosomal small subunit (SSU) processome. UTP15 is needed for optimal pre-ribosomal RNA transcription by RNA polymerase I, together with a subset of U3 proteins required for transcription (t-UTPs). This entry represents the C terminal of UTP15, and is found adjacent to WD40 repeats (pfam00400). 147
52234 286473 pfam09385 HisK_N Histidine kinase N terminal. This domain is found at the N terminal of sensor histidine kinase proteins. 129
52235 401367 pfam09386 ParD Antitoxin ParD. ParD is a plasmid anti-toxin than forms a ribbon-helix-helix DNA binding structure. It stabilizes plasmids by inhibiting ParE toxicity in cells that express ParD and ParE. ParD forms a dimer and also regulates its own promoter (parDE). 80
52236 401368 pfam09387 MRP Mitochondrial RNA binding protein MRP. MRP1 and MRP2 are mitochondrial RNA binding proteins that form a heteromeric complex. The MRP1/MRP2 heterotetrameric complex binds to guide RNAs and stabilizes them in an unfolded conformation suitable for RNA-RNA hybridisation. Each MRP subunit adopts a 'whirly' transcription factor fold. 192
52237 401369 pfam09388 SpoOE-like Spo0E like sporulation regulatory protein. Spore formation is an extreme response to starvation and can also be a component of disease transmission. Sporulation is controlled by an expanded two-component system where starvation signals result in sensor kinase activation and phosphorylation of the master sporulation response regulator Spo0A. Phosphatases such as Spo0E dephosphorylate Spo0A thereby inhibiting sporulation. This is a family of Spo0E-like phosphatases. The structure of a Bacillus anthracis member of this family has revealed an anti-parallel alpha-helical structure. 42
52238 370462 pfam09390 DUF1999 Protein of unknown function (DUF1999). This family contains a putative Fe-S binding reductase whose structure adopts an alpha and beta fold. 151
52239 401370 pfam09391 DUF2000 Protein of unknown function (DUF2000). This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold. 133
52240 401371 pfam09392 T3SS_needle_F Type III secretion needle MxiH, YscF, SsaG, EprI, PscF, EscF. Type III secretion systems are essential virulence determinants for many gram-negative bacterial pathogens. MxiH is an extracellular alpha helical needle that is required for translocation of effector proteins into host cells. Once inside, the effector proteins subvert normal cell function to aid infection. The needle protein F, polymerizes to form a shaft. 67
52241 401372 pfam09393 DUF2001 Phage tail tube protein. This is a family of phage tail tube proteins including protein XkdM from phage-like element PBSX protein whose structure adopts a beta barrel flanked with alpha helical regions. 139
52242 401373 pfam09394 Inhibitor_I42 Chagasin family peptidase inhibitor I42. Chagasin is a cysteine peptidase inhibitor which forms a beta barrel structure. 89
52243 401374 pfam09396 Thrombin_light Thrombin light chain. Thrombin is an enzyme that cleaves bonds after Arg and Lys, converts fibrinogen to fibrin and activates factors V, VII, VIII. Prothrombin is activated on the surface of a phospholipid membrane where factor Xa removes the activation peptide and cleaves the remaining part into light and heavy chains. This domain corresponds to the light chain of thrombin. 47
52244 401375 pfam09397 Ftsk_gamma Ftsk gamma domain. This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding. 63
52245 312785 pfam09398 FOP_dimer FOP N terminal dimerization domain. Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerization. 81
52246 401376 pfam09399 SARS_lipid_bind SARS lipid binding protein. This is a family of proteins found in SARS coronavirus. The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules. This cavity is likely to be involved in membrane attachment. 97
52247 401377 pfam09400 DUF2002 Protein of unknown function (DUF2002). This is a family of putative cytoplasmic proteins. The structure of these proteins form an antiparallel beta and sheet and contain some alpha helical regions. 110
52248 286486 pfam09401 NSP10 RNA synthesis protein NSP10. Non-structural protein 10 (NSP10) is involved in RNA synthesis. it is synthesized as a polyprotein whose cleavage generates many non-structural proteins. NSP10 contains two zinc binding motifs and forms two anti-parallel helices which are stacked against an irregular beta sheet. A cluster of basic residues on the protein surface suggests a nucleic acid-binding function. 119
52249 401378 pfam09402 MSC Man1-Src1p-C-terminal domain. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C terminal nucleoplasmic region forms a DNA binding winged helix and binds to Smad. This C-terminal tail is also found in S. cerevisiae and is thought to consist of three conserved helices followed by two downstream strands. 333
52250 401379 pfam09403 FadA Adhesion protein FadA. FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. 99
52251 401380 pfam09404 DUF2003 Eukaryotic protein of unknown function (DUF2003). This is a family of proteins of unknown function which adopt an alpha helical and beta sheet structure. 440
52252 401381 pfam09405 Btz CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide. 116
52253 401382 pfam09406 DUF2004 Protein of unknown function (DUF2004). This is a family of proteins with unknown function. The structure of one of the proteins in this family has revealed a novel alpha-beta fold. 106
52254 401383 pfam09407 AbiEi_1 AbiEi antitoxin C-terminal domain. AbiEi_1 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 143
52255 401384 pfam09408 Spike_rec_bind Spike receptor binding domain. Spike is an envelope glycoprotein which aids viral entry into the host cell. This domain corresponds is the immunogenic receptor binding domain of the protein which binds to angiotensin-converting enzyme 2 (ACE2). 177
52256 401385 pfam09409 PUB PUB domain. The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain. This domain is also found on other proteins linked to the ubiquitin-proteasome system. 77
52257 401386 pfam09411 PagL Lipid A 3-O-deacylase (PagL). PagL is an outer membrane protein with lipid A 3-O-deacylase activity. It forms an 8 stranded beta barrel structure. 129
52258 401387 pfam09412 XendoU Endoribonuclease XendoU. This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini. 264
52259 401388 pfam09413 DUF2007 Putative prokaryotic signal transducing protein. This is a family of putative prokaryotic signal transducing proteins of Pii-type. 66
52260 401389 pfam09414 RNA_ligase RNA ligase. This is a family of RNA ligases. The enzyme repairs RNA strand breaks in nicked DNA:RNA and RNA:RNA but not in DNA:DNA duplexes. 132
52261 401390 pfam09415 CENP-X CENP-S associating Centromere protein X. The centromere, essential for faithful chromosome segregation during mitosis, has a network of constitutive centromere-associated (CCAN) proteins associating with it during mitosis. So far in vertebrates at least 15 centromere proteins have been identified, which are divided into several subclasses based on functional and biochemical analyses. These provide a platform for the formation of a functional kinetochore during mitosis. CENP-S is one that does not associate with the CENP-H-containing complex but rather interacts with CENP-X to form a stable assembly of outer kinetochore proteins that functions downstream of other components of the CCAN. This complex may directly allow efficient and stable formation of the outer kinetochore on the CCAN platform. 72
52262 401391 pfam09416 UPF1_Zn_bind RNA helicase (UPF2 interacting domain). UPF1 is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD). 152
52263 401392 pfam09418 DUF2009 Protein of unknown function (DUF2009). This is a eukaryotic family of proteins with unknown function. 454
52264 286502 pfam09419 PGP_phosphatase Mitochondrial PGP phosphatase. This is a family of proteins that acts as a mitochondrial phosphatase in cardiolipin biosynthesis. Cardiolipin is a unique dimeric phosphoglycerolipid predominantly present in mitochondrial membranes. The inverted phosphatase motif includes the highly conserved DKD triad. 166
52265 401393 pfam09420 Nop16 Ribosome biogenesis protein Nop16. Nop16 is a protein involved in ribosome biogenesis. 209
52266 401394 pfam09421 FRQ Frequency clock protein. The frequency clock protein, is the central component of the frq-based circadian negative feedback loop, regulates various aspects of the circadian clock in Neurospora crassa. This protein has been shown to interact with itself via a coiled-coil. 982
52267 401395 pfam09422 WTX WTX protein. The WTX protein is found to be inactivated in one third of Wilms tumors. The WTX protein is functionally uncharacterized. 468
52268 401396 pfam09423 PhoD PhoD-like phosphatase. 342
52269 401397 pfam09424 YqeY Yqey-like protein. The function of this domain found in the YqeY protein is uncertain. 143
52270 401398 pfam09425 CCT_2 Divergent CCT motif. This short motif is found in a number of plant proteins. It appears to be related to the N-terminal half of the CCT motif. The CCT motif is about 45 amino acids long and contains a putative nuclear localization signal within the second half of the CCT motif. 25
52271 401399 pfam09426 Nyv1_N Vacuolar R-SNARE Nyv1 N terminal. This domain corresponds to the N terminal domain of vacuolar R-SNARE Nyv1 which adopts a longin fold. In yeast it has been shown that this domain is sufficient to direct the transport of Nyv1 to limiting membrane of the vacuole. 138
52272 401400 pfam09427 DUF2014 Domain of unknown function (DUF2014). This domain is found at the C terminal of a family of ER membrane bound transcription factors called sterol regulatory element binding proteins (SREBP). 263
52273 401401 pfam09428 DUF2011 Fungal protein of unknown function (DUF2011). This is a family of fungal proteins whose function is unknown. 90
52274 401402 pfam09429 Wbp11 WW domain binding protein 11. The WW domain is a small protein module with a triple-stranded beta-sheet fold. This is a family of WW domain binding proteins. 76
52275 401403 pfam09430 DUF2012 Protein of unknown function (DUF2012). This is a eukaryotic family of uncharacterized proteins. 122
52276 401404 pfam09431 DUF2013 Protein of unknown function (DUF2013). This region is found at the C terminal of a group of cytoskeletal proteins. 136
52277 401405 pfam09432 THP2 Tho complex subunit THP2. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1. 129
52278 401406 pfam09435 DUF2015 Fungal protein of unknown function (DUF2015). This is a fungal family of uncharacterized proteins. 110
52279 312811 pfam09436 DUF2016 Domain of unknown function (DUF2016). A predicted alpha+beta domain that is usually fused N-terminal to the JAB metallopeptidase. This protein in turn is found in conserved gene neighborhoods that include genes encoding the bacterial homologs of the ubiquitin modification system such as the E1, E2 and Ub proteins. The domain is also known as the JAB-N domain. 72
52280 117976 pfam09437 Pombe_5TM Pombe specific 5TM protein. 219
52281 401407 pfam09438 DUF2017 Domain of unknown function (DUF2017). This is an alpha-helical domain found in gene neighborhoods that contain genes encoding ubiquitin, cysteine synthases and JAB peptidases. 170
52282 370490 pfam09439 SRPRB Signal recognition particle receptor beta subunit. The beta subunit of the signal recognition particle receptor (SRP) is a transmembrane GTPase which anchors the alpha subunit to the endoplasmic reticulum membrane. 181
52283 401408 pfam09440 eIF3_N eIF3 subunit 6 N terminal domain. This is the N terminal domain of subunit 6 translation initiation factor eIF3. 132
52284 401409 pfam09441 Abp2 ARS binding protein 2. This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals. 171
52285 401410 pfam09442 DUF2018 Domain of unknown function (DUF2018). Acid-adaptive protein possibly of physiological significance when H.pylori colonises the human stomach, which adopts a unique four alpha-helical triangular conformations. The biologically active form is thought to be a tetramer. The protein is expressed along with six other proteins, some of which are related to iron storage and haem biosynthesis. 83
52286 401411 pfam09443 CFC Cripto_Frl-1_Cryptic (CFC). CFC domain is one half of the membrane protein Cripto, a protein overexpressed in many tumors and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain. 35
52287 401412 pfam09444 MRC1 MRC1-like domain. This putative domain is found to be the most conserved region in mediator of replication checkpoint protein 1. 141
52288 370496 pfam09445 Methyltransf_15 RNA cap guanine-N2 methyltransferase. RNA cap guanine-N2 methyltransferases such as Schizosaccharomyces pombe Tgs1 and Giardia lamblia Tgs2 catalyze methylation of the exocyclic N2 amine of 7-methylguanosine. 165
52289 401413 pfam09446 VMA21 VMA21-like domain. This presumed short domain appears to contain two potential transmembrane helices. VMA21 is localized in the ER where it is needed as an accessory factor for assembly of the V0 component of the vacuolar ATPase. 64
52290 401414 pfam09447 Cnl2_NKP2 Cnl2/NKP2 family protein. This family includes the Cnl2 kinetochore protein. 65
52291 370499 pfam09448 MmlI Methylmuconolactone methyl-isomerase. MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species and tetrameric in others. The known structure shows two copies of the protein form a dimeric alpha beta barrel. 114
52292 401415 pfam09449 DUF2020 Domain of unknown function (DUF2020). Protein of unknown function found in bacteria. 144
52293 312824 pfam09450 DUF2019 Domain of unknown function (DUF2019). Protein of unknown function found in bacteria. 105
52294 401416 pfam09451 ATG27 Autophagy-related protein 27. 261
52295 370502 pfam09452 Mvb12 ESCRT-I subunit Mvb12. The endosomal sorting complex required for transport (ESCRT) complexes play a critical role in receptor down-regulation and retroviral budding. A new component of the ESCRT-I complex was identified, multivesicular body sorting factor of 12 kD (Mvb12), which binds to the coiled-coil domain of the ESCRT-I subunit vacuolar protein sorting 23 (Vps23). 90
52296 401417 pfam09453 HIRA_B HIRA B motif. The HirA B (Histone regulatory homolog A binding) motif is the essential binding interface between HIRA pfam07569 and ASF1a, of approx. 40 residues. It forms an antiparallel beta-hairpin that binds perpendicular to the strands of the beta-sandwich of ASF1a N-terminal core domain, via beta-sheet, salt bridge and van der Waals interactions. The two histone chaperone proteins, HIRA and ASF1a, form a heterodimer with histones H3 and H4. HIRA is the human orthologue of Hir proteins known to silence histone gene expression and create transcriptionally silent heterochromatin in yeast, flies, plants and humans. The yeast CAF1B proteins which bind H3 also carry this motif at their very C-terminus. 23
52297 401418 pfam09454 Vps23_core Vps23 core domain. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. The core domain of the Vps23 subunit of the heterotrimeric ESCRT-I complex is a helical hairpin sandwiched in a fan-like formation between two other helical hairpins from Vps28 (pfam03997) and Vps37. Vps23 gives ESCRT-I complex its stability. 60
52298 401419 pfam09455 Cas_DxTHG CRISPR-associated (Cas) DxTHG family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown. 313
52299 401420 pfam09456 RcsC RcsC Alpha-Beta-Loop (ABL). This domain is found in the C-terminus of the phospho-relay kinase RcsC between pfam00512 and pfam00072, and forms a discrete alpha/beta/loop structure. 91
52300 401421 pfam09457 RBD-FIP FIP domain. The FIP domain is the Rab11-binding domain (RBD) at the C-terminus of a family of Rab11-interacting proteins (FIPs). The Rab proteins constitute the largest family of small GTPases (>60 members in mammals). Among them Rab11 is a well characterized regulator of endocytic and recycling pathways. Rab11 associates with a broad range of post-Golgi organelles, including recycling endosomes. 41
52301 401422 pfam09458 H_lectin H-type lectin domain. The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754. 67
52302 401423 pfam09459 EB_dh Ethylbenzene dehydrogenase. Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyzes the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria. 193
52303 370509 pfam09460 Saf-Nte_pilin Saf-pilin pilus formation protein. This domain consists of the adjacent Saf-Nte and Saf-pilin chains of the pilus-forming complex. Pilus assembly in Gram-negative bacteria involves a Donor-strand exchange mechanism between the C- and the N-termini of this domain. The C-terminal subunit forms an incomplete Ig-fold which is then complemented by the 10-18 residue N-terminus of another, incoming, pilus subunit which is not involved in the Ig-fold. The N-terminus sequences contain a motif of alternating hydrophobic residues that occupy the P2 to P5 binding pockets in the groove of the first pilus subunit. 144
52304 370510 pfam09461 PcF Phytotoxin PcF protein. PcF is a 52 residue protein factor of two alpha helices, containing a 4-hydroxyproline and three cysteine bridges. The presence of the hydroxyproline is unique in relation to other fungal phytotoxic proteins. The protein has a high content of acidic side-chains implying a lack of binding with lipid-rich components of membranes and appears to be an extracellular phytotoxin that causes leaf necrosis in strawberries. 43
52305 370511 pfam09462 Mus7 Mus7/MMS22 family. This family includes a conserved region from the Mus7 protein. Mus7 is involved in the repair of replication-associated DNA damage in the fission yeast Schizosaccharomyces pombe. Mus7 functions in the same pathway as Mus81, a subunit of the Mus81-Eme1 structure-specific endonuclease, which has been implicated in the repair of the replication-associated DNA damage. The MMS22 proteins are involved in repairing double-stranded DNA breaks created by the cleavage reaction of topoisomerase II. 610
52306 401424 pfam09463 Opy2 Opy2 protein. Opy2p acts as a membrane anchor in the HOG signalling pathway. 35
52307 401425 pfam09465 LBR_tudor Lamin-B receptor of TUDOR domain. The Lamin-B receptor, found on the TUDOR domain pfam00567, is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner Nuclear Envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with Importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the NE. 55
52308 312837 pfam09466 Yqai Hypothetical protein Yqai. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands. 66
52309 401426 pfam09467 Yopt Hypothetical protein Yopt. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms homo-dimers, with each monomer consisting of one alpha helix and three beta strands. 71
52310 401427 pfam09468 RNase_H2-Ydr279 Ydr279p protein family (RNase H2 complex component). RNases H are enzymes that specifically hydrolyze RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologs of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function. 157
52311 312839 pfam09469 Cobl Cordon-bleu ubiquitin-like domain. The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node. This domain has a ubiquitin-like fold. 79
52312 401428 pfam09470 Telethonin Telethonin protein. Telethonin is a 167-residue protein which complexes with the large muscle protein, titin. The very N-terminus of titin, composed of two immunoglobulin-like (Ig) domains, referred to as Z1 and Z2, interacts with the N-terminal region (residues 1-53) of telethonin, mediating the antiparallel assembly of two Z1Z2 domains. The C-terminus of the telethonin appears to induce dimerization of this 2:1 titin/telethonin structure which thus forms a complex necessary for myofibril assembly and maintenance of the intact Z-disk of skeletal and cardiac muscles. 154
52313 401429 pfam09471 Peptidase_M64 IgA Peptidase M64. This is a family of highly selective metallo-endopeptidases. The primary structure of the Clostridium ramosum IgA proteinase shows no significant overall similarity to any other known metallo-endopeptidase. 259
52314 401430 pfam09472 MtrF Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF). Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C terminal region of what appears to be a fusion of the MtrA and MtrF proteins. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism. 62
52315 401431 pfam09474 Type_III_YscX Type III secretion system YscX (type_III_YscX). Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believed to be part of the secretion machinery. 121
52316 286550 pfam09475 Dot_icm_IcmQ Dot/Icm secretion system protein (dot_icm_IcmQ). Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favour calling this the Dot/Icm system. This protein was shown to be essential for translocation. 178
52317 401432 pfam09476 Pilus_CpaD Pilus biogenesis CpaD protein (pilus_cpaD). Proteins in this entry consist of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function of the homologs is not known. 201
52318 401433 pfam09477 Type_III_YscG Bacterial type II secretion system chaperone protein (type_III_yscG). YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designated Yops (Yersinia outer proteins), in Yersinia. This entry consists of YscG from Yersinia and functionally equivalent type III secretion proteins in other species: e.g. AscG in Aeromonas and LscG in Photorhabdus luminescens. 116
52319 286553 pfam09478 CBM49 Carbohydrate binding domain CBM49. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose. 80
52320 401434 pfam09479 Flg_new Listeria-Bacteroides repeat domain (List_Bact_rpt). This model describes a conserved core region of about 43 residues, which occurs in at least two families of tandem repeats. These include 78-residue repeats which occur from 2 to 15 times in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats found in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few other bacteria. 65
52321 401435 pfam09480 PrgH Type III secretion system protein PrgH-EprH (PrgH). In Salmonella, the gene encoding this protein is part of a four-gene operon PrgHIJK, while in other organisms it is found in type III secretion operons. PrgH has been shown to be required for type III secretion and is a structural component of the needle complex, which is the core component of type III secretion systems. 374
52322 401436 pfam09481 CRISPR_Cse1 CRISPR-associated protein Cse1 (CRISPR_cse1). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry, represented by CT1972 from Chlorobaculum tepidum, is found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse1. 465
52323 370521 pfam09482 OrgA_MxiK Bacterial type III secretion apparatus protein (OrgA_MxiK). This protein is encoded by genes which are found in type III secretion operons, and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the gene is called MxiK and has been shown to be essential for the proper assembly of the needle complex, which is the core component of type III secretion systems. 181
52324 401437 pfam09483 HpaP Type III secretion protein (HpaP). This entry represents proteins encoded by genes which are always found in type III secretion operons, although their function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence. see also PMID:18584024 90
52325 401438 pfam09484 Cas_TM1802 CRISPR-associated protein TM1802 (cas_TM1802). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This minor cas protein is found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial. 584
52326 401439 pfam09485 CRISPR_Cse2 CRISPR-associated protein Cse2 (CRISPR_cse2). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family of proteins, represented by CT1973 from Chlorobaculum tepidum, is encoded by genes found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse2. 144
52327 370523 pfam09486 HrpB7 Bacterial type III secretion protein (HrpB7). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia. 157
52328 370524 pfam09487 HrpB2 Bacterial type III secretion protein (HrpB2). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia. 114
52329 401440 pfam09488 Osmo_MPGsynth Mannosyl-3-phosphoglycerate synthase (osmo_MPGsynth). This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together with mannosyl-3-phosphoglycerate phosphatase (MPGP) EC:2.4.1.217, comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus, this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase. 380
52330 401441 pfam09489 CbtB Probable cobalt transporter subunit (CbtB). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional transmembrane segments. 50
52331 401442 pfam09490 CbtA Probable cobalt transporter subunit (CbtA). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five transmembrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site. 240
52332 401443 pfam09491 RE_AlwI AlwI restriction endonuclease. This family includes the AlwI (recognizes GGATC), Bsp6I (recognizes GC^NGC), BstNBI (recognizes GASTC), PleI(recognizes GAGTC) and MlyI (recognizes GAGTC) restriction endonucleases. 441
52333 401444 pfam09492 Pec_lyase Pectic acid lyase. Members of this family are isozymes of pectate lyase (EC:4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase. 289
52334 401445 pfam09493 DUF2389 Tryptophan-rich protein (DUF2389). Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo. 62
52335 401446 pfam09494 Slx4 Slx4 endonuclease. The Slx4 protein is a heteromeric structure-specific endonuclease found from fungi to mammals. Slx4 with Slx1 acts as a nuclease on branched DNA substrates, particularly simple-Y, 5'-flap, or replication fork structures by cleaving the strand bearing the 5' non-homologous arm at the branch junction and thus generating ligatable nicked products from 5'-flap or replication fork substrates. 59
52336 401447 pfam09495 DUF2462 Protein of unknown function (DUF2462). This protein is highly conserved, but its function is unknown. It can be isolated from HeLa cell nucleoli and is found to be homologous with Leydig cell tumor protein whose function is unknown. 74
52337 401448 pfam09496 CENP-O Cenp-O kinetochore centromere component. This eukaryotic protein is a component of the inner kinetochore subcomplex of the centromere. It has been shown to be involved in chromosome segregation via regulation of the spindle in both yeast and human. 202
52338 401449 pfam09497 Med12 Transcription mediator complex subunit Med12. Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signalling pathway via its interaction with Gli3 within the RNA polymerase II transcriptional Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription. This subunit forms part of the Kinase section of Mediator. 63
52339 401450 pfam09498 DUF2388 Protein of unknown function (DUF2388). This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1 and in Pseudomonas putida (strain KT2440), four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown. 70
52340 286574 pfam09499 RE_ApaLI ApaLI-like restriction endonuclease. This family includes R.ApaLI and R.XbaI restriction endonucleases. ApaLI recognizes and cleaves the sequence GTGCAC. 189
52341 401451 pfam09500 YiiD_C Putative thioesterase (yiiD_Cterm). This entry consists of a broadly distributed uncharacterized domain often found as a standalone protein. The member from Shewanella oneidensis is described from crystallography work as a putative thioesterase because it belongs to the HotDog clan of enzymes. About half of the members of this family are fused to an Acetyltransf_1 domain pfam00583. 144
52342 312864 pfam09501 Bac_small_YrzI Probable sporulation protein (Bac_small_yrzI). Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, while arrays of six members in tandem are found in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number of small acid-soluble spore proteins (SASP). A role in sporulation is possible. 45
52343 401452 pfam09502 HrpB4 Bacterial type III secretion protein (HrpB4). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia. 217
52344 401453 pfam09504 RE_Bsp6I Bsp6I restriction endonuclease. This family includes the Bsp6I (recognizes and cleaves GC^NGC) restriction endonucleases. 179
52345 370535 pfam09505 Dimeth_Pyl Dimethylamine methyltransferase (Dimeth_PyL). This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of Methanosarcina acetivorans, Methanosarcina barkeri, and Methanosarcina mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence. 462
52346 401454 pfam09506 Salt_tol_Pase Glucosylglycerol-phosphate phosphatase (Salt_tol_Pase). Proteins in this family are glucosylglycerol-phosphate phosphatases, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol. 388
52347 370537 pfam09507 CDC27 DNA polymerase subunit Cdc27. This protein forms the C subunit of DNA polymerase delta. It carries the essential residues for binding to the Pol1 subunit of polymerase alpha, from residues 293-332, which are characterized by the motif D--G--VT, referred to as the DPIM motif. The first 160 residues of the protein form the minimal domain for binding to the B subunit, Cdc1, of polymerase delta, the final 10 C-terminal residues, 362-372, being the DNA sliding clamp, PCNA, binding motif. 427
52348 401455 pfam09508 Lact_bio_phlase Lacto-N-biose phosphorylase N-terminal TIM barrel domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides. 434
52349 401456 pfam09509 Hypoth_Ymh Protein of unknown function (Hypoth_ymh). This entry consists of a relatively rare prokaryotic protein family (about 8 occurrences per 200 genomes). Genes for members of this family appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. Their function is unknown. 118
52350 401457 pfam09510 Rtt102p Rtt102p-like transcription regulator protein. This protein is found in fungi. The family includes Rtt102p, a transcription regulator protein which appears to be integrally associated with both the Swi-Snf and the RSC chromatin remodelling complexes,. 130
52351 401458 pfam09511 RNA_lig_T4_1 RNA ligase. Members of this family include T4 phage proteins with ATP-dependent RNA ligase activity. Host defense to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. This family also includes fungal tRNA ligases that have adenylyltransferase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns.i 221
52352 401459 pfam09512 ThiW Thiamine-precursor transporter protein (ThiW). Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved, to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine. 150
52353 401460 pfam09514 SSXRD SSXRD motif. SSX1 can repress transcription, and this has been attributed to a putative Kruppel associated box (KRAB) repression domain at the N-terminus. However, from the analysis of these deletion constructs further repression activity was found at the C-terminus of SSX1. Which has been called the SSXRD (SSX Repression Domain). The potent repression exerted by full-length SSX1 appears to localize to this region. 30
52354 401461 pfam09515 Thia_YuaJ Thiamine transporter protein (Thia_YuaJ). Members of this protein family have been assigned as thiamine transporters by a phylogenetic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation. 177
52355 401462 pfam09516 RE_CfrBI CfrBI restriction endonuclease. This family includes the CfrBI (recognizes and cleaves C^CWWGG) restriction endonuclease. 257
52356 370543 pfam09517 RE_Eco29kI Eco29kI restriction endonuclease. This family includes the Eco29kI (recognizes and cleaves CCGC^GG ) restriction endonuclease. 161
52357 286590 pfam09518 RE_HindIII HindIII restriction endonuclease. This family includes the HindIII (recognizes and cleaves A^AGCTT) restriction endonuclease. 284
52358 370544 pfam09519 RE_HindVP HindVP restriction endonuclease. This family includes the HindVP (recognizes GRCGYC bu the cleavage site is unknown) restriction endonucleases. 324
52359 401463 pfam09520 RE_TdeIII Type II restriction endonuclease, TdeIII. This family includes many TdeIII restriction endonucleases that recognize and cleave at GGNCC sites. TdeIII cleave unmethylated double-stranded DNA. 239
52360 401464 pfam09521 RE_NgoPII NgoPII restriction endonuclease. This family includes the NgoPII (recognizes and cleaves GG^CC) restriction endonuclease. 262
52361 337439 pfam09522 RE_R_Pab1 R.Pab1 restriction endonuclease. 119
52362 401465 pfam09523 DUF2390 Protein of unknown function (DUF2390). Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model. 107
52363 401466 pfam09524 Phg_2220_C Conserved phage C-terminus (Phg_2220_C). This entry represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown. 74
52364 401467 pfam09526 DUF2387 Probable metal-binding protein (DUF2387). Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various proteobacteria. 75
52365 401468 pfam09527 ATPase_gene1 Putative F0F1-ATPase subunit Ca2+/Mg2+ transporter. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophobic stretches and is presumed to be a subunit of the enzyme. It carries two transmembrane helices and is a magnesium or calcium uniporter. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI. 54
52366 312885 pfam09528 Ehrlichia_rpt Ehrlichia tandem repeat (Ehrlichia_rpt). This entry represents 30 amino acid tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen. These short tandem-repeats elicit a strong antibody response in the hosts. 36
52367 401469 pfam09529 Intg_mem_TP0381 Integral membrane protein (intg_mem_TP0381). This entry represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc. 210
52368 401470 pfam09531 Ndc1_Nup Nucleoporin protein Ndc1-Nup. Ndc1 is a nucleoporin protein that is a component of the Nuclear Pore Complex, and, in fungi, also of the Spindle Pole Body. It consists of six transmembrane segments, three lumenal loops, both concentrated at the N-terminus and cytoplasmic domains largely at the C-terminus, all of which are well conserved. 546
52369 370547 pfam09532 FDF FDF domain. The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C-terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain. 102
52370 370548 pfam09533 DUF2380 Predicted lipoprotein of unknown function (DUF2380). This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown. 187
52371 401471 pfam09534 Trp_oprn_chp Tryptophan-associated transmembrane protein (Trp_oprn_chp). Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis. 180
52372 370549 pfam09535 Gmx_para_CXXCG Protein of unknown function (Gmx_para_CXXCG). This entry consists of at least 10 paralogous proteins from Myxococcus xanthus and that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment. 236
52373 370550 pfam09536 DUF2378 Protein of unknown function (DUF2378). This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622 and and 12 in Stigmatella aurantiaca DW4/3-1. Members are about 200 amino acids in length. The function is unknown. 177
52374 401472 pfam09537 DUF2383 Domain of unknown function (DUF2383). Members of this protein family are found mostly in the Proteobacteria, although one member is found in the the marine planctomycete Pirellula sp. strain 1. The function is unknown. 106
52375 401473 pfam09538 FYDLN_acid Protein of unknown function (FYDLN_acid). Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown. 108
52376 401474 pfam09539 DUF2385 Protein of unknown function (DUF2385). Members of this uncharacterized protein family are found in a number of alphaproteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown. 88
52377 370553 pfam09543 DUF2379 Protein of unknown function (DUF2379). This family consists of at least 7 paralogs in Myxococcus xanthus and 6 in Stigmatella aurantiaca, both members of the Deltaproteobacteria. The function is unknown. 120
52378 370554 pfam09544 DUF2381 Protein of unknown function (DUF2381). This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown. 287
52379 286611 pfam09545 RE_AccI AccI restriction endonuclease. This family includes the AccI (recognizes and cleaves GT^MKAC) restriction endonuclease. 366
52380 401475 pfam09546 Spore_III_AE Stage III sporulation protein AE (spore_III_AE). This represents the stage III sporulation protein AE, which is encoded in a spore formation operon spoIIIAABCDEFGH under the control of sigma G. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. 321
52381 401476 pfam09547 Spore_IV_A Stage IV sporulation protein A (spore_IV_A). SpoIVA is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. 490
52382 401477 pfam09548 Spore_III_AB Stage III sporulation protein AB (spore_III_AB). SpoIIIAB represents the stage III sporulation protein AB, which is encoded in a spore formation operon: spoIIIAABCDEFGH that is under sigma G regulation. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. 169
52383 370555 pfam09549 RE_Bpu10I Bpu10I restriction endonuclease. This family includes the Bpu10I (recognizes and cleaves CCTNAGC (-5/-2)) restriction endonucleases. 220
52384 401478 pfam09550 Phage_TAC_6 Phage tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins largely derived from the Rhodobacter species viral agent GTA (gene transfer agent) gp10. 58
52385 401479 pfam09551 Spore_II_R Stage II sporulation protein R (spore_II_R). SpoIIR is designated stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. SpoIIR is a signalling protein that links the activation of sigma E to the transcriptional activity of sigma F during sporulation. 125
52386 286618 pfam09552 RE_BstXI BstXI restriction endonuclease. This family includes the BstXI (recognizes and cleaves CCANNNNN^NTGG) restriction endonuclease. 290
52387 401480 pfam09553 RE_Eco47II Eco47II restriction endonuclease. This family includes the Eco47II (which recognizes GGNCC, but the cleavage site unknown) restriction endonuclease. 202
52388 370557 pfam09554 RE_HaeII HaeII restriction endonuclease. This family includes the HaeII (recognizes and cleaves RGCGC^Y) restriction endonuclease. 338
52389 401481 pfam09556 RE_HaeIII HaeIII restriction endonuclease. This family includes the HaeIII (recognizes and cleaves GG^CC) restriction endonuclease. 298
52390 401482 pfam09557 DUF2382 Domain of unknown function (DUF2382). This entry describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain pfam05239 which is also found in rRNA processing protein RimM and in a photosynthetic reaction centre complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known. 111
52391 286623 pfam09558 DUF2375 Protein of unknown function (DUF2375). Two members of this family are found in Colwellia psychrerythraea (strain 34H / ATCC BAA-681) and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by PEP_anchor (IPR013424). 69
52392 401483 pfam09559 Cas6 Cas6 Crispr. The Cas6 Crispr family of proteins averaging 140 residues are characterized by having a GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus. The CRISPR-Cas system is possibly a mechanism of defense against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes. 190
52393 401484 pfam09560 Spore_YunB Sporulation protein YunB (Spo_YunB). Spo_YunB is the sporulation protein YunB. In Bacillus subtilis its expression is controlled by sigmaE.The gene YunB seems to code for a protein involved, at least indirectly, in the pathway leading to the activation of sigmaK. Inactivation of YunB delays sigmaK activation and results in reduced sporulation efficiency. 91
52394 401485 pfam09561 RE_HpaII HpaII restriction endonuclease. This family includes the HpaII (recognizes and cleaves C^CGG) restriction endonuclease. 352
52395 401486 pfam09562 RE_LlaMI LlaMI restriction endonuclease. This family includes the LlaMI (recognizes and cleaves CC^NGG) restriction endonuclease. 242
52396 370562 pfam09563 RE_LlaJI LlaJI restriction endonuclease. This family includes the LlaJI (recognizes GACGC) restriction endonucleases. 365
52397 401487 pfam09564 RE_NgoBV NgoBV restriction endonuclease. This family includes the NgoBV (recognizes GGNNCC but cleavage site is unknown) restriction endonuclease. 238
52398 401488 pfam09565 RE_NgoFVII NgoFVII restriction endonuclease. This family includes the NgoFVII (recognizes GCSGC but cleavage site unknown) restriction endonuclease. 293
52399 401489 pfam09566 RE_SacI SacI restriction endonuclease. This family includes the SacI (recognizes and cleaves GAGCT^C) restriction endonuclease. 267
52400 401490 pfam09567 RE_MamI MamI restriction endonuclease. This family includes the MamI (recognizes and cleaves GATNN^NNATC) restriction endonuclease. 183
52401 401491 pfam09568 RE_MjaI MjaI restriction endonuclease. This family includes the MjaI (recognizes CTAG but cleavage site unknown) restriction endonuclease. 164
52402 401492 pfam09569 RE_ScaI ScaI restriction endonuclease. This family includes the ScaI (recognizes and cleaves AGT^ACT) restriction endonuclease. 192
52403 312916 pfam09570 RE_SinI SinI restriction endonuclease. This family includes the SinI (recognizes and cleaves G^GWCC) restriction endonuclease. 218
52404 286631 pfam09571 RE_XcyI XcyI restriction endonuclease. This family includes the XcyI (recognizes and cleaves C^CCGGG) restriction endonucleases. 305
52405 401493 pfam09572 RE_XamI XamI restriction endonuclease. This family includes the XamI (recognizes GTCGAC but cleavage site unknown) restriction endonuclease. 254
52406 401494 pfam09573 RE_TaqI TaqI restriction endonuclease. This family includes the TaqI (recognizes and cleaves T^CGA) restriction endonuclease. 229
52407 370569 pfam09574 DUF2374 Protein of unknown function (Duf2374). This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC 7966. 42
52408 286635 pfam09575 Spore_SspJ Small spore protein J (Spore_SspJ). Spore_SspJ represents a group of small acid-soluble proteins (SASP) from Bacillus sp., which are present in spores but not in growing cells. The sspJ gene is transcribed in the forespore compartment by RNA polymerase with the forespore-specific sigmaG. Loss of SspJ causes a slight decrease in the rate of spore outgrowth in an otherwise wild-type background. 46
52409 312918 pfam09577 Spore_YpjB Sporulation protein YpjB (SpoYpjB). These proteins are found in the endospore-forming bacteria which include Bacillus species. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon. Sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect, but this gene is not, however, a part of the endospore formation minimal gene set. 223
52410 401495 pfam09578 Spore_YabQ Spore cortex protein YabQ (Spore_YabQ). This protein is predicted to span the membrane several times. It is only found in genomes of species that perform sporulation, such as Bacillus subtilis, Clostridium tetani, and other members of the Firmicutes (low-GC Gram-positive bacteria). Mutation of this sigmaE-dependent gene blocks development of the spore cortex. The length of the C-terminal region, which includes some hydrophobic regions, is variable. 75
52411 401496 pfam09579 Spore_YtfJ Sporulation protein YtfJ (Spore_YtfJ). Proteins in this family are encoded by bacterial genomes if, and only if, the species is capable of endospore formation. YtfJ was confirmed in spores of B. subtilis; it appears to be expressed in the forespore under control of SigF. 81
52412 401497 pfam09580 Spore_YhcN_YlaJ Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ). This entry contains YhcN and YlaJ, which are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic, 40-residue C-terminal domain. 157
52413 401498 pfam09581 Spore_III_AF Stage III sporulation protein AF (Spore_III_AF). This family represents the stage III sporulation protein AF (Spore_III_AF) of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of these proteins is poorly conserved. 184
52414 401499 pfam09582 AnfO_nitrog Iron only nitrogenase protein AnfO (AnfO_nitrog). Proteins in this entry include Anf1 from Rhodobacter capsulatus (Rhodopseudomonas capsulata) and AnfO from Azotobacter vinelandii. They are found exclusively in species which contain the iron-only nitrogenase, and are encoded immediately downstream of the structural genes for the nitrogenase enzyme in these species. 190
52415 401500 pfam09583 Phageshock_PspG Phage shock protein G (Phageshock_PspG). This protein was previously designated as YjbO in Escherichia coli. It is found only in genomes that have the phage shock operon (psp), but it is only rarely encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins and heat shock. 64
52416 401501 pfam09584 Phageshock_PspD Phage shock protein PspD (Phageshock_PspD). Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown. 61
52417 401502 pfam09585 Lin0512_fam Conserved hypothetical protein (Lin0512_fam). This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a well conserved motif GxGxDxHG near the N-terminus. 114
52418 401503 pfam09586 YfhO Bacterial membrane protein YfhO. This protein is a conserved membrane protein. The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system. Some members of this family have been annotated as glycosyl transferases of the PMT family. 839
52419 401504 pfam09587 PGA_cap Bacterial capsule synthesis protein PGA_cap. This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein. 247
52420 401505 pfam09588 YqaJ YqaJ-like viral recombinase domain. This protein family is found in many different bacterial species but is of viral origin. The protein forms an oligomer and functions as a processive alkaline exonuclease that digests linear double-stranded DNA in a Mg(2+)-dependent reaction, It has a preference for 5'-phosphorylated DNA ends. It thus forms part of the two-component SynExo viral recombinase functional unit. 142
52421 370573 pfam09589 HrpA_pilin HrpA pilus formation protein. HrpA is an essential component of the type III secretion system (TTSS) which pathogens use to inject virulence factors directly into their host cells, and to cause disease. The TTSS has an Hrp pilus appendage for channelling effector proteins through the plant cell wall and this pilus elongates by the addition of HrpA pilin subunits at the distal end. 96
52422 150302 pfam09590 Env-gp36 Lentivirus surface glycoprotein. This protein is found in feline immunodeficiency retrovirus. It represents the surface glycoprotein which is found in the polyprotein C-terminal to the Env protein. 591
52423 286649 pfam09591 DUF2463 Protein of unknown function (DUF2463). This protein is found in eukaryotic, parasitic microsporidia. Its function is unknown. 210
52424 401506 pfam09592 DUF2031 Protein of unknown function (DUF2031). This protein is expressed in Plasmodium; its function is unknown. It may be the product of gene family pyst-b. 227
52425 401507 pfam09593 Pathogen_betaC1 Beta-satellite pathogenicity beta C1 protein. Cotton leaf-curl disease - CLCuD - is of major economic importance in cotton-growing areas of the far-east. The infectious agent appears to be a single-stranded DNA molecule of approx 1350 nucleotides in length, which, when inoculated with the Begomovirus into cotton, induces symptoms typical of CLCuD. This molecule requires the Begomovirus for replication and encapsidation. DNA beta encodes a single protein, betaC1. The intracellular distribution of betaC1 is consistent with the hypothesis that it has a role in transporting the DNA A of Begomovirus from the nuclear site of replication to the plasmodesmatal exit sites of the infected cell. The DNA beta-encoded protein, betaC1, is the determinant of both pathogenicity and suppression of gene silencing. 117
52426 401508 pfam09594 GT87 Glycosyltransferase family 87. The enzymes in this family are glycosyltransferases. PimE is involved in phosphatidylinositol mannoside (PIM) synthesis, a major class of glycolipids in all mycobacteria. PimE is a polyprenol-phosphate-mannose-dependent mannosyltransferase that transfers the fifth mannose of PIM. The family also includes alpha(1-->3) arabinofuranosyltransferase, invloved in the synthesis of of mycobacterial arabinogalactan. 237
52427 312932 pfam09595 Metaviral_G Metaviral_G glycoprotein. This is a viral attachment glycoprotein from region G of metaviruses. It is high in serine and threonine suggesting it is highly glycosylated. 183
52428 401509 pfam09596 MamL-1 MamL-1 domain. The MamL-1 domain is a polypeptide of up to 70 residues, numbers 15-67 of which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors. 58
52429 401510 pfam09597 IGR IGR protein motif. This domain is found in fungal proteins and contains a conserved IGR motif. Its function is unknown. 55
52430 401511 pfam09598 Stm1_N Stm1. This region is found at the N terminal of the Stm1 protein. Stm1 is a G4 quadraplex and purine motif triplex nucleic acid-binding protein. It has been implicated in many biological processes including apoptosis and telomere biosynthesis. Stm1 is known to interact with CDC13, and is known to associate with ribosomes and nuclear telomere cap complexes. 62
52431 286655 pfam09599 IpaC_SipC Salmonella-Shigella invasin protein C (IpaC_SipC). This entry represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri and SipC from Salmonella typhimurium. Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins. 334
52432 401512 pfam09600 Cyd_oper_YbgE Cyd operon protein YbgE (Cyd_oper_YbgE). This entry describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria. 76
52433 401513 pfam09601 DUF2459 Protein of unknown function (DUF2459). This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species. 171
52434 370578 pfam09602 PhaP_Bmeg Polyhydroxyalkanoic acid inclusion protein (PhaP_Bmeg). This entry describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage. 168
52435 401514 pfam09603 Fib_succ_major Fibrobacter succinogenes major domain (Fib_succ_major). This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron. 172
52436 401515 pfam09604 Potass_KdpF F subunit of K+-transporting ATPase (Potass_KdpF). This entry describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (IPR004623). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation. 24
52437 401516 pfam09605 Trep_Strep Hypothetical bacterial integral membrane protein (Trep_Strep). This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. It is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6. 185
52438 312941 pfam09606 Med15 ARC105 or Med15 subunit of Mediator complex non-fungal. The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development. 732
52439 401517 pfam09607 BrkDBD Brinker DNA-binding domain. This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway. 58
52440 401518 pfam09608 Alph_Pro_TM Putative transmembrane protein (Alph_Pro_TM). This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome. 227
52441 312943 pfam09609 Cas_GSU0054 CRISPR-associated protein, GSU0054 family (Cas_GSU0054). This entry represents a rare CRISPR-associated protein. So far, members are found in Geobacter sulfurreducens and in two unpublished genomes: Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins. 552
52442 401519 pfam09610 Myco_arth_vir_N Mycoplasma virulence signal region (Myco_arth_vir_N). This entry represents the N-terminal region of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. It includes a probable signal sequence or signal anchor, which, in most instances, has four consecutive Lys residues before the hydrophobic stretch. 32
52443 401520 pfam09611 Cas_Csy1 CRISPR-associated protein (Cas_Csy1). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1. 376
52444 401521 pfam09612 HtrL_YibB Bacterial protein of unknown function (HtrL_YibB). The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein. 267
52445 312947 pfam09613 HrpB1_HrpK Bacterial type III secretion protein (HrpB1_HrpK). This family of proteins is encoded by genes found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia. 126
52446 401522 pfam09614 Cas_Csy2 CRISPR-associated protein (Cas_Csy2). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2. 295
52447 401523 pfam09615 Cas_Csy3 CRISPR-associated protein (Cas_Csy3). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3. 327
52448 401524 pfam09617 Cas_GSU0053 CRISPR-associated protein GSU0053 (Cas_GSU0053). This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria). 170
52449 401525 pfam09618 Cas_Csy4 CRISPR-associated protein (Cas_Csy4). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4. 181
52450 401526 pfam09619 YscW Type III secretion system lipoprotein chaperone (YscW). This entry is encoded within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC. YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC. 105
52451 401527 pfam09620 Cas_csx3 CRISPR-associated protein (Cas_csx3). This entry is encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3). 79
52452 286674 pfam09621 LcrR Type III secretion system regulator (LcrR). This family of proteins are encoded within type III secretion operons and have been characterized in Yersinia as a regulator of the Low-Calcium Response (LCR). 138
52453 401528 pfam09622 DUF2391 Putative integral membrane protein (DUF2391). This entry is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. Proteins containing this entry appear to span the membrane seven times. 269
52454 370585 pfam09623 Cas_NE0113 CRISPR-associated protein NE0113 (Cas_NE0113). Members of this minor CRISPR-associated (Cas) protein family are encoded in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum. 210
52455 401529 pfam09624 DUF2393 Protein of unknown function (DUF2393). The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus. 142
52456 312957 pfam09625 VP9 VP9 protein. VP9 is a protein containing a ferredoxin fold. Two dimers come together to form one asymmetric unit which possesses a DNA recognition fold and specific metal binding sites possibly for zinc. It is postulated that being a non-structural protein VP9 is involved in the transcriptional regulation of the White spot syndrome virus, WSSV, from which it comes. WSSV is the major viral pathogen in shrimp aquaculture. VP9 is found N-terminal to the pfam07056 domain. 73
52457 401530 pfam09626 DHC Dihaem cytochrome c. Dihaem cytochrome c (DHC) is a soluble c-type cytochrome that folds into two distinct domains, each binding a single haem group and connected by a small linker region. Despite little sequence similarity, the N-terminal domain (residues 12-75) is a class I type cytochrome c, that binds one of the haems, but the domain surrounding the other haem is structurally unique. DHC binds electrostatically to an oxygen-binding protein, sphaeroides haem protein (SHP), as a component of a conserved electron transfer pathway. DHC acts as the physiological electron donor for SHP during phototrophic growth. In certain species DHC is found upstream of pfam01292. 110
52458 286680 pfam09627 PrgU PrgU-like protein. This hypothetical protein of 125 residues is expressed in bacteria but is thought to be plasmid in origin. It forms a six beta-strand barrel with three accompanying alpha helices and is probably a homo-dimer in the cell. It may be involved in pheromone-inducible conjugation. 105
52459 401531 pfam09628 YvfG YvfG protein. Yvfg is a hypothetical protein of 71 residues expressed in some bacteria. The monomer consists of two parallel alpha helices, and the protein crystallizes as a homo-dimer. 68
52460 401532 pfam09629 YorP YorP protein. YorP is a 71 residue protein found in bacteria. As it is also found in a bacteriophage it might be of viral origin. The structure is of an alpha helix between two of five beta strands. The function is unknown. 71
52461 401533 pfam09630 DUF2024 Domain of unknown function (DUF2024). This protein of 86 residues is expressed in bacteria. It consists of four alpha helices and two beta strands. Its function is unknown. One UniProt entry gives the gene name as Traf5. 81
52462 401534 pfam09631 Sen15 Sen15 protein. The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this hetero-tetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly. 101
52463 401535 pfam09632 Rac1 Rac1-binding domain. The Rac1-binding domain is the C-terminal portion of YpkA from Yersinia. It is an all-helical molecule consisting of two distinct subdomains connected by a linker. the N-terminal end, residues 434-615, consists of six helices organized into two three-helix bundles packed against each other. This region is involved with binding to GTPases. The C-terminal end, residues 705-732. is a novel and elongated fold consisting of four helices clustered into two pairs, and this fold carries the helix implicated in actin activation. Rac1-binding domain mimics host guanidine nucleotide dissociation inhibitors (GDIs) of the Rho GTPases, thereby inhibiting nucleotide exchange in Rac1 and causing cytoskeletal disruption in the host. It is usually found downstream of pfam00069. 293
52464 401536 pfam09633 DUF2023 Protein of unknown function (DUF2023). This protein of approx.120 residues consists of three beta strands and five alpha helices, thought to fold into a homo-dimer. It is expressed in bacteria. 99
52465 312962 pfam09634 DUF2025 Protein of unknown function (DUF2025). This protein is produced from gene PA1123 in Pseudomonas. It contains three alpha helices and six beta strands and is thought to be monomeric. It appears to be present in the biofilm layer and may be a lipoprotein. 105
52466 401537 pfam09635 MetRS-N MetRS-N binding domain. The MetRS-N domain binds an Arc1-P domain in a tetrameric complex resembling a classical GST homo-dimer. Domain-swapping between symmetrically related MetRS-N and Arc1p-N domains generates a 2:2 tetramer held together by van der Waals forces. This domain is necessary for formation of the aminoacyl-tRNA synthetase complex necessary for tRNA nuclear export and shuttling as part of the translational apparatus. The domain is associated with pfam09334. 121
52467 401538 pfam09636 XkdW XkdW protein. This protein of approx. 100 residues contains two alpha helices and two beta strands and is probably monomeric. It is expressed in bacteria but is probably viral in origin. Its function is unknown. 106
52468 401539 pfam09637 Med18 Med18 protein. Med18 is one subunit of Mediator, a head-module multiprotein complex, that stimulates basal RNA polymerase II (Pol II) transcription. Med18 consists of an eight-stranded beta-barrel with a central pore and three flanking helices. It complexes with Med8 and Med20 proteins by forming a heterodimer of two-fold symmetry with Med20 and binding the C-terminal alpha-helix region of Med8 across the top of its barrel. This complex creates a multipartite TBP-binding site that can be modulated by transcriptional activators. 229
52469 286688 pfam09638 Ph1570 Ph1570 protein. This is a hypothetical protein from Pyroccous horikoshii of unknown function. It contains six alpha helices and eight beta strands and is thought to be monomeric. 156
52470 401540 pfam09639 YjcQ YjcQ protein. YjcQ is a protein of approx. 100 residues containing four alpha helices and three beta strands. It is expressed in bacteria and also in viruses. It appears to be under the regulation of SigD RNA polymerase which is responsible for the expression of many genes encoding cell-surface proteins related to flagellar assembly, motility, chemotaxis and autolysis in the late exponential growth phase. The exact function of YjcQ is unknown. However, it is thought to be a prophage head protein in viruses. 94
52471 401541 pfam09640 DUF2027 Domain of unknown function (DUF2027). This protein domain is of unknown function. though putatively involved in DNA mismatch repair. It is associated with pfam01713. 154
52472 378228 pfam09641 DUF2026 Protein of unknown function (DUF2026). This protein of approx. 100 residues is found in bacteria. It contains up to five alpha helices and up to seven beta strands and is probably monomeric. Its function is unknown. It is cited as a major prophage head protein, so might generally be of viral origin. 205
52473 401542 pfam09642 YonK YonK protein. YonK protein is expressed by the bacterial prophage SPbetaC. It is a 63 residue protein that associates into a homo-octamer in the form of a beta-stranded barrel with four outer helical features at points of the compass. Its function is unknown. 58
52474 401543 pfam09643 YopX YopX protein. YopX is a protein that is largely helical, with three identical chains probably complexing into a twelve-chain structure. 122
52475 401544 pfam09644 Mg296 Mg296 protein. This protein of 129 residues is expressed in bacteria. It consists of three identical chains of five alpha helices. Two copies of each chain associate into a complex of six units of possible biological significance but of unknown function. 126
52476 401545 pfam09645 F-112 F-112 protein. F-112 protein is of 70-110 residues and is found in viruses. Its winged-helix structure suggests a DNA-binding function. 71
52477 401546 pfam09646 Gp37 Gp37 protein. This protein of 154 residues consists of a unit of helices and beta sheets that crystallizes into a beautiful asymmetrical dodecameric barrel-structure, of two six-membered rings one on top of the other. It is expressed in bacteria but is of viral origin as it is found in phage BcepMu and is probably a pathogenesis factor. 134
52478 401547 pfam09648 YycI YycH protein. This domain is exclusively found in YycI proteins in the low GC content Gram positive species. These two domains share the same structural fold with domains two and three of YycH pfam07435. Both, YycH and YycI are always found in pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies. 229
52479 401548 pfam09649 CHZ Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones. 34
52480 401549 pfam09650 PHA_gran_rgn Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn). Proteins in this entry are encoded by genes involved in either polyhydroxyalkanoic acid (PHA) biosynthesis or utilisation, including proteins found at the surface of PHA granules. These proteins have so far been found in the Pseudomonadales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria. 79
52481 401550 pfam09651 Cas_APE2256 CRISPR-associated protein (Cas_APE2256). This entry represents a conserved region of about 150 amino acids found in at least five archaeal and three bacterial species. These species all contain CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus. 136
52482 401551 pfam09652 Cas_VVA1548 Putative CRISPR-associated protein (Cas_VVA1548). This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas (CRISPR-associated) genes. 91
52483 401552 pfam09654 DUF2396 Protein of unknown function (DUF2396). These conserved hypothetical proteins have so far been found only in the Cyanobacteria. They are about 170 amino acids long and contain a CxxCx(14)CxxH motif near the N-terminus. 159
52484 401553 pfam09655 Nitr_red_assoc Conserved nitrate reductase-associated protein (Nitr_red_assoc). Proteins in this entry are found in the Cyanobacteria, and are mostly encoded near nitrate reductase and molybdopterin biosynthesis genes. Molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. These proteins are sometimes annotated as nitrate reductase-associated proteins, though their function is unknown. 144
52485 370603 pfam09656 PGPGW Putative transmembrane protein (PGPGW). Proteins in this entry are putative Actinobacterial proteins of about 150 amino acids in length, with three predicted transmembrane helices and an unusual motif with consensus sequence PGPGW. 53
52486 286705 pfam09657 Cas_Csx8 CRISPR-associated protein Csx8 (Cas_Csx8). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes proteins of unknown function which are encoded in the midst of a cas gene operon. 493
52487 370604 pfam09658 Cas_Csx9 CRISPR-associated protein (Cas_Csx9). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes archaeal proteins encoded in cas gene regions. 377
52488 401554 pfam09659 Cas_Csm6 CRISPR-associated protein (Cas_Csm6). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 370
52489 401555 pfam09660 DUF2397 Protein of unknown function (DUF2397). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). 483
52490 401556 pfam09661 DUF2398 Protein of unknown function (DUF2398). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). 360
52491 286710 pfam09662 Phenyl_P_gamma Phenylphosphate carboxylase gamma subunit (Phenyl_P_gamma). Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs. 83
52492 401557 pfam09663 Amido_AtzD_TrzD Amidohydrolase ring-opening protein (Amido_AtzD_TrzD). Members of this family are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC:3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC:3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring opening of barbiturase acid to ureidomalonic acid. 361
52493 401558 pfam09664 DUF2399 Protein of unknown function C-terminus (DUF2399). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria). Just the C-terminal region is ioncluded here. 152
52494 401559 pfam09665 RE_Alw26IDE Type II restriction endonuclease (RE_Alw26IDE). Members of this entry are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. characterized specificities of the three members are GGTCTC, CGTCTC and the shared subsequence GTCTC. 443
52495 401560 pfam09666 Sororin Sororin protein. Sororin is an essential, cell cycle-dependent mediator of sister chromatid cohesion. The protein is nuclear in interphase cells, dispersed from the chromatin in mitosis, and interacts with the cohesin complex. 155
52496 312980 pfam09667 DUF2028 Domain of unknown function (DUF2028). This region of similarity is found in the vertebrate homologs of the drosophila Bobby Sox. 198
52497 312981 pfam09668 Asp_protease Aspartyl protease. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover. 124
52498 401561 pfam09669 Phage_pRha Phage regulatory protein Rha (Phage_pRha). Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that Rha is a phage regulatory protein. 91
52499 401562 pfam09670 Cas_Cas02710 CRISPR-associated protein (Cas_Cas02710). Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia). 377
52500 401563 pfam09671 Spore_GerQ Spore coat protein (Spore_GerQ). Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase. 76
52501 401564 pfam09673 TrbC_Ftype Type-F conjugative transfer system pilin assembly protein. This entry represents TrbC, a protein that is an essential component of the F-type conjugative pilus assembly system for the transfer of plasmid DNA. The N-terminal portion of these proteins is heterogeneous. 111
52502 401565 pfam09674 DUF2400 Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation. 228
52503 286721 pfam09675 Chlamy_scaf Chlamydia-phage Chp2 scaffold (Chlamy_scaf). Members of this entry are encoded by genes in chlamydia-phage such as Chp2. These viruses have around eight genes and obligately infect intracellular bacterial pathogens of the genus Chlamydia. This protein is annotated as VP3 or structural protein (as if a protein of mature viral particles), however, it is displaced from procapsids as DNA is packaged, and therefore is more correctly described as a scaffolding protein. 109
52504 401566 pfam09676 TraV Type IV conjugative transfer system lipoprotein (TraV). This entry includes TraV, which is a component of conjugative type IV secretion system. TraV is an outer membrane lipoprotein that is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half. 127
52505 401567 pfam09677 TrbI_Ftype Type-F conjugative transfer system protein (TrbI_Ftype). This entry represents TrbI, an essential component of the F-type conjugative transfer system for plasmid DNA transfer that has been shown to be localized to the periplasm. 106
52506 401568 pfam09678 Caa3_CtaG Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG). Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as found in Bacillus subtilis. 238
52507 370613 pfam09679 TraQ Type-F conjugative transfer system pilin chaperone (TraQ). This entry represents TraQ, a protein that makes a specific interaction with pilin (TraA) to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly. 85
52508 401569 pfam09680 Tiny_TM_bacill Protein of unknown function (Tiny_TM_bacill). This entry represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV. 24
52509 401570 pfam09681 Phage_rep_org_N N-terminal phage replisome organizer (Phage_rep_org_N). This entry represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region is N-terminal to the low-complexity region. 121
52510 401571 pfam09682 Phage_holin_6_1 Bacteriophage holin of superfamily 6 (Holin_LLH). Phage_holin_6_1 or Holin_LLH identifies a family of phage holins from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is large for holins (about 100-160 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. Holin LLH family is found in phage of Firmicutes and have an N-terminal transmembrane segment. 100
52511 401572 pfam09683 Lactococcin_972 Bacteriocin (Lactococcin_972). These sequences represent bacteriocins related to lactococcin. Members tend to be found in association with a seven transmembrane putative immunity protein. 63
52512 401573 pfam09684 Tail_P2_I Phage tail protein (Tail_P2_I). These sequences represent the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria. 132
52513 401574 pfam09685 DUF4870 Domain of unknown function (DUF4870). 107
52514 401575 pfam09686 Plasmid_RAQPRD Plasmid protein of unknown function (Plasmid_RAQPRD). This entry identifies a family of proteins, which are about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function of these proteins are unknown. 75
52515 401576 pfam09687 PRESAN Plasmodium RESA N-terminal. The short, four-helical domain first identified in the Plasmodium export proteins PHISTa and PHISTc has been extended to become this six-helical PRESAN domain identified in the P. falciparum-specific RESA-type (Ring-infected erythrocyte surface antigen) proteins in association with the DnaJ domain. Overall, at least 67 proteins have been detected in P. falciparum with complete copies of the PRESAN domain. No versions of this domain were detected in other apicomplexan genera, suggesting that the domain was 'invented' after the divergence of the lineage leading to the genus Plasmodium undergoing a dramatic proliferation only in P. falciparum. A secondary structure-prediction derived from the multiple alignment of the PRESAN family reveals that it is composed of an all-helical fold with six conserved helical segments. There is some evidence it might localize to membranes. 125
52516 401577 pfam09688 Wx5_PLAF3D7 Protein of unknown function (Wx5_PLAF3D7). This set of protein sequences represent a family of at least four proteins in Plasmodium falciparum (isolate 3D7). An interesting feature is five perfectly conserved Trp residues. 144
52517 401578 pfam09689 PY_rept_46 Plasmodium yoelii repeat (PY_rept_46). This repeat is found in the products of only 2 genes in Plasmodium yoelii, in each of these proteins it is repeated 9 times. It is found in no other organism. 51
52518 401579 pfam09690 PYST-C1 Plasmodium yoelii subtelomeric region (PYST-C1). This group of sequences are defined by the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes that contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (IPR006491). 55
52519 401580 pfam09691 T2SS_PulS_OutS Type II secretion system pilotin lipoprotein (PulS_OutS). This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae (P20440), the OutS protein of Erwinia chrysanthemi (Q01567) and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. In the pilotin from this four-helix protein from enterohemorrhagic Escherichia coli, the straight helix alpha2, the curved helix alpha3 and the bent helix alpha4 surround the central N-terminal helix alpha1. These helices create a prominent groove, mainly formed by side chains of helices 1,2 and 3 suggesting this groove is important as a binding site. 96
52520 401581 pfam09692 Arb1 Argonaute siRNA chaperone (ARC) complex subunit Arb1. Arb1 is required for histone H3 Lys9 (H3-K9) methylation, heterochromatin, assembly and siRNA generation in fission yeast. 401
52521 401582 pfam09693 Phage_XkdX Phage uncharacterized protein (Phage_XkdX). This entry identifies a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes. 40
52522 286740 pfam09694 Gcw_chp Bacterial protein of unknown function (Gcw_chp). This entry represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis and Ralstonia solanacearum, usually as part of a paralogous family. The function is unknown. 228
52523 401583 pfam09695 YtfJ_HI0045 Bacterial protein of unknown function (YtfJ_HI0045). These are sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ. 158
52524 401584 pfam09696 Ctf8 Ctf8. Ctf8 (chromosome transmissions fidelity 8) is a component of the Ctf18 RFC-like complex which is a DNA clamp loader involved in sister chromatid cohesion. 127
52525 401585 pfam09697 Porph_ging Protein of unknown function (Porph_ging). This family of proteins of unknown function is found in Porphyromonas gingivalis (Bacteroides gingivalis). 81
52526 286744 pfam09698 GSu_C4xC__C2xCH Geobacter CxxxxCH...CXXCH motif (GSu_C4xC__C2xCH). This motif occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches the cytochrome c family haem-binding site signature, suggesting that the sequence may be involved in haem-binding. 36
52527 401586 pfam09699 Paired_CXXCH_1 Doubled CXXCH motif (Paired_CXXCH_1). This entry represents a domain of about 41 amino acids that contains, among other motifs, two copies of the motif CXXCH associated with haem binding. This domain is predicted to be a high molecular weight c-type cytochrome and is often found in multiple copies. Members are found mostly in species of Shewanella, Geobacter, and Vibrio. 41
52528 401587 pfam09700 Cas_Cmr3 CRISPR-associated protein (Cas_Cmr3). CRISPR is a term for Clustered Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family, found in at least ten different archaeal and bacterial species, is represented by TM1793 from Thermotoga maritima. 372
52529 401588 pfam09701 Cas_Cmr5 CRISPR-associated protein (Cas_Cmr5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species. 120
52530 401589 pfam09702 Cas_Csa5 CRISPR-associated protein (Cas_Csa5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus. 104
52531 286749 pfam09703 Cas_Csa4 CRISPR-associated protein (Cas_Csa4). CRISPR loci appear to be mobile elements with a wide host range. This entry represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 355
52532 401590 pfam09704 Cas_Cas5d CRISPR-associated protein (Cas_Cas5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum. 215
52533 401591 pfam09706 Cas_CXXC_CXXC CRISPR-associated protein (Cas_CXXC_CXXC). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes a conserved region of about 65 amino acids from an otherwise highly divergent protein found in a minority of CRISPR-associated protein regions. This region features two motifs of CXXC. 69
52534 401592 pfam09707 Cas_Cas2CT1978 CRISPR-associated protein (Cas_Cas2CT1978). This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799. Cas proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element. 86
52535 401593 pfam09709 Cas_Csd1 CRISPR-associated protein (Cas_Csd1). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins that tend to be found near CRISPR repeats. The species range, so far, is exclusively bacterial and mesophilic, although CRISPR loci are particularly common among the archaea and thermophilic bacteria. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. 561
52536 286754 pfam09710 Trep_dent_lipo Treponema clustered lipoprotein (Trep_dent_lipo). This entry represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown. 397
52537 401594 pfam09711 Cas_Csn2 CRISPR-associated protein (Cas_Csn2). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins found only in CRISPR-containing species, near other CRISPR-associated proteins (cas). The species range so far for these proteins is pathogenic bacteria only. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). 176
52538 370627 pfam09712 PHA_synth_III_E Poly(R)-hydroxyalkanoic acid synthase subunit (PHA_synth_III_E). This entry represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species. 314
52539 401595 pfam09713 A_thal_3526 Plant protein 1589 of unknown function (A_thal_3526). This plant-specific family of proteins is defined by an uncharacterized region 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana and Oryza sativa. The function of the proteins are unknown. 51
52540 401596 pfam09715 Plasmod_dom_1 Plasmodium protein of unknown function (Plasmod_dom_1). These sequences represent an uncharacterized family consisting of a small number of hypothetical proteins of the malaria parasite Plasmodium falciparum (isolate 3D7). 66
52541 401597 pfam09716 ETRAMP Malarial early transcribed membrane protein (ETRAMP). These sequences represent a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rodent parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich, stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both Plasmodium falciparum and P. yoelii chromosomes. 81
52542 401598 pfam09717 CPW_WPC Plasmodium falciparum domain of unknown function (CPW_WPC). This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown. 59
52543 401599 pfam09718 Tape_meas_lam_C Lambda phage tail tape-measure protein (Tape_meas_lam_C). This represents a relatively well-conserved region near the C-terminus of the tape measure protein of a lambda and related phage. The protein, which controls phage tail length, is typically about 1000 residues in length. Both low-complexity sequence and insertion/deletion events appear common in this family. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behaviour is attributed to proteins from distantly related or unrelated families in other phage. 76
52544 401600 pfam09719 C_GCAxxG_C_C Putative redox-active protein (C_GCAxxG_C_C). This entry represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters. 115
52545 401601 pfam09720 Unstab_antitox Putative addiction module component. This entry defines several short bacterial proteins, typically about 75 amino acids long, which are always found as part of a pair (at least) of small genes. The other protein in the pair always belongs to a family of plasmid stabilisation proteins (IPR007712). It is likely that this protein and its partner comprise some form of addiction module - a pair of genes consisting of a stable toxin and an unstable antitoxin which mediate programmed cell death - although these gene-pairs are usually found on the bacterial main chromosome. 54
52546 401602 pfam09721 Exosortase_EpsH Transmembrane exosortase (Exosortase_EpsH). Members of this family are designated exosortase, analogous to sortase in cell wall sorting mediated by LPXTG domains in Gram-positive bacteria. The phylogenetic distribution of the proteins in this entry is nearly perfectly correlated with the distribution of the proteins having the PEP-CTERM anchor motif, IPR013424. Members of this entry are integral membrane proteins with eight predicted transmembrane helices in common. Some members of this family have long trailing sequences past the region described by this model. This model does not include the region of the first predicted transmembrane region. The best characterized member is EpsH of Methylobacillus sp. 12S, where it is part of a locus associated with biosynthesis of the exopolysaccharide methanol-an. 249
52547 401603 pfam09722 DUF2384 Protein of unknown function (DUF2384). Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. The function is unknown. 51
52548 401604 pfam09723 Zn-ribbon_8 Zinc ribbon domain. This entry represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence. 41
52549 401605 pfam09724 Dcc1 Sister chromatid cohesion protein Dcc1. Sister chromatid cohesion protein Dcc1 is a component of the Ctf18-RFC complex. This complex is required for the efficient establishment of chromosome cohesion during S-phase and for loading the replication clamp Pol30/PCNA, which functions in DNA replication and repair. Ctf18-RFC loads PCNA onto DNA in an ATP-dependent manner. It may also have PCNA-unloading activity. 318
52550 401606 pfam09725 Fra10Ac1 Folate-sensitive fragile site protein Fra10Ac1. This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved protein, of unknown function that is nuclear and highly expressed in brain. 113
52551 401607 pfam09726 Macoilin Macoilin family. The Macoilin proteins has an N-terminal portion that is composed of 5 trasnmembrane helices, followed by a C-terminal coiled-coil region. Macoilin is a highly conserved protein present in eukaryotes. Macoilin appears to be found in the ER and be involved in the function of neurons. 664
52552 401608 pfam09727 CortBP2 Cortactin-binding protein-2. This entry is the first approximately 250 residues of cortactin-binding protein 2. In addition to being a positional candidate for autism this protein is expressed at highest levels in the brain in humans. The human protein has six associated ankyrin repeat domains pfam00023 towards the C-terminus which act as protein-protein interaction domains. 187
52553 401609 pfam09728 Taxilin Myosin-like coiled-coil protein. Taxilin contains an extraordinarily long coiled-coil domain in its C-terminal half and is ubiquitously expressed. It is a novel binding partner of several syntaxin family members and is possibly involved in Ca2+-dependent exocytosis in neuroendocrine cells. Gamma-taxilin, described as leucine zipper protein Factor Inhibiting ATF4-mediated Transcription (FIAT), localizes to the nucleus in osteoblasts and dimerizes with ATF4 to form inactive dimers, thus inhibiting ATF4-mediated transcription. 302
52554 401610 pfam09729 Gti1_Pac2 Gti1/Pac2 family. In S. pombe the gti1 protein promotes the onset of gluconate uptake upon glucose starvation. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade. 112
52555 370639 pfam09730 BicD Microtubule-associated protein Bicaudal-D. BicD proteins consist of three coiled-coiled domains and are involved in dynein-mediated minus end-directed transport from the Golgi apparatus to the endoplasmic reticulum (ER). For full functioning they bind with GSK-3beta pfam05350 to maintain the anchoring of microtubules to the centromere. It appears that amino-acid residues 437-617 of BicD and the kinase activity of GSK-3 are necessary for the formation of a complex between BicD and GSK-3beta in intact cells. 720
52556 370640 pfam09731 Mitofilin Mitochondrial inner membrane protein. Mitofilin controls mitochondrial cristae morphology. Mitofilin is enriched in the narrow space between the inner boundary and the outer membranes, where it forms a homotypic interaction and assembles into a large multimeric protein complex. The first 78 amino acids contain a typical amino-terminal-cleavable mitochondrial presequence rich in positive-charged and hydroxylated residues and a membrane anchor domain. In addition, it has three centrally located coiled coil domains. 583
52557 401611 pfam09732 CactinC_cactus Cactus-binding C-terminus of cactin protein. CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream. 125
52558 401612 pfam09733 VEFS-Box VEFS-Box of polycomb protein. The VEFS-Box (VRN2-EMF2-FIS2-Su(z)12) box is the C-terminal region of these proteins, characterized by an acidic cluster and a tryptophan/methionine-rich sequence, the acidic-W/M domain. Some of these sequences are associated with a zinc-finger domain about 100 residues towards the N-terminus. This protein is one of the polycomb cluster of proteins which control HOX gene transcription as it functions in heterochromatin-mediated repression. 137
52559 401613 pfam09734 Tau95 RNA polymerase III transcription factor (TF)IIIC subunit. TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription. 150
52560 401614 pfam09735 Nckap1 Membrane-associated apoptosis protein. Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease. It is involved in the regulation of actin polymerization in the brain as part of a WAVE2 signalling complex. 1114
52561 401615 pfam09736 Bud13 Pre-mRNA-splicing factor of RES complex. This entry is characterized by proteins with alternating conserved and low-complexity regions. Bud13 together with Snu17p and a newly identified factor, Pml1p/Ylr016c, form a novel trimeric complex. called The RES complex, pre-mRNA retention and splicing complex. Subunits of this complex are not essential for viability of yeasts but they are required for efficient splicing in vitro and in vivo. Furthermore, inactivation of this complex causes pre-mRNA leakage from the nucleus. Bud13 contains a unique, phylogenetically conserved C-terminal region of unknown function. 143
52562 401616 pfam09737 Det1 De-etiolated protein 1 Det1. This is the C-terminal conserved 400 residues of Det1 proteins of approximately 550 amino acids. Det1 (de-etiolated-1) is an essential negative regulator of plant light responses, and it is a component of the Arabidopsis CDD complex containing DDB1 and COP10 ubiquitin E2 variant. Mammalian Det1 forms stable DDD-E2 complexes, consisting of DDB1, DDA1 (DET1, DDB1 Associated 1), and a member of the UBE2E group of canonical ubiquitin conjugating enzymes and modulates Cul4A function. 409
52563 401617 pfam09738 LRRFIP LRRFIP family. LRRFIP1 is a transcriptional repressor which preferentially binds to the GC-rich consensus sequence (5'- AGCCCCCGGCG-3') and may regulate expression of TNF, EGFR and PDGFA. LRRFIP2 may function as activator of the canonical Wnt signalling pathway, in association with DVL3, upstream of CTNNB1/beta-catenin. 305
52564 401618 pfam09739 MCM_bind Mini-chromosome maintenance replisome factor. This entry is of proteins of approximately 600 residues in length containing alternating regions of conservation and low complexity. The Arabidopsis protein is a replisome factor found to bind with the mini-chromosome maintenance, MCM-binding, complex and is crucial for efficient DNA replication. The family now spans the full-length proteins. 571
52565 401619 pfam09740 DUF2043 Uncharacterized conserved protein (DUF2043). This is a 100 residue conserved region of a family of proteins found from fungi to humans. This region contains three conserved Cysteines and a motif of {CP}{y/l}{HG}. 103
52566 401620 pfam09741 DUF2045 Uncharacterized conserved protein (DUF2045). This entry is the conserved 250 residues of proteins of approximately 450 amino acids. It contains several highly conserved motifs including a CVxLxxxD motif.The function is unknown. 225
52567 401621 pfam09742 Dymeclin Dyggve-Melchior-Clausen syndrome protein. Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus. 640
52568 401622 pfam09743 E3_UFM1_ligase E3 UFM1-protein ligase 1. The ubiquitin fold modifier 1 (Ufm1) is the most recently discovered ubiquitin-like modifier whose conjugation (ufmylation) system is conserved in multicellular organisms. Ufm1 is known to covalently attach with cellular protein(s) via a specific E1-activating enzyme (Uba5), an E2-conjugating enzyme (Ufc1), and a E3-ligating enzyme. 272
52569 401623 pfam09744 Jnk-SapK_ap_N JNK_SAPK-associated protein-1. This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. 149
52570 401624 pfam09745 DUF2040 Coiled-coil domain-containing protein 55 (DUF2040). This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown. 121
52571 401625 pfam09746 Membralin tumor-associated protein. Membralin is evolutionarily highly conserved; though it seems to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers. Membralin-like gene homologs have been identified in plants including grape, cotton and tomato. 377
52572 401626 pfam09747 DUF2052 Coiled-coil domain containing protein (DUF2052). This entry is of sequences of two conserved domains separated by a region of low complexity, spanning some 200 residues. The function is unknown. 195
52573 401627 pfam09748 Med10 Transcription factor subunit Med10 of Mediator complex. Med10 is one of the protein subunits of the Mediator complex, tethered to Rgr1 protein. The Mediator complex is required for the transcription of most RNA polymerase II (Pol II)-transcribed genes. Med10 specifically mediates basal-level HIS4 transcription via Gcn4, and, additionally, there is a putative requirement for Med10 in Bas2-mediated transcription. Med10 is part of the middle region of Mediator. 119
52574 401628 pfam09749 HVSL Uncharacterized conserved protein. This entry is of proteins of approximately 300 residues conserved from plants to humans. It contains two conserved motifs, HxSL and FHVSL. The function is unknown. 239
52575 401629 pfam09750 DRY_EERY Alternative splicing regulator. This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains pfam01805 and an Arginine- serine-rich binding region towards the C-terminus. 129
52576 401630 pfam09751 Es2 Nuclear protein Es2. This entry is of a family of proteins of approximately 500 residues with alternating regions of low complexity and conservation where the domain similarities are strong. Apart from a predicted coiled-coil domain, no other known functional domains have been characterized. The protein appears to be expressed in the nucleus and particularly highly in the pons sub-region of the brain. The protein is clearly necessary for normal development of the nervous system. 419
52577 370661 pfam09752 DUF2048 Abhydrolase domain containing 18. The proteins in this family are conserved from plants to vertebrates. The function is unknown. 352
52578 401631 pfam09753 Use1 Membrane fusion protein Use1. This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport. 243
52579 401632 pfam09754 PAC2 PAC2 family. This PAC2 (Proteasome assembly chaperone) family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 247 and 307 amino acids in length. These proteins function as a chaperone for the 26S proteasome. The 26S proteasome mediates ubiquitin-dependent proteolysis in eukaryotic cells. A number of studies including very recent ones have revealed that assembly of its 20S catalytic core particle is an ordered process that involves several conserved proteasome assembly chaperones (PACs). Two heterodimeric chaperones, PAC1-PAC2 and PAC3-PAC4, promote the assembly of rings composed of seven alpha subunits. 167
52580 401633 pfam09755 DUF2046 Uncharacterized conserved protein H4 (DUF2046). This is the conserved N-terminal 350 residues of a family of proteins of unknown function possibly containing a coiled-coil domain. 304
52581 370664 pfam09756 DDRGK DDRGK domain. This is a family of proteins of approximately 300 residues, found in plants and vertebrates. They contain a highly conserved DDRGK motif. 188
52582 401634 pfam09757 Arb2 Arb2 domain. A second fission yeast Argonaute complex (Argonaute siRNA chaperone, ARC) that contains two previously uncharacterized proteins, Arb1 and Arb2, both of which are required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA generation. This family includes a region found in Arb2 and the Hda1 protein. 252
52583 401635 pfam09758 FPL Uncharacterized conserved protein. This entry represents an N-terminal region of approximately 150 residues of a family of proteins of unknown function. It contains a highly conserved FPL motif. 148
52584 401636 pfam09759 Atx10homo_assoc Spinocerebellar ataxia type 10 protein domain. This is the conserved C-terminal 100 residues of Ataxin-10. Ataxin-10 belongs to the family of armadillo repeat proteins and in solution it tends to form homotrimeric complexes, which associate via a tip-to-tip association in a horseshoe-shaped contact with the concave sides of the molecules facing each other. This domain may represent the homo-association site since that is located near the C-terminus of Ataxin-10. The protein does not contain a signal sequence for secretion or any subcellular compartment confirming its cytoplasmic localization, specifically to the olivocerebellar region. 99
52585 401637 pfam09762 KOG2701 Coiled-coil domain-containing protein (DUF2037). This entry represents the conserved N-terminal 200 residues of a family of proteins conserved from plants to vertebrates. In Drosophila it comes from the Fidipidine gene, and is of unknown function. 173
52586 401638 pfam09763 Sec3_C Exocyst complex component Sec3. This entry is the conserved middle and C-terminus of the Sec3 protein. Sec3 binds to the C-terminal cytoplasmic domain of GLYT1 (glycine transporter protein 1). Sec3 is the exocyst component that is closest to the plasma membrane docking site and it serves as a spatial landmark in the plasma membrane for incoming secretory vesicles. Sec3 is recruited to the sites of polarised membrane growth through its interaction with Rho1p, a small GTP-binding protein. 696
52587 401639 pfam09764 Nt_Gln_amidase N-terminal glutamine amidase. This protein is conserved from plants to humans. It represents a family of N terminal glutamine amidases. The enzyme removes the NH2 group from a Gln, at the N-terminal, rendering it a Glu. 172
52588 401640 pfam09765 WD-3 WD-repeat region. This entry is of a region of approximately 100 residues containing three WD repeats and six cysteine residues possibly as three cystine-bridges. These regions are contained within the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. The WD repeats are required for interaction with other subunits of the FA complex. 87
52589 401641 pfam09766 FimP Fms-interacting protein. This entry carries part of the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C-terminus contains a putative nuclear localization sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes. 339
52590 401642 pfam09767 DUF2053 Predicted membrane protein (DUF2053). This entry is of the conserved N-terminal 150 residues of proteins conserved from plants to humans. The function is unknown although some annotation suggests it to be a transmembrane protein. 157
52591 401643 pfam09768 Peptidase_M76 Peptidase M76 family. This is a family of metalloproteases. Proteins in this family are also annotated as Ku70-binding proteins. 168
52592 401644 pfam09769 ApoO Apolipoprotein O. Members of this family promote cholesterol efflux from macrophage cells. They are present in various lipoprotein complexes, including HDL, LDL and VLDL. The apoprotein is secreted by a microsomal triglyceride transfer protein (MTTP)-dependent mechanism, probably as a VLDL-associated protein that is subsequently transferred to HDL. 121
52593 401645 pfam09770 PAT1 Topoisomerase II-associated protein PAT1. Members of this family are necessary for accurate chromosome transmission during cell division. 846
52594 401646 pfam09771 Tmemb_18A Transmembrane protein 188. The function of this family of transmembrane proteins has not, as yet, been determined. 118
52595 370678 pfam09772 Tmem26 Transmembrane protein 26. The function of this family of transmembrane proteins has not, as yet, been determined. 288
52596 401647 pfam09773 Meckelin Meckelin (Transmembrane protein 67). Members of this family are thought to be related to the ciliary basal body. Defects result in Meckel syndrome type 3, an autosomal recessive disorder characterized by a combination of renal cysts and variably associated features including developmental anomalies of the central nervous system (typically encephalocele), hepatic ductal dysplasia and cysts, and polydactyly. Joubert syndrome type 6 is also a manifestation of certain mutations; it is an autosomal recessive congenital malformation of the cerebellar vermis and brainstem with abnormalities of axonal decussation (crossing in the brain) affecting the corticospinal tract and superior cerebellar peduncles. Individuals with Joubert syndrome have motor and behavioral abnormalities, including an inability to walk due to severe clumsiness and 'mirror' movements, and cognitive and behavioural disturbances. 816
52597 401648 pfam09774 Cid2 Caffeine-induced death protein 2. Members of this family of proteins mediate the disruption of the DNA replication checkpoint (S-M checkpoint) mechanism caused by caffeine. 156
52598 401649 pfam09775 Keratin_assoc Keratinocyte-associated protein 2. Members of this family comprise various keratinocyte-associated proteins. Their exact function has not, as yet, been determined. 129
52599 401650 pfam09776 Mitoc_L55 Mitochondrial ribosomal protein L55. Members of this family are involved in mitochondrial biogenesis and G2/M phase cell cycle progression. They form a component of the mitochondrial ribosome large subunit (39S) which comprises a 16S rRNA and about 50 distinct proteins. 115
52600 401651 pfam09777 OSTMP1 Osteopetrosis-associated transmembrane protein 1 precursor. Members of this family of proteins are required for osteoclast and melanocyte maturation and function. Mutations give rise to autosomal recessive osteopetrosis; also called autosomal recessive Albers-Schonberg disease. 229
52601 401652 pfam09778 Guanylate_cyc_2 Guanylylate cyclase. Members of this family of proteins catalyze the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate. 211
52602 401653 pfam09779 Ima1_N Ima1 N-terminal domain. This domain occurs at the N-terminus of the Schizosaccharomyces pombe inner nuclear membrane protein, Ima1. Ima1 interacts with other inner nuclear membrane proteins. 131
52603 401654 pfam09781 NDUF_B5 NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit. Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol. 186
52604 370687 pfam09782 NDUF_B6 NADH:ubiquinone oxidoreductase, NDUFB6/B17 subunit. Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol. 128
52605 401655 pfam09783 Vac_ImportDeg Vacuolar import and degradation protein. Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole. 162
52606 401656 pfam09784 L31 Mitochondrial ribosomal protein L31. This is a family of mitochondrial ribosomal proteins. L31 is essential for mitochondrial function in yeast. 102
52607 401657 pfam09785 Prp31_C Prp31 C terminal domain. This is the C terminal domain of the pre-mRNA processing factor Prp31. Prp31 is required for U4/U6.U5 tri-snRNP formation. In humans this protein has been linked to autosomal dominant retinitis pigmentosa. 121
52608 401658 pfam09786 CytochromB561_N Cytochrome B561, N terminal. Members of this family are found in the N terminal region of cytochrome B561, as well as in various other putative uncharacterized proteins. 581
52609 401659 pfam09787 Golgin_A5 Golgin subfamily A member 5. Members of this family of proteins are involved in maintaining Golgi structure. They stimulate the formation of Golgi stacks and ribbons, and are involved in intra-Golgi retrograde transport. Two main interactions have been characterized: one with RAB1A that has been activated by GTP-binding and another with isoform CASP of CUTL1. 306
52610 401660 pfam09788 Tmemb_55A Transmembrane protein 55A. Members of this family catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 4,5-bisphosphate, in the reaction: 1-phosphatidyl-myo-inositol 4,5-bisphosphate + H(2)O = 1-phosphatidyl-1D-myo-inositol 5-phosphate + phosphate. 245
52611 401661 pfam09789 DUF2353 Uncharacterized coiled-coil protein (DUF2353). Members of this family of uncharacterized proteins have no known function. 311
52612 401662 pfam09790 Hyccin Hyccin. Members of this family of proteins may have a role in the beta-catenin-Tcf/Lef signaling pathway, as well as in the process of myelination of the central and peripheral nervous system. Defects in Hyccin are the cause of hypomyelination with congenital cataracts. This disorder is characterized by congenital cataracts, progressive neurologic impairment, and diffuse myelin deficiency. Affected individuals experience progressive pyramidal and cerebellar dysfunction, muscle weakness and wasting prevailing in the lower limbs. 323
52613 401663 pfam09791 Oxidored-like Oxidoreductase-like protein, N-terminal. Members of this family are found in the N terminal region of various oxidoreductase like proteins. Their exact function is, as yet, unknown. 46
52614 401664 pfam09792 But2 Ubiquitin 3 binding protein But2 C-terminal domain. This family is of proteins conserved in yeasts. It binds to Uba3 and is involved in the NEDD8 signalling pathway. This family represents a presumed C-terminal domain. 141
52615 401665 pfam09793 AD Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. It is an anticodon-binding domain of a prolyl-tRNA synthetase, whose PDB structure is available under the identifier 1h4q. 90
52616 401666 pfam09794 Avl9 Transport protein Avl9. Avl9 is a protein involved in exocytic transport from the Golgi. It has been speculated that Avl9 could play a role in deforming membranes for vesicle fission and/or in recruiting cargo. 376
52617 401667 pfam09795 Atg31 Autophagy-related protein 31. Autophagy is an intracellular degradation system that responds to nutrient starvation. Cis1/Atg31 has been shown to be required for autophagosome formation in Saccharomyces cerevisiae. It interacts with Atg17. 154
52618 401668 pfam09796 QCR10 Ubiquinol-cytochrome-c reductase complex subunit (QCR10). The QCR10 family of proteins are a component of the ubiquinol-cytochrome c reductase complex (also known as complex III or cytochrome b-c1 complex). This complex is located on the inner mitochondrial membrane and it couples electron transfer from ubiquinol to cytochrome. This subunit (QCR10) is required for stable association of the iron-sulfur protein with the complex. 62
52619 401669 pfam09797 NatB_MDM20 N-acetyltransferase B complex (NatB) non catalytic subunit. This is the non-catalytic subunit of the N-terminal acetyltransferase B complex (NatB). The NatB complex catalyzes the acetylation of the amino-terminal methionine residue of all proteins beginning with Met-Asp or Met-Glu and of some proteins beginning with Met-Asn or Met-Met. In Saccharomyces cerevisiae this subunit is called MDM20 and in Schizosaccharomyces pombe it is called Arm1. NatB acetylates the Tpm1 protein and regulates and tropomyocin-actin interactions. This subunit is required by the NatB complex for the N-terminal acetylation of Tpm1. 371
52620 401670 pfam09798 LCD1 DNA damage checkpoint protein. This is a family of proteins which regulate checkpoint kinases. In Schizosaccharomyces pombe this protein is called Rad26 and in Saccharomyces cerevisiae it is called LCD1. 616
52621 401671 pfam09799 Transmemb_17 Predicted membrane protein. This is a 100 amino acid region of a family of proteins conserved from nematodes to humans. It is predicted to be a transmembrane region but its function is not known. 104
52622 370705 pfam09801 SYS1 Integral membrane protein S linking to the trans Golgi network. Members of this family are integral membrane proteins involved in protein trafficking between the late Golgi and endosome. They may also serve as a receptor for ADP-ribosylation factor-related protein 1 (ARFRP1). Sys1p is a small integral membrane protein with four predicted transmembrane domains that localizes to the Trans Golgi network TGN in yeast and human cells. 144
52623 401672 pfam09802 Sec66 Preprotein translocase subunit Sec66. Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures. 176
52624 401673 pfam09803 Pet100 Pet100. Pet100 is a chaperone required for the assembly of cytochrome c oxidase. The human Pet100 homolog (also known as C19orf79) has been shown to be located in the mitochondrial inner membrane and forms a ~300 kDa subcomplex with mitochondrial complex IV subunits. 63
52625 401674 pfam09804 DUF2347 Uncharacterized conserved protein (DUF2347). Members of this family of hypothetical proteins have no known function. 283
52626 401675 pfam09805 Nop25 Nucleolar protein 12 (25kDa). Members of this family of proteins are part of the yeast nuclear pore complex-associated pre-60S ribosomal subunit. The family functions as a highly conserved exonuclease that is required for the 5'-end maturation of 5.8S and 25S rRNAs, demonstrating that 5'-end processing also has a redundant pathway. Nop25 binds late pre-60S ribosomes, accompanying them from the nucleolus to the nuclear periphery; and there is evidence for both physical and functional links between late 60S subunit processing and export. 135
52627 370710 pfam09806 CDK2AP Cyclin-dependent kinase 2-associated protein. Members of this family of proteins are cell-growth suppressors, associating with and influencing the biological activities of important cell cycle regulators in the S phase including monomeric non-phosphorylated cyclin-dependent kinase 2 (CDK2) and DNA polymerase alpha/primase. An association between mutations in the gene coding for this protein and oral cancer has been described. 205
52628 401676 pfam09807 ELP6 Elongation complex protein 6. ELP6 is a subunit of the RNA polymerase II elongator complex. The Elongator complex promotes RNA-polymerase II transcript elongation through histone acetylation in the nucleus and tRNA modification in the cytoplasm. ELP5 and ELP6 play a major role in the migration, invasion and tumorigenicity of melanoma cells, as integral subunits of Elongator 250
52629 401677 pfam09808 SNAPc_SNAP43 Small nuclear RNA activating complex (SNAPc), subunit SNAP43. Members of this family are part of the SNAPc complex required for the transcription of both RNA polymerase II and III small-nuclear RNA genes. They bind to the proximal sequence element (PSE), a non-TATA-box basal promoter element common to these 2 types of genes. Furthermore, they also recruit TBP and BRF2 to the U6 snRNA TATA box. 191
52630 286847 pfam09809 MRP-L27 Mitochondrial ribosomal protein L27. Members of this family of proteins are components of the mitochondrial ribosome large subunit. They are also involved in apoptosis and cell cycle regulation. 97
52631 401678 pfam09810 Exo5 Exonuclease V - a 5' deoxyribonuclease. Exonuclease V is a monomeric 5' deoxyribonuclease that is localized in the nucleus. It degrades single-stranded, but not double-stranded, DNA from the 5'-end, and the products are dinucleotides, except the 3'-terminal tri- and tetranucleotides, which are not degraded. The initial hydrolytic cut of exonuclease V on the dephosphorylated substrate produces a mixture of dinucleoside monophosphates and trinucleoside diphosphates. The enzyme is processive in action. Exo5 is specific for single-stranded DNA and does not hydrolyze RNA. However, Exo5 has the capacity to slide across 5' double-stranded DNA or 5' RNA sequences and resume cutting two nucleotides downstream of the double-stranded-to-single-stranded junction or RNA-to-DNA junction, respectively. 363
52632 401679 pfam09811 Yae1_N Essential protein Yae1, N terminal. Members of this family are found in the N terminal region of the essential protein Yae1. Their exact function has not, as yet, been determined. The family DUF1715, pfam08215 has now been merged into this family. 39
52633 401680 pfam09812 MRP-L28 Mitochondrial ribosomal protein L28. Members of this family are components of the mitochondrial large ribosomal subunit. Mature mitochondrial ribosomes consist of a small (37S) and a large (54S) subunit. The 37S subunit contains at least 33 different proteins and 1 molecule of RNA (15S). The 54S subunit contains at least 45 different proteins and 1 molecule of RNA (21S). 157
52634 286851 pfam09813 Coiled-coil_56 Coiled-coil domain-containing protein 56. Members of this family of proteins have no known function. 101
52635 401681 pfam09814 HECT_2 HECT-like Ubiquitin-conjugating enzyme (E2)-binding. HECT_2 is a family of UbcH10-binding proteins. 305
52636 401682 pfam09815 XK-related XK-related protein. Members of this family comprise various XK-related proteins, that are involved in sodium-dependent transport of neutral amino acids or oligopeptides. These proteins are responsible for the Kx blood group system - defects results in McLeod syndrome, an X-linked multi-system disorder characterized by late onset abnormalities in the neuromuscular and hematopoietic systems. 337
52637 401683 pfam09816 EAF RNA polymerase II transcription elongation factor. Members of this family act as transcriptional transactivators of ELL and ELL2 elongation activities. Eaf proteins form a stable heterodimer complex with ELL proteins to facilitate the binding of RNA polymerase II to activate transcription elongation. The N-terminus of approx 120 residues is globular and highly conserved. 101
52638 401684 pfam09817 Zwilch RZZ complex, subunit zwilch. The protein Zwilch is an essential component of the mitotic checkpoint, which prevents cells from prematurely exiting mitosis. It is required for the assembly of the dynein-dynactin, Mad2 complexes and spindly/CG15415 onto kinetochores. 573
52639 401685 pfam09818 ABC_ATPase Predicted ATPase of the ABC class. Members of this family include various bacterial predicted ABC class ATPases. 445
52640 401686 pfam09819 ABC_cobalt ABC-type cobalt transport system, permease component. Members of this family of prokaryotic proteins include various hypothetical proteins as well as ABC-type cobalt transport systems. 121
52641 401687 pfam09820 AAA-ATPase_like Predicted AAA-ATPase. This family contains many hypothetical bacterial proteins. This family was previously the N-terminal part of the Pfam DUF1703 (pfam08011) family before it was split into two. This region is predicted to be an AAA-ATPase domain. 277
52642 401688 pfam09821 AAA_assoc_C C-terminal AAA-associated domain. This had been thought to be an ATPase domain of ABC-transporter proteins. However, only one member has any trans-membrane regions. It is associated with an upstream ATP-binding cassette family, pfam00005. 117
52643 401689 pfam09822 ABC_transp_aux ABC-type uncharacterized transport system. This domain is found in various eukaryotic and prokaryotic intra-flagellar transport proteins involved in gliding motility, as well as in several hypothetical proteins. 264
52644 401690 pfam09823 DUF2357 Domain of unknown function (DUF2357). This entry was previously the N terminal portion of DUF524 (pfam04411) before it was split into two. This domain has no known function. It is predicted to adopt an all beta secondary structure pattern followed by mainly alpha-helical structures. 248
52645 401691 pfam09824 ArsR ArsR transcriptional regulator. Members of this family of archaeal proteins are conserved transcriptional regulators belonging to the ArsR family. 159
52646 401692 pfam09825 BPL_N Biotin-protein ligase, N terminal. The function of this structural domain is unknown. It is found to the N-terminus of the biotin protein ligase catalytic domain. 373
52647 401693 pfam09826 Beta_propel Beta propeller domain. Members of this family comprise secreted bacterial proteins containing C-terminal beta-propeller domain distantly related to WD-40 repeats. Jpred secondary-structure prediction shows family to be a series of 4 short beta-strands, characteristic of beta-propeller families. 505
52648 401694 pfam09827 CRISPR_Cas2 CRISPR associated protein Cas2. Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages. 72
52649 401695 pfam09828 Chrome_Resist Chromate resistance exported protein. Members of this family of bacterial proteins, are involved in the reduction of chromate accumulation and are essential for chromate resistance. 131
52650 401696 pfam09829 DUF2057 Uncharacterized protein conserved in bacteria (DUF2057). This domain, found in various prokaryotic proteins, has no known function. 191
52651 401697 pfam09830 ATP_transf ATP adenylyltransferase. Members of this family of proteins catabolize Ap4N nucleotides (where N is A,C,G or U). Additionally they catalyze the conversion of adenosine-5-phosphosulfate (AMPs) plus Pi to ADP plus sulphate, the exchange of NDP and phosphate and the synthesis of Ap4A from AMPs plus ATP. 65
52652 401698 pfam09831 DUF2058 Uncharacterized protein conserved in bacteria (DUF2058). This domain, found in various prokaryotic proteins, has no known function. 173
52653 401699 pfam09832 DUF2059 Uncharacterized protein conserved in bacteria (DUF2059). This domain, found in various prokaryotic proteins, has no known function. 59
52654 401700 pfam09834 DUF2061 Predicted membrane protein (DUF2061). This domain, found in various prokaryotic proteins, has no known function. 51
52655 401701 pfam09835 DUF2062 Uncharacterized protein conserved in bacteria (DUF2062). This domain, found in various prokaryotic proteins, has no known function. 139
52656 401702 pfam09836 DUF2063 Putative DNA-binding domain. This family represents the N-terminal part of a Neisseria protein, UniProtKB:Q5F5I0, Structure 3dee. It runs from residues 31-117 as a helical bundle with 4 main helices. \From genomic context and the fold of the C-terminal part, it is suggested that this protein is involved in transcriptional regulation. 87
52657 401703 pfam09837 DUF2064 Uncharacterized protein conserved in bacteria (DUF2064). This family has structural similarity to proteins in the nucleotide-diphospho-sugar transferases superfamily. The similarity suggests that it is an enzyme with a sugar substrate. 120
52658 401704 pfam09838 DUF2065 Uncharacterized protein conserved in bacteria (DUF2065). This domain, found in various prokaryotic proteins, has no known function. 56
52659 401705 pfam09839 DUF2066 Uncharacterized protein conserved in bacteria (DUF2066). This domain, found in various prokaryotic proteins, has no known function. 230
52660 378268 pfam09840 DUF2067 Uncharacterized protein conserved in archaea (DUF2067). This domain, found in various archaeal proteins, has no known function. 186
52661 401706 pfam09842 DUF2069 Predicted membrane protein (DUF2069). This domain, found in various prokaryotes, has no known function. 104
52662 401707 pfam09843 DUF2070 Predicted membrane protein (DUF2070). This is a family of Archaeal 7-TM proteins. There are 6 closely assembled TM-regions at the N-terminus followed by a long intracellular, from residues 220-590, highly conserved region, of unknown function, terminating with one more TM-region. The short 25 residue section between TMs 5 and 6 might lie on the outer surface of the membrane and be acting as a receptor (from TMHMM). 561
52663 401708 pfam09844 DUF2071 Uncharacterized conserved protein (COG2071). This conserved protein (similar to YgjF), found in various prokaryotes, has no known function. 211
52664 401709 pfam09845 DUF2072 Zn-ribbon containing protein. This archaeal protein has no known function. 134
52665 401710 pfam09846 DUF2073 Uncharacterized protein conserved in archaea (DUF2073). This archaeal protein has no known function. 105
52666 286883 pfam09847 12TM_1 Membrane protein of 12 TMs. This family carries twelve transmembrane regions. It does not have any characteristic nucleotide-binding-domains of the GxSGSGKST type. so it may not be an ATP-binding cassette transporter. However, it may well be a transporter of some description. ABC transporters always have two nucleotide binding domains; this has two unusual conserved sequence-motifs: 'KDhKxhhR' and 'LxxLP'. 448
52667 401711 pfam09848 DUF2075 Uncharacterized conserved protein (DUF2075). This domain, found in various prokaryotic proteins (including putative ATP/GTP binding proteins), has no known function. 355
52668 401712 pfam09849 DUF2076 Uncharacterized protein conserved in bacteria (DUF2076). This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins. 257
52669 401713 pfam09850 DotU Type VI secretion system protein DotU. DotU is a family of Gram-negative bacterial proteins that form part of the membrane-joining complex of the type VI secretion system. DotU binds tightly to IcmF and together they are tethered to the inner membrane at one end and the peptidoglycan layer at the other; they interact with Lip1 which then tethers the peptidoglycan layer to the outer membrane. 204
52670 401714 pfam09851 SHOCT Short C-terminal domain. 28
52671 401715 pfam09852 DUF2079 Predicted membrane protein (DUF2079). This domain, found in various hypothetical prokaryotic proteins, has no known function. 455
52672 313137 pfam09853 DUF2080 Putative transposon-encoded protein (DUF2080). This domain, found in various hypothetical archaeal proteins, has no known function. 50
52673 401716 pfam09855 zinc_ribbon_13 Nucleic-acid-binding protein containing Zn-ribbon domain (DUF2082). This domain, found in various hypothetical prokaryotic proteins, as well as some Zn-ribbon nucleic-acid-binding proteins has no known function. 63
52674 401717 pfam09856 DUF2083 Predicted transcriptional regulator (DUF2083). This domain is found in various prokaryotic transcriptional regulatory proteins belonging to the XRE family. Its exact function is, as yet, unknown. 157
52675 401718 pfam09857 YjhX_toxin Putative toxin of bacterial toxin-antitoxin pair. YjhX_toxin is a putative toxin of a bacterial toxin-antitoxin pair, which is neutralized by the proteins YjhQ in family pfam00583. 85
52676 401719 pfam09858 DUF2085 Predicted membrane protein (DUF2085). This domain, found in various hypothetical prokaryotic proteins, has no known function. 90
52677 401720 pfam09859 Oxygenase-NA Oxygenase, catalyzing oxidative methylation of damaged DNA. This family of bacterial sequences is predicted to catalyze oxidative de-methylation of damaged bases in DNA. 172
52678 401721 pfam09860 DUF2087 Uncharacterized protein conserved in bacteria (DUF2087). This domain, found in various hypothetical prokaryotic proteins and transcriptional activators, has no known function. Structural modelling suggests this domain may bind nucleic acids. 67
52679 401722 pfam09861 DUF2088 Domain of unknown function (DUF2088). This domain, found in various hypothetical prokaryotic proteins, has no known function. 204
52680 401723 pfam09862 DUF2089 Protein of unknown function (DUF2089). This domain, found in various hypothetical prokaryotic proteins, has no known function. This domain is a zinc-ribbon. 115
52681 401724 pfam09863 DUF2090 Uncharacterized protein conserved in bacteria (DUF2090). This domain, found in various prokaryotic carbohydrate kinases, has no known function. 310
52682 401725 pfam09864 MliC Membrane-bound lysozyme-inhibitor of c-type lysozyme. Lysozymes are ancient and important components of the innate immune system of animals that hydrolyze peptidoglycan, the major bacterial cell wall polymer. Various mechanisms have evolved by which bacteria can evade this bactericidal enzyme, one being the production of lysozyme inhibitors. MliC (membrane bound lysozyme inhibitor of c-type lysozyme) of E. coli and Pseudomonas aeruginosa, possess lysozyme inhibitory activity and confer increased lysozyme tolerance upon expression in E. coli. Structural analyses show that the invariant loop of MliC plays a crucial role in the inhibition of the lysozyme by its insertion into the active site cleft of the lysozyme, where the loop forms hydrogen and ionic bonds with the catalytic residues. 68
52683 401726 pfam09865 DUF2092 Predicted periplasmic protein (DUF2092). This domain, found in various hypothetical prokaryotic proteins, has no known function. 209
52684 401727 pfam09866 DUF2093 Uncharacterized protein conserved in bacteria (DUF2093). This domain, found in various hypothetical prokaryotic proteins, has no known function. 41
52685 401728 pfam09867 DUF2094 Uncharacterized protein conserved in bacteria (DUF2094). This domain, found in various hypothetical prokaryotic proteins, has no known function. 135
52686 370731 pfam09868 DUF2095 Uncharacterized protein conserved in archaea (DUF2095). This domain, found in various hypothetical prokaryotic proteins, has no known function. 129
52687 401729 pfam09869 DUF2096 Uncharacterized protein conserved in archaea (DUF2096). This domain, found in various hypothetical prokaryotic proteins, has no known function. 168
52688 401730 pfam09870 DUF2097 Uncharacterized protein conserved in archaea (DUF2097). This domain, found in various hypothetical prokaryotic proteins, has no known function. 83
52689 401731 pfam09871 DUF2098 Uncharacterized protein conserved in archaea (DUF2098). This domain, found in various hypothetical prokaryotic proteins, has no known function. 94
52690 401732 pfam09872 DUF2099 Uncharacterized protein conserved in archaea (DUF2099). This domain, found in various hypothetical prokaryotic proteins, has no known function. 257
52691 401733 pfam09873 DUF2100 Uncharacterized protein conserved in archaea (DUF2100). This domain, found in various hypothetical archaeal proteins, has no known function. 210
52692 255617 pfam09874 DUF2101 Predicted membrane protein (DUF2101). This domain, found in various archaeal and bacterial proteins, has no known function. 206
52693 401734 pfam09875 DUF2102 Uncharacterized protein conserved in archaea (DUF2102). This domain, found in various hypothetical archaeal proteins, has no known function. 102
52694 401735 pfam09876 DUF2103 Predicted metal-binding protein (DUF2103). This domain, found in various putative metal binding prokaryotic proteins, has no known function. 98
52695 401736 pfam09877 DUF2104 Predicted membrane protein (DUF2104). This domain, found in various hypothetical archaeal proteins, has no known function. 99
52696 401737 pfam09878 DUF2105 Predicted membrane protein (DUF2105). This domain, found in various hypothetical archaeal proteins, has no known function. 200
52697 401738 pfam09879 DUF2106 Predicted membrane protein (DUF2106). This domain, found in various hypothetical archaeal proteins, has no known function. 151
52698 401739 pfam09880 DUF2107 Predicted membrane protein (DUF2107). This domain, found in various hypothetical archaeal proteins, has no known function. 73
52699 401740 pfam09881 DUF2108 Predicted membrane protein (DUF2108). This domain, found in various hypothetical archaeal proteins, has no known function. 70
52700 401741 pfam09882 DUF2109 Predicted membrane protein (DUF2109). This domain, found in various hypothetical archaeal proteins, has no known function. 76
52701 401742 pfam09883 DUF2110 Uncharacterized protein conserved in archaea (DUF2110). This domain, found in various hypothetical archaeal proteins, has no known function. 223
52702 401743 pfam09884 DUF2111 Uncharacterized protein conserved in archaea (DUF2111). This domain, found in various hypothetical archaeal proteins, has no known function. 83
52703 401744 pfam09885 DUF2112 Uncharacterized protein conserved in archaea (DUF2112). This domain, found in various hypothetical archaeal proteins, has no known function. 143
52704 401745 pfam09886 DUF2113 Uncharacterized protein conserved in archaea (DUF2113). This domain, found in various hypothetical archaeal proteins, has no known function. 185
52705 401746 pfam09887 DUF2114 Uncharacterized protein conserved in archaea (DUF2114). This domain, found in various hypothetical archaeal proteins, has no known function. 449
52706 401747 pfam09888 DUF2115 Uncharacterized protein conserved in archaea (DUF2115). This domain, found in various hypothetical archaeal proteins, has no known function. 163
52707 401748 pfam09889 DUF2116 Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids. 59
52708 401749 pfam09890 DUF2117 Uncharacterized protein conserved in archaea (DUF2117). This domain, found in various hypothetical archaeal proteins, has no known function. 213
52709 378294 pfam09891 DUF2118 Uncharacterized protein conserved in archaea (DUF2118). This domain, found in various hypothetical archaeal proteins, has no known function. 149
52710 401750 pfam09892 DUF2119 Uncharacterized protein conserved in archaea (DUF2119). This domain, found in various hypothetical archaeal proteins, has no known function. 186
52711 401751 pfam09893 DUF2120 Uncharacterized protein conserved in archaea (DUF2120). This domain, found in various hypothetical archaeal proteins, has no known function. 136
52712 401752 pfam09894 DUF2121 Uncharacterized protein conserved in archaea (DUF2121). This domain, found in various hypothetical archaeal proteins, has no known function. 196
52713 378295 pfam09895 DUF2122 RecB-family nuclease (DUF2122). This domain, found in various hypothetical archaeal proteins, has no known function. 106
52714 401753 pfam09897 DUF2124 Uncharacterized protein conserved in archaea (DUF2124). This domain, found in various hypothetical archaeal proteins, has no known function. 141
52715 401754 pfam09898 DUF2125 Uncharacterized protein conserved in bacteria (DUF2125). This domain, found in various hypothetical prokaryotic proteins, has no known function. 306
52716 401755 pfam09899 DUF2126 Putative amidoligase enzyme (DUF2126). Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown, but they are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes). 822
52717 401756 pfam09900 DUF2127 Predicted membrane protein (DUF2127). This domain, found in various hypothetical prokaryotic and archaeal proteins, has no known function. 140
52718 401757 pfam09902 DUF2129 Uncharacterized protein conserved in bacteria (DUF2129). This domain, found in various hypothetical prokaryotic proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids. 70
52719 401758 pfam09903 DUF2130 Uncharacterized protein conserved in bacteria (DUF2130). This domain, found in various hypothetical prokaryotic proteins, has no known function. 248
52720 370745 pfam09904 HTH_43 Winged helix-turn helix. This family, found in various hypothetical prokaryotic proteins, is a probable winged helix DNA-binding domain. 89
52721 401759 pfam09905 VF530 DNA-binding protein VF530. VF530 contains a unique four-helix motif that shows some similarity to the C-terminal double-stranded DNA (dsDNA) binding domain of RecA, as well as other nucleic acid binding domains. 63
52722 401760 pfam09906 DUF2135 Uncharacterized protein conserved in bacteria (DUF2135). This domain, found in various hypothetical prokaryotic proteins, has no known function. 52
52723 401761 pfam09907 HigB_toxin HigB_toxin, RelE-like toxic component of a toxin-antitoxin system. HigB_toxin is a family of RelE-like prokaryotic proteins that function as mRNA interferases. HigB cleaves translated mRNA only, and cleavage depended on translation of the target RNAs. HigB belongs to the RelE super-family of RNases. The toxin-antitoxin gene-pair is induced by environmental stress factors. 73
52724 370747 pfam09909 DUF2138 Uncharacterized protein conserved in bacteria (DUF2138). This domain, found in various hypothetical prokaryotic proteins, has no known function. 546
52725 401762 pfam09910 DUF2139 Uncharacterized protein conserved in archaea (DUF2139). This domain, found in various hypothetical archaeal proteins, has no known function. 340
52726 401763 pfam09911 DUF2140 Uncharacterized protein conserved in bacteria (DUF2140). This domain, found in various hypothetical prokaryotic proteins, has no known function. 186
52727 401764 pfam09912 DUF2141 Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function. 112
52728 401765 pfam09913 DUF2142 Predicted membrane protein (DUF2142). This domain, found in various hypothetical prokaryotic proteins, has no known function. 393
52729 401766 pfam09916 DUF2145 Uncharacterized protein conserved in bacteria (DUF2145). This domain, found in various hypothetical prokaryotic proteins, has no known function. 199
52730 401767 pfam09917 DUF2147 Uncharacterized protein conserved in bacteria (DUF2147). This domain, found in various hypothetical prokaryotic proteins, has no known function. 112
52731 401768 pfam09918 DUF2148 Uncharacterized protein containing a ferredoxin domain (DUF2148). This domain, found in various hypothetical bacterial proteins containing a ferredoxin domain, has no known function. 65
52732 401769 pfam09919 DUF2149 Uncharacterized conserved protein (DUF2149). This domain, found in various hypothetical prokaryotic proteins, has no known function. 92
52733 401770 pfam09920 DUF2150 Uncharacterized protein conserved in archaea (DUF2150). This domain, found in various hypothetical archaeal proteins, has no known function. 189
52734 378308 pfam09921 DUF2153 Uncharacterized protein conserved in archaea (DUF2153). This domain, found in various hypothetical archaeal proteins, has no known function. 123
52735 401771 pfam09922 DUF2154 Cell wall-active antibiotics response 4TMS YvqF. 114
52736 401772 pfam09923 DUF2155 Uncharacterized protein conserved in bacteria (DUF2155). This domain, found in various hypothetical prokaryotic proteins, has no known function. 89
52737 401773 pfam09924 DUF2156 Uncharacterized conserved protein (DUF2156). This domain, found in various hypothetical prokaryotic proteins, has no known function. 297
52738 401774 pfam09925 DUF2157 Predicted membrane protein (DUF2157). This domain, found in various hypothetical prokaryotic proteins, has no known function. 141
52739 401775 pfam09926 DUF2158 Uncharacterized small protein (DUF2158). Members of this family of prokaryotic proteins have no known function. 52
52740 401776 pfam09928 DUF2160 Predicted small integral membrane protein (DUF2160). The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet. 88
52741 401777 pfam09929 DUF2161 Putative PD-(D/E)XK phosphodiesterase (DUF2161). This family of proteins is functionally uncharacterized. This family of proteins is found in prokaryotes. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses s have established this family to be a member of the PD-(D/E)XK superfamily. 111
52742 401778 pfam09930 DUF2162 Predicted transporter (DUF2162). Members of this family of bacterial proteins are thought to be membrane transporters, but their exact function has not, as yet, been elucidated. 223
52743 401779 pfam09931 DUF2163 Uncharacterized conserved protein (DUF2163). This domain, found in various hypothetical prokaryotic proteins, has no known function. 163
52744 401780 pfam09932 DUF2164 Uncharacterized conserved protein (DUF2164). This domain, found in various hypothetical prokaryotic proteins, has no known function. 73
52745 401781 pfam09933 DUF2165 Predicted small integral membrane protein (DUF2165). This domain, found in various hypothetical prokaryotic proteins, has no known function. 157
52746 401782 pfam09935 DUF2167 Protein of unknown function (DUF2167). This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function. 238
52747 401783 pfam09936 Methyltrn_RNA_4 SAM-dependent RNA methyltransferase. This family has a Rossmanoid fold, with a deep trefoil knot in its C-terminal region. It has structural similarity to RNA methyltransferases, and is likely to function as an S-adenosyl-L-methionine (SAM)-dependent RNA 2'-O methyltransferase. 181
52748 401784 pfam09937 DUF2169 Uncharacterized protein conserved in bacteria (DUF2169). This domain, found in various hypothetical prokaryotic proteins, has no known function. 294
52749 401785 pfam09938 DUF2170 Uncharacterized protein conserved in bacteria (DUF2170). This domain, found in various hypothetical prokaryotic proteins, has no known function. 137
52750 401786 pfam09939 DUF2171 Uncharacterized protein conserved in bacteria (DUF2171). This domain, found in various hypothetical prokaryotic proteins, has no known function. 63
52751 401787 pfam09940 DUF2172 Domain of unknown function (DUF2172). This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet. Rebuilding from Structure 3kt9 shows this is an inserted (nested domain within the amino-peptidase). The function of this small domain is not known. 91
52752 370757 pfam09941 DUF2173 Uncharacterized conserved protein (DUF2173). This domain, found in various hypothetical prokaryotic proteins, has no known function. 104
52753 370758 pfam09943 DUF2175 Uncharacterized protein conserved in archaea (DUF2175). This domain, found in various hypothetical archaeal proteins, has no known function. 98
52754 401788 pfam09945 DUF2177 Predicted membrane protein (DUF2177). This domain, found in various hypothetical bacterial proteins, has no known function. 120
52755 401789 pfam09946 DUF2178 Predicted membrane protein (DUF2178). This domain, found in various hypothetical archaeal proteins, has no known function. 106
52756 401790 pfam09947 DUF2180 Uncharacterized protein conserved in archaea (DUF2180). This domain, found in various hypothetical archaeal proteins, has no known function. A few of the family members contain a zinc finger domain. 68
52757 401791 pfam09948 DUF2182 Predicted metal-binding integral membrane protein (DUF2182). This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function. 188
52758 401792 pfam09949 DUF2183 Uncharacterized conserved protein (DUF2183). This domain, found in various hypothetical bacterial proteins, has no known function. 99
52759 401793 pfam09950 DUF2184 Uncharacterized protein conserved in bacteria (DUF2184). This domain, found in various hypothetical bacterial proteins, has no known function. 251
52760 401794 pfam09951 DUF2185 Protein of unknown function (DUF2185). This domain, found in various hypothetical bacterial proteins, has no known function. 85
52761 370761 pfam09952 AbiEi_2 Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_2 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 142
52762 401795 pfam09953 DUF2187 Uncharacterized protein conserved in bacteria (DUF2187). This domain, found in various hypothetical bacterial proteins, has no known function. 60
52763 378323 pfam09954 DUF2188 Uncharacterized protein conserved in bacteria (DUF2188). This domain, found in various hypothetical bacterial proteins, has no known function. 62
52764 401796 pfam09955 DUF2189 Predicted integral membrane protein (DUF2189). Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established. 126
52765 401797 pfam09956 DUF2190 Uncharacterized conserved protein (DUF2190). This domain, found in various hypothetical prokaryotic proteins, as well as in some putative RecA/RadA recombinases, has no known function. 103
52766 401798 pfam09957 VapB_antitoxin Bacterial antitoxin of type II TA system, VapB. VapB is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is VapC, pfam05016. The family contains several related antitoxins from Cyanobacteria and Actinobacterial families. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the Toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all antitoxins. 43
52767 378325 pfam09958 DUF2192 Uncharacterized protein conserved in archaea (DUF2192). This domain, found in various hypothetical archaeal proteins, has no known function. 229
52768 401799 pfam09959 DUF2193 Uncharacterized protein conserved in archaea (DUF2193). This domain, found in various hypothetical archaeal proteins, has no known function. 498
52769 401800 pfam09960 DUF2194 Uncharacterized protein conserved in bacteria (DUF2194). This domain, found in various hypothetical bacterial proteins, has no known function. 586
52770 401801 pfam09961 DUF2195 Uncharacterized protein conserved in bacteria (DUF2195). This domain, found in various hypothetical bacterial proteins, has no known function. 117
52771 401802 pfam09962 DUF2196 Uncharacterized conserved protein (DUF2196). This domain, found in various hypothetical bacterial proteins, has no known function. 59
52772 313229 pfam09963 DUF2197 Uncharacterized protein conserved in bacteria (DUF2197). This domain, found in various hypothetical bacterial proteins, has no known function. 56
52773 401803 pfam09964 DUF2198 Uncharacterized protein conserved in bacteria (DUF2198). This domain, found in various hypothetical bacterial proteins, has no known function. 72
52774 401804 pfam09965 DUF2199 Uncharacterized protein conserved in bacteria (DUF2199). This domain, found in various hypothetical bacterial proteins, has no known function. 145
52775 401805 pfam09966 DUF2200 Uncharacterized protein conserved in bacteria (DUF2200). This domain, found in various hypothetical bacterial proteins, has no known function. 110
52776 401806 pfam09967 DUF2201 VWA-like domain (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. However, it is clearly related to the VWA domain. 123
52777 401807 pfam09968 DUF2202 Uncharacterized protein domain (DUF2202). This domain, found in various hypothetical archaeal proteins, has no known function. 162
52778 401808 pfam09969 DUF2203 Uncharacterized conserved protein (DUF2203). This domain, found in various hypothetical bacterial proteins, has no known function. 121
52779 378331 pfam09970 DUF2204 Nucleotidyl transferase of unknown function (DUF2204). This domain, found in various hypothetical archaeal proteins, has no known function. However, this family was identified as belonging to the nucleotidyltransferase superfamily. 181
52780 401809 pfam09971 DUF2206 Predicted membrane protein (DUF2206). This domain, found in various hypothetical archaeal proteins, has no known function. 380
52781 401810 pfam09972 DUF2207 Predicted membrane protein (DUF2207). This domain, found in various hypothetical bacterial proteins, has no known function. 434
52782 378334 pfam09973 DUF2208 Predicted membrane protein (DUF2208). This domain, found in various hypothetical archaeal proteins, has no known function. 231
52783 401811 pfam09974 DUF2209 Uncharacterized protein conserved in archaea (DUF2209). This domain, found in various hypothetical archaeal proteins, has no known function. 121
52784 401812 pfam09976 TPR_21 Tetratricopeptide repeat-like domain. This family resembles a single unit of a TPR repeat. 194
52785 401813 pfam09977 Tad_C Putative Tad-like Flp pilus-assembly. This domain, found in various hypothetical prokaryotic proteins, is likely to be involved in Flp lius biogenesis. 93
52786 401814 pfam09979 DUF2213 Uncharacterized protein conserved in bacteria (DUF2213). Members of this family of bacterial proteins comprise various hypothetical and phage-related proteins. The exact function of these proteins has not, as yet, been determined. 166
52787 401815 pfam09980 DUF2214 Predicted membrane protein (DUF2214). This domain, found in various hypothetical bacterial proteins, has no known function. 144
52788 401816 pfam09981 DUF2218 Uncharacterized protein conserved in bacteria (DUF2218). This domain, found in various hypothetical bacterial proteins, has no known function. 88
52789 401817 pfam09982 DUF2219 Uncharacterized protein conserved in bacteria (DUF2219). This domain, found in various hypothetical bacterial proteins, has no known function. 294
52790 401818 pfam09983 DUF2220 Uncharacterized protein conserved in bacteria C-term(DUF2220). This domain, found in various hypothetical bacterial proteins, has no known function. The family represents just the C-terminus. 181
52791 401819 pfam09984 DUF2222 Uncharacterized signal transduction histidine kinase domain (DUF2222). Members of this family of domains are found in various BarA-like signal transduction histidine kinases, which are involved in the regulation of carbon metabolism via the csrA/csrB regulatory system. The role of this domain has not, as yet, been established. 146
52792 401820 pfam09985 Glucodextran_C C-terminal binding-module, SLH-like, of glucodextranase. Glucodextran_C is the C-terminal domain of glucodextranase-like proteins found in various prokaryotic membrane-anchored proteins. It shows homology to the carbohydrate-binding unit of some glycosidases. 228
52793 401821 pfam09986 DUF2225 Uncharacterized protein conserved in bacteria (DUF2225). This domain, found in various hypothetical bacterial proteins, has no known function. 213
52794 255677 pfam09987 DUF2226 Uncharacterized protein conserved in archaea (DUF2226). This domain, found in various hypothetical archaeal proteins, has no known function. 252
52795 401822 pfam09988 DUF2227 Uncharacterized metal-binding protein (DUF2227). Members of this family of hypothetical bacterial proteins possess metal binding properties; however, their exact function has not, as yet, been determined. 172
52796 401823 pfam09989 DUF2229 CoA enzyme activase uncharacterized domain (DUF2229). Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined. 213
52797 401824 pfam09990 DUF2231 Predicted membrane protein (DUF2231). This domain, found in various hypothetical bacterial proteins, has no known function. 100
52798 370773 pfam09991 DUF2232 Predicted membrane protein (DUF2232). This family of bacterial proteins are multi-pass membrane proteins with up to 10 (2 x 4/5) transmembrane regions. The exact function of this potential pore molecule is not known, but in many instances it is associated with ABC-transporter-like domains, implying that it is part of a secretion system that uses energy. 290
52799 401825 pfam09992 NAGPA Phosphodiester glycosidase. This is a family conserved from bacteria to humans. The structure of a member from Bacteroides has been crystallized and modelled onto the luminal region of the human member of the family, the transmembrane glycoprotein N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase. There is some conservation of potentially functional residues, implying that in the bacterial members this family acts in some way as a phosphodiester glycosidase. The human protein is also present, so the eukaryotic members are likely to be catalyzing the second step in the formation of the mannose 6-phosphate targeting signal on lysosomal enzyme oligosaccharides. 169
52800 401826 pfam09994 DUF2235 Uncharacterized alpha/beta hydrolase domain (DUF2235). This domain, found in various hypothetical bacterial proteins, has no known function. 283
52801 401827 pfam09995 DUF2236 Uncharacterized protein conserved in bacteria (DUF2236). This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity. 223
52802 401828 pfam09996 DUF2237 Uncharacterized protein conserved in bacteria (DUF2237). This domain, found in various hypothetical bacterial proteins, has no known function. 108
52803 401829 pfam09997 DUF2238 Predicted membrane protein (DUF2238). This domain, found in various hypothetical bacterial proteins, has no known function. 140
52804 401830 pfam09998 DUF2239 Uncharacterized protein conserved in bacteria (DUF2239). This domain, found in various hypothetical bacterial proteins, has no known function. 181
52805 401831 pfam09999 DUF2240 Uncharacterized protein conserved in archaea (DUF2240). This domain, found in various hypothetical archaeal proteins, has no known function. 144
52806 401832 pfam10000 ACT_3 ACT domain. This domain, found in various hypothetical bacterial proteins, has no known function. However, its structure is similar to the ACT domain which suggests that it binds to amino acids and regulates other protein activity. This family was formerly known as DUF2241. 69
52807 401833 pfam10001 DUF2242 Uncharacterized protein conserved in bacteria (DUF2242). This domain is found in various hypothetical bacterial proteins, and has no known function. 121
52808 401834 pfam10002 DUF2243 Predicted membrane protein (DUF2243). This domain, found in various hypothetical bacterial proteins, has no known function. 139
52809 401835 pfam10003 DUF2244 Integral membrane protein (DUF2244). This domain, found in various bacterial hypothetical and putative membrane proteins, has no known function. 135
52810 401836 pfam10004 DUF2247 Uncharacterized protein conserved in bacteria (DUF2247). This domain, found in various hypothetical bacterial proteins, has no known function. 158
52811 401837 pfam10005 zinc-ribbon_6 zinc-ribbon domain. This family appears to be a true zinc-ribbon, with two sets of putative zinc-binding domains in tandem. 93
52812 401838 pfam10006 DUF2249 Uncharacterized conserved protein (DUF2249). Members of this family of hypothetical bacterial proteins have no known function. 70
52813 401839 pfam10007 DUF2250 Uncharacterized protein conserved in archaea (DUF2250). Members of this family of hypothetical archaeal proteins have no known function. 93
52814 401840 pfam10008 DUF2251 Uncharacterized protein conserved in bacteria (DUF2251). Members of this family of hypothetical bacterial proteins have no known function. 89
52815 401841 pfam10009 DUF2252 Uncharacterized protein conserved in bacteria (DUF2252). This domain, found in various hypothetical bacterial proteins, has no known function. 387
52816 401842 pfam10011 DUF2254 Predicted membrane protein (DUF2254). Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined. 371
52817 401843 pfam10012 DUF2255 Uncharacterized protein conserved in bacteria (DUF2255). Members of this family of hypothetical bacterial proteins have no known function. 113
52818 401844 pfam10013 DUF2256 Uncharacterized protein conserved in bacteria (DUF2256). Members of this family of hypothetical bacterial proteins have no known function. 40
52819 401845 pfam10014 2OG-Fe_Oxy_2 2OG-Fe dioxygenase. This family contains 2-oxoglutarate (2OG) and Fe-dependent dioxygenases. It includes L-isoleucine dioxygenase (IDO). 191
52820 401846 pfam10015 DUF2258 Uncharacterized protein conserved in archaea (DUF2258). Members of this family of hypothetical bacterial archaeal have no known function. Structural modelling suggests this domain may bind nucleic acids. 78
52821 401847 pfam10016 DUF2259 Predicted secreted protein (DUF2259). Members of this family of hypothetical bacterial proteins have no known function. 189
52822 401848 pfam10017 Methyltransf_33 Histidine-specific methyltransferase, SAM-dependent. The mycobacterial members of this family are expressed from part of the ergothioneine biosynthetic gene cluster. EGTD is the histidine methyltransferase that transfers three methyl groups to the alpha-amino moiety of histidine, in the first stage of the production of this histidine betaine derivative that carries a thiol group attached to the C2 atom of an imidazole ring. 305
52823 401849 pfam10018 Med4 Vitamin-D-receptor interacting Mediator subunit 4. Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter. 184
52824 401850 pfam10020 DUF2262 Uncharacterized protein conserved in bacteria (DUF2262). This domain, found in various hypothetical bacterial proteins, has no known function. 141
52825 401851 pfam10021 DUF2263 Uncharacterized protein conserved in bacteria (DUF2263). This domain, found in various hypothetical bacterial and eukaryotic proteins, has no known function. 136
52826 401852 pfam10022 DUF2264 Uncharacterized protein conserved in bacteria (DUF2264). Members of this family of hypothetical bacterial proteins have no known function. 351
52827 401853 pfam10023 Aminopep Putative aminopeptidase. This family of bacterial proteins has a conserved HEXXH motif, suggesting that members are putative peptidases of zincin fold. 322
52828 401854 pfam10025 DUF2267 Uncharacterized conserved protein (DUF2267). This domain, found in various hypothetical bacterial proteins, has no known function. 122
52829 401855 pfam10026 DUF2268 Predicted Zn-dependent protease (DUF2268). This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function. 195
52830 401856 pfam10027 DUF2269 Predicted integral membrane protein (DUF2269). Members of this family of bacterial hypothetical integral membrane proteins have no known function. 150
52831 401857 pfam10028 DUF2270 Predicted integral membrane protein (DUF2270). This domain, found in various hypothetical bacterial proteins, has no known function. 182
52832 401858 pfam10029 DUF2271 Predicted periplasmic protein (DUF2271). This domain, found in various hypothetical bacterial proteins and misannotated lysozyme proteins, it has no known function. 136
52833 370785 pfam10030 DUF2272 Uncharacterized protein conserved in bacteria (DUF2272). Members of this family of hypothetical bacterial proteins have no known function. However, given its similarity to the CHAP domain it seems likely that this is an enzyme involved in cleaving peptidoglycan. 191
52834 401859 pfam10031 DUF2273 Small integral membrane protein (DUF2273). Members of this family of hypothetical bacterial proteins have no known function. 47
52835 401860 pfam10032 Pho88 Phosphate transport (Pho88). Members of this family of proteins are involved in regulating inorganic phosphate transport, as well as telomere length regulation and maintenance. 175
52836 401861 pfam10033 ATG13 Autophagy-related protein 13. Members of this family of phosphoproteins are involved in cytoplasm to vacuole transport (Cvt), and more specifically in Cvt vesicle formation. They are probably involved in the switching machinery regulating the conversion between the Cvt pathway and autophagy. Finally, ATG13 is also required for glycogen storage. 229
52837 401862 pfam10034 Dpy19 Q-cell neuroblast polarisation. Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant. 645
52838 401863 pfam10035 DUF2179 Uncharacterized protein conserved in bacteria (DUF2179). This domain, found in various hypothetical bacterial proteins, has no known function. 55
52839 401864 pfam10036 RLL Putative carnitine deficiency-associated protein. This family of proteins conserved from nematodes to humans is of approximately 250 amino acids. It is purported to be carnitine deficiency-associated protein but this could not be confirmed. It carries a characteristic RLL sequence-motif. The function is unknown. 243
52840 401865 pfam10037 MRP-S27 Mitochondrial 28S ribosomal protein S27. Members of this family of small ribosomal proteins possess one of three conserved blocks of sequence found in proteins that stimulate the dissociation of guanine nucleotides from G-proteins, leaving open the possibility that MRP-S27 might be a functional partner of GTP-binding ribosomal proteins. 391
52841 401866 pfam10038 DUF2274 Protein of unknown function (DUF2274). Members of this family of hypothetical bacterial proteins have no known function. 69
52842 401867 pfam10039 DUF2275 Predicted integral membrane protein (DUF2275). This domain, found in various hypothetical bacterial proteins and in the RNA polymerase sigma factor, has no known function. 201
52843 401868 pfam10040 CRISPR_Cas6 CRISPR-associated endoribonuclease Cas6. Cas6 is a member of the RAMP (repeat-associated mysterious protein) superfamily. It is among the most widely distributed Cas proteins and is found in both bacteria and archaea. Cas6 functions in the generation of CRISPR-derived guide RNAs for invader defense in prokaryotes. 65
52844 401869 pfam10041 DUF2277 Uncharacterized conserved protein (DUF2277). Members of this family of hypothetical bacterial proteins have no known function. 74
52845 401870 pfam10042 DUF2278 Uncharacterized conserved protein (DUF2278). Members of this family of hypothetical bacterial proteins have no known function. 205
52846 370792 pfam10043 DUF2279 Predicted periplasmic lipoprotein (DUF2279). This domain, found in various hypothetical bacterial proteins, has no known function. 91
52847 401871 pfam10044 LIN52 Retinal tissue protein. LIN52 is a family of proteins of approximately 112 amino acids in length which is conserved from nematodes to humans. The proposed tertiary structure is of almost entirely alpha helix interrupted only by loops located at proline residues. Three sites in the protein sequence reveal two types of possible post-translation modification. A serine residue, at position 41, is a candidate for protein kinase C phosphorylation. Glycine residues at position 69 and 91 are probable sites for acetylation by covalent amide linkage of myristate via N-myristoyl transferase. LIN52 is differentially expressed in the trout retina between parr and smolt developmental stages (smoltification). It is likely to be a house-keeping protein. LIN52 forms a complex (LINC) required for transcriptional activation of G2/M genes. The LINC core complex consists of at least five subunits including the chromatin-associated LIN-9 and RbAp48 proteins. LINC associates with a large number of E2F-regulated promoters in quiescent cells. Family members are required for spermatogenesis by repressing testis-specific gene expression. 92
52848 401872 pfam10045 DUF2280 Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function. 103
52849 401873 pfam10046 BLOC1_2 Biogenesis of lysosome-related organelles complex-1 subunit 2. Members of this family of proteins play a role in cellular proliferation, as well as in the biogenesis of specialized organelles of the endosomal-lysosomal system. 95
52850 378366 pfam10047 DUF2281 Protein of unknown function (DUF2281). Members of this family of hypothetical bacterial proteins have no known function. 64
52851 401874 pfam10048 DUF2282 Predicted integral membrane protein (DUF2282). Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function. 52
52852 401875 pfam10049 DUF2283 Protein of unknown function (DUF2283). Members of this family of hypothetical bacterial proteins have no known function. 49
52853 401876 pfam10050 DUF2284 Predicted metal-binding protein (DUF2284). Members of this family of metal-binding hypothetical bacterial proteins have no known function. 161
52854 378369 pfam10051 DUF2286 Uncharacterized protein conserved in archaea (DUF2286). Members of this family of hypothetical archaeal proteins have no known function. 138
52855 401877 pfam10052 DUF2288 Protein of unknown function (DUF2288). Members of this family of hypothetical bacterial proteins have no known function. 89
52856 370798 pfam10053 DUF2290 Uncharacterized conserved protein (DUF2290). Members of this family of hypothetical bacterial proteins have no known function. 195
52857 401878 pfam10054 DUF2291 Predicted periplasmic lipoprotein (DUF2291). Members of this family of hypothetical bacterial proteins have no known function. 199
52858 401879 pfam10055 DUF2292 Uncharacterized small protein (DUF2292). Members of this family of hypothetical bacterial proteins have no known function. 37
52859 401880 pfam10056 DUF2293 Uncharacterized conserved protein (DUF2293). This domain, found in various hypothetical bacterial proteins, has no known function. 85
52860 401881 pfam10057 DUF2294 Uncharacterized conserved protein (DUF2294). Members of this family of hypothetical bacterial proteins have no known function. 111
52861 401882 pfam10058 zinc_ribbon_10 Predicted integral membrane zinc-ribbon metal-binding protein. This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins is a probably zinc-ribbon. 51
52862 401883 pfam10060 DUF2298 Uncharacterized membrane protein (DUF2298). This domain, found in various hypothetical bacterial proteins, has no known function. 485
52863 401884 pfam10061 DUF2299 Uncharacterized conserved protein (DUF2299). Members of this family of hypothetical bacterial proteins have no known function. 137
52864 401885 pfam10062 DUF2300 Predicted secreted protein (DUF2300). This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function. 122
52865 401886 pfam10063 DUF2301 Uncharacterized integral membrane protein (DUF2301). This domain, found in various hypothetical bacterial proteins, has no known function. 133
52866 401887 pfam10065 DUF2303 Uncharacterized conserved protein (DUF2303). Members of this family of hypothetical bacterial proteins have no known function. 268
52867 401888 pfam10066 DUF2304 Uncharacterized conserved protein (DUF2304). Members of this family of hypothetical archaeal proteins have no known function. 106
52868 370803 pfam10067 DUF2306 Predicted membrane protein (DUF2306). Members of this family of hypothetical bacterial proteins have no known function. 147
52869 401889 pfam10069 DICT Sensory domain in DIguanylate Cyclases and Two-component system. DICT is a sensory domain found associated with GGDEF, EAL, HD-GYP, STAS, and two component systems (histidine-kinase type). It assumes an alpha+beta fold with a 4-stranded beta-sheet and might have a role in light response (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter) 126
52870 401890 pfam10070 DUF2309 Uncharacterized protein conserved in bacteria (DUF2309). Members of this family of hypothetical bacterial proteins have no known function. 758
52871 401891 pfam10071 DUF2310 Zn-ribbon-containing, possibly nucleic-acid-binding protein (DUF2310). Members of this family of proteobacterial zinc ribbon proteins are thought to bind to nucleic acids, however their exact function has not as yet been defined. 255
52872 401892 pfam10073 DUF2312 Uncharacterized protein conserved in bacteria (DUF2312). Members of this family of hypothetical bacterial proteins have no known function. Structural modelling suggests this domain may bind nucleic acids. 72
52873 401893 pfam10074 DUF2285 Uncharacterized conserved protein (DUF2285). This domain, found in various hypothetical bacterial proteins, has no known function. 102
52874 370807 pfam10075 CSN8_PSD8_EIF3K CSN8/PSMD8/EIF3K family. This domain is conserved from plants to humans. It is a signature protein motif found in components of CSN (COP9 signalosome) where it functions as a structural scaffold for subunit-subunit interactions within the complex and is a key regulator of photomorphogenic development. It is found in Eukaryotic translation initiation factor 3 subunit K, a component of the eukaryotic translation initiation factor 3 (eIF-3) complex required for the initiation of protein synthesis. It is also found in 26S proteasome non-ATPase regulatory subunit 8 (PSMD8), a regulatory subunit of the 26S proteasome. 137
52875 401894 pfam10076 DUF2313 Uncharacterized protein conserved in bacteria (DUF2313). Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins. 150
52876 401895 pfam10077 DUF2314 Uncharacterized protein conserved in bacteria (DUF2314). This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined. 136
52877 401896 pfam10078 DUF2316 Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function. 89
52878 401897 pfam10079 BshC Bacillithiol biosynthesis BshC. Members of this protein family include BshC, which is an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme. 538
52879 401898 pfam10080 DUF2318 Predicted membrane protein (DUF2318). Members of this family of hypothetical bacterial proteins have no known function. 98
52880 401899 pfam10081 Abhydrolase_9 Alpha/beta-hydrolase family. This is a family of alpha/beta hydrolases which may function as lipases. This domain is the catalytic domain and includes the catalytic triad and the GXSXG sequence motif which is a characteristic of these enzymes. 282
52881 401900 pfam10082 BBP2_2 Putative beta-barrel porin 2. This domain is a putative beta-barrel porin type 2. 378
52882 401901 pfam10083 DUF2321 Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function. 156
52883 401902 pfam10084 DUF2322 Uncharacterized protein conserved in bacteria (DUF2322). Members of this family of hypothetical bacterial proteins have no known function. 99
52884 401903 pfam10086 DUF2324 Putative membrane peptidase family (DUF2324). This domain, found in various hypothetical bacterial proteins, has no known function. This family appears to be related to the prenyl protease 2 family pfam02517, suggesting this family may be peptidases. 223
52885 401904 pfam10087 DUF2325 Uncharacterized protein conserved in bacteria (DUF2325). Members of this family of hypothetical bacterial proteins have no known function. 94
52886 401905 pfam10088 DUF2326 Uncharacterized protein conserved in bacteria (DUF2326). This domain, found in various hypothetical bacterial proteins, has no known function. 135
52887 401906 pfam10090 HPTransfase Histidine phosphotransferase C-terminal domain. HPTransfase is a family of essential histidine phosphotransferases. It controls the activity of the master bacterial cell-cycle regulator CtrA through phosphorylation. It behaves as a homodimer by adopting the domain architecture of the intracellular part of class I histidine kinases. Each subunit consists of two distinct domains: an N-terminal helical hairpin domain and a C-terminal [alpha]/[beta] domain. The two N-terminal domains are adjacent within the dimer, forming a four-helix bundle. The C-terminal domain adopts an atypical Bergerat ATP-binding fold. 123
52888 401907 pfam10091 Glycoamylase Putative glucoamylase. The structure of UniProt:Q5LIB7 has an alpha/alpha toroid fold and is similar structurally to a number of glucoamylases. Most of these structural homologs are glucoamylases, involved in breaking down complex sugars (e.g. starch). The biologically relevant state is likely to be monomeric. The putative active site is located at the centre of the toroid with a well defined large cavity. 215
52889 370813 pfam10092 DUF2330 Uncharacterized protein conserved in bacteria (DUF2330). Members of this family of hypothetical bacterial proteins have no known function. 311
52890 401908 pfam10093 DUF2331 Uncharacterized protein conserved in bacteria (DUF2331). Members of this family of hypothetical bacterial proteins have no known function. 373
52891 401909 pfam10094 DUF2332 Uncharacterized protein conserved in bacteria (DUF2332). Members of this family of hypothetical bacterial proteins have no known function. 334
52892 401910 pfam10095 DUF2333 Uncharacterized protein conserved in bacteria (DUF2333). Members of this family of hypothetical bacterial proteins have no known function. 330
52893 401911 pfam10096 DUF2334 Uncharacterized protein conserved in bacteria (DUF2334). This domain, found in various hypothetical bacterial proteins, has no known function. 206
52894 401912 pfam10097 DUF2335 Predicted membrane protein (DUF2335). Members of this family of hypothetical bacterial proteins have no known function. 50
52895 401913 pfam10098 DUF2336 Uncharacterized protein conserved in bacteria (DUF2336). Members of this family of hypothetical bacterial proteins have no known function. 258
52896 401914 pfam10099 RskA Anti-sigma-K factor rskA. This domain, formerly known as DUF2337, is the anti-sigma-K factor, RskA. In Mycobacterium tuberculosis the protein positively regulates expression of the antigenic proteins MPB70 and MPB83. 180
52897 401915 pfam10100 DUF2338 Uncharacterized protein conserved in bacteria (DUF2338). Members of this family of hypothetical bacterial proteins have no known function. 423
52898 401916 pfam10101 DUF2339 Predicted membrane protein (DUF2339). This domain, found in various hypothetical bacterial proteins, has no known function. 650
52899 401917 pfam10102 DUF2341 Domain of unknown function (DUF2341). Members of this family are found in various bacterial proteins, including MotA/TolQ/ExbB proton channels and other transport proteins. The exact function of this set of domains has not, as yet, been determined. 82
52900 401918 pfam10103 Zincin_2 Zincin-like metallopeptidase. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family has similarity to Peptidase_M1 (pfam01433, Structure 3CMN). 340
52901 401919 pfam10104 Brr6_like_C_C Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport. Brr6 in yeast is an essential integral membrane protein of the NE-ER, wit two predicted transmembrane domains, and is a dosage suppressor of Apq12, pfam12716. 133
52902 401920 pfam10105 DUF2344 Uncharacterized protein conserved in bacteria (DUF2344). This domain, found in various hypothetical bacterial proteins and Radical Sam domain proteins, has no known function. This domain is distantly related to tRNA pseudouridine synthases, suggesting this family may carry out a function related to RNA modification. But this family appears to lack the catalytic aspartate found in pseudouridine synthases. 178
52903 401921 pfam10106 DUF2345 Uncharacterized protein conserved in bacteria (DUF2345). Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins. 151
52904 401922 pfam10107 Endonuc_Holl Endonuclease related to archaeal Holliday junction resolvase. This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases. 159
52905 401923 pfam10108 DNA_pol_B_exo2 Predicted 3'-5' exonuclease related to the exonuclease domain of PolB. This domain is found in various prokaryotic 3'-5' exonucleases and hypothetical proteins. 213
52906 401924 pfam10109 Phage_TAC_7 Phage tail assembly chaperone proteins, E, or 41 or 14. This is family of various Myoviridae bacteriophage tail assembly chaperone, or TAC, proteins. 76
52907 401925 pfam10110 GPDPase_memb Membrane domain of glycerophosphoryl diester phosphodiesterase. Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase. 321
52908 313356 pfam10111 Glyco_tranf_2_2 Glycosyltransferase like family 2. Members of this family of prokaryotic proteins include putative glucosyltransferase, which are involved in bacterial capsule biosynthesis. 276
52909 401926 pfam10112 Halogen_Hydrol 5-bromo-4-chloroindolyl phosphate hydrolysis protein. Members of this family of prokaryotic proteins mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds. 186
52910 401927 pfam10113 Fibrillarin_2 Fibrillarin-like archaeal protein. Members of this family of proteins include archaeal fibrillarin homologs. 500
52911 401928 pfam10114 PocR Sensory domain found in PocR. PocR, a ligand binding domain, has a novel variant of the PAS-like Fold. Evidence suggests that it binds small hydrocarbon derivatives such as 1,3-propanediol. In (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter) 162
52912 401929 pfam10115 HlyU Transcriptional activator HlyU. This domain, found in various hypothetical prokaryotic proteins, has no known function. One of the sequences in this family corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members. 91
52913 401930 pfam10116 Host_attach Protein required for attachment to host cells. Members of this family of bacterial proteins are required for the attachment of the bacterium to host cells. 136
52914 370821 pfam10117 McrBC McrBC 5-methylcytosine restriction system component. Members of this family of bacterial proteins modify the specificity of mcrB restriction by expanding the range of modified sequences restricted. 319
52915 401931 pfam10118 Metal_hydrol Predicted metal-dependent hydrolase. Members of this family of proteins comprise various bacterial transition metal-dependent hydrolases. 250
52916 401932 pfam10119 MethyTransf_Reg Predicted methyltransferase regulatory domain. Members of this family of domains are found in various prokaryotic methyltransferases, where they regulate the activity of the methyltransferase domain. 84
52917 401933 pfam10120 ThiP_synth Thiamine-phosphate synthase. This family is thiamine-phosphate synthase, and it belongs to the SCOP phosphomethylpyrimidine kinase C-terminal domain-like family. Vitamin B1 (thiamine pyrophosphate) is involved in several microbial metabolic functions. Thiamine biosynthesis is accomplished by joining two intermediate molecules that are synthesized separately, HMP-PP and HET-P. In the archaeon Natrialba magadii, ThiE and ThiN, are known to join HMP-PP ( hydroxymethylpyrimidine pyrophosphate) and HET-P (hydroxyethylthiazole phosphate) to generate thiamine phosphate. Whereas ThiE in Natrialba magadii is a mono-functional protein, ThiN exists as a C-terminal domain in a ThiDN fusion protein - examples of all three forms, from various prokaryotes, are found in this family. 164
52918 287133 pfam10122 Mu-like_Com Mu-like prophage protein Com. Members of this family of proteins comprise the translational regulator of mom. 52
52919 401934 pfam10123 Mu-like_Pro Mu-like prophage I protein. Members of this family of proteins comprise various viral Mu-like prophage I proteins. 325
52920 401935 pfam10124 Mu-like_gpT Mu-like prophage major head subunit gpT. Members of this family of proteins comprise various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT. 289
52921 401936 pfam10125 NADHdeh_related NADH dehydrogenase I, subunit N related protein. This family comprises a set of NADH dehydrogenase I, subunit N related proteins found in archaea. Their exact function, has not, as yet, been determined. 218
52922 401937 pfam10126 Nit_Regul_Hom Uncharacterized protein, homolog of nitrogen regulatory protein PII. This domain, found in various hypothetical archaeal proteins, has no known function. It is distantly similar to the nitrogen regulatory protein PII. 107
52923 401938 pfam10127 Nuc-transf Predicted nucleotidyltransferase. Members of this family of bacterial proteins catalyze the transfer of nucleotide residues from nucleoside diphosphates or triphosphates into dimer or polymer forms. 246
52924 401939 pfam10128 OpcA_G6PD_assem Glucose-6-phosphate dehydrogenase subunit. Members of this family are found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. The exact function of the domain is, as yet, unknown. 255
52925 401940 pfam10129 OpgC_C OpgC protein. This domain, found in various hypothetical and OpgC prokaryotic proteins. It is likely to act as an acyltransferase enzyme. 358
52926 370825 pfam10130 PIN_2 PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases). 132
52927 401941 pfam10131 PTPS_related 6-pyruvoyl-tetrahydropterin synthase related domain; membrane protein. This domain is found in various bacterial hypothetical membrane proteins, as well as in tetratricopeptide TPR_2 repeat protein. The exact function of the domain has not, as yet, been established. 621
52928 401942 pfam10133 RNA_bind_2 Predicted RNA-binding protein. Members of this family of bacterial proteins are thought to have RNA-binding properties, however, their exact function has not, as yet, been defined. 60
52929 401943 pfam10134 RPA Replication initiator protein A. Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination. 229
52930 401944 pfam10135 Rod-binding Rod binding protein. Members of this family are involved in the assembly of the prokaryotic flagellar rod. 50
52931 401945 pfam10136 SpecificRecomb Site-specific recombinase. Members of this family of bacterial proteins are found in various putative site-specific recombinase transmembrane proteins. 640
52932 401946 pfam10137 TIR-like Predicted nucleotide-binding protein containing TIR-like domain. Members of this family of bacterial nucleotide-binding proteins contain a TIR-like domain. Their exact function has not, as yet, been defined. 120
52933 401947 pfam10138 vWA-TerF-like vWA found in TerF C-terminus. vWA domain fused to TerD domain typified by the TerF protein. Some times found as solos. 200
52934 401948 pfam10139 Virul_Fac Putative bacterial virulence factor. Members of this family of prokaryotic proteins include various putative virulence factor effector proteins. Their exact function is, as yet, unknown. 872
52935 401949 pfam10140 YukC WXG100 protein secretion system (Wss), protein YukC. Members of this family of proteins include predicted membrane proteins homologous to YukC in B. subtilis. The YukC protein family would participate to the formation of a translocon required for the secretion of WXG100 proteins (pfam06013) in monoderm bacteria, the WXG100 protein secretion system (Wss). This family includes EssB in Staphylococcus aureus. 357
52936 401950 pfam10141 ssDNA-exonuc_C Single-strand DNA-specific exonuclease, C terminal domain. Members of this set of prokaryotic domains are found in a set of single-strand DNA-specific exonucleases, including RecJ. Their exact function has not, as yet, been determined. 202
52937 401951 pfam10142 PhoPQ_related PhoPQ-activated pathogenicity-related protein. Members of this family of bacterial proteins are involved in the virulence of some pathogenic proteobacteria. 366
52938 401952 pfam10143 PhosphMutase 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate]. 171
52939 401953 pfam10144 SMP_2 Bacterial virulence factor haemolysin. Members of this family of bacterial proteins are membrane proteins that effect the expression of haemolysin under anaerobic conditions. 159
52940 401954 pfam10145 PhageMin_Tail Phage-related minor tail protein. Members of this family are found in putative phage tail tape measure proteins. 200
52941 401955 pfam10146 zf-C4H2 Zinc finger-containing protein. This is a family of proteins which appears to have a highly conserved zinc finger domain at the C terminal end, described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is predicted to contain a coiled coil. Members are annotated as being tumor-associated antigen HCA127 in humans but this could not confirmed. 213
52942 401956 pfam10147 CR6_interact Growth arrest and DNA-damage-inducible proteins-interacting protein 1. Members of this family of proteins act as negative regulators of G1 to S cell cycle phase progression by inhibiting cyclin-dependent kinases. Inhibitory effects are additive with GADD45 proteins but occur also in the absence of GADD45 proteins. Furthermore, they act as a repressor of the orphan nuclear receptor NR4A1 by inhibiting AB domain-mediated transcriptional activity. 204
52943 401957 pfam10148 SCHIP-1 Schwannomin-interacting protein 1. Members of this family are coiled coil protein involved in linking membrane proteins to the cytoskeleton. 233
52944 401958 pfam10149 TM231 Transmembrane protein 231. This is a family of transmembrane proteins, given the number 231, of unknown function. It is conserved in eukaryotes. 301
52945 401959 pfam10150 RNase_E_G Ribonuclease E/G family. Ribonuclease E and Ribonuclease G are related enzymes that cleave a wide variety of RNAs. 267
52946 401960 pfam10151 TMEM214 TMEM214, C-terminal, caspase 4 activator. This is the N-terminal domain of transmembrane family 214, from eukaryotes. The family is localized on the endoplasmic reticulum where it recruits procaspase 4 to the ER and subsequently allows this to be cleaved to caspase 4 so leading to apoptosis. 661
52947 401961 pfam10152 CCDC53 Subunit CCDC53 of WASH complex. CCDC53 is a component of the WASH complex, which plays a key role in the fission of tubules that serve as transport intermediates during endosome sorting. 146
52948 401962 pfam10153 Efg1 rRNA-processing protein Efg1. Efg1 is involved in rRNA processing. 114
52949 401963 pfam10154 DUF2362 Uncharacterized conserved protein (DUF2362). This is a family of proteins conserved from nematodes to humans. The function is not known. 500
52950 401964 pfam10155 DUF2363 Uncharacterized conserved protein (DUF2363). This is a region of 120 amino acids of a family of proteins conserved from plants to humans. The function is not known. 124
52951 401965 pfam10156 Med17 Subunit 17 of Mediator complex. This Mediator complex subunit was formerly known as Srb4 in yeasts or Trap80 in Drosophila and human. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of cerevisiae lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures. 441
52952 401966 pfam10157 BORCS6 BLOC-1-related complex sub-unit 6. This is a family of conserved proteins found from nematodes to humans. Family members include BORCS6 (BLOC-1-related complex sub-unit 6) also known as Lyspersin (lysosome-dispersing protein) or C17orf59. It constitutes sub-unit 6 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery. 152
52953 287167 pfam10158 LOH1CR12 tumor suppressor protein. This is a region of 130 amino acids that is the most conserved region of hypothetical proteins involved in loss of heterozygosity and thus tumor suppression. The exact function of family members is not known. This region is also found in subunit 5 of the BLOC-1-related complex, which is also found in the BORC complex. BLOC-1 is important for the biogenesis of lysosome-related organelles, and BORC is important for the positioning of the lysosome in the cytoplasm. The BORC complex associates with the lysosomal membrane where it recruits the small GTPase Arl8, which leads in turn to the kinesin-dependent movement of lysosomes toward the plus ends of microtubules in the peripheral cytoplasm. 131
52954 401967 pfam10159 MMtag Kinase phosphorylation protein. This is a glycine-rich domain that is the most highly conserved region of a family of proteins that in vertebrates are associated with tumors in multiple myelomas. The region may contain phosphorylation sites for several protein kinases, as well as N-myristoylation sites and nuclear localization signals, so it might act as a signal molecule in the nucleus. 78
52955 401968 pfam10160 Tmemb_40 Predicted membrane protein. This is a region of 280 amino acids from a group of proteins conserved from plants to humans. It is predicted to be a membrane protein but its function is otherwise unknown. 258
52956 401969 pfam10161 DDDD Putative mitochondrial precursor protein. This is a family of small conserved proteins found from nematodes to humans. The C-terminal region is rich in asparagine. Members are putatively assigned to be mitochondrial precursor proteins but this could not be confirmed. 76
52957 401970 pfam10162 G8 G8 domain. This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. 123
52958 401971 pfam10163 EnY2 Transcription factor e(y)2. EnY2 is a small transcription factor which is combined in a complex with the TAFII40 protein. The protein is conserved from paramecium to humans. 79
52959 401972 pfam10164 DUF2367 Uncharacterized conserved protein (DUF2367). This is a highly conserved family of proteins which contains three pairs of cysteine residues within a length of 42 amino acids and is rich in proline residues towards the N-terminus. The function is unknown. Several members are putatively assigned as brain protein i3 but this was not validated. 105
52960 401973 pfam10165 Ric8 Guanine nucleotide exchange factor synembryn. Ric8 is involved in the EGL-30 neurotransmitter signalling pathway. It is a guanine nucleotide exchange factor that regulates neurotransmitter secretion. 406
52961 401974 pfam10166 DUF2368 Uncharacterized conserved protein (DUF2368). This family is conserved from nematodes to humans. The function is not known. 134
52962 401975 pfam10167 BORCS8 BLOC-1-related complex sub-unit 8. This is the N-terminal 80 residues of a family of proteins conserved from plants to humans. It contains a characteristic NEP sequence motif. Family members include BORCS8 (BLOC-1-related complex sub-unit 8) also known as MEF2BNB. It constitutes sub-unit 8 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery. 107
52963 401976 pfam10168 Nup88 Nuclear pore component. Nup88 can be divided into two structural domains; the N-terminal two-thirds of the protein has no obvious structural motifs but is the region for binding to Nup98, one of the components of the nuclear pore. the C-terminal end is a predicted coiled-coil domain. Nup88 is overexpressed in tumor cells. 713
52964 401977 pfam10169 Laps Learning-associated protein. This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA. 124
52965 401978 pfam10170 C6_DPF Cysteine-rich domain. This is the N-terminal approximately 100 amino acids of a family of proteins found from nematodes to humans. It contains between six and eight highly conserved cysteine residues and a characteristic DPF sequence motif. One member is putatively named as receptor for egg jelly protein but this could not confirmed. 94
52966 401979 pfam10171 Tim29 Translocase of the Inner Mitochondrial membrane 29. This is a family of proteins conserved from nematodes to humans. The function is not known. However, family members such as the import inner membrane translocase sub-unit Tim29 (C19orf52) found in human, is shown to be required for the stability of the TIM22 complex. TIM22 complex imports and inserts multi-pass trans-membrane proteins into the mitochondrial inner membrane by formation of a twin-pore translocase with components in the outer and inner membranes. TIM29 is integrated into the inner member with the C-terminus exposed to the inter-membrane space and able to contact the translocase of the outer membrane. It is required for complex stability and for the addition of the TIMM22 protein to the complex. 169
52967 401980 pfam10172 DDA1 Det1 complexing ubiquitin ligase. DDA1 (De-etiolated 1, Damaged DNA binding protein 1 associated 1) protein binds strongly with DDB1 and Det1 forming a DDD complex which is part of the ubiquitin conjugation system. 66
52968 401981 pfam10173 Mit_KHE1 Mitochondrial K+-H+ exchange-related. The members of this family function as mitochondrial potassium-hydrogen exchange transporters. The family is part of a large mitochondrial KHE protein complex. 191
52969 401982 pfam10174 Cast RIM-binding protein of the cytomatrix active zone. This is a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion. The C-terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). The family also contains four coiled-coil domains. 765
52970 401983 pfam10175 MPP6 M-phase phosphoprotein 6. This is a family of M-phase phosphoprotein 6s which is necessary for generation of the 3' end of the 5.8S rRNA precursor. It preferentially binds to poly(C) and poly(U). 130
52971 401984 pfam10176 DUF2370 Protein of unknown function (DUF2370). This family is conserved from fungi to humans. The human member is annotated as a Golgi-associated protein-Nedd4 WW domain-binding protein but this could not be confirmed. 215
52972 401985 pfam10177 DUF2371 Uncharacterized conserved protein (DUF2371). This is a family of proteins conserved from nematodes to humans. The function is not known. 141
52973 401986 pfam10178 PAC3 Proteasome assembly chaperone 3. PAC3 is a family of eukaryotic proteasome assembly chaperone 3 proteins conserved from fungi to plants to humans. PAC3 plays a crucial part in the assembly of the 20S core proteasome unit, in conjunction with PAC4. 86
52974 401987 pfam10179 DUF2369 Uncharacterized conserved protein (DUF2369). This is a proline-rich region of a group of proteins found from plants to fungi. The function is not known. 94
52975 401988 pfam10180 DUF2373 Uncharacterized conserved protein (DUF2373). This is the C-terminal conserved region of a family of proteins found from fungi to humans. The function is not known. 62
52976 401989 pfam10181 PIG-H GPI-GlcNAc transferase complex, PIG-H component. PIG-H is a family of conserved proteins that complexes with three other proteins to form the GPI-GnT (glycosylphosphatidylinositol anchor biosynthesis transferase) complex. It appears to be a peripheral membrane protein facing the cytoplasm involved in the first step in GPI anchor formation. 67
52977 401990 pfam10182 Flo11 Flo11 domain. This presumed domain is found at the N-terminus of the S. cerevisiae Flo11 protein. Flo11 is required for diploid pseudohyphal formation and haploid invasive growth. It belongs to a family of proteins involved in invasive growth, cell-cell adhesion, and mating, many of which can substitute for each other under abnormal conditions. 151
52978 401991 pfam10183 ESSS ESSS subunit of NADH:ubiquinone oxidoreductase (complex I). This subunit is part of the mitochondrial NADH:ubiquinone oxidoreductase (complex I). It carries mitochondrial import sequences. 115
52979 370865 pfam10184 DUF2358 Uncharacterized conserved protein (DUF2358). DUF2358 is a family of conserved proteins found from plants to humans. The function is unknown. 113
52980 401992 pfam10185 Mesd Chaperone for wingless signalling and trafficking of LDL receptor. Mesd is a family of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation). 155
52981 370867 pfam10186 Atg14 Vacuolar sorting 38 and autophagy-related subunit 14. The Atg14 or Apg14 proteins are hydrophilic proteins with a predicted molecular mass of 40.5 kDa, and have a coiled-coil motif at the N-terminus region. Yeast cells with mutant Atg14 are defective not only in autophagy but also in sorting of carboxypeptidase Y (CPY), a vacuolar-soluble hydrolase, to the vacuole. Subcellular fractionation indicate that Apg14p and Apg6p are peripherally associated with a membrane structure(s). Apg14p was co-immunoprecipitated with Apg6p, suggesting that they form a stable protein complex. These results imply that Apg6/Vps30p has two distinct functions: in the autophagic process and in the vacuolar protein sorting pathway. Apg14p may be a component specifically required for the function of Apg6/Vps30p through the autophagic pathway. There are 17 auto-phagosomal component proteins which are categorized into six functional units, one of which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The AS-PI3K complex and the Atg2-Atg18 complex are essential for nucleation, and the specific function of the AS-PI3K apparently is to produce phosphatidylinositol 3-phosphate (PtdIns(3)P) at the pre-autophagosomal structure (PAS). The localization of this complex at the PAS is controlled by Atg14. Autophagy mediates the cellular response to nutrient deprivation, protein aggregation, and pathogen invasion in humans, and malfunction of autophagy has been implicated in multiple human diseases including cancer. This effect seems to be mediated through direct interaction of the human Atg14 with Beclin 1 in the human phosphatidylinositol 3-kinase class III complex. 342
52982 401993 pfam10187 Nefa_Nip30_N N-terminal domain of NEFA-interacting nuclear protein NIP30. This is a the N-terminal 100 amino acids of a family of proteins conserved from plants to humans. The full-length protein has putatively been called NEFA-interacting nuclear protein NIP30, however no reference could be found to confirm this. 102
52983 401994 pfam10188 Oscp1 Organic solute transport protein 1. Oscp1 is a family of proteins conserved from plants to humans. It is called organic solute transport protein or oxido-red- nitro domain-containing protein 1, however no reference could be find to confirm the function of the protein. 173
52984 401995 pfam10189 Ints3 Integrator complex subunit 3. The Integrator complex is involved in small nuclear RNA (snRNA) U1 and U2 transcription, and in their 3'-box- dependent processing. This complex associates with the C- terminal domain of RNA polymerase II largest subunit and is recruited to the U1 and U2 snRNAs genes. This entry represents subunit 3 of this complex. 225
52985 401996 pfam10190 Tmemb_170 Putative transmembrane protein 170. Tmem170 is a family of putative transmembrane proteins conserved from fungi to nematodes to humans. The protein is only of approximately 130 amino acids in length. The function is unknown. 106
52986 337664 pfam10191 COG7 Golgi complex component 7 (COG7). COG7 is a component of the conserved oligomeric Golgi complex which is required for normal Golgi morphology and localization. Mutation in COG7 causes a congenital disorder of glycosylation. 736
52987 401997 pfam10192 GpcrRhopsn4 Rhodopsin-like GPCR transmembrane domain. This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers. 257
52988 401998 pfam10193 Telomere_reg-2 Telomere length regulation protein. This family is the central conserved 110 amino acid region of a group of proteins called telomere-length regulation or clock abnormal protein-2 which are conserved from plants to humans. The full-length protein regulates telomere length and contributes to silencing of sub-telomeric regions. In vitro the protein binds to telomeric DNA repeats. 112
52989 401999 pfam10195 Phospho_p8 DNA-binding nuclear phosphoprotein p8. P8 is a short 80-82 amino acid protein that is conserved from nematodes to humans. It carries at least one protein kinase C domain suggesting a possible role in signal transduction and it is thought to be a phosphoprotein, but the sites of phosphorylation and the kinases involved remain to be determined. 58
52990 402000 pfam10197 Cir_N N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex. 37
52991 402001 pfam10198 Ada3 Histone acetyltransferases subunit 3. Ada3 is a family of proteins conserved from yeasts to humans. It is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. Ada3 plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumor suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage. 123
52992 370877 pfam10199 Adaptin_binding Alpha and gamma adaptin binding protein p34. p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. Another possible function is in aiding the recruitment of soluble adaptors onto the membrane. 93
52993 370878 pfam10200 Ndufs5 NADH:ubiquinone oxidoreductase, NDUFS5-15kDa. This is a family of short, approximately 105 amino acid residue, proteins which form part of NADH:ubiquinone oxidoreductase complex I. Complex I is the first multisubunit inner membrane protein complex of the mitochondrial electron transport chain and it transfers two electrons from NADH to ubiquinone. The protein carries four highly conserved cysteine residues but these do not appear to be in a configuration which would favour metal binding so the exact function of the protein is uncertain. 95
52994 402002 pfam10203 Pet191_N Cytochrome c oxidase assembly protein PET191. Pet191_N is the conserved N-terminal of a family of conserved proteins found from nematodes to humans. It carries six highly conserved cysteine residues. Pet191 is required for the assembly of active cytochrome c oxidase but does not form part of the final assembled complex. 67
52995 402003 pfam10204 DuoxA Dual oxidase maturation factor. DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans. 274
52996 402004 pfam10205 KLRAQ Predicted coiled-coil domain-containing protein. This is the N-terminal 100 amino acid domain of a family of proteins conserved from nematodes to humans. It carries a characteristic KLRAQ sequence-motif. The function is not known. 100
52997 402005 pfam10206 WRW Mitochondrial F1F0-ATP synthase, subunit f. This is a family of small proteins of approximately 110 amino acids, which are highly conserved from nematodes to humans. Some members of the family have been annotated in Swiss-Prot as being the f subunit of mitochondrial F1F0-ATP synthase but this could not be confirmed. The sequence has a well-conserved WRW motif. The exact function of the protein is not known. 102
52998 402006 pfam10208 Armet Degradation arginine-rich protein for mis-folding. This is a family of small proteins of approximately 170 residues which contain four di-sulfide bridges that are highly conserved from nematodes to humans. Armet is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress. 145
52999 402007 pfam10209 DUF2340 Uncharacterized conserved protein (DUF2340). This is a family of small proteins of approximately 150 amino acids of unknown function. 120
53000 402008 pfam10210 MRP-S32 Mitochondrial 28S ribosomal protein S32. This entry is of a family of short, approximately 100 amino acid residues, proteins which are mitochondrial 28S ribosomal proteins named as MRP-S32. Their exact function could not be confirmed. 92
53001 402009 pfam10211 Ax_dynein_light Axonemal dynein light chain. Axonemal dynein light chain proteins play a dynamic role in flagellar and cilia motility. Eukaryotic cilia and flagella are complex organelles consisting of a core structure, the axoneme, which is composed of nine microtubule doublets forming a cylinder that surrounds a pair of central singlet microtubules. This ultra-structural arrangement seems to be one of the most stable micro-tubular assemblies known and is responsible for the flagellar and ciliary movement of a large number of organisms ranging from protozoan to mammals. This light chain interacts directly with the N-terminal half of the heavy chains. 182
53002 402010 pfam10212 TTKRSYEDQ Predicted coiled-coil domain-containing protein. This is the C-terminal 500 amino acids of a family of proteins with a predicted coiled-coil domain conserved from nematodes to humans. It carries a characteristic TTKRSYEDQ sequence-motif. The function is not known. 523
53003 287217 pfam10213 MRP-S28 Mitochondrial ribosomal subunit protein. This is a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae the protein is MRP-S24 whereas in humans it is MRP-S28. The human mitochondrial ribosome has 29 distinct proteins in the small subunit and these have homologs in, for example, Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi. 127
53004 402011 pfam10214 Rrn6 RNA polymerase I-specific transcription-initiation factor. RNA polymerase I-specific transcription-initiation factor Rrn6 and Rrn7 represent components of a multisubunit transcription factor essential for the initiation of rDNA transcription by Pol I. These proteins are found in fungi. 847
53005 402012 pfam10215 Ost4 Oligosaccaryltransferase. Ost4 is a very short, approximately 30 residues, enzyme found from fungi to vertebrates. It is a member of the ER oligosaccaryltansferase complex, EC 2.4.1.119, that catalyzes the asparagine-linked glycosylation of proteins. It appears to be an integral membrane protein that mediates the en bloc transfer of a preassembled high-mannose oligosaccharide onto asparagine residues of nascent polypeptides as they enter the lumen of the rough endoplasmic reticulum (RER). 34
53006 402013 pfam10216 ChpXY CO2 hydration protein (ChpXY). This small family of proteins includes paralogues ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations. 352
53007 402014 pfam10217 DUF2039 Uncharacterized conserved protein (DUF2039). This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown. 89
53008 402015 pfam10218 DUF2054 Uncharacterized conserved protein (DUF2054). This entry contains 14 conserved cysteines, three of which are CC-dimers. The region is of approximately 200 residues in length but its function is unknown. 128
53009 402016 pfam10220 Smg8_Smg9 Smg8_Smg9. Smg8 and Smg9 are two subunits of the Smg-1 complex. They suppress Smg-1 kinase activity in the isolated Smg-1 complex, and are involved in nonsense-mediated mRNA decay (NMD) in both mammals and nematodes. 868
53010 402017 pfam10221 DUF2151 Cell cycle and development regulator. This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells. 680
53011 287225 pfam10222 DUF2152 Uncharacterized conserved protein (DUF2152). This is a family of proteins conserved from worms to humans. Its function is unknown. 605
53012 402018 pfam10223 DUF2181 Uncharacterized conserved protein (DUF2181). This is region of approximately 250 residues conserved from worms to humans. Its function is unknown. 240
53013 402019 pfam10224 DUF2205 Predicted coiled-coil protein (DUF2205). This entry represent a highly conserved 100 residue region which is likely to be a coiled-coil structure. The exact function is unknown. 71
53014 402020 pfam10225 NEMP NEMP family. This entry includes a group of nuclear envelope integral membrane proteins from animals and plants, including NEMP1 from Xenopus laevis. NEMP1 is a RanGTP-binding protein and is involved in eye development. 249
53015 402021 pfam10226 CCDC85 CCDC85 family. This entry includes human CCDC85A/B/C and C. elegans Picc-1 protein. Picc-1 serves as a linker protein which helps to recruit the Rho GTPase-activating protein, pac-1, to adherens junctions. Human CCDC85B suppresses the beta-catenin activity in a p53-dependent manner. 190
53016 402022 pfam10228 DUF2228 Uncharacterized conserved protein (DUF2228). This is a family of conserved proteins of approximately 700 residues found from worms to humans. 250
53017 402023 pfam10229 MMADHC Methylmalonic aciduria and homocystinuria type D protein. This entry represents methylmalonic aciduria and homocystinuria type D protein and homologs. These proteins are involved in cobalamin (vitamin B12) metabolism. 272
53018 370901 pfam10230 LIDHydrolase Lipid-droplet associated hydrolase. This family of proteins is conserved from plants to humans. The function is as a lipid-droplet hydrolase in the yeast members. 261
53019 402024 pfam10231 DUF2315 Uncharacterized conserved protein (DUF2315). This is a family of small conserved proteins found from worms to humans. The function is not known. 118
53020 402025 pfam10232 Med8 Mediator of RNA polymerase II transcription complex subunit 8. Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex. The N-terminus of Med8 binds to the essential core Head part of Mediator and the C-terminus hinges to Med18 on the non-essential part of the Head that also includes Med20. 231
53021 402026 pfam10233 Cg6151-P Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined. 113
53022 402027 pfam10234 Cluap1 Clusterin-associated protein-1. This protein is conserved from worms to humans. The protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell. 268
53023 402028 pfam10235 Cript Microtubule-associated protein CRIPT. The CRIPT protein is a cytoskeletal protein involved in microtubule production. The C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners. SAP90 is concentrated in the post synaptic density of glutamatergic neurons. 87
53024 370907 pfam10236 DAP3 Mitochondrial ribosomal death-associated protein 3. This is a family of conserved proteins which were originally described as death-associated-protein-3 (DAP-3). The proteins carry a P-loop DNA-binding motif, and induce apoptosis. DAP3 has been shown to be a pro-apoptotic factor in the mitochondrial matrix and to be crucial for mitochondrial biogenesis and so has also been designated as MRP-S29 (mitochondrial ribosomal protein subunit 29). 310
53025 402029 pfam10237 N6-adenineMlase Probable N6-adenine methyltransferase. This is a protein of approximately 200 residues which is conserved from plants to humans. It contains a highly conserved QFW motif close to the N-terminus and a DPPF motif in the centre. The DPPF motif is characteristic of N-6 adenine-specific DNA methylases, and this family is found in eukaryotes. 118
53026 402030 pfam10238 Eapp_C E2F-associated phosphoprotein. This entry represents the conserved C-terminal portion of an E2F binding protein. E2F transcription factors play an essential role in cell proliferation and apoptosis and their activity is frequently deregulated in human cancers. E2F activity is regulated by a variety of mechanisms, frequently mediated by proteins binding to individual members or a subgroup of the family. EAPP interacts with a subset of E2F factors and influences E2F-dependent promoter activity. EAPP is present throughout the cell cycle but disappears during mitosis. 133
53027 402031 pfam10239 DUF2465 Protein of unknown function (DUF2465). FAM98A and B proteins are found from worms to humans but their function is unknown. This entry is of a family of proteins that is rich in glycines. 321
53028 402032 pfam10240 DUF2464 Multivesicular body subunit 12. MVB12A (also known as CFBP) and MVB12B are subunits of the ESCRT-I complex, which mediates the sorting of ubiquitinated cargo protein from the plasma membrane to the endosomal vesicle. MVB12A plays a key role in the ligand-mediated internalization and down-regulation of the EGF receptor. 256
53029 402033 pfam10241 KxDL Uncharacterized conserved protein. This is a family of short proteins which are conserved over a region of 80 residues. There is a characteristic KxDL motif towards the C-terminus. The function is unknown. 80
53030 370913 pfam10242 L_HMGIC_fpl Lipoma HMGIC fusion partner-like protein. This is a group of proteins expressed from a series of genes referred to as Lipoma HMGIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumor cell lines. 181
53031 402034 pfam10243 MIP-T3 Microtubule-binding protein MIP-T3. This protein, which interacts with both microtubules and TRAF3 (tumor necrosis factor receptor-associated factor 3), is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved. 113
53032 402035 pfam10244 MRP-L51 Mitochondrial ribosomal subunit. MRP-L51 is a family of small proteins from the intact 55 S mitochondrial ribosome. It has otherwise been referred to as bMRP-64. The exact function of this family is not known. 93
53033 402036 pfam10245 MRP-S22 Mitochondrial 28S ribosomal protein S22. This is the conserved N-terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them. 241
53034 402037 pfam10246 MRP-S35 Mitochondrial ribosomal protein MRP-S35. This is a family of short mitochondrial ribosomal proteins, less than 200 amino acids long. that are highly conserved from worms to humans. The structure has previously been referred to as MRP-S18 but the current numbering fits the preferred nomenclature from these authors. 105
53035 402038 pfam10247 Romo1 Reactive mitochondrial oxygen species modulator 1. This is a family of small, approximately 100 amino acid, proteins found from yeasts to humans. The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression. Members of this family are mitochondrial reactive oxygen species modulator 1 (Romo1) proteins that are responsible for increasing the level of ROS in cells. Increased Romo1 expression can have a number of other effects including: inducing premature senescence of cultured human fibroblasts and increased resistance to 5-fluorouracil. 66
53036 402039 pfam10248 Mlf1IP Myelodysplasia-myeloid leukemia factor 1-interacting protein. This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localization of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions. 174
53037 402040 pfam10249 NDUFB10 NADH-ubiquinone oxidoreductase subunit 10. NDUFB10 is a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus. 126
53038 402041 pfam10250 O-FucT GDP-fucose protein O-fucosyltransferase. This is a family of conserved proteins representing the enzyme responsible for adding O-fucose to EGF (epidermal growth factor-like) repeats. Six highly conserved cysteines are present in O-FucT-1 as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and C. elegans. Both features are characteristic of several glycosyltransferase families. The enzyme is a membrane-bound protein released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese. 254
53039 402042 pfam10251 PEN-2 Presenilin enhancer-2 subunit of gamma secretase. This entry is a short 101 peptide protein which is the smallest subunit of the gamma-secretase aspartyl protease complex that catalyzes the intramembrane cleavage of a subset of type I transmembrane proteins. The other active constituents of the complex are presenilin (PS) nicastrin and anterior pharynx defective-1 (APH-1) protein. PEN-2 adopts a hairpin orientation in the membrane with its N- and C-terminal domains facing the luminal/extracellular space, and the C-terminal domain maintains PS stability within the complex. 93
53040 402043 pfam10252 PP28 Casein kinase substrate phosphoprotein PP28. This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II. 80
53041 402044 pfam10253 PRCC Mitotic checkpoint regulator, MAD2B-interacting. This family constitutes the major, conserved, portion of PRCC proteins. In humans this family interacts with MAD2B, the mitotic checkpoint protein. In Schizosaccharomyces pombe this protein is part of the Cwf-complex that is known to be involved in pre-mRNA splicing. 128
53042 402045 pfam10254 Pacs-1 PACS-1 cytosolic sorting protein. PACS-1 is a cytosolic sorting protein that directs the localization of membrane proteins in the trans-Golgi network (TGN)/endosomal system. PACS-1 connects the clathrin adaptor AP-1 to acidic cluster sorting motifs contained in the cytoplasmic domain of cargo proteins such as furin, the cation-independent mannose-6-phosphate receptor and in viral proteins such as human immunodeficiency virus type 1 Nef. 413
53043 402046 pfam10255 Paf67 RNA polymerase I-associated factor PAF67. RNA polymerase I is a multisubunit enzyme and its transcription competence is dependent on the presence of PAF67. This family of proteins is conserved from worms to humans. 399
53044 402047 pfam10256 Erf4 Golgin subfamily A member 7/ERF4 family. This family of proteins includes Golgin subfamily A member 7 proteins as well as Ras modification protein ERF4. 116
53045 370927 pfam10257 RAI16-like Retinoic acid induced 16-like protein. This is the conserved N-terminal 450 residues of a family of proteins described as retinoic acid-induced protein 16-like proteins. The exact function is not known. The proteins are found from worms to humans. 357
53046 402048 pfam10258 RNA_GG_bind PHAX RNA-binding domain. RNA_GG_bind is the highly conserved U3 snoRNA-binding domain of PHAX (phosphorylated adaptor for RNA export) whose function is to transport U3 snoRNA from the nucleus after transcription. It is characterized by having two pairs of adjacent glycines, as GGx12GG. 84
53047 402049 pfam10259 Rogdi_lz Rogdi leucine zipper containing protein. This is a family of conserved proteins which have been suggested as containing leucine-zipper domains. A leucine zipper domain is a region of 30 amino acids with leucines repeating every seven or eight residues; these proteins do have many such leucines. The protein in Drosophila comes from the gene ROGDI. 295
53048 402050 pfam10260 SAYSvFN Uncharacterized conserved domain (SAYSvFN). This domain of approximately 75 residues contains a highly conserved SATSv/iFN motif. The function is unknown but the domain is conserved from plants to humans. 65
53049 402051 pfam10261 Scs3p Inositol phospholipid synthesis and fat-storage-inducing TM. This is a family of transmembrane proteins which are variously annotated as possibly being inositol phospholipid synthesis protein and fat-storage-inducing. The members are conserved from yeasts to humans and are localized to the endoplasmic reticulum where they are involved in triglyceride lipid droplet formation. 242
53050 402052 pfam10262 Rdx Rdx family. This entry is an approximately 100 residue region of selenoprotein-T, conserved from plants to humans. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. However, despite its binding to UGTR and that its mRNA is up-regulated in extended asphyxia, the function of the protein and hence of this region of it is unknown. Selenoprotein W contains selenium as selenocysteine in the primary protein structure and levels of this selenoprotein are affected by selenium. 74
53051 402053 pfam10263 SprT-like SprT-like family. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc metallopeptidases. This family includes the bacterial SprT protein. 105
53052 402054 pfam10264 Stork_head Winged helix Storkhead-box1 domain. This is the conserved N-terminal winged helix domain of Storkhead-box1 protein which is likely to be a DNA binding domain. In humans the full-length protein controls polyploidization of extravillus trophoblast and is implicated in pre-eclampsia. 79
53053 402055 pfam10265 Miga Mitoguardin. Mitoguardin (Miga) was first identified in flies as a mitochondrial outer-membrane protein that promotes mitochondrial fusion. Later, the mammalian Miga homologs, Miga1 and Miga2, were identified. They are found to promote mitochondrial fusion by regulating mitochondrial phospholipid metabolism via MitoPLD. 535
53054 402056 pfam10266 Strumpellin Hereditary spastic paraplegia protein strumpellin. This is a family of proteins conserved from plants to humans, in which two closely situated point mutations in the human protein lead to the condition of hereditary spastic paraplegia. Strumpellin contains one known domain called a spectrin repeat that consists of three alpha-helices of a characteristic length wrapped in a left-handed coiled coil. The spectrin proteins have multiple copies of this repeat, which can then form multimers in the cell. Spectrin associates with the cell membrane via spectrin repeats in the ankyrin protein. The spectrin repeat is a structural platform for cytoskeletal protein assemblies. 1081
53055 402057 pfam10267 Tmemb_cc2 Predicted transmembrane and coiled-coil 2 protein. This family of transmembrane coiled-coil containing proteins is conserved from worms to humans. Its function is unknown. 386
53056 402058 pfam10268 Tmemb_161AB Predicted transmembrane protein 161AB. Transmemb_161AB is a family of conserved proteins found from worms to humans. Members are putative transmembrane proteins but otherwise the function is not known. 485
53057 402059 pfam10269 Tmemb_185A Transmembrane Fragile-X-F protein. This is a family of conserved transmembrane proteins that appear in humans to be expressed from a region upstream of the FragileXF site and to be intimately linked with the Fragile-X syndrome. Absence of TMEM185A does not necessarily lead to developmental delay, but might in combination with other, yet unknown, factors. Otherwise, the lack of the TMEM185A protein is either disposable (redundant) or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B. 245
53058 402060 pfam10270 MMgT Membrane magnesium transporter. This entry represents a novel family of membrane magnesium transporters (MMgT). The proteins, MMgT1 and MMgT2, are localized to the Golgi complex and post-Golgi vesicles, including the early endosomes, suggesting that they may provide regulated pathways for Mg(2+) transport in the Golgi and post-Golgi organelles of epithelium-derived cells. 117
53059 402061 pfam10271 Tmp39 Putative transmembrane protein. This is a family of conserved proteins found from worms to humans. They are putative transmembrane proteins but the function is unknown. 429
53060 402062 pfam10272 Tmpp129 Putative transmembrane protein precursor. This is a family of proteins conserved from worms to humans. The proteins are purported to be transmembrane protein-precursors but the function is unknown. 351
53061 402063 pfam10273 WGG Pre-rRNA-processing protein TSR2. This entry represents the central conserved section of a family of proteins described as pre-rRNA-processing protein TSR2. The region has a distinctive WGG motif but the function is unknown. 80
53062 402064 pfam10274 ParcG Parkin co-regulated protein. This family of proteins is transcribed anti-sense along the DNA to the Parkin gene product and the two appear to be transcribed under the same promoter. The protein has predicted alpha-helical and beta-sheet domains which suggest its function is in the ubiquitin/proteasome system. Mutations in parkin are the genetic cause of early-onset and autosomal recessive juvenile parkinsonism. 183
53063 402065 pfam10275 Peptidase_C65 Peptidase C65 Otubain. This family of proteins conserved from plants to humans is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryote being a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumor domain) in which there is an active cysteine protease triad (ii) a nuclear localization signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. 242
53064 402066 pfam10276 zf-CHCC Zinc-finger domain. This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C. 37
53065 402067 pfam10277 Frag1 Frag1/DRAM/Sfk1 family. This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumor-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localization of Stt4p to the actin cytoskeleton. 219
53066 287279 pfam10278 Med19 Mediator of RNA pol II transcription subunit 19. Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators. 178
53067 192511 pfam10279 Latarcin Latarcin precursor. This family represents the precursor proteins for a number of short antimicrobial peptides called Latarcins. Latarcins were discovered in the venom of the spider Lachesana tarabaevi. Latarcins are likely to adopt amphipathic alpha-helical structure in the plasma membrane. 64
53068 402068 pfam10280 Med11 Mediator complex protein. Mediator is a large, modular protein complex that is conserved from yeast to human and conveys regulatory signals from DNA-binding transcription factors to RNA polymerase II. Not only are the polypeptides conserved but the structural organisation is also largely conserved. One or two subunits are either fungal or vertebral specific but Med11 is one of the subunits that is conserved from fungi to humans. Med11 appears to be necessary for the full and successful assembly of the core head sub-region. 119
53069 402069 pfam10281 Ish1 Putative stress-responsive nuclear envelope protein. This entry represents a repeat found in the fungal protein Ish1, a putative stress-responsive nuclear envelope protein. 37
53070 402070 pfam10282 Lactonase Lactonase, 7-bladed beta-propeller. This entry contains bacterial 6-phosphogluconolactonases (6PGL)YbhE-type (EC:3.1.1.31) which hydrolyze 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonising enzyme carboxy-cis,cis-muconate cyclase (EC:5.5.1.5) and muconate cycloisomerase (EC:5.5.1.1), which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold. 340
53071 402071 pfam10283 zf-CCHH Zinc-finger (CX5CX6HX5H) motif. This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism. 26
53072 118808 pfam10284 Luciferase_3H Luciferase helical bundle domain. This domain is found associated with the the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. This domain has a three helix bundle structure that holds four important histidines that are thought to play a role in the pH regulation of the enzyme. 66
53073 118809 pfam10285 Luciferase_cat Luciferase catalytic domain. This domain is the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. The core part of the domain is a 10 stranded beta barrel that is structurally similar to lipocalins and FABP. 296
53074 402072 pfam10287 DUF2401 Putative TOS1-like glycosyl hydrolase (DUF2401). This family of proteins is conserved in fungi. One member is annotated putatively as OPEL, a house-keeping protein, but this could not be confirmed. It contains 5 highly conserved cysteines two of which form a characteristic CGC sequence motif. It has recently been shown that this family is related to known glycosyl hydrolases. 223
53075 402073 pfam10288 CTU2 Cytoplasmic tRNA 2-thiolation protein 2. CTU2 is a family of proteins necessary for the formation of the wobble nucleoside 5-methoxycarbonylmethyl-2-thiouridine in Saccharomyces cerevisiae. The family is conserved from plants to humans ]1]. It plays a central role in the 2-thiolation of 5-methoxycarbonylmethyl-2-thiouridine, or the wobble nucleoside. This wobble modification in tRNAs, 5-methoxycarbonylmethyl-2-thiouridine (mcm(5)s(2)U), is required for the proper decoding of NNR codons in eukaryotes. The 2-thio group gives rigidity by largely fixing the C3'-endo ribose puckering, ensuring stable and accurate codon-anticodon pairing. 106
53076 402074 pfam10290 DUF2403 Glycine-rich protein domain (DUF2403). This domain is found in the N-terminal region of members of DUF2401 pfam10287. The function of this glycine-rich region is unknown. 59
53077 402075 pfam10291 muHD Muniscin C-terminal mu homology domain. The muniscins are a family of endocytic adaptors that is conserved from yeast to humans.This C-terminal domain is structurally similar to mu homology domains, and is the region of the muniscin proteins involved in the interactions with the endocytic adaptor-scaffold proteins Ede1-eps15. This interaction influences muniscin localization. The muniscins provide a combined adaptor-membrane-tubulation activity that is important for regulating endocytosis. 255
53078 370953 pfam10292 7TM_GPCR_Srab Serpentine type 7TM GPCR receptor class ab chemoreceptor. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srab is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The expression pattern of the srab genes is biologically intriguing. Of the six promoters successfully expressed in transgenic organisms, one was exclusively expressed in the tail phasmid neurons, two were exclusively expressed in a head amphid neuron, and two were expressed both in the head and tail neurons as well as a limited number of other cells. 324
53079 402076 pfam10293 DUF2405 Domain of unknown function (DUF2405). This is a conserved region of a family of proteins conserved in fungi. The function is unknown. 152
53080 313513 pfam10294 Methyltransf_16 Lysine methyltransferase. Methyltrans_16 is a lysine methyltransferase. characterized members of this family are protein methyltransferases targetting Lys residues in specific proteins, including calmodulin, VCP, Kin17 and Hsp70 proteins. 172
53081 402077 pfam10295 DUF2406 Uncharacterized protein (DUF2406). This is a family of small proteins conserved in fungi. The function is not known. 58
53082 402078 pfam10296 MMM1 Maintenance of mitochondrial morphology protein 1. MMM1 is conserved from plants to humans. MMM1 is an integral ER protein. It is N-glycosylated, and forms a complex with Mdm10, Mdm12and Mdm34 to tether the mitochondria to the endoplasmic reticulum. 314
53083 402079 pfam10297 Hap4_Hap_bind Minimal binding motif of Hap4 for binding to Hap2/3/5. In Saccharomyces cerevisiae, the haem-activated protein complex Hap2/3/4/5 plays a major role in the transcription of genes involved in respiration. Hap4_Hap_bind is the essential domain of Hap4 which allows it to associate with Hap2, Hap3 and Hap5 to form the Hap complex. 17
53084 402080 pfam10298 WhiA_N WhiA N-terminal LAGLIDADG-like domain. This domain is found at the N terminal of sporulation factor WhiA. This domain is related to the LAGLIDADG Homing endonuclease domain while the C terminal domain of WhiA is predicted to be a DNA binding helix-turn-helix domain. 86
53085 402081 pfam10300 DUF3808 Protein of unknown function (DUF3808). This is a family of proteins conserved from fungi to humans. Members of this family also carry a TPR_2 domain pfam07719 at their C-terminus. 474
53086 370959 pfam10302 DUF2407 DUF2407 ubiquitin-like domain. This is a family of proteins found in fungi. The function is not known. This domain is related to the ubiquitin domain. 101
53087 402082 pfam10303 DUF2408 Protein of unknown function (DUF2408). This is a family of proteins conserved in fungi. The function is unknown. 128
53088 402083 pfam10304 RTP1_C2 Required for nuclear transport of RNA pol II C-terminus 2. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10363. 34
53089 402084 pfam10305 Fmp27_SW RNA pol II promoter Fmp27 protein domain. Fmp27_SW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic SW and GKG sequence motifs. 101
53090 402085 pfam10306 FLILHELTA Hypothetical protein FLILHELTA. This is a family of conserved proteins found in fungi. It contains a characteristic FL(I)LHE(L)TA sequence motif, where the bracketed residues are I, L or V. The function is not known. 82
53091 402086 pfam10307 DUF2410 Hypothetical protein (DUF2410). This is a family of proteins conserved in fungi. The function is not known.There are two characteristic sequence motifs, GGWW and TGR. 198
53092 402087 pfam10309 NCBP3 Nuclear cap-binding protein subunit 3. NCBP3 and NCBP1 form an alternative cap-binding complex in higher eukaryotes. NCBP3 binds mRNA, associates with components of the mRNA processing machinery and contributes to polyA RNA export. 59
53093 402088 pfam10310 DUF5427 Family of unknown function (DUF5427). This is a domain of unknown function. Family members found in Saccharomyces cerevisiae, are synthetic lethal with genes involved in maintenance of telomere capping. However, experimental evidence is yet to verify the exact function of family members and the domain. 456
53094 402089 pfam10311 Ilm1 Increased loss of mitochondrial DNA protein 1. This is a family of proteins of approximately 200 residues that are conserved in fungi. Ilm1 is part of the peroxisome, a complex that is the sole site of beta-oxidation in Saccharomyces cerevisiae and known to be required for optimal growth in the presence of fatty acid. Ilm1 may participate in the control of the C16/C18 ratio since it interacts strongly with Mga2p, a transcription factor that controls expression of Ole1, the sole fatty acyl desaturase in S. cerevisiae responsible for conversion of the saturated fatty acids stearate (C18) and palmitate (C16) to oleate and palmitoleate, respectively. 160
53095 402090 pfam10312 Cactin_mid Conserved mid region of cactin. This is the conserved middle region of a family of proteins referred to as cactins. The region contains two of three predicted coiled-coil domains. Most members of this family have a CactinC_cactus pfam09732 domain at the C-terminal end. Upstream of Mid_cactin in Drosophila members are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localization signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. 186
53096 402091 pfam10313 DUF2415 Uncharacterized protein domain (DUF2415). This is a short, 30 residue domain, from a family of proteins conserved in fungi. The function is unknown. There is a characteristic DLL sequence motif. 43
53097 402092 pfam10315 Aim19 Altered inheritance of mitochondria protein 19. This is a family of conserved proteins found in fungi. The function is not known. 111
53098 402093 pfam10316 7TM_GPCR_Srbc Serpentine type 7TM GPCR chemoreceptor Srbc. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srbc is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 273
53099 402094 pfam10317 7TM_GPCR_Srd Serpentine type 7TM GPCR chemoreceptor Srd. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srd is part of the larger Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 292
53100 402095 pfam10318 7TM_GPCR_Srh Serpentine type 7TM GPCR chemoreceptor Srh. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srh is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 302
53101 370974 pfam10319 7TM_GPCR_Srj Serpentine type 7TM GPCR chemoreceptor Srj. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srj is part of the Str superfamily of chemoreceptors. The srj family is designated as the out-group based on its location in preliminary phylogenetic analyses of the entire superfamily. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 310
53102 255903 pfam10320 7TM_GPCR_Srsx Serpentine type 7TM GPCR chemoreceptor Srsx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 257
53103 402096 pfam10321 7TM_GPCR_Srt Serpentine type 7TM GPCR chemoreceptor Srt. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srt is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 313
53104 370976 pfam10322 7TM_GPCR_Sru Serpentine type 7TM GPCR chemoreceptor Sru. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sru is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 304
53105 370977 pfam10323 7TM_GPCR_Srv Serpentine type 7TM GPCR chemoreceptor Srv. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 283
53106 402097 pfam10324 7TM_GPCR_Srw Serpentine type 7TM GPCR chemoreceptor Srw. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. 318
53107 402098 pfam10325 7TM_GPCR_Srz Serpentine type 7TM GPCR chemoreceptor Srz. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srz is a solo families amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srz appear to be under strong adaptive evolutionary pressure. 265
53108 402099 pfam10326 7TM_GPCR_Str Serpentine type 7TM GPCR chemoreceptor Str. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Str is a member of the Str superfamily of chemoreceptors. Almost a quarter (22.5%) of str and srj family genes and pseudogenes in C. elegans appear to have been newly formed by gene duplications since the species split. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 307
53109 313539 pfam10327 7TM_GPCR_Sri Serpentine type 7TM GPCR chemoreceptor Sri. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sri is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 303
53110 402100 pfam10328 7TM_GPCR_Srx Serpentine type 7TM GPCR chemoreceptor Srx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. 262
53111 402101 pfam10329 DUF2417 Region of unknown function (DUF2417). This is a region of a family of proteins conserved in fungi some of whose members also have the Abhydrolase_1, pfam00561, domain in their sequence. The function of this region is not known. 234
53112 402102 pfam10330 Stb3 Putative Sin3 binding protein. This is a family of the conserved N-terminal end of a group of proteins conserved in fungi. It is likely to be a Sin3 binding protein. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein. 79
53113 402103 pfam10332 DUF2418 Protein of unknown function (DUF2418). This is a conserved 100 residue central region of a family of proteins found in fungi. It carries a characteristic EYD sequence motif. The function is not known. 96
53114 370985 pfam10333 Pga1 GPI-Mannosyltransferase II co-activator. Pga1 is found only in yeasts and not in mammals. It localizes in the ER as a glycosylated integral membrane protein. It binds to the GPI-mannosyltransferase II subunit of the GPI and it is responsible for the second mannose addition to GPI precursors. The GPI-anchoring complex is a glycolipid that functions as a membrane anchor for many cell-surface proteins. 174
53115 370986 pfam10334 ArAE_2 Aromatic acid exporter family member 2. This is a family of proteins conserved in fungi. The function is not known. 228
53116 402104 pfam10335 DUF294_C Putative nucleotidyltransferase substrate binding domain. This domain is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains. 144
53117 402105 pfam10336 DUF2420 Protein of unknown function (DUF2420). This is a family of proteins conserved in fungi. The function is not known. 107
53118 370988 pfam10337 ArAE_2_N Putative ER transporter, 6TM, N-terminal. This is a family of proteins conserved in fungi. The function is not known. This family is the C-terminal half of some member proteins which contain the DUF2421 pfam10334 domain at their N-terminus. These proteins are putative endoplasmic reticulum tranpsorters, with a total of 12 TMs. 470
53119 402106 pfam10338 DUF2423 Protein of unknown function (DUF2423). This is a family of proteins conserved in fungi. The function is not known. 44
53120 402107 pfam10339 Vel1p Yeast-specific zinc responsive. This is a small family of proteins from Saccharomyces and related species. The function is not known but member proteins are highly induced in zinc-depleted conditions and have increased expression in NAP1-deletion mutants. The S. cerevisiae genes are named VEL by association with Velum formation in the wine making process http://www.ajevonline.org/content/48/1/55.abstract 202
53121 313549 pfam10340 Say1_Mug180 Steryl acetyl hydrolase. This entry includes budding yeast steryl acetyl hydrolase 1 (Say1) and fission yeast Mug180. Say1 is a a membrane-anchored deacetylase required for the deacetylation of acetylated sterols. It is involved in the resistance to eugenol and pregnenolone toxicity. Mug180 has a role in meiosis. 374
53122 402108 pfam10341 TPP1 Shelterin complex subunit, TPP1/ACD. TPP1 is a component of the telomerase holoenzyme, involved in telomere replication. It has been demonstrated that TPP1 dimerizes and binds to DNA and RNA. Furthermore, TPP1 stimulates the dissociation of RNA/DNA hetero-duplexes. Yeast telomerase protein TPP1 (Est3 in yeast) is a novel type of GTPase. The key residues in yeast EST3 are an Asp at residue 86 and the Arg at residue 110. The Asp is totally conserved in the family, whereas the Arg is not so well conserved. The N-terminal of TPP1 is likely to be the binding surface for TINF2, whereas the C-terminus probably binds to POT1, thereby tethering POT1 to the shelterin complex. The complex bound to telomeric DNA increases the activity and processivity of the human telomerase core enzyme, thus helping to maintain the length of the telomeres. This domain is conserved from fungi to mammals, hence family Telomere_Pot1 has been merged into the family. The human shelterin complex includes six proteins: telomere repeat binding factor 1 (TRF1), TRF2, repressor/activator protein 1 (RAP1), TRF1-interacting nuclear protein 2 (TIN2), TIN2-interacting protein 1 (TPP1) and protection of telomeres 1 (POT1). 105
53123 370992 pfam10342 GPI-anchored Ser-Thr-rich glycosyl-phosphatidyl-inositol-anchored membrane family. Some members of this family appear to be serine- threonine-rich membrane-anchored proteins, anchored by glycosyl-phosphatidylinositol. In A. fumigatus these proteins play a role in fungal cell wall organisation. In Lentinula edodes this family is involved in fruiting body formation, and may have a more general role in signalling in other organisms as it interacts with MAPK. The family is also found in archaea and bacteria. 93
53124 402109 pfam10343 Q_salvage Potential Queuosine, Q, salvage protein family. Q_salvage proteins occur in most Eukarya as well as in a few bacteria possible via horizontal gene-transfer. Queuosine (Q) is a chemical modification found at the wobble position of tRNAs that have GUN anticodons. Most bacteria synthesize queuosine de novo, whereas eukaryotes rely solely on salvaging this essential component from the environment or the gut flora. The exact enzymatic function of the domain has yet to be determined, but structural similarity with DNA glycosidases suggests a ribonucleoside hydrolase role. 285
53125 402110 pfam10344 Fmp27 Mitochondrial protein from FMP27. This family contains mitochondrial FMP27 proteins which in yeasts together with SEN1 are long genes that exist in a looped conformation, effectively bringing together their promoter and terminator regions. Pol-II is located at both ends of FMP27 when this gene is transcribed from a GAL1 promoter under induced and non-induced conditions. The exact function of the Fmp27 protein is not certain. 864
53126 402111 pfam10345 Cohesin_load Cohesin loading factor. Cohesin_load is a common cohesin loading factor protein that is conserved in fungi. It is associated with the cohesin complex and is required in G1 for cohesin binding to chromosomes but dispensable in G2 when cohesion has been established. It is referred to as both Ssl3, in pombe, and Scc4, in S.cerevisiae. It complexes with Mis4. 594
53127 402112 pfam10346 Con-6 Conidiation protein 6. Con-6 is the conserved N-terminal region of a family of small proteins found in fungi. It is expressed at approximately 6 hours after the induction of development and is induced just prior to major constriction-chain growth. 33
53128 402113 pfam10347 Fmp27_GFWDK RNA pol II promoter Fmp27 protein domain. Fmp27_GFWDK is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic GFWDK sequence motifs. Some members are associated with domain Fmp27_SW (pfam10305) towards the N-terminus. 155
53129 402114 pfam10348 DUF2427 Domain of unknown function (DUF2427). This is the N-terminal region of a family of proteins conserved in fungi. Several members are annotated as being Ftp1 but this could not be confirmed. The function is not known. 105
53130 402115 pfam10350 DUF2428 Putative death-receptor fusion protein (DUF2428). This is a family of proteins conserved from plants to humans. The function is not known. Several members have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these could be confirmed. 256
53131 402116 pfam10351 Apt1 Golgi-body localization protein domain. This is the C-terminus of a family of proteins conserved from plants to humans. The plant members are localized to the Golgi proteins and appear to regulate membrane trafficking, as they are required for rapid vesicle accumulation at the tip of the pollen tube. The C-terminus probably contains the Golgi localization signal and it is well-conserved. 468
53132 118874 pfam10353 DUF2430 Protein of unknown function (DUF2430). This is a family of short, 111 residue, proteins found in S. pombe. The function is not known. 107
53133 402117 pfam10354 DUF2431 Domain of unknown function (DUF2431). This is the N-terminal domain of a family of proteins found from plants to humans. The function is not known. 163
53134 402118 pfam10355 Ytp1 Protein of unknown function (Ytp1). This is a family of proteins found in fungi. The region appears to contain regions similar to mitochondrial electron transport proteins. The C-terminal domain is hydrophobic and negatively charged. There are consensus sites for both N-linked glycosylation and cAMP-dependent protein kinase phosphorylation. 274
53135 287343 pfam10356 DUF2034 Protein of unknown function (DUF2034). This protein is expressed in fungi but its function is unknown. 185
53136 402119 pfam10357 Kin17_mid Domain of Kin17 curved DNA-binding protein. Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle. 123
53137 402120 pfam10358 NT-C2 N-terminal C2 in EEIG1 and EHBP1 proteins. This version of the C2 domain was initally identified in the vertebrate estrogen early-induced gene 1 (EEIG1), and its Drosophila ortholog required for uptake of dsRNA via the endocytotic machinery to induce RNAi silencing. It is also in C.elegans ortholog Sym-3 (SYnthetic lethal with Mec-3) and the mammalian protein EHBP1 (EH domain Binding Protein-1) that regulates endocytotic recycling and two plant proteins, RPG that regulates Rhizobium-directed polar growth and PMI1 (Plastid Movement Impaired 1) that is essential for intracellular movement of chloroplasts in response to blue light. 143
53138 402121 pfam10359 Fmp27_WPPW RNA pol II promoter Fmp27 protein domain. Fmp27_WPPW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic HQR and WPPW sequence motifs. and is towards the C-terminal in members which contain Fmp27_SW pfam10305. 481
53139 402122 pfam10360 DUF2433 Protein of unknown function (DUF2433). This is a conserved 120 residue region of a family of proteins found in fungi. The function is not known. 95
53140 402123 pfam10361 DUF2434 Protein of unknown function (DUF2434). This is a family of proteins conserved in fungi. The function is not known. 294
53141 402124 pfam10363 RTP1_C1 Required for nuclear transport of RNA pol II C-terminus 1. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10304. 108
53142 402125 pfam10364 NKWYS Putative capsular polysaccharide synthesis protein. Found only in Vibrio species, pombe and one other fungi, this is a the N-terminal 150 residues of a family of proteins of unknown function. There is a characteristic NKWYS sequence motif. 132
53143 287351 pfam10365 DUF2436 Domain of unknown function (DUF2436). This domain is found on peptidase C25 proteins and has no known function. 164
53144 371008 pfam10366 Vps39_1 Vacuolar sorting protein 39 domain 1. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. The precise function of this domain has not been characterized. 108
53145 402126 pfam10367 Vps39_2 Vacuolar sorting protein 39 domain 2. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localization and in mediating the interactions of Vps39 with Vps11. 109
53146 402127 pfam10368 YkyA Putative cell-wall binding lipoprotein. YkyA is a family of proteins containing a lipoprotein signal and a hydrolase domain. It is similar to cell wall binding proteins and might also be recognisable by a host immune defense system. It is thus likely to belong to pathways important for pathogenicity. 185
53147 402128 pfam10369 ALS_ss_C Small subunit of acetolactate synthase. ALS_ss_C is the C-terminal half of a family of proteins which are the small subunits of acetolactate synthase. Acetolactate synthase is a tetrameric enzyme, containing probably two large and two small subunits, which catalyzes the first step in branched-chain amino acid biosynthesis. This reaction is sensitive to certain herbicides. 73
53148 402129 pfam10370 DUF2437 Domain of unknown function (DUF2437). This is the N-terminal 50 amino acids of a group of bacterial proteins annotated as fumarylacetoacetate hydrolase-containing enzymes. In most cases members are associated with FAA_hydrolase pfam01557 further towards the C-terminus. 50
53149 402130 pfam10371 EKR Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) pfam01558 and the 4Fe-4S binding domain Fer4 pfam00037. It contains a characteristic EKR sequence motif. The exact function of this domain is not known. 54
53150 402131 pfam10372 YojJ Bacterial membrane-spanning protein N-terminus. YojJ is the N-terminus of a family of bacterial proteins some of which are associated with DUF147 pfam02457 towards the C-terminus. It is a putative membrane-spanning protein. 69
53151 402132 pfam10373 EST1_DNA_bind Est1 DNA/RNA binding domain. Est1 is a protein which recruits or activates telomerase at the site of polymerization. This is the DNA/RNA binding domain of EST1. 279
53152 402133 pfam10374 EST1 Telomerase activating protein Est1. Est1 is a protein which recruits or activates telomerase at the site of polymerization. Structurally it resembles a TPR-like repeat. 130
53153 402134 pfam10375 GRAB GRIP-related Arf-binding domain. The GRAB (GRIP-related Arf-binding) domain is towards the C-terminus of Rud3 type proteins. This domain is related to the GRIP domain, but the conserved tyrosine residue found at position 4 in all GRIP domains is replaced by a leucine residue. The Arf small GTPase is localized to the cis-Golgi where it recruits proteins via their GRAB domain, as part of the transport of cargo from the endoplasmic reticulum to the plasma membrane. 49
53154 371014 pfam10376 Mei5 Double-strand recombination repair protein. Mei5 is one of a pair of meiosis-specific proteins which facilitate the loading of Dmc1 on to Rad51 on DNA at double-strand breaks during recombination. Recombination is carried out by a large protein complex based around the two RecA homologs, Rad51 and Dmc1. This complex may play both a catalytic and a structural role in the interaction between homologous chromosomes during meiosis. Mei5 is seen to contain a coiled-coli region. 207
53155 402135 pfam10377 ATG11 Autophagy-related protein 11. The function of this family is conflicting. In the fission yeast, Schizosaccharomyces pombe, this protein has been shown to interact with the telomere cap complex. However, in budding yeast, Saccharomyces cerevisiae, this protein is called ATG11 and is shown to be involved in autophagy. 131
53156 402136 pfam10378 RRM Putative RRM domain. This is a putative RRM, RNA-binding, domain found only in fungi. It occurs in proteins annotated as Nrd1 yeast proteins, which are known to carry RRM domains. It is not homologous with any of the other RRM domains, eg RRM_1 pfam00076. 46
53157 402137 pfam10379 nec1 Virulence protein nec1. This is a family of virulence proteins that are found in pathogenic Streptomyces species. 184
53158 371017 pfam10380 CRF1 Transcription factor CRF1. CRF1 is a transcription factor that co-represses ribosomal genes with FHL1 via the TOR signalling pathway and protein kinase A. 122
53159 371018 pfam10381 Autophagy_C Autophagocytosis associated protein C-terminal. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif. 25
53160 402138 pfam10382 DUF2439 Protein of unknown function (DUF2439). Proteins in this family have been implicated in telomere maintenance in Saccharomyces cerevisiae and in meiotic chromosome segregation in Schizosaccharomyces pombe 74
53161 402139 pfam10383 Clr2 Transcription-silencing protein Clr2. Clr2 is a chromatin silencing protein, one of a quartet of proteins forming the core of SHREC, a multienzyme effector complex that mediates hetero-chromatic transcriptional gene silencing in fission yeast. Clr2 does not have any obvious well-conserved domains but, along with the other core proteins, binds to the histone deacetylase Clr3, and on its own might also have a role in chromatin organisation at the cnt domain, the site of kinetochore assembly. 138
53162 402140 pfam10384 Scm3 Centromere protein Scm3. Scm3 is a centromere protein that has been shown in Saccharomyces cerevisiae to be required for G2/M progression and Cse4 localization. The C terminal region of Scm3 proteins is variable in size and sometimes consists of DNA binding motifs. 53
53163 402141 pfam10385 RNA_pol_Rpb2_45 RNA polymerase beta subunit external 1 domain. RNA polymerases catalyze the DNA-dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This domain in prokaryotes spans the gap between domains 4 and 5 of the yeast protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region. 66
53164 402142 pfam10386 DUF2441 Protein of unknown function (DUF2441). This is a family of highly conserved, predicted, proteins from Bacillus species. The structure forms a homo-dimer. The function is unknown. 141
53165 402143 pfam10387 DUF2442 Protein of unknown function (DUF2442). This family of bacterial and fungal proteins has several members annotated as being putative molybdopterin-guanine dinucleotide biosynthesis protein A; however this could not be verified. Hence the function is not known. This family also includes the DUF3532 that was found to be related and was merged into this family. Members of this family also fall into the NE0471 N-terminal domain-like superfamily, a family of proteins with a unique fold in SCOP:143880. 72
53166 402144 pfam10388 YkuI_C EAL-domain associated signalling protein domain. In Bacillus species this highly conserved region of the YkuI protein lies immediately downstream of the EAL (diguanylate cyclase/phosphodiesterase domain 2) pfam00563 domain so that together they form a monomer which dimerizes for its enzymatic action. The region contains three alpha helices and five beta strands and is the C-terminal half of the structure. 166
53167 313589 pfam10389 CoatB Bacteriophage coat protein B. CoatB is a single filamentous bacteriophage alpha helix of approximately 44 residues. It is likely to assemble into a complex of 35 monomers in a Catherine-wheel like formation. It is the major coat protein of the virion. 46
53168 402145 pfam10390 ELL RNA polymerase II elongation factor ELL. ELL is a family of RNA polymerase II elongation factors. It is bound stably to elongation-associated factors 1 and 2, EAFs, and together these act as a strong regulator of transcription activity. by direct interaction with Pol II. ELL binds to pol II on its own but the affinity is greatly increased by the cooperation of EAF. Some members carry an Occludin domain pfam07303 just downstream. There is no S. cerevisiae member. 281
53169 402146 pfam10391 DNA_pol_lambd_f Fingers domain of DNA polymerase lambda. DNA polymerases catalyze the addition of dNMPs onto the 3-prime ends of DNA chains. There is a general polymerase fold consisting of three subdomains that have been likened to the fingers, palm, and thumb of a right hand. DNA_pol_lambd_f is the central three-helical region of DNA polymerase lambda referred to as the F and G helices of the fingers domain. Contacts with DNA involve this conserved helix-hairpin-helix motif in the fingers region which interacts with the primer strand. This motif is common to several DNA binding proteins and confers a sequence-independent interaction with the DNA backbone. 50
53170 192566 pfam10392 COG5 Golgi transport complex subunit 5. The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7. 132
53171 402147 pfam10393 Matrilin_ccoil Trimeric coiled-coil oligomerization domain of matrilin. This short domain is a coiled coil structure and has a single cysteine residue at the start which is likely to form a di-sulfide bridge with a corresponding cysteine in an upstream EGF (pfam00008) domain thereby spanning a VWA (pfam00092) domain. All three domains can be associated together as in the cartilage matrix protein matrilin, where this domain is likely to be responsible for oligomerization. 43
53172 402148 pfam10394 Hat1_N Histone acetyl transferase HAT1 N-terminus. This domain is the N-terminal half of the structure of histone acetyl transferase HAT1. It is often found in association with the C-terminal part of the GNAT Acetyltransf_1 (pfam00583) domain. It seems to be motifs C and D of the structure. Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histones. HATs are involved in transcription since histones tend to be hyper-acetylated in actively transcribed regions of chromatin, whereas in transcriptionally silent regions histones are hypo-acetylated. 157
53173 402149 pfam10395 Utp8 Utp8 family. Utp8 is an essential component of the nuclear tRNA export machinery in Saccharomyces cerevisiae. It is a tRNA binding protein that acts at a step between tRNA maturation /aminoacylation, and translocation of the tRNA across the nuclear pore complex. 690
53174 402150 pfam10396 TrmE_N GTP-binding protein TrmE N-terminus. This family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyzes GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff's base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein. 117
53175 402151 pfam10397 ADSL_C Adenylosuccinate lyase C-terminus. This is the C-terminal seven alpha helices of the structure whose full length represents the enzyme adenylosuccinate lyase. This sequence lies C-terminal to the conserved motif necessary for beta-elimination reactions, Adenylosuccinate lyase catalyzes two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide, the eighth step of the de novo pathway, and the formation of adenosine monophosphate (AMP) from adenylosuccinate, the second step in the conversion of inosine monophosphate into AMP. 79
53176 371028 pfam10398 DUF2443 Protein of unknown function (DUF2443). This is a small family of highly conserved proteins from bacteria, in particular Helicobacter species, The structure is a bundle of alpha helices. The function is not known. 79
53177 402152 pfam10399 UCR_Fe-S_N Ubiquitinol-cytochrome C reductase Fe-S subunit TAT signal. This is the N-terminal region of the E or R chain, Ubiquitinol-cytochrome C reductase Fe-S subunit, of the hetero-hexameric cytochrome bc1 complex. This region is a TAT-signal region. The cytochrome bc1 complex is an oligomeric membrane protein complex that is a component of respiratory and photosynthetic electron transfer chains. The enzyme couples the transfer of electrons from ubiquinol to cytochrome c with the the generation of a protein gradient across the membrane. The motif is also associated with Rieske (pfam00355), UCR_TM (pfam02921) and Ubiq-Cytc-red_N (pfam09165). 41
53178 402153 pfam10400 Vir_act_alpha_C Virulence activator alpha C-term. This structure is homo-dimeric, and the domain here is the C-terminal half of the structure, often associated with PadR upstream, (pfam03551), which is a transcriptional regulator. 85
53179 402154 pfam10401 IRF-3 Interferon-regulatory factor 3. This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha,beta-stimulated gene). IRF-3 preexists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection. 179
53180 402155 pfam10403 BHD_1 Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA. 53
53181 402156 pfam10404 BHD_2 Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA. 63
53182 402157 pfam10405 BHD_3 Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA. 73
53183 402158 pfam10406 TAF8_C Transcription factor TFIID complex subunit 8 C-term. This is the C-terminal, Delta, part of the TAF8 protein. The N-terminal is generally the histone fold domain, Bromo_TP (pfam07524). TAF8 is one of the key subunits of the transcription factor for pol II, TFIID. TAF8 is one of the several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery. 48
53184 402159 pfam10407 Cytokin_check_N Cdc14 phosphatase binding protein N-terminus. Cytokinesis in yeasts involves a family of proteins whose essential function is to bind Cdc14-family phosphatase and prevent this from being sequestered and inhibited in the nucleolus. This is the highly conserved N-terminus of a family of proteins which act as cytokinesis checkpoint controls by allowing cells to cope with cytokinesis defects. These proteins are required for rDNA silencing and mini-chromosome maintenance. 71
53185 402160 pfam10408 Ufd2P_core Ubiquitin elongating factor core. This is the most conserved part of the core region of Ufd2P ubiquitin elongating factor or E4, running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C-terminus, pfam04564, which has ligase activity. 594
53186 402161 pfam10409 PTEN_C2 C2 domain of PTEN tumor-suppressor protein. This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane. 133
53187 402162 pfam10410 DnaB_bind DnaB-helicase binding domain of primase. This domain is the C-terminal region three-helical domain of primase. Primases synthesize short RNA strands on single-stranded DNA templates, thereby generating the hybrid duplexes required for the initiation of synthesis by DNA polymerases. Primases are recruited to single-stranded DNA by helicases, and this domain is the region of the primase which binds DnaB-helicase. It is associated with the Toprim domain (pfam01751) which is the central catalytic core. 56
53188 402163 pfam10411 DsbC_N Disulfide bond isomerase protein N-terminus. This is the N-terminal domain of the disulfide bond isomerase DsbC. The whole molecule is V-shaped, where each arm is a DsbC monomer of two domains linked by a hinge; and the N-termini of each monomer join to form the dimer interface at the base of the V, so are vital for dimerization. DsbC is required for disulfide bond formation and functions as a disulfide bond isomerase during oxidative protein-folding in bacterial periplasm. It also has chaperone activity. 54
53189 313610 pfam10412 TrwB_AAD_bind Type IV secretion-system coupling protein DNA-binding domain. The plasmid conjugative coupling protein TrwB forms hexamers from six structurally very similar protomers. This hexamer contains a central channel running from the cytosolic pole (made up by the AADs) to the membrane pole ending at the transmembrane pore shaped by 12 transmembrane helices, rendering an overall mushroom-like structure. The TrwB_AAD (all-alpha domain) domain appears to be the DNA-binding domain of the structure. TrwB, a basic integral inner-membrane nucleoside-triphosphate-binding protein, is the structural prototype for the type IV secretion system coupling proteins, a family of proteins essential for macromolecular transport between cells and export. 386
53190 313611 pfam10413 Rhodopsin_N Amino terminal of the G-protein receptor rhodopsin. Rhodopsin is the archetypal G-protein-coupled receptor. Such receptors participate in virtually all physiological processes, as signalling molecules. They utilize heterotrimeric guanosine triphosphate (GTP)-binding proteins to transduce extracellular signals to intracellular events. Rhodopsin is important because of the pivotal role it plays in visual signal transduction. Rhodopsin is a dimeric transmembrane protein and its intradiskal surface consists of this amino terminal domain and three loops connecting six of the seven transmembrane helices. The N-terminus is a compact domain of alpha-helical regions with breaks and bends at proline residues outside the membrane. The transmembrane part of rhodopsin is represented by 7tm_1 (pfam00001). The N-terminal domain is extracellular is and is necessary for successful dimerization and molecular stability. 35
53191 402164 pfam10414 CysG_dimerizer Sirohaem synthase dimerization region. Bacterial sulfur metabolism depends on the iron-containing porphinoid sirohaem. CysG, S-adenosyl-L-methionine (SAM)-dependent bis-methyltransferase, dehydrogenase and ferrochelatase, synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer of two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues in the CysG_dimerizer region to produce a series of active sites, accounting for CysG's multifunctionality, catalyzing four diverse reactions: two SAM-dependent methylations, NAD+-dependent tetrapyrrole dehydrogenation and metal chelation. The CysG_dimerizer region holding the two protomers together is of 74 residues. 58
53192 402165 pfam10415 FumaraseC_C Fumarase C C-terminus. Fumarase C catalyzes the stereo-specific interconversion of fumarate to L-malate as part of the Kreb's cycle. The full-length protein forms a tetramer with visible globular shape. FumaraseC_C is the C-terminal 65 residues referred to as domain 3. The core of the molecule consists of a bundle of 20 alpha-helices from the five-helix bundle of domain 2. The projections from the core of the tetramer are generated from domains 1 and 3 of each subunit. FumaraseC_C does not appear to be part of either the active site or the activation site but is helical in structure forming a little bundle. 54
53193 402166 pfam10416 IBD Transcription-initiator DNA-binding domain IBD. In Trichomonas vaginalis, thought to be the earliest extant eukaryote, the sole initiator element for control of the start of transcription is Inr, and this is recognized by the initiator binding protein IBP39. IBP39 contains an N-terminal Inr binding domain, IBD, connected via a flexible, proteolytically sensitive, linker (residues 127-145) to a C-terminal domain. The IBD structure reveals a winged-helix-wing conformation with each element binding to DNA, the central helix-turn-helix contributing the majority of the specificity-determining contacts with the Inr core motif TCAPy(T/A). The binding of IBP39 to the Inr directly recruits RNA polymerase II and in this way initiates transcription. 125
53194 402167 pfam10417 1-cysPrx_C C-terminal domain of 1-Cys peroxiredoxin. This is the C-terminal domain of 1-Cys peroxiredoxin (1-cysPrx), a member of the peroxiredoxin superfamily which protect cells against membrane oxidation through glutathione (GSH)-dependent reduction of phospholipid hydroperoxides to corresponding alcohols. The C-terminal domain is crucial for providing the extra cysteine necessary for dimerization of the whole molecule. Loss of the enzyme's peroxidase activity is associated with oxidation of the catalytic cysteine, upstream of this domain; and glutathionylation, presumably through its disruption of protein structure, facilitates access for GSH, resulting in spontaneous reduction of the mixed disulfide to the sulfhydryl and consequent activation of the enzyme. The domain is associated with family AhpC-TSA, pfam00578, which carries the catalytic cysteine. 40
53195 402168 pfam10418 DHODB_Fe-S_bind Iron-sulfur cluster binding domain of dihydroorotate dehydrogenase B. Lactococcus lactis is one of the few organisms with two dihydroorotate dehydrogenases, DHODs, A and B. The B enzyme is a prototype for DHODs in Gram-positive bacteria that use NAD+ as the second substrate. DHODB is a hetero-tetramer composed of a central homodimer of PyrDB subunits resembling the DHODA structure and two PyrK subunits along with three different cofactors: FMN, FAD, and a [2Fe-2S] cluster. The [2Fe-2S] iron-sulfur cluster binds to this C-terminal domain of the PyrK subunit, which is at the interface between the flavin and NAD binding domains and contains three beta-strands. The four cysteine residues at the N-terminal part of this domain are the ones that bind, in pairs, to the iron-sulfur cluster. The conformation of the whole molecule means that the iron-sulfur cluster is localized in a well-ordered part of this domain close to the FAD binding site. The FAD and and NAD binding domains are FAD_binding_6, pfam00970 and NAD_binding_1, pfam00175. 40
53196 402169 pfam10419 TFIIIC_sub6 TFIIIC subunit. This is a family of proteins subunits of TFIIIC. TFIIIC in yeast and humans is required for transcription of tRNA and 5 S RNA genes by RNA polymerase III. Yeast members of this family are fused to phosphoglycerate mutase domain. 99
53197 402170 pfam10420 IL12p40_C Cytokine interleukin-12p40 C-terminus. IL12p40_C is the largely beta stranded C-terminal, D3, domain of interleukin-12p40 or interleukin-12B. This interleukin is produced on stimulation by macrophage-engulfed micro-organisms and other stimuli, when it dimerizes with interleukin-12p35 to form a heterodimer which then binds to receptors on natural killer cells to activate them to destroy the micro-organisms. This domain contains two disulfide bridges, one of which serves to bind p40 to p35 and the other to hold the beta strands within the domain together. The cupped shape of the p35 binding interface matches the elbow-like bend between D2 and D3 in p40. The domain is often associated with family fn3, pfam00041. 85
53198 402171 pfam10421 OAS1_C 2'-5'-oligoadenylate synthetase 1, domain 2, C-terminus. This is the largely alpha-helical, C-terminal half of 2'-5'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between 320 and 344 at the extreme C-terminal end. Oligoadenylate synthetases are antiviral enzymes that counteract vial attack by degrading viral RNA. The enzyme uses ATP in 2'-specific nucleotidyl transfer reactions to synthesize 2'.5'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication. This domain is often associated with NTP_transf_2 pfam01909. 185
53199 255978 pfam10422 LRS4 Monopolin complex subunit LRS4. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. The orthologous complex in Schizosaccharomyces pombe is not required for meiosis I chromosome segregation, but is proposed to play a similar physiological role in clamping microtubule binding sites. In S.cerevisiae this subunit is called LRS4, and in S. pombe it is known as Mde4. 211
53200 402172 pfam10423 AMNp_N Bacterial AMP nucleoside phosphorylase N-terminus. This is the N-terminal domain of bacterial AMP nucleoside phosphorylase (AMNp). The N- and C-termini form distinct domains which intertwine with each other to form a stable monomer which associates with five other monomers to yield the active hexamer. The N-terminus consists of a long helix and a four-stranded sheet with a novel topology. The C-terminus binds the nucleoside whereas the N-terminus acts as the enzymatic regulatory domain. AMNp (EC:3.2.2.4) catalyzes the hydrolysis of AMP to form adenine and ribose 5-phosphate. thereby regulating intracellular AMP levels. 151
53201 402173 pfam10425 SdrG_C_C C-terminus of bacterial fibrinogen-binding adhesin. This is the C-terminal half of a bacterial fibrinogen-binding adhesin SdrG. SdrG is a Gram-positive cell-wall-anchored adhesin that allows attachment of the bacterium to host tissues via specific binding to the beta-chain of human fibrinogen (Fg). SdrG binds to its ligand with a dynamic "dock, lock, and latch" mechanism which represents a general mode of ligand-binding for structurally related cell wall-anchored proteins in most Gram-positive bacteria. The C-terminal part of SdrG(276-596) is integral to the folding of the immunoglobulin-like whole to create the docking grooves necessary for Fg binding. The domain is associated with families of Cna_B, pfam05738. 156
53202 287407 pfam10426 zf-RAG1 Recombination-activating protein 1 zinc-finger domain. This is a C2-H2 zinc-finger domain closely resembling the classical TFIIIA-type zinc-finger, CX3FX5LX2-3H, despite having a valine and a tyrosine at the core instead of a phenylalanine and a leucine, hence CX3VX1LX2YX2H. The structure, nevertheless, contains the characteristic two-stranded beta-sheet and alpha-helix of a classical zinc-finger. The domain binds one zinc and, in complex with the zinc-RING-finger domain, helps to stabilize the whole of the dimerization region of recombination activating protein 1 (RAG1). The function of the whole is to bind double-stranded DNA. 30
53203 371044 pfam10427 Ago_hook Argonaute hook. This region has been called the argonaute hook. It has been shown to bind to the Piwi domain pfam02171 of Argnonaute proteins. 150
53204 402174 pfam10428 SOG2 RAM signalling pathway protein. SOG2 proteins in Saccharomyces cerevisiae are involved in cell separation and cytokinesis. 479
53205 402175 pfam10429 Mtr2 Nuclear pore RNA shuttling protein Mtr2. Mtr2 is a monomeric, dual-action, RNA-shuttle protein found in yeasts. Transport across the nuclear-cytoplasmic membrane is via the macro-molecular membrane-spanning nuclear pore complex, NPC. The pore is lined by a subset of NPC members called nucleoporins that present FG (Phe-Gly) receptors, characteristically GLFG and FXFG motifs, for shuttling RNAs and proteins. RNA cargo is bound to soluble transport proteins (nuclear export factors) such as Mex67 in yeasts, and TAP in metazoa, which pass along the pore by binding to successive FG receptors. Mtr2 when bound to Mex67 maximises this FG-binding. Mtr2 also acts independently of Mex67 in transporting the large ribosomal RNA subunit through the pore. 164
53206 402176 pfam10430 Ig_Tie2_1 Tie-2 Ig-like domain 1. 95
53207 402177 pfam10431 ClpB_D2-small C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerization, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilize the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724. 80
53208 402178 pfam10432 bact-PGI_C Bacterial phospho-glucose isomerase C-terminal SIS domain. This is the C-terminal SIS domain of a bacterial phospho-glucose isomerase EC:5.3.1.9 protein which is similar to eukaryote homologs to the extent that the sequence includes the cluster of threonines and serines that forms the sugar phosphate-binding site in conventional PGI. This domain contributes a good proportion of the active catalytic site residues. This PGI uses the same catalytic mechanisms for both glucose ring-opening and isomerisation for the interconversion of glucose 6-phosphate to fructose 6-phosphate. It is associated with family SIS, pfam01380. 147
53209 402179 pfam10433 MMS1_N Mono-functional DNA-alkylating methyl methanesulfonate N-term. MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest. 486
53210 371048 pfam10434 MAM1 Monopolin complex protein MAM1. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. MAM1 is required in S. cerevisiae for monopolar attachment. 255
53211 402180 pfam10435 BetaGal_dom2 Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyzes the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C-terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N-terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, pfam01301, which is N-terminal to it, but itself has no metazoan members. 180
53212 402181 pfam10436 BCDHK_Adom3 Mitochondrial branched-chain alpha-ketoacid dehydrogenase kinase. Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c pfam02518 which is towards the C-terminal. 159
53213 402182 pfam10437 Lip_prot_lig_C Bacterial lipoate protein ligase C-terminus. This is the C-terminal domain of a bacterial lipoate protein ligase. There is no conservation between this C-terminus and that of vertebrate lipoate protein ligase C-termini, but both are associated with the domain BPL_LipA_LipB pfam03099, further upstream. This domain is required for adenylation of lipoic acid by lipoate protein ligases. The domain is not required for transfer of lipoic acid from the adenylate to the lipoyl domain. Upon adenylation, this domain rotates 180 degrees away from the active site cleft. Therefore, the domain does not interact with the lipoyl domain during transfer. 84
53214 402183 pfam10438 Cyc-maltodext_C Cyclo-malto-dextrinase C-terminal domain. This domain is at the very C-terminus of cyclo-malto-dextrinase proteins and consists of 8 beta strands, is largely globular and appears to help stabilize the acitve sites created by upstream domains, Cyc-maltodext_N pfam09087, and Alpha-amylase pfam00128. Cyclo-malto-dextrinases hydrolyze cyclodextrans to maltose and glucose and catalyze trans-glycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. 76
53215 402184 pfam10439 Bacteriocin_IIc Bacteriocin class II with double-glycine leader peptide. This is a family of bacteriocidal bacteriocins secreted by Streptococcal species in order to kill off closely-related competitor Gram-positives. The sequence includes the peptide precursor, this being cleaved off proteolytically at the double-glycine. The family does not carry the YGNGVXC motif characteristic of pediocin-like Bacteriocins, Bacteriocin_II pfam01721. The producer bacteria are protected from the effects of their own bacteriocins by production of a specific immunity protein which is co-transcribed with the genes encoding the bacteriocins, eg family EntA_Immun pfam08951. The bacteriocins are structurally more specific than their immunity-protein counterparts. Typically, production of the bacteriocin gene is from within an operon carrying up to 6 genes including a typical two-component regulatory system (R and H), a small peptide pheromone (C), and a dedicated ABC transporter (A and -B) as well as an immunity protein. The ABC transporter is thought to recognize the N termini of both the pheromone and the bacteriocins and to transport these peptides across the cytoplasmic membrane, concurrent with cleavage at the conserved double-glycine motif. Cleaved extracellular C can then bind to the sensor kinase, H, resulting in activation of R and up-regulation of the entire gene cluster via binding to consensus sequences within each promoter. It seems likely that this whole regulon is carried on a transmissible plasmid which is passed between closely related Firmicute species since many clinical isolates from different Firmicutes can produce at least two bacteriocins. and the same bacteriocins can be produced by different species. 58
53216 402185 pfam10440 WIYLD Ubiquitin-binding WIYLD domain. This presumed domain has been predicted to contain three alpha helices. The domain was named the WIYLD domain based on the pattern of most conserved residues. It binds ubiquitin. In the Arabidopsis thaliana histone-lysine N-methyltransferase SUVR4, binding of ubiquitin to this domain stimulates enzymatic activity and converts its activity from a strict dimethylase to a di/trimethylase. 58
53217 402186 pfam10441 Urb2 Urb2/Npa2 family. This family includes the Urb2 protein from yeast that are involved in ribosome biogenesis. 213
53218 402187 pfam10442 FIST_C FIST C domain. The FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. 135
53219 371055 pfam10443 RNA12 RNA12 protein. This family includes RNA12 from S. cerevisiae. That protein contains an RRM domain. This region is C-terminal to that and includes a P-loop motif suggesting this region binds to NTP. The RNA12 proteins is involved in pre-rRNA maturation. 429
53220 402188 pfam10444 Nbl1_Borealin_N Nbl1 / Borealin N terminal. Nbl1 is a subunit of the conserved CPC, the chromosomal passenger complex, which regulates mitotic chromosome segregation. In Fungi and Animalia, this complex consists of the kinase Aurora B/AIR-2/Ipl1p, INCENP/ICP-1/Sli15p, and Survivin/BIR-1/Bir1p. In Animalia, a fourth subunit (Borealin/Dasra/CSC-1) is required for targeting CPC to centromeres and central spindles. Nbl1 has been shown in budding yeast to be essential for viability, and for CPC localization, stability, integrity, and function. The N-terminus of Borealin is homologous to Nbl1. This family contains both Nbl1, and the N terminal region of Borealin. 55
53221 402189 pfam10445 DUF2456 Protein of unknown function (DUF2456). This is a family of uncharacterized proteins. 94
53222 402190 pfam10446 DUF2457 Protein of unknown function (DUF2457). This is a family of uncharacterized proteins. 458
53223 402191 pfam10447 EXOSC1 Exosome component EXOSC1/CSL4. This family of proteins are components of the exosome 3'->5' exoribonuclease complex. The exosome mediates degradation of unstable mRNAs that contain AU-rich elements (AREs) within their 3' untranslated regions. 112
53224 402192 pfam10448 POC3_POC4 20S proteasome chaperone assembly proteins 3 and 4. This family contains chaperones of the 20S proteasome which function in early 20S proteasome assembly. The structures of two of the proteins in this family (POC3 and POC4) have been solved, and they closely resemble those of the mammalian proteasome assembling chaperone PAC3, although there is little sequence similarity between them. 136
53225 402193 pfam10450 POC1 POC1 chaperone. In yeast, POC1 is a chaperone of the 20S proteasome which functions in early 20S proteasome assembly. 223
53226 371062 pfam10451 Stn1 Telomere regulation protein Stn1. The budding yeast protein Stn1 is a DNA-binding protein which has specificity for telomeric DNA. Structural profiling has predicted an OB-fold. This domain is the N-terminal part of the molecule, which adopts the OB fold. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution. 252
53227 371063 pfam10452 TCO89 TORC1 subunit TCO89. TC089 is a component of the TORC1 complex. TORC1 is responsible for a wide range of rapamycin-sensitive cellular activities. 546
53228 402194 pfam10453 NUFIP1 Nuclear fragile X mental retardation-interacting protein 1 (NUFIP1). Proteins in this family have been implicated in the assembly of the large subunit of the ribosome and in telomere maintenance. Some proteins in this family contain a CCCH zinc finger. This family contains a protein called human fragile X mental retardation-interacting protein 1, which is known to bind RNA and is phosphorylated upon DNA damage. 53
53229 402195 pfam10454 DUF2458 Protein of unknown function (DUF2458). This a is family of uncharacterized proteins. 172
53230 402196 pfam10455 BAR_2 Bin/amphiphysin/Rvs domain for vesicular trafficking. This Pfam entry includes proteins that are not matched by pfam03114. 286
53231 313646 pfam10456 BAR_3_WASP_bdg WASP-binding domain of Sorting nexin protein. The C-terminal region of the Sorting nexin group of proteins appears to carry a BAR-like (Bin/amphiphysin/Rvs) domain. This domain is very diverse and the similarities with other BAR domains are few. In the Sorting nexins it is associated with family PX, pfam00787.13, and in combination with PX appears to be necessary to bind WASP along with p85 to form a multimeric signalling complex. 236
53232 402197 pfam10457 MENTAL Cholesterol-capturing domain. Human meta-static lymph node (MLN) 64 is a late endosomal membrane protein, and carries this MENTAL (MLN64N-terminal) domain at its N-terminus. The domain is composed of four trans-membrane helices with three short intervening loops. The function of the domain is to capture cholesterol and pass it to the associated START domain pfam01852 for transfer to a cytosolic acceptor protein or membrane. In mammals, the MENTAL domain is involved in the localization of MLN64 and MENTHO in late endosomes, and also in homo-and of hetero-interactions of these two proteins. 176
53233 402198 pfam10458 Val_tRNA-synt_C Valyl tRNA synthetase tRNA binding arm. This domain is found at the C-terminus of Valyl tRNA synthetases. 66
53234 402199 pfam10459 Peptidase_S46 Peptidase S46. Dipeptidyl-peptidase 7 (DPP-7) is the best characterized member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 694
53235 378438 pfam10460 Peptidase_M30 Peptidase M30. This family contains the metallopeptidase hyicolysin. Hyicolysin has a zinc ion which is liganded by two histidine and one glutamate residue. 364
53236 402200 pfam10461 Peptidase_S68 Peptidase S68. This family of serine peptidases contains PIDD proteins. PIDD forms a complex with RAIDD and procaspase-2 that is known as the 'PIDDosome'. The PIDDosome forms when DNA damage occurs and either activates NF-kappaB, leading to cell survival, or caspase-2, which leads to apoptosis. 34
53237 287440 pfam10462 Peptidase_M66 Peptidase M66. This family of metallopeptidases contains StcE, a virulence factor found in Shiga toxigenic Escherichia coli organisms. StcE peptidase cleaves C1 esterase inhibitor. 306
53238 402201 pfam10463 Peptidase_U49 Peptidase U49. This family contains Lit peptidase from Escherichia coli. Lit protease functions in bacterial cell death in response to infection by bacteriophage T4. Following binding of Gol peptide to domains II and III of elongation factor Tu, the Lit peptidase cleaves domain I of the elongation factor. This prevents binding of guanine nucleotides, shuts down translation and leads to cell death. 198
53239 287442 pfam10464 Peptidase_U40 Peptidase U40. This family contains P5 murein endopeptidase from bacteriophage phi-6. P5 murein endopeptidase has lytic activity against several gram-negative bacteria. It is thought that the enzyme cleaves the cell wall peptide bridge formed by meso-2,6-diaminopimelic acid and D-Ala 212
53240 287443 pfam10465 Inhibitor_I24 PinA peptidase inhibitor. PinA inhibits the endopeptidase La. It binds to the La homotetramer but does not interfere with the ATP binding site or the active site of La. 140
53241 402202 pfam10466 Inhibitor_I34 Saccharopepsin inhibitor I34. The saccharopepsin inhibitor is highly specific for the aspartic peptidase saccharopepsin. It is largely unstructured in the absence of saccharopepsin, but in the presence, the inhibitor undergoes a conformation change forming an almost perfect alpha-helix from Asn2 to Met32 in the active site cleft of the peptidase. 69
53242 313652 pfam10467 Inhibitor_I48 Peptidase inhibitor clitocypin. Clitocypin binds and inhibits cysteine proteinases. It has no similarity to any other known cysteine proteinase inhibitors but bears some similarity to a lectin-like family of proteins from mushrooms. 142
53243 402203 pfam10468 Inhibitor_I68 Carboxypeptidase inhibitor I68. This is a family of tick carboxypetidase inhibitors. 74
53244 402204 pfam10469 AKAP7_NLS AKAP7 2'5' RNA ligase-like domain. AKAP7_NLS is the N-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes. AKAP7_NLS carries the nuclear localization signal (NLS) KKRKK, that indicates the cellular destiny of this anchor protein. Binding to the regulatory subunits RI and RII of PKA is mediated via the family AKAP7_RIRII_bdg. at the C-terminus. This family represents a region that contains two 2'5' RNA ligase like domains pfam02834. Presumably this domain carried out some as yet unknown enzymatic function. 207
53245 371072 pfam10470 AKAP7_RIRII_bdg PKA-RI-RII subunit binding domain of A-kinase anchor protein. AKAP7_RIRII_bdg is the C-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA, for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes, by binding to its regulatory subunits, RI and RII, hence being known as a dual-specific AKAP. The 25 crucial amino acids of RII-binding domains in general form structurally conserved amphipathic helices with unrelated sequences; hydrophobic amino acid residues form the backbone of the interaction and hydrogen bond- and salt-bridge-forming amino acid residues increase the affinity of the interaction. The N-terminus, of family AKAP7_NLS, carries the nuclear localization signal. 57
53246 402205 pfam10471 ANAPC_CDC26 Anaphase-promoting complex APC subunit CDC26. The anaphase-promoting complex (APC) or cyclosome is a cell cycle-regulated ubiquitin-protein ligase that regulates important events in mitosis such as the initiation of anaphase and exit from telophase. The APC, in conjunction with other enzymes, assembles multi-ubiquitin chains on a variety of regulatory proteins thereby targeting them for proteolysis by the 26S proteasome. CDC26 is one of the nine or so subunits identified within APC but its exact function is not known. The APC/C becomes active at the metaphase/anaphase transition and remains active during G1 phase. One mechanism linked to activation of the APC/C is phosphorylation. The yeast APC/C is composed of at least 13 subunits, but the function of many of the subunits is unknown. Hcn1 is the smallest subunit of the S. pombe APC/C, and is found to be essential for cell viability, APC/C integrity, and proper APC/C regulation. In addition, Hcn1 phosphorylation indicates a specific role for the phosphorylation of this subunit late in the cell cycle. 65
53247 371074 pfam10472 CReP_N eIF2-alpha phosphatase phosphorylation constitutive repressor. This is the conserved N-terminal domain of CReP, constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, catalytic subunit. It functions in the dephosphorylation of eIF2-alpha under basal conditions in the absence of stress. In response to translation inhibition, there is reduced synthesis of the labile CReP that contributes to elevated levels of eIF2-alpha phosphorylation. The C-terminus, family PP1c, is shared with the apoptosis-associated protein Gadd34 and herpes simplex virus. 411
53248 402206 pfam10473 CENP-F_leu_zip Leucine-rich repeats of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance. 140
53249 402207 pfam10474 DUF2451 Protein of unknown function C-terminus (DUF2451). This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2450. 229
53250 402208 pfam10475 Vps54_N Vacuolar-sorting protein 54, of GARP complex. This is a family of vacuolar-sorting proteins 54, from eukaryotes. Along with VPS52 and VPS53 this forms the Golgi-associated retrograde protein complex GARP. VPS54 is separated into N- and C-terminal regions each of which has a different function. This N-terminal family of is important for GARP complex assembly and stability, whereas the C-terminal domain, pfam07928, brings about localization to an early endocytic compartment. 291
53251 402209 pfam10476 DUF2448 Protein of unknown function C-terminus (DUF2448). The family DUF2349 is the N-terminal part of this family. This protein is found in eukaryotes but its function is not known. 204
53252 371079 pfam10477 EIF4E-T Nucleocytoplasmic shuttling protein for mRNA cap-binding EIF4E. EIF4E-T is the transporter protein for shuttling the mRNA cap-binding protein EIF4E protein, targeting it for nuclear import. EIF4E-T contains several key binding domains including two functional leucine-rich NESs (nuclear export signals) between residues 438-447 and 613-638 in the human protein. The other two binding domains are an EIF4E-binding site, between residues 27-42 in Q9EST3, and a bipartite NLS (nuclear localization signals) between 194-211, and these lie in family EIF4E-T_N. EIF4E is the eukaryotic translation initiation factor 4E that is the rate-limiting factor for cap-dependent translation initiation. 646
53253 313663 pfam10479 FSA_C Fragile site-associated protein C-terminus. This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease. The protein may also be associated with polycystic kidney disease. 726
53254 119000 pfam10480 ICAP-1_inte_bdg Beta-1 integrin binding protein. ICAP-1 is a serine/threonine-rich protein that binds to the cytoplasmic domains of beta-1 integrins in a highly specific manner, binding to a NPXY sequence motif on the beta-1 integrin. The cytoplasmic domains of integrins are essential for cell adhesion, and the fact that phosphorylation of ICAP-1 by interaction with the cell-matrix implies an important role of ICAP-1 during integrin-dependent cell adhesion. Overexpression of ICAP-1 strongly reduces the integrin-mediated cell spreading on extracellular matrix and inhibits both Cdc42 and Rac1. In addition, ICAP-1 induces release of Cdc42 from cellular membranes and prevents the dissociation of GDP from this GTPase. An additional function of ICAP-1 is to promote differentiation of osteoprogenitors by supporting their condensation through modulating the integrin high affinity state, 200
53255 402210 pfam10481 CENP-F_N Cenp-F N-terminal domain. Mitosin or centromere-associated protein-F (Cenp-F) is found bound across the centromere as one of the proteins of the outer layer of the kinetochore. Most of the kinetochore/centromere functions appear to depend upon binding of the C-terminal par to f the molecule, whereas the N-terminal part, here, may be a cytoplasmic player in controlling the function of microtubules and dynein. 304
53256 402211 pfam10482 CtIP_N tumor-suppressor protein CtIP N-terminal domain. CtIP is predominantly a nuclear protein that complexes with both BRCA1 and the BRCA1-associated RING domain protein (BARD1). At the protein level, CtIP expression varies with cell cycle progression in a pattern identical to that of BRCA1. Thus, the steady-state levels of CtIP polypeptides, which remain low in resting cells and G1 cycling cells, increase dramatically as Dividing cells traverse the G1/S boundary. CtIP can potentially modulate the functions ascribed to BRCA1 in transcriptional regulation, DNA repair, and/or cell cycle checkpoint control. This N-terminal domain carries a coiled-coil region and is essential for homodimerization of the protein. The C-terminal domain is family pfam08573. 119
53257 402212 pfam10483 Elong_Iki1 Elongator subunit Iki1. This family is a component of the RNA polymerase II elongator complex. This complex is involved in elongation of RNA polymerase II transcription and in modification of wobble nucleosides in tRNA. 278
53258 402213 pfam10484 MRP-S23 Mitochondrial ribosomal protein S23. MRP-S23 is one of the proteins that makes up the 55S ribosome in eukaryotes from nematodes to humans. It does not appear to carry any common motifs, either RNA binding or ribosomal protein motifs. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble coordinately with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system. MRP-S23 is significantly up-regulated in uterine cancer cells. 124
53259 402214 pfam10486 PI3K_1B_p101 Phosphoinositide 3-kinase gamma adapter protein p101 subunit. Class I PI3Ks are dual-specific lipid and protein kinases involved in numerous intracellular signaling pathways. Class IB PI3K, p110gamma, is mainly activated by seven-transmembrane G-protein-coupled receptors (GPCRs), through its regulatory subunit p101 and G-protein beta-gamma subunits. 860
53260 402215 pfam10487 Nup188 Nucleoporin subcomplex protein binding to Pom34. This is one of the many peptides that make up the nucleoporin complex (NPC), and is found across eukaryotes. The Nup188 subcomplex (Nic96p-Nup188p-Nup192p-Pom152p) is one of at least six that make up the NPC, and as such is symmetrically localized on both faces of the NPC at the nuclear end, being integrally bound to the C-terminus of Pom34p. 916
53261 402216 pfam10488 PP1c_bdg Phosphatase-1 catalytic subunit binding region. This conserved C-terminus appears to be a protein phosphatase-1 catalytic subunit (PP1C) binding region, which may in some circumstances also be retroviral in origin since it is found in both herpes simplex virus and in mouse and man. This domain is found in Gadd-34 apoptosis-associated proteins as well as the constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, regulatory (inhibitor) subunit 15b, otherwise known as CReP. Diverse stressful conditions are associated with phosphorylation of the {alpha} subunit of eukaryotic translation initiation factor 2 (eIF2{alpha}) on serine 51. This signaling event, which is conserved from yeast to mammals, negatively regulates the guanine nucleotide exchange factor, eIF2-B and inhibits the recycling of eIF2 to its active GTP bound form. In mammalian cells eIF2{alpha} phosphorylation emerges as an important event in stress signaling that impacts on gene expression at both the translational and transcriptional levels. 287
53262 402217 pfam10490 CENP-F_C_Rb_bdg Rb-binding domain of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. This domain is at the very C-terminus of the C-terminal coiled-coil, and is one of the key Rb-binding domains. 47
53263 402218 pfam10491 Nrf1_DNA-bind NLS-binding and DNA-binding and dimerization domains of Nrf1. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein there is also an NLS domain at 88-116, and a DNA binding and dimerization domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity. 213
53264 313673 pfam10492 Nrf1_activ_bdg Nrf1 activator activation site binding domain. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is an activation domain at 303-469, the most conserved part of which is this domain 446-469. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity. The family Nrf1_DNA-bind is associated with this domain towards the N-terminal, as is the N terminal of the activation domain. 82
53265 402219 pfam10493 Rod_C Rough deal protein C-terminal region. Rod, the Rough deal protein, displays a dynamic intracellular staining pattern, localising first to kinetochores in pro-metaphase, but moving to kinetochore microtubules at metaphase. Early in anaphase the protein is once again restricted to the kinetochores, where it persists until the end of telophase. This behaviour is in all respects similar to that described for ZW10, and indeed the two proteins function together, localization of each depending upon the other. These two proteins are found at the kinetochore in complex with a third, Zwilch, in both flies and humans. The C-terminus is the most conserved part of the protein. During pro-metaphase, the ZW10-Rod complex, dynein/dynactin, and Mad2 all accumulate on unattached kinetochores; microtubule capture leads to Mad2 depletion as it is carried off by dynein/dynactin; ZW10-Rod complex accumulation continues, replenishing kinetochore dynein. The continuing recruitment of the ZW10-Rod complex during metaphase may serve to maintain adequate dynein/dynactin complex on kinetochores for assisting chromatid movement during anaphase. The ZW10-Rod complex acts as a bridge whose association with Zwint-1 links Mad1 and Mad2, components that are directly responsible for generating the diffusible 'wait anaphase' signal, to a structural, inner kinetochore complex containing Mis12 and KNL-1AF15q14, the last of which has been proved to be essential for kinetochore assembly in C. elegans. Removal of ZW10 or Rod inactivates the mitotic checkpoint. 560
53266 402220 pfam10494 Stk19 Serine-threonine protein kinase 19. This serine-threonine protein kinase number 19 is expressed from the MHC and predominantly in the nucleus. Protein kinases are involved in signal transduction pathways and play fundamental roles in the regulation of cell functions. This is a novel Ser/Thr protein kinase, that has Mn2+-dependent protein kinase activity that phosphorylates alpha -casein at Ser/Thr residues and histone at Ser residues. It can be covalently modified by the reactive ATP analogue 5'-p-fluorosulfonylbenzoyladenosine in the absence of ATP, and this modification is prevented in the presence of 1 mM ATP, indicating that the kinase domain of is capable of binding ATP. 244
53267 402221 pfam10495 PACT_coil_coil Pericentrin-AKAP-450 domain of centrosomal targeting protein. This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly. 77
53268 402222 pfam10496 Syntaxin-18_N SNARE-complex protein Syntaxin-18 N-terminus. This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes. 86
53269 402223 pfam10497 zf-4CXXC_R1 Zinc-finger domain of monoamine-oxidase A repressor R1. R1 is a transcription factor repressor that inhibits monoamine oxidase A gene expression. This domain is a four-CXXC zinc finger putative DNA-binding domain found at the C-terminal end of R1. The domain carries 12 cysteines of which four pairs are of the CXXC type. 95
53270 402224 pfam10498 IFT57 Intra-flagellar transport protein 57. Eukaryotic cilia and flagella are specialized organelles found at the periphery of cells of diverse organisms. Intra-flagellar transport (IFT) is required for the assembly and maintenance of eukaryotic cilia and flagella, and consists of the bidirectional movement of large protein particles between the base and the distal tip of the organelle. IFT particles contain multiple copies of two distinct protein complexes, A and B, which contain at least 6 and 11 protein subunits. IFT57 is part of complex B but is not, however, required for the core subunits to stay associated. This protein is known as Huntington-interacting protein-1 in humans. 359
53271 402225 pfam10500 SR-25 Nuclear RNA-splicing-associated protein. SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissues. At the N-terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localization signals strongly implies that this is a nuclear protein that may contribute to RNA splicing. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1) signalling pathway. 228
53272 402226 pfam10501 Ribosomal_L50 Ribosomal subunit 39S. The 39S ribosomal protein appears to be a subunit of one of the larger mitochondrial 66S or 70S units. Under conditions of ethanol-stress in rats the larger subunit is largely dissociated into its smaller components. In E. coli, in the absence of the enzyme pseudouridine synthase (RluD) synthase, there is an accumulation of 50S and 30S subunits and the appearance of abnormal particles (62S and 39S), with concomitant loss of 70S ribosomes. 109
53273 402227 pfam10502 Peptidase_S26 Signal peptidase, peptidase S26. This is a family of membrane signal serine endopeptidases which function in the processing of newly-synthesized secreted proteins. Peptidase S26 removes the hydrophobic, N-terminal, signal peptides as proteins are translocated across membranes. The active site residues take the form of a catalytic dyad that is Ser, Lys in subfamily S26A; the Ser is the nucleophile in catalysis, and the Lys is the general base. 162
53274 402228 pfam10503 Esterase_phd Esterase PHB depolymerase. This family of proteins include acetyl xylan esterases (AXE), feruloyl esterases (FAE), and poly(3-hydroxybutyrate) (PHB) depolymerases. 219
53275 371098 pfam10504 DUF2452 Protein of unknown function (DUF2452). This protein is found in eukaryotes but its function is unknown. 152
53276 402229 pfam10505 NARG2_C NMDA receptor-regulated gene protein 2 C-terminus. The transition of neuronal cells from pre-cursor to mature state is regulated by the N-methyl-d-aspartate (NMDA) receptor, a glutamate-gated ion channel that is permeable to Ca2+. NMDA receptors probably mediate this activity by permitting expression of NARG2. NARG2 is transiently expressed, being a regulatory protein that is present in the nucleus of dividing cells and then down-regulated as progenitors exit the cell cycle and begin to differentiate. NARG2 contains repeats of (S/T)PXX, (11 in mouse, six in human), a putative DNA-binding motif that is found in many gene-regulatory proteins including Kruppel, Hunchback and Antennapedi. 206
53277 402230 pfam10506 MCC-bdg_PDZ PDZ domain of MCC-2 bdg protein for Usher syndrome. The protein has a high homology to the tumor suppressor MCC (mutated in colon cancer; or MCC1 hereafter) and was named MCC2. MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumor suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75. 65
53278 402231 pfam10507 TMEM65 Transmembrane protein 65. MEM65 is an intercalated disc protein that interacts with with connexin 43 (Cx43) and is required for correct localization of Cx43 to the intercalated disc. It is essential for cardiac function in zebrafish. 108
53279 371102 pfam10508 Proteasom_PSMB Proteasome non-ATPase 26S subunit. The 26S proteasome, a eukaryotic ATP-dependent, dumb-bell shaped, protease complex with a molecular mass of approx 20kDa consists of a central 20S proteasome,functioning as a catalytic machine, and two large V-shaped terminal modules, having possible regulatory roles,composed of multiple subunits of 25- 110 kDa attached to the central portion in opposite orientations. It is responsible for degradation of abnormal intracellular proteins, including oxidatively damaged proteins, and may play a role as a component of a cellular anti-oxidative system. Expression of catalytic core subunits including PSMB5 and peptidase activities of the proteasome were elevated following incubation with 3-methylcholanthrene. The 20S proteasome comprises a cylindrical stack of four rings, two outer rings formed by seven alpha-subunits (alpha1-alpha7) and two inner rings of seven beta-subunits (beta1-beta7). Two outer rings of alpha subunits maintain structure, while the central beta rings contain the proteolytic active core subunits beta1 (PSMB6), beta2 (PSMB7), and beta5 (PSMB5). Expression of PSMB5 can be altered by chemical reactants, such as 3-methylcholanthrene. 497
53280 402232 pfam10509 GalKase_gal_bdg Galactokinase galactose-binding signature. This is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry. The function of this domain appears to be to bind galactose, and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6. This domain is associated with the families GHMP_kinases_C, pfam08544 and GHMP_kinases_N, pfam00288. 50
53281 402233 pfam10510 PIG-S Phosphatidylinositol-glycan biosynthesis class S protein. PIG-S is one of several key, core, components of the glycosylphosphatidylinositol (GPI) trans-amidase complex that mediates GPI anchoring in the endoplasmic reticulum. Anchoring occurs when a protein's C-terminal GPI attachment signal peptide is replaced with a pre-assembled GPI. Mammalian GPITransamidase consists of at least five components: Gaa1, Gpi8, PIG-S, PIG-T, and PIG-U, all five of which are required for function. It is possible that Gaa1, Gpi8, PIG-S, and PIG-T form a tightly associated core that is only weakly associated with PIG-U. The exact function of PIG-S is unclear. 497
53282 371104 pfam10511 Cementoin Trappin protein transglutaminase binding domain. Trappin-2, itself a protease inhibitor, has this unique N-terminal domain that enables it to become cross-linked to extracellular matrix proteins by transglutaminase. This domain contains several repeated motifs with the the consensus sequence Gly-Gln-Asp-Pro-Val-Lys, and these together can anchor the whole molecule to extracellular matrix proteins, such as laminin, fibronectin, beta-crystallin, collagen IV, fibrinogen, and elastin, by transglutaminase-catalyzed cross-links. The whole domain is rich in glutamine and lysine, thus allowing and transglutaminase(s) to catalyze the formation of an intermolecular epsilon-(gamma-glutamyl)lysine isopeptide bond. Cementoin is associated with the WAP family, pfam00095, at the C-terminus. 17
53283 402234 pfam10512 Borealin Cell division cycle-associated protein 8. The chromosomal passenger complex of Aurora B kinase, INCENP, and Survivin has essential regulatory roles at centromeres and the central spindle in mitosis. Borealin is also a member of the complex. Approximately half of Aurora B in mitotic cells is complexed with INCENP, Borealin, and Survivin. Depletion of Borealin by RNA interference delays mitotic progression and results in kinetochore-spindle mis-attachments and an increase in bipolar spindles associated with ectopic asters. 120
53284 402235 pfam10513 EPL1 Enhancer of polycomb-like. This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes. 166
53285 402236 pfam10514 Bcl-2_BAD Pro-apoptotic Bcl-2 protein, BAD. BAD is a Bcl-2 homology domain 3 (BH3)-only pro-apoptotic member of the Bcl-2 protein family that is regulated by phosphorylation in response to survival factors. Binding of BAD to mitochondria is thought to be exclusively mediated by its BH3 domain. Membrane localization of BAD mediates membrane translocation of Bcl-XL. The C-terminal part of BAD is sufficient for membrane binding. There are two segments with differing lipid-binding preferences, LBD1 and LBD2, that are responsible for this binding: (i) LBD1 located in the proximity of the BH3 domain (amino acids 122-131) and (ii) LBD2, the putative C-terminal alpha-helix-5. Phosphorylation-regulated 14-3-3 protein binding may expose the cholesterol-preferring LBD1 and bury the LBD2, thereby mediating translocation of BAD to raft-like micro-domains. 166
53286 402237 pfam10515 APP_amyloid beta-amyloid precursor protein C-terminus. This is the amyloid, C-terminal, protein of the beta-Amyloid precursor protein (APP) which is a conserved and ubiquitous transmembrane glycoprotein strongly implicated in the pathogenesis of Alzheimer's disease but whose normal biological function is unknown. The C-terminal 100 residues are released and aggregate into amyloid deposits which are strongly implicated in the pathology of Alzheimer's disease plaque-formation. The domain is associated with family A4_EXTRA, pfam02177, further towards the N-terminus. 52
53287 402238 pfam10516 SHNi-TPR SHNi-TPR. SHNi-TPR family members contain a reiterated sequence motif that is an interrupted form of TPR repeat. 38
53288 402239 pfam10517 DM13 Electron transfer DM13. The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme. 104
53289 402240 pfam10518 TAT_signal TAT (twin-arginine translocation) pathway signal sequence. 26
53290 402241 pfam10520 TMEM189_B_dmain B domain of TMEM189, localization domain. TMEM189_B is the B domain or probable localization domain of the transmembrane protein TMEM189 which in some mammals is fused with Kua ubiquitin-conjugation E2 enzyme proteins. The domain is also found on fatty acid saturase FAD4 in Arabidopsis. 176
53291 402242 pfam10521 Tti2 Tti2 family. Budding yeast Tti2 is a subunit of the ASTRA complex, which is involved in chromatin remodelling. Tti2 homolog from humans, TELO2-interacting protein 2, is part of the TTT complex that is involved in the cellular resistance to DNA damage stresses. 282
53292 371112 pfam10522 RII_binding_1 RII binding domain. This domain is found is a wide variety of AKAPs (A kinase anchoring proteins). The domain is also found on micro-tubule-associated proteins. 19
53293 402243 pfam10523 BEN BEN domain. The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription. 77
53294 371114 pfam10524 NfI_DNAbd_pre-N Nuclear factor I protein pre-N-terminus. The Nuclear factor I (NFI) family of site-specific DNA-binding proteins (also known as CTF or CAAT box transcription factor) functions both in viral DNA replication and in the regulation of gene expression in higher organisms. The N-terminal 200 residues contains the DNA-binding and dimerization domain, but also has an 8-47 residue highly conserved region 5' of this, whose function is not known. Deletion of the N-terminal 200 amino acids removes the DNA-binding activity, dimerization-ability and the stimulation of adenovirus DNA replication. 41
53295 402244 pfam10525 Engrail_1_C_sig Engrailed homeobox C-terminal signature domain. Engrailed homeobox proteins are characterized by the presence of a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. This domain of approximately 20 residues forms a kind of a signature pattern for this subfamily of proteins. 31
53296 402245 pfam10528 GLEYA GLEYA domain. The GLEYA domain is related to lectin-like binding domains found in the S. cerevisiae Flo proteins and the C. glabrata Epa proteins. It is a carbohydrate-binding domain that is found in fungal adhesins (also referred to as agglutinins or flocculins). Adhesins with a GLEYA domain possess a typical N-terminal signal peptide and a domain of conserved sequence repeats, but lack glycosylphosphatidylinositol (GPI) anchor attachment signals. They contain a conserved motif G(M/L)(E/A/N/Q)YA, hence the name GLEYA. Based on sequence homology, it is suggested that the GLEYA domain would predominantly contain beta sheets. The GLEYA domain is also found in S. pombe putative cell agglutination protein fta5, thought to be a kinetochore portein (Sim4 complex subunit), however no direct evidence for kinetochore association has been found. Furthermore, a global protein localization study in S. pombe identified it as a secreted protein localized to the Golgi complex. 91
53297 402246 pfam10529 Hist_rich_Ca-bd Histidine-rich Calcium-binding repeat region. This is a histidine-rich calcium binding repeat which appears in proteins called histidine-rich-calcium binding proteins (HRC). HRC is a high capacity, low affinity Ca2+-binding protein, residing in the lumen of the sarcoplasmic reticulum. HRC binds directly to triadin. This binding interaction occurs between the histidine-rich region of HRC and multiple clusters of charged amino acids, named as the KEKE motifs, in the lumenal domain of triadin. The region in which this repeat is found in many copies is long and variable but is the acidic region of the protein. There is also a cysteine-rich region further towards the C-terminus. HRC may regulate sarcoplasmic reticular calcium transport and play a critical role in maintaining calcium homeostasis and function in the heart. HRC as a candidate regulator of sarcoplasmic reticular calcium uptake. 15
53298 287498 pfam10530 Toxin_35 Toxin with inhibitor cystine knot ICK or Knottin scaffold. Spider toxins of the CSTX family are ion channel toxins containing an inhibitor cystine knot (ICK) structural motif or Knottin scaffold. The four disulfide bonds present in the CSTX spider toxin family are arranged in the following pattern: 1-4, 2-5, 3-8 and 6-7. CSTX-1 is the most important component of C. salei venom in terms of relative abundance and toxicity and therefore is likely to contribute significantly to the overall toxicity of the whole venom. CSTX-1 blocked rat neuronal L-type, but no other types of HVA Cav channels. Interestingly, the omega-toxins from Phoneutria nigriventer venom (another South American species also belonging to the Ctenidae family) are included as they carry the same disulfide bond arrangement. suggestive that CSTX-1 may interact with Cav channels. Calcium ion voltage channel heteromultimer containing an L-type pore-forming alpha1-subunit is the most probable candidate for the molecular target of CSTX-1 and these toxins. 61
53299 402247 pfam10531 SLBB SLBB domain. 56
53300 119052 pfam10532 Plant_all_beta Plant specific N-all beta domain. This domain was identified by Babu and colleagues. It is found associated with the WRKY domain pfam03106. 114
53301 402248 pfam10533 Plant_zn_clust Plant zinc cluster domain. This zinc binding domain was identified by Babu and colleagues and found associated with the WRKY domain pfam03106. 42
53302 402249 pfam10534 CRIC_ras_sig Connector enhancer of kinase suppressor of ras. The CRIC - Connector enhancer of kinase suppressor of ras - domain functions as a scaffold in several signal cascades and acts on proliferation, differentiation and apoptosis. 93
53303 402250 pfam10536 PMD Plant mobile domain. This domain was identified by Babu and colleagues in a variety of transposases. 345
53304 402251 pfam10537 WAC_Acf1_DNA_bd ATP-utilising chromatin assembly and remodelling N-terminal. ACF (for ATP-utilising chromatin assembly and remodelling factor) is a chromatin-remodelling complex that catalyzes the ATP-dependent assembly of periodic nucleosome arrays. The WAC (WSTF/Acf1/cbp146) domain is an approximately 110-residue module present at the N-termini of Acf1-related proteins in a variety of organisms. The DNA-binding region of Acf1 includes the WAC domain, which is necessary for the efficient binding of ACF complex to DNA. 101
53305 402252 pfam10538 ITAM_Cys-rich Immunoreceptor tyrosine-based activation motif. Signal transduction by T and B cell antigen receptors and certain receptors for Ig Fc regions involves a conserved sequence motif, termed an immunoreceptor tyrosine-based activation motif (ITAM). It is also found in the cytoplasmic domain of apoptosis receptor. 38
53306 402253 pfam10539 Dev_Cell_Death Development and cell death domain. The DCD domain is found in plant proteins involved in development and cell death. The DCD domain is an approximately 130 amino acid long stretch that contains several mostly invariable motifs. These include a FGLP and a LFL motif at the N-terminus and a PAQV and a PLxE motif towards the C-terminus of the domain. The DCD domain is present in proteins with different architectures. Some of these proteins contain additional recognisable motifs, like the KELCH repeats or the ParB domain. 126
53307 402254 pfam10540 Membr_traf_MHD Munc13 (mammalian uncoordinated) homology domain. Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster, mouse, rat and human, some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical. 141
53308 402255 pfam10541 KASH Nuclear envelope localization domain. The KASH (for Klarsicht/ANC-1/Syne-1 homology) or KLS domain is a highly hydrophobic nuclear envelope localization domain of approximately 60 amino acids comprising a 20-amino-acid transmembrane region and a 30-35-residue C-terminal region that lies between the inner and the outer nuclear membranes. During meiotic prophase, telomeres cluster to form a bouquet arrangement of chromosomes. SUN and KASH domain proteins form complexes that span both membranes of the nuclear envelope. The KASH domain links the dynein motor complex of the microtubules, through the outer nuclear membrane to the Sad1 domain in the inner nuclear membrane which then interacts with the bouquet proteins Bqt1 and Bqt2 that are complexed with Bqt4, Rap1 and Taz1 and attached to the telomere. SUN domain-containing proteins are essential for recruiting KASH domain proteins at the outer nuclear membrane, and KASH domains provide a generic NE tethering device for functionally distinct proteins whose cytoplasmic domains mediate nuclear positioning, maintain physical connections with other cellular organelles, and possibly even influence chromosome dynamics. 58
53309 371125 pfam10542 Vitelline_membr Vitelline membrane cysteine-rich region. In Drosophila melanogaster the vitelline membrane (VM) is the first layer of the eggshell produced by the follicular epithelium. It is composed of at least four different proteins. VM proteins are similarly organized with a central highly conserved 38-amino acid domain which is flanked by unrelated regions. The domain contains three highly conserved cysteines. 37
53310 402256 pfam10543 ORF6N ORF6N domain. This domain was identified by Iyer and colleagues. 82
53311 402257 pfam10544 T5orf172 T5orf172 domain. This domain was identified by Iyer and colleagues. 98
53312 402258 pfam10545 MADF_DNA_bdg Alcohol dehydrogenase transcription factor Myb/SANT-like. The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. 85
53313 402259 pfam10546 P63C P63C domain. This domain was identified by Iyer and colleagues. 93
53314 402260 pfam10547 P22_AR_N P22_AR N-terminal domain. This domain was identified by Iyer and colleagues. 110
53315 402261 pfam10548 P22_AR_C P22AR C-terminal domain. This domain was identified by Iyer and colleagues. It is found associated with pfam10547. 73
53316 287514 pfam10549 ORF11CD3 ORF11CD3 domain. This domain was identified by Iyer and colleagues. 52
53317 402262 pfam10550 Toxin_36 Conantokin-G mollusc-toxin. The conantokins are a family of neuroactive peptides found in the venoms of fish-hunting cone snails. They possess a high content of gamma-carboxyglutamic acid (Gla) (4-5 residues), a non-standard amino-acid made by the post-translational modification of glutamate (Glu) residue. Conantokins are the only natural biochemically characterized peptides known to be N-methyl-D-aspartate (NMDA) receptor antagonists. 17
53318 402263 pfam10551 MULE MULE transposase domain. This domain was identified by Babu and colleagues. 98
53319 402264 pfam10552 ORF6C ORF6C domain. This domain was identified by Iyer and colleagues. 114
53320 287517 pfam10553 MSV199 MSV199 domain. This domain was identified by Iyer and colleagues. 132
53321 402265 pfam10554 Phage_ASH Ash protein family. This family was identified by Iyer and colleagues. It includes the Ash protein from bacteriophage P4. 102
53322 402266 pfam10555 MraY_sig1 Phospho-N-acetylmuramoyl-pentapeptide-transferase signature 1. Phospho-N-acetylmuramoyl-pentapeptide-transferase (EC 2.7.8.13) (mraY) is a bacterial enzyme responsible for the formation of the first lipid intermediate of the cell wall peptidoglycan synthesis. It catalyzes the formation of undecaprenyl-pyrophosphoryl-N-acetylmuramoyl-pentapeptide from UDP-MurNAc-pentapeptide and undecaprenyl-phosphate. It is an integral membrane protein with probably ten transmembrane domains. This domain is located at the end of the first cytoplasmic loop and the beginning of the second transmembrane domain. 13
53323 402267 pfam10557 Cullin_Nedd8 Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognizes and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue. 62
53324 402268 pfam10558 MTP18 Mitochondrial 18 KDa protein (MTP18). This family of proteins are mitochondrial 18KDa proteins that are often misannotated as carbonic anhydrases. It was shown that knockdown of MTP18 protein results in a cytochrome c release from mitochondria and consequently leads to apoptosis. Overexpression studies suggest that MTP18 is required for mitochondrial fission. 170
53325 402269 pfam10559 Plug_translocon Plug domain of Sec61p. The Sec61/SecY translocon mediates translocation of proteins across the membrane and integration of membrane proteins into the lipid bilayer. The structure of the translocon revealed a plug domain blocking the pore on the lumenal side.The plug is unlikely to be important for sealing the translocation pore in yeast but it plays a role in stabilizing Sec61p during translocon formation. The domain runs from residues 52-74. 33
53326 402270 pfam10561 UPF0565 Uncharacterized protein family UPF0565. This family of proteins has no known function. 290
53327 402271 pfam10562 CaM_bdg_C0 Calmodulin-binding domain C0 of NMDA receptor NR1 subunit. This is a very short highly conserved domain that is C-terminal to the cytosolic transmembrane region IV of the NMDA-receptor 1. It has been shown to bind Calmodulin-Calcium with high affinity. The ionotropic N-methyl-D-aspartate receptor (NMDAR) is a major source of calcium flux into neurons in the brain and plays a critical role in learning, memory, neural development, and synaptic plasticity. Calmodulin (CaM) regulates NMDARs by binding tightly to the C0 and C1 regions of their NR1 subunit. The conserved tryptophan is considered to be the anchor residue. 29
53328 402272 pfam10563 CdCA1 Cadmium carbonic anhydrase repeat. This domain is the cadmium carbonic anhydrase repeat unit of the beta-carbonic anhydrase of a marine diatom, that uses both zinc and cadmium for catalysis of the reversible hydration of carbon dioxide for use in inorganic carbon acquisition for photosynthesis (thus being a cambialistic enzyme). Compared with alpha- and gamma-carbonic anhydrases that use three histidines to coordinate the zinc-atom, this beta-carbonic anhydrase has two cysteines and one histidine, and rapidly binds cadmium. 182
53329 402273 pfam10564 MAR_sialic_bdg Sialic-acid binding micronemal adhesive repeat. This domain is a novel carbohydrate-binding domain found on micronemal proteins. Micronemal proteins (MICs) are released onto the parasite surface just before invasion of host cells and play important roles in host cell recognition, attachment and penetration. Toxoplasma gondii can infect and replicate within all nucleated cells. This domain interacts with sialylated oligosaccharides; the protein in Toxoplasma gondii is a monomer but several MAR domains are carried on the protein. Each MAR domain contains one central sialic acid-binding pocket. 94
53330 402274 pfam10565 NMDAR2_C N-methyl D-aspartate receptor 2B3 C-terminus. This domain is found at the C-terminus of many NMDA-receptor proteins, many of which also carry the Ligated ion-channel family pfam00060 further upstream as well as the ANF_receptor family pfam01094. This region is predicted to be a large extra-cellular domain of the NMDA receptor proteins, being highly hydrophilic, and is thought to be integrally involved in the function of the receptor. The region also carries a number of potential N-glycosylation sites. 634
53331 402275 pfam10566 Glyco_hydro_97 Glycoside hydrolase 97. This domain is the catalytic region of the bacterial glycosyl-hydrolase family 97. This central part of the GH97 family protein sequences represents a typical and complete (beta/alpha)8-barrel or catalytic TIM-barrel type domain. The N- and C-terminal parts of the sequences, mainly consisting of beta-strands, form two additional non-catalytic domains. In all known glycosidases with the (beta-alpha)8-barrel fold, the amino acid residues at the active site are located on the C-termini of the beta-strands. 278
53332 402276 pfam10567 Nab6_mRNP_bdg RNA-recognition motif. This conserved domain is found in fungal proteins and appears to be involved in RNA-processing. It binds to poly-adenylated RNA, interacts genetically with mRNA 3'-end processing factors, copurifies with the nuclear cap-binding protein Cbp20p, and is found in complexes containing other translation factors, such as EIF4G. 315
53333 402277 pfam10568 Tom37 Outer mitochondrial membrane transport complex protein. The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial beta-barrel proteins from the cytosol across the outer mitochondrial membrane into the intra-membrane space. In conjunction with TOM70 it guides peptides without an MTS into TOM40, the protein that forms the passage through the outer membrane. It has homology with Metaxin-1, also part of the outer mitochondrial membrane beta-barrel protein transport complex. 125
53334 287532 pfam10570 Myelin-PO_C Myelin-PO cytoplasmic C-term p65 binding region. Myelin protein zero is the major myelin protein in the peripheral central nervous system and is essential for normal myelination. The family is a single-pass transmembrane molecule containing one Ig-like loop in the extracellular domain and this highly basic 69 residue C-terminal cytoplasmic domain which is the region that interacts with protein p65. 65
53335 402278 pfam10571 UPF0547 Uncharacterized protein family UPF0547. This domain contains a zinc-ribbon motif. 26
53336 402279 pfam10572 UPF0556 Uncharacterized protein family UPF0556. This family of proteins has no known function. 126
53337 287535 pfam10573 UPF0561 Uncharacterized protein family UPF0561. This family of proteins has no known function. 120
53338 402280 pfam10574 UPF0552 Uncharacterized protein family UPF0552. This family of proteins has no known function. 224
53339 402281 pfam10576 EndIII_4Fe-2S Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease. 17
53340 402282 pfam10577 UPF0560 Uncharacterized protein family UPF0560. This family of proteins has no known function. 819
53341 371146 pfam10578 SVS_QK Seminal vesicle protein repeat. 12
53342 402283 pfam10579 Rapsyn_N Rapsyn N-terminal myristoylation and linker region. Neuromuscular junction formation relies upon the clustering of acetylcholine receptors and other proteins in the muscle membrane. Rapsyn is a peripheral membrane protein that is selectively concentrated at the neuromuscular junction and is essential for the formation of synaptic acetylcholine receptor aggregates. Acetylcholine receptors fail to aggregate beneath nerve terminals in mice where rapsyn has been knocked out. The N-terminal six amino acids of rapsyn are its myristoylation site, and myristoylation is necessary for the targeting of the protein to the membrane. 80
53343 402284 pfam10580 Neuromodulin_N Gap junction protein N-terminal region. 30
53344 402285 pfam10581 Synapsin_N Synapsin N-terminal. This highly conserved domain of synapsin proteins has a serine at position 9 or 10 which is a phosphorylation site. The domain appears to be the part of the molecule that binds to calmodulin. 32
53345 402286 pfam10583 Involucrin_N Involucrin of squamous epithelia N-terminus. This is the N-terminal three beta strands of involucrin, a protein present in keratinocytes of epidermis and other stratified squamous epithelia. Involucrin first appears in the cell cytosol, but ultimately becomes cross-linked to membrane proteins by transglutaminase thus helping in the formation of an insoluble envelope beneath the plasma membrane. Apigenin is a plant-derived flavanoid that has significant promise as a skin cancer chemopreventive agent. It has been found that apigenin regulates normal human keratinocyte differentiation by suppressing it and this is associated with reduced cell proliferation without apoptosis. The downstream part of the protein is represented by the family Involucrin, pfam00904. 69
53346 402287 pfam10584 Proteasome_A_N Proteasome subunit A N-terminal signature. This domain is conserved in the A subunits of the proteasome complex proteins. 23
53347 402288 pfam10585 UBA_e1_thiolCys Ubiquitin-activating enzyme active site. Ubiquitin-activating enzyme (E1 enzyme) activates ubiquitin by first adenylating with ATP its C-terminal glycine residue and thereafter linking this residue to the side chain of a cysteine residue in E1, yielding an ubiquitin-E1 thiolester and free AMP. Later the ubiquitin moiety is transferred to a cysteine residue on one of the many forms of ubiquitin-conjugating enzymes (E2). This domain carries the last of five conserved cysteines that is part of the active site of the enzyme, responsible for ubiquitin thiolester complex formation, the active site being represented by the sequence motif PICTLKNFP. 252
53348 402289 pfam10587 EF-1_beta_acid Eukaryotic elongation factor 1 beta central acidic region. 28
53349 402290 pfam10588 NADH-G_4Fe-4S_3 NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 40
53350 402291 pfam10589 NADH_4Fe-4S NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 83
53351 402292 pfam10590 PNP_phzG_C Pyridoxine 5'-phosphate oxidase C-terminal dimerization region. This domain represents one of the two dimerization regions of the protein, located at the edge of the dimer interface, at the C-terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In Myxococcus xanthus PdxH, S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule.To date, the only time functional oxidase or phenazine biosynthesis activities have been experimentally demonstrated is when the sequences contain both pfam01243 and pfam10590. It is unknown the role performed by each domain in bringing about molecular functions of either oxidase or phenazine activity. 42
53352 402293 pfam10591 SPARC_Ca_bdg Secreted protein acidic and rich in cysteine Ca binding region. The SPARC_Ca_bdg domain of Secreted Protein Acidic and Rich in Cysteine is responsible for the anti-spreading activity of human urothelial cells. It is rich in alpha-helices. This extracellular calcium-binding domain contains two EF-hands that each coordinates one Ca2+ ion, forming a helix-loop-helix structure that not only drives the conformation of the protein but is also necessary for biological activity. The anti-spreading activity was dependent on the coordination of Ca2+ by a Glu residue at the Z position of EF-hand 2. 108
53353 402294 pfam10592 AIPR AIPR protein. This family of proteins was identified in as an abortive infection phage resistance protein often found in restriction modification system operons. 297
53354 402295 pfam10593 Z1 Z1 domain. This uncharacterized domain was identified by Iyer and colleagues. It is found associated with a helicase domain of superfamily type II. 222
53355 402296 pfam10595 UPF0564 Uncharacterized protein family UPF0564. This family of proteins has no known function. However, one of the members, TTHERM_01026310, is annotated as an EF-hand family protein. 362
53356 371153 pfam10596 U6-snRNA_bdg U6-snRNA interacting domain of PrP8. This domain incorporates the interacting site for the U6-snRNA as part of the U4/U6.U5 tri-snRNPs complex of the spliceosome, and is the prime candidate for the role of cofactor for the spliceosome's RNA core. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor. 159
53357 402297 pfam10597 U5_2-snRNA_bdg U5-snRNA binding site 2 of PrP8. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor. 134
53358 402298 pfam10598 RRM_4 RNA recognition motif of the spliceosomal PrP8. The large RNA-protein complex of the spliceosome catalyzes pre-mRNA splicing. One of the most conserved core proteins is PrP8 which occupies a central position in the catalytic core of the spliceosome, and has been implicated in several crucial molecular rearrangements that occur there, and has recently come under the spotlight for its role in the inherited human disease, Retinitis Pigmentosa. The RNA-recognition motif of PrP8 is highly conserved and provides a possible RNA binding centre for the 5-prime SS, BP, or 3-prime SS of pre-mRNA which are known to contact with Prp8. The most conserved regions of an RRM are defined as the RNP1 and RNP2 sequences. Recognition of RNA targets can also be modulated by a number of other factors, most notably the two loops beta1-alpha1, beta2-beta3 and the amino acid residues C-terminal to the RNP2 domain. 92
53359 371156 pfam10599 Nup_retrotrp_bd Retro-transposon transporting motif. This is the highly conserved C-terminal motif GRKIxxxxxRRKx of nucleoporins that plays a critical and unique role in the nuclear import of retro-transposons in both yeasts and higher organisms. It would appear that the arginine residues at positions 2 and 9-10 constitute a bipartite nuclear localization signal, with two basic peptide motifs separated by an interchangeable spacer sequence, that is crucial for the retro-transposon activity. 86
53360 402299 pfam10600 PDZ_assoc PDZ-associated domain of NMDA receptors. This domain is found in higher eukaryotes between the second and third PDZ domains, pfam00595, of glutamate receptor like proteins. Its exact function is not known. 68
53361 402300 pfam10601 zf-LITAF-like LITAF-like zinc ribbon domain. Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumor necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure. 67
53362 402301 pfam10602 RPN7 26S proteasome subunit RPN7. RPN7 (known as the non ATPase regulatory subunit 6 in higher eukaryotes) is one of the lid subunits of the 26S proteasome and has been shown in Saccharomyces cerevisiae to be required for structural integrity. The 26S proteasome is is involved in the ATP-dependent degradation of ubiquitinated proteins. 174
53363 402302 pfam10604 Polyketide_cyc2 Polyketide cyclase / dehydrase and lipid transport. This family contains polyketide cylcases/dehydrases which are enzymes involved in polyketide synthesis. It also includes other proteins of the START superfamily. 132
53364 402303 pfam10605 3HBOH 3HB-oligomer hydrolase (3HBOH). D-(-)-3-hydroxybutyrate oligomer hydrolase (also known as 3HB-oligomer hydrolase) functions in the degradation of poly-3-hydroxybutyrate (PHB). It catalyzes the hydrolysis of D(-)-3-hydroxybutyrate oligomers (3HB-oligomers) into 3HB-monomers. 690
53365 402304 pfam10606 GluR_Homer-bdg Homer-binding domain of metabotropic glutamate receptor. This is the proline-rich region of metabotropic glutamate receptor proteins that binds Homer-related synaptic proteins. The Homer proteins form a physical tether linking mGluRs with the inositol trisphosphate receptors (IP3R) that appears to be due to the proline-rich "Homer ligand" (PPXXFr). Activation of PI turnover triggers intracellular calcium release. MGluR function is altered in the mouse model of human Fragile X syndrome mental retardation, a disorder caused by loss of function mutations in the Fragile X mental retardation gene Fmr1. Homer 3 (and to a lesser extent Homer 1b/c) has been shown to form a multimeric complex with mGlu1a and the IP3 receptor, indicating that Homers may play a role in the localization of receptors to their signalling partners. 50
53366 402305 pfam10607 CLTH CTLH/CRA C-terminal to LisH motif domain. RanBPM is a scaffolding protein and is important in regulating cellular function in both the immune system and the nervous system. This domain is at the C-terminus of the proteins and is the binding domain for the CRA motif (for CT11-RanBPM), which is comprised of approximately 100 amino acids at the C terminal of RanBPM. It was found to be important for the interaction of RanBPM with fragile X mental retardation protein (FMRP), but its functional significance has yet to be determined. This region contains CTLH and CRA domains annotated by SMART; however, these may be a single domain, and it is refereed to as a C-terminal to LisH motif. 143
53367 402306 pfam10608 MAGUK_N_PEST Polyubiquitination (PEST) N-terminal domain of MAGUK. The residues upstream of this domain are the probable palmitoylation sites, particularly two cysteines. The domain has a putative PEST site at the very start that seems to be responsible for poly-ubiquitination. PEST domains are polypeptide sequences enriched in proline (P), glutamic acid (E), serine (S) and threonine (T) that target proteins for rapid destruction. The whole domain, in conjunction with a C-terminal domain of the longer protein, is necessary for dimerization of the whole protein. 89
53368 402307 pfam10609 ParA NUBPL iron-transfer P-loop NTPase. This family contains ATPases involved in plasmid partitioning. It also contains the cytosolic Fe-S cluster assembling factor NBP35 which is required for biogenesis and export of both ribosomal subunits. 246
53369 313764 pfam10610 Tafi-CsgC Thin aggregative fimbriae synthesis protein. Fimbriae are cell-surface protein polymers, of eg. E coli and Salmonella spp, that mediate interactions important for host and environmental persistence, development of biofilms, motility, colonisation and invasion of cells, and conjugation. Four general assembly pathways for different fimbriae have been proposed, one of which is extracellular nucleation-precipitation (ENP), that differs from the others in that fibre-growth occurs extracellularly. Thin aggregative fimbriae (Tafi) are the only fimbriae dependent on the ENP pathway. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. Tafi are known as curli because, in the absence of extracellular polysaccharides, their morphology appears curled; however, when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. The gene agfC is found to be transcribed at low levels, localized to the periplasm in a mature form, and in combination with AgfE is important for AgfA extracellular assembly, which facilitates the synthesis of Tafi. The genes involved in Tafi production are organized into two adjacent divergently transcribed operons, agfBAC and agfDEFG, both of which are required for biosynthesis and assembly. 98
53370 313765 pfam10611 DUF2469 Protein of unknown function (DUF2469). Member proteins often found in Actinomycetes clustered with signal peptidase and/or RNAse-HII. 100
53371 402308 pfam10612 Spore-coat_CotZ Spore coat protein Z. This family has members annotated as Spore coat protein Z, otherwise known as CotZ, It is a cysteine-rich spore coat family, and along with CotY is necessary for assembly of intact exosporium. 157
53372 402309 pfam10613 Lig_chan-Glu_bd Ligated ion channel L-glutamate- and glycine-binding site. This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan, pfam00060. 111
53373 402310 pfam10614 CsgF Type VIII secretion system (T8SS), CsgF protein. The extracellular nucleation-precipitation (ENP) pathway or Type VIII secretion system (T8SS) in Gram-negative (diderm) bacteria is responsible for the secretion and assembly of prepilins for fimbiae biogenesis, the prototypical curli. Besides the T2SS that can be involved in the assembly of prototypical Type 4 pilus, the T4SS that can be involved in the biogenesis of the prototypical pilus T, the T3SS involved in the assembly of the injectisome and the T7SS involved in the formation of the prototypical Type 1 pilus, the T8SS differs in that fibre-growth occurs extracellularly. The curli, also called thin aggregative fimbriae (Tafi), are the only fimbriae dependent on the T8SS. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. In the absence of extracellular polysaccharides Tafi appear curled, although when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery. 118
53374 402311 pfam10615 DUF2470 Protein of unknown function (DUF2470). This family is a putative haem-iron utilisation family, as many members are annotated as being pyridoxamine 5'-phosphate oxidase-related, FMN-binding; however this could not be confirmed. 73
53375 402312 pfam10616 DUF2471 Protein of unknown function (DUF2471). The function of this family is unknown. Members all come from Burkholderia spp. BDAG_04162 is annotated as Serine/threonine-protein kinase, but this could not be confirmed. 118
53376 402313 pfam10617 DUF2474 Protein of unknown function (DUF2474). This family of short proteins has no known function. 39
53377 402314 pfam10618 Tail_tube Phage tail tube protein. This bacterial family of proteins contains phage tail tube proteins related to the Mu phage tail tube protein M. Bacteriophage Mu has an eicosahedral head and contractile tail. The tail is composed of an outer sheath and an inner tube. 117
53378 402315 pfam10620 MdcG Phosphoribosyl-dephospho-CoA transferase MdcG. MdcG is a phosphoribosyl-dephospho-CoA transferase that is involved in the biosynthesis of the prosthetic group of malonate decarboxylase. Malonate decarboxylase from Klebsiella pneumoniae contains an acyl carrier protein (MdcC) to which a 2'-(5' '-phosphoribosyl)-3'-dephospho-CoA prosthetic group is attached via phosphodiester linkage. MdcG catalyzes the following reaction: 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA + apo-[acyl-carrier-protein] = holo-[acyl-carrier-protein] + diphosphate. 196
53379 287577 pfam10621 FpoO F420H2 dehydrogenase subunit FpoO. This is the FpoO subunit of F420H2 dehydrogenase, an enzyme which oxidizes reduced coenzyme F420. Reduced coenzyme F420 is a universal electron carrier in methanogens. 110
53380 402316 pfam10622 Ehbp Energy-converting hydrogenase B subunit P (EhbP). Ehb (energy-converting hydrogenase B) is an methanogenic archaeal enzyme that functions in one of the metabolic pathways involved in methanol reduction to methane. This family contains subunit P of Ehb. 77
53381 151143 pfam10623 PilI Plasmid conjugative transfer protein PilI. The thin pilus of plasmid R64 belongs to the type IV family and is required for liquid matings. pilI is one of 14 genes that have been identified as being involved in biogenesis of the R64 thin pilus. 83
53382 287579 pfam10624 TraS Plasmid conjugative transfer entry exclusion protein TraS. Entry exclusion (Eex) is a process which prevents redundant transfer of DNA between donor cells. TraS is a protein involved in Eex. It blocks redundant conjugative DNA synthesis and transport between donor cells, and it is suggested that TraS interferes with a signalling pathway that is required to trigger DNA transfer. TraS on the recipient cell is known to form an interaction with TraG on the donor cell. 162
53383 337810 pfam10625 UspB Universal stress protein B (UspB). UspB in Escherichia coli is a 14kDa protein which is predicted to be an integral membrane protein. Overexpression of UspB results in cell death in stationary phase, and mutants of uspB are sensitive to ethanol exposure during stationary phase. 107
53384 402317 pfam10626 TraO Conjugative transposon protein TraO. This is a family of conjugative transposon proteins. 168
53385 402318 pfam10627 CsgE Curli assembly protein CsgE. Curli are a class highly aggregated surface fibers that are part of a complex extracellular matrix. They promote biofilm formation in addition to other activities. CsgE is a non-structural protein involved in curli biogenesis. CsgE forms an outer membrane complex with the curli assembly proteins CsgG and CsgF. 105
53386 402319 pfam10628 CotE Outer spore coat protein E (CotE). CotE is a morphogenic protein that is required for the assembly of the outer coat of the endospore and spore resistance to lysozyme. CotE also regulates the expression of cotA, cotB, cotC and other genes encoding spore outer coat proteins. The timing of cotE expression has been shown in Bacillus subtilis to affect spore coat morphology but not lysozyme resistance. 177
53387 402320 pfam10629 DUF2475 Protein of unknown function (DUF2475). This family of proteins has no known function. 67
53388 402321 pfam10630 DUF2476 Protein of unknown function (DUF2476). This is a family of proteins of unknown function. The family is rich in proline residues. 258
53389 313781 pfam10631 DUF2477 Protein of unknown function (DUF2477). This is a family of proteins with no known function. The family is rich in proline residues. 141
53390 378460 pfam10632 He_PIG_assoc He_PIG associated, NEW1 domain of bacterial glycohydrolase. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW1 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the He_PIG family, pfam05345, a putative Ig-containing domain. 29
53391 402322 pfam10633 NPCBM_assoc NPCBM-associated, NEW3 domain of alpha-galactosidase. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW3 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the NPCBM family, pfam08305, a novel putative carbohydrate binding module found at the N-terminus of glycosyl hydrolases. 78
53392 402323 pfam10634 Iron_transport Fe2+ transport protein. This is a bacterial family of periplasmic proteins that are thought to function in high-affinity Fe2+ transport. 150
53393 402324 pfam10635 DisA-linker DisA bacterial checkpoint controller linker region. The DisA protein is a bacterial checkpoint protein that dimerizes into an octameric complex. The protein consists of three distinct domains. the first, N-terminal region, from 1-145 is globular and is represented by family DisA_N, pfam02457; the next 146-289 residues is this domain that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), thus forming a spine like-linker between domains 1 and 3. The C-terminal residues of domain 3 are family HHH, pfam00633, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis. 141
53394 402325 pfam10636 hemP Hemin uptake protein hemP. This is a bacterial family of proteins that are involved in the uptake of the iron source hemin. 37
53395 402326 pfam10637 Ofd1_CTDD Oxoglutarate and iron-dependent oxygenase degradation C-term. Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N in the presence of oxygen. The domain is conserved from yeasts to humans. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts. 255
53396 402327 pfam10638 Sfi1_C Spindle body associated protein C-terminus. This C-terminal domain of spindle-body-associated protein Sfi1 has an important role to play in the bridge-splitting during bi-polar spindle assembly, and this separation event possibly requires interaction with integral components of the nuclear envelope, such as the Mps2-Bbp1 complex. Centrally to this domain is a region carrying centrin-binding repeats with repeating units containing tryptophan, family Sfi1_central, pfam08457. 100
53397 371174 pfam10639 TMEM234 Putative transmembrane family 234. TMEM234 is a family of putative inner membrane proteins. Many bacterial members are annotated as putative L-Ara4N-phosphoundecaprenol flippase subunit ArnE, and as inner membrane proteins. They may be transporters of the multi-drug-resistant superfamily. 113
53398 287595 pfam10640 Pox_ATPase-GT mRNA capping enzyme N-terminal, ATPase and guanylyltransferase. This domain is the N-terminus of the large subunit viral mRNA capping enzyme, and carries both the ATPase and the guanylyltransferase activities of the enzyme. The guanylyltransferase enzymatic region runs from residues 242 (leucine)-273(arginine), the core of the acitve site being the lysine residue at 260. The ATPase activity is at the very N-terminal part of the domain. 311
53399 402328 pfam10642 Tom5 Mitochondrial import receptor subunit or translocase. This protein family is very short and is only found in yeasts. Tom5 is one of three very small translocases of the mitochondrial outer membrane. Tom5 links mitochondrial preprotein receptors to the general import pore. Although Tom5 has allegedly been identified in vertebrates this could not be confirmed. 47
53400 402329 pfam10643 Cytochrome-c551 Photosystem P840 reaction-centre cytochrome c-551. A photosynthetic reaction-centre complex is found in certain green sulphur bacteria such as Chlorobium vibrioforme which are anaerobic photo-auto-trophic organisms. The primary electron donor is P840, a probable B-Chl a dimer, and the primary electron acceptor is a B-Chl monomer. Also on the donor side c-type cytochromes are known to function as electron donors to photo-oxidized P840. This family is thus the secondary endogenous donor of the photosynthetic reaction-centre complex and is a membrane-bound cytochrome containing a single haem group. 207
53401 402330 pfam10644 Misat_Tub_SegII Misato Segment II tubulin-like domain. The misato protein contains three distinct, conserved domains, segments I, II and III. Segments I and III are common to Tubulins pfam00091, but segment II aligns with myosin heavy chain sequences from D. melanogaster (PIR C35815), rabbit (SP P04460), and human (PIR S12458). Segment II of misato is a major contributor to its greater length compared with the various tubulins. The most significant sequence similarities to this 54-amino acid region are from a motif found in the heavy chains of myosins from different organisms. A comparison of segment II with the vertebrate myosin heavy chains reveals that it is homologous to a myosin peptide in the hinge region linking the S2 and LMM domains. Segment II also contains heptad repeats which are characteristic of the myosin tail alpha-helical coiled-coils. This myosin-like homology may be due only to the fact that both myosin and Misato carry coiled-coils, which appear similar but are not necessarily homologous (Wood V, personal communication). 115
53402 402331 pfam10645 Carb_bind Carbohydrate binding. This is a carbohydrate binding domain which has been shown in Schizosaccharomyces pombe to be required for septum localization. 49
53403 402332 pfam10646 Germane Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520, Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold. 114
53404 402333 pfam10647 Gmad1 Lipoprotein LpqB beta-propeller domain. The Gmad1 domain is found associated with the GerMN family, pfam10646, in bacterial spore formation. It is predicted to have a beta-propeller fold and to have a passive binding role rather than a catalytic function owing to the low number of conserved hydrophilic residues. 255
53405 402334 pfam10648 Gmad2 Immunoglobulin-like domain of bacterial spore germination. This domain is found linked to the GerMN domain pfam10646 in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold. 85
53406 402335 pfam10649 DUF2478 Protein of unknown function (DUF2478). This is a family of hypothetical bacterial proteins found in the vicinity of Molybdenum ABC transporter ATP-binding gene-products MobA MobB and MobC. However the function could not be confirmed. This family appears to belong to the P-loop superfamily by alignment to pfam03266. However, the characteristic P-loop sequence motif appears to have diverged beyond recognition in this family. 159
53407 402336 pfam10650 zf-C3H1 Putative zinc-finger domain. This domain is conserved in fungi and might be a zinc-finger domain as it contains three conserved Cs and an H in the C-x8-C-x5-C-x3-H conformation typical of a zinc-finger. 22
53408 371181 pfam10651 DUF2479 Domain of unknown function (DUF2479). This domain is found in phage from a number of different bacteria. It is purported to be a putative long tail fibre (Bacteriophage A118) protein, but this could not be confirmed. 143
53409 402337 pfam10652 DUF2480 Protein of unknown function (DUF2480). All the members of this family are uncharacterized proteins, but the environment in which they are found on the bacterial genome suggests a function as a glucose-6-phosphate isomerase (EC 5.3.1.9). This could not, however, be confirmed. 165
53410 287607 pfam10653 Phage-A118_gp45 Protein gp45 of Bacteriophage A118. This domain is found in bacteriophage and is thought to have a gp45 function within the phage tail-fibre system. 62
53411 287608 pfam10654 DUF2481 Protein of unknown function (DUF2481). This is a hypothetical protein family homologous to Lmo2305 in Bacteriophage A118 systems. 126
53412 402338 pfam10655 DUF2482 Hypothetical protein of unknown function (DUF2482). All the members of this very small, very short family are derived from bacteriophages, of the SA bacteriophages 11, Mu50B, system, and from the Staphylococcal_phi-Mu50B-like_prophages subsystem. All members are hypothetical proteins. 98
53413 371183 pfam10656 DUF2483 Hypothetical protein of unknown function (DUF2483). This is a family of proteins found in bacteriophage particularly of the SA bacteriophages 11, Mu50B, family, homologous to phi-ETA orf16. 72
53414 402339 pfam10657 RC-P840_PscD Photosystem P840 reaction centre protein PscD. The photosynthetic reaction centers (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilizing the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin. 144
53415 402340 pfam10658 DUF2484 Protein of unknown function (DUF2484). A role of this family in UDP-N-acetylenolpyruvoylglucosamine reductase, as MurB, could not be confirmed. 76
53416 287612 pfam10659 Trypan_glycop_C Trypanosome variant surface glycoprotein C-terminal domain. The trypanosome parasite expresses these proteins to evade the immune response. 104
53417 313801 pfam10660 MitoNEET_N Iron-containing outer mitochondrial membrane protein N-terminus. MitoNEET_N is the N-terminal region of the MitoNEET and Miner-type proteins that carry a zf-CDGSH, pfam09360, redox-active 2Fe-2S cluster. The whole protein regulates oxidative capacity. The domain is an anchor sequence that tethers the protein to the outer membrane. 64
53418 287614 pfam10661 EssA WXG100 protein secretion system (Wss), protein EssA. The WXG100 protein secretion system (Wss) is responsible for the secretion of WXG100 proteins (pfam06013) such as ESAT-6 and CFP-10 in Mycobacterium tuberculosis or EsxA and EsxB in Staphylococcus aureus. In S. aureus, the Wss seems to be encoded by a locus of eight CDS, called ess (eSAT-6 secretion system). This locus encodes, amongst several other proteins, EssA, a protein predicted to possess one transmembrane domain. Due to its predicted membrane location and its absolute requirement for WXG100 protein secretion, it has been speculated that EssA could form a secretion apparatus in conjunction with the polytopic membrane protein EsaA, YukC (pfam10140) and YukAB, which is a membrane-bound ATPase containing Ftsk/SpoIIIE domains (pfam01580) called EssC in S. aureus and Snm1/Snm2 in Mycobacterium tuberculosis. Proteins homologous to EssA, YukC, EsaA and YukD seem absent from mycobacteria. 148
53419 402341 pfam10662 PduV-EutP Ethanolamine utilisation - propanediol utilisation. Members of this family function in ethanolamine and propanediol degradation pathways. PduV may be involved in the association of the bacterial microcompartments (BMCs) to filaments. 137
53420 402342 pfam10664 NdhM Cyanobacterial and plastid NDH-1 subunit M. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 sub-complexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The cyanobacterial NDH-1 complex contains additional subunits, NdhM and NdhN, compared with the minimal set of the bacterial enzyme and these seem to be specific for thylakoid-located NDH-1 of photosynthetic organisms. The three subunits of NDH-1, NdhM, NdhN and NdhO are essential for effecting cyclic electron flow around photosystem I, by supplying extra-ATP for photosynthesis in both plastids and cyanobacteria. 107
53421 402343 pfam10665 Minor_capsid_1 Minor capsid protein. This is a putative tail-knob or minor capsid protein from bacteriophages. 104
53422 402344 pfam10666 Phage_TAC_8 Phage tail assembly chaperone protein Gp14 ()A118. This phage protein family is expressed from within a cluster of tail- and base plate-producing genes. It is a family of tail assembly chaperone proteins. 140
53423 402345 pfam10667 DUF2486 Protein of unknown function (DUF2486). This family is made up of members from various Burkholderia spp. The function is unknown. 251
53424 402346 pfam10668 Phage_terminase Phage terminase small subunit. This family of small highly conserved proteins come from a subset of Firmicute species. Its putative function is as a phage terminase small subunit. 67
53425 402347 pfam10669 Phage_Gp23 Protein gp23 (Bacteriophage A118). This is the highly conserved family of the major tail subunit protein. 120
53426 402348 pfam10670 DUF4198 Domain of unknown function (DUF4198). This family was previously missannotated in Pfam as NikM. 209
53427 402349 pfam10671 TcpQ Toxin co-regulated pilus biosynthesis protein Q. The toxin-coregulated pilus (TCP) of Vibrio cholerae and the soluble TcpF protein that is secreted via the TCP biogenesis apparatus are essential for intestinal colonisation in the disease of cholera. TcpQ is part of an outer membrane complex of the TCP biogenesis apparatus, comprised of TcpC and TcpQ, and the TcpQ is required for proper localization of TcpC to the outer membrane. The domain is found in other Proteobacterial species apart from Vibrio. 80
53428 287624 pfam10672 Methyltrans_SAM S-adenosylmethionine-dependent methyltransferase. Members of this family are S-adenosylmethionine-dependent methyltransferases from gamma-proteobacterial species. The diversity in the roles of methylation is matched by the almost bewildering number of methyltransferase enzymes that catalyze the methylation reaction. Although several classes of methyltransferase enzymes are known, the great majority of methylation reactions are catalyzed by the S-adenosylmethionine-dependent methyltransferases. 286
53429 402350 pfam10673 DUF2487 Protein of unknown function (DUF2487). This is a bacterial family of uncharacterized proteins. 142
53430 402351 pfam10674 Ycf54 Protein of unknown function (DUF2488). This protein is conserved in the green lineage and located in the chloroplast. 91
53431 402352 pfam10675 DUF2489 Protein of unknown function (DUF2489). This is a bacterial family of uncharacterized proteins. 130
53432 371190 pfam10676 gerPA Spore germination protein gerPA/gerPF. This is a bacterial family of proteins that are required for the formation of functionally normal spores. Proteins in this family may be involved in establishing normal coat structure and/or permeability which could control the access of germinants to their receptor. 69
53433 402353 pfam10677 DUF2490 Protein of unknown function (DUF2490). This is a bacterial family of uncharacterized proteins. They appear to belong to the outer membrane beta barrel superfamily. 180
53434 402354 pfam10678 DUF2492 Protein of unknown function (DUF2492). This is a bacterial family of uncharacterized proteins. 77
53435 402355 pfam10679 DUF2491 Protein of unknown function (DUF2491). This is a bacterial family of uncharacterized proteins. 211
53436 402356 pfam10680 RRN9 RNA polymerase I specific transcription initiation factor. Initiation of transcription of ribosomal DNA (rDNA) in yeast involves an interaction of upstream activation factor (UAF) with the upstream element of the promoter, to form a stable UAF-template complex. UAF, together with the TATA-binding transcription initiation factor protein TBP, then recruits an essential core factor to the promoter, to form a stable preinitiation complex. This Rrn9 domain, which seems to be constrained to fungi, is the two highly conserved regions of proteins which form one of the subunits of UAF and appears to be the region responsible for the interaction with TBP. The family includes the S.pombe Arc1 protein, which is found to be essential for the accumulation of condensin at kinetochores. 65
53437 402357 pfam10681 Rot1 Chaperone for protein-folding within the ER, fungal. This conserved fungal family is an essential molecular chaperone in the endoplasmic reticulum. Molecular chaperones transiently interact with unfolded proteins to inhibit their self-aggregation and to support their folding and/or assembly. Rot1 is a general chaperone with some substrate specificity, its substrates being the structurally unrelated Kre5 Kre6 Big1 Atg22, which are type I, type II, and polytopic membrane proteins. The dependencies of each for Rot1 do not share similarities. However, their folding does require BiP, and one of these proteins was simultaneously associated with both Rot1 and BiP. In addition, Rot1 may cooperate with BiP/Kar2 in the folding of Kre6. 208
53438 287634 pfam10682 UL40 Glycoprotein of human cytomegalovirus HHV-5. This is glycoprotein UL40 from human cytomegalovirus or herpesvirus 5. The signal sequence of the UL40 polypeptide contains an HLA-E ligand identical with HLA-Cw*0304. The first 37 residues of UL40, including this ligand, are predicted to encode a signal peptide. The virus thus prevents the lysis by NK (natural killer) cells of the cell it has invaded. 214
53439 402358 pfam10683 DBD_Tnp_Hermes Hermes transposase DNA-binding domain. This domain confers specific DNA-binding on Hermes transposase. 67
53440 313818 pfam10684 BDM Putative biofilm-dependent modulation protein. This is a family of tightly conserved proteins from Enterobacteriaceae which are annotated as being biofilm-dependent modulation protein homologs. 71
53441 402359 pfam10685 KGG Stress-induced bacterial acidophilic repeat motif. This repeat is found in proteins which are expressed under conditions of stress in bacteria. The repeat contains a highly conserved, characteristic sequence motif,KGG, that is also recognized by plants and lower eukaryotes and repeated in their LEA (late embryogenesis abundant) family of proteins, thereby rendering those proteins bacteriostatic. An example of such an LEA family is LEA_5, pfam00477. Further downstream from this motif is a Walker A, nucleotide binding, motif GXXXXGK(S,T), that in YciG of E coli is QSGGNKSGKS. YciG is expressed as part of a three-gene operon, yciGFE, and this operon is induced by stress and is regulated by RpoS, which controls the general stress-response in E coli. YciG was shown to be important for stationary-phase resistance to thermal stress and in particular to acid stress. 21
53442 402360 pfam10686 DUF2493 Protein of unknown function (DUF2493). Members of this family are all Proteobacteria. The function is not known. 66
53443 402361 pfam10688 Imp-YgjV Bacterial inner membrane protein. This is a family of inner membrane proteins. Many of the members are YgjV protein. 155
53444 402362 pfam10689 DUF2496 Protein of unknown function (DUF2496). This family consists of proteins from Gammaproteobacteria spp. Many members are annotated as being like the E coli protein YbaM. 43
53445 151186 pfam10690 Myticin-prepro Myticin pre-proprotein from the mussel. Myticin is a cysteine-rich peptide produced in three isoforms, A, B and C, by Mytilus galloprovincialis, the Mediterranean mussel. Some isoforms show antibacterial activity against gram-positive bacteria, while others are additionally active against the fungus Fusarium oxysporum and a gram-negative bacterium, Escherichia coli D31. Myticin-prepro is the precursor peptide. The mature molecule, named myticin, consists of 40 residues, with four intramolecular disulfide bridges and a cysteine array in the primary structure different from that of previously characterized cysteine-rich antimicrobial peptides. The first 20 amino acids are a putative signal peptide, and the antimicrobial peptide sequence is a 36-residue C-terminal extension. Such a structure suggests that myticins are synthesized as prepro-proteins that are then processed by various proteolytic events before storage in the haemocytes as the active peptide. Myticin precursors are expressed mainly in the haemocytes. The family Mytilin has been merged into this family. 98
53446 402363 pfam10691 DUF2497 Protein of unknown function (DUF2497). Members of this family belong to the Alphaproteobacteria. The function of the family is not known. 70
53447 402364 pfam10692 DUF2498 Protein of unknown function (DUF2498). Members of this family are Gammaproteobacteria. Many are annotated as like E coli protein YciN. The function is not known. 79
53448 402365 pfam10693 DUF2499 Protein of unknown function (DUF2499). Members of this family are found in plants, lower eukaryotes, and bacteria and the chloroplast where it is annotated as Ycf49 or Ycf49-like. The function is not known though several members are annotated as putative membrane proteins. 87
53449 402366 pfam10694 DUF2500 Protein of unknown function (DUF2500). The members of this family are largely confined to the Gammaproteobacteria. The function is not known. 102
53450 402367 pfam10696 DUF2501 Protein of unknown function (DUF2501). Members of this family are all Proteobacteria. Several are annotated as being YjjA or YjjA-like, but this protein is uncharacterized. 77
53451 371199 pfam10697 DUF2502 Protein of unknown function (DUF2502). Members of this family are all Gammaproteobacteria. The function is not known. 90
53452 402368 pfam10698 DUF2505 Protein of unknown function (DUF2505). Members of this family are all Actinobacteria. The function is not known. 151
53453 402369 pfam10699 HAP2-GCS1 Male gamete fusion factor. The gene encoding Arabidopsis HAP2 is allelic with GCS1 (Generative cell-specific protein 1). HAP2 is expressed only in the haploid sperm and is required for efficient guidance of the pollen tube to the ovules. In Arabidopsis the protein is a predicted membrane protein with an N-terminal secretion signal, a single transmembrane domain and a C-terminal histidine-rich domain. HAP2-GCS1 is found from plants to lower eukaryotes and is necessary for the fusion of the gametes in fertilisation. Studies in the green alga Chlamydomonas and the malaria organism Plasmodium showed that it is involved in a novel mechanism for gamete fusion where a first species-specific protein binds male and female gamete membranes together after which a second, broadly conserved protein, either directly or indirectly, causes fusion of the two membranes together. The broadly conserved protein is represented by this HAP2-GCS1 domain, conserved from plants to lower eukaryotes. In Plasmodium berghei the protein is expressed only in male gametocytes and gametes, having a male-specific function during the interaction with female gametes, and being indispensable for parasite fertilisation. The gene in plants and eukaryotes might well have originated from acquisition of plastids from red algae. 433
53454 402370 pfam10702 DUF2507 Protein of unknown function (DUF2507). This family is conserved in Firmicutes. The function is not known. 123
53455 402371 pfam10703 MoaF MoaF N-terminal domain. MoaF protein is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha. It is conserved in Proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis. This entry corresponds to the N-terminal domain. 108
53456 402372 pfam10704 DUF2508 Protein of unknown function (DUF2508). This family is conserved in Firmicutes. Several members are annotated as being the protein YaaL. The function is not known. 71
53457 313834 pfam10705 Ycf15 Chloroplast protein precursor Ycf15 putative. In some species of plants the ycf15 gene is probably not a protein-coding gene because the protein in these species has premature stop codons. Most of the members of the family are hypothetical or uncharacterized. 86
53458 402373 pfam10706 Aminoglyc_resit Aminoglycoside-2''-adenylyltransferase. This family is conserved in Bacteria. It confers resistance to kanamycin, gentamicin, and tobramycin. The protein is also produced by plasmids in various bacterial species and confers resistance to essentially all clinically available aminoglycosides except streptomycin, and it eliminates the synergism between aminoglycosides and cell-wall active agents. 174
53459 402374 pfam10707 YrbL-PhoP_reg PhoP regulatory network protein YrbL. This is a family of proteins that are activated by PhoP. PhoP protein controls the expression of a large number of genes that mediate adaptation to low Mg2+ environments and/or virulence in several bacterial species. YbrL is proposed to be acting in a loop activity with PhoP and PrmA analogous to the multicomponent loop in Salmonella where the PhoP-dependent PmrD protein activates the regulatory protein PmrA, and the activated PmrA then represses transcription from the PmrD promoter which harbours binding sites for both the PhoP and PmrA proteins. Expression of YrbL is induced in low Mg2+ in a PhoP-dependent fashion and repressed by Fe3+ in a PmrA-dependent manner. 185
53460 402375 pfam10708 DUF2510 Protein of unknown function (DUF2510). This is family of proteins conserved in Actinobacteria. Many members are annotated as putative membrane proteins but this could not be confirmed. 35
53461 402376 pfam10709 DUF2511 Protein of unknown function (DUF2511). This family is conserved in bacteria. The function is not known. 87
53462 371204 pfam10710 DUF2512 Protein of unknown function (DUF2512). Proteins in this family are predicted to be integral membrane proteins, and many of them are annotated as being YndM protein. They are all found in Firmicutes. The true function is not known. 136
53463 402377 pfam10711 DUF2513 Hypothetical protein (DUF2513). This family is found in bacteria. The function is not known. 98
53464 402378 pfam10712 NAD-GH NAD-specific glutamate dehydrogenase. The members of this are annotated as being NAD-specific glutamate dehydrogenase encoded in antisense gene pair with DnaK-J. 576
53465 402379 pfam10713 DUF2509 Protein of unknown function (DUF2509). This family is conserved in Proteobacteria. The function is not known but many of the members are annotated as protein YgdB. 131
53466 371207 pfam10714 LEA_6 Late embryogenesis abundant protein 18. This is a family of late embryogenesis-abundant proteins There is high accumulation of this protein in dry seeds, and in the roots of full-grown plants in response to dehydration and ABA (abscisic acid application) treatments. This LEA protein disappears after germination. It accumulates in growing regions of well irrigated hypocotyls and meristems suggesting a role in seedling growth resumption on rehydration. As a group the LEA proteins are highly hydrophilic, contain a high percentage of glycine residues, lack Cys and Trp residues and do not coagulate upon exposure to high temperature, and for these reasons are considered to be members of a group of proteins called hydrophilins. Expression of the protein is negatively regulated during etiolating growth, particularly in roots, in contrast to its expression patterns during normal growth. 75
53467 402380 pfam10715 REGB_T4 T4-page Endoribonuclease RegB. The RegB endoribonuclease encoded by bacteriophage T4 is a unique sequence-specific nuclease that cleaves in the middle of GGAG or, in a few cases, GGAU tetranucleotides, preferentially those found in the Shine-Dalgarno regions of early phage mRNAs. Phage RB49 in addition to RegB utilizes Escherichia coli endoribonuclease E for the degradation of its transcripts for gene regB. The deduced primary structure of RegB proteins of 32 phages studied is almost identical to that of T4, while the sequences of RegB encoded by phages RB69, TuIa and RB49 show substantial divergence from their T4 counterpart. Rebuilding from the Structure 2hx6 structure, this family does not fall into the Lysozyme-like family, but rather is a new member of the RelE/YoeB structural and functional family of ribonucleases specialising in mRNA inactivation within the ribosome. 150
53468 402381 pfam10716 NdhL NADH dehydrogenase transmembrane subunit. The NdhL family is a component of the NDH-1L complex that is one of the proton-pumping NADH:ubiquinone oxidoreductases that catalyze the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. NDH-1L is essential for photoheterotrophic cell growth. NdhL appears to contain two transmembrane helices and it is necessary for the functioning of though not the correct assembly of the NDH-1 complex in Synechocystis 6803. The conservation between cyanobacteria and green plants suggests that chloroplast NDH-1 complexes contain related subunits. 76
53469 287662 pfam10717 ODV-E18 Occlusion-derived virus envelope protein ODV-E18. This family of occlusion-derived viral envelope proteins are detected in viral-induced intranuclear microvesicles and are not detected in the plasma membrane, cytoplasmic membranes, or the nuclear envelope. The ODV-E18 protein is encoded by baculovirus late genes with transcription initiating from a TAAG motif. It exists as a dimer in the ODV envelope and contains a hydrophobic domain which is putatively acting as a target or retention signal for intranuclear microvesicles. 87
53470 402382 pfam10718 Ycf34 Hypothetical chloroplast protein Ycf34. This family is of proteins annotated as hypothetical chloroplast protein YCF34. The function is not known. 78
53471 402383 pfam10719 ComFB Late competence development protein ComFB. This family is conserved in bacteria. Some members, with three conserved cysteines, are annotated as late competence development protein ComFB. 79
53472 371209 pfam10720 DUF2515 Protein of unknown function (DUF2515). This family is conserved in Firmicutes. Several members are annotated as YppC. The function is not known. 303
53473 287666 pfam10721 DUF2514 Protein of unknown function (DUF2514). This family is conserved in bacteria and some viruses. The function is not known. 161
53474 402384 pfam10722 YbjN Putative bacterial sensory transduction regulator. YbjN is a putative sensory transduction regulator protein found in Proteobacteria. As it is a multi-copy suppressor of the coenzyme A-associated temperature sensitivity in temperature-sensitive mutant strains of Escherichia coli the suggestion is that it both helps CoA-A1 and possibly works as a general stabilizer for some other unstable proteins. This family was expanded to subsume other related families: DUF1790, DUF1821 and DUF2596. 126
53475 287668 pfam10723 RepB-RCR_reg Replication regulatory protein RepB. This is a family of proteins which regulate replication of rolling circle replication (RCR) plasmids that have a double-strand replication origin (dso). Regulation of replication of RCR plasmids occurs mainly at initiation of leading strand synthesis at the dso, such that Rep protein concentration controls plasmid replication. 81
53476 402385 pfam10724 DUF2516 Protein of unknown function (DUF2516). This family is conserved in Actinobacteria. The function is not known. 91
53477 402386 pfam10725 DUF2517 Protein of unknown function (DUF2517). This family is conserved in Proteobacteria. Several members are annotated as being protein YbfA. The function is not known. 61
53478 402387 pfam10726 DUF2518 Protein of function (DUF2518). This family is conserved in Cyanobacteria. Several members are annotated as the protein Ycf51. The function is not known. 142
53479 287672 pfam10727 Rossmann-like Rossmann-like domain. This family of proteins contain a Rossmann-like domain. 127
53480 402388 pfam10728 DUF2520 Domain of unknown function (DUF2520). This presumed domain is found C-terminal to a Rossmann-like domain suggesting that these proteins are oxidoreductases. 126
53481 313850 pfam10729 CedA Cell division activator CedA. CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet.. 75
53482 371213 pfam10730 DUF2521 Protein of unknown function (DUF2521). Family of unknown function specific to Bacillus. 146
53483 287676 pfam10731 Anophelin Thrombin inhibitor from mosquito. Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing. 65
53484 287677 pfam10732 DUF2524 Protein of unknown function (DUF2524). This family of proteins with unknown function appears to be restricted to Bacillaceae bacteria. 84
53485 402389 pfam10733 DUF2525 Protein of unknown function (DUF2525). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. The family has a highly conserved sequence. 60
53486 371214 pfam10734 DUF2523 Protein of unknown function (DUF2523). This is a family of phage related proteins whose function is uncharacterized. 80
53487 402390 pfam10735 DUF2526 Protein of unknown function (DUF2526). This family of proteins with unknown function is restricted to Enterobacteriaceae. The family has a highly conserved sequence. 76
53488 119256 pfam10736 DUF2527 Protein of unknown function (DUF2627). This family of proteins with unknown function appears to be restricted to a family of Enterobacterial proteins. It has a highly conserved sequence. 38
53489 371215 pfam10737 GerPC Spore germination protein GerPC. GerPC is required for the formation of functionally normal spores. The gerP locus encodes a number of proteins which are thought to be involved in the establishment of normal spore coat structure and/or permeability, which allows the access of germinants to their receptor. 173
53490 402391 pfam10738 Lpp-LpqN Probable lipoprotein LpqN. This family is conserved in Mycobacteriaceae and is likely to be a lipoprotein. 171
53491 402392 pfam10739 DUF2550 Protein of unknown function (DUF2550). This family is conserved in Corynebacterineae. The function is not known though most members are annotated as either secreted, or membrane, proteins. 127
53492 402393 pfam10740 DUF2529 Domain of unknown function (DUF2529). This domain is conserved in the Bacillales. The function is not known, but given this domains relationship to the SIS domain it may carry out a sugar isomerase reaction. Several members are annotated as being YWJG, a protein expressed downstream of pyrG, a gene encoding for cytidine triphosphate synthetase. 167
53493 402394 pfam10741 T2SSM_b Type II secretion system (T2SS), protein M subtype b. The T2SMb family is conserved in Proteobacteria and Actinobacteria, and differs from the T2SM proteins in Vibrio spp. (pfam04612). 111
53494 402395 pfam10742 DUF2555 Protein of unknown function (DUF2555). This family is conserved in Cyanobacteria. The function is not known. 55
53495 371218 pfam10743 Phage_Cox Regulatory phage protein cox. This family of phage Cox proteins is expressed by Enterobacteria phages. The Cox protein is a 79-residue basic protein with a predicted strong helix-turn-helix DNA-binding motif. It inhibits integrative recombination and it activates site-specific excision of the HP1 genome from the Haemophilus influenzae chromosome, Hp1. Cox appears to function as a tetramer. Cox binding sites consist of two direct repeats of the consensus motif 5'-GGTMAWWWWA, one Cox tetramer binding to each motif. Cox binding interferes with the interaction of HP1 integrase with one of its binding sites, IBS5. This competition is central to directional control. Both Cox binding sites are needed for full inhibition of integration and for activating excision, because it plays a positive role in assembling the nucleoprotein complexes that produce excisive recombination, by inducing the formation of a critical conformation in those complexes. 87
53496 402396 pfam10744 Med1 Mediator of RNA polymerase II transcription subunit 1. Mediator complexes are basic necessities for linking transcriptional regulators to RNA polymerase II. This domain, Med1, is conserved from plants to fungi to humans and forms part of the Med9 submodule of the Srb/Med complex. it is one of three subunits essential for viability of the whole organism via its role in environmentally-directed cell-fate decisions. Med1 is part of the tail region of the Mediator complex. 377
53497 402397 pfam10745 DUF2530 Protein of unknown function (DUF2530). This family of proteins with unknown function appears to be restricted to mycobacteria. 73
53498 287690 pfam10746 Phage_holin_2_2 Phage holin T7 family, holin superfamily II. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. 55
53499 402398 pfam10747 SirA Sporulation inhibitor of replication protein SirA. This entry represents the Sporulation inhibitor of replication (sirA) family of proteins from Bacillus sp. Induction of sporulation in rapidly growing cells inhibits replication; this is thought to be through the action of SirA protein and independent of phosphorylated Spo0A; however SirA protein synthesis is induced by Spo0A. 140
53500 402399 pfam10748 DUF2531 Protein of unknown function (DUF2531). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 132
53501 402400 pfam10749 DUF2534 Protein of unknown function (DUF2534). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 80
53502 371221 pfam10750 DUF2536 Protein of unknown function (DUF2536). This family of proteins with unknown function appears to be restricted to Bacillus spp. Structural modelling suggests this domain may bind nucleic acids. 68
53503 313866 pfam10751 DUF2535 Protein of unknown function (DUF2535). This family of proteins with unknown function appears to be restricted to Bacillus spp. 83
53504 313867 pfam10752 DUF2533 Protein of unknown function (DUF2533). This family of proteins with unknown function appears to be restricted to Bacillus spp. 83
53505 402401 pfam10753 Toxin_GhoT_OrtT Toxin GhoT_OrtT. GhoT is part of the GhoT-GhoS type V toxin-antitoxin (TA) system. OrtT is homologous to GhoT, but it is not part of a TA pair. In this case, it acts as an independent toxin to reduce growth during stress related to amino acid and DNA synthesis. 55
53506 402402 pfam10754 DUF2569 Protein of unknown function (DUF2569). This family is conserved in bacteria. The function is not known, but several members are annotated as being YdgK or a homolog thereof. 142
53507 402403 pfam10755 DUF2585 Protein of unknown function (DUF2585). This family is conserved in Proteobacteria. The function is not known. 164
53508 371225 pfam10756 bPH_6 Bacterial PH domain. This domain has a bacterial type PH domain structure. This domain was previously known as DUF2581. This family is conserved in the Actinomycetales. Although several members are annotated as RbiX homologs, RbiX being a putative regulator of riboflavin biosynthesis, the function could not be confirmed. 73
53509 371226 pfam10757 YbaJ Biofilm formation regulator YbaJ. YbaJ regulates biofilm formation. It also has an important role in the regulation of motility in the biofilm. YbaJ functions in increasing conjugation, aggregation and decreasing the motility, resulting in an increase of biofilm 114
53510 402404 pfam10758 DUF2586 Protein of unknown function (DUF2586). This bacterial family of proteins has no known function. 363
53511 402405 pfam10759 DUF2587 Protein of unknown function (DUF2587). This is a bacterial family of proteins with no known function. 161
53512 402406 pfam10761 DUF2590 Protein of unknown function (DUF2590). This family of proteins has no known function. 98
53513 402407 pfam10762 DUF2583 Protein of unknown function (DUF2583). Some members in this family of proteins are annotated as YchH however currently no function is known. 86
53514 402408 pfam10763 DUF2584 Protein of unknown function (DUF2584). This bacterial family of proteins have no known function. 77
53515 402409 pfam10764 Gin Inhibitor of sigma-G Gin. Gin allows sigma-F to delay late forespore transcription by preventing sigma-G to take over before the cell has reached a critical stage of development. Gin is also known as CsfB. 44
53516 402410 pfam10765 DUF2591 Protein of unknown function (DUF2591). This bacterial family of proteins has no known function. 107
53517 371231 pfam10766 AcrZ Multidrug efflux pump-associated protein AcrZ. AcrZ is associated with the AcrA-TolC multidrug efflux pump, it may enhance the ability of the pump to recognize and export certain substrates. 44
53518 402411 pfam10767 DUF2593 Protein of unknown function (DUF2593). This family of proteins appear to be restricted to Enterobacteriaceae. Some members in the family are annotated as YbjO however currently there is no known function. 141
53519 402412 pfam10768 FliX Class II flagellar assembly regulator. The FliX protein is possibly a transient component of the flagellum that is required for the assembly process. FliX may contribute to the targeting or assembly of the P- and L-ring protein monomers at the cell pole. The family carries a potential N-terminal signal sequence and at least one transmembrane domain indicating that it might function either in or in association with the cell membrane. 135
53520 371234 pfam10769 DUF2594 Protein of unknown function (DUF2594). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae. 74
53521 402413 pfam10771 DUF2582 Winged helix-turn-helix domain (DUF2582). This family is conserved in bacteria and archaea. The function is not known. The structure of two proteins in this family were solved using NMR and shown to adopt a winged helix-turn-helix fold. Structural analysis shows that these proteins form an unusual dimeric conformation. This dimer was shown to be similar to that found in the FadR and TubR wHTH domains. It was suggested that these proteins are not very likely to bind to DNA. 65
53522 402414 pfam10772 DUF2597 Protein of unknown function (DUF2597). This family of proteins has no known function. 134
53523 402415 pfam10774 DUF4226 Domain of unknown function (DUF4226). This family of mycobacterial proteins are uncharacterized. 115
53524 402416 pfam10775 ATP_sub_h ATP synthase complex subunit h. Subunit h is a component of the yeast mitochondrial F1-F0 ATP synthase. It is essential for the correct assembly and functioning of this enzyme. Subunit h occupies a central place in the peripheral stalk between the F1 sector and the membrane. 67
53525 378492 pfam10776 DUF2600 Protein of unknown function (DUF2600). This is a bacterial family of proteins. Some members in the family are annotated as YtpB however currently no function is known. 328
53526 402417 pfam10777 YlaC Inner membrane protein YlaC. Members of this family include proteins annotated as inner membrane protein YlaC in E. coli and Salmonella. The function of this family is unknown. 154
53527 402418 pfam10778 DehI Halocarboxylic acid dehydrogenase DehI. Haloacid dehalogenases catalyze the removal of halides from organic haloacids. DehI can process both L- and D-substrates. A crucial aspartate residue is predicted to activate a water molecule for nucleophilic attack of the substrate chiral centre resulting in an inversion of the configuration of either L- or D-substrates in contrast to D-only enzymes. 148
53528 402419 pfam10779 XhlA Haemolysin XhlA. XhlA is a cell-surface associated haemolysin that lyses the two most prevalent types of insect immune cells (granulocytes and plasmatocytes) as well as rabbit and horse erythrocytes. This family has had DUF1267, pfam06895, merged into it. 67
53529 402420 pfam10780 MRP_L53 39S ribosomal protein L53/MRP-L53. MRP-L53 is also known as Mrp144. It is part of the 39S ribosome. 52
53530 402421 pfam10781 DSRB Dextransucrase DSRB. DSRB is a novel dextransucrase which produces a dextran different from the typical dextran, as it contains (1-6) and (1-2) linkages, when this strain is grown in the presence of sucrose. 61
53531 402422 pfam10782 zf-C2HCIx2C Zinc-finger. This bacterial family of proteins is a zinc-finger domain of the C2HC type with an additional cysteine. 58
53532 402423 pfam10783 DUF2599 Protein of unknown function (DUF2599). This family is conserved in Actinobacteria. The function is not known. 94
53533 119304 pfam10784 Plasmid_stab_B Plasmid stability protein. This family is conserved in the Enterobacteriales. It is a putative plasmid stability protein in that it is expressed from the operon involved in stability, but its actual function has not yet been characterized. 72
53534 402424 pfam10785 NADH-u_ox-rdase NADH-ubiquinone oxidoreductase complex I, 21 kDa subunit. This family is the N-terminal domain of NADH-ubiquinone oxidoreductase 21 kDa subunits from fungi, lower metazoa and plants. 84
53535 402425 pfam10786 G6PD_bact Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49). This family is conserved in Firmicutes and Proteobacteria. Several members are annotated as being glucose-6-phosphate 1-dehydrogenase (EC:1.1.1.49) but this could not be confirmed. 213
53536 402426 pfam10787 YfmQ Uncharacterized protein from bacillus cereus group. This family is conserved in the Bacillus cereus group. Several members are called YfmQ but the function is not known. 141
53537 402427 pfam10788 DUF2603 Protein of unknown function (DUF2603). This family is conserved in Epsilon-proteobacteria. The function is not known. 134
53538 256166 pfam10789 Phage_RpbA Phage RNA polymerase binding, RpbA. Upon infection the RpbA encode phage protein binds to the ADP-ribosylated core RNA polymerase and modulates function to preferentially bind T4 promoters. This is a non-essential protein to the phage life cycle. 108
53539 402428 pfam10790 DUF2604 Protein of Unknown function (DUF2604). Family of bacterial proteins with undetermined function. 76
53540 402429 pfam10791 F1F0-ATPsyn_F Mitochondrial F1-F0 ATP synthase subunit F of fungi. The membrane bound F1-FO-type H+ ATP synthase of mitochondria catalyzes the terminal step in oxidative respiration converting the generation of the electrochemical gradient into ATP for cellular biosynthesis. The general structure and the core subunits of the enzyme are highly conserved in both prokaryotic and eukaryotic organisms. 87
53541 402430 pfam10792 DUF2605 Protein of unknown function (DUF2605). This family is conserved in Cyanobacteria. The function is not known. 96
53542 371249 pfam10793 Gloverin Gloverin-like protein. This family of proteins are Gloverin-like. Gloverin is a 13.8kDa inducible antibacterial insect protein which inhibits the synthesis of vital outer membrane proteins leading to a permeable outer membrane. Gloverin contains a large number of glycine residues. 161
53543 287732 pfam10794 DUF2606 Protein of unknown function (DUF2606). Family of bacterial proteins with unknown function. These proteins have been classified as membrane proteins 134
53544 402431 pfam10795 DUF2607 Protein of unknown function (DUF2607). This family is conserved in Gammaproteobacteria. The function is not known. 94
53545 402432 pfam10796 Anti-adapt_IraP Sigma-S stabilisation anti-adaptor protein. This family is conserved in Enterobacteriaceae. It is one of a series of proteins, expressed by these bacteria in response to stress, that help to regulate Sigma-S, the stationary phase sigma factor of Escherichia coli and Salmonella. IraP is essential for Sigma-S stabilisation in some but not all starvation conditions. 86
53546 402433 pfam10797 YhfT Protein of unknown function. This family is conserved in Firmicutes and Proteobacteria. The function is not known but several members are annotated as being homologs of E coli YhfT, a protein thought to be involved in fatty acid oxidation. 422
53547 402434 pfam10798 YmgB Biofilm development protein YmgB/AriR. YmgB is part of the three gene cluster ymgABC which has a role in biofilm development and stability. YmgB represses biofilm formation in rich medium containing glucose, decreases cellular motility and also protects the cell from acid which indicates that YmgB has an important function in acid-resistance. YmgB binds as a dimer to genes which are important for biofilm formation via a ligand. Due to its important function in acid resistance it is also known as AriR (regulator of acid resistance influenced by indole). 59
53548 313902 pfam10799 YliH Biofilm formation protein (YliH/bssR). YliH is induced in biofilms and is involved in repression of motility in the biofilms. YliH is also known as bssR (regulator of biofilm through signal secreton). 126
53549 402435 pfam10800 DUF2528 Protein of unknown function (DUF2528). This family of proteins has no known function. Some of the sequences are annotated as ea10 however the function of this protein is unknown. 103
53550 402436 pfam10801 DUF2537 Protein of unknown function (DUF2537). This bacterial family of proteins has no known function. 75
53551 402437 pfam10802 DUF2540 Protein of unknown function (DUF2540). This family of proteins with unknown function appears to be restricted to Methanococcus. 75
53552 313906 pfam10803 GerPB Spore germination GerPB. Members of this family are required for formation of functionally normal spores. They may be involved in the establishment of spore coat structure or permeability. 52
53553 402438 pfam10804 DUF2538 Protein of unknown function (DUF2538). This family of proteins has no known function. 155
53554 402439 pfam10805 DUF2730 Protein of unknown function (DUF2730). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 101
53555 402440 pfam10806 SAM35 SAM35, subunit of SAM coomplex. SAM35 is a family of fungal proteins found in the peripheral mitochondrial outer membrane. It is essential for cell viability. It forms a subunit of the SAM (sorting and assembly machinery) complex and is crucial for the assembly of the precursors of Tom40 and porin, the outer membrane beta-barrel proteins involved in mitochondrial biogenesis. SAM35 is required in order for the Sam50 subunit of the SAM complex to bind outer membrane substrate proteins. 125
53556 287745 pfam10807 DUF2541 Protein of unknown function (DUF2541). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. All proteins are annotated as YaaI precursor however currently no function is known. 130
53557 151258 pfam10808 DUF2542 Protein of unknown function (DUF2542). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. The family has a highly conserved sequence. 79
53558 402441 pfam10809 DUF2732 Protein of unknown function (DUF2732). This family of proteins has no known function. 74
53559 371254 pfam10810 DUF2545 Protein of unknown function (DUF2545). This family of proteins with unknown function is restricted to Enterobacteriaceae. The sequence is highly conserved. 80
53560 287748 pfam10811 DUF2532 Protein of unknown function (DUF2532). This bacterial family of proteins has no known function. 158
53561 402442 pfam10812 DUF2561 Protein of unknown function (DUF2561). This family of proteins with unknown function appears to be restricted to Mycobacterium spp. 204
53562 287750 pfam10813 DUF2733 Protein of unknown function (DUF2733). This viral family of proteins has no known function. 32
53563 402443 pfam10814 CwsA Cell wall synthesis protein CwsA. Cell wall synthesis protein CwsA is required for cell division, cell wall synthesis and cell shape maintenance. 133
53564 313911 pfam10815 ComZ ComZ. ComZ is part of a two gene operon. It affects competence regulation by negatively affecting the transcription of the ComG operon. ComZ contains a leucine zipper motif. 55
53565 402444 pfam10816 DUF2760 Domain of unknown function (DUF2760). This is a bacterial family of uncharacterized proteins. 123
53566 287754 pfam10817 DUF2563 Protein of unknown function (DUF2563). This family of proteins with unknown function appears to be restricted to Mycobacterium. 104
53567 402445 pfam10818 DUF2547 Protein of unknown function (DUF2547). This bacterial family of proteins has no known function. 96
53568 287756 pfam10819 DUF2564 Protein of unknown function (DUF2564). This family of proteins with unknown function appears to be restricted to Bacillus spp. 78
53569 402446 pfam10820 DUF2543 Protein of unknown function (DUF2543). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae. The family has a highly conserved sequence. 81
53570 402447 pfam10821 DUF2567 Protein of unknown function (DUF2567). This is a bacterial family of proteins with unknown function. 166
53571 402448 pfam10823 DUF2568 Protein of unknown function (DUF2568). One member in this family is annotated as yrdB which is part of a four gene operon however currently no function is known. 92
53572 371259 pfam10824 T7SS_ESX_EspC Excreted virulence factor EspC, type VII ESX diderm. T7SS_ESX-EspC is a family of exported virulence proteins from largely Acinetobacteria and a few Fimicutes, Gram-positive bacteria. It is exported in conjunction with EspA as an interacting pair.ED F8ADQ6.1/227-313; F8ADQ6.1/227-313; 100
53573 402449 pfam10825 DUF2752 Protein of unknown function (DUF2752). This family is conserved in bacteria. Many members are annotated as being putative membrane proteins. 47
53574 402450 pfam10826 DUF2551 Protein of unknown function (DUF2551). This Archaeal family of proteins has no known function. 83
53575 371260 pfam10827 DUF2552 Protein of unknown function (DUF2552). This bacterial family of proteins has no known function. 79
53576 402451 pfam10828 DUF2570 Protein of unknown function (DUF2570). This is a family of proteins with unknown function. 108
53577 313921 pfam10829 DUF2554 Protein of unknown function (DUF2554). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 76
53578 371262 pfam10830 DUF2553 Protein of unknown function (DUF2553). This family of bacterial proteins has no known function. 75
53579 402452 pfam10831 DUF2556 Protein of unknown function (DUF2556). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 53
53580 402453 pfam10832 DUF2559 Protein of unknown function (DUF2559). This family of proteins appear to be restricted to Enterobacteriaceae. The sequences are annotated as yhfG however currently no function is known. 52
53581 402454 pfam10833 DUF2572 Protein of unknown function (DUF2572). This bacterial family of proteins has no known function. 220
53582 287770 pfam10834 DUF2560 Protein of unknown function (DUF2560). This family of proteins has no known function. 72
53583 313925 pfam10835 DUF2573 Protein of unknown function (DUF2573). Some members in this bacterial family of proteins are annotated as YusU however no function is currently known. This family of proteins appears to be restricted to Bacillus spp. 75
53584 287772 pfam10836 DUF2574 Protein of unknown function (DUF2574). This family of proteins appears to be restricted to Enterobacteriaceae. Members of the family are annotated as yehE however currently no function is known. 93
53585 313926 pfam10838 DUF2677 Protein of unknown function (DUF2677). Members in this family of proteins are annotated as UL121 however currently no function is known. 166
53586 287774 pfam10839 DUF2647 Protein of unknown function (DUF2647). This eukaryotic family of proteins are annotated as ycf68 but have no known function. 70
53587 337858 pfam10840 DUF2645 Protein of unknown function (DUF2645). This family of proteins appear to be restricted to Enterobacteriaceae. Some members in the family are annotated as YjeO however no function for this protein is currently known. 98
53588 402455 pfam10841 DUF2644 Protein of unknown function (DUF2644). This family of proteins with unknown function appear to be restricted to Pasteurellaceae. 59
53589 402456 pfam10842 DUF2642 Protein of unknown function (DUF2642). This family of proteins with unknown function appear to be restricted to Bacillus spp. 61
53590 371266 pfam10843 RGI1 Respiratory growth induced protein 1. This family of fungal proteins includes RGI1, standing for respiratory growth induced 1. RGI1 is involved in aerobic energetic metabolism. 194
53591 402457 pfam10844 DUF2577 Protein of unknown function (DUF2577). This family of proteins has no known function 106
53592 313931 pfam10845 DUF2576 Protein of unknown function (DUF2576). The function of this viral family of proteins is unknown. 48
53593 371267 pfam10846 DUF2722 Protein of unknown function (DUF2722). This eukaryotic family of proteins has no known function. 369
53594 402458 pfam10847 DUF2656 Protein of unknown function (DUF2656). This bacterial family of proteins has no known function. 141
53595 287783 pfam10849 DUF2654 Protein of unknown function (DUF2654). Some members in this family of proteins are annotated as a-gt.4 however currently no function is known. 70
53596 313933 pfam10850 DUF2653 Protein of unknown function (DUF2653). This family of proteins with unknown function appears to be restricted to Bacillus spp. 88
53597 402459 pfam10851 DUF2652 Protein of unknown function (DUF2652). This family of proteins has no known function. 118
53598 371270 pfam10852 DUF2651 Protein of unknown function (DUF2651). This family of proteins with unknown function appears to be restricted to Bacillus spp. 73
53599 402460 pfam10853 DUF2650 Protein of unknown function (DUF2650). This family of proteins with unknown function appear to be restricted to Caenorhabditis elegans. 35
53600 287788 pfam10854 DUF2649 Protein of unknown function (DUF2649). Members in this family of proteins are annotated as Plectrovirus orf 10 transmembrane proteins however currently no function is known. 67
53601 402461 pfam10856 DUF2678 Protein of unknown function (DUF2678). This family of proteins has no known function. 118
53602 287791 pfam10857 DUF2701 Protein of unknown function (DUF2701). This viral family of proteins has no known function. 63
53603 402462 pfam10858 DUF2659 Protein of unknown function (DUF2659). This bacterial family of proteins has no known function. 224
53604 313938 pfam10859 DUF2660 Protein of unknown function (DUF2660). This is a family of proteins with unknown function. 91
53605 287794 pfam10860 DUF2661 Protein of unknown function (DUF2661). This viral family of proteins have no known function. 113
53606 402463 pfam10861 DUF2784 Protein of Unknown function (DUF2784). This is a family of uncharacterized protein. The function is not known however it is conserved in Bacteria. 105
53607 402464 pfam10862 FcoT FcoT-like thioesterase domain. Proteins in this family have a HotDog fold. This family was formerly known as domain of unknown function 2662 (DUF2662). The structure of Rv0098 from M. tuberculosis suggested a thioesterase function. Assays showed that this protein was a thioesterase with a preference for long chain fatty acyl groups. The maximal Kcat was observed for palmitoyl-CoA although longer and shorter molecules were also cleaved. In solution this protein forms a homo-hexameric complex. 150
53608 402465 pfam10863 NOP19 Nucleolar protein 19. Nucleolar protein 19 plays an essential role in 40S ribosomal subunit biogenesis. 140
53609 371275 pfam10864 DUF2663 Protein of unknown function (DUF2663). Some members in this family of proteins are annotated as YpbF however currently no function is known. 130
53610 402466 pfam10865 DUF2703 Domain of unknown function (DUF2703). This family of protein has no known function, but it may be distantly related to the thioredoxin fold. It contains the CXXC motif that is characteristic of thioredoxins. 120
53611 313944 pfam10866 DUF2704 Protein of unknown function (DUF2704). This viral family of proteins has no known function. 167
53612 287800 pfam10867 DUF2664 Protein of unknown function (DUF2664). This family of proteins is a viral family, annotated as UL96. Currently no function is known. 89
53613 402467 pfam10868 Defensin_like Cysteine-rich antifungal protein 2, defensin-like. This is a family of plant antifungal proteins. It has insecticidal and antifungal activity against certain plant pathogens. 50
53614 402468 pfam10869 DUF2666 Protein of unknown function (DUF2666). This Archaeal family of proteins has no known function. 135
53615 287802 pfam10870 DUF2729 Protein of unknown function (DUF2729). This viral family of proteins has no known function. 57
53616 402469 pfam10871 DUF2748 Protein of unknown function (DUF2748). This is a bacterial family of proteins with unknown function. 439
53617 287804 pfam10872 DUF2740 Protein of unknown function (DUF2740). This family of proteins with unknown function has a highly conserved sequence. 48
53618 313948 pfam10873 CYYR1 Cysteine and tyrosine-rich protein 1. Members in this family of proteins are annotated as Cysteine and tyrosine-rich protein 1, however currently no function is known. 149
53619 371279 pfam10874 DUF2746 Protein of unknown function (DUF2746). This family of proteins has no known function. 101
53620 287807 pfam10875 DUF2670 Protein of unknown function (DUF2670). This bacterial family of proteins has no known function. 145
53621 337861 pfam10876 Phage_TAC_9 Phage tail assemb.y chaperone protein, TAC. This is a family of putative phage tail assembly chaperone proteins largely from Haemophilus and Xylella species. 133
53622 402470 pfam10877 DUF2671 Protein of unknown function (DUF2671). This family of proteins with unknown function appears to be restricted to Rickettsia spp. 89
53623 313952 pfam10878 DUF2672 Protein of unknown function (DUF2672). This family of proteins with unknown function appear to be restricted to Rickettsiae. 67
53624 402471 pfam10879 DUF2674 Protein of unknown function (DUF2674). This family of proteins with unknown function appears to be conserved to Rickettsia spp. 63
53625 313953 pfam10880 DUF2673 Protein of unknown function (DUF2673). This family of proteins with unknown function appears to be restricted to Rickettsiae spp. 82
53626 371282 pfam10881 DUF2726 Protein of unknown function (DUF2726). This bacterial family of proteins has no known function. 127
53627 378504 pfam10882 bPH_5 Bacterial PH domain. This family of proteins with unknown function appear to be related to bacterial PH domains. This family was formerly known as DUF2679. 100
53628 402472 pfam10883 DUF2681 Protein of unknown function (DUF2681). This family of proteins is found in bacteria. Proteins in this family are typically between 81 and 117 amino acids in length. 87
53629 287815 pfam10884 DUF2683 Protein of unknown function (DUF2683). This family of proteins with unknown function appears to be restricted to Methanosarcinaceae. 78
53630 287817 pfam10886 DUF2685 Protein of unknown function (DUF2685). Members in this family of proteins are annotated as uvdY.-2 which is an open reading frame within uvsY. However currently there is no known function. 55
53631 287818 pfam10887 DUF2686 Protein of unknown function (DUF2686). Some members in this family of proteins are annotated as yjfZ however currently no function is known. 285
53632 371285 pfam10888 DUF2742 Protein of unknown function (DUF2742). Members in this family of phage proteins are the product of the gene phiRv1, however no function is known. 97
53633 371286 pfam10890 Cyt_b-c1_8 Cytochrome b-c1 complex subunit 8. This entry represents subunit 8 of the Cytochrome b-c1 complex. 72
53634 151339 pfam10891 DUF2719 Protein of unknown function (DUF2719). This family of proteins with unknown function appears to be restricted to Nucleopolyhedrovirus. 81
53635 371287 pfam10892 DUF2688 Protein of unknown function (DUF2688). Members in this family of proteins are annotated as KleB however currently no function is known. 56
53636 371288 pfam10893 DUF2724 Protein of unknown function (DUF2724). This is a family of proteins with unknown function. 63
53637 313958 pfam10894 DUF2689 Protein of unknown function (DUF2689). Members in this family of proteins are annotated as TrbD however currently no function is known. 57
53638 371289 pfam10895 DUF2715 Domain of unknown function (DUF2715). This family of proteins with unknown function appears to be largely found in spirochaete bacteria. It is related to membrane beta barrel proteins. 153
53639 402473 pfam10896 DUF2714 Protein of unknown function (DUF2714). This family of proteins with unknown function appears to be restricted to Mycoplasmataceae. 143
53640 313960 pfam10897 DUF2713 Protein of unknown function (DUF2713). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 235
53641 371291 pfam10898 DUF2716 Protein of unknown function (DUF2716). This bacterial family of proteins has no known function. 140
53642 402474 pfam10899 AbiGi Putative abortive phage resistance protein AbiGi, antitoxin. This is a bacterial family of proteins with unknown function. AbiGi is a family of putative type IV toxin-antitoxin system antitoxins. The AbiG abortive phage resistance system affects lactococcal bacteriophages phiP335 and phiQ30 but not the other P335 phage species. AbiGii toxin appears to confer resistance to phages by a mechanism of abortive infection that acts by interfering with phage RNA synthesis. The cognate toxin is found in pfam16873. 178
53643 402475 pfam10901 DUF2690 Protein of unknown function (DUF2690). This bacterial family of proteins has no known function. 86
53644 402476 pfam10902 WYL_2 WYL_2, Sm-like SH3 beta-barrel fold. WYL_2 is a family of Sm-like SH3 beta-barrel fold containing domains. WYL is named for three conserved amino acids found in a subset of domains of this superfamily. These residues are not strongly conserved throughout the family. Rather, the conservation pattern includes four basic residues and a position often occupied by a cysteine, which are predicted to line a ligand-binding groove typical of the Sm-like SH3 beta-barrels. It is predicted to be a ligand-sensing domain that could bind negatively charged ligands, such as nucleotides or nucleic acid fragments, to regulate CRISPR-Cas and other defense systems such as the abortive infection AbiG system 74
53645 371294 pfam10903 DUF2691 Protein of unknown function (DUF2691). This bacterial family of proteins has no known function. 152
53646 287827 pfam10904 DUF2694 Protein of unknown function (DUF2694). This family of proteins with unknown function appears to be restricted to Mycobacterium spp. 97
53647 402477 pfam10905 DUF2695 Protein of unknown function (DUF2695). This bacterial family of proteins has no known function. 53
53648 371295 pfam10906 Mrx7 MIOREX complex component 7. This entry includes budding yeast MIOREX complex component 7 (Mrx7), which associates with mitochondrial ribosome. Its function is not clear. 66
53649 337866 pfam10907 DUF2749 Protein of unknown function (DUF2749). This bacterial family of proteins appear to come from the Trb operon however currently no function is known. 64
53650 402478 pfam10908 DUF2778 Protein of unknown function (DUF2778). This is a bacterial family of uncharacterized proteins. 119
53651 151356 pfam10909 DUF2682 Protein of unknown function (DUF2682). This viral family of proteins has no known function. 77
53652 287830 pfam10910 DUF2744 Protein of unknown function (DUF2744). This is a viral family of proteins with unknown function. 119
53653 151358 pfam10911 DUF2717 Protein of unknown function (DUF2717). Members in this family of proteins are annotated as gene 6.5 protein however currently there is no known function. 77
53654 371296 pfam10912 DUF2700 Protein of unknown function (DUF2700). This family of proteins with unknown function appears to be restricted to Caenorhabditis elegans. 136
53655 313971 pfam10913 DUF2706 Protein of unknown function (DUF2706). This family of proteins with unknown function appears to be restricted to Rickettsia spp. 59
53656 371297 pfam10914 DUF2781 Protein of unknown function (DUF2781). This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins. 145
53657 402479 pfam10915 DUF2709 Protein of unknown function (DUF2709). This bacterial family of proteins has no known function. 237
53658 402480 pfam10916 DUF2712 Protein of unknown function (DUF2712). This family of proteins with unknown function appear to be restricted to Bacillales. 115
53659 371300 pfam10917 Fungus-induced Fungus-induced protein. This entry represents fungus-induced proteins which may have role in hypoxia response. 49
53660 313975 pfam10918 DUF2718 Protein of unknown function (DUF2718). This viral family of proteins has no known function. 129
53661 402481 pfam10920 DUF2705 Protein of unknown function (DUF2705). This bacterial family of proteins has no known function. 226
53662 287837 pfam10921 DUF2710 Protein of unknown function (DUF2710). This family of proteins with unknown function appears to be restricted to Mycobacteriaceae. 104
53663 313976 pfam10922 DUF2745 Protein of unknown function (DUF2745). This is a viral family of proteins with unknown function. 85
53664 402482 pfam10923 DUF2791 P-loop Domain of unknown function (DUF2791). This is a family of proteins found in archaea and bacteria. This domain contains a P-loop motif suggesting it binds to a nucleotide such as ATP. 412
53665 402483 pfam10924 DUF2711 Protein of unknown function (DUF2711). Some members in this family of proteins are annotated as ywbB however currently there is no known function. 216
53666 371301 pfam10925 DUF2680 Protein of unknown function (DUF2680). Members in this family of proteins are annotated as yckD however currently no function is known. 57
53667 402484 pfam10926 DUF2800 Protein of unknown function (DUF2800). This is a family of uncharacterized proteins found in bacteria and viruses. Some members of this family are annotated as being Phi APSE P51-like proteins. 366
53668 287842 pfam10927 DUF2738 Protein of unknown function (DUF2738). This is a viral family of proteins with unknown function. 236
53669 402485 pfam10928 DUF2810 Protein of unknown function (DUF2810). This is a bacterial family of uncharacterized proteins. 53
53670 402486 pfam10929 DUF2811 Protein of unknown function (DUF2811). This is a bacterial family of uncharacterized proteins. 57
53671 402487 pfam10930 DUF2737 Protein of unknown function (DUF2737). This family of proteins has no known function. 53
53672 371303 pfam10931 DUF2735 Protein of unknown function (DUF2735). Some members in this family of proteins are annotated as glutamine synthetase translation inhibitor however this function can not be confirmed. 52
53673 402488 pfam10932 DUF2783 Protein of unknown function (DUF2783). This is a bacterial family of uncharacterized protein. 59
53674 402489 pfam10933 DUF2827 Protein of unknown function (DUF2827). This is a family of uncharacterized proteins found in Burkholderia. 362
53675 402490 pfam10934 DUF2634 Protein of unknown function (DUF2634). Some members in this family of proteins are annotated as phage related, xkdS however currently there is no known function. 104
53676 402491 pfam10935 DUF2637 Protein of unknown function (DUF2637). This family of proteins has no known function. 161
53677 402492 pfam10936 DUF2617 Protein of unknown function DUF2617. This bacterial family of proteins has no known function. 156
53678 402493 pfam10937 S36_mt Ribosomal protein S36, mitochondrial. This entry is represented by a mitochondrial ribosomal protein of the small subunit, which has similarity to human mitochondrial ribosomal protein MRP-S36. 116
53679 402494 pfam10938 YfdX YfdX protein. YfdX is a protein found in Proteobacteria of unknown function. The protein coding for this gene is regulated by EvgA in E. coli. 148
53680 402495 pfam10939 DUF2631 Protein of unknown function (DUF2631). This is s bacterial family of proteins with unknown function. 63
53681 287855 pfam10940 DUF2618 Protein of unknown function (DUF2618). This bacterial family of proteins has no known function. The sequences within the family are highly conserved. 40
53682 402496 pfam10941 DUF2620 Protein of unknown function DUF2620. This is a bacterial family of proteins with unknown function. 116
53683 402497 pfam10942 DUF2619 Protein of unknown function (DUF2619). This bacterial family of proteins has no known function. 69
53684 287858 pfam10943 DUF2632 Protein of unknown function (DUF2632). This is a family of membrane proteins with unknown function. 233
53685 402498 pfam10944 DUF2630 Protein of unknown function (DUF2630). This bacterial family of proteins have no known function. 80
53686 402499 pfam10945 CBP_BcsR Cellulose biosynthesis protein BcsR. CBP_BcsR is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation). 42
53687 402500 pfam10946 DUF2625 Protein of unknown function DUF2625. Some members in this family of proteins are annotated as ybfG however currently no function is known. 207
53688 402501 pfam10947 DUF2628 Protein of unknown function (DUF2628). Some members in this family of proteins are annotated as yigF however currently no function is known. 78
53689 378516 pfam10948 DUF2635 Protein of unknown function (DUF2635). This is a family of phage proteins with unknown function. 46
53690 371311 pfam10949 DUF2777 Protein of unknown function (DUF2777). This family of proteins with unknown function appears to be restricted to Bacillus cereus. 181
53691 402502 pfam10950 Organ_specific Organ specific protein. This eukaryotic family includes a number of plant organ-specific proteins. While their function is unknown, their predicted amino acid sequence suggests that these proteins could be exported and glycosylated. 117
53692 402503 pfam10951 DUF2776 Protein of unknown function (DUF2776). This bacterial family of proteins has no known function. 348
53693 287867 pfam10952 DUF2753 Protein of unknown function (DUF2753). This bacterial family of proteins has no known function. 140
53694 402504 pfam10953 DUF2754 Protein of unknown function (DUF2754). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae. 70
53695 402505 pfam10954 DUF2755 Protein of unknown function (DUF2755). Some members in this family of proteins are annotated as YaiY however no function is known. The family appears to be restricted to Enterobacteriaceae. 100
53696 402506 pfam10955 DUF2757 Protein of unknown function (DUF2757). Members in this family of proteins are annotated as YabK however currently no function is known. 73
53697 402507 pfam10956 DUF2756 Protein of unknown function (DUF2756). Some members in this family of proteins are annotated yhhA however currently no function is known. The family appears to be restricted to Enterobacteriaceae. 104
53698 337871 pfam10957 Spore_Cse60 Sporulation protein Cse60. Cse60 is expressed during sporulation in Bacillus subtilis. Transcription commences around 2h after the start of sporulation and had an absolute requirement for the transcription factor sigmaE. Cse60 is an acidic product of only 60 residues, whose function is not known. 60
53699 378517 pfam10958 DUF2759 Protein of unknown function (DUF2759). This family of proteins with unknown function appear to be restricted to Bacillaceae. 50
53700 371315 pfam10959 DUF2761 Protein of unknown function (DUF2761). Members in this family of proteins are annotated as KleF however no function is known. 94
53701 402508 pfam10960 Holin_BhlA BhlA holin family. The Phage_holin_BhlA family is a family of holin-like proteins from both bacteriophages and bacterial chromosomes. In bacteriophage, holins are small membrane proteins that accumulate and oligomerize to form non-specific lesions in the cytoplasmic membrane allowing the release of the second protein, endolysins, to access the peptidoglycan. Most holins share common structural features: two or three transmembrane domains separated by a beta-turn, a short hydrophilic N-terminus, a highly charged C-terminus and a dual translational start motif. The BhlA holin of Bacillus is found to be toxic to the host cell where the site of action of is on the cell membrane and causes bacterial death by cell membrane disruption. 66
53702 402509 pfam10961 SelK_SelG Selenoprotein SelK_SelG. This entry inclues a group of eukaryotic selenoproteins, such as SelK and SelG. SelK seems to play an important role in protecting cells from endoplasmic reticulum stress induced apoptosis. SelG may be involved in regulating the redox state of the cell. 81
53703 402510 pfam10962 DUF2764 Protein of unknown function (DUF2764). This bacterial family of proteins has no known function. 271
53704 402511 pfam10963 Phage_TAC_10 Phage tail assembly chaperone. This is a family of phage tail assembly chaperone proteins. 82
53705 371318 pfam10964 DUF2766 Protein of unknown function (DUF2766). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 79
53706 402512 pfam10965 DUF2767 Protein of unknown function (DUF2767). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 67
53707 402513 pfam10966 DUF2768 Protein of unknown function (DUF2768). This family of proteins with unknown function appear to be restricted to Bacillus spp. 58
53708 402514 pfam10967 DUF2769 Protein of unknown function (DUF2769). This family of proteins have no known function. 57
53709 371319 pfam10968 DUF2770 Protein of unknown function (DUF2770). Members in this family of proteins are annotated as yceO however currently no function is known. 36
53710 402515 pfam10969 DUF2771 Protein of unknown function (DUF2771). This bacterial family of proteins has no known function. 128
53711 287884 pfam10970 GerPE Spore germination protein GerPE. GerPE is required for the formation of functionally normal spores. It could be involved in the establishment of a normal spore coat structure and (or) permeability, which allows the access of germinants to their receptor. 123
53712 287885 pfam10971 DUF2773 Protein of unknown function (DUF2773). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. 81
53713 402516 pfam10972 CsiV Peptidoglycan-binding protein, CsiV. CsiV, a small periplasmic protein (cell-shape integrity in Vibrio), is essential for growth of Vibrio cholerae in the presence of DAA, non-canonical amino-acids, the typical components of peptidoglycan side-chains in Vibrio cholerae. CsiV interacts with LpoA, the lipoprotein activator of penicillin-binding-protein1A that is necessary for mediating the assembly of peptidoglycan. CsiV acts through LpoA to promote peptidoglycan biogenesis in V. cholerae and other vibrio species as well as in the other genera where this protein is found. 210
53714 402517 pfam10973 DUF2799 Protein of unknown function (DUF2799). Some members in this family of proteins are annotated as yfiL which has no known function. 86
53715 402518 pfam10974 DUF2804 Protein of unknown function (DUF2804). This is a family of proteins with unknown function. 321
53716 402519 pfam10975 DUF2802 Protein of unknown function (DUF2802). This bacterial family of proteins has no known function. 63
53717 402520 pfam10976 DUF2790 Protein of unknown function (DUF2790). This family of proteins with unknown function appear to be restricted to Pseudomonadaceae. 77
53718 402521 pfam10977 DUF2797 Protein of unknown function (DUF2797). This family of proteins has no known function. 228
53719 402522 pfam10978 DUF2785 Protein of unknown function (DUF2785). Some members in this family are annotated as hypothetical membrane spanning proteins however this cannot be confirmed. The family has no known function. 174
53720 402523 pfam10979 DUF2786 Protein of unknown function (DUF2786). This family of proteins has no known function. 40
53721 402524 pfam10980 DUF2787 Protein of unknown function (DUF2787). This bacterial family of proteins has no known function. 128
53722 402525 pfam10981 DUF2788 Protein of unknown function (DUF2788). This bacterial family of proteins have no known function. 51
53723 402526 pfam10982 DUF2789 Protein of unknown function (DUF2789). This bacterial family of proteins has no known function. 75
53724 402527 pfam10983 DUF2793 Protein of unknown function (DUF2793). This is a bacterial family of proteins with unknown function. 87
53725 402528 pfam10984 DUF2794 Protein of unknown function (DUF2794). This is a bacterial family of proteins with unknown function. 85
53726 402529 pfam10985 DUF2805 Protein of unknown function (DUF2805). This is a bacterial family of proteins with unknown function. 71
53727 402530 pfam10986 DUF2796 Protein of unknown function (DUF2796). This bacterial family of proteins has no known function. 163
53728 371326 pfam10987 DUF2806 Protein of unknown function (DUF2806). This bacterial family of proteins has no known function. 221
53729 402531 pfam10988 DUF2807 Putative auto-transporter adhesin, head GIN domain. This bacterial family of proteins shows structural similarity to other pectin lyase families. Although structures from this family align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In Structure 3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that this family belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN). 181
53730 402532 pfam10989 DUF2808 Protein of unknown function (DUF2808). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 144
53731 402533 pfam10990 DUF2809 Protein of unknown function (DUF2809). Some members in this family of proteins are annotated as yjgA however currently no function for the protein is known. 86
53732 402534 pfam10991 DUF2815 Protein of unknown function (DUF2815). This is a phage related family of proteins with unknown function. 167
53733 402535 pfam10992 DUF2816 Protein of unknown function (DUF2816). This eukaryotic family of proteins has no known function. 83
53734 402536 pfam10993 DUF2818 Protein of unknown function (DUF2818). This bacterial family of proteins has no known function. 93
53735 378533 pfam10994 DUF2817 Protein of unknown function (DUF2817). This family of proteins has no known function. 340
53736 402537 pfam10995 CBP_GIL GGDEF I-site like or GIL domain. The GIL domain, for GGDEF I-site like domain, is a c-di-GMP binding domain on the BcsE proteins of enterobacteria. It is not essentail for cellulose synthesis but is critical for maximal cellulose production. Cellulose production in enterobacteria is controlled by a two-tiered c-di-GMP-dependent system involving BcsE and the PilZ domain containing glycosyltransferase BcsA. 513
53737 402538 pfam10996 Beta-Casp Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. 107
53738 402539 pfam10997 Amj Alternate to MurJ. This bacterial family of proteins has no known function. However, family members include lipid II flippase Amj, which is required for bacterial cell wall synthesis. It transports lipid-linked peptidoglycan precursors from the inner to the outer surface of the cytoplasmic membrane. 253
53739 402540 pfam10998 DUF2838 Protein of unknown function (DUF2838). This bacterial family of proteins has no known function. 108
53740 402541 pfam10999 DUF2839 Protein of unknown function (DUF2839). This bacterial family of unknown function appear to be restricted to Cyanobacteria. 67
53741 402542 pfam11000 DUF2840 Protein of unknown function (DUF2840). This bacterial family of proteins have no known function. 148
53742 402543 pfam11001 DUF2841 Protein of unknown function (DUF2841). This family of proteins with unknown function are all present in yeast. 122
53743 402544 pfam11002 RDM RFPL defining motif (RDM). The RDM domain is found on RFPL (Ret finger protein like) proteins. In humans, RFPL transcripts can be detected at the onset of neurogenesis in differentiating human embryonic stem cells, and in the developing human neocortex. The RDM domain is thought to have emerged from a neofunctionalisation event. It is found N terminal to the SPRY domain (pfam00622). 42
53744 402545 pfam11003 DUF2842 Protein of unknown function (DUF2842). This bacterial family of proteins have no known function. 61
53745 402546 pfam11004 Kdo_hydroxy 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) hydroxylase. This is a family of 3-deoxy-D-manno-oct-2-ulosonic acid 3-hydroxylases, which catalyze the conversion of 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) to D-glycero-D-talo-oct-2-ulosonic acid (Ko). It contains a potential iron-binding motif, HXDX(n)H (n>40). Hydroxylation activity is iron-dependent. 273
53746 402547 pfam11005 DUF2844 Protein of unknown function (DUF2844). This bacterial family of proteins has no known function. 129
53747 402548 pfam11006 DUF2845 Protein of unknown function (DUF2845). This bacterial family of proteins has no known function. 78
53748 402549 pfam11007 CotJA Spore coat associated protein JA (CotJA). CotJA is part of the CotJ operon which contains CotJA and CotJC. The operon encodes spore coat proteins. Interaction of CotJA with CotJC is required for the assembly of both CotJA and CotJC into the spore coat. 35
53749 402550 pfam11008 DUF2846 Protein of unknown function (DUF2846). Some members in this family of proteins with unknown function are annotated as lipoproteins however this cannot be confirmed. 89
53750 402551 pfam11009 DUF2847 Protein of unknown function (DUF2847). Some members in this bacterial family of proteins with unknown function are annotated as YtxJ, a putative general stress protein. This cannot be confirmed. 103
53751 402552 pfam11010 DUF2848 Protein of unknown function (DUF2848). This bacterial family of proteins has no known function. 194
53752 402553 pfam11011 DUF2849 Protein of unknown function (DUF2849). This bacterial family of proteins has no known function. 86
53753 314054 pfam11012 DUF2850 Protein of unknown function (DUF2850). This family of proteins with unknown function appear to be restricted to Vibrionaceae. 78
53754 402554 pfam11013 DUF2851 Protein of unknown function (DUF2851). This bacterial family of proteins has no known function. 369
53755 402555 pfam11014 DUF2852 Protein of unknown function (DUF2852). This bacterial family of proteins has no known function. 116
53756 402556 pfam11015 DUF2853 Protein of unknown function (DUF2853). This bacterial family of proteins has no known function. 101
53757 402557 pfam11016 DUF2854 Protein of unknown function (DUF2854). This family of proteins has no known function. 145
53758 402558 pfam11017 DUF2855 Protein of unknown function (DUF2855). This family of proteins has no known function. 334
53759 371343 pfam11018 Cuticle_3 Pupal cuticle protein C1. Insect cuticles are composite structures whose mechanical properties are optimized for biological function. The major components are the chitin filament system and the cuticular proteins, and the cuticle's properties are determined largely by the interactions between these two sets of molecules. The proteins can be ordered by species. 186
53760 378542 pfam11019 DUF2608 Protein of unknown function (DUF2608). This family is conserved in Bacteria. The function is not known. 240
53761 371344 pfam11020 DUF2610 Domain of unknown function (DUF2610). This family is conserved in Proteobacteria. One member is annotated as being elongation factor P but this could not be confirmed. This domain is related to the Ribbon-helix-helix superfamily so may be a DNA-binding protein. 78
53762 402559 pfam11021 DUF2613 Protein of unknown function (DUF2613). This is a family of putative small secreted proteins expressed by Actinobacteria. The function is not known. 55
53763 402560 pfam11022 DUF2611 Protein of unknown function (DUF2611). This family is conserved in the Dikarya of Fungi. The function is not known. 64
53764 402561 pfam11023 DUF2614 Zinc-ribbon containing domain. This is a family of proteins conserved in the Bacillaceae family. Some members are annotated as being protein YgzB. The function is not known. 111
53765 337892 pfam11024 DGF-1_4 Dispersed gene family protein 1 of Trypanosoma cruzi region 4. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_5. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C. 70
53766 314067 pfam11025 GP40 Glycoprotein GP40 of Cryptosporidium. This family is highly conserved in Cryptosporidium spp. Many members are annotated as being a 60 kDa glycoprotein. 164
53767 402562 pfam11026 DUF2721 Protein of unknown function (DUF2721). This family is conserved in bacteria. The function is not known. 127
53768 402563 pfam11027 DUF2615 Protein of unknown function (DUF2615). This small. approximately 100 residue, family is conserved from worms to humans. It is cysteine-rich with a characteristic FDxCEC sequence motif. The function is not known. 102
53769 402564 pfam11028 DUF2723 Protein of unknown function (DUF2723). This family is conserved in bacteria. The function is not known. 168
53770 402565 pfam11029 DAZAP2 DAZ associated protein 2 (DAZAP2). DAZ associated protein 2 has a highly conserved sequence throughout evolution including a conserved polyproline region and several SH2/SH3 binding sites. It occurs as a single copy gene with a four-exon organisation and is located on chromosome 12. It encodes a ubiquitously expressed protein and binds to DAZ and DAZL1 through DAZ repeats. 129
53771 287944 pfam11030 Nucleocapsid-N Nucleocapsid protein N. This is the N protein of the nucleocapsid. The nucleocapsid functions to protect the RNA against nuclease degradation and to promote it's reverse transcription. The NC protein promotes viral RNA dimerization and encapsidation and initiates reverse transcription by activating the annealing of the primer tRNA to the initiation site. 167
53772 287945 pfam11031 Phage_holin_T Bacteriophage T holin. Bacteriophage effects host lysis with T holin along with an endolysin. T disrupts the membrane allowing sequential events which lead to the attack of the peptidoglycan. T has an usual periplasmic domain which transduces environmental information for the real-time control of lysis timing. 233
53773 402566 pfam11032 ApoM ApoM domain. ApoM is a 25 kDa plasma protein associated with high-density lipoproteins (HDLs). ApoM is important in the formation of pre-ss-HDL and also in increasing cholesterol efflux from macrophage foam cells. Lipoproteins consist of lipids solubilized by apolipoproteins. ApoM lacks an external amphipathic motif and is uniquely secreted to plasma without cleavage of its terminal signal peptide. 188
53774 402567 pfam11033 ComJ Competence protein J (ComJ). ComJ is a competence specific protein. 122
53775 402568 pfam11034 Grg1 Glucose-repressible protein Grg1. This fungal protein increases during glucose deprivation. Its function is unknown. 65
53776 402569 pfam11035 SnAPC_2_like Small nuclear RNA activating complex subunit 2, SNAP190 Myb. This family of proteins is snRNA-activating protein complex subunit 2 (SnAPC subunit 2). SnAPC complex allows the transcription of human small nuclear RNA genes to occur by recognition of the proximal sequence element, the TATA box. The family functions both to specifically recognize the proximal sequence element present in the core promoters of human snRNA genes and to stimulate TBP recognition of the neighboring TATA box present in human U6 snRNA promoters. 331
53777 151483 pfam11036 YqgB Virulence promoting factor. YqgB encodes adaptive factors that acts in synergy with vqfZ, enabling the bacteria to cope with the physical environment in vivo, facilitating colonisation of the host. 43
53778 371352 pfam11037 Musclin Insulin-resistance promoting peptide in skeletal muscle. Musclin is a muscle derived secretory peptide which induces insulin resistance in vitro. It encodes a 130 amino acid sequence including a NH(2) terminal 30 amino acid signal sequence. Musclin expression level is tightly regulated by nutritional changes. 132
53779 287951 pfam11038 DGF-1_5 Dispersed gene family protein 1 of Trypanosoma cruzi region 5. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_4. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C. 278
53780 371353 pfam11039 DUF2824 Protein of unknown function (DUF2824). This family of proteins has no known function. Some members in the family are annotated as the P22 head assembly protein gp14 however this cannot be confirmed. 151
53781 337895 pfam11040 DGF-1_C Dispersed gene family protein 1 of Trypanosoma cruzi C-terminus. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. This is the very C-terminal part of the protein. 87
53782 402570 pfam11041 DUF2612 Protein of unknown function (DUF2612). This is a phage protein family expressed from a range of Proteobacteria species. The function is not known. 181
53783 402571 pfam11042 DUF2750 Protein of unknown function (DUF2750). This family is conserved in Proteobacteria. The function is not known. 102
53784 287956 pfam11043 DUF2856 Protein of unknown function (DUF2856). Some members in this viral family of proteins with unknown function are annotated as Abc2 however this cannot be confirmed. 97
53785 371355 pfam11044 TMEMspv1-c74-12 Plectrovirus spv1-c74 ORF 12 transmembrane protein. This is a family of proteins expressed by Plectroviruses. The plectroviruses are single-stranded DNA viruses belonging to the Inoviridae. Except that it is a putative transmembrane protein the function is not known. 49
53786 402572 pfam11045 YbjM Putative inner membrane protein of Enterobacteriaceae. This family is conserved in the Enterobacteriaceae. It is a putative inner membrane protein, named YbjM, but the function is not known. 117
53787 402573 pfam11046 HycA_repressor Transcriptional repressor of hyc and hyp operons. This family is conserved in Proteobacteria. It is likely to be the transcriptional repressor molecule for the hyc and hyp operons, which express, amongst others, the protein HycA. This protein may be harnessed for the reduction of technetium oxide, an unwelcome product of radio-nucleotide bioaccumulation. HycA produces formate hydrogenlyase, one of the key proteins necessary for metal compound reduction. 140
53788 287960 pfam11047 SopD Salmonella outer protein D. SopD is a type III virulence effector protein whose structure consists of 38% alpha-helix and 26% beta-strand. 319
53789 287961 pfam11049 KSHV_K1 Glycoprotein K1 of Kaposi's sarcoma-associated herpes virus. This is a highly glycosylated cytoplasmic and membrane protein similar to the immunoglobulin receptor family that is expressed as an inducible early-lytic-cycle gene product in primary effusion lymphoma cell-lines. This domain would appear to be the cytoplasmic region of the protein. 71
53790 287962 pfam11050 Viral_env_E26 Virus envelope protein E26. E26 is a multifunctional protein. One form of E26 associates with viral DNA or DNA binding proteins, while a second form associates with intracellular membranes. 225
53791 402574 pfam11051 Mannosyl_trans3 Mannosyltransferase putative. This family is conserved in fungi. Several members are annotated as being alpha-1,3-mannosyltransferase but this could not be confirmed. 273
53792 337898 pfam11052 Tr-sialidase_C Trans-sialidase of Trypanosoma hydrophobic C-terminal. This is a highly conserved sequence motif that is the very C-terminus of a number of more diverse proteins from Trypanosoma cruzi. All members of the family are annotated putatively as being trans-sialidase but this appears to be a diverse group. 23
53793 287965 pfam11053 DNA_Packaging Terminase DNA packaging enzyme. Phage T4 terminase functions in packaging concatemeric DNA. The T4 terminase is composed of a large subunit, gp17 ad a small subunit, gp16. The role of gp16 is not well characterized however it is known that it binds to double-stranded DNA but not single stranded DNA. 157
53794 402575 pfam11054 Surface_antigen Sporozoite TA4 surface antigen. This family of proteins is a Eukaryotic family of surface antigens. One of the better characterized members of the family is the sporulated TA4 antigen. The TA4 gene encodes a single polypeptide of 25 kDa which contains a 17 and a 8 kD polypeptide. 209
53795 402576 pfam11055 Gsf2 Glucose signalling factor 2. Gsf2 is localized to the ER and functions to promote the secretion of certain hexose transporters. 371
53796 402577 pfam11056 UvsY Recombination, repair and ssDNA binding protein UvsY. UvsY protein enhances the rate of single-stranded-DNA-dependant ATP hydrolysis by UvsX protein. The enhancement of ATP hydrolysis by UvsY protein is shown to result from the ability of UvsY protein to increase the affinity of UvsX protein for single-stranded DNA. 128
53797 314086 pfam11057 Cortexin Cortexin of kidney. In the middle of cortexin protein there is a single membrane-spanning domain which indicates that this protein may be a membrane protein involved in intracellular or extracellular signalling of the kidney or brain, since it is expressed specifically in the kidneys and brain only. The protein is highly conserved among species. Cortexin is also thought to be important to neurons of both the developing and adult cerebral cortex. 73
53798 287970 pfam11058 Ral Antirestriction protein Ral. Ral alleviates restriction and enhances modification by the E.Coli restriction and modification system. 66
53799 371359 pfam11059 DUF2860 Protein of unknown function (DUF2860). This bacterial family of proteins has no known function. 297
53800 402578 pfam11060 DUF2861 Protein of unknown function (DUF2861). This bacterial family of proteins has no known function. 267
53801 402579 pfam11061 DUF2862 Protein of unknown function (DUF2862). This family of proteins has no known function. 60
53802 402580 pfam11062 DUF2863 Protein of unknown function (DUF2863). This bacterial family of proteins have no known function. 398
53803 402581 pfam11064 DUF2865 Protein of unknown function (DUF2865). This bacterial family of proteins has no known function. 110
53804 402582 pfam11065 DUF2866 Protein of unknown function (DUF2866). This bacterial family of proteins have no known function. 64
53805 402583 pfam11066 DUF2867 Protein of unknown function (DUF2867). This bacterial family of proteins have no known function. 144
53806 402584 pfam11067 DUF2868 Protein of unknown function (DUF2868). Some members in this family of proteins with unknown function are annotated as putative membrane proteins. However, this cannot be confirmed. 309
53807 402585 pfam11068 YlqD YlqD protein. The structure of a representative of this family has been solved (Structure 4dci) and found to form a tetrameric structure of prefoldin-like architecture with the beta-barrel core and helical coiled coil tentacles. This suggests that this family may act as molecular chaperones. 131
53808 402586 pfam11069 DUF2870 Protein of unknown function (DUF2870). This is a eukaryotic family of proteins with unknown function. 95
53809 402587 pfam11070 DUF2871 Protein of unknown function (DUF2871). This family of proteins has no known function. 133
53810 402588 pfam11071 Nuc_deoxyri_tr3 Nucleoside 2-deoxyribosyltransferase YtoQ. 140
53811 402589 pfam11072 DUF2859 Protein of unknown function (DUF2859). This is a bacterial family of uncharacterized proteins. 145
53812 402590 pfam11073 NSs Rift valley fever virus non structural protein (NSs) like. This family contains several Phlebovirus non structural proteins which act as a major determinant of virulence by antagonising interferon beta gene expression. 242
53813 402591 pfam11074 DUF2779 Domain of unknown function(DUF2779). This domain is conserved in bacteria. The function is not known. 126
53814 402592 pfam11075 DUF2780 Protein of unknown function VcgC/VcgE (DUF2780). This is a bacterial family of uncharacterized proteins. 175
53815 402593 pfam11076 YbhQ Putative inner membrane protein YbhQ. This family is conserved in Proteobacteria. The function is not known but most members are annotated as being inner membrane protein YbhQ. 132
53816 371366 pfam11077 DUF2616 Protein of unknown function (DUF2616). This cysteine-rich family is expressed by the double-stranded Nucleopolyhedrovirus, a member of the Baculoviridae family of dsDNA viruses. The function is not known. 172
53817 337907 pfam11078 Optomotor-blind Optomotor-blind protein N-terminal region. This family is conserved in Drosophila spp. Optomotor-blind is one of the essential toolkit proteins for coordinating development in diverse animal taxa, and in Drosophila it plays a key role in establishing the abdominal pigmentation pattern, in development of the central nervous system and leg and wing imaginal disc-formation of Drosophila melanogaster. This is the N-terminal region of the protein and does not include the T-box-containing transcription factor that plays a part in DNA-binding. 79
53818 402594 pfam11079 YqhG Bacterial protein YqhG of unknown function. This family of putative proteins is conserved in the Bacillaceae family of the Firmicutes. The function is not known. 258
53819 402595 pfam11080 GhoS Endoribonuclease GhoS. GhoS is part of the GhoT-GhoS type V toxin-antitoxin (TA) system. GhoT is inhibited by antitoxin GhoS, which specifically cleaves its mRNA. 87
53820 314108 pfam11081 DUF2890 Protein of unknown function (DUF2890). This family is conserved in dsDNA adenoviruses of vertebrates. The function is not known. 168
53821 287992 pfam11082 DUF2880 Protein of unknown function (DUF2880). This bacterial family of proteins has no known function. 79
53822 314109 pfam11083 Streptin-Immun Lantibiotic streptin immunity protein. Streptococcal species produce a lantibiotic, streptin, in a similar manner to the production of nisin and subtilin by other lactic acid bacteria, in order to compete against competing bacteria within the environment. The immunity protein protects the bacterium from destruction by its own lantibiotic. In general, there is little homology between the immunity proteins of different genera of bacteria. 93
53823 402596 pfam11084 DUF2621 Protein of unknown function (DUF2621). This family is conserved in the Bacillaceae family. Several members are named as YneK. The function is not known. 139
53824 314111 pfam11085 YqhR Conserved membrane protein YqhR. This family is conserved in the Bacillaceae family of the Firmicutes. The function is not known. 165
53825 402597 pfam11086 DUF2878 Protein of unknown function (DUF2878). This bacterial family of proteins has no known function. Some members annotate the proteins as the permease component of a Mn2+/Zn2+ transport system however this cannot be confirmed. 150
53826 151532 pfam11087 PRD1_DD PRD1 phage membrane DNA delivery. This small family of phage proteins are bound in the viral membrane and assist, along with P11 and P18 in the delivery of DNA. 54
53827 287997 pfam11088 RL11D Glycoprotein encoding membrane proteins RL5A and RL6. RL5A and RL6 are part of the RL11 family which are predicted to encode membrane glycoproteins. Two adjacent open reading frames potentially encode a domain that is the hallmark of proteins encoded by the RL11 family. 99
53828 402598 pfam11089 SyrA Exopolysaccharide production repressor. SyrA is a small protein located in the cytoplasmic membrane that lacks an apparent DNA binding domain. SyrA mediates the transcriptional up-regulation of exo genes involved in the biosynthesis of the symbiotic exopolysaccharide succinoglycan. It does this through a mechanism which requires a two component system. 38
53829 151535 pfam11090 DUF2833 Protein of unknown function (DUF2833). This family of proteins with unknown function are found in the bacteriophage T7. Some of the members of this family are annotated as gene 13 protein. 86
53830 371369 pfam11091 T4_tail_cap Tail-tube assembly protein. This tail tube protein is also referred to as Gp48. It is required for the assembly and length regulation of the tail tube of bacteriophage T4. 348
53831 402599 pfam11092 Alveol-reg_P311 Neuronal protein 3.1 (p311). P311 has several PEST-like motifs and is found in neuron and muscle cells. P311 could have some function in myo-fibroblast transformation and prevention of fibrosis. It has also been identified as a potential regulator of alveolar generation. 66
53832 402600 pfam11093 Mitochondr_Som1 Mitochondrial export protein Som1. Som1 is a component of the mitochondrial protein export system. The various Som1 proteins exhibit a highly conserved region and a pattern of cysteine residues. Stabilisation of Som1 occurs through an interaction between Som1 and Imp1, a peptidase required for proteolytic processing of certain proteins during their transport across the mitochondrial membrane. This suggests that Som1 represents a third subunit of the Imp1 peptidase complex 81
53833 288002 pfam11094 UL11 Membrane-associated tegument protein. The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment. UL11 is acylated which is crucial for lipid raft association. 39
53834 402601 pfam11095 Gemin7 Gem-associated protein 7 (Gemin7). Gemin7 is a novel component of the survival of motor neuron complex which functions in the assembly of spliceosomal small nuclear ribonucleoproteins. Gemin7 interacts with several Sm proteins of spliceosomal small nuclear ribonucleoproteins, especially SmE. 76
53835 288004 pfam11097 DUF2883 Protein of unknown function (DUF2883). This family of proteins have no known function but appear to be restricted to phage. 75
53836 371373 pfam11098 Chlorosome_CsmC Chlorosome envelope protein C. Chlorosomes are light-harvesting antennae found in green bacteria. CsmC is one of the proteins that exists in the chlorosome envelope. CsmC has been shown to exist as a homomultimer with CsmD in the chlorosome envelope. CsmC is thought to be important in chlorosome elongation and shape. 139
53837 402602 pfam11099 M11L Apoptosis regulator M11L like. Apoptosis regulators function to modulate the apoptotic cascades and thereby favour productive viral replication. M11L inhibits mitochondrial-dependant apoptosis by mimicking and competing with host proteins for the binding and blocking of Bak and Bax, two executioner proteins. 141
53838 314118 pfam11100 TrbE Conjugal transfer protein TrbE. TrbE is essential for conjugation and phage adsorption. It contains four common motifs and one conserved domain. 66
53839 402603 pfam11101 DUF2884 Protein of unknown function (DUF2884). Some members in this bacterial family of proteins are annotated as YggN which currently has no known function. 228
53840 402604 pfam11102 YjbF Group 4 capsule polysaccharide lipoprotein gfcB, YjbF. This family includes lipoprotein GfcB (YmcC), involved in group 4 capsule polysaccharide formation. YjbF is a family of Gram-negative bacterial outer-membrane lipoproteins, predicted to be a beta-barrel and possibly a porin that is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjF and H are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production. 189
53841 371375 pfam11103 DUF2887 Protein of unknown function (DUF2887). This bacterial family of proteins has no known function. These proteins may be distantly related to the PD(D/E)XK superfamily. 200
53842 402605 pfam11104 PilM_2 Type IV pilus assembly protein PilM;. The type IV pilus assembly protein PilM is required for competency and pilus biogenesis. It binds to PilN and ATP. 340
53843 314123 pfam11105 CCAP Arthropod cardioacceleratory peptide 2a. CCAP exerts a reversible and dose-dependant cardio-stimulatory effect on the semi-isolated heart of experimental beetles. CCAP also increases free hemolymph sugar concentration in young larvae and adults of the meal-worm beetle. 128
53844 402606 pfam11106 YjbE Exopolysaccharide production protein YjbE. YjbE is part of a four gene operon which is involved in exopolysaccharide production. The expression of YjbE is higher than the rest of the operon yjbEFGH. It appears to be restricted to Enterobacteriaceae. YbjE is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjH and F are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production. 79
53845 402607 pfam11107 FANCF Fanconi anemia group F protein (FANCF). FANCF regulates its own expression by methylation at both mRNA and protein levels. Methylation-induced inactivation of FANCF has an important role on the occurrence of ovarian cancers by disrupting the FA-BRCA pathway. 345
53846 371377 pfam11108 Phage_glycop_gL Viral glycoprotein L. GL forms a complex with gH, a glycoprotein known to be essential for entry of HSV-1 into cells and virus-induced cell fusion. It is a hetero-oligomer of gH and gL which is incorporated into virions and transported to the cell surface which acts during entry of virus into cells 98
53847 402608 pfam11109 RFamide_26RFa Orexigenic neuropeptide Qrfp/P518. Qrfp/P518 has a direct role in maintaining bone mineral density. Qrfp has also found to be important in energy homeostasis by regulating appetite and energy expenditure in mice. The c-terminal 28 residues are the functional 26RFa. 131
53848 314127 pfam11110 Phage_hub_GP28 Baseplate hub distal subunit. These baseplate proteins are also referred to as Gp28. Gp28 is the structural component of the central part of the bacteriophage T4 baseplate, which possesses a hydrophobic region and is membrane bound. Gp28 forms a complex with gp27 which is another structural component of the baseplate. 154
53849 402609 pfam11111 CENP-M Centromere protein M (CENP-M). The prime candidate for specifying centromere identity is the array of nucleosomes assembles with CENP-A. CENP-A recruits a nucleosome associated complex (NAC) comprised of CENP-M along with two other proteins. Assembly of the CENP-A NAC at centromeres is partly dependant on CENP-M. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival. 171
53850 402610 pfam11112 PyocinActivator Pyocin activator protein PrtN. PrtN is a transcriptional activator for pyocin synthesis genes. It activates the expression of various pyocin genes by interaction with the DNA sequences conserved in the 5' noncoding regions of the pyocin genes. 74
53851 371381 pfam11113 Phage_head_chap Head assembly gene product. This head assembly protein is also refereed to as gene product 40 (Gp40). A specific gp20-gp40 membrane insertion structure constitutes the T4 prohead assembly initiation complex. This protein in T4 stimulates head formation. 56
53852 402611 pfam11114 Minor_capsid_2 Minor capsid protein. Most of the members of this family are annotated as being minor capsid proteins. The genomes carrying the genes usually have three similar proteins adjacent to each other, hence this one being named as No.2. 113
53853 371383 pfam11115 DUF2623 Protein of unknown function (DUF2623). This family is conserved in the Enterobacteriaceae family. Several members are named as YghW. The function is not known. 93
53854 402612 pfam11116 DUF2624 Protein of unknown function (DUF2624). This family is conserved in the Bacillaceae family. Several members are named as YqfT. The function is not known. 83
53855 378559 pfam11117 DUF2626 Protein of unknown function (DUF2626). This family is conserved in the Bacillaceae family. Several members are named as YqgY. The function is not known. 73
53856 378560 pfam11118 DUF2627 Protein of unknown function (DUF2627). This family is conserved in the Bacillaceae family. Several members are named as YqzF. The function is not known. 72
53857 402613 pfam11119 DUF2633 Protein of unknown function (DUF2633). This family is conserved largely in the Bacillaceae family. Several members are named as YfgG. The function is not known. 54
53858 402614 pfam11120 CBP_BcsF Cellulose biosynthesis protein BcsF. CBP_BcsF is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 56
53859 288026 pfam11121 DUF2639 Protein of unknown function (DUF2639). This family is conserved in the Bacillaceae family. Several members are named as being YflJ, but the function is not known. 37
53860 371387 pfam11122 Spore-coat_CotD Inner spore coat protein D. This family is conserved in the Enterobacteriaceae family. CotD is an inner spore coat protein that is expressed in the middle phase of mother cell gene expression. Along with CotD, CotH, CotS and CotT it is assumed to assemble into the loose skeleton of the matrix, between the shells of SpoIVA and CotE. Coat proteins do not share much sequence similarity between species, but this does not imply they do not share secondary, tertiary, or quaternary features. 86
53861 402615 pfam11123 DNA_Packaging_2 DNA packaging protein. This DNA packaging protein is also referred to as gene 18 product (gp18). This protein is required for DNA packaging and functions in a complex with gp19. 82
53862 402616 pfam11124 Pho86 Inorganic phosphate transporter Pho86. Pho86p is an ER protein which is produced in response to phosphate starvation. It is essential for growth when phosphate levels are limiting. Pho86p is also involved in the regulation of Pho84p, a high-affinity phosphate transporter which is localized to the endoplasmic reticulum (ER) in low phosphate medium. When the level of phosphate increases Pho84p is transported to the vacuole. Pho86p is required for packaging of Pho84p in to COPII vesicles. 284
53863 288030 pfam11125 DUF2830 Protein of unknown function (DUF2830). Several members in this viral family of proteins are annotated as lysis proteins. 54
53864 288031 pfam11126 Phage_DsbA Transcriptional regulator DsbA. DsbA is a double stranded binding protein found in bacteriophage T4 which is involved in transcriptional regulation. DsbA, along with other viral proteins, interacts with the host RNA polymerase core enzyme enabling initiation of transcription. DsbA acts as an enhancer protein of late genes in vitro. The protein consists of mainly alpha helices. 67
53865 402617 pfam11127 DUF2892 Protein of unknown function (DUF2892). This family is conserved in bacteria. The function is not known. 66
53866 371390 pfam11128 Nucleocap_ssRNA Plant viral coat protein nucleocapsid. This family of nucleocapsid proteins is from ssRNA negative-strand viruses of plant origin. 179
53867 151573 pfam11129 EIAV_Rev Rev protein of equine infectious anaemia virus. The sequence of this family is highly conserved and carries a nuclear export signal from residues 31-55, and RNA binding/nuclear localization signals of RRDR at residue 76 and KRRRK at residue 159. Rev is an essential regulatory protein required for nucleocytoplasmic transport of incompletely spliced viral mRNAs that encode structural proteins. Rev has been shown to down-regulate the expression of viral late genes and alter sensitivity to Gag-specific cytotoxic-T-lymphocytes (CTL). Equine infectious anaemia virus (EIAV) exhibits a high rate of genetic variation in vivo, and results in a clinically variable disease in infected horses. 134
53868 402618 pfam11130 TraC_F_IV F pilus assembly Type-IV secretion system for plasmid transfer. This family of TraC proteins is conserved in Proteobacteria. TraC is a cytoplasmic, peripheral membrane protein and is one of the proteins encoded by the F transfer region of the conjugative plasmid that is required for the assembly of F pilin into the mature F pilus structure. F pili are filamentous appendages that help establish the physical contact between donor and recipient cells involved in the conjugation process. 231
53869 288035 pfam11131 PhrC_PhrF Rap-phr extracellular signalling. PhrC and PhrF stimulate ComA-dependent gene expression to different levels and are both required for full expression of genes activated by ComA, which activates the expression of genes involved in competence development and the production of several secreted products. 38
53870 371391 pfam11132 SplA Transcriptional regulator protein (SplA). The SplA protein functions in trans as a negative regulator of the level of splB-lacZ expression in the developing forespore. 73
53871 256308 pfam11133 Phage_head_fibr Head fiber protein. This head fiber protein is also refereed to as Gp8.5. Gp8.5 is a structural protein in phage. It is a dispensable head protein. 277
53872 288037 pfam11134 Phage_stabilize Phage stabilisation protein. Members of this family are phage proteins that are probably involved with stabilizing the condensed DNA within the capsid. 469
53873 288038 pfam11135 DUF2888 Protein of unknown function (DUF2888). Some members in this family of proteins with unknown function are annotated as immediate early protein ICP-18 however this cannot be confirmed. 144
53874 402619 pfam11136 DUF2889 Protein of unknown function (DUF2889). This bacterial family of proteins has no known function. 123
53875 402620 pfam11137 DUF2909 Protein of unknown function (DUF2909). This is a family of proteins conserved in Proteobacteria of unknown function. 60
53876 402621 pfam11138 DUF2911 Protein of unknown function (DUF2911). This bacterial family of proteins has no known function. 141
53877 378566 pfam11139 SfLAP Sap, sulfolipid-1-addressing protein. SAP is a transmembrane transport protein with six predicted transmembrane helices, with a hydrophilic domain between helices 3 and 4. This hyrodphobic region is highly variable among identified Gap-like (GPL, peptidoglycolipid, addressing protein) proteins and may be involved in substrate recognition. SAP also belongs to the LysE protein superfamily (pfam01810), whose members have been implicated in small molecule transport in bacteria. Other Gap proteins export metabolites across the cell membrane so it is possible that Sap specifically may be involved in transport of sulfolipid-1 across the membrane. 213
53878 402622 pfam11140 DUF2913 Protein of unknown function (DUF2913). This family of proteins with unknown function appear to be restricted to Gammaproteobacteria. 207
53879 402623 pfam11141 DUF2914 Protein of unknown function (DUF2914). This bacterial family of proteins has no known function. 62
53880 402624 pfam11142 DUF2917 Protein of unknown function (DUF2917). This bacterial family of proteins appears to be restricted to Proteobacteria. 59
53881 402625 pfam11143 DUF2919 Protein of unknown function (DUF2919). This bacterial family of proteins has no known function. Some members are annotated as YfeZ however this cannot be confirmed. 146
53882 314148 pfam11144 DUF2920 Protein of unknown function (DUF2920). This bacterial family of proteins has no known function. 394
53883 402626 pfam11145 DUF2921 Protein of unknown function (DUF2921). This eukaryotic family of proteins has no known function. 891
53884 402627 pfam11146 DUF2905 Protein of unknown function (DUF2905). This is a family of bacterial proteins conserved of unknown function. 64
53885 402628 pfam11148 DUF2922 Protein of unknown function (DUF2922). This bacterial family of proteins has no known function. 63
53886 402629 pfam11149 DUF2924 Protein of unknown function (DUF2924). This bacterial family of proteins has no known function. 134
53887 402630 pfam11150 DUF2927 Protein of unknown function (DUF2927). This family is conserved in Proteobacteria. Several members are described as being putative lipoproteins, but otherwise the function is not known. 204
53888 402631 pfam11151 DUF2929 Protein of unknown function (DUF2929). This family of proteins with unknown function appears to be restricted to Firmicutes. 56
53889 402632 pfam11152 CCB2_CCB4 Cofactor assembly of complex C subunit B, CCB2/CCB4. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis. The CCB system consists of four proteins: CCB1-4. CCB2 and CCB4 are paralogues derived from a unique cyanobacterial ancestor. Orthologues are conserved in higher plants. 192
53890 402633 pfam11153 DUF2931 Protein of unknown function (DUF2931). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function. 189
53891 402634 pfam11154 DUF2934 Protein of unknown function (DUF2934). This bacterial family of proteins has no known function. 37
53892 402635 pfam11155 DUF2935 Domain of unknown function (DUF2935). This family of proteins with unknown function appears to be restricted to Firmicutes. The structure of this protein has been solved and each domain is composed of four alpha helices. A metal cluster composed of iron and magnesium lies between the two domains. 121
53893 402636 pfam11157 DUF2937 Protein of unknown function (DUF2937). This family of proteins with unknown function appears to be restricted to Proteobacteria. 160
53894 402637 pfam11158 DUF2938 Protein of unknown function (DUF2938). This bacterial family of proteins has no known function. Some members are thought to be membrane proteins however this cannot be confirmed. 150
53895 402638 pfam11159 DUF2939 Protein of unknown function (DUF2939). This bacterial family of proteins has no known function. 92
53896 402639 pfam11160 DUF2945 Protein of unknown function (DUF2945). This family of proteins has no known function. 59
53897 402640 pfam11161 DUF2944 Protein of unknown function (DUF2946). This family of proteins with unknown function appear to be restricted to Proteobacteria. 183
53898 402641 pfam11162 DUF2946 Protein of unknown function (DUF2946). This family of proteins has no known function. 116
53899 314165 pfam11163 DUF2947 Protein of unknown function (DUF2947). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 151
53900 402642 pfam11164 DUF2948 Protein of unknown function (DUF2948). This family of proteins with unknown function appear to be restricted to Proteobacteria. 137
53901 402643 pfam11165 DUF2949 Protein of unknown function (DUF2949). This family of proteins with unknown function appear to be restricted to Cyanobacteria. 56
53902 288067 pfam11166 DUF2951 Protein of unknown function (DUF2951). This family of proteins has no known function. It has a highly conserved sequence. 98
53903 378578 pfam11167 DUF2953 Protein of unknown function (DUF2953). This family of proteins has no known function. 53
53904 402644 pfam11168 DUF2955 Protein of unknown function (DUF2955). Some members in this family of proteins with unknown function annotate the proteins as membrane protein. However, this cannot be confirmed. 140
53905 402645 pfam11169 DUF2956 Protein of unknown function (DUF2956). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 101
53906 402646 pfam11170 DUF2957 Protein of unknown function (DUF2957). Some members annotate the proteins to be putative lipoproteins however this cannot be confirmed. Currently no function is known for this family of proteins. 298
53907 402647 pfam11171 DUF2958 Protein of unknown function (DUF2958). Some members are annotated as lipoproteins however this cannot be confirmed. This family of proteins has no known function. 111
53908 402648 pfam11172 DUF2959 Protein of unknown function (DUF2959). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 190
53909 402649 pfam11173 DUF2960 Protein of unknown function (DUF2960). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 79
53910 402650 pfam11174 DUF2970 Protein of unknown function (DUF2970). This short family is conserved in Proteobacteria. The function is not known. 56
53911 402651 pfam11175 DUF2961 Protein of unknown function (DUF2961). This family of proteins has no known function. 234
53912 402652 pfam11176 Tma16 Translation machinery-associated protein 16. Proteins in this family localize to the nucleus. Their function is not clear. 146
53913 402653 pfam11177 DUF2964 Protein of unknown function (DUF2964). This family of proteins with unknown function appears to be restricted to Proteobacteria. 62
53914 314179 pfam11178 DUF2963 Protein of unknown function (DUF2963). This family of proteins with unknown function appears to be restricted to Mollicutes. 51
53915 402654 pfam11179 DUF2967 Protein of unknown function (DUF2967). This family of proteins with unknown function appears to be restricted to Drosophila. 954
53916 402655 pfam11180 DUF2968 Protein of unknown function (DUF2968). This family of proteins has no known function. 180
53917 402656 pfam11181 YflT Heat induced stress protein YflT. YflT is a heat induced protein. 100
53918 402657 pfam11182 AlgF Alginate O-acetyl transferase AlgF. AlgF is essential for the addition of O-acetyl groups to alginate, an extracellular polysaccharide. The presence of O-acetyl groups plays an important role in the ability of the polymer to act as a virulence factor. 164
53919 402658 pfam11183 PmrD Polymyxin resistance protein PmrD. PmrB forms a two-component system (TCS) with PmrA that allows Gram-negative bacteria to survive the cationic antimicrobial peptide polymyxin G. The TCS is linked to another one via the polymyxin resistance protein PmrD. PmrD is the first protein identified to mediate the connectivity between the two TCSs. It binds to the N terminal domain of the PmrA response regulator which prevents its dephosphorylation, thereby promoting the the transcription of genes involved in polymyxin resistance. 81
53920 402659 pfam11184 DUF2969 Protein of unknown function (DUF2969). This family of proteins with unknown function appears to be restricted to Lactobacillales. 71
53921 402660 pfam11185 DUF2971 Protein of unknown function (DUF2971). This bacterial family of proteins has no known function. 89
53922 288086 pfam11186 DUF2972 Protein of unknown function (DUF2972). Some members in this family of proteins with unknown function are annotated as sugar transferase proteins, however this cannot be confirmed. 198
53923 402661 pfam11187 DUF2974 Protein of unknown function (DUF2974). This bacterial family of proteins has no known function. 224
53924 402662 pfam11188 DUF2975 Protein of unknown function (DUF2975). This family of bacterial proteins have no known function. These proteins are likely to be integral membrane proteins. The proteins contain a highly conserved glutamic acid close to their C-terminus. 130
53925 402663 pfam11189 DUF2973 Protein of unknown function (DUF2973). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently they have no known function. 68
53926 402664 pfam11190 DUF2976 Protein of unknown function (DUF2976). This family of proteins has no known function. Some members are annotated as membrane proteins however this cannot be confirmed. 87
53927 402665 pfam11191 DUF2782 Protein of unknown function (DUF2782). This is a bacterial family of proteins whose function is unknown. 88
53928 288092 pfam11192 DUF2977 Protein of unknown function (DUF2977). This family of proteins has no known function. 61
53929 402666 pfam11193 DUF2812 Protein of unknown function (DUF2812). This is a bacterial family of uncharacterized proteins, however some members of this family are annotated as membrane proteins. 108
53930 402667 pfam11195 DUF2829 Protein of unknown function (DUF2829). This is a uncharacterized family of proteins found in bacteria and bacteriphages. 71
53931 402668 pfam11196 DUF2834 Protein of unknown function (DUF2834). This is a bacterial family of uncharacterized proteins. 95
53932 402669 pfam11197 DUF2835 Protein of unknown function (DUF2835). This is a bacterial family of uncharacterized proteins. One member of this family is annotated as the A subunit of Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV). 71
53933 402670 pfam11198 DUF2857 Protein of unknown function (DUF2857). This is a bacterial family of uncharacterized proteins. 174
53934 402671 pfam11199 DUF2891 Protein of unknown function (DUF2891). This is a bacterial family of uncharacterized proteins. 323
53935 402672 pfam11200 DUF2981 Protein of unknown function (DUF2981). This eukaryotic family of proteins has no known function. 334
53936 402673 pfam11201 DUF2982 Protein of unknown function (DUF2982). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 151
53937 402674 pfam11202 PRTase_1 Phosphoribosyl transferase (PRTase). This PRTase family is fused to a C-terminal RNA binding Pelota domain, pfam01248. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 246
53938 402675 pfam11203 EccE Putative type VII ESX secretion system translocon, EccE. EccE is a family of largely Gram-positive bacterial transmembrane componenets of the type VII secretion system characterized in Mycobacterium tuberculosis, systems ESX1-5. Translocation of virulent peptides through the membranes is thought to be mediated via a complex that includes EccB, EccC, EccD, EccE, and MycP. EccB, EccC, EccD, and EccE form a stable complex in the mycobacterial cell envelope. 97
53939 402676 pfam11204 DUF2985 Protein of unknown function (DUF2985). This eukaryotic family of proteins has no known function. 78
53940 402677 pfam11205 DUF2987 Protein of unknown function (DUF2987). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 145
53941 371419 pfam11207 DUF2989 Protein of unknown function (DUF2989). Some members in this bacterial family of proteins are annotated as lipoproteins however this cannot be confirmed. 201
53942 402678 pfam11208 DUF2992 Protein of unknown function (DUF2992). This bacterial family of proteins has no known function. However, the cis-regulatory yjdF motif, just upstream from the gene encoding the proteins for this family, is a small non-coding RNA, Rfam:RF01764. The yjdF motif is found in many Firmicutes, including Bacillus subtilis. In most cases, it resides in potential 5' UTRs of homologs of the yjdF gene whose function is unknown. However, in Streptococcus thermophilus, a yjdF RNA motif is associated with an operon whose protein products synthesize nicotinamide adenine dinucleotide (NAD+). Also, the S. thermophilus yjdF RNA lacks typical yjdF motif consensus features downstream of and including the P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the S. thermophilus RNAs might sense a distinct compound that structurally resembles the ligand bound by other yjdF RNAs. On the ohter hand, perhaps these RNAs have an alternative solution forming a similar binding site, as is observed with some SAM riboswitches. 132
53943 402679 pfam11209 DUF2993 Protein of unknown function (DUF2993). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 216
53944 402680 pfam11210 DUF2996 Protein of unknown function (DUF2996). This family of proteins has no known function. 121
53945 402681 pfam11211 DUF2997 Protein of unknown function (DUF2997). This family of proteins has no known function. 47
53946 288110 pfam11212 DUF2999 Protein of unknown function (DUF2999). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 82
53947 402682 pfam11213 DUF3006 Protein of unknown function (DUF3006). This family of proteins has no known function. 67
53948 402683 pfam11214 Med2 Mediator complex subunit 2. This family of mediator complex subunit 2 proteins is conserved in fungi. Cyclin-dependent kinase CDK8 or Srb10 interacts with and phosphorylates Med2. Post-translational modifications of Mediator subunits are important for regulation of gene expression. 100
53949 402684 pfam11215 DUF3010 Protein of unknown function (DUF3010). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 137
53950 402685 pfam11216 DUF3012 Protein of unknown function (DUF3012). This family of proteins with unknown function is restricted to Gammaproteobacteria. 32
53951 402686 pfam11217 DUF3013 Protein of unknown function (DUF3013). This bacterial family of proteins with unknown function appear to be restricted to Firmicutes. 159
53952 402687 pfam11218 DUF3011 Protein of unknown function (DUF3011). This bacterial family of proteins has no known function. Most members belong to Proteobacteria. 197
53953 378600 pfam11219 DUF3014 Protein of unknown function (DUF3014). This family of proteins with unknown function appears to be restricted to Proteobacteria. 156
53954 402688 pfam11220 DUF3015 Protein of unknown function (DUF3015). This bacterial family of proteins has no known function. 137
53955 402689 pfam11221 Med21 Subunit 21 of Mediator complex. Med21 has been known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in Drosophila. The heterodimer of the two subunits Med7 and Med21 appears to act as a hinge between the middle and the tail regions of Mediator. 140
53956 402690 pfam11222 DUF3017 Protein of unknown function (DUF3017). This bacterial family of proteins with unknown function appear to be restricted to Actinobacteria. 74
53957 402691 pfam11223 DUF3020 Protein of unknown function (DUF3020). This family of fungal proteins is conserved towards the C-terminus of HMG domains. The function is not known. 49
53958 288122 pfam11224 DUF3023 Protein of unknown function (DUF3023). This bacterial family of proteins with unknown function appear to be restricted to Alphaproteobacteria. 130
53959 402692 pfam11225 DUF3024 Protein of unknown function (DUF3024). This family of proteins has no known function. 56
53960 402693 pfam11226 DUF3022 Protein of unknown function (DUF3022). This family of proteins with unknown function appears to be restricted to Proteobacteria. 103
53961 402694 pfam11227 DUF3025 Protein of unknown function (DUF3025). Some members in this bacterial family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently this family of proteins has no known function. 210
53962 402695 pfam11228 DUF3027 Protein of unknown function (DUF3027). This family of proteins with unknown function appears to be restricted to Actinobacteria. 193
53963 402696 pfam11229 Focadhesin Focadhesin. Focadhesin (FOCAD) is focal adhesion protein with potential tumor suppressor function in gliomas. 589
53964 402697 pfam11230 DUF3029 Protein of unknown function (DUF3029). Some members in this family of proteins are annotated as ykkI. Currently no function is known. 485
53965 402698 pfam11231 DUF3034 Protein of unknown function (DUF3034). This family of proteins with unknown function appears to be restricted to Proteobacteria. 256
53966 402699 pfam11232 Med25 Mediator complex subunit 25 PTOV activation and synapsin 2. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This family is the combined PTOV and SD2 domains. the PTOV domain being the domain through which Med25 co-operates with the histone acetyltransferase CBP, but the function of the SD2 domain is unclear. 147
53967 402700 pfam11233 DUF3035 Protein of unknown function (DUF3035). This family of proteins with unknown function appear to be restricted to Alphaproteobacteria. 140
53968 402701 pfam11235 Med25_SD1 Mediator complex subunit 25 synapsin 1. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA, domain, this SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This The function of the SD domains is unclear. 157
53969 402702 pfam11236 DUF3037 Protein of unknown function (DUF3037). This bacterial family of proteins has no known function. 118
53970 402703 pfam11237 DUF3038 Protein of unknown function (DUF3038). This family of proteins with unknown function appear to be restricted to Cyanobacteria. 169
53971 402704 pfam11238 DUF3039 Protein of unknown function (DUF3039). This family of proteins with unknown function appears to be restricted to Actinobacteria. 56
53972 402705 pfam11239 DUF3040 Protein of unknown function (DUF3040). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed. 82
53973 402706 pfam11240 DUF3042 Protein of unknown function (DUF3042). This family of proteins with unknown function appears to be restricted to Firmicutes. 54
53974 402707 pfam11241 DUF3043 Protein of unknown function (DUF3043). Some members in this family of proteins with unknown function are annotated as membrane proteins. This cannot be confirmed. 171
53975 402708 pfam11242 DUF2774 Protein of unknown function (DUF2774). This is a viral family of proteins with unknown function. 63
53976 288140 pfam11243 DUF3045 Protein of unknown function (DUF3045). Members in this family of proteins are annotated as gene protein 30.1. Currently no function is known. 88
53977 402709 pfam11244 Med25_NR-box Mediator complex subunit 25 C-terminal NR box-containing. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA, domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and this C-terminal NR box-containing domain (646-650) from C69-747. The NR box of MED25 is critical for its recruitment to the promoter, probably through an interaction with pre bound RAR. 90
53978 288142 pfam11245 DUF2544 Protein of unknown function (DUF2544). This is a bacterial family of proteins with unknown function. 246
53979 402710 pfam11246 Phage_gp53 Base plate wedge protein 53. The baseplate of bacteriophage T4 controls host cell recognition, attachment, tail sheath contraction and viral DNA ejection. The structure of the baseplate suggests a mechanism of baseplate structural transition during the initial stages of T4 infection. The baseplate is assembled from six identical wedges that surround the central hub. Gp53, along with other T4 gene products, combine sequentially to assemble a wedge. 200
53980 314235 pfam11247 DUF2675 Protein of unknown function (DUF2675). Members in this family of proteins are annotated as Gene protein 5.5. Currently no function is known. 99
53981 402711 pfam11248 DUF3046 Protein of unknown function (DUF3046). This family of proteins with unknown function appears to be restricted to Actinobacteria. 62
53982 402712 pfam11249 DUF3047 Protein of unknown function (DUF3047). This bacterial family of proteins has no known function. 198
53983 402713 pfam11250 FAF Fantastic Four meristem regulator. FAF is a family of plant proteins that regulate the size of the shoot meristem by modulating the CLV3-WUS feedback loop. The proteins are expressed in the centre of the shoot meristem, overlapping with the site of WUS - the homeodomain transcription factor WUSCHEL- expression. FAF proteins are capable of modulating shoot growth by repressing WUS in the organising centre of the shoot meristem. The ability of plants to form new organs throughout their life cycle requires tight control of the meristems to avoid unregulated growth. Plants have evolved an elaborate genetic network that controls meristem size and maintenance. WUS and the CLAVATA (CLV) ligand-receptor system are at the core of the network that regulates the size of the stem cell population in the shoot meristem. 54
53984 402714 pfam11251 DUF3050 Protein of unknown function (DUF3050). This bacterial family of proteins has no known function. 232
53985 151694 pfam11252 DUF3051 Protein of unknown function (DUF3051). This viral family of proteins has no known function. 189
53986 402715 pfam11253 DUF3052 Protein of unknown function (DUF3052). This family of proteins with unknown function appears to be restricted to Actinobacteria. 123
53987 402716 pfam11254 DUF3053 Protein of unknown function (DUF3053). Some members in this family of proteins are annotated as the membrane protein YiaF. No function is currently known. 219
53988 402717 pfam11255 DUF3054 Protein of unknown function (DUF3054). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 110
53989 402718 pfam11256 DUF3055 Protein of unknown function (DUF3055). This family of proteins with unknown function appear to be restricted to Firmicutes. 80
53990 402719 pfam11258 DUF3048 Protein of unknown function (DUF3048) N-terminal domain. Some members in this bacterial family of proteins are annotated as YerB. However currently no function is known. This entry represents the N-terminal domain. 143
53991 402720 pfam11259 DUF3060 Protein of unknown function (DUF3060). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. 58
53992 371436 pfam11260 Spidroin_MaSp Major ampullate spidroin 1 and 2. Dragline silk is composed of two proteins, major ampullate spidroin 1 (MaSp1) and major ampullate spidroin 2 (MaSp2). MaSp1 contains five alpha-helices. Only the C-terminus of the proteins are shown. 85
53993 402721 pfam11261 IRF-2BP1_2 Interferon regulatory factor 2-binding protein zinc finger. IRF-2BP1 and IRF-2BP2 are nuclear transcriptional repressor proteins and can inhibit both enhancer-activated and basal transcription. They both contain N-terminal zinc finger represented in this family and C-terminal RING finger domains. 52
53994 402722 pfam11262 Tho2 Transcription factor/nuclear export subunit protein 2. THO and TREX form a eukaryotic complex which functions in messenger ribonucleoprotein metabolism and plays a role in preventing the transcription-associated genetic instability. Tho2, along with four other subunits forms THO 304
53995 402723 pfam11263 Attachment_P66 Borrelia burgdorferi attachment protein P66. P66 is an outer membrane protein in Borrelia burgdorferi, the agent of Lyme disease. P66 has a role in the attachment of Borrelia burgdorferi to human cell-surface receptors. 253
53996 402724 pfam11264 ThylakoidFormat Thylakoid formation protein. THF1 is localized to the outer plastid membrane and the stroma. THF1 has a role in sugar signalling. THF1 is also thought to have a role in chloroplast and leaf development. THF1 has been shown to play a crucial role in vesicle-mediated thylakoid membrane biogenesis. 216
53997 402725 pfam11265 Med25_VWA Mediator complex subunit 25 von Willebrand factor type A. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex. 213
53998 402726 pfam11266 Ald_deCOase Long-chain fatty aldehyde decarbonylase. This cyanobacterial family of fatty aldehyde decarbonylases acts on mainly C16 and C18 substrates to form hydrocarbons and carbon monoxide. Note that the corresponding EC number (EC:4.1.99.5) dating from 1989 refers to a nonorthologous Pisum sativum enzyme that acts on C18 and longer chains and attaches the overly narrow narrow name octadecanal decarbonylase. 215
53999 402727 pfam11267 DUF3067 Domain of unknown function (DUF3067). This family of proteins found in plants and cyanobacteria has no known function. The structure of this domain has been solved by NMR for the alr2454 protein. The structure was determined to be a novel fold composed of four alpha helices and a sheet of three anti-parallel beta-strands. 90
54000 402728 pfam11268 DUF3071 Protein of unknown function (DUF3071). Some members in this family of proteins are annotated as DNA-binding proteins however this cannot be confirmed. Currently no function is known. 165
54001 314254 pfam11269 DUF3069 Protein of unknown function (DUF3069). This family of proteins with unknown function appear to be restricted to Gammaproteobacteria. 118
54002 288164 pfam11270 DUF3070 Protein of unknown function (DUF3070). This eukaryotic family of proteins has no known function. 23
54003 402729 pfam11271 DUF3068 Protein of unknown function (DUF3068). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed. 297
54004 402730 pfam11272 DUF3072 Protein of unknown function (DUF3072). This bacterial family of proteins has no known function. 56
54005 402731 pfam11273 DUF3073 Protein of unknown function (DUF3073). This family of proteins with unknown function appears to be restricted to Actinobacteria. 63
54006 402732 pfam11274 DUF3074 Protein of unknown function (DUF3074). This eukaryotic family of proteins has no known function but appears to be part of the START superfamily. 181
54007 371447 pfam11275 DUF3077 Protein of unknown function (DUF3077). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 77
54008 402733 pfam11276 DUF3078 Protein of unknown function (DUF3078). This bacterial family of proteins has no known function. 90
54009 402734 pfam11277 Med24_N Mediator complex subunit 24 N-terminal. This subunit of the Mediator complex appears to be conserved only from insects to humans. It is essential for correct retinal development in fish. Subunit composition of the mediator contributes to the control of differentiation in the vertebrate CNS as there are divergent functions of the mediator subunits Crsp34/Med27, Trap100/Med24, and Crsp150/Med14. 994
54010 402735 pfam11278 DUF3079 Protein of unknown function (DUF3079). This family of proteins with unknown function appears to be restricted to Proteobacteria. 50
54011 402736 pfam11279 DUF3080 Protein of unknown function (DUF3080). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently this family has no known function. 315
54012 337968 pfam11280 DUF3081 Protein of unknown function (DUF3081). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 77
54013 314264 pfam11281 DUF3083 Protein of unknown function (DUF3083). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 315
54014 402737 pfam11282 DUF3082 Protein of unknown function (DUF3082). This family of proteins has no known function. 80
54015 402738 pfam11283 DUF3084 Protein of unknown function (DUF3084). This bacterial family of proteins has no known function. 77
54016 402739 pfam11284 DUF3085 Protein of unknown function (DUF3085). This family of proteins with unknown function appears to be restricted to Proteobacteria. 89
54017 402740 pfam11285 DUF3086 Protein of unknown function (DUF3086). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 275
54018 402741 pfam11286 DUF3087 Protein of unknown function (DUF3087). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 165
54019 402742 pfam11287 DUF3088 Protein of unknown function (DUF3088). This family of proteins with unknown function appears to be restricted to Proteobacteria. 112
54020 402743 pfam11288 DUF3089 Protein of unknown function (DUF3089). This family of proteins has no known function but appears to have an alpha/beta hydrolase domain and so is likely to be enzymatic. 200
54021 402744 pfam11289 APA3_viroporin Coronavirus accessory protein 3a. APA3_viroporin is a pro-apoptosis-inducing protein. It localizes to the endoplasmic reticulum (ER)-Golgi compartment. The Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) causes apoptosis of infected cells, and this is one of the culprits. Multi-pass membrane protein that forms a homotetrameric potassium-sensitive ion channel called a viroporin whose activity causes ER-stress to the host cell. 273
54022 402745 pfam11290 DUF3090 Protein of unknown function (DUF3090). This family of proteins with unknown function appears to be restricted to Actinobacteria. 171
54023 151732 pfam11291 DUF3091 Protein of unknown function (DUF3091). This eukaryotic family of proteins has no known function. 100
54024 402746 pfam11292 DUF3093 Protein of unknown function (DUF3093). This family of proteins with unknown function appears to be restricted to Actinobacteria. Some members are annotated as alanine rich membrane proteins however this cannot be confirmed. 140
54025 402747 pfam11293 DUF3094 Protein of unknown function (DUF3094). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 55
54026 402748 pfam11294 DUF3095 Protein of unknown function (DUF3095). Some members in this bacterial family of proteins are annotated as adenylyl cyclase however this cannot be confirmed. Currently no function is known. 377
54027 371454 pfam11295 DUF3096 Protein of unknown function (DUF3096). This family of proteins with unknown function appears to be restricted to Proteobacteria. 37
54028 402749 pfam11296 DUF3097 Protein of unknown function (DUF3097). This family of proteins with unknown function appears to be restricted to Actinobacteria. 270
54029 402750 pfam11297 DUF3098 Protein of unknown function (DUF3098). This bacterial family of proteins has no known function. 67
54030 402751 pfam11298 DUF3099 Protein of unknown function (DUF3099). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 72
54031 402752 pfam11299 DUF3100 Protein of unknown function (DUF3100). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 235
54032 402753 pfam11300 DUF3102 Protein of unknown function (DUF3102). This family of proteins has no known function. 129
54033 402754 pfam11301 DUF3103 Protein of unknown function (DUF3103). This family of proteins with unknown function appear to be restricted to Proteobacteria. 351
54034 402755 pfam11302 DUF3104 Protein of unknown function (DUF3104). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 69
54035 402756 pfam11303 DUF3105 Protein of unknown function (DUF3105). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 123
54036 402757 pfam11304 DUF3106 Protein of unknown function (DUF3106). Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 98
54037 402758 pfam11305 DUF3107 Protein of unknown function (DUF3107). Some members in this family of proteins are annotated as ATP-binding proteins however this cannot be confirmed. Currently no function is known. 73
54038 402759 pfam11306 DUF3108 Protein of unknown function (DUF3108). This is a bacterial family of putative lipoproteins. Structure 3fzx, the first structural template for this large family including several homologs in the human gut microbiome and in metagenomic datasets, folds into a beta barrel that topologically looks like a small-scale porin (such as FepA). Bacteroides fragilis glyA is a putative exported protein, and this fold is of the YmcC-like type, with a predicted signal peptide SpI cleavage site AGAMA|QNQDC, and a Phobius server prediction of non-cytoplasmic localization for amino acids 21-236. The possibility of it being a membrane protein can be ruled out by the hydrophilic nature of the solvent exposed surface outside the barrels. Analysis of sequence conservation suggests that an area near Glu172/Trp206 is potentially interesting. These two residues are also conserved in Dali hit Structure 2in5, a hypothetical lipoprotein classified as a new YmcC-like fold in SCOP (SCOP:159271, with a 12-stranded meander beta-sheet folded into a deformed beta-barrel) despite large structural differences between the two structures, suggesting similarity in function. 225
54039 402760 pfam11307 DUF3109 Protein of unknown function (DUF3109). This bacterial family of proteins has no known function. 181
54040 402761 pfam11308 Glyco_hydro_129 Glycosyl hydrolases related to GH101 family, GH129. This family of bacterial and lower eukaryote glycosyl hydrolases is related to CAZy family GH129,and distantly to GH101, and is made up of sub-families GHL1-GHL3. 324
54041 402762 pfam11309 DUF3112 Protein of unknown function (DUF3112). This eukaryotic family of proteins has no known function. 217
54042 288203 pfam11310 DUF3113 Protein of unknown function (DUF3113). This family of proteins has no known function. It has a highly conserved sequence. 60
54043 402763 pfam11311 DUF3114 Protein of unknown function (DUF3114). Some members in this family of proteins with unknown function are annotated as cytosolic proteins. This cannot be confirmed. 253
54044 402764 pfam11312 Methyltransf_34 Putative SAM-dependent methyltransferase. This family of largely fungal proteins are likely to be a methyltransferase. This was determined through multiple motif screening in yeast. 294
54045 402765 pfam11313 DUF3116 Protein of unknown function (DUF3116). This family of proteins with unknown function appears to be restricted to Bacillales. 84
54046 371461 pfam11314 DUF3117 Protein of unknown function (DUF3117). This family of proteins with unknown function appears to be restricted to Actinobacteria. 50
54047 402766 pfam11315 Med30 Mediator complex subunit 30. Med30 is a metazoan-specific subunit of Mediator, having no homologs in yeasts. 147
54048 371463 pfam11316 Rhamno_transf Putative rhamnosyl transferase. Most members of this family are uncharacterized, but one is a putative side-chain-rhamnosyl transferase. 235
54049 402767 pfam11317 DUF3119 Protein of unknown function (DUF3119). This family of proteins has no known function. 108
54050 402768 pfam11318 DUF3120 Protein of unknown function (DUF3120). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 199
54051 402769 pfam11319 VasI Type VI secretion system VasI, EvfG, VC_A0118. VasI is a family of Gram-negative proteins that form part of the pathogenicity apparatus for bacteria-to-bacteria attack. The exact function of this component is not known. 183
54052 402770 pfam11320 DUF3122 Protein of unknown function (DUF3122). This family of proteins with unknown function appear to be restricted to Cyanobacteria. 134
54053 402771 pfam11321 DUF3123 Protein of unknown function (DUF3123). This eukaryotic family of proteins has no known function. 113
54054 402772 pfam11322 DUF3124 Protein of unknown function (DUF3124). This bacterial family of proteins has no known function. 123
54055 402773 pfam11324 DUF3126 Protein of unknown function (DUF3126). This family of proteins with unknown function appear to be restricted to Alphaproteobacteria. 63
54056 402774 pfam11325 DUF3127 Domain of unknown function (DUF3127). This bacterial family of proteins has no known function. However, it does show distant similarity to pfam00436, with proteins such as Prevotella buccalis HMPREF0650_0099 being similar to both families. This suggests that this family may have a DNA-binding function. 84
54057 402775 pfam11326 DUF3128 Protein of unknown function (DUF3128). This eukaryotic family of proteins has no known function. 80
54058 402776 pfam11327 DUF3129 Protein of unknown function (DUF3129). This eukaryotic family of proteins has no known function. 183
54059 288220 pfam11328 DUF3130 Protein of unknown function (DUF3130. This bacterial family of proteins has no known function. 89
54060 402777 pfam11329 DUF3131 Protein of unknown function (DUF3131). This bacterial family of proteins has no known function. 367
54061 371472 pfam11330 DUF3132 Protein of unknown function (DUF3132). This viral family of proteins are 55kDa. No function is currently known. 242
54062 402778 pfam11331 zinc_ribbon_12 Probable zinc-ribbon domain. This eukaryotic family of proteins has no known function. 45
54063 402779 pfam11332 DUF3134 Protein of unknown function (DUF3134). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 68
54064 402780 pfam11333 DUF3135 Protein of unknown function (DUF3135). This family of proteins with unkown function appears to be restricted to Proteobacteria. 80
54065 402781 pfam11334 DUF3136 Protein of unknown function (DUF3136). This family of proteins with unknown function appear to be restricted to Cyanobacteria. 64
54066 402782 pfam11335 DUF3137 Protein of unknown function (DUF3137). This bacterial family of proteins has no known function. 142
54067 402783 pfam11336 DUF3138 Protein of unknown function (DUF3138). This family of proteins with unknown function appear to be restricted to Proteobacteria. 525
54068 402784 pfam11337 DUF3139 Protein of unknown function (DUF3139). This family of proteins with unknown function appears to be restricted to Firmicutes. 77
54069 402785 pfam11338 DUF3140 Protein of unknown function (DUF3140). Some members in this family of proteins are annotated as DNA binding proteins. No function is currently known. 92
54070 402786 pfam11339 DUF3141 Protein of unknown function (DUF3141). This family of proteins with unknown function appears to be restricted to Proteobacteria. 582
54071 371478 pfam11340 DUF3142 Protein of unknown function (DUF3142). This bacterial family of proteins has no known function. 223
54072 402787 pfam11341 DUF3143 Protein of unknown function (DUF3143). This family of proteins has no known function. 65
54073 402788 pfam11342 DUF3144 Protein of unknown function (DUF3144). This family of proteins with unknown function appears to be restricted to Proteobacteria. 77
54074 402789 pfam11343 DUF3145 Protein of unknown function (DUF3145). This family of proteins with unknown function appear to be restricted to Actinobacteria. 157
54075 402790 pfam11344 DUF3146 Protein of unknown function (DUF3146). This family of proteins with unknown function appear to be restricted to Cyanobacteria. 80
54076 288237 pfam11345 DUF3147 Protein of unknown function (DUF3147). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 111
54077 402791 pfam11346 DUF3149 Protein of unknown function (DUF3149). This bacterial family of proteins has no known function. 39
54078 402792 pfam11347 DUF3148 Protein of unknown function (DUF3148). This family of proteins has no known function. 61
54079 402793 pfam11348 DUF3150 Protein of unknown function (DUF3150). This bacterial family of proteins with unknown function appears to be restricted to Proteobacteria. 256
54080 402794 pfam11349 DUF3151 Protein of unknown function (DUF3151). This family of proteins with unknown function appears to be restricted to Actinobacteria. 127
54081 371483 pfam11350 DUF3152 Protein of unknown function (DUF3152). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function. 207
54082 402795 pfam11351 GTA_holin_3TM Holin of 3TMs, for gene-transfer release. This is a family of bacterial 3TM holins. In Rhodobacter capsulatus the protein is expressed just overlapping and downstream of a putative N-acetylmuramidase lysozyme (an endolysin) thought to be responsible for lysing a phage particle, RcGTA - a gene-transfer agent. A holin would be necessary for such an endolysin to access the peptidoglycan. Gene-transfer agents are bacteriophage-like genetic elements with the sole known function of horizontal gene transfer, serving an important role in microbial evolution. In order to be released from the cell these require the combined action of an endolysin and this holin. 123
54083 402796 pfam11352 DUF3155 Protein of unknown function (DUF3155). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 88
54084 402797 pfam11353 DUF3153 Protein of unknown function (DUF3153). This family of proteins with unknown function appear to be restricted to Cyanobacteria. Some members are annotated as membrane proteins however this cannot be confirmed. 191
54085 402798 pfam11354 DUF3156 Protein of unknown function (DUF3156). This family of proteins with unknown function appears to be restricted to Proteobacteria. 161
54086 371487 pfam11355 DUF3157 Protein of unknown function (DUF3157). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 199
54087 402799 pfam11356 T2SSC Type II secretion system protein C. This is the greater N-terminal region of GspC-type proteins. GspC proteins form part of the sophisticated transport mechanism of Gram-negative pathogens for injecting divers proteins into their hosts, a type-II secretion system - T2SS. The region is made up of a short N-terminal cytoplasmic domain that is followed by the single transmembrane helix, a Pro-rich linker, and the so-called homology region domain in the periplasm. This inner membrane GspC interacts with the outer membrane secretin GspD via periplasmic domains, an interaction which is critical for the effectiveness of type II secretion. 142
54088 402800 pfam11357 Spy1 Cell cycle regulatory protein. Speedy (Spy1) is a cell cycle regulatory protein which activates CDK2, the major kinase that allows progression through G1/S phase and further replication events. Spy1 expression overcomes a p27-induced cell cycle arrest to allow for DNA synthesis, so cell cycle progression occurs due to an interaction between Spy1 and p27. Spy1 is also known as Ringo protein A. 131
54089 402801 pfam11358 DUF3158 Protein of unknown function (DUF3158). Some members in this family of proteins are annotated as integrase regulator R however this cannot be confirmed. This family of proteins with unknown function appear to be restricted to Proteobacteria. 156
54090 288251 pfam11359 gpUL132 Glycoprotein UL132. Glycoprotein UL132 is a low-abundance structural component of Human cytomegalovirus (HCMV). The function of this protein is not fully understood. 238
54091 402802 pfam11360 DUF3110 Protein of unknown function (DUF3110). This family of proteins has no known function. 84
54092 371490 pfam11361 DUF3159 Protein of unknown function (DUF3159). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed. Currently this family of proteins has no known function. 188
54093 402803 pfam11362 DUF3161 Protein of unknown function (DUF3161). This eukaryotic family of proteins has no known function. 83
54094 371491 pfam11363 DUF3164 Protein of unknown function (DUF3164). This family of proteins has no known function. 194
54095 402804 pfam11364 DUF3165 Protein of unknown function (DUF3165). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function. 81
54096 402805 pfam11365 SOGA Protein SOGA. The SOGA (suppressor of glucose by autophagy) family consists of proteins SOGA1, SOGA2, and SOGA3. SOGA1 regulates autophagy by playing a role in the reduction of glucose production in an adiponectin and insulin dependent manner. 95
54097 402806 pfam11367 DUF3168 Protein of unknown function (DUF3168). This family of proteins has no known function but is likely to be a component of bacteriophage. 117
54098 402807 pfam11368 DUF3169 Protein of unknown function (DUF3169). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function. 248
54099 402808 pfam11369 DUF3160 Protein of unknown function (DUF3160). This family of proteins has no known function. 606
54100 402809 pfam11371 DUF3172 Protein of unknown function (DUF3172). This family of proteins has no known function. 139
54101 402810 pfam11372 DUF3173 Domain of unknown function (DUF3173). This family of proteins with unknown function appears to be restricted to Firmicutes. These proteins appear to be distantly related to HHH domains and are therefore likely to be DNA-binding. Genomic environment-visualisation confirms the likely function as being DNA-binding, as this short protein lies very closely between an integrase and a replication protein (http://www.microbesonline.org/). 58
54102 402811 pfam11373 DUF3175 Protein of unknown function (DUF3175). This family of proteins with unknown function appears to be restricted to Proteobacteria. 84
54103 402812 pfam11374 DUF3176 Protein of unknown function (DUF3176). This eukaryotic family of proteins has no known function. 107
54104 402813 pfam11375 DUF3177 Protein of unknown function (DUF3177). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function. 189
54105 402814 pfam11376 DUF3179 Protein of unknown function (DUF3179). This family of proteins has no known function. 289
54106 402815 pfam11377 DUF3180 Protein of unknown function (DUF3180). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function. 137
54107 402816 pfam11378 DUF3181 Protein of unknown function (DUF3181). This family of proteins has no known function. 88
54108 402817 pfam11379 DUF3182 Protein of unknown function (DUF3182). This family of proteins with unknown function appears to be restricted to Proteobacteria. 353
54109 402818 pfam11380 Stealth_CR2 Stealth protein CR2, conserved region 2. Stealth_CR2 is the second of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. CR2 carries a well-conserved NDD sequence-motif. The domain is found in tandem with CR1, CR3 and CR4 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system. 108
54110 402819 pfam11381 DUF3185 Protein of unknown function (DUF3185). Some members in this bacterial family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. 56
54111 402820 pfam11382 MctB Copper transport outer membrane protein, MctB. Outer membrane channel protein MctB in Mycobacterium tuberculosis is part of a Cu resistance mechanism that ensures low intracellular Cu levels in the bacterium. Human resisitance to bacteria, mediated via IFN-gamma-mediated activation of macrophages, may involve the use of reactive Cu(1) in the presence of hyrodgen peroxide, since acitivated Cu is toxic to bacteria. IFN-gamma stimulates the trafficking of the Cu transporter ATP7a to the vesicles that fuse with phagosomes and these phagosomes are found to have a high Cu content and an increaseed bactericidal activity against E.coli. Using MctB, Mycobacterium tuberculosis may limit the amount of excess Cu within it whole in the host. 298
54112 402821 pfam11383 DUF3187 Protein of unknown function (DUF3187). This family of proteins with unknown function appear to be restricted to Proteobacteria. These proteins are likely to be outer membrane proteins. 320
54113 402822 pfam11384 DUF3188 Protein of unknown function (DUF3188). This bacterial family of proteins has no known function. 50
54114 402823 pfam11385 DUF3189 Protein of unknown function (DUF3189). This family of proteins with unknown function appears to be restricted to Firmicutes 147
54115 371502 pfam11386 VERL Vitelline envelope receptor for lysin. VERL, the egg vitelline envelope (VE) receptor for lysin, is a giant unbranched glycoprotein comprising 30% of the vitelline envelope. Lysin binds to VERL and creates a hole as VERL molecules lose cohesion and splay apart. These proteins are important in the mediation of fertilisation 78
54116 402824 pfam11387 DUF2795 Protein of unknown function (DUF2795). This family of proteins has no known function. 44
54117 402825 pfam11388 DotA Phagosome trafficking protein DotA. DotA is essential for intracellular growth in Legionella. DotA is thought to play an important role in regulating initial phagosome trafficking decisions either upon or immediately after macrophage uptake. 105
54118 371504 pfam11389 Porin_OmpL1 Leptospira porin protein OmpL1. OmpL1 is a member of the outer membrane (OM) proteins in the mammalian pathogen Leptospira. Specifically, it is a porin. 272
54119 402826 pfam11390 FdsD NADH-dependant formate dehydrogenase delta subunit FdsD. FdsD is the delta subunit of the enzyme formate dehydrogenase. This subunit may play a role in maintaining the quaternary structure by means of electrostatic interactions with the other subunits. The delta subunit is not involved in the active centre of the enzyme. 61
54120 402827 pfam11391 DUF2798 Protein of unknown function (DUF2798). This family of proteins has no known function. 58
54121 402828 pfam11392 DUF2877 Protein of unknown function (DUF2877). This bacterial family of proteins are putative carboxylase proteins however this cannot be confirmed. 109
54122 402829 pfam11393 T4BSS_DotI_IcmL Type-IV b secretion system, inner-membrane complex component. IcmL contains two amphipathic beta-sheet regions, required for the pore-forming ability which may be related to the transfer of this protein into a host cell membrane. The icmL gene shows significant similarity to plasmid genes involved in conjugation however IcmL is thought to be required for macrophage killing. It is unknown whether conjugation plays a role in macrophage killing. This is a family of DotI/IcmL proteins of type IVb secretion systems, that reside in the inner-membrane. It carries a single transmembrane helix in the N-terminal conserved region, has an extra-periplasmic domain, and is conserved in all T4BSSs including I-type conjugation systems (TraM). DotI/IcmL (and DotJ) may form an inner membrane complex that associates with the core complex. 180
54123 288282 pfam11394 DUF2875 Protein of unknown function (DUF2875). This family of proteins with unknown function appear to be restricted to Proteobacteria. 462
54124 402830 pfam11395 DUF2873 Protein of unknown function (DUF2873). This viral family of proteins has no known function. 43
54125 402831 pfam11396 PepSY_like Putative beta-lactamase-inhibitor-like, PepSY-like. This family of bacterial proteins is probably periplasmic. Members are found predominantly in microbes of the human gut and oral cavity. Structurally, one member of this family is found to show similarity to the beta-lactamase-inhibitor PepSY proteins, so the overall function may be inhibitory. There are tandem repeats of the domain on many family members. 86
54126 371506 pfam11397 GlcNAc Glycosyltransferase (GlcNAc). GlcNAc is an enzyme that carries out the first glycosylation step of hydroxylated Skp1, a ubiquitous eukaryotic protein, in the cytoplasm. 351
54127 402832 pfam11398 DUF2813 Protein of unknown function (DUF2813). This entry contains YjbD from Escherichia coli, which is annotated as a nucleotide triphosphate hydrolase. 372
54128 378660 pfam11399 DUF3192 Protein of unknown function (DUF3192). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. 101
54129 402833 pfam11401 Tetrabrachion Tetrabrachion. Tetrabrachion forms a parallel right-handed coiled coil structure with hydrophobic interactions and salt bridges forming a thermostable tetrameric structure. It contains large hydrophobic cavities. No function is known for this family of proteins. 49
54130 402834 pfam11402 Antifungal_prot Antifungal protein. Antifungal protein consists of five antiparallel beta strands which are highly twisted creating a beta barrel stabilized by four internal disulphide bridges. A cationic site adjacent to a hydrophobic stretch on the protein surface may constitute a phospholipid binding site. 50
54131 402835 pfam11403 Yeast_MT Yeast metallothionein. Metallothioneins are characterized by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster. 39
54132 402836 pfam11404 Potassium_chann Potassium voltage-gated channel. Fast inactivation of voltage-dependant potassium channels occurs by a 'ball-and-chain'-type mechanism. It controls membrane excitability and signal propagation in central neurons. Inactivation is regulated by protein phosphorylation where phosphorylation of serine residues leads to a reduction of the fast inactivation. 29
54133 402837 pfam11405 Inhibitor_I67 Bromelain inhibitor VI. Bromelain inhibitor VI is a double-chain inhibitor consisting of a 11-residue and a 41-residue chain. This protein is the 41-residue heavy chain which is joined to the 11-residue chain by disulphide bonds. The inhibitor acts to inhibit the cysteine proteinase bromelain. 41
54134 371512 pfam11406 Tachystatin_A Antimicrobial peptide tachystatin A. Tachystatin A contains a cysteine-stabilized triple-stranded beta-sheet and shows features common to membrane-interactive peptides. Tachystatin A is thought to have an antimicrobial activity similar to defensins.Tachystatin A is also a chitin-binding peptide. 44
54135 402838 pfam11407 RestrictionMunI Type II restriction enzyme MunI. Type II restriction enzyme MunI recognizes the palindromic sequence C/AATTG. It makes contact with the DNA via the major groove. 202
54136 151847 pfam11408 Helicase_Sgs1 Sgs1 RecQ helicase. RecQ helicases unwind DNA in an ATP-dependent manner. Sgs1 has a HRDC (helicase and RNaseD C-terminal) domain which modulates the helicase function via auxiliary contacts to DNA. 79
54137 402839 pfam11409 SARA Smad anchor for receptor activation (SARA). Smad proteins mediate transforming growth factor-beta (TGF-beta) signaling from the transmembrane serine-threonine receptor kinases to the nucleus. SARA recruits Smad2 to the TGF-beta receptors for phosphorylation. 37
54138 371514 pfam11410 Antifungal_pept Antifungal peptide. This peptide has six cysteines involved in three disulphide bonds. It contains a global fold which involves a cysteine-knotted three-stranded antiparallel beta-sheet along with a flexible loop and four beta-reverse turns. It also has an amphiphilic character which is the main structural basis of its biological function. 33
54139 402840 pfam11411 DNA_ligase_IV DNA ligase IV. DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface. 34
54140 402841 pfam11412 DsbC Disulphide bond corrector protein DsbC. DsbC rearranges incorrect disulphide bonds during oxidative protein folding. It is activated by the N-terminal domain of DsbD, a transmembrane electron transporter. DsbD binds to a DsbC dimer and selectively activates it using electrons from the cytoplasm. 117
54141 402842 pfam11413 HIF-1 Hypoxia-inducible factor-1. HIF-1 is a transcriptional complex and controls cellular systemic homeostatic responses to oxygen availability. In the presence of oxygen HIF-1 alpha is targeted for proteasomal degradation by pHVL, a ubiquitination complex. 32
54142 402843 pfam11414 Suppressor_APC Adenomatous polyposis coli tumor suppressor protein. The tumor suppressor protein, APC, has a nuclear export activity as well as many different intracellular functions. The structure consists of three alpha-helices forming two separate antiparallel coiled coils. 82
54143 151854 pfam11415 Toxin_37 Antifungal peptide termicin. Termicin is a cysteine-rich antifungal peptide which exhibits antibacterial activity. A cysteine stabilized alpha beta motif is formed due to an alpha-helical segment and a two-stranded antiparallel beta-sheet. 35
54144 402844 pfam11416 Syntaxin-5_N Syntaxin-5 N-terminal, Sly1p-binding domain. Syntaxin-5_N is the Sed5 N-terminal and the N-terminus of Syntaxin-5-like proteins. It is the region of Syntaxin that interacts with Sly1p, a positive regulator of intracellular membrane fusion, allowing SM (cytosolic Sec1/munc18-like) proteins to stay associated with the assembling fusion machinery. This allows the SM proteins to participate in late fusion steps. 22
54145 402845 pfam11417 Inhibitor_G39P Loader and inhibitor of phage G40P. G39P inhibits the initiation of DNA replication by blocking G40P replicative helicase. G39P has a bipartite stricture consisting of a folded N-terminal domain and an unfolded C-terminal domain. The C terminal is essential for helicase interaction. 66
54146 402846 pfam11418 Scaffolding_pro Phi29 scaffolding protein. This protein is also referred to as gp7. The protein contains a DNA-binding function and may halve a role in mediating the structural transition from prohead to mature virus and also scaffold release.Gp7 is arranged within the capsid as a series of concentric shells. 100
54147 402847 pfam11419 DUF3194 Protein of unknown function (DUF3194). This family of proteins has no known function however the structure has been determined. The protein consists of two alpha-helices packed on the same side of a central beta-hairpin. 83
54148 371520 pfam11420 Subtilosin_A Bacteriocin subtilosin A. Subtilosin A is a bacteriocin from Bacillus subtilis.The protein has a cyclized peptide backbone and forms three cross-liks between the sulphurs of Cys13, Cys7 and Cys4 and the alpha-positions of Phe22,Thr28 and Phe31. 35
54149 402848 pfam11421 Synthase_beta ATP synthase F1 beta subunit. The NMR solution structure of the protein in SDS micelles was found to contain two helices, an N-terminal amphipathic alpha-helix and a C-terminal alpha-helix separated by a large unstructured internal domain. The N-terminal alpha-helix is the Tom20 receptor binding site whereas the C-terminal alpha-helix is located upstream of the mitochondrial processing peptidase cleavage site. 47
54150 402849 pfam11422 IBP39 Initiator binding protein 39 kDa. IBP39 recognizes the initiator which is solely responsible for transcription start site selection. IBP39 contains an N-terminal Inr binding domain connected to a C-terminal domain. The C domain structure indicates that it interacts with the T. vaginalis RNAP II large subunit C-terminal domain. Binding of IBP39 to Inr recruits RNAP II and initiates transcription. 179
54151 402850 pfam11423 Repressor_Mnt Regulatory protein Mnt. Mnt is a repressor which is involved in the genetic switch between lysogenic and lytic growth in bacteriophage P22. The C-terminal domain of the protein consists of a dimer of two antiparallel coiled coils with a right handed twist, which is both stronger and has closer inter-helical separation compared with those found in left-handed coiled coils. 25
54152 402851 pfam11424 DUF3195 Protein of unknown function (DUF3195). This archaeal family of proteins has no known function. 85
54153 371524 pfam11426 Tn7_TnsC_Int Tn7 transposition regulator TnsC. TnsC is a molecular switch that regulates transposition and interacts with TnsA which is a component of the transposase. The two proteins interact via the residues 504-555 on TnsC. The TnsA/TnsC interaction is very important in Tn7 transposition. 47
54154 192757 pfam11427 HTH_Tnp_Tc3_1 Tc3 transposase. Tc3 is transposase with a specific DNA-binding domain which contains three alpha-helices, two of which form a helix-turn-helix motif which makes four base-specific contacts with the major groove. The N-terminus makes contacts with the minor groove. There is a base specific recognition between Tc3 and the transposon DNA. The DNA binding domain forms a dimer in which each monomer binds a separate transposon end. This implicates that the dimer has a role in synapsis and is necessary for the simultaneous cleavage of both transposon termini. 50
54155 402852 pfam11428 DUF3196 Protein of unknown function (DUF3196). This proteins is the product of the gene MPN330 and is thought to involved in a cellular function that has yet to be characterized. The proteins has 11 helices and a novel fold. No function is currently known for this protein. 264
54156 402853 pfam11429 Colicin_D Colicin D. Colicin D is a tRNase which kills sensitive E.coli cells via a specific tRNA cleavage. It targets the four isoaccepting tRNAs for Arg and cleaves the phosphodiester bond between positions 38 and 39 at the 3' junction of the anticodon stem and the loop. 81
54157 402854 pfam11430 EGL-1 Programmed cell death activator EGL-1. Initiation of programmed cell death in C.elegans occurs by the binding of EGL-1 to CED-9 which disrupts a complex involving CED-4/CED-9 and allows CED-4 to activate CED-3, a caspase. It is the C terminal domain of EGL-1 which is involved in the formation of the complex with CED-9. The formation of the complex induces structural rearrangements in CED-9 and EGL-1 adopts an extended alpha-helical conformation. 20
54158 402855 pfam11431 Transport_MerF Membrane transport protein MerF. The mercury transport membrane protein, MerF has a core helix-loop-helix domain. It has two vicinal pairs of cysteine residues which are involved in the transport of Hg(II) across the membrane and are exposed to the cytoplasm. 45
54159 371529 pfam11432 DUF3197 Protein of unknown function (DUF3197). This bacterial family of proteins has no known function. 113
54160 371530 pfam11433 DUF3198 Protein of unknown function (DUF3198). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently, this archaeal family has no known function. 49
54161 371531 pfam11434 CHIPS Chemotaxis-inhibiting protein CHIPS. The chemotaxis inhibitory protein, CHIPS, is an excreted virulence factor which acts by binding to C5a and formylated peptide receptor (FPR), blocking phagocyte responses. A fragment of CHIPS, which contains residues 31-121 comprises of an alpha helix packed onto a four stranded anti-parallel beta-sheet. Most of the conserved residues of CHIPS are present in the alpha-helix. 91
54162 402856 pfam11435 She2p RNA binding protein She2p. She2p is a RNA binding protein which binds to RNA via a helical hairpin. The protein is required for the actin dependent transport of ASH1 mRNA in yeast, a form of mRNP translocation. ASH1 mRNP requires recognition of zip code elements by the RNA binding protein She2p. She2p contains a globular domain consisting of a bundle of five alpha-helices. 199
54163 402857 pfam11436 DUF3199 Protein of unknown function (DUF3199). Some members in this family of proteins with unknown function are annotated as YqbG however this cannot be confirmed. Currently the proteins has no known function. 123
54164 371534 pfam11437 Vanabin-2 Vanadium-binding protein 2. The Vanadium binding protein, Vanabin2, contains four alpha-helices connected by nine disulphide bonds. Vanadium accumulates in Ascidians however the biological reason remains unclear. 93
54165 288317 pfam11438 N36 36-mer N-terminal peptide of the N protein (N36). The arginine-rich motif of the N protein is involved in transcriptional antitermination of phage lambda. N36 forms a complex with boxB RNA by binding tightly to the major groove of the boxB hairpin via hydrophobic and electrostatic interactions forming a bent alpha helix. 35
54166 371535 pfam11439 T3SchapCesA Type III secretion system filament chaperone CesA. This family represents a chaperone protein for the type III secretion system - TTSS - translocon protein EspA, to prevent the latter's self-polymerization. The TTSS is a highly specialized bacterial protein secretory pathway, similar in many ways to the flagellar system, that is essential for the pathogenesis of many Gram-negative bacteria. The twenty or so proteins making up the TTSS apparatus, referred to as the needle complex, allow the injection of virulence proteins (known as effectors) directly into the cytoplasm of the eukaryotic host cells they infect; however, the injection process itself is mediated by a subset of extracellular proteins that are secreted by the needle complex to the bacterial surface and assembled into the type III translocon - EspA. EspB and EspD. EspA polymerizes into an extracellular filament, and, as with other fibrous proteins, is apt to undergo massive polymerization when overexpressed. CesA is the secretion chaperone protein that binds to EspA. CesA is dimeric and helical, and it traps EspA in a monomeric state and inhibits its polymerization. 95
54167 288318 pfam11440 AGT DNA alpha-glucosyltransferase. The T4 bacteriophage of E.coli protects its DNA via two glycosyltransferases which glucosylate 5-hydroxymethyl cytosines (5-HMC) using UDP-glucose. These two proteins are the retaining alpha-glucosyltransferase (AGT) and the inverting beta-glucosyltransferase (BGT). The proteins in this family are AGT. AGT adopts the GT-B fold and binds both the sugar donor and acceptor to the C-terminal domain. There is evidence for a role of AGT in the base-flipping mechanism and for its specific recognition of the acceptor base. 355
54168 288319 pfam11441 MxiM Pilot protein MxiM. MxiM, a Shigella pilot protein, is essential for the assembly and membrane association of the Shigella secretin MxiD. MxiM contains an orthologous secretin component and has a specific binding domain for the acyl chains of bacterial lipids. The C terminal domain of MxiD hinders lipid binding to MxiM. 115
54169 288320 pfam11442 DUF2826 Protein of unknown function (DUF2826). This is a family of uncharacterized proteins that is highly conserved in Trypanosoma cruzi. 158
54170 402858 pfam11443 DUF2828 Domain of unknown function (DUF2828). This is a uncharacterized domain found in eukaryotes and viruses. 612
54171 402859 pfam11444 DUF2895 Protein of unknown function (DUF2895). This is a bacterial family of uncharacterized proteins. 189
54172 402860 pfam11445 DUF2894 Protein of unknown function (DUF2894). This is a bacterial family of uncharacterized proteins. 181
54173 402861 pfam11446 DUF2897 Protein of unknown function (DUF2897). This is a bacterial family of uncharacterized proteins. 50
54174 402862 pfam11447 DUF3201 Protein of unknown function (DUF3201). This archaeal family of proteins has no known function. 153
54175 402863 pfam11448 DUF3005 Protein of unknown function (DUF3005). This is a bacterial family of uncharacterized proteins. 109
54176 402864 pfam11449 ArsP_2 Putative, 10TM heavy-metal exporter. This is a family of putative manganese transporters with 9-11 TMs. Members carry two well-conserved characteristic sequence- motifs of 'PGCG'. 363
54177 402865 pfam11450 DUF3008 Protein of unknwon function (DUF3008). This is a bacterial family of uncharacterized proteins. 57
54178 402866 pfam11452 DUF3000 Protein of unknown function (DUF3000). This is a bacterial family of uncharacterized proteins. 173
54179 402867 pfam11453 DUF2950 Protein of unknown function (DUF2950). This is a bacterial family of uncharacterized proteins. 273
54180 402868 pfam11454 DUF3016 Protein of unknown function (DUF3016). This is a bacterial family of uncharacterized proteins. 139
54181 402869 pfam11455 DUF3018 Protein of unknown function (DUF3018). This is a bacterial family of uncharacterized proteins. 64
54182 378667 pfam11456 DUF3019 Protein of unknown function (DUF3019). This is a bacterial family of uncharacterized proteins. 102
54183 402870 pfam11457 DUF3021 Protein of unknown function (DUF3021). This is a bacterial family of uncharacterized proteins. 130
54184 402871 pfam11458 Mistic Membrane-integrating protein Mistic. Mistic is an integral membrane protein that folds autonomously into the membrane.The protein forms a helical bundle with a polar lipid-facing surface. Mistic can be used for high-level production of other membrane proteins in their native conformations. 74
54185 402872 pfam11459 AbiEi_3 Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 159
54186 402873 pfam11460 DUF3007 Protein of unknown function (DUF3007). This is a family of uncharacterized proteins found in bacteria and eukaryotes. 96
54187 402874 pfam11461 RILP Rab interacting lysosomal protein. RILP contains a domain which contains two coiled-coil regions and is found mainly in the cytosol. RILP is recruited onto late endosomal and lysosomal membranes by Rab7 and acts as a downstream effector of Rab7. This recruitment process is important for phagosome maturation and fusion with late endosomes and lysosomes. 58
54188 314397 pfam11462 DUF3203 Protein of unknown function (DUF3203). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. 67
54189 314398 pfam11463 R-HINP1I R.HinP1I restriction endonuclease. Hinp1I is a type II restriction endonuclease, recognising and cleaving a palindromic tetranucleotide sequence (G/CGC) resulting in 2 nt 5' overhanging ends. HINP1I has a conserved catalytic core domain containing an active site motif SDC18QXK and a DNA-binding domain. 205
54190 402875 pfam11464 Rbsn Rabenosyn Rab binding domain. Rabenosyn-5 (Rbsn) is a multivalent effector with interacts with the Rab family.Rsbn contains distinct Rab4 and Rab5 binding sites within residues 264-500 and 627-784 respectively. Rab proteins are GTPases involved in the regulation of all stages of membrane trafficking. 39
54191 288342 pfam11465 Receptor_2B4 Natural killer cell receptor 2B4. 2B4 is a transmembrane receptor which is expressed primarily on natural killer cells. It plays a role in activating NK-mediated cytotoxicity through its interaction with CD48 on target cells in a subset of CD8 T cells. The structure of 2B4 consists of an immunoglobulin variable domain fold and contains two beta-sheets. One of the beta-sheets, the six-stranded sheet, contains structural features that may have a role in ligand recognition and receptor function. 108
54192 402876 pfam11466 Doppel Prion-like protein Doppel. Dpl is a homolog related to the prion protein (PrP). Dpl is toxic to neurons and is expressed in the brains of mice that do not express PrP. In DHPC and SDS micelles, Dpl shoes about 40% alpha-helical structure however in aqueous solution it consists of a random coil. The alpha helical segment can adopt a transmembrane localization also in a membrane. The unprocessed Dpl protein is thought to posses a possible channel formation mechanism which may be related to toxicity through direct interaction with cell membranes and damage to the cell membrane. 30
54193 402877 pfam11467 LEDGF Lens epithelium-derived growth factor (LEDGF). LEDGF is a chromatin-associated protein that protects cells from stress-induced apoptosis. It is the binding partner of HIV-1 integrase in human cells. The integrase binding domain (IBD) of LEDGF is a compact right-handed bundle composed of five alpha-helices. The residues essential for the interaction with the integrase are present in the inter-helical loop regions of the bundle structure. 112
54194 371546 pfam11468 PTase_Orf2 Aromatic prenyltransferase Orf2. In vivo Orf2 attaches a geranyl group to a 1,3,6,8-tetrahydroxynaphthalene-derived polyketide during naphterpin biosynthesis. In vitro, Orf2 catalyzes carbon-carbon based and carbon-oxygen based prenylation of hydroxyl-containing aromatic acceptors of synthetic, microbial and plant origin. 287
54195 402878 pfam11469 Ribonucleas_3_2 Ribonuclease III. This is a family of archaeal ribonuclease_III proteins. 119
54196 402879 pfam11470 TUG-UBL1 TUG ubiquitin-like domain. TUG is a GLUT4 regulating protein and functions to retain membrane vesicles containing GLUT4 intracellularly. TUG releases the GLUT4 containing vesicles to the cellular exocytic machinery in response to insulin stimulation which allows translocation to the plasma membrane. TUG has an N-terminal ubiquitin-like domain (UBL1) which in similar proteins appears to participate in protein-protein interactions. The region does have a area of negative electrostatic potential and increased backbone motility which leads to suggestions of a potential protein-protein interaction site. This domain is also found at the N-terminus of yeast UBX4. 65
54197 402880 pfam11471 Sugarporin_N Maltoporin periplasmic N-terminal extension. This domain would appear to be the periplasmic, N-terminal extension of the outer membrane maltoporins, pfam02264, LamB. 31
54198 402881 pfam11472 DUF3206 Protein of unknown function (DUF3206). This bacterial family of proteins has no known function. 128
54199 314404 pfam11473 B2 RNA binding protein B2. B2 is expressed by the insect Flock House virus (FHV) as a counter-defense mechanism against antiviral RNA silencing during infection. In vitro, B2 binds to dsRNA as a dimer and inhibits the cleavage of it by Dicer. B2 blocks cleavage of the FHV genome by Dicer and also the incorporation of FHV small interfering RNAs into the RNA-induced silencing complex. 72
54200 151913 pfam11474 N-Term_TEN Telomerase reverse transcriptase TEN domain. This is the N terminal domain of the protein telomerase reverse transcriptase called TEN. The TEN domain is able to bind both RNA and telomeric DNA and contributes towards telomerase catalysis. The TEN domain has a structure that consists of a core beta sheet surrounded by seven alpha helices and a short beta hairpin. 188
54201 288349 pfam11475 VP_N-CPKC Virion protein N terminal domain. This is the N terminal domain of a family of virion proteins which contains a zinc finger domain. Currently no function is known. 32
54202 402882 pfam11476 TgMIC1 Toxoplasma gondii micronemal protein 1 TgMIC1. TgMIC1 is released as part of a complex by Toxoplasma gondii prior to invasion. The complex which consists of TgMIC4-MIC1-MIC6 participates in host cell attachment and penetration and is critical in invasion. This is the C terminal domain of TgMIC1 which has a Galectin-like fold which interacts with and stabilizes TgMIC6 providing a mechanism for an exit from the early secretory compartments and trafficking of the complex to micronemes. 137
54203 402883 pfam11477 PM0188 Sialyltransferase PMO188. PMO188 is a sialyltransferase from P.multocida. It transfers sialic acid from cytidine 5'-monophosphonuraminic acid to an acceptor sugar. It has important catalytic residues such as Asp141, His311, Glu338, Ser355 and Ser356. 385
54204 314405 pfam11478 Tachystatin_B Antimicrobial chitin binding protein tachystatin B. Tachystatin B is an antimicrobial chitin binding peptide and consists of two isotopes B1 and B2.Both structures contain a short antiparallel beta sheet with an inhibitory cysteine knot motif. Tyr(14) and Arg(17) are thought to be the essential residues for chitin binding. 42
54205 371552 pfam11479 Suppressor_P21 RNA silencing suppressor P21. P21 is produced by Beet yellows virus to suppress the antiviral silencing response mounted by the host. P21 acts by binding directly to siRNA which is a mediator in the process. P21 has an octameric ring structure with a large central cavity. 80
54206 288352 pfam11480 ImmE5 Colicin-E5 Imm protein. Imms bind specifically to cognate colicins in order to protect their host cells. Imm-E5 is a specific inhibitor protein of colicin E5. It binds to E5 C-terminal ribonuclease domain (CRD) to prevent cell death. The binding mode of E5-CRD and Imm-E5 mimics that of mRNA and tRNA suggesting an evolutionary pathway from the RNA-RNA interaction through the RNA-protein interaction of tRNA/E5-CRD. 82
54207 402884 pfam11482 DUF3208 Protein of unknown function (DUF3208). This bacterial family of proteins has no known function. 108
54208 402885 pfam11483 DUF3209 Protein of unknown function (DUF3209). This family of proteins has no known function. 123
54209 402886 pfam11485 DUF3211 Protein of unknown function (DUF3211). This archaeal family of proteins has no known function. 136
54210 288356 pfam11486 DUF3212 Protein of unknown function (DUF3212). Members in this family of proteins are annotated as YfmB however currently no function for this protein is known. 119
54211 402887 pfam11487 RestrictionSfiI Type II restriction enzyme SfiI. SfiI is a restriction enzyme that can cleave two DNA sites simultaneously to leave 3-base 3' overhangs. It acts as a homo-tetramer and recognizes a specific eight base-paid palindromic DNA sequence. After binding two copies of its recognition sequence, SfiI becomes activated leading to cleavage of all four DNA strands. The structure of SfiI consists of a central twisted beta-sheet surrounded by alpha-helices. 216
54212 402888 pfam11488 Lge1 Transcriptional regulatory protein LGE1. This family of proteins is conserved from fungi to human. In yeasts it is involved in the ubiquitination of histones H2A and H2B. This ubiquitination step is a vital one in the regulation of the transcriptional activity of RNA polymerase II. In S. cerevisiae, Rad6 and Bre1 are present in a complex, also containing Lge1, that is required for H2B ubiquitination. Bre1 is the H2B ubiquitin ligase that interacts with acidic activators, such as Gal4, and recruits Rad6 and its binding partner Lge1 to target promoters. In S. pombe the equivalent protein to Lge1 appears to be Shf1. In human, periphilin acts a transcriptional co-repressor and regulates cell cycle progression. 71
54213 371558 pfam11489 Aim21 Altered inheritance of mitochondria protein 21. This is a family of proteins conserved in yeasts. Saccharomyces cerevisiae Aim21 may be involved in mitochondrial migration along actin filament. It may also interact with ribosomes. 677
54214 402889 pfam11490 DNA_pol3_a_NII DNA polymerase III polC-type N-terminus II. This is the second N-terminal domain, NII domain, of the DNA polymerase III polC subunit A that is found only in Firmicutes. DNA polymerase polC-type III enzyme functions as the 'replicase' in low G + C Gram-positive bacteria. Purine asymmetry is a characteristic of organisms with a heterodimeric DNA polymerase III alpha-subunit constituted by polC which probably plays a direct role in the maintenance of strand-biased gene distribution; since, among prokaryotic genomes, the distribution of genes on the leading and lagging strands of the replication fork is known to be biased. It has been predicted that the N-terminus of polC folds into two globular domains, NI and NII. A predicted hydrophobic surface patch suggests this domain may be involved in protein binding. This domain is associated with DNA_pol3_alpha pfam07733 and DNA_pol3_a_NI pfam14480. 117
54215 402890 pfam11491 DUF3213 Protein of unknown function (DUF3213). The backbone structure of this family of proteins has been determined however the function remains unknown. The protein has an alpha and beta structure with a ferredoxin-like fold. 85
54216 288362 pfam11492 Dicistro_VP4 Cricket paralysis virus, VP4. This is a family of minor capsid proteins, known as VP4, from the dicistroviridae. The dicistroviridae is a group of small, RNA-containing viruses that are closely structurally related to the picornaviridae. VP4 is a short, extended polypeptide chain found within the viral capsid, at the interface between the external protein shell and packaged RNA genome. 53
54217 402891 pfam11493 TSP9 Thylakoid soluble phosphoprotein TSP9. The plant-specific protein, TSP9 is phosphorylated and released in response to changing light conditions from the photosynthetic membrane. The protein resembles the characteristics of transcription/translation regulatory factors. The structure of the protein is predicted to consist of a random coil. 80
54218 402892 pfam11494 Ta0938 Ta0938. Ta0938 is a protein of unknown function however the structure has been determined. The protein has a novel fold and a putative Zn-binding motif. The structure has two different parts, one region contains a beta sheet flanked by two alpha helices and the other contains a bundle of loops which contain all cysteines in the protein. 92
54219 402893 pfam11495 Regulator_TrmB Archaeal transcriptional regulator TrmB. TrmB is an alpha-glucoside sensing transcriptional regulator. The protein is the transcriptional repressor for gene cluster encoding trehalose/maltose ABC transporter in T.litoralis and P.furiosus. TrmB has lost its DNA binding domain but retained its sugar recognition site. A nonreducing glucosyl residue is shared by all substrates bound to TrmB which suggests that its a common recognition motif. 233
54220 402894 pfam11496 HDA2-3 Class II histone deacetylase complex subunits 2 and 3. This family of class II histone deacetylase complex subunits HDA2 and HDA3 is found in fungi, The member from S. pombe is referred to as Ccq1 (coiled-coil quantitatively-enriched protein 1). These proteins associate with HDA1 to generate the activity of the HDA1 histone deacetylase complex. HDA1 interacts with itself and with the HDA2-HDA3 subcomplex to form a probable tetramer and these interactions are necessary for catalytic activity. The HDA1 histone deacetylase complex is responsible for the deacetylation of lysine residues on the N-terminal part of the core histones (H2A, H2B, H3 and H4). Histone deacetylation gives a tag for epigenetic repression and plays an important role in transcriptional regulation, cell cycle progression and developmental events. HDA2 and HDA3 have a conserved coiled-coil domain towards their C-terminus. 281
54221 402895 pfam11497 NADH_Oxid_Nqo15 NADH-quinone oxidoreductase chain 15. This protein, Nqo15, is a part of respiratory complex 1 which is a complex that plays a central role in cellular energy production in both bacteria and mitochondria. Nqo15 has a similar fold to Frataxin, the mitochondrial iron chaperone. This protein may have a role in iron-sulphur cluster regeneration in the complex. This domain represents more than half the molecular mass of the entire complex. 123
54222 151935 pfam11498 Activator_LAG-3 Transcriptional activator LAG-3. The C.elegans Notch pathway, involved in the control of growth, differentiation and patterning in animal development, relies on either of the receptors GLP-1 or LIN-12. Both these receptors promote signalling by the recruitment of LAG-3 to target promoters, where it then acts as a transcriptional activator. LAG-3 works as a ternary complex together with the DNA binding protein, LAG-1. 476
54223 402896 pfam11500 Cut12 Spindle pole body formation-associated protein. This is the central coiled-coil region of cut12 also found in other fungi, barring S. cerevisiae. The full protein has two predicted coiled-coil regions, and one consensus phosphorylation site for p34cdc2 and two for MAP kinase. During fission yeast mitosis, the duplicated spindle pole bodies (SPBs) nucleate microtubule arrays that interdigitate to form the mitotic spindle. Cut12 is localized to the SPB throughout the cell cycle, predominantly around the inner face of the interphase SPB, adjacent to the nucleus. Cut12 associates with Fin1 and is important in this context for the activity of Plo1. 149
54224 402897 pfam11501 Nsp1 Non structural protein Nsp1. Nsp1 is the N-terminal cleavage product from the viral replicase that mediates RNA replication and processing. The specific function of the protein is unknown however the structure has been determined. The protein has a novel alpha/beta fold formed by a 6 stranded beta barrel with an alpha helix covering one end of the barrel and another helix alongside the barrel. Nsp1 could be involved in the degradation of mRNA. 138
54225 371566 pfam11502 BCL9 B-cell lymphoma 9 protein. The Wnt pathway plays a role in embryonic development, stem cell growth and tumorigenesis. BCL9 associates with beta-catenin and Tcf in the nucleus when the Wnt pathway is stimulated leading to the transactivation of Wnt target genes. 39
54226 371567 pfam11503 DUF3215 Protein of unknown function (DUF3215). This family of proteins with unknown function appears to be restricted to Saccharomycetaceae. 72
54227 314418 pfam11504 Colicin_Ia Colicin Ia. Colicins are toxic molecules secreted to kill other bacteria in times of stress. Colicin Ia kills susceptible E.coli cells by binding to the colicin I receptor leading to the formation of a voltage-dependant ion channel. The protein can be divided into three domains, a translocation domain, a receptor binding domain and a channel forming domain. 72
54228 402898 pfam11505 DUF3216 Protein of unknown function (DUF3216). This family of archaeal proteins with unknown function appears to be restricted ton Thermococcaceae. 91
54229 402899 pfam11506 DUF3217 Protein of unknown function (DUF3217). This family of proteins with unknown function appears to be restricted to Mycoplasma. Some members in this family of proteins are annotated as MG376 however this cannot be confirmed. 104
54230 402900 pfam11507 Transcript_VP30 Ebola virus-specific transcription factor VP30. VP30 is a nucleocapsid-associated Ebola virus-specific transcription factor. It acts by stabilizing nascent mRNA in Ebola virus replication. The C terminal domain of VP30 folds into a dimeric helical assembly. VP30 assembles into hexamers in solution by an N-terminal oligomerization domain which activates the transcription function of the protein. The oligomerization is mediated by hydrophobic amino acids at 94-112. 131
54231 402901 pfam11508 DUF3218 Protein of unknown function (DUF3218). This family of proteins with unknown function appears to be restricted to Pseudomonas. 213
54232 288377 pfam11510 FA_FANCE Fanconi Anaemia group E protein FANCE. Fanconi Anaemia (FA) is a cancer predisposition disorder. In response to DNA damage, the FA core complex monoubiquitinates the downatream FANCD2 protein. The protein FANCE has an important role in DNA repair as it is the FANCD2-binding protein in the FA core complex so it represents the link between the FA core complex and FANCD2. The sequence shown is the C terminal domain of the protein which consists predominantly of helices and does not contain any beta-strand. The fold of the polypeptide is a continuous right-handed solenoidal pattern from the N terminal to the C terminal end. 262
54233 314420 pfam11511 RhodobacterPufX Intrinsic membrane protein PufX. PufX organizes RC-LH1, the photosynthesis reaction centre-light harvesting complex 1 core complex of Rhodobacter sphaeroides. It also facilitates the exchange of quinol for quinone between the reaction centre and cytochrome bc(1) complexes. In organic solvent, PufX contains two hydrophobic helices which are flanked by unstructured regions and connected by a helical bend. 67
54234 371572 pfam11512 Atu4866 Agrobacterium tumefaciens protein Atu4866. Atu4866 is a protein with unknown function from Agrobacterium tumefaciens however the structure has been determined. Atu4866 adopts a streptavidin-like fold and has a beta-barrel/sandwich which is formed by eight antiparallel beta-strands. Atu4866 has a potential ligand-binding site where is has a stretch of conserved residues on the surface. 75
54235 314422 pfam11513 TA0956 Thermoplasma acidophilum protein TA0956. TA0956 is a protein from Thermoplasma acidophilum which currently has no known function however the structure has been determined. The protein has a two-layered alpha/beta-sandwich topology and is a putative Elongation factor 1-alpha binding motif. 110
54236 402902 pfam11514 DUF3219 Protein of unknown function (DUF3219). This family of proteins with unknown function appears to be restricted to Bacillaceae. Some members in this family of proteins are annotated as YkvR however this cannot be confirmed. 94
54237 402903 pfam11515 Cul7 Mouse development and cellular proliferation protein Cullin-7. The Cullin Ring Ligase family member, Cul7, is required for normal mouse development and cellular proliferation. Cul7 has a CPH domain which is a p53 interaction domain. The CPH domain interaction surface of P53 is present in the tetramerisation domain. 75
54238 151953 pfam11516 DUF3220 Protein of unknown function (DUF3120). This family of proteins with unknown function appears to be restricted to Bordetella. 106
54239 402904 pfam11517 Nab2 Nuclear abundant poly(A) RNA-bind protein 2 (Nab2). Nab2 is a yeast heterogeneous nuclear ribonucleoprotein that modulates poly(A) tail length and mRNA. This is the N terminal domain of the protein which mediates interactions with the C-terminal globular domain, Myosin-like protein 1 and the mRNA export factor, Gfd1.The N-terminal domain of Nab2 shows a structure of a helical fold. The N terminal domain of Nab2 is thought to mediate protein protein interactions that facilitate the nuclear export of mRNA. An essential hydrophobic Phe73 patch on the N terminal domain is thought to be a important component of the interface between Nab2 and Mlp1. 101
54240 402905 pfam11518 DUF3221 Protein of unknown function (DUF3221). This family of proteins with unknown function appears to be restricted to Bacillus. Some members in this family of proteins are annotated as YobA however this cannot be confirmed. YobA is a protein with unknown function. 82
54241 402906 pfam11519 DUF3222 Protein of unknown function (DUF3222). This family of proteins with unknown function appears to be restricted to Rhodopseudomonas. 75
54242 402907 pfam11520 Cren7 Chromatin protein Cren7. Cren7 is a chromatin protein found in Crenarchaeota and has a higher affinity for double-stranded DNA than for single-stranded DNA. The protein contains negative DNA supercoils and is associated with genomic DNA in vivo.Cren7 interacts with duplex DNA through a beta-sheet and a long flexible loop. The function has not been completely determined but it is thought that the protein may have a role similar to that of archaeal proteins in Euryarchaea. 55
54243 402908 pfam11521 TFIIE-A_C C-terminal general transcription factor TFIIE alpha. TFIIE is compiled of two subunits, alpha and beta. This family of proteins are the C terminal domain of the alpha subunit of the protein which is the largest subunit and contains several functional domains which are important for basal transcription and cell growth. The C terminal end of the protein binds directly to the amino-terminal PH domain of p62/Tfb1 (of IIH) which is involved in the recruitment of the general transcription factor IIH to the transcription preinitiation complex. P53 competes for the same binding site as TFIIE alpha which shows their structural similarity. Like p53, TFIIE alpha 336-439 can activate transcription in vivo. 83
54244 402909 pfam11522 Pik1 Yeast phosphatidylinositol-4-OH kinase Pik1. Pik1 is a regulator of membrane traffic and participates in the mating-pheromone signal-transduction cascade. The protein is localized to the nucleus and cytoplasm in the Golgi. Pik1 is thought to have an actin-independent role in membrane transport. 50
54245 402910 pfam11523 DUF3223 Protein of unknown function (DUF3223). This family of proteins has no known function. 75
54246 314430 pfam11524 SeleniumBinding Selenium binding protein. Selenium is an important nutrient that needs to be regulated since lack of the nutrient leads to cell abnormalities and high concentrations are toxic. SeBP regulates the level of free selenium in the cell by sequestering the nutrient during transport. SeBP acts as a pentamer and delivers the selenium to the selenophosphate synthetase enzyme. Each subunit is composed of an alpha helix on top of a four stranded twisted ss sheet, stabilized by hydrogen bonds. 82
54247 371581 pfam11525 CopK Copper resistance protein K. CopK is a periplasmic dimeric protein which is strongly up-regulated in the presence of copper, leading to a high periplasmic accumulation. CopK has two different binding sites for Cu(I), each with a different affinity for the metal. Binding of the first Cu(I) ion induces a conformational change of CopK which involves dissociation of the dimeric apo-protein. Binding of a second Cu(I) further increases the plasticity of the protein. CopK has features that are common with functionally related proteins such as a structure consisting of an all-beta fold and a methionine-rich Cu(I) binding site. 70
54248 402911 pfam11526 CFIA_Pcf11 Subunit of cleavage factor IA Pcf11. Pcf11 is a subunit of an essential polyadenylation factor in Saccharomyces cerevisiae, CFIA. Pcf11 binds to Clp1, another subunit of CFIA whose interaction is responsible for maintaining a tight coupling between the Clp1 nucleotide binding subunit and the other components of the polyadenylation machinery. 83
54249 402912 pfam11527 ARL2_Bind_BART The ARF-like 2 binding protein BART. BART binds specifically to ARL2.GTP with a high affinity however it does not bind to ARL2.GDP. It is thought that this specific interaction is due to BART being the first identified ARL2-specific effector. The function is not completely characterized. BART is predominantly cytosolic but can also be found to be associated with mitochondria. BART is also involved in binding to the adenine nucleotide transporter ANT1. 111
54250 402913 pfam11528 DUF3224 Protein of unknown function (DUF3224). This bacterial family of proteins has no known function. 124
54251 151966 pfam11529 AvrL567-A Melampsora lini avirulence protein AvrL567-A. AvrL567-A is a protein from the fungal pathogen flax which induces plant disease resistance in flax plants. The protein has a novel fold. 127
54252 288394 pfam11530 Pilin_PilX Minor type IV pilin, PilX. PilX is a protein from Neisseria meningitidis which is crucial for the formation of bacterial aggregates and adhesion to human cells. The structure of PilX is similar to all pilins as it has the common alpha/beta roll fold. PilX subunits have surface-exposed motifs which are thought to stabilize bacterial aggregates against pilus retraction. It also illustrates how a minor pilus component can modulate the virulence properties of pili which have a simple composition and structure. 127
54253 402914 pfam11531 CARM1 Coactivator-associated arginine methyltransferase 1 N terminal. CARM1 is an arginine methyltransferase which methylates a variety of different proteins and plays a role in gene expression. This is the N terminal domain of the protein which has a PH domain, normally present to regulate protein-protein interactions.A molecular switch is also present on the N terminal domain. 105
54254 402915 pfam11532 HnRNP_M Heterogeneous nuclear ribonucleoprotein M. HnRNP M is a splicing regulatory factor that binds to the auxiliary RNA cis-element ISE/ISS-2 which promotes splicing of exon IIIb and silencing of exon IIIC in the fibroblast growth factor receptor 2 (FGFR2). By binding to ISE/ISS-3, HnRNP M plays a role in the regulation of alternative splicing in FGFR2 as it induces exon skipping and promotes exon inclusion. 30
54255 402916 pfam11533 DUF3225 Protein of unknown function (DUF3225). This bacterial family of proteins has no known function. 126
54256 402917 pfam11534 HTHP Hexameric tyrosine-coordinated heme protein (HTHP). HTHP is from the marine bacterium Silicibacter pomeroyi and has peroxidase and catalase activity. HTHP consists of six monomers which each binds a solvent accessible heme group and is stabilized by the interaction of three neighboring monomers. The heme iron is penta-coordinated with a tyrosine residue as proximal ligand. 72
54257 402918 pfam11535 Calci_bind_CcbP Calcium binding. CcbP is a Ca(2+) binding protein which, in Anabaena, is thought to bind Ca(2+) by protein surface charge. When bound to Ca(2+), the protein becomes more compact and the level of free calcium decreases. The free Ca(2+) concentration which is regulated by CcbP is critical for the differentiation process. Calcium signalling is widespread in bacterial species, and prokaryotic cells like eukaryotes are equipped with all the elements to maintain Ca2+ homeostasis. 105
54258 402919 pfam11536 DUF3226 Protein of unknown function (DUF3226). This archaeal family of proteins has no known function. 237
54259 371589 pfam11537 DUF3227 Protein of unknown function (DUF3227). This archaeal family of proteins has no known function. 93
54260 402920 pfam11538 Snurportin1 Snurportin1. Snurportin1 is a novel nuclear import receptor which contains an N-terminal importin beta binding domain which is essential for its function of a snRNP-specific nuclear import receptor. Snurportin1 interacts with m3G-cap where it enhances the m3G-cap dependent nuclear import of U snRNPs in Xenopus laevis oocytes and digitonin-permeabilized HeLa cells. 40
54261 402921 pfam11539 DUF3228 Protein of unknown function (DUF3228). This family of proteins has no known function. 192
54262 402922 pfam11540 Dynein_IC2 Cytoplasmic dynein 1 intermediate chain 2. Intermediate chain IC 2 forms part of the complex cytoplasmic dynein 1 along with a heavy chain (HC), two light intermediate chains (LICs) and three light chains (LCs). The complex is responsible for hydrolysing ATP to generate force toward the minus end of microtubules. IC binds to the HC via the N terminal binding domain on the HC and ICs contain binding sites for the LCs. The ICs are responsible for binding to kinetochores and the Golgi apparatus through an interaction with the p150Glued subunit of dynactin which is another complex. 29
54263 402923 pfam11542 Mdv1 Mitochondrial division protein 1. Mdv1 is a component of the mitochondrial fission machinery in Saccharomyces cerevisiae. The protein is also involved in peroxisome proliferation. Mdv1 along with Fis1 is also involved in controlling Dnm-1 dependant devision, a GTPase involved in the mediation of mitochondrial division. In this role, Mdv1 is the linker between Fis1 and Dnm1. Mdv1 plays a key role in the regulation of Dnm1 self-assembly. 49
54264 402924 pfam11543 UN_NPL4 Nuclear pore localization protein NPL4. Npl4 is part of the heterodimer UN along with Ufd1 which is involved in the recruitment of p97, an AAA ATPase, for tasks involving the ubiquitin pathway. Npl4 has a ubiquitin-like domain which has within its structure a beta-grasp fold with a helical insert. 74
54265 402925 pfam11544 Spc42p Spindle pole body component Spc42p. Spc42p is a 42-kD component of the S.cerevisiae spindle body that localizes to the electron dense central region of the SPB.Spc42p is a phosphoprotein which forms a polymeric layer at the periphery of the SPB central plaque. This functions during SPB duplication and also facilitates the attachment of the SPB to the nuclear membrane. 72
54266 288407 pfam11545 HemeBinding_Shp Cell surface heme-binding protein Shp. Shp is part of a complex which functions in heme uptake in Streptococcus pyogenes. During which, Shp transfers its heme to HtsA which is a component of an ABC transporter. The heme binding region of Shp contains an immunoglobulin-like beta-sandwich fold and has a unique heme-iron coordination with the axial ligands being two methionine residues from the same Shp molecule. Surrounding the heme pocket, there is a negative surface which may serve as a docking interface for heme transfer. 151
54267 371594 pfam11546 CompInhib_SCIN Staphylococcal complement inhibitor SCIN. SCIN is released by Staphylococcus aureus to counteract the host immune defense. The protein binds to and inhibits C3 convertases on the bacterial surface, reducing phagocytosis and blocking downstream effector functions by C3b deposition on its surface. An 18 residue stretch 31-48 is crucial for SCIN activity. 114
54268 402926 pfam11547 E3_UbLigase_EDD E3 ubiquitin ligase EDD. EDD, the ER ubiquitin ligase from the HECT ligases, contains an N-terminal ubiquitin-associated domain which binds ubiquitin. Ubiquitin is recognized by helices alpha-1 and -3 in in the UBA domain. EDD is involved in DNA damage repair pathways and binds to mono-ubiquitinated proteins. 52
54269 402927 pfam11548 Receptor_IA-2 Protein-tyrosine phosphatase receptor IA-2. IA-2 is a protein-tyrosine phosphatase receptor that upon exocytosis, the cytoplasmic domain is cleaved and moves to the nucleus where it enhances transcription of the insulin gene. The mature exodomain of IA-2 participates in adhesion to the extracellular matrix and is self-proteolyzed in vitro by reactive oxygen species which may be a new shedding mechanism. 89
54270 402928 pfam11549 Sec31 Protein transport protein SEC31. Sec31 is involved in COPII coat formation as it forms through the sequential binding of three cytoplasmic proteins: Sar1, Sec23/24 and Sec13/31. Sec13/31 is recruited by the pre-budding complex and polymerization of Sec13/31 occurs to form an octahedral cage that is the outer shell of the COPII coat. Sec13/31 is a hetero-tetramer which is organized as a linear array of alpha-solenoid and beta-propeller domains to form a rod in which twenty-four copies assemble to form the COPII cub-octahedron. 48
54271 402929 pfam11550 IglC Intracellular growth locus C protein. IglC protein is involved in the escape of F.tularensis live vaccine strain. It has been shown that the expression of IglC is essential for F.tularensis to induce macrophage apoptosis. IglC adopts a beta-sandwich conformation that has no similarity to any known protein structure. 210
54272 402930 pfam11551 Omp28 Outer membrane protein Omp28. Omp28 is a 28-kDa outer membrane protein from Porphyromonas gingivalis. Omp28 is thought to be a surface adhesion/receptor protein. Omp28 is expressed in a wide distribution of P.gingivalis strains. 171
54273 402931 pfam11553 DUF3231 Protein of unknown function (DUF3231). This bacterial family of proteins has no known function. 161
54274 402932 pfam11554 DUF3232 Protein of unknown function (DUF3232). This bacterial family of proteins has no known function. 151
54275 402933 pfam11555 Inhibitor_Mig-6 EGFR receptor inhibitor Mig-6. When the kinase domain of EGFR binds to segment one of Mitogen induced gene 6 (Mig-6), EGFR becomes inactive due to the conformation it adopts which is Src/CDK like. The binding of the two proteins prevents EGFR acting as a cyclin-like activator for other kinase domains.The structure of Mig-6(1) consists of alpha helices-G and -H with a polar surface and hydrophobic residues for interactions with EGFR. A critical step for the activation of EGFR is the formation of an asymmetric dimer involving the kinase domains of the protein. Since Mig-6 binds to the kinase domain it blocks this process and EGFR becomes inactive. 71
54276 402934 pfam11556 EBA-175_VI Erythrocyte binding antigen 175. EBA-175 is involved in the formation of a tight junction, a necessary step in invasion. This family represents the region VI which is a cysteine rich domain essential for EBA-175 trafficking. The structure is a homodimer that contains a five-alpha-helical core stabilized by four disulphide bridges. 80
54277 371601 pfam11557 Omp_AT Solitary outer membrane autotransporter beta-barrel domain. Omp_AT is a family of Gram-negative Gamma-proteobacteria outer membrane autotransporter beta-barrel proteins. Secondary structure prediction indicates a beta-barrel domain of 12 beta-strands. with an N terminal helix running along the central barrel axis perpendicular to the 12 antiparallel strands that form the barrel. Autotransporter translocation units defined by a beta-barrel of 12 to 14 antiparallel strands with an N terminal helix perpendicular to the barrel. 327
54278 402935 pfam11558 HET-s_218-289 Het-s 218-289. This family of proteins is residues 218-289 of Het-s, a protein of Podospora anserina. Het-s plays a role in heterokaryon incompatibility which prevents different forms of parasitism. This region of the protein is the C-terminal end and is unstructured in solution but forms infectious fibrils in vitro which has a structure consisting of a left-handed beta solenoid which contains two windings per molecule. 61
54279 402936 pfam11559 ADIP Afadin- and alpha -actinin-Binding. This family is found in mammals where it is localized at cell-cell adherens junctions, and in Sch. pombe and other fungi where it anchors spindle-pole bodies to spindle microtubules. It is a coiled-coil structure, and in pombe, it is required for anchoring the minus end of spindle microtubules to the centrosome equivalent, the spindle-pole body. The name ADIP derives from the family being composed of Afadin- and alpha -Actinin-Binding Proteins localized at Cell-Cell Adherens Junctions. 151
54280 371604 pfam11560 LAP2alpha Lamina-associated polypeptide 2 alpha. LAPs are components of the nuclear lamina which supports the nuclear envelope.LAP2alpha is a non-membrane-associated member of the LAP family which is unique. This family of proteins is the C terminal domain of LAP2alpha which consists of residues 459-693 and constitutes a dimeric structure with an antiparallel coiled coil. LAP2alpha is involved in cell-cycle regulation and chromatin organisation and preferentially binds to lamin A/C. 234
54281 371605 pfam11561 Saw1 Single strand annealing-weakened 1. This family of yeast proteins is involved in single-strand-annealing, or SSA. SSA entails multiple steps: end resection and ssDNA formation; annealing of complementary ssDNAs; removal of 3' single-stranded non-homologous tails; gap fill-in synthesis; and ligation. Saw1 in combination with Slx4 catalyzes the 3' non-homologous tail removal during recombination. Saw1 interacts physically with Rad1/Rad10, Msh2/Msh3, and Rad52 proteins, and works by targeting Rad1/Rad10 to Rad52-coated recombination intermediates. 244
54282 402937 pfam11563 Protoglobin Protoglobin. This family includes protoglobin from Methanosarcina acetivorans C2A. It is also found near the N-terminus of the Haem-based aerotactic transducer HemAT in Bacillus subtilis. It is part of the haemoglobin superfamily. Protoglobin has specific loops and an amino-terminal extension which leads to the burying of the haem within the matrix of the protein. Protoglobin-specific apolar tunnels allow the access of O2, CO and NO to the haem distal site. In HemAT it acts as an oxygen sensor domain. 153
54283 402938 pfam11564 BpuJI_N Restriction endonuclease BpuJI - N terminal. BpuJI is a restriction endonuclease which recognizes the asymmetric sequence 5'-CCCGT and cuts at multiple sites in the surrounding area of the target sequence. This family of proteins is the N terminal domain of BpuJI which has DNA recognition functions. The recognition domain has two subdomains D1 and D2. The recognition of the target sequence occurs through major groove contacts of amino acids on the helix-turn-helix region and the N-terminal arm. 278
54284 288422 pfam11565 PorB Alpha helical Porin B. Porin B is a porin from Corynebacterium glutamicum which allows the exchange of material across the mycolic acid layer which is the protective nonpolar barrier. Porin B has an alpha helical core structure consisting of four alpha-helices surrounding a nonpolar interior. There is a disulphide bridge between helices 1 and 4 to form a stable covalently bound ring. The channel of PorB is oligomeric. 99
54285 402939 pfam11566 PI31_Prot_N PI31 proteasome regulator N-terminal. PI31 is a regulatory subunit of the immuno-proteasome which is an inhibitor of the 20 S proteasome in vitro.PI31 is also an F-box protein Fbxo7.Skp1 binding partner which requires an N terminal FP domain in both proteins for the interaction to occur via the FP beta sheets. The structure of PI31 FP domain contains a novel alpha/beta-fold and two intermolecular contact surfaces. This is the N-terminal domain of the members. 156
54286 402940 pfam11567 PfUIS3 Plasmodium falciparum UIS3 membrane protein. UIS3 is a membrane protein essential for sporozoite development in infected hepatocytes. This family is 130-229 of the Plasmodium falciparum UIS3 protein which is compact and has an all alpha-helical structure.PfUIS3(130-229) interacts with lipids, phospholipid lysosomes, the human liver fatty acid-binding protein and with the lipid phosphatidylethanolamine. The interaction with liver fatty acid-binding protein provides the parasite with a method to import essential fatty acids/lipids during rapid growth phases of sporozoites. 101
54287 402941 pfam11568 Med29 Mediator complex subunit 29. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med29, along with Med11 and Med28, in mammals, is part of the core head-region of the complex. Med29 is the apparent orthologue of the Drosophila melanogaster Intersex protein, which interacts directly with, and functions as a transcriptional coactivator for, the DNA-binding transcription factor Doublesex, so it is likely that mammalian Med29 serves as a target for one or more DNA-binding transcriptional activators. 140
54288 288426 pfam11569 Homez Homeodomain leucine-zipper encoding, Homez. Homez contains two leucine zipper-like motifs and an acidic domain and belongs to the superfamily of homeobox-containing proteins. The presence of leucine zippers suggests that Homez can function as a homo or heterodimer in the nucleus. It is thought that the first leucine zipper and homeodomain 1 (HD1)of Homez is responsible for dimerization and HD2 has a specific DNA-binding activity. Homez is also thought to function as a transcriptional repressor due to the acidic region in its C-terminal domain. Homez is involved in a complex regulatory network. 55
54289 314459 pfam11570 E2R135 Coiled-coil receptor-binding R-domain of colicin E2. E2 is a DNase which utilizes the outer membrane receptor BtuB to bind to and enter the cell. This family of proteins is E2R135 (residues 321-443) which is the part of E2 which is responsible for binding to BtuB in a coiled coil formation. 136
54290 402942 pfam11571 Med27 Mediator complex subunit 27. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. The ubiquitous expression of Med27 mRNA suggests a universal requirement for Med27 in transcriptional initiation. Loss of Crsp34/Med27 decreases amacrine cell number, but increases the number of rod photoreceptor cells. 85
54291 402943 pfam11572 DUF3234 Protein of unknown function (DUF3234). This bacterial family of proteins has no known function. Some members in this family of proteins are annotated as TTHA0547 however this cannot be confirmed. 95
54292 402944 pfam11573 Med23 Mediator complex subunit 23. Med23 is one of the subunits of the Tail portion of the Mediator complex that regulates RNA polymerase II activity. Med23 is required for heat-shock-specific gene expression, and has been shown to mediate transcriptional activation of E1A in mice. 1302
54293 402945 pfam11574 DUF3235 Protein of unknown function (DUF3235). Some members in this family of proteins with unknown function are annotated as RpfA however this cannot be confirmed. 83
54294 402946 pfam11575 FhuF_C FhuF 2Fe-2S C-terminal domain. This family consists of several bacterial ferric iron reductase protein (FhuF) sequences. FhuF is involved in the reduction of ferric iron in cytoplasmic ferrioxamine B. This domain is the C-terminal domain that contains 4 conserved cysteine residues that are found to be part of a 2Fe-2S cluster. 21
54295 402947 pfam11576 DUF3236 Protein of unknown function (DUF3236). This family of proteins with unknown function appears to be restricted to Methanobacteria. 152
54296 371612 pfam11577 NEMO NF-kappa-B essential modulator NEMO. NEMO is a regulatory protein which is part of the IKK complex along with the catalytic IKKalpha and beta kinases. The IKK complex phosphorylates IkappaB targeting it for degradation which results in the release of NF-kappaB which initiates the inflammatory response, cell proliferation or cell differentiation. NEMO activates the IKK complex's activity by associating with the unphosphorylated IKK kinase C termini.The core domain of NEMO is a dimer which binds to two fragments of IKK. 67
54297 402948 pfam11578 DUF3237 Protein of unknown function (DUF3237). This family of proteins has no known function 149
54298 314467 pfam11579 DUF3238 Protein of unknown function (DUF3238). This family of proteins with unknown function appears to be restricted to Bacillus cereus. 192
54299 402949 pfam11580 DUF3239 Protein of unknown function (DUF3239). This bacterial family of proteins may be membrane proteins however this cannot be confirmed. Currently there is no known function. 125
54300 371613 pfam11581 Argos Antagonist of EGFR signalling, Argos. Argos is a natural secreted antagonist of EGFR signalling which functions by binding growth factor ligands that activate EGFR by forming a clamp like structure using three disulphide-bonded beta-sheet domains. 129
54301 338041 pfam11582 DUF3240 Protein of unknown function (DUF3240). This family of proteins with unknown function appears to be restricted to Proteobacteria. 101
54302 402950 pfam11583 AurF P-aminobenzoate N-oxygenase AurF. This family is a metalloenzyme which is involved in the biosynthesis of antibiotic aureothin by catalyzing the formation of p-nitrobenzoic acid from p-aminobenzoic acid. AurF is a non-heme di-iron monooxygenase which creates nitroarenes via the sequential oxidation of aminoarenes. 280
54303 402951 pfam11584 Toxin_ToxA Proteinaceous host-selective toxin ToxA. ToxA is produced by particular Pyrenophora tritici-repentis races and is a proteinaceous host-selective toxin. It is necessary and sufficient to cause cell death in sensitive wheat cultivars.ToxA adopts a single-domain, beta-sandwich fold which has novel topology. The protein is directly involved in recognition events required for ToxA action. It is thought to be distantly related to FnIII proteins, gaining entry to the host via an integrin-like receptor. 117
54304 152021 pfam11585 Stomoxyn Insect antimicrobial peptide, stomoxyn. Stomoxyn, localized in the gut epithelium, is an insect antimicrobial peptide which functions in killing a range of microorganisms, parasites and some viruses. Stomoxyn has a structure consisting of a random coil in water however in TFE it adopts a stable helical structure. Stomoxyn is thought to have a similar function to cecropin A from Hyalophora cecropia due to structural similarities. 42
54305 402952 pfam11586 DUF3242 Protein of unknown function (DUF3242). This protein from Thermotoga maritima is a hypothetical ORFan protein, TM1622, whose structure has been determined. The protein is composed of seven beta strands and three alpha helices. 125
54306 371614 pfam11587 Prion_bPrPp Major prion protein bPrPp - N terminal. This family represents the N-terminal domain (1-30) of the bovine prion protein (bPrPp). The proteins structure consists of mainly alpha helices. BPrPp forms a stable helix which inserts in a transmembrane location in the bilayer, with the N -terminal (1-30) functioning as a cell-penetrating peptide. 30
54307 378681 pfam11588 DUF3243 Protein of unknown function (DUF3243). This family of proteins with unknown function appears to be restricted to Firmicutes. 79
54308 402953 pfam11589 DUF3244 Domain of unknown function (DUF3244). This domain adopts an immunoglobulin-like beta-sandwich fold and structurally is most similar to fibronectin. 100
54309 314473 pfam11590 DNAPolymera_Pol DNA polymerase catalytic subunit Pol. This family of proteins represents the catalytic subunit, Pol, of the Herpes simplex virus DNA polymerase. Pol binds UL42, making up the DNA polymerase. UL42 is a processivity subunit which binds to the C-terminal of Pol in a similar way that the cell cycle regulator p21 binds to PCNA. 36
54310 152027 pfam11591 2Fe-2S_Ferredox Ferredoxin chloroplastic transit peptide. The structure of chloroplast ferredoxin in water is unstructured however in a 30:70 molar-ratio mixture of 2,2,2-trifluoroethanol, residues 3 to 13 form an alpha-helix. The rest of the peptide remains unstructured. This family is the N-terminal of the [2Fe-2S) ferredoxin from C.reinhardtii. This protein catalyzes the final reaction in a pathway which allows the production of H(2) from water in the chloroplast. 34
54311 288446 pfam11592 AvrPto Central core of the bacterial effector protein AvrPto. This family of proteins represents the bacterial effector protein AvrPto from Pseudomonas syringae. This is the central core region of the protein which consists of a three-helix bundle motif. AvrPto is part of a type III secretion system from P.syringae which is involved in the bacterial speck disease of tomato. In resistant plants, AvrPto interacts with the host Pto kinase, which elicits an antibacterial defense response. In plants lacking resistance, the Pto kinase is not present and AvrPto acts as a virulence factor, promoting bacterial growth. 105
54312 402954 pfam11593 Med3 Mediator complex subunit 3 fungal. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator subunit Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast transcriptional co-repressor. 398
54313 402955 pfam11594 Med28 Mediator complex subunit 28. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Subunit Med28 of the Mediator may function as a scaffolding protein within Mediator by maintaining the stability of a submodule within the head module, and components of this submodule act together in a gene-regulatory programme to suppress smooth muscle cell differentiation. Thus, mammalian Mediator subunit Med28 functions as a repressor of smooth muscle-cell differentiation, which could have implications for disorders associated with abnormalities in smooth muscle cell growth and differentiation, including atherosclerosis, asthma, hypertension, and smooth muscle tumors. 101
54314 371618 pfam11595 DUF3245 Protein of unknown function (DUF3245). This is a family of proteins conserved in fungi. The function is not known, and there is no S. cerevisiae member. 148
54315 371619 pfam11596 DUF3246 Protein of unknown function (DUF3246). This is a small family of fungal proteins one of whose members, MUC1.5 from Pichia stipitis is described as being an extremely serine rich protein-mucin-like protein. 241
54316 402956 pfam11597 Med13_N Mediator complex subunit 13 N-terminal. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med13 is part of the ancillary kinase module, together with Med12, CDK8 and CycC, which in yeast is implicated in transcriptional repression, though most of this activity is likely attributable to the CDK8 kinase. The large Med12 and Med13 proteins are required for specific developmental processes in Drosophila, zebrafish, and Caenorhabditis elegans but their biochemical functions are not understood. 311
54317 402957 pfam11598 COMP Cartilage oligomeric matrix protein. This family of proteins represents the five-stranded coiled-coil domain of cartilage oligomeric matrix protein (COMP). This region has a binding site between two internal rings formed by Leu37 and Thr40 43
54318 402958 pfam11599 AviRa RRNA methyltransferase AviRa. This family of proteins represents the methyltransferase AviRa from Streptomyces viridochromogenes. This protein mediates the resistance to the antibiotic avilamycin. AviRa methylates a specific guanine base within the peptidyl-transferase loop of the 23S ribosomal RNA. 232
54319 402959 pfam11600 CAF-1_p150 Chromatin assembly factor 1 complex p150 subunit, N-terminal. CAF-1_p150 is a polypeptide subunit of CAF-1, which functions in depositing newly synthesized and acetylated histones H3/H4 into chromatin during DNA replication and repair. CAF-1_p150 includes the HP1 interaction site, the PEST, KER and ED interacting sites. CAF-1_p150 interacts directly with newly synthesized and acetylated histones through the acidic KER and ED domains. The PEST domain is associated with proteins that undergo rapid proteolysis. 164
54320 402960 pfam11601 Shal-type Shal-type voltage-gated potassium channels, N-terminal. This family represents the short N-terminal helical domain of Shal-type voltage-gated potassium channels. The domain interacts with Kv channel-interacting proteins to modulate cell surface expression and the function of Kv4 channels. The interaction of the N-terminus of Shal-type protein Kv4.2 and the Kv interacting protein KChiP1 forms a structure which is like the structure between calmodulin and its target peptides when they interact. Interactions of an N terminal alpha helix in Kv4.2 and a C terminal alpha helix in KChIP1 are essential for the modulation of Kv4.2 by KChIPs. 28
54321 402961 pfam11602 NTPase_P4 ATPase P4 of dsRNA bacteriophage phi-12. P4 is a packaging motor which is involved in the packaging of phi-12 genome into preformed capsids using ATP. P4 is located at the vertices of the icosahedral capsid. ATP drives RNA translocation through cooperative conformational changes. 320
54322 402962 pfam11603 Sir1 Regulatory protein Sir1. Sir1p interacts with the BAH domain of the Orc1p subunit of the origin recognition complex (ORC) resulting in the establishment of silent chromatin at HMR and HML in S.cerevisiae. The amino acids from the ORC interaction region of Sir1p are presented on a conserved, convex surface that forms a complementary interface with the Orc1 BAH domain, critical for transcriptional silencing. 120
54323 402963 pfam11604 CusF_Ec Copper binding periplasmic protein CusF. CusF is a periplasmic protein involved in copper and silver resistance in Escherichia coil. CusF forms a five-stranded beta-barrel OB fold. Cu(I) binds to H36, M47 and M49 which are conserved residues in the protein. 67
54324 402964 pfam11605 Vps36_ESCRT-II Vacuolar protein sorting protein 36 Vps36. Vps36 is a subunit of ESCRT-II, a protein involved in driving protein sorting from endosomes to lysosomes. The GLUE domain of Vps36 allows for a tight interaction to occur between the protein and Vps28, a subunit of ESCRT-I. This interaction is critical for ubiquitinated cargo progression from early to late endosomes. 92
54325 152042 pfam11606 AlcCBM31 Family 31 carbohydrate binding protein. This family of proteins represents the family 31 carbohydrate-binding module of beta-1,2-xylanase. This protein is from Alcaligenes sp. strain XY-234. The AlcCBM31 module makes a beta-sandwich structure with an immunoglobulin fold and contains two intra-molecular disulfide bonds. AlcCBM31 shows affinity with only beta-1,3-xylan. 93
54326 402965 pfam11607 DUF3247 Protein of unknown function (DUF3247). This family of proteins is the protein product of the gene XC5848 from Xanthomonas campestris. The protein has no known function however its structure has been determined. The protein adopts a Lsm fold however differences with the fold were observed at the N-terminal and internal regions. 92
54327 402966 pfam11608 Limkain-b1 Limkain b1. This family of proteins represents Limkain b1, which is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. 89
54328 288462 pfam11609 DUF3248 Protein of unknown function (DUF3248). This family of proteins is thought to be the product of the gene TT1592 from Thermus thermophilus however this cannot be confirmed. Currently there is no known function. 62
54329 402967 pfam11610 Ste5 Scaffold protein Ste5, Fus3-binding region. This family of proteins represents the Fus3 binding region of Ste5. Ste5 functions in the yeast mating pathway and is required for signalling through the mating response MAPK pathway. Ste5 has separate binding sites for each member of the MAPK cascade. This region of Ste5 allosterically activates autophosphroylation of Fus3, a mitogen-activated protein kinase. Auto-activated Fus3 has a negative regulatory role, and promotes Ste5 phosphorylation which leads to a decrease in pathway transcriptional output. 30
54330 402968 pfam11611 DUF4352 Domain of unknown function (DUF4352). Members of these family are putative lipoproteins that fall into the Antigen MPT63/MPB63 (immunoprotective extracellular protein) superfamily. 125
54331 402969 pfam11612 T2SSJ Type II secretion system (T2SS), protein J. The T2SJ proteins are pseudopilins, which are targeted to the membrane in E. Coli. T2SJ forms a complex with T2SI (pfam02501) and T2SK (pfam03934) which is part of the Type II secretion apparatus involved in the translocation of proteins across the outer membrane in E.coli. The T2SK-I-J complex has quasihelical characteristics. 137
54332 402970 pfam11614 FixG_C IG-like fold at C-terminal of FixG, putative oxidoreductase. This domain is part of a transmembrane protein, FixG, itself part of the FixGHIS operon closely associated with the FixNOPQ operon that is the symbiotically essential cbb3-type haem-copper oxidase complex. FixG expression is induced by oxygen-deprivation. This C-terminal domain adopts an E-set Ig-like fold. 116
54333 402971 pfam11615 Caf4 CCR4-associated factor 4. Caf4 is a WD40 repeats containing protein involved in mitochondrial fission. It displays physical interactions with CCR4-NOT complex. It has a paralogue, Mdv1. Both Caf4 and Mdv1 act as adapter proteins, binding to Fis1 on the mitochondrial outer membrane and recruiting the dynamin-like GTPase Dnm1 to form mitochondrial fission complexes. However, Fis1 and Caf4, but not Mdv1, determine the polar localization of Dnm1 clusters on the mitochondrial surface. 60
54334 402972 pfam11616 EZH2_WD-Binding WD repeat binding protein EZH2. This family of proteins represents Enhancer of zest homolog 2, (EZH2) a 30 residue peptide which binds to a WD-repeat domain of EED by residues 39-68. EED is a component of PRC2 complex which is involved in gene expression. This interaction is required for the HMTase activity of PCR2. 30
54335 402973 pfam11617 Cu-binding_MopE Putative metal-binding motif. The seqeunce of structure 2vov is not matched in any other sequence either in UniProt or in NCBI (Sep2014). The model is of a short repeat not found on the G1UBC6 - 2vov - protein. The presence of conserved cysteine residues and the lack of hydrophobic residues suggests that this repeat might be a metal-binding site, perhaps for zinc or calcium ions. 28
54336 402974 pfam11618 C2-C2_1 First C2 domain of RPGR-interacting protein 1. This domain is the first, more N-terminal, C2 domain on X-linked retinitis pigmentosa GTPase regulator-interacting proteins, or RPGR-interacting proteins. 140
54337 314489 pfam11619 P53_C Transcription factor P53 - C terminal domain. This family of proteins is the C terminal domain of the transcription factor P53. While the rest of the protein is quite conserved between the different transcription factors such as p53 and p73, the C terminal domain is highly divergent. The DM-p53 structure is characterized by an additional N-terminal beta-strand and a C-terminal helix. 67
54338 402975 pfam11620 GABP-alpha GA-binding protein alpha chain. This family of proteins represents the transcription factor GABP alpha. This alpha domain is a five-stranded beta-sheet crossed by a distorted helix termed an OST domain. The surface of the GABP alpha OST domain contains two clusters of negatively-charged residues suggesting there are positively-charged partner proteins. The OST domain binds to the CH1 and CH3 domains of the co-activator histone acetyltransferase CBP/p300, a direct link between GABP and transcriptional machinery has been made. 81
54339 288473 pfam11621 Sbi-IV C3 binding domain 4 of IgG-bind protein SBI. This family of proteins represents Sbi domain IV which binds the central complement protein C3. Sbi-IV interacts with Sbi-III to induce a consumption of complement via alternative pathway activation. When not interacting with Sbi-III, Sbi-IV inhibits the alternative pathway without complement consumption. The structure of Sbi-IV consists of a three-helix bundle fold. 69
54340 402976 pfam11622 DUF3251 Protein of unknown function (DUF3251). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. Some members if this family are annotated as putative lipoprotein YajI however this cannot be confirmed. 156
54341 402977 pfam11623 NdhS NAD(P)H dehydrogenase subunit S. This family is found in Bacteria and Streptophyta includes members such as NdhS (NAD(P)H-quinone oxidoreductase subunit S). NdhS, also known as CRR31 (chlororespiratory reduction 31), is a subunit of the chloroplast NADH dehydrogenase-like (NDH) complex. It is also a subunit of the cyanobacterial NDH-1 complex. NAD(P)H-oxidizing subunits have not been found in chloroplasts or cyanobacteria, where ferredoxin is probably the electron donor. NdhS contributes to the formation of a ferredoxin binding site of NDH and is necessary for high affinity binding of ferredoxin. The cyanobacterial NDH-1 complex, also known as NADPH:plastoquinone oxidoreductase or type I NAD(P)H dehydrogenase, is involved in plastoquinone reduction and cyclic electron transfer (CET) around photosystem I. The chloroplast NDH is more similar to cyanobacterial NDH-1, which is believed to be the origin of chloroplast NDH, than to mitochondrial NADH dehydrogenase present in the same species. The NDH complexes of chloroplasts, however, contain many subunits that are absent from cyanobacterial NDH-1 complexes. 52
54342 402978 pfam11624 M157 MHC class I-like protein M157. This family of proteins represents M157,a divergent form of MHC class I-like proteins which is the protein product of the mouse cytomegalovirus. This protein is unique in its ability to engage both activating (Ly49H) and inhibitory (Ly49I) natural killer cell receptors. M157 is involved in intra- and intermolecular interacts within and between its domains to form a compact MHC-like molecule. 247
54343 402979 pfam11625 DUF3253 Protein of unknown function (DUF3253). This bacterial family of proteins has no known function. 81
54344 402980 pfam11626 Rap1_C TRF2-interacting telomeric protein/Rap1 - C terminal domain. This family of proteins represents the C-terminal domain of the protein Rap-1, which plays a distinct role in silencing at the silent mating-type loci and telomeres. The Rap-1 C-terminus adopts an all-helical fold. Rap1 carries out its function by recruiting the Sir3 and Sir4 proteins to chromatin via its C terminal domain. Rap1 is otherwise known as TRF2-interacting protein, as it is one of the six subunit components of the Shelterin complex. Shelterin protects telomere ends from attack by DNA-repair mechanisms. Model doesn't capture Sch. pombe as it cuts this sequence into two. 80
54345 402981 pfam11627 HnRNPA1 Nuclear factor hnRNPA1. This family of proteins represents hnRNPA1, a nuclear factor that binds to Pol II transcripts. The family of hnRNP proteins are involved in numerous RNA-related activities. 38
54346 402982 pfam11628 TCR_zetazeta T-cell surface glycoprotein CD3 zeta chain. The incorporation of the zetazeta signalling module requires one basic TCR alpha and two zetazeta aspartic acid TM residues. The structure of the zetazeta(TM) dimer consists of a left-handed coiled coil with polar contacts. Two aspartic acids are critical for zetazeta dimerization and assembly with TCR. 31
54347 402983 pfam11629 Mst1_SARAH C terminal SARAH domain of Mst1. This family of proteins represents the C terminal SARAH domain of Mst1. SARAH controls apoptosis and cell cycle arrest via the Ras, RASSF, MST pathway. The Mst1 SARAH domain interacts with Rassf1 and Rassf5 by forming a heterodimer which mediates the apoptosis process. 48
54348 152066 pfam11630 DUF3254 Protein of unknown function (DUF3254). This family of proteins is most likely a family of anti-lipopolysaccharide factor proteins however this cannot be confirmed. 97
54349 288482 pfam11631 DUF3255 Protein of unknown function (DUF3255). Members in this family of proteins are annotated as YxeF however no function is currently known. The family appears to be restricted to Bacillus. 123
54350 402984 pfam11632 LcnG-beta Lactococcin G-beta. This family of proteins is LcnG-beta, which with LcnG-alpha constitute the two-peptide bacteriocin lactococcin G (LcnG). This family of proteins represents the N terminal domain which has an alpha-helical structure and is amphiphilic. Both peptides have a GxxxG motif which they use for interaction through a helix-helix structure. 61
54351 314498 pfam11633 SUD-M Single-stranded poly(A) binding domain. This family of proteins represents Nsp3c, the product of ORF1a in group 2 coronavirus. The domain exhibits a macrodomain fold containing the nsp3 residues 528 to 648, with a flexibly extended N-terminal tail from residues 513 to 527 and a C-terminal flexible tail of residues 649 to 651. SUD-M(527-651) binds single-stranded poly(A); the contact area with this RNA on the protein surface, and the electrophoretic mobility shift assays confirm that SUD-M has higher affinity for purine bases than for pyrimidine bases. 143
54352 288483 pfam11634 IPI_T4 Nuclease inhibitor from bacteriophage T4. This family of proteins represents IPI from bacteriophage T4. This protein is a nuclease inhibitor which is injected by T4 to protect its DNA from gmrS/gmrD CT of pathogenic Escherichia coli into the infected host. The structure of this protein consists of two small beta-sheets flanked by N and C termini by alpha-helices. The protein has a gmrS/gmrD hydrophobic binding site. 76
54353 371639 pfam11635 Med16 Mediator complex subunit 16. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med16 is one of the subunits of the Tail portion of the Mediator complex and is required for lipopolysaccharide gene-expression. Several members including the human protein MED16 have one or more WD40 domains on them, pfam00400. 755
54354 371640 pfam11636 Troponin-I_N Troponin I residues 1-32. This family of proteins represents the cardiac N-extension of troponin I. This region of the protein (1-32) interacts with the N-lobe of cTnC and modulates myofilament calcium(2) sensitivity. 32
54355 402985 pfam11637 UvsW ATP-dependant DNA helicase UvsW. This family of proteins represents the DNA helicase UvsW from bacteriophage T4. The protein is a member of the monomeric SF2 helicase superfamily and shows structural homology to the eukaryotic SF2 helicase Rad54. UvsW is thought to have a role in recombination and the rescue of stalled replication forks. 56
54356 402986 pfam11638 DnaA_N DnaA N-terminal domain. This family of proteins represents the N-terminal domain of DnaA, a protein involved in the initiation of bacterial chromosomal replication. The structure of this domain is known. It is also found in three copies in some proteins. The exact function of this domain is uncertain but it has been suggested to play a role in oligomerization. 65
54357 402987 pfam11639 HapK REDY-like protein HapK. This family of proteins represents HapK, a protein of unknown function, with two homologs PigK and RedY. The monomer structure of the protein contains a four-stranded anti parallel beta-sheet, three alpha-helices and a short C terminal tail which it uses for dimer formation. The surface of HapK has a deep cavity with consists of a kinked helix and a beta-four strand. HapK could be involved in prodigiosin biosynthesis, specifically the binding of a bipyrrole intermediate such as HBM or MBM. 103
54358 402988 pfam11640 TAN Telomere-length maintenance and DNA damage repair. ATM is a large protein kinase, in humans, critical for responding to DNA double-strand breaks (DSBs). Tel1, the orthologue from budding yeast, also regulates responses to DSBs. Tel1 is important for maintaining viability and for phosphorylation of the DNA damage signal transducer kinase Rad53 (an orthologue of mammalian CHK2). In addition to functioning in the response to DSBs, numerous findings indicate that Tel1/ATM regulates telomeres. The overall domain structure of Tel1/ATM is shared by proteins of the phosphatidylinositol 3-kinase (PI3K)-related kinase (PIKK) family, but this family carries a unique and functionally important TAN sequence motif, near its N-terminal, LxxxKxxE/DRxxxL. which is conserved specifically in the Tel1/ATM subclass of the PIKKs. The TAN motif is essential for both telomere length maintenance and Tel1 action in response to DNA damage. It is classified as an EC:2.7.11.1. 151
54359 152077 pfam11641 Antigen_Bd37 Glycosylphosphatidylinositol-anchored merozoite surface protein. This family of proteins represents the core region of Bd37, a surface antigen of B.divergens which is GPI-anchored at the surface of the merozoite. The structure of the protein consists of mainly alpha folds and has three sub domains. 224
54360 402989 pfam11642 Blo-t-5 Mite allergen Blo t 5. This family of proteins is Blo t 5, an allergen protein from Blomia tropicalis mites. This protein shoes strong reactivity with IgE in asthmatic and rhinitis patients. The structure of the protein contains three alpha helices which form a coiled-coil. 118
54361 402990 pfam11644 DUF3256 Protein of unknown function (DUF3256). This family of proteins with unknown function appears to be restricted to Bacteroidales. 195
54362 402991 pfam11645 PDDEXK_5 PD-(D/E)XK endonuclease. This family of endonucleases includes a group I intron-encoded endonuclease. This family belongs to the PD-(D/E)XK superfamily. 137
54363 402992 pfam11646 DUF3258 Protein of unknown function DUF3258. This viral family are possible phage integrase proteins however this cannot be confirmed. 99
54364 402993 pfam11647 MLD Membrane Localization Domain. This is a membrane localization domain found in multiple families of bacterial toxins including all of the clostridial glucosyltransferase toxins and various MARTX toxins (multifunctional-autoprocessing RTX toxins). In the Pasteurella multocida toxin (PMT) C-terminal fragment, structural analysis have indicated that the C1 domain possesses a signal that leads the toxin to the cell membrane. Furthermore, the C1 domain was found to structurally resemble the phospholipid-binding domain of C. difficile toxin B. Functional studies in Vibrio cholera indicate that the subdomain at the N-terminus of RID (Rho-inactivation domain), homologous to the membrane targeting C1 domain of Pasteurella multocida toxin, is a conserved membrane localization domain essential for proper localization. The Rho-inactivation domain (RID) of MARTX (Multifunctional Autoprocessing RTX toxin) is responsible for inactivating the Rho-family of small GTPases in Vibrio cholerae. It is a bacterial toxin that self-process by a cysteine peptidase mechanism. These cysteine peptidases belong to MEROPS peptidase family C80 (RTX self-cleaving toxin, clan CD). 67
54365 402994 pfam11648 RIG-I_C-RD C-terminal domain of RIG-I. This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerization. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity. 115
54366 288494 pfam11649 T4_neck-protein Virus neck protein. This family of protein represents gene product 14, a major component of the neck in T4-like viruses along with gene product 13. Gene product 14 is rich is beta-sheets. The formation of the neck to the head of the bacteriophage is crucial for the tail attachment. 254
54367 402995 pfam11650 P22_Tail-4 P22 tail accessory factor. This tail accessory factor of the P22 virus is also referred to as gene product 4 (Gp4). The proteins structure consists of 60% alpha helices. Gp4 is the first tail accessory factor to be added to newly DNA-filled capsids during P22-morphogenesis. In solution, the protein acts as a monomer and has low structural stability. The interaction of gp4 with the portal protein involves the binding of two non-equivalent sets of six gp4 proteins. Gp4 acts as a structural adaptor for gp10 and gp26, the other tail accessory factors. 148
54368 402996 pfam11651 P22_CoatProtein P22 coat protein - gene protein 5. This family of proteins represents gene product 5 from bacteriophage P22. This protein is involved in the formation of the pro-capsid shells in the bacteriophage. In total, there are 415 molecules of the coat protein which are arranged in an icosahedral shell. 416
54369 402997 pfam11652 FAM167 FAM167. This entry describes a eukaryotic protein family of unknown function designated FAM167. 84
54370 314509 pfam11653 VirionAssem_T7 Bacteriophage T7 virion assembly protein. This family of proteins represents the gene product 7.3 from T7 bacteriophage. The protein is localized to the tail and is thought to be important in virion assembly. Particles assembled in the absence of the protein fail to adsorb to cells. 99
54371 402998 pfam11654 NCE101 Non-classical export protein 1. This entry represents the non classical export protein 1 family. Family members are Involved in a novel pathway of export of proteins that lack a cleavable signal sequence. 45
54372 371654 pfam11655 DUF2589 Protein of unknown function (DUF2589). This family of proteins has no known function. 150
54373 402999 pfam11656 DUF3811 YjbD family (DUF3811). This is a family of proteobacteria proteins of unknown function. This family is unrelated to pfam03960 which contains a set of transcription factors that are also named YjbD. 88
54374 338055 pfam11657 Activator-TraM Transcriptional activator TraM. TraM is required for quorum dependence. It binds to and in-activates TraR which controls the replication of the tumor-inducing virulence plasmid. TraM interacts in a two-step process with DNA-TraR to form a large, stable anti-activation complex. 142
54375 403000 pfam11658 CBP_BcsG Cellulose biosynthesis protein BcsG. CBP_BcsG is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 516
54376 403001 pfam11659 DUF3261 Protein of unknown function (DUF3261). This family of proteins with unknown function appears to be restricted to Proteobacteria. The family is related to the LolB family suggesting a role in lipoprotein insertion in the outer membrane. 139
54377 403002 pfam11660 DUF3262 Protein of unknown function (DUF3262). This family of proteins with unknown function appears to be restricted to Proteobacteria. 76
54378 314517 pfam11661 DUF2986 Protein of unknown function (DUF2986). This family of proteins has no known function. 43
54379 403003 pfam11662 DUF3263 Protein of unknown function (DUF3263). This family of proteins with unknown function appears to be restricted to Actinobacteria. 74
54380 403004 pfam11663 Toxin_YhaV Toxin with endonuclease activity, of toxin-antitoxin system. YhaV causes reversible bacteriostasis and is part of a toxin-antitoxin system in Escherichia coli along with PrlF. The toxicity of YhaV is counteracted by PrlF by the formation of a tight complex which binds to the promoter of the prlF-yhaV operon. In vitro, YhaV also has endonuclease activity. 138
54381 403005 pfam11665 DUF3265 Protein of unknown function (DUF3265). This family of proteins with unknown function appear to be restricted to Vibrio. 28
54382 403006 pfam11666 DUF2933 Protein of unknown function (DUF2933). This bacterial family of proteins has no known function. 50
54383 403007 pfam11667 DUF3267 Putative zincin peptidase. This family of proteins has a conserved HEXXH motif, suggesting the members are putative peptidases of zincin fold. 103
54384 288512 pfam11668 Gp_UL130 HCMV glycoprotein pUL130. This family of proteins represents pUL130 from Human cytomegalovirus, a glycoprotein secreted from infected cells that is incorporated into the virion envelope as a Golgi-matured form. The protein promotes endothelial cell infection through a producer cell modification of the virion. 159
54385 403008 pfam11669 WBP-1 WW domain-binding protein 1. This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain. 102
54386 371661 pfam11670 MSP1a Major surface protein 1a (MSP1a). MSP1a is part of the A.marginale major surface protein 1 (MSP1) complex and exists as a heterodimer with MSP1b. The complex has adhesive functions in bovine erythrocytes invasion. 252
54387 152107 pfam11671 Apis_Csd Complementary sex determiner protein. This family of proteins represents the complementary sex determiner in the honeybee. In the honeybee, the mechanism of sex determination depends on the csd gene which produces an SR-type protein. Males are homozygous while females are homozygous for the csd gene. Heterozygosity generates an active protein which initiates female development. 146
54388 403009 pfam11672 DUF3268 zinc-finger-containing domain. This is a family of bacterial and plasmid sequences that carry at least one zinc-finger towards the N-terminus and a possible second at the C-terminus. 118
54389 288515 pfam11673 DUF3269 Protein of unknown function (DUF3269). This family of proteins has no known function. 73
54390 403010 pfam11674 DUF3270 Protein of unknown function (DUF3270). This family of proteins with unknown function appears to be restricted to Streptococcus. 86
54391 403011 pfam11675 DUF3271 Protein of unknown function (DUF3271). This family of proteins with unknown function appears to be restricted to Plasmodium. 248
54392 403012 pfam11676 DUF3272 Protein of unknown function (DUF3272). This family of proteins with unknown function appears to be restricted to Streptococcus. 61
54393 403013 pfam11677 DUF3273 Protein of unknown function (DUF3273). Some members in this family of proteins are annotated as multi-transmembrane proteins however this cannot be confirmed. Currently this family has no known function. 265
54394 371665 pfam11678 DUF3274 Protein of unknown function (DUF3274). This bacterial family of proteins has no known function. 286
54395 371666 pfam11679 DUF3275 Protein of unknown function (DUF3275). This family of proteins with unknown function appear to be restricted to Proteobacteria. 207
54396 403014 pfam11680 DUF3276 Protein of unknown function (DUF3276). This bacterial family of proteins has no known function. 128
54397 403015 pfam11681 DUF3277 Protein of unknown function (DUF3277). This family of proteins represents a putative bacteriophage protein. No function is currently known. 144
54398 288524 pfam11682 zinc_ribbon_11 Probable zinc-ribbon. This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. It is a probably zinc-ribbon. 127
54399 371668 pfam11683 DUF3278 Protein of unknown function (DUF3278). This bacterial family of proteins has no known function. 127
54400 403016 pfam11684 DUF3280 Protein of unknown function (DUF2380). This family of proteins with unknown function appears to be restricted to Proteobacteria. 133
54401 314533 pfam11685 DUF3281 Protein of unknown function (DUF3281). This family of bacterial proteins has no known function. 267
54402 403017 pfam11686 DUF3283 Protein of unknown function (DUF3283). This family of proteins with unknown function appears to be restricted to Proteobacteria. 60
54403 403018 pfam11687 DUF3284 Domain of unknown function (DUF3284). This family of proteins with unknown function appears to be restricted to Firmicutes. 116
54404 403019 pfam11688 DUF3285 Protein of unknown function (DUF3285). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 44
54405 314536 pfam11690 DUF3287 Protein of unknown function (DUF3287). This eukaryotic family of proteins has no known function. 121
54406 403020 pfam11691 DUF3288 Protein of unknown function (DUF3288). This family of proteins with unknown function appears to be restricted to Cyanobacteria. 88
54407 403021 pfam11692 DUF3289 Protein of unknown function (DUF3289). This family of proteins with unknown function appears to be restricted to Proteobacteria. 272
54408 403022 pfam11693 DUF2990 Protein of unknown function (DUF2990). This family of proteins represents a fungal protein with unknown function. 62
54409 403023 pfam11694 DUF3290 Protein of unknown function (DUF3290). This family of proteins with unknown function appears to be restricted to Firmicutes. 144
54410 403024 pfam11695 DUF3291 Domain of unknown function (DUF3291). This bacterial family of proteins has no known function. 136
54411 403025 pfam11696 DUF3292 Protein of unknown function (DUF3292). This eukaryotic family of proteins has no known function. 648
54412 403026 pfam11697 DUF3293 Protein of unknown function (DUF3293). This bacterial family of proteins has no known function. 73
54413 403027 pfam11698 V-ATPase_H_C V-ATPase subunit H. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the C terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation. 117
54414 403028 pfam11699 CENP-C_C Mif2/CENP-C like. CENP-C_C is a C-terminal family of fungal and eukaryote proteins necessary for centromere formation. CENP-C is the inner-kinetochore centromere (CEN) binding protein. In the budding-yeast, Mif2, the yeast homolog, binds in the CDEIII region of the centromere, and has been shown to recruit a substantial subset of all inner and outer kinetochore proteins. Mif2 adopts a cupin fold and is extremely similar both in polypeptide chain conformation and in dimer geometry to the dimerization domain of a bacterial transcription factor. The Mif2 dimer appears to be part of an enhanceosome-like structure that nucleates kinetochore assembly in budding yeast. This C-terminal domain is the region via which CENP-C localizes to centromeres throughout the cell cycle 2,3]. 85
54415 371676 pfam11700 ATG22 Vacuole effluxer Atg22 like. Autophagy is a major survival survival mechanism in which eukaryotes recycle cellular nutrients during stress conditions. Atg22, Avt3 and Avt4 are partially redundant vacuolar effluxes, which mediate the efflux of leucine and other amino acids resulting from autophagy. This family also includes other transporter proteins. 479
54416 403029 pfam11701 UNC45-central Myosin-binding striated muscle assembly central. The UNC-45 or small muscle protein 1 of C.elegans is expressed in two forms from different genomic positions in mammals, as a general tissue protein UNC-45a and a specific form Unc-45b expressed only in striated and skeletal muscle. All members carry up to three amino-terminal tetratricopeptide repeat (TPR) domains towards their N-terminal, a UCS domain at the C-terminal that contains a number of Arm repeats pfam00514 and this central region of approximately 400 residues. Both the general form and the muscle form of UNC-45 function in myotube formation through cell fusion. Myofibril formation requires both GC and SM UNC-45, consistent with the fact that the cytoskeleton is necessary for the development and maintenance of organized myofibrils. The S. pombe Rng3p, is crucial for cell shape, normal actin cytoskeleton, and contractile ring assembly, and is essential for assembly of the myosin II-containing progenitors of the contractile ring. Widespread defects in the cytoskeleton are found in null mutants of all three fungal proteins. Mammalian Unc45 is found to act as a specific chaperone during the folding of myosin and the assembly of striated muscle by forming a stable complex with the general chaperone Hsp90. The exact function of this central region is not known. 150
54417 403030 pfam11702 DUF3295 Protein of unknown function (DUF3295). This family is conserved in fungi but the function is not known. 489
54418 403031 pfam11703 UPF0506 UPF0506. This uncharacterized family is found in Schistosoma genomes. Although uncharacterized it appears to belong to the knottin fold. The sequence is composed of two repeats of a 6 cysteine motif. 59
54419 403032 pfam11704 Folliculin Vesicle coat protein involved in Golgi to plasma membrane transport. In yeast cells this family functions in the regulated delivery of Gap1p (a general amino acid permease) to the cell surface, perhaps as a component of a post-Golgi secretory-vesicle coat complex. Birt-Hogg-Dube (BHD)4 syndrome is an autosomal dominant disorder characterized by hamartomas of skin follicles, lung cysts, spontaneous pneumothorax, and renal cell carcinoma. Folliculin is the protein from the BHD4 gene and is found to have no significant homology to any other human proteins. It is expressed in most tissues. These same symptoms also occur in TSC or tuberous sclerosis complex, suggesting that the same pathway is involved, and it is likely that the target is the down-stream Tor2 - an essential gene. Folliculin appears to bind Tor2, and down-regulation of Tor2 activity leads to up-regulation of nitrogen responsive genes including membrane transporters and amino acid permeases. 163
54420 403033 pfam11705 RNA_pol_3_Rpc31 DNA-directed RNA polymerase III subunit Rpc31. RNA polymerase III contains seventeen subunits in yeasts and in human cells. Twelve of these are akin to RNA polymerase I or II and the other five are RNA pol III-specific, and form the functionally distinct groups (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31, Rpc34 and Rpc82 form a cluster of enzyme-specific subunits that contribute to transcription initiation in S.cerevisiae and H.sapiens. There is evidence that these subunits are anchored at or near the N-terminal Zn-fold of Rpc1, itself prolonged by a highly conserved but RNA polymerase III-specific domain. 230
54421 403034 pfam11706 zf-CGNR CGNR zinc finger. This family consists of a C-terminal zinc finger domain. It seems likely to be DNA-binding given the conservation of many positively charged residues. The domain is named after a highly conserved motif found in many members of the family. 44
54422 403035 pfam11707 Npa1 Ribosome 60S biogenesis N-terminal. Npa1p is required for ribosome biogenesis and operates in the same functional environment as Rsa3p and Dbp6p during early maturation of 60S ribosomal subunits. The protein partners of Npa1p include eight putative helicases as well as the novel Npa2p factor. Npa1p can also associate with a subset of H/ACA and C/D small nucleolar RNPs (snoRNPs) involved in the chemical modification of residues in the vicinity of the peptidyl transferase centre. The protein has also been referred to as Urb1, and this domain at the N-terminal is one of several conserved regions along the length. 332
54423 403036 pfam11708 Slu7 Pre-mRNA splicing Prp18-interacting factor. The spliceosome, an assembly of snRNAs (U1, U2, U4/U6, and U5) and proteins, catalyzes the excision of introns from pre-mRNAs in two successive trans-esterification reactions. Step 2 depends upon integral spliceosome constituents such as U5 snRNA and Prp8 and non-spliceosomal proteins Prp16, Slu7, Prp18, and Prp22. ATP hydrolysis by the DEAH-box enzyme Prp16 promotes a conformational change in the spliceosome that leads to protection of the 3'ss from targeted RNase H cleavage. This change, which probably reflects binding of the 3'ss PyAG in the catalytic centre of the spliceosome, requires the ordered recruitment of Slu7, Prp18, and Prp22 to the spliceosome. There is a close functional relationship between Prp8, Prp18, and Slu7, and Prp18 interacts with Slu7, so that together they recruit Prp22 to the spliceosome. Most members of the family carry a zinc-finger of the CCHC-type upstream of this domain. 258
54424 403037 pfam11709 Mit_ribos_Mrp51 Mitochondrial ribosomal protein subunit. This family is the mitochondrial ribosomal small-subunit protein Mrp51. Its function is not entirely clear, but deletion of the MRP51 gene completely blocked mitochondrial gene expression. 355
54425 371685 pfam11710 Git3 G protein-coupled glucose receptor regulating Gpa2. Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C. 201
54426 403038 pfam11711 Tim54 Inner membrane protein import complex subunit Tim54. Mitochondrial function depends on the import of hundreds of different proteins synthesized in the cytosol. Protein import is a multi-step pathway which includes the binding of precursor proteins to surface receptors, translocation of the precursor across one or both mitochondrial membranes, and folding and assembly of the imported protein inside the mitochondrion. Most precursor proteins carry amino-terminal targeting signals, called pre-sequences, and are imported into mitochondria via import complexes located in both the outer and the inner membrane (IM). The IM complex, TIM, is made up of at least two proteins which mediate translocation of proteins into the matrix by removing their signal peptide and another pair of proteins, Tim54 and Tim22, that insert the polytopic proteins, that carry internal targetting information, into the inner membrane. 372
54427 403039 pfam11712 Vma12 Endoplasmic reticulum-based factor for assembly of V-ATPase. The yeast vacuolar proton-translocating ATPase (V-ATPase) is the best characterized member of the V-ATPase family. A total of thirteen genes are required for encoding the subunits of the enzyme complex itself and an additional three for providing factors necessary for the assembly of the whole. Vma12 is one of these latter, all three of which are localized to the endoplasmic reticulum. 139
54428 288550 pfam11713 Peptidase_C80 Peptidase C80 family. This family belongs to cysteine peptidase family C80. 152
54429 152150 pfam11714 Inhibitor_I53 Thrombin inhibitor Madanin. Members of this family are the peptidase inhibitor madanin proteins. These proteins were isolated from tick saliva. 78
54430 403040 pfam11715 Nup160 Nucleoporin Nup120/160. Nup120 is conserved from fungi to plants to humans, and is homologous with the Nup160 of vertebrates. The nuclear core complex, or NPC, mediates macromolecular transport across the nuclear envelope. Deletion of the NUP120 gene causes clustering of NPCs at one side of the nuclear envelope, moderate nucleolar fragmentation and slower cell growth. The vertebrate NPC is estimated to contain between 30 and 60 different proteins. most of which are not known. Two important ones in creating the nucleoporin basket are Nup98 and Nup153, and Nup120, in conjunction with Nup 133, interacts with these two and itself plays a role in mRNA export. Nup160, Nup133, Nup96, and Nup107 are all targets of phosphorylation. The phosphorylation sites are clustered mainly at the N-terminal regions of these proteins, which are predicted to be natively disordered. The entire Nup107-160 sub-complex is stable throughout the cell cycle, thus it seems unlikely that phosphorylation affects interactions within the Nup107-160 sub-complex, but rather that it regulates the association of the sub-complex with the NPC and other proteins. 536
54431 371689 pfam11716 MDMPI_N Mycothiol maleylpyruvate isomerase N-terminal domain. 139
54432 403041 pfam11717 Tudor-knot RNA binding activity-knot of a chromodomain. This is a novel knotted tudor domain which is required for binding to RNA. The know influences the loop conformation of the helical turn Ht2 - residues 61-6 3- that is located at the side opposite the knot in the tudor domain-chromodomain; stabilisation of Ht2 is essential for RNA binding. 55
54433 403042 pfam11718 CPSF73-100_C Pre-mRNA 3'-end-processing endonuclease polyadenylation factor C-term. This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known. 204
54434 371692 pfam11719 Drc1-Sld2 DNA replication and checkpoint protein. Genome duplication is precisely regulated by cyclin-dependent kinases CDKs, which bring about the onset of S phase by activating replication origins and then prevent re-licensing of origins until mitosis is completed. The optimum sequence motif for CDK phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found to have at least 11 potential phosphorylation sites. Drc1 is required for DNA synthesis and S-M replication checkpoint control. Drc1 associates with Cdc2 and is phosphorylated at the onset of S phase when Cdc2 is activated. Thus Cdc2 promotes DNA replication by phosphorylating Drc1 and regulating its association with Cut5. Sld2 and Sld3 represent the minimal set of S-CDK substrates required for DNA replication. 391
54435 288556 pfam11720 Inhibitor_I78 Peptidase inhibitor I78 family. This family includes Aspergillus elastase inhibitor and belongs to MEROPS peptidase inhibitor family I78. 65
54436 403043 pfam11721 Malectin Di-glucose binding within endoplasmic reticulum. Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognizes and binds Glc2-N-glycan. It carries a signal peptide from residues 1-26, a C-terminal transmembrane helix from residues 255-274, and a highly conserved central part of approximately 190 residues followed by an acidic, glutamate-rich region. Carbohydrate-binding is mediated by the four aromatic residues, Y67, Y89, Y116, and F117 and the aspartate at D186. NMR-based ligand-screening studies has shown binding of the protein to maltose and related oligosaccharides, on the basis of which the protein has been designated "malectin", and its endogenous ligand is found to be Glc2-high-mannose N-glycan. 164
54437 403044 pfam11722 zf-TRM13_CCCH CCCH zinc finger in TRM13 protein. This domain is found at the N-terminus of TRM13 methyltransferase proteins. It is presumed to be a zinc binding domain. 29
54438 403045 pfam11723 Aromatic_hydrox Homotrimeric ring hydroxylase. This domain is found on aromatic hydroxylating enzymes such as 2-oxo-1,2-dihydroquinoline 8-monooxygenase from Pseudomonas putida and carbazole 1,9a-dioxygenase from Janthinobacterium. These enzymes are homotrimers and are distantly related to the typical oxygenase. This domain is found C terminal to the Rieske domain which binds an iron-sulphur cluster. 241
54439 378697 pfam11724 YvbH_ext YvbH-like oligomerization region. This region is found at the C-terminus of a group of bacterial PH domains. This region is composed of a helical hairpin that appears to mediate oligomerization based on the known structure. This elaboration of the bacterial PH domain is only found in Bacillales. 61
54440 338078 pfam11725 AvrE Pathogenicity factor. This family is secreted by gram-negative Gammaproteobacteria such as Pseudomonas syringae of tomato and the fire blight plant pathogen Erwinia amylovora, amongst others. It is an essential pathogenicity factor of approximately 198 kDa. Its injection into the host-plant is dependent upon the bacterial type III or Hrp secretion system. The family is long and carries a number of predicted functional regions, including in Erwinia stewartii, an ERMS or endoplasmic reticulum membrane retention signal at both the C- and the N-termini, a leucine-zipper motif from residues 539-560, and a nuclear localization signal at 1358-1361. this conserved AvrE-family of effectors is among the few that are required for full virulence of many phytopathogenic pseudomonads, erwinias and pantoeas. A double beta-propeller structure is found towards the N-terminus. 1879
54441 403046 pfam11726 Inovirus_Gp2 Inovirus Gp2. Isoform G2P plays an essential role in viral DNA replication; it binds to the origin of replication and cleaves the dsDNA replicative form I (RFI) and becomes covalently bound to it via phosphotyrosine bond, generating the dsDNA replicative form II (RFII). 179
54442 403047 pfam11727 ISG65-75 Invariant surface glycoprotein. This family is found in Trypanosome species, and appears to be one of two invariant surface glycoproteins, ISG65 and ISG75. that are found in the mammalian stage of the parasitic protozoan. the sequence suggests the two families are polypeptides with N-terminal signal sequences, hydrophilic extracellular domains, single trans-membrane alpha-helices and short cytoplasmic domains. they are both expressed in the bloodstream form but not in the midgut stage. Both polypeptides are distributed over the entire surface of the parasite. 280
54443 403048 pfam11728 ArAE_1_C Putative aromatic acid exporter C-terminal domain. This region is a presumed intracellular domain found in a set of bacterial presumed transporter proteins. The region is about 160 amino acids in length. 161
54444 288565 pfam11729 Capsid-VNN nodavirus capsid protein. The capsid or coat protein of this family is expressed in Nodaviridae, that are ssRNA positive-strand viruses, with no DNA stage. These viruses are the causative agents of viral nervous necrosis in marine fish. 340
54445 403049 pfam11730 DUF3297 Protein of unknown function (DUF3297). This family is expressed in Proteobacteria and Actinobacteria. The function is not known. 71
54446 403050 pfam11731 Cdd1 Pathogenicity locus. Cdd1 is expressed as part of the pathogenicity locus operon in several different orders of bacteria. Many members of the family are annotated as being putative mitomycin resistance proteins but this could not be confirmed. 81
54447 403051 pfam11732 Thoc2 Transcription- and export-related complex subunit. The THO/TREX complex is the transcription- and export-related complex associated with spliceosomes that preferentially deal with spliced mRNAs as opposed to unspliced mRNAs. Thoc2 plays a role in RNA polymerase II (RNA pol II)-dependent transcription and is required for the stability of DNA repeats. In humans, the TRE complex is comprised of the exon-junction-associated proteins Aly/REF and UAP56 together with the THO proteins THOC1 (hHpr1/p84), Thoc2 (hRlr1), THOC3 (hTex1), THOC5 (fSAP79), THOC6 (fSAP35), and THOC7 (fSAP24). Although much evidence indicates that the function of the TREX complex as an adaptor between the mRNA and components of the export machinery is conserved among eukaryotes, in Drosophila the majority of mRNAs can be exported from the nucleus independently of the THO complex. 75
54448 314574 pfam11733 NP1-WLL Non-capsid protein NP1. This family is the non-capsid protein NP1 of the ssDNA, Parvovirinae virus Bocavirus of cattle and humans. 213
54449 403052 pfam11734 TilS_C TilS substrate C-terminal domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. 74
54450 403053 pfam11735 CAP59_mtransfer Cryptococcal mannosyltransferase 1. The capsule of pathogenic fungi is a complex polysaccharide whose formation is determined by a number of enzymes including, most importantly, alpha-1,3-mannosyltransferase 1, EC:2.4.1.-. 225
54451 403054 pfam11736 DUF3299 Protein of unknown function (DUF3299). This is a family of bacterial proteins of unknown function. 106
54452 403055 pfam11737 DUF3300 Protein of unknown function (DUF3300). This hypothetical bacterial gene product has a long hydrophobic segment and is thus likely to be a membrane protein. 229
54453 403056 pfam11738 DUF3298 Protein of unknown function (DUF3298). This family of bacterial protein C-terminal regions is highly conserved but the function is not known. Several members are annotated as being endo-1,4-beta-xylanase-like, but this could not be confirmed, and the structure can be defined as a heat-shock cognate 70kd protein 44kd ATPase. 81
54454 403057 pfam11739 DctA-YdbH Dicarboxylate transport. In certain bacterial families this protein is expressed from the ydbH gene, and there is a suggestion that this is a form of DctA or dicarboxylate transport protein. Dicarboxylate transport proteins are found in aerobic bacteria which grow on succinate or other C4-dicarboxylates. 204
54455 403058 pfam11740 KfrA_N Plasmid replication region DNA-binding N-term. The broad host-range plasmid RK2 is able to replicate in and be inherited in a stable manner in diverse Gram-negative bacterial species. It encodes a number of co-ordinately regulated operons including a central control korF1 operon that represses the kfrA operon. The KfrA polypeptide is a site-specific DNA-binding protein whose operator overlaps the kfrA promoter. The N-terminus, containing an helix-turn-helix motif, is essential for function. Downstream from this family is an extended coiled-coil domain containing a heptad repeat segment which is probably responsible for formation of multimers, and may provide an example of a bridge to host structures required for plasmid partitioning. 117
54456 403059 pfam11741 AMIN AMIN domain. This N-terminal domain of various bacterial protein families is crucial for the targetting of periplasmic or extracellular proteins to specific regions of the bacterial envelope. AMIN is derived from the N-terminal domain of AmiC, an N-acetylmuramoyl-l-alanine amidase of Escherichia coli which localizes to the septal ring during division and plays a key role in the separation of daughter cells. The AMIN domain is present in several protein families besides amidases suggesting that AMIN may represent a general targetting determinant involved in the localization of periplasmic protein complexes. 96
54457 314583 pfam11742 DUF3302 Protein of unknown function (DUF3302). This family of unknown function is expressed by proteobacteria. 77
54458 403060 pfam11743 DUF3301 Protein of unknown function (DUF3301). This family is conserved in Proteobacteria, but the function is not known. 93
54459 314585 pfam11744 ALMT Aluminium activated malate transporter. 469
54460 403061 pfam11745 DUF3304 Protein of unknown function (DUF3304). This is a family of bacterial proteins of unknown function. 104
54461 403062 pfam11746 DUF3303 Protein of unknown function (DUF3303). Several members are annotated as being LysM domain-like proteins, but these did not match any LysM domains reported in the literature. 90
54462 403063 pfam11747 RebB Killing trait. RebB is one of three proteins necessary for the production of R- bodies, refractile inclusion bodies produced by a small number of bacterial species, essential for the expression of the killing trait of the endosymbiont bacteria that produce them for attack upon the host Paramecium. R-bodies are highly insoluble protein ribbons which coil into cylindrical structures in the cell and the genes for their synthesis and assembly are encoded on a plasmid. One of these three proteins is RebB. 68
54463 403064 pfam11748 DUF3306 Protein of unknown function (DUF3306). This family of proteobacterial species proteins has no known function. 120
54464 403065 pfam11749 DUF3305 Protein of unknown function (DUF3305). Several members of this family are annotated as being molybdopterin-guanine dinucleotide biosynthesis protein A; however, this could not be confirmed. The family is found in proteobacteria. 141
54465 371703 pfam11750 DUF3307 Protein of unknown function (DUF3307). This family of bacterial proteins has no known function. 124
54466 403066 pfam11751 PorP_SprF Type IX secretion system membrane protein PorP/SprF. This entry describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Cytophaga johnsonae (Flavobacterium johnsoniae), that exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss of the rapid gliding phenotype. 274
54467 403067 pfam11752 DUF3309 Protein of unknown function (DUF3309). This family is conserved in bacteria but its function is not known. 49
54468 403068 pfam11753 DUF3310 Protein of unknwon function (DUF3310). This is a family of conserved bacteriophage proteins of unknown function. 60
54469 403069 pfam11754 Velvet Velvet factor. The velvet factor is conserved in many fungal species and is found to have gained different roles depending on the organism's need, expanding the conserved role in developmental programmes. The velvet factor orthologues can be adapted to the fungal-specific life cycle and may be involved in diverse functions such as sclerotia formation and toxin production, as in A. parasiticus, nutrition-dependent sporulation, as in A. fumigatus, or the microconidia-to-macroconidia ratio and cell wall formation, as in the heterothallic fungus Fusarium verticilloides. 237
54470 403070 pfam11755 DUF3311 Protein of unknown function (DUF3311). This is a family of short bacterial proteins of unknown function. 59
54471 403071 pfam11756 YgbA_NO Nitrous oxide-stimulated promoter. The function of ygaB is not known but it is a promoter that is stimulated by the presence of nitrous oxide. It is regulated by the gene-product of the bacterial nsrR gene. 106
54472 403072 pfam11757 RSS_P20 Suppressor of RNA silencing P21-like. This is a large family of putative suppressors of RNA silencing proteins, P20-P25, from ssRNA positive-strand viruses such as Closterovirus, Potyvirus and Cucumovirus families. RNA silencing is one of the major mechanisms of defense against viruses, and, in response, some viruses have evolved or acquired functions for suppression of RNA silencing. These counter-defencive viral proteins with RNA silencing suppressor (RSS) activity were originally discovered in the members of plant virus genera Potyvirus and Cucumovirus. Each of the conserved blocks of amino acids found in P21-like proteins corresponds to a computer-predicted alpha-helix, with the most C-terminal element being 42 residues long. This suggests conservation of the predominantly alpha-helical secondary structure in the P21-like proteins. 94
54473 403073 pfam11758 Bacteriocin_IIi Aureocin-like type II bacteriocin. This is a small family of type II bacteriocins usually encoded on a plasmid. Characteristically the members are small, cationic, rich in Lys and Try, and bring about a generalized membrane permeabilisation leading to leakage of ions. The family includes aureocin A, lacticins Q and Z, and BhtB as well as an archaeal member. 51
54474 314599 pfam11759 KRTAP Keratin-associated matrix. The major structural proteins of mammalian hair are the hair keratin intermediate filaments (KIFs) and the keratin-associated proteins (KRTAPs). In the hair cortex, hair keratins are embedded in an inter-filamentous matrix consisting of KRTAPs which are essential for the formation of a rigid and resistant hair shaft as a result of disulfide bonds between cysteine residues. There are essentially three groups of KRTAPs, viz: the high-sulfur (HS) and ultra-high-sulfur (UHS) KRTAPs (cysteine content: 16-30 and >30 mol%, respectively) and the high-glycine/tyrosine (HGT: 35-60 mol% glycine and tyrosine) KRTAPs. 59
54475 403074 pfam11760 CbiG_N Cobalamin synthesis G N-terminal. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. Within the cobalamin synthesis pathway CbiG catalyzes the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B. The N-terminal of the enzyme is conserved in this family, and the C-terminal and the mid-sections are conserved independently in other families, CbiG_C and CbiG_mid, although the distinct function of each region is unclear. 79
54476 403075 pfam11761 CbiG_mid Cobalamin biosynthesis central region. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. 88
54477 403076 pfam11762 Arabinose_Iso_C L-arabinose isomerase C-terminal domain. This is a family of L-arabinose isomerases, AraA, EC:5.3.1.4. These enzymes catalyze the reaction: L-arabinose <=> L-ribulose. This reaction is the first step in the pathway of L-arabinose utilisation as a carbon source after entering the cell L-arabinose is converted into L-ribulose by the L-arabinose isomerases enzyme. This is a C-terminal non catalytic domain. 114
54478 403077 pfam11763 DIPSY Cell-wall adhesin ligand-binding C-terminal. The DIPSY domain is characterized by the distinctive D*I*PSY motif at the very C-terminus of yeast cell-wall glycoproteins. It appears not to be conserved in any other species, however. In fungi, cell adhesion is required for flocculation, mating and virulence, and is mediated by covalently bound cell wall proteins termed adhesins. Map4, an adhesin required for mating in Schizosaccharomyces pombe, is N-glycosylated and O-glycosylated, and is an endogenous substrate for the mannosyl transferase Oma4p. Map4 has a modular structure with an N-terminal signal peptide, a serine and threonine (S/T)-rich domain that includes nine repeats of 36 amino acids (rich in serine and threonine residues, but lacking glutamines), and a C-terminal DIPSY domain with no glycosyl-phosphatidyl inositol (GPI)-anchor signal. The N-terminal S/T-rich regions, are required for cell wall attachment, but the C-terminal DIPSY domain is required for agglutination and mating in liquid and solid media. 122
54479 403078 pfam11764 N-SET COMPASS (Complex proteins associated with Set1p) component N. The n-SET or N-SET domain is a component of the COMPASS complex, associated with SET1, conserved in yeasts and in other eukaryotes up to humans. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for the silencing of genes close to the telomeres of chromosomes. This domain promotes trimethylation in conjunction with an RRM domain and is necessary for binding of the Spp1 component of COMPASS into the complex. 172
54480 403079 pfam11765 Hyphal_reg_CWP Hyphally regulated cell wall protein N-terminal. The proteins in this family are all fungal and largely annotated as being hyphally regulated cell wall proteins, and several are listed as the enzyme EC:3.2.1.18. This enzyme is acetylneuraminyl hydrolase or exo-alpha-sialidase, that hydrolyzes glycosidic linkages of terminal sialic acid residues in oligosaccharides, glycoproteins, glycolipids, colominic acid and synthetic substrates. 322
54481 403080 pfam11766 Candida_ALS_N Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins. 241
54482 403081 pfam11767 SET_assoc Histone lysine methyltransferase SET associated. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. A subset of SET domains have been called PR domains. The SET domain consists of two regions known as N-SET and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure. This domain is found in fungi associated with SET and N-SET domains. 65
54483 314607 pfam11768 Frtz WD repeat-containing and planar cell polarity effector protein Fritz. Fritz is a probable effector of the planar cell polarity signaling pathway which regulates the septin cytoskeleton in both ciliogenesis and collective cell movements. In Drosophila melanogaster, fritz regulates both the location and the number of wing cell prehair initiation sites. 545
54484 403082 pfam11769 DUF3313 Protein of unknown function (DUF3313). This a bacterial family of proteins which are annotated as putative lipoproteins. 186
54485 403083 pfam11770 GAPT GRB2-binding adapter (GAPT). This is a family of transmembrane proteins which bind the growth factor receptor-bound protein 2 (GRB2) in B cells. In contrast to other transmembrane adaptor proteins, GAPT is not phosphorylated upon BCR ligation. It associates with GRB2 constitutively through its proline-rich region. 155
54486 403084 pfam11771 DUF3314 Protein of unknown function (DUF3314). This small family contains human, mouse and fish members but the function is not known. 162
54487 403085 pfam11772 EpuA DNA-directed RNA polymerase subunit beta. This short 60-residue long bacterial family is the beta subunit of the DNA-directed RNA polymerase, likely to be EC:2.7.7.6. It is membrane-bound and is referred to by the name EpuA. 46
54488 403086 pfam11773 PulG Type II secretory pathway pseudopilin. The secreton (type II secretion) and type IV pilus biogenesis branches of the general secretory pathway in Gram-negative bacteria share many features that suggest a common evolutionary origin. Five components of the secreton, the pseudopilins, are similar to subunits of type IV pili. Pseudopilin PulG is one of the secreton pseudopilins, and is found to assemble into pilus-like bundles. PulG interacts with proteins H, I and J within the multi-protein complex as well as blocking extracellular secretion and reducing the amount of PulE protein as well as the amounts of PulL, PulM, PulC and PulD when G is over-expressed. In Klebsiella the pilus-like structure is composed largely of PulG. 82
54489 403087 pfam11774 Lsr2 Lsr2. Lsr2 is a small, basic DNA-bridging protein present in Mycobacterium and related actinomycetes. It is a functional homolog of the H-NS-like proteins. H-NS proteins play a role in nucleoid organisation and also function as a pleiotropic regulator of gene expression. 109
54490 288608 pfam11775 CobT_C Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. 220
54491 403088 pfam11776 RcnB Nickel/cobalt transporter regulator. RcnB is a family of Proteobacteria proteins. RcnB is required for maintaining metal ion homeostasis, in conjunction with the efflux pump RcnA, family NicO, pfam03824. 51
54492 403089 pfam11777 DUF3316 Protein of unknown function (DUF3316). This family of bacterial proteins has no known function. Several members are, however, annotated as being putative acyl-CoA synthetase, but this could not be confirmed. 107
54493 403090 pfam11778 SID Septation initiation. This family is required for activation of the spg1 GTPase signalling cascade which leads to the initiation of septation and the subsequent termination of mitosis. It may act as a scaffold at the spindle pole body to which other components of the spg1 signalling cascade attach in pombe. In S.cerevisiae it is both required for the proper formation of the spindle pole body outer plaque and may also connect the outer plaque to the central plaque embedded in the nuclear envelope. 135
54494 403091 pfam11779 SPT_ssu-like Small subunit of serine palmitoyltransferase-like. Serine palmitoyltransferase (SPT) catalyzes the first committed step in sphingolipid biosynthesis. In mammals, two small subunits of serine palmitoyltransferase, ssSPTa and ssSPTb, substantially enhance the activity of SPT, conferring full enzyme activity upon it. The 2 ssSPT isoforms share a conserved hydrophobic central domain, which is predicted to reside in the membrane. 54
54495 378716 pfam11780 DUF3318 Protein of unknown function (DUF3318). This is a bacterial family of uncharacterized proteins. 141
54496 403092 pfam11781 zf-RRN7 Zinc-finger of RNA-polymerase I-specific TFIIB, Rrn7. This is the zinc-finger at the start of transcription-binding factor that associates strongly with both Rrn6 and Rrn7 to form a complex which itself binds the TATA-binding protein and is required for transcription by the core domain of the RNA PolI promoter. 32
54497 403093 pfam11782 DUF3319 Protein of unknown function (DUF3319). This is a family of short bacterial proteins, a few of which are annotated as being minor tail protein. Otherwise the function is unknown. 89
54498 403094 pfam11783 Cytochrome_cB Cytochrome c bacterial. This is a family of long bacterial cytochrome c proteins, found in Proteobacteria and Chlorobi families. 173
54499 403095 pfam11784 DUF3320 Protein of unknown function (DUF3320). This family is conserved in Proteobacteria and Chlorobi families. Many members are annotated as being putative DNA helicase-related proteins. 50
54500 403096 pfam11785 Aft1_OSA Aft1 osmotic stress response (OSM) domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The OSM domain has been shown to be involved in the osmotic stress response. 57
54501 371723 pfam11786 Aft1_HRA Aft1 HRA domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The HRA domain is involved in meiotic recombination. It has been shown to be necessary and sufficient to activate recombination. 76
54502 403097 pfam11787 Aft1_HRR Aft1 HRR domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The HRR domain is involved in meiotic recombination. It has been shown to be necessary and sufficient to repress recombination. 68
54503 371725 pfam11788 MRP-L46 39S mitochondrial ribosomal protein L46. This is the L46 subunit of the mammalian mitochondrial ribosome, conserved from plants and fungi. 115
54504 403098 pfam11789 zf-Nse Zinc-finger of the MIZ type in Nse subunit. Nse1 and Nse2 are novel non-SMC subunits of the fission yeast Smc5-6 DNA repair complex. This family is the zinc-finger domain similar to the MIZ type of zinc-finger. 57
54505 371727 pfam11790 Glyco_hydro_cc Glycosyl hydrolase catalytic core. This family is probably a glycosyl hydrolase, and is conserved in fungi and some Proteobacteria. The pombe member is annotated as being from IPR013781. 235
54506 403099 pfam11791 Aconitase_B_N Aconitate B N-terminal domain. This family represents the N-terminal domain of Aconitase B. 152
54507 288625 pfam11792 Baculo_LEF5_C Baculoviridae late expression factor 5 C-terminal domain. This C-terminal domain is likely to be a zinc-binding domain. 42
54508 403100 pfam11793 FANCL_C FANCL C-terminal domain. This domain is found at the C-terminus of the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. 70
54509 403101 pfam11794 HpaB_N 4-hydroxyphenylacetate 3-hydroxylase N terminal. HpaB is part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli. HpaB is part of a heterodimeric enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes PvcC, which may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis. 266
54510 403102 pfam11795 DUF3322 Uncharacterized protein conserved in bacteria N-term (DUF3322). This domain, found in various hypothetical bacterial proteins, has no known function. The family represents just the N-terminus. 187
54511 403103 pfam11796 DUF3323 Protein of unknown function N-terminus (DUF3323). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria). 209
54512 403104 pfam11797 DUF3324 Protein of unknown function C-terminal (DUF3324). This family consists of several hypothetical bacterial proteins of unknown function. 138
54513 403105 pfam11798 IMS_HHH IMS family HHH motif. These proteins are involved in UV protection, eg. 32
54514 403106 pfam11799 IMS_C impB/mucB/samB family C-terminal domain. These proteins are involved in UV protection. 110
54515 403107 pfam11800 RP-C_C Replication protein C C-terminal region. Replication protein C is involved in the early stages of viral DNA replication. 206
54516 371732 pfam11801 Tom37_C Tom37 C-terminal domain. The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial beta-barrel proteins from the cytosol across the outer mitochondrial membrane into the intra-membrane space. In conjunction with TOM70 it guides peptides without an MTS into TOM40, the protein that forms the passage through the outer membrane. It has homology with Metaxin-1, also part of the outer mitochondrial membrane beta-barrel protein transport complex. 145
54517 371733 pfam11802 CENP-K Centromere-associated protein K. CENP-K is one of seven new CENP-A-nucleosome distal (CAD) centromere components (the others being CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S) that are identified as assembling on the CENP-A nucleosome associated complex, NAC. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival despite continued centromere-derived mitotic checkpoint signalling. CENP-K is centromere-associated through its interaction with one or more components of the CENP-A NAC. 263
54518 403108 pfam11803 UXS1_N UDP-glucuronate decarboxylase N-terminal. The N-terminus of the UDP-glucuronate decarboxylases may be involved in localization to the perinuclear Golgi membrane. 75
54519 403109 pfam11804 DUF3325 Protein of unknown function (DUF3325). This family of short proteins are functionally uncharacterized. This family is restricted to Alpha-, Beta- and Gamma-proteobacteria. 101
54520 403110 pfam11805 DUF3326 Protein of unknown function (DUF3326). This protein is functionally uncharacterized. It is about 300-500 amino acids in length. This family is found in plants and bacteria. 336
54521 403111 pfam11806 DUF3327 Domain of unknown function (DUF3327). 122
54522 403112 pfam11807 DUF3328 Domain of unknown function (DUF3328). This family of proteins are functionally uncharacterized. This family is only found in eukaryotes. 220
54523 403113 pfam11808 DUF3329 Domain of unknown function (DUF3329). This family of proteins are functionally uncharacterized. This family is only found in bacteria. 83
54524 288642 pfam11809 DUF3330 Domain of unknown function (DUF3330). This family of proteins are functionally uncharacterized. This family is only found in bacteria. 69
54525 403114 pfam11810 DUF3332 Domain of unknown function (DUF3332). This family of proteins are functionally uncharacterized. This family is only found in bacteria. 160
54526 288644 pfam11811 DUF3331 Domain of unknown function (DUF3331). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family vary in length from 96 to 160 amino acids. 90
54527 403115 pfam11812 DUF3333 Domain of unknown function (DUF3333). This family of proteins are functionally uncharacterized. This family is only found in bacteria. This presumed domain is typically between 116 to 159 amino acids in length. 150
54528 314646 pfam11813 DUF3334 Protein of unknown function (DUF3334). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family are typically between 227 to 238 amino acids in length. 226
54529 403116 pfam11814 DUF3335 Peptidase_C39 like family. 206
54530 403117 pfam11815 DUF3336 Domain of unknown function (DUF3336). This family of proteins are functionally uncharacterized. This family is found in bacteria and eukaryotes. This presumed domain is typically between 143 to 227 amino acids in length. 139
54531 403118 pfam11816 DUF3337 Domain of unknown function (DUF3337). This family of proteins are functionally uncharacterized. This family is only found in eukaryotes. This presumed domain is typically between 285 to 342 amino acids in length. 168
54532 371742 pfam11817 Foie-gras_1 Foie gras liver health family 1. Mutating the gene foie gras in zebrafish has been shown to affect development; the mutants develop large, lipid-filled hepatocytes in the liver, resembling those in individuals with fatty liver disease. Foie-gras protein is long and has several well-defined domains though none of them has a known function. We have annotated this one as the first. The C-terminus of this region contains TPR repeats. 262
54533 403119 pfam11818 DUF3340 C-terminal domain of tail specific protease (DUF3340). This presumed domain is found at the C-terminus of tail specific proteases. Its function is unknown. This family is found in bacteria and eukaryotes. This presumed domain is typically between 88 to 187 amino acids in length. 149
54534 403120 pfam11819 DUF3338 Domain of unknown function (DUF3338). This family of proteins are functionally uncharacterized. This family is found in eukaryotes. This presumed domain is about 130 amino acids in length. 135
54535 403121 pfam11820 DUF3339 Protein of unknown function (DUF3339). This family of proteins are functionally uncharacterized. This family is found in eukaryotes. Proteins in this family are about 70 amino acids in length. 66
54536 403122 pfam11821 DUF3341 Protein of unknown function (DUF3341). This family of proteins are functionally uncharacterized. This family is found in bacteria. Proteins in this family are about 170 amino acids in length. 170
54537 403123 pfam11822 DUF3342 Domain of unknown function (DUF3342). This family of proteins are functionally uncharacterized. This family is found in bacteria. The domain is a BTB-like domain. 97
54538 403124 pfam11823 DUF3343 Protein of unknown function (DUF3343). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 78 to 102 amino acids in length. 63
54539 403125 pfam11824 DUF3344 Protein of unknown function (DUF3344). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length. 267
54540 403126 pfam11825 Nuc_recep-AF1 Nuclear/hormone receptor activator site AF-1. Nuclear receptors (NRs) are a family of ligand-inducible transcription factors, and, like other transcription factors, they contain a distinct DNA binding domain that allows for target gene recognition and several activation domains that possess the ability to activate transcription. One of these activation domains is at the N-terminal, although there are two distinct motifs within this domain, between residues 20-36 and between 74 and the end of this domain, which are the binding regions. One of the co-activators is TIF1beta, which appears to bind at the first motif. 113
54541 288659 pfam11826 DUF3346 Protein of unknown function (DUF3346). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 231 to 659 amino acids in length. 225
54542 403127 pfam11827 DUF3347 Protein of unknown function (DUF3347). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 169 to 570 amino acids in length. 93
54543 403128 pfam11828 DUF3348 Protein of unknown function (DUF3348). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 244 to 323 amino acids in length. 247
54544 403129 pfam11829 DUF3349 Protein of unknown function (DUF3349). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 99 to 124 amino acids in length. 94
54545 403130 pfam11830 DUF3350 Domain of unknown function (DUF3350). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 50 to 64 amino acids in length. 62
54546 403131 pfam11831 Myb_Cef pre-mRNA splicing factor component. This family is a region of the Myb-Related Cdc5p/Cef1 proteins, in fungi, and is part of the pre-mRNA splicing factor complex. 226
54547 403132 pfam11832 DUF3352 Protein of unknown function (DUF3352). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 538 to 575 amino acids in length. 529
54548 403133 pfam11833 CPP1-like Protein CHAPERONE-LIKE PROTEIN OF POR1-like. This entry includes proteins from bacteria and eukaryotes. The plant member, CHAPERONE-LIKE PROTEIN OF POR1 (CPP1), is an essential protein for chloroplast development, plays a role in the regulation of POR (light-dependent protochlorophyllide oxidoreductase) stability and function. 193
54549 403134 pfam11834 KHA KHA, dimerization domain of potassium ion channel. KHA is the tetramerisation domain of eukaryotic voltage-dependent potassium ion-channel proteins. In plants the domain lies at the C-terminus whereas in many chordates it lies at the N-terminus. 67
54550 371753 pfam11835 DUF3355 Domain of unknown function (DUF3355). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 111 to 177 amino acids in length. 89
54551 403135 pfam11836 Phage_TAC_11 Phage tail tube protein, GTA-gp10. This is a family of phage tail tube proteins. 98
54552 403136 pfam11837 DUF3357 Domain of unknown function (DUF3357). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 96 to 119 amino acids in length. 108
54553 403137 pfam11838 ERAP1_C ERAP1-like C-terminal domain. This large domain is composed of 16 alpha helices organized as 8 HEAT-like repeats. This domain forms a concave face that faces towards the active site of the peptidase. 316
54554 371756 pfam11839 Alanine_zipper Alanine-zipper, major outer membrane lipoprotein. This is a family of a major outer membrane lipoprotein, OprL that is an alanine-zipper. Zipper motifs are a seven-repeat motif where the first and fourth positions are occupied by an aliphatic residue, usually a leucine. These residues are positioned on the outside of the coil such as to bind firmly to one or more monomers of the protein to create a triple or five-helical coiled-coil that probably forms a seam in a membrane. 69
54555 403138 pfam11840 DUF3360 Protein of unknown function (DUF3360). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 489 to 517 amino acids in length. 485
54556 403139 pfam11841 DUF3361 Domain of unknown function (DUF3361). This domain is functionally uncharacterized. This domain is found in eukaryotes and predominantly in ELMO (Elongation and Cell motility) proteins where it may play an important role in defining the functions of the ELMO family members and may be functionally linked to the ELMO domain in these proteins. 153
54557 403140 pfam11842 DUF3362 Domain of unknown function (DUF3362). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 117 to 158 amino acids in length. 148
54558 403141 pfam11843 DUF3363 Protein of unknown function (DUF3363). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 323 to 658 amino acids in length. 380
54559 403142 pfam11844 DUF3364 Domain of unknown function (DUF3364). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 60 amino acids in length. 56
54560 403143 pfam11845 DUF3365 Protein of unknown function (DUF3365). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 198 to 657 amino acids in length. 167
54561 403144 pfam11846 Wzy_C_2 Virulence factor membrane-bound polymerase, C-terminal. Wzy is a membrane-bound polymerase of 12 TMs, found in Gram-positive bacteria such as Streptococcus pnuemoniae. It forms part of the EPS or exopolysaccharide system. This family is the 6xTMs at the C-terminal end of the molecule. Wzy functions in polymerizing the oligosaccharide repeat subunits to form high-molecular-weight capsular polysaccharides. A contiguous emebrane-bound flippase, Wzx, pfam01943, transports the repeat units to the external surface of the membrane. These polysaccharides form the capsule and their differing compositions contribute to the multidudinous pneumococcal capsular serotypes, all being structurally and antigenically different. 186
54562 403145 pfam11847 DUF3367 Domain of unknown function (DUF3367). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 667 to 694 amino acids in length. 642
54563 403146 pfam11848 DUF3368 Domain of unknown function (DUF3368). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is about 50 amino acids in length. 46
54564 403147 pfam11849 DUF3369 Domain of unknown function (DUF3369). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 170 amino acids in length. The domain appears to be related to the GAF domain. 168
54565 403148 pfam11850 DUF3370 Protein of unknown function (DUF3370). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 452 to 532 amino acids in length. 422
54566 403149 pfam11851 DUF3371 Domain of unknown function (DUF3371). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 125 to 142 amino acids in length. 127
54567 403150 pfam11852 DUF3372 Domain of unknown function (DUF3372). This domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This presumed domain is about 170 amino acids in length. 167
54568 371762 pfam11853 DUF3373 Protein of unknown function (DUF3373). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 472 to 574 amino acids in length. 405
54569 403151 pfam11854 MtrB_PioB Putative outer membrane beta-barrel porin, MtrB/PioB. MtrB-PioB is a family of bacterial putative outer membrane porins. This family, is secreted as part of the pio (phototrophic iron oxidation) operon that has been found to couple the oxidation of ferrous iron [Fe(II)] to reductive CO2 fixation using light energy. PioABC is found in Rhodopseudomonas palustris and MtrB-PioB is likely to be a beta-barrel porin. Similar to other outer membrane porins, PioB and MtrB are predicted to have long loops protruding into the extracellular space and short turns on the periplasmic side. 640
54570 403152 pfam11855 DUF3375 Protein of unknown function (DUF3375). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 479 to 499 amino acids in length. 469
54571 371765 pfam11856 DUF3376 Protein of unknown function (DUF3376). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 770 to 1142 amino acids in length. 521
54572 403153 pfam11857 DUF3377 Domain of unknown function (DUF3377). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 70 amino acids in length. 72
54573 403154 pfam11858 DUF3378 Domain of unknown function (DUF3378). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 80 amino acids in length. 76
54574 403155 pfam11859 DUF3379 Protein of unknown function (DUF3379). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 234 to 251 amino acids in length. 232
54575 403156 pfam11860 Muraidase N-acetylmuramidase. Endolysins are bacteriophage encoded proteins synthesized at the end of the lytic infection cycle. They degrade the peptidoglycan (PG) of the host bacterium to allow viral progeny release. This domain family is found in bacteria and viruses. It is also found associated with pfam01471. One of the family members is the modular Gp110 endolysin found in the Salmonella phage. This domain represents the catalytic region found in the C-terminal of Gp110. It has been demonstrated to have N-acetylmuramidase (lysozyme) activity cleaving the beta-(1,4) glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine residues in the sugar backbone of the PG. Furthermore, sequence alignments containing this domain show that the Gp110 E101 residue is conserved (suggesting that is is the catalytic residue), and followed by serine (a common feature in lysozymes). The structure of endolysins varies depending on their origin. In general, most of the endolysins from phages infecting Gram-positive bacteria have a modular structure consisting of one or two N-terminal enzymatic active domains (EADs) and a C-terminal cell wall binding domain (CBD) separated by a short linker. In silico analysis indicate that this endolysin has a modular structure harboring this EAD family at the C-terminus and a PG_binding_1 CBD at the N-terminus. 173
54576 403157 pfam11861 DUF3381 Domain of unknown function (DUF3381). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 156 to 174 amino acids in length. This domain is found associated with pfam07780, pfam01728. 146
54577 403158 pfam11862 DUF3382 Domain of unknown function (DUF3382). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 100 amino acids in length. This domain is found associated with pfam02653. 97
54578 403159 pfam11863 DUF3383 Protein of unknown function (DUF3383). This family of proteins are functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 356 to 501 amino acids in length. 493
54579 403160 pfam11864 DUF3384 Domain of unknown function (DUF3384). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 422 to 486 amino acids in length. This domain is found associated with pfam02145. 407
54580 403161 pfam11865 DUF3385 Domain of unknown function (DUF3385). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 160 to 172 amino acids in length. This domain is found associated with pfam00454, pfam02260, pfam02985, pfam02259 and pfam08771. 160
54581 403162 pfam11866 DUF3386 Protein of unknown function (DUF3386). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are about 220 amino acids in length. 211
54582 403163 pfam11867 DUF3387 Domain of unknown function (DUF3387). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 255 to 340 amino acids in length. This domain is found associated with pfam04851, pfam04313. 331
54583 314698 pfam11868 DUF3388 Protein of unknown function (DUF3388). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 261 to 275 amino acids in length. This protein is found associated with pfam01842. 190
54584 403164 pfam11869 DUF3389 Protein of unknown function (DUF3389). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 80 amino acids in length. 75
54585 403165 pfam11870 DUF3390 Domain of unknown function (DUF3390). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 90 amino acids in length. This domain is found associated with pfam02589. This domain is found on most LutB proteins in association with DUF162 and usually Fer4_8. The LutABC operon is involved in lactate-utilisation and is essential. Duf162, pfam02589, is over-represented in the human gut-microbiome. 86
54586 403166 pfam11871 DUF3391 Domain of unknown function (DUF3391). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is typically between 122 to 139 amino acids in length. This domain is found associated with pfam01966. 136
54587 403167 pfam11872 DUF3392 Protein of unknown function (DUF3392). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length. 103
54588 403168 pfam11873 DUF3393 Domain of unknown function (DUF3393). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is typically between 188 to 206 amino acids in length. This domain is found associated with pfam01464. 161
54589 403169 pfam11874 DUF3394 Domain of unknown function (DUF3394). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with pfam06808. 180
54590 403170 pfam11875 DUF3395 Domain of unknown function (DUF3395). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 147 to 176 amino acids in length. This domain is found associated with pfam00226. 144
54591 403171 pfam11876 DUF3396 Protein of unknown function (DUF3396). This family of proteins are functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 302 to 382 amino acids in length. 205
54592 403172 pfam11877 DUF3397 Protein of unknown function (DUF3397). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 114 to 128 amino acids in length. 112
54593 403173 pfam11878 DUF3398 Domain of unknown function (DUF3398). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. 111
54594 403174 pfam11879 DUF3399 Domain of unknown function (DUF3399). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. This domain is found associated with pfam02214, pfam00520. 103
54595 403175 pfam11880 DUF3400 Domain of unknown function (DUF3400). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 50 amino acids in length. This domain is found associated with pfam02754, pfam02913, pfam01565. 45
54596 403176 pfam11881 SPAR_C C-terminal domain of SPAR protein. This domain is found st the C-terminus of many spine-associated Rap GTPase-activating - SPAR - proteins in eukaryotes. This domain is found associated with pfam02145, pfam00595. The exact function is not known. 242
54597 403177 pfam11882 DUF3402 Domain of unknown function (DUF3402). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 350 to 473 amino acids in length. This domain is found associated with pfam07923. 429
54598 403178 pfam11883 DUF3403 Domain of unknown function (DUF3403). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 50 amino acids in length. This domain is found associated with pfam00069, pfam08276, pfam00954, pfam01453. 47
54599 403179 pfam11884 DUF3404 Domain of unknown function (DUF3404). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 260 amino acids in length. This domain is found associated with pfam02518, pfam00512. 259
54600 403180 pfam11885 DUF3405 Protein of unknown function (DUF3405). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 636 to 810 amino acids in length. 493
54601 403181 pfam11886 TOC159_MAD Translocase of chloroplast 159/132, membrane anchor domain. This is the membrane-anchor domain of translocase of chloroplast 159, TOC159/132. This domain is present in plants at the C-terminus of the GTPase, AIG1, pfam04548, and anchors the GTPas region to the outer membrane of the chloroplast. The domain may carry a very C-terminal sequence motif that resembles a transit peptide. 267
54602 403182 pfam11887 Mce4_CUP1 Cholesterol uptake porter CUP1 of Mce4, putative. Mce4_CUP1 is a family of putative Mce4 transporters of cholesterol. The domain is found associated with pfam02470. The full TCDB classification for this family in conjunction with PF02470 is TC:3.A.1.27.4. 238
54603 403183 pfam11888 DUF3408 Protein of unknown function (DUF3408). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 128 to 160 amino acids in length. 136
54604 288721 pfam11889 DUF3409 Domain of unknown function (DUF3409). This domain is functionally uncharacterized. This domain is found in viruses. This presumed domain is about 60 amino acids in length. This domain is found associated with pfam00271, pfam05550, pfam05578. 56
54605 403184 pfam11890 DUF3410 Domain of unknown function (DUF3410). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 90 amino acids in length. This domain is found associated with pfam02826, pfam00389. This domain has a conserved RRE sequence motif. 81
54606 403185 pfam11891 RETICULATA-like Protein RETICULATA-related. This entry represents RETICULATA and related proteins from plants. Arabidopsis RETICULATA protein is involved in differential development of bundle sheath and mesophyll cell chloroplasts. 177
54607 403186 pfam11892 DUF3412 Domain of unknown function (DUF3412). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 120 amino acids in length. This domain is found associated with pfam03641. 121
54608 403187 pfam11893 DUF3413 Domain of unknown function (DUF3413). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 250 amino acids in length. This domain is found associated with pfam00884. 246
54609 403188 pfam11894 Nup192 Nuclear pore complex scaffold, nucleoporins 186/192/205. This is a family of eukaryotic nucleoporins of several different sizes. All of them are long and form the scaffold of the nuclear pore complex. Nup192 in particular modulates the permeability of the central channel of the NPC central or nuclear pore complex. 1483
54610 403189 pfam11895 Peroxidase_ext Fungal peroxidase extension region. This region is found as an extension to a haem peroxidase domain in some fungi. This region is about 80 amino acids in length and forms an extended structure on the surface of the peroxidase domain pfam00141. 72
54611 403190 pfam11896 DUF3416 Domain of unknown function (DUF3416). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 190 amino acids in length. This domain is found associated with pfam00128. 185
54612 403191 pfam11897 DUF3417 Protein of unknown function (DUF3417). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 145 to 860 amino acids in length. This protein is found associated with pfam00343. This protein has a conserved AYF sequence motif. 109
54613 403192 pfam11898 DUF3418 Domain of unknown function (DUF3418). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 582 to 594 amino acids in length. This domain is found associated with pfam07717, pfam00271, pfam04408. 587
54614 403193 pfam11899 DUF3419 Protein of unknown function (DUF3419). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length. 383
54615 314729 pfam11900 DUF3420 Domain of unknown function (DUF3420). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 50 amino acids in length. This domain is found associated with pfam00023. 47
54616 403194 pfam11901 DUF3421 Protein of unknown function (DUF3421). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 119 to 296 amino acids in length. 114
54617 403195 pfam11902 DUF3422 Protein of unknown function (DUF3422). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 426 to 444 amino acids in length. 415
54618 314732 pfam11903 ParD_like ParD-like antitoxin of type II bacterial toxin-antitoxin system. ParD-like antitoxin is a family of archaeal and bacterial proteins of a type II bacterial toxin-antitoxin system. Many of the cognate toxins for these molecules fall into family ParE-like_toxin, pfam15781. Gene-pairs are expressed from the same operon, the toxin of the pair being expressed first, eg, for UniProtKB:Q3AQ93 and UniProtKB:Q3AQ94. 73
54619 403196 pfam11904 GPCR_chapero_1 GPCR-chaperone. This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2. 300
54620 403197 pfam11905 DUF3425 Domain of unknown function (DUF3425). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 120 to 143 amino acids in length. 128
54621 403198 pfam11906 DUF3426 Protein of unknown function (DUF3426). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 262 to 463 amino acids in length. 147
54622 403199 pfam11907 DUF3427 Domain of unknown function (DUF3427). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 243 to 275 amino acids in length. This domain is found associated with pfam04851, pfam00271. 281
54623 403200 pfam11909 NdhN NADH-quinone oxidoreductase cyanobacterial subunit N. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 subcomplexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The cyanobacterial NDH-1 complex contains additional subunits, NdhM and NdhN, compared with the minimal set of the bacterial enzyme and these seem to be specific for thylakoid-located NDH-1 of photosynthetic organisms. 150
54624 403201 pfam11910 NdhO Cyanobacterial and plant NDH-1 subunit O. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 subcomplexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The three nuclear-encoded subunits NdhM,NdhN and NdhO are vital for the functional integrity of the plastidial complex. 65
54625 403202 pfam11911 DUF3429 Protein of unknown function (DUF3429). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 147 to 245 amino acids in length. 137
54626 256719 pfam11912 DUF3430 Protein of unknown function (DUF3430). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 209 to 265 amino acids in length. 204
54627 403203 pfam11913 DUF3431 Protein of unknown function (DUF3431). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 291 to 390 amino acids in length. This protein has a conserved NLRC sequence motif. 211
54628 403204 pfam11914 DUF3432 Domain of unknown function (DUF3432). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 100 amino acids in length. This domain is found associated with pfam00096. This domain has two conserved sequence motifs: YPSPV and PSP. 98
54629 403205 pfam11915 DUF3433 Protein of unknown function (DUF3433). This is a family of functionally uncharacterized proteins. The family is found in eukaryotes, and represents the conserved central region of the member proteins. 91
54630 403206 pfam11916 Vac14_Fig4_bd Vacuolar protein 14 C-terminal Fig4p binding. Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. The C-terminal region of Vac14 binds to Fig4p. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system. 179
54631 371796 pfam11917 DUF3435 Protein of unknown function (DUF3435). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 435 to 791 amino acids in length. This family is related to pfam00589 suggesting it may be an integrase enzyme. 418
54632 403207 pfam11918 Peptidase_S41_N N-terminal domain of Peptidase_S41 in eukaryotic IRBP. Peptidase_S41_N is a family found at the N-terminus of the functional unit of interphotoreceptor retinoid binding proteins 3, IRBP, in eukaryotes. From the structure of Structure 1j7x, the domain forms the N-terminal end of the module which is characterized as a serine-peptidase, pfam03572. Peptidase_S41_N forms a three-helix bundle followed by a small beta strand and is termed domain A. Part of the peptidase domain folds back over domain A to create a largely hydrophobic cleft between the two domains. On binding of ligand domain A is structurally rearranged with respect to domain B. 129
54633 403208 pfam11919 DUF3437 Domain of unknown function (DUF3437). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 142 to 163 amino acids in length. 86
54634 403209 pfam11920 DUF3438 Protein of unknown function (DUF3438). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 276 to 307 amino acids in length. 289
54635 288750 pfam11921 DUF3439 Domain of unknown function (DUF3439). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 46 to 94 amino acids in length. This domain is found associated with pfam01462, pfam00560. 122
54636 403210 pfam11922 DUF3440 Domain of unknown function (DUF3440). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 53 to 190 amino acids in length. This domain is found associated with pfam01507. This domain has a conserved KND sequence motif. 181
54637 403211 pfam11923 DUF3441 Domain of unknown function (DUF3441). This presumed domain is functionally uncharacterized. This domain is found in archaea and eukaryotes. This domain is typically between 104 to 119 amino acids in length. This domain is found associated with pfam05833, pfam05670. This domain has two conserved residues (P and G) that may be functionally important. 104
54638 403212 pfam11924 IAT_beta Inverse autotransporter, beta-domain. This is a family of beta-barrel porin-like outer membrane proteins from enteropathogenic Gram-negative bacteria. Intimins and invasins are virulence factors produced by pathogenic Gram-negative bacteria. They carry C-terminal extracellular passenger domains that are involved in adhesion to host cells and N-terminal beta domains that are embedded in the outer membrane. This family represents the beta-barrel porin-like domain in the outer membrane that can be found in intimins, invasins and some inverse autotransporters. 276
54639 314750 pfam11925 DUF3443 Protein of unknown function (DUF3443). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 400 to 434 amino acids in length. This protein has two conserved sequence motifs: NPV and DNNG. 365
54640 403213 pfam11926 DUF3444 Domain of unknown function (DUF3444). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 210 amino acids in length. This domain is found associated with pfam00226. This domain has two conserved sequence motifs: FSH and FSH. 210
54641 403214 pfam11927 DUF3445 Protein of unknown function (DUF3445). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 264 to 418 amino acids in length. This protein has a conserved RLP sequence motif. This protein has two completely conserved R residues that may be functionally important. 231
54642 403215 pfam11928 DUF3446 Domain of unknown function (DUF3446). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 80 to 99 amino acids in length. This domain is found associated with pfam00096. This domain has a single completely conserved residue P that may be functionally important. 76
54643 371804 pfam11929 DUF3447 Domain of unknown function (DUF3447). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 80 amino acids in length. This domain is found associated with pfam00023. This domain has a conserved SHN sequence motif. It seems likely that this region represents divergent Ankyrin repeats. 76
54644 403216 pfam11931 DUF3449 Domain of unknown function (DUF3449). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 181 to 207 amino acids in length. This domain has two conserved sequence motifs: PIP and CEICG. The domain carries a zinc-finger domain of the C2H2-type. 191
54645 403217 pfam11932 DUF3450 Protein of unknown function (DUF3450). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are about 260 amino acids in length. 238
54646 403218 pfam11933 Na_trans_cytopl Cytoplasmic domain of voltage-gated Na+ ion channel. This is a large cytoplasmic domain towards the start of voltage-dependent sodium ion channel proteins in eukaryotes. It is found closely associated with pfam06512 and pfam00520. 205
54647 403219 pfam11934 DUF3452 Domain of unknown function (DUF3452). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 124 to 150 amino acids in length. This domain is found associated with pfam01858, pfam01857. This domain has a single completely conserved residue W that may be functionally important. 131
54648 403220 pfam11935 DUF3453 Domain of unknown function (DUF3453). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 239 to 261 amino acids in length. 217
54649 403221 pfam11936 DUF3454 Domain of unknown function (DUF3454). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 60 amino acids in length. This domain is found associated with pfam00066, pfam00008, pfam06816, pfam07684, pfam00023. 63
54650 314761 pfam11937 DUF3455 Protein of unknown function (DUF3455). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 174 to 251 amino acids in length. 142
54651 403222 pfam11938 DUF3456 TLR4 regulator and MIR-interacting MSAP. This family of proteins, found from plants to humans, is PRAT4 (A and B), a Protein Associated with Toll-like receptor 4. The Toll family of receptors - TLRs - plays an essential role in innate recognition of microbial products, the first line of defense against bacterial infection. PRAT4A influences the subcellular distribution and the strength of TLR responses and alters the relative activity of each TLR. PRAT4B regulates TLR4 trafficking to the cell surface and the extent of its expression there. TLR4 recognizes lipopolysaccharide (LPS), one of the most immuno-stimulatory glycolipids constituting the outer membrane of the Gram-negative bacteria. This family has also been described as a SAP-like MIR-interacting protein family. 137
54652 403223 pfam11939 NiFe-hyd_HybE [NiFe]-hydrogenase assembly, chaperone, HybE. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain. 147
54653 403224 pfam11940 DUF3458 Domain of unknown function (DUF3458) Ig-like fold. This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. The domain has an Ig-like fold. This domain is found associated with pfam01433. 95
54654 403225 pfam11941 DUF3459 Domain of unknown function (DUF3459). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 110 amino acids in length. This domain is found associated with pfam00128, pfam02922. 92
54655 403226 pfam11942 Spt5_N Spt5 transcription elongation factor, acidic N-terminal. This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4. 97
54656 403227 pfam11943 DUF3460 Protein of unknown function (DUF3460). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 70 amino acids in length. This protein has a conserved WDK sequence motif. 58
54657 403228 pfam11944 DUF3461 Protein of unknown function (DUF3461). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length. This protein has two conserved sequence motifs: KFK and HLE. 124
54658 403229 pfam11945 WASH_WAHD WAHD domain of WASH complex. This domain forms part of the WASH-complex of domains and proteins that activates the Arp2/3 complex, see pfam04062. The Arp2/3 complex regulates endocytosis, sorting, and trafficking within the cell. The WAHD domain attaches to the FAM21 proteins via its N-terminal residues and to the microtubules via its C-terminal residues. 286
54659 403230 pfam11946 DUF3463 Domain of unknown function (DUF3463). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 140 amino acids in length. This domain is found associated with pfam04055. This domain has two conserved sequence motifs: CTPWG and PCYL, plus a highly conserved CxxCxxHC motif. 134
54660 403231 pfam11947 DUF3464 Protein of unknown function (DUF3464). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 137 to 196 amino acids in length. 150
54661 403232 pfam11948 DUF3465 Protein of unknown function (DUF3465). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif. 124
54662 403233 pfam11949 DUF3466 Protein of unknown function (DUF3466). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 564 to 612 amino acids in length. 604
54663 403234 pfam11950 DUF3467 Protein of unknown function (DUF3467). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and viruses. Proteins in this family are typically between 101 to 118 amino acids in length. 92
54664 403235 pfam11951 Fungal_trans_2 Fungal specific transcription factor domain. This family of are likely to be transcription factors. This protein is found in fungi. Proteins in this family are typically between 454 to 826 amino acids in length. This protein is found associated with pfam00172. 384
54665 403236 pfam11952 XTBD XRN-Two Binding Domain, XTBD. XTBD is a family of eukaryotic proteins that act as an XRN2-binding module. XRN2 is an essential exoribonuclease in eukaryotes that processes and degrades a number of different substrates. XTBD is found on a number of different proteins to link them to XRN, such as PAXT-1. 85
54666 403237 pfam11953 DUF3470 Domain of unknown function (DUF3470). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 50 amino acids in length. This domain is found associated with pfam00037. This domain has a single completely conserved residue N that may be functionally important. 42
54667 403238 pfam11954 DUF3471 Domain of unknown function (DUF3471). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144. 94
54668 403239 pfam11955 PORR Plant organelle RNA recognition domain. This family, which was previously known as DUF860, has been shown to be a component of group II intron ribonucleoprotein particles in maize chloroplasts. The domain is required for the splicing of the introns with which it associates, and promotes splicing in the context of a heterodimer with the RNase III-domain protein RNC1. All of the members are predicted to localize to mitochondria or chloroplasts. It seems likely that most PORR proteins function in organellar RNA metabolism. 328
54669 403240 pfam11956 KCNQC3-Ank-G_bd Ankyrin-G binding motif of KCNQ2-3. Interactions with ankyrin-G are crucial to the localization of voltage-gated sodium channels (VGSCs) at the axon initial segment and for neurons to initiate action potentials. This conserved 9-amino acid motif ((V/A)P(I/L)AXXE(S/D)D) is required for ankyrin-G binding and functions to localize sodium channels to a variety of 'excitable' membrane domains both inside and outside of the nervous system. This motif has also been identified in the potassium channel 6TM proteins KCNQ2 and KCNQ3, that correspond to the M channels that exert a crucial influence over neuronal excitability. KCNQ2/KCNQ3 channels are preferentially localized to the surface of axons both at the axonal initial segment and more distally, and this axonal initial segment targeting of surface KCNQ channels is mediated by these ankyrin-G binding motifs of KCNQ2 and KCNQ3. KCNQ3 is a major determinant of M channel localization to the AIS, rather than KCNQ2. Phylogenetic analysis reveals that anchor motifs evolved sequentially in chordates (NaV channel) and jawed vertebrates (KCNQ2/3). 101
54670 403241 pfam11957 efThoc1 THO complex subunit 1 transcription elongation factor. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1. The THO complex is a nuclear complex that is required for transcription elongation through genes containing tandemly repeated DNA sequences. The THO complex is also part of the TREX (TRanscription EXport) complex that is involved in coupling transcription to export of mRNAs to the cytoplasm. 490
54671 403242 pfam11958 DUF3472 Domain of unknown function (DUF3472). This presumed domain is functionally uncharacterized. This domain is found in bacteria, eukaryotes and viruses. This domain is typically between 174 to 190 amino acids in length. This domain has a single completely conserved residue G that may be functionally important. 173
54672 403243 pfam11959 DUF3473 Domain of unknown function (DUF3473). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 130 amino acids in length. This domain is found associated with pfam01522. This domain has two completely conserved residues (P and H) that may be functionally important. 130
54673 403244 pfam11960 DUF3474 Domain of unknown function (DUF3474). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 126 to 140 amino acids in length. This domain is found associated with pfam00487. 127
54674 403245 pfam11961 DUF3475 Domain of unknown function (DUF3475). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 60 amino acids in length. This domain is found associated with pfam05003. 57
54675 371823 pfam11962 Peptidase_G2 Peptidase_G2, IMC autoproteolytic cleavage domain. This domain is found at the very C-terminus of bacteriophage parallel beta-helical tailspike proteins. It carries the enzymic residues that induce autoproteolytic cleavage to bring about maturation of the folding process of the helix in a chaperone-like manner. The domain thus mediates the assembly of a large tailspike protein and then releases itself after maturation. These C-terminal regions that autoproteolytically release themselves after maturation are exchangeable between functionally unrelated N-terminal proteins and have been identified in a number of bacteriophage tailspike proteins. 221
54676 152398 pfam11963 DUF3477 Protein of unknown function (DUF3477). This family of proteins is functionally uncharacterized. This protein is found in viruses. Proteins in this family are typically between 246 to 7162 amino acids in length. This protein is found associated with pfam08716, pfam01661, pfam05409, pfam08717, pfam01831, pfam08715, pfam08710. 355
54677 403246 pfam11964 SpoIIAA-like SpoIIAA-like. These proteins adopt an alpha/beta SpoIIAA-like fold, similar to that found in STAT (pfam01740). They adopt open and closed conformations arising from different arrangements of their alpha-2 and alpha-3 helices. They may be membrane associated and may function as carriers of non-polar compounds. 104
54678 403247 pfam11965 DUF3479 Domain of unknown function (DUF3479). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is about 160 amino acids in length. This domain is found associated with pfam02514. 159
54679 403248 pfam11966 SSURE Fibronectin-binding repeat. Streptococcal surface repeat domain - SSURE - is a protein fragment found to bind to extracellular matrix protein fibronectin but not to collagen or submaxillary mucin in Streptococci. Anti-SSURE antibodies recognized the corresponding protein on the surface of streptococcal cells. The full-length proteins are thus fibronectin-binding surface adhesins. 149
54680 403249 pfam11967 RecO_N Recombination protein O N terminal. Recombination protein O (RecO) is involved in DNA repair and pfam00470 pathway recombination. This domain forms a beta barrel structure. 80
54681 371825 pfam11968 Bmt2 25S rRNA (adenine(2142)-N(1))-methyltransferase, Bmt2. This entry represents Bmt2 and its homogues. In Saccharomyces cerevisiae, Bmt2 is a nucleolar S-adenosylmethionine-dependent rRNA methyltransferase that is responsible for the N-1-methyl-adenosine base modification of 25S rRNA.It specifically methylates the N1 position of adenine 2142 in 25S rRNA. 221
54682 403250 pfam11969 DcpS_C Scavenger mRNA decapping enzyme C-term binding. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the C-terminal region. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The C-terminal domain contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function. 114
54683 371826 pfam11970 GPR_Gpa2_C G protein-coupled glucose receptor regulating Gpa2 C-term. GPR1 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. The protein contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This family is the conserved C-terminal domain of the member proteins. 76
54684 314792 pfam11971 CAMSAP_CH CAMSAP CH domain. This domain is the N-terminal CH domain from the CAMSAP proteins. 85
54685 403251 pfam11972 HTH_13 HTH DNA binding domain. This is a helix-turn-helix DNA binding domain. 54
54686 403252 pfam11973 NQRA_SLBB NQRA C-terminal domain. This family consists of the C-terminal domain of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration. 51
54687 403253 pfam11974 MG1 Alpha-2-macroglobulin MG1 domain. This is the N-terminal MG1 domain from alpha-2-macroglobulin. 102
54688 403254 pfam11975 Glyco_hydro_4C Family 4 glycosyl hydrolase C-terminal domain. 168
54689 403255 pfam11976 Rad60-SLD Ubiquitin-2 like Rad60 SUMO-like. The small ubiquitin-related modifier SUMO-1 is a Ub/Ubl family member, and although SUMO-1 shares structural similarity to Ub, SUMO's cellular functions remain distinct insomuch as SUMO modification alters protein function through changes in activity, cellular localization, or by protecting substrates from ubiquitination. Rad60 family members contain functionally enigmatic, integral SUMO-like domains (SLDs). Despite their divergence from SUMO, each Rad60 SLD interacts with a subset of SUMO pathway enzymes: SLD2 specifically binds the SUMO E2 conjugating enzyme (Ubc9)), whereas SLD1 binds the SUMO E1 (Fub2, also called Uba2) activating and E3 (Pli1, also called Siz1 and Siz2) specificity enzymes. Structural analysis of Structure 2uyz reveals a mechanistic basis for the near-synonymous roles of Rad60 and SUMO in survival of genotoxic stress and suggest unprecedented DNA-damage-response functions for SLDs in regulating SUMOylation. The Rad60 branch of this family is also known as RENi (Rad60-Esc2-Nip45), and biologically it should be two distinct families SUMO and RENi (Rad60-Esc2-Nip45). 72
54690 403256 pfam11977 RNase_Zc3h12a Zc3h12a-like Ribonuclease NYN domain. This domain is found in the Zc3h12a protein which has shown to be a ribonuclease that controls the stability of a set of inflammatory genes. It has been suggested that this domain belongs to the PIN domain superfamily. This domain has also been identified as part of the NYN domain family. 154
54691 403257 pfam11978 MVP_shoulder Shoulder domain. This domain is found in the Major Vault Protein and has been called the shoulder domain. This family includes two bacterial proteins, suggesting that some bacteria may possess vault particles. 117
54692 403258 pfam11979 DUF3480 Domain of unknown function (DUF3480). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 350 to 362 amino acids in length. This domain is found associated with pfam01363. 353
54693 403259 pfam11980 DUF3481 C-terminal domain of neuropilin glycoprotein. This domain is found in eukaryotes at the C-terminus of neuropilins. It represents the transmembrane region of these transmembrane glycoproteins, that are predominantly co-receptors for another class of proteins known as semaphorins. The domain is found associated with pfam00754, pfam00431, pfam00629. 82
54694 403260 pfam11981 DUF3482 Domain of unknown function (DUF3482). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 289 to 301 amino acids in length. This domain is found associated with pfam01926. The central region of these proteins contains a hydrophobic region that is similar to pfam05433. 290
54695 403261 pfam11982 DUF3483 Domain of unknown function (DUF3483). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 230 amino acids in length. This domain is found associated with pfam02754. 217
54696 403262 pfam11983 DUF3484 Membrane-attachment and polymerization-promoting switch. This family is the C-terminal region of essential streptococcal FtsA proteins and their homologs. It acts as an intra-molecular switch, triggered by ATP, to promote polymerization of the whole protein and to attach it to the membrane. FtsA is essential for the formation of the septum that divides fully-grown cells into two daughter cells at cell-division. FtsA anchors the constricting FtsZ ring to the membrane. 65
54697 403263 pfam11984 DUF3485 Protein of unknown function (DUF3485). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 223 to 526 amino acids in length. This protein is found associated with pfam09721. 195
54698 403264 pfam11985 DUF3486 Protein of unknown function (DUF3486). This family of proteins is functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are about 190 amino acids in length. 179
54699 288812 pfam11986 PB1-F2 Influenza A Proapoptotic protein. PB1-F2 is a protein found in almost all known strains of Influenza A virus - a negative sense ssRNA Orthomyxovirus. It originates from translation of the viral polymerase gene in an alternative reading frame. PB1-F2 consists of two independent structural domains, two closely neighboring short helices at the N-terminus, and an extended C-terminal helix. Although the protein has originally been described to induce apoptosis, it has now been shown that PB1-F2 more likely acts as an apoptosis promoter in concert with other apoptosis-inducing agents. PB1-F2 promotes apoptosis by localising to the mitochondria where it destabilizes the membrane. This will cause release of cytochrome C which activates the caspase cascade of apoptosis through the endogenous pathway. In this way it acts like the Bcl-2 protein family which are physiological apoptotic regulators in cells. 87
54700 403265 pfam11987 IF-2 Translation-initiation factor 2. IF-2 is a translation initiator in each of the three main phylogenetic domains (Eukaryotes, Bacteria and Archaea). IF2 interacts with formylmethionine-tRNA, GTP, IF1, IF3 and both ribosomal subunits. Through these interactions, IF2 promotes the binding of the initiator tRNA to the A site in the smaller ribosomal subunit and catalyzes the hydrolysis of GTP following initiation-complex formation. 105
54701 403266 pfam11988 Dsl1_N Retrograde transport protein Dsl1 N terminal. Dsl1 is a peripheral membrane protein required for transport between the Golgi and the endoplasmic reticulum. It is localized to the ER membrane, and in vitro it specifically binds to coatomer, the major component of the protein coat of COPI vesicles. It is comprised primarily of alpha helical bundles. It complexes with another subunit of the Dsl1p complex called Tip20 which forms heterodimers by pairing the N termini of each protein. A central disorganized region between the N and C termini of Dsl1 contains binding sites for coatomer. The C-terminus of Dsl1 contains a binding site to the Sec39 subunit of the Dsl1p complex. 350
54702 371836 pfam11989 Dsl1_C Retrograde transport protein Dsl1 C terminal. Dsl1 is a peripheral membrane protein required for transport between the Golgi and the endoplasmic reticulum. It is localized to the ER membrane, and in vitro it specifically binds to coatomer, the major component of the protein coat of COPI vesicles. Binding sites for coatomer are found on a disorganized region between the C and N termini of Dsl1. The C terminal domain is involved in binding to the Sec39 subunit of the Dsl1p complex. The N terminal complexes with another subunit of the Dsl1p complex called Tip20 which forms heterodimers by pairing the N termini of each protein. 194
54703 403267 pfam11990 DUF3487 Protein of unknown function (DUF3487). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 121 to 136 amino acids in length. This protein has a conserved RLN sequence motif. 114
54704 403268 pfam11991 Trp_DMAT Tryptophan dimethylallyltransferase. This family of proteins represents tryptophan dimethylallyltransferase (EC:2.5.1.34), which catalyzes the first step of ergot alkaloid biosynthesis. Ergot alkaloids, which are produced by endophyte fungi, can enhance plant host fitness, but also cause livestock toxicosis to host plants. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 390 to 465 amino acids in length. 356
54705 403269 pfam11992 DUF3488 Domain of unknown function (DUF3488). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 323 to 339 amino acids in length. This domain is found associated with pfam01841. This domain has a conserved PLW sequence motif. This domain contains 6 transmembrane helices. 339
54706 403270 pfam11993 Ribosomal_S4Pg Ribosomal S4P (gammaproteobacterial). This family of proteins are ribosomal SSU S4 p proteins. This protein is found in gamma-proteobacteria. Proteins in this family are typically between 162 to 178 amino acids in length. 158
54707 403271 pfam11994 DUF3489 Protein of unknown function (DUF3489). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 84 to 211 amino acids in length. This protein has a single completely conserved residue W that may be functionally important. 68
54708 403272 pfam11995 DUF3490 Domain of unknown function (DUF3490). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 160 amino acids in length. This domain is found associated with pfam00225. This domain is found associated with pfam00225. This domain has two conserved sequence motifs: EVE and ESA. 161
54709 314813 pfam11996 DUF3491 Protein of unknown function (DUF3491). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 286 to 3225 amino acids in length. This protein is found associated with pfam04488. This protein is found associated with pfam04488. 946
54710 403273 pfam11997 DUF3492 Domain of unknown function (DUF3492). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 259 to 282 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: GGVS and EHGIY. 278
54711 403274 pfam11998 DUF3493 Protein of unknown function (DUF3493). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 79 to 331 amino acids in length. 73
54712 403275 pfam11999 DUF3494 Protein of unknown function (DUF3494). This family of proteins is functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 243 to 678 amino acids in length. This protein has a single completely conserved residue G that may be functionally important. 202
54713 403276 pfam12000 Glyco_trans_4_3 Gkycosyl transferase family 4 group. This domain is found associated with pfam00534. 168
54714 403277 pfam12001 DUF3496 Domain of unknown function (DUF3496). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 110 amino acids in length. 109
54715 403278 pfam12002 MgsA_C MgsA AAA+ ATPase C terminal. The MgsA protein possesses DNA-dependent ATPase and ssDNA annealing activities. MgsA contributes to the recovery of stalled replication forks and therefore prevents genomic instability caused by aberrant DNA replication. Additionally, MgsA may play a role in chromosomal segregation. This is consistent with a report that MgsA co-localizes with the replisome and affects chromosome segregation. This domain represents the C terminal region of MgsA. 158
54716 403279 pfam12004 DUF3498 Domain of unknown function (DUF3498). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 433 to 538 amino acids in length. This domain is found associated with pfam00616, pfam00168. This domain has two conserved sequence motifs: DLQ and PLSFQNP. 465
54717 403280 pfam12005 DUF3499 Protein of unknown function (DUF3499). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 125 to 163 amino acids in length. 119
54718 403281 pfam12006 DUF3500 Protein of unknown function (DUF3500). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 335 to 438 amino acids in length. This protein has a conserved GHH sequence motif. This protein has two completely conserved G residues that may be functionally important. 294
54719 403282 pfam12007 DUF3501 Protein of unknown function (DUF3501). This family of proteins is functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are about 200 amino acids in length. The structure of protein BPSS1837 from B. pseudomallei has been solved. This protein contains two domains, domain I (1:31, 46:81) is a helical domain, domain II (32:45,82-193) is a mainly beta protein with a beta barrel. According to crystal contacts the proteins probably functions as a dimer. The gene neighborhood analysis suggests that this protein may be functionally related to rubrerythrin and ferredoxin. The wedge surface between the two domains might be functionally important. The fold of this protein could best be described as a circularly permuted C2-like fold (details derived from TOPSAN). 187
54720 403283 pfam12008 EcoR124_C Type I restriction and modification enzyme - subunit R C terminal. This enzyme has been characterized and shown to belong to a new family of the type I class of restriction and modification enzymes. This family is involved in bacterial defense by making double strand breaks in specific double stranded DNA sequences, e.g. that of invading bacteriophages. EcoR124 is made up of three subunits, HsdR, HsdS and HsdM. The R subunit has ATPase and restriction endonuclease activity. This domain is the C terminal of the R subunit. 232
54721 403284 pfam12009 Telomerase_RBD Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA. 127
54722 403285 pfam12010 DUF3502 Domain of unknown function (DUF3502). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 140 amino acids in length. This domain is found associated with pfam01547. 131
54723 288834 pfam12011 NPH-II RNA helicase NPH-II. RNA helicase NPH-II or I8 is found in Poxviridae. It is essential for viral replication and plays an important role during transcription of early mRNAs, presumably by preventing R-loop formation behind the elongating RNA polymerase. It acts as NTP-dependent helicase that catalyzes unidirectional unwinding of 3'tailed duplex RNAs. It might also play a role in the export of newly synthesized mRNA chains out of the core into the cytoplasm and is required for propagation of viral particles. 168
54724 403286 pfam12012 DUF3504 Domain of unknown function (DUF3504). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. 163
54725 371847 pfam12013 OrsD Orsellinic acid/F9775 biosynthesis cluster protein D. This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 247 to 1018 amino acids in length. Family members include orsellinic acid/F9775 biosynthesis cluster protein D (orsD) from Emericella nidulans. The orsD gene is part of the cluster that encodes components for the biosynthesis of orsellinic acid, as well as biosynthesis of the cathepsin K inhibitors F9775 A and F9775 B, but the function of orsD is unknown. OrsD contains two segments that are likely to be C2H2 zinc binding domains. 114
54726 403287 pfam12014 DUF3506 Domain of unknown function (DUF3506). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 131 to 148 amino acids in length. This domain has a conserved KLTGD sequence motif. 134
54727 403288 pfam12015 DUF3507 Domain of unknown function (DUF3507). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 180 amino acids in length. This domain has a conserved ENL sequence motif. 182
54728 403289 pfam12016 Stonin2_N Stonin 2. Stonin 2 is involved in clathrin mediated endocytosis. It binds to Eps15 by its highly conserved NPF motif. The complex formed has been shown to directly associate with the clathrin adaptor complex AP-2, and to localize to clathrin-coated pits (CCPs). In addition, stonin2 was recently identified as a specific sorting adaptor for synaptotagmin, and may thus regulate synaptic vesicle recycling. 338
54729 288840 pfam12017 Tnp_P_element Transposase protein. Protein in this family are transposases found in insects. This region is about 230 amino acids in length and is found associated with pfam05485. 219
54730 403290 pfam12018 FAP206 Domain of unknown function. This domain of about 280 residues is found in eukaryotes. There are two conserved sequence motifs: GFC and GLL. This family is also known as UPF0704. This domain is found in FAP206, a protein associated with cilia and flagella. In the ciliate Tetrahymena, the cilium has radial spokes, each of which is a macromolecular complex essential for motility. A triplet of three radial spokes, RS1, RS2, and RS3, is repeated every 96 nm along the doublet microtubule. Each spoke has a distinct base that docks to the doublet and is linked to different inner dynein arms. Knockout of the FAP206 gene results in slow cell motility and the 96-nm repeats lack RS2 and dynein c. FAP206 is probably part of the front prong and docks RS2 and dynein c to the microtubule. 271
54731 403291 pfam12019 GspH Type II transport protein GspH. GspH is involved in bacterial type II export systems. Like all pilins, GspH has an N-terminus alpha helix. This helix is followed by nine beta strands forming two beta sheets, one of five antiparallel strands and one of four antiparallel strands. GspH is a minor pseudopilin; it is expressed much less than other pseudopilins in the type II secretion pilus (major pilins). The function and localization of minor pseudo-pilins are still to be fully unraveled. It has been suggested that some minor pseudopilins may assemble either into the base or the tip of pili, or both. They function as initiators or regulators of pilus biogenesis and dynamics, and/or as adaptors between various pseudopilin component and other members of the T2SS. 108
54732 403292 pfam12020 TAFA TAFA family. This family of secreted proteins are brain specific and thought to be chemokines. These proteins are found in vertebrates. Proteins in this family are typically between 94 to 133 amino acids in length and contain a number of conserved cysteines. 89
54733 403293 pfam12021 DUF3509 Protein of unknown function (DUF3509). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 92 to 110 amino acids in length. This protein has two completely conserved residues (G and R) that may be functionally important. 87
54734 403294 pfam12022 DUF3510 Domain of unknown function (DUF3510). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 130 amino acids in length. This domain is found associated with pfam06148. 129
54735 403295 pfam12023 DUF3511 Domain of unknown function (DUF3511). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 50 amino acids in length. This domain has two completely conserved residues (Y and K) that may be functionally important. 45
54736 403296 pfam12024 DUF3512 Domain of unknown function (DUF3512). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 231 to 249 amino acids in length. This domain is found associated with pfam00439. 185
54737 288848 pfam12025 Phage_C Phage protein C. This family of phage proteins is functionally uncharacterized. Proteins in this family are typically between 68 to 86 amino acids in length. 68
54738 403297 pfam12026 DUF3513 Domain of unknown function (DUF3513). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 192 to 218 amino acids in length. This domain is found associated with pfam00018, pfam08824. This domain has a conserved QPP sequence motif. 207
54739 314839 pfam12027 DUF3514 Protein of unknown function (DUF3514). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 368 to 823 amino acids in length. 256
54740 403298 pfam12028 DUF3515 Protein of unknown function (DUF3515). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 166 to 214 amino acids in length. This protein has a conserved RCG sequence motif. 159
54741 403299 pfam12029 DUF3516 Domain of unknown function (DUF3516). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 460 to 473 amino acids in length. This domain is found associated with pfam00270, pfam00271. 460
54742 403300 pfam12030 DUF3517 Domain of unknown function (DUF3517). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 340 amino acids in length. This domain is found associated with pfam00443. 408
54743 403301 pfam12031 BAF250_C SWI/SNF-like complex subunit BAF250/Osa. This entry represents the mammalian BAF250a/b and its homolog osa from fruit flies. They are part of the SWI/SNF-like ATP-dependent chromatin remodelling complex that regulates gene expression through regulating nucleosome remodelling. In humans there are two BAF250 isoforms, BAF250a/ARID1a and BAF250b/ARID1b. BAF250a/b may be E3 ubiquitin ligases that target histone H2B. 257
54744 403302 pfam12032 CLIP Regulatory CLIP domain of proteinases. CLIP is a regulatory domain which controls the proteinase action of various proteins of the trypsin family, e.g. easter and pap2. The CLIP domain remains linked to the protease domain after cleavage of a conserved residue which retains the protein in zymogen form. It is named CLIP because it can be drawn in the shape of a paper clip. It has many disulphide bonds and highly conserved cysteine residues, and so it folds extensively. 54
54745 152468 pfam12033 DUF3519 Protein of unknown function (DUF3519). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 117 to 1154 amino acids in length. This protein has a single completely conserved residue Q that may be functionally important. 104
54746 403303 pfam12034 DUF3520 Domain of unknown function (DUF3520). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 180 amino acids in length. This domain is found associated with pfam00092. 182
54747 403304 pfam12036 DUF3522 Protein of unknown function (DUF3522). This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 220 to 787 amino acids in length. This family belongs to the CREST superfamily, which are distant members of the GPCR superfamily. 183
54748 403305 pfam12037 DUF3523 Domain of unknown function (DUF3523). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 257 to 277 amino acids in length. This domain is found associated with pfam00004. This domain has a conserved LER sequence motif. 264
54749 403306 pfam12038 DUF3524 Domain of unknown function (DUF3524). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: HENQ and FNS. This domain has a single completely conserved residue S that may be functionally important. 165
54750 152474 pfam12039 DUF3525 Protein of unknown function (DUF3525). This family of proteins is functionally uncharacterized. This protein is found in viruses. Proteins in this family are about 360 amino acids in length. 404
54751 403307 pfam12040 DUF3526 Domain of unknown function (DUF3526). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 149 to 170 amino acids in length. This domain has a single completely conserved residue P that may be functionally important. 140
54752 403308 pfam12041 DELLA Transcriptional regulator DELLA protein N terminal. Gibberellins are plant hormones which have great impact on growth signalling. DELLA proteins are transcriptional regulators of growth related proteins which are downregulated when gibberellins bind to their receptor GID1. GID1 forms a complex with DELLA proteins and signals them towards 26S proteasome. The N terminal of DELLA proteins contains conserved DELLA and VHYNP motifs which are important for GID1 binding and proteolysis of the DELLA proteins. 68
54753 314852 pfam12042 RP1-2 Tubuliform egg casing silk strands structural domain. Spiders use fibroins to make silk strands. This family includes tubuliform silk fibroins which are used to protect egg cases. This domain is a structural domain which is found in repeats of up to 20 in many individuals (although this is not necessarily the case). RP1 makes up structural domains in the N terminal while RP2 makes up structural domains in the C terminal. 167
54754 403309 pfam12043 DUF3527 Domain of unknown function (DUF3527). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 120 amino acids in length. This domain has a conserved CDCGGWD sequence motif. 350
54755 371862 pfam12044 Metallopep Putative peptidase family. This family of proteins is functionally uncharacterized. However, it does contain an HEXXH motif characteristic of metallopeptidases. This protein is found mainly in fungi. Proteins in this family are typically between 625 to 773 amino acids in length. 425
54756 403310 pfam12045 DUF3528 Protein of unknown function (DUF3528). This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 185 to 298 amino acids in length. This protein is found associated with pfam00046. 141
54757 403311 pfam12046 CCB1 Cofactor assembly of complex C subunit B. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis. The CCB system consists of four protein, CCB1-4. The four CCBs are well conserved between green algae and plants. 166
54758 403312 pfam12047 DNMT1-RFD Cytosine specific DNA methyltransferase replication foci domain. This domain is part of a cytosine specific DNA methyltransferase enzyme. It functions non-catalytically to target the protein towards replication foci. This allows the DNMT1 protein to methylate the correct residues. This domain targets DMAP1 and HDAC2 to the replication foci during the S phase of mitosis. They are thought to have some importance in conversion of critical histone lysine moieties. 141
54759 403313 pfam12048 DUF3530 Protein of unknown function (DUF3530). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 272 to 336 amino acids in length. These proteins are distantly related to alpa/beta hydrolases so they may act as enzymes. 307
54760 403314 pfam12049 DUF3531 Protein of unknown function (DUF3531). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 149 to 199 amino acids in length. 139
54761 403315 pfam12051 DUF3533 Protein of unknown function (DUF3533). This family of transmembrane proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 393 to 772 amino acids in length. 378
54762 403316 pfam12052 VGCC_beta4Aa_N Voltage gated calcium channel subunit beta domain 4Aa N terminal. The beta subunit of voltage gated calcium channels is coded for by four genes 1-4. Gene 4 can produce two types of beta4A domain (beta4Aa and beta4Ab) according to how the gene splicing is carried out. This family is part of the beta4Aa N terminal domain. It is made up of an alpha helix and a beta strand. It is thought to regulate the channel properties through protein-protein interactions with non Ca channel proteins. 42
54763 403317 pfam12053 DUF3534 N-terminal of Par3 and HAL proteins. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length. This eukaryotic domain is found associated with pfam00595. It has a conserved GILD sequence motif. Family members have been found to be essential for cell polarity establishment and maintenance such as Par3 (partitioning defective) and involved in conversion of histidine into ammonia (a crucial step for forming histamine in humans) such as Histidine ammonia lyase (HAL). This N-terminal domain is found to mediate oligomerization critical for the membrane localization of Par-3. It is also found to possess a self-association capacity via a front-to-back mode in Par-3 and HAL proteins. However, unlike the Par-3 N-terminal domain which self-assembles into a left-handed helical filament, the HAL N-terminal domain does not tend to form a helical filament but rather self-assembles into circular oligomeric particles. This has been suggested to be likely due to the absence of equivalent charged residues that are essential for the longitudinal packing of the Par-3 N-terminal domain filament. 82
54764 403318 pfam12054 DUF3535 Domain of unknown function (DUF3535). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 439 to 459 amino acids in length. This domain is found associated with pfam00271, pfam02985, pfam00176. This domain has two completely conserved residues (P and K) that may be functionally important. 445
54765 403319 pfam12055 DUF3536 Domain of unknown function (DUF3536). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 274 to 285 amino acids in length. This domain is found associated with pfam03065. 284
54766 403320 pfam12056 DUF3537 Protein of unknown function (DUF3537). This family of transmembrane proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 427 to 453 amino acids in length. 392
54767 403321 pfam12057 DUF3538 Domain of unknown function (DUF3538). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 120 amino acids in length. This domain is found associated with pfam00240. This domain has a conserved SDL sequence motif. 114
54768 403322 pfam12058 DUF3539 Protein of unknown function (DUF3539). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 90 amino acids in length. This protein has a conserved NHP sequence motif. 86
54769 403323 pfam12059 DUF3540 Protein of unknown function (DUF3540). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 212 to 238 amino acids in length. This protein has a conserved SCL sequence motif. 199
54770 403324 pfam12060 DUF3541 Domain of unknown function (DUF3541). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 230 amino acids in length. 225
54771 403325 pfam12061 NB-LRR Late blight resistance protein R1. R1 is a gene for resistance to late blight, the most destructive disease in potato cultivation worldwide. The R1 gene belongs to the class of plant genes for pathogen resistance that have a leucine zipper motif, a putative nucleotide binding domain and a leucine-rich repeat domain. This protein is found associated with PF00931. 297
54772 403326 pfam12062 HSNSD heparan sulfate-N-deacetylase. This family of proteins is are heparan sulfate N-deacetylase enzymes. This protein is found in eukaryotes. This proteinenzyme is often found associated with pfam00685. 491
54773 403327 pfam12063 DUF3543 Domain of unknown function (DUF3543). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 217 to 291 amino acids in length. This domain is found associated with pfam00069. This domain has a single completely conserved residue A that may be functionally important. 251
54774 403328 pfam12064 DUF3544 Domain of unknown function (DUF3544). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 198 to 216 amino acids in length. This domain is found associated with pfam00628, pfam01753, pfam00439, pfam00855. 202
54775 403329 pfam12065 DUF3545 Protein of unknown function (DUF3545). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 60 to 77 amino acids in length. This protein has two completely conserved residues (R and L) that may be functionally important. 58
54776 403330 pfam12066 DUF3546 Domain of unknown function (DUF3546). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 93 to 114 amino acids in length. This domain has two completely conserved Y residues that may be functionally important. 110
54777 403331 pfam12067 Sox17_18_mid Sox 17/18 central domain. This is the central region of eukaryotic SOX17 and 18 transcription factor proteins. It lies just downstream of the HMG-box family, pfam00505, and is followed by a C-terminal domain. 49
54778 371881 pfam12068 DUF3548 Domain of unknown function (DUF3548). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 184 to 216 amino acids in length. This domain is found associated with pfam00566. This domain is found at the N-terminus of GYP7 proteins. 170
54779 403332 pfam12069 DUF3549 Protein of unknown function (DUF3549). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 340 amino acids in length. This protein has a conserved LDE sequence motif. 338
54780 403333 pfam12070 SCAI Protein SCAI. SCAI is a transcriptional cofactor and tumor suppressor that suppresses MKL1-induced SRF transcriptional activity. It may function in the RHOA-DIAPH1 signal transduction pathway and regulate cell migration through transcriptional regulation of ITGB1. 520
54781 371883 pfam12071 DUF3551 Protein of unknown function (DUF3551). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 79 to 104 amino acids in length. This protein has a single completely conserved residue C that may be functionally important. 77
54782 403334 pfam12072 DUF3552 Domain of unknown function (DUF3552). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is about 200 amino acids in length. This domain is found associated with pfam00013, pfam01966. This domain has a single completely conserved residue A that may be functionally important. 201
54783 403335 pfam12073 DUF3553 Protein of unknown function (DUF3553). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 60 amino acids in length. This protein has two conserved sequence motifs: GQVQS and TVNF. 48
54784 403336 pfam12074 Gcn1_N Domain of unknown function (DUF3554). This domain is found in the N-terminal region of Gcn1 protein, which acts as a translation activator that mediates translational control by regulating Gcn2 kinase activity. 350
54785 403337 pfam12075 KN_motif KN motif. This small motif is found at the N-terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with pfam00023. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerization and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors. 39
54786 403338 pfam12076 Wax2_C WAX2 C-terminal domain. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam04116. This domain has a conserved LEGW sequence motif. This region has similarity to short chain dehydrogenases. 164
54787 403339 pfam12077 DUF3556 Transmembrane protein of unknown function (DUF3556). This family of transmembrane proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 576 to 592 amino acids in length. 573
54788 371889 pfam12078 DUF3557 Domain of unknown function (DUF3557). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length. 154
54789 403340 pfam12079 DUF3558 Protein of unknown function (DUF3558). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 177 to 195 amino acids in length. 172
54790 403341 pfam12080 GldM_C GldM C-terminal domain. This domain is found in bacteria at the C-terminus of the GldM protein. This domain is typically between 169 to 182 amino acids in length. This domain has two completely conserved residues (Y and N) that may be functionally important. GldM, is named for the member from Cytophaga johnsonae (Flavobacterium johnsoniae), which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. 176
54791 403342 pfam12081 GldM_N GldM N-terminal domain. This domain is found in bacteria at the N-terminus of the GldM protein. This domain is typically between 169 to 182 amino acids in length. This domain has two completely conserved residues (Y and N) that may be functionally important. GldM, is named for the member from Cytophaga johnsonae (Flavobacterium johnsoniae), which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. 178
54792 403343 pfam12083 DUF3560 Domain of unknown function (DUF3560). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 120 amino acids in length. This domain has a conserved GHHSE sequence motif. 124
54793 403344 pfam12084 DUF3561 Protein of unknown function (DUF3561). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length. 107
54794 378806 pfam12085 DUF3562 Protein of unknown function (DUF3562). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 62 to 84 amino acids in length. This protein has two completely conserved residues (A and Y) that may be functionally important. 60
54795 403345 pfam12086 DUF3563 Protein of unknown function (DUF3563). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 50 amino acids in length. This protein has conserved AYL and DLE sequence motifs. 57
54796 403346 pfam12087 DUF3564 Protein of unknown function (DUF3564). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 118 to 142 amino acids in length. This protein has a conserved WSRE sequence motif. 120
54797 403347 pfam12088 DUF3565 Protein of unknown function (DUF3565). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 30 to 78 amino acids in length. This protein has two conserved sequence motifs: WVA and CGH. 58
54798 403348 pfam12089 DUF3566 Transmembrane domain of unknown function (DUF3566). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 136 to 304 amino acids in length. This region represents a transmembrane region found at the C-terminus of the proteins. 118
54799 371892 pfam12090 Spt20 Spt20 family. This presumed domain is found in the Spt20 proteins from both human and yeast. The Spt20 protein is part of the SAGA complex which is a large complex mediating histone deacetylation. Yeast Spt20 has been shown to play a role in structural integrity of the SAGA complex as as no intact SAGA could be purified in spt20 deletion strains. 155
54800 403349 pfam12091 DUF3567 Protein of unknown function (DUF3567). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 90 amino acids in length. This protein has a conserved EIVDK sequence motif. 85
54801 403350 pfam12092 DUF3568 Protein of unknown function (DUF3568). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length. 124
54802 152528 pfam12093 Corona_NS8 Coronavirus NS8 protein. This family of proteins is functionally uncharacterized. This protein is found in coronaviruses. Proteins in this family are typically between 39 to 121 amino acids in length. This protein has two conserved sequence motifs: EDPCP and INCQ. 126
54803 378811 pfam12094 DUF3570 Protein of unknown function (DUF3570). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 396 to 444 amino acids in length. 417
54804 403351 pfam12095 CRR7 Protein CHLORORESPIRATORY REDUCTION 7. This entry includes protein from blue-green algae and plants, including CRR7 protein from Arabidopsis. CRR7 is part of the chloroplastic NAD(P)H dehydrogenase complex (NDH Complex) involved in respiration, photosystem I (PSI) cyclic electron transport and CO2 uptake. It is essential for the stable formation of the NDH Complex. 78
54805 403352 pfam12096 DUF3572 Protein of unknown function (DUF3572). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 100 amino acids in length. 81
54806 288914 pfam12097 DUF3573 Protein of unknown function (DUF3573). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 372 to 530 amino acids in length. 383
54807 403353 pfam12098 DUF3574 Protein of unknown function (DUF3574). This family of proteins is functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 144 to 163 amino acids in length. This protein has a conserved TPRF sequence motif. 103
54808 403354 pfam12099 DUF3575 Protein of unknown function (DUF3575). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family are typically between 187 to 236 amino acids in length. 178
54809 403355 pfam12100 DUF3576 Domain of unknown function (DUF3576). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 100 amino acids in length. This domain has a single completely conserved residue G that may be functionally important. 101
54810 403356 pfam12101 DUF3577 Protein of unknown function (DUF3577). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 143 to 307 amino acids in length. 133
54811 403357 pfam12102 DUF3578 Domain of unknown function (DUF3578). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 177 to 191 amino acids in length. 183
54812 403358 pfam12103 Lipl32 Surface lipoprotein of Spirochaetales order. Lipl32 is an outer membrane surface lipoprotein of Leptospira like bacteria. 178
54813 403359 pfam12104 Tcell_CD4_C T cell CD4 receptor C terminal region. This domain is the C terminal domain of the CD4 T cell receptor. The C terminal domain is the cytoplasmic domain which relays the signal for T cell activation. This process involves co-receptor internalisation. This domain is involved in binding to the N terminal of Lck co-receptor in a Zn2+ clasp structure. 28
54814 403360 pfam12105 SpoU_methylas_C SpoU, rRNA methylase, C-terminal. This domain is found in bacteria. This domain is about 60 amino acids in length. This domain is found in association with pfam00588. This domain has a conserved LFE sequence motif. Some members of the Pfam family SpoU_methylase, pfam00588, carry this very distinctive sequence region at their extreme C-terminus. The exact function of this region is not known. 54
54815 314909 pfam12106 Colicin_E5 Colicin E5 ribonuclease domain. Colicin is a protein produced by bacteria with Col plasmids. Its function is to attack E. coli through actions on its inner membrane ion channels or through ribonuclease or deoxyribonuclease actions. The C terminal domain is the ribonuclease domain. It specifically cleaves tRNA anticodons which recognize codons in the form NAY (N:any nucleotide, A:adenosine, Y:pyrimidine) which corresponds to Tyrosine, Histidine, Asparagine and Aspartic Acid. E5-CRD can be referred to as an RNA restriction enzyme that specifically recognizes and cleaves single-stranded GU sequences. 88
54816 152542 pfam12107 VEK-30 Plasminogen (Pg) ligand in fibrinolytic pathway. Pg is an important mediator of angiostatin production in the fibrinolytic pathway. Pg is made up of five subunit kringle molecules (Pg-K1 to Pg-K5), of which the first three make the protein angiostatin. VEK-30 is a domain of the group A streptococcal protein PAM. It binds to Pg-K2 of angiostatin and activates the molecule to mediate its anti-angiogenic effects. VEK-30 binds to angiostatin via a C terminal lysine with argininyl and glutamyl side chain residues known as a 'through space isostere'. 17
54817 403361 pfam12108 SF3a60_bindingd Splicing factor SF3a60 binding domain. This domain is found in eukaryotes. This domain is about 30 amino acids in length. This domain has a single completely conserved residue Y that may be functionally important. SF3a60 makes up the SF3a complex with SF3a66 and SF3a120. This domain is the binding site of SF3a60 for SF3a120. The SF3a complex is part of the spliceosome, a protein complex involved in splicing mRNA after transcription. 27
54818 403362 pfam12109 CXCR4_N CXCR4 Chemokine receptor N terminal. CXCR4 and its ligand stromal cell-derived factor-1 (a.k.a. CXCL12) are essential for proper fetal development. CXCR4 is also the major coreceptor for T-tropic strains of human immunodeficiency virus 1 (HIV-1), and SDF-1 inhibits HIV-1 infection. Additionally, SDF-1 and CXCR4 mediate cancer cell migration and metastasis. The N terminal domain of most chemokine receptors is the ligand binding domain and so the N terminal domain of CXCR4 is the binding site for SDF-1. 33
54819 403363 pfam12110 Nup96 Nuclear protein 96. Nup96 (often known by the name of its yeast homolog Nup145C) is part of the Nup84 heptameric complex in the nuclear pore complex. Nup96 complexes with Sec13 in the middle of the heptamer. The function of the heptamer is to coat the curvature of the nuclear pore complex between the inner and outer nuclear membranes. Nup96 is predicted to be an alpha helical solenoid. The interaction between Nup96 and Sec13 is the point of curvature in the heptameric complex. 287
54820 403364 pfam12111 PNPase_C Polyribonucleotide phosphorylase C terminal. PNPase regulates the expression of small non-coding RNAs that control expression of outer-membrane proteins. The enzyme also affects complex processes, such as the tissue-invasive virulence of Salmonella enterica and the regulation of a virulence-factor secretion system in Yersinia. In Escherichia coli, PNPase is involved in the quality control of ribosomal RNA precursors and is required for growth following cold shock. This family contains the C terminal protomer domain of the PNPase core. The function of the C terminal protomer is to catalyze phosphorolysis through its two active sites. 37
54821 403365 pfam12112 DUF3579 Protein of unknown function (DUF3579). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 98 to 126 amino acids in length. This protein has a conserved FRP sequence motif. 87
54822 288929 pfam12113 SVM_signal SVM protein signal sequence. This region is presumed to be a signal peptide sequence found in Sequence-variable mosaic (SVM) proteins. This domain is found in phytoplasmas. This presumed signal sequence is about 30 amino acids in length. 33
54823 403366 pfam12114 Period_C Period protein 2/3C-terminal region. This domain is found in eukaryotes. This domain is typically between 164 to 200 amino acids in length. This domain is found associated with pfam08447. 195
54824 314916 pfam12115 Salp15 Salivary protein of 15kDa inhibits CD4+ T cell activation. This is a family of 15kDa salivary proteins from Acari Arachnids that is induced on feeding and assists the parasite to remain attached to its arthropod host. By repressing calcium fluxes triggered by TCR engagement, Salp15 inhibits CD4+ T cell activation. Salp15 shows weak similarity to Inhibin A, a member of the TGF-beta superfamily that inhibits the production of cytokines and the proliferation of T cells. 112
54825 403367 pfam12116 SpoIIID Stage III sporulation protein D. This stage III sporulation protein is a small DNA-binding family that is essential for gene expression of the mother-cell compartment during sporulation. The domain is found in bacteria and viruses, and is about 40 amino acids in length. It has a conserved RGG sequence motif. 82
54826 288933 pfam12117 DUF3580 Protein of unknown function (DUF3580). This domain is found in viruses, and is about 120 amino acids in length. It is found in association with pfam01057. 121
54827 288934 pfam12118 SprA-related SprA-related family. This family of bacterial proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. Proteins in this family are typically between 234 to 465 amino acids in length. Most members are annotated as being SprA-related. 310
54828 403368 pfam12119 DUF3581 Protein of unknown function (DUF3581). This protein is found in bacteria. Proteins in this family are about 240 amino acids in length. 217
54829 403369 pfam12120 Arr-ms Rifampin ADP-ribosyl transferase. This protein is found in bacteria. Proteins in this family are typically between 136 to 150 amino acids in length. The opportunistic pathogen Mycobacterium smegmatis is resistant to rifampin because of the presence of a chromosomally encoded rifampin ADP-ribosyltransferase (Arr-ms). Arr-ms is a small enzyme whose activity thus renders rifamycin antibiotics ineffective. 99
54830 403370 pfam12121 DD_K Dermaseptin. This protein is found in eukaryotes. Proteins in this family are typically between 30 to 76 amino acids in length. This protein is found associated with pfam03032. This domain is part of a dermaseptin protein which is used as an antimicrobial agent. The full protein is almost completely defined in an alpha helical domain. It creates high levels of disorder at the level of the phospholipid head group of bacterial membranes suggesting that it partitions into the bilayer where it severely disrupts membrane packing. 23
54831 403371 pfam12122 Rhomboid_N Cytoplasmic N-terminal domain of rhomboid serine protease. Rhomboid_N is the N-terminal cytoplasmic domain of the rhomboid intra-membraneous serine protease, otherwise known as Peptidase_S54, pfam01694. This N-terminal domain has similarity to other GlnB-like domains, some of which appear to have a binding role, eg to peptidoglycan. It is not clear exactly what the function of this domain is in the protease, but its presence is critical for maintaining a catalytically competent state for the protein. 86
54832 338253 pfam12123 Amidase02_C N-acetylmuramoyl-l-alanine amidase. This domain is found in bacteria and viruses. This domain is about 50 amino acids in length. This domain is classified with the enzyme classification code EC:3.5.1.28. This domain is the C terminal of the enzyme which hydrolyzes the link between N-acetylmuramoyl residues and L-amino acid residues in certain cell-wall glycopeptides. 44
54833 288939 pfam12124 Nsp3_PL2pro Coronavirus polyprotein cleavage domain. This domain is found in SARS coronaviruses, and is about 70 amino acids in length. It is found associated with various other coronavirus proteins due to the polyprotein nature of most viral translation. PL2pro is a domain of the non-structural protein nsp3. The domain performs three of the cleavages required to separate the translated polyprotein into its distinct proteins. 66
54834 403372 pfam12125 Beta-TrCP_D D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with pfam00646, pfam00400. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerization of the protein. dimerization is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation. 39
54835 403373 pfam12126 DUF3583 Protein of unknown function (DUF3583). This domain is found in eukaryotes, and is typically between 302 and 338 amino acids in length. It is found in association with pfam00097 and pfam00643. Most members are promyelocytic leukemia proteins, and this family lies towards the C-terminus. 329
54836 403374 pfam12127 YdfA_immunity SigmaW regulon antibacterial. This protein is found in bacteria. Proteins in this family are about 330 amino acids in length. The operon from which this protein is derived confers immunity for the host species to a broad range of antibacterial compounds, unlike the specific immunity proteins that are linked to and co-regulated with their antibiotic-synthesis proteins. 314
54837 403375 pfam12128 DUF3584 Protein of unknown function (DUF3584). This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 943 to 1234 amino acids in length. This family contains a P-loop motif suggesting it is a nucleotide binding protein. It may be involved in replication. 1186
54838 403376 pfam12129 Phtf-FEM1B_bdg Male germ-cell putative homeodomain transcription factor. This domain is found in bacteria and eukaryotes, and is typically between 101 and 140 amino acids in length. Phtf proteins do not display any sequence similarity to known or predicted proteins, but their conservation among species suggests an essential function. The 84 kDa Phtf1 protein is an integral membrane protein, anchored to a cell membrane by six to eight trans-membrane domains, that is associated with a domain of the endoplasmic reticulum (ER) juxtaposed to the Golgi apparatus. It is present during meiosis and spermiogenesis, and, by the end of spermiogenesis, is released from the mature spermatozoon within the residual bodies. Phtf1 enhances the binding of FEM1B -feminisation homolog 1B - to cell membranes. Fem-1 was initially identified in the signaling pathway for sex determination, as well as being implicated in apoptosis, but its biochemical role is still unclear, and neither FEM1B nor PHTF1 is directly implicated in apoptosis in spermatogenesis. It is the ANK domain of FEM1B that is necessary for the interaction with the N-terminal region of Phtf1. 154
54839 403377 pfam12130 DUF3585 Protein of unknown function (DUF3585). This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with pfam00307. 129
54840 403378 pfam12131 DUF3586 Protein of unknown function (DUF3586). This domain is found in eukaryotes. This domain is about 80 amino acids in length and is found associated with pfam08246, and pfam00112. 75
54841 371912 pfam12132 DUF3587 Protein of unknown function (DUF3587). This protein is found in viruses. Proteins in this family are typically between 209 and 248 amino acids in length. 201
54842 403379 pfam12133 Sars6 Open reading frame 6 from SARS coronavirus. This family is found in Coronaviruses. Proteins in this family are typically between 42 to 63 amino acids in length. 62
54843 403380 pfam12134 PRP8_domainIV PRP8 domain IV core. This domain is found in eukaryotes, and is about 20 amino acids in length. It is found associated with pfam10597, pfam10596, pfam10598, pfam08083, pfam08082, pfam01398, pfam08084. There is a conserved LILR sequence motif. The domain is a selenomethionine domain in a subunit of the spliceosome. The function of PRP8 domain IV is believed to be interaction with the splicosomal core. 230
54844 403381 pfam12135 Sialidase_penC Sialidase enzyme penultimate C terminal domain. This domain is found in bacteria and eukaryotes, and is about 30 amino acids in length. The protein from which this domain is found is a sialidase enzyme which is used by virulent bacteria as a toxin. It is the penultimate C terminal domain. 25
54845 152571 pfam12136 RNA_pol_Rpo13 RNA polymerase Rpo13 subunit HTH domain. This domain is found in archaea, and is about 40 amino acids in length. It has a single completely conserved residue E that may be functionally important. It is found in the archaeal DNA dependent RNA polymerase. The domain is a 'helix-turn-helix' (HTH) domain in the Rpo13 subunit of the RNA polymerase. This domain is involved in downstream DNA binding, and the entire subunit has also been implicated in contacting transcription factor II B. 40
54846 403382 pfam12137 RapA_C RNA polymerase recycling family C-terminal. This domain is found in bacteria. This domain is about 360 amino acids in length. This domain is found associated with pfam00271, pfam00176. The function of this domain is not known, but structurally it forms an alpha-beta fold in nature with a central beta-sheet flanked by helices and loops, the beta-sheet being mainly antiparallel and flanked by four alpha helices, among which the two longer helices exhibit a coiled-coil arrangement. 360
54847 403383 pfam12138 Spherulin4 Spherulation-specific family 4. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 250 and 398 amino acids in length. There is a conserved NPG sequence motif and there are two completely conserved G residues that may be functionally important. Starvation will often induce spherulation - the production of spores - and this process may involve DNA-methylation. Changes in the methylation of spherulin4 are associated with the formation of spherules, but these changes are probably transient. Methylation of the gene accompanies its transcriptional activation, and spherulin4 mRNA is only detectable in late spherulating cultures and mature spherules. It is a spherulation-specific protein. 238
54848 378819 pfam12139 APS-reductase_C Adenosine-5'-phosphosulfate reductase beta subunit. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 112 to 142 amino acids in length. This family is found in association with pfam00037, and has a conserved FPIRTT sequence motif. The whole beta subunit has the enzymic properties of EC:1.8.99.2. 82
54849 403384 pfam12140 SLED SLED domain. The SLED (Scm-Like Embedded Domain) domain is a double-stranded DNA binding domain found in Scml2 which is a member of the Polycomb group of proteins involved in epigenetic gene silencing. 112
54850 403385 pfam12141 DUF3589 Protein of unknown function (DUF3589). This family of proteins is found in eukaryotes. Proteins in this family are typically between 541 and 717 amino acids in length. The function of this family is not known, 485
54851 403386 pfam12142 PPO1_DWL Polyphenol oxidase middle domain. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length, and the family is found in association with pfam00264. Most members are annotated as being polyphenol oxidases, and many are from plants or plastids. There is a conserved DWL sequence motif which gives the family its name. 52
54852 371919 pfam12143 PPO1_KFDV Protein of unknown function (DUF_B2219). This domain family is found in eukaryotes, and is typically between 138 and 152 amino acids in length. and the family is found in association with pfam00264. Many members are plant or plastid polyphenol oxidases, and there is a highly conserved sequence motif: KFDV, from which the name derives. This is the C-terminal domain of these oxidases. 130
54853 403387 pfam12144 Med12-PQL Eukaryotic Mediator 12 catenin-binding domain. This domain is found in eukaryotes, and is typically between 325 and 354 amino acids in length. Both development and carcinogenesis are driven by signal transduction within the canonical Wnt/beta-catenin pathway through both programmed and unprogrammed changes in gene transcription. Beta-catenin physically and functionally targets this PQL (proline-, glutamine-, leucine-rich) region of the Med12 subunit of Mediator to activate transcription. The beta-catenin transactivation domain binds directly to isolated Med12 and intact Mediator both in vitro and in vivo, and Mediator is recruited to Wnt-responsive genes in a beta-catenin-dependent manner. 209
54854 403388 pfam12145 Med12-LCEWAV Eukaryotic Mediator 12 subunit domain. This domain is found in eukaryotes, and is typically between 325 and 354 amino acids in length. The function of this particular region of the Mediator subunit Med12 is not known, but there is a conserved sequence motif: LCEWAV, from which the name derives. 466
54855 403389 pfam12146 Hydrolase_4 Serine aminopeptidase, S33. This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. The majority of the members in this family carry the exopeptidase active-site residues of Ser-122, Asp-239 and His-269 as in UniProtKB:Q7ZWC2. 237
54856 403390 pfam12147 Methyltransf_20 Putative methyltransferase. This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. The family shows homology to methyltransferases. 309
54857 403391 pfam12148 TTD Tandem tudor domain within UHRF1. TTD, tandem tudor domain within UHRF1 preferentially binds H3 histone tails trimethylated at Lys-9. It specifically recognizes H3 tail peptides with the heterochromatin-associated modification state of trimethylated lysine 9 and unmodified lysine 4 (H3K4me0/K9me3). This domain is found in eukaryotes and is found in association with pfam00097, pfam02182, pfam00628, pfam00240. 154
54858 403392 pfam12149 HSV_VP16_C Herpes simplex virus virion protein 16 C terminal. This domain is found in viruses, and is about 30 amino acids in length. It is found in association with pfam02232. This domain is the C terminal of the HSV virion protein 16. This protein is a transcription promoter. The C terminal domain is the carboxyl subdomain of the acidic transcriptional activation domain. The protein binds to DNA binding proteins to carry out its function. Such proteins include TATA binding protein, CBP, TBP-binding protein, etc. 26
54859 403393 pfam12150 MFP2b Cytosolic motility protein. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. These proteins are found in nematodes. They complex with MSP (major sperm protein) to allow motility. Their action is quite similar to the action of bacterial actin molecules. 343
54860 314942 pfam12151 MVL Mannan-binding protein. This domain family is found in bacteria, and is approximately 40 amino acids in length, There is a single completely conserved residue G that may be functionally important. The domain occurs in two types of proteins. In mannan binding proteins, it forms a homodimeric molecule which complexes into a homo-octamer. In thiamidases it occurs without repeats but in the presence of other domains. MVL is distinct amongst other oligomannoside binding proteins in that it exhibits specificity for certain tetrasaccharides. Each molecule of MVL has four distinct carbohydrate binding sites. 36
54861 403394 pfam12152 eIF_4G1 Eukaryotic translation initiation factor 4G1. This domain is found in eukaryotes, and is about 80 amino acids in length. It is found in association with pfam02854. This domain is part of the protein eIF_4G. It binds to eIF_4E by wrapping around its N terminal to form the eIF_4F complex. This complex binds various eIF_4E-BPs (binding proteins) to regulate initiation of translation. 60
54862 403395 pfam12153 CAP18_C LPS binding domain of CAP18 (C terminal). This domain family is found in eukaryotes, and is approximately 30 amino acids in length, and the family is found in association with pfam00666. CAP18 is a protein which is derived from rabbit granulocytes. It has two domains, an N terminal DUF and a C terminal Gram negative LPS binding domain. This domain is the C terminal domain. 27
54863 152589 pfam12154 HCMVantigenic_N Glycoprotein B N-terminal antigenic domain of HCMV. This domain is found in viruses, and is approximately 40 amino acids in length. The domain is found in association with pfam00606. There are two conserved sequence motifs: SVS and TSS. This family is the amino-terminal antigenic domain of glycoprotein B of human cytomegalovirus. 36
54864 152590 pfam12155 NADHdh-2_N NADH dehydrogenase subunit 2 N-terminal. This domain is found in eukaryotes, and is approximately 90 amino acids in length. It is found associated with pfam00361. All members are annotated as being NADH dehydrogenase subunit 2, and this region is the N-terminus. 88
54865 403396 pfam12156 ATPase-cat_bd Putative metal-binding domain of cation transport ATPase. This domain is found in bacteria, and is approximately 90 amino acids in length. It is found associated with pfam00403, pfam00122, pfam00702. The cysteine-rich nature and composition suggest this might be a cation-binding domain; most members are annotated as being cation transport ATPases. 86
54866 403397 pfam12157 DUF3591 Protein of unknown function (DUF3591). This domain is found in eukaryotes and is typically between 445 to 462 amino acids in length. Most members are annotated as being transcription initiation factor TFIID subunit 1, and this region is the conserved central portion of these proteins. 455
54867 403398 pfam12158 DUF3592 Protein of unknown function (DUF3592). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 150 and 242 amino acids in length. 138
54868 403399 pfam12159 DUF3593 Protein of unknown function (DUF3593). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 98 and 228 amino acids in length. There is a conserved LHG sequence motif. 88
54869 403400 pfam12160 Fibrinogen_aC Fibrinogen alpha C domain. This domain family is found in eukaryotes, and is approximately 70 amino acids in length, and the family is found in association with pfam08702. This domain is the C terminal domain of fibrinogen in mammals. The domain lies in the C terminal half of the alpha C region in these proteins. The function of the domain is that of intramolecular and intermolecular interactions to form fibrin. 68
54870 403401 pfam12161 HsdM_N HsdM N-terminal domain. This domain is found at the N-terminus of the methylase subunit of Type I DNA methyltransferases. This domain family is found in bacteria and archaea, and is typically between 123 and 138 amino acids in length. The family is found in association with pfam02384. Mutations in this region of EcoKI methyltransferase abolish the normally strong preference of this system for methylating hemimethylated substrate. The structure of this domain has been shown to be all alpha-helical. 123
54871 371931 pfam12162 STAT1_TAZ2bind STAT1 TAZ2 binding domain. This domain family is found in eukaryotes, and is approximately 20 amino acids in length, and the family is found in association with pfam02865, pfam00017, pfam01017, pfam02864. This domain is the C terminal domain of STAT1. This domain binds selectively to the TAZ2 domain of CRB (CREB-binding protein). In this process it becomes a transcriptional activator and can initiate transcription of certain genes. 25
54872 403402 pfam12163 HobA DNA replication regulator. This family of proteins is found exclusively in epsilon-proteobacteria. Proteins in this family are approximately 180 amino acids in length. The structure of HobA is a modified Rossmann fold consisting of a five-stranded parallel beta-sheet (beta1-5) flanked on one side by alpha-2, alpha-3 and alpha-6 helices and alpha-4 and alpha-5 on the other. The alpha-1 helix is extended away from and has minimal interaction with the globular part of the protein. Four monomers interact to form a tetrameric molecule. Four calcium atoms bind to the tetramer and these binding sites may have functional relevance. The function of HobA is to regulate DNA replication and its does this by binding to DNA-A, but the exact mechanism of how this regulation occurs is purely speculative 180
54873 403403 pfam12164 SporV_AA Stage V sporulation protein AA. This domain family is found in bacteria - primarily Firmicutes, and is approximately 90 amino acids in length. There is a single completely conserved residue G that may be functionally important. Most annotation associated with this domain suggests that it is involved in the fifth stage of sporulation, however there is little publication to back this up. 89
54874 403404 pfam12165 Alfin Alfin. The Alfin family includes PHD finger protein Alfin1 and Alfin1-like proteins. Alfin1 is a histone-binding component that specifically recognizes H3 tails trimethylated on 'Lys-4' (H3K4me3), which marks transcription start sites of virtually all active genes. 126
54875 403405 pfam12166 Piezo_RRas_bdg Piezo non-specific cation channel, R-Ras-binding domain. This is an extracellular domain at the C-terminus of Piezo, or FAM38 mechanosensitive non-specific cation channel proteins. It seems likely that this region of the Piezo proteins may be responsible for R-Ras recruitment because this region is capable of relocalising R-Ras to the ER in eukaryotes. 419
54876 403406 pfam12167 Arm-DNA-bind_2 Arm DNA-binding domain. This domain is found at the N-terminus of various phage integrases. The domain binds to DNA. 65
54877 403407 pfam12168 DNA_pol3_tau_4 DNA polymerase III subunits tau domain IV DnaB-binding. This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. 82
54878 403408 pfam12169 DNA_pol3_gamma3 DNA polymerase III subunits gamma and tau domain III. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. 143
54879 403409 pfam12170 DNA_pol3_tau_5 DNA polymerase III tau subunit V interacting with alpha. This domain family is found in bacteria, and is approximately 140 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. The extreme C-terminal region of this domain 5 is the part which interacts with the alpha subunit of the DNA polymerase III holoenzyme. 142
54880 403410 pfam12171 zf-C2H2_jaz Zinc-finger double-stranded RNA-binding. This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localize in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localization. 26
54881 403411 pfam12172 DUF35_N Rubredoxin-like zinc ribbon domain (DUF35_N). This domain has no known function and is found in conserved hypothetical archaeal and bacterial proteins. The domain is duplicated in Mycobacterium tuberculosis Rv3521. The structure of a DUF35 representative reveals two long N-terminal helices followed by a rubredoxin-like zinc ribbon domain represented in this family and a C-terminal OB fold domain. Zinc is chelated by the four conserved cysteines in the alignment. 37
54882 152608 pfam12173 BacteriocIIc_cy Bacteriocin class IIc cyclic gassericin A-like. This class of bacteriocins was previously described as class V. The members include gassericin A, acidocin B and butyrovibriocin AR10, all of which are hydrophobic cyclical structures. The N- and C-termini are covalently linked, and the circular molecule is resistant to several proteases and peptidases. The immunity protein that protects Lactobacillus gasseri from the toxic effects of its bacteriocin, gassericin A, has been identified. It is found to be a small positively-charged hydrophobic peptide of 53 amino acids containing a putative transmembrane segment - a structure unlike that of the more common immunity proteins as found in pfam08951. 91
54883 403412 pfam12174 RST RCD1-SRO-TAF4 (RST) plant domain. This domain is found in plant RCD1, SRO and TAF4 proteins, hence its name of RST. It is required for interaction with multiple plant transcription factors. Radical-Induced Cell Death1 (RCD1) is an important regulator of stress and hormonal and developmental responses in Arabidopsis thaliana, as is its closest homolog, SRO1 - Similar To RCD-One1. TBP-Associated Factor 4 (TAF4) and TAF4-b are components of the transcription initiation factor complex TFIID. 67
54884 403413 pfam12175 WSS_VP White spot syndrome virus structural envelope protein VP. This family of proteins is found in viruses. Proteins in this family are approximately 210 amino acids in length. There is a conserved NNT sequence motif. These proteins are structural envelope proteins in viruses. This is the beta barrel C terminal domain. There is a protruding N terminal domain which completes the proteins. Three of four envelope proteins in white spot syndrome virus share sequence homology with each other and are present in this family - VP24, VP26 and VP28. VP19 is the other major envelope protein but shares no sequence homology with the other proteins. These proteins are essential for entry into cells of the crustacean host. 201
54885 403414 pfam12176 MtaB Methanol-cobalamin methyltransferase B subunit. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 460 amino acids in length. MtaB folds as a TIM barrel and contains a novel zinc-binding motif. Zinc(II) lies at the bottom of a funnel formed at the C-terminal beta-barrel end and ligates to two cysteinyl sulfurs (Cys-220 and Cys-269) and one carboxylate oxygen (Glu-164). The function of this protein is to catalyze the cleavage of the C O bond in methanol by an SN2 mechanism. It complexes with MtaA and MtaC to perform this function. 459
54886 403415 pfam12177 Proho_convert Prohormone convertase enzyme. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam01483, pfam00082. There are two completely conserved residues (Y and D) that may be functionally important. This protein is the C terminal domain of a prohormone convertase enzyme which targets hormones in dense core secretory granules. This C terminal tail domain is the domain responsible for targeting these dense core secretory granules. The domain adopts an alpha helical structure. 39
54887 403416 pfam12178 INCENP_N Chromosome passenger complex (CPC) protein INCENP N terminal. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. INCENP is a regulatory protein in the chromosome passenger complex. It is involved in regulation of the catalytic protein Aurora B. It performs this function in association with two other proteins - Survivin and Borealin. These proteins form a tight three-helical bundle. The N terminal domain is the domain involved in formation of this three helical bundle. 33
54888 403417 pfam12179 IKKbetaNEMObind I-kappa-kinase-beta NEMO binding domain. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. These proteins are involved in inflammatory reactions. They cause release of NF-kappa-B into the nucleus of inflammatory cells and upregulation of transcription of proinflammatory cytokines. They perform this function by phosphorylating I-kappa-B proteins which are targeted for degradation to release NF-kappa-B. This kinase (I-kappa-kinase-beta) is found in association with IKK-alpha and NEMO (NF-kappa-B essential modulator). This domain is the binding site of IKK-beta for NEMO. 35
54889 403418 pfam12180 EABR TSG101 and ALIX binding domain of CEP55. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. This domain is the active domain of CEP55. CEP55 is a protein involved in cytokinesis, specifically in abscission of the plasma membrane at the midbody. To perform this function, CEP55 complexes with ESCRT-I (by a Proline rich sequence in its TSG101 domain) and ALIX. This is the domain on CEP55 which binds to both TSG101 and ALIX. It also acts as a hinge between the N and C termini. This domain is called EABR. 34
54890 403419 pfam12181 MogR_DNAbind DNA binding domain of the motility gene repressor (MogR). This domain family is found in bacteria, and is approximately 150 amino acids in length. MogR is involved in repression of transcription of the flagellar gene in Listeria bacteria. This allows a phenotypical switch from an extracellular bacterium to an intracellular pathogen. MogR binds AT rich flagellar gene promoter regions upstream of the flagellar gene. These regions follow the pattern 5'-TTTTNNNNNAAAA-3'. This domain is the DNA binding domain of MogR. 151
54891 403420 pfam12182 DUF3642 Bacterial lipoprotein. This domain family is found in bacteria, and is approximately 60 amino acids in length. There is a single completely conserved Y residue that may be functionally important. This domain is from a bacterial lipoprotein, a major virulence factor in Gram negative bacteria. 83
54892 403421 pfam12183 NotI Restriction endonuclease NotI. This family of proteins is found in bacteria. Proteins in this family are typically between 270 and 341 amino acids in length. There is a conserved CPF sequence motif. The type IIP restriction enzyme, NotI, is a homodimer that recognizes the 8 bp DNA sequence 5'-GC/GGCCGC-3' and cleaves both strands of DNA to create 5', 4 base cohesive overhangs. 232
54893 403422 pfam12185 IR1-M Nup358/RanBP2 E3 ligase domain. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00638, pfam00641, pfam00160. There are two conserved sequence motifs: TFFC and EDF. Nup358/RanBP2 is a nucleoporin involved in ubiquitination of many different protein targets from various cellular pathways. It complexes with Ubc9, SUMO-1 and RanGAP1 to perform this function. This is the ligase domain which binds to Ubc9. 59
54894 403423 pfam12186 AcylCoA_dehyd_C Acyl-CoA dehydrogenase C terminal. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam02770, pfam00441, pfam02771. There is a conserved ARRL sequence motif. The C terminal domain is an alpha helical domain. The flavin ring of Acyl-CoA dehydrogenase is buried in the crevice between the two alpha helical domains and the beta-sheet domain of one subunit, and the adenosine pyrophosphate moiety is stretched into the subunit junction of a neighboring subunit, composed of two C terminal domains. 111
54895 288997 pfam12187 VirArc_Nuclease Viral/Archaeal nuclease. This family of proteins is found in archaea and viruses. Proteins in this family are typically between 211 and 244 amino acids in length. These proteins are nucleases from fusseloviruses and sulfolobus archaea. 149
54896 371951 pfam12188 STAT2_C Signal transducer and activator of transcription 2 C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam02865, pfam00017, pfam01017, pfam02864. There is a conserved DLP sequence motif. STATs are involved in transcriptional regulation and are the only regulators known to be modulated by tyrosine phosphorylation. STAT2 forms a trimeric complex with STAT1 and IRF-9 (Interferon Regulatory Factor 9), on activation of the cell by interferon, which is called ISGF3 (Interferon-stimulated gene factor 3). The C terminal domain of STAT2 contains a nuclear export signal (NES) which allows export of STAT2 into the cytoplasm along with any complexed molecules. 53
54897 288999 pfam12189 VirE1 Single-strand DNA-binding protein. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved IELE sequence motif. VirE1 is an acidic chaperone protein which binds to VirE2, a ssDNA binding protein. These proteins are virulence factors of the plant pathogens Agrobacteria. VirE1 competes for the ssDNA binding site of VirE2. 63
54898 371952 pfam12190 amfpi-1 Fungal protease inhibitor. This protein family is found in eukaryotes, and is approximately 50 amino acids in length. These proteins are fungal protease inhibitors. 91
54899 403424 pfam12191 stn_TNFRSF12A tumor necrosis factor receptor stn_TNFRSF12A_TNFR domain. This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 184 amino acids in length. This is the stn_TNFRSF12A_TNFR domain from the tumor necrosis factor receptor. The function of this domain is unknown. 129
54900 371954 pfam12192 CBP Fungal calcium binding protein. This domain is found in eukaryotes, and is approximately 60 amino acids in length. There is a single completely conserved residue C that may be functionally important. This is a calcium binding domain from the fungal protein CBP (calcium binding protein). This protein is a virulence factor with unknown virulence mechanisms. CBP complexes as a highly intertwined homodimer. Each monomer is comprised of four alpha helices which adopt the saposin fold, characteristic of a protein family that binds to membranes and lipids. 76
54901 289003 pfam12193 Sulf_coat_C Sulfolobus virus coat protein C terminal. This domain family is found in viruses, and is approximately 70 amino acids in length. It is the C terminal of a coat protein in sulfolobus viruses. 69
54902 403425 pfam12194 Ste5_C Protein kinase Fus3-binding. This domain family is found in eukaryotes, and is approximately 190 amino acids in length. This domain is the penultimate C terminal domain from the protein ste5 which co-catalyzes the phosphorylation of fus3 by ste7. It is involved in the MAPK pathways. This domain is the minimal scaffold domain of ste5. It binds to the mitogen activated protein kinase fus3 before it is phosphorylated. 189
54903 152630 pfam12195 End_beta_barrel Beta barrel domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 80 amino acids in length.This domain is the beta barrel domain of bacteriophage endosialidase which represents the one of the two sialic acid binding sites of the enzyme. The domain is nested in the beta propeller domain of the endosialidase enzyme. The endosialidase protein complexes to form homotrimeric molecules. 83
54904 403426 pfam12196 hNIFK_binding FHA Ki67 binding domain of hNIFK. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00076. There are two conserved sequence motifs: TPVCTP and LERRKS. This domain is found on the human nucleolar protein hNIFK. It binds to the fork-head-associated domain of human Ki67. High-affinity binding requires sequential phosphorylation by two kinases, CDK1 and GSK3, yielding pThr238, pThr234 and pSer230. This interaction is involved in cell cycle regulation. 40
54905 314977 pfam12197 lci Bacillus cereus group antimicrobial protein. This domain is found in bacteria, and is approximately 40 amino acids in length. This domain is found in bacillus cereus group bacteria. It is an antimicrobial protein. 42
54906 289006 pfam12198 Tuberculin Theoretical tuberculin protein. This domain family is found in bacteria, and is approximately 30 amino acids in length. This protein is a theoretical model of the tuberculin protein from Mycobacterium tuberculosis. 34
54907 289007 pfam12199 efb-c Extracellular fibrinogen binding protein C terminal. This domain family is found in bacteria, and is approximately 70 amino acids in length. There is a conserved VLK sequence motif. It is the C terminal domain of bacterial extracellular fibrinogen binding protein. It contains a helical motif involved in complement regulation. This motif binds to complement and changes its conformation to a form which cannot activate downstream components of the complement cascade. 65
54908 403427 pfam12200 DUF3597 Domain of unknown function (DUF3597). This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 126 and 281 amino acids in length. The function of this domain is unknown. The structure of this domain has been found to contain five helices with a long flexible loop between helices one and two. 127
54909 371957 pfam12201 bcl-2I13 Bcl2-interacting killer, BH3-domain containing. This is a family of pro-apoptotic Bcl-x proteins, B cell leukaemia/lymphoma 2, or BIKs. BIK proteins rely for their activity upon an intact BH3 domain lying between residues 48 and 80, as in UniProt:Q13323. 155
54910 403428 pfam12202 OSR1_C Oxidative-stress-responsive kinase 1 C-terminal domain. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. There is a single completely conserved residue F that may be functionally important. OSR1 is involved in the signalling cascade which activates Na/K/2Cl cotransporter during osmotic stress. This domain is the C terminal domain of OSR1 which recognizes a motif (Arg-Phe-Xaa-Val) on the OSR1-activating protein WNK1. 64
54911 403429 pfam12203 HDAC4_Gln Glutamine rich N terminal domain of histone deacetylase 4. This domain is found in eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam00850. The domain forms an alpha helix which complexes to form a tetramer. The glutamine rich domains have many intra- and inter-helical interactions which are thought to be involved in reversible assembly and disassembly of proteins. The domain is part of histone deacetylase 4 (HDAC4) which removes acetyl groups from histones. This restores their positive charge to allow stronger DNA binding thus restricting transcriptional activity. 91
54912 403430 pfam12204 DUF3598 Domain of unknown function (DUF3598). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 230 and 398 amino acids in length. These proteins are formed entirely from B sheets which form a barrel structure similar to those seen in the lipocalin superfamily. 267
54913 403431 pfam12205 GIT1_C G protein-coupled receptor kinase-interacting protein 1 C term. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam01412, pfam00023, pfam08518. GIT1 plays an important role in cell adhesion, motility, cytoskeletal remodeling and membrane trafficking. To perform this function, it localizes p21-activated kinase (PAK) and PAK-interactive exchange factor to focal adhesions. Its activation is regulated by interaction between its paxillin-binding C terminal and the LD motifs of paxillin. The C terminal folds into a four helix bundle. 116
54914 403432 pfam12206 DUF3599 Domain of unknown function (DUF3599). This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. This domain is the phage-like element pbsx protein xkdh. 115
54915 403433 pfam12207 DUF3600 Domain of unknown function (DUF3600). This family of proteins is found in bacteria. Proteins in this family are approximately 230 amino acids in length. This domain is the C terminal of the putative ecf-type sigma factor negative effector. 157
54916 314984 pfam12208 DUF3601 Domain of unknown function (DUF3601). This domain family is found in bacteria, and is approximately 80 amino acids in length. 77
54917 371963 pfam12209 SAC3 Leucine permease transcriptional regulator helical domain. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam03399. This domain is a helical domain in the middle of leucine permease transcriptional regulator. 77
54918 403434 pfam12210 Hrs_helical Hepatocyte growth factor-regulated tyrosine kinase substrate. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam00790, pfam01363, pfam02809. This domain is the helical region of Hrs which forms the core complex of ESCRT with STAM. 95
54919 314987 pfam12211 LMWSLP_N Low molecular weight S layer protein N terminal. This family of proteins is found in bacteria. Proteins in this family are typically between 328 and 381 amino acids in length. There is a conserved LGDG sequence motif. Clostridial species have a layer of surface proteins surrounding their membrane. This layer is comprised of a high molecular weight protein and a low molecular weight protein. This domain is the N terminal domain of the low molecular weight protein. It is a structural domain. 258
54920 314988 pfam12212 PAZ_siRNAbind PAZ domain. This entry corresponds to the PAZ domain found in some archaeal argonaute proteins. It is an siRNA binding domain. 127
54921 371965 pfam12213 Dpoe2NT DNA polymerases epsilon N terminal. This domain is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam04042. There is a single completely conserved residue F that may be functionally important. This domain is the N terminal domain of DNA polymerase epsilon subunit B. It forms a primarily alpha helical structure in which four helices are arranged in two hairpins with connecting loops containing beta strands which form a short parallel sheet. DNA polymerase epsilon is required in DNA replication for synthesis of the leading strand. This domain has close structural relation to AAA+ protein C terminal domains. 71
54922 403435 pfam12214 TPX2_importin Cell cycle regulated microtubule associated protein. This domain is found in eukaryotes. This domain is typically between 127 to 182 amino acids in length. This domain is found associated with pfam06886. This domain is found in the protein TPX2 (a.k.a p100) which is involved in cell cycling. It is only expressed between the start of the S phase and completion of cytokinesis. The microtubule-associated protein TPX2 has been reported to be crucial for mitotic spindle formation. This domain is close to the C terminal of TPX2. The protein importin alpha regulates the activity of TPX2 by binding to the nuclear localization signal in this domain. 127
54923 403436 pfam12215 Glyco_hydr_116N beta-glucosidase 2, glycosyl-hydrolase family 116 N-term. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 320 to 354 amino acids in length. This domain is found associated with pfam04685. It is found just after the extreme N-terminus. The N-terminal is thought to be the luminal domain while the C terminal is the cytosolic domain. The catalytic domain of GBA-2 is unknown. The primary catabolic pathway for glucosylceramide is catalysis by the lysosomal enzyme glucocerebrosidase. In higher eukaryotes, glucosylceramide is the precursor of glycosphingolipids, a complex group of ubiquitous membrane lipids. Mutations in the human protein cause motor-neurone defects in hereditary spastic paraplegia. The catalytic nucleophile, identified in UniProtKB:Q97YG8_SULSO, is a glutamine-335 in the downstream family pfam04685. 309
54924 371968 pfam12216 m04gp34like Immune evasion protein. This protein is found in archaea and viruses. Proteins in this family are typically between 265 to 342 amino acids in length. The proteins in this family are or are related to the m04 encoded protein gp34 of pathogenic microorganisms such as murine cytomegalovirus. m06 and m152 genes are expressed earlier in the intracellular replication phases of these microorganism' life cycles. They function to inhibit MHC-1 loading and export. gp34 is theorized to prevent immune reactions from NK cells which would ordinarily recognize and attack cells lacking MHC. 267
54925 152652 pfam12217 End_beta_propel Catalytic beta propeller domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is typically between 443 and 460 amino acids in length. This domain is the highly conserved beta propeller of bacteriophage endosialidase which represents the catalytically active part of the enzymes. This core domain forms stable SDS-resistant trimers. There is a nested beta barrel domain in this domain (pfam12195). The endosialidase protein complexes to form a homotrimeric molecule. 449
54926 314993 pfam12218 End_N_terminal N terminal extension of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 70 amino acids in length. This domain is found in the bacteriophage protein endosialidase. The two N-terminal domains (this domain and the beta propeller) assemble in the compact 'cap' whereas the C-terminal domain forms an extended tail-like structure. The very N-terminal part of the 'cap' region (residues 246 to 312) holds the only alpha-helix of the protein and is presumably the residual part of the deleted N-terminal head-binding domain. The endosialidase protein complexes to form homotrimeric molecules. 67
54927 152654 pfam12219 End_tail_spike Catalytic domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 160 amino acids in length. There are two conserved sequence motifs: VSR and YGA. This domain is the C terminal domain of the bacteriophage protein endosialidase. The endosialidase protein forms homotrimeric molecules and this domain complexes into a tail-spike stalk. The stalk region folds in a triple beta-helix that is interrupted by a small triple beta-prism domain. The tail-spike is a multifunctional protein device used by the phage to fulfill the following functions: (i) to adsorb to the bacterial polySia capsule (ii) to de-polymerize the capsule to gain access to the outer bacterial membrane, and finally (iii) to mediate tight adhesion to the membrane, a prerequisite for the initiation of the infection cycle. 160
54928 403437 pfam12220 U1snRNP70_N U1 small nuclear ribonucleoprotein of 70kDa MW N terminal. This domain is found in eukaryotes. This domain is about 90 amino acids in length. This domain is found associated with pfam00076. This domain is part of U1 snRNP, which is the pre-mRNA binding protein of the penta-snRNP spliceosome complex. It extends over a distance of 180 A from its RNA binding domain, wraps around the core domain of U1 snRNP consisting of the seven Sm proteins and finally contacts U1-C, which is crucial for 5'-splice-site recognition. 90
54929 403438 pfam12221 HflK_N Bacterial membrane protein N terminal. This domain is found in bacteria. This domain is typically between 65 to 81 amino acids in length. This domain is found associated with pfam01145. This domain is the N terminal of the bacterial membrane protein HflK. HflK complexes with HflC to form a membrane protease which is modulated by the GTPase HflX. The N terminal domain of HflK is the membrane spanning region which anchors the protein in the bacterial membrane. 44
54930 403439 pfam12222 PNGaseA Peptide N-acetyl-beta-D-glucosaminyl asparaginase amidase A. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 558 and 775 amino acids in length. There is a conserved TGG sequence motif. PNGase A is a protein which cleaves glycopeptides. 434
54931 403440 pfam12223 DUF3602 Protein of unknown function (DUF3602). This domain family is found in eukaryotes, and is typically between 78 and 89 amino acids in length. 80
54932 403441 pfam12224 Amidoligase_2 Putative amidoligase enzyme. This family of proteins are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes). 235
54933 403442 pfam12225 MTHFR_C Methylene-tetrahydrofolate reductase C terminal. This family is found in bacteria and archaea, and is approximately 100 amino acids in length. There is a conserved NGPCGG sequence motif. This family is the C terminal of methylene-tetrahydrofolate reductase. This protein reduces FAD using the reducing equivalents from reduced FAD, subsequently reduces tetrahydrofolate. The C terminal of MTHFR contains the FAD binding site and is the catalytic portion of the enzyme. 89
54934 371974 pfam12226 Astro_capsid_p Turkey astrovirus capsid protein. This family of proteins is found in viruses. Proteins in this family are typically between 241 and 261 amino acids in length. These proteins are capsid proteins from various astrovirus strains. 361
54935 371975 pfam12227 DUF3603 Protein of unknown function (DUF3603). This protein is found in bacteria and eukaryotes. Proteins in this family are about 250 amino acids in length. 214
54936 371976 pfam12228 DUF3604 Protein of unknown function (DUF3604). This family of proteins is found in bacteria. Proteins in this family are typically between 621 and 693 amino acids in length. 591
54937 403443 pfam12229 PG_binding_4 Putative peptidoglycan binding domain. This domain is found associated with the L,D-transpeptidase domain pfam03734. The structure of this domain has been solved and shows a mixed alpha-beta fold composed of nine beta strands and four alpha helices. This domain is usually found to be duplicated. Therefore, it seems likely that this domain acts to bind the two unlinked peptidoglycan chains and bring them into close association so they can be cross linked by the transpeptidase domain (Bateman A pers. observation). 117
54938 403444 pfam12230 PRP21_like_P Pre-mRNA splicing factor PRP21 like protein. This domain family is found in eukaryotes, and is typically between 212 and 238 amino acids in length. The family is found in association with pfam01805. There are two completely conserved residues (W and H) that may be functionally important. PRP21 is required for assembly of the prespliceosome and it interacts with U2 snRNP and/or pre-mRNA in the prespliceosome. This family also contains proteins similar to PRP21, such as the mammalian SF3a. SF3a also interacts with U2 snRNP from the prespliceosome, converting it to its active form. 213
54939 403445 pfam12231 Rif1_N Rap1-interacting factor 1 N terminal. This domain family is found in eukaryotes, and is typically between 135 and 146 amino acids in length. Rif1 is a protein which interacts with Rap1 to regulate telomere length. Interaction with telomeres limits their length. The N terminal region contains many HEAT- and ARMADILLO- type repeats. These are helical folds which form extended curved proteins or RNA interface surfaces. 363
54940 403446 pfam12232 Myf5 Myogenic determination factor 5. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00010, pfam01586. There is a conserved CSD sequence motif. Myf5 is responsible for directing cells to the skeletal myocyte lineage during development. Myf5 is likely to act in a similar way to the other MRF4 proteins such as MyoD which perform the same function. These are histone acetyltransferases and histone deacetylases which activate and repress genes involved in the myocyte lineage. 71
54941 289037 pfam12233 p12I Human adult T cell leukemia/lymphoma virus protein. This family of proteins is found in viruses. Proteins in this family are approximately 100 amino acids in length. p12I binds to the immature beta and gamma-c chains of the interleukin-2 receptor retarding their translocation to the plasma membrane. p12I forms dimers which bind to these chains. 99
54942 403447 pfam12234 Rav1p_C RAVE protein 1 C terminal. This domain family is found in eukaryotes, and is typically between 621 and 644 amino acids in length. This family is the C terminal region of the protein RAVE (regulator of the ATPase of vacuolar and endosomal membranes). Rav1p is involved in regulating the glucose dependent assembly and disassembly of vacuolar ATPase V1 and V0 subunits. 637
54943 403448 pfam12235 FXMRP1_C_core Fragile X-related 1 protein core C terminal. This domain family is found in eukaryotes, and is typically between 126 and 160 amino acids in length. The family is found in association with pfam05641, pfam00013. This family is the core C terminal region of the fragile X related 1 proteins FXR1P, FXR2 and FMR1. These different proteins have different regions at their very C-terminus. The Glutamine-arginine rich region facilitates protein interactions. This family contains two blocks of RGG repeats that bind to G-quartet sequences in a wide variety of mRNAs. 121
54944 403449 pfam12236 Head-tail_con Bacteriophage head to tail connecting protein. This family of head-tail connector proteins is found in bacteria and viruses. Proteins in this family are typically between 516 and 555 amino acids in length. This protein is found in Phage T7 and T3 among others. 479
54945 403450 pfam12237 PCIF1_WW Phosphorylated CTD interacting factor 1 WW domain. This domain family is found in bacteria and eukaryotes, and is approximately 180 amino acids in length. This domain is the WW domain of PCIF1. PCIF1 interacts with phosphorylated RNA polymerase II carboxy-terminal domain (CTD). The WW domain of PCIF1 can directly and preferentially bind to the phosphorylated CTD compared to the unphosphorylated CTD. PCIF1 binds to the hyperphosphorylated RNAP II (RNAP IIO) in vitro and in vivo. Double immunofluorescence labeling in HeLa cells demonstrated that PCIF1 and endogenous RNAP IIO are co-localized in the cell nucleus. Thus, PCIF1 may play a role in mRNA synthesis by modulating RNAP IIO activity. 172
54946 289042 pfam12238 MSA-2c Merozoite surface antigen 2c. This family of proteins is found in eukaryotes. Proteins in this family are typically between 263 and 318 amino acids in length. There is a conserved SFT sequence motif. MSA-2 is a plasma membrane glycoprotein which can be found in Babesia bovis species. 216
54947 403451 pfam12239 DUF3605 Protein of unknown function (DUF3605). This family of proteins is found in eukaryotes and viruses. Proteins in this family are typically between 161 and 256 amino acids in length. 155
54948 403452 pfam12240 Angiomotin_C Angiomotin C terminal. This domain family is found in eukaryotes, and is typically between 197 and 211 amino acids in length. This family is the C terminal region of angiomotin. Angiomotin regulates the action of angiogenesis inhibitor angiostatin. The C terminal region of angiomotin appears to be involved in directing the protein chemotactically. 200
54949 403453 pfam12241 Enoyl_reductase Trans-2-enoyl-CoA reductase catalytic region. This family of trans-2-enoyl-CoA reductases, EC:1.3.1.44, carries the the catalytic sites of the enzyme, characterized by the conserved sequence motifs: YNThhhFxK, and YShAPxR. In Euglena where the enzyme has been characterized it catalyzes the reduction of enoyl-CoA to acyl-CoA in an unusual fatty acid pathway in mitochondria. the whole path performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation. 236
54950 403454 pfam12242 Eno-Rase_NADH_b NAD(P)H binding domain of trans-2-enoyl-CoA reductase. This family carries the region of the enzyme trans-2-enoyl-CoA reductase, EC:1.3.1.44, which binds NAD(P)H. The activity of the enzyme was characterized in Euglena where an unusual fatty acid synthesis path-way in the mitochondria performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation. The full enzyme catalyzes the reduction of enoyl-CoA to acyl-CoA. The binding site is conserved as GA/CSpGYG, where p is any polar residue. 78
54951 403455 pfam12243 CTK3 CTD kinase subunit gamma CTK3. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2. 123
54952 403456 pfam12244 DUF3606 Protein of unknown function (DUF3606). This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 85 amino acids in length. There is a single completely conserved residue G that may be functionally important. 54
54953 403457 pfam12245 Big_3_2 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. 122
54954 403458 pfam12246 MKT1_C Temperature dependent protein affecting M2 dsRNA replication. This domain family is found in eukaryotes, and is typically between 231 and 255 amino acids in length. There is a single completely conserved residue P that may be functionally important. MKT1 is required for maintenance of K2 toxin above 30 degrees C in strains with the L-A-HN variant of the L-A double-stranded RNA virus of Saccharomyces cerevisiae. MKT1 is a 93 kDa protein with serine-rich regions and the retroviral protease signature, DTG. This family is the C terminal region of MKT1. 242
54955 403459 pfam12247 MKT1_N Temperature dependent protein affecting M2 dsRNA replication. This domain family is found in eukaryotes, and is typically between 231 and 255 amino acids in length. There is a single completely conserved residue P that may be functionally important. MKT1 is required for maintenance of K2 toxin above 30 degrees C in strains with the L-A-HN variant of the L-A double-stranded RNA virus of Saccharomyces cerevisiae. MKT1 is a 93 kDa protein with serine-rich regions and the retroviral protease signature, DTG. This family is the N terminal region of MKT1. 84
54956 403460 pfam12248 Methyltransf_FA Farnesoic acid 0-methyl transferase. This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH). 101
54957 403461 pfam12249 AftA_C Arabinofuranosyltransferase A C terminal. This domain family is found in bacteria, and is typically between 179 and 190 amino acids in length. This family is the C terminal region of AftA. The enzyme catalyzes the addition of the first key arabinofuranosyl residue from the sugar donor beta-D-arabinofuranosyl-1-monophosphoryldecaprenol to the galactan domain of the cell wall, thus priming the galactan for further elaboration by the arabinofuranosyltransferases. The C terminal region is predicted to be directed towards the periplasm. 177
54958 403462 pfam12250 AftA_N Arabinofuranosyltransferase N terminal. This domain family is found in bacteria, and is typically between 430 and 441 amino acids in length. This family is the N terminal region of AftA. The enzyme catalyzes the addition of the first key arabinofuranosyl residue from the sugar donor beta-D-arabinofuranosyl-1-monophosphoryldecaprenol to the galactan domain of the cell wall, thus priming the galactan for further elaboration by the arabinofuranosyltransferases. The N terminal region has been predicted to span 11 transmembrane regions. 424
54959 403463 pfam12251 zf-SNAP50_C snRNA-activating protein of 50kDa MW C terminal. This domain family is found in eukaryotes, and is typically between 196 and 207 amino acids in length. There is a conserved CEH sequence motif. SNAP50 is part of the snRNA-activating protein complex which activates RNA polymerases II and III. There is a cysteine-histidine cluster which contains two possible zinc finger motifs. 189
54960 403464 pfam12252 SidE Dot/Icm substrate protein. This family of proteins is found in bacteria. Proteins in this family are typically between 397 and 1543 amino acids in length. This family is the SidE protein in the Dot/Icm pathway of Legionella pneumophila bacteria. There is little literature describing the family. 220
54961 403465 pfam12253 CAF1A Chromatin assembly factor 1 subunit A. The CAF-1 or chromatin assembly factor-1 consists of three subunits, and this is the first, or A. The A domain is uniquely required for the progression of S phase in mouse cells, independent of its ability to promote histone deposition but dependent on its ability to interact with HP1 - heterochromatin protein 1-rich heterochromatin domains next to centromeres that are crucial for chromosome segregation during mitosis. This HP1-CAF-1 interaction module functions as a built-in replication control for heterochromatin, which, like a control barrier, has an impact on S-phase progression in addition to DNA-based checkpoints. 76
54962 403466 pfam12254 DNA_pol_alpha_N DNA polymerase alpha subunit p180 N terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00136, pfam08996, pfam03104. This family is the N terminal of DNA polymerase alpha subunit p180 protein. The N terminal contains the catalytic region of the alpha subunit. 64
54963 371995 pfam12255 TcdB_toxin_midC Insecticide toxin TcdB middle/C-terminal region. This domain family is found in bacteria, and is approximately 150 amino acids in length. The family is found in association with pfam03534. This family is the C-terminal-sided middle region of the bacterial insecticide toxin TcdB. 140
54964 403467 pfam12256 TcdB_toxin_midN Insecticide toxin TcdB middle/N-terminal region. This domain family is found in bacteria and archaea, and is typically between 164 and 180 amino acids in length. The family is found in association with pfam05593. This family is the N-terminal-sided middle region of the bacterial insecticide toxin TcdB. This region appears related to the FG-GAP repeat pfam01839. 181
54965 403468 pfam12257 IML1 Vacuolar membrane-associated protein Iml1. Proteins in this family contain a DEP domain, which is a globular domain of about 80 residues. This entry includes vacuolar membrane-associated protein Iml1 and DEP domain-containing protein 5/DDB_G0279099. In Saccharomyces cerevisiae, Iml1 is a subunit of both the SEA (Seh1-associated) and Iml1 complexes (Iml1-Npr2-Npr3). SEA complex is associates dynamically with the vacuole and is involved in autophagy. Iml1 complex is required for non-nitrogen-starvation (NNS)-induced autophagy. 278
54966 403469 pfam12258 Microcephalin Microcephalin protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 384 and 835 amino acids in length. Microcephalin is involved in determining the size of the brain in animals. It is a protein, which if expressed homozygously causes the organism to have the condition microcephaly. Organisms expressing the mutated form of this protein in a homozygous manner develop a condition called microcephaly - a drastically reduced brain mass and volume. Microcephalin is predicted to contain three BRCA1 C-terminal domains, the first of which is the probable microcephaly mutation site. 391
54967 371998 pfam12259 Baculo_F Baculovirus F protein. This protein is found in a variety of baculoviruses. It is known as the F protein. Matches to this family are additionally found in some presumed transposons. 606
54968 403470 pfam12260 PIP49_C Protein-kinase domain of FAM69. This is the C-terminal region of a family of FAM69 proteins from Metazoa and Viridiplantae that are active protein-kinases. The family members have a short transmembrane helix close to the N-terminus, and thereafter are highly enriched with cysteines. FAM69 proteins are localized to the endoplasmic reticulum. Many members also have a short EF-hand, calcium-binding, domain just upstream of the kinase domain. The exact function of the more N-terminal family is uncertain. 189
54969 403471 pfam12261 T_hemolysin Thermostable hemolysin. This family of proteins is found in bacteria. Proteins in this family are typically between 200 and 228 amino acids in length. T_hemolysin is a pore-forming toxin of bacteria, able to lyse erythrocytes from a number of mammalian species. 171
54970 372000 pfam12262 Lipase_bact_N Bacterial virulence factor lipase N-terminal. This domain family is found in bacteria, and is typically between 258 and 271 amino acids in length. There are two conserved sequence motifs: DGT and DGWST. This family is the N-terminal region of bacterial virulence factor lipase. The N-terminal region contains a potential signalling sequence. 238
54971 403472 pfam12263 DUF3611 Protein of unknown function (DUF3611). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 180 and 205 amino acids in length. There are two completely conserved residues (W and G) that may be functionally important. 176
54972 152699 pfam12264 Waikav_capsid_1 Waikavirus capsid protein 1. The rice tungro spherical waikavirus polyprotein is cleaved into 7 proteins, including three capsid proteins, by the tungro spherical virus-type peptidase pfam12381. This family represents the capsid protein 1. 197
54973 403473 pfam12265 CAF1C_H4-bd Histone-binding protein RBBP4 or subunit C of CAF1 complex. The CAF-1 complex is a conserved heterotrimeric protein complex that promotes histone H3 and H4 deposition onto newly synthesized DNA during replication or DNA repair; specifically it facilitates replication-dependent nucleosome assembly with the major histone H3 (H3.1). This domain is an alpha helix which sits just upstream of the WD40 seven-bladed beta-propeller in the human RbAp46 protein. RbAp46 folds into the beta-propeller and binds histone H4 in a groove formed between this N-terminal helix and an extended loop inserted into blade six. 69
54974 289069 pfam12266 DUF3613 Protein of unknown function (DUF3613). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 126 amino acids in length. 67
54975 152702 pfam12267 DUF3614 Protein of unknown function (DUF3614). This family of proteins is found in viruses. Proteins in this family are typically between 162 and 495 amino acids in length. 173
54976 372003 pfam12268 DUF3612 Protein of unknown function (DUF3612). This domain family is found in bacteria, and is approximately 180 amino acids in length. The family is found in association with pfam01381. 176
54977 403474 pfam12269 zf-CpG_bind_C CpG binding protein zinc finger C terminal domain. This domain family is found in eukaryotes, and is approximately 240 amino acids in length. This domain is the zinc finger domain of a CpG binding DNA methyltransferase protein. It contains a CxxC motif which forms the zinc finger and binds to DNA. 233
54978 403475 pfam12270 Cyt_c_ox_IV Cytochrome c oxidase subunit IV. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. This family is the fourth subunit of the cytochrome c oxidase complex. This subunit does not have a catalytic capacity but instead, is required for assembly and/or stability of the complex. 132
54979 403476 pfam12271 Chs3p Chitin synthase III catalytic subunit. This family of proteins is found in eukaryotes. Proteins in this family are typically between 288 and 332 amino acids in length. This family is the catalytic domain of chitin synthase III. Chitin is a major component of fungal cell walls and this enzyme is responsible for its formation. 283
54980 403477 pfam12273 RCR Chitin synthesis regulation, resistance to Congo red. RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5. 113
54981 403478 pfam12274 DUF3615 Protein of unknown function (DUF3615). This domain family is found in bacteria and eukaryotes, and is typically between 86 and 97 amino acids in length. There is a conserved FAE sequence motif. There is a single completely conserved residue F that may be functionally important. 94
54982 403479 pfam12275 DUF3616 Protein of unknown function (DUF3616). This family of proteins is found in bacteria. Proteins in this family are typically between 335 and 392 amino acids in length. There is a conserved GLRGPV sequence motif. 328
54983 403480 pfam12276 DUF3617 Protein of unknown function (DUF3617). This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important. 133
54984 403481 pfam12277 DUF3618 Protein of unknown function (DUF3618). This domain family is found in bacteria, and is approximately 50 amino acids in length. 47
54985 372010 pfam12278 SDP_N Sex determination protein N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 168 and 410 amino acids in length. This family is the N terminal end of the sex determination protein of many different animals. It plays a role in the gender determination of around 20% of all animals. 137
54986 403482 pfam12279 DUF3619 Protein of unknown function (DUF3619). This protein is found in bacteria. Proteins in this family are about 140 amino acids in length. This protein has two conserved sequence motifs: AAR and DDLP. 123
54987 403483 pfam12280 BSMAP Brain specific membrane anchored protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 285 and 331 amino acids in length. BSMAP has a putative transmembrane domain and is predicted to be a type I membrane glycoprotein. 195
54988 403484 pfam12281 NTP_transf_8 Nucleotidyltransferase. This is a family of bacterial proteins that have a nucleotidyltransferase fold. The fold-prediction is backed up by conservation of three highly characteristic sequence motifs found in all other nucleotidyl transferases: i) pDhDhhh(h/p), where p is a polar residue and h is a hydrophobic residue; ii) upstream of the first, a GG/S; iii) a conserved D/E in a hydrophobic surround. In the classification of nucleotidyltransferases proposed in this is a group XVIII NTP-transferase. Many of these sequences were classified in the COG database as COG5397. The exact function is not known. 208
54989 403485 pfam12282 H_kinase_N Signal transduction histidine kinase. This domain is found in bacteria. This domain is about 150 amino acids in length. This domain is found associated with pfam07568, pfam08448, pfam02518. This domain has a single completely conserved residue P that may be functionally important. This family is mostly annotated as a histidine kinase involved in signal transduction but there is little published evidence to support this. 139
54990 289084 pfam12283 Protein_K Bacteriophage protein K. This family of proteins is found in viruses. Proteins in this family are approximately 60 amino acids in length. This family is a protein expressed by bacteriophages which has an unknown function. There is evidence that it is non-essential for in vivo production of a mature phage. 56
54991 403486 pfam12284 HoxA13_N Hox protein A13 N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 149 and 306 amino acids in length. The family is found in association with pfam00046. This family is the N terminal of the Hox gene protein involved in formation of the digital arch of the hands and feet as well as in correct genital formation. Mutation of the protein is associated with hand-foot-genital syndrome. 120
54992 204871 pfam12285 DUF3621 Protein of unknown function (DUF3621). This family of proteins is found in viruses. Proteins in this family are typically between 49 and 62 amino acids in length. There are two conserved sequence motifs: QPLDLS and EQQ. 49
54993 403487 pfam12286 DUF3622 Protein of unknown function (DUF3622). This family of proteins is found in bacteria. Proteins in this family are typically between 72 and 107 amino acids in length. There is a conserved VSK sequence motif. 71
54994 403488 pfam12287 Caprin-1_C Cytoplasmic activation/proliferation-associated protein-1 C term. This family of proteins is found in eukaryotes. Proteins in this family are typically between 343 and 708 amino acids in length. This family is the C terminal region of caprin-1. Caprin-1 is a protein involved in regulating cellular proliferation. In mutated phenotypes, the G1 phase of the cell cycle is greatly lengthened, impairing normal proliferation. The C terminal region of caprin-1 contains RGG motifs which are characteristic of RNA binding domains. It is possible that caprin-1 functions through an RNA binding mechanism. 319
54995 403489 pfam12288 CsoS2_M Carboxysome shell peptide mid-region. This domain family is found in bacteria and eukaryotes, and is approximately 430 amino acids in length. This family is annotated frequently as a carboxysome shell peptide, however there is little publication to confirm this. 420
54996 403490 pfam12289 Rotavirus_VP1 Rotavirus VP1 C-terminal domain. This domain is the C-terminal bracelet domain of the rotavirus VP1 RNA-directed RNA polymerase. It surrounds the exit tunnel for dsRNA produced by replication and for the RNA template for transcription. 312
54997 315053 pfam12290 DUF3802 Protein of unknown function (DUF3802). This family of proteins is found in bacteria. Proteins in this family are typically between 114 and 143 amino acids in length. There is a conserved KNLFD sequence motif. 112
54998 403491 pfam12291 DUF3623 Protein of unknown function (DUF3623). This family of proteins is found in bacteria. Proteins in this family are typically between 261 and 345 amino acids in length. 255
54999 403492 pfam12292 DUF3624 Protein of unknown function (DUF3624). This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There is a conserved GRC sequence motif. 74
55000 403493 pfam12293 T4BSS_DotH_IcmK Putative outer membrane core complex of type IVb secretion. T4BSS_DotH_IcmK is a family of bacterial transporter proteins from Proteobacteria. DotH is an integral outer membrane component and it may form an outer membrane complex along with DotD and DotC functionally equivalent to secretins. DotH is the strongest candidate for the VirB9 counterpart of other T4BSS systems. 238
55001 403494 pfam12294 DUF3626 Protein of unknown function (DUF3626). This family of proteins is found in bacteria. Proteins in this family are typically between 294 and 374 amino acids in length. 301
55002 403495 pfam12295 Symplekin_C Symplekin tight junction protein C terminal. This domain family is found in eukaryotes, and is approximately 180 amino acids in length. There is a single completely conserved residue P that may be functionally important. Symplekn has been localized, by light and electron microscopy, to the plaque associated with the cytoplasmic face of the tight junction-containing zone (zonula occludens) of polar epithelial cells and of Sertoli cells of testis. However, both the mRNA and the protein can also be detected in a wide range of cell types that do not form tight junctions. Careful analyses have revealed that the protein occurs in all these diverse cells in the nucleoplasm, and only in those cells forming tight junctions is it recruited, partly but specifically, to the plaque structure of the zonula occludens. 185
55003 403496 pfam12296 HsbA Hydrophobic surface binding protein A. This protein is found in eukaryotes. Proteins in this family are typically between 171 to 275 amino acids in length. Although the HsbA amino acid sequence suggests that HsbA may be hydrophilic, HsbA adsorbed to hydrophobic PBSA (Polybutylene succinate-co-adipate) surfaces in the presence of NaCl or CaCl2. When HsbA was adsorbed on the hydrophobic PBSA surfaces, it promoted PBSA degradation via the CutL1 polyesterase. CutL1 interacts directly with HsbA attached to the hydrophobic QCM electrode surface. These results suggest that when HsbA is adsorbed onto the PBSA surface, it recruits CutL1, and that when CutL1 is accumulated on the PBSA surface, it stimulates PBSA degradation. 123
55004 403497 pfam12297 EVC2_like Ellis van Creveld protein 2 like protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 571 and 1310 amino acids in length. There are two conserved sequence motifs: LPA and ELH. EVC2 is implicated in Ellis van Creveld chondrodysplastic dwarfism in humans. Mutations in this protein can give rise to this congenital condition. LIMBIN is a protein which shares around 80% sequence homology with EVC2 and it is implicated in a similar condition in bovine chondrodysplastic dwarfism. 429
55005 403498 pfam12298 Bot1p Eukaryotic mitochondrial regulator protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 168 and 381 amino acids in length. Bot1p localizes to the mitochondria in live cells and cofractionates with purified mitochondrial ribosomes. Bot1p has a novel function in the control of cell respiration by acting on the mitochondrial protein synthesis machinery. Observations also indicate that in fission yeast, alterations of mitochondrial function are linked to changes in cell cycle and cell morphology control mechanisms. 172
55006 372024 pfam12299 DUF3627 Protein of unknown function (DUF3627). This domain family is found in bacteria and viruses, and is approximately 90 amino acids in length. The family is found in association with pfam02498. 93
55007 372025 pfam12300 RhlB ATP-dependent RNA helicase RhlB. Proteins in this entry are DEAD Box RhlB RNA Helicases found in Xanthomonadaceae bacteria. 181
55008 403499 pfam12301 CD99L2 CD99 antigen like protein 2. This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum. 159
55009 372027 pfam12302 DUF3629 Protein of unknown function (DUF3629). This family of proteins is found in eukaryotes. Proteins in this family are typically between 256 and 292 amino acids in length. 218
55010 403500 pfam12304 BCLP Beta-casein like protein. This protein is found in eukaryotes. Proteins in this family are typically between 216 to 240 amino acids in length. This protein has two conserved sequence motifs: VLR and TRIY. BCLP is associated with cell morphology and a regulation of growth pattern of tumor. It is found in adenocarcinomas of uterine cervical tissues. 184
55011 403501 pfam12305 DUF3630 Protein of unknown function (DUF3630). This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a single completely conserved residue D that may be functionally important. 91
55012 403502 pfam12306 PixA Inclusion body protein. This family of proteins is found in bacteria. Proteins in this family are typically between 173 and 191 amino acids in length. PixA is thought to be specifically produced in Xenorhabdus nematophila. It is an inclusion body protein. 165
55013 403503 pfam12307 DUF3631 Protein of unknown function (DUF3631). This protein is found in bacteria. Proteins in this family are typically between 180 to 701 amino acids in length. 185
55014 403504 pfam12308 Noelin-1 Neurogenesis glycoprotein. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam02191. There are two conserved sequence motifs: SAQ and VQN. Noelin-1 is a glycoprotein which is secreted mainly by postmitotic neurogenic tissues in the developing central and peripheral nervous systems, first appearing after neural tube closure. It is likely that it forms large multimeric complexes.It has a divergent function in neurogenesis. In animal caps neuralized by expression of noggin, co-expression of Noelin-1 causes expression of neuronal differentiation markers several stages before neurogenesis normally occurs in this tissue. Finally, only secreted forms of the protein can activate sensory marker expression, while all forms of the protein can induce early neurogenesis. 100
55015 403505 pfam12309 KBP_C KIF-1 binding protein C terminal. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 365 and 621 amino acids in length. There is a conserved LLP sequence motif. KBP is a binding partner for KIF1Balpha that is a regulator of its transport function and thus represents a type of kinesin interacting protein. 360
55016 403506 pfam12310 Elf-1_N Transcription factor protein N terminal. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00178. There is a conserved PAVIVE sequence motif. Elf-1 is an immune cell specific transcription factor. It is found in T cells, B cells, megakaryocytes,and mast cells and is involved in the control of transcription for various immune proteins. These include IL-2, GM-CSF, IL-5, IL-2 receptor alpha chain, and CD4 in T cells, IgH, blk, and lyn in B cells, TdT in T and B cells, IL-3 in megakaryocytes, and SCL and Fc-epsilon-RI alpha chain in mast cells. 109
55017 403507 pfam12311 DUF3632 Protein of unknown function (DUF3632). This domain family is found in eukaryotes, and is approximately 170 amino acids in length. There is a conserved ALE sequence motif. 185
55018 372036 pfam12312 NeA_P2 Nepovirus subgroup A polyprotein. This family of proteins is found in viruses. Proteins in this family are typically between 259 and 1110 amino acids in length. The family is found in association with pfam03688, pfam03689, pfam03391. This family is one of the polyproteins expressed by Nepoviruses in subgroup A. 175
55019 372037 pfam12313 NPR1_like_C NPR1/NIM1 like defense protein C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 251 and 588 amino acids in length. The family is found in association with pfam00023, pfam00651. There are two conserved sequence motifs: LENRV and DLN. NPR1 (NIM1) is a defense protein in many plant species. 204
55020 403508 pfam12314 IMCp Inner membrane complex protein. This domain is found in bacteria and eukaryotes. This domain is about 120 amino acids in length. This family is the inner membrane complex of parasitic organisms. This is a cytoskeletal structure associated with the pellicle of these parasites. 87
55021 403509 pfam12315 DA1-like Protein DA1. Proteins in this family include protein DA1 and its homologs. In Arabidopsis thaliana, DA1 is an ubiquitin receptor that limits final seed and organ size by restricting the period of cell proliferation. It may act maternally to control seed mass. 214
55022 403510 pfam12316 Dsh_C Segment polarity protein dishevelled (Dsh) C terminal. This domain family is found in eukaryotes, and is typically between 177 and 207 amino acids in length. The family is found in association with pfam00778, pfam02377, pfam00610, pfam00595. The segment polarity gene dishevelled (dsh) is required for pattern formation of the embryonic segments. It is involved in the determination of body organisation through the Wingless pathway (analogous to the Wnt-1 pathway). 211
55023 403511 pfam12317 IFT46_B_C Intraflagellar transport complex B protein 46 C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 298 and 416 amino acids in length. IFT46 is a flagellar protein of complex B. Like all IFT proteins, it is required for transport of IFT particles into the flagella. 203
55024 403512 pfam12318 FAD-SLDH Membrane bound FAD containing D-sorbitol dehydrogenase. This family of proteins is found in bacteria. Proteins in this family are typically between 168 and 189 amino acids in length. There is a conserved ALM sequence motif. This family is a membrane protein (FAD-SLDH) involved in oxidation of D-sorbitol to L-sorbose. 160
55025 403513 pfam12319 TryThrA_C Tryptophan-Threonine-rich plasmodium antigen C terminal. This protein is found in eukaryotes. Proteins in this family are typically between 254 to 536 amino acids in length. This family is the C terminal of a surface antigen of malarial Plasmodium species. It is currently being targeted for use as part of a subunit vaccine against Plasmodium falciparum, the main species involved in causing human malaria. 214
55026 403514 pfam12320 SbcD_C Type 5 capsule protein repressor C-terminal domain. This domain is found in bacteria and archaea. This domain is about 90 amino acids in length. This domain is found associated with pfam00149. SbcD works in complex with SbdC (SbcDC) which is a transcription regulator. It down-regulates transcription of arl and mgr to inhibit type 5 capsule protein production. It acts as part of the SOS pathway of bacteria. 96
55027 315081 pfam12321 DUF3634 Protein of unknown function (DUF3634). This family of proteins is found in bacteria. Proteins in this family are typically between 103 and 114 amino acids in length. 98
55028 289120 pfam12322 T4_baseplate T4 bacteriophage base plate protein. This protein is found in viruses. Proteins in this family are typically between 208 to 249 amino acids in length. This protein has a single completely conserved residue S that may be functionally important. This family includes the two base plate proteins in T4 bacteriophages. These are gp51 and gp26, encoded by late genes. 132
55029 403515 pfam12323 HTH_OrfB_IS605 Helix-turn-helix domain. This is the N terminal helix-turn-helix domain of Transposase_2 pfam01385. 47
55030 372044 pfam12324 HTH_15 Helix-turn-helix domain of alkylmercury lyase. Alkylmercury lyase (EC:4.99.1.2) cleaves the carbon-mercury bond of organomercurials such as phenylmercuric acetate. This is the N terminal helix-turn-helix domain associated with pfam03243. 74
55031 403516 pfam12325 TMF_TATA_bd TATA element modulatory factor 1 TATA binding. This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family. 115
55032 403517 pfam12326 EOS1 N-glycosylation protein. This family is not required for survival of S.cerevisiae, but its deletion leads to heightened sensitivity to oxidative stress. It appears to be involved in N-glycosylation, and resides in the endoplasmic reticulum. 160
55033 403518 pfam12327 FtsZ_C FtsZ family, C-terminal domain. This family includes the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ is a GTPase, like tubulin. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. 95
55034 372048 pfam12328 Rpp20 Rpp20 subunit of nuclear RNase MRP and P. The nuclear RNase P of Saccharomyces cerevisiae is made up of at least nine protein subunits; Pop1, Pop3, Pop4, Pop5, Pop6, Pop7, Pop8, Rpr2 and Rpp1. Many of these subunits seem to be present also in the RNase MRP, with the exception of Rpr2 (Rpp21) which is unique to RNase P. Human nuclear RNase P and MRP appear to contain at least 10 protein subunits, Rpp14, Rpp20, Rpp21, Rpp25, Rpp29, Rpp30, Rpp38, Rpp40, hPop1 and hPop5, although there is recent evidence that not all of these subunits are shared between P and MRP. Archaeal RNase P has at least four protein subunits homologous to eukaryotic RNase P/MRP proteins. In the yeast RNase P, Pop6 and Pop7 (the Rpp20 homolog) interact with each other and they are both interaction partners of Pop4; in the human MRP Rpp25 and Rpp20 interact with each other and Rpp25 binds to Rpp29 (Pop4). 118
55035 372049 pfam12329 TMF_DNA_bd TATA element modulatory factor 1 DNA binding. This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. 74
55036 403519 pfam12330 Haspin_kinase Haspin like kinase domain. This family represents the haspin-like kinase domains. 369
55037 403520 pfam12331 DUF3636 Protein of unknown function (DUF3636). This domain family is found in eukaryotes, and is approximately 160 amino acids in length. 148
55038 403521 pfam12333 Ipi1_N Rix1 complex component involved in 60S ribosome maturation. This domain family is found in eukaryotes, and is typically between 91 and 105 amino acids in length. This family is the N terminal of Ipi1, a component of the Rix1 complex which works in conjunction with Rea1 to mature the 60S ribosome. 101
55039 372053 pfam12334 rOmpB Rickettsia outer membrane protein B. This domain family is found in bacteria, and is approximately 220 amino acids in length. The family is found in association with pfam03797. This family is the middle region of one of the outer membrane proteins of Rickettsia which is involved in adhesion to eukaryotic cells for uptake. 223
55040 403522 pfam12335 SBF2 Myotubularin protein. This domain family is found in eukaryotes, and is approximately 220 amino acids in length. The family is found in association with pfam02141, pfam03456, pfam03455. This family is the middle region of SBF2, a member of the myotubularin family. Myotubularin-related proteins have been suggested to work in phosphoinositide-mediated signalling events that may also convey control of myelination. Mutations of SBF2 are implicated in Charcot-Marie-Tooth disease. 227
55041 403523 pfam12336 SOXp SOX transcription factor. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam00505. There are two conserved sequence motifs: KKDK and LPG. This family is made up of SOX transcription factors. These are involved in upregulation of nestin, a neural promoter. 88
55042 289134 pfam12337 DUF3637 Protein of unknown function (DUF3637). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam00073, pfam08935. 67
55043 403524 pfam12338 RbcS Ribulose-1,5-bisphosphate carboxylase small subunit. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00101. There is a conserved APF sequence motif. There are two completely conserved residues (L and P) that may be functionally important. This family is the small subunit of ribulose-1,5-bisphosphate. 45
55044 403525 pfam12339 DNAJ_related DNA-J related protein. This domain family is found in bacteria, and is approximately 130 amino acids in length. The family is found in association with pfam00226. There is a conserved YYLD sequence motif. Mostof the sequences in this family are annotated as DNA-J related proteins but there is little publication to back this up. 120
55045 403526 pfam12340 DUF3638 Protein of unknown function (DUF3638). This domain family is found in eukaryotes, and is approximately 230 amino acids in length. There are two conserved sequence motifs: LLE and NMG. 225
55046 403527 pfam12341 Mcl1_mid Minichromosome loss protein, Mcl1, middle region. Mcl1_mid, or the middle domain of minichromosome loss protein 1, is the domain that lies between a 7-bladed beta-propeller at the N-terminus, family WD40 pfam00400 etc, and a Homeobox (HMG) domain, pfam00505, at the C-terminus. The full length proteins with all three domains are referred to as DNA polymerase alpha accessory factor Mcl1, but the exact function of this domain is not known. 288
55047 152777 pfam12342 DUF3640 Protein of unknown function (DUF3640). This family of proteins is found in viruses. Proteins in this family are typically between 25 and 211 amino acids in length. 26
55048 403528 pfam12343 DEADboxA Cold shock protein DEAD box A. This domain family is found in bacteria, and is typically between 68 and 89 amino acids in length. The family is found in association with pfam00270, pfam00271, pfam03880. This family is the C terminal region of DEAD box A, a protein expressed under conditions of cold shock which is involved in various cellular processes such as transcription, translation and DA recombination. 67
55049 403529 pfam12344 UvrB Ultra-violet resistance protein B. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00271, pfam02151, pfam04851. There are two conserved sequence motifs: YAD and RRR. This family is the C terminal region of the UvrB protein which conveys mutational resistance against UV light to various different species. 43
55050 403530 pfam12345 DUF3641 Protein of unknown function (DUF3641). This domain family is found in bacteria and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam04055. This family consists of proteins which are commonly annotated as Radical SAM domains but there is little annotation to back this up. 135
55051 372060 pfam12346 HJURP_mid Holliday junction recognition protein-associated repeat. Vertebral Holliday junction recognition proteins carry an SCM3 domain at their N-terminus as do the eukaryotic fungi, but they also carry this central, conserved region. The function of this family is not known. Further downstream there is also a repeated domain, also of unknown function. Investigation of Scm3 and associated proteins is likely to be directly relevant to understanding the mechanism of HJURP-mediated CENP-A chromatin assembly at human centromeres. 115
55052 403531 pfam12347 HJURP_C Holliday junction regulator protein family C-terminal repeat. Although this family is conserved in the Holliday junction regulator, HJURP, proteins in higher eukaryotes, alongside an Scm3, pfam10384, family, its exact function is not known. The C-terminal region of Scm3 proteins has been evolving rapidly, and this short repeat at the C-terminal end can be present in up to two copies in the higher eukaryotes. 60
55053 403532 pfam12348 CLASP_N CLASP N terminal. This region is found at the N terminal of CLIP-associated proteins (CLASPs). CLASPs are widely conserved microtubule plus-end-tracking proteins that regulate the stability of dynamic microtubules. In yeast, Drosophila, and Xenopus, a single CLASP orthologue is present. In mammals, a second paralogue (CLASP2) exists which has some functional overlap with CLASP1. 227
55054 403533 pfam12349 Sterol-sensing Sterol-sensing domain of SREBP cleavage-activation. Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus. 153
55055 403534 pfam12350 CTK3_C CTD kinase subunit gamma CTK3 C-terminus. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2. It is not clear what independent function this C-terminal domain has. 62
55056 372065 pfam12351 Fig1 Ca2+ regulator and membrane fusion protein Fig1. During the mating process of yeast cells, two Ca2+ influx pathways become activated. The resulting elevation of cytosolic free Ca2+ activates downstream signaling factors that promote long term survival of unmated cells. Fig1 is a regulator of the low affinity Ca2+ influx system (LACS), and is also required for efficient membrane fusion during yeast mating. 181
55057 289148 pfam12352 V-SNARE_C Snare region anchored in the vesicle membrane C-terminus. Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. 66
55058 403535 pfam12353 eIF3g Eukaryotic translation initiation factor 3 subunit G. This domain family is found in eukaryotes, and is approximately 130 amino acids in length. The family is found in association with pfam00076. This family is subunit G of the eukaryotic translation initiation factor 3. Subunit G is required for eIF3 integrity. 126
55059 289150 pfam12354 Internalin_N Bacterial adhesion/invasion protein N terminal. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00560, pfam08191, pfam09479. There are two completely conserved residues (I and F) that may be functionally important. Internalin mediates bacterial adhesion and invasion of epithelial cells in the human intestine through specific interaction with its host cell receptor E-cadherin. This family is the N terminal of internalin, the cap domain of the protein. The cap domain is conserved between different internalin types. The cap domain does not interact with E cadherin, therefore its function is presumably structural: capping the hydrophobic core. 50
55060 315106 pfam12355 Dscam_C Down syndrome cell adhesion molecule C terminal. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam00047, pfam07679, pfam00041. The Down syndrome cell adhesion molecule (Dscam) belongs to a family of cell membrane molecules involved in the differentiation of the nervous system. This is the C terminal cytoplasmic tail region of Dscam. 118
55061 403536 pfam12356 BIRC6 Baculoviral IAP repeat-containing protein 6. BIRC6 is an anti-apoptotic protein which can regulate cell death by controlling caspases and by acting as an E3 ubiquitin-protein ligase. 175
55062 403537 pfam12357 PLD_C Phospholipase D C terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00168, pfam00614. There is a conserved FPD sequence motif. This family is the C terminal of phospholipase D. PLD is a major plant lipid-degrading enzyme which is involved in signal transduction. 69
55063 403538 pfam12358 DUF3644 Protein of unknown function (DUF3644). This domain family is found in bacteria, and is typically between 65 and 80 amino acids in length. 71
55064 403539 pfam12359 DUF3645 Protein of unknown function (DUF3645). This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a conserved HPD sequence motif. 33
55065 403540 pfam12360 Pax7 Paired box protein 7. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00046, pfam00292. Pax7 belongs to a family of genes that encode paired-box-containing transcription factors involved in the control of developmental processes. Pax7 has a distinct role in the specification of myogenic satellite cells. 45
55066 403541 pfam12361 DBP Duffy-antigen binding protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 449 and 1061 amino acids in length. The family is found in association with pfam05424. There are two conserved sequence motifs: NKNGG and QKHDF. This family is part of the Duffy-antigen binding protein of Plasmodium spp. This protein is an antigen on these parasites which enable them to invade erythrocytes. 156
55067 403542 pfam12362 DUF3646 DNA polymerase III gamma and tau subunits C terminal. This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam00004. The proteins in this family are frequently annotated as the gamma and tau subunits of DNA polymerase III, however there is little accompanying literature to back this up. 114
55068 372073 pfam12363 Phage_TAC_12 Phage tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins from Siphoviridae phages. 111
55069 372074 pfam12364 DUF3648 Protein of unknown function (DUF3648). This family of proteins is found in eukaryotes and viruses. Proteins in this family are typically between 53 and 3115 amino acids in length. There are two completely conserved residues (A and F) that may be functionally important. 141
55070 403543 pfam12365 DUF3649 Protein of unknown function (DUF3649). This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length. 26
55071 372076 pfam12366 Casc1 Cancer susceptibility candidate 1. This domain family is found in eukaryotes, and is typically between 216 and 263 amino acids in length. Casc1 has many SNPs associated with cancer susceptibility. 240
55072 403544 pfam12367 PFO_beta_C Pyruvate ferredoxin oxidoreductase beta subunit C terminal. This domain family is found in bacteria and archaea, and is approximately 70 amino acids in length. The family is found in association with pfam02775. There are two completely conserved residues (A and G) that may be functionally important. PFO is involved in carbon dioxide fixation via a reductive TCA cycle. It forms a heterodimer (alpha/beta). The beta subunit has binding motifs for Fe-S clusters and thiamine pyrophosphate. 63
55073 403545 pfam12368 Rhodanese_C Rhodanase C-terminal. Rhodanase_C is found as the domain-extension to Rhodanase enzyme in some members of the Rhodanase family. Rhodanase is pfam00581. 63
55074 403546 pfam12369 GnHR_trans Gonadotropin hormone receptor transmembrane region. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00560, pfam00001. There are two completely conserved C residues that may be functionally important. This family contains the transmembrane region of Follicular stimulating hormone and leutenizing hormone - the two major gonadotropin hormone receptors. These receptors are G protein coupled receptors involved in development and maturation of germ cells in both fecund genders. The transmembrane region is conserved between the two different receptors while the extracellular ligand binding domains are less well conserved. 68
55075 372078 pfam12371 TMEM131_like Transmembrane protein 131-like. TMEM131_like is a family of bacterial, plant and other metazoa transmembrane proteins. Many of the members are multi-pass transmembrane proteins. 84
55076 403547 pfam12372 DUF3652 Huntingtin protein region. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam02985. This family is in the middle region of the Huntingtin protein associated with Huntington's disease. The protein is of unknown function, however it is known that a polyglutamine (CAG) repeat in the gene coding for it results in the development of Huntington's disease. 41
55077 372080 pfam12373 Msg2_C Major surface glycoprotein 2 C terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam02349. This family is the C terminal of major surface glycoprotein 2 of virulent bacteria. It is a virulence factor antigen. 30
55078 403548 pfam12374 Dmrt1 Double-sex mab3 related transcription factor 1. This domain family is found in eukaryotes, and is typically between 61 and 73 amino acids in length. The family is found in association with pfam00751. This family is a transcription factor involved in sex determination. The proteins in this family contain a zinc finger-like DNA-binding motif, DM domain. 72
55079 372082 pfam12375 DUF3653 Phage protein. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 112 and 194 amino acids in length. 66
55080 289169 pfam12376 DUF3654 Protein of unknown function (DUF3654). This family of proteins is found in eukaryotes. Proteins in this family are typically between 193 and 612 amino acids in length. 138
55081 315124 pfam12377 DuffyBP_N Duffy binding protein N terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam05424. This family contains the N-terminus of the Duffy receptor binding domain. 67
55082 315125 pfam12378 CytadhesinP1 Trypsin-sensitive surface-exposed protein. This domain family is found in bacteria, and is typically between 67 and 79 amino acids in length. This family contains trypsin-sensitive surface-exposed proteins called cytadhesins. Cytadhesins are virulence factor proteins which mediate attachment of bacterial cells to host cells for invasion. 72
55083 403549 pfam12379 DUF3655 Protein of unknown function (DUF3655). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam08716, pfam01661, pfam05409, pfam06471, pfam08717, pfam06478, pfam09401, pfam06460, pfam08715, pfam08710. 151
55084 289173 pfam12380 Peptidase_C62 Gill-associated viral 3C-like peptidase. a positive-stranded RNA virus of prawns, that has been called yellow head virus protease and gill-associated virus 3C-like peptidase. The GAV cysteine protease is predicted to be the key enzyme in the processing of the GAV replicase polyprotein precursors, pp1a and pp1ab. This protease employs a Cys(2968)-His(2879) catalytic dyad. 284
55085 152816 pfam12381 Peptidase_C3G Tungro spherical virus-type peptidase. This is the protease for self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The protease activity of the polyprotein is at the C-terminal end, adjacent to the putative RNA polymerase. 231
55086 152817 pfam12382 Peptidase_A2E Retrotransposon peptidase. This is a small family of fungal retroviral aspartyl peptidases. 137
55087 289174 pfam12383 SARS_3b Severe acute respiratory syndrome coronavirus 3b protein. This family of proteins is found in viruses. Proteins in this family are typically between 32 and 154 amino acids in length. This family contains the SARS coronavirus 3b protein which is predominantly localized in the nucleolus, and induces G0/G1 arrest and apoptosis in transfected cells. 153
55088 152819 pfam12384 Peptidase_A2B Ty3 transposon peptidase. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species. 177
55089 403550 pfam12385 Peptidase_C70 Papain-like cysteine protease AvrRpt2. This is a family of cysteine proteases, found in actinobacteria, protobacteria and firmicutes. Papain-like cysteine proteases play a crucial role in plant-pathogen/pest interactions. On entering the host they act on non-self substrates, thereby manipulating the host to evade proteolysis. AvrRpt2 from Pseudomonas syringae pv. tomato DC3000 triggers resistance to P. syringae-2-dependent defense responses, including hypersensitive cell death, by cleaving the Arabidopsis RIN4 protein which is monitored by the cognate resistance protein RPS2. 143
55090 403551 pfam12386 Peptidase_C71 Pseudomurein endo-isopeptidase Pei. This peptidase has the catalytic triad C-H-D at the C-terminal end, a triad similar to that in thiol proteases and animal transglutaminases. It catalyzes the in vitro lysis of M. marburgensis cells under reducing conditions and exhibits characteristics of metal-activated peptidases. 149
55091 289175 pfam12387 Peptidase_C74 Pestivirus NS2 peptidase. The pestivirus NS2 peptidase is responsible for single cleavage between NS2 and NS3 of the bovine viral diarrhea virus polyprotein, a cleavage that is correlated with cytopathogenicity. The peptidase is activated by its interaction with 'J-domain protein interacting with viral protein'. 200
55092 403552 pfam12388 Peptidase_M57 Dual-action HEIGH metallo-peptidase. The catalytic triad for this family of proteases is HE-H-H, which in many members is in the sequence motif HEIGH. 211
55093 372085 pfam12389 Peptidase_M73 Camelysin metallo-endopeptidase. 196
55094 403553 pfam12390 Se-cys_synth_N Selenocysteine synthase N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam03841. There is a single completely conserved residue P that may be functionally important. This family is the N terminal region of selenocysteine synthase which catalyzes the conversion of seryl-tRNA(Sec) into selenocysteyl-tRNA(Sec). 40
55095 403554 pfam12391 PCDO_beta_N Protocatechuate 3,4-dioxygenase beta subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam00775. There are two completely conserved residues (Y and R) that may be functionally important. This family is the N terminal region of the beta subunit of protocatechuate 3,4-dioxidase. This enzyme utilizes a mononuclear, non-heme Fe3+ centre to catalyze metabolic cellular reactions. 32
55096 403555 pfam12392 DUF3656 Collagenase. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam01136. 102
55097 152828 pfam12393 Dr_adhesin Dr family adhesin. This domain family is found in bacteria, and is approximately 20 amino acids in length. The family is found in association with pfam04619. This family is the Dr-family adhesin expressed by uropathogenic E. coli. 21
55098 403556 pfam12394 DUF3657 Protein FAM135. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam05057. 64
55099 403557 pfam12395 DUF3658 Protein of unknown function. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam08874. There are two completely conserved residues (D and R) that may be functionally important. 107
55100 403558 pfam12396 DUF3659 Protein of unknown function (DUF3659). This domain family is found in bacteria and eukaryotes, and is approximately 70 amino acids in length. 59
55101 403559 pfam12397 U3snoRNP10 U3 small nucleolar RNA-associated protein 10. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam08146. This family is the protein associated with U3 snoRNA which is involved in the processing of pre-rRNA. 116
55102 403560 pfam12398 DUF3660 Receptor serine/threonine kinase. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00954, pfam01453, pfam00069, pfam08276. There is a conserved ELPL sequence motif. 42
55103 403561 pfam12399 BCA_ABC_TP_C Branched-chain amino acid ATP-binding cassette transporter. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00005. There is a conserved AYLG sequence motif. This family is the C terminal of an ATP dependent branched-chain amino acid transporter. 23
55104 403562 pfam12400 STIMATE STIMATE family. STIMATE is a ER-resident multi-transmembrane protein that serves as a positive regulator of Ca(2+) influx in vertebrates. It interacts with ER-resident Ca2+ sensor protein STIM1 to promote STIM1 conformational switch. This entry also includes budding yeast YPL162C. 124
55105 403563 pfam12401 DUF3662 Protein of unknown function (DUF2662). This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam00498. 115
55106 403564 pfam12402 nlz1 NocA-like zinc-finger protein 1. This domain family is found in eukaryotes, and is typically between 42 and 57 amino acids in length. There is a conserved GAY sequence motif. There is a single completely conserved residue G that may be functionally important. Nlz1 self-associated via its C-terminus, interacted with Nlz2, and bound to histone deacetylases. 58
55107 403565 pfam12403 Pax2_C Paired-box protein 2 C terminal. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00292. This family is the C terminal of the paired-box protein 2 which is a transcription factor involved in embryonic development and organogenesis. 112
55108 403566 pfam12404 DUF3663 Peptidase. This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam00883. There is a conserved WAF sequence motif. 77
55109 289191 pfam12406 DUF3664 Surface protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 131 and 312 amino acids in length. 99
55110 403567 pfam12407 Abdominal-A Homeobox protein. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00046. This family is a homeobox protein involved in differentiation of embryonic cells to form the abdominal region. 24
55111 403568 pfam12408 DUF3666 Ribose-5-phosphate isomerase. This domain family is found in bacteria, and is approximately 50 amino acids in length. The family is found in association with pfam02502. There are two completely conserved residues (D and F) that may be functionally important. 46
55112 403569 pfam12409 P5-ATPase P5-type ATPase cation transporter. This domain family is found in eukaryotes, and is typically between 110 and 126 amino acids in length. The family is found in association with pfam00122, pfam00702. P-type ATPases comprise a large superfamily of proteins, present in both prokaryotes and eukaryotes, that transport inorganic cations and other substrates across cell membranes. 125
55113 289195 pfam12410 rpo30_N Poxvirus DNA dependent RNA polymerase 30kDa subunit. This family of proteins is found in viruses. Proteins in this family are typically between 193 and 259 amino acids in length. The family is found in association with pfam01096. There are two conserved sequence motifs: GIEYSKD and LRY. This family is N terminal of the 30 kDa subunit of poxvirus DNA-d-RNA-pol. It has structural similarity to the eukaryotic transcriptional elongation factor SII. 135
55114 403570 pfam12411 Choline_sulf_C Choline sulfatase enzyme C terminal. This domain family is found in bacteria, eukaryotes and viruses, and is approximately 60 amino acids in length. The family is found in association with pfam00884. There are two completely conserved residues (R and W) that may be functionally important. This family is the C terminal of choline sulfatase, the enzyme responsible for catalyzing the conversion of choline-O-sulfate and, at a lower rate, phosphorylcholine, into choline. 53
55115 403571 pfam12412 DUF3667 Protein of unknown function (DUF3667). This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. There is a single completely conserved residue P that may be functionally important. 45
55116 403572 pfam12413 DLL_N Homeobox protein distal-less-like N terminal. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam00046. This family is the N terminal of a homeobox protein involved in embryonic development and adult neural regeneration. 83
55117 403573 pfam12414 Fox-1_C Calcitonin gene-related peptide regulator C terminal. This domain family is found in eukaryotes, and is typically between 69 and 99 amino acids in length. The family is found in association with pfam00076. This family is the C terminal of Fox-1, a protein involved in the regulation of calcitonin gene-related peptide to mediate the neuron-specific splicing pattern. Fox-1, with Fox-2, functions to repress exon 4 inclusion. 95
55118 289200 pfam12415 rpo132 Poxvirus DNA dependent RNA polymerase. This domain family is found in viruses, and is approximately 30 amino acids in length. The family is found in association with pfam04566, pfam00562, pfam04567, pfam04560, pfam04565. This family is the second largest subunit of the poxvirus DNA dependent RNA polymerase. It has structural similarity to the second-largest RNA polymerase subunits of eubacteria, archaebacteria, and eukaryotes. 32
55119 403574 pfam12416 DUF3668 Cep120 protein. This family includes the Cep120 protein which is associated with centriole structure and function. 226
55120 403575 pfam12417 DUF3669 Zinc finger protein. This domain family is found in eukaryotes, and is typically between 64 and 80 amino acids in length. 66
55121 403576 pfam12418 AcylCoA_DH_N Acyl-CoA dehydrogenase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam02770, pfam00441, pfam02771. This family is one of the enzymes involved in AcylCoA interaction in beta-oxidation. 32
55122 403577 pfam12419 DUF3670 SNF2 Helicase protein. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam00271, pfam00176. Most of the proteins in this family are annotated as SNF2 helicases but there is little accompanying literature to confirm this. 136
55123 372098 pfam12420 DUF3671 Protein of unknown function. This domain family is found in eukaryotes, and is typically between 96 and 116 amino acids in length. 114
55124 289206 pfam12421 DUF3672 Fibronectin type III protein. This domain family is found in bacteria and viruses, and is typically between 126 and 146 amino acids in length. The family is found in association with pfam09327, pfam00041. There are two completely conserved G residues that may be functionally important. Many of the proteins in this family are annotated as fibronectin type III however there is little accompanying literature to confirm this. 133
55125 403578 pfam12422 Condensin2nSMC Condensin II non structural maintenance of chromosomes subunit. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. This family is part of a non-SMC subunit of condensin II which is involved in maintenance of the structural integrity of chromosomes. Condensin II is made up of SMC (structural maintenance of chromosomes) and non-SMC subunits. The non-SMC subunits bind to the catalytic ends of the SMC subunit dimer. The condensin holocomplex is able to introduce superhelical tension into DNA in an ATP hydrolysis- dependent manner, resulting in the formation of positive supercoils in the presence of topoisomerase I and of positive knots in the presence of topoisomerase II. 148
55126 403579 pfam12423 KIF1B Kinesin protein 1B. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells. 43
55127 403580 pfam12424 ATP_Ca_trans_C Plasma membrane calcium transporter ATPase C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00689, pfam00122, pfam00702, pfam00690. There is a conserved QTQ sequence motif. This family is the C terminal of a calcium transporting ATPase located in the plasma membrane. 47
55128 338347 pfam12425 DUF3673 Protein of unknown function (DUF3673). This domain family is found in eukaryotes, and is approximately 50 amino acids in length. 53
55129 289211 pfam12426 DUF3674 RNA dependent RNA polymerase. This domain family is found in viruses, and is approximately 40 amino acids in length. There is a conserved MFNLKF sequence motif. There are two completely conserved residues (E and P) that may be functionally important. 41
55130 372102 pfam12427 DUF3665 Branched-chain amino acid aminotransferase. This domain family is found in bacteria, and is typically between 23 and 35 amino acids in length. The family is found in association with pfam01063. There is a conserved TRT sequence motif. 22
55131 403581 pfam12428 DUF3675 Protein of unknown function (DUF3675). This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam00097. There are two completely conserved residues (R and L) that may be functionally important. 119
55132 338349 pfam12429 DUF3676 Protein of unknown function (DUF3676). This domain family is found in eukaryotes, and is approximately 230 amino acids in length. 230
55133 403582 pfam12430 ABA_GPCR Abscisic acid G-protein coupled receptor. This domain family is found in eukaryotes, and is typically between 177 and 216 amino acids in length. This family is part of the abscisic acid (ABA) G-protein coupled receptor. ABA is a stress hormone in plants. 186
55134 403583 pfam12431 CitT Transcriptional regulator. This domain family is found in bacteria, and is approximately 30 amino acids in length. The family is found in association with pfam00072. There is a single completely conserved residue G that may be functionally important. CitT is a transcriptional regulator which allows transcription of the citM gene which codes for the secondary transporter in the Mg-citrate transport complex. 30
55135 403584 pfam12432 DUF3677 Protein of unknown function (DUF3677). This domain family is found in eukaryotes, and is approximately 80 amino acids in length. 81
55136 403585 pfam12433 PV_NSP1 Parvovirus non-structural protein 1. This family of proteins is found in viruses. Proteins in this family are typically between 109 and 668 amino acids in length. Parvoviral NSPs regulate host gene expression through histone acetylation. 71
55137 289219 pfam12434 Malate_DH Malate dehydrogenase enzyme. This domain family is found in bacteria, and is approximately 30 amino acids in length. The family is found in association with pfam00390, pfam03949, pfam01515. There is a conserved AAL sequence motif. There is a single completely conserved residue R that may be functionally important. Malate dehydrogenase is one of the enzymes involved in the citric acid cycle in mitochondria. It converts malate to oxaloacetate using NAD as a cofactor. 28
55138 372107 pfam12435 DUF3678 Protein of unknown function (DUF3678). This domain family is found in eukaryotes, and is approximately 40 amino acids in length. 35
55139 403586 pfam12436 USP7_ICP0_bdg ICP0-binding domain of Ubiquitin-specific protease 7. This domain is one of two C-terminal domains on the much longer ubiquitin-specific proteases. This particular one is found to interact with the herpesvirus 1 trans-acting transcriptional protein ICP0/VMW110. 246
55140 403587 pfam12437 GSIII_N Glutamine synthetase type III N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 160 amino acids in length. The family is found in association with pfam00120. This family is the N terminal region of glutamine synthetase type III which is one of the enzymes responsible for generation of glutamine through conversion glutamate to glutamine by the incorporation of ammonia (NH3). 160
55141 315165 pfam12438 DUF3679 Protein of unknown function (DUF3679). This domain family is found in bacteria, and is approximately 60 amino acids in length. 56
55142 403588 pfam12439 GDE_N Glycogen debranching enzyme N terminal. This domain family is found in bacteria and archaea, and is typically between 218 and 229 amino acids in length. The family is found in association with pfam06202. Glycogen debranching enzyme catalyzes the debranching of amylopectin in glycogen. This is done by transferring three glucose subunits of glycogen from one parallel chain to another. This has the effect of enabling the glucose residues to become more accessible for glycolysis. 209
55143 403589 pfam12440 MAGE_N Melanoma associated antigen family N terminal. This domain family is found in eukaryotes, and is typically between 82 and 96 amino acids in length. The family is found in association with pfam01454. This family is the N terminal of various melanoma associated antigens. These are tumor rejection antigens which are expressed on HLA-A1 of tumor cells and they are recognized by cytotoxic T lymphocytes (CTLs). 90
55144 372110 pfam12441 CopG_antitoxin CopG antitoxin of type II toxin-antitoxin system. CopG antitoxin is a member of a type II toxin-antitoxin system family found in bacteria and archaea. Most antitoxins encoded by the relBE and parDE loci belong to the MetJ/Arc/CopG family of dimeric proteins which bind DNA through N-terminal ribbon-helix-helix (RHH) motifs. The toxin for CopG proteins falls into the family BrnT_toxin, pfam04365. 79
55145 403590 pfam12442 DUF3681 Protein of unknown function (DUF3681). This family of proteins is found in eukaryotes. Proteins in this family are typically between 112 and 212 amino acids in length. There is a single completely conserved residue G that may be functionally important. 95
55146 403591 pfam12443 AKNA AT-hook-containing transcription factor. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. This family contains a transcription factor which regulates the expression of the costimulatory molecules on lymphocytes. 96
55147 403592 pfam12444 Sox_N Sox developmental protein N terminal. This domain family is found in eukaryotes, and is typically between 69 and 88 amino acids in length. The family is found in association with pfam00505. There are two conserved sequence motifs: YDW and PVR. This family contains Sox8, Sox9 and Sox10 proteins which have structural similarity. Sox proteins are involved in developmental processes. 76
55148 372114 pfam12445 FliC Flagellin protein. This domain family is found in bacteria, and is typically between 125 and 147 amino acids in length. The family is found in association with pfam00669, pfam00700. There are two completely conserved G residues that may be functionally important. This family is the flagellin motor protein which confers motility to bacterial cells. 127
55149 289231 pfam12446 DUF3682 Protein of unknown function (DUF3682). This domain family is found in eukaryotes, and is typically between 125 and 136 amino acids in length. 129
55150 403593 pfam12447 DUF3683 Protein of unknown function (DUF3683). This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam02754, pfam01565, pfam02913. 114
55151 403594 pfam12448 Milton Kinesin associated protein. This domain family is found in eukaryotes, and is typically between 143 and 173 amino acids in length. The family is found in association with pfam04849. This family is a region of the protein milton. Milton recruits the heavy chain of kinesin to mitochondria to allow the motor movement function of kinesin. 168
55152 403595 pfam12449 DUF3684 Protein of unknown function (DUF3684). This domain family is found in eukaryotes, and is typically between 1072 and 1090 amino acids in length. 1093
55153 403596 pfam12450 vWF_A von Willebrand factor. This domain family is found in bacteria, and is approximately 100 amino acids in length. The family is found in association with pfam00092. There are two conserved sequence motifs: STF and DVD. There are two completely conserved residues (E and N) that may be functionally important. In hemostasis, platelet adhesion to the damaged vessel wall is mediated by several proteins, including von Willebrand factor. In solution vWF becomes immobilized via its A3 domain on the fibrillar collagen of the vessel wall and acts as an intermediary between collagen and the platelet receptor glycoprotein Ibalpha (GPIbalpha), which is the only platelet receptor that does not require prior activation for bond formation. 94
55154 403597 pfam12451 VPS11_C Vacuolar protein sorting protein 11 C terminal. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. Vps 11 is one of the evolutionarily conserved class C vacuolar protein sorting genes (c-vps: vps11, vps16, vps18, and vps33), whose products physically associate to form the c-vps protein complex required for vesicle docking and fusion. 44
55155 403598 pfam12452 DUF3685 Protein of unknown function (DUF3685). This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. There are two completely conserved residues (L and D) that may be functionally important. 192
55156 403599 pfam12453 PTP_N Protein tyrosine phosphatase N terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00041. There is a single completely conserved residue L that may be functionally important. This family consists of various protein tyrosine phosphatase haematopoietic receptors, e.g. CD45, which dephosphorylate growth stimulating proteins. This limits growth signalling in haematopoietic cells. 26
55157 372120 pfam12454 Ecm33 GPI-anchored cell wall organization protein. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. Ecm33 is an essential cell wall component and is important for cell wall integrity. 40
55158 403600 pfam12455 Dynactin Dynein associated protein. This domain family is found in eukaryotes, and is approximately 280 amino acids in length. The family is found in association with pfam01302. There is a single completely conserved residue E that may be functionally important. Dynactin has been associated with Dynein, a kinesin protein which is involved in organelle transport, mitotic spindle assembly and chromosome segregation. Dynactin anchors Dynein to specific subcellular structures. 286
55159 403601 pfam12456 hSac2 Inositol phosphatase. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam02383. hSac2 functions as an inositol polyphosphate 5-phosphatase. 110
55160 403602 pfam12457 TIP_N Tuftelin interacting protein N terminal. This domain family is found in eukaryotes, and is typically between 99 and 114 amino acids in length. The family is found in association with pfam08697, pfam01585. There are two completely conserved residues (G and F) that may be functionally important. TIP is involved in enamel assembly by interacting with one of the major proteins responsible for biomineralisation of enamel - tuftelin. 93
55161 403603 pfam12458 DUF3686 ATPase involved in DNA repair. This domain family is found in bacteria, and is approximately 450 amino acids in length. There are two conserved sequence motifs: DVF and SPNGED. 446
55162 403604 pfam12459 DUF3687 D-Ala-teichoic acid biosynthesis protein. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There are two completely conserved residues (L and Y) that may be functionally important. 43
55163 403605 pfam12460 MMS19_C RNAPII transcription regulator C-terminal. MMS19 is required for both nucleotide excision repair (NER) and RNA polymerase II (RNAP II) transcription. This C-terminal domain, along with the N-terminal, MMS19_N, form part of a silencing complex in fission yeast that contains Dos2, Rik1, Mms19 and Cdc20 (the catalytic subunit of DNA polymerase-epsilon). This complex regulates RNA polymerase II (RNA Pol II) activity in heterochromatin and is required for DNA replication and heterochromatin assembly. 423
55164 315187 pfam12461 DUF3688 Protein of unknown function (DUF3688). This domain family is found in bacteria and viruses, and is typically between 79 and 104 amino acids in length. There is a conserved YRW sequence motif. There is a single completely conserved residue Y that may be functionally important. 727
55165 403606 pfam12462 Helicase_IV_N DNA helicase IV / RNA helicase N terminal. This domain family is found in bacteria, and is approximately 170 amino acids in length. This family is found in bacterial DNA helicase IV, at the N-terminus of pfam00580. 164
55166 403607 pfam12463 DUF3689 Protein of unknown function (DUF3689). This family of proteins is found in eukaryotes. Proteins in this family are typically between 399 and 797 amino acids in length. 309
55167 403608 pfam12464 Mac Maltose acetyltransferase. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00132. Mac uses acetyl-CoA as acetyl donor to acetylated cytoplasmic maltose. 52
55168 403609 pfam12465 Pr_beta_C Proteasome beta subunits C terminal. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00227. There is a conserved GTT sequence motif. There is a single completely conserved residue Y that may be functionally important. This family includes the C terminal of the beta-type subunits of the proteasome, a multimeric complex that degrades proteins into peptides as part of the MHC class I-mediated Ag-presenting pathway. 35
55169 372129 pfam12466 GDH_N Glutamate dehydrogenase N terminal. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam05088. There is a conserved ALR sequence motif. Glutamate dehydrogenase (GDH) is a homohexameric, mitochondrial enzyme that reversibly catalyzes the oxidative deamination of L-glutamate to 2-oxoglutarate using either NADP(H) or NAD(H) with comparable efficacy. 95
55170 289250 pfam12467 CMV_1a Cucumber mosaic virus 1a protein family. This domain family is found in viruses, and is typically between 156 and 171 amino acids in length. The family is found in association with pfam01443, pfam01660. 1a protein is the major virulence factor of the cucumber mosaic virus (CMV). The Ns strain of CMV causes necrotic lesions to Nicotiana spp. while other strains cause systemic mosaic. The determinant of the pathogenesis of these different strains is the specific amino acid residue at the 461 residue of the 1a protein. 184
55171 403610 pfam12468 TTSSLRR Type III secretion system leucine rich repeat protein. This domain family is found in bacteria, and is approximately 50 amino acids in length. There are two completely conserved residues (Y and W) that may be functionally important. This family consists of leucine-rich repeat proteins involved in type III secretion. 54
55172 403611 pfam12469 DUF3692 CRISPR-associated protein. This domain family is found in bacteria and archaea, and is typically between 101 and 138 amino acids in length. The proteins in this family are frequently annotated as CRISPR-associated proteins however there is little accompanying literature to confirm this. 112
55173 403612 pfam12470 SUFU_C Suppressor of Fused Gli/Ci N terminal binding domain. This domain family is found in eukaryotes, and is typically between 192 and 219 amino acids in length. The family is found in association with pfam05076. There is a conserved HGRHFT sequence motif. This family is the C terminal domain of the Suppressor of Fused protein (Su(fu)). Su(fu) is a repressor of the Gli and Ci transcription factors of the Hedgehog signalling cascade. It functions by binding these proteins and preventing their translocation to the nucleus. The C terminal domain is only found in eukaryotic Su(fu) proteins; it is not present in bacterial homologs. The C terminal domain binds to the N terminal of Gli/Ci while the N terminal of Su(fu) binds to the C terminal of Gli/Ci. This dual binding mechanism is likely an evolutionary advancement in this signalling cascade which is not present in bacterial homologs. 215
55174 403613 pfam12471 GTP_CH_N GTP cyclohydrolase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. This family is the N terminal of GTP cyclohydrolase, the rate limiting enzyme in the synthesis of tetrahydrobiopterin. 193
55175 372132 pfam12472 DUF3693 Phage related protein. This domain family is found in bacteria and viruses, and is approximately 60 amino acids in length. 60
55176 403614 pfam12473 DUF3694 Kinesin protein. This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important. 149
55177 403615 pfam12474 PKK Polo kinase kinase. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam00069. Polo-like kinase 1 (Plx1) is essential during mitosis for the activation of Cdc25C, for spindle assembly, and for cyclin B degradation. This family is Polo kinase kinase (PKK) which phosphorylates Polo kinase and Polo-like kinase to activate them. PKK is a serine/threonine kinase. 140
55178 289258 pfam12475 Amdo_NSP Amdovirus non-structural protein. This domain family is found in viruses, and is approximately 50 amino acids in length. This family contains proteins of each of the four types of Amdovirus non-structural protein. 54
55179 403616 pfam12476 DUF3696 Protein of unknown function (DUF3696). This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length. 53
55180 403617 pfam12477 TraW_N Sex factor F TraW protein N terminal. This domain family is found in bacteria, and is approximately 30 amino acids in length. There is a single completely conserved residue G that may be functionally important. The traW gene of the E. coli K-12 sex factor, F, encodes one of the numerous proteins required for conjugative transfer of this plasmid. 30
55181 372135 pfam12478 DUF3697 Ubiquitin-associated protein 2. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00627. There are two conserved sequence motifs: AVEMPG and QFG. 32
55182 372136 pfam12479 DUF3698 Protein of unknown function (DUF3698). This domain family is found in eukaryotes, and is typically between 89 and 105 amino acids in length. 101
55183 403618 pfam12480 DUF3699 Protein of unknown function (DUF3699). This domain family is found in eukaryotes, and is approximately 80 amino acids in length. 71
55184 403619 pfam12481 DUF3700 Aluminium induced protein. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: YGL and LRDR. This family is related to GATase enzyme domains. 228
55185 372139 pfam12482 DUF3701 Phage integrase protein. This domain family is found in bacteria, and is approximately 100 amino acids in length. The family is found in association with pfam00589. 88
55186 403620 pfam12483 GIDE E3 Ubiquitin ligase. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 150 and 163 amino acids in length. There is a single completely conserved residue E that may be functionally important. GIDE is an E3 ubiquitin ligase which is involved in inducing apoptosis. 160
55187 372141 pfam12484 PE_PPE_C Polymorphic PE/PPE proteins C terminal. This domain family is found in bacteria, and is approximately 90 amino acids in length. The family is found in association with pfam00823. There is a conserved SVP sequence motif. There is a single completely conserved residue W that may be functionally important. The proteins in this family are PE/PPE proteins implicated in immunostimulation and virulence. 80
55188 403621 pfam12485 SLY Lymphocyte signaling adaptor protein. This domain family is found in eukaryotes, and is typically between 144 and 156 amino acids in length. The family is found in association with pfam07647, pfam07653. There is a conserved LGKK sequence motif. SLY contains a Src homology 3 domain and a sterile alpha motif, suggesting that it functions as a signaling adaptor protein in lymphocytes. 154
55189 403622 pfam12486 VasL Type VI secretion system, EvfB, or VasL. EvfB or VasL is a domain found on many Gram-negative proteins with an ImpA_N domain at the N-terminus. These proteins are expressed from the pathogenicity locus that forms the bacterial type VI secretion system. The exact function of VasL is not known. One E.coli member is annotated as being EvfB, though the E.coli equivalent of ImpA would be expected to be EvfG. It is possible that in many bacteria what is a single protein in one species, eg E.coli, is a fusion of two genes in others, which would explain an ImpA at the N-terminus and a VasL at the C-terminus. 147
55190 403623 pfam12487 DUF3703 Protein of unknown function (DUF3703). This family of proteins is found in bacteria. Proteins in this family are typically between 113 and 135 amino acids in length. 109
55191 403624 pfam12488 DUF3704 Protein of unknown function (DUF3704). This domain family is found in eukaryotes, and is approximately 30 amino acids in length. 27
55192 403625 pfam12489 ARA70 Nuclear coactivator. This domain family is found in eukaryotes, and is typically between 127 and 138 amino acids in length. This family is ARA70, a nuclear coactivator which interacts with peroxisome proliferator-activated receptor gamma (PPARgamma) to regulate transcription and the addition of the PPARgamma ligand (prostaglandin J2) enhances this interaction. 131
55193 403626 pfam12490 BCAS3 Breast carcinoma amplified sequence 3. This domain family is found in eukaryotes, and is typically between 229 and 245 amino acids in length. The proteins in this family have been shown to be proto-oncogenes implicated in the development of breast cancer. 240
55194 403627 pfam12491 ApoB100_C Apolipoprotein B100 C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. There are two conserved sequence motifs: QLS and LIDL. ApoB100 has an essential role in the assembly and secretion of triglyceride-rich lipoproteins and lipids transport. 57
55195 289275 pfam12493 DUF3709 Protein of unknown function (DUF3709). This domain family is found in bacteria, and is approximately 30 amino acids in length. There are two conserved sequence motifs: RCLMK and LIEL. 33
55196 372148 pfam12494 DUF3695 Protein of unknown function (DUF3695). This family of proteins is found in eukaryotes. Proteins in this family are typically between 157 and 192 amino acids in length. There is a single completely conserved residue D that may be functionally important. 95
55197 403628 pfam12495 Vip3A_N Vegetative insecticide protein 3A N terminal. This family of proteins is found in bacteria. Proteins in this family are typically between 170 and 789 amino acids in length. The family is found in association with pfam02018. Vip3A represents a novel class of proteins insecticidal to lepidopteran insect larvae. 177
55198 403629 pfam12496 BNIP2 Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2. This domain family is found in eukaryotes, and is typically between 119 and 133 amino acids in length. There is a conserved HGGY sequence motif. This family is Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2. It interacts with pro- and anti- apoptotic molecules in the cell. 135
55199 403630 pfam12497 ERbeta_N Estrogen receptor beta. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00104, pfam00105. There is a conserved IPS sequence motif. There are two completely conserved residues (Y and W) that may be functionally important. ERbeta binds estrogens with an affinity similar to that of ERalpha, and activates expression of reporter genes containing estrogen response elements in an estrogen-dependent manner. ERbeta acts as a transcription factor once bound to its ligand and it can dimerize with ERalpha. 114
55200 403631 pfam12498 bZIP_C Basic leucine-zipper C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 174 and 411 amino acids in length. The family is found in association with pfam00170. There is a conserved KVK sequence motif. There is a single completely conserved residue K that may be functionally important. Various bZIP proteins have been found and shown to play a role in seed-specific gene expression. bZIP binds to the alpha-globulin gene promoter, but not to promoters of other major storage genes such as glutelin, prolamin and albumin. 122
55201 403632 pfam12499 DUF3707 Pherophorin. This domain family is found in eukaryotes, and is typically between 147 and 160 amino acids in length. The proteins in this family are frequently annotated as pherophorins however there is little accompanying literature to confirm this. 139
55202 403633 pfam12500 TRSP TRSP domain C-terminus to PRTase_2. This domain occurs C-terminal to PRTase_2 and has highly conserved GXXE and TRSP signatures. It is found in bacteria. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 128
55203 403634 pfam12501 DUF3708 Phosphate ATP-binding cassette transporter. This domain family is found in bacteria, and is typically between 143 and 173 amino acids in length. The family is found in association with pfam00528. There is a single completely conserved residue P that may be functionally important. 165
55204 403635 pfam12502 DUF3710 Protein of unknown function (DUF3710). This family of proteins is found in bacteria. Proteins in this family are typically between 237 and 284 amino acids in length. There are two conserved sequence motifs: DLG and DGPRW. 177
55205 289285 pfam12503 CMV_1a_C Cucumber mosaic virus 1a protein C terminal. This domain family is found in viruses, and is approximately 90 amino acids in length. The family is found in association with pfam01443, pfam01660. There is a conserved GLG sequence motif. 1a protein is the major virulence factor of the cucumber mosaic virus (CMV). The Ns strain of CMV causes necrotic lesions to Nicotiana spp. while other strains cause systemic mosaic. The determinant of the pathogenesis of these different strains is the specific amino acid residue at the 461 residue of the 1a protein. 84
55206 403636 pfam12505 DUF3712 Protein of unknown function (DUF3712). This domain family is found in eukaryotes, and is approximately 130 amino acids in length. 125
55207 289287 pfam12506 DUF3713 Protein of unknown function (DUF3713). This family of proteins is found in bacteria. Proteins in this family are typically between 92 and 1225 amino acids in length. There is a single completely conserved residue S that may be functionally important. 115
55208 403637 pfam12507 HCMV_UL139 Human Cytomegalovirus UL139 protein. This family of proteins is found in eukaryotes and viruses. Proteins in this family are approximately 140 amino acids in length. UL139 product shared sequence homology with human CD24, a signal transducer modulating B-cell activation responses, and the sequences in the G1c variant of UL139 contained a specific attachment site of prokaryotic membrane lipoprotein lipid. 100
55209 403638 pfam12508 Transposon_TraM Conjugative transposon, TraM. Proteins in this entry are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. 194
55210 403639 pfam12509 DUF3715 Protein of unknown function (DUF3715). This domain family is found in eukaryotes, and is approximately 170 amino acids in length. 150
55211 403640 pfam12510 Smoothelin Smoothelin cytoskeleton protein. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00307. Smoothelin is a cytoskeletal protein specifically expressed in differentiated smooth muscle cells and has been shown to co-localize with smooth muscle alpha actin. 50
55212 403641 pfam12511 DUF3716 Protein of unknown function (DUF3716). This domain family is found in eukaryotes, and is approximately 60 amino acids in length. 59
55213 403642 pfam12512 DUF3717 Protein of unknown function (DUF3717). This family of proteins is found in bacteria. Proteins in this family are typically between 75 and 117 amino acids in length. There is a conserved AIN sequence motif. There are two completely conserved residues (L and Y) that may be functionally important. 65
55214 403643 pfam12513 SUV3_C Mitochondrial degradasome RNA helicase subunit C terminal. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00271. The yeast mitochondrial degradosome (mtEXO) is an NTP-dependent exoribonuclease involved in mitochondrial RNA metabolism. mtEXO is made up of two subunits: an RNase (DSS1) and an RNA helicase (SUV3). These co-purify with mitochondrial ribosomes. 47
55215 378871 pfam12514 DUF3718 Protein of unknown function (DUF3718). This domain family is found in bacteria and viruses, and is approximately 70 amino acids in length. There is a single completely conserved residue C that may be functionally important. 66
55216 403644 pfam12515 CaATP_NAI Ca2+-ATPase N terminal autoinhibitory domain. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00689, pfam00122, pfam00702, pfam00690. There is a conserved RRFR sequence motif. There are two completely conserved residues (F and W) that may be functionally important. This family is the N terminal autoinhibitory domain of an endosomal Ca2+-ATPase. 45
55217 403645 pfam12516 DUF3719 Protein of unknown function (DUF3719). This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved HLR sequence motif. There are two completely conserved residues (W and H) that may be functionally important. 65
55218 338388 pfam12517 DUF3720 Protein of unknown function (DUF3720). This domain family is found in eukaryotes, and is approximately 100 amino acids in length. There are two completely conserved A residues that may be functionally important. 99
55219 403646 pfam12518 DUF3721 Protein of unknown function. This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length. There is a conserved WMPC sequence motif. There are two completely conserved residues (A and C) that may be functionally important. 33
55220 403647 pfam12519 MDM10 Mitochondrial distribution and morphology protein 10. MDM10 is a family of eukaryotic proteins that forms a subunit of the SAM complex for biogenesis of beta-barrel proteins, though not porins, into the outer mitochondrial membrane. 434
55221 372162 pfam12520 DUF3723 Protein of unknown function (DUF3723). This family of proteins is found in eukaryotes. Proteins in this family are typically between 374 and 1069 amino acids in length. There is a conserved LGF sequence motif. 504
55222 152955 pfam12521 DUF3724 Protein of unknown function (DUF3724). This domain family is found in viruses, and is approximately 20 amino acids in length. The family is found in association with pfam00073. There is a single completely conserved residue Y that may be functionally important. 23
55223 315236 pfam12522 UL73_N Cytomegalovirus glycoprotein N terminal. This domain family is found in viruses, and is approximately 30 amino acids in length. The family is found in association with pfam03554. This family is an envelope glycoprotein of human cytomegalovirus (HCMV). 27
55224 152957 pfam12523 DUF3725 Protein of unknown function (DUF3725). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam01577. There is a conserved FLE sequence motif. 74
55225 315237 pfam12524 GlyL_C dsDNA virus glycoprotein L C terminal. This domain family is found in viruses, and is typically between 55 and 80 amino acids in length. The family is found in association with pfam05259. This family is the C terminal of glycoprotein L from various types of double stranded DNA viruses (dsDNA). 65
55226 403648 pfam12525 DUF3726 Protein of unknown function (DUF3726). This domain family is found in bacteria and eukaryotes, and is approximately 80 amino acids in length. There is a single completely conserved residue E that may be functionally important. 74
55227 372164 pfam12526 DUF3729 Protein of unknown function (DUF3729). This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important. 115
55228 403649 pfam12527 DUF3727 Protein of unknown function (DUF3727). This domain family is found in bacteria and eukaryotes, and is approximately 100 amino acids in length. 97
55229 403650 pfam12528 T2SSppdC Type II secretion prepilin peptidase dependent protein C. 81
55230 403651 pfam12529 Xylo_C Xylosyltransferase C terminal. This domain family is found in eukaryotes, and is typically between 169 and 183 amino acids in length. The family is found in association with pfam02485. There is a single completely conserved residue G that may be functionally important. Xylosyltransferases are enzymes involved in the biosynthesis of the glycosaminoglycan linker region in proteoglycans. 181
55231 403652 pfam12530 DUF3730 Protein of unknown function (DUF3730). This domain family is found in eukaryotes, and is typically between 220 and 262 amino acids in length. 227
55232 403653 pfam12531 DUF3731 DNA-K related protein. This domain family is found in bacteria, and is approximately 250 amino acids in length. There are two conserved sequence motifs: RPG and WRR. The proteins in this family are frequently annotated as DNA-K related proteins however there is little accompanying literature to confirm this. 247
55233 403654 pfam12532 DUF3732 Protein of unknown function (DUF3732). This domain family is found in bacteria and eukaryotes, and is typically between 180 and 198 amino acids in length. There is a conserved DQP sequence motif. 184
55234 403655 pfam12533 Neuro_bHLH Neuronal helix-loop-helix transcription factor. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found C-terminal to pfam00010. There is a single completely conserved residue W that may be functionally important. Neuronal basic helix-loop-helix (bHLH) transcription factors such as neuroD and neurogenin have been shown to play important roles in neuronal development. 122
55235 403656 pfam12534 Pannexin_like Pannexin-like TM region of LRRC8. Pannexin_like is a family of the four transmembrane domains of metazoan leucine-rich-repeat-containing 8 proteins. These four TMs associate into hexamers resulting in homo- or heteromeric channels that connect the cytosol to the extracellular space. The family is found in association with pfam00560. 342
55236 403657 pfam12535 Nudix_N Hydrolase of X-linked nucleoside diphosphate N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 847 and 5344 amino acids in length. These enzymes hydrolyze the molecular motif of a nucleoside diphosphate linked to some other moiety, X. 54
55237 403658 pfam12536 DUF3734 Patatin phospholipase. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam01734. There are two completely conserved residues (F and G) that may be functionally important. The proteins in this family are frequently annotated as patatin family phospholipases however there is little accompanying literature to confirm this. 106
55238 403659 pfam12537 GPHR_N The Golgi pH Regulator (GPHR) Family N-terminal. GPHR_N is the N-terminal 5TM region of the Golgi pH regulator proteins in eukaryotes. It plays vital roles in the transport of newly synthesized proteins from the Golgi to the plasma membrane, in the glycosylation of proteins along the exocytic pathway and the structural organisation of the Golgi apparatus. 68
55239 372173 pfam12538 FtsK_SpoIIIE_N DNA transporter. This domain family is found in bacteria, and is typically between 107 and 121 amino acids in length. The family is found in association with pfam01580. The FtsK/SpoIIIE family of DNA transporters are responsible for translocating missegregated chromosomes after the completion of cell division. 115
55240 403660 pfam12539 Csm1 Chromosome segregation protein Csm1/Pcs1. Saccharomyces cerevisiae Csm1 is part of the monopolin complex. Csm1 forms a complex with Mde4 and promotes monoorientation during meiosis. Csm1 also plays a mitotic role in DNA replication. This family also contains the Schizosaccharomyces pombe homolog to Csm1, Pcs1. Pcs1 forms a complex with Mde4 and acts in the central kinetochore domain to clamp microtubule binding sites together. The two complexes (Csm1/Lrs4 and Pcs1/Mde4) contribute to the prevention of merotelic attachment. 84
55241 403661 pfam12540 DUF3736 Protein of unknown function (DUF3736). This domain family is found in eukaryotes, and is typically between 135 and 160 amino acids in length. 138
55242 403662 pfam12541 DUF3737 Protein of unknown function (DUF3737). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 281 and 297 amino acids in length. 274
55243 403663 pfam12542 CWC25 Pre-mRNA splicing factor. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam10197. There is a single completely conserved residue Y that may be functionally important. Cwc25 has been identified to associate with pre-mRNA splicing factor Cef1/Ntc85, a component of the Prp19-associated complex (NTC) involved in spliceosome activation. Cwc25 is neither tightly associated with NTC nor required for spliceosome activation, but is required for the first catalytic reaction. 98
55244 403664 pfam12543 DUF3738 Protein of unknown function (DUF3738). This family of proteins is found in bacteria. Proteins in this family are typically between 251 and 457 amino acids in length. 188
55245 378876 pfam12544 LAM_C Lysine-2,3-aminomutase. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 111 and 127 amino acids in length. The family is found in association with pfam04055. LAM catalyzes the interconversion of L-alpha-lysine and L-beta-lysine, which proceeds by migration of the amino group from C2 to C3 concomitant with cross-migration of the 3-pro-R hydrogen of L-alpha-lysine to the 2-pro-R position of L-beta-lysine. 127
55246 372178 pfam12545 DUF3739 Filamentous haemagglutinin family outer membrane protein. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam05860. 111
55247 403665 pfam12546 Cryptochrome_C Blue/Ultraviolet sensing protein C terminal. This domain family is found in eukaryotes, and is typically between 113 and 125 amino acids in length. The family is found in association with pfam03441, pfam00875. Cryptochromes are blue/ultraviolet-A light sensing photoreceptors involved in regulating various growth and developmental responses in plants. 121
55248 403666 pfam12547 ATXN-1_C Capicua transcriptional repressor modulator. This family of proteins is found in eukaryotes. Proteins in this family are typically between 49 and 781 amino acids in length. There is a conserved IQT sequence motif. ATXN1 directly binds Capicua and modulates Capicua repressor activity in Drosophila and mammalian cells. The polyglutamine expanded mutant type of ATXN-1 does not bind Capicua with as high affinity as wild-type ATXN-1. It is associated with spinocerebellar ataxia type 1 (SCA1). 50
55249 403667 pfam12548 DUF3740 Sulfatase protein. This domain family is found in eukaryotes, and is typically between 144 and 173 amino acids in length. The family is found in association with pfam00884. 139
55250 403668 pfam12549 TOH_N Tyrosine hydroxylase N terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. There is a single completely conserved residue G that may be functionally important. Tyrosine hydroxylase converts L-tyrosine to L-DOPA in the catecholamine synthesis pathway. 25
55251 403669 pfam12550 GCR1_C Transcriptional activator of glycolytic enzymes. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. This family is activates the transcription of glycolytic enzymes. 80
55252 403670 pfam12551 PHBC_N Poly-beta-hydroxybutyrate polymerase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam07167, pfam00561. There is a single completely conserved residue W that may be functionally important. PHBC is the third enzyme of the poly-beta-hydroxybutyrate biosynthetic pathway. 41
55253 403671 pfam12552 DUF3741 Protein of unknown function (DUF3741). This domain family is found in eukaryotes, and is approximately 50 amino acids in length. 45
55254 372185 pfam12553 DUF3742 Protein of unknown function (DUF3742). This domain family is found in bacteria, and is approximately 50 amino acids in length. There is a single completely conserved residue Y that may be functionally important. 114
55255 403672 pfam12554 MOZART1 Mitotic-spindle organizing gamma-tubulin ring associated. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, this family does not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. MOZART1 is required for gamma-TuRC recruitment to centrosomes. 45
55256 403673 pfam12555 TPPK_C Thiamine pyrophosphokinase C terminal. This domain family is found in bacteria, and is approximately 50 amino acids in length. The proteins in this family catalyzes the pyrophosphorylation of thiamine in yeast and synthesizes thiamine pyrophosphate (TPP), a thiamine coenzyme. 50
55257 403674 pfam12556 CobS_N Cobaltochelatase CobS subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam07728. There are two completely conserved residues (P and F) that may be functionally important. This family is the N terminal of the CobS subunit of cobaltochelatase. Cobaltochelatase belongs to the AAA+ superfamily of proteins. CobS and CobT form a chaperone like complex. 33
55258 403675 pfam12557 Co_AT_N Cob(I)alamin adenosyltransferase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 20 amino acids in length. The family is found in association with pfam02572. Cob(I)alamin adenosyltransferase adenosylates Co(I) in an ATP-dependent manner in the conversion of aquacobalamin to its coenzyme form. This is the third step in this process, after two steps involved in the reduction of Co(III) to Co(I). 23
55259 403676 pfam12558 DUF3744 ATP-binding cassette cobalt transporter. This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam00005. There is a conserved REP sequence motif. There is a single completely conserved residue P that may be functionally important. The proteins in this family are frequently annotated as ABC Cobalt transporters however there is little accompanying literature to confirm this. 73
55260 372187 pfam12559 Inhibitor_I10 Serine endopeptidase inhibitors. This family includes both microviridins and marinostatins. It seems likely that in both cases it is the C-terminus which becomes the active inhibitor after post-translational modifications of the full length, pre-peptide. it is the ester linkages within the key, 12-residue. region that circularize the molecule giving it its inhibitory conformation. 64
55261 403677 pfam12560 RAG1_imp_bd RAG1 importin binding. This region of RAG1 is responsible for binding to importin alpha. 287
55262 403678 pfam12561 TagA ToxR activated gene A lipoprotein. This domain family is found in bacteria, and is approximately 140 amino acids in length. The family is found in association with pfam10462. There is a conserved GAG sequence motif. This family is a bacterial lipoprotein. 99
55263 289339 pfam12562 DUF3746 Protein of unknown function (DUF3746). This domain family is found in viruses, and is approximately 40 amino acids in length. The family is found in association with pfam04595. 37
55264 403679 pfam12563 Hemolysin_N Hemolytic toxin N terminal. This domain family is found in bacteria, and is approximately 190 amino acids in length. The family is found in association with pfam07968, pfam00652. This family is a bacterial virulence factor - hemolysin - which forms pores in erythrocytes and causes them to lyse. 192
55265 403680 pfam12564 TypeIII_RM_meth Type III restriction/modification enzyme methylation subunit. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam01555. There are two completely conserved residues (F and S) that may be functionally important. This family is a bacterial phage resistance protein. It functions in a type III restriction/modification enzyme complex. It is part of the methylation subunit of the complex. It binds DNA and methylates it. 56
55266 403681 pfam12565 DUF3747 Protein of unknown function (DUF3747). This family of proteins is found in bacteria. Proteins in this family are typically between 215 and 413 amino acids in length. There is a conserved DSNGYS sequence motif. 171
55267 403682 pfam12566 DUF3748 Protein of unknown function (DUF3748). This domain family is found in bacteria and eukaryotes, and is approximately 120 amino acids in length. 119
55268 403683 pfam12567 CD45 Leukocyte receptor CD45. This family of proteins is found in eukaryotes. Proteins in this family are typically between 77 and 1130 amino acids in length. The family is found in association with pfam00041. CD45 plays a critical role in T-cell receptor (TCR)-mediated signaling. CD45 interacts with SKAP55 which is a transcriptional activator of IL-2. 59
55269 403684 pfam12568 DUF3749 Acetyltransferase (GNAT) domain. This domain family is found in bacteria, and is approximately 40 amino acids in length. The proteins in this family are acetyltransferases of the GNAT family. 128
55270 403685 pfam12569 NARP1 NMDA receptor-regulated protein 1. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam07719, pfam00515. There is a single completely conserved residue L that may be functionally important. NARP1 is the mammalian homolog of a yeast N-terminal acetyltransferase that regulates entry into the G(0) phase of the cell cycle. 514
55271 403686 pfam12570 DUF3750 Protein of unknown function (DUF3750). This family of proteins is found in bacteria. Proteins in this family are typically between 175 and 265 amino acids in length. 129
55272 403687 pfam12571 DUF3751 Phage tail-collar fibre protein. This domain family is found in bacteria and viruses, and is approximately 160 amino acids in length. There are two completely conserved residues (K and W) that may be functionally important. The members are annotated as being putative phage tail or tail-collar proteins. 149
55273 403688 pfam12572 DUF3752 Protein of unknown function (DUF3752). This domain family is found in eukaryotes, and is typically between 140 and 163 amino acids in length. 150
55274 403689 pfam12573 OxoDH_E1alpha_N 2-oxoisovalerate dehydrogenase E1 alpha subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam00676. There are two conserved sequence motifs: VPEP and RPG. This family is the alpha subunit of the E1 component of 2-oxoisovalerate dehydrogenase. This is the enzyme complex responsible for metabolism of pyruvate, 2-oxoglutarate, branched chain 2-oxo acids and acetoin. The E1 component is a heterotetramer of alpha2beta2. The homodimerized beta subunits are flanked by two alpha subunits in a 'vise' structure. 41
55275 403690 pfam12574 120_Rick_ant 120 KDa Rickettsia surface antigen. This domain family is found in bacteria, and is approximately 40 amino acids in length. This family is a Rickettsia surface antigen of 120 KDa which may be used as an antigen for immune response against the bacterial species. 238
55276 289352 pfam12575 Pox_EPC_I2-L1 Poxvirus entry protein complex L1 and I2. Pox_EPC_I2-L1 family of proteins is found in poxviruses. Proteins in this family are approximately 70 amino acids in length. There is a conserved YLK sequence motif. 71
55277 403691 pfam12576 DUF3754 Protein of unknown function (DUF3754). This domain family is found in bacteria, archaea and eukaryotes, and is typically between 135 and 166 amino acids in length. There is a single completely conserved residue P that may be functionally important. 136
55278 403692 pfam12577 PPARgamma_N PPAR gamma N-terminal region. Peroxisome proliferator-activated receptors (PPAR) are nuclear hormone receptors that control the expression of genes involved in lipid homeostasis in mammals. This sequence region is found at the N-terminus of these proteins. The family is found in association with pfam00104, pfam00105. It is not clear if this region is a separate protein domain. 79
55279 403693 pfam12578 3-PAP Myotubularin-associated protein. This domain family is found in eukaryotes, and is typically between 115 and 138 amino acids in length. Myotubularin is a dual-specific phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bisphosphate. 3-PAP is a catalytically inactive member of the myotubularin gene family, which coprecipitates lipid phosphatidylinositol 3-phosphate-3-phosphatase activity from lysates of human platelets. 128
55280 403694 pfam12579 DUF3755 Protein of unknown function (DUF3755). This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a single completely conserved residue N that may be functionally important. 34
55281 403695 pfam12580 TPPII Tripeptidyl peptidase II. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N-terminus of oligopeptides. 187
55282 315289 pfam12581 DUF3756 Protein of unknown function (DUF3756). This domain family is found in viruses, and is approximately 40 amino acids in length. 41
55283 403696 pfam12582 DUF3757 Protein of unknown function (DUF3757). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 154 amino acids in length. 122
55284 403697 pfam12583 TPPII_N Tripeptidyl peptidase II N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N-terminus of oligopeptides. 136
55285 403698 pfam12584 TRAPPC10 Trafficking protein particle complex subunit 10, TRAPPC10. This domain forms part of the TRAPP complex for mediating vesicle docking and fusion in the Golgi apparatus. The fungal version is referred to as Trs130, and an alternative vertebrate alias is TMEM1. 147
55286 403699 pfam12585 DUF3759 Protein of unknown function (DUF3759). This family of proteins is found in eukaryotes. Proteins in this family are typically between 107 and 132 amino acids in length. There is a single completely conserved residue H that may be functionally important. 91
55287 315294 pfam12586 DUF3760 Protein of unknown function (DUF3760). This domain family is found in eukaryotes, and is typically between 46 and 64 amino acids in length. 44
55288 403700 pfam12587 DUF3761 Protein of unknown function (DUF3761). This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 157 amino acids in length. 87
55289 403701 pfam12588 PSDC Phophatidylserine decarboxylase. This domain family is found in bacteria and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam02666. Phosphatidylserine decarboxylase (PSD) is an important enzyme in the synthesis of phosphatidylethanolamine in both prokaryotes and eukaryotes. 140
55290 403702 pfam12589 WBS_methylT Methyltransferase involved in Williams-Beuren syndrome. This domain family is found in eukaryotes, and is typically between 72 and 83 amino acids in length. The family is found in association with pfam08241. This family is made up of S-adenosylmethionine-dependent methyltransferases. The proteins are deleted in Williams-Beuren syndrome (WBS), a complex developmental disorder with multisystemic manifestations including supravalvular aortic stenosis (SVAS) and a specific cognitive phenotype. 81
55291 403703 pfam12590 Acyl-thio_N Acyl-ATP thioesterase. This domain family is found in bacteria and eukaryotes, and is typically between 120 and 131 amino acids in length. The family is found in association with pfam01643. The plant acyl-acyl carrier protein (ACP) thioesterases (TEs) have roles in fatty acid synthesis. 131
55292 153025 pfam12591 DUF3762 Protein of unknown function (DUF3762). This domain family is found in viruses, and is approximately 80 amino acids in length. The family is found in association with pfam05533. 80
55293 403704 pfam12592 DUF3763 Protein of unknown function (DUF3763). This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam07728. There is a single completely conserved residue F that may be functionally important. 55
55294 289369 pfam12593 McyA_C Microcystin synthetase C terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam08242, pfam00501. There is a conserved YAN sequence motif. Microcystins form a large family of small cyclic heptapeptides harbouring extensive modifications in amino acid residue composition and functional group chemistry. These peptide hepatotoxins contain a range of non-proteinogenic amino acids and unusual peptide bonds, and are typically N-methylated. They are synthesized on large enzyme complexes consisting of non-ribosomal peptide synthetases and polyketide synthases. This family is made up of the C terminal of microcystin synthetase, one of the proteins involved in this synthesis pathway. 43
55295 403705 pfam12594 DUF3764 Protein of unknown function (DUF3764). This family of proteins is found in bacteria. Proteins in this family are typically between 89 and 101 amino acids in length. 84
55296 403706 pfam12595 Rhomboid_SP Rhomboid serine protease. This domain family is found in eukaryotes, and is approximately 210 amino acids in length. The family is found in association with pfam01694. Rhomboid is a seven-transmembrane spanning protein that resides in the Golgi and acts as a serine protease to cleave Spitz. 216
55297 257152 pfam12596 Tnp_P_element_C 87kDa Transposase. This domain family is found in eukaryotes, and is typically between 78 and 110 amino acids in length. The family is found in association with pfam05485. There are two completely conserved residues (D and G) that may be functionally important. This family is an 87kDa transposase protein which catalyzes both the precise and imprecise excision of a nonautonomous P transposable element. 107
55298 403707 pfam12597 DUF3767 Protein of unknown function (DUF3767). This family of proteins is found in eukaryotes. Proteins in this family are typically between 112 and 199 amino acids in length. 101
55299 403708 pfam12598 TBX T-box transcription factor. This domain family is found in eukaryotes, and is typically between 77 and 89 amino acids in length. The family is found in association with pfam00907. There are two completely conserved residues (S and P) that may be functionally important. T-box genes encode transcription factors involved in morphogenesis and organogenesis of vertebrates and invertebrates 83
55300 403709 pfam12599 DUF3768 Protein of unknown function (DUF3768). This family of proteins is found in bacteria. Proteins in this family are typically between 108 and 129 amino acids in length. There are two conserved sequence motifs: NDP and RVLT. 83
55301 403710 pfam12600 DUF3769 Protein of unknown function (DUF3769). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 560 and 931 amino acids in length. 443
55302 289375 pfam12601 Rubi_NSP_C Rubivirus non-structural protein. This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam05407. The rubella virus (RUB) nonstructural (NS) protein (NSP) ORF encodes a protease that cleaves the NSP precursor (240 kDa) at a single site to produce two products. 66
55303 315304 pfam12602 FinO_N Fertility inhibition protein N terminal. This domain family is found in bacteria, and is typically between 62 and 102 amino acids in length. The family is found in association with pfam04352. The FinOP (fertility inhibition) system of F-like plasmids consists of an antisense RNA (FinP) and a 22 kDa protein (FinO) which act in concert to prevent the translation of TraJ, the positive regulator of the transfer operon. 62
55304 289377 pfam12603 DUF3770 Protein of unknown function (DUF3770). This domain family is found in viruses, and is approximately 250 amino acids in length. The family is found in association with pfam04196. 235
55305 403711 pfam12604 gp37_C Tail fibre protein gp37 C terminal. This domain family is found in bacteria and viruses, and is typically between 49 and 166 amino acids in length. The family is found in association with pfam03906. In T-even phages, gp37 and gp38 are components of the tail Faber that are critical for phage-host interaction. 156
55306 403712 pfam12605 CK1gamma_C Casein kinase 1 gamma C terminal. This domain family is found in eukaryotes, and is typically between 54 and 99 amino acids in length. The family is found in association with pfam00069. CK1gamma is a membrane-bound member of the CK1 family. Gain-of-function and loss-of-function experiments show that CK1gamma is both necessary and sufficient to transduce LRP6 signalling in vertebrates and Drosophila cells. 99
55307 403713 pfam12606 RELT tumor necrosis factor receptor superfamily member 19. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 49 and 288 amino acids in length. There are two completely conserved residues (K and Y) that may be functionally important. The members of tumor necrosis factor receptor (TNFR) superfamily have been designated as the "guardians of the immune system" due to their roles in immune cell proliferation, differentiation, activation, and death (apoptosis). The messenger RNA of RELT is especially abundant in hematologic tissues such as spleen, lymph node, and peripheral blood leukocytes as well as in leukemias and lymphomas. RELT is able to activate the NF-kappaB pathway and selectively binds tumor necrosis factor receptor-associated factor 1. 42
55308 403714 pfam12607 DUF3772 Protein of unknown function (DUF3772). This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00924. 63
55309 289381 pfam12608 T4bSS_IcmS Type IVb secretion, IcmS, effector-recruitment. This is a family of Gram-negative bacterial proteins involved in the Dot/Icm type IVb transport system. Members are small acidic cytoplasmic proteins required for Dot/Icm-dependent activities. Binary complexes of IcmW-IcmS and of IcmS-LvgA have been consistently reported, suggestive of the binary WXG100 system. The IcmW-IcmS complex may play a role in recruitment of effector proteins to the transport apparatus. 92
55310 403715 pfam12609 DUF3774 Wound-induced protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 81 and 97 amino acids in length. The proteins in the family are often annotated as wound-induced proteins however there is little accompanying literature to confirm this. 77
55311 403716 pfam12610 SOCS Suppressor of cytokine signalling. This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam07525, pfam00017. The suppressors of cytokine signaling (SOCS) family play important roles in regulating a variety of signal transduction pathways that are involved in immunity, growth and development of organisms. 58
55312 403717 pfam12611 Flagellar_put Putative flagellar. Proteins in this entry are encoded in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus. 24
55313 403718 pfam12612 TFCD_C Tubulin folding cofactor D C terminal. This domain family is found in eukaryotes, and is typically between 182 and 199 amino acids in length. The family is found in association with pfam02985. There is a single completely conserved residue R that may be functionally important. Tubulin folding cofactor D does not co-polymerize with microtubules either in vivo or in vitro, but instead modulates microtubule dynamics by sequestering beta-tubulin from GTP-bound alphabeta-heterodimers in microtubules. 186
55314 403719 pfam12613 FliC_SP Flagellin structural protein. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00669, pfam00700. This family is the bacterial flagellin structural protein. It is involved with cell motility. 53
55315 403720 pfam12614 RRF_GI Ribosome recycling factor. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 130 amino acids in length. There are two conserved sequence motifs: LPS and LKR. Overproduction of ribosome recycling factor (RRF) reduces tna operon expression and increases the rate of cleavage of TnaC-tRNA(2)(Pro), relieving the growth inhibition associated with plasmid-mediated tnaC overexpression. 126
55316 403721 pfam12615 TraD_N F sex factor protein N terminal. This domain family is found in bacteria, and is typically between 96 and 107 amino acids in length. The family is found in association with pfam10412. TraD is a cytoplasmic membrane protein with possible DNA binding domains. It is part of the bacterial F sex factor complex. 93
55317 403722 pfam12616 DUF3775 Protein of unknown function (DUF3775). This domain family is found in bacteria, and is approximately 80 amino acids in length. There is a single completely conserved residue G that may be functionally important. 69
55318 403723 pfam12617 LdpA_C Iron-Sulfur binding protein C terminal. This domain family is found in bacteria and eukaryotes, and is typically between 179 and 201 amino acids in length. The family is found in association with pfam00037. LdpA (light-dependent period) plays a role in controlling the redox state in cyanobacteria to modulate its. circadian clock. LdpA is a protein with Iron-Sulfur cluster-binding motifs. 183
55319 372226 pfam12618 DUF3776 Protein of unknown function (DUF3776). This domain family is found in eukaryotes, and is approximately 100 amino acids in length. 76
55320 403724 pfam12619 MCM2_N Mini-chromosome maintenance protein 2. This domain family is found in eukaryotes, and is typically between 138 and 153 amino acids in length. The family is found in association with pfam00493. Mini-chromosome maintenance (MCM) proteins are essential for DNA replication. These proteins use ATPase activity to perform this function. 148
55321 403725 pfam12620 DUF3778 Protein of unknown function (DUF3778). This domain family is found in eukaryotes, and is typically between 48 and 61 amino acids in length. There is a conserved LRF sequence motif. 64
55322 403726 pfam12621 PHM7_ext Extracellular tail, of 10TM putative phosphate transporter. This PHM7_ext family is found in plants and fungi. It represents the C-terminal extracellular domain of the putative phosphate transporter, PHM7. The three N-terminal TMS are found in family RSN1_TM, pfam02714; the cytoplssmic domain is pfam14703, and the remaining 7TM region is in pfam02714. 84
55323 403727 pfam12622 NpwBP mRNA biogenesis factor. The full-length Wbp11 proteins carry several copies of a PPGPPP motif throughout their length. This motif is thought to be necessary for folding of the molecule as it helps to bind the WW domain, Wbp11, pfam09429. This domain together with Wbp11 may function as components of an mRNA factory in the nucleus. 47
55324 403728 pfam12623 Hen1_L RNA repair, ligase-Pnkp-associating, region of Hen1. This domain is the N-terminal region of the bacterial Hen1 protein. This protein forms stable hetero-tetramer with Pnkp. The hetero-tetramer was able to repair transfer RNAs cleaved by ribotoxins in vitro. This domain provides the ligase activity of the hetero-tetramer. 230
55325 403729 pfam12624 Chorein_N N-terminal region of Chorein or VPS13. Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport. 109
55326 403730 pfam12625 Arabinose_bd Arabinose-binding domain of AraC transcription regulator, N-term. AraC is a bacterial transcriptional regulatory protein with a DNA-binding domain at the C-terminus, HTH_AraC, pfam00165, and this dimerization domain which harbours the arabinose-binding pocket at the N-terminus. AraC positively and negatively regulates expression of the proteins required for the uptake and catabolism of the sugar L-arabinose 1,2,3]. 185
55327 403731 pfam12626 PolyA_pol_arg_C Polymerase A arginine-rich C-terminus. The C-terminus of polymerase A in E coli is arginine-rich and is necessary for full functioning of the enzyme. 116
55328 403732 pfam12627 PolyA_pol_RNAbd Probable RNA and SrmB- binding site of polymerase A. This region encompasses much of the RNA and SrmB binding motifs on polymerase A. 64
55329 403733 pfam12628 Inhibitor_I71 Falstatin, cysteine peptidase inhibitor. This family of peptidase inhibitors is expressed from plasmodial protozoal species. Falstatin is found to be a potent reversible inhibitor of the P. falciparum cysteine proteases falcipain-2 and falcipain-3, as well as other parasite- and non-parasite-derived cysteine proteases, but is only a relatively weak inhibitor of the P. falciparum cysteine proteases falcipain-1 and dipeptidyl aminopeptidase 1. Thus, P. falciparum requires expression of falstatin to limit proteolysis by certain host or parasite cysteine proteases during erythrocyte invasion. 173
55330 289401 pfam12629 Pox_polyA_pol_C Poxvirus poly(A) polymerase C-terminal domain. This domain is found at the C-terminus of the pox virus PolyA polymerase protein. 199
55331 403734 pfam12630 Pox_polyA_pol_N Poxvirus poly(A) polymerase N-terminal domain. This domain is found at the N-terminus of the pox virus Poly(A) polymerase protein. According to SCOP this domain contains a helix-hairpin-helix motif. 108
55332 403735 pfam12631 MnmE_helical MnmE helical domain. The tRNA modification GTPase MnmE consists of three domains. An N-terminal domain, a helical domain and a GTPase domain which is nested within the helical domain. This family represents the helical domain. 326
55333 403736 pfam12632 Vezatin Mysoin-binding motif of peroxisomes. Vezatin is a peroxisome transmembrane receptor that is involved in membrane-membrane and cell-cell adhesions. In the movement of peroxisomes it binds to class V and class VIIa myosins to guide the organelle through the microtubules and allow pathogens to internalize themselves into host cells. Vezatin is crucial for spermatozoan production. In mouse cells it interacts with the cadherin-catenin complex bridging it to the C-terminal FERM domain of myosin VIIA. 242
55334 403737 pfam12633 Adenyl_cycl_N Adenylate cyclase NT domain. 198
55335 372234 pfam12634 Inp1 Inheritance of peroxisomes protein 1. Inp1 is a family of peripheral membrane proteins of peroxisomes. Inp1p binds Pex25p, Pex30p, and Vps1p, all of which are involved in controlling peroxisome division. The levels of Inp1p vary with the cell cycle, and Inp1 acts as a factor that retains peroxisomes in cells and controls peroxisome division. Inp1p promotes the retention of peroxisomes in mother cells and buds of budding yeast by attaching peroxisomes to as-yet-unidentified cortical structures. 137
55336 403738 pfam12635 DUF3780 Protein of unknown function (DUF3780). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria. Proteins in this family are typically between 189 and 206 amino acids in length. There are two conserved sequence motifs: PEERWWL and GWR. This family is found in a very sporadic set of bacterial species, suggesting that it may have been horizontally transferred. One protein is annotated as plasmid borne. 184
55337 403739 pfam12636 DUF3781 Protein of unknown function (DUF3781). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 82 and 98 amino acids in length. There are two conserved sequence motifs: GKNWY and ITA. 72
55338 403740 pfam12637 TSCPD TSCPD domain. This family of proteins is found in bacteria, archaea and viruses. The domain is found in isolation in many proteins where it has a conserved C-terminal motif TSCPD after which the domain is named. Most copies of the domain possess 4 conserved cysteines that may be part of an Iron-sulfur cluster. This domain is found at the C-terminus of some ribonucleoside-diphosphate reductase enzymes. 74
55339 403741 pfam12638 Staygreen Staygreen protein. This family of proteins have been implicated in chlorophyll degradation. Intriguingly members of this family are also found in non-photosynthetic bacteria. 147
55340 403742 pfam12639 Colicin-DNase DNase/tRNase domain of colicin-like bacteriocin. Colicin-like bacteriocins are complex structures with an N-terminal beta-barrel translocation domain (pfam09000), a long double-alpha-helical receptor-binding domain (pfam11570) and this C-terminal RNAse/DNase domain with endonuclease activity. Their competitor bacteriocidal action is by a process that involves binding to a surface receptor, entering the cell, and, finally, killing it. The lethal action of colicin E3 is a specific cleavage in the ribosomal decoding A site. The crystal structure of colicin E3 reveals a Y-shaped molecule with the receptor binding domain forming a 100 Angstrom long stalk and the two globular heads of the translocation domain and this catalytic domain comprising the two arms. 96
55341 403743 pfam12640 UPF0489 UPF0489 domain. This family is probably an enzyme which is related to the Arginase family. 161
55342 403744 pfam12641 Flavodoxin_3 Flavodoxin domain. This family represents a flavodoxin domain. 159
55343 403745 pfam12642 TpcC Conjugative transposon protein TcpC. This family of proteins are annotated as conjugative transposon protein TcpC. The transfer clostridial plasmid (tcp) locus is part of some conjugative antibiotic resistance and virulence plasmids. TcpC was one of five genes whose products had low-level sequence identity to Tn916 proteins, having similarity to ORF13 homologs from Tn916, Tn5397, and CW459tet. This family of proteins is found in bacteria. Proteins in this family are typically between 302 and 351 amino acids in length. 230
55344 403746 pfam12643 MazG-like MazG-like family. This family of short proteins are distantly related to the MazG enzyme. This suggests that these proteins are enzymes that catalyze a related reaction. 84
55345 372238 pfam12644 DUF3782 Protein of unknown function (DUF3782). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 91 and 186 amino acids in length. 78
55346 403747 pfam12645 HTH_16 Helix-turn-helix domain. This domain appears to be a helix-turn-helix domain suggesting that this might be a transcriptional regulatory protein. Some members of this family are annotated as conjugative transposon domains. 65
55347 403748 pfam12646 DUF3783 Domain of unknown function (DUF3783). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length. 56
55348 403749 pfam12647 RNHCP RNHCP domain. This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 143 amino acids in length. There is a conserved RNHCP sequence motif. 85
55349 403750 pfam12648 TcpE TcpE family. This family of proteins includes TcpE a conjugative transposon membrane protein.This family of proteins is found in bacteria. Proteins in this family are typically between 122 and 168 amino acids in length. 104
55350 403751 pfam12650 DUF3784 Domain of unknown function (DUF3784). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 96 and 110 amino acids in length. 94
55351 289422 pfam12651 RHH_3 Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding. 44
55352 403752 pfam12652 CotJB CotJB protein. CotJ is a sigma E-controlled operon involved in the spore coat of Bacillus subtilis. This protein has been identified as a spore coat protein. 76
55353 403753 pfam12653 DUF3785 Protein of unknown function (DUF3785). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. These proteins share two CXXC motifs suggesting these are zinc binding proteins. This protein is found in clostridia in an operon with three signalling proteins, suggesting this protein may be a DNA-binding transcription regulator downstream of an as yet unknown signalling pathway (Bateman A pers obs). 136
55354 403754 pfam12654 DUF3786 Domain of unknown function (DUF3786). This presumed domain is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 201 and 257 amino acids in length. Some proteins also contains an iron-sulfur cluster. 176
55355 403755 pfam12655 DUF3787 Domain of unknown function (DUF3787). This family of proteins is functionally uncharacterized. This family of proteins is found in Clostridia. Proteins in this family are approximately 60 amino acids in length. There is a conserved TAAW sequence motif that may be functionally important. 52
55356 372242 pfam12656 G-patch_2 G-patch domain. Yeast Spp2, a G-patch protein and spliceosome component, interacts with the ATP-dependent DExH-box splicing factor Prp2. As this interaction involves the G-patch sequence in Spp2 and is required for the recruitment of Prp2 to the spliceosome before the first catalytic step of splicing, it is proposed that Spp2 might be an accessory factor that confers spliceosome specificity on Prp2. 61
55357 403756 pfam12657 TFIIIC_delta Transcription factor IIIC subunit delta N-term. In humans there are six subunits of transcription factor IIIC, and this one is the 90 kDa subunit; whereas in fungi the complex resolves into nine different subunits and this is No. 9 in yeasts. The whole subunit is involved in RNA polymerase III-mediated transcription. It is possible that this N-terminal domain interacts with TFIIIC subunit 8. 174
55358 403757 pfam12658 Ten1 Telomere capping, CST complex subunit. Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast, and a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. S.pombe has Stn1- and Ten1-like proteins that are essential for chromosome end protection. Stn1 orthologues exist in all species that have Pot1, whereas Ten1-like proteins can be found in all fungi. Fission yeast Stn1 and Ten1 localize at telomeres in a manner that correlates with the length of the ssDNA overhang, suggesting that they specifically associate with the telomeric ssDNA. Two separate protein complexes are required for chromosome end protection in fission yeast. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution. Ten1 is one of the three components of the CST complex, which, in conjunction with the Shelterin complex helps protect telomeres from attack by DNA-repair mechanisms. 115
55359 403758 pfam12659 Stn1_C Telomere capping C-terminal wHTH. This domain consists of tandem winged helix-turn-helix motifs. Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast, and a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. S.pombe has Stn1- and Ten1-like proteins that are essential for chromosome end protection. Stn1 orthologues exist in all species that have Pot1, whereas Ten1-like proteins can be found in all fungi. Fission yeast Stn1 and Ten1 localize at telomeres in a manner that correlates with the length of the ssDNA overhang, suggesting that they specifically associate with the telomeric ssDNA. Two separate protein complexes are required for chromosome end protection in fission yeast. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution. 119
55360 372246 pfam12660 zf-TFIIIC Putative zinc-finger of transcription factor IIIC complex. This zinc-finger domain is at the very C-terminus of a number of different TFIIIC subunit proteins. This domain might be involved in protein-DNA and/or protein-protein interactions. 87
55361 403759 pfam12661 hEGF Human growth factor-like EGF. hEGF, or human growth factor-like EGF, domains have six conserved residues disulfide-bonded into the characteristic 'ababcc' pattern. They are involved in growth and proliferation of cells, in proteins of the Notch/Delta pathway, neurogulin and selectins. hEGFs are also found in mosaic proteins with four-disulfide laminin EGFs such as aggrecan and perlecan. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal Cys residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In hEGFs the C-terminal thiol resides in the beta-turn, resulting in shorter loop-lengths between the Cys residues of disulfide 'c', typically C[8-9]XC. These shorter loop-lengths are also typical of the four-disulfide EGF domains, laminin ad integrin. Tandem hEGF domains have six linking residues between terminal cysteines of adjacent domains. hEGF domains may or may not bind calcium in the linker region. hEGF domains with the consensus motif CXD4X[F,Y]XCXC are hydroxylated exclusively in the Asp residue. 22
55362 403760 pfam12662 cEGF Complement Clr-like EGF-like. cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue. 24
55363 403761 pfam12663 DUF3788 Protein of unknown function (DUF3788). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 137 and 149 amino acids in length. This family may be distantly related to RelE proteins. 128
55364 315357 pfam12664 DUF3789 Protein of unknown function (DUF3789). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two completely conserved residues (V and C) that may be functionally important. 32
55365 403762 pfam12666 PrgI PrgI family protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 116 and 146 amino acids in length. This protein is found in an operon that is part of a Type IV secretion system. 91
55366 403763 pfam12667 NigD_N NigD-like N-terminal OB domain. This family of proteins is functionally uncharacterized. This family of proteins is found in Bacteroides species. Proteins in this family are typically between 234 and 260 amino acids in length. These proteins possess an N-terminal lipoprotein attachment site. The family includes NigD a protein found in the Nig operon that encodes a bacteriocin called nigrescin. It has been suggested that NigD may be the immunity protein for nigrescin (NigC) because it is directly downstream. This domain has an OB fold. 66
55367 403764 pfam12668 DUF3791 Protein of unknown function (DUF3791). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 71 and 125 amino acids in length. 60
55368 403765 pfam12669 P12 Virus attachment protein p12 family. This family of proteins are related to Virus attachment protein p12 from the African swine fever virus. The family appears to contain an N-terminal signal peptide followed by a short cysteine rich region. The cysteine rich region is extremely variable and it is possible that only the N-terminal region is homologous. 45
55369 403766 pfam12670 DUF3792 Protein of unknown function (DUF3792). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. These proteins are integral membrane proteins. 110
55370 403767 pfam12671 Amidase_6 Putative amidase domain. 161
55371 403768 pfam12672 DUF3793 Protein of unknown function (DUF3793). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 211 amino acids in length. There are two conserved sequence motifs: PHE and LGYP. 171
55372 403769 pfam12673 DUF3794 Domain of unknown function (DUF3794). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. The family is found in association with pfam01476. 80
55373 403770 pfam12674 Zn_ribbon_2 Putative zinc ribbon domain. This domain appears to be a zinc binding DNA-binding domain. 76
55374 403771 pfam12675 DUF3795 Protein of unknown function (DUF3795). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 99 and 171 amino acids in length. This protein is likely to be zinc binding given the conserved cysteines. 81
55375 403772 pfam12676 DUF3796 Protein of unknown function (DUF3796). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 112
55376 289447 pfam12677 DUF3797 Domain of unknown function (DUF3797). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 50 amino acids in length. There is a conserved CGN sequence motif. 48
55377 403773 pfam12678 zf-rbx1 RING-H2 zinc finger domain. There are 8 cysteine/ histidine residues which are proposed to be the conserved residues involved in zinc binding. The protein, of which this domain is the conserved region, participates in diverse functions relevant to chromosome metabolism and cell cycle control. 55
55378 403774 pfam12679 ABC2_membrane_2 ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family. 281
55379 403775 pfam12680 SnoaL_2 SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold. 101
55380 403776 pfam12681 Glyoxalase_2 Glyoxalase-like domain. This domain is related to the Glyoxalase domain pfam00903. 118
55381 403777 pfam12682 Flavodoxin_4 Flavodoxin. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN. 155
55382 403778 pfam12683 DUF3798 Protein of unknown function (DUF3798). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 247 and 417 amino acids in length. Most of the proteins in this family have an N-terminal lipoprotein attachment site. These proteins have distant similarity to periplasmic ligand binding families such as pfam02608, which suggests that this family have a similar role. 271
55383 403779 pfam12684 DUF3799 PDDEXK-like domain of unknown function (DUF3799). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 265 and 420 amino acids in length. It appears that these proteins are distantly related to the PDDEXK superfamily and so these domains are likely to be nucleases. This family has a C-terminal cysteine cluster similar to that found in pfam01930. 228
55384 403780 pfam12685 SpoIIIAH SpoIIIAH-like protein. Stage III sporulation protein AH (SpoIIIAH) is a protein that is involved in forespore engulfment. It forms a channel with SpoIIIAH that is open on the forespore end and closed (or gated) on the mother cell end. This allows sigma-E-directed gene expression in the mother-cell compartment of the sporangium to trigger the activation of sigma-G forespore-specific gene expression by a pathway of intercellular signaling. This family of proteins is found in bacteria, archaea and eukaryotes and so must have a wider function that in sporulation. Proteins in this family are typically between 174 and 223 amino acids in length. 195
55385 403781 pfam12686 DUF3800 Protein of unknown function (DUF3800). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 215 and 302 amino acids in length. There is a DE motif at the N-terminus and a QXXD motif at the C-terminus that may be functionally important. 112
55386 403782 pfam12687 DUF3801 Protein of unknown function (DUF3801). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 158 and 187 amino acids in length. This family includes the PcfB protein. 131
55387 403783 pfam12688 TPR_5 Tetratrico peptide repeat. BH0479 of Bacillus halodurans is a hypothetical protein which contains a tetratrico peptide repeat (TPR) structural motif. The TPR motif is often involved in mediating protein-protein interactions. This protein is likely to function as a dimer. The first 48 amino acids are not present in the clone construct. This Pfam entry includes tetratricopeptide-like repeats not detected by the pfam00515, pfam07719, pfam07720 and pfam07221 models. 119
55388 372256 pfam12689 Acid_PPase Acid Phosphatase. This family contains phosphatase enzymes and other proteins of the HAD superfamily. It includes MDP-1 which is a eukaryotic magnesium-dependent acid phosphatase. 169
55389 403784 pfam12690 BsuPI Intracellular proteinase inhibitor. This is a bacterial domain which has been named BsuPI in Bacillus subtilis. This domain is found in Bacillus subtilis ipi, where it has been suggested to regulate the major intracellular proteinase (ISP-1) activity in vivo. The structure of proteins in this family adopt a beta barrel topology. 100
55390 403785 pfam12691 Minor_capsid_3 Minor capsid protein from bacteriophage. This family is from one of three adjacent genes, all of which are involved in formation of the minor phage capsid. 117
55391 403786 pfam12692 Methyltransf_17 S-adenosyl-L-methionine methyltransferase. This domain is found in bacterial proteins. The structure of the proteins in this family suggest that they function as a methyltransferase. 160
55392 289463 pfam12693 GspL_C GspL periplasmic domain. This domain is the periplasmic domain of the GspL/EpsL family proteins. These proteins are involved in type II secretion systems. 158
55393 403787 pfam12694 MoCo_carrier Putative molybdenum carrier. The structure of proteins in this family contain central beta strands with flanking alpha helices. The structure is similar to that of a molybdenum cofactor carrier protein. 145
55394 315383 pfam12695 Abhydrolase_5 Alpha/beta hydrolase family. This family contains a diverse range of alpha/beta hydrolase enzymes. 164
55395 403788 pfam12696 TraG-D_C TraM recognition site of TraD and TraG. This family includes both TraG and TraD as well as VirD4 proteins. TraG is essential for DNA transfer in bacterial conjugation. These proteins are thought to mediate interactions between the DNA-processing (Dtr) and the mating pair formation (Mpf) systems. This domain interacts with the relaxosome component TraM via the latter's tetramerisation domain. TraD is a hexameric ring ATPase that forms the cytoplasmic face of the conjugative pore. 125
55396 403789 pfam12697 Abhydrolase_6 Alpha/beta hydrolase family. This family contains alpha/beta hydrolase enzymes of diverse specificity. 212
55397 403790 pfam12698 ABC2_membrane_3 ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061. 345
55398 403791 pfam12699 phiKZ_IP phiKZ-like phage internal head proteins. Phage internal head proteins (IP) are proteins that are encoded by a bacteriophage and assembled into the mature virion inside the capsid head. The most analogous characterized IP proteins are those of bacteriophage T4, which are known to be proteolytically processed during phage maturation, and then subsequently injected into the host cell during infection. The phiKZ_IP family consists of internal head proteins encoded by phiKZ-like phages. Each phage encodes three to six members of this family. Members of the family reside in the head and are cleaved during phage maturation to separate an N-terminal propeptide from a C-terminal domain. The C-terminal domain remains in the mature capsid. The N-terminal propeptide domain is either mostly or completely removed from the mature capsid. In one case, an unrelated polypeptide is embedded in the propeptide and also remains in the mature capsid. The phiKZ-like IP proteins are not discernibly homologous to the T4 IP proteins, and it is not known if the phiKZ-like IP proteins are injected into the host cell, or have some other function within the head. The alignment and HMM model exclude most of the propeptide region, but include the cleavage sites. The first 100 residues, including the cleavage sites, constitute the most conservative part of the seed alignment. 323
55399 403792 pfam12700 HlyD_2 HlyD family secretion protein. This family is related to pfam00529. 413
55400 403793 pfam12701 LSM14 Scd6-like Sm domain. The Scd6-like Sm domain is found in Scd6p from S. cerevisiae, Rap55 from the newt Pleurodeles walt, and its orthologs from fungi, animals, plants and apicomplexans. The domain is also found in Dcp3p and the human EDC3/FLJ21128 protein where it is fused to the the Rossmanoid YjeF-N domain. In addition both EDC3 and Scd6p are found fused to the FDF domain. 75
55401 403794 pfam12702 Lipocalin_3 Lipocalin-like. This is a family of proteins of 115 residues on average. The family has two highly conserved tryptophan residues. The fold is very similar to the lipocalin-like fold from several comparable structures. 92
55402 403795 pfam12703 ptaRNA1_toxin Toxin of toxin-antitoxin type 1 system. This family is the toxin of a type 1 toxin-antitoxin system which is found in a relatively widespread range of bacterial species. The species distribution suggests frequent horizontal gene transfer. In a type 1 system, as characterized for the plasmid-encoded E coli hok/sok system, the toxin-encoding stable mRNA encodes a protein which rapidly leads to cell death unless the translation is suppressed by a short-lived small RNA. The plasmid-encoded module prevents the growth of plasmid-free offspring, thus ensuring the persistence of the plasmid in the population. Plasmid-free cells arising after cell-division will be killed because the stable mRNA toxin is present while the comparably unstable anti-toxin is rapidly degraded. Where the system is transcribed chromosomally, the mechanism is poorly understood. 73
55403 403796 pfam12704 MacB_PCD MacB-like periplasmic core domain. This family represents the periplasmic core domain found in a variety of ABC transporters. The structure of this family has been solved for the MacB protein. Some structural similarity was found to the periplasmic domain of the AcrB multidrug efflux transporter. 209
55404 403797 pfam12705 PDDEXK_1 PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily 249
55405 403798 pfam12706 Lactamase_B_2 Beta-lactamase superfamily domain. This family is part of the beta-lactamase superfamily and is related to pfam00753. 196
55406 403799 pfam12707 DUF3804 Protein of unknown function (DUF3804). This family is approximately 130 residues. Dali search indicates this protein carries a NTF2-fold with a hydrophobic cavity as a structural homolog to 1JB2, 2R4I, 3FSD and 2UX0. In this hydrophobic cavity, Arg 118 provides the H-bonding force to hold a PEG molecule from crystallisation. The interface interaction suggests that the biomolecule of PMN2A_0505 is a dimer. Two members of the family are annotated as putative EF-Tu domain 2 but there is no match to this family so this is likely to be a false assignment. There are two highly conserved tryptophan residues towards the C-terminal end of the family. 128
55407 403800 pfam12708 Pectate_lyase_3 Pectate lyase superfamily protein. This family of proteins possesses a beta helical structure like Pectate lyase. This family is most closely related to glycosyl hydrolase family 28. 213
55408 403801 pfam12709 Kinetocho_Slk19 Central kinetochore-associated. This is a family of proteins integrally involved in the central kinetochore. Slk19 is a yeast member and it may play an important role in the timing of nuclear migration. It may also participate, directly or indirectly, in the maintenance of centromeric tensile strength during mitotic stagnation, for instance during activation of checkpoint controls, when cells need to preserve nuclear integrity until cell cycle progression can be resumed. 77
55409 403802 pfam12710 HAD haloacid dehalogenase-like hydrolase. 187
55410 403803 pfam12712 DUF3805 Domain of unknown function (DUF3805). This family represent the N-terminal domain of the structure. In two related Bacteroides species the gene lies immediately upstream from a putative ATP binding component of an ATP transporter and a putative histidinol phosphatase. The structure of this domain is strikingly similar to the N-terminal structure of 1tui, also of unknown function. The domain carries four conserved tryptophan residues. 152
55411 403804 pfam12713 DUF3806 Domain of unknown function (DUF3806). This family represent the C-terminal domain of the structure. In two related Bacteroides species the gene lies immediately upstream from a putative ATP binding component of an ATP transporter and a putative histidinol phosphatase. The structure of this domain is strikingly similar to the N-terminal structure of 1ma7 whose C-terminal domain is a phage integrase, pfam00589. 86
55412 315397 pfam12714 TILa TILa domain. This cysteine rich domain occurs along side the TIL pfam01826 domain and is likely to be a distantly related relative. 54
55413 403805 pfam12715 Abhydrolase_7 Abhydrolase family. This is a family of probable bacterial abhydrolases. 387
55414 403806 pfam12716 Apq12 Nuclear pore assembly and biogenesis. This is a family of conserved fungal proteins involved in nuclear pore assembly. Apq12 is an integral membrane protein of the nuclear envelope (NE) and endoplasmic reticulum. Its absence leads to a partial block in mRNA export and cold-sensitive defects in the growth and localization of a subset of nucleoporins, particularly those asymmetrically localized to the cytoplasmic fibrils. The defects in nuclear pore assembly appear to be due to defects in regulating membrane fluidity. 53
55415 403807 pfam12717 Cnd1 non-SMC mitotic condensation complex subunit 1. The three non-SMC (structural maintenance of chromosomes) subunits of the mitotic condensation complex are Cnd1-3. The whole complex is essential for viability and the condensing of chromosomes in mitosis. 162
55416 403808 pfam12718 Tropomyosin_1 Tropomyosin like. This family is a set of eukaryotic tropomyosins. Within the yeast Tpm1 and Tpm2, biochemical and sequence analyses indicate that Tpm2p spans four actin monomers along a filament, whereas Tpm1p spans five. Despite its shorter length, Tpm2p can compete with Tpm1p for binding to F-actin. Over-expression of Tpm2p in vivo alters the axial budding of haploids to a bipolar pattern, and this can be partially suppressed by co-over-expression of Tpm1p. This suggests distinct functions for the two tropomyosins, and indicates that the ratio between them is important for correct morphogenesis. The family also contains higher eukaryote Tpm3 members. 142
55417 403809 pfam12719 Cnd3 Nuclear condensing complex subunits, C-term domain. The Cnd1-3 proteins are the three non-SMC (structural maintenance of chromosomes) proteins that go to make up the mitotic condensation complex along with the two SMC protein families, XCAP-C and XCAP-E, (or in the case of fission yeast, Cut3 and Cut14). The five-member complex seems to be conserved from yeasts to vertebrates. This domain is the C-terminal, cysteine-rich domain of Cnd3. The complex shuttles between the nucleus, during mitosis, and the cytoplasm during the rest of the cycle. Thus this family is made up of the C-termini of XCAP-Gs, Ycg1 and Ycs5 members. 286
55418 403810 pfam12720 DUF3807 Protein of unknown function (DUF3807). This is a family of conserved fungal proteins of unknown function. 178
55419 403811 pfam12721 RHIM RIP homotypic interaction motif. RIP proteins are receptor-interacting serine/threonine-protein kinases or cell death proteins. This interacting domain is involved in virus recognition. The RHIM domain is necessary for the recruitment of RIP and RIP3 by the IFN-inducible protein DNA-dependent activator of IRFs (DAI), also known as DLM-1 or Z-DNA binding protein (ZBP1). Both the RIP kinases contribute to DAI-induced NF-kappaB activation. RIP3 undergoes auto phosphorylation on binding to DAI. 52
55420 403812 pfam12722 Hid1 High-temperature-induced dauer-formation protein. Hid1 (high-temperature-induced dauer-formation protein 1) represents proteins of approximately 800 residues long and is conserved from fungi to humans. It contains up to seven potential transmembrane domains separated by regions of low complexity. Functionally it might be involved in vesicle secretion or be an inter-cellular signalling protein or be a novel insulin receptor. 804
55421 372272 pfam12723 DUF3809 Protein of unknown function (DUF3809). This family of proteins is functionally uncharacterized. This family of proteins is found in Deinococci bacteria. Proteins in this family are typically between 117 and 157 amino acids in length. 136
55422 403813 pfam12724 Flavodoxin_5 Flavodoxin domain. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN. 144
55423 403814 pfam12725 DUF3810 Protein of unknown function (DUF3810). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 333 and 377 amino acids in length. There is a conserved HEXXH sequence motif that is characteristic of metallopeptidases. This family may therefore belong to an as yet uncharacterized family of peptidase enzymes. 318
55424 403815 pfam12726 SEN1_N SEN1 N terminal. This domain is found at the N terminal of the helicase SEN1. SEN1 is a Pol II termination factor for noncoding RNA genes. The N terminal of SEN1, unlike the C terminal, is not required for growth. 744
55425 403816 pfam12727 PBP_like PBP superfamily domain. This family belongs to the periplasmic binding domain superfamily. It is often associated with a helix-turn-helix domain. 193
55426 403817 pfam12728 HTH_17 Helix-turn-helix domain. This domain is a DNA-binding helix-turn-helix domain. 51
55427 403818 pfam12729 4HB_MCP_1 Four helix bundle sensory module for signal transduction. This family is a four helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterized in E coli chemoreceptors. 181
55428 403819 pfam12730 ABC2_membrane_4 ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061. 179
55429 372275 pfam12731 Mating_N Mating-type protein beta 1. This domain is found in some fungi and is the C-terminus of a homeodomain-containing transcription factor protein involved in mating. 95
55430 403820 pfam12732 YtxH YtxH-like protein. This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 143 amino acids in length. The N-terminal region is the most conserved. Proteins is this family are functionally uncharacterized. 71
55431 403821 pfam12733 Cadherin-like Cadherin-like beta sandwich domain. This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain. In animal proteins it is associated with an ATP-grasp domain. 89
55432 403822 pfam12734 CYSTM Cysteine-rich TM module stress tolerance. The members of this family are short cysteine-rich membrane proteins that most probably dimerize together to form a transmembrane sulfhydryl-lined pore. The CYSTM module is always present at the extreme C-terminus of the protein in which it is present. Furthermore, like the yeast prototypes, the majority of the proteins also possess a proline/glutamine-rich segment upstream of the CYSTM module that is likely to form a polar, disordered head in the cytoplasm. The presence of an atypical well-conserved acidic residue at the C-terminal end of the TM helix suggests that this might interact with a positively charged moiety in the lipid head group. Consistently across the eukaryotes, the different versions of the CYSTM module appear to have roles in stress-response or stress-tolerance, and, more specifically, in resistance to deleterious substances, implying that these might be general functions of the whole family. 37
55433 403823 pfam12735 Trs65 TRAPP trafficking subunit Trs65. This family is one of the subunits of the TRAPP Golgi trafficking complex. TRAPP subunits are found in two different sized complexes, TRAPP I and TRAPP II. While both complexes contain the same seven subunits, Bet3p, Bet5p, Trs20p, Trs23p, Trs31p, Trs33p and Trs85p, with TRAPPC human equivalents, TRAPP II has the additional three subunits,Trs65p, Trs120p and Trs130p. While it has been implicated in cell wall biogenesis and stress response, the role of Trs65 in TRAPP II is supported by the findings that the protein co-localizes with Trs130p, and deletion of TRS65 in yeast leads to a conditional lethal phenotype if either one of the other TRAPP II-specific subunits is modified. Furthermore, the trs65 mutant has reduced Ypt31/32p guanine nucleotide exchange, GEF, activity. 309
55434 403824 pfam12736 CABIT Cell-cycle sustaining, positive selection,. The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologs (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2. 261
55435 372279 pfam12737 Mating_C C-terminal domain of homeodomain 1. Mating in fungi is controlled by the loci that determine the mating type of an individual, and only individuals with differing mating types can mate. Basidiomycete fungi have evolved a unique mating system, termed tetrapolar or bifactorial incompatibility, in which mating type is determined by two unlinked loci; compatibility at both loci is required for mating to occur. The multi-allelic tetrapolar mating system is considered to be a novel innovation that could have only evolved once, and is thus unique to the mushroom fungi. This domain is C-terminal to the homeodomain transcription factor region. 412
55436 403825 pfam12738 PTCB-BRCT twin BRCT domain. This is a BRCT domain that appears in duplicate in most member sequences. BRCT domains are peptide- and phosphopeptide-binding modules. BRCT domains are present in a number of proteins involved in DNA checkpoint controls and DNA repair. 63
55437 403826 pfam12739 TRAPPC-Trs85 ER-Golgi trafficking TRAPP I complex 85 kDa subunit. This family is one of the subunits of the TRAPP Golgi trafficking complex. TRAPP subunits are found in two different sized complexes, TRAPP I and TRAPP II, and this Trs85 is in the smaller complex. TRAPP I, but Not TRAPP II, functions in ER-Golgi transport. Trs85p was reported to function in the cytosol-to-vacuole targeting pathway, suggesting a role for this subunit in autophagy as well as in secretion. The overall architecture of TRAPP I shows the other components to be Bet3p (TRAPPC3), Bet5p (TRAPPC1), Trs20p (TRAPPC2), Trs23p (TRAPPC4), Trs31p (TRAPPC5), Trs33p (TRAPPC6a and b) and Trs85p. 392
55438 403827 pfam12740 Chlorophyllase2 Chlorophyllase enzyme. This family consists of several chlorophyllase and chlorophyllase-2 (EC:3.1.1.14) enzymes. Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of an ester bond to yield chlorophyllide and phytol. The family includes both plant and Amphioxus members. 254
55439 403828 pfam12741 SusD-like Susd and RagB outer membrane lipoprotein. This is a family of SusD-like proteins, one member of which, BT1043, is an outer membrane lipoprotein involved in host glycan metabolism. The structures of this and SusD-homologs in the family are dominated by tetratrico peptide repeats that may facilitate association with outer membrane beta-barrel transporters required for glycan uptake. The structure of BT1043 complexed with N-acetyllactosamine reveals that recognition is mediated via hydrogen bonding interactions with the reducing end of beta-N-acetylglucosamine, suggesting a role in binding glycans liberated from the mucin polypeptide. Mammalian distal gut bacteria have an expanded capacity to utilize glycans. In the absence of dietary sources, some species rely on host-derived mucosal glycans. The ability of Bacteroides thetaiotaomicron, a prominent human gut symbiont, to forage host glycans contributes to both its ability to persist within an individual host and its ability to be transmitted naturally to new hosts at birth. 495
55440 403829 pfam12742 Gryzun-like Gryzun, putative Golgi trafficking. Members of this family are involved in Golgi trafficking. 56
55441 403830 pfam12743 ESR1_C Oestrogen-type nuclear receptor final C-terminal. This is the very C-terminal region of a subfamily of nuclear receptors that includes oestrogen receptors and other subfamily 3 group A members. The actual function of this region is not known, but the domain is absent from all the other types of nuclear receptors. Oestrogen receptors modulate AP-1-dependent transcription through two distinct mechanisms: via protein-protein interactions on DNA; and via non-genomic actions. The mechanism used depends on the cellular localization of the receptor. In addition to the more extensively studied cross-talk on DNA, additional non-genomic actions might be very important in target tissues in which membrane-associated ERs are found. These non-genomic actions probably contribute to the overall physiological responses mediated by ligand-bound ERs and might possibly be mediated via this C-terminal domain. 40
55442 403831 pfam12744 ATG19_autophagy Autophagy protein Atg19, Atg8-binding. Autophagy is generally known as a process involved in the degradation of bulk cytoplasmic components that are non-specifically sequestered into an autophagosome, where they are sequestered into double-membrane vesicles and delivered to the degradative organelle, the lysosome/vacuole, for breakdown and eventual recycling of the resulting macromolecules. In contrast to autophagy, however, the Cvt pathway is a highly selective process that involves the sequestration of at least two specific cargos that are resident vacuolar hydrolases, aminopeptidase I (Ape1) and alpha-mannosidase (Ams1). These proteins are sequestered within a double-membrane vesicle, termed a Cvt vesicle. The Cvt vesicle is fairly consistent in size, and is much smaller than the autophagosome, being 140-160 nm in diameter. The prApe1 is sequestered within either Cvt vesicles or autophagosomes, depending on the nutrient conditions, and delivered to the vacuole. Autophagy and the Cvt pathway are topologically and mechanistically similar and share most of the same machinery. The Ape1 complex is ultimately enwrapped within either Cvt vesicles or autophagosomes at the perivacuolar PAS. The receptor protein Atg19 binds to the Ape1 complex through the prApe1 propeptide to form the Cvt complex in the cytosol. In the absence of Atg19, prApe1 can form an Ape1 complex, but does not localize at the PAS. Atg19 is a peripheral membrane protein with differing binding sites for both Ape1 and Ams1. The Atg8-binding region in the yeast proteins is this very C-terminal residues. 246
55443 403832 pfam12745 HGTP_anticodon2 Anticodon binding domain of tRNAs. This is an HGTP_anticodon binding domain, found largely on Gcn2 proteins which bind tRNA to down regulate translation in certain stress situations. 261
55444 403833 pfam12746 GNAT_acetyltran GNAT acetyltransferase. Many of the members are annotated s being Zwittermicin A resistance proteins, whereas others are listed as being GNAT acetyltransferases. The family has similarities to the GNAT acetyltransferase family. 239
55445 372287 pfam12747 DdrB DdrB-like protein. This family includes the Deinococcus DdrB protein which is a ssDNA binding protein. This family also includes some possibly distantly related cyanobacterial proteins. However, these are not strongly supported. The structure of DdrB is known. 126
55446 315429 pfam12749 Metallothio_Euk Eukaryotic metallothionein. This is a family of eukaryotic metallothioneins. 66
55447 403834 pfam12750 Maff2 Maff2 family. This family of short membrane proteins are related to the protein Maff2. Maff2 lies just outside the direct repeats of a tetracycline resistance transposable element. This protein may contain transmembrane helices. 70
55448 403835 pfam12751 Vac7 Vacuolar segregation subunit 7. Vac7 is localized at the vacuole membrane, a location which is consistent with its involvement in vacuole morphology and inheritance. Vac7 has been shown to function as an upstream regulator of the Fab1 lipid kinase pathway. The Fab1 lipid p[pathway is important for correct regulation of membrane trafficking events. 382
55449 403836 pfam12752 SUZ SUZ domain. The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched in positively charged amino acids. It was first characterized in the C.elegans protein Szy-20 where it has been shown to bind RNA and allow their localization to the centrosome. Warning- the domain has a compositionally biased character. 56
55450 403837 pfam12753 Nro1 Nuclear pore complex subunit Nro1. In fission yeast, this protein is a positive regulator of the stability of Sre1N, the sterol regulatory element-binding protein which is an ER membrane-bound transcription factor that controls adaptation to low oxygen-growth. In addition, the fission yeast Nro1 is a direct inhibitor of a protein that inhibits SreN1 degradation, Ofd1 (an oxoglutamate deoxygenase). The outcome of this reactivity is that Ofd1 acts as an oxygen sensor that regulates the binding of Nro1 to Ofd1 to control the stability of Sre1N. Solution of the structure of Nro1 reveals it to be made up of a number of TPR coils. TPR proteins are composed of three to 16 tandem peptide repeat motifs of 34 amino acids with degenerate sequence. The helical pairs adopt a helix-turn-helix anti-parallel arrangement with interacting helices. In general, TPR motifs are stacked together so that helix A from TPRn is packed between helix B from TPRn and helix A from TPRn+1. In Nro1, the 12 alpha helices forming the six TPR motifs are organized as follows from N-terminus to C-terminus - TPR1A, TPR1B, TPR2A, TPR2B, TPR3A, TPR3B, TPR4A, TPR4B, TPR5A, TPR5B, TPR6A, and TPR6B with the C-terminal helix (hC) running above the sixth TPR motif with an angle of approx 45 degrees with TPR6A and TPR6B. The corresponding TPRs structural motifs are longer (50 residues) than are canonical ones (34 amino acids) and are organized into two subdomains - Nro1-N (residues 55-225) and Nro1-C (residues 226-393). The Nro1/Etti protein plays a role in nuclear import suggesting that it is residues 4-19 that are interacting with Ofd1. 414
55451 372291 pfam12754 Blt1 Blt1 N-terminal domain. During size-dependent cell cycle transitions controlled by the ubiquitous cyclin-dependent kinase Cdk1, Blt1 has been shown to co-localize with Cdr2 in the medial interphase nodes, as well as with Mid1 which was previously shown to localize to similar interphase structures. Physical interactions between Blt1-Mid1, Blt1-Cdr2 and Cdr2-Mid1 were detected, indicating that medial cortical nodes are formed by the ordered, Cdr2-dependent assembly of multiple interacting proteins during interphase. This domain show similarity to ubiquitin family proteins. 150
55452 403838 pfam12755 Vac14_Fab1_bd Vacuolar 14 Fab1-binding region. Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. This domain appears to be the one responsible for binding to Fab1. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system. 97
55453 403839 pfam12756 zf-C2H2_2 C2H2 type zinc-finger (2 copies). This family contains two copies of a C2H2-like zinc finger domain. 98
55454 403840 pfam12757 Eisosome1 Eisosome protein 1. Eisosome protein 1 is required for normal formation of eisosomes, large cytoplasmic protein assemblies that localize to specialized domains on plasma membrane and mark the site of endocytosis. 125
55455 372295 pfam12758 DUF3813 Protein of unknown function (DUF3813). This is an uncharacterized family of Bacillus proteins. 60
55456 289525 pfam12759 HTH_Tnp_IS1 InsA C-terminal domain. This short domain is found at the C-terminus of the InsA protein. This domain contains a helix-turn-helix domain. 46
55457 403841 pfam12760 Zn_Tnp_IS1595 Transposase zinc-ribbon domain. This zinc binding domain is found in a range of transposase proteins such as ISSPO8, ISSOD11, ISRSSP2 etc. It is likely a zinc-binding beta ribbon domain that could bind the DNA. 46
55458 403842 pfam12761 End3 Actin cytoskeleton-regulatory complex protein END3. Endocytosis is accomplished through the sequential recruitment at endocytic sites of proteins that drive cargo sorting, membrane invagination and vesicle release. End3p is part of the coat module protein complex Pan1, along with Pan1p, Sla1p, and Sla2p. The proteins in this complex are regulated by phosphorylation events. End3p also regulates the cortical actin cytoskeleton. The subunits of the Pan1 complex are homologous to mammalian intersectin. 200
55459 403843 pfam12762 DDE_Tnp_IS1595 ISXO2-like transposase domain. This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2. 153
55460 289529 pfam12763 EF-hand_4 Cytoskeletal-regulatory complex EF hand. This is an efhand family from the N-terminal of actin cytoskeleton-regulatory complex END3 and similar proteins from fungi and closely related species. 104
55461 403844 pfam12764 Gly-rich_Ago1 Glycine-rich region of argonaut. This domain is often found at the very N-terminal of argonaut-like proteins. 103
55462 403845 pfam12765 Cohesin_HEAT HEAT repeat associated with sister chromatid cohesion. This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble. 42
55463 403846 pfam12766 Pyridox_oxase_2 Pyridoxamine 5'-phosphate oxidase. Pyridoxamine 5'-phosphate oxidase catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP), the terminal step in the de novo biosynthesis of PLP in Escherichia coli and part of the salvage pathway of this coenzyme in both E. coli and mammalian cells. This region is the flavoprotein FMN-binding domain. 99
55464 403847 pfam12767 SAGA-Tad1 Transcriptional regulator of RNA polII, SAGA, subunit. The yeast SAGA complex is a multifunctional coactivator that regulates transcription by RNA polymerase II. It is formed of five major modular subunits and shows a high degree of structural conservation to human TFTC and STAGA. The complex can also be conceived of as consisting of two histone-fold-containing core subunits, and this family is one of these. As a family it is likely to carry binding regions for interactions with a number of the other components of the complex. 135
55465 403848 pfam12768 Rax2 Cortical protein marker for cell polarity. Diploid yeast cells repeatedly polarise and bud from their poles, due probably to the presence of highly stable membrane markers, and Rax2 is one such marker. It is inherited immutably at the cell cortex for multiple generations, and has a half-life exceeding several generations. The persistent inheritance of cortical protein markers would provide a means of coupling a cell's history with the future development of a precise morphogenetic form. Both Rax1 and Rax2 localize to the distal pole as well as to the division site and they interact both with each other and with Bud8p and Bud9p in the establishment and/or maintenance of the cortical markers for bipolar budding. thus Rax2 is likely to control cell polarity during vegetative growth, and in fission yeast this is done by regulating the localization of for3p. 211
55466 403849 pfam12769 PNTB_4TM 4TM region of pyridine nucleotide transhydrogenase, mitoch. PNTB_4TM is the region upstream of family PNTB, pfam02233, that carries four of this transporters transmembrane regions. PNTB is the beta-subunit of pyridine nucleotide transhydrogenase. This family forms part of the Proton-translocating Transhydrogenase (PTH) Family. 84
55467 378942 pfam12770 CHAT CHAT domain. These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. 289
55468 403850 pfam12771 SusD-like_2 Starch-binding associating with outer membrane. SusD is a secreted starch-binding protein with an N-terminal lipid tail that allows it to associate with the outer membrane. 413
55469 403851 pfam12772 GHBP Growth hormone receptor binding. Growth hormone receptor binding protein is produced either by proteolysis of the GHR (growth hormone receptor) at the cell surface thereby releasing its extracellular domain, the GHBP (growth hormone-binding protein), or, in rodents, by alternative processing of the GHR transcript. The sheddase proteolytic enzyme responsible for the cleavage is TACE (tumor necrosis factor-alpha-converting enzyme). Growth hormone (GH) binding to GH receptor (GHR) is the initial step that leads to the physiological functions of the hormone. The biological effects of GHBP are determined by the serum levels of growth hormone (GH), which can vary. Low levels of GH can result in a dwarf phenotype and have been positively correlated with an increased life expectancy. High levels of GH can lead to gigantism or a clinical syndrome termed acromegaly and have been implicated in diabetic eye and kidney damage. 303
55470 403852 pfam12773 DZR Double zinc ribbon. This family consists of a pair of zinc ribbon domains. 45
55471 403853 pfam12774 AAA_6 Hydrolytic ATP binding site of dynein motor region D1. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D1 unit of the motor and contains the hydrolytic ATP binding site. 327
55472 403854 pfam12775 AAA_7 P-loop containing dynein motor region D3. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site. 181
55473 403855 pfam12776 Myb_DNA-bind_3 Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. In particular pfam10545 seems most related. This family is greatly expanded in plants and appears in several proteins annotated as transposon proteins. 96
55474 289543 pfam12777 MT Microtubule-binding stalk of dynein motor. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component. 344
55475 403856 pfam12778 PXPV PXPV repeat (3 copies). This short repeat is found in multiple copies in a variety of Burkholderia proteins. The function of this region is unknown. 22
55476 403857 pfam12779 YXWGXW YXWGXW repeat (2 copies). This short repeat contains the motif YXWXXGXW where X can be any amino acid. It is generally found in 2-5 copies in short secreted bacterial proteins. Its function is as yet unknown. 26
55477 403858 pfam12780 AAA_8 P-loop containing dynein motor region D4. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor. 259
55478 403859 pfam12781 AAA_9 ATP-binding dynein motor region D5. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop. 217
55479 372312 pfam12782 Innate_immun Invertebrate innate immunity transcript family. The immune response of the purple sea urchin appears to be more complex than previously believed in that it uses immune-related gene families homologous to vertebrate Toll-like and NOD/NALP-like receptor families as well as C-type lectins and a rudimentary complement system. In addition, the species also produces this unusual family of mRNAs, also known as 185/333, which is strongly upregulated in response to pathogen challenge. 291
55480 403860 pfam12783 Sec7_N Guanine nucleotide exchange factor in Golgi transport N-terminal. The full-length Sec7 functions proximally in the secretory pathway as a protein binding scaffold for the coat protein complexes COPII-COPI. The COPII-COPI-protein switch is necessary for maturation of the vesicular-tubular cluster, VTC, intermediate compartments for Golgi compartment biogenesis. This N-terminal domain however does not appear to be binding either of the COP or the ARF. 154
55481 403861 pfam12784 PDDEXK_2 PD-(D/E)XK nuclease family transposase. Members of this family belong to the PD-(D/E)XK nuclease superfamily. These proteins are transposase proteins. 227
55482 372315 pfam12785 VESA1_N Variant erythrocyte surface antigen-1. This family represents the N-terminal of the variant erythrocyte surface antigen 1, versions a and b, of Babesia. Babesia bovis is a tick-borne, intra-erythrocytic, protozoal parasite of cattle that shares many lifestyle parallels with the most virulent of the human malarial parasites, Plasmodium falciparum. Babesia uses antigenic variation to establish consistent infections of long duration. The two variants of VESA1, a and b, are expressed from different but closely related genes, and variation is achieved through the involvement of a segmental gene conversion mechanism and low-frequency epigenetic in situ switching of transcriptional activity from the VESA1 gene-pair to a possible other gene pair. 457
55483 193262 pfam12786 GBV-C_env GB virus C genotype envelope. This the envelope protein from the ssRNA GB virus genotype C. 413
55484 403862 pfam12787 EcsC EcsC protein family. Proteins in this family are related to EcsC from B. subtilis. This protein is found in an operon with EcsA and EcsB which are components of an ABC transport system. The function of this protein is unknown. 245
55485 403863 pfam12788 YmaF YmaF family. This family of proteins contain 6 HXH motifs and is named after the B. subtilis YmaF protein. It seems likely that these are involved in metal binding. The function of this protein is unknown. 97
55486 372316 pfam12789 PTR Phage tail repeat like. This family largely contains proteins from the eukaryote Trichomonas vaginalis. These proteins contain multiple HXH repeats. Some proteins in this family are annotated as having phage tail repeats. The function of this family is unknown. 60
55487 403864 pfam12790 T6SS-SciN Type VI secretion lipoprotein, VasD, EvfM, TssJ, VC_A0113. One of the virulence mechanisms of E coli is the production of toxins which it produces from dedicated machineries called secretion systems. Seven secretion systems have been described, which assemble from 3 to up to more than 20 subunits. These secretion systems derive from or have co-evolved with bacterial organelles such as ABC transporters (type I), type IV pili (type 2), flagella (type 3), or conjugative machines (type IV). The type VI secretion system (T6SS) is present in most pathogens that have contact with animals, plants, or humans. SciN is a lipoprotein tethered to the outer membrane and expressed in the periplasm of E coli and is essential for T6S-dependent secretion of the Hcp-like SciD protein and for biofilm formation. 120
55488 403865 pfam12791 RsgI_N Anti-sigma factor N-terminus. The heat shock genes in B. subtilis can be classified into several groups according to their regulation, and the sigma gene, sigI, of Bacillus subtilis belongs to the group IV heat-shock response genes and has many orthologues in the bacterial phylum Firmicutes. Regulation of sigma factor I is carried out by RsgI from the same operon, and this N-terminal cytoplasmic portion of RsgI ('upstream' of the single transmembrane helix) has been shown to interact directly with Sigma-I. 53
55489 403866 pfam12792 CSS-motif CSS motif domain associated with EAL. This family with its characteristic highly conserved CSS sequence motif is found N-terminal to the EAL, pfam00563, domain in many cyclic diguanylate phosphodiesterases. 209
55490 403867 pfam12793 SgrR_N Sugar transport-related sRNA regulator N-term. Small, non-coding RNA molecules play important regulatory roles in a variety of physiological processes in bacteria. SgrR_N is the N-terminus of a family of proteins which regulate the transcription of these sRNAs, in particular SgrS. SgrR_N contains a helix-turn-helix motif characteristic of winged-helix DNA-binding transcriptional regulators. SgrS is a small RNA required for recovery from glucose-phosphate stress in bacteria. In examining the regulation of sgrR expression it was found that SgrR negatively auto-regulates its own transcription in the presence and absence of stress, and thus SgrR coordinates the response to glucose-phosphate stress by binding specifically to sgrS promoter DNA. 115
55491 403868 pfam12794 MscS_TM Mechanosensitive ion channel inner membrane domain 1. The small mechanosensitive channel, MscS, is a part of the turgor-driven solute efflux system that protects bacteria from lysis in the event of osmotic shock. The MscS protein alone is sufficient to form a functional mechanosensitive channel gated directly by tension in the lipid bilayer. The MscS proteins are heptamers of three transmembrane subunits with seven converging M3 domains, and this domain is one of the inner membrane domains. 333
55492 403869 pfam12795 MscS_porin Mechanosensitive ion channel porin domain. The small mechanosensitive channel, MscS, is a part of the turgor-driven solute efflux system that protects bacteria from lysis in the event of osmotic shock. The MscS protein alone is sufficient to form a functional mechanosensitive channel gated directly by tension in the lipid bilayer. The MscS proteins are heptamers of three transmembrane subunits with seven converging M3 domains, and this MscS_porin is towards the N-terminal of the molecules. The high concentration of negative charges at the extracellular entrance of the pore helps select the cations for efflux. 235
55493 403870 pfam12796 Ank_2 Ankyrin repeats (3 copies). 91
55494 403871 pfam12797 Fer4_2 4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 22
55495 403872 pfam12798 Fer4_3 4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 15
55496 403873 pfam12799 LRR_4 Leucine Rich repeats (2 copies). Leucine rich repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains. 44
55497 403874 pfam12800 Fer4_4 4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 17
55498 403875 pfam12801 Fer4_5 4Fe-4S binding domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 48
55499 403876 pfam12802 MarR_2 MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif. 61
55500 289567 pfam12803 G-7-MTase mRNA (guanine-7-)methyltransferase (G-7-MTase). The Sendai virus RNA-dependent RNA polymerase complex, which consists of L and P proteins, participates in the synthesis of viral mRNAs that possess a methylated cap structure. The N-terminal of the L protein acts as the RNA-dependent RNA polymerase part of the molecule, family Paramyx_RNA_pol, pfam00946. This domain is the C-terminal part of the L protein and it catalyzes cap methylation through its mRNA (guanine-7-)methyltransferase (G-7-MTase) activity. 317
55501 403877 pfam12804 NTP_transf_3 MobA-like NTP transferase domain. This family includes the MobA protein (Molybdopterin-guanine dinucleotide biosynthesis protein A). The family also includes a wide range of other NTP transferase domain. 159
55502 403878 pfam12805 FUSC-like FUSC-like inner membrane protein yccS. This family has similarities to the fusaric acid resistance protein family. The proteins are lodged in the inner membrane. 284
55503 403879 pfam12806 Acyl-CoA_dh_C Acetyl-CoA dehydrogenase C-terminal like. this domain would appear to be the very C-terminal region of many bacterial acetyl-CoA dehydrogenases. 127
55504 403880 pfam12807 eIF3_p135 Translation initiation factor eIF3 subunit 135. Translation initiation factor eIF3 is a multi-subunit protein complex required for initiation of protein biosynthesis in eukaryotic cells. The complex promotes ribosome dissociation, the binding of the initiator methionyl-tRNA to the 40 S ribosomal subunit, and mRNA recruitment to the ribosome. The protein product from TIF31 genes in yeast is p135 which associates with the eIF3 but does not seem to be necessary for protein translation initiation. 168
55505 315477 pfam12808 Mto2_bdg Micro-tubular organizer Mto1 C-term Mto2-binding region. The C-terminal region of the micro-tubular organizer protein 1 (mto1) is the binding domain for attachment to Mto2p.The full-length Mto1 protein is required for microtubule nucleation from non-spindle pole body MTOCs in fission yeast. The interaction of Mto2p with this region of Mto1 is critical for anchoring the cytokinetic actin ring to the medial region of the cell and for proper coordination of mitosis with cytokinesis. 52
55506 403881 pfam12809 Metallothi_Euk2 Eukaryotic metallothionein. This is a family of eukaryotic metallothioneins. 69
55507 403882 pfam12810 Gly_rich Glycine rich protein. This family of proteins is greatly expanded in Trichomonas vaginalis. The proteins are composed of several glycine rich motifs interspersed through the sequence. Although many proteins have been annotated by similarity in the family these annotations given the biased composition of the sequences these are unlikely to be functionally relevant. 257
55508 403883 pfam12811 BaxI_1 Bax inhibitor 1 like. The Bax-inhibitor-1 region of the receptor molecules is conserved from bacteria to humans. 235
55509 372324 pfam12812 PDZ_1 PDZ-like domain. PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. this is a family of PDZ-like domains from bacteria, plants and fungi. 78
55510 372325 pfam12813 XPG_I_2 XPG domain containing. This family is largely of fungal proteins and is related to the XP-G protein family. 249
55511 403884 pfam12814 Mcp5_PH Meiotic cell cortex C-terminal pleckstrin homology. The PH domain of these largely fungal proteins is necessary for the cortical localization of the protein during meiosis, since the overall function of the protein is to anchor dynein at the cell cortex during the horsetail phase. During prophase I of fission yeast, horsetail nuclear movement occurs, and this starts when all the telomeres become bundled at the spindle pole body - SPB. Subsequent to this, the nucleus undergoes a dynamic oscillation, resulting in elongated nuclear morphology. Horsetail nuclear movement is thought to be predominantly due to the pulling of astral microtubules that link the SPB to cortical microtubule-attachment sites at the opposite end of the cell; the pulling force is believed to be provided by cytoplasmic dynein and dynactin. 119
55512 372327 pfam12815 CTD Spt5 C-terminal nonapeptide repeat binding Spt4. The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif. 71
55513 403885 pfam12816 Vps8 Golgi CORVET complex core vacuolar protein 8. Vps8 is one of the Golgi complex components necessary for vacuolar sorting. Eukaryotic cells contain a highly dynamic endo-membrane system, in which individual organelles keep their identity despite continuous vesicle generation and fusion. Vesicles that bud from a donor membrane are targeted and delivered to each individual organelle, where they release their cargo after fusion with the acceptor membrane. Vps8 is the core component of the endosomal tethering complex CORVET (class C core vacuole/endosome tethering). Vps8 co-operates with Vps21-GTP to mediate endosomal clustering in a reaction that is dependent on Vps3. Vps8 is the only CORVET subunit that is enriched on late endosomes, suggesting that it is a marker for the maturation of late endosomes. Late endosomes form intralumenal vesicles, and the resulting multivesicular bodies fuse with the vacuole to release their cargoes. 194
55514 289579 pfam12818 Tegument_dsDNA dsDNA viral tegument protein. This is a family of tegument proteins from double-stranded DNA herpesvirus and related viral species. 277
55515 403886 pfam12819 Malectin_like Carbohydrate-binding protein of the ER. Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognizes and binds Glc2-N-glycan. The domain is found on a number of plant receptor kinases. 328
55516 403887 pfam12820 BRCT_assoc Serine-rich domain associated with BRCT. This domain is found on BRCA1 proteins. 164
55517 403888 pfam12821 ThrE_2 Threonine/Serine exporter, ThrE. ThrE_2 is a family of membrane proteins involved in the export of threonine and serine. L-threonine, L-serine are both substrates for the exporter. The exporter exhibits nine-ten predicted transmembrane-spanning helices with long charged C and N termini and an amphipathic helix present within the N-terminus. L-Threonine can be made by the amino acid-producing bacterium Corynebacterium glutamicum, but the potential for amino acid formation can be considerably improved by reducing its intracellular degradation into glycine and increasing its export by this exporter. Members of the family are found in Bacteria, Archaea, and the fungal kingdoms, and the family can exist either as a single long polypeptide chain or as two short polypeptides. All family members show an extended hydrophilic N-terminal domain with weak sequence similarity to portions of hydrolases (proteases, peptidases, and glycosidases); this suggests that since this region is cytoplasmic to the membrane it may be generating the transport substrate, so may imply that threonine may not be the primary substrate and the ThrE has a subsidiary function. 129
55518 403889 pfam12822 ECF_trnsprt ECF transporter, substrate-specific component. Energy-coupling factor (ECF) transporters consist of a substrate-specific component (known as the S component), and an energy-coupling module. The substrate-binding component is a small integral membrane protein which captures specific substrates and forms an active transporter in the presence of the energy-coupling AT module. 156
55519 403890 pfam12823 DUF3817 Domain of unknown function (DUF3817). This domain is of unknown function. It is sometimes found adjacent to pfam07690 and pfam03176 which are both transporter domains. 89
55520 403891 pfam12824 MRP-L20 Mitochondrial ribosomal protein subunit L20. This family is the essential mitochondrial ribosomal protein subunit L20 of fungi. 165
55521 403892 pfam12825 DUF3818 Domain of unknown function in PX-proteins (DUF3818). This domain is found on proteins carrying a PX domain. Its function is unknown. 335
55522 403893 pfam12826 HHH_2 Helix-hairpin-helix motif. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain. 64
55523 403894 pfam12827 Peroxin-22 Peroxisomal biogenesis protein family. Peroxin-22 is a integral peroxisomal membrane protein family. The N-terminus is in the matrix and the C-terminus is in the cytosol. The N-terminus carries a 25-amino acid peroxisome membrane-targeting signal. It interacts with the ubiquitin-conjugating peripheral peroxisomal membrane enzyme Pex4p anchoring it at the peroxisomal membrane. Both Pex proteins are involved at the same stage of peroxisome biogenesis. 109
55524 403895 pfam12828 PXB PX-associated. This domain is associated with the PX domain. 131
55525 403896 pfam12829 Mhr1 Transcriptional regulation of mitochondrial recombination. This family is involved in the transcriptional regulation of recombination in the mitochondria, 82
55526 403897 pfam12830 Nipped-B_C Sister chromatid cohesion C-terminus. This domain lies towards the C-terminus of nipped-B or sister chromatid cohesion proteins. 177
55527 403898 pfam12831 FAD_oxidored FAD dependent oxidoreductase. This family of proteins contains FAD dependent oxidoreductases and related proteins. 420
55528 372338 pfam12832 MFS_1_like MFS_1 like family. This family contains proteins related to the MFS superfamily. 362
55529 403899 pfam12833 HTH_18 Helix-turn-helix domain. 81
55530 372339 pfam12834 Phage_int_SAM_2 Phage integrase, N-terminal. This is a family of DNA-binding prophage integrases. It is found largely in Proteobacteria. 91
55531 372340 pfam12835 Integrase_1 Integrase. This is a family of DNA-binding prophage integrases found in Proteobacteria. 149
55532 403900 pfam12836 HHH_3 Helix-hairpin-helix motif. The HhH domain is a short DNA-binding domain. 65
55533 403901 pfam12837 Fer4_6 4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 24
55534 403902 pfam12838 Fer4_7 4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters. 52
55535 403903 pfam12840 HTH_20 Helix-turn-helix domain. This domain represents a DNA-binding Helix-turn-helix domain found in transcriptional regulatory proteins. 61
55536 403904 pfam12841 YvrJ YvrJ protein family. This family of short proteins are related to B. subtilis YvrJ protein. None of the members of this family have been functionally characterized. 36
55537 403905 pfam12842 DUF3819 Domain of unknown function (DUF3819). This is an uncharacterized domain that is found on the CCR4-Not complex component Not1. Not1 is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID. 143
55538 403906 pfam12843 QSregVF_b Putative quorum-sensing-regulated virulence factor. QSregVF_b is a family of short Pseudomonas proteins that are potential virulence factors. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd, from pfam13652. It is predicted that these two adjacent proteins form a single transcriptional unit based on the prediction that together they interact with their adjacent protein PotD, which is the putrescine-binding periplasmic protein in the polyamine uptake system comprising PotABCD. These two adjacent proteins are predicted to be quroum-sensing-regulated virulence factors. 66
55539 403907 pfam12844 HTH_19 Helix-turn-helix domain. Members of this family contains a DNA-binding helix-turn-helix domain. This family contains many example antitoxins from bacterial toxin-antitoxin systems. These antitoxins are likely to be DNA-binding domains. 64
55540 403908 pfam12845 TBD TBD domain. The Tbk1/Ikki binding domain (TBD) is a 40 amino acid domain able to bind kinases, has been found to be essential for poly(I:C)-induced IRF activation. The domain is found in SINTBAD, TANK and NAP1 protein. This domain is predicted to form an a-helix with residues essential for kinase binding clustering on one side. 55
55541 315512 pfam12846 AAA_10 AAA-like domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. 362
55542 403909 pfam12847 Methyltransf_18 Methyltransferase domain. Protein in this family function as methyltransferases. 151
55543 403910 pfam12848 ABC_tran_Xtn ABC transporter. This domain is an extension of some members of pfam00005 and other ABC-transporter families. 85
55544 403911 pfam12849 PBP_like_2 PBP superfamily domain. This domain belongs to the periplasmic binding protein superfamily. 270
55545 403912 pfam12850 Metallophos_2 Calcineurin-like phosphoesterase superfamily domain. Members of this family are part of the Calcineurin-like phosphoesterase superfamily. 153
55546 372343 pfam12851 Tet_JBP Oxygenase domain of the 2OGFeDO superfamily. A double-stranded beta helix (DSBH) fold domain of the 2-oxoglutarate (2OG)-Fe(II)-dependent dioxygenase (2OGFeDO) superfamily found in various eukaryotes, bacteria and bacteriophages. Members of this family catalyze nucleic acid modifications, such as thymidine hydroxylation during base J synthesis in kinetoplastids, and the conversion of 5 methyl-cytosine (5-mC) to 5-hydroxymethyl-cytosine (hmC), or further oxidation to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Metazoan TET proteins contain a cysteine-rich region inserted into the core of the DSBH fold. Vertebrate TET proteins are oncogenes that are mutated in various myeloid cancers. Fungal and algal versions of this family are linked to a predicted transposase and show lineage-specific expansions. 166
55547 403913 pfam12852 Cupin_6 Cupin. This is a family of bacterial and eukaryotic proteins that belong to the Cupin superfamily. Some of the proteins in this family are annotated as being members of the AraC family of transcription factors, in which case this domain corresponds to the ligand binding domain. 184
55548 372344 pfam12853 NADH_u_ox_C C-terminal of NADH-ubiquinone oxidoreductase 21 kDa subunit. This family is the C-terminal domain of NADH-ubiquinone oxidoreductase 21 kDa subunits from fungi. 89
55549 403914 pfam12854 PPR_1 PPR repeat. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. The exact function is not known. 34
55550 403915 pfam12855 Ecl1 Life-span regulatory factor. This family is involved in the chronological life-span of S. cerevisiae. Over-expression leads to an extended viability of wild-type strains, indicating a role in regulation. 171
55551 403916 pfam12856 ANAPC9 Anaphase-promoting complex subunit 9. Apc9 is one of the subunits of the anaphase-promoting complex, or cyclosome, which is essential for regulating entry into anaphase and exit from mitosis. The APC is a ubiquitin-protein ligase complex. All APC subunits are members of the cullin family proteins, which bind to a ring-finger subunit via a conserved cullin domain. The APC is made up of four parts, the third of which is a tetratricopeptide repeat arm (TPR) that contains Apc9. 112
55552 403917 pfam12857 TOBE_3 TOBE-like domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain. 59
55553 403918 pfam12859 ANAPC1 Anaphase-promoting complex subunit 1. Apc1 is the largest of the subunits of the anaphase-promoting complex or cyclosome. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. Infection of human fibroblasts with human cytomegalovirus (HCMV) leads to cell cycle dysregulation, which is associated with the inactivation of the anaphase-promoting complex. 120
55554 403919 pfam12860 PAS_7 PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya. 115
55555 403920 pfam12861 zf-ANAPC11 Anaphase-promoting complex subunit 11 RING-H2 finger. Apc11 is one of the subunits of the anaphase-promoting complex or cyclosome. The APC subunits are cullin family proteins with ubiquitin ligase activity. Polyubiquitination marks proteins for degradation by the 26S proteasome and is carried out by a cascade of enzymes that includes ubiquitin-activating enzymes (E1s), ubiquitin-conjugating enzymes (E2s), and ubiquitin ligases (E3s). Apc11 acts as an E3 enzyme and is responsible for recruiting E2s to the APC and for mediating the subsequent transfer of ubiquitin to APC substrates in vivo. In Saccharomyces cerevisiae this RING-H2 finger protein defines the minimal ubiquitin ligase activity of the APC, and the integrity of the RING-H2 finger is essential for budding yeast cell viability. 85
55556 403921 pfam12862 ANAPC5 Anaphase-promoting complex subunit 5. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5, although it does not harbour a classical RNA binding domain, Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This region of the Apc5 member proteins carries a TPR-like motif. 91
55557 403922 pfam12863 DUF3821 Domain of unknown function (DUF3821). This is a domain largely confined to sequences from Methanomicrobiales found on putative lipases. The function is not known. 202
55558 403923 pfam12864 DUF3822 Protein of unknown function (DUF3822). This is a family of uncharacterized bacterial proteins. However, structural-similarity searches indicate the family takes on an actin-like ATPase fold. 241
55559 403924 pfam12866 DUF3823 Protein of unknown function (DUF3823). This is a family of uncharacterized proteins from Bacteroidetes. It has characteristic DN and DR sequence-motifs. The function is not known. 92
55560 403925 pfam12867 DinB_2 DinB superfamily. The DinB family are an uncharacterized family of potential enzymes. The structure of these proteins is composed of a four helix bundle. 128
55561 372351 pfam12868 DUF3824 Domain of unknwon function (DUF3824). This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known. 145
55562 378982 pfam12869 tRNA_anti-like tRNA_anti-like. This is a family of bacterial, archeael and viral proteins that is related to the tRNA_anti family pfam01336. The major characteristic of families like tRNA_anti is their OB-fold, and many of them bind DNA. 162
55563 403926 pfam12870 DUF4878 Domain of unknown function (DUF4878). This is a family of putative lipoproteins from bacteria. The family is probably related to the NTF2-like transpeptidase family. 112
55564 403927 pfam12871 PRP38_assoc Pre-mRNA-splicing factor 38-associated hydrophilic C-term. This domain is a hydrophilic region found at the C-terminus of plant and metazoan pre-mRNA-splicing factor 38 proteins. The function is not known. 98
55565 403928 pfam12872 OST-HTH OST-HTH/LOTUS domain. A predicted RNA-binding domain found in insect Oskar and vertebrate TDRD5/TDRD7 proteins that nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. The domain adopts the winged helix-turn- helix fold and bind RNA with a potential specificity for dsRNA.In eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized RNase domain belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains. 65
55566 403929 pfam12873 DUF3825 Domain of unknown function (DUF3825). Potential uncharacterized enzymatic domain associated with bacterial pfam12872 domains. Has conserved residues suggestive of an enzymatic role probably related to RNA metabolism. 239
55567 403930 pfam12874 zf-met Zinc-finger of C2H2 type. This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding. 25
55568 403931 pfam12875 DUF3826 Protein of unknown function (DUF3826). This is a putative sugar-binding family. 186
55569 372355 pfam12876 Cellulase-like Sugar-binding cellulase-like. This is a putative cellulase family. The structure is a TIM-barrel. 355
55570 403932 pfam12877 DUF3827 Domain of unknown function (DUF3827). This family contains the human KIAA1549 protein which has been found to be fused fused to BRAF gene in many cases of pilocytic astrocytomas. The fusion is due mainly to a tandem duplication of 2 Mb at 7q34. Although nothing is known about the function of the human KIAA1549 protein, the BRAF protein is a well characterized oncoprotein. It is a serine/threonine protein kinase which is implicated in MAP/ERK signalling, a critical pathway for the regulation of cell division, differentiation and secretion. 677
55571 403933 pfam12878 SICA_beta SICA extracellular beta domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. There can be between 1 and 10 copies of this cysteine-rich domain. 172
55572 372358 pfam12879 SICA_C SICA C-terminal inner membrane domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. The C-terminal domain is thought to remain in the erythrocyte, found juxtaposition to the single transmembrane domain. To date, all full length proteins contain a single copy of this domain. 136
55573 403934 pfam12881 NUT NUT protein. This family includes the NUT protein. The gene encoding for NUT protein (Nuclear Testis protein) is found fused to BRD3 or BRD4 genes, in some aggressive types of carcinoma, due to chromosomal translocations. Proteins of the BRD family contain two bromodomains that bind transcriptionally active chromatin through associations with acetylated histones H3 and H4. Such proteins are crucial for the regulation of cell cycle progression. On the other hand, little is known about NUT protein. NUT is known to have a Nuclear Export Sequence (NES) as well as a Nuclear localization Signal (NLS), both located towards the C-terminal end of the protein. A fused NUT-GFP protein showed either cytoplasmic or nuclear localization, suggesting that it is subject to nuclear/cytoplasmic shuttling. Consistent with this possibility, treatment with leptomycin B an inhibitor of CRM1-dependent nuclear export resulted in re-distribution of NUT-GFP to the nucleus. Inspection of NUT revealed a C-terminal sequence similar to known nuclear export sequences (NES) which are often regulated by phosphorylation. This family carries some natively unstructured sequence. 717
55574 403935 pfam12883 DUF3828 Protein of unknown function (DUF3828). This is a family of bacterial proteins of unknown function. 122
55575 403936 pfam12884 TORC_N Transducer of regulated CREB activity, N-terminus. This family includes the N terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). The proteins display a highly conserved predicted N-terminal coiled-coil domain and an invariant sequence matching a protein kinase A (PKA) phosphorylation consensus sequence (RKXS). The coiled-coil structure interacts with the bZIP domain of CREB. This interaction may occur via ionic bonds because it is disrupted under high-salt conditions. In addition to CREB-binding, the N-terminal region plays a role in the tetramer formation of TORCs, but the physiological function of the multimeric complex has not been clarified yet. 63
55576 403937 pfam12885 TORC_M Transducer of regulated CREB activity middle domain. This family includes the region between the N and C-terminus of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). Although the C- and N- terminal domains of these proteins have been well characterized, no functional role has been assigned to the central region, yet. 160
55577 403938 pfam12886 TORC_C Transducer of regulated CREB activity, C-terminus. This family includes the C terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). The C-terminus region is negatively charged, resembling the transcription activation domains. When this domain, from all three human TORC proteins, was expressed as fusion proteins with the DNA-binding domain of GAL4 (GAL4-BD), and tested for induction of a minimal promoter linked to GAL4-binding sites (UAS-GAL4), UAS-GAL4 was potently induced by GAL4-BD fusions containing the C-terminal portion of all three human TORCs. 75
55578 403939 pfam12887 SICA_alpha SICA extracellular alpha domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. This domain is typically found at the N-terminus, with 1 or 2 copies per protein. The domain is cysteine-rich domain and similar to pfam12878. 187
55579 403940 pfam12888 Lipid_bd Lipid-binding putative hydrolase. This is a small family of lipid-binding proteins found in Bacteroidetes. 140
55580 403941 pfam12889 DUF3829 Protein of unknown function (DUF3829). This is a small family of proteins from several bacterial species, whose function is not known. It may, however, be related to the GvpL_GvpF family of proteins, pfam06386. 283
55581 315550 pfam12890 DHOase Dihydro-orotase-like. This is a small family of dihydro-orotase-like proteins from various bacteria. 142
55582 403942 pfam12891 Glyco_hydro_44 Glycoside hydrolase family 44. This is a family of bacterial glycoside hydrolases formerly known as cellulase family J, and now known as Cel44A. It is one of the major enzymatic components of the cellulosome of Clostridium thermocellum strain F1 and of many other Firmicutes. 234
55583 403943 pfam12892 FctA Spy0128-like isopeptide containing domain. The FCT and equivalent region genes of Streptococcus pyogenes and other related bacteria encode surface proteins that include fibronectin- and collagen-binding proteins and the serological markers known as T antigens. Some of these proteins give rise to pilus-like appendages. The FctA family is found in many Firmicutes and related bacteria. In S. pyogenes, the pili have a role in bacterial adherence and colonisation of human tissues. Members of this family have a conserved N-terminal lysine and C-terminal asparagine that can form a covalent isopeptide bond. 113
55584 403944 pfam12893 Lumazine_bd_2 Putative lumazine-binding. This is a family of uncharacterized proteins. However, the family belongs to the NTF2-like superfamily of various enzymes, and some of the members of the family are putative dehydrogenases. 116
55585 403945 pfam12894 ANAPC4_WD40 Anaphase-promoting complex subunit 4 WD40 domain. Apc4 contains an N-terminal propeller-shaped WD40 domain.The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC, 91
55586 403946 pfam12895 ANAPC3 Anaphase-promoting complex, cyclosome, subunit 3. Apc3, otherwise known as Cdc27, is one of the subunits of the anaphase-promoting complex or cyclosome. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. The protein members of this family contain TPR repeats just as those of Apc7 do, and it appears that these TPR units bind the C-termini of the APC co-activators CDH1 and CDC20. 82
55587 403947 pfam12896 ANAPC4 Anaphase-promoting complex, cyclosome, subunit 4. Apc4 is one of the larger of the subunits of the anaphase-promoting complex or cyclosome. This family represents the long domain downstream of the WD40 repeat/s that are present on the Apc4 subunits. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. Results in C.elegans show that the primary essential role of the spindle assembly checkpoint is not in the chromosome segregation process itself but rather in delaying anaphase onset until all chromosomes are properly attached to the spindle. the APC/C is likely to be required for all metaphase-to-anaphase transitions in a multicellular organism. 203
55588 403948 pfam12897 Aminotran_MocR Alanine-glyoxylate amino-transferase. These proteins catalyze the reversible transfer of an amino group from the amino acid substrate to an acceptor alpha-keto acid. They require pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze this reaction. Trans-amination reactions are of central importance in amino acid metabolism and in links to carbohydrate and fat metabolism. This class of aminotransferases acts as dimers in a head-to-tail configuration. 419
55589 403949 pfam12898 Stc1 Stc1 domain. The domain contains 8 conserved cysteines that may bind to zinc. In S. pombe this protein acts as a protein linker which links the chromatin modifying CLRC complex to RNAi by tethering it to the RITS complex. The region is reported as a LIM domain here, but has a slightly different arrangement of its CxxC pairs from the Pfam LIM domain pfam00412, hence why it is not part of that family. The tandem zinc-finger structure could mediate protein-protein interactions. 78
55590 403950 pfam12899 Glyco_hydro_100 Alkaline and neutral invertase. This is a family of bacterial and plant alkaline and neutral invertases, EC:3.2.1.26, previously known as Invertase_neut pfam04853. 429
55591 403951 pfam12900 Pyridox_ox_2 Pyridoxamine 5'-phosphate oxidase. Pyridoxamine 5'-phosphate oxidase is a FMN flavoprotein that catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). This entry contains several pyridoxamine 5'-phosphate oxidases, and related proteins. 133
55592 403952 pfam12901 SUZ-C SUZ-C motif. The SUZ-C domain is a conserved motif found in one or more copies in several RNA-binding proteins. It is always found at the C-terminus of the protein and appear to be required for localization of the protein to specific subcellular structures. It was first characterized in the C.elegans protein Szy-20 which localizes to the centrosome. It is widely distributed in eukaryotes. 33
55593 403953 pfam12902 Ferritin-like Ferritin-like. This is a family of bacterial ferritin-like substances that also includes a C-terminal domain of VioB, polyketide synthase enzymes, that make up one of the key components of the violacein biosynthesis pathway. Violacein is a purple-coloured, broad-spectrum antibacterial pigment. 222
55594 403954 pfam12903 DUF3830 Protein of unknown function (DUF3830). This is a family of bacterial and archaeal proteins, the structure for one of whose members has been characterized. Structure 3kop probably adopts a new hexameric form compared to previous structures. The putative active is near the domain interface. 3kop is most closely related, structurally to Structure 1zx8, where the potential active site is located near residues E51 and Y53 (conserved in 1zx8). Beyond the two residues above, the other residues are not conserved. Also the shape of the active site differs from that of 1zx8. Structure 1zx8 belongs to family DUF369. pfam04126, which is part of the cyclophilin-like clan. 144
55595 403955 pfam12904 Collagen_bind_2 Putative collagen-binding domain of a collagenase. This domain is likely to be the collagen-binding domain of a family of bacterial collagenase enzymes. It is the C-terminal part of the Structure 3kzs (information derived from TOPSAN). 92
55596 403956 pfam12905 Glyco_hydro_101 Endo-alpha-N-acetylgalactosaminidase. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by the S. pneumoniae protein Endo-alpha-N-acetylgalactosaminidase, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor. 273
55597 403957 pfam12906 RINGv RING-variant domain. 47
55598 403958 pfam12907 zf-met2 Zinc-binding. This is small family of metazoan zinc-binding proteins. 38
55599 403959 pfam12910 PHD_like Antitoxin of toxin-antitoxin, RelE / RelB, TA system. This domain appears to be the N-terminus of the RelB antitoxin of toxin-antitoxin stability system or prevent-host death system. Together RelE toxin and the RelB antitoxin form a non-toxic complex. Although toxin-antitoxin gene cassettes were first found in plasmids, it is clear that these loci are abundant in free-living prokaryotes, including many pathogenic bacteria, and these toxin-antitoxin loci provide a control mechanism that helps free-living prokaryotes cope with nutritional stress. 139
55600 403960 pfam12911 OppC_N N-terminal TM domain of oligopeptide transport permease C. Oligopeptide permeases (Opp) have been identified in numerous gram-negative and -positive bacteria. These transport systems belong to the superfamily of highly conserved ATP-binding cassette transporters. Typically, Opp importers comprise a complex of five proteins. The oligopeptide-binding protein OppA is responsible for the capture of peptides from the external medium. Two integral highly hydrophobic membrane spanning proteins, OppB and OppC, form a channel through the membrane used for peptide translocation. This N-terminal domain appears to be the first TM domain of the molecule. 53
55601 403961 pfam12912 N_NLPC_P60 NLPC_P60 stabilizing domain, N term. This domain, at the N-terminus, appears to be the stabilizing domain for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The next domain is an SH3b1, the third an SH3b2 and the last, the C-terminal region, the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN). 106
55602 403962 pfam12913 SH3_6 SH3 domain (SH3b1 type). This domain appears to be an SH3 domain of the SH3b1-type, and is just C-terminal to an N-terminal domain that is probably the stabilizing domain for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The next domain is an SH3b2 and the last, the C-terminal region, is the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN). 51
55603 403963 pfam12914 SH3_7 SH3 domain of SH3b2 type. This domain appears to be an SH3 domain of the SH3b2-type, and is the second SH3 domain to be found, downstream of an N-terminal domain that is probably the stabilizing domain, for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The last, the C-terminal region, is the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN). 46
55604 403964 pfam12915 DUF3833 Protein of unknown function (DUF3833). This is a family of uncharacterized proteins found in Proteobacteria. 163
55605 315571 pfam12916 DUF3834 Protein of unknown function (DUF3834). This family is likely to be related to solute-binding lipo-proteins. 201
55606 403965 pfam12917 HD_2 HD containing hydrolase-like enzyme. This is a family of bacterial and archaeal hydrolases. 182
55607 403966 pfam12918 TcdB_N TcdB toxin N-terminal helical domain. This is a short helical bundle domain found associated with the catalytic domain of the TcdB toxin from C. difficile. The function of this domain is unknown, but it may be involved in substrate recognition. 66
55608 372382 pfam12919 TcdA_TcdB TcdA/TcdB catalytic glycosyltransferase domain. This domain represents the N-terminal glycosyltransferase from a set of toxins found in some bacteria. This domain in TcdB glycosylates the host RhoA protein. 382
55609 372383 pfam12920 TcdA_TcdB_pore TcdA/TcdB pore forming domain. This family represents the most conserved region within the C. difficile Toxin A and Toxin B pore forming region. 626
55610 372384 pfam12921 ATP13 Mitochondrial ATPase expression. ATP13 is necessary for the expression of subunit 9 of mitochondrial ATPase. The protein has a basic amino terminal signal sequence that is cleaved upon import into mitochondria. 114
55611 403967 pfam12922 Cnd1_N non-SMC mitotic condensation complex subunit 1, N-term. The three non-SMC (structural maintenance of chromosomes) subunits of the mitotic condensation complex are Cnd1-3. The whole complex is essential for viability and the condensing of chromosomes in mitosis. This is the conserved N-terminus of the subunit 1. 164
55612 403968 pfam12923 RRP7 Ribosomal RNA-processing protein 7 (RRP7). RRP7 is an essential protein in yeast that is involved in pre-rRNA processing and ribosome assembly. It is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. 119
55613 403969 pfam12924 APP_Cu_bd Copper-binding of amyloid precursor, CuBD. This short domain, part of the extra-cellular N-terminus of the amyloid precursor protein, APP, can bind both copper and zinc, CuBD. The structure of Cu2+-bound CuBD reveals that the metal ligands are His147, His151, Tyr168 and two water molecules, which are arranged in a square pyramidal geometry. The structure of Cu+-bound CuBD is almost identical to the Cu2+-bound structure except for the loss of one of the water ligands. The geometry of the site is unfavourable for Cu+, thus providing a mechanism by which CuBD could readily transfer Cu ions to other proteins. 56
55614 403970 pfam12925 APP_E2 E2 domain of amyloid precursor protein. The E2 domain is the largest of the conserved domains of the amyloid precursor protein. The structure of E2 consists of two coiled-coil sub-structures connected through a continuous helix, and bears an unexpected resemblance to the spectrin family of protein structures.E 2 can reversibly dimerize in solution, and the dimerization occurs along the longest dimension of the molecule in an antiparallel orientation, which enables the N-terminal substructure of one monomer to pack against the C-terminal substructure of a second monomer. The high degree of conservation of residues at the putative dimer interface suggests that the E2 dimer observed in the crystal could be physiologically relevant. Heparin sulfate proteoglycans, the putative ligands for the precursor present in extracellular matrix, bind to E2 at a conserved and positively charged site near the dimer interface. 190
55615 403971 pfam12926 MOZART2 Mitotic-spindle organizing gamma-tubulin ring associated. FAM128A and FAM128B proteins have been re-named MOZART2A and B. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, the MOZART proteins, both 1 and 2, do not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. The exact function of MOZART2 is not clear. 90
55616 403972 pfam12927 DUF3835 Domain of unknown function (DUF3835). This is a C-terminal domain conserved in fungi. 73
55617 403973 pfam12928 tRNA_int_end_N2 tRNA-splicing endonuclease subunit sen54 N-term. This is an N-terminal family of archaeal and metazoan sen54 proteins that forms one of the tRNA-splicing endonuclease subunits. 69
55618 403974 pfam12929 Mid1 Stretch-activated Ca2+-permeable channel component. MID1 is a yeast Saccharomyces cerevisiae gene encoding a plasma membrane protein required for Ca2+ influx induced by the mating pheromone, alpha-factor. Mid1 protein plays a crucial role in supplying Ca2+ during the mating process. Mid1 is composed of 548-amino-acid residues with four hydrophobic regions named H1, H2, H3 and H4, and two cysteine-rich regions (C1 and C2) at the C-terminal. This family contains the H3, H4, C1 and C2 regions. suggesting that H1 is a signal sequence responsible for the alpha-factor-induced Mid1 delivery to the plasma membrane. The region from H1 to H3 is required for the localization of Mid1 in the plasma and ER membranes. Trafficking of Mid1-GFP to the plasma membrane is dependent on the N-glycosylation of Mid1 and the transporter protein Sec12. This findings suggests that the trafficking of Mid1-GFP to the plasma membrane requires a Sec12-dependent pathway from the ER to the Golgi, and that Mid1 is recruited via a Sec6- and Sec7-independent pathway from the Golgi to the plasma membrane. 430
55619 403975 pfam12930 DUF3836 Family of unknown function (DUF3836). Family of uncharacterized proteins found in Bacteroidales species. Test. 121
55620 403976 pfam12931 Sec16_C Sec23-binding domain of Sec16. Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure. 279
55621 403977 pfam12932 Sec16 Vesicle coat trafficking protein Sec16 mid-region. Sec16 is a multi-domain vesicle coat protein. This central region is the functional part of the molecules and thus is vital for the family's role in mediating the movement of protein-cargo between the organelles of the secretory pathway. 118
55622 403978 pfam12933 FTO_NTD FTO catalytic domain. This domain is the catalytic AlkB-like domain from the FTO protein. This domain catalyzes a demethylase activity with a preference for 3-methylthymidine. 275
55623 403979 pfam12934 FTO_CTD FTO C-terminal domain. This domain is found at the C-terminus of the FTO protein which was shown to be associated with increased BMI and obesity risk in humans. The N-terminal domain of this protein is a DNA demethylase and this domain is found to associate with the N-terminal domain in the crystal structure. This domain is alpha helical with three helices that form a bundle. 167
55624 315590 pfam12935 Sec16_N Vesicle coat trafficking protein Sec16 N-terminus. Sec16 is a multi-domain vesicle coat protein. The overall function of Sec16 is in mediating the movement of protein-cargo between the organelles of the secretory pathway. Over-expression of truncated mutants of only the N-terminus are lethal, and this portion does not appear to be essential for function so may act as a stabilizing region. 236
55625 403980 pfam12936 Kri1_C KRI1-like family C-terminal. The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. This is the C-terminal domain of the family. 89
55626 403981 pfam12937 F-box-like F-box-like. This is an F-box-like family. 45
55627 372400 pfam12938 M_domain M domain of GW182. 240
55628 403982 pfam12939 DUF3837 Domain of unknown function (DUF3837). A small, compact all-alpha helical domain of unknown function. This domain is currently only found in Clostridiales species. 92
55629 315595 pfam12940 RAG1 Recombination-activation protein 1 (RAG1), recombinase. This family is one of the two different components of the RAG1-RAG2 V(D)J recombinase complex. The RAG complex, consisting of two RAG1 and two RAG2 proteins is a multi-protein complex that mediates DNA cleavage during V(D)J (variable-diversity-joining) recombination. RAG1 mediates DNA-binding to the conserved recombination signal sequences (RSS). Many of the proteins in this family are fragments. Solution of the structure of the complex of RAG1 and RAG2 shows that each protein dimerizes with itself and each pair then complexes together to from the RAG1-RAG2 V(D)J recombinase enzyme. The different structural elements in RAG1 for UniProtKB:P15919 are: an N-terminal nonamer-binding domain from residues 391-459; a dimerization and DNA-binding domain from 459-515; an extended pre-RNase H domain from 515-588; the catalytic RNase H domain from 588-719; a ZnC2 domain from 719-791; and ZnH2 domain from 791-962; and a three-helix C-terminal domain from 962-1008. 653
55630 289693 pfam12941 HCV_NS5a_C HCV NS5a protein C-terminal region. This is a family of proteins found in the hepatitis C virus. This family contains the C-terminal region of the NS5A protein. CC The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. 242
55631 289694 pfam12942 Archaeal_AmoA Archaeal ammonia monooxygenase subunit A (AmoA). This is an archeael family that contains ammonia monooxygenase subunit A. Ammonia monooxygenase is an enzyme that oxidizes ammonia to nitrite and nitrate, thus playing a significant role in the nitrogen cycle. Ammonia-oxidising archaea (AOA) are widespread in marine environments. 183
55632 372401 pfam12943 DUF3839 Protein of unknown function (DUF3839). This is a family of uncharacterized proteins that are found in Trichomonas. 242
55633 403983 pfam12944 HAV_VP Hepatitis A virus viral protein VP. This is a family of the viral protein found in hepatitis A viruses. HAV is unique among picornaviruses in targeting the liver. 169
55634 403984 pfam12945 YcgR_2 Flagellar protein YcgR. This domain is found N terminal to pfam07238. Proteins which contain YcgR domains are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias. This family contains members not captured by pfam07317. 85
55635 403985 pfam12946 EGF_MSP1_1 MSP1 EGF domain 1. This EGF-like domain is found at the C-terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite. 37
55636 403986 pfam12947 EGF_3 EGF domain. This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein. 36
55637 403987 pfam12948 MSP7_C MSP7-like protein C-terminal domain. MSP7 is a protein family the malaria parasite that has been found to be associated with processed fragments from the MSP1 protein in a complex involved in red blood cell invasion. 125
55638 403988 pfam12949 HeH HeH/LEM domain. This is a HeH domain. HeH domains form helix-extended loop-helix (HeH) structures. This domain is closely related to pfam03020 and pfam02037. 35
55639 403989 pfam12950 TaqI_C TaqI-like C-terminal specificity domain. This domain is found at the C-terminus of the TaqI protein and is involved in DNA-binding and substrate recognition. 119
55640 403990 pfam12951 PATR Passenger-associated-transport-repeat. This Autotransporter-associated beta strand repeat model represents a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. These repeats as likely to have a beta-helical structure. This repeat plays a role in the efficient transport of autotransporter virulence factors to the bacterial surface during growth and infection. The repeat is always associated with the passenger domain of the autotransporter. For these reasons it has been coined the Passenger-associated Transport Repeat (PATR). The mechanism by which the PATR motif promotes transport is uncertain but it is likely that the conserved glycines (see HMM Logo) are required for flexibility of folding and that this folding drives secretion. Autotransporters that contain PATR(s) associate with distinct virulence traits such as subtilisin (S8) type protease domains and polymorphic outer-membrane protein repeats, whilst SPATE (S6) type protease and lipase-like autotransporters do not tend to contain PATR motifs. 28
55641 403991 pfam12952 DUF3841 Domain of unknown function (DUF3841). This presumed domain is around 190 amino acids in length. As yet no function has been given to any member of the family. 178
55642 403992 pfam12953 DUF3842 Domain of unknown function (DUF3842). This short protein is found mainly in firmicute bacteria. It is functionally uncharacterized. 130
55643 403993 pfam12954 DUF3843 Protein of unknown function (DUF3843). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 409
55644 403994 pfam12955 DUF3844 Domain of unknown function (DUF3844). This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins that are thought to be found in the endoplasmic reticulum. 104
55645 403995 pfam12956 DUF3845 Domain of Unknown Function with PDB structure. Member Structure 3GF6 has statistically significant similarity to TNF-like jelly roll fold may indicate an immunomodulatory function or a bioadhesion role 221
55646 403996 pfam12957 DUF3846 Domain of unknown function (DUF3846). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain is found associated with an pfam07275 like domain. This suggests that this family may also be involved in evading host restriction. 92
55647 403997 pfam12958 DUF3847 Protein of unknown function (DUF3847). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 81
55648 403998 pfam12959 DUF3848 Protein of unknown function (DUF3848). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3849. 93
55649 403999 pfam12960 DUF3849 Protein of unknown function (DUF3849). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3848. 124
55650 404000 pfam12961 DUF3850 Domain of Unknown Function with PDB structure (DUF3850). The search results from NCBI sequence alignment indicates a conserved domain belonging to ASCH superfamily. Dali searching results show that the protein is a structurally similar to the PUA domain, suggesting it may be involved in RNA recognition. It has been reported that the deletion of PUA genes results in impaired growth (RluD) and competitive disadvantage (TruB) in Escherichia coli. Suggestions have been put forward that, apart from their usual catalytic role, certain PUS enzymes (e.g. TruB) may also act as chaperones for RNA folding. The interface interaction indicates that the biomolecule of protein NP_809782.1 should be a dimer. 77
55651 289714 pfam12962 DUF3851 Protein of unknown function (DUF3851). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 126
55652 404001 pfam12963 DUF3852 Protein of unknown function (DUF3852). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3848. 107
55653 372411 pfam12964 DUF3853 Protein of unknown function (DUF3853). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 96
55654 404002 pfam12965 DUF3854 Domain of unknown function (DUF3854). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain is likely to be related to the Toprim domain. 124
55655 404003 pfam12966 AtpR N-ATPase, AtpR subunit. Membrane protein with three predicted transmembrane segments, two of which contain conserved Arg residues. AtpR genes are found in the N-ATPase (archaeal-type F1-Fo-ATPase) operons and are predicted to interact with the conserved Glu/Asp residues in the c subunits, regulating the assembly and/or function of the membrane-embedded ring of 'c' (proteolipid) subunits (pfam00137). 86
55656 404004 pfam12967 DUF3855 Domain of Unknown Function with PDB structure (DUF3855). Family based on orphan protein (TM0875) from Thermotoga maritima that has been structurally determined as Structure 1022. The TM0875 gene of Thermotoga maritima encodes a hypothetical protein NP_228683 of unknown function. Analysis of TM0875 genomic context reveals the presence of MMT1 (a predicted Co/Zn/Cd cation transporter) and an inactive homolog of metal-dependent proteases. 1O22 shows weak structural similarity with the phosphoribosylformylglycinamidine synthase 1t4a (Dali Z-scr=4.6), the yggU protein (PDB structure:1n91; with DALI Z-scr=3), and with the thioesterase superfamily member (PDB structure 2cy9 - found using FATCAT), even though they have very low sequence identity. 157
55657 404005 pfam12968 DUF3856 Domain of Unknown Function (DUF3856). TPR-like protein. The 2hr2 structure belongs to the SCOP all alpha class, TPR-like superfamily, CT2138-like family. A DALI search gives hits with the putative peptidyl-prolyl isomerase 2fbn (Z=16), the SGTA protein (Z=16), the PLCR protein 2qfc (Z=16), a putative FK506-binding protein (Structure 1qz2-A; DALI Z-score 15.3; RMSD 2.9; 16% sequence identity within 132 superimposed residues), and with the tetratricopeptide repeats of the protein phosphatase 5 (Structure 2bug; DALI Z-score 15.1; RMSD 2.5; 19% sequence identity within 117 superimposed residues). 142
55658 404006 pfam12969 DUF3857 Domain of Unknown Function with PDB structure (DUF3857). This family is based on the first domain of the PDB structure 3KD4(residues 1-228). It is structurally similar to domains in other hydrolases, eg. M1 family aminopeptidase (3ebi, Z=10, rmsd 3.6A for 152 CA, seq id 12%), despite lack of any significant sequence similarity. 131
55659 404007 pfam12970 DUF3858 Domain of Unknown Function with PDB structure (DUF3858). This family is based on the third domain of the PDB structure 3KD4(residues 410-525). It is structurally similar to part of neuropilin-2 (Z=4.6, rmsd 3.6A for 83 CA, 7% seq id). This domain and the second domain appears to be part of peptide-n-glycanase (1x3w, 2g9f). 116
55660 404008 pfam12971 NAGLU_N Alpha-N-acetylglucosaminidase (NAGLU) N-terminal domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This N-terminal domain has an alpha-beta fold. 81
55661 404009 pfam12972 NAGLU_C Alpha-N-acetylglucosaminidase (NAGLU) C-terminal domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This C-terminal domain has an all alpha helical fold. 258
55662 404010 pfam12973 Cupin_7 ChrR Cupin-like domain. Members of this family are part of the cupin superfamily. This family includes the transcriptional activator ChrR. 91
55663 404011 pfam12974 Phosphonate-bd ABC transporter, phosphonate, periplasmic substrate-binding protein. This is a family of periplasmic proteins which are part of the transport system for alkylphosphonate uptake. 241
55664 404012 pfam12975 DUF3859 Domain of unknown function (DUF3859). This short domain is functionally uncharacterized. 127
55665 315622 pfam12976 DUF3860 Domain of Unknown Function with PDB structure (DUF3860). A protein family created to cover Structure 2OD5. 2OD5 is a hypothetical protein (JCVI_PEP_1096688149193) from an environmental metagenome (unidentified marine microbe). 92
55666 404013 pfam12977 DUF3861 Domain of Unknown Function with PDB structure (DUF3861). The 3cjl structure is likely a representative of a new fold with some resemblance to 3-helical bundle folds such as the serum albumin-like fold of SCOP. No significant hits reported by a Dali search. This protein is the first structural representative of a small (about 60 proteins) family of proteins that are found among proteo- and enterobacteria (REF http://www.topsan.org/Proteins/JCSG/3CJL). 88
55667 289729 pfam12978 DUF3862 Domain of Unknown Function with PDB structure (DUF3862). Structure 3D4E shared structural similarity to beta-lactamase inhibitory proteins (BLIP) which already include 1XXM, 1S0W, 1JTG, 2G2U, 2G2W, 2B5R, and 3due. All of structures are involved in beta-lactamase inhibitor complex. (REF http://www.topsan.org/Proteins/JCSG/3d4e) 159
55668 372415 pfam12979 DUF3863 Domain of Unknown Function with PDB structure (DUF3863). Domain based on 1-364 domain of Structure 3LM3 which is encoded by the BDI_3119 gene from Parabacteroides distasonis atcc 8503. 349
55669 289731 pfam12980 DUF3864 Domain of Unknown Function with PDB structure (DUF3864). Domain based on 366-449 domain of Structure 3LM3 which is encoded by the BDI_3119 gene from Parabacteroides distasonis atcc 8503. 80
55670 289732 pfam12981 DUF3865 Domain of Unknown Function with PDB structure (DUF3865). Family based of Structure 3B5P encoded by ZP_00108531 from nitrogen-fixing cyanobacterium Nostoc punctiforme pcc 73102 is a CADD-like protein of unknown function. Superposition between protein structures encoded by CT610 from Chlamydia trachomatis (Structure 1rwc), pyrroloquinolinquinone synthase C (PqqC, Structure 1otv) and ZP_00108531 revealed that putative active sites in CT610 and ZP_00108531 are identical. 224
55671 404014 pfam12982 DUF3866 Protein of unknown function (DUF3866). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 352 and 374 amino acids in length. 317
55672 404015 pfam12983 DUF3867 Protein of unknown function (DUF3867). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 190 amino acids in length. 185
55673 404016 pfam12984 DUF3868 Domain of unknown function, B. Theta Gene description (DUF3868). Based on Bacteroides thetaiotaomicron gene BT_1065, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or other bacterial species vs when in culture. 102
55674 404017 pfam12985 DUF3869 Domain of unknown function (DUF3869). A family based on the N-terminal domain of 3KOG, which shows weak but consistent remote homology with adhesive families such as immunoglobulins and cadherins, suggesting it might form an attachment module. 97
55675 404018 pfam12986 DUF3870 Domain of unknown function (DUF3870). A family based on the C-terminal domain of 3KOG which shows structural similarity to pore-forming proteins, suggesting it may have a lytic function. 94
55676 404019 pfam12987 DUF3871 Domain of unknown function, B. Theta Gene description (DUF3871). Based on Bacteroides thetaiotaomicron gene BT_2984, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture. 318
55677 404020 pfam12988 DUF3872 Domain of unknown function, B. Theta Gene description (DUF3872). Based on Bacteroides thetaiotaomicron gene BT_2593, a conserved protein found in a conjugate transposon. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture. 113
55678 372420 pfam12989 DUF3873 Domain of unknown function, B. Theta Gene description (DUF3873). Based on Bacteroides thetaiotaomicron gene BT_2286, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or other bacterial species vs when in culture. 68
55679 404021 pfam12990 DUF3874 Domain of unknonw function from B. Theta Gene description (DUF3874). Based on Bacteroides thetaiotaomicron gene BT_4228, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or other bacterial species vs when in culture. 71
55680 404022 pfam12991 DUF3875 Domain of unknown function, B. Theta Gene description (DUF3875). Based on Bacteroides thetaiotaomicron gene BT_4769, a conserved protein found in a conjugate transposon. As seem in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture. 50
55681 404023 pfam12992 DUF3876 Domain of unknown function, B. Theta Gene description (DUF3876). Based on Bacteroides thetaiotaomicron gene BT_0092, a conserved protein found in a conjugate transposon. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or other bacterial species vs when in culture. 91
55682 404024 pfam12993 DUF3877 Domain of unknown function, E. rectale Gene description (DUF3877). Based on Eubacterium rectale gene EUBREC_0237. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture. 173
55683 404025 pfam12994 DUF3878 Domain of unknown function, E. rectale Gene description (DUF3878). Based on Eubacterium rectale gene EUBREC_0973. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737). it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture. 300
55684 289746 pfam12995 DUF3879 Domain of unknown function, E. rectale Gene description (DUF3879). Based on Eubacterium rectale gene EUBREC_1343. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture. 179
55685 404026 pfam12996 DUF3880 DUF based on E. rectale Gene description (DUF3880). Based on Eubacterium rectale gene EUBREC_3218. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), It appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture. 78
55686 404027 pfam12997 DUF3881 Domain of unknown function, E. rectale Gene description (DUF3881). Based on Eubacterium rectale gene EUBREC_3695. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture. 283
55687 404028 pfam12998 ING Inhibitor of growth proteins N-terminal histone-binding. Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterized C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail. 100
55688 372423 pfam12999 PRKCSH-like Glucosidase II beta subunit-like. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. 176
55689 372424 pfam13000 Acatn Acetyl-coenzyme A transporter 1. The mouse Acatn is a 61 kDa hydrophobic protein with six to 10 transmembrane domains. It appears to promote 9-O-acetylation in gangliosides. 543
55690 404029 pfam13001 Ecm29 Proteasome stabilizer. The proteasome consists of two subunits, and the capacity of the proteasome to degrade protein depends crucially on the interaction between these two subunits. This interaction is affected by a wide range of factors including metabolites, such as ATP, and proteasome-associated proteins such as Ecm29. Ecm29 stabilizes the interaction between the two subunits. 494
55691 404030 pfam13002 LDB19 Arrestin_N terminal like. This is a family of proteins related to the Arrestin_N terminal family. 183
55692 404031 pfam13004 BACON Putative binding domain, N-terminal. The BACON (Bacteroidetes-Associated Carbohydrate-binding Often N-terminal) domain is an all-beta domain found in diverse architectures, principally in combination with carbohydrate-active enzymes and proteases. These architectures suggest a carbohydrate-binding function which is also supported by the nature of BACON's few conserved amino-acids. The phyletic distribution of BACON and other data tentatively suggest that it may frequently function to bind mucin. Further work with the characterized structure of a member of glycoside hydrolase family 5 enzyme, Structure 3ZMR, has found no evidence for carbohydrate-binding for this domain. 61
55693 404032 pfam13005 zf-IS66 zinc-finger binding domain of transposase IS66. This is a zinc-finger region of the N-terminus of the insertion element IS66 transposase. 46
55694 404033 pfam13006 Nterm_IS4 Insertion element 4 transposase N-terminal. This family represents the N-terminal region of proteins carrying the transposase enzyme, DDE_Tnp_1 (that was Transposase_11), pfam01609, at the C-terminus. The full-length members are Insertion Element 4, IS4. Within the collection of E.coli strains, ECOR, the number of IS4 elements varies from zero to 14, with an average of 5 copies/strain. 95
55695 404034 pfam13007 LZ_Tnp_IS66 Transposase C of IS166 homeodomain. This is a leucine-zipper-like or homeodomain-like region of transposase TnpC of insertion element IS66. 68
55696 372427 pfam13008 zf-Paramyx-P Zinc-binding domain of Paramyxoviridae V protein. The Paramyxoviridae, which include such respiroviruses as para-influenzae and measles, produce phosphoproteins - protein P - that are integral to the polymerase transcription-replication complex. Protein P consists of two functionally distinct moieties, an N-terminal PNT, and a C-terminal PCT. The P gene region transcribes proteins from all three ORFs, and the V protein consists of the PNT moiety and a more C-terminal 2-zinc-binding domain. This conserved region consists of the two-zinc-binding section sandwiched between beta sheets 6 and 7 of the overall V protein. It is the binding of this core domain of V protein with the DDB1 protein (part of the ubiquitin-ligase complex) of eukaryotes which represents the key element of the virus-host protein interaction. In the Henipavirus family which includes Nipah and Hendra viruses, the V protein is able to block IFN (interferon) signalling by preventing IFN-induced STAT phosphorylation and nuclear translocation. The P gene of morbillivirus is co-transcriptionally edited leading to a V protein being produced. 45
55697 404035 pfam13009 Phage_Integr_2 Putative phage integrase. This family is found in association with IS elements. 323
55698 289758 pfam13010 pRN1_helical Primase helical domain. This alpha helical domain is found in a set of bacterial plasmid replication proteins. The domain is found to the C-terminus of the primase/polymerase domain. Mutants of this domain are defective in template binding, dinucleotide formation and conformation change prior to DNA extension. 138
55699 289759 pfam13011 LZ_Tnp_IS481 leucine-zipper of insertion element IS481. This is the upstream region of the conjoined ORF AB of insertion element 481. The significance of IS481 in the detection of Bordetella pertussis is discussed in. The B portion of the ORF AB carries the transposase activity in family rve, pfam00665. 85
55700 404036 pfam13012 MitMem_reg Maintenance of mitochondrial structure and function. This is C-terminal to the Mov24 region of the yeast proteasomal subunit Rpn11 and seems likely to regulate the mitochondrial fission and tubulation processes, ie the outer mitochondrial membrane proteins. This function appears to be unrelated to the proteasome activity of the N-terminal region. 72
55701 404037 pfam13013 F-box-like_2 F-box-like domain. The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression. 107
55702 404038 pfam13015 PRKCSH_1 Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. The beta-subunit confers substrate specificity for di- and monoglucosylated glycans on the glucose-trimming activity of the alpha-subunit. 154
55703 404039 pfam13016 Gliadin Cys-rich Gliadin N-terminal. This is a cysteine-rich N-terminal region of gliadin and avenin plant proteins. The exact function is not known. 75
55704 404040 pfam13017 Maelstrom piRNA pathway germ-plasm component. Maelstrom is a germ-plasm component protein, that is shown to be functionally involved in the piRNA pathway. It is conserved throughout Eukaryota, though it appears to have been lost from all examined teleost fish species. The domain architecture shows that it is coupled with several DNA- and RNA- related domains such as HMG box, SR-25-like and HDAC_interact domains. Sequence analysis and fold recognition have found a distant similarity between Maelstrom domain and the DnaQ 3'-5' exonuclease family with the RNase H fold (Exonuc_X-T, pfam00929); notably, that the Maelstrom domains from basal eukaryotes contain the conserved 3'-5' exonuclease active site residues (Asp-Glu-Asp-His-Asp, DEDHD). However, the animal and some amoeba maelstrom contain another set of conserved residues (Glu-His-His-Cys-His-Cys, EHHCHC). This evolutionary link together with structural examinations leads to the hypothesis that Maelstrom domains may have a potential nuclease-transposase activity or RNA-binding ability that may be implicated in piRNA biogenesis. A protein function evolution mode, namely "active site switch", has been proposed, in which the amoeba Maelstrom domains are the possible evolutionary intermediates due to their harbouring of the specific characteristics of both 3'-5' exonuclease and Maelstrom domains. 212
55705 404041 pfam13018 ESPR Extended Signal Peptide of Type V secretion system. This conserved domain is called ESPR for Extended Signal Peptide Region. It is present at the N-terminus of the signal peptides of proteins belonging to the Type V secretion systems, including the autotransporters (T5aSS), TpsA exoproteins of the two-partner system (T5bSS) and trimeric autotransporters (TAAs). So far, the ESPR is present only in Gram-negative bacterial proteins originating from the classes Beta- and Gamma-proteobacteria. ESPR severely impairs inner membrane translocation, suggesting that it adopts a particular conformation or it interacts with a cytoplasmic or inner membrane co-factor, prior to exportation. Deletion of ESPR causes mis-folding of the TAAs passenger domain in the periplasm, substantially impairing its translocation across the outer membrane. 24
55706 372432 pfam13019 Telomere_Sde2 Telomere stability and silencing. Sde2 has been identified in fission yeast as an important factor in telomere formation and maintenance. This is a more N-terminal domain on these nuclear proteins, and is essential for telomeric silencing and genomic stability. 165
55707 404042 pfam13020 DUF3883 Domain of unknown function (DUF3883). This is a domain is uncharacterized. It is found on restriction endonucleases. 91
55708 404043 pfam13021 DUF3885 Domain of unknown function (DUF3885). A putative Rac prophage DNA binding protein. This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved YDDRG sequence motif. There is a single completely conserved residue D that may be functionally important. 38
55709 404044 pfam13022 HTH_Tnp_1_2 Helix-turn-helix of insertion element transposase. This is a family of largely phage proteins which are likely to be a helix-turn-helix insertion elements. 122
55710 404045 pfam13023 HD_3 HD domain. HD domains are metal dependent phosphohydrolases. 163
55711 338588 pfam13024 DUF3884 Protein of unknown function (DUF3884). This family of proteins is functionally uncharacterized. However several proteins are annotated as Tagatose 1,6-diphosphate aldolase, but evidence to support this could not be found. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 106 amino acids in length. There are two completely conserved residues (Y and F) that may be functionally important. 73
55712 315655 pfam13025 DUF3886 Protein of unknown function (DUF3886). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two completely conserved L residues that may be functionally important. 68
55713 404046 pfam13026 DUF3887 Protein of unknown function (DUF3887). This domain family is found in bacteria and archaea, and is approximately 90 amino acids in length. 91
55714 404047 pfam13027 DUF3888 Protein of unknown function (DUF3888). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 111 and 149 amino acids in length. 87
55715 404048 pfam13028 DUF3889 Protein of unknown function (DUF3889). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two completely conserved residues (A and Y) that may be functionally important. 84
55716 289775 pfam13029 DUF3890 Domain of unknown function (DUF3890). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 70 amino acids in length. 84
55717 404049 pfam13030 DUF3891 Protein of unknown function (DUF3891). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 250 amino acids in length. 215
55718 404050 pfam13031 DUF3892 Protein of unknown function (DUF3892). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 87 and 104 amino acids in length. 70
55719 404051 pfam13032 DUF3893 Domain of unknown function (DUF3893). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 123 and 144 amino acids in length. There is a single completely conserved residue E that may be functionally important. 288
55720 372437 pfam13033 DUF3894 Protein of unknown function (DUF3894). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 66 and 79 amino acids in length. There are two conserved sequence motifs: FNIC and MALLNLT. 54
55721 372438 pfam13034 DUF3895 Protein of unknown function (DUF3895). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two completely conserved residues (Y and L) that may be functionally important. 76
55722 315663 pfam13035 DUF3896 Protein of unknown function (DUF3896). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 61
55723 404052 pfam13036 LpoB Peptidoglycan-synthase activator LpoB. This is a family of Gram-negative bacterial outer membrane lipoproteins. LpoB is required for the function of the major peptidoglycan synthase enzyme PBP1B. It interacts with PBP1B protein via the UvrB-like non-catalytic domain on that protein. LpoB has a 54-aa-long flexible N-terminal stretch followed by a globular domain with similarity to the N-terminal domain of the prevalent periplasmic protein TolB. The long, flexible N-terminal region of LpoB enables it to span the periplasm and reach its docking site in PBP1B. Peptidoglycan is the essential polymer within the sacculus that surrounds the cytoplasmic membrane of bacteria. 147
55724 404053 pfam13037 DUF3898 Domain of unknown function (DUF3898). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. There are two conserved sequence motifs: DFG and FEKG. 89
55725 404054 pfam13038 DUF3899 Domain of unknown function (DUF3899). Putative Tryptophanyl-tRNA synthetase. 83
55726 372442 pfam13039 DUF3900 Protein of unknown function (DUF3900). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 360 amino acids in length. 249
55727 379027 pfam13040 Fur_reg_FbpB Fur-regulated basic protein B. This family of proteins is regulated by the ferric uptake regulator protein Fur. This family represses expression of the lutABC operon encoding iron sulfur-containing enzymes necessary for growth on lactate. 39
55728 404055 pfam13041 PPR_2 PPR repeat family. This repeat has no known function. It is about 35 amino acids long and is found in up to 18 copies in some proteins. The family appears to be greatly expanded in plants and fungi. The repeat has been called PPR. 50
55729 289787 pfam13042 DUF3902 Protein of unknown function (DUF3902). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length. There is a conserved LGI sequence motif. 161
55730 404056 pfam13043 DUF3903 Domain of unknown function (DUF3903). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length. 40
55731 289789 pfam13044 Fusion_F0 Fusion glycoprotein F0, Isavirus. Fusion between viral and cellular membranes is mediated by viral membrane fusion glycoproteins. This entry represents fusion glycoprotein F0 from the infectious salmon anemia virus (ISAV). The precursor protein F0 is proteolytically cleaved to F1 and F2, which are held together by disulphide bridges. 436
55732 315671 pfam13045 DUF3905 Protein of unknown function (DUF3905). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 84
55733 289791 pfam13046 DUF3906 Protein of unknown function (DUF3906). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved EKK sequence motif. 64
55734 379028 pfam13047 DUF3907 Protein of unknown function (DUF3907). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. There is a conserved AYTG sequence motif. 146
55735 315673 pfam13048 DUF3908 Protein of unknown function (DUF3908). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 140 amino acids in length. There is a single completely conserved residue Y that may be functionally important. 134
55736 404057 pfam13049 DUF3910 Protein of unknown function (DUF3910). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. 93
55737 404058 pfam13050 DUF3911 Protein of unknown function (DUF3911). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 77
55738 404059 pfam13051 DUF3912 Protein of unknown function (DUF3912). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 92
55739 315677 pfam13052 DUF3913 Protein of unknown function (DUF3913). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 57
55740 372448 pfam13053 DUF3914 Protein of unknown function (DUF3914). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two conserved sequence motifs: KFDIR and DLW. 89
55741 315678 pfam13054 DUF3915 Protein of unknown function (DUF3915). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 126
55742 404060 pfam13055 DUF3917 Protein of unknown function (DUF3917). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 71
55743 289801 pfam13056 DUF3918 Protein of unknown function (DUF3918). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two completely conserved residues (G and R) that may be functionally important. 43
55744 404061 pfam13057 DUF3919 Protein of unknown function (DUF3919). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 251 and 262 amino acids in length. There is a conserved YLNG sequence motif. 227
55745 372451 pfam13058 DUF3920 Protein of unknown function (DUF3920). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. 126
55746 404062 pfam13059 DUF3922 Protein of unknown function (DUF3992). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 87 and 98 amino acids in length. 79
55747 289805 pfam13060 DUF3921 Protein of unknown function (DUF3921). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 58
55748 404063 pfam13061 DUF3923 Protein of unknown function (DUF3923). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 65
55749 289807 pfam13062 DUF3924 Protein of unknown function (DUF3924). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 62
55750 372453 pfam13063 DUF3925 Protein of unknown function (DUF3925). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. 65
55751 372454 pfam13064 DUF3927 Protein of unknown function (DUF3927). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 50 amino acids in length. There is a conserved SVL sequence motif. There is a single completely conserved residue D that may be functionally important. 53
55752 372455 pfam13065 DUF3928 Protein of unknown function (DUF3928). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. 95
55753 372456 pfam13066 DUF3929 Protein of unknown function (DUF3929). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. 64
55754 404064 pfam13067 DUF3930 Protein of unknown function (DUF3930). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 51 and 67 amino acids in length. 51
55755 372458 pfam13068 DUF3932 Protein of unknown function (DUF3932). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 81
55756 404065 pfam13069 DUF3933 Protein of unknown function (DUF3933). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 53
55757 289815 pfam13070 DUF3934 Protein of unknown function (DUF3934). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two conserved sequence motifs: GTG and SKG. 40
55758 372460 pfam13071 DUF3935 Protein of unknown function (DUF3935). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved sequence motifs: FVF and LGV. 70
55759 289817 pfam13072 DUF3936 Protein of unknown function (DUF3936). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved GKAW sequence motif. There is a single completely conserved residue G that may be functionally important. 37
55760 372461 pfam13073 DUF3937 Protein of unknown function (DUF3937). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 72
55761 372462 pfam13074 DUF3938 Protein of unknown function (DUF3938). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. 98
55762 289820 pfam13075 DUF3939 Protein of unknown function (DUF3939). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. 133
55763 315692 pfam13076 Fur_reg_FbpA Fur-regulated basic protein A. This family of proteins is regulated by the ferric uptake regulator protein Fur. This family does not regulate the lutABC operon encoding iron sulfur-containing enzymes necessary for growth on lactate. 36
55764 404066 pfam13077 DUF3909 Protein of unknown function (DUF3909). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 108
55765 289823 pfam13078 DUF3942 Protein of unknown function (DUF3942). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. 137
55766 404067 pfam13079 DUF3916 Protein of unknown function (DUF3916). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length. There is a single completely conserved residue S that may be functionally important. 147
55767 404068 pfam13080 DUF3926 Protein of unknown function (DUF3926). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 46 and 63 amino acids in length. There is a single completely conserved residue P that may be functionally important. 43
55768 289826 pfam13081 DUF3941 Domain of unknown function (DUF3941). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 30 amino acids in length. There is a conserved YSK sequence motif. 24
55769 289827 pfam13082 DUF3931 Protein of unknown function (DUF3931). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 66
55770 404069 pfam13083 KH_4 KH domain. 73
55771 404070 pfam13084 DUF3943 Domain of unknown function (DUF3943). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length. 108
55772 404071 pfam13085 Fer2_3 2Fe-2S iron-sulfur cluster binding domain. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which abeta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. 106
55773 404072 pfam13086 AAA_11 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. 248
55774 404073 pfam13087 AAA_12 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. 196
55775 404074 pfam13088 BNR_2 BNR repeat-like domain. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases. 280
55776 404075 pfam13089 PP_kinase_N Polyphosphate kinase N-terminal domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. 106
55777 404076 pfam13090 PP_kinase_C Polyphosphate kinase C-terminal domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. This C-terminal domain has a structure similar to phospholipase D. 172
55778 404077 pfam13091 PLDc_2 PLD-like domain. 132
55779 404078 pfam13092 CENP-L Kinetochore complex Sim4 subunit Fta1. CENP-L is one of the components that assembles onto the CENP-A-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta1 is the equivalent component of the fission yeast Sim4 complex. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. 158
55780 404079 pfam13093 FTA4 Kinetochore complex Fta4 of Sim4 subunit, or CENP-50. Fission yeast has three kinetochore protein complexes. Two complexes, Sim4 and Ndc80-MIND-Spc7 (NMS), are constitutive components, whereas the third complex, DASH, is transiently associated with kinetochores only in mitosis and is required for precise chromosome segregation. The Sim4 complex functions as a loading dock for the DASH complex. Sim4 consists of a number of different proteins including Ftas 1-7 and Dad1. 199
55781 404080 pfam13094 CENP-Q CENP-Q, a CENPA-CAD centromere complex subunit. CENP-Q is one of the components that assembles onto the CENPA-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENPA nucleosomes directly recruit a proximal CENPA-nucleosome-associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENPA NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENPA-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta7 is the equivalent component of the fission yeast Sim4 complex. 158
55782 404081 pfam13095 FTA2 Kinetochore Sim4 complex subunit FTA2. Fission yeast has three kinetochore protein complexes. Two complexes, Sim4 and Ndc80-MIND-Spc7 (NMS), are constitutive components, whereas the third complex, DASH, is transiently associated with kinetochores only in mitosis and is required for precise chromosome segregation. The Sim4 complex functions as a loading dock for the DASH complex. Sim4 consists of a number of different proteins including Ftas 1-7 and Dad1. The equivalent higher eukaryotic protein is CENP-P. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. 204
55783 289841 pfam13096 CENP-P CENP-A-nucleosome distal (CAD) centromere subunit, CENP-P. CENP-P is one of the components that assembles onto the CENP-A-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta7 is the equivalent component of the fission yeast Sim4 complex. 177
55784 404082 pfam13097 CENP-U CENP-A nucleosome associated complex (NAC) subunit. CENP-U is one of the components that assembles onto the CENP-A-nucleosome associated complex (NAC). The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. FTA4 is the equivalent component of the fission yeast Sim4 complex. 175
55785 379034 pfam13098 Thioredoxin_2 Thioredoxin-like domain. 103
55786 289844 pfam13099 DUF3944 Domain of unknown function (DUF3944). This short domain is sometimes found N terminal to pfam03981. 35
55787 404083 pfam13100 OstA_2 OstA-like protein. This is a family of OstA-like proteins that are related to pfam03968. 158
55788 404084 pfam13101 DUF3945 Protein of unknown function (DUF3945). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This is a C-terminal repeated region. 59
55789 404085 pfam13102 Phage_int_SAM_5 Phage integrase SAM-like domain. A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This family appears related to the N-terminal domain of phage integrases. 99
55790 404086 pfam13103 TonB_2 TonB C terminal. This family contains TonB members that are not captured by pfam03544. 85
55791 289849 pfam13104 DUF3956 Protein of unknown function (DUF3956). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. 45
55792 404087 pfam13105 DUF3959 Protein of unknown function (DUF3959). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 260 amino acids in length. 241
55793 372478 pfam13106 DUF3961 Domain of unknown function (DUF3961). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 40 amino acids in length. 39
55794 289852 pfam13107 DUF3964 Protein of unknown function (DUF3964). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two conserved sequence motifs: FYF and AFW. 109
55795 404088 pfam13108 DUF3969 Protein of unknown function (DUF3969). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 102
55796 404089 pfam13109 AsmA_1 AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli. 213
55797 404090 pfam13110 DUF3966 Protein of unknown function (DUF3966). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 86 amino acids in length. 42
55798 404091 pfam13111 DUF3962 Protein of unknown function (DUF3962). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 233 and 796 amino acids in length. There is a conserved FSY sequence motif. 397
55799 289857 pfam13112 DUF3965 Protein of unknown function (DUF3965). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 380 amino acids in length. 291
55800 372483 pfam13113 DUF3970 Protein of unknown function (DUF3970). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved NPKY sequence motif. 55
55801 404092 pfam13114 RecO_N_2 RecO N terminal. This entry contains members that are not captured by pfam11967. 71
55802 404093 pfam13115 YtkA YtkA-like. 86
55803 404094 pfam13116 DUF3971 Protein of unknown function. Some members of this family are related to the AsmA family proteins. 288
55804 404095 pfam13117 Cag12 Cag pathogenicity island protein Cag12. This is a Proteobacterial family of Cag pathogenicity island proteins. 92
55805 372487 pfam13118 DUF3972 Protein of unknown function (DUF3972). This is a Proteobacterial family of unknown function. Some of the proteins in this family are annotated as being kinesin-like proteins. 125
55806 404096 pfam13119 DUF3973 Domain of unknown function (DUF3973). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved YCI sequence motif. 40
55807 372489 pfam13120 DUF3974 Domain of unknown function (DUF3974). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length. 126
55808 404097 pfam13121 DUF3976 Domain of unknown function (DUF3976). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length. 40
55809 289867 pfam13122 DUF3977 Protein of unknown function (DUF3977). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 77
55810 372491 pfam13123 DUF3978 Protein of unknown function (DUF3978). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. 144
55811 289869 pfam13124 DUF3963 Protein of unknown function (DUF3963). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 85 amino acids in length. There is a conserved DIQKW sequence motif. 40
55812 404098 pfam13125 DUF3958 Protein of unknown function (DUF3958). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. There are two conserved sequence motifs: RLF and TWH. 107
55813 315729 pfam13126 DUF3975 Protein of unknown function (DUF3975). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 80
55814 404099 pfam13127 DUF3955 Protein of unknown function (DUF3955). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 68 and 87 amino acids in length. There are two completely conserved residues (G and E) that may be functionally important. 59
55815 315731 pfam13128 DUF3954 Protein of unknown function (DUF3954). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 60 amino acids in length. 49
55816 404100 pfam13129 DUF3953 Protein of unknown function (DUF3953). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 47 and 76 amino acids in length. 40
55817 315733 pfam13130 DUF3952 Domain of unknown function (DUF3952). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length. There is a conserved VMSAS sequence motif. 101
55818 315734 pfam13131 DUF3951 Protein of unknown function (DUF3951). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 56 and 71 amino acids in length. There is a conserved YTP sequence motif. 52
55819 404101 pfam13132 DUF3950 Domain of unknown function (DUF3950). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 30 amino acids in length. There is a conserved NFS sequence motif. 30
55820 315735 pfam13133 DUF3949 Protein of unknown function (DUF3949). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 69 and 87 amino acids in length. 60
55821 315736 pfam13134 DUF3948 Protein of unknown function (DUF3948). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. 35
55822 289880 pfam13135 DUF3947 Protein of unknown function (DUF3947). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 91
55823 372493 pfam13136 DUF3984 Protein of unknown function (DUF3984). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 393 and 442 amino acids in length. 325
55824 289882 pfam13137 DUF3983 Protein of unknown function (DUF3983). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 40 amino acids in length. There is a conserved AWRN sequence motif. 34
55825 404102 pfam13138 DUF3982 Protein of unknown function (DUF3982). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 47 and 73 amino acids in length. There are two conserved sequence motifs: EKL and EIP. 35
55826 372494 pfam13139 DUF3981 Domain of unknown function (DUF3981). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length. 115
55827 289885 pfam13140 DUF3980 Domain of unknown function (DUF3980). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. 87
55828 404103 pfam13141 DUF3979 Protein of unknown function (DUF3979). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 115
55829 372496 pfam13142 DUF3960 Domain of unknown function (DUF3960). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 72 and 89 amino acids in length. 89
55830 404104 pfam13143 DUF3986 Protein of unknown function (DUF3986). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. 87
55831 404105 pfam13144 ChapFlgA Chaperone for flagella basal body P-ring formation. ChapFlgA is a family similar to the SAF family, and includes chaperones for flagellar basal-body proteins and pilus-assembly proteins, FlgA, RcpB and CpaB. ChapFlgA is necessary for the formation of the P-ring of the flagellum, FlgI, which sits in the peptidoglycan layer of the outer membrane of the bacterium. FlgA plays an auxiliary role in P-ring assembly. 122
55832 404106 pfam13145 Rotamase_2 PPIC-type PPIASE domain. 121
55833 404107 pfam13146 TRL TRL-like protein family. This family includes the TRL protein that is found in a locus that includes several tRNAs. The function of this protein is not known. The proteins in this family usually have a lipoprotein attachment site at their N-terminus. 77
55834 404108 pfam13148 DUF3987 Protein of unknown function (DUF3987). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 365
55835 404109 pfam13149 Mfa_like_1 Fimbrillin-like. A family of putative fimbrillin proteins found by clustering human gut metagenomic sequences. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins. 244
55836 404110 pfam13150 DUF3989 Protein of unknown function (DUF3989). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 86
55837 404111 pfam13151 DUF3990 Protein of unknown function (DUF3990). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 151
55838 289896 pfam13152 DUF3967 Protein of unknown function (DUF3967). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 173 and 249 amino acids in length. 35
55839 404112 pfam13153 DUF3985 Protein of unknown function (DUF3985). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. 44
55840 404113 pfam13154 DUF3991 Protein of unknown function (DUF3991). This family of proteins is often associated with family Toprim, pfam01751. 73
55841 404114 pfam13155 Toprim_2 Toprim-like. This is a family or Toprim-like proteins. 87
55842 404115 pfam13156 Mrr_cat_2 Restriction endonuclease. Prokaryotic family found in type II restriction enzymes containing the hallmark (D/E)-(D/E)XK active site. Presence of catalytic residues implicates this region in the enzymatic cleavage of DNA. 127
55843 372502 pfam13157 DUF3992 Protein of unknown function (DUF3992). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 98 and 122 amino acids in length. There is a single completely conserved residue T that may be functionally important. 88
55844 372503 pfam13158 DUF3993 Protein of unknown function (DUF3993). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. 118
55845 372504 pfam13159 DUF3994 Domain of unknown function (DUF3994). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 97 and 111 amino acids in length. 99
55846 404116 pfam13160 DUF3995 Protein of unknown function (DUF3995). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 138 and 149 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important. 124
55847 315755 pfam13161 DUF3996 Protein of unknown function (DUF3996). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 172 and 203 amino acids in length. 154
55848 404117 pfam13162 DUF3997 Protein of unknown function (DUF3997). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. 107
55849 404118 pfam13163 DUF3999 Protein of unknown function (DUF3999). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 440 and 470 amino acids in length. There is a single completely conserved residue D that may be functionally important. 421
55850 404119 pfam13164 Diedel Diedel. Diedel (die) was identified as an insect immune response protein. It is up-regulated after a septic injury and may act as a negative regulator of the JAK/STAT signalling pathway. Its homologs can be found in Drosophila and Acyrtosiphon pisum. Interestingly, the orthologues of the die gene are present in the genome of insect DNA viruses of the Baculoviridae and Ascoviridae families. The viral homologs suppress the immune deficiency (IMD) pathway in Drosophila. 75
55851 404120 pfam13165 SCIFF Six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase. 43
55852 372509 pfam13166 AAA_13 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. This family includes the PrrC protein that is thought to be the active component of the anticodon nuclease. 712
55853 404121 pfam13167 GTP-bdg_N GTP-binding GTPase N-terminal. This is the N-terminal region of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This N-terminal region is necessary for stability of the whole protein. 87
55854 289912 pfam13168 Poxvirus_B22R_C Poxvirus B22R protein C-terminal. This is the highly conserved C-terminal region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses. 195
55855 289913 pfam13169 Poxvirus_B22R_N Poxvirus B22R protein N-terminal. This is the highly conserved N-terminal region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses. 88
55856 404122 pfam13170 DUF4003 Protein of unknown function (DUF4003). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 327 and 345 amino acids in length. 296
55857 404123 pfam13171 DUF4004 Protein of unknown function (DUF4004). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 210 amino acids in length. 196
55858 404124 pfam13173 AAA_14 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. 129
55859 404125 pfam13174 TPR_6 Tetratricopeptide repeat. 33
55860 404126 pfam13175 AAA_15 AAA ATPase domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. 392
55861 372510 pfam13176 TPR_7 Tetratricopeptide repeat. 36
55862 404127 pfam13177 DNA_pol3_delta2 DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalyzed reaction. The delta subunit is also known as HolA. 161
55863 404128 pfam13178 DUF4005 Protein of unknown function (DUF4005). This is a C-terminal region of plant IQ-containing putative calmodulin-binding proteins. 97
55864 404129 pfam13179 DUF4006 Family of unknown function (DUF4006). This is a family of short, approx 65 residue-long, bacterial proteins of unknown function. 63
55865 404130 pfam13180 PDZ_2 PDZ domain. 74
55866 404131 pfam13181 TPR_8 Tetratricopeptide repeat. 33
55867 404132 pfam13182 DUF4007 Protein of unknown function (DUF4007). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 284 and 326 amino acids in length. This domain is found associated with pfam01507 in some proteins, suggesting a functional link. 287
55868 404133 pfam13183 Fer4_8 4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters. 65
55869 404134 pfam13184 KH_5 NusA-like KH domain. 69
55870 404135 pfam13185 GAF_2 GAF domain. 137
55871 404136 pfam13186 SPASM Iron-sulfur cluster-binding domain. This domain occurs as an additional C-terminal iron-sulfur cluster binding domain in many radical SAM domain, pfam04055 proteins. The domain occurs in a number of proteins that modify a protein to become an active enzyme, or a peptide to become a ribosomal natural product. The domain is named SPASM because it occurs in the maturases of Subilitosin, PQQ, Anaerobic Sulfatases, and Mycofactocin. 66
55872 404137 pfam13187 Fer4_9 4Fe-4S dicluster domain. 50
55873 404138 pfam13188 PAS_8 PAS domain. 65
55874 404139 pfam13189 Cytidylate_kin2 Cytidylate kinase-like family. This family includes enzymes related to cytidylate kinase. 176
55875 404140 pfam13190 PDGLE PDGLE domain. This short presumed domain is usually found on its own. However, it is also found associated with pfam01891 suggesting it may have a role in cobalt uptake. The domain is named after a short motif found within many members of the family. 89
55876 404141 pfam13191 AAA_16 AAA ATPase domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. 166
55877 404142 pfam13192 Thioredoxin_3 Thioredoxin domain. 71
55878 404143 pfam13193 AMP-binding_C AMP-binding enzyme C-terminal domain. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices. 76
55879 404144 pfam13194 DUF4010 Domain of unknown function (DUF4010). This is a family of putative membrane proteins found in archaea and bacteria. It is sometimes found C terminal to pfam02308. 209
55880 404145 pfam13195 DUF4011 Protein of unknown function (DUF4011). This family of proteins is found in archaea and bacteria. Many members are annotated as being putative DNA helicase-related proteins. 164
55881 404146 pfam13196 DUF4012 Protein of unknown function (DUF4012). This is a family of uncharacterized proteins found in archaea and bacteria. 144
55882 404147 pfam13197 DUF4013 Protein of unknown function (DUF4013). This is a family of uncharacterized proteins that is found in archaea and bacteria. 167
55883 338629 pfam13198 DUF4014 Protein of unknown function (DUF4014). This is a bacterial and viral family of uncharacterized proteins. 72
55884 404148 pfam13199 Glyco_hydro_66 Glycosyl hydrolase family 66. This family is a set of glycosyl hydrolase enzymes including cycloisomaltooligosaccharide glucanotransferase (EC:2.4.1.-) and dextranase (EC:3.2.1.11) activities. 557
55885 404149 pfam13200 DUF4015 Putative glycosyl hydrolase domain. This domain is related to other known glycosyl hydrolases suggesting this domain is also involved in carbohydrate break down. 313
55886 404150 pfam13201 PCMD Putative carbohydrate metabolism domain. This domain has been suggested to participate in carbohydrate metabolism. Structural evidence indicates that it might be a carbohydrate binding domain, with or without enzymatic activity. In particular, it has been hypothesized that it might act as a glycoside hydrolase. 238
55887 404151 pfam13202 EF-hand_5 EF hand. 25
55888 404152 pfam13203 DUF2201_N Putative metallopeptidase domain. This domain, found in various hypothetical bacterial proteins, has no known function. However, it is related to pfam01435. 271
55889 404153 pfam13204 DUF4038 Protein of unknown function (DUF4038). A family of putative cellulases. 320
55890 404154 pfam13205 Big_5 Bacterial Ig-like domain. 106
55891 404155 pfam13206 VSG_B Trypanosomal VSG domain. This family represents the B-type variant surface glycoproteins from trypanosomal parasites. This family is related to pfam00913. 354
55892 404156 pfam13207 AAA_17 AAA domain. 136
55893 404157 pfam13208 TerB_N TerB N-terminal domain. The TerB_N domain is found N-terminal to TerB, and TerB_C containing proteins. It has a predominantly alpha-helical structure and contains an absolutely conserved glutamate. The presence of a conserved acidic residue suggests that it might chelate metal like TerB. These proteins occur in a two-gene operon containing an AAA+ ATPase and SF-II DNA helicase suggesting a role in stress-response or phage defense. 203
55894 289952 pfam13209 DUF4017 Protein of unknown function (DUF4017). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 60
55895 404158 pfam13210 DUF4018 Domain of unknown function (DUF4018). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 190 amino acids in length. 198
55896 404159 pfam13211 DUF4019 Protein of unknown function (DUF4019). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 130 and 183 amino acids in length. There is a single completely conserved residue E that may be functionally important. 104
55897 372518 pfam13212 DUF4020 Domain of unknown function (DUF4020). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 176 and 195 amino acids in length. 174
55898 289956 pfam13213 DUF4021 Protein of unknown function (DUF4021). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved YGM sequence motif. 46
55899 289957 pfam13214 DUF4022 Protein of unknown function (DUF4022). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 85 amino acids in length. 77
55900 372519 pfam13215 DUF4023 Protein of unknown function (DUF4023). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved KLP sequence motif. 35
55901 315802 pfam13216 DUF4024 Protein of unknown function (DUF4024). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved RDE sequence motif. 35
55902 315803 pfam13217 DUF4025 Protein of unknown function (DUF4025). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved EGT sequence motif. 50
55903 404160 pfam13218 DUF4026 Protein of unknown function (DUF4026). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 450 amino acids in length. The family is found in association with pfam10077. 320
55904 404161 pfam13219 DUF4027 Protein of unknown function (DUF4027). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved CLGGF sequence motif. 36
55905 289962 pfam13220 DUF4028 Protein of unknown function (DUF4028). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 67 and 93 amino acids in length. There are two conserved sequence motifs: IVKI and YVKKWF. 65
55906 372521 pfam13221 DUF4029 Protein of unknown function (DUF4029). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 119 amino acids in length. 96
55907 289964 pfam13222 DUF4030 Protein of unknown function (DUF4030). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 164 and 197 amino acids in length. 142
55908 404162 pfam13223 DUF4031 Protein of unknown function (DUF4031). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 91 and 130 amino acids in length. There is a conserved HYD sequence motif. 75
55909 404163 pfam13224 DUF4032 Domain of unknown function (DUF4032). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 170 amino acids in length. The family is found in association with pfam06293. 163
55910 404164 pfam13225 DUF4033 Domain of unknown function (DUF4033). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 80 amino acids in length. 88
55911 404165 pfam13226 DUF4034 Domain of unknown function (DUF4034). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 280 amino acids in length. There is a conserved PRW sequence motif. 274
55912 404166 pfam13227 DUF4035 Protein of unknown function (DUF4035). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 67 and 93 amino acids in length. 55
55913 404167 pfam13228 DUF4037 Domain of unknown function (DUF4037). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 100 amino acids in length. There is a single completely conserved residue P that may be functionally important. 101
55914 404168 pfam13229 Beta_helix Right handed beta helix region. This region contains a parallel beta helix region that shares some similarity with Pectate lyases. 157
55915 404169 pfam13230 GATase_4 Glutamine amidotransferases class-II. This family captures members that are not found in pfam00310. 272
55916 404170 pfam13231 PMT_2 Dolichyl-phosphate-mannose-protein mannosyltransferase. This family contains members that are not captured by pfam02366. 160
55917 315815 pfam13232 Complex1_LYR_1 Complex1_LYR-like. This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria. 99
55918 404171 pfam13233 Complex1_LYR_2 Complex1_LYR-like. This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria. 79
55919 404172 pfam13234 rRNA_proc-arch rRNA-processing arch domain. Mtr4 is the essential RNA helicase, and is an exosome-activating cofactor. This arch domain is carried in Mtr4 and Ski2 (the cytosolic homolog of Mtr4). The arch domain is required for proper 5.8S rRNA processing, and appears to function independently of canonical helicase activity. 266
55920 404173 pfam13236 CLU Clustered mitochondria. The CLU domain (CLUstered mitochondria) is a eukaryotic domain found in proteins from fungi, protozoa, plants to humans. It is required for correct functioning of the mitochondria and mitochondrial transport although the exact function of the domain is unknown. In Dictyostelium the full-length protein is required for a very late step in fission of the outer mitochondrial membrane suggesting that mitochondria are transported along microtubules, as in mammalian cells, rather than along actin filaments, as in budding yeast. Disruption of the protein-impaired cytokinesis and caused mitochondria to cluster at the cell centre. It is likely that CLU functions in a novel pathway that positions mitochondria within the cell based on their physiological state. Disruption of the CLU pathway may enhance oxidative damage, alter gene expression, cause mitochondria to cluster at microtubule plus ends, and lead eventually to mitochondrial failure. 225
55921 404174 pfam13237 Fer4_10 4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich. 56
55922 404175 pfam13238 AAA_18 AAA domain. 128
55923 404176 pfam13239 2TM 2TM domain. This short region contains two transmembrane alpha helices that are found associated with a wide range of other domains. This domain may be involved in cell lysis or peptidoglycan turnover. 80
55924 404177 pfam13240 zinc_ribbon_2 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773. 21
55925 404178 pfam13241 NAD_binding_7 Putative NAD(P)-binding. This domain is found in fungi, plants, archaea and bacteria. 104
55926 404179 pfam13242 Hydrolase_like HAD-hyrolase-like. 75
55927 404180 pfam13243 SQHop_cyclase_C Squalene-hopene cyclase C-terminal domain. Squalene-hopene cyclase, EC:5.4.99.17, catalyzes the cyclisation of squalene into hopene in bacteria. This reaction is part of a cationic cyclisation cascade, which is homologous to a key step in cholesterol biosynthesis. This family is the C-terminal half of the molecule. 319
55928 404181 pfam13244 DUF4040 Domain of unknown function (DUF4040). 65
55929 404182 pfam13245 AAA_19 AAA domain. 136
55930 404183 pfam13246 Cation_ATPase Cation transport ATPase (P-type). This domain is found in cation transport ATPases, including phospholipid-transporting ATPases, calcium-transporting ATPases, and sodium-potassium ATPases. 91
55931 404184 pfam13247 Fer4_11 4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters. 99
55932 404185 pfam13248 zf-ribbon_3 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773. 25
55933 404186 pfam13249 SQHop_cyclase_N Squalene-hopene cyclase N-terminal domain. Squalene-hopene cyclase, EC:5.4.99.17, catalyzes the cyclisation of squalene into hopene in bacteria. This reaction is part of a cationic cyclisation cascade, which is homologous to a key step in cholesterol biosynthesis. This family is the N-terminal domain. 290
55934 404187 pfam13250 DUF4041 Domain of unknown function (DUF4041). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and viruses, and is approximately 60 amino acids in length. The family is found in association with pfam10544. 55
55935 404188 pfam13251 DUF4042 Domain of unknown function (DUF4042). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 180 amino acids in length. 182
55936 404189 pfam13252 DUF4043 Protein of unknown function (DUF4043). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 369 and 424 amino acids in length. There is a single completely conserved residue G that may be functionally important. 352
55937 404190 pfam13253 DUF4044 Protein of unknown function (DUF4044). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 56 amino acids in length. There is a single completely conserved residue M that may be functionally important. 33
55938 404191 pfam13254 DUF4045 Domain of unknown function (DUF4045). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length. 426
55939 404192 pfam13255 DUF4046 Protein of unknown function (DUF4046). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 64 and 331 amino acids in length. 90
55940 315838 pfam13256 DUF4047 Domain of unknown function (DUF4047). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length. There are two conserved sequence motifs: TEA and FPKT. 125
55941 372535 pfam13257 DUF4048 Domain of unknown function (DUF4048). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 228 and 257 amino acids in length. 252
55942 289998 pfam13258 DUF4049 Domain of unknown function (DUF4049). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 310 and 324 amino acids in length. 333
55943 404193 pfam13259 DUF4050 Protein of unknown function (DUF4050). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 173 amino acids in length. There are two conserved sequence motifs: IPL and FLVD. 124
55944 372537 pfam13260 DUF4051 Protein of unknown function (DUF4051). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 52
55945 315842 pfam13261 DUF4052 Protein of unknown function (DUF4052). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 220 amino acids in length. 217
55946 404194 pfam13262 DUF4054 Protein of unknown function (DUF4054). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 120 and 152 amino acids in length. 105
55947 404195 pfam13263 PHP_C PHP-associated. This is a subunit, probably the alpha, of bacterial and eukaryotic DNA polymerase III, associated with the PHP domain, pfam02811. 56
55948 404196 pfam13264 DUF4055 Domain of unknown function (DUF4055). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 140 amino acids in length. 135
55949 372540 pfam13265 DUF4056 Protein of unknown function (DUF4056). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 355 and 380 amino acids in length. 266
55950 404197 pfam13266 DUF4057 Protein of unknown function (DUF4057). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 279 and 322 amino acids in length. 299
55951 379094 pfam13267 DUF4058 Protein of unknown function (DUF4058). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 244 and 264 amino acids in length. 252
55952 404198 pfam13268 DUF4059 Protein of unknown function (DUF4059). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved DKT sequence motif. 72
55953 372542 pfam13269 DUF4060 Protein of unknown function (DUF4060). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There are two conserved sequence motifs: VEVV and SYVAT. 73
55954 404199 pfam13270 DUF4061 Domain of unknown function (DUF4061). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 90 amino acids in length. There is a conserved AFG sequence motif. 88
55955 404200 pfam13271 DUF4062 Domain of unknown function (DUF4062). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif. 78
55956 315853 pfam13272 Holin_2-3 Putative 2/3 transmembrane domain holin. Holin_2-3 is a putative holins with 2 or 3 transmembrane segments. It consists of many proteobacterial proteins ranging in size from about 70 aas to 120 aas. They have 2 or 3 predicted TMSs. Although some are annotated as phage proteins or holins, none appears to be functionally characterized. 106
55957 404201 pfam13273 DUF4064 Protein of unknown function (DUF4064). 102
55958 404202 pfam13274 DUF4065 Protein of unknown function (DUF4065). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 155 and 202 amino acids in length. 108
55959 404203 pfam13275 S4_2 S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4. 65
55960 404204 pfam13276 HTH_21 HTH-like domain. This domain contains a predicted helix-turn-helix suggesting a DNA-binding function. 59
55961 404205 pfam13277 YmdB YmdB-like protein. This family of putative phosphoesterases contains the B. subtilis protein YmdB. 251
55962 404206 pfam13279 4HBT_2 Thioesterase-like superfamily. This family contains a wide variety of enzymes, principally thioesterases. These enzymes are part of the Hotdog fold superfamily. 121
55963 404207 pfam13280 WYL WYL domain. WYL is a Sm-like SH3 beta-barrel fold containing domain. It is a member of the WYL-like superfamily, named for three conserved amino acids found in a subset of the superfamily. However, these residues are not strongly conserved throughout the family. Rather, the conservation pattern includes four basic residues and a position often occupied by a cysteine, which are predicted to line a ligand-binding groove typical of the Sm-like SH3 beta-barrels. A WYL domain protein (sll7009) is a negative regulator of the I-D CRISPR-Cas system in Synechocystis sp. It is predicted to be a ligand-sensing domain that could bind negatively charged ligands, such as nucleotides or nucleic acid fragments, to regulate CRISPR-Cas and other defense systems such as the abortive infection AbiG system. 173
55964 404208 pfam13281 DUF4071 Domain of unknown function (DUF4071). This domain is found at the N-terminus of many serine-threonine kinase-like proteins. 372
55965 404209 pfam13282 DUF4070 Domain of unknown function (DUF4070). This is a bacterial domain often found at the C-terminus of Radical_SAM methylases. 142
55966 372547 pfam13283 NfrA_C Bacteriophage N adsorption protein A C-term. The function of this domain is unknown but it is found at the C-terminus of bacteriophage N4 adsorption protein A, in association with an N-terminal region of TPR repeats. 173
55967 404210 pfam13284 DUF4072 Domain of unknown function (DUF4072). This short domain is normally found at the very N-terminus of Hyrdrolases pfam00702. 47
55968 290024 pfam13285 DUF4073 Domain of unknown function (DUF4073). This family is frequently found at the C-terminus of bacterial proteins carrying the family, Metallophos pfam00149. 157
55969 404211 pfam13286 HD_assoc Phosphohydrolase-associated domain. This domain is found on bacterial and archaeal metal-dependent phosphohydrolases. 91
55970 404212 pfam13287 Fn3_assoc Fn3 associated. 59
55971 404213 pfam13288 DXPR_C DXP reductoisomerase C-terminal domain. This is the C-terminal domain of the 1-deoxy-D-xylulose-5-phosphate reductoisomerase enzyme. This domain forms a left handed super-helix. 114
55972 404214 pfam13289 SIR2_2 SIR2-like domain. This family of proteins are related to the sirtuins. 141
55973 404215 pfam13290 CHB_HEX_C_1 Chitobiase/beta-hexosaminidase C-terminal domain. 67
55974 404216 pfam13291 ACT_4 ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains. These ACT domains are found at the C-terminus of the RelA protein. 79
55975 404217 pfam13292 DXP_synthase_N 1-deoxy-D-xylulose-5-phosphate synthase. This family contains 1-deoxyxylulose-5-phosphate synthase (DXP synthase), an enzyme which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate, to yield 1-deoxy-D- xylulose-5-phosphate, a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). 273
55976 404218 pfam13293 DUF4074 Domain of unknown function (DUF4074). This family is found at the C-terminal of Homeobox proteins in Metazoa. 62
55977 372549 pfam13294 DUF4075 Domain of unknown function (DUF4075). The members of this family are putative mature parasite-infected erythrocyte surface antigen protein from Bacillus spp. 79
55978 290034 pfam13295 DUF4077 Domain of unknown function (DUF4077). This is the N-terminal region of methyl-accepting chemotaxis proteins from Bacillus spp. The function is not known. 176
55979 404219 pfam13296 T6SS_Vgr Putative type VI secretion system Rhs element Vgr. This is a family of putative type VI secretion system Rhs element Vgr proteins from Proteobacteria. 108
55980 404220 pfam13297 Telomere_Sde2_2 Telomere stability C-terminal. This short C-terminal domain is found in higher eukaryotes further downstream from the Sde2 family, pfam13019. It is found in all Sde2-related proteins except those from fission yeast, fly, and mosquito. Its exact function in telomere formation and maintenance has not yet been established. 60
55981 404221 pfam13298 LigD_N DNA polymerase Ligase (LigD). This is the N terminal region of ATP dependant DNA ligase. 103
55982 404222 pfam13299 CPSF100_C Cleavage and polyadenylation factor 2 C-terminal. This family lies at the C-terminus of many fungal and plant cleavage and polyadenylation specificity factor subunit 2 proteins. The exact function of the domain is not known, but is likely to function as a binding domain for the protein within the overall CPSF complex. 161
55983 404223 pfam13300 DUF4078 Domain of unknown function (DUF4078). This family is found from fungi to humans, but its exact function is not known. 86
55984 404224 pfam13301 DUF4079 Protein of unknown function (DUF4079). This is an uncharacterized family of proteins. 141
55985 379112 pfam13302 Acetyltransf_3 Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions. 139
55986 404225 pfam13303 PTS_EIIC_2 Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate. 327
55987 404226 pfam13304 AAA_21 AAA domain, putative AbiEii toxin, Type IV TA system. Several members are annotated as being of the abortive phage resistance system, in which case the family would be acting as the toxin for a type IV toxin-antitoxin resistance system. 303
55988 404227 pfam13305 WHG WHG domain. This presumed domain is around 80 amino acids in length. It is found to the C-terminus of a DNA-binding helix-turn-helix domain. This domain may be involved in binding to an as yet unknown ligand that allows a transcriptional regulation response to that molecule. The domain is named WHG after three conserved residues near the C-terminus of the domain. 102
55989 404228 pfam13306 LRR_5 Leucine rich repeats (6 copies). This family includes a number of leucine rich repeats. This family contains a large number of BSPA-like surface antigens from Trichomonas vaginalis. 127
55990 404229 pfam13307 Helicase_C_2 Helicase C-terminal domain. This domain is found at the C-terminus of DEAD-box helicases. 166
55991 404230 pfam13308 YARHG YARHG domain. This presumed extracellular domain is about 70 amino acids in length. It is named YARHG after a conserved motif in the sequence. This domain is associated with peptidases and bacterial kinase proteins. Its molecular function is unknown. 84
55992 404231 pfam13309 HTH_22 HTH domain. This domain is a helix-turn-helix domain that is likely to act as a DNA-binding domain. 63
55993 404232 pfam13310 Virulence_RhuM Virulence protein RhuM family. There are currently no experimental data for members of this group or their homologs. However, these proteins are implicated in virulence/pathogenicity because RhuM is encoded in the SPI-3 pathogenicity island in Salmonella typhimurium. 252
55994 404233 pfam13311 DUF4080 Protein of unknown function (DUF4080). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. 188
55995 404234 pfam13312 DUF4081 Domain of unknown function (DUF4081). This domain is often found N-terminal to the GNAT acetyltransferase domain, pfam00583 and FR47, pfam08445. 105
55996 404235 pfam13313 DUF4082 Domain of unknown function (DUF4082). This family appears to be a parallel beta-helix repeated region that sits between successive Cadherin domains, pfam00028. 143
55997 372557 pfam13314 DUF4083 Domain of unknown function (DUF4083). This is a family of very short, approximately 60 residue, proteins from Firmicutes, that are all putatively annotated as being MutT/Nudix. However, the characteristic Nudix motif of GX(5)EX(7)REUXEE is absent. 57
55998 372558 pfam13315 DUF4085 Protein of unknown function (DUF4085). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 101 and 269 amino acids in length. 206
55999 372559 pfam13316 DUF4087 Protein of unknown function (DUF4087). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 140 and 280 amino acids in length. There is a conserved RCGW sequence motif. 94
56000 404236 pfam13317 DUF4088 Protein of unknown function (DUF4088). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 258 and 300 amino acids in length. 226
56001 372560 pfam13318 DUF4089 Protein of unknown function (DUF4089). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 50
56002 404237 pfam13319 DUF4090 Protein of unknown function (DUF4090). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 84
56003 404238 pfam13320 DUF4091 Domain of unknown function (DUF4091). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 70 amino acids in length. There is a single completely conserved residue G that may be functionally important. 66
56004 372562 pfam13321 DUF4084 Domain of unknown function (DUF4084). This family of Firmicute proteins is frequently associated with the EAL, GGDEF and PAS families, pfam00563, pfam00990, and pfam00989. The exact function is not known. 304
56005 290061 pfam13322 DUF4092 Domain of unknown function (DUF4092). This family is found in Proteobacteria. The function is not known. 176
56006 404239 pfam13323 HPIH N-terminal domain with HPIH motif. This family is found in fungi on proteins carrying the PAS, pfam00989, domain. There is a well-conserved characteristic HPIH motif, but the function is not known. 152
56007 404240 pfam13324 GCIP Grap2 and cyclin-D-interacting. GCIP, or Grap2 and cyclin-D-interacting protein, is found in eukaryotes, and in the human protein CCNDBP1, residues 149-190 constitute a helix-loop-helix domain, residues 190-240 an acidic region, and 240-261 a leucine zipper domain. GCIP interacts with full-length Grap2 protein and with the COOH-terminal unique and SH3 domains (designated QC domain) of Grap2. It is potentially involved in the regulation of cell differentiation and proliferation through Grap2 and cyclin D-mediated signalling pathways. In mice, it is involved in G1/S-phase progression of hepatocytes, which in older animals is associated with the development of liver tumors. In vitro it acts as an inhibitory HLH protein, for example, blocking transcription of the HNF-4 promoter. In its function as a cyclin D1-binding protein it is able to reduce CDK4-mediated phosphorylation of the retinoblastoma protein and to inhibit E2F-mediated transcriptional activity. GCIP has also been shown to have interact physically with Rad (Ras associated with diabetes), Rad being important in regulating cellular senescence. 261
56008 404241 pfam13325 MCRS_N N-terminal region of micro-spherule protein. This domain is found in plants and higher eukaryotes, and is the N-terminal region of micro-spherule proteins which repress the transactivation activities of Nrf1 (p45 nuclear factor-erythroid 2 (p45 NF-E2)-related factor 1). In conjunction with DIPA the full-length protein acts as a transcription repressor. The exact function of the region is not known. 199
56009 404242 pfam13326 PSII_Pbs27 Photosystem II Pbs27. This family of proteins contains Pbs27, a highly conserved component of photosystem II. Pbs27 is comprised of four helices arranged in a right handed up-down-up-down fold, with a less ordered region located at the N-terminus. 134
56010 404243 pfam13327 T3SS_LEE_assoc Type III secretion system subunit. This is a family of bacterial putative type III secretion apparatus proteins associated with the locus of enterocyte effacement (LEE). 162
56011 404244 pfam13328 HD_4 HD domain. HD domains are metal dependent phosphohydrolases. 157
56012 404245 pfam13329 ATG2_CAD Autophagy-related protein 2 CAD motif. The Atg2 protein, an integral membrane protein, is required for a range of functions including the regulation of autophagy in conjunction with the Atg1-Atg13 complex. Atg2 binds Atg9. The precise function of this region, with its characteristic highly conserved CAD sequence motif, is not known. 154
56013 404246 pfam13330 Mucin2_WxxW Mucin-2 protein WxxW repeating region. This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in an uncharacterized protein from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues. 84
56014 404247 pfam13331 DUF4093 Domain of unknown function (DUF4093). This domain lies at the C-terminus of primase proteins carrying the TOPRIM, pfam01751, domain. The exact function of the domain is not known. 85
56015 404248 pfam13332 Fil_haemagg_2 Haemagluttinin repeat. 169
56016 372570 pfam13333 rve_2 Integrase core domain. 52
56017 404249 pfam13334 DUF4094 Domain of unknown function (DUF4094). This domain is found in plant proteins that often carry a galactosyltransferase domain, pfam01762, at their C-terminus. 91
56018 404250 pfam13335 Mg_chelatase_C Magnesium chelatase, subunit ChlI C-terminal. This is a family of the C-terminal of putative bacterial magnesium chelatase subunit ChlI proteins. Most members have the associated pfam01078. 93
56019 404251 pfam13336 AcetylCoA_hyd_C Acetyl-CoA hydrolase/transferase C-terminal domain. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA. 154
56020 404252 pfam13337 Lon_2 Putative ATP-dependent Lon protease. This is a family of proteins that are annotated as ATP-dependent Lon proteases. 450
56021 404253 pfam13338 AbiEi_4 Transcriptional regulator, AbiEi antitoxin. AbiEi_4 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 49
56022 404254 pfam13339 AATF-Che1 Apoptosis antagonizing transcription factor. The N-terminal and leucine-zipper region of the apoptosis antagonizing transcription factor-Che1. 130
56023 404255 pfam13340 DUF4096 Putative transposase of IS4/5 family (DUF4096). 76
56024 404256 pfam13341 RAG2_PHD RAG2 PHD domain. This domain is found at the C-terminus of the RAG2 protein. The structure of this domain has been shown bound to histone H3 trimethylated at lysine 4 (H3K4me3). 78
56025 404257 pfam13342 Toprim_Crpt C-terminal repeat of topoisomerase. 60
56026 404258 pfam13343 SBP_bac_6 Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins. 247
56027 404259 pfam13344 Hydrolase_6 Haloacid dehalogenase-like hydrolase. This family is part of the HAD superfamily. 101
56028 404260 pfam13346 ABC2_membrane_5 ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061. 206
56029 404261 pfam13347 MFS_2 MFS/sugar transport protein. This family is part of the major facilitator superfamily of membrane transport proteins. 427
56030 404262 pfam13349 DUF4097 Putative adhesin. This has a putative all-beta structure with a twenty-residue repeat with a highly conserved repeating GD, gly-asp, motif. It may form part of a bacterial adhesin. 247
56031 404263 pfam13350 Y_phosphatase3 Tyrosine phosphatase family. This family is closely related to the pfam00102 and pfam00782 families. 243
56032 404264 pfam13351 DUF4099 Protein of unknown function (DUF4099). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. The C-terminal repeat region of this family is DUF4098, pfam13345. 80
56033 372575 pfam13352 DUF4100 Protein of unknown function (DUF4100). This is a family of uncharacterized proteins found in Physcomitrella. 207
56034 404265 pfam13353 Fer4_12 4Fe-4S single cluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich. 138
56035 404266 pfam13354 Beta-lactamase2 Beta-lactamase enzyme family. This family is closely related to Beta-lactamase, pfam00144, the serine beta-lactamase-like superfamily, which contains the distantly related pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. 201
56036 404267 pfam13355 DUF4101 Protein of unknown function (DUF4101). This is a family of uncharacterized proteins, and is sometimes found in combination with pfam00226. 117
56037 404268 pfam13356 Arm-DNA-bind_3 Arm DNA-binding domain. This DNA-binding domain is found at the N-terminus of a wide variety of phage integrase proteins. 78
56038 404269 pfam13358 DDE_3 DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 145
56039 404270 pfam13359 DDE_Tnp_4 DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 156
56040 404271 pfam13360 PQQ_2 PQQ-like domain. This domain contains several repeats of the PQQ repeat. 233
56041 404272 pfam13361 UvrD_C UvrD-like helicase C-terminal domain. This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold. 498
56042 372578 pfam13362 Toprim_3 Toprim domain. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation. 96
56043 404273 pfam13363 BetaGal_dom3 Beta-galactosidase, domain 3. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyzes the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain has an Ig-like fold. 65
56044 404274 pfam13364 BetaGal_dom4_5 Beta-galactosidase jelly roll domain. This domain is found in beta galactosidase enzymes. It has a jelly roll fold. 111
56045 404275 pfam13365 Trypsin_2 Trypsin-like peptidase domain. This family includes trypsin-like peptidase domains. 142
56046 404276 pfam13366 PDDEXK_3 PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily 117
56047 404277 pfam13367 PrsW-protease Protease prsW family. This is a family of putative peptidases, possibly belonging to the MEROPS M79 family. PrsW, appears to be a member of a widespread family of membrane proteins that includes at least one previously known protease. PrsW appears to be responsible for Site-1 cleavage of the RsiW anti-sigma factor, the cognate anti-sigma factor, and it senses antimicrobial peptides that damage the cell membrane and other agents that cause cell envelope stress, The three acidic residues, E75, E76 and E95 in Aflv_1074, appear to be crucial since their mutation to alanine renders the protein inactive. Based on predictions of the bioinformatics programme TMHMM it is likely that these residues are located on the extracytoplasmic face of PrsW placing them in a position to act as a sensor for cell envelope stress. 195
56048 404278 pfam13368 Toprim_C_rpt Topoisomerase C-terminal repeat. This domain is repeated up to five times to form the C-terminal region of bacterial topoisomerase immediately downstream of the zinc-finger motif. 48
56049 404279 pfam13369 Transglut_core2 Transglutaminase-like superfamily. 155
56050 404280 pfam13370 Fer4_13 4Fe-4S single cluster domain of Ferredoxin I. Fer4_13 is a ferredoxin I from sulfate-reducing bacteria. Chemical sequence analysis suggests that this characteristic [4Fe-4S] cluster sulfur environment is widely distributed among ferredoxins. 58
56051 404281 pfam13371 TPR_9 Tetratricopeptide repeat. 73
56052 404282 pfam13372 Alginate_exp Alginate export. This domain forms an 18-stranded beta-barrel pore which is likely to act as an alginate export channel. 394
56053 404283 pfam13373 DUF2407_C DUF2407 C-terminal domain. This is a family of proteins found in fungi. The function is not known. There is a characteristic GFDRL sequence motif. 141
56054 404284 pfam13374 TPR_10 Tetratricopeptide repeat. 42
56055 404285 pfam13375 RnfC_N RnfC Barrel sandwich hybrid domain. This domain is part of the barrel sandwich hybrid superfamily. It is found at the N-terminus of the RnfC Electron transport complex protein. It appears to be most related to the N-terminal NQRA domain (pfam05896). 101
56056 404286 pfam13376 OmdA Bacteriocin-protection, YdeI or OmpD-Associated. This is a family of archaeal and bacterial proteins predicted to be periplasmic. YdeI is important for resistance to polymyxin B in broth and for bacterial survival in mice upon oral, but not intraperitoneal inoculation, suggesting a role for YdeI in the gastrointestinal tract of mice. Production of the ydeI gene is regulated by the Rcs (regulator of capsule synthesis) phospho-relay system pathway independently of RcsA, and additionally transcription of the protein is regulated by the stationary-phase sigma factor, RpoS (sigma-S). YdeI confers protection against cationic AMPs (Antimicrobial peptides) or bacteriocins in conjunction with the general porin Omp, thus justifying its name of OmdA, for OmpD-Associated protein. 60
56057 404287 pfam13377 Peripla_BP_3 Periplasmic binding protein-like domain. Thi domain is found in a variety of transcriptional regulatory proteins. It is related to bacterial periplasmic binding proteins, although this domain is unlikely to be found in the periplasm. This domain likely acts to bind a small molecule ligand that the DNA-binding domain responds to. 160
56058 404288 pfam13378 MR_MLE_C Enolase C-terminal domain-like. This domain appears at the C-terminus of many of the proteins that carry the MR_MLE_N pfam02746 domain. EC:4.2.1.40. 205
56059 404289 pfam13379 NMT1_2 NMT1-like family. This family is closely related to the pfam09084 family. 254
56060 404290 pfam13380 CoA_binding_2 CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. 116
56061 404291 pfam13382 Adenine_deam_C Adenine deaminase C-terminal domain. This family represents a C-terminal region of the adenine deaminase enzyme. 168
56062 372586 pfam13383 Methyltransf_22 Methyltransferase domain. This family appears to be a methyltransferase domain. 252
56063 404292 pfam13384 HTH_23 Homeodomain-like domain. 50
56064 404293 pfam13385 Laminin_G_3 Concanavalin A-like lectin/glucanases superfamily. This domain belongs to the Concanavalin A-like lectin/glucanases superfamily. 151
56065 404294 pfam13386 DsbD_2 Cytochrome C biogenesis protein transmembrane region. 199
56066 404295 pfam13387 DUF4105 Domain of unknown function (DUF4105). This is a family of uncharacterized bacterial proteins. There is a highly conserved histidine residue and a well-conserved NCT motif. 166
56067 404296 pfam13388 DUF4106 Protein of unknown function (DUF4106). This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown. 431
56068 372588 pfam13389 DUF4107 Protein of unknown function (DUF4107). This family of putative proteins are found in Trichomonas vaginalis in large numbers. The function of this protein is unknown. 167
56069 290126 pfam13390 DUF4108 Protein of unknown function (DUF4108). This family of putative proteins are found in Trichomonas vaginalis in large numbers. The function of this protein is unknown. 145
56070 404297 pfam13391 HNH_2 HNH endonuclease. 64
56071 404298 pfam13392 HNH_3 HNH endonuclease. This is a zinc-binding loop of Fold group 7 as found in endo-deoxy-ribonucleases and HNH nucleases. 46
56072 404299 pfam13393 tRNA-synt_His Histidyl-tRNA synthetase. This is a family of class II aminoacyl-tRNA synthetase-like and ATP phosphoribosyltransferase regulatory subunits. 305
56073 404300 pfam13394 Fer4_14 4Fe-4S single cluster domain. 117
56074 404301 pfam13395 HNH_4 HNH endonuclease. This HNH nuclease domain is found in CRISPR-related proteins. 54
56075 404302 pfam13396 PLDc_N Phospholipase_D-nuclease N-terminal. This family is often found at the very N-terminus of proteins from the phospholipase_D-nuclease family, PLDc, pfam00614. However, a large number of members are full-length within this family. 43
56076 404303 pfam13397 RbpA RNA polymerase-binding protein. RbpA is a family bacterial RNA polymerase-binding proteins. RbpA acts as a transcription factor by binding to the sigma subunit of RNA polymerase. 104
56077 404304 pfam13398 Peptidase_M50B Peptidase M50B-like. This is a family of bacterial and plant peptidases in the same family as MEROPS:M50B. 194
56078 404305 pfam13399 LytR_C LytR cell envelope-related transcriptional attenuator. This family appears at the C-terminus of members of the LytR_cpsA_psr, pfam03816, family 87
56079 404306 pfam13400 Tad Putative Flp pilus-assembly TadE/G-like. This is an N-terminal domain on a family of putative Flp pilus-assembly proteins. The exact function is not known. The Flp-pilus biogenesis genes include the Tad genes, and some members of this family are putatively assigned as being TadG. 47
56080 379165 pfam13401 AAA_22 AAA domain. 129
56081 404307 pfam13402 Peptidase_M60 Peptidase M60, enhancin and enhancin-like. This family of peptidases contains a zinc metallopeptidase motif (HEXXHX(8,28)E) and possesses mucinase activity. It includes the viral enhancins as well as enhancin-like peptidases from bacterial species. Enhancins are a class of metalloproteases found in some baculoviruses that enhance viral infection by degrading the peritrophic membrane (PM) of the insect midgut. Bacterial enhancins are found to be cytotoxic when compared to viral enhancin, however, suggesting that the bacterial enhancins do not enhance infection in the same way as viral enhancin. Bacterial enhancins may have evolved a distinct biochemical function. These bacterial domains are peptidases targetting host glycoproteins and thus probably play an important role in successful colonisation of both vertebrate mucosal surfaces and the invertebrate digestive tract by both mutualistic and pathogenic microbes. This family has been augmented by a merge with the sequences in the Enhancin Pfam family. 268
56082 404308 pfam13403 Hint_2 Hint domain. This domain is found in inteins. 147
56083 404309 pfam13404 HTH_AsnC-type AsnC-type helix-turn-helix domain. 41
56084 404310 pfam13405 EF-hand_6 EF-hand domain. 30
56085 404311 pfam13406 SLT_2 Transglycosylase SLT domain. This family is related to the SLT domain pfam01464. 292
56086 404312 pfam13407 Peripla_BP_4 Periplasmic binding protein domain. This domain is found in a variety of bacterial periplasmic binding proteins. 256
56087 404313 pfam13408 Zn_ribbon_recom Recombinase zinc beta ribbon domain. This short bacterial protein contains a zinc ribbon domain that is likely to be DNA-binding. This domain is found in site specific recombinase proteins. This family appears most closely related to pfam04606. 57
56088 404314 pfam13409 GST_N_2 Glutathione S-transferase, N-terminal domain. This family is closely related to pfam02798. 68
56089 404315 pfam13410 GST_C_2 Glutathione S-transferase, C-terminal domain. This domain is closely related to pfam00043. 67
56090 404316 pfam13411 MerR_1 MerR HTH family regulatory protein. 66
56091 404317 pfam13412 HTH_24 Winged helix-turn-helix DNA-binding. 45
56092 404318 pfam13413 HTH_25 Helix-turn-helix domain. This domain is a helix-turn-helix domain that probably binds to DNA. 62
56093 315977 pfam13414 TPR_11 TPR repeat. 42
56094 404319 pfam13415 Kelch_3 Galactose oxidase, central domain. 49
56095 404320 pfam13416 SBP_bac_8 Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins. 279
56096 404321 pfam13417 GST_N_3 Glutathione S-transferase, N-terminal domain. 75
56097 404322 pfam13418 Kelch_4 Galactose oxidase, central domain. 49
56098 404323 pfam13419 HAD_2 Haloacid dehalogenase-like hydrolase. 178
56099 404324 pfam13420 Acetyltransf_4 Acetyltransferase (GNAT) domain. 153
56100 404325 pfam13421 Band_7_1 SPFH domain-Band 7 family. 211
56101 404326 pfam13422 DUF4110 Domain of unknown function (DUF4110). This is a family that is found predominantly at the C-terminus of Kelch-containing proteins. However, the exact function of this region is not known. 95
56102 404327 pfam13423 UCH_1 Ubiquitin carboxyl-terminal hydrolase. 308
56103 315987 pfam13424 TPR_12 Tetratricopeptide repeat. 77
56104 404328 pfam13425 O-antigen_lig O-antigen ligase like membrane protein. 461
56105 404329 pfam13426 PAS_9 PAS domain. 102
56106 404330 pfam13427 DUF4111 Domain of unknown function (DUF4111). Although the exact function of this domain is not known it frequently appears downstream of the family Nucleotidyltransferase, pfam01909. It is also found in species associated with methicillin-resistant bacteria. 102
56107 404331 pfam13428 TPR_14 Tetratricopeptide repeat. 39
56108 404332 pfam13429 TPR_15 Tetratricopeptide repeat. 279
56109 404333 pfam13430 DUF4112 Domain of unknown function (DUF4112). This family has several highly conserved GD sequence-motifs of unknown function. The family is found in bacteria, archaea and fungi. 104
56110 404334 pfam13431 TPR_17 Tetratricopeptide repeat. 34
56111 404335 pfam13432 TPR_16 Tetratricopeptide repeat. This family is found predominantly at the C-terminus of transglutaminase enzyme core regions. 68
56112 404336 pfam13433 Peripla_BP_5 Periplasmic binding protein domain. This domain is found in a variety of bacterial periplasmic binding proteins. 363
56113 404337 pfam13434 K_oxygenase L-lysine 6-monooxygenase (NADPH-requiring). This is family of Rossmann fold oxidoreductases that catalyzes the NADPH-dependent hydroxylation of lysine at the N6 position, EC:1.14.13.59. 341
56114 372604 pfam13435 Cytochrome_C554 Cytochrome c554 and c-prime. This family is a tetra-haem cytochrome involved in the oxidation of ammonia. It is found in both phototrophic and denitrifying bacteria. 84
56115 404338 pfam13436 Gly-zipper_OmpA Glycine-zipper domain. 44
56116 404339 pfam13437 HlyD_3 HlyD family secretion protein. This is a family of largely bacterial haemolysin translocator HlyD proteins. 104
56117 404340 pfam13438 DUF4113 Domain of unknown function (DUF4113). Although the function is not known this domain occurs almost invariably at the very C-terminus of the IMS family DNA-polymerase repair proteins, IMS, pfam00817. 51
56118 404341 pfam13439 Glyco_transf_4 Glycosyltransferase Family 4. 170
56119 404342 pfam13440 Polysacc_synt_3 Polysaccharide biosynthesis protein. 293
56120 404343 pfam13441 Gly-zipper_YMGG YMGG-like Gly-zipper. 45
56121 404344 pfam13442 Cytochrome_CBB3 Cytochrome C oxidase, cbb3-type, subunit III. 67
56122 404345 pfam13443 HTH_26 Cro/C1-type HTH DNA-binding domain. This is a helix-turn-helix domain that probably binds to DNA. 63
56123 404346 pfam13444 Acetyltransf_5 Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions. 102
56124 404347 pfam13445 zf-RING_UBOX RING-type zinc-finger. This zinc-finger is a typical RING-type of plant ubiquitin ligases. 40
56125 404348 pfam13446 RPT A repeated domain in UCH-protein. This is a repeated domain found in de-ubiquitinating proteins. It's exact function is not known although it is likely to be involved in the binding of the Ubps in the complex with Rsp5 and Rup1. 59
56126 290183 pfam13447 Multi-haem_cyto Seven times multi-haem cytochrome CxxCH. This domain carries up to seven CxxCH repeated sequence motifs, characteristic of multi-haem cytochromes. 269
56127 404349 pfam13448 DUF4114 Domain of unknown function (DUF4114). This is a repeated domain that is found towards the C-terminal of many different types of bacterial proteins. There are highly conserved glutamate and aspartate residues suggesting that this domain might carry enzymic activity. 86
56128 404350 pfam13449 Phytase-like Esterase-like activity of phytase. This is a repeated domain that carries several highly conserved Glu and Asp residues indicating the likelihood that the domain incorporates the enzymic activity of the PLC-like phospho-diesterase part of the proteins. 284
56129 404351 pfam13450 NAD_binding_8 NAD(P)-binding Rossmann-like domain. 68
56130 404352 pfam13451 zf-trcl Probable zinc-ribbon domain. This is a probable zinc-binding domain with two CxxC sequence motifs, found in various families of bacteria. 48
56131 404353 pfam13452 MaoC_dehydrat_N N-terminal half of MaoC dehydratase. It is clear from the structures of bacterial members of MaoC dehydratase, pfam01575, that the full-length functional dehydratase enzyme is made up of two structures that dimerize to form a whole. Divergence of the N- and C- monomers in higher eukaryotes has led to two distinct domains, this one and MaoC_dehydratas. However, in order to function as an enzyme both are required together. 132
56132 404354 pfam13453 zf-TFIIB Transcription factor zinc-finger. 41
56133 404355 pfam13454 NAD_binding_9 FAD-NAD(P)-binding. 155
56134 372608 pfam13455 MUG113 Meiotically up-regulated gene 113. This is a family of fungal proteins found to be up-regulated in meiosis. 73
56135 404356 pfam13456 RVT_3 Reverse transcriptase-like. This domain is found in plants and appears to be part of a retrotransposon. 123
56136 404357 pfam13457 SH3_8 SH3-like domain. 74
56137 404358 pfam13458 Peripla_BP_6 Periplasmic binding protein. This family includes a diverse range of periplasmic binding proteins. 342
56138 404359 pfam13459 Fer4_15 4Fe-4S single cluster domain. 66
56139 404360 pfam13460 NAD_binding_10 NAD(P)H-binding. 183
56140 404361 pfam13462 Thioredoxin_4 Thioredoxin. 165
56141 404362 pfam13463 HTH_27 Winged helix DNA-binding domain. 68
56142 404363 pfam13464 DUF4115 Domain of unknown function (DUF4115). This short domain is often found at the C-terminus of proteins containing a helix-turn-helix domain. The function of this domain is unknown. 68
56143 404364 pfam13465 zf-H2C2_2 Zinc-finger double domain. 26
56144 404365 pfam13466 STAS_2 STAS domain. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. 80
56145 404366 pfam13467 RHH_4 Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding. 65
56146 404367 pfam13468 Glyoxalase_3 Glyoxalase-like domain. This domain is related to the Glyoxalase domain pfam00903. 175
56147 404368 pfam13469 Sulfotransfer_3 Sulfotransferase family. 216
56148 404369 pfam13470 PIN_3 PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases). 117
56149 404370 pfam13471 Transglut_core3 Transglutaminase-like superfamily. This family includes uncharacterized proteins that are related to the transglutaminase like domain pfam01841. 117
56150 404371 pfam13472 Lipase_GDSL_2 GDSL-like Lipase/Acylhydrolase family. This family of presumed lipases and related enzymes are similar to pfam00657. 176
56151 379208 pfam13473 Cupredoxin_1 Cupredoxin-like domain. The cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. 104
56152 404372 pfam13474 SnoaL_3 SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold. 121
56153 404373 pfam13475 DUF4116 Domain of unknown function (DUF4116). 49
56154 404374 pfam13476 AAA_23 AAA domain. 190
56155 404375 pfam13477 Glyco_trans_4_2 Glycosyl transferase 4-like. 139
56156 404376 pfam13478 XdhC_C XdhC Rossmann domain. This entry is the rossmann domain found in the Xanthine dehydrogenase accessory protein. 143
56157 404377 pfam13479 AAA_24 AAA domain. This AAA domain is found in a wide variety of presumed phage proteins. 199
56158 404378 pfam13480 Acetyltransf_6 Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions. 143
56159 404379 pfam13481 AAA_25 AAA domain. This AAA domain is found in a wide variety of presumed DNA repair proteins. 195
56160 404380 pfam13482 RNase_H_2 RNase_H superfamily. 165
56161 404381 pfam13483 Lactamase_B_3 Beta-lactamase superfamily domain. This family is part of the beta-lactamase superfamily and is related to pfam00753. 160
56162 404382 pfam13484 Fer4_16 4Fe-4S double cluster binding domain. 65
56163 290220 pfam13485 Peptidase_MA_2 Peptidase MA superfamily. 247
56164 372615 pfam13486 Dehalogenase Reductive dehalogenase subunit. This family is most frequently associated with a Fer4 iron-sulfur cluster towards the C-terminal region. 288
56165 404383 pfam13487 HD_5 HD domain. HD domains are metal dependent phosphohydrolases. 64
56166 404384 pfam13488 Gly-zipper_Omp Glycine zipper. 46
56167 404385 pfam13489 Methyltransf_23 Methyltransferase domain. This family appears to be a methyltransferase domain. 162
56168 404386 pfam13490 zf-HC2 Putative zinc-finger. This is a putative zinc-finger found in some anti-sigma factor proteins. 34
56169 404387 pfam13491 FtsK_4TM 4TM region of DNA translocase FtsK/SpoIIIE. 4TM_FtsK is the integral membrane domain of the FtsK DNA tranlocases. During sporulation in Bacillus subtilis, the SpoIIIE protein is believed to form a translocation pore at the leading edge of the nearly closed septum. The E. coli FtsK protein is homologous to SpoIIIE, and can free chromosomes trapped in vegetative septa. 171
56170 379221 pfam13492 GAF_3 GAF domain. 129
56171 404388 pfam13493 DUF4118 Domain of unknown function (DUF4118). This domain is found in a wide variety of bacterial signalling proteins. It is likely to be a transmembrane domain involved in ligand sensing. 107
56172 404389 pfam13494 DUF4119 Domain of unknown function, B. Theta Gene description (DUF4119). Based on Bacteroides thetaiotaomicron gene BT_0594, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 92
56173 404390 pfam13495 Phage_int_SAM_4 Phage integrase, N-terminal SAM-like domain. 84
56174 404391 pfam13496 DUF4120 Domain of unknown function (DUF4120). Based on Bacteroides thetaiotaomicron gene BT_2585, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 95
56175 404392 pfam13497 DUF4121 Domain of unknown function (DUF4121). Based on Bacteroides thetaiotaomicron gene BT_2588, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 264
56176 404393 pfam13498 DUF4122 Domain of unknown function (DUF4122). Based on Bacteroides thetaiotaomicron gene BT_2607, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 220
56177 404394 pfam13499 EF-hand_7 EF-hand domain pair. 67
56178 404395 pfam13500 AAA_26 AAA domain. This domain is found in a number of proteins involved in cofactor biosynthesis such as dethiobiotin synthase and cobyric acid synthase. This domain contains a P-loop motif. 198
56179 404396 pfam13501 SoxY Sulfur oxidation protein SoxY. This domain is found in the sulfur oxidation protein SoxY. It is closely related to the Desulfoferrodoxin family pfam01880. Dissimilatory oxidation of thiosulfate is carried out by the ubiquitous sulfur-oxidizing (Sox) multi-enzyme system. In this system, SoxY plays a key role, functioning as the sulfur substrate-binding protein that offers its sulfur substrate, which is covalently bound to a conserved C-terminal cysteine, to another oxidizing Sox enzyme. The structure of this domain shows an Ig-like fold. 109
56180 404397 pfam13502 AsmA_2 AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli. 233
56181 404398 pfam13503 DUF4123 Domain of unknown function (DUF4123). This presumed domain is functionally uncharacterized. It is about 120 amino acids in length and contains several conserved motifs that may be functionally important. This domain is sometimes associated with the FHA domain. 128
56182 404399 pfam13505 OMP_b-brl Outer membrane protein beta-barrel domain. This domain is found in a wide range of outer membrane proteins. This domain assumes a membrane bound beta-barrel fold. 175
56183 404400 pfam13506 Glyco_transf_21 Glycosyl transferase family 21. This is a family of ceramide beta-glucosyltransferases - EC:2.4.1.80. 174
56184 404401 pfam13507 GATase_5 CobB/CobQ-like glutamine amidotransferase domain. This family captures members that are not found in pfam00310, pfam07685 and pfam13230. 260
56185 404402 pfam13508 Acetyltransf_7 Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions. 83
56186 404403 pfam13509 S1_2 S1 domain. The S1 domain was originally identified as a repeat motif in the ribosomal S1 protein. It was later identified in a wide range of proteins. The S1 domain has an OB-fold structure. The S1 domain is involved in nucleic acid binding. 59
56187 404404 pfam13510 Fer2_4 2Fe-2S iron-sulfur cluster binding domain. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which a beta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This cluster appears within sarcosine oxidase proteins. 82
56188 404405 pfam13511 DUF4124 Domain of unknown function (DUF4124). This presumed domain is found in a variety of bacterial proteins. It is found associated at the N-terminus associated with other domains such as the SLT domain and glutaredoxin domains in some proteins. The function of this domain is unknown, but it may have an Ig-like fold. 53
56189 404406 pfam13512 TPR_18 Tetratricopeptide repeat. 145
56190 404407 pfam13513 HEAT_EZ HEAT-like repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). These EZ repeats are found in subunits of cyanobacterial phycocyanin lyase and other proteins and probably carry out a scaffolding role. 53
56191 404408 pfam13514 AAA_27 AAA domain. This domain is found in a number of double-strand DNA break proteins. This domain contains a P-loop motif. 207
56192 404409 pfam13515 FUSC_2 Fusaric acid resistance protein-like. 126
56193 404410 pfam13516 LRR_6 Leucine Rich repeat. 24
56194 404411 pfam13517 VCBS Repeat domain in Vibrio, Colwellia, Bradyrhizobium and Shewanella. This domain of about 100 residues is found in multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion (TIGR). 61
56195 404412 pfam13518 HTH_28 Helix-turn-helix domain. This helix-turn-helix domain is often found in transposases and is likely to be DNA-binding. 52
56196 404413 pfam13519 VWA_2 von Willebrand factor type A domain. 103
56197 404414 pfam13520 AA_permease_2 Amino acid permease. 427
56198 404415 pfam13521 AAA_28 AAA domain. 163
56199 404416 pfam13522 GATase_6 Glutamine amidotransferase domain. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes, such as asparagine synthetase and glutamine--fructose-6-phosphate transaminase. 130
56200 404417 pfam13523 Acetyltransf_8 Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions. 145
56201 404418 pfam13524 Glyco_trans_1_2 Glycosyl transferases group 1. 93
56202 404419 pfam13525 YfiO Outer membrane lipoprotein. This outer membrane lipoprotein carries a TPR-like region towards its N-terminal. YfiO in E.coli is one of three outer membrane lipoproteins that form a multicomponent YaeT complex in the outer membrane of Gram-negative bacteria that is involved in the targeting and folding of beta-barrel outer membrane proteins. YfiO is the only essential lipoprotein component of the complex. It is required for the proper assembly and/or targeting of outer membrane proteins to the outer membrane. Through its interactions with NlpB it maintains the functional integrity of the YaeT complex. 200
56203 404420 pfam13526 DUF4125 Protein of unknown function (DUF4125). 197
56204 404421 pfam13527 Acetyltransf_9 Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions. 124
56205 404422 pfam13528 Glyco_trans_1_3 Glycosyl transferase family 1. 321
56206 379241 pfam13529 Peptidase_C39_2 Peptidase_C39 like family. 139
56207 404423 pfam13530 SCP2_2 Sterol carrier protein domain. 103
56208 404424 pfam13531 SBP_bac_11 Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins. 224
56209 404425 pfam13532 2OG-FeII_Oxy_2 2OG-Fe(II) oxygenase superfamily. 191
56210 404426 pfam13533 Biotin_lipoyl_2 Biotin-lipoyl like. 50
56211 404427 pfam13534 Fer4_17 4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich. 61
56212 316093 pfam13535 ATP-grasp_4 ATP-grasp domain. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. 160
56213 404428 pfam13536 EmrE Putative multidrug resistance efflux transporter. This is a membrane protein family whose members are purported to be related to the DMT or Drug/Metabolite Transporter (DMT) Superfamily. Members are all uncharacterized. 259
56214 404429 pfam13537 GATase_7 Glutamine amidotransferase domain. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes such as asparagine synthetase and glutamine-fructose-6-phosphate transaminase. 123
56215 404430 pfam13538 UvrD_C_2 UvrD-like helicase C-terminal domain. This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold. 51
56216 404431 pfam13539 Peptidase_M15_4 D-alanyl-D-alanine carboxypeptidase. This family resembles VanY, pfam02557, which is part of the peptidase M15 family. 67
56217 404432 pfam13540 RCC1_2 Regulator of chromosome condensation (RCC1) repeat. 30
56218 404433 pfam13541 ChlI Subunit ChlI of Mg-chelatase. 121
56219 404434 pfam13542 HTH_Tnp_ISL3 Helix-turn-helix domain of transposase family ISL3. 51
56220 404435 pfam13543 KSR1-SAM SAM like domain present in kinase suppressor RAS 1. 129
56221 404436 pfam13545 HTH_Crp_2 Crp-like helix-turn-helix domain. This family represents a crp-like helix-turn-helix domain that is likely to bind DNA. 69
56222 404437 pfam13546 DDE_5 DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 270
56223 404438 pfam13547 GTA_TIM GTA TIM-barrel-like domain. This domain is found in the gene transfer agent protein. An unusual system of genetic exchange exists in the purple nonsulfur bacterium Rhodobacter capsulatus. DNA transmission is mediated by a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. The genes involved in this process appear to be found widely in bacteria. According to the SUPERFAMILY database this domain has a TIM barrel fold. 299
56224 404439 pfam13548 DUF4126 Domain of unknown function (DUF4126). 174
56225 404440 pfam13549 ATP-grasp_5 ATP-grasp domain. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. 222
56226 404441 pfam13550 Phage-tail_3 Putative phage tail protein. This putative domain is found in the large gene transfer agent protein. These produce defective phage like particles. This domain is similar to other phage-tail protein families. 164
56227 379256 pfam13551 HTH_29 Winged helix-turn helix. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding. 64
56228 404442 pfam13552 DUF4127 Protein of unknown function (DUF4127). This family of uncharacterized bacterial proteins are about 500 amino acids in length. 493
56229 404443 pfam13553 FIIND Function to find. The function to find (FIIND) was initially discovered in two proteins, NLRP1 (aka NALP1, CARD7, NAC, DEFCAP) and CARD8 (aka TUCAN, Cardinal). NLRP1 is a member of the Nod-like receptor (NLR) protein superfamily and is involved in apoptosis and inflammation. To date, it is the only NLR protein known to have a FIIND domain. The FIIND domain is also present in the CARD8 protein where, like in NLRP1, it is followed by a C-terminal CARD domain. Both proteins are described to form an "inflammasome", a macro-molecular complex able to process caspase 1 and activate pro-IL1beta. The FIIND domain is present in only a very small subset of the kingdom of life, comprising primates, rodents (mouse, rat), carnivores (dog) and a few more, such as horse. The function of this domain is yet to be determined. Publications describing the newly discovered NLRP1 protein failed to identify it as a separate domain; for example, it was taken as part of the adjacent leucine rich repeat domain (LRR). Upon discovery of CARD8 it was noted that the N-terminal region shared significant sequence identity with an undescribed region in NLRP1. Before getting its final name, FIIND, this domain was termed NALP1-associated domain (NAD). 251
56230 404444 pfam13554 DUF4128 Bacteriophage related domain of unknown function. The three-dimensional structure of NP_888769.1, Structure 2L25, reveals a tail terminator protein gpU fold, which suggests that the protein could have a bacteriophage origin. 127
56231 404445 pfam13555 AAA_29 P-loop containing region of AAA domain. 61
56232 404446 pfam13556 HTH_30 PucR C-terminal helix-turn-helix domain. This helix-turn-helix domain is often found at the C-terminus of PucR-like transcriptional regulators such as Bacillus subtilis pucR and is likely to be DNA-binding. 55
56233 404447 pfam13557 Phenol_MetA_deg Putative MetA-pathway of phenol degradation. 238
56234 404448 pfam13558 SbcCD_C Putative exonuclease SbcCD, C subunit. Possible exonuclease SbcCD, C subunit, on AAA proteins. 90
56235 404449 pfam13559 DUF4129 Domain of unknown function (DUF4129). This presumed domain is found at the C-terminus of proteins that contain a transglutaminase core domain. The function of this domain is unknown. The domain has a conserved TXXE motif. 70
56236 404450 pfam13560 HTH_31 Helix-turn-helix domain. This domain is a helix-turn-helix domain that probably binds to DNA. 64
56237 404451 pfam13561 adh_short_C2 Enoyl-(Acyl carrier protein) reductase. 236
56238 404452 pfam13562 NTP_transf_4 Sugar nucleotidyl transferase. This is a probable sugar nucleotidyl transferase family. 147
56239 404453 pfam13563 2_5_RNA_ligase2 2'-5' RNA ligase superfamily. This family contains proteins related to pfam02834. These proteins are likely to be enzymes, but they may not share the RNA ligase activity. 152
56240 404454 pfam13564 DoxX_2 DoxX-like family. This family of uncharacterized proteins are related to DoxX pfam07681. 103
56241 404455 pfam13565 HTH_32 Homeodomain-like domain. 73
56242 404456 pfam13566 DUF4130 Domain of unknown function (DUF4130. 157
56243 379269 pfam13567 DUF4131 Domain of unknown function (DUF4131). This domain is frequently found to the N-terminus of the Competence domain, pfam03772. 165
56244 404457 pfam13568 OMP_b-brl_2 Outer membrane protein beta-barrel domain. This domain is found in a wide range of outer membrane proteins. This domain assumes a membrane bound beta-barrel fold. 175
56245 404458 pfam13569 DUF4132 Domain of unknown function (DUF4132). This domain might be involved in the biosynthesis of the molybdopterin cofactor in E.coli. 179
56246 404459 pfam13570 PQQ_3 PQQ-like domain. 38
56247 404460 pfam13571 DUF4133 Domain of unknown function (DUF4133). Based on Bacteroides thetaiotaomicron gene BT_0094, a putative uncharacterized protein as seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or vs when in culture. 91
56248 404461 pfam13572 DUF4134 Domain of unknown function (DUF4134). Based on Bacteroides thetaiotaomicron gene BT_0095, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or vs when in culture. 92
56249 404462 pfam13573 SprB SprB repeat. This repeat occurs several times in SprB, a cell surface protein involved in gliding motility in the bacterium Flavobacterium johnsoniae. 37
56250 372637 pfam13574 Reprolysin_2 Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B. 193
56251 404463 pfam13575 DUF4135 Domain of unknown function (DUF4135). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 380 amino acids in length. The family is found in association with pfam05147. This domain may be involved in synthesis of a lantibiotic compound. 367
56252 404464 pfam13576 Pentapeptide_3 Pentapeptide repeats (9 copies). 48
56253 404465 pfam13577 SnoaL_4 SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold. 125
56254 404466 pfam13578 Methyltransf_24 Methyltransferase domain. This family appears to be a methyltransferase domain. 106
56255 404467 pfam13579 Glyco_trans_4_4 Glycosyl transferase 4-like domain. 158
56256 404468 pfam13580 SIS_2 SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. 138
56257 404469 pfam13581 HATPase_c_2 Histidine kinase-like ATPase domain. 126
56258 404470 pfam13582 Reprolysin_3 Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B. 122
56259 404471 pfam13583 Reprolysin_4 Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B. 203
56260 404472 pfam13584 BatD Oxygen tolerance. This family of proteins carries up to three membrane spanning regions and is involved in tolerance to oxygen in in Bacteroides spp. 95
56261 404473 pfam13585 CHU_C C-terminal domain of CHU protein family. The function of this C-terminal domain is not known; there are several conserved tryptophan and asparagine residues. 85
56262 404474 pfam13586 DDE_Tnp_1_2 Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. 90
56263 404475 pfam13588 HSDR_N_2 Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA. 110
56264 404476 pfam13589 HATPase_c_3 Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase. This family represents, additionally, the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90. 135
56265 404477 pfam13590 DUF4136 Domain of unknown function (DUF4136). This domain is found in bacterial lipoproteins. The function is not known. 112
56266 404478 pfam13591 MerR_2 MerR HTH family regulatory protein. 84
56267 404479 pfam13592 HTH_33 Winged helix-turn helix. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding. 60
56268 404480 pfam13593 SBF_like SBF-like CPA transporter family (DUF4137). These family members are 7TM putative membrane transporter proteins. The family is similar to the SBF family of bile-acid symporters, pfam01758. 313
56269 404481 pfam13595 DUF4138 Domain of unknown function (DUF4138). Based on Bacteroides thetaiotaomicron gene BT_4780, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 244
56270 404482 pfam13596 PAS_10 PAS domain. 106
56271 404483 pfam13597 NRDD Anaerobic ribonucleoside-triphosphate reductase. 554
56272 404484 pfam13598 DUF4139 Domain of unknown function (DUF4139). This family is usually found at the C-terminus of proteins. 317
56273 404485 pfam13599 Pentapeptide_4 Pentapeptide repeats (9 copies). 78
56274 404486 pfam13600 DUF4140 N-terminal domain of unknown function (DUF4140). This family is often found at the N-terminus of its member proteins, with DUF4139, pfam13598, at the C-terminus. 99
56275 404487 pfam13601 HTH_34 Winged helix DNA-binding domain. 80
56276 404488 pfam13602 ADH_zinc_N_2 Zinc-binding dehydrogenase. 132
56277 404489 pfam13603 tRNA-synt_1_2 Leucyl-tRNA synthetase, Domain 2. This is a family of the conserved region of Leucine-tRNA ligase or Leucyl-tRNA synthetase, EC:6.1.1.4. 184
56278 404490 pfam13604 AAA_30 AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. 191
56279 404491 pfam13605 DUF4141 Domain of unknown function (DUF4141). Based on Bacteroides thetaiotaomicron gene BT_4772, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture. 53
56280 404492 pfam13606 Ank_3 Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities. 30
56281 404493 pfam13607 Succ_CoA_lig Succinyl-CoA ligase like flavodoxin domain. This domain contains the catalytic domain from Succinyl-CoA ligase alpha subunit and other related enzymes. A conserved histidine is involved in phosphoryl transfer. 136
56282 290339 pfam13608 Potyvirid-P3 Protein P3 of Potyviral polyprotein. This is the P3 protein section of the Potyviridae polyproteins. The function is not known except that the protein is essential to viral survival. 452
56283 404494 pfam13609 Porin_4 Gram-negative porin. 310
56284 404495 pfam13610 DDE_Tnp_IS240 DDE domain. This DDE domain is found in a wide variety of transposases including those found in IS240, IS26, IS6100 and IS26. 138
56285 372647 pfam13611 Peptidase_S76 Serine peptidase of plant viral polyprotein, P1. This family is the P1 protein of the Potyviridae polyproteins that is a serine peptidase at the N-terminus. The catalytic triad in the genome polyprotein of ssRNA positive-strand Brome streak mosaic rymovirus, is His-311, Asp-322 and Ser-355. 119
56286 404496 pfam13612 DDE_Tnp_1_3 Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contains three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. 154
56287 404497 pfam13613 HTH_Tnp_4 Helix-turn-helix of DDE superfamily endonuclease. This domain is the probable DNA-binding region of transposase enzymes, necessary for efficient DNA transposition. Most of the members derive from the IS superfamily IS5 and rather fewer from IS4. 53
56288 404498 pfam13614 AAA_31 AAA domain. This family includes a wide variety of AAA domains including some that have lost essential nucleotide binding residues in the P-loop. 177
56289 404499 pfam13616 Rotamase_3 PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline. 116
56290 404500 pfam13617 Lipoprotein_19 YnbE-like lipoprotein. This family includes lipoproteins similar to E. coli YnbE. Protein in this family are typically 60 amino acids in length and contain an N-terminal lipid attachment site, which has been included in the alignment to increase sensitivity. The specific function of these proteins is unknown. 34
56291 404501 pfam13618 Gluconate_2-dh3 Gluconate 2-dehydrogenase subunit 3. This family corresponds to subunit 3 of the Gluconate 2-dehydrogenase enzyme that catalyzes the conversion of gluconate to 2-dehydro-D-gluconate EC:1.1.99.3. 134
56292 404502 pfam13619 KTSC KTSC domain. This short domain is named after Lysine tRNA synthetase C-terminal domain. It is found at the C-terminus of some Lysyl tRNA synthetases as well as a single domain in bacterial proteins. The domain is about 60 amino acids in length and contains a reasonably conserved YXY motif in the centre of the sequence. The function of this domain is unknown but it could be an RNA binding domain. 58
56293 404503 pfam13620 CarboxypepD_reg Carboxypeptidase regulatory-like domain. 81
56294 404504 pfam13621 Cupin_8 Cupin-like domain. This cupin like domain shares similarity to the JmjC domain. 251
56295 404505 pfam13622 4HBT_3 Thioesterase-like superfamily. This family contains a wide variety of enzymes, principally thioesterases. These enzymes are part of the Hotdog fold superfamily. 240
56296 404506 pfam13623 SurA_N_2 SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment. 139
56297 404507 pfam13624 SurA_N_3 SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment. 162
56298 404508 pfam13625 Helicase_C_3 Helicase conserved C-terminal domain. This domain family is found in a wide variety of helicases and helicase-related proteins. 121
56299 404509 pfam13627 LPAM_2 Prokaryotic lipoprotein-attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached. 23
56300 404510 pfam13628 DUF4142 Domain of unknown function (DUF4142). This is a bacterial family of unknown function. 138
56301 404511 pfam13629 T2SS-T3SS_pil_N Pilus formation protein N terminal region. 72
56302 404512 pfam13630 SdpI SdpI/YhfL protein family. This family of proteins includes the SdpI and YhfL proteins from B. subtilis. The SdpI protein is a multipass integral membrane protein that protects toxin-producing cells from being killed. Killing is mediated by the exported toxic protein SdpC an extracellular protein that induces the synthesis of an immunity protein. 71
56303 379304 pfam13631 Cytochrom_B_N_2 Cytochrome b(N-terminal)/b6/petB. 169
56304 404513 pfam13632 Glyco_trans_2_3 Glycosyl transferase family group 2. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis. 194
56305 404514 pfam13634 Nucleoporin_FG Nucleoporin FG repeat region. This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145. 90
56306 404515 pfam13635 DUF4143 Domain of unknown function (DUF4143). This domain is almost always found C-terminal to an ATPase core family. 160
56307 404516 pfam13636 Methyltranf_PUA RNA-binding PUA-like domain of methyltransferase RsmF. Methyltranf_PUA is the second of two C-terminal domains found on bacterial methyltransferase RsmF that modifies the 16S ribosomal RNA. It has some structural similarity to the RNA-binding PUA domains suggesting that it is involved in RNA recognition. It lies downstream of the catalytic centre of this methyltransferase, family pfam01189. 50
56308 372654 pfam13637 Ank_4 Ankyrin repeats (many copies). 54
56309 404517 pfam13638 PIN_4 PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases). 131
56310 404518 pfam13639 zf-RING_2 Ring finger domain. 44
56311 404519 pfam13640 2OG-FeII_Oxy_3 2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. 94
56312 404520 pfam13641 Glyco_tranf_2_3 Glycosyltransferase like family 2. Members of this family of prokaryotic proteins include putative glucosyltransferase, which are involved in bacterial capsule biosynthesis. 230
56313 404521 pfam13642 DUF4144 protein structure with unknown function. A family based on the three-dimensional structure of YP_926445.1 (Structure 2L6O) 95
56314 404522 pfam13643 DUF4145 Domain of unknown function (DUF4145). This domain is found in a variety of restriction endonuclease enzymes. The exact function of this domain is uncertain. 88
56315 404523 pfam13644 DKNYY DKNYY family. This family represents a group of proteins found enriched in fusobacteria. These proteins contain many repeats of a DKNXXYY motif. The repeats are spaced at about 35 amino acid residues intervals. These proteins are likely to be associated with the membrane. The specific function of these proteins is unknown. 150
56316 404524 pfam13645 YkuD_2 L,D-transpeptidase catalytic domain. This family is related to pfam03734. 169
56317 404525 pfam13646 HEAT_2 HEAT repeats. This family includes multiple HEAT repeats. 88
56318 404526 pfam13647 Glyco_hydro_80 Glycosyl hydrolase family 80 of chitosanase A. This is a small family of bacterial chitosanases. These have lysozyme-like activity. 307
56319 404527 pfam13648 Lipocalin_4 Lipocalin-like domain. 89
56320 404528 pfam13649 Methyltransf_25 Methyltransferase domain. This family appears to be a methyltransferase domain. 97
56321 404529 pfam13650 Asp_protease_2 Aspartyl protease. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. 90
56322 404530 pfam13651 EcoRI_methylase Adenine-specific methyltransferase EcoRI. This methylase recognizes the double-stranded sequence GAATTC, causes specific methylation on A-3 on both strands, and protects the DNA from cleavage by the EcoRI endonuclease. 343
56323 404531 pfam13652 QSregVF Putative quorum-sensing-regulated virulence factor. This is a family of short ~14 kDa proteins from Psuedomonas. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd. It comprises one structural domain with five beta-strands and five alpha-helices. Various comparative structural prediction methods plus its genomic location point to the protein forming a functional dimer with its adjacent genomic partner, UniProtKB:Q9HY14, in pfam12843. Together these might be regulated by the other product from the PotABCD operon, namely the putrescine-binding periplasmic protein UniProtKB:Q9HY16. which has been implicated in quorum-sensing. QSregVF is certainly up-regulated in quorum-sensing, and is predicted to be a virulence factor. 110
56324 404532 pfam13653 GDPD_2 Glycerophosphoryl diester phosphodiesterase family. This family also includes glycerophosphoryl diester phosphodiesterases as well as agrocinopine synthase, the similarity to GDPD has been noted. This family appears to have weak but not significant matches to mammalian phospholipase C pfam00388, which suggests that this family may adopt a TIM barrel fold. 30
56325 404533 pfam13654 AAA_32 AAA domain. This family includes a wide variety of AAA domains including some that have lost essential nucleotide binding residues in the P-loop. 514
56326 404534 pfam13655 RVT_N N-terminal domain of reverse transcriptase. This domain is found at the N-terminus of bacterial reverse transcriptases. 83
56327 404535 pfam13656 RNA_pol_L_2 RNA polymerase Rpb3/Rpb11 dimerization domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp. Many of the members of this family carry only the N-terminal region of Rpb11. 75
56328 404536 pfam13657 Couple_hipA HipA N-terminal domain. This domain is found to the N-terminus of HipA-like proteins. It is also found in isolation in some proteins. 95
56329 404537 pfam13660 DUF4147 Domain of unknown function (DUF4147). This domain is frequently found at the N-terminus of proteins carrying the glycerate kinase-like domain MOFRL, pfam05161. 231
56330 379319 pfam13661 2OG-FeII_Oxy_4 2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. 93
56331 404538 pfam13662 Toprim_4 Toprim domain. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation. 83
56332 404539 pfam13663 DUF4148 Domain of unknown function (DUF4148). 62
56333 404540 pfam13664 DUF4149 Domain of unknown function (DUF4149). 101
56334 404541 pfam13665 DUF4150 Domain of unknown function (DUF4150). 108
56335 404542 pfam13667 ThiC-associated ThiC-associated domain. This domain is most frequently found at the N-terminus of the ThiC family of proteins, pfam01964. The function is not known. 71
56336 404543 pfam13668 Ferritin_2 Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins. 138
56337 404544 pfam13669 Glyoxalase_4 Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. 109
56338 404545 pfam13670 PepSY_2 Peptidase propeptide and YPEB domain. This region is likely to have a protease inhibitory function (personal obs:C Yeats). The name is derived from Peptidase & Bacillus subtilis YPEB. 83
56339 404546 pfam13671 AAA_33 AAA domain. This family of domains contain only a P-loop motif, that is characteristic of the AAA superfamily. Many of the proteins in this family are just short fragments so there is no Walker B motif. 143
56340 404547 pfam13672 PP2C_2 Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase. 209
56341 404548 pfam13673 Acetyltransf_10 Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions such as Elp3-related proteins. 128
56342 404549 pfam13675 PilJ Type IV pili methyl-accepting chemotaxis transducer N-term. This domain is found on many type IV pili methyl-accepting chemotaxis transducer proteins where there is also a HAMP, signature towards the C-terminus. 112
56343 404550 pfam13676 TIR_2 TIR domain. This is a family of bacterial Toll-like receptors. 119
56344 404551 pfam13677 MotB_plug Membrane MotB of proton-channel complex MotA/MotB. This is the MotB member of the E.coli MotA/MotB proton-channel complex that forms the stator of the bacterial membrane flagellar motor. Key residues act as a plug to prevent premature proton flow. The plug is in the periplasm just C-terminal to the MotB TM, consisting of an amphipathic alpha helix flanked by Pro-52 and Pro-65. In addition to the Pro residues, Ile-58, Tyr-61, and Phe 62 are also essential for plug function. 58
56345 372668 pfam13678 Peptidase_M85 NFkB-p65-degrading zinc protease. This family of bacterial metallo-peptidases is thought to compromise the inflammatory response by degrading p65 thereby down-regulating the NF-kappaB signalling pathway. NF-kappa-B is a pleiotropic transcription factor which is present in almost all cell types and is involved in many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. NF-kappa-B is a homo- or heterodimeric complex formed by the Rel-like domain-containing proteins RELA/p65, RELB, NFKB1/p105, NFKB1/p50, REL and NFKB2/p52; and the heterodimeric p65-p50 complex appears to be most abundant one. 251
56346 379330 pfam13679 Methyltransf_32 Methyltransferase domain. This family appears to be a methyltransferase domain. 138
56347 404552 pfam13680 DUF4152 Protein of unknown function (DUF4152). This family of proteins is functionally uncharacterized. This family of proteins is found in archaea. Proteins in this family are approximately 230 amino acids in length. The structure of PF2046 from pyrococcus furiosus has been solved. It shows an RNaseH like fold that conserves critical catalytic residues. This suggests that these proteins may cleave nucleic acid. 224
56348 404553 pfam13681 PilX Type IV pilus assembly protein PilX C-term. This family is likely to be the C-terminal region of type IV pilus assembly PilX or PilW proteins. 92
56349 404554 pfam13682 CZB Chemoreceptor zinc-binding domain. The chemoreceptor zinc-binding domain (CZB) is found in bacterial signal transduction proteins - most frequently receptors involved in chemotaxis and motility, but also in c-di-GMP signalling and nitrate/nitrite-sensing. Originally discovered in the cytoplasmic chemoreceptor TlpD from Helicobacter pylori, it is often found C-terminal to the MCPsignal domain in cytoplasmic chemoreceptor proteins. The CZB domain contains a core sequence motif, Hxx[WFYL]x21-28Cx[LFMVI]Gx[WFLVI]x18-27HxxxH. The highly-conserved H-C-H-H residues of this motif are believed to coordinate zinc; mutating the latter two histidines of the motif to alanines abolishes Zn binding. This domain binds zinc with high affinity, with a Kd in the femtomolar range. This domain has been shown in E. coli to be a zinc sensor that regulates the catalytic activity of pfam00990. 64
56350 404555 pfam13683 rve_3 Integrase core domain. 67
56351 404556 pfam13684 Dak1_2 Dihydroxyacetone kinase family. This is the kinase domain of the dihydroxyacetone kinase family. 313
56352 404557 pfam13685 Fe-ADH_2 Iron-containing alcohol dehydrogenase. 251
56353 404558 pfam13686 DrsE_2 DsrE/DsrF/DrsH-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. The family also includes YrkE proteins. 156
56354 404559 pfam13687 DUF4153 Domain of unknown function (DUF4153). Members of this family are annotated as putative inner membrane proteins. 216
56355 372673 pfam13688 Reprolysin_5 Metallo-peptidase family M12. 191
56356 404560 pfam13689 DUF4154 Domain of unknown function (DUF4154). This family of proteins is found in bacteria. Proteins in this family are typically between 172 and 207 amino acids in length. Many members are annotated as valyl-tRNA synthetase but this could not be confirmed. 140
56357 404561 pfam13690 CheX Chemotaxis phosphatase CheX. CheX is very closely related to the CheC chemotaxis phosphatase, but it dimerizes in a different way, via a continuous beta sheet between the subunits. CheC and CheX both dephosphorylate CheY, although CheC requires binding of CheD to achieve the activity of CheX. The ability of bacteria to modulate their swimming behaviour in the presence of external chemicals (nutrients and repellents) is one of the most rudimentary behavioural responses known, but the the individual components are very sensitively tuned. 94
56358 404562 pfam13691 Lactamase_B_4 tRNase Z endonuclease. This is family of tRNase Z enzymes, that are closely related structurally to the Lactamase_B family members. tRNase Z is the endonuclease that is involved in tRNA 3'-end maturation through removal of the 3'-trailer sequences from tRNA precursors. The fission yeast Schizosaccharomyces pombe contains two candidate tRNase Zs encoded by two essential genes. The first, trz1, is targeted to the nucleus and has an SV40 nuclear localization signal at its N-terminus, consisting of four consecutive arginine and lysine residues between residues 208 and 211 (KKRK) that is critical for the NLS function. The second, trz2, is targeted to the mitochondria, with an N-terminal mitochondrial targeting signal within the first 38 residues. 63
56359 404563 pfam13692 Glyco_trans_1_4 Glycosyl transferases group 1. 138
56360 404564 pfam13693 HTH_35 Winged helix-turn-helix DNA-binding. 70
56361 404565 pfam13694 Hph Sec63/Sec62 complex-interacting family. This is a family of closely related Hph proteins that are integral endoplasmic reticulum (ER) membrane proteins required for yeast survival under environmental stress conditions. They interact with several subunits of the Sec63/Sec62 complex that mediates post-translational translocation of proteins into the ER. Cells with mutant Hph1 and Hph2 proteins revealed phenotypes resembling those of mutants defective for vacuolar proton ATPase (V-ATPase) activity. The yeast V-ATPase is a multisubunit complex whose function, structure, and assembly have been well characterized. Cells with impaired V-ATPase activity fail to acidify the vacuole, cannot grow at alkaline pH, and are sensitive to high concentrations of extracellular calcium. 187
56362 404566 pfam13695 zf-3CxxC Zinc-binding domain. This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. 95
56363 404567 pfam13696 zf-CCHC_2 Zinc knuckle. This is a zinc-binding domain of the form CxxCxxxGHxxxxC from a variety of different species. 21
56364 404568 pfam13698 DUF4156 Domain of unknown function (DUF4156). The function of this family is unknown but members are annotated as putative lipoprotein outer membrane proteins. 89
56365 404569 pfam13699 DUF4157 Domain of unknown function (DUF4157). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. This domain contains an HEXXH motif that is characteristic of many families of metallopeptidases. However, no peptidase activity has been shown for this domain. 79
56366 404570 pfam13700 DUF4158 Domain of unknown function (DUF4158). The exact function of this domain is not clear, but it frequently occurs as an N-terminal region of transposase 3 or IS3 family of insertion elements. 166
56367 404571 pfam13701 DDE_Tnp_1_4 Transposase DDE domain group 1. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. 433
56368 404572 pfam13702 Lysozyme_like Lysozyme-like. 165
56369 404573 pfam13704 Glyco_tranf_2_4 Glycosyl transferase family 2. Members of this family of prokaryotic proteins include putative glucosyltransferases, 97
56370 404574 pfam13705 TRC8_N TRC8 N-terminal domain. This region is found at the N-terminus of the TRC8 protein. TRC8 is an E3 ubiquitin-protein ligase also known as RNF139. This region contains 12 transmembrane domains. This region has been suggested to contain a sterol sensing domain. It has been found that TRC8 protein levels are sterol responsive and that it binds and stimulates ubiquitylation of the endoplasmic reticulum anchor protein INSIG. 491
56371 404575 pfam13707 RloB RloB-like protein. This family includes the RloB protein that is found within a bacterial restriction modification operon. This family includes the AbiLii protein that is found as part of a plasmid encoded phage abortive infection mechanism. Deletion within abiLii abolished the phage resistance. The family includes some proteins annotated as CRISPR Csm2 proteins. 152
56372 404576 pfam13708 DUF4942 Domain of unknown function (DUF4942). The function of this family is not known. 187
56373 404577 pfam13709 DUF4159 Domain of unknown function (DUF4159). Members of this family are hypothetical proteins. 191
56374 404578 pfam13710 ACT_5 ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains. These ACT domains are found at the C-terminus of the RelA protein. 61
56375 404579 pfam13711 DUF4160 Domain of unknown function (DUF4160). 61
56376 404580 pfam13712 Glyco_tranf_2_5 Glycosyltransferase like family. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis. 210
56377 404581 pfam13713 BRX_N Transcription factor BRX N-terminal domain. The BREVIS RADIX (BRX) domain was characterized as being a transcription factor in plants regulating the extent of cell proliferation and elongation in the growth zone of the root. BRX is rate limiting for auxin-responsive gene-expression by mediating cross-talk with the brassino-steroid pathway. BRX has a ubiquitous, although quantitatively variable role in modulating the growth rate in both the root and the shoot. This family features a short region, also alpha-helical, N-terminal to the repeated alpha-helices of family BRX, pfam08381. BRX is expressed in the vasculature and is rate-limiting for transcriptional auxin action. 37
56378 404582 pfam13714 PEP_mutase Phosphoenolpyruvate phosphomutase. This domain includes the enzyme Phosphoenolpyruvate phosphomutase (EC:5.4.2.9). This protein has been characterized as catalyzing the formation of a carbon-phosphorus bond by converting phosphoenolpyruvate (PEP) to phosphonopyruvate (P-Pyr). This enzyme has a TIM barrel fold. 241
56379 404583 pfam13715 CarbopepD_reg_2 CarboxypepD_reg-like domain. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam07715 and pfam00593. 88
56380 404584 pfam13716 CRAL_TRIO_2 Divergent CRAL/TRIO domain. This family includes divergent members of the CRAL-TRIO domain family. This family includes ECM25 that contains a divergent CRAL-TRIO domain identified by Gallego and colleagues. 140
56381 404585 pfam13717 zinc_ribbon_4 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773. 36
56382 404586 pfam13718 GNAT_acetyltr_2 GNAT acetyltransferase 2. This domain has N-acetyltransferase activity. It has a GCN5-related N-acetyltransferase (GNAT) fold. 228
56383 316258 pfam13719 zinc_ribbon_5 zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773. 37
56384 404587 pfam13720 Acetyltransf_11 Udp N-acetylglucosamine O-acyltransferase; Domain 2. This is domain 2, or the C-terminal domain, of Udp N-acetylglucosamine O-acyltransferase. This enzyme is a zinc-dependent enzyme that catalyzes the deacetylation of UDP-3-O-((R)-3-hydroxymyristoyl)-N-acetylglucosamine to form UDP-3-O-(R-hydroxymyristoyl)glucosamine and acetate. 82
56385 404588 pfam13721 SecD-TM1 SecD export protein N-terminal TM region. This domain appears to be the fist transmembrane region of the SecD export protein. SecD is directly involved in protein secretion and important for the release of proteins that have been translocated across the cytoplasmic membrane. 103
56386 404589 pfam13722 CstA_5TM 5TM C-terminal transporter carbon starvation CstA. CstA_5TM is the last five transmembrane regions of the peptide transporter carbon starvation family CstA. 118
56387 404590 pfam13723 Ketoacyl-synt_2 Beta-ketoacyl synthase, N-terminal domain. 226
56388 404591 pfam13724 DNA_binding_2 DNA-binding domain. This domain, often found on ovate proteins, binds to single-stranded and double-stranded DNA. Binding to DNA is not sequence-specific. 48
56389 404592 pfam13725 tRNA_bind_2 Possible tRNA binding domain. This domain, found at the C-terminus of tRNA(Met) cytidine acetyltransferase, may be involved in tRNA-binding. 231
56390 404593 pfam13726 Na_H_antiport_2 Na+-H+ antiporter family. This family includes integral membrane proteins, some of which are NA+-H+ antiporters. 88
56391 379351 pfam13727 CoA_binding_3 CoA-binding domain. 175
56392 404594 pfam13728 TraF F plasmid transfer operon protein. TraF protein undergoes proteolytic processing associated with export. The 19 amino acids at the amino terminus of the polypeptides appear to constitute a typical membrane leader peptide - not included in this family, while the remainder of the molecule is predicted to be primarily hydrophilic in character. F plasmid TraF and TraH are required for F pilus assembly and F plasmid transfer, and they are both localized to the outer membrane in the presence of the complete F transfer region, especially TraV, the putative anchor. 224
56393 404595 pfam13729 TraF_2 F plasmid transfer operon, TraF, protein. 268
56394 404596 pfam13730 HTH_36 Helix-turn-helix domain. 55
56395 404597 pfam13731 WxL WxL domain surface cell wall-binding. The WxL motif appears in two or three copies in these bacterial proteins and confers a cell surface localization function. It seems likely that this region is the cell wall-binding domain of gram-positive bacteria, and may interact with the peptidoglycan. 205
56396 404598 pfam13732 DUF4162 Domain of unknown function (DUF4162). This domain is found at the C-terminus of bacterial ABC transporter proteins. The function is not known. 82
56397 404599 pfam13733 Glyco_transf_7N N-terminal region of glycosyl transferase group 7. This is the N-terminal half of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction. 133
56398 404600 pfam13734 Inhibitor_I69 Spi protease inhibitor. This family includes the inhibitor Spi and the pro-peptides of streptopain (SpeB). SpeB is produced as a 43 kDa pre-pro-protein, which is secreted via the recently described Sec secretory pathway Exportal. There is tight coupling between this inhibitor and its associated protease: the gene for the inhibitor Spi is located directly downstream from the gene for the streptococcal cysteine protease SpeB, and the sequence of the inhibitor is very similar to that of the SpeB propeptide. This is an example of an inhibitor molecule that is a structural homolog of the cognate propeptide, and is genetically linked to the protease gene. 98
56399 404601 pfam13735 tRNA_NucTran2_2 tRNA nucleotidyltransferase domain 2 putative. 149
56400 404602 pfam13737 DDE_Tnp_1_5 Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. 112
56401 404603 pfam13738 Pyr_redox_3 Pyridine nucleotide-disulphide oxidoreductase. 296
56402 404604 pfam13739 DUF4163 Domain of unknown function (DUF4163). The structure of this domain is and alpha-beta-two layer sandwich, identified from a Fervidobacterium nodosum Rt17-B1 like protein. The function is not known except that it is found in association with Heat-shock cognate 70kd protein 44kd ATPase, pfam11738. 94
56403 404605 pfam13740 ACT_6 ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains. 76
56404 404606 pfam13741 MRP-S25 Mitochondrial ribosomal protein S25. This is the family of fungal 37S mitochondrial ribosomal S25 proteins. 221
56405 404607 pfam13742 tRNA_anti_2 OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. 95
56406 404608 pfam13743 Thioredoxin_5 Thioredoxin. 176
56407 404609 pfam13744 HTH_37 Helix-turn-helix domain. Members of this family contains a DNA-binding helix-turn-helix domain. 80
56408 404610 pfam13746 Fer4_18 4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich. 114
56409 404611 pfam13747 DUF4164 Domain of unknown function (DUF4164). This is a family of short, approx 100 residue-long, bacterial proteins of unknown function. There is several conserved LE/LD sequence pairs. 89
56410 404612 pfam13748 ABC_membrane_3 ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. 237
56411 404613 pfam13749 HATPase_c_4 Putative ATP-dependent DNA helicase recG C-terminal. This domain may well interact selectively and non-covalently with ATP, adenosine 5'-triphosphate, a universally important coenzyme and enzyme regulator. 88
56412 404614 pfam13750 Big_3_3 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. 157
56413 404615 pfam13751 DDE_Tnp_1_6 Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. 125
56414 404616 pfam13752 DUF4165 Domain of unknown function (DUF4165). 119
56415 404617 pfam13753 SWM_repeat Putative flagellar system-associated repeat. This family appears to be a repeated unit that can occur up to 29 times in these outer membrane proteins. It is putatively associated with a novel flagellar system. 87
56416 404618 pfam13754 Big_3_4 Domain of unknown function. This is a family of uncharacterized Clostridiales proteins. 105
56417 404619 pfam13755 Sensor_TM1 Sensor N-terminal transmembrane domain. This domain is found at the N-terminus of the sensor component of the two-component regulatory system. It includes a transmembrane region and part of the periplasmic region, which is likely to be involved in stimulus sensing. 68
56418 404620 pfam13756 Stimulus_sens_1 Stimulus-sensing domain. This domain is found in the periplasmic region of the sensor component of the two-component regulatory system. The periplasmic region is likely to be involved in stimulus sensing. 110
56419 404621 pfam13757 VIT_2 Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. 78
56420 372710 pfam13758 Prefoldin_3 Prefoldin subunit. This family includes prefoldin subunits that are not detected by pfam02996. 99
56421 404622 pfam13759 2OG-FeII_Oxy_5 Putative 2OG-Fe(II) oxygenase. This family has structural similarity to the 2OG-Fe(II) oxygenase superfamily. 101
56422 404623 pfam13761 DUF4166 Domain of unknown function (DUF4166). This domain is often found at the C-terminus of proteins containing pfam03435. 177
56423 372712 pfam13762 MNE1 Mitochondrial splicing apparatus component. MNE1 is a novel component of the mitochondrial splicing apparatus responsible for the processing of a COX1 group I intron in yeast. Yeast cells lacking MNE1 are deficient in intron splicing in the gene encoding the Cox1 subunit of cytochrome oxidase but do contain wild-type levels of the bc1 complex. 141
56424 404624 pfam13763 DUF4167 Domain of unknown function (DUF4167). 75
56425 404625 pfam13764 E3_UbLigase_R4 E3 ubiquitin-protein ligase UBR4. This is a family of E3 ubiquitin ligase enzymes. 807
56426 404626 pfam13765 PRY SPRY-associated domain. SPRY and PRY domains occur on PYRIN proteins. Their function is not known. 49
56427 404627 pfam13767 DUF4168 Domain of unknown function (DUF4168). 78
56428 372716 pfam13768 VWA_3 von Willebrand factor type A domain. 155
56429 404628 pfam13769 Virulence_fact Virulence factor. This domain is found in conserved virulence factors. It is often found in association with pfam02985 and pfam08712. 81
56430 404629 pfam13770 DUF4169 Domain of unknown function (DUF4169). 54
56431 404630 pfam13771 zf-HC5HC2H PHD-like zinc-binding domain. The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH. 88
56432 372719 pfam13772 AIG2_2 AIG2-like family. This family is found in bacteria and metazoa. 83
56433 404631 pfam13773 DUF4170 Domain of unknown function (DUF4170). 68
56434 404632 pfam13774 Longin Regulated-SNARE-like domain. Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain. 78
56435 404633 pfam13775 DUF4171 Domain of unknown function (DUF4171). This short family is frequently found at the N-terminus of Homeobox proteins. 127
56436 404634 pfam13776 DUF4172 Domain of unknown function (DUF4172). The family is often found in association with pfam02661. 82
56437 404635 pfam13777 DUF4173 Domain of unknown function (DUF4173). This domain of unknown function contains multiple predicted transmembrane domains. 188
56438 404636 pfam13778 DUF4174 Domain of unknown function (DUF4174). This domain of unknown function is found in a putative tumor suppressor gene and in a ligand for the the urokinase-type plasminogen activator receptor, which plays a role in cellular migration and adhesion. 120
56439 404637 pfam13779 DUF4175 Domain of unknown function (DUF4175). 818
56440 404638 pfam13780 DUF4176 Domain of unknown function (DUF4176). 74
56441 404639 pfam13781 DoxX_3 DoxX-like family. This family of uncharacterized proteins are related to DoxX pfam07681. 101
56442 404640 pfam13782 SpoVAB Stage V sporulation protein AB. This family of proteins is required for sporulation. 109
56443 404641 pfam13783 DUF4177 Domain of unknown function (DUF4177). 65
56444 404642 pfam13784 Fic_N Fic/DOC family N-terminal. This domain is found at the N-terminus of the Fic/DOC family, pfam02661. 79
56445 404643 pfam13785 DUF4178 Domain of unknown function (DUF4178). 143
56446 404644 pfam13786 DUF4179 Domain of unknown function (DUF4179). 93
56447 404645 pfam13787 HXXEE Protein of unknown function with HXXEE motif. This domain contains an HXXEE motif, another conserved histidine and a YXPG motif. Its function is unknown. 100
56448 404646 pfam13788 DUF4180 Domain of unknown function (DUF4180). 108
56449 372726 pfam13789 DUF4181 Domain of unknown function (DUF4181). 108
56450 372727 pfam13790 SR1P SR1 protein. This family of proteins is encoded by the dual function SR1 RNA. SR1 is a sRNA which regulates arginine metabolism, it also encodes a short protein that binds to glyceraldehyde-3-phosphate dehydrogenase (GapA) and stabilizes the gapA operon mRNAs. 37
56451 404647 pfam13791 Sigma_reg_C Sigma factor regulator C-terminal. This family is the C-terminal domain of a sigma factor regulator, this may represent a sensory domain. 156
56452 404648 pfam13793 Pribosyltran_N N-terminal domain of ribose phosphate pyrophosphokinase. This family is frequently found N-terminal to the Pribosyltran, pfam00156. 117
56453 205967 pfam13794 MiaE_2 tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE)-like. 185
56454 404649 pfam13795 HupE_UreJ_2 HupE / UreJ protein. These proteins contain many conserved histidines that may be involved in nickel binding. 153
56455 404650 pfam13796 Sensor Putative sensor. This family is often found at the N-terminus of proteins containing pfam07730 and pfam02518. The N-termini of proteins containing these two domains often function in stimulus sensing. 179
56456 404651 pfam13797 Post_transc_reg Post-transcriptional regulator. This family includes post-transcriptional regulators. 83
56457 404652 pfam13798 PCYCGC Protein of unknown function with PCYCGC motif. This domain contains a PCYCGC motif and four other conserved cysteines. Its function is unknown. 153
56458 404653 pfam13799 DUF4183 Domain of unknown function (DUF4183). This domain of unknown function contains a highly conserved ING motif. 74
56459 404654 pfam13800 Sigma_reg_N Sigma factor regulator N-terminal. This domain is found near the N-terminus of a sigma factor regulator. The N-terminus is responsible for interaction with the sigma factor. 89
56460 404655 pfam13801 Metal_resist Heavy-metal resistance. This is a metal-binding protein which is involved in resistance to heavy-metal ions. The protein forms a four-helix hooked hairpin, consisting of two long alpha helices each flanked by a shorter alpha helix. It binds a metal ion in a type-2 like centre. It contains two copies of an LTXXQ motif. 119
56461 404656 pfam13802 Gal_mutarotas_2 Galactose mutarotase-like. This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily. 67
56462 404657 pfam13803 DUF4184 Domain of unknown function (DUF4184). This domain of unknown function contains several highly conserved histidines. 230
56463 290518 pfam13804 HERV-K_env_2 Retro-transcribing viruses envelope glycoprotein. This family comes from human endogenous retrovirus K envelope glycoproteins. 169
56464 404658 pfam13805 Pil1 Eisosome component PIL1. In the budding yeast, S. cerevisiae, Pil1 and another cytoplasmic protein, Lsp1, together form large immobile assemblies at the plasma membrane that mark sites for endocytosis, called eisosomes. Endocytosis functions to recycle plasma membrane components, to regulate cell-surface expression of signalling receptors and to internalize nutrients in all eukaryotic cells. 265
56465 404659 pfam13806 Rieske_2 Rieske-like [2Fe-2S] domain. 103
56466 379381 pfam13807 GNVR G-rich domain on putative tyrosine kinase. This domain is found between two families, Wzz, pfam02706 and CbiA pfam01656. There is a highly conserved GNVR sequence motif which characterizes this domain. The function is not known. 82
56467 404660 pfam13808 DDE_Tnp_1_assoc DDE_Tnp_1-associated. This domain is frequently found N-terminal to the transposase, IS family DDE_Tnp_1, pfam01609 and its relatives. 88
56468 404661 pfam13809 Tubulin_2 Tubulin like. Many of the residues conserved in Tubulin, pfam00091, are also highly conserved in this family. 337
56469 404662 pfam13810 DUF4185 Domain of unknown function (DUF4185). 308
56470 404663 pfam13811 DUF4186 Domain of unknown function (DUF4186). 109
56471 316342 pfam13812 PPR_3 Pentatricopeptide repeat domain. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. In the case of the Arabidopsis protein UniProtKB:Q66GI4, the repeated helices in this N-terminal region, of protein-only RNase P (PRORP) enzymes, form the pentatricopeptide repeat (PPR) domain which enhances pre-tRNA binding affinity. PROPRP enzymes process precursor tRNAs in human mitochondria and in all tRNA-using compartments of Arabidopsis thaliana. 63
56472 372734 pfam13813 MBOAT_2 Membrane bound O-acyl transferase family. 84
56473 404664 pfam13814 Replic_Relax Replication-relaxation. This family includes proteins which are essential for plasmid replication and plasmid DNA relaxation. 190
56474 372735 pfam13815 Dzip-like_N Iguana/Dzip1-like DAZ-interacting protein N-terminal. The DAZ gene-product - Deleted in Azoospermia - and a closely related sequence are required early in germ-cell development in order to maintain germ-cell populations. This family is the N-terminal region that is the only part of the protein in some fungi and lower metazoa. 118
56475 404665 pfam13816 Dehydratase_hem Haem-containing dehydratase. This family includes aldoxime dehydratase, EC:4.99.1.5. This is a haem-containing enzyme, which catalyzes the dehydration of aldoximes to their corresponding nitrile. It also includes phenylacetaldoxime dehydratase, EC:4.99.1.7. This haem-containing enzyme catalyzes the dehydration of Z-phenylacetaldoxime to phenylacetonitrile. The enzyme forms an elliptic beta barrel, composed of eight beta-strands, flanked by alpha-helices. 307
56476 404666 pfam13817 DDE_Tnp_IS66_C IS66 C-terminal element. 39
56477 372737 pfam13820 Nucleic_acid_bd Putative nucleic acid-binding region. This is a family of putative nucleic acid-binding proteins. Several members are annotated as being nuclear receptor coactivator 6 proteins but this could not be confirmed. 143
56478 404667 pfam13821 DUF4187 Domain of unknown function (DUF4187). This family is found at the very C-terminus of proteins that carry a G-patch domain, pfam01585. The domain is short and cysteine-rich. 50
56479 404668 pfam13822 ACC_epsilon Acyl-CoA carboxylase epsilon subunit. This family includes the epsilon subunits of propionyl-CoA carboxylase, EC:6.4.1.3, and acetyl-CoA carboxylase, EC:6.4.1.2. These enzymes are involved in the biosynthesis of long-chain fatty acids. The epsilon subunit is necessary for an efficient interaction between the alpha and beta subunits of these enzymes. 60
56480 404669 pfam13823 ADH_N_assoc Alcohol dehydrogenase GroES-associated. This short domain is frequently found at the N-terminus of the alcohol dehydrogenase GroES-like domain, pfam08240. 23
56481 404670 pfam13824 zf-Mss51 Zinc-finger of mitochondrial splicing suppressor 51. Mss51 regulates the expression of cytochrome oxidase, so this domain is probably DNA-binding. 54
56482 372740 pfam13825 Paramyxo_PNT Paramyxovirus structural protein V/P N-terminus. This family consists of several Paramyxoviridae structural protein P and V sequences. From a structural point of view, P is the best-characterized protein of the replicative complex. P is organized into two moieties that are functionally and structurally distinct: a C-terminal moiety (PCT) and an N-terminal moiety (PNT). PCT is the most conserved in sequence and contains all regions required for virus transcription, whereas PNT, which is poorly conserved, provides several additional functions required for replication. P protein plays a crucial role in the enzyme by positioning L onto the N/RNA template through an interaction with the C-terminal domain of N. Without P, L is not functional. The N, P, and L proteins of SeV and measles and mumps viruses are functionally equivalent. However, sequence identity between proteins from these viruses is limited, and the viruses have been placed in different genera (Respirovirus, Morbilivirus, and Rubulavirus, respectively). SeV P protein (568 aa) is a modular protein with distinct functional domains. The N-terminal part of P (PNT) is a chaperone for N and prevents it from binding to non-viral RNA in the infected cell. 309
56483 404671 pfam13826 DUF4188 Domain of unknown function (DUF4188). 117
56484 404672 pfam13827 DUF4189 Domain of unknown function (DUF4189). This domain of unknown function contains six well-conserved cysteine residues. 97
56485 404673 pfam13828 DUF4190 Domain of unknown function (DUF4190). This integral membrane domain is functionally uncharacterized. One of the membrane helices contains two GXXG motifs that are usually associated with dimerization. 61
56486 404674 pfam13829 DUF4191 Domain of unknown function (DUF4191). 219
56487 404675 pfam13830 DUF4192 Domain of unknown function (DUF4192). 318
56488 404676 pfam13831 PHD_2 PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. 34
56489 404677 pfam13832 zf-HC5HC2H_2 PHD-zinc-finger like domain. 108
56490 404678 pfam13833 EF-hand_8 EF-hand domain pair. 54
56491 404679 pfam13834 DUF4193 Domain of unknown function (DUF4193). This domain of unknown function contains four conserved cysteines and a conserved histidine, including a CXXXXH motif. 98
56492 404680 pfam13835 DUF4194 Domain of unknown function (DUF4194). 165
56493 404681 pfam13836 DUF4195 Domain of unknown function (DUF4195). This family is found at the N-terminus of metazoan proteins that carry PHD-like zinc-finger domains. The function is not known. 183
56494 404682 pfam13837 Myb_DNA-bind_4 Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT-like DNA binding domains. In particular pfam10545 seems most related. This family is greatly expanded in plants and appears in several proteins annotated as transposon proteins. 84
56495 404683 pfam13838 Clathrin_H_link Clathrin-H-link. This short domain is found on clathrins, and often appears on proteins directly downstream from the Clathrin-link domain pfam09268. 66
56496 404684 pfam13839 PC-Esterase GDSL/SGNH-like Acyl-Esterase family found in Pmr5 and Cas1p. The PC-Esterase family is comprised of Cas1p, the Homo sapiens C7orf58, Arabidopsis thaliana PMR5 and a group of plant freezing resistance/cold acclimatization proteins typified by Arabidopsis thaliana ESKIMO1, animal FAM55D proteins, and animal FAM113 proteins. The PC-Esterase family has features that are both similar and different from the canonical GDSL/SGNH superfamily. The members of this family are predicted to have Acyl esterase activity and predicted to modify cell-surface biopolymers such as glycans and glycoproteins. The Cas1p protein has a Cas1_AcylT domain, in addition, with the opposing acyltransferase activity. The C7orf58 family has a ATP-Grasp domain fused to the PC-Esterase and is the first identified secreted tubulin-tyrosine ligase like enzyme in eukaryotes. The plant family with PMR5, ESK1, TBL3 etc have a N-terminal C rich potential sugar binding domain followed by PC-Esterase domain. 281
56497 404685 pfam13840 ACT_7 ACT domain. The ACT domain is a structural motif of 70-90 amino acids that functions in the control of metabolism, solute transport and signal transduction. They are thus found in a variety of different proteins in a variety of different arrangements. In mammalian phenylalanine hydroxylase the domain forms no contacts but promotes an allosteric effect despite the apparent lack of ligand binding. 65
56498 404686 pfam13841 Defensin_beta_2 Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation. 30
56499 404687 pfam13842 Tnp_zf-ribbon_2 DDE_Tnp_1-like zinc-ribbon. This zinc-ribbon domain is frequently found at the C-terminal of proteins derived from transposable elements. 31
56500 372752 pfam13843 DDE_Tnp_1_7 Transposase IS4. 352
56501 404688 pfam13844 Glyco_transf_41 Glycosyl transferase family 41. This family of glycosyltransferases includes O-linked beta-N-acetylglucosamine (O-GlcNAc) transferase, an enzyme which catalyzes the addition of O-GlcNAc to serine and threonine residues. In addition to its function as an O-GlcNAc transferase, human OGT also appears to proteolytically cleave the epigenetic cell-cycle regulator HCF-1. 543
56502 404689 pfam13845 Septum_form Septum formation. This domain is found in a protein which is predicted to play a role in septum formation during cell division. 227
56503 372754 pfam13846 DUF4196 Domain of unknown function (DUF4196). This is a short region of ccdc82_homologs that is conserved from Schizo. pombe up to humans. The function is not known. 116
56504 404690 pfam13847 Methyltransf_31 Methyltransferase domain. This family appears to have methyltransferase activity. 150
56505 404691 pfam13848 Thioredoxin_6 Thioredoxin-like domain. 184
56506 404692 pfam13850 ERGIC_N Endoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC). This family is the N-terminal of ERGIC proteins, ER-Golgi intermediate compartment clusters, otherwise known as Ervs, and is associated with family COPIIcoated_ERV, pfam07970. 91
56507 404693 pfam13851 GAS Growth-arrest specific micro-tubule binding. This family is the highly conserved central region of a number of metazoan proteins referred to as growth-arrest proteins. In mouse, Gas8 is predominantly a testicular protein, whose expression is developmentally regulated during puberty and spermatogenesis. In humans, it is absent in infertile males who lack the ability to generate gametes. The localization of Gas8 in the motility apparatus of post-meiotic gametocytes and mature spermatozoa, together with the detection of Gas8 also in cilia at the apical surfaces of epithelial cells lining the pulmonary bronchi and Fallopian tubes suggests that the Gas8 protein may have a role in the functioning of motile cellular appendages. Gas8 is a microtubule-binding protein localized to regions of dynein regulation in mammalian cells. 200
56508 404694 pfam13852 DUF4197 Protein of unknown function (DUF4197). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 228 and 249 amino acids in length. 199
56509 404695 pfam13853 7tm_4 Olfactory receptor. The members of this family are transmembrane olfactory receptors. 278
56510 404696 pfam13854 Kelch_5 Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415. 41
56511 404697 pfam13855 LRR_8 Leucine rich repeat. 61
56512 404698 pfam13856 Gifsy-2 ATP-binding sugar transporter from pro-phage. Members of this short family are putative ATP-binding sugar transporter-like protein. 98
56513 404699 pfam13857 Ank_5 Ankyrin repeats (many copies). 56
56514 404700 pfam13858 DUF4199 Protein of unknown function (DUF4199). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 167 and 182 amino acids in length. 156
56515 372760 pfam13859 BNR_3 BNR repeat-like domain. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases. 302
56516 404701 pfam13860 FlgD_ig FlgD Ig-like domain. This domains has an immunoglobulin like beta sandwich fold. It is found in the FlgD protein the flagellar hook capping protein. THe structure for this domain shows that it is inserted within a TUDOR like beta barrel domain. 78
56517 404702 pfam13861 FLgD_tudor FlgD Tudor-like domain. This domain has a tudor domain-like beta barrel fold. It is found in the FlgD protein the flagellar hook capping protein. The structure for this domain shows that it contains a nested Ig-like domain within it. However in some firmicute proteins this inserted domain is absent such as Q67K21. 136
56518 404703 pfam13862 BCIP p21-C-terminal region-binding protein. This family of p21-binding proteins is important as a modulator of p21 activity. The domain binds the C-terminal region of p21 in a ternary complex with CDK2, which results in inhibition of the kinase activity of CDK2. 206
56519 404704 pfam13863 DUF4200 Domain of unknown function (DUF4200). This family is found in eukaryotes. It is a coiled-coil domain of unknwon function. 119
56520 404705 pfam13864 Enkurin Calmodulin-binding. This is a family of apparent calmodulin-binding proteins found at high levels in the testis and vomeronasal organ and at lower levels in certain other tissues. Enkurin is a scaffold protein that binds PI3 kinase to sperm transient receptor potential (canonical) (TRPC) channels. The mammalian transient receptor potential (canonical) channels are the primary candidates for the Ca(2+) entry pathway activated by the hormones, growth factors, and neurotransmitters that exert their effect through activation of PLC. Calmodulin binds to the C-terminus of all TRPC channels, and dissociation of calmodulin from TRPC4 results in profound activation of the channel. 96
56521 404706 pfam13865 FoP_duplication C-terminal duplication domain of Friend of PRMT1. Fop, or Friend of Prmt1, proteins are conserved from fungi and plants to vertebrates. There is little that is actually conserved except for this C-terminal LDXXLDAYM region where X is any amino acid). The Fop proteins themselves are nuclear proteins localized to regions with low levels of DAPI, with a punctate/speckle-like distribution. Fop is a chromatin-associated protein and it co-localizes with facultative heterochromatin. It is is critical for oestrogen-dependent gene activation. 80
56522 404707 pfam13866 zf-SAP30 SAP30 zinc-finger. SAP30 is a subunit of the histone deacetylase complex, and this domain is a zinc-finger. Solution of the structure shows a novel fold comprising two beta-strands and two alpha-helices with the zinc organising centre showing remote resemblance to the treble clef motif. In silico analysis of the structure revealed a highly conserved surface dominated by basic residues. NMR-based analysis of potential ligands for the SAP30 zn-finger motif indicated a strong preference for nucleic acid substrates. The zinc-finger of SAP3 probably functions as a double-stranded DNA-binding motif, thereby expanding the known functions of both SAP30 and the mammalian Sin3 co-repressor complex. 72
56523 404708 pfam13867 SAP30_Sin3_bdg Sin3 binding region of histone deacetylase complex subunit SAP30. This C-terminal domain of the SAP30 proteins appears to be the binding region for Sin3. 53
56524 404709 pfam13868 TPH Trichohyalin-plectin-homology domain. This family is a mixtrue of two different families of eukaryotic proteins. Trichoplein or mitostatin, was first defined as a meiosis-specific nuclear structural protein. It has since been linked with mitochondrial movement. It is associated with the mitochondrial outer membrane, and over-expression leads to reduction in mitochondrial motility whereas lack of it enhances mitochondrial movement. The activity appears to be mediated through binding the mitochondria to the actin intermediate filaments (IFs). The family is in the trichohyalin-plectin-homology domain. 352
56525 404710 pfam13869 NUDIX_2 Nucleotide hydrolase. Nudix hydrolases are found in all classes of organism and hydrolyze a wide range of organic pyrophosphates, including nucleoside di- and triphosphates, di-nucleoside and diphospho-inositol polyphosphates, nucleotide sugars and RNA caps, with varying degrees of substrate specificity. 188
56526 404711 pfam13870 DUF4201 Domain of unknown function (DUF4201). This is a family of coiled-coil proteins from eukaryotes. The function is not known. 177
56527 404712 pfam13871 Helicase_C_4 C-terminal domain on Strawberry notch homolog. Strawberry notch proteins carry DExD/H-box groups upstream of this domain. The function of this domain is not known. These proteins promote the expression of diverse targets, potentially through interactions with transcriptional activator or repressor complexes. 271
56528 404713 pfam13872 AAA_34 P-loop containing NTP hydrolase pore-1. 301
56529 404714 pfam13873 Myb_DNA-bind_5 Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. 78
56530 404715 pfam13874 Nup54 Nucleoporin complex subunit 54. This is the human Nup54 subunit of the nucleoporin complex, equivalent to Nup57 of yeast. Nup54, Nup58 and Nup62 all have similar affinities for importin-beta. It seems likely that they are the only FG-repeat nucleoporins of the central channel, and as such they would form a zone of equal affinity spanning the central channel. The diffusion of importin-beta import complexes through the central channel may be a stochastic process as the affinities are similar, whereas movement from cytoplasmic fibrils to the central channel and from the central channel to the nuclear basket would be facilitated by the subtle differences in affinity between them. 139
56531 404716 pfam13875 DUF4202 Domain of unknown function (DUF4202). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 187 and 205 amino acids in length. There are two conserved sequence motifs: LED and KMS. The function of these proteins is unknown, although many are incorrectly annotated as glutamyl tRNA synthetases. 183
56532 404717 pfam13876 Phage_gp49_66 Phage protein (N4 Gp49/phage Sf6 gene 66) family. This family of phage proteins is functionally uncharacterized. The family includes bacteriophage Sf6 gene 66 as well as phage N4 GP49 protein. Proteins in this family are typically between 87 and 154 amino acids in length. There is a conserved NGF sequence motif. 77
56533 404718 pfam13877 RPAP3_C Potential Monad-binding region of RPAP3. This domain is found at the C-terminus of RNA-polymerase II-associated proteins. These proteins bind to Monad and are involved in regulating apoptosis. They contain TPR-repeats towards the N_terminus. 89
56534 404719 pfam13878 zf-C2H2_3 zinc-finger of acetyl-transferase ESCO. 40
56535 404720 pfam13879 KIAA1430 KIAA1430 homolog. This is a family of KIAA1430 homologs. The function is not known. 97
56536 404721 pfam13880 Acetyltransf_13 ESCO1/2 acetyl-transferase. 69
56537 372780 pfam13881 Rad60-SLD_2 Ubiquitin-2 like Rad60 SUMO-like. 111
56538 404722 pfam13882 Bravo_FIGEY Bravo-like intracellular region. This is the very C-terminal intracellular region of neural adhesion molecule L1 proteins that are also known as Bravo or NrCAM. It lies upstream of the IG and Fn3 domains and has the highly conserved motif FIGEY. The function is not known. 84
56539 404723 pfam13883 Pyrid_oxidase_2 Pyridoxamine 5'-phosphate oxidase. 167
56540 404724 pfam13884 Peptidase_S74 Chaperone of endosialidase. This is the very C-terminal, chaperone, domain of the bacteriophage protein endosialidase. It releases itself, via the serine-lysine dyad at the N-terminus, from the remainder of the end-tail-spike. Cleavage occurs after the threonine which is the final residue of the End-tail-spike family, pfam12219. The endosialidase protein forms homotrimeric molecules in bacteriophages. The catalytic dyad allows this portion of the molecule to be cleaved from the more N-terminal region such that the latter can fold and presumably bind to DNA. 56
56541 404725 pfam13885 Keratin_B2_2 Keratin, high sulfur B2 protein. 45
56542 404726 pfam13886 DUF4203 Domain of unknown function (DUF4203). This is the N-terminal region of 7tm proteins. The function is not known. 200
56543 404727 pfam13887 MRF_C1 Myelin gene regulatory factor -C-terminal domain 1. This domain is found just downstream of Peptidase_S74, pfam13884. The function is not known. 36
56544 404728 pfam13888 MRF_C2 Myelin gene regulatory factor C-terminal domain 2. This domain is found further downstream of Peptidase_S74, pfam13884, and MRF_C1, pfam13887. The function is not known. 135
56545 404729 pfam13889 Chromosome_seg Chromosome segregation during meiosis. The proteins come from eukaryotes, plants and animals, and are necessary for chromosome segregation during meiosis. 55
56546 404730 pfam13890 Rab3-GTPase_cat Rab3 GTPase-activating protein catalytic subunit. This family is the probable catalytic subunit of the GTPase activating protein that has specificity for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). It is likely to convert active Rab3-GTP to the inactive form Rab3-GDP. Rab3 proteins are involved in regulated exocytosis of neurotransmitters and hormones. The Rab3 GTPase-activating complex is a heterodimer composed of RAB3GAP and RAB3-GAP150. This complex interacts with DMXL2. 159
56547 404731 pfam13891 zf-C3Hc3H Potential DNA-binding domain. This domain is likely to be the DNA-binding domain of chromatin re-modelling proteins and helicases. 62
56548 404732 pfam13892 DBINO DNA-binding domain. DBINO is a DNA-binding domain found on global transcription activator SNF2L1 proteins and chromatin re-modelling proteins. 130
56549 372791 pfam13893 RRM_5 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins. 125
56550 404733 pfam13894 zf-C2H2_4 C2H2-type zinc finger. This family contains a number of divergent C2H2 type zinc fingers. 24
56551 404734 pfam13895 Ig_2 Immunoglobulin domain. This domain contains immunoglobulin-like domains. 76
56552 404735 pfam13896 Glyco_transf_49 Glycosyl-transferase for dystroglycan. This glycosyl-transferase brings about the glycosylation of the alpha-dystroglycan subunit. Dystroglycan is an integral member of the skeletal muscular dystrophin glycoprotein complex, which links dystrophin to proteins in the extracellular matrix. 326
56553 404736 pfam13897 GOLD_2 Golgi-dynamics membrane-trafficking. Sec14-like Golgi-trafficking domain The GOLD domain is always found combined with lipid- or membrane-association domains. 133
56554 404737 pfam13898 DUF4205 Domain of unknown function (DUF4205). The proteins in this family are uncharacterized but often named FAM188B. 348
56555 404738 pfam13899 Thioredoxin_7 Thioredoxin-like. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. 82
56556 404739 pfam13901 zf-RING_9 Putative zinc-RING and/or ribbon. This is a family of cysteine-rich proteins. Many members also carry a pleckstrin-homology domain, pfam00169 201
56557 404740 pfam13902 R3H-assoc R3H-associated N-terminal domain. This family is found at the N-terminus of R3H, pfam01424, domain-containing proteins. The function is not known. 117
56558 372799 pfam13903 Claudin_2 PMP-22/EMP/MP20/Claudin tight junction. Members of this family are claudins, that form tight junctions between cells. 191
56559 404741 pfam13904 DUF4207 Domain of unknown function (DUF4207). This family is found in eukaryotes; it has several conserved tryptophan residues. The function is not known. 249
56560 404742 pfam13905 Thioredoxin_8 Thioredoxin-like. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. 94
56561 404743 pfam13906 AA_permease_C C-terminus of AA_permease. This is the C-terminus of AA-permease enzymes that is not captured by the models pfam00324 and pfam13520. 51
56562 404744 pfam13907 DUF4208 Domain of unknown function (DUF4208). This domain is found at the C-terminus of chromodomain-helicase-DNA-binding proteins. The exact function of the domain is undetermined. 93
56563 404745 pfam13908 Shisa Wnt and FGF inhibitory regulator. Shisa is a transcription factor-type molecule that physically interacts with immature forms of the Wnt receptor Frizzled and the FGF receptor within the endoplasmic reticulum to inhibit their post-translational maturation and trafficking to the cell surface. 175
56564 404746 pfam13909 zf-H2C2_5 C2H2-type zinc-finger domain. 25
56565 404747 pfam13910 DUF4209 Domain of unknown function (DUF4209). This short domain is found in bacteria and eukaryotes, though not in yeasts or Archaea. It carries a highly conserved RNxxxHG sequence motif. 89
56566 372807 pfam13911 AhpC-TSA_2 AhpC/TSA antioxidant enzyme. This family contains proteins related to alkyl hydro-peroxide reductase (AhpC) and thiol specific antioxidant (TSA). 113
56567 404748 pfam13912 zf-C2H2_6 C2H2-type zinc finger. 27
56568 404749 pfam13913 zf-C2HC_2 zinc-finger of a C2HC-type. This family contains a number of divergent C2H2 type zinc fingers. 25
56569 404750 pfam13914 Phostensin Phostensin PP1-binding and SH3-binding region. Phostensin has been identified as a PP1 regulatory protein binding PP1 at the KISF motif. The domain also appears to carry an incomplete incomplete SH3-binding domain PxRxP further upstream. It is likely that Phostensin targets PP1 to the F-actin cytoskeleton. Phostensin binds to actin and decreases the elongation and depolymerization rates of actin filament pointed ends. 142
56570 404751 pfam13915 DUF4210 Domain of unknown function (DUF4210). This short domain is found in fungi, plants and animals, and the proteins appear to be necessary for chromosome segregation during meiosis. 66
56571 404752 pfam13916 Phostensin_N PP1-regulatory protein, Phostensin N-terminal. Phostensin has been identified as a PP1 regulatory protein binding protein. This domain is N-terminal to the PP1- and SH3-binding regions though may carry an additional SH3-binding motif. It is likely that Phostensin targets PP1 to the F-actin cytoskeleton. Phostensin binds to actin and decreases the elongation and depolymerization rates of actin filament pointed ends. 86
56572 404753 pfam13917 zf-CCHC_3 Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger. 41
56573 372814 pfam13918 PLDc_3 PLD-like domain. 177
56574 404754 pfam13919 ASXH Asx homology domain. A conserved alpha helical domain with a characteristic LXXLL motif. The LXXLL motif is detected in diverse transcription factors, coactivators and corepressors and is implicated in mediating interactions between them. The ASXH domain is found in animals, fungi and plants and is predicted to play a role in mediating contact between transcription factors and chromatin-associated complexes. In Drosophila Asx and Human ASXL1, the ASXH domain is predicted to mediate interactions with the Calypso and BAP1 deubiquitinases (DUBs) which further belong to the UCHL5/UCH37 clade of DUBs. 128
56575 404755 pfam13920 zf-C3HC4_3 Zinc finger, C3HC4 type (RING finger). 50
56576 372817 pfam13921 Myb_DNA-bind_6 Myb-like DNA-binding domain. This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family. 60
56577 316444 pfam13922 PHD_3 PHD domain of transcriptional enhancer, Asx. This is the DNA-binding domain on the additional sex combs-like 1 proteins. The Asx protein acts as an enhancer of trithorax and polycomb in displaying bidirectional homoeotic phenotypes in Drosophila, suggesting that it is required for maintenance of both activation and silencing of Hox genes. Asx is required for normal adult haematopoiesis and its function depends on its cellular context. 68
56578 404756 pfam13923 zf-C3HC4_2 Zinc finger, C3HC4 type (RING finger). 40
56579 404757 pfam13924 Lipocalin_5 Lipocalin-like domain. This family includes domains distantly related to lipocalins. However, they do contain the important GXW motif in the first strand. The protein in this family include aln5, which is involved in biosynthesis of alnumycin. The family also includes the ZFK protein from Trypanosoma brucei which is a protein kinase. This domain is at the C-terminus of that protein. The domain is also found as the C-terminal domain in StiJ a protein involved in producing stigmatellin. This domain has been assumed to catalyze a final cyclisation reaction. 140
56580 404758 pfam13925 Katanin_con80 con80 domain of Katanin. The con80 domain of katanin is the C-terminal region of the protein that binds to the N-terminal domain of katanin-p60, the catalytic ATPase. The complex associates with a specific subregion of the mitotic spindle leading to increased microtubule disassembly and targeting of p60 to the spindle poles. The assembly and function of the mitotic spindle requires the activity of a number of microtubule-binding proteins. Katanin, a heterodimeric microtubule-severing ATPase, is found localized at mitotic spindle poles. A proposed model is that katanin is targeted to spindle poles through a combination of direct microtubule binding by the p60 subunit and through interactions between the WD40 domain and an unknown protein. 153
56581 404759 pfam13926 DUF4211 Domain of unknown function (DUF4211). 139
56582 404760 pfam13927 Ig_3 Immunoglobulin domain. This family contains immunoglobulin-like domains. 79
56583 404761 pfam13928 Flocculin_t3 Flocculin type 3 repeat. This repeat is found in the Flocculation protein FLO9 close to its C-terminus. 44
56584 404762 pfam13929 mRNA_stabil mRNA stabilisation. This domain is an mRNA stabilisation factor. 288
56585 404763 pfam13930 Endonuclea_NS_2 DNA/RNA non-specific endonuclease. 133
56586 404764 pfam13931 Microtub_bind Kinesin-associated microtubule-binding. This domain binds to micotubules. 129
56587 404765 pfam13932 GIDA_assoc GidA associated domain. The GidA associated domain is a domain that has been identified at the C-terminus of protein GidA. It consists of several helices, the last three being rather short and forming small bundle. GidA is an tRNA modification enzyme found in bacteria and mitochondrial. Based on mutational analysis this domain has been suggested to be implicated in binding of the D-stem of tRNA and to be responsible for the interaction with protein MnmE. Structures of GidA in complex with either tRNA or MnmE are missing. Reported to bind to Pfam family MnmE, pfam12631. 212
56588 404766 pfam13933 HRXXH Putative peptidase family. This family of putative peptidases are closely related to the M35 family pfam02102. In this family the metal binding HEXXH motif is replaced with HRXXH. The exact function of these proteins is unknown. Members of this family are found to be fungal allergens. 244
56589 404767 pfam13934 ELYS Nuclear pore complex assembly. ELYS (embryonic large molecule derived from yolk sac) is conserved from fungi such Aspergillus nidulans and Schizosaccharomyces pombe to human. It is important for the assembly of the nuclear pore complex. 219
56590 379401 pfam13935 Ead_Ea22 Ead/Ea22-like protein. This family contains phage proteins and bacterial proteins that are likely to represent integrated phage proteins. This family includes the Lambda phage Ea22 early protein as well as the Bacteriophage P22 Ead protein. 139
56591 404768 pfam13936 HTH_38 Helix-turn-helix domain. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding. 44
56592 404769 pfam13937 DUF4212 Domain of unknown function (DUF4212). This family includes several putative integral membrane proteins. 77
56593 404770 pfam13938 DUF4213 Putative heavy-metal chelation. This domain of unknown function has an enolase N-terminal domain-like fold. Its genomic context suggests that it may have a role in anaerobic vitamin B12 biosynthesis. This domain is often found at the N-terminus of proteins containing DUF364, pfam04016. The structure of UnioProtKB:B8FUJ5, Structure 3l5o, suggests that the whole protein has this enolase N-terminal-like fold and an Rossmann-like C-terminal domain. Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The protein may be playing a role in heavy-metal chelation. 73
56594 404771 pfam13939 TisB_toxin Toxin TisB, type I toxin-antitoxin system. TisB (toxicity-induced by SOS B) is an SOS-induced toxic peptide. It is a hydrophobic membrane-spanning protein which inhibits cell growth. Its expression is inhibited by the antisense RNA IstR-1, which acts as an antitoxin. 28
56595 404772 pfam13940 Ldr_toxin Toxin Ldr, type I toxin-antitoxin system. This family includes the Ldr (long direct repeat) toxins. In Escherichia coli there are four Ldr toxins, LdrA, LdrB, LdrC and LdrD. These toxins inhibit cell growth, decrease cell viability and cause nucleoid condensation. LdrD expression is inhibited by the antisense RNA RdlD, which functions as an antitoxin. 35
56596 404773 pfam13941 MutL MutL protein. This small family includes, GlmL/MutL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family. 448
56597 404774 pfam13942 Lipoprotein_20 YfhG lipoprotein. This family includes the YfhG protein from E. coli. Members of this family have an N-terminal lipoprotein attachment site. The members of this family are functionally uncharacterized. 175
56598 404775 pfam13943 WPP WPP domain. 100
56599 404776 pfam13944 Calycin_like Calycin-like beta-barrel domain. 121
56600 372833 pfam13945 NST1 Salt tolerance down-regulator. NST1 is a family of proteins that seem to be involved, directly or indirectly, in the salt sensitivity of some cellular functions in yeast. It does this without affecting sodium accumulation. It negatively affects salt-tolerance through an interaction with the splicing factor Msl1p. This interaction stresses the importance of efficient RNA processing under salt stress conditions. 186
56601 404777 pfam13946 DUF4214 Domain of unknown function (DUF4214). This domain is found on a variety of different proteins including transferases, and allergen V5/Tpx-1 related proteins. 72
56602 404778 pfam13947 GUB_WAK_bind Wall-associated receptor kinase galacturonan-binding. This cysteine-rich GUB_WAK_bind domain is the extracellular part of this serine/threonine kinase that binds to the cell-wall pectins. 63
56603 290659 pfam13948 DUF4215 Domain of unknown function (DUF4215). The function of this family is unknown. 47
56604 404779 pfam13949 ALIX_LYPXL_bnd ALIX V-shaped domain binding to HIV. The binding of the LYPxL motif of late HIV p6Gag and EIAV p9Gag to this domain is necessary for viral budding.This domain is generally central between an N-terminal Bro1 domain, pfam03097 and a C-terminal proline-rich domain. The retroviruses thus used this domain to hijack the ESCRT system of the cell. 295
56605 404780 pfam13952 DUF4216 Domain of unknown function (DUF4216). This DUF is sometimes found at the C-terminal end of proteins carrying a Transposase_21 domain, pfam02992. 72
56606 404781 pfam13953 PapC_C PapC C-terminal domain. The PapC C-terminal domain is a structural domain found at the C-terminus of the E. coli PapC protein. Pili are assembled using the chaperone usher system. In E.coli this is composed of the chaperone PapD and the usher PapC. This domain represents the C-terminal domain from PapC and its homologs. This domain has a beta-sandwich structure similar to the plug domain of PapC. 66
56607 404782 pfam13954 PapC_N PapC N-terminal domain. The PapC N-terminal domain is a structural domain found at the N-terminus of the E. coli PapC protein. Pili are assembled using the chaperone usher system. In E.coli this is composed of the chaperone PapD and the usher PapC. This domain represents the N-terminal domain from PapC and its homologs. This domain is involved in substrate binding. 146
56608 372839 pfam13955 Fst_toxin Toxin Fst, type I toxin-antitoxin system. Fst (faecalis plasmid stabilization toxin), also known as RNA I, is a toxic peptide. Its N-terminus forms a transmembrane alpha helix, its C-terminus is disordered and is likely to be cytosolic. Its translation is inhibited by the antisense RNA, RNA II, which acts as an antitoxin. 21
56609 206126 pfam13956 Ibs_toxin Toxin Ibs, type I toxin-antitoxin system. The Ibs (induction brings stasis) proteins are a family of toxic peptides. Their expression is inhibited by the Sib antisense RNAs, which act as antitoxins. 19
56610 404783 pfam13957 YafO_toxin Toxin YafO, type II toxin-antitoxin system. YafO is a toxin which inhibits protein synthesis. It acts as a ribosome-dependent mRNA interferase. It forms part of a type II toxin-antitoxin system, where the YafN protein acts as an antitoxin. This domain forms complexes with yafN antitoxins containing pfam02604. 101
56611 404784 pfam13958 ToxN_toxin Toxin ToxN, type III toxin-antitoxin system. ToxN acts as a toxin, it is part of a type III toxin-antitoxin system. It acts as a ribosome independent endoribonuclease. It interacts with, and is inhibited by, the RNA antitoxin, ToxI. Three ToxN monomers bind to three ToxI monomers to create a trimeric ToxN-ToxI complex. 155
56612 404785 pfam13959 DUF4217 Domain of unknown function (DUF4217). This short domain is found at the C-terminus of many helicase proteins. 61
56613 404786 pfam13960 DUF4218 Domain of unknown function (DUF4218). 112
56614 404787 pfam13961 DUF4219 Domain of unknown function (DUF4219). This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif. 27
56615 404788 pfam13962 PGG Domain of unknown function. The PGG domain is named for the highly conserved sequence motif found at the startt of the domain. The function is not known. 114
56616 404789 pfam13963 Transpos_assoc Transposase-associated domain. 74
56617 404790 pfam13964 Kelch_6 Kelch motif. 50
56618 404791 pfam13965 SID-1_RNA_chan dsRNA-gated channel SID-1. This is a family of proteins that are transmembrane dsRNA-gated channels. They passively transport dsRNA into cells and do not act as ATP-dependent pumps. They are required for systemic RNA interference. This family of proteins belong to the CREST superfamily, which are distantly related to GPCRs. 590
56619 404792 pfam13966 zf-RVT zinc-binding in reverse transcriptase. This domain would appear to be a zinc-binding region of a putative reverse transcriptase. 84
56620 404793 pfam13967 RSN1_TM Late exocytosis, associated with Golgi transport. This family represents the first three transmembrane regions of 11-TM proteins involved in vesicle transport. In S. cerevisiae these proteins are members of the yeast facilitator superfamily and are integral membrane proteins localized to the cell periphery, in particular to the bud-neck region. The distribution is consistent with a role in late exocytosis which is in agreement with the proteins' ability to substitute for the function of Sro7p, required for the sorting of the protein Enap1 into Golgi-derived vesicles destined for the cell surface. 158
56621 404794 pfam13968 DUF4220 Domain of unknown function (DUF4220). This family is found in plants and is often associated with DUF294, pfam04578. 316
56622 372852 pfam13969 Pab87_oct Pab87 octamerisation domain. This domain was first characterized as the C-terminal domain of Pab87 serine protease from Pyrococcus abyssi. The domain is reported to play a crucial role in Pab87 octamerisation and active site compartmentalisation. Its up-and-down 8-stranded beta-barrel 3D structure is reminiscent of the one found in lipocalins. 96
56623 404795 pfam13970 DUF4221 Domain of unknown function (DUF4221). This family of bacterial proteins contains highly conserved asparagine and cysteine residues. The function is not known. 310
56624 404796 pfam13971 Mei4 Meiosis-specific protein Mei4. This family of meiosis specific proteins is required for correct meiotic chromosome segregation and recombination. It is required for meiotic DNA double-strand break (DSB) formation. 340
56625 404797 pfam13972 TetR Bacterial transcriptional repressor. This family of bacterial transcriptional repressors is characterized by the short approximately 50 amino acid stretch of residues constituting the helix-turn-helix DNA binding motif, around the YRFhY motif. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The regulatory network in which TetR itself is involved is in being released in the presence of tetracycline, binding to the target operator, and repressing tetA transcription. 143
56626 372855 pfam13973 DUF4222 Domain of unknown function (DUF4222). This short protein is likely to be of phage origin. For example it is found in the Enterobacteria phage YYZ-2008. It is largely found in enteric bacteria. The molecular function of this protein is unknown. 53
56627 404798 pfam13974 YebO YebO-like protein. This short protein is uncharacterized. It seems likely to be of phage origin as it is found in Enterobacteria phage HK022 Gp20 and Enterobacteria phage HK97 Gp15. The protein is also found in a variety of enteric bacteria. 80
56628 404799 pfam13975 gag-asp_proteas gag-polyprotein putative aspartyl protease. This family of putative aspartyl proteases is found pre-dominantly in retroviral proteins. 92
56629 372857 pfam13976 gag_pre-integrs GAG-pre-integrase domain. This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins. 67
56630 404800 pfam13977 TetR_C_6 BetI-type transcriptional repressor, C-terminal. This family comprises the C-terminal portion of proteins that belong to the TetR family of transcriptional regulators. The C-terminus represents the regulatory region, and does not include the DNA binding helix-turn-helix domain. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. One of the target proteins is BetI, an osmoprotectant which controls the choline-glycine betaine pathway in E.coli. 115
56631 404801 pfam13978 DUF4223 Protein of unknown function (DUF4223). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. These proteins are likely to be lipoproteins (attachment site currently included in alignment). 54
56632 372858 pfam13979 SopA_C SopA-like catalytic domain. This domain is found in the E. coli Type III secretion effector proteins SopA and NleL. These proteins have been shown to act as E3 ubiquitin ligase enzymes. This domain contains the active site cysteine residue. 166
56633 404802 pfam13980 UPF0370 Uncharacterized protein family (UPF0370). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved DWP sequence motif. 61
56634 404803 pfam13981 SopA SopA-like central domain. This domain is found in the E. coli Type III secretion effector proteins SopA and NleL. These proteins have been shown to act as E3 ubiquitin ligase enzymes. 126
56635 372861 pfam13982 YbfN YbfN-like lipoprotein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. Members of this family are lipoproteins. 88
56636 404804 pfam13983 YsaB YsaB-like lipoprotein. This family of proteins is functionally uncharacterized. These proteins are related to E.coli YsaB. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. These proteins are lipoproteins. 75
56637 404805 pfam13984 MsyB MsyB protein. The MsyB protein has been found to be able to restore protein export defects caused by a temperature-sensitive secY or secA mutation. However, its exact molecular function is still unknown, but it may play a role in protein export. Proteins in this family are approximately 120 amino acids in length. This family of proteins is found in bacteria. 120
56638 404806 pfam13985 YbgS YbgS-like protein. This family of proteins is functionally uncharacterized. The family includes the YbgS protein from E. coli. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. Some members of this family are annotated as homeobox protein, but this annotation cannot be verified. 120
56639 404807 pfam13986 DUF4224 Domain of unknown function (DUF4224). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 50 amino acids in length. The protein is likely to be of phage origin and is found as protein Gp02 in the Xylella phage Xfas53. 45
56640 404808 pfam13987 YedD YedD-like protein. This family of proteins related to the YedD protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. These proteins are lipoproteins. 106
56641 404809 pfam13988 DUF4225 Protein of unknown function (DUF4225). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 182 and 282 amino acids in length. 163
56642 404810 pfam13989 YejG YejG-like protein. The YejG protein family is a group of functionally uncharacterized proteins related to Escherichia coli yejG. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 106
56643 404811 pfam13990 YjcZ YjcZ-like protein. This family of proteins is functionally uncharacterized. The family includes the YjcZ protein from E. coli. This family of proteins is found in enteric bacteria. Proteins in this family are approximately 300 amino acids in length. There are two conserved sequence motifs: FGD and MPR. 272
56644 404812 pfam13991 BssS BssS protein family. The BssS protein family is a group of proteins that are involved in regulation of biofilm formation. Proteins in this family are approximately 80 amino acids in length. 72
56645 404813 pfam13992 YecR YecR-like lipoprotein. The YecR-like family of lipoproteins includes the YecR protein from E. coli. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 110 amino acids in length. 73
56646 404814 pfam13993 YccJ YccJ-like protein. The YccJ-like family of proteins includes the E. coli YccJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 67
56647 372868 pfam13994 PgaD PgaD-like protein. This family includes the PgaD protein from E. coli. The homopolymer poly-beta-1,6-N-acetyl-D-glucosamine (beta-1,6-GlcNAc; PGA) serves as an adhesin for the maintenance of biofilm structural stability in eubacteria. The pgaABCD operon is required for its synthesis and export. It has been shown that PgaD is essential for this process. 148
56648 404815 pfam13995 YebF YebF-like protein. The YebF-like protein family appears to be a group of colicin immunity proteins. As well as YebF the family includes cmi, the colicin M immunity protein. This domain family is found in bacteria, and is approximately 80 amino acids in length. The alignment contains two conserved cysteine residues that form a disulphide bond in the solved structure. 89
56649 404816 pfam13996 YobH YobH-like protein. The YobH-like protein family includes the YobH protein from E. coli, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There are two conserved sequence motifs: GYG and GLGL. 70
56650 404817 pfam13997 YqjK YqjK-like protein. The YqjK-like protein family includes the E. coli YqjK protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a single completely conserved residue R that may be functionally important. 72
56651 404818 pfam13998 MgrB MgrB protein. The MgrB protein is a short lipoprotein. The mgrB gene has a mg2+ responsive promoter. Deletion of mgrB results in a potent increase in PhoP-regulated transcription. The PhoQ/PhoP signaling system responds to low magnesium and the presence of certain cationic antimicrobial peptides. Over-expression of mgrB decreased transcription at both high and low concentrations of magnesium. Localization and bacterial two-hybrid studies suggest that MgrB resides in the inner-membrane and interacts directly with PhoQ. This domain family is found in bacteria, and is approximately 40 amino acids in length. There are two conserved sequence motifs: CDQ and GIC. 29
56652 404819 pfam13999 MarB MarB protein. The MarB protein is found in the multiple antibiotic resistance (mar) locus in Escherichia coli. The MarB protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved GSDKSD sequence motif. 63
56653 290707 pfam14000 Packaging_FI DNA packaging protein FI. This family includes the lambda phage DNA-packaging protein FI. Proteins in this family are typically between 124 and 140 amino acids in length. There is a conserved EEE sequence motif. 131
56654 404820 pfam14001 YdfZ YdfZ protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved YDRNRN sequence motif. The E. coli protein has been shown to bind selenium. 64
56655 404821 pfam14002 YniB YniB-like protein. The YniB-like protein family includes the E. coli YniB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 180 amino acids in length. This family of proteins are integral membrane proteins. 166
56656 404822 pfam14003 YlbE YlbE-like protein. The YlbE-like protein family includes the B. subtilis protein YlbE, which is functionally uncharacterized. This family of cytosolic proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There is a conserved WYR sequence motif. 61
56657 372877 pfam14004 DUF4227 Protein of unknown function (DUF4227). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 71
56658 404823 pfam14005 YpjP YpjP-like protein. The YpjP-like protein family includes the B. subtilis YpjP protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 200 amino acids in length. 133
56659 379412 pfam14006 YqzL YqzL-like protein. The YqzL-like protein family includes the B. subtilis YqzL protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. 40
56660 379413 pfam14007 YtpI YtpI-like protein. The YtpI-like protein family includes the B. subtilis YtpI protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 101 amino acids in length. 87
56661 404824 pfam14008 Metallophos_C Iron/zinc purple acid phosphatase-like protein C. This domain is found at the C-terminus of Purple acid phosphatase proteins. 63
56662 404825 pfam14009 DUF4228 Domain of unknown function (DUF4228). This domain is found in plants. The function is not known. 148
56663 404826 pfam14010 PEPcase_2 Phosphoenolpyruvate carboxylase. This family of phosphoenolpyruvate carboxylases is based on seqeunces not picked up by the model for PEPcase, PF00311. Most of the family members are from Archaea. 496
56664 404827 pfam14011 ESX-1_EspG EspG family. This family of proteins contains the the EspG1, EspG2 and EspG3 proteins from M. tuberculosis. These proteins are involved in the ESAT-6 secretion system 1 (ESX-1) of Mycobacterium tuberculosis which is important for virulence and intercellular spread. Proteins in this family are typically between 254 and 295 amino acids in length. 241
56665 404828 pfam14012 DUF4229 Protein of unknown function (DUF4229). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 122 amino acids in length. 67
56666 404829 pfam14013 MT0933_antitox MT0933-like antitoxin protein. This family of proteins contains the MT0933 protein, which has been identified as an antitoxin to /protein MT0934. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 90 amino acids in length. 49
56667 404830 pfam14014 DUF4230 Protein of unknown function (DUF4230). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 203 and 228 amino acids in length. 134
56668 404831 pfam14015 DUF4231 Protein of unknown function (DUF4231). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 148 and 288 amino acids in length. 104
56669 404832 pfam14016 DUF4232 Protein of unknown function (DUF4232). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 177 and 242 amino acids in length. Many members of this family are lipoproteins. 130
56670 404833 pfam14017 DUF4233 Protein of unknown function (DUF4233). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 122 and 147 amino acids in length. Proteins in this family are integral membrane proteins. 106
56671 404834 pfam14018 DUF4234 Domain of unknown function (DUF4234). This presumed integral membrane protein domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 70 amino acids in length. 69
56672 404835 pfam14019 DUF4235 Protein of unknown function (DUF4235). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 88 and 119 amino acids in length. 77
56673 404836 pfam14020 DUF4236 Protein of unknown function (DUF4236). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 69 and 402 amino acids in length. 55
56674 404837 pfam14021 TNT Tuberculosis necrotizing toxin. This is the C-terminal domain secreted by Mycobacterium tuberculosis (Mtb). It induces necrosis of infected cells to evade immune responses. Mtb utilizes the protein CpnT to kill human macrophages by secreting its C-terminal domain (CTD), named tuberculosis necrotizing toxin (TNT) that induces necrosis. It acts as a NAD+ glycohydrolase which hydrolyzes the essential cellular coenzyme NAD+ in the cytosol of infected macrophages resulting in necrotic cell death. CpnT transports its toxic CTD from the cell surface of M. tuberculosis by proteolytic cleavage, where the toxin is cleaved to induce host cell death. Structural analysis determined that the TNT core contains only six beta-strands as opposed to seven found in all known NAD+-utilizing toxins, and is significantly smaller, with only two short alpha-helices and two 3/10 helices. Furthermore, the putative NAD+ binding pocket identified Q822, Y765 and R757 as residues possibly involved in NAD+-binding and hydrolysis based on similar positions of catalytic amino acids of ADP-ribosylating toxins. While glutamine 822 residue was detected to be highly conserved among TNT homologs. 84
56675 404838 pfam14022 DUF4238 Protein of unknown function (DUF4238). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 274 and 374 amino acids in length. 279
56676 372881 pfam14023 DUF4239 Protein of unknown function (DUF4239). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 254 and 270 amino acids in length. 212
56677 404839 pfam14024 DUF4240 Protein of unknown function (DUF4240). This presumed domain is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 169 and 263 amino acids in length. This domain is often associated with the WGR domain pfam05406. 128
56678 404840 pfam14025 DUF4241 Protein of unknown function (DUF4241). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 205 and 315 amino acids in length. There is a conserved GDG sequence motif at the C-terminus. 187
56679 404841 pfam14026 DUF4242 Protein of unknown function (DUF4242). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 90 and 170 amino acids in length. There is a single completely conserved residue C that may be functionally important. 74
56680 404842 pfam14027 DUF4243 Protein of unknown function (DUF4243). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 348 and 477 amino acids in length. 336
56681 404843 pfam14028 Lant_dehydr_C Lantibiotic biosynthesis dehydratase C-term. Lant_dehydr_C is the C-terminal domain of a family of dehydratases that are involved in the biosynthesis of lantibiotics. While the extensive N-terminal domain, pfam04738, is involved in the serine-threonine glutamylation step of the synthetic process, this C-terminal domain, once thought to be a separate domain from the dehydratase enzymic activity, is necessary for the final glutamate-elimination step in the generation of the lantibiotic. Lantibiotics are a class of peptide antibiotic that contains one or more thioether bonds. 289
56682 404844 pfam14029 DUF4244 Protein of unknown function (DUF4244). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 66 and 95 amino acids in length. There is a conserved EYA sequence motif. 50
56683 404845 pfam14030 DUF4245 Protein of unknown function (DUF4245). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 188 and 235 amino acids in length. 166
56684 404846 pfam14031 D-ser_dehydrat Putative serine dehydratase domain. This domain is found at the C-terminus of yeast D-serine dehydratase. Structures have been solved for two bacterial members of this family. The yeast protein has been shown to be a zinc dependant enzyme. 96
56685 404847 pfam14032 PknH_C PknH-like extracellular domain. This domain is functionally uncharacterized. It is found as the periplasmic domain of the bacterial protein kinase PknH. The domain is also found in isolation in numerous proteins, for example the lipoproteins lpqQ, lprH, lppH and lpqA from M. tuberculosis. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 268 amino acids in length. There are two completely conserved C residues that are likely to form a disulphide bond. A second pair of cysteines are less well conserved probably form a second disulphide bond. It seems likely that this domain functions to bind some as yet unknown ligand. 187
56686 404848 pfam14033 DUF4246 Protein of unknown function (DUF4246). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and fungi. Proteins in this family are typically between 392 and 644 amino acids in length. 421
56687 379432 pfam14034 Spore_YtrH Sporulation protein YtrH. This family of proteins is involved in sporulation. It may contribute to the formation and stability of the thick peptidoglycan layer between the two membranes of the spore, known as the cortex. In Bacillus subtilis its expression is regulated by sigma-E. 99
56688 404849 pfam14035 YlzJ YlzJ-like protein. The YlzJ-like protein family includes the B. subtilis YlzJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 72 amino acids in length. There are two completely conserved residues (L and G) that may be functionally important. 65
56689 379434 pfam14036 YlaH YlaH-like protein. The YlaH-like protein family includes the B. subtilis YlaH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a conserved LGFA sequence motif. 76
56690 372886 pfam14037 YoqO YoqO-like protein. The YoqO-like protein family includes the B. subtilis YoqO protein, which is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 120 amino acids in length. There are two completely conserved residues (I and Y) that may be functionally important. 116
56691 372887 pfam14038 YqzE YqzE-like protein. The YqzE-like protein family includes the B. subtilis YqzE protein, which is functionally uncharacterized. It is a part of the ComG operon, which is regulated by the competence transcription factor ComK. This family of proteins is found in bacteria. Proteins in this family are typically between 49 and 66 amino acids in length. 53
56692 404850 pfam14039 YusW YusW-like protein. The YusW-like protein family includes the B. subtilis YusW protein, which is functionally uncharacterized. This family of proteins is found in bacteria, and is approximately 90 amino acids in length. 91
56693 404851 pfam14040 DNase_NucA_NucB Deoxyribonuclease NucA/NucB. Members of this family act as deoxyribonucleases. 113
56694 404852 pfam14041 Lipoprotein_21 LppP/LprE lipoprotein. The family includes putative lipoproteins LppP and LprE from species of Mycobacterium. LppP is required for optimal growth of M. tuberculosis. 86
56695 404853 pfam14042 DUF4247 Domain of unknown function (DUF4247). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 143 and 271 amino acids in length. 122
56696 379438 pfam14043 WVELL WVELL protein. This family includes the B. subtilis YfjH protein, which is functionally uncharacterized. This is not a homolog of E. coli YfjH, a synonym for IscX, which belongs to pfam04384. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length and contain a highly conserved WVELL motif. 74
56697 404854 pfam14044 NETI NETI protein. This family includes the B. subtilis YebG protein, which is functionally uncharacterized. This is not a homolog of E. coli YebG, which belongs to pfam07130. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 66 amino acids in length and contain a conserved NETI motif. 56
56698 404855 pfam14045 YIEGIA YIEGIA protein. This family includes the B. subtilis YphB protein, which is functionally uncharacterized. Its expression is regulated by the sporulation transcription factor sigma-F, however it is not essential for sporulation or germination. This is not a homolog of E. coli YphB, which belongs to pfam01263. This family of proteins is found in bacteria. Proteins in this family are typically between 276 and 300 amino acids in length and contain a conserved YIEGIA motif. 282
56699 404856 pfam14046 NR_Repeat Nuclear receptor repeat. This is a repeat domain involved in dimerization of nuclear receptors proteins and in transcriptional regulation in general. It contains a Leu-Xaa-Xaa-Leu-Leu motif which has been characterized for the orphan nuclear receptor Dax-1, which represses the constitutively expressed protein Ad4BP/SF-1. The LXXLL motif plays in important role in binding of Dax-1 to Ad4BP/SF-1. The domain is subject to structure determination by the Joint Center of Structural Genomics. 47
56700 404857 pfam14047 DCR Dppa2/4 conserved region. This domain has been characterized in the finding of a developmental pluripotency associated gene (Dppa) in the lower vertebrate Xenopus laevis. Previous to this discovery, Dppa genes were known only in higher vertebrates. The domain is subject to structure determination by the Joint Center of Structural Genomics. 67
56701 404858 pfam14048 MBD_C C-terminal domain of methyl-CpG binding protein 2 and 3. CpG-methylation is a frequently occurring epigenetic modification of vertebrate genomes resulting in transcriptional repression. This domain was found at the C-terminus of the methyl-CpG-binding domain (MBD) containing proteins MBD2 and MBD3, the latter was shown to not bind directly to methyl-CpG DNA but rather interact with components of the NuRD/Mi2 complex, an abundant deacetylase complex. The domain is subject to structure determination by the Joint Center of Structural Genomics. 93
56702 404859 pfam14049 Dppa2_A Dppa2/4 conserved region in higher vertebrates. Developmental pluripotency associated genes (Dppa) in lower vertebrates have remained undetected until the discovery of a Dppa homolog in Xenopus laevis, reporting a new domain termed Dppa2/4 conserved region (DCR). In higher vertebrate Dppa proteins the DCR domain is located next to the here-reported domain. The domain is subject to structure determination by the Joint Center of Structural Genomics. 85
56703 404860 pfam14050 Nudc_N N-terminal conserved domain of Nudc. The N-terminus of nuclear distribution gene C homolog (NUDC) proteins contains a highly conserved region consisting of a predicted three helix bundle. In the human homolog this segment has been targeted for structure determination by the Joint Center for Structural Genomics. NUDC forms a complex with other NUD proteins and is involved in several cellular division activities. Recently it was shown that NUDC regulates platelet-activating factor (PAF) acetylhydrolase with PAF being a pro-inflammatory secondary lipidic messenger. 60
56704 404861 pfam14051 Requiem_N N-terminal domain of DPF2/REQ. This putative domain has been detected on the human DPF2 protein and was subsequently targeted for structure determination by the Joint Center for Structural Genomics (JCSG). Possibly, the C-terminus extends by 30 amino acids and forms a separate domain. DPF2 interacts with estrogen related receptor alpha (Err-alpha), an orphan receptor which acts as a regulator in energy metabolism. It was also identified as an adaptor molecule that links nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kappa-B) dimer RelB/p52 and switch/sucrose-nonfermentable (SWI/SNF) chromatin remodeling factor. 67
56705 404862 pfam14052 Caps_assemb_Wzi Capsule assembly protein Wzi. Many bacteria are covered in a layer of surface-associated polysaccharide called the capsule. These capsules can be divided into four groups depending upon the organisation of genes responsible for capsule assembly, the assembly pathway and regulation. This family plays a role in group 1 capsule biosynthesis. It is likely to be involved in the later stages of capsule assembly. It is likely to consist of a beta-barrel structure. 392
56706 404863 pfam14053 DUF4248 Domain of unknown function (DUF4248). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 86 amino acids in length. 66
56707 404864 pfam14054 DUF4249 Domain of unknown function (DUF4249). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 279 and 365 amino acids in length. There are two completely conserved residues (C and G) that may be functionally important. 257
56708 404865 pfam14055 NVEALA NVEALA protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 75 and 92 amino acids in length. There is a conserved NVEALA sequence motif. 64
56709 404866 pfam14056 DUF4250 Domain of unknown function (DUF4250). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There are two completely conserved residues (N and R) that may be functionally important. 55
56710 404867 pfam14057 GGGtGRT GGGtGRT protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 330 amino acids in length and contain many highly conserved residues including a GGGtGRT motif. 326
56711 404868 pfam14058 PcfK PcfK-like protein. The PcfK-like protein family includes the Enterococcus faecalis PcfK protein, which is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 137 and 257 amino acids in length. There are two completely conserved residues (D and L) that may be functionally important. 137
56712 404869 pfam14059 DUF4251 Domain of unknown function (DUF4251). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 164 and 196 amino acids in length. 132
56713 404870 pfam14060 DUF4252 Domain of unknown function (DUF4252). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 182 amino acids in length. 123
56714 404871 pfam14061 Mtf2_C Polycomb-like MTF2 factor 2. Mammalian Polycomb-like gene MTF2/PCL2 forms a complex with Polycomb repressive complex-2 (PRC2) and collaborates with PRC1 to achieve repression of Hox gene expression. The human MTF2 gene is expressed in three splicing variants, each of them contains the short C-terminal domain defined here. The domain is subject to structure determination by the Joint Center of Structural Genomics. 48
56715 404872 pfam14062 DUF4253 Domain of unknown function (DUF4253). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length. 109
56716 404873 pfam14063 DUF4254 Protein of unknown function (DUF4254). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 195 and 207 amino acids in length. 144
56717 404874 pfam14064 HmuY HmuY protein. HmuY is a novel heme-binding protein that recruits heme from host carriers and delivers it to its cognate outer-membrane transporter, the TonB-dependent receptor HmuR. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 278 amino acids in length. 155
56718 404875 pfam14065 DUF4255 Protein of unknown function (DUF4255). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 190 and 320 amino acids in length. 176
56719 404876 pfam14066 DUF4256 Protein of unknown function (DUF4256). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 190 amino acids in length. 173
56720 404877 pfam14067 LssY_C LssY C-terminus. This domain is found at the C-terminus of Legionella LssY proteins, which may be a part of the type I secretion system. This domain is functionally uncharacterized. This domain is found in bacteria, and is typically between 182 and 195 amino acids in length. It is often found in association with pfam09335 and PF01569. There are two completely conserved residues (P and W) that may be functionally important. 188
56721 404878 pfam14068 YuiB Putative membrane protein. This family of bacterial proteins is functionally uncharacterized. Proteins in this family are approximately 100 amino acids in length. There is a conserved FGIGF sequence motif, and many members are putative membrane proteins. 101
56722 404879 pfam14069 SpoVIF Stage VI sporulation protein F. The sporulation-specific SpoVIF (YjcC) protein of Bacillus subtilis is essential for the development of heat-resistant spores. Its expression is governed by SigK. 72
56723 404880 pfam14070 YjfB_motility Putative motility protein. This family of proteins is regulated in B. subtilis by SigD, and is likely to be involved in motility or flagellin production, Proteins in this family are approximately 60 amino acids in length, and contain two highly conserved asparagine residues. 57
56724 404881 pfam14071 YlbD_coat Putative coat protein. This is a family of putative bacterial coat proteins. Proteins in this family are approximately 140 amino acids in length. 127
56725 404882 pfam14072 DndB DNA-sulfur modification-associated. This is family of bacterial proteins likely to be necessary for binding to DNA and recognising the modification sites. Members are found in bacteria, archaea and on viral plasmids, and are typically between 354 and 474 amino acids in length. There is a conserved DGQHR sequence motif. 337
56726 404883 pfam14073 Cep57_CLD Centrosome localization domain of Cep57. The CLD or centrosome localization domain of Cep57 is found at the N-terminus, and lies approximately between residues 58 and 239. This region lies within the first alpha-helical coiled-coil segment of Cep57, and localizes to the centrosome internally to gamma-tubulin, suggesting that it is either on both centrioles or on a centromatrix component. This N-terminal region can also multimerize with the N-terminus of other Cep57 molecules. The C-terminal part, Family Cep57_MT_bd, pfam06657, is the microtubule-binding region of Cep57. 178
56727 372902 pfam14074 DUF4257 Protein of unknown function (DUF4257). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 80
56728 404884 pfam14075 UBN_AB Ubinuclein conserved middle domain. Ubinuclein 1 and 2 (UBN1, UBN2) are members of a histone chaperone complex involved in the formation of a certain type of facultative heterochromatin, called senescence-associated heterochromatin foci (SAHF). The domain described here is conserved in many eukaryotes such as human, rat, drosophila, and zebra-fish and has been targeted for protein structure determination by the Joint Center for Structural Genomics. 211
56729 404885 pfam14076 DUF4258 Domain of unknown function (DUF4258). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 95 and 124 amino acids in length. 70
56730 404886 pfam14077 WD40_alt Alternative WD40 repeat motif. WD repeats are short subdomains of about 40 amino acids and fold into 4 antiparallel beta hairpins. This domain here has been detected on the C-terminus of WD repeat-containing protein 18 during target selection by the Joint Center for Structural Genomics. 48
56731 404887 pfam14078 DUF4259 Domain of unknown function (DUF4259). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 118 and 145 amino acids in length. 130
56732 404888 pfam14079 DUF4260 Domain of unknown function (DUF4260). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 114 and 126 amino acids in length. There is a conserved GLK sequence motif. 112
56733 404889 pfam14080 DUF4261 Domain of unknown function (DUF4261). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 80 amino acids in length. 77
56734 404890 pfam14081 DUF4262 Domain of unknown function (DUF4262). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 147 and 227 amino acids in length. Some members may be incorrectly annotated as the KatG protein. 127
56735 404891 pfam14082 DUF4263 Domain of unknown function (DUF4263). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 244 and 403 amino acids in length. 163
56736 290790 pfam14083 PGDYG PGDYG protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. There is a conserved PGDYG motif. 101
56737 404892 pfam14084 DUF4264 Protein of unknown function (DUF4264). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 52
56738 404893 pfam14085 DUF4265 Domain of unknown function (DUF4265). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 139 and 168 amino acids in length. 111
56739 404894 pfam14086 DUF4266 Domain of unknown function (DUF4266). This presumed lipoprotein domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 50 amino acids in length. 50
56740 404895 pfam14087 DUF4267 Domain of unknown function (DUF4267). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 126 and 142 amino acids in length. 110
56741 404896 pfam14088 DUF4268 Domain of unknown function (DUF4268). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 151 and 387 amino acids in length. 138
56742 372908 pfam14089 KbaA KinB-signalling pathway activation in sporulation. This family of small proteins is found in the membrane and is necessary for kinase KinB signalling during sporulation. There is a conserved GFF sequence motif. The initiation of sporulation in Bacillus subtilis is dependent on the phosphorylation of the Spo0A transcription factor mediated by the phospho-relay and by two major kinases, KinA and KinB. 179
56743 404897 pfam14090 HTH_39 Helix-turn-helix domain. This helix-turn-helix domain is often found in phage proteins and is likely to be DNA-binding. 70
56744 404898 pfam14091 DUF4269 Domain of unknown function (DUF4269). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 176 and 187 amino acids in length. There is a conserved KTE sequence motif. 151
56745 404899 pfam14092 DUF4270 Domain of unknown function (DUF4270). This family of lipoproteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 444 and 534 amino acids in length. 442
56746 404900 pfam14093 DUF4271 Domain of unknown function (DUF4271). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 221 and 326 amino acids in length. 207
56747 404901 pfam14094 DUF4272 Domain of unknown function (DUF4272). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 221 and 399 amino acids in length. 207
56748 404902 pfam14096 DUF4274 Domain of unknown function (DUF4274). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 80 amino acids in length. 76
56749 404903 pfam14097 SpoVAE Stage V sporulation protein AE1. Members of this family are all described as putative stage V sporulation protein AE, although this could not be confirmed. Proteins in this family are approximately 190 amino acids in length. 179
56750 404904 pfam14098 SSPI Small, acid-soluble spore protein I. This family of proteins is putatively assigned as a small, acid-soluble spore protein 1. Proteins in this family are approximately 70 amino acids in length. There is a conserved LPGLGV sequence motif. 65
56751 404905 pfam14099 Polysacc_lyase Polysaccharide lyase. This family includes heparin lyase I, EC:4.2.2.7. Heparin lyase I depolymerizes heparin by cleaving the glycosidic linkage next to an iduronic acid moiety. The structure of heparin lyase I consists of a beta-jelly roll domain with a long, deep substrate-binding groove and an unusual thumb domain containing many basic residues extending from the main body of the enzyme. This family also includes glucuronan lyase, EC:4.2.2.14. The structure glucuronan lyase is a beta-jelly roll. 213
56752 404906 pfam14100 PmoA Methane oxygenase PmoA. This family is a putative methane oxygenase 272
56753 316612 pfam14101 DUF4275 Domain of unknown function (DUF4275). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. 139
56754 404907 pfam14102 Caps_synth_CapC Capsule biosynthesis CapC. This family of proteins play a role in capsule biosynthesis. They are essential for gamma-polyglutamic acid (PGA) production. 119
56755 404908 pfam14103 DUF4276 Domain of unknown function (DUF4276). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 190 and 224 amino acids in length. There is a single completely conserved residue E that may be functionally important. 186
56756 404909 pfam14104 DUF4277 Domain of unknown function (DUF4277). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 110 amino acids in length. There is a conserved NGLGF sequence motif. 109
56757 404910 pfam14105 DUF4278 Domain of unknown function (DUF4278). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 58 and 136 amino acids in length. There is a single completely conserved residue R that may be functionally important. 56
56758 404911 pfam14106 DUF4279 Domain of unknown function (DUF4279). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 134 and 145 amino acids in length. 116
56759 404912 pfam14107 DUF4280 Domain of unknown function (DUF4280). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 129 and 456 amino acids in length. There is a single completely conserved residue C that may be functionally important. 109
56760 404913 pfam14108 DUF4281 Domain of unknown function (DUF4281). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 147 and 232 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important. 127
56761 404914 pfam14109 GldH_lipo GldH lipoprotein. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum. Proteins in this family are typically between 155 and 167 amino acids in length. Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility. 129
56762 404915 pfam14110 DUF4282 Domain of unknown function (DUF4282). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 93 and 155 amino acids in length. There is a single completely conserved residue E that may be functionally important. 86
56763 404916 pfam14111 DUF4283 Domain of unknown function (DUF4283). This domain family is found in plants, and is approximately 100 amino acids in length. Considering the very diverse range of other domains it is associated with it is possible that this domain is a binding/guiding region. There are two highly conserved tryptophan residues. 145
56764 404917 pfam14112 DUF4284 Immunity protein 22. A predicted immunity protein with an alpha+beta fold and conserved tryptophan,tyrosine and an acidic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the ColD/E5, Tox-REase-4, Ntox49 or Ntox14 families. The domain is also found in heterogeneous polyimmunity loci. 121
56765 404918 pfam14113 Tae4 Type VI secretion system (T6SS), amidase effector protein 4. Tae4 is a new form of toxin-antitoxin system protein for a type VI secretion system, T6SS. T6SS has roles in interspecies interactions, as well as higher order host-infection, by injecting effector proteins into the periplasmic compartment of the recipient cells of closely related species. Pseudomonas aeruginosa produces at least three effector proteins to other cells and thus has three specific cognate immunity proteins to protect itself. Tae4, or type VI amidase effector 4, in Enterobacter cloacae has a cognate Tai4 or type VI amidase immunity 4 protein. The immunity protein is Tai4, pfam16695. 114
56766 404919 pfam14114 DUF4286 Domain of unknown function (DUF4286). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 100 and 112 amino acids in length. 95
56767 316626 pfam14115 YuzL YuzL-like protein. The YuzL-like protein family includes the B. subtilis YuzL protein, which is functionally uncharacterized. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. 41
56768 404920 pfam14116 YyzF YyzF-like protein. The YyzF-like protein family includes the B. subtilis YyzF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 48
56769 404921 pfam14117 DUF4287 Domain of unknown function (DUF4287). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 70 and 180 amino acids in length. 60
56770 290824 pfam14118 YfzA YfzA-like protein. The YfzA-like protein family includes the B. subtilis YfzA protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. 90
56771 379479 pfam14119 DUF4288 Domain of unknown function (DUF4288). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 84
56772 379480 pfam14120 YhzD YhzD-like protein. The YhzD-like protein family includes the B. subtilis YhzD protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved GKL sequence motif. 61
56773 404922 pfam14121 Porin_10 Putative porin. This family of membrane bet-barrel proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 655 and 722 amino acids in length. SRI_1264 is identified by Gene3D as a membrane bound beta-barrel. These sequences are putative porins. 602
56774 316632 pfam14122 YokU YokU-like protein, putative antitoxin. The YokU-like protein family includes the B. subtilis YokU protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved CXXC sequence motifs. This is likely to be a family of bacterial antitoxins, as the sequence bears remote homology to the RelE fold family. 87
56775 404923 pfam14123 DUF4290 Domain of unknown function (DUF4290). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 200 and 221 amino acids in length. There are two conserved sequence motifs: EYGR and KLWD. 172
56776 404924 pfam14124 DUF4291 Domain of unknown function (DUF4291). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 190 and 214 amino acids in length. There are two conserved sequence motifs: VYQAY and RMTW. 180
56777 404925 pfam14125 DUF4292 Domain of unknown function (DUF4292). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 243 and 287 amino acids in length. 207
56778 404926 pfam14126 DUF4293 Domain of unknown function (DUF4293). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 136 and 154 amino acids in length. 153
56779 404927 pfam14127 DUF4294 Domain of unknown function (DUF4294). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 192 and 226 amino acids in length. 149
56780 404928 pfam14128 DUF4295 Domain of unknown function (DUF4295). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There are two completely conserved residues (K and Y) that may be functionally important. 47
56781 404929 pfam14129 DUF4296 Domain of unknown function (DUF4296). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. 87
56782 404930 pfam14130 DUF4297 Domain of unknown function (DUF4297). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is typically between 207 and 221 amino acids in length. 212
56783 404931 pfam14131 DUF4298 Domain of unknown function (DUF4298). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 94 and 105 amino acids in length. There are two completely conserved residues (Y and D) that may be functionally important. 87
56784 404932 pfam14132 DUF4299 Domain of unknown function (DUF4299). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 275 and 313 amino acids in length. There are two conserved sequence motifs: RGF and DAY. There are two completely conserved residues (P and D) that may be functionally important. 301
56785 404933 pfam14133 DUF4300 Domain of unknown function (DUF4300). This family of lipoproteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 281 and 303 amino acids in length. There are two conserved sequence motifs: NCR and PYQ. 252
56786 404934 pfam14134 DUF4301 Domain of unknown function (DUF4301). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 505 and 516 amino acids in length. 508
56787 404935 pfam14135 DUF4302 Domain of unknown function (DUF4302). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 344 and 443 amino acids in length. There are two completely conserved residues (R and L) that may be functionally important. 234
56788 404936 pfam14136 DUF4303 Domain of unknown function (DUF4303). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 169 and 192 amino acids in length. 153
56789 404937 pfam14137 DUF4304 Domain of unknown function (DUF4304). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 223 amino acids in length. 114
56790 404938 pfam14138 COX16 Cytochrome c oxidase assembly protein COX16. This family represents homologs of COX16 which has been shown to be involved in assembly of cytochrome oxidase. Protein in this family are typically between 106 and 134 amino acids in length. 79
56791 404939 pfam14139 YpzG YpzG-like protein. The YpzG-like protein family includes the B. subtilis YpzG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a conserved QVNG sequence motif. 49
56792 290845 pfam14140 YpzI YpzI-like protein. The YpzI-like protein family includes the B. subtilis YpzI protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. 42
56793 372925 pfam14141 YqzM YqzM-like protein. The YqzM-like protein family includes the B. subtilis YqzM protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. 40
56794 290847 pfam14142 YrzO YrzO-like protein. The YrzO-like protein family includes the B. subtilis YrzO protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. 46
56795 372926 pfam14143 YrhC YrhC-like protein. The YrhC-like protein family includes the B. subtilis YrhC protein, which is functionally uncharacterized. YrhC is on the same operon as the MccA and MccB genes, which are involved in the conversion of methionine to cysteine. Expression of this operon is repressed in the presence of sulphate or cysteine. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 72
56796 404940 pfam14144 DOG1 Seed dormancy control. This family of plant proteins appears to be a highly specific controller seed dormancy. 76
56797 404941 pfam14145 YrhK YrhK-like protein. The YrhK-like protein family includes the B. subtilis YrhK protein, which is functionally uncharacterized. Its expression is under the control of the motility sigma factor sigma-D. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length. 57
56798 316654 pfam14146 DUF4305 Domain of unknown function (DUF4305). This family includes the B. subtilis YdiK protein, which is functionally uncharacterized. This is not a homolog of E. coli YdiK, which belongs to pfam01594. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 37
56799 290852 pfam14147 Spore_YhaL Sporulation protein YhaL. This family of proteins is involved in sporulation. In B. subtilis its expression is regulated by the early mother-cell-specific transcription factor sigma-E. 52
56800 290853 pfam14148 YhdB YhdB-like protein. The YhdB-like protein family includes the B. subtilis YhdB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 57 and 82 amino acids in length. There are two conserved sequence motifs: LMVRT and FLHAY. 71
56801 404942 pfam14149 YhfH YhfH-like protein. The YhfH-like protein family includes the B. subtilis YhfH protein, which is functionally uncharacterized. Its expression is repressed by the Spx paralogue MgsR, which regulates genes involved in stress response. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 53 amino acids in length. 37
56802 372929 pfam14150 YesK YesK-like protein. The YesK-like protein family includes the B. subtilis YesK protein, which is functionally uncharacterized. Its expression is regulated by the sporulation-specific sigma factor sigma-E. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. 81
56803 372930 pfam14151 YfhD YfhD-like protein. The YfhD-like protein family includes the B. subtilis YfhD protein, which is functionally uncharacterized. Its expression is regulated by the sporulation-specific sigma factor sigma-F. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a single completely conserved residue E that may be functionally important. 59
56804 372931 pfam14152 YfhE YfhE-like protein. The YfhE-like protein family includes the B. subtilis YfhE protein, which is functionally uncharacterized. Its expression may be regulated by the sigma factor sigma-B, which regulates the expression of stress-response proteins. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved QEV sequence motif. 36
56805 372932 pfam14153 Spore_coat_CotO Spore coat protein CotO. Bacillus spores are protected by a protein shell consisting of over 50 different polypeptides, known as the coat. This family of proteins has an important morphogenetic role in coat assembly, it is involved in the assembly of at least 5 different coat proteins including CotB, CotG, CotS, CotSA and CotW. It is likely to act at a late stage of coat assembly. 180
56806 316659 pfam14154 DUF4306 Domain of unknown function (DUF4306). This family includes the B. subtilis YjdJ protein, which is functionally uncharacterized. This is not a homolog of E. coli YjdJ, which belongs to pfam00583. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 152 amino acids in length. 88
56807 404943 pfam14155 DUF4307 Domain of unknown function (DUF4307). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 132 and 153 amino acids in length. There is a single completely conserved residue C that may be functionally important. 111
56808 404944 pfam14156 AbbA_antirepres Antirepressor AbbA. This family inactivates the repressor AbrB, which represses genes switched on during the transition from the exponential to the stationary phase of growth. It binds to AbrB and prevents it from binding to DNA. 63
56809 404945 pfam14157 YmzC YmzC-like protein. The YmzC-like protein family includes the B. subtilis YmzC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 91 amino acids in length. There is a conserved ELR sequence motif. 58
56810 404946 pfam14158 YndJ YndJ-like protein. The YndJ-like protein family includes the B. subtilis YndJ protein, which is functionally uncharacterized. This family is found in bacteria and archaea, and is typically between 222 and 269 amino acids in length. There are two completely conserved G residues that may be functionally important. 260
56811 404947 pfam14159 CAAD CAAD domains of cyanobacterial aminoacyl-tRNA synthetase. This domain is present in aminoacyl-tRNA synthetases (aaRSs), enzymes that couple tRNAs to their cognate amino acids. aaRSs from cyanobacteria containing the CAAD (for cyanobacterial aminoacyl-tRNA synthetases appended domain) protein domains are localized in the thylakoid membrane. The domain bears two putative transmembrane helices and is present in glutamyl-, isoleucyl-, leucyl-, and valyl-tRNA synthetases, the latter of which has probably recruited the domain more than once during evolution. 85
56812 404948 pfam14160 FAM110_C Centrosome-associated C-terminus. This is the C-terminus of a family of proteins that colocalize with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis. 113
56813 404949 pfam14161 FAM110_N Centrosome-associated N-terminus. This is the N-terminus of a family of proteins that colocalize with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis. 107
56814 316666 pfam14162 YozD YozD-like protein. The YozD-like protein family includes the B. subtilis YozD protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 57
56815 404950 pfam14163 SieB Super-infection exclusion protein B. This family includes superinfection exclusion proteins. These proteins prevent the growth of superinfecting phage which are insensitive to repression. It aborts lytic development of superinfecting phage. 147
56816 372939 pfam14164 YqzH YqzH-like protein. The YqzH-like protein family includes the B. subtilis YqzH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 64
56817 316669 pfam14165 YtzH YtzH-like protein. The YtzH-like protein family includes the B. subtilis YtzH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There is a conserved DIL sequence motif. 86
56818 372940 pfam14166 YueH YueH-like protein. The YueH-like protein family includes the B. subtilis YueH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 79
56819 372941 pfam14167 YfkD YfkD-like protein. The YfkD-like protein family includes the B. subtilis YfkD protein, which is functionally uncharacterized. Its expression is regulated by the sigma factor sigma-B, which regulates the expression of stress-response proteins, and by the forespore-specific sigma factor sigma-G. This family of proteins is found in bacteria. Proteins in this family are typically between 254 and 265 amino acids in length. 232
56820 404951 pfam14168 YjzC YjzC-like protein. The YjzC-like protein family includes the B. subtilis YjzC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 55
56821 372943 pfam14169 YdjO Cold-inducible protein YdjO. This family includes the B. subtilis YdjO protein, which is functionally uncharacterized. This is not a homolog of E. coli YdjO. B. subtilis YdjO is cold-inducible. Its expression is induced by the extracytoplasmic function sigma factor sigma-W. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. 59
56822 404952 pfam14171 SpoIISA_toxin Toxin SpoIISA, type II toxin-antitoxin system. SpoIISA is a toxin which causes lysis of vegetatively growing cells. It forms part of a type II toxin-antitoxin system, where the SpoIISB protein, pfam14185, acts as an antitoxin. It is a transmembrane protein, with a cytoplasmic domain accounting for approximately two-thirds of the protein. The structure of the cytoplasmic domain resembles that of the GAF domains, pfam01590. SpoIISB binds to the cytoplasmic domain of SpoIISA with high affinity. 236
56823 404953 pfam14172 DUF4309 Domain of unknown function (DUF4309). This family includes the B. subtilis YjgB protein, which is functionally uncharacterized. This is not a homolog of E. coli YjgB. Expression of B. subtilis YjgB is regulated by the alternative transcription factor sigma-B. This family is found in bacteria, and is approximately 140 amino acids in length. 130
56824 372946 pfam14173 ComGG ComG operon protein 7. This family is required for DNA-binding during transformation of competent bacterial cells. 95
56825 379493 pfam14174 YycC YycC-like protein. The YycC-like protein family includes the B. subtilis YycC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a conserved HIL sequence motif. 50
56826 404954 pfam14175 YaaC YaaC-like Protein. The YaaC-like protein family includes the B. subtilis YaaC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 320 and 333 amino acids in length. 313
56827 404955 pfam14176 YxiJ YxiJ-like protein. The YxiJ-like protein family includes the B. subtilis YxiJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 110
56828 372949 pfam14177 YkyB YkyB-like protein. The YkyB-like protein family includes the B. subtilis YkyB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. There are two conserved sequence motifs: NRHAKTA and HLG. 135
56829 290882 pfam14178 YppF YppF-like protein. The YppF-like protein family includes the B. subtilis YppF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved LLDF sequence motif. 59
56830 372950 pfam14179 YppG YppG-like protein. The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important. 101
56831 404956 pfam14181 YqfQ YqfQ-like protein. The YqfQ-like protein family includes the B. subtilis YqfQ protein, also known as VrrA, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 146 and 237 amino acids in length. There are two conserved sequence motifs: QYGP and PKLY. 166
56832 372951 pfam14182 YgaB YgaB-like protein. The YgaB-like protein family includes the B. subtilis YgaB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 76
56833 404957 pfam14183 YwpF YwpF-like protein. The YwpF-like protein family includes the B. subtilis YwpF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 146 and 167 amino acids in length. There is a conserved IIN sequence motif. 134
56834 404958 pfam14184 YrvL Regulatory protein YrvL. YrvL prevents expression and activity of the YrvI sigma factor. It may function as an anti-sigma factor. 121
56835 290888 pfam14185 SpoIISB_antitox Antitoxin SpoIISB, type II toxin-antitoxin system. Members of this family act as antitoxins. They bind to the SpoIISA toxin, pfam14171. They are disordered proteins which adopt structure only when bound to SpoIISA. 55
56836 404959 pfam14186 Aida_C2 Cytoskeletal adhesion. This is the C-terminal domain of the axin-interacting protein family, and is a distinct version of the C2 domain. This domain is critical for interactions with cytoskeletal in the context of cellular adhesion points. 139
56837 404960 pfam14187 DUF4310 Domain of unknown function (DUF4310). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 231 amino acids in length. 208
56838 404961 pfam14188 DUF4311 Domain of unknown function (DUF4311). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 260 amino acids in length. 212
56839 404962 pfam14189 DUF4312 Domain of unknown function (DUF4312). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 99 and 118 amino acids in length. 84
56840 404963 pfam14190 DUF4313 Domain of unknown function (DUF4313). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 136 and 171 amino acids in length. 103
56841 404964 pfam14191 YodL YodL-like. The YodL-like protein family includes the B. subtilis YodL protein, which is functionally uncharacterized. This domain family is found in bacteria, and is approximately 100 amino acids in length. There are two completely conserved residues (Y and D) that may be functionally important. 101
56842 404965 pfam14192 DUF4314 Domain of unknown function (DUF4314). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 56 and 93 amino acids in length. 69
56843 404966 pfam14193 DUF4315 Domain of unknown function (DUF4315). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 79
56844 404967 pfam14194 Cys_rich_VLP Cysteine-rich VLP. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. It contains 6 conserved cysteines and a conserved VLP sequence motif. 57
56845 404968 pfam14195 DUF4316 Domain of unknown function (DUF4316). This domain is functionally uncharacterized. This domain is found in bacteria, and is typically between 56 and 95 amino acids in length. 44
56846 404969 pfam14196 ATC_hydrolase L-2-amino-thiazoline-4-carboxylic acid hydrolase. This family of enzymes catalyzes the conversion of L-2-amino-delta2-thiazoline-4-carboxylic acid (L-ATC) to N-carbamoyl-L-cysteine. It cleaves the carbon-sulphur bond in the ring structure of L-ATC to produce N-carbamoyl-L-cysteine. 145
56847 372959 pfam14197 Cep57_CLD_2 Centrosome localization domain of PPC89. The N-terminal region of the fission yeast spindle pole body protein PPC89 has low similarity to the human Cep57 protein. The CLD or centrosome localization domain of Cep57 and PPC89 is found at the N-terminus. This region localizes to the centrosome internally to gamma-tubulin, suggesting that it is either on both centrioles or on a centromatrix component. This N-terminal region can also multimerize with the N-terminus of other Cep57 molecules. The C-terminal part, Family Cep57_MT_bd, pfam06657, is the microtubule-binding region of Cep57 and PPC89. 67
56848 404970 pfam14198 TnpV Transposon-encoded protein TnpV. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 114 and 125 amino acids in length. 112
56849 404971 pfam14199 DUF4317 Domain of unknown function (DUF4317). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 225 and 451 amino acids in length. There is a single completely conserved residue P that may be functionally important. 370
56850 404972 pfam14200 RicinB_lectin_2 Ricin-type beta-trefoil lectin domain-like. 89
56851 404973 pfam14201 DUF4318 Domain of unknown function (DUF4318). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There is a single completely conserved residue F that may be functionally important. 75
56852 404974 pfam14202 TnpW Transposon-encoded protein TnpW. This family of proteins is found in bacteria. Proteins in this family are typically between 54 and 75 amino acids in length. There is a single completely conserved residue G that may be functionally important. 35
56853 404975 pfam14203 TTRAP Putative tranposon-transfer assisting protein. TTRAP is a family of small bacterial proteins largely from Clostrium difficile. From comparative and other structural studies of the Structure 2L7K, UniProtKB:Q18AW3, it has been suggested that this family is required for interacting with other proteins in order to facilitate the transfer of the transposon CTn4 between different bacterial species. Structure 2L7K comprises an alpha-helical fold of four alpha-helices leading to the production of two clefts, the larger of which displays two highly conserved residues in close proximity, Glu-8 and Lys-48. The gene concerned is part of an operon within transposon CTn4, and is expressed alongside a putative DNA primase, a DNA topoisomerase and conjugal transfer proteins. 62
56854 404976 pfam14204 Ribosomal_L18_c Ribosomal L18 C-terminal region. This domain is the C-terminal end of ribosomal L18/L5 proteins. 93
56855 404977 pfam14205 Cys_rich_KTR Cysteine-rich KTR. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 60 amino acids in length. There are 4 conserved cysteines and a conserved KTR sequence motif. 54
56856 404978 pfam14206 Cys_rich_CPCC Cysteine-rich CPCC. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 68 and 104 amino acids in length. There are six conserved cysteines and a conserved CPCC sequence motif. 75
56857 404979 pfam14207 DpnD-PcfM DpnD/PcfM-like protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 57 and 153 amino acids in length. There are two completely conserved residues (E and A) that may be functionally important. 46
56858 404980 pfam14208 DUF4320 Domain of unknown function (DUF4320). This family of proteins is found in bacteria. Proteins in this family are typically between 120 and 131 amino acids in length. There are two completely conserved residues (G and Y) that may be functionally important. 117
56859 404981 pfam14209 DUF4321 Domain of unknown function (DUF4321). This family of proteins is functionally uncharacterized. It is found in bacteria, and is approximately 50 amino acids in length. 48
56860 290912 pfam14210 DUF4322 Domain of unknown function (DUF4322). This presumed domain is functionally uncharacterized. This domain family is found in archaea, and is approximately 60 amino acids in length. There is a conserved QTV sequence motif. 66
56861 404982 pfam14213 DUF4325 STAS-like domain of unknown function (DUF4325). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 99 and 341 amino acids in length. This domain is distantly related to the STAS domain. 63
56862 404983 pfam14214 Helitron_like_N Helitron helicase-like domain at N-terminus. This family is found in Helitrons, recently recognized eukaryotic transposons that are predicted to amplify by a rolling-circle mechanism. In many instances a protein-coding gene is disrupted by their insertion. 197
56863 404984 pfam14215 bHLH-MYC_N bHLH-MYC and R2R3-MYB transcription factors N-terminal. This is the N-terminal region of a family of MYB and MYC transcription factors. The DNA-binding HLH domain is further downstream, pfam00010. Members of the MYB and MYC family regulate the biosynthesis of phenylpropanoids in several plant species (DOI:10.1007/s11295-009-0232-y). 121
56864 404985 pfam14216 DUF4326 Domain of unknown function (DUF4326). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important. 82
56865 404986 pfam14217 DUF4327 Domain of unknown function (DUF4327). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 67
56866 404987 pfam14218 COP23 Circadian oscillating protein COP23. This family includes the circadian oscillating protein COP23 from Cyanothece sp. (strain PCC 8801). The levels of this peripheral membrane protein display a circadian oscillation. 138
56867 404988 pfam14219 DUF4328 Domain of unknown function (DUF4328). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 218 and 342 amino acids in length. 166
56868 404989 pfam14220 DUF4329 Domain of unknown function (DUF4329). This domain is functionally uncharacterized. It is found in bacteria and eukaryotes, and is approximately 130 amino acids in length. It is often found in association with pfam05593 and pfam03527. There is a single completely conserved residue D and a highly conserved HTH motif which may be functionally important. 111
56869 404990 pfam14221 DUF4330 Domain of unknown function (DUF4330). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 165 and 177 amino acids in length. There is a single completely conserved residue G that may be functionally important. 167
56870 404991 pfam14222 MOR2-PAG1_N Cell morphogenesis N-terminal. This family is the conserved N-terminal region of proteins that are involved in cell morphogenesis. 547
56871 404992 pfam14223 Retrotran_gag_2 gag-polypeptide of LTR copia-type. This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type. 138
56872 404993 pfam14224 DUF4331 Domain of unknown function (DUF4331). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 223 and 526 amino acids in length. There is a conserved FPY sequence motif. 414
56873 404994 pfam14225 MOR2-PAG1_C Cell morphogenesis C-terminal. This family is the conserved C-terminal region of proteins that are involved in cell morphogenesis. 252
56874 404995 pfam14226 DIOX_N non-haem dioxygenase in morphine synthesis N-terminal. This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. 118
56875 372971 pfam14228 MOR2-PAG1_mid Cell morphogenesis central region. This family is the conserved central region of proteins that are involved in cell morphogenesis. 1114
56876 404996 pfam14229 DUF4332 Domain of unknown function (DUF4332). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 134 and 356 amino acids in length. This domain contains helix-hairpin-helix motifs. 122
56877 404997 pfam14230 DUF4333 Domain of unknown function (DUF4333). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 140 and 255 amino acids in length. There are two completely conserved C residues that may be functionally important. 76
56878 404998 pfam14231 GXWXG GXWXG protein. This domain is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. There is a conserved GXWXG motif. This domain is frequently found at the N-terminus of pfam14232. 59
56879 404999 pfam14232 DUF4334 Domain of unknown function (DUF4334). This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. This domain is frequently found at the C-terminus of pfam14231. 55
56880 405000 pfam14233 DUF4335 Domain of unknown function (DUF4335). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 204 and 480 amino acids in length. There are two completely conserved residues (G and D) that may be functionally important. 184
56881 405001 pfam14234 DUF4336 Domain of unknown function (DUF4336). 321
56882 405002 pfam14235 DUF4337 Domain of unknown function (DUF4337). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 201 amino acids in length. There is a single completely conserved residue Q that may be functionally important. 153
56883 372975 pfam14236 DUF4338 Domain of unknown function (DUF4338). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 206 and 475 amino acids in length. 231
56884 405003 pfam14237 DUF4339 Domain of unknown function (DUF4339). This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important. 50
56885 405004 pfam14238 DUF4340 Domain of unknown function (DUF4340). This domain is found in bacteria, and is typically between 183 and 196 amino acids in length. 183
56886 379519 pfam14239 RRXRR RRXRR protein. This domain is found in bacteria, eukaryotes and viruses, and is approximately 180 amino acids in length. It contains a conserved RRXRR motif. It is often found in association with pfam01844. 173
56887 405005 pfam14240 YHYH YHYH protein. This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif. 187
56888 405006 pfam14242 DUF4342 Domain of unknown function (DUF4342). This family of proteins is found in bacteria. Proteins in this family are typically between 97 and 206 amino acids in length. There is a single completely conserved residue P that may be functionally important. 79
56889 405007 pfam14243 DUF4343 Domain of unknown function (DUF4343). This domain family is found in bacteria, eukaryotes and viruses, and is typically between 127 and 142 amino acids in length. 172
56890 405008 pfam14244 Retrotran_gag_3 gag-polypeptide of LTR copia-type. This family is found in Plants and fungi, and contains pol polyprotein-like retroelements or retrotransposons of the copia-type. It is a short domain at the very start of these polypeptides. 48
56891 290944 pfam14245 Pilin_PilA Type IV pilin PilA. This family consists of proteins which form type IV pili. In M. xanthus these pili are required for social motility. 136
56892 405009 pfam14246 TetR_C_7 AefR-like transcriptional repressor, C-terminal region. This family comprises the C-terminal domain of transcriptional regulators of the TetR family. It includes the AefR transcriptional regulator from P. syringae. It is found in association with pfam00440. 119
56893 405010 pfam14247 DUF4344 Putative metallopeptidase. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 247 and 291 amino acids in length. There is a conserved EED sequence motif. This is a putative metallopeptidase. 214
56894 405011 pfam14248 DUF4345 Domain of unknown function (DUF4345). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 125 and 141 amino acids in length. There is a single completely conserved residue E that may be functionally important. 119
56895 405012 pfam14249 Tocopherol_cycl Tocopherol cyclase. This family contains tocopherol cyclases. These enzymes are involved in the synthesis of tocopherols and tocotrienols (vitamin E). 332
56896 405013 pfam14250 AbrB-like AbrB-like transcriptional regulator. This family of DNA-binding proteins is likely to act as a transcriptional regulator. This family does not include E.coli AbrB, which belongs to pfam05145. 67
56897 405014 pfam14251 DUF4346 Domain of unknown function (DUF4346). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 127 and 502 amino acids in length. There are two conserved sequence motifs: LDP and DHA. Many members of this family have been annotated as dihydropteroate synthases, however no experimental evidence can be found for this and MJ0107 has been shown not to possess dihydropteroate synthase activity. 118
56898 405015 pfam14252 DUF4347 Domain of unknown function (DUF4347). This domain family is found in bacteria and eukaryotes, and is approximately 160 amino acids in length. There are two completely conserved residues (C and G) that may be functionally important. 164
56899 405016 pfam14253 AbiH Bacteriophage abortive infection AbiH. This family of proteins confers resistance to bacteriophage. 249
56900 405017 pfam14254 DUF4348 Domain of unknown function (DUF4348). Two structures have been solved form this DUF, Structure 4mjf and Structure 3sbu. TOPSAN records that both proteins are the only structural representatives of Pfam PF14254, DUF4348. There are no other significant hits in FFAS. DUF4348 has ~200 proteins, all from Bacteroidetes, and all with a single domain architecture with just one DUF4348 domain. There appears to be a possible gene duplication in the protein as the N-terminal domain (approx residues 25-174) and C-terminal domain (approx residues 175-286) superimpose quite well with ~1.9A r.m.s.d. and ~30% sequence identity. 230
56901 405018 pfam14255 Cys_rich_CPXG Cysteine-rich CPXCG. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There are 5 conserved cysteines which occur in a CPXCG motif and a DCXXCCXP motif. 49
56902 405019 pfam14256 YwiC YwiC-like protein. The YwiC-like protein family includes the B. subtilis YwiC protein, which is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length. There is a single completely conserved residue G that may be functionally important. 124
56903 405020 pfam14257 DUF4349 Domain of unknown function (DUF4349). This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 282 and 353 amino acids in length. There is a single completely conserved residue D that may be functionally important. 213
56904 405021 pfam14258 DUF4350 Domain of unknown function (DUF4350). This domain family is found in bacteria, archaea and eukaryotes, and is approximately 70 amino acids in length. 69
56905 405022 pfam14260 zf-C4pol C4-type zinc-finger of DNA polymerase delta. In fission yeast this zinc-finger domain appears is the region of Pol3 that binds directly to the B-subunit, Cdc1. Pol delta is a hetero-tetrameric enzyme comprising four evolutionarily well-conserved proteins: the catalytic subunit Pol3 and three smaller subunits Cdc1, Cdc27 and Cdm1. 70
56906 405023 pfam14261 DUF4351 Domain of unknown function (DUF4351). This domain is found in bacteria, and is approximately 60 amino acids in length. 59
56907 405024 pfam14262 Cthe_2159 Carbohydrate-binding domain-containing protein Cthe_2159. Cthe_2159 from Clostridium thermocellum is the first representative of a novel family of cellulose and/or acid-sugar binding beta-helix proteins that share structural similarities with polysaccharide lyases. 264
56908 372985 pfam14263 DUF4354 Domain of unknown function (DUF4354). Several members of this family are annotated as being ATP/GTP-binding site motif A (P-loop) proteins, but this could not be confirmed. The one Structure 3NRF structure solved for this family exhibits an immunoglobin-like beta-sandwich fold. Crystal packing suggests that a tetramer is a significant oligomerization state, and a disulfide bridge is formed between Cys 125 at the C-terminal end of the monomer, and Cys 69. 125
56909 405025 pfam14264 Glucos_trans_II Glucosyl transferase GtrII. This family includes glucosyl transferase II from the Shigella phage SfII, which mediates seroconversion of S. flexneri when the phage is integrated into the host chromosome. 307
56910 405026 pfam14265 DUF4355 Domain of unknown function (DUF4355). This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 180 and 214 amino acids in length. 119
56911 405027 pfam14266 YceG_bac Putative component of 'biosynthetic module'. YceG is a family of proteins found in bacteria. Proteins in this family are approximately 540 amino acids in length. YceG is an additional gene-product encoded in the Ter operon and might thus be part of a 'biosynthetic module' encoding certain enzymes. 480
56912 405028 pfam14267 DUF4357 Domain of unknown function (DUF4357). This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important. 54
56913 405029 pfam14268 YoaP YoaP-like. The YoaP-like domain is found at the C-terminus of the B. subtilis YoaP protein. It is found in bacteria and archaea, and is approximately 40 amino acids in length. The family is found in association with pfam00583. There is a single completely conserved residue A that may be functionally important. 41
56914 372989 pfam14269 Arylsulfotran_2 Arylsulfotransferase (ASST). 301
56915 405030 pfam14270 DUF4358 Domain of unknown function (DUF4358). This domain family is found in bacteria, and is approximately 110 amino acids in length. 97
56916 405031 pfam14271 DUF4359 Domain of unknown function (DUF4359). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. There are two completely conserved residues (P and S) that may be functionally important. 103
56917 405032 pfam14272 Gly_rich_SFCGS Glycine-rich SFCGS. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. There are a number of highly conserved motifs including an SFCGSGGAGA motif. 113
56918 405033 pfam14273 DUF4360 Domain of unknown function (DUF4360). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 200 and 228 amino acids in length. There is a conserved GCP sequence motif near the N-terminus. 178
56919 405034 pfam14274 DUF4361 Domain of unknown function (DUF4361). 141
56920 372994 pfam14275 DUF4362 Domain of unknown function (DUF4362). This family of proteins is found in bacteria. Proteins in this family are typically between 93 and 146 amino acids in length. There is a conserved IRIV sequence motif. 93
56921 405035 pfam14276 DUF4363 Domain of unknown function (DUF4363). This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. 107
56922 405036 pfam14277 DUF4364 Domain of unknown function (DUF4364). This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 180 amino acids in length. 162
56923 405037 pfam14278 TetR_C_8 Transcriptional regulator C-terminal region. This domain is a tetracycline repressor, domain 2, or C-terminus. 103
56924 405038 pfam14279 HNH_5 HNH endonuclease. This domain is related to other HNH domain families such as pfam01844. Suggesting that these proteins have a nucleic acid cleaving function. 56
56925 405039 pfam14280 DUF4365 Domain of unknown function (DUF4365). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 182 and 530 amino acids in length. There is a single completely conserved residue D that may be functionally important. 141
56926 405040 pfam14281 PDDEXK_4 PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily. 178
56927 405041 pfam14282 FlxA FlxA-like protein. This family includes FlxA from E. coli. The expression of FlxA is regulated by the FliA sigma factor, a transcription factor specific for class 3 flagellar operons. However FlxA is not required for flagellar function or formation. 101
56928 405042 pfam14283 DUF4366 Domain of unknown function (DUF4366). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 227 and 387 amino acids in length. 144
56929 405043 pfam14284 PcfJ PcfJ-like protein. The PcfJ-like protein family includes the E. faecalis PcfJ protein, which is functionally uncharacterized. It is found in bacteria and viruses, and is typically between 159 and 170 amino acids in length. There is a conserved HCV sequence motif. 146
56930 379540 pfam14285 DUF4367 Domain of unknown function (DUF4367). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 229 and 435 amino acids in length. 110
56931 405044 pfam14286 DHHW DHHW protein. This family of proteins is found in bacteria. Proteins in this family are typically between 366 and 404 amino acids in length. There is a conserved DHHW motif. There is some distant homology to the Lipase_GDSL_2 family. 385
56932 405045 pfam14287 DUF4368 Domain of unknown function (DUF4368). This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam00239 and pfam07508. There is a single completely conserved residue G that may be functionally important. 64
56933 405046 pfam14288 FKS1_dom1 1,3-beta-glucan synthase subunit FKS1, domain-1. The FKS1_dom1 domain is likely to be the 'Class I' region just N-terminal to the first set of transmembrane helices that is involved in 1,3-beta-glucan synthesis itself. This family is found on proteins with family Glucan_synthase, pfam02364. 111
56934 405047 pfam14289 DUF4369 Domain of unknown function (DUF4369). This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam00578. 92
56935 372997 pfam14290 DUF4370 Domain of unknown function (DUF4370). 237
56936 405048 pfam14291 DUF4371 Domain of unknown function (DUF4371). 236
56937 405049 pfam14292 SusE SusE outer membrane protein. This family includes the SusE outer membrane protein from Bacteroides thetaiotaomicron. This protein has a role in starch utilisation, but is not essential for growth on starch. 106
56938 405050 pfam14293 YWFCY YWFCY protein. This family is found in bacteria, and is approximately 60 amino acids in length. There is a conserved YWFCY motif. It is often found in association with pfam02534. 60
56939 405051 pfam14294 DUF4372 Domain of unknown function (DUF4372). This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam01609. There is a single completely conserved residue G that may be functionally important. 74
56940 405052 pfam14295 PAN_4 PAN domain. 51
56941 405053 pfam14296 O-ag_pol_Wzy O-antigen polysaccharide polymerase Wzy. This family includes O-antigen polysaccharide polymerases. These enzymes link O-units via a glycosidic linkage to form a long O-antigen. These enzymes vary in specificity and sequence. 468
56942 405054 pfam14297 DUF4373 Domain of unknown function (DUF4373). This domain is found in bacteria, eukaryotes and viruses, and is approximately 90 amino acids in length. 89
56943 405055 pfam14298 DUF4374 Domain of unknown function (DUF4374). This family of proteins is found in bacteria. Proteins in this family are typically between 406 and 466 amino acids in length. 427
56944 405056 pfam14299 PP2 Phloem protein 2. Phloem protein 2 (PP2) is one of the most abundant and enigmatic proteins in the phloem sap. PP2 is translocated in the assimilate stream where its lectin activity or RNA-binding properties can exert effects over long distances. 152
56945 405057 pfam14300 DUF4375 Domain of unknown function (DUF4375). This family of proteins is found in bacteria. Proteins in this family are typically between 156 and 204 amino acids in length. There is a single completely conserved residue G that may be functionally important. 116
56946 405058 pfam14301 DUF4376 Domain of unknown function (DUF4376). This domain family is found in bacteria and viruses, and is approximately 110 amino acids in length. 102
56947 405059 pfam14302 DUF4377 Domain of unknown function (DUF4377). This domain family is found in bacteria and archaea, and is approximately 80 amino acids in length. 76
56948 405060 pfam14303 NAM-associated No apical meristem-associated C-terminal domain. This domain is found in a number of different types of plant proteins including NAM-like proteins. 164
56949 405061 pfam14304 CSTF_C Transcription termination and cleavage factor C-terminal. The C-terminal section of CSTF proteins is a discreet structure is crucial for mRNA 3'-end processing. This domain interacts with Pcf11 and possibly PC4, thus linking CstF2 to transcription, transcriptional termination, and cell growth. 41
56950 291003 pfam14305 ATPgrasp_TupA TupA-like ATPgrasp. A member of the ATP-grasp fold predicted to be involved in the biosynthesis of cell surface polysaccharides such as the O-antigen in proteobacteria, the capsule in firmicutes and the polyglutamate chain of teichuronopeptide. 241
56951 405062 pfam14306 PUA_2 PUA-like domain. This PUA like domain is found at the N-terminus of ATP-sulfurylase enzymes. 159
56952 405063 pfam14307 Glyco_tran_WbsX Glycosyltransferase WbsX. Members of this family are found in within O-antigen biosynthesis clusters in Gram negative bacteria, where they are predicted to function as glycosyltransferases. 312
56953 405064 pfam14308 DnaJ-X X-domain of DnaJ-containing. IN certain plant and yeast proteins, the DnaJ-1 proteins have a three-domain structure. The x-domain lies between the N-terminal DnaJ and the C-terminal Z domains. The exact function is not known. 205
56954 405065 pfam14309 DUF4378 Domain of unknown function (DUF4378). 157
56955 405066 pfam14310 Fn3-like Fibronectin type III-like domain. This domain has a fibronectin type III-like structure. It is often found in association with pfam00933 and pfam01915. Its function is unknown. 71
56956 405067 pfam14311 DUF4379 Probable Zinc-ribbon domain. This domain is found in bacteria, eukaryotes and viruses, and is approximately 60 amino acids in length. It contains a CXXCXH motif and a CPXC motif. 54
56957 316802 pfam14312 FG-GAP_2 FG-GAP repeat. 49
56958 316803 pfam14313 Soyouz_module N-terminal region of Paramyxovirinae phosphoprotein (P). The soyouz module moiety is the N-terminal region of the phosphoprotein (P) from the subfamily Paramyxovirinae of the family Paramyxoviridae viruses. The main genera in this subfamily include the Rubulaviruses, avulaviruses, respiroviruses, henipaviruses, and morbilliviruses, all of which are enveloped viruses with a non-segmented, negative, single-stranded RNA genome encapsidated by the nucleoprotein (N) within a helical nucleocapsid. 58
56959 316804 pfam14314 Methyltrans_Mon Virus-capping methyltransferase. This is the methyltransferase region of the Mononegavirales single-stranded RNA viral RNA polymerase enzymes. This region is involved in the mRNA-capping of the virion particles. 685
56960 373010 pfam14315 DUF4380 Domain of unknown function (DUF4380). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 288 and 372 amino acids in length. There are two completely conserved residues (G and E) that may be functionally important. 295
56961 405068 pfam14316 DUF4381 Domain of unknown function (DUF4381). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 158 and 180 amino acids in length. 145
56962 405069 pfam14317 YcxB YcxB-like protein. The YcxB-like protein family includes the B. subtilis YcxB protein, which is a functionally uncharacterized transmembrane protein. This family of proteins is found in bacteria, and is approximately 60 amino acids in length. 61
56963 405070 pfam14318 Mononeg_mRNAcap Mononegavirales mRNA-capping region V. This V domain of L RNA-polymerase carries a new motif, GxxTx(n)HR, that is essential for mRNA cap formation. Nonsegmented negative-sense (NNS) RNA viruses, Mononegavirales, cap their mRNA by an unconventional mechanism. Specifically, 5'-monophosphate mRNA is transferred to GDP derived from GTP through a reaction that involves a covalent intermediate between the large polymerase protein L and mRNA. The V region is essential for this process. 241
56964 405071 pfam14319 Zn_Tnp_IS91 Transposase zinc-binding domain. This domain is likely to be a zinc-binding domain. It is found at the N-terminus of transposases belonging to the IS91 family. 92
56965 291018 pfam14320 Paramyxo_PCT Phosphoprotein P region PCT disordered. The N-terminal half of the phosphoprotein P of the Paramyxovirinae viruses. The very first 60 residues have been built as the family Soyouz-module, pfam14313. The remaining part of the region, here, is disordered, and is liable to induced folding under the right physiological conditions. The region undergoes an unstructured-to-structured transition upon binding to Measles virus tail, C, unstructured region. 311
56966 405072 pfam14321 DUF4382 Domain of unknown function (DUF4382). This family is found in bacteria and archaea, and is typically between 142 and 161 amino acids in length. 149
56967 405073 pfam14322 SusD-like_3 Starch-binding associating with outer membrane. SusD is a secreted polysaccharide-binding protein with an N-terminal lipid moiety that allows it to associate with the outer membrane. SusD probably mediates xyloglucan-binding prior to xyloglucan transport in the periplasm for degradation. This domain is found N-terminal to pfam07980. 185
56968 405074 pfam14323 GxGYxYP_C GxGYxYP putative glycoside hydrolase C-terminal domain. This family carries a characteristic sequence motif, GxGYxYP, and is a putative glycoside hydrolase. This domain is found in association with pfam16216. Associated families are sugar-processing domains. 226
56969 405075 pfam14324 PINIT PINIT domain. The PINIT domain is a protein domain that is found in PIAS proteins. The PINIT domain is about 180 amino acids in length. 133
56970 379556 pfam14325 DUF4383 Domain of unknown function (DUF4383). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 137 and 164 amino acids in length. 125
56971 405076 pfam14326 DUF4384 Domain of unknown function (DUF4384). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 80 amino acids in length. 81
56972 405077 pfam14327 CSTF2_hinge Hinge domain of cleavage stimulation factor subunit 2. The hinge domain of cleavage stimulation factor subunit 2 proteins, CSTF2, is necessary for binding to the subunit CstF-77 within the polyadenylation complex and subsequent nuclear localization. This suggests that nuclear import of a pre-formed CSTF complex is an essential step in polyadenylation. Accurate and efficient polyadenylation is essential for transcriptional termination, nuclear export, translation, and stability of eukaryotic mRNAs. CSTF2 is an important regulatory subunit of the polyadenylation complex. 81
56973 405078 pfam14328 DUF4385 Domain of unknown function (DUF4385). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 149 and 163 amino acids in length. 143
56974 405079 pfam14329 DUF4386 Domain of unknown function (DUF4386). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 214 and 245 amino acids in length. 210
56975 405080 pfam14330 DUF4387 Domain of unknown function (DUF4387). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are approximately 110 amino acids in length. There is a conserved RSKN sequence motif. 98
56976 405081 pfam14331 ImcF-related_N ImcF-related N-terminal domain. This domain is found in bacterial ImcF (intracellular multiplication and human macrophage-killing) proteins. It is found to the N-terminus of the ImcF-related domain, pfam06761. 258
56977 405082 pfam14332 DUF4388 Domain of unknown function (DUF4388). This domain family is found in bacteria, and is typically between 102 and 135 amino acids in length. 106
56978 405083 pfam14333 DUF4389 Domain of unknown function (DUF4389). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 104 and 223 amino acids in length. There is a single completely conserved residue R that may be functionally important. 76
56979 405084 pfam14334 DUF4390 Domain of unknown function (DUF4390). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 192 and 203 amino acids in length. 158
56980 405085 pfam14335 DUF4391 Domain of unknown function (DUF4391). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 220 and 257 amino acids in length. 219
56981 405086 pfam14336 DUF4392 Domain of unknown function (DUF4392). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 282 and 585 amino acids in length. There are two completely conserved G residues that may be functionally important. 289
56982 405087 pfam14337 DUF4393 Domain of unknown function (DUF4393). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 254 and 285 amino acids in length. 192
56983 405088 pfam14338 Mrr_N Mrr N-terminal domain. This domain is found at the N-terminus of the Mrr restriction endonuclease catalytic domain, pfam04471. Fold recognition analysis predicts that it is a diverged member of the winged helix variant of helix turn helix proteins. It may play a role in DNA sequence recognition. 82
56984 405089 pfam14339 DUF4394 Domain of unknown function (DUF4394). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 262 and 476 amino acids in length. 228
56985 405090 pfam14340 DUF4395 Domain of unknown function (DUF4395). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 142 and 168 amino acids in length. There are two completely conserved C residues that may be functionally important. 130
56986 405091 pfam14341 PilX_N PilX N-terminal. This domain is found at the N-terminus of the PilX prepilin-like proteins which are involved in type 4 fimbrial biogenesis. 51
56987 405092 pfam14342 DUF4396 Domain of unknown function (DUF4396). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 167 and 310 amino acids in length. 141
56988 405093 pfam14343 PrcB_C PrcB C-terminal. This domain is found at the C-terminus of Treponema denticola PrcB. PrcB interacts with the PrtP protease (dentilisin) and is required for the stability of the protease complex. 56
56989 405094 pfam14344 DUF4397 Domain of unknown function (DUF4397). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 120 amino acids in length. 115
56990 405095 pfam14345 GDYXXLXY GDYXXLXY protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 171 and 199 amino acids in length. It contains a conserved GDYXXLXY motif. 155
56991 405096 pfam14346 DUF4398 Domain of unknown function (DUF4398). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 127 and 269 amino acids in length. 78
56992 405097 pfam14347 DUF4399 Domain of unknown function (DUF4399). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 135 and 1079 amino acids in length. 91
56993 405098 pfam14348 DUF4400 Domain of unknown function (DUF4400). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 209 and 249 amino acids in length. There is a single completely conserved residue P that may be functionally important. 204
56994 405099 pfam14349 SprA_N Motility related/secretion protein. This domain is found repeated three times in the N-terminal half of the gliding motility-related SprA proteins. The role of this domain in motility is uncertain. It is also found in proteins required for secretion. 509
56995 405100 pfam14350 Beta_protein Beta protein. This family includes the beta protein from Bacteriophage T4. Beta protein prevents the gop protein from killing the bacterial host cell. 340
56996 405101 pfam14351 DUF4401 Domain of unknown function (DUF4401). This family of proteins is found in bacteria. Proteins in this family are typically between 357 and 735 amino acids in length. The family is found in association with pfam09925. There is a single completely conserved residue K that may be functionally important. 308
56997 405102 pfam14352 DUF4402 Domain of unknown function (DUF4402). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 182 amino acids in length. 129
56998 405103 pfam14353 CpXC CpXC protein. This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea, and is typically between 122 and 134 amino acids in length. It contains four conserved cysteines forming two CpXC motifs. 121
56999 405104 pfam14354 Lar_restr_allev Restriction alleviation protein Lar. This family includes the restriction alleviation protein Lar encoded by the Rac prophage of Escherichia coli. This protein modulates the activity of the Escherichia coli restriction and modification system. 60
57000 405105 pfam14355 Abi_C Abortive infection C-terminus. This domain is found at the C-terminus of the Lactococcus lactis abortive infection protein Abi-859. This protein confers bacteriophage resistance. 83
57001 405106 pfam14356 DUF4403 Domain of unknown function (DUF4403). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 455 and 518 amino acids in length. There is a single completely conserved residue W that may be functionally important. 425
57002 405107 pfam14357 DUF4404 Domain of unknown function (DUF4404). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two completely conserved residues (P and G) that may be functionally important. 83
57003 405108 pfam14358 DUF4405 Domain of unknown function (DUF4405). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length. There are two conserved histidines that may be functionally important. This family is N-terminally truncated compared to other members of the clan. 65
57004 405109 pfam14359 DUF4406 Domain of unknown function (DUF4406). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 98 and 145 amino acids in length. 90
57005 405110 pfam14360 PAP2_C PAP2 superfamily C-terminal. This family is closely related to the C-terminal a region of PAP2. 74
57006 405111 pfam14361 RsbRD_N RsbT co-antagonist protein rsbRD N-terminal domain. This domain is found at the N-terminus of a number of anti-sigma-factor antagonist proteins including B. subtilis RsbRD. These proteins are negative regulators of the general stress transcription factor sigma(B). It is found in association with pfam01740. 104
57007 405112 pfam14362 DUF4407 Domain of unknown function (DUF4407). This family of proteins is found in bacteria. Proteins in this family are typically between 366 and 597 amino acids in length. There is a single completely conserved residue R that may be functionally important. 296
57008 405113 pfam14363 AAA_assoc Domain associated at C-terminal with AAA. This domain is found in association with the AAA family, pfam00004. 96
57009 405114 pfam14364 DUF4408 Domain of unknown function (DUF4408). This domain is found at the N-terminus of member of the DUF761 family pfam05553. Many members are plant proteins. 33
57010 405115 pfam14365 Neprosin_AP Neprosin activation peptide. Pitcher plants are insectivorous and secrete a digestive fluid into the pitcher. This fluid contains a mixture of enzymes including peptidases. One of these is neprosin, characterized from the pitcher plant Nepenthes ventrata. This peptidase is of unknown catalytic type and is unaffected by standard peptidase inhibitors. Unusually, activity is directed towards prolyl bonds, but unlike most peptidase that cleave after proline, there is no restriction on sequence length or position of the proline residue. The peptidase is secreted and is presumed to possess an N-terminal activation peptide. This domain corresponds to the presumed activation peptide. 119
57011 379584 pfam14366 DUF4410 Domain of unknown function (DUF4410). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 238 amino acids in length. 119
57012 405116 pfam14367 DUF4411 Domain of unknown function (DUF4411). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 153 and 170 amino acids in length. There is a single completely conserved residue D that may be functionally important. 159
57013 405117 pfam14368 LTP_2 Probable lipid transfer. The members of this family are probably involved in lipid transfer. The family has several highly conserved cysteines, paired in various ways. 91
57014 405118 pfam14369 zinc_ribbon_9 zinc-ribbon. 35
57015 405119 pfam14370 Topo_C_assoc C-terminal topoisomerase domain. This domain is found at the C-terminal of topoisomerase and other similar enzymes. 68
57016 405120 pfam14371 DUF4412 Domain of unknown function (DUF4412). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 75 and 104 amino acids in length. 191
57017 405121 pfam14372 DUF4413 Domain of unknown function (DUF4413). This domain is part of an RNase-H fold section of longer proteins some of which are transposable elements possibly of the Pong type, since some members are putative Tam3 transposases. 100
57018 405122 pfam14373 Imm_superinfect Superinfection immunity protein. This family includes the E. coli bacteriophage T4 superinfection immunity (imm) protein. When E. coli is sequentially infected with two T-even type bacteriophage the DNA of the superinfecting phage is excluded from the host, into the periplasmic space. The immunity protein plays a role in this process. 42
57019 405123 pfam14374 Ribos_L4_asso_C 60S ribosomal protein L4 C-terminal domain. This family is found at the very C-terminal of 60 ribosomal L4 proteins. 74
57020 405124 pfam14375 Cys_rich_CWC Cysteine-rich CWC. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 74 and 102 amino acids in length. It contains eight conserved cysteines, including a conserved CWC sequence motif. 50
57021 405125 pfam14376 Haem_bd Haem-binding domain. This domain contains a potential haem-binding motif, CXXCH. This family is found in association with pfam00034 and pfam03150. 133
57022 405126 pfam14377 DUF4414 Domain of unknown function (DUF4414). This family is frequently found on DNA binding proteins of the URE-B1 type and on ligases. 32
57023 405127 pfam14378 PAP2_3 PAP2 superfamily. 190
57024 405128 pfam14379 Myb_CC_LHEQLE MYB-CC type transfactor, LHEQLE motif. This family is found towards the C-terminus of Myb-CC type transcription factors, and carries a highly conserved LHEQLE sequence motif. 47
57025 405129 pfam14380 WAK_assoc Wall-associated receptor kinase C-terminal. This WAK_assoc domain is cysteine-rich and lies C-terminal to the binding domain, GUB_WAK_bind, pfam13947. 96
57026 405130 pfam14381 EDR1 Ethylene-responsive protein kinase Le-CTR1. EDR1 regulates disease resistance and ethylene-induced senescence, and is also involved in stress response signalling and cell death regulation. 199
57027 405131 pfam14382 ECR1_N Exosome complex exonuclease RRP4 N-terminal region. ECR1_N is an N-terminal region of the exosome complex exonuclease RRP proteins. It is a G-rich domain which structurally is a rudimentary single hybrid fold with a permuted topology. 36
57028 405132 pfam14383 VARLMGL DUF761-associated sequence motif. This family is found frequently at the N-terminus of family DUF3741, pfam12552. 32
57029 405133 pfam14384 BrnA_antitoxin BrnA antitoxin of type II toxin-antitoxin system. BrnA is family of antitoxins that neutralizes the toxin BrnT, pfam04365. It consists of 3 alpha-helices and a C-terminal ribbon-helix-helix DNA binding domain. As in other toxin-antitoxin systems, BrnA negatively autoregulates the brnTA operon and has higher affinity for the DNA operator when complexed with BrnT. It dimerizes with two molecules of its toxin BrnT. 67
57030 405134 pfam14385 DUF4416 Domain of unknown function (DUF4416). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 176 and 187 amino acids in length. There is a conserved DPG sequence motif. 162
57031 405135 pfam14386 DUF4417 Domain of unknown function (DUF4417). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 220 and 340 amino acids in length. There is a single completely conserved residue G that may be functionally important. 182
57032 405136 pfam14387 DUF4418 Domain of unknown function (DUF4418). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 132 and 150 amino acids in length. 118
57033 405137 pfam14388 DUF4419 Domain of unknown function (DUF4419). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 348 and 454 amino acids in length. 302
57034 405138 pfam14389 Lzipper-MIP1 Leucine-zipper of ternary complex factor MIP1. This leucine-zipper is towards the N-terminus of MIP1 proteins. These proteins, here largely from plants, are subunits of the TORC2 (rictor-mTOR) protein complex controlling cell growth and proliferation. The leucine-zipper is likely to be the region that interacts with plant MADS-box factors, 82
57035 405139 pfam14390 DUF4420 Putative PD-(D/E)XK family member, (DUF4420). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 310 and 334 amino acids in length. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses have established this family to be a member of the PD-(D/E)XK superfamily. 310
57036 405140 pfam14391 DUF4421 Domain of unknown function (DUF4421). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 336 and 370 amino acids in length. 298
57037 405141 pfam14392 zf-CCHC_4 Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. This particular family is found in plant proteins. 49
57038 405142 pfam14393 DUF4422 Domain of unknown function (DUF4422). This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 255 and 371 amino acids in length. 219
57039 405143 pfam14394 DUF4423 Domain of unknown function (DUF4423). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 170 amino acids in length. 168
57040 373045 pfam14395 COOH-NH2_lig Phage phiEco32-like COOH.NH2 ligase-type 2. A family of COOH-NH2 ligases/GCS superfamily found in the neighborhood of YheC/D-like ATP-grasp and the CotE family of proteins in the firmicutes. Contextual analysis suggests that it might be involved in cell wall modification and spore coat biosynthesis. 253
57041 405144 pfam14396 CFTR_R Cystic fibrosis TM conductance regulator (CFTR), regulator domain. 213
57042 405145 pfam14397 ATPgrasp_ST Sugar-transfer associated ATP-grasp. A member of the ATP-grasp fold predicted to be involved in the biosynthesis of cell surface polysaccharides. 278
57043 405146 pfam14398 ATPgrasp_YheCD YheC/D like ATP-grasp. A member of the ATP-grasp fold predicted to be involved in the modification/biosynthesis of spore-wall and capsular proteins. 256
57044 405147 pfam14399 BtrH_N Butirosin biosynthesis protein H, N-terminal. BtrH_N is the N-terminus of the acyl carrier protein:aminoglycoside acyltransferase BtrH. Alternatively it can be referred to as butirosin biosynthesis protein H. BtrH transfers the unique (S)-4-amino-2-hydroxybutyrate (AHBA) side chain, which protects the antibiotic butirosin from several common resistance mechanisms. Butirosin, an aminoglycoside antibiotic produced by Bacillus circulans, exhibits improved antibiotic properties over its parent molecule and retains bactericidal activity toward many aminoglycoside-resistant strains. Butirosin is unique in carrying the AHBA side-chain. BtrH transfers the AHBA from the acyl carrier protein BtrI to the parent aminoglycoside ribostamycin as a gamma-glutamylated dipeptide. 134
57045 405148 pfam14400 Transglut_i_TM Inactive transglutaminase fused to 7 transmembrane helices. A family of inactive transglutaminases fused to seven transmembrane helices. The transglutaminase domain is predicted to be extracellularly located. Members of this family are associated in gene neighborhoods with a pepsin-like peptidase and an ATP-grasp of the RimK-family. The ATP-grasp is predicted to modify the 7TM protein or a cofactor that interacts with it. 161
57046 405149 pfam14401 RLAN RimK-like ATPgrasp N-terminal domain. An uncharacterized alpha+beta fold domain that is mostly fused to a RimK-like ATP-grasp and is found in bacteria and euryarchaea. Members of this family are almost always associated in gene neighborhoods with a GNAT-like acetyltransferase fused to a papain-like petidase. Additionally M20-like peptidases, GCS2, 4Fe-4S Ferredoxins, a distinct metal-sulfur cluster protein and ribosomal proteins are found in the gene neighborhoods. Contextual analysis suggests a role for these in peptide biosynthesis. 150
57047 405150 pfam14402 7TM_transglut 7 transmembrane helices usually fused to an inactive transglutaminase. A family of seven transmembrane helices fused to an inactive transglutaminase domain. The transglutaminase domain is predicted to be extracellularly located. Members of this family are associated in gene neighborhoods with a pepsin-like peptidase and an ATP-grasp of the RimK-family. The ATP-grasp is predicted to modify the 7TM protein or a cofactor that interacts with it. 248
57048 405151 pfam14403 CP_ATPgrasp_2 Circularly permuted ATP-grasp type 2. Circularly permuted ATP-grasp prototyped by Roseiflexus RoseRS_2616 that is associated in gene neighborhoods with a GCS2-like COOH-NH2 ligase, alpha/beta hydrolase fold peptidase, GAT-II -like amidohydrolase, and M20 peptidase. Members of this family are predicted to be involved in the biosynthesis of small peptides. 378
57049 405152 pfam14404 Strep_pep Ribosomally synthesized peptide in Streptomyces species. A ribosomally synthesized peptide related to microviridin and marinostatin, usually in the gene neighborhood of one or more RimK-like ATP-grasp. The gene-context suggests that it is further modified by the ATP-grasp. The peptide is predicted to function in a defensive or developmental role, or as an antibiotic. 63
57050 405153 pfam14406 Bacteroid_pep Ribosomally synthesized peptide in Bacteroidetes. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp, and an ABC ATPase fused to a papain-like domain. It is often present in multiple tandem gene copies. The gene contexts suggest that it is modified by the ATP-grasp as in the biosynthesis of microviridin and marinostatin. They might function in defense or development or as peptide antibiotics. 52
57051 373050 pfam14407 Frankia_peptide Ribosomally synthesized peptide prototyped by Frankia Franean1_4349. Ribosomally synthesized peptide linked to cyclases in chloroflexi. It may have a link to cyclic nucleotide signaling. 62
57052 405154 pfam14408 Actino_peptide Ribosomally synthesized peptide in actinomycetes. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp and an aspartyl-O-methylase. Gene contexts suggest that it is further modified by the ATP-grasp and the methylase. It might function in defense or development, or as a peptide antibiotic. 58
57053 291106 pfam14409 Herpeto_peptide Ribosomally synthesized peptide in Herpetosiphon. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp, and an ABC ATPase fused to a papain-like domain. It is often present in multiple tandem gene copies. Gene contexts suggest that it is modified by the ATP=grasp. It might function in defense or development, or as a peptide antibiotic. 56
57054 405155 pfam14410 GH-E HNH/ENDO VII superfamily nuclease with conserved GHE residues. A predicted nuclease of the HNH/EndoVII superfamily of the treble clef fold which is closely related to the NucA-like family. The name is derived from the conserved G, H and E residues. It is found in several bacterial polymorphic toxin systems. Some GH-E members preserve the conserved cysteines of the treble-clef suggesting that they might represent potential evolutionary intermediates from a classical HNH domain to the derived NucA-like form. 70
57055 405156 pfam14411 LHH A nuclease of the HNH/ENDO VII superfamily with conserved LHH. LHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif, LHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. Like WHH and AHH, LHH nuclease contain 4 conserved histidines of which, the first one is predicted to bind metal-ion and other three ones are involved in activation of water molecule for hydrolysis. 76
57056 405157 pfam14412 AHH A nuclease family of the HNH/ENDO VII superfamily with conserved AHH. AHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif, AHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. Like WHH and LHH, the AHH nuclease contains 4 conserved histidines of which, the first one is predicted to bind a metal-ion and the other three ones are involved in activation of a water molecule for hydrolysis. 113
57057 405158 pfam14413 Thg1C Thg1 C terminal domain. Thg1 polymerases contain an additional region of conservation C-terminal to the core palm domain that comprise of 5 helices and two strands. This region has several well-conserved charged residues including a basic residue found towards the end of the first helix of this unit might contribute to the Thg1-specific active site. This C-terminal module of Thg1 is predicted to form a helical bundle that functions equivalently to the fingers of the other nucleic acid polymerases, probably in interacting with the template HtRNA. 117
57058 405159 pfam14414 WHH A nuclease of the HNH/ENDO VII superfamily with conserved WHH. WHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif WHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. WHH is the shortest version of HNH nuclease families. Like AHH and LHH, the WHH nuclease contains 4 conserved histidines of which the first one is predicted to bind a metal-ion and other three ones are involved in activation of water molecule for hydrolysis. 43
57059 405160 pfam14415 DUF4424 Domain of unknown function (DUF4424). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 310 and 361 amino acids in length. 303
57060 405161 pfam14416 PMR5N PMR5 N terminal Domain. The plant family with PMR5, ESK1, TBL3 etc have a N-terminal C rich predicted sugar binding domain followed by the PC-Esterase (acyl esterase) domain. 55
57061 405162 pfam14417 MEDS MEDS: MEthanogen/methylotroph, DcmR Sensory domain. MEDS is prototyped by DcmR and is likely to function with the PocR domain in certain organisms in sensing hydrocarbon derivatives. The MEDS domain occurs fused to Histidine Kinase and as stand-alone version. Sequence analysis shows that it is a catalytically inactive version of the P-loop NTPase domain of the RecA superfamily. 160
57062 405163 pfam14418 OHA OST-HTH Associated domain. OHA occurs with OST-HTH. 74
57063 405164 pfam14419 SPOUT_MTase_2 AF2226-like SPOUT RNA Methylase fused to THUMP. SPOUT superfamily RNA methylase fused to RNA binding THUMP domain. 173
57064 405165 pfam14420 Clr5 Clr5 domain. This domain is found at the N-terminus of the Clr5 protein which has been shown to be involved in silencing in fission yeast. This domain has been found to often be associated with proteins that contain ankyrin repeats and large regions of disordered sequence. 54
57065 405166 pfam14421 LmjF365940-deam A distinct subfamily of CDD/CDA-like deaminases. A distinct branch of the CDD/CDA-like deaminases prototyped by Leishmania LmjF36.5940. Members of this family are widely distributed across several microbial eukaryotes such as kinetoplastids, chlorophyte algae, stramenopiles and the alveolate Perkinsus. Domain architectures suggest that these proteins might possess mRNA editing or DNA mutagenizing activity. 197
57066 405167 pfam14423 Imm5 Immunity protein Imm5. A predicted Immunity protein, with an all-alpha fold, present in bacterial polymorphic toxin systems as an immediate neighbor of the toxin. 186
57067 405168 pfam14424 Toxin-deaminase The BURPS668_1122 family of deaminases. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Burkholderia BURPS668_1122. Members of this family are found as toxins in polymorphic toxin systems in a wide range of bacteria and in the eukaryote Perkinsus. Members of this family typically possess a DxE catalytic motif in Helix-2 of the core fold instead of the more common C[H]xE motif. The Perkinsus versions are predicted to be inactive. 135
57068 373060 pfam14425 Imm3 Immunity protein Imm3. A predicted Immunity protein, with a mostly all-alpha fold, present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene. 119
57069 339223 pfam14426 Imm2 Immunity protein Imm2. A predicted Immunity protein, with a mostly all-alpha fold, present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene. 60
57070 373061 pfam14427 Pput2613-deam Pput_2613-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Pseudomonas Pput_2613. Members of this family are predicted to function as toxins in bacterial polymorphic toxin systems. 118
57071 373062 pfam14428 SCP1201-deam SCP1.201-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Streptomyces SCP1.201. Members of this family are predicted to function as toxins in bacterial polymorphic toxin systems. 136
57072 405169 pfam14429 DOCK-C2 C2 domain in Dock180 and Zizimin proteins. The Dock180/Dock1 and Zizimin proteins are atypical GTP/GDP exchange factors for the small GTPases Rac and Cdc42 and are implicated cell-migration and phagocytosis. Across all Dock180 proteins, two regions are conserved: C-terminus termed CZH2 or DHR2 (or the Dedicator of cytokinesis) whereas CZH1/DHR1 contain a new family of the C2 domain. 186
57073 405170 pfam14430 Imm1 Immunity protein Imm1. A predicted immunity protein, with an alpha+beta fold and a conserved C-terminal tryptophan residue. The protein is present in a wide range of bacteria in polymorphic toxin systems as an immediate gene neighbor of the toxin gene. 125
57074 373065 pfam14431 YwqJ-deaminase YwqJ-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bacillus YwqJ. Members of this family are present in a wide phyletic range of bacteria and a few basidiomycetes. Bacterial versions are predicted to function as toxins in bacterial polymorphic toxin systems. 134
57075 373066 pfam14432 DYW_deaminase DYW family of nucleic acid deaminases. A family of nucleic acid deaminases prototyped by the plant PPR DYW proteins that are implicated in chloroplast and mitochondrial RNA transcript maturation by numerous C to U editing events. The name derives from the DYW motif present at the C-terminus of the classical plant PPR DYW deaminases. Members of this family are present in bacteria, plants, Naegleria, and fungi. Plants and Naegleria show lineage-specific expansions of this family. The classical DYW family contain an additional C-terminal metal-binding cluster composed of 2 histidines and a CxC motif and are often fused to PPR repeats. Ascomycete versions, which are independent lateral transfers, contain a large insert within the domain and are often fused to ankyrin repeats. Bacterial versions are predicted to function as toxins in polymorphic toxin systems. 125
57076 405171 pfam14433 SUKH-3 SUKH-3 immunity protein. This family belongs to the SUKH superfamily and functions as immunity proteins in bacterial toxin systems. 140
57077 405172 pfam14434 Imm6 Immunity protein Imm6. A predicted immunity protein, with an alpha+beta fold (mostly alpha helices). The protein is present in polymorphic toxin systems as an immediate gene neighbor of the toxin gene. 121
57078 405173 pfam14435 SUKH-4 SUKH-4 immunity protein. This family belongs to the SUKH superfamily and functions as immunity proteins in bacterial toxin systems. 140
57079 405174 pfam14436 EndoU_bacteria Bacterial EndoU nuclease. This is a bacterial virion of EndoU nuclease. It is found at C-terminal region of polymorphic toxin proteins. 129
57080 405175 pfam14437 MafB19-deam MafB19-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Neisseria MafB19. Members of this family are present in a wide phyletic range of bacteria and are predicted to function as toxins in bacterial polymorphic toxin systems. 138
57081 405176 pfam14438 SM-ATX Ataxin 2 SM domain. This SM domain is found in Ataxin-2. 78
57082 405177 pfam14439 Bd3614-deam Bd3614-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bdellovibrio Bd3614. They are typified by a distinct N-terminal globular domain. The Bdellovibrio version occurs in a predicted operon with a 23S rRNA G2445-modifying methylase suggesting that it might be involved in RNA editing. 113
57083 405178 pfam14440 XOO_2897-deam Xanthomonas XOO_2897-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Xanthomonas XOO_2897. Members of this family are present in a wide phyletic range of bacteria and are predicted to function as toxins in bacterial polymorphic toxin systems. The Xanthomonas XOO_2897 lack an immunity protein and is predicted to be deployed against its eukaryotic host. 101
57084 405179 pfam14441 OTT_1508_deam OTT_1508-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Orientia OTT_1508. Members of this family are present in a wide phyletic range of bacteria,including several intracellular parasites and eukaryotes such as fungi, Leishmania, Selaginella, and some apicomplexa. In bacteria, these deaminases are predicted to function as toxins in bacterial polymorphic toxin systems. Versions in intracellular bacteria lack immunity proteins and are likely to be deployed against their eukaryotic hosts. Eukaryotic versions are predicted to function as nucleic acid (either DNA or RNA) deaminases. Among eukaryotes, some fungi show lineage-specific expansions of this family. Many fungal versions are fused to a distinct N-terminal globular domain. Various fungal versions are fused to domains involved in chromatin function. Apicomplexan versions are fused to tRNA guanine transglycosylase domain. 66
57085 405180 pfam14442 Bd3614_N Bd3614-like deaminase N-terminal. This is a globular domain that occurs N-terminal to the Bd3614-like deaminases, which are predicted to be involved in RNA editing. 92
57086 405181 pfam14443 DBC1 DBC1. DBC1 and it homologs from diverse eukaryotes are a catalytically inactive version of the Nudix hydrolase (MutT) domain. DBC1 is predicted to bind NAD metabolites and regulate the activity of SIRT1 or related deacetylases by sensing the soluble products or substrates of the NAD-dependent deacetylation reaction. 123
57087 405182 pfam14444 S1-like S1-like. S1-like RNA binding domain found in DBC1 58
57088 379611 pfam14445 Prok-RING_2 Prokaryotic RING finger family 2. RING finger family found sporadically in bacteria and archaea, and associated with other components of the ubiquitin-based signaling and degradation system, including ubiquitin and the E1 and E2 proteins. The bacterial versions contain transmembrane helices. 56
57089 405183 pfam14446 Prok-RING_1 Prokaryotic RING finger family 1. RING finger family found sporadically in bacteria and archaea, and associated in gene neighborhoods with other components of the ubiquitin-based signaling and degradation system, including ubiquitin, the E1 and E2 proteins and the JAB-like metallopeptidase. The bacterial versions contain transmembrane helices. 52
57090 405184 pfam14447 Prok-RING_4 Prokaryotic RING finger family 4. RING finger family domain found sporadically in bacteria. The finger is fused to an N-terminal alpha-helical domain, ROT/Trove-like repeats and a C-terminal TerD domain. The architecture suggests a possible role in an RNA-processing complex. 46
57091 373075 pfam14448 Nuc_N Nuclease N terminal. This is a conserved short region that is found in many bacterial polymorphic toxin proteins. It is often located before C-terminal nuclease domains. 58
57092 379612 pfam14449 PT-TG Pre-toxin TG. PT-TG is a conserved region found in many bacterial toxin proteins. It could function as a linker that links N-terminal secretion-related domain and C-terminal toxin domain. It contains a TG motif. 68
57093 405185 pfam14450 FtsA Cell division protein FtsA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012). FtsA has a SHS2 domain PF02491 inserted in to the RnaseH fold PF02491. 168
57094 405186 pfam14451 Ub-Mut7C Mut7-C ubiquitin. This member of the ubiquitin superfamily is found at the N-terminus of Mut7-C like RNAses, suggestive of an RNA-binding role. 81
57095 405187 pfam14452 Multi_ubiq Multiubiquitin. A ubiquitin superfamily domain that is often present in multiple tandem copies in the same polypeptide. Members of this family are associated in gene neighborhoods, or on occasions fused to, bacterial homologs of components of ubiquitin-dependent modification system such as the E1, E2 and JAB metallopeptidase enzymes and a distinct metal-binding domain. The E2/UBC fold domain appears to be inactive. The JAB domain in these operons is usually fused to the E1 domain. 69
57096 405188 pfam14453 ThiS-like ThiS-like ubiquitin. A member of the ubiquitin superfamily that is often fused to the ThiF-like (E1)- ubiquitin activating enzyme and is present in gene neighborhoods with components of the thiamine biosynthesis pathway. 57
57097 405189 pfam14454 Prok_Ub Prokaryotic Ubiquitin. A Ubiquitin-superfamily protein that is present across several bacterial lineages, and found in gene neighborhoods with components of the ubiquitin modification system such as the E1, E2 and JAB proteins, and a novel alpha-helical protein, which is predicted to be enzymatic. 63
57098 405190 pfam14455 Metal_CEHH Predicted metal binding domain. A predicted metal-binding domain that is found in gene-neighborhood associations with genes encoding components of the bacterial homologs of the ubiquitin modification pathway including the E1, E2, JAB metallopeptidase and ubiquitin proteins. The domain is characterized by a conserved motif with a CxxxxxEYHxxxxH signature. 177
57099 373077 pfam14456 alpha-hel2 Alpha-helical domain 2. An alpha-helical domain found in gene neighborhoods encoding genes containing bacterial homologs of components of the ubiquitin modification pathway such as the E1, E2, Ub and JAB peptidase proteins. 303
57100 405191 pfam14457 Prok-E2_A Prokaryotic E2 family A. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are very similar to the eukaryotic E2 proteins. Members of this family are usually fused to E1 and JAB domains C-terminal to the E2 domain. The protein is usually in the gene neighborhood of a gene encoding a distinct metallobetalactamase family protein. 162
57101 405192 pfam14459 Prok-E2_C Prokaryotic E2 family C. A divergent member of the E2/UBC superfamily of proteins found in bacteria. Members of the family contain a conserved cysteine in place of the histidine of the classical E2/UBC proteins. Members of this family are usually fused to an E1 domain at their C-terminus. The protein is usually in the gene neighborhood of a gene encoding a JAB peptidase and another encoding a predicted metal binding domain. 131
57102 405193 pfam14460 Prok-E2_D Prokaryotic E2 family D. A member of the E2/UBC superfamily of proteins found in several bacteria. Members of this family lack the conserved histidine of the classical E2-fold. However, they have an absolutely conserved histidine carboxyl-terminal to the conserved cysteine. Members of this family are usually present in a conserved gene neighborhood with genes encoding members of the Ub modification pathway such as the E1, Ub and JAB proteins. These neighborhoods also contain a gene encoding a rapidly diverging alpha-helical protein. 169
57103 405194 pfam14461 Prok-E2_B Prokaryotic E2 family B. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are similar to the eukaryotic E2 proteins but lack the conserved asparagine. Members of this family are usually fused to an E1 domain at the C-terminus. The protein is usually in the gene neighborhood of a gene encoding a member of the pol-beta nucleotidyltransferase superfamily. Many of the operons in this family are in ICE-like mobile elements and plasmids. 133
57104 405195 pfam14462 Prok-E2_E Prokaryotic E2 family E. A member of the E2/UBC superfamily of proteins found in diverse bacteria. Analysis of the active site residues suggest that members of this family are inactive as they lack the characteristic catalytic residues of the E2 enzymes. They are usually fused to or in the neighborhood of a multi/poly ubiquitin domain protein. Other proteins of the ubiquitin modification pathway such as the E1 and JAB proteins are also found in its gene neighborhood along with a distinct predicted metal-binding protein. 119
57105 405196 pfam14463 E1-N E1 N-terminal domain. An uncharacterized alpha/beta domain fused to E1 proteins. This protein is usually present in gene neighborhoods with genes encoding a JAB protein and a predicted metal-binding protein. In related E1 proteins, the E1-N domain is replaced by an E2/UBC superfamily domain. 151
57106 405197 pfam14464 Prok-JAB Prokaryotic homologs of the JAB domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signaling and protein turnover pathways in eukaryotes. Prokaryotic JAB domains are predicted to have a similar role in their cognates of the ubiquitin modification pathway. The domain is widely found in bacteria, archaea and phages where they are present in several gene contexts in addition to those that correspond to the prokaryotic cognates of the eukaryotic Ub pathway. Other contexts in which JAB domains are present include gene neighbor associations with ubiquitin fold domains in cysteine and siderophore biosynthesis, and phage tail morphogenesis, where they are shown or predicted to process the associated ubiquitin. A distinct family, the RadC-like JAB domains are widespread in bacteria and are predicted to function as nucleases. In halophilic archaea the JAB domain shows strong gene-neighborhood associations with a nucleotidyltransferase suggesting a role in nucleotide metabolism. 113
57107 405198 pfam14465 NFRKB_winged NFRKB Winged Helix-like. This domain covers regions 370-495 of human nuclear factor related to kappaB binding (NFRKB) protein. 102
57108 405199 pfam14466 PLCC PLAT/LH2 and C2-like Ca2+-binding lipoprotein. A small family of bacterial proteins, found in several Bacteroides species. Structure determination (NMR and Xray) shows an immunoglobulin beta-barrel fold. Multiple homologs have been found in human gut metagenomics data sets. Structural experimentation shows it to share features with two well-established protein architectures in the SCOP database, ie, C2 (calcium/lipid-binding domain) of the Pfam PF00168 and PLAT/LH2 (lipase/lipooxigenase domain) of the Pfam PF01477. The C2 and PLAT/LH2 domains bind Ca2+ in their functions of targeting proteins to cell-membranes; this domain is also shown to bind Ca2+ as well as to be a novel fold. 128
57109 405200 pfam14467 DUF4426 Domain of unknown function (DUF4426). Members of this entry are found mostly in g-proteobacteria, especially in Vibrio. Strangely enough, there seems to be one eukaryotic homolog in Nematostella vectensis (NEMVEDRAFT_v1g226006), where the PA0388-like domain is fused with a domain homogous to the Methionine biosynthesis protein MetW (see below). In several Pseudomonas species, but also in Vibrio vulnificus and Azotobacter vinelandii PA0388 homologs are genomic neighbors of Nucleoside 5-triphosphatase RdgB (dHAPTP, dITP, XTP-specific) (EC 3.6.1.15) and Methionine biosynthesis protein MetW. On the other hand, in most Vibrio species it appears as a part of a conserved operon involved in possible response to stress. 119
57110 291157 pfam14468 DUF4427 Protein of unknown function (DUF4427). This domain is often found at the C-terminal of proteins with pfam10899 domain, for instance in STY1911 protein from a multiple drug resistant Salmonella enterica serovar Typhi CT18. 132
57111 405201 pfam14469 AKAP28 28 kDa A-kinase anchor. 28 kDa AKAP (AKAP28) is highly enriched in human airway axonemes. The mRNA for AKAP28 is up-regulated as primary airway cells differentiate and is specifically expressed in tissues containing cilia and/or flagella. Homologs of AKAP28 are present in all animals and in some, including mice the AKAP28-like domain are preceded by another uncharacterized domain 121
57112 405202 pfam14470 bPH_3 Bacterial PH domain. Proteins in this family are distantly related to PH domains. 94
57113 405203 pfam14471 DUF4428 Domain of unknown function (DUF4428). This putative zinc finger domain is found in uncharacterized bacterial proteins. 51
57114 405204 pfam14472 DUF4429 Domain of unknown function (DUF4429). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and viruses, and is approximately 90 amino acids in length. This domain is often found in two tandem copies. 95
57115 405205 pfam14473 RD3 RD3 protein. RD3 is a human protein that is found preferentially expressed in the retina. Mutations in RD3 causes Leber Congenital Amaurosis type 12. 127
57116 405206 pfam14474 RTC4 RTC4-like domain. This presumed domain is found in the RTC4 protein from yeasts. In Saccharomyces cerevisiae, Cdc13 binds telomeric DNA to recruit telomerase and to "cap" chromosome ends. RTC4 was identified in a screen to identify novel proteins and pathways that cap telomeres, or that respond to uncapped telomeres. This domain is also found in proteins that contain a DNA-binding myb domain. 121
57117 405207 pfam14475 Mso1_Sec1_bdg Sec1-binding region of Mso1. Mso1p is a component of the secretory vesicle docking complex whose function is closely associated with that of Sec1p. It is a small hydrophilic protein that is enriched in the microsomal membrane fraction, and this binding domain is towards the N-terminus of Mso1. The yeast Sec1p protein functions in the docking of secretory transport vesicles to the plasma membrane. Mso1p and Sec1p interact at sites of exocytosis and the Mso1p-Sec1p interaction site depends on a functional Rab GTPase Sec4p and its GEF Sec2p. The C-terminal region of Mso1 (not built) assists in targetting Sec1 to the sites of polarised membrane transport. 40
57118 405208 pfam14476 Chloroplast_duf Petal formation-expressed. The members of this plant family from Arabidopsis thaliana appear to be proteins found in the chloroplast, expressed in the pollen tube during the petal differentiation and expansion stage. The function is not known. 319
57119 316951 pfam14477 Mso1_C Membrane-polarising domain of Mso1. Mso1p is a component of the secretory vesicle docking complex whose function is closely associated with that of Sec1p. It is a small hydrophilic protein that is enriched in the microsomal membrane fraction. The yeast Sec1p protein functions in the docking of secretory transport vesicles to the plasma membrane. Mso1p and Sec1p interact at sites of exocytosis and the Mso1p-Sec1p interaction site depends on a functional Rab GTPase Sec4p and its GEF Sec2p. This C-terminal region of Mso1 assists in targeting Sec1 to the sites of polarised membrane transport, the SNARES and Sec4. 54
57120 405209 pfam14478 DUF4430 Domain of unknown function (DUF4430). Although this family has overlaps with SLBB, the majority of its sequences are unique. Several family members, eg UniProtKB:A0RGA8, that do not overlap have an LPXTG-cell wall anchor at their C-terminus, a SSF_Family 10_polysaccharide_lyase or Glycosyltransferase structure associated with them in the middle region, as shown by InterPro, as well as this domain at the N-terminus. 72
57121 405210 pfam14479 HeLo Prion-inhibition and propagation. This N-terminal region, HeLo, has a prion-inhibitory effect in cis on its own prion-forming domain (PFD) and in trans on HET-s prion propagation. The domain is found exclusively in the fungal kingdom. Its structure, as it occurs in the HET-s/HET-S proteins, consists of two bundles of alpha-helices that pack into a single globular domain. The domain boundary determined from its structure and from protease-resistance experiments overlaps with the C-terminal prion-forming domain of HET-s (PF11558. The HeLo domains of HET-s and HET-S are very similar and their few differences (and not the prion-forming domains) determine the compatibility-phenotype of the fungi in which the proteins are expressed. The mechanism of the HeLo domain-function in heterokaryon-incompatibility is still under investigation, however the HeLo domain is found in similar protein architectures as other cell death and apoptosis-inducing domains. The only other HeLo protein to which a function has been associated is LopB from L. maculans. Although its specific role in L. maculans is unknown, LopB- mutants have impaired ability to form lesions on oilseed rape. The HeLo domain is not related to the HET domain (PF06985) which is another domain involved in heterokaryon incompatibility. 167
57122 405211 pfam14480 DNA_pol3_a_NI DNA polymerase III polC-type N-terminus I. This is the first N-terminal domain, NI domain, of the DNA polymerase III polC subunit A that is found only in Firmicutes. DNA polymerase polC-type III enzyme functions as the 'replicase' in low G + C Gram-positive bacteria. Purine asymmetry is a characteristic of organisms with a heterodimeric DNA polymerase III alpha-subunit constituted by polC which probably plays a direct role in the maintenance of strand-biased gene distribution; since, among prokaryotic genomes, the distribution of genes on the leading and lagging strands of the replication fork is known to be biased. It has been predicted that the N-terminus of polC folds into two globular domains, NI and NII. A predicted patch of elecrostatic potential at the surface of this domain suggests a possible involvement in nucleic acid binding. This domain is associated with DNA_pol3_alpha pfam07733 and DNA_pol3_a_NI pfam11490. 72
57123 373090 pfam14481 Fimbrial_PilY2 Type 4 fimbrial biogenesis protein PilY2. Members of this family were experimentally shown to be involved in fimbrial biogenesis, but its exact role appears to be unknown. 105
57124 405212 pfam14484 FISNA Fish-specific NACHT associated domain. This domain is frequently found associated with the NACHT domain (pfam05729) in fish and other vertebrates. 70
57125 373092 pfam14485 DUF4431 Domain of unknown function (DUF4431). 48
57126 405213 pfam14486 DUF4432 Domain of unknown function (DUF4432). 303
57127 405214 pfam14487 DUF4433 Domain of unknown function (DUF4433). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 201 and 230 amino acids in length. There is a single completely conserved residue E that may be functionally important. This family is distantly similar to pfam01885 suggesting these may be ADP-ribosylases. 201
57128 405215 pfam14488 DUF4434 Domain of unknown function (DUF4434). 168
57129 405216 pfam14489 QueF QueF-like protein. This protein is involved in the biosynthesis of queuosine. In some proteins this domain appears to be fused to pfam06508. 81
57130 405217 pfam14490 HHH_4 Helix-hairpin-helix containing domain. This presumed domain contains at least one helix-hairpin-helix motif. This domain is often found in RecD helicases. 91
57131 405218 pfam14491 DUF4435 Protein of unknown function (DUF4435). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 285 and 362 amino acids in length. This domain is sometimes associated with AAA domains. 232
57132 405219 pfam14492 EFG_II Elongation Factor G, domain II. This domain is found in Elongation Factor G. It shares a similar structure with domain V (pfam00679). 75
57133 405220 pfam14493 HTH_40 Helix-turn-helix domain. This presumed domain is found at the C-terminus of a large number of helicase proteins. 89
57134 379624 pfam14494 DUF4436 Domain of unknown function (DUF4436). This is a family of membrane and transmembrane proteins from mycobacterial and related species. The function is not known. 255
57135 405221 pfam14495 Cytochrom_C550 Cytochrome c-550 domain. This domain is a heme binding cytochrome known as cytochrome c550, or cytochrome c549, or PsbV. 136
57136 405222 pfam14496 NEL C-terminal novel E3 ligase, LRR-interacting. This NEL or novel E3 ligase domain is found at the C-terminus of bacterial virulence factors. Its sequence is different from those of the eukaryotic HECT and RING-finger E3 ligases, and it subverts the host ubiquitination process. At the N-terminus of the family-members there is a series of LRR repeats, and the NEL domain interacts with the most N-terminal repeat. The key residue for the ligation step is the cysteine, eg found at position 386 in UniProtKB:E7K2H2. The LRR section sequesters this active site until invasion has occurred. 216
57137 405223 pfam14497 GST_C_3 Glutathione S-transferase, C-terminal domain. This domain is closely related to pfam00043. 102
57138 405224 pfam14498 Glyco_hyd_65N_2 Glycosyl hydrolase family 65, N-terminal domain. This domain represents a domain found to the N-terminus of the glycosyl hydrolase 65 family catalytic domain. 233
57139 379626 pfam14499 DUF4437 Domain of unknown function (DUF4437). This family of proteins is found in bacteria. Proteins in this family are typically between 152 and 283 amino acids in length. 250
57140 405225 pfam14500 MMS19_N Dos2-interacting transcription regulator of RNA-Pol-II. This domain, along with the C-terminal part, pfam12460, is an essential component of a silencing complex in fission yeast that contains Dos2, Rik1, Mms19 and Cdc20 (the catalytic subunit of DNA polymerase-epsilon). This complex regulates RNA polymerase II (RNA Pol II) activity in heterochromatin and is required for DNA replication and heterochromatin assembly. 258
57141 405226 pfam14501 HATPase_c_5 GHKL domain. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90. 99
57142 405227 pfam14502 HTH_41 Helix-turn-helix domain. 48
57143 405228 pfam14503 YhfZ_C YhfZ C-terminal domain. This domain is often found in association with the helix-turn-helix domain HTH_41 (pfam14502). It includes YhfZ proteins from Escherichia coli and Shigella flexneri. 236
57144 405229 pfam14504 CAP_assoc_N CAP-associated N-terminal. The function of this domain is unknown, but it is found towards the N-terminus of bacterial proteins carrying the CAP domain, pfam00188. All members that do not otherwise carry an additional Cu_amine_oxidN1, pfam07833, domain are likely to be extracellular as they start with a signal-peptide. Most other non-bacterial proteins with the CAP domain are allergenic. 140
57145 405230 pfam14505 DUF4438 Domain of unknown function (DUF4438). 256
57146 291192 pfam14506 CppA_N CppA N-terminal. This is the N-terminal domain of the CppA protein found in species of Streptococcus. CppA is a putative C3-glycoprotein degrading proteinase, involved in pathogenicity. It is often found associated with pfam14507. 123
57147 405231 pfam14507 CppA_C CppA C-terminal. This is the C-terminal domain of the CppA protein found in species of Streptococcus. CppA is a putative C3-glycoprotein degrading proteinase, involved in pathogenicity. It is often found associated with pfam14506. 97
57148 405232 pfam14508 GH97_N Glycosyl-hydrolase 97 N-terminal. This N-terminal domain of glycosyl-hydrolase-97 contributes part of the active site pocket. It is also important for contact with the catalytic and C-terminal domains of the whole. 235
57149 405233 pfam14509 GH97_C Glycosyl-hydrolase 97 C-terminal, oligomerization. Glycosyl-hydrolase-97 is made up of three tightly linked and highly conserved globular domains. The C-terminal domain is found to be necessary for oligomerization of the whole molecule in order to create the active-site pocket and the Ca++-binding site. 97
57150 405234 pfam14510 ABC_trans_N ABC-transporter extracellular N-terminal. This domain is found at the N-terminus of ABC-transporter proteins from fungi, plants to higher eukaryotes. It would appear to be an extracellular domain. 80
57151 405235 pfam14511 RE_EcoO109I Type II restriction endonuclease EcoO109I. This is a family of Type II restriction endonucleases. 194
57152 405236 pfam14512 TM1586_NiRdase Putative TM nitroreductase. Compared with the more traditional NADH oxidase/flavin reductase family, this family is a duplication, consisting of two similar domains arranged as the subunits of the dimeric NADH oxidase/flavin reductase with one conserved active site. 214
57153 405237 pfam14513 DAG_kinase_N Diacylglycerol kinase N-terminus. This domain is found at the N-terminus of diacylglycerol kinases. 107
57154 405238 pfam14514 TetR_C_9 Transcriptional regulator, TetR, C-terminal. This family comprises proteins that belong to the TetR family of transcriptional regulators. This family features the C-terminal region of these sequences, which does not include the N-terminal helix-turn-helix. 130
57155 405239 pfam14515 HOASN Haem-oxygenase-associated N-terminal helices. This domain represents a pair of alpha helices, which are found at the N-terminus of some Haem-oxygenase globular domain. 92
57156 373105 pfam14516 AAA_35 AAA-like domain. This family of proteins are part of the AAA superfamily. 331
57157 405240 pfam14517 Tachylectin Tachylectin. This family of lectins binds N-acetylglucosamine and N-acetylgalactosamine and may be involved in innate immunity. It has a five-bladed beta-propeller structure with five carbohydrate-binding sites, one per beta sheet. 230
57158 405241 pfam14518 Haem_oxygenas_2 Iron-containing redox enzyme. The CADD, Chlamydia protein associating with death domains, crystal structure reveals a dimer of seven-helical bundles. Each bundle contains a di-iron centre adjacent to an internal cavity that forms an active site similar to that of methane mono-oxygenase hydrolase. 178
57159 405242 pfam14519 Macro_2 Macro-like domain. This domain is an ADP-ribose binding module. It is found in a number of yeast proteins. 290
57160 405243 pfam14520 HHH_5 Helix-hairpin-helix domain. 57
57161 405244 pfam14521 Aspzincin_M35 Lysine-specific metallo-endopeptidase. This is the catalytic region of aspzincins, a group of lysine-specific metallo-endopeptidases in the MEROPS:M35 family. They exhibit the following active-site architecture. The active site is composed of two helices and a loop region and includes the HExxH and GTxDxxYG motifs. In UniProt:P81054, His117, His121 and Asp130 coordinate to the catalytic zinc ligands. An electrostatically negative region composed of Asp154 and Glu157 attracts a positively charged Lys side chain of a substrate in a specific manner. 145
57162 405245 pfam14522 Cytochrome_C7 Cytochrome c7 and related cytochrome c. This family includes cytochromes c7 and c7-type. In cytochromes c7 all three haems are bis-His co-ordinated. In c7-type the last haem is His-Met co-ordinated. 62
57163 405246 pfam14523 Syntaxin_2 Syntaxin-like protein. This domain includes syntaxin-like domains including from the Vam3p protein. 101
57164 405247 pfam14524 Wzt_C Wzt C-terminal domain. This domain is found at the C-terminus of the Wzt protein. The crystal structure of C-Wzt(O9a) reveals a beta sandwich with an immunoglobulin-like topology that contains the O-antigenic polysaccharide binding pocket. This domain is often associated with the ABC-transporter domain. 143
57165 405248 pfam14525 AraC_binding_2 AraC-binding-like domain. This domain is related to the AraC ligand binding domain pfam02311. 173
57166 405249 pfam14526 Cass2 Integron-associated effector binding protein. This family contains Cass2 from Vibrio cholerae, an integron-associated protein that has been shown to bind cationic drug compounds with submicromolar affinity. Cass2 has been proposed to be representative of a larger family of independent effector-binding proteins associated with lateral gene transfer within Vibrio and other closely-related species. 149
57167 405250 pfam14527 LAGLIDADG_WhiA WhiA LAGLIDADG-like domain. This domain is found within the sporulation regulator WhiA. It is a LAGLIDADG superfamily like domain. 93
57168 405251 pfam14528 LAGLIDADG_3 LAGLIDADG-like domain. This domain is part of the LAGLIDADG superfamily. 82
57169 405252 pfam14529 Exo_endo_phos_2 Endonuclease-reverse transcriptase. This domain represents the endonuclease region of retrotransposons from a range of bacteria, archaea and eukaryotes. These are enzymes largely from class EC:2.7.7.49. 118
57170 405253 pfam14530 DUF4439 Domain of unknown function (DUF4439). This domain has a ferritin-like fold. 131
57171 405254 pfam14531 Kinase-like Kinase-like. This family includes the pseudokinases ROP2 and ROP8 from Toxoplasma gondii. These proteins have a typical bilobed protein kinase fold, but lack catalytic actvity. 288
57172 316998 pfam14532 Sigma54_activ_2 Sigma-54 interaction domain. 138
57173 405255 pfam14533 USP7_C2 Ubiquitin-specific protease C-terminal. This C-terminal domain on many long ubiquitin-specific proteases has no known function. 205
57174 405256 pfam14534 DUF4440 Domain of unknown function (DUF4440). 107
57175 405257 pfam14535 AMP-binding_C_2 AMP-binding enzyme C-terminal domain. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices. 96
57176 405258 pfam14536 DUF4441 Domain of unknown function (DUF4441). This family is largely made up of uncharacterized proteins from the Ciliophora. The function is not known. 114
57177 405259 pfam14537 Cytochrom_c3_2 Cytochrome c3. 79
57178 405260 pfam14538 Raptor_N Raptor N-terminal CASPase like domain. This domain is found at the N-terminus of the Raptor protein. It has been identified to have a CASPase like structure. It conserves the characteristic cys/his dyad of the caspases suggesting it may have a peptidase activity. 152
57179 405261 pfam14539 DUF4442 Domain of unknown function (DUF4442). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 139 and 165 amino acids in length. There is a conserved PYF sequence motif. There is a single completely conserved residue N that may be functionally important. 131
57180 405262 pfam14540 NTF-like Nucleotidyltransferase-like. Structural comparisons with Structure 1kny indicate that this N-terminal domain resembles a nucleotidyltransferase fold. 117
57181 405263 pfam14541 TAXi_C Xylanase inhibitor C-terminal. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylasnase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival. 160
57182 405264 pfam14542 Acetyltransf_CG GCN5-related N-acetyl-transferase. This family of GCN5-related N-acetyl-transferases bind both CoA and acetyl-CoA. They are characterized by highly conserved glycine, a cysteine residue in the acetyl-CoA binding site near the acetyl group, their small size compared with other GNATs and a lack of of an obvious substrate-binding site. It is proposed that they transfer an acetyl group from acetyl-CoA to one or more unidentified aliphatic amines via an acetyl (cysteine) enzyme intermediate. The substrate might be another macromolecule. 80
57183 405265 pfam14543 TAXi_N Xylanase inhibitor N-terminal. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylanase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival. 172
57184 373123 pfam14544 DUF4443 Domain of unknown function (DUF4443). This is a family of archaeal proteins. The domain is a putative gyrase domain. 112
57185 405266 pfam14545 DBB Dof, BCAP, and BANK (DBB) motif,. The DBB domain is named from the Drosophila (Downstream of FGFR - Dof, also known as Heartbroken or Stumps) protein, the BANKS and BCAP, both signalling in B-cell pathway, proteins. This domain defines a minimal region required for mediating Dof dimerization. Since this domain can interact both with itself and with a region in the C-terminal part of the molecule, it may mediate either intermolecular or intramolecular interactions. Mutants lacking this domain disrupt FGFR signal transduction and fibroblast growth-factor signalling. 139
57186 405267 pfam14547 Hydrophob_seed Hydrophobic seed protein. This domain has a four-helix bundle structure. It contains four disulfide bonds, of which three function to keep the C- and N-terminal parts of the molecule in place. 85
57187 405268 pfam14549 P22_Cro DNA-binding transcriptional regulator Cro. Bacteriophage P22 Cro protein represses genes normally expressed in early phage development and is necessary for the late stage of lytic growth. It does this by binding to the OL and OR operator-regions normally used by the repressor protein for lysogenic maintenance. 60
57188 405269 pfam14550 Peptidase_S78_2 Putative phage serine protease XkdF. This domain is largely found on phage proteins. In a number of cases the domain is associated with a SAM-dependent methyltransferase. Members are serine peptidases. 120
57189 405270 pfam14551 MCM_N MCM N-terminal domain. This family contains the N-terminal domain of MCM proteins. 93
57190 405271 pfam14552 Tautomerase_2 Tautomerase enzyme. 82
57191 405272 pfam14553 YqbF YqbF, hypothetical protein domain. This N-terminal domain is found in Bacillus and related spp. The function is not known. 43
57192 405273 pfam14554 VEGF_C VEGF heparin-binding domain. This short domain is found at the C-terminus of VEGF. It has been shown to have heparin binding activity. 49
57193 405274 pfam14555 UBA_4 UBA-like domain. 42
57194 373128 pfam14556 AF2331-like AF2331-like. AF2331-like is a 11-kDa orphan protein of unknown function from Archaeoglobus fulgidus. The structure consists of an alpha + beta fold formed by an unusual homodimer, where the two core beta-sheets are interdigitated, containing strands alternating from both subunits. AF2331 contains multiple negatively charged surface clusters and is located on the same operon as the basic protein AF2330. It is suggested that AF2331 and AF2330 may form a charge-stabilized complex in vivo, though the role of the negatively charged surface clusters is not clear. 90
57195 317018 pfam14557 AphA_like Putative AphA-like transcriptional regulator. Members of this family are putative transcriptional regulators that appear to be related to the pfam03551 family. This family includes AphA-like members. 174
57196 405275 pfam14558 TRP_N ML-like domain. This domain is distantly similar to pfam02221 and conserves its pattern of conserved cysteines. This suggests that this domain may be involved in lipid binding. 137
57197 405276 pfam14559 TPR_19 Tetratricopeptide repeat. 68
57198 405277 pfam14560 Ubiquitin_2 Ubiquitin-like domain. This entry contains ubiquitin-like domains. 83
57199 405278 pfam14561 TPR_20 Tetratricopeptide repeat. 90
57200 373131 pfam14562 Endonuc_BglI Restriction endonuclease BglI. This restriction endonuclease binds DNA as a dimer. BglI recognizes and cleaves the interrupted DNA sequence GCCNNNNNGGC and cleaves between the fourth and fifth unspecified base pair to produce 3' overhanging ends. 285
57201 405279 pfam14563 DUF4444 Domain of unknown function (DUF4444). This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved LIPL sequence motif. There are two completely conserved G residues that may be functionally important. 41
57202 405280 pfam14564 Membrane_bind Membrane binding. This family includes the C-terminal domain of Dictyostelium discoideum Calcium-dependent cell adhesion molecule 1, which has an immunoglobulin-like fold. It tethers the protein to the cell membrane. 109
57203 405281 pfam14565 IL22 Interleukin 22 IL-10-related T-cell-derived-inducible factor. Interleukin-22 is distantly related to interleukin (IL)-10, and is produced by activated T cells. IL-22 is a ligand for CRF2-4, a member of the class II cytokine receptor family. 139
57204 405282 pfam14566 PTPlike_phytase Inositol hexakisphosphate. Inositol hexakisphosphate, often called phytate, is found in abundance in seeds and acting as an inorganic phosphate reservoir. Phytases are phosphatases that hydrolyze phytate to less-phosphorylated myo-inositol derivatives and inorganic phosphate. The active-site sequence (HCXXGXGR) of the phytase identified from the gut micro-organism Selenomonas ruminantium forms a loop (P loop) at the base of a substrate binding pocket that is characteristic of protein tyrosine phosphatases (PTPs). The depth of this pocket is an important determinant of the substrate specificity of PTPs. In humans this enzyme is thought to aid bone mineralization and salvage the inositol moiety prior to apoptosis. 157
57205 405283 pfam14567 SUKH_5 SMI1-KNR4 cell-wall. Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins. 133
57206 405284 pfam14568 SUKH_6 SMI1-KNR4 cell-wall. Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins. 120
57207 405285 pfam14569 zf-UDP Zinc-binding RING-finger. This RING/U-box type zinc-binding domain is frequently found in the catalytic subunit (irx3) of cellulose synthase. The enzymic class is EC:2.4.1.12, whereby the synthase removes the glucose from UDP-glucose and adds it to the growing cellulose, thereby releasing UDP. The domain-structure is treble-clef like (Structure 1weo). 75
57208 405286 pfam14570 zf-RING_4 RING/Ubox like zinc-binding domain. 47
57209 405287 pfam14571 Di19_C Stress-induced protein Di19, C-terminal. C-terminal domain of Di19, a protein that increases the sensitivity of plants to environmental stress, such as salinity, drought, osmotic stress and cold. the protein is also induced by an increased supply of stress-related hormones such as abscisic acid ABA and ethylene. There is a zinc-finger at the N-terminus, zf-Di19, pfam05605. 102
57210 405288 pfam14572 Pribosyl_synth Phosphoribosyl synthetase-associated domain. This family includes several examples of enzymes from class EC:2.7.6.1, phosphoribosyl-pyrophosphate transferase. 184
57211 373139 pfam14573 PP-binding_2 Acyl-carrier. 96
57212 405289 pfam14574 DUF4445 Domain of unknown function (DUF4445). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 525 and 664 amino acids in length. The family is found in association with pfam00111. 261
57213 405290 pfam14575 EphA2_TM Ephrin type-A receptor 2 transmembrane domain. Epha2_TM represents the left-handed dimer transmembrane domain of of EphA2 receptor. This domain oligomerizes and is important for the active signalling process. 76
57214 405291 pfam14576 SEO_N Sieve element occlusion N-terminus. Sieve element occlusion (SEO) proteins, or forisomes, are phloem proteins which accumulate during sieve element differentiation. This domain represents the N-terminus of SEO proteins. 287
57215 405292 pfam14577 SEO_C Sieve element occlusion C-terminus. Sieve element occlusion (SEO) proteins, or forisomes, are phloem proteins which accumulate during sieve element differentiation. This domain represents the C-terminus of SEO proteins. 232
57216 405293 pfam14578 GTP_EFTU_D4 Elongation factor Tu domain 4. Elongation factor Tu consists of several structural domains, and this is usually the fourth. 86
57217 405294 pfam14579 HHH_6 Helix-hairpin-helix motif. The HHH domain is a short DNA-binding domain. 88
57218 405295 pfam14580 LRR_9 Leucine-rich repeat. 175
57219 405296 pfam14581 SseB_C SseB protein C-terminal domain. This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon. This presumed domain is found at the C-terminus of SseB proteins. 106
57220 405297 pfam14582 Metallophos_3 Metallophosphoesterase, calcineurin superfamily. Members of this family are part of the Calcineurin-like phosphoesterase superfamily. 259
57221 291262 pfam14583 Pectate_lyase22 Oligogalacturonate lyase. This is a family of oligogalacturonate lyases, referred to more generally as pectate lyase family 22. These proteins fold into 7-bladed beta-propellers. 386
57222 405298 pfam14584 DUF4446 Protein of unknown function (DUF4446). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 165 and 176 amino acids in length. 150
57223 405299 pfam14585 CagY_I CagY type 1 repeat. This repeat is found at the N-terminus of the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells. 65
57224 291265 pfam14586 MHC_I_2 Class I Histocompatibility antigen, NKG2D ligand, domains 1 and 2. Members of this family are known as retinoic-acid-inducible proteins. They are ligands for the activating immunoreceptor NKG2D, which is widely expressed on natural killer cells, T cells, and macrophages. 174
57225 405300 pfam14587 Glyco_hydr_30_2 O-Glycosyl hydrolase family 30. 367
57226 405301 pfam14588 YjgF_endoribonc YjgF/chorismate_mutase-like, putative endoribonuclease. YjgF_Endoribonuc is a putative endoribonuclease. The structure is of beta-alpha-beta-alpha-beta(2) domains common both to bacterial chorismate mutase and to members of the YjgF family. These proteins form trimers with a three-fold symmetry with three closely-packed beta-sheets. The YjgF family is a large, widely distributed family of proteins of unknown biochemical function that are highly conserved among eubacteria, archaea and eukaryotes. 148
57227 373147 pfam14589 NrfD_2 Polysulfide reductase. Bacterial polysulfide reductase is an integral membrane protein complex responsible for quinone-coupled reduction of polysulfide, a process important in extreme environments such as deep-sea vents and hot springs. Polysulfides are a class of compounds composed of chains of sulfur atoms, which in their simplest form are present as an anion with general formula Sn(2-). In nature, polysulfides are found in particularly high concentrations in extreme volcanic or geothermically active environments. Here, the reduction and oxidation of polysulfides are vital processes for many bacteria and are essential steps in the global sulfur cycle. In particular, the reduction of polysulfide to hydrogen sulfide in these environments is usually linked to energy-generating respiratory processes, supporting growth of many microorganisms, particularly hyperthermophiles. 281
57228 317044 pfam14590 DUF4447 Domain of unknown function (DUF4447). This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length. 166
57229 373148 pfam14591 AF0941-like AF0941-like. Members of this family are of unknown function. 113
57230 405302 pfam14592 Chondroitinas_B Chondroitinase B. This family includes chondroitinases. These enzymes cleave the glycosaminoglycan dermatan sulfate. 426
57231 405303 pfam14593 PH_3 PH domain. 103
57232 405304 pfam14594 Sipho_Gp37 Siphovirus ReqiPepy6 Gp37-like protein. This family includes numerous phage proteins from Siphoviruses. The function of this protein is uncertain, but it is related to pfam06605. In Rhodococcus phage ReqiPepy6 this protein is called Gp37. 337
57233 405305 pfam14595 Thioredoxin_9 Thioredoxin. 129
57234 291272 pfam14596 STAT6_C STAT6 C-terminal. This family represents the C-terminus of mammalian STAT6 (Signal transducer and activator of transcription 6), it contains an LXXLL motif which binds to NCOA1 (Nuclear receptor coactivator 1). 193
57235 373150 pfam14597 Lactamase_B_5 Metallo-beta-lactamase superfamily. This is a small family of putative metal-dependent hydrolases. 199
57236 405306 pfam14598 PAS_11 PAS domain. This family includes the PAS-B domain of NCOA1 (Nuclear receptor coactivator 1), which binds to an LXXLL motif in the C-terminal region of STAT6 (Signal transducer and activator of transcription 6). 109
57237 405307 pfam14599 zinc_ribbon_6 Zinc-ribbon. This is a typical zinc-ribbon finger, with each pair of zinc-ligands coming from more-or-less either side of two knuckles. It is found in eukaryotes. 59
57238 373153 pfam14600 CBM_5_12_2 Cellulose-binding domain. This C-terminal domain belongs to the CAZy family of carbohydrate-binding domains that are associated with glycosyl-hydrolases. It is suggested to bind cellulose. 62
57239 405308 pfam14601 TFX_C DNA_binding protein, TFX, C-term. This is the C-terminal region of TFX-like DNA-binding proteins. 83
57240 405309 pfam14602 Hexapep_2 Hexapeptide repeat of succinyl-transferase. 33
57241 405310 pfam14603 hSH3 Helically-extended SH3 domain. This domain is the 70 C-terminal residues of ADAP - Adhesion and de-granulation promoting adapter protein. It shows homology to SH3 domains; however, conserved residues of the fold are absent. It thus represents an altered SH3 domain fold. An N-terminal, amphipathic, helix makes extensive contacts to residues of the regular SH3 domain fold thereby creating a composite surface with unusual surface properties. The domain can no longer bind conventional proline-rich peptides. There are key phosphorylation sites within the two hSH3 domains and it would appear that binding at these sites does not materially affect the folding of these regions although the equilibrium towards the unfolded state may be slightly altered. The binding partners of the hSH3 domains are still unknown. 89
57242 405311 pfam14604 SH3_9 Variant SH3 domain. 49
57243 373156 pfam14605 Nup35_RRM_2 Nup53/35/40-type RNA recognition motif. 53
57244 405312 pfam14606 Lipase_GDSL_3 GDSL-like Lipase/Acylhydrolase family. 178
57245 405313 pfam14607 GxDLY N-terminus of Esterase_SGNH_hydro-type. This domain lies upstream of SGNH hydrolase, but its function is not known. There is a highly conserved GxDLY sequence-motif. 146
57246 405314 pfam14608 zf-CCCH_2 RNA-binding, Nab2-type zinc finger. This is an unusual zinc-finger family, and is represented by fingers 5-7 of Nab2. Nab2 ZnF5-7 are zinc-fingers of the type C-x8-C-x5-C-x3-H. Nab2 ZnFs function in the generation of export-competent mRNPs. Mab2 is a conserved polyadenosine-RNA-binding Zn finger protein required for both mRNA export and polyadenylation regulation and becomes attached to the mRNP after splicing and during or immediately after polyadenylation. The three ZnFs, 5-7, have almost identical folds and, most unusually, associate with one another to form a single coherent structural unit. ZnF5-7 bind to eight consecutive adenines, and chemical shift perturbations identify residues on each finger that interact with RNA. 18
57247 405315 pfam14609 GCP5-Mod21 gamma-Tubulin ring complex non-core subunit mod21. GCP5-Mod21 is a non-core subunit of the larger gamma-tubulin ring complex that effects microtubule nucleation from both centrosomal and non-centrosomal sites. This subunit, unlike GCP2 and and GCP3 and others, is not thought to be essential for viability in the fission yeast, and may not be expressed in very high concentrations. Fission yeast can form a large gamma-Tubulin complex C similar to that found in higher eukaryotes and this complex is important for maintaining normal levels of microtubule nucleation in vivo. 676
57248 405316 pfam14610 DUF4448 Protein of unknown function (DUF4448). This is a family of predicted membrane glycoproteins from fungi. However there appears, visually, to be some similarity with the family of GPI-anchored fungal proteins, pfam10342. 185
57249 405317 pfam14611 SLS Mitochondrial inner-membrane-bound regulator. SLS is a fungal domain found bound to the mitochondrial inner-membrane. It reacts physically with fungal Kar2p to promote translocation across the endoplasmic-reticulum membrane. This action appeared to be mediated via the promotion of the Sec63p-mediated activation of Kar2p's ATPase activity. This indicates that the Sls1p protein is a GrpE-like protein in the endoplasmic reticulum. In S.cerevisiae the SLS1 gene (ScSLS1) is not essential but is also involved in ERAD and folding. 197
57250 405318 pfam14612 Ino80_Iec3 IEC3 subunit of the Ino80 complex, chromatin re-modelling. This is a family of fungal chromatin re-modelling proteins found in one of the chromatin-central complexes, Ino80. The function was identified in Schizosaccharomyces pombe but there is no orthologue in S. cerevisiae. 231
57251 405319 pfam14613 DUF4449 Protein of unknown function (DUF4449). This is a fungal DUF of unknown function. 156
57252 405320 pfam14614 DUF4450 Domain of unknown function (DUF4450). This is a family of bacterial proteins of unknown function. 213
57253 405321 pfam14615 Rsa3 Ribosome-assembly protein 3. This is a family of 60S ribosome-assembly proteins, from fungi. 46
57254 405322 pfam14616 DUF4451 Domain of unknown function (DUF4451). This is family of fungal proteins up-regulated during meiosis. 120
57255 373164 pfam14617 CMS1 U3-containing 90S pre-ribosomal complex subunit. This is a family of fungal and plant CMS1-like proteins. The family has similarity to the DEAD-box helicases. 250
57256 405323 pfam14618 DUF4452 Domain of unknown function (DUF4452). This fungal family has no known function. However, it is rich in paired, as CXXC, cysteines and histidines, but these do not fall in the conformation that might suggest zinc-binding. 169
57257 405324 pfam14619 SnAC Snf2-ATP coupling, chromatin remodelling complex. This domain appears to play a crucial role in chromatin remodelling for yeast SWI/SNF. It binds histones. It is required for mobilising nucleosomes and lies within the catalytic subunit of the yeast SWI/SNF. It is found to be universally conserved. 71
57258 405325 pfam14620 YPEB YpeB sporulation. YPEB is a protein that is necessary for the functioning of SleB during spore-cortex hydrolysis. 361
57259 405326 pfam14621 RFX5_DNA_bdg RFX5 DNA-binding domain. RFX5 and RFXAP reveals molecular details associated with MHCII gene expression. 219
57260 405327 pfam14622 Ribonucleas_3_3 Ribonuclease-III-like. Members of this family are involved in rDNA transcription and rRNA processing. They probably also cleave a stem-loop structure at the 3' end of U2 snRNA to ensure formation of the correct U2 3' end; they are involved in polyadenylation-independent transcription termination. Some members may be mitochondrial ribosomal protein subunit L15, others may be 60S ribosomal protein L3. 128
57261 405328 pfam14623 Vint Hint-domain. This short domain is a conserved region of intein-containing proteins from lower eukaryotes. 165
57262 405329 pfam14624 Vwaint VWA / Hh protein intein-like. VWA-Hint proteins carry this conserved domain of around 300 residues, now named the Vwaint domain. Such proteins do not seem to have a signal peptide for secretion. Generally, this domain lies between the N-terminal VWA domain and the more C-terminal 'Vint'-type Hint domain. The exact function of this domain is not known. 70
57263 405330 pfam14625 Lustrin_cystein Lustrin, cysteine-rich repeated domain. This repeated domain is found in proteins from lower eukaryotes in lustrin, perlucin, pearl nacre, and other similar protein-types. Each repeat lies between Kunitz-BPTI repeats, in certain species, which are also cysteine-rich. The cysteines may form the disulfide bonds observed for other members of this superfamily. 43
57264 405331 pfam14626 RNase_Zc3h12a_2 Zc3h12a-like Ribonuclease NYN domain. This family is found to be a divergent form of the NYN-domain- containing RNAse family. 122
57265 405332 pfam14627 DUF4453 Domain of unknown function (DUF4453). This short domain is found only on a small subgroup of proteins from Gram-negative Proteobacteria that also carry a YARHG domain, pfam13308. They carry three conserved tryptophan and three conserved cysteine residues. 107
57266 405333 pfam14628 DUF4454 Domain of unknown function (DUF4454). This C-terminal domain is found only on a small subgroup of proteins from Gram-positive Clostridiales that also carry a YARHG domain, pfam13308. 210
57267 405334 pfam14629 ORC4_C Origin recognition complex (ORC) subunit 4 C-terminus. This entry represents the C-terminus of origin recognition complex subunit 4. 212
57268 405335 pfam14630 ORC5_C Origin recognition complex (ORC) subunit 5 C-terminus. This entry represents the C-terminus of origin recognition complex subunit 5. 260
57269 405336 pfam14631 FancD2 Fanconi anaemia protein FancD2 nuclease. The Fanconi anaemia protein FancD2 is a nuclease necessary for the repair of DNA interstrand-crosslinks. 1345
57270 405337 pfam14632 SPT6_acidic Acidic N-terminal SPT6. The N-terminus of SPT6 is highly acidic. The full SPT6 protein is a transcription regulator, but the exact function of this acidic region is not certain. 88
57271 405338 pfam14633 SH2_2 SH2 domain. 206
57272 405339 pfam14634 zf-RING_5 zinc-RING finger domain. 43
57273 291309 pfam14635 HHH_7 Helix-hairpin-helix motif. 104
57274 405340 pfam14636 FNIP_N Folliculin-interacting protein N-terminus. This is the N-terminus of folliculin-interacting proteins. 79
57275 405341 pfam14637 FNIP_M Folliculin-interacting protein middle domain. This is the middle domain of folliculin-interacting proteins. 226
57276 405342 pfam14638 FNIP_C Folliculin-interacting protein C-terminus. This is the C-terminus of folliculin-interacting proteins. This region is responsible for binding to folliculin. 189
57277 258777 pfam14639 YqgF Holliday-junction resolvase-like of SPT6. The YqgF domain of SPT6 proteins is homologous to the E.coli RuvC but its putative catalytic site lacks the carboxylate side chains critical for coordinating magnesium ions that mediate phosphodiester bond-cleavage 150
57278 405343 pfam14640 TMEM223 Transmembrane protein 223. 172
57279 405344 pfam14641 HTH_44 Helix-turn-helix DNA-binding domain of SPT6. This helix-turn-helix represents the first of two DNA-binding domains on the SPT6 proteins. 115
57280 405345 pfam14642 FAM47 FAM47 family. The function of this Chordate family of proteins is not known. 257
57281 405346 pfam14643 DUF4455 Domain of unknown function (DUF4455). This domain family is found in bacteria and eukaryotes, and is approximately 480 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important. 469
57282 405347 pfam14644 DUF4456 Domain of unknown function (DUF4456). This domain family is found in bacteria and eukaryotes, and is approximately 210 amino acids in length. There is a single completely conserved residue E that may be functionally important. 209
57283 405348 pfam14645 Chibby Chibby family. This family includes the eukaryotic chibby proteins. These proteins inhibit the wingless/Wnt pathway by binding to beta-catenin and inhibiting beta-catenin-mediated transcriptional activation. Chibby is Japanese for small, and is named after the RNAi phenotype seen in Drosophila. 114
57284 405349 pfam14646 MYCBPAP MYCBP-associated protein family. This family of eukaryotic proteins includes the mammalian MYCBP-associated proteins. These proteins may be synaptic processes and may have a role in spermatogenesis. 429
57285 405350 pfam14647 FAM91_N FAM91 N-terminus. 306
57286 405351 pfam14648 FAM91_C FAM91 C-terminus. 393
57287 405352 pfam14649 Spatacsin_C Spatacsin C-terminus. This family includes the C-terminus of spatacsin. 295
57288 405353 pfam14650 FAM75 FAM75 family. 382
57289 405354 pfam14651 Lipocalin_7 Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). 128
57290 405355 pfam14652 DUF4457 Domain of unknown function (DUF4457). This family of proteins is found in eukaryotes. It is found repeated several times in the vertebrate KIAA0556 proteins. 325
57291 405356 pfam14653 IGFL Insulin growth factor-like family. This family includes the insulin growth factor-like proteins. These proteins are potential ligands for the IGFLR1 cell membrane receptor. 83
57292 405357 pfam14654 Epiglycanin_C Mucin, catalytic, TM and cytoplasmic tail region. This family represents the non-tandem repeat domain including cleavage site, the transmembrane helix domain, and the cytoplasmic tail of epiglycanin and related mucins. 100
57293 405358 pfam14655 RAB3GAP2_N Rab3 GTPase-activating protein regulatory subunit N-terminus. This family includes the N-terminus of the Rab3 GTPase-activating protein non-catalytic subunit. Rab3 GTPase-activating protein is a GTPase activating protein with specificity for Rab3 subfamily. 416
57294 405359 pfam14656 RAB3GAP2_C Rab3 GTPase-activating protein regulatory subunit C-terminus. This family includes the N-terminus of the Rab3 GTPase-activating protein non-catalytic subunit. Rab3 GTPase-activating protein is a GTPase activating protein with specificity for Rab3 subfamily. 595
57295 405360 pfam14657 Arm-DNA-bind_4 Arm DNA-binding domain. This family includes AP2-like domains found in a variety of phage integrase proteins. These domains bind to Arm DNA sites. 44
57296 405361 pfam14658 EF-hand_9 EF-hand domain. 66
57297 405362 pfam14659 Phage_int_SAM_3 Phage integrase, N-terminal SAM-like domain. This domain is found in a variety of phage integrase proteins. 55
57298 405363 pfam14660 DUF4458 Domain of unknown function (DUF4458). this domain is found in tandem repeats on the N-terminus of secreted LRR proteins from human associated Bacteroidetes domain boundaries are based on the JCSG solved 3D structure of JCSG target SP16667A (BT_0210) 108
57299 405364 pfam14661 HAUS6_N HAUS augmin-like complex subunit 6 N-terminus. This family includes the N-terminus of HAUS augmin-like complex subunit 6. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis. 227
57300 405365 pfam14662 KASH_CCD Coiled-coil region of CCDC155 or KASH. This coiled-coil region is found in the central part of KASH or Klarsicht/ANC-1/Syne/homology proteins. KASH are a meiosis-specific proteins that localize at telomeres and interact with SUN1, thus being implicated in meiotic chromosome dynamics and homolog pairing. 191
57301 405366 pfam14663 RasGEF_N_2 Rapamycin-insensitive companion of mTOR RasGEF_N domain. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. 107
57302 405367 pfam14664 RICTOR_N Rapamycin-insensitive companion of mTOR, N-term. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the N-terminal conserved section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. 372
57303 405368 pfam14665 RICTOR_phospho Rapamycin-insensitive companion of mTOR, phosphorylation-site. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient- and growth-factor signalling. This short region is the phoshorylation site. Rictor does interact with 14-3-3 in a Thr1135-dependent manner. Rictor can be inhibited by short-term rapamycin treatment showing that Thr1135 is an mTORC1-regulated site. 112
57304 405369 pfam14666 RICTOR_M Rapamycin-insensitive companion of mTOR, middle domain. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. 104
57305 405370 pfam14667 Polysacc_synt_C Polysaccharide biosynthesis C-terminal domain. This family represents the C-terminal integral membrane region of polysaccharide biosynthesis proteins. 142
57306 405371 pfam14668 RICTOR_V Rapamycin-insensitive companion of mTOR, domain 5. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. These long eukaryotic proteins carry several well-conserved domains, and this is No.5. 71
57307 373208 pfam14669 Asp_Glu_race_2 Putative aspartate racemase. This is a small family of vertebrate putative aspartate racemases. The family lies on TOPAZ 1 proteins. 176
57308 405372 pfam14670 FXa_inhibition Coagulation Factor Xa inhibitory site. This short domain on coagulation enzyme factor Xa is found to be the target for a potent inhibitor of coagulation, TAK-442. 36
57309 405373 pfam14671 DSPn Dual specificity protein phosphatase, N-terminal half. The active core of the dual specificity protein phosphatase is made up of two globular domains both with the DSP-like fold. This family represents the N-terminal half of the core. These domains are arranged in tandem, and are associated via an extensive interface to form a single globular whole. The conserved PTP signature motif (Cys-[X]5-Arg) that defines the catalytic centre of all PTP-family members is located within the C-terminal domain, family DSPc, pfam00782. Although the centre of the catalytic site is formed from DSPc, two loops from the N-terminal domain, DSPn, also contribute to the catalytic site, facilitating peptide substrate specificity. 138
57310 405374 pfam14672 LCE Late cornified envelope. This is a family of late cornified envelope proteins that are expressed in skin. 77
57311 405375 pfam14673 DUF4459 Domain of unknown function (DUF4459). This family appears only on sequences from Salmonella spp. These sequences also all carry a YARHG domain, pfam13308. 159
57312 405376 pfam14674 FANCI_S1-cap FANCI solenoid 1 cap. This is the solenoid 1 cap (S1-cap) domain of the Fanconi anemia group I protein. 53
57313 405377 pfam14675 FANCI_S1 FANCI solenoid 1. This is the solenoid 1 (S1) domain of the Fanconi anemia group I protein. 221
57314 405378 pfam14676 FANCI_S2 FANCI solenoid 2. This is the solenoid 2 (S2) domain of the Fanconi anemia group I protein. 152
57315 405379 pfam14677 FANCI_S3 FANCI solenoid 3. This is the solenoid 3 (S3) domain of the Fanconi anemia group I protein. 217
57316 405380 pfam14678 FANCI_S4 FANCI solenoid 4. This is the solenoid 4 (S4) domain of the Fanconi anemia group I protein. 244
57317 405381 pfam14679 FANCI_HD1 FANCI helical domain 1. This is the helical domain 1 (HD1) of the Fanconi anemia group I protein. 86
57318 405382 pfam14680 FANCI_HD2 FANCI helical domain 2. This is the helical domain 2 (HD2) of the Fanconi anemia group I protein. 237
57319 405383 pfam14681 UPRTase Uracil phosphoribosyltransferase. This family includes the enzyme uracil phosphoribosyltransferase (EC:2.4.2.9). This enzyme catalyzes the first step of UMP biosynthesis. 204
57320 379667 pfam14682 SPOB_ab Sporulation initiation phospho-transferase B, C-terminal. Sporulation initiation phospho-transferase B or SpoOB is part of a phospho-relay that initiates sporulation in Bacillus subtilis. Spo0B is a two-domain protein consisting of an N-terminal alpha-helical hairpin domain and a C-terminal alpha/beta domain, represented by this family. Two subunits of Spo0B dimerize by a parallel association of helical hairpins to form a novel four-helix bundle from which the active histidine - involved in the auto-phosphorylation - protrudes. In the phospho-relay, the signal-receptor histidine kinases are dephosphorylated by a common response regulator, Spo0F. Spo0B then takes phosphorylated Spo0F as substrate hereby mediating the transfer of a phosphoryl group to Spo0A, the ultimate transcription factor. 113
57321 405384 pfam14683 CBM-like Polysaccharide lyase family 4, domain III. CBM-like is domain III of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain possesses a jelly roll beta-sandwich fold structurally homologous to carbohydrate binding modules (CBMs), and it carries two sulfate ions and a hexa-coordinated calcium ion. 157
57322 405385 pfam14684 Tricorn_C1 Tricorn protease C1 domain. This domain is the C1 core domain of tricorn protease. This is a mixed alpha-beta domain. 70
57323 405386 pfam14685 Tricorn_PDZ Tricorn protease PDZ domain. This domain is the PDZ domain of tricorn protease. 88
57324 405387 pfam14686 fn3_3 Polysaccharide lyase family 4, domain II. FnIII-like is domain II of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain displays an immunoglobulin-like or more specifically Fibronectin-III type fold and shows highest structural similarity to the C-terminal beta-sandwich subdomain of the pro-hormone/propeptide processing enzyme carboxypeptidase gp180 from duck. It serves to assist in producing the deep pocket, with domain III, into which the substrate fits. 74
57325 405388 pfam14687 DUF4460 Domain of unknown function (DUF4460). This domain family is found in eukaryotes, and is typically between 103 and 119 amino acids in length. There is a conserved HPD sequence motif. There are two completely conserved residues (N and F) that may be functionally important. 104
57326 405389 pfam14688 DUF4461 Domain of unknown function (DUF4461). This domain family is found in eukaryotes, and is approximately 310 amino acids in length. 306
57327 405390 pfam14689 SPOB_a Sensor_kinase_SpoOB-type, alpha-helical domain. Sporulation initiation phospho-transferase B or SpoOB is part of a phospho-relay that initiates sporulation in Bacillus subtilis. Spo0B is a two-domain protein consisting of an N-terminal alpha-helical hairpin domain and a C-terminal alpha/beta domain. Two subunits of Spo0B dimerize by a parallel association of helical hairpins to form a novel four-helix bundle from which the active histidine - involved in the auto-phosphorylation - protrudes. In the phospho-relay, the signal-receptor histidine kinases are dephosphorylated by a common response regulator, Spo0F. Spo0B then takes phosphorylated Spo0F as substrate thereby mediating the transfer of a phosphoryl group to Spo0A, the ultimate transcription factor. The exact function of this alpha-helical domain is not known; it does not always occur just as the N-terminal domain of SPOB_ab, pfam14682. SCOP describes this domain as a histidine kinase-like fold lacking the kinase ATP-binding site. 60
57328 379669 pfam14690 zf-ISL3 zinc-finger of transposase IS204/IS1001/IS1096/IS1165. 47
57329 405391 pfam14691 Fer4_20 Dihydroprymidine dehydrogenase domain II, 4Fe-4S cluster. Domain II of the enzyme dihydroprymidine dehydrogenase binds FAD. Dihydroprymidine dehydrogenase catalyzes the first and rate-limiting step of pyrimidine degradation by converting pyrimidines to the corresponding 5,6- dihydro compounds. This domain carries two Fe4-S4 clusters. 113
57330 405392 pfam14692 DUF4462 Domain of unknown function (DUF4462). This domain family is found in eukaryotes, and is approximately 30 amino acids in length. 28
57331 405393 pfam14693 Ribosomal_TL5_C Ribosomal protein TL5, C-terminal domain. This family contains the C-terminal domain of ribosomal protein TL5. The N-terminal domain, which binds to 5S rRNA, is contained in family Ribosomal_L25p, pfam01386. Full length (N- and C-terminal domain) homologs of TL5 are also known as CTC proteins. TL5 or CTC are not found in Eukarya or Archaea. In some Bacteria, including E. coli, this ribosomal subunit occurs as a single domain protein (named Ribosomal subunit L25), where the only domain is homologous to TL5 N-terminal domain (hence included in family pfam01386). The function of the C-terminal domain of TLC is at present unknown. 84
57332 405394 pfam14694 LINES_N Lines N-terminus. This family represents the N-terminus of protein lines. In Drosophila this protein is involved in embryonic segmentation and may function as a transcriptional regulator. 350
57333 405395 pfam14695 LINES_C Lines C-terminus. This family represents the C-terminus of protein lines. In Drosophila this protein is involved in embryonic segmentation and may function as a transcriptional regulator. 37
57334 405396 pfam14696 Glyoxalase_5 Hydroxyphenylpyruvate dioxygenase, HPPD, N-terminal. This domain is one of two barrel-shaped regions that together form the active enzyme, 4-hydroxyphenylpyruvic acid dioxygenase, EC:1.13.11.27. As can be deduced from the disposition of the various Glyoxalase families, _2, _3 and _4 in Pfam, pfam00903, pfam12681, pfam13468, pfam13669, these two regions are similar to be indicative of a gene-duplication event. At the individual sequence level slight differences in conformation have given rise to slightly different functions. In the case of UniProt:P80064, 4-hydroxyphenylpyruvic acid dioxygenase catalyzes the formation of homogentisate from 4-hydroxyphenylpyruvate, and the pyruvate part of the HPPD substrate (4-hydroxyphenylpyruvate), derived from L-tyrosine, and the O2 molecule occupy the three free coordination sites of the catalytic iron atom in the C-terminal domain. In plants and photosynthetic bacteria, the tyrosine degradation pathway is crucial because homogentisate, a tyrosine degradation product, is a precursor for the biosynthesis of photosynthetic pigments, such as quinones or tocopherols. 138
57335 405397 pfam14697 Fer4_21 4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters. 59
57336 405398 pfam14698 ASL_C2 Argininosuccinate lyase C-terminal. This domain is found at the C-terminus of argininosuccinate lyase. 68
57337 405399 pfam14699 hGDE_N N-terminal domain from the human glycogen debranching enzyme. This domain is found on the very N-terminal of eukaryotic variants of the glycogen debranching enzyme (GDE), where it is immediately followed by the aldolase-like domain. The eukaryotic GDE performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzyme. The domain is involved in the glucosyltransferase activity, probably as a substrate-binding module (by analogy with other glucosyltransferases). 87
57338 405400 pfam14700 RPOL_N DNA-directed RNA polymerase N-terminal. This is the N-terminal domain of DNA-directed RNA polymerase. This domain has a role in interaction with regions of upstream promoter DNA and the nascent RNA chain, leading to the processivity of the enzyme. In order to make mRNA transcripts the RNA polymerase undergoes a transition from the initiation phase (which only makes short fragments of RNA) to an elongation phase. This domain undergoes a structural change in the transition from initiation to elongation phase. The structural change results in abolition of the promoter binding site, creation of a channel accommodating the heteroduplex in the active site and formation of an exit tunnel which the RNA transcript passes through after peeling off the heteroduplex. 286
57339 405401 pfam14701 hDGE_amylase Glycogen debranching enzyme, glucanotransferase domain. This is a glucanotransferase catalytic domain of the eukaryotic variant of the glycogen debranching enzyme (GDE). The eukaryotic GDEs performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzymes. The domain is a catalytic domain responsible for the glucanotransferase function. It belongs to the alpha-amylase clan and is predicted to have a structure of a 8-stranded alpha/beta barrel (TIM barrel) where strands are interrupted by long loops and additional mini-domains. In most other amylases, the catalytic domain is followed by a beta- barrel substrate binding domain, but presence of such a domain cannot be verified in the human (and other eukaryotic) GDE enzymes. 439
57340 405402 pfam14702 hGDE_central Central domain of human glycogen debranching enzyme. This is a central domain of the eukaryotic variant of the glycogen debranching enzyme (GDE). The eukaryotic GDE performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzyme This central domain follows the glucanotransferase domain and precedes the glucosidase (GDE_N) domain. It is very likely that the current definition contains two or more domains, by analogy with baterial GDEs, this domain should be involved in substrate- binding either for the N-terminal glucanotransferase and/or the the C-terminal glucosidase (or both). 242
57341 405403 pfam14703 PHM7_cyt Cytosolic domain of 10TM putative phosphate transporter. PHM7_cyt is the predicted cytosolic domain of integral membrane proteins, such as yeast PHM7 and TM63A_HUMAN TRANSMEMBRANE PROTEIN 63A. This domain usually precedes the 7TM region, pfam02714, and follows a RSN1_TM, pfam13967. Fold recognition programs consistently and with high significance predict this domain to be distantly homologous to RNA binding proteins from the RRM clan. 163
57342 405404 pfam14704 DERM Dermatopontin. Members of this family mediate cell adhesion via cell surface integrin binding. They also induce haemagglutination and aggregation of amebocytes. 149
57343 405405 pfam14705 Costars Costars. This domain is found both alone and at the C-terminus of actin-binding Rho-activating protein (ABRA). It binds to actin, and in muscle regulates the actin cytoskeleton and cell motility. It has a winged helix-like fold consisting of three alpha-helices and four antiparallel beta strands. Unlike typical winged helix proteins it does not bind to DNA, but contains a hydrophobic groove which may be responsible for interaction with other proteins. 75
57344 405406 pfam14706 Tnp_DNA_bind Transposase DNA-binding. This domain occurs at the C-terminus of transposases including E. coli tnpA. TnpA encodes a transposase and an inhibitor protein, the inhibitor only differs from the transposase by the absence of the N-terminal 55 amino acids, which includes most of this domain. This domain consists of alpha helices and turns, and functions as a DNA-binding domain. 58
57345 405407 pfam14707 Sulfatase_C C-terminal region of aryl-sulfatase. 122
57346 405408 pfam14709 DND1_DSRM double strand RNA binding domain from DEAD END PROTEIN 1. A C-terminal domain in human dead end protein 1 (DND1_HUMAN) homologous to double strand RNA binding domains (PF00035, PF00333) 80
57347 405409 pfam14710 Nitr_red_alph_N Respiratory nitrate reductase alpha N-terminal. This is the N-terminal tail of the respiratory nitrate reductase alpha chain. The nitrate reductase complex is a dimer of heterotrimers each consisting of an alpha, beta and gamma chain. The N-terminal tail of the alpha chain interacts with the beta chain and contributes to the stability of the heterotrimer. 37
57348 405410 pfam14711 Nitr_red_bet_C Respiratory nitrate reductase beta C-terminal. This domain occurs near the C-terminus of the respiratory nitrate reductase beta chain. The nitrate reductase complex is a dimer of heterotrimers each consisting of an alpha, beta and gamma chain. This domain plays a role in the interactions between subunits and shielding of the Fe-S clusters 81
57349 405411 pfam14712 Snapin_Pallidin Snapin/Pallidin. This family of proteins includes Snapin, this protein is associated with the SNARE complex, which mediates synaptic vesicle docking and fusion. It also includes the yeast snapin-like protein SNN1, which is a part of a complex involved in endosomal cargo sorting. The family also includes pallidin, a component of a complex involved in biogenesis of lysosome-related organelles. 89
57350 405412 pfam14713 DUF4464 Domain of unknown function (DUF4464). This family of proteins is found in eukaryotes. Proteins in this family are typically between 224 and 241 amino acids in length. There is a conserved YID sequence motif. 229
57351 405413 pfam14714 KH_dom-like KH-domain-like of EngA bacterial GTPase enzymes, C-terminal. The KH-like domain at the C-terminus of the EngA subfamily of essential bacterial GTPases has a unique domain structure position. The two adjacent GTPase domains (GD1 and GD2), two domains of family MMR_HSR1, pfam01926, pack at either side of the C-terminal domain. This C-terminal domain resembles a KH domain but is missing the distinctive RNA recognition elements. Conserved motifs of the nucleotide binding site of GD1 are integral parts of the GD1-KH domain interface, suggesting the interactions between these two domains are directly influenced by the GTP/GDP cycling of the protein. In contrast, the GD2-KH domain interface is distal to the GDP binding site of GD2. This family has not been added to the KH clan since SCOP classifies it separately due to its missing the key KH motif/fold. 79
57352 405414 pfam14715 FixP_N N-terminal domain of cytochrome oxidase-cbb3, FixP. This is the N-terminal domain of FixP, the cytochrome oxidase type-cbb3. the exact function is not known. 47
57353 405415 pfam14716 HHH_8 Helix-hairpin-helix domain. 67
57354 405416 pfam14717 DUF4465 Domain of unknown function (DUF4465). A large family of uncharacterized proteins mostly from human gut bacteroides, but also some environmental and water bacteria (Planctomycetes) as well as metagenomic samples Most proteins from this family are secreted or located on the outer surface and may participate in cell-cell interactions or cell-nutrient interactions This function is supported by a solved structure of a Bacteroides ovatus homolog, which adapts a galactose binding (jelly-roll) beta barrel structure 170
57355 405417 pfam14718 SLT_L Soluble lytic murein transglycosylase L domain. Soluble lytic murein transglycosylase (SLT) consists of three domains, an N-terminal U domain, an L domain (linker domain) and a C-terminal domain (C). The L domain may be involved in the interaction of the enzyme with peptidoglycan. 67
57356 405418 pfam14719 PID_2 Phosphotyrosine interaction domain (PTB/PID). 184
57357 405419 pfam14720 NiFe_hyd_SSU_C NiFe/NiFeSe hydrogenase small subunit C-terminal. This domain is found at the C-terminus of hydrogenase small subunits including periplasmic [NiFeSe] hydrogenase small subunit, uptake hydrogenase small subunit and periplasmic [NiFe] hydrogenase small subunit. This C-terminal domain binds two of the three iron-sulfur clusters in this enzyme. 79
57358 405420 pfam14721 AIF_C Apoptosis-inducing factor, mitochondrion-associated, C-term. This C-terminal domain appears to be a dimerization domain of the mitochondrial apoptosis-inducing factor 1. protein. The domain also appears at the C-terminus of FAD-dependent pyridine nucleotide-disulfide oxidoreductases. Apoptosis inducing factor (AIF) is a bifunctional mitochondrial flavoprotein critical for energy metabolism and induction of caspase-independent apoptosis. On reduction with NADH, AIF undergoes dimerization and forms tight, long-lived FADH2-NAD charge-transfer complexes proposed to be functionally important. 129
57359 405421 pfam14722 KRAP_IP3R_bind Ki-ras-induced actin-interacting protein-IP3R-interacting domain. This family includes the N-terminus of the actin-interacting protein sperm-specific antigen 2, or KRAP (Ki-ras-induced actin-interacting protein). This region is found to be the residues that interact with inositol 1,4,5-trisphosphate receptor (IP3R). KRAP was first localized as a membrane-bound form with extracellular regions suggesting it might be involved in the regulation of filamentous actin and signals from the outside of the cells. It has now been shown to be critical for the proper subcellular localization and function of IP3R. Inositol 1,4,5-trisphosphate receptor functions as the Ca2+ release channel on specialized endoplasmic reticulum membranes, so the subcellular localization of IP3R is crucial for its proper function. 143
57360 405422 pfam14723 SSFA2_C Sperm-specific antigen 2 C-terminus. This family includes the C-terminus of the actin-interacting protein sperm-specific antigen 2. 170
57361 405423 pfam14724 mit_SMPDase Mitochondrial-associated sphingomyelin phosphodiesterase. The GO annotation for this family indicates that it is a single-pass membrane protein, and it appears to be found in mitochondrial membranes. Sphingolipids play important roles in regulating cellular responses, and although mitochondria contain sphingolipids, direct regulation of their levels in mitochondria or mitochondria-associated membranes is mostly unclear. Sphingomyelin phosphodiesterases catalyze the hydrolysis of sphingomyelin to ceramide and phosphocholine, and these metabolites are involved in signalling pathways. 765
57362 405424 pfam14725 DUF4466 Domain of unknown function (DUF4466). 307
57363 405425 pfam14726 RTTN_N Rotatin, an armadillo repeat protein, centriole functioning. Rotatin and its homologs such as Ana3 in Drosophila are found to be essential for centriole function. A deficiency of rotatin in mice leads to randomized heart tube looping, defects in embryonic turning, and abnormal expression of HNF3beta, lefty, and nodal. Thus it is required for left-right and axial patterning. Ana3 - the Drosophila homolog - is present in centrioles and basal bodies, is required for the structural integrity of both centrioles and basal bodies and for centriole cohesion. Rotatin also localizes to centrioles and basal bodies and appears to be essential for cilia function. This family represents the N-terminal domain. 97
57364 405426 pfam14727 PHTB1_N PTHB1 N-terminus. This family includes the N-terminus of PTHB1 protein. This protein forms a part of the BBSome complex, which is required for ciliogenesis. 413
57365 405427 pfam14728 PHTB1_C PTHB1 C-terminus. This family includes the C-terminus of PTHB1 protein. This protein forms a part of the BBSome complex, which is required for ciliogenesis. 370
57366 291399 pfam14729 DUF4467 Domain of unknown function with cystatin-like fold (DUF4467). Large family of predicted lipoproteins from Gram-positive bacteria Experimentally determined structure shows a cystatitin-like fold, allowing us to classify this family in the NFT2 clan, despite lack of any detectable sequence similarity between members of this family and other families in this clan 94
57367 405428 pfam14730 DUF4468 Domain of unknown function (DUF4468) with TBP-like fold. A large family of (predicted) secreted proteins with unknown functions from human gut and oral cavity. Typically forms a N-terminal domain with FMN binding domain at the C-terminus. Experimentaly determined 3D structure of this domain shows a variant of a TATA box binding - like fold, but no detectable sequence similarity to other proteins with this fold 88
57368 291401 pfam14731 Staphopain_pro Staphopain proregion. This domain is the proregion of the cysteine protease staphopain. Like many papain type peptidases, staphopain is synthesized as an inactive precursor and cleavage of the proregion is required for activation. This proregion has a half-barrel or barrel-sandwich hybrid fold. The proregion blocks the active site cleft of the mature enzyme on one side of the nucleophilic cysteine 169
57369 405429 pfam14732 UAE_UbL Ubiquitin/SUMO-activating enzyme ubiquitin-like domain. This is the C-terminal domain of ubiquitin-activating enzyme and SUMO-activating enzyme 2. It is structurally similar to ubiquitin. This domain is involved in E1-SUMO-thioester transfer to the SUMO E2 conjugating protein. 88
57370 405430 pfam14733 ACDC AP2-coincident C-terminal. This family is found at the C-terminus of apicomplexan proteins containing the AP2 domain (pfam00847). 89
57371 405431 pfam14734 DUF4469 Domain of unknown function (DUF4469) with IG-like fold. A C-terminal domain in a large family of (predicted) secreted proteins with uknown functions from human gut bacteroides 101
57372 405432 pfam14735 HAUS4 HAUS augmin-like complex subunit 4. This family includes HAUS augmin-like complex subunit 4. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis. 235
57373 405433 pfam14736 N_Asn_amidohyd Protein N-terminal asparagine amidohydrolase. This family of enzymes catalyze the deamindation of N-terminal asparagines in peptides and proteins to aspartic acid. 267
57374 405434 pfam14737 DUF4470 Domain of unknown function (DUF4470). This family is conserved from fungi to Metazoa and includes plants. The function is not known, but several members have zinc-finger domain, zf-MYND, pfam01753, at their very C-terminus. Others are also associated with DUF1279, pfam06916. 97
57375 405435 pfam14738 PaaSYMP Solute carrier (proton/amino acid symporter), TRAMD3 or PAT1. PAT1 (proton amino acid transporter 1), also known as TRAMD3 of AAT-1, is the molecular correlate of the intestinal imino acid carrier. It is a proton-amino acid co-transporter having a stoichiometry of 1:1. Due to its mechanism, PAT1 activity increases at acidic pH, which correlates well with the acidic micro-climate close to the brush-border in the intestine. Glycine, proline, and alanine are the preferred substrates of the transporter. The maximum velocity is similar for the three substrates. All substrates are transported with low affinity, showing Km values in the range of 2-10 mM. The transporter does not discriminate between L- and D-isoforms of these amino acids; in addition, beta-alanine is transported with similar affinity as alpha-alanine. Similar to the IMINO transporter, the amino acid analog MeAIB is recognized by PAT1. The transporter is strongly expressed in the small intestine, colon, kidney, and brain. 153
57376 405436 pfam14739 DUF4472 Domain of unknown function (DUF4472). This family is specific to the Chordates. Some members also carry Kinesin-motor domains at their N-terminus, Kinesin, pfam00225. 106
57377 405437 pfam14740 DUF4471 Domain of unknown function (DUF4471). This family is conserved from fungi to Metazoa and includes plants. The function is not known, but several members have zinc-finger domain, zf-MYND, pfam01753, at their very C-terminus. Others are also associated with DUF1279, pfam06916. This domain is more C-terminal in many members to DUF4470, pfam14737. 303
57378 373261 pfam14741 GH114_assoc N-terminal glycosyl-hydrolase-114-associated domain. This short domain is also a very small family found at the N-terminus of GH114, glycosyl-hydrolases. 126
57379 405438 pfam14742 GDE_N_bis N-terminal domain of (some) glycogen debranching enzymes. This domain is found on the N-terminal of some glycogen debranching enzymes and is usually followed by the GDE_C (PF06202) and in this sense it is analogous (but probably not homologous) to the GDE_N (PF12439). Its exact function is unknown 193
57380 405439 pfam14743 DNA_ligase_OB_2 DNA ligase OB-like domain. This domain has an OB-like fold, but does not appear to be related to pfam03120. It is found at the C-terminus of the ATP dependent DNA ligase domain pfam01068. 60
57381 405440 pfam14744 WASH-7_mid WASH complex subunit 7. This family is the central, conserved region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes. 346
57382 405441 pfam14745 WASH-7_N WASH complex subunit 7, N-terminal. This family is the conserved N-terminal region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes. 566
57383 405442 pfam14746 WASH-7_C WASH complex subunit 7, C-terminal. This family is the conserved C-terminal region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes. The C-terminus is predicted to include a transmembrane region. 175
57384 373266 pfam14747 DUF4473 Domain of unknown function (DUF4473). This short family is largely confined to Caenorhabditis proteins. The function is not known. There are two well-conserved aspartate residues. 78
57385 405443 pfam14748 P5CR_dimer Pyrroline-5-carboxylate reductase dimerization. Pyrroline-5-carboxylate reductase consists of two domains, an N-terminal catalytic domain (pfam03807) and a C-terminal dimerization domain. This is the dimerization domain. 105
57386 405444 pfam14749 Acyl-CoA_ox_N Acyl-coenzyme A oxidase N-terminal. Acyl-coenzyme A oxidase consists of three domains. An N-terminal alpha-helical domain, a beta sheet domain (pfam02770) and a C-terminal catalytic domain (pfam01756). This entry represents the N-terminal alpha-helical domain. 120
57387 405445 pfam14750 INTS2 Integrator complex subunit 2. This family of proteins are subunits of the integrator complex involved in snRNA transcription and processing. 1048
57388 405446 pfam14751 DUF4474 Domain of unknown function (DUF4474). Domain found on N-termina of few families of uncharacterized Clostridia proteins. Typically followed by a proline-rich domain or other kinds of repeats 239
57389 405447 pfam14752 RBP_receptor Retinol binding protein receptor. Proteins in this family function as retinol binding protein receptors. 602
57390 405448 pfam14753 FAM221 Protein FAM221A/B. This family of proteins is found in eukaryotes. Proteins in this family are typically between 99 and 305 amino acids in length. 195
57391 291424 pfam14754 IFR3_antag Papain-like auto-proteinase. The replicase polyproteins of the Nidoviruses such as, porcine arterivirus PRRSV, equine arterivirus EAV, human coronavirus 229E, and severe acute respiratory syndrome coronavirus (SARS-CoV), are predicted to be cleaved into 14 non-structural proteins (nsps) by the nsp4 main proteinase pfam05579 and three accessory proteinases residing in nsp1-alpha, nsp1-beta and nsp2. This family is the two nsp1 proteins that together act in a papain-like way to separate off the rest of the various functional domains of the polyprotein. Once inside the host cell, this nsp1 interferes with the regulation of interferon, thereby enabling the virus to replicate. 249
57392 373271 pfam14755 NSP2_middle Middle region of RNA-arterivirus nonstructural protein 2 (nsp2). This domain represents the middle region of nsp2 of the RNA-arteriviruses, such as porcine arterivirus PRRSV and equine arterivirus EAV, C-terminal to the peptidase C33 family catalytic domain. 148
57393 258893 pfam14756 Pdase_C33_assoc Peptidase_C33-associated domain. The nsps or non-structural protein subunits of the arteriviral polyproteins such as porcine arterivirus PRRSV and equine arterivirus EAV are auto-cleaved into functional units. the function of this particular domain is not known. 147
57394 373272 pfam14757 NSP2-B_epitope Immunogenic region of nsp2 protein of arterivirus polyprotein. This domain is in a non-essential part of the nsp2 (non-structural protein) subunit section of the arterivirus polyprotein. This domain carries seven small sequence-regions that are predicted to be potential B-cell epitopes. 272
57395 373273 pfam14758 NSP2_assoc Non-essential region of nsp2 of arterivirus polyprotein. This non-essential region of the nsp2 subunit of the arterivirus polyprotein of such as porcine arterivirus PRRSV and equine arterivirus EAV may offer immunogenic surfaces to B-cells. It is associated with Peptidase_C33, pfam05412. 198
57396 405449 pfam14759 Reductase_C Reductase C-terminal. This domain occurs at the C-terminus of various reductase enzymes, including putidaredoxin reductase, ferredoxin reductase, 3-phenylpropionate/cinnamic acid dioxygenase ferredoxin--NAD(+) reductase component, benzene 1,2-dioxygenase system ferredoxin--NAD(+) reductase subunit, rhodocoxin reductase, biphenyl dioxygenase system ferredoxin--NAD(+) reductase component, rubredoxin-NAD(+) reductase and toluene 1,2-dioxygenase system ferredoxin--NAD(+) reductase component. In putidaredoxin reductase this domain is involved in dimerization. In the FAD-containing NADH-ferredoxin reductase (BphA4) it is responsible for interaction with the Rieske-type [2Fe-2S] ferredoxin (BphA3). 83
57397 405450 pfam14760 Rnk_N Rnk N-terminus. This domain occurs at the N-terminus of Rnk, an RNA polymerase-interacting protein of the GreA/GreB family (pfam01272). It has a coiled coil structure. 41
57398 405451 pfam14761 HPS3_N Hermansky-Pudlak syndrome 3. This domain is at the N-terminus of these vertebrate proteins. This region carries the clathrin-binding motif LLDFE at residues 172-176 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072. 211
57399 405452 pfam14762 HPS3_Mid Hermansky-Pudlak syndrome 3, middle region. This domain is downstream of the N-terminus of these vertebrate proteins. This region carries a number of tyrosine sorting motifs and one of two di-leucine sorting boxes at residues 542-548 well as a peroxisomal matrix targetting motif at residues 614-623 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072. 387
57400 405453 pfam14763 HPS3_C Hermansky-Pudlak syndrome 3, C-terminal. This domain is downstream of the mid domain family, pfam14762, of these vertebrate proteins. This region carries a number of tyrosine sorting motifs and the second of two di-leucine sorting boxes at residues 711-717 well as the ER membrane-retention signal KKPL at residues 1000-1003 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072. 350
57401 405454 pfam14764 SPG48 AP-5 complex subunit, vesicle trafficking. This family would appear to be the second of the two larger subunits of the fifth Adaptor-Protein complex, AP-5. Adaptor protein (AP) complexes facilitate the trafficking of cargo from one membrane compartment of the cell to another by recruiting other proteins to particular types of vesicles. AP-5 is involved in trafficking proteins from endosomes towards other membranous compartments. There are genetic links between AP-5 and hereditary spastic paraplegia, a group of human genetic disorders characterized by progressive spasticity in the lower limbs. 118
57402 405455 pfam14765 PS-DH Polyketide synthase dehydratase. This is the dehydratase domain of polyketide synthases. Structural analysis shows these DH domains are double hotdogs in which the active site contains a histidine from the N-terminal hotdog and an aspartate from the C-terminal hotdog. Studies have uncovered that a substrate tunnel formed between the DH domains may be essential for loading substrates and unloading products. 294
57403 405456 pfam14766 RPA_interact_N Replication protein A interacting N-terminal. This family of proteins represents the N-terminal domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. The N-terminal domain is responsible for interaction with importin beta. 38
57404 405457 pfam14767 RPA_interact_M Replication protein A interacting middle. This family of proteins represents the middle domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. This domain is responsible for interaction with RPA. 82
57405 405458 pfam14768 RPA_interact_C Replication protein A interacting C-terminal. This family of proteins represents the C-terminal domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. The C-terminal domain is a putative zinc finger. 79
57406 405459 pfam14769 CLAMP Flagellar C1a complex subunit C1a-32. This family represents one small subunit, C1a-32, of the C1a projection (the seventh projection of flagellar). Numerous studies have indicated that each of the seven projections associated with the central pair of microtubules in flagellar plays a distinct role in regulating eukaryotic ciliary/flagellar motility. The C1a projection is a complex of proteins including PF6, C1a-86, C1a-34, C1a-32, C1a-18, and calmodulin. C1a projection is involved in modulating flagellar beat frequency and this is mediated via the C1a-34, C1a-32, and C1a-18 sub-complex by modulating the activity of both the inner and outer dynein arms. 97
57407 405460 pfam14770 TMEM18 Transmembrane protein 18. The function of this family is not known, however it is predicted to be a three-pass membrane protein. 118
57408 405461 pfam14771 DUF4476 Domain of unknown function (DUF4476). 91
57409 405462 pfam14772 NYD-SP28 Sperm tail. NYD-SP28 is expressed in a development-dependent manner, localized in spermatogenic cell cytoplams and human spermatozoa tail. It is post-translationally modified during sperm capacitation and ultimately contributes to the success of fertilisation. 101
57410 405463 pfam14773 VIGSSK Helicase-associated putative binding domain, C-terminal. The function of this short, serine-rich C-terminal region is not known. However, as it is frequently found at the very C-terminus of P-loop containing nucleoside triphosphate hydrolases, it might possibly be a binding domain. 62
57411 405464 pfam14774 FAM177 FAM177 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 205 amino acids in length. 117
57412 405465 pfam14775 NYD-SP28_assoc Sperm tail C-terminal domain. NYD-SP28 is expressed in a development-dependent manner, localized in spermatogenic cell cytoplams and human spermatozoa tail. It is post-translationally modified during sperm capacitation and ultimately contributes to the success of fertilisation. This short region is found at the very C-terminus of family members of family NYD-SP28, pfam14772. 60
57413 405466 pfam14776 UNC-79 Cation-channel complex subunit UNC-79. This family is a component of a cation-channel complex. 525
57414 405467 pfam14777 BBIP10 Cilia BBSome complex subunit 10. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. BBIP10 localizes to the primary cilium, and is present exclusively in ciliated organisms. It is required for cytoplasmic microtubule polymerization and acetylation, two functions not shared with any other BBSome subunits. BBIP10 physically interacts with HDAC6. BBSome-bound BBIP10 may therefore function to couple acetylation of axonemal microtubules and ciliary membrane growth. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. 55
57415 405468 pfam14778 ODR4-like Olfactory receptor 4-like. In C.elegans, odr-4 and odr-8 are required for localising a subset of odorant GPCRs to the cilia of olfactory neurons. Olfactory receptors (ORs) are synthesized in endoplasmic reticulum of the olfactory neurons, trafficked to the cell surface membrane and transported to the tip of the olfactory cilium, where they bind with odorants. Various accessory proteins are required for proper targetting of different ORs to the cell membrane. ODR-4 was the first accessory protein to be described. 368
57416 405469 pfam14779 BBS1 Ciliary BBSome complex subunit 1. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of the all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS1 predominantly localizes to the basal body and or transitional zone of ciliated cells. It has been found in a heptameric complex with BBS2, BBS5, BBS7, BBS8, and BBS9, termed the BBSome. Mutations in BBS1 can lead to retinal inadequacy. 254
57417 405470 pfam14780 DUF4477 Domain of unknown function (DUF4477). 187
57418 405471 pfam14781 BBS2_N Ciliary BBSome complex subunit 2, N-terminal. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia. 107
57419 405472 pfam14782 BBS2_C Ciliary BBSome complex subunit 2, C-terminal. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia. 429
57420 405473 pfam14783 BBS2_Mid Ciliary BBSome complex subunit 2, middle region. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia. 108
57421 405474 pfam14784 ECSIT_C C-terminal domain of the ECSIT protein. This family represents the C-terminal domain of the evolutionarily conserved signaling intermediate in Toll pathway protein, an adapter protein of the Toll-like and IL-1 receptor signaling pathway, which is involved in the activation of NF-kappa-B via MAP3K1. This domain is missing in isoform 2. Fold recognition suggests that this domain may be distantly homologous to the pleckstrin homology domain 131
57422 405475 pfam14785 MalF_P2 Maltose transport system permease protein MalF P2 domain. This is the second periplasmic domain (P2 domain) of the maltose transport system permease protein MalF. 164
57423 405476 pfam14786 Death_2 Tube Death domain. This Tube-Death domain has an insertion between helices 2 and 3, and a C-terminal tail compared with the Death domain of Pelle proteins in Drosophila. The two N-terminal Death domains of the serine/threonine kinase Pelle and the adaptor protein Tube interact to form a six-helix bundle fold arranged in an open-ended linear array with plastic interfaces mediating their interactions. This interaction leads to the nuclear translocation of the transcription factor Dorsal and activation of zygotic patterning genes during Drosophila embryogenesis, and is assisted by the significant and indispensable contacts in the heterodimer contributed by the insertion and C-terminal tail described above. 137
57424 373297 pfam14787 zf-CCHC_5 GAG-polyprotein viral zinc-finger. 36
57425 405477 pfam14788 EF-hand_10 EF hand. 50
57426 405478 pfam14789 THDPS_M Tetrahydrodipicolinate N-succinyltransferase middle. This is the middle domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. 41
57427 339376 pfam14790 THDPS_N Tetrahydrodipicolinate N-succinyltransferase N-terminal. This is the N-terminal domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. 167
57428 405479 pfam14791 DNA_pol_B_thumb DNA polymerase beta thumb. The catalytic region of DNA polymerase beta is split into three domains. An N-terminal fingers domain, a central palm domain and a C-terminal thumb domain. This entry represents the thumb domain. 63
57429 405480 pfam14792 DNA_pol_B_palm DNA polymerase beta palm. The catalytic region of DNA polymerase beta is split into three domains. An N-terminal fingers domain, a central palm domain and a C-terminal thumb domain. This entry represents the palm domain. 110
57430 405481 pfam14793 DUF4478 Domain of unknown function (DUF4478). This domain is found in bacteria, and is approximately 110 amino acids in length. It is found in association with pfam03641 and pfam11892. 109
57431 405482 pfam14794 DUF4479 Domain of unknown function (DUF4479). This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam01588. 71
57432 373300 pfam14795 Leucyl-specific Leucine-tRNA synthetase-specific domain. This short region is found only in leucyl-tRNA synthetases. It is flexibly linked to the enzyme-core by beta-ribbons structures 56
57433 405483 pfam14796 AP3B1_C Clathrin-adaptor complex-3 beta-1 subunit C-terminal. This domain lies at the C-terminus of the clathrin-adaptor protein complex-3 beta-1 subunit. The AP-3 complex is associated with the Golgi region of the cell as well as with more peripheral structures. The AP-3 complex may be directly involved in trafficking to lysosomes or alternatively it may be involved in another pathway, but that mis-sorting in that pathway may indirectly lead to defects in pigment granules. 147
57434 373302 pfam14797 SEEEED Serine-rich region of AP3B1, clathrin-adaptor complex. This short low-complexity, highly serine-rich region lies on clathrin-adaptor complex 3 beta-1 subunit proteins, between family Adaptin_N, pfam01602 and a C-terminal domain, AP3B1_C,pfam14796. 125
57435 405484 pfam14798 Ca_hom_mod Calcium homeostasis modulator. This family of proteins control cytosolic calcium concentration. They are transmembrane proteins which may be pore-forming ion channels. 250
57436 405485 pfam14799 FAM195 FAM195 family. 98
57437 405486 pfam14800 DUF4481 Domain of unknown function (DUF4481). 293
57438 405487 pfam14801 GCD14_N tRNA methyltransferase complex GCD14 subunit N-term. This is the N-terminal domain of GCD14, itself a subunit of the tRNA methyltransferase complex that is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA. The exact function of the N-terminus is not known but it is necessary for maintaining the overall folding and for full enzymatic activity. 51
57439 405488 pfam14802 TMEM192 TMEM192 family. The function of this family of transmembrane proteins is unknown. In vertebrates, proteins in this family are located in the lysosomal membrane and late endosome. In Arabidopsis, a member of this family has been found to weakly interact with FRIGIDA, a determinant of flowering time. 234
57440 405489 pfam14803 Nudix_N_2 Nudix N-terminal. Ths domain occurs at the N-terminus of several Nudix (Nucleoside Diphosphate linked to X) hydrolases. 32
57441 405490 pfam14804 Jag_N Jag N-terminus. This domain is found at the N-terminus of proteins containing pfam13083 and pfam01424, including the jag proteins. 50
57442 405491 pfam14805 THDPS_N_2 Tetrahydrodipicolinate N-succinyltransferase N-terminal. This is the N-terminal domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. 67
57443 405492 pfam14806 Coatomer_b_Cpla Coatomer beta subunit appendage platform. This family is found at the C-terminus of the coatamer beta subunit proteins (Beta-coat proteins). It is a platform domain on the appendage that carries a highly conserved tryptophan. 128
57444 405493 pfam14807 AP4E_app_platf Adaptin AP4 complex epsilon appendage platform. This domain is found at the C terminal of clathrin-adaptor epsilon subunit, and at the C-terminus of the appendage on the platform domain. 100
57445 405494 pfam14808 TMEM164 TMEM164 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 214 and 330 amino acids in length. There are two conserved sequence motifs: LNPCH and DPF. 250
57446 373310 pfam14809 TGT_C1 C1 domain of tRNA-guanine transglycosylase dimerization. This short region of the tRNA-guanine transglycosylase enzyme acts as the dimerization domain of the whole protein. 70
57447 405495 pfam14810 TGT_C2 Patch-forming domain C2 of tRNA-guanine transglycosylase. Domain C2 of tRNA-guanine transglycosylase is formed by a four-stranded anti-parallel beta-sheet lined with two alpha helices. It has conserved basic residues on the surface of the beta-sheets as does the C-terminal domain PUA, pfam01472. The catalytic domain, TGT has conserved basic residues on the outer surface of the N-terminal three-stranded beta sheet, which closes the barrel, and it is postulated that these basic residues from the three domains form a continuous, positively charged patch to which the tRNA binds. 70
57448 405496 pfam14811 TPD Protein of unknown function TPD sequence-motif. This is a family of eukaryotic proteins of unknown function. A few members have an associated zinc-finger domain. All members carry a highly conserved TPD sequence-motif. 138
57449 405497 pfam14812 PBP1_TM Transmembrane domain of transglycosylase PBP1 at N-terminal. This is the N-terminal, transmembrane, domain of the transglycosylases ()penicillin-binding proteins), the multi-domain membrane proteins essential for cell wall synthesis that are targeted by penicillin antibiotics. The TM domain is a single helix, several of whose residues lie in close proximity to hydrophobic residues in the TGT domain. The TM helix seems to be necessary for stabilizing the protein-membrane interaction, and the resulting orientation limits the interaction between PBPb1 and lipid II in the membrane in a 2D lateral diffusion fashion. 85
57450 405498 pfam14813 NADH_B2 NADH dehydrogenase 1 beta subcomplex subunit 2. This family represents an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I), that is believed not to be involved in catalysis. 69
57451 405499 pfam14814 UB2H Bifunctional transglycosylase second domain. UB2H is the second domain of the transglycosylases, or penicillin-binding proteins PBP1bs)), the multi-domain membrane proteins essential for cell wall synthesis that are targeted by penicillin antibiotics. The exact function of the UB2H domain is uncertain, but it may act as the binding component of PBP1b with different binding partners, or it may participate in the regulation between DNA repair and/or synthesis and cell wall formation during the bacterial cell cycle. 85
57452 405500 pfam14815 NUDIX_4 NUDIX domain. 115
57453 405501 pfam14816 FAM178 Family of unknown function, FAM178. 373
57454 405502 pfam14817 HAUS5 HAUS augmin-like complex subunit 5. This family includes HAUS augmin-like complex subunit 5. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis. 642
57455 405503 pfam14818 DUF4482 Domain of unknown function (DUF4482). This family is found in eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam11365. 138
57456 405504 pfam14819 QueF_N Nitrile reductase, 7-cyano-7-deazaguanine-reductase N-term. The QueF monomer is made up of two ferredoxin-like domains aligned together with their beta-sheets that have additional embellishments. This subunit is composed of a three-stranded beta-sheet and two alpha-helices. QueF reduces a nitrile bond to a primary amine. The two monomer units together create suitable substrate-binding pockets. 110
57457 373317 pfam14820 SPRR2 Small proline-rich 2. This family of small proteins is rich in proline, cysteine and glutamate. They contain a tandemly repeated nonamer, PKCPEPCPP. They are components of the cornified envelope of keratinocytes. 68
57458 405505 pfam14821 Thr_synth_N Threonine synthase N-terminus. This domain is found at the N-terminus of many threonine synthase enzymes. 79
57459 405506 pfam14822 Vasohibin Vasohibin. This family of proteins function as angiogenesis inhibitors in animals. 245
57460 405507 pfam14823 Sirohm_synth_C Sirohaem biosynthesis protein C-terminal. This domain is the C-terminus of a multifunctional enzyme which catalyzes the biosynthesis of sirohaem. Both of the catalytic activities of this enzyme (precorrin-2 dehydrogenase EC:1.3.1.76) and sirohydrochlorin ferrochelatase (EC:4.99.1.4) are located in the N-terminal domain of this enzyme, pfam13241. 66
57461 405508 pfam14824 Sirohm_synth_M Sirohaem biosynthesis protein central. This is the central domain of a multifunctional enzyme which catalyzes the biosynthesis of sirohaem. Both of the catalytic activities of this enzyme (precorrin-2 dehydrogenase EC:1.3.1.76) and sirohydrochlorin ferrochelatase (EC:4.99.1.4) are located in the N-terminal domain of this enzyme, pfam13241. 25
57462 405509 pfam14825 DUF4483 Domain of unknown function (DUF4483). This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 326 amino acids in length. There is a single completely conserved residue N that may be functionally important. 157
57463 405510 pfam14826 FACT-Spt16_Nlob FACT complex subunit SPT16 N-terminal lobe domain. The FACT or facilitator of chromatin transcription complex binds to and alters the properties of nucleosomes. This family represents the N-terminal lobe of the NTD, or N-terminal domain, and acts as a protein-protein interaction domain presumably with partners outside of the FACT complex. Knockout of the whole NTD domain, 1-450 residues in UniProt:P32558, in yeast serves to tender the cells sensitive to DNA replication stress but is not lethal. The C-terminal half of NTD is structurally similar to aminopeptidases, and the most highly conserved surface residues line a cleft equivalent to the aminopeptidase substrate-binding site, family peptidase_M24, pfam00557. 160
57464 405511 pfam14827 dCache_3 Double sensory domain of two-component sensor kinase. Cache_3 is the periplasmic sensor domains of sensor histidine kinase of E. coli DcuS. This domain forms one of the components of the two-component signalling system that allows bacteria to adapt to changing environments. The ability of bacteria to monitor and adapt to their environment is crucial to their survival, and two-component signal transduction systems mediate most of these adaptive responses. One component is a histidine kinase sensor - this domain - most commonly part of a homodimeric transmembrane sensor protein, and the second component is a cytoplasmic response regulator. The two components interact in tandem through a phospho-transfer cascade. 238
57465 405512 pfam14828 Amnionless Amnionless. The amnionless protein forms a complex with cubilin. This complex is necessary for vitamin B12 uptake. 442
57466 373324 pfam14829 GPAT_N Glycerol-3-phosphate acyltransferase N-terminal. GPAT_N is the N-terminal domain of glycerol-3-phosphate acyltransferases, and it forms a four-helix bundle. Glycerol-3-phosphate (1)-acyltransferase(G3PAT) catalyzes the incorporation of an acyl group from either acyl-acyl carrier proteins or acyl-CoAs into the sn-1 position of glycerol 3-phosphate to yield 1-acylglycerol-3-phosphate. G3PATs can either be selective, preferentially using the unsaturated fatty acid, oleate (C18:1), as the acyl donor, or non-selective, using either oleate or the saturated fatty acid, palmitate (C16:0), at comparable rates. The differential substrate-specificity for saturated versus unsaturated fatty acids seen within this enzyme family has been implicated in the sensitivity of plants to chilling temperatures. The exact function of this domain is not known. it lies upstream of family Acyltransferase, pfam01553. 76
57467 405513 pfam14830 Haemocyan_bet_s Haemocyanin beta-sandwich. This antiparallel beta sandwich domain occurs in mollusc haemocyanins. Each mollusc haemocyanin contains several globular oxygen binding functional units. Each unit consists of an alpha-helical copper binding domain (pfam00264) and an antiparallel beta sandwich domain. 103
57468 405514 pfam14831 DUF4484 Domain of unknown function (DUF4484). This domain is found, in a few members, a the the C-terminus of family Avl9, pfam09794. The function is not known. 183
57469 373327 pfam14832 Tautomerase_3 Putative oxalocrotonate tautomerase enzyme. 4-oxalocrotonate tautomerase enzyme is involved in the anthranilate synthase pathway.1 136
57470 405515 pfam14833 NAD_binding_11 NAD-binding of NADP-dependent 3-hydroxyisobutyrate dehydrogenase. 3-Hydroxyisobutyrate is a central metabolite in the valine catabolic pathway, and is reversibly oxidized to methylmalonate semi-aldehyde by a specific dehydrogenase belonging to the 3-hydroxyacid dehydrogenase family. The reaction is NADP-dependent and this region of the enzyme binds NAD. The NAD-binding domain of 6-phosphogluconate dehydrogenase adopts an alpha helical structure. 122
57471 405516 pfam14834 GST_C_4 Glutathione S-transferase, C-terminal domain. GST conjugates reduced glutathione to a variety of targets including S-crystallin from squid, the eukaryotic elongation factor 1-gamma, the HSP26 family of stress-related proteins and auxin-regulated proteins in plants. Stringent starvation proteins in E. coli are also included in the alignment but are not known to have GST activity. The glutathione molecule binds in a cleft between N and C-terminal domains. The catalytically important residues are proposed to reside in the N-terminal domain. 117
57472 405517 pfam14835 zf-RING_6 zf-RING of BARD1-type protein. The RING domain of the breast and ovarian cancer tumor-suppressor BRCA1 interacts with multiple cognate proteins, including the RING protein BARD1. Proper function of the BRCA1 RING domain is critical, as evidenced by the many cancer-predisposing mutations found within this domain. A dimer is formed between the RING domains of BRCA1 and BARD1. The BRCA1-BARD1 structure provides a model for its ubiquitin ligase activity, illustrates how the BRCA1 RING domain can be involved in associations with multiple protein partners and provides a framework for understanding cancer-causing mutations at the molecular level. The corresponding BRCA1-RING domain is on family zf-C3HC4_2, pfam13923. 65
57473 405518 pfam14836 Ubiquitin_3 Ubiquitin-like domain. This ubiquitin-like domain is found in several ubiquitin carboxyl-terminal hydrolases and in gametogenetin-binding protein. 88
57474 405519 pfam14837 INTS5_N Integrator complex subunit 5 N-terminus. This family of proteins represents the N-terminus of subunit 5 of the integrator complex involved in snRNA transcription and processing. 208
57475 405520 pfam14838 INTS5_C Integrator complex subunit 5 C-terminus. This family of proteins represents the C-terminus of subunit 5 of the integrator complex involved in snRNA transcription and processing. 693
57476 405521 pfam14839 DOR DOR family. This family of proteins regulate autophagy and gene transcription. 206
57477 405522 pfam14840 DNA_pol3_delt_C Processivity clamp loader gamma complex DNA pol III C-term. This domain lies at the C-terminus of the delta subunit of the DNA polymerase III clamp loader gamma complex. Within the complex the several C-terminal domains, of gamma, delta and delta' form a helical scaffold, on which the rest of he subunits are hung. The gamma complex, an AAA+ ATPase, is the bacterial homolog of the eukaryotic replication factor C that loads the sliding clamp (beta, homologous to PCNA) onto DNA. 125
57478 405523 pfam14841 FliG_M FliG middle domain. This is the middle domain of the flagellar rotor protein FliG. 76
57479 405524 pfam14842 FliG_N FliG N-terminal domain. This is the N-terminal domain of the flagellar rotor protein FliG. 101
57480 405525 pfam14843 GF_recep_IV Growth factor receptor domain IV. This is the fourth extracellular domain of receptor tyrosine protein kinases. Interaction between this domain and the furin-like domain (pfam00757) regulates the binding of ligands to the receptor L domains (pfam01030). 132
57481 405526 pfam14844 PH_BEACH PH domain associated with Beige/BEACH. This PH domain is found in proteins containing the Beige/BEACH domain (pfam02138), it immediately precedes the Beige/BEACH domain. 99
57482 373334 pfam14845 Glycohydro_20b2 beta-acetyl hexosaminidase like. 133
57483 405527 pfam14846 DUF4485 Domain of unknown function (DUF4485). This family is found in eukaryotes, and is approximately 90 amino acids in length. 81
57484 405528 pfam14847 Ras_bdg_2 Ras-binding domain of Byr2. This domain is the binding/interacting region of several protein kinases, such as the Schizosaccharomyces pombe Byr2. Byr2 is a Ser/Thr-specific protein kinase acting as mediator of signals for sexual differentiation in S. pombe by initiating a MAPK module, which is a highly conserved element in eukaryotes. Byr2 is activated by interacting with Ras, which then translocates the molecule to the plasma membrane. Ras proteins are key elements in intracellular signaling and are involved in a variety of vital processes such as DNA transcription, growth control, and differentiation. They function like molecular switches cycling between GTP-bound 'on' and GDP-bound 'off' states. 95
57485 405529 pfam14848 HU-DNA_bdg DNA-binding domain. 123
57486 405530 pfam14849 YidC_periplas YidC periplasmic domain. This is the periplasmic domain of YidC, a bacterial membrane protein which is required for the insertion and assembly of inner membrane proteins. 267
57487 405531 pfam14850 Pro_dh-DNA_bdg DNA-binding domain of Proline dehydrogenase. This domain lies at the N-terminus of bifunctional proline-dehydrogenases and is found to bind DNA. 113
57488 405532 pfam14851 FAM176 FAM176 family. Members of the FAM176 family regulate autophagy and apoptosis. 145
57489 405533 pfam14852 Fis1_TPR_N Fis1 N-terminal tetratricopeptide repeat. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the N-terminal tetratricopeptide repeat 33
57490 405534 pfam14853 Fis1_TPR_C Fis1 C-terminal tetratricopeptide repeat. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the C-terminal tetratricopeptide repeat 53
57491 373342 pfam14854 LURAP Leucine rich adaptor protein. This family of proteins activate the canonical NF-kappa-B pathway, promote proinflammatory cytokine production and promote the antigen presenting and priming functions of dendritic cells. 117
57492 405535 pfam14855 PapJ Pilus-assembly fibrillin subunit, chaperone. PapJ is part of the Pap pilus assembly complex that plays an auxiliary role by ensuring the proper integration of PapA into the fimbrial shaft. PapA is the major shaft protein of the pilus. 187
57493 405536 pfam14856 Hce2 Pathogen effector; putative necrosis-inducing factor. The domain corresponds to the mature part of the Ecp2 effector protein from the tomato pathogen Cladopsorium fulvum. Effectors are low molecular weight proteins that are secreted by bacteria, oomycetes and fungi to manipulate their hosts and adapt to their environment. Ecp2 is a 165 amino acid secreted protein that was originally identified as a virulence factor in C. fulvum, since disruption reduces virulence of the fungus on tomato plants. We have recently determined that Ecp2 is a member of a novel, widely distributed and highly diversified within the fungal kingdom multigene superfamily, which we have designated Hce2, for Homologs of C. fulvum Ecp2 effector. Although Ecp2 is present in most organisms as a small secreted protein, the mature part of this protein can be found fused to other protein domains, including the fungal Glycoside Hydrolase family 18, Glyco_hydro_18 pfam00704 and other, unknown, protein domains. The intrinsic function of Ecp2 remains unknown but it is postulated by that it is a necrosis-inducing factor in plants that serves pathogenicity on the host. 103
57494 405537 pfam14857 TMEM151 TMEM151 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 338 and 558 amino acids in length. 424
57495 405538 pfam14858 DUF4486 Domain of unknown function (DUF4486). This domain family is found in eukaryotes, and is typically between 542 and 565 amino acids in length. 542
57496 258996 pfam14859 Colicin_M Colicin M. Colicin M is a toxin produced by, and active against, Escherichia coli. It catalyzes the hydrolysis of lipid I and lipid II peptidoglycan intermediates, therefore inhibiting peptidoglycan biosynthesis and leading to lysis of the bacterial cells. 269
57497 405539 pfam14860 DrrA_P4M DrrA phosphatidylinositol 4-phosphate binding domain. This domain binds to phosphatidylinositol 4-phosphate. It is found in Legionella pneumophila DrrA, a protein involved in the redirection of endoplasmic reticulum-derived vesicles to the Legionella-containing vacuoles. 103
57498 405540 pfam14861 Antimicrobial21 Plant antimicrobial peptide. This family includes plant antimicrobial peptides. They adopt an alpha-helical hairpin fold stabilized by two disulphide bonds. 30
57499 373348 pfam14862 Defensin_big Big defensin. Big defensins are antimicrobial peptides. They consist of a hydrophobic N-terminal half, which is active against Gram-positive bacteria, and a cationic C-terminal half, which is active against Gram-negative bacteria. The C-terminal half adopts a beta-defensin-like structure. 55
57500 405541 pfam14863 Alkyl_sulf_dimr Alkyl sulfatase dimerization. This domain is found in alkyl sulfatases such as the Pseudomonas aeruginosa SDS hydrolase, where it acts as a dimerization domain 138
57501 405542 pfam14864 Alkyl_sulf_C Alkyl sulfatase C-terminal. This domain is found at the C-terminus of alkyl sulfatases. Together with the N-terminal catalytic domain, this domain forms a hydrophobic chute and may recruit hydrophobic substrates. 124
57502 373349 pfam14865 Macin Macin. The macins are antimicrobial proteins. They form a disulphide-stabilized alpha-beta motif. 60
57503 259003 pfam14866 Toxin_38 Potassium channel toxin. This family includes scorpion potassium channel toxins. 55
57504 291529 pfam14867 Lantibiotic_a Lantibiotic alpha. Lantibiotics are two-component lanthionine-containing peptide antibiotics active on Gram-positive bacteria. 29
57505 405543 pfam14868 DUF4487 Domain of unknown function (DUF4487). This family of proteins is found in eukaryotes. Proteins in this family are typically between 209 and 938 amino acids in length. There is a conserved WCF sequence motif. There is a single completely conserved residue W that may be functionally important. 555
57506 405544 pfam14869 DUF4488 Domain of unknown function (DUF4488). In most members this domain covers almost the whole sequence, but a few member-sequences also carry a TonB_C domain, PF03544. This domain has a lipocalin fold. 122
57507 405545 pfam14870 PSII_BNR Photosynthesis system II assembly factor YCF48. YCF48 is one of several assembly factors of the photosynthesis system II. The photosynthesis system II occurs in Cyanobacteria that are Gram-negative bacteria performing oxygenic photosynthesis. One of the three membranes surrounding these bacteria is the inner thylakoid membrane (TM) system that is localized within the cell and houses the large pigment-protein complexes of the photosynthetic electron transfer chain, i.e. Photosystem (PS) II, PSI, the cytochrome b6f complex, and the ATP synthase. YCF48 is necessary for efficient assembly and repair of the PSII. YCF48 is found predominantly in the thykaloid membrane. It is a BNR repeat protein. 304
57508 405546 pfam14871 GHL6 Hypothetical glycosyl hydrolase 6. GHL6 is a family of hypothetical glycoside hydrolases. 135
57509 405547 pfam14872 GHL5 Hypothetical glycoside hydrolase 5. GHL5 is a family of hypothetical glycoside hydrolases. 803
57510 405548 pfam14873 BNR_assoc_N N-terminal domain of BNR-repeat neuraminidase. This domain is usually found at the N-terminus of the BNR-repeat neuraminidase protein family. 149
57511 373355 pfam14874 PapD-like Flagellar-associated PapD-like. This domain is a putative PapD periplasmic pilus chaperone protein family. 102
57512 405549 pfam14875 PIP49_N N-term cysteine-rich ER, FAM69. The FAM69 family of cysteine-rich type II transmembrane proteins localize to the endoplasmic reticulum (ER) in cultured cells, probably via N-terminal di-arginine motifs. These proteins carry at least 14 luminal cysteines which are conserved in all FAM69s. There are currently few indications of the involvement of FAM69 members in human diseases. It would appear that FAM69 proteins are predicted to be have a protein kinase structure and function. Analysis of three-dimensional structure models and conservation of the classic catalytic motifs of protein kinases in four of human FAM69 proteins suggests they might have retained catalytic phosphotransferase activity. An EF-hand Ca2+-binding domain, inserted within the structure of the kinase domain, suggests they function as Ca2+-dependent kinases (unpublished). 157
57513 291538 pfam14876 RSF Respiratory growth transcriptional regulator. This is a family of transcriptional regulators that determine the transition from fermentative activity to growth on glycerol. 380
57514 405550 pfam14877 mIF3 Mitochondrial translation initiation factor. This is a family of mitochondrial initiation factors IF3. 169
57515 405551 pfam14879 DUF4489 Domain of unknown function (DUF4489). 139
57516 405552 pfam14880 COX14 Cytochrome oxidase c assembly. COX14 plays an essential role in cytochrome oxidase assembly. The COX14 product is a low-molecular weight membrane protein of mitochondria, but it is not a subunit of cytochrome oxidase. Orthology-prediction methods have identified the vertebrate C12orf62 orthologues to be orthologues of the yeast COX14. 59
57517 373359 pfam14881 Tubulin_3 Tubulin domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Misato from Drosophila and Dml1p from fungi are descendants of an ancestral tubulin-like protein, and exhibit regions with similarity to members of a GTPase family that includes eukaryotic tubulin and prokaryotic FtsZ. Dml1p and Misato have been co-opted into a role in mtDNA inheritance in yeast, and into a cell division-related mechanism in flies, respectively. Dml1p might additionally function in the partitioning of the mitochondrial organelle itself, or in the segregation of chromosomes, thereby explaining its essential requirement. This domain subject to extensive post-translational modifications. 180
57518 405553 pfam14882 PHINT_rpt Phage-integrase repeat unit. This repeat family is found on phage-integrase proteins in up to 15 copies. The function is not known. 54
57519 405554 pfam14883 GHL13 Hypothetical glycosyl hydrolase family 13. GHL13 is a family of hypothetical glycoside hydrolases. 325
57520 405555 pfam14884 EFF-AFF Type I membrane glycoproteins cell-cell fusogen. EFF-AFF was first identified when EFF1 mutants were found to block cell fusion in all epidermal and vulval epithelia in the worm. However, fusion between the anchor cell and the utse syncytium that establishes a continuous uterine-vulval tube proceeds normally in eff-1 mutants and thus Aff1 was established as necessary for this and the fusion of heterologous cells in C. elegans. The transmembrane forms of FF proteins, like most viral fusogens, possess an N-terminal signal sequence followed by a long extracellular portion, a predicted transmembrane domain, and a short intracellular tail. A striking conservation in the position and number of all 16 cysteines in the extracellular portion of FF proteins from different nematode species suggests that these proteins are folded in a similar 3D structure that is essential for their fusogenic activity. C. elegans AFF-1 and EFF-1 proteins are essential for developmental cell-to-cell fusion and can merge insect cells. Thus FFs comprise an ancient family of cellular fusogens that can promote fusion when expressed on a viral particle. 471
57521 405556 pfam14885 GHL15 Hypothetical glycosyl hydrolase family 15. GHL15 is a family of hypothetical glycoside hydrolases. 272
57522 405557 pfam14886 FAM183 FAM183A and FAM183B related. The function of this family of metazoan sequences is not known. 106
57523 373364 pfam14887 HMG_box_5 HMG (high mobility group) box 5. Nucleolar transcription factor/upstream binding factor contains six HMG box domains. This is the fifth HMG box domain in these proteins. This domain has lost DNA-binding ability. 84
57524 405558 pfam14888 PBP-Tp47_c Penicillin-binding protein Tp47 domain C. Domain C is the largest domain in this unusual penicillin-binding protein PBP), Tp47. This domain is mainly characterized by an immunoglobulin fold with two opposing beta-sheets that form the typical barrel-like structure. In contrast to the classical immunoglobulin fold, however, this has an additional beta-strand inserted after strand 3. Also, the strands are connected by rather large loops. Helices are inserted between strands 2 and 3 and between strands 4 and 5. Domain C interacts with domain B via a surface that has a slightly concave, goblet-like shape. Tp47 is unusual in that it displays beta-lactamase activity, and thus it does not fit the classical structural and mechanistic paradigms for PBPs, and thus Tp47 appears to represent a new class of PBP. 158
57525 405559 pfam14889 PBP-Tp47_a Penicillin-binding protein Tp47 domain a. This is the first domain in this unusual penicillin-binding protein PBP), Tp47 is mainly composed of beta-strands and is sequentially non-contiguous. The first three domains in Tp47 interact with each other through intimate domain-domain interfaces. Domain A contacts domain B through its N-terminal segment. Domain A also interacts tightly with domain C, Tp47 is unusual in that it displays beta-lactamase activity, and thus it does not fit the classical structural and mechanistic paradigms for PBPs, and thus Tp47 appears to represent a new class of PBP. 161
57526 405560 pfam14890 Intein_splicing Intein splicing domain. Inteins are segments of protein which excise themselves from a precursor protein and mediate the rejoining of the remainder of the precursor (the extein). Most inteins consist of a splicing domain which is split into two segments by a homing endonuclease domain. This domain represents the splicing domain. 382
57527 405561 pfam14891 Peptidase_M91 Effector protein. This family of proteins contains an HEXXH motif, typical of zinc metallopeptidases. The family includes the E. coli effector protein NleD, which cleaves and inactivates c-Jun N-terminal kinase (JNK). 173
57528 405562 pfam14892 DUF4490 Domain of unknown function (DUF4490). This family of proteins is found in eukaryotes. Proteins in this family are typically between 101 and 220 amino acids in length. In mice, a member of this family whose expression is induced by p53 may play a role in DNA damage response. 99
57529 405563 pfam14893 PNMA PNMA. The PNMA family includes paraneoplastic antigens Ma 1, 2 and 3, found in the serum of patients with paraneoplastic neurological disorders. The family also includes modulator of apoptosis 1, which has a role in death receptor-dependent apoptosis. 327
57530 405564 pfam14894 Lsm_C Lsm C-terminal. This domain is found at the C-terminus of archaeal Lsm (like-Sm) proteins. 57
57531 405565 pfam14895 PPPI_inhib Protein phosphatase 1 inhibitor. This family of proteins interacts with and inhibits the phosphatase activity of protein phosphatase 1 (PP1) complexes. 342
57532 405566 pfam14896 Arabino_trans_C EmbC C-terminal domain. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol. This domain represents the C-terminal extracellular domain that is likely to bind to carbohydrate. 384
57533 405567 pfam14897 EpsG EpsG family. This family of proteins are related to the EpsG protein from B. subtilis. These proteins are likely glycosyl transferases belonging to the membrane protein GT-C clan. 323
57534 405568 pfam14898 DUF4491 Domain of unknown function (DUF4491). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 107 amino acids in length. There is a conserved EYY sequence motif. 90
57535 405569 pfam14899 DUF4492 Domain of unknown function (DUF4492). This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. The function of these proteins is unknown. 63
57536 405570 pfam14900 DUF4493 Domain of unknown function (DUF4493). This family of proteins is found in bacteria. Proteins in this family are typically between 264 and 710 amino acids in length. Many of these proteins have a lipid attachment site suggesting they are lipoproteins. 221
57537 405571 pfam14901 Jiv90 Cleavage inducing molecular chaperone. Jiv90 is a fragment of the DnaJ protein in eukaryotes and in J-domain protein interacting with viral protein (Jiv) located in the N terminal region of the pestivirus viral polypeptide. The viral protein interacts stably with non structural (NS) protein NS2, causing a conformational change in NS2-NS3 and stimulates NS2-NS3 cleavage in trans. Cleavage of NS2-NS3 increases cytopathogenicity and consequently aids viral replication. Jiv therefore acts as a regulating cofactor for NS2 auto-protease. The efficient release of NS3 from the viral polypeptide by Jiv is considered crucial to the pestivirus cytopathogenicity. In eukaryotes, it usually lies 40 residues downstream of DnaJ family pfam00226. However, the function in eukaryotes is still unknown. 89
57538 405572 pfam14902 DUF4494 Domain of unknown function (DUF4494). This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 172 amino acids in length. There are two conserved sequence motifs: VDA and EAE. There is a single completely conserved residue E that may be functionally important. 139
57539 405573 pfam14903 WG_beta_rep WG containing repeat. This repeat contains an N-terminal WG repeat motif. The extent of the repeat is poorly defined. This repeat may form a beta solenoid structure (Bateman A pers. obs.). 35
57540 405574 pfam14904 FAM86 Family of unknown function. Function of this protein family is not known. 94
57541 405575 pfam14905 OMP_b-brl_3 Outer membrane protein beta-barrel family. This family includes proteins annotated as TonB dependent receptors. But it is also likely to contain other membrane beta barrel proteins of other functions. 407
57542 405576 pfam14906 DUF4495 Domain of unknown function (DUF4495). This domain family is found in eukaryotes, and is typically between 322 and 336 amino acids in length. There are two conserved sequence motifs: QMW and DLW. Proteins in this family vary in length from 793 to 1184 amino acids. 318
57543 405577 pfam14907 NTP_transf_5 Uncharacterized nucleotidyltransferase. This family is likely to be an uncharacterized group of nucleotidyltransferases. 249
57544 405578 pfam14908 DUF4496 Domain of unknown function (DUF4496). This domain family is found in eukaryotes, and is typically between 134 and 154 amino acids in length. Proteins in this family vary in length between 264 and 772 amino acid residues. 89
57545 405579 pfam14909 SPATA6 Spermatogenesis-assoc protein 6. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. The family has similarity to the motor domain of kinesin related proteins and with the Caenorhabditis elegans neural calcium sensor protein (NCS-2). 139
57546 405580 pfam14910 MMS22L_N S-phase genomic integrity recombination mediator, N-terminal. MMS22L (Methyl methanesulfonate-sensitivity protein 22-like) is found in yeast, plants and vertebrates, and is integrally concerned with DNA forking and repair mechanisms during replication. MMS22L complexes with TONSL and this complex accumulates at regions of ssDNA associated with distressed replication forks or at processed DNA breaks. Its depletion results in high levels of endogenous DNA double-strand breaks caused by an inability to complete DNA synthesis after replication fork collapse. Thus the complex mediates recovery from replication stress and homologous recombination in vertebrates, yeasts and plants. This family is the more N-terminal region of the proteins. 708
57547 405581 pfam14911 MMS22L_C S-phase genomic integrity recombination mediator, C-terminal. MMS22L (Methyl methanesulfonate-sensitivity protein 22-like) is found in yeast, plants and vertebrates, and is integrally concerned with DNA forking and repair mechanisms during replication. MMS22L complexes with TONSL and this complex accumulates at regions of ssDNA associated with distressed replication forks or at processed DNA breaks. Its depletion results in high levels of endogenous DNA double-strand breaks caused by an inability to complete DNA synthesis after replication fork collapse. Thus the complex mediates recovery from replication stress and homologous recombination in vertebrates, yeasts and plants. This family is the more C-terminal region of the proteins. 373
57548 405582 pfam14912 THEG Testicular haploid expressed repeat. This repeat is the only conserved part of the THEG proteins from vertebrate spermatids. Both human and mouse THEG are specifically expressed in the nucleus of haploid male germ cells and are involved in the regulation of nuclear functions. Although the differential gene expression of THEG in spermatid-Sertoli cell co-culture supports the relevance of germ cell-Sertoli cell interaction for gene regulation during spermatogenesis, THEG was not found to be essential for spermatogenesis in mice. 59
57549 405583 pfam14913 DPCD DPCD protein family. This protein is a found in eukaryotes and a mutation in this protein is thought to cause Primary Ciliary Dyskinesia (PCD). This protein is 203 amino acids in length, 23 kDa in size and its function remains unknown. The gene that encodes this protein is a candidate gene for PCD and is expressed during ciliogenesis. PCD affects the airways and reproductive organs, and probing Northern blots show DPCD expression in humans is highest in the testes. Additionally, there is no indication of major splice variants. 190
57550 405584 pfam14914 LRRC37AB_C LRRC37A/B like protein 1 C-terminal domain. This family represents the C-terminal domain of the putative Leucine Rich Repeat Containing protein 37A or protein 37B (LRRC37A/B) found in eukaryotes. The Leucine Rich Repeats (LRR) lies in the central region. The gene that encodes this protein is found in the chromosomal position 17q11.2, and its microdeletion results in the disease, neurofibromatosis type-1 (NF1). The function of the protein, LRRC37B is unknown, however experimental data shows expression in the aorta, heart, skeletal muscle, liver and brain during gestation. 147
57551 405585 pfam14915 CCDC144C CCDC144C protein coiled-coil region. This family includes the human protein CCDC144C and the ankyrin repeat domain-containing protein 26-like 1 found in eukaryotes. Its function remains unknown, however, it is known to contain a coiled-coil domain which corresponds to this region. The ankyrin repeat which features in this protein is a common amino acid motif. 305
57552 373383 pfam14916 CCDC92 Coiled-coil domain of unknown function. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The function is not known and the proteins carry no other domains. 57
57553 405586 pfam14917 CCDC74_C Coiled coil protein 74, C terminal. This is a C-terminal conserved domain of coiled-coil proteins from vertebrates. The function is not known. Expression levels in humans are elevated in breast cancer. 121
57554 405587 pfam14918 MTBP_N MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14919, PF14920. 254
57555 405588 pfam14919 MTBP_mid MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14918, PF14920. 339
57556 405589 pfam14920 MTBP_C MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14918, PF14919. 257
57557 405590 pfam14921 APCDDC Adenomatosis polyposis coli down-regulated 1. The domain is duplicated in most members of this family. APCDD is directly regulated by the beta-catenin/Tcf complex, and its elevated expression promotes proliferation of colonic epithelial cells in vitro and in vivo. APCDD1 has an N-terminal signal-peptide and a C-terminal transmembrane region. The domain is rich in cysteines, there being up to 12 such residues, a structural motif important for interaction between Wnt ligands and their receptors. APCDD1 is expressed in a broad repertoire of cell types, indicating that it may regulate a diverse range of biological processes controlled by Wnt signalling. 234
57558 405591 pfam14922 FWWh Protein of unknown function. This is a family of eukaryotic proteins. Most members carry a highly distinctive, conserved sequence motif of FWWh, where h represents a hydrophobic residue. The function of the family is not known. 150
57559 405592 pfam14923 CCDC142 Coiled-coil protein 142. The function of this coiled-coil domain-containing family is not known. It is found in eukaryotes. 455
57560 405593 pfam14924 DUF4497 Protein of unknown function (DUF4497). This domain family is found in eukaryotes, and is typically between 107 and 123 amino acids in length. There are two completely conserved G residues that may be functionally important. 107
57561 405594 pfam14925 HPHLAWLY Domain of unknown function. Members of this family carry two distinct, highly conserved sequence motifs, CPPPLYYTHL and HPHLAWLY. The family is found in eukaryotes, and the function is not known. This family lies at the C-terminus of members. 641
57562 405595 pfam14926 DUF4498 Domain of unknown function (DUF4498). This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 308 amino acids in length. 245
57563 405596 pfam14927 Neurensin Neurensin. The neurensin family includes the neuronal membrane proteins neurensin-1 and neurensin-2. Neurensin-1 plays a role in neurite extension. 132
57564 405597 pfam14928 S_tail_recep_bd Short tail fibre protein receptor-binding domain. This domain is a receptor binding domain found on bacteriophage short tail fibre proteins. It contains a zinc-binding site and a potential lipopolysaccharide-binding site. 93
57565 405598 pfam14929 TAF1_subA TAF RNA Polymerase I subunit A. TATA box binding protein associated factor RNA Polymerase I subunit A is found in eukaryotes and is encoded by the gene TAF1A in humans. Its function is to aid transcription of DNA into RNA by binding to the promoter at the -10 TATA box site. It is a component of the transcription factor SL1/TIF-IB complex, involved in PIC assembly (pre-initiation complex) during RNA polymerase I-dependent transcription. The rate of PIC formation depends on the rate of association of this protein. This protein also stabilizes nucleolar transcription factor 1/UBTF on rDNA. 365
57566 405599 pfam14930 Qn_am_d_aII Quinohemoprotein amine dehydrogenase, alpha subunit domain II. This is the second domain of the alpha subunit of quinohemoprotein amine dehydrogenase. 107
57567 405600 pfam14931 IFT20 Intraflagellar transport complex B, subunit 20. IFT20 is subunit 20 of the intraflagellar transport complex B. The intraflagellar transport complex assembles and maintains eukaryotic cilia and flagella. IFT20 is localized to the Golgi complex and is anchored there by the Golgi polypeptide, GMAP210, whereas all other subunits except IFT172 localize to cilia and the peri-basal body or centrosomal region at the base of cilia. IFT20 accompanies Golgi-derived vesicles to the point of exocytosis near the basal bodies where the other IFT polypeptides are present, and where the intact IFT particle is assembled in association with the inner surface of the cell membrane. Passage of the IFT complex then follows, through the flagellar pore recognition site at the transition region, into the ciliary compartment. There also appears to be a role of intraflagellar transport (IFT) polypeptides in the formation of the immune synapse in non ciliated cells. The flagellum, in addition to being a sensory and motile organelle, is also a secretory organelle. A number of IFT components are expressed in haematopoietic cells, which have no cilia, indicating an unexpected role of IFT proteins in immune synapse-assembly and intracellular membrane trafficking in T lymphocytes; this suggests that the immune synapse could represent the functional homolog of the primary cilium in these cells. 109
57568 405601 pfam14932 HAUS-augmin3 HAUS augmin-like complex subunit 3. This domain is subunit three of the augmin complex found from Drosophila to humans. The HAUS-augmin complex is made up of eight subunits. The augmin complex interacts with gamma-TuRC, and attenuation of this interaction severely impairs spindle MT generation. Furthermore, we provide evidence that human augmin plays critical and non-redundant roles in the kinetochore-MT attachment and also central spindle formation during anaphase in human cells.The HAUS complex is required for mitotic spindle assembly and for maintenance of centrosome integrity. 261
57569 373400 pfam14933 CEP19 CEP19-like protein. This family includes the centrosomal protein of 19 kDa found in eukaryotes. In humans, it is encoded for by the gene CEP19 which is also known as C3orf34. These proteins localize in the centrosomes. Centrosomes are dynamic organelles that assemble around the centrioles. They organize the microtubule cytoskeleton and mitotic spindle apparatus and are required for cell division and cell migration. C3orf34 localizes near the centrosome in early interphase, to spindle poles during mitosis, and to distinct foci oriented towards the midbody at telophase. 150
57570 405602 pfam14934 DUF4499 Domain of unknown function (DUF4499). This family contains a protein found in eukaryotes. Transmembrane protein C10orf57 is encoded for by the gene chromosome 10 open reading frame 57 (C10orf57) located in chromosomal position 10q22.3. The exact function of this protein is still unknown, however it is thought to be an integral membrane protein. The protein sequence is 123 amino acids in length and has a mass of approximately 14.2 kDa. The family also includes some longer proteins that possess an N-terminal dehydrogenase domain, pfam01073. 88
57571 405603 pfam14935 TMEM138 Transmembrane protein 138. This family of proteins is found in eukaryotes and members are approximately 160 amino acids in length. There are two conserved sequence motifs: YYY and DPR. This transmembrane protein belongs to a family found in eukaryotes and is involved in the biogenesis and degradation of ciliated cells. Mutations in this protein cause the disease Joubert syndrome(JBTS) where the cilia becomes non-motile. Ciliopathy can be severe since cilia provide the cell with large amounts of information through signals. Ciliopathy can affect cell behaviour as the appropriate signals between the cell and its environment are not made, which can affect cell survival. 119
57572 405604 pfam14936 p53-inducible11 tumor protein p53-inducible protein 11. TP53 is a tumor suppressor gene, when switched on it suppresses tumor development by inducing stable growth arrest or cell apoptosis. The tumor protein TP53 inducible protein 11 encoded for by the gene TP53I11, has a protein sequence of 189 amino acids in length and 21 kDa in mass. The role of this protein is thought to negatively regulate cell proliferation in response to stress, and therefore suppress tumor formation. 182
57573 405605 pfam14937 DUF4500 Domain of unknown function (DUF4500). This family is found in eukaryotes. The function of this protein remains unknown. The gene which encodes for this protein is named chromosome 6 open reading frame 162 (C6orf162) and is found between the chromosomal positions 6q15-q16.1. It is thought that this protein may be an important part of membrane function. 81
57574 405606 pfam14938 SNAP Soluble NSF attachment protein, SNAP. The soluble NSF attachment protein (SNAP) proteins are involved in vesicular transport between the endoplasmic reticulum and Golgi apparatus. They act as adaptors between SNARE (integral membrane SNAP receptor) proteins and NSF (N-ethylmaleimide-sensitive factor). They are structurally similar to TPR repeats. 273
57575 405607 pfam14939 DCAF15_WD40 DDB1-and CUL4-substrate receptor 15, WD repeat. DCAFs, Ddb1- and Cul4-associated factors, are substrate receptors for the Cul4-Ddb1 Ubiquitin Ligase. There are 18 different factors, the majority of which are WD40-repeat-proteins. 203
57576 405608 pfam14940 TMEM219 Transmembrane 219. This protein belongs to a family found in eukaryotes. Proteins in this family are typically between 240 and 315 amino acids in length. The domains in this family vary in length from 202 to 249 amino acids. Its exact function remains unknown, however, it is thought to have a role as a transmembrane protein. More specifically, it is possible that this transmembrane protein may have a role as an insulin-like growth factor binding protein 3-receptor (IGFBP-3R). This receptor binds to the ligand, insulin growth factor 3, which is a p53-induced, apoptosis factor important for cancer prevention. 236
57577 405609 pfam14941 OAF Transcriptional regulator, Out at first. This family of proteins is found in eukaryotes. Proteins in this family are typically between 198 and 332 amino acids in length. The domains in this family vary in length from 239 to 242 amino acids. The gene, OAF (out at first), which encodes this protein, has a promoter which may help mediate regulation of neighboring genes. An alternative name for this protein is HCV NS5A-transactivated protein 13 target protein 2, which stands for Hepatitis C virus nonstructural 5A-transactivated protein 13 target protein 2. NS5A inhibits double-stranded-RNA-activated protein kinase (PKR) activity, which is thought to allow Hepatitis C Virus replication to continue in the presence of an alpha interferon (IFN)induced antiviral response. 242
57578 405610 pfam14942 Muted Organelle biogenesis, Muted-like protein. The protein is a coiled-coil protein and belongs to a family found in eukaryotes. It undergoes alternative splicing forming two isoforms. The larger isoform is 187 amino acids long in protein sequence length and 21 kDa in mass. The smaller isoform is 110 amino acids long in protein sequence length and 12 kDa in mass. This protein associates with other proteins in order to form biogenesis of lysosome-related organelles complex-1 BLOC1 complex. BLOC-1 is required for the normal biogenesis of specialized organelles of the endosomal-lysosomal system. 141
57579 405611 pfam14943 MRP-S26 Mitochondrial ribosome subunit S26. This family of proteins corresponds to mitochondrial ribosomal subunit S26 in eukaryotes 169
57580 405612 pfam14944 TCRP1 Tongue Cancer Chemotherapy Resistant Protein 1. This family of proteins are found in eukaryotes. Tongue Cancer Chemotherapy Resistant-associated Protein 1 (TCRP1) is resistant to the chemotherapy drug, cisplatin, which induces apoptosis in tumor cells. There is suggestion that TCRP1 can be targeted to reverse chemotherapy resistance. The precise mechanism of TCRP1 inducing resistance against chemotherapy is still not clear, but it is thought that TCRP1 alters cell signalling pathways affecting apoptosis or DNA repair capacity. Proteins in this family are typically between 194 and 235 amino acids in length. 243
57581 405613 pfam14945 LLC1 Normal lung function maintenance, Low in Lung Cancer 1 protein. This protein is part of a family found in eukaryotes. It is 137 amino acids long in protein sequence length and mass is approximately 15.7 kDa. The protein is present in the normal lung epithelium, but absent or downregulated in most primary non-small lung cancers. The gene is known as Low in Lung Cancer 1 (LLC1). This protein is thought to have a role in the maintenance of normal lung function and its absence may lead to lung tumorigenesis. 118
57582 405614 pfam14946 DUF4501 Domain of unknown function (DUF4501). This family of proteins is found in eukaryotes. Proteins in this family are typically between 167 and 308 amino acids in length. The exact function of this protein remains unknown, but it is thought to be a single-pass membrane protein. This family contains many highly conserved cysteine residues. 177
57583 405615 pfam14947 HTH_45 Winged helix-turn-helix. This winged helix-turn-helix domain contains an extended C-terminal alpha helix which is responsible for dimerization of this domain. 77
57584 405616 pfam14948 RESP18 RESP18 domain. This domain is found in the glucocorticoid-responsive protein regulated endocrine-specific protein 18 (RESP18) and in the N-terminal extracellular region of receptor-type tyrosine-protein phosphatases containing the protein-tyrosine phosphatase receptor IA-2 domain (pfam11548). 77
57585 405617 pfam14949 ARF7EP_C ARF7 effector protein C-terminus. This family represents the C-terminus of the ARF7 effector protein (ARF7EP). ARF7EP interacts with ADP-ribosylation factor-like protein 14 and unconventional myosin-Ie and through this interaction controls movement of MHC-II-containing vesicles along the actin cytoskeleton in dendritic cells. It contains a conserved CXCXXXXCXXCXXXCXXCXXXXCXXXCXC motif in it's C-terminal half. 102
57586 405618 pfam14950 DUF4502 Domain of unknown function (DUF4502). This family of proteins is found in eukaryotes. Proteins in this family are typically between 181 and 876 amino acids in length. 351
57587 405619 pfam14951 DUF4503 Domain of unknown function (DUF4503). This family of proteins is found in eukaryotes. Proteins in this family are typically between 313 and 876 amino acids in length. 391
57588 405620 pfam14952 zf-tcix Putative treble-clef, zinc-finger, Zn-binding. This domain resembles the zinc-binding domain of prokaryotic topoisomerases, family DNA_ligase_ZBD pfam03119. The function of the eukaryotic proteins it is carried on is not known. 42
57589 405621 pfam14953 DUF4504 Domain of unknown function (DUF4504). This family of proteins is found in eukaryotes. Proteins in this family are typically between 253 and 329 amino acids in length. There are two conserved sequence motifs: LLGYP and SFS. 254
57590 373420 pfam14954 LIX1 Limb expression 1. This entry represents the limb expression 1 (LIX1) family. 242
57591 405622 pfam14955 MRP-S24 Mitochondrial ribosome subunit S24. This family of proteins corresponds to mitochondrial ribosomal subunit S24 in eukaryotes. 135
57592 405623 pfam14956 DUF4505 Domain of unknown function (DUF4505). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 166 and 225 amino acids in length. 178
57593 405624 pfam14957 BORG_CEP Cdc42 effector. The Cdc42 effector (CEP) or binder of Rho GTPases (BORG) proteins are involved in the organisation of the actin cytoskeleton. They may function as negative regulators of Rho GTPase signaling. 118
57594 405625 pfam14958 DUF4506 Domain of unknown function (DUF4506). This domain family is found in eukaryotes, and is approximately 140 amino acids in length. 140
57595 405626 pfam14959 GSAP-16 gamma-Secretase-activating protein C-term. GSAP, or gamma-secretase-activating protein, also known as PION, regulates gamma-secretase activity. The holo-protein is a large, approx 850 residue protein that is rapidly cleaved to an active 16 kDa C-terminal fragment that is the stable, predominant form. GSAP is expressed in inclusion bodies and is important in brain function. It dramatically and selectively increases neurotoxic beta-Amyloid production in the brain through a mechanism involving its interactions with both gamma-secretase and its substrate, the amyloid precursor protein C-terminal fragment (APP-CTF). Accumulation of neurotoxic beta-Amyloid is a major hallmark of Alzheimer's disease. Formation of beta-Amyloid is catalyzed by gamma-secretase, a protease with numerous substrates that catalyzes the intra-membrane cleavage of integral membrane proteins such as Notch receptors and APP (beta-amyloid precursor protein). The secondary structure of GSAP is largely alpha-helical, lacking well-defined tertiary structure. GSAP represents a type of gamma-secretase regulator that directs enzyme specificity by interacting with a specific substrate. 108
57596 373426 pfam14960 ATP_synth_reg ATP synthase regulation. Members of this family are subunits of mitochondrial ATP synthase (F-ATPase) and vacuolar ATPase (V-ATPase). In F-ATPase, this subunit regulates mitochondrial ATP synthase population. 50
57597 405627 pfam14961 BROMI Broad-minded protein. Broad-minded protein (BROMI) interacts with cell cycle-related kinase (CCRK), together these proteins regulate ciliary membrane and axonemal growth. 1290
57598 373428 pfam14962 AIF-MLS Mitochondria localization Sequence. This family contains a protein found in eukaryotes. Proteins in this family are typically between 240 and 613 amino acids in length. The family is found in association with pfam07992. This protein family is an N-terminal domain for the mitochondrial localization sequence for an apoptosis-inducing factor. The protein is also known as Corneal endothelium-specific protein 1 or as Ovary-specific acidic protein. It is thought to be important for membrane function and is expressed in the ovary and corneal endothelium. 192
57599 405628 pfam14963 CAML Calcium signal-modulating cyclophilin ligand. Calcium signal-modulating cyclophilin ligand was originally identified in a screen for cyclophilin B-interacting proteins. It is likely to be involved in calcium signalling. It has also been shown to interact with many other signalling molecules including proto-oncogene tyrosine-protein kinase LCK, tumor necrosis factor receptor superfamily member 13B and EGFR. 269
57600 405629 pfam14964 DUF4507 Domain of unknown function (DUF4507). This family of proteins is found in eukaryotes. Proteins in this family are typically between 346 and 434 amino acids in length. 359
57601 405630 pfam14965 BRI3BP Negative regulator of p53/TP53. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 213 and 245 amino acids in length. It is found in various tissues, including the brain, liver and kidneys. It was first discovered as a functional unknown gene, murine brain I3 (BRI3). This protein is also known as HCCRBP-1 and it plays a role in tumorigenesis, as it binds to an oncogene, HCCR-1, and acts as a negative regulator of p53/TP53 tumor suppressor. BRI3BP induces tumorigenesis by activating protein kinase C (PKC) activity but decreasing the pro-apoptotic PKC-alpha and PKC-delta isoform levels. BRI3BP is over-expressed in many tumors. 180
57602 405631 pfam14966 DNA_repr_REX1B DNA repair REX1-B. This family of proteins includes Chlamydomonas reinhardtii REX1-B (Required for Excision 1-B) which is involved in a light-independent DNA repair pathway. 94
57603 405632 pfam14967 FAM70 FAM70 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 241 and 349 amino acids in length. The function of this family is unknown. 325
57604 405633 pfam14968 CCDC84 Coiled coil protein 84. The function of this coiled-coil domain-containing family is not known. It is found in eukaryotes. 328
57605 405634 pfam14969 DUF4508 Domain of unknown function (DUF4508). This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 253 amino acids in length. 96
57606 405635 pfam14970 DUF4509 Domain of unknown function (DUF4509). This family of proteins is found in eukaryotes. Proteins in this family are typically between 212 and 449 amino acids in length. There is a conserved WLL sequence motif. 187
57607 405636 pfam14971 DUF4510 Domain of unknown function (DUF4510). This family of proteins is found in eukaryotes. Proteins in this family are typically between 242 and 452 amino acids in length. There are two conserved sequence motifs: LEA and WMD. 153
57608 405637 pfam14972 Mito_morph_reg Mitochondrial morphogenesis regulator. This family of proteins regulate mitochondrial morphogenesis via a mechanism which is independent of mitofusins and dynamin-related protein 1. 162
57609 405638 pfam14973 TINF2_N TERF1-interacting nuclear factor 2 N-terminus. This is the N-terminus of TERF1-interacting nuclear factor 2. It is required for the formation of the shelterin complex. The shelterin complex is involved in the protection and maintenance of telomeres. 143
57610 405639 pfam14974 P_C10 Protein C10. The function of this protein family is unknown. Mutations in protein C (C12orf57) are implicated in the pathogenesis of colobomatous microphthalmia. 103
57611 405640 pfam14975 DUF4512 Domain of unknown function (DUF4512). This family of proteins is found in eukaryotes. Proteins in this family are typically between 74 and 104 amino acids in length. There are two completely conserved residues (C and P) that may be functionally important. 103
57612 405641 pfam14976 FAM72 FAM72 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 264 amino acids in length. The function of this family is unknown. 145
57613 405642 pfam14977 FAM194 FAM194 protein. This family is found in eukaryotes, and is approximately 210 amino acids in length. There is a conserved YPSG sequence motif. The function of this family is unknown. 196
57614 405643 pfam14978 MRP-63 Mitochondrial ribosome protein 63. This family of proteins is present in the intact 55S subunit of the mitochondrial ribosome. It is not known if it belongs to the 28S or to the 39S subunit. 89
57615 405644 pfam14979 TMEM52 Transmembrane 52. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 160 and 236 amino acids in length. There is a conserved LLCG sequence motif. The function of this family is unknown. 143
57616 317403 pfam14980 TIP39 TIP39 peptide. 51
57617 317404 pfam14981 FAM165 FAM165 family. This family of proteins known as FAM165 are found in eukaryotes. Members of this family are as yet uncharacterized. Proteins in this family are typically short membrane proteins between 55 and 70 amino acids in length. 50
57618 291643 pfam14982 UPF0731 UPF0731 family. The UPF0731 family of uncharacterized proteins is found in mammals. 78
57619 291644 pfam14983 DUF4513 Domain of unknown function (DUF4513). This family of uncharacterized proteins is found in chordates. 132
57620 405645 pfam14984 CD24 CD24 protein. 52
57621 373447 pfam14985 TM140 TM140 protein family. This family of uncharacterized membrane proteins are called transmembrane protein 140. They are found in mammals. 180
57622 373448 pfam14986 DUF4514 Domain of unknown function (DUF4514). This family of uncharacterized proteins are found in mammals. 60
57623 405646 pfam14987 NADHdh_A3 NADH dehydrogenase 1 alpha subcomplex subunit 3. This family of proteins are accessory subunits of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I). This subunit is not believed to be catalytic. 78
57624 405647 pfam14988 DUF4515 Domain of unknown function (DUF4515). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 198 and 469 amino acids in length. There are two completely conserved L residues that may be functionally important. 206
57625 405648 pfam14989 CCDC32 Coiled-coil domain containing 32. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 188 amino acids in length. The gene that encodes this protein is C15orf57 but its protein product is called Protein CCDC32 (Coiled-coil domain containing 32). The exact function of this protein is still unknown. 150
57626 405649 pfam14990 DUF4516 Domain of unknown function (DUF4516). This family of proteins is found in eukaryotes. Proteins in this family are typically between 56 and 69 amino acids in length. 46
57627 405650 pfam14991 MLANA Protein melan-A. 117
57628 405651 pfam14992 TMCO5 TMCO5 family. The TMCO5 family includes human transmembrane and coiled-coil domain-containing proteins 5A and 5B. 281
57629 405652 pfam14993 Neuropeptide_S Neuropeptide S precursor protein. 65
57630 405653 pfam14994 TSGA13 Testis-specific gene 13 protein. This family of uncharacterized proteins are found in chordates. In humans this gene is found to be expressed specifically in the testes. 273
57631 405654 pfam14995 TMEM107 Transmembrane protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 138 and 164 amino acids in length. There are two completely conserved residues (H and E) that may be functionally important and four transmembrane helices. The domains in this family vary in length from 124 to 126 amino acids. The precise function of the protein family is still unknown. 123
57632 405655 pfam14996 RMP Retinal Maintenance. RMP is encoded for by a gene, C8orf37. Mutations in the gene cause two types of retinal dystrophies: cone-rod dystrophy type 16 (CORD16) and retinitis pigmentosa type 64 (RP64). CORD16 affects the cone receptors which detect red, green or blue wavelengths of light and RP64 affects the cone receptors first and then the rod receptors. Both of these affect the photo-receptors in the eye leading to colour blindness or blindness respectively. 154
57633 405656 pfam14997 CECR6_TMEM121 CECR6/TMEM121 family. This family includes Cat eye syndrome critical region protein 6, a protein which has been identified in a screen for candidate genes for the developmental disorder Cat Eye Syndrome (CES). It also includes the TMEM121 transmembrane proteins. The function of this family is unknown. 194
57634 405657 pfam14998 Ripply Transcription Regulator. The precise function of this family is not clear, but it is thought to play a role in somitogenesis, development and transcriptional repression. Ripply is also known by an alternative name, Bowline. Bowline, is an associate protein of the transcriptional co-repressor XGrg-4. This family contains two conserved sequence motifs: WRPW and FPVQATI. The WRPW motif is thought to be required for binding to tle/groucho proteins. Ripply3 is also known as Down Syndrome Critical Region Protein 6 homolog. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 154 amino acids in length. 85
57635 373461 pfam14999 Shadoo Shadow of prion protein, neuroprotective. This protein family is a Prion-like protein and its function is neuroprotective and similar to PrP(C)-like. Shadoo is mainly expressed in the brain, and highly expressed in the hippocampus, the area of the brain which co-ordinates memory as well as spatial memory and navigation. This protein may also alter the biological actions of normal and abnormal Prion Protein (PrP) which lead to lethal neurodegenerative diseases. This family of proteins is found in eukaryotes. Proteins in this family are approximately 150 amino acids in length, of which the first 90 are alanine rich. 133
57636 405658 pfam15000 TUSC2 tumor suppressor candidate 2. This family of proteins are candidate tumor suppressors. 111
57637 405659 pfam15001 AP-5_subunit_s1 AP-5 complex subunit sigma-1. This family of proteins are subunits of the adaptor protein complex AP-5. 191
57638 405660 pfam15002 ERK-JNK_inhib ERK and JNK pathways, inhibitor. This coiled-coiled domain, CCDC134, is a secretory protein that inhibits Mitogen activated protein kinase (MAPK) pathways such as Raf-1/MEK/ERK and JNK/SAPK but not p38. CCDC134 is widely expressed in normal adult tissues, tumor tissues and cell lines, which shows its importance in cell signal transduction pathways, transcription regulation and therefore cell survival. Additionally, CCDC134 is known to bind to a transcription adaptor, hADA2a, which forms part of the general control nonderepressible 5 (GCN5) histone acetyltransferase complex. Acetylation usually 'switches genes on' for transcription. Moreover, knocking out CCDC134 suppressed hADA2a-induced cell apoptosis activity and G1/S cell cycle arrest suggesting its importance in cell survival. This family of proteins is found in eukaryotes. Proteins in this family are typically between 188 and 257 amino acids in length. This family is a coiled-coil domain containing protein 134 (CCDC134) whereby the coiled-coiled domain is a ubiquitous motif involved in oligomerization. 197
57639 373465 pfam15003 HAUS2 HAUS augmin-like complex subunit 2. This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 291 amino acids in length. HAUS augmin-like complex subunit 2 is alternatively called centrosomal protein of 27 kDa (CEP27). It localized in the microtubule organising centre, the centrosome. These microtubules are part of the cytoskeleton and give the cell its shape, provides it with a platform for motility and are crucial for mitosis. This protein is part of the HAUS augmin-like complex. This interacts with the gamma-tubulin ring complex (gamma-TuRC) which is required for spindle generation. HAUS2 may also increase the tension between spindle and kinetochore allowing for chromosome segregation during mitosis. This protein is involved in mitotic spindle assembly, maintenance of centrosome integrity and completion of cytokinesis. 191
57640 405661 pfam15004 MYEOV2 Myeloma-overexpressed-like. This family of proteins is found in eukaryotes. It includes human myeloma-overexpressed gene 2 protein. Proteins in this family are typically between 45 and 74 amino acids in length. There are two conserved sequence motifs: MKP and DEMF. The function of this family is unknown. 57
57641 405662 pfam15005 IZUMO Izumo sperm-egg fusion, Ig domain-associated. This IZUMO family is a domain just upstream of the immunoglobulin domain on Izumo proteins in higher eukaryotes. The actual function of this region of the Izumo proteins is not known. The full-length protein is a molecule with a single immunoglobulin (Ig) domain. It is thought that Izumo proteins bind to putative Izumo receptors on the oocyte. Izumo is not detectable on the surface of fresh sperm but becomes exposed only after an exocytotic process, the acrosome reaction, has occurred. Studies have shown that knock-out mice (Izumo-/- males) were sterile despite normal mating behaviour and ejaculation, indicating the importance of the protein in fertilisation. There are cysteine residues thought to form a disulphide bridge. Izumo is a typical type I membrane glycoprotein with one immunoglobulin-like domain and a putative N-glycoside link motif (Asn 204). There is a conserved GCL sequence motif. Izumo expression has been found to be testis-specific. 142
57642 373468 pfam15006 DUF4517 Domain of unknown function (DUF4517). The function of this protein remains unknown. This family of proteins is found in eukaryotes and are typically between 160 and 182 amino acids in length. 152
57643 405663 pfam15007 CEP44 Centrosomal spindle body, CEP44. CEP44 is a coiled coil domain found localized in the centrosome and spindle poles. 127
57644 405664 pfam15008 DUF4518 Domain of unknown function (DUF4518). The precise function of this protein family is unknown but it is thought to be involved in apoptosis regulation. 263
57645 405665 pfam15009 TMEM173 Transmembrane protein 173. Transmembrane protein 173, also known as stimulator of interferon genes protein (STING), is a transmembrane adaptor protein which is involved in innate immune signalling processes. It induces expression of type I interferons (IFN-alpha and IFN-beta) via the NF-kappa-B and IRF3, pathways in response to non-self cytosolic RNA and dsDNA. 293
57646 405666 pfam15010 FAM131 Putative cell signalling. The precise function of this protein family is unknown, however studies have shown it undergoes Protein N-myristoylation; a type of lipid modification in eukaryotic and viral proteins. Protein N-myristoylation is usually an irreversible co-translational protein modification which is useful in cell signal transduction pathways. This indicates that FAM131 may have some sort of role in cell signalling due to its ability to be myristoylated. This family of proteins is found in eukaryotes and are typically between 257 and 361 amino acids in length. 278
57647 405667 pfam15011 CK2S Casein Kinase 2 substrate. It is suggested that CK2S (C10orf109) is important in the regulation of cancer cell proliferation. Studies have indicated that CK2S is the downstream target of a protein kinase, casein kinase 2 (CK2), which is upregulated in cancer cells. CK2S has been found to be upregulated in cancer cells. The precise mechanism of CK2 targetting CK2S is not well characterized. It is found to be localized in the nucleus and cytoplasm. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 221 amino acids in length. There is a single completely conserved residue P that may be functionally important. 158
57648 373474 pfam15012 DUF4519 Domain of unknown function (DUF4519). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 59 amino acids in length. There are two conserved sequence motifs: KET and VLP. There is a single completely conserved residue P that may be functionally important. 55
57649 405668 pfam15013 CCSMST1 CCSMST1 family. This family of proteins was discovered in a screen of Bos taurus placental ESTs. The B. taurus member of this family was named cattle cerebrum and skeletal muscle-specific transcript 1. This family of proteins is found in eukaryotes. Proteins in this family are typically between 97 and 157 amino acids in length. There is a single completely conserved residue D that may be functionally important. The function of this family is unknown. 74
57650 405669 pfam15014 CLN5 Ceroid-lipofuscinosis neuronal protein 5. 301
57651 373477 pfam15015 NYD-SP12_N Spermatogenesis-associated, N-terminal. NYD-SP12, also known as SPATA16, is a germ-cell specific participant in the Golgi apparatus, and its expression is confined to spermatogenic epithelium, not being found in interstitial cells. Computer analysis of the protein-sequence showed that NYD-SP12 contains a cluster of phosphorylation sites for protein kinase C as well as for cyclic nucleotide-dependent protein kinases. It is postulated that since the mutation of some Golgi apparatus' proteins are responsible for male infertility that NYD-SP12 might play a role in modification and sorting of acrosomal enzymes. OMIM:102530. 564
57652 405670 pfam15016 DUF4520 Domain of unknown function (DUF4520). This family of proteins is found in eukaryotes. Proteins in this family are typically between 197 and 638 amino acids in length.This is the C-terminal domain of the member proteins. 85
57653 405671 pfam15017 WRNPLPNID Putative WW-binding domain and destruction box. This short conserved region is a putative destruction-box, with its RxxLxxI sequence motif, though the homology is not absolute. The domain occurs on a number of tumorigenic proteins, on some RNA-binding proteins and serine-threonine regulatory proteins. The second less well-conserved motif, WITPS, is a potential WW domain ligand-binding motif for recruiting proteins to their substrates. WW domains bind tightly to short proline-containing peptides that are typically in regions of native disordered polypeptide, as this family is as it lies between a PIN domain and a zinc-binding domain. 61
57654 405672 pfam15018 InaF-motif TRP-interacting helix. This highly conserved motif is thought to be a transmembrane helix that binds to transient receptor potential (TRP) calcium channel. It is known that proline-rich proteins inactivate tannins found in food compounds, and it is putatively thought that PRR24 does too. This is important since tannins often inhibit the uptake of iron. InaF is a protein required for TRP calcium channel function in Drosophila. TRP-related channels have been suggested to mediate store-operated calcium entry, important for Ca2+ homeostasis in a wide variety of cell types. The amino acid sequence of PRR-24 contains two completely conserved Y residues that may be functionally important. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. 35
57655 405673 pfam15019 C9orf72-like C9orf72-like protein family. The precise function of this family is unknown but members have been found to be localized in the cytoplasm of brain tissue. Defects in the gene, C9orf72, are the cause of frontotemporal dementia and/or amyotrophic lateral sclerosis (FTDALS) which is an autosomal dominant neurodegenerative disorder. The disorder is caused by a large expansion of a GGGGCC hexa-nucleotide within the first C9orf72 intron located between the first and the second non-coding exons. The expansion leads to the loss of transcription of one of the two transcripts encoding isoform 1 and to the formation of nuclear RNA foci. This domain family is found in eukaryotes, and is typically between 230 and 250 amino acids in length. There is a single completely conserved residue F that may be functionally important. 230
57656 405674 pfam15020 CATSPERD Cation channel sperm-associated protein subunit delta. The CATSPER (cation channel of sperm) complex is a tetrameric complex consisting of CATSPER1, CATSPER2, CATSPER3 and CATSPER4, it functions as an alkalinisation-activated calcium channel. This complex requires several auxiliary subunits, including CATSPERD. CATSPERD is essential for the cation channel function and may play a role in channel assembly or transport. 727
57657 405675 pfam15021 DUF4521 Protein of unknown function (DUF4521). This family of vertebrate proteins is functionally uncharacterized. The family includes the Chromosome 20 protein C20orf196. 198
57658 405676 pfam15022 DUF4522 Protein of unknown function (DUF4522). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. In human this protein is known as C4orf36. 117
57659 405677 pfam15023 DUF4523 Protein of unknown function (DUF4523). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. 166
57660 405678 pfam15024 Glyco_transf_18 Glycosyltransferase family 18. Enzymes belonging to glycosyltransferase family 18 (alpha-1,6-mannosylglycoprotein 6-beta-N-acetylglucosaminyltransferase) contribute to the creation of branches in complex-type N-glycans. This domain is responsible for the catalytic activity of the enzyme. 557
57661 405679 pfam15025 DUF4524 Domain of unknown function (DUF4524). This family of proteins is found in eukaryotes. Proteins in this family are typically between 197 and 638 amino acids in length.This is the N-terminal domain of the member proteins. The human gene is from C5orf34. 145
57662 373487 pfam15027 DUF4525 Domain of unknown function (DUF4525). This domain is found in eukaryotes. It is often found at the N-terminus of glycosyltransferase family 18 enzymes (pfam15024). It is also found in coiled-coil domain-containing protein 126. 137
57663 373488 pfam15028 PTCRA Pre-T-cell antigen receptor. The pre-T-cell antigen receptor (pre-TCR), expressed by immature thymocytes, has a pivotal role in early T-cell development, including TCR beta-selection, survival and proliferation of CD4(-)CD8(-) double-negative thymocytes, and subsequent alpha/beta T-cell lineage differentiation. This protein contains an immunoglobulin domain. 127
57664 405680 pfam15029 TMEM174 Transmembrane protein 174. This family of proteins is found in chordates and includes the human integral membrane protein TMEM174 protein. 235
57665 405681 pfam15030 DUF4527 Protein of unknown function (DUF4527). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. 276
57666 405682 pfam15031 DUF4528 Domain of unknown function (DUF4528). This family of proteins is found in eukaryotes. Proteins in this family are typically between 95 and 154 amino acids in length. This family includes Human C15orf61. 126
57667 405683 pfam15032 DUF4529 Protein of unknown function (DUF4529). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. The proteins contain a conserved VLPPLK sequence motif. 402
57668 405684 pfam15033 Kinocilin Kinocilin protein. This family of kinocilin proteins is found in vertebrate. In mouse it has been shown that this protein is expressed primarily in the kinocilium of sensory cells in the inner ear. 123
57669 291693 pfam15034 KRTAP7 KRTAP type 7 family. This family of keratin associated proteins are found in vertebrate. 84
57670 405685 pfam15035 Rootletin Ciliary rootlet component, centrosome cohesion. 189
57671 405686 pfam15036 IL34 Interleukin 34. 157
57672 405687 pfam15037 IL17_R_N Interleukin-17 receptor extracellular region. This domain is found at the N-terminus (extracellular region) of interleukin-17 receptor C and Interleukin-17 receptor E. This is the presumed ligand-binding domain. Human putative interleukin-17 receptor E-like consists only of this domain. 388
57673 405688 pfam15038 Jiraiya Jiraiya. Jiraiya inhibits bone morphogenetic protein (BMP) signaling during embryogenesis. The human member of this family is TMEM221. 170
57674 405689 pfam15039 DUF4530 Domain of unknown function (DUF4530). This family of proteins is found in eukaryotes. Proteins in this family are typically around 140 amino acids in length. The human member of this family is C19orf69. 113
57675 373499 pfam15040 Humanin Humanin family. This family of proteins is found exclusively in humans. Humanin is a short anti-apoptotic peptide that interacts with Bax. 24
57676 405690 pfam15041 DUF4531 Domain of unknown function (DUF4531). This family of uncharacterized proteins is found in mammals. This family includes the human protein C19orf71. 184
57677 405691 pfam15042 LELP1 Late cornified envelope-like proline-rich protein 1. This family of uncharacterized proteins is found in mammals. 106
57678 405692 pfam15043 CNRIP1 CB1 cannabinoid receptor-interacting protein 1. This family of proteins interacts with cannabinoid receptor 1 (CNR1) and attenuates CNR1-mediated tonic inhibition of voltage-gated calcium channels. 152
57679 405693 pfam15044 CLU_N Mitochondrial function, CLU-N-term. CLU_N is the N-terminal domain of the Clueless protein, also known as TIF31-like in other organisms. The function of this domain is not known. It family is found in association with pfam13236. 79
57680 405694 pfam15045 Clathrin_bdg Clathrin-binding box of Aftiphilin, vesicle trafficking. Aftiphilin forms a stable complex with p200 and gamma-synergin. This family contains a clathrin box, with two identified clathrin-binding motifs. This family of proteins is found in eukaryotes. 80
57681 405695 pfam15046 DUF4532 Protein of unknown function (DUF4532). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. 279
57682 405696 pfam15047 DUF4533 Protein of unknown function (DUF4533). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. This family includes two human proteins: C12orf60 and C12orf69. 226
57683 405697 pfam15048 OSTbeta Organic solute transporter subunit beta protein. 122
57684 405698 pfam15049 DUF4534 Protein of unknown function (DUF4534). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. Proteins in this family are typically between 170 and 190 amino acids in length. The protein includes the human integral membrane TMEM217 protein. 163
57685 405699 pfam15050 SCIMP SCIMP protein. This family contains the SCIMP proteins which are a a transmembrane adaptor protein involved in major histocompatibility complex class II signaling. 132
57686 373510 pfam15051 FAM198 FAM198 protein. This family of proteins is found in eukaryotes. The function of this family is unknown. Murine FAM198B is downregulated by FGFR signalling. 327
57687 405700 pfam15052 TMEM169 TMEM169 protein family. This domain is thought to be structured transmembrane helices and includes the intermediary cytoplasmic domain. It is found in eukaryotes, and is approximately 130 amino acids in length. 130
57688 405701 pfam15053 Njmu-R1 Mjmu-R1-like protein family. This protein family is thought to have a role in spermatogenesis. This family of proteins is found in eukaryotes. In humans, it is found in chromosome 17 open reading frame 75 (C17orf75). Proteins in this family are typically between 217 and 399 amino acids in length. 345
57689 405702 pfam15054 DUF4535 Domain of unknown function (DUF4535). This family includes the uncharacterized protein C7orf73 that is found in eukaryotes. Members are generally less than 100 residues in length. Although the precise function of the domain is still unknown, members have a predicted N-terminal signal peptide sequence which suggests they are short secreted peptides. 45
57690 405703 pfam15055 DUF4536 Domain of unknown function (DUF4536). This domain family is thought to be a transmembrane helix. It is found in eukaryotes, and is approximately 50 amino acids in length. In humans, it is located in the chromosomal position, C9orf123. The family contains the uncharacterized Sch. pombe protein TAM6 which is found in the mitochondrion. 47
57691 405704 pfam15056 NRN1 Neuritin protein family. The domain family Neuritin1 (NRN1) is a GPI-anchored protein expressed in post-mitotic-differentiating neurons in the developing nervous system. NRN1 is a glutamate and neurotrophin receptor target encoding a neuronal protein that functions extracellularly to modulate neurite outgrowth (OMIM:607409). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 158 amino acids in length. 89
57692 405705 pfam15057 DUF4537 Domain of unknown function (DUF4537). The function of this domain family is unknown. It is found in eukaryotes, and is typically between 119 and 141 amino acids in length. In humans, it is found in the chromosomal position C11orf16. 122
57693 373517 pfam15058 Speriolin_N Speriolin N-terminus. This family represents the N-terminus of the sperm centrosome protein speriolin. 200
57694 405706 pfam15059 Speriolin_C Speriolin C-terminus. This family represents the C-terminus of the sperm centrosome protein speriolin. 148
57695 405707 pfam15060 PPDFL Differentiation and proliferation regulator. Pancreatic progenitor cell differentiation and proliferation factor-like protein (PPDFL) is alternatively named Exocrine differentiation and proliferation factor-like protein. PPDFL regulates exocrine cell fate. This protein is highly expressed in exocrine progenitor cells which eventually differentiate to form exocrine pancreatic cells. 110
57696 405708 pfam15061 DUF4538 Domain of unknown function (DUF4538). This protein family is thought to be a transmembrane helix. Its function remains unknown. This family of proteins is found in eukaryotes. Proteins in this family are typically between 58 and 87 amino acids in length. 56
57697 405709 pfam15062 ARL6IP6 Haemopoietic lineage transmembrane helix. ADP-ribosylation factor-like protein 6-interacting protein 6 (ARP6) is a transmembrane helix present in the J2E erythro-leukaemic cell line, but not its myeloid variants. In tissues, ARL-6 mRNA was most abundant in brain and kidney. While ARL-6 protein was predominantly cytosolic, it is known to bind to SEC61-beta subunit of a protein conducting channel SEC61p. 86
57698 405710 pfam15063 TC1 Thyroid cancer protein 1. Thyroid cancer protein 1 (TC1) is thought to decrease in apoptosis and increase cell proliferation. It is found to be expressed in thyroid papillary carcinoma. This suggests its importance in thyroid cancer. The molecular mechanism of TC1, involves up-regulating cell signalling through ERK-1/2 signalling pathway and it positively regulates transition between the G1 and S phase in the cell cycle. It is thought to positively regulate Wnt/beta-catenin signalling pathway by interacting with its repressor. In humans, it is located in the chromosomal position, C8orf4. This family of proteins is found in eukaryotes and contains a conserved NIF sequence motif. 74
57699 373523 pfam15064 CATSPERG Cation channel sperm-associated protein subunit gamma. This family represents the gamma subunit of the CATSPER, or cation channel sperm-associated protein complex. The complex appears only to be expressed in the flagellum of sperm. The complex is activated at alkaline intracellular pH, and being restricted to the flagellum is the mediating calcium channel. 971
57700 405711 pfam15065 NCU-G1 Lysosomal transcription factor, NCU-G1. NCU-G1 is a set of highly conserved nuclear proteins rich in proline with a molecular weight of approximately 44 kDa. Especially high levels are detected in human prostate, liver and kidney. NCU-G1 is a dual-function family capable of functioning as a transcription factor as well as a nuclear receptor co-activator by stimulating the transcriptional activity of peroxisome proliferator-activated receptor-alpha (PPAR-alpha). 356
57701 405712 pfam15066 CAGE1 Cancer-associated gene protein 1 family. CAGE-1 is a family of proteins overexpressed in tumor tissues compared with surrounding tissues. CAGE-1 gene showed testis-specific expression among normal tissues and displayed wide expression in a variety of cancer cell lines and cancer tissues. CAGE-1 is predominantly expressed during post-meiotic stages. It localizes to the acrosomal matrix and acrosomal granule showing it to be a component of the acrosome of mammalian spermatids and spermatozoa. 528
57702 405713 pfam15067 FAM124 FAM124 family. The exact function of this protein family remains unknown. This family of proteins is found in eukaryotes. Proteins in this family are approximately 480 amino acids in length. There is a conserved LFL sequence motif. 235
57703 405714 pfam15068 FAM101 FAM101 family. This protein family includes the actin regulators, Refilin A and B, however the exact function of this protein family remains unknown. Refilin is thought to stabilize peri-nuclear actin filament bundles, important in fibroblasts. Refilin is important as changes in localization and shape in the nucleus plays a role in cellular and developmental processes. 208
57704 405715 pfam15069 FAM163 FAM163 family. This protein family is alternatively named Neuroblastoma-derived secretory proteins. Highly expressed in neuroblastoma compared to other tissues, suggesting that it may be used as a marker for metastasis in bone marrow. 163
57705 405716 pfam15070 GOLGA2L5 Putative golgin subfamily A member 2-like protein 5. The function of the GOLGA2L5 protein family remains unknown. This family of proteins is thought to be found in the Golgi apparatus of eukaryotes. Proteins in this family are typically between and 840 amino acids in length. 523
57706 405717 pfam15071 TMEM220 Transmembrane family 220, helix. Transmembrane 220 (TMEM220) is a domain of unknown function. It is thought to be a transmembrane helix. The length of this protein is typically between 150 and 160 amino acids. In humans, it is found in the chromosomal position 17p13.1. 99
57707 405718 pfam15072 DUF4539 Domain of unknown function (DUF4539). This family of proteins is found in eukaryotes. Proteins in this family are typically between 230 and 625 amino acids in length. 85
57708 405719 pfam15073 DUF4540 Domain of unknown function (DUF4540). This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 302 amino acids in length. In humans, it is found in the chromosomal position, C7orf72. 128
57709 405720 pfam15074 DUF4541 Domain of unknown function (DUF4541). This family of proteins is found in eukaryotes. Proteins in this family are typically between 100 and 163 amino acids in length. There is a conserved KLHRDDR sequence motif. There is a single completely conserved residue Y that may be functionally important. In humans, the gene is found in the chromosomal location, C5orf49. 92
57710 405721 pfam15075 DUF4542 Domain of unknown function (DUF4542). This family of proteins is found in eukaryotes. Proteins in this family are typically between 123 and 173 amino acids in length. There is a conserved IPPYN sequence motif. The gene that encodes this protein in humans, is found in the chromosomal position, C17orf98. 132
57711 373535 pfam15076 DUF4543 Domain of unknown function (DUF4543). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 90 amino acids in length. The human member of this family is C17orf67. 74
57712 405722 pfam15077 MAJIN Membrane-anchored junction protein. Membrane-anchored junction protein (MAJIN) is a meiosis-specific telomere-associated protein involved in meiotic telomere attachment to the nucleus inner membrane, a crucial step for homologous pairing and synapsis. It is a component of the MAJIN-TERB1-TERB2 complex, which promotes telomere cap exchange by mediating attachment of telomeric DNA to the inner nuclear membrane and replacement of the protective cap of telomeric chromosomes. 241
57713 405723 pfam15078 DUF4545 Domain of unknown function (DUF4545). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 417 amino acids in length. The human member of this family is C1orf141. 465
57714 373538 pfam15079 Tsc35 Testis-specific protein 35. Tsc35 (also referred to in the literature as Tsc24) is essential for spermatogenesis in mammalian male reproduction. It is expressed in the testis from day 35 onwards in mice. 199
57715 373539 pfam15080 DUF4547 Domain of unknown function (DUF4547). This family of proteins is found in eukaryotes. Proteins in this family are typically between 144 and 206 amino acids in length. The human member of this family is C3orf43. 196
57716 373540 pfam15081 DUF4548 Domain of unknown function (DUF4548). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 178 amino acids in length. The human member of this family is C1orf105. 167
57717 405724 pfam15082 DUF4549 Domain of unknown function (DUF4549). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 1871 amino acids in length. The human member of this family is C6orf183. 142
57718 373542 pfam15083 Colipase-like Colipase-like. This is a family of colipase-like proteins. 90
57719 405725 pfam15084 DUF4550 Domain of unknown function (DUF4550). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. This domain contains an N-terminal HXE motif. 95
57720 405726 pfam15085 NPFF Neuropeptide FF. 109
57721 405727 pfam15086 UPF0542 Uncharacterized protein family UPF0542. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved LSWKL sequence motif. This family includes human protein C5orf43. 67
57722 405728 pfam15087 DUF4551 Protein of unknown function (DUF4551). This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa. This family includes human protein C12orf56. 571
57723 405729 pfam15088 NADH_dh_m_C1 NADH dehydrogenase [ubiquinone] 1 subunit C1, mitochondrial. 49
57724 405730 pfam15089 DUF4552 Domain of unknown function (DUF4552). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. Proteins in this family are typically between 425 and 649 amino acids in length. 425
57725 405731 pfam15090 DUF4553 Domain of unknown function (DUF4553). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. This family includes the human protein C10orf12. 474
57726 405732 pfam15091 DUF4554 Domain of unknown function (DUF4554). This family of proteins is functionally uncharacterized. This family of proteins is found in some vertebrates. This family includes human protein C11orf80. 456
57727 405733 pfam15092 UPF0728 Uncharacterized protein family UPF0728. This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa. There is a conserved GPY sequence motif. 88
57728 405734 pfam15093 DUF4555 Domain of unknown function (DUF4555). This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa.This family includes the human protein C7orf31. 284
57729 405735 pfam15094 DUF4556 Domain of unknown function (DUF4556). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. This family includes human protein C1orf127. 215
57730 405736 pfam15095 IL33 Interleukin 33. 266
57731 405737 pfam15096 G6B G6B family. 220
57732 405738 pfam15097 Ig_J_chain Immunoglobulin J chain. 134
57733 405739 pfam15098 TMEM89 TMEM89 protein family. The function of this family of transmembrane proteins, TMEM89, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are approximately 159 amino acids in length. 131
57734 405740 pfam15099 PIRT Phosphoinositide-interacting protein family. The function of this family, PIRT, is not known, however it is predicted to be a multi-pass membrane protein. This family of proteins is thought to have a role in positively regulating TRPV1 channel activity via phosphatidylinositol 4,5-bisphosphate (PIP2). This family of proteins is found in eukaryotes. Proteins in this family are located in the cell membrane. Proteins in this family are approximately 140 amino acids in length. 131
57735 405741 pfam15100 TMEM187 TMEM187 protein family. The function of this family, TMEM187, is not known, however it is predicted to be a multi-pass membrane protein. Members of this family are as yet uncharacterized. This protein family is also alternatively named ITBA1. This family of proteins are found in eukaryotes. Proteins in this family are typically between 239 and 267 amino acids in length. 244
57736 405742 pfam15101 TERB2 Telomere-associated protein TERB2. TERB2 is a meiosis-specific telomere-associated protein involved in meiotic telomere attachment to the nucleus inner membrane, a crucial step for homologous pairing and synapsis. 207
57737 405743 pfam15102 TMEM154 TMEM154 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. However, it is thought to be a therapeutic target for ovine lentivirus infection. This family of proteins is found in eukaryotes and members are typically between 138 and 320 amino acids in length. 153
57738 405744 pfam15103 G0-G1_switch_2 G0/G1 switch protein 2. This family of proteins regulate apoptosis by binding to Bcl-2 and preventing the formation of the anti-apoptotic BAX-BCL2 heterodimers. 105
57739 405745 pfam15104 DUF4558 Domain of unknown function (DUF4558). This family of proteins is found in eukaryotes. Proteins in this family are typically between 78 and 121 amino acids in length. One member is annotated as being a flagellar associated protein. 86
57740 405746 pfam15105 TMEM61 TMEM61 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 150 and 211 amino acids in length. 182
57741 405747 pfam15106 TMEM156 TMEM156 protein family. The function of this family of transmembrane proteins, TMEM 156, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins are found in eukaryotes. Proteins in this family are approximately 310 amino acids in length. In humans, the gene encoding this protein is located in the chromosomal position, 4p14. 226
57742 405748 pfam15107 FAM216B FAM216B protein family. The function of this family of proteins, FAM216B, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins are found in eukaryotes. Proteins in this family are approximately 150 amino acids in length. In humans, the gene encoding this protein is located in the position, C13orf30. 103
57743 373566 pfam15108 TMEM37 Voltage-dependent calcium channel gamma-like subunit protein family. This family of transmembrane proteins, TMEM37, has a role in stabilizing the calcium channel in an inactivated (closed) state. It is a subunit of the L-type calcium channels. This family of proteins are found in eukaryotes. Proteins in this family are approximately 210 amino acids in length. 182
57744 405749 pfam15109 TMEM125 TMEM125 protein family. The function of this family of transmembrane proteins, TMEM125, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 55 and 232 amino acids in length. 109
57745 373568 pfam15110 TMEM141 TMEM141 protein family. The function of this family of transmembrane proteins, TMEM141, has not, as yet, been determined. Members of this family remain uncharacterized. TMEM141 protein family is found in eukaryotes. Proteins in this family are typically between 103 and 124 amino acids in length. There are two completely conserved residues (C and W) that may be functionally important. 91
57746 405750 pfam15111 TMEM101 TMEM101 protein family. The function of this family of transmembrane proteins, TMEM101, has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 127 and 257 amino acids in length. 249
57747 405751 pfam15112 DUF4559 Domain of unknown function (DUF4559). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein CXorf38. 311
57748 405752 pfam15113 TMEM117 TMEM117 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 181 and 504 amino acids in length. 410
57749 405753 pfam15114 UPF0640 Uncharacterized protein family UPF0640. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 70 and 80 amino acids in length. There are two conserved sequence motifs: PGK and YRFLP. 66
57750 405754 pfam15115 HDNR Domain of unknown function with conserved HDNR motif. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 219 amino acids in length. There is a conserved HDNR sequence motif. The function is not known. 174
57751 405755 pfam15116 CD52 CAMPATH-1 antigen. 41
57752 373575 pfam15117 UPF0697 Uncharacterized protein family UPF0697. This family of uncharacterized proteins is found in vertebrates. Proteins in this family are typically around 100 amino acids in length. 98
57753 373576 pfam15118 DUF4560 Domain of unknown function (DUF4560). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 66 and 78 amino acids in length. There are two conserved sequence motifs: FCK and RTL. 64
57754 405756 pfam15119 APOC4 Apolipoprotein C4. 94
57755 405757 pfam15120 SPACA9 Sperm acrosome-associated protein 9. This family of proteins found in eukaryotes represents sperm acrosome-associated protein 9 (SPACA9, previously known as C9orf9 or MAST). Sperm acrosome-associated protein 9 has been suggested to form a complex with calcium-binding proteins calreticulin and caldendrin localized to the acrosome. Despite this, no known protein interaction motifs have been identified in MAST/SPACA9. 164
57756 405758 pfam15121 TMEM71 TMEM71 protein family. The function of this family, TMEM71, is not known, however it is predicted to be a transmembrane protein. This family of proteins is found in eukaryotes and located in the cell membrane. Proteins in this family vary between 41 and 291 amino acids in length. 150
57757 405759 pfam15122 TMEM206 TMEM206 protein family. The function of this family of transmembrane proteins, TMEM206, has not, as yet, been determined. Members of this family are remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are approximately 350 amino acids in length. 296
57758 405760 pfam15123 DUF4562 Domain of unknown function (DUF4562). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved HRYQNPW sequence motif. This family includes the human protein C4orf45. 112
57759 373581 pfam15124 FANCD2OS FANCD2 opposite strand protein. This family of proteins of unknown function gets its name from its position in the mammalian genome: Fanconi anemia group D2 protein opposite strand transcript protein. 175
57760 405761 pfam15125 TMEM238 TMEM238 protein family. The function of this family of transmembrane proteins, TMEM238; has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 61 and 153 amino acids in length. 64
57761 405762 pfam15127 SmAKAP Small membrane A-kinase anchor protein. SmAKAP is a small membrane-bound PKA-RI-specific protein kinase A-anchoring protein, referred to as small membrane-AKAP. It is probably tethered to the plasma membrane most through a dual acylation of its N-terminal Met-Gly-Cys- motif (myristoylation and palmitoylation, respectively). It specifically targets PKA-RI isoforms to the plasma membrane. It localizes to plasma membranes, is enriched at cell-cell junctions and associates with filopodia. 97
57762 405763 pfam15128 T_cell_tran_alt T-cell leukemia translocation-altered. This family of proteins is required for osteoclastogenesis. 92
57763 405764 pfam15129 FAM150 FAM150 family. This family of proteins known as FAM150 is found in eukaryotes. Members of this family are as yet uncharacterized. Proteins in this family are approximately 143 amino acids in length. The function of this family has not, as yet, been determined, however it is predicted to be a secretory protein family. 124
57764 405765 pfam15130 DUF4566 Domain of unknown function (DUF4566). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein C6orf62. 226
57765 373586 pfam15131 DUF4567 Domain of unknown function (DUF4567). This family of proteins is functionally uncharacterized. This family of proteins is found in some mammals. 75
57766 405766 pfam15132 DUF4568 Domain of unknown function (DUF4568). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. 194
57767 405767 pfam15133 DUF4569 Domain of unknown function (DUF4569). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein CXorf21. 304
57768 405768 pfam15134 DUF4570 Domain of unknown function (DUF4570). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. 110
57769 405769 pfam15135 UPF0515 Uncharacterized protein UPF0515. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There are two conserved sequence motifs: PLT and HSC. 271
57770 405770 pfam15136 UPF0449 Uncharacterized protein family UPF0449. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved LPTRP sequence motif. 99
57771 405771 pfam15137 DUF4571 Domain of unknown function (DUF4571). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrate. This family includes human protein C21orf62. 214
57772 405772 pfam15138 Syncollin Syncollin. This family has a role in zymogen granule exocytosis. 112
57773 405773 pfam15139 DUF4572 Domain of unknown function (DUF4572). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 220 amino acids in length. 195
57774 405774 pfam15140 DUF4573 Domain of unknown function (DUF4573). This family of proteins is found in eukaryotes. Proteins in this family are typically approximately 360 amino acids in length. 176
57775 405775 pfam15141 DUF4574 Domain of unknown function (DUF4574). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 86 amino acids in length. 87
57776 405776 pfam15142 INCA1 INCA1. This family of proteins inhibits cyclin-dependent kinase activity. 178
57777 291801 pfam15143 DUF4575 Domain of unknown function (DUF4575). This family of uncharacterized proteins is found in eukaryotes. 129
57778 373598 pfam15144 DUF4576 Domain of unknown function (DUF4576). This family of uncharacterized proteins is found in eukaryotes. 88
57779 405777 pfam15145 DUF4577 Domain of unknown function (DUF4577). The function of this family of proteins, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically 128 amino acids in length. 128
57780 405778 pfam15146 FANCAA Fanconi anemia-associated. This family of proteins plays a role in the Fanconi anemia-associated DNA damage response. 437
57781 405779 pfam15147 DUF4578 Domain of unknown function (DUF4578). This family of proteins is found in eukaryotes. Proteins in this family are typically between 44 and 137 amino acids in length. 126
57782 405780 pfam15148 Apolipo_F Apolipoprotein F. 198
57783 405781 pfam15149 CATSPERB Cation channel sperm-associated protein subunit beta protein family. The function of this family of transmembrane proteins, CATSPERB, has not, as yet, been determined. However, it is thought to play a role in sperm hyperactivation by associating with CATSPER1. This family of proteins is found in eukaryotes. Proteins in this family are typically between 220 and 1107 amino acids in length. 520
57784 317555 pfam15150 PMAIP1 Phorbol-12-myristate-13-acetate-induced. This family carries a BH3 domain between residues 23 and 40. 54
57785 405782 pfam15151 RGCC Response gene to complement 32 protein family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 44 and 130 amino acids in length. There is a conserved KLGDT sequence motif. 127
57786 405783 pfam15152 Kisspeptin Kisspeptin. 76
57787 405784 pfam15153 CYTL1 Cytokine-like protein 1. The function of this family of proteins, CYTL1, has not, as yet, been determined. However it is thought to be a secretory protein expressed in CD34+ haemopoietic cells. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 145 amino acids in length. There are two conserved sequence motifs: PPTCYSR and DDC. 127
57788 317559 pfam15155 MRFAP1 MORF4 family-associated protein1. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 127 amino acids in length. 119
57789 405785 pfam15156 CLN6 Ceroid-lipofuscinosis neuronal protein 6. This family of proteins is found in eukaryotes. Proteins in this family are typically between 190 and 310 amino acids in length. 280
57790 405786 pfam15157 IQCJ-SCHIP1 Fusion protein IQCJ-SCHIP1 with IQ-like motif. This is a family of eukaryotic fusion proteins. It bridges two adjacent genes that encode distinct proteins, IQCJ, a novel IQ motif containing protein and SCHIP1, a schwannomin interacting protein. It contains a unique calmodulin-binding IQ motif at the N-terminus not shared with its shorter isoform SCHIP1, suggesting a distinctive function for this protein. It is localized to cytoplasm and actin-rich regions, and in differentiated PC12 cells is seen in neurite extensions. the exact physiological function is unclear. 150
57791 405787 pfam15158 DUF4579 Domain of unknown function (DUF4579). This family of proteins is found in eukaryotes. Proteins in this family are typically between 192 and 239 amino acids in length. The human member of this family is C8orfK29. 184
57792 405788 pfam15159 PIG-Y Phosphatidylinositol N-acetylglucosaminyltransferase subunit Y. This family of proteins represents subunit Y of the GPI-N-acetylglucosaminyltransferase (GPI-GnT) complex. It may regulate activity of the complex by binding the catalytic subunit, PIG-A. 70
57793 373611 pfam15160 SASRP1 Spermatogenesis-associated serine-rich protein 1. Spermatogenesis-associated serine-rich protein 1 is a serine-rich protein differentially expressed during spermatogenesis. 236
57794 317565 pfam15161 Neuropep_like Neuropeptide-like. This family contains putative neuropeptides. 61
57795 405789 pfam15162 DUF4580 Domain of unknown function (DUF4580). This family of proteins is found in eukaryotes. Proteins in this family are typically between 63 and 185 amino acids in length. 162
57796 405790 pfam15163 Meiosis_expr Meiosis-expressed. This family of proteins is essential for spermiogenesis. 75
57797 373614 pfam15164 WBS28 Williams-Beuren syndrome chromosomal region 28 protein homolog. WBS28 is an integral membrane family. These proteins have been identified as being linked to Williams-Beuren syndrome, OMIM:194050. This family of proteins is found in eukaryotes, and are typically 266 amino acids in length. 266
57798 405791 pfam15165 REC114-like Meiotic recombination protein REC114-like. REC114-like members are necessary for meiotic DNA double-strand break formation. It functions in conjunction with Mei4. This family of proteins is found in eukaryotes. Proteins in this family are typically between 43 and 259 amino acids in length. 239
57799 291823 pfam15167 DUF4581 Domain of unknown function (DUF4581). This family of proteins is found in eukaryotes. Proteins in this family are typically 131 amino acids in length. 131
57800 405792 pfam15168 TRIQK Triple QxxK/R motif-containing protein family. TRIQK member-proteins share a characteristic triple repeat of the sequence QXXK/R, as well as a hydrophobic C-terminal region. Xenopus and mouse triqk genes are broadly expressed throughout embryogenesis, and mtriqk is also generally expressed in mouse adult tissues. TRIQK proteins are localized to the endoplasmic reticulum membrane. This family is found in eukaryotes and members are typically between and 86 amino acids in length. 79
57801 405793 pfam15169 DUF4564 Domain of unknown function (DUF4564). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes the human protein C17orf62. 184
57802 373618 pfam15170 CaM-KIIN Calcium/calmodulin-dependent protein kinase II inhibitor. CaM-KIIN is the inhibitor of Calcium/calmodulin-dependent protein kinase II (CaMKII). CaMKII plays a central part in long-term potentiation, which underlies some forms of learning and memory. CaM-KIIN is a natural, specific inhibitor of CaMKII. This family is found in eukaryotes. 79
57803 373619 pfam15171 Spexin Neuropeptide secretory protein family, NPQ, spexin. Spexin, alternatively named NPQ, is a peptide hormone and is derived from a pro-hormone. This family of proteins has a role in inducing stomach wall contraction and is expressed in the submucosal layer of the mouse oesophagus and stomach. Spexin, like most peptide hormones, is a ligand for G-protein coupled receptors. Spexin is also thought to have a role in controlling arterial blood pressure as well as salt and water balance. 90
57804 405794 pfam15172 Prolactin_RP Prolactin-releasing peptide. 45
57805 405795 pfam15173 FAM180 FAM180 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 182 amino acids in length. There are two conserved sequence motifs: ELAS and DFE. The function of this family is unknown. 137
57806 291830 pfam15174 PRNT Prion-related protein testis-specific. PRNT is a family of prion-related proteins expressed in the testis. This family of proteins is found in eukaryotes. Proteins in this family are typically between 52 and 94 amino acids in length. 51
57807 373622 pfam15175 SPATA24 Spermatogenesis-associated protein 24. This family of proteins bind to DNA and to TBP (TATA box binding protein), TATA-binding protein (TBP)-related protein 2 (TRF2) and several polycomb factors. It is likely to function as a transcription regulator. 170
57808 405796 pfam15176 LRR19-TM Leucine-rich repeat family 19 TM domain. LRR19-TM is the single-span transmembrane region of LRRC19, a leucine-rich repeat protein family. LRRC19 functions as a transmembrane receptor inducing pro-inflammatory cytokines. This suggests its role in innate immunity. This family of proteins is found in eukaryotes. 101
57809 405797 pfam15177 IL28A Interleukin-28A. The protein family, Interleukin-28A, plays an important role in modulating the immune system. This protein family is induced by viral infection and interacts with a class II receptor. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 195 amino acids in length. 156
57810 405798 pfam15179 Myc_target_1 Myc target protein 1. This family of proteins is regulated by the c-Myc oncoprotein. It regulates the expression of several other c-Myc target genes. 193
57811 405799 pfam15180 NPBW Neuropeptides B and W. The function of this family, NPBW, which includes Neuropeptides B and W, is thought to be involved in activating G-protein coupled receptors, GPR7 and GPR8. It is thought to play a regulatory role in the organisation of neuroendocrine signals accessing the anterior pituitary gland. It is predicted that this effect will stimulate the increase in water-drinking and food-intake. This suggests it plays a role in the hypothalamic response to stress. This family of proteins is found in eukaryotes. 113
57812 373627 pfam15181 SMRP1 Spermatid-specific manchette-related protein 1. This family of proteins, SMRP1, is thought to have a role in spermatogenesis and may be involved in differentiation or function of ciliated cells. This family of proteins is found in eukaryotes. Proteins in this family are typically approximately 260 amino acids in length. 262
57813 405800 pfam15182 OTOS Otospiralin. This family of proteins, Otospiralin, has a role in maintaining the neurosensory epithelium of the inner ear. This family of proteins is found in eukaryotes. Proteins in this family are approximately 90 amino acids in length. 69
57814 405801 pfam15183 MRAP Melanocortin-2 receptor accessory protein family. This family is thought to be involved in cell trafficking. It is required for MC2R expression in certain cell types, suggesting that it is involved in the processing, trafficking or function of MC2R. MRAP may be involved in the intracellular trafficking pathways in adipocyte cells. This family of proteins is found in eukaryotes. Proteins in this family are typically between 47 and 205 amino acids in length. 89
57815 291840 pfam15184 TOM6p Mitochondrial import receptor subunit TOM6 homolog. TOMM6 forms part of the pre-protein translocase complex of the outer mitochondrial membrane (TOM complex). This family of proteins is found in eukaryotes. Proteins in this family are typically between 43 and 74 amino acids in length. 74
57816 291841 pfam15185 BMF Bcl-2-modifying factor, apoptosis. BMF is thought to play a role in inducing apoptosis. It is thought to bind to Bcl-2 proteins. This family of proteins is found in eukaryotes. Proteins in this family are typically between 75 and 190 amino acids in length. There are two conserved sequence motifs: GNA and DQF. 224
57817 405802 pfam15186 TEX13 Testis-expressed sequence 13 protein family. The function of this family of proteins has not, as yet, been determined. However, members are thought to be encoded for by spermatogonially-expressed, germ-cell-specific genes. This family of proteins is found in eukaryotes. Proteins in this family are typically between 177 and 384 amino acids in length. There are two conserved sequence motifs: FIN and LAL. 148
57818 373630 pfam15187 Augurin Oesophageal cancer-related gene 4. Augurin is alternatively named oesophageal cancer-related gene 4 protein. The function of this family of transmembrane proteins, is to induce the senescence of oligodendrocyte and neural precursor cells, characterized by G1 arrest, RB1 dephosphorylation and accelerated CCND1 and CCND3 proteasomal degradation. Augurin has been found to stimulate the release of ACTH via the release of hypothalamic CRF. This family of proteins is found in eukaryotes. Proteins in this family are typically 145 amino acids in length. 115
57819 405803 pfam15188 CCDC-167 Coiled-coil domain-containing protein 167. The function of this family of coiled-coil domains, has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 103 amino acids in length. 82
57820 405804 pfam15189 MEIOC Meiosis-specific coiled-coil domain-containing protein MEIOC. This family of proteins is found in eukaryotes. In humans, it is encoded for on the chromosomal position C17orf104. 162
57821 405805 pfam15190 TMEM251 Transmembrane protein 251. This family of proteins, also known as UPF0694, is found in eukaryotes. Proteins in this family are around 135 amino acids in length. In humans, it is found on the chromosomal position, C14orf109. 128
57822 405806 pfam15191 Synaptonemal_3 Synaptonemal complex central element protein 3. 85
57823 405807 pfam15192 TMEM213 TMEM213 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 154 amino acids in length. The function of this family is unknown. 79
57824 405808 pfam15193 FAM24 FAM24 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 87 and 101 amino acids in length. There are two conserved sequence motifs: FDLRT and CLY. The function of this family is unknown. 69
57825 373637 pfam15194 TMEM191C TMEM191C family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 302 amino acids in length. There are two conserved sequence motifs: QDC and RLF. The function of this family is unknown. 121
57826 405809 pfam15195 TMEM210 TMEM210 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 149 amino acids in length. The function of this family is unknown. 112
57827 259330 pfam15196 Harakiri Activator of apoptosis harakiri. 94
57828 291852 pfam15198 Dexa_ind Dexamethasone-induced. 90
57829 373639 pfam15199 DAOA D-amino acid oxidase activator. 82
57830 405810 pfam15200 KRTDAP Keratinocyte differentiation-associated. 76
57831 291855 pfam15201 Rod_cone_degen Progressive rod-cone degeneration. This family of proteins is involved in vision. 54
57832 317593 pfam15202 Adipogenin Adipogenin. This family of proteins is involved in the stimulation of adipocyte differentiation and development. 79
57833 405811 pfam15203 TMEM95 TMEM95 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 102 and 231 amino acids in length. There is a conserved LGG sequence motif. The function of this family is unknown. 151
57834 373642 pfam15204 KKLCAg1 Kita-kyushu lung cancer antigen 1. This is a family of cancer antigens. 85
57835 405812 pfam15205 PLAC9 Placenta-specific protein 9. This family of proteins was identified as being enriched in placenta. 74
57836 405813 pfam15206 FAM209 FAM209 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 170 amino acids in length. The function of this family is unknown. 148
57837 405814 pfam15207 TMEM240 TMEM240 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 54 and 175 amino acids in length. The function of this family is unknown. 174
57838 405815 pfam15208 Rab15_effector Rab15 effector. This family of proteins has a role in receptor recycling from the endocytic recycling compartment. 237
57839 405816 pfam15209 IL31 Interleukin 31. 127
57840 373648 pfam15210 SFTA2 Surfactant-associated protein 2. 59
57841 405817 pfam15211 CXCL17 VEGF co-regulated chemokine 1. 89
57842 291866 pfam15212 SPATA19 Spermatogenesis-associated protein 19, mitochondrial. 130
57843 405818 pfam15213 CDRT4 CMT1A duplicated region transcript 4 protein. 137
57844 373651 pfam15214 PXT1 Peroxisomal testis-specific protein 1. This family of proteins is testis-specific. 50
57845 291869 pfam15215 FDC-SP Follicular dendritic cell secreted peptide. 65
57846 405819 pfam15216 TSLP Thymic stromal lymphopoietin. 125
57847 405820 pfam15217 TSC21 TSC21 family. This family of proteins is testis-specific. 180
57848 373654 pfam15218 SPATA25 Spermatogenesis-associated protein 25. This family of proteins may be involved in spermatogenesis. 222
57849 317606 pfam15219 TEX12 Testis-expressed 12. 96
57850 291874 pfam15220 HILPDA Hypoxia-inducible lipid droplet-associated. This family of proteins stimulate intracellular lipid accumulation, function as autocrine growth factors and enhance cell growth. 63
57851 373655 pfam15221 LEP503 Lens epithelial cell protein LEP503. This protein may be involved in lens epithelial cell differentiation. 69
57852 259356 pfam15222 KAR Kidney androgen-regulated. The function of this family is unknown. 105
57853 405821 pfam15223 DUF4584 Domain of unknown function (DUF4584). This family of proteins is found in eukaryotes. Proteins in this family are approximately 835 amino acids in length. The family is found in association with pfam02437. 418
57854 405822 pfam15224 SCRG1 Scrapie-responsive protein 1. This protein family has an important function in acting against the prion protein, Scrapie.This family of proteins is found in eukaryotes. Proteins in this family are approximately 98 amino acids in length. 76
57855 405823 pfam15225 IL32 Interleukin 32. 100
57856 373659 pfam15226 HPIP HCF-1 beta-propeller-interacting protein family. HPIP is a small cellular polypeptide that binds to the beta-propeller domain of HCF-1. HPIP regulates HCF-1 activity by modulating its subcellular localization. HCF-1 is a cellular protein required by VP16 to activate the herpes simplex virus- immediate-early genes. VP16 is a component of the viral tegument and, after release into the cell, binds to HCF-1 and translocates to the nucleus to form a complex with the POU domain protein Oct-1 and a VP16-responsive DNA sequence. HPIP-mediated export may provide the pool of cytoplasmic HCF-1 required for import of virion-derived VP16 into the nucleus. 133
57857 405824 pfam15227 zf-C3HC4_4 zinc finger of C3HC4-type, RING. This is a family of primate-specific Ret finger protein-like (RFPL) zinc-fingers of the C3HC4 type. Ret finger protein-like proteins are primate-specific target genes of Pax6, a key transcription factor for pancreas, eye and neocortex development. This domain is likely to be DNA-binding. This zinc-finger domain together with the RDM domain, pfam11002, forms a large zinc-finger structure of the RING/U-Box superfamily. RING-containing proteins are known to exert an E3 ubiquitin protein ligase activity with the zinc-finger structure being mandatory for binding to the E2 ubiquitin-conjugating enzyme. 42
57858 405825 pfam15228 DAP Death-associated protein. 95
57859 405826 pfam15229 POM121 POM121 family. 234
57860 405827 pfam15230 SRRM_C Serine/arginine repetitive matrix protein C-terminus. This domain is found near to the C-terminus of Serine/arginine repetitive matrix proteins 3 and 4. 67
57861 405828 pfam15231 VCX_VCY Variable charge X/Y family. The variable charge X/Y (VCX/VCY) family of proteins has members on the Human X and Y chromosomes, is expressed in male germ calls and may play a role in spermatogenesis or in sex ratio distortion. 133
57862 405829 pfam15232 DUF4585 Domain of unknown function (DUF4585). The function of this protein domain family is yet to be characterized. It is putatively thought to lie in the C-terminal domain of the DNA nucleotide repair protein, Xeroderma pigmentosa complementation group A (XPA). The function of XPA is to bind to DNA and repair any mismatched base pairs. This domain family is often found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved DPE sequence motif. In humans, this protein is encoded for in the chromosomal position, Chromosome 5 open reading frame 65. Mutations in the gene lead to myelodysplastic syndromes, where there is inefficient stem cell production in the bone marrow. This suggests that the protein may have a role in forming blood cells. 71
57863 405830 pfam15233 SYCE1 Synaptonemal complex central element protein 1. This family of proteins includes synaptonemal complex central element protein 1, a component of the synaptonemal complex involved in meiosis, and synaptonemal complex central element protein 1-like, which may be involved in meiosis. 147
57864 405831 pfam15234 LAT Linker for activation of T-cells. 233
57865 405832 pfam15235 GRIN_C G protein-regulated inducer of neurite outgrowth C-terminus. This represents the C-terminus of the G protein-regulated inducer of neurite outgrowth proteins. 126
57866 405833 pfam15236 CCDC66 Coiled-coil domain-containing protein 66. This protein family, named Coiled-coil domain-containing protein 66 (CCDC) refers to a protein domain found in eukaryotes, and is approximately 160 amino acids in length. CCDC66 protein is detected mainly in the inner segments of photoreceptors in many vertebrates including mice and humans. It has been found in dogs, that a mutation in the CCDC66 gene causes generalized progressive retinal atrophy (gPRA). This shows that the protein encoded for by this gene is vital for healthy vision and guards against photoreceptor cell degeneration. The structure of CCDC66 proteins includes a heptad repeat pattern which contains at least one coiled-coil domain. There are at least two or more alpha-helices which form a cable-like structure. 152
57867 405834 pfam15237 PTRF_SDPR PTRF/SDPR family. This family of proteins includes muscle-related coiled-coil protein (MURC), protein kinase C delta-binding protein (PRKCDBP), polymerase I and transcript release factor (PTRF) and serum deprivation-response protein (SDPR). MURC activates the Rho/ROCK pathway. PRKCDBP appears to act as an immune potentiator. PTRF is involved in caveolae formation and function. SDPR is involved in the targetting of protein kinase Calpha to caveolae. 250
57868 405835 pfam15238 FAM181 FAM181. This family of proteins is found in eukaryotes. Proteins in this family are typically between 256 and 426 amino acids in length. 283
57869 405836 pfam15239 DUF4586 Domain of unknown function (DUF4586). This protein family, refers to a domain of unknown function. The precise role of this protein domain remains to be elucidated. This family of proteins is found in eukaryotes and are typically between 256 and 320 amino acids in length. There is a single completely conserved residue, phenylalanine (F), that may be functionally important. In humans, the protein is found in the position, chromosome 4 open reading frame 47. 294
57870 405837 pfam15240 Pro-rich Proline-rich. This family includes several eukaryotic proline-rich proteins. 167
57871 405838 pfam15241 Cylicin_N Cylicin N-terminus. This is the N-terminus of cylicin proteins, which may play a role in spermatid differentiation. 107
57872 405839 pfam15242 FAM53 Family of FAM53. The FAM53 protein family refers to a family of proteins, which bind to a transcriptional regulator that modulates cell proliferation. It is known to be highly important in neural tube development. It is found in eukaryotes and is typically between 303 and 413 amino acids in length. 304
57873 405840 pfam15243 ANAPC15 Anaphase-promoting complex subunit 15. This is a component of the anaphase promoting complex/cyclosome. 87
57874 405841 pfam15244 HSD3 Spermatogenesis-associated protein 7, or HSD3. Spermatogenesis-associated protein HSD3 also goes by the name of spermatogenesis-associated protein 7 or SPAT7. The family carries a single transmembrane domain. It functions in several tissues, and is expressed in the developing and mature mouse retina; it is expressed in multiple retinal layers in the adult mouse retina. Mutations lead to LCA disease, or Leber congenital amaurosis, which results in a number of retinal dystrophies. The disease- phenotype is characterized by severe visual loss at birth, nystagmus, a variety of fundus changes, and minimal or absent recordable responses on the electroretinogram (ERG). 413
57875 405842 pfam15245 VGLL4 Transcription cofactor vestigial-like protein 4. These proteins act as transcriptional enhancer factor (TEF-1) cofactors. 210
57876 405843 pfam15246 NCKAP5 Nck-associated protein 5, Peripheral clock protein. NCKAP5 is short for Nck-associated protein 5, which is also known as the Peripheral clock protein. NCKAP5 is a protein family, which interacts with the SH3-containing region of the adaptor protein Nck. Nck is a protein that interacts with receptor tyrosine kinases and guanine nucleotide exchange factor Sos. The role of Nck can be thought of as similar to Grb2. The role of NCKAP5 is to assist Nck with its adaptor protein role. 309
57877 405844 pfam15247 SLBP_RNA_bind Histone RNA hairpin-binding protein RNA-binding domain. This family represents the RNA-binding domain of histone RNA hairpin-binding protein. 69
57878 405845 pfam15248 DUF4587 Domain of unknown function (DUF4587). This protein family is a domain of unknown function. The precise function of this protein domain remains to be elucidated. This domain family is found in eukaryotes, and is typically between 64 and 79 amino acids in length. There are two conserved sequence motifs: QNAQ and HHH. In humans, it is found in the position, chromosome 21 open reading frame 58. 74
57879 405846 pfam15249 GLTSCR1 Conserved region of unknown function on GLTSCR protein. This domain family is found in eukaryotes, and is typically between 105 and 124 amino acids in length. It is found on glioma tumor suppressor candidate region gene proteins. ** Forced reload 102
57880 405847 pfam15250 Raftlin Raftlin. This family of proteins plays a role in the formation and/or maintenance of lipid rafts. 448
57881 405848 pfam15251 DUF4588 Domain of unknown function (DUF4588). This family of proteins is found in eukaryotes. Proteins in this family are typically between 200 and 274 amino acids in length. There is a conserved LYK sequence motif. There is a single completely conserved residue A that may be functionally important. 238
57882 405849 pfam15252 DUF4589 Domain of unknown function (DUF4589). This protein family is a domain of unknown function. The precise function of the protein domain remains to be elucidated. This family of proteins is found in eukaryotes and are typically between 215 and 293 amino acids in length. The protein contains two conserved sequence motifs: SSS and KST. 245
57883 405850 pfam15253 STIL_N SCL-interrupting locus protein N-terminus. 404
57884 405851 pfam15254 CCDC14 Coiled-coil domain-containing protein 14. This protein family, Coiled-coil domain-containing protein 14 (CCDC14) is a domain of unknown function. This family of proteins is found in eukaryotes. Proteins in this family are typically between 301 and 912 amino acids in length. 862
57885 405852 pfam15255 CAP-ZIP_m WASH complex subunit CAP-Z interacting, central region. This domain is found on WASH complex subunits FAM21 and CAP-ZIP proteins, as well as on VPEF (vaccinia virus penetration factor). This family of proteins is found in eukaryotes. Proteins in this family are typically between 305 and 1321 amino acids in length. The exact function of this region is not known. 125
57886 405853 pfam15256 SPATIAL SPATIAL. SPATIAL (stromal protein associated with thymii and lymph node) proteins may be involved in spermatid differentiation. 199
57887 405854 pfam15257 DUF4590 Domain of unknown function (DUF4590). This family of proteins remains to be characterized and is a domain of unknown function. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: CCE and PCY. In humans, the gene encoding this protein lies in the position, chromosome 1 open reading frame 173. 106
57888 405855 pfam15258 FAM222A Protein family of FAM222A. This protein family, FAM222A are a domain of unknown function. This family of proteins is found in eukaryotes and are typically between 411 and 562 amino acids in length. In humans, the gene encoding this protein domain lies in the position, chromosome 12 open reading frame 34. 528
57889 405856 pfam15259 GTSE1_N G-2 and S-phase expressed 1. This family is the N-terminus of GTSE1 proteins. GTSE-1 (G2 and S phase-expressed-1) protein is specifically expressed during S and G2 phases of the cell cycle. It is mainly localized to the microtubules and when overexpressed delays the G2 to M transition. the full protein negatively regulates p53 transactivation function, protein levels, and p53-dependent apoptosis. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. There is a conserved FDFD sequence motif. 145
57890 405857 pfam15260 FAM219A Protein family FAM219A. This protein family, FAM219A is a domain of unknown function. This protein family has been found in eukaryotes. Proteins in this family are typically between 144 and 191 amino acids in length. There are two conserved sequence motifs: QLL and LDE. 124
57891 405858 pfam15261 DUF4591 Domain of unknown function (DUF4591). This protein family is a domain of unknown function. It is found in eukaryotes, and is approximately 120 amino acids in length. In humans, the gene encoding this protein lies in the position chromosome 11 open reading frame 63. 126
57892 405859 pfam15262 DUF4592 Domain of unknown function (DUF4592). This protein family is a domain of unknown function, which lies to the N-terminus of the protein. This domain family is found in eukaryotes, and is typically between 114 and 130 amino acids in length. There are two completely conserved residues (L and A) that may be functionally important. In humans, the gene that encodes this protein lies in the position, chromosome 2 open reading frame 55. 132
57893 373696 pfam15264 TSSC4 tumor suppressing sub-chromosomal transferable candidate 4. This family of proteins is expressed from a gene cluster where in humans the TSSC4 gene is not imprinted. This same cluster is associated with the Beckwith-Wiedermann syndrome. This domain family is found in eukaryotes, and is typically between 120 and 147 amino acids in length. There is a conserved YSL sequence motif. 126
57894 405860 pfam15265 FAM196 FAM196 family. This protein family is a domain of unknown function. This family of proteins is found in eukaryotes and are typically between 441 and 534 amino acids in length. 491
57895 405861 pfam15266 DUF4594 Domain of unknown function (DUF4594). This protein family is a domain of unknown function. The protein family is found in eukaryotes, and is typically between 170 and 183 amino acids in length.In humans, the gene encoding this protein lies in the position, chromosome 15 open reading frame 52. 174
57896 405862 pfam15268 Dapper Dapper. This is a family of signalling proteins. They act in a diverse range of signaling pathways and have a range of binding partners. They act as homo- and heterodimers. 710
57897 259402 pfam15269 zf-C2H2_7 Zinc-finger. this is a family of eukaryotic zinc-fingers. 54
57898 405863 pfam15270 ACI44 Metallo-carboxypeptidase inhibitor. ACI44, a metallo-carboxypeptidase inhibitor, is one member of a battery of selective inhibitors protecting roundworms of the genus Ascaris, common parasites of the human gastrointestinal tract, from host enzymes and the immune system. 58
57899 373701 pfam15271 BBP1_N Spindle pole body component BBP1, Mps2-binding protein. This N-terminal domain of BBP1, a spindle pole body component, interacts directly, though transiently, with the polo-box domain of Cdc5p. full length BBP1 localizes at the cytoplasmic side of the central plaque periphery of the spindle pole body (SPB) and plays an important role in inserting a duplication plaque into the nuclear envelope and assembling a functional inner plaque. Although not a membrane protein itself, BBP1 binds to Mps2 as well as to Spc29 and the half-bridge protein Kar1, thus providing a model for how the SPB core is tethered within the nuclear envelope and to the half-bridge. 151
57900 405864 pfam15272 BBP1_C Spindle pole body component BBP1, C-terminal. This C-terminal domain of BBP1, a spindle pole body component, carries coiled-coils that are necessary for the localization of BBP1 to the spindle pole body (SPB). Although not a membrane protein itself, BBP1 binds to Mps2 as well as to Spc29 and the half-bridge protein Kar1, thus providing a model for how the SPB core is tethered within the nuclear envelope and to the half-bridge 183
57901 405865 pfam15273 NHS NHS-like. This family of proteins includes Nance-Horan syndrome protein (NHS). 641
57902 405866 pfam15274 MLIP Muscular LMNA-interacting protein. MLIP is a Muscle-enriched A-type Lamin-interacting Protein, an innovation of amniotes, and is expressed ubiquitously and most abundantly in heart, skeletal, and smooth muscle. MLIP interacts directly and co-localizes with lamin A and C in the nuclear envelope. MLIP also co-localizes with promyelocytic leukemia (PML) bodies within the nucleus. PML, like MLIP, is only found in amniotes, suggesting that a functional link between the nuclear envelope and PML bodies may exist through MLIP. 253
57903 405867 pfam15275 PEHE PEHE domain. This domain was first identified in drosophila MSL1 (male-specific lethal 1). In drosophila it binds to the histone acetyltransferase males-absent on the first protein (MOF) and to protein male-specific lethal-3 (MSL3). 127
57904 405868 pfam15276 PP1_bind Protein phosphatase 1 binding. This domain contains a protein phosphatase 1 (PP1) binding site. 63
57905 405869 pfam15277 Sec3-PIP2_bind Exocyst complex component SEC3 N-terminal PIP2 binding PH. This is the N-terminal domain of fungal and eukaryotic Sec3 proteins. Sec3 is a component of the exocyst complex that is involved in the docking of exocytic vesicles with fusion sites on the plasma membrane.This N-terminal domain contains a cryptic pleckstrin homology (PH) fold, and all six positively charged lysine and arginine residues in the PH domain predicted to bind the PIP2 head group are conserved. The exocyst complex is essential for many exocytic events, by tethering vesicles at the plasma membrane for fusion. In fission yeast, polarised exocytosis for growth relies on the combined action of the exocyst at cell poles and myosin-driven transport along actin cables. 81
57906 259411 pfam15278 Sec3_C_2 Sec3 exocyst complex subunit. This small Sec3 C-terminal domain family is based around the fission yeast protein, and is rather shorter than the budding yeast/vertebrate domain Sec3_C, family. pfam09763. In fact it is only this coiled-coil region that they carry in common. The full length fission yeast, UniProtKB:Q10324, protein Sec3 is redundant with Exo70 for viability and for the localization of other exocyst subunits, suggesting that these components act as exocyst tethers at the plasma membrane. Sec3, Exo70 and Sec5 are transported by the myosin V Myo52 along actin cables. The exocyst holo-complex, including Sec3 and Exo70, is present on exocytic vesicles, which can reach cell poles by either myosin-driven transport or random walk. 86
57907 405870 pfam15279 SOBP Sine oculis-binding protein. SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteristic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein. 321
57908 405871 pfam15280 BORA_N Protein aurora borealis N-terminus. This family of proteins is required for the activation of the protein kinase Aurora-A. 205
57909 405872 pfam15281 Consortin_C Consortin C-terminus. Consortin is a trans-Golgi network cargo receptor involved in targeting connexins to the plasma membrane. 113
57910 405873 pfam15282 BMP2K_C BMP-2-inducible protein kinase C-terminus. This family represents the C-terminus of BMP2K and related proteins. 255
57911 405874 pfam15283 DUF4595 Domain of unknown function (DUF4595) with porin-like fold. Large family of predicted secreted proteins mostly from CFG group, but also from Burkholderia, Pseudomonas and Streptomyces. Function of these proteins is not known. A 3D structure of a representative of this family from Bacteroides uniformis was solved by JCSG and deposited to PDB as 4ghb. There is some overlap with RHS-repeat (PF05593) family despite lack of obvious repeats in the structure. 195
57912 317659 pfam15284 PAGK Phage-encoded virulence factor. PAGK represents a new of virulence factors that is translocated into the host cytoplasm via bacterial outer membrane vesicles (OMV). Members are small proteins composed of about 70 amino acids. In Salmonella they are secreted independently of the SPI-2 type-III secretion system, T3SS. The OMV functions as a vehicle for transferring virulence determinants to the cytoplasm of the infected host cell. OMVs are released from the cell envelopes of Gram-negative bacteria and comprise a variety of outer membrane and periplasmic constituents, including proteins, phospholipids, lipopolysaccharides, and DNA. 64
57913 373713 pfam15285 BH3 Beclin-1 BH3 domain, Bcl-2-interacting. The BH3 domain is a short motif known to bind to Bcl-xLs. This interaction is important in apoptosis. 23
57914 373714 pfam15286 Bcl-2_3 Apoptosis regulator M11, B cell 2 leukaemia/lymphoma like. pfam02180. Bcl-2_3 is a small family of eukaryotic proteins associated with autophagy. The family is found in association with pfam00452. 126
57915 405875 pfam15287 KRBA1 KRBA1 family repeat. KRBA1 is a short repeating motif found in mammalian proteins. It is characterized by a highly conserved sequence of residues, SSPLxxLxxCLK. The function of the repeat, which can be present in up to seven copies, is unknown as is the function of the full length proteins. 43
57916 405876 pfam15288 zf-CCHC_6 Zinc knuckle. This Zinc knuckle is found in FAM90A mammalian proteins. 36
57917 405877 pfam15289 RFXA_RFXANK_bdg Regulatory factor X-associated C-terminal binding domain. This C-terminal domain of Regulatory factor X-associated protein binds to RFXANK, the Ankyrin-repeat regulatory factor X proteins. RFXA is part of the RFX complex, Mutants of either RFXAP or RFXANK protein fail to bind to each other. RFX5 binds only to the RFXANK-RFXAP scaffold and not to either protein alone, and neither the scaffold nor RFX5 alone can bind DNA. The binding of the RFXANK-RFXAP scaffold to RFX5 leads to a conformational change in the latter that exposes the DNA-binding domain of RFX5. The DNA-binding domain of RFX5 anchors the RFX complex to MHC class II X and S promoter boxes. 122
57918 405878 pfam15290 Syntaphilin Golgi-localized syntaxin-1-binding clamp. Syntaphilin or Syntabulin is a family of eukaryotic proteins. Syntaphilin binds to syntaxin-1 thereby inhibiting SNARE complex formation by absorbing free syntaxin-1. So it is a syntaxin-1 clamp that controls SNARE assembly. 308
57919 405879 pfam15291 Dermcidin Dermcidin, antibiotic peptide. Dermcidin is a family of peptides produced in the sweat to protect against pathogenic Gram-positive bacteria. 84
57920 405880 pfam15292 Treslin_N Treslin N-terminus. This family represents the N-terminus of treslin, a checkpoint regulator which plays a role in DNA replication preinitiation complex formation. 793
57921 405881 pfam15293 NUFIP2 Nuclear fragile X mental retardation-interacting protein 2. 596
57922 405882 pfam15294 Leu_zip Leucine zipper. This family includes Leucine zipper transcription factor-like protein 1 (LZTFL1) and Leucine zipper protein 2 (LUZP2). 276
57923 405883 pfam15295 CCDC50_N Coiled-coil domain-containing protein 50 N-terminus. 126
57924 405884 pfam15296 Codanin-1_C Codanin-1 C-terminus. This domain is found near to the C-terminus of codanin-1. 119
57925 405885 pfam15297 CKAP2_C Cytoskeleton-associated protein 2 C-terminus. This family includes the C-terminus of CKAP2 and CKAP2L. CKAP2 is a microtubule associated protein which stabilizes microtubules. 346
57926 405886 pfam15298 AJAP1_PANP_C AJAP1/PANP C-terminus. This family includes the C-terminus of adherens junction-associated protein 1 (AJAP1) and of PILR-associating neural protein (PANP). AJAP1 inhibits cell adhesion and migration. PANP is a ligand for the immune inhibitory receptor paired immunoglobulin-like type 2 receptor alpha. 204
57927 405887 pfam15299 ALS2CR8 Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8. This domain is found in amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8 protein. 216
57928 405888 pfam15300 INT_SG_DDX_CT_C INTS6/SAGE1/DDX26B/CT45 C-terminus. This domain is found at the C-terminus of integrator complex subunit 6 (INTS6), sarcoma antigen 1 (SAGE1), protein DDX26B (DDX26B) and members of the cancer/testis antigen family 45. 62
57929 405889 pfam15301 SLAIN SLAIN motif-containing family. The SLAIN motif containing family is named after the presence of a SLAIN motif in SLAIN1. They are a family of microtubule plus-end tracking proteins. 435
57930 405890 pfam15302 P33MONOX P33 mono-oxygenase. This family of proteins contains a flavine-containing mono-oxygenase motif. It may have a role in the regulation of neuronal survival, differentiation and axonal outgrowth. 291
57931 405891 pfam15303 RNF111_N E3 ubiquitin-protein ligase Arkadia N-terminus. This domain is found at the N-terminus of E3 ubiquitin-protein ligase Arkadia. 273
57932 405892 pfam15304 AKAP2_C A-kinase anchor protein 2 C-terminus. This family includes the C-terminus of A-kinase anchor protein 2 (AKAP2). It includes the site where the regulatory subunits (RII) of protein kinase AII binds. 346
57933 405893 pfam15305 IFT43 Intraflagellar transport protein 43. Intraflagellar transport protein 43 (IFT43) is a subunit of the IFT complex A (IFT-A) machinery of primary cilia. 136
57934 405894 pfam15306 LIN37 LIN37. LIN37 is a component of the DREAM (or LINC) complex which represses cell cycle-dependent genes in quiescent cells and plays a role in the cell cycle-dependent activation of G2/M genes. 156
57935 405895 pfam15307 SPACA7 Sperm acrosome-associated protein 7. SPACA7 is a family of eukaryotic proteins expressed in the testes. Proteins in this family are typically between 104 and 195 amino acids in length. There is a conserved DEIL sequence motif. The function is not known. 108
57936 405896 pfam15308 CEP170_C CEP170 C-terminus. This family includes the C-terminus of centrosomal protein of 170 kDa (CEP170). 667
57937 405897 pfam15309 ALMS_motif ALMS motif. This domain is found at the C-terminus of Alstrom syndrome protein 1 (ALMS1), KIAA1731 and C10orf90. 131
57938 405898 pfam15310 VAD1-2 Vitamin A-deficiency (VAD) rat model signalling. VAD1-2 is a family of proteins found in eukaryotes. The family is expressed in testes and is involved in signalling during spermatogenesis. 249
57939 405899 pfam15311 HYLS1_C Hydrolethalus syndrome protein 1 C-terminus. 89
57940 405900 pfam15312 JSRP Junctional sarcoplasmic reticulum protein. JSRP, junctional sarcoplasmic reticulum protein 1, or junctional-face membrane protein of 45 kDa homolog, is a family of eukaryotic proteins. The family is to the junctional face membrane of the skeletal muscle sarcoplasmic reticulum (SR); it colocalizes with its Ca2+-release channel (the ryanodine receptor), and interacts with calsequestrin and the skeletal-muscle dihydro-pyridine receptor Cav1. It is key for the functional expression of voltage-dependent Ca2+ channels. 63
57941 405901 pfam15313 HEXIM Hexamethylene bis-acetamide-inducible protein. HEXIM is a transcriptional regulator that functions as a general RNA polymerase II transcription inhibitor. In cooperation with 7SK snRNA it sequesters P-TEFb in a large inactive 7SK snRNP complex preventing RNA polymerase II phosphorylation and subsequent transcriptional elongation. HEXIM may also regulate NF-kappa-B, ESR1, NR3C1 and CIITA-dependent transcriptional activity. 135
57942 405902 pfam15314 PRAP Proline-rich acidic protein 1, pregnancy-specific uterine. PRAP, or proline-rich acidic protein 1, is a family of eukaryotic proteins. PRAP is abundantly expressed in the epithelial cells of the human liver, kidney, gastrointestinal tract, and cervix. It is significantly down-regulated in hepatocellular carcinoma and right colon adenocarcinoma compared with the respective adjacent normal tissues. In the mouse it is expressed in the epithelial cells of the mouse and rat gastrointestinal tracts, and pregnant mouse uterus. This article describes the isolation, distribution, and functional characterization of the human homolog. PRAP was abundantly expressed in the epithelial cells of the human liver, kidney, gastrointestinal tract, and cervix. PRAP plays an important role in maintaining normal growth suppression. 45
57943 405903 pfam15315 FRG2 Facioscapulohumeral muscular dystrophy candidate 2. This family of proteins is found in eukaryotes. The family is localized close to the D4Z4 repeats on chromosome 4 and 10 that are associated with the autosomal dominant facioscapulohumeral muscular dystrophy (FSHD). FRG2 are transcriptionally upregulated in FSHD myoblast cultures suggesting involvement in the pathogenesis of FSHD. 182
57944 405904 pfam15316 MDFI MyoD family inhibitor. Members of this family inhibits the transactivation activity of the MyoD family of myogenic factors. They affect axin-mediated regulation of the Wnt and JNK signaling pathways, and regulate expression from viral promoters. 168
57945 405905 pfam15317 Lbh Cardiac transcription factor regulator, Developmental protein. The family of proteins are cardiac transcription regulators, named Lbh, short for Limb, bud and heart. They regulate embryological development in the heart. More specifically, in humans, they may act as transcriptional activators in MAPK signaling pathway to mediate cellular functions. This family of proteins is found in eukaryotes. Proteins in this family are typically between 92 and 116 amino acids in length. 88
57946 373745 pfam15318 Bclt Putative Bcl-2 like protein of testis. This family of proteins is found in eukaryotes. The family may represent a set of Bcl-2-like proteins involved in apoptosis, see UniProt:Q9BQM9. 175
57947 405906 pfam15319 RHINO RAD9, RAD1, HUS1-interacting nuclear orphan protein. RHINO, or RAD9, RAD1, HUS1-interacting nuclear orphan, is a family of eukaryotic proteins. Under genotoxic stresses such as ionizing radiation during the S phase, RHINO plays a role in DNA damage response signalling. It is recruited to sites of DNA damage through interaction with the 9-1-1 cell-cycle checkpoint response complex and TOPBP1 in a ATR-dependent (ataxia telangiectasia and Rad3-related) manner. It is required for the progression of the G1 to S phase transition of breast cancer cells, and it is known to play a role in the stimulation of CHEK1 phosphorylation. It interacts with RAD9A, RAD18, TOPBP1 and UBE2N. 244
57948 405907 pfam15320 RAM mRNA cap methylation, RNMT-activating mini protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 102 and 154 amino acids in length. There is a single completely conserved residue D that may be functionally important. RAM is a family of eukaryotic proteins that are an obligate component of the mammalian cap methyltransferase, RNMT (RNA guanine-7 methyltransferase). RAM consists of an N-terminal RNMT-activating domain and a C-terminal RNA-binding domain. Either RAM or RNMT independently have rather weak binding affinity for RNA, but together their RNA affinity is significantly increased. RAM is necessary for efficient cap methylation, maintaining mRNA expression levels, for mRNA translation and for cell viability. 83
57949 405908 pfam15321 ATAD4 ATPase family AAA domain containing 4. ATAD4 is a family of proteins is found in eukaryotes. The family is also known as PRR15L, or proline-rich 15-like. ATAD4 is expressed almost exclusively in post-mitotic cells both during foetal development and in adult tissues, such as the intestinal epithelium and the testis. Its expression in mouse and human gastrointestinal tumors is linked, directly or indirectly, to the disruption of the Wnt signaling pathway. 98
57950 405909 pfam15322 PMSI1 Protein missing in infertile sperm 1, putative. This family of proteins is found in eukaryotes. Proteins in this family are typically between 249 and 341 amino acids in length. 309
57951 405910 pfam15323 Ashwin Developmental protein. This family of proteins are found in eukaryotes. These proteins have an important role to play in developmental biology, particularly embryogenesis. It plays an important role in cell survival and axial pattern. It is also thought to be a crucial subunit in the tRNA splicing ligase complex. Proteins in this family are typically between 141 and 232 amino acids in length. There are two conserved sequence motifs: HPE and PQR. 220
57952 405911 pfam15324 TALPID3 Hedgehog signalling target. TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development. 1253
57953 405912 pfam15325 MRI Modulator of retrovirus infection. MRI, or modulator of retrovirus infection, is a family of eukaryotic proteins that regulate the activity of the proteasome in the uncoating of retroviruses. 104
57954 405913 pfam15326 TEX15 Testis expressed sequence 15. TEX15 is a family of eukaryotic proteins that is required for chromosomal synapsis and meiotic recombination. TEX15 regulates the loading of DNA repair proteins onto sites of double-stranded-breaks and, thus, its absence causes a failure in meiotic recombination. Two polymorphisms in the TEX15 gene could be considered the genetic risk factors for spermatogenic failure in the Chinese Han population. 234
57955 405914 pfam15327 Tankyrase_bdg_C Tankyrase binding protein C terminal domain. This protein domain family is found at the C-terminal end of the Tankyrase binding protein in eukaryotes. The precise function of this protein is still unknown. However, it is known interacts with the enzyme tankyrase, a telomeric poly(ADP-ribose) polymerase, by binding to it. Tankyrin catalyzes poly(ADP-ribose) chain formation onto proteins. More specifically, it binds to the ankyrin domain in tankyrase. The protein domain is approximately 170 amino acids in length and contains two conserved sequence motifs: FPG and LKA. 166
57956 405915 pfam15328 GCOM2 Putative GRINL1B complex locus protein 2. This protein family is named Putative GRINL1B complex locus protein 2. GRINL1B is short for: glutamate receptor, ionotropic, N-methyl D-aspartate-like 1B. The name indicates what sort of receptor it is thought to be, a ligand gated ion channel specific to the neurotransmitter Glutamate. This family of proteins is found in eukaryotes. Proteins in this family are typically between 325 and 463 amino acids in length. The protein is thought to be the product of a pseudogene with a role in helping assemble a gene transcription unit. 216
57957 405916 pfam15330 SIT SHP2-interacting transmembrane adaptor protein, SIT. SIT, or SHP2-interacting transmembrane adaptor protein, is a disulfide-linked dimer that regulates human T Cell activation. 114
57958 405917 pfam15331 TP53IP5 Cellular tumor antigen p53-inducible 5. TP53IP5 suppresses cell growth, and its intracellular location and expression change in a cell-cycle-dependent manner. 218
57959 405918 pfam15332 LIME1 Lck-interacting transmembrane adapter 1. LIME1 is a family of eukaryotic transmembrane adaptors. It plays an important role in linking BCR stimulation to B-cell activation and is expressed in primary B cells. LIME localizes to lipid rafts in T cells in response to TCR stimulation, and is phosphorylated by Lck and recruits signalling molecules such as Lck, PI3K, Grb2, Gads, and SHP-2. LIME acts as the transmembrane adaptor linking BCR-induced membrane-proximal signalling to B-cell activation. 224
57960 405919 pfam15333 TAF1D TATA box-binding protein-associated factor 1D. TAF1D is a family of eukaryotic proteins that are members of the SL1 complex The SL1 complex includes TBP and TAF1A, TAF1B and TAF1C, and plays a role in RNA polymerase I transcription. Alternatives names have included 'JOSD3, Josephin domain containing 3'. 222
57961 405920 pfam15334 AIB Aurora kinase A and ninein interacting protein. AIB is a family of eukaryotic proteins necessary for the adequate functioning of Aurora-A, a protein involved in chromosome alignment, centrosome maturation, mitotic spindle assembly and aspects of tumorigenesis. AIB is likely to act as a regulator of Aurora-A activity. 326
57962 405921 pfam15335 CAAP1 Caspase activity and apoptosis inhibitor 1. CAAP1, or caspase activity and apoptosis inhibitor 1, is a family of eukaryotic proteins involved in the regulation of apoptosis. It modulates a caspase-10 dependent mitochondrial caspase-3/9 feedback amplification loop. 62
57963 405922 pfam15336 Auts2 Autism susceptibility gene 2 protein. Auts2, or FBRSL2, Fibrosin-1-like protein 2, is a family of eukaryotic proteins associated both with a susceptibility to autism and with influencing the number of corpora lutea produced by breeding sows. 217
57964 405923 pfam15337 Vasculin Vascular protein family Vasculin-like 1. GC-rich promoter-binding protein 1-like 1 or Vasculin-like protein family 1, is likely to be a transcription factor. The domain family is found in eukaryotes, and is approximately 90 amino acids in length. 94
57965 373764 pfam15338 TPIP1 p53-regulated apoptosis-inducing protein 1. TPIP1 is a family of eukaryotic proteins whose expression is induced by wild-type p53. Ectopically expressed TPIP1, which is localized within mitochondria, leads to apoptotic cell death through dissipation of mitochondrial A(psi)m. Phosphorylation of p53 Ser-46 regulates the transcriptional activation of TPIP1, thereby mediating p53-dependent apoptosis. 123
57966 405924 pfam15339 Afaf Acrosome formation-associated factor. Afaf is a family of single pass type I membrane proteins. Afaf is a vesicle factor derived from the early endosome trafficking pathway that is involved in the biogenesis of the acrosome on the maturing spermatozoon head. 198
57967 405925 pfam15340 COPR5 Cooperator of PRMT5 family. COPR5 is a family of histone H4-binding proteins expressed in the nucleus. It interacts with the N-terminus of histone H4 thereby mediating the association between histone H4 and PRMT5, PRMT5, the Janus kinase-binding protein 1 that catalyzes the formation of symmetric dimethyl-arginine residues in proteins. COPR5 is specifically required for histone H4 'Arg-3' methylation mediated by PRMT5, but not histone H3 'Arg-8' methylation, suggesting that it modulates the substrate specificity of PRMT5. This family of proteins is found in eukaryotes. 151
57968 405926 pfam15341 SLX9 Ribosome biogenesis protein SLX9. SLX9 is present in pre-ribosomes from an early stage and is implicated in the processing events that remove the ITS1 spacer sequences. In eukaryotes, biogenesis of ribosomes starts in the nucleolus with transcription by RNA polymerase I of a large precursor RNA molecule, called 35S pre-rRNA in yeast, in which the 18S, 5.8S, and 25S mature rRNAs reside, while RNA polymerase III transcribes a 3'-extended pre-5S rRNA. The 35S precursor also contains external transcribed spacer elements (5' and 3'-ETS) at either end as well as internal transcribed spacers (ITS1 and ITS2) that separate the mature sequences. 118
57969 405927 pfam15342 FAM212 FAM212 family. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. 60
57970 405928 pfam15343 DEPP Decidual protein induced by progesterone family. DEPP is a family of proteins expressed in various tissues, including pancreas, placenta, ovary, testis and kidney. High levels are found during the first trimester. Its expression is induced by progesterone, testosterone and, to a much lower extent, oestrogen. The family is alternatively known as fasting-induced gene protein, FIG. 185
57971 405929 pfam15344 FAM217 FAM217 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 329 and 507 amino acids in length. There is a conserved YPDFLP sequence motif. 230
57972 405930 pfam15345 TMEM51 Transmembrane protein 51. This family of proteins is found in eukaryotes. Proteins in this family are typically between 233 and 253 amino acids in length. 237
57973 405931 pfam15346 ARGLU Arginine and glutamate-rich 1. ARGLU, arginine and glutamate-rich 1 protein family, is required for the oestrogen-dependent expression of ESR1 target genes. It functions in cooperation with MED1. The family of proteins is found in eukaryotes. 151
57974 405932 pfam15347 PAG Phosphoprotein associated with glycosphingolipid-enriched. PAG, or Cbp/PAG (Csk binding protein/phospho-protein associated with glycosphingolipid-enriched microdomains) is a transmembrane family that has a negative regulatory role in T-cell activation through being an adapter for C-terminal Src kinase, Csk. This family of proteins is found in eukaryotes. 429
57975 405933 pfam15348 GEMIN8 Gemini of Cajal bodies-associated protein 8. GEMIN8 proteins are found in the nuclear bodies called gems (Gemini of Cajal bodies) that are often in proximity to Cajal (coiled) bodies themselves. They are also found in the cytoplasm. The family is part of the SMN (survival motor neurone) complex that plays an essential role in spliceosomal snRNP assembly in the cytoplasm and is required for pre-mRNA splicing in the nucleus. GEMIN8 binds directly to SMN1 and mediates the interaction of the GEMIN6-GEMIN7 heterodimer. 231
57976 405934 pfam15349 DCA16 DDB1- and CUL4-associated factor 16. DCA16 is a family of eukaryotic proteins that interacts with DDB1 and CUL4A. The family may function as a substrate receptor for the CUL4-DDB1 E3 ubiquitin-protein ligase complex. 167
57977 405935 pfam15350 ETAA1 Ewing's tumor-associated antigen 1 homolog. This family of proteins is found in eukaryotes, where members are expressed at high levels in the brain, liver kidney and Ewing tumor cell lines. Proteins in this family are typically between 648 and 898 amino acids in length. 819
57978 405936 pfam15351 JCAD Junctional protein associated with coronary artery disease. JCAD is a component of VE-cadherin-based cell-cell junctions in endothelial cells. The cell-cell or adherens junction is an adhesion complex that plays a crucial role in the organisation and function of epithelial and endothelial cellular sheets. These junctions join the actin cytoskeleton to the plasma membrane to form adhesive contacts between cells or between cells and extracellular matrix. The junctions also mediate both cell adhesion and cell-signalling. JCAD localizes close to the apical membrane in epithelial cells. This family is found in eukaryotes. 1358
57979 405937 pfam15352 K1377 Susceptibility to monomelic amyotrophy. This family of proteins is associated with a susceptibility to monomelic amyotrophy. 981
57980 405938 pfam15353 HECA Headcase protein family homolog. HECA was characterized first in Drosophila where it regulates the proliferation and differentiation of cells during adult morphogenesis. In humans, HECA affects cell cycle progression and proliferation in head and neck cancer cells. It by slows down cell division of oral squamous cell carcinoma cells and may thereby act as a tumor-suppressor in head and neck cancers. 101
57981 259486 pfam15354 KAAG1 Kidney-associated antigen 1. KAAG1, kidney-associated antigen 1, or RU2AS (RU2 antisense gene protein) has been found in mammals. It is expressed in testis and kidney, and, at lower levels, in urinary bladder and liver. It is expressed by a high proportion of tumors of various histologic origin, including melanomas, sarcomas and colorectal carcinomas. 84
57982 405939 pfam15355 Chisel Stretch-responsive small skeletal muscle X protein, Chisel. The murine X-linked gene Chisel (Csl/Smpx) is selectively expressed in cardiac and skeletal muscle cells. It localizes to the costameric cytoskeleton of muscle cells through its association with focal adhesion proteins, where it may participate in regulating the dynamics of actin through the Rac1/p38 kinase pathway. Thus it is implicated in the maintenance of muscle integrity and in responses to biomechanical stress. 86
57983 292000 pfam15356 SPR1 Psoriasis susceptibility locus 2. SPR1 is psoriasis susceptibility locus 2 protein family. 114
57984 373780 pfam15357 SEEK1 Psoriasis susceptibility 1 candidate 1. This family is considered a candidate for susceptibility to psoriasis. 149
57985 405940 pfam15358 TSKS Testis-specific serine kinase substrate. TSKS, testis-specific serine kinase substrate, is expressed in the testis and is downregulated in cancerous testicular tissue, in comparison with adjacent normal tissue. TSKS expression is very low to undetectable in seminoma, teratocarcinoma, embryonal, and Leydig cell tumors, while high in testicular tissue adjacent to tumors which contain pre-malignant carcinoma in situ. Recently it has been shown in human testis to be localized to the equatorial segment of ejaculated human sperm. The finding of a TSKS family member in mature sperm suggests that this family of kinases might play a role in sperm function. TSKS is localized during spermiogenesis to the centrioles of post-meiotic spermatids, where it reaches its greatest concentration during the period of flagellogenesis. 556
57986 405941 pfam15359 CDV3 Carnitine deficiency-associated protein 3. This family of proteins is found in eukaryotes. Proteins in this family are typically between 128 and 251 amino acids in length. CDV3 is also known as TPP36 - tyrosine-phosphorylated protein 36. The function is not known. 125
57987 405942 pfam15360 Apelin APJ endogenous ligand. Apelin is among the most potent stimulators of cardiac contractility known. The apelin-APJ signaling pathway is an important novel mediator of cardiovascular control. Apelin is an adipokine secreted by adipocytes where it is co-expressed with apelin receptor (APJ) in adipocytes. It suppresses adipogenesis through MAPK kinase/ERK dependent pathways and prevents lipid droplet fragmentation, thereby inhibiting basal lipolysis through AMP kinase dependent enhancement of perilipin expression. It also inhibits hormone-stimulated acute lipolysis through decreasing perilipin phosphorylation. Apelin induces a decrease of free fatty acid release via its dual inhibition on adipogenesis and lipolysis. As a vaso-active and vascular cell growth-regulating peptide Apelin is a target of the BMP pathway, the TGF-beta/bone morphogenic protein (BMP) system - a major pathway for angiogenesis. 55
57988 405943 pfam15361 RIC3 Resistance to inhibitors of cholinesterase homolog 3. RIC3 is a protein associated with nicotinic acetylcholine receptors (nAChRs), neurotransmitter-gated ion channels expressed at the neuromuscular junction and within the central and peripheral nervous systems. It can enhance functional expression of multiple nAChR subtypes. RIC3 promotes functional expression of homomeric alpha-7 and alpha-8 nicotinic acetylcholine receptors at the cell surface. 146
57989 405944 pfam15362 Enamelin Enamelin. ENAMELIN is involved in the mineralisation and structural organisation of enamel. It is necessary for the extension of enamel during the secretory stage of dental enamel formation. The proteins are expressed in teeth, particularly in odontoblasts, ameloblasts and cementoblasts. 907
57990 405945 pfam15363 DUF4596 Domain of unknown function (DUF4596). This domain family is found in eukaryotes, and is approximately 50 amino acids in length. There is a conserved ELET sequence motif. There are two completely conserved residues (S and E) that may be functionally important. 46
57991 405946 pfam15364 PAXIP1_C PAXIP1-associated-protein-1 C term PTIP binding protein. This protein domain family is the C-terminal domain of PAXIP1-associated-protein-1, which also goes by the name PTIP-associated protein 1. This family of proteins is found in eukaryotes. The function of this protein is to localize at the site of DNA damage and form foci with PTIP at the DNA break point. Furthermore, studies have shown that depletion of PA1 increases cellular sensitivity to ionizing radiation. Proteins in this family are typically between 122 and 254 amino acids in length. 132
57992 405947 pfam15365 PNRC Proline-rich nuclear receptor coactivator motif. The PNRC family, proline-rich nuclear receptor coactivator, is found in eukaryotes. Studies in S. pombe show that the proteins carrying this motif are mRNA decapping proteins.In addition, this motif is found in Saccharomyces cerevisiae two intrinsically disordered decapping enhancers Edc1 and Edc2, which show limited sequence conservation with human PNRC2. This motif in the N-terminal domain serves two purposes: it enhances the activity of the catalytic domain by recognizing part of the mRNA cap structure (i.e. activation motif), and secondly, it directly interacts with the decapping activator Dcp1. Mutation in the (YAG) sequence led to los of activity of activate the decapping complex. Hence the activity of the family members involved in mRNA processing mechanisms depends on YAG activation motif that is 11-13 residues N-terminal of a conserved LPxP Dcp1 interaction motif. 19
57993 373789 pfam15366 DUF4597 Domain of unknown function (DUF4597). This family of proteins is found in eukaryotes. Proteins in this family are typically between 63 and 76 amino acids in length. There is a conserved TPPTPT sequence motif. 62
57994 405948 pfam15367 CABS1 Calcium-binding and spermatid-specific protein 1. CABS1 is a family of proteins found in eukaryotes. It is also known as NYD-SP26. It binds calcium and is specifically expressed in the elongate spermatids and then localized into the principal piece of flagella of matured spermatozoa. 397
57995 405949 pfam15368 BioT2 Spermatogenesis family BioT2. BioT2 is a family of eukaryotic proteins expressed only in the testes. BioT2 is found abundantly in five types of murine cancer cell lines, suggesting it plays a role in testes development as well as tumorigenesis. 168
57996 405950 pfam15369 KIAA1328 Uncharacterized protein KIAA1328. This function of this protein family remains uncharacterized. This family of proteins is found in eukaryotes. 325
57997 405951 pfam15370 DUF4598 Domain of unknown function (DUF4598). This family of proteins is found in eukaryotes. Proteins in this family are typically between 159 and 251 amino acids in length. 111
57998 405952 pfam15371 DUF4599 Domain of unknown function (DUF4599). The function of this family of eukaryotic proteins is not known. 88
57999 405953 pfam15372 DUF4600 Domain of unknown function (DUF4600). 128
58000 405954 pfam15373 DUF4601 Domain of unknown function (DUF4601). This protein family is a domain of unknown function, which is found in eukaryotes. In humans, the gene encoding this protein is found in the position, chromosome 19 open reading frame 45. 437
58001 405955 pfam15374 CCDC71L Coiled-coil domain-containing protein 71L. The protein family, Coiled-coil domain-containing protein 71L, is a domain of unknown function, which is found in eukaryotes. 393
58002 405956 pfam15375 DUF4602 Domain of unknown function (DUF4602). This family of proteins is found in eukaryotes. Proteins in this family are typically between 173 and 294 amino acids in length. This family includes Human C1orf131. 132
58003 405957 pfam15376 DUF4603 Domain of unknown function (DUF4603). This protein family is a domain of unknown function. In particular, this domain lies at the C-terminal end of a protein found in eukaryotes. 1293
58004 405958 pfam15377 DUF4604 Domain of unknown function (DUF4604). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 141 and 174 amino acids in length and contain a conserved LSF sequence motif. 170
58005 405959 pfam15378 DUF4605 Domain of unknown function (DUF4605). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 82 and 137 amino acids in length. 59
58006 405960 pfam15379 DUF4606 Domain of unknown function (DUF4606). This domain family is found in eukaryotes, and is approximately 100 amino acids in length. 103
58007 373803 pfam15380 DUF4607 Domain of unknown function (DUF4607). This family of proteins is found in eukaryotes. Proteins in this family are typically between 207 and 359 amino acids in length. 264
58008 405961 pfam15382 DUF4609 Domain of unknown function (DUF4609). This family of proteins is found in eukaryotes. Proteins in this family are typically between 70 and 139 amino acids in length. 68
58009 405962 pfam15383 TMEM237 Transmembrane protein 237. This protein family is found in eukaryotes. The function of this protein is to aid the production of new cilia in ciliogenesis. Mutations in the protein cause a disease, named Joubert syndrome type 14 (JBTS14) and also affect cell signalling using the Wnt pathway. Proteins in this family are typically between 203 and 512 amino acids in length. There are two completely conserved G residues that may be functionally important. 248
58010 405963 pfam15384 PAXX PAXX, PAralog of XRCC4 and XLF, also called C9orf142. PAXX is a set of eukaryotic proteins that belong to the XRCC4 superfamily of DNA-double-strand break-repair proteins. PAXX interacts directly with DSB-repair protein Ku and is recruited to DNA-damage sites in cells thus functioning with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents. 195
58011 405964 pfam15385 SARG Specifically androgen-regulated gene protein. This family of proteins is found in eukaryotes, the function of this protein is still unknown but it is thought to be an androgen receptor. Protein expression is up-regulated in the presence of androgens, but not in the presence of glucocorticoids. SARG tends to be highly expressed in prostate tissue. Proteins in this family are typically between 340 and 587 amino acids in length. There is a conserved EETI sequence motif. 567
58012 405965 pfam15386 Tantalus Drosophila Tantalus-like. An alpha+beta fold domain found in metazoan proteins such as Drosophila Tantalus. Drosophila Tantalus binds the chromatin protein Additional sex combs (Asx) and also binds DNA in vitro. 53
58013 405966 pfam15387 DUF4611 Domain of unknown function (DUF4611). This family of proteins is found in eukaryotes. Proteins in this family are typically between 71 and 100 amino acids in length. There is a conserved AKR sequence motif. 96
58014 405967 pfam15388 FAM117 Protein Family FAM117. This protein family is a domain of unknown function found in eukaryotes. Proteins in this family are typically between 269 and 453 amino acids in length. There are two conserved sequence motifs: RRT and TQT. 309
58015 405968 pfam15389 DUF4612 Domain of unknown function (DUF4612). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 109 and 323 amino acids in length. 111
58016 405969 pfam15390 WDCP WD repeat and coiled-coil-containing protein family. This family includes WD repeat and coiled-coil-containing protein (WDCP, previously known as C2orf44), which is found in eukaryotes and consists of around 721 amino acids. The N-terminal contains two WD (tryptophan-aspartic acid) repeats (WD1 and WD2). WD repeats may be involved in a range of biological functions including apoptosis, transcriptional regulation and signal transduction. The C-terminal contains a proline-rich sequence (PPRLPQR), and is predicted to have leucine-rich coiled coil region (CC). WDCP was identified in a proteomic screen to find signalling components that interact with Hck (hematopoietic cell kinase), a non-receptor tyrosine kinase. WDCP was shown to bind tightly and specifically to the SH3 domain of Hck in U937 human monocytic cells. WDCP was also shown to exist as an oligomer when expressed in mammalian cells. While the function of WDCP is unknown, it has been identified in a gene fusion event with anaplastic lymphoma kinase (ALK) in colorectal cancer patients. 684
58017 405970 pfam15391 DUF4614 Domain of unknown function (DUF4614). This domain family is found in eukaryotes, and is approximately 180 amino acids in length. There is a conserved EALT sequence motif. 176
58018 405971 pfam15392 Joubert Joubert syndrome-associated. This family of proteins is domain of unknown function, which is found in eukaryotes. However, mutations in the gene lead to Joubert's Syndrome, indicating that the protein that the gene encodes for is vital for correct ciliogenesis. 280
58019 405972 pfam15393 DUF4615 Domain of unknown function (DUF4615). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 161 and 229 amino acids in length. There is a single completely conserved residue F that may be functionally important. 132
58020 373816 pfam15394 DUF4616 Domain of unknown function (DUF4616). This protein family is a domain of unknown function found at the C-terminal domain of the proteins. This protein family is found in eukaryotes. Proteins in this family are typically between 166 and 538 amino acids in length. 491
58021 405973 pfam15395 DUF4617 Domain of unknown function (DUF4617). This family of proteins is found in eukaryotes. Proteins in this family are typically between 702 and 1745 amino acids in length. 1082
58022 405974 pfam15396 FAM60A Protein Family FAM60A. This protein family, FAM60A is a family of proteins is found in eukaryotes. It is known to be a cell cycle protein that binds to the promoter of a gene transcription repressor complex, named SIN4-HDAC complex. This means that FAM60A has an important role to play in 'switching on' gene expression. Proteins in this family are typically between 179 and 324 amino acids in length. 207
58023 405975 pfam15397 DUF4618 Domain of unknown function (DUF4618). This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 363 amino acids in length. There are two conserved sequence motifs: EYP and KCTPD. 258
58024 405976 pfam15398 DUF4619 Domain of unknown function (DUF4619). This family of proteins is found in eukaryotes. Proteins in this family are typically between 128 and 299 amino acids in length. 296
58025 405977 pfam15399 DUF4620 Domain of unknown function (DUF4620). 113
58026 405978 pfam15400 TEX33 Testis-expressed sequence 33 protein family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 147 and 280 amino acids in length. There are two conserved sequence motifs: NIRH and SYT. The function is not known. 138
58027 405979 pfam15401 TAA-Trp-ring Tryptophan-ring motif of head of Trimeric autotransporter adhesin. TAA-head_Trp-ring is the tryptophan-ring motif of some Gram-negative Enterobacteriaceae. The Trp-ring folds into a beta-meander type on the top of the head domain of its trimeric autotransporter adhesin proteins. In conjunction with the GIN domain it is thought to be the region of the head that adheres to fibronectin. 65
58028 373822 pfam15402 Spc7_N N-terminus of kinetochore NMS complex subunit Spc7. 917
58029 405980 pfam15403 HiaBD2 HiaBD2_N domain of Trimeric autotransporter adhesin (GIN). HiaBD2_N may represent the GIN domain of the Head region of TAAs - trimeric autotransporter adhesins. Not all TAAs carry this domain; however, in those that do, the GIN in combination with the Trp-ring domain is necessary for adhesion to fibronectin in the host cell. 52
58030 405981 pfam15404 PH_4 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 181
58031 405982 pfam15405 PH_5 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 135
58032 373825 pfam15406 PH_6 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 112
58033 405983 pfam15407 Spo7_2_N Sporulation protein family 7. Spo7_2 constitutes a different set of fungal and related species from those found in Spo7. This domain is found in general at the N-terminus. In many members the domain is associated with a Pleckstrin-homology - PH - domain. 64
58034 405984 pfam15409 PH_8 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 89
58035 405985 pfam15410 PH_9 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 118
58036 405986 pfam15411 PH_10 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 120
58037 405987 pfam15412 Nse4-Nse3_bdg Binding domain of Nse4/EID3 to Nse3-MAGE. This family includes Nse4 and EID3 members, that bind over this region to the Nse3 pocket, in MAGE family pfam01454. 56
58038 405988 pfam15413 PH_11 Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species. 105
58039 405989 pfam15414 DUF4621 Protein of unknown function (DUF4621). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 350 amino acids in length. 329
58040 405990 pfam15415 Mfa_like_2 Fimbrillin-like. This family of proteins is found in bacteria. Proteins in this family are typically between 348 and 360 amino acids in length. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins. 312
58041 405991 pfam15416 DUF4623 Domain of unknown function (DUF4623). This family of proteins is found in bacteria. Proteins in this family are approximately 470 amino acids in length. There are two conserved sequence motifs: HLL and RYL. 448
58042 405992 pfam15417 DUF4624 Domain of unknown function (DUF4624). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. 132
58043 405993 pfam15418 DUF4625 Domain of unknown function (DUF4625). This family contains a likely bacterial Ig-like fold, suggesting it may be a family of lipoproteins. 131
58044 405994 pfam15419 LNP1 Leukemia NUP98 fusion partner 1. This family of proteins includes leukemia NUP98 fusion partner 1, the gene encoding this protein is involved in a chromosomal translocation with the NUP98 locus in a form of T-cell acute lymphoblastic leukemia. 177
58045 405995 pfam15420 Abhydrolase_9_N Alpha/beta-hydrolase family N-terminus. This is the N-terminal transmembrane domain of a family of alpha/beta hydrolases which may function as lipases. The C-terminal domain (pfam10081) is the catalytic domain. 208
58046 405996 pfam15421 Polysacc_deac_3 Putative polysaccharide deacetylase. 423
58047 405997 pfam15423 FLYWCH_N FLYWCH-type zinc finger-containing protein. This family is the N-terminus of some FLYWCH-zinc-finger proteins, found in eukaryotes. The family is found in association with pfam04500. There are two conserved sequence motifs: EQE and QEPS. 107
58048 405998 pfam15424 ODAM Odontogenic ameloblast-associated family. 264
58049 405999 pfam15425 DUF4627 Domain of unknown function (DUF4627). This family of proteins is found in bacteria. Proteins in this family are approximately 230 amino acids in length. There is a conserved WYK sequence motif. 195
58050 373837 pfam15427 S100PBPR S100P-binding protein. S100PBPR is a family of proteins found in eukaryotes, and localized to cell nuclei where S100P is also present, and the two proteins co-immunoprecipitate. S100P is a member of the S100 family of calcium-binding proteins and there have been several recent reports of its over-expression in pancreatic ductal adenocarcinoma. In situ hybridisation shows S100PBPR transcripts to be found in islet cells but not duct cells of the healthy pancreas. An interaction between S100P and S100PBPR may be involved in early pancreatic cancer. 386
58051 406000 pfam15428 Imm26 Immunity protein 26. A predicted immunity protein with mostly all-beta fold and several conserved hydrophobic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI1 or Tox-HNH family. The protein is also found heterogeneous poly-immunity loci. 101
58052 406001 pfam15429 DUF4628 Domain of unknown function (DUF4628). This family of proteins is found in eukaryotes. Proteins in this family are typically between 152 and 673 amino acids in length. 274
58053 406002 pfam15430 SVWC Single domain von Willebrand factor type C. SVWC is a family of single-domain von Willebrand factor type C proteins from lower eukaryotes. The canonical pattern of most von Willebrand factor type C (VWC) domains is of ten cysteines, however this family, largely but not exclusively of arthropod proteins, contains only eight. SVWC family proteins respond to environmental challenges, such as bacterial infection and nutritional status. They also are involved in anti-viral immunity, and all of these functions seem linked to SVWC expression being induced by Dicer2. 66
58054 292071 pfam15431 TMEM190 Transmembrane protein 190. 133
58055 406003 pfam15432 Sec-ASP3 Accessory Sec secretory system ASP3. Sec-ASP3 is family of bacterial proteins involved in the Sec secretory system. The family forms part of the accessory SecA2/SecY2 system specifically required to export GspB, a serine-rich repeat cell-wall glycoprotein adhesin encoded upstream in the same operon. 123
58056 406004 pfam15433 MRP-S31 Mitochondrial 28S ribosomal protein S31. MRP-S31 is the mitochondrial 28S ribosomal subunit S31. This family of proteins is found in eukaryotes. Proteins in this family are typically between 246 and 395 amino acids in length. There are two conserved sequence motifs: RHFMELV and GLSKN. 301
58057 406005 pfam15434 FAM104 Family 104. This family of proteins is found in eukaryotes. Proteins in this family are typically between 113 and 185 amino acids in length. There is a conserved SLQ sequence motif. 109
58058 406006 pfam15435 UNC119_bdg UNC119-binding protein C5orf30 homolog. UNC119_bdg is a family of eukaryotic proteins that probably plays a role in trafficking of proteins, via interaction with unc119 family cargo adapters. The family may play a role in ciliary membrane localization. 198
58059 406007 pfam15436 PGBA_N Plasminogen-binding protein pgbA N-terminal. PGBA_N is an N-terminal family of bacterial proteins that bind plasminogen. This activity was identified in In Helicobacter pylori where it is thought to contribute to the virulence of this bacterium. Both PgbA and PgbB are surface-exposed proteins that mediate binding to plasminogen such that it can be converted into plasmin in the presence of a Pg activator. 217
58060 292077 pfam15437 PGBA_C Plasminogen-binding protein pgbA C-terminal. PGBA_C is an C-terminal family of bacterial proteins that bind plasminogen. This activity was identified in Helicobacter pylori where it is thought to contribute to the virulence of this bacterium. Both PgbA and PgbB are surface-exposed proteins that mediate binding to plasminogen such that it can be converted into plasmin in the presence of a plasminogen activator. 84
58061 373844 pfam15438 Phyto-Amp Antigenic membrane protein of phytoplasma. Phyto-Amp is a family of phytopathogenic wall-less bacterial antigenic membrane proteins. The bacteria are limited to the phloem and pose a major threat to agriculture worldwide. They are transmitted in a persistent, propagative manner by phloem-sucking Hemipteran insects. Phytoplasma membrane proteins are in direct contact with hosts and are assumed to be involved in determining vector specificity. Phyto-Amp is thought to be one family of proteins that mediates such specificity. The proteins appear to be encoded by circular extrachromosomal elements, at least one of which is a plasmid. 147
58062 406008 pfam15439 NYAP_N Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter. NYAP_N is an N-terminal family of eukaryotic proteins that are substrates of tyrosine kinase in the brain. When first identified, the family members were referred to as unconventional myosin XVI, or Myr 8. However, proteins have now been identified as being integrally involved in neuronal function and morphogenesis. The family is involved in both the activation of phosphoinositide 3-kinase (PI3K) and the recruitment of the downstream effector WAVE complex to the close vicinity of PI3K; it also appears to regulate the brain size and neurite outgrowth in mice. 381
58063 406009 pfam15440 THRAP3_BCLAF1 THRAP3/BCLAF1 family. This family includes thyroid hormone receptor-associated protein 3 (THRAP3), which is a spliceosome component and a subunit of the TRAP complex which plays a role in pre-mRNA splicing and in mRNA decay. It also includes the transcriptional repressor Bcl-2-associated transcription factor 1 (BCLAF1). 614
58064 406010 pfam15441 ARHGEF5_35 Rho guanine nucleotide exchange factor 5/35. This family includes Rho guanine nucleotide exchange factor 5 and Rho guanine nucleotide exchange factor 35. 488
58065 406011 pfam15442 DUF4629 Domain of unknown function (DUF4629). This domain family is found in eukaryotes, and is approximately 150 amino acids in length. There are two conserved sequence motifs: MHML and LGKK. 150
58066 373849 pfam15443 DUF4630 Domain of unknown function (DUF4630). This family of proteins is found in eukaryotes. Proteins in this family are typically between 124 and 286 amino acids in length. 156
58067 373850 pfam15444 TMEM247 Transmembrane protein 247. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 197 and 222 amino acids in length. The function of this family is unknown. 211
58068 373851 pfam15445 ATS acidic terminal segments, variant surface antigen of PfEMP1. ATS is the intracellular and relatively conserved acidic terminal segment of the Plasmodium falciparum erythrocyte membrane protein-1 (PfEMP1). this domain appears to be present in all variants of the highly polymorphic PfEMP1 proteins. 446
58069 406012 pfam15446 zf-PHD-like PHD/FYVE-zinc-finger like domain. This family appears to be a combination domain of several consecutive zinc-binding regions. 170
58070 406013 pfam15447 NTS N-terminal segments of PfEMP1. This family, the N-terminal segment, is the most variable part of the variant surface antigen family of Plasmodium falciparum, the erythrocyte membrane protein-1 (PfEMP1) proteins. PfEMP1 is an important target for protective immunity and is implicated in the pathology of malaria through its ability to adhere to host endothelial receptors. A structural and functional study of the N-terminal domain of PfEMP1 from the VarO variant comprising the N-terminal segment (NTS) and the first DBL domain (DBL1alpha1), shows this region is directly implicated in rosetting. NTS, previously thought to be a structurally independent component of PfEMP1, forms an integral part of the DBL1alpha domain that is found to be the important heparin-binding site. This family is closely associated with PFEMP, pfam03011, and Duffy_binding, pfam05424. 36
58071 406014 pfam15448 NTS_2 N-terminal segments of P. falciparum erythrocyte membrane protein. NTS_2 is a family of the most variable part of the variant surface antigen family of Plasmodium falciparum, the erythrocyte membrane protein-1 (PfEMP1). However, in this group of proteins conservation is high. PfEMP1 is an important target for protective immunity and is implicated in the pathology of malaria through its ability to adhere to host endothelial receptors. 50
58072 406015 pfam15449 Retinal Retinal protein. This family of proteins is found in the photoreceptor cells of the retina. Mutations of the gene encoding this protein have been associated with retinal disorders such as retinitis pigmentosa and late-onset progressive retinal atrophy. The function of this family of proteins is unknown, but it is likely to be important in the development and function of the retina. 1293
58073 406016 pfam15450 CCDC154 Coiled-coil domain-containing protein 154. CCDC154 is an osteopetrosis-related protein that suppresses cell proliferation by inducing G2/M arrest. 526
58074 373857 pfam15451 DUF4632 Domain of unknown function (DUF4632). This family of proteins is found in eukaryotes. Proteins in this family are typically between 59 and 190 amino acids in length. 71
58075 406017 pfam15452 NYAP_C Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter. NYAP_C is a C-terminal family of eukaryotic proteins that are substrates of tyrosine kinase in the brain. When first identified, the family members were referred to as unconventional myosin XVI, or Myr 8. However, proteins have now been identified as being integrally involved in neuronal function and morphogenesis. The family is involved in both the activation of phosphoinositide 3-kinase (PI3K) and the recruitment of the downstream effector WAVE complex to the close vicinity of PI3K; it also appears to regulate the brain size and neurite outgrowth in mice. 261
58076 406018 pfam15453 Pilt Protein incorporated later into Tight Junctions. Pilt is a family of eukaryotic tight junction-proteins that binds to guanylate-kinase. Pilt is a component of TJs (Tight junctions) rather than AJs (Adhesin junctions). The protein is incorporated into TJs after TJ strands are formed, thereby suggesting the name Pilt for 'protein incorporated later into TJs'. Pilt binds to the guanylate-kinase region of hDlg otherwise known as Disk large homolog. 362
58077 406019 pfam15454 LAMTOR Late endosomal/lysosomal adaptor and MAPK and MTOR activator. LAMTOR is a family of eukaryotic proteins that have otherwise been referred to as Lipid raft adaptor protein p18, Late endosomal/lysosomal adaptor and MAPK and MTOR activator 1, and Protein associated with DRMs and endosomes. It is found to be one of three small proteins constituting the Rag complex or Ragulator that interact with each other, localize to endosomes and lysosomes, and play positive roles in the MAPK pathway. The complex does this by interacting with the Rag GTPases, recruiting them to lysosomes, and bringing about mTORC1 activation. 69
58078 317808 pfam15455 Pro-rich_19 Proline-rich 19. This family includes proline-rich protein 19. 363
58079 406020 pfam15456 Uds1 Up-regulated During Septation. Uds1 is a domain family is found mostly in fungi, and is typically between 120 and 138 amino acids in length. The GO annotation for the S.pombe protein describes the protein as barrier septum assembly involved in cell cycle cytokinesis, GO:0071937. Many of the uncharacterized members are listed as being involucrin repeat proteins, but this can not be substantiated. 120
58080 406021 pfam15457 HopW1-1 Type III T3SS secreted effector HopW1-1/HopPmaA. HopW1-1 is a family of bacterial modular P. syringae Avr effectors that induce accumulation of the signal molecule salicylic acid (SA) and the transcripts of HWI1 (HOPW1-1-INDUCED GENE1) in Arabidopsis. Thus HopW1-1 elicits a resistance response in Arabidopsis. 321
58081 406022 pfam15458 NTR2 Nineteen complex-related protein 2. NTR2 or Nineteen complex-related protein 2 is a family of largely fungal and plant proteins that form a complex with the DExD/H-box RNA helicase Prp43. Along with NTR1 it is an accessory factor of Prp43 in catalyzing spliceosome disassembly. Disassembly of the spliceosome after completion of the splicing reaction is necessary for recycling of splicing factors to promote efficient splicing. NTR2 and NTR1 associate with a post-splicing complex containing the excised intron and the spliceosomal U2, U5, and U6 snRNAs, that supports a link with a late stage in the pre-mRNA splicing process. 310
58082 406023 pfam15459 RRP14 60S ribosome biogenesis protein Rrp14. RRP14 is a family of nucleolar 60S ribosomal biogenesis proteins from eukaryotes. RRP14 functions in ribosome synthesis as it is required for the maturation of both small and large subunit rRNAs and it helps to prevent premature cleavage of the pre-rRNA at site C2. It also plays a role in cell polarity and/or spindle positioning. 63
58083 406024 pfam15460 SAS4 Something about silencing, SAS, complex subunit 4. SAS4 is a family of largely fungal silencing regulators. This silencing is mediated by chromatin. SAS4 specifically silences the yeast mating-type genes HML and HMR. SAS4 is found to be one subunit of a complex, the SAS complex, that interacts with chromatin assembly factor Asf1p, and asf1 mutants show silencing defects similar to mutants in the SAS complex. Thus, ASF1-dependent chromatin-assembly may mediate the role of the SAS complex in silencing. Co-expression of Sas2, SAS4, and Sas5 in Escherichia coli leads to formation of a stable SAS complex that acetylates histones. SAS4 is essential for the acetyltransferase activity of Sas2, and Sas5 is also important. 99
58084 406025 pfam15461 BCD Beta-carotene 15,15'-dioxygenase. This is a family of bacterial and archaeal proteins that catalyzes or regulates the conversion of beta-carotene to retinal. characterization of BCD proteins shows them to cleave beta-carotene at its central double bond (15,15') to yield two molecules of all-trans-retinal. However, the oxygen atom of retinal originated not from water but from molecular oxygen, suggesting that the enzyme was a beta-carotene 15,15'-dioxygenase, rather than a mono-oxygenase that catalyzes the same biochemical reaction. 264
58085 406026 pfam15462 Barttin Bartter syndrome, infantile, with sensorineural deafness (Barttin). Barttin is a family of mammalian proteins that are chloride ion channel beta-subunits crucial for renal Cl-re-absorption and inner ear K+ secretion. Bartter syndrome is a term covering a heterogeneous group of autosomal recessive salt-losing nephropathies that are caused by disturbed transepithelial sodium chloride re-absorption in the distal nephron. Mutations in the BCD proteins lead to sensorial deafness. 223
58086 406027 pfam15463 ECM11 Extracellular mutant protein 11. ECM11 is a family of largely fungal proteins. ECM11 interacts with Cdc6, an essential protein involved in the initiation of DNA replication, and is a nuclear protein involved in maintaining chromatin structure. It was previously identified as a protein involved in yeast cell wall biogenesis and organisation, but is also found to be required in meiosis where its function is related to DNA replication and crossing-over. 133
58087 406028 pfam15464 DUF4633 Domain of unknown function (DUF4633). This family of proteins is found in eukaryotes. Proteins in this family are typically between 94 and 123 amino acids in length. 114
58088 406029 pfam15465 DUF4634 Domain of unknown function (DUF4634). This family of proteins is found in eukaryotes. Proteins in this family are typically between 98 and 133 amino acids in length. 131
58089 406030 pfam15466 DUF4635 Domain of unknown function (DUF4635). This family of proteins is found in eukaryotes. Proteins in this family are typically between 120 and 154 amino acids in length. There are two conserved sequence motifs: LEQ and DLE. 134
58090 406031 pfam15467 SGIII Secretogranin-3. Secretogranin_3 is a family of vertebrate proteins that is one of the granin family. Granins are rich in acidic amino acids, exhibit aggregation at low pH, and possess a high capacity for calcium binding. Because granins are restricted in their localization to secretory granules of neuroendocrine cells, two interesting characteristics of their sorting mechanisms have been observed. These are, first, that they aggregate on low pH/high calcium concentrations and second that two of them carry an N-terminal disulfide loop, mutations in which lead to mis-sorting. Thus, granins are thought to be essential for the sorting of secretory proteins at the trans-Golgi network. Chromogranin A (CgA) binds to SGIII in secretory granules of endocrine cells. SGIII directly binds to cholesterol components of the secretory granule membrane and targets CgA to secretory granules in pituitary and pancreatic endocrine cells. Mutations in the SGIII gene may influence the risk of obesity through possible regulation of hypothalamic neuropeptide secretion. 449
58091 373871 pfam15468 DUF4636 Domain of unknown function (DUF4636). This family of proteins is found in eukaryotes. Proteins in this family are typically between 196 and 244 amino acids in length. 243
58092 406032 pfam15469 Sec5 Exocyst complex component Sec5. This Sec5 family of eukaryotic proteins conserved is not representing the Sec5-Ral binding site. 186
58093 406033 pfam15470 DUF4637 Domain of unknown function (DUF4637). This family of proteins is found in eukaryotes. Proteins in this family are typically between 142 and 178 amino acids in length. 164
58094 373874 pfam15471 TMEM171 Transmembrane protein family 171. This family of proteins is found in eukaryotes. TMEM171 is also known as parturition-related protein 2. Proteins in this family are typically between 242 and 326 amino acids in length. 317
58095 406034 pfam15472 DUF4638 Domain of unknown function (DUF4638). This family of proteins is found in eukaryotes. Proteins in this family are typically between 240 and 272 amino acids in length. 262
58096 406035 pfam15473 PCNP PEST, proteolytic signal-containing nuclear protein family. PCNP is a PEST-containing nuclear protein that is ubiquitinated by NIRF, a Np95/ICBP90-like RING finger protein. PEST sequences, which are rich in proline (P), glutamic acid (E), serine (S) and threonine (T), are found in a number of short-lived proteins, such as transcription factors and cell cycle-associated proteins. Their function is generally controlled by proteolysis, mostly via ubiquitin-mediated degradation. Thus, NIRF and PCNP are a ubiquitin ligase and its substrate, respectively, that may constitute a novel signalling pathway with some relation to cell proliferation. 156
58097 406036 pfam15474 MU117 Meiotically up-regulated gene family. This protein was identified as being up-regulated during meiosis in S.pombe. This family of proteins is found in largely in plants and fungi. Proteins in this family are typically between 128 and 920 amino acids in length. 104
58098 406037 pfam15475 UPF0444 Transmembrane protein C12orf23, UPF0444. This family of proteins is found in eukaryotes. Proteins in this family are typically between 94 and 119 amino acids in length. 91
58099 406038 pfam15476 SAP25 Histone deacetylase complex subunit SAP25. SAP25 is a family of proteins found in eukaryotes. SAP25 is a core component of the mSin3 co-repressor complex whose subcellular location is regulated by PML. mSin3, the transcriptional co-repressor, is associated with histone deacetylases (HDACs) and is utilized by many DNA-binding transcriptional repressors. SAP25 is a nucleo-cytoplasmic shuttling protein that is actively exported from the nucleus by a CRM1-dependent mechanism. It binds to the PAH1 domain of mSin3A, associates with the mSin3A-HDAC complex in vivo, and represses transcription when tethered to DNA. 202
58100 406039 pfam15477 SMAP Small acidic protein family. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a single completely conserved residue G that may be functionally important. 73
58101 406040 pfam15478 LKAAEAR Family of unknown function with LKAAEAR motif. This family of proteins is found in eukaryotes. Proteins in this family are typically between 119 and 235 amino acids in length. There is a conserved LKAAEAR sequence motif. 137
58102 406041 pfam15479 DUF4639 Domain of unknown function (DUF4639). This family of proteins is found in eukaryotes. Proteins in this family are typically between 161 and 601 amino acids in length. 580
58103 406042 pfam15480 DUF4640 Domain of unknown function (DUF4640). This family of proteins is found in eukaryotes. Proteins in this family are typically between 99 and 306 amino acids in length. 292
58104 406043 pfam15481 CPG4 Chondroitin proteoglycan 4. CPG4 is a domain family found in nematodes of one of nine core chondroitin proteoglycans. Vertebrates produce multiple chondroitin sulfate proteoglycans that play important roles in development and tissue mechanics. In the nematode Caenorhabditis elegans, the chondroitin chains lack sulfate but nevertheless play essential roles in embryonic development and vulval morphogenesis. CPG4 has the largest predicted mass of the C. elegans CPGs at 84 kDa. The majority of its 35 predicted glycosaminoglycan attachment sites reside in the COOH-terminal half of the protein, of which four sites were confirmed by DTT modification. The family is rich in conserved cysteines. 94
58105 406044 pfam15482 CCER1 Coiled-coil domain-containing glutamate-rich protein family 1. This is a family of coiled-coil family proteins found in eukaryotes. Proteins in this family are typically between 160 and 397 amino acids in length. 213
58106 406045 pfam15483 DUF4641 Domain of unknown function (DUF4641). This family of proteins is found in eukaryotes. Proteins in this family are typically between 201 and 519 amino acids in length. 443
58107 406046 pfam15484 DUF4642 Domain of unknown function (DUF4642). This family of proteins is found in eukaryotes. Proteins in this family are typically between 115 and 196 amino acids in length. 155
58108 406047 pfam15485 DUF4643 Domain of unknown function (DUF4643). This family of proteins is found in eukaryotes. Proteins in this family are typically between 254 and 462 amino acids in length. 263
58109 406048 pfam15486 DUF4644 Domain of unknown function (DUF4644). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 191 amino acids in length. 161
58110 406049 pfam15487 FAM220 FAM220 family. This protein family is a domain of unknown function which is found in eukaryotes. Proteins in this family are typically between 217 and 277 amino acids in length. There are two completely conserved residues (S and L) that may be functionally important. 275
58111 406050 pfam15488 DUF4645 Domain of unknown function (DUF4645). This family of proteins is found in eukaryotes. Proteins in this family are typically between 200 and 298 amino acids in length. 294
58112 406051 pfam15489 CTC1 CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. Mutations in human CTC1 have been recognized as contributing to cerebroretinal microangiopathy. 1139
58113 373893 pfam15490 Ten1_2 Telomere-capping, CST complex subunit. Ten1_2 is a family of primarily plant and vertebrate telomere-capping proteins that is evolutionarily related to the mostly fungal family of Ten1, pfam12658. 117
58114 406052 pfam15491 CTC1_2 CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. This family largely represents sequences from plants species. 287
58115 406053 pfam15492 Nbas_N Neuroblastoma-amplified sequence, N terminal. Nbas_N is an N-terminal family of metazoan sequences. This domain lies at the N-terminal of several WD40-containing proteins. The human protein is over-expressed in neuroblastoma cells. 282
58116 406054 pfam15493 YrpD Domain of unknown function, YrpD. This family of proteins is found in bacteria. Proteins in this family are typically between 236 and 351 amino acids in length. The member from Bacillus subtilis, UniProtKB:O05411, is named YrpD. 203
58117 406055 pfam15494 SRCR_2 Scavenger receptor cysteine-rich domain. SRCR_2 is a scavenger receptor cysteine-rich domain family found largely on vertebrate sequences up-stream of the trypsin-like transmembrane serine protease, Spinesin. 99
58118 406056 pfam15495 Fimbrillin_C Major fimbrial subunit protein type IV, Fimbrillin, C-terminal. Fimbrillin_C is a C-terminal family of major fimbrial subunit protein type IV proteins largely from Bacillus species. The family is associated with family P_gingi_FimA, pfam06321. 83
58119 406057 pfam15496 DUF4646 Domain of unknown function (DUF4646). This is a family of proteins largely from fungi. The function is not known. 120
58120 406058 pfam15497 SNAPc19 snRNA-activating protein complex subunit 19, SNAPc subunit 19. SNAPc19 is a family of proteins found in eukaryotes. It is one of the five core components of the snRNA-activating protein complex or SNAPc that helps direct the nucleation of RNA polymerases II and III. The core RNA polymerase II snRNA promoters consist of a single essential element, the proximal sequence element (PSE), whereas the core RNA polymerase III snRNA promoters consist of both a PSE and a TATA box. The SNAPc binds to the PSE of both of these. SNAPc recognizes the PSE sequence common to all human snRNA genes, irrespective of polymerase specificity. SNAPc is also known as the PSE transcription factor (PTF) or PSE-binding protein (PBP). The human SNAP19 and SNAP45 subunits are dispensable for transcription in vitro and are not as widely conserved as the other three, SNAP190, SNAP43 and SNAP50, suggesting that these vertebrate-specific SNAPc subunits may have adapted specialized regulatory roles for snRNA gene transcription. 88
58121 373901 pfam15498 Dendrin Nephrin and CD2AP-binding protein, Dendrin. Dendrin is a family of eukaryotic proteins found in the podocytes of the kidneys. Dendrin, originally identified in telencephalic dendrites, is a constituent of the slit diaphragm, SD, complex of podocytes, where it directly binds to nephrin and CD2AP. Kidney podocytes and their slit diaphragms (SDs) form the final barrier to urinary protein loss. SD proteins also participate in intracellular signalling pathways. Dendrin appears to prevent programmed cell death (apoptosis) through its binding to nephrin. The SD protein nephrin serves as a component of a signalling complex that directly links podocyte junctional integrity to actin cytoskeletal dynamics. Thus, dendrin is identified as an SD family with proapoptotic signalling properties that accumulates in the podocyte nucleus in response to glomerular injury. 656
58122 406059 pfam15499 Peptidase_C98 Ubiquitin-specific peptidase-like, SUMO isopeptidase. Peptidase_C98 is a small family of SUMO - small ubiquitin-related modifier - isopeptidases found in eukaryotes. Reversible attachment of SUMO is an essential protein modification in all eukaryotic cells, The family neither binds nor cleaves ubiquitin, but is a potent SUMO isopeptidase, and the invariant residues required for SUMO binding and cleavage, in UniProtKB:Q5W0Q7, are Cys-236, His-456 and Asp-472, all of which are fully conserved in the family. Member proteins are low-abundance proteins that colocalize with coilin in Cajal bodies. Peptidase_C98 depletion does not affect global sumoylation, but causes striking coilin mis-localization and impairs cell proliferation, functions that are not dependent on the catalytic activity. Thus, Peptidase_C98 represents a third type of SUMO protease, with essential functions in Cajal body biology. 272
58123 259631 pfam15500 Ntox1 Putative RNase-like toxin, toxin_1. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and conserved cysteine, histidine and glutamate residues that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 96
58124 406060 pfam15501 MDM1 Nuclear protein MDM1. This family of proteins is present in the nucleus. The function of MDM1 is not known. 515
58125 406061 pfam15502 MPLKIP M-phase-specific PLK1-interacting protein. 62
58126 406062 pfam15503 PPP1R35_C Protein phosphatase 1 regulatory subunit 35 C-terminus. This is the C-terminus of protein phosphatase 1 regulatory subunit 35. This protein interacts with and inhibits the serine/threonine-protein phosphatase PPP1CA. 144
58127 406063 pfam15504 DUF4647 Domain of unknown function (DUF4647). This family of proteins is found in eukaryotes. Proteins in this family are typically between 282 and 480 amino acids in length. 465
58128 373907 pfam15505 DUF4648 Domain of unknown function (DUF4648). This family of proteins is found in eukaryotes. Proteins in this family are typically between 115 and 207 amino acids in length. 80
58129 292144 pfam15506 OCC1 OCC1 family. The human member of this family, overexpressed in colon carcinoma 1 protein has been shown to be overexpressed in several colon carcinomas. 61
58130 406064 pfam15507 DUF4649 Domain of unknown function (DUF4649). This family of Firmicute sequences has members that are annotated as ribose-phosphate pyrophosphokinase; however there is no evidence for this attribution. Member proteins are all shorter than 100 residues in length. 68
58131 406065 pfam15508 NAAA-beta beta subunit of N-acylethanolamine-hydrolyzing acid amidase. NAAA-beta is a family of vertebral sequences that form the beta subunit of vertebral N-acylethanolamine-hydrolyzing acid amidase, a member of the choloylglycine hydrolase acid ceramidase family. The alpha subunit is represented by family CBAH, pfam02275. 63
58132 406066 pfam15509 DUF4650 Domain of unknown function (DUF4650). This family of vertebrate proteins lies to the C-terminus of Ubiquitin-specific peptidase-like protein family peptidase_C98, pfam15499. It might be acting as the exosite for the peptidase. 519
58133 373910 pfam15510 CENP-W CENP-W protein. CENP-W is a family of vertebral kinetochore proteins that associates directly with CENP-T. CENP-W members are histone-fold proteins. The histone fold region is critical for binding to centromeric DNA. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes. 88
58134 373911 pfam15511 CENP-T_C Centromere kinetochore component CENP-T histone fold. CENP-T is a family of vertebral kinetochore proteins that associates directly with CENP-W. The N-terminus of CENP-T proteins interacts directly with the Ndc80 complex in the outer kinetochore. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes. This domain is the C-terminal histone fold domain of CENP-T, which associates with chromatin. 108
58135 406067 pfam15512 CAF-1_p60_C Chromatin assembly factor complex 1 subunit p60, C-terminal. CAF-1_p60_C is a family of vertebral proteins that is involved in chromatin assembly. CAF-1_p60 is one of the three subunits of the CAF-1 complex, and this domain binds to the C-terminal region of CAF-1_p150, family pfam12253. The N-terminal part of the CAF-1_p60 proteins is a WD-repeat structure, pfam00400. 177
58136 406068 pfam15513 DUF4651 Domain of unknown function (DUF4651). family of short, secreted proteins specific to the Streptococcus genus, with distant homologs, not recognized by this HMM, found in other cocci. In all sequenced genomes, proteins from this family appear in a conserved genomic context with an thioredoxin, tRNA synthase and tRNA binding protein, but the functional implication of this is unclear 61
58137 292152 pfam15514 ThaI Restriction endonuclease ThaI. This family of restriction endonucleases belongs to the PD-(D/E)XK superfamily. It cuts the recognition site CG^CG leaving blunt ends. 202
58138 406069 pfam15515 MvaI_BcnI MvaI/BcnI restriction endonuclease family. This family of proteins includes the restriction endonucleases MvaI and BcnI. These enzymes both function as monomers. MvaI cleaves the sequence CC/WGG, where W is an A or a T nucleotide, leaving sticky ends. BcnI cleaves the sequence CC/SGG, where S is G or C, leaving sticky ends. 226
58139 373914 pfam15516 BpuSI_N BpuSI N-terminal domain. This is the N-terminal (nuclease) domain of the BpuSI restriction endonuclease. 168
58140 406070 pfam15517 TBPIP_N TBP-interacting protein N-terminus. This is the N-terminal restriction endonuclease-like domain found in several archaeal TATA-binding protein (TBP)-interacting proteins. 100
58141 373916 pfam15518 L_protein_N L protein N-terminus. This endonuclease domain is found at the N-terminus of many bunyavirus L proteins. 93
58142 406071 pfam15519 RBM39linker linker between RRM2 and RRM3 domains in RBM39 protein. A conserved linker between the second and the third RRM domain in human RBM39 (CAPER) protein, also present in other RNA binding proteins, especially those involved in RNA splicing. This linker was implicated in interactions with ESR1 and ESR2. Preliminary results from JCSG suggest that this is a structured domain with a well defined fold. 84
58143 292158 pfam15520 Ntox10 Novel toxin 10. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the type 2 secretion system. 193
58144 373918 pfam15521 Ntox11 Novel toxin 11. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin contains two structural domains, an N-terminal alpha/beta domain and a C-terminal all-beta domain. The domain contains conserved GxR, RxxxoH GxE and GxxH motifs and a conserved histidine residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 256
58145 259653 pfam15522 Ntox14 Novel toxin 14. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 218
58146 406072 pfam15523 Ntox16 Novel toxin 16. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha helical fold and conserved (DNE)xxH motif and arginine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, or Photorhabdus virulence cassette (PVC)-type secretion system. 85
58147 406073 pfam15524 Ntox17 Novel toxin 17. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly all-beta fold and a conserved ExD motif and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 7 or TcdB/TcaC-type secretion system. 98
58148 406074 pfam15525 DUF4652 Domain of unknown function (DUF4652). This family of uncharacterized proteins from Clostridia and Bacilli classes has an unusual structure of three beta propeller repeats that do not form a barrel, as in well known 6-, 7- etc beta propeller barrels, but instead are stacked in a three-layer beta-sheet sandwich. The function of all the proteins from this family is unknown. 193
58149 406075 pfam15526 Ntox21 Novel toxin 21. Bacterial genomes and plasmids encode a variety of peptide and protein toxins that mediate inter-bacterial competition. Bacteriocins are diffusible proteins that parasitize cell-envelope proteins to enter and kill bacteria. Contact-dependent growth inhibition (CDI) is one mechanism of inter-bacterial competition. Novel Toxin 21 (alternatively 16S rRNA endonuclease CdiA) belongs to a family of prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. This RNase toxin found in bacterial polymorphic toxin systems, is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, with two conserved lysine residues and [DS]xDxxxH, RxG[ST] and RxxD motifs. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 4, type 5 or type 7 secretion systems. This is also referred to as the E. cloacae CdiAC. The CdiAC proteins carry a variety of sequence-diverse C-terminal domains, which represent a collection of distinct toxins. Many CdiA-CT toxins have nuclease activities. In accord with the structural homology, CdiA-CT cleaves 16S rRNA at the same site as colicin E3 and this nuclease activity is responsible for growth inhibition. 71
58150 406076 pfam15527 Ntox22 Bacterial toxin 22. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly beta fold and two conserved histidines, two aspartates and a glutamate residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 5 secretion system. 129
58151 373924 pfam15528 Ntox23 Bacterial toxin 23. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved ND and DxxR motifs and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or TcdB/TcaC secretion system. 190
58152 406077 pfam15529 Ntox24 Bacterial toxin 24. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved ND and DxxR motifs and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or TcdB/TcaC secretion system. Interestingly, the toxin is also found in type-II toxin-antitoxin systems. 96
58153 373925 pfam15530 Ntox25 Bacterial toxin 25. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly all-beta fold and conserved FGPY motif and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 5 secretion system. 167
58154 406078 pfam15531 Ntox27 Bacterial toxin 27. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and conserved aspartate and glutamate residues, and an RxW motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion systems. 130
58155 373927 pfam15532 Ntox30 Bacterial toxin 30. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and two conserved histidines present in an RxH and THIP motif. The domain additionally has a highly conserved arginine residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6 or type 7 secretion systems. 103
58156 292170 pfam15533 Ntox33 Bacterial toxin 33. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and [DN]xHxxK and DxxxD motifs. It is usually exported by the Type 2 secretory system. 65
58157 406079 pfam15534 Ntox35 Bacterial toxin 35. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains a conserved histidine residue and a KH motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2 secretion system. 77
58158 373929 pfam15535 Ntox37 Bacterial toxin 37. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and a conserved glutamate residue, and [KR] and Hx[DH] motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion systems. 64
58159 292173 pfam15536 Ntox3 Bacterial toxin 3. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved aspartate, arginine, histidine and cysteine residues that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 133
58160 292174 pfam15537 Ntox43 Bacterial toxin 43. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold with two conserved histidine residues. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2 or TcdB/TcaC-type secretion system. An example of this, the Pseudomonas RhsT-C, has been experimentally characterized. 127
58161 406080 pfam15538 Ntox46 Bacterial toxin 46. A predicted toxin domain found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold with a conserved glutamine residue and a [KR]STxxPxxDxx[ST] motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system. 157
58162 406081 pfam15539 CAF1-p150_C2 CAF1 complex subunit p150, region binding to CAF1-p60 at C-term. CAF1-p150_C2 is part of the binding region of the CAF1 complex p150 subunit to the p60 subunit. The CAF1 complex is essential in human cells for the de novo deposition of histones H3 and H4 at the DNA replication fork. 288
58163 406082 pfam15540 Ntox47 Bacterial toxin 47. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains two conserved aspartates, a glutamate, a histidine and an arginine residue and an RT motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6 or type 7 secretion system. 111
58164 292178 pfam15541 Ntox4 Bacterial toxin 4. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 109
58165 406083 pfam15542 Ntox50 Bacterial toxin 50. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains two conserved histidine, a serine, two lysine, and a threonine residue and a HxVP motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6, type 7, and MuF-type secretion systems. 93
58166 292180 pfam15543 Ntox5 Bacterial toxin 5. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 142
58167 373932 pfam15544 Ntox6 Bacterial toxin 6. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system. 279
58168 292182 pfam15545 Ntox8 Bacterial toxin 8. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and HxR and HxxxH motifs and is usually exported by the type 2 and type 6 secretion system. 74
58169 317880 pfam15546 DUF4653 Domain of unknown function (DUF4653). This family of proteins is found in eukaryotes. Proteins in this family are typically between 93 and 229 amino acids in length. 229
58170 406084 pfam15547 DUF4654 Domain of unknown function (DUF4654). This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 169 amino acids in length. There is a conserved IDC sequence motif. 137
58171 406085 pfam15548 DUF4655 Domain of unknown function (DUF4655). This family of proteins is found in eukaryotes. Proteins in this family are typically between 533 and 570 amino acids in length. 534
58172 406086 pfam15549 PGC7_Stella PGC7/Stella/Dppa3 domain. The domain belongs to a fast evolving family known only from the placental mammals. The PGC7/Stella/Dppa3 protein protects imprinted regions from demethylation post-fertilization. This suggests that it might bind methylated DNA sequences directly. The conserved core includes a postively charged helical segment and a C-terminal CXCXXC motif that is predicted to chelate a metal ion. Most placental mammals contain 3-6 paralogs of this domain family. The CXCXXC motif is also conserved in a subset of fungal MBD4-like proteins. 166
58173 406087 pfam15550 Draxin Draxin. This family of proteins inhibit Wnt signaling and act as chemorepulsive axon guidance molecules. 319
58174 406088 pfam15551 DUF4656 Domain of unknown function (DUF4656). This family of proteins is found in eukaryotes. Proteins in this family are typically between 286 and 398 amino acids in length. 361
58175 406089 pfam15552 DUF4657 Domain of unknown function (DUF4657). This family of proteins is found in eukaryotes. Proteins in this family are typically between 305 and 370 amino acids in length. 294
58176 406090 pfam15553 TEX19 Testis-expressed protein 19. This family of proteins is expressed in testis. 159
58177 406091 pfam15554 FSIP1 FSIP1 family. 399
58178 406092 pfam15555 DUF4658 Domain of unknown function (DUF4658). This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 161 amino acids in length. 123
58179 406093 pfam15556 Zwint ZW10 interactor. This family of proteins is found in eukaryotes. Proteins in this family are typically between 127 and 281 amino acids in length. 252
58180 373941 pfam15557 CAF1-p150_N CAF1 complex subunit p150, region binding to PCNA. CAF1-p150_N is part of the N-terminus of the CAF1 complex p150 subunit that binds to PCNA - proliferating cell nuclear antigen. The PCNA mediates the connection between CAF-1 and the DNA replication fork. The CAF1 complex is essential in human cells for the de novo deposition of histones H3 and H4 at the DNA replication fork. 230
58181 406094 pfam15558 DUF4659 Domain of unknown function (DUF4659). This family of proteins is found in eukaryotes. Proteins in this family are typically between 427 and 674 amino acids in length. There are two completely conserved residues (D and I) that may be functionally important. 374
58182 406095 pfam15559 DUF4660 Domain of unknown function (DUF4660). This family of proteins is found in eukaryotes. Proteins in this family are typically between 93 and 189 amino acids in length. 107
58183 292197 pfam15560 Imm12 Immunity protein 12. A predicted immunity protein with an alpha+beta fold and several conserved charged and hydrophobic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family. The protein is also found in heterogeneous poly-immunity loci. 138
58184 406096 pfam15561 Imm15 Immunity protein 15. A predicted immunity protein with an alpha+beta fold and several conserved polar and hydrophobic residues. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems. 160
58185 406097 pfam15562 Imm17 Immunity protein 17. A predicted immunity protein with two transmembrane helices, and a WxW motif and a conserved arginine between the two helices. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems. 60
58186 406098 pfam15563 Imm19 Immunity protein 19. A predicted immunity protein with an alpha+beta fold and a conserved HxxRN motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems. 227
58187 259695 pfam15564 Imm25 Immunity protein 25. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in heterogeneous poly-immunity loci of polymorphic toxin systems. 131
58188 406099 pfam15565 Imm30 Immunity protein 30. A predicted immunity protein with a mostly alpha-helical fold and a conserved DxG motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-SHH family of HNH/Endonuclease VII fold nucleases. 96
58189 292202 pfam15566 Imm32 Immunity protein 32. A predicted immunity protein with an alpha+beta fold and a conserved histidine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox12 or Ntox37 or Notx 7 families. 54
58190 379730 pfam15567 Imm35 Immunity protein 35. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a protease domain such as Tox-PL1 and Ntox40. In some instances, it is also fused to a papain-like toxin, ADP-ribosyl glycohydrolase and a S8-like peptidase. Based on these associations the domain is likely to be a protease inhibitor. 84
58191 259699 pfam15568 Imm39 Immunity protein 39. A predicted immunity protein with an alpha+beta fold and conserved GR, and GxK motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family of nucleases. 131
58192 373946 pfam15569 Imm40 Immunity protein 40. A predicted immunity protein with an alpha+beta fold and conserved phenylalanine and tryptophan residues and a GGD motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox19 family. 93
58193 406100 pfam15570 Imm43 Immunity protein 43. A predicted immunity protein with an alpha+beta fold with conserved tryptophan, proline, aspartate, serine and arginine residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-AHH family of HNH/Endonuclease VII fold nucleases. The gene for this toxin is also found in heterogeneous poly-immunity loci. 124
58194 292206 pfam15571 Imm44 Immunity protein 44. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI1, Tox-URI2 or Tox-ParBL1 families. The gene for this toxin is also found in heterogeneous poly-immunity loci that show variations in structure even between closely related strains. 126
58195 406101 pfam15572 Imm45 Immunity protein 45. A predicted immunity protein with an alpha+beta fold and a conserved C-terminal tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-ColE3 family. 95
58196 406102 pfam15573 Imm47 Immunity protein 47. A predicted immunity protein with an alpha+beta fold and a conserved KxGDxxK motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems. 258
58197 317899 pfam15574 Imm48 Immunity protein 48. A predicted immunity protein with an all alpha-helical fold and a conserved HRG motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems. 123
58198 406103 pfam15575 Imm49 Immunity protein 49. A predicted immunity protein with an all alpha-helical fold and a conserved proline residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-REAse-1 or Tox-REase-6 families. 212
58199 373950 pfam15576 DUF4661 Domain of unknown function (DUF4661). This family of proteins is found in eukaryotes. Proteins in this family are typically between 281 and 302 amino acids in length. 253
58200 406104 pfam15577 Spc7_C2 Spc7_C2. Spc7_C2 is a short family to the C-terminus of fungal Spc7 proteins. The Ndc80-MIND-Spc7 complex plays a role in kinetochore function during late meiotic prophase and throughout the mitotic cell cycle. The N-terminal region of Spc7 co-localizes with the mitotic spindle, and it has been argued that Spc7 has the potential to associate with spindle microtubules and that this association is regulated by the C-terminal part of the Spc7 protein. However, this family represents only the conserved region towards the end of the C-terminus; the majority of the C-terminal part is in family Spc7, pfam08317. 62
58201 406105 pfam15578 DUF4662 Domain of unknown function (DUF4662). This family of proteins is found in eukaryotes. Proteins in this family are approximately 290 amino acids in length. 268
58202 406106 pfam15579 Imm52 Immunity protein 52. A predicted immunity protein with an alpha+beta fold and conserved tryptophan and phenylalanine residues, and a GT motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-REase-5 family. 102
58203 373954 pfam15580 Imm53 Immunity protein 53. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan, and WE and PGW motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox24 or Ntox10 families. 90
58204 292215 pfam15581 Imm58 Immunity protein 58. A predicted immunity protein with an alpha+beta fold and YxxxD, WxG, KxxxE motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene. 109
58205 292216 pfam15582 Imm65 Immunity protein 65. A predicted immunity protein with an alpha+beta fold and a conserved YxC motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-JAB1 family. The immunity protein typically contains a signal peptide and a lipobox. 321
58206 406107 pfam15583 Imm68 Immunity protein 68. A predicted immunity protein with an alpha+beta fold and a conserved glutamate residue. The domain is often fused to one or more immunity domains in poly-immunity proteins. 152
58207 292218 pfam15584 Imm72 Immunity protein 72. A predicted immunity protein with a mostly all-beta fold and GxxE, WxDxRY motifs and a glutamate residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox48 family. This domain is often fused to the Imm71 immunity domain. 81
58208 406108 pfam15585 Imm7 Immunity protein 7. A predicted immunity protein with an alpha+beta fold and a conserved GxaG motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a Tox-REase-3 domain. 130
58209 406109 pfam15586 Imm8 Immunity protein 8. A predicted immunity protein with an alpha+beta fold and a conserved WEa (a: aromatic) motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox7 family. 114
58210 406110 pfam15587 Imm9 Immunity protein 9. A predicted immunity protein with an alpha+beta fold and a conserved lysine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family. The protein is also found in heterogeneous poly-immunity loci. 165
58211 379733 pfam15588 Imm10 Immunity protein 10. A predicted immunity protein with a mostly all-beta fold and a conserved arginine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a Pput_2613 deaminase domain. The protein is also found in heterogeneous poly-immunity loci. 104
58212 406111 pfam15589 Imm21 Immunity protein 21. A predicted immunity protein with an alpha+beta fold and conserved WxG and YxxxC motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the NGO1392-family of HNH/Endonuclease VII fold nucleases. 156
58213 373957 pfam15590 Imm27 Immunity protein 27. A predicted immunity protein with an alpha+beta fold and a conserved aspartate and GGxP motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox10 or Tox-ParB families. 67
58214 292225 pfam15591 Imm31 Immunity protein 31. A predicted immunity protein with a mostly all-beta fold and a conserved GxS motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox17 or Ntox7 families. 73
58215 406112 pfam15592 Imm41 Immunity protein 41. A predicted immunity protein with an alpha+beta fold and a conserved SF motif and tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox21, Ntox29 or Tox-ART-RSE-like ADP-ribosyltransferase families. 108
58216 406113 pfam15593 Imm42 Immunity protein 42. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox18 family. 162
58217 406114 pfam15594 Imm50 Immunity protein 50. A predicted immunity protein with an all-beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-HHH or Ntox24 families. 118
58218 406115 pfam15595 Imm51 Immunity protein 51. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan and Dx[DE] motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-RES or Tox-URI1 families. Proteins containing this domain are present in heterogeneous poly immunity loci in polymorphic toxin systems. 105
58219 373959 pfam15596 Imm57 Immunity protein 57. A predicted immunity protein with a mostly alpha-helical fold and conserved aspartate and cysteine residues and an SE motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the LD-peptidase or Tox-Caspase families. 111
58220 406116 pfam15597 Imm59 Immunity protein 59. A predicted immunity protein with an alpha+beta fold and a conserved [DE]R motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox13 or Ntox40 families. In some proteins this domain is fused to the Imm38, pfam15599 immunity domain. 100
58221 406117 pfam15598 Imm61 Immunity protein 61. A predicted immunity protein with an alpha+beta fold and a conserved arginine. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox40 family. 153
58222 406118 pfam15599 Imm63 Immunity protein 63. A predicted immunity protein with an alpha+beta fold and a conserved E+G and ExxY motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox40, Tox-CdiAC and Tox-ARC families. The protein is also found in poly-immunity loci in polymorphic toxin systems. 83
58223 406119 pfam15600 Imm64 Immunity protein 64. A predicted immunity protein with an alpha+beta fold and a conserved DxEA motif and arginine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-ColD family. 207
58224 406120 pfam15601 Imm70 Immunity protein 70. A predicted immunity protein with an alpha+beta fold and conserved tyrosine and tryptophan residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-REase-10 family. 131
58225 379737 pfam15602 Imm71 Immunity protein 71. A predicted immunity protein with a mostly alpha-helical fold and conserved arginine and phenylalanine residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox48 family. This domain is often fused to the Imm72 immunity domain. 158
58226 406121 pfam15603 Imm74 Immunity protein 74. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-ARC family. This domain is also found in heterogeneous poly-immunity loci. 80
58227 406122 pfam15604 Ntox15 Novel toxin 15. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a most all-alpha helical fold and a conserved HxxD motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion systems. This is shown to be a type IV secretion system protein that behaves as DNase. 154
58228 373963 pfam15605 Ntox28 Bacterial toxin 28. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and[DN]HxxE motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5 or type 7 secretion system. 104
58229 292240 pfam15606 Ntox34 Bacterial toxin 34. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha helical fold and conserved lysine and cysteine residues, and GNxxD and WxCxH motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system. 80
58230 406123 pfam15607 Ntox44 Bacterial toxin 44. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha-helical fold with conserved DxK, GNxxxG, and DxxxD motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6 or type 7 secretion systems. 94
58231 406124 pfam15608 PELOTA_1 PELOTA RNA binding domain. This RNA binding Pelota domain is at the C-terminus of a PRTase family. These PRTase+Pelota genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 79
58232 406125 pfam15609 PRTase_2 Phosphoribosyl transferase. This PRTase family, and C-terminal TRSP domain, are related to OPRTases, and are predicted to use Orotate as substrate. These genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 189
58233 379741 pfam15610 PRTase_3 PRTase ComF-like. This PRTase family is related to the ComF PRTases. These genes are found in the smaller biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress-response. 265
58234 406126 pfam15611 EH_Signature EH_Signature domain. This domain with a strongly conserved glutamate at the N-terminus and a histidine at the C-terminus, is found in a SWI2/SNF2 four gene operon. Its strict-neighborhood association with SWI2/SNF2 ATPase strongly suggests a function in conjunction with it. The other genes in the operon are a OmpA protein and a TM protein. This has a DNA related function along with the TerY-P triad. 347
58235 406127 pfam15612 WHIM1 WSTF, HB1, Itc1p, MBD9 motif 1. A conserved alpha helical motif that along with the WHIM2 and WHIM3 motifs, and the DDT domain comprise an alpha helical module found in diverse eukaryotic chromatin proteins.Based on the Ioc3 structure, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. The conserved basic residue in WHIM1 is involved in packing with the DDT motif. The module shows a great domain architectural diversity and is often combined with other modified histone peptide recognising and DNA binding domains, some of which discriminate methylated DNA. 46
58236 406128 pfam15613 WSD Williams-Beuren syndrome DDT (WSD), D-TOX E motif. This family represents the combined alpha-helical module found in diverse eukaryotic chromatin proteins. Based on the Ioc3 structure, the N-terminus of this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. The acidic residue from the GxD signature at the N-terminus is a major determinant of the interaction between the ISWI and WHIM motifs. The N-terminal portion also contacts the inter-nucleosomal linker DNA. The module shows a great domain architectural diversity and is often combined with other modified histone peptide recognizing and DNA binding domains, some of which discriminate methylated DNA. The WSD module constitutes the inter-nucleosomal linker DNA binding site in the major groove of DNA, and was first identified as WSD, the D-TOX E motif of plant homeodomains homologous with the mutant transcription factor causing Williams-Beuren syndrome in association with the DDT-domain. 69
58237 406129 pfam15615 TerB_C TerB-C domain. TerB-C occurs C-terminal of TerB in TerB-N containing proteins. This domain displays multiple conserved acidic residues (TerBC). The presence of conserved acidic residues in both TerB-N and TerB-C suggests that they, like the TerB domain, might also chelate metals. These two domains may also occur together in the same protein independently of TerB. 143
58238 406130 pfam15616 TerY_C TerY-C metal binding domain. TerY-C is found C-terminal to TerY-like vWA domains in some proteins. It has 8 conserved metal chelating cysteines or histidines. It occasionally occurs as solos. 129
58239 406131 pfam15617 C-C_Bond_Lyase C-C_Bond_Lyase of the TIM-Barrel fold. This family of TIM-Barrel fold C-C bond lyase is related to citrate-lyase. These genes are found in the biosynthetic operon, with other enzymatic domains, associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 320
58240 406132 pfam15619 Lebercilin Ciliary protein causing Leber congenital amaurosis disease. Lebercilin is a family of eukaryotic ciliary proteins. Mutations in the gene, LCA5, are implicated in the disease Leber congenital amaurosis. In photoreceptors, lebercilin is uniquely localized at the cilium that bridges the inner and outer segments. Lebercilin functions as an integral element of selective protein transport through photoreceptor cilia. Lebercilin specifically interacts with the intraflagellar transport (IFT), and disruption of IFT can lead to Leber congenital amaurosis. 186
58241 406133 pfam15620 CENP-C_mid Centromere assembly component CENP-C middle DNMT3B-binding region. CENP-C is a component of the centromere assembly complex in eukaryotes. CENP-C recruits the DNA methyltransferases DNMT3B, in order to establish the necessary epigenetic DNA-methylation essential for maintenance of chromatin structure and genomic stability. This middle region of CENP-C is the binding-domain for DNMT3B. Binding of CENP-C and DNMT3B to DNA occurs at both centromeric and peri-centromeric satellite repeats. CENP-C and DNMT3B regulate the histone code in these regions. 259
58242 406134 pfam15621 PROL5-SMR Proline-rich submaxillary gland androgen-regulated family. SMR is a family of proteins found in eukaryotes. The family of SMR proteins is expressed in the submaxillary gland. SMR members may play a role in protection or detoxification. 102
58243 406135 pfam15622 CENP_C_N Kinetochore assembly subunit CENP-C N-terminal. CENP-C is a vertebrate family that forms a core component of the centromeric chromatin. On depletion of CENP-C proper formation of both centromeres and kinetochores is prevented. The N-terminal of CENP-C is necessary for recruitment of some but not all components of the Mis12 complex of the kinetochore. 287
58244 406136 pfam15623 CT47 Cancer/testis gene family 47. CT47 is a family of proteins found in eukaryotes. Proteins in this family are typically between 262 and 291 amino acids in length. There is a conserved HIL sequence motif. The function of this family is not known. 278
58245 406137 pfam15624 Mif2_N Kinetochore CENP-C fungal homolog, Mif2, N-terminal. Mif2_N is a family of fungal proteins homologous to mammalian CENP-C. On depletion of CENP-C proper formation of both centromeres and kinetochores is prevented. The N-terminal of CENP-C is necessary for recruitment of some but not all components of the Mis12 complex of the kinetochore. 133
58246 406138 pfam15625 CC2D2AN-C2 CC2D2A N-terminal C2 domain. Many ciliary proteins are involved in ciliogenesis and implicated for ciliophathies. A recent study has shown that many of them contain various new versions of C2 domains which are predicted to mediate membrane localizations for Y-shaped linkers of transition zone of cilia. This is the first C2 domain of ciliary CC2D2A proteins which also have another C2 domain (CC2D2AC-C2) and a new inactive transglutaminase-like peptidase domain (CC2D2A-TGL). 176
58247 373975 pfam15626 mono-CXXC single CXXC unit. This is a solo version of the zf-CXXC domain with a conserved CXXCXXCX(n)C, zinc-binding motif. This is, thus far, only detected in the plant lineage in diverse chromatin proteins. Structural comparisons show that the mono-CXXC is homologous to the structural- zinc binding domain of medium chain dehydrogenases. The regular zf-CXXC domain binds nonmethyl-CpG dinucleotides. 53
58248 406139 pfam15627 CEP76-C2 CEP76 C2 domain. Many ciliary proteins are involved in ciliogenesis and implicated for ciliophathies. A recent study has shown that many of them contain various new versions of C2 domains which are predicted to mediate membrane localizations for Y-shaped linkers of transition zone of cilia. This is the new C2 domain that is contained by ciliary CEP76 proteins. 154
58249 373977 pfam15628 RRM_DME RRM in Demeter. This is a predicted RRM-fold domain present at the C-terminus of Demeter-like glycoslyases. These proteins are involved in DNA demethylation in plants where they catalyze removal of the 5mC base and subsequently cleave the backbone through lyase activity. Orthologs of Demeter are present in plants and stramenopiles. The RRM fold domain is predicted to facilitate interaction of the catalytic domain with ssDNA or regulatory RNA. 102
58250 406140 pfam15629 Perm-CXXC Permuted single zf-CXXC unit. This is a permuted version of a single unit of the zf-CXXC domain that is detected in the Demeter-like proteins of land plants. Structural comparisons show that the mono-CXXC is homologous to the structural-zinc binding domain of medium chain dehydrogenases. The classical zf-CXXC domain binds nonmethyl-CpG dinucleotides. 32
58251 406141 pfam15630 CENP-S CENP-S protein. CENP-S is a family of vertebral and fungal kinetochore component proteins. CENP-S complexes with CENP-X to form a stable CENP-T-W-S-X heterotetramer. 76
58252 406142 pfam15631 Imm-NTF2-2 NTF2 fold immunity protein. A predicted immunity protein of the NTF2 fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-NucA family. This domain is also fused to ankyrin repeats and the pfam14025. 72
58253 406143 pfam15632 ATPgrasp_Ter ATP-grasp in the biosynthetic pathway with Ter operon. This ATP-grasp family is related to carbamoyl phosphate synthetase. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response. 131
58254 406144 pfam15633 Tox-ART-HYD1 HYD1 signature containing ADP-ribosyltransferase. A predicted toxin of the ADP-ribosyltransferase superfamily present in bacterial polymorphic toxin systems. The domain has characteristic histidine, tyrosine and aspartate residues that comprise the active site. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, or type 7 secretion system. 97
58255 292267 pfam15634 Tox-ART-HYE1 HYE1 signature containing ADP-ribosyltransferase. A predicted toxin of the ADP-ribosyltransferase superfamily present in bacterial polymorphic toxin systems. The domain has characteristic histidine, tyrosine and glutamate residues that comprise the active site. 282
58256 373981 pfam15635 Tox-GHH2 GHH signature containing HNH/Endo VII superfamily nuclease toxin 2. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic s[AGP]HH signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type secretion system. 112
58257 406145 pfam15636 Tox-GHH GHH signature containing HNH/Endo VII superfamily nuclease toxin. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus. 78
58258 373983 pfam15637 Tox-HNH-HHH HNH/Endo VII superfamily nuclease toxin with a HHH motif. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with characteristic conserved s[GD]xxR and HHH motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion system. 103
58259 406146 pfam15638 Tox-MPTase2 Metallopeptidase toxin 2. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems. 196
58260 406147 pfam15639 Tox-MPTase3 Metallopeptidase toxin 3. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems. 137
58261 406148 pfam15640 Tox-MPTase4 Metallopeptidase toxin 4. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems. 132
58262 373985 pfam15641 Tox-MPTase5 Metallopeptidase toxin 5. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems. 110
58263 292272 pfam15642 Tox-ODYAM1 Toxin in Odyssella and Amoebophilus. A predicted all-alpha fold toxin present in bacterial polymorphic toxin systems of the endosymbionts Odyssella and Amoebophilus. 385
58264 317949 pfam15643 Tox-PL-2 Papain fold toxin 2. A papain fold toxin domain found in bacterial polymorphic toxin systems. 102
58265 406149 pfam15644 Gln_amidase Papain fold toxin 1, glutamine deamidase. A papain fold toxin domain found in bacterial polymorphic toxin systems. In these systems they might function either as a releasing peptidase or toxin. In Shigella flexneri, UniProtKB:Q8VSD5, this protein is expressed from a plasmid, and delivered into the host via the type III secretion system where it deamidates the glutamine residue at position 100 in ubiquitin-activating enzyme E2, UBC13, to a glutamic acid residue. Invasion of host cells by pathogens normally invokes an acute inflammatory response through activating the TRAF6-mediated signalling pathway. UBC13 helps to activate TRAF6. Thus deamidation of UBC13 results in the dampening of the inflammatory response. The key glutaminase deamidase activity is mediated by a cys-his-glu triad, present in all members of the family. 112
58266 406150 pfam15645 Tox-PLDMTX Dermonecrotoxin of the Papain-like fold. A papain fold toxin domain found in bacterial polymorphic toxin systems. 142
58267 373987 pfam15646 Tox-REase-2 Restriction endonuclease fold toxin 2. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 7 or PrsW-peptidase dependent secretion system. 129
58268 373988 pfam15647 Tox-REase-3 Restriction endonuclease fold toxin 3. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or PrsW-peptidase dependent secretion system. 102
58269 406151 pfam15648 Tox-REase-5 Restriction endonuclease fold toxin 5. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses. 96
58270 373990 pfam15649 Tox-REase-7 Restriction endonuclease fold toxin 7. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or type 7 secretion system. 86
58271 379747 pfam15650 Tox-REase-9 Restriction endonuclease fold toxin 9. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion system. 87
58272 406152 pfam15651 Tox-SGS Salivary glad secreted protein domain toxin. An alpha+beta fold domain with four conserved cysteine residues and a conserved [DE}xx[ND] motif. This domain is mainly present at the c-terminus of RHS repeats containing proteins in insects and crustaceans. Although no bacterial homologs have been identified, the domain architecture suggests an origin from bacterial polymorphic toxin systems. 96
58273 379748 pfam15652 Tox-SHH HNH/Endo VII superfamily toxin with a SHH signature. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with two conserved histidine residues. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6 or type 7 secretion system. 97
58274 406153 pfam15653 Tox-URI2 URI fold toxin 2. A predicted toxin of the URI nuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system. 86
58275 406154 pfam15654 Tox-WTIP Toxin with a conserved tryptophan and TIP tripeptide motif. A predicted toxin domain with two membrane spanning alpha helices and RxxR, Wx[ST]IP motifs. The domain is present in bacterial polymorphic toxin systems. The toxin is usually exported by the type 2 or Photorhabdus virulence cassette (PVC)-type secretion system. 74
58276 406155 pfam15655 Imm-NTF2 NTF2 fold immunity protein. A predicted immunity protein of the NTF2 fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-JAB-2 family. 129
58277 317959 pfam15656 Tox-HDC Toxin with a H, D/N and C signature. A predicted alpha/beta fold peptidase domain with a strongly conserved triad of a histidine, aspartate/asparagine and cysteine residues that are predicted to comprise the active site of the predicted peptidase. Proteins bearing this predicted toxin domain are particularly common in both intracellular and extracellular pathogens. 130
58278 373994 pfam15657 Tox-HNH-EHHH HNH/Endo VII superfamily nuclease toxins. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic conserved [ED]H motif and two histidine residues. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion system. 69
58279 317961 pfam15658 Latrotoxin_C Latrotoxin C-terminal domain. A toxin domain present in arthropod alphaproteobacterial, gammaproteobacterial endosymbionts and also at the C-termini of the latrotoxins of the black widow spider. The domain is characterized by a conserved, hydrophobic helix and is predicted to associate with the cell membrane. 137
58280 406156 pfam15659 Toxin-JAB1 JAB-like toxin 1. 86
58281 317963 pfam15660 Imm75 Putative Immunity protein 75. This family is highly conserved suggesting it might derive from a phage protein. Members are less than 90 residues in length, and the function is not known. 84
58282 406157 pfam15661 CF222 C6orf222, uncharacterized family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 618 and 652 amino acids in length. 648
58283 317965 pfam15662 SPATA3 Spermatogenesis-associated protein 3 family. The SPATA3 family of proteins is expressed significantly in testis and faintly in epididymis in the ten tissues of testis, ovary, spleen, kidney, lung, heart, brain, epididymis, liver and skeletal muscle in mouse. Members are not expressed in the eight other tissues. This suggests that SPATA3 plays potential roles in spermatogenesis cell apoptosis or spermatogenesis. 191
58284 406158 pfam15663 zf-CCCH_3 Zinc-finger containing family. zf-CCCH_3 family is found in eukaryotes, and is typically between 155 and 169 amino acids in length. 110
58285 406159 pfam15664 TMEM252 Transmembrane protein 252 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 152 and 182 amino acids in length. The function is not known. 139
58286 406160 pfam15665 FAM184 Family with sequence similarity 184, A and B. The function of FAM184 is not known. 211
58287 406161 pfam15666 HGAL Germinal center-associated lymphoma. HGAL is a family of mammalian sequences typically between 104 and 179 amino acids in length. Members were discovered in a search for proteins precipitating diffuse large B-cell lymphomas. HGAL interacts with the cytoskeleton and aids the activity of interleukin-6 on cell migration. It also modulates the RhoA signalling pathway. 88
58288 406162 pfam15667 GDWWSH Protein of unknown function with motif GDWWSH. This family of proteins is found in eukaryotes. Proteins in this family are typically between 135 and 289 amino acids in length. There are three conserved sequence motifs: GDWWSH, RSDF and KRHG. 238
58289 406163 pfam15668 DUF4663 Domain of unknown function (DUF4663). This family of proteins is found in eukaryotes. Proteins in this family are typically between 289 and 334 amino acids in length. There are two completely conserved residues (W and G) that may be functionally important. 335
58290 406164 pfam15669 CCDC24 Coiled-coil domain-containing protein 24 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 187 and 319 amino acids in length. There are two completely conserved residues (G and P) that may be functionally important. 195
58291 406165 pfam15670 Spem1 Spermatid maturation protein 1. Spem1 is a family of mammalian proteins. Proteins are exclusively expressed in the cytoplasm of the last three steps of spermiogenesis in the mouse testis, and male mice deficient in Spem1 are completely infertile because of deformed sperm. 254
58292 406166 pfam15671 PRR18 Proline-rich protein family 18. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 297 amino acids in length. The function is not known but there are many highly conserved proline residues. 264
58293 406167 pfam15672 Mucin15 Cell-membrane associated Mucin15. Mucin15 is a family of vertebrate mucins associated with the cell-membrane. The function is not known. Members of the family are typically between 284 and 335 amino acids in length. 315
58294 374006 pfam15673 Ciart Circadian-associated transcriptional repressor. Circadian-associated transcriptional repressor (Ciart or Chrono) is a negative regulatory component of the circadian clock. It functions as a transcriptional repressor, modulating BMAL1-CLOCK activity. It also regulates metabolic pathways such as the glucocorticoid response triggered by behavioral stress. 278
58295 406168 pfam15674 CCDC23 Coiled-coil domain-containing protein 23. This family of proteins is found in eukaryotes. Proteins in this family are typically between 66 and 78 amino acids in length. There are two completely conserved residues (K and E) that may be functionally important. 57
58296 406169 pfam15675 CLLAC CLLAC-motif containing domain. This short domain is found in chordates. It carries a highly conserved CLLAC sequence motif. The function is not known. 30
58297 406170 pfam15676 S6OS1 Six6 opposite strand transcript 1 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 114 and 587 amino acids in length. The function is not known. 557
58298 374010 pfam15677 CEND1 Cell cycle exit and neuronal differentiation protein 1. This family of neuron-specific proteins may have a role in the differentiation of neuroblastoma cells and neuronal precursors. It is involved in development of the cerebellum. 143
58299 406171 pfam15678 SPICE Centriole duplication and mitotic chromosome congression. SPICE is a family of proteins found in chordates. It localizes to spindle microtubules in mitosis and to centrioles throughout the cell cycle. Deletion of SPICE compromises the architecture of spindles, the integrity of the spindle pole and the process of aligning chromosomes on the spindle (chromosome congression). 408
58300 406172 pfam15679 DUF4665 Domain of unknown function (DUF4665). This family of proteins is found in eukaryotes. Proteins in this family are typically between 45 and 100 amino acids in length. 99
58301 406173 pfam15680 OFCC1 Orofacial cleft 1 candidate gene 1 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 125 and 276 amino acids in length. 109
58302 406174 pfam15681 LAX Lymphocyte activation family X. LAX is a family of proteins is found in chordates. LAX is membrane-associates and expressed in B cells, T cells, and other lymphoid-specific cell types. It down-regulates antigen-receptor signalling in T cells by inhibiting TCR-mediated p38 MAPK activation. 350
58303 374015 pfam15682 Mustang Musculoskeletal, temporally activated-embryonic nuclear protein 1. Mustang is a family of short, approx 80 residue, proteins found in chordates. It localizes to the nucleus and specifically, spatially in mesenchymal cells of the developing limbs and tail as well as in the fracture callus, especially in periosteal osteoprogenitor cells, proliferating chondrocytes, and young active osteoblasts. It is highly expressed during embryogenesis and inactivated in most adult tissues with the exception of skeletal muscle and tendon where is is acutely and differentially expressed during bone regeneration. 75
58304 406175 pfam15683 TDRP Testis development-related protein. TDRP is a family of proteins found in chordates. It is predominantly expressed in the testis. distributed in both cytoplasm and the nuclei of spermatogenic cells. It may act as a nuclear factor with an important role in spermatogenesis. 145
58305 406176 pfam15684 AROS Active regulator of SIRT1, or 40S ribosomal protein S19-binding 1. AROS is a family of chordate proteins active in the nucleolus. It has a stretch of polylysines at the N-terminus and in the middle regions and it localizes to the nucleus and especially the nucleolus in high concentrations. It binds to the 40S ribosomal protein RPS19, which is implicated in erythropoiesis. AROS is an active regulator of Sirtuin (SIRT1), an NAD+-dependent deacetylase protein that plays a role in cell survival and hormonal signalling, and AROS regulates the activity of SIRT1 by enhancing SIRT1-mediated de-acetylation of p53 and thus regulates growth of the cell. 128
58306 406177 pfam15685 GGN Gametogenetin. GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking. 639
58307 406178 pfam15686 LYRIC Lysine-rich CEACAM1 co-isolated protein family. LYRIC is a family of proteins found in eukaryotes. It is a type-1b membrane protein with a single transmembrane domain and localizes to the endoplasmic reticulum and the nuclear envelope. It is also found in the nucleolus, suggesting functional relationships between these two cellular compartments. It is found to be colocalized with tight junction proteins ZO-1 and occludin in polarised epithelial cells, suggesting that LYRIC is part of the tight junction complex. LYRIC has been shown to promote tumor cell migration and invasion by activating the transcription factor NF-kappaB. 419
58308 406179 pfam15687 NRIP1_repr_1 Nuclear receptor-interacting protein 1 repression 1. This domain is the first (N-terminal) repression domain of nuclear receptor-interacting protein 1. 308
58309 406180 pfam15688 NRIP1_repr_2 Nuclear receptor-interacting protein 1 repression 2. This domain is the second repression domain of nuclear receptor-interacting protein 1. 331
58310 406181 pfam15689 NRIP1_repr_3 Nuclear receptor-interacting protein 1 repression 3. This domain is the third repression domain of nuclear receptor-interacting protein 1. 88
58311 406182 pfam15690 NRIP1_repr_4 Nuclear receptor-interacting protein 1 repression 4. This domain is the fourth (C-terminal) repression domain of nuclear receptor-interacting protein 1. 311
58312 406183 pfam15691 PPP1R32 Protein phosphatase 1 regulatory subunit 32. PPP1R32 is a family of eukaryotic proteins thought to be involved in the interactome of protein phosphatase-1. 418
58313 406184 pfam15692 NKAP NF-kappa-B-activating protein. NKAP is a family of eukaryotic proteins that interacts with NF-kappa-B. It is a nuclear regulator of TNF- and IL-1-induced NF-kappa-B activation. NKAP does not interact with RIP in mammalian cells family is often found in association with pfam06047. 84
58314 406185 pfam15693 Med26_C Mediator complex subunit 26 C-terminal. Med26_C is the C-terminal domain of subunit 26 of the Mediator complex in eukaryotes. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator. The C-terminal domain is critical and sufficient for its assembly into Mediator and its interaction with Pol II. The most highly conserved C-terminal amino acids are critical for these interactions because deletion of the last eight amino acids from the Med26 C-terminus disrupted binding to Mediator and Pol II. 182
58315 406186 pfam15694 Med26_M Mediator complex subunit 26 middle domain. Med26_M is the middle domain of subunit 26 of Mediator. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator. 255
58316 292323 pfam15695 HERV-K_REC Rec (regulator of expression encoded by corf) of HERV-K-113. REC is a family of rec proteins from the HERV-K viral polyprotein family. Rec is a functional homolog of Rev and Rex, and binds to an RNA element, the Rec-responsive element (RcRE), in the 3'LTR of HTDV/HERV-K transcripts. Thus Rec mediates nuclear export of RNA by binding to its responsive element, RcRE, present in a transcript. The human small glutamine-rich tetratricopeptide repeat-containing protein (hSGT) that controls mitotic processes and is a checkpoint protein during pro-metaphase is found to be a Rec-interacting partner.interferes with its role as a negative regulator of the androgen receptor, leading to enhanced androgen receptor activity. HERV-K(HML-2) elements benefit from this enhanced activity, as this leads to a vicious cycle that can result in increased cell proliferation, an inhibition of apoptosis, and eventually tumorigenesis. 87
58317 406187 pfam15696 RAD51_interact RAD51 interacting motif. This motif interacts with RAD51. 39
58318 406188 pfam15697 DUF4666 Domain of unknown function (DUF4666). This family of proteins is found in plants. Proteins in this family are typically between 103 and 140 amino acids in length. There are two conserved sequence motifs: LQRS and FRR. 109
58319 406189 pfam15698 Phosphatase Phosphatase. Members of this family have phosphatase activity. 256
58320 406190 pfam15699 NPR1_interact NPR1 interacting. This family of proteins interacts via a motif at the C-terminus with the regulatory protein NPR1. 108
58321 406191 pfam15700 DUF4667 Domain of unknown function (DUF4667). This family of proteins is found in fungi. Proteins in this family are typically between 172 and 313 amino acids in length. 231
58322 406192 pfam15701 DUF4668 Domain of unknown function (DUF4668). This family of proteins is found in eukaryotes. Proteins in this family are typically between 142 and 211 amino acids in length. 162
58323 406193 pfam15702 HPS6 Hermansky-Pudlak syndrome 6 protein. 778
58324 406194 pfam15703 LAT2 Linker for activation of T-cells family member 2. 177
58325 406195 pfam15704 Mt_ATP_synt Mitochondrial ATP synthase subunit. This plant mitochondrial ATP synthase subunit may the the equivalent of the mitochondrial ATP synthase d subunit. 188
58326 406196 pfam15705 TMEM132D_N Mature oligodendrocyte transmembrane protein, TMEM132D, N-term. TMEM132D_N is the N-terminal family of chordate proteins implicated in panic disorder. TMEM132D is a single-pass transmembrane protein that is highly expressed in the cortical regions of the human and mouse brain. The function is still unknown. It may act as a cell-surface marker for oligodendrocyte differentiation. Additionally, as it may be most strongly expressed in neurons and it colocalizes with actin filaments TMEM132D may be implicated in neuronal sprouting and connectivity in brain regions important for anxiety-related behaviour. 131
58327 406197 pfam15706 TMEM132D_C Mature oligodendrocyte transmembrane protein, TMEM132D, C-term. TMEM132D_C is the C-terminal family of chordate proteins implicated in panic disorder. TMEM132D is a single-pass transmembrane protein that is highly expressed in the cortical regions of the human and mouse brain. The function is still unknown. It may act as a cell-surface marker for oligodendrocyte differentiation. Additionally, as it may be most strongly expressed in neurons and it colocalizes with actin filaments TMEM132D may be implicated in neuronal sprouting and connectivity in brain regions important for anxiety-related behaviour. 84
58328 406198 pfam15707 MCCD1 Mitochondrial coiled-coil domain protein 1. This is a family of uncharacterized proteins known as mitochondrial coiled-coil domain protein 1. 90
58329 406199 pfam15708 PRR20 Proline-rich protein family 20. This family of proteins is found in eukaryotes. Proteins in this family are typically between 73 and 221 amino acids in length. There is a conserved AYV sequence motif. 221
58330 406200 pfam15709 DUF4670 Domain of unknown function (DUF4670). This family of proteins is found in eukaryotes. Proteins in this family are typically between 373 and 763 amino acids in length. 522
58331 406201 pfam15710 DUF4671 Domain of unknown function (DUF4671). This family of proteins is found in eukaryotes. Proteins in this family are typically between 385 and 652 amino acids in length. 678
58332 406202 pfam15711 ILEI Interleukin-like EMT inducer. ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras. 89
58333 406203 pfam15712 NPAT_C NPAT C-terminus. 685
58334 406204 pfam15713 PTPRCAP Protein tyrosine phosphatase receptor type C-associated. 150
58335 406205 pfam15714 SpoVT_C Stage V sporulation protein T C-terminal, transcription factor. SpoVT_C is the C-terminal part of the stage V sporulation protein T, a transcription factor involved in endospore formation in Gram-positive bacteria such as Bacillus subtilis. Sporulation is induced by conditions of environmental stress to protect the genome. SpoVT behaves as a tetramer that shows an overall significant distortion mediated by electrostatic interactions. Two monomers dimerize via the highly charged N-terminal AbrB-like domains, family pfam04014, to form swapped-hairpin beta-barrels. These asymmetric dimers then form tetramers through the formation of mixed helix bundles between their C-terminal domains. The C-termini themselves fold as GAF (cGMP-specific and cGMP-stimulated phosphodiesterases, Anabaena adenylate cyclases, and Escherichia coli FhlA) domains. 128
58336 406206 pfam15715 PAF PCNA-associated factor. 131
58337 406207 pfam15716 DUF4672 Domain of unknown function (DUF4672). This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 199 amino acids in length. 173
58338 406208 pfam15717 PCM1_C Pericentriolar material 1 C-terminus. 612
58339 406209 pfam15718 MNR Protein moonraker. Protein moonraker is a centriolar satellite component involved in centriole duplication. It promotes centriole duplication by localizing WDR62 to the centrosome. 933
58340 406210 pfam15719 DUF4674 Domain of unknown function (DUF4674). This family of proteins is found in eukaryotes. Proteins in this family are typically between 126 and 221 amino acids in length. 191
58341 406211 pfam15720 DUF4675 Domain of unknown function (DUF4675). This family of proteins is found in eukaryotes. Proteins in this family are approximately 190 amino acids in length. 198
58342 406212 pfam15721 ANXA2R Annexin-2 receptor. This family of proteins acts as annexin-2 receptors. 190
58343 406213 pfam15722 FAM153 FAM153 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 289 amino acids in length. 114
58344 406214 pfam15723 MqsR_toxin Motility quorum-sensing regulator, toxin of MqsA. MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins.y characterized bacterial toxins. 96
58345 406215 pfam15724 TMEM119 TMEM119 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 217 and 283 amino acids in length. 252
58346 292353 pfam15725 RCDG1 Renal cancer differentiation gene 1 protein. This family includes human protein C4orf46, also known as renal cancer differentiation gene 1 protein (RCDG1). 83
58347 292354 pfam15726 DUF4677 Domain of unknown function (DUF4677). This family of proteins is found in eukaryotes. Proteins in this family are typically between 157 and 195 amino acids in length. 198
58348 406216 pfam15727 DUF4678 Domain of unknown function (DUF4678). This family of proteins is found in eukaryotes. Proteins in this family are typically between 318 and 395 amino acids in length. 380
58349 406217 pfam15728 DUF4679 Domain of unknown function (DUF4679). This family of proteins is found in eukaryotes. Proteins in this family are typically between 213 and 412 amino acids in length. 399
58350 406218 pfam15729 ALS2CR11 Amyotrophic lateral sclerosis 2 candidate 11. This family of proteins is found in eukaryotes. Proteins in this family are typically between 286 and 727 amino acids in length. 418
58351 406219 pfam15730 DUF4680 Domain of unknown function (DUF4680). This family of proteins is found in eukaryotes. Proteins in this family are typically between 65 and 178 amino acids in length. There are two conserved sequence motifs: VISRM and ENE. 144
58352 292359 pfam15731 MqsA_antitoxin Antitoxin component of bacterial toxin-antitoxin system, MqsA. MqsA_antitoxin is a family of prokaryotic proteins that act as antidotes to the mRNA interferase MqsR. It has a zinc-binding at the very N-terminus indicating its DNA-binding capacity. MqsR is the gene most highly upregulated in E. Colo MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins. 131
58353 374059 pfam15732 DUF4681 Domain of unknown function (DUF4681). This family of proteins is found in eukaryotes. Proteins in this family are typically between 101 and 127 amino acids in length. 127
58354 406220 pfam15733 DUF4682 Domain of unknown function (DUF4682). This domain family is found in eukaryotes, and is typically between 152 and 183 amino acids in length. The family is found in association with pfam00566. There is a conserved NHLL sequence motif. 122
58355 406221 pfam15734 MIIP Migration and invasion-inhibitory. This family of proteins binds to insulin-like growth factor binding protein 2 (IGFBP-2) and inhibits the invasion of glioma cells. 337
58356 406222 pfam15735 DUF4683 Domain of unknown function (DUF4683). This domain family is found in eukaryotes, and is typically between 384 and 400 amino acids in length. 391
58357 406223 pfam15736 DUF4684 Domain of unknown function (DUF4684). This family of proteins is found in eukaryotes. Proteins in this family are typically between 531 and 1277 amino acids in length. 365
58358 406224 pfam15737 DUF4685 Domain of unknown function (DUF4685). This domain family is found in eukaryotes, and is typically between 106 and 131 amino acids in length. There are two conserved sequence motifs: SGE and VRF. 117
58359 406225 pfam15738 YafQ_toxin Bacterial toxin of type II toxin-antitoxin system, YafQ. YafQ is a family of bacterial toxin ribonucleases of type II toxin-antitoxin systems. The E.coli gene is expressed from the dinB operon. The cognate antitoxin for the E. coli protein is DinJ, in family RelB_antitoxin, pfam02604. 88
58360 406226 pfam15739 TSNAXIP1_N Translin-associated factor X-interacting N-terminus. This domain is found at the N-terminus of translin-associated factor X-interacting protein, a protein which may play a role in spermatogenesis. 104
58361 406227 pfam15740 PPP1R26_N Protein phosphatase 1 regulatory subunit 26 N-terminus. This domain represents the N-terminus of protein phosphatase 1 regulatory subunit 26. 872
58362 406228 pfam15741 LRIF1 Ligand-dependent nuclear receptor-interacting factor 1. This family of proteins interacts with the retinoic acid receptor RARalpha and inhibit it's ligand-dependent transcriptional activation. 739
58363 406229 pfam15742 DUF4686 Domain of unknown function (DUF4686). This family of proteins is found in eukaryotes. Proteins in this family are typically between 498 and 775 amino acids in length. There is a conserved DLK sequence motif. 384
58364 406230 pfam15743 SPATA1_C Spermatogenesis-associated C-terminus. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. There is a single completely conserved residue E that may be functionally important. 149
58365 406231 pfam15744 UPF0492 Uncharacterized protein family UPF0492. This family of proteins is found in eukaryotes. Proteins in this family are typically between 78 and 408 amino acids in length. 364
58366 406232 pfam15745 AP1AR AP-1 complex-associated regulatory protein. 275
58367 374072 pfam15746 TMEM215 TMEM215 family. This family of proteins is found in eukaryotes. Proteins in this family are approximately 230 amino acids in length. 225
58368 374073 pfam15747 DUF4687 Domain of unknown function (DUF4687). This family of proteins is found in eukaryotes. Proteins in this family are typically between 76 and 140 amino acids in length. 120
58369 406233 pfam15748 CCSAP Centriole, cilia and spindle-associated. This family of microtubule-binding proteins may play a role in embryonic brain development and cilia beating. 255
58370 406234 pfam15749 MRNIP MRN-interacting protein. This family is found in eukaryotes. Family members include MRN complex-interacting protein (MRNIP), which plays a role in preventing the accumulation of damaged DNA in cells. It associates with the MRE11-RAD50-NBS1 (MRN) damage-sensing complex and is rapidly recruited to sites of DNA damage. Phosphorylation of a serine promotes nuclear localization of MRNIP. 100
58371 406235 pfam15750 UBZ_FAAP20 Ubiquitin-binding zinc-finger. This domain is the ubiquitin-binding zinc-finger of the Fanconi anemia-associated protein of 20 kDa. 35
58372 406236 pfam15751 FANCA_interact FAAP20 FANCA interaction domain. This domain is found at the N-terminus of Fanconi anemia-associated protein of 20 kDa (FAAP20), where it is responsible for interaction with Fanconi anemia group A protein (FANCA). 108
58373 406237 pfam15752 DUF4688 Domain of unknown function (DUF4688). This family of proteins is found in eukaryotes. Proteins in this family are typically between 331 and 596 amino acids in length. 400
58374 406238 pfam15753 BLOC1S3 Biogenesis of lysosome-related organelles complex 1 subunit 3. This family of proteins are components of the biogenesis of lysosome-related organelles complex-1 (BLOC-1). 168
58375 406239 pfam15754 SPESP1 Sperm equatorial segment protein 1. 318
58376 406240 pfam15755 DUF4689 Domain of unknown function (DUF4689). This family of proteins is found in eukaryotes. Proteins in this family are typically between 202 and 224 amino acids in length. 223
58377 406241 pfam15756 DUF4690 Domain of unknown function (DUF4690). This family of proteins is found in eukaryotes. Proteins in this family are typically between 100 and 122 amino acids in length. There are two conserved sequence motifs: LGPGAI and LRKF. 96
58378 406242 pfam15757 Amelotin Amelotin. This ameloblast-specific family of proteins may play a role in dental enamel formation. 194
58379 406243 pfam15758 HRCT1 Histidine-rich carboxyl terminus protein 1. 77
58380 406244 pfam15759 TMEM108 TMEM108 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 258 and 575 amino acids in length. 511
58381 406245 pfam15760 DLEU7 Leukemia-associated protein 7. 194
58382 406246 pfam15761 IMUP Immortalisation up-regulated protein. This family of proteins is found in eukaryotes. Proteins in this family are approximately 100 amino acids in length. There are two conserved sequence motifs: GDPK and KKPK. 101
58383 406247 pfam15762 DUF4691 Domain of unknown function (DUF4691). This family of proteins is found in eukaryotes. Proteins in this family are typically between 71 and 317 amino acids in length. 179
58384 406248 pfam15763 DUF4692 Domain of unknown function (DUF4692). This family of proteins is found in eukaryotes. Proteins in this family are approximately 170 amino acids in length. 167
58385 406249 pfam15764 DUF4693 Domain of unknown function (DUF4693). This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 436 amino acids in length. 284
58386 406250 pfam15765 DUF4694 Domain of unknown function (DUF4694). This family of proteins is found in eukaryotes. Proteins in this family are typically between 154 and 217 amino acids in length. There is a conserved SSGY sequence motif. 155
58387 374090 pfam15766 DUF4695 Domain of unknown function (DUF4695). This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 206 amino acids in length. There is a conserved RFKTQP sequence motif. 107
58388 406251 pfam15767 DUF4696 Domain of unknown function (DUF4696). This family of proteins is found in eukaryotes. Proteins in this family are typically between 599 and 780 amino acids in length. There is a conserved AFP sequence motif. 583
58389 406252 pfam15768 CC190 Coiled-coil domain-containing protein 190. This family of proteins is found in eukaryotes. Proteins in this family are typically between 234 and 297 amino acids in length. 269
58390 406253 pfam15769 DUF4698 Domain of unknown function (DUF4698). This family of proteins is found in eukaryotes. Proteins in this family are typically between 464 and 550 amino acids in length. 488
58391 406254 pfam15770 DUF4699 Domain of unknown function (DUF4699). This family of proteins is found in eukaryotes. Proteins in this family are typically between 303 and 319 amino acids in length. 310
58392 406255 pfam15771 IHO1 Interactor of HORMAD1 protein 1. Interactor of HORMAD1 protein 1 (IHO1, previously known as coiled-coil domain-containing protein 36 or DUF4700) is required for DNA double-strand breaks (DSBs) formation in unsynapsed regions during meiotic recombination. It is thought to function, in collaboration with SPO11-auxiliary proteins MEI4 and REC114, through the formation of DSB-promoting recombinosomes on chromatin at the onset of meiosis. 576
58393 406256 pfam15772 UPF0688 UPF0688 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 176 and 243 amino acids in length. 232
58394 406257 pfam15773 DUF4701 Domain of unknown function (DUF4701). This family of proteins is found in eukaryotes. Proteins in this family are typically between 111 and 520 amino acids in length. 502
58395 406258 pfam15774 DUF4702 Domain of unknown function (DUF4702). This family of proteins is found in eukaryotes. Proteins in this family are typically between 346 and 637 amino acids in length. 399
58396 406259 pfam15775 DUF4703 Domain of unknown function (DUF4703). This family of proteins is found in eukaryotes. Proteins in this family are typically between 149 and 210 amino acids in length. 186
58397 374100 pfam15776 PRR22 Proline-rich protein family 22. This family of proteins is found in eukaryotes. Proteins in this family are typically between 217 and 420 amino acids in length. 366
58398 406260 pfam15777 Anti-TRAP Tryptophan RNA-binding attenuator protein inhibitory protein. 59
58399 406261 pfam15778 UNC80 Cation channel complex component UNC80. UNC80 is a family of proteins found in eukaryotes, and is typically between 193 and 224 amino acids in length. NALCN and UNC80 form a complex in mouse brain, both being tyrosine-phosphorylated; this phosphorylation can be inhibited by PP1. NALCN as the cation channel activated by substance P receptor, and the coupling from receptor to channel is facilitated by UNC80 and Src kinases rather than by a G-protein. 187
58400 406262 pfam15779 LRRC37 Leucine-rich repeat-containing protein 37 family. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The function of this protein is unknown but it is likely to be upregulated by androgen. 73
58401 406263 pfam15780 ASH Abnormal spindle-like microcephaly-assoc'd, ASPM-SPD-2-Hydin. The ASH domain or N-terminal domain of abnormal spindle-like microcephaly-associated protein are found in proteins associated with cilia, flagella, the centrosome and the Golgi complex. The domain is also found in Hydin and OCRL whose deficiencies are associated with hydrocephalus and Lowe oculocerebrorenal syndrome (OCRL), respectively. The fact that Human ASPM protein carries an ASH domain indicates possible roles for ASPM in sperm flagellar or in ependymal cells' cilia. The presence of ASH in centrosomal and ciliary proteins indicates that ASPM may possess roles not only in mitotic spindle regulation, but also in ciliary and flagellar function. 98
58402 406264 pfam15781 ParE-like_toxin ParE-like toxin of type II bacterial toxin-antitoxin system. 87
58403 406265 pfam15782 GREB1 Gene regulated by oestrogen in breast cancer. GREB1 (gene regulated by estrogen in breast cancer 1) was first identified as an oestrogen-regulated gene expressed in breast cancer. Its exact function is not known but its expression is regulated by the coordinated binding of oestrogen-receptors to distal sites interacting with Pol II to activate gene transcription from core promoters located at a considerable distance from the greb1 gene. 1925
58404 406266 pfam15783 FSIP2 Fibrous sheath-interacting protein 2. FSIP2, fibrous sheath-interacting protein 2, is the C-terminal portion of a family of proteins found in mammals. The function is not known but the domain appears to be repeated up to 10 times in some members. 876
58405 406267 pfam15784 GPS2_interact G-protein pathway suppressor 2-interacting domain. GPS2_interact is the more N-terminal domain of two co-repressor protein-families found in vertebrates. The domain is found in NCoR and SMRT proteins; N-CoR (nuclear receptor co-repressor) and SMRT (silencing mediator for retinoid and thyroid receptors) are related corepressors that mediate transcriptional repression by unliganded nuclear receptors and other classes of transcriptional repressors. GPS2 is a stoichiometric subunit of the N-CoR-HDAC3 complex. GPS2 links the complex to membrane receptor-related intracellular JNK (c-Jun amino-terminal kinase) signalling pathways. 89
58406 406268 pfam15785 SMG1 Serine/threonine-protein kinase smg-1. SMG1 is a family of eukaryotic proteins. In humans this family acts as an mRNA-surveillance protein. In C.elegans, SMG1, a phosphatidylinositol kinase-related protein kinase, is a key regulator of growth. Loss of SMG1 leads to hyperactive responses to injury and subsequent growth that continues out of control. It has an antagonistic role to mTOR signalling in these worms and possibly also in higher eukaryotes. 613
58407 406269 pfam15786 PET117 PET assembly of cytochrome c oxidase, mitochondrial. PET117 is a family of eukaryotic proteins found from fungi and plants to human. It is likely to be involved in the assembly of cytochrome C oxidase, and is found in the mitochondrion. 66
58408 406270 pfam15787 DUF4704 Domain of unknown function (DUF4704). This domain of unknown function is found in eukaryotes on neurobeachin proteins. 262
58409 406271 pfam15788 DUF4705 Domain of unknown function (DUF4705). DUF4705 is a family of repeated domains that is found in eukaryotes. It can occur up to 10 times in any one sequence. The repeat is rich in glycine and proline residues. 52
58410 374113 pfam15789 Hyr1 Hyphally regulated cell wall GPI-anchored protein 1. Hyr1 family is a repeated domain found up to 39 times in a range of fungal and vertebral proteins. Hyr1 is a hypha-specific protein. 41
58411 406272 pfam15790 EP400_N E1A-binding protein p400, N-terminal. EP400_N is a family of eukaryote proteins. the exact function of this domain is not known. This family is largely low-complexity residues. 490
58412 374115 pfam15791 DMRT-like Doublesex-and mab-3-related transcription factor C1 and C2. DMRT-like is a C-terminal domain found on eukaryotic proteins for doublesex-and mab-3-related transcription factors C1 and C2. This is not the DM DNA-binding region. The family is all disorder and low-complexity. 119
58413 406273 pfam15792 LAS2 Lung adenoma susceptibility protein 2. LAS2 is a family of eukaryotic proteins. Deletion of LAS2 is observed in approx. 40% of human lung adenocarcinomas, suggesting that loss of function of LAS2 may be a key step for promoting lung tumorigenesis. 75
58414 406274 pfam15793 FAM35_C Protein family FAM35, C-terminal. FAM35_C is a family of proteins found in eukaryotes. the function is not known. 174
58415 406275 pfam15794 CCDC106 Coiled-coil domain-containing protein 106. CCDC106, coiled-coil domain-containing protein 106, is a family of eukaryote proteins. Yeast two-hybrid screening has identified CCDC106 as a p53-interacting partner. CCDC106 is a negative regulator of p53 and may be involved in tumorigenesis in some cancers by promoting the degradation of p53 protein and inhibiting its transactivity. 223
58416 406276 pfam15795 Spec3 Ectodermal ciliogenesis protein. Spec3 is a family of eukaryotic membrane proteins. In the sea urchin, Spec3 is expressed predominantly during ectodermal ciliogenesis. 85
58417 406277 pfam15796 KELK KELK-motif containing domain of MRCK Ser/Thr protein kinase. KELK is a domain of eukaryotic proteins found in serine/threonine-protein kinase MRCK-type proteins. The region is low-complexity, but it is not a predicted disordered-binding domain. The name comes from a highly conserved sequence motif within the domain. The function is not known. 79
58418 406278 pfam15797 DUF4706 Domain of unknown function (DUF4706). This domain family is found in eukaryotes, and is approximately 110 amino acids in length. 103
58419 406279 pfam15798 PRAS Proline-rich AKT1 substrate 1. This domain family is found in eukaryotes, and is typically between 117 and 132 amino acids in length. PRAS domain family is found in eukaryotes, and is typically between 117 and 132 amino acids in length. It is a proline-rich family that can be phosphorylated by AKT, and in the phosphorylated state binds to 14-3-3. The AKT signalling pathway contributes to regulation of apoptosis after a variety of cell death stimuli, and PRAS is found to be a substrate. PRAS plays an important role in regulating cell survival downstream of the PI3-K/Akt pathway after re-perfusion injury after transient focal cerebral ischemia. Copper/zinc-SOD (SOD1), a cytosolic isoenzyme of superoxide dismutase, SOD, is highly protective against ischemia and re-perfusion injury after transient focal cerebral ischemia, and SOD1 thus contributes to the inhibition of direct oxidation of PRAS and the activation of its signalling pathway. PRAS is also a mTOR binding partner, and PRAS phosphorylation by AKT and its association with 14-3-3, a cytosolic anchor protein, are crucial for insulin to stimulate mTOR (mammalian target of rapamycin). 123
58420 406280 pfam15799 CCD48 Coiled-coil domain-containing protein 48. This family of proteins is found in eukaryotes. Proteins in this family are typically between 161 and 575 amino acids in length. 579
58421 406281 pfam15800 CiPC Clock interacting protein circadian. CiPC is a family of proteins found in eukaryotes. The protein was identified in sheep as a gene-orthologue involved in regulation of the circadian clock. Proteins in this family are typically between 220 and 400 amino acids in length. 329
58422 406282 pfam15801 zf-C6H2 zf-MYND-like zinc finger, mRNA-binding. zf-C6H2 is an unusual zinc-finger similar to zf-MYND, pfam01753.This zinc-finger is found at the N-terminus of Pfam families Exo_endo_phos pfam03372 and Peptidase_M24 pfam00557. The domain is missing in prokaryotic methionine aminopeptidases, and is a unique type of zinc-finger domain. It consists of a C2-C2 zinc-finger motif similar to the RING finger family followed by a C2H2 motif similar to zinc-fingers involved in RNA-binding. In yeast the domain chelates zinc in a 2:1 ratio. The domain is found in yeast, plants and mammals. The domain is necessary for the association of the methionine aminopeptidase with the ribosome and the normal processing of the peptidase. 46
58423 406283 pfam15802 DCAF17 DDB1- and CUL4-associated factor 17. DCAF17, DDB1- and CUL4-associated factor 17, is a family of proteins found in eukaryotes. It may function as a substrate-receptor for CUL4-DDB1 E3 ubiquitin-protein ligase complex. Mutations in the human protein, otherwise known as C2orf37, are responsible for Woodhouse-Sakati Syndrome. Woodhouse-Sakati Syndrome is a rare autosomal recessive multi-systemic disorder characterized by hypogonadism, alopecia, diabetes mellitus, mental retardation, and extrapyramidal syndrome. 474
58424 406284 pfam15803 zf-SCNM1 Zinc-finger of sodium channel modifier 1. zf-SCNM1 is a C2H2 type zinc-finger conserved in eukaryotes found at the N-terminus of SCNM1, sodium channel modifier protein 1. Phylogenetic analysis of these zinc finger sequences places SCNM1 within the U1C subfamily of RNA binding proteins that is commonly found in RNA-processing proteins, suggesting that SCNM1 is involved in splicing activities. 27
58425 406285 pfam15804 CCDC168_N Coiled-coil domain-containing protein 168. CCDC168_N is the N-terminal region of eukaryotic coiled-coil proteins 168 family. There are up to 17, on average 6, copies of this repeat in most members. 205
58426 406286 pfam15805 SCNM1_acidic Acidic C-terminal region of sodium channel modifier 1 SCNM1. SCNM1_acidic is the C-terminal acidic region of eukaryotic sodium channel modifier protein 1. Deletion of this region affects the splicing and normal activity of the sodium channel Nav1.6 from gene Scn8a. SCNM1 sits within the U1C subfamily of RNA binding proteins that is commonly found in RNA-processing proteins, suggesting that SCNM1 is involved in splicing activities. SCNM1 and LUC7L2 associate with the mammalian spliceosomal subunit U1 snRNP. 47
58427 406287 pfam15806 DUF4707 Domain of unknown function (DUF4707). This family of proteins is found in eukaryotes. The function is not known. 438
58428 406288 pfam15807 MAP17 Membrane-associated protein 117 kDa, PDZK1-interacting protein 1. MAP17 is a family of proteins found in eukaryotes. It is a small non-glycosylated two-pass membrane protein, that is overexpressed in many tumors of different origins, including carcinomas. 117
58429 406289 pfam15808 BCOR BCL-6 co-repressor, non-ankyrin-repeat region. BCOR is a domain family found in eukaryotes, and is approximately 220 amino acids in length. This domain lies just upstream of the ankyrin-repeat region at the C-terminus of BCL-6 co-repressor proteins. The function of this region is not known. 218
58430 406290 pfam15809 STG Simian taste bud-specific gene product family. STG was first isolated from rhesus monkey taste buds. The exact function of STG is not known, but it has been implicated in follicular lymphomas, though not with psoriasis at least in a Swedish population despite lying close to the PSOR1 gene-locus. 240
58431 406291 pfam15810 CCDC117 Coiled-coil domain-containing protein 117. CCDC117 is a family of coiled-coil proteins found in eukaryotes. Proteins in this family are typically between 203 and 279 amino acids in length. There is a conserved MELV sequence motif. The function is not known. 142
58432 374135 pfam15811 SVIP Small VCP/p97-interacting protein. SVIP, small VCP/p97-interacting protein, is a family of proteins found in eukaryotes. SVIP was identified by yeast two-hybrid screening to be an interactive partner of VCP/p97. Mammalian VCP/p97 and its yeast counterpart Cdc48p participate in the formation of organelles, including the endoplasmic reticulum (ER), Golgi apparatus, and nuclear envelope. Over-expression of SVIP caused the formation of large vacuoles that seemed to be derived from the ER. The family has two putative coiled-coil regions and contains proteins of approximately 80 amino acids in length. 77
58433 374136 pfam15812 MREG Melanoregulin. Melanoregulin is a family of proteins found in eukaryotes. It is a putative membrane fusion regulator. MREG forms a complex with peripherin-2. It is required for lysosome maturation and plays a role in intracellular trafficking. It is a negative regulator of melanosome intercellular transfer and it regulates intercellular melanosome transfer through palmitoylation. 148
58434 406292 pfam15813 DUF4708 Domain of unknown function (DUF4708). This family of proteins is found in eukaryotes. 274
58435 406293 pfam15814 FAM199X Protein family FAM199X. This family of proteins is found in eukaryotes. The function of FAM199X is not known. 320
58436 406294 pfam15815 MKRN1_C E3 ubiquitin-protein ligase makorin-1, C-terminal. MKRN1_C is the very C-terminus of E3 ubiquitin-protein ligase makorin-1, or MKRN1, a family of eukaryotic putative ribonucleoproteins with a distinctive array of zinc-finger motifs. MKRN1 plays an important role in modulating the homeostasis of telomere-length through a dynamic balance involving the stability of the protein hTERT. MKRN1 has been shown to be a a transcriptional co-regulator and an E3 ligase. It functions simultaneously as a differentially negative regulator of p53 and p21, preferentially leading cells to p53-dependent apoptosis by suppressing p21. The exact function of the C-terminal region has not been determined. 87
58437 406295 pfam15816 TMEM82 Transmembrane protein 82. TMEM82 is a family of proteins found in eukaryotes. The function is not known. 298
58438 374141 pfam15817 TMEM40 Transmembrane protein 40 family. TMEM40 is a family of eukaryotic membrane proteins. 120
58439 406296 pfam15818 CCDC73 Coiled-coil domain-containing protein 73 family. CCDC73 is a family of eukaryotic coiled-coil containing proteins. The function is not known. The alternative name is sarcoma antigen NY-SAR-79. 1050
58440 406297 pfam15819 Fibin Fin bud initiation factor homolog. Fibin is a family of eukaryotic proteins expressed in the lateral plate mesoderm of presumptive pectoral fin bud regions. It acts as a signal molecule for the expression of Tbx5, a gene involved in the specification of fore-limb identity. Fibin is found to be expressed in cerebellum, skeletal muscle and many other embryonic as well as adult mouse tissues, suggesting roles in both embryogenesis and in adult life. Although Fibin is routed through the endoplasmic reticulum (ER) no significant evidence for secretion is found. Fibin is post-translationally modified and forms dimers when expressed heterologously and its expression is regulated by a number of cellular signalling pathways. 189
58441 292448 pfam15820 ECSCR Endothelial cell-specific chemotaxis regulator. ECSCR, endothelial cell-specific chemotaxis regulator, is a family of proteins found in eukaryotes. It is also known as ARIA for apoptosis regulator through modulating IAP expression. It is a cell surface protein that regulates endothelial chemotaxis and tube formation, and interacts with filamin A. Filamin A anchors transmembrane proteins to the actin cytoskeleton becoming a scaffold for various signalling proteins. ECSCR is also known to interact with and regulate the function of several endothelial transmembrane molecules. It has been shown to play a role in angiogenesis, a complex process involving the migration, proliferation, and lumen formation of blood vessels by endothelial cells. ECSCR appears also to regulate endothelial apoptosis, probably through modulating proteasomal degradation of cIAP-1 and cIAP-2 in endothelial cells. 104
58442 406298 pfam15821 DUF4709 Domain of unknown function (DUF4709). This domain family is found in eukaryotes, and is approximately 110 amino acids in length. There is a conserved QQL sequence motif. 109
58443 318115 pfam15822 MISS MAPK-interacting and spindle-stabilizing protein-like. MISS is a family of eukaryotic MAPK-interacting and spindle-stabilizing protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest. 238
58444 406299 pfam15823 UPF0524 UPF0524 of C3orf70. UPF0524 is a family of proteins found in eukaryotes. Proteins in this family are typically between 183 and 250 amino acids in length. The function is not known. 239
58445 374146 pfam15824 SPATA9 Spermatogenesis-associated protein 9. SPATA9, spermatogenesis-associated protein 9, or testis development protein NYD-SP16, is a family of eukaryotic proteins associated with sperm production. It is highly expressed in human testis and contains one transmembrane domain. Its localization indicates it is likely to play an important role in testicular development and spermatogenesis and may be an important factor in male infertility. 253
58446 374147 pfam15825 FAM25 FAM25 family. FAM25 is a family of proteins found in eukaryotes. Proteins in this family are typically between 54 and 95 amino acids in length. There is a conserved GEK sequence motif. The function is not known. 65
58447 374148 pfam15826 PUMA Bcl-2-binding component 3, p53 upregulated modulator of apoptosis. PUMA (p53 upregulated modulator of apoptosis) is a family of eukaryotic proteins that are a target for activation by p53. The proteins contain BH3 domains and are induced in cells after p53 activation. They bind to Bcl-2, localize to the mitochondria to induce cytochrome c release, and activate the rapid induction of apoptosis. 189
58448 406300 pfam15827 UPF0730 UPF0730 unknown protein family. UPF0730 is a family of proteins found in eukaryotes. Proteins in this family are typically between 51 and 156 amino acids in length. 46
58449 406301 pfam15828 DUF4710 Domain of unknown function (DUF4710). This family of proteins is found in eukaryotes. Proteins in this family are typically between 60 and 150 amino acids in length. 75
58450 406302 pfam15829 DUF4711 Domain of unknown function (DUF4711). This family of proteins is found in eukaryotes. Proteins in this family are typically between 130 and 288 amino acids in length. 217
58451 406303 pfam15830 DUF4712 Domain of unknown function (DUF4712). This family of proteins is found in eukaryotes. Proteins in this family are typically between 133 and 267 amino acids in length. 250
58452 406304 pfam15831 DUF4713 Domain of unknown function (DUF4713). This family of proteins is found in eukaryotes. Proteins in this family are typically between 68 and 91 amino acids in length. Members are single-pass membrane proteins. 56
58453 406305 pfam15832 FAM27 FAM27 D and E protein family. FAM27 is a family of proteins found in eukaryotes. Proteins in this family are typically between 57 and 131 amino acids in length. 92
58454 406306 pfam15833 DUF4714 Domain of unknown function (DUF4714). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 164 amino acids in length. 149
58455 406307 pfam15834 THEG4 Testis highly expressed protein 4. THEG4, testis highly expressed protein 4, is a family of proteins found in eukaryotes. Proteins in this family are typically between 152 and 232 amino acids in length. 201
58456 374157 pfam15835 DUF4715 Domain of unknown function (DUF4715). This family of proteins is found in eukaryotes. Proteins in this family are approximately 150 amino acids in length. The proteins are described as coiled-coil domain-containing protein ENSP00000299415-like. 139
58457 406308 pfam15836 SSTK-IP SSTK-interacting protein, TSSK6-activating co-chaperone protein. SSTK-IP, SSTK-interacting protein or TSSK6-activating co-chaperone, is a family of proteins found in eukaryotes. SSTK-IP directly binds to HSP70, is found associated with HSP70 and HSP90 in cells, and facilitates HSP90-dependent enzymatic activation of SSTK. SSTK is a small serine/threonine kinase expressed post-meiotically and essential for male fertility along with two other serine threonine kinases. SSTK is one of the smallest protein kinases, consisting only of N- and C-lobes of a kinase catalytic domain, and forms stable associations with heat shock protein (HSP) 70 and 90. SSTK-IP, its interacting protein, thus represents the first germ cell-specific co-chaperone and protein kinase that requires the HSP90 machinery for catalytic activation. 125
58458 406309 pfam15837 DUF4716 Domain of unknown function (DUF4716). This domain family is found in eukaryotes, and is approximately 60 amino acids in length. 60
58459 406310 pfam15838 DUF4717 Domain of unknown function (DUF4717). This family of proteins is found in eukaryotes. Proteins in this family are typically between 103 and 139 amino acids in length. There are two conserved sequence motifs: LLLL and CFNLAS. 72
58460 406311 pfam15839 TEX29 Testis-expressed sequence 29 protein. TEX29, testis-expressed sequence 29 protein, is a family of proteins found in eukaryotes. Proteins in this family are typically between 39 and 150 amino acids in length. 69
58461 406312 pfam15840 ARL17 ADP-ribosylation factor-like protein 17. ARL17 is a family of proteins found in primates. Proteins in this family are typically between 82 and 130 amino acids in length. Members of this family are also referred to as NBR2 or neighbor of BRAC1 gene 2. 61
58462 374162 pfam15841 TMEM239 Transmembrane protein 239 family. This family of proteins is found in primates. Proteins in this family are typically between 152 and 198 amino acids in length. 155
58463 374163 pfam15842 DUF4718 Domain of unknown function (DUF4718). This family of proteins is found in eukaryotes. Proteins in this family are typically between 130 and 224 amino acids in length. 183
58464 374164 pfam15843 DUF4719 Domain of unknown function (DUF4719). This family of proteins is found in eukaryotes. Proteins in this family are typically between 67 and 240 amino acids in length. 207
58465 374165 pfam15844 TMCCDC2 Transmembrane and coiled-coil domain-containing protein 2. This family of proteins is found in primates. Proteins in this family are approximately 180 amino acids in length. 171
58466 406313 pfam15845 NICE-1 Cysteine-rich C-terminal 1 family. NICE-1 is family of proteins found in primates. Proteins in this family are typically between 51 and 105 amino acids in length. 89
58467 374167 pfam15846 DUF4720 Domain of unknown function (DUF4720). This family of proteins is found in vertebrates. Proteins in this family are typically between 101 and 117 amino acids in length. 94
58468 406314 pfam15847 Loricrin Major keratinocyte cell envelope protein. Loricrin is a family of major keratinocyte cell envelope proteins found in primates. It acts as an important epidermal barrier, and is initially expressed in the granular layer comprising 70% of the total protein mass of the cornified layer. Expression of Loricrin is regulated by TNF-alpha via a c-Jun N-terminal kinase-dependent pathway. 312
58469 374169 pfam15848 DUF4721 Domain of unknown function (DUF4721). This domain family is found in primates. 107
58470 292477 pfam15849 DUF4722 Domain of unknown function (DUF4722). This family of proteins is found in vertebrates. Proteins in this family are typically between 86 and 203 amino acids in length. 167
58471 406315 pfam15851 DUF4723 Domain of unknown function (DUF4723). This family of proteins is found in mammals. There are a number of conserved cysteines but it is unlikely to be a zinc-finger family. 81
58472 406316 pfam15852 DUF4724 Domain of unknown function (DUF4724). This family of proteins is found in mammals. There is a conserved KVKPL sequence motif. 93
58473 406317 pfam15854 DUF4725 Domain of unknown function (DUF4725). This family of proteins is found in vertebrates. Proteins in this family are approximately 80 amino acids in length. 80
58474 318139 pfam15855 DUF4726 Domain of unknown function (DUF4726). This family of proteins is found in vertebrates. Proteins in this family are typically between 40 and 110 amino acids in length. 101
58475 406318 pfam15856 DUF4727 Domain of unknown function (DUF4727). This family of proteins is found in vertebrates. There are a number of conserved cysteines, but the domain is not a zinc-finger. 216
58476 406319 pfam15858 LCE6A Late cornified envelope protein 6A family. LCE6A is a family of proteins is found in mammals. It was identified in a large-scale screening experiment as being involved in the barrier function of the epidermis. 81
58477 406320 pfam15859 DEC1 Deleted in esophageal cancer 1 family. DEC1 is a family of proteins found in primates. The protein has been identified as being deleted in oesophageal cancers so is also referred to as candidate tumor suppressor CTS9. Proteins in this family are approximately 70 amino acids in length. 70
58478 374176 pfam15860 DUF4728 Domain of unknown function (DUF4728). This family of arthropod proteins is functionally uncharacterized. 91
58479 406321 pfam15861 partial_CstF Partial cleavage stimulation factor domain. Partial_CstF domain is a protein domain that occurs in proteins from apicomplexan parasites. Currently (as of 2012), little is known about the function of this domain. However, it is homologous to the amino-terminal part of the cleavage stimulation factor, which is thought to be involved with mRNA maturation in mammals. 62
58480 406322 pfam15862 Coilin_N Coilin N-terminus. 138
58481 406323 pfam15863 EELM2 Extended EGL-27 and MTA1 homology domain. EELM2, the extended EGL-27 and MTA1 homology domain is a protein domain that occurs in proteins from apicomplexan parasites. Part of the EELM2 domain is homologous to the ELM2 domain, but is 'extended' in that its boundaries (the region of conservation) are longer than in the ELM2 domain. Currently (as of 2012), little is known about the function of this domain. However, some proteins that contain an EELM2 domain also contain a PHD finger domain, which is thought to be involved in chromatin remodelling. This suggests an associated role for the EELM2 domain. 170
58482 406324 pfam15864 PglL_A Protein glycosylation ligase. PglL_A is a pilin glycosylation ligase domain found in Gram negative bacteria. PglL protein O-oligosaccharyltransferases differ from the wider Wzy_C family, pfam04932, which contains both WaaL O-antigen ligases, in its substrate-specificity. PglL O-oligosaccharyltransferases (O-OTase) transfer oligosaccharide to serine or threonine in a protein. A further indication that the genes identified are PglL rather than WaaL homologs is that they are not located within lipopolysaccharide biosynthetic loci. The specific pilin glycosylation ligases are a subset of the more general bacterial protein o-oligosaccharyltransferases. 26
58483 406325 pfam15865 Fanconi_A_N Fanconi anaemia group A protein N-terminus. 333
58484 406326 pfam15866 DUF4729 Domain of unknown function (DUF4729). This family of proteins is functionally uncharacterized. This family of proteins is found in insects. Proteins in this family are typically between 238 and 666 amino acids in length. 208
58485 406327 pfam15867 Dynein_attach_N Dynein attachment factor N-terminus. This family represents the N-terminus of a dynein arm attachment factor which is required for dynein arm assembly and cilia motility. 68
58486 406328 pfam15868 MBF2 Transcription activator MBF2. MBF2 activates transcription via its interaction with TFIIA. In Bombyx mori, it has been found to form a complex with MBF1 and the DNA-binding regulator FTZ-F1. 90
58487 406329 pfam15869 TolB_like TolB-like 6-blade propeller-like. 295
58488 406330 pfam15870 EloA-BP1 ElonginA binding-protein 1. This domain family is found in eukaryotes, and is typically between 144 and 167 amino acids in length. 162
58489 406331 pfam15871 JMY Junction-mediating and -regulatory protein. JMY, Junction-mediating and -regulatory protein is also a WASP homolog-associated protein with actin, membranes and microtubules. This middle region is the coiled-coil region that putatively binds microtubules to the scaffold. This ability to interact with microtubules plays a role in membrane tubulation. 298
58490 318153 pfam15872 SRTM1 Serine-rich and transmembrane domain-containing protein 1. This family of proteins is found in eukaryotes. Proteins in this family are approximately 100 amino acids in length. 103
58491 292498 pfam15873 DUF4730 Domain of unknown function (DUF4730). This family of proteins is found in eukaryotes. Proteins in this family are approximately 60 amino acids in length. 55
58492 406332 pfam15874 Il2rg Putative Interleukin 2 receptor, gamma chain. This family of proteins is found in eukaryotes. Proteins in this family are typically between 137 and 197 amino acids in length. 92
58493 406333 pfam15875 DUF4731 Domain of unknown function (DUF4731). This family of proteins is found in eukaryotes. Proteins in this family are typically between 37 and 78 amino acids in length. 75
58494 374189 pfam15876 DUF4732 Domain of unknown function (DUF4732). This family of proteins is found in eukaryotes. Proteins in this family are typically between 107 and 201 amino acids in length. 112
58495 406334 pfam15877 TMEM232 Transmembrane protein family 232. This family of proteins is found in eukaryotes. The function is not known. 452
58496 406335 pfam15878 DUF4733 Domain of unknown function (DUF4733). This family of proteins is found in eukaryotes. Proteins in this family are typically between 73 and 99 amino acids in length. 91
58497 406336 pfam15879 MWFE NADH-ubiquinone oxidoreductase MWFE subunit. MWFE is a short subunit of NADH-ubiquinone oxidoreductase found in eukaryotes. It is necessary for the activity of NADH-ubiquinone oxidoreductase complex I in mitochondria. This subunit is essential for the assembly and function of the enzyme. MWFE is found to be phosphorylated, eg in rat heart mitochondria. The short family includes much of a signal peptide. 55
58498 406337 pfam15880 NDUFV3 NADH dehydrogenase [ubiquinone] flavoprotein 3, mitochondrial. 35
58499 406338 pfam15881 DUF4734 Domain of unknown function (DUF4734). This domain family is found in species of Drosophila, and is approximately 90 amino acids in length. The family is found in association with pfam07707. 91
58500 406339 pfam15882 DUF4735 Domain of unknown function (DUF4735). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 288 and 410 amino acids in length. There are two completely conserved C residues that may be functionally important. In mammals this protein family is thyroid-specific. 290
58501 374196 pfam15883 DUF4736 Domain of unknown function (DUF4736). This family of proteins is functionally uncharacterized. This family of proteins is found in insects. Proteins in this family are typically between 186 and 228 amino acids in length. 186
58502 406340 pfam15884 QIL1 Protein QIL1. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 111 and 169 amino acids in length. 77
58503 406341 pfam15886 CBM39 Carbohydrate binding domain (family 32). This domain is found at the N-terminus of beta-1,3-glucan-binding proteins involved in recognition of invading micro-organisms. It often co-occurs with pfam00722 (Glycosyl hydrolases family 16). It recognizes and binds to a triple-helical beta-1,3-glucan structure. 107
58504 406342 pfam15887 Peptidase_Mx Putative zinc-binding metallo-peptidase. This family has a highly conserved HHExxH motif with a highly conserved ED pairing downstream. HExxH is indicative of a zinc-binding metallo-peptidase. 240
58505 374199 pfam15888 FOG_N Folded gastrulation N-terminus. This is the N-terminal domain of the folded gastrulation protein. Folded gastrulation is required for morphogenic movements during gastrulation and nervous system development. It may act as a secreted signal and activate the G protein alpha subunit. This domain may be the G protein-coupled receptor ligand. 112
58506 406343 pfam15889 DUF4738 Domain of unknown function (DUF4738). Family of uncharacterized proteins found in CFB group of bacteria, mostly from Bacteroides and Prevotella genera present in human gut and oral cavity, respectively. JCSG target SP13584B, the experimentally determined structure consists of two WD40-like beta sheet repeats forming a beta sandwich 134
58507 406344 pfam15890 Peptidase_Mx1 Putative zinc-binding metallo-peptidase. This family is a putative zinc-binding metallo-peptidase. There are two highly conserved motifs, HHExxH and ED. HExxH with ED is indicative of zinc-binding metallo-peptidases. 238
58508 406345 pfam15891 Nuc_deoxyri_tr2 Nucleoside 2-deoxyribosyltransferase like. 105
58509 406346 pfam15892 BNR_4 BNR repeat-containing family member. BNR_4 is a family which carries the unique sequence motif SxDxGxTW which is so characteristic of the repeats of the BNR family, pfam02012. It is unclear whether or not this unit is repeated throughout the sequences of this family, but if it is then the family is likely to be bacterial neuraminidase. 272
58510 406347 pfam15893 DUF4739 Domain of unknown function (DUF4739). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 138 and 167 amino acids in length. 235
58511 406348 pfam15894 SgrT Inhibitor of glucose uptake transporter SgrT. 53
58512 374204 pfam15895 CAAX_1 CAAX box cerebral protein 1. CAAX_1 is a family of primate proteins. CAAX refers to the highly characteristic C-terminal residues, a cysteine and two aliphatic residues followed by any residue, a C-terminal tetrapeptide recognition motif called the Ca1a2X box. This motif on substrates is recognized by prenyltransferases that then attach an isoprenoid lipid (a process termed prenylation), one of the many post-translational modifications that occur in cells. The function of the prenylated family is not known. 209
58513 406349 pfam15897 DUF4741 Domain of unknown function (DUF4741). 169
58514 406350 pfam15898 PRKG1_interact cGMP-dependent protein kinase interacting domain. This domain is found at the C-terminus of protein phosphatase 1 regulatory subunits 12A, 12B and 12C. In protein phosphatase 1 regulatory subunit 12A it has been found to bind to cGMP-dependent protein kinase 1 via a leucine zipper motif located at the C-terminus of this domain. 101
58515 406351 pfam15899 BNR_6 BNR-Asp box repeat. This BNR repeat is found in proteins such as human sortilin. The model complements family BNR_5. 14
58516 406352 pfam15901 Sortilin_C Sortilin, neurotensin receptor 3, C-terminal. Sortilin_C is the C-terminal cytoplasmic tail of sortilin, a Vps10p domain-containing family of proteins. Most sortilin is expressed within intracellular compartments, where it chaperones diverse ligands, including proBDNF and acid hydrolases. The sortilin cytoplasmic tail is homologous to mannose 6-phosphate receptor and is required for the intracellular trafficking of cargo proteins via interactions with distinct adaptor molecules. In addition to mediating lysosomal targeting of specific acid hydrolases, the sortilin cytoplasmic tail also directs trafficking of BDNF to the secretory pathway in neurons, where it can be released in response to depolarisation to modulate cell survival and synaptic plasticity. 164
58517 406353 pfam15902 Sortilin-Vps10 Sortilin, neurotensin receptor 3,. Sortilin, also known in mammals as neurotensin receptor-3, is the archetypical member of a Vps10-domain (Vps10-D) that binds neurotrophic factors and neuropeptides. This domain constitutes the entire luminal part of Sortilin and is activated in the trans-Golgi network by enzymatic propeptide cleavage. The structure of the domain has been determined as a ten-bladed propeller, with up to 9 BNR or beta-hairpin turns in it. The mature receptor binds various ligands, including its own propeptide (Sort-pro), neurotensin, the pro-forms of nerve growth factor-beta (NGF)6 and brain-derived neurotrophic factor (BDNF)7, lipoprotein lipase (LpL), apo lipoprotein AV14 and the receptor-associated protein (RAP)1. 443
58518 406354 pfam15903 PL48 Filopodia upregulated, FAM65. PL48 is associated with cytotrophoblast and lineage-specific HL-60 cell differentiation. The N-terminal part of the family is found to induce the formation of filopodia. It is found in vertebrates. 346
58519 406355 pfam15904 LIP1 LKB1 serine/threonine kinase interacting protein 1. LIP1 is a protein found in eukaryotes. It represents the N-terminus of a leucine-rich-repeat protein that is implicated in Peutz-Jeghers syndrome. LIP1 interacts with the TGF-beta-regulated transcription factor SMAD4 to form a LKB1-LIP1-SMAD4 ternary complex. Mutations in SMAD4 lead to juvenile polyposis, suggesting a mechanistic link between these two diseases. 88
58520 406356 pfam15905 HMMR_N Hyaluronan mediated motility receptor N-terminal. HMMR_N is the N-terminal region of eukaryotic hyaluronan-mediated motility receptor proteins. The protein is functionally associated with BRCA1 and thus predicted to be a common, low-penetrance breast cancer candidate. 331
58521 318178 pfam15906 zf-NOSIP Zinc-finger of nitric oxide synthase-interacting protein. 75
58522 406357 pfam15907 Itfg2 Integrin-alpha FG-GAP repeat-containing protein 2. Members of this family are annotated as being integrin-alpha FG-GAP repeat-containing protein 2. 332
58523 406358 pfam15908 HMMR_C Hyaluronan mediated motility receptor C-terminal. HMMR_C is the C-terminal region of eukaryotic hyaluronan-mediated motility receptor proteins. The protein is functionally associated with BRCA1 and thus predicted to be a common, low-penetrance breast cancer candidate. 157
58524 406359 pfam15909 zf-C2H2_8 C2H2-type zinc ribbon. This family carries three zinc-fingers in tandem. 98
58525 374216 pfam15910 V-set_2 ICOS V-set domain. This family contains divergent V-set ig domains found in the ICOS protein. 113
58526 406360 pfam15911 WD40_3 WD domain, G-beta repeat. 57
58527 406361 pfam15912 VIR_N Virilizer, N-terminal. VIR_N is the conserved N-terminus of the protein virilizer, necessary for male and female viability and required for the production of eggs capable of embryonic development. 265
58528 406362 pfam15913 Furin-like_2 Furin-like repeat, cysteine-rich. The furin-like cysteine rich region has been found in a variety of proteins from eukaryotes that are involved in the mechanism of signal transduction by receptor tyrosine kinases, which involves receptor aggregation. 102
58529 406363 pfam15914 FAM193_C FAM193 family C-terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. This C-terminal region of these proteins carries the most conserved residues. 56
58530 406364 pfam15915 BAT GAF and HTH_10 associated domain. GAF-HTH_assoc domain is always found between GAF-2 and HTH_10 domains on bacterio-opsin activator proteins. The exact function is not known. 156
58531 406365 pfam15916 DUF4743 Domain of unknown function (DUF4743). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 150 amino acids in length. The family is found in association with pfam00293. 119
58532 406366 pfam15917 PIEZO Piezo. This domain is found in proteins belonging to the piezo family. Piezo proteins are components of cation channels. This domain is found in association with pfam12166. 163
58533 374223 pfam15918 DUF4744 Domain of unknown function (DUF4744). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 81 and 415 amino acids in length. 66
58534 406367 pfam15919 HicB_lk_antitox HicB_like antitoxin of bacterial toxin-antitoxin system. This is a family of HicB-like antitoxins. 123
58535 406368 pfam15920 WHAMM-JMY_N N-terminal of Junction-mediating and WASP homolog-associated. WHAMM-JMY_N is the very N-terminus of WHAMM and JMY proteins. The function of this conserved region is not known; there are two highly conserved tryptophan residues. 49
58536 318193 pfam15921 CCDC158 Coiled-coil domain-containing protein 158. CCDC158 is a family of proteins found in eukaryotes. The function is not known. 1112
58537 292544 pfam15922 YjeJ YjeJ-like. YjeJ is a family of bacterial proteins. The domains and proteins in this family vary in length from 283 to 284 amino acids. The function is not yet known. All proteins are Gammaproteobacteria. 283
58538 374226 pfam15923 DUF4745 Domain of unknown function (DUF4745). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 180 amino acids in length. 133
58539 406369 pfam15924 ALG11_N ALG11 mannosyltransferase N-terminus. 208
58540 374228 pfam15925 SOSSC SOSS complex subunit C. SOSS complex subunit C is a component of the SOSS complex, a single-stranded DNA binding complex involved in genomic stability, double-stranded break repair and ataxia telangiectasia-mutated-dependent signaling pathways. 95
58541 406370 pfam15926 RNF220 E3 ubiquitin-protein ligase RNF220. This family represents the central region of the E3 ubiquitin-protein ligase RNF220. 246
58542 406371 pfam15927 Casc1_N Cancer susceptibility candidate 1 N-terminus. This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 200 amino acids in length. The family is found in association with pfam12366. There are two completely conserved residues (N and W) that may be functionally important. 201
58543 406372 pfam15928 DUF4746 Domain of unknown function (DUF4746). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes, and is typically between 247 and 324 amino acids in length. The family is found in association with pfam00085. 290
58544 374232 pfam15929 Myofilin Myofilin. Myofilin is an insect muscle protein found in thick muscle filaments. 146
58545 318201 pfam15930 YdiH Domain of unknown function. YdiH is a family of proteins found in bacteria. Proteins in this family are typically between 62 and 80 amino acids in length. The function is not known. 62
58546 406373 pfam15931 DUF4747 Domain of unknown function (DUF4747). This family of proteins is found in bacteria. Proteins in this family are typically between 263 and 305 amino acids in length. 257
58547 406374 pfam15932 DUF4748 Domain of unknown function (DUF4748). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 114 and 139 amino acids in length. 51
58548 406375 pfam15933 RnlB_antitoxin Antitoxin to bacterial toxin RNase LS or RnlA. RnlB_antitoxin, formerly known as yfjO, has been found to be the antidote protein to RNase LS or RnlA in E. coli. Bacterial toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB. 94
58549 318204 pfam15934 Yuri_gagarin Yuri gagarin. The yuri gagarin protein found in Drosophila, it plays roles in spermatogenesis. 234
58550 406376 pfam15935 RnlA_toxin RNase LS, bacterial toxin. RnlA_toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB. 87
58551 406377 pfam15936 DUF4749 Domain of unknown function (DUF4749). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 121 and 170 amino acids in length. It is usually found in association with pfam00595 (PDZ) and pfam00412 (LIM), and often contains the conserved Zasp-like motif (IPR006643). 96
58552 374238 pfam15937 PrlF_antitoxin prlF antitoxin for toxin YhaV_toxin. PrlF_antitoxin is a family of bacterial antitoxins that neutralizes the toxin YhaV. PrlF is labile and forms a homodimer that then binds to the YhaV toxin thereby neutralising its ribonuclease activity. Alone, it can also act as a transcription factor. The YhaV/PrlF complex binds the prlF-yhaV operon, probably regulating its expression negatively. Over-expression of PrlF leads to increased doubling time. 95
58553 406378 pfam15938 DUF4750 Domain of unknown function (DUF4750). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 76 and 92 amino acids in length. There are two completely conserved W residues that may be functionally important. 52
58554 292561 pfam15939 YmcE_antitoxin Putative antitoxin of bacterial toxin-antitoxin system. YmcE_antitoxin is the putative antitoxin for the supposed bacterial toxin GnsA, UniProtKB:P0AC92, family pfam08178. 76
58555 406379 pfam15940 YjcB Family of unknown function. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. 90
58556 374241 pfam15941 FidL_like FidL-like putative membrane protein. FidL-like is a family of bacterial proteins that are purported to be membrane proteins. 90
58557 374242 pfam15942 DUF4751 Domain of unknown function (DUF4751). This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. 121
58558 406380 pfam15943 YdaS_antitoxin Putative antitoxin of bacterial toxin-antitoxin system, YdaS/YdaT. YdaS_antitoxin is a family of putative bacterial antitoxins, neutralising the toxin YdaT, family pfam06254. 65
58559 339554 pfam15944 DUF4752 Domain of unknown function (DUF4752). This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 90 and 105 amino acids in length. There is a conserved GLA sequence motif. 84
58560 374244 pfam15946 DUF4754 Domain of unknown function (DUF4754). This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 80
58561 318214 pfam15947 DUF4755 Domain of unknown function (DUF4755). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. 129
58562 406381 pfam15948 DUF4756 Domain of unknown function (DUF4756). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. 158
58563 406382 pfam15949 DUF4757 Domain of unknown function (DUF4757). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 145 and 166 amino acids in length. The family is found in association with pfam00412. There are two completely conserved residues (W and L) that may be functionally important. 167
58564 406383 pfam15950 DUF4758 Putative sperm flagellar membrane protein. 124
58565 406384 pfam15951 MITF_TFEB_C_3_N MITF/TFEB/TFEC/TFE3 N-terminus. This domain is found at the N-terminus of several transcription factors including microphthalmia-associated transcription factor, transcription factor EB, transcription factor EC and transcription factor E3. 153
58566 374249 pfam15952 ESM4 Enhancer of split M4 family. This family of proteins includes enhancer of split M4, enhancer of split M2 and enhancer of split MAlpha. These proteins are part of the Notch signaling pathway. 174
58567 292575 pfam15953 PDU_like Putative propanediol utilisation. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. 153
58568 374250 pfam15955 Cuticle_4 Cuticle protein. 74
58569 374251 pfam15956 DUF4760 Domain of unknown function (DUF4760). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 147 and 190 amino acids in length. There is a single completely conserved residue R that may be functionally important. 143
58570 374252 pfam15957 Comm Commissureless. Commissureless regulates Roundabout (Robo) levels and as a result regulates controls axon guidance across the embryo midline. 110
58571 318223 pfam15958 DUF4761 Domain of unknown function (DUF4761). This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 105
58572 406385 pfam15959 DUF4762 Domain of unknown function (DUF4762). This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved TTC sequence motif. 61
58573 406386 pfam15960 DUF4763 Domain of unknown function (DUF4763). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 237 and 332 amino acids in length. There are two completely conserved residues (C and R) that may be functionally important. 236
58574 406387 pfam15961 DUF4764 Domain of unknown function (DUF4764). 798
58575 374256 pfam15962 DUF4765 Domain of unknown function (DUF4765). This domain family is found in bacteria, and is approximately 90 amino acids in length. 1128
58576 406388 pfam15963 Myb_DNA-bind_7 Myb DNA-binding like. 85
58577 318229 pfam15964 CCCAP Centrosomal colon cancer autoantigen protein family. CCCAP is a family of proteins found in eukaryotes. CCCAP is also known as SDCCAG8, serologically defined colon cancer antigen 8. It is associated with the centrosome. 703
58578 406389 pfam15965 zf-TRAF_2 TRAF-like zinc-finger. 93
58579 406390 pfam15966 F-box_4 F-box. 115
58580 406391 pfam15967 Nucleoporin_FG2 Nucleoporin FG repeated region. Nucleoporin_FG2, or nucleoporin p58/p45, is a family of chordate nucleoporins. The proteins carry many repeats of the FG sequence motif. 598
58581 292590 pfam15968 RexB Membrane-anchored ion channel, Abi component. RexB is a family of anti-lambda phage inner-membrane ion-channels with four transmembrane domains. On infection by phage, a phage protein-DNA complex is produced as a replication or recombination intermediate which activates RexA. RexA is an intracellular sensor that activates the membrane-anchored RexB. At least two RexA proteins are needed to activate one RexB protein. Activation opens the ion-channel leading to a drop in membrane potential, the outcome of which is the death of the host cell but also the cessation or abortion of the phage infection. RexA-RexB is one of the most well characterized bacterial abortive infection systems, or Abis. 139
58582 374261 pfam15969 RexA Intracellular sensor of Lambda phage, Abi component. RexA is a family of bacterial anti-phage proteins. It forms one partner in the two-component abortive infection system, Abi, of E. coli in partnership with RexB, a membrane-anchored ion- channel. Two RexA are needed to activate one RexB, and activation causes opening of the channel, the efflux of cations, a drop in cellular levels of ATP and subsequent death of the host cell and abortion of the phage infecting process which requires ATP. 235
58583 292592 pfam15970 HicB-like_2 HicB_like antitoxin of bacterial toxin-antitoxin system. This is a family of HicB-like antitoxins. 81
58584 374262 pfam15971 Mannosyl_trans4 DolP-mannose mannosyltransferase. This family catalyzes the transfer of mannose from DolP-mannose to the N-linked tetrasaccharide bound to the S-layer glycoprotein to form a pentasaccharide. 163
58585 374263 pfam15972 Unpaired Unpaired protein. Unpaired protein activates the JAK pathway. 177
58586 374264 pfam15973 DUF4766 Domain of unknown function (DUF4766). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 106 and 128 amino acids in length. There is a conserved KVI sequence motif. 115
58587 406392 pfam15974 Cadherin_tail Cadherin C-terminal cytoplasmic tail, catenin-binding region. Cadherin_tail is the cytoplasmic domain at the C-terminus of cadherin proteins. This domain binds p120 catenin, an action critical for the surface stability of cadherin-catenin cell-cell adhesion complexes. 134
58588 406393 pfam15975 Flot Flotillin. Flotillin is a family of lipid-membrane-associated proteins found in bacteria, archaea and eukaryotes. The family is found in association with pfam01145, another integral membrane-associated domain. Flotillins in vertebrates are associated with sphingolipids and cholesterol-enriched membrane microdomains known as lipid-rafts. These rafts along with other membrane components are important in cell-signalling. Flotillins in other organisms have roles in viral pathogenesis, endocytosis, and membrane shaping. 121
58589 406394 pfam15976 CooC_C CS1-pili formation C-terminal. CooC_C is a highly conserved C-terminal domain on fimbrial outer membrane usher proteins like TcfC. The protein is required for CS1 pilus formation. 93
58590 406395 pfam15977 HTH_46 Winged helix-turn-helix DNA binding. 68
58591 379756 pfam15978 TnsD Tn7-like transposition protein D. TnsD is a family of putative Tn7-like transposition proteins type D. 360
58592 406396 pfam15979 Glyco_hydro_115 Glycosyl hydrolase family 115. Glyco_hydro_115 is a family of glycoside hydrolases likely to have the activity of xylan a-1,2-glucuronidase, EC:3.2.1.131, or a-(4-O-methyl)-glucuronidase EC:3.2.1.-. 334
58593 406397 pfam15980 ComGF Putative Competence protein ComGF. ComGF is a family of putative bacterial competence proteins. 99
58594 292603 pfam15981 EAV_GP5 Envelope glycoprotein GP 5 of equine arteritis virus. EAV_GP5 is a domain family found in equine arteritis virus envelope. It is approximately 80 amino acids in length and is found in association with pfam00951. 80
58595 374269 pfam15982 TMEM135_C_rich N-terminal cysteine-rich region of Transmembrane protein 135. TMEM135_C_rich is a family of putative peroxisomal membrane proteins found in eukaryotes. This is the highly conserved N-terminal region that has several highly conserved cysteine residues. The domain is associated with family Tim17, pfam02466. 134
58596 374270 pfam15983 DUF4767 Domain of unknown function (DUF4767). This domain family is found in bacteria, and is approximately 140 amino acids in length. There is a single completely conserved residue Q that may be functionally important. 138
58597 406398 pfam15984 Collagen_mid Bacterial collagen, middle region. Collagen_mid is the conserved central region of bacterial collagen triple helix repeat proteins. 192
58598 406399 pfam15985 KH_6 KH domain. KH motifs bind RNA in vitro. Auto-antibodies to Nova, a KH domain protein, cause para-neoplastic opsoclonus ataxia. 47
58599 406400 pfam15989 DUF4768 Domain of unknown function (DUF4768). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 116 and 180 amino acids in length. There is a conserved FFFGQY sequence motif. 87
58600 406401 pfam15990 UPF0767 UPF0767 family. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 92 amino acids in length. There are two conserved sequence motifs: IGYN and SPSL. 83
58601 406402 pfam15991 G_path_suppress G-protein pathway suppressor. This family of proteins inhibits G-protein- and mitogen-activated protein kinase-mediated signal transduction. 273
58602 406403 pfam15992 DUF4769 Domain of unknown function (DUF4769). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 291 and 501 amino acids in length. 256
58603 406404 pfam15993 Fuseless Fuseless. This family includes Drosophila fuseless protein and contains four WXGXW motifs. Fuseless is a transmembrane protein which regulates pre-synaptic calcium channels. 299
58604 406405 pfam15994 DUF4770 Domain of unknown function (DUF4770). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 169 and 182 amino acids in length. There is a single completely conserved residue L that may be functionally important. 181
58605 406406 pfam15995 DUF4771 Domain of unknown function (DUF4771). This domain family is found in eukaryotes, and is approximately 160 amino acids in length. There is a conserved RYGK sequence motif. 159
58606 406407 pfam15996 PNISR Arginine/serine-rich protein PNISR. 178
58607 374280 pfam15997 DUF4772 Domain of unknown function (DUF4772). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 107 and 124 amino acids in length. There is a single completely conserved residue V that may be functionally important. 112
58608 406408 pfam15998 DUF4773 Domain of unknown function (DUF4773). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. 118
58609 406409 pfam15999 DUF4774 Domain of unknown function (DUF4774). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, eukaryotes and viruses, and is approximately 50 amino acids in length. 57
58610 406410 pfam16000 CARMIL_C CARMIL C-terminus. This domain is found near to the C-terminus of leucine-rich repeat-containing proteins in the CARMIL family. In leucine-rich repeat-containing protein 16A (LRRC16A) it includes the region responsible for interaction with F-actin-capping protein subunit alpha-2 (CAPZA2). 287
58611 406411 pfam16001 DUF4775 Domain of unknown function (DUF4775). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 308 and 484 amino acids in length. 456
58612 406412 pfam16002 Headcase Headcase protein. This domain is found in Drosophila Headcase protein and the human Headcase protein homolog. In humans, it may have a role in some cancers. 194
58613 406413 pfam16003 DUF4776 Domain of unknown function (DUF4776). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 444 and 485 amino acids in length. There is a conserved TLR sequence motif. 502
58614 406414 pfam16004 EFTUD2 116 kDa U5 small nuclear ribonucleoprotein component N-terminus. 76
58615 406415 pfam16005 MOEP19 KH-like RNA-binding domain. MOEP19 is a family of mammalian KH-like RNA-binding motifs. The family is expressed during early embryogenesis. It appears to effect an early form of molecular asymmetry within the murine oocyte cytoplasm. The family marks a defined cortical cytoplasmic domain in oocytes and provides evidence for mammalian oocyte polarity and a form of pre-patterning that persists in zygotes and early embryos through the morula stage. 85
58616 406416 pfam16006 NUSAP Nucleolar and spindle-associated protein. This family of microtubule-associated proteins has a role in spindle microtubule organisation. 277
58617 374290 pfam16007 DUF4777 Domain of unknown function (DUF4777). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. 66
58618 374291 pfam16008 DUF4778 Domain of unknown function (DUF4778). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 321 and 791 amino acids in length. There is a single completely conserved residue P that may be functionally important. 289
58619 406417 pfam16009 DUF4779 Domain of unknown function (DUF4779). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 234 and 351 amino acids in length. 160
58620 406418 pfam16010 CDH-cyt Cytochrome domain of cellobiose dehydrogenase. CDH-cyt is the cytochrome domain, at the N-terminus, of cellobiose dehydrogenase. CDH-cyt folds as a beta sandwich with the topology of the antibody Fab V(H) domain and binds iron. The haem iron is ligated by Met83 and His181 in UniProtKB:Q01738. 177
58621 406419 pfam16011 CBM9_2 Carbohydrate-binding family 9. CBM9_2 is a family of putative endoxylanase-like proteins that belong to the Carbohydrate-binding family 9. 199
58622 406420 pfam16012 DUF4780 Domain of unknown function (DUF4780). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 132 and 144 amino acids in length. There is a single completely conserved residue W that may be functionally important. 177
58623 406421 pfam16013 DUF4781 Domain of unknown function (DUF4781). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 288 and 306 amino acids in length. 308
58624 406422 pfam16014 SAP130_C Histone deacetylase complex subunit SAP130 C-terminus. 406
58625 406423 pfam16015 Promethin Promethin. 96
58626 406424 pfam16016 DUF4782 Domain of unknown function (DUF4782). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. The family is found in association with pfam02893. 147
58627 406425 pfam16017 BTB_3 BTB/POZ domain. 106
58628 406426 pfam16018 Anillin_N Anillin N-terminus. This domain is found towards the N-terminus of anillin. In mammalian anillin this domain is repeated. This domain overlaps with the region responsible for nuclear localization of anillin. 86
58629 406427 pfam16019 CSRNP_N Cysteine/serine-rich nuclear protein N-terminus. This presumed domain is found at the N-terminus of cysteine/serine-rich nuclear proteins. These proteins act as transcriptional activators. 217
58630 406428 pfam16020 Deltameth_res Deltamethrin resistance. This presumed domain is found in the deltamethrin-resistance protein prag01 from Culex pipiens pallens. 49
58631 406429 pfam16021 PDCD7 Programmed cell death protein 7. 306
58632 406430 pfam16022 DUF4783 Domain of unknown function (DUF4783). This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. There is a single completely conserved residue F that may be functionally important. Recent structures show this domain has an NTF2 fold. 102
58633 406431 pfam16023 DUF4784 Domain of unknown function (DUF4784). This is a family of uncharacterized proteins from Bacteroidetes. 409
58634 406432 pfam16024 DUF4785 Domain of unknown function (DUF4785). This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 442 amino acids in length. 376
58635 406433 pfam16025 CALM_bind Calcium-dependent calmodulin binding. This domain is found at the N-terminus of centriolar coiled-coil protein of 110 kDa (CCP110), where it binds calmodulin. Binding of calmodulin to this domain is calcium dependent. 78
58636 406434 pfam16026 MIEAP Mitochondria-eating protein. This domain is found at the C-terminus of mitochondria-eating proteins. This family of proteins regulate mitochondrial quality. They have a role in the degradation of damaged mitochondrial proteins and in the degradation of damaged mitochondria. 198
58637 406435 pfam16027 DUF4786 Domain of unknown function (DUF4786). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 209 and 353 amino acids in length. 162
58638 406436 pfam16028 SLC3A2_N Solute carrier family 3 member 2 N-terminus. This domain is found at the N-terminus of solute carrier family 3 member 2 proteins (4F2 cell-surface antigen heavy chain). 77
58639 406437 pfam16029 DUF4787 Domain of unknown function (DUF4787). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. 62
58640 406438 pfam16030 GD_N Serine protease gd N-terminus. This domain is found at the N-terminus of the serine protease gd (gastrulation defective) in insects. 108
58641 406439 pfam16031 TonB_N TonB N-terminal region. TonB_N is a short domain found just downstream of the cytoplasmic-membrane anchor at the N-terminus of TonB proteins. The exact function is not known. 132
58642 406440 pfam16032 DUF4788 Domain of unknown function (DUF4788). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 230 amino acids in length. There is a single completely conserved residue D that may be functionally important. 229
58643 406441 pfam16033 DUF4789 Domain of unknown function (DUF4789). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 87 and 100 amino acids in length. There is a conserved GPC sequence motif. There are two completely conserved C residues that may be functionally important. 95
58644 406442 pfam16034 JAKMIP_CC3 JAKMIP CC3 domain. This domain is found at the C-terminus of proteins belonging to the JAKMIP family (Janus kinase and microtubule-interacting proteins) and is predicted to be a coiled coil. It interacts with the Janus family kinases Tyk2 and Jak1. 199
58645 406443 pfam16035 Chalcone_2 Chalcone isomerase like. 203
58646 406444 pfam16036 Chalcone_3 Chalcone isomerase-like. Chalcone_3 is a family of largely bacterial members. 165
58647 406445 pfam16037 DUF4790 Domain of unknown function (DUF4790). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 191 amino acids in length. There is a single completely conserved residue C that may be functionally important. 93
58648 406446 pfam16038 TMIE TMIE protein. This family of proteins includes the mammalian transmembrane inner ear expressed protein. It's function is unknown. 85
58649 406447 pfam16039 DUF4791 Domain of unknown function (DUF4791). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 189 and 203 amino acids in length. There are two conserved sequence motifs: PLPL and LGN. There is a single completely conserved residue N that may be functionally important. 162
58650 406448 pfam16040 DUF4792 Domain of unknown function (DUF4792). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. 71
58651 406449 pfam16041 DUF4793 Domain of unknown function (DUF4793). This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length. There are two completely conserved C residues that may be functionally important. 108
58652 406450 pfam16042 DUF4794 Domain of unknown function (DUF4794). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 74 and 92 amino acids in length. 76
58653 406451 pfam16043 DUF4795 Domain of unknown function (DUF4795). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 285 and 978 amino acids in length. 181
58654 406452 pfam16044 DUF4796 Domain of unknown function (DUF4796). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 194 and 289 amino acids in length. There is a single completely conserved residue C that may be functionally important. 189
58655 406453 pfam16045 LisH_2 LisH. 28
58656 406454 pfam16046 FAM76 FAM76 protein. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 233 and 341 amino acids in length. 298
58657 292666 pfam16047 Antimicrobial22 Frog antimicrobial peptide. This family includes the antimicrobial peptides Grahamin and Nigrocin which are secreted from frog skin. 21
58658 292667 pfam16048 Antimicrobial23 Frog antimicrobial peptide. This family includes antimicrobial peptides such as Ranacyclin which are secreted from frog skin. 17
58659 292668 pfam16049 Antimicrobial24 Frog antimicrobial peptide. This family includes antimicrobial peptides such as Aurein-5 and Caerin 2 which are secreted from frog skin. 25
58660 406455 pfam16050 CDC73_N Paf1 complex subunit CDC73 N-terminal. CDC73_N is the N-terminal region of the members of CDC73_C, pfam05179. CDC73 forms part of the Paf1 post-initiation complex. The exact function within the complex is not known. 302
58661 406456 pfam16051 DUF4797 Domain of unknown function (DUF4797). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a conserved SGLPT sequence motif. There are two completely conserved residues (P and G) that may be functionally important. 43
58662 406457 pfam16053 MRP-S34 Mitochondrial 28S ribosomal protein S34. 128
58663 406458 pfam16054 TMEM72 Transmembrane protein family 72. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 275 amino acids in length. 153
58664 406459 pfam16055 DUF4798 Domain of unknown function (DUF4798). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 80 and 365 amino acids in length. There is a single completely conserved residue H that may be functionally important. 103
58665 318308 pfam16056 DUF4799 Domain of unknown function (DUF4799). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 362 and 1493 amino acids in length. 375
58666 406460 pfam16057 DUF4800 Domain of unknown function (DUF4800). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 310 amino acids in length. The family is found in association with pfam02138, pfam00400. There is a conserved RDN sequence motif. 254
58667 406461 pfam16058 Mucin-like Mucin-like. This domain is found repeated at the C-terminus (C-tail) of bile salt-activated lipase, where is is O-glycosylated. 100
58668 406462 pfam16059 DUF4801 Domain of unknown function (DUF4801). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00907. 52
58669 406463 pfam16060 DUF4802 Domain of unknown function (DUF4802). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There are two conserved sequence motifs: CRC and YFDC. 65
58670 406464 pfam16061 DUF4803 Domain of unknown function (DUF4803). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 351 and 686 amino acids in length. There is a conserved RRY sequence motif. 255
58671 406465 pfam16062 DUF4804 Domain of unknown function (DUF4804). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 504 amino acids in length. 447
58672 374340 pfam16063 DUF4805 Domain of unknown function (DUF4805). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 244 and 363 amino acids in length. There is a conserved WEL sequence motif. 265
58673 406466 pfam16064 DUF4806 Domain of unknown function (DUF4806). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. 86
58674 406467 pfam16065 DUF4807 Domain of unknown function (DUF4807). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 171 and 270 amino acids in length. There is a conserved STLGG sequence motif. 126
58675 374343 pfam16066 DUF4808 Domain of unknown function (DUF4808). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 106 and 135 amino acids in length. 121
58676 406468 pfam16067 DUF4809 Domain of unknown function (DUF4809). This family of proteins is found in bacteria. Proteins in this family are typically between 120 and 137 amino acids in length. There is a conserved GGCNAC sequence motif. 129
58677 406469 pfam16068 DUF4810 Domain of unknown function (DUF4810). This family of proteins is found in bacteria. Proteins in this family are typically between 117 and 134 amino acids in length. There is a conserved PES sequence motif. It is a putative lipoprotein. 84
58678 406470 pfam16069 DUF4811 Domain of unknown function (DUF4811). This family of proteins is found in bacteria. Proteins in this family are typically between 188 and 241 amino acids in length. There is a single completely conserved residue Y that may be functionally important. 154
58679 406471 pfam16070 TMEM132 Transmembrane protein family 132. This presumed domain is found in members of the TMEM132 family. TMEM132A may be involved in embryonic and postnatal brain development. TMEM132D may be a marker for oligodendrocyte differentiation. 344
58680 406472 pfam16071 DUF4812 Domain of unknown function (DUF4812). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam03791, pfam03790. There are two completely conserved residues (H and I) that may be functionally important. 65
58681 318322 pfam16072 DUF4813 Domain of unknown function (DUF4813). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length. 291
58682 406473 pfam16073 SAT Starter unit:ACP transacylase in aflatoxin biosynthesis. SAT is the N-terminal starter unit:ACP transacylase of the aflatoxin biosynthesis pathway. SAT selects the hexanoyl starter unit from a pair of specialized fungal fatty acid synthase subunits (HexA/HexB) and transfers it onto the polyketide synthase A acyl-carrier protein to prime polyketide chain elongation. The family is found in association with pfam02801, pfam00109, pfam00550, pfam00975, pfam00698. 239
58683 406474 pfam16074 PilW Type IV Pilus-assembly protein W. PilW is a family of putative type IV pilus-assembly proteins. PilW is one of the component proteins of the pilus biogenesis process whereby pilus fibers are assembled in the periplasm, emerge onto the cell surface and are there stabilized, to allow bacterial attachment to host cells. PilW is an outer-membrane protein necessary for both the functionality of fibers and their stabilisation. 125
58684 374348 pfam16075 DUF4815 Domain of unknown function (DUF4815). 570
58685 406475 pfam16076 Acyltransf_C Acyltransferase C-terminus. This domain is found at the C-terminus of several different acyltransferases including 1-acyl-sn-glycerol-3-phosphate acyltransferase, acyl-CoA:lysophosphatidylglycerol acyltransferase 1 and lysocardiolipin acyltransferase 1. 73
58686 406476 pfam16077 Spaetzle Spaetzle. This family of proteins are nerve growth factor-like ligands required in the pathway that establishes the dorsal-ventral pattern of the embryo. They form a cystine knot structure. 93
58687 406477 pfam16078 2-oxogl_dehyd_N 2-oxoglutarate dehydrogenase N-terminus. This domain is found at the N-terminus of 2-oxoglutarate dehydrogenases. 41
58688 406478 pfam16079 Phage_holin_5_2 Phage holin family Hol44, in holin superfamily V. Phage_holin_V_2 is a family of small hydrophobic proteins with three transmembrane domains of the Hol44 family. These proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. Full activity of the endolysin Lys44 from oenophage fOg44 requires sudden ion-nonspecific dissipation of the proton motive force, undertaken by the fOg44 holin during phage-infection. 66
58689 406479 pfam16080 Phage_holin_2_3 Bacteriophage holin family HP1. Phage_holin_2_3 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. 56
58690 292699 pfam16081 Phage_holin_7_1 Mycobacterial 2 TMS Phage Holin (M2 Hol) Family. Phage_holin_8_1 is a family of two transmembrane mycobacteriophage holins, small hydrophobic proteins that effect lysis of host mycobacterial cells in conjunction with a mycobacteria-specific lysin, lysB. The product of lysB gene targets the mycobacteria outer membrane, the last barrier to bacteriophage release. 139
58691 406480 pfam16082 Phage_holin_2_4 Bacteriophage holin family, superfamily II-like. Phage_holin_2_4 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. 76
58692 406481 pfam16083 Phage_holin_3_3 LydA holin phage, holin superfamily III. Phage_holin_3_3 is a family of small hydrophobic holin proteins with one or more transmembrane domains. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as those of the bacteriophages of these organisms. Their primary function appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded, these enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases. LydA and lydB are encoded on the dar operon. The phenotype of a rapid lysis in the absence of active LydB suggests that this protein might be an antagonist of the holin LydA. 78
58693 292702 pfam16084 LydB LydA-holin antagonist. LydB is a family of proteins that are antagonistic to the lysing action of holin LydA. 147
58694 318333 pfam16085 Phage_holin_3_5 Bacteriophage holin Hol, superfamily III. Phage_holin_6_2 is a family of holins classified as 1.E.20 in the TC database. The hol gene (PRF9) product (117 aas) of Pseudomonas aeruginosa PAO1 exhibits a hydrophobicity profile similar to holins of P2 and phiCTX phages with two peaks of hydrophobicity that might correspond to either one or two TMSs. Hol functions in conjunction with the lytic enzyme, Lys, a glycosyl hydrolase that breaks-up the murein in the bacterial cell-wall, causing lysis of the cell and hence entry of phage particles. Several members are annotated as pyocin R2_PP when encoded on the chromosome. 113
58695 374353 pfam16086 DUF4816 Domain of unknown function (DUF4816). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 178 and 456 amino acids in length. There is a conserved WKP sequence motif. 43
58696 406482 pfam16087 DUF4817 Domain of unknown function (DUF4817). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 322 amino acids in length. There are two completely conserved residues (G and R) that may be functionally important. 54
58697 406483 pfam16088 BORCS7 BLOC-1-related complex sub-unit 7. This is a family of unknown function found in eukaryotes. Family members include BORCS7 (BLOC-1-related complex sub-unit 7) also known as Diaskedin (from the Ancient Greek diaskedazo, meaning to disperse) or C10orf32. It constitutes sub-unit 7 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery. 103
58698 406484 pfam16089 DUF4818 Domain of unknown function (DUF4818). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 176 and 214 amino acids in length. There is a single completely conserved residue W that may be functionally important. 109
58699 406485 pfam16090 DUF4819 Domain of unknown function (DUF4819). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 82 and 99 amino acids in length. 84
58700 374358 pfam16091 DUF4820 Domain of unknown function (DUF4820). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 320 and 483 amino acids in length. There are two conserved sequence motifs: WSLP and RPLPW. 226
58701 406486 pfam16092 DUF4821 Domain of unknown function (DUF4821). 264
58702 406487 pfam16093 PAC4 Proteasome assembly chaperone 4. PAC4 or proteasome assembly chaperone 4 protein promotes assembly of the 20S proteasome. It interacts with PSMG3. It associates with alpha subunits of the 20S proteasome. At the very C-terminal is a crucial HbYX or hydrophobic-tyrosine-X sequence motif that, in proteasome activators, opens the 20S proteasome entry pore. 72
58703 406488 pfam16094 PAC1 Proteasome assembly chaperone 4. PAC1 is a family of eukaryotic proteasome assembly chaperone 1 proteins in eukaryotes that promotes assembly of the core 20S proteasome as part of a heterodimer with PAC2. 282
58704 406489 pfam16095 COR C-terminal of Roc, COR, domain. The C-terminal of Roc domain, COR, along with Roc functions as the putative regulator of kinase activity. It functions as a proper GTP-binding protein with a low GTPase activity somehow stimulating the kinase activity. 196
58705 406490 pfam16096 FXR_C1 Fragile X-related 1 protein C-terminal region 2. FXR_C1 is a small highly conserved region of the C-terminus of Fragile X-related proteins 1 and 2, FRX1, FRX2. The family is found in association with pfam05641, pfam00013. This family is immediately C-terminal to the core C terminal region, PF12235, and contains at least one block of RGG repeats that bind to G-quartet sequences in a wide variety of mRNAs. 75
58706 406491 pfam16097 FXR_C3 Fragile X-related 1 protein C-terminal region 3. FXR_C1 is a small highly conserved region at the very C-terminus of Fragile X-related proteins 1 and 2, FRX1, FRX2. The family is found in association with pfam05641, pfam00013, PF16096. 68
58707 406492 pfam16098 FXMR_C2 Fragile X-related mental retardation protein C-terminal region 2. FXMR_C2 is a small highly conserved region at the very C-terminus of Fragile X-related proteins FMR1. The family is found in association with pfam05641, pfam00013, PF16096. 86
58708 406493 pfam16099 RMI1_C Recq-mediated genome instability protein 1, C-terminal OB-fold. RMI1_C is a C-terminal oligo-nucleotide binding domain of Recq-mediated genome instability proteins. This domain interacts with RMI2-OB folds to make up the RMI core complex. The RMI core interface is crucial for BLM, Bloom syndrome, dissolvasome assembly and may have additional cellular roles as a docking hub for other proteins. 136
58709 406494 pfam16100 RMI2 RecQ-mediated genome instability protein 2. RMI2 is a eukaryotic family of an OB3, oligo-nucleotide-binding proteins. It is an essential component of the RMI complex that plays a vital role in the processing of homologous recombination intermediates in order to limit DNA-crossover-formation in cells. 124
58710 406495 pfam16101 PRIMA1 Proline-rich membrane anchor 1. 122
58711 406496 pfam16102 ACTH_assoc ACTH-associated domain. ACTH_assoc is the low-complexity regions immediately adjacent to the highly conserved binding motif of the ACTH_domain, pfam00976. the exact function is not known. 28
58712 406497 pfam16103 DUF4822 Domain of unknown function (DUF4822). A lipocain-like domain found in functionally uncharacterized bacterial proteins, often as a repeat of two domains. Proteins with this domain are found in a wide range of bacteria and are often annotated as S-layer proteins, but the origin of this annotation is not clear 121
58713 292722 pfam16104 FPRL1_inhibitor Formyl peptide receptor-like 1 inhibitory protein. This family consists of several formyl peptide receptor-like 1 inhibitory proteins from Staphylococcus aureus. These are secreted proteins that block the formyl peptide receptor-like 1 found in neutrophils, monocytes, B cells, and NK cells; and inhibit the binding of chemoattractants (such as formylated peptides) to FPRL1, which initiate phagocyte mobilization towards the infection site. 105
58714 374367 pfam16105 DUF4823 Domain of unknown function (DUF4823). This family consists of hypothetical lipoproteins around 210 residues of length and is mainly found in various Pseudomonas species. The function of this family is unknown. 141
58715 406498 pfam16106 DUF4824 Domain of unknown function (DUF4824). This family consists of several hypothetical lipoproteins around 270 residues in length and is mainly found in Pseudomonas species. The function of this family is unknown. 253
58716 406499 pfam16107 DUF4825 Domain of unknown function (DUF4825). This domain forms the N-terminal, extracellular domain of some homologs of Staph BlaR1 proteases, where it replaces the penicillin-binding domain of BlaR1. It is also found in many uncharacterized proteins in a broad range of bacteria. Its association with BlaR1 homologs suggests it may be involved in substrate-, possibly antibiotic-binding, but this prediction has not been verified experimentally. 95
58717 339611 pfam16108 DUF4826 Domain of unknown function (DUF4826). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Shewanella species. The function of this protein is unknown. 124
58718 406500 pfam16109 DUF4827 Domain of unknown function (DUF4827). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. Distant homology prediction algorithms consistently suggest a homology between this family and FKBP-type peptidyl-prolyl cis-trans isomerases (PF00254), but this relation is as yet not confirmed. The function of this family is unknown. 177
58719 406501 pfam16110 DUF4828 Domain of unknown function (DUF4828). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Enterococcus and Lactobacillus species. The function of this family is unknown. 79
58720 379768 pfam16111 DUF4829 Domain of unknown function (DUF4829). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 117
58721 406502 pfam16112 DUF4830 Domain of unknown function (DUF4830). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in Clostridium, Eubacterium, and Ruminococcus species. The function of this family is unknown. 84
58722 406503 pfam16113 ECH_2 Enoyl-CoA hydratase/isomerase. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This family differs from pfam00378 in the structure of it's C-terminus. 329
58723 406504 pfam16114 Citrate_bind ATP citrate lyase citrate-binding. This is the citrate-binding domain of ATP citrate lyase. This domain has a Rossmann fold. 177
58724 406505 pfam16115 DUF4831 Domain of unknown function (DUF4831). This family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 318
58725 406506 pfam16116 DUF4832 Domain of unknown function (DUF4832). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides and Capnocytophaga species. The function of this family is unknown. Distant homology analysis suggests a possible similarity of proteins from this family to TIM barrel glycoside hydrolases and, subsequently its involvement in carbohydrate metabolism.The domain lies downstream of glycosyl hydrolases 42 suggesting that as a domain it might represent the carbohydrate-binding region of the enzyme. 208
58726 406507 pfam16117 DUF4833 Domain of unknown function (DUF4833). This family consists of uncharacterized proteins around 170 residues in length and is mainly found in various Parabacteroides and Bacteroides species. The function of this family is unknown. 136
58727 406508 pfam16118 DUF4834 Domain of unknown function (DUF4834). This family consists of uncharacterized proteins around 90 residues in length and is mainly found in various Parabacteroides and Bacteroides species. Protein in this family are characterized by a strongly conserved KDEGEYVD motif on the C-terminal and a very divergent N-terminal. The function of this family is unknown. 91
58728 406509 pfam16119 DUF4835 Domain of unknown function (DUF4835). This family consists of uncharacterized proteins of around 300 residues in length and is mainly found in bacteria from the Cytophaga-Flavobacteria-Bacteroides (CFB) group, both environmental and from human microbiome. The function of this family is unknown. 276
58729 406510 pfam16120 DUF4836 Domain of unknown function (DUF4836). This family consists of several uncharacterized proteins around 520 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 474
58730 406511 pfam16121 40S_S4_C 40S ribosomal protein S4 C-terminus. This domain is found at the C-terminus of 40S ribosomal protein S4. 48
58731 406512 pfam16122 40S_SA_C 40S ribosomal protein SA C-terminus. This domain is found at the C-terminus of 40S ribosomal protein SA. 96
58732 406513 pfam16123 HAGH_C Hydroxyacylglutathione hydrolase C-terminus. This domain is found at the C-terminus of hydroxyacylglutathione hydrolase enzymes. Substrate binding occurs at the interface between this domain and the catalytic domain (pfam00753). 82
58733 406514 pfam16124 RecQ_Zn_bind RecQ zinc-binding. This domain is the zinc-binding domain of ATP-dependent DNA helicase RecQ. 64
58734 406515 pfam16125 DUF4837 Domain of unknown function (DUF4837). This family consists of uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 292
58735 406516 pfam16126 DUF4838 Domain of unknown function (DUF4838). This family consists of several uncharacterized proteins found in various Bacteroides and Chloroflexus species. The function of this family is unknown. 263
58736 406517 pfam16127 DUF4839 Domain of unknown function (DUF4839). This family consists of uncharacterized proteins around 300 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 122
58737 406518 pfam16128 DUF4840 Domain of unknown function (DUF4840). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 141
58738 406519 pfam16129 DUF4841 Domain of unknown function (DUF4841). this domain is found on the N-terminal of several uncharacterize proteins found in various Bacteroides species. Solved structure of one of them (BACOVA_00967) from Bacteroides ovatus shows a small beta barrel with an immunoglobulin-like fold. DUF4841 domain shows weak overlap with the DUF4114 family, suggesting a possible distant relation. Function of this domain is unknown. 69
58739 406520 pfam16130 DUF4842 Domain of unknown function (DUF4842). This domain is found on the C-terminal of large number of uncharacterized proteins with broad phylogenetic distribution, which includes human gut Bacteroides, g-proteobacteria (Vibrio and Shewanella) and also spirochetes from Leptospira genus. Solved structure of Bacteroides ovatus protein BACOVA_00967 shows a large beta barrel with an immunoglobulin-like fold and significant structural similarity to collagen-binding domain of adhesin from S. aureus (1amx), but with several additional long loops and secondary structure elements. Function of this domain is unknown. 201
58740 406521 pfam16131 Torus Torus domain. This domain is found in pre-mRNA-splicing factor CWC2. It includes a CCCH-type zinc finger. 109
58741 374383 pfam16132 DUF4843 Domain of unknown function (DUF4843). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. Distant homology analysis suggest distant relation between this family and other families of proteins with immunoglobulin-like folds, which are often involved in substrate binding. However, specific function of this family is unknown. There is distant homology to the Calx-beta family pfam03160. 163
58742 406522 pfam16133 DUF4844 Domain of unknown function (DUF4844). this family consists of short uncharacterized proteins found mostly in different strains of Acinetobacter bumanii, but also in several Shewanella species and in some bacteria from the CFB group. Solved structure of ABAYE3784 protein from Acinetobacter baumannii AYE shows a five helical bundle with a very strong structural similarity to a bromodomain domain. However, the specific function of proteins from the DUF4844 family is unknown 117
58743 406523 pfam16134 THOC2_N THO complex subunit 2 N-terminus. This family represents the N-terminus of THO complex subunit 2. 617
58744 406524 pfam16135 Jas TPL-binding domain in jasmonate signalling. The Jas domain is a short region of sequence characterized by IxCxCx(12)HAG found in plant transcriptional repressors. This motif appears to bind to the Groucho/Tup1-type co-repressor TOPLESS (TPL) and TPL-related proteins (TPRs). This binding is a crucial step in the jasmonate signalling pathway, involved in plant disease and defense. 48
58745 406525 pfam16136 NINJA_B Putative nuclear localization signal. NINJA proteins are Novel INteractor of JAZ proteins found in plants. NINJA proteins act as a transcriptional repressor, the activity of which is mediated by a functional TPL-binding EAR repression motif upstream from this domain. 114
58746 406526 pfam16137 DUF4845 Domain of unknown function (DUF4845). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Pseudomonas species. Distant homology analysis suggests that proteins from this family are related to pilin type IV proteins from the Bundlin (PF05307) family, this prediction is however not confirmed by any experimental evidence 85
58747 406527 pfam16138 DUF4846 Domain of unknown function (4846). This family consists of uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 239
58748 406528 pfam16139 DUF4847 Domain of unknown function (DUF4847). This uncharacterized domain has a lipocalin fold. 142
58749 406529 pfam16140 DUF4848 Domain of unknown function (DUF4848). A small family of uncharacterized proteins around 310 residues in length and found in various Bacteroides species. The function of this family is unknown. 216
58750 406530 pfam16141 DUF4849 Putative glycoside hydrolase Family 18, chitinase_18. This DUF is likely to be a form of glycosyl hydrolase from CAZy family 18, possibly chitinase 18. This would have the EC number of EC:3.2.1.14. 318
58751 292760 pfam16142 DUF4850 Domain of unknown function (DUF4850). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Acinetobacter species. The function of this family is unknown. 184
58752 374391 pfam16143 DUF4851 Domain of unknown function (DUF4851). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Desulfovibrio species. The function of this family is unknown. 195
58753 406531 pfam16144 DUF4852 Domain of unknown function (DUF4852). This family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Parabacteroides, Bacteroides and Porphyromonas species. The function of this family is unknown. 121
58754 406532 pfam16145 DUF4853 Domain of unknown function (DUF4853). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Actinomyces species. The function of this family is unknown. 135
58755 406533 pfam16146 DUF4854 Domain of unknown function (DUF4854). This family consists of uncharacterized proteins found in firmicutes and high GC Gram+ bacteria associated with human and animal guts. The function of this family is unknown. 105
58756 406534 pfam16147 DUF4855 Domain of unknown function (DUF4855). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. Several proteins are annotated as glycerophosphodiester phosphodiesterases, but the origin of this annotation is not clear. 313
58757 406535 pfam16148 DUF4856 Domain of unknown function (DUF4856). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown. 358
58758 406536 pfam16149 DUF4857 Domain of unknown function (DUF4857). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 270
58759 406537 pfam16150 DUF4858 Domain of unknown function (DUF4858). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 182
58760 406538 pfam16151 DUF4859 Domain of unknown function (DUF4859). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown. 116
58761 406539 pfam16152 DUF4860 Domain of unknown function (DUF4860). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Eubacterium and Clostridium species. The function of this family is unknown. 98
58762 406540 pfam16153 DUF4861 Domain of unknown function (DUF4861). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. However, in many instances the domain lies upstream of a glycosyl hydrolase family, usually family 88, so it might be involved in carbohydrate binding. 382
58763 406541 pfam16154 DUF4862 Domain of unknown function (DUF4862). This family consists of uncharacterized proteins around 300 residues in length and is mainly found in various high GC Gram+ bacteria, but also in several pathogenic and non-pathogenic enterobacteria (Salmonella, E. coli). Distant homology analysis suggests this could be a branch of Xylose isomerase-like TIM barrel family, but this prediction is currently not confirmed by experiment 291
58764 406542 pfam16155 DUF4863 Domain of unknown function (DUF4863). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various delta- proteobacteria, but also several fungal species. Distant homology analysis suggest proteins from this family have a cupin-like fold and may be related to a group of lyases involved in the metabolism of benzoate. Few proteins from this family are annotated as p-hydroxylaminobenzoate lyases, NbaB, and this proposed function matches well their phylogenetic distribution, but there seems to be no direct experimental verification of this function, therefore at this point we call it a DUF. 153
58765 406543 pfam16156 DUF4864 Domain of unknown function (DUF4864). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Anabaena and Nostoc species. Distant homology analysis suggests this family is related to NTF2-like proteins and specifically to proteins that bind small molecules. HMM partly overlaps with Tol_Tol_Ttg2 (PF05494) involved in Toluene tolerance and lumazine binding family (PF12870) and these families should form a clan. 101
58766 406544 pfam16157 DUF4865 Domain of unknown function (DUF4865). This family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Bacillus species. Distant homology and fold prediction suggests proteins from this family would have a ferrodoxin dimeric fold and specifically be related to the putative mono-oxygenase ydhR family PF08803, however this prediction has not been verified by experiment 181
58767 406545 pfam16158 N_BRCA1_IG Ig-like domain from next to BRCA1 gene. Domain present between positions 365-485 in the human next to BRCA1 gene 1 protein Q14596 (NBR1_HUMAN) Distant homology and fold prediction analysis suggests this domain has an immunoglobulin like fold and is distantly homologous to domains involved in cell adhesion such as CARDB (PF07705). JCSG construct was crystalized confirming the domain boundaries 101
58768 406546 pfam16159 FOXP-CC FOXP coiled-coil domain. This domain, approximately 60-70 residues in length, is mainly found in Forkhead box proteins in various Mammalia species. It is a coiled-coil domain, which modulates the dimeric associations of FOXP transcription factors. Several key disease mutations, for instance those found in the IPEX syndrome are located in this domain 68
58769 406547 pfam16160 DUF4866 Domain of unknown function (DUF4866). This family consists of uncharacterized proteins around 250 residues in length and is mainly found in various human gut Firmicute species and abundant in human gut metagenomic datasets. The function of this family is unknown. 246
58770 406548 pfam16161 DUF4867 Domain of unknown function (DUF4867). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various human gut Firmicutes and a few eubacteria species. It is also amply represented in human gut metagenomic datasets. Distant homology analysis and marginal HMM overlaps suggest this family is a distant homolog of Ureidoglycolate hydrolase pfam04115, but this prediction is not verified by experiment, therefore the function of this family is still unknown. 199
58771 406549 pfam16162 DUF4868 Domain of unknown function (DUF4868). This family consists of uncharacterized proteins around 320 residues in length and is a phylogenetically broad range of bacteria associated with the human gut microbiome. A member of this family from Lactobacillus casei CRL 705 is part of the gene cluster involved in synthesis of bacteriocin toxin, but the specific function of this family is unknown. 186
58772 406550 pfam16163 DUF4869 Domain of unknown function (DUF4869). This family consists of uncharacterized proteins around 150 residues in length. Its members are found in human gut Firmicutes and are also abundant in human gut metagenomics datasets. The function of this family is unknown. 128
58773 406551 pfam16164 VWA_N2 VWA N-terminal. This domain is found in von Willebrand factor proteins, where it is found to the N-terminus of the first VWA domain (pfam00092). 79
58774 406552 pfam16165 Ferlin_C Ferlin C-terminus. This domain is found at the C-terminus of proteins belonging to the ferlin family, including dysferlin, myoferlin, otoferlin and fer-1-like proteins. 154
58775 374405 pfam16166 TIC20 Chloroplast import apparatus Tic20-like. Chloroplast function requires the import of nuclear encoded proteins from the cytoplasm across the chloroplast double membrane. This is accomplished by two protein complexes, the Toc complex located at the outer membrane and the Tic complex located at the inner membrane. The Toc complex recognizes specific proteins by a cleavable N-terminal sequence and is primarily responsible for translocation through the outer membrane, while the Tic complex translocates the protein through the inner membrane. This entry represents Tic20, a core member of the Tic complex. This protein is deeply embedded in the inner envelope membrane and is thought to function as a protein- conducting component of the Tic complex. 177
58776 406553 pfam16167 DUF4871 Domain of unknown function (DUF4871). This family consists of uncharacterized proteins around 170 residues in length and is mainly found in various Bacillus species (B. cereus, B. thuringiensis and B. anthracis). The solved structure of B. anthracis homologs has a variant of the Greek-key beta barrel fold, making the DUF4870 family a member of a large group of bacterial immunoglobulin like domains, but the functional consequences of this classification remain unknown. 128
58777 406554 pfam16168 AIDA Adhesin of bacterial autotransporter system, probable stalk. The AIDA repeat is found on bacterial autotransporter proteins. As the repeat is short and occurs multiple times, it is likely to be the region of the transporter that acts as the stalk between the beta-barrel inserted into the membrane and the N-terminal head domain. 57
58778 406555 pfam16169 DUF4872 Domain of unknown function (DUF4872). Members of this family are often found in the gene neighborhood, or fused to, non-ribosomal peptide synthetases. 173
58779 406556 pfam16170 DUF4873 Domain of unknown function (DUF4873). This family consists mostly of short uncharacterized proteins found in various high GC Gram positive bacteria, primarily Mycobacterium species. However in some proteins, such as for instance Rv0943c proteins from Mycobacterium tuberculosis H37Rv, DUF4873 domain is found at the C-terminus, following the flavin-binding monooxygenase-like domain pfam00743, which is why probably many proteins with DUF4873 domains are annotated as monooxygenases. However these functions are not confirmed experimentally and the function of DUF4873 domain is still unknown. 91
58780 406557 pfam16171 CENP-T_N Centromere kinetochore component CENP-T N-terminus. CENP-T is a family of vertebral kinetochore proteins that associates directly with CENP-W. The N-terminus of CENP-T proteins interacts directly with the Ndc80 complex in the outer kinetochore. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes. This family represents the N-terminus of CENP-T. 378
58781 406558 pfam16172 DOCK_N DOCK N-terminus. This family is found near to the N-terminus of dedicator of cytokinesis (DOCK) proteins, between the variant SH3 domain (pfam07653) and the C2 domain (pfam14429). 378
58782 406559 pfam16173 DUF4874 Domain of unknown function (DUF4874). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 161 and 175 amino acids in length. There is a conserved WGE sequence motif. 162
58783 406560 pfam16174 IHABP4_N Intracellular hyaluronan-binding protein 4 N-terminal. IHABP4_N is the N-terminal region of intracellular hyaluronan-binding protein 4-like and SERPINE1 mRNA binding protein 1-like proteins. This region carries nuclear localization sites, and may also be involved in the binding to some of the partners in the translational machinery. 155
58784 292793 pfam16175 DUF4875 Domain of unknown function (DUF4875). Small protein family, with members present in few proteobacteria mostly Desulfovibrio species, but also a Vibrio phage vB, suggesting a possible phage origin Experimentally determined structure shows a fold reminiscent of a thioesterase/thiol ester dehydrase-isomerase fold, but a functional consequences of this similarity are not clear 127
58785 406561 pfam16176 T-box_assoc T-box transcription factor-associated. This domain lies downstream of the T-box in many eukaryotic T-box proteins. The exact function is not known. 226
58786 406562 pfam16177 ACAS_N Acetyl-coenzyme A synthetase N-terminus. This domain is found at the N-terminus of many acetyl-coenzyme A synthetase enzymes. 55
58787 406563 pfam16178 Anoct_dimer dimerization domain of Ca+-activated chloride-channel, anoctamin. This family appears to be the cytoplasmic domain of the calcium-activated chloride-channel, anoctamin, protein. It is responsible for creating the homodimeric architecture of the chloride-channel proteins. 224
58788 406564 pfam16179 RHD_dimer Rel homology dimerization domain. The Rel homology domain (RHD) is composed of two structural domains, an N-terminal DNA_binding domain (pfam00554) and a C-terminal dimerization domain. This is the dimerization domain. 103
58789 374415 pfam16180 RelB_leu_zip RelB leucine zipper. This domain is a leucine zipper found in RelB transcription factors. 84
58790 406565 pfam16181 RelB_transactiv RelB transactivation domain. This domain is the transactivation domain of the transcription factor RelB. 181
58791 406566 pfam16182 AbLIM_anchor Putative adherens-junction anchoring region of AbLIM. AbLIM_anchor is a domain lying between the LIM actin-binding and the vilin-head domain of actin-binding LIM proteins. It is likely that this domain is involved in anchoring abLIMs to circumferential actin bundles in specific cell types. 324
58792 406567 pfam16183 Kinesin_assoc Kinesin-associated. 171
58793 406568 pfam16184 Cadherin_3 Cadherin-like. 111
58794 406569 pfam16185 MTABC_N Mitochondrial ABC-transporter N-terminal five TM region. MTABC_N is the N-terminal five transmembrane helices of eukaryotic mitochondrial ABC-transporters. 244
58795 406570 pfam16186 Arm_3 Atypical Arm repeat. This atypical Arm repeat appears at the very C-terminus of eukaryotic proteins such as importin subunit alpha-2, as the last of the repeating units. 48
58796 406571 pfam16187 Peptidase_M16_M Middle or third domain of peptidase_M16. Peptidase_M16_M is the third domain of peptidase_M16 in eukaryotes of the insulin-degrading-enzyme type. Insulin-degrading enzymes - insulysin - are zinc metallopeptidases that metabolize several bioactive peptides, including insulin and the amyloid-beta-peptide. The tertiary structure of insulin-degrading enzymes resembles a clamshell composed of four structurally similar domains arranged to enclose a large central chamber. Substrates must enter the chamber, and it is likely that a hinge-like conformational change allows substrate binding and product release. Triphosphates are found to dock between the inner surfaces of the non-catalytic domains three and four. 283
58797 406572 pfam16188 Peptidase_M24_C C-terminal region of peptidase_M24. This is a short region at the C-terminus of a number of metallo-peptidases of the M24 family. 63
58798 406573 pfam16189 Creatinase_N_2 Creatinase/Prolidase N-terminal domain. 159
58799 406574 pfam16190 E1_FCCH Ubiquitin-activating enzyme E1 FCCH domain. This domain is found in the ubiquitin-activating E1 family enzymes. 69
58800 406575 pfam16191 E1_4HB Ubiquitin-activating enzyme E1 four-helix bundle. This domain is found in the ubiquitin-activating E1 family enzymes. 64
58801 406576 pfam16192 PMT_4TMC C-terminal four TMM region of protein-O-mannosyltransferase. PMT_4TMC is the C-terminal four membrane-pass region of protein-O-mannosyltransferases and similar enzymes. 198
58802 406577 pfam16193 AAA_assoc_2 AAA C-terminal domain. AAA_assoc_2 is found at the C-terminus of a relatively small set of AAA domains in proteins ranging from archaeal to fungi, plants and mammals. 80
58803 406578 pfam16195 UBA2_C SUMO-activating enzyme subunit 2 C-terminus. 93
58804 406579 pfam16197 KAsynt_C_assoc Ketoacyl-synthetase C-terminal extension. KAsynt_C_assoc represents the very C-terminus of a subset of proteins from the keto-acyl-synthetase 2 family. It is found in proteins ranging from bacteria to human. 111
58805 406580 pfam16198 TruB_C_2 tRNA pseudouridylate synthase B C-terminal domain. This C-terminal region is found on a subset of TruB_B protein family members pfam01509. It is found from bacteria and archaea to fungi, plants and human. 65
58806 406581 pfam16199 Radical_SAM_C Radical_SAM C-terminal domain. This domain is found as a C-terminal extension to a subset of Radical_SAM domains. It is found in archaeal, bacterial, fungal, plant and human proteins. 83
58807 406582 pfam16200 Band_7_C C-terminal region of band_7. This domain is found on a subset of proteins as a C-terminal extension of the Band_7 family, pfam01145. It is found in proteins fro bacteria to fungi, plants and mammals. 63
58808 406583 pfam16201 NopRA1 Nucleolar pre-ribosomal-associated protein 1. This family is found on the long vertebral and plant nucleolar proteins that also carry Npa1, pfam11707. 196
58809 406584 pfam16202 BLM_N N-terminal region of Bloom syndrome protein. BLM_N is the very N-terminal region of chordate Bloom syndrome proteins. The exact function is not known. 368
58810 406585 pfam16203 ERCC3_RAD25_C ERCC3/RAD25/XPB C-terminal helicase. This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases. 248
58811 406586 pfam16204 BDHCT_assoc BDHCT-box associated domain on Bloom syndrome protein. This family is found on Bloom syndrome-associated DEAD-box helicases in higher eukaryotes. It lies between the BDHCT, and DEAD-box families, pfam08072 and pfam00270. 223
58812 406587 pfam16205 Ribosomal_S17_N Ribosomal_S17 N-terminal. This short N-terminal region is found in a number of higher eukaryotic ribosomal subunit 17 proteins. 67
58813 374431 pfam16206 Mon2_C C-terminal region of Mon2 protein. Mon2 proteins are found from fungi to plants, to human and is a scaffold protein involved in multiple aspects of endo membrane trafficking. This C-terminal region is essential for Mon2 activity. 827
58814 406588 pfam16207 RAWUL RAWUL domain RING finger- and WD40-associated ubiquitin-like. The RAWUL domain is found at the C-terminus of poly-comb group RING finger proteins. It is a ubiquitin-like domain. RAWUL binds directly to PUFD, a domain on BCOR proteins (BCL6 corepressor). BCOR has emerged as an important player in development and health. 66
58815 406589 pfam16208 Keratin_2_head Keratin type II head. 160
58816 406590 pfam16209 PhoLip_ATPase_N Phospholipid-translocating ATPase N-terminal. PhoLip_ATPase_N is found at the N-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes. 67
58817 406591 pfam16210 Keratin_2_tail Keratin type II cytoskeletal 1 tail. 135
58818 406592 pfam16211 Histone_H2A_C C-terminus of histone H2A. 35
58819 406593 pfam16212 PhoLip_ATPase_C Phospholipid-translocating P-type ATPase C-terminal. PhoLip_ATPase_C is found at the C-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes. 249
58820 406594 pfam16213 DCB dimerization and cyclophilin-binding domain of Mon2. DCB is the N-terminal domain of Mon2- and GIG1-like proteins from metazoa. Mon2 and BIG1 like proteins play an important role in the cytoplasm-to-vacuole transport pathway and are required for Golgi homeostasis. 172
58821 318454 pfam16214 AC_N Adenylyl cyclase N-terminal extracellular and transmembrane region. This family covers the N-terminal extracellular region and the first transmembrane 5-6 pass region of adenylate cyclase. 415
58822 406595 pfam16215 DUF4876 Protein of unknown function (DUF4876). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 433 amino acids in length. There is a conserved NNS sequence motif. 186
58823 406596 pfam16216 GxGYxYP_N GxGYxY sequence motif in domain of unknown function N-terminal. This domain is found in bacteria, archaea and eukaryotes, and is typically between 213 and 231 amino acids in length. This domain is found in association with pfam14323. 215
58824 406597 pfam16217 M64_N Peptidase M64 N-terminus. This domain is found at the N-terminus of IgA Peptidase M64. Its function is unknown. 115
58825 406598 pfam16218 Peptidase_C101 Peptidase family C101. This is a family of cysteine-peptidases that is conserved in vertebrates. The key residues as found in human OTULIN are Asp126, Cys129, His339 and Asn341. 264
58826 406599 pfam16219 DUF4879 Domain of unknown function (DUF4879). family of short proteins of bacterial proteins of phage origin, exemplified by protein SPBc2p013 from Bacillus phage SPBc2, found in various Bacillus and Pseudomonas species. Structure show unexpected structural similarity to greek key beta barrels from the E-set domain, especially to domains involved in carbohydrate and protein- protein binding. However functional consequences of this similarity are not confirmed. 123
58827 406600 pfam16220 DUF4880 Domain of unknown function (DUF4880). This domain can be found on the N-terminal of uncharacterized proteins from various Rhodopseudomonas and Pseudomonas species, often, but not always followed by the ron siderophore sensor protein family (FecR, PF04773). The function of this domain is unknown. 43
58828 379798 pfam16221 HTH_47 winged helix-turn-helix. HTH_47 is an example of a circularly permuted winged helix-turn-helix domain. HTH_47 is found at the very C-terminus of DUF2172, which is structurally similar to M28-peptidases but lacking one of the key zinc-binding residues. 77
58829 339666 pfam16222 DUF4881 Domain of unknown function (DUF4881). This small family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Desulfovibrio species. The function of this protein is unknown. 180
58830 374442 pfam16223 DUF4882 Domain of unknown function (DUF4882). This small family consists of several uncharacterized proteins around 325 residues in length and is mainly found in various Acinetobacter species. The function of this family is unknown. 267
58831 406601 pfam16224 DUF4883 DOmain of unknown function (DUF4883). This family consists of several uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 118
58832 406602 pfam16225 DUF4884 Domain of unknown function (DUF4884). This family consists of several uncharacterized proteins around 90 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 49
58833 292843 pfam16226 DUF4885 Domain of unknown function (DUF4885). This family consists of several uncharacterized proteins around 390 residues in length and is mainly found in various Bacillus subtillis species. This family is predicted to be functional in biosynthesis of rhizocticins and antifungal phosphonate oligopeptides, but the specific function of this family is still unknown. 325
58834 406603 pfam16227 DUF4886 Domain of unknown function (DUF4886). This domain is mainly found in uncharacterized proteins around 290 residues in length and is mainly found in various Bacteroides species. It has a curved central beta sheet flanked by helices. Distant homolog analysis showed it has a similarity with GDSL-like Lipase/Acylhydrose family. The function of this domain is still unknown. 250
58835 374444 pfam16228 DUF4887 Domain of unknown function (DUF4887). This family consists of uncharacterized proteins around 210 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown. 176
58836 292846 pfam16229 DUF4888 Domain of unknown function (DUF4888). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown. 141
58837 406604 pfam16230 DUF4889 Domain of unknown function (DUF4889). This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various Staphylococcus aureus species. The function of this family is unknown. 71
58838 406605 pfam16231 DUF4890 Domain of unknown function (DUF4890). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 109
58839 406606 pfam16232 DUF4891 Domain of unknown function (DUF4891). This family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 92
58840 406607 pfam16233 DUF4893 Domain of unknown function (DUF4893). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Pseudomonas species. The function of this family is unknown. 171
58841 406608 pfam16234 DUF4892 Domain of unknown function (DUF4892). This family consists of uncharacterized proteins around 270 residues in length and is mainly found in various Pseudomonas aeruginosa species. The function of this family is unknown. 182
58842 406609 pfam16235 DUF4894 Domain of unknown function (DUF4894). A small family of uncharacterized proteins around 180 residues in length and found in various Thermotoga species. The function of this family is unknown. 122
58843 406610 pfam16236 DUF4895 Domain of unknown function (DUF4895). A small family of uncharacterized proteins around 250 residues in length and found in various Thermotoga species. The function of this family is unknown. 218
58844 406611 pfam16237 DUF4896 Domain of unknown function (DUF4896). A small family of uncharacterized proteins around 50 or 570 residues in length and found in various Thermotoga species. The function of this family is unknown. 44
58845 406612 pfam16238 DUF4897 Domain of unknown function (DUF4897). A small family of uncharacterized proteins around 200 residues in length and found in various Thermotoga species. The function of this family is unknown. 152
58846 406613 pfam16239 DUF4898 Domain of unknown function (DUF4898). A small family of uncharacterized proteins around 100 residues in length and found in various Sulfolobus species. The function of this family is unknown. 82
58847 406614 pfam16240 DUF4899 Domain of unknown function (DUF4899). A small family of uncharacterized proteins around 340 residues in length and found in various Thermotoga and Thermosipho species. The function of this family is unknown. 283
58848 292858 pfam16241 DUF4900 Domain of unknown function (DUF4900). This family consists of uncharacterized proteins around 600 residues in length and is mainly found in various Thermotoga and Fervidobaterium species. The function of this family is unknown. 89
58849 406615 pfam16242 Pyrid_ox_like Pyridoxamine 5'-phosphate oxidase like. This domain, approximately 140 residues in length, is mainly found in general stress proteins in various Xanthomonas species. It is composed of a six-stranded antiparallel beta-barrel flanked by five alpha-helices and can bind to FMN and FAD, suggesting that it may help the bacteria to react against the oxidative stress induced by the defense mechanisms of the plant. 149
58850 374451 pfam16243 Sm_like Sm_like domain. This domain, approximately 150 residues, is mainly found in several uncharacterized proteins in various Prochlorococcus and Synechococcus species. The crystal structure of ECX21941 reveals unexpected similarity to Sm/LSm proteins, which are important RNA-binding proteins, despite no detectable sequence similarity. The specific function of this family is unknown, but the structure analysis of ECX21941 indicates nucleic acid-binding capabilities and suggests a role in RNA and/or DNA processing. 85
58851 406616 pfam16244 DUF4901 Domain of unknown function (DUF4901). This family consists of uncharacterized proteins around 470 residues in length and is mainly found in various Bacillus subtilis species. The function of this family is unknown. 228
58852 406617 pfam16245 DUF4902 Domain of unknown function (DUF4902). A family of uncharacterized proteins around 140 residues in length and found in various Acidithiobacillacea and Acinetobacter species. It may be functional in extreme acidophile Acidithiobacillus ferrooxidans, but the specific function of this family is unknown. 118
58853 406618 pfam16246 DUF4903 Domain of unknown function (DUF4903). A small family of uncharacterized proteins around 210 residues in length and found in various Bacteroides and Prevotella species. The function of this family is unknown. 190
58854 292864 pfam16247 DUF4904 Domain of unknown function (DUF4904). This domain, approximately 130 residues in length, is mainly found in several uncharacterized proteins around 340 residues in Actinobacteria, Cyanobacteria and Metazoa species. It is mainly composed of antiparallel beta sheets and has a cystatin-like fold, but the specific function of this family is unknown. 127
58855 379805 pfam16248 DUF4905 Domain of unknown function (DUF4905). A small family of uncharacterized proteins around 270 residues in length and found in various Cytophagales, Sphingobacteriaceae and Ignavibacteriaceae species. The function of this family is unknown. 81
58856 406619 pfam16249 DUF4906 Domain of unknown function (DUF4906). A family of uncharacterized proteins around 300 residues in length and found in various Bacteroides species. The function of this family is unknown. 203
58857 406620 pfam16250 DUF4907 Domain of unknown function (DUF4907). A family of uncharacterized proteins around 110 residues in length and found in various Bacteroides species. The function of this family is unknown. 65
58858 406621 pfam16251 NAR Nucleic acid-binding domain (NAR). This domain, approximately 100 residues in length, is mainly found in Orf1a polyproteins in severe acute respiratory syndrome coronavirus. The global domain of the NAR represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands and a group of residues form a positively charged patch on the protein surface as the binding site responsible for binding affinity for nucleic acids. 129
58859 318482 pfam16252 DUF4908 Domain of unknown function (DUF4908). A small family of uncharacterized proteins around 260 residues in length and found in various Caulobacter and Brevundimonas species. The function of this family is unknown. 221
58860 292870 pfam16253 DUF4909 Domain of unknown function (DUF4909). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. Several members are associated with vancomycin virulence in Staph. aureus in some way. These proteins are all lipoproteins, carrying the characteristic prokaryotic membrane-attachment site at their N-termini. 127
58861 406622 pfam16254 DUF4910 Domain of unknown function (DUF4910). 339
58862 318484 pfam16255 Lipase_GDSL_lke GDSL-like Lipase/Acylhydrolase. 202
58863 406623 pfam16256 DUF4911 Domain of unknown function (DUF4911). This family consists of uncharacterized proteins around 75 residues in length and is mainly found in various Thermotogav species. The function of this family is unknown. 57
58864 406624 pfam16257 UxaE tagaturonate epimerase. This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteria species, such as Thermotoga, Paenibacillus and Rhodothermus. A newly recognized enzyme from the galacturonate utilization pathway in T. maritima with tagaturonate epimerase activity. 475
58865 406625 pfam16258 DUF4912 Domain of unknown function (DUF4912). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown. 117
58866 406626 pfam16259 DUF4913 Domain of unknown function (DUF4913). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Arthrobacter species. The function of this family may be functional in enableing the growth of Arthrobacter sp. strain JBH1 with nitroglycerin as the sole source of carbon and nitrogen. 105
58867 406627 pfam16260 DUF4914 Domain of unknown function (DUF4914). This family consists of uncharacterized proteins around 630 residues in length and is mainly found in various Thermotoga, Thermoanaerobacter and Carboxydibrachium species. The function of this family is unknown. 606
58868 406628 pfam16261 DUF4915 Domain of unknown function (DUF4915). This family consists of uncharacterized proteins around 370 residues in length and is mainly found in various species, such as Shewanella, Rheinheimera, Saccharophagus, Leptolyngbya and so on. It contains serveral TPR repeat-containing proteins. The function of this family is unknown. 314
58869 406629 pfam16262 DUF4916 Domain of unknown function (DUF4916). This domain family consists of uncharacterized proteins around 175 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown. This family is related to the NUDIX hydrolases. 169
58870 374458 pfam16263 DUF4917 Domain of unknown function (DUF4917). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Burkholderia and Brucella species. The function of this family is unknown. 311
58871 339675 pfam16264 SatD SatD family (SatD). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Streptococcus species. The function of this family is involved in acid resistance. 211
58872 406630 pfam16265 DUF4918 Domain of unknown function (DUF4918). This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Listeria species. The function of this family is unknown. 224
58873 406631 pfam16266 DUF4919 Domain of unknown function (DUF4919). This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown. 184
58874 406632 pfam16267 DUF4920 Domain of unknown function (DUF4920). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 85
58875 406633 pfam16268 DUF4921 Domain of unknown function (DUF4921). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various Corynebacterium species. Several proteins are predicted as galactose-1-phosphate uridylytransferases. The function of this family is unknown. 425
58876 406634 pfam16269 DUF4922 Domain of unknown function (DUF4922). This family consists of uncharacterized proteins around 310 residues in length and is mainly found in various Bacteroides and Parabacteroides species. Several members are annotated as putative glycosyltransferases, but the specific function of this family is still unknown. 188
58877 406635 pfam16270 DUF4923 Domain of unknown function (DUF4923). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown. 175
58878 406636 pfam16271 DUF4924 Domain of unknown function (DUF4924). This family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Parabacteroides and Bacteroides species. The function of this family is unknown. 179
58879 406637 pfam16272 DUF4925 Domain of unknown function (DUF4925). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 339
58880 406638 pfam16273 NuDC Nuclear distribution C domain. This domain, approximately 40-50 residues in length, is mainly found in nuclear migration proteins in various Mammalia species. It may play a role not only in mitosis and cytokinesis, but also in interkinetic nuclear migration and neuronal migration during neocortical development. 64
58881 406639 pfam16274 Qua1 Qua1 domain. This domain, approximately 40 residues in length, is mainly found in KH-domain containing, RNA-binding, signal transduction-associated protein 1 from yeast to human. It forms a homodimer composed of a perpendicular interaction of two helical hairpins, and the Qua1 domain is sufficient for homodimerization which is required for the regulation of alternative splicing. 52
58882 406640 pfam16275 SF1-HH Splicing factor 1 helix-hairpin domain. This domain, approximately 100 residues in length, is mainly found in splicing factor 1 from yeast to human. It is a helix-hairpin domain, which forms a secondary, hydrophobic interface with U2AF65(UHM) to lock the orientation of the two subunits, which is essential for cooperative formation of the ternary SF1-U2AF65-RNA complex. In this domain, it contains a highly conserved SPSP motif in its C terminal and phophorylation of SPSP motif induces a disorder-to-order transition within a novel SF1/U2AF65 interface, indicating a phosphorylation-dependent control of pre-mRNA splicing factors. 114
58883 406641 pfam16276 NPM1-C Nucleophosmin C-terminal domain. This domain, approximately 50 residues in length, is mainly found in Nucleophosmin proteins in mammalia species. Nucleophosmin, a nucleocytoplasmic shuttling protein, is related with cancer and involved in serveral cellluar functions, such as ribosome maturatation and export, centrosome duplication, and response to stress stimuli. This domain has a three-helix bundle which can bind G-quadruplex DNA and the interaction involves helices H1 and H2 of the NPM1-C domain mainly through electrostatic contacts with G-quadruplex phosphates, indicating a crucial role in rescuring its function in leukemia. 47
58884 406642 pfam16277 DUF4926 Domain of unknown function (DUF4926). This family consists of uncharacterized proteins around 70 residues in length and is mainly found in various Caulobacter, Microcystis and Cyanothece species. The function of this family is unknown. 58
58885 406643 pfam16278 zf-C2HE C2HE / C2H2 / C2HC zinc-binding finger. zf-C2HE is an unusual zinc-binding domain found in fungi, plants and metazoa. It is often found at the C-terminus of HIT-domain-containing proteins, pfam01230. In fungi the fourth ligand is a Glu, in plants it is Cys and in metazoans it is usually a His. The fourth ligand is often mutated in neurogenerative disease-states. 60
58886 406644 pfam16279 DUF4927 Domain of unknown function (DUF4927). This family, around 80 residues, consists of uncharacterized and nuclear receptor coactivator 2 proteins and is mainly found in mammalia species. The specific function of this family is still unknown. 89
58887 374468 pfam16280 DUF4928 Domain of unknown function (DUF4928). This family consists of uncharacterized proteins around 330 residues in length and is mainly found in various Bacteria species, such as Enterobacteriales, Clostridiales, Actinomycetales and so on. The function of this family is unknown. 306
58888 406645 pfam16282 SANT_DAMP1_like SANT/Myb-like domain of DAMP1. This domain, approximately 90 residues, is mainly found in DNA methyltransferase 1-associated protein 1 (DAMP1) that plays an important role in development and maintenace of genome integrity in various mammalia species. It mainly consists of tandem repeats of three alpha-helices that are arranged in a helix-turn-helix motif and shows a structual similarity with SANT domain and Myb DNA-binding domain, indicating it contains a putative DNA binding site. 80
58889 406646 pfam16283 DUF4929 Domain of unknown function (DUF4929). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various species, such as Bacteroides, Capnocytophaga and Prevotella. The function of this family is unknown. 366
58890 374471 pfam16284 DUF4930 Domain of unknown function (DUF4930). A small family of uncharacterized proteins around 150 residues in length and found in various Staphylococcus aureus species. The function of this family is unknown. 144
58891 406647 pfam16285 DUF4931 Domain of unknown function (DUF4931). This family consists of uncharacterized proteins around 270 residues in length and is mainly found in various Bacillus cereus species. Some members of this family are annotated as Galactose-1-phosphate uridylyltransferases, but the specific function of this family is unknown. 245
58892 406648 pfam16286 DUF4932 Domain of unknown function (DUF4932). This family consists of uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis, Bacteroides sp. and so on. Several members are annotated as putative metalloproteases, but the specific function of this family is unknown. 330
58893 406649 pfam16287 DUF4933 Domain of unknown function (DUF4933). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various species, such as Bacteroides and Parabacteroides. Several members are annotated as putative transmembrane proteins, but the specific function of this family is unknown. 386
58894 406650 pfam16288 DUF4934 Domain of unknown function (DUF4934). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp. The function of this family is unknown. 102
58895 406651 pfam16289 DUF4935 Domain of unknown function (DUF4935). This family consists of uncharacterized proteins around 350 residues in length and is mainly found in various species, such as Prevotella, Pseudomonas, Leptospira and so on. The function of this family is unknown. 171
58896 406652 pfam16290 DUF4936 Domain of unknown function (DUF4936). This family consists of uncharacterized proteins around 100 residues in length and is mainly found in various Burkholderiales species, such as Herbaspirillum, Cupriavidus, Ralstonia and so on. The function of this family is unknown. 87
58897 406653 pfam16291 DUF4937 Domain of unknown function (DUF4937. This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Bacillus species, such as Bacillus subtilis and Bacillus amyloliquefaciens. Several members are annotated as ydbC, but the specific function of this family is unknown. 89
58898 292908 pfam16292 DUF4938 Domain of unknown function (DUF4938). A small family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Chloroflexus, Comamonas, Delfitia, Rubrivivax and Roseiflexus species. Several members are annotated as cyanophycin synthetases, but the function of this family is unknown. 302
58899 406654 pfam16293 zf-C2H2_9 C2H2 type zinc-finger (1 copy). 57
58900 406655 pfam16294 RSB_motif RNSP1-SAP18 binding (RSB) motif. The RSB motif on the Acinus protein is the core around which the ASAP complex is built. The apoptosis and splicing-associated protein complex, ASAP, is made up of three proteins, SAP18 (Sin3-associated protein of 18 kDa), RNA-binding protein S1 (RNPS1) and apoptotic chromatin inducer in the nucleus (Acinus). The ASAP complex appears to be an assembly of proteins at the interface between transcription, splicing and NMD, acting as a hub in the network of protein-interactions that regulate gene-expression. 91
58901 292911 pfam16295 TetR_C_10 Tetracycline repressor, C-terminal all-alpha domain. 132
58902 406656 pfam16296 TM_PBP2_N N-terminal of TM subunit in PBP-dependent ABC transporters. This family mainly consists of Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs, such as Binding-protein-dependent transport systems inner membrane component and Maltose transport permease MalF. It is around 580 residues in length and is mainly found in various species, such as Thermotoga, Dictyoglomus, Thermosipho, Fervidobacterium, Mesotoga and so on. The function of this family is unknown. 78
58903 406657 pfam16297 DUF4939 Domain of unknown function (DUF4939). This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various mammalia species. LDOC1, a member of this family and a novel MZF-1-interacting protein, inhibits NF-kappaB activation and relates with cancer and some other diseases. But the specific function of this family is still unknown. 114
58904 406658 pfam16298 DUF4940 Domain of unknown function (DUF4940). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Thermotoga species. The function of this family is unknown. 206
58905 406659 pfam16299 DUF4941 Domain of unknown function (DUF4941). This family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Thermotoga species. The function of this family is unknown. 265
58906 406660 pfam16300 WD40_4 Type of WD40 repeat. Most members of this family form part of the 7-bladed beta-propeller at the N-terminus of coronin proteins. 44
58907 339683 pfam16301 DUF4943 Domain of unknown function (DUF4943). This small family consists of several uncharacterized proteins around 170 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 150
58908 292918 pfam16302 DUF4944 Domain of unknown function (DUF4944). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Bacillus species. The function of this family is unknown. 128
58909 292919 pfam16303 DUF4945 Domain of unknown function (DUF4945). This small family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp.. The function of this family is unknown. 115
58910 318518 pfam16304 DUF4946 Domain of unknown function (DUF4946). This small family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Pseudomonas species, especially in Pseudomonas aeruginosa. The function of this family is unknown. 152
58911 406661 pfam16305 DUF4947 Domain of unknown function (DUF4947). This small family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Streptococcus mutans species. The function of this family is unknown. 169
58912 374480 pfam16306 DUF4948 Domain of unknown function (DUF4948). This small family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides, Paraprevotella, Parabacteroides and Alistipes species. The function of this family is unknown. 171
58913 292923 pfam16307 DUF4949 Domain of unknown function (DUF4949). This small family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Legionella pneumophila and longbeachae species. The function of this family is unknown. 107
58914 292924 pfam16308 DUF4950 Domain of unknown function (DUF4950). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Enterococcus faecalis species. The function of this family is unknown. 191
58915 339686 pfam16309 DUF4951 Domian of unknown function (DUF4951). This family consists of several uncharacterized proteins around 125 residues in length and is mainly found in various Acinetobacter baumannii species. The function of this family is unknown. 83
58916 374481 pfam16310 DUF4952 Domian of unknown function (DUF4952). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in various Leptospira, Pseudomonas, Stenotrophomonas and Desulfovibrio species. The function of this family is unknown. 77
58917 406662 pfam16311 TMEM100 Transmembrane protein 100. This family of proteins is found in eukaryotes. Proteins in this family are approximately 130 amino acids in length. There is some apparent similarity with family the phosphoinositide-interacting protein family PIRT, pfam15099, because those proteins are also transmembrane proteins. 132
58918 406663 pfam16312 Oberon_cc Coiled-coil region of Oberon. Oberon_cc is the coiled-coil region of Oberon proteins from plants. Oberon is necessary for maintenance and/or establishment of both the shoot and root apical meristems in Arabidopsis. Most Oberon proteins carry a PHD finger domain, pfam07227 and this coiled-coil domain. Oberon proteins mediate the TMO7 (the direct target of MP) expression through modification of, or binding to, chromatin at the TMO7 locus. TMO7 stands for the target of Monopteros 7 (or Auxin response factor 7). 129
58919 406664 pfam16313 DUF4953 Met-zincin. This is a family of uncharacterized proteins that carry the highly characteristic met-zincin mmotif HExxHxxGxxH, the extended zinc-binding domain of metallopeptidases. 319
58920 406665 pfam16314 DUF4954 Domain of unknown function (DUF4954). This family consists of uncharacterized proteins around 660 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 653
58921 406666 pfam16315 DUF4955 Domain of unknown function (DUF4955). This family consists of uncharacterized proteins around 850 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 149
58922 406667 pfam16316 DUF4956 Domain of unknown function (DUF4956). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 169
58923 406668 pfam16317 Glyco_hydro_99 Glycosyl hydrolase family 99. This domain, around 350 residues, is mainly found in some uncharacterized proteins from bacteroides to human. Some proteins in this family, annotated as endo-alpha-mannosidases cleave mannoside linkages internally within an N-linked glycan chain, short circuiting the classical N-glycan biosynthetic pathway. This domain reveals a (beta-alpha)(8) barrel fold in which the catalytic centre is present in a long substrate-binding groove, consistent with cleavage within the N-glycan chain, providing a foundation upon which to develop new enzyme inhibitors targeting the hijacking of N-glycan synthesis in viral disease and cancer. 342
58924 406669 pfam16318 DUF4957 Domain of unknown function (DUF4957). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown. 141
58925 406670 pfam16319 DUF4958 Domain of unknown function (DUF4958). This family consists of uncharacterized proteins around 720 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 731
58926 406671 pfam16320 Ribosomal_L12_N Ribosomal protein L7/L12 dimerization domain. This is the N-terminal dimerization domain of ribosomal protein L7/L12. 48
58927 406672 pfam16321 Ribosom_S30AE_C Sigma 54 modulation/S30EA ribosomal protein C-terminus. This domain often occurs at the C-terminus of proteins containing pfam02482. 53
58928 406673 pfam16322 Tub_N Tubby N-terminal. Tub_N is the N-terminal region of Tubby proteins. It carries a nuclear localization signal and is able to activate transcription. 200
58929 406674 pfam16323 DUF4959 Domain of unknown function (DUF4959). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides, Pedobacter and Parabacteroides species. Several proteins are annotated as Galactose-binding like proteins, but the specific function of this protein is unknown. 106
58930 406675 pfam16324 DUF4960 Domain of unknown function (DUF4960). This family consists of uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 253
58931 406676 pfam16325 Peptidase_U32_C Peptidase family U32 C-terminal domain. This domain is found at the C-terminus of many members of Peptidase family U32 (pfam01136). 80
58932 406677 pfam16326 ABC_tran_CTD ABC transporter C-terminal domain. This domain is found at the C-terminus of ABC transporters. It has a coiled coil structure with an atypical 3(10)-helix in the alpha-hairpin region. It is involved in DNA_binding. 69
58933 406678 pfam16327 CcmF_C Cytochrome c-type biogenesis protein CcmF C-terminal. This C-terminal region of CcmF, one of the cytochrome c-type biogenesis proteins, is associated at the C-terminal with Cytochrome_C_asm family pfam01578. It is possible that it is this domain which delivers reductant to haem on CcmE. 323
58934 292944 pfam16328 DUF4961 Domain of unknown function (DUF4961). This small family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 317
58935 318535 pfam16329 Pestivirus_E2 Pestivirus envelope glycoprotein E2. 372
58936 406679 pfam16330 MukB_hinge MukB hinge domain. The hinge domain of chromosome partition protein MukB is responsible for dimerization and is also involved in protein-DNA interactions and conformational flexibility. 167
58937 406680 pfam16331 TolA_bind_tri TolA binding protein trimerisation. This is the N-terminal domain of the YbgF protein. YbgF binds to TolA. This domain mediates trimerisation. 72
58938 406681 pfam16332 DUF4962 Domain of unknown function (DUF4962). This family consists of uncharacterized proteins around 870 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 476
58939 406682 pfam16334 DUF4964 Domain of unknown function (DUF4964). This family consists of uncharacterized proteins around 840 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glutaminases, but the function of this protein is unknown. 87
58940 406683 pfam16335 DUF4965 Domain of unknown function (DUF4965). This family consists of uncharacterized proteins around 840 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glutaminases, but the function of this protein is unknown. 174
58941 406684 pfam16338 DUF4968 Domain of unknown function (DUF4968). This family consists of uncharacterized proteins around 830 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as alpha-glucosidases, but the function of this protein is unknown. 90
58942 318542 pfam16339 DUF4969 Domain of unknown function (DUF4969). This small family consists of several uncharacterized proteins around 540 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 79
58943 406685 pfam16341 DUF4971 Domain of unknown function (DUF4971). This small family consists of uncharacterized proteins around 370 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 139
58944 292954 pfam16342 DUF4972 Domain of unknown function (DUF4972). This family consists of uncharacterized proteins around 490 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 128
58945 406686 pfam16343 DUF4973 Domain of unknown function (DUF4973). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown. 130
58946 406687 pfam16344 DUF4974 Domain of unknown function (DUF4974). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Parabacterodies species. The function of this protein is unknown. 70
58947 406688 pfam16346 DUF4975 Domain of unknown function (DUF4975). This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glycosyl hydrolases, but the function of this protein is unknown. 176
58948 406689 pfam16347 DUF4976 Domain of unknown function (DUF4976). This family consists of uncharacterized proteins around 530 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Arylsulfatases, but the function of this protein is unknown. 103
58949 406690 pfam16348 Corona_NSP4_C Coronavirus nonstructural protein 4 C-terminus. This is the C-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex (RTC) to modified endoplasmic reticulum membranes. This predominantly alpha-helical domain may be involved in protein-protein interactions. 92
58950 406691 pfam16349 DUF4978 Domain of unknown function (DUF4978). This family consists of uncharacterized proteins around 540 residues in length and is mainly found in various Bacteroides and Prevotella species. Several proteins in this family are annotated as Glycoside hydrolases, but the function of this protein is unknown. 172
58951 406692 pfam16350 FAO_M FAD dependent oxidoreductase central domain. This domain occurs in several FAD dependent oxidoreductases: Sarcosine dehydrogenase, Dimethylglycine dehydrogenase and Dimethylglycine dehydrogenase. It is situated between the DAO domain (pfam01266) and the GCV_T domain (pfam01571). 56
58952 406693 pfam16351 DUF4979 Domain of unknown function (DUF4979). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 158
58953 406694 pfam16352 DUF4980 Domain of unknown function (DUF4980). This family consists of uncharacterized proteins around 610 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 104
58954 406695 pfam16353 DUF4981 Domain of unknown function(DUF4981). This family consists of uncharacterized proteins around 1000 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 90
58955 406696 pfam16355 DUF4982 Domain of unknown function (DUF4982). This family is found in the C-terminal of uncharacterized proteins and beta-galactosidases around 680 residues in length from various Bacteroides species. The function of this protein is unknown. 62
58956 374498 pfam16356 DUF4983 Domain of unknown function (DUF4983). This family consists of uncharacterized proteins around 600 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown. 93
58957 374499 pfam16357 PepSY_TM_like_2 Putative PepSY_TM-like. This is a family of bacterial proteins with three PepSY-like TM regions. 197
58958 406697 pfam16358 RcsF RcsF lipoprotein. The RcsF lipoprotein is a component of the Rcs signaling system. It activates the Rcs system by transmitting signals from the cell suface to the histidine kinase RcsC. 110
58959 406698 pfam16359 RcsD_ABL RcsD-ABL domain. This domain is part of the RcsD histidine kinase. It recognizes the effector domain of RcsB. 103
58960 406699 pfam16360 GTP-bdg_M GTP-binding GTPase Middle Region. This family locates between the N-terminal domain and MMR_HSR1 50S ribosome-binding GTPase of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This region is unknown for its function. 79
58961 406700 pfam16361 Peptidase_S8_N N-terminal of Subtilase family. This is the N-terminal of Peptidase_S8 of subtilase family. It is around 100 residues in length from various Bacteroides species. The function of this family is unknown. 142
58962 406701 pfam16362 YaiA YaiA protein. This family of proteins is found in Enterobacteriaceae, where they are immediately downstream of a Shikimate kinase. 63
58963 406702 pfam16363 GDP_Man_Dehyd GDP-mannose 4,6 dehydratase. 327
58964 406703 pfam16364 Antigen_C Cell surface antigen C-terminus. This repeated domain is found at the C-terminus of cell surface antigens. In the Streptococcus mutans antigen I/II there are three repeats of this domain, a cleft between the first two of these forms a binding site for the human salivary agglutinin (SAG). 171
58965 292975 pfam16365 EutK_C Ethanolamine utilization protein EutK C-terminus. This is the C-terminal domain of the ethanolamine utilization protein EutK. It is a helix-turn-helix domain and is predicted to bind to nucleic acids. 55
58966 406704 pfam16366 CEBP_ZZ Cytoplasmic polyadenylation element-binding protein ZZ domain. This ZZ-type zinc finger domain binds zinc via two conserved histidines in the C-terminal part of the domain. 56
58967 406705 pfam16367 RRM_7 RNA recognition motif. 94
58968 406706 pfam16368 CEBP1_N Cytoplasmic polyadenylation element-binding protein 1 N-terminus. This is the N-terminal domain of cytoplasmic polyadenylation element-binding protein 1. 307
58969 406707 pfam16369 GH43_C C-terminal of Glycosyl hydrolases family 43. This is the C-terminal of Glycosyl hydrolases family 43. It is around 100 residues in length from various Bacteroides species. The function of this family is unknown. 106
58970 406708 pfam16370 MetallophosC C terminal of Calcineurin-like phosphoesterase. This is the C-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown. 156
58971 406709 pfam16371 MetallophosN N terminal of Calcineurin-like phosphoesterase. This is the N-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown. 73
58972 406710 pfam16372 DUF4984 Domain of unknown function (DUF4984). This domain is around 150 residues long and is located in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this domain remains unknown. 163
58973 406711 pfam16373 DUF4985 Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides, Prevotella and Prevotella species. The function of this family remains unknown. 114
58974 406712 pfam16374 CIF Cycle inhibiting factor (CIF). Cycle inhibiting factors (Cif) are bacterial effectors that interfere with the eukarytoc cell cycle. CIF induce an irreversible cell cycle arrest upon injection into host cell. CIF blocks degradation of cyclin -dependent kinase inhibitors p21 and p27, inducing their accumulation in the cell. The x-ray crystal structure of Cif reveals it to be a divergent member of a superfamily of enzymes including cysteine proteases and acetyltransferases. 138
58975 406713 pfam16375 DUF4986 Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Bacillus species. The function of this family remains unknown. 84
58976 292986 pfam16376 fragilysinNterm N-terminal domain of fragilysin. N-terminal domain of fragilysin, an extracellular metalloprotease toxin, which is primary virulence factor of B. fragilis, an oportunistic pathogen of human gut. The N-terminal domain of fragilysin inhibits fragilysin and is cleaved in a mature, virulent form. 144
58977 406714 pfam16377 DUF4987 Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown. 145
58978 406715 pfam16378 DUF4988 Domain of unknown function. This family around 200 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Alistipes species. The function of this family remains unknown. The N-terminus of this model has been clipped by ~30 residues as it was capturing parts of collagen sequences, pfam01391. 181
58979 339721 pfam16379 DUF4989 Domain of unknown function (DUF4989). This family around 300 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Alistipes species. The function of this family remains unknown. This entry contains a duplication of a DUF1735-like domain. 293
58980 406716 pfam16380 DUF4990 Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown. 142
58981 406717 pfam16381 Coatomer_g_Cpla Coatomer subunit gamma-1 C-terminal appendage platform. Coatomer_g_Cpla is the very C-terminal domain of the eukaryotic Coatomer subunit gamma-1 proteins. It acts as a platform domain to the C-terminal appendage. It carries one single protein/protein interaction site, which is the binding site for ARFGAP2 or ADP-ribosylation factor GTPase-activating protein. COPI-coated vesicles mediate retrograde transport from the Golgi back to the ER and intra-Golgi transport. The gamma-COPI is part of one of two subcomplexes that make up the heptameric coatomer complex along with the beta, delta and zeta subunits. 114
58982 406718 pfam16383 DUF4992 Domain of unknown function. This family around 150 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown. 182
58983 406719 pfam16384 DUF4993 Domain of unknown function. This family around 350 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown. 366
58984 406720 pfam16385 DUF4994 Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown. 98
58985 339724 pfam16386 DUF4995 Domain of unknown function. This family around 100 residues locates in the N-terminal of some uncharacterized proteins and glucuronyl hydrolases in various Bacteroides species. The function of this family remains unknown. 73
58986 406721 pfam16387 DUF4996 Domain of unknown function. This family around 100 residues locates in the N-terminal of some glycerophosphoryl diester phosphodiesterases and uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown. 102
58987 406722 pfam16389 DUF4998 Domain of unknown function. This family around 200 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown. 199
58988 406723 pfam16390 DUF4999 Domain of unknown function. This family around 75 residues locates in the N-terminal of F5/8 type C domain proteins and some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown. 76
58989 406724 pfam16391 DUF5000 Domain of unknown function. This family around 200 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown. 149
58990 406725 pfam16392 DUF5001 Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown. 86
58991 406726 pfam16394 DUF5003 Domain of unknown function (DUF5003). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically between 500 and 650 amino acids in length. 316
58992 406727 pfam16395 DUF5004 Domain of unknown function (DUF5004). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 150 amino acids in length. 145
58993 293005 pfam16396 DUF5005 Domain of unknown function (DUF5005). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 440 amino acids in length. 436
58994 406728 pfam16397 DUF5006 Domain of unknown function (DUF5006). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are around 600 amino acids in length. 263
58995 406729 pfam16398 DUF5007 Domain of unknown function (DUF5007). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides and Sphingobacterium. The members in this family are around 350 residues in length. 287
58996 406730 pfam16399 Aquarius_N Intron-binding protein aquarius N-terminus. This family represents the N-terminus of intron-binding protein aquarius, a splicing factor which links excision of introns from pre-mRNA with snoRP assembly. 790
58997 374519 pfam16400 DUF5008 Domain of unknown function (DUF5008). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides, Paraprevotella, and Sphingobacterium. The members in this family are around 550 residues in length. 101
58998 293010 pfam16401 DUF5009 Domain of unknown function (DUF5009). This small family of proteins is functionally uncharacterized. This family is mainly found in various Bacteroides species. The members in this family are around 470 residues in length. 260
58999 406731 pfam16402 DUF5010 Domain of unknown function (DUF5010). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are around 600 amino acids in length. 341
59000 406732 pfam16403 DUF5011 Domain of unknown function (DUF5011). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides, Prevotella, and Parabateroides. Proteins in this family are around 230 amino acids in length. 71
59001 374522 pfam16404 DUF5012 Domain of unknown function (DUF5012). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 230 amino acids in length. 125
59002 406733 pfam16405 DUF5013 Domain of unknown function (DUF5013). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Parabacteroides species. Proteins in this family are around 400 amino acids in length. 145
59003 406734 pfam16406 DUF5014 Domain of unknown function (DUF5014). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 630 amino acids in length. 90
59004 406735 pfam16407 PKD_2 PKD-like family. This is a PKD-like family of proteins found in various Bacteroides species. 157
59005 406736 pfam16408 DUF5016 Domain of unknown function (DUF5016). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 660 amino acids in length. 125
59006 406737 pfam16409 DUF5017 Domain of unknown function (DUF5017). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Prevotella species. Proteins in this family are around 350 amino acids in length. 182
59007 406738 pfam16410 DUF5018 Domain of unknown function (DUF5018). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Alistipes species. Proteins in this family are around 600 amino acids in length. 355
59008 406739 pfam16411 SusF_SusE Outer membrane protein SusF_SusE. SusE and SusF are two outer membrane proteins composed of tandem starch specific carbohydrate binding modules (CBMs) with no enzymatic activity. They are are likely to play an important role in starch metabolism in Bacteroides. It has been speculated that they could compete for starch in the human intestinal tract by sequestering starch at the bacterial surface and away from competitors. SusE has higher affinity for starch compared to SusF. 165
59009 406740 pfam16412 DUF5020 Domain of unknown function (DUF5020). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 235 amino acids in length. 212
59010 406741 pfam16413 Mlh1_C DNA mismatch repair protein Mlh1 C-terminus. This is the C-terminal domain of DNA mismatch repair protein Mlh1, these proteins belong to the MutL family. This domain forms part of the endonuclease active site. 257
59011 406742 pfam16414 NPC1_N Niemann-Pick C1 N-terminus. This is the N-terminal domain of Niemann-Pick C1 family proteins. This family of proteins mediates transport of cholesterol from the intestinal lumen to enterocytes. This domain contains a cholesterol-binding pocket. 238
59012 406743 pfam16415 CNOT1_CAF1_bind CCR4-NOT transcription complex subunit 1 CAF1-binding domain. This is the CAF1-binding domain of CCR4-NOT transcription complex. It adopts a MIF4G (middle portion of eIF4G) fold. 225
59013 406744 pfam16416 GUN4_N ARM-like repeat domain, GUN4-N terminal. GUN4_N is the ARM-repeat like N-terminal domain of GUN4 proteins. It contains five helices arranged in an alternating antiparallel pattern that resembles ARM or HEAT repeats, though the functional importance of this poorly conserved domain in Gun4 is not currently known. 82
59014 406745 pfam16417 CNOT1_TTP_bind CCR4-NOT transcription complex subunit 1 TTP binding domain. This is the TTP binding domain of CCR4-NOT transcription complex subunit 1. It adopts a MIF4G (middle portion of eIF4G) fold. 183
59015 406746 pfam16418 CNOT1_HEAT CCR4-NOT transcription complex subunit 1 HEAT repeat. This domain is a HEAT repeat found in CCR4-NOT transcription complex subunit 1. 146
59016 406747 pfam16419 CNOT1_HEAT_N CCR4-NOT transcription complex subunit 1 HEAT repeat. This domain is a HEAT repeat found in fungal CCR4-NOT transcription complex subunit 1 at the N-terminus of PF16418. 224
59017 406748 pfam16420 ATG7_N Ubiquitin-like modifier-activating enzyme ATG7 N-terminus. This is the N-terminal domain of Ubiquitin-like modifier-activating enzyme ATG7. In Arabidopsis this domain binds the E2 enzymes ATG10 and ATG3. 309
59018 406749 pfam16421 E2F_CC-MB E2F transcription factor CC-MB domain. This is the coiled coil (CC) - marked box (MB) domain of E2F transcription factors. This domain forms a heterodimer with the corresponding domain of the DP transcription factor, the heterodimer binds the C-terminus of retinoblastoma protein. 94
59019 406750 pfam16422 COE1_DBD Transcription factor COE1 DNA-binding domain. 227
59020 406751 pfam16423 COE1_HLH Transcription factor COE1 helix-loop-helix domain. This is the helix-loop-helix domain of transcription factor COE1. It is responsible for dimerization. 44
59021 406752 pfam16424 DUF5021 Domain of unknown function (DUF5021). This family consists of Prepilin-type cleavage/methylation N-terminal domain proteins around 200 residues in length and is mainly found in various Eubacterium species. The function of this family is unknown. 158
59022 293034 pfam16425 DUF5022 Domain of unknown function (DUF5022). This family consists of several uncharacterized proteins around 350 in length and is mainly found in various Firmicutes species. The function of this family is unknown. 287
59023 406753 pfam16426 DUF5023 Domain of unknown function (DUF5023). This family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Eubacterium species. The function of this family is unknown. 197
59024 406754 pfam16427 DUF5024 Domain of unknown function (DUF5024). This family consists of several uncharacterized proteins around 150 or 200 in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown. 104
59025 406755 pfam16428 DUF5025 Domain of unknown function (DUF5025). This family consists of several uncharacterized proteins around 200 in length and is mainly found in various Parabacteroides species. The function of this family is unknown. 161
59026 374540 pfam16429 DUF5026 Domain of unknown function (DUF5026). This family consists of several uncharacterized proteins around 100 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown. 82
59027 374541 pfam16430 DUF5027 Domain of unknown function (DUF5027). This family consists of several uncharacterized proteins around 180 in length and is mainly found in various Clostridiales species. The function of this family is unknown. 187
59028 293040 pfam16431 DUF5028 Domain of unknown function (DUF5028). This family consists of several uncharacterized proteins around 200 in length and is mainly found in Eubacterium and Clostridium. The function of this family is unknown. 177
59029 293041 pfam16432 DUF5029 Domain of unknown function (DUF5029). This family consists of several uncharacterized proteins around 550 in length and is mainly found in Bacteroides fragilis and sp. The function of this family is unknown. 210
59030 406756 pfam16433 DUF5030 Domain of unknown function (DUF5030). This family consists of several uncharacterized proteins around 300 in length and is mainly found in various Bacteroides species. The function of this family is unknown. 307
59031 406757 pfam16434 DUF5031 Domain of unknown function (DUF5031). This family consists of several uncharacterized proteins around 380 in length and is mainly found in Bacteroides fragilis and sp. The function of this family is unknown. 415
59032 406758 pfam16435 DUF5032 Domain of unknown function (DUF5032). This family consists of several uncharacterized proteins around 270 in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown. 259
59033 406759 pfam16436 DUF5033 Domain of unknown function (DUF5033). This family consists of several uncharacterized proteins around 200 in length and is mainly found in various Bacteroides species. The function of this family is unknown. 178
59034 406760 pfam16437 DUF5034 Domain of unknown function (DUF5034). This family consists of several uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 169
59035 406761 pfam16438 DUF5035 Domain of unknown function (DUF5035). This family consists of several uncharacterized proteins around 170 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 145
59036 406762 pfam16439 DUF5036 Domain of unknown function (DUF5036). This family consists of several uncharacterized proteins around 240 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown. 225
59037 293049 pfam16440 DUF5037 Domain of unknown function (DUF5037). This family consists of several uncharacterized proteins around 270 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown. 242
59038 406763 pfam16441 DUF5038 Domain of unknown function (DUF5038). This family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown. 144
59039 406764 pfam16442 DUF5039 Domain of unknown function (DUF5039). This family consists of several uncharacterized proteins around 240 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 203
59040 406765 pfam16443 DUF5040 Domain of unknown function (DUF5040). This family consists of several uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 227
59041 406766 pfam16444 DUF5041 Domain of unknown function (DUF5041). This family consists of several uncharacterized proteins around 230 residues in length and is mainly found in various Bacteroidales species. The function of this family is unknown. 192
59042 406767 pfam16445 DUF5042 Domain of unknown function (DUF5042). This family consists of several uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 434
59043 406768 pfam16446 DUF5043 Domain of unknown function (DUF5043). This family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 155
59044 374548 pfam16447 DUF5044 Domain of unknown function (DUF5044). This family consists of several uncharacterized proteins around 220 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown. 178
59045 406769 pfam16448 LapD_MoxY_N LapD/MoxY periplasmic domain. This domain is the N-terminal periplasmic domain of the LapD and MoxY receptor proteins. 124
59046 374549 pfam16449 MatB Fimbrillin MatB. This is a family of fimbrial proteins. 168
59047 406770 pfam16450 Prot_ATP_ID_OB Proteasomal ATPase OB/ID domain. This is the interdomain (ID) or oligonucleotide binding (OB) domain of proteasomal ATPase 56
59048 406771 pfam16451 Spike_NTD Spike glycoprotein N-terminal domain. The N-terminal domain of the coronavirus spike glycoprotein functions as a receptor binding domain. It binds carcinoembryonic antigen-related cell adhesion molecule 1. 298
59049 406772 pfam16452 Phage_CI_C Bacteriophage CI repressor C-terminal domain. The C-terminal domain of the CI repressor functions in oligomer formation. 102
59050 406773 pfam16453 IQ_SEC7_PH PH domain. This PH domain is found in IQ motif and SEC7 domain-containing proteins. 135
59051 406774 pfam16454 PI3K_P85_iSH2 Phosphatidylinositol 3-kinase regulatory subunit P85 inter-SH2 domain. This domain is found between the two SH2 domains in phosphatidylinositol 3-kinase regulatory subunit P85. It forms a complex with the adaptor-binding domain of phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha. 161
59052 406775 pfam16455 UBD Ubiquitin-binding domain. This ubiquitin-binding domain is found in ubiquitin domain-containing proteins. 102
59053 406776 pfam16456 YmgD YmgD protein. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. 82
59054 406777 pfam16457 PH_12 Pleckstrin homology domain. 129
59055 406778 pfam16458 Beta-prism_lec Beta-prism lectin. This beta-prism fold lectin is the C-terminal domain of the Vibrio cholerae cytolytic pore-forming toxin hemolysin. It binds to N-glycans with a heptasaccharide GlcNAc4Man3 core (NGA2). 129
59056 406779 pfam16459 Phage_TAC_13 Phage tail assembly chaperone, TAC. This family represents the phage-tail assembly chaperone proteins from a small set of Siphoviridae from Gammaproteobacteria. TACs are required for the morphogenesis of all long-tailed phages. The proposed function for the TAC is to coat the tape-measure protein to prevent it from forming unproductive complexes or precipitating before the tail tube protein has been incorporated. 98
59057 293069 pfam16460 Phage_TTP_11 Phage tail tube, TTP, lambda-like. This family represents the phage-tail-tube protein from a set of Siphoviridae from Gammaproteobacteria. Tail tube proteins polymerize with the assistance of the Tail-tip complex, a tape measure protein and two chaperones. Infectivity of host is delivered through the tube. 137
59058 406780 pfam16461 Phage_TTP_12 Lambda phage tail tube protein, TTP. This family represents the phage-tail-tube protein from a set of Siphoviridae from Gammaproteobacteria. Tail tube proteins polymerize with the assistance of the Tail-tip complex, a tape measure protein and two chaperones. Infectivity of host is delivered through the tube. 134
59059 318627 pfam16462 Phage_TAC_14 Phage tail assembly chaperone protein, TAC. This is a family of Siphoviridae phage tail assembly chaperone proteins. 113
59060 318628 pfam16463 Phage_TTP_13 Phage tail tube protein family. This is a small family of Siphoviridae phage tail tube proteins. The tube protein polymerizes to form the shaft through which the infecting DNA passes into the host. 137
59061 406781 pfam16464 DUF5045 Domain of unknown function (DUF5045). This family consists of N-terminal of several uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown. 85
59062 406782 pfam16465 DUF5046 Domain of unknown function (DUF5046). This small family consists of C-terminal of several uncharacterized proteins around 500 residues in length and is mainly found in various Faecalibacterium species. The function of this family is unknown. This family has distant similarity to WD40 repeats. 286
59063 293075 pfam16466 DUF5047 Domain of unknown function (DUF5047). This family consists of N-terminal of several uncharacterized proteins and peptidases around 360 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown. 135
59064 406783 pfam16467 DUF5048 Domain of unknown function (DUF5048). This family consists of C-terminal of several uncharacterized proteins around 500 residues in length and is mainly found in various Faecalibacterium and Clostridium species. The function of this family is unknown. 104
59065 379845 pfam16468 DUF5049 Domain of unknown function (DUF5049). This family consists of some uncharacterized proteins around 60 residues in length and is mainly found in various Lactobacillus and Selenomonas species. The function of this family is unknown. 57
59066 406784 pfam16469 NPA Nematode polyprotein allergen ABA-1. The nematode polyprotein allergen ABA-1 is a lipid-binding protein comprising multiple tandem repeats of this domain. 116
59067 406785 pfam16470 S8_pro-domain Peptidase S8 pro-domain. This domain is the pro-domain of several peptidases belonging to family S8. 77
59068 406786 pfam16471 JIP_LZII JNK-interacting protein leucine zipper II. This is the second leucine zipper domain (LZII) of several JNK-interacting proteins (JIP). It interacts with the small GTP-binding protein ARF6. 56
59069 406787 pfam16472 DUF5050 Domain of unknown function (DUF5050). 283
59070 406788 pfam16473 DUF5051 3' exoribonuclease, RNase T-like. This is a highly divergent 3' exoribonuclease family. The proteins constitute a typical RNase fold, where the active site residues form a magnesium catalytic centre. The protein of the solved structure readily cleaves 3' overhangs in a time-dependent manner. It is similar to DEDD-type RNases and is an unusual ATP-binding protein that binds ATP and dATP. It forms a dimer in solution and both protomers in the asymmetric unit bind a magnesium ion through Asp-6 in UniProtKB:P9WJ73. 177
59071 406789 pfam16474 KIND Kinase non-catalytic C-lobe domain. The KIND domain (kinase non-catalytic C-lobe domain) evolved from a catalytic protein kinase fold and functions as an interaction domain. In SPIRE1 (protein spire homolog 1) this domain interacts with FMN2 (formin-2). 194
59072 406790 pfam16475 DUF5052 Domain of unknown function (DUF5052). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown. 199
59073 406791 pfam16476 DUF5053 Domain of unknown function (DUF5053). This family consists of C-terminal of uncharacterized proteins around 100 residues in length and is mainly found in various Prevotella species. The function of this family is unknown. 59
59074 406792 pfam16477 DUF5054 Domain of unknown function (DUF5054). This family consists of Glycosyl hydrolase family 38 proteins around 700 residues in length and is mainly found in various Clostridium and Rhizobium species. The function of this family is unknown. 287
59075 374565 pfam16478 DUF5055 Domain of unknown function (DUF5055). This family consists of several uncharacterized proteins around 100 residues in length and is mainly found in butyrate-producing bacteriums. The function of this family is unknown. 105
59076 406793 pfam16479 DUF5056 Domain of unknown function (DUF5056). This family consists of uncharacterized proteins around 360 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. 93
59077 406794 pfam16480 DUF5057 Domain of unknown function (DUF5057). This family consists of C-terminal of uncharacterized proteins and F5/8 type C domain proteins around 360 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown. 353
59078 406795 pfam16481 DUF5058 Domain of unknown function (DUF5058). This family consists of uncharacterized proteins around 250 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown. 222
59079 374568 pfam16482 Staufen_C Staufen C-terminal domain. This is the C-terminal domain of Staufen proteins. It consists of an N-terminal Staufen-swapping motif (SSM) comprising two alpha helices, connected by a linker region to a dsRNA-binding-like domain ('RBD'). The 'RBD' has the fold of a functional dsRNA-binding domain, but lacks the residues required to bind RNA. This domain is responsible for dimerization, the SSM from one molecule interacts with the 'RBD' of another. 110
59080 406796 pfam16483 Glyco_hydro_64 Beta-1,3-glucanase. Family 64 glycoside hydrolases have beta-1,3-glucanase activity. 370
59081 406797 pfam16484 CPT_N Carnitine O-palmitoyltransferase N-terminus. This domain is found at the N-terminus of carnitine O-palmitoyltransferases. It functions as a regulatory domain and is linked to the catalytic domain (pfam00755) via two transmembrane regions. 47
59082 406798 pfam16485 PLN_propep Protealysin propeptide. This propeptide is cleaved during maturation of protealysin. Before cleavage it interacts with the catalytic domain, blocking the active site. 43
59083 406799 pfam16486 ArgoN N-terminal domain of argonaute. ArgoN is the N-terminal domain of argonaute proteins in eukaryotes. ArgoN is composed of an antiparallel four-stranded beta sheet core that has two alpha helices positioned along one face of the sheet and an extended beta strand towards its N-terminus. The core fold of the N domain most closely resembles the catalytic domain of replication-initiator protein Rep. The N domain is linked to the PAZ domain via linker 1 region, and together these three regions are designated the PAZ-containing lobe of argonaute. 86
59084 406800 pfam16487 ArgoMid Mid domain of argonaute. The ArgoMid domain is found to be part of the Piwi-lobe of the argonaute proteins. It is composed of a parallel four-stranded beta-sheet core surrounded by four alpha-helices and two additional short alpha-helices. It most closely resembles the amino terminal tryptic core of the E.coli lactose repressor. There is an extensive interface between the Mid and the Piwi domains. The conserved C-terminal half or the Mid has extensive interactions with Piwi, with a deep basic pocket on the surface of the `Mid adjacent to the interface with Piwi. The Mid carries a binding pocket for the 5' phosphate overhang of the guide strand of DNA. The N, Mid, and Piwi domains form a base upon which the PAZ domain sits, resembling a duck. The 5' phosphate and the U1 base are held in place by a conserved network of interactions from protein residues of the Mid and Piwi domains in order to place the guide uniquely in the proper position observed in all Argonaute-RNA complexes. 84
59085 406801 pfam16488 ArgoL2 Argonaute linker 2 domain. ArgoL2 is the second linker domain in eukaryotic argonaute proteins. It starts with two alpha-helices aligned orthogonally to each other followed by a beta-strand involved in linking the two lobes, the PAZ lobe and the Piwi lobe of argonaute to each other. Linker 2 together with the N, PAZ and L1 domains form a compact global fold. Numerous residues from Piwi, L1 and L2 linkers direct the path of the phosphate backbone of nucleotides 7-9, thus allowing DNA-slicing. 47
59086 406802 pfam16489 GAIN GPCR-Autoproteolysis INducing (GAIN) domain. The GAIN a domain of alpha-helices and beta-strands that is found in cell-adhesion GPCRs and precedes the GPS motif where the autoproteolysis occurs, family, pfam01825. The full GAIN domain, comprises the GPS and the GAIN, in cell-adhesion GPCRs, and is the functional unit for autoproteolysis. The GPS motif at the end of the GAIN domain is an ancient domain that exists in primitive ancestor organisms, and the full GAIN + GPS is conserved in all cell-adhesion GPCRs and all PKD1-related proteins. 205
59087 406803 pfam16490 Oxidoreduct_C Putative oxidoreductase C terminal domain. This is the putative C-terminal domain of a bacterial oxidoreductase. It lies C-terminal to family GFO_IDH_MocA pfam01408 in some members. 278
59088 406804 pfam16491 Peptidase_M48_N CAAX prenyl protease N-terminal, five membrane helices. The five N-terminal five transmembrane alpha-helices of peptidase_M48 family proteins including the CAAX prenyl proteases reside completely within the membrane of the endoplasmic reticulum. 179
59089 406805 pfam16492 Cadherin_C_2 Cadherin cytoplasmic C-terminal. Cadherin_C_2 is the cytoplasmic C-terminal domain of some proto-cadherins. It is this region of the cadherins that allows cell-adhesion and the essential feature of metazoan multicellularity. Cadherins are cell-surface receptors that function in cell adhesion, cell polarity, and tissue morphogenesis. 84
59090 406806 pfam16493 Meis_PKNOX_N N-terminal of Homeobox Meis and PKNOX1. Meis_PKNOX_N is a family found at the N-terminus of Meis, Myeloid ecotropic viral integration site, transcription regulators and PKNOX1 regulators, PBX/knotted 1 homeobox 1, homeobox proteins. 86
59091 406807 pfam16494 Na_Ca_ex_C C-terminal extension of sodium/calcium exchanger domain. Na_Ca_ex_C is a region of the higher eukaryote sodium/calcium exchanger domain that extends toward the C-terminal, and is cytoplasmic. 134
59092 406808 pfam16495 SWIRM-assoc_1 SWIRM-associated region 1. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known. 84
59093 406809 pfam16496 SWIRM-assoc_2 SWIRM-associated domain at the N-terminal. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known. 410
59094 406810 pfam16497 MHC_I_3 MHC-I family domain. 180
59095 406811 pfam16498 SWIRM-assoc_3 SWIRM-associated domain at the C-terminal. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known. 65
59096 374582 pfam16499 Melibiase_2 Alpha galactosidase A. 284
59097 406812 pfam16500 Cyclin_N2 N-terminal region of cyclin_N. Cyclin_N2 is fond upstream of the family Cyclin_N, pfam00134. The exact function of this region of cyclins is not certain. 135
59098 406813 pfam16501 SCAPER_N S phase cyclin A-associated protein in the endoplasmic reticulum. SCAPER_N is a short highly conserved region close to the N-terminus. SCAPER is localized to the endoplasmic reticulum and is a substrate for cyclin A/Cdk2. It associates with cyclin A and localizes to the ER. One theory suggests that SCAPER functions to create a local high concentration of cyclin A2 in the cytoplasm. Alternatively, SCAPER might be acting to sequester a portion of cellular cyclin A2 that could then be readily available for nuclear translocation, which may be needed for exit from G0 phase. 98
59099 406814 pfam16502 DUF5059 Domain of unknown function (DUF5059). This domain is found fused to a copper-binding protein at the C-terminus, family Copper-bind, pfam00127. Its function is not known, and it is found in the Halobacteriaceae family in Archaea. 620
59100 406815 pfam16503 zn-ribbon_14 Zinc-ribbon. This is a family of zinc-ribbons largely from eukaryotes that lie at the C-terminus of cytoplasmic tRNA adenylyltransferase 1 proteins. Most of these proteins carry an ATP-binding domain towards the N-terminus. 32
59101 406816 pfam16504 SP24 Putative virion membrane protein of plant and insect virus. SP24, or structural protein of 24kD, is a family of putative virion membrane proteins of plant and insect viruses. These viruses are ssRNA positive-strand viruses, with no DNA stage. The family corresponds to the central region of the ORF3 of insect chroparaviruses and negeviruses and plant cileviruses, higreviruses and blunerviruses. It contains four transmembrane regions. Chronic bee paralysis virus (CBPV) is one of the more common member virions. SP24 is probably one of the major structural components of the virions. 147
59102 293114 pfam16505 Emaravirus_P4 P4 movement protein of Emaravirus, and the 30K superfamily. Emaravirus_P4 is composed of movement proteins of the genus of negative-strand RNA viruses Emaravirus (related to the family Bunyaviridae), which infect plants. P4 is a movement protein of the 30K superfamily. 349
59103 406817 pfam16506 DiSB-ORF2_chro Putative virion glycoprotein of insect viruses. DiSB-ORF2_chro corresponds to a short conserved region at the N-terminus of putative glycoproteins from chroparaviruses. It carries two putative disulfide bridges. No similarity can be found with any other glycoproteins outside this region. 210
59104 406818 pfam16507 BLM10_mid Proteasome-substrate-size regulator, mid region. The ordered regions of the yeast BLM10 or PA200 (human homolog), full-length protein encode 32 HEAT repeat (HR)-like modules, each comprising two helices joined by a turn, with adjacent repeats connected by a linker. Whereas a standard HEAT repeat is composed of ~50 residues, the BLM10 HEAT repeats are highly variable. The length of helices ranges from 8 to 35 residues, turns range from 2 to 87 residues, and linkers range from 1 to 88 residues, with the longest linker, between HR21 and HR22, containing additional secondary structures (two strands and three helices). BLM10_mid is the middle ordered region of the three in BLM10. BLM10 is found to surround the proteasome entry pore in the 1.2 MDa complex of proteasome and BLM10 to form a largely closed dome that is expected to restrict access of potential substrates. Thus Blm10 and PA200 are predominantly nuclear and stimulate the degradation of model peptides, although they do not appear to stimulate the degradation of proteins, recognize ubiquitin, or utilize ATP. 499
59105 406819 pfam16508 NIBRIN_BRCT_II Second BRCT domain on Nijmegen syndrome breakage protein. 114
59106 374591 pfam16509 KORA TrfB plasmid transcriptional repressor. KORA is a family of Gram-negative bacterial proteins that act as global repressors of genes involved in plasmid replication, conjugative transfer and stable inheritance in the IncP group of plasmids. KORA operates as a symmetric dimer, and contacts the DNA via the helix-turn-helix region at the N-terminus. 84
59107 293119 pfam16510 P22_portal Phage P22-like portal protein. The portal protein of P22 and similar Podoviridae tail phages is a dodecameric structure consisting of a hip (2), a leg(1) and a barrel(3). DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Domains 1 and 3 are mostly helical and form the majority of the DNA-translocating channel. Domain 2 adopts an alpha-beta-fold characterized by two sheets of eight beta-strands, which cross each other to form a beta-barrel-like structure. 668
59108 406820 pfam16511 FERM_f0 N-terminal or F0 domain of Talin-head FERM. FERM_f0 forms a stable globular structure. The fold is an ubiquitin-like fold joined to the f1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components, No binding partner has yet been found for it. 82
59109 406821 pfam16512 RhoGAP-FF1 p190-A and -B Rho GAPs FF domain. RhoGAP-FF1 is the FF domain of the Rho GTPase activating proteins (GAPs). These are the key proteins that make the switch between the active guanosine-triphosphate-bound form of Rho guanosine triphosphatases (GTPases) and the inactive guanosine-diphosphate-bound form. Rho guanosine triphosphatases (GTPases) are a family of proteins with key roles in the regulation of actin cytoskeleton dynamics. The RhoGAP-FF1 region contains the FF domain that has been implicated in binding to the transcription factor TFII-I; and phosphorylation of Tyr308 within the first FF domain inhibits this interaction. The RhoGAPFF1 domain constitutes the first solved structure of an FF domain that lacks the first of the two highly conserved Phe residues, but the substitution of Phe by Tyr does not affect the domain fold. 80
59110 374594 pfam16514 NADH-UOR_E putative NADH-ubiquinone oxidoreductase chain E. This putative NADH-ubiquinone oxidoreductase chain E family is found in Epsilonproteobacteria, chiefly in Helicobacter pylori. All proteins in the family are less than 100 residues in length. 74
59111 406822 pfam16515 HIP1_clath_bdg Clathrin-binding domain of Huntingtin-interacting protein 1. HIP1_clath_bdg is the coiled-coil region of Huntington-interacting proteins 1. It carries a highly conserved HADLLRKN sequence motif at its N-terminus which effects the binding of HIP1R to clathrin light-chain EED regulatory site. this binding then stimulates clathrin lattice assembly. Huntingtin-interacting protein 1 (HIP1) is an obligate binding partner for Huntungtin, and loss of this interaction triggers the cascade of events that results in the apoptosis of neuronal cells and the onset of Hungtinton's disease. Clathrin light-chain binds to a flexible coiled-coil domain in HIP1 and induces a compact state that is refractory to actin binding. 93
59112 406823 pfam16516 CC2-LZ Leucine zipper of domain CC2 of NEMO, NF-kappa-B essential modulator. CC2-LZ is a leucine-zipper domain associated with the CC2 coiled-coil region of NF-kappa-B essential modulator, NEMO. It plays a regulatory role, along with the very C-terminal zinc-finger; it contains a ubiquitin-binding domain (UBD) and represents one region that contributes to NEMO oligomerization. NEMO itself is an integral part of the IkappaB kinase complex and serves as a molecular switch via which the NF-kappaB signalling pathway is regulated. 100
59113 406824 pfam16517 Nore1-SARAH Novel Ras effector 1 C-terminal SARAH (Sav/Rassf/Hpo) domain. The Nore1-SARAH, C-terminal, domain of Nore1, the tumor-suppressor, a novel Ras effector, has a characteristic coiled-coil structure. It is a small helical module that is important in signal-transduction networks. The recombinant SARAH domain of Nore1 crystallizes as an anti-parallel homodimer with representative characteristics of coiled coils. The central function of the SARAH domain seems to be the mediation of homo- and hetero-oligomerization between SARAH domain-containing proteins. Nore1 forms homo- and hetero complexes through its C-terminal SARAH (Sav/Rassf/Hpo) domain. 39
59114 406825 pfam16518 GrlR T3SS negative regulator,GrlR. GrlR is a family of protobacterial type III secretion system negative regulators. Structurally, GrlR consists of a typical beta-barrel fold with eight beta-strands containing an internal hydrophobic cavity and a plug-like loop on one side of the barrel. Strong hydrophobic interactions between the two beta-barrels maintain its dimeric architecture. A unique surface-exposed EDED (Glu-Asp-Glu-Asp) motif is identified to be critical for GrlA-GrlR interaction and for the repressive activity of GrlR. The locus of enterocyte effacement (LEE) is essential for virulence of enterohaemorrhagic Escherichia coli (EHEC) and enteropathogenic E. coli (EPEC). It encodes some 20 genes including an overall regulator ler and two others, GrlR and GrlA, that form the type three secretion system for infection. GrlR comlexes with GrlA to repress expression of ler. GrlA is found in family pfam00462. 112
59115 406826 pfam16519 TRPM_tetra Tetramerisation domain of TRPM. TRPM7_tetra is a short anti-parallel coiled-coil tetramerisation domain of the transient receptor potential cation channel subfamily M member proteins 1-8. It is held together by extensive core packing and interstrand polar interactions. Transient receptor potential (TRP) channels comprise a large family of tetrameric cation-selective ion channels that respond to diverse forms of sensory input. The presence of cytoplasmic domains that direct channel assembly appears to be a feature of many voltage-gated ion channel superfamily members. 55
59116 293128 pfam16520 BDV_M ssRNA-binding matrix protein of Bornaviridae. BDV_M is a family of matrix proteins from negative-strand Bornaviridae viruses. Its most stable oligomeric form is a tetramer, and it lies beneath the viral envelope where it associates with the inner layer of the viral membrane. It bridges the gap between the nucleocapsid and the viral envelope thereby imparting structural integrity and individual form to the virus particle. Borna disease virus (BDV) is a neurotropic enveloped RNA virus causing a noncytolytic, persistent infection of the central nervous system in mammals. The order to which this virus belongs, Mononegavirales, also contains the Ebola, mumps, rabies and measles viruses, amongst other highly infectious agents. 103
59117 406827 pfam16521 Myosin-VI_CBD Myosin VI cargo binding domain. Myosin-VI_CBD is a C-terminal family that allows unconventional myosin-VI to recognize and select its binding cargoes. Several adaptor proteins have been reported to interact specifically with the CBD, thus defining the specific subcellular functions of myosin VI. The crystal structure determination of the myosin VI CBD/Dab2 (an endocytic adaptor protein Disabled-2 that is a cargo) complex shows that the Myosin-VI_CBD forms a cargo-induced dimer, suggesting that the motor undergoes monomer-to-dimer conversion that is dependent upon cargo binding. In the absence of cargo myosin VI exists as a stable monomer. This cargo binding-mediated monomer-to-dimer conversion mechanism adopted by myosin VI may be shared by other unconventional myosins, such as myosin VII and myosin X. 90
59118 406828 pfam16522 FliS_cochap Flagellar FLiS export co-chaperone, HP1076. FliS_cochap is a family of largely Campylobacterales proteins that are co-chaperones for FliS, one of the type III secretion system flagellar chaperones. The HP1076 (Flis_cochap) and FliS complex together prevents premature polymerization of flagellins and is critical for flagellar assembly and bacterial colonisation. The HP1076 shows co-chaperone activity that promotes protein folding of FliS with mutations in the flagellin binding pocket and enhances the chaperone activity of FliS. 131
59119 406829 pfam16523 betaPIX_CC betaPIX coiled coil. betaPIX_CC is the very C-terminal coiled-coil region of betaPIX or p21-activated kinase interacting exchange factor proteins. The coiled-coil runs from residues 589-646 in UniProtKB:G31IU6, and the PDZ-binding site is the final eight residues immediately downstream. The coiled-coil trimerizes and thus exposes three potential PZD-binding surfaces, although only one of these is maximally used. One of the C-terminal ends of the coiled-coil forms an extensive beta-sheet interaction with the Shank PDZ, while the other two ends are not involved in ligand binding and form random coils. Thus the coiled-coil domain allows multimerisation of betaPIX that is vital for its physiological functions. betaPIX and the Shank/ProSAP protein form a complex that acts as a protein scaffold for integrating signalling pathways and regulating postsynaptic structure.** Forced reload 87
59120 406830 pfam16524 RisS_PPD Periplasmic domain of Sensor histidine kinase RisS. RisS_PPD is the periplasmic domain of the sensor histidine kinase RisS. It is purported to be the region of the kinase that senses the pH of the environs. 105
59121 406831 pfam16525 MHB Haemophore, haem-binding. MHB is a coiled-coil molecule that binds free haem in mycobacterial cytoplasm to deliver it to membrane proteins for shuttling through the membrane. 75
59122 406832 pfam16526 CLZ C-terminal leucine zipper domain of cyclic nucleotide-gated channels. The CLZ domain is the C-terminal leucine-zipper domain of of cyclic nucleotide-gated channel proteins. The CLZ domains form homotypic trimers in solution thus constraining the channel of the CNGs to contain three cyclic nucleotide-gated subunits, CNGA. The CLZ domains formed homotypic parallel 3-helix coiled-coil domains, consistent with their proposed role in regulating subunit assembly. 70
59123 318680 pfam16527 CpxA_peri Two-component sensor protein CpxA, periplasmic domain. CpxA_peri is the periplasmic domain-family of the Gram-negative Gammaproteobacteria two-component signalling system, Cpx. It represents the recognition-site for sensing specific envelope stress signals. The fold that the domain-core of CpxA_peri conforms to is a PAS fold. The domain senses the environmental change and triggers a signal transduction to the cytoplasmic domain. As well as the PAS-core, there is a C-terminal tail that is necessary for ligand-sensing and binding to CpxP, a CpxA-associated and a regulatory protein. 134
59124 406833 pfam16528 Exo84_C Exocyst component 84 C-terminal. Exo84_C is the C-terminal helical region of the exocyst component Exo84. This region resembles a cullin-repeat, a multi-helical bundle. The exocyst is a large complex that is required for tethering vesicles at the final stages of the exocytic pathway in all eukaryotes. Exocyst subunits are composed of mostly helical modules strung together into long rods. 203
59125 374606 pfam16529 Ge1_WD40 WD40 region of Ge1, enhancer of mRNA-decapping protein. Ge1_WD40 is the N-terminal region of Ge-1 or enhancer of mRNA-decapping proteins. WD40-repeat regions are involved in protein-protein interactions. 329
59126 406834 pfam16530 IHHNV_capsid Infectious hypodermal and haematopoietic necrosis virus, capsid. IHHNV_capsid is the single capsid protein of infectious hypodermal and haematopoietic necrosis virus, found particularly in shrimp densovirus. Densoviruses are a subfamily of the parvoviruses. The capsid protein has an eight-stranded anti-parallel beta-barrel 'jelly roll' motif similar to that found in many icosahedral viruses, including other parvoviruses. The N-terminal portion of the IHHNV coat protein adopts a 'domain-swapped' conformation relative to its twofold-related neighbor. The loops connecting the strands of the structurally conserved jelly roll motif differ considerably in structure and length from those of other parvoviruses. IHHNV was first reported as a highly lethal disease of juvenile shrimp in 1983, and has only one type of capsid protein that lacks the phospholipase A2 activity that has been implicated as a requirement during parvoviral host cell infection. The structure of recombinant virus-like particles, composed of 60 copies of the 37.5-kDa coat protein is the smallest parvoviral capsid protein reported thus far. The small size of the PstDNV capsid protein makes the system attractive as a model for studying assembly mechanisms of icosahedral virus capsids. 323
59127 406835 pfam16531 SAS-6_N Centriolar protein SAS N-terminal. SAS-6_N is the N-terminal domain of the SAS-6 centriolar protein, both in C.elegans and in humans. The N-terminal domain is the region through which the 9 rod-shaped homodimers that SAS-6 forms on oligomerization interact with each other. Proper functioning of the centriole requires this correct oligomerization. 91
59128 406836 pfam16532 Phage_tail_NK Sf6-type phage tail needle knob or tip of some Caudovirales. Phage_tail_NK is the globular tip protein of some tailed bacteriophages. Tailed bacteriophage virions deliver DNA to susceptible cells after adsorbing to specific receptors on the surface of the bacteria. In the Gram-negative bacteria these receptors are surface proteins or polysaccharides. In the phage Sf6-type needle, this distal tip folds into a knob with a TNF-like fold, similar to the fibre knobs of bacteriophage PRD1 and Adenovirus. It contains three bound L-glutamate molecules that are bind tightly in the crevices between the trimers of this trimeric tip. 152
59129 406837 pfam16533 SOAR STIM1 Orai1-activating region. SOAR is the Orai1-activating region of STIM1, where STIM1 are calcium sensors in the endoplasmic reticulum. As the store of calcium is depleted the calcium sensor in the ER activates Orai1, a Ca2+-release-activated Ca2+ (CRAC) channel, in the plasma membrane. The SOAR region, which runs from residues 340-443 on UniProtKB:Q13586, forms a dimer, and is essential for oligomerization of the whole of STIM1. 98
59130 406838 pfam16534 ULD Ubiquitin-like oligomerization domain of SATB. ULD is an N-terminal oligomerization domain of SATB or special AT-rich sequence-binding proteins. SATBs are global chromatin organizers and regulators of gene expression that are essential for T-cell development, breast cancer tumor growth and metastasis. SATBs assemble into a tetramer via the ULD domain, and the tetramerisation of SATBs are essential for recognising specific DNA sequences (such as multiple AT-rich DNA fragments). Thus, SATBs may regulate gene expression directly by binding to various promoters and upstream regions and thereby influencing promoter activity. 108
59131 406839 pfam16535 T3SSipB Type III cell invasion protein SipB. T3SSipB is a family of pathogenic Gram-negative bacterial proteins that invade human intestinal cells via the type III secretion system translocators. T3SSipB represents the coiled -coil region of the proteins and is shown to be homologous in activity to the pore-forming toxins of other Gram-negative pathogens, such as colicin Ia. 155
59132 406840 pfam16536 PNKP-ligase_C PNKP adenylyltransferase domain, C-terminal region. This is a short unique anti-parallel two-helical module with an extended tail peptide. It packs tightly against an extended peptide segment, residues 489-501 in UniProtKB:A3DJ38, near the N-terminus of the NTase domain, pfam16542. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps. The exact function of this C-terminal region is unclear; however, the conformation of the bundles changes on transfer of a PO4 from ATP to AMP. 60
59133 406841 pfam16537 T2SSB Type II secretion system protein B. This is the B protein from some operons of bacterial secretion systems of type II. The exact function of the B protein is not known, though in the case of Vibrio cholerae there is a fusion protein between proteins A and B that includes an AAA domain, a PG_binding domains well as this domain at the C-terminus. Many of the other species have no A or B domain genes in this operon. The type II secretion pathway is conserved in Gram-negative bacteria that are prevalent in bacterial pathogens of plants (Pseudomonas fluorescens, Erwinia or Xanthomonas species), animals (Aeromonas hydrophila) and humans (Klebsiella oxytoca, Pseudomonas aeruginosa, Vibrio cholerae or Legionella pneumophila). Typical type II secretion systems (T2SSs) are encoded by a set of 12 to 16 gsp (general secretion pathway) genes organized into large operons including the conserved 'core' genes denoted C to O and in some bacterial species, as indicated above, extra gsp genes such as gspAB, gspN or gspS. A different nomenclature is used for Pseudomonas T2SSs, so the B gene is referred to as the P protein. 60
59134 406842 pfam16538 FlgT_C Flagellar assembly protein T, C-terminal domain. FlgT_C is the C-terminal domain of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT_C is not essential but stabilizes the H-ring structure.. 74
59135 406843 pfam16539 FlgT_M Flagellar assembly protein T, middle domain. FlgT_M is the middle region of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT-N and FlgT-M are thought to be involved in the H-ring and the T-ring formation, respectively. and FlgT-M is also required for the stable association of FlgT with the basal body. 163
59136 406844 pfam16540 MKLP1_Arf_bdg Arf6-interacting domain of mitotic kinesin-like protein 1. This family is a C-terminal region of mitotic kinesin-like proteins that is necessary for the interaction with the small GTPase Arf6. MKLP1 is a Flemming body-localising protein essential for cytokinesis, so its interaction with Arf6 shows how Arf6 is involved in cytokinesis. The Arf6-MKLP1 complex plays a crucial role in cytokinesis by connecting the microtubule bundle and membranes at the cleavage plane. 107
59137 406845 pfam16541 AltA1 Alternaria alternata allergen 1. AltA1 is a family of fungal allergens. It shows a unique beta-barrel comprising 11 beta-strands. There is structural evidence for the location of IgE antibody-binding epitopes. The crystal structure will allow efforts to promote immunotherapy for patients allergic to Alternaria species. 104
59138 406846 pfam16542 PNKP_ligase PNKP adenylyltransferase domain, ligase domain. PNKP_ligase is a classical ligase nucleotidyltransferase module of bacteria. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps. 315
59139 406847 pfam16543 DFRP_C DRG Family Regulatory Proteins, Tma46. DFRP_C is a family of eukaryotic translation machinery-associated protein 46 proteins that are the binding partner for the highly conserved Developmentally Regulated GTP-binding (DRG) GTPases. Thus this family is referred to as DRG Family Regulatory Proteins (DFRP). Binding of this DFRP modulates the function of the GTPase. 89
59140 406848 pfam16544 STAR_dimer Homodimerization region of STAR domain protein. This family is the homodimerization domain of quaking proteins. Quaking-dimer is a helix-turn-helix dimer with an additional helix in the turn region. dimerization is required for adequate RNA-binding. Quaking is a prototypical member of the STAR (signal transducer and activator of RNA) protein family, which plays key roles in post-transcriptional gene regulation by controlling mRNA translation, stability and splicing. STAR_dimer is the homodimerization domain, Qua1 of the STAR domain of a series of proteins referred to as STAR/GSG, or Signal Transduction and Activation of RNA/GRP33, Sam68, GLD-1 family. These are conserved in higher eukaryotes and are RNA-binding transcriptional regulators. The STAR domain is a KH domain flanked by two homologous regions, Qua1 and Qua2. Qua1, this family, is the homodimerization domain, and the KH plus Qua2 is the RNA-binding region. 51
59141 406849 pfam16545 CCM2_C Cerebral cavernous malformation protein, harmonin-homology. CCM2_HHD is a folded-helical region of a family of vertebral proteins, mutations in which cause cerebral cavernous malformations (CCMs). These malformations are congenital vascular anomalies of the central nervous system that can result in haemorrhagic stroke, seizures, recurrent headaches, and focal neurologic deficits. This domain is structurally homologous to the N-terminal domain of harmonin, so it is named the CCM2 harmonin-homology domain or CCM2_HHD. This protein is often called Malcavernin. 91
59142 406850 pfam16546 SGTA_dimer Homodimerization domain of SGTA. SGTA_dimer is a short N-terminal domain at the start of SGTA or small glutamine-rich tetratricopeptide repeat-containing proteins. It is the homodimerization domain of the SGTA, a heat-shock protein (HSP) co-chaperone involved in the targeting of tail-anchor membrane proteins to the endoplasmic reticulum. This N-terminal homodimerization domain mediates the association with a single copy of Get4 or Get5 proteins, providing a link to the rest of the GET pathway. 64
59143 406851 pfam16547 BLM10_N Proteasome-substrate-size regulator, N-terminal. The ordered regions of the yeast BLM10 or PA200 (human homolog), full-length protein encode 32 HEAT repeat (HR)-like modules, each comprising two helices joined by a turn, with adjacent repeats connected by a linker. Whereas a standard HEAT repeat is composed of ~50 residues, the BLM10 HEAT repeats are highly variable. The length of helices ranges from 8 to 35 residues, turns range from 2 to 87 residues, and linkers range from 1 to 88 residues, with the longest linker, between HR21 and HR22, containing additional secondary structures (two strands and three helices). BLM10_N is the N-terminal ordered region of the three in BLM10. BLM10 is found to surround the proteasome entry pore in the 1.2 MDa complex of proteasome and BLM10 to form a largely closed dome that is expected to restrict access of potential substrates. BLM10 and PA200 are predominantly nuclear and stimulate the degradation of model peptides, although they do not appear to stimulate the degradation of proteins, recognize ubiquitin, or utilize ATP. 81
59144 406852 pfam16548 FlgT_N Flagellar assembly protein T, N-terminal domain. FlgT_N is the N-terminal domain of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT-N contributes to the construction of the H-ring structure, and adopts a two-layer alpha-beta sandwich architecture composed of a four-stranded anti-parallel beta-sheet and two alpha helices. 87
59145 406853 pfam16549 T2SSS_2 Type II secretion system (T2SS) pilotin, S protein. T2S_S is the S protein or pilotin of the bacterial Gram-negative secretion system in Vibrio and some E.coli and Shigella. It is given the suffix _2 to distinguish it from the PulS_OutS family of pilotins from Klebsiella and Dickeya, etc. AspS is functionally equivalent and yet structurally unrelated to the pilotins found in Klebsiella and other bacteria. AspS binds to a specific targeting sequence in the Vibrio-type secretins, enhancing the kinetics of secretin assembly; homologs of AspS are found in all species of Vibrio as well those few strains of Escherichia and Shigella that have acquired a Vibrio-type T2SS. PulS is the Kelbsiella pilotin, found in PulS_OutS, pfam09691. Not all species with a type II secretion system have this pilotin or S protein. 104
59146 406854 pfam16550 RPN13_C UCH-binding domain. RPN13_C is a family of all-helical domains that forms the binding-surface for the proteasome-ubiquitn-receptor protein Rpn13 to UCH37, one of the three de-ubiquitinating enzymes of the proteasome. 106
59147 406855 pfam16551 Quaking_NLS Putative nuclear localization signal of quaking. Quaking_NLS is the very C-terminal region of quaking proteins that is purported to be the nuclear localization signal. 30
59148 406856 pfam16552 OAM_alpha D-ornithine 4,5-aminomutase alpha-subunit. OAM_alpha is the 12.8kDa, alpha subunit of d-ornithine 4,5-aminomutase, or OAM, an enzyme that converts d-ornithine to 2,4-diaminopentanoic acid by way of radical propagation from an adenosylcobalamin to a pyridoxal 5'-phosphate cofactor. OAM is an alpha2-beta2 heterodimer comprising two strongly associating subunits. The packing of the alpha subunits against the beta helps to form the substrate and co-factor binding-regions. 107
59149 406857 pfam16553 PUFD BCORL-PCGF1-binding domain. PUFD is the minimal domain at the C-terminus of BCORL (BCL6 corepressor) that is needed for binding and giving specificity to some of the PCGF proteins, polycomb-group RING finger homologs. PUFD binds to the RAWUL (RING finger- and WD40-associated ubiquitin-like) domain of the particular PCGF PCGF1, pfam16207. Polycomb group proteins form repressive complexes (PRC) that mediate epigenetic modifications of histones. In humans there are many different PCGF homologs whose functions all vary, but the direct binding partner of PCGF1 is BCOR. BCOR has emerged as an important player in development and health. 110
59150 406858 pfam16554 OAM_dimer dimerization domain of d-ornithine 4,5-aminomutase. This family is the short dimerization domain of the enzyme D-ornithine 4,5-aminomutase. It sits between the TIM-barrel pfam09043 and pfam02310. The enzyme is an alpha2-beta2-heterodimer that converts D-ornithine to 2,4-diaminopentanoic acid by way of radical propagation from an adenosylcobalamin to a pyridoxal 5'-phosphate cofactor. 78
59151 406859 pfam16555 GramPos_pilinD1 Gram-positive pilin subunit D1, N-terminal. GramPos_pilinD1 is the first subunit domain of Gram-positive pilins from Strep.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three Ig-like, CnaB, domains along with a crucial N-terminal domain, D1. The three IG-like domains are stabilized by internal Lys-Asn isopeptdie bonds, but this N-terminal domain makes few contact with the rest of the molecule due to the different orientation of its G beta-strand. Strand G of D1 also carries the YPKN motif that provides the essential Lys residue for the sortase-mediated intermolecular linkages along the pilus shaft. Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base. 161
59152 406860 pfam16556 IL17R_fnIII_D1 Interleukin-17 receptor, fibronectin-III-like domain 1. IL17R_fnIII_D1 is the first of two fibronectin 3-like domains on interleukin-17 receptor proteins A and B. The tow fnIII domains are linked and together bind two molecules of IL-17 at one of its receptor-binding interfaces. This allows the other interface to bind to another receptor, thus allowing the IL-17 family of homodimeric cytokines to coordinate two different receptors. 154
59153 406861 pfam16557 CUTL CUT1-like DNA-binding domain of SATB. CUTL is part of the N-terminal region of SATB proteins, special AT-rich sequence-binding proteins that are global chromatin organizers and gene expression regulators essential for T-cell development and breast cancer tumor growth and metastasis. CUTL carries a DNA-binding region just as CUT domains do. 71
59154 406862 pfam16558 AZUL Amino-terminal Zinc-binding domain of ubiquitin ligase E3A. The AZUL or amino-terminal zinc-binding domain of ubiquitin E3a ligase is found in eukaryotes, and is an unusual zinc-finger domain. The final cysteine is usually mutated in Angelman syndrome patients. It is likely that AZUL plays a role in Ube3A substrate-recognition. 59
59155 406863 pfam16559 GIT_CC GIT coiled-coil Rho guanine nucleotide exchange factor. GIT-CC is the coiled-coil region of GIT (G protein-coupled receptor kinase-interacting) proteins. This coiled-coil region is the surface that associates with the equivalent binding-region on beta-PIX, or p21-activated kinase-interacting exchange factor proteins. Both GIT and PIX complex together to form a scaffold for the formation of multi-protein assemblies. On its own the GIT-CC region assembles into a parallel two-stranded CC in the asymmetric unit. Similarly the PIX coiled-coil region assembles into a trimer. At least in vitro the two regions associate together into a stable heteropentameric complex that consists of one PIX trimer and one GIT dimer. 66
59156 406864 pfam16560 SAPI Putative mobile pathogenicity island. SAPI is a family of putative Gram-positive mobile pathogenicity island proteins. SAPIs are responsible for many superantigen-related diseases in humans as they carry two or more superantigens. 213
59157 406865 pfam16561 AMPK1_CBM Glycogen recognition site of AMP-activated protein kinase. AMPK1_CBM is a family found in close association with AMPKBI pfam04739. The surface of AMPK1_CBM reveals a carbohydrate-binding pocket. 85
59158 406866 pfam16562 HECW_N N-terminal domain of E3 ubiquitin-protein ligase HECW1 and 2. HECW_N is a domain on E3 ubiquitin-protein ligases that lies upstream of the C2 domain; its function is not clearly understood, except perhaps to determine the substrate spectrum of the ligase. 118
59159 406867 pfam16563 P66_CC Coiled-coil and interaction region of P66A and P66B with MBD2. This family is a short coiled-coil interaction region on the transcriptional repressors P66A and P66B. The P66A and B, or alpha and beta, complex with MBDs or methyl-binding domain-containing proteins via a coiled-coil region on each. This P66-MBD2 complex forms part of an assembly with NuRD, nucleosome remodelling and deacetylation protein. MBD2-NuRD binds methylated DNA and regulates transcription of eg, the foetal beta-globin gene during development. 37
59160 406868 pfam16564 MBDa p55-binding region of Methyl-CpG-binding domain proteins MBD. MBDa is a second MBD domain of Methyl-CpG-binding domain proteins. region implicated in binding the RbAp46/48 (retinoblastoma protein-associated protein) homolog p55, which is one of the components of the MBD2-NuRD complex. The MBD2-NuRD complex is a nucleosome remodelling and deacetylation complex. 69
59161 406869 pfam16565 MIT_C Phospholipase D-like domain at C-terminus of MIT. MIT_C is the C-terminal domain of MIT-containing proteins, pfam04212. It contains an unanticipated phospholipase d fold (PLD fold) that binds avidly to phosphoinositide-containing membranes. It is conserved in eukaryotes, though not fungi and plants, and some bacteria. 137
59162 406870 pfam16566 CREPT Cell-cycle alteration and expression-elevated protein in tumor. CREPT (Cell-cycle alteration and expression-elevated protein in tumor) is a family of eukaryotic transcriptional regulators that ptromote the binding of RNA-polymerase to the CYCLIN D1, CCDN1, promoter and other genes involved in the cell-cycle. It promotes the formation of a chromatin loop in the CYCLIN D1 gene, and is preferentially expressed in a range of different human tumors. 147
59163 293175 pfam16567 CagD Pathogenicity island component CagD. CagD is a tightly conserved family of proteins found in the pathogenic strains of Helicobacter species. It is one of some 30 proteins, produced from the genomic insert termed the pathogenicity island, required for the type IV secretion system - T4SS - that delivers CagA oncoprotein toxin into the host cell. CagD is a covalent dimer in which each monomer folds as a single domain composed of five beta-strands and three alpha-helices. CagD partially associates with the inner membrane, where it may be exposed to the periplasmic space; this may indicate that CagD is released into the supernatant during host cell infection in order then to bind to the host cell surface, or to be incorporated into the pilus structure. 205
59164 406871 pfam16568 Sam68-YY Tyrosine-rich domain of Sam68. Sam68-YY is a short tyrosine-rich domain on Src-associated in mitosis, 68 kDa protein (Sam68), a protein that regulates TCF-1 alternative splicing. It is a crucial binding-partner of the APC-Arm domain that forms a superhelix with a positively charged groove, the surface-residues of which groove form numerous interactions with Sam68-YY to fix it in a bent conformation. APC-Arm is the armadillo repeat domain of the tumor-suppressor protein adenomatous polyposis coli or APC. APC plays plays important roles in Wnt signalling and other cellular processes. 55
59165 406872 pfam16569 GramPos_pilinBB Gram-positive pilin backbone subunit 2, Cna-B-like domain. GramPos_pilinBB is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base. 116
59166 406873 pfam16570 GramPos_pilinD3 Gram-positive pilin backbone subunit 3, Cna-B-like domain. GramPos_pilinD3 is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base. 141
59167 406874 pfam16571 FBP_C FBP C-terminal treble-clef zinc-finger. FBP_C is a family from the C terminal end of fibronectin-binding proteins. It forms an extended four-cysteine zinc-finger with a unique structural fold. Fibronectin-binding proteins bind to elongation factor G - EF-G, which is mediated by the zinc-finger binding to the C-terminus of EF-G. FBPs release ribosomes by competing with them for EF-G. 155
59168 406875 pfam16572 HlyD_D4 Long alpha hairpin domain of cation efflux system protein, CusB. HlyD_D4 is the long alpha-hairpin domain in the centre of CusB or HlyD proteins. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons. HlyD_D4 is thought to interact with the alpha-helical tunnels of the corresponding outer-membrane channels, ie the periplasmic domain of CusC. 54
59169 406876 pfam16573 CLP1_N N-terminal beta-sandwich domain of polyadenylation factor. This family is the short N-terminal domain of the pre-mRNA cleavage complex II protein Clp1. Clp1 function involves some degree of adenine or guanine nucleotide-binding and participates in the 3'-end-processing of mRNAs in eukaryotes. 92
59170 406877 pfam16574 CEP209_CC5 Coiled-coil region of centrosome protein CE290. CEP290 and similar centrosomal proteins carry a number of coiled-coil regions, and this is the fifth along the length of the protein. It is thought that the proteins are involved in cilia biosynthesis. 128
59171 406878 pfam16575 CLP1_P mRNA cleavage and polyadenylation factor CLP1 P-loop. CLP1_P is the P-loop carrying domain of Clp1 mRNA cleavage and polyadenylation factor, Clp1, proteins in eukaryotes. Clp1 is essential for 3'-end processing of mRNAs. This region carries the P-loop suggesting it is the region that binds adenine or guanine nucleotide. 187
59172 406879 pfam16576 HlyD_D23 Barrel-sandwich domain of CusB or HlyD membrane-fusion. HlyD_D23 is the combined domains 2 and 3 of the membrane-fusion proteins CusB and HlyD, which forms a barrel-sandwich. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. The whole molecule hinges between D2 and D3. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons. 214
59173 318725 pfam16577 UBA_5 UBA domain. UBA_2 is a domain found on eukaryotic ubiquitin-interacting proteins. Sequestosome 1/p62 has recently been shown to interact with polyubiquitinated proteins through its UBA domain. This domain selectively binds K63-polyubiquitinated proteins. 62
59174 406880 pfam16578 IL17R_fnIII_D2 Interleukin 17 receptor D. IL17R_fnIII_D2 is the second extracellular fibronectin III-like domain on interleukin17-receptor-D molecules. The exact ligands of IL17R-D are not known. 105
59175 406881 pfam16579 AdenylateSensor Adenylate sensor of SNF1-like protein kinase. AdenylateSensor is a family found at the C-terminus of SNF1-like protein kinases snf other protein-kinases. 118
59176 293188 pfam16580 Astro_capsid_p2 C-terminal tail of astrovirus capsid projection or spike. Astro_capsid_p2 is a family of turkey astroviral spike projections. These are globular domains on the surface of the viral capsid. Astroviruses cause diarrhoea in a variety of mammals and birds, and are small, non-enveloped, single-stranded RNA viruses. The spike carries three conserved patches on its surface which could be candidates for avian receptor-binding sites. 245
59177 406882 pfam16581 HIGH_NTase1_ass Cytidyltransferase-related C-terminal region. This domain is found as the C-terminal portion of some HIGH_NTase1 proteins. The exact function is not known. 205
59178 406883 pfam16582 TPP_enzyme_M_2 Middle domain of thiamine pyrophosphate. TPP_enzyme_M_2 is the middle domain of thiamine pyrophosphate in sequences not captured by pfam00205. This enzyme is necessary for the first step of the biosynthesis of menaquinone, or vitamin K2, an important cofactor in electron transport in bacteria. 207
59179 406884 pfam16583 ZirS_C Zinc-regulated secreted antivirulence protein C-terminal domain. ZirS_C is the C-terminal domain of ZirS, zinc-regulated secreted protein, that is part of a type V-like secretion system. The domain adopts a bacterial Ig-like fold. This domain interacts with its transporter ZirT, and ZirS also interacts directly with ZirU, the third component of this antivirulence complex. ZirT is the zinc-regulated transporter through which ZirS is secreted. 141
59180 293192 pfam16584 LolA_2 Outer membrane lipoprotein carrier protein LolA. LolA_2 is a family of Bacteroidetes outer membrane lipoprotein carrier protein LolA-like proteins. The exact function is not known. 152
59181 406885 pfam16585 Lipocalin_8 Lipocalin-like domain. 135
59182 406886 pfam16586 DUF5060 Domain of unknown function (DUF5060). This is the N-terminal domain of a putative glycoside hydrolase, DUF4038. It is found in a number of different bacterial orders. 70
59183 406887 pfam16587 DUF5061 17 kDa common-antigen outer membrane protein. This is a bacterial domain of 17 kDa common-antigen proteins. 82
59184 374649 pfam16588 zf-C2H2_10 C2H2 zinc-finger. 23
59185 406888 pfam16589 BRCT_2 BRCT domain, a BRCA1 C-terminus domain. This BRCT domain, a BRCA1 C-terminus region, is found on many RAP1 proteins, usually at the very N-terminus. The function in human at least of a BRCT is to contribute to the heterogeneity of the telomere DNA length, but that may not be its general function, which remains unknown. 84
59186 406889 pfam16590 ESP Exocrine gland-secreting peptide. ESP is a family of largely rodent exocrine gland-secreting peptides that are produced by the male extraorbital lacrimal gland to be secreted into the tear fluid. Other mice including females detect these peptides through receptors in the vomeronasal organ, and the receptors report information on mouse-strain, sex and species. The peptides are short, all carrying an N-terminal signal-peptide to indicate they are for secretion which accounts for much of the common conservation. 91
59187 406890 pfam16591 HBM Helical bimodular sensor domain. The HBM sensor domain has been identified primarily in bacterial chemoreceptors but is also present on histidine kinases. Characteristic features of this domain are its size of approximately 250 amino acids and its location in the bacterial periplasm. The McpS chemoreceptor of Pseudomonas putida KT2440 was found to possess an HBM sensor domain and its 3D structure in complex with physiologically relevant ligands has been reported. This domain is composed of 2 long and 4 short helices that form two modules each composed of a 4-helix bundle. The McpS chemoreceptor mediates chemotaxis towards a number of organic acids. Both modules of the McpS HBM domain contain a ligand binding site. Chemo-attractants binds to each of these sites and their binding was shown to trigger a chemotactic response. This domain is primarily found in different proteobacteria but also in archaea. Interestingly, amino acids in both ligand binding sites showed a high degree of conservation suggesting that members of this family sense similar ligands. 245
59188 406891 pfam16592 Cas9_REC REC lobe of CRISPR-associated endonuclease Cas9. The REC lobe of Cas9 - the CRISPR-associated endonuclease Cas9 - includes the REC1 and REC2 domains. REC1 forms an elongated, alpha-helical structure consisting of 25 alpha helices and two beta-sheets, whereas REC2 inserted within REC1 adopts a six-helix bundle structure. The REC lobe and the NUC lobe of Cas9 fold to present a positively charged groove at their interface which accommodates the negatively charged sgRNA:target DNA heteroduplex. CRISPR (clustered regularly interspaced short palindromic repeat)-Cas system occurs naturally in bacteria as a defense against invasion by phages or other mobile genetic elements. Cas9 is targeted to specific genomic locations by sgRNAs or single guide RNAs, in order to complex with invading DNA in order to cleave it and render it inactive. 526
59189 406892 pfam16593 Cas9-BH Bridge helix of CRISPR-associated endonuclease Cas9. Cas9-BH is the bridge helix between the NUC and the REC lobes of Cas9 - the CRISPR-associated endonuclease Cas9. The REC lobe and the NUC lobe of Cas9 fold to present a positively charged groove at their interface which accommodates the negatively charged sgRNA:target DNA heteroduplex. CRISPR (clustered regularly interspaced short palindromic repeat)-Cas system occurs naturally in bacteria as a defense against invasion by phages or other mobile genetic elements. Cas9 is targeted to specific genomic locations by sgRNAs or single guide RNAs, in order to complex with invading DNA in order to cleave it and render it inactive. 33
59190 406893 pfam16594 ATP-synt_Z Putative AtpZ or ATP-synthase-associated. This is a family of short highly conserved plant proteins that might be associated with ATP-synthase atp operon. 53
59191 406894 pfam16595 Cas9_PI PAM-interacting domain of CRISPR-associated endonuclease Cas9. Cas9_PI is a family found at the C-terminal of bacterial type II CRISPR system Cas9 endonuclease. This domain adopts a novel protein fold that is unique to the Cas9 family. It is positioned in the structure-DNA-complex to recognize the PAM sequence on the non-complementary DNA strand of the crRNA. PAM sequence is protospacer-adjacent motifs on DNA. See family CRISPR-DR2, Rfam:RF01315. Cas9 carries two nuclease domains, HNH and RuvC, which cleave the DNA strands that are complementary and non-complementary to the 20 nucleotide guide sequence in crRNAs, respectively. 264
59192 406895 pfam16596 MFMR_assoc Disordered region downstream of MFMR. This is a conserved region of disorder, identified with the MobiDB database, found in plants immediately to the C-terminus of the MFMR domain. 136
59193 406896 pfam16597 Thyroglob_assoc Thyroglobulin_1 repeat associated disordered domain. This domain of conserved disorder lies almost invariably between the two repeated Thyroglobulin_1 domains, pfam00086. 61
59194 406897 pfam16598 Edc3_linker Linker region of enhancer of mRNA-decapping protein 3. This region is located between the LSM14 pfam12701 (Lsm) and FDF pfam09532 domains of the enhancer of mRNA-decapping protein 3. This region is predicted to be natively unstructured. Its precise functional role is not known. 94
59195 406898 pfam16599 PTN13_u3 Unstructured linker region on PTN13 protein between PDZ. This natively unstructured region lies between the first two PDZ domains on long eukaryotic tyrosine-protein phosphatase non-receptor type 13 proteins. The function is not known. However, since each of the PDZ domains binds with a different protein it is likely to be a linker region allowing flexibility between the PDZs. 191
59196 406899 pfam16600 Caskin1-CID Caskin1 CASK-interaction domain. The Caskin1 protein interacts with the CASK protein via this region.CASK and Caskin1 are synaptic scaffolding proteins. The binding motif on human Caskin1 is EEIWVLRK. A similar motif is found on protein MINT1 and protein TIAM1, both shown to be able to bind to CASK though the motif. MINT1 and TIAM1 are not part of this family. This region is predicted to be natively unstructured. 55
59197 406900 pfam16601 NPF Rabosyn-5 repeating NPF sequence-motif. NPF is a natively unstructured but well-conserved region found in eukaryotic proteins of the Rabenosyn-5 type, wherein the sequence motif arginine-proline-phenylalanine followed by several glutamates and aspartates is repeated up to four times along the sequence. NPF lies between the two Rab-binding domains, for Rab-4 and Rab-5, at the C-terminal end of these proteins. Rabosyn-5 (or rabenosyn) is also involved in cell-polarity determination in developing wing epithelia of Drosophila, when the NPF-motif may be implicated. These NPF motifs create a region of strong positive surface potential which appear to bind Eps15 homology, EH or EF-hand, domains on proteins involved in vesicle trafficking. 188
59198 406901 pfam16602 USP19_linker Linker region of USP19 deubiquitinase. This region is generally located between a CS domain pfam04969 and the enzymatic UCH domain pfam00582 of USP19 deubiquitinases. This region is predicted to be natively unstructured. Its precise functional role is not known. 121
59199 406902 pfam16605 LSM_int_assoc LSM-interacting associated unstructured. LSM_int_assoc is a family found largely on eukaryotic SART3 proteins just upstream of their C-terminal LSM-interacting domain. This region is natively unstructured. 60
59200 406903 pfam16606 zf-C2H2_assoc Unstructured conserved, between two C2H2-type zinc-fingers. This domain is found on a set of eukaryotic Zinc finger protein 536 transcriptional regulator proteins sandwiched between zf-C2H2, pfam00096 and zf-H2C2_2 pfam13465. It is not conserved between other pairs of the zinc-fingers on these sequences. It is natively unstructured, and its function is not known. The proteins recognize and bind 2 copies of the core DNA sequence 5'-CCCCCA-3'. 80
59201 406904 pfam16607 CYLD_phos_site Phosphorylation region of CYLD, unstructured. CYLD_phos_site is a natively unstructured region on a subset of tumor-suppressor and de-ubiquitinating enzyme CYLD proteins in eukaryotes. It lies between the second pair of CAP_GLY domains, pfam01302, on these proteins. This region of CYLD, being unstructured, carries a number of serine residues which, in response to cellular stimuli, become phosphorylated. This transient phosphorylation-state induces ubiquitination of TRAF2, a ubiquitin ligase that catalyzes both self-ubiquitination and the ubiquitination of specific target molecules involved in signal transduction. 165
59202 406905 pfam16608 TNRC6-PABC_bdg TNRC6-PABC binding domain. TNRC6-PABC_bdg is a natively unstructured region on the higher eukaryote TNRC6 subset of GW182 proteins that carries the binding motif for the interaction with Polyadenylate-binding protein 1, PABC. TNRC6 are trinucleotide repeat-containing gene 6 proteins required for miRNA-mediated gene silencing that are localized to the P bodies (processing bodies). P bodies are cytoplasmic mRNP aggregates that are involved in general mRNA translation repression and decay, including nonsense-mediated decay. Thus GW182 proteins are essential for microRNA-mediated translational repression and deadenylation in animal cells being a major component of miRISCs. The interaction motif that binds to PABC is ShNWPPEFHPGVPWKGLQ. This region lies between a Q-rich region and the RRM, or RNA-recognition motif, pfam13893. 290
59203 406906 pfam16609 SH3-RhoG_link SH3-RhoGEF linking unstructured region. This family of natively unstructured but conserved residues from higher eukaryotes is found to lie between an SH3 pfam00018 and the RhoGEF, pfam00621, domains. It is serine-rich and likely to be acidic and natively unstructured. 261
59204 406907 pfam16610 dbPDZ_assoc Unstructured region between two PDZ domains on Dlg5. dbPDZ_assoc is found on higher eukaryote Dlg5, Disks large homolog 5, proteins, lying between the second pair of PDZ domains. The sequence is natively unstructured but may just be long extensions of the PDZs on these sequences in this position. The function is not known. 81
59205 406908 pfam16611 RGS12_us2 Unstructured region between RBD and GoLoco. RGs12_us2 is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies between an RBD domain and a GoLoco motif, pfam02196 and pfam02188. The function is not known. 72
59206 406909 pfam16612 RGS12_usC C-terminal unstructured region of RGS12. RGS12_usC is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies at the very C-terminus. It has a highly conserved central section. The function is not known. 138
59207 406910 pfam16613 RGS12_us1 Unstructured region of RGS12. RGS12_us1 is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies N-terminal to other such regions in UniProt:E1BPP4. It is very glycine-rich, and the function is not known. 114
59208 374671 pfam16614 RhoGEF67_u2 Unstructured region two on RhoGEF 6 and 7. RhoGEF67_u2 is a region of natively unstructured residues on Rho guanine nucleotide exchange factor 6 and 7 proteins. The function is not known. It lies after the PH domain and before the C-terminal coiled-coil. 109
59209 406911 pfam16615 RhoGEF67_u1 Unstructured region one on RhoGEF 6 and 7. RhoGEF67_u1 is a region of natively unstructured residues on Rho guanine nucleotide exchange factor 6 and 7 proteins. The function is not known. It lies between the CH and the SH3 domains. 47
59210 406912 pfam16616 PHC2_SAM_assoc Unstructured region on Polyhomeotic-like protein 1 and 2. PHC2_SAM_assoc is a natively unstructured region on Polyhomeotic-like proteins 1 and 2, that lies immediately upstream of the SAM domain, pfam00536. The function is not known. 123
59211 406913 pfam16617 INTAP Intersectin and clathrin adaptor AP2 binding region. INTAP is a natively unstructured region of intersectin 1 proteins, lying between the first pair of SH3 domains, that binds to the clathrin adaptor AP2. This binding forms an intersectin-AP2 complex that functions as an important regulator of clathrin-mediated SV recycling in synapses. 115
59212 406914 pfam16618 SH3-WW_linker Linker region between SH3 and WW domains on ARHGAP12. SH3-WW_linker is a natively unstructured region on Rho-GTPase activating factor 12 proteins that lies between the SH3 and the WW domains. it is found in higher eukaryotes, and the function is not known. 196
59213 406915 pfam16619 SUIM_assoc Unstructured region C-term to UIM in Ataxin3. SUIM_assoc is a natively unstructured region on Ataxin 3 proteins that lies immediately C-terminal to the second UIM domain linking it to a third when present. The function is not known. It is rich in glutamine residues. 60
59214 374677 pfam16620 23ISL Unstructured linker between I-set domains 2 and 3 on MYLCK. 23ISL is a natively unstructured region lying between the second and third I-set domains on higher eukaryotic myosin light chain kinase (MYLCK) proteins. The function is not known. It carries a highly conserved TSSTITLQ sequence motif which might be a binding domain. 162
59215 406916 pfam16621 NECFESHC SH3 terminal domain of 2nd SH3 on Neutrophil cytosol factor 1. NECFESHC is the C-terminal domain of the second SH3 domain found on neutrophil cytosol factor 1 or p47phox proteins in higher eukaryotes. It is not unstructured as illustrated by the structure of Structure 1ng2. 50
59216 406917 pfam16622 zf-C2H2_11 zinc-finger C2H2-type. Zinc-finger of C2H2 type found in higher eukaryotes. 29
59217 406918 pfam16623 WW_FCH_linker Unstructured linker region between on GAS7 protein. WW_FCH_linker is a natively unstructured region on GAS7 or Growth arrest-specific protein 7 higher eukaryote proteins. It lies between the WW and the FCH domains. The function is not known but it carries a highly conserved TINCVTFP sequence motif which might be a binding domain. 92
59218 406919 pfam16624 zf-C2H2_assoc2 Unstructured region upstream of a zinc-finger. zf-C2H2_assoc2 is a short region of natively unstructured sequence immediately upstream of a C2H2-type zinc-finger on eukaryotic Zinc-finger proteins 592 and 800. The function is not known. 95
59219 406920 pfam16625 ISET-FN3_linker Unstructured linking region I-set and fnIII on Brother of CDO. ISET-FN3_linker is a short section of natively unstructured sequence on Biregional cell adhesion molecule-related/down-regulated by oncogenes (Cdon) binding proteins or Brother of CDO. It is found in higher eukaryotes and lies between the second I-set and the first fnIII domains, pfam07679 and pfam00041. The function is not known. 65
59220 374683 pfam16626 Papilin_u7 Linking region between Kunitz_BPTI and I-set on papilin. Papilin_u7 is a conserved region of natively unstructured residues on proteoglycan-like sulfated glycoprotein - papilin 0 in higher eukaryotes. It links the Kunitz_BPTI, pfam00014, and I-set domains pfam07679. The function is not known. 92
59221 406921 pfam16627 BRX_assoc Unstructured region between BRX_N and BRX domain. BRX_assoc is a short stretch of plant transcription regulator proteins carrying the BRX domain that is natively unstructured. It connects the BRX_N and BRX domains in plant transcription regulators. The function is not known. 70
59222 374685 pfam16628 Mac_assoc Unstructured region on maltose acetyltransferase. Mac_assoc is a region of natively unstructured residues on fungal maltose acetyltransferase proteins. It lies just upstream of the Mac, pfam12464, domain linking it with the upstream Zn_clus, pfam00172, the Zn(2)-Cys(6) binuclear cluster. the function of this region is not known. 185
59223 406922 pfam16629 Arm_APC_u3 Armadillo-associated region on APC. Arm_APC_u3 is a semi-unstructured region lying immediately downstream of the armadillo fold before the beta-catenin binding motifs, APC_crr, pfam05923, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 293
59224 406923 pfam16630 APC_u5 Unstructured region on APC between 1st and 2nd catenin-bdg motifs. APC_u5 is a short region of natively unstructured sequence lying between the first and the second 15-residue beta-catenin binding motifs, APC_15aa, pfam05972, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 100
59225 406924 pfam16631 TUTF7_u4 Unstructured region 4 on terminal uridylyltransferase 7. TUTF7_u4 is the fourth natively unstructured region found on a set of higher eukaryote Terminal uridylyltransferase 7 proteins. The function is not known. The region is rich in arginine and lysine. 88
59226 406925 pfam16632 Caskin-tail C-terminal region of Caskin. This region is found at the C-terminus of Caskin proteins. Caskins are CASK-binding synaptic scaffolding proteins. Part of this region is predicted to be in coiled-coil conformation. Its function is not known. 61
59227 406926 pfam16633 APC_u9 Unstructured region on APC between 1st two creatine-rich regions. APC_u9 is a short region of natively unstructured sequence lying between the first and second APC_crr, pfam05923, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 89
59228 406927 pfam16634 APC_u13 Unstructured region on APC between APC_crr and SAMP. APC_u13 is a short region of natively unstructured sequence lying between the fourth creatine-rich region, APC_crr, pfam05923, and the SAMP pfam05924, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 54
59229 406928 pfam16635 APC_u14 Unstructured region on APC between SAMP and APC_crr. APC_u14 is a short region of natively unstructured sequence lying between the second SAMP pfam05924, and the fifth creatine-rich region, APC_crr, pfam05923, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 94
59230 406929 pfam16636 APC_u15 Unstructured region on APC between APC_crr regions 5 and 6. APC_u15 is a short region of natively unstructured sequence lying between the fifth and sixth creatine-rich, APC_crr, pfam05923, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known. 81
59231 406930 pfam16637 zf-C2H2_assoc3 Putative zinc-finger between two C2H2 zinc-fingers on Patz. zf-C2H2_assoc3 is a partially unstructured region on Patz or POZ-, AT hook-, and zinc finger-containing proteins of higher eukaryotes. It lies between the two C2H2-type zinc-fingers towards the C-terminus of these proteins and may well be an unusual zinc-finger itself. 74
59232 406931 pfam16638 Tristanin_u2 Unstructured region on methyltransferase between zinc-fingers. Tristanin_u2 is a region of natively unstructured sequence on tristanin like or PR domain zinc finger protein 10s found in higher eukaryotes. It lies between two C2H2-type zinc-fingers. The function is not known. 121
59233 406932 pfam16639 Apocytochr_F_N Apocytochrome F, N-terminal. This is the N-terminal domain of cytochrome f. It is a soluble lumen-side domain. 154
59234 406933 pfam16640 Big_3_5 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. 90
59235 406934 pfam16641 CLIP1_ZNF CLIP1 zinc knuckle. This zinc knuckle domain is found tandemly repeated at the C-terminal of the cytoplasmic linker protein CLIP1 (CLIP170). It forms a complex with the CAP-Gly domain of Dynactin. 17
59236 406935 pfam16642 KCNQ2_u3 Unstructured region on Potassium channel subunit alpha KvLQT2. KCNQ2_u3 is a region of natively unstructured sequence on potassium voltage-gated channel subfamily KQT member 2 proteins from higher eukaryotes. It lies between families KCNQ_channel, pfam03520, and KCNQC3-Ank_bd, pfam11956. The function is not known. 96
59237 406936 pfam16643 cNMPbd_u2 Unstructured region on cNMP-binding protein. cNMPbd_u2 is a natively unstructured region on a set of higher eukaryote cyclic nucleotide-binding domain-containing proteins. It lies between the second cNMP_binding, pfam00027, and the F-box, pfam00646, domains. The function is not known but there is a highly conserved DPDPFL sequence motif. 163
59238 406937 pfam16644 NEXCaM_BD Regulatory region of Na+/H+ exchanger NHE binds to calmodulin. NEXCaM_BD is a coiled-coil domain found as part of the regulatory, C-terminal region of the 12-14 TM sodium/proton exchangers (NHEs)2 of the solute carrier 9 (SLC9) family in all animal kingdoms. The C- lobe of CaM binds the first alpha-helix of the NHE, or NEXCaM_BD region, and the N-lobe of CaM binds the second helix of NEXCaM_BD. 110
59239 318786 pfam16645 PHtD_u1 Unstructured region on Pneumococcal histidine triad protein. PHtD_u1 is a natively unstructured region on Pneumococcal histidine triad proteins of higher eukaryotes lying between the first two Strep_his_triad domains so far identified, pfam04270. The function is not known but it does not carry the characteristic histidine triad. 56
59240 406938 pfam16646 AXIN1_TNKS_BD Axin-1 tankyrase binding domain. This is the N-terminal domain tankyrase binding domain of Axin-1. 75
59241 406939 pfam16647 GCSF Granulocyte colony-stimulating factor. GCSF is a family of higher eukaryotic granulocyte colony-stimulating factor proteins. Granulocyte colony-stimulating factors are cytokines that are involved in haematopoeisis. They control the production, differentiation and function of white blood cell granulocytes. GCSF binds to the extracellular Ig-like and CRH domain of its receptor GCSFR, thereby triggering the receptor to homodimerize. Homodimerization result in activation of Janus tyrosine kinase-signal transducers and other activators of transcription (JAK-STAT)-type signalling cascades. 149
59242 406940 pfam16648 Calpain_u2 Unstructured region on Calpain-3. Calpain_u2 is a region of natively unstructured sequence that lies between the Calpain_III, pfam01067 and the first EF-hand, Pfam;PF13833, domains on higher eukaryote calpain-3 proteins. The function is not known. 68
59243 406941 pfam16649 IL23 Interleukin 23 subunit alpha. This family, interleukin 23 subunit alpha, is a heterodimer consisting of a 40 kDa subunit - p40 - that is shared with IL12 and a unique 19 kDa subunit - p19. IL23 is a pro-inflammatory cytokine that binds to adnectins and thus plays a key role in the pathogenesis of several autoimmune and inflammatory diseases. IL23 signalling on the cell membrane works through the interaction of four proteins, two of which are shared with the IL12-receptor complex; signalling through the cell membrane involves the combined aggregation of at least two receptor components and then the subsequent activation of the Jak/Tyk tyrosine kinases and the family of STAT transcription factors. 158
59244 293256 pfam16650 SPEG_u2 Unstructured region on SPEG complex protein. SPEG_u2 is a region of natively unstructured but conserved sequence on Striated muscle-specific serine/threonine-protein kinase proteins in higher eukaryotes. It lies between two I-set immunoglobulin, pfam07679, domains. The function is not known. 57
59245 406942 pfam16652 PH_13 Pleckstrin homology domain. 144
59246 406943 pfam16653 Sacchrp_dh_C Saccharopine dehydrogenase C-terminal domain. This family comprises the C-terminal domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase. 212
59247 406944 pfam16654 DAPDH_C Diaminopimelic acid dehydrogenase C-terminal domain. This family comprises the C-terminal domain of diaminopimelic acid dehydrogenase. Diaminopimelate dehydrogenase is a NADPH-dependent enzyme that catalyzes the oxidative deamination of meso-2,6-diaminopimelate, which is the direct precursor of L-lysine in bacterial lysine biosynthesis. 154
59248 379867 pfam16655 PhoD_N PhoD-like phosphatase, N-terminal domain. This domain is found at the N-terminus of proteins in the PhoD family pfam09423. 89
59249 406945 pfam16656 Pur_ac_phosph_N Purple acid Phosphatase, N-terminal domain. This domain is found at the N-terminus of Purple acid phosphatase proteins. 94
59250 406946 pfam16657 Malt_amylase_C Maltogenic Amylase, C-terminal domain. This is the C-terminal domain of Maltogenic amylase, an enzyme that hydrolyzes starch material. Maltogenic amylases are central to carbohydrate metabolism. 75
59251 406947 pfam16658 RF3_C Class II release factor RF3, C-terminal domain. 129
59252 374703 pfam16660 PHD20L1_u1 PHD finger protein 20-like protein 1. PHD20L1_u1 is a region of natively unstructured but highly conserved sequence on a set of higher eukaryotic PHD finger protein 20-like protein 1 like proteins. The function is not known. 68
59253 406948 pfam16661 Lactamase_B_6 Metallo-beta-lactamase superfamily domain. This family is part of the metallo-beta-lactamase superfamily. 192
59254 406949 pfam16662 FLYWCH_u FLYWCH-type zinc finger-containing protein 1. FLYWCH_u is a region of natively unstructured but conserved sequence that lies between the FLYWCH zinc-finger domains on FLYWCH-type zinc finger-containing protein 1 proteins in higher eukaryotes. The function is not known but the N- and C-termini are likely to be part of the zinc-finger domains specific, to the eukaryotes. 62
59255 406950 pfam16663 MAGI_u1 Unstructured region on MAGI. MAGI_u3 is a region of natively unstructured but highly conserved sequence on a subset, of higher eukaryote, membrane-associated guanylate kinase with WW and PDZ domain-containing proteins. The function is not known. 60
59256 406951 pfam16664 STAC2_u1 Unstructured on SH3 and cysteine-rich domain-containing protein 2. STAC2_u1 is a region of natively unstructured but highly conserved sequence between C1_1 pfam00130, and an SH3 domain, eg pfam00018, on SH3 and cysteine-rich domain-containing proteins from higher eukaryotes. The function is not known. 123
59257 406952 pfam16665 NCOA_u2 Unstructured region on nuclear receptor coactivator protein. NCOA_u2 is a region of natively unstructured but highly conserved sequence found on higher eukaryote nuclear receptor coactivator proteins. It lies between a PAS domain, pfam14598 and a steroid receptor coactivator domain, pfam08832. The function is not known. 119
59258 406953 pfam16666 MAGI_u5 Unstructured region on MAGI. MAGI_u5 is a region of natively unstructured but highly conserved sequence on a subset, of higher eukaryote, membrane-associated guanylate kinase with WW and PDZ domain-containing proteins. The function is not known. This region lies between two PDZ, pfam00595 domains. 99
59259 406954 pfam16667 MPDZ_u10 Unstructured region 10 on multiple PDZ protein. MPDZ_u10 is a region of natively unstructured but highly conserved sequence on multiple-PDZ-containing domain proteins in higher eukaryotes. It lies between two PDZ domains, pfam00595. The function is not known. 65
59260 293273 pfam16668 JLPA Adhesin from Campylobacter. JLPA is a surface-exposed lipoprotein adhesin that promotes interaction with the host epithelial cells. It is found in the genus Campylobacter, and the structure is an unclosed half beta-barrel fold with a wide hydrophobic concave face; this represents a novel bacterial surface lipoprotein. 352
59261 406955 pfam16669 TTC5_OB Tetratricopeptide repeat protein 5 OB fold domain. This OB fold domain is located at the C-terminus of Tetratricopeptide repeat protein 5 and is required for effective p53 response. 115
59262 406956 pfam16670 PI-PLC-C1 Phosphoinositide phospholipase C, Ca2+-dependent. PI-PLC-C1 is a family of calcium 2+-dependent phosphatidylinositol-specific phospholipase C1 enzymes from bacteria and fungi. The enzyme classification number is EC:3.1.4.11. This enzyme is involved in part of the myo-inositol phosphate metabolic pathway. 330
59263 293276 pfam16671 ACD Actin cross-linking domain. This domain is found in Vibrio cholerae RtxA toxin and VgrG1 protein. This domain cross-links to G-actin leading to cytoskeletal changes. 386
59264 406957 pfam16672 LAMTOR5 Ragulator complex protein LAMTOR5. 88
59265 406958 pfam16673 TRAF_BIRC3_bd TNF receptor-associated factor BIRC3 binding domain. This domain is found in TNF receptor-associated factor 1 and 2 (TRAF1 and TRAF2), where it binds to Baculoviral IAP repeat-containing protein 3 (BIRC3) (cIAP2). 61
59266 406959 pfam16674 UCH_N N-terminal of ubiquitin carboxyl-terminal hydrolase 37. UCH_N is a domain found at the N-terminus of ubiquitin carboxyl-terminal hydrolase 37 or 26. The function is not known. 102
59267 406960 pfam16675 FOXO_KIX_bdg KIX-binding domain of forkhead box O, CR2. FOXO_KIX_bd is the first part of the region of transcription factor forkhead box O family proteins that binds to the CREB-binding proteins via the KIX domain. Coactivator CBP/p300 is recruited by FOXO3 via the binding of this domain as well as the simultaneous binding of the more C-terminal TAD domain. 76
59268 406961 pfam16676 FOXO-TAD Transactivation domain of FOXO protein family. TAD is a promiscuous binding domain that mediates the association of the transcription factor FOXO with the coactivator CBP/p300. Both this domain and the FOXO-KIX_bd family pfam16675 bind simultaneously the KIX domain of CBP/p300. Coactivator CBP/p300 is recruited by FOXO3 though binding to these two regions. The promiscuity of the TAD is further evidenced by that the finding that they also bind the TAZ1 and TAZ2 domains of CBP/p300. 40
59269 406962 pfam16677 GP3_package DNA-packaging protein gp3. DNA-packaging protein gp3 (terminase small subunit) is involved in DNA packing in bacteriophage. it contains a channel where DNA is bound and passed to DNA-packaging protein gp2 (terminase large subunit). 106
59270 406963 pfam16678 HOIP-UBA HOIP UBA domain pair. HOIP-UB is a binding domain on E3 ubiquitin-protein ligase RNF31 like proteins. E3 ubiquitin-protein ligase RNF31 is often referred to as HOIL-1L binding partner. The interaction of HOIL-1L and HOIP is thus via the UBL-UBA interaction. this interaction is important in E3 complex formation and the subsequent activation of NF-kappaB. This family contains two UBA-like domains. 150
59271 406964 pfam16679 CDT1_C DNA replication factor Cdt1 C-terminal domain. This is the C-terminal domain of DNA replication factor Cdt1. This domain binds the MCM complex. 96
59272 406965 pfam16680 Ig_4 T-cell surface glycoprotein CD3 delta chain. This is an immunoglobulin-like domain. It is found on the T-cell surface glycoprotein CD3 delta chain. CD3delta and CD3epsilon complex together as part of the T-cell receptor complex. 76
59273 406966 pfam16681 Ig_5 Ig-like domain on T-cell surface glycoprotein CD3 epsilon chain. Ig_5 is an immunoglobulin domain found on T-cell surface glycoprotein CD3 epsilon chain. It forms a first-order complex with T-cell surface glycoprotein CD3 delta chain as part of the T-cell receptor complex. 75
59274 374722 pfam16682 MSL2-CXC CXC domain of E3 ubiquitin-protein ligase MSL2. MSL2-CXC is an autonomously folded domain containing that binds three zinc ions. It lies on the E3 ubiquitin-protein ligase MSL2 in eukaryotes. The CXC domain critically contributes to the DNA-binding activity of MSL2. It carries 9 invariant cysteines within about a 50 residue region. 55
59275 406967 pfam16683 TGase_elicitor Transglutaminase elicitor. TGase_elicitor is a family of largely oomycete sequences from plant pathogens that elicit transglutaminase/acyltransferase activity. The enzyme classification is E.C:2.3.2.13. From the presence of sequences from Vibrio spp one can propose a lateral gene transfer event having occurred between bacteria and oomycetes to the probable selective advantage of the pathogen. 361
59276 406968 pfam16684 Telomere_res Telomere resolvase. Telomere resolvase (protelomerase) catalyzes the conversion of linear double-stranded DNA into hairpin telomeres. 272
59277 374725 pfam16685 zf-RING_10 zinc RING finger of MSL2. zf-RING_10 is an N-terminal domain on E3 ubiquitin-protein ligase msl-2 proteins. The domain binds MSL1 and exhibits ubiquitin E3 ligase activity towards H2B K34, the histone proteins. 70
59278 406969 pfam16686 POT1PC ssDNA-binding domain of telomere protection protein. POT1PC is the ssDNA-binding domain on a family of fungal telomere protection protein 1 proteins. POT1PC is able to accommodate heterogeneous ssDNA ligands. Pot1 proteins are the proteins responsible for binding to and protecting the 3' single-stranded DNA (ssDNA) overhang at most eukaryotic telomeres. 152
59279 374727 pfam16687 ELYS-bb beta-propeller of ELYS nucleoporin. ELYS-bb is the N-terminal seven-bladed beta-propeller domain of ELYS nucleoporins in higher eukaryotes. It is required for anchorage of the nucleoporin to the nuclear envelope during cell-division. 488
59280 406970 pfam16688 CNV-Replicase_N Replicase polyprotein N-term from Coronavirus nsp1. CNV-Replicase_N is the N-terminal domain of a family of ssRNA positive-stranded porcine transmissible gastroenteritis coronaviruses. the domain folds into a six-stranded beta-barrel fold with a long alpha helix on the rim of the barrel. This fold is shared with SARS-CoV nsp1. 108
59281 374728 pfam16689 APC_N_CC Coiled-coil N-terminus of APC, dimerization domain. APC_N_CC is the N-terminal, coiled-coil dimerization domain of the adenomatosis polyposis coli (APC) tumor-repressor proteins. It plays a key role in the regulation of cellular levels of the oncogene product beta-catenin. Coiled-coil regions are binding repeats that in this case bind to the armadillo repeat region of beta-catenin. 52
59282 406971 pfam16690 MMACHC Methylmalonic aciduria and homocystinuria type C family. 216
59283 374730 pfam16691 DUF5062 Domain of unknown function (DUF5062). This family is found in Vibrio spp. The function is not known. 83
59284 406972 pfam16692 Folliculin_C Folliculin C-terminal domain. This is the C-terminal domain of folliculin. It has guanine nucleotide exchange factor (GEF) activity. 224
59285 293298 pfam16693 Yop-YscD_ppl Inner membrane component of T3SS, periplasmic domain. Yop-YscD-ppl is the periplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus. 254
59286 406973 pfam16694 Cytochrome_P460 Cytochrome P460. 122
59287 406974 pfam16695 Tai4 Type VI secretion system (T6SS), amidase immunity protein. Tai4 is a new form of autoimmunity protein for a type VI secretion system, T6SS. T6SS has roles in interspecies interactions, as well as higher order host-infection, by injecting effector proteins into the periplasmic compartment of the recipient cells of closely related species. Pseudomonas aeruginosa produces at least three effector proteins to other cells and thus has three specific cognate immunity proteins to protect itself. Tae4, or type VI amidase effector 4, in Enterobacter cloacae has a cognate Tai4 or type VI amidase immunity 4 protein. The effector is Tae4, pfam14113. 91
59288 406975 pfam16696 ZFYVE21_C Zinc finger FYVE domain-containing protein 21 C-terminus. This is the C-terminal domain of Zinc finger FYVE domain-containing protein 21. It has a PH-like fold and is required for the regulation of focal adhesions and in cell migration. 120
59289 406976 pfam16697 Yop-YscD_cpl Inner membrane component of T3SS, cytoplasmic domain. Yop-YscD-cpl is the cytoplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus. 92
59290 406977 pfam16698 ADAM17_MPD Membrane-proximal domain, switch, for ADAM17. ADAM17_MPD is the membrane-proximal domain of a family of disintegrin and metalloproteinase domain-containing protein 17 found in metazoan species. ADAM17 is a major sheddase that is responsible for the regulation of a wide range of biological processes, such as cellular differentiation, regeneration, and cancer progression. This MPD region acts as the sheddase switch. PDI or protein-disulfide isomerase interacts with ADAM17 and to down-regulate its enzymatic activity. The interaction is directly with the MPD, the region of dimerization and substrate recognition, where it catalyzes an isomerisation of disulfide bridges within the thioredoxin motif CXXC. this isomerisation results in a major structural change between an active, open state and an inactive, closed state of the MPD. This change is thought to act as a molecular switch, allowing a global reorientation of the extracellular domains in ADAM17 and regulating its shedding activity. 62
59291 406978 pfam16699 CSTF1_dimer Cleavage stimulation factor subunit 1, dimerization domain. This family is the dimerization domain, at the N-terminal, of a family of cleavage stimulation factor subunit 1 proteins from eukaryotes. This domain allows for homodimerization such that the functional state of CSTF1 is a heterohexamer. The cleavage stimulation factor (CstF) complex is composed of three subunits and is essential for pre-mRNA 3'-end processing. CstF recognizes U and G/U-rich cis-acting RNA sequence elements and helps to stabilize the cleavage and polyadenylation specificity factor (CPSF) at the polyadenylation site as required for productive RNA cleavage. 57
59292 318831 pfam16700 SNCAIP_SNCA_bd Synphilin-1 alpha-Synuclein-binding domain. This coiled-coil domain found in Synphilin-1 is responsible for binding to alpha-Synuclein. 45
59293 406979 pfam16701 Ad_Cy_reg Adenylate cyclase regulatory domain. This domain regulates the activity of Actinobacterial adenylate cyclase in a pH-dependent manner, allowing activation at acidic pH. 185
59294 406980 pfam16702 DUF5063 Domain of unknown function (DUF5063). 164
59295 406981 pfam16703 DUF5064 Domain of unknown function (DUF5064). This is found in Pseudomonas species. Several members are annotated as being acetyl-CoA carboxylase alpha subunit, but his could not be confirmed. 117
59296 293309 pfam16704 Rab_bind Rab binding domain. This coiled-coil domain, found in GRIP and coiled-coil domain-containing protein 2 and RANBP2-like and GRIP domain-containing protein, has been shown to bind to Rab in GRIP and coiled-coil domain-containing protein 2. 65
59297 406982 pfam16705 NUDIX_5 NUDIX, or N-terminal NPxY motif-rich, region of KRIT. NUDIX_5 is found in higher eukaryotes at the N-terminus of KRIT1 or Krev interaction trapped proteins. NUDIX_5 carries three NPxY-like motifs, and it is found to bind the integrin cytoplasmic-associated protein 1 ICAP1. In the absence of KRIT1 ICAP1 binds via its C-terminal PH/PTB fold domain to the integrin beta-1 cytoplasmic tail. Binding of KRIT1 to ICAP1 via NUDIX_5 out-competes the binding of ICAP1 to integrin cytoplasmic tails such that ICAP1 is sequestered in the nucleus. Integrin activation is thus prevented. 169
59298 406983 pfam16706 Izumo-Ig Izumo-like Immunoglobulin domain. Izumo-Ig is the immunoglobulin domain on Izumo proteins from higher eukaryotes. Izumo is a typical type I membrane glycoprotein with one immunoglobulin-like domain and a putative N-glycoside link motif - glycosylation site. The full-length protein is a molecule with a single immunoglobulin (Ig) domain. It is thought that Izumo proteins bind to putative Izumo receptors on the oocyte. Izumo is not detectable on the surface of fresh sperm but becomes exposed only after an exocytotic process, the acrosome reaction, has occurred. Studies have shown that knock-out mice (Izumo-/- males) were sterile despite normal mating behaviour and ejaculation, indicating the importance of the protein in fertilisation. There is a conserved GCL sequence motif. Izumo expression has been found to be testis-specific. 86
59299 406984 pfam16707 CagS Cag pathogenicity island protein S of Helicobacter pylori. CagS is a family of proteins from the pathogenicity island of Helicobacter pylori. The gene lies just downstream of the cluster whose protein-products resemble those of the Vibrio proteins that form the structural core of T4SS. The exact function of CagS is not known. 196
59300 406985 pfam16708 LppA Lipoprotein confined to pathogenic Mycobacterium. This is a family of lipoproteins found only in pathogenic mycobacteria. These pathogenic lipoproteins may play a role in host-pathogen interactions. Lipoproteins localized to the cell-envelope of pathogenic bacteria are major determinants of virulence. The proteins are localized to the cell-surface via an N-terminal lipidation carried out by a transferase - pro-lipoprotein diacylglyceryl transferase Lgt - which attaches a diacylglyceride molecule to a sulfur atom from a crucial cysteine, and a consecutively acting lipoprotein signal peptidase LspA that cleaves the signal peptide just before the modified cysteine. When the peptidase is inactivated the pathogen has difficulty in replicating inside macrophages. 153
59301 406986 pfam16709 SCAB-IgPH Fused Ig-PH domain of plant-specific actin-binding protein. This family is a fused Ig and PH domain found on plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. This domain has the N-terminal Ig beta-sandwich fold consisting of two antiparallel beta-sheets built from strands beta1 and beta2 and strands beta3-beta6, respectively. The C-terminus of the fused domains adopts the PH fold, of seven beta-strands, beta7-beta13 and two alpha-helices, alpha1 and alpha2 arranged into a beta-barrel. The Ig and PH domains appear to be truly fused together into an integral structure which displays a few conserved patches on the surface, particularly of the PH part. The canonical phosphoinositide-binding pocket of the classic PH domain is degenerate in this fused one, and the charge on the pocket suggest that the Ig-PH domain contains a non-canonical binding site for inositol phosphates. There are a handful of bacterial members at low threshold but they are missing the PH part of the fused domain, and appear to match little else. 98
59302 293315 pfam16710 CTXphi_pIII-N1 N-terminal N1 domain of Vibrio phage CTXphi pIII. CTXphi_pIII-N1 is the N-terminal domain, N1, of the pIII protein of the CTXphi bacteriophage of Vibrio cholerae. CTXphi is a ssDNA Inovirus. pIII is a minor coat protein. This domain interacts directly with the C-terminus of TolA, a periplasmic protein of Vibrio cholerae itself as part of the infection mechanism. 111
59303 374744 pfam16711 SCAB-ABD Actin-binding domain of plant-specific actin-binding protein. SCAB-ABD is the actin-binding domain of plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. The ABD is structurally independent from the first coiled-coil, CC1, domain which is also involved in binding; the CC1 is likely to function as a dimerization module 41
59304 406987 pfam16712 SCAB_CC Coiled-coil regions of plant-specific actin-binding protein. SCAB_CC is the two coiled-coil, dimerization domains of plant-specific actin-binding proteins or SCABs, CC1 and CC2, both of which contribute independently to dimerization. CC1 is also required for actin binding, indicating that SCAB1 is a bivalent actin cross-linker. since CC1 adopts an antiparallel helical hairpin that further dimerizes into a four-helix bundle. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement 168
59305 293318 pfam16713 EAGR_box Enriched in aromatic and glycine Residues box. The Enriched in Aromatic and Glycine Residues (EAGR) box is found in proteins from Mycoplasma, often tandemly repeated, and may have a role in cell motility. 34
59306 406988 pfam16714 TyrRSs_C Tyrosyl-tRNA synthetase C-terminal domain. This domain is found at the C-terminus of fungal tyrosyl-tRNA synthetases. It binds to group I introns. 120
59307 406989 pfam16715 CDPS Cyclodipeptide synthase. This family of proteins includes enzymes involved in the synthesis of cyclodipeptides using aminoacyl-tRNAs as substrates, including cyclo(L-leucyl-L-leucyl) synthase, cyclo(L-tyrosyl-L-tyrosyl) synthase and cyclo(L-leucyl-L-phenylalanyl) synthase. They are structurally similar to class Ic aminoacyl-tRNA synthetases (aaRSs). 220
59308 406990 pfam16716 BST2 Bone marrow stromal antigen 2. 91
59309 406991 pfam16717 RAC_head Ribosome-associated complex head domain. The RAC head domain is involved in ribosome binding. 87
59310 374750 pfam16718 IFS Immunity factor for SPN. Immunity factor for SPN (IFS) binds to and inhibits the SPN toxin. 164
59311 406992 pfam16719 SAWADEE SAWADEE domain. The SAWADEE domain, found in plant homeobox proteins, has a pair of tandem tudor-like folds that bind chromatin. 126
59312 374752 pfam16720 Albumin_I_a Albumin I chain a. The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43 kDa receptor. This domain represents the a chain. 48
59313 293326 pfam16721 zf-H3C2 Zinc-finger like, probable DNA-binding. This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members. 96
59314 406993 pfam16722 SAPIS-gp6 Pathogenicity island protein gp6 in Staphylococcus. SAPIS-gp6 is a family of proteins produced from the pathogenicity island SAPI1 in pathogenic Staphylococcus aureus. This is a mobile genetic element that carries genes for several superantigen toxins. SAPIS-gp6 is a dimeric protein produced from the pathogenicity island with a helix-loop-helix motif similar to that of bacteriophage scaffolding proteins. It is thought to determine the size of the capsids of distribution of the SAPI1 genome as it acts as an internal scaffolding protein during capsid size determination. 72
59315 374754 pfam16723 DUF5065 Domain of unknown function (DUF5065). This family is found in found in Bacillus species. The function is not known. 156
59316 406994 pfam16724 T4-gp15_tss T4-like virus Myoviridae tail sheath stabilizer. T4-gp15_tss is the tail-sheath-stabilizer or tail-terminator protein of T4-like myoviridae phage. It forms a hexamer. It simultaneously forms the binding site for attachment of the capsid to the tail as gp15 binds to gp14 and gp13, the neck proteins, and completes the tail as it binds to the top of the tail via hexamer gp3 and the C-terminal domain of gp18 located in the last ring of the contractile tail sheath. 238
59317 406995 pfam16725 Nucleolin_bd Nucleolin binding domain. This domain adopts a three helix fold resembling part of a winged helix motif. It binds nucleolin. 70
59318 406996 pfam16726 OCRL_clath_bd Inositol polyphosphate 5-phosphatase clathrin binding domain. This domain is a clathrin binding domain found at the N-terminus of inositol polyphosphate 5-phosphatase OCRL. It has a PH domain-like fold. 101
59319 406997 pfam16727 REV1_C DNA repair protein REV1 C-terminal domain. This is the C-terminal domain of DNA repair protein REV1. It interacts with REV7, POLN, POLK and POLI. 91
59320 293333 pfam16728 DUF5066 Domain of unknown function (DUF5066). 213
59321 406998 pfam16729 DUF5067 Domain of unknown function (DUF5067). 125
59322 406999 pfam16730 DnaGprimase_HBD DnaG-primase C-terminal, helicase-binding domain. DnaG-primase_C is the C-terminal of a set of eubacterial DnaG primases that are a single-stranded DNA (ssDNA)-dependent RNA polymerase responsible for the synthesis of oligonucleotide primers needed for the replication of DNA. It interacts with helicase at the replication fork. 118
59323 407000 pfam16731 GARP Glutamic acid/alanine-rich protein of Trypanosoma. GARP, or glutamic acid/alanine-rich protein, is one of a subset of major surface molecules on Trypanosoma species. They are all surface-orientated, immunodominant, and highly charged. GARP is interesting as ts expression coincides with the loss and gain of variant surface glycoprotein (VSG) molecules in the tsetse vector. It has an extended helical bundle structure that is homologous to the core surface structure of VSG, suggesting that it might replace the bloodstream VSG as the trypanosomes differentiate inside the tsetse vector after a blood-meal. 193
59324 407001 pfam16732 ComP_DUS Type IV minor pilin ComP, DNA uptake sequence receptor. ComP-DUS is the DNA-uptake sequence receptor of pathogenic Proteobacteria. ComP is a type IV minor pilin -site on the minor type IV pilin, C one of three minor (low abundance) pilins in pathogenic Proteobacteria Neisseria species (with PilV and PilX). These modulate Tfp-mediated properties without affecting Tfp biogenesis. ComP plays a prominent role in competence at the level of DNA uptake. Comp is exposed on the surface of Neisseria filaments, and it is this that recognizes homotypic DNA through genus-specific DNA uptake sequence (DUS) motifs. 81
59325 407002 pfam16733 NRho Rhomboid N-terminal domain. This is the N-terminal domain of rhomboid protease. 69
59326 407003 pfam16734 Pilin_GH Type IV pilin-like G and H, putative. Pilin_GH is a family from Cyanobacteria. All the proteins are putatively annotated as being general secretion pathway proteins G and H, and are likely to be pilins of the type IV secretory pathway. 111
59327 407004 pfam16735 MYO10_CC Unconventional myosin-X coiled coil domain. This coiled coil domain is found in unconventional myosin-X and is responsible for dimerization. 52
59328 407005 pfam16736 sCache_like Single Cache-like. This entry represents the N-terminal Cache-like domain of the alkaline phosphatase synthesis sensor protein PhoR. It covers part of the PAS-like fold that share a central five-stranded beta- sheet of identical topology to other PAS domains. 114
59329 407006 pfam16737 PHF12_MRG_bd PHD finger protein 12 MRG binding domain. This domain found in PHD finger protein 12 binds to the MRG domain of Mortality factor 4-like protein 1. 39
59330 407007 pfam16738 CBM26 Starch-binding module 26. CBM26 is a carbohydrate-binding module that binds starch. 68
59331 407008 pfam16739 CARD_2 Caspase recruitment domain. In the probable ATP-dependent RNA helicase DDX58 this CARD domain is found near the N-terminus and interacts with the C-terminal domain. 90
59332 407009 pfam16740 SKA2 Spindle and kinetochore-associated protein 2. Spindle and kinetochore-associated protein 2 (SKA2) interacts with the N-termini of SKA1 and SKA3 and forms the Ska complex. This is a microtubule binding complex required for chromosome segregation. 110
59333 407010 pfam16741 mRNA_decap_C mRNA-decapping enzyme C-terminus. The C-terminal domain of mRNA-decapping enzyme in Metazoa is responsible for trimerisation. 43
59334 407011 pfam16742 IL17R_D_N N-terminus of interleukin 17 receptor D. IL17R_D_N is found in higher eukaryotes. The function of this N-terminal domain is not known. 122
59335 407012 pfam16743 PliI Periplasmic lysozyme inhibitor of I-type lysozyme. 121
59336 407013 pfam16744 Zf_RING KIAA1045 RING finger. 72
59337 407014 pfam16745 RsgA_N RsgA N-terminal domain. This domain is found at the N-terminus of RsgA domains. It has an OB fold. 54
59338 407015 pfam16746 BAR_3 BAR domain of APPL family. BAR_12 is the BAR coiled-coil domain at the N-terminus of APPL or adaptor protein containing PH domain, PTB domain, and leucine zipper motif proteins in higher eukaryotes. This BAR domain contains four helices whereas the other classical BAR domains contain only three helices. The first three helices form an antiparallel coiled-coil, while the fourth helix, is unique to APPL1. BAR domains take part in many varied biological processes such as fission of synaptic vesicles, endocytosis, regulation of the actin cytoskeleton, transcriptional repression, cell-cell fusion, apoptosis, secretory vesicle fusion, and tissue differentiation. 235
59339 407016 pfam16747 Adhesin_E Surface-adhesin protein E. Adhesin E plays a role in pathogenesis. It binds to host proteins including plasminogen, vitronectin and laminin. 125
59340 407017 pfam16748 INSC_LBD Inscuteable LGN-binding domain. This is the LGN-binding domain (LBD) of the inscuteable homolog protein. It interacts with the TPR motifs of G-protein-signaling modulator 2 (GPSM2) (LGN) and stabilizes LGN. 44
59341 374774 pfam16749 Arteri_nsp7a Arterivirus nonstructural protein 7 alpha. Nonstructural protein 7 alpha is likely to have a role in viral RNA synthesis. 128
59342 407018 pfam16750 HK_sensor Sensor domain of 2-component histidine kinase. HK_sensor is the sensor domain found at the N-terminus of the integral membrane two-component system sensor histidine kinase proteins in bacteria. 110
59343 407019 pfam16751 RsdA_SigD_bd Anti-sigma-D factor RsdA to sigma factor binding region. RsdA_SigD_bd is a domain at the N-terminus of anti-sigma-D factor RsdA proteins. It binds to the -35 promoter binding domain of sigma-D. The complex formed regulates the transcriptional expression of the bacterium. 46
59344 407020 pfam16752 TBCC_N Tubulin-specific chaperone C N-terminal domain. This N-terminal domain of tubulin-specific chaperone C has a spectrin-like fold and binds to tubulin. 115
59345 407021 pfam16753 Tipalpha TNF-alpha-Inducing protein of Helicobacter. Tipalpha is secreted from H. pylori as dimers and enters the gastric cells.It binds to DNA via the positively charged surface-patch formed between the two monomers of the crystal structure by the loop between helices alpha1 and alpha2. Each monomer consists of a helical domain and a mixed domain. 150
59346 407022 pfam16754 Pesticin Bacterial toxin homolog of phage lysozyme, C-term. This the C-terminal activator domain of pesticin, a hydrolase enzyme secreted by Yersinia pestis and other Gammaproteobacteria to kill related bacteria occupying the same ecological niche. It is referred to as a bacteriocin and it leads to the hydrolysis of peptidoglycan. Its immunity protein is Pim. Pesticin carries an elongated N-terminal translocation domain, an intermediate receptor binding domain, and a C-terminal activity domain with structural analogy to lysozyme homologs. The full-length protein is toxic to bacteria when taken up to the target site via the outer or the inner membrane. The receptor domain is necessary for the close contact with the outer membrane; the N-terminal is a type of translocational, TonB box; the C-terminal domain is the death-delivering domain. 152
59347 407023 pfam16755 NUP214 Nucleoporin or Nuclear pore complex subunit NUP214=Nup159. NUP214 is a family of nucleoporins or nuclear pore complex subunit 214 in vertebrates and 159 in yeast found in eukaryotes. It participates in allowing family 2 of DEAD-box ATPases Dbp5/DDX19 to localize to the nuclear pore complex where it takes part in mRNA export and re-modelling. NUP214 helps to regulate DEAD-box ATPase activity. 359
59348 407024 pfam16756 PALB2_WD40 Partner and localizer of BRCA2 WD40 domain. This domain is found at the C-terminus of partner and localizer of BRCA2 (PALB2). It is a seven-bladed WD40-type beta-propeller. It binds to the N-terminus of BRCA2. 351
59349 407025 pfam16757 Fucosidase_C Alpha-L-fucosidase C-terminal domain. The C-terminal domain of Structure 1hl8 is constructed of eight anti-parallel-strands packed into two-sheets of five and three strands, respectively, forming a two-layer-sandwich containing a Greek key motif. 90
59350 407026 pfam16758 UL141 Herpes-like virus membrane glycoprotein UL141. UL141 is a family of glycoproteins from herpesvirus species. At it N-terminus it carries an Ig-like beta-sandwich domain, which binds to the cysteine-rich region of TRAIL-R2, a family of tumor necrosis factor receptor proteins. UL141 is both necessary and sufficient to retain TRAIL receptors in the ER, thereby preventing their cell surface expression and it is also necessary and sufficient to inhibit cell surface expression of CD155. 191
59351 407027 pfam16759 LIG3_BRCT DNA ligase 3 BRCT domain. The BRCT domain of DNA ligase 3 (LIG3) binds to the C-terminal BRCT domain of the scaffolding protein X-ray repair cross-complementing protein 1 (XRCC1) and mediates homo- and heterodimerization. 78
59352 407028 pfam16760 CBM53 Starch/carbohydrate-binding module (family 53). 75
59353 407029 pfam16761 Clr2_transil Transcription-silencing protein, cryptic loci regulator Clr2. Clr2_transil is a domain carrying the first and second of three regions on Clr2 that are necessary for transcriptional silencing by the protein. Clr2 is a protein in the SHREC complex that is a crucial factor required for heterochromatin formation and it plays a major role in mating-type and rDNA silencing. The third region is family pfam10383. 68
59354 407030 pfam16762 RHH_6 Ribbon-helix-helix domain. This ribbon-helix-helix domain binds to DNA and may be a part of a toxin-antitoxin system. 77
59355 407031 pfam16763 Spidroin_N Major ampullate spidroin 1, spider silk protein 1, N-term. Spidroin is produced by a number of arachnids. Spidrions are made up of repetitive segments flanked by conserved non-repetitive domains, and this domain is the conserved non-repetitive region. Aggregation to form the rigid silk occurs due to association at the repetitive regions, and the N-terminal domain is necessary to prevent premature aggregation during storage before extrusion. This N_terminal region inhibits precocious aggregation and then accelerates and directs self-assembly as the pH is lowered along the extrusion duct. 125
59356 407032 pfam16764 Sharpin_PH Sharpin PH domain. This PH domain is found at the N-terminus of sharpin and is involved in dimerization. 113
59357 293370 pfam16765 Pim Pesticin immunity protein. Pim is the immunity protein produced by Yersinia pestis and other Gammaproteobacteria to protect themselves against the bacteriostatic activity of the toxin pesticin, pfam16754. 98
59358 407033 pfam16766 CID_GANP Binding region of GANP to ENY2. CID is a domain on higher eukaryotic germinal-cent associated nuclear protein, or GANP, that binds to the transcription and mRNA export factor ENY2. The complex of these two proteins forms part of the TREX-2 complex that links transcription with nuclear messenger RNA export. 71
59359 407034 pfam16767 KinB_sensor Sensor domain of alginate biosynthesis sensor protein KinB. KinB_sensor is the N-terminal sensor domain of histidine kinase from Pseudomonas species. The domain is the extracellular sensing domain, and is four helical bundle. 120
59360 407035 pfam16768 NupH_GANP Nucleoporin homology of Germinal-centre associated nuclear protein. NupH_GANP is the nucleoporin-homology domain at the N-terminus of human GANP or germinal-centre associated nuclear proteins. GANP is part of the TREX-2 complex that links transcription with nuclear messenger RNA export, and it associates with the mRNP particle through the interaction of the NupH_GANP with NXF1, the export factor. This attachment mediates efficient delivery of mRNPs to nuclear pore complexes. 292
59361 407036 pfam16769 MCM3AP_GANP MCM3AP domain of GANP. MCM3AP_GANP is the C-terminal domain of germinal centre-associated proteins, GANPs in higher eukaryotes. GANP forms part of the TREX-2 complex which in higher eukaryotes requires the MCM3AP domain of GANP to facilitate its localization to the Nuclear pore complex and nuclear envelope. TREX-2 complex links transcription with nuclear messenger RNA export. 717
59362 407037 pfam16770 RTT107_BRCT_5 Regulator of Ty1 transposition protein 107 BRCT domain. This is the fifth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A. 91
59363 407038 pfam16771 RTT107_BRCT_6 Regulator of Ty1 transposition protein 107 BRCT domain. This is the sixth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A. 107
59364 407039 pfam16772 TERF2_RBM Telomeric repeat-binding factor 2 Rap1-binding motif. This domain, found in telomeric repeat-binding factor 2, binds to the C-terminus of repressor activator protein 1 (RAP1) (telomeric repeat-binding factor 2-interacting protein 1). 41
59365 374792 pfam16773 Phage_SSB Lactococcus phage single-stranded DNA binding protein. This single-stranded DNA binding protein is found in Lactococcus phage. It can stimulate RecA-mediated homologous recombination. Its structure is a variation of the typical oligonucleotide/oligosaccharide binding-fold of single-stranded DNA binding proteins. 117
59366 407040 pfam16774 Baseplate Baseplate protein. This protein is a structural component of the phage baseplate in Siphoviridae. 157
59367 407041 pfam16775 ZoocinA_TRD Target recognition domain of lytic exoenzyme. ZoocinA_TRD is domain found downstream of various lytic enzymes, such as peptidase M23 and phage lysins. The domain is composed of strands of antiparallel beta sheet with one short alpha helix at the C-terminal end. 106
59368 407042 pfam16776 INPP5B_PH Type II inositol 1,4,5-trisphosphate 5-phosphatase PH domain. 144
59369 374794 pfam16777 RHH_7 Transcriptional regulator, RHH-like, CopG. RHH_7 is a ribbon-helix-helix protein family expressed by Helicobacter species. These proteins bind to specific DNA sequences with high affinity and usually act as repressors. Many are putatively named CopG. 74
59370 407043 pfam16778 Phage_tail_APC Phage tail assembly chaperone protein. Phage_tail_APC is a family of general phage tail assembly chaperone proteins from double-stranded DNA viruses with no RNA stage, many of which are unclassified. 60
59371 407044 pfam16779 DMP12 DNA-mimic protein. This is a family of DNA-mimic proteins expressed by Neisseria species. In its monomeric form DMP12 interacts with the Neisseria dimeric form of the bacterial histone-like protein HU. HU proteins promote the assembly of higher-order DNA-protein structures, The interaction between DMP12 and HU protein may be instrumental in controlling the stability of the nucleoid in Neisseria as DMP12 prevents Neisseria HU protein from being digested by trypsin. 115
59372 407045 pfam16780 AIMP2_LysRS_bd AIMP2 lysyl-tRNA synthetase binding domain. This is the lysyl-tRNA synthetase binding domain of aminoacyl tRNA synthase complex-interacting multifunctional protein 2 (AIMP2). 47
59373 407046 pfam16781 DUF5068 Domain of unknown function (DUF5068). This family is expressed by Firmicutes. The function is not known. 185
59374 407047 pfam16782 SIL1 Nucleotide exchange factor SIL1. This family consists of fungal SIL1 nucleotide-exchange factor proteins.It interacts with Hsp70 (heat-shock protein of 70 kDa) Bip. 289
59375 407048 pfam16783 FANCM-MHF_bd FANCM to MHF binding domain. FANCM-MHF_bd is a structured region on Fanconi anaemia complementation group protein M that binds to a two-histone-fold-containing protein complex MHF. MHF binds double-strand DNA, stimulates the DNA-binding activity of FANCM, and contributes to the targeting of FANCM to chromatin. 115
59376 293389 pfam16784 HNHc_6 Putative HNHc nuclease. This family is found in Gammaproteobacteria. It may be an HNH-like nucleases. The shorter matches are likely to be from phage proteins whereas the longer members are probably from the bacterial genomes. 203
59377 407049 pfam16785 SMBP Small metal-binding protein. This histidine-rich protein binds metal ions. 111
59378 407050 pfam16786 RecA_dep_nuc Recombination enhancement, RecA-dependent nuclease. REF is a family of P1-like phage RecA-dependent nucleases. It does not appear to act as a positive RecA regulator. It is a new kind of enzyme, a RecA-dependent nuclease. 102
59379 407051 pfam16787 NDC10_II Centromere DNA-binding protein complex CBF3 subunit, domain 2. NDC10_II is a the second of five domains on the Kluyveromyces lactis Ndc10 protein. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilizes a DNA loop at the centromere. 313
59380 407052 pfam16788 ATF7IP_BD ATF-interacting protein binding domain. ATF7IP-BD is a short conserved region of activating transcription factor 7-interacting protein 1 found in higher eukaryotes. This domain appears to bind several key proteins such as TFIIE-alpha and TFIIE-beta as well the transcriptional regulator Sp1 which are part of the transcriptional machinery. 215
59381 374803 pfam16789 YscO-like YscO-like protein. This family of proteins is similar to the type III secretion protein YscO. The family includes Chlamydia trachomatis CT670 which is found in a type III secretion gene cluster. CT670 interacts with CT671, a putative YscP homolog and CT670 and CT671 may form a chaperone-effector pair. 160
59382 407053 pfam16790 Phage_clamp_A Bacteriophage clamp loader A subunit. This is the A subunit of bacteriophage DNA clamp loader required for loading of sliding clamps onto chromosomal DNA. These clamps are involved in processivity of DNA replication. 144
59383 407054 pfam16791 Connexin40_C Connexin 40 C-terminal domain. This is the C-terminal domain of connexin 40. It interacts with the C-terminal and cytoplasmic loop domains of connexin 43 and with the cytoplasmic loop pf connexin 40. 106
59384 407055 pfam16792 Caudo_bapla16 Phage tail base-plate attachment protein of Caudovirales ORF16. Caudo_bapla16 is a family of ORF16 tail-phage P2-like proteins that forms part of the base-plate at the tip of the phage tail. The whole base-plate complex is involved in host recognition and attachment, and consists of several proteins derived from consecutive open-reading-frames. This central domain is expressed from ORF16 in the lactococcal P2-phage and forms a trimer. 372
59385 374806 pfam16793 RepB_primase RepB DNA-primase from phage plasmid. RepB_primase is a DNA-primase produced by P4-like phages. It is a zinc-independent primase unlike Pri-type primases. It takes up a dumbbell shaped consisting of an N-terminal catalytic domain separated by a long alpha-helix plus tether and a C-terminal helical-bundle domain. Primases are necessary for phage replication. RepBprime primases such as in this family recognize both ssiA and ssiB, ie only 1 single-stranded primase initiation site on each strand, independently of each other and then synthesize primers that are elongated by DNA polymerase III. The phage is thus replicated exclusively in leading strand mode. 230
59386 407056 pfam16794 fn3_4 Fibronectin-III type domain. 101
59387 407057 pfam16795 Phage_integr_3 Archaeal phage integrase. catalyzes cleavage and ligation of DNA. 162
59388 407058 pfam16796 Microtub_bd Microtubule binding. This motor homology domain binds microtubules and lacks an ATP-binding site. 143
59389 407059 pfam16797 Fungal_KA1 Fungal kinase associated-1 domain. This domain is found at the C-terminus of several fungal kinases. 115
59390 374811 pfam16798 DUF5069 Domain of unknown function (DUF5069). 134
59391 407060 pfam16799 VGPC1_C C-terminal membrane-localization domain of ion-channel, VCN1. VCN1_C is the short C-terminal region of voltage-gated proton channel 1 proteins in higher eukaryotes. The domain is necessary for achieving the dimeric architecture, two monomers form a dimer via parallel alpha-helical coiled-coil interaction. but it is also essential for localising the protein to an intracellular membrane. 48
59392 374812 pfam16800 Endopep_inhib IseA DL-endopeptidase inhibitor. This domain functions as a DL-endopeptidase inhibitor. 150
59393 407061 pfam16801 MSL1_dimer dimerization domain of Male-specific-Lethal 1. MSL1_dimer is the short coiled dimerization domain of higher eukaryotic MSL1, part of the MSL or Male-Specific Lethal complex. This complex regulates the dosage compensation of the male X chromosome in Drosophila and other eukaryotes. The structure of the MSL1/MSL2 core shows that two MSL2 subunits bind to a dimer formed by two molecules of MSL1. MSL11 is a substrate for MSL2 E3 ubiquitin ligase activity. 37
59394 407062 pfam16802 DUF5070 Domain of unknown function (DUF5070). 154
59395 407063 pfam16803 DRE2_N Fe-S cluster assembly protein DRE2 N-terminus. This is the N-terminal domain of the fungal Fe-S cluster assembly protein DRE2. 129
59396 407064 pfam16804 DUF5071 Domain of unknown function (DUF5071). 119
59397 374815 pfam16805 Trans_coact Phage late-transcription coactivator. This family of proteins is found in Caudovirales. It is a late-transcription coactivator which interacts with the host RNA polymerase forming a part of the initiation complex. 69
59398 293411 pfam16806 ExsD Antiactivator protein ExsD. The antiactivator protein ExsD represses the transcriptional activator ExsA. ExsA activates expression of type III secretion system genes. Repression of ExsA by ExsD is relieved by the secretion chaperone ExsC. 237
59399 407065 pfam16807 DUF5072 Domain of unknown function (DUF5072). 112
59400 407066 pfam16808 PKcGMP_CC Coiled-coil N-terminus of cGMP-dependent protein kinase. PKcGMP_CC is the N-terminal coiled-coil, dimerization, domain of cGMP-protein kinases. 35
59401 293414 pfam16809 NleF_casp_inhib NleF caspase inhibitor. Binds to and inhibits caspase-9, caspase-8 and caspase-4. therefore preventing caspase-induced apoptosis in the host cell. 145
59402 374818 pfam16810 RXLR RXLR phytopathogen effector protein, Avirulence activity. RXLR is a family of phytopathogen avirulence or effector proteins. RXLR proteins are defined by a secretion signal peptide - not in this family - followed by a conserved N-terminal domain with the sequence motif RXLR (Arg-Xaa-Leu-Arg) consensus sequence. The RXLR part is required for translocation inside plant cells, although it appears to be dispensable for the biochemical activity of the effectors when expressed directly inside host cells. The effector activity resides in the C-terminal part of the family, which activate effector-triggered immunity in plants that carry a corresponding resistance (R) protein. The C-terminal region exhibits a fold appears to be able to evolve to outwit the host as the latter tries to acquire new immunity. 138
59403 374819 pfam16811 TAtT TRAP transporter T-component. TAtT is a family of one component, the T-component, of a sub-set of TRAP-Ts or Tripartite ATP-independent periplasmic transporters. TRAP-Ts are bacterial transport systems implicated in the import of small molecules into the cytoplasm in bacteria. They are all periplasmic lipoproteins. TatT consists of a 13-alpha-helical fold containing cryptic tetratricopeptide repeat motifs (cTPRs) and encompassing a pore, ie is a water-soluble trimer whose protomers are each perforated by a pore. It forms a complex with a P component, and a putative ligand-binding cleft of TatPT aligns with the pore of TatT. Family TatPT is represented by some members of pfam03480. 263
59404 318916 pfam16812 AdHead_fibreRBD C-terminal head domain of the fowl adenovirus type 1 long fibre. AdHead_fibreRBD is a C-terminal part of the head domain of the dsDNA viruses, no RNA stage, Adenovirus. This is a globular head domain with an anti-parallel beta-sandwich fold formed by two four-stranded beta-sheets with the same overall topology as human adenovirus fibre heads. This C-terminal domain is the receptor-binding domain of the avian adenovirus long fibre. 207
59405 407067 pfam16813 Cas_St_Csn2 CRISPR-associated protein Csn2 subfamily St. Cas_St_Csn2 is a family of Csn2 CRISPR-associated (Cas) proteins found in Firmicutes, largely Streptococcus and Enterococcus. CRISPR-associated (Cas) proteins are the main executioners of the process whereby prokaryotes acquire immunity against foreign genetic material. Cas allow short segments of this DNA, called spacer, to become incorporated into chromosomal loci as clustered regularly interspaced short palindromic repeats or CRISPRs; the resulting encoded RNAs are then processed into small fragments that guide the silencing of the invading genetic elements. Thus Cas are involved in the acquisition of new spacers. This family of St_Csn2 is longer than the canonical Csn2, pfam09711 through the addition of a large C-terminal domain. The central domain present in both families appears to be a channel that selectively interacts with dsDNA. 325
59406 293419 pfam16814 Read-through Read-through domain. The Enterobacteria phage minor coat protein A1 is a C-terminally extended version of the coat protein formed when ribosomes read-through a leaky stop codon. This is the C-terminal read-through domain of A1. 182
59407 407068 pfam16815 HRI1 Protein HRI1. This fungal protein interacts with Sec72 and Hrr25, it's function is not yet known. 229
59408 407069 pfam16816 DotD DotD protein. The DotD protein is a component of the Dot/Icm type IVB secretion system. It is involved in the outer membrane targeting of DotH. 120
59409 407070 pfam16817 DUF5073 Domain of unknown function (DUF5073). This domain of unknown function is a membrane protein found in Mycobacterium. 121
59410 407071 pfam16818 SLM4 Protein SLM4. The fungal protein SLM4 (EGO3, GSE1) is a component of the GSE complex and the EGO (TOR) complex. The GSE complex is required for trafficking GAP1 out of the endosome. The EGO complex is involved in the regulation of autophagy and cell growth. SLM4 is required for the integrity and function of the EGO complex. 157
59411 407072 pfam16819 DUF5074 Domain of unknown function (DUF5074). This family of proteins from Bacteroidetes, is found with a PKD domain at the N-terminus. Several members are annotated as putative quinonprotein alcohol dehydrogenase-like proteins but this could not be confirmed. 112
59412 407073 pfam16820 PKD_3 PKD-like domain. This PKD-like family is found in various Bacteroidetes species. 68
59413 374824 pfam16821 C_Hendra C protein from hendra and measles viruses. This is a family of C proteins from a number of Morbillivirus species. 153
59414 407074 pfam16822 ALGX SGNH hydrolase-like domain, acetyltransferase AlgX. ALGX is a family found in bacteria. The domain demonstrates catalytic activity similar to that of the SGNH hydrolase-like domain, with the typical Ser-His-Asp triad found in this enzyme. Alginate is an exopolysaccharide that contributes to biofilm formation. ALGX is secreted into the biofilm and is responsible for the acetylation of biofilm polymers that help protect them from host destruction. 266
59415 374826 pfam16823 PilZ_2 Atypical PilZ domain, cyclic di-GMP receptor. PilZ_2 is a family of cyclic di-GMP receptors found in Proteobacteria plant pathogens. PilZ_2 forms a tetramer that adopts a novel 'house-like' construct, with a central pillar domain of the four vertical alpha3 helices, a roof-top domain made up of the eight inclined alpha2 and alpha4 helices, and four corner-stone domains making up the PilZ domain. Cyclic-di-GMP is a universal secondary messenger molecule extensively involved in regulating bacterial pathogenicity, and its downstream receptor appears to be this PilZ domain. 136
59416 407075 pfam16824 CBM_26 C-terminal carbohydrate-binding module. CBM_26 is a family of bacterial carbohydrate-binding modules frequently found at the C-terminus of enzymes. The combination is not unusual as the CBMs function to bring the relevant polysaccharide into close proximity to the active site. 125
59417 407076 pfam16825 DUF5075 IGP family C-type lectin domain. This C-type lectin domain is present in the IGP 'invariant glycoprotein' family of proteins from Trypanosoma and Leishmania. 173
59418 407077 pfam16826 DUF5076 Domain of unknown function (DUF5076). 84
59419 374829 pfam16827 zf-HC3 zinc-finger. This is a family of putative zinc-fingers from Actinobacteriales. 67
59420 407078 pfam16828 GAGBD GAG-binding domain on surface antigen. GAGBD is a domain on the surface antigen of the swine pathogen Streptococcus suis and related species. This domain expresses three clusters of basic residues, largely lysines, that are critical for heparin-binding and cell adhesion during bacterium-host cell adhesion. The GAGBD domain binds to the host cell surface glycosaminoglycans or GAGs of the Streptococcus. 152
59421 293434 pfam16829 ATR13 Avirulence protein ATR13, RxLR effector. ATR13 is expressed by the plant pathogen oomycete Hyaloperonospora. Such phytopathogenic oomycetes like the one that infects Arabidopsis, Hyaloperonospora arabidopsidis (Hpa), grow intercellularly, forming parasitic structures called haustoria. Haustoria play a role in feeding and suppression of host defense systems. A whole range of pathogen proteins, called effectors, are secreted across this haustorial membrane, a subset of which are further translocated across the plant plasma membrane by an unknown mechanism that is present in both plants and animals. ATR13 is an RxLR effector from the downy mildew oomycete, and is a very dynamic protein. It contains two surface-exposed patches of polymorphism, one of which is involved in the specific recognition by host R-genes. The R-gene-products detect the presence of the infection by recognising the effector proteins. Once detected, the host R-genes trigger apoptosis of the host cell. The R-gene-products carry a specific motif, RxLR, that is recognizes the effector proteins. 101
59422 407079 pfam16830 NBD94 Nucleotide-Binding Domain 94 of RH. NBD94 is a domain on one of the reticulocyte binding protein homolog family or RH proteins expressed by the malaria parasite merozoite. RH proteins recognize erythrocytes and are important in virulence. This domain has been shown to exhibit selective binding to ATP and ADP. Binding of ATP or ADP induces nucleotide-dependent structural changes in the C-terminal hinge-region of NBD94 that directly impact on the ability of the RH to bind to the red blood cells. 91
59423 318930 pfam16831 CssAB CS6 fimbrial subunits A and B, Coli surface antigen 6. CssAB is a family of CS6 pilins from E.coli, including both subunits A and B. It acts as a colonisation factor for the enterotoxigenic species pf E.coli to mediate bacterial attachment to the small intestinal epithelium. Both subunits in the fibre bind to receptors on epithelial cells, and that CssB, but not CssA, specifically recognizes the extracellular matrix protein fibronectin. 129
59424 318931 pfam16832 EKLF_TAD1 Erythroid krueppel-like transcription factor, transactivation 1. This family is the first part of the minimal transactivation domain of erythroid-specific transcription factor EKFL in craniates. EKLF plays an important role in red blood cell development; it is posttranslationally modified by UBI on several lysine residues, and its turnover in the cell is regulated by ubiquitin-mediated degradation. In the first 90 residues at the N-terminus EKLF carries a minimal transactivation or TAD domain that is highly acidic. This minimal TAD of EKLF can be further subdivided into two independent domains EKLF_TAD1 (residues 1-40) and EKLF_TAD2 (residues 51-90), pfam16833, that are both capable of independently activating transcription. TAD1, is able to form a non-covalent interaction with ubiquitin. Both TAD1 and TAd2 are highly acidic and carry a PEST (sequence rich in proline, glutamic acid, serine, and threonine) region. Deletion of either PEST domain significantly slows down degradation of EKLF by ubiquitin. The minimal TAD has an overlapping activation/degradation function that is critical for the role of EKLF in red blood cell development. 27
59425 407080 pfam16833 EKLF_TAD2 Erythroid krueppel-like transcription factor, transactivation 2. This family is the second part of the minimal transactivation domain of erythroid-specific transcription factor EKFL in craniates. EKLF plays an important role in red blood cell development; it is post-translationally modified by ubiquitin on several lysine residues, and its turnover in the cell is regulated by ubiquitin-mediated degradation. In the first 90 residues at the N-terminus EKLF carries a minimal transactivation or TAD domain that is highly acidic. This minimal TAD of EKLF can be further subdivided into two independent domains EKLF_TAD1 (residues 1-40), pfam16832, and EKLF_TAD2 (residues 51-90) that are both capable of independently activating transcription. Both TAD1 and TAD2 are highly acidic and carry a PEST (sequence rich in proline, glutamic acid, serine, and threonine) region. Deletion of either PEST domain significantly slows down degradation of EKLF by ubiquitin. The minimal TAD has an overlapping activation/degradation function that is critical for the role of EKLF in red blood cell development. 27
59426 407081 pfam16834 CSM2 Shu complex component Csm2, DNA-binding. CSM2 is one of the components of the yeast Shu complex that maintains genomic stability during replication. CSM2 complexes first with Psy3, and their L2 loops confer the DNA-binding activity to the Shu complex. The Shu complex binds to recombination sites and is required for Rad51 assembly and function during meiosis. The heterodimer of Psy3-Csm2 stabilizes the Rad51-single-stranded DNA complex independently of nucleotide cofactor because Psy3-Csm2 is a structural mimic of the Rad51-dimer. 203
59427 407082 pfam16835 SF3A2 Pre-mRNA-splicing factor SF3a complex subunit 2 (Prp11). SF3A2 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). with Prp21 wrapping around Prp11. 92
59428 407083 pfam16836 PSY3 Shu complex component Psy3, DNA-binding description. PSY3 is one of the components of the yeast Shu complex that maintains genomic stability during replication. Psy3 complexes first with Cms2, and their L2 loops confer the DNA-binding activity to the Shu complex. The Shu complex binds to recombination sites and is required for Rad51 assembly and function during meiosis. The heterodimer of Psy3-Csm2 stabilizes the Rad51-single-stranded DNA complex independently of nucleotide cofactor because Psy3-Csm2 is a structural mimic of the Rad51-dimer. 216
59429 407084 pfam16837 SF3A3 Pre-mRNA-splicing factor SF3A3, of SF3a complex, Prp9. SF3A3 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). Prp9 and Prp21 were not thought to interact with each other but the alpha1 helix of Prp9 does make important contacts with the SURP2 domain of Prp21, thus the two do interact via a bidentate-binding mode. Prp9 harbours a major binding site for stem-loop IIa of U2 snRNA. 77
59430 407085 pfam16838 Caud_tail_N Caudoviral major tail protein N-terminus. This is the N-terminal domain of the major tail protein, or knob protein, from Caudovirales. 120
59431 407086 pfam16839 Antimicrobial25 Nematode antimicrobial peptide. This family of antimicrobial peptides is found in nematodes. 54
59432 407087 pfam16840 ACTL7A_N Actin-like protein 7A N-terminus. The N-terminus of actin-like protein 7A is required for interaction with testin (TES). 65
59433 407088 pfam16841 CBM60 Ca-dependent carbohydrate-binding module xylan-binding. CBM60 is a family of xylan-binding modules found in conjunction with xylanase enzymes in many bacterial species that attack plant cell walls. Xylan is the major hemicellulose component of most plant cell walls, and is one of the most complex carbohydrates targeted by CBMs. CBM60 modules are evolutionarily related to CBM36 domains as both show circular permutation in the beta-barrel folds. CBM60 targets xylan but is also able to bind cellulose and galactan and thus contribute towards breakdown of the plant cell wall. Recognition of the ligand is conferred primarily through the polar interactions of O2 (oxygen) and O3 of a single sugar with a protein-bound calcium ion. 93
59434 407089 pfam16842 RRM_occluded Occluded RNA-recognition motif. This family is an unusual, usually C-terminal, RNA-recognition motif found in fungi. In yeast it is the fourth RRM domain on the essential splicing factor Prp24. Structurally, it has a non-canonical RRM fold with the expected beta-aloha-beta-beta-alpha-beta RRM-fold is flanked by N- and C-terminal alpha-helices. These two additional flanking alpha-helices occlude the beta-sheet face. The electropositive surface thereby presented is an alternative RNA-binding surface that allows both binding and unwinding of the U6 small nuclear RNA's internal stem loop, at least in vitro. 79
59435 407090 pfam16843 Get5_bdg Binding domain to Get4 on Get5, Golgi to ER traffic protein. Get5_bdg is the binding domain at the N-terminus of Get5, or Golgi to ER traffic protein 5, in yeast, that binds to Get4. Together with Get3, this tripartite complex is involved in the insertion of tail-anchored proteins in the ER membrane. 53
59436 407091 pfam16844 DIMCO_N Dinitrogenase iron-molybdenum cofactor, N-terminal. DIMCO_N is the N-terminal domain of the gamma (Y) subunit of nitrogenase. An alternative name is NafY_N, for nitrogenase accessory factor Y N-terminal. This region is negatively charged and appears to be necessary for recognising and interacting with the apo state of dinitrogenase. The full-length NafY protein facilitates the transfer of iron-molybdenum cofactor, or FeMo-co, into apodinitrogenase by binding to both. The C-terminal region, family Nitro_FeMo-Co, pfam02579, is the part that binds to the cofactor, and the N-terminus binds to apodinitrogenase. Nitrogenase is the bacterial enzyme responsible for nitrogen fixation by catalyzing the reduction of nitrogen gas (N2) to ammonium in an ATP-dependent manner. It has two components, dinitrogenase and dinitrogenase reductase. 91
59437 407092 pfam16845 SQAPI Aspartic acid proteinase inhibitor. SQAPI, aspartic acid inhibitor first isolated from squash, inhibits a wide range of aspartic proteinases. This particular family of PAAPIs (proteinaceous aspartic acid inhibitors) seems to have evolved quite recently from an ancestral cystatin. Structurally it consists of a four-stranded anti-parallel beta-sheet gripping an alpha-helix in much the same manner that a hand grips a tennis racket. The unstructured N-terminus and the loop connecting beta-strands 1 and 2 are important for pepsin inhibition, but the loop connecting strands 3 and 4 is not. 83
59438 407093 pfam16846 Cep3 Centromere DNA-binding protein complex CBF3 subunit B. Cep3 is one of the major components of the CBF3. It dimerizes and in so doing forms a large central channel that is large enough to accommodate duplex B-form DNA. The dimerization region is followed by a linker to the zinc-finger domain at the C-terminus. The CBF3 complex is an essential core component of the budding yeast kinetochore and is required for the centromeric localization of all other kinetochore proteins. Cep3 is the only component with DNA-binding properties. 507
59439 293452 pfam16847 AvrPtoB_bdg Avirulence AvrPtoB, BAK1-binding domain. AvrPtoB_bdg is a binding region on a family of bacterial plant pathogenic proteins. Type III effector proteins are injected into plants by bacteria when they are under attack, eg Pseudomonas syringae when attacking tomato. AvrPtoB is one such effector that suppresses the plants' PAMP-triggered innate immunity. PAMPs are pathogen/microbe-associated molecular patterns that are detected as non-self by a host. AvrPtoB suppresses this response by binding to BAK1, a kinase that acts with several pattern recognition receptors to activate defense signalling. AvrPtoB_bdg is the region of AvrPtoB that binds to BAK1 thereby preventing its kinase activity after the perception of flagellin. 91
59440 407094 pfam16848 SoDot-IcmSS Substrate of the Dot/Icm secretion system, putative. This is a family of putative substrates of the Dot/Icm type IVA secretion system from Legionella species. 177
59441 293454 pfam16849 Glyco_transf_88 Glycosyltransferase family 88. This is a family of type A glycosyltransferases found in Legionella. It acts as a virulence factor by the glucosylation of EF1A (elongation factor 1A) thereby blocking protein synthesis in the host cell. 423
59442 374843 pfam16850 Inhibitor_I66 Peptidase inhibitor I66. This family of serine protease inhibitors has a beta-trefoil fold and inhibits trypsin and chymotrypsin. 146
59443 407095 pfam16851 Stomagen Stomagen. Stomagen (epidermal patterning factor-like protein 9) acts as a positive regulator of stomatal development. 50
59444 293457 pfam16852 HHV-1_VABD Herpes viral adaptor-to-host cellular mRNA binding domain. HHV-1_VABD is the short region of the Herpes simplex 1 virus' specific signature adaptor protein that binds to the cellular mRNA export factor such as mouse REF. 42
59445 407096 pfam16853 CDC13_N Cell division control protein 13 N-terminus. This domain is found at the N-terminus of fungal cell division control protein 13 (CDC13). It has an OB type fold. It is involved in dimerization of CDC13 and in interaction of CDC13 with the catalytic subunit of DNA polymerase alpha, Pol1. 208
59446 407097 pfam16854 VPS53_C Vacuolar protein sorting-associated protein 53 C-terminus. This is the C-terminal domain of fungal vacuolar protein sorting-associated protein 53. 203
59447 407098 pfam16855 Soc Small outer capsid protein. This protein attaches to and stabilizes the bacteriophage capsid. 74
59448 407099 pfam16856 CDC4_D Cell division control protein 4 dimerization domain. This is the dimerization domain (D domain) of fungal cell division control protein 4. 51
59449 374848 pfam16857 RNA_pol_inhib RNA polymerase inhibitor. This bacteriophage protein inhibits the bacterial host RNA polymerase by interacting with the RpoC subunit and inhibiting the formation of a promoter complex. 47
59450 407100 pfam16858 CNDH2_C Condensin II complex subunit CAP-H2 or CNDH2, C-term. CNDH2_C is the C-terminal domain of the H2 subunit of the condensin II complex, found in eukaryotes but not fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII are concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII. 284
59451 407101 pfam16859 TetR_C_11 Bacterial transcriptional repressor C-terminal. This family of bacterial transcriptional repressors is characterized by the short approximately 50 amino acid stretch of residues constituting the helix-turn-helix DNA binding motif, around the YRFhY motif. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. Another target protein is BetI, an osmoprotectant which controls the choline-glycine betaine pathway in E.coli. 113
59452 407102 pfam16860 CX9C CHCH-CHCH-like Cx9C, IMS import disulfide relay-system,. CX9C is the first half of a twin Cx9C motif in eukaryotic proteins. The function of this motif is to import nuclear-encoded mitochondrial intermembrane-space-proteins into the IMS (intermembrane space), as these latter lack a mitochondrial targeting sequence. The Cx9C proteins have a disulfide-bonded alpha-hairpin conformation. Cx9C-containing proteins are thus putative substrates for the Mia40-dependent thiol-disulfide exchange mechanism that carries out an oxidative folding process resulting in the proteins being trapped in the IMS. 42
59453 407103 pfam16861 Carbam_trans_C Carbamoyltransferase C-terminus. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the smaller, C-terminal, domain. 169
59454 407104 pfam16862 Glyco_hydro_79C Glycosyl hydrolase family 79 C-terminal beta domain. This domain is found at the C-terminus of glycosyl hydrolase family 79 proteins. It's function is not yet known. 103
59455 407105 pfam16863 NtCtMGAM_N N-terminal barrel of NtMGAM and CtMGAM, maltase-glucoamylase. NtCtMGAM_N is a beta-barrel-like structure just N-terminal to the catalytic domain of maltase-glucoamylase in eukaryotes. It contributes to the architecture of the substrate-binding site, by donating a loop that comes into close contact with two regions in the catalytic domain thereby creating the site. This family is frequently found at the N-terminus of Glycosyl hydrolase 31, pfam01055.to which it contributes as above. 111
59456 407106 pfam16864 dimerization2 dimerization domain. This domain, found in methyltransferases, functions as a dimerization domain. 87
59457 407107 pfam16865 GST_C_5 Glutathione S-transferase, C-terminal domain. Leishmania major and Trypanosoma cruzi glutathione-S-transferase (GST) has undergone gene duplication, diversification, and gene fusion leading to an four domain enzyme which contains two repeats of a GST N-terminal domain followed by a GST C-terminal domain. 108
59458 407108 pfam16866 PHD_4 PHD-finger. 64
59459 407109 pfam16867 DMSP_lyase Dimethlysulfonioproprionate lyase. Breaks down into dimethylsulfoniopropionate (DMSP) into acrylate and dimethyl sulfide. 163
59460 407110 pfam16868 NMT1_3 NMT1-like family. 289
59461 407111 pfam16869 CNDH2_M PF16858. CNDH2_M is the middle domain of the H2 subunit of the condensin II complex, found in eukaryotes but not fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII are concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII. This region represents the disordered section of CNDH2 between the N- and the C-termini. 127
59462 407112 pfam16870 OxoGdeHyase_C 2-oxoglutarate dehydrogenase C-terminal. OxoGdeHyase_C is a family found immediately C-terminal to Transket_pyr, pfam02779. It is found at the C-terminus of 2-oxoglutarate dehydrogenase. 151
59463 407113 pfam16871 DUF5077 Domain of unknown function (DUF5077). This family is found at the N-terminal of DUF3472, pfam00958. 189
59464 407114 pfam16872 putAbiC Putative phage abortive infection protein. Several members are annotated as putative phage abortive infection proteins. 80
59465 374859 pfam16873 AbiGii_2 Putative abortive phage resistance protein AbiGii toxin. AbiGii is a family of putative type IV toxin-antitoxin system toxins. The AbiG abortive phage resistance protein affects lactococcal bacteriophages phiP335 and phiQ30 but not the other P335 phage species. AbiGii toxin appears to confer resistance to phages by a mechanism of abortive infection that acts by interfering with phage RNA synthesis. The cognate anti-toxin is found in pfam10899. 397
59466 407115 pfam16874 Glyco_hydro_36C Glycosyl hydrolase family 36 C-terminal domain. This domain is found at the C-terminus of many family 36 glycoside hydrolases. It has a beta-sandwich structure with a Greek key motif. 78
59467 407116 pfam16875 Glyco_hydro_36N Glycosyl hydrolase family 36 N-terminal domain. This domain is found at the N-terminus of many family 36 glycoside hydrolases. It has a beta-supersandwich fold. 256
59468 407117 pfam16876 Lipin_mid Lipin/Ned1/Smp2 multi-domain protein middle domain. This is a middle domain of lipins. Overall the enzyme acts as a magnesium-dependent phosphatidate phosphatase enzyme that catalyzes the conversion of phosphatidic acid to diacylglycerol during triglyceride, phosphatidylcholine and phosphatidylethanolamine biosynthesis. EC:5.2.1.8. 95
59469 407118 pfam16877 DUF5078 Domain of unknown function (DUF5078). This family of unknown function is found in Mycobacterium spp. 119
59470 407119 pfam16878 SIX1_SD Transcriptional regulator, SIX1, N-terminal SD domain. SIX1_SD is a family of eukaryotic proteins, and it is found N-terminal to the Homeobox domain. As a transcription factor it lacks intrinsic activation domains and thus needs to bind to the EYA family of co-factors in order to mediate transcriptional activation. It is the SD domain that is necessary for this protein-protein interaction, binding to the C-terminal region of EYA - Eyes absent homolog proteins. 110
59471 407120 pfam16879 Sin3a_C C-terminal domain of Sin3a protein. Sin3a_C is a family of eukaryotic species. It is found at the C-terminus of the co-repressor Sin3a, and downstream of family Sin3_corepress, pfam08295. 281
59472 407121 pfam16880 EHD_N N-terminal EH-domain containing protein. EHD_N is a short domain that lies at the very N-terminus of many dynamins and EF-hand domain-containing proteins. 33
59473 374865 pfam16881 LIAS_N N-terminal domain of lipoyl synthase of Radical_SAM family. LIAS_N is found as the N-terminal domain of the Radical_SAM family in the members that are lipoyl synthase enzymes, particularly the mitochondrial ones in metazoa but also those in bacteria. 97
59474 374866 pfam16882 DUF5079 Domain of unknown function (DUF5079). This protein is believed to be involved in the type VII secretion system. 241
59475 293488 pfam16883 DUF5080 Domain of unknown function (DUF5080). This protein is believed to be involved in the type VII secretion system. 204
59476 407122 pfam16884 ADH_N_2 N-terminal domain of oxidoreductase. N-terminal region of oxidoreductase and prostaglandin reductase and alcohol dehydrogenase. 108
59477 407123 pfam16885 CAC1F_C Voltage-gated calcium channel subunit alpha, C-term. CAC1F_C is the C-terminal region of voltage-gated calcium channel subunit alpha in higher eukaryotes. The exact function of this domain is not known.This region lies immediately downstream from the CDB motif, pfam08673. 348
59478 407124 pfam16886 ATP-synt_ab_Xtn ATPsynthase alpha/beta subunit N-term extension. ATP-synt_ab_Xtn is an extension of the alpha-beta catalytic subunit of VATA or V-type proton ATPase catalytic subunit at the N-terminal end. It is found from bacteria to humans, and was not modelled in family ATP-synt_ab, pfam00006. 120
59479 318977 pfam16887 DUF5081 Domain of unknown function (DUF5081). This protein is believed to be involved in the type VII secretion system. 230
59480 407125 pfam16888 DUF5082 Domain of unknown function (DUF5082). This protein is believed to be involved in the type VII secretion system. 122
59481 407126 pfam16889 Hepar_II_III_N Heparinase II/III N-terminus. This is the N-terminal domain of heparinase II/III proteins. It is a toroid-like domain. 344
59482 374868 pfam16890 DUF5083 Domain of unknown function (DUF5083). This protein is believed to be involved in the type VII secretion system. 157
59483 407127 pfam16891 STPPase_N Serine-threonine protein phosphatase N-terminal domain. This family is often found at the N-terminus of Metallophos family, in serine-threonine protein phosphatases. 48
59484 407128 pfam16892 CHS5_N Chitin biosynthesis protein CHS5 N-terminus. This domain is found at the N-terminus of fungal chitin biosynthesis protein CHS5. It functions as a dimerization domain. 48
59485 407129 pfam16893 fn3_2 Fibronectin type III domain. This fibronectin type III domain is found in fungal chitin biosynthesis protein CHS5 where, together with the neighboring BRCT domain (pfam00533), it binds to the Arf1 GTPase. 89
59486 318984 pfam16894 DUF5084 Domain of unknown function (DUF5084). This protein is believed to be involved in the type VII secretion system. 130
59487 374871 pfam16895 DUF5085 Domain of unknown function (DUF5085). This protein is believed to be involved in the type VII secretion system. 139
59488 407130 pfam16896 PGDH_C Phosphogluconate dehydrogenase (decarboxylating) C-term. PGDH_C is the C-terminal domain of putative bacterial phosphogluconate dehydrogenase proteins. 153
59489 407131 pfam16897 MMR_HSR1_Xtn C-terminal region of MMR_HSR1 domain. MMR_HSR1_Xtn is the C-terminal region of some members of the MMR_HSR1 family. 105
59490 407132 pfam16898 TOPRIM_C C-terminal associated domain of TOPRIM. TOPRIM_C is found as the C-terminal extension of the TOPRIM domain, pfam01751 in metazoa. 127
59491 407133 pfam16899 Cyclin_C_2 Cyclin C-terminal domain. Cyclins contain two domains of similar all-alpha fold, this family corresponds with the C-terminal domain of some cyclins including cyclin C and cyclin H. 98
59492 407134 pfam16900 REPA_OB_2 Replication protein A OB domain. Replication protein A contains two OB domains in it's DNA binding region. This is the second of the OB domains. 98
59493 407135 pfam16901 DAO_C C-terminal domain of alpha-glycerophosphate oxidase. DAO_C is the C-terminal region of alpha-glycerophosphate oxidase. 126
59494 407136 pfam16902 Type2_restr_D3 Type-2 restriction enzyme D3 domain. This is the D3 domain of type-2 restriction enzyme. These enzymes contain an N-terminal recognition domain and a C-terminal catalytic domain. The recognition domain consists of the D1, D2 and D3 domains. 69
59495 407137 pfam16903 Capsid_N Major capsid protein N-terminus. This is the N-terminal domain of the major capsid protein in several dsDNA viruses. 196
59496 407138 pfam16904 PurL_C Phosphoribosylformylglycinamidine synthase II C-terminus. This is the C-terminal domain of phosphoribosylformylglycinamidine synthase II in Thermatoga and related species. 94
59497 407139 pfam16905 GPHH Voltage-dependent L-type calcium channel, IQ-associated. GPHH is a sequence motif found in this short domain on voltage-dependent L-type calcium channel proteins in eukaryotes. The domain is closely associated with the IQ-domain, pfam08763. 54
59498 407140 pfam16906 Ribosomal_L26 Ribosomal proteins L26 eukaryotic, L24P archaeal. Ribosomal_L26 is a family of the 50S and the 60S ribosomal proteins from eukaryotes - L26 - and archaea - L25. 114
59499 407141 pfam16907 Caskin-Pro-rich Proline rich region of Caskin proteins. This proline rich region is found in Caskin proteins. Caskins are CASK-binding synaptic scaffolding proteins. This region is predicted to be natively unstructured. Its function is not known. 91
59500 407142 pfam16908 VPS13 Vacuolar sorting-associated protein 13, N-terminal. VPS13 is a family of eukaryotic vacuolar sorting-associated 13 proteins that lies just downstream from Chorein_N family, pfam12624. The exact function of this domain is not known. 230
59501 407143 pfam16909 VPS13_C Vacuolar-sorting-associated 13 protein C-terminal. VPS13_C is a family of eukaryotic vacuolar sorting-associated 13 proteins that lies at the C-terminus of the members, The exact function of this domain is not known. 175
59502 407144 pfam16910 VPS13_mid_rpt Repeating coiled region of VPS13. This repeat is a family of repeating regions of eukaryotic vacuolar sorting-associated 13 proteins. This repeating region shares a common core element that includes a well-conserved P-x4-P-x13-17-G sequence. The exact function of this repeat is not known. 236
59503 407145 pfam16911 PapA_C Phthiocerol/phthiodiolone dimycocerosyl transferase C-terminus. 120
59504 407146 pfam16912 Glu_dehyd_C Glucose dehydrogenase C-terminus. 211
59505 293518 pfam16913 PUNUT Purine nucleobase transmembrane transport. PUNUT is a family of largely plant and fungal purine transporters. Most members are 10-pass transmembrane proteins, and they belong to the drug/metabolite transporter (dmt) superfamily. The plant vascular system transports nucloebases and their derivatives such as cytokinins and caffeine by a common H+-coupled high-affinity purine transport system; the PUNUT family members carry out this transport. 321
59506 374885 pfam16914 TetR_C_12 Bacterial transcriptional repressor C-terminal. This domain is found at the C-terminus of a small group of bacterial TetR transcriptional regulator proteins. 105
59507 319001 pfam16915 Eryth_link_C Annelid erythrocruorin linker subunit C-terminus. This domain is found in linker subunits of the erythrocruorin respiratory complex in annelid worms. 120
59508 407147 pfam16916 ZT_dimer dimerization domain of Zinc Transporter. ZT_dimer is the dimerization region of the whole molecule of zinc transporters since the full-length members form a homodimer during activity. The domain lies within the cytoplasm and exhibits an overall structural similarity with the copper metallochaperone Hah1 UniProtKB:O00244, exhibiting an open alpha-beta domain with two alpha helices (H1 and H2) aligned on one side and a three-stranded mixed beta-sheet (S1 to S3) on the other side. The N-terminal part of the members is the Cation_efflux family, pfam01545. 73
59509 407148 pfam16917 BPL_LplA_LipB_2 Biotin/lipoate A/B protein ligase family. 182
59510 407149 pfam16918 PknG_TPR Protein kinase G tetratricopeptide repeat. This domain is found at the C-terminus of protein kinase G and contains a tetratricopeptide repeat (TPR). 340
59511 407150 pfam16919 PknG_rubred Protein kinase G rubredoxin domain. This rubredoxin domain is found at the N-terminus of protein kinase G, and is essential for kinase activity. 139
59512 407151 pfam16920 TPKR_C2 Tyrosine-protein kinase receptor C2 Ig-like domain. In the tyrosine-protein kinase receptor NTRK1 this domain interacts with beta-nerve growth factor NGF. 45
59513 407152 pfam16921 Tex_YqgF Tex protein YqgF-like domain. This is the YqgF-like domain of the bacterial Tex protein, which is involved in transcriptional processes. 125
59514 407153 pfam16922 SLD5_C DNA replication complex GINS protein SLD5 C-terminus. The C-terminal domain of DNA replication complex GINS protein SLD5 is important in the assembly of the GINS complex, a complex which is involved in initiation of DNA replication and progression of DNA replication forks. 57
59515 407154 pfam16923 Glyco_hydro_63N Glycosyl hydrolase family 63 N-terminal domain. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the N-terminal beta sandwich domain. 221
59516 407155 pfam16924 DpaA_N Dipicolinate synthase subunit A N-terminal domain. 115
59517 407156 pfam16925 TetR_C_13 Bacterial transcriptional repressor C-terminal. 113
59518 293531 pfam16926 HisKA_4TM Archaeal 4TM region of histidine kinase. This N-terminal region of histidine-kinases consists of 4xTMs and is found in Archaea. 164
59519 407157 pfam16927 HisKA_7TM N-terminal 7TM region of histidine kinase. HisKA_7TM is an N-terminal region consisting of seven transmembrane domains found in Archaea and some bacteria. It is always found associated with histidine kinase. 221
59520 293533 pfam16928 Inj_translocase DNA/protein translocase of phage P22 injectosome. Inj_translocase is a family of putative phage translocases that are involved in the injectosome mechanism. Phage P22 of Salmonella typhimurium ejects four proteins, gp7, gp16, gp20 and gp26, which are ejected from the phage virion into the bacterial cell after absorption. These four proteins may play a role in DNA ejection. 217
59521 407158 pfam16929 Asp2 Accessory Sec system GspB-transporter. Asp2 is a family of the SecA2/Y2 accessory Sec secretory system of Gram-positive bacteria. It is specific for large serine-rich repeat, cell-wall-anchored, glycoproteins such as GspB. Export of GspB requires the three Asp1-Asp3 proteins. Asp2, in conjunction with Asp3, probably acts as a chaperone in the early stage of GspB transport. 505
59522 374891 pfam16930 Porin_5 Putative porin. 535
59523 407159 pfam16931 Phage_holin_8 Putative phage holin. 122
59524 407160 pfam16932 T4SS_TraI Type IV secretory system, conjugal DNA-protein transfer. T4SS_TraI is a family of putative Gram-negative, largely Proteobacterial, type IV conjugal DNA-Protein transfer or VirB secretory pathway (IVSP) proteins. 211
59525 407161 pfam16933 PelG Putative exopolysaccharide Exporter (EPS-E). PelG is a family of putative exopolysaccharide transporters like PelG. Most members carry twelve transmembrane regions. The family also contains fusion proteins with glycosyl transferase group 1, which are putative flippase transporters. 451
59526 293539 pfam16934 Mersacidin Two-component Enterococcus faecalis cytolysin (EFC). Mersacidin is a cytolysin, a lantibiotic produced by Gram-positive bacteria, The cytolysin is a 'pseudohaemolysin' which produces haemolysis on blood agar plates, but not in broth culture. Mersacidin is one of the type B lantibiotics (lanthionine-containing antibiotics) that contain post-translationally modified amino acids and cyclic ring structures. Mersacidin attacks the cell wall precursor lipid II, thereby inhibiting cell-wall synthesis. 68
59527 407162 pfam16935 Hol_Tox Putative Holin-like Toxin (Hol-Tox). Hol_Tox is a family of small proteins (34-48aas) with a single TM region. Members can exhibit antibacterial activity against Gram-positive bacteria but not against Gram-negative bacteria. 60
59528 407163 pfam16936 Holin_9 Putative holin. This is a family of putative holins from Actinobacteria with three TM regions. 78
59529 407164 pfam16937 T3SS_HrpK1 Type III secretion system translocator protein, HrpF. T3SS_HrpK1 is a family of putative Type III secretion system pore-forming bacterial proteins. These allow transfer of pathogenic material from bacterial cytoplasm into the plant host cytoplasm. 256
59530 407165 pfam16938 Phage_holin_Dp1 Putative phage holin Dp-1. Phage_holin_Dp1 is a family of putative phage-holins from Gram-positive bacteria, largely Firmicutes, with two probable TMSs. The family shows lytic activity. 62
59531 293544 pfam16939 Porin_6 Putative porin. Porin_6 is a family of putative porins from Leptospira species. 282
59532 374895 pfam16940 Tic110 Chloroplast envelope transporter. Tic110 is a family of chloroplast envelope proteins. Some are involved in protein translocation and others are neurotransmitter receptor, cys loop, ligand-gated ion channel or LIC proteins. 573
59533 407166 pfam16941 CymA Putative cyclodextrin porin. 341
59534 293547 pfam16942 CclA_1 Putative cyclic bacteriocin. This is a family of short proteins from Gram- putatively from the carnocylcin A family of bacteriocins. 103
59535 374896 pfam16943 T4SS_CagC Cag pathogenicity island, type IV secretory system. T4SS_CagC is a family of putative pathogenicity island, type IV, conjugal DNA-protein transfer, secretory system proteins from Gram-negative bacteria. 119
59536 407167 pfam16944 KCH Fungal potassium channel. KHC is a family of fungal proteins carrying three transmembrane domains. It is a member of the fungal potassium channel family of transporters, and includes a pair of homologous sequences that localize to distinct zones of the yeast plasma membrane and are induced during the response to mating pheromones. Together KCH1 and KCH2 promote low-affinity K+ uptake and are essential for K+-dependent activation of HACS - a high-affinity Ca2+ influx system that activates calcineurin and is essential for cell survival - in S. cerevisiae cells responding to mating pheromones. 251
59537 407168 pfam16945 Phage_r1t_holin Putative lactococcus lactis phage r1t holin. Phage_r1t_holin is a family of putative phage r1t holins from lactococcus. these holins carry two hydrophobic putative TMs separated by a short beta-turn region. 70
59538 319021 pfam16946 Porin_OmpG_1_2 OMPG-porin 1 family. Porin_OmpG_1_2 is a family of putative porins of the OmpG-type. these are channels without solute specificity. 294
59539 407169 pfam16947 Ferredoxin_N N-terminal region of 4Fe-4S ferredoxin iron-sulfur binding. Ferredoxin_N is a short domain that is often found at the N-terminus of 4Fe-4S ferredoxin iron-sulfur binding domain proteins from Archaea and a few bacteria. 65
59540 374899 pfam16949 ABC_tran_2 Putative ATP-binding cassette. This is a family of putative two component ABC exporters. This is the membrane protein of approximately 573 residues and twelve transmembrane domains. It is encoded adjacent to an ATPase. 542
59541 293556 pfam16951 MaAIMP_sms Putative methionine and alanine importer, small subunit. MaAIMP_sms is a family of hypothetical proteins from Proteobacteria that purported to be small subunits of a methionine and alanine importer. 60
59542 407170 pfam16952 Gln-synt_N_2 Glutamine synthetase N-terminal domain. 112
59543 407171 pfam16953 PRORP Protein-only RNase P. PRORPs (protein-only RNase P) are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. Arabidopsis thaliana contains PRORP enzymes (PRORP1, PRORP2 and PRORP3) where PRORP1 localizes to mitochondria as well as chloroplasts, while PRORP2 and PRORP3 are found in the nucleus. In humans and most other metazoans, mt-RNase P is composed of three protein subunits (mitochondrial RNase P proteins 1-3; MRPP1-3), homologs to the Arabidopsis thaliana PRORP1-3. This domain corresponds to the metallonuclease domain of PRORPs. PRORP1 has 22% sequence identity to the human homolog MRPP3. PRORP1 crystal structure shows a V-shaped tripartite structure with a C-terminal metallonuclease domain of the NYN (N4BL1, YacP-like nuclease) family, with a typical and functional two-metal-ion catalytic site that has conserved aspartate residues. 241
59544 407172 pfam16954 HRG Haem-transporter, endosomal/lysosomal, haem-responsive gene. HRG1 is a family of conserved, membrane-bound permeases that reside in distinct intracellular compartments and bind and transport haem in metazoa. These proteins carry four transmembrane domains, 4xTMs, modelled here in two pairs, the two N-terminal and the two more C-terminal. 51
59545 374902 pfam16955 OFeT_1 Ferrous iron uptake permease, iron-lead transporter. OFeT_1 is a family of conserved archaeal membrane proteins that are putative oxidase-dependent Fe2+ transporters. 206
59546 339867 pfam16956 Porin_7 Putative general bacterial porin. 274
59547 407173 pfam16957 Mal_decarbox_Al Malonate decarboxylase, alpha subunit, transporter. Mal_decarbox_Al is a family of Na+-transporting carboxylic acid decarboxylases. 547
59548 407174 pfam16958 PRP9_N Pre-mRNA-splicing factor PRP9 N-terminus. This is the N-terminal domain of pre-mRNA-splicing factor PRP9. 149
59549 407175 pfam16959 Collectrin Renal amino acid transporter. Collectrin is a single-pass transmembrane protein that is homologous to the C-terminal region of human angiotensin-converting enzyme 2, ACE2, found in Peptidase_M2 pfam01401. Collectrin is critical for normal amino acid reabsorption in the kidney. 154
59550 319033 pfam16960 HpuA Haemoglobin-haptoglobin utilisation, porphyrin transporter. HpuA is a family of Neisseria spp proteins from the hpuAB operon, which are putative porphyrin transporters. 313
59551 374905 pfam16961 OmpA_like Putative OmpA-OmpF-like porin family. This is a family of putative OmpA-OmpF-like porins from Bacteroidetes. 197
59552 407176 pfam16962 ABC_export Putative ABC exporter. This is a family of putative ABC_exporters from Firmicutes. 533
59553 407177 pfam16963 PelD_GGDEF PelD GGDEF domain. This degenerate GGDEF domain is found at the C-terminus of PelD, a membrane-bound c-di-GMP-specific receptor. It contains an RXXD motif resembling the allosteric inhibition site found in diguanylate cyclases. In PelD this RXXD motif binds to dimeric c-di-GMP. 123
59554 407178 pfam16964 TadF Putative tight adherence pilin protein F. TadF is a family of proteins from the tad locus that is part of the type IV bacterial secretory system. 176
59555 374907 pfam16965 CSG2 Ceramide synthase regulator. CSG2 is an integral membrane protein with up to 10 transmembrane segments that, when over-expressed, localizes to the endoplasmic reticulum. CSG2 is a family of fungal transmembrane proteins that regulate mannosyl phosphorylinositol ceramide synthase and are thereby implicated in calcium homoeostasis in the cell. 396
59556 407179 pfam16966 Porin_8 Porin-like glycoporin RafY. This is a family of Gram-negative Gammaproteobacteria putative raffinose-like porins. 363
59557 407180 pfam16967 TcfC E-set like domain. TcfC is a family of bacterial fimbrial proteins. These sit in the outer bacterial membrane surrounding the RcpA proteins of the fimbrial shaft. This family is from Gamma-proteobacteria. This domain represents an immunoglobulin like E-set domain. 68
59558 407181 pfam16968 TadZ_N Pilus assembly protein TadZ N-terminal. TadZ_N is the N-terminal region of the Flp pilus assembly protein TadZ, which carries an AAA, ATPase domain immediately downstream, AAA_31, pfam13614. The domain is an example of a signal-transduction-response receiver. It is localized to the cytoplasmic side of the inner bacterial cell-membrane, contacting also with both tadA and RcpC. 129
59559 407182 pfam16969 SRP68 RNA-binding signal recognition particle 68. SRP68 is a family that is part of the SRP or signal recognition particle complex. This complex, consisting of six proteins and a 7SL-RNA is necessary for guiding the emerging proteins designed for the membrane towards the translocation pore. SRP68 forms a stable heterodimer with SRP72, a protein with a TPR repeat. Specific RNA-binding of SRP68 is mediated by the N-terminal domain of approximately 200 residues of this family. 561
59560 339870 pfam16970 FimA Type-1 fimbrial protein, A. FimA is a family of Gram-negative fimbrial component A proteins that form part of the pili. There are usually up to 1000 copies of this subunit in one pilus that form a helically wound rod onto which the tip fibrillum (FimF.FimG, FimH) is attached. Pilus subunits are translocated from the cytoplasm to the periplasm via the general secretory pathway SecYEG. 145
59561 407183 pfam16971 RcpB Rough colony protein B, tight adherence - tad - subunit. RcpB is part of the Tad operon of proteins. The Tad (tight adherence) macromolecular transport system, present in many bacterial and archaeal species, represents an ancient and major new subtype of type II secretion. The three Rcp proteins (RcpA, RcpB, and RcpC) and TadD, a putative lipoprotein, are localized to the bacterial outer membrane. 168
59562 374913 pfam16972 TipE Na+ channel auxiliary subunit TipE. TipE appears to be a family of insect Na+ channel auxiliary subunit proteins. 486
59563 407184 pfam16973 FliN_N Flagellar motor switch protein FliN N-terminal. FliN is one of three proteins that form a switch-complex at the base of the basal body of the flagellum; the switch regulates the flagellum-motor. 50
59564 374915 pfam16974 NAR2 High-affinity nitrate transporter accessory. NAR2 is a family of plant proteins with a C-terminal transmembrane region that is an essential accessory for high-affinity nitrate uptake. This family works together with NRT2, a 12xTM family of proteins that is part of family MFS_1, pfam07690. NAR2 is also involved in the repression of lateral root initiation in response to high ratios of sucrose to nitrogen in the medium. Therefore the two component-system of NAR2 and NRT2 itself is likely to be involved in the signalling pathway that integrates nutritional cues for the regulation of lateral root architecture. The functional unit of the high-affinity nitrate influx complex is likely to be a tetramer, in Arabidopsis, made up of two subunits each of NRT2.1 and NAR2.1. 173
59565 407185 pfam16975 UPAR_LY6_2 Ly6/PLAUR domain-containing protein 6, Lypd6. UPAR_LY6_2 is a family of higher eukaryotic proteins expressed in neurons. It modulates nicotinic acetylcholine receptors by selectively increasing Ca2+-influx through this ion channel. The family carries an LU protein domain - about 80 amino acids long characterized by a conserved pattern of 10 cysteine residues. The family is a positive feedback regulator of Wnt/beta-catenin signalling, eg for patterning of the mesoderm and neuroectoderm in zebrafish gastrulation, where Lypd6 is GPI-anchored to the plasma-membrane and interacts with the Wnt receptor Frizzled8 and the co-receptor Lrp6. 105
59566 407186 pfam16976 RcpC Flp pilus assembly protein RcpC/CpaB. RcpC is a family of Gram-negative proteins expressed from the tight-adherence tad locus. RcpC is an auxillary protein that sits in the inner membrane and interacts with TadB and TadZ, an AAA ATPase. A recent study has identified two tandem beta-clip domains in RcpC95. beta-Clip domains are known to interact with carbohydrate moieties in other systems, such as SAF. 115
59567 407187 pfam16977 ApeC C-terminal domain of apextrin. ApeC domain was first identified from two apextrin-like proteins (ALP) of the amphioxus Branchiostoma japonicum. Our functional studies show that amphioxus ALP1 and ALP2 are important anti-bacterial effectors, and that the apeC domain of the ALP1/2 mediates the bacterial recognition by binding to bacterial muramyl dipeptide (MDP). Further analysis shows that the apeC domain is present in various proteins from cnidarians, molluscs, arthropods, hemichordates, echinoderms and amphioxus. The apeC domain is also found to form different domain combinations with other domains (in press). 205
59568 407188 pfam16978 CRIM SAPK-interacting protein 1 (Sin1), middle CRIM domain. CRIM is a domain in the middle of Sin1 that is important in the substrate recognition of TORC2. It is conserved from yeast to humans. TOR is a serine/threonine-specific protein kinase and forms functionally distinct protein complexes referred to as TORC1 and TORC2. 137
59569 407189 pfam16979 SIN1_PH SAPK-interacting protein 1 (Sin1), Pleckstrin-homology. SIN1_PH is a pleckstrin-homology domain found at the C-terminus of SIN1. It is conserved from yeast to humans. PH-domains are involved in intracellular signalling or as constituents of the cytoskeleton. SIN1 (SAPK-interacting protein 1) plays an essential role in signal transduction, anf the PH domain is involved in lipid and membrane binding. 104
59570 407190 pfam16980 CitMHS_2 Putative citrate transport. CitMHS is a family of putative citrate transporters, belonging to the Na+/H+ antiporter NhaD-like permease superfamily. 440
59571 293586 pfam16981 Chi-conotoxin chi-Conotoxin or t superfamily. Chi-conotoxin is a family of Cone snail venom chi-conopeptide class bioactive peptides based. These conopeptides show a unique ability, highly selectively and non-competitively, to inhibit the noradrenaline transporter. They show an unusual cysteine-stabilized scaffold that presents a gamma-turn in an optimized conformation for high affinity interactions with the noradrenaline transporter. 60
59572 407191 pfam16982 Flp1_like Putative Flagellin, Flp1-like, domain. 48
59573 407192 pfam16983 MFS_MOT1 Molybdate transporter of MFS superfamily. MFS_MOT1 is a family of molybdenate transporters. Molybdenum is an essential element that is taken up into the cell in the oxyanion molybdate. Molybdenum is used in the form of molybdopterin-cofactor, which participates in the active site of enzymes involved in key reactions of carbon, nitrogen, and sulfur metabolism. 111
59574 319052 pfam16984 Grp7_allergen Group 7 allergen. 180
59575 407193 pfam16985 DUF5086 Domain of unknown function (DUF5086). 118
59576 374920 pfam16986 CzcE Heavy-metal resistance protein CzcE. CzcE is involved in heavy-metal resistance. It binds copper, which induces a conformational change. 80
59577 407194 pfam16987 KIX_2 KIX domain. This KIX domain is an activator-binding domain. 83
59578 407195 pfam16988 Vps36-NZF-N Vacuolar protein sorting 36 NZF-N zinc-finger domain. The vacuolar protein sorting 36 NZF-N zinc-finger domain interacts with the C-terminus of vacuolar protein sorting 28. 65
59579 407196 pfam16989 T6SS_VasJ Type VI secretion, EvfE, EvfF, ImpA, BimE, VC_A0119, VasJ. T6SS_VasJ is a family from Gram-negative bacteria that forms a component of the type VI pathogenic secretion system. In the case of the Escherichia coli RS218 strain UniProtKB:G8IRL4, EvfF,it represents expression of the full-length gene; whereas it is just the C-terminal part of EvfE, UniProtKB:G8IRL3. The N-terminal part of these sequences is in family ImpA_N, pfam06812. 254
59580 293595 pfam16990 CBM_35 Carbohydrate binding module (family 35). This is a mannan-specific carbohydrate binding domain, previously known as the X4 module. Unlike other carbohydrate binding modules, binding to substrate causes a conformational change. 119
59581 407197 pfam16991 SIR4_SID Sir4 SID domain. This is the Sir2 interaction domain (SID domain) of silent information regulator 4 (Sir4). 159
59582 379914 pfam16992 RNA_pol_RpbG DNA-directed RNA polymerase, subunit G. RNA_pol_RpbG is a family of archaeal and fungal subunit G of DNA-directed RNA polymerase. 119
59583 407198 pfam16993 Asp1 Accessory Sec system protein Asp1. Asp1, along with SecY2, SecA2, and other proteins forms part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. Asp1 is predicted to be cytosolic. 522
59584 407199 pfam16994 Glyco_trans_4_5 Glycosyl-transferase family 4. 167
59585 407200 pfam16995 tRNA-synt_2_TM Transmembrane region of lysyl-tRNA synthetase. tRNA-synt_2_TM is a family from the N-terminal region of tRNA-synthase-2, with 6xTMs. The presence of this region indicates that the protein is anchored in the membrane. The family is found in Actinobacteria. 215
59586 407201 pfam16996 Asp4 Accessory secretory protein Sec Asp4. Asp4 and Asp5 are putative accessory components of the SecY2 channel of the SecA2-SecY2 mediated export system, but they are not present in all SecA2-SecY2 systems. This family of Asp4 is found in Firmicutes. 55
59587 407202 pfam16997 Wap1 Wap1 domain. The Wap1 domain is found at the C-terminus of fungal Wpl1 proteins (also known as Rad61). These proteins are members of the cohesin complex. The Wap1 domain binds to the ATPase domain of Smc3. 373
59588 339877 pfam16998 17kDa_Anti_2 17 kDa outer membrane surface antigen. 17kDa_Anti_2 is a surface protein that is found in several Proteobacteria species. 111
59589 339878 pfam16999 V-ATPase_G_2 Vacuolar (H+)-ATPase G subunit. This family represents vacuolar (H+)-ATPase G subunit from several bacterial and archaeal species. Subunit G is a component of the peripheral stalk of the ATPase complex 104
59590 407203 pfam17000 Asp5 Accessory secretory protein Sec, Asp5. Asp4 and Asp5 are putative accessory components of the SecY2 channel of the SecA2-SecY2 mediated export system, but they are not present in all SecA2-SecY2 systems. This family of Asp5 is found in Firmicutes. 71
59591 407204 pfam17001 T3SS_basalb_I Type III secretion basal body protein I, YscI, HrpB, PscI. T3SS_basalb_I represents a family of Gram-negative type III secretion basal body proteins I. It is the inner rod protein of the secreted needle. YscI is suggested to form a rod that allows substrate passage across the inner membrane of the needle protein YscF through it. 94
59592 374928 pfam17002 DUF5089 Domain of unknown function (DUF5089). This is a family of microsporidial-specific proteins of unknown function. There is distant homology to synaptosomal-associated 25 family proteins. 193
59593 374929 pfam17003 Actin_micro Putative actin-like family. This is a family of microsporidial-specific proteins of unknown function. There is distant homology to the Actin family. 350
59594 374930 pfam17004 SRP_TPR_like Putative TPR-like repeat. This is a family of microsporidial sequences that are likely to fold into a TPR-like structure. Many sequences are annotated as being signal recognition proteins. 109
59595 407205 pfam17005 WD40_like WD40-like domain. This is a family of proteins which have weak homology to the WD40 repeat family. Members are largely from microsporidia and related species. 301
59596 374931 pfam17006 DUF5087 Domain of unknown function (DUF5087). This is a family of microsporidial sequences of unknown function. 292
59597 374932 pfam17007 HTH_micro HTH-like. This is a family of microsporidial sequences whose function is not known. It is possible that the proteins are DNA-binding as there is distant homology to helix-turn-helix families at the N-terminus. 454
59598 374933 pfam17008 DUF5088 Domain of unknown function (DUF5088). This is a family of microsporidial sequences of unknown function. 184
59599 374934 pfam17009 DUF5090 Domain of unknown function (DUF5090). This is a microsporidial-specific family of proteins of unknown function. The family is likely to be of four transmembrane domains. 187
59600 374935 pfam17010 DUF5092 Domain of unknown function (DUF5092). his is a family of microsporidial-specific sequences of unknown function. There is one transmembrane domain towards the C-terminus. 145
59601 374936 pfam17011 DUF5093 Domain of unknown function (DUF5093). This is a family of microsporidial sequences that may be distantly related to RRP7, pfam12923, ribosomal-RNA-processing protein 7. 131
59602 374937 pfam17012 DUF5091 Domain of unknown function (DUF5091). This is a family of microsporidial-specific sequences of unknown function. 147
59603 374938 pfam17013 Acetyltransf_15 Putative acetyl-transferase. This is a family of microsporidial proteins which may be distantly related to acetyl-transferase. 210
59604 374939 pfam17014 Mad3_BUB1_I_2 Putative Mad3/BUB1 like region 1 protein. This family of microsporidial sequences may be related to the Mad3_BUB1_I family pfam08311. 128
59605 374940 pfam17015 DUF5094 Domain of unknown function (DUF5094). This family of largely microsporidial-specific proteins is of unknown function. However there may be distant homology to family Csm1, pfam12539. 178
59606 374941 pfam17016 DUF5095 Domain of unknown function (DUF5095). This is a family of microsporidial-specific sequences. The function is not known and there is no distant homology to any Pfam families so far. 229
59607 374942 pfam17017 zf-C2H2_aberr Aberrant zinc-finger. This is a family of largely microsporidia-specific proteins with an aberrant zinc-finger motif of Cx(4)C2H repeated. 165
59608 374943 pfam17018 MICSWaP Spore wall protein. This is a family of microsporidial spore-wall proteins. 193
59609 374944 pfam17019 DUF5096 Domain of unknown function (DUF5096). This is a family of microsporidial sequences of unknown function. There is a well conserved Asp residue towards the C-terminus which may be functional. 192
59610 407206 pfam17020 DUF5097 Domain of unknown function (DUF5097). This is a family of microsporidia-specific proteins of unknown function. There is the possibility of very distant homology to the WAC domain. 119
59611 374945 pfam17021 Mei5_like Putative double-strand recombination repair-like. This is a family of microsporidia-specific sequences with homology to the double-strand recombination repair protein family, Mei5 pfam10376. 118
59612 374946 pfam17022 PTP2 Polar tube protein 2 from Microsporidia. PTP2 is a family of microsporidial polar-tube protein 2 sequences. Humans can be infected with the unicellular eukaryote Microsporidia which are obligate intracellular parasites that produce resistant spores. To initiate entry into a new host cell a unique motile process is formed by a sudden extrusion of the polar tube protein from the spore. There are a series of conserved cysteine residues. 216
59613 407207 pfam17023 DUF5098 Domain of unknown function (DUF5098). This is family of microsporidia-specific sequences with no known function. There is a very characteristic NPW sequence motif at the very C-terminus. 461
59614 374947 pfam17024 DMAP1_like Putative DMAP1-like. This is a family of microsporidia-specific sequences that may have distant homology to the family DMAP1, pfam05499. 113
59615 374948 pfam17025 DUF5099 Domain of unknown function (DUF5099). This is a family of microsporidia-specific sequences of unknown function. 109
59616 374949 pfam17026 zf-RRPl_C4 Putative ribonucleoprotein zinc-finger pf C4 type. This is a family of largely microsporidia-specific proteins. One member is annotated as being a ribonucleoprotein. The family carries two pairs of CxxC residues suggesting that there is DNA-binding. 108
59617 374950 pfam17027 Bromo_TP_like Histone-fold protein. This is a family of microsporidia-specific sequences that have distant homology to the Bromo_TP family, pfam07524. 119
59618 407208 pfam17028 8TM_micro 8TM Microsporidial transmembrane domain. This is a family of largely microsporidial-specific proteins that carry eight transmembrane regions, in two blocks of four. Such an arrangement of TMs suggests a transporter function of some kind. There is a highly conserved NFLNW sequence-motif at the C-terminus which might be of functional importance. 259
59619 374951 pfam17029 DUF5100 Domain of unknown function (DUF5100). This is a family of microsporidia-specific sequences of unknown function. 126
59620 374952 pfam17030 Beta_lactamase3 Putative beta-lactamase-like family. This is a family derived from microsporidia-specific proteins. There is homology to the beta-lactamase domain. 213
59621 374953 pfam17031 DUF5101 Domain of unknown function (DUF5101). This is a family of short microsporidia-specific proteins of unknown function. 99
59622 407209 pfam17032 zinc_ribbon_15 zinc-ribbon family. This zinc-ribbon region is found on a set of largely microsporidia-specific proteins. 73
59623 407210 pfam17033 Peptidase_M99 Carboxypeptidase controlling helical cell shape catalytic. This is the peptidase domain of a D,L-carboxypeptidase. The active site residues are Arg86, Glu222 and the metal ligands, in the peptidase domain, are Gln46, Glu49 and His128 in UniProtKB:O25708. The protein binds many zinc ions and a calcium ion and there are other metal binding sites. The catalytic activity is the release of m-Dpm from the peptide muramyl-Ala-gamma-D-Glu-m-Dpm; this is probably the precursor of the cell wall cross-linking peptide. 227
59624 374955 pfam17034 zinc_ribbon_16 Zinc-ribbon like family. This family is found at the C-terminus of WD40 repeat structures in eukaryotes. 125
59625 407211 pfam17035 BET Bromodomain extra-terminal - transcription regulation. The BET, or bromodomain extra-terminal domain, is found on bromodomain proteins that play key roles in development, cancer progression and virus-host pathogenesis. It interacts with NSD3, JMJD6, CHD4, GLTSCR1, and ATAD5 all of which are shown to impart a pTEFb-independent transcriptional activation function on the bromodomain proteins. 64
59626 407212 pfam17036 CBP_BcsS Cellulose biosynthesis protein BcsS. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 145
59627 374957 pfam17037 CBP_BcsO Cellulose biosynthesis protein BcsO. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 208
59628 407213 pfam17038 CBP_BcsN Cellulose biosynthesis protein BcsN. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 186
59629 407214 pfam17039 Glyco_tran_10_N Fucosyltransferase, N-terminal. This is the N-terminal domain of a family of fucosyltransferases. This enzyme transfers fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is known as glycosyltransferase family 10. The N-terminal domain is the likely binding-region for the fucose-like substrate (manuscript in publication). 109
59630 407215 pfam17040 CBP_CCPA Cellulose-complementing protein A. This is a family of bacterial cellulose-complementing protein A proteins necessary for cellulose biosynthesis. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). 121
59631 407216 pfam17041 SasG_E E domain. This short domain is about 50 amino acids in length. Its structure shows that it is composed of two beta sheets each of three strands. This domain is found associated with the pfam07501 domain and it has structural similarity with that domain although it is somewhat shorter. The E domain forms part of a rod like structure. 48
59632 407217 pfam17042 DUF1357_C Putative nucleotide-binding of sugar-metabolising enzyme. This conserved region is found in proteins of unknown function in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini interacting to produce a binding-interface in which a threonate-ADP complex is bound, suggesting that a sugar binding site is on the N-terminal domain, pfam07005, and a nucleotide binding site is in the C-terminal domain here (manuscript in preparation). 163
59633 407218 pfam17043 MAT1-1-2 Mating type protein 1-1-2 of unknown function. MAT1-1-2 is a family of proteins present in Sordariomycetes. They are encoded by the MAT1-1-2 gene which is present in the mating types of Sordariomycetes. The most famous representative if this family is Neurospora crassa. MAT1-1-2 is the generic nomenclature of all mating-type genes encoding proteins with HPG (also termed PPF) domain. This gene and its domain was first identified in Podospora anserina (its name in this species is SMR1) and Neurospora crassa (its name in this species is mat A-2) by Debuchy et al (1993). HPG was the first name proposed for the domain found in MAT1-1-2 proteins, based on the most conserved residues (histidine, proline and glycine). PPF was a second denomination proposed by Kanematsu et al (2007) for the same domain but these authors identified different conserved residues (proline, proline and phenylalanine). The function of this domain is not yet known. 147
59634 319104 pfam17044 BPTA Borrelial persistence in ticks protein A. BPTA is a family of proteins that are found in Borrelia species. The function is not known. 196
59635 407219 pfam17045 CEP63 Centrosomal protein of 63 kDa. CEP63 is a family of eukaryotic proteins involved in centriole activity. 268
59636 407220 pfam17046 Ses_B SesB domain on fungal death-pathway protein. SesB is a short conserved domain found on fungal proteins that are part of the cell death or heterokaryon incompatibility pathway. 22
59637 293652 pfam17047 SMP_LBD Synaptotagmin-like mitochondrial-lipid-binding domain. SMP is a proposed lipid-binding module, ie a synaptotagmin-like mitochondrial-lipid-binding domain found in eukaryotes. The SMP domain has a beta-barrel structure like protein modules in the tubular-lipid-binding (TULIP) superfamily. It dimerizes to form an approximately 90-Angstrom-long cylinder traversed by a channel lined entirely with hydrophobic residues. The following two C2 domains then form arched structures flexibly linked to the SMP domain. The SMP domain is a lipid-binding domain that links the ER with other lipid bilayer-membranes within the cell. 180
59638 407221 pfam17048 Ceramidse_alk_C Neutral/alkaline non-lysosomal ceramidase, C-terminal. This family represents C-terminal domain of a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. The EC classification is EC:3.5.1.23. The enzyme hydrolyzes ceramide to generate sphingosine and fatty acid. The enzyme plays a regulatory role in a variety of physiological events in eukaryotes and also functions as an exotoxin in particular bacteria. This C-terminal tail of the enzyme is highly conserved across all species and may play a role in the interaction of the enzyme with the plasma membranes. The tail is also vital for the stabilisation of the enzyme as a whole. 165
59639 407222 pfam17049 AEP1 ATPase expression protein 1. ATPase expression protein 1 (AEP1) is a yeast mitochondrial protein. It is essential for the expression of subunit 9 of mitochondrial ATP synthase. 396
59640 407223 pfam17050 AIM5 Altered inheritance of mitochondria 5. AIM5 is a fungal mitochondrial inner membrane protein. It is a component of the mitochondrial inner membrane organising system (MINOS/MitOS), which promotes normal mitochondrial morphology. 60
59641 293656 pfam17051 COA2 Cytochrome C oxidase assembly factor 2. 86
59642 407224 pfam17052 CAF20 Cap associated factor. In eukaryotes, the translation of mRNA is initiated by the binding of eIF4F complex, which is composed of eIF4E, eIF4A and eIF4G proteins. elF4E-binding proteins (4E-BPs) are involved in translational regulation through their interaction with eIF4E. There are two elF4E-binding proteins (4E-BPs) found in S. cerevisiae, Caf20 and Eap1. This entry represents Caf20 (also known as p20), which competes with elF4G for binding to elF4E and interferes with the formation of the elF4F complex, hence inhibiting translation. It is needed for the induction of pseudohyphal growth in response to nitrogen limitation. 151
59643 407225 pfam17053 GEP5 Genetic interactor of prohibitin 5. Genetic interactor of prohibitin 5 (GEP5), also known as required for respiratory growth protein5 (RRG5), has been shown to interact with prohibitin ring complexes in the mitochondrial inner membrane that regulate cell proliferation as well as the dynamics and function of mitochondria. It is required for mitochondrial genome maintenance and is essential for respiratory growth. 214
59644 293659 pfam17054 JUPITER Microtubule-Associated protein Jupiter. Is a microtubule-associated protein that binds to all microtubule populations in Drosophila. 208
59645 293660 pfam17055 VMR2 Viral matrix protein M2. Is a viral transmembrane protein which forms a proton-selective ion channel that is needed for the efficient release of the viral genome during virus entry. Once is attached to the cell surface, the virion enters the cells by endocytosis. Acidification of the endosome triggers M2 ion channel activity. Also plays a role in viral proteins secretory pathways. Elevates the intravesicular pH of normally acidic compartments, such as trans-Golgi network. It seems that M2 protein ion channel activity can affect the status of the conformational form of cleaved HA during intracellular transport. 235
59646 407226 pfam17056 KRE1 Killer toxin-resistance protein 1. The killer toxin-resistance protein 1 family are GPI-anchored plasma membrane proteins, found in yeast. They are involved in 1,6-beta-glucan formation and in the assembly and architecture of the cell wall. They also act as plasma membrane receptors for the yeast K1 viral toxin, and are involved in subsequent lethal channel formation. The family also includes Pga1 proteins, which have a role in oxidative stress response and in adhesion and biofilm formation. 66
59647 293662 pfam17057 B3R Poxviridae B3 protein. This is a viral protein. Its function is unknown. 123
59648 407227 pfam17058 MBR1 Mitochondrial biogenesis regulation protein 1. In yeast this protein participates in mitochondrial biogenesis and stress response. And also seems that may affect the NAM7 function, possibly at the level of mRNA turnover. 208
59649 293664 pfam17059 MGTL MgtA leader peptide. MTG is a bacterial protein that makes mgtA transcription sensitive to intracellular proline levels. When the levels of proline are low, this protein is not able to be translated and stem loop'C' forms in the mgt A 5'UTR which enables the transcription of the downstream mgtA gene. 17
59650 407228 pfam17060 MPS2 Monopolar spindle protein 2. Is a fungal transmembrane protein which is part of the component of the spindle pole body (SPB) required for the insertion of the nascent SPB into the nuclear envelope and for the proper execution of spindle pole body (SPB) duplication. It seems that Mps2-Spc24 interaction may contribute to the localization of Spc24 and other kinetochore components to the inner plaque of the SPB. 340
59651 407229 pfam17061 PARM PARM. Human PARM-1 is a mucin-like, androgen-regulated transmembrane protein that is present in most tissues, with high levels in the heart, kidney and placenta. It has been shown to be induced and expressed in prostate after castration and may have a role in cell proliferation and immortalisation in prostate cancer. 296
59652 407230 pfam17062 Osw5 Outer spore wall 5. In fungi the outermost cape of the spore wall is made up of a polymer that contains cross-linked amino acid dityrosine, which is important for the stress resistance of the spore. The OSW family of proteins have been implicated in assembly of this protective dityrosine coat. OSW5 null mutant spores show an enhanced spore wall permeability and vulnerability to beta glucanase digestion. The proteins are predicted to be integral membrane proteins. 70
59653 293668 pfam17063 Psm4 Phenol-soluble modulin alpha 4 peptide. Psma4 is a methicillin-resistant Staphylococcus aureus (MRSA) protein that may recruit, activate and induce the lysis of human neutrophils. It stimulates the secretion of IL-8 and also has haemolytic activity during MRSA infection. 20
59654 407231 pfam17064 QVR Sleepless protein. In Drosophila QUIVER (also known as SLEEPLESS protein) is required for homoeostatic regulation of sleep under normal conditions and following sleep deprivation. It is a novel potassium channel subunit that modulates the Shaker potassium channel which regulates the sleep. 85
59655 293670 pfam17065 UPF0669 Putative cytokine, C6ORF120. C6orf120 is a secreted protein that promotes cell cycle progression of CD4(+) T-cells, not hepatocytes. In humans it has its main role in tunicamycin-induced CD4(+) T apoptosis that may be associated with endoplasmatic reticulum stress. This suggests that it might be a new cytokine with immununoregulatory function that is selective for CD4+ T cells. It is mainly expressed in hepatocytes and cells in germinal centre of lymph nodes. 185
59656 407232 pfam17066 RITA RBPJ-interacting and tubulin associated protein. RITA is a highly conserved protein that binds to tubulin and shuttles between the cytoplasm and nucleus. It is responsible for export of RBP-J/CBF-1 from the nucleus, which modulates Notch-mediated transcription. 267
59657 293672 pfam17067 RPS31 Ribosomal protein S31e. RPS31, Ubi3 precursor, which is part of mature 60S and 40S ribosomal subunits. It seems that linear ubiquitin fusion to Rps31 and its subsequent cleavage are required for the efficient production and functional integrity of 40S ribosomal subunits. 99
59658 293673 pfam17068 RRG8 Required for respiratory growth protein 8 mitochondrial. RRG8 is a mitochondrial protein that plays an important role in maintenance of mtDNA due to is required for respiratory activity and maintenance and expression of the mitochondrial genome. 279
59659 293674 pfam17069 RSRP Arginine/Serine-Rich protein 1. RSRP1 is an eukaryotic protein family. Its function is unknown. 299
59660 319115 pfam17070 Thx 30S ribosomal protein Thx. Thx forms part of the 30S ribosomal subunit. It fits into a cavity between multiple RNA elements in the top of the 30S subunit head and stabilizes the organisation of these elements. 27
59661 293676 pfam17071 Capsid_VP7 Outer capsid protein VP7. Outer capsid protein VP7 is a reoviral protein that interacts with VP4 to form the outer icosahedral capsid. Outer capsids are involved directly in viral host interactions. 276
59662 407233 pfam17072 Spike_torovirin Torovirinae spike glycoprotein. The spike glycoprotein is a corona viral transmembrane protein that mediates the binding of virions to the host cell receptor and is involved in membrane fusion. The torovirinae spike proteins appear distinct from other coronaviridae spike proteins, such as human SARS coronavirus. 1271
59663 319116 pfam17073 SafA Two-component-system connector protein. SafA is a bacterial transmembrane protein family that connects the signal transduction between the two component systems EvgS/EvgA and PhoQ/Phop. SafA interacts with PhoQ, leading to the PhoQ/PhoP system activation in response to acid stress conditions. 64
59664 293679 pfam17074 Darcynin Darcynin, domain of unknown function. Darcynin is a bacterial protein family. Its function is unknown. 127
59665 374974 pfam17075 RRT14 Regular of rDNA transcription protein 14. Regulator of rDNA transcription protein14 (RRT14) is a nucleolar protein that is involved in ribosome biogenensis. 196
59666 293681 pfam17076 SBE2 SBE2, cell-wall formation. 820
59667 293682 pfam17077 Msap1 Mitotic spindle associated protein SHE1. She1 seems to be related to the spindle integrity function of the Dam1 complex. She1 is a dynein regulator and limits dynein offloading by gating the recruitment of dynactin to the astral microtubule plus end. Aurora B phosphorylates She1, modulating its potency against dynein. 330
59668 293683 pfam17078 SHE3 SWI5-dependent HO expression protein 3. SWI5-dependent HO expression protein 3 (She3) is an RNA-binding protein that binds specific mRNAs, including the mRNA of Ash1, which is invalid in cell-fate determination. She3 acts as an adapter protein that docks the myosin motor Myo4p onto an Ash1-She2p ribonucleoprotein complex. She3 seems to bind to Myo4p and Shep2p via different domains. 228
59669 293684 pfam17079 SOTI Male-specific protein scotti. Soti is a post-meiotically transcribed gene that is required in late spermiogenesis for normal spermatid individualisation. Besides, it is expressed in primary spermatocytes and round spermatids. 101
59670 374975 pfam17080 SepA Multidrug Resistance efflux pump. SepA is a drug efflux protein that is involved in bacterial multidrug resistance. It is predicted to have four transmembrane domains. 144
59671 407234 pfam17081 SOP4 Suppressor of PMA 1-7 protein. SOP4 is a family of fungal ER membrane proteins that regulate the quality control and intracellular transport of Pma1-7, a mutant plasma membrane ATPase. 209
59672 293687 pfam17082 Spc29 Spindle Pole Component 29. Spc29 is a component of the Spc-110 subcomplex and is required for the SPB (Spindle pole body) duplication. Spc29 acts as a linker between the central plaque component Spc42 to the inner plaque component Spc110. 245
59673 407235 pfam17083 Swm2 Nucleolar protein Swm2. The nucleolar protein SWM2 (Synthetic With MUD-2-delta protein2) constitutes a yeast protein family. SWM2 is a nonessential gene whose function is unknown, but it encodes a protein that binds Tgs1, an enzyme responsible for 2,2,7-trimethylguanosine (TMG) capping of small nuclear (sn) RNAs implicated in pre-mRNA splicing. 130
59674 407236 pfam17084 TDA11 Topoisomerase I damage affected protein 11. Tda11 is a fungal protein family. The function is unknown. 465
59675 293690 pfam17085 UCMA Unique cartilage matrix associated protein. UCMA is a secreted cartilage-specific protein located in chromosome 2 that is predominantly expressed in resting chondrocytes. It is secreted into the extracellular matrix as an uncleaved precursor and shows the same restricted distribution pattern in cartilage as UCMA mRNA. This protein is proteolytically processed and contains tyrosine sulfates. It seems to be to be involved in the negative control of osteogenic differentiation of osteochondrogenic precursor cells in peripheral zones of foetal cartilage. 134
59676 374978 pfam17086 HV_small_capsid Small capsid protein of Herpesviridae. This is a family of herpes-type viral small capsid proteins. 77
59677 293692 pfam17087 HHV-5_US34A Herpesvirus US34A protein family. Proteins in this human cytomegalovirus (HHV-5 )family contain a transmembrane domain. 64
59678 407237 pfam17088 YCF90 Uncharacterized protein family. Ycf90 is an algal protein located in chloroplasts. Its function is unknown. 388
59679 293694 pfam17089 YjbT Uncharacterized protein family. This is a family of bacterial proteins. The function is unknown. 92
59680 374980 pfam17090 Ytca Uncharacterized protein family. This is a family of bacterial transmembrane proteins. The function is unknown. 62
59681 293696 pfam17091 Tail_VII Inovirus G7P protein. Tail virion protein 7P is a viral transmembrane protein that interacts with the packaging signal of the viral genome leading to the initiation the virion concomitant assembly-budding process in the host inner membrane. 40
59682 407238 pfam17092 PCB_OB Penicillin-binding protein OB-like domain. 109
59683 293698 pfam17093 PBP_N Penicillin-binding protein N-terminus. This domain occurs at the N-terminus of some penicillin-binding proteins in Caulobacter species. 138
59684 293699 pfam17094 UPF0715 Uncharacterized protein family (UPF0715). This is a family of Bacilli transmembrane proteins. The function is unknown. 115
59685 407239 pfam17095 CAMSAP_CC1 Spectrin-binding region of Ca2+-Calmodulin. CAMSAP_CC1 is the conserved region on calmodulin-regulated spectrin-associated proteins in eukaryotes that binds spectrin. CAMSAPs are vertebrate microtubule-binding proteins, representatives of a family of cytoskeletal proteins that arose in animals. This conserved CC1 region binds to both spectrin and Ca2+/calmodulin in vitro, although the binding of Ca2+/calmodulin inhibited the binding of spectrin. CC1 appears to be a functional region of CAMSAP1 that links spectrin-binding to neurite outgrowth. 59
59686 407240 pfam17096 AIM3 Altered inheritance of mitochondria protein 3. AIM3 is a family of fungal proteins that are described as altered inheritance of mitochondria protein 3 proteins. 85
59687 407241 pfam17097 Kre28 Spindle pole body component. In Saccharomyces cerevisae Kre28 and Spc105 form a kinetochore microtubule binding complex, which bridges between centromeric heterochromatin and kinetochore MAPs (microtubule associated protein, such as Bim1, Bik1 and SIk19) and motors (Cin8, Kar3). It may be regulated by sumoylation. 360
59688 407242 pfam17098 Wtap WTAP/Mum2p family. The Wtap family includes female-lethal(2)D from Drosophila and pre-mRNA-splicing regulator WTAP from mammals. The former is required for female-specific splicing of Sex-lethal RNA, and the latter is a regulatory subunit of the RNA N6-methyladenosine methyltransferase. The family also includes the yeast Mum2p protein which is part of the Mis complex. 155
59689 293704 pfam17099 TrpP Tryptophan transporter TrpP. TrpP is a bacterial transmembrane protein that is probably involved in tryptophan uptake. Its expression is regulated by tryptophan-activated RNA-binding regulatory protein (TRAP). 169
59690 407243 pfam17100 NACHT_N N-terminal domain of NWD NACHT-NTPase. This is an N-terminal domain on putative NWD NACHT proteins, signal transducing ATPases which undergo ligand-induced oligomerization. 220
59691 407244 pfam17101 Stealth_CR1 Stealth protein CR1, conserved region 1. Stealth_C1 is the first of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. CR1 carries a well-conserved IDVVYT sequence-motif. The domain is found in tandem with CR2, CR3 and CR4 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system. 29
59692 407245 pfam17102 Stealth_CR3 Stealth protein CR3, conserved region 3. Stealth_CR3 is the third of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. The domain is found in tandem with CR1, CR2 and CR3 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system. 49
59693 407246 pfam17103 Stealth_CR4 Stealth protein CR4, conserved region 4. Stealth_CR4 is the fourth highly conserved region on stealth proteins in metazoa and bacteria. There are four CR regions on mammalian members. CR4 carries a well-conserved CLND sequence-motif. The domain is found in tandem with CR1, CR2 and CR3 on both potential metazoan hosts and on pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains also appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system. 56
59694 407247 pfam17104 DUF5102 Domain of unknown function (DUF5102). This is a family fungal sequences of no known function. 292
59695 407248 pfam17105 BRD4_CDT C-terminal domain of bromodomain protein 4. BRD4_CDT is the short highly conserved C-terminal domain of certain bromodomain proteins, notably Brd4. The Brd4 CTD interacts with the cyclin T1 and Cdk9 subunits of positive transcription elongation factor b (pTEFb) complex. Brd4 displaces negative regulators, the HEXIM1 and 7SKsnRNA complex, from pTEFb, thereby transforming it into an active form that can phosphorylate RNA pol II. 44
59696 374990 pfam17106 NACHT_sigma Sigma domain on NACHT-NTPases. NACHT_sigma is a short conserved region found on NACHT-NTPases. The function of this domain is not known. 42
59697 407249 pfam17107 SesA N-terminal domain on NACHT_NTPase and P-loop NTPases. This is a family of fungal N-terminal domains that appear at the N-terminus of P-loop NTPases, NACHT-NTPases and Ankyrin or WD repeat proteins. The exact function is not known. 122
59698 374992 pfam17108 HET-S N-terminal small S protein of HET, non-prionic. HET-S is an N-terminal domain on various fungal STAND proteins. The function is not known exactly. 23
59699 407250 pfam17109 Goodbye fungal STAND N-terminal Goodbye domain. The Goodbye domain is an N-terminal domain on certain fungal STAND proteins. The exact function is not known. 120
59700 407251 pfam17110 TFB6 Subunit 11 of the general transcription factor TFIIH. TFB6 is a family of fungal proteins that form the 11th subunit of the general transcription factor TFIIH. TFB6 facilitates the dissociation of Ssl2 helicase from TFIIH after the initiation of transcription. 170
59701 374995 pfam17111 Helo_like_N Fungal N-terminal domain of STAND proteins. Helo_like is a family of predicted fungal STAND NTPases. The exact function is not known. 209
59702 407252 pfam17112 Tom6 Mitochondrial import receptor subunit Tom6, fungal. Tom6 is the Tom6 subunit of the protein translocase complex TOM in fungi. This complex of the outer membrane of mitochondria is the entry gate for the vast majority of precursor proteins that are encoded by nuclear DNA, synthesized in the cytosol and imported into the mitochondria. Tom6 and Tom7 together play a role in the assembly, stability and dynamics of the TOM complex. 45
59703 374997 pfam17113 AmpE Regulatory signalling modulator protein AmpE. AmpE is a family of bacterial regulatory proteins. AmpE in conjunction with AmpD sense the effect of beta-lactam on peptidoglycan synthesis and relay this signal to AmpR. AmpR regulates the production of beta-lactamase. 284
59704 374998 pfam17114 Nod1 Gef2-related medial cortical node protein Nod1. This is a small family of fungal proteins that are involved in cytokinesis, the last stage of the cell-division cycle. Nod1 co-localizes with Gef2 - RhoGEF - in the contractile ring and its precursor cortical nodes. Nod1 and Gef2 interact through this C-terminal region of each, this interaction being important for their localization. 145
59705 407253 pfam17115 Toast_rack_N N-terminal domain of toast_rack, DUF2154. This short domain lies at the N-terminus of DUF2154, pfam09922, hereafter named Toast_rack from its structural resemblance. The function of both domains is unknown though DUF2154 is proposed to be a cell-adhesion protein. 92
59706 407254 pfam17116 DUF5103 Domain of unknown function (DUF5103). This is a family of Bacteroidetes proteins of unknown function. 288
59707 407255 pfam17117 DUF5104 Domain of unknown function (DUF5104). This is a family of gut microbes of unknown function. 107
59708 407256 pfam17118 DUF5105 Domain of unknown function (DUF5105). This is a family of Firmicutes proteins of unknown function. There is one structure, Structure 4r4g, a lipoprotein, whose N-terminus is represented by DUF4352, pfam11611. 189
59709 407257 pfam17119 MMU163 Mitochondrial protein up-regulated during meiosis. This is a family of fungal mitochondrial proteins of unknown function. 253
59710 375001 pfam17120 Zn_ribbon_17 Zinc-ribbon, C4HC2 type. 57
59711 375002 pfam17121 zf-C3HC4_5 Zinc finger, C3HC4 type (RING finger). 51
59712 293727 pfam17122 zf-C3H2C3 Zinc-finger. 35
59713 407258 pfam17123 zf-RING_11 RING-like zinc finger. 29
59714 375004 pfam17124 ThiJ_like ThiJ/PfpI family-like. This is a family of fungal and bacterial ThiJ/PfpI-like proteins. 188
59715 407259 pfam17125 Methyltr_RsmF_N N-terminal domain of 16S rRNA methyltransferase RsmF. This is the N-terminal domain of the RsmF methyl transferase. RsmF is a multi-site-specific methyltransferase that is responsible for the synthesis of three modifications on cytidines in 16S ribosomal RNA. The N-terminus is critical for stabilizing the catalytic core of the enzyme. 88
59716 407260 pfam17126 RsmF_methylt_CI RsmF rRNA methyltransferase first C-terminal domain. This is the first of two distinct C-terminal domains of the 16S rRNA methyltransferase RsmF. It is necessary for stabilizing the catalytic core, pfam01189. 61
59717 407261 pfam17127 DUF5106 Domain of unknown function (DUF5106). This domain, found in Bacteroidetes proteins, is frequently associated with a putative thiol-disulfide oxidoreductase domain, pfam13905. The function of this domain is not known. 153
59718 407262 pfam17128 DUF5107 Domain of unknown function (DUF5107). This family is found in range of different bacterial species. In many proteins it lies N-terminal to a TPR-repeat region at the C-terminus. 300
59719 407263 pfam17129 Peptidase_M99_C C-terminal domain of metallo-carboxypeptidase. C-terminal immunoglobulin-like domain of helical cell shape-determining peptidoglycan hydrolases, a metallo-carboxypeptidase. The structural elements of this domain form a Ca2+ binding-channel, the Ca2+ being co-ordinated by six ligand-atoms. 100
59720 407264 pfam17130 Peptidase_M99_m beta-barrel domain of carboxypeptidase M99. This is the central, beta-barrel, domain of the metallo-carboxypeptidase that maintains helical cell-shape in Helicobacter. It shows a novel fold. It has a highly positively charged surface which contributes to a high overall isoelectric point. A calcium-binding channel is formed from residues in the C-terminal Ig-like domain in conjunction with some of the long side-chains of residues from strands beta-14 and beta-18 of this domain. 73
59721 407265 pfam17131 LolA_like Outer membrane lipoprotein-sorting protein. This is likely to be a family of outer-membrane lipoprotein-sorting proteins. 182
59722 407266 pfam17132 Glyco_hydro_106 alpha-L-rhamnosidase. 873
59723 407267 pfam17133 DUF5108 Domain of unknown function (DUF5108). This is a family of Bacteroidetes proteins. The domain lies upstream of a Fasciclin family, pfam02469. 216
59724 407268 pfam17134 DUF5109 Domain of unknown function (DUF5109). This is a family of Gram-positive Bacteroidetes and Firmicutes proteins. It lies just C-terminal to a putative glycosyl-hydrolase family, DUF4434, pfam14488. It is likely to be some form of binding or recognition domain. 114
59725 407269 pfam17135 Ribosomal_L18 Ribosomal protein 60S L18 and 50S L18e. This is a family of ribosomal proteins, 60S L18 from eukaryotes and 50S L18e from Archaea. 184
59726 407270 pfam17136 ribosomal_L24 Ribosomal proteins 50S L24/mitochondrial 39S L24. This is the family of bacterial 50S ribosomal subunit proteins L24. It also carries some mitochondrial 39S L24 proteins. 60
59727 407271 pfam17137 DUF5110 Domain of unknown function (DUF5110). This domain is likely to be a carbohydrate-binding domain of some description as it is found immediately C-terminal to the glycosyl-hydrolase family Glyco_hydro_31, pfam01055. 72
59728 407272 pfam17138 DUF5111 Domain of unknown function (DUF5111). This family is found immediately downstream of SusE, a putative starch-processing family, pfam14292. It is possible that this domain represents a substrate-binding site. 121
59729 407273 pfam17139 DUF5112 Domain of unknown function (DUF5112). This domain is frequently found upstream of family HATPase_c pfam000251. 266
59730 407274 pfam17140 DUF5113 Domain of unknown function (DUF5113). This domain is frequently found downstream of family HATPase_c pfam000251 in duplicate. 162
59731 407275 pfam17141 DUF5114 Domain of unknown function (DUF5114). This family lies further downstream of DUF5111, pfam17138, on proteins from Bacteroidetes that also carry a SusE family, pfam14292. 88
59732 407276 pfam17142 DUF5115 Domain of unknown function (DUF5115). 258
59733 407277 pfam17144 Ribosomal_L5e Ribosomal large subunit proteins 60S L5, and 50S L18. This family contains the large 60S ribosomal L5 proteins from Eukaryota and the 50S L18 proteins from Archaea. It has been shown that the amino terminal 93 amino acids of Rat Rpl5 are necessary and sufficient to bind 5S rRNA in vitro, suggesting that the entire family has a function in rRNA binding. 162
59734 407278 pfam17145 DUF5119 Domain of unknown function (DUF5119). This is a family of uncharacterized Bacteroidia sequences. 193
59735 407279 pfam17146 PIN_6 PIN domain of ribonuclease. This is a PIN domain found largely in eukaryotes. 87
59736 407280 pfam17147 PFOR_II Pyruvate:ferredoxin oxidoreductase core domain II. PFOR_II is a core domain of the anaerobic enzyme pyruvate:ferredoxin oxidoreductase and is necessary for inter subunit contacts in conjunction with domains I and IV. 102
59737 407281 pfam17148 DUF5117 Domain of unknown function (DUF5117). This domain may fall upstream of a met-zincin domain. 189
59738 319167 pfam17149 CHASE5 Periplasmic sensor domain found in signal transduction proteins. CHASE5 is a conserved periplasmic sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases and methyl-accepting chemotaxis proteins. In Pseudomonas aeruginosa, CHASE5 is the sensor domain in the c-di-GMP phosphodiesterase BifA that regulates biofilm formation and in sensor kinase AruS that regulates arginine degradation pathways. These results suggest that CHASE5 might bind arginine or a related compound. 108
59739 407282 pfam17150 CHASE6_C C-terminal domain of two-partite extracellular sensor domain. CHASE6 was originally described as a two-partite extracellular (periplasmic) sensor domain found in histidine kinases and HD-GYP-type c-di-GMP-specific phosphodiesterases and assigned to COG4250 in the COG database. Subsequently, its N-terminal part has been described as a separate DICT (DIguanylate Cyclases and Two-component systems) domain (pfam10069) (Aravind L., Iyer LM, Anantharaman V. (2010) Natural history of sensor domains in the bacterial signalling systems. In: Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition ((Spiro S, Dixon R, eds)), pp. 1-38. Caister Academic Press, Norfolk, UK). The current entry contains only the C-terminal part of the original CHASE6 domain, which is found primarily in cyanobacteria. 80
59740 407283 pfam17151 CHASE7 Periplasmic sensor domain. CHASE7 is a conserved periplasmic sensor domain found in histidine kinases and diguanylate cyclases/phosphodiesterases, including the diguanylate cyclase DgcQ (YedQ) that regulates biofilm formation and motility in Escherichia coli (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). 187
59741 407284 pfam17152 CHASE8 Periplasmic sensor domain. CHASE8 is a conserved periplasmic sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases and methyl-accepting chemotaxis proteins, including the diguanylate cyclase DgcN (YfiN) that regulates biofilm formation and motility in Escherichia coli (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). In Pseudomonas aeruginosa, CHASE8 is the sensor domain in the diguanylate cyclase TpbB that regulates biofilm formation by controlling the levels of extracellular DNA. 102
59742 319171 pfam17153 CHASE9 Periplasmic sensor domain, extracellular. CHASE9 is a conserved extracellular (periplasmic) sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases, methyl-accepting chemotaxis proteins, adenylate cyclases and protein serine phosphatases, including the c-di-GMP phosphodiesterases PdeI (YliE) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). 116
59743 407285 pfam17154 GAPES3 Gammaproteobacterial periplasmic sensor domain. GAPES3 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases/phosphodiesterases, including the c-di-GMP phosphodiesterases PdeK (YhjK) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation) and HmsP of Yersinia pestis. 121
59744 319173 pfam17155 GAPES1 Gammaproteobacterial periplasmic sensor domain. GAPES1 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases and methyl-accepting chemotaxis proteins, including the diguanylate cyclase DgcJ (YeaJ) that regulates biofilm formation and motility in Escherichia coli and (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation). 274
59745 319174 pfam17156 GAPES2 Gammaproteobacterial periplasmic sensor domain. GAPES2 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases, including the diguanylate cyclase DgcI (YliF) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). It contains three conserved Cys residues that might participate in thiol-disulfide exchange. 204
59746 407286 pfam17157 GAPES4 Gammaproteobacterial periplasmic sensor domain. GAPES4 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in various GGDEF- and EAL-containing proteins. In Escherichia coli, GAPES4 forms the N-terminal domain of the regulatory protein CsrD (YhdA) (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation), which contains enzymatically inactive GGDEF and EAL domains and controls CsrD) that controls the degradation of two non-coding RNAs, CsrB and CsrC. In Vibrio cholerae, GAPES4-containing protein MshH (Q9KUW1_VIBCH) inhibits biofilm formation, apparently acting through the glucose-specific enzyme IIA (Q9KTD8, pfam00358). 98
59747 407287 pfam17158 MASE4 Membrane-associated sensor, integral membrane domain. MASE4 (Membrane-Associated SEnsor) is an integral membrane sensor domain found in various GGDEF domain proteins, including a functional diguanylate cyclase DgcT (YcdT) and the enzymatically inactive CdgI (YeaI) of Escherichia coli (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation). In the Shiga toxin-producing enteroaggregative E. coli O104:H4, which caused the outbreak of the haemolytic uraemic syndrome in Germany in 2011, MASE4-containing diguanylate cyclase DgcX, UniProtKB:B7LBD9_ECO55, was highly expressed, ensuring strong biofilm formation. 239
59748 407288 pfam17159 MASE3 Membrane-associated sensor domain. MASE3 (Membrane-Associated SEnsor) is an integral membrane sensor domain of unknown specificity found in histidine kinases, diguanylate cyclases and protein phosphatases in various bacteria and archaea. 226
59749 319178 pfam17160 DUF5124 Domain of unknown function (DUF5124). 100
59750 407289 pfam17161 DUF5123 Domain of unknown function (DUF5123). 116
59751 407290 pfam17162 DUF5118 Domain of unknown function (DUF5118). This domain falls upstream of a met-zincin domain. 50
59752 407291 pfam17163 DUF5125 Domain of unknown function (DUF5125). 193
59753 407292 pfam17164 DUF5122 Domain of unknown function (DUF5122) beta-propeller. 36
59754 407293 pfam17165 DUF5121 Domain of unknown function (DUF5121). 111
59755 407294 pfam17166 DUF5126 Domain of unknown function (DUF5126). This domain lies C-terminal to DUF4959, pfam16323. 102
59756 407295 pfam17167 Glyco_hydro_36 Glycosyl hydrolase 36 superfamily, catalytic domain. This is the catalytic region of the superfamily of enzymes referred to as GH36. UniProtKB:Q76IQ9 is a chitobiose phosphorylase that catalyzes the reversible phosphorolysis of chitobiose into alpha-GlcNAc-1-phosphate and GlcNAc with inversion of the anomeric configuration. The full-length enzyme comprises a beta sandwich domain and an (alpha/alpha)(6) barrel domain. The alpha-helical barrel component of the domain, this family, is the catalytic region. 425
59757 407296 pfam17168 DUF5127 Domain of unknown function (DUF5127). 226
59758 407297 pfam17169 NRBF2_MIT MIT domain of nuclear receptor-binding factor 2. This MIT domain is the microtubule interaction and trafficking of nuclear receptor-binding factor 2 - NRBF2 - in higher eukaryotes. It is a coiled-coil region at the N-terminus of pfam08961. NRBF2 plays an essential role in autophagy, the cellular pathway that degrades long-lived proteins and other cytoplasmic contents through lysosomes. NRBF2 binds Atg14L - a Beclin-binding protein - directly via the MIT domain and enhances Atg14L-linked Vps34 kinase (a class III phosphatidylinositol-3 kinase) activity and autophagy induction. 83
59759 407298 pfam17170 DUF5128 6-bladed beta-propeller. This family is a 6-bladed beta-propeller structure of unknown function. There is a highly conserved FDxxG motif which might be important. 321
59760 407299 pfam17171 GST_C_6 Glutathione S-transferase, C-terminal domain. This domain is closely related to PF00043. 64
59761 407300 pfam17172 GST_N_4 Glutathione S-transferase N-terminal domain. This domain is homologous to pfam02798. 97
59762 375028 pfam17173 DUF5129 Domain of unknown function (DUF5129). 337
59763 379937 pfam17174 DUF5130 Domain of unknown function (DUF5130). 136
59764 407301 pfam17175 MOLO1 Modulator of levamisole receptor-1. MOLO1 is a one-pass transmembrane protein that contains a single extracellular globular domain. It is a positive regulator of levamisole-sensitive acetylcholine receptors in Caenorhabditis elegans. These receptors are Cys-loop ligand-gated ion channels, and the MOLO1 domain is an auxiliary subunit of the gated channel. The proteins carry a Rossmann fold. 119
59765 407302 pfam17176 tRNA_bind_3 tRNA-binding domain. This domain, found at the C-terminus of tRNA(Met) cytidine acyltransferase, may be involved in tRNA-binding. This family represents the tRNA-binding domain proteins not captured by pfam13725. 119
59766 407303 pfam17177 PPR_long Pentacotripeptide-repeat region of PRORP. Pentatricopeptide repeat (PPR) proteins are a large family of modular RNA-binding proteins which mediate several aspects of gene expression primarily in organelles but also in the nucleus. PPR_long is the region of Arabidopsis protein-only RNase P (PRORP) enzyme that consists of up to eleven alpha-helices. PRORPs are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. All PPR proteins contain tandemly repeated sequence motifs (the PPR motifs) which can vary in number. The series of helix-turn-helix motifs formed by PPR motifs throughout the protein produces a superheros with a central groove that allows the protein to bind RNA. Proteins containing PPR motifs are known to have roles in transcription, RNA processing, splicing, stability, editing, and translation. Over a decade after the discovery of PPR proteins, the super-helical structure was confirmed. The protein-only mitochondrial RNase P crystal structure from Arabidopsis thaliana (PRORP1) confirmed the role of its PPR motifs in pre-tRNA binding and suggest it has evolved independently from other RNase P proteins that rely on catalytic RNA. 212
59767 407304 pfam17178 MASE5 Membrane-associated sensor. MASE5 is a family of bacterial membrane-associated sensor domains. It is an integral membrane sensor domain found in various GGDEF domain proteins, including a diguanylate cyclase DgcY (EcSMS35_1716) from multidrug-resistant environmental isolate Escherichia coli SMS-3-5 (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). 192
59768 407305 pfam17179 Fer4_22 4Fe-4S dicluster domain. 95
59769 407306 pfam17180 zf-3CxxC_2 Zinc-binding domain. 74
59770 407307 pfam17181 EPF Epidermal patterning factor proteins. EPF is a family of plant epidermal cell growth factors. It is a signalling peptide that determines the spacing and separation of the development of stomatal cells in the upper epidermis of plant leaf cells. 45
59771 407308 pfam17182 OSK OSK domain. This entry represents the OSK domain defined by Jeske and colleagues. The domain is related to SGNH hydrolases but lacks the active site residues. The domain binds to RNA. 202
59772 375036 pfam17183 Blt1_C Get5 carboxyl domain. During size-dependent cell cycle transitions controlled by the ubiquitous cyclin-dependent kinase Cdk1, Blt1 has been shown to co-localize with Cdr2 in the medial interphase nodes, as well as with Mid1 which was previously shown to localize to similar interphase structures. Physical interactions between Blt1-Mid1, Blt1-Cdr2 and Cdr2-Mid1 were detected, indicating that medial cortical nodes are formed by the ordered, Cdr2-dependent assembly of multiple interacting proteins during interphase. This entry corresponds to the C-terminal dimerization domain. 51
59773 407309 pfam17184 Rit1_C Rit1 N-terminal domain. This domain is the N-terminal domain from the enzyme (EC:2.4.2.-) which modifies exclusively the initiator tRNA in position 64 using 5'-phosphoribosyl-1'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. The N-terminal domain is the most conserved region of the protein. 272
59774 407310 pfam17185 NlpE_C NlpE C-terminal OB domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes. This domain is found at the C-terminus of the NlpE protein. 90
59775 407311 pfam17186 Lipocalin_9 Lipocalin-like domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of a family member (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication. This entry represents the C-terminal domain of the pair. 130
59776 407312 pfam17187 Svf1_C Svf1-like C-terminal lipocalin-like domain. Family of proteins that are involved in survival during oxidative stress. This entry corresponds to the the C-terminal domain of a pair of lipocalin domains. 163
59777 407313 pfam17188 MucB_RseB_C MucB/RseB C-terminal domain. Members of this family are regulators of the anti-sigma E protein RseD. 98
59778 407314 pfam17189 Glyco_hydro_30C Glycosyl hydrolase family 30 beta sandwich domain. 63
59779 407315 pfam17190 RecG_N RecG N-terminal helical domain. This four helical bundle domain is found at the N-terminus of bacterial RecG proteins. 89
59780 407316 pfam17191 RecG_wedge RecG wedge domain. This DNA-binding domain has an OB-fold with large elaborations. 162
59781 407317 pfam17192 MukF_M MukF middle domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid. 161
59782 407318 pfam17193 MukF_C MukF C-terminal domain. This presumed domain is found at the C-terminus of the MukF protein. 158
59783 407319 pfam17194 AbiEi_3_N Transcriptional regulator, AbiEi antitoxin N-terminal domain. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338. 93
59784 407320 pfam17195 DUF5132 Protein of unknown function (DUF5132). Proteins in this family are uncharacterized, but have been identified as members of a gene cluster for the synthesis of Ansamitocin. 47
59785 407321 pfam17196 DUF5133 Protein of unknown function (DUF5133). This protein of unknown function is part of the Borrelidin synthesis genomic cluster. Borrelidin is a polyketide antibiotic. 65
59786 407322 pfam17197 DUF5134 Domain of unknown function (DUF5134). Proteins in this family are uncharacterized, but have been identified as members of a gene cluster for the synthesis of the tetramic-acid antibiotic streptolydigin, which inhibits bacterial RNA polymerase (RNAP). 157
59787 379943 pfam17198 AveC_like Spirocyclase AveC-like. AveC catalyzes the stereospecific spiroketalization of a dihydroxy-ketone polyketide intermediate in the biosynthetic pathway of Avermectin, a potent antiparasitic agent. Additionally, it has a unique dehydration activity that serves to determine the regiospecific saturation pattern for spiroketal diversity. MeiC, the counterpart in the biosynthesis of AVE-like meilingmycin, also has spirocyclase activity, but lacks the dehydratase activity. 229
59788 407323 pfam17199 DUF5136 Protein of unknown function (DUF5136). Sequences in this family have been identified in Micromonospora as part of the genomic cluster for the synthesis of dynemicin, an enediyne antitumor antibiotic. 28
59789 407324 pfam17200 sCache_2 Single Cache domain 2. This entry represents the single Cache domain 2 (sCache_2), which contains the long N-terminal helix domain. 154
59790 407325 pfam17201 Cache_3-Cache_2 Cache 3/Cache 2 fusion domain. The Cache_3-Cache_2 domain likely originated as a fusion of sCache_3 and sCache_2 domains. 292
59791 407326 pfam17202 sCache_3_3 Single cache domain 3. 107
59792 375046 pfam17203 sCache_3_2 Single cache domain 3. 140
59793 375047 pfam17204 Sid-5 Sid-5 family. SID-5 is a C. elegans endosome-associated protein that is required for efficient systemic RNA. 76
59794 407327 pfam17205 PSI_integrin Integrin plexin domain. This short disulphide rich domain is found at the N-terminus of integrin beta chains. 48
59795 319224 pfam17206 SeqA_N SeqA protein N-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor. This short domain mediates dimerization. 36
59796 407328 pfam17207 MCM_OB MCM OB domain. This family contains an OB-fold found within MCM proteins. This domain contains an insertion at the zinc binding motif. 126
59797 407329 pfam17208 RBR RNA binding Region. 59
59798 407330 pfam17209 Hfq Hfq protein. 64
59799 407331 pfam17210 SdrD_B SdrD B-like domain. This family corresponds to the B-like domain from the SdrD protein. This domain has three calcium binding sites within a greek key beta sandwich fold. 112
59800 407332 pfam17211 VHL_C VHL box domain. This domain represents the short C-terminal alpha helical domain from the VHL protein. 49
59801 407333 pfam17212 Tube Tail tubular protein. This family includes the tail tubular gp11 protein from bacteriophage T7. 169
59802 407334 pfam17213 Hydin_ADK Hydin Adenylate kinase-like domain. This domain found in the Hydin protein is homologous to adenylate kinases. 202
59803 407335 pfam17214 KH_7 KH domain. 68
59804 407336 pfam17215 Rrp44_S1 S1 domain. This domain corresponds to the S1 domain found at the C-terminus of ribonucleases such as yeast Rrp44. 87
59805 375054 pfam17216 Rrp44_CSD1 Rrp44-like cold shock domain. 148
59806 407337 pfam17217 UPA UPA domain. The UPA domain is conserved in UNC5, PIDD, and Ankyrins. It has a beta sandwich structure. 140
59807 407338 pfam17218 CBX7_C CBX family C-terminal motif. This motif is found at the C-terminus of CBX family proteins. It is bound by the RAWUL domain of the RING1B protein. 33
59808 407339 pfam17219 YAF2_RYBP Yaf2/RYBP C-terminal binding motif. This motif is found in the Yaf2 and RYBP proteins that are homologous parts of the PRC1 complex. This motif forms a beta hairpin structure when it binds to the RAWUL domain pfam16207. 33
59809 375058 pfam17220 DUF5137 Protein of unknown function (DUF5137). This is a family of uncharacterized yeast proteins. 78
59810 407340 pfam17221 COMMD1_N COMMD1 N-terminal domain. This helical domain is found at the N-terminus of COMMD1. 102
59811 407341 pfam17222 Peptidase_C107 Viral cysteine endopeptidase C107. This is a family of viral cysteine endopeptidases that process RNA polyproteins. Site directed mutagenesis suggest that H1434 and C1539 form the catalytic dyad. 314
59812 375060 pfam17223 CPCFC Cuticle protein CPCFC. This entry contains cuticle proteins with a CX(5)C motif, although some members have a CX(7)C motif. In Anopheles gambiae, mRNA for this protein is most abundant immediately following ecdysis in larvae, pupae and adults, and is localized primarily in epidermis that secretes hard cuticle, sclerites, setae, head capsules, appendages and spermatheca. EM immunolocalization studies have shown that the protein is present in the endocuticle of legs and antennae. CPCFC is found throughout the Hexapoda and in several classes of Crustacea. 17
59813 407342 pfam17224 DUF5300 Domain of unknown function (DUF5300). This small family of proteins found in Clostridiales is functionally uncharacterized. Proteins in this family are around 130 amino acids in length. Based on NMR structure 2MCA, it forms a beta-sandwich structure consisting of two 4-stranded antiparallel b-strands. The structure is very similar to glutamine glutamyltransferases (1l9n) and peptide transporters (5a9h). 98
59814 407343 pfam17225 DUF5301 Domain of unknown function (DUF5300). This small family of proteins is functionally uncharacterized. It is found mainly in Firmicutes. Proteins in this family are around 130 amino acids in length. Based on NMR structure 2MCT, it forms an alpha/beta structure with a 6 stranded antiparallel b-sheet planked by a single alpha helix. The only protein with similar structures is a putative lipoprotein (PDB code 4R7R). 97
59815 407344 pfam17226 MTA_R1 MTA R1 domain. The R1 domain is found in the MTA1 protein and its homologs. The domain is composed of 4 alpha helices. It has been shown to bind to the RBBP4 protein. The MTA proteins contain a second partial copy of this domain called R2. The R2 domain is matched by this model for some proteins. 79
59816 407345 pfam17227 DUF5302 Family of unknown function (DUF5302). Family of unknown function found in Actinobacteria with highly conserved motif of FRRKSG found at the C-terminus. 52
59817 375063 pfam17228 SGP Sulphur globule protein. Sulphur globules are membrane-bounded intracellular globules, used by purple sulphur bacteria to transiently store sulphur during the oxidisation of reduced sulphur compounds. This proteobacterial family contains structural proteins of these sulphur globules, and includes sulphur globule protein CV1 (SgpA) and sulphur globule protein CV2 (SgpB). 96
59818 407346 pfam17229 DUF5303 Region of unknown function (DUF5303). This disordered region of unknown function shows similarity to the N-terminal region of SMG1. 106
59819 407347 pfam17230 DUF5304 Family of unknown function (DUF5304). This family of unknown function is found in Actinobacteria. 149
59820 407348 pfam17231 DUF5305 Family of unknown function (DUF5305). This family consists of several hypothetical proteins of unknown function. 215
59821 407349 pfam17232 DUF5306 Family of unknown function (DUF5306). This family of unknown function is found mainly in plants. 82
59822 407350 pfam17233 DUF5308 Family of unknown function (DUF5308). This family of uncharacterized fungal proteins are primarily found in ascomycota. 162
59823 407351 pfam17234 MPM1 Mitochondrial peculiar membrane protein 1. This family contains mitochondrial peculiar membrane proteins, found predominantly in Saccharomycetales. 172
59824 407352 pfam17235 STD1 STD1/MTH1. This family of proteins includes the known homologs STD1 (also known as MSN3) and MTH1. Both STD1 and MTH1 are involved in modulating the expression of glucose-regulated genes in yeast, but have been shown to function by slightly different methods. It has been suggested that both STD1 and MTH1 are required to repress the hexose transporter genes in low glucose conditions. STD1 has also been shown to stimulate SNF1 kinase through interaction with the catalytic domain of SNF1, antagonising auto-inhibition and promoting an active conformation of the kinase. 213
59825 407353 pfam17236 DUF5309 Family of unknown function (DUF5309). This is a family of uncharacterized proteins found in viruses and bacteria. 280
59826 407354 pfam17237 DUF5310 Family of unknown function (DUF5310). This uncharacterized family of proteins contains members that are found mainly in fungi. 44
59827 407355 pfam17238 DUF5311 Family of unknown function (DUF5311). This is a family of proteins which is mostly found in Streptophyta.On the C terminal of this family, the Nucleoporin Nup120/160 family pfam11715 if often present. 194
59828 375073 pfam17239 DUF5312 Family of unknown function (DUF5312). This is a family of unknown function, mostly found in Spirochaeta. 553
59829 375074 pfam17240 DUF5313 Family of unknown function (DUF5313). This is a family of unknown function, found mostly in Actinobacteria and composed of trans-membrane proteins. 123
59830 407356 pfam17241 DUF5314 Family of unknown function (DUF5314). This is a family of unknown function usually preceded by the GAG-pre-integrase domain pfam13976. 154
59831 407357 pfam17242 DUF5315 Disordered region of unknown function (DUF5315). This is a family of unknown function found mostly in Saccharomycetales. 77
59832 407358 pfam17243 POTRA_TamA_1 POTRA domain TamA domain 1. This family represents the POTRA domain found in the membrane insertase TamA. 74
59833 407359 pfam17244 CDC24_OB3 Cell division control protein 24, OB domain 3. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis. 207
59834 407360 pfam17245 CDC24_OB2 Cell division control protein 24, OB domain 2. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis. 129
59835 407361 pfam17246 CDC24_OB1 Cell division control protein 24, OB domain 1. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis. 118
59836 407362 pfam17247 DUF5316 Family of unknown function (DUF5316). This is a family of unknown function mainly found in Firmicutes. Might contain multiple trans-membrane sequences. 74
59837 407363 pfam17248 DUF5317 Family of unknown function (DUF5317). This is a family of unknown function found mainly in Bacteria. Members of this family have multiple trans-membrane domains with the majority typically constituted of 4 trans-membrane regions. 150
59838 407364 pfam17249 DUF5318 Family of unknown function (DUF5318). This family of unknown function is mostly found in Actinobacteria. 131
59839 407365 pfam17250 NDUFB11 NADH-ubiquinone oxidoreductase 11 kDa subunit. Complex I of the respiratory chain is a proton-pumping, NADH ubiquinone oxidoreductase that oxidizes NADH in the electron transport pathway. Plants contain the series of 14 highly conserved complex I subunits found in other eukaryotic and related prokaryotic enzymes. 86
59840 407366 pfam17251 Pom Protochlamydia outer membrane protein. This family represents an outer membrane protein found in environmental chlamydia. The protein shows porin function. 279
59841 407367 pfam17252 DUF5319 Family of unknown function (DUF5319). This is a family of unknown function mostly found in Actinobacteria. 121
59842 407368 pfam17253 DUF5320 Family of unknown function (DUF5320). A number of this family members have a coiled coil domain at the C terminal. 98
59843 407369 pfam17254 DUF5321 Family of unknown function (DUF5321). This is a family of unknown function. Most of the members seem to carry one trans-membrane region. 160
59844 407370 pfam17255 DUF5322 Family of unknown function (DUF5322). This is a family of unknown function. The uncharacterized family is mainly found in Bacteria and consists of two putative trans-membrane domains. 133
59845 407371 pfam17256 ANAPC16 Anaphase Promoting Complex Subunit 16. The Anaphase-promoting complex/cyclosome (APC/C) is a 1.5 megaDaltons assembly ubiquitin ligase complex comprising 19 subunits. This multifunctional ubiquitin-protein ligase targets different substrates for ubiquitylation and therefore regulates a variety of cellular processes such as cell division, differentiation, genome stability, energy metabolism, cell death, autophagy as well as carcinogenesis. The APC/C complex contains two sub-complexes,the Platform and the Arc Lamp. The Arc Lamp, which mediates transient association with regulators and ubiquitination substrates, contains the small subunits APC16, CDC26, APC13, and tetratricopeptide repeat (TPR) proteins. APC16 is a conserved subunit of the APC/C. APC16 was found in association with tandem-affinity-purified mitotic checkpoint complex protein complexes. APC16 is a bona fide subunit of human APC/C. It is present in APC/C complexes throughout the cell cycle. The phenotype of APC16-depleted cells copies depletion of other APC/C subunits, and APC16 is important for APC/C activity towards mitotic substrates. APC16 sequence homologs can be identified in metazoans, but not fungi, by four conserved primary sequence stretches. 80
59846 375084 pfam17257 DUF5323 Family of unknown function (DUF5323). This family of proteins found in Eukaryota, has no known function. 62
59847 407372 pfam17258 DUF5324 Family of unknown function (DUF5324). This is a family of unknown function, mostly found in Actinobacteria. Most of the family members contain one trans-membrane domain. 220
59848 407373 pfam17259 DUF5325 Family of unknown function (DUF5325). This is a family of unknown function mainly found in Bacilli. Family members of this family are predicted to have trans-membrane domains. 61
59849 407374 pfam17260 DUF5326 Family of unknown function (DUF5326). This is a family of unknown function mostly found in Actinobacteria. Many of the family members are predicted to contain two trans-membrane domains. 70
59850 407375 pfam17261 DUF5327 Family of unknown function (DUF5327). This bacterial family of proteins has no known function and is mostly found in Bacilli. 97
59851 407376 pfam17262 DUF5328 Family of unknown function (DUF5328). This family of unknown function can be found in Bacteria and Archaea. Some of the proteins in this family are annotated in UniProt as putative DNA repair proteins. 114
59852 407377 pfam17263 DUF5329 Family of unknown function (DUF5329). This is a bacterial family of proteins with unknown function. 93
59853 407378 pfam17264 DUF5330 Family of unknown function (DUF5330). This is a family of unknown function which is mostly found in Bacteria. 65
59854 407379 pfam17265 DUF5331 Family of unknown function (DUF5331). This bacterial family of unknown function can be found in Cyanobacteria. 113
59855 407380 pfam17266 DUF5332 Family of unknown function (DUF5332). This family of uncharacterized proteins is mostly found in Chromadorea. 148
59856 407381 pfam17267 DUF5333 Family of unknown function (DUF5333). This family of uncharacterized proteins is mostly found in Alphaproteobacteria. 110
59857 339984 pfam17268 DUF5334 Family of unknown function (DUF5334). This is a family of unknown function which can is found mainly in Proteobacteria. 71
59858 407382 pfam17269 DUF5335 Family of unknown function (DUF5335). This bacterial family of proteins has no known function. 110
59859 407383 pfam17270 DUF5336 Family of unknown function (DUF5336). This Actinobacterial family of proteins has no known function. Most of the family members are predicted to have have 4 trans-membrane regions. 115
59860 407384 pfam17271 Usher_TcfC TcfC Usher-like barrel domain. This is the presumed beta barrel domain from the usher-like TcfC family of proteins. 422
59861 407385 pfam17272 DUF5337 Family of unknown function (DUF5337). This family of unknown function is found in Rhodobacterales. Most members are predicted to have 2 trans-membrane regions. 74
59862 407386 pfam17273 DUF5338 Family of unknown function (DUF5338). This is a family of unknown function which can be found mostly in Proteobacteria. 70
59863 407387 pfam17274 DUF5339 Family of unknown function (DUF5339). This is a family of unknown function that can be found mostly in Proteobacteria. Some of the family members are predicted to contain a coiled coil region. 70
59864 407388 pfam17275 DUF5340 Family of unknown function (DUF5340). This family of unknown function can be found in Cyanobacteria. 70
59865 407389 pfam17276 DUF5341 Family of unknown function (DUF5341). This is a family of unknown function, which can be found mostly in Ascomycota. 161
59866 407390 pfam17277 DUF5342 Family of unknown function (DUF5342). This family of no known function is found in Bacilli. 69
59867 407391 pfam17278 DUF5343 Family of unknown function (DUF5343). This is a family of unknown function which is found in Bacteria and Archaea. 138
59868 407392 pfam17279 DUF5344 Family of unknown function (DUF5344). This is a Bacterial family of unknown function. Most of the members of this family are predicted to contain a coiled-coil region. 87
59869 407393 pfam17280 DUF5345 Family of unknown function (DUF5345). This is a family of unknown function. It is found mostly in Bacteria. Members of this family are predicted to contain 2 trans-membrane regions. 77
59870 407394 pfam17281 DUF5346 Family of unknown function (DUF5346). This family of unknown function is found in Nematoda. 102
59871 407395 pfam17282 DUF5347 Family of unknown function (DUF5347). This family of unknown function is found in Bacteria, mainly in Proteobacteria. 102
59872 407396 pfam17283 Zn_ribbon_SprT SprT-like zinc ribbon domain. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc ribbon. This family includes the bacterial SprT protein. 38
59873 407397 pfam17284 Spermine_synt_N Spermidine synthase tetramerisation domain. This domain represents the N-terminal tetramerization domain from spermidine synthase. 53
59874 407398 pfam17285 PRMT5_TIM PRMT5 TIM barrel domain. This domain corresponds to the N-terminal TIM barrel domain from PRMT5 proteins.. 248
59875 407399 pfam17286 PRMT5_C PRMT5 oligomerization domain. 173
59876 407400 pfam17287 POTRA_3 POTRA domain. This POTRA domain is found in ShlB-like proteins. 56
59877 375107 pfam17288 Terminase_3C Terminase RNAseH like domain. 154
59878 407401 pfam17289 Terminase_6C Terminase RNaseH-like domain. 153
59879 407402 pfam17290 Arena_ncap_C Arenavirus nucleocapsid C-terminal domain. This domain represents the the C-terminal domain that contains 3'-5' exoribonuclease activity involved in suppressing interferon induction. This domain has an RNaseH-like fold. 177
59880 407403 pfam17291 M60-like_N N-terminal domain of M60-like peptidases. This accessory domain has a jelly roll topology. 107
59881 407404 pfam17292 POB3_N POB3-like N-terminal PH domain. This domain is found at the N-terminus of POB3 and related proteins. 93
59882 407405 pfam17293 Arm-DNA-bind_5 Arm DNA-binding domain. This domain is the N-terminal Arm DNA-binding domain found in various tyrosine recombinases. 87
59883 407406 pfam17294 Lipoprotein_22 Uncharacterized lipoprotein family. The proteins in this family all have an N-terminal lipoprotein attachment motif. No member of this family has been functionally characterized. 166
59884 407407 pfam17295 DUF5348 Domain of unknown function (DUF5348). 69
59885 340012 pfam17296 ArenaCapSnatch Arenavirus cap snatching domain. This domain represents the N-terminal domain of the Arenavirus polymerase that is involved in cap snatching during transcription initiation. 171
59886 407408 pfam17297 PEPCK_N Phosphoenolpyruvate carboxykinase N-terminal domain. catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate. 218
59887 407409 pfam17298 DUF5349 Family of unknown function (DUF5349). This is a family of unknown function found in Saccharomycetaceae. 362
59888 407410 pfam17299 DUF5350 Family of unknown function (DUF5350). This family is found in Euryarchaeota, predominantly in Methanomicrobia and Archaeoglobi. No known function for this family has been demonstrated. 57
59889 407411 pfam17300 FIN1 Filament protein FIN1. Fin1 is a kinetochore protein, predicted to contain two putative coiled-coil regions at its C-terminus. It is present in a filamentous structure associated with the spindle and spindle pole in dividing cells during anaphase. Fin1 is a substrate of S-phase cyclin-dependent kinase (CDK). It binds to PP1 creating the Fin1- PPI complex which is recruited onto kinetochores promoting spindle assembly checkpoint (SAC) dis-assembly during anaphase. This is an important step in cell division since the kinetochore is the docking site for the spindle assembly checkpoint that monitors the defects in chromosome attachment and blocks anaphase onset. Fin1 has two RXXS/T sequences: S377 (RVTS), S526 (RKVS) that can be phosphorylated. Upon phosphorylation, interactions with other proteins such as Bmh1 and Bmh2 is promoted. However, de-phosphorylation during anaphase promotes the kinetochore recruitment of Fin1-PP1. 240
59890 407412 pfam17301 LpqV Putative lipoprotein LpqV. This is a family of cell surface proteins found in Mycobacterium with no known function. 117
59891 407413 pfam17302 DUF5351 Family of unknown function (DUF5351). This family of unknown function is found in Bacillales. 29
59892 407414 pfam17303 DUF5352 Family of unknown function (DUF5352). This is a family of unknown function found mostly in Eukaryota. 165
59893 407415 pfam17304 DUF5353 Family of unknown function (DUF5353). This is a family of unknown function found mostly in Fungi. Members of this family are predicted to contain 2 trans-membrane regions. 68
59894 375117 pfam17305 DUF5354 Family of unknown function (DUF5354). This family of unknown function is found mostly in Metazoa. 124
59895 407416 pfam17306 DUF5355 Family of unknown function (DUF5355). This family of unknown function is found in Saccharomycetales. 331
59896 407417 pfam17307 Smim3 Small integral membrane protein 3. This domain family can be found in Smim3 proteins (Small integral membrane protein 3) also known as NID67 (NGF-induced differentiation clone 67). It is a primary response gene, hypothesized to be involved in forming or regulating ion channels in neuronal differentiation. It is strongly induced by NGF (Nerve Growth Factor) and FGF (Fibroblast Growth Factor), both of which cause these cells to differentiate. The amino acid sequence of NID67 is strongly conserved among rat, mouse and human. This family of small membrane proteins is only 60 amino acids long and analysis of the predicted peptide sequence reveals a stretch of 29 hydrophobic and uncharged residues which very likely comprise a trans-membrane region. 60
59897 375119 pfam17308 Corazonin Pro-corazonin. This domain family is found in Corazonin proteins in Drosophila and other Anthropods. Corazonin (Crz)is a neuropeptide with a wide spectrum of biological functions in diverse insect groups. It was first discovered due to its myostimulatory activities on the heart muscle of Periplaneta Americana and the hyper-neural muscle of Carausius morosus. In Drosophila melanogaster, Crz plays diverse roles ranging from a regulator of insulin producing cells in the brain to roles specific to tissues, life stages, and gender. 134
59898 407418 pfam17309 DUF5356 Family of unknown function (DUF5356). This is a family of unknown function found in Chromadorea. 135
59899 375121 pfam17310 DUF5357 Family of unknown function (DUF5357). This is a family of unknown function found in Cyanobacteria. Most of the family members are predicted to have several trans-membrane regions. 319
59900 407419 pfam17311 DUF5358 Family of unknown function (DUF5358). This family of unknown function is found in Proteobacteria. 161
59901 407420 pfam17312 Helveticin_J Bacteriocin helveticin-J. Bacteriocins are biologically active proteins or protein complexes that display a bactericidal mode of action towards closely related species. Bacteriocins produced by lactic acid bacteria are grouped into different classes. Class III of bacteriocins includes large heat liable proteins. Lactobacillus helveticus 481 produces a 37-kDa bacteriocin called helveticin J which is a representative for Clas III bacteriocins. 310
59902 407421 pfam17313 DUF5359 Family of unknown function (DUF5359). This is a family of unknown function found in Bacillales. Most of the family members are predicted to have one trans-membrane region. 56
59903 407422 pfam17314 DUF5360 Family of unknown function (DUF5360). This is a family of unknown function. It is present in Bacteria and most of the family members are predicted to have 4 trans-membrane regions. 127
59904 407423 pfam17315 FMP23 Found in Mitochondrial Proteome. FMP23 gene encodes a putative mitochondrial protein involved in iron-copper homoeostasis. It was observed to be induced in response to ATX1 deletion and high copper conditions. 119
59905 407424 pfam17316 PET10 Petite colonies protein 10. This family of proteins found in yest does not have a clear function but are predicted to be involved in lipid metabolism. 254
59906 407425 pfam17317 MFA1_2 Mating hormone A-factor 1&2. The polypeptides encoded by the MFa1 and MFa2 genes are precursors of 36 and 38 amino acids, respectively. These mating pheromones secreted by S. cerevisiae a-cells, exhibit a single amino acid residue difference (the MFa1 gene product contains a valine instead of the leucine coded for by MFa2 at position 6 of the mature a-factor). The most significant feature of the primary a-factor gene products is the presence of a specific C-terminal motif, found in all known farnesylated proteins, representing a signal for modification of polypeptides with an isoprenoid group. In the case of both a-factor precursors, this specific sequence of amino acids is -CVIA. However, the general motif is referred to as a CAAX box, since the consensus sequence of amino acids present at the C-terminus of isoprenylated proteins consists of an invariable cysteine (C) residue followed by two aliphatic (A) amino acids and ending in a carboxyl-terminal residue of almost any (X) type The specific CAAX sequence has also been shown to target the peptide for either farnesylation or geranylgeranylation. 34
59907 407426 pfam17318 DUF5361 Family of unknown function (DUF5361). This is a family of unknown function found in Bacteria. 36
59908 340035 pfam17319 DUF5362 Family of unknown function (DUF5362). This is a family of unknown function found in Bacteria. Most of the family members are predicted to have 2 trans-membrane regions. 94
59909 407427 pfam17320 DUF5363 Family of unknown function (DUF5363). This is a family of unknown function found in Gammaproteobacteri. 54
59910 407428 pfam17321 Vac17 Vacuole-related protein 17. Vac17 serves as an adaptor protein recruiting vacuole vesicles to the actin cable tracks by its dual interaction with Vac8 and the Myo2 motor protein. It is directly phosphorylated by Cdk1. Vac17 plays an important role in vacuole inheritance and segregation in cell division. 445
59911 407429 pfam17322 DUF5364 Family of unknown function (DUF5364). This family of unknown function is found in Saccharomycetales. 185
59912 407430 pfam17323 ToxS Trans-membrane regulatory protein ToxS. Gram negative bacteria such as Vibrio cholera require the production of a number of virulence factors during infection. ToxS, a member of this domain family, is required for ToxR activity. The ToxR and ToxS regulatory proteins are considered to be at the root of the V. cholera virulence regulon, called the ToxR regulon. ToxS serves as a mediator of ToxR function, perhaps by influencing its stability and/or capacity to dimerize, hence ToxS plays an important function in transcriptional activation of Vibrio cholerae virulence genes. 147
59913 340040 pfam17324 BLI1 BLOC-1 interactor 1. In yeast BLOC-1 consists of six subunits localized to the endosomes. In the absence of BLOC-1 subunits, the balance between recycling and degradation of selected cargoes is impaired. This family contains BLI1 (BlOC-1 interactor 1) protein, a subunit of the BLOC-1 complex which mediates endosomal maturation. 111
59914 407431 pfam17325 SPG4 Stationary phase protein 4. Saccharomyces cerevisiae respond and cope to starvation by ceasing growth and entering a non-proliferating state referred to as stationary phase. Expression of SPG4 has been shown to be higher in stressed cells, and stationary phase cells compared to active cells. It is not required for growth on non-fermentable carbon sources. 109
59915 407432 pfam17326 DUF5365 Family of unknown function (DUF5365). This is a family of unknown function found in Bacillaceae. 116
59916 407433 pfam17327 AHL_synthase Acyl homoserine lactone synthase. Members of this family are involved in quorum sensing processes. In gram negative bacteria, N-acylhomoserine lactones (AHLs) act as signals. As the bacterial density increases, AHLs accumulate, and once they reach a critical level (quorum), they interact with cognate receptor proteins, which then affect target gene expression. Some AHLs are synthesized by LuxM (AHL synthase) and homologs (VanM and opaM). LuxM enzymes use S-adenosyl-methionine (SAM) as one of its two substrates and are capable of using either acyl-acyl-carrier-protein (acyl-ACP) or acyl-coenzyme A (acyl-CoA) as the other substrate. VanM, the LuxM homolog, produces two auto-inducers C6HSL and 3OC6HSL. Both autoinducers are detected by the VanN receptor. The autoinducers HAI-1, is synthesized by the cytoplasmic enzymes LuxM. 376
59917 407434 pfam17328 DUF5366 Family of unknown function (DUF5366). This is a family of unknown function, found in Bacillales. Members of the family are predicted to have between 4 and 5 trans-membrane regions. 158
59918 407435 pfam17329 DUF5367 Family of unknown function (DUF5367). This bacterial family of proteins of unknown function is predicted to contain 3 or 4 trans-membrane regions. 98
59919 407436 pfam17330 SWC7 SWR1 chromatin-remodelling complex, subunit Swc7. Th SWR1 complex is involved in chromatin-remodelling by promoting the the ATP-dependent exchange of histone H2A for the H2A variant HZT1 in Saccharomyces cerevisiae or H2AZ in mammals. The SWR1 chromatin-remodelling complex is composed of at least 14 subunits and has a molecular mass of about 1.2 to 1.5 MDa. In S. cerevisiae there are core conserved subunits (ATPase; Swr1,RuvB-like; Rvb1 and Rvb2, Actin; Act1, Actin-related: Arp4 and Arp6, YEATS protein; Yaf9) and non-conserved subunits ( Vps71 (Swc6), Vps72 (Swc2), Swc3, Swc4, Swc5, Swc7, Bdf1). Seven of the SWR1 subunits are involved in maintaining complex integrity and H2AZ histone replacement activity: Swr1, Swc2, Swc3, Arp6, Swc5, Yaf9 and Swc6. Arp4 is required for the association of Bdf1, Yaf9, and Swc4 and Arp4 is also required for SWR1 H2AZ histone replacement activity in vitro. Furthermore the N-terminal region of the ATPase Swr1 provides the platform upon which Bdf1, Swc7, Arp4, Act1, Yaf9 and Swc4 associate. It also contains an additional H2AZ-H2B specific binding site, distinct from the binding site of the Swc2 subunit. In eukaryotes the deposition of variant histones into nucleosomes by the chromatin-remodelling complexes such as the SWR1 and INO80 complexes have many crucial functions including the control of gene regulation and expression, checkpoint regulation, DNA replication and repair, telomer maintenance and chromosomal segregation and as such represent critical components of pathways that maintain genomic integrity. This entry represents the subunit Swc7; the smallest subunit of the SWR1 complex. Swc7 is not required for H2AZ binding. It associates with the N-terminus of Swr1, and the association of Bdf1 requires Swc7, Yaf9, and Arp4. 98
59920 407437 pfam17331 GFD1 GFD1 mRNA transport factor. Following transcription, mRNA is processed, packaged into messenger ribonucleoprotein (mRNP) particles, and transported through nuclear pores (NPCs) to the cytoplasm. Gfd1 is one of several factors that, although not essential for mRNA export, enhances the efficiency of the process, either by facilitating integration of different steps in the gene expression pathway or by increasing the rate of key steps. Gfd1 localizes to the cytoplasm and nuclear rim. It interacts with a number of components of the mRNA export machinery in yeast. Most notably, Gfd1 interacts with the Dbp5-activating protein, Gle1, the cytoplasmic nucleoporin Nup42/Rip1, the putative RNA helicase, Dbp5, and a protein implicated in mRNA export, Zds1. Gfd1 forms a complex with Nab2 both in vitro and in vivo in which Gfd1 binds to the N-terminal domain of Nab2. The crystal structure, together with complementary NMR data, indicated that residues 126-150 of Gfd1 form a single alpha-helix that binds primarily to helix 2 of Nab2-N. Gfd1 functions to co-ordinate Dbp5 and Gle1 to facilitate the removal of Nab2 from mRNPs at the cytoplasmic face of nuclear pores. 22
59921 407438 pfam17332 pXO2-11 Uncharacterized protein pXO2-11. This is a protein of unknown function found in Firmicutes and predicted to contain 2 trans-membrane regions. 89
59922 375136 pfam17333 DEFB136 Beta-defensin 136. Beta-defensins are small cationic peptides that have triple-stranded beta-sheet structure. They are characterized by the presence of multiple cysteine residues (forming three distinctive intramolecular disulfide bridges) and a highly similar tertiary structure known as the defensin motif. All beta-defensin genes encode a precursor peptide that consists of a hydrophobic, leucine-rich signal sequence, a pro-sequence, and a mature six-cysteine defensin motif at the carboxy terminus. They exhibit broad-spectrum antimicrobial properties and contribute to mucosal immune responses at epithelial sites. Several beta-defensins family members have been shown to play essential roles in sperm maturation and fertility in rats, mice and humans. In addition to the wide spectrum of antimicrobial activity, mammalian beta-defensins have been reported to have other roles in the immune system, such as the chemotactic ability for immature dendritic cells and memory T-cells via chemokine receptor-6 demonstrated by human beta-defensin-2. This entry contains beta-defensins such as DEFB136, the mouse homolog Defb42, and Ostricacin-3. 51
59923 407439 pfam17334 CsgA Sigma-G-dependent sporulation-specific SASP protein. Curli are extracellular functional amyloids that are assembled by enteric bacteria during biofilm formation and host colonization. The csg (curli specific gene) operon encodes major structural and accessory proteins that are required for curli production. The csgBAC operon encodes the major and minor curli fiber components, CsgA and CsgB, respectively. CsgA is secreted to the extracellular milieu as an unfolded protein and forms amyloid polymers upon interacting with the CsgB nucleator. CsgA is comprised of five imperfect repeating units with highly conserved glutamine and asparagine residues that are important for amyloid formation. Each repeating unit is predicted to form a strand-loop-strand motif. In vitro, CsgC inhibits CsgA amyloid formation at substoichiometric concentrations and maintains CsgA in a non-beta-sheet rich conformation, making CsgC an efficient and selective amyloid inhibitor. 83
59924 407440 pfam17335 IES5 Ino80 complex subunit 5. The INO80 chromatin remodeling complex is known to be related to DNA repair in yeast, mammals, and plants. In yeast, the INO80 complex is recruited to the DSBs (DNA double-strand breaks) through the direct interaction of its Nhp10 (non-histone protein 10) or Arp4 (Actin-Related Protein) subunits with phosphorylated histone H2A. However, the ortholog of yeast Nhp10 does not exist in mammals. The Nhp10 module consists of Nhp10, Ies1, Ies3, and Ies5. These yeast-specific subunits cross-link to the N-terminus of Ino80 and form a stable complex in-vitro, which helps high-affinity targeting of INO80 to nucleosome-binding. 110
59925 407441 pfam17336 DUF5368 Family of unknown function (DUF5368). This is a family of unknown function found in Proteobacteria and predicted to contain 2 trans-membrane regions. 111
59926 407442 pfam17337 Gal_GalNac_35kD Galactose-inhibitable lectin 35 kDa subunit. The role of the cell surface D-galactose (Gal)/N-Acetyl-D-galactosamine (GalNAc), lectin in the adhesion process has been demonstrated in Entamoeba histolytica, a protozoan parasite that causes amebiasis in humans. The Gal/GalNAc lectin is a heterotrimeric protein complex. It is composed of a 260 kDa heterodimer of trans-membrane disulphide-linked heavy 170 kDa subunit and glycosylphosphatidylinositol (GPI)-anchored light 31 kDa/35 kDa subunits. The light subunits are non-covalently associated with an intermediate subunit of 150 kDa. Inhibition of expression of 35 kDa subunit of Gal/GalNAc lectin inhibits the cytotoxic and cytopathic activity of E. histolytica, but no decrease in adherence capacity to mammalian cells was evident. Interestingly, a carbohydrate-binding activity has been reported for the 35 kDa light subunit of the lectin molecules of the closely related Entamoeba invadens. This entry is related to the light subunit where this domain of unknown function is present. The light subunit consists of several polypeptide chains with considerable antigenic homology. The two light (31/35 kDa) subunits of the lectin are present in two isoforms: the 31 kDa isoform is glycerolphosphatidylinositol (GPI) anchored; and the 35 kDa isoform is more highly glycosylated. 225
59927 375140 pfam17338 GP88 Gene 88 protein. This family of unknown function is found in Bacteria. 231
59928 407443 pfam17339 DUF5369 Family of unknown function (DUF5369). This is a family of unknown function found in Chromadorea. 107
59929 407444 pfam17340 DUF5370 Family of unknown function (DUF5370). This is a family of unknown function found in Bacillaceae. 63
59930 340057 pfam17341 DUF5371 Family of unknown function (DUF5371). This is a family of unknown function found in Euryarchaeota. 65
59931 407445 pfam17342 DUF5372 Family of unknown function (DUF5372). This family of unknown function is found in Bacteria. 78
59932 407446 pfam17343 DUF5373 Family of unknown function (DUF5373). This family of unknown function is found in Caenorhabditis. Members of this family are predicted to contain 4 trans-membrane regions. 182
59933 407447 pfam17344 DUF5374 Family of unknown function (DUF5374). This is a family of unknown function found in Pasteurellaceae. 40
59934 340061 pfam17345 DUF5375 Family of unknown function (DUF5375). This is a family of unknown function found in Enterobacteriaceae. 106
59935 375143 pfam17346 DUF5376 Family of unknown function (DUF5376). This is a family of unknown function found in Bacteria. 129
59936 407448 pfam17347 DUF5377 Family of unknown function (DUF5377). This is a family of unknown function found in Pasteurellaceae. 96
59937 407449 pfam17349 DUF5378 Family of unknown function (DUF5378). This is a family of unknown function which is found in Mycoplasmataceae.Family members are predicted to contain 7 trans-membrane regions 282
59938 340065 pfam17350 DUF5379 Family of unknown function (DUF5379). This family of unknown function is found in Methanobacteria and Methanococci. Family members are predicted to have 3 trans-membrane regions. 90
59939 375145 pfam17351 DUF5380 Family of unknown function (DUF5380). This is a family of unknown function found in Rhabditida. 85
59940 340067 pfam17352 MFS18 Male Flower Specific protein 18. This domain family is found on MFS18 protein from Maize. MFS18 mRNA accumulates in the glumes and in anther walls, paleas and lemmas of mature florets. It is particularly associated with the vascular bundle in the glumes and encodes a polypeptide of 12 kDa, rich in glycine, proline and serine that has similarities with other plant structural proteins. There is no known function of this domain family in Maize or other Poaceae. 97
59941 407450 pfam17353 DUF5381 Family of unknown function (DUF5381). This is a family of unknown function found in Bacillales. 169
59942 375147 pfam17354 DUF5382 Family of unknown function (DUF5382). This is a family of unknown function found in Caenorhabditis. 418
59943 407451 pfam17355 DUF5383 Family of unknown function (DUF5383). This is a family of unknown function found in Bacillales. Members of this family are predicted to contain one trans-membrane region. 124
59944 407452 pfam17356 PBSX_XtrA Phage-like element PBSX protein XtrA. This is a family of unknown function found in Bacilli. 64
59945 375150 pfam17357 FIT1_2 Facilitor Of Iron Transport 1 and 2. Fit proteins (facilitor of iron transport) found on Saccharomyces cerevisiae cell wall are mannoproteins implicated in the siderophore-iron bound transport. This domain family can be found in FIT1 and FIT2 proteins in Saccharomycetaceae. The FIT1-3 cell wall mannoproteins are attached to the beta-glucan layer through a GPI (glycosylphosphatidylinoisitol) anchor. They are very rich in serine and threonine residues (40-50 % serine and threonine) and bear several short repeat of 6-7 amino acids sequence. The exact domain function is unknown. 86
59946 407453 pfam17358 DUF5384 Family of unknown function (DUF5384). This is a family of unknown function found in Proteobacteria. 145
59947 407454 pfam17359 DUF5385 Family of unknown function (DUF5385). This is a family of unknown function found in Mycoplasmataceae. Family members are predicted to have one trans-membrane region. 217
59948 375152 pfam17360 DUF5386 Family of unknown function (DUF5386). This is a family of unknown function found in Chromadorea. 170
59949 340076 pfam17361 DUF5387 Family of unknown function (DUF5387). This is a family of unknown function found in Strongyloides. 222
59950 375153 pfam17362 pXO2-34 Family of unknown function. This is a family of unknown function found in Bacilli. 79
59951 375154 pfam17363 DUF5388 Family of unknown function (DUF5388). This is a family of unknown function found in Lactobacillales. 70
59952 407455 pfam17364 DUF5389 Family of unknown function (DUF5389). This is a family of unknown function found in Pasteurellaceae. Family members are predicted to have 3 trans-membrane regions. 104
59953 375156 pfam17365 DUF5390 Family of unknown function (DUF5390). This is a family of unknown function found in Caenorhabditis. 141
59954 375157 pfam17366 AGA2 A-agglutinin-binding subunit Aga2. The wall of Saccharomyces cerevisiae consists of mannoproteins, beta-glucans, and a small amount of chitin. Mannoproteins include Aga2p where this domain family is found. There are two main display systems for yeast, the agglutinin system and the flocculin system. The S. cerevisiae sexual agglutinins facilitate the mating between two types of cells, a and alpha. a-Agglutinin consists of two subunits, encoded by two unlinked genes, AGA1 and AGA2. The cell surface adhesion protein (Aga2), enhances agglutination between a and alpha cells. Optimal binding includes interactions of the alpha-agglutinin binding pocket with the Aga2p terminal carboxyl group. This O-mannosylated glycopeptide is doubly disulfide linked to Aga1p. The Aga2p half-cystines near the ends of the peptide are linked to two Aga1p Cys residues separated by only two residues. This closeness of the disulfide bonds stabilizes the alpha/beta structure in Aga2p. 58
59955 407456 pfam17367 NiFe_hyd_3_EhaA NiFe-hydrogenase-type-3 Eha complex subunit A. Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. Energy-converting [NiFe] hydrogenases function as ion pumps, catalyzing the reduction of ferredoxin with H2 driven by the proton-motive force or the sodium-ion-motive force. Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA, EhaB, EhaC, EhaD, EhaE, EhaF, EhaG, EhaI, EhaK, EhaL) and four hydrophobic subunits (EhaM, EhaR, EhS, EhT). Eha and Ehb catalyze the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases. Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure. This domain family can be found on the small membrane proteins that are predicted to be the EhaA trans-membrane subunits of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes. 94
59956 407457 pfam17368 YwcE Spore morphogenesis and germination protein YwcE. The ywcE gene codes for a holin-like protein that localizes to the cell and spore membranes. It is expressed at the onset of sporulation and transcription is repressed during growth by the transition-state regulator AbrB. YwcE is an 83-residue protein with three trans-membrane domains and a highly charged C-terminal tail. Moreover, YwcE has a dual start motif, which plays a role in the regulation of class I or class II holins. It is likely to have the N-terminus on the outside of the membrane and the C-terminus in the cytoplasm. This domain family is found in YwcE proteins in Bacilli. 85
59957 407458 pfam17369 DUF5391 Family of unknown function (DUF5391). This is a family of unknown function found in Bacilli. Family members are predicted to have 4 trans-membrane regions. 135
59958 375160 pfam17370 DUF5392 Family of unknown function (DUF5392). This is a family of unknown function found in Bacilli. Family members are predicted to have 2 trans-membrane regions. 139
59959 407459 pfam17371 DUF5393 Family of unknown function (DUF5393). This is a family of unknown function found in Trypanosomatidae. 666
59960 407460 pfam17372 DUF5394 Family of unknown function (DUF5394). This is a family of unknown function found in Rickettsiales. 205
59961 407461 pfam17373 DUF5395 Family of unknown function (DUF5395). This is a family of unknown function found in Archaea and Bacteria. 81
59962 340089 pfam17374 DUF5396 Family of unknown function (DUF5396). This is a family of unknown function found in Mycoplasma. 947
59963 340090 pfam17375 DUF5397 Family of unknown function (DUF5397). This is a family of unknown function found in Proteobacteria. 64
59964 375163 pfam17376 DUF5398 Family of unknown function (DUF5398). This is a family of unknown function found in Chlamydiales. 80
59965 340092 pfam17377 DUF5399 Family of unknown function (DUF5399). This is a family of unknown function found in Chlamydiales. 134
59966 407462 pfam17378 REC104 Meiotic recombination protein REC104. REC104 is one of several meiosis specific genes required for generating meiotic DSBs (double strand breaks). It is suggested that Rec102 and Rec104 directly promote DSB formation as part of a multiprotein complex with Spo11. Rec102 and Rec104 are mutually dependent for proper sub-cellular localization, and share a requirement for Spo11 and Ski8 for their recruitment to meiotic chromosomes. Moreover, Rec102 is required for Rec104 to accumulate to normal steady-state levels and to be properly phosphorylated. It is likely that Rec102 and Rec104 move freely in and out of the nucleus but are most stably sequestered there only when they can form a complex on chromosomes. This domain family is found on Rec104 proteins in yeast. 182
59967 407463 pfam17379 DUF5400 Family of unknown function (DUF5400). This is a family of unknown function found in Methanobacteria and Methanococci. Members of this family are predicted to contain 4 trans-membrane regions. 100
59968 375164 pfam17380 DUF5401 Family of unknown function (DUF5401). This is a family of unknown function found in Chromadorea. 722
59969 375165 pfam17381 Svs_4_5_6 Seminal vesicle secretory protein 4/5/6. There are seven major proteins involved in murine seminal vesicle secretion (SVS1-7). Mouse Svs2-Svs6 genes evolved by gene duplication and belong to the same gene family. This domain family is found in SVS4/5 and 6. SVS4 is a basic, thermostable, secretory protein synthesized by rat seminal vesicle epithelium under strict androgen transcriptional control. This protein has potent nonspecies-specific immunomodulatory, anti-inflammatory, and pro-coagulant activities that have been shown to be located in the N-terminal region of Svs4 (fragment 1-70). The N-terminal segment has a high amino-acid sequence similarity with the C-terminal segment 34-66 of uteroglobin, a rabbit steroid-inducible, cytokine-like, multifunctional, secreted protein. Furthermore, SVS4 acts as a sperm capacitation inhibitor, by interacting with SVS3 and SVS2. 91
59970 340097 pfam17382 ycf70 Uncharacterized protein ycf70. This is a family of unknown function found in Poaceae. 89
59971 375166 pfam17383 kleA_kleC Uncharacterized KorC regulated protein A. This is a family of unknown function found in Proteobacteria. 76
59972 407464 pfam17384 DUF150_C RimP C-terminal SH3 domain. This family represents the C-terminal domain from RimP. 70
59973 407465 pfam17385 LBP_M Lacto-N-biose phosphorylase central domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides. 221
59974 407466 pfam17386 LBP_C Lacto-N-biose phosphorylase C-terminal domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides. 53
59975 407467 pfam17387 Glyco_hydro_59M Glycosyl hydrolase family 59 central domain. 116
59976 407468 pfam17388 GP24_25 Tail assembly protein Gp24 and Gp25. Bacteriophages (viruses of bacteria) use a specialized organelle called a tail to deliver their genetic material and proteins across the cell envelope during infection. In phages the most complex part of these contractile injection systems, the base-plate, is responsible for coordinating host recognition or other environmental signals with sheath contraction. In T4 phage, 15 different proteins encoded by Gene Products (Gps), make up the base-plate and proximal region of the tail tube. The base-plate is divided into inner, intermediate and peripheral regions. Gp25 is located in the inner region of the base-plate. It interacts with Gp53 connecting the core bundle to the central hub and the tube, stabilizing the entire assembly. Gp25 has a structurally conserved loop (residues 47-49), mediating the interaction between LysM (residues 46-82 in Gp53) and the core bundle. Orthologues of Gp25 contain an EPR motif (Glu-Pro-Arg, residues 85-87 of Gp25), which interacts with the core bundle and points towards the region of the Gp27-Gp48 interface. In summary, Gp25 plays a critical role in sheath assembly and contraction. This domain family is found on Gp24 and Gp25 Mycobacterium phages. 132
59977 407469 pfam17389 Bac_rhamnosid6H Bacterial alpha-L-rhamnosidase 6 hairpin glycosidase domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria. 340
59978 379972 pfam17390 Bac_rhamnosid_C Bacterial alpha-L-rhamnosidase C-terminal domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria. 78
59979 407470 pfam17391 Urocanase_N Urocanase N-terminal domain. 127
59980 407471 pfam17392 Urocanase_C Urocanase C-terminal domain. 196
59981 340108 pfam17393 DUF5402 Family of unknown function (DUF5402). This is a family of unknown function found in Methanobacteria and Methanococci. 119
59982 340109 pfam17394 KleE Uncharacterized KleE stable inheritance protein. This domain family of unknown function is found in Proteobacteria. Family Members are predicted to contain two trans-membrane regions. 108
59983 407472 pfam17395 DUF5403 Family of unknown function (DUF5403). This is a family of unknown function found in Actinobacteria. 96
59984 407473 pfam17396 DUF1611_N Domain of unknown function (DUF1611_N) Rossmann-like domain. 93
59985 340112 pfam17397 DUF5404 Family of unknown function (DUF5404). This is a family of unknown function found in Chordata. This domain is located downstream the N-terminal of Fip1 pfam05182. The Tsx gene resides at the X-inactivation centre and once thought to encode a protein expressed in testis. However, this was disputed upon further analysis. ORF and immunostaining analysis concluded that Tsx may be non-coding. Tsx long transcript is abundantly expressed in meiotic germ cells, embryonic stem cells, and brain. In vertebrates, Fip1 is the evolutionary precursor of eutherian Tsx, hence its location upstream from the Tsx gene. 145
59986 407474 pfam17398 NolB Nodulation protein NolB. This domain family of unknown function is found in Rhizobiales. Family members are involved in Nodulation (nodule development in plants). 151
59987 407475 pfam17399 DUF5405 Domain of unknown function (DUF5405). This domain family is found in Enterobacteriaceae. This protein may have a phage origin being found in bacteriophage P2. The majority of proteins have a conserved cysteine residue close to their C-terminus which may have functional significance. 94
59988 407476 pfam17400 DUF5406 Family of unknown function (DUF5406). This is a family of unknown function found in Bacteria. 114
59989 340116 pfam17401 DUF5407 Family of unknown function (DUF5407). This is a family of unknown function found in Chlamydiales. 74
59990 340117 pfam17402 DUF5408 Family of unknown function (DUF5408). This is a family of unknown function found in Helicobacteraceae. Family members are predicted to contain one trans-membrane region. 63
59991 407477 pfam17403 Nrap_D2 Nrap protein PAP/OAS-like domain. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 148
59992 407478 pfam17404 Nrap_D3 Nrap protein domain 3. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 160
59993 407479 pfam17405 Nrap_D4 Nrap protein nucleotidyltransferase domain 4. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 201
59994 407480 pfam17406 Nrap_D5 Nrap protein PAP/OAS1-like domain 5. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 158
59995 407481 pfam17407 Nrap_D6 Nrap protein domain 6. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 128
59996 407482 pfam17408 MCD_N Malonyl-CoA decarboxylase N-terminal domain. This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidized. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK). 85
59997 407483 pfam17409 MoaF_C MoaF C-terminal domain. MoaF protein is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha. It is conserved in Proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis. This entry corresponds to the C-terminal domain. 113
59998 407484 pfam17410 Stevor Subtelomeric Variable Open Reading frame. The parasite protein STEVOR (Subtelomeric Variable Open Reading frame) is an erythrocyte-binding protein recognizing Glycophorin C on the red blood cell (RBC) surface. The cytoplasmic domain of STEVOR is shown to interact with ankyrin complex at the erythrocyte skeleton. It is phosphorylated by protein kinase A (PKA) at a specific serine residue (S324). The N-terminal semi-conserved region of Stevor that is present in this domain is shown to specifically bind to to a chymotrypsin-resistant RBC receptor. The expression of STEVOR in multiple parasite stages including merozoites suggests that STEVOR mediates multiple distinct functions in parasitic infectious cycle. 275
59999 407485 pfam17411 SmaI Type II site-specific deoxyribonuclease. Family members of this domain are Type II site-specific deoxyribonuclease EC=3.1.21.4. The endonuclease SmaI recognizes and cleaves the sequence CCCGGGG on DNA, yielding a blunt end scission. It has been used for the diagnosis of neurogenic muscle weakness, ataxia and retinitis pigmentosa disease or Leigh's disease. Due to its specificity in recognizing the cleavage site, it is used in Leigh's disease to specifically eliminate the mutant mitochondrial DNA (mtDNA), which coexists with the wild-type mtDNA (heteroplasmy). Only the mutant mtDNA, but not the wild-type mtDNA, is selectively restricted by the enzyme. By delivering the SmaI gene fused to a mitochondrial targeting sequence, specific elimination of the mutant mtDNA was demonstrated, resulting in restoration of both the normal intracellular ATP level and normal mitochondrial membrane potential. The same strategy has also been demonstrated retinitis pigmentosa (NARP), where a mutant mitochondrial DNA carrying a T8993G transversion has been targeted by using SmaI enzymes. 241
60000 407486 pfam17412 VraX Family of unknown function. This domain family is found in VraX proteins from Staphylococcus aureus. The vraX gene belongs to the vra operon together with the vraA gene encoding for a long chain fatty acid-CoA ligase, which is up-regulated in the VISA (vancomycin-intermediate S. aureus). The gene product, a 55-amino acids protein,is upregulated in the stress response to cell wall-active antibiotics and other surface-interactive molecules. VraX harbors a putative phosphorylation site, and could therefore be involved in regulatory processes within the cell. However, no exact function has been demonstrated. 55
60001 340128 pfam17413 VirB7 Outer membrane lipoprotein virB7. The type IV secretion systems (T4SSs) are ancestrally related to bacterial conjugation machines and are able to translocate proteins and/or protein-DNA complexes to the extracellular milieu or the host interior, in many cases contributing to the ability of the bacterial pathogen to colonize the host and evade its immune system. In the pathogenic plant pathogen Agrobacterium tumefaciens T4SS allows the bacterium to transfer a segment of its tumor inducing (Ti-) plasmid DNA into plant cells causing crown gall tumor disease. Proteins in the virB and virD operons catalyze processing of the T-DNA and its transfer to plants. The VirB proteins assemble a secretion apparatus spanning both bacterial membranes to allow transfer of DNA and protein substrates into plant cells. VirB7 and VirB8, along with VirB6, VirB9 and VirB10, are the core components of the Agrobacterium DNA translocation apparatus. Structural studies with the Escherichia coli plasmid pKM101 VirB homologs showed that three proteins, TraN (VirB7 homolog), TraO (VirB9) and TraF (VirB10), form a hetero-tetradecameric structure with 14-fold symmetry forming an outer membrane channel through which the substrates pass. VirB7 stabilizes VirB9 and in its absence bacteria do not accumulate VirB9 preventing assembly of the secretion machine. Members of the VirB7 family are typically 45-65 residues long, becoming 15-20 residues shorter after removal of the N-terminal signal sequence and covalent attachment to lipid molecules. 35
60002 407487 pfam17414 MatP_C MatP C-terminal ribbon-helix-helix domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the C-terminal ribbon-helix-helix domain. 60
60003 407488 pfam17415 NigD_C NigD-like C-terminal beta sandwich domain. This family of proteins is functionally uncharacterized. This family of proteins is found in Bacteroides species. Proteins in this family are typically between 234 and 260 amino acids in length. These proteins possess an N-terminal lipoprotein attachment site. The family includes NigD a protein found in the Nig operon that encodes a bacteriocin called nigrescin. It has been suggested that NigD may be the immunity protein for nigrescin (NigC) because it is directly downstream. This entry represents the C-terminal beta-sandwich domain of NigD. 120
60004 407489 pfam17416 Glycoprot_B_PH1 Herpesvirus Glycoprotein B. This domain has a PH-like fold. 210
60005 407490 pfam17417 Glycoprot_B_PH2 Herpesvirus Glycoprotein B PH-like domain. This domain corresponds to the second PH-like domain in herpesvirus glycoprotein B. 97
60006 407491 pfam17418 SdpA Sporulation delaying protein SdpA. Spore formation by the bacterium Bacillus subtilis is an elaborate developmental process that is triggered by nutrient limitation. Cells that have entered the pathway to sporulate produce and export a killing factor and a signaling protein that act cooperatively to block sister cells from sporulating and to cause them to lyse. The sporulating cells feed on the nutrients thereby released, which allows them to keep growing rather than to complete morphogenesis. Entry into sporulation is governed by the regulatory protein Spo0A (master regulator of sporulation). Upon Spo0A phosphorylation, it represses the expression of abrB, a negative regulator of skfABCEFGH and sdpAB, leading to the transcriptional activation of sdpAB operon. The production of SdpAB is essential for the SDP toxin. SDP is a 42-amino-acid, ribosomally synthesized AMP which contains a disulfide bond between two cysteine residues located at the N-terminus. SDP acts by rapidly collapsing the proton motive force thereby inducing autolysin mediated lysis on neighboring species and non-biofilm producing B. subtilis cells (which do not produce SdpI) to respond by moving away, while autolysis would release nutrients that can be readily used to promote biofilm growth. SdpAB proteins are required to produce SDP from SdpC33-203. This domain family is found in SdpA proteins which are predicted to be a 158-amino-acid proteins suggest to be primarily cytoplasmic. 142
60007 407492 pfam17419 MauJ Methylamine utilization protein MauJ. This domain family is found in MauJ proteins. The exact function of the MauJ proteins is unknown but thought to be involved in methylamine utilization. MauJ is predicted to be a cytoplasmic protein. 282
60008 340135 pfam17420 Gp17 Superinfection exclusion protein, bacteriophage P22. Bacteriophages infect host cells by injecting their genome through the cell wall. To this end, tailed bacteriophages have evolved complex tail machines that extend from a unique capsid vertex, providing both an attachment point to the host surface, and a channel for genome-ejection through the cell envelope. Family members of this domain are putative gp17 proteins involved in genome delivery tail machine in Entereobacteria phage p22 and Salmonella phage ViI. Gp17 found in other bacteriophages such as SPP1 (siphophage SPP1, a lytic Bacillus subtilis phage) has been identified as a tail completion protein adopting an alpha/beta fold, and found to be located at the interface between the head-to-tail connector and the tail of bacteriophage SPP1. 98
60009 340136 pfam17421 DUF5409 Family of unknown function (DUF5409). This domain of unknown function is found in Poxviridae. 88
60010 375184 pfam17422 DUF5410 Family of unknown function (DUF5410). This is a family of unknown function found in Rickettsia. 353
60011 407493 pfam17423 SwrA Swarming motility protein. This domain family is found in Bacillus. Members of this family are Swra proteins involved in swarming motility (a multicellular movement of hyper-flagellated cells on a surface). SwrA is a key transcription factor facilitating this cascade. It acts synergistically with DegU to drive the fla/che operon encoding flagella components, chemotaxis constituents and the alternative sigma factor sigmaD, which is regarded as the primary event in the development of motility. LonA protease of Bacillus subtilis inhibits SwrA by proteolytically restricting its accumulation. SwrA does not contain any known DNA binding domain, and it has been shown to interact with the N-terminal domain of DegU. Anecdotally, in most laboratory strains, e.g. 168, the swrA coding sequence contains a nucleotide insertion that prematurely interrupts its reading frame, causing a non-swarming phenotype strain. 116
60012 407494 pfam17424 DUF5411 Family of unknown function (DUF5411). This is a family of unknown function found in Bacteria. 134
60013 407495 pfam17425 Arylsulfotran_N Arylsulfotransferase Ig-like domain. This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate. This domain has an Ig-like fold. 89
60014 407496 pfam17426 Putative_G5P Putative Gamma DNA binding protein G5P. This domain family is found in Gammaproteobacterial proteins. Members of the family are predicted to be G5P DNA binding proteins. Homologous proteins are found in pfam02303 107
60015 340142 pfam17427 Phi29_Phage_SSB Phage Single-stranded DNA-binding protein. DNA replication of phi29 and related phages takes place via a strand displacement mechanism, a process that generates large amounts of single-stranded DNA (ssDNA). Consequently, phage-encoded ssDNA-binding proteins (SSBs) are essential proteins during phage phi29-like DNA replication. Single-stranded DNA-binding proteins (SSBs) destabilize double-stranded DNA (dsDNA) and bind without sequence specificity, but selectively and cooperatively, to single-stranded DNA (ssDNA) conferring a regular structure to it, which is recognized and exploited by a variety of enzymes involved in DNA replication, repair and recombination. Phage phi29 protein p5 is the SSB protein active during phi29 DNA replication. It protects ssDNA against nuclease degradation and greatly stimulates dNTP incorporation during phi29 DNA replication process. Binding of the SSB to ssDNA prevents non-productive binding of the viral DNA polymerase to ssDNA, and allows the release DNA polymerase molecules that are already titrated by the ssDNA. This effect would be of particular importance in phi29-like DNA replication systems, where large amounts of ssDNA are generated and SSB binding to ssDNA could favor efficient re-usage of templates. This domain family is found in SSB proteins in phage phi-29, homologs are found in pfam00436. 123
60016 407497 pfam17428 DUF5412 Family of unknown function (DUF5412). This is a family of unknown function found in Bacteria. Members of this family have one or two predicted trans-membrane regions. 118
60017 407498 pfam17429 GP70 Gene 70 protein. This family of unknown function is found in Mycobcterium phage and Actinobacteria. 54
60018 407499 pfam17431 ypmT Uncharacterized ympT. This is a family of unknown function found in Bacillus. 62
60019 407500 pfam17432 DUF3458_C Domain of unknown function (DUF3458_C) ARM repeats. This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. 320
60020 407501 pfam17433 Glyco_hydro_49N Glycosyl hydrolase family 49 N-terminal Ig-like domain. Family of dextranase (EC 3.2.1.11) and isopullulanase (EC 3.2.1.57). Dextranase hydrolyzes alpha-1,6-glycosidic bonds in dextran polymers. This domain corresponds to the N-terminal Ig-like fold. 186
60021 340149 pfam17434 DUF5413 Family of unknown function (DUF5413). This is a family of unknown function found in Bradyrhizobiaceae. Family members contain 3 or 4 predicted trans-membrane regions. 133
60022 407502 pfam17435 DUF5414 Family of unknown function (DUF5414). This is a family of unknown function found in Chlamydiales. Family members have a known structure. 183
60023 340151 pfam17436 DUF5415 Family of unknown function (DUF5415). This is a family of unknown function found in Enterococcus. 66
60024 407503 pfam17437 DUF5416 Family of unknown function (DUF5416). This is a family of unknown function found in Campylobacteria. 173
60025 407504 pfam17438 DUF5417 Family of unknown function (DUF5417). This is a family of unknown function found in Proteobacteria. 91
60026 340154 pfam17439 DUF5418 Family of unknown function (DUF5418). This is a family of unknown function found in Methanocaldococcus jannaschii. Family members hace three predicted trans-membrane regions. 151
60027 407505 pfam17440 Thiol_cytolys_C Thiol-activated cytolysin beta sandwich domain. This domain has an immunoglobulin like fold. It is found at the C-terminus of the thiol-activated cytolsin protein. 102
60028 375192 pfam17441 DUF5419 Family of unknown function (DUF5419). This is a family of unknown function found in Rhodopseudomonas. 54
60029 340157 pfam17442 U62_UL91 Functional domain of U62 and UL91 proteins. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. Human cytomegalovirus (HCMV) is responsible for significant diseases in developing fetus as well as in an immunocompromised host. During their productive cycle, herpesviruses have a regulated temporal cascade of gene expression that can be divided into three general stages: immediate-early (IE), early (E), and late (L). Following viral DNA replication, late viral genes that mainly encode structural proteins start to be transcribed, ultimately leading to the assembly and release of infectious particles. This domain family is found in Human herpesvirus 6A and 6B (HHV-6A/B) as well as HCMV. Family members are shown to be involved in late gene expression such as UL91 in Human Cytomegalovirus. This functional domain is located on the N-terminal (1-71 amino acids) of full-length UL91. It has been found to suffice for transcriptional activation of true-late genes within the nucleus of infected cells. In other words, UL91 is fully functional as a 71-aa N-terminal polypeptide and This small 71-aa polypeptide contains all protein-protein interaction motifs crucial to mediate transcriptional activation. 65
60030 340158 pfam17443 pXO2-72 Uncharacterized protein pXO2-72. This is a family of unknown function found in Bacilli. 62
60031 340159 pfam17444 yhdX Uncharacterized protein YhdX. This is a family of unknown function found in Bacillus. 33
60032 340160 pfam17445 Mfa1 Mating factor A1. Many pathogenic fungi undergo morphological changes in order to infect their hosts. The Ustilago maydis pathogenic cycle starts when two mating compatible haploid yeast cells recognize each other via a pheromone-receptor system which is encoded by two sets of genes a and b. The a locus (a1 and a2) controls the cell fusion by encoding intercellular recognition system consisting of precursors (mfa1 and mfa2) and receptors (pra1 and pra2) of lipopeptide pheromones. The open reading frame codes for a 42-amino acid precursor, which is processed to a shorter peptide of 13 amino acids. The terminal CAAX motif is typical of farnesylated fungal pheromones, in which the last three amino acids are removed during farnesylation of the cysteine residue. This terminal cysteine is known to be Omethylated in several fungal pheromones. Mating leads to the formation of a dikaryon filament, whose apical tip differentiates into a specialized structure for plant penetration known as the appressorium. Once inside the plant, U. maydis proliferates, inducing the formation of tumors and eventually develops into diploid spores. This mating process requires cross-talk between cAMP and mitogen-activated protein kinase (MAPK) signaling. Upstream regulation of a locus has been demonstrated where Hos2 (Histone deacetylases (HDACs) plant homolog) directly regulates the expression of U. maydis mating-type genes downstream of the cAMP-PKA pathway. Furthermore, pheromone recognition blocks cell cycle progression in U. maydis cells in order to prepare mating partners for conjugation where cells undergo arrest in G2 phase. This entry relates to the domain found in Mfa1 proteins in Ustilgo maydis and U. hordei. 43
60033 407506 pfam17446 ltuA Late transcription unit A protein. This is a domain of unknown function found in Chlamydia. 46
60034 407507 pfam17447 ykpC Uncharacterized protein YkpC. This is a family of unknown function found in Bacillus. 42
60035 340163 pfam17448 yqaH Uncharacterized protein YqaH. This is a family of unknown function found in Bacillus. 88
60036 340164 pfam17449 yrzK Uncharacterized protein YrzK. This is a family of unknown function found in Bacillus. 54
60037 407508 pfam17450 Melibiase_2_C Alpha galactosidase A C-terminal beta sandwich domain. 86
60038 407509 pfam17451 Glyco_hyd_101C Glycosyl hydrolase 101 beta sandwich domain. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by a S. pneumoniae protein, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor. This domain represents C-terminal the beta sandwich domain. 111
60039 340167 pfam17452 YnfE Uncharacterized protein YnfE. This is a family of unknown function found in Bacillus. 78
60040 340168 pfam17453 YhdK Sigma-M inhibitor protein. This is a domain of unknown function found in Sigma M inhibitor proteins YhdK. In Bacillus subtilis, sigM (yhdM) gene, is required for growth and survival after salt stress. Expression of sigM is positively autoregulated and is controlled by growth phase and medium composition. SigM-dependent transcription is regulated by the products of both the yhdL and the yhdK genes, which are co-transcribed with the sigM gene. The small hydrophobic protein YhdK, appears to interact with the trans-membrane domain of YhdL, suggesting some specific role for YhdK in the anti-sigma function of YhdL. 96
60041 375196 pfam17454 Bee_toxin Honey bee toxin. Bee venom contains a variety of peptides such as melittin, apamin, adolapin and mast cell degranulating peptide. Bee venom has been used in the treatment of major neurodegenerative disorders, including Alzheimer's Disease, Parkinson's Disease, Epilepsy, Multiple Sclerosis and Amyotrophic Lateral Sclerosis. Secondary structure analysis of apamin, mast cell degranulating peptide, tertiapin and secapin have been studied. The predicted structure for mast cell degranulating peptide is almost spherical with the eight positive centers evenly distributed over the surface. It has also been suggested that these four peptides share a common folding pattern, which is centred on a beta-turn covalently linked to an alpha-helical segment by two disulphide links. It is further suggested that apamin, mast cell degranulating peptide and tertiapin form a single molecular class. This domain family is found in apamin, mast cell degranulating peptide and tertiapin. Apamin, the most widely studied member of this family has been shown to be a selective blocker of small-conductance Ca2+-activated K+ (KCa2.X or SK) channels. 49
60042 407510 pfam17455 LtuB Late transcription unit B protein. This is a family of unknown function which is specific to Chlamydia late transcription unit B protein. 79
60043 340171 pfam17456 TcpS Toxin-coregulated pilus protein S. The toxin-coregulated pilus (TCP) and cholera toxin (CT) are two main virulence factors produced by V. cholerae, which allows the bacterium to colonize and establish an infection in a host and to cause the physical symptoms of the disease, respectively. Increased expression of the TCP, a type IV pilus expressed by the tcp operon (tcpABQCRDSTEF) located on the Vibrio pathogenicity island (VPI), has been associated with enhanced attachment and is essential for colonization of the intestinal epithelium. This domain of unknown function is found in TcpS proteins in Vibrionaceae such as Vibrio choleae. 152
60044 407511 pfam17457 DUF5420 Family of unknown function (DUF5420). This is a domain of unknown function found in Gammaproteobacteria such as Haemophilus influenzae. 185
60045 340173 pfam17458 DUF5421 Family of unknown function (DUF5421). This is a domain of unknown function found in Chlamydia. 284
60046 340174 pfam17459 DUF5422 Family of unknown function (DUF5422). This is a family of unknown function found in Chlamydia. Members of this family have 1-4 predicted trans-membrane regions. 153
60047 340175 pfam17460 RP854 Uncharacterized protein RP854. This is a family of unknown function found in Rickettsia. Members of this family are predicted to have one trans-membrane region. 212
60048 375197 pfam17461 DUF5423 Family of unknown function (DUF5423). This is a domain of unknown function found in Chlamydia. Family members have 4 predicted trans-membrane regions. 348
60049 340177 pfam17462 DUF5424 Family of unknown function (DUF5424). This is a family of unknown function specific to Rickettsia amblyommii. 175
60050 340178 pfam17463 Gp79 Gene Product 79. This is a domain of unknown function found in Mycobacterium phage. Family members include the full Gp79 protein found in Mycobacteriophage L5. Mycobacteriophage L5, is a phage isolated from Mycobacterium smegmatis. It forms stable lysogens in M. smegmatis and has a broad host range among the pathogenic mycobacteria. L5 encodes gene products (gp) toxic to the host M. smegmatis. Expression of gp79 interferes with the cell membrane or cell-wall synthesis of M. smegmatis, leading to altered cell morphology. It also has a bactericidal effect on E. coli. The N-terminal segment of gp79 (amino acids 1-41) shares sequence similarity with the signal peptide of the D-alanylD-alanine carboxypeptidase of Bacillus licheniformis. This enzyme removes C-terminal D-alanyl residues from sugarpeptide cell-wall precursors and is also a penicillin-binding protein (PBP). The homology of the hydrophobic N-terminal part of gp79 to a PBP (penicillin-binding protein) signal peptide may indicate an interaction of gp79 with proteins or metabolites involved in the peptidoglycan synthesis of M. smegmatis. 51
60051 340179 pfam17464 Pns11_12 Non-structural protein 11 and 12. This is a domain of unknown function found in Phytoreovirus. Family members include the Rice dwarf virus Pns11 and Pns12. Rice dwarf virus (RDV) is an icosahedral, double-layered particle. The viral genome consists of 12 segmented dsRNAs that encode seven structural (P1, P2, P3, P5, P7, P8 and P9) and five non-structural (Pns4, Pns6, Pns10, Pns11 and Pns12) proteins. Pns11 is known to bind nucleic acids and Pns12 is a phosphorylated protein. The non-structural proteins Pns6, Pns11 and Pns12 of RDV are the major constituents of the matrix of viral inclusions in which the assembly of progeny virions and the synthesis of viral RNA are thought to occur. 205
60052 340180 pfam17465 Putative_CCL4 Chemokine-like protein, HHV-6 U83 gene product. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. HHV6 A/B encode two putative chemokine receptors and a chemokine-like protein. The HHV6 U83 gene encodes a CC chemokine, which functions as a highly selective and efficacious agonist for the human CCR2 receptor both in respect of signal transduction and the ability to induce chemotaxis. homologs of the U83 gene products are found in Human cytomegalovirus encoded chemokines vCXC1 and vCXC2. HHV-6 CCL4 contains a region with the CC/CX3C chemokine motif and a glycosaminoglycan (GAG)-binding epitope, BBXB (B being a basic residue), found right before the third Cys residue, which very likely forms a disulfide bridge back to the first Cys of the protein. This gene is the only HHV-6A/B divergent gene that is specific for these viruses. The U83 chemokine gene is distinct between HHV-6A and HHV-6B strains, encoding up to 13% amino acid differences. The HHV-6A (U83A) and HHV-6B (U83B) chemokines have distinct specificities which determine chemoattraction or diversion of different leukocyte subsets for infection or immune evasion, thus an early component of cellular tropism as well as mediator of innate immunity. U83 also has a varied gene structure, with N-terminal length variation determining production of the encoded mature secreted chemokine, coupled with control by cell-directed splicing which truncates the chemokine gene early in replication to encode an antagonist. The long active form of U83A has a unique broad specificity for receptors CCR1, CCR4, CCR5, CCR6 and CCR8 present on plasmacytoid and myeloid dendritic and monocyte/macrophage antigen presenting cells, as well as both TH1 and TH2 skin homing lymphocytes and NK cells; it is also amongst the highest affinity ligands for CCR5 and inhibits HIV-1 binding at this coreceptor. U83A can both block and divert human chemokine action while occupying the human chemokine receptors. 97
60053 340181 pfam17466 NinD Family of unknown function. This is a family of unknown function found in Enterobacteria phage P22 and Enterobacteria phage lambda. 57
60054 340182 pfam17467 E7R Viral Protein E7. This domain family is found in Vaccinia and Variola viruses. Family members include E7R gene product. Vaccinia virus (VV) is a large double-stranded DNA virus that replicates in the cytoplasm of infected cells. Many viruses express proteins that are modified by myristic acid. Myristic acid is a 14-carbon fatty acid that is cotranslationally transferred to the penultimate glycine residue found within the consensus sequence MGXXX(S/T/A/C/N) (where X is any amino acid) at the amino terminus of target proteins. E7R proteins in Vaccina virus have been shown to be myristylated. The expressed E7R protein has also been found to reside within mature infectious virions. 60
60055 340183 pfam17468 Gp52 Phage protein Gp52. This domain of unknown function is found in Mycobacterium phage. 61
60056 340184 pfam17469 Gp68 Phage protein Gp68. This is a domain of unknown function found in Mycobacterium phage. 78
60057 340185 pfam17470 Gp45_2 Phage protein Gp45.2. This is a domain of unknown function found in Myoviridae. 58
60058 340186 pfam17471 Gp63 Hypothetical phage protein Gp63. This is a family of unknown function found in Mycobacterium. 73
60059 340187 pfam17472 DUF5425 Family of unknown function (DUF5425). This is a family of unknown function found in Borreliella burgdorferi. 76
60060 340188 pfam17473 DUF5426 Family of unknown function (DUF5426). This is a family of unknown function found in Mycoplasma. 137
60061 340189 pfam17474 U71 Tegument protein UL11 homolog. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. During their productive cycle, herpesviruses have a regulated temporal cascade of gene expression that can be divided into three general stages: immediate-early (IE), early (E), and late (L). Following viral DNA replication, late viral genes that mainly encode structural proteins start to be transcribed, ultimately leading to the assembly and release of infectious particles. This domain family is found in tegument protein UL11 homolog (U71) in HHV-6A/B. It is a myristylated virion protein which is expressed at the early stage of the lytic cycle. 52
60062 340190 pfam17475 Binary_toxB_2 Clostridial binary toxin B/anthrax toxin PA domain 2. This domain forms the middle beta sandwish domain in anthrax toxin. 218
60063 375198 pfam17476 Binary_toxB_3 Clostridial binary toxin B/anthrax toxin PA domain 3. This entry represents the beta-grasp domain in anthrax protective antigen. 102
60064 407512 pfam17477 Rota_VP4_MID Rotavirus VP4 membrane interaction domain. This entry represents the VP4 membrane interction domain. 225
60065 340193 pfam17478 VP4_helical Rotavirus VP4 helical domain. 291
60066 407513 pfam17479 DUF3048_C Protein of unknown function (DUF3048) C-terminal domain. Some members in this bacterial family of proteins are annotated as YerB. However currently no function is known. This entry represents the C-terminal domain. 114
60067 340195 pfam17480 AlphaC_C Alpha C protein C terminal. The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle. This entry is the C-terminal domain for APC. The C-terminal domain (45 amino acids) contains an LPXTG peptidoglycan-anchoring motif characteristic of cell-wall anchored surface proteins. 71
60068 407514 pfam17481 Phage_sheath_1N Phage tail sheath protein beta-sandwich domain. This entry represents the N-terminal beta sandwich domain found in a variety of phage tail sheath proteins. 99
60069 407515 pfam17482 Phage_sheath_1C Phage tail sheath C-terminal domain. This entry represents the C-terminal domain in a variety of phage tail sheath proteins. 104
60070 407516 pfam17483 TbpB_C C-lobe handle domain of Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. This domain family is found in the handle domain of the C lobe (domain C) of TbpB proteins. It consists of a squashed six-stranded beta sheet flanked by two antiparallel beta strands and has no supporting alpha helix as in the N lobe. 99
60071 407517 pfam17484 TbpB_A N-Lobe handle Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. The 4-residue conserved LSAC motif found at the amino terminus of TbpB represents a prototypical lipobox, with the cysteine residue serving as the first amino acid in the mature protein which is subsequently modified by the addition of a diacyl glycerol. A second conserved motif of interest is located two amino acids downstream of the LSAC site. This region consists of four glycine residues in tandem. Deletion of the conserved polyglycine motif has significant negative effects on growth in certain conditions, while mutational analysis revealed that the LSAC motif constituting the lipobox of TbpB is necessary for lipidation and hence tethering of TbpB to the bacterial surface. This domain family is found on the N-terminal region of TbpB proteins, which comprises the N lobe handle consisting of a four-stranded antiparallel beta sheets held together by a short surface-exposed alpha helix. Tf-binding activity primarily resides in the TbpB N lobe. 136
60072 340200 pfam17485 SatRNA_48 Satellite RNA 48 kDa protein. Satellite RNAs (satRNAs) are short RNA molecules, usually <1,500 nt, that depend on cognate helper viruses for replication, encapsidation, movement, and transmission, but most share little or no sequence homology to the helper viruses. In contrast, satellite viruses are satRNAs that encode and are encapsidated in their own capsid proteins (CPs). Members of this family are nonstructural proteins of 48kDa in size which been shown to be involved in the replication of the sat-RNA. They are found in tomato black ring virus (TBRV). 299
60073 340201 pfam17486 Cys_Knot_tox Cystine knot toxins. This family is found in Araneaea (spiders) and family members are venomus peptides with 4 disulfide bonds. Cystine knot toxins (CKTs) are small, compact molecules cross-linked by three to five disulfide bonds and are often the key contributors to the activity and potency of the venom. While these disulfide-rich peptides can adopt a number of different structural motifs, three of the most observed structural scaffold motifs are the inhibitor cystine knot (ICK) and the disulfide-directed beta-hairpin (DDH) and Kunitz motif. These venomus peptides mainly act on membrane proteins in electro-excitable cell membranes by modulating voltage-activated sodium (NaV), calcium (CaV), and potassium (KV) channels, acid-sensing ion channels (ASICs), transient receptor potential (TRP) channels, and mechanosensitive channels (MSCs). 70
60074 407518 pfam17487 RPS12 Ribosomal protein S12. This is a family of unknown function. Family members are ribosomal proteins found in the mitochondria (RPS12). Homologus RPS12 proteins in bacterial ribosoms participate in stabilizing the second base pair of the codon-anticodon duplex in the A site and is likely to be critical for the fidelity of decoding process. A similar role can be anticipated for this protein in mitochondrial ribosomes. This has been shown where the product of edited RPS12 mRNA translation represented a component of the mitoribosome's small subunit. 87
60075 407519 pfam17488 Herpes_glycoH_C Herpesvirus glycoprotein H C-terminal domain. Herpesvirus glycoprotein H (gH) is a virion associated envelope glycoprotein. Complex formation between gH and gL has been demonstrated in both virions and infected cells. This entry represents the C-terminal domain. 135
60076 407520 pfam17489 Tnp_22_trimer L1 transposable element trimerization domain. This entry represents the trimerization domain. 43
60077 407521 pfam17490 Tnp_22_dsRBD L1 transposable element dsRBD-like domain. This entry represents the double stranded RNA-binding-like domain. 65
60078 340206 pfam17491 m_DGTX_Dc1a_b_c Spider Toxins mu-diguetoxin-1 a, b and c. This family has members that are 56-59 residue mu-diguetoxin-1 toxins, which have been isolated from the weaving spider, Diguetia canities. These toxins were isolated as a result of their potent insect paralytic activities, designated mu-DGTX-Dc1a to -Dc1c (formerly DTX9.2, DTX11 and DTX12). Family members such as beta-Diguetoxin-Dc1a (Dc1a) has been structurally characterized and shown to have disulfide bonds which form a classical inhibitor cysteine knot (ICK) motif in which the Cys13-Cys26 and Cys20-Cys40 disulfide bonds and the intervening sections of the polypeptide backbone forming a 23-residue ring that is pierced by the Cys25-Cys54 disulfide bond. This ICK motif is commonly found in spider toxins, and this particular scaffold provides these peptides (so-called knottins) with an unusually high degree of chemical, thermal and biological stability. Dc1a contains an additional disulfide bond (Cys42-Cys52) that appears to serve as a molecular staple which limits the flexibility of a disordered serine-rich hairpin loop. The extended N-terminus of Dc1a along with an unusually large loop between Cys26 and Cys40 enables the formation of an N-terminal three-stranded antiparallel beta-sheet that is not found in any other knottin.The molecular surface of Dc1a contains a relatively uniform distribution of charged residues; moreover, there are no distinct clusters of hydrophobic residues that might mediate an interaction with lipid bilayers. 55
60079 340207 pfam17492 D_CNTX Delta Ctenitoxins. This family includes peptides isolated from Phoneutria such as delta-ctenitoxins.Members of the CNTX-Pn1a family and its paralogs (delta-CNTX-Pn1b through delta-CNTX-Pn1e) of Phoneutria toxins have complex effects on sodium channels but their primary effect appears to be an inhibition of channel inactivation, a pharmacology similar to that of the delta-atracotoxins and delta-conotoxins. Orthologous toxins such as delta-CNTX-Pr1/PK1 and Pn2 are also family members, some of which act by clocking the calcium channels. Delta-CNTX-Pn1a and delta-CNTX-PN2a are 48-amino-acid polypeptides, with 5 disulfide bridges. The later has a complex pharmacology that results in inhibition of NaV channel inactivation and a hyperpolarizing shift in the channel activation potential. 48
60080 340208 pfam17493 DUF5428 Family of unknown function (DUF5428). This is a family of unknown function found in Betanecrovirus. 63
60081 340209 pfam17494 DUF5429 Family of unknown function (DUF5429). This is a family of unknown function. 76
60082 375203 pfam17495 DUF5430 Family of unknown function (DUF5430). This is a family of unknown function found in Feline immunodeficiency virus. 106
60083 407522 pfam17496 DUF5431 Family of unknown function (DUF5431). This is a family of unknown function found in Enterobacteriaceae. 70
60084 340212 pfam17497 DUF5432 Family of unknown function (DUF5432). This is a family of unknown function found in Orthopoxvirus. 74
60085 340213 pfam17498 DUF5433 Family of unknown function (DUF5433). This is a family of unknown function found in Orthopoxviruses. 67
60086 375204 pfam17499 Pilosulin Ant venom peptides. Members of this family are found in Myrmecia pilosula and represent a group of peptides that display cytotoxic, hypotensive, histamine-releasing and antimicrobial activities. Pilosulins constitute the major allergens of the venom of Myrmecia pilosula (Myrmeciinae). Pilosulin 1 is a long linear peptide (57 amino acids) and displays haemolytic and cytolytic activities. Pilosulins 3, 4, and 5 are a group of homo- and heterodimeric peptides. Pilosulin 1 is expressed in the venom sac of ants in the form of a propeptide (112 kDa) which undergoes extensive post-translational modification. It is proposed to give rise to a family of six homologous C-terminal peptide sub-sequences containing between 27 and 56 amino acid residues in the final venom. Furthermore, it is found to form random coils and have minimal secondary structure. However, in increasingly hydrophobic conditions, approximately one-third of the peptide forms alpha-helix secondary structures. Studies on human erythrocytes and lymphocytes, show that Pilosulin 1 is highly lytic towards leukocytes and that the NH2-terminus (20 N-terminal residues) of Pilosulin 1 is critical for its cytotoxic activity and antimicrobial activities. Another family member Pilosulin 3, is a heterodimer of Pilosulin 3a and Pilosulin 3b linked in anti-parallel fashion through 2 disulfide bridges. This peptide is the most abundant peptide found in native venom. 74
60087 340215 pfam17500 Colicin_K Colicin-K immunity protein. Colicins are bacterial toxins produced by Escherichia coli strains and are active against E. coli or related strains. These bacterial antibiotic toxins play an important role in the E. coli colonization of environmental niches. Members of this family are Colicin K peptides which require TolA, TolB, TolQ, and TolR proteins for translocation across the periplasm and binding to the outer membrane receptor. Colicin K uses the Tsx nucleoside-specific receptor for binding at the cell surface, the OmpA protein for translocation through the outer membrane, and the TolABQR proteins for the transit through the periplasm. The N-terminal domain interacts with components of its import machinery, including the TolB and TolQ proteins. 96
60088 340216 pfam17501 Viral_RdRp_C Viral RNA-directed RNA polymerase. This is the C-terminal of RNA-directed RNA polymerase (Protein A) found in Alphanodaviruses such as Flock House Virus (FHV). FHV is a positive-stranded RNA virus with a bipartite genome of RNAs, RNA1 and RNA2. RNA1 encodes protein A, which is the catalytic subunit of the RNA-dependent RNA polymerase (RdRp) and functions as the sole viral replicase protein responsible for RNA replication. FHV protein A also possesses a terminal nucleotidyl transferase (TNTase) activity, which is able to restore the nucleotide loss at the 3'-end initiation site of RNA template to rescue RNA synthesis initiation. It has also been reported that FHV protein A replicates viral RNA in concert with the mitochondrial outer membrane and other viral or cellular factors and mediates the formation of viral RNA replication complexes and small spherules by inducing membrane rearrangement.This domain is also found in B1 proteins which are encoded by the subgenomic RNA3 during FHV replication. The function of translated B1 protein is poorly defined, but may be important for maintenance of RNA replication. 101
60089 340217 pfam17502 DUF5434 Family of unknown function (DUF5434). This is a family of unknown function found in Varicellovirus. 189
60090 407523 pfam17503 DUF5435 Family of unknown function (DUF5435). This is a family of unknown function found in Varicellovirus. 208
60091 340219 pfam17504 DUF5436 Family of unknown function (DUF5436). This is a family of unknown function found in Orthopoxvirus. 79
60092 375205 pfam17505 DUF5437 Family of unknown function (DUF5437). This is a family of unknown function found in Alphabaculovirus. 60
60093 340221 pfam17506 DUF5438 Family of unknown function (DUF5438). This is a family of unknown function found in Orthopoxvirus. 71
60094 340222 pfam17507 DUF5439 Family of unknown function (DUF5439). This is a family of unknown function found in Orthopoxvirus. 75
60095 340223 pfam17508 MccV Microcin V bacteriocin. Family members are bacterial microcin-V peptides MccV, also known as colicin V. MccV was the first antibiotic substance reported to be produced by E. coli. This antibacterial agent was initially named colicin V (ColV). However, on account of several characteristics (low molecular mass, non-inducible production, and dedicated export system), it became classified within the microcins. The structural gene cvaC, encodes the 103-aa MccV precursor. The dedicated export system of MccV has been well characterized and involves two genes that form the second operon. The MccV protein has an N-terminal double glycine motif which precedes the cleavage site for the precursor protein. 104
60096 375206 pfam17509 DUF5440 Family of unknown function (DUF5440). This is a family of unknown function found in bacteria. 93
60097 375207 pfam17510 Gp44 Mycobacterium phage hypothetical protein Gp44.1. This is a family with unknown function. Family members are hypothetical proteins found in Mycobacterium phages. 107
60098 340226 pfam17511 Mobilization_B Mobilization protein B. This is a family of unknown function found in Bacteria. Family members include Mobilization protein B (MobB). MobB contains a putative membrane-spanning domain, and might be involved in anchoring or presenting MobA, and the covalently-linked plasmid DNA, to the conjugative pore for subsequent export. In agreement with this, MobB has been shown to be associated with the membrane. Deletion of the membrane-spanning domain disrupts this association and decreases the frequency of both type IV transport and plasmid mobilization. MobB is one out of three proteins encoded by RSF1010 that are required for its mobilization along with MobA and MobC. MobB encoded by the broad-host-range plasmid R1162 is required for its efficient transfer by conjugation. The C-terminal half of the protein contains a membrane domain essential for transfer, while the other, functionally active region of MobB, identified by mutagenesis, is at the N-terminal end. One mutation affecting this region inhibits replication, suggesting that this part of the protein is contacting and sequestering the relaxase-linked primase. A model that represents MobB molecules as anchored in the membrane at one end and engaging the relaxase at the other. This arrangement is suggested to increase the transfer frequency by raising the probability of contact between the relaxase and the membrane-embedded, coupling protein for type IV secretion. 136
60099 340227 pfam17512 Sh_2 Metapneumovirus Small hydrophobic protein. This family is found in SH (small hydrophobic) proteins present in Metapneumovirus such as the Avian metapneumovirus (AMPV), a paramyxovirus that has three membrane proteins (G, F, and SH). Among them, the SH protein is a small type II integral membrane protein. It is located in both the plasma membrane as well as within intracellular compartments. AMPV type C- SH protein localizes in the endoplasmic reticulum (ER), Golgi, and cell surface, and is transported through ER-Golgi secretory pathway. AMPV SH protein is modified by N-linked glycans and can be released into the extracellular environment. Furthermore, it has been shown that glycosylated AMPV SH proteins form homodimers through cysteine-mediated disulfide bonds. 174
60100 375208 pfam17513 DUF5441 Family of unknown function (DUF5441). This is a family of unknown function found in Mastadenoviruses. 189
60101 340229 pfam17514 DUF5442 Family of unknown function (DUF5442). This is a family of unknown function found in Chironomus. 107
60102 340230 pfam17515 CPV_Polyhedrin Cypovirus polyhedrin protein. This family is found in polyhedrin proteins of Cypoviruses. These viruses possess a single capsid layer with turrets and are commonly embedded in crystalline occlusion bodies called polyhedra, which are formed in the cell cytoplasm and mainly composed of a single virus-encoded protein, polyhedron. Cypoviruses have been classified into 21 distinct types. Within each type the amino acid sequence of polyhedrins are highly conserved, whilst between types there is little conservation. Structural analysis and comparison of the different polyhedrins reveals five variable regions: the N-terminal loop, connections between secondary structures (H2 and H3, beta-E and beta-F, beta-F and beta-G, beta-G and beta-H), and the C-terminal loop, which is designate V1-V5 respectively. V2 forms a cap at one end of the protein and is subdivided across two sections of the polypeptide, V2n and V2c. Differences in these regions give each polyhedrin its characteristic appearance. The base domain (residues 74-110) is a region that is neither required for proper folding of the protein, nor for crystal assembly, but fine-tunes the crystal, 'locking-down' the structure, often in conjunction with NTPs. This region is also implicated in virion recognition and packaging. 241
60103 407524 pfam17516 ProQ_C ProQ C-terminal domain. This domain is found at the C-terminus of many ProQ proteins. 51
60104 407525 pfam17517 IgGFc_binding IgGFc binding protein. This domain is found at the N terminal of human IgGFc-binding protein and has been shown to confer IgG Fc binding activity. It may play a role in immune protection and inflammation in the intestines of primates. 292
60105 340233 pfam17518 DUF5443 Family of unknown function (DUF5443). This is a family of unknown function found in Mycoplasma. 344
60106 407526 pfam17519 DUF5444 Family of unknown function (DUF5444). This is a family of unknown function found in Enterobacterales. 62
60107 340235 pfam17520 DUF5445 Family of unknown function (DUF5445). This is a family of unknown function found in Enterobacteriaceae. 52
60108 340236 pfam17521 Secapin Honey bee peptides. Family members are bee venom peptides such as Secapin. Mature secapin is composed of 25 amino acid residues that contain a disulfide link. Secapin has been demonstrated to act as a potent neurotoxin. In Apis mellifera secapin exhibits anti-bacterial activity and induces inflammation and pain with anti-fibrinolytic, anti-elastolytic, and anti-microbial activities. Secapin shares a common folding pattern with apamin, mast cell degranulating peptide and tertiapin; it is centred on a beta-turn covalently linked to an alpha-helical segment by one disulphide link (two disulphide links in the other peptides). 45
60109 340237 pfam17522 DUF5446 Family of unknown function (DUF5446). This is a family of unknown function found in Bacillales. 72
60110 340238 pfam17523 MPS-4 MinK-related peptide, potassium channel accessory sub-unit protein 4. MinK-related peptides (MiRPs or KCNEs) are single-transmembrane proteins that associate with pore-forming ion-channel sub-units to form stable complexes with channel properties markedly distinct from those of the isolated pore-forming sub-units. MPS-4 is expressed exclusively in the C. elegans nervous system and is essential for neuronal excitability. 78
60111 407527 pfam17524 CnrY anti-sigma factor CnrY. This family is found in alpha and beta proteobacteria. Family members include anti-sigma factor CnrY from Cupriavidus metallidurans. Sigma factors are multi-domain sub-units of bacterial RNA polymerase (RNAP) that play critical roles in transcription initiation, including the recognition and opening of promoters as well as the initial steps in RNA synthesis. They also control a wide variety of adaptive responses such as morphological development and the management of stress. A recurring theme in sigma factor control is their sequestration by anti-sigma factors that occlude their RNAP-binding determinants. CnrH, controls cobalt and nickel resistance in Cupriavidus metallidurans. CnrH is regulated by a complex of two transmembrane proteins: the periplasmic sensor CnrX and the anti-sigma CnrY. At rest, CnrH is sequestered by CnrY whose 45-residue-long cytosolic domain is one of the shortest anti-sigma domains. Upon Ni(II) or Co(II) ions detection by CnrX in the periplasm, CnrH is released between CnrH and the cytosolic domain of CnrY (CnrYc). The CnrH/CnrYC complex displays an unexpected structural similarity to the anti-sigma NepR in complex with its antagonist PhyR, whereas NepR shares no sequence similarity with CnrY. Crystal structure of CnrH/CnrY shows that CnrYC residues 3-19 are folded as a well-defined alpha-helix. The peptide further extends along the hydrophobic groove of sigma 2 with no canonical structure except for a short helical turn spanning residues 24-28. CnrY has a hydrophobic knob made of V4, W7 and L8 side chains protruding into sigma 4 hydrophobic pocket and contributing to the interface. In vivo investigation of CnrY function pinpoints part of the hydrophobic knob as a hotspot in CnrH inhibitory binding. 98
60112 340240 pfam17525 DUF5447 Family of unknown function (DUF5447). This is a family of unknown function found in Pseudomonas. 92
60113 340241 pfam17526 DUF5448 Family of unknown function (DUF5448). This is a family of unknown function found in Gammaproteobacteria. 118
60114 340242 pfam17527 ALP Phage ALP protein. During the course of infection of Escherichia coil by bacteriophage T4, transcription of viral late genes does not take place unless template DNA contains hydroxymethyl cytosine (hmCyt), a modification normally effected by virus-encoded enzymes. Bacteriophage T4 Alc protein acts as a site-specific termination factor participating in shutting off host transcription after infection of E. coli, while the bacteriophage T4 transcription is protected from the action of Alc by overall substitution of cytosine with 5-hydroxymethyl cytosine in T4 DNA. Based on genetic studies, Alc is thought to bind directly to the beta sub-unit dispensable region 1 (bDR1) of E.coli RNAP. However, immune-isolation experiments show that Alc binds both core and sigma 70-holoenzyme of RNAP. 177
60115 407528 pfam17528 DUF5449 Family of unknown function (DUF5449). This is a family of unknown function found in Lactobacillus. 174
60116 407529 pfam17529 DUF5450 Family of unknown function (DUF5450). This is a family of unknown function found in Giardia intestinalis. 161
60117 340245 pfam17530 NS3 Non-structural protein NS3. This is a family of proteins found in Densoviruses. Members of this family such as NS3 found in Junonia coenia have been shown to be involved in viral DNA replication. Generation of deletion mutants and replicative cycle analysis show that NS3 is required for viral DNA replication. Bioinformatics analysis of Bombyx mori densovirus protein NS3, show that it has two putative zinc-finger motifs, 6 putative N glycosylation sites, and 4 putative phosphorylation sites. 245
60118 340246 pfam17531 O_Spanin_T7 outer-membrane spanin sub-unit. This family contains members of the outer membrane spanin sub-unit protein (o-spanin), found in Enterobacteria phage T7. Spanins are lytic proteins that act on bacterial outer-membrane by disrupting it, allowing progeny virions to spread. O-spanin acts together with inner membrane spanin sub-unit (i-spanin) to form the spanin complex necessary for function. 33
60119 340247 pfam17532 DUF5451 Family of unknown function (DUF5451). This is a family of unknown function found in Epstein-Barr virus. 148
60120 340248 pfam17533 DUF5452 Family of unknown function (DUF5452). This is a family of unknown function found in Mycoplasmataceae. 169
60121 340249 pfam17534 DUF5453 Family of unknown function (DUF5453). This is a family of unknown function found in Mycoplasma. Family members have 4 predicted trans-membrane regions. 186
60122 340250 pfam17535 DUF5454 Family of unknown function (DUF5454). This is a family of unknown function found in Mycoplasma. 221
60123 340251 pfam17536 Mx_ML Matrix and Matrix long proteins N-terminal. This entry represents the N-terminal fragment of family members such as the Matrix (Mx) and Matrix protein long (ML) proteins. They are found in Thogoto virus (THOV), a tick-transmitted orthomyxovirus with a genome consisting of six single-stranded RNA segments that encode seven structural proteins. Matrix proteins of the family Orthomyxoviridae are major structural components of the viral capsid, located below the viral lipid membrane and provide protection for viral ribonucleoproteins (vRNPs). They serve as a major participant during the processes of virus invasion and budding. Furthermore, they play specific roles throughout the viral life cycle, usually by interacting with other viral components or host cellular proteins. ML protein, an extended version of the viral M protein, is a viral IFN antagonist. ML is essential for virus growth and pathogenesis in an IFN-competent host. In the presence of ML the activation and/or action of the interferon regulatory factor-3 (IRF-3) is severely affected. This effect depends on direct interaction of ML with the transcription factor IIB (TFIIB). ML suppresses IRF-7 in a similar manner as it suppresses IRF-3. Studies have revealed that ML associates with IRF-7 and prevents IRF-7 dimerization and interaction with TRAF6. Structural analysis revealed that N-terminal fragment of M protein (MN) undergoes conformational changes that result in specific, pH-dependent inter-molecular interactions. Comparison of THOV MN and influenza A virus (IAV) MN region, showed low sequence identity. However, superimposition of the two structures in neutral condition, showed that both matrix proteins contain nine helices connected with same topology. Since the matrix layer of IAV disassembles in acidic endosome at the beginning of infection and repacks in the neutral cytoplasm, a change of pH might be a key regulator for the capsid assembly/disassembly transition during these processes. Hence, pH-dependent conformational transition model was studied in THOV MN, where interactions such as hydrogen bonds and hydrophobic interactions are suggested to be involved in THOV matrix assembly. 149
60124 407530 pfam17537 DUF5455 Family of unknown function (DUF5455). This is a family of unknown function found in Proteobacteria. Family members contain three predicted trans-membrane regions. 102
60125 407531 pfam17538 C_LFY_FLO DNA Binding Domain (C-terminal) Leafy/Floricaula. This family consists of various plant development proteins which are homologs of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY is a plant-specific transcription factor (TF) essential for flower development. It is one of the few master regulators of flower development, as it integrates environmental and endogenous signals to orchestrate the whole floral network. Transcription factors such as LFY, recognize short DNA motifs primarily through their DNA-binding domain. Upon binding to short stretches of DNA called cis-elements or TF binding sites (TFBS), they regulate gene expression. This entry represents the DNA binding domain found in C-terminal of LFY proteins in plants. Structure-function studies have demonstrated that LFY binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD, a unique helix-turn-helix fold that by itself dimerizes on DNA. 169
60126 340254 pfam17539 DUF5456 Family of unknown function (DUF5456). This is a family of unknown function found in Bacteroides. 152
60127 375213 pfam17540 DUF5457 Family of unknown function (DUF5457). This is a family of unknown function found in Bacteria. Family members have one predicted trans-membrane region. 89
60128 407532 pfam17541 DUF5458 Family of unknown function (DUF5458). This is a family of unknown function found in Bacteroidetes. 430
60129 340257 pfam17542 RP853 Uncharacterized RP853. This is a family of unknown function found in Rickettsia. Family members are predicted to contain one trans-membrane region. 317
60130 340259 pfam17544 DUF5460 Family of unknown function (DUF5460). This is a family of unknown function found in Rickettsia. Family members are predicted to contain one trans-membrane region. 375
60131 340260 pfam17545 DUF5461 Family of unknown function (DUF5461). This is a family of unknown function found in viruses. 93
60132 340261 pfam17546 Defb50 Beta Defensin 50. B-defensin are small cationic antimicrobial peptides. Family members such as beta-defensin 50 (Defb50) has poor antimicrobial activity in its oxidized form, but this improves under reduced conditions. 50
60133 407533 pfam17547 DUF5462 Family of unknown function (DUF5462). This is a family of unknown function found in Gammaproteobacteria. 157
60134 340263 pfam17548 p6 Histone-like Protein p6. Family members such as protein p6 from Bacillus subtilis phage phi29 bind double-stranded DNA, forming a large nucleoprotein complex all along the viral genome, and have been proposed to be an architectural protein with a global role in genome organization. P6 is also involved in viral transcriptional control, repressing the C2 early promoter located at the right DNA end,and together with the viral regulatory protein p4, repressing early promoters A2b/A2c and activating late promoter A3. 76
60135 340264 pfam17549 Phage_Gp17 Gene Product 17. Family members such as protein 17 (gene product 17/gp17) found in Bacillus phage phi29, is involved in DNA replication and in pulling the phage DNA into the cell during the injection process. 140
60136 340265 pfam17550 PsaF Family of unknown function. This is a family of unknown function found in Yersinia pestis. 162
60137 340266 pfam17551 DUF5463 Family of unknown function (DUF5463). This is a family of unknown function found in Yersinia pestis. 32
60138 340267 pfam17552 DUF5464 Family of unknown function (DUF5464). This is a family of unknown function found in Bacteriophages. 51
60139 340268 pfam17553 DUF5465 Family of unknown function (DUF5465). This is a family of unknown function found in Enterobacteria phage T7. 19
60140 340269 pfam17554 DUF5466 Family of unknown function (DUF5466). This is a family of unknown function found in Enterobacteria phage T7. 57
60141 407534 pfam17555 DUF5467 Family of unknown function (DUF5467). This is a family of unknown function found in Bacteria. Family members have 5 predicted trans-membrane regions. 274
60142 340271 pfam17556 MIT_LIKE_ACTX MIT-like atracotoxin family. This family includes peptides such as the Atracotoxin-Hvf17. It is a a non-toxic peptide isolated from the venom of Blue Mountains funnel-web spider Hadronyche versuta. It does not function like classical funnel-web spider atracotoxins to modulate mammalian or insect voltage-gated ion channel function since it lacks insecticidal activity and fails to affect vas deferens smooth muscle or skeletal muscle contractility. This peptide has ten conserved cysteine residues similar to AVIT family members such as MIT1. Due to the lack of the AVIT N-terminal four residues and lack of functional similarity to the AVIT family, the Atracotoxin-Hvf17 is classified as MIT-like atracotoxin. 68
60143 375216 pfam17557 Conotoxin_I2 I2-superfamily conotoxins. Conotoxins (or conopeptides) are the peptidic components of the venoms of marine cone snails (genus Conus). They are classified in one of three ways: gene superfamily, cysteine framework or pharmacological family. Several distinct cysteine frameworks have been described in conotoxins. Members of this family display a XI cysteine pattern (C-C-CC-CC-C-C) and belong to the I2- superfamily conotoxins. Family members such as Kappa-conotoxin ViTx and Kappa-conotoxin SrXIA inhibit voltage gated potassium channels (Kv). 38
60144 375217 pfam17558 AGH Androgenic gland hormone. This family contains members such as the Androgenic gland hormone (AGH) of the woodlouse, Armadillidium vulgare. AGH is a heterodimeric glycopeptide synthesized and secreted from androgenic glands. It is responsible for sex differentiation in crustaceans and contains 4 disulfide bonds. 121
60145 340275 pfam17560 Megourin Aphid Megourins. This family is fond in the vetch aphid Megoura viciae with members such as Megourin 1, 2 and 3. Megourins are antimicrobial peptides that act against Gram-positive bacteria and fungi. 63
60146 407535 pfam17561 DUF5469 Family of unknown function (DUF5469). This is a family of unknown function found in Bacteroidetes. Family members have one predicted trans-membrane region. 148
60147 340277 pfam17562 Styelin Styelin A-E. This is a family of antimicrobial peptides found in Stela clava (Sea squirt). Family members such as Styelin A and B, are two alpha-helical phenylalanine-rich antimicrobial peptides effective against a panel of Gram-negative and Gram-positive bacteria. Styelin contains unusual amino acids such as dihydroxyarginine, dihydroxylysine, 6-bromotryptophan, and 3,4-dihydroxyphenylalanine which are important for the antimicrobial activity at high salt concentrations. 59
60148 340278 pfam17563 Cu Cupiennin. Cupiennin are small cationic alpha-helical peptides from the venom of the ctenid spider Cupiennius salei which are characterized by high bactericidal as well as hemolytic activities. Family members such as cupiennin 1a exert both cytolytic and antibacterial effects. The cytolytic activity of the cupiennin peptides depends primarily on the amphipathic N-terminus, which is capable of inserting into the membrane, and is modulated by the C-terminus via electrostatic interactions with the cell surface. 27
60149 340279 pfam17564 DUF5470 Family of unknown function (DUF5470). This is a family of unknown function found in viruses. 73
60150 340280 pfam17565 DUF5471 Family of unknown function (DUF5471). This is a family of unknown function found in Enterobacteria phage T7. 70
60151 340281 pfam17566 DUF5472 Family of unknown function (DUF5472). This is a family of unknown function found in Human papillomavirus type 11. 73
60152 340282 pfam17567 DUF5473 Family of unknown function (DUF5473). This is a family of unknown function found in Human adenovirus. 106
60153 340283 pfam17568 DUF5474 Family of unknown function (DUF5474). This is a family of unknown function found in Saccharomycetales. 77
60154 340284 pfam17569 DUF5475 Family of unknown function (DUF5475). This is a family of unknown function found in Alphabaculovirus. 81
60155 375218 pfam17570 DUF5476 Family of unknown function (DUF5476). This is a family of unknown function found in Podoviridae. 61
60156 340286 pfam17571 DUF5477 Family of unknown function (DUF5477). This is a family of unknown function found in Podoviridae. 77
60157 340287 pfam17572 DUF5478 Family of unknown function (DUF5478). This is a family of unknown function found in Alphabaculovirus. 86
60158 340288 pfam17573 GA-like GA-like domain. This domain is found in bacterial cell surface proteins. It is related to the GA domain that forms a three helix bundle. 50
60159 340289 pfam17574 TA_inhibitor Inhibitor of toxin/antitoxin system (Gp4.5). This is a family of prokaryotic toxin-antitoxin (TA) systems inhibitors, found in Podoviridae such as Enterobacteria phage T7. Family members such as Gene product 4.5 have been shown to neutralize TA-system-mediated abortive infection by inhibiting the Lon protease activity, thus preventing antitoxin degradation and toxin activation. 89
60160 340290 pfam17575 DUF5479 Family of unknown function (DUF5479). This is a family of unknown function found in Kappa-papillomavirus. 101
60161 340291 pfam17576 DUF5480 Family of unknown function (DUF5480). This is a family of unknown function found in Podoviridae. 71
60162 340292 pfam17577 ETM ECORI-T site protein ETM. This is a family of unknown function found in Alphabaculovirus. 109
60163 340293 pfam17578 DUF5481 Family of unknown function (DUF5481). This is a family of unknown function found in Myoviridae. 103
60164 340294 pfam17579 DUF5482 Family of unknown function (DUF5482). This is a family of unknown function found in Saccharomycetales. 159
60165 340295 pfam17580 GBR_NSP5 Group B Rotavirus Non-structural protein 5. Family members such as non-structural protein 5 (NSP5), are found in Group B rotaviruses (GBR). Group B rotavirus (GBR) is genetically and antigenically distinct from Group A rotavirus (GAR). Hence phylogneetic studies have been carried out and show that the C-terminal region of NSP5, which is conserved among GAR and critical for its function for viroplasm-like structure formation in cells, was also conserved in GBR NSP5. 176
60166 340296 pfam17581 DUF5483 Family of unknown function (DUF5483). This is a family of unknown function found in Saccharomycetaceae. 441
60167 340297 pfam17582 UL20 Cytomegalovirus UL20. This family has members such as the human cytomegalovirus glycoprotein UL20. UL20 is a type I trans-membrane glycoprotein with an immunoglobulin-like ectodomain that is highly polymorphic among HCMV strains. 304
60168 340298 pfam17583 DUF5484 Family of unknown function (DUF5484). This is a family of unknown function found in Myoviridae. 43
60169 340299 pfam17584 comS Bacillus Competence protein S. ComS is crucial for competence development as it prevents proteolytic degradation of ComK, the key transcriptional activator of all genes required for the uptake and integration of DNA. This family includes members of the Bacillus comS proteins. 44
60170 340300 pfam17585 Phage_Arf Accessory recombination function protein. Family members are found in Caudovirales such as Salmonella virus P22. Family members have a recombination accessory function. 47
60171 340301 pfam17586 DUF5485 Family of unknown function (DUF5485). This is a family of unknown function found in Alphabaculovirus. 56
60172 340302 pfam17587 Dmd Discriminator of mRNA degradation. This family includes Dmd peptides from T4 phages. Dmd can suppress the toxicities of toxins such as LsoA (an endoribonucleases toxin expressed by E.coli). Crystal structure analysis show that Dmd is inserted into the deep groove between the N-terminal repeated domain (NRD) and the Dmd-binding domain (DBD) of LsoA. Site-directed mutagenesis of Dmd revealed the conserved residues (W31 and N40) are necessary for LsoA binding and the toxicity suppression. 60
60173 340303 pfam17588 DUF5486 Family of unknown function (DUF5486). This is a family of unknown function found in Myoviridae. 53
60174 407536 pfam17589 DUF5487 Family of unknown function (DUF5487). This is a family of unknown function found in Myoviridae. 66
60175 340305 pfam17590 DUF5488 Family of unknown function (DUF5488). This is a family of unknown function found in Orthopoxvirus. 70
60176 340306 pfam17591 UL41A Herpesvirus UL41A. Members of this family are found in Human cytomegalovirus. No known function has been reported. 78
60177 340307 pfam17592 DUF5489 Family of unknown function (DUF5489). This is a family of unknown function found in Alphafusellovirus. 78
60178 340308 pfam17593 DUF5490 Family of unknown function (DUF5490). This is a family of unknown function found in Myoviridae. 62
60179 340309 pfam17594 GP57 Phage Tail fiber assembly helper protein. Gene product 57 (Gp57) is a chaperone protein for short tail fiberphage protein that acts as a molecular chaperone of gp12, increasing the folding efficacy and production efficiency. 75
60180 340310 pfam17595 DUF5491 Family of unknown function (DUF5491). This is a family of unknown function found in Myoviridae. 68
60181 340311 pfam17596 DUF5492 Family of unknown function (DUF5492). This is a family of unknown function found in Alphabaculovirus. 80
60182 340312 pfam17597 DUF5493 Family of unknown function (DUF5493). This is a family of unknown function found in viruses. 82
60183 340313 pfam17598 DUF5494 Family of unknown function (DUF5494). This is a family of unknown function found in viruses. 84
60184 340314 pfam17599 DUF5495 Family of unknown function (DUF5495). This is a family of unknown function found in Myoviridae. 87
60185 340315 pfam17600 DUF5496 Family of unknown function (DUF5496). This is a family of unknown function found in Myoviridae. 87
60186 340316 pfam17601 DUF5497 Family of unknown function (DUF5497). This is a family of unknown function found in Alphabaculovirus. 89
60187 340317 pfam17602 DUF5498 Family of unknown function (DUF5498). This is a family of unknown function found in Myoviridae. 96
60188 340318 pfam17603 DUF5499 Family of unknown function (DUF5499). This is a family of unknown function found in Myoviridae. 97
60189 340319 pfam17604 DUF5500 Family of unknown function (DUF5500). This is a family of unknown function found in Herpesvirus. 98
60190 340320 pfam17605 DUF5501 Family of unknown function (DUF5501). This is a family of unknown function found in Alphabaculovirus. 107
60191 340321 pfam17606 DUF5502 Family of unknown function (DUF5502). This is a family of unknown function found in Listeria. 87
60192 340322 pfam17607 DUF5503 Family of unknown function (DUF5503). This is a family of unknown function found in Enterobacteriaceae. 116
60193 340323 pfam17608 DUF5504 Family of unknown function (DUF5504). This is a family of unknown function found in Lactobacillus. Family members have 4 predicted trans-membrane regions. 124
60194 340324 pfam17609 HCMV_UL124 Family of unknown function. This is a family of unknown function found in beta-herpesvirus. Family members such as UL124 is a predicted membrane glycoprotein with one predicted trans-membrane region. 126
60195 340325 pfam17610 DUF5505 Family of unknown function (DUF5505). This is a family of unknown function found in Alphabaculovirus. 156
60196 340326 pfam17611 DUF5506 Family of unknown function (DUF5506). This is a family of unknown function found in Fowl aviadenovirus. 161
60197 340327 pfam17612 DUF5507 Family of unknown function (DUF5507). This is a family of unknown function found in Escherichia. 160
60198 340328 pfam17613 motB Modifier of transcription. Family members are transcription regulation-related proteins found in Myoviridae such as Enterobacteria phage T4. 162
60199 340329 pfam17614 FPV060 Viral CC-type chemokine. Family members found in Fowlpox virus are CC chemokine-like proteins. Fpv060 contains the conserved pattern of four cysteine residues similar to the CC chemokine family. Fpv060 also contains more cysteines in the mature protein, than cellular chemokines and one predicted trans-membrane region. In vitro studies show N-terminal glycosylation and show that Fpv060 from Fowl pox virus is much larger and has many more cysteine residues than host chemokines and viral homologs. 188
60200 375219 pfam17615 C166 Family of unknown function. Family members found in Fuselloviridae are predicted to play a role in virus function. 171
60201 340331 pfam17616 US6 Viral unique short region 6. This family has members such as US6 found in HCMV (Human cytomegalovirus). US6 is a unique short region glycoprotein found in the ER. It blocks the binding of ATP by TAP1 (Transporter associated with Antigen Processing 1) through a conformational change and subsequently inhibits TAP-mediated peptide translocation to the ER. It also down regulates only MHC class I. Inhibition of US6 of TAP has been shown to require residues 89 to 108 of the HCMV US6 luminal domain, whereas sequences that flank this region stabilize the binding of the viral protein to TAP. Residues 81 to 90 and the C-terminal 39 residues of HCMV US6 may also contribute to the stabilization of the interaction between US6 and TAP. 161
60202 340332 pfam17617 US10 Viral unique short region 10. This family contains US10 proteins found in HCMV Human cytomegalovirus. US10 is a unique short region trans-membrane glycoprotein found in the endoplasmic reticulum (ER). It down-regulates cell surface expression of HLA-G, but not that of classical class I MHC molecules. Despite of binding to classical class I MHC molecules and delaying their trafficking, it does not affect their steady-state cell surface levels. US10 contains a tri-leucine motif in the cytoplasmic tail which is responsible for down-regulation of HLA-G. 161
60203 407537 pfam17618 SL4P Uncharacterized Strongylid L4 protein. Family members are predicted non-classically secreted proteins found in Ancylostoma ceylaniucum. Homologs are found in strongylids A. ceylanicum, N. americanus, H. contortus and Angiostrongylus cantonensis, where the corresponding genes in A. cantonensis are expressed in L4 larvae. Thus this family members found in A, ceylaniucum have been named strongylid L4 proteins (SL4Ps). Although SL4Ps do not resemble any domains of known function, they do have a conspicuous number of charged residues (both acidic and basic) in their N-terminal, most highly conserved regions. 88
60204 407538 pfam17619 SCVP Secreted clade V proteins. Family members are found in strongylid parasites (A. ceylanicum, N. americanus, H. contortus and Heterorhabditis bacteriophora) and in related non-parasitic clade V species (C. elegans, Caenorhabditis briggsae and P. pacificus), hence the name secreted clade V proteins (SCVPs). In A. ceylanicum, the encoded 150 residue proteins are predicted to be classically secreted. 97
60205 340335 pfam17620 ORF45 Family of unknown function. Family members found in alphabaculoviruses such as orf45 have been implicated in late gene expression when linked to orf41. 191
60206 340336 pfam17621 DUF5508 Family of unknown function (DUF5508). This is a family of unknown function found in Enterobacteriaceae. 263
60207 340337 pfam17622 UL16 Viral unique long protein 16. This family contains members such as UL16 found in the human cytomegalovirus (HCMV). It is an immunoevasin which subverts NKG2D-mediated immune responses by retaining a select group of diverse NKG2D ligands inside the cell. UL16 is a heavily glycosylated 50 kDa type I trans-membrane glycoprotein. The ectodomain folds into a modified version of the a variable (V-type) (immunoglobulin Ig)-like domain. The N-terminal plug region (amino acids 27-50) is covalently linked to the Ig-like core with a disulfide bond. UL16 protein utilizes a three-stranded beta-sheet to engage the alpha-helical surface of the MHC class I-like MICB platform domain. Residues at the center of this beta-sheet mimic a central binding motif employed by the structurally unrelated C-type lectin-like NKG2D to facilitate engagement of diverse NKG2D ligands. 204
60208 340338 pfam17623 B277 Family of unknown function. This is a family of unknown function, however family members such as B277 have been suggested to play a role in viral function. 277
60209 340339 pfam17624 US30 Family of unknown function. This is a family of unknown function found in Cytomegalovirus. One of the family members US30 is a putative membrane glycoprotein with one predicted trans-membrane region. 282
60210 340340 pfam17625 DUF5509 Family of unknown function (DUF5509). This is a family of unknown function found in Baculoviridae. 362
60211 340341 pfam17626 IncF Inclusion membrane protein F. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) are exposed to the cytosol and share a common structural feature of a long, bi-lobed hydrophobic domain but little or no primary amino acid sequence similarity. This family has members such as the IncF proteins found in Chlamydia trachomatis. IncF, is enriched at the point of contact of RBs (reticulate bodies) with the inclusion membrane. It is expressed early in the developmental cycle and interacts with many other Inc proteins, like Ct058 or Ct850, which are expressed later during the cycle. Thus, IncF could act as an interaction node for Inc proteins. IncF consists of 104 amino acids of which 38 N-terminal amino acids encoding the signal sequence for the type III system and 12 C-terminal amino acids may be localized in the host cell cytoplasm. Suggesting that IncF or other small Incs interact with other Inc proteins by their trans-membrane domain. It has been identified to be capable of homo-oligomerization and also displayed self-interacting properties. 104
60212 340342 pfam17627 IncE Inclusion membrane protein E. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) have two major characteristics: an N-terminal type III secretion signal that is necessary for their secretion out of the bacterium and a hydrophobic region consisting of at least two trans-membrane helices that allows insertion into the inclusion membrane. Generally, both the N- and C-terminal regions of the Inc are exposed to the host cell cytosol. This family has members such as the IncE (also known as CT116) proteins found in Chlamydia trachomatis. IncE Interacts with Retromer-Associated Sorting Nexins (SNXs) directly binding the PX-domains of SNX5/6. It is expressed within the first 2 hours of C. trachomatis infection. IncE region 101-132 is the binding site for SNX5/6 causing re-localization of SNX5/6 from endosomes to the inclusion membrane. IncE101-132 expression was shown to be sufficient to maintain CI-MPR (Cation-Independent Mannose-6-Phosphate Receptor) in retromer-containing compartments, thereby disrupting efficient CI-MPR trafficking to the trans-Golgi. It has been suggested that SNX5/6 bind directly to IncE independently of phosphoinositides and that the predicted IncE C-terminal beta-hairpin is required. IncE-mediated sequestration of retromer SNX-BAR proteins may promote Golgi fragmentation, a process that facilitates lipid acquisition by C. trachomatis and enhances progeny production. 132
60213 340343 pfam17628 IncD Inclusion membrane protein D. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) have two major characteristics: an N-terminal type III secretion signal that is necessary for their secretion out of the bacterium and a hydrophobic region consisting of at least two trans-membrane helices that allows insertion into the inclusion membrane. Generally, both the N- and C-terminal regions of the Inc are exposed to the host cell cytosol. This family has members such as the IncD proteins found in Chlamydia trachomatis. This C. trachomatis effector protein IncD has been shown to recruit the lipid transfer protein CERT to the inclusion membrane by directly interacting with CERT PH domain, which mediates the FFAT motif-dependent recruitment of the ER-resident protein VAPB (vesicle-associated membrane protein-associated protein) to the inclusion. 141
60214 340344 pfam17629 DUF5510 Family of unknown function (DUF5510). This is a family of unknown function found in Rickettsia. Family members are predicted to have 2 or 3 trans-membrane regions. 62
60215 340345 pfam17630 DUF5511 Family of unknown function (DUF5511). This is a family of unknown function found in Bacillus. 69
60216 340346 pfam17631 DUF5512 Family of unknown function (DUF5512). This is a family of unknown function found in Bacillus. 139
60217 340347 pfam17632 DUF5513 Family of unknown function (DUF5513). This is a family of unknown function found in Bacillus. 91
60218 340348 pfam17633 DUF5514 Family of unknown function (DUF5514). This is a family of unknown function found in Bacillus. 142
60219 340349 pfam17634 Gp67 Gene product 67. This is a family of unknown function found in Myoviridae such as Enterobacteria phages. Family members such as Gp67, is a prohead core (scaffold) protein. 80
60220 407539 pfam17635 DUF5515 Family of unknown function (DUF5515). This is a family of unknown function found in SARS coronavirus. 70
60221 340351 pfam17636 UL21a Viral Unique Long protein 21a. Members of this family such as UL21a found in Human cytomegalovirus (HCMV) is required for HCMV to establish efficient productive infection. It is a short-lived cytoplasmic protein that facilitates HCMV replication. It has also been shown to be responsible for APC1, APC4 and APC5 degradation. 123
60222 375222 pfam17637 DUF5516 Family of unknown function (DUF5516). This is a family of unknown function found in T7 viruses. 37
60223 340353 pfam17638 UL42 HCMV UL42. Family members include UL42 proteins found in Human cytomegalovirus (HCMV). UL42 has two Pro-Pro-X-Tyr (PPxY) sequences, a hydrophobic region at the C-terminus and no N-terminal signal peptide. These features are shared with herpes simplex virus (HSV) UL56. UL42 has a putative C-terminal trans-membrane region. HCMV UL42 interacts with Itch, a member of the Nedd4 family of ubiquitin E3 ligases, through its PY motifs as observed in HSV UL56, suggestive of a regulatory function. 125
60224 340354 pfam17639 DUF5517 Family of unknown function (DUF5517). This is a family of unknown function found in Fuselloviridae. Structure analysis suggest a role in viral assembly. 100
60225 340355 pfam17640 UL17 Uncharacterized UL17. This is a family of unknown function found in beta-herpesviruses such as Human cytomegalovirus (HCMV). 102
60226 407540 pfam17641 ASPRs Ancylostoma-associated secreted protein related genes. This family includes members encoded by ASP-related genes which are distant homologs to ASPs (Ancylostoma-associated secreted proteins). ASPs are a diverse set of secreted cysteine-rich proteins pfam00188. ASPRs, on the other hand are predicted to be secreted with one ASPR in Heligmosomoides bakeri shown to be secreted by parasitic adults. Thus, like ASPs, ASPRs are suggested to comprise an important element of hookworm infection in vivo. 118
60227 407541 pfam17642 TssD Hemolysin coregulated protein Hcp (TssD). T6SSs are toxin delivery systems. It is a multiprotein complex requiring numerous core proteins (Tss proteins) including cytoplasmic, transmembrane, and outer membrane components. The needle or tube apparatus is comprised of a phage-like complex, similar to the T4 contractile bacteriophage tail, which is thought to be anchored to the membrane by a trans-envelope complex. These tube and trans-envelope sub-assemblies are linked via TssK. This entry comprises family members such as the inner tube protein Hcp (TssD). Hcp proteins form hexamers that stack to form the inner tube/needle structure of the puncturing device. Other functions have also been described for Hcp proteins, for example, some Hcp proteins have been shown to have a chaperone function in that they bind to and stabilize effectors. In addition, there are evolved Hcp proteins that have the Hcp domain at the N-terminal half of the protein and a toxic effector function present in the C-terminal portion of the protein. 127
60228 407542 pfam17643 TssR Type VI secretion system, TssR. T6SSs are toxin delivery systems. It is a multiprotein complex requiring numerous core proteins (Tss proteins) including cytoplasmic, transmembrane, and outer membrane components. The needle or tube apparatus is comprised of a phage-like complex, similar to the T4 contractile bacteriophage tail, which is thought to be anchored to the membrane by a trans-envelope complex. This entry relates to TssR family members. TssR proteins have no predicted TM regions. 745
60229 375225 pfam17644 30K_MP_core Core domain of 30K viral movement proteins. This entry represents the core domain found in viral movement proteins (MP) of the 30K type. The core domain is conserved among MPs of 30K sharing the same predicted secondary structure which consists of 1 alpha-helix and 7 predicted beta-strands. The only sequence feature common to all 30K MPs is a short region between beta-strands 1 and 2, which contains several conserved hydrophobic positions, and a nearly-invariant aspartate which constitutes the sequence signature of the superfamily. This signature aspartate has a conserved role in essential for viral cell-to-cell movement. 138
60230 375226 pfam17645 Amdase Arylmalonate decarboxylase. This entry contains members such as the arylmalonate decarboxylases (AMDase; EC 4.1.1.76), which belong to the family of carboxy-lyases (EC 4.1). Amdases are capable of decarboxylating a range of alpha-disubstituted malonic acid derivates to enantiopure products without the need for any cofactor. AMDases are members of the widespread Asp/Glu racemase family pfam01177 together with aspartate (EC 5.1.1.13) and glutamate racemases (EC 5.1.1.3), hydantoin racemases (EC 5.1.99.5) and maleate isomerases (EC 5.2.1.1). 217
60231 375227 pfam17646 Zemlya Closterovirus 1a polyprotein central region. This family represents an alignment of the Zemlya region of closteroviruses. The alignment of the 1a polyprotein of the Closteroviridae family members revealed that this region was not conserved in other genera. The homologs of the Zemlya region are not found in other viral or cellular proteins. This region is named the Zemlya region (zemlya is the Russian word for earth), meaning that its conserved amino acid sequence represents a olid ground within the highly variable central region of 1a polyporotein. It is composed of four predicted alpha-helices, alphaA to alphaD, and contains three conserved positions: i) a strictly conserved glutamate (E) in helix alphaA (E1291 in Beet yellows closterovirus (BYV)); ii) a strictly conserved proline (P1380) in alphaD; and iii) a conserved basic position (arginine or lysine; R1384 in BYV). The presence of a conserved proline in helix alphaD is noteworthy because prolines are strongly disfavoured in helices; this proline most probably induces a kink in the helix. Functional studies have suggested that most part of the Zemlya region, targets the ER and remodels the ER membranes. More specifically, deletion analysis and substitutions of the conserved hydrophobic amino acid residues suggest a role of the putative amphipathic helix1368-1385 (alphaD) in the formation of globules. Hence it was proposed that this specific region in 1a protein protein may be involved in the biogenesis of closterovirus. 106
60232 407543 pfam17647 DUF5518 Family of unknown function (DUF5518). This is a family of unknown function found in Archaea. Family members have multiple predicted trans-membrane regions. 118
60233 407544 pfam17648 DUF5519 Family of unknown function (DUF5519). This is a family of unknown function. 96
60234 407545 pfam17649 VPS38 Vacuolar protein sorting 38. The class III phosphatidylinositol-3-kinase (PI3K) known as Vps34 (vacuolar protein sorting 34, encoded by PIK3C3) regulate intracellular membrane trafficking in endocytic sorting, cytokinesis and autophagy. Vps34 forms complexes with other proteins: Vps15 (encoded by PIK3R4, known as p150 in mammalian cells), Vps30 (encoded by VPS30/ATG6 in yeast, equivalent to mammalian Beclin 1, encoded by BECN1) and either Vps38 (UVRAG) or Atg14 (ATG14L). This family includes members such as Vps38 found in Saccharomyces cerevisiae. Vps38 is characteristic of complex II and essential for vacuolar protein sorting. In mammalian cells, complex II is also involved in autophagy, receptor degradation and cytokinesis as well as signaling, recycling and lysosomal tubulation. Independently from complex I and II, Beclin 1 and UVRAG also play separate roles in endosome function and neuron viability. In complex I, Vps38/UVRAG is substituted with Atg14/ATG14L. Although the N-terminal domains of Vps30, Vps38 and Atg14 differ, the overall similarity of their domain organizations suggests that these proteins may have evolved from a common ancestor. 425
60235 407546 pfam17650 RACo_linker RACo linker region. This family includes reductive activator of CoFeSP (RACo) proteins. Structure analysis of RACo indicate that it contains 4 regions: N-terminal region pfam00111 (residues 3-94) binding the [2Fe-2S] cluster, a linker region (residues 95-125), the middle region (residues 126-206), and the large C-terminal domain pfam14574 (residues 207-630). This entry pertains to the linker region. The linker region is only present in RACE (reductive activases for corrinoid enzymes) protein sequences with the N-terminal [2Fe-2S] cluster family pfam00111 and is absent in the RamA-like RACE proteins, suggesting that the linker domain and the N-terminal domain form one functional unit. 86
60236 407547 pfam17651 Raco_middle RACo middle region. This family includes reductive activator of CoFeSP (RACo) proteins. Structure analysis of RACo indicate that it contains 4 regions: N-terminal region pfam00111 (residues 3-94) binding the [2Fe-2S] cluster, a linker region (residues 95-125), the middle region (residues 126-206), and the large C-terminal domain pfam14574 (residues 207-630). This entry pertains to the middle region. This region contains residues in their alpha-helices (H6 and H7) that mediate dimerization with subdomain I of the C-terminal domain. 163
60237 407548 pfam17652 Glyco_hydro81C Glycosyl hydrolase family 81 C-terminal domain. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release. This entry represents the helical C-terminal domain. 349
60238 407549 pfam17653 DUF5522 Family of unknown function (DUF5522). This is a family of unknown function. Family members are found in Bacteria and Eukaryotes. In algae, this family is found on the N-terminal of diphthamide synthase family pfam01902 and is predicted to be a membrane transporter belonging to The ATP-binding Cassette (ABC) family. In nematoda, it is found on the N-terminal side of ribosomal protein family L14 pfam00238. It is also found on the N-temrinal of Cob(I)alamin adenosyltransferase pfam01923 in other eukaryotes. Family members found in homo sapiens have been shown to be highly expressed in 15 high-grade neuroendocrine tumor cell lines and YAP1-positive small-cell lung cancer cell lines as well as being up-regulated in two human Multiple Myloma cell lines in response to selective nuclear export inhibitor KPT-276. The HMM profile of this family reveals 4 highly conserved cysteines. 48
60239 407550 pfam17654 Trnau1ap Selenocysteine tRNA 1 associated proteins. This entry represents the C-terminal region of Selenocysteine tRNA 1 associated proteins (Trnau1ap also known as Secp43). Family members found in Eukaryotes have been shown to serve an essential role in the synthesis of selenoproteins, which have critical functions in numerous biological processes. Selenium deficiency results in a variety of diseases, including cardiac disease. Trnau1ap proteins harbor RNA recognition motifs (RRM) pfam00076 and Tyr-rich region found in the C-terminal. The Tyr-rich region (amino acids 185-225) is conserved among several mammals, including human, chimp, dog, cattle, mouse and rat. Furthermore, constitutive deletion of exons corresponding to the Tyr-rich region in mouse resulted in embryonic lethality. 101
60240 407551 pfam17655 IRK_C Inward rectifier potassium channel C-terminal domain. This cytoplasmic C-terminal domain has an Ig fold. 174
60241 407552 pfam17656 ChapFlgA_N FlgA N-terminal domain. Presumed domain found to N-terminus of SAF-like domain in FlgA proteins. 76
60242 407553 pfam17657 DNA_pol3_finger Bacterial DNA polymerase III alpha subunit finger domain. 166
60243 407554 pfam17658 DUF5520 Family of unknown function (DUF5520). This is a family of unknown function found in Mammalia. 338
60244 407555 pfam17659 DUF5521 Family of unknown function (DUF5521). This is a family of unknown function found in Eukaryota. Family members include the human CXorf57. High-throughput sequencing used to identify genes driving tumorigenesis in Avian leukosis virus (ALV)-induced B-cell lymphomas, showed CXorf57 as the 10th most frequently targeted common integration site by ALV. ALV induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. CXorf57 encodes a protein that has a conserved putative replication factor A protein 1 domain. Several proteins with this domain have been shown to be involved in recognition of DNA damage for nucleotide excision repair. CXorf57 contains 24 unique integration sites that are spaced throughout the gene and in no preferred orientation. 848
60245 407556 pfam17660 BTRD1 Bacterial tandem repeat domain 1. This short domain is found in a wide variety of bacterial proteins. 50
60246 407557 pfam17661 DUF5523 Family of unknown function (DUF5523). This is a family of unknown function found in Eukaryotes. Family members such as the human CC1D2A protein carry the domain architecture where pfam15625 and pfam00168 are found at the C-terminal region of this family. However, other family members do not carry either of the above mentioned Pfam families. 255
60247 407558 pfam17662 DUF5524 Family of unknown function (DUF5524). This is a family of unknown function found in Metazoa. 290
60248 407559 pfam17663 DUF5525 Family of unknown function (DUF5525). This is a family of unknown function found in Chordata. 1017
60249 407560 pfam17664 DUF5526 Family of unknown function (DUF5526). This is a family of unknown function found in Metazoa. 168
60250 407561 pfam17665 DUF5527 Family of unknown function (DUF5527). This is a family of unknown function found in Chordata. 139
60251 407562 pfam17666 DUF5528 Family of unknown function (DUF5528). This is a family of unknown function found in Chordata. 152
60252 407563 pfam17667 Pkinase_fungal Fungal protein kinase. This domain appears to be a variant of the protein kinase domain that is found in a variety of fungal species. 386
60253 407564 pfam17668 Acetyltransf_17 Acetyltransferase (GNAT) domain. 109
60254 375243 pfam17669 DUF5529 Family of unknown function (DUF5529). This is a family of unknown function found in Chordata. 186
60255 407565 pfam17670 DUF5530 Family of unknown function (DUF5530). This is a family of unknown function found in Chordata. 141
60256 407566 pfam17671 DUF5531 Family of unknown function (DUF5531). This is a family of unknown function found in Mammalia. Family members have one or several predicted trans-membrane regions. 151
60257 375246 pfam17672 DUF5589 Family of unknown function (DUF5589). This is a family of unknown function found in mammalia. Family members contains one or several predicted trans-membrane regions. 89
60258 407567 pfam17673 DUF5532 Family of unknown function (DUF5532). This is a family of unknown function found in mammals. 92
60259 407568 pfam17674 HHH_9 HHH domain. 70
60260 407569 pfam17675 APG6_N Apg6 coiled-coil region. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway. 131
60261 407570 pfam17676 Peptidase_S66C LD-carboxypeptidase C-terminal domain. Muramoyl-tetrapeptide carboxypeptidase hydrolyses a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66. 120
60262 407571 pfam17677 Glyco_hydro38C2 Glycosyl hydrolases family 38 C-terminal beta sandwich domain. This domain is found at the C-terminal end of various glycosyl hydrolases belonging to family 38. The domain has a beta sandwich fold. 73
60263 407572 pfam17678 Glyco_hydro_92N Glycosyl hydrolase family 92 N-terminal domain. This domain is found at the N-terminus of family 92 glycosyl hydrolase proteins. 231
60264 375249 pfam17679 Dip gp37/Dip protein. This protein is found in the giant phage phi KZ. This protein has been shown to bind to RNAse E of the host P. aeruginosa and to inhibit the RNA degradation machinery of the bacterium. 273
60265 407573 pfam17680 FlgO FlgO protein. This entry represents the FlgO protein. Mutation of this protein in Vibrio cholerae has been shown to reduce motility. FlgO is an outer membrane protein that localizes throughout the membrane and not at the flagellar pole. Although FlgO and FlgP do not specifically localize to the flagellum, they are required for flagellar stability. Proteins in this family mostly contain an N-terminal lipoprotein attachment motif. 130
60266 407574 pfam17681 GCP_N_terminal Gamma tubulin complex component N-terminal. This is the N-terminal domain found in components of the gamma-tubulin complex proteins (GCPs). Family members include spindle pole body (SBP) components such as Spc97 and Spc98 which function as the microtubule-organizing center in yeast. Furthermore, family members such as human GCP4 (Gamma-tubulin complex component 4) have been structurally elucidated. Functional studies have shown that the N-terminal domain defines the functional identity of GCPs, suggesting that all GCPs are incorporated into the helix of gamma-tubulin small complexes (gTURCs) via lateral interactions between their N-terminal domains. Thereby, they define the direct neighbors and position the GCPs within the helical wall of gTuRC. Sequence alignment of human GCPs based on the GCP4 structure helped delineate conserved regions in the N- and C-terminal domains. In addition to the conserved sequences, the N-terminal domains carry specific insertions of various sizes depending on the GCP, i.e. internal insertions or N-terminal extensions. These insertions may equally contribute to the function of individual GCPs as they have been implied in specific interactions with regulatory or structural proteins. For instance, GCP6 carries a large internal insertion phosphorylated by Plk4 and containing a domain of interaction with keratins, whereas the N-terminal extension of GCP3 interacts with the recruitment protein MOZART1. 294
60267 407575 pfam17682 Tau95_N Tau95 Triple barrel domain. TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription. 115
60268 407576 pfam17683 TFIIF_beta_N TFIIF, beta subunit N-terminus. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter. 104
60269 407577 pfam17684 SCAB-PH PH domain of plant-specific actin-binding protein. This family is a PH domain found on plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. This domain adopts the PH fold, of seven beta-strands, beta7-beta13 and two alpha-helices, alpha1 and alpha2 arranged into a beta-barrel. The canonical phosphoinositide-binding pocket of the classic PH domain is degenerate in this fused one, and the charge on the pocket suggest that the Ig-PH domain contains a non-canonical binding site for inositol phosphates. 108
60270 375254 pfam17685 DUF5533 Family of unknown function (DUF5533). This is a family of unknown function found in chordata. Family members have multiple predicted transmembrane regions. 139
60271 407578 pfam17686 DUF5534 Family of unknown function (DUF5534). This is a family of unknown function found in mammals. Family members have one or several predicted trans-membrane regions. 183
60272 375256 pfam17687 DUF5535 Family of unknown function (DUF5535). This is a family of unknown function found in mammals. 105
60273 407579 pfam17688 DUF5536 Family of unknown function (DUF5536). This is a family of unknown function found in mammals. 185
60274 407580 pfam17689 Arabino_trans_N Arabinosyltransferase concanavalin like domain. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol. 155
60275 375258 pfam17690 DUF5537 Family of unknown function (DUF5537). This is a family of unknown function found in chordata. 160
60276 407581 pfam17691 Croc_4 Contingent replication of cDNA 4. Family members are 18-kDa serine/threonine-rich polypeptides containing a P-loop motif and an SH3-binding region with phosphorylation sites for a variety of protein kinases (cdc2, CDK2, MAPK, CDK5, protein kinase C, Ca(2+)/calmodulin protein kinase 2, casein kinase 2) involved in cell proliferation and differentiation. Functional studies revealed that expression is associated with proliferating and migrating cells in developing brain. Furthermore, it has been suggested that CROC-4 participates in brain-specific c-fos signaling pathways involved in cellular remodeling of brain architecture. C1orf61 expression was also found associated with the progression of liver disease as well as human embryogenesis. It was shown to be up-regulated in hepatic cirrhosis tissues and further up-regulated in primary hepatocellular carcinoma tumors where it was suggested to play a role as a tumor activator. 79
60277 407582 pfam17692 DUF5538 Family of unknown function (DUF5538). This is a family of unknown function found in primates. 130
60278 375261 pfam17693 DUF5539 Family of unknown function (DUF5539). This is a family of unknown function found in primates. 201
60279 375262 pfam17694 DUF5540 Family of unknown function (DUF5540). This is a family of unknown function found in primates. 80
60280 407583 pfam17695 DUF5541 Family of unknown function (DUF5541). This is a family of unknown function found in primates. 116
60281 407584 pfam17696 DUF5542 Family of unknown function (DUF5542). This is a family of unknown function. The C-terminal has a strongly conserved WxxxW motif. 68
60282 375265 pfam17697 DUF5543 Family of unknown function (DUF5543). This is a family of unknown function found in primates. 118
60283 375266 pfam17698 DUF5544 Family of unknown function (DUF5544). This is a family of unknown function found in primates. 125
60284 375267 pfam17699 DUF5545 Family of unknown function (DUF5545). This is a family of unknown function found in primates. 221
60285 407585 pfam17700 DUF5546 Family of unknown function (DUF5546). This is a family of unknown function found in primates. 135
60286 407586 pfam17701 DUF5547 Family of unknown function (DUF5547). This is a family of unknown function found in mammals. 123
60287 407587 pfam17702 DUF5548 Family of unknown function (DUF5548). This is a family of unknown function found in primates. 177
60288 407588 pfam17703 DUF5549 Family of unknown function (DUF5549). This is a family of unknown function found in mammals. 200
60289 375272 pfam17704 DUF5550 Family of unknown function (DUF5550). This is a family of unknown function found in primates. 123
60290 375273 pfam17705 DUF5551 Family of unknown function (DUF5551). This is a family of unknown function found in primates. 156
60291 375274 pfam17706 DUF5552 Family of unknown function (DUF5552). This is a family of unknown function found in primates. 217
60292 407589 pfam17707 DUF5553 Family of unknown function (DUF5553). This is a family of unknown function found in primates. 223
60293 407590 pfam17708 Gasdermin_C Gasdermin PUB domain. The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells. This family also includes the gasdermin protein 174
60294 375277 pfam17709 DUF5554 Family of unknown function (DUF5554). This is a family of unknown function found in primates. 75
60295 407591 pfam17710 DUF5555 Family of unknown function (DUF5555). This is a family of unknown function found in mammals. 295
60296 375279 pfam17711 DUF5556 Family of unknown function (DUF5556). This is a family of unknown function found in primates. 179
60297 375280 pfam17712 DUF5557 Family of unknown function (DUF5557). This is a family of unknown function found in primates. 90
60298 407592 pfam17713 DUF5558 Family of unknown function (DUF5558). This is a family of unknown function found in Homo sapiens. 129
60299 375282 pfam17714 DUF5559 Family of unknown function (DUF5559). This is a family of unknown function found in primates. 194
60300 375283 pfam17715 DUF5560 Family of unknown function (DUF5560). This is a family of unknown function found in primates. 165
60301 407593 pfam17716 DUF5561 Family of unknown function (DUF5561). This is a family of unknown function found in eukaryota. 255
60302 407594 pfam17717 DUF5562 Family of unknown function (DUF5562). This is a family of unknown function found in mammals. Family members have one or several predicted transmembrane regions. 166
60303 375286 pfam17718 DUF5563 Family of unknown function (DUF5563). This is a family of unknown function found in chordata. 192
60304 407595 pfam17719 DUF5564 Family of unknown function (DUF5564). This is a family of unknown function found in chordata. 98
60305 407596 pfam17720 DUF5565 Family of unknown function (DUF5565). This is a family of unknown function found in bacteria and eukaryotes. 324
60306 407597 pfam17721 DUF5566 Family of unknown function (DUF5566). This is a family of unknown function found in chordata. 233
60307 407598 pfam17722 DUF5567 Family of unknown function (DUF5567). This is a family of unknown function found in chordata. 234
60308 407599 pfam17723 RHH_8 Ribbon-Helix-Helix transcriptional regulator family. This family of proteins are likely to be transcriptional regulators that have an N-terminal ribbon-helix-helix domain. Although some members of the family are annotated as CopG, this family does not include that protein. 119
60309 407600 pfam17724 DUF5568 Family of unknown function (DUF5568). This is a family of unknown function found in chordata. 203
60310 407601 pfam17725 YBD YAP binding domain. TEA domain transcription factors contain an N-terminal TEA domain pfam01285 and a C-terminal YAP binding domain (YBD). This entry corresponds to the YBD that binds to the oncoproteins YAP and TAZ. The structure of the YBD shows that it has an Ig-like beta sandwich fold. 206
60311 407602 pfam17726 DpnI_C Dam-replacing HTH domain. Dam-replacing protein (DRP) is an restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. This domain represents the C-terminal HTH domain. 69
60312 407603 pfam17727 CtsR_C CtsR C-terminal dimerization domain. This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon. This entry corresponds to the C-terminal dimerization domain. 72
60313 407604 pfam17728 BsuBI_PstI_RE_N BsuBI/PstI restriction endonuclease HTH domain. This family represents the C-terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (EC:3.1.21.4). The enzymes of the BsuBI restriction/modification (R/M) system recognize the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system. 140
60314 407605 pfam17729 DUF5569 Family of unknown function (DUF5569). This is a family of unknown function found in mammals. 227
60315 407606 pfam17730 Centro_C10orf90 Centrosomal C10orf90. This is the N-terminal region found on proteins encoded by C10orf90. Most of the family members carry ALMS motif on their C-terminal pfam15309.SiRNA mediated functional analysis suggest that the C10orf90 encoded proteins have a role in centrosomal functions. 512
60316 407607 pfam17731 DUF5570 Family of unknown function (DUF5570). This is a family of unknown function found in chordata. Family members contain a transmembrane region in the C-terminal and have been shown to be localized to the Golgi apparatus. 123
60317 407608 pfam17732 DUF5571 Family of unknown function (DUF5571). This is a family of unknown function found in chordata. Family members carry a zinc finger family on the N-terminal pfam15663. 351
60318 407609 pfam17733 DUF5572 Family of unknown function (DUF5572). This is a family of unknown function found in eukaryotes. Family members carry a highly conserved KPWE sequence at the C-terminal. 48
60319 375300 pfam17734 Spt46 Spermatogenesis-associated protein 46. This family is found in chordata. Functional characterization studies showed that the deletion of Spata46 in mice resulted in subfertility with abnormal sperm head shape and a failure of sperm-egg fusion. Spata46 has also been shown to localize to the nuclear membrane by a transmembrane region in the N-terminal. 228
60320 407610 pfam17735 BslA Biofilm surface layer A. This family includes members such as BslA (previously called YuaB). Secreted BslA from Bacillus subtillis has been shown to form surface layers around the biofilm self-assembling at interfaces of B. subtilis biofilms, forming an elastic film. structural analysis revealed that BslA consists of an Ig-type fold with the addition of an unusual, extremely hydrophobic cap region. The hydrophobic cap exhibits physiochemical properties similar to the hydrophobic surface found in fungal hydrophobins; thus, BslA is defined as member of a class of bacterially produced hydrophobins. 121
60321 407611 pfam17736 Ig_C17orf99 C17orf99 Ig domain. This Ig domain is found in tandem in the uncharacterized human protein C17orf99, which is found across mammalian species. 95
60322 375302 pfam17737 Ig_C19orf38 Ig domain in C19orf38 (HIDE1). This entry represents an Ig domain found in the uncharacterized human protein C19orf38, which is found across mammals. Family members have one predicted transmembrane region which is not included in this entry. The C19orf38 protein is also known as HIDE1 after Highly expressed in Immature DEndritic cell transcript 1 protein. 91
60323 407612 pfam17738 DUF5575 Family of unknown function (DUF5575). This is a family of unknown function found in chordates. 307
60324 375304 pfam17739 DUF5576 Family of unknown function (DUF5576). This is a family of unknown function found in Hominidae. 134
60325 407613 pfam17740 DUF5577 Family of unknown function (DUF5577). This is a family of unknown function found in Metazoa. 334
60326 407614 pfam17741 DUF5578 Family of unknown function (DUF5578). This is a family of unknown function found in Eukaryotes. 268
60327 375307 pfam17742 DUF5579 Family of unknown function (DUF5579). This is a family of unknown function found in chordates. Family members carry one predicted transmembrane region. 202
60328 407615 pfam17743 DUF5580 Family of unknown function (DUF5580). This is a family of unknown function found in metazoa. 547
60329 407616 pfam17744 DUF5581 Family of unknown function (DUF5581). This is a family of unknown function found in chordates. 315
60330 407617 pfam17745 Ydr279_N Ydr279p protein triple barrel domain. RNases H are enzymes that specifically hydrolyse RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologues of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function. This domain corresponds to the N-terminal triple barrel domain. 66
60331 407618 pfam17746 SfsA_N SfsA N-terminal OB domain. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. This domain corresponds to the N-terminal OB fold. 66
60332 407619 pfam17747 VID27_PH VID27 PH-like domain. This region has been predicted to contain a PH-like domain. 109
60333 407620 pfam17748 VID27_N VID27 N-terminal region. This region may contain a PH domain. 173
60334 407621 pfam17749 MIP-T3_C Microtubule-binding protein MIP-T3 C-terminal region. This protein, which interacts with both microtubules and TRAF3 (tumour necrosis factor receptor-associated factor 3), is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved. 154
60335 407622 pfam17750 Reo_sigmaC_M Reovirus sigma C capsid protein triple beta spiral. This short region forms a triple beta spiral structural motif. 40
60336 407623 pfam17751 SKICH SKICH domain. The SKICH domains of SKIP and PIPP mediate plasma membrane localisation. The functions of the SKICH domains of NDP52 and CALCOCO1 are not known. 102
60337 407624 pfam17752 BLF1 Burkholderia lethal factor 1. This family includes members such as BLF1 (Burkholderia lethal factor 1) also known as BPSL1549. BLF1 is a potent toxin from Burkholderia pseudomallei causing melioidosis. BLF1 interacts with the human translation factor eIF4A causing deamidation of Gln339 to Glu. Thereby, reducing endogenous host cell protein synthesis and triggering increased stress granule formation, which is associated with translational blocks. Structural analysis of BLF1 revealed an alpha/beta fold comprising a sandwich of two mixed beta-sheets surrounded by loops and alpha-helices, where the beta-sheet core of the catalytic pocket is structurally similar to that of the deamidase domain of CNF1 pfam05785. 213
60338 407625 pfam17753 Ig_mannosidase Ig-fold domain. This Ig-like fold domain is found in mannosidase enzymes. 78
60339 407626 pfam17754 TetR_C_14 MftR C-terminal domain. This domain is found at the C-terminus of TetR like transcription factors including the Mycofactin biosynthesis transcription factor. 112
60340 407627 pfam17755 UvrA_DNA-bind UvrA DNA-binding domain. 110
60341 407628 pfam17756 RET_CLD1 RET Cadherin like domain 1. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that RET contains four consecutive cadherin-like domains (CLD). This entry relates to the first CLD at the N-terminal. Several regions within RET-CLD1 have been shown to be important for ligand-coreceptor binding. CLD1 and CLD2 have a distinctive clamshell shape and CLD1 is essential for CLD2 folding. CLD1 contains 2 sites for GDNF receptor alpha 1 binding. 125
60342 407629 pfam17757 UvrB_inter UvrB interaction domain. This domain is found in the UvrB protein where it interacts with the UvrA protein. 91
60343 407630 pfam17758 Prot_ATP_OB_N Proteasomal ATPase OB N-terminal domain. This is N-terminal oligonucleotide binding (OB) domain of proteasomal ATPase 62
60344 407631 pfam17759 tRNA_synthFbeta Phenylalanyl tRNA synthetase beta chain CLM domain. This domain corresponds to the catalytic like domain (CLM) in the beta chain of phe tRNA synthetase. 214
60345 407632 pfam17760 UvrA_inter UvrA interaction domain. This domain found in UvrA proteins interacts with the UvrB protein. 109
60346 407633 pfam17761 DUF1016_N DUF1016 N-terminal domain. This family may include an HTH domain. 136
60347 407634 pfam17762 HTH_ParB HTH domain found in ParB protein. 52
60348 407635 pfam17763 Asparaginase_C Glutaminase/Asparaginase C-terminal domain. This domain is found at the C-terminus of asparaginase enzymes. 114
60349 407636 pfam17764 PriA_3primeBD 3' DNA-binding domain (3'BD). This domain represents the N-terminal DNA-binding domain found in the PriA protein. The 3'BD, which has been shown to bind the 3' end of the leading-strand arm of replication fork structures. 96
60350 407637 pfam17765 MLTR_LBD MmyB-like transcription regulator ligand binding domain. This domain is found in a family of actinobacterial transcription factors. The structure shows it has a PAS domain like fold and it is bound to Myristic acid. 168
60351 407638 pfam17766 fn3_6 Fibronectin type-III domain. This FN3 like domain is found at the C-terminus of cucumisin proteins. 98
60352 407639 pfam17767 NAPRTase_N Nicotinate phosphoribosyltransferase (NAPRTase) N-terminal domain. Nicotinate phosphoribosyltransferase (EC:2.4.2.11) is the rate limiting enzyme that catalyses the first reaction in the NAD salvage synthesis. This is the N-terminal domain of the enzyme. 124
60353 407640 pfam17768 RecJ_OB RecJ OB domain. This OB-fold is found in RecJ proteins where is binds to ssDNA. 107
60354 407641 pfam17769 PurK_C Phosphoribosylaminoimidazole carboxylase C-terminal domain. This entry represents the C-terminal domain of the PurK enzyme. 56
60355 407642 pfam17770 RNase_J_C Ribonuclease J C-terminal domain. This domain is found at the C-terminus of Ribonuclease J proteins. Its function is unknown, but deletion of this domain causes dissociation to monomers. 102
60356 407643 pfam17771 ADAM_CR_2 ADAM cysteine-rich domain. This cysteine rich domain is found in a variety of ADAM like peptidases. This domain is distantly related to pfam08516. 69
60357 407644 pfam17772 zf-MYST MYST family zinc finger domain. This zinc finger domain is found in the MYST family of histone acetyltransferases. 55
60358 407645 pfam17773 UPF0176_N UPF0176 acylphosphatase like domain. This domain is found at the N-terminus of UPF0176 family proteins. It adopts a fold similar to the pfam00708 family. 92
60359 407646 pfam17774 YlmH_RBD Putative RNA-binding domain in YlmH. This domain adopts an RRM like fold and is found in the B. subtilis YlmH cell division protein. 84
60360 407647 pfam17775 UPF0225 UPF0225 domain. This entry represents an NTF2-like domain found in bacterial proteins. 99
60361 407648 pfam17776 NLRC4_HD2 NLRC4 helical domain HD2. This entry represents a helical domain found in the NLRC4 protein and NOD2 protein. 122
60362 407649 pfam17777 RL10P_insert Insertion domain in 60S ribosomal protein L10P. This domain is found in prokaryotic and archaeal ribosomal L10 protein. 71
60363 407650 pfam17778 BLACT_WH Beta-lactamase associated winged helix domain. This winged helix domain is found at the C-terminus of some beta lactamase enzymes. 46
60364 407651 pfam17779 NOD2_WH NOD2 winged helix domain. This winged helix domain is found in the NOD2 protein. Its molecular function is not known. 57
60365 407652 pfam17780 OCRE OCRE domain. The OCtamer REpeat (OCRE) has been annotated as a 42-residue sequence motif with 12 tyrosine residues in the spliceosome trans-regulatory elements RBM5 and RBM10 (RBM [RNA-binding motif]), which are known to regulate alternative splicing of Fas and Bcl-x pre-mRNA transcripts. The structure of the domain consists of an anti-parallel arrangement of six beta strands. 51
60366 407653 pfam17781 RPN1_RPN2_N RPN1/RPN2 N-terminal domain. This domain is found at the N-terminus of the 26S proteasome regulatory subunits RPN1 and RPN2. The domain is formed by an array of alpha helices. 301
60367 407654 pfam17782 DprA_WH DprA winged helix domain. This winged helix domain is found in the DprA protein. 61
60368 407655 pfam17783 CvfB_WH CvfB-like winged helix domain. This winged helix domain is found in RNA-binding proteins such as CvfB. 58
60369 407656 pfam17784 Sulfotransfer_4 Sulfotransferase domain. This family of proteins are distantly related to sulfotransferase enzymes. This protein in S. mansonii has been shown to be involved in resistance to oxamniquine and to have sulfotransferase activity. 214
60370 407657 pfam17785 PUA_3 PUA-like domain. This PUA-like domain is found at the N-terminus of SAM-dependent methyltransferases. 64
60371 407658 pfam17786 Mannosidase_ig Mannosidase Ig/CBM-like domain. This domain corresponds to domain 4 in the structure of Bacteroides thetaiotaomicron beta-mannosidase, BtMan2A. This domain has an Ig-like fold. 91
60372 407659 pfam17787 PH_14 PH domain. This entry corresponds to the PH domain found at the N-terminus of phospholipase C enzymes. 131
60373 407660 pfam17788 HypF_C HypF Kae1-like domain. This domain is found in the HypF protein. In the structure it is one of the two subdomains of the Kae1 domain. 99
60374 407661 pfam17789 MG4 Macroglobulin domain MG4. This domain is MG4 found in complement C3 and C5 proteins. 95
60375 407662 pfam17790 MG1 Macroglobulin domain MG1. This entry represents the N-terminal macroglobulin domain found in complement proteins C3, C4 and C5. 101
60376 407663 pfam17791 MG3 Macroglobulin domain MG3. This entry corresponds to the MG3 domain found in complement components C3, C4 and C5. 83
60377 407664 pfam17792 ThiD2 ThiD2 family. This domain functions as a ThiD protein and is called the ThiD2 family. The domain is associated with the ThiE domain in some proteins. 124
60378 407665 pfam17793 AHD ANC1 homology domain (AHD). This entry corresponds to the ANC1 homology domain (AHD) found in AF-9. 61
60379 407666 pfam17794 Vault_2 Major Vault Protein repeat domain. This short domain is found repeated numerous times in the Major Vault Protein. This entry is related to pfam01505. 60
60380 407667 pfam17795 Vault_3 Major Vault Protein Repeat domain. This domain is found in the Major Vault Protein. 62
60381 407668 pfam17796 Vault_4 Major Vault Protein repeat domain. 61
60382 407669 pfam17797 RL RL domain. The RRM-like (RL) domain is found in the N-terminal region of the polyA polymerase PAPD1. It contributes to PAPD1 dimerization and has a fold similar to RNP-type RBDs. 71
60383 407670 pfam17798 TRIF-NTD TRIF N-terminal domain. The N-terminal domain of TRIF/TICAM-1 has a structure that consists of eight antiparallel helices. This domain believed to be involved in self-regulation of TRIF by interacting with its TIR domain. 157
60384 407671 pfam17799 RRM_Rrp7 Rrp7 RRM-like N-terminal domain. This domain corresponds to the N-terminal RNA-binding domain found in the Rrp7 protein. It has an RRM-like fold with a circular permutation. 160
60385 407672 pfam17800 NPL Nucleoplasmin-like domain. 88
60386 407673 pfam17801 Melibiase_C Alpha galactosidase C-terminal beta sandwich domain. This domain is found at the C-terminus of alpha galactosidase enzymes. 74
60387 407674 pfam17802 SpaA Prealbumin-like fold domain. This entry contains a prealbumin-like domain from a wide variety of bacterial surface proteins. This entry corresponds to domain 1 and domain 3 of SpaA from Corynebacterium diphtheriae. Some members of this family contain an isopeptide bond. 72
60388 407675 pfam17803 Cadherin_4 Bacterial cadherin-like domain. This entry contains numerous bacterial cadherin-like domains found in extracelullar proteins. 71
60389 407676 pfam17804 TSP_NTD Tail specific protease N-terminal domain. The N-terminal domain of tail specific proteases has a novel fold composed of 10 alpha helices. 187
60390 407677 pfam17805 AsnC_trans_reg2 AsnC-like ligand binding domain. This entry contains an AsnC-like ligand binding domain. 86
60391 375343 pfam17806 SO_alpha_A3 Sarcosine oxidase A3 domain. This short domain is found in Heterotetrameric Sarcosine Oxidase's alpha A3 domain. This domain binds to FMN in sarcosine oxidase. This domain is related to pfam04324 but lacks its iron binding cysteine residues. 87
60392 407678 pfam17807 zf-UBP_var Variant UBP zinc finger. This domain is found in ubiquitin C-terminal hydrolase enzymes and is related to the pfam02148 domain. However, it has an altered pattern of zinc binding residues. 64
60393 375345 pfam17808 fn3_PAP Fn3-like domain from Purple Acid Phosphatase. This entry represents an N-terminal Fn3-like domain found at the N-terminus of purple acid phosphatase enzymes. 118
60394 375346 pfam17809 UPA_2 UPA domain. The UPA domain is conserved in UNC5, PIDD, and Ankyrins. It has a beta sandwich structure. 131
60395 407679 pfam17810 Arg_decarb_HB Arginine decarboxylase helical bundle domain. This entry represents a helical bundle domain that is found between the two enzymatic domains of the arginine decarboxylases. 84
60396 407680 pfam17811 JHD Jumonji helical domain. This 4-helix bundle domain is associated with the Jumonji domain pfam02373. 104
60397 407681 pfam17812 RET_CLD3 RET Cadherin like domain 3. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that RET contains four consecutive cadherin-like domains (CLD). This entry relates to CLD3. Classical cadherin calcium-coordinating motifs can be found between CLD2 and CLD3. 114
60398 407682 pfam17813 RET_CLD4 RET Cadherin like domain 4. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that the ligand-binding RET ectodomain (RET-ECD) contains four consecutive cadherin-like domains (CLD1-CLD4) followed by a membrane-proximal cysteine-rich domain (CRD). This entry relates to CLD4 which is required for CRD folding. 104
60399 375350 pfam17814 LisH_TPL LisH-like dimerisation domain. TOPLESS (TPL) proteins have a highly conserved N-terminal domain containing a lissencephaly homologous (LisH) dimerization motif. 30
60400 407683 pfam17815 PDZ_3 PDZ domain. This entry contains the second PDZ domain from plant peptidases such as Deg2. This domain is involved in cage assembly. 145
60401 407684 pfam17816 PDZ_4 PDZ domain. This entry represents a PDZ domain that is found in the CPAF protein from chlamydia trachomatis. 114
60402 407685 pfam17817 PDZ_5 PDZ domain. This entry corresponds to PDZ domains found in neurabin and spinophilin proteins. The PDZ domain in spinophilin mediates its interaction with protein phosphatase PP1. 73
60403 407686 pfam17818 KCT2 Keratinocyte-associated gene product. This entry includes Keratinocyte-associated transmembrane protein 2 found in humans. Functional studies show that KCP2 localizes to the endoplasmic reticulum, consistent with a role in protein biosynthesis, and has a functional KKxx retrieval signal at its cytosolic C-terminus. 187
60404 407687 pfam17819 DUF5582 Family of unknown function (DUF5582). This is a family of unknown function found in chordata. 146
60405 407688 pfam17820 PDZ_6 PDZ domain. This entry represents the PDZ domain from a wide variety of proteins. 54
60406 375357 pfam17821 DUF5583 Family of unknown function (DUF5583). This is a family of unknown function found in chordata. 129
60407 407689 pfam17822 DUF5584 Family of unknown function (DUF5584). This is a family of unknown function found in chordata. 230
60408 407690 pfam17823 DUF5585 Family of unknown function (DUF5585). This is a family of unknown function found in chordata. 506
60409 407691 pfam17824 DUF5586 Family of unknown function (DUF5586). This is a family of unknown function found in chordata. 404
60410 407692 pfam17825 DUF5587 Family of unknown function (DUF5587). This is a family of unknown function found in chordata. 1440
60411 375362 pfam17826 DUF5588 Family of unknown function (DUF5588). This is a family of unknown function found in chordata. 362
60412 407693 pfam17827 PrmC_N PrmC N-terminal domain. This entry corresponds to the N-terminal alpha helical domain of the HemK protein. HemK is a methyltransferase enzyme that carries out the methylation of the N5 nitrogen of the glutamine found in the conserved GGQ motif of class-1 release factors. 71
60413 407694 pfam17828 FAS_N N-terminal domain in fatty acid synthase subunit beta. This entry represents the N-terminal domain found in fatty acid synthase proteins. 127
60414 407695 pfam17829 GH115_C Gylcosyl hydrolase family 115 C-terminal domain. This domain is found at the C-terminus of glycosyl hydrolase family 115 proteins. This domain has a beta-sandwich fold. 172
60415 407696 pfam17830 STI1 STI1 domain. This entry corresponds to the STI1 domain that is found in two copies in the Sti1 protein. 55
60416 375365 pfam17831 PDH_E1_M Pyruvate dehydrogenase E1 component middle domain. This entry represents one of the thiamin diphosphate-binding domains found in pyruvate dehydrogenase E1 component. 230
60417 407697 pfam17832 Pre-PUA Pre-PUA-like domain. This Pre-PUA-like domain is found in a wide variety of proteins including the eukaryotic translation initiation factor 2D, where it is found at the N-terminus. 86
60418 407698 pfam17833 UPF0113_N UPF0113 Pre-PUA domain. 81
60419 407699 pfam17834 GHD Beta-sandwich domain in beta galactosidase. This entry corresponds to a beta sandwich like domain found in glycosyl hydrolase family 35 beta galactosidase enzymes. 72
60420 407700 pfam17835 NOG1_N NOG1 N-terminal helical domain. This domain is found at the N-terminus of NOG1 GTPase proteins. 160
60421 407701 pfam17836 PglD_N PglD N-terminal domain. This alpha/beta domain is found at the N-terminus of proteins such as PglD. This domain binds a UDP-sugar substrate. 78
60422 407702 pfam17837 4PPT_N 4'-phosphopantetheinyl transferase N-terminal domain. This entry represents the N-terminal domain from 4'- phosphopantetheinyl transferase enzymes. This domain is structurally related to the pfam01648 domain with which it forms a pseudodimeric arrangement. 68
60423 407703 pfam17838 PH_16 PH domain. 122
60424 407704 pfam17839 CNP_C_terminal C-terminal domain of cyclic nucleotide phosphodiesterase. This is the C-temrinal domain found in Listeria monocytogenes, Lmo2642 cyclic nucleotide phosphodiesterase. The auxiliary C-terminal domain, consists of five alpha-helices forming a long helical bundle, and is connected to the catalytic domain pfam00149 by two loop segments. It is suggested that this auxiliary domain of Lmo2642 might confer functional specificity to the protein through the interactions with unknown factors or involving the substrate recognition. 108
60425 407705 pfam17840 Tugs Tethering Ubl4a to BAGS domain. This is the C-terminal domain of Ubiquitin-like protein 4A an ortholog of yeast Get5. In budding yeast, GET proteins directly mediate the insertion of newly synthesized TA proteins into endoplasmic reticulum membranes. Similarly, mammalian BAG6, Ubl4a, and SGTA make up a trimeric complex that binds TA proteins post-translationally and then loads them onto the cytosolic ATPase TRC40, which in turn targets them to the endoplasmic reticulum. Structural studies show that this C-terminal TUGS domain of Ubl4a is essential for BAG6 tethering. Given that BAG6 mediates oligomeric complex formation of Ubl4a, TRC35, and TRC40 (mammalian counterparts of Get5, Get4, and Get3, respectively), the C-terminal TUGS domain might be crucial for supporting BAG6-mediated Ubl4a-TRC35 complex formation in humans as an alternative to the direct Get5-Get4 interaction in yeast. 47
60426 407706 pfam17841 Bep_C_terminal BID domain of Bartonella effector protein (Bep). This entry is the BID (Bep intracellular delivery) domain located at the C-terminal of Bartonella effector proteins (Beps). It functions as a secretion signal in a subfamily of protein substrates of bacterial type IV secretion (T4S) systems. It mediates transfer of (1) relaxases and the attached DNA during bacterial conjugation, and (2) numerous Beps during protein transfer into host cells infected by pathogenic Bartonella species. Crystal structure of several representative BID domains show a conserved fold characterized by a compact, antiparallel four-helix bundle topped with a hook. 97
60427 407707 pfam17842 dsRBD2 Double-stranded RNA binding domain 2. This domain is found in HEN1 proteins from Arabidopsis. Structural characterization reveal that small RNA substrate bind to two double-stranded RNA (dsRNA)-specific binding domains, dsRBD1 and dsRBD2. This entry relates to dsRBD2 which together with dsRBD1 forms a strong grip on the duplex region of the small RNA substrate, and these interactions help position the other duplex terminus towards the MTase domain pfam13847. 147
60428 407708 pfam17843 MycE_N MycE methyltransferase N-terminal. This is the N-terminal domain found in MycE from the mycinamicin biosynthetic pathway. MycE is a tetramer of a two-domain polypeptide, comprising a C-terminal catalytic MT domain and an N-terminal auxiliary domain, which is important for quaternary assembly and for substrate binding. 110
60429 407709 pfam17844 SCP_3 Bacterial SCP ortholog. This domain is found in MSMEG_5817 gene product from M. smegmatis. It has been shown to be vital for mycobacterial survival within host macrophages. Crystal structure revealed a Rossmann-like fold alpha/beta two-layer sandwich forming a highly hydrophobic interface cavity and with high structural homology to the SCP family. Hence, it has been suggested that this domain may be involved in the interaction of apolar ligands through its hydrophobic cavity. Alanine-scanning mutagenesis of the hydrophobic cavity of MSMEG_5817 protein demonstrated that the conserved Val82 residue plays an important role in ligand binding. 93
60430 407710 pfam17845 FbpC_C_terminal FbpC C-terminal regulatory nucleotide binding domain. Most functional ABC transporters are composed of at least four sub-units: two trans-membrane (TM) domains where the transport process takes place and two cytoplasmic nucleotide binding domains (NBDs) providing the energy required for active transport. This entry is one of the two NBDs found at the the C-terminal domain of FbpC, ferric iron uptake transporter, from the Neisseria gonorrhoeae. The C-terminal regulatory domain adopts two OB-folds per monomer. These are similar in topology to those seen in the NBD (nucleotide binding domain) from the maltose uptake ABC transporter, MalK. However, FbpC does not open as far as MalK when ATP is removed from their respective closed structures. This difference was suggested to be due to the substantial domain swap in the regulatory domain of FbpC. 55
60431 375377 pfam17846 XRN_M Xrn1 helical domain. This helical domain is part of the Xrn1 catalytic core. Xrn1 is a cytoplasmic 5'-3' exonuclease that degrades decapped mRNAs. 442
60432 375378 pfam17847 GlcV_C_terminal Glucose ABC transporter C-terminal domain. This is the C-terminal domain found at the ATPase subunit of the glucose ABC transporter from Sulfolobus solfataricus. Overall, the C-terminal domain (residues 243-353) contains only beta-strands, which form an elongated barrel-shaped structure composed of two parts. This entry represents the upper part which includes a three-stranded anti-parallel beta-sheets and two small anti-parallel beta-strands. The overall structure of this domain is very similar to that of the C-terminal domain of MalK from T. litoralis however, the function of the C-terminal domain in GlcV is not clear. 61
60433 407711 pfam17848 zf-ACC Acetyl-coA carboxylase zinc finger domain. Acetyl-coA carboxylase (ACC) is a central metabolic enzyme that catalyzes the committed step in fatty acid biosynthesis: biotin- dependent conversion of acetyl-coA to malonyl-coA. In bacteria this protein contains a small zinc finger domain. 26
60434 407712 pfam17849 OB_Dis3 Dis3-like cold-shock domain 2 (CSD2). This domain has an OB fold and is found in the Dis3l2 protein. This domain along with CSD1 binds to RNA. 77
60435 407713 pfam17850 CysA_C_terminal CysA C-terminal regulatory domain. ABC (ATP-binding cassette) transporters share a common architecture comprising two variable hydrophobic transmembrane domains (TMDs) that form the translocation pathway and two conserved hydrophilic ABC-ATPases that hydrolyze ATP. This is the C-terminal regulatory domain found at the ATPase subunit of CysA, a putative sulfate ABC transporter from Alicyclobacillus acidocaldarius. The regulatory domain of CysA is built up of an elongated beta-barrel composed of two beta-sandwiches that form a common hydrophobic core. 43
60436 407714 pfam17851 GH43_C2 Beta xylosidase C-terminal Concanavalin A-like domain. This domain is found to the C-terminus of the pfam04616 domain. This domain adopts a concanavalin A-like fold. 203
60437 407715 pfam17852 Dynein_AAA_lid Dynein heavy chain AAA lid domain. This entry corresponds to the extension domain of AAA domain 5 in the dynein heavy chain. This domain is composed of 8 alpha helices. 126
60438 407716 pfam17853 GGDEF_2 GGDEF-like domain. This domain is distantly related to the GGDEF domain, suggesting these may by diguanylate cyclase enzymes. 116
60439 407717 pfam17854 FtsK_alpha FtsK alpha domain. FtsK is a DNA translocase that coordinates chromosome segregation and cell division in bacteria. In addition to its role as activator of XerCD site-specific recombination, FtsK can translocate double-stranded DNA (dsDNA) rapidly and directionally and reverse direction. FtsK can be split into three domains called alpha (this entry), beta and gamma. The alpha and beta domains contain the core ATPase machinery of the DNA translocase. 101
60440 407718 pfam17855 MCM_lid MCM AAA-lid domain. This entry represents the AAA-lid domain found in MCM proteins. 86
60441 407719 pfam17856 TIP49_C TIP49 AAA-lid domain. This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the pfam00004 domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities.TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases. 66
60442 375385 pfam17857 AAA_lid_1 AAA+ lid domain. This domain represents the AAA lid domain from dynein heavy chain D3. 100
60443 375386 pfam17858 Defensin_int Platypus intermediate defensin-like peptide. This entry represents a defensin like peptide identified in the platypus genome. Structurally it resembles the beta defensins. The peptide was found to display potent antimicrobial activity against Staphylococcus aureus and Pseudomonas aeruginosa. 44
60444 375387 pfam17859 Pelovaterin Pelovaterin. The pelovaterin peptide is a major intracrystalline peptide found in turtle eggshell. The global fold of pelovaterin is similar to that of human beta-defensins. Pelovaterin exhibits strong antimicrobial activity against two pathogenic gram-negative bacteria, Pseudomonas aeruginosa and Proteus vulgaris. 42
60445 375388 pfam17860 Defensin_RK-1 RK-1-like defensin. This family includes RK-1 a defensin like peptide from rabbit kidney. The family also includes some rat alpha defensins. 34
60446 375389 pfam17861 Laterosporulin Laterosporulin defensin-like peptide. This entry corresponds to a bacteriocin from the bacterium Brevibacillus laterosporus called laterosporulin. This peptide has a defensin-like structure. 49
60447 407720 pfam17862 AAA_lid_3 AAA+ lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 45
60448 407721 pfam17863 AAA_lid_2 AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 73
60449 380039 pfam17864 AAA_lid_4 RuvB AAA lid domain. The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This entry contains the AAA lid domain that is found to the C-terminus of the AAA domain. 74
60450 407722 pfam17865 AAA_lid_5 Midasin AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. This lid domain is found in midasin proteins. 104
60451 407723 pfam17866 AAA_lid_6 AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 60
60452 407724 pfam17867 AAA_lid_7 Midasin AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. This lid domain is found in midasin proteins. 106
60453 407725 pfam17868 AAA_lid_8 AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 72
60454 407726 pfam17869 Cys_box Anosmin cysteine rich domain. This is the Cys-box (cysteine-rich) domain found on the N-terminal of anosmin-1 proteins. It is suggested that the Cys-box domain may resemble the cysteine-rich region of the insulin-like growth factor receptor. Family members are found in chordates. 82
60455 407727 pfam17870 Insulin_TMD Insulin receptor trans-membrane segment. This entry represents the trans-membrane domain (TMD) found in insulin receptor proteins. The TMD of the insulin receptor is within the beta-subunit and contains 23 amino acids. Mutations in the TMD were shown to have effects on receptor biosynthetic processing and kinase activation. Substitution of the entire TMD of the insulin receptor (IR) resulted in constitutive kinase activation in vitro, while replacing the TMD with that of glycophorin A inhibited insulin action. Structural studies show that TMD contains a helix and a kink when it is purified in dodecylphosphocholine (DPC) micelles. The residues 942-948 preceding the TMD have a propensity to be a short helix and may interact with membrane. 47
60456 407728 pfam17871 AAA_lid_9 AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 104
60457 407729 pfam17872 AAA_lid_10 AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. 99
60458 375396 pfam17873 Rep_1B Replicase polyprotein 1ab. This entry relates to a regulatory domain found in replicase polyprotein 1ab found in Arterivirus. Structural studies of arterivirus helicase (nsp10), indicate that this domain undergoes conformational changes on substrate binding. Besides the large conformational change, it is suggested that the regions at the surface of domain 1B not directly involved in DNA binding may become flexible. For example, domain 1B residues Arg95, Gly125 and Ala131 become disordered after DNA binding. Together with domains 1A and 2A it forms a nucleic acid-binding channel where the single-stranded part of the DNA substrate is bound to. 53
60459 407730 pfam17874 TPR_MalT MalT-like TPR region. This entry contains a series of TPR repeats. 336
60460 407731 pfam17875 RPA43_OB RPA43 OB domain in RNA Pol I. This is OB domain found in RPA43 proteins (DNA-directed RNA polymerase I subunit RPA43, also known as A43) in yeast. Functional analysis of RNA polymerase I show that, subunits A14 and A43 form the heterodimer A14/43, which is distantly related to Rpb4/7 in Pol II and C17/25 in Pol III. Crystal structure analysis show that A43-A14 heterodimer forms the stalk that provides a platform for initiation factors and interacts with newly synthesized RNA. 111
60461 407732 pfam17876 CSD2 Cold shock domain. Crystallographic structure analysis of E. coli wild-type RNase II revealed that the amino-terminal region starts with an alpha-helix followed by two consecutive five-stranded anti-parallel beta-barrels, identified as cold-shock domains (CSD1 and CSD2). This entry relates to CSD2 which lacks the typical sequence motifs RNPI and RNPII but contributes to RNA binding. 74
60462 407733 pfam17877 Dis3l2_C_term DIS3-like exonuclease 2 C terminal. This is the C-terminal S1 domain found in Dis3L2 proteins. Dis3L2 belongs to the RNase II/R 3-5 exonuclease superfamily, which includes the catalytic subunit of the RNA exosome in yeast and in humans. 87
60463 407734 pfam17878 ssDBP Single-stranded DNA-binding protein. Family members include single-stranded DNA binding protein encoded by the filamentous Pseudomonas bacteriophage Pf3. 72
60464 375400 pfam17879 DNA_ligase_C DNA ligase C-terminal domain. This is the C-terminal domain found in ATP-Dependent DNA Ligase from Bacteriophage T7. This domain has no ligase activity, however together with the N-terminal domain they bind to double-stranded DNA consistent with the idea that the DNA-binding site is between the domains in the intact protein. Furthermore, although the fold of domain 2 is very similar to a number of other proteins that bind single-stranded DNA, this domain does not bind to single-stranded DNA but instead has a high affinity for double-stranded DNA. 109
60465 407735 pfam17880 Yos9_DD Yos9 dimerzation domain. This is the dimerization domain (DD) found in Yos9 proteins in yeast. Structural analysis revealed that this domain contributes to self association of Yos9. The overall fold of the domain can be classified as an alpha-beta-roll architecture, comprising two alpha-helices and seven beta-strands. 128
60466 407736 pfam17881 DUF5590 Domain of unknown function (DUF5590). This is a domain of unknown function found in bacterial proteins. 45
60467 407737 pfam17882 SBD OAA-family lectin sugar binding domain. This domain is found in agglutinin family of lectins. Oscillatoria agardhii agglutinin (OAA)- family lectins comprise either one or two homologous domains, with a single domain possessing two glycan binding sites. OAA is one of the lectins with anti-HIV activity. This sugar binding domain is also found in Pseudomonas fluorescens agglutinin (PFA) and myxobacterial hemagglutinin (MBHA), where MBHA contains two sugar-binding domains (i.e. 4 sugar binding sites), whereas OAA and PFA are single-domain proteins (i.e. 2 sugar binding sites). 76
60468 407738 pfam17883 MBG MBG domain. This domain is found in a variety of bacterial extracellular proteins. Although initially described as having a divergent Ig fold this domain has a novel topology that is like a mirror image of the beta grasp fold. Hence the name of Mirror Beta Grasp (MBG) domain. 99
60469 407739 pfam17884 DUF5591 Domain of unknown function (DUF5591). This is a domain of unknown function found in archaeal tRNA-guanine transglycosylase (EC:2.4.2.48) and in archaeosine synthase (EC:2.6.1.97) proteins. 149
60470 407740 pfam17885 Smoa_sbd Styrene monooxygenase A putative substrate binding domain. This domain is found in the 46 kDa FAD-specific styrene epoxidase (SMOA) protein, comprises a part of the styrene monooxygenase (SMO) two-component flavoprotein monooxygenase enzyme. Structural analysis indicates that SMOA monomer comprises two globular domains spanned by a long alpha-helix. This domain contains a putative substrate binding site. 108
60471 407741 pfam17886 ArsA_HSP20 HSP20-like domain found in ArsA. This domain is found at the C-terminus of ArsA like proteins. This domain is related to HSP20. 63
60472 407742 pfam17887 Jak1_Phl Jak1 pleckstrin homology-like domain. This entry is for the pleckstrin homology-like (PHL) subdomain found in Jak1 proteins. JAK1 is a member of the Janus kinase (JAK) family of non-receptor tyrosine kinases that are activated in response to cytokines and interferons. PHL (residues 283-419) together with the N-terminal ubiquitin-like subdomain (residues 36-111) and an acyl-coenzyme A binding protein-like subdomain (residues 148-282), associate into a canonical tri-lobed FERM domain. 145
60473 407743 pfam17888 Carm_PH Carmil pleckstrin homology domain. This is a non-canonical pleckstrin homology (PH) domain connected to a 16-leucine-rich repeat domain found in CARMIL (CP Arp2/3 complex myosin-I linker) proteins. The PH domain is interconnected with an N-terminal helix (N-helix), residues 10-20 and a C-terminal linker (Linker), residues 129-147 in mouse F-actin-uncapping protein LRRC16A. Structural and functional studies indicate that the PH domain involved in direct binding to the PM (plasma membrane) and a HD (helical domain) responsible for antiparallel dimerization and enhancement of CARMIL's membrane-binding activity. Furthermore, it appears that CARMIL's PH domain mediates non-specific binding to the membrane, in contrast to other PH domains that bind polyphosphorylated phosphatidylinositides, which are thought to function as signalling lipids. 94
60474 407744 pfam17889 NLRC4_HD NLRC4 helical domain. This is a helical domain found in NLRC4, Nucleotide-binding and oligomerization domain-like receptor (NLR) proteins. Structural and functional studies indicate that the helical domain HD2 repressively contacted a conserved and functionally important alpha-helix of the NBD (nucleotide binding domain) in mouse NLR family CARD domain-containing protein 4. Furthermore, the HD2 domain was shown to cap the N-terminal side of the LRR (leucine-rich repeat) domain via extensive interactions. Other family members carrying this domain include baculoviral IAP repeat-containing protein 1 (Birc1) also known as neuronal apoptosis inhibitory protein (Naip). 106
60475 407745 pfam17890 WW_like Peptidoglycan hydrolase LytB WW-like domain. Structural analysis revealed that the catalytic domain of LytB consists of three structurally independent modules: SH3b, WW domain-like, and the glycoside hydrolase family 73 (GH73). This entry is the WW like domain found in endo-beta-N-acetylglucosaminidase LytB from Streptococcus pneumoniae. Functional analysis show that the deletion of both SH3b and WW modules almost completely abolished the activity of LytB. Furthermore, it was shown that the SH3b and WW modules are indispensable for LytB in cell separation. 53
60476 407746 pfam17891 FluMu_N Mu-like prophage FluMu N-terminal domain. Structural analysis of HI1506 (also known as Mu-like prophage FluMu protein gp35) from Haemophilus influenzae show that HI1506 consists of two structured domains connected by an unstructured 30 amino acid loop. This entry is the N-terminal domain which comprises a three-stranded antiparallel beta-sheet packed against an alpha-helix. 49
60477 407747 pfam17892 Cadherin_5 Cadherin-like domain. 98
60478 380048 pfam17893 Cas9_b_hairpin CRISPR-associated endonuclease Cas9 beta-hairpin domain. This is beta-hairpin domain found in Cas9 proteins from Actinobacteria. The beta-hairpin domain is not conserved in all type II-C Cas9 proteins. The beta-hairpin in Cas 9 from Streptococcus pyogenes blocks the HNH domain active site. 90
60479 407748 pfam17894 Cas9_Topo Topo homolgy domain in CRISPR-associated endonuclease Cas9. This is the Topo-homology domain found in Cas9 proteins from Actinobacteria. This domain bears structural similarity to a domain found in topoisomerase II. 71
60480 407749 pfam17895 Dicer_N Giardia Dicer N-terminal domain. This is the N-terminal domain found in dicer proteins from Giardia intestinalis. The N-terminal region (i.e. the platform domain) forms a flat surface composed of antiparallel beta sheet and three alpha helices. This surface contains a large positively charged region that could interact directly with the negatively charged phosphodiester backbone of the modeled dsRNA helix. 144
60481 375411 pfam17896 Nsp2a_N Replicase polyprotein 1a N-terminal domain. This is the N-terminal domain found in Replicase polyprotein 1a (also known as non-structural protein 2a-Nsp2a). Family members are found in Gammacoronaviruses. 358
60482 407750 pfam17897 VCPO_N Vanadium chloroperoxidase N-terminal domain. This is the N-terminal domain found in Vanadium chloroperoxidase proteins found in fungi. 215
60483 407751 pfam17898 GerD Spore germination GerD central core domain. This is the central core domain found in GerD from a thermophilic Bacillus. GerD plays a critical role in nutrient receptor-mediated spore germination in Bacillus species. The crystal structure of GerD reveals this domain as a trimeric superhelical rope fold. Alterations in GerD structure have profound effects on spores' nutrient germination. 114
60484 407752 pfam17899 Peptidase_M61_N Peptidase M61 N-terminal domain. This domain is found at the N-terminus of pfam05299 and has a beta sandwich-like fold with similarity to the baculovirus p35 protein. 168
60485 407753 pfam17900 Peptidase_M1_N Peptidase M1 N-terminal domain. This domain is found at the N-terminus of aminopeptidases from the M1 family. 186
60486 375414 pfam17901 EF-hand_12 EF-hand fold domain. This domain is found in Dd-STATa, a STAT protein (Signal transducer and activator of transcription A) which transcriptionally regulates cellular differentiation in Dictyostelium discoideum. The EF-hand domains predicted to contain several basic residues that lie close to the DNA backbone. 93
60487 407754 pfam17902 SH3_10 SH3 domain. This entry represents an SH3 domain. 65
60488 407755 pfam17903 KH_8 Krr1 KH1 domain. This entry represents the first KH domain in the KRR1 protein. Krr1 is a ribosomal assembly factor. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif and is involved in binding another assembly factor, Kri1. 81
60489 407756 pfam17904 KH_9 FMRP KH0 domain. This entry corresponds to the KH0 domain from the FMRP protein. This is a divergent KH domain that was discovered through solving the structure of an N-terminal fragment of the FMRP protein. KH0 does not have the canonical G-X-X-G motif between helices A and B. It has been suggested that this domain may be involved in RNA binding. 85
60490 407757 pfam17905 KH_10 GLD-3 KH domain 5. This entry corresponds to KH5 of the GLD-3 protein. The 4 KH domains KH2 to KH5 form a proteolytically stable structure. 95
60491 407758 pfam17906 HTH_48 HTH domain in Mos1 transposase. The N-terminal domain of the Mos1 Mariner transposase comprises two HTH domains. This HTH domain binds in the DNA major groove to the transposons inverted repeats. 50
60492 407759 pfam17907 AWS AWS domain. This entry represents the AWS (associated with SET domain) domain. This is a zinc binding domain. The full AWS domain contains 8 cysteines. This entry represents the N-terminal part of the domain, with the C-terminal part interwoven with the SET domain. 39
60493 407760 pfam17908 APAF1_C APAF-1 helical domain. This domain represents the C-terminal alpha helical domain of the apoptotic Apaf-1 protein. 135
60494 375422 pfam17909 Htr2 Htr2 transmembrane domain. Archaebacterial photoreceptors mediate phototaxis by regulating cell motility through two-component signaling cascades like those found in chemotaxis signaling chains of enteric bacteria. The photoreceptor sensory rhodopsin II from N. pharaonis (NpSRII) in complex with its cognate transducer NpHtrII serves as a system for transmembrane signal transfer. This entry is for the transmembrane domain of the transducer HtrII. Studies suggest that conformation changes of the NpSRII/NpHtrII complex may be crucial for the mechanism of signal propagation spanning the membrane domain and feeding into the HAMP domain. Furthermore, HtrII in H. salinarum not only transmits the signal from the photoreceptor SRII but also operates as a chemoreceptor. 61
60495 407761 pfam17910 FeoB_Cyto FeoB cytosolic helical domain. FeoB is a G-protein coupled membrane protein essential for Fe(II) uptake in prokaryotes. In the structures, a canonical G-protein domain (G domain) is followed by a helical bundle domain (S-domain) which is represented by this entry. 90
60496 407762 pfam17911 Ski2_N Ski2 N-terminal region. This region is the N-terminal extended region found in the Ski2 protein. The Ski complex is a conserved multiprotein assembly required for the cytoplasmic functions of the exosome, including RNA turnover, surveillance, and interference. Ski2, Ski3, and Ski8 assemble in a tetramer with 1:1:2 stoichiometry. 134
60497 407763 pfam17912 OB_MalK MalK OB fold domain. This entry corresponds to one of two OB-fold domains found in the MalK transport protein. 53
60498 407764 pfam17913 FHA_2 FHA domain. This entry represents a divergent FHA domain which in PNK binds to phosphorylated segment of XRCC1. 97
60499 407765 pfam17914 HopA1 HopA1 effector protein family. This family includes the HopA1 effector protein from Pseudomonas syringae. Structurally this protein has an alpha + beta fold. The effector protein HopA1 was shown to affect the EDS1 complex by binding EDS1 directly and activating the immune response signaling pathway. 170
60500 375426 pfam17915 zf_Rg Reverse gyrase zinc finger. This is the N-terminal zinc finger domain present in reverse gyrase proteins. Most reverse gyrases conserve the N-terminal zinc finger of the zinc ribbon type, pointing to a crucial function of this domain. Structure of Thermotoga maritima reverse gyrase elucidates that the N-terminal zinc finger firmly attaches the H1 (helicase 1) domain to the topoisomerase domain contributing to double-strand DNA (dsDNA) binding. 49
60501 407766 pfam17916 LID LIM interaction domain (LID). LIM-homeodomain (LIM-HD) proteins are transcription factors that are critical in the development of many cell types and tissues. The activity of LIM-HD proteins is dependent on the essential cofactor LIM domain-binding protein 1 (Ldb1). This entry represents a 30-residue LIM interaction domain (LID) that binds to the LIM domains of all LIM-HD and closely related LIM-only (LMO) proteins. Isl1 (Insulin gene enhancer protein 1) and Isl2 each contain a LID-like sequence in their C-terminal regions, the Lhx3-binding domain (LBD), which binds the LIM domains Lhx3 and Lhx4. 29
60502 407767 pfam17917 RT_RNaseH RNase H-like domain found in reverse transcriptase. DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain. 104
60503 407768 pfam17918 TetR_C_15 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain found in a number of different TetR transcription regulator proteins including SlmA proteins found in E. coli. Unlike other TetR proteins, SlmA functions not as a transcription regulator but rather as an NO (nucleoid occlusion) factor. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. 108
60504 407769 pfam17919 RT_RNaseH_2 RNase H-like domain found in reverse transcriptase. 100
60505 407770 pfam17920 TetR_C_16 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain found in a number of different TetR transcription regulator proteins found in Actinobacteria. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. 107
60506 407771 pfam17921 Integrase_H2C2 Integrase zinc binding domain. This zinc binding domain is found in a wide variety of integrase proteins. 58
60507 407772 pfam17922 TetR_C_17 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain present in Yfir transcription regulator proteins found in Bacillus subtilus. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. 100
60508 375433 pfam17923 TetR_C_18 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in TetR transcriptional regulations found in proteobacteria. 113
60509 407773 pfam17924 TetR_C_19 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the transcriptional regulator heme-regulated transporter regulator (HrtR), which senses and binds a heme molecule as its physiological effector to regulate the expression of the heme-efflux system responsible for heme homeostasis in L. lactis. 117
60510 407774 pfam17925 TetR_C_20 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the transcriptional regulator KstR that regulates a large set of genes responsible for cholesterol catabolism. This is important for Mycobacterium tuberculosis during infection, both at an early stage in the macrophage phagosome and later within the necrotic granuloma. 107
60511 407775 pfam17926 TetR_C_21 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor found in Streptomyces coelicolor A3. Family members include HTH-type transcriptional repressor sco4008, which is suggested to be a transcriptional repressor of sco4007 responsible for the multidrug resistance system in S. coelicolor A3. 111
60512 407776 pfam17927 Ins134_P3_kin_N Inositol 1,3,4-trisphosphate 5/6-kinase pre-ATP-grasp domain. This family consists of several inositol 1, 3, 4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase. This entry represents the N-terminal pre-ATP-grasp domain. 80
60513 407777 pfam17928 TetR_C_22 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor present in sco1712 proteins from Streptomyces coelicolo which act as a regulator of antibiotic production. 113
60514 407778 pfam17929 TetR_C_34 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in putative TetR family transcriptional regulators found in bacteria. 120
60515 407779 pfam17930 LpxI_N LpxI N-terminal domain. This entry represents the N-terminal domain of the LpxI enzyme that is involved in biosynthesis of lipid A. Specifically it carried out the hydrolysis of UDP-2,3-diacyl- glucosamine. This step is either carried out by LpxI or LpxH. This domain has a Rossmann fold. 129
60516 407780 pfam17931 TetR_C_23 Tetracyclin repressor-like, C-terminal domain. This is a C-terminal domain present in putative TetR family transcriptional regulators found in bacteria. 127
60517 407781 pfam17932 TetR_C_24 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in family members such as HTH-type transcriptional repressor KstR2 as well as fatty acid metabolism regulator proteins. In Mycobacterium smegmatis, KstR2 is involved in involved in cholesterol catabolism, while YsiA in Bacillus subtilis is involved in fatty acid degradation. 114
60518 407782 pfam17933 TetR_C_25 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in Rv1219c of Mycobacterium tuberculosis. Structural studies indicate that the helix alpha 10 of the C-terminal end of Rv1219c forms a long arm feature, a feature which is unique in Rv1219c compared to some other members of the TetR family. Furthermore, it has been shown that substrate binding occurs in the C-terminal regulatory domain of Rv1219c. 106
60519 407783 pfam17934 TetR_C_26 Tetracyclin repressor-like, C-terminal domain. This entry represents the C-terminal domain present in putative HTH-type transcriptional regulator. Family members are found in bacilli. 109
60520 407784 pfam17935 TetR_C_27 Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain present in putative TetR transcriptional regulators. 106
60521 407785 pfam17936 Big_6 Bacterial Ig domain. This domain is found in a wide variety of extracellular bacterial proteins often in multiple tandem copies. 83
60522 407786 pfam17937 TetR_C_28 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in CgmR (C. glutamicum multidrug-responsive transcriptional repressor), previously called CGL2612 protein. CgmR (CGL2612) from Corynebacterium glutamicum is a multidrug-resistance-related transcription factor belonging to the TetR family. It regulates expression of the immediately upstream gene cgmA (cgl2611) by binding to the operator cgmO in the cgmA promoter. The cgmA gene encodes a permease belonging to the major facilitator superfamily, a protein family composed of bacterial multidrug exporters, and the pair of CgmR and CgmA confers multidrug resistance on C. glutamicum. 97
60523 407787 pfam17938 TetR_C_29 Tetracyclin repressor-like, C-terminal domain. This domain is found in the C-terminal region of putative TetR-family regulatory proteins. 119
60524 407788 pfam17939 TetR_C_30 Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in the Pseudomonas aeruginosa PsrA which regulates the fadBA5 beta-oxidation operon. Functional analysis of PsrA indicated its importance in regulating b-oxidative enzymes. It has also been suggested that PsrA, a member of the TetR family of repressors, could affect global gene expression including activation of rpoS. 113
60525 407789 pfam17940 TetR_C_31 Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain found in putative transcriptional regulator, TetR family proteins. 107
60526 407790 pfam17941 PP_kinase_C_1 Polyphosphate kinase C-terminal domain 1. Polyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. This C1-terminal domain has a structure similar to phospholipase D. It is one of two closely related carboxy-terminal domains (C1 and C2 domains). Both the C1 and C2 domains (residues 322-502 and 503-687, respectively) consist of a sevenstranded mixed beta-sheet flanked by five alpha-helices. However, the structural topology and relative orientations of the helices to the beta-sheet in these two domains are different. The C1 and C2 domains are highly conserved in the PPK family. Some of the residues previously shown to be crucial for the enzyme catalytic activity are located in these two domains. 167
60527 407791 pfam17942 Morc6_S5 Morc6 ribosomal protein S5 domain 2-like. This domain is found in MORC6 proteins in eukaryotes. Arabidopsis microrchidia (MORC) ATPase family proteins are conserved among plants and animals and are involved in transcriptional silencing. In Arabidopsis, MORC6/DMS11 was reported to function in the condensation of pericentromeric heterochromatin, thereby facilitating transcriptional silencing. Further studies demonstrate that MORC6 and its homologs MORC1 and MORC2 form a complex which associates with SUVH9, required for Pol V occupancy in the RdDM (RNA-directed DNA methylation) pathway. 139
60528 407792 pfam17943 HOCHOB Homeobox-cysteine loop-homeobox. This domain is considered a double homeodomain, termed HOCHOB, present in the C. elegans genome. Family members include CEH-91 and CEH-93 that share extended sequence similarity with each other upstream of their typical HDs (Homeodomains). CEH-92, another family member, has three copies of this domain. The domain consists of two divergent HDs that are separated by a linker of about 17 residues. The linker has a number of conserved positions, two of which are cysteine residues suggesting that they could be involved in metal binding. Hence, the name HOCHOB (Homeobox-cysteine loop-homeobox). Furthermore, there are two conserved histidine residues, one in each HD (in CEH-91 displaced by two positions), and there is also a conserved aspartic acid. It is speculated that the HOCHOB domain is an evolutionary novelty that is derived from two HDs and may have gained metal-binding capacity. 120
60529 407793 pfam17944 Arg_decarbox_C Arginine decarboxylase C-terminal helical extension. This small three helical domain is found at the C-terminus of the arginine decarboxylase enzyme. 50
60530 407794 pfam17945 Crystall_4 Beta/Gamma crystallin. This is the C-terminal domain found in mucin glycoproteins such as secreted protease of C1 esterase inhibitor from EHEC (StcE). This domain adopts a beta/gamma crystallin fold and has been shown to be dispensable for substrate binding. Furthermore, deletion analysis suggest that lack of the C-terminal resulted in impaired association with the cell surface. 90
60531 407795 pfam17946 RecC_C RecC C-terminal domain. This entry corresponds to the C-terminal domain of the RecC protein. This domain has a PD(D/E)XK like fold. Deleting this domain eliminates RecD assembly within the RecBCD complex. 224
60532 407796 pfam17947 4HB Four helical bundle domain. This domain is found in elongation factor 3A where it packs against the bottom of the concave face of the HEAT domain. 78
60533 407797 pfam17948 DnaT DnaT DNA-binding domain. This domain is found in E.coli primosomal protein 1 (Pp1); the PP1 domain (residues 84-153) can bind to different types of ssDNA, which is fundamental for its physiological substrate bindings. Functional analysis indicate that both N- and C- terminals are essential to having the cooperative effect in binding ssDNA. The ssDNA bound complex displays a spiral filament assembly that is adopted by many proteins that are involved in DNA replication, such as DnaA, RecA and PriB. This domain is similar to pfam08585 except that it contains an extra loop at the N-terminus (84-99). Structural analysis indicate that this extra loop might be essential for the stabilisation of the three-helix bundle. 71
60534 375443 pfam17949 PND FANCM pseudonuclease domain. This entry represents the pseudonuclease domain (PND) from the FANCM protein. This domain is part of the PD(D/E)XK superfamily but does not appear to have a full set of catalytic residues. 125
60535 407798 pfam17950 SpmSyn_N S-adenosylmethionine decarboxylase N -terminal. This is the N-terminal domain found in human spermine synthase (EC 2.5.1.22). The N-terminal domain, which forms the major part of the dimerization interface, shows a considerable structural similarity to the AdoMetDC-like fold (S-adenosylmethionine decarboxylase, the enzyme that forms the aminopropyl donor substrate), pfam02675. Deletion of the N-terminal domain led to a complete loss of spermine synthase activity, suggesting that dimerization may be required for activity. The N-terminal domain (amino acids 1-117) includes seven beta-strands and two alpha-helices. 96
60536 407799 pfam17951 FAS_meander Fatty acid synthase meander beta sheet domain. This domain is found in fungal fatty acid synthase beta chain proteins. 146
60537 407800 pfam17952 Cas6_N Cas6 N-terminal domain. The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA.Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader.The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). This entry represents the N-terminal domain of Cas 6 proteins. The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability. Structural analysis of Sulfolobus sulfotaricus P2 Cas6 (SsCas6) proteins indicate that SsCas6 is able to bind and cleave the nonstructured RNA by stabilizing an otherwise unstable duplex of two base pairs near the cleavage site, leading to an inline conformation around the scissile phosphate necessary for its breakage. 115
60538 407801 pfam17953 Csm4_C CRISPR Csm4 C-terminal domain. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci play a pivotal role in the prokaryotic host defense system against invading genetic materials. The CRISPR loci are transcribed to produce CRISPR RNAs (crRNAs), which form interference complexes with CRISPR-associated (Cas) proteins to target the invading nucleic acid for degradation. The interference complex of the type III-A CRISPR-Cas system is composed of five Cas proteins (Csm1-Csm5) and a crRNA, and targets invading DNA. This entry represents the C-terminal domain found in Csm4. Csm4 structurally resembles Cmr3, a component of the type III-B CRISPR-Cas interference complex. Studies indicate that Csm3-Csm4 complex binds single-stranded RNA in a non-sequence-specific manner. Structural analysis show, Csm3 and Csm4 have one and two ferredoxin-like folds (also known as an RRM-like fold), respectively. The long beta-hairpin inserted into the C-terminal ferredoxin-like fold of Csm4, is well-conserved in the Cmr3 structure. The corresponding beta-hairpin of Cmr3 binds the D1 domain of Cmr2, as observed in the Cmr2-Cmr3 complex structure. Furthermore, it is suggested that the hairpin of Csm4 is responsible for the interaction with Csm1 (ortholog of Cmr2). 91
60539 407802 pfam17954 Pirin_C_2 Quercetinase C-terminal cupin domain. Experiments on the YhhW protein show that is has quercetinase activity. This entry represents the C-terminal cupin domain from the two cupin domains that make up the protein. This domain is usually associated with pfam02678. 86
60540 407803 pfam17955 Cas6b_N Cas6b N-terminal domain. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci play a pivotal role in the prokaryotic host defense system against invading genetic materials. The CRISPR loci are transcribed to produce CRISPR RNAs (crRNAs), which form interference complexes with CRISPR-associated (Cas) proteins to target the invading nucleic acid for degradation. Four Cas proteins (Cas5, Cas6b, Cas7 and Cas8b) are proposed to form a Type I-B Cascade complex that mediates the antiviral defense. This is the N-terminal domain found in Cas6b proteins. Cas6b is a member of Cas6 RNA processing endoribonucleases found in bacteria and archaea whose RNA substrates have a wide range of structural features. Cocrystal structures of Cas6 from Methanococcus maripaludis (MmCas6b) bound with its repeat RNA revealed a dual-site binding structure and a cleavage site conformation poised for phosphodiester bond breakage. 104
60541 407804 pfam17956 NAPRTase_C Nicotinate phosphoribosyltransferase C-terminal domain. This domain is found at the C-terminus of some Nicotinate phosphoribosyltransferase enzymes. The function of this domain is uncertain. 111
60542 407805 pfam17957 Big_7 Bacterial Ig domain. This entry represents a bacterial ig-like domain that is found in glycosyl hydrolase enzymes. 67
60543 407806 pfam17958 EF-hand_13 EF-hand domain. This entry represents an EF-hand domain found in one of the regulatory B subunits of PP2A. 90
60544 407807 pfam17959 EF-hand_14 EF-hand domain. This EF-hand domain is found at the N-terminus of the human glutaminase enzyme. 90
60545 407808 pfam17960 TIG_plexin TIG domain. This entry represents an TIG or IPT domain (Ig domain shared by Plexins and Transcription factors) found in plexins. 89
60546 407809 pfam17961 Big_8 Bacterial Ig domain. This entry represents a bacterial Ig-fold domain that is found in a wide range of bacterial cell surface adherence proteins. 100
60547 407810 pfam17962 bMG6 Bacterial macroglobulin domain 6. This macroglobulin domain is found in bacterial alpha 2 macroglobulin proteins. It adopts an Ig-like beta sandwich fold. 112
60548 407811 pfam17963 Big_9 Bacterial Ig domain. This entry represents a wide variety of bacterial Ig domains. 90
60549 407812 pfam17964 Big_10 Bacterial Ig domain. This entry represents a bacterial Ig-like domain found associated with transpeptidase domains. 182
60550 407813 pfam17965 MucBP_2 Mucin binding domain. This domain is found in bacterial cell surface proteins that interact with mucins. The archetypal member of this family is the Mub-R5 B1 domain. This domain has a beta-grasp fold. 75
60551 407814 pfam17966 Mub_B2 Mub B2-like domain. This entry corresponds to the Mub B2 domain. This domain is related to the Mub B1 domain pfam17965. This domain may be involved in mucin binding. This domain is often found associated with the related pfam17965 in bacterial cell surface proteins. 70
60552 407815 pfam17967 Pullulanase_N2 Pullulanase N2 domain. This domain is found close to the N-terminus of the Klebsiella starch debranching pullulanase enzyme. The structure of the domain is a beta sandwich fold. 112
60553 407816 pfam17968 Tlr3_TMD Toll-like receptor 3 trans-membrane domain. Toll-like receptor (TLR) 3 is an endosomal TLR that mediates immune responses against viral infections upon activation by its ligand double-stranded RNA, a replication intermediate of most viruses. TLR3 is expressed widely in the body and activates both the innate and adaptive immune systems. This entry represents the Toll-like receptor 3 trans-membrane domain which has been shown to form dimers and trimers with different surfaces for helix-helix interaction. 33
60554 407817 pfam17969 Ldt_C L,D-transpeptidase C-terminal domain. This is the C-terminal domain found in d-transpeptidases (Ldt) homologues from E.coli. Three of these enzymes (YbiS, ErfK, YcfS) have been shown to cross-link Braun's lipoprotein to the peptidoglycan (PG), while the other two (YnhG, YcbB) form direct meso-diaminopimelate (DAP-DAP, or 3-3) cross-links within the PG. Family members include erfK (ldtA), ybiS (ldtB), ycfS (ldtC), and ynhG (ldtE). 67
60555 407818 pfam17970 bMG1 Bacterial Alpha-2-macroglobulin MG1 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the N-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG1 domain which is the farthest from the body of the structure. It is normally anchored to the inner membrane in vivo and connected to MG2 by a flexible linker. 105
60556 375455 pfam17971 LIFR_D2 Leukemia inhibitory factor receptor D2 domain. This is the D2 domain in cytokine-binding module 1 (CBM1) found in Leukemia inhibitory factor receptor (LIFR) and OSM receptors (OSMR). LIFR has an extracellular region with a modular structure containing two cytokine-binding modules (CBM) separated by an Ig-like domain and followed by three membrane-proximal fibronectin type-III (FNIII) domains. The D2 domain in CBM1 shows structural similarity to the corresponding CBM domains of both gp130 and IL-6Ralpha because it contains conserved structural features like the WSXWS motif. The WSXWS motif in cytokine receptors is a molecular switch involved in receptor activation. 114
60557 407819 pfam17972 bMG5 Bacterial Alpha-2-macroglobulin MG5 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the N-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG5 domain. 127
60558 407820 pfam17973 bMG10 Bacterial Alpha-2-macroglobulin MG10 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the C-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG10 domain. MG10 is markedly different from the other MG domains in that it has more beta strands and an alpha helix. The position of MG10 is stabilized by, in addition to other hydrogen bonds, the formation of a beta sheet with MG9. 128
60559 407821 pfam17974 GalBD_like Galactose-binding domain-like. Proteins containing a galactose-binding domain-like fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase, phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va, membrane-anchored ephrin for the Eph family of receptor tyrosine kinases, and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1. The structure of the galactose-binding domain-like members consists of a beta-sandwich, in which the strands making up the sheets exhibit a jellyroll fold. 190
60560 407822 pfam17975 RNR_Alpha Ribonucleotide reductase alpha domain. This is the alpha helical domain of ribonucleotide reductases. Family members include Ribonucleotide reductase (RNR, EC:1.17.4.1) which catalyse the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. This domain is found in Class II. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Many organisms have more than one class of RNR present in their genomes. Ribonucleotide reductase is an oligomeric enzyme composed of a large sub-unit (700 to 1000 residues) and a small sub-unit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain. Some family members carry ATP cone domain which acts as a functional regulator. Competitive binding of ATP and dATP to an N-terminal ATP-cone domain determines enzyme activity. As the ratio of dATP to ATP increases above a certain threshold, the enzyme activity is turned off. Substrate nucleotides are recognized by relatively simple H-bonding interactions at the N-terminus of one or more alpha helices. In the monomeric class II RNR, the effector binds in a pocket formed by helices in a 130 amino acid insertion which constitutes this domain. 101
60561 407823 pfam17976 zf-RING_12 RING/Ubox like zinc-binding domain. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by this domain RING0 and three additional zinc finger domains characteristic of the RBR family. RING0 binds two coordinated zinc atoms at each extremity of the domain with a hairpin. Deletion of RING0 massively derepressed parkin activity supporting the role of RING0 in autoinhibition, point mutations in RING0 (Phe146 to Ala) or RING2 (Phe463 to Ala) both increased parkin activity. The REP (repressor element of parkin) and RING0 domains play a preeminent role in repressing parkin ligase activity through their interactions with RING1 and RING2, respectively. 73
60562 375459 pfam17977 zf-RING_13 RING/Ubox like zinc-binding domain. This is a zinc binding domain found in nidovirus helicase. It includes includes 12 or 13 conserved Cys/His residues. Amino acid substitutions in ZBD or the adjacent spacer that connects it to the downstream domain can profoundly affect EAV (equine arteritis virus) helicase activity and RNA synthesis, with most replacements of conserved Cys or His residues yielding replication-negative virus phenotypes. 68
60563 407824 pfam17978 zf-RING_14 RING/Ubox like zinc-binding domain. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by RING0 and three additional zinc finger domains characteristic of the RBR family. This entry relates to RING1 zinc binding domain. The RING1 domain displays the C3HC4 cross-brace motif characteristic of RING domains. The N-terminal Ubl domain binds to RING1. 91
60564 407825 pfam17979 zf-CRD Cysteine rich domain with multizinc binding regions. This is a cysteine rich domain which contains four zinc-binding regions (binding five zinc ions). Family members include human CHFR which interacts with Poly(ADP-ribose) (PAR) through a 20-amino acid PAR binding zinc finger region (PBZ) found at the C-terminal end of this cysteine-rich domain (i.e. the 4th region towards the C-terminal). CHFR lacking PBZ does not co-localize with nuclear PAR foci in interphase cells and cannot rescue antephase checkpoint function despite retaining autoubiquitination activity. Hence it has been suggested that the CHFR-PAR interaction is an important part of the antephase checkpoint and could form part of the checkpoint sensor for cellular stress and microtubule poisons or be required for proper localization of CHFR. The PBZ region of CHFR contains two adenine binding sites. 158
60565 407826 pfam17980 ADD_DNMT3 Cysteine rich ADD domain in DNMT3. This is a cysteine-rich domain termed ADD (ATRX-DNMT3-DNMT3L, AD-DATRX) found in DNMT3A proteins. The ADD domains of the DNMT3 family have a decisive role in blocking DNMT activity in the areas of the genome with chromatin containing methylated H3K4. Furthermore, the ADD domain of DNMMT3A (ADD-3A) competes with the chromodomain (CD) of heterochromatin protein 1 alpha (HP1alpha, CDHP1alpha) for binding to the H3 tail. The DNA methyltransferase (DNMT) 3 family members DNMT3A and DNMT3B and the DNMT3-like non-enzymatic regulatory factor DNMT3L, are involved in de-novo establishment of DNA methylation patterns in early mammalian development. 56
60566 407827 pfam17981 ADD_ATRX Cysteine Rich ADD domain. This is a cysteine-rich domain termed ADD (ATRX-DNMT3-DNMT3L, AD-DATRX) found in ATRX proteins. Chromatin-associated human protein ATRX was originally identified because mutations in the ATRX gene cause a severe form of syndromal X-linked mental retardation called ATR-X syndrome. Mutations or knockdown of ATRX expression cause diverse effects, including altered patterns of DNA methylation, a telomere-dysfunction phenotype, aberrant chromosome segregation, premature sister chromatid separation and changes in gene expression. ATRX localizes predominantly to large, tandemly repeated regions (such as telomeres, centromeres and ribosomal DNA) associated with heterochromatin, and studies show that it directs H3.3 deposition to pericentric and telomeric heterochromatin. The ADD domain of ATRX, in which most syndrome-causing mutations occur, engages the N-terminal tail of histone H3 through two rigidly oriented binding pockets, one for unmodified Lys4 and the other for di- or trimethylated Lys9. Mutations in the ATRX ADD domain cause mislocalization of ATRX protein to heterochromatin, and this may contribute to understanding the underlying etiology of ATRX syndrome. Structure analysis of the ADD domain of ATRX revealed that it contains a PHD zinc-finger domain packed against a GATA-like zinc finger. Same structure is also found in the DNMT3 DNA methyltransferases and DNMT3L. 56
60567 407828 pfam17982 C5HCH NSD Cys-His rich domain. This is an NSD-specific Cys-His rich region (C5HCH) domain. Family members include NSD3 (nuclear receptor SET domain-containing) proteins. This domain is located on the C-terminal of NSD1, 2 and 3 proteins. C5HCH domain lies adjacent to the fifth plant homeodomain (PHD5). The PHD5-C5HCH module of NSD3 (PHD5-C5HCHNSD3) recognizes the H3 N-terminal peptide containing unmodified K4 and trimethylated K9. Moreover, it has been reported that the PHD5-C5HCH module of NSD1 (PHD5-C5HCH) was the sole region required for tight binding of the NUP98-NSD1 fusion protein to the HoxA9 gene promoter, implicating that PHD5-C5HCH might have chromatin targeting ability. 50
60568 375465 pfam17983 Tryp_inh Trypsin inhibitors 1,2 and 3. This is domain is found in trypsin inhibitor 1, 2 and 3 found in Chenopodiaceae. Structure analysis of S. oleracea trypsin inhibitor III (SOTI-III), in complex with bovine pancreatic trypsin, shows a knottin-like cystine-bridge topology. 33
60569 375466 pfam17984 TERT_thumb Telomerase reverse transcriptase thumb DNA binding domain. The catalytic subunit of telomerase is structurally similar to retroviral reverse transcriptases, viral RNA polymerases and, to a lesser extent, the bacteriophage B-family DNA polymerases. Like its structural homologs, the core catalytic subunit of telomerase, TERT, contains the fingers, palm and thumb domains required for nucleic acid and nucleotide associations as well as catalysis. The four major TERT domains: the RNA binding domain (TRBD); the fingers domain, implicated in nucleotide binding and processivity; the palm domain, which contains the active site of the enzyme; and the thumb domain, implicated in DNA binding and processivity are organized into a ring configuration similar to that observed for the substrate-free enzyme. This is the thumb domain found in Tribolium castaneum telomerase catalytic subunit, TERT. Contacts between TERT and the DNA substrate are mostly mediated via backbone interactions with the thumb loop and helix. The thumb helix sits in the minor groove of the RNA-DNA heteroduplex, making extensive contacts with the phosphodiester backbone and the ribose groups of the RNA-DNA hybrid. 191
60570 407829 pfam17985 SipA_VBS SipA vinculin binding site. This motif family includes the three vinculin binding sites found in the Shigella SipA/IpaA protein. The family also includes some proteins from Chlamydia species. 22
60571 407830 pfam17986 EKAL EMP3-KAHRP-like N-terminal domain. This is the N-terminal domain which is found at the N terminus of the erythrocyte cytoskeleton-associated protein (EMP3) and the knob-associated histidine-rich protein (KAHRP). KAHRP found in protozoan parasites such as Plasmodium falciparum, is involved in both rigidifying the host cell and the formation of cytoadherent knob structures. 53
60572 407831 pfam17987 PMT2_N Phosphoethanolamine N-methyltransferase 2 N-terminal. This is the N-terminal vestigial domain found in Haemonchus contortus phosphoethanolamine N-methyltransferases 2 (PMT2). Structural analysis reveal changes leading to loss of function in the vestigial domains of the nematode PMT. 124
60573 375470 pfam17988 VEGFR-2_TMD VEGFR-2 Transmembrane domain. This is a transmembrane domain (TMD) of vascular endothelial growth factor receptor 2 which regulates blood vessel homeostasis. Transmembrane signalling by receptor tyrosine kinases (RTKs) requires specific orientation of the intracellular kinase domains in active receptor dimers. Two mutants in VEGFR-2 TMD showed constitutive kinase activity, suggesting that precise TMD orientation is mandatory for kinase activation. Scanning mutagenesis and structural analysis indicated that introducing two polar amino acids in distinct positions of the TMD (G770E/F777E and I771E/L778E mutations) reorients transmembrane helices and leads to stable dimer formation. Therefore, it has been suggested that the transition between the inactive and the active dimeric state of VEGFR-2 implicates alternative dimeric TMD conformations. 35
60574 407832 pfam17989 ALP_N Actin like proteins N terminal domain. This is the N-terminal domain found in archaeal actin homolog Ta0583 found in thermophilic archaeon Thermoplasma acidophilum. Structural analysis indicate that the fold of Ta0583 contains the core structure of actin indicating that it belongs to the actin/Hsp70 superfamily of ATPases. Furthermore,Ta0583 co-crystallized with ADP shows that the nucleotide binds at the interface between the subdomains of Ta0583 in a manner similar to that of actin. It has been suggested that Ta0583 might function in the cellular organisation of T. acidophilum. Other family members include ParM another actin-like protein found in Staphylococcus aureus. Crystal structure co-ordinates revealed that this protein is most structurally related to the chromosomally encoded Actin-like proteins (Alp) Ta0583 from the archaea Thermoplasma acidophilum. Furthermore, biophysical analyses have suggested that ParM filaments undergo a treadmilling-like mechanism of motion in vitro similar to that of F-actin. The recruitment of ParM to the segrosome complex, was shown to be required for the conversion of static ParM filaments to a dynamic form proficient for active segregation and facilitated by the C-terminus of ParR 148
60575 407833 pfam17990 LodA_N L-Lysine epsilon oxidase N-terminal. This is the N-terminal domain found in antimicrobial protein (LodA) with lysine-epsilon oxidase activity (EC 1.4.3.20) which is produced by gram-negative marine bacteria such as Marinomonas mediterranea. The enzyme, previously named marinocine, catalyzes the oxidative deamination of l-lysine into 6-semialdehyde 2-aminoadipic acid, ammonia, and hydrogen peroxide (H2O2). Orthologous proteins have been detected in other bacterial genera, where they participate in biofilm development and dispersal. It has been shown that M. mediterranea LodA and its homologues induce cell death in the microcolonies formed in the process of biofilm development due to the hydrogen peroxide generated by their enzymatic activity. Moreover, cells dispersed from the biofilm by means of this mechanism show a phenotypic variation in growth and biofilm formation. The active form of LodA containing the quinonic cofactor is generated intracellularly only in the presence of LodB, suggesting that the latter protein is involved in this process. 215
60576 407834 pfam17991 Thioredoxin_10 Thioredoxin like C-terminal domain. This is the C-terminal thioredoxin like domain found in Rv2874 in the pathogenic bacterium Mycobacterium tuberculosis. Structure analysis of Rv2874-C shows the presence of a C-terminal domain formed by the 128 residues Thr568-Gly695. These residues form a jelly-roll structure in which two antiparallel beta-sheets sandwich a hydrophobic core. This domain is combined with a second domain with a carbohydrate-binding module (CBM) fold. 142
60577 407835 pfam17992 Agarase_CBM Agarase CBM like domain. This is the N-terminal CBM-like domain in exo-beta-agarase proteins (EC:3.2.1.81) found in the marine microbe Saccharophagus degradans. This enzyme catalyzes a critical step in the metabolism of agarose by S. degradans through cleaving agarose oligomers into neoagarobiose products that can be further processed into monomers. The CBM-like domain is structurally very similar to some CBM families. A loop in the CBM-like domain is involved in forming the roof of the active site channel. The contribution of the CBM-like domain to formation of the active site of the enzyme supports a role in substrate recognition explaining the exo-mode of beta-agarase action. 178
60578 407836 pfam17993 HA70_C Haemagglutinin 70 C-terminal domain. This is the C-terminal domain found in hemagglutinin component such as HA70 found in Clostridium botulinum. HA is a component of the large botulinum neurotoxin complex and is critical for its oral toxicity. HA plays multiple roles in toxin penetration in the gastrointestinal tract, including protection from the digestive environment, binding to the intestinal mucosal surface, and disruption of the epithelial barrier. HA consists of three different proteins, designated HA70 (also known as HA3), HA33 (HA1), and HA17 (HA2) based on molecular mass. HA70 consists of three domains (D1-3). The D1 and D2 domains, which adopt similar structures, mediate the trimerization of HA70 with each protomer. The D3 domain, sitting at the tip of the trimer, is composed of two similar jelly-roll-like beta-sandwich structures. Furthermore, crystal structures of HA70 in a complex with alpha2,3- or alpha2,6-SiaLac (alpha2,6-sialyllactose), show that alpha2,3- and alpha2,6-SiaLac bound to the same region in the D3 domain of HA70. This domain is the D3 domain found in HA3/HA70 which has been shown to be involved in binding to carbohydrate of glycoproteins from epithelial cells in the infection process. 135
60579 407837 pfam17994 Glft2_N Galactofuranosyltransferase 2 N-terminal. This is the N-terminal beta-barrel domain found in the polymerizing galactofuranosyltransferase GlfT2 (Rv3808c). This enzyme synthesizes the bulk of the galactan portion of the mycolyl-arabinogalactan complex, which is the largest component of the mycobacterial cell wall such as in Mycobacterium tuberculosis. The N-terminal domain contains two short helices preceding a 10-stranded beta-sandwich with jelly roll topology. 140
60580 407838 pfam17995 GH101_N Endo-alpha-N-acetylgalactosaminidase N-terminal. This is the N-terminal domain found in Streptococcus pneumoniae endo-alpha-N-acetylgalactosaminidase (EC:3.2.1.97), a cell surface-anchored glycoside hydrolase from family GH101 involved in the breakdown of mucin type O-linked glycans. This is a twisted beta-sandwich domain composed of two sheets of six and seven antiparallel beta-strands. The domain appears to be missing the extended metal and carbohydrate-binding loops. 180
60581 407839 pfam17996 CE2_N Carbohydrate esterase 2 N-terminal. This is the N-terminal beta-sheet domain with jelly roll topology found in CE2 acetyl-esterase from the bacterium Clostridium thermocellum. This enzyme displays dual activities, it catalyses the deacetylation of plant polysaccharides and also potentiates the activity of its appended cellulase catalytic module through its noncatalytic cellulose binding function. This N-terminal jelly-roll domain appears to extend the substrate/cellulose binding cleft of the catalytic domain in C.thermocellum. 108
60582 375474 pfam17997 Cry1Ac_D5 Insecticidal delta-endotoxin CryIA(c) domain 5. This domain is found in the protoxins portion of insecticidal proteins (parasporins, or Cry proteins) such as those from Bacillus thuringiensis (Bt) Cry1Ac. The protoxin portion comprise a proteolytically labile C-terminal segment (sometimes referred to as the protoxin domain). This is domain V in Cry1Ac from B. thuringiensis. One of the four protoxin domains (D-IV through D-VII). Domains V and VII are beta-rolls (similar to D-II or D-III) that closely resemble carbohydrate-binding modules (CBM) found in sugar hydrolases, however, it is difficult to guess which particular carbohydrates (if any) may serve as their ligands because residues on the putative sugar-binding interfaces are conserved neither in sequence nor in local structure. Structural analysis indicate that there are putative disulfide crosslinking at the dimer interface mediated by cysteines within 783-823 region of this domain which together with other cysteines creates a three-dimensional network of cross-links across the crystal which may play a role in stabilizing mature Bt Cry1Ac. 173
60583 407840 pfam17998 AgI_II_C2 Cell surface antigen I/II C2 terminal domain. This is the second domain (C2) located in the C-terminal region found in antigen I/II type adhesin protein AspA from S. pyogenes. Together with C3, these two domains form an elongated structure, each domain adopts the DEv-IgG fold. Similar to the classical IgG folds, it is comprised of two major antiparallel beta-sheets, designated ABED and CFG. For the C2-domain, there are two additional strands on the CFG sheet. Furthermore, sheets ABED and CFG are interconnected by several cross-connecting loops and one alpha-helix (DH1). The side chains of D982 and N996 in the C2-domain are involved in hydrogen bonding with the side chains of R1264 and N1295 in the C3 domain. Main chain hydrogen bonding can also be observed between S992 in C2 and N1189/G1191 in C3, furthermore stabilizing the interaction between the domains. The C2 domain contains one bound metal ion, modeled as Ca2+, and both the C2- and C3-domains are stabilized by conserved isopeptide bonds, which connect the beta-sheets of the central DEv-IgG motifs.Other members of this family include Major cell-surface adhesin PAc from Streptococcus mutans and SspB from Streptococcus gordonii. 180
60584 407841 pfam17999 PulA_N1 Pullulanase N1-terminal domain. This is the N-terminal domain found in debranching enzyme such as Pullulanase (PulA)from Anoxybacillus sp. LM18-11. The PulA structure comprises four domains (N1, N2, A, and C). This is the N1 domain which has been identified as a carbohydrate-binding motif. Two maltotriose or maltotetraose molecules were found between the N1 domain and a loop of the A domain in the PulA-maltotriose or PulA-maltotetraose structures. These carbohydrates are bound in a parallel binding mode close to each other and form hydrogen bonds. The sugar moieties bound to the N1 domain are not immediately adjacent to the active site, but the enzyme might use N1 binding to attract and grab the substrate. Functional analysis indicate that N1 is important for catalytic activity and thermostability in addition to assisting substrate binding. The structure of the N1 domain reveals a classic distorted beta-jelly roll fold consisting of two anti-parallel beta-sheets, forming a concave and a convex surface. On the concave side of N1 domain there is a cleft to accommodate two molecules of maltotriose or maltotetraose. 85
60585 407842 pfam18000 Top6b_C Type 2 DNA topoisomerase 6 subunit B C-terminal domain. This is the C-terminal domain found in archaeal type 2 DNA topoisomerase 6 subunit B (EC:5.99.1.3). This region is a small helix-two turns-helix (H2TH) domain inserted between the GHKL and transducer domains which adopts an immunoglobulin-like fold. Mutation analysis of this C-terminal domain showed that the overall activity of the mutant mesophilic methanogen M. mazei Top6B (MmT6) is modestly reduced but its relative activity on different substrates is not affected. Due to the similarity of the B subunit's CTD to known protein- and carbohydrate-binding modules, it has been suggested that it could regulate topo VI spatially, perhaps by localizing the enzyme to a specific subcellular region or functional partner. 113
60586 375477 pfam18001 Il13Ra_Ig Interleukin-13 receptor subunit alpha Ig-like domain. This is the N-terminal Ig-like domain found in IL-13Ralpha1 type two cytokine complex. The IL-13Ralpha1 contains an extra N-terminal Ig-like domain not found in other receptors of the the common gamma-chain subfamily. The extra N-terminal IL-13Ralpha1 Ig-like domain contacts the dorsal surfaces of both IL-4 and IL-13. Mutational studies show that the deletion of this domain affects the binding of IL-13 to IL-13Ralpha1. 95
60587 407843 pfam18002 T6_Ig_like T6 antigen Ig like domain. This is the N-terminal immunoglobulin-like domain. Family members carrying this domain include Trypsin-resistant surface T6 protein found in Streptococcus pyogenes. 142
60588 407844 pfam18003 DUF3823_C Domain of unknown function (DUF3823_C). This is a family of uncharacterized proteins from Bacteroidetes. This domain has an Ig-like fold. 104
60589 407845 pfam18004 RPN2_C 26S proteasome regulatory subunit RPN2 C-terminal domain. This is the C-terminal domain found in S. cerevisiae Rpn2 (26S proteasome regulatory subunit RPN2) as well as other eukaryotic species. A study revealed that the C-terminal 52 residues of the Rpn2 C-terminal domain are responsible for mediating interactions with the ubiquitin-binding subunit Rpn13. Futhermore, the extreme C-terminal 20 or 21 residues of Rpn2 (926-945 or 925-945) of S. cerevisiae, were shown to be equally effective at binding Rpn13. Multiple sequence alignments indicate that Rpn2 orthologs are highly conserved in this C-terminal region and share characteristic acidic, aromatic, and proline residues, suggesting a common function. In the structure of Rpn2 from S. cerevisiae, this region is exposed and disordered, and is thus accessible for associating with Rpn13. The Rpn2 binding surface of human Rpn13 has been mapped by nuclear magnetic resonance titration to one surface of its Pru domain. 159
60590 407846 pfam18005 eIF3m_C_helix eIF3 subunit M, C-terminal helix. This is the C-terminal helix domain found in Eukaryotic translation initiation factor 3 subunit M (eIF3m). In mammalian eIF3a, the C-terminal helix following the PCI domain is involved in interactions with other core subunits. 29
60591 407847 pfam18006 SepRS_C O-phosphoseryl-tRNA synthetase C-terminal domain. The SSHS domain is mainly found in Archaea. The domain makes up part of the anticodon binding domain at the C-terminal of O-phosphoserine--tRNA(Cys) ligase. 31
60592 407848 pfam18007 DUF5593 Domain of unknown function (DUF5593). This is a domain of unknown function found in Corynebacteriales. 93
60593 407849 pfam18008 Bac_RepA_C Replication initiator protein A C-terminal domain. This is the C-terminal domain (CTD) that can be found in the conserved replication initiator, RepA,essential for staphylococcal propagation. RepA CTD shared the strongest structural homology to the Enterococcus faecalis DnaD CTD, yet perform distinct functions. RepA CTD shows strong sequence homology between RepA_N plasmids in genus-specific clusters, suggesting that it may perform host-specific functions necessary for replication. The RepA CTD interacts with the host DnaG primase, which binds the replicative helicase. Structural data indicate that the RepA CTD exists as a monomeric entity, flexibly tethered to the DNA-bound NTD. 94
60594 375484 pfam18009 Fer4_23 4Fe-4S iron-sulfur cluster binding domain. This is the C-terminal domain found in Deinococcus radiodurans protein DR2241 (a Ribosomal protein S2-related protein). This domain has been shown to harbour the sequence motifs CxxC and CxxxC which bind a [4Fe-4S] iron-sulphur cluster. Together with the preceding domain, it is heavily involved in the tetramer formation. 82
60595 375485 pfam18010 HTH_49 Cry35Ab1 HTH C-terminal domain. This is the C-terminal domain found in Bacillus thuringiensis protein Cry35Ab1 (an insecticidal protein). The domain has three helices held in a hydrophobic core, the first two form a typical helix-loop-helix whilst the third helix is perpendicular. The domain is structurally homologous to BinB and proteins containing similar beta-trefoil lectin-like domains. The domain is not required for insecticidal activity or for immuno-reactivity and its function is likely to be to bind to WCR brush border membrane vesicles. 29
60596 407850 pfam18011 Catalase_C C-terminal domain found in long catalases. This domain is found at the C-terminus of a variety of large catalase enzymes from bacteria. Structurally it is related to class I glutamine amidotransferase domains. The precise molecular function of this domain is uncertain. 150
60597 407851 pfam18012 PH_17 PH domain. This entry represents the C-terminal part of the split PH domain from syntrophin proteins. 59
60598 407852 pfam18013 Phage_lysozyme2 Phage tail lysozyme. This domain has a lysozyme like fold. It is found in the tail protein of various phages probably giving them the ability to degrade the host cell wall peptidoglycan layer. 139
60599 407853 pfam18014 Acetyltransf_18 Acetyltransferase (GNAT) domain. This entry represents a likely acetyltransferase enzyme that is related to pfam06852. 123
60600 407854 pfam18015 Acetyltransf_19 Acetyltransferase (GNAT) domain. This entry represents a likely acetyltransferase enzyme that is related to pfam13302. 113
60601 407855 pfam18016 SAM_3 SAM domain (Sterile alpha motif). 65
60602 407856 pfam18017 SAM_4 SAM domain (Sterile alpha motif). This entry corresponds to a SAM domain that is found at the N-terminus of the human C19orf47 protein. 84
60603 407857 pfam18018 DNA_pol_D_N DNA polymerase delta subunit OB-fold domain. The eukaryotic DNA polymerase delta (Pol delta) participates in genome replication, homologous recombination, DNA repair and damage tolerance. Human Pol delta consists of four subunits: p125, p50, p66 and p12. The first three subunits correspond to the three subunits of S. cerevisiae Pol delta. p50 serves as a scaffold for the assembly of Pol delta by interacting simultaneously with all of the other three subunits. This entry corresponds to the OB fold domain found in the p50 subunit. 129
60604 407858 pfam18019 HD_6 HD domain. This HD domain is found at the N-terminus of Cas3 enzymes fused to a helicase domain. This domain is sometimes found as a separate protein. It acts as a nuclease that cleaves ssDNA. 211
60605 407859 pfam18020 TIG_2 TIG domain found in plexin. This entry represents a TIG domain found in plexin proteins. TIG domains have an Ig-like fold. 94
60606 407860 pfam18021 Agglutinin_C Agglutinin C-terminal. This is the C-terminal domain of the alpha chain found in Marasmius Oreades lectin protein (MOA) which binds specifically with Gal.alpha(1,3)Gal-containing sugar epitopes. The enzymatic activity of the MOA may be associated with this domain. The domain has an alpha/beta-fold, which features a central six-stranded, mostly anti-parallel, beta-sheet flanked by three alpha-helices, and a short two-stranded beta-sheet. 93
60607 407861 pfam18022 Lectin_C_term Ricin-type beta-trefoil lectin C-terminal domain. This is the C-terminal domain of the beta chain found in Polyporus squamosus lectin protein (PSL). PSL binds specifically to glycans terminating with the sequence: Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is not involved in the binding to the Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is characterized by a central five-stranded beta-sheet that is flanked by three alpha-helices and topped by a short strand. It shows high fold similarity to its closest relative, the Gal.alpha1-3Gal-binding agglutinin from the mushroom Marasmius oreades agglutinin (MOA). 102
60608 375495 pfam18023 FKBP_N_2 BDBT FKBP like N-terminal. This is the N-terminal domain of the beta chain found in Drosophila melanogaster protein BDBT (a FK506-Binding Protein) which stimulates the DBT circadian function. The domain contains the DBT-binding site. The domain is structurally homologous to the peptidyl prolyl isomerase (PPIase) regions of FK506-binding proteins despite low sequence homology. BDBT is structurally related to the immunophilin FKBP51 and it shares a common domain organization consisting of PPIase-like and TPR domains with noncanonical immunophilins such as FKBP38 or FKBPL. 113
60609 407862 pfam18024 HTH_50 Helix-turn-helix domain. The TyrR protein of Haemophilus influenzae is a 36-kD transcription factor whose major function is to control the expression of genes important in the biosynthesis and transport of aromatic amino acids. This entry represents the C-terminal helix-turn-helix DNA-binding domain of TyrR and related proteins. 50
60610 407863 pfam18025 FucT_N Alpha-(1,3)-fucosyltransferase FucT N-terminal domain. This is the N-terminal domain of the alpha chain found in Helicobacter pylori Fucosyltransferase protein which is involved in the production of Lewis x trisaccharide, a major component of lipopolysaccharide. The N-terminal domain contains the catalyst base, Glu-95 which is equivalent to the Asp-100 of other members of the glycosyltransferases-B family. The domain contains the pocket where LacNAc binds. The domain is composed of 2-10 heptad repeats and a conserved N-terminal alpha-beta-alpha motif which has little sequence similarity to the conserved N-terminal motif in other glycosyltransferases. 92
60611 407864 pfam18026 Exog_C Endo/exonuclease (EXOG) C-terminal domain. This is the C-terminal domain found in EndoG-like mitochondrial endo/exonuclease (EXOG) proteins in higher eukaryotes. Evolutionary conserved mitochondrial nucleases are involved in programmed cell death and normal cell proliferation in lower and higher eukaryotes. It has been proposed that during metazoan evolution duplication of an ancestral nuclease gene could have generated the paralogous EndoG- and EXOG-protein subfamilies in higher eukaryotes, thereby maintaining the full endo/exonuclease activity found in mitochondria of lower eukaryotes. Family members include the human EXOG, a dimeric mitochondrial enzyme that displays 5'-3' exonuclease activity and further differs from EndoG in substrate specificity. This C-terminal domain is predicted to fold into a coiled-coil structure. Deletion of the domain led to a pronounced reduction in EXOG activity, revealing that the presence and most likely the proper positioning of this domain in EXOG proteins is crucial for its enzymatic activity. 49
60612 407865 pfam18027 Pepdidase_M14_N Cytosolic carboxypeptidase N-terminal domain. This entry corresponds to the N-terminal domain of cytosolic carboxypeptidases. The N-terminal domain folds into a nine-stranded antiparallel beta sandwich. This domain is specific to CCP proteins and is absent in other carboxypeptidases. It has been hypothesized that the N-terminal domain might contribute to folding, might have a regulatory function and/or might be involved in binding other proteins. 107
60613 375500 pfam18028 Zmiz1_N Zmiz1 N-terminal tetratricopeptide repeat domain. This is the N-terminal domain found in Zmiz1 proteins (Zinc finger MIZ domain-containing protein 1). Zmiz1 is a direct Notch1 cofactor that heterogeneously regulates Notch target genes. Zmiz1 directly interacts with the RAM1 domain of Notch1 through this N-terminal tetratricopeptide repeat (TPR) domain. Furthermore, it has been shown that Zmiz1 and Notch1 cooperatively recruit each other to chromatin through direct interaction via the N-terminal TPR domain resulting in a slight increase in activating histone marks and decrease of repressive histone marks. Functional analysis indicate that the N-Terminal Domain of Zmiz1 is important for driving Myc transcription and proliferation indirectly. 93
60614 407866 pfam18029 Glyoxalase_6 Glyoxalase-like domain. This entry comprises a diverse set of domains related to the Glyoxalase domain. The exact specificity of these proteins is uncertain. 111
60615 407867 pfam18030 Rimk_N RimK PreATP-grasp domain. This is the N-terminal domain found in Escherichia coli RimK proteins (Ribosomal protein S6-L-glutamate ligase). This domain precedes the ATP-grasp domain pfam08443. 94
60616 407868 pfam18031 UCH_C Ubiquitin carboxyl-terminal hydrolases. This is the C-terminal domain found in eukaryotic UCH37 proteins (also known as Ubiquitin carboxyl-terminal hydrolase isozyme L5, UCHL5). UCH37 is a subunit of two complexes: INO80, which performs ATP-dependent sliding of nucleosomes for transcriptional regulation and DNA repair, and the 26S proteasome, which performs ATP-dependent proteolysis of polyubiquitylated proteins in the cytosol and nucleus. Recruitment to the proteasome is mediated by the C-terminal domain of RPN13 (also known as ADRM1). Recruitment to INO80 is mediated by the N-terminal domain of NFRKB. Structural and biochemical analysis reveal that RPN13 and NFRKB make similar interactions with the UCH37 C-terminal domain but have very different interactions with the catalytic UCH domain that are activating in the case of RPN13 and highly inhibitory in the case of NFRKB. 46
60617 407869 pfam18032 FRP Photoprotection regulator fluorescence recovery protein. This family includes fluorescence recovery protein (FRP) domain, which is found in Synechocystis sp. PCC 6803 substr. Kazusa. FRP causes the dissociation of the orange carotenoid protein (OCP) from the phycobilisomes by interacting with the C-terminal domain of OCP, accelerating the conversion of the active red OCP to the inactive orange form. A patch of residues (W50, D54, H53, and R60), contributed by both chains of the FRP dimer cause the acceleration of the OCPr to OCPo conversion. Mutation of the absolutely conserved amino acids (R60) affect the activity of FRP. 99
60618 407870 pfam18033 SpuA_C SpuA C-terminal. This is the C-terminal beta sandwich domain found in Streptococcus pneumoniae Spu4 proteins. Spu4 is a large multimodular cell wall-attached enzyme involved in the degradation of glycogen. 93
60619 407871 pfam18034 Bac_GH3_C Bacterial Glycosyl hydrolase family 3 C-terminal domain. This is the C-terminal domain of the glycoside hydrolase family (pfam00933) 137
60620 407872 pfam18035 Bap31_Bap29_C Bap31/Bap29 cytoplasmic coiled-coil domain. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31. This entry represents the cytoplasmic domain which forms a heterodimeric coiled-coil with Bap29. This Bap29 and Bap31 are homologous to each other and this entry includes both proteins. 52
60621 407873 pfam18036 Ubiquitin_4 Ubiquitin-like domain. This domain has a ubiquitin-like fold. It is found in a diverse range of proteins including the 25KDa U11/U12 component. 89
60622 407874 pfam18037 Ubiquitin_5 Ubiquitin-like domain. This entry includes N-terminal ubiquitin-like domain from proteins such as NEDD8 ultimate buster protein. 96
60623 407875 pfam18038 FERM_N_2 FERM N-terminal domain. This entry represents the FERM N-terminal domain found in focal adhesion kinases. 96
60624 407876 pfam18039 UBA_6 UBA-like domain. This entry represents a UBA-like domain found at the N-terminus of ribonuclease ZC3 proteins. 42
60625 407877 pfam18040 BPA_C beta porphyranase A C-terminal. This is the C-terminal domain found in Bacteroides plebeius of proteins such as beta-porphyranase A (BPA), a beta-galactanase that cleaves the beta-1,4 glycosidic bond. Porphyranase degrade red seaweed glycans. This domain adopts a beta sandwich shape. 95
60626 407878 pfam18041 MapZ_EC1 MapZ extracellular domain 1. This is the extracellular domain 1 (MapZextra1) found in Streptococcus pneumoniae cell division site positioning protein MapZ. MapZ ensures accurate placement of the bacterial division site. The domain is a rigid four alpha-helices with two flexible linkers. The N-terminal end of MapZextra1 is connected to the trans-membrane segment of MapZ whilst the C-terminal is linked to MapZextra2 via a serine rich linker (SRL).The highly conserved residues are not accessible at the surface but are directly involved in many inter-helices interactions allowing for rigidity. 129
60627 407879 pfam18042 ORF_12_N ORF 12 gene product N-terminal. This is the N-terminal domain of Streptomyces clavuligerus ORF 12 gene product, which is directly involved in biosynthesis of Clavulanic acid (CA). The N-terminal domain consists of one four-stranded antiparallel beta-sheet surrounded by four alpha-helices and folds similarly to steroid isomerases and polyketide cyclases (PKTC). However, the N-terminal domain has no apparent polar-lined active-site pocket or conserved catalytic residues analogous to those of either the steroid isomerases or PKTCs. ORF 12 has 2 binding sites, CA is able to bind both to an active site and between the two domains. The active site pocket is lined by residues from both the N- and the C-terminal domains (His88, Ser173, Thr209, Ser234, Ser278, Met383, Phe374, Ala376 and Phe385). The C-2 carboxylate of CA is positioned deep in the pocket, making electrostatic interactions with Lys375, Arg418 and Lys89. Lys89 is located in strand beta-2 of the N-terminal domain and may also be involved in the beta-lactam core-binding region, including the residues Lys89 as well as Arg418. 92
60628 407880 pfam18043 T4_Rnl2_C T4 RNA ligase 2 C-terminal. This is the C-terminal domain of Enterobacteria phage T4 RNA ligase 2 (Rnl2). The C-terminal domain consists of a four-helix bundle. The C-terminal domain is thought to be critical for binding the Rnl2 to adenylylated nicked duplex. This is known as step 2 of the ligation pathway. 82
60629 407881 pfam18044 zf-CCCH_4 CCCH-type zinc finger. This short zinc binding domain has the pattern of three cysteines and one histidine to coordinate the zinc ion. This domain is found in a wide variety of proteins such as E3 ligases. 22
60630 407882 pfam18045 ISP3_C ISP3 C-terminal. This is the C-terminal domain of ISP3 protein, which plays a role in asexual daughter cell formation, for example in T.gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by a interstrand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The loop between beta 5 and beta 6 is extended and variable. The domain adopts a pleckstrin homology (PH) fold, despite having neglible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the ISP3. Unlike PH domains, ISP3 is cysteine rich. The cysteine-rich nature of the ISP3s and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. There are no disulfide bonds in ISP3 unlike in ISP1. It is worth noting that ISP1 and ISP3 share low sequence identity but contain the same secondary core elements. 110
60631 407883 pfam18046 FKBP26_C FKBP26_C-terminal. This is the C-terminal domain (CTD) of Methanocaldococcus jannaschii peptidyl-prolyl cis/trans isomerase FKBP26. FKBP26 mediates protein folding. CTD has an alpha/beta-sandwich structure composed of three alpha-helices and a three-stranded, mixed-orientation beta-sheet. The CTD domain is responsible for dimerization of FKBP26 by through inter-subunit antiparallel pairing of the third beta-strands of the two CTDs. The CTD dimer forms a continuous, six-stranded mixed beta-sheet. A CTD-like structure is also found in the phylogenetically conserved NifU proteins (HIRIP5 in mouse and humans). 72
60632 407884 pfam18047 PatG_D PatG Domain. This is a domain found in PatG proteins, these proteins are involved in prfocessing the precursor peptide to yield the cyclic Patellamide. PatG can be found in Prochloron sp. 111
60633 407885 pfam18048 TRAF6_Z2 TNF receptor-associated factor 6 zinc finger 2. This domain is the second of three zinc fingers of Homo sapiens TNF receptor associated factor 6 (TRAF6). TRAF6 mediates Lys63 (K63)-linked polyubiquitination for Necrosis Factor-kappaB activation. The first three residues and the last Cys of finger 1 form a classical type I beta-turn. 27
60634 407886 pfam18049 DNA_pol_P_Exo DNA polymerase nu pseudo-exo. This domain, known as the Pseudo-exo domain, is found DNA polymerase Nu protein in species such as Homo sapiens.Residues 192-416 of Pol nu formed a degenerate 3'-5'-exonuclease domain, which deviates from the equivalent. 212
60635 407887 pfam18050 Cyclophil_like2 Cyclophilin-like family. This entry represents a family of cyclophilin-like proteins found in a range of bacterial species. 114
60636 407888 pfam18051 RPN1_C 26S proteasome non-ATPase regulatory subunit RPN1 C-terminal. This is the C-terminal domain found in RPN1 proteins (26S proteasome non-ATPase regulatory subunit 2). The 26S proteasome holocomplex consists of a 28-subunit barrel-shaped core particle (CP) in the center capped at the top and bottom by 19-subunit regulatory particles (RPs). The CP forms the catalytic chamber and the RP is formed from two subcomplexes known as the lid and the base. The lid comprises nine Rpn subunits in yeast (Rpn3/5/6/7/8/9/11/12/15) and the base comprises three Rpn subunits (Rpn1/2/13) and six ATPases (Rpt1-6). 54
60637 407889 pfam18052 Rx_N Rx N-terminal domain. This entry represents the N-terminal domain found in many plant resistance proteins. This domain has been predicted to be a coiled-coil, however the structure shows that it adopts a four helical bundle fold. 93
60638 407890 pfam18053 GyrB_insert DNA gyrase B subunit insert domain. This is the insert domain found in DNA gyrase B subunit proteins. Studies indicate that the insert has two functions, acting as a steric buttress to pre-configure the primary DNA-binding site, and serving as a relay that may help coordinate communication between different functional domains. 167
60639 407891 pfam18054 CEL_III_C CEL-III C-terminal. This is the C-terminal domain found in Cucumaria echinata CEL-III protein which is a lectin that exhibits both hemolytic and hemagglutinating activity. The domain is responsible for oligomerization and insertion of CEL-III into the erythrocyte membrane. The domain is composed of eight stranded beta sandwich and two alpha helices, the latter changes conformation upon binding to the cell surface carbohydrates. 154
60640 407892 pfam18055 RPN6_N 26S proteasome regulatory subunit RPN6 N-terminal domain. This is the N-terminal domain found in RPN6 proteins (26S proteasome regulatory subunit). The 26S proteasome holocomplex consists of a 28-subunit barrel-shaped core particle (CP) in the center capped at the top and bottom by 19-subunit regulatory particles (RPs). The CP forms the catalytic chamber and the RP is formed from two subcomplexes known as the lid and the base. The lid comprises nine Rpn subunits in yeast (Rpn3/5/6/7/8/9/11/12/15) and the base comprises three Rpn subunits (Rpn1/2/13) and six ATPases (Rpt1-6). Phosphorylation of Rpn6 enhances proteasome ATPase activity and promotes the formation of doubly capped (30S) proteasome, hence accelerating the degradation of short-lived proteins. 117
60641 407893 pfam18056 PBP3 Penicillin Binding Protein 3 Domain. This domain belongs to peptidoglycan synthesis regulatory factor 3 (PBP3) from streptococcus pneumoniae. Peptidoglycan synthesis regulatory factor are known as Penicillin binding proteins (PBP) and are membrane-associated enzymes that perform critical functions in the bacterial cell division process. The domain contributes residues to the active site such as Arg 278. 110
60642 407894 pfam18057 DUF5594 Domain of unknown function (DUF5594). This domain was first discovered in BPSL1050, a highly immunoreactive protein found in Burkholderia pseudomallei. The domain's structure consists of three helical regions which pack onto an antiparallel beta sheet, formed by four strands. The beta sheet is solvent exposed on one side and packs tightly against the three helices on the other side, generating a network of hydrophobic and aromatic interactions that contribute to tight packing of the protein. Is is thought that the small loop L1, the main loop L2, and part of helix alpha-3 extending until Leu120 are the three main immunogenic sequences. 115
60643 407895 pfam18058 SbsC_C SbsC C-terminal domain. This is the C-terminal domain found in Bacterial Cell Surface Layer Protein SbsC which can be found in species such as Geobacillus stearothermophilus. The C-terminal domain is the third and last triple-helical bundle and adopts a canonical coiled-coil structure. A similar overall arrangement of antiparallel triple-helical bundles has been found in the cytoskeletal protein spectrin (2SPC). 132
60644 407896 pfam18059 Csd3_N Csd3 N-terminal. Csd3 (also known as HdpA) is a bi-functional enzyme with delta,delta-endopeptidase activity and delta,delta-carboxypeptidase activity. The N-terminal domain is also known as domain 1 and is composed of an alpha/beta fold consisting of a five-stranded antiparallel beta-sheet and three short alpha-helices. Domain 1 blocks the active-site cleft of the LytM domain, with the protruding helix alpha-3 contributing to the Zn2+ coordination sphere. The fold of domain 1 is very remotely related to monellin/cystatin superfamily proteins, some of which act as inhibitors of cysteine peptidases. 83
60645 407897 pfam18060 F_actin_bund_C F actin bundling C terminal. This is the C-terminal domain found in 34 kDa F actin bundling protein. ABP34 is a calcium regulated actin binding protein that cross links actin filaments into bundles. Residues 216-244 in the C-domain are part of the strongest actin-binding sites (residue 193 residue 254) and have conserved sequences with the actin-binding regions of alpha-actinin and ABP120. 87
60646 407898 pfam18061 CRISPR_Cas9_WED CRISPR-Cas9 WED domain. This domain, known as the wedge (WED) domain, is found in Cas9 proteins which are present in Staphylococcus aureus. Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. The Cas9 WED domain has a fold comprising a twisted five-stranded beta sheet flanked by four alpha helices, and is responsible for the recognition of the distorted repeat: anti-repeat duplex. WED domains are responsible for the recognition of single-guide RNA scaffolds. 126
60647 407899 pfam18062 RE_AspBHI_N Restriction endonuclease AspBHI N-terminal. This is the N-terminal domain found in modification-dependent restriction endonuclease proteins such as AspBHI, which can be found in Azoarcus sp. AspBHI is a homo-tetrameric protein that recognizes 5-methylcytosine in the double-strand DNA sequence context of (C/T) (C/G) (5mC) nucleotide (C/G) and cleaves the two strands at a fixed distance (N12/N16) 3 to the modified cytosine. The N-terminal domain is responsible for DNA-recognition and resembles an SRA-like 5-methylcytosine binding domain in structure and function. 185
60648 407900 pfam18063 BB_PF Beta barrel Pore-forming domain. This domain is found in Monalysin Pore-forming Toxin which is a type of beta-barrel pore-forming toxin protein found in Pseudomonas entomophila. Monalysin forms a stable doughnut-like 18-mer complex composed of two disk-shaped nonamers to form a pore. The domain is composed of a central twisted beta -sheet composed of three antiparallel beta-strands (beta 3, beta 6, and beta 8/9) and flanked by the pore-forming segment and the C-terminal region on either side. The pore-forming domain (residues 102-170) is located between strands beta 3 and beta 6, and is formed from two antiparallel beta-strands connected by three alpha-helices, alpha 3, alpha 4, and alpha 5. The C-terminal region forms a long alpha-helix followed by a small hairpin and a short alpha-helix. 204
60649 407901 pfam18064 ParB_C Centromere-binding protein ParB C-terminal. This is the C-terminal domain found in centromere-binding protein ParB, which is used for stable segregation. The C-terminal domain has a ribbon-helix helix (RHH) motif with a C-terminal loop (residues 119-128) following helix alpha-2. The domain forms a dimer with the C-terminal of the beta chain. The function of the C-terminal domain is to bind to DNA. 47
60650 375529 pfam18065 PatG_C PatG C-terminal. This is the C-terminal domain of Prochloron sp. PatG, which process the precursor peptide to yield the cyclic Patellamide. The C-terminal domain of PatG is 56% structurally homologous to the C-terminal domain of PatA. 115
60651 407902 pfam18066 Phage_ABA_S Phage ABA sandwich domain. This domain is found in a prophage protein BC1872 found in B. cereus. The domain forms a three-layer beta/alpha/beta sandwich. Three alpha-helices, alpha1, alpha2, and alpha3, are sandwiched on one side by three-stranded antiparallel beta-sheet (beta3, beta4, and beta5) and on the other by a beta-hairpin (beta1 and beta2). 89
60652 407903 pfam18067 Lipase_C Lipase C-terminal domain. This domain is found in Archaeoglobus fulgidus lipase (AFL). The domain consists of a layer of seven beta-sheet. When the domain is combined with the proximal domain, which is also a layer of seven beta-sheet, they form a beta sandwich. The combination of these two domains is known as the C-terminal domain. It is likely that the C-terminal domain plays an important role in substrate specificity, catalytic efficiency but also attributes partly to AFLs stability. 96
60653 407904 pfam18068 Npun_R1517 Npun R1517. This domain belongs to NPun R1517 which is found in Nostoc punctiforme. Studies indicate that Npun R1517 is encoded by orphan gene 29. Npun R1517 adopts a sleigh-shaped structure with a two-stranded antiparallel beta-sheet forming the floor of the sleigh, a HTH forming the seat and a HTH forming the front of the sleigh. 74
60654 407905 pfam18069 DR2241 DR2241 stabilising domain. This is the middle domain found in DR2241, a multi-domain protein with an N-terminal cobalamin (vitamin B12) chelatase domain. DR2241 is found in D. radiodurans. The middle domain has four alpha-helices (alpha7-alpha10) in contact with the N-terminal domains and C-terminal domain and five anti-parallel beta-strands with strand order 12354 at the outer side of one monomer. The middle domain, as well as the C-terminal domain, are heavily involved in the tetramer stabilisation. 111
60655 407906 pfam18070 Cas9_PI2 CRISPR-Cas9 PI domain. This domain is found in Cas9 proteins which can be Staphylococcus aureus. Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. When this domain is combined with the C-terminal domain it is called the Pam-interacting (PI) domain. The Cas9 orthologs from different microbes have highly divergent sequences but their PI domains share a conserved core fold and recognize distinct PAM sequences. 59
60656 407907 pfam18071 HMW1C_N HMW1C N-terminal. This is the N-terminal domain found in Actinobacillus pleuropneumoniae HMW1C (ApHMW1C). HMW1 adhesin is an N-linked glycoprotein that mediates adherence to respiratory epithelium through N-glycosylation of protein acceptor sites an O-glycosylation of sugar acceptor sites. The N-terminal domain forms an all alpha domain (AAD) when combined with the domain spanning from residue 154 to residue 245. The AAD interacts extensively with the C-terminal GT-B fold in order to create unique groove with potential to accommodate the acceptor protein. 143
60657 407908 pfam18072 FGAR-AT_linker Formylglycinamide ribonucleotide amidotransferase linker domain. This is the linker domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. The structure analysis of Salmonella typhimurium FGAR-AT reveals that this linker domain is made up of a long hydrophilic belt with an extended conformation. 50
60658 407909 pfam18073 Rubredoxin_2 Rubredoxin metal binding domain. This is the C-terminal rubredoxin metal binding domain found in Interest in lipopolysaccharide (LPS) assembly protein B (LapB). Rubredoxin proteins form small non-heme iron binding sites that use four cysteine residues to coordinate a single metal ion in a tetrahedral environment. Rubredoxins are most commonly found in bacterial systems, but have also been found in eukaryotes. The key features of these rubredoxin-like domains are the extended loops or 'knuckles' and the tetracysteine mode of iron binding. Structural analysis of LapB from Escherichia coli show that the rubredoxin metal binding domain is intimately bound to the TPR motifs and that this association to the TPR motifs is essential to LPS regulation and growth in vivo. Other family members include RadA proteins which play a role in DNA damage repair. In E. coli, a protein known as RadA (or Sms) participates in the recombinational repair of radiation-damaged DNA in a process that uses an undamaged DNA strand in one DNA duplex to fill a DNA strand gap in a homologous sister DNA duplex. RadA carries a zinc finger at the N-terminal domain. 28
60659 407910 pfam18074 PriA_C Primosomal protein N C-terminal domain. This is the C-terminal domain found in PriA DNA helicase, a multifunctional enzyme that mediates the process of restarting prematurely terminated DNA replication reactions in bacteria. The C-terminal domain (CTD) bears similarity to the S10 subunit which binds branched rRNA within the bacterial ribosome. The C-terminal domain is part of the helicase domain of PriA proteins. It acts together with the 3' DNA-binding domain to form a site for binding ssDNA-binding protein (SSB). 96
60660 407911 pfam18075 FtsX_ECD FtsX extracellular domain. This is the extracellular domain (ECD) found in FtsX enzyme, a homolog of the transmembrane PG-hydrolase regulator. The FtsX extracellular domain binds the PG peptidase Rv2190c/RipC N-terminal segment, causing a conformational change that activates the enzyme ileading to PG hydrolysis in Mycobacterium tuberculosis. Structural analysis of FtsX ECD reveals fold containing two lobes connected by a flexible hinge. Mutations in the hydrophobic cleft between the lobes showed reduction in RipC binding in vitro and inhibition of FtsX function in Mycobacterium smegmatis. 94
60661 407912 pfam18076 FGAR-AT_N Formylglycinamide ribonucleotide amidotransferase N-terminal. This is the N-terminal domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide and glutamine to formylglycinamidine ribonucleotide, ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. 115
60662 407913 pfam18077 DUF5595 Domain of unknown function (DUF5595). This domain is found in Nude C 80 (Ndc80) proteins which can be found in species such as Homo sapiens. Ndc80 protein complexes are a core component of the end-on attachment sites for kinetochore microtubules. Ndc80 is also known as Hec1, for highly expressed in cancer 1. 73
60663 407914 pfam18078 Thioredoxin_11 Thioredoxin-like SNTX domain. This domain is found in pore-forming toxin stonustoxin (SNTX), a lethal venom found in Synanceia horrida. Because of the thioredoxin-like nature of the domain, it is referred to as the THX domain. The THX domain is comprised of a five-stranded beta-sheet and shares greatest structural similarity with Saccharomyces cerevisiae mitochondrial THX3. It is thought that THX domain plays a purely structural role. 126
60664 407915 pfam18079 AglB_L1 Archaeal glycosylation protein B long peripheral domain. This domain is found in Archaeal Glycosylation B protein (AglB-Long) in A. fulgidus. When the domain, known as peripheral l (Pl), is combined with the central core (CC) and insertion (IS) sub-units, they form the C-terminal domain. It is thought that the C-terminal domain may contribute toward the increased thermal stability of the AglB proteins in the hyper-thermophilic. 88
60665 407916 pfam18080 Gal_mutarotas_3 Galactose mutarotase-like fold domain. This domain is found in endo-alpha-N-acetylgalactosaminidase present in Streptococcus pneumoniae. Endo-alpha-N-acetylgalactosaminidase is a cell surface-anchored glycoside hydrolase involved in the breakdown of mucin type O-linked glycans. The domain, known as domain 2, exhibits strong structural similarlity to the galactose mutarotase-like fold but lacks the active site residues. Domains, found in a number of glycoside hydrolases, structurally similar to domain 2 confer stability to the multidomain architectures. 243
60666 407917 pfam18081 FANC_SAP Fanconi anemia-associated nuclease SAP domain. This domain is found in Fanconi-anemia-associated nuclease 1 (FAN1) present in Pseudomonas aeruginosa. FAN1 is a nuclease associated with Fanconi anemia (FA), an autosomal recessive genetic disorder caused by defects in FA genes responsible for processing DNA inter-strand cross-links (ICLs). The domain, known as the SAP domain, helps to augment the overall protein DNA interaction by interacting with the 3' and 5' ends of the template strand. Support of the pre-nick segment binding is crucial as multiple mutations in this domain resulted in hypersensitivity to a cross-linking agent in the SAP domain of Caenorhabditis elegans' FAN1. The helix-hairpin-helix of the SAP recognize three consecutive phosphate groups (C19, A20 and A21) at the 3' end of the template via the basic residues K116, K135 and K117. 51
60667 407918 pfam18082 DUF5596 Domain of unknown function (DUF5596). This domain belongs to polcalcin from the weed Chenopodium album, recombinant Che a 3 (rChe a 3). Polcalcin occur in pollen as highly cross-reactive allergenic molecules. The three-dimensional structure of rChe a 3 resembles an alphahelical fold that is essentially identical with that of the two EF-hand allergens from birch pollen, Bet v 4, and timothy grass pollen, Phl p 7. 129
60668 407919 pfam18083 PutA_N Proline utilization A N-terminal domain. This domain is found in Proline utilization A (PutA) proteins present in Geobacter sulfurreducens. PutA are bifunctional peripheral membrane flavoenzymes that catalyze the oxidation of l-proline to l-glutamate and couple the oxidation of imported proline imported to the reduction of membrane-associated quinones. This domain is located at the N-terminus and is referred to as the alpha domain. The hydrocarbon tail of Zwittergent 3-12 binds to an exposed hydrophobic patch of the alpha domain which contains aromatic and nonpolar residues. The domain may be involved in membrane association. 113
60669 407920 pfam18084 ARTD15_N ARTD15 N-terminal domain. This is the N-terminal domain of poly ADP-ribose polymerase (PARP16 also known as ARTD15) present in homo sapiens. ARTDs catalyse the formation of branched or unbranched chains of ADP-ribose units on protein side chains. The N-terminal domain of ARTD15 does not share any obvious sequence similarity with the regulatory domains of ARTD1-4. The N-terminal domain arrangement in both ARTD15 and ARTD1-3 suggests a regulatory role through different mechanisms. 82
60670 407921 pfam18085 Mak_N_cap Maltokinase N-terminal cap domain. Glycogen is a central energy storage molecule in bacteria and the metabolic pathways associated with its biosynthesis and degradation are crucial for maintaining cellular energy homeostasis. In mycobacteria, the GlgE pathway involves the combined action of trehalose synthase (TreS), maltokinase (Mak) and maltosyltransferase (GlgE). The N-terminal lobe can be divided into two subdomains: a cap N-terminal subdomain comprising the first 88 amino acid residues. This entry is for the cap N-terminal domain found in mycobacterial maltokinase (Mak), (EC:2.7.1.175). The N-terminal cap subdomain and the C-terminal lobe are predominantly acidic, the intermediate subdomain is enriched in positively charged residues. A structural search with only the first 88 amino acid residues of Mak, corresponding to the N-terminal cap subdomain of maltokinases, unveiled a resemblance with proteins displaying the cystatin fold and a remote similarity with the N-terminal domain of the serine/threonine protein kinase GCN2. Conservation of the cap subdomain in maltokinases (including the bifunctional TreS-Mak enzymes), in particular of the residues in the proximity of the P-loop, together with the potential flexibility of this region, are compatible with regulatory functions for this subdomain. Hence it is hypothesized that the N-terminal cap subdomain plays a central role in modulation of Mak enzymatic activity. 88
60671 407922 pfam18086 PPIP5K2_N Diphosphoinositol pentakisphosphate kinase 2 N-terminal domain. This is the N-terminal domain found in the Inositol hexakisphosphate and diphosphoinositol-pentakisphosphate kinase 2 (PPIP5K2), EC:2.7.4.24. Structure analysis of human PPIP5K2 indicate that this region forms alpha-beta-alpha domain.PPIP5K2 is one of the mammalian PPIP5K isoforms responsible for synthesis of diphosphoinositol polyphosphates (inositol pyrophosphates; PP-InsPs), regulatory molecules that function at the interface of cell signaling and organismic homeostasis. 90
60672 407923 pfam18087 RuBisCo_chap_C Rubisco Assembly chaperone C-terminal domain. This is the C-terminal domain, also known as the beta domain, of Rubsico Assembly Chaperone protein (Raf1). Raf1 is necessary for rubisco to catalyze the rate-limiting step of carbon fixation through carboxylating the five-carbon sugar substrate ribulose-1,5-bisphosphate. The beta domains primary function is dimerization, which is critical for Raf1 to achieve the necessary avidity for complex formation with RbcL (the large complex sub-unbit of Rubsico) assembly intermediates. The beta domain is also involved, to a small extent, in binding to RbcL with use of the lustiness near the beta domain's conserved top surface. 138
60673 407924 pfam18088 Glyco_H_20C_C Glycoside Hydrolase 20C C-terminal domain. This is the C-terminal domain of Glycoside hydrolase 20 C (GH20C) present in S. pneumoniae. GH20C possesses the ability to hydrolyze the beta-linkages joining either N-acetylglucosamine or N-acetylgalactosamine to a wide variety of aglycon residues. The C-terminal domain is commonly known as Domain III is important in dimerization as it forms the primary interface of the dimer. However, there is presently no evidence supporting dimerization as being necessary for catalysis. Domain III is unusual among structurally characterized GH20 enzymes but in GH20 enzymes possessing domain III, dimerization seems to be a conserved feature. 194
60674 407925 pfam18089 DAPG_hydrolase DAPG hydrolase PhiG domain. This domain is found in 2,4-diacetylphloroglucinol hydrolase PhiG present in Pseudomonas fluorescens. 2,4-diacetylphloroglucinol hydrolase is the gene product of PhiG that is responsible for cleaving toxic 2,4-diacetylphloroglucinol (DAPG). The small N-terminal region of the domain is involved in dimerization through hydrogen bonding of the dimer interface. The C-terminal catalytic region resembles the tetracenomycin aromatase/cyclase and has a Bet v1-like fold. DAPG PhiG is the first discovered hydrolase whose catalytic domain belongs to the Bet v1-like fold, rather than the classical alpha/beta-fold hydrolases. 221
60675 375543 pfam18090 SoPB_HTH Centromere-binding protein HTH domain. This domain is found in centromere-binding protein (SopB). SopB displays an intriguing range of DNA-binding properties essential for partition; it binds the centromere to form a partition complex, which recruits NTPase (SopA), and it also inhibits SopA polymerization. The domain has a helix-turn-helix (HTH) structure and is thought to be the specific DNA-binding domain mainly through residues from the recognition helix, alpha 3, of the HTH. The domain has show structural similarity to the DNA-binding domains of P1 ParB and KorB. 75
60676 407926 pfam18091 E3_UbLigase_RBR E3 Ubiquitin Ligase RBR C-terminal domain. This is the C-terminal domain of HOIP present in Homo sapiens. HOIP synthesize the linear ubiquitin chains that help control innate immunity and inflammation. This region has an RBR domain which catalyzes the transfer of ubiquitin onto a substrate. 90
60677 407927 pfam18092 DraK_HK_N DraK Histidine Kinase N-terminal domain. This is the N-terminal domain found in DraK Histidine Kinase (HK) present in Streptomyces coelicolor. Activation of the DraK HK leads to autophosphorylation of its kinase domain in the cytoplasm and subsequent transphosphorylation of DraR promoting blue-pigmented polyketide actinorhodin (ACT) production. The N-terminal domain is known as the sensor (or input) domain and undergoes a conformational change resulting in phosphorylation of a conserved histidine in its cytoplasmic kinase domain. 85
60678 407928 pfam18093 Trm5_N tRNA methyltransferase 5 N-terminal domain. This is the N-terminal domain of tRNA methyltransferase 5 (Trm5) present in Methanocaldococcus jannaschii. Trm5 catalyzes the methyl transfer from S-adenosyl methionine (AdoMet) to N1 of G37. This domain, also known as the D1 domain, contacts the tertiary core (elbow) region of the tRNA L shape in a ternary complex of the enzyme with tRNA and AdoMet. 47
60679 407929 pfam18094 DNA_pol_B_N DNA polymerase beta N-terminal domain. This is the N-terminal domain of DNA polymerase beta present in Homo sapiens. DNA polymerase beta is a repair enzyme that has a key role in the base excision repair of simple DNA lesions. 103
60680 407930 pfam18095 PAS_12 UPF0242 C-terminal PAS-like domain. This domain is found at the C-terminus of proteins of the UPF0242 family. This domain is related to the PAS domain pfam13426. 153
60681 407931 pfam18096 Thump_like THUMP domain-like. This is a domain of unknown function found in bacteria. 77
60682 407932 pfam18097 Vta1_C Vta1 C-terminal domain. This is the C-terminal domain of Vta1 proteins pfam04652. Structural and functional analysis indicate that this C-terminal domain promotes the ATP-dependent double ring assembly of Vps4. Furthermore, it has been shown that it is necessary and sufficient for protein dimerization. Mutations in Lys-299 and Lys-302 completely abolished the ability of Vta1 to stimulate the ATPase activity of Vps4 while mutation in Lys-322 had no effect. 38
60683 407933 pfam18098 RPN5_C 26S proteasome regulatory subunit RPN5 C-terminal domain. This is the C-terminal domain of the 26S proteasome regulatory subunit RPN5 proteins.This helical domain can be found adjacent to pfam01399. The 26S proteasome is the major ATP-dependent protease in eukaryotes. Three subcomplexes form this degradation machine: the lid, the base, and the core. The helices found at the C terminus of each lid subunit form a helical bundle that directs the ordered self-assembly of the lid subcomplex. This domain which comprises the tail of RPN5 along with the tail of Rpn9, are important for Rpn12 binding to the lid. 32
60684 407934 pfam18099 DUF5010_C DUF5010 C-terminal domain. This domain is found at the end of a family of putative glycosyl hydrolases pfam16402. This domain is likely to function as a carbohydrate binding domain due to its similarity with pfam03422. 112
60685 407935 pfam18100 PDE4_UCR Phosphodiesterase 4 upstream conserved regions (UCR). This is the upstream conserved region (UCR) found in Phosphodiesterase 4 (PDE4) enzymes. PDE4 is a contributor to intracellular signalling and an important drug target. The four members of this enzyme family (PDE4A to -D) are functional dimers in which each subunit contains two upstream conserved regions (UCR), UCR1 and -2, which precede the C-terminal catalytic domain pfam00233. Due to alternative promoters/start sites and variable mRNA splicing, transcription from the four PDE4 genes results in the expression of more than 25 different isoforms of PDE4. Each isoform has a unique N-terminal region that determines its specific subcellular localization by mediating interactions with scaffolding proteins. The isoforms are further classified into long, short, and supershort forms based on the presence or absence of two upstream conserved regions (UCRs, known as UCR1 and UCR2). Long splice variants contain both UCR1 and UCR2, short variants lack UCR1, and the supershort forms of PDE4 additionally lack part of UCR2. The extent to which UCRs are present determines critical functional differences between the isoforms. Phosphorylation by protein kinase A (PKA) at a conserved site on UCR1 activates all long PDE4 isoforms. Mutation and deletion studies have shown that long forms of PDE4 are dimeric, with key dimerization interactions mediated by UCR1 and UCR2, and that the C-terminal half of UCR2 could play a negative regulatory role. 119
60686 407936 pfam18101 Pan3_PK Pan3 Pseudokinase domain. This is a pseudokinase (PK) domain found in PAB-dependent poly(A)-specific ribonuclease subunit pan3. PAN3 proteins contain three prominent regions: an unstructured N-terminal region (N-term), a central PK domain, and a highly conserved C-terminal domain (C-term). The PAN3 PK domain has retained its ATP binding capacity, and this function is required for mRNA degradation in vivo. Analysis of Pan3 amino acids sequences show that, despite of retaining the general structural characteristics of protein kinases, the PK domain has substitutions in all the conserved motifs that are critical for kinase activity, such as in the catalytic VAIK and HRD motifs and in the Mg2+ binding DFG motif. However, the PAN3 PK domain has been shown to bind ATP. Furthermore, similar to other kinases, the ATP-binding site is located in the cleft between the N- and C-lobes of the kinase fold, however, the ATP-binding pocket is wider than that of typical kinases. 138
60687 407937 pfam18102 DTC Deltex C-terminal domain. This is the C-terminal domains found in members of the Deltex family of proteins which comprises five members (DTX1, 2, 3, 4, and 3L). This conserved C-terminal region of about 150 residues of the Deltex family, is preceded by a RING E3 ligase domain in four of the members. Crystal structure of the Deltex C-terminal (DTC) domain reveals a fold composed of a central beta-sheet lined with two long parallel alpha-helices. 133
60688 375552 pfam18103 SH3_11 Retroviral integrase C-terminal SH3 domain. This is the carboxy-terminal domain (CTD) found in retroviral integrase, an essential retroviral enzyme that binds both termini of linear viral DNA and inserts them into a host cell chromosome. The CTD adopts an SH3-like fold. Each CTD makes contact with the phosphodiester backbone of both viral DNA molecules, essentially crosslinking the structure. 63
60689 407938 pfam18104 Tudor_2 Jumonji domain-containing protein 2A Tudor domain. This is the tudor domain found in histone demethylase Jumonji domain-containing protein 2A (JMJD2A). Structure and function analysis indicate that this domain can recognize equally well two unrelated histone peptides, H3K4me3 and H4K20me3, by means of two very different binding mechanisms. JMJD2 also known as KDM4, is a conserved iron (II)-dependent jumonji-domain demethylase subfamily that is essential during development. Vertebrate KDM4A-C proteins contain a conserved double tudor domain (DTD). 35
60690 407939 pfam18105 PGM1_C PGM1 C-terminal domain. This is the C-terminal domain found in PGM1 present in Streptomyces cirratus. PGM1 is a gene product that links precursor peptides together to form the antibiotic Pheganomycin. 53
60691 407940 pfam18106 Rol_Rep_N Rolling Circle replication initiation protein N-terminal domain. This is the N-terminal domain of the Rolling Circle Replication Initiator Protein (Rep) from Geobacillus stearothermophilus. This protein acts on plasmids from family pT181 to initiate replication, recruit a helicase to the site of initiation and terminate replication after DNA synthesis. These proteins possess a unique active site and a catalytically essential metal ion is bound in a distinct manner from other rolling circle Reps. 91
60692 407941 pfam18107 HTH_ABP1_N Fission yeast centromere protein N-terminal domain. This domain is found in the fission yeast centromere protein (Abp1) in species such as Shizosaccharomyces pombe. The domain, referred to as Domain 1, is DNA-binding and makes up half of the N-terminal region. 61
60693 407942 pfam18108 QSOX_Trx1 QSOX Trx-like domain. This domain is found in Quiescin sulfhydryl oxidase (QSOX), an oxidoreductase present in Homo sapiens capable of both generating and transferring disulfide modules within a single polypeptide. The domain is thioredoxin-like, hence referred to as Trx1 domain. Trx1 domain has a di-cysteine motif (Cys-X-X-Cys) which is related to the redox-active domains of protein disulfide isomerase. The Trx1 domain is responsible for intramolecular disulfide transfer through the di-cysteine motif. 108
60694 407943 pfam18109 Fer4_24 Ferredoxin I 4Fe-4S cluster domain. This is domain is found in Ferredoxin I (FdI), an Iron-sulfur ([Fe-S]) cluster-containing protein, present in species such as Azotobacter vinelandii. [Fe-S] proteins participate in electron transfer, catalytic, regulatory, and structural function. The FdI cluster exhibits a pH-dependent reduction potential and reversible protonation in the reduced state. 35
60695 407944 pfam18110 BRCC36_C BRCC36 C-terminal helical domain. This is the C-terminal domain of BRCC36, a Zn2+ dependent deubiquitinating enzyme, present in Camponotus floridanus. BRCC36 hydrolyzes lysine linked ubiquitin chains as part of macromolecular complexes that participate in either interferon signalling or DNA-damage recognition. The domain consists of 2 non canonical helices. The domain interacts hydrophobically with helices alpha 4 and alpha 5 of KIAA0157 in the form of a coiled coil helical bundle. This interaction helps establish the association of BRCC36 with KIAA0157, a pseudo-DUB MPN- protein that is essential for the activity of BRCC36. 81
60696 407945 pfam18111 RPGR1_C Retinitis pigmentosa G-protein regulator interacting C-terminal. This is the C-terminal domain of retinitis pigmentosa G-protein regulator (RPGR) interacting protein-1 present in Homo sapiens. A mutation in RPGR interacting protein-1 can be observed in the eye disease Leber congenital amaurosis. The domain is commonly known as the RPGR-interacting domain (RID) and is thought to have a C2-like fold. 166
60697 407946 pfam18112 Zn-C2H2_12 Autophagy receptor zinc finger-C2H2 domain. This domain is found in calcium-binding and coiled-coil domain 2/NDP25 (CALCOCO2/NDP25) found in Homo sapiens. CALCOCO2/NDP25 is an ubiquitin-binding autophagy receptor involved in the selective autophagic degradation of invading pathogens. This domain is a typical C2H2-type zinc finger which specifically recognizes mono-ubiquitin or poly-ubiquitin chain. The overall ubiquitin-binding mode utilizes the C-terminal alpha-helix to interact with the solvent-exposed surface of the central beta-sheet of ubiquitin, similar to that observed in the RABGEF1/Rabex-5 or POLN/Pol-eta zinc finger. 27
60698 407947 pfam18113 Rbx_binding Rubredoxin binding C-terminal domain. This is the C-terminal domain found in rubredoxin reductase (RdxR) present in Pseudomonas aeruginosa. RdxR are important in prokaryotes as they allow for the metabolism of inert n-alkanes and RdxR is also crucial for archaea and anaerobic bacteria in the response to oxidative stress. This domain is known to recognize and bind to rubredoxin. 71
60699 407948 pfam18114 Suv3_N Suv3 helical N-terminal domain. This is the N-terminal domain of Suv3 present in Homo sapiens. Suv3 is an NTP-dependent RNA/DNA helicase that is necessary for the degradation of mature mtRNAs. Suv3 has been found to interact in vitro with polynucleotide phosphorylase. 117
60700 407949 pfam18115 Tudor_3 DNA repair protein Crb2 Tudor domain. This is the tudor domain found in DNA repair protein crb2. Structural and functional studies of Crb2 and its mammalian homologue 53BP1 indicate that the conserved tandem-Tudor domain of 53BP1 and Crb2 preferentially interacts with H4K20me2, though it also binds to H4K20me1. Furthermore, despite low amino acid sequence similarity, Crb2 is structurally related to 53BP1 in having two tudor domains and a conserved dimethyllysine-binding pocket, and that, like 53BP1, it directly binds H4-K20me2. 49
60701 407950 pfam18116 SNX17_FERM_C Sorting Nexin 17 FERM C-terminal domain. This is the C-terminal domain of sorting nexin 17 (SNX17) present in Homo sapiens. SNX17 localizes to early endosomes where it directly binds NPX(Y/F) motifs in the target receptors to mediate their rates of endocytic internalization, recycling, or degradation. The domain is known as terminal band 4.1/ezrin/radixin/moesin (FERM) domain. The FERM domain binds directly to the common motif, NPX(Y/F), in the cytoplasmic region of its target proteins. 109
60702 407951 pfam18117 EDS1_EP Enhanced disease susceptibility 1 protein EP domain. This is the C-terminal domain found in the enhanced disease susceptibility 1 (EDS1) protein present in Arabidopsis thaliana. EDS1 controls the post-infection basal resistance layer. This highly conserved domain is known as the EP domain and its interface consists of hydrophobic interactions, salt bridges, and an extensive hydrogen bonding network. 214
60703 407952 pfam18118 PRC2_HTH_1 Polycomb repressive complex 2 tri-helical domain. This domain can be found in the Polycomb repressive complex 2 (PRC2) present in Homo sapiens. Polycomb complexes maintain repressive chromatin states by silencing gene expression. PRC2 does this by methylating lysine 27 of histone H3. This domain makes up part of the N-lobe which is involved in regulation. 101
60704 407953 pfam18119 RIG-I_C RIG-I receptor C-terminal domain. This is the C-terminal domain of Innate Immune Pattern-Recognition Receptor RIG-I present in homo sapiens. RIG-I is a key cytosolic pattern-recognition receptors of the vertebrate innate immune system that form the first line of defense against RNA viral infection. RNA binding to RIG-I is mediated both by the C-terminal domain and by the helicase domain. The C-terminal domain specifically binds the 5'triphosphate end with a 10-fold higher affinity compared to 5'OH-dsRNA. 139
60705 407954 pfam18120 DUF5597 Domain of unknown function (DUF5597). This is the C-terminal domain of xyloglucan utilization locus (XyGUL) present in Cellvibrio japonicas. XyGUL is required for xyloglucan utilization. It is also the C-terminal domain of PF02449 and PF01301. 130
60706 407955 pfam18121 TFA2_Winged_2 TFA2 Winged helix domain 2. This is the second winged helix domain can be found in TFA2 proteins present in Saccharomyces cerevisiae. In form 2, the domain interacts directly with Rad3, a DNA helicase. 61
60707 407956 pfam18122 APC1_C Anaphase-promoting complex sub unit 1 C-terminal domain. This is the C-terminal domain of chain A, also known as sub-unit 1, found in anaphase-promoting complex (APC/C) present in Homo sapiens. APC/C is an ubiquitin ligase that controls chromosome segregation and mitotic exit. 158
60708 407957 pfam18123 FGFR3_TM Fibroblast growth factor receptor 3 transmembrane domain. This transmembrane (TM) domain is found in Fibroblast growth factor receptor 3 (FGFR3) present in Homo sapiens. Fibroblast growth factors transduce diverse biochemical signals by lateral dimerization in the plasma membrane, followed by receptor auto-phosphorylation and stimulation of downstream signalling cascades. In FGFR3 TM domains associate in a parallel fashion in a left-handed dimer via an extended heptad motif. The N-terminal part of the TM dimer act, most likely, as anchors positioning the TM domain in the detergent head group region. The charged residues flanking the TM helix on both termini have apparently profound destabilizing effect on the FGFR3 dimer but on absence of ligand, the TM domain interaction stabilize the FGFR3 dimer. 31
60709 407958 pfam18124 Kindlin_2_N Kindlin-2 N-terminal domain. This is the N-terminal domain (K2-N) of Kindlin-2 protein present in Homo sapiens. Kindlin-2 is a regulator for heterodimeric integrin adhesion receptors promotes integrin activation. Activation depends on binding of the N-terminal domain to the integrin beta cytoplasmic tail (CT), which disrupts the receptors association with alpha-CT and triggers the conformational transitions in the receptor. K2-N contains a conserved positively charged surface that binds to membrane enriched with negatively charged phosphatidylinositol-(4,5)-bisphosphate (PIP2). K2-N is also very similar to the homologous kindlin-1 F0. 89
60710 375573 pfam18125 RlmM_FDX RlmM ferredoxin-like domain. This domain is found in Ribosomal methyltransferase RlmM (YdgE) present in E. coli. RlmM catalyzes the S-adenosyl methionine (AdoMet)-dependent 2'O methylation of C2498 in 23S ribosomal RNA. The domain is ferredoxin-like and forms part of the THUMP domain which binds RNA. THUMP domains typically have low sequence similarity. 71
60711 407959 pfam18126 Mitoc_mL59 Mitochondrial ribosomal protein mL59. This domain is a protein found in mitochondrial ribosome 54S large sub unit present in species such as Saccharomyces cerevisiae. The domain used to be referred to as MRP25 but is now called mL59 protein. mL59 is known to partially stabilize a change in the rRNA path prior to helix 82-ES1 ultimately leading to the stabilization of the phosphate backbone of the tRNA acceptor stem. It is worth noting that the domain is encoded in the nucleus and imported from the cytoplasm. 129
60712 407960 pfam18127 DUF5598 Domain of unknown function (DUF5598). This is the N-terminal domain found in Nicotinamide phosphoribosyltransferase (NAMPT) present in Homo sapiens. NAMPT captures nicotinamide (NAM) and replenish the nicotinamide adenine dinucleotide (NAD+) pool during ADP-ribosylation and transferase reactions. 98
60713 375576 pfam18128 HydF_dimer Hydrogen maturase F dimerization domain. This domain is found in Hydrogen maturase F (HydF) present in Thermotoga neapolitana. HydF is a GTPase, containing an FeS cluster-binding motif, that is able to are able to activate HydA produced so that HydA can drive the reversible reduction of protons to molecular H2. This domain, referred to as domain II, is responsible for HydF dimerization through the formation of a continuous beta-sheet comprising eight beta-strands from two monomers. 99
60714 407961 pfam18129 SH3_12 Xrn1 SH3-like domain. This is the C-terminal SH3-like domain which can be found in the exoribonuclease Xrn1. Xrn1 is a 175 kDa processive exoribonuclease that is conserved from yeast to mammals which targets cytoplasmic RNA substrates marked by a 5' monophosphate for processive 5'-to-3' degradation. The Sh3-like domain in Xrn1 lacks the canonical SH3 residues normally involved in binding proline-rich peptide motifs and instead engages in non-canonical interactions with the catalytic domain. Additionally it is essential in maintaining the structural integrity of Xrn1, since partial truncation of this domain in yeast Xrn1 yields an inactive protein. There is a long loop projecting from the SH3-like domain that contacts the PAZ/Tudor domain, occluding the functional surface that binds RNA or peptide motifs containing methylated arginines, respectively, in canonical PAZ and Tudor domain. 68
60715 407962 pfam18130 ATPgrasp_N ATP-grasp N-terminal domain. This is the N-terminal domain found in BL00235 present in Bacillus licheniformis. BL00235 is a ATP-grasp superfamily protein that catalyzes the formation of an alpha-peptide bond between two L-amino acids in an ATP-dependent manner. BL00235 has a highly restricted substrate specificity: the N-terminal substrate is confined to L-methionine an L-leucine, while the C-terminal substrates include small residues such as L-alanine, L-serine, L-threonine and L-cysteine. 81
60716 407963 pfam18131 KN17_SH3 KN17 SH3-like C-terminal domain. This is the C-terminal domain of human KIN17 protein. Overexpression of KIN17 modifies the nuclear morphology and inhibits S-phase progression, thus blocking cell growth as part of the response to genotoxics. The C-terminal domain binds to RNA and is generally well conserved. The domain has structural similarity with various SH3-like domains, although it lacks similarities in both primary sequence and charge distribution. 53
60717 407964 pfam18132 Tyosinase_C Tyosinase C-terminal domain. This is the C-terminal domain of Tyosinase present in Aspergillus oryzae. Tyosinase is a dinuclear copper monooxygenase/oxidase that plays a crucial role in the melanin pigment biosynthesis. The C-terminal domain is referred to as the shielding domain as it prohibits substrate access to the enzyme-active site and blocks the oxidase/oxygenase activity to avoid undesirable intracellular reactions of highly reactive quinonoid products. This means the domain may play an important role in regulating the enzyme activity. Two of the three cysteines (Cys522, and Cys525) that play a significant role in the copper incorporation process belong to the C-terminal domain. 122
60718 407965 pfam18133 HydF_tetramer Hydrogen maturase F tetramerization domain. This is the C-terminal domain found in Hydrogen maturase F (HydF) present in Thermotoga neapolitana. HydF is a GTPase, containing an FeS cluster-binding motif, that is able to are able to activate HydA produced so that HydA can drive the reversible reduction of protons to molecular H2. This domain is known as domain III, and is primarily responsible for homotetramer formation. Interactions between the two FeS cluster-binding domains worth noting are the interactions between beta-2 strands, the initial part of the long loop that connects strand beta-2 to strand beta-3, and the loop that connects strand beta-1 to helix alpha-3. There are three highly conserved cysteine residues (Cys-302, Cys-353, and Cys-356) that represent the FeS cluster-binding site which form a superficial pocket. 118
60719 407966 pfam18134 AGS_C Adenylyl/Guanylyl and SMODS C-terminal sensor domain. Predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by bacterial adenylyl/guanylyl cyclase domains. The sensing of ligands by AGS-C is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers. 129
60720 407967 pfam18135 Type_ISP_C Type ISP C-terminal specificity domain. This is the C-terminal domain of Type ISP restriction-modification enzyme LLaBIII present in Lactococcus lactis subsp. cremoris. Type ISP restriction-modification (RM) enzymes provide a potent defence against infection by foreign and bacteriophage DNA. This domain interacts extensively with DNA and is known as the target recognition domain (TRD). TRD works by recognising 6/7 base pairs of asymmetric sequence. 342
60721 407968 pfam18136 DNApol_Exo DNA mitochondrial polymerase exonuclease domain. This domain belongs to human mitochondrial DNA polymerase (Pol-gamma). Pol-gamma has a catalytic subunit, Pol gamma-A, which possesses both polymerase and proofreading exonuclease activities and an accessory subunit, Pol gamma-B, which accelerates polymerization rate and suppresses exonuclease activity. This domain is the exonuclease domain of the catalytic subunit, Pol gamma-A. 282
60722 407969 pfam18137 ORC_WH_C Origin recognition complex winged helix C-terminal. This is the C-terminal winged-helix (WH) DNA-binding domain of the origin recognition complex present in Drosophila melanogaster. The WH domain is responsible for recognizing origin sequences. 131
60723 407970 pfam18138 bacHORMA_1 Bacterial HORMA domain family 1. Family of bacterial HORMA domains found in conserved genome contexts with Pch2/TRIP13 P-loop NTPases. Acts as a 'third component' in broad class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Together with Pch2/TRIP13, could act as co-effectors or in regulation of other effectors of the systems. 169
60724 407971 pfam18139 LSDAT_euk SLOG in TRPM. Family in the SLOG superfamily, found in several eukaryotic channels including diverse ciliate channels and the TRPM class of animal ion channels. Positioned near the N-terminus of all TRPM channels, it is predicted to play a regulatory role for the channel in potentially recognizing a universal nucleotide or nucleotide-derived ligand. 266
60725 407972 pfam18140 PCC_BT Propionyl-coenzyme A carboxylase BT domain. This domain is found in Propionyl-coenzyme A carboxylase (PCC), present in Roseobacter denitrificans. PCC is a mitochondrial biotin-dependent enzyme that is essential for the catabolism of certain amino acids, cholesterol, and fatty acids with an odd number of carbon atoms. Since this domain mediates biotin carboxylase-carboxyltransferase interactions it is referred to as the BT domain. The BT domain is located between biotin carboxylase and the biotin carboxyl carrier protein domains. The BT domain shares some structural similarity with the pyruvate carboxylase tetramerization domain of pyruvate carboxylase. 126
60726 407973 pfam18141 DUF5599 Domain of unknown function (DUF5599). This domain is found in UPF 1 present in Homo sapiens. UPF 1 is involved in the initiation of nonsense-medicated decay, the process of degradation of transcripts containing premature termination codons in order to control the quality of mRNA. The domain is referred to as 1B. 93
60727 407974 pfam18142 SLATT_fungal SMODS and SLOG-associating 2TM effector domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict. 121
60728 407975 pfam18143 HAD_SAK_2 HAD domain in Swiss Army Knife RNA repair proteins. Family of HAD domain phophoesterases observed in large eukaryotic proteins with predicted role in RNA repair, the so-called 'Swiss Army Knife' repair proteins. May be involved in phosphate group removal during RNA re-ligation. 143
60729 407976 pfam18144 SMODS Second Messenger Oligonucleotide or Dinucleotide Synthetase domain. Nucleotide synthetase enzyme of the DNA polymerase beta superfamily. Experimental studies have demonstrated cGAMP synthetase activity in the Vibrio cholerae DncV protein, a member of the SMODS family. The diversity inherent to the SMODS family suggests members of the family could generate a range of nucleotides, cyclic and/or linear. The nucleotide second messengers generated by the SMODS domains are predicted to activate effectors in a class of conflict systems reliant on the production and sensing of the nucleotide second messengers. 164
60730 407977 pfam18145 SAVED SMODS-associated and fused to various effectors sensor domain. Predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers. 189
60731 407978 pfam18146 CinA_KH Damage-inducible protein CinA KH domain. This domain is found in competence-induced protein A (CinA) present in Thermus thermophiles. CinA is important in the horizontal transfer of genes via competence and may also participate in the pyridine nucleotide cycle, which recycles products formed by non-redox uses of NAD. This domain has a KH-type fold and contains the absolutely conserved Glu-187, which stabilizes the binding of Mg2+ and hence polarizes the P=O bond for hydrolysis. A major feature of the CinA in T. thermophiles structure is the asymmetry in the dimer, which is caused by contact between a KH-type domain on the opposite chain and the bound ADP-ribose. This has the effect of closing the active site, allowing additional recognition of ADP-ribose by residues from the KH-type domain. 73
60732 407979 pfam18147 Suv3_C_1 Suv3 C-terminal domain 1. This domain is found in Suv3 present in Homo sapiens. Suv3 is an NTP-dependent RNA/DNA helicase that is necessary for the degradation of mature mtRNAs. Suv3 has been found to interact in vitro with polynucleotide phosphorylase. This domain makes up part of the C-terminal domain. 41
60733 375589 pfam18148 RGS_DHEX Regulator of G-protein signalling DHEX domain. This domain is found in RGS9 (Class C) regulator of G-protein signalling (RGS) protein present in Mus musculus. RGS proteins attenuate heterotrimeric G-protein signalling by enhancing the intrinsic GTPase activity of G-alpha subunits and are vital for proper signal transduction kinetics. The domain is referred to as DEP helical extension (DHEX) because it is located next to N-terminal Dishevelled/Egl-10/Pleckstrin homology (DEP) domain. Both the DEP and DHEX domains are necessary, but not sufficient, to bind anchoring proteins such as RGS9 anchor protein. DHEX has no close structural homologs. 100
60734 407980 pfam18149 Helicase_PWI N-terminal helicase PWI domain. This domain is found in spliceosomal RNA helicase Brr2. Brr2 is required for the assembly of a catalytically active spliceosome on a messenger RNA precursor. The domain is found in the N-terminal region and is non-canonically PWI-like. The PWI-like domain is thought to be involved in protein-protein interactions. 111
60735 407981 pfam18150 DUF5600 Domain of unknown function (DUF5600). This domain can be found in EH-domain-containing ATPase 2 (EHD2) present in Mus musculus. The domain is helical in nature and has extensive contacts with the G-domain. 107
60736 407982 pfam18151 DUF5601 Domain of unknown function (DUF5601). This domain is found in the catalytic core RABEX-5 present in Homo sapiens. RABEX, also known as Rab GTPase exchange factors, regulate endocytic trafficking through activation of the Rab families RAB5, RAB21 and RAB22. The domain is helical in nature. 65
60737 407983 pfam18152 DAHP_snth_FXD DAHP synthase ferredoxin-like domain. This domain is found in 3-Deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS) present in Thermotoga maritime. DAHPS catalyzes the first reaction of the aromatic biosynthetic pathway in bacteria, fungi, and plants, the condensation of PEP and E4P with the formation of DAHP. The domain is ferredoxin-like and is thought to play a critical role in feedback regulation of the enzyme. 67
60738 407984 pfam18153 S_2TMBeta SMODS-associating 2TM, beta-strand rich effector domain. Predicted sensor/effector coupled domain which occurs in conserved genome contexts with the SMODS nucleotide synthetase. In addition to the predicted pore-forming 2TM region, the domain contains seven predicted beta-strands, suggestive of a lipocalin-like beta-barrel structure which could act as the sensor which activates the pore-forming effector response. 180
60739 407985 pfam18154 pPIWI_RE_REase REase associating with pPIWI_RE. Restriction endonuclease (REase) domain family, found in a conserved three-gene island that also contains a DinG-type helicase and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication. 118
60740 407986 pfam18155 pPIWI_RE_Z pPIWI RE three-gene island domain Z. Poorly-understood domain observed N-terminal to DinG-type helicase, which is part of a conserved three-gene island also containing a REase domain and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication. 166
60741 407987 pfam18156 pPIWI_RE_Y pPIWI_RE three-gene island domain Y. Poorly-understood domain observed N-terminal to restriction endonuclease (REase) domain, which is part of a conserved three-gene island also containing a DinG-type helicase and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication. 144
60742 407988 pfam18157 MID_pPIWI_RE MID domain of pPIWI_RE. MID domain of the pPIWI_RE PIWI/Argonaute module. pPIWI_RE is found in a conserved three-gene island that also contains a DinG-type helicase and an REase nuclease. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication. 142
60743 407989 pfam18158 AidB_N Adaptive response protein AidB N-terminal domain. This is the N-terminal domain of Adaptive response protein AidB present in E. coli. AidB is upregulated in response to small doses of DNA-methylating agents initiates a response that mitigates the mutagenic and cytotoxic effects of DNA methylation. Tetramer formation is thought to be carried out by the N-terminal domain. 156
60744 407990 pfam18159 S_4TM SMODS-associating 4TM effector domain. Predicted pore-forming effector domain found in conserved genome contexts with diverse nucleotide synthetases including the SMODS synthetases. Predicted to function as a pore-forming effector in a class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. S-4TM domains are predicted to initiate cell suicide responses upon their activation. 291
60745 407991 pfam18160 SLATT_5 SMODS and SLOG-associating 2TM effector domain family 5. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential. 190
60746 407992 pfam18161 ISP1_C ISP1 C-terminal. This is the C-terminal domain of ISP1 protein, which plays a role in asexual daughter cell formation, such as in Toxi. gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by an inter-strand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The domain adopts a pleckstrin homology (PH) fold, despite having negligible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the TgISP1. Unlike PH domains, ISP1 is cysteine rich. The cysteine-rich nature of the ISPs and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. A disulfide bond between beta 2 and beta 3 is likely a structural feature of the ISP1, as both cysteines appear broadly conserved. 107
60747 407993 pfam18162 Arc_C Arc C-lobe. This is the C-terminal domain of Arc protein present in found in Rattus norvegicus. The Arc protein modulates the trafficking of AMPA-type glutamate receptors. This domains tertiary structure is similar to the capsid domain of HIV gag protein. The domain is thought to have evolved from the capsid domain of Ty3/Gypsy retrotransposon. 83
60748 407994 pfam18163 LD_cluster2 SLOG cluster2. Family in the SLOG superfamily, observed associating with distinct effector domains including the patatin lipase or a protein containing one enzymatically active and one inactive copy of the TIR domain. 262
60749 407995 pfam18164 GNAT_C GNAT-like C-terminal domain. This is the C-terminal domain found in N-acyltransferase (NAT) proteins present in Actinoplanes teichomyceticus. In this organism, NAT proteins are responsible for N-acylation in the synthesis of the antibiotic teicoplanin. The C-terminal domain undergoes a substantial conformational change upon binding to Acyl-CoA. The C-terminal domain is considered Gcn5-related N-acetyltransferase like (GNAT-like) but differs from the canonical GNAT fold in that it lacks the first beta strand and has an additional four alpha helices. 141
60750 407996 pfam18165 pP_pnuc_1 Predicted pPIWI-associating nuclease. Predicted nuclease effector domain associating with prokaryotic PIWI-centered conflict systems. 135
60751 407997 pfam18166 pP_pnuc_2 Predicted pPIWI-associating nuclease. Predicted nuclease effector domain associating with prokaryotic PIWI-centered conflict systems. 122
60752 407998 pfam18167 Sa_NUDIX SMODS-associated NUDIX domain. NUDIX domain with distinctive features in the substrate-interacting region observed associating with SMODS domain synthetases. Predicted to cleave nucleotide diphosphate bonds, potentially to regulate flux through the pore formed by fused 2TM module. 199
60753 407999 pfam18168 PPL5 Prim-pol family 5. Family of prim-pol enzymes currently known only in kinetoplastids. 321
60754 408000 pfam18169 SLATT_6 SMODS and SLOG-associating 2TM effector domain 6. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems. 176
60755 408001 pfam18171 LSDAT_prok SLOG in TRPM, prokaryote. Family in the SLOG superfamily, fused to or operonically associating with SLATT domain in diverse prokaryotes. Predicted to function as ligand sensor in conjunction with the SLATT transmembrane domain. 194
60756 408002 pfam18172 LepB_GAP_N LepB GAP domain N-terminal subdomain. This is a subdomain of a Rab GTPase-activating protein (GAP) effector from Legionella pneumophilia. This GAP modulates Rab enzymes that act as molecular switches in regulating vesicular transport in eukaryotic cells. This N-terminal subdomain belongs to the the GAP domain of the protein. The catalytic arginine finger (Arg444) is located within this sub-domain and it is the only arginine residue required for GAP activity. 189
60757 375605 pfam18173 bacHORMA_2 Bacterial HORMA domain 2. Family of bacterial HORMA domains found in conserved genome contexts with Pch2/TRIP13 P-loop NTPases. Acts as a 'third component' in broad class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Together with Pch2/TRIP13, could act as co-effectors or in regulation of other effectors of the systems. 166
60758 408003 pfam18174 HU-CCDC81_bac_1 CCDC81-like prokaryotic HU domain 1. First of two HU domains found in bacterial proteins typically fused to a C-terminal transmembrane helix and an extracellular peptidoglycan-binding domain. The HU domains in many of these proteins are predicted to function in tethering the nucleoid to the cell envelope. 59
60759 408004 pfam18175 HU-CCDC81_bac_2 CCDC81-like prokaryotic HU domain 2. Second of two HU domains found in bacterial proteins typically fused to a C-terminal transmembrane helix and an extracellular peptidoglycan-binding domain. The HU domains in many of these proteins are predicted to function in tethering the nucleoid to the cell envelope. 70
60760 408005 pfam18176 KptA_kDCL KptA in kinetoplastid DICER domain. KptA ADP-ribosyltransferase domain observed in kinetoplastid DICER-like (DCL) enzymes. Appears to have lost most residues required for catalyzing phospho-transfer to NAD; however, several positively-charged residues implicated in RNA end recognition remain well-conserved. 158
60761 408006 pfam18177 La_HTH_kDCL La HTH in kinetoplastid DICER domain. Winged HTH domain family observed in kinetoplastid DICER-like (DCL) enzymes, situated N-terminal to the KptA_kDCL domain pfam18176. 89
60762 408007 pfam18178 TPALS TIR- and PNP-associating SLOG family. Family in the SLOG superfamily associating with predicted TIR- and PNP-like effector domains. Members of this family are predicted to function as sensors of nucleotide or nucleotide-derived ligands, which are likely processed or modified by the associating effectors. Often co-occur genomically with the bacterial HORMA and Pch2/TRIP13 domains. 232
60763 408008 pfam18179 SUa-2TM SMODS- and Ubiquitin system-associated 2TM effector domain. Predicted pore-forming effector domain observed exclusively in conserved genome contexts with the SMODS nucleotide synthetases and the bacterial Ubiquitin conjugation systems. 278
60764 408009 pfam18180 LD_cluster3 SLOG cluster3 family. Family in the SLOG superfamily, observed to associate with a predicted effector protein containing one enzymatically active and inactive copy of the TIR domain. 170
60765 408010 pfam18181 SLATT_1 SMODS and SLOG-associating 2TM effector domain 1. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain. 122
60766 408011 pfam18182 mCpol minimal CRISPR polymerase domain. Minimal version of the CRISPR polymerase domain. Predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. Part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Implicates CRISPR polymerase of the Type III CRISPR/Cas systems in a nucleotide synthetase functional role. 114
60767 408012 pfam18183 SLATT_2 SMODS and SLOG-associating 2TM effector domain 2. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations. 192
60768 408013 pfam18184 SLATT_3 SMODS and SLOG-associating 2TM effector domain 3. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. 156
60769 408014 pfam18185 STALD Sir2- and TIR-associating SLOG family. Family in the SLOG superfamily, associating with predicted Sir2- and TIR-like effector domains. Members of this family are predicted to functions as sensors of nucleotide or nucleotide-derived ligands, which are likely processed or modified by the associating effectors. 207
60770 408015 pfam18186 SLATT_4 SMODS and SLOG-associating 2TM effector domain family 4. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems. 165
60771 408016 pfam18187 RIF5_SNase_1 TbRIF5 SNase domain 1. Staphylococcus nuclease (SNase) domain family found in the Trypanosoma brucei TbRIF5 protein, which could contribute to the processing of dsRNA targets. 166
60772 408017 pfam18188 PPL4 Prim-pol 4. Family of prim-pol enzymes with predicted roles in RNA processing and repair, potentially acting independently of a DNA template. One member of the family is fused to kinetoplastid DICER-like protein 1 (DCL1). 159
60773 408018 pfam18189 RIF5_SNase_2 TbRIF5 SNase domain 2. Staphylococcus nuclease (SNase) domain family found in the Trypanosoma brucei TbRIF5 protein, which could contribute to the processing of dsRNA targets. 179
60774 375619 pfam18190 Plk4_PB1 Polo-like Kinase 4 Polo Box 1. This domain is found in Polo-like kinase 4 (Plk4) present in Drosophila melanogaster. Plk4 is a conserved component in the duplication pathway of centrioles which is needed to prevent chromosomal instability. The domain is Polo Box 1 (PB1) and has a pseudo-symmetric dimerization interface across PB1-PB1. 107
60775 408019 pfam18191 PnpCD_PnpD_N Hydroquinone 1,2-dioxygenase large subunit N-terminal. This is the N-terminal domain of the alpha subunit, known as PnpD, of Hydroquinone 1,2-dioxygenase (PnpCD) present in Pseudomonas sp. strain WBC-3. PnpCD is the key enzyme in the degradation pathway of pollutant para-nitrophenol (PNP). The N-terminal domain residues Trp-76 and Phe-79 are indispensable in the formation of the active site pocket. The N-terminal domain also plays a vital role in formation of the heterotetrameric structure. Structural homologs of the N-terminal domain exhibit the nature to bind nucleic acids but due to the steric effect of the C-terminal domain, this N-terminal domain cannot bind nucleic acids. 151
60776 408020 pfam18192 DNTTIP1_dimer DNTTIP1 dimerisation domain. This is the N-terminal domain of DNTTIP1, a protein that forms part of a novel histone deacetylase complex present in Homo sapiens. Histone deacetylase complexes comprise DNTTIP1, histone deacetylase (HDAC) and the repressor protein MIDEAS. The acetylation of histone tails plays a critical role in determining the accessibility of chromatin to transcriptional regulators and RNA polymerase complexes. This N-terminal domain is responsible for dimerization of histone deacetylase 1(HDAC1). The N-terminal domain also interacts and mediates the assembly of the HDAC1- MIDEAS complex. 69
60777 408021 pfam18193 Fibrillin_U_N Fibrillin 1 unique N-terminal domain. This is the N-terminal domain of human fibrillin-1. Fibrillin is a primary constituent of microfibrils in the extracellular matrix of many elastic and non-elastic connective tissues. This domain, known as the fibrillin unique N-terminal (FUN) domain, constitutes the minimal interaction site for the fibrillin C terminus. The FUN domain has homologs in the human proteins LTBP-1L/2 and VWCE, which are also associated with a C-terminal EGF-like domain. 37
60778 408022 pfam18194 Xrn1_D3 Exoribonuclease 1 Domain-3. This domain is found in 5' to 3' exoribonuclease 1 (XRN1) present in Kluyveromyces lactis. XRN1 is involved in transcription, RNA metabolism, and RNA interference. This domain, known as D3, is the third of four domains located far from the active site. These four domains may help to stabilize the N-terminal segment of Xrn1 for catalysis. 71
60779 408023 pfam18195 GatD_N GatD N-terminal domain. This is the N-terminal domain of GatD protein present in Pyrococcus abyssi. Two GatD and two GatE associate to form a tetramer complex. The tetramer complex is able to mature Glutamic acid-tRNA Glutamine into Glutamine-tRNA Glutamine, a necessary step in the translation of proteins. The N-terminal domain is involved in anchoring GatD to GatE in order to form the tetramer. 53
60780 408024 pfam18196 Cdh1_DBD_1 Chromodomain helicase DNA-binding domain 1. This domain can be found in chromodomain helicase DNA-binding protein 1 (Chd1) present in Saccharomyces cerevisiae. Cdh1 proteins have been associated with the efficient assembly and spacing of nucleosomes. The domain consists of four helices, alpha helix 1-4, and can be divided into regions SANT, HL1 and the beta-linker. The HL1 region comprises of some 40 residues specific to budding yeast that are unlikely to form such a prominent feature or perform a conserved function in other species. The domain itself forms part of the DNA-binding domain. Basic residues on the alpha-1 helix are thought to be important for DNA interaction. 120
60781 380135 pfam18197 DUF5602 Domain of unknown function (DUF5602). This domain is found in TTHB210 protein present in Thermus thermophilus. TTHB210 is a Sigma-E factor regulated gene product that forms a homodecamer. This domain is chain G and can be classified with chains A, C, E and I based on its folds. 50
60782 408025 pfam18198 AAA_lid_11 Dynein heavy chain AAA lid domain. This family represents the AAA lid domain found neat the C-terminal region of dynein heavy chain. 156
60783 408026 pfam18199 Dynein_C Dynein heavy chain C-terminal domain. This family represents the C-terminal domain of dynein heavy chain. This domain is a complex structure comprising six alpha-helices and an incomplete six-stranded antiparallel beta-barrel. The shape of this domain is distinctively flat, spreading over the AAA1, AAA5 and AAA6 domain. 303
60784 408027 pfam18200 Big_11 Bacterial Ig-like domain. This presumed domain is found repeat in bacterial cell surface proteins. 79
60785 408028 pfam18201 PIH1_CS PIH1 CS-like domain. This domain is found in yeast PIH1 and its homologues. This domain consists of a seven-stranded beta sandwich with the topology of a CS domain, a structural motif also found in Hsp90 co-chaperones such as p23/Sba1 and Sgt1. 100
60786 408029 pfam18202 TQ T-Q ester bond containing domain. This domain is found in gram positive bacterial surface proteins. It contains a very unusual isopeptide bond between a conserved N-terminal threonine residue on the first beta strand of the Ig-like fold and and a glutamine residue in the final strand of the domain. 125
60787 408030 pfam18203 IPTL-CTERM IPTL-CTERM motif. This entry represents a predicted C-terminal sorting motif. 28
60788 408031 pfam18204 PGF-CTERM PGF-CTERM motif. 23
60789 408032 pfam18205 VPDSG-CTERM VPDSG-CTERM motif. The PEP-CTERM/exosortase system has been previously identified through in silico analysis. This entry describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria. 26
60790 408033 pfam18206 Porphyrn_cat_1 Porphyranase catalytic subdomain 1. This domain is found in porphyranase protein present in Bacteroides plebeius. Porphyranase breaks down porphyran during digestion of red seaweed glycans. It is worth noting that red seaweed glycans contain sulfate esters that are absent in terrestrial plants. This domain makes up part of the catalytic domain of the porphyranase protein. 105
60791 408034 pfam18207 LIFR_N Leukemia inhibitory factor receptor N-terminal domain. This domain can be found in leukemia inhibitory factor receptor (LIFR). LIFR is a cell surface receptor that mediates the actions of LIF and other interleukin-6 type cytokines through the formation of signalling complexes with gp130. This is the N-terminal domain, referred to as domain 1, which contains conserved disulfide bonds between Cys-10 to Cys-20 and Cys-37 to Cys-45. 75
60792 408035 pfam18208 NES_C_h Nicking enzyme C-terminal middle helical domain. This domain is found in nicking enzyme in S. aureus. NES initiates and terminates the transfer of plasmids that variously confer resistance to a range of drugs, including vancomycin and gentamicin. This domain is found in the C-terminal region of NES. The C-terminal region is required for conjugation and significantly impacts the catalytic activity of the N-terminal relaxase. 109
60793 408036 pfam18209 ESF1 Embryo surrounding factor 1. This domain is Embryo surrounding factor 1 (ESF1) protein present in Arabidopsis thaliana. Maternally contributed central cell ESF1 peptides play an important role in suspensor formation and pro-embryo development. The biological activity of ESF1 depends on structural topology, which is stabilized by disulfide bonds. 56
60794 408037 pfam18210 Knl1_RWD_C Knl1 RWD C-terminal domain. This domain is found in Knl1, a sub-unit of the KMN network, present in Homo sapiens. The KMN network is the core of the outer kinetochore which is responsible for microtubule binding/stabilization and controls the spindle assembly checkpoint. This domain is the second of two RING finger, WD repeat, DEAD-like helicase (RWD) domains. The tandem RWD domains mediate kinetochore targeting of the microtubule-binding subunits by interacting with the Mis12 complex. The Mis12 complex is a KMN sub-complex that tethers directly onto the underlying chromatin layer. 98
60795 408038 pfam18211 Csm1_B Csm1 subunit domain B. This domain is found in the Csm1 subunit of the Csm complex found in Thermococcus onnurineus. Csm is a type III-A CRISPR-Cas system, which is an RNA-guided immune defense mechanism that detects and destroys foreign DNA or RNA. This domain is known as domain A and is positioned side by side with domain C. Both domain A and domain C adopt the BABBA topology. Domain A interacts primarily with domain B. 94
60796 408039 pfam18212 ZNRF_3_ecto ZNRF-3 Ectodomain. This domain is found in ZNRF-3 protein present in Danio rerio. ZNRF-3 is a transmembrane E3 ubiquitin ligase that antagonizes Wnt signalling, the signalling system used to mediate Rspo protein actions. ZNFR3 and RNF43, alongside the Rspo proteins, have emerged as a system with therapeutic potential for a number of pathological processes. This domain is known as the ectodomain. 104
60797 408040 pfam18213 SUB1_ProdP9 SUB1 protease Prodomain ProdP9. This domain is the bound prodomain fragment ProdP9 of the SUB1 protein present in Plasmodium falciparum. SUB1 is a serine protease that processes a subset of parasite proteins that play indispensable roles in egress and invasion. The C-terminal stalk of ProdP9 binds in the active site groove in a substrate-like manner and is truncated at the N terminus as a result of the chymotrypsin digestion step used during purification. ProdP9 is structural similar to MIC5 from Toxoplasma gondii, despite low sequence identity. 80
60798 408041 pfam18214 STATa_Ig STATa Immunoglobulin-like domain. This domain is found in signal transducer and activator of transcription A protein (STATa) present in Dictyostelium discoideum (dd). STATa is responsible for transcriptionally regulating cellular differentiation in Dictyostelium discoideum. ddSTATa is the only non-metazoan known to employ SH2 domain signaling. This domain adopts an Immunoglobulin-like fold. 122
60799 408042 pfam18215 Rtt106_N Histone chaperone Rtt106 N-terminal domain. This is the N-terminal domain of Rtt106 in Saccharomyces cerevisiae. Rtt106 is a histone chaperone that contributes to the deposition of newly synthesized acetylated Histone 3 Lysine 56 (H3K56ac) carrying H3-H4 complex on replicating DNA. The N-terminal domain of Rtt106 homodimerizes and interacts with H3-H4 independently of acetylation. 45
60800 375643 pfam18216 N_formyltrans_C N-formyltransferase dimerization C-terminal domain. This is the C-terminal domain of N-formyltransferase found in Francisella tularensis. N-formylated sugars are observed on O-antigens of pathogenic Gram-negative bacteria. This C-terminal domain is responsible for dimerization. In particular, the beta hairpin motif present in the domain helps create a subunit-subunit interface. The dimeric interface is characterized by a hydrophobic patch formed by Ile 195, Leu 197, Val 201, Met 203, Ile 207, Phe 223, Val 231, Val 233, Leu 235, and Leu 237 from both monomers. 52
60801 408043 pfam18217 Zap1_zf2 Zap1 zinc finger 2. This domain can be found in Zap1 present in Saccharomyces cerevisiae. Zap1 regulates S. cerevisiae which mediates the transcription of genes encoding uptake vacuolar transporters. This domain corresponds to zinc finger 2 (zf2) which has been shown to be a constitutive transcriptional activator. The two zinc fingers interactions stabilizes Zn(II)-binding. 24
60802 408044 pfam18218 Spa1_C Lantibiotic immunity protein Spa1 C-terminal domain. This is the C-terminal domain found in SpaI present in Bacillus subtilis. SpaI is an immunity lipoprotein that protects the Gram-positive bacteria against their own lantibiotics, in this case subtilin. SpaI together with the ABC transporter SpaFEG protects the membrane from subtilin insertion. 99
60803 408045 pfam18219 SidC_N SidC N-terminal domain. This is the N-terminal domain of SidC present in Legionella pneumophilia. SidC appears to be involved in modulating mammalian trafficking by promoting the communication between ER-derived vesicles and the Legionella containing vacuole. The N-terminal domain (SidC-N) has a novel fold with 4 potential subdomains. SidC-N does not show structural similarity to any known protein domain in the protein data bank. 466
60804 408046 pfam18220 BspA_v Adhesin BspA variable domain. This domain is found in BspA protein present in Streptococcus agalactiae. BspA is an antigen I/II family polypeptide that confers adhesion linked to pathogenesis in group B Streptococcus. This domain is referred to as the variable domain (BspA-V). BspA-V is responsible for binding to scavenger receptor gp340. BspA-V adopts a fold that is distinct from those of other AgI/II family polypeptide variable domains. 150
60805 375648 pfam18221 MU2_FHA Mutator 2 Fork head associated domain. This is the N-terminal forkhead-associated (FHA) domain found in Drosophila mutator 2 (MU2) protein. FHA domains are generally phosphothreonine (pThr) specific-binding domains and are present in DNA repair and checkpoint proteins. However, phosphothreonine binding is not conserved in the MU2 active site pocket due to the absence of three key residues needed for pThr binding. Dimerization, is conserved between FHA domains of Drosophila MU2 and human MDC1 albeit through different interfaces. The MU2 FHA domain dimerizes via the beta-sheet 2, the MDC1 FHA domain dimerizes via the opposite beta-sheet 1. 94
60806 408047 pfam18222 PilN_bio_d PilN biogenesis protein dimerization domain. This domain is found in PilN type IV pilus biogenesis protein present in Thermus thermophiles. PilN is an integral inner membrane protein needed for the formation of type IV pilus. This domain forms a dimer which is mediated by symmetric contacts between residues in alpha-1, beta-1, beta-3 and alpha-3. 102
60807 408048 pfam18223 PilJ_C Pili PilJ C-terminal domain. This is the C-terminal domain of PilJ, a Type IV pilin found in gram-positive Clostridium difficile. Incorporation of PilJ into pili exposes the C-terminal domain of PilJ to create a novel interaction surface. This C-terminal domain is not observed in other Type IV pilin proteins. 95
60808 408049 pfam18224 ToxB_N ToxB N-terminal domain. This is the N-terminal domain of ToxB found in Pyrenophora tritici-repentis. This domain is crucial for toxin activity. There are only two amino acid differences between ToxB and toxb, an inactive homolog of ToxB. These two differences are a Val at position 3 in ToxB compared to a Thr in toxb, and an Ala at position 12 in ToxB compared to a Val in toxb. AvrPiz-t, a secreted avirulence protein produced by the rice blast fungus, is a structural homolog to ToxB. 61
60809 408050 pfam18225 AbfS_sensor Sensor histidine kinase (AbfS) sensor domain. This is the sensor domain of sensor histidine kinase (AbfS) present in Cellvibrio japonicas. AbfS forms part of the AbfR/S two-component system which is needed to to activate the expression of the suite of enzymes that remove the numerous side chains from xylan. The overall fold of the sensor domain is that of a classical Per Arndt Sim domain. 65
60810 408051 pfam18226 QslA_E LasR-specific antiactivator QslA chain E. This domain is chain E of QslA present in Pseudomonas aeruginosa. QslA is an antiactivator which binds to the transcription factor LasR, disrupting its dimerization and preventing LasR from binding to target DNA. Chain E interacts with chain F of QslA and forms the dimerization interface. Chain E also interacts with chain A of ligand-binding domain in LasR. 71
60811 408052 pfam18227 LepB_GAP_C LepB GAP domain C-terminal subdomain. This subdomain is found in the Rab1 GTPase-activating protein (GAP) domain of GAP LepB present in Legionella pneumophilia. LepB inactivates Rab1, a key regulator of the secretory vesicular trafficking machinery, by acting as a GTPase-activating protein. LepB is also an antagonist of DrrA, which promotes Rab1. LepB acts by an atypical RabGAP mechanism that is reminiscent of classical GAPs. This is the C-terminal subdomain of the GAP domain and consists of an unusual fold. 80
60812 375655 pfam18228 CdiI_N CdiI N-terminal domain. This is the N-terminal domain of Contact-dependent growth inhibition immunity (CdiI) proteins present in Enterobacter cloacae. CdiI proteins neutralize CdiA-CT toxins to protect toxin-producing cells from auto-inhibition. Structural homology searches reveal that Enterobacter cloacae's CdiI is most similar to the Whirly family of single-stranded DNA-binding protein. 108
60813 408053 pfam18229 GcnA_N N-acetyl-beta-D-glucosaminidase N-terminal domain. This is the N-terminal domain found in N-acetyl-beta-D-glucosaminidase (GcnA) present in Streptococcus gordonii. GcnA is a family 20 glycosidase that cleaves N-acetyl-beta-D-glucosamine and N-acetyl-beta-D-galactosamine from 4-methylumbelliferylated substrates. Similar N-terminal domains have been observed in all family 20 glycosidases although the number of beta-sheet strands may vary from five. 78
60814 408054 pfam18230 Glyc_hyd_38C_2 Glycosyl hydrolases family 38 C-terminal sub-domain. This is a subdomain found in the C-terminal region of golgi alpha-mannosidase II present in Drosophila melanogaster. These proteins are important in glycoprotein processing and are thought to cleave mannosidic bonds through a double displacement mechanism involving a reaction intermediate.This subdomain is found at the C-terminal of Glycosyl hydrolases family 38 C-terminal domain. 89
60815 375658 pfam18231 DUF5603 Domain of unknown function (DUF5603). This domain is found in the C-terminal region of free serine kinase (SerK) in the hyperthermophilic archaeon Thermococcus kodakarensis. SerK converts ADP and l-serine (Ser) into AMP and O-phospho-l-serine (Sep), which is a precursor of l-cysteine. The domain is not conserved in the ParB/Srx family. The differences between SerK and the other members of the ParB/Srx family is concentrated in the C-terminal region, which may include residues involved in the Sep binding. 105
60816 375659 pfam18232 Chalcone_N Chalcone isomerase N-terminal domain. This is the N-terminal domain of chalcone isomerase present in Eubacterium ramulus. Chalcone isomerase is involved in the degradation pathway of flavone naringenin. 102
60817 408055 pfam18233 Cdc13_OB4_dimer Cdc13 OB4 dimerization domain. This domain is found in Cdc13 proteins in several Candida species. The Cdc13-Stn1-Ten1 complex is crucial for telomere protection. This domain is the C-terminal OB4 domain and is responsible for dimerization. Dimerization of Cdc13 is important for high-affinity DNA binding. 114
60818 408056 pfam18234 VioE Violacein biosynthetic enzyme VioE. This domain is VioE present in Chromobacterium violaceum. VioE plays a key role in the biosynthesis of violacein. Violacein has potential medical applications as an antibacterial, anti-tryptanocidal, anti-ulcerogenic and as an anti-cancer drug. VioE forms a homodimer with a chiefly hydrophobic interface between the two VioE monomers.The fact that VioE adopts a fold normally associated with lipoprotein carrier proteins may be due to VioE for binding the hydrophobic polyethylene glycol. 182
60819 408057 pfam18235 OST_P2 Oligosaccharyltransferase Peripheral 2 domain. This is a domain found in the C-terminal region of STT3 present in P. furiosus. STT3 is an Oligosaccharyltransferase which catalyzes the transfer of a heptasaccharide, containing one hexouronate and two pentose residues, onto peptides in an Asn-X-Thr/Ser-motif-dependent manner. This domain, known as the Peripheral 2 (P2) domain, encircles the central core domain. 133
60820 408058 pfam18236 AGO_N Argonaute N domain. This is the N domain often found in the N-terminal region of Argonaute (AGO) present in Kluyveromyces polyspora. AGO forms part of the RNA-induced silencing complex that mediates the gene silencing pathway, RNA interference. The N domain blocks the nucleic acid-binding channel and prevents propagation of guide-target pairing beyond position 16. 122
60821 375664 pfam18237 Tk-SP_N-pro Tk-SP N-propeptide domain. This is the N-propeptide domain found in Tk-SP, a subtilisin-like serine protease from Thermococcus kodakaraensis. The beta sheet of this domain packs tightly to the two nearly parallel alpha helices 2 and 3 located at the surface of the subtilisin domain. Gln105 and Asp107 of the N-propeptide domain also bind to the N-termini of these two alpha-helices to form helix caps. 67
60822 408059 pfam18238 LnmK_N_HDF LnmK N-terminal Hot Dog Fold domain. This domain is found in LnmK and is present in Streptomyces atroolivaceus. LnmK is a bifunctional acyltransferase/decarboxylase (AT/DC) that catalyzes first self-acylation using methylmalonyl-CoA as a substrate and subsequently trans-acylation of the methylmalonyl group to the phosphopantetheinyl group of the LnmL acyl carrier protein. LnmK is a homodimer composed of two monomeric double-hot-dog folds (DHDF). This domain is the N-terminal hot dog fold. 176
60823 408060 pfam18239 HA1 Hemagglutinin I. This domain is hemagglutinin I (HA1) present in Physarum polycephalum. Although the physiological function of the secreted HA1 remains to be established, HA1 recognizes cell wall polysaccharides of E. coli. The beta-sandwich fold of HA1, composed of two up and down beta-sheets, is conserved among other legume lectin-like proteins. The up and down beta-sheet region is a minimal carbohydrate recognition domain. 90
60824 408061 pfam18240 PSII_Pbs31 Photosystem II Psb31 protein. This domain is Psb31, an extrinsic protein found in photosystem II (PSII) present in Chaetoceros gracilis. Photosystem II (PSII) is a multisubunit, membrane protein complex located in the thylakoid membranes of oxygenic photosynthetic organisms from cyanobacteria to higher plants. The four helices in the N-terminal domain are arranged in an up-down-up-down fold and are similar in structure to PsbQ protein in Spinach, despite low sequence homology. 93
60825 408062 pfam18241 AvrM-A Flax-rust effector AvrM-A. This domain is found in AvrM-A present in Melampsora lini. AvrM-A is a natural variant of AvrM which is a secreted effector protein that can internalize into plant cells in the absence of the pathogen and bind to phosphoinositides. AvrM results in effector-triggered immunity. This domain makes up part of the C-terminal region, which is highly conserved in AvrM. The domain is required for M-dependent effector-triggered immunity. 147
60826 375669 pfam18242 LupA Legionella ubiquitin-specific protease A domain. This domain is found in Legionella ubiquitin-specific protease A (LupA). LupA removes a ubiquitin modification from LegC3 which inactivates the cognate effector. This domain is typical of eukaryotic ubiquitin proteases involved in deconjugation of ubiquitin or ubiquitin-like proteins from their targets. 178
60827 408063 pfam18243 BfiI_DBD Metal-independent restriction enzyme BfiI DNA binding domain. This domain is found in the metal-independent restriction enzyme BfiI present in Bacillus firmus. This domain is found in the C-terminal of the protein and is responsible for DNA binding. The domain exhibits a beta-barrel-like structure similar to the effector DNA-binding domain of the Mg2+ dependent restriction enzyme EcoRII and to the B3-like DNA-binding domain of plant transcription factors. 164
60828 408064 pfam18244 CttA_N Cellulose-binding protein CttA N-terminal domain. This is the N-terminal domain of cellulose-binding protein CttA present in Ruminococcus flavefaciens. CttA mediates attachment of the bacterial substrate via two carbohydrate-binding modules. The domain is known as the X-module and lacks a true hydrophobic core. Unlike the X-modules in other types of CohE-XDoc complexes it does not contribute to the binding surface. This X-module appears to serve as an extended spacer, which separates the cellulose-binding modules at the N terminus of CttA and the bacterial cell wall. The domain does not share structural similarity with other known X-modules from cellulolytic bacteria but does show similarity to G5-1 module of StrH from S. pneumoniae. 71
60829 408065 pfam18245 XRN1_DBM 5-3 exonuclease XRN1 DCP1-binding motif. This domain is found in the 5'-3' exonuclease (XRN1) present in Drosophila melanogaster. XRN1 degrades deadenylated mRNA that has recently been decapped by decapping enzyme 2 (DCP2). DCP2 associates with decapping activators DCP1 and EDC4. The direct interaction between DCP1 and XRN1 couples mRNA decapping to 5' exonucleolytic degradation. This domain is responsible for binding to DCP1. In particular, the helical C-terminal region of the domain contributes to the binding affinity and the specificity of the interaction. 26
60830 408066 pfam18246 OST_IS Oligosaccharyltransferase Insert domain. This is a domain found in STT3 present in P. furiosus. In P. furiousus, STT3 is an Oligosaccharyltransferase which catalyzes the transfer of a heptasaccharide, containing one hexouronate and two pentose residues, onto peptides in an Asn-X-Thr/Ser-motif-dependent manner. This domain is inserted into the central core (CC) domain and hence is referred to as the Insert (IS) domain. This IS domain contains a disulphide bond. 83
60831 375674 pfam18247 AvrM_N Flax-rust effector AvrM N-terminal domain. This is the N-terminal domain found in AvrM present in Melampsora lini. AvrM is a secreted effector protein that can internalize into plant cells in the absence of pathogens, binds to phosphoinositides and results in effector-triggered immunity. This domain is related to the WY domain core in oomycete effectors. 65
60832 375675 pfam18248 RalF_SCD RalF C-terminal Sec-7 capping domain. This is the C-terminal domain of RalF protein present in Legionella pneumophilia. RalF is secreted into host cytosol via the Dot/Icm type IV transporter where it acts to recruit ADP-ribosylation factor (Arf) to pathogen-containing phagosomes in the establishment of a replicative organelle. This domain forms a cap over the active site in the Sec7 domain and so is referred to as the Sec7-capping domain. 147
60833 375676 pfam18249 Ca_bind_SSO6904 Calcium binding protein SSO6904. This domain is SSO6904 present in Sulfolobus solfataricus. SSO6904 is a calcium binding protein thought to have a weak affinity for other cations such as Mg2+ and Zn2+. The structure of SSO6904 is similar to that of saposin-fold proteins. Saposin proteins are membrane-interacting glycoproteins required for the hydrolysis of certain sphingolipids by specific lysosomal hydrolases. 90
60834 408067 pfam18250 Tgi2PP Effector immunity protein Tgi2PP. This domain is Tgi2PP found in Pseudomonas protegens. Tgi2PP is part of the Tge2PP- Tgi2PP Effector-immunity pair secreted by the type VI secretion system (T6SS). Tgi2PP interacts predominantly by hydrogen bonding and hydrophobic interactions with Tge2PP via the insertion of the beta-sheet core of Tgi2PP into the substrate-binding groove of Tge2PP. Tgi2PP contains a similar topology to the periplasmic E. coli colicin M immunity protein. 46
60835 375678 pfam18251 Defensin_5 Fungal defensin Copsin. This domain is Copsin present in Coprinopsis cinerea. Copsin is a defensin that interferes with peptidoglycan synthesis and has a CS-alpha-beta fold. Copsin is stabilized by a unique connectivity of six cysteine bonds in contrast to most other CS-alpha-beta defensins which are linked by three or four disulfide bonds. 39
60836 408068 pfam18252 Cu_bind_CorA Copper(I)-binding protein CorA. This domain is found in CorA present in Methylomicrobium album. CorA is a copper repressible surface associated copper(I)-binding protein. CorA can bind one copper ion per protein molecule. The overall fold of CorA is similar to M. capsulatus protein MopE, including the unique copper(I)-binding site and most of the secondary structure elements. 175
60837 408069 pfam18253 HipN Hsp70-interacting protein N N-terminal domain. This is the N-terminal domain, known as HipN, found in Hsp70-interacting protein (Hip) present in Rattus norvegicus. Hip cooperates with the chaperone Hsp70 in protein folding and prevention of aggregation and may delay substrate release by slowing ADP dissociation from Hsp70. HipN is responsible for N-terminal homo-dimerization which is necessary so that the Hip dimer can interact with Hsp70 molecules. 42
60838 408070 pfam18254 HMw1_D2 HMW1 domain 2. This domain is found in Actinobacillus pleuropneumoniae HMW1C (ApHMW1C). HMW1 adhesin is an N-linked glycoprotein that mediates adherence to respiratory epithelium through N-glycosylation of protein acceptor sites and O-glycosylation of sugar acceptor sites. This domain forms an all alpha domain (AAD) when combined with the N-terminal domain. The AAD interacts extensively with the C-terminal GT-B fold in order to create a unique groove with the potential to accommodate the acceptor protein. 88
60839 408071 pfam18255 SAM_DrpA DNA processing protein A sterile alpha motif domain. This is the N-terminal domain found in DNA processing protein A (DprA) present in Streptococcus pneumoniae. DprA has recently been discovered to be a transformation-dedicated RecA loader. Transformation is believed to play a major role in genetic plasticity. This domain is known as the sterile alpha motif (SAM) domain. DprAs are able to form a type of dimer through SAM-SAM interactions, also known as N/N interactions. 62
60840 375683 pfam18256 HscB_4_cys Co-chaperone HscB tetracysteine metal binding motif. This is the N-terminal domain of human co-chaperone protein HscB (hHscB). This domain is capable of binding a metal ion through its tetracysteine metal binding motif. The metal atom is coordinated by a set of four cysteine residues (Cys41, Cys44, Cys58 and Cys61) on opposed beta-hairpins. Although the N-domain lacks any recognizable secondary structure elements, it has several distant structural homologs including C-4 zinc finger domains and rubredoxin. 27
60841 408072 pfam18257 DsbG_N Disulfide isomerase DsbG N-terminal. This is the N-terminal domain found in DsbG, a protein disulfide isomerase present in the periplasm of Helicobacter pylori. The formation of correct disulfide bonds is critical in the folding process of many secretory and membrane proteins in bacteria. Non-native disulfides are corrected by the isomerase DsbC, and, to a lesser extent, by DsbG. The N-terminal domain is involved in dimerization. The dimer interface of Helicobacter pylori's DsbG is stabilized by hydrophobic interactions and hydrogen bonds involving alpha 1, beta-3 to beta-4 loop, beta-4 and beta-4 to alpha-2 loop. This pattern of dimerization is similar to that of E. coli's DsbG. 90
60842 408073 pfam18258 IL4_i_Ig Interleukin-4 inducing immunoglobulin-binding domain. This domain is found in Interleukin-4 inducing protein alpha-1 (IPSE/alpha-1) present in Schistosoma mansoni, a parasite of humans. IPSE/alpha-1 triggers the release of IL-4 from basophils in the liver which is a major site of egg deposition during S. mansoni infection. This domain adopts a beta gamma-crystallin fold that is stabilized by three disulfide bonds within the domain (23/26, 59/93, and 111/121). The domain is involved in immunoglobulin binding. 89
60843 408074 pfam18259 CBM65_1 Carbohydrate binding module 65 domain 1. This domain is found in the non-catalytic carbohydrate binding module 65B (CMB65B) present in Eubacterium cellulosolvens. CBMs are present in plant cell wall degrading enzymes and are responsible for targeting, which enhances catalysis. CBM65s display higher affinity for oligosaccharides, such as cellohexaose, and particularly polysaccharides than cellotetraose, which fully occupies the core component of the substrate binding cleft. The concave surface presented by beta-sheet 2 comprises the beta-glucan binding site in CBM65s. C6 of all the backbone glucose moieties makes extensive hydrophobic interactions with the surface tryptophans of CBM65s. Three out of the four surface Trp are highly conserved. The conserved metal ion site typical of CBMs is absent in this CBM65 family. 113
60844 408075 pfam18260 Nab2p_Zf1 Nuclear polyadenylated RNA-binding 2 protein CCCH zinc finger 1. This domain is found in nuclear polyadenylated RNA-binding 2 protein (Nab2p) present in Saccharomyces cerevisiae. Nab2p is a major family of Poly A-binding proteins whose interactions are thought to be crucial for the control of poly(A) tail length. This domain is the first of seven CCCH zinc fingers which are responsible for polyadenosine RNA binding. When combined with the next three zinc fingers (Zf1-4), these four zinc fingers together may bind RNA in the 3' to 5' direction. 26
60845 408076 pfam18261 Rpn9_C Rpn9 C-terminal helix. This is the C-terminal domain found in Rpn9 present in Saccharomyces cerevisiae. Rpn9 is one of six PCI-domain-containing proteins that form the lid of the proteasome for ATP-dependent unfolding and hydrolysis of the polypeptide. Rpn9s C-terminal domain is not necessary for lid assembly with the exception of Rpn12, where the domains absence prevents the association of Rpn12. 33
60846 408077 pfam18262 PhetRS_B1 Phe-tRNA synthetase beta subunit B1 domain. This is the N-terminal domain found in human cytosolic phenylalanyl tRNA synthetase beta subunit. 83
60847 408078 pfam18263 MCM6_C MCM6 C-terminal winged-helix domain. The minichromosome maintenance (Mcm) complex is the replicative helicase in eukaryotic species, that plays essential roles in the initiation and elongation phases of DNA replication. During late M and early G(1), the Mcm complex is loaded onto chromatin to form prereplicative complex in a Cdt1-dependent manner. This entry represents the C-terminal domain of human Mcm6 which is the Cdt1 binding domain (CBD). The structure of CBD exhibits a typical winged helix fold that is generally involved in protein-nucleic acid interaction. The CBD failed to interact with DNA in experiments. The CBD-Cdt1 interaction involves the helix-turn-helix motif of CBD. 107
60848 408079 pfam18264 preSET_CXC CXC domain. This domain is found to the N-terminus of the SET domain in the EZH2 protein. It is a zinc binding domain.ED L9LD52.1/505-536; 32
60849 408080 pfam18265 Nas2_N Nas2 N_terminal domain. Nas2 is a proteosome assembly chaperone. Nas2 bivalently binds the proteasome Rpt5 subunit. The Nas2 N-terminal helical domain masks the Rpt1-interacting surface of Rpt5. 79
60850 408081 pfam18266 Ncstrn_small Nicastrin small lobe. This domain is part of the protein Nicastrin, a component of gamma secretase present in Homo sapiens. Gamma-secretase is thought to contribute to Alzheimer's disease development by generating beta-amyloid peptides. This domain is the known as the small lobe which forms the 'lid'. The lid is an extended surface loop that covers the hydrophilic pocket that is thought to be responsible for substrate recruitment. On substrate binding, the large lobe is thought to rotate relative to the small lobe. 169
60851 408082 pfam18267 Rubredoxin_C Rubredoxin NAD+ reductase C-terminal domain. This is the C-terminal domain of NADH rubredoxin oxidoreductase present in Clostridium acetobutylicum. The majority of obligatory anaerobes detoxify micro-aerobic environments by consuming O2 via H2O-forming NADH oxidase. This enzyme offers an alternate reaction pathway for scavenging of O2 and reactive oxygen species, wherein the reducing equivalent is obtained from NADH. 70
60852 408083 pfam18268 Hit1_C Hit1 C-terminal. This domain is found in Hit1 protein (Hit1p) present in Saccharomyces cerevisiae. Hit1p contributes to C/D small nucleolar RNPs (snoRNPs) stability and pre-RNA maturation kinetics by associating with U3 snoRNA precursors and influencing its 3'-end processing. Snu13p-Rsa1p-Hit1p heterotrimer binds C/D snoRNAs. C/D snoRNAs are essential for the biogenesis of ribosomes and spliceosomes. The domain adopts a Pac-Hit fold that forms a claw which locks the alpha-1 helix of Rsa1p, while the Rsa1p alpha-2 helix packs against the exposed surface of the Hit1p Pac-Hit domain alpha-3 helix. 82
60853 408084 pfam18269 T3SS_ATPase_C T3SS EscN ATPase C-terminal domain. This is the C-terminal domain of the EscN protein family of ATPases that form part of the Type III secretion system (T3SS) present in Escherichia coli. T3SS is a macromolecular complex that creates a syringe-like apparatus extending from the bacterial cytosol across three membranes to the eukaryotic cytosol. This process is essential for pathogenicity. EscN is a functionally unique ATPase that provides an inner-membrane recognition gate for the T3SS chaperone-virulence effector complexes as well as a potential source of energy for their subsequent secretion.The C-terminal domain of T3SS ATPases mediates binding with multiple contact points along the chaperone. 70
60854 375697 pfam18270 Evf Virulence factor Evf. This domain is found in Erwinia virulence factor (Evf) present in the Drosophila Pathogen, Erwinia carotovora. Evf is able to bind to model membranes containing negatively charged phospholipids and to promote their aggregation. Palmitoic acid covalently binds to the completely conserved Cys209. The structure of Evf is unlike any virulence factors known to date. 235
60855 408085 pfam18271 GH131_N Glycoside hydrolase 131 catalytic N-terminal domain. This is the N-terminal domain found in glycoside hydrolase family 131 (GH131A) protein observed in Coprinopsis cinerea. GH131A exhibits bifunctional exo-beta-1,3-/-1,6- and endo-beta-1,4 activity toward beta-glucan. This domain is catalytic in nature though the catalytic mechanism of C. cinerea GH131A is different from that of typical glycosidases that use a pair of carboxylic acid residues as the catalytic residues. In the case of GH131A, Glu98 and His218 may form a catalytic dyad and Glu98 may activate His218 during catalysis. 251
60856 408086 pfam18272 ssDNA_TraI_N single-stranded DNA binding TraI N-terminal subdomain. This is a subdomain found in TraI present in E. coli. Tra1 is a conjugative relaxase that forms part of the Type IV secretion system. This subdomain, referred to as 2A, is located in N-terminal region of the translocation signal (TSA) domain. TSA is known to reside in a larger ssDNA-binding domain. 53
60857 408087 pfam18273 T3RM_EcoP15I_C Type III R-M EcoP15I C-terminal domain. This domain is found in the Type III restriction-modification EcoP15I complex, present in E. coli. The Mod subunits function as a dimer with ModA recognizing the DNA and ModB methylating the target adenine base. This domain is found in the C-terminal of the ModB subunit. 97
60858 375701 pfam18274 V_ATPase_prox Vacuolar ATPase Subunit I N-terminal proximal lobe. This domain is found in the cytoplasmic N-terminal domain of vacuolar ATP synthase subunit I present in Meiothermus ruber. Subunit I is a homolog of subunit A which associates with the membrane-bound complex of eukaryotic vacuolar H+-ATPase (V-ATPase) acidification machinery. The domain forms the proximal lobe that caps one end of the alpha helix bundle, with the distal lobe capping the other end. Although the two lobes exhibit a similar motif, the molecular nature of the coupling with the identical stalks is thought to be dissimilar. 52
60859 408088 pfam18275 His_Me_b4a2 His-Me finger endonuclease beta4-alpha2 domain. This domain is found in Hpy991 present in Helicobacter pylori. Hpy991 is a beta-beta-alpha-Me restriction endonuclease that recognizes the CGWCG target sequence and cleaves both DNA strands with a stagger that leads to 5'-recessed ends in the cleavage products. This domain is the first of two beta4-alpha2 repeats found after the N-terminal domain. The two repeats have low overall sequence similarity but readily identified by a structural comparison. Both repeats contain contains two CXXC motifs that map to the first beta-hairpin and the first alpha-helix. The four cysteine residues coordinate a structurally bound Zn2+ ion tetrahedrally. The major groove is in contact with the first repeat, with the beta-hairpin 2 inserting deeply into the groove. 52
60860 408089 pfam18276 TcA_TcB_BD Tc toxin complex TcA C-terminal TcB-binding domain. This domain is found in the C-terminal region of the Tc toxin TcA, present in Photorhabdus luminescens. Tc Toxin complexes bind to the cell surface, are endocytosed and perforate the host endosomal membrane by forming channels that translocate toxic enzymes into the host. This domain is responsible for binding to toxin TcB. Binding of TcA to TcB/TcC opens the beta-propeller gate. 287
60861 408090 pfam18277 AbrB_C AbrB C-terminal domain. This is the C-terminal domain of AbrB protein from Bacillus subtilis. AbrB is a transition state regulator. Functions of AbrB include biofilm formation, antibiotic production, competence development, extracellular enzyme production, motility, and sporulation. The C-terminal domain is responsible for multimerization and, to a lesser extent than the N-terminal domain, also contributes in DNA binding. 37
60862 408091 pfam18278 RANK_CRD_2 Receptor activator of the NF-KB cysteine-rich repeat domain 2. This domain is found in the receptor activator of the NF-KB (RANK) present in Mus musculus. RANK and its cognate ligand RANKL play a role in bone remodelling, immune function and mammary gland development in conjunction with various cytokines and hormones. The binding of RANKL to RANK causes trimerisation of the receptor, which activates the signalling pathway and results in osteoclastogenesis from progenitor cells and the activation of mature osteoclasts receptor activator of the NF-KB. This domain is the second of four cysteine rich pseudo-repeat domains (CRDs) and so is known as CRD2. RANK moves via a hinge region between CRD2 and CRD3 to make close contact with RANKL. 41
60863 375706 pfam18279 zf-WRNIP1_ubi Werner helicase-interacting protein 1 ubiquitin-binding domain. This domain is found in the Werner helicase-interacting protein 1 present in Homo sapiens. The domain is a zinc finger responsible and has a zinc-coordinating B-B-A fold. WRNIP1 UBZ binds ubiquitin in a similar manner to Rad18 UBZ. 21
60864 375707 pfam18280 AadA_C Aminoglycoside adenyltransferase C-terminal domain. This is the C-terminal domain of aminoglycoside (3'')(9) adenyltransferase (AadA) present in Salmonella enterica. AadA acts as a monomer to catalyse the magnesium-dependent transfer of adenosine monophosphate from ATP to the two chemically dissimilar drugs streptomycin and spectinomycin. 104
60865 408092 pfam18281 BILBO1_N BILBO1 N-terminal domain. This is the N-terminal domain of BIBLO1 present in Trypanosoma brucei. BILBO1 is a flagella pocket collar component. Depletion of BIBLO1 prevents flagella pocket and flagella pocket collar biogenesis and leads to cell death. This domain has a ubiquitin-like fold and has a conserved patch of four aromatic residues (Phe-12, Trp-71, Tyr-87, and Phe-89) and three basic residues (Lys-15, Lys-60, and Lys-62). 92
60866 375709 pfam18282 RAP80_UIM RAP80 N-terminal ubiquitin interaction motif. This is the N-terminal domain found in RAP80 protein present in Homo sapiens. RAP80 is fundamental for protein recruitment in the DNA damage response. The N-terminal domain is a ubiquitin-interacting motif (UIM). RAP80 is involved in multivalent recognition of polyUb chains through N-terminal domain. 57
60867 408093 pfam18283 CBM77 Carbohydrate binding module 77. This domain is the non-catalytic carbohydrate binding module 77 (CBM77) present in Ruminococcus flavefaciens. CBMs fulfil a critical targeting function in plant cell wall depolymerisation. In CBM77, a cluster of conserved basic residues (Lys1092, Lys1107 and Lys1162) confer calcium-independent recognition of homogalacturonan. 108
60868 408094 pfam18284 DNA_meth_N DNA methylase N-terminal domain. This is the N-terminal domain of DNA methylase (pfam00145). Family members include Modification methylase EcoRII (EC:2.1.1.37) and DNA-cytosine methyltransferase. 57
60869 408095 pfam18285 LuxT_C Tetracycline repressor LuxT C-terminal domain. This is the C-terminal domain of LuxT. LuxT is a tetracycline repressor family regulator identified in Vibrio alginolyticus which may play a role in the fine-tuning of the virulence via quorum sensing (QS). 87
60870 408096 pfam18286 T3SS_ExsE Type III secretion system ExsE. This domain is found in ExsE present in Pseudomonas aeruginosa. ExsE forms part of the ExsACDE signaling cascade which acts as an important regulatory switch that ensures timely expression of the Type III secretion system (T3SS) and so plays a critical role in facilitating infection. Prior to host-cell contact, the T3SS is inactive and ExsE and Type III Secretion Chaperone (ExsC) form a stable complex. ExsC forms a compact homodimer and ExsE wraps around one face of this dimer. 46
60871 375714 pfam18287 Hfx_Cass5 Integron Cassette Protein Hfx_Cass5. This domain forms part of the integron cassette protein Hfx_CASS5 present in Vibrio cholerae. The structure of Hfx is a tetramer built from two domain-swapped dimers. 80
60872 408097 pfam18288 FAA_hydro_N_2 Fumarylacetoacetase N-terminal domain 2. This is domain is found in the N-terminal region of Fumarylacetoacetate (FAA) hydrolase (pfam01557). Family members of this domain include Pseudogulbenkiania ferrooxidans and Cupriavidus gilardii. 78
60873 408098 pfam18289 HU-CCDC81_euk_2 CCDC81 eukaryotic HU domain 2. This is the second of two HU domains found in the CCDC81-like proteins. CCDC81 has been experimentally linked to the centrosome; eukaryotic CCDC81 HU domains are predicted to function in protein-protein interactions in centrosome organization and potentially contribute to cargo-binding in conjunction with Dynein-VII. A striking lineage-specific expansion of the domain is observed in birds, where the HU domains could function in recognition of non-self molecules. 75
60874 408099 pfam18290 Nudix_hydro Nudix hydrolase domain. This domain is found just before the N-terminal region of nucleoside diphosphate-linked moiety (Nudix) hydrolases (pfam00293). Nudix hydrolases catalyze the hydrolysis of nucleoside diphosphates which are often toxic metabolic intermediates and signalling molecules. 80
60875 408100 pfam18291 HU-HIG HU domain fused to wHTH, Ig, or Glycine-rich motif. Rapidly-diverging family of HU domains predominantly observed in the bacteroidetes lineage with a predicted role in recognition and possible interception of the DNA of parasitic elements, a counter-conflict strategy preventing incorporation of these elements into the host genome. 125
60876 408101 pfam18292 ZIP4_domain Zinc transporter ZIP4 domain. This domain is found in ZRT1-IRT1-like protein 4 (ZIP4) present in Homo sapiens and Mus musculus. ZIP4 is a zinc transporter that allows uptake of the essential nutrient zinc. The domain is found before the N-terminal of ZIP Zinc transporter domain (pfam02535). 167
60877 408102 pfam18293 Caprin-1_dimer Caprin-1 dimerization domain. This domain is found in human Caprin-1 protein. Caprin-1 plays a role in many important biological processes, including cellular proliferation, innate immune response and synaptic plasticity. This domain is found in the highly conserved homologous region 1(HR1) and is responsible for the tight homodimerization of Caprin-1. 116
60878 408103 pfam18294 Pept_S41_N Peptidase S41 N-terminal domain. This domain is found in the N-terminal region of proteins carrying the peptidase S41 domain (pfam03572) in Bacteroidetes. 49
60879 408104 pfam18295 Pdase_M17_N2 M17 aminopeptidase N-terminal domain 2. This domain is found in the N-terminal region of M17 aminopeptidase (pfam00883) present in Homo sapiens and Mus musculus. M17 aminopeptidases are Zn-dependent exopeptidases that catalyse the removal of unsubstituted amino acid residues from the N-terminus of peptides. 121
60880 408105 pfam18296 MID_MedPIWI MID domain of medPIWI. MID domain of the medPIWI PIWI/Argonaute module. medPIWI is the core globular domain of the Med13 protein. Med13 is one member of the CDK8 subcomplex of the Mediator transcriptional coactivator complex. The medPIWI module in Med13 is predicted to bind double-stranded nucleic acids, triggering the experimentally-observed conformational switch in the CDK8 subcomplex which regulates the Mediator complex. 191
60881 408106 pfam18297 NFACT-R_2 NFACT protein RNA binding domain. NFACT-R RNA binding family found found in bacteria fused to the ThiI domain as a variant of the canonical tRNA 4-thiouridylation pathway. 104
60882 408107 pfam18298 NusG_add NusG additional domain. This domain is found in Thermotoga maritima NusG, which interacts with RNA polymerase and other proteins to form multi-component complexes that modulate transcription. This domain is referred to as Domain II and is an additional domain inserted into the N-terminal domain. 109
60883 408108 pfam18299 R2K_2 ATP-grasp domain, R2K clade family 2. Family of ATP-grasp enzymes belonging to the R2K clade, wherein one of the absolutely-conserved lysine residues has migrated to the RAGYNA domain which is a part of the core ATP-grasp module. This family is predicted to catalyze peptide ligation reactions on protein substrates in biological conflict contexts, probably between bacteriophages and their hosts. 147
60884 408109 pfam18300 DUF5604 Domain of unknown function (DUF5604). This domain is often found in the N-terminal region of proteins carrying the SET domain (pfam00856), such as the SETDB1 protein present in Homo sapiens. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3. 58
60885 408110 pfam18301 preATP-grasp_3 pre ATP-grasp 3 domain. This domain is found just before the N-terminal of the ATP grasp 3 domain (pfam02655). The domain is carried by species such as Azospirillum brasilense and Methylobacter tundripaludum. 76
60886 375729 pfam18302 CPSase_C Carbamoyl phosphate synthetase C-terminal domain. This is the C-terminal domain found after the MGS domain (pfam02142) in human carbamoyl phosphate synthetase. Carbamoyl phosphate synthetase catalyzes the first step of ammonia detoxification to urea. 14
60887 408111 pfam18303 Saf_2TM SAVED-fused 2TM effector domain. Predicted pore-forming effector domain directly fused to predicted SAVED sensor domain. Binding of a ligand via the SAVED sensor is predicted to activate the Saf-2TM and initiate a cell suicide response. Component of a class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivative. 152
60888 375731 pfam18304 SabA_adhesion SabA N-terminal extracellular adhesion domain. This is the N-terminal extracellular adhesion domain of Sialic acid binding adhesin (SabA) present in Helicobacter pylori. The N-terminal domain of SabA functions as a sugar-binding adhesion domain with conserved disulfide bonds. Notably, these amino acid residues are not only conserved among SabA orthologs but also between SabA and BabA. 299
60889 408112 pfam18305 DNA_pol_A_exoN 3' to 5' exonuclease C-terminal domain. This domain is found just after the C-terminal region of the HRDC domain (pfam00570) in 3'-5' exonuclease proteins (pfam01612). The domain is carried by species such as Streptomyces griseoaurantiacus and Streptomyces albulus. 87
60890 408113 pfam18306 LDcluster4 SLOG cluster4 family. Family in the SLOG superfamily, observed as a standalone domain with little informative genome context, although related families in the SLOG superfamily are predicted to function in diverse conflict contexts. 152
60891 408114 pfam18307 Tfb2_C Transcription factor Tfb2 (p52) C-terminal domain. This is the C-terminal domain of Transcription factor Tfb2 present in Saccharomyces cerevisiae. Tfb2 is referred to as p52 in humans. The interaction between p8-Tfb5 and p52-Tfb2 has a key role in the maintenance of the transcription factor TFIIH architecture and TFIIHs function in nucleotide-excision repair (NER) pathway. The C-terminal domain of Tfb2 is thought to have a crucial role in DNA repair. 68
60892 408115 pfam18308 GGA_N-GAT GGA N-GAT domain. This domain is found in the N-terminal region of the GGA and Tom1 (GAT) domain in Golgi-localizing gamma-adaptin ARF-binding protein 1 (GGA1) present in Homo sapiens. The GAT domains is the key region in GGA that interacts with ARF. ARF plays a crucial role in docking adaptor proteins to membranes. This domain is referred to as N-GAT and it interacts extensively with ARF. 39
60893 408116 pfam18309 Ago_PAZ Argonaute PAZ domain. This is a PAZ domain is found in argonaute present in Thermus thermophiles. Argonaute has a central role in the RNA interference pathway by mediating the maturation of small interfering RNA (siRNA) through initial degradation of the passenger strand, followed by guide-strand-mediated sequence-specific cleavage of target mRNA. The nucleic-acid-binding channel is thought to be positioned between the PAZ and PIWI domain. 88
60894 408117 pfam18310 DUF5605 Domain of unknown function (DUF5605). This domain is found in the C-terminal region of proteins carrying pfam16586 and pfam13204. The C-terminal domain is carried by species such as Bacteroides vulgatus. 73
60895 408118 pfam18311 Rrp40_N Exosome complex exonuclease Rrp40 N-terminal domain. This is the N-terminal domain of Rrp40 of the exosome complex present in Saccharomyces cerevisiae. The RNA exosome complex is responsible for degrading RNA molecules in the 3' to 5' direction. Rrp40 is a 'cap' protein and binds the RNase PH barrel on the opposite side from the S1/KH ring. The N-terminal domain of Rrp44 forms a long beta-hairpin that is wedged in between Rrp41-Rrp42 and approaches the N terminus of the cap protein Rrp4. 47
60896 408119 pfam18312 ScsC_N Copper resistance protein ScsC N-terminal domain. This is the N-terminal domain found in Copper resistance protein ScsS present in Proteus mirabilis. ScsC is a powerful disulfide isomerase that is able to refold and reactivate the scrambled disulfide form of the model substrate RNase A. The protein has a thioredoxin 4 domain (pfam13462) but, unlike other characterized proteins in this family, it is trimeric. The N-terminal domain is responsible for trimerization of ScsC which is needed for isomerase activity. 30
60897 408120 pfam18313 TLP1_add_C Thiolase-like protein type 1 additional C-terminal domain. This domain is found in thiolase-like protein type 1 (TLP1) present in Mycobacterium smegmatis. Thiolase enzymes are acetyl-coenzyme A acetyltransferases which convert two units of acetyl-CoA to acetoacetyl CoA in the mevalonate pathway. This domain is deemed an additional C-terminal region, much like the SPC2-thiolase present in mammals which has an additional C-terminal domain termed the sterol carrier protein-2 (SPC2). However, the additional C-terminal domain in TLP1 folds differently to the traditional SCP2-fold observed in mammalian SPC2-thiolase. The topology of the C-terminal domain of TLP1 is reminiscent of single strand nucleic acid binding proteins. 82
60898 408121 pfam18314 FAS_I_H Fatty acid synthase type I helical domain. This domain is found in the fatty acid synthase (FAS) complex present in species such as Mycobacterium smegmatis and Thermomyces lanuginosus. FAS is a homo-hexameric enzyme that catalyzes synthesis of fatty acid precursors of mycolic acids. This domain is composed of dimerization module 1 (DM1) and four-helix bundle (4HB), both of which are conserved parts of the acetyl transferase. 203
60899 408122 pfam18315 VCH_CASS14 Integron cassette protein VCH_CASS1 chain. This domain is a chain that forms part of the integron cassette protein VCH_CASS14 present in Vibrio cholerae. In each monomer lies a deep binding pocket for small molecule substrates formed by helices alpha-1 and alpha-2 and residues from the central four strands of the beta-sheet. The pocket is extensively lined with hydrophobic side chains. 97
60900 408123 pfam18316 S-l_SbsC_C S-layer protein SbsC C-terminal domain. This domain is found in the crystalline bacterial cell-surface layer (S-layer) protein SbsC present in Geobacillus stearothermophilus. S-layers are a common feature of archaeal cell envelopes. SbsC is an oblique lattice forming protein. This domain, termed Domain 9, is located at the C-terminal region of SbsC. The C-terminal region comprises the self-assembly domain responsible for the formation of the crystalline array. 85
60901 408124 pfam18317 SDH_C Shikimate 5'-dehydrogenase C-terminal domain. This domain is found in the C-terminal region of Shikimate 5'-dehydrogenase (SDH) present in Methanocaldococcus jannaschii. SDH catalyses the NADPH-dependent reduction of 3-dehydroshikimate to shikimate in the shikimate pathway. The domain is found just after the C-terminal domain (pfam01488) which is responsible for NADP binding. 31
60902 408125 pfam18318 Gln-synt_C-ter Glutamine synthetase C-terminal domain. This domain is found in type III glutamine synthetase present in Bacteroides fragilis. Glutamine synthetase (GS) are large oligomeric enzymes that catalyze the condensation of ammonium and glutamate to form glutamine, the principal source of nitrogen for protein and nucleic acid synthesis. This domain is located in the C-terminal end of the protein. 118
60903 408126 pfam18319 PriA_CRR PriA DNA helicase Cys-rich region (CRR) domain. This is a cys-rich region (CRR) domain found in PriA DNA helicases. In bacteria, the replication restart process is orchestrated by the PriA DNA helicase, which identifies replication forks via structure-specific DNA binding and interactions with fork-associated ssDNA-binding proteins (SSBs). The CRR region which is embedded within the C-terminal helicase lobe has been identified to bind two Zn2+ ions. This 50-residue insertion forms a structure on the surface of the helicase core in which two Zn2+ ions are coordinated by invariant Cys residues. Biochemical experiments have shown that sequence changes to Zn2+-binding Cys residues in the PriA CRR can eliminate helicase, but not ATPase, activity and can block assembly of PriB onto DNA-bound PriA, implicating the CRR in multiple functions in PriA. 27
60904 408127 pfam18320 Csc2 Csc2 Crispr. The Csc2 Crispr family of proteins forms a core RNA recognition motif-like domain, flanked by three peripheral insertion domains: a lid domain, a Zinc-binding domain and a helical domain. The CRISPR-Cas system is possibly a mechanism of defence against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes. 298
60905 408128 pfam18321 3HCDH_RFF 3-hydroxybutyryl-CoA dehydrogenase reduced Rossmann-fold domain. This domain is found in 3-hydroxybutyryl-CoA dehydrogenase present in E. coli. 3-hydroxybutyryl-CoA dehydrogenase catalyzes the second step in the biosynthesis of n-butanol from acetyl-CoA, in which acetoacetyl-CoA is reduced to 3-hydroxybutyryl-CoA. This domain is a reduced Rossmann-fold domain and, unlike the first Rossmann-fold domain, it is missing the catalytic residues and an NAD(H) binding cleft. 69
60906 408129 pfam18322 CLIP_1 Serine protease Clip domain PPAF-2. This domain is found in Prophenoloxidase-activating factor (PPAF)-II present in the beetle Holotrichia diomphalia. PPAF-II is indispensable for the generation of the active phenoloxidase leading to melanization, a major defense mechanism of insects. This domain is the clip domain and it is thought to tightly associate with regions I-III of the serine protease-like (SPL) domain. The clip domain is a protein-interaction module that plays an essential role in the binding and activation of PO76s via its central cleft. 52
60907 408130 pfam18323 CSN5_C Cop9 signalosome subunit 5 C-terminal domain. The COP9 (Constitutive photomorphogenesis 9) signalosome (CSN), a large multiprotein complex that resembles the 19S lid of the 26S proteasome, plays a central role in the regulation of the E3-cullin RING ubiquitin ligases (CRLs). The catalytic activity of the CSN complex is carried by subunit 5 (CSN5), also known as c-Jun activation domain-binding protein-1 (Jab1). This entry is the C-terminal domain found in CSN5 proteins. CSN5, whose two C-terminal helices form an antiparallel hairpin, inserts its final C-terminal helix (helix II) into the central CSN6 framework at the core of the bundle. Deletion of the C-terminal helices has a pronounced effect on CSN integrity. 82
60908 408131 pfam18324 TT1725 Hypothetical protein TT1725. This is the hypothetical protein TT1725 found in Thermus thermophilus HB8. The sequence is conserved in three predicted prokaryotic proteins with unknown functions, including Deinococcus radiodurans, Stigmatella aurantiaca, and Mycobacterium leprae. The presence of positively-charged residues in the alpha-1 helix suggests this region binds to a protein with a negatively charged region or to nucleic acids. 110
60909 408132 pfam18325 Fas_alpha_ACP Fatty acid synthase subunit alpha Acyl carrier domain. This is the acyl carrier domain (ACP) found in fatty acid synthase subunit alpha (FAS2) EC:2.3.1.86.The fungal type I fatty acid synthase (FAS) is a 2.6 MDa multienzyme complex, catalyzing all necessary steps for the synthesis of long acyl chains. To be catalytically competent, the FAS must be activated by a posttranslational modification of the central acyl carrier domain (ACP) by an intrinsic phosphopantetheine transferase (PPT). 162
60910 375753 pfam18326 RFX5_N RFX5 N-terminal domain. This is the N-terminal domain of regulatory factor X (RFX)-5 protein of the RFX complex present in Homo sapiens. The RFX complex is made up of RFX5, RFXAP and RFXB. The complex is involved in the regulation of the expression of the major histocompatibility complex class II (MHCII) gene products. These gene products are essential for the initiation and regulation of the mammalian immune response. The N-terminal domain of RFX5 is responsible for homodimerization of RFX5 which promotes folding of the C-terminal domain of RFXAP. The folding of RFXAP results in the formation of a potential binding site for RFXB to bind to the MHCII promoter. 59
60911 408133 pfam18327 PRODH Proline utilization A proline dehydrogenase N-terminal domain. This is the N-terminal domain found in Proline utilization A (PutA) proteins. Proline utilization A (PutA) is a flavoprotein that has mutually exclusive roles as a transcriptional repressor of the put regulon and a membrane-associated enzyme that catalyzes the oxidation of proline to glutamate. The N-terminal region carries the flavoenzyme proline dehydrogenase (PRODH) domain which catalyzes the 2-electron oxidation of proline with the concomitant reduction of a flavin cofactor. 48
60912 408134 pfam18328 PfaD_N Fatty acid synthase subunit PfaD N-terminal domain. This domain is found in N-terminal region of PfaD, an enoyl reductase enzyme present in Bacillus subtilis. PfaD plays a role in the biosynthesis of polyunsaturated fatty acids. The domain is typically found just before the N-terminal region of a nitronate monooxygenase domain (pfam03060). 62
60913 408135 pfam18329 SGBP_B_XBD Surface glycan-binding protein B xyloglucan binding domain. This is the C-terminal domain found in the surface glycan-binding protein-B (SGBP-B) protein found in Bacteroides ovatus. SGBP-B is a cell-surface-localized, xyloglucan-specific binding protein. The C-terminal domain mediates xyloglucan binding. The domain display similarity to the C-terminal beta-sandwich domain of many GH13 enzyme. 178
60914 408136 pfam18330 Lig_C Ligase Pab1020 C-terminal region. This is the C-terminal region of RNA ligase Pab1020 present in Pyrococcus abyssi. Pab1020 catalyzes the nucleotidylation of oligo-ribonucleotides in an ATP-dependent reaction. This region contains both a dimerization domain and a C-terminal domain. 125
60915 408137 pfam18331 PKHD_C PKHD-type hydroxylase C-terminal domain. This is the C-terminal domain found in PKHD-type hydroxylase enzymes. Family members are found mostly in Bacteria and carry the 2OG-Fe(II) oxygenase superfamily pfam13640. 43
60916 408138 pfam18332 XRN1_D1 Exoribonuclease Xrn1 D1 domain. This domain can be found in 5' to 3' exoribonuclease 1 (XRN1) which belong to a family of conserved enzymes in eukaryotes and have important functions in transcription, RNA metabolism, and RNA interference. Xrn1 in fungi and animals is primarily cytosolic and is involved in degradation of decapped mRNAs, nonsense mediated decay, microRNA decay and is essential for proper development. The Xrn1 homolog in Drosophila, known as Pacman, is required for male fertility. This domain (D1) along with 3 other domains, make up a 510-residue segment following the conserved regions found in XRNs but they are only present in XRN1 and are absent in Rat1/XRN2. The amino acid sequences of these four domains contain an excess of basic residues, suggesting that these domains might help in binding the RNA substrate. Mutational studies carried out in D1 domain show that the mutant forms had dramatically reduced nuclease activity towards ssDNA substrate indicating that domain D1 is required for Xrn1 nuclease activity. 192
60917 408139 pfam18333 ssDNA_DBD Non-canonical single-stranded DNA-binding domain. This domain is found in ThermoDBP, a non-canonical single-stranded DNA-binding protein in Thermoproteales. Single-stranded DNA-binding proteins are needed for DNA metabolism, sequestering and protecting transiently formed ssDNA during DNA replication and recombination, detecting DNA damage and recruiting repair proteins. The outer edge of the ssDNA-binding cleft, formed by this domain, has a strongly positive electrostatic surface potential because of the conserved basic residues R49, K54, R65, R80, R86, R90, K97, and R112. 106
60918 408140 pfam18334 XRN1_D2_D3 Exoribonuclease Xrn1 D2/D3 domain. This domain can be found in 5' to 3' exoribonuclease 1 (XRN1) which belong to a family of conserved enzymes in eukaryotes and have important functions in transcription, RNA metabolism, and RNA interference. Xrn1 in fungi and animals is primarily cytosolic, involved in degradation of decapped mRNAs, nonsense mediated decay, microRNA decay and is essential for proper development. The Xrn1 homolog in Drosophila, known as Pacman, is required for male fertility. This entry relates to domain 2 and 3 combined which can be found in the 510-residue C-terminal extension found in XRN1 and not in XRN2/Rat1. Domain D2 is formed by two stretches of Xrn1, residues 915-960 and 1134-1151. The presence of domain (D3) is suggested based on structure. This domain is formed by residues 979-1109, in the insert of domain D2. It is suggested that domains D2-D4 may help maintain domain D1 pfam18332 in the correct conformation, thereby indirectly stabilising the conformation of the N-terminal segment pfam03159. 87
60919 408141 pfam18335 SH3_13 ATP-dependent RecD-like DNA helicase SH3 domain. This is an SH3 (SRC homology domain 3) domain found in RecD helicases (EC 3.6.4.12) that belong to the bacterial Superfamily 1B (SF1B). This superfamily of helicases translocate in a 5'-3' direction and are required for a range of cellular activities across all domains of life. Structural analysis indicate that the extension of the 5'-tail of the unwound DNA duplex induces a large conformational change in the RecD subunit, that is transferred through the RecC subunit to activate the nuclease domain of the RecB subunit. The process involves this SH3 domain that binds to a region of the RecB subunit. Studies of RecD in E. coli also revealed that the SH3 domain interacts with the ssDNA tail in a location different to that normally occupied by a peptide in canonical eukaryotic SH3 domains, thus retaining the potential to bind peptide at the same time as the ssDNA tail. 65
60920 408142 pfam18336 Tudor_FRX1 Fragile X mental retardation Tudor domain. This is the N-terminal Tudor domain (Tud1) found in Fragile X mental retardation syndrome-related protein 1 (Fxr1). The Tud1 domain forms a canonical Tudor barrel. It is usually found in tandem with Agenet domain pfam05641. 49
60921 408143 pfam18337 Tudor_RapA RapA N-terminal Tudor like domain. This is one of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110-kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel beta-sheet. This fold is also found in transcription factor NusG, ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP. 62
60922 408144 pfam18338 BppL_N Lower baseplate protein N-terminal domain. This domain is found in the N-terminal region of the receptor-binding protein of bacteriophage TP901-1, which infects Lactococcus lactis. The receptor-binding protein of phage TP901-1 is termed the lower baseplate protein (BppL) and is trimeric in nature. The N-terminal domain of BppL plugs into the upper baseplate protein (BppU). 25
60923 408145 pfam18339 Tudor_1_RapA RapA N-terminal Tudor like domain 1. This is one of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110-kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel beta-sheet. This fold is also found in transcription factor NusG, ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP. 51
60924 408146 pfam18340 TraI_2B DNA relaxase TraI 2B/2B-like domain. This is the 2B and 2B-like sub-domain found in TraI (EC:5.99.1.2) a relaxase of F-family plasmids. It contains four domains; a trans-esterase domain that executes the nicking and covalent attachment of the T-strand to the relaxase, a vestigial helicase domain (carrying the 2B/2B-like sub-domain) that operates as an ssDNA-binding domain, an active 5' to 3' helicase domain, and a C-terminal domain that functions as a recruitment platform for relaxosome components. The 2B sub-domains in TraI are formed by residues 625-773 in the vestigial helicase domain and residues 1255-1397 in the active helicase domain. The 2B/2B-like sub-domain interacts with ssDNA where it contributes to the surface area where ssDNA bind. In other words the ssDNA-binding site is located in a groove between the 2B and 2B-like parts of the sub-domain. The sub-domain parts appear to act as clamps holding the ssDNA in place, resulting in the ssDNA being completely surrounded by protein. In previous studies, the 2B/2B-like sub-domain of the TraI vestigial helicase domain has been identified as translocation signal A (TSA) since it contains sequences essential for the recruitment of TraI to the T4S system. Thus, the 2B/2B-like sub-domain plays two major roles in relaxase function: (1) interacting with the DNA and possibly promoting high processivity and (2) mediating recruitment of the relaxosome to the T4S system. 79
60925 408147 pfam18341 PSA_CBD PSA endolysin C-terminal cell wall binding domain. This is the C-terminal domain of bacteriophage PSA endolysin. The C-terminal domain is the cell wall-binding domain (CBD) which is composed of two structurally homologous subdomains. CBD comprises two copies of a beta-barrel-like folds, which are held together by means of swapped beta-strands. The observed structure of the CBD sub-domains from Listeriaphage endolysin (N-acetyl-muramoyl-l-alanine amidase), could be the result of either a gene duplication during evolution of the CBD or the pick-up of another functionally equivalent coding sequence, followed by swapping of the respective ancestral leading beta-strands. 51
60926 408148 pfam18342 LytB_WW Endo-beta-N-acetylglucosaminidase LytB WW domain. This domain has can be found in endo-beta-N-acetylglucosaminidase LytB (EC 3.2.1.96) of S. pneumoniae and other gram positive bacteria. Comparative analysis revealed that the second all-beta module derived from the WW-like segments is structurally similar to the chitin binding domain of S. marcescens chitinase ChiB, implying a peptide binding function for this module. 66
60927 408149 pfam18343 SH3_14 Dda helicase SH3 domain. This is a Src homology-3 (SH3) like beta-barrel domain which can be found in Dda enzyme. Dda is a phage T4 SF1B helicase. The Dda SH3 domain contains two insertions (compared to RecD2), a second beta-ribbon that is referred to as the hook and a beta-ribbon/two-helix substructure that is referred to as the tower. The tower region within the domain is rigidly connected to domain 2A in Dda and appears to be specifically designed for the task of supporting the extended pin. Hence, it is suggested that 2A and this SH3 domain move as one unit during the ATP-driven translocation of ssDNA while maintaining contact with the pin. In this scenario, the pin-tower interaction can be considered as an additional transmission site that serves to more efficiently couple the energy from ATP binding and hydrolysis to the unwinding of dsDNA. 134
60928 408150 pfam18344 CBM32 Carbohydrate binding module family 32. This domain is found in GH84C present in Clostridium perfringens. GH84C is a beta-N-acetylglucosaminidase. This domain is a family 32 carbohydrate binding module (CBM) which preferentially recognizes the non-reducing terminus of N-acetyllactosamine. 64
60929 408151 pfam18345 zf_CCCH_4 Zinc finger domain. This is a zinc finger domain found in Zinc finger CCCH-type with G patch domain-containing proteins such as ZIP. Functional studies indicate that ZIP specifically targets EGFR and represses its transcription, and that the zinc finger and the coiled-coil domains are central to that process. 19
60930 408152 pfam18346 SH3_15 Mind bomb SH3 repeat domain. The REP domain of Mind bomb which serves as the substrate recognition domain for a second, membrane-distal epitope of the Notch ligand (C-box). Although the first Mib repeat of REP may play a dominant role in ligand binding, the two repeats appear to cooperate in the engagement of the Jag1 tail. Mind bomb (Mib) proteins are large, multi-domain E3 ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. The structure and functional analysis, show that Mib1 contains two independent substrate recognition domains that engage two distinct epitopes from the cytoplasmic tail of the ligand Jagged1, one in the intracellular membrane proximal region and the other near the C terminus. REP domains have a five-stranded anti-parallel twisted beta sheet topology similar to that of SH3 domains. 67
60931 408153 pfam18347 DUF5606 Domain of unknown function (DUF5606). This is a domain of unknown function found at the N-terminal region of bacterial proteins. 46
60932 408154 pfam18348 SH3_16 Bacterial dipeptidyl-peptidase Sh3 domain. This is the first of two N-terminal bacterial SH3 (SH3b) domains found in bacterial dipeptidyl-peptidases VI such as gamma-D-glutamyl-L-diamino acid endopeptidases. The first SH3b domain plays an important role in defining substrate specificity by contributing to the formation of the active site, such that only murein peptides with a free N-terminal alanine are allowed. 49
60933 375776 pfam18349 Paz_1 PAZ domain. This is a Paz domain found in Argonaute proteins from Aquifex aeolicus bacteria. The PAZ core fold in Aquifex aeolicus bacteria Ago (Aa-Ago) proteins is closely related to the human hAgo1 PAZ domain. Structural and functional studies of Aa-Ago indicate that conformational rearrangement of the PAZ domain may be critical for the catalytic cycle of Argonaute and the RNA-induced silencing complex. 93
60934 375777 pfam18350 SH3_17 Restriction endonuclease SH3 domain. This is an N-terminal beta-barrel domain found in the Hpy99I protomer region. Hpy99I is a type II restriction endonuclease (REase) found in the gastric pathogen Helicobacter pylori. The beta-barrel domain has the SH3-domain fold. Deletion of the beta-barrel domain drastically reduced the activity of Hpy99I. 53
60935 408155 pfam18351 Ago_N_1 Fungal Argonaute N-terminal domain. The AGO (Argonaute) proteins have four domains: an N-terminal domain, the PAZ domain, the MID domain and the PIWI domain. This entry is for the N-terminal domain. The N-terminal domain of AGOs is the most variable domain. Compared with prokaryotic Argonautes, KpAGO (Kluyveromyces polysporus Argonaute) has numerous surface-exposed insertion segments, with a cluster of conserved insertions re-positioning the N domain, contributing to the formation of nucleic-acid-binding channel to enable full propagation of the 3' end of the guide RNA guide-target pairing. The guide strand is used by the RISC complex to specify interactions with target RNAs. If sequence complementarity between guide and target is extensive, AGO catalyses cleavage, resulting in slicing of the target RNA. 95
60936 408156 pfam18352 Gp138_N Phage protein Gp138 N-terminal domain. This domain is found in the N-terminal domain of gene product 138 (gp138) in an unidentified bacteriophage. Gp138 is thought to be involved in the process of opening the host cell membrane during infection. The domain has an OB-fold with an intramolecular disulfide bond between C114 and C120. 98
60937 375780 pfam18353 PG_isomerase_N Phosphoglucose isomerase N-terminal domain. This domain is found in the N-terminal region of glucose-6-phosphate isomerase-like protein, just before the phospho-glucose isomerase C-terminal SIS domain (pfam10432). 110
60938 375781 pfam18354 SH3_18 CarS bacterial SH3 domain. This is an SH3 domain found in antirepressor proteins such as CarS from Myxococcus xanthus. CarS antirepressor recognizes and neutralizes its cognate repressors to turn on a photo-inducible promoter. CarS physically interacts with the MerR-type winged-helix DNA-binding domain of these repressors leading to activation of carB operon. Structural studies of CarS from M. Xanthus reveals a beta-barrel fold akin to that in SH3 domains. However, it diverges from the typical SH3 domain fold in the lengths and conformations of the connecting loops. Functional analysis reveal that SH3 domain-like fold in the antirepressor CasS, mimics operator DNA in sequestering the repressor DNA recognition helix to activate transcription. 85
60939 375782 pfam18355 DUF5607 Domain of unknown function (DUF5607). 65
60940 408157 pfam18356 DUF5608 Domain of unknown function (DUF5608). 56
60941 408158 pfam18357 DtxR Diphteria toxin repressor SH3 domain. This is an SH3 domain which can be found in diphtheria toxin repressor (DtxR) proteins. DtxR) from Corynebacterium diphtheriae regulates the expression of the gene on corynebacteriophages that encodes diphtheria toxin (DT). Other genes regulated by DtxR include those that encode proteins involved in siderophore-mediated iron uptake. 78
60942 408159 pfam18358 Tudor_4 Histone methyltransferase Tudor domain. This is a Tudor domain found in histone-lysine N-methyltransferase SETDB1 proteins (EC:2.1.1.43), also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4. 50
60943 408160 pfam18359 TUDOR_5 Histone methyltransferase Tudor domain 1. This is the first TUDOR domain found in SETDB1 enzymes (EC:2.1.1.43) in homosapiens, also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions. The second tudor domain is pfam18385. 53
60944 408161 pfam18360 hnRNP_Q_AcD Heterogeneous nuclear ribonucleoprotein Q acidic domain. This is an acidic sequence segment domain found in the splicing factor SYNCRIP (hnRNP Q) which is involved in viral replication, neural morphogenesis, modulation of circadian oscillation and the regulation of the cytidine deaminase APOBEC1. This domain is a self-folding globular domain with an all alpha-helix architecture with negatively charged surface areas. Additionally it contains a large hydrophobic cavity and a positively charged surface area as potential epitopes for inter-molecular interactions. 70
60945 375787 pfam18361 ssDBP_DBD Single stranded DNA-binding protein ss DNA binding domain. This domain is found in the N-terminal of ThermoDBP, a single stranded DNA binding protein found in Thermoproteus tenax. ThermoDBP binds specifically to ssDNA with low sequence specificity. This domain is responsible for ssDNA binding. Conserved motif 'LIYWIRSDR' is located at the C-terminal end of the domain and is thought to participate in ssDNA binding. 136
60946 408162 pfam18362 THB Tri-helix bundle domain. This domain can be found in the myosin-binding motif (m-domain) region present in myosin-binding protein C (MyBP-C). MyBP-C is a sarcomeric assembly protein necessary for the regulation of sarcomere structure and function. The MyBP-C family of proteins consists mainly of modules with immunoglobulin (Ig) or fibronectin folds. This domain exhibits a three-helix bundle fold and there is a known actin-binding motif, LK(R/K)XK positioned in the third helix (alpha3), similar to that found in villin and related proteins. 34
60947 408163 pfam18363 PI_PP_I Phosphoinositide phosphatase insertion domain. This domain is found in the effector protein SidP present in Legionella longbeachae. SidP functions as a Phosphoinositide-3-phosphatase specifically hydrolyzing Phosphoinositide(3)P, referred to as PI(3)P, and PI(3,5)P2. The domain is inserted into the N-terminal portion of the catalytic domain and is referred to as the appendage or insertion (I) domain. 94
60948 408164 pfam18364 Molybdopterin_N Molybdopterin oxidoreductase N-terminal domain. This is the N-terminal domain of pfam00384 found in a number of molybdopterin-containing oxidoreductases such as dimethyl sulfoxide/trimethylamine N-oxide reductase, also known as DMSO reductase (EC:1.7.2.3, EC:1.8.5.3). 41
60949 408165 pfam18365 PI_PP_C Phosphoinositide phosphatase C-terminal domain. This domain is found in the C-terminal region of effector protein SidP present in Legionella longbeachae. SidP functions as a PI-3-phosphatase specifically hydrolyzing PI(3)P and PI(3,5)P2. This C-terminal domain is rich with glutamate residues. 125
60950 408166 pfam18366 zf_ZIC Zic proteins zinc finger domain. This is the ZF1 (Zinc Finger 1) domain found in Zic family proteins found in Eukaryotes. In humans, there are five members of the Zic family that are involved in human congenital anomalies. One of them, ZIC3, causes X-linked heterotaxy (HTX1), which is a left-right axis disturbance that manifests as variable combinations of heart malformation, altered lung lobation, splenic abnormality and gastrointestinal malrotation. Zic faily proteins contain multiple zinc finger domains (ZFD), which are generally composed of five tandemly repeated C2H2 zinc finger (ZF) motifs. Sequence comparison analysis reveal that this N-terminal ZF (ZF1) domain of the Zic zinc finger domains is unique in that it possesses more amino acid residues (6-38 amino acids) between the two cysteine residues of the C2H2 motif compared to Gli and Glis ZF1s or any of the other ZFs (ZF2-5) in the Gli/Glis/Zic superfamily of proteins. Mutations in cysteine 253 (C253S) or histidine 286 (H286R) in ZIC3 ZF1, which are found in heterotaxy patients, result in extranuclear localization of the mutant ZIC3 protein. Furthermore, mutations in the evolutionarily conserved amino acid residues (C253, W255, C268, H281 and H286) of ZF1 generally impair nuclear localization. 45
60951 408167 pfam18367 Rv2175c_C Rv2175c C-terminal domain of unknown function. This is the C-terminal domain of unknown function found in actinomycetes such as M. tuberculosis Rv2175c. Rv2175c has a DNA binding activity and possesses a winged helix-turn-helix fold, furthermore it is identified as a substrate of the PknL kinase. 56
60952 408168 pfam18368 Ig_GlcNase Exo-beta-D-glucosaminidase Ig-fold domain. This domain can be found in 2 glycoside hydrolase subfamily of beta-glucosaminidases (EC:3.2.1.165) such as CsxA, from Amycolatopsis orientalis that has exo-beta-D-glucosaminidase (exo-chitosanase) activity. It has an immunoglobulin-like topology. 104
60953 408169 pfam18369 PKS_DE Polyketide synthase dimerisation element domain. This is the dimerisation element domain found in bacterial modular polyketide synthase ketoreductases. The dimerization element (DE) domain is N-terminal to the KR domain pfam08659. DE domain is necessary for KR function, presumably because the dimeric DE orients the KR domains for optimal activity within a module. 45
60954 408170 pfam18370 RGI_lyase Rhamnogalacturonan I lyases beta-sheet domain. This is the beta-sheet domain found in rhamnogalacturonan (RG) lyases, which are responsible for an initial cleavage of the RG type I (RG-I) region of plant cell wall pectin. Polysaccharide lyase family 11 carrying this domain, such as YesW (EC:4.2.2.23) and YesX (EC:4.2.2.24), cleave glycoside bonds between rhamnose and galacturonic acid residues in RG-I through a beta-elimination reaction. Other family members carrying this domain are hemagglutinin A, lysine gingipain (Kgp) and Chitinase C (EC:3.2.1.14). 86
60955 408171 pfam18371 FAD_SOX Flavin adenine dinucleotide (FAD)-dependent sulfhydryl oxidase. This is a flavin adenine dinucleotide (FAD) binding domain found in Quiescin sulfhydryl oxidases (QSOX) (EC:1.8.3.2). QSOX is a multi-domain disulfide catalyst that is localized primarily to the Golgi apparatus and secreted fluids and has attracted attention due to its over-production in tumors. Structural studies indicate that the closure of the Trx1 domain over the FAD-binding site may enhance the active-site chemistry for disulfide formation. 104
60956 408172 pfam18372 I-EGF_1 Integrin beta epidermal growth factor like domain 1. This is the I-EGF 1 domain found in several integrin betas such as integrin beta 1-7. Structural analysis reveal an epidermal growth factor-like (I-EGF) domains 1 and 2. EGF1 lacks one disulfide (C2-C4) relative to the integrin EGF 2, 3, and 4 domains, this allows the C-terminal end of EGF1 to flex remarkably relative to its N-terminal end. 29
60957 408173 pfam18373 Spectrin_like Spectrin like domain. Desmoplakin (DP) is an integral part of desmosomes, where it links desmosomal cadherins to the intermediate filaments. The N-terminal region of DP contains a plakin domain common to members of the plakin family. Plakin domains contain multiple copies of spectrin repeats (SRs) pfam00435. Spectrin repeats (SRs) consist of three alpha-helices (A, B, and C) that form an antiparallel triple-helical bundle. This entry describes SR6 which has a divergent structure relative to the other SRs. SR6 shows significant deviations in helices A and B where they are significantly shorter than in other repeats. Structural comparison revealed that SR6 is more similar to other three-helix-bundle proteins, including target of Myb1 and the syntaxin Habc domain, than to other SR proteins. Due to these differences with other spectrin repeats, this region is termed spectrin-like repeat. 78
60958 408174 pfam18374 Enolase_like_N Enolase N-terminal domain-like. This is the N-terminal domain found in o-succinylbenzoate synthase (OSBS) enzymes (EC:4.2.1.113). Like other members of the enolase superfamily, OSBS enzymes are composed of a C-terminal catalytic (beta/alpha)7beta-barrel domain pfam00113 and an N-terminal capping domain with an alpha+beta fold that is found in the enolase superfamily. This domain is different from other enolase super family N-terminal domains such as pfam03952. This actino-bacterial N-terminal domain lacks the prototypical first two helices of the enolase capping domains. Structural analysis of T. fusca OSBS reveals that this is compensated for with an extra helix appended to the C-terminus. 51
60959 408175 pfam18375 CDH1_2_SANT_HL1 CDH1/2 SANT-Helical linker 1. CDH1 is an ATP-dependent chromatin-remodelling factor and plays an important role in regulating nucleosome assembly and mobilization. CHD1 consists of double chromodomain, SNF2-related ATPase domain, and a C-terminal DNA-binding domain. The DNA-binding domain contains SANT (Swi3, Ada2, N-CoR, TFIIIB) and SLIDE (SANT-like ISWI) domains in its C-terminal region. SANT domains are structurally related to Myb-like domains are common motifs found in chromatin interacting proteins. Deletion of individual SANT or SLIDE domains in CDH1 does not significantly affect nucleosome binding, but combined deletion of both domains severely compromise binding, suggesting that the SANT-SLIDE motif recognizes DNA/nucleosomes as a single cooperative unit. SANT sequences of Chd1 proteins are the most distantly relation group of sequences relation to other SANT/Myb sequences, and are more diverse than other SANT proteins. The SANT and SLIDE regions are well conserved in both Chd1 and ISWI (imitation switch) remodelling enzymes. This domain comprises the SANT region and the helical linker region 1 (HL1). 90
60960 408176 pfam18376 MDD_C Mevalonate 5-diphosphate decarboxylase C-terminal domain. Mevalonate diphosphate decarboxylase (EC:4.1.1.33) catalyzes the ATP dependent decarboxylation of mevalonate 5-diphosphate (MVAPP) to form isopentenyl 5-diphosphate. The reaction is required for production of polyisoprenoids and sterols from acetyl-CoA. This entry represents the C-terminal domain of the mevalonate 5-diphosphate decarboxylase enzyme which is a member of the GHMP kinase superfamily. 186
60961 408177 pfam18377 FERM_F2 FERM F2 acyl-CoA binding protein-like domain. This is an F2 lobe domain consisting of an acyl-CoA binding protein fold found in FERM region of Jak-family tyrosine kinases. Multidomain JAK molecules interact with receptors through their FERM and SH2-like domains, triggering a series of phosphorylation events, resulting in the activation of their kinase domains. Overall, the FERM region maintains the typical three-lobed architecture, with an F1 lobe consisting of a ubiquitin-like fold, an F2 lobe consisting of an acyl-CoA binding protein fold, and an F3 lobe consisting of a pleckstrin-homology (PH) fold. JAK1 FERM-F2 domain has been shown to act as the interaction site for the IFNLR1 box1 motif (PxxLxF) of class II cytokine receptors which is essential for kinase activation. 131
60962 408178 pfam18378 Nup188_C Nuclear pore protein NUP188 C-terminal domain. This is C-terminal domain of Nup188. It is a right-handed arc-shaped superhelical structure built from 19 helices that form 6 helical repeats, which are stacked in regular order. The first helical pair (alpha1 and alpha2) forms a HEAT repeat followed by 5 ARM repeats. 371
60963 408179 pfam18379 FERM_F1 FERM F1 ubiquitin-like domain. This is an F1 lobe domain consisting of a ubiquitin like fold found in FERM region of Jak-family tyrosine kinases. Multidomain JAK molecules interact with receptors through their FERM and SH2-like domains, triggering a series of phosphorylation events, resulting in the activation of their kinase domains. Overall, the FERM region maintains the typical three-lobed architecture, with an F1 lobe consisting of a ubiquitin-like fold, an F2 lobe consisting of an acyl-CoA binding protein fold, and an F3 lobe consisting of a pleckstrin-homology (PH) fold. 96
60964 408180 pfam18380 GEN1_C Holliday junction resolvase Gen1 C-terminal domain. This is the C-terminal domain found in GEN1 resolvase. It is composed of three-strand antiparallel beta sheets and four alpha helices. GEN1 protein, a member of the XPG/Rad2 family of structure-selective endonucleases, is specialized for the cleavage of Holliday junction recombination intermediates. Structural comparison indicates that the C-terminal domain is similar to a series of chromobox homology proteins. Functional analysis indicates that the chromodomain provides an additional DNA binding site necessary for efficient HJ cleavage, and its truncation severely hampers GEN1's catalytic activity. 104
60965 408181 pfam18381 YcaO_C YcaO cyclodehydratase C-terminal domain. This is the proline-rich C-terminal domain found in ribosomal protein S12 methylthiotransferase accessory factor YcaO. It has been shown to be involved in both C protein recognition and cyclodehydration. The C-terminal domain resembles a tetratricopeptide repeat that mediates dimerization. 172
60966 408182 pfam18382 Formin_GBD_N Formin N-terminal GTPase-binding domain. This is the N-terminal GTPase-binding domain (GBD) of formins also known as formin homology domain-containing proteins (FHOD) pfam02181. This GBD is recruited by Rac and Ras GTPases in cells and plays an essential role for FHOD1-mediated actin remodelling and transcriptional activation, localizes to specific GTPases in cells, and binds to GTPases in vitro. It exhibits structural similarity to the ubiquitin superfold as found, for example, in the Ras-binding domains of c-Raf1 or PI3 kinase, but contains an unusual loop that inserts into the first FH3 repeat. 99
60967 408183 pfam18383 IFT81_CH Intraflagellar transport 81 calponin homology domain. This is the N-terminal domain found in IFT81 proteins. Crystal structure analysis revealed that IFT81-N adopts the fold of a calponin homology (CH) domain with structural similarity to the kinetochore complex component NDC80 with microtubule (MT)-binding properties. Functional analysis show that IFT74 and IFT81 form a tubulin-binding module required for ciliogenesis. It is suggested that IFT81-N binds the globular domain of tubulin to provide specificity, and IFT74-N recognizes the beta-tubulin tail to increase affinity. 123
60968 375810 pfam18384 zf_CCCH_5 Unkempt Zinc finger domain 1 (Znf1). This is CCCH zinc finger 1 domain found in Unkempt N-terminal region. Unkempt is an evolutionary conserved RNA-binding protein that regulates translation of its target genes and is required for the establishment of the early bipolar neuronal morphology. It carries six CCCH zinc fingers (ZnFs) forming two compact clusters, ZnF1-3 and ZnF4-6, that recognize distinct trinucleotide RNA substrates. These clusters, recognize an unexpectedly short stretch of RNA sequence-only three consecutive ribonucleotides-with a varying degree of specificity. ZnF1-3 binds to the UUA motif of RNA substrates. 40
60969 408184 pfam18385 Tiam_CC_Ex T-lymphoma invasion and metastasis CC-Ex domain. This is the CC and Ex subdomains found in PH-CC-Ex globular domain from Tiam1 and Tiam2 proteins (T-lymphoma invasion and metastasis). The CC subdomain forms an antiparallel coiled coil with two long alpha-helices, together with the C-terminal Ex subdomain they form a small globular domain comprising three alpha-helices. The CC subdomain of the Tiam2 PHCCEx domain follows the C-terminal alpha1 helix of the PH pfam00169 subdomain through a four-residue linker. 98
60970 408185 pfam18386 ROQ_II Roquin II domain. The ROQ domain is composed of three subdomains, I, II and III. This entry describes the second domain, ROQ II. Structural analysis reveals similarity of domain II to the helix-turn-helix (HTH) fold. Mutagenesis and biochemical studies show that that the HTH fold in domain II contributes to binding dsRNA at the 5'arm. 56
60971 408186 pfam18387 zf_C2H2_ZHX Zinc-fingers and homeoboxes C2H2 finger domain. This is a C2H2 zinc-finger domain found in ZHX proteins such as ZHX1. ZHXs are multidomain proteins comprising two C2H2 zinc finger motifs and five homeodomains. Both homeodomains and zinc fingers are short protein modules involved in protein-DNA and/or protein-protein interactions; they are frequently associated with roles in transcriptional regulation. All members of the ZHX family are reported to be able to form both homo- and heterodimers via the region containing homeodomain 1. ZHX1 is a transcriptional repressor which is ubiquitously expressed. It interacts with nuclear factor Y subunit A (NFYA) and DNA methyl transferase 3B (DNMT3B) for its repression activity. Changes in expression profiles of rat ZHX1 ortholog have been associated with glomerular disease. In addition to the five homeodomains, ZHX1, which also contains of two N terminal C2H2 zincfingers forms homodimers via homeodomain and can also form heterodimers with ZHX3. 53
60972 408187 pfam18388 Atg29_N Atg29 N-terminal domain. This is the N-terminal domain found in fungal Atg proteins such as Atg29. In yeast, the induction of autophagy begins at a single perivacuolar site that is proximal to the vacuole, called the phagophore assembly site (PAS). Atg17-Atg29-Atg31 complex (Atg1 complex) formation is a prerequisite for PAS assembly. Functional analysis indicate that the N-terminal half Atg29 can bind Atg31. 54
60973 408188 pfam18389 TrmO_C TrmO C-terminal domain. This domain is found at the C-terminus of TrmO tRNA methyltransferase proteins. This domain has a RelE fold. 65
60974 408189 pfam18390 GlgX_C Glycogen debranching enzyme C-terminal domain. This is the C-terminal domain of the glycogen debranching enzyme GlgX. GlgX hydrolyzes alpha-1,6-glycosidic linkages of phosphorylase-limit dextrin containing only three or four glucose subunits produced by glycogen phosphorylase. Sequence analysis suggests that GlgX is a debranching enzyme belonging to the glycoside hydrolase GH-13 family in the CAZy database. 85
60975 408190 pfam18391 CHIP_TPR_N CHIP N-terminal tetratricopeptide repeat domain. This is N-terminal tetratricopeptide repeat (TPR) domain found in C terminus of Hsp70 interacting proteins (CHIP). The TPR domain of CHIP binds directly to EEVD motifs located at the C termini of Hsc/Hsp70 and Hsp90. 83
60976 408191 pfam18392 CSN7a_helixI COP9 signalosome complex subunit 7a helix I domain. This is The C-terminal helix I domain found in COP9 signalosome complex subunit 7a. The helix from CSN7 (helix I) contacts CSN6 helices I and II at the base of the bundle, nearest the PCI ring. 50
60977 408192 pfam18393 MotY_N MotY N-terminal domain. The bacterial flagellar motor is a rotary motor complex composed of various proteins. MotX and MotY are essential for the Na+-driven flagellar motor motility of Vibrio, Shewanella and Aeromonas species. MotY is main component for T-ring formation and absence of MotY completely disrupt the T-ring formation. This is the N-terminal domain of MotY which is shown to be essential for motility and responsible for the interaction with both MotX and the basal body. Functional analysis suggests that MotY-N connects the basal body to MotX and that the PomA/PomB complex associates with MotX to form the functional stator complex around the rotor. MotY-N alone does not associate strongly with the basal body, but the partial T-ring structure made of the MotY-N/MotX complex is sufficient to allow at least a few PomA/PomB stator complexes to be incorporated into the motor. 146
60978 408193 pfam18394 TBK1_CCD1 TANK-binding kinase 1 coiled-coil domain 1. This is a coiled-coil domain found in TANK-binding kinase 1 (TBK1), it comprises one of two coiled-coil domains found in the scaffold dimerization region. TBK1 is a serine/threonine kinase and a noncanonical member of the IKK family implicated in diverse cellular functions, including innate immune response as well as tumorigenesis and development. Deletion of the coiled-coil 1 region in TBK1 lead to a severe impairment in TBK1 function even upon over-expression. 256
60979 408194 pfam18395 Cas3_C Cas3 C-terminal domain. This is the C-temrinal domain of Cas3 proteins. The C-terminal domain (CTD) is shown to completely wrap ssDNA inside the helicase. Deletion of the CTD (aa 819-924) reduced CRISPR interference. It is suggested that the CTD regulates the N-terminal HD nuclease activity by functioning as a substrate filter. 107
60980 408195 pfam18396 TBK1_ULD TANK binding kinase 1 ubiquitin-like domain. This is the ubiquitin-like domain (ULD) found in TANK-binding kinase 1 (TBK1). TBK1 is a serine/threonine kinase and a noncanonical member of the IKK family implicated in diverse cellular functions, including innate immune response as well as tumorigenesis and development. It has been reported that the ULD of TBK1 regulates kinase activity, playing an important role in signaling and mediating interactions with other molecules in the IFN pathway. Deletion of ULD indicates that it is required for the kinase domain to form an enzymatically active conformation. TBK1 ULD has a ubiquitin-like structure and an Ile44 hydrophobic patch, which is conserved among ULDs and IKK and IKK-related proteins. This hydrophobic patch is involved in ULD-SDD interactions in TBK1 and other IKK and IKK-related proteins. 88
60981 408196 pfam18397 IKBKB_SDD IQBAL scaffold dimerization domain. This is the C-terminal scaffold dimerization domain (SDD) found in inhibitor of nuclear factor kappa-B kinase subunit beta IKBKB (EC:2.7.11.10). IKK2 also known as IKBKB is one of the core component of IKB kinases (IKK). IKB kinase (IKK) is an enzyme that quickly becomes active in response to diverse stresses on a cell. The SDD consists primarily of two long alpha-helices. 275
60982 408197 pfam18398 CLIP_SPH_mas Clip-domain serine protease homolog masquerade. The clip domain is a structural/regulatory unit in many arthropod serine proteases. The clip domain super-family also includes serine protease homologs (SPHs). This entry describes clip domains in the SPHs (CLIP subfamily A), which belong to group-3. SPHs usually carry between 1 to 5 clip domains. One of the most prominent family members is masquerade (mas). Deletion in drosophila models lead to defects in somatic muscle attachment and in the formation of the nervous system during embryogenesis. 33
60983 408198 pfam18399 CLIP_SPH_Scar Clip-domain serine protease homolog Scarface. The clip domain is a structural/regulatory unit in many arthropod serine proteases. The clip domain super-family also includes serine protease homologs (SPHs). This entry describes clip domains in the SPHs (CLIP subfamily A), which belong to group-3. SPHs usually carry between 1 to 5 clip domains. The most prominent family member of carrying this clip domain is Scarface proteins in drosophila, which bear an inactive catalytic site, representing a subgroup of serine protease homologues (SPH). Loss-of-function induces defects in JNK-controlled morphogenetic events such as embryonic dorsal closure and adult male terminalia rotation. 66
60984 408199 pfam18400 Thioredoxin_12 Thioredoxin-like domain. This is one of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). 185
60985 408200 pfam18401 Thioredoxin_13 Thioredoxin-like domain. This is the second out of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). 136
60986 408201 pfam18402 Thioredoxin_14 Thioredoxin-like domain. This is the third out of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). 248
60987 408202 pfam18403 Thioredoxin_15 Thioredoxin-like domain. This is the fourth TRXL(thioredoxin-like) domain found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). 204
60988 408203 pfam18404 Glyco_transf_24 Glucosyltransferase 24. This is the catalytic domain found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). This domain belongs to glucosyltransferase 24 family (GT24) A-type domain. The GT domain displays the expected glycosyltransferase type A (GT-A) fold. 268
60989 408204 pfam18405 Serine_protease Gammaproteobacterial serine protease. This family includes serine proteases such as L. pneumophila effector Lpg1137. Lpg1137, is a serine protease that targets the mitochondria-associated ER membrane (MAM) and degrades STX17 (syntaxin 17), a SNARE implicated in macroautophagy/autophagy as well as mitochondria dynamics and membrane trafficking in fed cells. Lpg1137 has a sequence (-Gly-Leu-Ser68-Gly-Gly-) that matches the consensus sequence for the active site of serine proteases (Gly-X-Ser-X-Gly/Ala, where X is any residue). It exhibits proteolytic activity toward STX17 in vitro, whereas an active site mutant in which Ser68 is replaced by Ala does not. Expressed Lpg1137 localizes to the MAM and mitochondria, in addition to the cytosol, and binds to STX17. 283
60990 408205 pfam18406 DUF1281_C Ferredoxin-like domain in Api92-like protein. This domain has a ferredoxin like fold. It is often found to the C terminus of pfam06924. 87
60991 408206 pfam18407 GNAT_like GCN5-related N-acetyltransferase like domain. This is a domain with a GCN5-related N-acetyltransferase (GNAT) fold which can be found in Rv1692 phophatases. Crystal structure of Rv1692 indicates that this C-temrinal extension, which is absent in other characterized HADSF members, resembles a small GCN5-related N-acetyltransferase (GNAT) fold. Furthermore, it is fused to the HADSF catalytic domain pfam13242. Functional studies indicate that this GNAT region is not likely to be involved in acetyl group transfer using AcCoA and SucCoA, it could nonetheless be a regulatory domain. Furthermore, it is suggested that this GNAT domain is required for the solubility of the HADSF fold of Rv1692 and is potentially needed for the structural integrity of this enzyme. 60
60992 408207 pfam18408 zf_Hakai C2H2 Hakai zinc finger domain. This is the C2H2 zinc finger domain found in E3 ubiquitin ligase Hakai. Hakai targets tyrosine-phosphorylated E-cadherin. It carries a Tyr(P)-binding domain, coined the HYB domain for Hakai phosphotyrosine (Tyr(P)) binding. HYB domain structure illustrates that it forms a zinc-coordinated homodimer in an antiparallel, intertwined configuration, utilizing residues from the Tyr(P)-binding region of two Hakai monomers. The C-terminal region of the HYB domain, which harbors the atypical zinc-coordination motif and key residues involved in the Tyr(P) interaction, plays an important role in the dimerization observed in the HYB domain. 32
60993 408208 pfam18409 Plk4_PB2 Polo-like Kinase 4 Polo Box 2. This Polo box (PB) domain is found in Polo-like kinase 4 (Plk4) present in Drosophila melanogaster. Plk4 is a conserved component in the duplication pathway of centrioles which is needed to prevent chromosomal instability. Plk4 localizes to centrioles in M/G1. Structural analysis reveals two tandem, homodimerized polo boxes, PB1-PB2, that form a winged architecture. This domain is PB2, together with PB1 pfam18190, they are required for binding the centriolar protein Asterless (Asl) as well as robust centriole targeting. In other words, PB1-PB2 cassette collectively binds Asl and affords robust centriole localization, optimally positioning the kinase domain for trans-autophosphorylation. 109
60994 408209 pfam18410 BTHB Basic tilted helix bundle domain. This domain is found on the N-terminal region of FKBPs such as FKBP25 and in the core region of E3 ubiquitin ligase HectD1. It adopts a compact 5-helix bundle, hence termed BTHB (Basic Tilted Helix Bundle) domain. In FKBP25, it has been suggested to have a role in regulating the association state of nucleosomes by interacting with nucleolin. Moreover, this basic domain in FKBP25 forms alternative complexes with other chromatin-related proteins, such as the HDAC1, HDAC2, and the transcriptional regulator YY1, the DNA binding activity of which is enhanced on binding FKBP25. Structural analysis of this fold suggests that the DNA binding properties of FKBP25 and HectD1 are presented by the conserved basic region. 72
60995 408210 pfam18411 Annexin_like Annexin-like domain. This annexin-like domain can be found in astrotactin 2 (Astn-2), an integral membrane perforin-like protein linked to the planar cell polarity pathway in hair cells. The annexin-like domain is closest in fold to repeat three of human annexin V and similarly binds calcium, yet shares no sequence homology with it. Notably, this ASTN-2 annexin-like domain is closer in structure to human annexin repeat 3 than human annexin repeat 3 is to repeat 1. Annexin-like domains are known for their capacity to remodel membranes, triggered by calcium binding, and have also been suggested to be involved in the formation of pores in membranes both are possible biological roles of the ASTN-2 annexin-like domain. 93
60996 408211 pfam18412 Wza_C Outer-membrane lipoprotein Wza C-terminal domain. This is the C-terminal domain found in Wza, an integral outer membrane lipoprotein, which is essential for group 1 capsule export in Escherichia coli. The domain is exposed on the cell surface and is suggested to mimic antimicrobial peptide pore formation. 30
60997 408212 pfam18413 Neuraminidase Neuraminidase-like domain. This is a neuraminidase-like domain, which is structurally homologous to neuraminidases. It can be found in TcA subunit in tripartite Tc toxin complexes of bacterial pathogens. Functional analysis suggest that the neuraminidase-like domain acts as an electrostatic lock that opens at high or low pH values. 171
60998 375840 pfam18414 zf_C2H2_10 C2H2 type zinc-finger. This is a zinc finger domain C2H2 which can be found in optineurin (optic neuropathy inducing protein) and NF-kappa-B essential modulator (NEMO) furthermore, it can be found in kinase TBK1, a member of the IKK (inhibitor of nuclear factor kappa-B kinase) family. The C-terminal region, which carries the zinc finger domain, constitutes the regulatory domain of NEMO, as it receives the activation signal from upstream molecules, and subsequently transmits this activation to the kinases bound to the N-terminal domain. The isolated NEMO zinc finger is thought to be involved in protein-protein rather than protein-DNA interaction. 26
60999 408213 pfam18415 HKR_ArcB_TM Histidine kinase receptor ArcB trans-membrane domain. Histidine kinase receptors (HKRs) are part of a two-component system, in which an HKR in the bacterial inner membrane transmits a signal to a response regulator located in the cytoplasm. This is a trans-membrane domain (TM) found in ArcB (class 2, aerobic respiratory control sensor). ArcB has two TM helices connected by a short periplasmic loop. TM domain structures suggests a loose helical packing which provides an inherent flexibility in the TM domains and that this is perhaps essential to the mechanism of signal transduction across the membrane. 75
61000 408214 pfam18416 GbpA_2 N-acetylglucosamine binding protein domain 2. This domain can be found in N-acetylglucosamine binding protein (GbpA) from Vibrio cholerae, a bacterial pathogen that colonizes the chitinous exoskeleton of zooplankton as well as the human gastrointestinal tract. GbpA binds to GlcNAc oligosaccharides. Structural comparison show that there are distant structural similarities between domain 2 of GbpA and the beta-domain of the flagellin protein p5. It is suggested that this domain interacts with the bacterial surface, and functions to project an alginate binding domain of the protein from the cell surface. 102
61001 408215 pfam18417 LodA_C L-lysine epsilon oxidase C-terminal domain. This is the C-terminal domain of L-Lysine epsilon-oxidase (LodA, EC 1.4.3.20), an enzyme which catalyses the oxidative deamination of free L-lysine into L-2-aminoadipate 6-semialdehyde, ammonia and hydrogen peroxide. 144
61002 408216 pfam18418 AnkUBD Ankyrin ubiquitin-binding domain. This is an Ankyrin repeat domain found in TRABID (also known as Ubiquitin thioesterase ZRANB1) (EC:3.4.19.12). In TRABID, the first ankyrin repeat spans residues 260-290 and is connected to the second repeat residues 313-340 by a long linker that packs against what would correspond to the concave surface in an extended ankyrin-repeat structure. Ankyrin-repeat domains mediate protein interactions through a variety of surfaces. The ankyrin domain of TRABID interacts with ubiquitin, hence it is referred to as the ankyrin ubiquitin-binding domain, or AnkUBD. 96
61003 408217 pfam18419 ATP-grasp_6 ATP-grasp-like domain. Glutathione biosynthesis is achieved in most organisms via a conserved two-step approach relying on the capacity of two independent and unrelated ligases to perform peptide synthesis coupled to ATP hydrolysis. In a first and rate-limiting step, gamma-glutamylcysteine ligase (gamma-ECL) (or GshA; EC:6.3.2.2) uses l-glutamate and l-cysteine to form gamma-glutamylcysteine (gamma-EC), which, in a second step, is condensed with glycine to glutathione by glutathione synthetase (GS) (or GshB; EC:6.3.2.3). However, several pathogenic and free-living bacteria carry out glutathione biosynthesis based on a single enzyme that catalyzes both the gamma-ECL and the GS reactions. Such bifunctional glutathione-synthesizing enzymes have been termed gamma-GCS-GS or GshF. Hybrid GshF contains a typical gamma-proteobacterial gamma-ECL fused to an ATP-grasp-like domain. The ATP-grasp-like module is responsible for the ensuing formation of glutathione from gamma-glutamylcysteine and glycine. The ATP-grasp-like domain has an antiparallel beta-sheet in the GshF structures in contrast to all structurally characterized members of the ATP-grasp superfamily. 54
61004 375846 pfam18420 CSN4_RPN5_eIF3a CSN4/RPN5/eIF3a helix turn helix domain. Cullin-RING E3 ubiquitin ligases (CRLs) are regulated by the eight-subunit COP9 signalosome (CSN). Enzymatically, CSN functions as an isopeptidase that removes the ubiquitin-like activator NEDD8 from CRLs, but it can also bind deneddylated CRLs and maintain them in an inactive state. The CSN subunits CSN1, CSN2, CSN3, CSN4, CSN7 and CSN8, share a common domain composition: an N-terminal array of tandem alpha-helical tetratricopeptide/-like repeats, a 34 residue motif, followed by a PCI domain, which encompasses a WH subdomain, a linker, and one or two alpha-helices at the C-terminus. This entry describes the C-terminal helices found on CSN4. The two helices from CSN4 (helices I and II) form a brace roughly perpendicular to the bundle axis in contact with the three C-terminal helices of CSN6. CSN5, whose two C-terminal helices form an antiparallel hairpin, inserts its final C-terminal helix (helix II) into the central CSN6 framework at the core of the bundle. Both CSN1 and CSN4 are dependent on the presence of their C-terminal helix (CSN1 isoform-2 residues: 466-527; and CSN4: 364-406) for integration into CSN. COP9 signalosome shares common architecture with the 26S proteasome lid and eIF3 where the 19S lid subunit RPN5 and the eIF3 core subunit eIF3a share significant structural similarity with CSN4. 42
61005 408218 pfam18421 Peptidase_M23_N Peptidase family M23 N-terminal domain. This is the N-terminal domain of Peptidase M23 pfam01551 mostly found in proteobacteria. 73
61006 408219 pfam18422 TNFR_16_TM Tumor necrosis factor receptor member 16 trans-membrane domain. This is the helical trans-membrane domain found in tumor necrosis factor receptor superfamily member 16 (also known as p75 neurotrophin receptor, and nerve growth factor receptor-NGFR). p75 plays prominent biological functions such the induction of cell death, and it demonstrates several other activities, like survival, axonal growth, and cell migration. The trans-membrane (TM) domain of p75 stabilizes the receptor dimers through a disulfide bond, essential for the NGF signalling Structural and mutational analysis indicate that Cys257 plays the key role in this stabilisation process. Furthermore, although the p75-C257A mutant is still capable to form dimers and bind to NGF, it is unable to transduce the signals triggered by NGF binding in some cell signalling paradigms. 38
61007 408220 pfam18423 zf_CopZ Zinc binding domain. This is N-terminal domain containing a mononuclear metal center for zinc binding found in copper chaperone CopZ proteins. 62
61008 408221 pfam18424 a_DG1_N2 Alpha-Dystroglycan N-terminal domain 2. This is the second N-terminal domain found in alpha-Dystroglycan (DG). The murine skeletal muscle N-terminal alpha-DG region, contains two autonomous domains; the first identified as an Ig-like and the second resembling ribosomal RNA-binding proteins. This domain is similar to the small subunit ribosomal protein S6 of Thermus thermophilus (S6 domain). It is suggested that the S6 domain may be of functional relevance for LARGE (like-acetylglucosaminyltransferase) recognition along the alpha-DG maturation pathway. 123
61009 408222 pfam18425 CspB_prodomain Csp protease B prodomain. Csp proteases (Csps) and the subtilase protease family Subtilases are serine proteases that contain a catalytic triad in the order of Asp, His and Ser. Structure analysis reveals that Csps are subtilisin-like proteases with two distinctive functional features: a central jellyroll domain and a retained prodomain. The prodomain adopts a similar fold to the prodomains of related subtilisin-like proteases with the C-terminal region extending deep into the catalytic cleft. However, unlike the majority of subtilisin-like proteases, the prodomain stays bound to the subtilase domain via a network of interactions that result in tighter prodomain binding relative to other subtilases. Finally the prodomain acts as both an intramolecular chaperone and an inhibitor of CspB protease activity. 89
61010 408223 pfam18426 Tli4_C Tle cognate immunity protein 4 C-terminal domain. T6SS bacteria employ toxic effectors to inhibit rival cells and concurrently use effector cognate immunity proteins to protect their sibling cells. The effector and immunity pairs (E-I pairs) endow the bacteria with a great advantage in niche competition. This is the C-terminal domain of Tli4. The Tle cognate immunity proteins (Tlis) can directly disable the transported Tle protein and thereby mediate the self-protection process. The Tle-Tli effector-immunity (E-I) pairs confer substantial advantage to the donor cell during interbacterial competition. Tli4 displays a two-domain structure, in which a large lobe and a small lobe form a crab claw-like conformation. Tli4 uses this crab claw to grasp the cap domain of Tle4, especially the lid2 region, which prevents the interfacial activation of Tle4 and thus causes enzymatic dysfunction of Tle4. Structural comparison indicates similarity between this C-terminal domain of Tli4 and Tsi3, which is the cognate immunity protein of the effector protein Tse3 in P. aeruginosa PDB:4n7s. 161
61011 408224 pfam18427 DDR_swiveling DD-reactivating factor swiveling domain. AdoCbl-dependent diol dehydratase (DD) (EC 4.2.1.28) is one of the enzymes that catalyzes the conversion of 1,2-propanediol, 1,2-ethanediol, and glycerol to the corresponding aldehydes. A DD-reactivating factor (DDR) is responsible for the rapid reactivation of the inactivated holoDD in the presence of AdoCbl, ATP, and Mg2+. DDR exists as a dimer of heterodimer (alpha-beta)2. The alpha subunit has four domains: ATPase domain, swiveling domain, linker domain, and insert domain. The beta subunit, composed of a single domain, has a similar fold to the beta subunit of diol dehydratase (DD). This entry describes the swiveling domain of DDR, which structurally connects the beta subunit and the ATPase domain of the other alpha subunit. Furthermore, the beta subunit moves with the swiveling domain while the linker domain acts as a flat spring or a hinge for the domain movement of the swiveling domain. 162
61012 408225 pfam18428 BRCT_3 BRCA1 C Terminus (BRCT) domain. Brca1 C-terminal (BRCT) domains are a common protein-protein interaction regions in proteins involved in the DNA damage response and DNA repair. For example 53BP1 which plays multiple roles in mammalian DNA damage repair, has a C-terminal tandem BRCT domain (BRCT2), which in its orthologs, Saccharomyces cerevisiae Rad9p and Schizosaccharomyces pombe Crb2, mediates binding to the equivalents of gammaH2AX. Structural and functional studies indicate that the 53BP1-BRCT2 domain is a competent binding module for phosphorylated peptides with a clear specificity for the DNA-damage marker gammaH2AX, and in isolation from other parts of 53BP1 is sufficient for localization to sites of DNA damage in cells associated with gammaH2AX. 102
61013 408226 pfam18429 DUF5609 Domain of unknown function (DUF5609). This is a probable HAD-like (haloalkanoate dehalogenase) domain found in bacterial phosphoserine phosphatases. 65
61014 408227 pfam18430 DBD_HTH Putative DNA-binding domain. This is a putative DNA-binding protein dimerization domain found bacterial proteins such as CD3330, (a transposon-related DNA-binding protein from Clostridium difficile). Crystal structure analysis suggests that CD3330 N-terminus is involved in dimer formation and also in crystal packing but the C-terminus is open to solvent. 35
61015 408228 pfam18431 RNAse_A_bac Bacterial CdiA-CT RNAse A domain. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. Structure analysis of CdiA-CT shows that it adopts the same fold (with two beta-sheets forming an overall kidney shape) as angiogenin and other RNase A paralogs, but the toxin does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. Furthermore, structural comparison analysis identified human angiogenin, Rana pipiens protein P-30 (onconase) and mouse pancreatic ribonuclease (RNase 1) as the closest structural homologs of CdiA-CT. 113
61016 408229 pfam18432 ECD Extracellular Cadherin domain. This is an extracellular cadherin (EC) domain which can be found at the N-terminal region of Protocadherin 15 (Pcdh15). Pcdh15 features exceptionally long extracellular domains containing 11 ECs. These repeats are structurally similar, but not identical in sequence, often featuring linkers with conserved calcium-binding sites that confer mechanical strength to them. 110
61017 408230 pfam18433 DUF5610 Domain of unknown function (DUF5610). This is a domain of unknown function found in bacterial proteins. 114
61018 408231 pfam18434 Kazal_3 Kazal-type serine protease inhibitor domain. Kazal domain found in factor I-like modules (FIMs) region on the carboxyl-terminal of complement component C7 proteins. Complement component C7 is a subunit of the membrane attack complex (MAC), a fundamental machinery in the mammalian innate immunity. KAZAL domains are common in serine protease inhibitors. 49
61019 408232 pfam18435 EstA_Ig_like Esterase Ig-like N-terminal domain. This is an N-terminal immunoglobulin (Ig)-like domain found in esterases such as EstA. Analysis of the EstA structure confirms that it is a member of the alpha/beta hydrolase family, with a conserved Ser-Asp-His catalytic triad. The Ig-like domain presumably plays a role in the multimerization of EstA into an unusual hexameric structure. Additionally, it may also participate in the catalysis of EstA by guiding the substrate to the active site. 120
61020 408233 pfam18436 HECW1_helix Helical box domain of E3 ubiquitin-protein ligase HECW1. This is a region of 109 amino acids found in HECW1 proteins in Eukaryotes.Polymorphisms in the same region in the C.elegans homologue affects C. elegans behavioural avoidance of a lawn of Pseudomonas aeruginosa. 67
61021 408234 pfam18437 Nup54_C Nup54 C-terminal interacting domain. The mammalian nuclear pore complex (NPC) conducts nucleocytoplasmic transport and contains multiple copies of nucleoporins (nups). This is the C-terminal interacting domain found on Nup54. Nup45 is a splice variant of Nup58 with an identical alpha-helical region. Nup54 along with Nup62 and Nup58 are essential for nuclear transport. The C-terminal part of the alpha-helical region of Nup54 interacts with a C-terminal part of the alpha-helical region of Nup58. Interestingly, this region appears in two distinct conformations: a single helix and a helix-loop-helix, termed 'straight' and 'bent'. Whereas the straight conformer consists of a 34 residues long alpha helix (residues 460-493), the bent conformer is composed of two alpha helices, each 13 residues long, connected by a central loop (N helix, residues 460-472; C helix, residues 477-489). 39
61022 408235 pfam18438 Glyco_hydro_38 Glycosyl hydrolases family 38 C-terminal domain 1. The enzymatic hydrolysis of alpha-mannosides is catalyzed by glycoside hydrolases (GH), termed alpha-mannosidases. Streptococcal (Sp) GH38 alpha-mannosidase active on N-glycans and possibly O-glycans. SpGH38 structure can be considered as five domains: an N-terminal alpha/beta-domain, a three-helix bundle and three predominantly beta-sheet domains. This is the first of the three beta-sheet domains found in GH38, termed Beta-1. Structural analysis indicate that the beta-1 domain bows outward from the protein core, is involved in dimer interactions whilst also forming a lid 'above' and somewhat into the active centre of its dimer. 111
61023 408236 pfam18439 zf_UBZ Ubiquitin-Binding Zinc Finger. This is ubiquitin-binding zinc finger (UBZ) domain found in DNA polymerase eta (EC:2.7.7.7). It is important in the recruitment of the polymerase to the stalled replication machinery in translesion synthesis. The UBZ domain adopts a classical C2H2 zinc-finger structure characterized by a beta-beta-alpha fold. 32
61024 408237 pfam18440 GlcNAc-1_reg Putative GlcNAc-1 phosphotransferase regulatory domain. The Golgi enzyme UDP-GlcNAc-lysosomal enzyme N-acetylglucosamine-1-phosphotransferase (GlcNAc-1-phosphotransferase), an alpha2beta2gamma2 hexamer, mediates the initial step in the addition of the mannose 6-phosphate targeting signal on newly synthesized lysosomal enzymes. GNPTAB encodes the alpha and beta subunits of GlcNAc-1-phosphotransferase, and mutations in this gene cause the lysosomal storage disorders mucolipidosis II and III alpha-beta The alpha-beta subunits contain three identifiable domains separated by so-called spacer regions. This domain is part of the first spacer region, Spacer-1. Studies indicate that GlcNAc-1 lacking spacer-1 exhibits enhanced phosphorylation of several non-lysosomal glycoproteins, while the phosphorylation of lysosomal acid hydrolases is not altered. In view of these effects on the maturation and function of GlcNAc-1, it is suggested to rename 'spacer-1' the 'regulatory-1' domain. 88
61025 408238 pfam18441 Hen1_Lam_C Hen1 La-motif C-terminal domain. RNA silencing is a conserved regulatory mechanism in fungi, plants and animals that regulates gene expression and defence against viruses and transgenes. A conserved S-adenosyl-l-methionine-dependent RNA methyltransferase, HUA ENHANCER 1 (HEN1), and its homologues are responsible for 2'-O-methylation on the 3' terminal nucleotide of microRNAs and small interfering RNAs (siRNAs). The 2'-O-methylation protects miRNAs and siRNAs from 3'-end uridylation and 3'-to-5' exonuclease-mediated degradation. This domain lies on the C-terminal region of the La-motif domain found in HEN1. 136
61026 375868 pfam18442 G2BR E3 gp78 Ube2g2-binding region (G2BR). The activity of RING finger ubiquitin ligases (E3) is dependent on their ability to facilitate transfer of ubiquitin from ubiquitin-conjugating enzymes (E2) to substrates. The G2BR domain within the E3 gp78 binds selectively and with high affinity to the E2 Ube2g2. Binding to the G2BR results in conformational changes in Ube2g2 that affect ubiquitin loading. The Ube2g2-G2BR interaction also causes a 50-fold increase in affinity between the E2 and RING finger. Hence, the Ube2g2-binding region (G2BR) is required for the function of gp78. In yeast, Ubc7p, the ortholog of Ube2g2, is recruited by Cue1p to the ER membrane. Cue1p directly binds Ubc7p through a stretch of 50 aa domain analogous to G2BR, i.e. suggesting that this domain which activates ERAD and Hrd1p stimulating ubiquitylation, might be the yeast equivalent of the G2BR domain. 26
61027 408239 pfam18443 Tli4_N Tle cognate immunity protein 4 N-terminal domain. T6SS bacteria employ toxic effectors to inhibit rival cells and concurrently use effector cognate immunity proteins to protect their sibling cells. The effector and immunity pairs (E-I pairs) endow the bacteria with a great advantage in niche competition. This is the C-terminal domain of Tli4. The Tle cognate immunity proteins (Tlis) can directly disable the transported Tle protein and thereby mediate the self-protection process. The Tle-Tli effector-immunity (E-I) pairs confer substantial advantage to the donor cell during interbacterial competition. Tli4 displays a two-domain conformation (domains I and II) and contains 17 beta-strands and four helices. These two domains pack into a crab claw-like conformation functioning as an inhibitor of Tle4. Both domains adopt an alpha+beta architecture. Domain I features a central antiparallel beta-sheet sandwiched by two helices and a short antiparallel beta-sheet. This entry comprises the N-terminal domain I found in Tli4 proteins. 149
61028 408240 pfam18444 RRM_9 RNA recognition motif. The Mex67-Mtr2 complex (TAP-p15 or NXF1-NXT1 in metazoans) is the principal mRNA export factor in Saccharomyces cerevisiae. Mex67 is a member of the NXF family of proteins and has conserved homologs through eukaryotes from yeast to humans. Although sequence conservation is poor between S. cerevisiae Mex67 and Homo sapiens NXF1, they do show functional complementarity. Mex67 and TAP/NXF1 are modular proteins that contain four structural domains: an N-terminal RNA recognition motif (RRM), a leucine rich repeat (LRR) domain, a nuclear transport factor 2-like domain (NTF2L) and an ubiquitin-associated domain (UBA). This entry describes the N-terminal RNA recognition motif (RRM) found in Mex67 proteins. 70
61029 375871 pfam18445 zf_PR_Knuckle PR zinc knuckle motif. This is a zinc knuckle motif found in PRDM4 (Schwann cell factor 1, SC-1), a member of the PR protein family. PRDM4 is a transcriptional regulator that has been implied in transduction of nerve growth factor signals via the p75 neurotrophin receptor and in cell growth arrest. The short motif is also present in several other PR proteins including human PRDM6 (PRISM), PRDM7, PRDM9 (meisetz), PRDM10 (tristanin), PRDM11, and PRDM15. The conservation of cysteine and histidine residues suggested that this 20 amino acid motif binds zinc, hence the name 'PR zinc knuckle' to distinguish it from the longer (30 amino acid) C2H2-like zinc fingers that are located C-terminally of the PR domain. The PR zinc knuckle fold is similar to that of Gag-knuckles (a beta-hairpin providing two zinc ligands followed by a short helix or a loop providing the other two zinc ligands) and zinc ribbons (two beta-hairpins, each providing two zinc ligands). 38
61030 408241 pfam18446 DUF5611 Domain of unknown function (DUF5611). This is a domain of unknown function. Studies of the TA0095 gene product indicate that this 96-residue hypothetical protein from Thermoplasma acidophilum is a member of the COG4004 orthologous group of unknown function found in Archaea bacteria. The structure displays an alpha/beta two-layer sandwich architecture formed by three alpha-helices and five beta-strands. Furthermore, structural homologs indicate that the TA0095 structure belongs to the TBP-like fold. 103
61031 408242 pfam18447 FN3_7 Fibronectin type III domain. This domain is found in Interleukin-7 receptor subunit alpha (IL-7Ralpha), which together IL-7 form a complex crucial to several signalling cascades leading to the development and homeostasis of T and B cells. IL-7Ralpha carries a 219 residue ectodomain on the N-terminal region which is crucial for T and B-cell development. Mutations in the IL-7Ralpha ectodomain inhibits T and B cell development, resulting in patients with a form of severe combined immunodeficiency (SCID). The ectodomain folds into two fibronectin type III (FNIII) domains connected by a 310-helical linker. This entry comprises the first of the two FNIII domains, D1 while D2 domain is pfam00041. In the D1 domain of IL-7Ralpha, a disulfide bond (C22R-C37R) conserved among cytokine receptor class I (CRH I) family members bridges two beta strands. 97
61032 408243 pfam18448 CBM46 Carbohydrate binding domain. Carbohydrate active enzymes (CAZYmes) that target recalcitrant polysaccharides are modular enzymes containing noncatalytic carbohydrate-binding modules (CBMs) that direct enzymes to their cognate substrate, thus potentiating catalysis. The structure of Bacillus halodurans endo-beta-1,4-glucanase B (Cel5B) reveals that CBM46 is tightly associated with the catalytic module and, dependent on the glucan presented to the enzyme, can contribute directly to substrate binding or play a targeting role in directing the enzyme to regions of the plant cell wall rich in the polysaccharide hydrolyzed by the enzyme. The CBM46 domain displays a classic beta-sandwich jelly roll fold. Against beta-1,3-1,4-glucans CBM46 domain participates in productive substrate binding and thus plays a direct role in the hydrolytic activity of the enzyme. 105
61033 408244 pfam18449 Endotoxin_C2 Delta endotoxin. This is domain (D-VI) can be found in Bacillus thuringiensis (Bt) insecticidal protein Cry1Ac. Full length structural analysis reveal that Cry1Ac contains seven distinct domains (DI-DVII): the three canonical toxin core domains (D-I through D-III) and four protoxin domains (D-IV through D-VII). Cry1Ac is sickle-shaped with the toxic core as handle and the protoxin domains as the blade. Domains IV and VI are alpha-bundles that resemble structural/interaction domains such as spectrin (PDB ID: 1CUN) or bacterial fibrinogen-binding complement inhibitor. 63
61034 408245 pfam18450 zf_C2H2_6 Zinc Finger domain. This is a C2H2 type zinc finger domain which can be found in Zinc finger and BTB domain-containing protein 21 (ZBTB21). 28
61035 408246 pfam18451 CdiA_C Contact-dependent growth inhibition CdiA C-terminal domain. Contact-dependent growth inhibition (CDI) systems encode polymorphic toxin/immunity proteins that mediate competition between neighboring bacterial cells. CDI is mediated by the CdiB/CdiA family of two-partner secretion proteins. This domain represents the C-terminal of CdiA proteins (CdiA-CT), which contains the CDI toxin activity. The C-terminal nuclease domain forms a stable complex with its cognate immunity protein. It is also sufficient to inhibit growth when expressed in E. coli cells, consolidating the idea that they constitute the functional CDI toxin. The CdiA-CT C-terminal domains are structurally similar to type IIS restriction endonucleases suggesting that the toxins have metal-dependent DNase activity. 81
61036 408247 pfam18452 Ig_6 Immunoglobulin domain. This is an immunoglobulin domain which can be found in Interleukin-18 receptor alpha (IL-18Ra). IL-18Ra ectodomain folds into three immunoglobulin (Ig)-like domains, similar to IL-1 receptors. Each domain comprises a two-layer sandwich of six to nine beta-strands and contains at least one intra-domain disulfide bond. 50
61037 375879 pfam18453 C4bp_oligo Oligomerization domain of C4b-binding protein alpha. This is the C-terminal oligomerization domain found in C4b-binding protein (C4BP), which contains 14 cysteines that form 7 intermolecular disulfide bridges. C4BP is a plasma glycoprotein complex of 570 kDa, which is mainly produced in the liver. C4BP is the major inhibitor of complement activation, the major isoform consists of 7alpha and one beta-chain where each alpha-chain comprises eight complement control domain proteins (CCPs) and a C-terminal oligomerization domain. This domain carries a stretch of 42-45 amino acids that has been shown to be required for ring formation. 49
61038 408248 pfam18454 Mtd_N Major tropism determinant N-terminal domain. This is the N-terminal domain of major tropism determinant (Mtd), a retroelement-encoded receptor-binding protein. Mtd-N forms a three-fold symmetric beta-prism. This resembles the pseudo three-fold-symmetric beta-prisms of monocot lectins, but lacks residues in these lectins identified as binding carbohydrates. The beta-prism and beta-sandwich domains reinforce overall trimeric assembly and therefore may have indirect roles in stabilizing the backbone of the variable region. 37
61039 375881 pfam18455 GBR2_CC Gamma-aminobutyric acid type B receptor subunit 2 coiled-coil domain. This is the intracellular coiled-coil domain found in Gamma-aminobutyric acid type B receptor subunit 2 (GBR2). The coiled-coil complex between the GABAB receptor subunits GBR1 and GBR2 is responsible for facilitating the surface transport of the intact receptor. Disruption of the hydrophobic coiled-coil interface with single mutations in either subunit impairs surface expression of GBR1, confirming that the coiled-coil interaction is required to inactivate the adjacent ER retention signal of GBR1. 39
61040 408249 pfam18456 CmlA_N Diiron non-heme beta-hydroxylase N-terminal domain. This is the N-terminal domain found in Diiron non-heme beta-hydroxylase (CmlA). CmlA catalyzes beta-hydroxylation of the precursor molecule l-p-aminophenylalanine (l-PAPA) to form l-p-aminophenylserine. Structural analysis indicate that the N-terminal domain facilitates dimerization and has a mixed alpha-beta topology. Furthermore, a projecting 'dimerization arm' (residues 108-146) from the N-terminal domain of CmlA mediates the interaction between the monomers. 232
61041 408250 pfam18457 PUD1_2 Up-Regulated in long-lived daf-2. This entry includes C. elegans PUD-1 and PUD-2, two proteins up-regulated in daf-2(loss-of-function) (PUD), are homologous 17-kD proteins with similar beta-sandwich folds that further associate with each other into a V-shaped heterodimer. 172
61042 408251 pfam18458 XPB_DRD Xeroderma pigmentosum group B helicase damage recognition domain. This domain is found in the N-terminal region of xeroderma pigmentosum group B (XPB) helicase present in Archaeoglobus fulgidus. XPB is essential for transcription, nucleotide excision repair, and TFIIH functional assembly. The domain is a damage recognition domain (DRD) which allows XPB to unwind damaged DNA as needed for nucleotide excision repair. 57
61043 408252 pfam18459 PCSK9_C1 Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds. 83
61044 408253 pfam18460 HetR_C Heterocyst differentiation regulator C-terminal Hood domain. This is the C-terminal hood domain found in Heterocyst differentiation control protein (HetR). HetR-C binds to PatS peptide. Two PatS6 peptides bind to the lateral clefts of HetR-Hood domain, and trigger significant conformational changes of the flap domain, resulting in dissociation of the auxiliary alpha-helix and eventually release of HetR from the DNA major grove and termination of transcription. 79
61045 408254 pfam18461 Atypical_Card Atypical caspase recruitment domain. The N-terminal effector domain found in NLRC5. It adopts a six alpha-helix bundle with a general death fold. Structure and sequence analysis of the NLRC5-N indicate that it possesses a fold similar to the one of the death-fold domains; however, it displays significant differences in the number of core alpha-helices and their relative orientation. Hence, it is suggested that NLRC5 belongs to the caspase recruitment domain (CARD) subfamily as an atypical CARD. 95
61046 408255 pfam18462 DUF5612 Domain of unknown function (DUF5612). This is a domain of unknown function which is mostly found at the C-terminal of ACT domains such as pfam01842. 143
61047 408256 pfam18463 PCSK9_C3 Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds. 74
61048 408257 pfam18464 PCSK9_C2 Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds. 66
61049 408258 pfam18465 Rieske_3 Rieske 3Fe-4S. This domain is comprised of the iron-sulphur cluster and Rieske subunit found in the large subunit of arsenite oxidase. Arsenite oxidase is a 100 kDa molybdenum- and iron-sulfur-containing protein located on the outer surface of the inner membrane of Gram-negative organisms. The large subunit of arsenite oxidase is similar to other members of the dimethylsulfoxide (DMSO) reductase family of molybdenum enzymes. The large subunit of arsenite oxidase is divided into four domains, with domain I binding the [3Fe-4S] cluster. Domain I, consists of three antiparallel beta sheets and six helices. The [3Fe-4S] cluster is coordinated by the motif Cys21-X2-Cys24-X3-Cys28 near the interface with domains III and IV. A large, flattened funnel-like cavity bounded by domains I, II, and III leads to the molybdenum center pfam00384 located near the center of the molecule. 96
61050 408259 pfam18466 GluRS_N Glutamate--tRNA ligase N-terminal domain. This is an N-terminal domain of Glutamate--tRNA ligase (GluRS, EC:6.1.1.17). The domain adopts a classical glutathione S-transferase (GST)-like fold and it interacts with tRNA-aminoacylation cofactor ARC1 (Arc1p) N-terminal domain for the formation of aminoacyl-tRNA synthetase (aaRS) complex in yeast. 55
61051 408260 pfam18467 DUF5613 Domain of unknown function (DUF5613). This is a domain of unknown function found in bacteria. 97
61052 408261 pfam18468 Pfk_N Phosphofructokinase N-terminal domain yeast. This is a phosphofructokinase (Pfk) N-terminal domain found in yeast ATP-dependent 6-phosphofructokinase subunit alpha. ATP-dependent 6-phosphofructokinases (Pfks, EC 2.7.1.11) catalyze the phosphorylation of fructose 6-phosphate (F-6-P) to fructose 1,6-bisphosphate, a key control step of glycolysis in most organisms. The N-terminal domain contains the active site and is related to glyoxalase I (E.C. 4.4.1.5). 98
61053 408262 pfam18469 PH_18 Pleckstrin homology domain. This is a Pleckstrin Homology (PH) domain found on the N-terminal region of the histone chaperone Rtt106 in yeast. Rtt106 binds histone H3 acetylated at lysine 56 (H3K56ac) and facilitates nucleosome assembly during several molecular processes and this N-PH domain is shown to mediate histone binding. 140
61054 408263 pfam18470 Cas9_a Cas9 alpha-helical lobe domain. This is an alpha-helical lobe domain found in Cas9 proteins. Cas9 enzymes adopt a bilobed architecture composed of a nuclease lobe containing juxtaposed RuvC and HNH nuclease domains and a variable alpha-helical lobe likely to be involved in nucleic acid binding. Amino acid residues located in both the nuclease and alpha-helical lobe clefts are highly conserved within type II-A Cas9 proteins. 221
61055 408264 pfam18471 Ribosomal_L27_C Ribosomal L27 protein C-terminal domain. This is the C-terminal domain of 54S ribosomal protein L2 (also known as Mitochondrial large ribosomal subunit protein bL27m). bL27 C-terminal region interacts with an expansion segment of 21S rRNA to form part of the central protuberance. 240
61056 408265 pfam18472 HP1451_C HP1451 C-terminal domain. HP1451 modulates the ATPase activity of HP0525 H. pylori. It is suggested that HP1451 acts as an inhibitory factor of HP0525 to regulate Cag-mediated secretion. It consists of two domains. The two HP1451 domains (referred to as the KH-domain and S8-domain) interact with each other via two salt bridges. The second domain is structurally homologous to other nucleic acid-binding proteins such as the NTD of ribosomal protein S8 and DNaseI. 67
61057 408266 pfam18473 Urease_linker Urease subunit beta-alpha linker domain. This domain is present in bacterial ureases and corresponds to the gap region between the C-terminus of the beta-chain Urease beta subunit pfam00699 and the N-terminus of the alpha-chain Urease alpha-subunit, N-terminal domain pfam00449. It is suggested that this region is required for the stability of the putative transmembrane beta-barrel, and might be the reason for bacterial urease (B. pasteurii) not being lethal to insects. 34
61058 408267 pfam18474 DUF5614 Family of unknown function (DUF5614). This is an N-terminal domain found in C7orf25 protein UPF0415. It is distantly related to the PD-(D/E)XK nucleases. 221
61059 408268 pfam18475 PIN7 PIN domain. This is a bacterial PIN-like domain of unknown function. 103
61060 408269 pfam18476 PIN_8 PIN like domain. This is a domain of unknown function, suggested to be a member of PIN like domains clan. 228
61061 408270 pfam18477 PIN_9 PIN like domain. This is a domain of unknown function that resembles the PIN like domains. Family members include Ribonuclease VapC9. 113
61062 408271 pfam18478 PIN_10 PIN like domain. This is a bacterial domain of unknown function suggested to resemble PIN like domains. 84
61063 408272 pfam18479 PIN_11 PIN like domain. This is a eukaryotic/eumetazoan PIN like domain found in the C-terminal region of bilateral ZNF451 proteins such as isoform 1 of human ZNF451. ZNF451 was shown to interact with p300 by the PIN-like domain and to negatively regulate TGF-beta signalling in a p300-dependent and sumoylation-independent manner. This domain is suggested to posses a potential active nuclease due to the presence of at least four conserved Asp residues in the predicted active site. Furthermore, it contains several conserved Cys and His residues, which may suggest stabilization of the domain structure with an embedded short zinc-binding loop. 118
61064 408273 pfam18480 DUF5615 Domain of unknown function (DUF5615). This is a domain of unknown function found in potential toxin-antitoxin system component. 77
61065 408274 pfam18481 DUF5616 Domain of unknown function (DUF5616). This domain is found in a number of prokaryotic proteins. It is mostly found fused with the N-terminal domain pfam04256. This C-terminal domain is suggested to be a PIN like domain. 140
61066 408275 pfam18482 Pih1_fungal_CS Fungal Pih1 CS domain. The Pih1 protein is part of the R2TP complex. The CS domain of Pih1 binds to the unstructured region of Tah1. The C-terminal domain of Pih1 consists of a seven-stranded beta sandwich with the topology of a CS domain, a structural motif also found in Hsp90 cochaperones such as p23/Sba1 and Sgt1. 90
61067 408276 pfam18483 Bact_lectin Bacterial lectin. This entry primarily matches to legume-like lectin domains found in prokaryotes. 211
61068 408277 pfam18484 CDCA Cadmium carbonic anhydrase repeat. This domain is the cadmium carbonic anhydrase repeat unit of the beta-carbonic anhydrase of a marine diatom, that uses both zinc and cadmium for catalysis of the reversible hydration of carbon dioxide for use in inorganic carbon acquisition for photosynthesis (thus being a cambialistic enzyme). Compared with alpha- and gamma-carbonic anhydrases that use three histidines to coordinate the zinc-atom, this beta-carbonic anhydrase has two cysteines and one histidine, and rapidly binds cadmium. 184
61069 408278 pfam18485 GST_N_5 Glutathione S-transferase, N-terminal domain. This is the N-terminal (GST-N) domain containing a thioredoxin fold. This domain found in methionyl-tRNA synthetase (MRS), a multi-tRNA synthetase complex (MSC) component. 74
61070 408279 pfam18486 PUB_1 PNGase/UBA- or UBX-containing domain. This is a PUB domain (PNGase/UBA- or UBX-containing domain), found in E3 ubiquitin-protein ligase RNF31, also known as Ring finger protein 31 and HOIL-1-interacting protein (HOIP) (EC:2.3.2.27). RNF31/HOIP is observed to contribute to inborn human immunity disorders, in which RNF31/HOIP missense mutation at PUB domain gives rise to the de-stabilized LUBAC complex (linear ubiquitin chain assembly complex) and subsequently causes the auto-inflammation and immunodeficiency. In addition, RNF31 is reported to modify ERK and JNK pathways leading to cisplatin resistance. Functional studies indicate that HOIP and OTULIN interact and act as a bimolecular editing pair for linear ubiquitin signals where the HOIP-PUB domain binds to the PUB interacting motif (PIM) of OTULIN and the chaperone VCP/p9. This interaction plays an important role where the HOIP binding to OTULIN is required for the recruitment of OTULIN to the TNF receptor complex and to counteract HOIP-dependent activation of the NF-KB pathway. 64
61071 408280 pfam18487 TSR Thrombospondin type 1 repeat. This is a thrombospondin type I repeat (TSR) found in properdin. Properdin, also known as factor P (fP), is a glycoprotein constructed from a common pool of structure units or modules, which are homologous to the thrombospondin type 1 repeat, TSR. It is positive regulator of the complement system that stabilizes the alternative pathway C3-convertase C3bBb. Properdin also inhibits the factor H-mediated cleavage of C3b by factor I. In addition, properdin acts as a pattern recognition molecule capable of identifying and interacting with microbial surfaces, apoptotic cells, and necrotic cells. However this role of pattern recognition is controversial. Studies indicate that this domain is a TSR variants. It is present at the N-terminal of properdin and has been denoted as TSR-0. It is suggested that the TSR-0 domain of properdin which possesses only the six Cys residues and no CWR-layered motif (which is usually found in other TSR domains) may constitute a truncated TSR domain. 50
61072 408281 pfam18488 WYL_3 WYL domain. Many phytopathogens secrete and/or inject 'effector' proteins inside host cells to modulate cellular processes. Phytopathogens deliver effector proteins inside host plant cells to promote infection. Crystal structures of the effector domains from two oomycete RXLR proteins, Phytophthora capsici AVR3a11 and Phytophthora infestans PexRD2 reveal a core alpha-helical fold (termed the 'WY-domain') which enables functional adaptation of these fast evolving effectors through (i) insertion/deletions in loop regions between alpha-helices, (ii) extensions to the N and C termini, (iii) amino acid replacements in surface residues, (iv) tandem domain duplications, and (v) oligomerization. It is proposed that the core fold provides both a degree of molecular stability and plasticity that enables development/maintenance of effector virulence activities while allowing evasion of recognition by the plant innate immune system during rapid 'arms race' co-evolution. 63
61073 408282 pfam18489 Alpha_Helical Alpha helical domain. This is the N-terminal domain found in putative tRNA-binding proteins found in archaea. Structural analysis from Pyrococcus horikoshii indicate that it is a helical domain where many conserved residues are found in the first three helices and are mainly located on the inverse side of the putative tRNA-binding site. A structural homology search suggested that this fold prefers to bind proteins/peptides. 127
61074 408283 pfam18490 tRNA_bind_4 tRNA-binding domain. This is the N-terminal domain found in archeal type-2 serine-tRNA ligase (SerRS) (EC:6.1.1.11). The SerRS N-terminal domain interacts with the extra-arm stem and the outer corner of tRNA specific to Selenocysteine (tRNA-Sec). 159
61075 408284 pfam18491 SRA SET and RING associated domain. This is the C-terminal domain found in PvuRts1I, a modification-dependent restriction endonuclease that recognizes 5-hydroxymethylcytosine (5hmC) as well as 5-glucosylhydroxymethylcytosine (5ghmC) in double-stranded DNA in bacteria. Structural analysis indicates that it has the typical SRA (SET and RING associated) domain fold (pfam02182). 140
61076 408285 pfam18492 ORF_2_N Open reading frame 2 N-terminal domain. This is the N-terminal domain found in ORF 2 (open reading frame 2), a protein encoded just downstream of asp (A. sobria serine protease). The ORF 2 N-terminal domain is essential for proper ASP folding. This domain is intrinsically disordered but forms some degree of secondary structure upon binding ASP. 121
61077 408286 pfam18493 DUF5617 Domain of unknown function (DUF5617). This is a C-terminal domain of unknown function found in gammaproteobacteria. 96
61078 408287 pfam18494 Pullulanase_Ins Pullulanase Ins domain. Pullulanases (pullulan 6-glucanohydrolase, EC 3.2.1.41) are debranching enzymes that are able to hydrolyze the alpha-1,6-glycosidic linkage in pullulan, starch, amylopectin, and related oligosaccharides. Type I pullulanases specifically cleave the alpha-1,6-glycosidic linkages in pullulan and branched oligosaccharides to produce maltotriose and linear oligosaccharides, respectively. Structural analysis of Klebsiella lipoprotein pullulanase (PulA) illustrates that the catalytic core is composed of two major regions: the TIM-barrel domain A and beta-sandwich fold domain C. PulA contains an extra domain, a highly mobile Ins subdomain of unknown function which is inserted into the catalytic TIM-barrel domain A of Klebsiella pullulanases. The Ins subdomain is rich in helical and loop secondary structure. A disulfide bond between Cys491 and Cys506 and two Ca2+ ions presumably stabilizes this domain.This insertion is also found in pullulanases from other Gram-negative genera that have a functional T2SS, such as Vibrio, Aeromonas, and Photorhabdus. Functional analysis indicate that this domain is required for PulA secretion via the T2SS. 74
61079 408288 pfam18495 VbhA Antitoxin VbhA. VbhT is a bacterial Fic protein of the mammalian pathogen B. schoenbuchensis7,8. It is composed of an N-terminal FIC domain and a C-terminal BID domain. FIC domains are known to catalyse adenylylation (also called AMPylation). This entry represents VbhA, an antitoxin that binds FIC domain (filamentation induced by cyclic AMP) of VbhT and inhibits its activity. It inhibits the adenylylation activity of VbhT by positioning close to the putative ATP-binding site, hence competing with ATP binding. 47
61080 408289 pfam18496 ColG_sub Collagenase G catalytic helper subdomain. This is the catalytic helper subdomain found in collagenase G from Clostridium histolyticum. This domain is indispensable for proper folding and full peptidase activity. 116
61081 408290 pfam18497 RNase_3_N Ribonuclease III N-terminal domain. This is the N-terminal domain of eukaryotic ribonuclease 3 (RNase III, EC:3.1.26.3). Structure analysis of Saccharomyces cerevisiaeRNase III (Rnt1p) :RNA revealed specific contacts between the N-terminal domain (NTD) dimer and the 5' end region of the tetraloop. Deletion of the NTD led to cleavage at alternative site(s), suggesting that this region increases the precision of the cleavage site selectivity. 95
61082 408291 pfam18498 DUF5618 Domain of unknown function (DUF5618). This is a domain of unknown function found in bacteria. 123
61083 408292 pfam18499 Cue1_U7BR Ubc7p-binding region of Cue1. Cue1p (coupling of ubiquitin conjugation to ER degradation protein 1) is an integral component of yeast endoplasmic reticulum (ER)-associated degradation (ERAD) ubiquitin ligase (E3) complexes. It tethers the ERAD ubiquitin-conjugating enzyme (E2), Ubc7p, to the ER and prevents its degradation, and also activates Ubc7p. This domain represents the Ubc7p-binding region (U7BR) of Cue1p with Ubc7p. The U7BR is as E2-binding domain that includes three alpha-helices that interact extensively with the 'backside' of Ubc7p. U7BR stimulates both RING-independent and RING-dependent ubiquitin transfer from Ubc7p. Moreover, the U7BR enhances ubiquitin-activating enzyme (E1)-mediated charging of Ubc7p with ubiquitin. 52
61084 408293 pfam18500 CadC_C1 CadC C-terminal domain 1. CadC is an integral membrane protein of 512 amino acids comprising an N-terminal cytoplasmic DNA-binding domain, a transmembrane helix, and a C-terminal periplasmic domain. CadC belongs to the ToxR-like regulators that encompass biochemically non-modified one-component systems with similar gross topology, including several low pH-induced transcription regulators. Structural analysis of the C-terminal periplasmic domain indicates that it resembles the sensory domain of a (pH-activated) ToxR-like regulator. Furthermore, it is composed of two subdomains with a cavity at their interface that is suited to accommodate cadaverine, the feedback inhibitor of the Cad system. This is the N-terminal subdomain of the C-terminal periplasmic domain. It is composed of five-stranded beta-sheets. 134
61085 408294 pfam18501 REC1 Alpha helical recognition lobe domain. Cpf1 is an RNA-guided endonuclease of a type V CRISPR-Cas system. Cpf1 adopts a bilobed architecture consisting of an alpha-helical recognition (REC) lobe and a nuclease (NUC) lobe, with the small CRISPR RNAs (crRNAs)-target DNA heteroduplex bound to the positively charged, central channel between the two lobes. The REC lobe consists of the REC1 and REC2 domains where REC1 comprises 13 alpha helices, and REC2 comprises ten alpha helices and two beta strands that form a small antiparallel sheet. This entry represents REC1 domain. 245
61086 408295 pfam18502 Mrpl_C 54S ribosomal protein L8 C-terminal domain. This is the C-terminal domain of mitochondrial 54S ribosomal protein L8. 111
61087 408296 pfam18503 RPN6_C_helix 26S proteasome subunit RPN6 C-terminal helix domain. This is the C-terminal helix domain found in RPN6, a component of the 26S proteasome. The C-terminal helices are essential for lid assembly. 27
61088 375930 pfam18504 Csm1_N Csm1 N-terminal domain. In the budding yeast Saccharomyces cerevisiae, sister chromatid co-orientation in meiosis I depends on the four-protein monopolin complex (Mam1, Csm1, Lrs4, and Hrr25/casein kinase 1), which localizes to centromeres from meiotic prophase through metaphase I. Csm1 and Lrs4, form a complex that resides in the nucleolus during interphase and relocalizes to centromeres during meiotic prophase, accompanied by phosphorylation of Lrs4. This is the N-terminal domain of Csm1 which forms a coiled-coil. 70
61089 408297 pfam18505 DUF5619 Domain of unknown function (DUF5619). This is a domain of unknown function found in bacteria and archaea. 85
61090 408298 pfam18506 RelB_N RelB Antitoxin alpha helical domain. This is an alpha helix domain found in the N-terminal region of antitoxin RelB. RelE-RelB (RelBE) is a toxin-antitoxin (TA) protein complex. It is suggested that the toxic action of RelE is counteracted by antitoxin RelB, which wraps around RelE, blocks its active site and prevents sterically the binding to the ribosomal A-site.The long N-terminal alpha-helix of the tightly bound antitoxin RelB covers the presumed active site of the toxin RelE that is formed by a central beta-sheet. 46
61091 375933 pfam18507 WW_1 WW domain. This is a WW domain found in histone-lysine N-methyltransferase, H3 lysine-36 specific (EC:2.1.1.43) in Saccharomyces cerevisiae. The WW domain is the simplest natural beta-sheet structure. It is a 35-residue protein module found in signaling and regulatory proteins with two highly conserved tryptophans and a strictly conserved proline. 27
61092 408299 pfam18508 zf_C2H2_13 Zinc finger domain. The SAGA (Spt-Ada-Gcn5-acetyltransferase) complex performs multiple functions in transcription activation including deubiquitinating histone H2B, which is mediated by a subcomplex called the deubiquitinating module (DUBm). The yeast DUBm comprises a catalytic subunit, Ubp8, and three additional subunits, Sgf11, Sus1 and Sgf73, all of which are required for DUBm activity. A portion of the non-globular Sgf73 subunit lies between the Ubp8 catalytic domain and the zinc finger (ZnF)-UBP domain and has been proposed to contribute to deubiquitinating activity by maintaining the catalytic domain in an active conformation. Sgf73 contributes to maintaining both the organization and ubiquitin-binding conformation of Ubp8, thereby contributing to overall DUBm activity. This domain is a Sgf73 fragment in the DUB module. It is a zinc finger (ZnF) domain whose integrity is essential for the incorporation of this subunit into DUBm as well as for the catalytic activity of Ubp8, as either a short deletion or point mutations in Sgf73 zinc-coordinating residues disrupt the association of Sgf73 with the rest of the DUBm. 42
61093 375935 pfam18509 MCR Magnetochrome domain. Magnetotactic bacteria (MTB) align along the Earth's magnetic field using an organelle called the magnetosome. The magnetosome-associated protein MamP is conserved in all MTB and has a PDZ domain, a small c-type cytochrome domain (the first magnetochrome domain, MCR1), a 17-residue linker and a second magnetochrome domain (MCR2). This entry describes the two tandem magnetochrome domains carrying c-type cytochrome motifs CX2CH. 30
61094 408300 pfam18510 NUC Nuclease domain. This is a nuclease (NUC) domain found in Cpf1, an RNA-guided endonuclease of a type V CRISPR-Cas system. Structural and functional analysis indicate that this domain is involved in DNA cleavage. 158
61095 408301 pfam18511 F-box_5 F-box. Jasmonates are a family of plant hormones that regulate plant growth, development and responses to stress. COI1 is an F-box protein that functions as the substrate-recruiting module of the Skp1-Cul1-F-box protein (SCF) ubiquitin E3 ligase complex. The role of COI1-mediated JAZ degradation in jasmonate (JA) signaling is analogous to auxin signaling through the receptor F-box protein transport inhibitor response 1 (TIR1), which promotes hormone-dependent turnover of the AUX/IAA transcriptional repressors. The crystal structure of COI1 reveals a TIR1-like overall architecture, with an N-terminal tri-helical F-box motif bound to ASK1 and a C-terminal horseshoe-shaped solenoid domain formed by 18 tandem leucine-rich repeats. This entry represents the N-terminal F-box domain which is also found in other auxin signaling f-box proteins such as AFB1, AFB2 and AFB3. 42
61096 375938 pfam18512 BssB_TutG Benzylsuccinate synthase beta subunit. Members of this family include benzylsuccinate synthase beta subunit found in bacteria. BssB acts as a regulator of activation and may additionally be involved in regulating access to the enzyme's active site. It adopts a fold similar to that of a high potential iron-sulfur protein (HiPIP) and resembles the single small subunit of HPAD, which is known as the HpdC or HPADgamma in that system. 66
61097 408302 pfam18513 Pro_sub2 Prodomain subtilisin 2. Plasmodium subtilisin 2 (Sub2) is a multidomain protein that plays an important role in malaria infection. This domain is a conserved region of the inhibitory prodomain of Sub2 from Plasmodium falciparum, termed prosub2 which has structural similarity to bacterial and mammalian subtilisin-like prodomains. 88
61098 408303 pfam18514 Get5_C Get5 C-terminal domain. Tail-anchored trans-membrane proteins are targeted to membranes post-translationally. The proteins Get4 and Get5 form an obligate complex that catalyzes the transfer of tail-anchored proteins destined to the endoplasmic reticulum from Sgt2 to the cytosolic targeting factor Get3. This is the carboxyl domain of Get5 (Get5-C), a homodimerization domain, resulting in a heterotetrameric Get4/Get5 complex. 38
61099 408304 pfam18515 Rh5 Rh5 coiled-coil domain. This is a helical coiled-coil domain found in reticulocyte-binding protein homolog 5 (RH5), a Plasmodium falciparum protein essential for erythrocyte invasion. 255
61100 408305 pfam18516 RuvC_1 RuvC nuclease domain. This is a RuvC nuclease domain found in type V CRISPR-associated protein Cas12a (Cpf1), used for genome editing applications. These proteins carry out endoribonuclease activity for processing its own guide RNAs and RNA-guided DNase activity for target DNA cleavage. The C-terminal region of Cas12a carries the RUVC domain, NUC domain pfam18510 and the arginine-rich bridge helix (BH). Both the NUC and BH domains are nested in the RuvC domain. Mutations in the RuvC domain impair cleavage of both strands in a target DNA duplex, while a mutation in the Nuc domain impaired target strand cleavage only. This indicates that the DNA nuclease active sites are located at the interface of the RuvC and Nuc domains and that cleavage of the non-target DNA strand by the RuvC domain is a prerequisite for target strand cleavage by the Nuc domain. 412
61101 408306 pfam18517 LZ3wCH Leucine zipper with capping helix domain. This domain is found at the C-terminal region of Hop2 and Mnd1 proteins. In meiotic DNA recombination, the Hop2-Mnd1 complex promotes Dmc1-mediated single-stranded DNA (ssDNA) invasion into homologous chromosomes to form a synaptic complex. Hop2 (for homologous pairing; also known as TBPIP) is expressed specifically during meiosis, same as Mnd1 (for meiotic nuclear divisions 1). The C-terminal region of both Hop2 and Mnd1, folds into three alpha-helices that are interrupted by two short non-helical regions. These alpha-helices of the two proteins together form a parallel coiled coil that provides the major interface for heterodimer formation. The non-helical regions form substantially kinked junctions between adjacent leucine zippers: the LZ1-LZ2 and LZ2-LZ3 junctions.This domain is the C-terminal segment of Hop2 and Mnd1 which folds back onto the C-terminal leucine zipper (LZ3) to form a helical bundle-like structure, hence designated LZ3wCH (for LZ3 with capping helices). The LZ3wCH region plays a role in interacting with the Dmc1 nucleofilament. 55
61102 408307 pfam18518 TcA_RBD TcA receptor binding domain. Tc toxin complexes are virulence factors of many bacteria such as the plague pathogen Yersinia pestis. Tc toxins are composed of TcA, TcB and TcC subunits. TcA forms a large bell-shaped pentameric structure and enters the membrane like a syringe, forming a translocation channel through which the cytotoxic domain is probably transported into the cytoplasm. TcA has four receptor-binding domains. This domain is one of 4 receptor binding domains found in TcA. All four domains have an immunoglobulin (Ig)-like beta-sandwich fold of two sheets with antiparallel beta-strands. The domains are structurally reminiscent of the receptor-binding domains of the diphtheria and anthrax toxins. 130
61103 408308 pfam18519 Sgf11_N SAGA-associated factor 11 N-terminal domain. The SAGA (Spt-Ada-Gcn5-Acetyltransferase) transcriptional co-activator is a protein complex that regulates inducible yeast genes by performing multiple functions including acetylating core histones, recruiting the RNA polymerase II preinitiation complex, and deubiquitinating histone H2B. The deubiquitinating activity of SAGA resides in a distinct sub-complex called the deubiquitinating module (DUBm), which consists of four proteins that are conserved across eukaryotes: Ubp8, Sgf11, Sus1 and Sgf73. The DUBm proteins are organized into two lobes around the globular domains of Ubp8. In SAGA, Sus1 binds to Sgf11 by wrapping around this N-terminal domain of Sgf11, forming a stable dimer. 39
61104 375946 pfam18520 Spc110_C Spindle pole body component 110 C-terminal domain. This is the C-terminal domain found in Spc110 proteins. Spc110 is a spindle pole body component (SPB) protein. The N-terminus is shown to bind to gamma-tubulin small complex (g-TuSC) while this C-terminal domain is essential for calmodulin-binding. The C-terminus of Spc110 is anchored to the SPB via a conserved PACT domain. 52
61105 375947 pfam18521 TAD2 Transactivation domain 2. This is a N-terminal transactivation domain (TAD) domain 2 found in p53 proteins. In p53 two TAD domains are found termed TAD1 (residues 1-39) and TAD2 (residues 40-61), both of which have been shown to be able to independently activate gene transcription and are intrinsically disordered protein domains that adopt a helical conformation for at least part of their length when bound. This inherent flexibility allows the TADs to adapt to and bind a broad range of proteins. This entry describes TAD2 which can independently interact with Taz2 domain of the histone acetyltransferase p300. It has also been shown to bind to OB-fold domain of replication protein 70 A (RPA) as well as the pleckstrin homology (PH) domain of the p62 and Tfb1 subunits of human and yeast TFIIH. 25
61106 408309 pfam18522 DUF5620 Domain of unknown function (DUF5620). This is a domain of unknown function predicted to be a carbohydrate binding module. 119
61107 408310 pfam18523 Sld3_N Sld3 N-terminal domain. Sld3 is conserved in yeast and fungi, and treslin, also known as Ticrr, has been identified as the functional counterpart of Sld3 in metazoans. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the helicase. This entry represents the N-terminal domain of Sld3 which is shown to bind to the N-terminal domain of Sld7. 116
61108 408311 pfam18524 HPIP_like High potential iron-sulfur protein like. This is a C-terminal domain found in 4-hydroxyphenylacetate decarboxylase small subunit (EC:4.1.1.83), which catalyzes the last reaction in the fermentative production of p-cresol from tyrosine. The C-terminal domain [4F-4S] cluster bears structural similarity to high-potential iron-sulfur proteins (HiPIPs). HiPIPs have an N-terminal extension of 20-40 residues, so the structural similarity is limited to their Fe/S cluster-binding scaffold. Furthermore, despite of the weak amino acid sequence identity, the cluster binding motifs are remarkably similar to H/CX2CX12-13CX16-17C for the gamma-subunit and CX2CX13-19CX14-19C for HiPIPs. 40
61109 408312 pfam18525 Cas9_C Cas9 C-terminal domain. This is the C-terminal domain of Cas9 enzymes found in actinobacteria. 110
61110 408313 pfam18526 DB_JBP1 Thymine dioxygenase JBP1 DNA-binding domain. The J-binding protein 1 (JBP1) is essential for biosynthesis and maintenance of DNA base-J (beta-d-glucosyl-hydroxymethyluracil). Base-J and JBP1 are confined to some pathogenic protozoa and are absent from higher eukaryotes, prokaryotes and viruses. JBP1 recognizes J-containing DNA (J-DNA) through the DNA-Binding JBP1 domain (DB-JBP1), which binds to J-DNA with approximately the same affinity and specificity as full-length JBP1. Structure analysis of DB-JBP1 revealed a helix-turn-helix variant fold, a 'helical bouquet' with a 'ribbon' helix encompassing the amino acids responsible for DNA binding. Mutation of a single residue (Asp525) in the ribbon helix abrogates specificity toward J-DNA. 164
61111 408314 pfam18527 STT3_PglB_C STT3/PglB C-terminal beta-barrel domain. Asparagine-linked glycosylation is a post-translational modification of proteins containing the conserved sequence motif Asn-X-Ser/Thr. The attachment of oligosaccharides is implicated in diverse processes such as protein folding and quality control, organism development or host-pathogen interactions. The reaction is catalysed by oligosaccharyltransferase (OST), a membrane protein complex located in the endoplasmic reticulum. The central, catalytic enzyme of OST is the STT3 subunit, which has homologues in bacteria and archaea. Structural analysis of a bacterial OST, undecaprenyl-diphosphooligosaccharide protein glycotransferase EC:2.4.99.19 (PglB) protein, revealed two domains: a transmembrane domain and a periplasmic domain. This entry represents the C-terminal periplasmic beta-barrel domain. 79
61112 408315 pfam18528 Ret2_MD RNA editing 3' terminal uridylyl transferase 2 middle domain. Post-transcriptional RNA editing in Trypanosomatids (pathogenic protozoa) is catalyzed by a large multiprotein complex, the editosome. A key editosome enzyme, RNA editing terminal uridylyl transferase 2 (TUTase 2; RET2) catalyzes the uridylate addition reaction. RET2 structure consists of three domains: the N-terminal domain (NTD), the middle domain (MD) and the C-terminal domain (CTD). This MD domain is mainly composed of six helices and a four-stranded antiparallel beta-sheet. structural comparison reveals that the fold of this MD is topologically similar to the binding domains of several RNA-binding proteins such as the RNA-binding domain of the U1A spliceosomal protein, the RRM domain of the human La protein and the CTD of an archaeal CCA-adding enzyme. The CTD of the archaeal CCA-adding enzyme has been shown to bind double-stranded tRNA stem substrate through the alpha-helices regions. Hence it is suggested that this domain might be an RNA-binding domain. 93
61113 408316 pfam18529 MIX Mitochondrial membrane-anchored proteins. MIX forms an all alpha-helical fold comprising seven alpha-helices that fold into a single domain. The distribution of helices is similar to a number of scaffold proteins, namely HEAT repeats, 14-3-3, and tetratricopeptide repeat proteins, suggesting that MIX mediates protein-protein interactions. 151
61114 408317 pfam18530 Swi6_N Swi6 N-terminal domain. This is a putative DNA binding domain, it comprises four alpha helices and five beta strands arranged in a mixed alpha/beta fold. 108
61115 408318 pfam18531 Polo_box_2 Polo box domain. In metazoans, Plk4 kinases control daughter centriole assembly. Plk4 homologs have an N-terminal kinase domain, a C-terminal polo box, and a central domain termed the 'cryptic polo box' (CPB) that has been shown to dimerize, to be sufficient for centriole localization and to be required for Plk4 to promote centriole assembly. Probable serine/threonine-protein kinase zyg-1 (EC:2.7.11.1) (ZYG-1) is a Plk4 homlog found in C. elegans. Crystal structure for the CPB of C. elegans ZYG-1, reveals that it forms a Z-shaped dimer containing an intermolecular beta-sheet with an extended basic surface patch. Electrostatic interactions between the basic patch on the ZYG-1 CPB dimer and the SPD-2 acidic region dock ZYG-1 onto centrioles to promote new centriole assembly. ZYG-1 CPB contains two tandem polo boxes (PB1 and PB2), each containing a six-stranded beta-sheet with an alpha-helix packed against one side. 112
61116 408319 pfam18532 DUF5621 Domain of unknown function (DUF5621). This is a domain of unknown function found in gammaproteobacteria. 139
61117 408320 pfam18533 DUF5622 Domain of unknown function (DUF5622). This is a domain of unknown function found in archaea-specific ribosomal proteins such as L46a which is suggested to directly bind to rRNA in the ribosome. 66
61118 408321 pfam18534 HBD Helical bundle domain. Lpg0393 is a Legionella pneumophila effector protein. Structure analysis reveals that it has two domains, the N-terminal domain is a Vps9-like domain, which is structurally most similar to the catalytic core of human Rabex-5 that activates the endosomal Rab proteins Rab5, Rab21 and Rab22. The C-terminal domain is a helical bundle domain. The C-terminal helical bundle of Lpg0393 corresponds to the N-terminal helical bundle of Rabex-5, it lacks an obvious region that corresponds to the membrane-binding motif of Rabex-5. One possibility may be that Lpg0393 localization to endosomes depends on an unknown Legionella effector. 84
61119 408322 pfam18535 Gal11_ABD1 Gal11 activator-binding domain (ABD1). This is activator-binding domain (ABD1) found in Gal11/med15 proteins. Structural analysis indicate that it binds to the central activator domain (cAD) of Gcn4. Mutations in Gal11-ABD1 W196 residue abolishes the binding to Gcn4 cAD. 81
61120 408323 pfam18536 DUF5623 Domain of unknown function (DUF5623). This is a domain of unknown function found in proteobacteria. 119
61121 408324 pfam18537 CODH_A_N Carbon monoxide dehydrogenase subunit alpha N-terminal domain. Acetyl-coenzyme A (CoA) synthase/carbon monoxide dehydrogenase (ACS/CODH) is a bifunctional enzyme that catalyzes the reversible reduction of CO2 to CO (CODH activity). This entry is for the N-terminal domain found in ACS/CODH subunit alpha. 83
61122 408325 pfam18538 DUF5624 Domain of unknown function (DUF5624). This is a domain of unknown function found mainly in bacteria. 129
61123 408326 pfam18539 DUF5625 Domain of unknown function (DUF5625). This is a domain of unknown function found in proteobacteria. 130
61124 408327 pfam18540 DUF5626 Domain of unknown function (DUF5626). This is a domain of unknown function mostly found in firmicutes. 120
61125 408328 pfam18541 RuvC_III RuvC endonuclease subdomain 3. Cas9 proteins are abundant across the bacterial kingdom, but vary widely in both sequence and size. All known Cas9 enzymes contain an HNH domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC nuclease domain required for cleaving the noncomplementary strand (non-target strand), yielding double-strand DNA breaks (DSBs). The crystal structures of type II-A and II-C Cas9 proteins highlight the features in Cas9 enzymes that support their function as RNA-guided endonucleases. Cas9 enzymes adopt a bilobed architecture composed of a nuclease lobe containing juxtaposed RuvC and HNH nuclease domains and a variable alpha-helical lobe likely to be involved in nucleic acid binding. The RuvC domain forms the structural core of the nuclease lobe, a six-stranded beta sheet surrounded by four alpha helices, with all three conserved subdomains (I, II, III) contributing catalytic residues to the active site. 160
61126 408329 pfam18542 TFIIB_C_1 Transcription factor IIB C-terminal module 1. In the pathogenic trypanosome, Trypanosoma brucei, transcription factor IIB (tTFIIB) is essential for spliced leader (SL) RNA gene transcription and cell viability, but has a highly divergent primary sequence in comparison to TFIIB in other eukaryotes. Structure analysis of the C-terminal region of trypanosome TFIIB, reveals 2, closely packed helical modules followed by a C-terminal extension of 32 aa. The trypanosome-specific region comprises the second helical module and the C-terminal extension. Both helical modules contain the canonical 5-helix cyclin fold characteristic of TFIIB proteins. This domain is mostly found in Trypanosomatidae. 98
61127 408330 pfam18543 ID Intracellular delivery domain. This is a C-terminal domain found in BepA proteins from Bartonella henselae. It is a type IV secretion system (T4SS) effector protein. BepA from Bartonella henselae is composed of an N-terminal Fic domain and a C-terminal Bartonella intracellular delivery (ID) domain, the latter being responsible for T4SS-mediated translocation into host cells. The ID domain of BepA mediates inhibition of apoptosis and exhibits an OB (oligonucleotide/oligosaccharide binding)-fold. 55
61128 408331 pfam18544 Polo_box_3 Polo box domain. In metazoans, Plk4 kinases control daughter centriole assembly. Plk4 homologs have an N-terminal kinase domain, a C-terminal polo box, and a central domain termed the 'cryptic polo box' (CPB) that has been shown to dimerize, to be sufficient for centriole localization and to be required for Plk4 to promote centriole assembly. Probable serine/threonine-protein kinase zyg-1 (EC:2.7.11.1) (ZYG-1) is a Plk4 homlog found in C. elegans. Crystal structure for the CPB of C. elegans ZYG-1, reveals that it forms a Z-shaped dimer containing an intermolecular beta-sheet with an extended basic surface patch. Electrostatic interactions between the basic patch on the ZYG-1 CPB dimer and the SPD-2 acidic region dock ZYG-1 onto centrioles to promote new centriole assembly. ZYG-1 CPB contains two tandem polo boxes (PB1 and PB2), each containing a six-stranded beta-sheet with an alpha-helix packed against one side. This entry represents PB2. 97
61129 408332 pfam18545 HalOD1 Halobacterial output domain 1. HalOD1 (Halobacterial output domain 1) is a protein domain that is specific for haloarchaea and their viruses. It is found in a stand-alone version and also in combination with Response_reg pfam00072 and other domains (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 75
61130 408333 pfam18546 MetOD1 Metanogen output domain 1. MetOD1 (Metanogen output domain 1) is a protein domain that is found in euryarchaeal classes Methanobacteria and Methanomicrobia, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 143
61131 375973 pfam18547 HalOD2 Halobacterial output domain 2. HalOD2 (Halobacterial output domain 2) is a protein domain that is found in haloarchaea in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 52
61132 375974 pfam18548 MetOD2 Metanogen output domain 2. MetOD2 (Metanogen output domain 2) is found in euryarchaeal class Methanomicrobia, usually in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 88
61133 408334 pfam18549 NitrOD1 Nitrosopumilus output domain 1. NitrOD1 (Nitrosopumilus output domain 1) is found in thaumarchaea, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 68
61134 375976 pfam18550 NitrOD2 Nitrososphaera output domain 2. NitrOD2 (Nitrososphaera output domain 2) is found in thaumarchaea, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 104
61135 408335 pfam18551 TackOD1 Thaumarchaeal output domain 1. TackOD1 (Thaumarchaeal output domain 1) is a predicted metal-binding domain found in archaea and in some bacteria. It contains 11 highly conserved Cys residues, which form 5 CxxC motifs and an HxxC motif. In several instances, it is found in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems). 188
61136 408336 pfam18552 PheRS_DBD1 PheRS DNA binding domain 1. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase (EC:6.1.1.20) N-terminal region. This domain belongs to a superfamily of 'winged helix' DNA-biding domains. The topology of DBD-1 and DBD-3 closely resembles the topology of the Z-DNA-binding domain Zalpha of double-stranded RNA (dsRNA) adenosine deaminase and other domains from DNA-binding proteins. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNA-Phe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains. 59
61137 408337 pfam18553 PheRS_DBD3 PheRS DNA binding domain 3. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase N-terminal region. This domain belongs to a superfamily of 'winged helix' DNA-biding domains. The topology of DBD-1 and DBD-3 closely resembles the topology of the Z-DNA-binding domain Zalpha of double-stranded RNA (dsRNA) adenosine deaminase and other domains from DNA-binding proteins. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNAPhe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains. 57
61138 408338 pfam18554 PheRS_DBD2 PheRS DNA binding domain 2. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase N-terminal region. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNA- Phe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains. DBD-2 and DBD-3 constitute large insertions sequentially included between two neighboring antiparallel strands of the DBD-1 domain. Moreover, the DBD-3 pfam18553 is the domain insertion into DBD-2. 33
61139 408339 pfam18555 MobL MobL relaxases. This family includes members of relaxase enzymes. These enzymes initiate bacterial conjugation contributing to the spread of antibiotic resistance. These MobL relaxases are found mainly in Firmicutes. It is suggested that MobL type relaxases play a prominent role in horizontal gene transfer in Firmicutes bacteria. Family members carry a stretch of relaxase motif III 'HUH' sequence that is characteristic of the HUH endonuclease superfamily essential for enzymatic activity. 387
61140 408340 pfam18556 TetR_C_35 Bacterial Tetracyclin repressor, C-terminal domain. This is the C-terminal tetracyclin repressor domain found in bacteria. Family members include TetR family transcriptional regulators Rv3249c and Rv1816 found in Mycobacterium tuberculosis. Palmitic acid (a fatty acid) and isopropyl laurate (a fatty acid ester), were identified as binding ligands to Rv3249c and Rv1816 respectively. Similar to other TetR family regulators, these proteins are alpha-helical dimeric proteins consisting of a smaller N-terminal DNA-binding domain and a larger C-terminal regulatory domain. 105
61141 408341 pfam18557 NepR Anti-sigma factor NepR. The general stress response sigma factor in alphaproteobacteria, sigma EcfG is inactivated by the anti-sigma factor NepR, which is itself regulated by the response regulator PhyR. NepR forms two helices that extend over the surface of the PhyR subdomains. Homology modeling and comparative analysis of NepR, PhyR and sigmaEcfG mutants indicate that NepR contacts both proteins with the same determinants, showing sigma factor mimicry at the atomic level. This entry represents NepR domains found in alphaproteobacteria. 33
61142 408342 pfam18558 HTH_51 Helix-turn-helix domain. This is helix turn helix domain found in polyketide synthases (PKSs) in fungi. They are multidomain enzymes that biosynthesize a wide range of natural products. Family members include citrinin polyketide synthases which contain a C-methyltransferase (CMeT) domain pfam08242 that adds one or more S-adenosylmethionine (SAM)-derived methyl groups to the carbon framework. 90
61143 408343 pfam18559 Exop_C Galactose-binding domain-like. This is the C-terminal domain found in ExoP (exo-1,3/1,4-beta-glucanase) from Pseudoalteromonas. This domain contains a beta-sandwich fold which is common in glycosyl hydrolases (GH7, 11, 12 and 16) and in some 23 carbohydrate-binding modules. It is suggested that the main role of this domain is to provide structural stability necessary for ExoP activity, however no substrate-binding role has been shown. 155
61144 408344 pfam18560 Lectin_like Lectin like domain. This is a lectin like domain found in Cwp84, a surface-located cysteine protease (a member of the C1A cysteine protease family, also known as papain proteases) responsible for the maturation of the SlpA precursor protein which has been implicated in the degradation of extracellular matrix proteins such as fibronectin, laminin and vitronectin. Structural comparison indicates that this domain is similar to carbohydrate-binding domains. 157
61145 408345 pfam18561 Regnase_1_C Endoribonuclease Regnase 1/ ZC3H12 C-terminal domain. This is the C-terminal domain found in regnase-1, an RNase that directly cleaves mRNAs of inflammatory genes such as IL-6 and IL-12p40, and negatively regulates cellular inflammatory responses. The C-terminal domain is composed of three alpha helices and resembles ubiquitin associated protein 1 in structure. 44
61146 408346 pfam18562 CIDR1_gamma Cysteine-Rich Interdomain Region 1 gamma. Rosetting is the capacity of infected RBCs to bind uninfected RBCs, which is consistently associated with severe malaria in African children. The rosette-forming PfEMP1 adhesins, namely IT4/R29, Palo Alto 89F5 VarO, 3D7/PF13_0003 and IT4/var60, belong to a specific sub-group called groupA/UpsA var genes and all four present a specific Duffy Binding-Like and and Cysteine-Rich Interdomain Region (DBL1alpha1-CIDR1gamma) double domain Head region found at the extracellular region of PfEMP1. This entry represents the CIDR1gamma domain which increases the binding affinity to VarO (Palo Alto VarO parasites). 52
61147 408347 pfam18563 TubC_N TubC N-terminal docking domain. This is the N-terminal docking domain found in TubC proteins from the tubulysin polyketide synthase and nonribosomal polypeptide synthetase (PKS-NRPS) system, which binds to C-terminal docking domain of TubB. 52
61148 408348 pfam18564 Glyco_hydro_5_C Glycoside hydrolase family 5 C-terminal domain. This is the C-terminal domain of endo-glycoceramidase II (EGC), a membrane-associated family 5 glycosidase pfam00150. The C-terminal domain assumes a beta-sandwich fold, which resembles that of many carbohydrate-binding modules. 86
61149 408349 pfam18565 Glyco_hydro2_C5 Glycoside hydrolase family 2 C-terminal domain 5. Domain 5 is found in dimeric beta-D-galactosidase from Paracoccus sp. 32d, which contributes to stabilization of the functional dimer. It is suggested that the location of this domain 5, may be one of the factors responsible for the creation of a functional dimer and cold-adaptation of this enzyme. 103
61150 408350 pfam18566 Ldi Linalool dehydratase/isomerase. This (alpha,alpha)6 barrel fold domain is found in linalool dehydratase/isomerase (Ldi) EC:4.2.1.127. An enzyme found in the betaproteobacterium Castellaniella defragrans 65Phen that mineralizes monoterpenes coupled to anaerobic denitrification. The periplasmic enzyme reversibly catalyzes the isomerisation from the primary alkenol geraniol into the tertiary alkenol (S)-linalool and its dehydration to beta-myrcene. Each monomer is built up of a classical (alpha,alpha)6 barrel fold composed of six inner helices. Structural data of Ldi revealed the terpene binding site between two monomers inside a hydrophobic channel, and three catalytic clusters involved in catalysis. 307
61151 408351 pfam18567 TIR_3 Toll/interleukin-1 receptor domain. This is a Toll/interleukin-1 receptor (TIR) domain found in the N-terminal region of B-cell adaptor for phosphoinositide 3-kinase (BCAP). BCAP functions in linking the B-cell receptor (BCR) and the co-receptor CD19 to the activation of PI3K via interaction with the SH2 domains on the regulatory p85 subunit. BCAP TIR associates with the MAL/TIRAP adaptor and the TIR domains of Toll-like receptors (TLRs). 131
61152 408352 pfam18568 COS TRIM C-terminal subgroup One Signature domain. This domain is found in the C-terminal region of the TRIM subgroup C-1 proteins such as E3 ubiquitin-protein ligase Midline-1 protein which is required for the proper development during embryogenesis. Mutations of MID1 are associated with X-linked Opitz G syndrome, characterized by midline anomalies. This domain is also found in MURF1-3 proteins that do not contain the FNIII and B30.2 domains. MUF1-3 proteins are also associated with microtubules. Deletion of the COS domain does not affect MID1 dimerization but disrupts the localization to the microtubules. 52
61153 408353 pfam18569 Thioredoxin_16 Thioredoxin-like domain. This is a thioredoxin like domain found in AIMP2 proteins (Aminoacyl tRNA synthetase complex interacting multifunctional protein 2). Aimp2 is a component of human multi-tRNA synthetase complex (MSC). MSC is a macromolecular protein complex consisting of nine different ARSs and three ARS-interacting multifunctional proteins (AIMPs). 93
61154 408354 pfam18570 Nup54_57_C NUP57/Nup54 C-terminal domain. The nuclear pore complex (NPC) constitutes the sole gateway for bidirectional nucleocytoplasmic transport. NPCs are formed by multiple copies of 34 distinct proteins, termed nucleoporins (nups). In yeast, the channel nups Nsp1, Nup49, and Nup57 constitute part of the central transport channel and form the diffusion barrier with their disordered phenylalanine-glycine (FG) repeats. Structural studies of yeast Nup57 indicate that it contains left-handed coiled-coil domains (CCD1-3). This entry represents the third CCD located at the C-terminal region of Nup57 which is composed of five heptad repeats. Nup57 in yeast is the equivalent to human Nup54 pfam13874. 29
61155 408355 pfam18571 VWA_3_C von Willebrand factor type A C-terminal domain. This is the C-terminal domain of von Willebrand factor type A pfam13768. 47
61156 408356 pfam18572 T6PP_N Trehalose-6-phosphate phosphatase N-terminal helical bundle domain. This is the N-terminal domain found in trehalose-6-phosphate phosphatase (T6PP, EC 3.1.3.12) from parasitic nematodes such as Brugia malayi. In the model nematode Caenorhabditis elegans, T6PP is essential for survival due to the toxic effect(s) of the accumulation of trehalose 6-phosphate. T6PP has also been shown to be essential in Mycobacterium tuberculosis. The N-terminal domain composed of a three-helix bundle is similar in topology to the Microtubule Interacting and Transport (MIT) domains of the Vps4-like ATPases from Sulfolobus acidocaldarius. MIT domains are protein-interacting domains typically associated with multivesicular body formation, cytokinetic abscission, or viral budding. Mutational analysis indicate that deletion or mutation of the MIT-like domain is highly destabilizing to the enzyme. 98
61157 408357 pfam18573 BclA_C BclA C-terminal domain. This is the C-terminal domain of BclA (Bacillus collagen-like protein of anthracis) which is expressed on spores of Bacillus species. Trimers of the C-terminal domain (CTD) form the tips of the spore's hair-like nap and are the immunodominant target of vertebrate antibodies and drive trimerization. Structure analysis indicate the C-terminal region of the peptide folding into an all-beta structure with a jelly-fold topology, similar to the first human complement C1q, a member of the tumor necrosis factor (TNF)-like family. The C-terminal globular domain has been shown to be located on the exterior of the exosporium, and therefore is critical in determining the immunogenicity of the spore in a mammalian host. 127
61158 408358 pfam18574 zf_C2HC_14 C2HC Zing finger domain. This is a zinc finger domain together with a linker region found in RNF125, a small protein (25kD) that contains a RING domain, three zinc fingers (ZnFs) and a ubiquitin interacting motif (UIM). The C2HC ZnF plays an essential role in the interaction of RNF125 with the E2 UbcH5a, which originates from the requirement of the C2HC-ZnF for the structural stability of the RING domain. A mutation at one of the contact residues in the C2HC-ZnF, a highly conserved M112, resulted in the loss of ubiquitin ligase activity. Furthermore, mutations at the Zn2+ chelating cysteine residues, C100 and C103 of this domain resulted in a loss of activity. 33
61159 408359 pfam18575 HAMP_N3 HAMP N-terminal domain 3. Aer2 soluble receptor from Pseudomonas aeruginosa contains three successive HAMP domains in the N-terminal region. HAMP domains are widespread prokaryotic signaling modules. This entry is the third N-terminal HAMP domain (HAMP3). HAMP3 adopt a conformation resembling Af1503, with only minor differences in helical tilt and orientation. The basic construction of each HAMP domain consists of a monomeric unit of two parallel alpha helices (AS1 and AS2) joined by an elongated connector of 12-14 residues form a parallel four-helix bundle. 43
61160 408360 pfam18576 HTH_52 Helix-turn-helix domain. This is a helix turn helix domain found in bacilli. 64
61161 408361 pfam18577 ASTN_2_hairpin Astrotactin-2 C-terminal beta-hairpin domain. This is a beta-hairpin domain found at the C-terminal region of astrotactin 2 proteins (ASTN-2). ASTN-2 is an integral membrane perforin-like protein linked to the planar cell polarity pathway in hair cells. it consists of multiple polypeptide folds: a perforin-like domain, a minimal epidermal growth factor-like module, a fibronectin type III domain Fn (III) and an annexin-like domain as well as the beta hairpin domain which packs across the fibronectin domain. 47
61162 408362 pfam18578 Raf1_N Rubisco accumulation factor 1 alpha helical domain. This is the N-terminal alpha helical domain found in Rubisco accumulation factor1 (Raf1). Raf1 from Arabidopsis thaliana consists of an N-terminal alpha-domain, a flexible linker segment and a C-terminal beta-sheet domain that mediates dimerization. The alpha-domains mediate the majority of functionally important contacts with RbcL (Rubisco large subunits) by bracketing each RbcL dimer at the top and bottom. The alpha-domain alone is essentially inactive. 106
61163 408363 pfam18579 Raf1_HTH Rubisco accumulation factor 1 helix turn helix domain. This is helix turn helix domain found in alpha helical region of Rubisco accumulation factor1 (Raf1). Raf1 from Arabidopsis thaliana consists of an N-terminal alpha-domain, a flexible linker segment and a C-terminal beta-sheet domain that mediates dimerization. The alpha-domains mediate the majority of functionally important contacts with RbcL (Rubisco large subunits) by bracketing each RbcL dimer at the top and bottom. The alpha-domain alone is essentially inactive. 61
61164 408364 pfam18580 Sun2_CC2 SUN2 coiled coil domain 2. LINC complexes are formed by coupling of KASH (Klarsicht, ANC-1, and Syne/Nesprin Homology) and SUN (Sad1 and UNC-84) proteins from the inner and outer nuclear membranes (INM and ONM, respectively). the formation of LINC complexes by KASH and SUN proteins at the nuclear envelope (NE) establishes the physical linkage between the cytoskeleton and nuclear lamina, which is instrumental for the mechanical force transmission from the cytoplasm to the nuclear interior, and is essential for cellular processes such as nuclear positioning and migration, centrosome-nucleus anchorage, and chromosome dynamics. SUN2 possesses two coiled-coil domains (CC1 and CC2). These coiled-coil domains are also believed to act as rigid spacers to delineate the distance between the ONM and INM of the NE. Furthermore, the two coiled-coil domains of SUN2 have been indicated to be able to directly modulate SUN domain activity and regulate the subsequent interactions between the SUN and KASH domains. CC2 forms a three-helix bundle to lock the SUN domain in an inactive conformation acting as an inhibitory component. Structure-based sequence analysis demonstrated that several Gly residues are located in the flexible linker regions between the three helices which would ideally provide the breaks/turns in CC2 for three-helix bundle formation. The last helix alpha3 of CC2 (that is immediately connected to the SUN domain) has been shown to be an essential segment for promoting SUN domain trimerization in the SUN-KASH complex structure. 58
61165 408365 pfam18581 SYCP2_ARLD Synaptonemal complex 2 armadillo-repeat-like domain. Synaptonemal complex protein 2 (SYCP2) N-terminal region contains two separate subdomains an ARLD (armadillo-repeat-like domain) and an SLD (Spt16M-like domain). The ARLD domain belongs to the armadillo-repeat protein family. Armadillo-repeat units often form a superhelix, which typically provides a platform for many protein partners that transduce Wnt signaling, such as beta-catenin. The ARLD of mouse SYCP2 was found to associate with different protein partners, including CENP J and CENP F. ARLD structure is highly similar to that of the 'required for cell differentiation (RCD-1)' protein. 171
61166 408366 pfam18582 HZS_alpha Hydrazine synthase alpha subunit middle domain. The crystal structure of hydrazine synthase multiprotein complex isolated from the anammox organism Kuenenia stuttgartiensis implies a two-step mechanism for hydrazine synthesis: a three-electron reduction of nitric oxide to hydroxylamine at the active site of the gamma-subunit and its subsequent condensation with ammonia, yielding hydrazine in the active centre of the alpha-subunit. The alpha-subunit consists of three domains: an N-terminal domain which includes a six-bladed beta-propeller, a middle domain binding a pentacoordinated c-type haem (haem alphaI) and a C-terminal domain which harbours a bis-histidine-coordinated c-type haem (haem alphaII). This entry represents the middle domain of subunit alpha of hydrazine synthase (HZS). 98
61167 408367 pfam18583 Arnt_C Aminoarabinose transferase C-terminal domain. ArnT is a member of the GT-C family of glycosyltransferases, and it has a similar fold to a bacterial oligosaccharyltransferase (OST) from Campylobacter lari (PglB) and to an archaeal OST from Archaeoglobus fulgidus (AglB). This entry represents the C-terminal periplasmic domain of Arnt proteins. 103
61168 408368 pfam18584 SYCP2_SLD Synaptonemal complex 2 Spt16M-like domain. Synaptonemal complex protein 2 (SYCP2) N-terminal region contains two separate subdomains an ARLD (armadillo-repeat-like domain) and an SLD (Spt16M-like domain). The SLD structure is highly similar to the middle domain of the histone chaperone FACT. It consists of a twisted ten-stranded beta-sheet flanked by two helices. Since the SLD domain structurally resembles Spt16M, which is known as the well-recognized histone protein H2A-H2B; it is speculated that the SLD may be involved in chromatin binding. 111
61169 408369 pfam18585 zf-CCCH_6 Chromatin remodeling factor Mit1 C-terminal Zn finger 2. The Snf2/Hdac Repressive Complex (SHREC) is the fission yeast nucleosome remodeling and deacetylation (NuRD) equivalent and plays a major role in transcriptional gene silencing (TGS) within S. pombe heterochromatin. SHREC consists of the chromatin remodeler Mit1, the HDAC Clr3, and Clr1 and Clr2 proteins. The Mit1 C terminus contains two zinc binding motifs (CCHC zinc fingers) at the C-terminus onto which the alpha helices from both Clr1 and Mit1 pack. The Mit1 chromatin remodeler uses its C-terminal domain to intimately bind to the N-terminal half of Clr1 to integrate into the SHREC complex. This is entry represents the second C-terminal zinc-binding domain found on Mit-1. 53
61170 376012 pfam18586 zf-CCCH_7 Chromatin remodeling factor Mit1 C-terminal Zn finger 1. The Snf2/Hdac Repressive Complex (SHREC) is the fission yeast nucleosome remodeling and deacetylation (NuRD) equivalent and plays a major role in transcriptional gene silencing (TGS) within S. pombe heterochromatin. SHREC consists of the chromatin remodeler Mit1, the HDAC Clr3, and Clr1 and Clr2 proteins. The Mit1 C terminus contains two zinc binding motifs (CCHC zinc fingers) at the C-terminus onto which the alpha helices from both Clr1 and Mit1 pack. The Mit1 chromatin remodeler uses its C-terminal domain to intimately bind to the N-terminal half of Clr1 to integrate into the SHREC complex. This is entry represents the first C-terminal zinc-binding domain found on Mit-1. 90
61171 408370 pfam18587 PLL PTX/LNS-Like (PLL) domain. Adhesion G protein-coupled receptors (aGPCRs) play critical roles in diverse neurobiological processes including brain development, synaptogenesis, and myelination. The aGPCR GPR56/ADGRG1 regulates both oligodendrocyte and cortical development. The N-terminal domain of GPR56 has low sequence identity and a fold that likely diverged from the PTX and LNS domains. It also has a conserved motif (HphiC91xxWxxxxG) that was identified among canonical PTX domains. Thus, it is termed the Pentraxin/Laminin/neurexin/sex-hormone-binding-globulin-Like (PLL) domain. Truncation-based analyses suggest that the regions of GPR56 responsible for binding TG2 and collagen III are within the PLL domain, most likely in the surface-exposed conserved patch. Furthermore, it is suggested that the conserved patch of the PLL domain mediates an essential function in CNS myelination. 134
61172 408371 pfam18588 WcbI Polysaccharide biosynthesis enzyme WcbI. Capsular polysaccharides (CPSs) are protective structures on the surfaces of many Gram-negative bacteria. wcbI is one of several genes in the CPS biosynthetic cluster whose deletion leads to significant attenuation of the pathogen. Structural analysis and biophysical assays suggest that WcbI functions as an acetyltransferase enzyme but it requires another functional module to carry out this function. WcbI adopts a predominantly helical fold where the N-terminal 100 amino acids form a ligand-binding domain and binds tightly to coenzyme A and its derivative acetyl-CoA. 207
61173 408372 pfam18589 ObR_Ig Obesity receptor immunoglobulin like domain. This is the immunoglobulin-like domain (IGD) found in obesity receptors (ObR). ObR is a single membrane-spanning receptor belonging to the class I cytokine receptor family. All isoforms have an identical extracellular part consisting of six domains: an N-terminal domain (NTD), two CRH domains (CRH1 and CRH2), an immunoglobulin-like domain (IGD), and two additional membrane-proximal fibronectin type III (FN III) domains. ObR activation depends on the CRH2, IGD, and FN III domains, however the CRH2 domain is the major leptin-binding determinant in the receptor. The IGD and membrane-proximal domains have no detectable affinity for the ligand, but are nonetheless indispensable for receptor activation. Deletion of the IGD results in a receptor with wild-type affinity for leptin, but completely devoid of biological activity. 105
61174 408373 pfam18590 IMP2_N Immune Mapped Protein 2 (IMP2) N-terminal domain. Immune Mapped Protein 2 (IMP2) N-terminal domain which is conserved across both IMP1 and IMP2 families. It is suggested that the globular domain likely contributes to a shared function, hence it is termed 'IMP1-like domain'. 87
61175 408374 pfam18591 IMP2_C Immune Mapped Protein 2 (IMP2) C-terminal domain. Immune Mapped Protein 2 (IMP2) C-terminal domain. 63
61176 408375 pfam18592 Tho1_MOS11_C Tho1/MOS11 C-terminal domain. THO is a multi-protein complex involved in the formation of messenger ribonuclear particles (mRNPs) by coupling transcription with mRNA processing and export. Some studies show that Tho1, like Sub2, can assemble onto the nascent mRNA during transcription and that Tho1 and Sub2 can provide alternative pathways for mRNP biogenesis in the absence of a functional THO complex. This is the C-terminal domain found in Tho1 and MOS11 proteins. The C-terminal region of Tho1 from Saccharomyces cerevisiae, adopts a helical fold similar to that of the WHEP RNA-binding domains of metazoan aminoacyl-tRNA synthetases. 37
61177 408376 pfam18593 CdiI_2 CdiI immunity protein. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. CDI+ bacteria also produce CdiI immunity proteins, which specifically neutralize cognate CdiA-CT toxins to prevent self-inhibition. Structure analysis of CdiI immunity protein from Yersinia kristensenii shows that it is composed of eight alpha-helices packed together to form a nearly spherical structure with weak structural homology to a putative TetR family transcriptional repressor. The CdiI protein fits into the curved cavity of the CdiA-CTYkris toxin domain where it most likely neutralizes toxin activity by blocking access to RNA substrates. This domain is mostly found in gammaproteobacteria. 91
61178 408377 pfam18594 Sas6_CC Sas6/XLF/XRCC4 coiled-coil domain. This is a coiled-coil domain found at the C-terminal of spindle assembly abnormal protein 6 (Sas6). The highly conserved protein SAS-6 constitutes the center of the cartwheel assembly that scaffolds centrioles early in their biogenesis.Structural analysis of Sas6 show that similar to XLF, and XRCC4 it forms a parallel coiled-coil dimer. 30
61179 408378 pfam18595 DHR10 Designed helical repeat protein 10 domain. Repeat proteins composed of multiple tandem copies of a modular structure unit1 are widespread in nature and have critical roles in molecular recognition, signaling, and other essential biological processes. This entry describes a MazG related domain also designated as Designed helical repeat protein 10 (DHR10). This domain is also found at the N-terminal region of Nuf2 proteins pfam03800. 117
61180 408379 pfam18596 Sld7_C Sld7 C-terminal domain. This is an alpha helical domain found at the C-terminal region of Sld7 proteins. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the replicative helicase at the replication origins of chromosomes. Sld7 forms a complex with Sld3 throughout the cell cycle, and associates with and dissociates from origins in an Sld3- dependent manner and is thought to regulate the function of Sld3. Structural analysis of S. cerevisiae Sld7 indicates that two Sld7 molecules form a homodimer using their C-terminal domains. 77
61181 408380 pfam18597 SH3_19 Myosin X N-terminal SH3 domain. This is the N-terminal Sh3 domain found in myosin X. Myosin X is essential for neuritogenesis, wound healing, cancer metastasis and some pathogenic infections. Myosin X is required for filopodia formation and extension. 52
61182 408381 pfam18598 TetR_C_36 Tetracyclin repressor-like, C-terminal domain. This is a C-terminal TetR regulatory domain found in QsdR proteins (quorum-sensing degradation regulation). 111
61183 408382 pfam18599 LCIB_C_CA Limiting CO2-inducible proteins B/C beta carbonyic anhydrases. Limiting CO2-inducible B protein (LCIB)-LCIC complex plays an important role in the microalgal CO2-concentrating mechanisms (CCMs).LCIB and homologs (LCIB1-4 and LCIC) structurally resemble beta carbonyic anhydrases (b-CAs) with striking similarities in overall fold, zinc-binding motif, and especially putative active site architecture. 222
61184 408383 pfam18600 Ezh2_MCSS MCSS domain. Polycomb repressive complex 2 (PRC2) carries out the methylation of lysine 27 of histone H3, a hallmark of repressive chromatin. Three core subunits make up the catalytic core of PRC2; the SET domain containing EZH2, the zinc-finger containing SUZ12 and the WD40 repeat protein EED. The complex forms a compact arrangement of three lobes. The middle lobe largely comprises two domains that mark the beginning of the carboxy (C)-terminal region of EZH2 (MCSS and SANT2) and the helical, C-terminal, component of the Suz12 Vefs domain. This entry describes the MCSS (also known as SANT2L) domain. There is one zinc binding (Zn1Cys3His1) which is formed solely by MCSS. 53
61185 408384 pfam18601 EZH2_N EZH2 N-terminal domain. Polycomb repressive complex 2 (PRC2) carries out the methylation of lysine 27 of histone H3, a hallmark of repressive chromatin. Three core subunits make up the catalytic core of PRC2; the SET domain containing EZH2, the zinc-finger containing SUZ12 and the WD40 repeat protein EED. The complex forms a compact arrangement of three lobes. This is the N-terminal domain of EZH2. 79
61186 408385 pfam18602 Rap1a Rap1a immunity proteins. The structures of the immunity proteins, Rap1a, responsible for the inhibition and neutralization of Ssp1 endopeptidase, revealed two distinct folds. The structure of the Ssp1-Rap1a complex revealed a tightly bound heteromeric assembly with two effector molecules flanking a Rap1a dimer. The Rap1a subunit displays a compact globular structure constructed from five alpha-helices that assemble to form the highly stable symmetric dimer. 86
61187 408386 pfam18603 LAL_C2 L-amino acid ligase C-terminal domain 2. l-amino-acid ligases (LALs; EC 6.3.2.28) were discovered to be ATP-grasp superfamily enzymes that catalyze the formation of an alpha-peptide bond between two l-amino acids in an ATP-dependent manner. The members of this family share a common structural architecture that consists of three domains referred to as the A-domain, B-domain and C-domain. The C domain can be further divided into the C1-subdomain and the C2-subdomain. This entry represents the C2 subdomain. 78
61188 408387 pfam18604 PreAtp-grasp Pre ATP-grasp domain. This is a preATP grasp domain region found inon the N-terminal of pfam02222 in Pheganomycin (PGM1). 92
61189 376031 pfam18605 PikAIV_N Narbonolide/10-deoxymethynolide synthase PikA4 N-terminal domain. Polyketide synthase (PKS) catalyzes the biosynthesis of polyketides, which are structurally and functionally diverse natural products in microorganisms and plants. Type I modular PKSs are the large, multifunctional enzymes responsible for the production of a diverse family of structurally rich and often biologically active natural products. The efficiency of acyl transfer at the interfaces of the individual PKS proteins is thought to be governed by helical regions, termed docking domains (dd), located at the C-terminus of the upstream and N-terminus of the downstream polypeptide chains. This entry represents the N-terminal coiled-coil domain found in PikAIV (module 6) proteins from the Pik PKS system in bacteria. This N-terminal PKS docking domain (KS-side docking domain, KSdd) exhibits a coiled-coil motif and the dimer presents a small hydrophobic patch, sometimes flanked by charged residues, as a narrow binding groove where the ACPdd terminal helix can bind. 30
61190 408388 pfam18606 HTH_53 Zap helix turn helix N-terminal domain. Zinc-finger antiviral protein (ZAP) is a host factor that specifically inhibits the replication of certain viruses, such as HIV-1, by targeting viral mRNA for degradation. This domain is a helix turn helix domain found at the N-terminal region constituting the top cockpit layer of the protein. 62
61191 408389 pfam18607 HTH_54 ParA helix turn helix domain. The accurate segregation of DNA is essential for the faithful inheritance of genetic information. Segregation of the prototypical P1 plasmid par system requires two proteins, ParA and ParB, and a centromere. When bound to ATP, ParA mediates segregation by interacting with centromere-bound ParB, but when bound to ADP, ParA fulfills a different function: DNA-binding transcription autoregulation. ParA consists of an elongated N-terminal alpha-helix which mediates dimerization, a winged-HTH and a Walker-box containing C-domain. This entry describes the N-terminal alpha helix domain combined with the winged HTH region. 92
61192 408390 pfam18608 XAF1_C XIAP-associated factor 1 C-terminal domain. XIAP-associated factor 1 (XAF1) is a 301-amino acids interferon (INF)-inducible pro-apoptotic protein. The XIAP binding region within XAF1, XIAP RING binding site, is located at the C-terminal portion of XAF1. This entry represents the C-terminal region which is functionally identified as XIAP RING-binding domain of XAF1. 51
61193 408391 pfam18609 SAM_Exu Exuperantia SAM-like domain. Exuperantia (Exu) is associated with localization of bicoid (bcd) mRNA and required for its localization at the anterior pole of the oocyte. Crystal structure of Exu reveals a dimeric assembly with each monomer consisting of a 3'-5' EXO-like domain and a sterile alpha motif (SAM)-like domain. The SAM-like domain interacts with its target RNA as a homodimer and is required for RNA binding activity. 73
61194 408392 pfam18610 Peripla_BP_7 Periplasmic binding protein domain. Treponema pallidum, the bacterium that causes syphilis, is an obligate human parasite. T. pallidum lacks the machinery for the de novo synthesis of many key nutrients therefore it acquires these nutrients from its human host. MglB-2 from T. pallidum has been shown to act as the ligand-binding element of an ABC transporter for D-glucose. The overall fold of MglB-2 resembles those of LBPs (Ligand-binding proteins sometimes called 'Periplasmic Binding Proteins') that serve as receptors for nutrients and cofactors in bacterial ABC transporters. Furthermore, structural analysis of MglB-2 i found in Treponema pallidum shows it to be one of the founding member of a family of proteins related to the 'Type I' or 'Cluster B' LBPs. This domain can also be found on the C-terminal region of pfam13407. 71
61195 408393 pfam18611 IL3Ra_N IL-3 receptor alpha chain N-terminal domain. Interleukin-3 (IL-3) is an activated T cell product that bridges innate and adaptive immunity and contributes to several immunopathologies. Structure of IL-3 receptor alpha chain (IL3Ra) in complex with the anti-leukemia antibody CSL362 reveals that the N-terminal domain (NTD), a domain also present in the granulocyte-macrophage colony-stimulating factor (GM-CSF), contains the CSL362 binding epitope. Furthermore, NTD of IL3Ra adopts a typical fibronectin type III (FnIII) fold. 74
61196 408394 pfam18612 Bac_A_amyl_C Bacterial Alpha amylase C-terminal domain. This is a bacterial alpha amaylase C-terminal domain found mostly in bacilli. 69
61197 408395 pfam18613 TrkA_TMD Tyrosine kinase receptor A trans-membrane domain. This receptor consists of 796 amino acids and can be divided in the extracellular ligand-binding domain, the trans-membrane domain, and the intracellular tyrosine kinase domain.This domain is the TMD of TrkA which has shown to be involved in the interaction with amyloid precursor protein (APP). 22
61198 408396 pfam18614 RNase_II_C_S1 RNase II-type exonuclease C-terminal S1 domain. This entry describes the C-terminal S1 domain found in type 2 RNase exonucleases. DrR63 proteins from Deinococcus radiodurans are an RNase II-type enzymes (DrII). Structure analysis of DrII indicates that it has an N-terminal HTH domain which interacts with a flexible loop that connects two beta-strands from the conserved C-terminal S1 domain, forming a beta-wing fold common in wHTH domains. 59
61199 408397 pfam18615 SMYLE_N Short myomegalin-like EB1 binding proteins, N-terminal domain. This N-terminal region is found in SMYLE (for short myomegalin-like EB1 binding protein). It includes the SMYLE homology (SmyH) domain found in the first 100 residues at the N terminus. This conserved SmyH domain is required and sufficient for PKA scaffolding protein AKAP9, and the pericentrosomal protein CDK5RAP2 binding. 388
61200 408398 pfam18616 CdiI_3 CDI immunity proteins. Contact-dependent growth inhibition (CDI) is a widespread mechanism of bacterial competition. CDI+ bacteria deliver the toxic C-terminal region of contact-dependent inhibition A proteins (CdiA-CT) into neighboring target bacteria and produce CDI immunity proteins (CdiI) which bind CdiA-CT domains and neutralize their toxic activity to protect against self-inhibition. CdiI immunity proteins are also variable and only neutralize their cognate CdiA-CT toxins. Structure analysis of CdiI from Escherichia coli 536 (EC536) shows that is composed of a single domain and that it blocks the interaction with substrate, strongly suggesting that the immunity protein occludes the nuclease active site. 94
61201 408399 pfam18617 Nup214_FG Nucleoporin Nup214 phenylalanine-glycine (FG) domain. CRM1 is the major nuclear export receptor. During translocation through the nuclear pore, transport complexes transiently interact with phenylalanine-glycine (FG) repeats of multiple nucleoporins. On the cytoplasmic side of the nuclear pore, CRM1 tightly interacts with the nucleoporin Nup214. Nup214 binds to N- and C-terminal regions of CRM1, thereby clamping CRM1 in a closed conformation and stabilizing the export complex. This entry represents an FG repeat region within the C terminus of Nup214 which is required for its interaction with CRM1. 62
61202 408400 pfam18618 HP0268 HP0268. HP0268 is a small, characterized protein that is conserved in H. pylori strains and consists of 80 amino acid residues with a molecular weight of approximately 9.5 kDa. HP0268 has nicking endonuclease and RNase activities, both of which are specific for a single-strand of nucleotides. It is structurally similar to small MutS-related (SMR) domains, that can be categorized roughly into three subfamilies according to their arrangement in the domain architecture.HP0268 falls into subfamily 3 that is found as stand-alone type proteins. It is proposed that HP0268 has become an evolutionary intermediate between RNases and nicking endonucleases during H. pylori adaptation to the extremely acidic environment of the stomach. 80
61203 408401 pfam18619 GAIN_A GPCR-Autoproteolysis-INducing (GAIN) subdomain A. GPR56 is a a cell-surface G protein-coupled receptor (GPCR) which belongs to the adhesion G protein-coupled receptor (aGPCR) family, a large family of chimeric proteins that have both adhesion and signaling functions and play critical roles in diverse neurobiological processes including brain development, synaptogenesis, and myelination. This entry represents GPCR-Autoproteolysis-INducing (GAIN) subdomain A, including PLL-GAIN linker (F161-D175) region. 48
61204 408402 pfam18620 DUF5627 Family of unknown function (DUF5627). This is a domain of unknown function found in bacteria. 133
61205 408403 pfam18621 DUF5628 Family of unknown function (DUF5628). This is a domain of unknown function found in Actinobacteria. 110
61206 408404 pfam18622 HTH_55 RctB helix turn helix domain. RctB is a highly conserved 75.3 kD protein (658 residues), which is unique to the Vibrionaceae. The first 500 amino acids of RctB are sufficient to mediate oriCII-based replication and its C-terminal 165 residues may mediate regulatory processes. RctB contains at least three DNA binding winged-helix-turn-helix motifs, and mutations within any of these severely compromise biological activity. This entry describes domain 1 located at the N-terminal region of RctB proteins. Mutational analysis show that it binds oriCII DNA, and that this function is critical for the capacity of RctB to mediate oriCII-based replication. 107
61207 408405 pfam18623 TnsE_C TnsE C-terminal domain. The bacterial transposon Tn7 facilitates horizontal transfer by directing transposition into actively replicating DNA with the element-encoded protein TnsE. Structural analysis of the C-terminal domain of TnsE identified a central V-shaped loop that toggles between two distinct conformations. It is suggested that a conformational change within the C-terminal domain of TnsE underlies target site selection by regulating stable engagement of the target DNA while providing a signal for activating transposition. 145
61208 408406 pfam18624 CdiI_4 CDI immunity protein. Contact-dependent growth inhibition (CDI) is a mechanism of inter-cellular competition in which Gram-negative bacteria exchange polymorphic toxins using type V secretion systems. Structure analysis of the CDI toxin from Escherichia coli NC101 reveals that it has moderate structural homology to Whirly-like proteins found in plastids, but appears to lack the characteristic Whirly RNA-binding site. 104
61209 408407 pfam18625 EspB_PE ESX-1 secreted protein B PE domain. The ESX-1 secretion system is an important virulence determinant in Mycobacterium tuberculosis. ESX-1 secreted protein B (EspB) contains putative PE (Pro-Glu) and PPE (Pro-Pro-Glu) domains, and a C-terminal domain, which is processed by MycP1 protease during secretion. This domain represents the PE domain located at the N-terminal region of EspB which carries the conserved YxxxD/E secretion motif. 78
61210 408408 pfam18626 Gln_deamidase_2 Glutaminase. Protein glutaminase (PG, EC 3.5.1.44) can deamidate glutamine residues in proteins to glutamate residues. This entry represents the mature PG enzyme which bears partial homology to factor XIII-like Transglutaminase (TG), especially its Cys-His-Asp catalytic triad. A similar triad (Cys-His-Asn) is also shared by some cysteine proteases such as papain and actinidin. The mature PG is a monomer enzyme consisting of 185 amino acid residues. 106
61211 408409 pfam18627 PgdA_N Peptidoglycan GlcNAc deacetylase N-terminal domain. This is the N-terminal and middle domain found in Streptococcus pneumoniae peptidoglycan GlcNAc deacetylase (SpPgdA). PgdA protects the Gram-positive bacterial cell wall from host lysozymes by deacetylating peptidoglycan GlcNAc residues. It is a member of the family 4 carbohydrate esterases (CE-4). 218
61212 408410 pfam18628 P2_N Viral coat protein P2 N-terminal domain. P2 (30.2 kDa) is the major outer-coat protein of the marine lipid-containing bacteriophage PM2. Each sub-unit of P2 is composed of two beta barrel jelly rolls, disposed normal to the surface of the capsid, which lend pseudo-6-fold symmetry to the molecules, facilitating their close packing within the capsid. There is a Ca2+ ion located between the two beta barrels of P2 that helps PM2 molecular organizations stabilization. This entry represents the N-terminal jelly roll domain of P2. 127
61213 408411 pfam18629 DUF5629 Family of unknown function (DUF5629). This is a domain of unknown function found in hypothetical proteins from Pseudomonas aeruginosa. 98
61214 408412 pfam18630 Peptidase_M60_C Peptidase M60 C-terminal domain. This is C-terminal domain (CTD) of M60-peptidases pfam13402. It Can also be found at the C-terminal region of gingipain B (RgpB) from P. gingivalis. It was found to possess a typical Ig-like fold encompassing seven antiparallel beta-strands organized in two beta-sheets, packed into a beta-sandwich structure that can spontaneously dimerize through C-terminal strand swapping. Translocation of gingipains from the periplasm across the OM is dependent on the conserved CTD, which appears to be important for secretion of the proteins and in particular, truncation of the last few C-terminal residues of this domain leads to accumulation of gingipains in the periplasm. Subsequently, the T9SS targeting signal was demonstrated to reside within the last 22 residues at the C-terminus of the CTD. During gingipain translocation across the OM, the CTD is cleaved off by PorU. 65
61215 408413 pfam18631 Cucumopine_C Cucumopine synthase C-terminal helical bundle domain. McbB from Marinactinospora thermotolerans is an enzyme that catalyzes the Pictet-Spengler (PS) reaction of L-tryptophan and oxaloacetaldehyde to produce the betaC scaffold of marinacarbolines. This is the C-terminal domain composed of 5 bundled alpha helices. It is weakly similar to the signal transduction histidine-protein kinase BarA from E. coli and the DNA endonuclease I-MsoI from Monomastix sp. 141
61216 408414 pfam18632 DUF5630 Family of unknown function (DUF5630). This is a domain of unknown function mostly found in Legionella. 218
61217 408415 pfam18633 zf-CCCH_8 Zinc-finger antiviral protein (ZAP) zinc finger domain 3. Zinc-finger antiviral protein (ZAP) is a host factor that specifically inhibits the replication of certain viruses, such as HIV-1, by targeting viral mRNA for degradation. N-terminal domain of ZAP is the major functional domain which contains four zinc-finger motifs. This entry represents the third zinc finger type CCCH. 28
61218 408416 pfam18634 RXLR_WY RXLR phytopathogen effector protein WY-domain. Filamentous plant pathogens cause devastating diseases of crops. Phytophthora infestans, the Irish potato famine pathogen, facilitates disease on its hosts by delivering effector proteins that modulate host cell processes to the benefit of the parasite, a strategy used by many biotrophic plant pathogens. The Phytophthora infestans RXLR-type effector PexRD54 binds potato ATG8 via its ATG8 family-interacting motif (AIM) and perturbs host-selective autophagy. The N-terminal region of PexRD54 contains 5 tandem WY domains. The WY domain is a conserved structural unit consisting of three alpha-helices and two characteristic hydrophobic amino acids, frequently W (Trp) and Y (Tyr), which contribute to a stable hydrophobic core. Deletion analysis show that the WY domains of PexRD54 are dispensable for ATG8CL binding suggesting an alternative function for these domains. 51
61219 408417 pfam18635 EpCAM_N Epithelial cell adhesion molecule N-terminal domain. EpCAM (epithelial cell adhesion molecule), a stem and carcinoma cell marker, is a cell surface protein involved in homotypic cell-cell adhesion via intercellular oligomerization and proliferative signalling via proteolytic cleavage. Structure analysis indicate that it is composed of three domains: N-domain, Thyroglobulin type-1A (TY) domain and the C-terminal domain. This entry represents the small and compact disulphide-rich N-terminal domain of 39 amino-acid residues. 33
61220 408418 pfam18636 Sld7_N Mitochondrial morphogenesis protein SLD7 N-terminal domain. The initiation of eukaryotic chromosomal DNA replication requires the formation of an active replicative helicase at the replication origins of chromosomes. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the helicase. The Sld7 protein interacts with Sld3, and the complex formed is thought to regulate the function of Sld3. Although Sld7 is a non-essential DNA replication protein that is found in only a limited range of yeasts, its depletion slowed the growth of cells and caused a delay in the S phase. Structure analysis indicates that the N-terminal domain of Sld7 binds to the N-terminal region of Sld3. 122
61221 408419 pfam18637 AUDH_Cupin Aldos-2-ulose dehydratase/isomerase (AUDH) Cupin domain. The enzyme aldos-2-ulose dehydratase/isomerase (AUDH) participates in carbohydrate secondary metabolism, catalyzing the conversion of glucosone and 1,5-d-anhydrofructose to the secondary metabolites cortalcerone and microthecin, respectively. Crystal structure analysis revealed that the enzyme subunit is built up of three domains, an N-terminal seven-bladed propeller, a bicupin and a C-terminal lectin domain. This entry describes the second Cupin domain (residues 574-739) composed of two antiparallel sheets that build up the jellyroll sandwich fold formed from four and five beta-strands. This cupin domain in AUDH is found to contain a zinc binding site where the metal site is located at the bottom of the cleft formed by the beta-sandwich, as observed in many cupins. 156
61222 408420 pfam18638 CyRPA Cysteine-Rich Protective Antigen 6 bladed domain. Plasmodium falciparum Cysteine-Rich Protective Antigen (PfCyRPA) is a 42.8 kDa protein of 362 residues with a predicted N-terminal secretion signal. It is part of a multi-protein complex including the PfRH5-interacting protein PfRipr and the reticulocyte binding-like homologous protein PfRH5, which binds to the erythrocyte receptor basigin. PfRH5, PfCyRPA, and PfRipr colocalize during parasite invasion at the junction between merozoites and erythrocytes. The complex seems to be required both for triggering Ca2+ release and establishment of tight junctions. PfCyRPA adopts a 6-bladed beta-propeller structure with similarity to the classic sialidase fold, but it has no sialidase activity and fulfills a purely non-enzymatic function. Each blade of the propeller is constructed by a four-stranded anti-parallel beta-sheet. 315
61223 408421 pfam18639 Longin_2 Yeast longin domain. This is a longin domain which is found in the N-terminal region of Lst4 proteins in yeast. Lst4 is the Fnip1/2 orthologue found in mammals. Lst4 forms a complex with Lst7 and are targeted to the vacuole when the cells are starved of carbon, and to a lesser extent nitrogen. Lst4 and Fnip1/2 belong to the DENN family of proteins which comprise an N-terminal longin domain, commonly found in a variety of trafficking proteins, and a C-terminal DENN domain. This domain is made up of a core five-strand beta-sheet, with one short alpha-helix. 159
61224 408422 pfam18640 LepB_N LepB N-terminal domain. Rab GTPases constitute the largest family of small GTP-binding proteins that act as molecular switches in regulating vesicular transport in eukaryotic cells. LepB is a Rab GTPase-activating protein (GAP) effector found in Legionella pneumophila. This entry represents the N-terminal domain which is followed by a GAP domain. 183
61225 408423 pfam18641 LidA_Long_CC LidA long coiled-coil domain. LidA, another Rab1-interacting bacterial effector protein, is translocated by Legionella into the host cytosol at the beginning of infection, and it localized to the Legionella-containing vacuole (LCV) at the cytosolic surface. It has been shown that tight interaction with Rab1 allows LidA to facilitate the Legionella targeting factor (DrrA/SidM)-catalyzed release of Rab1 from GDP dissociation inhibitors (GDI). The base of the protein is formed by two antiparallel coiled-coil structures forming a long coiled-coil domain. This region of LidA interacts with switch and interswitch regions of Rab1 the nucleotide binding pocket of Rab8a, hence blocking access to the GDP/GTP-binding site to a great extent. 177
61226 408424 pfam18642 IMPa_helical Immunomodulating metalloprotease helical domain. IMPa is an immunomodulator metalloprotease that belongs to the peptidase M60 family pfam13402. This entry represents the helical domain found at the N-terminal of the Ig domain. 107
61227 408425 pfam18643 RE_BsaWI BsaWI restriction endonuclease type 2. Type II restriction endonucleases recognize short 4-8 bp nucleotide sequences and cleave phosphodiester bonds within or close to their target site. BsaWI restriction endonuclease from the thermophilic bacterium Bacillus stearothermophilus W1718 belongs to a group of restriction endonucleases that share CCGG motif within their target sites, termed 'CCGG-family'. However, the R-(D/E)R motif residues, which are supposed to recognize CCGG from the major groove side, are poorly ordered and located far away from the DNA bases. BsaWI contacts with the CCGG tetranucleotide from the minor groove side. It is folded into two domains an N-terminal helical domain and a C-terminal catalytic domain. Furthermore, it carries a PDXKXE motif at the putative active site. 105
61228 408426 pfam18644 Phage_int_SAM_6 Phage integrase SAM-like domain. Xer recombinases are members of the tyrosine site-specific recombinase superfamily, a large group of enzymes that catalyze DNA breakage and rejoining using a conserved tyrosine nucleophile. Tyrosine recombinases promote various programmed DNA rearrangements including the monomerization of phage, plasmid and chromosome multimers, resolution of hairpin telomeres, and the movement of virulence and antibiotic resistance carrying integrative mobile genetic elements. Structural analysis of Helicobacter pylori XerH indicates that this N-terminal domain consisting of six alpha-helices contacts the DNA using a four-helix bundle. 132
61229 408427 pfam18645 DUF5631 Family of unknown function (DUF5631). This is an alpha helical domain found at the C-terminal region of the hypothetical protein Rv3899c from Mycobacterium tuberculosis which is conserved across mycobacteria. 96
61230 408428 pfam18646 DUF5632 Family of unknown function (DUF5632). This an alpha-beta-alpha domain found at the N-terminal region of Rv3899c, a hypothetical protein from Mycobacterium tuberculosis which is conserved across mycobacteria. 80
61231 408429 pfam18647 Fungal_lectin_2 Alpha-galactosyl-binding fungal lectin. This domain can be found in alpha-galactosyl binding Lyophyllum decastes lectin (LDL). It is composed of five-stranded anti-parallel beta-sheet and two alpha-helices and contain conserved cysteines responsible for disulfide bridges. The protein with the highest similarity is ginkbilobin-2, a protein with apparent anti-fungal properties isolated from the seeds of the ginkgo biloba tree. Homologous sequences can be divided into two groups, where the proteins in the group with closest homology to LDL only consist of a single LDL-like domain. In the second group of sequences, the LDL-like domain is found at the C-terminal end of a larger domain with homology to members of the Ser, Gly, Asn, His consensus sequence (SGNH)-hydrolase family, which is part of the Gly, Asp, Ser, Leu motif-esterase/lipase superfamily. 102
61232 408430 pfam18648 ADPRTs_Tse2 Tse2 ADP-ribosyltransferase toxins. Tse2 from P. aeruginosa has structural features similar to ADP-ribosylating toxins. It is a cytoactive toxin secreted by a type six secretion apparatus of Pseudomonas aeruginosa and found mostly in gamma proteobacteria. It naturally attacks a target in the cytoplasm of bacterial cells. Structural analysis shows similarity between Tse2 and nicotinamide adenine dinucleotide (NAD)-dependent enzymes from bacteria, notably the mono-ADP-ribosyltransferase toxins (ADPRTs). Furthermore, it revealed that the Tse2 active site is occluded upon binding the cognate immunity protein Tsi2. The abrogation of toxicity for the R14A, S80A, and H122A mutant Tse2 proteins indicates the importance of these amino acids in the mechanism of Tse2 toxicity and, given their conservation with NAD-reactive enzymes, also supports their assignment as being involved in a catalytic reaction. 155
61233 408431 pfam18649 EcpB_C EcpB C-terminal domain. This is an immunoglobulin like domain found at the C-terminal region of EcpB. It is a periplasmic chaperone which along with EcpE help assemble the E. coli common pilus (ECP) EcpA and EcpD subunits. The C-terminal domain is predicted to contain residues that might be involved in binding a C-terminal carboxylate anchor. 72
61234 408432 pfam18650 IMPa_N_2 Immunomodulating metalloprotease N-terminal domain. PA0572 of P. aeruginosa is an inhibitor of PSGL-1, also known as an immunomodulating metalloprotease of P. aeruginosa (IMPa). IMPa prevents neutrophil extravasation and thereby protects P. aeruginosa from neutrophil attack. It belongs to the peptidase M60 family pfam13402. This entry represents the N-terminal alpha/beta-fold domain. 200
61235 408433 pfam18651 CshA_NR2 Surface adhesin CshA non-repetitive domain 2. The multifunctional fibrillar adhesin CshA, which mediates binding to both host molecules and other microorganisms, is an important determinant of colonization by Streptococcus gordonii, an oral commensal and opportunistic pathogen of animals and humans. CshA binds the high-molecular-weight glycoprotein fibronectin (Fn) via an N-terminal non-repetitive region, and this protein-protein interaction has been proposed to promote S. gordonii colonization at multiple sites within the host. This 259-kDa polypeptide is organized in the form of a leader peptide (residues 1-41), a non-repetitive region (residues 42-778), 17 repeat domains (R1-R17, each about 101-aa residues), and a C-terminal cell wall anchor. The non-repetitive Fn-binding region of CshA in turn is composed of three distinct domains, designated as non-repetitive domain 1 (NR1, CshA(42-222)), non-repetitive domain 2 (NR2, CshA(223-540)), and non-repetitive domain 3 (NR3, CshA(582-814)). The NR2 domain of CshA is shown to adopt a globular structure with a lectin-like fold and a ligand-binding site on its surface with structural homologues identified as those involved in binding carbohydrates or glycoproteins. 266
61236 408434 pfam18652 Adhesin_P1_N Adhesin P1 N-terminal domain. The cariogenic bacterium Streptococcus mutans uses adhesin P1 to adhere to tooth surfaces, extracellular matrix components, and other bacteria. The N terminus forms a stabilizing scaffold by wrapping behind the base of P1's elongated stalk and physically 'locking' it into place. It is suggested that the N-terminal has such a pronounced impact on P1 immunogenicity, antigenicity, folding, stability, and adherent function. 106
61237 408435 pfam18653 Arcadin_1 Arcadin 1. Arcadin-1 is encoded by arcade gene cluster which also encodes cernactin. Crenactin is a filament-forming protein from the crenarchaeon Pyrobaculum calidifontis which shows exceptional similarity to eukaryotic F-actin. Arcadin-1 on the other hand does not seem to be related to any known eukaryotic actin binding proteins nor does it affect crenactin polymerisation. 111
61238 408436 pfam18654 LegC3_N LegC3 N-terminal coiled-coil domain. LegC3 is an effector protein secreted by Legionella pneumophila which is believed to act by inhibiting vacuolar fusion. The N-terminal domain of LegC3 is composed of a long discontinuous coiled-coil capped by an antiparallel four-helix bundle that is rigidly connected to the coiled-coil segment. The features responsible for fusion inhibition are located mainly within the N-terminal domain, as this domain alone was sufficient to inhibit vacuole fusion, and that this domain remains associated with vacuoles. 296
61239 408437 pfam18655 SHIRT SHIRT domain. The SHIRT domain is found in a range of presumed bacterial adhesin proteins. 82
61240 408438 pfam18656 DUF5633 Family of unknown function (DUF5633). This entry represents a 40 residue repeat that is often found in tandem in a small set of bacterial cell surface proteins. The function of this region is not known. 41
61241 408439 pfam18657 YDG YDG domain. This presumed domain is found in a wide variety of bacterial cell surface proteins. This domain has a highly conserved YDG motif near its N-terminus. This domain is likely related to the pfam17883 domain. 82
61242 408440 pfam18658 zf-C2H2_12 Zinc-finger C2H2-type. This is a zinc finger domain C2H2 type which can be found in SPIN1 docking protein (SPIN-DOC) and Epm2a-interacting protein 1 (Epm2aip1). SPIN-DOC is a Spindlin1 (SPIN1) regulator that directly binds and strongly disrupts its histone methylation reading ability, causing it to disassociate from chromatin. Epm2aip1 is a glycogen synthase (GS)-associated protein. In the absence of Epm2aip1, the sensitivity of the liver to insulin, in which GS is a principal actor, is impaired. 64
61243 408441 pfam18659 CelTOS Cell-traversal protein for ookinetes and sporozoites. Cell-traversal protein for ookinetes and sporozoites (CelTOS) is a conserved protein that is essential for traversal of malaria parasites in both the mosquito vector and human host and is therefore critical for malaria transmission and disease pathogenesis. It specifically binds phosphatidic acid commonly present within the inner leaflet of plasma membranes, and potently disrupts liposomes composed of phosphatidic acid by forming pores. CelTOS resembles class I viral membrane fusion glycoproteins and a bacterial pore-forming toxin with roles in membrane binding and disruption. CelTOS forms an alpha helical dimer that resembles a tuning fork. Structure analysis indicate that it has a distinct structural architecture with two subdomains that independently resemble membrane binding and/or disrupting proteins and could simultaneously act during disruption. 116
61244 408442 pfam18660 Tsi6 Tsi6. Tsi6 inhibits the NADase activity of Tse6, an integral membrane toxin from Pseudomonas aeruginosa. The Tsi6 immunity protein adopts an all-alpha-helical fold that binds to a surface of Tse6. 83
61245 408443 pfam18661 AvrLm4-7 Avirulence Effector AvrLm4-7. AvrLm4-7 is found in Leptosphaeria maculans, an ascomycete fungus in the dothideomycete group which is responsible for stem canker (blackleg) of Brassica napus (oilseed rape, OSR) and other crucifers. AvrLm4-7 is one of six avirulence genes which encodes a small secreted protein strongly over-expressed at the onset of plant infection. This gene confers a dual recognition specificity by two distinct resistance genes of OSR, Rlm4 and Rlm7 and loss of AvrLm4 avirulence was demonstrated to be associated with a strong fitness cost. Structure and functional analysis of AvrLm4-7 protein show that it contains the motifs RAWG and RYRE, part of a well-structured protein region held together by disulfide bridges. Mutations in the RAWG motif or in the RYRE motif (especially mutations in both motifs) almost abolished the translocation of AvrLm4-7 into cells. Furthermore, loss of recognition of AvrLm4-7 by Rlm4 is caused by the mutation of a single glycine to an arginine residue located in a loop of the protein. 86
61246 408444 pfam18662 HTH_56 Cch helix turn helix domain. Staphylococcal Cassette Chromosome, or SCC elements, are a family of genomic islands found in S. aureus and closely related species. SCC elements that carry the mecA gene are called SCCmec and render S. aureus methicillin-resistant, creating the MRSA strains. Cch, the self-loading helicase encoded by SCCmec type IV, belongs to the pre-sensor II insert clade of AAA+ ATPases, as do the archaeal and eukaryotic MCM-family replicative helicases. The N-terminal domain carries pfam06048. The central domain (residues 157-438) contains an AAA+ ATPase fold. This domain is found at the C-terminal region, it is a winged helix-turn-helix (WH) domain typical of many dsDNA-binding proteins. 110
61247 376087 pfam18663 Pallilysin Pallilysin beta barrel domain. The Treponema pallidum protein, Tp0751 (also known as pallilysin), possesses adhesive properties and has been previously reported to mediate attachment to the host extracellular matrix components laminin, fibronectin, and fibrinogen. Tp0751 adopts an eight-stranded beta-barrel with a profile of short conserved regions consistent with a non-canonical lipocalin fold. Lipocalins, along with fatty acid-binding proteins and avidins, are members of the calycin superfamily, which is defined by the distinct features of a central beta-barrel and a key structural signature consisting of three short conserved regions (SCR1, SCR2, and SCR3). However, Tp0751 does not contain all three conserved regions, hence it is considered an outlier to canonical lipocalins. In SCR1, there is a GxW motif. 118
61248 408445 pfam18664 CdiA_C_tRNase CdiA C-terminal tRNase domain. This entry represents the C-terminal tRNase domain of CdiA a type II toxin/immunity protein complex which can be found in B. pseudomallei isolate E479. The C-terminal tRNase domain has an alpha/beta-fold characteristic of PD(D/E)XK nucleases. The PD(D/E)XK superfamily includes most restriction endonucleases and other enzymes involved in DNA recombination and repair. 116
61249 408446 pfam18665 TetR_C_37 Tetracyclin repressor-like, C-terminal domain. IcaR belongs to the tetracycline repressor (TetR) family of proteins, which are involved in a wide variety of gene regulations. It binds to a 42 bp region immediately upstream of the icaA gene. This entry represents the C-terminal domain which is involved in dimerization. 117
61250 408447 pfam18666 CBM64 Carbohydrate-binding module 64. Spirochaeta thermophila secretes seven glycoside hydrolases for plant biomass degradation that carry a carbohydrate-binding module 64 (CBM64) appended at the C-terminus. CBM64 adsorbs to various beta1-4-linked pyranose substrates and shows high affinity for cellulose. Structure analysis indicates a jelly-roll-like fold corresponding to a surface-binding type A CBM. 74
61251 408448 pfam18667 BppU_IgG Baseplate upper protein immunoglobulin like domain. This is a beta-sandwich immunoglobulin fold domain, which resembles the plexin-A2 C-terminal domain in structure. In baseplate upper protein (BppU, also known as ORF48) trimer, this domain plays part in surrounding the Dit hexameric core. 96
61252 408449 pfam18668 Tail_spike_N Tail spike TSP1/Gp66 receptor binding N-terminal domain. Bacteriophages recognize and bind to their hosts with the help of receptor-binding proteins (RBPs) that emanate from the phage particle in the form of fibers or tailspikes. RBPs of podovirus G7C tailspikes gp63.1 and gp66 are essential for infection of its natural host bacterium E. coli 4s. Gp63.1 and gp66 form a stable complex, in which the N-terminal part of gp66 serves as an attachment site for gp63.1 and anchors the gp63.1-gp66 complex to the G7C tail. The two N-terminal domains show 70% sequence identity to the N-terminal region of the CBA120 phage tailspike 1 (orf210, TSP1). The N-terminal domain of TSP1 is the virion head binding domain that interfaces with the phage baseplate. The N-terminal domain can be further divided into two subdomains, each beginning with a alpha-helix followed by an anti-parallel beta-sandwich. Subdomain two folds similarly to the chitin binding domain of Chitinase from Bacillus circulans. 70
61253 408450 pfam18669 Trp_ring Trimeric autotransporter adhesin Trp ring domain. Autotransporters are synthesized as precursor proteins with three functional domains, namely, an N-terminal signal peptide, an internal passenger domain, and a C-terminal pore-forming translocator domain. The C-terminal translocator domain is embedded in the outer membrane and facilitates delivery of the internal passenger domain to the bacterial surface. In conventional autotransporters, the C-terminal translocator domain contains approximately 300 amino acids and is monomeric. In contrast, in trimeric autotransporters, the translocator domain contains 60 to 70 amino acids and forms trimers in the outer membrane. This entry represents a Trp-ring domain which is found in the translocator region of H. influenzae Hia autotransporter, an adhesive protein that promotes adherence to respiratory epithelial cells. Trp-ring domains appear to be crucial repeated modular units in Hia, both in the general architecture of the passenger domain and in the structure of the binding domains. 49
61254 408451 pfam18670 V_ATPase_I_N V-type ATPase subunit I, N-terminal domain. Vacuolar H+-ATPase (V-ATPase) is a ubiquitous multi-subunit proton pump that acidifies a wide variety of intracellular compartments, which in turn affects many biological processes, including membrane trafficking, protein degradation and coupled transport of small molecules and pH homeostasis. Subunit 'a' of V0 (the functional domain responsible for proton transport) sector is highly conserved across eukaryotic species and exists in multiple isoforms. It is the largest subunit of V-ATPases and partitioned almost equally into an N-terminal cytosolic domain and a C-terminal integral membrane. Structure analysis of the N-terminal cytosolic domain from the Meiothermus ruber subunit 'I' homolog of subunit a shows that it is composed of a curved long central alpha-helix bundle capped on both ends by two lobes with similar alpha/beta architecture. 90
61255 408452 pfam18671 4HPAD_g_N 4-Hydroxyphenylacetate decarboxylase subunit gamma N-terminal. 4-Hydroxyphenylacetate decarboxylase (4-HPAD) is a heterotetramer consisting of catalytic beta-subunit harboring the putative glycyl/thiyl dyad and a distinct small gamma-subunit with two [4Fe-4S] clusters (EC:4.1.1.83). The gamma-subunit is proposed to be involved in the regulation of the oligomeric state and catalytic activity of the enzyme and it comprises two domains with some amino acid sequence identity that are structurally related by a pseudo-2-fold symmetry indicating a gene duplication origin. This entry represents the N-terminal domain which binds one [4Fe-4S] cluster through His3, Cys6, Cys19, and Cys36. 31
61256 376096 pfam18672 DUF5634_N Family of unknown function (DUF5634). This is an N-terminal domain of unknown function found in Deltaproteobacteria. 93
61257 408453 pfam18673 IrmA interleukin receptor mimic protein A. The E. coli interleukin [IL] receptor mimic protein A (IrmA), is a small (13 kDa) Uropathogenic E. coli (UPEC) protein that was originally identified in a large reverse genetic screen as a broadly protective vaccine antigen. It has a fibronectin III (FNIII)-like fold that forms a domain-swapped dimer with structural mimicry to the binding domain of the IL-2 receptor (IL-2R), the IL-4 receptor (IL-4R) and, to a lesser extent, the IL-10 receptor (IL-10R). IrmA binds to all three cytokines, with the greatest affinity observed for IL-4. It is suggested that IrmA may contribute to manipulation of the innate immune response during UPEC infection. 106
61258 408454 pfam18674 TarS_C1 TarS beta-glycosyltransferase C-terminal domain 1. Beta-glycosyltransferase TarS is an enzyme responsible for the glycosylation of wall teichoic acid polymers of the S. aureus cell wall, a process that has been shown to be specifically responsible for methicillin resistance in MRSA. It contains a trimerization domain composed of tandem carbohydrate binding motifs.The two C-terminally localized regions composed of a series of beta-sheets participate in an extensive trimerization interface and they assume an immunoglobulin-like fold. It is suggested that both carbohydrate binding domains may be involved in polyRboP binding, however unlike pullulanase, the CBMs of TarS are involved in the formation of an extensive trimerization interface. 148
61259 408455 pfam18675 HepII_C Heparinase II C-terminal domain. Heparinase II (HepII) is an 85-kDa dimeric enzyme that depolymerizes both heparin and heparan sulfate glycosaminoglycans. The protein is composed of three domains: an N-terminal alpha-helical domain, a central two-layered beta-sheet domain, and a C-terminal domain forming a two-layered beta-sheet. The C-terminal domain contains nine beta-strands packed together in a manner resembling a beta-barrel. 88
61260 408456 pfam18676 MBG_2 MBG domain (YGX type). This domain is found in a variety of bacterial extracellular proteins. This domain is related to the MBG domain (pfam17883). But it replaces the characteristic YDG motif close the N-terminus with a YGX motif. 72
61261 408457 pfam18677 ArnB_C Archaellum regulatory network B, C-terminal domain. This is the C-terminal domain found in archeal proteins that carry a von Willebrand factor type A domain such as ArnB from Sulfolobus acidocaldarius. ArnB is involved in negative regulation of the archaellum (former: archaeal flagellum), and the C-terminal domain is phosphorylated by ArnC and ArnD on serine and threonine residues affecting docking of ArnA. 72
61262 408458 pfam18678 AOC_like Allene oxide cyclase barrel like domain. This is an allene oxide cyclase barrel like domain found in spirotetronate cyclases such as AbyU, a Diels-Alderase enzyme. It is comprised of two eight-stranded antiparallel beta-barrels. 122
61263 408459 pfam18679 HTH_57 ThcOx helix turn helix domain. This is a winged helix turn helix domain which is found in cyanobactin oxidase ThcOx N-terminal region. The oxidase converts thiazolines to thiazoles. 107
61264 408460 pfam18680 SPECT1 Plasmodium host cell traversal SPECT1. This domain is found in SPECT1 (sporozoite microneme protein essential for cell traversal). It is formed of a four alpha-helix bundle with a 'hook'-like feature at one end. These helices in parallel or antiparallel alignment. 180
61265 408461 pfam18681 DUF5634 Family of unknown function (DUF5634). This is a domain of unknown function mostly found in bacilli. 95
61266 408462 pfam18682 PilA4 Pilin A4. This domain is found in the major pilin protein PilA. PilA4 binds to PilMNO forming a complex with a well-defined platform linking the cytoplasmic PilM protein to pilus subunits in the periplasm. Structure analysis indicate that it is comprised of one alpha-helix and 4 beta-sheets. 82
61267 408463 pfam18683 ChiW_Ig_like Chitinase W immunoglobulin-like domain. This is an immunoglobulin like domain found in ChiW, a chitinase with high activity towards various chitins. ChiW has a multi-modular architecture composed of six domains to function efficiently on the cell surface: a right-handed beta-helix domain (carbohydrate-binding module family 54, CBM-54), a Gly-Ser-rich loop, 1st immunoglobulin-like (Ig-like) fold domain, 1st beta/alpha-barrel catalytic domain (glycoside hydrolase family 18, GH-18), 2nd Ig-like fold domain and 2nd beta/alpha-barrel catalytic domain (GH-18). 106
61268 408464 pfam18684 PlyB_C Pleurotolysin B C-terminal domain. This a trefoil C-terminal beta-rich domain found in PlyB, one of the components of pleurotolysin (Ply) pore-forming protein. Ply is a membrane attack complex/perforin-like family (MACPF) protein consisting of two components, PlyA and PlyB. PlyB and PlyA act together to form relatively small and regular pores in liposomes. The PlyB C-terminal trefoil sits on top of the PlyA dimer. 173
61269 408465 pfam18685 DUF5635 Family of unknown function (DUF5635). This is a domain of unknown function which is found at the C-terminal region of pfam13749 in actinobacteria. 86
61270 376110 pfam18686 DUF5636 Family of unknown function (DUF5636). This is a domain of unknown function mostly found in gammaproteobacteria. 193
61271 376111 pfam18687 DUF5637 Family of unknown function (DUF5637). This is a domain of unknown function found in predicted cysteine knot peptides. 33
61272 408466 pfam18688 DUF5638 Family of unknown function (DUF5638). This is a domain of unknown function found in Legionella. 104
61273 376113 pfam18689 PriX Primase X. This domain is found in non-catalytic subunit of the archaeal eukaryotic-type primase, PriX. Detailed sequence analysis combined with structural analysis of a truncated PriX protein from the hyperthermophilic archaeon Sulfolobus solfataricus shows that, PriX is essential for the survival of the organism and that it is homologous to the C-terminal domain of archaeal and eukaryotic large primase subunits PriL. Highly conserved PriX homologues are present in many members of the phylum Crenarchaeota. 99
61274 408467 pfam18690 DUF5639 Family of unknown function (DUF5639). This is a domain of unknown function which is mainly found in Deinococcus-Thermus. Some family members can be found in the C-terminal region of pfam01565. 82
61275 408468 pfam18691 Cdc13_OB2 Cell division control protein 13, OB2 domain. Cdc13 is an essential yeast protein required for telomere length regulation and genome stability. Cdc13, like a number of single-stranded telomere binding proteins, consists of several oligonucleotide-oligosaccharide binding (OB) folds. These folds potentially arise from evolutionary gene duplication and are involved in multiple functions, including nucleic acid and protein binding and Cdc13 dimerization. This entry represents the OB2 domain, second OB-fold counting from the N terminus of Cdc13. Biochemical assays indicate OB2 is not involved in telomeric DNA or Stn1 binding. However, disruption of the OB2 dimer in full-length Cdc13 affects Cdc13-Stn1 association, leading to telomere length deregulation, increased temperature sensitivity, and Stn1 binding defects. Hence it is suggested that the dimerization of the OB2 domain of Cdc13 is required for proper Cdc13, Stn1, Ten1 (CST) assembly and productive telomere capping. 111
61276 408469 pfam18692 DUF5640 Family of unknown function (DUF5640). This domain is found in proteins of unknown function. It has composed of eight antiparallel beta strands and carries a G-X-W motif in the N-terminal region. The conserved glycine-X-tryptophan (G-X-W) motif is characteristic for the lipocalin family. 83
61277 408470 pfam18693 TRAM_2 TRAM domain. This is a C-terminal TRAM (after TRM2, a family of uridine methylases, and MiaB) domain found in the methylthiotransferases RimO enzymes that catalyze the conversion of aspartate to 2-methylthio-aspartate (msD) in the S12 protein near the decoding center in prokaryotic ribosomes. The TRAM domain in RimO, contains five anti-parallel beta-strands and docks on the surface of the Radical-SAM domain at the distal edge of its open TIM-barrel from its conserved [4Fe-4S] cluster. 63
61278 408471 pfam18694 TDP43_N Transactive response DNA-binding protein N-terminal domain. This domain can be found at the N-terminal region of transactive response DNA-binding protein 43 kDa (TDP-43), an RNA transporting and processing protein whose aberrant aggregates are implicated in neurodegenerative diseases. TDP-43 N-terminal domain has been shown to play an important role in the aggregation of TDP-43 monomers and its loss of function affects the RNA metabolic levels. Secondary structure of the N-terminal domain consists of six beta-strands and it resembles axin 1. 74
61279 408472 pfam18695 cPLA2_C2 Cytosolic phospholipases A2 C2-domain. Cytosolic phospholipases A2 (cPLA2s) consist of a family of calcium-sensitive enzymes that function to generate lipid second messengers through hydrolysis of membrane-associated glycerophospholipids. In humans, the cPLA2 family contains six isoforms. Structural information of full length cPLA2alpha apo form, shows that it is composed of two domains; an N-terminal Ca2 + binding C2 domain and a C-terminal alpha/beta hydrolase core. This entry describes the N-terminal Ca2+ binding C2 domain which is composed of an eight-stranded antiparallel beta-sandwich consisting of two four-stranded beta-sheets. C2 domains are present in many lipid-binding proteins including Copines, CAPRI and Rabphilin-3A all of which are involved in membrane trafficking. 111
61280 408473 pfam18696 SMP_C2CD2L Synaptotagmin-like, mitochondrial and lipid-binding domain. This is a lipid transport domain found in phospholipid transfer proteins such as C2CD2L-like (also known as TMEM24). The TMEM24-SMP domain is shown to bind glycerolipids with a preference for phosphatidylinositol (PI).The bound PI is then transferred to the plasma membrane (PM) where it is converted to phosphatidylinositol-4,5-bisphosphate [PI(4,5)P2] to replenish pools of this lipid hydrolyzed during glucose-stimulated signaling. PI(4,5)P2 is required for Ca2+-dependent exocytosis hence, the SMP domain of TMEM24 is essential for sustaining the intracellular Ca2+ oscillations that trigger bursts of insulin granule release and hence insulin secretion. The SMP domain belongs to a superfamily of lipid/hydrophobic ligand-binding domains called TULIP for (tubular lipid-binding proteins) it adopts TULIP fold with two alpha helices and a highly curved antiparallel beta sheet forming a cornucopia-like structure. 152
61281 408474 pfam18697 MLVIN_C Murine leukemia virus (MLV) integrase (IN) C-terminal domain. This is the C-terminal domain (CTD) which can be found in murine leukemia virus (MLV) integrase (IN) proteins. The MLV IN C-terminal domain interacts with the bromo and extraterminal (BET) proteins through the ET domain. This interaction provides a structural basis for global in vivo integration-site preferences andt disruption of this interaction through truncation mutations affects the global targeting profile of MLV. The CTD consists an SH3 fold followed by a long unstructured tail. 83
61282 408475 pfam18698 HisK_sensor Histidine kinase sensor domain. The Bacillus subtilis ResD-ResE two-component (TC) regulatory system activates genes involved in nitrate respiration in response to oxygen limitation or nitric oxide (NO). The sensor kinase ResE activates the response regulator ResD through phosphorylation, which then binds to the regulatory region of genes involved in anaerobiosis to activate their transcription. In other words, ResE is involved in sensing signals related to the redox state of the cells. ResE is composed of an N-terminal signal input domain and a C-terminal catalytic domain. The N-terminal domain contains two transmembrane subdomains and a large extra-cytoplasmic loop. Mutational analysis indicate that cytoplasmic ResE lacking the transmembrane segments and the extra-cytoplasmic loop retains the ability to sense oxygen limitation and NO, which leads to transcriptional activation of ResDE-dependent genes. Having said that, it is also proposed that the extra-cytoplasmic region may serve as a second signal-sensing subdomain. This suggests that the extracytoplasmic region could contribute to amplification of ResE activity leading to the robust activation of genes required for anaerobic metabolism in B. subtilis. This entry represents the extracytoplasmic subdomain. Family members also include SrrB found in S. aureus that is similar to ResE of B. subtilis. 126
61283 408476 pfam18699 MRPL52 Mitoribosomal protein mL52. Members of this family include the mamalian mitoribosomal proteins mL52 which is found in the 39S subunit. The mL52 has no homologues in yeast. 91
61284 408477 pfam18700 Castor1_N Cytosolic arginine sensor for mTORC1 subunit 1 N-terminal domain. CASTOR1 (Cytosolic arginine sensor for mTORC1 subunit 1) has been identified as the cytosolic arginine sensor for the mTORC1 pathway. In the absence of arginine, CASTOR1 binds to GATOR2 and inhibits mTORC1 signaling; whereas in the presence of arginine, CASTOR1 interacts with arginine and no longer associates with GATOR2. The arginine sits in a pocket between the N-terminal domain (NTD) and the C-terminal domain (CTD) of CASTOR1. The CASTOR1-NTD on the opposite side of the arginine-binding site was identified to mediate direct physical interaction with its downstream effector GATOR2, via GATOR2 subunit Mios. 61
61285 408478 pfam18701 DUF5641 Family of unknown function (DUF5641). This presumed domain is found in a range of retrotransposon polyproteins. 94
61286 408479 pfam18702 DUF5642 Domain of unknown function (DUF5642). This is a domain of unknown function found in actinobacteria. 186
61287 408480 pfam18703 MALT1_Ig MALT1 Ig-like domain. This is an Immunoglobulin like domain which can be found in the mucosa-associated lymphoid tissue lymphoma translocation 1 (MALT1) paracaspase. Malt1 is a key component of the Carma1/Bcl10/MALT1 signalosome and is critical for NF-kB signaling in multiple contexts. The MALT1 C-terminal Ig domain is suggested to recruit key factors to promote NF-kB activation. The It is also proposed to undergo Lys63-linked ubiquitylation via TRAF6 in potentially nine different lysines to recruit the IKK complex. 138
61288 408481 pfam18704 Chromo_2 Chromatin organization modifier domain 2. Chromodomains serve as chromatin-targeting modules, general protein interaction elements as well as dimerization sites. They are found in many chromatin-associated proteins that bind modified histone tails for chromatin targeting. Chromodomains often recognize modified lysines through their aromatic cage thus targeting proteins to chromatin. Family members such as GEN1 carry a chomodomain which directly contacts DNA and its truncation severely hampers GEN1's catalytic activity. The chromodomain allows GEN1 to correctly position itself against DNA molecules, and without the chromodomain, GEN1's ability to cut DNA was severely impaired. The GEN1 chromodomain was found to be distantly related to the CDY chromodomains and chromobox proteins, particularly to the chromo-shadow domains of CBX1, CBX3 and CBX5. Furthermore, it is conserved from yeast (Yen1) to humans with the only exception being the Caenorhabditis elegans GEN1, which has a much smaller protein size of 443 amino acids compared to yeast Yen1 (759 aa) or human GEN1 (908 aa). 62
61289 408482 pfam18705 DUF5643 Family of unknown function (DUF5643). This is an immunoglobulin-like domain found in bacteria. 117
61290 408483 pfam18706 ISPD_C D-ribitol-5-phosphate cytidylyltransferase C-terminal domain. This domain is located at the C-terminal region of ISPD (isoprenoid synthase domain containing protein, EC:2.7.7.40), pfam01128. Structural homologs can be found in two distinct alpha/beta protein families including the seven-stranded NAD(P) (H)-dependent short-chain dehydrogenases/reductases and five-stranded response regulator proteins involved in bacterial sensing systems. 169
61291 408484 pfam18707 IL2RB_N1 Interleukin-2 receptor subunit beta N-terminal domain 1. IL-2Rbeta is a member of the class I cytokine receptor superfamily. It carries a cytokine-binding homology region, which is divided in two fibronectin type-III (FN-III) domains termed D1 and D2. Each domain contains seven beta-strands that form a sandwich of two antiparallel beta-sheets. The N-terminal D1 domain of IL-2Rbeta includes two highly conserved disulfide bridges. This entry describes D1 of the N-terminal region of IL2Rbeta. 92
61292 408485 pfam18708 MapZ_C2 MapZ extracellular C-terminal domain 2. In the pneumococcus cell division, MapZ (Midcell Anchored Protein Z) locates at the division site before FtsZ and guides septum positioning. MapZ forms ring structures at the cell equator and moves apart as the cell elongates, therefore behaving as a permanent beacon of division sites. MapZ then positions the FtsZ-ring through direct protein-protein interactions. Structural analysis indicate that it displays a bi-modular structure composed of two subdomains separated by a flexible serine-rich linker. The extracellular C-terminal domain carries a conserved patch of amino acids which plays a crucial function in binding peptidoglycan and positioning MapZ at the cell equator. 94
61293 408486 pfam18709 DLP_helical Dynamin-like helical domain. This helical domain is found in bacterial proteins such as labile enterotoxin output A (LeoA), a large GTPase (64.2 kDa) with a putative involvement in membrane vesicle (MV) secretion in Escherichia coli. The crystal structure of LeoA reveals a fold with all the hallmarks of a dynamin-like protein (DLP). 344
61294 408487 pfam18710 ComR_TPR ComR tetratricopeptide. In Gram-positive bacteria, cell-to-cell communication mainly relies on extracellular signaling peptides. ComR is a member of the RNPP family, which positively controls competence for natural DNA transformation in streptococci. It is directly activated by the binding of its associated pheromone XIP. The crystal structure analysis of ComR shows that it contains an N-terminal helix-turn-helix (HTH), DNA binding domain (DBD) and a C-terminal tetratricopeptide repeat (TPR) domain. The TPR domain is composed of 11 alpha-helices forming 5 TPR motifs followed by an additional C-terminal alpha-helix 16 called CAP. The pheromone XIP binding site is found in the TPR region. Biochemical and mutational analysis indicate that, if the interacting XIP is accepted it can then trigger the conformational change of the TPR domain to open the DBD-TPR interface to allow dimer formation that is required to bind DNA. 224
61295 408488 pfam18711 TxDE Toxoflavin-degrading enzyme. This domain is found in toxoflavin-degrading enzymes such as toxoflavin lyase (TflA) also known as toxoflavin-degrading enzyme (TxDE). TflA/TxDE is structurally similar to the vicinal oxygen chelate superfamily of metalloenzymes, despite the lack of apparent sequence identity. 55
61296 408489 pfam18712 DUF5644 Family of unknown function (DUF5644). This is a domain of unknown function found at the C-terminal region of Helicobacterial proteins of unknown function. 109
61297 408490 pfam18713 DUF5645 Domain of unknown function (DUF5645). This is a domain of unknown function found in Diptera. Some family members carry pfam08445 on their C-terminal. 126
61298 408491 pfam18714 PI-TkoII_IV DNA polymerase II intein Domain IV. This domain can be found in the hyperthermophilic archaeon Thermococcus kodakaraensis Pol-2 intein. It is suggested to be a potential DNA binding domain. 150
61299 408492 pfam18715 Phage_spike Phage spike trimer. Bacteriophages penetrate the host cell membrane using their tail to inject genetic material into the host. In this penetration process, they use central spike domain located beneath their baseplate. The spike domain folds as a trimeric iron-binding structure. This entry contains three copies of the repeat unit. 53
61300 408493 pfam18716 VATC Vms1-associating treble clef domain. Treble clef fold domain found at C-terminus of many, but not all, Vms1/ANKZF1-like proteins. 43
61301 408494 pfam18717 CxC4 CxC4 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 129
61302 408495 pfam18718 CxC5 CxC5 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 117
61303 408496 pfam18719 ArlS_N ArlS sensor domain. This entry represents the N-terminal extracellular sensor domain of the ArlS protein from S. aureus. 127
61304 376143 pfam18720 EGF_Tenascin Tenascin EGF domain. This entry represents the EGF-like domains found in tenascin proteins. 29
61305 408497 pfam18721 CxC6 CxC6 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain inserted into the core of the KDZ transposase domain. 66
61306 408498 pfam18722 MazG_C MazG C-terminal domain. An alpha+beta fold domain found C-terminal to the MazG superfamily pyrophosphatase domain. The domain has a conserved DxYRxHDxxH motif indicative of catalytic activity. Based on its broader context in DNA modification, it is proposed to function as a nucleotide kinase. 190
61307 408499 pfam18723 aGPT-Pplase1 alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 1. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Members of this clade are found in phages with hypermodified bases and eukaryotes such as fungi and stramenopiles. 280
61308 408500 pfam18724 aGPT-Pplase2 Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 2. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in caudoviruses and prophages. 229
61309 408501 pfam18725 HEPN_SAV2148 SAV2148-like HEPN. SAV2148-like HEPN nuclease domain. 216
61310 408502 pfam18726 HEPN_SAV_6107 SAV_6107-like HEPN. SAV_6107-like HEPN. 98
61311 376150 pfam18727 ALMS_repeat Alstrom syndrome repeat. This entry contains a single repeat unit of approximately 47 AA. It is found in Alstrom syndrome protein 1 (ALMS1) and homologs. 47
61312 408503 pfam18728 HEPN_AbiV AbiV. AbiV-like HEPN 157
61313 408504 pfam18729 HEPN_STY4199 STY4199-like HEPN. STY4199-like HEPN nuclease domain. 282
61314 408505 pfam18730 HEPN_Cthe2314 Cthe_2314-like HEPN. Cthe_2314-like HEPN. 173
61315 408506 pfam18731 HEPN_Swt1 Swt1-like HEPN. Swt1-like HEPN. This HEPN domain might have a role in binding and sensing unspliced pre-mRNAs that are specifically targeted by the Swt1 nuclease at the nuclear envelope. 116
61316 376155 pfam18732 HEPN_AbiA_CTD HEPN like, Abia C-terminal domain. AbiA-CTD-like HEPN nuclease. Fused to Reverse Transcriptase ; in operon with R-M system. 132
61317 408507 pfam18733 HEPN_LA2681 LA2681-like HEPN. LA2681-like HEPN nuclease. 207
61318 408508 pfam18734 HEPN_AbiU2 AbiU2. AbiU2-like HEPN 193
61319 408509 pfam18735 HEPN_RiboL-PSP RiboL-PSP-HEPN. RiboL-PSP-HEPN. Fused to endoRNase L-PSP ; in operon with ParB. 191
61320 408510 pfam18736 pEK499_p136 HEPN pEK499 p136. pEK499_p136-like HEPN. 150
61321 408511 pfam18737 HEPN_MAE_28990 MAE_28990/MAE_18760-like HEPN. HEPN-like nuclease. MAE_28990 In operon with a ParB nuclease and DNA methylase genes. MAE_18760-like HEPN found fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold. 211
61322 408512 pfam18738 HEPN_DZIP3 DZIP3/ hRUL138-like HEPN. DZIP3/ hRUL138-like HEPN nuclease. Fusion to TPR, Zn-ribbon, RING, Ankyrin, CARD, NACHT ATPase, DEATH and LRR in various animal lineages. 144
61323 408513 pfam18739 HEPN_Apea Apea-like HEPN. Apea-like HEPN nuclease. In epsilonproteobacteria embedded in R-M operons. 99
61324 408514 pfam18740 EC042_2821 EC042_2821-lke REase. REase Fold Fused to HEPN (EC042_2821) and an N-terminal wHTH in some. 188
61325 408515 pfam18741 MTES_1575 REase_MTES_1575. Vsr REase Fold. Fused to HEPN (SWT1/Abi2 family), along with Transglutaminase and wHTH. 96
61326 408516 pfam18742 DpnII-MboI REase_DpnII-MboI. REase Fold fused to DpnII/MboI-NTD. 150
61327 408517 pfam18743 AHJR-like REase_AHJR-like. REase Fold fused to HEPN(DUF86) pfam01934. 124
61328 376166 pfam18744 SNAD1 Secreted Novel AID/APOBEC-like Deaminase 1. A family of secreted AID/APOBEC like deaminases found sporadically across vertebrates. 208
61329 408518 pfam18745 SNAD2 Secreted Novel AID/APOBEC-like Deaminase 2. A family of secreted AID/APOBEC like deaminases found in ray-finned fishes. 211
61330 408519 pfam18746 aGPT-Pplase3 Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 3. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in bacterial mobile operons. 279
61331 408520 pfam18747 Ploopntkinase2 P-loop Nucleotide Kinase2. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis. 298
61332 408521 pfam18748 Ploopntkinase1 P-loop Nucleotide Kinase1. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis. 196
61333 376171 pfam18749 SNAD3 Secreted Novel AID/APOBEC-like Deaminase 3. A family of AID/APOBEC like deaminases found in vertebrates that were derived from secreted versions of the family. 379
61334 408522 pfam18750 SNAD4 Secreted Novel AID/APOBEC-like Deaminase 4. A family of secreted AID/APOBEC like deaminases found only in sponges that often shows lineage-specific expansions. 104
61335 408523 pfam18751 Ploopntkinase3 P-loop Nucleotide Kinase3. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis. 184
61336 376174 pfam18752 DAAD Dictyosteliid AID/APOBEC-like Deaminase. A family of secreted AID/APOBEC-like deaminases found in dictyostellids that often shows lineage-specific expansions. 291
61337 408524 pfam18753 Nmad2 Nucleotide modification associated domain 2. A beta-strand rich domain containing a conserved cysteine and charged residues predicted to play a role in modified DNA base biosynthesis. 202
61338 408525 pfam18754 Nmad3 Nucleotide modification associated domain 3. An alpha+beta fold domain with a high conserved HxD and D motifs suggestive of enzymatic function and predicted to be involved in modifed nucleotide biosynthesis. 244
61339 408526 pfam18755 RAMA Restriction Enzyme Adenine Methylase Associated. An alpha+beta fold domain associated with restriction enzymes across prokaryotes and fused to JAB deubiquitinases, and chromatin proteins in a wide range of eukaryotes. The domain is predicted to function as a modified-DNA reader domain. 108
61340 408527 pfam18756 Nmad4 Nucleotide modification associated domain 4. An alpha+beta fold domain typically associated with DNA methylases and likely to be involved in modified nucleotide biosynthesis. 87
61341 408528 pfam18757 Nmad5 Nucleotide modification associated domain 5. An alpha+beta fold domain associated with DNA base modifying genes in prokaryotes, and likely to be involved in modified DNA base biosynthesis. 205
61342 408529 pfam18758 KDZ Kyakuja-Dileera-Zisupton transposase. A transposase family with an RNaseH catalytic domain, often fused to DNA binding domains such as SAP or cysteine cluster domains. KDZ transposases are widely present in fungi, metazoa, chlorophytes and haotpohytes. Fungal versions are often associated with a TET/JBP family of dioxygenases. 218
61343 408530 pfam18759 Plavaka Plavaka transposase. A transposase with an RNaseH catalytic domain that often has a histone binding BAM/BAH domain at the C-terminus and is sometimes associated with TET/JBP family of dioxygenases in fungi. 320
61344 408531 pfam18760 ART-PolyVal ADP-Ribosyltransferase in polyvalent proteins. A family of ADP-Ribosyltransferases found in polyvalent proteins of phages and conjugative elements. These are in turn related to the Tox-ART-HYD2 group of ADP-Ribosyltransferases that are seen in polymorphic toxin systems and in toxin-antitoxin systems. These are predicted to modify host proteins. 136
61345 408532 pfam18761 Heliorhodopsin Heliorhodopsin. Heliorhodopsins, distantly related to type-1 rhodopsins, are embedded in the membrane with their N termini facing the cell cytoplasm, an orientation that is opposite to that of type-1 or type-2 rhodopsins. Heliorhodopsins show photocycles that are longer than one second, which is suggestive of light-sensory activity. Heliorhodopsin photocycles accompany retinal isomerization and proton transfer, as in type-1 and type-2 rhodopsins, but protons are never released from the protein. 242
61346 408533 pfam18762 Kinase-PolyVal Serine/Threonine/Tyrosine Kinase found in polyvalent proteins. A family of protein kinases found in polyvalent proteins of phages and prophages that although preserving their active site residues for ATP-binding and phosphotransfer appear to have lost the C-terminal subdomain characteristic of this superfamily. 160
61347 408534 pfam18763 ddrB-ParB ddrB-like ParB superfamily domain. A member of the ParB/sulfiredoxin superfamily of proteins found in polyvalent proteins prototyped by the version in the phage P1 ddRB protein. These proteins are predicted to function as nucleases. 124
61348 408535 pfam18764 nos_propeller Nitrous oxide reductase propeller repeat. Nitrous oxide reductases usually contain a seven-bladed beta-propeller domain with external short alpha-helices. This entry represents a single blade of the propeller, with imperfect alpha-helix, usually at the C-terminus of the repeat region. 71
61349 408536 pfam18765 Polbeta Polymerase beta, Nucleotidyltransferase. A member of the nucleotidyltransferase fold found in polymorphic toxins (NTox45) and polyvalent proteins. 93
61350 408537 pfam18766 SWI2_SNF2 SWI2/SNF2 ATPase. A SWi2/SNF2 ATPase found in polyvalent proteins. 223
61351 408538 pfam18767 AID Activation induced deaminase. The activation induced deaminase is a vertebrate-specific member of the classical AID/APOBEC cytosine deaminases that is involved in antibody diversification. 90
61352 408539 pfam18768 HTH_Bact Helix-turn-helix bacterial domain. The bacterial PIcR helix-turn-helix transcription factor includes five TPR units of different lengths. This entry represents the central, medium-sized HTHs repeat. 210
61353 408540 pfam18769 APOBEC1 APOBEC1. APOBEC1 deaminates cytosine both in RNA and ssDNA and has roles in both mRNA editing and ssDNA mutagenesis as part of the defense against retroviruses and genomic retrotransposons. 101
61354 408541 pfam18770 Arm_vescicular Armadillo tether-repeat of vescicular transport factor. Armadillo-like tether-repeat of general vescicular transport factor. This entry contains a single copy of the repeat unit. 60
61355 408542 pfam18771 APOBEC3 APOBEC3. APOBEC3 deaminases act as restriction factors in the innate response to retroviruses and various retroelements. 135
61356 408543 pfam18772 APOBEC2 APOBEC2. APOBEC2 is a highly conserved (slow-evolving) family of AID/APOBECs found in most vertebrates including cartilaginous fishes. APOBEC2 is poorly understood in terms of their molecular functions and substrate specificity. 174
61357 408544 pfam18773 Importin_rep Importin 13 repeat. Importin 13 has a spiralic structure containing repeats structurally similar to HEAT repeats. It serves as receptor for nuclear localization signals (NLS) in cargo substrates, mediating docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit. 40
61358 408545 pfam18774 APOBEC4_like APOBEC4-like -AID/APOBEC-deaminase. Cnidarian and Algal homologs of the APOBEC4-like AID/APOBEC-like deaminases characterized by a distinct Zn chelating site involving residues from the conserved loops 1 and 3. 131
61359 408546 pfam18775 APOBEC4 APOBEC4. A member of the AID/APOBEC family of cytosine deaminases. The biological function of APOBEC4 is poorly understood. However, it is widely conserved across vertebrates. 74
61360 408547 pfam18776 Hexapep_loop Hexapeptide repeat including loop. This entry contains a single hexapeptide repeat unit including a loop between two strands. 24
61361 408548 pfam18777 CRM1_repeat Chromosome region maintenance or exportin repeat. Chromosome region maintenance 1 or exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-terminal, C-terminal and central repeats show slightly different structural arrangements, with N- and C- termini repeats interacting with each other. This entry represents the central repeats of CRM1. 37
61362 408549 pfam18778 NAD1 Novel AID APOBEC clade 1. A distinct family of AID/APOBEC-like deaminases found in ray-finned fishes, the coelacanth, amphibians, lizards, and marsupials. 175
61363 408550 pfam18779 LRR_RI_capping Capping Ribonuclease inhibitor Leucine Rich Repeat. Leucine-rich repeats are composed of a beta-alpha unit. This repeat unit is found as capping unit (N- or C- terminal of the repeat region) of Ribonuclease Inhibitors. 30
61364 408551 pfam18780 HNH_repeat Homing endonuclease repeat. Homing endonucleases are found in bacteria and viruses and catalyze the hydrolysis of genomic DNA within the cells that synthesize them. This entry represents a single repeat unit. 23
61365 408552 pfam18781 Phage_spike_2 Phage spike trimer. This Pfam entry includes some phage spike repeats that fail to be detected with the pfam18715 model. 78
61366 408553 pfam18782 NAD2 Novel AID APOBEC clade 2. A distnct family of AID/APOBEC deaminases found only in Amphibians. 179
61367 408554 pfam18783 IPU_b_solenoid Isopullulanase beta-solenoid repeat. IPU and dextranase repeat unit includes three (or one long and one short) parallel beta-strands. The repeat region as a whole folds into a beta-helix, known as beta-solenoid. 33
61368 408555 pfam18784 CRM1_repeat_2 CRM1 / Exportin repeat 2. Chromosome region maintenance 1 / Exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-, C-terminal and central repeats show slightly different structural arrangements, with N- and C- terminal repeats interacting with each other. This Pfam entry includes some CRM1 repeats that fail to be detected with the pfam18777 model. 68
61369 408556 pfam18785 Inv-AAD Invertebrate-AID/APOBEC-deaminase. A classical AID/APOBEC-like deaminases found in lophotrochozoans, echinoderms and cnidarians. 129
61370 408557 pfam18786 Importin_rep_2 Importin 13 repeat. Importin 13 serves as receptor for nuclear localization signals (NLS) in cargo substrates. It mediates docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit. 44
61371 408558 pfam18787 CRM1_repeat_3 CRM1 / Exportin repeat 3. Chromosome region maintenance 1 / Exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-, C-terminal and central repeats show slightly different structural arrangements, with N- and C- terminal repeats interacting with each other. This Pfam entry includes some CRM1 repeats that fail to be detected with the PF18777 model. 51
61372 408559 pfam18788 DarA_N Defence against restriction A N-terminal. This is an alpha and beta fold domain. It has a conserved aspartate, and an asparagine residue followed by a basic residue in a Nx+ motif. This predicted structural domain is mainly found in polyvalent proteins of phages/prophages. The P1 hdf protein, a solo version of the domain, and the Phage P1 DarA protein that contains this domain are components of the phage P1 head. The domain might be involved in a counter-restriction activity. 101
61373 408560 pfam18789 DarA_C Defence against restriction A C-terminal. This is a mostly alpha-helical domain found in polyvalent proteins of phages and prophages. In Phage P1, the DarA protein is a component of the phage P1 head. 70
61374 408561 pfam18790 KfrB KfrB protein. This is an alpha and beta domain found in polyvalent proteins of conjugative element or often in the neighbourhood of one. The KfrB domain has been speculated to play a role in discrimination of self from non-self in plasmid conjugation systems. 61
61375 408562 pfam18791 Transp_inhibit Transport inhibitor response 1 protein domain. The F-box protein Transport inhibitor response 1 (TIR1) is a receptor for auxin, triggering an auxin-enhanced and ubiquitin-mediated degradation of substrates. The targets are recruited via interaction with the leucine-rich repeat region of the protein. This Pfam entry represents a specific unit of the LRR region, including an insertion of one short alpha-helix in the loop between the beta-strand and the following helix. It shares some sequence homology with a unit with similar structure of Coronatine-insensitive protein 1. 47
61376 408563 pfam18792 UspA1_rep Ubiquitous surface protein adhesin repeat. The UspA1 head domain is globally similar to other structures of TAA head domains and comprises a trimeric left-handed parallel beta-roll, which is formed from 14 repeating 14-16 residue segments that form a ladder of beta-strand coils. This Pfam entry represents a single repeat unit of the beta-ladder. 13
61377 408564 pfam18793 nos_propeller_2 Nitrous oxide reductase propeller repeat 2. Nitrous oxide reductases usually contain a seven-bladed beta-propeller domain with external short alpha-helices. This entry represents a single blade of the propeller, without alpha- helical insertion. 70
61378 408565 pfam18794 HSM3_C DNA mismatch repair protein HSM3, C terminal domain. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to HEAT repeats. This entry include the last 5 repeats at the C terminal. 177
61379 408566 pfam18795 HSM3_N DNA mismatch repair protein HSM3, N terminal domain. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to the HEAT repeats. This entry includes the first 5 repeats at the N-terminal. 237
61380 408567 pfam18796 LPD1 Large polyvalent protein-associated domain 1. This is an alpha helical domain with a conserved ExxARxxE motif that is found in polyvalent proteins of both conjugative elements and phages and prophages. 78
61381 408568 pfam18797 APC_rep Adenomatous polyposis coli (APC) repeat. Adenomatous polyposis coli contains an armadillo repeat and uses its highly conserved surface groove to recognize the APC-binding region (ABR) of Asef. This entry represents a single repeat unit of the Armadillo region. 74
61382 408569 pfam18798 LPD3 Large polyvalent protein-associated domain 3. This domain is predicted to adopt an alpha and beta fold. The secondary structure arrangement suggests it to be a member of the BECR fold. The domain is found in polyvalent proteins of both conjugative elements and phages/prophages. 110
61383 408570 pfam18799 LPD5 Large polyvalent protein-associated domain 5. This domain is predicted to be an enzymatic alpha beta domain. It often found N-terminal to a metallopeptidase domain in polyvalent proteins. The domain contains a conserved aspartate, lysine, and two arginine residues. 146
61384 408571 pfam18800 Atthog Attenuator of Hedgehog. Attenuator of Hedgehog is a integral membrane protein of the tetraspan family that functions as a negative regulator of Hedgehog signaling. 141
61385 408572 pfam18801 RapH_N response regulator aspartate phosphatase H, N terminal. Rap proteins consist of a N-terminal 3-helix bundle and a tetratricopeptide domain. This entry represents the conserved region of the C-terminal bundle. 62
61386 408573 pfam18802 CxC1 CxC1 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 104
61387 408574 pfam18803 CxC2 CxC2 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 107
61388 408575 pfam18804 CxC3 CxC3 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 113
61389 408576 pfam18805 LRR_10 Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents two repeat units. 67
61390 408577 pfam18806 Importin_rep_3 Importin 13 repeat. Importin 13 serves as receptor for nuclear localization signals (NLS) in cargo substrates. It mediates docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit. 75
61391 408578 pfam18807 TTc_toxin_rep Tripartite Tc toxins repeat. Tripartite Tc toxin complexes of bacterial pathogens perforate the host membrane and translocate toxic enzymes into the host cell. These structures undergo a transition between a prepore to a pore state and they are mainly constituted by closed beta-layer repeats. This Pfam entry includes a single repeat unit. 45
61392 408579 pfam18808 Importin_rep_4 Importin repeat. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry includes a single repeat unit. 90
61393 408580 pfam18809 PBECR1 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease1. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues. 108
61394 408581 pfam18810 PBECR2 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease2. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages. The predicted active site contains a conserved arginine and threonine residues. 120
61395 408582 pfam18811 DPPIV_rep Dipeptidyl peptidase IV (DPP IV) low complexity region. Dipeptidyl peptidase IV includes an helical N-terminal region, the pfam00930 domain and the pfam00326 domain, comprising the active site. This Pfam entry represents a sequence that can be repeated in the low complexity region between the helical N-terminus and the DPPIV_N domain. 21
61396 408583 pfam18812 PBECR3 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease3. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues. 116
61397 408584 pfam18813 PBECR4 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease4. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine residue. 185
61398 408585 pfam18814 PBECR5 phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease5. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues. 237
61399 408586 pfam18815 AFP_2 Bacterial antifreeze protein repeat. This family of proteins is involved in stopping the formation of ice crystals at low temperatures. The structure folds as a Ca(2+)-bound parallel beta-helix with an extensive array of ice-like surface waters that are anchored via hydrogen bonds directly to the polypeptide backbone and adjacent side chains. 52
61400 408587 pfam18816 Importin_rep_5 Importin repeat. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry includes a single repeat unit and includes sequences not captured by pfam18808. 52
61401 408588 pfam18817 HEAT_UF Repeat of uncharacterized protein PH0542. Repeat found in PH0542 showing some sequence similarity to HEAT repeat. 45
61402 408589 pfam18818 MPTase-PolyVal Metallopeptidase superfamily domain. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues. 124
61403 408590 pfam18819 MuF_C Phage MuF-C-terminal domain. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements and also fused to the MuF domain, a structural component of the phage head. The predicted active site contains a conserved histidine and serine residues. 103
61404 376239 pfam18820 BD_b_sandwich Bdellovibrio Beta-sandwich. A beta-sandwich domain exclusively found at the N-terminal of CHROMO domains in many Bdellovibrio proteins. 131
61405 408591 pfam18821 LPD7 Large polyvalent protein-associated domain 7. This domain contains conserved aspartate and phenylalanine residues. It is widely present in polyvalent proteins and gene neighbourhoods of conjugative elements. This domain is also known as PTox1. 92
61406 408592 pfam18822 CdvA CdvA-like coiled-coil domain. A coiled coil region domain related to the CdvA-like proteins. 123
61407 408593 pfam18823 InPase Inorganic Pyrophosphatase. A type I Inorganic Pyrophosphatase family domain that is found in polyvalent proteins. 135
61408 408594 pfam18824 LPD11 Large polyvalent protein-associated domain 11. This is an alpha-helical domain with conserved hydrophobic residues. It is found in polyvalent proteins of conjugative elements. 69
61409 408595 pfam18825 LPD13 Large polyvalent protein-associated domain 13. This is an alpha and beta domain that is found in polyvalent proteins of both conjugative elements and phages/prophages. 139
61410 408596 pfam18826 bVLRF1 bacteroidetes VLRF1 release factor. Archaeo-eukaryotic release factor domain family belonging to the VLRF1 clade observed primarily in the bacteroidetes bacterial lineage. Contains a conserved glutamine residue in the release factor catalytic loop, suggesting it functions as an active peptidyl-tRNA hydrolase at the ribosome. 143
61411 408597 pfam18827 LPD14 Large polyvalent protein-associated domain 14. This is an alpha-helical domain with a conserved glutamate residue that is mainly found in polyvalent proteins of prophages. 136
61412 408598 pfam18828 LPD15 Large polyvalent-protein-associated domain 15. This is a predicted enzymatic alpha and beta domain. It is found at the N-terminus of polyvalent proteins of conjugative elements. 99
61413 408599 pfam18829 Importin_rep_6 Importin repeat 6. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry represents two consecutive repeat units and includes sequences captured by pfam18808. 110
61414 408600 pfam18830 LPD16 Large polyvalent protein-associated domain 16. This is an alpha and beta fold domain that is mainly found in polyvalent proteins of conjugative elements. 82
61415 408601 pfam18831 LRR_11 Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents one repeat unit. 29
61416 408602 pfam18832 LPD18 Large polyvalent protein-associated domain 18. This is a mostly all beta domain which contains conserved acidic residues. It is mainly found in polyvalent proteins of conjugative elements. 86
61417 408603 pfam18833 TPR_22 Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (two repeat units) that are not matched by pfam00515. 92
61418 408604 pfam18834 LPD22 Large polyvalent protein associated domain 22. This is a predicted enzymatic alpha-helical domain with highly conserved aspartate residues. The domain is found in polyvalent proteins of phage and prophage genes and is often the immediate neighbour of a lysozyme gene. 98
61419 408605 pfam18835 Beta_helix_2 Beta helix repeat of Inulin fructotransferase. This region contains a right-handed parallel beta helix repeat unit found in Inulin fructotransferase. This Pfam entry includes sequences not found by pfam13229. 68
61420 408606 pfam18836 B_solenoid_ydck Beta solenoid repeat from YDCK. The crystal structure of YDCK from Salmonella cholerae includes a beta-solenoid repeat. This Pfam entry includes a single repeat unit of YDCK repeat. 18
61421 408607 pfam18837 LRR_12 Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents one repeat unit. 30
61422 408608 pfam18838 LPD23 Large polyvalent protein associated domain 23. This is an alpha-helical domain that is usually N-terminal to a metallopeptidase domain. The domain is found in both in polyvalent proteins of conjugative elements and phages/prophages. 58
61423 408609 pfam18839 LPD24 Large polyvalent protein associated domain 24. This is an all-beta domain that is mostly seen in polyvalent proteins of conjugative elements. 70
61424 408610 pfam18840 LPD25 Large polyvalent protein associated domain 25. This is an alpha and beta fold domain found in polyvalent proteins of conjugative elements. 99
61425 408611 pfam18841 B_solenoid_dext Beta solenoid repeat from Dextranase. The crystal structures of Dex49A from Penicillium minioluteum and of ATCC9642 isopullulanase from Aspergillus niger include beta-solenoid repeats, sharing structural similarities. This Pfam entry includes a single repeat unit of the repeat regions. 33
61426 408612 pfam18842 LPD26 Large polyvalent protein associated domain 26. This is a small alpha-helical domain with two acidic residues conserved in a predicted loop between two of its helices. The domain is mainly found in polyvalent proteins of conjugative elements. 57
61427 408613 pfam18843 LPD28 Large polyvalent protein associated domain 28. This is a beta strand rich domain that lacks strongly conserved polar residues. It is fast diverging and is found in polyvalent proteins of conjugative elements. 96
61428 408614 pfam18844 baeRF_family2 Bacterial archaeo-eukaryotic release factor family 2. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family contains a well-conserved 'FP' motif in the catalytic loop. 149
61429 408615 pfam18845 baeRF_family3 Bacterial archaeo-eukaryotic release factor family 3. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 168
61430 408616 pfam18846 baeRF_family5 Bacterial archaeo-eukaryotic release factor family 5. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family unusually lacks the fusion to the C-terminal Pelota domain. 132
61431 408617 pfam18847 LPD29 Large polyvalent protein associated domain 29. This is an alpha and beta fold domain with conserved polar residues that is found in polyvalent proteins of conjugative elements. 91
61432 408618 pfam18848 baeRF_family6 Bacterial archaeo-eukaryotic release factor family 6. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 149
61433 408619 pfam18849 baeRF_family7 Bacterial archaeo-eukaryotic release factor family 7. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. Consistent with this, many members of the family associate on the genome with HPF-like ribosome hibernation factor. 144
61434 408620 pfam18850 LPD30 Large polyvalent protein associated domain 30. This is an alpha and beta fold domain that is found in polyvalent proteins of conjugative elements. 121
61435 408621 pfam18851 baeRF_family8 Bacterial archaeo-eukaryotic release factor family 8. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 141
61436 408622 pfam18852 LPD34 Large polyvalent protein associated domain 34. This is a predicted enzymatic alpha and beta fold domain with a large, prominent helix with conserved glutamate residue and several additional conserved residues including the motifs HTxN and SN. The domain is associated with polyvalent proteins of firmicute conjugative elements. 213
61437 408623 pfam18853 LPD37 Large polyvalent protein associated domain 37. This is and alpha and beta fold domain that is found in polyvalent proteins that are likely to be phage/prophage-derived. 244
61438 408624 pfam18854 baeRF_family10 Bacterial archaeo-eukaryotic release factor family 10. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 140
61439 408625 pfam18855 baeRF_family11 Bacterial archaeo-eukaryotic release factor family 11. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 139
61440 408626 pfam18856 baeRF_family12 Bacterial archaeo-eukaryotic release factor family 12. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. 138
61441 408627 pfam18857 LPD38 Large polyvalent protein associated domain 38. This is an alpha and beta fold domain found in polyvalent proteins of phages and prophages. 189
61442 408628 pfam18858 LPD39 Large polyvalent protein associated domain 39. This is a predicted enzymatic alpha-helical domain that is associated with polyvalent proteins of phages and prophages. 196
61443 408629 pfam18859 acVLRF1 Actinobacteria/chloroflexi VLRF1 release factor. Archaeo-eukaryotic release factor domain family belonging to the VLRF1 clade, observed primarily in the actinbacteria and chloroflexi bacterial lineages. Contains a conserved glutamine residue in the release factor catalytic loop, suggesting it functions as an active peptidyl-tRNA hydrolase at the ribosome. 130
61444 408630 pfam18860 AbiJ_NTD3 AbiJ N-terminal domain 3. Alpha + beta domain. Found fused to AbiJ-like HEPN. Fused to other domains presumably involved in defense. 167
61445 408631 pfam18861 PTP_tm Transmembrane domain of protein tyrosine phosphatase, receptor type J. Protein tyrosine phosphatases (PTPs) are known to be signaling molecules that regulate a variety of cellular processes, including cell growth, differentiation, mitotic cycle, and oncogenic transformation. PTP receptor type J possesses an extracellular region containing five fibronectin type III repeats, the transmembrane region included in this Pfam entry, and a intracytoplasmic catalytic domain. 161
61446 408632 pfam18862 ApeA_NTD1 ApeA N-terminal domain 1. Mostly beta strands. Fused to HEPN (Apea). Several conserved aromatic residues, abundant but poorly conserved. 421
61447 408633 pfam18863 AbiJ_NTD4 AbiJ N-terminal domain 4. Alpha + beta. Found fused to AbiJ-like HEPN and heat repeats. 153
61448 408634 pfam18864 AbiTii AbiTii. Alpha + beta domain. Found fused to the N-terminus of the c2405 family of HEPN domains and in few cases to Ymh. 186
61449 408635 pfam18865 AbiJ_NTD5 AbiJ N-terminal domain 5. Mostly alpha helical. Found fused to AbiJ-lke HEPN, and to other domains presumably involved in defense. 93
61450 408636 pfam18866 CxC7 CxC7 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain. 64
61451 408637 pfam18867 HEPN-like_int HEPN-like integron domain. This is a HEPN-like nuclease. Part of mobile integron element. The integron cassettes are known to be activated by stress conditions, thereby allowing swapping of genetic material that might be of adaptive value. Hence it is hypothesized that the HEPN domains present in some integron cassettes contribute to the stress response by functioning as RNases that induce dormancy by probably inhibiting translation and thus enabling survival of harsh conditions. 151
61452 408638 pfam18868 zf-C2H2_3rep Zinc finger C2H2-type, 3 repeats. This Pfam entry includes three instances of the Zinc finger C2H2-type. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. 126
61453 408639 pfam18869 HEPN_RnaseLS RnaseLS-like HEPN. RnaseLS-like HEPN. 124
61454 408640 pfam18870 HEPN_RES_NTD1 HEPN/RES N-terminal domain 1. Mostly alpha helical. Fused to HEPN (MAE_28990 superfamily), RES domain, a potential RNase found in various toxin systems. 131
61455 408641 pfam18871 HEPN_Toprim_N HEPN/Toprim N-terminal domain 1. Alpha + beta domain. Fused to two distinct HEPN families: MAE_28990 and ERFG_01251 families, TOPRIM and a Mrr-like REase domain. 225
61456 408642 pfam18872 Daz Daz repeat. This short repeat is found in the Daz proteins is a varying number of copies. The molecular function of these repeats in unknown. 21
61457 408643 pfam18873 Sgo0707_N1 Sgo0707 N-terminal domain. This domain found at the N-terminus of the cell surface Sgo0707 protein. This domain is called the N1 domain and is involved in host colonisation.The largest domain, N1, comprises a putative binding cleft with a single cysteine located in its centre and exhibits an unexpected structural similarity to the variable domains of the streptococcal Antigen I/II adhesins. 265
61458 408644 pfam18874 QPE QPE domain. This sort presumed domain is found in a small set of gram positive organisms in cell surface proteins with an N-terminal collagen binding domain. We have named this domain QPE after the most conserved sequence motif. 45
61459 408645 pfam18875 AF4_int AF4 interaction motif. This short motif found in the AF4 protein interacts with AF9. 15
61460 408646 pfam18876 AF-4_C AF-4 proto-oncoprotein C-terminal region. This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila. 262
61461 408647 pfam18877 SSSPR-51 SSSPR-51 domain. This repeat domain is designated SSRS51, Streptococcal and Staphylococcal Surface Protein Repeat of size 51. These repeats are homologous to the listerial repeats of pfam13461, but shorter on average by about 8 amino acids. 48
61462 408648 pfam18878 PPE-PPW PPE-PPW subfamily C-terminal region. This entry represents the C-terminal region of a subfamily of PPE proteins known as the PPW subfamily. The PPW refers to three conserved residues found in the sequence alignment. The region also contains a second conserved motif GFGT. 48
61463 408649 pfam18879 EspA_EspE EspA/EspE family. This family of proteins includes Mycobacterium tuberculosis EspA and EspE proteins. 84
61464 408650 pfam18880 UDI Uracil-DNA glycosylase inhibitor. Uracil-DNA glycosylase inhibitor (UGI) found in Bacillus subtilis phage, is an inhibitor that inactivates the host uracil-DNA glycosylase (UDG), also known as (UNG) uracil-DNA N-glycosylase. UDG is a highly conserved enzyme responsible for the initiation of uracil-base excision repair (1). UGI forms a tight non-covalent bond to UDG, completely inhibiting it from binding to DNA by inserting its beta-1 strand into the conserved DNA-binding groove of the enzyme (2). In complex with UDG, UGI folds into an alpha-beta-alpha sandwich structure formed by five-stranded antiparallel beta-strands and two helices (3). 82
61465 408651 pfam18881 DUF5646 Family of unknown function (DUF5646). This is a family of unknown function. Family members include the archaeal homolog the bacterial RelB, a toxin-antitoxin system which is activated during amino acid starvation. This family is mostly found in thermococcus archaea. 69
61466 408652 pfam18882 DUF5647 Family of unknown function (DUF5647). This is a family of unknown function. Family members include the hypothetical protein TTHC002 from Thermus thermophilus. Its has an alpha-beta-alpha-beta(3) structure and forms a dimer with a single beta-sheet, folded in a barrel-like shape. 79
61467 408653 pfam18883 AC_1 Autochaperone Domain Type 1. This entry represents the autochaperone domain of type 1 (AC-1) in the Type Va Secretion System (T5aSS). Autotransporters (ATs) belong to a family of modular proteins secreted by the Type V, subtype a, secretion system (T5aSS) and considered as an important source of virulence factors in lipopolysaccharidic diderm bacteria (archetypical Gram-negative bacteria). The AC of type 1 with beta-fold appears as a prevalent and conserved structural element exclusively associated to beta-helical AT passenger. 113
61468 408654 pfam18884 TSP3_bac Bacterial TSP3 repeat. This entry contains a novel bacterial thrombospondin type 3 repeat which differs from the typical consensus by containing a glutamate in place of one of the calcium binding aspartate residues. 22
61469 408655 pfam18885 DUF5648 Repeat of unknown function (DUF5648). This entry represents a repeat of approximately 40 residues in length. It is often associated with enzymatic domains in bacterial cell surface proteins. This entry may represent a beta-propeller repeat, although most proteins only possess three repeats rather than the expected 6-8 copies. 39
61470 408656 pfam18886 DUF5649 Repeats of unknown function (DUF5649). This entry represents a series of potential beta-helix repeats found in a variety of putative bacterial adhesin proteins. 68
61471 408657 pfam18887 MBG_3 MBG domain. This entry corresponds to an MBG (mirror beta grasp) domain. It is found in a variety of bacterial cell surface proteins. 72
61472 408658 pfam18888 DUF5650 Repeat of unknown function (DUF5650). This entry represents a repeating region found in filamentous hemagglutinin proteins from various bacteria. This entry may contain a beta helix structure. 55
61473 408659 pfam18889 Beta_helix_3 Beta helix repeat. This entry contains a 30 residue repeat found in a variety of bacterial cell surface proteins. This repeat is related to pfam14262, meaning that it has a beta-helix structure. The sequence repeat is quite glycine rich. 20
61474 408660 pfam18890 FANCL_d2 FANCL UBC-like domain 2. This entry represents the second of three UBC-like domain found in the FANCL protein, which is the catalytic E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. 91
61475 408661 pfam18891 FANCL_d3 FANCL UBC-like domain 3. This entry represents the third of three UBC-like domain found in the FANCL protein, which is the catalytic E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. 97
61476 408662 pfam18892 DUF5651 Family of unknown function (DUF5651). This entry represents a probable zinc binding domain found at the C-terminus of some Firmicute bacteria. The function of these proteins is unknown. 54
61477 408663 pfam18893 DUF5652 Family of unknown function (DUF5652). This entry represents a protein containing two transmembrane helices. Many of these proteins are found in organisms in the Candidate Phyla Radiation. 71
61478 408664 pfam18894 PhageMetallopep Putative phage metallopeptidase. This entry represents a probable metallopeptidase found in a variety of phage and bacterial proteomes. 141
61479 408665 pfam18895 T4SS_pilin Type IV secretion system pilin. This entry represents likely Type IV secretion system pilins. 71
61480 408666 pfam18896 SLT_3 Lysozyme like domain. This entry represents a lysozyme like domain found in candidate phyla radiation bacteria. The domain contains several conserved cysteine and histidine residues suggesting that it may bind to zinc. 89
61481 408667 pfam18897 DUF5653 Family of unknown function (DUF5653). This entry contains a group of bacterial proteins of no known function. The proteins are approximately 230 amino acids in length. 194
61482 408668 pfam18898 DUF5654 Family of unknown function (DUF5654). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 79 and 98 amino acids in length. The region contains two predicted transmembrane helices. The Eukaryotic examples are found in the Foraminiferan Reticulomyxa filosa that contains several proteins with this family.s 71
61483 408669 pfam18899 DUF5655 Domain of unknown function (DUF5655). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 122 and 304 amino acids in length. 110
61484 408670 pfam18900 DUF5656 Protein of unknown function (DUF5656). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 237 and 274 amino acids in length. These proteins are likely to be integral membrane proteins. 240
61485 408671 pfam18901 DUF5657 Family of unknown function (DUF5657). This family of small integral membrane proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. 61
61486 408672 pfam18902 DUF5658 Domain of unknown function (DUF5658). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 101 and 135 amino acids in length. There is a completely conserved aspartate and a conserved EXNP motif that may be functionally important. 82
61487 408673 pfam18903 DUF5659 Domain of unknown function (DUF5659). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 78 and 90 amino acids in length. 75
61488 408674 pfam18904 DUF5660 Domain of unknown function (DUF5660). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and fungi, and is approximately 110 amino acids in length. 109
61489 408675 pfam18905 DUF5661 Protein of unknown function (DUF5661). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 89 and 148 amino acids in length. 71
61490 408676 pfam18906 Phage_tube_2 Phage tail tube protein. This family of proteins are tube proteins which polymerise to form the phage tails. 253
61491 408677 pfam18907 DUF5662 Family of unknown function (DUF5662). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 175 and 193 amino acids in length. Many proteins in this family are annotated as catalase, but this could not be verified. 157
61492 408678 pfam18908 DUF5663 Protein of unknown function (DUF5663). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 102 and 113 amino acids in length. 86
61493 408679 pfam18909 DUF5664 Siphovirus protein of unknown function (DUF5664). This family of proteins is found predominantly in siphoviruses. Proteins in this family are typically between 117 and 208 amino acids in length. 99
61494 408680 pfam18910 DUF5665 Domain of unknown function (DUF5665). This entry represents a functionally uncharacterized family of integral membrane proteins. This protein family is found in bacteria, and is approximately 60 amino acids in length. There are several conserved glycines in the first transmembrane helix that may be functionally important. 56
61495 408681 pfam18911 PKD_4 PKD domain. This entry is composed of PKD domains found in bacterial surface proteins. 81
61496 408682 pfam18912 DZR_2 Double zinc ribbon domain. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00156. This entry corresponds to two zinc ribbon motifs. This domain is found at the N-terminus of the ComF operon protein 3. 56
61497 408683 pfam18913 FBPase_C Fructose-1-6-bisphosphatase, C-terminal domain. This entry represents the C-terminal domain of Fructose-1-6-bisphosphatase enzymes. According to ECOD this domain has a Rossmann-like fold. 125
61498 408684 pfam18914 DUF5666 Domain of unknown function (DUF5666). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. This domain is likely to adopt an OB-fold based on similarity to other families. 60
61499 408685 pfam18915 DUF5667 Domain of unknown function (DUF5667). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is typically between 95 and 113 amino acids in length. 102
61500 408686 pfam18916 Lycopene_cyc Lycopene cyclase. 92
61501 408687 pfam18917 DUF5668 Domain of unknown function (DUF5668). This entry is composed of two transmembrane helices that are often found in 2 or three copies in a protein. The members of this family are functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 40 amino acids in length. This entry is often associated with pfam09922 a putative adhesive domain that adopts a beta helix fold. 42
61502 408688 pfam18918 DUF5669 Family of unknown function (DUF5669). This is a family of unknown function. Family members are mostly found in gammaproteobacteria. 76
61503 408689 pfam18919 DUF5670 Family of unknown function (DUF5670). This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 50 amino acids in length. There is a single completely conserved residue W that may be functionally important. These proteins contain two transmembrane helices. 43
61504 408690 pfam18920 DUF5671 Domain of unknown function (DUF5671). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 168 and 339 amino acids in length. These proteins are likely to be integral membrane proteins. 131
61505 408691 pfam18921 Cyanophycin_syn Cyanophycin synthase-like N-terminal domain. This domain is found at the N-terminus of cyanophycin synthase proteins and related enzymes from bacteria and archaea. It is approximately 120 amino acids in length. The family is found in association with pfam08245, pfam02875. This domain is found in isolation in some proteins. 115
61506 408692 pfam18922 DUF5672 Protein of unknown function (DUF5672). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 260 and 408 amino acids in length. This entry corresponds to a region of about 200 amino acids in length with multiple conserved motifs. There are two conserved sequence motifs: GAP and NGG. In some proteins this domain is found associated with various glycosyl transferase enzyme domains, suggesting this domain has a related role in glycan biosynthesis. 165
61507 408693 pfam18923 DUF5673 Domain of unknown function (DUF5673). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 90 amino acids in length. The domain is usually found C-terminal to a pair of transmembrane helices. 66
61508 408694 pfam18924 DUF5674 Protein of unknown function (DUF5674). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 110 amino acids in length. 109
61509 408695 pfam18925 DUF5675 Family of unknown function (DUF5675). This presumed domain is found in bacteria, archaea, alveolata and caudoviruses. Proteins in this family are typically between 133 and 179 amino acids in length. 114
61510 408696 pfam18926 DUF5676 2TM family of unknown function (DUF5676). This family of presumed integral membrane proteins is found in bacteria and archaea. Proteins in this family are approximately 90 amino acids in length and contain two predicted transmembrane helices. 82
61511 408697 pfam18927 CrtO Glycosyl-4,4'-diaponeurosporenoate acyltransferase. This family of proteins is found in certain bacterial lineages. In staphylococcus this protein is known to be Glycosyl-4,4'-diaponeurosporenoate acyltransferase an enzyme involved in the final step of synthesis of staphyloxanthin, an orange pigment found in most S. aureus strains. Proteins in this family are typically between 157 and 184 amino acids in length. Members of the family contain and EXXH motif that may be functionally important. 119
61512 408698 pfam18928 DUF5677 Family of unknown function (DUF5677). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 250 and 347 amino acids in length. These proteins contain a conserved RXXXE motif an invariant Histidine that may be functionally important. 163
61513 408699 pfam18929 DUF5678 Family of unknown function (DUF5678). This presumed domain family is found in bacteria and archaea. Proteins in this family are typically between 64 and 76 amino acids in length. 50
61514 408700 pfam18930 DUF5679 Domain of unknown function (DUF5679). This family of domains is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 48 and 68 amino acids in length. These domains contain four conserved cysteines suggesting that this domain is zinc binding 40
61515 408701 pfam18931 DUF5680 Domain of unknown function (DUF5680). This family of presumed domains is found in bacteria and archaea. Proteins in this family are typically between 152 and 220 amino acids in length. In some of the proteins this domain is associated with an N-terminal HTH domain, suggesting that they are transcriptional regulators. This suggests that this may be a previously unidentified ligand binding domain. 104
61516 408702 pfam18932 DUF5681 Family of unknown function (DUF5681). This domain family is found in bacteria, archaea and viruses, and is typically between 75 and 86 amino acids in length. There is a conserved SGNP sequence motif. There are two completely conserved G residues that may be functionally important. 77
61517 408703 pfam18933 PsbP_2 PsbP-like protein. This family is related to the PsbP family. 184
61518 408704 pfam18934 DUF5682 Family of unknown function (DUF5682). This is a family of unknown function, mostly found in bacteria. 729
61519 408705 pfam18935 DUF5683 Family of unknown function (DUF5683). This is a domain of unknown function found mostly in bacteria. 151
61520 408706 pfam18936 DUF5684 Family of unknown function (DUF5684). This is a family of unknown function mostly found in bacteria. Some family members can be found at the N-terminal region of pfam00717. 80
61521 408707 pfam18937 DUF5685 Family of unknown function (DUF5685). This is a family of unknown function mostly found in bacteria. 271
61522 408708 pfam18938 aRib Atypical Rib domain. This entry contains atypical Rib (aRib) domains. These are found in a variety of bacterial cell surface proteins. These proteins share a conserved motif with the Rib domain (YPDXXD). The structure of the aRib domain has been solved from two proteins, the SrpA adhesin and the GspB adhesin. In these proteins this domain has been termed the unique domain due to its lack of similarity to any other known structures at the time. The aRib domain from SrpA has been shown to mediate a dimer interaction. This family has been added to the E-set clan based on its similarity to the Rib domain, although it does not contain the Ig fold. 71
61523 408709 pfam18939 DUF5686 Family of unknown function (DUF5686). This is a family of unknown function, mostly found in bacteria. Family members can be found at the C-terminal region of pfam13715. 666
61524 408710 pfam18940 DUF5687 Family of unknown function (DUF5687). This is a family of unknown function mainly found in bacteria. 484
61525 408711 pfam18941 DUF5688 Family of unknown function (DUF5688). This is a family of unknown function found in bacteria. 290
61526 408712 pfam18942 DUF5689 Family of unknown function (DUF5689). This is a domain of unknown function. It is mostly found in bacteria and can be present in multiple copies. 220
61527 408713 pfam18943 DUF5690 Family of unknown function (DUF5690). This is a family of unknown function mostly found in bacteria. 382
61528 408714 pfam18944 DUF5691 Family of unknown function (DUF5691). This is a family of unknown function. Some family members overlap with pfam13646. 209
61529 408715 pfam18945 VipB_2 EvpB/VC_A0108, tail sheath gpW/gp25-like domain. EvpB is a family of Gram-negative probable type VI secretion system components of the tail sheath. They have been known as COG:COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as VipA/VipB and TssB/TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell. This entry represents the gpW/gp25-like domain. 112
61530 408716 pfam18946 Apex GpV Apex motif. This entry represents a short motif found at the C-terminus of Phage gpV proteins. These proteins act as a spike for piercing the host membrane. The apex motif contains a conserved HXH motif that coordinates an iron ion. 23
61531 408717 pfam18947 HAMP_2 HAMP domain. 67
61532 408718 pfam18948 DUF5692 Family of unknown function (DUF5692). This is a family of unknown function mostly found in bacteria. 304
61533 408719 pfam18949 DUF5693 Family of unknown function (DUF5693). This is a family of unknown function found in bacteria. 608
61534 408720 pfam18950 DUF5694 Family of unknown function (DUF5694). This is a family of unknown function, mostly found in bacteria. 185
61535 408721 pfam18951 DUF5695 Family of unknown function (DUF5695). This is a family of unknown function mainly found in fungi and bacteria. 857
61536 408722 pfam18952 DUF5696 Family of unknown function (DUF5696). This is a family of unknown function with some overlap with clan family members of CL0058. 608
61537 408723 pfam18953 SAP_new25 SAP domain-containing new25. This family includes Schizosaccharomyces specific SAP domain containing proteins such as gene product new25. SAP ( SAF-A/B, Acinus and PIAS) motif is a DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins. For instance, the SAP domain of SUMO E3 ligase PIAS1 from human is shown to bind an A/T-rich DNA. 51
61538 408724 pfam18954 DUF5697 Family of unknown function (DUF5697). This entry likely contains an N-terminal Helix-tun-helix motif and is therefore likely to be a transcriptional regulator. 161
61539 408725 pfam18955 DUF5698 Domain of unknown function (DUF5698). This family is functionally uncharacterized. This family family is found in bacteria and archaea, and is approximately 60 amino acids in length and contains two probable transmembrane helices. This entry is found in association with pfam10035. The C-terminal transmembrane helix contains a GXXXGXXXG motif that is characteristic of transmembrane helices that dimerise. 58
61540 408726 pfam18956 DUF5699 Family of unknown function (DUF5699). 61
61541 408727 pfam18957 RibLong Long Rib domain. This entry represents the Long Rib domain that is closely related to the pfam08428 Rib domain but has a conserved insertion. These domains are found in bacterial cell surface proteins. 93
61542 408728 pfam18958 DUF5700 Putative zinc dependent peptidase (DUF5700). This entry represents a group of putative zinc dependent peptidases that have the characteristic HEXXH motif. This family is most related to pfam10026. 278
61543 408729 pfam18959 DUF5701 Family of unknown function (DUF5701). This is a family of unknown function mostly found in bacteria. 193
61544 408730 pfam18960 DUF5702 Family of unknown function (DUF5702). This family is mostly found in bacteria. 276
61545 408731 pfam18961 DUF5703_N Domain of unknown function (DUF5703). This is an N-terminal domain of unknown function mostly found in bacteria. It is possible that this domain might be a putative glycoside hydrolase. This family belongs to the Galactose Mutarotase-like superfamily. 287
61546 408732 pfam18962 Por_Secre_tail Secretion system C-terminal sorting domain. Species that include Porphyromonas gingivalis, Fibrobacter succinogenes, Flavobacterium johnsoniae, Cytophaga hutchinsonii, Gramella forsetii, Prevotella intermedia, and Salinibacter ruber have on average twenty or more copies of this C-terminal domain, associated with sorting to the outer membrane and covalent modification. This domain targets proteins to type IX secretion systems and is secreted then cleaved off by a C-terminal signal peptidease. Based on similarity to other families it is likely that this domain adopts an immunoglobulin like fold. 72
61547 408733 pfam18963 DUF5703 Family of unknown function (DUF5703). This family includes members mostly found in Actinobacteria. Some family members are thought to be dihydroorotate dehydrogenases, however there is no evidence to support the relationship to pfam01180. 51
61548 408734 pfam18964 DUF5704 Family of unknown function (DUF5704). This entry is mainly found in Firmicutes and has a notable number of highly conserved aromatic amino acids. The entry is found towards the C-terminus of pfam02368 in some of the members and there are members that present two copies. 185
61549 408735 pfam18965 DUF5705 Family of unknown function (DUF5705). 1052
61550 408736 pfam18966 Lipoprotein_23 uncharacterized lipoprotein. This entry includes members found in Actinobacteria (mostly streptomycetaceae, micromonosporales and pseudonocardiales). Some members are annotated as lipoproteins. 179
61551 408737 pfam18967 DUF5706 Family of unknown function (DUF5706). 107
61552 408738 pfam18968 DUF5707 Family of unknown function (DUF5707). 98
61553 408739 pfam18969 DUF5708 Family of unknown function (DUF5708). This family includes members in Actinobacteria. All family members present double transmembrane. 60
61554 408740 pfam18970 DUF5709 Family of unknown function (DUF5709). 49
61555 408741 pfam18971 CagA_N CagA protein. The Helicobacter pylori type IV secretion effector CagA is a major bacterial virulence determinant and critical for gastric carcinogenesis. X-ray crystallographic analysis of the N-terminal CagA fragment (residues 1-876) revealed that the region has a structure comprised of three discrete domains. Domain I constitutes a mobile CagA N terminus, while Domain II tethers CagA to the plasma membrane by interacting with membrane phosphatidylserine. Domain III interacts intramolecularly with the intrinsically disordered C-terminal region, and this interaction potentiates the pathogenic scaffold/hub function of CagA. 876
61556 408742 pfam18972 Wheel Cns1/TTC4 Wheel domain. The wheel domain is found at the C-terminus of yeast Cns1 and human TTC4 proteins. The structure of the domain shows an overall fold consisting of a twisted five-stranded beta sheet surrounded by several alpha helices. The Hsp90 chaperone machinery in eukaryotes comprises a number of distinct accessory factors. Cns1 is one of the few essential co-chaperones in yeast. Cns1 is important for maintaining translation elongation, specifically chaperoning the elongation factor eEF2. In this context, Cns1 interacts with the novel co-factor Hgh1 and forms a quaternary complex together with eEF2 and Hsp90. 114
61557 408743 pfam18973 CBL Putative Chitin binding like. This family includes host-selective toxins such as SnTox1 found in S. nodorum. SnTox1 is a necrotrophic effector contains 6 cysteine residues, a common feature for some fungal avirulence effectors such as the Avr and ECP effectors from Cladosporium fulvum. The high content of cysteine residues and high stability suggest that SnTox1 may function in the plant apoplastic space which is abundant in plant defense components. Protein sequence analysis indicate that SnTox1 contains a C-terminal chitin binding (CB) like motif. Three-dimensional (3D) structure-based sequence alignment suggested that the putative CB motif in SnTox1 was more similar to those of plant-specific ChtBDs than to Avr4 proteins, which are related to invertebrate ChtBDs. Furthermore, SnTox1 contained all secondary-structure-related residues including the strictly conserved b-strand-forming 'CCS' motif found only in plant-specific ChtBD1 proteins. 48
61558 408744 pfam18974 DUF5710 Domain of unknown function (DUF5710). This is a domain of unknown function which can be found in DNA primases such as TraC. 44
61559 408745 pfam18975 DUF5711 Family of unknown function (DUF5711). This is a family of unknown function mostly found in bacteria and archea. Some members contain WD repeats. 344
61560 408746 pfam18976 DUF5712 Family of unknown function (DUF5712). This is a family of unknown function mainly found in Bacteroidetes. 292
61561 408747 pfam18977 DUF5713 Family of unknown function (DUF5713). This is a family of unknown function, mainly found in bacteria. 107
61562 408748 pfam18978 DUF5714 Family of unknown function (DUF5714). This is a family of unknown function, mainly found in bacteria. It is distantly related to Pfam family pfam09719, which is a heme binding cytochrome. This domain is found associated with other domains such as the Radical SAM domain and a methyltransferase. 173
61563 408749 pfam18979 DUF5715 Family of unknown function (DUF5715). This is a family of unknown function, mainly found in bacteria. 170
61564 408750 pfam18980 DUF5716_C Family of unknown function (DUF5716) C-terminal. This is a C-terminal domain found in bacterial sequences of unknown function. 295
61565 408751 pfam18981 InlK_D3 Internalin K domain (D3/D4). This domain is found at the elbow of internalin surface proteins, used by the bacteria to invade mammalian cells. This domain has an Ig-like fold. 75
61566 408752 pfam18982 DUF5716 Family of unknown function (DUF5716). This is a family of unknown function, mostly found in bacteria. 434
61567 408753 pfam18983 DUF5717 Family of unknown function (DUF5717)C-terminal. This is a C-terminal domain of a family of unknown function found in bacteria. 306
61568 408754 pfam18984 DUF5717_N Family of unknown function (DUF5717)N-terminal. This is the N-terminal domain found in sequences of unknown function in bacteria. 875
61569 408755 pfam18985 DUF5718 Family of unknown function (DUF5718). This is a family of unknown function, mostly found in bacteria. 250
61570 408756 pfam18986 DUF5719 Family of unknown function (DUF5719). This is a family of unknown function, mostly found in bacteria. 319
61571 408757 pfam18987 DUF5720 Family of unknown function (DUF5720). This is a family of unknown function, mostly found in bacteria. 100
61572 408758 pfam18988 DUF5721 Family of unknown function (DUF5721). This is a family of unknown function, mostly found in Firmicutes. 149
61573 408759 pfam18989 DUF5722 Family of unknown function (DUF5722). This is a family of unknown function mainly found in bacteria. 394
61574 408760 pfam18990 DUF5723 Family of unknown function (DUF5723). This is a family of unknown function mainly found in bacteria. 377
61575 408761 pfam18991 DUF5724 Family of unknown function (DUF5724). This is a family of unknown function mainly found in bacteria. 342
61576 408762 pfam18992 DUF5725 Family of unknown function (DUF5725). A highly conserved domain of unknown function found in Platyhelminthes. 158
61577 408763 pfam18993 Rv0078B Rv0078B-related antitoxin. Putative antitoxin protein according to TASmania database. 63
61578 408764 pfam18994 Prophage_tailD1 Prophage endopeptidase tail N-terminal domain. This domain represents the N-terminal domain of prophage tail proteins that are probably acting as endopeptidases. This domain has a RIFT related fold. 84
61579 408765 pfam18995 PRT6_C Proteolysis_6 C-terminal. This is the C-terminal domain mainly found in E3 ubiquitin ligases. Proteolysis 6 (PRT6) encodes a ubiquitin E3 ligase belonging to the N-end rule pathway of targeted protein degradation, which is a specialized subset of the ubiquitin proteasome system. In Arabidopsis, at least two N-recognins (E3 ubiquitin ligases) with different substrate specificities exist, namely PROTEOLYSIS1 (PRT1) and PRT6. 444
61580 408766 pfam18996 DUF5726 Family of unknown function (DUF5726). This family is found in various Platyhelminthes, with many of the sequences annotated as being the polyprotein of the transposon Ty3-I Gag-Pol. However, the genomic location of other members suggests that this region is not always found in a transposon. The family members are characterized by a highly conserved DxDxDxCC motif. The function of this family is currently unknown. 90
61581 408767 pfam18997 DUF5727 Family of unknown function (DUF5727). A conserved domain of unknown function found in Platyhelminthes, closely related to glycosylphosphatidylinositol (GPI)-anchored protein GP50, known as Diagnostic antigen GP50. GP50 is a family of highly expressed antigens exclusive to tapeworms. 192
61582 408768 pfam18998 Flg_new_2 Divergent InlB B-repeat domain. This family of domains are found in bacterial cell surface proteins. They are often found in tandem array. This domain is closely related to pfam09479. 74
61583 408769 pfam18999 DUF5728 Family of unknown function (DUF5728). This is a highly conserved domain of unknown function found in Platyhelminthes, with many of the sequences annotated as being the small conductance calcium-activated potassium channel protein. However, the location suggests that this domain belongs to the intracellular amino terminus of these transmembrane proteins, immediately next to the calcium-activated SK potassium channel. 188
61584 408770 pfam19000 DUF5729 Family of unknown function (DUF5729). This is a highly conserved domain of unknown function found in Platyhelminthes, with many of the sequences annotated as being the small conductance calcium-activated potassium channel protein. However, the location suggests that this domain belongs to the intracellular amino terminus of these transmembrane proteins, near the calcium-activated SK potassium channel. 141
61585 408771 pfam19001 DUF5730 Family of unknown function (DUF5730). A highly conserved domain of unknown function found in Platyhelminthes, in which the first 21 residues correspond to a transmembrane domain. 182
61586 408772 pfam19002 DUF5731 Family of unknown function (DUF5731). A highly conserved domain of unknown function found in Platyhelminthes. 108
61587 408773 pfam19003 DUF5732 Family of unknown function (DUF5732). This family is found in various Platyhelminthes, with many of the sequences annotated as being the STARP-like antigen. This highly conserved domain is located next to the basic helix-loop-helix motif found in this antigen. The function of this family is currently unknown. 83
61588 408774 pfam19004 DUF5733 Family of unknown function (DUF5733). A highly conserved domain of unknown function found in Platyhelminthes. 115
61589 408775 pfam19005 DUF5734 Family of unknown function (DUF5734). This is a conserved domain found in various Platyhelminthes. The function of this family is still unknown. 83
61590 408776 pfam19006 DUF5735 Family of unknown function (DUF5735). A highly conserved domain of unknown function found in different Platyhelminthes. 138
61591 408777 pfam19007 DUF5736 Family of unknown function (DUF5736). A highly conserved domain of unknown function found in various Platyhelminthes. 111
61592 408778 pfam19008 DUF5737 Family of unknown function (DUF5737). A highly conserved domain of unknown function found in various Platyhelminthes. 99
61593 408779 pfam19009 DUF5738 Family of unknown function (DUF5738). A highly conserved domain of unknown function found in various Platyhelminthes. 104
61594 408780 pfam19010 DUF5739 Family of unknown function (DUF5739). A highly conserved domain of unknown function found in various Platyhelminthes. 114
61595 408781 pfam19011 DUF5740 Family of unknown function (DUF5740). This family is found in various Platyhelminthes, with many of the sequences annotated as being part of death domain-containing proteins. This highly conserved domain is located at the amino terminus. The function of this family is currently unknown. 94
61596 408782 pfam19012 DUF5741 Family of unknown function (DUF5741). This coiled-coil is found in various Platyhelminthes, with many of the sequences annotated as being uveal autoantigen with coiled-coil. This region is one of the numerous coiled-coil regions which constitutes this antigen. The function is currently unknown. 82
61597 408783 pfam19013 DUF5742 Family of unknown function (DUF5742). A highly conserved domain of unknown function found in various Platyhelminthes. 111
61598 408784 pfam19014 DUF5743 Family of unknown function (DUF5743). A highly conserved domain of unknown function found in various Platyhelminthes. 186
61599 408785 pfam19015 DUF5744 Family of unknown function (DUF5744). This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown. 81
61600 408786 pfam19016 DUF5745 Domain of unknown function (DUF5745). This is a domain of unknown function found in Platyhelminthes. It shows homology with the calponin homology (CH) domain. 59
61601 408787 pfam19017 DUF5746 Domain of unknown function (DUF5746). This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown, with some of the sequences annotated as being Palmitoyltransferase. This highly conserved domain is located at the amino terminus, next to a DHHC domain (named for its signature tetrapeptide Asp-His-His-Cys). 127
61602 408788 pfam19018 Vanin_C Vanin C-terminal domain. This domain is found at the C terminus of Vanin 1 and related proteins. 165
61603 408789 pfam19019 Phlebo_G2_C Phlebovirus glycoprotein G2 C-terminal domain. This family consists of several Phlebovirus glycoprotein G2 C-terminal Ig-like domains. 171
61604 408790 pfam19020 Ta1207 Ta1207 family. The function of this family is unknown. The protein forms a homopentameric complex in T. acidophilum. Each protein is composed of two structurally similar domains. 276
61605 408791 pfam19021 DUF5747 Family of unknown function (DUF5747). The function of this protein family is unknown. 197
61606 408792 pfam19022 DUF5748 Family of unknown function (DUF5748). The function of this family of euryarchaeal proteins is unknown. 101
61607 408793 pfam19023 DUF5749 Family of unknown function (DUF5749). The function of this family of euryarchaeal proteins is unknown. The structure of this protein forms a beta barrel. 78
61608 408794 pfam19024 DUF5750 Family of unknown function (DUF5750). The function of this family of proteins is unknown. 91
61609 408795 pfam19025 DUF5751 Family of unknown function (DUF5751). The function of this archaeal family is unknown. 116
61610 408796 pfam19026 HYPK_UBA HYPK UBA domain. This entry represents the UBA domain found at the C-terminus of the HYPK protein and its homologues. This domain in HYPK mediates a protein interaction with the Naa15 C-terminus. 41
61611 408797 pfam19027 DUF5752 Family of unknown function (DUF5752). This family includes the OrfY protein from the hyperthermophilic archaeum Thermoproteus tenax. This protein co-occurs with the treS/P protein in an operon regulating the synthesis of trehalose. The structure of this protein shows it contains an internal duplication. 208
61612 408798 pfam19028 TSP1_spondin Spondin-like TSP1 domain. This entry represents a sub-type of TSP1 domains that have an alternative disulphide binding pattern compared to the canonical TSP1 domain. 52
61613 408799 pfam19029 DUF883_C DUF883 C-terminal glycine zipper region. This family corresponds to the C-terminal presumed transmembrane helix found in DUF883 proteins. The helix contains a glycine zipper motif suggestive of dimerisation. 30
61614 408800 pfam19030 TSP1_ADAMTS Thrombospondin type 1 domain. This subfamily of thrombospondin type 1 repeats are mainly found in ADAMTS proteins. 55
61615 408801 pfam19031 Intu_longin_1 First Longin domain of INTU, CCZ1 and HPS4. This entry is specific to the first Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1/CCZ1) family, including protein sequences of INTU, CCZ1 and HPS4 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia. 112
61616 408802 pfam19032 Intu_longin_2 Intu longin-like domain 2. This entry represents a longin-like domain found in Intu and related proteins. 119
61617 408803 pfam19033 Intu_longin_3 Intu longin-like domain 3. This entry represents a longin-like domain found in Intu and related proteins. 97
61618 408804 pfam19034 RnlA-toxin_C RNase LS, bacterial toxin C terminal. RnlA toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB. 125
61619 408805 pfam19035 TSP1_CCN CCN3 Nov like TSP1 domain. This entry represents a sub-type of TSP1 domains found in matricellular CCN proteins that have an alternative disulphide binding pattern compared to the canonical TSP1 domains. 44
61620 408806 pfam19036 Fuz_longin_1 First Longin domain of FUZ, MON1 and HPS1. This entry is specific to the first Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia. 125
61621 408807 pfam19037 Fuz_longin_2 Second Longin domain of FUZ, MON1 and HPS1. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the second Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia. 98
61622 408808 pfam19038 Fuz_longin_3 Third Longin domain of FUZ, MON1 and HPS1. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the third Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia. 103
61623 408809 pfam19039 ASK_PH ASK kinase PH domain. This PH-like domain is found in the regulatory region of ASK1 and related kinase proteins. This domain is found adjacent to the kinase domain. 97
61624 408810 pfam19040 SGNH SGNH domain (fused to AT3 domains). This entry include SGNH domains that are found fused to membrane domains from the AT3 families pfam01757. 235
61625 408811 pfam19041 CBP30 Nuclear cap binding complex subunit CBP30. This entry represents the CBP30 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP30 is part of the complex that recognizes this cap. 299
61626 408812 pfam19042 CBP110 Nuclear cap binding complex subunit CBP110. This entry represents the CBP110 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP110 is part of the complex that recognizes this cap. 987
61627 408813 pfam19043 CBP66 Nuclear cap binding complex subunit CBP66. This entry represents the CBP66 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP66 is part of the complex that recognizes this cap. 583
61628 408814 pfam19044 P-loop_TraG TraG P-loop domain. This entry represents the P-loop domain found in the TraG conjugation protein. 413
61629 408815 pfam19045 Ligase_CoA_2 Ligase-CoA domain. This domain is related to pfam00549 and adopts a flavodoxin fold. 162
61630 408816 pfam19046 GM130_C GM130 C-terminal binding motif. This entry represents the C-terminal motif from the GM130 protein that is bound by the GRASP65 PDZ domain pfam04495. 46
61631 408817 pfam19047 HOOK_N HOOK domain. This domain is found at the N-terminus of HOOK proteins. 151
61632 408818 pfam19048 SidE_mART SidE mono-ADP-ribosyltransferase domain. This domain found in the SidE bacterial effector protein mediates the mono-ADP-ribosylation of ubiquitin, which is then ligated to host proteins by the pfam12252 domain. 313
61633 408819 pfam19049 SidE_DUB SidE DUB domain. This entry represents the N-terminal deubiquitinating domain from the SidE protein. The SidE protein is a bacterial effector protein that can ubiquitinate and presumably deubiquitinate host proteins. 173
61634 408820 pfam19050 PhoD_2 PhoD related phosphatase. This entry contains a domain that is presumed to be a phosphatase enzyme based on its similarity to pfam09423. 543
61635 408821 pfam19051 GFO_IDH_MocA_C2 Oxidoreductase family, C-terminal alpha/beta domain. This entry represents a domain found at the C-terminus of a variety of oxidoreductase enzymes. The domain is related to pfam02894. 254
61636 408822 pfam19052 BRINP BMP/retinoic acid-inducible neural-specific protein. This entry represents the BMP/retinoic acid-inducible neural-specific protein (BRINP) family, including BRINP1/2/3. They are predominantly and widely expressed in both the central nervous system (CNS) and peripheral nervous system (PNS). They inhibit neuronal cell proliferation by negative regulation of the cell cycle G1/S transition. 448
61637 408823 pfam19053 EccD EccD-like transmembrane domain. This entry represents an integral membrane component of the ESX type VII secretion systems, EccD. This region includes 11 predicted transmembrane alpha helices. 392
61638 408824 pfam19054 DUF5753 Domain of unknown function (DUF5753). This entry represents a putative ligand binding domain found in bacterial transcription regulators that have an N-terminal HTH domain pfam13560. 178
61639 408825 pfam19055 ABC2_membrane_7 ABC-2 type transporter. 409
61640 408826 pfam19056 WD40_2 WD40 repeated domain. This entry contains an array of WD40 repeats found in RhoGEF proteins. 487
61641 408827 pfam19057 PH_19 PH domain. This entry contains a PH domain found in RhoGEF proteins. 146
61642 408828 pfam19058 DUF5754 Family of unknown function (DUF5754). This is a family of uncharacterized proteins of unknown function found in viruses. 49
61643 408829 pfam19059 DUF5755 Family of unknown function (DUF5755). This family of unknown function appears to be primarily restricted to mimiviridae and phycodnaviridae. This entry may be found at the N terminus of longer proteins, which contain a C-terminal DNA polymerase family domain and a DNA polymerase family B exonuclease domain. 92
61644 408830 pfam19060 DUF5756 Family of unknown function (DUF5756). This family of unknown function is predominantly found in Phycodnaviridae. 66
61645 408831 pfam19061 DUF5757 Family of unknown function (DUF5757). This is a family of uncharacterized proteins of unknown function found in viruses. It is thought to be part of the early transcription factor large subunit. 94
61646 408832 pfam19062 DUF5758 Family of unknown function (DUF5758). This is a family of uncharacterized proteins of unknown function found in viruses, as well as in bacteria. It is predicted to be a pentapeptide repeat-containing protein. 105
61647 408833 pfam19063 DUF5759 Family of unknown function (DUF5759). This is a family of uncharacterized proteins of unknown function found in viruses. 71
61648 408834 pfam19064 DUF5760 Family of unknown function (DUF5760). This is a family of uncharacterized proteins of unknown function found in Phycodnaviridae and Mimiviridae. 86
61649 408835 pfam19065 DUF5761 Family of unknown function (DUF5761). This is a family of uncharacterized proteins of unknown function found in viruses. 62
61650 408836 pfam19066 DUF5762 Family of unknown function (DUF5762). This is a family of uncharacterized proteins of unknown function found in viruses. It is inferred from homology to be a membrane-component. 69
61651 408837 pfam19067 DUF5763 Family of unknown function (DUF5763). This is a family of uncharacterized proteins of unknown function found predominantly in viruses. However, some matches with predicted proteins from Archaea and Eukaryotes were also found. 39
61652 408838 pfam19068 DUF5764 Family of unknown function (DUF5764). This is a family of uncharacterized proteins of known function found in viruses, particularly in Conferred and Mummified. 134
61653 408839 pfam19069 DUF5765 Family of unknown function (DUF5765). This is a family of proteins of unknown function found in viruses, which are thought to be membrane proteins. 91
61654 408840 pfam19070 DUF5766 Family of unknown function (DUF5766). This is a family of uncharacterized proteins of unknown function found in viruses. 77
61655 408841 pfam19071 DUF5767 Family of unknown function (DUF5767). This is a family uncharacterized proteins of unknown function found in viruses. 85
61656 408842 pfam19072 DUF5768 Family of unknown function (DUF5768). This is a family of uncharacterized proteins of unknown function found in viruses. 123
61657 408843 pfam19073 DUF5769 Family of unknown function (DUF5769). This is a family of uncharacterized proteins of unknown function found in Mimiviridae. 190
61658 408844 pfam19074 DUF5770 Family of unknown function (DUF5770). This is a family of uncharacterized proteins of unknown function found in Iridoviridae. 131
61659 408845 pfam19075 DUF5771 Family of unknown function (DUF5771). This is a family of uncharacterized proteins of unknown function found in viruses which apparently has N-acetyltransferase activity. 69
61660 408846 pfam19076 CshA_repeat Surface adhesin CshA repetitive domain. Repeat domain from surface fibrillar adhesin CshA. This domain forms several tandem repeats with high sequence identity. CshA is found primarily in streptococci species, but also in other Gram+ bacteria. 97
61661 408847 pfam19077 Big_13 Bacterial Ig-like domain. Presumed domain found as tandem repeats of high sequence identity in bacterial cell surface proteins. 102
61662 408848 pfam19078 Big_12 Bacterial Ig-like domain. Presumed domain found as tandem repeats of high sequence identity in bacterial cell surface proteins. 102
61663 408849 pfam19079 CFSR Collagen-flanked surface repeat. Repeat flanked by collagen-like CXX motifs found in bacterial cell surface proteins. 49
61664 408850 pfam19080 DUF5772 Family of unknown function (DUF5772). This is a family of proteins of unknown function found in viruses which appears to contain transmembrane proteins inferred from homology. 84
61665 408851 pfam19081 Ig_7 Ig-like domain CHU_C associated. Presumed Ig-like domain found as tandem repeats in proteins with the gliding motility-associated C-terminal domain CHU_C 84
61666 408852 pfam19082 DUF5773 Family of unknown function (DUF5773). This is a family of uncharacterized proteins of unknown function found in Phycodnaviridae. 175
61667 408853 pfam19083 DUF5774 Family of unknown function (DUF5774). This is a family of uncharacterized proteins of unknown function found in viruses. 123
61668 408854 pfam19084 DUF5775 Family of unknown function (DUF5775). This is a family of uncharacterized proteins of unknown function, found in phycodnaviruses. The protein contains a predicted N-terminal transmembrane helix. 81
61669 408855 pfam19085 Choline_bind_2 Choline-binding repeat. this entry contains a pair of presumed choline-binding repeats that are often found adjacent to pfam01473. 38
61670 408856 pfam19086 Terpene_syn_C_2 Terpene synthase family 2, C-terminal metal binding. 198
61671 408857 pfam19087 DUF5776 Domain of unknown function (DUF5776). Presumed stalk domain found in bacterial surface proteins forming tandem repeats with high sequence identity. This domain is also associated with other known bacterial surface protein stalks and adhesive domains. 67
61672 408858 pfam19088 TUTase TUTase nucleotidyltransferase domain. This nucleotidyltransferase domain is found in TUTase enzymes. 333
61673 408859 pfam19089 DUF5777 Membrane bound beta barrel domain (DUF5777). This entry contains integral membrane beta barrel proteins. 247
61674 408860 pfam19090 DUF5778 Family of unknown function (DUF5778). Family of unknown function predominantly found in Halobacteria. 126
61675 408861 pfam19091 DUF5779 Family of unknown function (DUF5779). Family of unknown function predominantly found in Halobacteria. 96
61676 408862 pfam19092 DUF5780 Family of unknown function (DUF5780). This entry adopts a Greek-key beta sandwich topology, with long loops in some of the members. While a structure is known, the function of this domain is yet to be determined. 109
61677 408863 pfam19093 DUF5781 Family of unknown function (DUF5781). Family of unknown function predominantly found in Halobacteria. 243
61678 408864 pfam19094 DUF5782 Family of unknown function (DUF5782). Family of unknown function predominantly found in Halobacteria. 75
61679 408865 pfam19095 DUF5783 Family of unknown function (DUF5783). Family of unknown function predominantly found in Halobacteria. 105
61680 408866 pfam19096 DUF5784 Family of unknown function (DUF5784). Family of unknown function predominantly found in Halobacteria. 329
61681 408867 pfam19097 Snu56_snRNP Snu56-like U1 small nuclear ribonucleoprotein component. This family is a component of the U1 snRNP particle, which recognizes and binds the 5'-splice site of pre-mRNA. Together with other non-snRNP factors, U1 snRNP forms the spliceosomal commitment complex, which targets pre-mRNA to the splicing pathway. 434
61682 408868 pfam19098 DUF5785 Family of unknown function (DUF5785). Family of unknown function predominantly found in Halobacteria. 98
61683 408869 pfam19099 DUF5786 Family of unknown function (DUF5786). Family of unknown function predominantly found in Halobacteria. 55
61684 408870 pfam19100 DUF5787 Family of unknown function (DUF5787). Family of unknown function predominantly found in Halobacteria. 264
61685 408871 pfam19101 DUF5788 Family of unknown function (DUF5788). This is a family of proteins of unknown function predominantly found in Halobacteria. 132
61686 408872 pfam19102 DUF5789 Family of unknown function (DUF5789). This is a family of proteins of unknown function predominantly found in Halobacteria. 74
61687 408873 pfam19103 DUF5790 Family of unknown function (DUF5790). This is a family of proteins of unknown function found predominantly in Halobacteria. 126
61688 408874 pfam19104 DUF5791 Family of unknown function (DUF5791). This is a family of proteins of unknown function predominantly found in Halobacteria. 124
61689 408875 pfam19105 DUF5792 Family of unknown function (DUF5792). This family contains a domain of unknown function found in prasinoviruses. 152
61690 408876 pfam19106 DUF5793 Family of unknown function (DUF5793). This is a family of proteins of unknown function predominantly found in Halobacteria. 157
61691 408877 pfam19107 DUF5794 Family of unknown function (DUF5794). This is a family of proteins of unknown function predominantly found in Halobacteria. 128
61692 408878 pfam19108 DUF5795 Family of unknown function (DUF5795). Family of unknown function predominantly found in Halobacteria. 74
61693 408879 pfam19109 DUF5796 Family of unknown function (DUF5796). Family of proteins of unknown function predominantly found in Halobacteria. 139
61694 408880 pfam19110 DUF5797 Family of unknown function (DUF5797). This is a family of proteins of unknown function predominantly found in Halobacteria. 163
61695 408881 pfam19111 DUF5798 Family of unknown function (DUF5798). Family of unknown function predominantly found in Halobacteria. 89
61696 408882 pfam19112 VanA_C Vanillate O-demethylase oxygenase C-terminal domain. This domain is found in a wide variety of oxygenases such as Vanillate O-demethylase oxygenase and Toluene-4-sulfonate monooxygenase. 196
61697 408883 pfam19113 DUF5799 Family of unknown function (DUF5799). This is a family of proteins of unknown function predominantly found in Halobacteria. 148
61698 408884 pfam19114 EsV_1_7_cys EsV-1-7 cysteine-rich motif. The EsV-1-7 repeat is a cysteine-rich motif of unknown function. The motif was originally identified in the Ectocarpus "immediate upright" protein, which has an EsV-1-7 domain that contains five EsV-1-7 repeats. The name is derived from the Ectocarpus virus EsV-1 protein EsV-1-7, which possesses six EsV-1-7 repeats. Ectocarpus has a large family of EsV-1-7 domain proteins with between one and 19 copies of the motif (C-X4-C-X16-C-X2-H-X12). In addition to brown algae, EsV-1-7 domain proteins have been found in eustigmatophytes, oomycetes, cryptophytes, two families of green algae (Coccomyxaceae and Selenastraceae) and also in viral genomes, such as Emiliania huxleyi virus PS401 and Pithovirus sibericum. Based on this unusual distribution, it has been proposed that EsV-1-7 domain genes have been exchanged between lineages by horizontal gene transfer during evolution [1,2]. 35
61699 408885 pfam19115 DUF5800 Family of unknown function (DUF5800). This is a family of proteins of unknown function predominantly found in Halobacteria. 64
61700 408886 pfam19116 DUF5801 Domain of unknown function (DUF5801). This entry contains a presumed domain that is found as tandem repeats in a number of bacterial proteins. 150
61701 408887 pfam19117 Mim2 Mitochondrial import 2. This entry, together with pfam08219 form the Mim1/Mim2 complex, which is specific to fungi. This complex is responsible for the assembly and/or insertion of a subset of mitochondrial outer membrane proteins, including subunits of the main mitochondrial outer membrane translocase. 45
61702 408888 pfam19118 DUF5802 Family of unknown function (DUF5802). Family of unknown function predominantly found in Halobacteria. 113
61703 408889 pfam19119 DUF5803 Family of unknown function (DUF5803). Family of unknown function predominantly found in Halobacteria. 196
61704 408890 pfam19120 DUF5804 Family of unknown function (DUF5804). Family of unknown function predominantly found in Halobacteria and Methanomicrobia. 108
61705 408891 pfam19121 DUF5805 Family of unknown function (DUF5805). This is a family of proteins of unknown function predominantly found in Halobacteria. 67
61706 408892 pfam19122 DUF5806 Family of unknown function (DUF5806). This is a family of proteins of unknown function predominantly found in Halobacteria and Methanomicrobia. 148
61707 408893 pfam19123 DUF5807 Family of unknown function (DUF5807). This is a family of proteins of unknown function found in Halobacteria. 106
61708 408894 pfam19124 DUF5808 Family of unknown function (DUF5808). This is a family of proteins of unknown function predominantly found in Firmicutes but also in Actinobacteria and Halobacteria. Members of this family are thought to be DUF1648 domain-containing proteins as they are membrane-components. 26
61709 408895 pfam19125 DUF5809 Family of unknown function (DUF5809). This is a family of proteins of unknown function predominantly found in Halobacteria. 130
61710 408896 pfam19126 DUF5810 Family of unknown function (DUF5810). This is a family of proteins of unknown function predominantly found in Halobacteria, but also in Eukaryotes. This family contains some members that are predicted as C2H2-type domain-containing proteins. 66
61711 408897 pfam19127 Choline_bind_3 Choline-binding repeat. Pair of presumed choline-binding repeats often found adjacent to pfam01473. 48
61712 408898 pfam19128 DUF5811 Family of unknown function (DUF5811). This is a family of proteins of unknown function predominantly found in Halobacteria. 103
61713 408899 pfam19129 DUF5812 Family of unknown function (DUF5812). This is a family of unknown function predominantly found in Halobacteria. 140
61714 408900 pfam19130 DUF5813 Family of unknown function (DUF5813). This is a family of unknown function found predominantly in Halobacteria. 143
61715 408901 pfam19131 DUF5814 Family of unknown function (DUF5814). This is a family of proteins that are thought to have helicase activity, predominantly found in Halobacteria and Methanomicrobia. 148
61716 408902 pfam19132 DUF5815 Family of unknown function (DUF5815). This is a family of unknown function predominantly found in Halobacteria. 155
61717 408903 pfam19133 DUF5816 Family of unknown function (DUF5816). This is a family of proteins predominantly found in Halobacteria. They are thought to be GNAT (Gcn5-related N-acetyltransferases) family acetyltransferases. 72
61718 408904 pfam19134 DUF5817 Family of unknown function (DUF5817). This is a family of proteins predominantly found in Halobacteria. They are thought to be the replication protein H. 51
61719 408905 pfam19135 DUF5818 Protein of unknown function (DUF5818). This is a family of uncharacterized proteins. 57
61720 408906 pfam19136 DUF5819 Family of unknown function (DUF5819). This is a family of uncharacterized proteins. 168
61721 408907 pfam19137 DUF5820 Family of unknown function (DUF5820). This is a family of unknown function predominantly found in Halobacteria. 117
61722 408908 pfam19138 DUF5821 Family of unknown function (DUF5821). This is a family of proteins of unknown function predominantly found in Halobacteria. 217
61723 408909 pfam19139 DUF5822 Family of unknown function (DUF5822). This is a family of proteins of unknown function predominantly found in Halobacteria. This family includes some members which are thought to be peptidoglycan binding proteins. 38
61724 408910 pfam19140 DUF5823 Family of unknown function (DUF5823). This is a family of uncharacterized proteins. 178
61725 408911 pfam19141 DUF5824 Family of unknown function (DUF5824). This family contains a domain of unknown function, which is predominantly found in Phycodnaviridae and Caudovirale viruses. 127
61726 408912 pfam19142 DUF5825 Family of unknown function (DUF5825). This is a family of uncharacterized proteins. 180
61727 408913 pfam19143 Omp85_2 OMP85 superfamily. This entry represents the membrane spanning beta barrel domain of various Omp85 superfamily related proteins that have often have POTRA and Patatin domains. This family contains mainly flavobacterial proteins. 351
61728 408914 pfam19144 DUF5826 Family of unknown function (DUF5826). This family of unknown function is mostly found in prasinoviruses and is likely to represent membrane proteins with two transmembrane regions. 68
61729 408915 pfam19145 DUF5827 Family of unknown function (DUF5827). This is a family of proteins of unknown function predominantly found in Halobacteria. 87
61730 408916 pfam19146 DUF5828 Family of unknown function (DUF5828). This is a family of proteins of unknown function predominantly found in Halobacteria. 175
61731 408917 pfam19147 DUF5829 Family of unknown function (DUF5829). This is a family of uncharacterized proteins. 266
61732 408918 pfam19148 DUF5830 Family of unknown function (DUF5830). This is a family of proteins predominantly found in Halobacteria. Some members includes the MarR family transcriptional regulator. 115
61733 408919 pfam19149 DUF5831 Family of unknown function (DUF5831). This family of unknown function is found mostly in prasinoviruses. 72
61734 408920 pfam19150 DUF5832 Family of unknown function (DUF5832). This entry contains proteins of unknown function found predominantly in the Phycodnaviridae, Mimiviridae, Marseilleviridae and Iridoviridae virus families. 81
61735 408921 pfam19151 Sublancin Sublancin. This family represents sublancin, a small bacteriocin active against Gram-positive bacteria. This family appears to be restricted to Bacilli. Sublancin was thought to be a lantibiotic but was later shown to be an S-linked glycopeptide. Glycosylation is essential for its antimicrobial activity. Sublancin is biosynthesized as a precursor peptide bearing an N-terminal leader peptide, and a C-terminal core peptide that is converted into the mature peptide. Sublancin comprises two alpha helices and a well-defined inter-helical loop. Sublancin inhibits B.cereus spore outgrowth, after the germination stage, approximately 1000-fold better than it inhibits exponential growth of the same cells and inhibits B.subtilis strain ATCC6633 and B. megaterium strain 14581. 57
61736 408922 pfam19152 DUF5834 Family of unknown function (DUF5834). This family represents an uncharacterized protein that is associated with the thiocillin biosynthetic gene cluster. 159
61737 408923 pfam19153 DUF5835 Family of unknown function (DUF5835). This family represents an uncharacterized protein that is associated with the biosynthetic gene cluster for the bacteriocin salivaricin CRL 1328. The salivaricin CRL 1328 biosynthetic gene cluster is similar to the previously described gene cluster for the bacteriocin ABP118. 53
61738 408924 pfam19154 DUF5836 Family of unknown function (DUF5836). This family represents the induction peptide AbpIP, which regulates the biosynthesis of the bacteriocin ABP-118 in Lactobacillus salivarius subsp. salivarius UCC118. 38
61739 408925 pfam19155 DUF5837 Family of unknown function (DUF5837). This family of unknown function is associated with the tenuecyclamide A biosynthetic gene cluster. 63
61740 408926 pfam19156 DUF5838 Family of unknown function (DUF5838). This family of unknown function is associated with the biosynthetic gene cluster for anacyclamide A10. The family appears to be restricted to Cyanobacteria. Some matches also contain a methyltransferase domain at the N terminus. 285
61741 408927 pfam19157 DUF5839 Family of unknown function (DUF5839). This family of unknown function is associated with the biosynthetic gene cluster for glycocin F. The family appears to be restricted to Firmicutes. 87
61742 408928 pfam19158 DUF5840 Family of unknown function (DUF5840). This family contains uncharacterized proteins. It also contains the anacyclamide synthesis protein AcyE. Cyanobactins are small, cyclic peptides found in cyanobacteria. They are ribosomally synthesized and post-translationally modified. Cyanobactin biosynthesis clusters contain 7-12 genes. Anaclyclamides are a type of cyanobactin produced in strains of the cyanobacteria Anabaena. AcyE is a 49-amino-acid protein with N-terminal homology to the peptide precursor proteins in the other cyanobactin pathways. The core peptide of AcyE is cleaved during post-translational processing of the precursor peptide. 49
61743 408929 pfam19159 DUF5841 Family of unknown function (DUF5841). This family of unknown function is associated with the biosynthetic gene cluster for enterocin A. 48
61744 408930 pfam19160 SPARK SPARK. This entry is typically found as an extracellular domain of plant receptor-like kinases, many of which play a role in signalling during plant-fungal symbiosis. The precise function of this entry is unknown. 161
61745 408931 pfam19161 DUF5843 Family of unknown function (DUF5843). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 196
61746 408932 pfam19162 DUF5844 Family of unknown function (DUF5844). This is a family of uncharacterized proteins of unknown function found in Iridoviridae. 109
61747 408933 pfam19163 DUF5845 Family of unknown function (DUF5845). This is a family of uncharacterized proteins of unknown function found in viruses. 80
61748 408934 pfam19164 DUF5846 Family of unknown function (DUF5846). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 116
61749 408935 pfam19165 DUF5847 Family of unknown function (DUF5847). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 407
61750 408936 pfam19166 DUF5848 Family of unknown function (DUF5848). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. This family is also found in Bacteria and Fungi. 64
61751 408937 pfam19167 DUF5849 Family of unknown function (DUF5849). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 200
61752 408938 pfam19168 DUF5850 Family of unknown function (DUF5850). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae. The family contains a conserved motif towards the C-terminus of these proteins. This motif contains a central RGD sequence which suggests these viral proteins may bind to integrins. 133
61753 408939 pfam19169 DUF5851 Family of unknown function (DUF5851). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 353
61754 408940 pfam19170 DUF5852 Family of unknown function (DUF5852). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae. 158
61755 408941 pfam19171 DUF5853 Family of unknown function (DUF5853). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. 136
61756 408942 pfam19172 DUF5854 Family of unknown function (DUF5854). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 161
61757 408943 pfam19173 DUF5855 Family of unknown function (DUF5855). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. 183
61758 408944 pfam19174 DUF5856 Family of unknown function (DUF5856). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. This family was also found in Bacteria. 94
61759 408945 pfam19175 DUF5857 Family of unknown function (DUF5857). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 287
61760 408946 pfam19176 DUF5858 Family of unknown function (DUF5858). This is a family of uncharacterized proteins of unknown function predominantly found in Marseillevirus. 61
61761 408947 pfam19177 DUF5859 Family of unknown function (DUF5859). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 162
61762 408948 pfam19178 DUF5860 Family of unknown function (DUF5860). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 167
61763 408949 pfam19179 DUF5861 Family of unknown function (DUF5861). This is a family of uncharacterized proteins of unknown function found in viruses. This family also includes proteins found in eukaryotes which are thought to be E3 ubiquitin-protein ligase. 116
61764 408950 pfam19180 DUF5862 Family of unknown function (DUF5862). This is a family of uncharacterized proteins of unknown function predominantly found in Ascoviridae. This family also includes uncharacterized proteins found in Bacteria. 68
61765 408951 pfam19181 DUF5863 Family of unknown function (DUF5863). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 169
61766 408952 pfam19182 DUF5864 Family of unknown function (DUF5864). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 136
61767 408953 pfam19183 DUF5865 Family of unknown function (DUF5865). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 210
61768 408954 pfam19184 DUF5866 Family of unknown function (DUF5866). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 74
61769 408955 pfam19185 DUF5867 Family of unknown function (DUF5867). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 274
61770 408956 pfam19186 DUF5868 Family of unknown function (DUF5868). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae. 193
61771 408957 pfam19187 HTH_PafC PafC helix-turn-helix domain. This entry is an N-terminal HTH domain found in the PafC protein. Transcriptional activator PafBC is responsible for upregulating the majority of genes induced by DNA damage. 115
61772 408958 pfam19188 AGRB_N Adhesion GPCR B N-terminal region. This region is found at the N-terminus of various adhesion G-protein coupled receptor B proteins. This region contains 10 cysteine residues that probably form disulphide bonds. 177
61773 408959 pfam19189 Mtf2 Mtf2 family. This family appears to be distantly related to PPR repeats. 196
61774 408960 pfam19190 BACON_2 Viral BACON domain. This family represents a distinct class of BACON domains found in crAss-like phages, the most common viral family in the human gut, in which they are found in tail fiber genes. This suggests they may play a role in phage-host interactions. 91
61775 408961 pfam19191 HEF_HK HEF_HK domain. This is a dimerization and histidine phosphotransfer (DHp) domain found in Histidine Kinases (HK). This domain is belongs to the His_Kinase_A (CL0025) clan. HK domain architectures typically contain DHp domains adjacent to GHKL domains, such as HATPase_c (pfam02518) and HATPase_c_3 (pfam13589) which comprise the ATP-binding regions. 67
61776 408962 pfam19192 Response_reg_2 Response receiver domain. This is a receiver domain (REC) commonly found in the same gene-neighbourhood of its cognate HK. Together they comprise a Two-Component System (TCS). There is a high degree of specificity among REC and DHp domains in a cognate pair. This domain is related to the Response_reg (pfam00072) domain. We found that pfam19191 and this domain show high degree of linkage in the same gene-neighbourhoods across several bacterial lineages. This implies that they comprise a TCS. 176
61777 408963 pfam19193 Tectonin Tectonin domain. This entry represents proteins homologous to tectonin. This protein adopts a 6 bladed beta propeller structure. 214
61778 408964 pfam19194 DUF5869 Family of unknown function (DUF5869). This is a family of uncharacterized proteins of unknown function found in Marseillevirus. 159
61779 408965 pfam19195 DUF5870 Family of unknown function (DUF5870). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 488
61780 408966 pfam19196 DUF5871 Family of unknown function (DUF5871). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. 136
61781 408967 pfam19197 DUF5872 Family of unknown function (DUF5872). This is a family of uncharacterized proteins of unknown function predominantly found in viruses and Bacteria. 115
61782 408968 pfam19198 RsaA_NTD RsaA N-terminal domain. This entry represents the N-terminal domain of the RsaA S-layer protein from Caulobacter crescentus. This domain binds to lipopolysaccharide. 174
61783 408969 pfam19199 Phage_coatGP8 Phage major coat protein, Gp8. 68
61784 408970 pfam19200 DUF871_N DUF871 N-terminal domain. This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown. 235
61785 408971 pfam19201 DUF5873 Family of unknown function (DUF5873). This is a family of uncharacterized proteins of unknown function predominantly found in Phyconaviridae. 109
61786 408972 pfam19202 DUF5874 Family of unknown function (DUF5874). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. This entry includes proteins that are also found in Fungi. 74
61787 408973 pfam19203 DUF5875 Family of unknown function (DUF5875). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae. 228
61788 408974 pfam19204 DUF5876 Family of unknown function (DUF5876). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae. 561
61789 408975 pfam19205 DUF5877 Family of unknown function (DUF5877). This is a family of uncharacterized proteins of unknown functions predominantly found in Iridoviridae. 609
61790 408976 pfam19206 DUF5878 Family of unknown function (DUF5878). This is a family of uncharacterized proteins of unknown function found in viruses. 148
61791 408977 pfam19207 DUF5879 Family of unknown function (DUF5879). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. 273
61792 408978 pfam19208 DUF5880 Family of unknown function (DUF5880). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. This family also includes uncharacterized proteins found in Archaea and Eukaryotes. 97
61793 408979 pfam19209 CoV_S1_C Coronavirus spike glycoprotein S1, C-terminal. This entry represents a domain found at the C-terminus of the Coronavirus S1 protein. It is found across a range of alpha, beta and gamma coronaviruses. This small all beta stranded domain is known as subdomain 2 in the structure of the porcine epidemic diarrhea virus spike protein. 58
61794 408980 pfam19211 CoV_NSP2_N Coronavirus replicase NSP2, N-terminal. This entry corresponds to the N-terminal region of coronavirus non-structural protein 2. NSP2 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. The function of this protein is uncertain. This region contains numerous conserved and semi-conserved cysteine residues. 204
61795 408981 pfam19212 CoV_NSP2_C Coronavirus replicase NSP2, C-terminal. This entry corresponds to a presumed domain found at the C-terminus of Coronavirus non-structural protein 2 (NSP2). NSP2 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. The function of NSP2 is uncertain. This presumed domain is found in two copies in some viral NSP2 proteins. This domain is found in both alpha and betacoronaviruses. 156
61796 408982 pfam19213 CoV_NSP6 Coronavirus replicase NSP6. This entry represents proteins found in Coronaviruses and includes the Non-structural Protein 6 (NSP6). Coronaviruses encode large replicase polyproteins which are proteolytically processed by viral proteases to generate mature Nonstructural Proteins (NSPs). NSP6 is a membrane protein containing 6 transmembrane domains with a large C-terminal tail. NSP6 from the avian coronavirus, infectious bronchitis virus (IBV) and the mouse hepatitis virus (MHV) have been shown to localize to the ER and to generate autophagosomes. Coronavirus NSP6 proteins have also been shown to limit autophagosome expansion. This may favour coronavirus infection by reducing the ability of autophagosomes to deliver viral components to lysosomes for degradation. NSP6 from IBV, MHV and severe acute respiratory syndrome coronavirus (SARS-CoV) have also been found to activate autophagy. 261
61797 408983 pfam19214 CoV_S2_C Coronavirus spike glycoprotein S2, intravirion. This entry represents the cysteine rich intravirion region found at the C-terminus of coronavirus spike proteins (S). These cysteine residues are targets for palmitoylation, necessary for efficiently S incorporation into virions and S-mediated membrane fusions. 42
61798 408984 pfam19215 CoV_NSP15_C Coronavirus replicase NSP15, uridylate-specific endoribonuclease. This entry represents the C-terminal domain of coronavirus non-structural protein 15 (NSP15 or nsp15). NSP15 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. This domain exhibits endoribonuclease activity designated EndoU, highly conserved in all known CoVs and is part of the replicase-transcriptase complex that plays important roles in virus replication and transcription. NSP15 is a Uridylate-specific endoribonuclease that cleaves the 5'-polyuridines from negative-sense viral RNA, termed PUN RNA either upstream or downstream of uridylates, at GUU or GU to produce molecules with 2',3'-cyclic phosphate ends. PUN RNA is a CoV MDA5-dependent pathogen-associated molecular pattern (PAMP). 154
61799 408985 pfam19216 CoV_NSP15_M Coronavirus replicase NSP15, middle domain. This entry represents the non-catalytic middle domain from coronavirus non-structural protein 15 (NSP15). NSP15 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. This domain is formed by ten beta strands organized into three beta hairpins. 90
61800 408986 pfam19217 CoV_NSP4_N Coronavirus replicase NSP4, N-terminal. This is the N-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex to modified endoplasmic reticulum membranes. This N-terminal region represents the membrane spanning region, covering four transmembrane regions. 351
61801 408987 pfam19218 CoV_NSP3_C Coronavirus replicase NSP3, C-terminal. This family represents the C-terminal region of non-structural protein NSP3 (also known as nsp3). NSP3 is the product of ORF1a. It is found in human SARS coronavirus polyprotein 1a and 1ab, and in related coronavirus polyproteins. It is a multifunctional protein comprising up to 16 different domains and regions. NSP3 binds to viral RNA, nucleocapsid protein, as well as other viral proteins and participates in polyprotein processing. 464
61802 408988 pfam19219 CoV_NSP15_N Coronavirus replicase NSP15, N-terminal oligomerisation. This is the N-terminal domain of the coronavirus nonstructural protein 15 (NSP15), which is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP15, is a nidoviral RNA uridylate-specific endoribonuclease (NendoU) carrying C-terminal catalytic domain belonging to the EndoU family. The SARS-CoV-2 NendoU monomers assemble into a double-ring hexamer, generated by a dimer of trimers. The hexamer is stabilized by the interactions of N-terminal oligomerization domain. 61
61803 408989 pfam19220 Crescentin Crescentin protein. This entry represents a bacterial equivalent to Intermediate Filament proteins, named crescentin, whose cytoskeletal function is required for the vibrioid and helical shapes of Caulobacter crescentus. Without crescentin, the cells adopt a straight-rod morphology. Crescentin has characteristic features of IF proteins including the ability to assemble into filaments in vitro without energy or cofactor requirements. In vivo, crescentin forms a helical structure that colocalizes with the inner cell curvatures beneath the cytoplasmic membrane. 401
61804 408990 pfam19221 MELT MELT motif. The outer kinetochore protein scaffold KNL1 is essential for error-free chromosome segregation during mitosis and meiosis. A critical feature of KNL1 is an array of repeats containing MELT-like motifs. When phosphorylated, these motifs form docking sites for the BUB1-BUB3 dimer that regulates chromosome biorientation and the spindle assembly checkpoint. This entry mainly represents vertebrate proteins although MELT motifs are found much more widely. 26
61805 408991 pfam19222 Noda_Vmethyltr Nodavirus Vmethyltransferase. This entry represents a family of nodavirus proteins that is homologous to pfam01660. These proteins are likely methytransferases involved in mRNA capping. 148
61806 408992 pfam19223 Chropara_Vmeth Chroparavirus methyltransferase. This entry represents a family of chroparavirus proteins that is homologous to pfam01660. These proteins are likely methytransferases involved in mRNA capping. 319
61807 408993 pfam19224 pATOM36 pATOM36 family. This entry represents the trypanosome Peripherally associated ATOM36 protein which has been shown to complement a deletion of the Mim1/Mim2 complex in yeast. The integral MOM protein, peripheral archaic translocase of the outer membrane 36 (pATOM36), in analogy to the MIM complex, is involved in the assembly and/or membrane insertion of a small subset of MOM proteins including subunits of the main trypanosomal outer membrane protein translocase (ATOM complex). 270
61808 408994 pfam19225 Spo16 Spo16 protein. This entry represents proteins related to yeast Spo16. Spo16 forms a complex with Zip2 and Zip4. Zip2 and Spo16, form a meiosis-specific XPF-ERCC1-like complex. The recombinant Zip2 XPF domain together with Spo16 preferentially binds branched DNA structures, such as D loops and HJs. The Spo16 protein contains a C-terminal HHH motif. 163
61809 408995 pfam19226 DisA DisA glycoprotein. This entry corresponds to a putative viral glycoprotein. 363
61810 408996 pfam19227 Salyut Salyut domain. This entry represents the Salyut domain found in the replicase of all viruses of the viral family Tymoviridae, composed of viruses that infect plants and insects. it is located within a long, hypervariable, Proline-rich hinge region located between the Alphavirus-like methyltransferase (pfam01660) and the pfam01443. It is located 150-20aa downstream the Iceberg region that forms the C-terminus of Vmethyltransf. The function of this family is unknown. 51
61811 275365 sd00001 TSP3 Calcium-binding Thrombospondin type 3 (TSP3) repeat. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs. 59
61812 275366 sd00002 TSP3 Calcium-binding Thrombospondin type 3 (TSP3) repeat. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs. 59
61813 275367 sd00003 TSP3_1C Calcium-binding Thrombospondin type 3 (TSP3) repeat; C-type motif 1C. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs. The first TSP3 repeat 1C deviates from the canonical C-type calcium binding repeat in containing an insert relative to the other C-type repeats, however, the residues of the interrupted halves are positioned identically to C-type repeats without the insert. 35
61814 276811 sd00004 PPR Pentatricopeptide repeat, an RNA-binding module. The Pentatricopeptide repeat (PPR) is a 35-residue repeat motif that forms two anti-parallel alpha helices and binds single-stranded RNA in a sequence-specific and modular manner. It is present in a large family of RNA-binding proteins that are found in protists, fungi, and metazoan, but are most abundant in the mitochondria and chloroplasts of terrestrial plants. PPR proteins function in many aspects of RNA metabolism, including splicing, editing, degradation, and translation. They contain between 2 to 30 PPR repeats, organized into a hairpin of alpha helices. Proteins containing only arrays of PPR repeats that are 35-amino acid in length are called P class proteins. The second type of PPR proteins, called PLS class, contain additional C-terminal endonuclease or RNA editing domains and a distinct PPR architecture of triplet repeats alternating between a typical PPR, a longer PPR and a short PPR of 31 residues. 100
61815 276810 sd00005 TPR Tetratricopeptide repeat. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. The number of TPR motifs varies among proteins. Those containing 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein. It has been proposed that TPR proteins preferentially interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes. 60
61816 276809 sd00006 TPR Tetratricopeptide repeat. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. The number of TPR motifs varies among proteins. Those containing 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein. It has been proposed that TPR proteins preferentially interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes. 97
61817 276808 sd00008 TPR_YbbN C-terminal Tetratricopeptide repeat (TPR) region of YbbN and similar motifs. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. YbbN is a thioredoxin-like protein containing two tandem TPR repeats at the C-terminus, separated by two alpha helices. Its N-terminal thioredoxin-like domain is not a functional oxidoreductase. It functions in heat stress response and DNA synthesis as a chaperone or co-chaperone. 171
61818 276807 sd00010 SLR Sel1-like repeat. Sel1-like repeats (SLRs) share similar alpha-helical conformations with Tetratricopeptide repeats (TPRs), but with different consensus sequence lengths and superhelical topologies. SLRs contain 36 to 44 amino acids and are present in bacteria and eukaryotes but not in archaea. SLR proteins are involved in a variety of functions, and many serve as adaptor proteins for the assembly of macromolecular complexes. The SLR family was named after the Caenorhabditis elegans Sel1 protein which is predicted to fold into 11 SLRs, a transmembrane domain, and an N-terminal signal sequence. The human Sel1L protein contains an additional fibronectin type-II domain and an N-terminal PEST sequence. Its downregulation is associated with the development of breast and pancreatic carcinomas. 133
61819 276806 sd00016 Apc5 Tetratricopeptide repeat (TPR)-like motif of Apc5 and similar motifs. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This model represents the Tetratricopeptide repeat (TPR)-like motif region of Apc5. 98
61820 275368 sd00017 ZF_C2H2 Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats. 78
61821 275369 sd00018 ZF_C2H2 Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats. 24
61822 275370 sd00019 ZF_C2H2 Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats. 49
61823 275371 sd00020 ZF_C2H2 Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats. 46
61824 275375 sd00025 zf-RanBP2 RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains. 293
61825 275376 sd00029 zf-RanBP2 RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains. 74
61826 275377 sd00030 zf-RanBP2 RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains. 60
61827 275378 sd00031 LRR_1 leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 110
61828 275379 sd00032 LRR_2 leucine rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 205
61829 275380 sd00033 LRR_RI leucine-rich repeats, ribonuclease inhibitor (RI)-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 238
61830 275381 sd00034 LRR_AMN1 leucine-rich repeats, antagonist of mitotic exit network protein 1-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 212
61831 275382 sd00035 LRR_NTF leucine-rich repeats, nuclear transport factor-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 144
61832 275383 sd00036 LRR_3 leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions. 142
61833 275384 sd00037 PASTA PASTA domain. PASTA domain is found at the C-termini of several penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases. It is a small globular domain consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 126
61834 276965 sd00038 Kelch Kelch repeat. Kelch repeats are 44 to 56 amino acids in length and form a four-stranded beta-sheet corresponding to a single blade of five to seven bladed beta propellers. The Kelch superfamily is a large evolutionary conserved protein family whose members are present throughout the cell and extracellularly, and have diverse activities. Kelch repeats are often in combination with other domains, like BTB and BACK or F-box domains. 140
61835 293791 sd00039 7WD40 WD40 repeats in seven bladed beta propellers. The WD40 repeat is found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing, and cytoskeleton assembly. It typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40. Between the GH and WD dipeptides lies a conserved core. It forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel beta-sheet. The WD40 sequence repeat originally described in literature forms the first three strands of one blade and the last strand in the next blade. The C-terminal WD40 repeat completes the blade structure of the N-terminal WD40 repeat to create the closed ring propeller-structure. The residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands, allowing them to bind either stably or reversibly. 293
61836 293790 sd00041 GyrA-ParC_C beta-pinwheel repeat found at the C-terminus of GyrA, ParC, and similar proteins. Beta-pinwheel repeats are found at the C-terminus of both DNA gyrase subunit A and ParC, a subunit of topoisomerase IV (topo IV). DNA gyrase, a type IIA topoisomerase is a GyrA2GyrB2 heterotetramer which introduces negative supercoiling into the circular bacterial chromosome. Topo IV, a type IIB topoisomerase, is a ParC2ParE2 tetramer, which primarily relaxes positive supercoils and mediates topological unlinking of entangled DNA segments such as catenanes. The GyrA C-terminal repeat region, referred to as the C-terminal domain or CTD, binds DNA nonspecifically; it is thought to constrain a positive supercoil by wrapping a DNA duplex around its surface, upon strand passage, this wrap is converted into two negative supercoils. All known gyrase CTDs have 6 bladed beta-pinwheels, the topo IV CTD in various organisms is more variable and includes both 3-bladed and 8-bladed pinwheels. 253
61837 293789 sd00042 LVIVD LVIVD repeat. LVIVD repeats are mainly found in bacterial and archaeal cell surface proteins, many of them hypothetical. Structurally, LVIVD repeats have been predicted to form a beta-propeller, with each repeat forming one four-stranded anti-parallel beta-sheet blade. 120
61838 293788 sd00043 ARM armadillo repeat. Armadillo (ARM)/beta-catenin-like repeats are approximately 40 amino acid long, tandemly repeated sequence motif, first identified in the Drosophila segment polarity gene armadillo. These repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified. ARM repeats are related to HEAT repeats. 117
61839 293787 sd00044 HEAT HEAT repeats. The canonical HEAT repeat consists of two helices forming a helical hairpin. HEAT repeats are found in a diverse family of proteins, including the four proteins from which its name is based: Huntingtin, Elongation factor 3, the PR65/A subunit of protein phosphatase 2A (PP2A), and the lipid kinase TOR (target of rapamycin). The HEAT repeat family is related to armadillo (ARM)/beta-catenin-like repeats. 181
61840 293786 sd00045 ANK ankyrin repeats. Ankyrin repeats are one of the most abundant repeat motifs, and generally function as scaffolds for protein-protein interactions in processes including cell cycle, transcriptional regulation, signal transduction, vesicular trafficking, and inflammatory response. Although predominantly found in eukaryotic proteins, they are also found in some bacterial and viral proteins. Less is known of their physiological roles in prokaryotes. Some bacterial ANK proteins play key roles in microbial pathogenesis by mimicking or manipulating host function(s). The pathogen Providencia alcalifaciens N-formyltransferase ankyrin repeats function in small molecule binding and allosteric control. Ankyrin-repeat proteins have been associated with a number of human diseases. 98
61841 293785 sd00046 FHA_bHelix beta-helical repeat found in filamentous hemagglutinin and related adhesins and CdiA family proteins. This model contains ten copies of an approximately 20-residue repeat found in two-partner secretion (TPS) proteins, including the filamentous hemagglutinin (FHA) family of adhesins and CdiA family proteins. These repeats form a right-handed beta-helical structure, and are found in large secreted proteins from a number of plant and animal pathogens. FHA family adhesins bind to various types of cells and may contribute to attachment, aggregation, and pathogenesis. CdiA proteins are involved in contact-dependent growth inhibition (CDI). 209
61842 128322 smart00002 PLP Myelin proteolipid protein (PLP or lipophilin). 60
61843 128323 smart00003 NH Neurohypophysial hormones. Vasopressin/oxytocin gene family. 78
61844 197463 smart00004 NL Domain found in Notch and Lin-12. The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains. 38
61845 214467 smart00005 DEATH DEATH domain, found in proteins involved in cell death (apoptosis). Alpha-helical domain present in a variety of proteins with apoptotic functions. Some (but not all) of these domains form homotypic and heterotypic dimers. 88
61846 128326 smart00006 A4_EXTRA amyloid A4. amyloid A4 precursor of Alzheimers disease 165
61847 214468 smart00008 HormR Domain present in hormone receptors. 70
61848 197466 smart00010 small_GTPase Small GTPase of the Ras superfamily; ill-defined subfamily. SMART predicts Ras-like small GTPases of the ARF, RAB, RAN, RAS, and SAR subfamilies. Others that could not be classified in this way are predicted to be members of the small GTPase superfamily without predictions of the subfamily. 166
61849 214469 smart00012 PTPc_DSPc Protein tyrosine phosphatase, catalytic domain, undefined specificity. Protein tyrosine phosphatases. Homologues detected by this profile and not by those of "PTPc" or "DSPc" are predicted to be protein phosphatases with a similar fold to DSPs and PTPs, yet with unpredicted specificities. 105
61850 214470 smart00013 LRRNT Leucine rich repeat N-terminal domain. 33
61851 214471 smart00014 acidPPc Acid phosphatase homologues. 116
61852 197470 smart00015 IQ Calmodulin-binding motif. Short calmodulin-binding motif containing conserved Ile and Gln residues. 23
61853 214472 smart00017 OSTEO Osteopontin. Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone 287
61854 197472 smart00018 PD P or trefoil or TFF domain. Proposed role in renewal and pathology of mucous epithelia. 46
61855 128335 smart00019 SF_P Pulmonary surfactant proteins. Pulmonary surfactant associated proteins promote alveolar stability by lowering the surface tension at the air-liquid interface in the peripheral air spaces. SP-C, a component of surfactant, is a highly hydrophobic peptide of 35 amino acid residues which is processed from a larger precursor protein. SP-C is post-translationally modified by the covalent attachment of two palmitoyl groups on two adjacent cysteines 191
61856 214473 smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues. 229
61857 197474 smart00021 DAX Domain present in Dishevelled and axin. Domain of unknown function. 83
61858 214474 smart00022 PLAc Cytoplasmic phospholipase A2, catalytic subunit. Cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids. Family includes phospholipases B isoforms. 549
61859 128339 smart00023 COLIPASE Colipase. Colipase is a protein that functions as a cofactor for pancreatic lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. Colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds. 95
61860 214475 smart00025 Pumilio Pumilio-like repeats. Pumilio-like repeats that bind RNA. 36
61861 128341 smart00026 EPEND Ependymins. Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds. 191
61862 197477 smart00027 EH Eps15 homology domain. Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences. 96
61863 197478 smart00028 TPR Tetratricopeptide repeats. Repeats present in 4 or more copies in proteins. Contain a minimum of 34 amino acids each and self-associate via a "knobs and holes" mechanism. 34
61864 214476 smart00029 GASTRIN gastrin / cholecystokinin / caerulein family. This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction. 14
61865 128345 smart00030 CLb CLUSTERIN Beta chain. 206
61866 214477 smart00031 DED Death effector domain. 79
61867 214478 smart00032 CCP Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR). The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII. 56
61868 214479 smart00033 CH Calponin homology domain. Actin binding domains present in duplicate at the N-termini of spectrin-like proteins (including dystrophin, alpha-actinin). These domains cross-link actin filaments into bundles and networks. A calponin homology domain is predicted in yeasst Cdc24p. 101
61869 214480 smart00034 CLECT C-type lectin (CTL) or carbohydrate-recognition domain (CRD). Many of these domains function as calcium-dependent carbohydrate binding modules. 124
61870 128350 smart00035 CLa CLUSTERIN alpha chain. 216
61871 214481 smart00036 CNH Domain found in NIK1-like kinases, mouse citron and yeast ROM1, ROM2. 302
61872 128352 smart00037 CNX Connexin homologues. Connexin channels participate in the regulation of signaling between developing and differentiated cell types. 34
61873 197483 smart00038 COLFI Fibrillar collagens C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. 232
61874 128354 smart00039 CRF corticotropin-releasing factor. 40
61875 128355 smart00040 CSF2 Granulocyte-macrophage colony-simulating factor (GM-CSF). GM-CSF stimulates the development of and the cytotoxic activity of white blood cells. 121
61876 214482 smart00041 CT C-terminal cystine knot-like domain (CTCK). The structures of transforming growth factor-beta (TGFbeta), nerve growth factor (NGF), platelet-derived growth factor (PDGF) and gonadotropin all form 2 highly twisted antiparallel pairs of beta-strands and contain three disulphide bonds. The domain is non-globular and little is conserved among these presumed homologues except for their cysteine residues. CT domains are predicted to form homodimers. 82
61877 214483 smart00042 CUB Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein. This domain is found mostly among developmentally-regulated proteins. Spermadhesins contain only this domain. 102
61878 214484 smart00043 CY Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains. 107
61879 214485 smart00044 CYCc Adenylyl- / guanylyl cyclase, catalytic domain. Present in two copies in mammalian adenylyl cyclases. Eubacterial homologues are known. Two residues (Asn, Arg) are thought to be involved in catalysis. These cyclases have important roles in a diverse range of cellular processes. 194
61880 214486 smart00045 DAGKa Diacylglycerol kinase accessory domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known. 160
61881 214487 smart00046 DAGKc Diacylglycerol kinase catalytic domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain is presumed to be the catalytic domain. Bacterial homologues areknown. 124
61882 214488 smart00047 LYZ2 Lysozyme subfamily 2. Eubacterial enzymes distantly related to eukaryotic lysozymes. 147
61883 128363 smart00048 DEFSN Defensin/corticostatin family. Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels. 29
61884 214489 smart00049 DEP Domain found in Dishevelled, Egl-10, and Pleckstrin. Domain of unknown function present in signalling proteins that contain PH, rasGEF, rhoGEF, rhoGAP, RGS, PDZ domains. DEP domain in Drosophila dishevelled is essential to rescue planar polarity defects and induce JNK signalling (Cell 94, 109-118). 77
61885 214490 smart00050 DISIN Homologues of snake disintegrins. Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs. 75
61886 128366 smart00051 DSL delta serrate ligand. 63
61887 214491 smart00052 EAL Putative diguanylate phosphodiesterase. Putative diguanylate phosphodiesterase, present in a variety of bacteria. 242
61888 197491 smart00053 DYNc Dynamin, GTPase. Large GTPases that mediate vesicle trafficking. Dynamin participates in the endocytic uptake of receptors, associated ligands, and plasma membrane following an exocytic event. 240
61889 197492 smart00054 EFh EF-hand, calcium binding motif. EF-hands are calcium-binding motifs that occur at least in pairs. Links between disease states and genes encoding EF-hands, particularly the S100 subclass, are emerging. Each motif consists of a 12 residue loop flanked on either side by a 12 residue alpha-helix. EF-hands undergo a conformational change unpon binding calcium ions. 29
61890 214492 smart00055 FCH Fes/CIP4 homology domain. Alignment extended from original report. Highly alpha-helical. Also known as the RAEYL motif or the S. pombe Cdc15 N-terminal domain. 87
61891 214493 smart00057 FIMAC factor I membrane attack complex. 68
61892 214494 smart00058 FN1 Fibronectin type 1 domain. One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding. 45
61893 128373 smart00059 FN2 Fibronectin type 2 domain. One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function. 49
61894 214495 smart00060 FN3 Fibronectin type 3 domain. One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins. 83
61895 214496 smart00061 MATH meprin and TRAF homology. 95
61896 214497 smart00062 PBPb Bacterial periplasmic substrate-binding proteins. bacterial proteins, eukaryotic ones are in PBPe 219
61897 214498 smart00063 FRI Frizzled. Drosophila melanogaster frizzled mediates signalling that polarises a precursor cell along the anteroposterior axis. Homologues of the N-terminal region of frizzled exist either as transmembrane or secreted molecules. Frizzled homologues are reported to be receptors for the Wnt growth factors. (Not yet in MEDLINE: the FRI domain occurs in several receptor tyrosine kinases [Xu, Y.K. and Nusse, Curr. Biol. 8 R405-R406 (1998); Masiakowski, P. and Yanopoulos, G.D., Curr. Biol. 8, R407 (1998)]. 113
61898 214499 smart00064 FYVE Protein present in Fab1, YOTB, Vac1, and EEA1. The FYVE zinc finger is named after four proteins where it was first found: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. The FYVE finger is structurally related to the PHD finger and the RING finger. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. The FYVE finger functions in the membrane recruitment of cytosolic proteins by binding to phosphatidylinositol 3-phosphate (PI3P), which is prominent on endosomes. The R+HHC+XCG motif is critical for PI3P binding. 68
61899 214500 smart00065 GAF Domain present in phytochromes and cGMP-specific phosphodiesterases. Mutations within these domains in PDE6B result in autosomal recessive inheritance of retinitis pigmentosa. 149
61900 214501 smart00066 GAL4 GAL4-like Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA-binding domain. Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi. 43
61901 128381 smart00067 GHA Glycoprotein hormone alpha chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. 87
61902 214502 smart00068 GHB Glycoprotein hormone beta chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. 107
61903 214503 smart00069 GLA Domain containing Gla (gamma-carboxyglutamate) residues. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration. 65
61904 128384 smart00070 GLUCA Glucagon like hormones. 27
61905 128385 smart00071 Galanin Galanin. Galanin is a neuropeptide that controls various biological activities: it regulates the release growth hormone, inhibits the release of insulin and somatostatin, contracts smooth muscle of the gastrointestinal and genitourinary tract and may be involved in the control of adrenal secretion 103
61906 214504 smart00072 GuKc Guanylate kinase homologues. Active enzymes catalyze ATP-dependent phosphorylation of GMP to GDP. Structure resembles that of adenylate kinase. So-called membrane-associated guanylate kinase homologues (MAGUKs) do not possess guanylate kinase activities; instead at least some possess protein-binding functions. 174
61907 197502 smart00073 HPT Histidine Phosphotransfer domain. Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper. 92
61908 214505 smart00075 HYDRO Hydrophobins. 76
61909 197503 smart00076 IFabd Interferon alpha, beta and delta. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon. 117
61910 128390 smart00077 ITAM Immunoreceptor tyrosine-based activation motif. Motif that may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery. 21
61911 214506 smart00078 IlGF Insulin / insulin-like growth factor / relaxin family. Family of proteins including insulin, relaxin, and IGFs. Insulin decreases blood glucose concentration. 66
61912 197504 smart00079 PBPe Eukaryotic homologues of bacterial periplasmic substrate binding proteins. Prokaryotic homologues are represented by a separate alignment: PBPb 133
61913 197505 smart00080 LIF_OSM leukemia inhibitory factor. OSM, Oncostatin M 157
61914 214507 smart00082 LRRCT Leucine rich repeat C-terminal domain. 51
61915 197507 smart00084 NMU Neuromedin U. Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians. 25
61916 214508 smart00085 PA2c Phospholipase A2. 117
61917 197509 smart00086 PAC Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain). PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. 43
61918 128398 smart00087 PTH Parathyroid hormone. 36
61919 214509 smart00088 PINT motif in proteasome subunits, Int-6, Nip-1 and TRIP-15. Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function. 88
61920 214510 smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases. 79
61921 214511 smart00090 RIO RIO-like kinase. 237
61922 214512 smart00091 PAS PAS domain. PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. 67
61923 128403 smart00092 RNAse_Pc Pancreatic ribonuclease. 123
61924 214513 smart00093 SERPIN SERine Proteinase INhibitors. 359
61925 214514 smart00094 TR_FER Transferrin. 332
61926 128406 smart00095 TR_THY Transthyretin. 121
61927 128407 smart00096 UTG Uteroglobin. 69
61928 128408 smart00097 WNT1 found in Wnt-1. 305
61929 214515 smart00098 alkPPc Alkaline phosphatase homologues. 419
61930 128410 smart00099 btg1 tob/btg1 family. The tob/btg1 is a family of proteins that inhibit cell proliferation. 108
61931 197516 smart00100 cNMP Cyclic nucleotide-monophosphate binding domain. Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases. 120
61932 128412 smart00101 14_3_3 14-3-3 homologues. 14-3-3 homologues mediates signal transduction by binding to phosphoserine-containing proteins. They are involved in growth factor signalling and also interact with MEK kinases. 244
61933 214516 smart00102 ADF Actin depolymerisation factor/cofilin -like domains. Severs actin filaments and binds to actin monomers. 127
61934 214517 smart00103 ALBUMIN serum albumin. 187
61935 197517 smart00104 ANATO Anaphylatoxin homologous domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. 35
61936 214518 smart00105 ArfGap Putative GTP-ase activating proteins for the small GTPase, ARF. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. 119
61937 128417 smart00107 BTK Bruton's tyrosine kinase Cys-rich motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. 36
61938 214519 smart00108 B_lectin Bulb-type mannose-specific lectin. 114
61939 197519 smart00109 C1 Protein kinase C conserved region 1 (C1) domains (Cysteine-rich domains). Some bind phorbol esters and diacylglycerol. Some bind RasGTP. Zinc-binding domains. 50
61940 128420 smart00110 C1Q Complement component C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor. 135
61941 128421 smart00111 C4 C-terminal tandem repeated domain in type 4 procollagens. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. 114
61942 214520 smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. 81
61943 128423 smart00113 CALCITONIN calcitonin. This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones. 38
61944 128424 smart00114 CARD Caspase recruitment domain. Motif contained in proteins involved in apoptotic signalling. Mediates homodimerisation. Structure consists of six antiparallel helices arranged in a topology homologue to the DEATH and the DED domain. 88
61945 214521 smart00115 CASc Caspase, interleukin-1 beta converting enzyme (ICE) homologues. Cysteine aspartases that mediate programmed cell death (apoptosis). Caspases are synthesised as zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologues. 241
61946 214522 smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease. 49
61947 214523 smart00119 HECTc Domain Homologous to E6-AP Carboxyl Terminus with. E3 ubiquitin-protein ligases. Can bind to E2 enzymes. 328
61948 214524 smart00120 HX Hemopexin-like repeats. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). 45
61949 197525 smart00121 IB Insulin growth factor-binding protein homologues. High affinity binding partners of insulin-like growth factors. 75
61950 128430 smart00125 IL1 Interleukin-1 homologues. Cytokines with various biological functions. Interluekin 1 alpha and beta are also known as hematopoietin and catabolin. 147
61951 128431 smart00126 IL6 Interleukin-6 homologues. Family includes granulocyte colony-stimulating factor (G-CSF) and myelomonocytic growth factor (MGF). IL-6 is also known as B-cell stimulatory factor 2. 154
61952 128432 smart00127 IL7 Interleukin-7 and interleukin-9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. 146
61953 214525 smart00128 IPPc Inositol polyphosphate phosphatase, catalytic domain homologues. Mg(2+)-dependent/Li(+)-sensitive enzymes. 306
61954 214526 smart00129 KISc Kinesin motor, catalytic domain. ATPase. Microtubule-dependent molecular motors that play important roles in intracellular transport of organelles and in cell division. 335
61955 214527 smart00130 KR Kringle domain. Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides. 83
61956 197529 smart00131 KU BPTI/Kunitz family of serine protease inhibitors. Serine protease inhibitors. One member of the family is encoded by an alternatively-spliced form of Alzheimer's amyloid beta-protein. 53
61957 214528 smart00132 LIM Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways. 54
61958 214529 smart00133 S_TK_X Extension to Ser/Thr-type protein kinases. 64
61959 214530 smart00134 LU Ly-6 antigen / uPA receptor -like domain. Three-fold repeated domain in urokinase-type plasminogen activator receptor; occurs singly in other GPI-linked cell-surface glycoproteins (Ly-6 family, CD59, thymocyte B cell antigen, Sgp-2). Topology of these domains is similar to that of snake venom neurotoxins. 85
61960 214531 smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. 43
61961 214532 smart00136 LamNT Laminin N-terminal domain (domain VI). N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins. 238
61962 214533 smart00137 MAM Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others). Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions. 161
61963 214534 smart00138 MeTrc Methyltransferase, chemotaxis proteins. Methylates methyl-accepting chemotaxis proteins to form gamma-glutamyl methyl ester residues. 264
61964 214535 smart00139 MyTH4 Domain in Myosin and Kinesin Tails. Domain present twice in myosin-VIIa, and also present in 3 other myosins. 152
61965 128445 smart00140 NGF Nerve growth factor (NGF or beta-NGF). NGF is important for the development and maintenance of the sympathetic and sensory nervous systems. 106
61966 197537 smart00141 PDGF Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family. Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF. 83
61967 214536 smart00142 PI3K_C2 Phosphoinositide 3-kinase, region postulated to contain C2 domain. Outlier of C2 family. 100
61968 197539 smart00143 PI3K_p85B PI3-kinase family, p85-binding domain. Region of p110 PI3K that binds the p85 subunit. 78
61969 197540 smart00144 PI3K_rbd PI3-kinase family, Ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation). 108
61970 214537 smart00145 PI3Ka Phosphoinositide 3-kinase family, accessory domain (PIK domain). PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. 184
61971 214538 smart00146 PI3Kc Phosphoinositide 3-kinase, catalytic domain. Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene produced, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities. 240
61972 214539 smart00147 RasGEF Guanine nucleotide exchange factor for Ras-like small GTPases. 242
61973 197543 smart00148 PLCXc Phospholipase C, catalytic domain (part); domain X. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs. 143
61974 128454 smart00149 PLCYc Phospholipase C, catalytic domain (part); domain Y. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs. 115
61975 197544 smart00150 SPEC Spectrin repeats. 101
61976 128456 smart00151 SWIB SWI complex, BAF60b domains. 77
61977 128457 smart00152 THY Thymosin beta actin-binding motif. 37
61978 128458 smart00153 VHP Villin headpiece domain. 36
61979 197545 smart00154 ZnF_AN1 AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. 39
61980 197546 smart00155 PLDc Phospholipase D. Active site motifs. Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, aspartic acid, and/or asparagine residues which may contribute to the active site. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. 28
61981 197547 smart00156 PP2Ac Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, that includes PP1, PP2A and PP2B (calcineurin) family members. 271
61982 197548 smart00157 PRP Major prion protein. The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy. 218
61983 128463 smart00159 PTX Pentraxin / C-reactive protein / pentaxin family. This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding. 206
61984 197549 smart00160 RanBD Ran-binding domain. Domain of apporximately 150 residues that stabilises the GTP-bound form of Ran (the Ras-like nuclear small GTPase). 130
61985 128465 smart00162 SAPA Saposin/surfactant protein-B A-type DOMAIN. Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains. 34
61986 214540 smart00164 TBC Domain in Tre-2, BUB2p, and Cdc16p. Probable Rab-GAPs. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases. 216
61987 197551 smart00165 UBA Ubiquitin associated domain. Present in Rad23, SNF1-like kinases. The newly-found UBA in p62 is known to bind ubiquitin. 37
61988 197552 smart00166 UBX Domain present in ubiquitin-regulatory proteins. Present in FAF1 and Shp1p. 77
61989 128469 smart00167 VPS9 Domain present in VPS9. Domain present in yeast vacuolar sorting protein 9 and other proteins. 117
61990 214541 smart00173 RAS Ras subfamily of RAS small GTPases. Similar in fold and function to the bacterial EF-Tu GTPase. p21Ras couples receptor Tyr kinases and G protein receptors to protein kinase cascades 164
61991 197554 smart00174 RHO Rho (Ras homology) subfamily of Ras-like small GTPases. Members of this subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms. 174
61992 197555 smart00175 RAB Rab subfamily of small GTPases. Rab GTPases are implicated in vesicle trafficking. 164
61993 128473 smart00176 RAN Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases. Ran is involved in the active transport of proteins through nuclear pores. 200
61994 128474 smart00177 ARF ARF-like small GTPases; ARF, ADP-ribosylation factor. Ras homologues involved in vesicular transport. Activator of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. ARFs are N-terminally myristoylated. Contains ATP/GTP-binding motif (P-loop). 175
61995 197556 smart00178 SAR Sar1p-like members of the Ras-family of small GTPases. Yeast SAR1 is an essential gene required for transport of secretory proteins from the endoplasmic reticulum to the Golgi apparatus. 184
61996 214542 smart00179 EGF_CA Calcium-binding EGF-like domain. 39
61997 214543 smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai. 46
61998 214544 smart00181 EGF Epidermal growth factor-like domain. 35
61999 214545 smart00182 CULLIN Cullin. 143
62000 128480 smart00183 NAT_PEP Natriuretic peptide. Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. 24
62001 214546 smart00184 RING Ring finger. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain; Various RING fingers exhibit binding activity towards E2 ubiquitin-conjugating enzymes (Ubc' s) 40
62002 214547 smart00185 ARM Armadillo/beta-catenin-like repeats. Approx. 40 amino acid repeat. Tandem repeats form superhelix of helices that is proposed to mediate interaction of beta-catenin with its ligands. Involved in transducing the Wingless/Wnt signal. In plakoglobin arm repeats bind alpha-catenin and N-cadherin. 41
62003 214548 smart00186 FBG Fibrinogen-related domains (FReDs). Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous. 212
62004 197563 smart00187 INB Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). 423
62005 128485 smart00188 IL10 Interleukin-10 family. Interleukin-10 inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. 137
62006 128486 smart00189 IL2 Interleukin-2 family. Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response. 154
62007 197564 smart00190 IL4_13 Interleukins 4 and 13. Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells. 138
62008 214549 smart00191 Int_alpha Integrin alpha (beta-propellor repeats). Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Alpha integrins are proposed to contain a domain containing a 7-fold repeat that adopts a beta-propellor fold. Some of these domains contain an inserted von Willebrand factor type-A domain. Some repeats contain putative calcium-binding sites. The 7-fold repeat domain is homologous to a similar domain in phosphatidylinositol-glycan-specific phospholipase D. 57
62009 197566 smart00192 LDLa Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia. 33
62010 128490 smart00193 PTN Pleiotrophin / midkine family. Heparin-binding domain family. 80
62011 214550 smart00194 PTPc Protein tyrosine phosphatase, catalytic domain. 259
62012 214551 smart00195 DSPc Dual specificity phosphatase, catalytic domain. 138
62013 214552 smart00197 SAA Serum amyloid A proteins. Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins. 103
62014 214553 smart00198 SCP SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains. Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function. 144
62015 197570 smart00199 SCY Intercrine alpha family (small cytokine C-X-C) (chemokine CXC). Family of cytokines involved in cell-specific chemotaxis, mediation of cell growth, and the inflammatory response. 59
62016 214554 smart00200 SEA Domain found in sea urchin sperm protein, enterokinase, agrin. Proposed function of regulating or binding carbohydrate sidechains. 121
62017 197571 smart00201 SO Somatomedin B -like domains. Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity. 43
62018 214555 smart00202 SR Scavenger receptor Cys-rich. The sea urchin egg peptide speract contains 4 repeats of SR domains that contain 6 conserved cysteines. May bind bacterial antigens in the protein MARCO. 101
62019 197573 smart00203 TK Tachykinin family. Tachykinins are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. These peptides are synthesized as longer precursors and then processed to peptides from ten to twelve residues long. 11
62020 214556 smart00204 TGFB Transforming growth factor-beta (TGF-beta) family. Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types. 102
62021 128501 smart00205 THN Thaumatin family. The thaumatin family gathers proteins related to plant pathogenesis. The thaumatin family includes very basic members with extracellular and vacuolar localization. Thaumatin itsel is a potent sweet-tasting protein. Several members of this family display significant in vitro activity of inhibiting hyphal growth or spore germination of various fungi probably by a membrane permeabilizing mechanism. 218
62022 128502 smart00206 NTR Tissue inhibitor of metalloproteinase family. Form complexes with metalloproteinases, such as collagenases, and irreversibly inactivate them. 172
62023 214557 smart00207 TNF Tumour necrosis factor family. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor. 125
62024 214558 smart00208 TNFR Tumor necrosis factor receptor / nerve growth factor receptor repeats. Repeats in growth factor receptors that are involved in growth factor binding. TNF/TNFR 39
62025 214559 smart00209 TSP1 Thrombospondin type 1 repeats. Type 1 repeats in thrombospondin-1 bind and activate TGF-beta. 53
62026 214560 smart00210 TSPN Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin 184
62027 214561 smart00211 TY Thyroglobulin type I repeats. The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases and binding partners of heparin. 46
62028 214562 smart00212 UBCc Ubiquitin-conjugating enzyme E2, catalytic domain homologues. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. This pathway functions in regulating many fundamental processes required for cell viability.TSG101 is one of several UBC homologues that lacks this active site cysteine. 145
62029 214563 smart00213 UBQ Ubiquitin homologues. Ubiquitin-mediated proteolysis is involved in the regulated turnover of proteins required for controlling cell cycle progression 72
62030 214564 smart00214 VWC von Willebrand factor (vWF) type C domain. 59
62031 214565 smart00215 VWC_out von Willebrand factor (vWF) type C domain. 67
62032 214566 smart00216 VWD von Willebrand factor (vWF) type D domain. Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation. 163
62033 197580 smart00217 WAP Four-disulfide core domains. 47
62034 128514 smart00218 ZU5 Domain present in ZO-1 and Unc5-like netrin receptors. Domain of unknown function. 104
62035 197581 smart00219 TyrKc Tyrosine kinase, catalytic domain. Phosphotransferases. Tyrosine-specific kinase subfamily. 257
62036 214567 smart00220 S_TKc Serine/Threonine protein kinases, catalytic domain. Phosphotransferases. Serine or threonine-specific kinase subfamily. 254
62037 214568 smart00221 STYKc Protein kinase; unclassified specificity. Phosphotransferases. The specificity of this class of kinases can not be predicted. Possible dual-specificity Ser/Thr/Tyr kinase. 258
62038 214569 smart00222 Sec7 Sec7 domain. Domain named after the S. cerevisiae SEC7 gene product, which is required for proper protein transport through the Golgi. The domain facilitates guanine nucleotide exchange on the small GTPases, ARFs (ADP ribosylation factors). 189
62039 128519 smart00223 APPLE APPLE domain. Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder. 79
62040 128520 smart00224 GGL G protein gamma subunit-like motifs. 63
62041 197585 smart00225 BTB Broad-Complex, Tramtrack and Bric a brac. Domain in Broad-Complex, Tramtrack and Bric a brac. Also known as POZ (poxvirus and zinc finger) domain. Known to be a protein-protein interaction motif found at the N-termini of several C2H2-type transcription factors as well as Shaw-type potassium channels. Known structure reveals a tightly intertwined dimer formed via interactions between N-terminal strand and helix structures. However in a subset of BTB/POZ domains, these two secondary structures appear to be missing. Be aware SMART predicts BTB/POZ domains without the beta1- and alpha1-secondary structures. 97
62042 197586 smart00226 LMWPc Low molecular weight phosphatase family. 134
62043 128523 smart00227 NEBU The Nebulin repeat is present also in Las1. Tandem arrays of these repeats are known to bind actin. 31
62044 214570 smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities. 85
62045 214571 smart00229 RasGEFN Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this domain N-terminal to the RasGef (Cdc25-like) domain. The recent crystal structureof Sos shows that this domain is alpha-helical and plays a "purely structural role" (Nature 394, 337-343). 127
62046 128526 smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit). 318
62047 214572 smart00231 FA58C Coagulation factor 5/8 C-terminal domain, discoidin domain. Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. 139
62048 214573 smart00232 JAB_MPN JAB/MPN domain. Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function. 135
62049 214574 smart00233 PH Pleckstrin homology domain. Domain commonly found in eukaryotic signalling proteins. The domain family possesses multiple functions including the abilities to bind inositol phosphates, and various proteins. PH domains have been found to possess inserted domains (such as in PLC gamma, syntrophins) and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. Point mutations cluster into the positively charged end of the molecule around the predicted binding site for phosphatidylinositol lipids. 102
62050 214575 smart00234 START in StAR and phosphatidylcholine transfer protein. putative lipid-binding domain in StAR and phosphatidylcholine transfer protein 205
62051 214576 smart00235 ZnMc Zinc-dependent metalloprotease. Neutral zinc metallopeptidases. This alignment represents a subset of known subfamilies. Highest similarity occurs in the HExxH zinc-binding site/ active site. 139
62052 197593 smart00236 fCBD Fungal-type cellulose-binding domain. Small four-cysteine cellulose-binding domain of fungi 34
62053 197594 smart00237 Calx_beta Domains in Na-Ca exchangers and integrin-beta4. Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins) 90
62054 197595 smart00238 BIR Baculoviral inhibition of apoptosis protein repeat. Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes. 71
62055 214577 smart00239 C2 Protein kinase C conserved region 2 (CalB). Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. SMART detects C2 domains using one or both of two profiles. 101
62056 214578 smart00240 FHA Forkhead associated domain. Found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. 52
62057 214579 smart00241 ZP Zona pellucida (ZP) domain. ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan). 252
62058 214580 smart00242 MYSc Myosin. Large ATPases. ATPase; molecular motor. Muscle contraction consists of a cyclical interaction between myosin and actin. The core of the myosin structure is similar in fold to that of kinesin. 677
62059 128539 smart00243 GAS2 Growth-Arrest-Specific Protein 2 Domain. GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain 73
62060 214581 smart00244 PHB prohibitin homologues. prohibitin homologues 160
62061 214582 smart00245 TSPc tail specific protease. tail specific protease 192
62062 128542 smart00246 WH2 Wiskott Aldrich syndrome homology region 2. Wiskott Aldrich syndrome homology region 2 / actin-binding motif 18
62063 214583 smart00247 XTALbg Beta/gamma crystallins. Beta/gamma crystallins 82
62064 197603 smart00248 ANK ankyrin repeats. Ankyrin repeats are about 33 amino acids long and occur in at least four consecutive copies. They are involved in protein-protein interactions. The core of the repeat seems to be an helix-loop-helix structure. 30
62065 214584 smart00249 PHD PHD zinc finger. The plant homeodomain (PHD) finger is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in epigenetics and chromatin-mediated transcriptional regulation. The PHD finger binds two zinc ions using the so-called 'cross-brace' motif and is thus structurally related to the RING finger and the FYVE finger. It is not yet known if PHD fingers have a common molecular function. Several reports suggest that it can function as a protein-protein interacton domain and it was recently demonstrated that the PHD finger of p300 can cooperate with the adjacent BROMO domain in nucleosome binding in vitro. Other reports suggesting that the PHD finger is a ubiquitin ligase have been refuted as these domains were RING fingers misidentified as PHD fingers. 47
62066 197605 smart00250 PLEC Plectin repeat. 38
62067 128547 smart00251 SAM_PNT SAM / Pointed domain. A subfamily of the SAM domain 82
62068 214585 smart00252 SH2 Src homology 2 domains. Src homology 2 domains bind phosphotyrosine-containing polypeptides via 2 surface pockets. Specificity is provided via interaction with residues that are distinct from the phosphotyrosine. Only a single occurrence of a SH2 domain has been found in S. cerevisiae. 84
62069 128549 smart00253 SOCS suppressors of cytokine signalling. suppressors of cytokine signalling 43
62070 214586 smart00254 ShKT ShK toxin domain. ShK toxin domain 33
62071 214587 smart00255 TIR Toll - interleukin 1 - resistance. 140
62072 197608 smart00256 FBOX A Receptor for Ubiquitination Targets. 41
62073 197609 smart00257 LysM Lysin motif. 44
62074 128554 smart00258 SAND SAND domain. 73
62075 128555 smart00259 ZnF_A20 A20-like zinc fingers. A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation. 26
62076 214588 smart00260 CheW Two component signalling adaptor domain. 138
62077 214589 smart00261 FU Furin-like repeats. 45
62078 214590 smart00262 GEL Gelsolin homology domain. Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains. 90
62079 197612 smart00263 LYZ1 Alpha-lactalbumin / lysozyme C. 127
62080 214591 smart00264 BAG BAG domains, present in regulator of Hsp70 proteins. BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains 79
62081 128561 smart00265 BH4 BH4 Bcl-2 homology region 4. 27
62082 128562 smart00266 CAD Domains present in proteins implicated in post-mortem DNA fragmentation. 74
62083 128563 smart00267 GGDEF diguanylate cyclase. Diguanylate cyclase, present in a variety of bacteria. 163
62084 214592 smart00268 ACTIN Actin. ACTIN subfamily of ACTIN/mreB/sugarkinase/Hsp70 superfamily 373
62085 197615 smart00269 BowB Bowman-Birk type proteinase inhibitor. 55
62086 214593 smart00270 ChtBD1 Chitin binding domain. 38
62087 197617 smart00271 DnaJ DnaJ molecular chaperone homology domain. 60
62088 197618 smart00272 END Endothelin. 22
62089 214594 smart00273 ENTH Epsin N-terminal homology (ENTH) domain. 127
62090 128570 smart00274 FOLN Follistatin-N-terminal domain-like. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence 24
62091 214595 smart00275 G_alpha G protein alpha subunit. Subunit of G proteins that contains the guanine nucleotide binding site 342
62092 214596 smart00276 GLECT Galectin. Galectin - galactose-binding lectin 128
62093 197621 smart00277 GRAN Granulin. 51
62094 197622 smart00278 HhH1 Helix-hairpin-helix DNA-binding motif class 1. 20
62095 197623 smart00279 HhH2 Helix-hairpin-helix class 2 (Pol1 family) motifs. 36
62096 197624 smart00280 KAZAL Kazal type serine protease inhibitors. Kazal type serine protease inhibitors and follistatin-like domains. 46
62097 214597 smart00281 LamB Laminin B domain. 127
62098 214598 smart00282 LamG Laminin G domain. 132
62099 214599 smart00283 MA Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer). Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis. 262
62100 128580 smart00284 OLF Olfactomedin-like domains. 255
62101 197628 smart00285 PBD P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). 36
62102 128582 smart00286 PTI Plant trypsin inhibitors. 29
62103 214600 smart00287 SH3b Bacterial SH3 domain homologues. 63
62104 197630 smart00288 VHS Domain present in VPS-27, Hrs and STAM. Unpublished observations. Domain of unknown function. 133
62105 214601 smart00289 WR1 Worm-specific repeat type 1. Worm-specific repeat type 1. Cysteine-rich domain apparently unique (so far) to C. elegans. Often appears with KU domains. About 3 dozen worm proteins contain this domain. 38
62106 197632 smart00290 ZnF_UBP Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger. 50
62107 197633 smart00291 ZnF_ZZ Zinc-binding domain, present in Dystrophin, CREB-binding protein. Putative zinc-binding domain present in dystrophin-like proteins, and CREB-binding protein/p300 homologues. The ZZ in dystrophin appears to bind calmodulin. A missense mutation of one of the conserved cysteines in dystrophin results in a patient with Duchenne muscular dystrophy. 44
62108 214602 smart00292 BRCT breast cancer carboxy-terminal domain. 78
62109 214603 smart00293 PWWP domain with conserved PWWP motif. conservation of Pro-Trp-Trp-Pro residues 63
62110 128590 smart00294 4.1m putative band 4.1 homologues' binding motif. 19
62111 214604 smart00295 B41 Band 4.1 homologues. Also known as ezrin/radixin/moesin (ERM) protein domains. Present in myosins, ezrin, radixin, moesin, protein tyrosine phosphatases. Plasma membrane-binding domain. These proteins play structural and regulatory roles in the assembly and stabilization of specialized plasmamembrane domains. Some PDZ domain containing proteins bind one or more of this family. Now includes JAKs. 201
62112 197636 smart00297 BROMO bromo domain. 107
62113 214605 smart00298 CHROMO Chromatin organization modifier domain. 55
62114 128594 smart00299 CLH Clathrin heavy chain repeat homology. 140
62115 197638 smart00300 ChSh Chromo Shadow Domain. 61
62116 214606 smart00301 DM Doublesex DNA-binding motif. 54
62117 128597 smart00302 GED Dynamin GTPase effector domain. 92
62118 197639 smart00303 GPS G-protein-coupled receptor proteolytic site domain. Present in latrophilin/CL-1, sea urchin REJ and polycystin. 49
62119 197640 smart00304 HAMP HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) domain. 53
62120 197641 smart00305 HintC Hint (Hedgehog/Intein) domain C-terminal region. Hedgehog/Intein domain, C-terminal region. Domain has been split to accommodate large insertions of endonucleases. 46
62121 197642 smart00306 HintN Hint (Hedgehog/Intein) domain N-terminal region. Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases. 100
62122 214607 smart00307 ILWEQ I/LWEQ domain. Thought to possess an F-actin binding function. 200
62123 214608 smart00308 LH2 Lipoxygenase homology 2 (beta barrel) domain. 105
62124 197643 smart00309 PAH Pancreatic hormones / neuropeptide F / peptide YY family. Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions. 36
62125 197644 smart00310 PTBI Phosphotyrosine-binding domain (IRS1-like). 99
62126 214609 smart00311 PWI PWI, domain in splicing factors. 74
62127 214610 smart00312 PX PhoX homologous domain, present in p47phox and p40phox. Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K isoform. 105
62128 214611 smart00313 PXA Domain associated with PX domains. unpubl. observations 176
62129 214612 smart00314 RA Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Kalhammer et al. have shown that not all RA domains bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and nore1 found to bind RasGTP. Included outliers (Grb7, Grb14, adenylyl cyclases etc.) 90
62130 214613 smart00315 RGS Regulator of G protein signalling domain. RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. 118
62131 197648 smart00316 S1 Ribosomal protein S1-like RNA-binding domain. 72
62132 214614 smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues 124
62133 214615 smart00318 SNc Staphylococcal nuclease homologues. 137
62134 128614 smart00319 TarH Homologues of the ligand binding domain of Tar. Homologues of the ligand binding domain of the wild-type bacterial aspartate receptor, Tar. 135
62135 197651 smart00320 WD40 WD40 repeats. Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. 40
62136 214616 smart00321 WSC present in yeast cell wall integrity and stress response component proteins. Domain present in WSC proteins, polycystin and fungal exoglucanase 95
62137 197652 smart00322 KH K homology RNA-binding domain. 68
62138 214617 smart00323 RasGAP GTPase-activator protein for Ras-like GTPases. All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position. Improved domain limits from structure. 344
62139 214618 smart00324 RhoGAP GTPase-activator protein for Rho-like GTPases. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. etter domain limits and outliers. 174
62140 214619 smart00325 RhoGEF Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Improved coverage. 180
62141 214620 smart00326 SH3 Src homology 3 domains. Src homology 3 (SH3) domains bind to target proteins through sequences containing proline and hydrophobic amino acids. Pro-containing polypeptides may bind to SH3 domains in 2 different binding orientations. 56
62142 214621 smart00327 VWA von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. 175
62143 214622 smart00328 BPI1 BPI/LBP/CETP N-terminal domain. Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain 225
62144 128624 smart00329 BPI2 BPI/LBP/CETP C-terminal domain. Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain 202
62145 214623 smart00330 PIPKc Phosphatidylinositol phosphate kinases. 342
62146 214624 smart00331 PP2C_SIG Sigma factor PP2C-like phosphatases. 193
62147 214625 smart00332 PP2Cc Serine/threonine phosphatases, family 2C, catalytic domain. The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. 252
62148 197660 smart00333 TUDOR Tudor domain. Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished). 57
62149 197661 smart00335 ANX Annexin repeats. 53
62150 197662 smart00336 BBOX B-Box-type zinc finger. 42
62151 214626 smart00337 BCL BCL (B-Cell lymphoma); contains BH1, BH2 regions. (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation 100
62152 197664 smart00338 BRLZ basic region leucin zipper. 65
62153 214627 smart00339 FH FORKHEAD. FORKHEAD, also known as a "winged helix" 89
62154 128634 smart00340 HALZ homeobox associated leucin zipper. 44
62155 128635 smart00341 HRDC Helicase and RNase D C-terminal. Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease. 81
62156 197666 smart00342 HTH_ARAC helix_turn_helix, arabinose operon control protein. 84
62157 197667 smart00343 ZnF_C2HC zinc finger. 17
62158 214628 smart00344 HTH_ASNC helix_turn_helix ASNC type. AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli) 108
62159 197669 smart00345 HTH_GNTR helix_turn_helix gluconate operon transcriptional repressor. 60
62160 214629 smart00346 HTH_ICLR helix_turn_helix isocitrate lyase regulation. 91
62161 197670 smart00347 HTH_MARR helix_turn_helix multiple antibiotic resistance protein. 101
62162 128642 smart00348 IRF interferon regulatory factor. interferon regulatory factor, also known as trytophan pentad repeat 107
62163 214630 smart00349 KRAB krueppel associated box. 61
62164 214631 smart00350 MCM minichromosome maintenance proteins. 509
62165 128645 smart00351 PAX Paired Box domain. 125
62166 197673 smart00352 POU Found in Pit-Oct-Unc transcription factors. 75
62167 197674 smart00353 HLH helix loop helix domain. 53
62168 197675 smart00354 HTH_LACI helix_turn _helix lactose operon repressor. 70
62169 197676 smart00355 ZnF_C2H2 zinc finger. 23
62170 214632 smart00356 ZnF_C3H1 zinc finger. 27
62171 214633 smart00357 CSP Cold shock protein domain. RNA-binding domain that functions as a RNA-chaperone in bacteria and is involved in regulating translation in eukaryotes. Contains sub-family of RNA-binding domains in the Rho transcription termination factor. 64
62172 214634 smart00358 DSRM Double-stranded RNA binding motif. 67
62173 214635 smart00359 PUA Putative RNA-binding Domain in PseudoUridine synthase and Archaeosine transglycosylase. 76
62174 214636 smart00360 RRM RNA recognition motif. 73
62175 214637 smart00361 RRM_1 RNA recognition motif. 70
62176 214638 smart00363 S4 S4 RNA-binding domain. 60
62177 214639 smart00364 LRR_BAC Leucine-rich repeats, bacterial type. 20
62178 197684 smart00365 LRR_SD22 Leucine-rich repeat, SDS22-like subfamily. 22
62179 197685 smart00367 LRR_CC Leucine-rich repeat - CC (cysteine-containing) subfamily. 26
62180 197686 smart00368 LRR_RI Leucine rich repeat, ribonuclease inhibitor type. 28
62181 197687 smart00369 LRR_TYP Leucine-rich repeats, typical (most populated) subfamily. 24
62182 197688 smart00370 LRR Leucine-rich repeats, outliers. 24
62183 197689 smart00380 AP2 DNA-binding domain in plant proteins such as APETALA2 and EREBPs. 64
62184 214640 smart00382 AAA ATPases associated with a variety of cellular activities. AAA - ATPases associated with a variety of cellular activities. This profile/alignment only detects a fraction of this vast family. The poorly conserved N-terminal helix is missing from the alignment. 148
62185 197691 smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y). 13
62186 214641 smart00385 CYCLIN domain present in cyclins, TFIIB and Retinoblastoma. A helical domain present in cyclins and TFIIB (twice) and Retinoblastoma (once). A protein recognition domain functioning in cell-cycle and transcription control. 83
62187 214642 smart00386 HAT HAT (Half-A-TPR) repeats. Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. 33
62188 214643 smart00387 HATPase_c Histidine kinase-like ATPases. Histidine kinase-, DNA gyrase B-, phytochrome-like ATPases. 111
62189 214644 smart00388 HisKA His Kinase A (phosphoacceptor) domain. Dimerisation and phosphoacceptor domain of histidine kinases. 66
62190 197696 smart00389 HOX Homeodomain. DNA-binding factors that are involved in the transcriptional regulation of key developmental processes 57
62191 214645 smart00390 GoLoco LGN motif, putative GEFs specific for G-alpha GTPases. GEF specific for Galpha_i proteins 23
62192 128673 smart00391 MBD Methyl-CpG binding domain. Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain 77
62193 214646 smart00392 PROF Profilin. Binds actin monomers, membrane polyphosphoinositides and poly-L-proline. 129
62194 214647 smart00393 R3H Putative single-stranded nucleic acids-binding domain. 79
62195 197697 smart00394 RIIa RIIalpha, Regulatory subunit portion of type II PKA R-subunit. RIIalpha, Regulatory subunit portion of type II PKA R-subunit. Contains dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs). 38
62196 197698 smart00396 ZnF_UBR1 Putative zinc finger in N-recognin, a recognition component of the N-end rule pathway. Domain is involved in recognition of N-end rule substrates in yeast Ubr1p 71
62197 197699 smart00397 t_SNARE Helical region found in SNAREs. All alpha-helical motifs that form twisted and parallel four-helix bundles in target soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptor proteins. This motif found in "Q-SNAREs". 66
62198 197700 smart00398 HMG high mobility group. 70
62199 197701 smart00399 ZnF_C4 c4 zinc finger in nuclear hormone receptors. 70
62200 128681 smart00400 ZnF_CHCC zinc finger. 55
62201 214648 smart00401 ZnF_GATA zinc finger binding to DNA consensus sequence [AT]GATA[AG]. 52
62202 214649 smart00404 PTPc_motif Protein tyrosine phosphatase, catalytic domain motif. 105
62203 214650 smart00406 IGv Immunoglobulin V-Type. 81
62204 214651 smart00407 IGc1 Immunoglobulin C-Type. 75
62205 197706 smart00408 IGc2 Immunoglobulin C-2 Type. 63
62206 214652 smart00409 IG Immunoglobulin. 85
62207 214653 smart00410 IG_like Immunoglobulin like. IG domains that cannot be classified into one of IGv1, IGc1, IGc2, IG. 85
62208 197709 smart00411 BHL bacterial (prokaryotic) histone like domain. 90
62209 128690 smart00412 Cu_FIST Copper-Fist. binds DNA only in present of copper or silver 39
62210 197710 smart00413 ETS erythroblast transformation specific domain. variation of the helix-turn-helix motif 87
62211 197711 smart00414 H2A Histone 2A. 106
62212 214654 smart00415 HSF heat shock factor. 105
62213 128694 smart00417 H4 Histone H4. 74
62214 197713 smart00418 HTH_ARSR helix_turn_helix, Arsenical Resistance Operon Repressor. 66
62215 128696 smart00419 HTH_CRP helix_turn_helix, cAMP Regulatory protein. 48
62216 197714 smart00420 HTH_DEOR helix_turn_helix, Deoxyribose operon repressor. 53
62217 197715 smart00421 HTH_LUXR helix_turn_helix, Lux Regulon. lux regulon (activates the bioluminescence operon 58
62218 197716 smart00422 HTH_MERR helix_turn_helix, mercury resistance. 70
62219 214655 smart00423 PSI domain found in Plexins, Semaphorins and Integrins. 47
62220 128701 smart00424 STE STE like transcription factors. 111
62221 214656 smart00425 TBOX Domain first found in the mice T locus (Brachyury) protein. 190
62222 128703 smart00426 TEA TEA domain. 68
62223 197718 smart00427 H2B Histone H2B. 97
62224 128705 smart00428 H3 Histone H3. 105
62225 214657 smart00429 IPT ig-like, plexins, transcription factors. 90
62226 214658 smart00430 HOLI Ligand binding domain of hormone receptors. 163
62227 128708 smart00431 SCAN leucine rich region. 113
62228 197721 smart00432 MADS MADS domain. 59
62229 214659 smart00433 TOP2c TopoisomeraseII. Eukaryotic DNA topoisomerase II, GyrB, ParE 594
62230 214660 smart00434 TOP4c DNA Topoisomerase IV. Bacterial DNA topoisomerase IV, GyrA, ParC 444
62231 214661 smart00435 TOPEUc DNA Topoisomerase I (eukaryota). DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras 391
62232 214662 smart00436 TOP1Bc Bacterial DNA topoisomeraes I ATP-binding domain. Extension of TOPRIM in Bacterial DNA topoisomeraes I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase beta subunit 89
62233 214663 smart00437 TOP1Ac Bacterial DNA topoisomerase I DNA-binding domain. Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit 259
62234 128715 smart00438 ZnF_NFX Repressor of transcription. 20
62235 214664 smart00439 BAH Bromo adjacent homology domain. 121
62236 128717 smart00440 ZnF_C2C2 C2C2 Zinc finger. Nucleic-acid-binding motif in transcriptional elongation factor TFIIS and RNA polymerases. 40
62237 128718 smart00441 FF Contains two conserved F residues. A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues. 55
62238 214665 smart00442 FGF Acidic and basic fibroblast growth factor family. Mitogens that stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family play essential roles in patterning and differentiation during vertebrate embryogenesis, and have neurotrophic activities. 126
62239 197727 smart00443 G_patch glycine rich nucleic binding domain. A predicted glycine rich nucleic binding domain found in the splicing factor 45, SON DNA binding protein and D-type Retrovirus- polyproteins. 47
62240 214666 smart00444 GYF Contains conserved Gly-Tyr-Phe residues. Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues. 56
62241 214667 smart00445 LINK Link (Hyaluronan-binding). 94
62242 197729 smart00446 LRRcap occurring C-terminal to leucine-rich repeats. A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. 19
62243 214668 smart00448 REC cheY-homologous receiver domain. CheY regulates the clockwise rotation of E. coli flagellar motors. This domain contains a phosphoacceptor site that is phosphorylated by histidine kinase homologues. 55
62244 214669 smart00449 SPRY Domain in SPla and the RYanodine Receptor. Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues. 122
62245 197731 smart00450 RHOD Rhodanese Homology Domain. An alpha beta fold found duplicated in the Rhodanese protein. The the Cysteine containing enzymatically active version of the domain is also found in the CDC25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and stress proteins such as Senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions with a loss of the cysteine are also seen in Dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases. These are likely to play a role in protein interactions. 100
62246 197732 smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins. 35
62247 214670 smart00452 STI Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. 172
62248 197734 smart00453 WSN Worm-specific (usually) N-terminal domain. 69
62249 197735 smart00454 SAM Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation. 68
62250 128731 smart00455 RBD Raf-like Ras-binding domain. 70
62251 197736 smart00456 WW Domain with 2 conserved Trp (W) residues. Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides. 33
62252 214671 smart00457 MACPF membrane-attack complex / perforin. 195
62253 214672 smart00458 RICIN Ricin-type beta-trefoil. Carbohydrate-binding domain formed from presumed gene triplication. 118
62254 128735 smart00459 Sorb Sorbin homologous domain. First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins. 50
62255 214673 smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events. 68
62256 214674 smart00461 WH1 WASP homology region 1. Region of the Wiskott-Aldrich syndrome protein (WASp) that contains point mutations in the majority of patients with WAS. Unknown function. Ena-like WH1 domains bind polyproline-containing peptides, and that Homer contains a WH1 domain. 106
62257 214675 smart00462 PTB Phosphotyrosine-binding domain, phosphotyrosine-interaction (PI) domain. PTB/PI domain structure similar to those of pleckstrin homology (PH) and IRS-1-like PTB domains. 134
62258 214676 smart00463 SMR Small MutS-related domain. 80
62259 197740 smart00464 LON Found in ATP-dependent protease La (LON). N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs. 92
62260 214677 smart00465 GIYc GIY-YIG type nucleases (URI domain). 84
62261 197742 smart00466 SRA SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. 155
62262 197743 smart00467 GS GS motif. Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity. 30
62263 128744 smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished. 98
62264 128745 smart00469 WIF Wnt-inhibitory factor-1 like domain. Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity. 136
62265 214678 smart00470 ParB ParB-like nuclease domain. Plasmid RK2 ParB preferentially cleaves single-stranded DNA. ParB also nicks supercoiled plasmid DNA preferably at sites with potential single-stranded character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5-->3 exonuclease activity. 89
62266 214679 smart00471 HDc Metal dependent phosphohydrolases with conserved 'HD' motif. Includes eukaryotic cyclic nucleotide phosphodiesterases (PDEc). This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit). 124
62267 197746 smart00472 MIR Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. 57
62268 214680 smart00473 PAN_AP divergent subfamily of APPLE domains. Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions. 78
62269 214681 smart00474 35EXOc 3'-5' exonuclease. 3\' -5' exonuclease proofreading domain present in DNA polymerase I, Werner syndrome helicase, RNase D and other enzymes 172
62270 214682 smart00475 53EXOc 5'-3' exonuclease. 259
62271 128752 smart00476 DNaseIc deoxyribonuclease I. Deoxyribonuclease I catalyzes the endonucleolytic cleavage of double-stranded DNA. The enzyme is secreted outside the cell and also involved in apoptosis in the nucleus. 276
62272 214683 smart00477 NUC DNA/RNA non-specific endonuclease. prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases 210
62273 214684 smart00478 ENDO3c endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases 149
62274 214685 smart00479 EXOIII exonuclease domain in DNA-polymerase alpha and epsilon chain, ribonuclease T and other exonucleases. 169
62275 214686 smart00480 POL3Bc DNA polymerase III beta subunit. 345
62276 197753 smart00481 POLIIIAc DNA polymerase alpha chain like domain. DNA polymerase alpha chain like domain, incl. family of hypothetical proteins 67
62277 214687 smart00482 POLAc DNA polymerase A domain. 207
62278 214688 smart00483 POLXc DNA polymerase X family. includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases 334
62279 214689 smart00484 XPGI Xeroderma pigmentosum G I-region. domain in nucleases 73
62280 214690 smart00485 XPGN Xeroderma pigmentosum G N-region. domain in nucleases 99
62281 214691 smart00486 POLBc DNA polymerase type-B family. DNA polymerase alpha, delta, epsilon and zeta chain (eukaryota), DNA polymerases in archaea, DNA polymerase II in e. coli, mitochondrial DNA polymerases and and virus DNA polymerases 474
62282 214692 smart00487 DEXDc DEAD-like helicases superfamily. 201
62283 214693 smart00488 DEXDc2 DEAD-like helicases superfamily. 289
62284 197757 smart00490 HELICc helicase superfamily c-terminal domain. 82
62285 214694 smart00491 HELICc2 helicase superfamily c-terminal domain. 142
62286 214695 smart00493 TOPRIM topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins. 75
62287 214696 smart00494 ChtBD2 Chitin-binding domain type 2. 49
62288 197760 smart00495 ChtBD3 Chitin-binding domain type 3. 41
62289 128772 smart00496 IENR2 Intron-encoded nuclease repeat 2. Short helical motif of unknown function (unpublished results). 17
62290 197761 smart00497 IENR1 Intron encoded nuclease repeat motif. Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished). 53
62291 214697 smart00498 FH2 Formin Homology 2 Domain. FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain. 392
62292 214698 smart00499 AAI Plant lipid transfer protein / seed storage protein / trypsin-alpha amylase inhibitor domain family. 79
62293 128776 smart00500 SFM Splicing Factor Motif, present in Prp18 and Pr04. 44
62294 128777 smart00501 BRIGHT BRIGHT, ARID (A/T-rich interaction domain) domain. DNA-binding domain containing a helix-turn-helix structure 93
62295 128778 smart00502 BBC B-Box C-terminal domain. Coiled coil region C-terminal to (some) B-Box domains 127
62296 214699 smart00503 SynN Syntaxin N-terminal domain. Three-helix domain that (in Sso1p) slows the rate of its reaction with the SNAP-25 homologue Sec9p 117
62297 128780 smart00504 Ubox Modified RING finger domain. Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination. 63
62298 214700 smart00505 Knot1 Knottins. Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins. 45
62299 214701 smart00506 A1pp Appr-1"-p processing enzyme. Function determined by Martzen et al. Extended family detected by reciprocal PSI-BLAST searches (unpublished results, and Pehrson & Fuji). 133
62300 214702 smart00507 HNHc HNH nucleases. 52
62301 214703 smart00508 PostSET Cysteine-rich motif following a subset of SET domains. 17
62302 197766 smart00509 TFS2N Domain in the N-terminus of transcription elongation factor S-II (and elsewhere). 75
62303 128786 smart00510 TFS2M Domain in the central regions of transcription elongation factor S-II (and elsewhere). 102
62304 128787 smart00511 ORANGE Orange domain. This domain confers specificity among members of the Hairy/E(SPL) family. 45
62305 214704 smart00512 Skp1 Found in Skp1 protein family. Family of Skp1 (kinetochore protein required for cell cycle progression) and elongin C (subunit of RNA polymerase II transcription factor SIII) homologues. 104
62306 128789 smart00513 SAP Putative DNA-binding (bihelical) motif predicted to be involved in chromosomal organisation. 35
62307 214705 smart00515 eIF5C Domain at the C-termini of GCD6, eIF-2B epsilon, eIF-4 gamma and eIF-5. 83
62308 214706 smart00516 SEC14 Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p). Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) and in RhoGAPs, RhoGEFs and the RasGAP, neurofibromin (NF1). Lipid-binding domain. The SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. 158
62309 197769 smart00517 PolyA C-terminal domain of Poly(A)-binding protein. Present also in Drosophila hyperplastics discs protein. Involved in homodimerisation (either directly or indirectly) 64
62310 214707 smart00518 AP2Ec AP endonuclease family 2. These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites 273
62311 128794 smart00520 BASIC Basic domain in HLH proteins of MYOD family. 91
62312 128795 smart00521 CBF CCAAT-Binding transcription Factor. 62
62313 214708 smart00523 DWA Domain A in dwarfin family proteins. 109
62314 197770 smart00524 DWB Domain B in dwarfin family proteins. 171
62315 197771 smart00525 FES iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3). 21
62316 197772 smart00526 H15 Domain in histone families 1 and 5. 66
62317 197773 smart00527 HMG17 domain in high mobilty group proteins HMG14 and HMG 17. 88
62318 128801 smart00528 HNS Domain in histone-like proteins of HNS family. 46
62319 197774 smart00529 HTH_DTXR Helix-turn-helix diphteria tox regulatory element. iron dependent repressor 95
62320 197775 smart00530 HTH_XRE Helix-turn-helix XRE-family like proteins. 56
62321 128804 smart00531 TFIIE Transcription initiation factor IIE. 147
62322 214709 smart00532 LIGANc Ligase N family. 441
62323 214710 smart00533 MUTSd DNA-binding domain of DNA mismatch repair MUTS family. 308
62324 197777 smart00534 MUTSac ATPase domain of DNA mismatch repair MUTS family. 185
62325 197778 smart00535 RIBOc Ribonuclease III family. 129
62326 197779 smart00536 AXH domain in Ataxins and HMG containing proteins. unknown function 116
62327 214711 smart00537 DCX Domain in the Doublecortin (DCX) gene product. Tandemly-repeated domain in doublin, the Doublecortin gene product. Proposed to bind tubulin. Doublecortin (DCX) is mutated in human X-linked neuronal migration defects. 89
62328 197780 smart00538 POP4 A domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein complexes and archaeal proteins. 92
62329 214712 smart00539 NIDO Extracellular domain of unknown function in nidogen (entactin) and hypothetical proteins. 152
62330 128813 smart00540 LEM in nuclear membrane-associated proteins. LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin. 44
62331 128814 smart00541 FYRN FY-rich domain, N-terminal region. is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. 44
62332 197781 smart00542 FYRC FY-rich domain, C-terminal region. is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. 86
62333 214713 smart00543 MIF4G Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press) 200
62334 214714 smart00544 MA3 Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press 113
62335 128818 smart00545 JmjN Small domain found in the jumonji family of transcription factors. To date, this domain always co-occurs with the JmjC domain (although the reverse is not true). 42
62336 214715 smart00546 CUE Domain that may be involved in binding ubiquitin-conjugating enzymes (UBCs). CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. Ponting (Biochem. J.) "Proteins of the Endoplasmic reticulum" (in press) 43
62337 197784 smart00547 ZnF_RBZ Zinc finger domain. Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP. 25
62338 214716 smart00548 IRO Motif in Iroquois-class homeodomain proteins (only). Unknown function. 18
62339 197785 smart00549 TAFH TAF homology. Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1). 92
62340 128823 smart00550 Zalpha Z-DNA-binding domain in adenosine deaminases. Helix-turn-helix-containing domain. Also known as Zab. 68
62341 214717 smart00551 ZnF_TAZ TAZ zinc finger, present in p300 and CBP. 79
62342 214718 smart00552 ADEAMc tRNA-specific and double-stranded RNA adenosine deaminase (RNA-specific editase). 374
62343 197786 smart00553 SEP Domain present in Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. 93
62344 214719 smart00554 FAS1 Four repeated domains in the Fasciclin I family of proteins, present in many other contexts. 97
62345 128828 smart00555 GIT Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins. Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX. 31
62346 214720 smart00557 IG_FLMN Filamin-type immunoglobulin domains. These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29). 93
62347 214721 smart00558 JmjC A domain family that is part of the cupin metalloenzyme superfamily. Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press). 58
62348 128831 smart00559 Ku78 Ku70 and Ku80 are 70kDa and 80kDa subunits of the Lupus Ku autoantigen. This is a single stranded DNA- and ATP-depedent helicase that has a role in chromosome translocation. This is a domain of unknown function C-terminal to its von Willebrand factor A domain, that also occurs in bacterial hypothetical proteins. 140
62349 214722 smart00560 LamGL LamG-like jellyroll fold domain. 133
62350 214723 smart00561 MBT Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation. 96
62351 197791 smart00562 NDK Enzymes that catalyze nonsubstrate specific conversions of nucleoside diphosphates to nucleoside triphosphates. These enzymes play important roles in bacterial growth, signal transduction and pathogenicity. 135
62352 214724 smart00563 PlsC Phosphate acyltransferases. Function in phospholipid biosynthesis and have either glycerolphosphate, 1-acylglycerolphosphate, or 2-acylglycerolphosphoethanolamine acyltransferase activities. Tafazzin, the product of the gene mutated in patients with Barth syndrome, is a member of this family. 118
62353 128836 smart00564 PQQ beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases. 33
62354 128837 smart00567 EZ_HEAT E-Z type HEAT repeats. Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role. 30
62355 214725 smart00568 GRAM domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins. 60
62356 197794 smart00569 L27 domain in receptor targeting proteins Lin-2 and Lin-7. 53
62357 197795 smart00570 AWS associated with SET domains. subdomain of PRESET 50
62358 214726 smart00571 DDT domain in different transcription and chromosome remodeling factors. 63
62359 128842 smart00572 DZF domain in DSRM or ZnF_C2H2 domain containing proteins. 246
62360 214727 smart00573 HSA domain in helicases and associated with SANT domains. 73
62361 214728 smart00574 POX domain associated with HOX domains. 140
62362 128845 smart00575 ZnF_PMZ plant mutator transposase zinc finger. 28
62363 128846 smart00576 BTP Bromodomain transcription factors and PHD domain containing proteins. subdomain of archael histone-like transcription factors 77
62364 214729 smart00577 CPDc catalytic domain of ctd-like phosphatases. 148
62365 214730 smart00579 FBD domain in FBox and BRCT domain containing plant proteins. 72
62366 197798 smart00580 PUG domain in protein kinases, N-glycanases and other nuclear proteins. 57
62367 128850 smart00581 PSP proline-rich domain in spliceosome associated proteins. 54
62368 214731 smart00582 RPR domain present in proteins, which are involved in regulation of nuclear pre-mRNA. 124
62369 214732 smart00583 SPK domain in SET and PHD domain containing proteins and protein kinases. 114
62370 214733 smart00584 TLDc domain in TBC and LysM domain containing proteins. 165
62371 128854 smart00586 ZnF_DBF Zinc finger in DBF-like proteins. 49
62372 214734 smart00587 CHK ZnF_C4 abd HLH domain containing kinases domain. subfamily of choline kinases 196
62373 128856 smart00588 NEUZ domain in neuralized proteins. 123
62374 128857 smart00589 PRY associated with SPRY domains. 52
62375 214735 smart00591 RWD domain in RING finger and WD repeat containing proteins and DEXDc-like helicases subfamily related to the UBCc domain. 107
62376 197800 smart00592 BRK domain in transcription and CHROMO domain helicases. 45
62377 214736 smart00593 RUN domain involved in Ras-like GTPase signaling. 64
62378 214737 smart00594 UAS UAS domain. 122
62379 214738 smart00595 MADF subfamily of SANT domain. 89
62380 128863 smart00596 PRE_C2HC PRE_C2HC domain. 69
62381 214739 smart00597 ZnF_TTF zinc finger in transposases and transcription factors. 91
62382 214740 smart00602 VPS10 VPS10 domain. 612
62383 128866 smart00603 LCCL LCCL domain. 85
62384 214741 smart00604 MD MD domain. 145
62385 214742 smart00605 CW CW domain. 94
62386 128869 smart00606 CBD_IV Cellulose Binding Domain Type IV. 129
62387 128870 smart00607 FTP eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain. 151
62388 214743 smart00608 ACR ADAM Cysteine-Rich Domain. 137
62389 197803 smart00609 VIT Vault protein Inter-alpha-Trypsin domain. 130
62390 214744 smart00611 SEC63 Domain of unknown function in Sec63p, Brr2p and other proteins. 312
62391 128874 smart00612 Kelch Kelch domain. 47
62392 214745 smart00613 PAW domain present in PNGases and other hypothetical proteins. present in several copies in proteins with unknown function in C. elegans 89
62393 214746 smart00614 ZnF_BED BED zinc finger. DNA-binding domain in chromatin-boundary-element-binding proteins and transposases 50
62394 128877 smart00615 EPH_lbd Ephrin receptor ligand binding domain. 177
62395 214747 smart00630 Sema semaphorin domain. 390
62396 214748 smart00631 Zn_pept Zn_pept domain. 277
62397 214749 smart00632 Aamy_C Aamy_C domain. 81
62398 214750 smart00633 Glyco_10 Glycosyl hydrolase family 10. 263
62399 214751 smart00634 BID_1 Bacterial Ig-like domain (group 1). 92
62400 214752 smart00635 BID_2 Bacterial Ig-like domain 2. 81
62401 214753 smart00636 Glyco_18 Glyco_18 domain. 334
62402 214754 smart00637 CBD_II CBD_II domain. 92
62403 214755 smart00638 LPD_N Lipoprotein N-terminal Domain. 574
62404 214756 smart00639 PSA Paramecium Surface Antigen Repeat. 62
62405 214757 smart00640 Glyco_32 Glycosyl hydrolases family 32. 437
62406 128889 smart00641 Glyco_25 Glycosyl hydrolases family 25. 109
62407 214758 smart00642 Aamy Alpha-amylase domain. 166
62408 214759 smart00643 C345C Netrin C-terminal Domain. 114
62409 214760 smart00644 Ami_2 Ami_2 domain. 126
62410 214761 smart00645 Pept_C1 Papain family cysteine protease. 175
62411 214762 smart00646 Ami_3 Ami_3 domain. 113
62412 214763 smart00647 IBR In Between Ring fingers. the domains occurs between pairs og RING fingers 64
62413 197818 smart00648 SWAP Suppressor-of-White-APricot splicing regulator. domain present in regulators which are responsible for pre-mRNA splicing processes 54
62414 197819 smart00649 RL11 Ribosomal protein L11/L12. 132
62415 128898 smart00650 rADc Ribosomal RNA adenine dimethylases. 169
62416 197820 smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing 67
62417 128900 smart00652 eIF1a eukaryotic translation initiation factor 1A. 83
62418 214764 smart00653 eIF2B_5 domain present in translation initiation factor eIF2B and eIF5. 110
62419 128902 smart00654 eIF6 translation initiation factor 6. 200
62420 214765 smart00656 Amb_all Amb_all domain. 190
62421 128904 smart00657 RPOL4c DNA-directed RNA-polymerase II subunit. 118
62422 197821 smart00658 RPOL8c RNA polymerase subunit 8. subunit of RNA polymerase I, II and III 143
62423 128906 smart00659 RPOLCX RNA polymerase subunit CX. present in RNA polymerase I, II and III 44
62424 197822 smart00661 RPOL9 RNA polymerase subunit 9. 52
62425 214766 smart00662 RPOLD RNA polymerases D. DNA-directed RNA polymerase subunit D and bacterial alpha chain 224
62426 214767 smart00663 RPOLA_N RNA polymerase I subunit A N-terminus. 295
62427 214768 smart00664 DoH Possible catecholamine-binding domain present in a variety of eukaryotic proteins. A predominantly beta-sheet domain present as a regulatory N-terminal domain in dopamine beta-hydroxylase, mono-oxygenase X and SDR2. Its function remains unknown at present (Ponting, Human Molecular Genetics, in press). 148
62428 214769 smart00665 B561 Cytochrome b-561 / ferric reductase transmembrane domain. Cytochrome b-561 recycles ascorbate for the generation of norepinephrine by dopamine-beta-hydroxylase in the chromaffin vesicles of the adrenal gland. It is a transmembrane heme protein with the two heme groups being bound to conserved histidine residues. A cytochrome b-561 homologue, termed Dcytb, is an iron-regulated ferric reductase in the duodenal mucosa. Other homologues of these are also likely to be ferric reductases. SDR2 is proposed to be important in regulating the metabolism of iron in the onset of neurodegenerative disorders. 129
62429 214770 smart00666 PB1 PB1 domain. Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate. 81
62430 128913 smart00667 LisH Lissencephaly type-1-like homology motif. Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly. 34
62431 128914 smart00668 CTLH C-terminal to LisH motif. Alpha-helical motif of unknown function. 58
62432 214771 smart00670 PINc Large family of predicted nucleotide-binding domains. From similarities to 5'-exonucleases, these domains are predicted to be RNases. PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in RNAi. 111
62433 214772 smart00671 SEL1 Sel1-like repeats. These represent a subfamily of TPR (tetratricopeptide repeat) sequences. 36
62434 214773 smart00672 CAP10 Putative lipopolysaccharide-modifying enzyme. 256
62435 197827 smart00673 CARP Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. 38
62436 197828 smart00674 CENPB Putative DNA-binding domain in centromere protein B, mouse jerky and transposases. 66
62437 128920 smart00675 DM11 Domains in hypothetical proteins in Drosophila including 2 in CG15241 and CG9329. 164
62438 128921 smart00676 DM10 Domains in hypothetical proteins in Drosophila, C. elegans and mammals. Occurs singly in some nucleoside diphosphate kinases. 104
62439 128922 smart00678 WWE Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis. 73
62440 128923 smart00679 CTNS Repeated motif present between transmembrane helices in cystinosin, yeast ERS1p, mannose-P-dolichol utilization defect 1, and other hypothetical proteins. Function unknown, but likely to be associated with the glycosylation machinery. 32
62441 197829 smart00680 CLIP Clip or disulphide knot domain. Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme. 52
62442 214774 smart00682 G2F G2 nidogen domain and fibulin. 227
62443 128926 smart00683 DM16 Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. 55
62444 128927 smart00684 DM15 Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. 39
62445 128928 smart00685 DM14 Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. 59
62446 128929 smart00686 DM13 Domain present in fly proteins (CG14681, CG12492, CG6217), worm H06A10.1 and Arabidopsis thaliana MBG8.9. 108
62447 128930 smart00688 DM7 Domain of unknown function in Drosophila CG15332, CG15333 and CG18293. 95
62448 214775 smart00689 DM6 Cysteine-rich domain currently specific to Drosophila. 157
62449 214776 smart00690 DM5 Domain of unknown function, currently peculiar to Drosophila. 102
62450 128933 smart00692 DM3 Zinc finger domain in CG10631, C. elegans LIN-15B and human P52rIPK. 59
62451 214777 smart00693 DysFN Dysferlin domain, N-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. 62
62452 128935 smart00694 DysFC Dysferlin domain, C-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. 34
62453 197831 smart00695 DUSP Domain in ubiquitin-specific proteases. 88
62454 128937 smart00696 DM9 Repeats found in Drosophila proteins. 71
62455 214778 smart00697 DM8 Repeats found in several Drosophila proteins. 93
62456 197832 smart00698 MORN Possible plasma membrane-binding motif in junctophilins, PIP-5-kinases and protein kinases. 22
62457 214779 smart00700 JHBP Juvenile hormone binding protein domains in insects. The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions. 224
62458 128941 smart00701 PGRP Animal peptidoglycan recognition proteins homologous to Bacteriophage T3 lysozyme. The bacteriophage molecule, but not its moth homologue, has been shown to have N-acetylmuramoyl-L-alanine amidase activity. One member of this family, Tag7, is a cytokine. 142
62459 214780 smart00702 P4Hc Prolyl 4-hydroxylase alpha subunit homologues. Mammalian enzymes catalyse hydroxylation of collagen, for example. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor. 165
62460 214781 smart00703 NRF N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4). Also present in several other worm and fly proteins. 110
62461 197836 smart00704 ZnF_CDGSH CDGSH-type zinc finger. Function unknown. 38
62462 128945 smart00705 THEG Repeats in THEG (testicular haploid expressed gene) and several fly proteins. 20
62463 214782 smart00706 TECPR Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. 35
62464 128947 smart00707 RPEL Repeat in Drosophila CG10860, human KIAA0680 and C. elegans F26H9.2. 26
62465 214783 smart00708 PhBP Insect pheromone/odorant binding protein domains. 103
62466 128949 smart00709 Zpr1 Duplicated domain in the epidermal growth factor- and elongation factor-1alpha-binding protein Zpr1. Also present in archaeal proteins. 160
62467 214784 smart00710 PbH1 Parallel beta-helix repeats. The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates. 23
62468 197839 smart00711 TDU Short repeats in human TONDU, fly vestigial and other proteins. Unknown function. 16
62469 197840 smart00712 PUR DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. 63
62470 128953 smart00713 GYR Motif of unknown function with conserved Gly, Tyr, Arg tripeptide in Drosophila proteins. 18
62471 197841 smart00714 LITAF Possible membrane-associated motif in LPS-induced tumor necrosis factor alpha factor (LITAF), also known as PIG7, and other animal proteins. 67
62472 128955 smart00715 LA Domain in the RNA-binding Lupus La protein; unknown function. 80
62473 197842 smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains. 49
62474 214785 smart00718 DM4_12 DM4/DM12 family of domains in Drosophila melanogaster proteins of unknown function. 95
62475 197843 smart00719 Plus3 Short conserved domain in transcriptional regulators. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1. 109
62476 214786 smart00720 calpain_III calpain_III domain. 143
62477 214787 smart00721 BAR BAR domain. 239
62478 214788 smart00722 CASH Domain present in carbohydrate binding proteins and sugar hydrolses. 153
62479 128962 smart00723 AMOP Adhesion-associated domain present in MUC4 and other proteins. 154
62480 214789 smart00724 TLC TRAM, LAG1 and CLN8 homology domains. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains. 205
62481 214790 smart00725 NEAT NEAr Transporter domain. 123
62482 197845 smart00726 UIM Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins. 20
62483 128966 smart00727 STI1 Heat shock chaperonin-binding motif. 41
62484 214791 smart00728 ChW Clostridial hydrophobic, with a conserved W residue, domain. 46
62485 214792 smart00729 Elp3 Elongator protein 3, MiaB family, Radical SAM. This superfamily contains MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases. 216
62486 214793 smart00730 PSN Presenilin, signal peptide peptidase, family. Presenilin 1 and presenilin 2 are polytopic membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. Distant homologues, present in eukaryotes and archaea, also contain conserved aspartic acid residues which are predicted to contribute to catalysis. At least one member of this family has been shown to possess signal peptide peptidase activity. 249
62487 214794 smart00731 SprT SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function. 146
62488 128971 smart00732 YqgFc Likely ribonuclease with RNase H fold. YqgF proteins are likely to function as an alternative to RuvC in most bacteria, and could be the principal holliday junction resolvases in low-GC Gram-positive bacteria. In Spt6p orthologues, the catalytic residues are substituted indicating that they lack enzymatic functions. 99
62489 197848 smart00733 Mterf Mitochondrial termination factor repeats. Human mitochondrial termination factor is a DNA-binding protein that acts as a transcription termination factor. Six repeats occur in human mTERF, that also are present in numerous plant proteins. 31
62490 128973 smart00734 ZnF_Rad18 Rad18-like CCHC zinc finger. Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. 24
62491 128974 smart00735 ZM ZASP-like motif. Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules. 26
62492 214795 smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions. 97
62493 214796 smart00737 ML Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids. 119
62494 197850 smart00738 NGN In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold. In Spt5p, this domain may confer affinity for Spt4p.Spt4p 106
62495 128978 smart00739 KOW KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54. 28
62496 197851 smart00740 PASTA PASTA domain. 67
62497 214797 smart00741 SapB Saposin (B) Domains. Present in multiple copies in prosaposin and in pulmonary surfactant-associated protein B. In plant aspartic proteinases, a saposin domain is circularly permuted. This causes the prediction algorithm to predict two such domains, where only one is truly present. 76
62498 128981 smart00742 Hr1 Rho effector or protein kinase C-related kinase homology region 1 homologues. Alpha-helical domain found in vertebrate PRK1 and yeast PKC1 protein kinases C. The HR1 in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Also called RBD - Rho-binding domain 57
62499 214798 smart00743 Agenet Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions. 59
62500 128983 smart00744 RINGv The RING-variant domain is a C4HC3 zinc-finger like motif found in a number of cellular and viral proteins. Some of these proteins have been shown both in vivo and in vitro to have ubiquitin E3 ligase activity. The RING-variant domain is reminiscent of both the RING and the PHD domains and may represent an evolutionary intermediate. To describe this domain the term PHD/LAP domain has been used in the past. Extended description: The RING-variant (RINGv) domain contains a C4HC3 zinc-finger-like motif similar to the PHD domain, while some of the spacing between the Cys/His residues follow a pattern somewhat closer to that found in the RING domain. The RINGv domain, similar to the RING, PHD and LIM domains, is thought to bind two zinc ions co-ordinated by the highly conserved Cys and His residues. RING variant domain: C-x (2) -C-x(10-45)-C-x (1) -C-x (7) -H-x(2)-C-x(11-25)-C-x(2)-C As opposed to a PHD: C-x(1-2) -C-x (7-13)-C-x(2-4)-C-x(4-5)-H-x(2)-C-x(10-21)-C-x(2)-C Classical RING domain: C-x (2) -C-x (9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48) -C-x(2)-C 49
62501 197854 smart00745 MIT Microtubule Interacting and Trafficking molecule domain. 77
62502 214799 smart00746 TRASH metallochaperone-like domain. 39
62503 128986 smart00747 CFEM eight cysteine-containing domain present in fungal extracellular membrane proteins. 65
62504 214800 smart00748 HEPN Higher Eukarytoes and Prokaryotes Nucleotide-binding domain. 113
62505 197856 smart00749 BON bacterial OsmY and nodulation domain. 61
62506 214801 smart00750 KIND kinase non-catalytic C-lobe domain. It is an interaction domain identified as being similar to the C-terminal protein kinase catalytic fold (C lobe). Its presence at the N terminus of signalling proteins and the absence of the active-site residues in the catalytic and activation loops suggest that it folds independently and is likely to be non-catalytic. The occurrence of KIND only in metazoa implies that it has evolved from the catalytic protein kinase domain into an interaction domain possibly by keeping the substrate-binding features 176
62507 128990 smart00751 BSD domain in transcription factors and synapse-associated proteins. 51
62508 214802 smart00752 HTTM Horizontally Transferred TransMembrane Domain. Sequence analysis of vitamin K dependent gamma-carboxylases (VKGC) revealed the presence of a novel domain, HTTM (Horizontally Transferred TransMembrane) in its N-terminus. In contrast to most known domains, HTTM contains four transmembrane regions. Its occurrence in eukaryotes, bacteria and archaea is more likely caused by horizontal gene transfer than by early invention. The conservation of VKGC catalytic sites indicates an enzymatic function also for the other family members. 271
62509 214804 smart00754 CHRD A domain in the BMP inhibitor chordin and in microbial proteins. 118
62510 197860 smart00755 Grip golgin-97, RanBP2alpha,Imh1p and p230/golgin-245. 46
62511 214805 smart00756 VKc Family of likely enzymes that includes the catalytic subunit of vitamin K epoxide reductase. Bacterial homologues are fused to members of the thioredoxin family of oxidoreductases. 142
62512 214806 smart00757 CRA CT11-RanBPM. protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi) 99
62513 214807 smart00758 PA14 domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins. 136
62514 128998 smart00759 Flu_M1_C Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein. 95
62515 197863 smart00760 Bac_DnaA_C Bacterial dnaA protein helix-turn-helix domain. Could be involved in DNA-binding. 69
62516 214808 smart00761 HDAC_interact Histone deacetylase (HDAC) interacting. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. 102
62517 214809 smart00762 Cog4 COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport. 324
62518 214810 smart00763 AAA_PrkA PrkA AAA domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain. 361
62519 129003 smart00764 Citrate_ly_lig Citrate lyase ligase C-terminal domain. Proteins of this family contain the C-terminal domain of citrate lyase ligase EC:6.2.1.22. 182
62520 129004 smart00765 MANEC The MANEC domain, formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase. 93
62521 197866 smart00766 DnaG_DnaB_bind DNA primase DnaG DnaB-binding. DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB. 125
62522 214811 smart00767 DCD DCD is a plant specific domain in proteins involved in development and programmed cell death. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone. 132
62523 197867 smart00768 X8 Possibly involved in carbohydrate binding. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges. 85
62524 214812 smart00769 WHy Water Stress and Hypersensitive response. 100
62525 214813 smart00770 Zn_dep_PLPC Zinc dependent phospholipase C (alpha toxin). This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium. Structure information: PDB 1ca1. 241
62526 129010 smart00771 ZipA_C ZipA, C-terminal domain (FtsZ-binding). C-terminal domain of ZipA, a component of cell division in E.coli. It interacts with the FtsZ protein in one of the initial steps of septum formation. The structure of this domain is composed of three alpha-helices and a beta-sheet consisting of six antiparallel beta-strands. 131
62527 214814 smart00773 WGR Proposed nucleic acid binding domain. This domain is named after its most conserved central motif. It is found in a variety of polyA polymerases as well as in molybdate metabolism regulators (e.g. in E.coli) and other proteins of unknown function. The domain is found in isolation in some proteins and is between 70 and 80 residues in length. It is proposed that it may be a nucleic acid binding domain. 84
62528 214815 smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding. 59
62529 197870 smart00775 LNS2 This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance. 157
62530 214816 smart00776 NPCBM This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. 145
62531 214817 smart00777 Mad3_BUB1_I Mad3/BUB1 hoMad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. 124
62532 129016 smart00778 Prim_Zn_Ribbon Zinc-binding domain of primase-helicase. This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities. 37
62533 197872 smart00780 PIG-X PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. 203
62534 129018 smart00782 PhnA_Zn_Ribbon PhnA Zinc-Ribbon. This protein family includes an uncharacterised member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterised phosphonoacetate hydrolase designated PhnA. 47
62535 129019 smart00783 A_amylase_inhib Alpha amylase inhibitor. Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases. 69
62536 214818 smart00784 SPT2 SPT2 chromatin protein. This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation. 106
62537 129021 smart00785 AARP2CN AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU. 83
62538 129022 smart00786 SHR3_chaperone ER membrane protein SH3. This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs). Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER. 196
62539 197874 smart00787 Spc7 Spc7 kinetochore protein. This domain is found in cell division proteins which are required for kinetochore-spindle association. 312
62540 197875 smart00788 Adenylsucc_synt Adenylosuccinate synthetase. Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalyzing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterized from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present - one involved in purine biosynthesis and the other in the purine nucleotide cycle. The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein. 417
62541 129025 smart00789 Ad_cyc_g-alpha Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins. 51
62542 129026 smart00790 AFOR_N Aldehyde ferredoxin oxidoreductase, N-terminal domain. Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea. carboxylic acid reductase found in clostridia. and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum. GAPOR may be involved in glycolysis. but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases. 199
62543 129027 smart00791 Agglutinin Amaranthus caudatus agglutinin or amaranthin is a lectin from the ancient South American crop, amaranth grain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains. 139
62544 197876 smart00792 Agouti Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation. 124
62545 214819 smart00793 AgrB Accessory gene regulator B. The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide. 184
62546 129030 smart00794 AgrD Staphylococcal AgrD protein. This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr. 45
62547 129031 smart00795 Agro_virD5 Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein. 780
62548 214820 smart00796 AHS1 Allophanate hydrolase subunit 1. This domain represents subunit 1 of allophanate hydrolase (AHS1). 201
62549 214821 smart00797 AHS2 Allophanate hydrolase subunit 2. This domain represents subunit 2 of allophanate hydrolase (AHS2). 280
62550 214822 smart00798 AICARFT_IMPCHas AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. 311
62551 214823 smart00799 DENN Domain found in a variety of signalling proteins, always encircled by uDENN and dDENN. The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 183
62552 214824 smart00800 uDENN Domain always found upstream of DENN domain, found in a variety of signalling proteins. The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 89
62553 129037 smart00801 dDENN Domain always found downstream of DENN domain, found in a variety of signalling proteins. The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. 69
62554 214825 smart00802 UME Domain in UVSB PI-3 kinase, MEI-41 and ESR-1. Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules. 107
62555 129039 smart00803 TAF TATA box binding protein associated factor. TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold. 65
62556 197882 smart00804 TAP_C C-terminal domain of vertebrate Tap protein. The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for the nuclear export of mRNA. Its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate shuttling. This domain forms a compact four-helix fold related to that of a UBA domain. 63
62557 197883 smart00805 AGTRAP Angiotensin II, type I receptor-associated protein. This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. 159
62558 214826 smart00806 AIP3 Actin interacting protein 3. Aip3p/Bud6p is a regulator of cell and cytoskeletal polarity in Saccharomyces cerevisiae that was previously identified as an actin-interacting protein. Actin-interacting protein 3 (Aip3p) localizes at the cell cortex where cytoskeleton assembly must be achieved to execute polarized cell growth, and deletion of AIP3 causes gross defects in cell and cytoskeletal polarity. Aip3p localization is mediated by the secretory pathway, mutations in early- or late-acting components of the secretory apparatus lead to Aip3p mislocalization. 426
62559 214827 smart00807 AKAP_110 A-kinase anchor protein 110 kDa. This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction. 851
62560 197885 smart00808 FABD F-actin binding domain (FABD). FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution. 126
62561 197886 smart00809 Alpha_adaptinC2 Adaptin C-terminal domain. Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology.. The adaptor appendage contains an additional N-terminal strand. 104
62562 129046 smart00810 Alpha-amyl_C2 Alpha-amylase C-terminal beta-sheet domain. This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet. 61
62563 214828 smart00811 Alpha_kinase Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. 198
62564 214829 smart00812 Alpha_L_fucos Alpha-L-fucosidase. O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Family 29 encompasses alpha-L-fucosidases, which is a lysosomal enzyme responsible for hydrolyzing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Deficiency of alpha-L-fucosidase results in the lysosomal storage disease fucosidosis. 384
62565 214830 smart00813 Alpha-L-AF_C Alpha-L-arabinofuranosidase C-terminus. This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. 189
62566 129050 smart00814 Alpha_TIF Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology. 356
62567 214831 smart00815 AMA-1 Apical membrane antigen 1. Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites. 239
62568 129052 smart00816 Amb_V_allergen Amb V Allergen. Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens. 45
62569 214832 smart00817 Amelin Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. 411
62570 197891 smart00818 Amelogenin Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth. They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide. 165
62571 214833 smart00822 PKS_KR This enzymatic domain is part of bacterial polyketide synthases. It catalyses the first step in the reductive modification of the beta-carbonyl centres in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group. 180
62572 214834 smart00823 PKS_PP Phosphopantetheine attachment site. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. 86
62573 214835 smart00824 PKS_TE Thioesterase. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa. 212
62574 214836 smart00825 PKS_KS Beta-ketoacyl synthase. The structure of beta-ketoacyl synthase is similar to that of the thiolase family and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. 298
62575 214837 smart00826 PKS_DH Dehydratase domain in polyketide synthase (PKS) enzymes. 167
62576 214838 smart00827 PKS_AT Acyl transferase domain in polyketide synthase (PKS) enzymes. 298
62577 214839 smart00828 PKS_MT Methyltransferase in polyketide synthase (PKS) enzymes. 224
62578 214840 smart00829 PKS_ER Enoylreductase. Enoylreductase in Polyketide synthases. 287
62579 214841 smart00830 CM_2 Chorismate mutase type II. Chorismate mutase, catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine.. 79
62580 214842 smart00831 Cation_ATPase_N Cation transporter/ATPase, N-terminus. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases. 75
62581 214843 smart00832 C8 This domain contains 8 conserved cysteine residues. Not all of the conserved cysteines have been included in the alignment model. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. 76
62582 214844 smart00833 CobW_C Cobalamin synthesis protein cobW C-terminal domain. CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression. 92
62583 197903 smart00834 CxxC_CXXC_SSSS Putative regulatory protein. CxxC_CXXC_SSSS represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence. 41
62584 214845 smart00835 Cupin_1 Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant. 146
62585 214846 smart00836 DALR_1 DALR anticodon binding domain. This all alpha helical domain is the anticodon binding domain of Arginyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids. 122
62586 129070 smart00837 DPBB_1 Rare lipoprotein A (RlpA)-like double-psi beta-barrel. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen. 87
62587 197906 smart00838 EFG_C Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. 85
62588 214847 smart00839 ELFV_dehydrog Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction. 102
62589 214848 smart00840 DALR_2 This DALR domain is found in cysteinyl-tRNA-synthetases. 56
62590 214849 smart00841 Elong-fact-P_C Elongation factor P, C-terminal. These nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology. 57
62591 214850 smart00842 FtsA Cell division protein FtsA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. 187
62592 197911 smart00843 Ftsk_gamma This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding. 63
62593 197912 smart00844 GA GA module. The protein G-related albumin-binding (GA) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin. 60
62594 197913 smart00845 GatB_Yqey GatB domain. This domain is found in GatB and proteins related to bacterial Yqey. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB which transamidates Glu-tRNA to Gln-tRNA. The function of this domain is uncertain. It does however suggest that Yqey and its relatives have a role in tRNA metabolism. 147
62595 214851 smart00846 Gp_dh_N Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold. 149
62596 214852 smart00847 HA2 Helicase associated domain (HA2) Add an annotation. This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. 82
62597 214853 smart00848 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. 57
62598 214854 smart00849 Lactamase_B Metallo-beta-lactamase superfamily. Apart from the beta-lactamases a number of other proteins contain this domain. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein these proteins bind two zinc ions per molecule as cofactor. 177
62599 197918 smart00850 LytTR LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. 96
62600 214855 smart00851 MGS MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site. 91
62601 214856 smart00852 MoCF_biosynth Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. 138
62602 214857 smart00853 MutL_C MutL C terminal dimerisation domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation. 140
62603 214858 smart00854 PGA_cap Bacterial capsule synthesis protein PGA_cap. This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein. 239
62604 214859 smart00855 PGAM Phosphoglycerate mutase family. Phosphoglycerate mutase (PGAM) and bisphosphoglycerate mutase (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate... Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphatase activity). In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein. 158
62605 214860 smart00856 PMEI Plant invertase/pectin methylesterase inhibitor. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences, suggesting that both PMEs and their inhibitors are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical. 148
62606 214861 smart00857 Resolvase Resolvase, N terminal domain. The N-terminal domain of the resolvase family contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase. 148
62607 214862 smart00858 SAF This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins. 63
62608 214863 smart00859 Semialdhyde_dh Semialdehyde dehydrogenase, NAD binding domain. The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase. 123
62609 214864 smart00860 SMI1_KNR4 SMI1 / KNR4 family. Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. 127
62610 214865 smart00861 Transket_pyr Transketolase, pyrimidine binding domain. Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Hansenula polymorpha, there is a highly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates. 136
62611 214866 smart00862 Trans_reg_C Transcriptional regulatory protein, C terminal. This domain is almost always found associated with the response regulator receiver domain. It may play a role in DNA binding. 76
62612 197931 smart00863 tRNA_SAD Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain. 43
62613 214867 smart00864 Tubulin Tubulin/FtsZ family, GTPase domain. This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. 192
62614 214868 smart00865 Tubulin_C Tubulin/FtsZ family, C-terminal domain. This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain. 120
62615 214869 smart00866 UTRA The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain. It has a similar fold to HutC/FarR-like bacterial transcription factors of the GntR family. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules. 143
62616 214870 smart00867 YceI YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold. 166
62617 214871 smart00868 zf-AD Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA. 73
62618 214872 smart00869 Autotransporter Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs. 268
62619 214873 smart00870 Asparaginase Asparaginase, found in various plant, animal and bacterial cells. Asparaginase catalyses the deamination of asparagine to yield aspartic acid and an ammonium ion, resulting in a depletion of free circulatory asparagine in plasma. The enzyme is effective in the treatment of human malignant lymphomas, which have a diminished capacity to produce asparagine synthetase: in order to survive, such cells absorb asparagine from blood plasma..- if Asn levels have been depleted by injection of asparaginase, the lymphoma cells die. 323
62620 214874 smart00871 AraC_E_bind Bacterial transcription activator, effector binding domain. This domain is found in the probable effector binding domain of a number of different bacterial transcription activators.and is also present in some DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA. 158
62621 214875 smart00872 Alpha-mann_mid Alpha mannosidase, middle domain. Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase. 79
62622 214876 smart00873 B3_4 B3/4 domain. This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. 174
62623 197942 smart00874 B5 tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits. 68
62624 197943 smart00875 BACK BTB And C-terminal Kelch. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. 101
62625 214877 smart00876 BATS Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers.. This domain therefore may be involved in co-factor binding or dimerisation. 94
62626 197945 smart00877 BMC Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure. 75
62627 214878 smart00878 Biotin_carb_C Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain. 107
62628 214879 smart00879 Brix The Brix domain is found in a number of eukaryotic proteins. Members include SSF proteins from yeast and humans, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins. 180
62629 214880 smart00880 CHAD The CHAD domain is an alpha-helical domain functionally associated with some members of the adenylate cyclase family. It has conserved histidines that may chelate metals. 262
62630 214881 smart00881 CoA_binding CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. 100
62631 214882 smart00882 CoA_trans Coenzyme A transferase. Coenzyme A (CoA) transferases belong to an evolutionary conserved family of enzymes catalyzing the reversible transfer of CoA from one carboxylic acid to another. They have been identified in many prokaryotes and in mammalian tissues. The bacterial enzymes are heterodimer of two subunits (A and B) of about 25 Kd each while eukaryotic SCOT consist of a single chain which is colinear with the two bacterial subunits. 212
62632 197951 smart00883 Cpn10 Chaperonin 10 Kd subunit. The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins. These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60. 93
62633 214883 smart00884 Cullin_Nedd8 Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue. 68
62634 197953 smart00885 D5_N D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages. 141
62635 214884 smart00886 Dabb Stress responsive A/B Barrel Domain. The function of this domain is unknown, but it is upregulated in response to salt stress in Populus balsamifera (balsam poplar). It is also found at the C-terminus of a fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus.It is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an alpha-beta barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence. 97
62636 214885 smart00887 EB_dh Ethylbenzene dehydrogenase. Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyses the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria. 209
62637 214886 smart00888 EF1_GNE EF-1 guanine nucleotide exchange domain. Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta). This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma). 88
62638 214887 smart00889 EFG_IV Elongation factor G, domain IV. Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome. EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the 'body' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place. 120
62639 197958 smart00890 EKR Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known. 57
62640 214888 smart00891 ERCC4 ERCC4 domain. This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases. 98
62641 214889 smart00892 Endonuclease_NS DNA/RNA non-specific endonuclease. A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion. 198
62642 214890 smart00893 ETF Electron transfer flavoprotein domain. Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II). ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides. This entry represents the N-terminal domain of both the alpha and beta subunits from Group I and Group II ETFs. 185
62643 214891 smart00894 Excalibur Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated. 37
62644 214892 smart00895 FCD This entry represents the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain. This domain is found in and that are regulators of sugar biosynthesis operons. Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector binding or oligomerisation domain at the C terminus. The winged-helix DNA-binding domain is well conserved in structure for the whole of the GntR family, and is similar in structure to other transcriptional regulator families. The C-terminal effector-binding and oligomerisation domains are more variable and are consequently used to define the subfamilies. Based on the sequence and structure of the C-terminal domains, the GtnR family can be divided into four major groups, as represented by FadR, HutC, MocR and YtrA, as well as some minor groups such as those represented by AraR and PlmA. 123
62645 214893 smart00896 FDX-ACB Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2). 93
62646 214894 smart00897 FIST FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. 196
62647 214895 smart00898 Fapy_DNA_glyco Formamidopyrimidine-DNA glycosylase N-terminal domain. This entry represents the catalytic domain of DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway. These enzymes are primarily from bacteria, and have both DNA glycosylase activity and AP lyase activity. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines.. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine. These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82). The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger. 115
62648 214896 smart00899 FeoA This entry represents the core domain of the ferrous iron (Fe2+) transport protein FeoA found in bacteria. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment. 72
62649 214897 smart00900 FMN_bind This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins. 86
62650 214898 smart00901 FRG This domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised. 103
62651 214899 smart00902 Fe_hyd_SSU Iron hydrogenase small subunit. Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold. 52
62652 214900 smart00903 Flavin_Reduct Flavin reductase like domain. This entry represents the FMN-binding domain found in NAD(P)H-flavin oxidoreductases (flavin reductases), a class of enzymes capable of producing reduced flavin for bacterial bioluminescence and other biological processes. This domain is also found in various other oxidoreductase and monooxygenase enzymes... This domain consists of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. The flavin reductases have a different dimerisation mode than that found in the PNP oxidase-like family, which also carries an FMN-binding domain with a similar topology. 147
62653 214901 smart00904 Flavokinase Riboflavin kinase. Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme. the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme. 124
62654 214902 smart00905 FolB Dihydroneopterin aldolase. Dihydroneopterin aldolase catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. In the opportunistic pathogen Pneumocystis carinii, dihydroneopterin aldolase function is expressed as the N-terminal portion of the multifunctional folic acid synthesis protein (Fas). This region encompasses two domains, FasA and FasB, which are 27% amino acid identical. FasA and FasB also share significant amino acid sequence similarity with bacterial dihydroneopterin aldolases. This region consists of two tandem sequences each homologous to folB and which form tetramers. 113
62655 214903 smart00906 Fungal_trans Fungal specific transcription factor domain. This domain is found in a number of fungal transcription factors including transcriptional activator xlnR, yeast regulatory protein GAL4, and other transcription proteins regulating a variety of cellular and metabolic processes. 93
62656 197975 smart00907 GDNF GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons.. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity. 86
62657 214904 smart00908 Gal-bind_lectin Galactoside-binding lectin. Animal lectins display a wide variety of architectures. They are classified according to the carbohydrate-recognition domain (CRD) of which there are two main types, S-type and C-type. Galectins (previously S-lectins) bind exclusively beta-galactosides like lactose. They do not require metal ions for activity. Galectins are found predominantly, but not exclusively in mammals. Their function is unclear. They are developmentally regulated and may be involved in differentiation, cellular regulation and tissue construction. 122
62658 214905 smart00909 Germane Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold. 84
62659 214906 smart00910 HIRAN The HIRAN protein (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks. 90
62660 214907 smart00911 HWE_HK HWE histidine kinase. The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as. In. the HWE family was defined by the presence of conserved a H residue and a WXE motifs and was limited to members of the proteobacteria. However, many homologues of this domain are lack the WXE motif. Furthermore, homologues are found in a wide range of Gram-positive and Gram-negative bacteria as well as in several archaea. 84
62661 214908 smart00912 Haemagg_act haemagglutination activity domain. This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins. 119
62662 197981 smart00913 IBN_N Importin-beta N-terminal domain. Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins.. which is important for importin-beta mediated transport. 67
62663 197982 smart00914 IDEAL A short protein domain of unknown function. It is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members. 37
62664 214909 smart00915 Jacalin Jacalin-like lectin domain. This entry represents a mannose-binding lectin domain with a beta-prism fold consisting of three 4-stranded beta-sheets, with an internal pseudo 3-fold symmetry. Some lectins in this group stimulate distinct T- and B- cell functions, such as Jacalin, which binds to the T-antigen and acts as an agglutinin. This domain is found in 1 to 6 copies in lectins. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. 128
62665 197984 smart00916 L51_S25_CI-B8 Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. 70
62666 214910 smart00917 LeuA_dimer LeuA allosteric (dimerisation) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich. 131
62667 214911 smart00918 Lig_chan-Glu_bd Ligated ion channel L-glutamate- and glycine-binding site. This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan. 62
62668 214912 smart00919 Malic_M Malic enzyme, NAD binding domain. Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate. 231
62669 214913 smart00920 MHC_II_alpha Class II histocompatibility antigen, alpha domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection). 81
62670 197989 smart00921 MHC_II_beta Class II histocompatibility antigen, beta domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection). 72
62671 214914 smart00922 MR_MLE Mandelate racemase / muconate lactonizing enzyme, C-terminal domain. Mandelate racemase (MR) and muconate lactonizing enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyze mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures.. This entry represents the C-terminal region of these proteins. 97
62672 197991 smart00923 MbtH MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters. 49
62673 214915 smart00924 MgtE_N MgtE intracellular N domain. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. It is presumed to be an intracellular domain, that may be involved in magnesium binding. 105
62674 214916 smart00925 MltA MltA specific insert domain. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. 153
62675 197994 smart00926 Molybdop_Fe4S4 Molybdopterin oxidoreductase Fe4S4 domain. The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain. 55
62676 214917 smart00927 MutH DNA mismatch repair enzyme MutH. MutS, MutL and MutH are the three essential proteins for initiation of methyl-directed DNA mismatch repair to correct mistakes made during DNA replication in Escherichia coli. MutH cleaves a newly synthesized and unmethylated daughter strand 5' to the sequence d(GATC) in a hemi-methylated duplex. Activation of MutH requires the recognition of a DNA mismatch by MutS and MutL. 100
62677 197996 smart00928 NADH_4Fe-4S NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 46
62678 214918 smart00929 NADH-G_4Fe-4S_3 NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 41
62679 197998 smart00930 NIL This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family. 76
62680 197999 smart00931 NOSIC NOSIC (NUC001) domain. This is the central domain in Nop56/SIK1-like proteins. 52
62681 214919 smart00932 Nfu_N Scaffold protein Nfu/NifU N terminal. This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters. 88
62682 214920 smart00933 NurA NurA nuclease. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius. 262
62683 214921 smart00934 OMPdecase Orotidine 5'-phosphate decarboxylase / HUMPS family. Orotidine 5'-phosphate decarboxylase (OMPdecase) catalyzes the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein. 212
62684 214922 smart00935 OmpH Outer membrane protein (OmpH-like). This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterized as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery. 140
62685 198004 smart00936 PBP5_C Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides. 92
62686 214923 smart00937 PCRF This domain is found in peptide chain release factors. 116
62687 198006 smart00938 P-II Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase. 102
62688 214924 smart00939 PepX_C X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases. 214
62689 198008 smart00940 PepX_N X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction. 156
62690 214925 smart00941 PYNP_C Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer. 75
62691 214926 smart00942 PriCT_1 Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases. 66
62692 214927 smart00943 Prim-Pol Bifunctional DNA primase/polymerase, N-terminal. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities. 154
62693 214928 smart00944 Pro-kuma_activ Pro-kumamolisin, activation domain. This domain is found at the N-terminus of peptidases belonging to MEROPS peptidase family S53 (sedolisin, clan SB). The domain adopts a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase. 136
62694 198013 smart00945 ProQ ProQ/FINO family. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli. 113
62695 198014 smart00946 ProRS-C_1 Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif. 67
62696 214929 smart00947 Pro_CA Carbonic anhydrase. Carbonic anhydrases (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation. Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's. 154
62697 198016 smart00948 Proteasome_A_N Proteasome subunit A N-terminal signature Add an annotation. This domain is conserved in the A subunits of the proteasome complex proteins. 23
62698 198017 smart00949 PAZ This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerisation. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway. 138
62699 214930 smart00950 Piwi This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex.. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA. 301
62700 214931 smart00951 QLQ QLQ is named after the conserved Gln, Leu, Gln motif. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions. 36
62701 214932 smart00952 RAP This domain is found in various eukaryotic species, particularly in apicomplexans. In Plasmodium falciparum, the domain is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain. 58
62702 214933 smart00953 RES RES domain. This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length. 121
62703 214934 smart00954 RelA_SpoT Region found in RelA / SpoT proteins. The functions of Escherichia coli RelA and SpoT differ somewhat. RelA produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species. 111
62704 214935 smart00955 RNB This domain is the catalytic domain of ribonuclease II. 286
62705 214936 smart00956 RQC This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain. 92
62706 214937 smart00957 SecA_DEAD SecA DEAD-like domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the. 380
62707 214938 smart00958 SecA_PP_bind SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain. 114
62708 198027 smart00959 Rho_N Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain. 43
62709 214939 smart00960 Robl_LC7 Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. 88
62710 198029 smart00961 RuBisCO_small Ribulose bisphosphate carboxylase, small chain. RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The function of the small subunit is unknown. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome. which results in RuBisCO being more abundant during the day when it is required. 96
62711 214940 smart00962 SRP54 SRP54-type protein, GTPase domain. This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins. 197
62712 214941 smart00963 SRP54_N SRP54-type protein, helical bundle domain. This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. 77
62713 214942 smart00964 STAT_int STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain. 120
62714 198033 smart00965 STN Secretin and TonB N terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates. 52
62715 198034 smart00966 SpoVT_AbrB SpoVT / AbrB like domain. This domain is found in AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator of the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation. AbrB is thought to interact directly with the transcription initiation regions of genes under its control. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins. The product of the B. subtilis gene spoVT is another member of this family and is also a transcriptional regulator. DNA-binding activity in this AbrB homologue requires hexamerisation. Another family member has been isolated from the Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The Escherichia coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins. 45
62716 214943 smart00967 SpoU_sub_bind RNA 2'-O ribose methyltransferase substrate binding. This domain is a RNA 2'-O ribose methyltransferase substrate binding domain. 70
62717 214944 smart00968 SMC_hinge SMC proteins Flexible Hinge Domain. This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. 120
62718 198037 smart00969 SOCS_box The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. 34
62719 214945 smart00970 s48_45 Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation. 116
62720 198039 smart00971 SATase_N Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants.and bacteria. 105
62721 198040 smart00972 SCPU Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins.as well as a family of secreted pili proteins involved in motility and biofilm formation. 59
62722 214946 smart00973 Sec63 Sec63 Brl domain. This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases. 314
62723 214947 smart00974 T5orf172 This entry represents the putative helicase A859L. 80
62724 214948 smart00975 Telomerase_RBD Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA. 136
62725 214949 smart00976 Telo_bind Telomeric single stranded DNA binding POT1/CDC13. The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold. 137
62726 198045 smart00977 TilS_C TilS substrate C-terminal domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. 69
62727 214950 smart00978 Tim44 Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. 147
62728 198047 smart00979 TIFY This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response. 36
62729 214951 smart00980 THAP The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes. 80
62730 214952 smart00981 THUMP The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterised RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets. 83
62731 198050 smart00982 TRCF This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript. 100
62732 214953 smart00983 TPK_B1_binding Thiamin pyrophosphokinase, vitamin B1 binding domain. Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 66
62733 214954 smart00984 UDPG_MGDP_dh_C UDP binding domain. The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 99
62734 214955 smart00985 UBA_e1_C Ubiquitin-activating enzyme e1 C-terminal domain. This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised. 128
62735 214956 smart00986 UDG Uracil DNA glycosylase superfamily. 156
62736 214958 smart00988 UreE_N UreE urease accessory protein, N-terminal domain. UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid. 65
62737 198057 smart00989 V4R The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. 61
62738 214959 smart00990 VRR_NUC This model contains proteins with the VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI. 108
62739 214960 smart00991 WHEP-TRS A conserved domain of 46 amino acids, called WHEP-TRS has been shown.to exist in a number of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present one to six times in the several enzymes. There are three copies in mammalian multifunctional aminoacyl-tRNA synthetase in a region that separates the N-terminal glutamyl-tRNA synthetase domain from the C-terminal prolyl-tRNA synthetase domain, and six copies in the intercatalytic region of the Drosophila enzyme. The domain is found at the N-terminal extremity of the mammalian tryptophanyl- tRNA synthetase and histidyl-tRNA synthetase, and the mammalian, insect, nematode and plant glycyl- tRNA synthetases. This domain could contain a central alpha-helical region and may play a role in the association of tRNA-synthetases into multienzyme complexes. 56
62740 214961 smart00992 YccV-like Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. 98
62741 198061 smart00993 YL1_C YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. 30
62742 198062 smart00994 zf-C4_ClpX ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known. 39
62743 214962 smart00995 AD Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. 90
62744 214963 smart00996 AdoHcyase S-adenosyl-L-homocysteine hydrolase. 426
62745 198065 smart00997 AdoHcyase_NAD S-adenosyl-L-homocysteine hydrolase, NAD binding domain. 162
62746 198066 smart00998 ADSL_C Adenylosuccinate lyase C-terminus. Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide (the fifth step of de novo IMP biosynthesis); the formation of adenosine monophosphate (AMP) from adenylosuccinate (the final step in the synthesis of AMP from IMP). This entry represents the C-terminal, seven alpha-helical, domain of adenylosuccinate lyase. 81
62747 198067 smart00999 Aerolysin Aerolysin toxin. This family represents the pore forming lobe of aerolysin. 368
62748 214964 smart01000 Aha1_N Activator of Hsp90 ATPase, N-terminal. This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity. 134
62749 214965 smart01001 AIRC AIR carboxylase. Members of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. 152
62750 214966 smart01002 AlaDh_PNT_C Alanine dehydrogenase/PNT, C-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. 149
62751 214967 smart01003 AlaDh_PNT_N Alanine dehydrogenase/PNT, N-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. 133
62752 214968 smart01004 ALAD Delta-aminolevulinic acid dehydratase. This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme. 321
62753 214969 smart01005 Ala_racemase_C Alanine racemase, C-terminal domain. Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins. 124
62754 198074 smart01006 AlcB Siderophore biosynthesis protein domain. AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle. 48
62755 214970 smart01007 Aldolase_II Class II Aldolase and Adducin N-terminal domain. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. 185
62756 214971 smart01008 Ald_Xan_dh_C Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. Aldehyde oxidase catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups. 107
62757 214972 smart01009 AlkA_N AlkA N-terminal domain. This domain is found at the N terminus of bacterial AlkA . AlkA (3-methyladenine-DNA glycosylase II) is a base excision repair glycosylase from Escherichia coli. It removes a variety of alkylated bases from DNA, primarily by removing alkylation damage from duplex and single stranded DNA. AlkA flips a 1-azaribose abasic nucleotide out of DNA. This produces a 66 degrees bend in the DNA and a marked widening of the minor groove. 113
62758 214973 smart01010 AMPKBI 5'-AMP-activated protein kinase beta subunit, interation domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family. 100
62759 198079 smart01011 AMP_N Aminopeptidase P, N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain. However, little or no sequence similarity exists between the two families. 135
62760 198080 smart01012 ANTAR ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. 55
62761 198081 smart01013 APC2 Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein. 60
62762 198082 smart01014 ARID ARID/BRIGHT DNA binding domain. Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. 88
62763 214974 smart01015 Arfaptin Arfaptin-like domain. Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin. 217
62764 214975 smart01016 Arg_tRNA_synt_N Arginyl tRNA synthetase N terminal dom. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. 85
62765 214976 smart01017 Arrestin_C Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X). which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction. 142
62766 198086 smart01018 B12-binding_2 B12 binding domain. Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases. 84
62767 214977 smart01019 B3 B3 DNA binding domain. Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations. 96
62768 198088 smart01020 B2-adapt-app_C Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15). 111
62769 214978 smart01021 Bac_rhodopsin Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria.. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). 233
62770 214979 smart01022 ASCH The ASCH domain adopts a beta-barrel fold similar to that of the PUA domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation. 99
62771 198091 smart01023 BAF Barrier to autointegration factor. Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA. 87
62772 214980 smart01024 BCS1_N This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein. 170
62773 214981 smart01025 BEN The BEN domain is found in diverse animal proteins. Proteins containing BEN domains are BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, the chordopoxvirus virosomal protein E5R and several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription. 80
62774 214982 smart01026 Beach Beige/BEACH domain. The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown. 280
62775 214983 smart01027 Beta-Casp Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. 126
62776 198096 smart01028 Beta-TrCP_D D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation. 40
62777 198097 smart01029 BetaGal_dom2 Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members. 182
62778 214984 smart01030 BHD_1 Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA. 54
62779 214985 smart01031 BHD_2 Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA. 56
62780 198100 smart01032 BHD_3 Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA. 75
62781 198101 smart01033 BING4CT BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. 80
62782 198102 smart01034 BLUF Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light. 92
62783 214986 smart01035 BOP1NT BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins. 264
62784 214987 smart01036 BP28CT BP28CT (NUC211) domain. This C-terminal domain is found in BAP28-like nucleolar proteins. 151
62785 198105 smart01037 Bet_v_1 Pathogenesis-related protein Bet v I family. This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10.- Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1, an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones. - (S)-Norcoclaurine synthases are enzymes catalysing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens. 151
62786 214988 smart01038 Bgal_small_N Beta galactosidase small chain. This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. 272
62787 198107 smart01039 BRICHOS The BRICHOS domain is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for the domain, which is about 100 amino acids long, include (a) targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region. 96
62788 214989 smart01040 Bro-N BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins. 89
62789 214990 smart01041 BRO1 BRO1-like domain. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1. 381
62790 198110 smart01042 Brr6_like_C_C Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport. 134
62791 198111 smart01043 BTAD Bacterial transcriptional activator domain. Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR along with the C terminal region is capable of independently directing actinorhodin production. This family contains TPR repeats. 145
62792 214991 smart01044 Btz CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide. 106
62793 214992 smart01045 BURP The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown. 222
62794 198114 smart01046 c-SKI_SMAD_bind c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4. 95
62795 214993 smart01047 C1_4 TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C. 49
62796 214994 smart01048 C6 This domain of unknown function is found in a C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. 98
62797 214995 smart01049 Cache_2 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins. Members include the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions. 91
62798 214996 smart01050 CactinC_cactus Cactus-binding C-terminus of cactin protein. CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain further upstream. 129
62799 198119 smart01051 CAMSAP_CKK Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin. 129
62800 214997 smart01052 CAP_GLY Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. 68
62801 198121 smart01053 CaMBD Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. 76
62802 214998 smart01054 CaM_binding Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains.. Binding of the proteins to calmodulin depends on the presence of calcium ions.. These proteins are thought to be involved in various processes, such as plant defence responses.and stolonisation or tuberization. 115
62803 214999 smart01055 Cadherin_pro Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions. 87
62804 198124 smart01056 Candida_ALS_N Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins. 245
62805 215000 smart01057 Carb_anhydrase Eukaryotic-type carbonic anhydrase. Carbonic anhydrases are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate.. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion. 247
62806 215001 smart01058 CarD_TRCF CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase. 99
62807 215002 smart01059 CAT Chloramphenicol acetyltransferase. Chloramphenicol acetyltransferase (CAT).catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT. evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen. 202
62808 215003 smart01060 Catalase Catalases are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria. 373
62809 215004 smart01061 CAT_RBD CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer.to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template. 55
62810 198130 smart01062 Ca_chan_IQ Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF). 31
62811 215005 smart01063 CBM49 Carbohydrate binding domain CBM49. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose. 84
62812 198132 smart01064 CBM_10 Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. 29
62813 215006 smart01065 CBM_2 Starch binding domain. 88
62814 198134 smart01066 CBM_25 Carbohydrate binding domain. 83
62815 215007 smart01067 CBM_3 Cellulose binding domain. 83
62816 215008 smart01068 CBM_X Putative carbohydrate binding domain. 62
62817 215009 smart01069 CDC37_C Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37. 93
62818 215010 smart01070 CDC37_M Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear. 155
62819 198139 smart01071 CDC37_N Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases.and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function. 154
62820 215011 smart01072 CDC48_2 Cell division protein 48 (CDC48) domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain. 64
62821 215012 smart01073 CDC48_N Cell division protein 48 (CDC48) N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain. 82
62822 215013 smart01074 Cdc6_C CDC6, C terminal. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localisation factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity. 84
62823 215014 smart01075 CDT1 DNA replication factor CDT1 like. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin. 164
62824 198144 smart01076 CG-1 CG-1 domains are highly conserved domains of about 130 amino-acid residues. The domains contain a predicted bipartite NLS and are named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs. 118
62825 198145 smart01077 Cg6151-P Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined. 111
62826 198146 smart01078 CGGC This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function. 106
62827 215015 smart01079 CHASE This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases. Predicted to be a ligand binding domain. 176
62828 215016 smart01080 CHASE2 CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time. 303
62829 215017 smart01081 CHB_HEX Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. 160
62830 198150 smart01082 CHZ Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones. 38
62831 198151 smart01083 Cir_N N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex. 37
62832 198152 smart01084 CKS Cyclin-dependent kinase regulatory subunit. Cyclin-dependent kinase regulatory subunit. 70
62833 198153 smart01085 CK_II_beta Casein kinase II regulatory subunit. 184
62834 198154 smart01086 ClpB_D2-small C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2. 90
62835 215018 smart01087 COG6 Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation. 598
62836 198156 smart01088 Col_cuticle_N Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins. 53
62837 198157 smart01089 Connexin_CCC Gap junction channel protein cysteine-rich domain. 67
62838 215019 smart01090 Copper-fist Copper fist is an N-terminal domain involved in copper-dependent DNA binding. The domain is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding. 38
62839 215020 smart01091 CorC_HlyC Transporter associated domain. This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. 78
62840 215021 smart01092 CO_deh_flav_C CO dehydrogenase flavoprotein C-terminal domain. 102
62841 198161 smart01093 CP12 CP12 domain. 72
62842 215022 smart01094 CpcD CpcD/allophycocyanin linker domain. 51
62843 198163 smart01095 Cpl-7 Cpl-7 lysozyme C-terminal domain. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far. 42
62844 198164 smart01096 CPSase_L_D3 Carbamoyl-phosphate synthetase large chain, oligomerisation domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. 124
62845 198165 smart01097 CPSase_sm_chain Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus. 130
62846 215023 smart01098 CPSF73-100_C This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known. 212
62847 198167 smart01099 CPW_WPC This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown. 60
62848 215024 smart01100 CRAL_TRIO_N CRAL/TRIO, N-terminal domain. 48
62849 215025 smart01101 CRISPR_assoc This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. 215
62850 198170 smart01102 CRM1_C CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat. 321
62851 198171 smart01103 CRS1_YhbY Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome. 84
62852 215026 smart01104 CTD Spt5 C-terminal nonapeptide repeat binding Spt4. The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif. 121